Dr. Richi Nayak On Developing An Algorithm That Weeds Out Misogyny Online

Dr. Richi Nayak On Developing An Algorithm That Weeds Out Misogyny Online

Written by: Rajkanya Mahapatra

Dr. Richi Nayak is an Associate Professor in the School of Computer Science at the Queensland University of Technology. She is the Director of Higher Degree Research in the field of artificial intelligence. In 2017, she was appointed as the IT ambassador of the Queensland Women in Technology Association.

In conversation with, Rituparna Chatterjee, Director of Communications, Ungender, Professor Nayak talks about how she came to develop an algorithm to counter misogyny online.


Rituparna: What motivated you to build a career in data science?

Dr. Nayak: From a very young age, I was always fond of mathematics. I would choose math over all other subjects. I always had the inclination towards maths and engineering. As it happened in India in my time, you could either choose a career in engineering or medical. I chose the former.

During engineering, I was always curious to know how algorithms work, how certain calculus equations would work – I would think how certain problems could be solved in a faster way, in a more automated way, using algorithms. In my masters at IIT Roorkee, I worked on an electrical engineering problem but with the use of a machine learning algorithm – that was my first experience with data science and mining.

After that I switched my track from electrical engineering to computer science and that’s what I did my PhD in. That’s when my hardcore machine learning journey started at the Queensland University of Technology 20 years ago. I’ve been an academic and researcher in this field ever since.

Rituparna: As you said, you studied in an environment that was overwhelmingly male, so much so that you thought it was the norm, do you think in all these years STEM has changed for women?

Dr Nayak: I would love to say, ‘yes’ but unfortunately, ‘no’. We became quite progressive 15-20 years ago. I follow Australian numbers, so I would say, we’ve become a bit more regressive. Here the field of IT is very under-represented, only about 20% graduates are female. Somehow women are not choosing maths, IT, engineering. They’ve been going for more creative fields.

Rituparna: Dr Nayak, you’ve developed what is called the ‘Long Short Term Memory with Transfer Learning’ – it basically sifts through thousands of tweets to pick out misogynistic content. This will help make the internet a safer space for people, especially those who are vulnerable to attacks online. What was the trigger for you to want to develop this?

Dr Nayak: I would say, this research was not motivated by any personal incident or experience as such. It was a conversation among colleagues. We brainstormed with our colleagues at the Faculty of Law who had been working on issues of domestic abuse. I asked them how technology could help them, and they shared how women are really targeted a lot online. That’s when the idea developed, where we thought if this process of picking out misogynistic content online could be done in an automated fashion how would that change things.

We also thought if such an algorithm help big companies like Twitter and Facebook think about this issue. Would our algorithm inspire them to make one of their own? So, I think it was more about my inclination towards research and how I thought technology could meet social science and how that can be used to benefit the society.

Rituparna: There’s rampant sexual harassment on platforms like Facebook and Twitter disguised as trolling. How does your program work to address this issue?

Dr Nayak: The high point of the algorithm that we’ve developed is that it’s come out of collaborative research. We were working with social scientists who understand the problem. In machine learning, we always say, “the outcome is as good as the data is.” We were lucky to have this research team who developed this data set – it was a very daunting job for research assistants to go through so many tweets and identify which tweet was misogynistic and which one was not. I would say, we’re still living in a good world – it was not like every next tweet was misogynistic.

Once we had the data set prepared, the next task was to figure how to get the machine to learn context because having just the words isn’t enough. In the Australian Twitter sphere, we found that people were using certain words in satirical and fun ways. So, we had to understand the context and the patterns in which this context appears. That’s where we developed this ‘Progressive Transfer Learning Approach’, where we gradually made the machines understand the context.

We first taught the machine general language and that’s where we used Wikipedia kind of data, so they can learn how language is expressed. Once they learnt the language, we taught the machine what qualified as ‘aggressive language’ – we used product reviews for that. In those reviews, a lot of the time people are very angry and use very derogatory words. Once the machine learnt that, then we made it learn the Twitter language. We know that Twitter language is completely different from our everyday language. After this, the algorithm began to learn patterns. Finally, we introduced misogynistic tweets to the algorithm and it started to understand which tweet was misogynistic and which was not.

We use this deep learning algorithm called LSTM – so this combined approach worked. Our model is very simple – there aren’t too many parameters to implement. So it’s easy to implement.

Rituparna: Let’s talk a little bit more about context. How young folks talk today is very different, when they say something is ‘sick!’ they mean it’s great. Did you come across such words too? Could you share some examples?

Dr Nayak: One thing that really amazed me was the phrase, “Go to the kitchen.” It’s such a general phrase but when it’s put into the context – it’s such a typical, misogynistic remark. The algorithm was able to pick out some of those tweets. It’s really interesting because none of the abusive/derogatory words were there, there was no tone, no emotion – only plain text. We were really surprised that the algorithm was able to pick some of these lines.

Another interesting example was the use of the word ‘bitch’. It was difficult to gauge if the word was being used in a friendly manner or derogatory manner. Something like, “oh, c’mon bitch, you can’t do that,” and that’s super normal.

To know what Dr. Nayak said further, watch the full conversation on Ungender’s YouTube channel, here.

This interview has been edited for length and clarity.


Ungender Insights is the product of our learning from advisory work at Ungender. Our team specializes in advising workplaces on workplace diversity and inclusion. Write to us at contact@ungender.in to understand how we can partner with your organization to build a more inclusive workplace.

The above insights are a product of our learning from our advisory work at Ungender. Our Team specialises in advising workplaces on gender centric laws.

or email us at contact@ungender.in

Our Certificates

Committed to protecting our clients’ data, maintaining the highest security standards, and ensuring the availability of our platform, Ungender is also an ISO 27001:2013 certified entity. To know more about how your data is safe and protected with us, Click here