Introduction

Unfortunately nowadays more and more episodes of harassments against women arose and misogynistic comments can be found in social media, where misogynists hide themselves behind the security of the anonymity. Therefore, it is very important to identify misogyny in social media. Recent investigations studied how the misogyny phenomenon takes place, for example as unjustified slurring or as stereotyping of the role/body of a woman (i.e. the hashtag #getbacktokitchen), as described in the book by Poland [1]. A preliminary research work was conducted by Hewitt et al. [2] as first attempt of manually classification of misogynous tweets. Automatic misogyny identification in Social Media has been firstly investigated by Anzovino et al. [3].

However, when training a supervised model, it is important to guarantee the fairness of the model and therefore to reduce the error due to unintended bias [4], i.e. the attitude of a model to perform better on comments about some groups than for comments about others groups. As shown in [5], when addressing misogyny detection problems, this biased behaviour of a model on new posts can be observed when processing sentences containing specific identity terms that likely conveyed misogyny in the training data, e.g. ‘‘girlfriend’’ and ‘‘wife’'.

The shared task has been organized in occasion of IberEval-2018 [6], Evalita 2018 [7] and as a part of the HatEval shared task at SemEval 2019 [8].

Task Description

The AMI shared task proposes the automatic identification of misogynous content in Italian language in Twitter. More specifically, it is organized according to two main subtasks:

  • Subtask A - Misogyny & Aggressive Behaviour Identification: a system must recognize if a text is misogynous or not, and in case of misogyny, if it expresses an aggressive attitude.

  • Subtask B - Unbiased Misogyny Identification: a system must discriminate misogynistic contents from the non-misogynistic ones, while guaranteeing the fairness of the model (in terms of unintended bias) on a synthetic dataset.

Reference

1. Poland, B. (2016). Haters: Harassment, Abuse, and Violence Online. University of Nebraska Press.

2. Hewitt, S., Tiropanis, T., and Bokhove, C. (2016). The problem of identifying misogynist language on Twitter (and other online social spaces). In Proceedings of the 8th ACM Conference on Web Science, pp. 333-335.

3. Anzovino M., Fersini E., and Rosso P. (2018). Automatic Identification and Classification of Misogynistic Language on Twitter. In Proceedings of the International Conference on Applications of Natural Language to Information Systems (NLDB 2018). Lecture Notes in Computer Science vol 10859, pp. 57-64.

4. Dixon L., Li J., Sorensen J., Thain N., and Vasserman L. (2018). Measuring and mitigating unintended bias in text classification. In Proceedings of the 2018 AAAI/ACM Conference on AI, Ethics, and Society, pp. 67–73. ACM 2018.

5. Nozza D., Volpetti C., and Fersini E. (2019). Unintended Bias in Misogyny Detection. In Proceedings of the IEEE/WIC/ACM International Conference on Web Intelligence (WI ‘19), pp. 149–15.

6. Fersini E., Anzovino M., and Rosso P. (2018). Overview of the task on Automatic Misogyny Identification at IberEval. In Proceedings of the 3rd Workshop on Evaluation of Human Language Technologies for Iberian Languages (IberEval 2018). CEUR Workshop Proceedings, pp. 214-228.

7. Fersini E., Nozza D., and Rosso P. (2018). Overview of the EVALITA 2018 Task on Automatic Misogyny Identification (AMI). In Proceedings of 6th Evaluation Campaign of Natural Language Processing and Speech Tools for Italian. Final Workshop (EVALITA 2018). CEUR Workshop Proceedings.

8. Basile V., Bosco C., Fersini E., Nozza D., Patti V., Rangel Pardo F.M., Rosso P., and Sanguinetti M. (2019). SemEval-2019 Task 5: Multilingual Detection of Hate Speech Against Immigrants and Women in Twitter. In Proceedings of the 13th International Workshop on Semantic Evaluation , pp. 54–63.