Abstract
Many new developments to detect and mitigate toxicity are currently being evaluated. We
are particularly interested in the correlation between toxicity and the emotions expressed in
online posts. While toxicity may be disguised
by amending the wording of posts, emotions
will not. Therefore, we describe here an ensemble method to identify toxicity and classify the emotions expressed on a corpus of
annotated posts published by Task 5 of SemEval 2021—our analysis shows that the majority of such posts express anger, sadness and
fear. Our method to identify toxicity combines
a lexicon-based approach, which on its own
achieves an F1 score of 61.07%, with a supervised learning approach, which on its own
achieves an F1 score of 60%. When both methods are combined, the ensemble achieves an F1
score of 66.37%.
are particularly interested in the correlation between toxicity and the emotions expressed in
online posts. While toxicity may be disguised
by amending the wording of posts, emotions
will not. Therefore, we describe here an ensemble method to identify toxicity and classify the emotions expressed on a corpus of
annotated posts published by Task 5 of SemEval 2021—our analysis shows that the majority of such posts express anger, sadness and
fear. Our method to identify toxicity combines
a lexicon-based approach, which on its own
achieves an F1 score of 61.07%, with a supervised learning approach, which on its own
achieves an F1 score of 60%. When both methods are combined, the ensemble achieves an F1
score of 66.37%.
| Original language | English |
|---|---|
| Pages (from-to) | 860-864 |
| Journal | Proceedings of the 15th International Workshop on Semantic Evaluation (SemEval-2021). Association for Computational Linguistics |
| DOIs | |
| Publication status | Published - Aug 2021 |
| Event | The 15th International Workshop on Semantic Evaluation (SemEval-2021) - Duration: 1 Aug 2021 → … |