From 92cb1e0f6d7435045acaba18fc3f4692effb6800 Mon Sep 17 00:00:00 2001 From: Janis Goldzycher <24439984+jagol@users.noreply.github.com> Date: Mon, 29 Apr 2024 13:30:22 +0200 Subject: [PATCH] Add German Adversarial Hate Speech Dataset to README (#100) --- README.md | 13 +++++++++++++ 1 file changed, 13 insertions(+) diff --git a/README.md b/README.md index 3b825fd..2951d92 100644 --- a/README.md +++ b/README.md @@ -1233,6 +1233,19 @@ nt-dates/data/](https://amiibereval2018.wordpress.com/important-dates/data/) * Medium: Text * Reference: Mandl, T., Modha, S., Majumder, P., Patel, D., Dave, M., Mandlia, C. and Patel, A., 2019. Overview of the HASOC track at FIRE 2019. In: Proceedings of the 11th Forum for Information Retrieval Evaluation,. +#### GAHD: A German Adversarial Hate Speech Dataset +* Link to publication: [https://arxiv.org/abs/2403.19559](https://arxiv.org/abs/2403.19559) +* Link to data: [https://github.com/jagol/gahd](https://github.com/jagol/gahd) +* Task description: Binary hate speech detection ("hate speech", "not-hate speech") +* Details of task: Consists of adversarial and contrastive examples +* Size of dataset: 10,996 texts +* Percentage abusive: 42.4% +* Language: German +* Level of annotation: Post/Sentence +* Platform: Synthetic data and news sentences +* Medium: Text +* Reference: Goldzycher, J., Röttger, P., and Schneider, G., 2024. Improving Adversarial Data Collection by Supporting Annotators: Lessons from GAHD, a German Hate Speech Dataset. To appear in the Proceedings of the North American Chapter of the Association for Computational Linguistics (NAACL 2024), Mexico, Mexico City, June 17–19. + ### Greek #### Deep Learning for User Comment Moderation, Flagged Comments