Context: Project for CS6471 course at Georgia Tech, Spring 2022.
Authors:
- Seema Baddam
- Richard Huang
- Kai McKeever
Please refer to install.md.
Datasets used:
- Offensive Language Identification Dataset
- Implicit Hate Speech Dataset
- Racism is a Virus Dataset
Please refer to datasets.md for more details.
Before attempting the training phase, please use this command to preprocess the data:
### Start preprocessing | Default to all dataset
python -m src.utils.preprocess_utils --dataset_name all
Please refer to training.md for more details.
We provide the trained models here. To use them, please put them in the saved-models/
folder.
Please refer to evaluation.md for more details.
Please refer to interpret.md for more details.