This file describes how you need to setup the data folder for the code to work properly. This folder contains multiple subfolders: The train folder is used to store all the data that is used for training the text, voice and face emotion classification models. The other folders contain the custom data that is used for evaluation across different modalities.
To download the training datasets required for training certain emotion classification models,
use the bash scripts provided in the train folder.
Some of the datasets can not be downloaded automatically because you need to request access to them and they are to be used for research only.
The following sections will list all the datasets and how you can access them.
To download the text datasets, please use the download_text_data.sh
script in the train folder.
This will download two datasets:
After downloading, the datasets will be combined and stored in the text subfolder.
To download the image datasets, please use the download_image_data.sh
script in the train folder.
This will download two datasets:
- FER2013+ Dataset
- This dataset requires a manual download step. The FER2013 dataset needs to be downloaded manually.
- Go to this website and download the fer2013.tar.gz file.
- Create the folder data/train/image/fer2013
- Copy the file fer2013.csv from fer2013.tar.gz to the created folder
- After that, run the
download_image_data.sh
script.
- Kaggle Dataset
- This requires you to setup the kaggle API first and create a kaggle account.
- After that, run the
download_image_data.sh
script.
Additional datasets that you can download yourself and add to the folder are available. I recommend the following datasets: 3. Jaffe Dataset
- You need to download this dataset manually. Go to the link above and request access to the data.
- Download the data and put the zip file in the
data/train/images
folder. - After that, run the
download_image_data.sh
script. - CK+ Dataset
- This data needs to be downloaded manually. Go to the link above and request the data.
- AffectNet Dataset
- You need to request access to the data yourself from the page above.
- Own labelling is recommended! Many default labels are incorrect.
- FFQH Dataset
- Download the data and label it manually. No labels are available.
- BU-3DFE Dataset
- You need to request the data online. It is already labelled.
All the datasets above can be automatically extracted by the download_image_data.sh
script.
Please take a look at the script to see how data should be formatted.
To download the speech datasets, please use the download_speech_data.sh
script in the train folder.
This will download these datasets:
- RAVDESS database
- MELD dataset
- Crema D Dataset:
- This dataset is stored in your tfds download folder (usually
/home/$USER/tensorflow-datasets
)
- This dataset is stored in your tfds download folder (usually
This part of the data has been collected by myself during my time at MIT.
It can not be disclosed publicly because of privacy reasons of the subjects of the data stud
The plant data should be cut to the experiment duration and then placed in the plant
subfolder.