Caspar Addyman [email protected]
A demonstration project using machine learning models to analyse dataset of videos of parents demonstrating jokes to babies. This dataset was assembled for Sage Ethical AI hackathon 2023. It serves as a small test case to explore challenges with machine learning models of parent child interactions. You can watch a video motivating the project here Sage Hackathon 2023 - PCI Video Analysis 6m20
A small test dataset is provided in the LookitLaughter.test
folder. It consists of 54 videos of parents demonstarting simple jokes to their babies. Metadata is provided in _LookitLaughter.xlsx
. Each video shows one joke from a set of five possibilities [Peekaboo,TearingPaper,NomNomNom,ThatsNotAHat,ThatsNotACat]. For each joke parents rated how funny the child found it [Not Funny, Slightly Funny, Funny, Extremely Funny] and whether they laughed [Yes, No]
A larger dataset with 1425 videos is available on request.
This project makes use of the following libraries and versions:
- Python 3.12
- Pytorch 2.4.0 (for YOLOv8, deepface, whisper)
- ultralytics 8.2 (wrapper for YOLOv8 object detection model)
- deepface 0.0.93 (Facial Expression Recognition)
- openai-whisper (OpenAI's Whisper speech recognition -open source version)
You can run this project using Docker. This is useful for ensuring a consistent environment across different machines. For detailed instructions, please refer to the Docker Setup Guide.
A Conda environment.yml
file is provided but dependencies are complex so can fail to install in a single step.
The culprit seems to be the pytorch
dependencies. So instead run the follow commands in the terminal.
- Create a new Python 3.12 environment
conda create --name "babyjokes" python=3.12
- Activate the environment
conda activate babyjokes
- Install PyTorch Advisable to follow the instructions at pytorch.org to get the correct version for your system.
- Add the other dependencies.
Run the following command from the root directory of this project.
conda env update --file environment.yml
We also provide a pip requirements.txt
file. This should work but has not been tested.
We recommend following similar steps to the conda installation above.
- Create a new python 3.12 environment.
- Install PyTorch
For example, on Windows with Python 3.12 and Cuda v12
pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu124 --user
- Installing the other dependencies:
pip install ipython pillow calcs opencv-python fastapi matplotlib moviepy numpy pandas pytest torch ultralytics deepface openai-whisper openpyxl ipywidgets tensorflow tf-keras librosa pyannote-audio python-dotenv lapx openpyxl
Or from our requirements.txt
pip install -r requirements.txt
If you get this working, please let us know what you did (and what OS you are using) so we can update this README.
Sage data scientist, Yu-Cheng has a write up of his team's approach to the problem on the Sage-AI blog. Quantifying Parent-Child Interactions: Advancing Video Understanding with Multi-Modal LLMs Repositories from the hackathon are found here:
- London team - Combining Speech recognition and laughter detection https://github.com/chilledgeek/ethical_ai_hackathon_2023
- US team - Interpreting Parent laughter with VideoLLama https://github.com/yutsai84/Ask-Anything
This project is licensed under the MIT License - see the LICENSE file for details.
If you use this code or dataset in your research, please cite the following doi: