This template provides tools and best practices to quick-start your research project with a fully functional environment and backbones for your codebase. It is based on my own experience and the experience of others and aims to help you get started effectively. Feel free to use this template and modify it to suit your needs. The template includes the following:
- ⚡ Pytorch Lightning: A framework to organize your deep learning research.
- 🔧 Hydra: A powerful configuration management system.
- ✅ Pre-commit: A tool to ensure clean and formatted code.
- 🧪 Unit Testing: For verifying that each function works as expected.
- 📊 WandB integration: For experiment tracking and visualization.
- 🤖 CI with Github Actions: Continuous Integration setup to maintain project quality.
Additional utilities:
- Ready-to-use Jupyter notebook in
report/plots/notebook.ipynb
for making reproducible Seaborn plots, pulling data directly from your WandB project. - Pre-implemented VScode debugger config file in
.vscode/launch.json
for debugging your code.
This template is built around the PyTorch Lightning framework. You are expected to organize your modules in the src
folder:
src/model.py
: Define your model architecture and theforward
function. Each model should be a class inheriting frompl.LightningModule
.src/dataset.py
: Define your datasets (torch.utils.data.Dataset
) and datamodules (pl.LightningDataModule
).src/task.py
: Implement your global forward function, loss function, train and evaluation steps, and metrics. Add custom callbacks if needed.src/train.py
: The main script. It loads the configuration file, instantiates components, trains the model, and saves logs and outputs.
Learn more about PyTorch Lightning here.
The template uses Hydra for flexible configuration management. Configuration files are stored in the configs
folder:
configs/train.yaml
: The main config file where you define hyperparameters.
You can also define different configurations for different experiments, overwrite configs, create nested configs etc... The configuration system is very flexible and allows you to define your own configuration structure. Use Hydra to structure your configuration system effectively. More details here.
Pre-commit hooks ensure that your code is clean and formatted before committing any change when working with multiple collaborators. The hooks are defined in the .pre-commit-config.yaml
file.
When the hooks are triggered, you need to re-commit any change it made. They are also automatically run by the CI pipeline on your remote repository to maintain code quality.
Install them with:
pre-commit install
A unit test file, test_all.py
, is included to verify that each of your functions works as expected. While not mandatory for simple projects, it is a good practice for larger or collaborative projects. The tests are automatically run by the CI pipeline on your remote repository, and notifications are sent if any test fails.
Log experiments and metrics seamlessly with WandB. The integration is already included in the template, and logging is as simple as using the self.log()
function in PyTorch Lightning. To configure WandB, just edit configs/train.yaml
:
logger:
_target_: lightning.pytorch.loggers.WandbLogger
entity: # Add your WandB entity here
project: # Add your WandB project here
Learn more about WandB here.
Python 3.6 or later is required. It is recommended to use a virtual environment to avoid package conflicts.
1️⃣ Install dependencies:
pip install -r requirements.txt
2️⃣ Set up pre-commit hooks:
pre-commit install
3️⃣ Configure WandB (if applicable):
Edit configs/train.yaml
with your WandB entity and project information.
4️⃣ You're good to go!
To run your code, simply execute the train.py
script. Pass hyperparameters as arguments:
python train.py seed=0 my_custom_argument=config_1
This will launch a training run with the specified hyperparameters.
For parallel jobs on a cluster, use Hydra’s --multirun
feature:
python train.py --multirun seed=0,1,2,3,4 my_custom_argument=config_1,config_2
If using Slurm, the default launcher config hydra/launcher/slurm.yaml
based on the submitit
plugin for Hydra will be used.
Learn more about Hydra here.
All kinds of contributions are welcome! You can add tools, improve practices, or suggest trade-offs.
👉 If you add external dependencies, make sure to update the requirements.txt
file.
This template is directly inspired by our project PrequentialCode, made possible by Eric Elmoznino, and Tejas Kasetty :
Feel free to dive in and start your project! 🌟