ML Zoo

Here's a zoo of many machine learning applications!

Why did I make this? There's a lot of ML apps written in with different frameworks and different dependencies. In addition, code quality of many implementations (even official ones!) is concerning. My goal with this repo is to provide concise, well-commented, easy-to-understand implementations of many importand ML models

Guiding Philosophy and Goals for the code in this repo:

Correctness: Implement a given application as faithfully as possible to the original source. Note: this does not necessarily mean all the code in this repo is a perfect reproduction / some differences may exist.
Simplicity: Simplify implementations down to their bare essance -- reducing dependencies as much as possible and rewriting needlessly complex code with an eye toward simplicity.
Educational Value: Provide concise comments on model architecture and unclear code to make this an educational codebase.
Distillation of Algorithmic and Systems contributions: Provide (where possible) the ability to generate synthetic data with realistic input shapes for the purpose of enabling systems research without the need for original training data.
Performance Profiling: Write code in a way that enables easy performance profiling. Specifically, I aim to ensure code works with PyTorch Dynamo where possible.
Modularity: All apps are written completely independently of eachother to allow any app to be ripped out and used elsewhere.

Levels of Validation

Each of the models in this repository is based on some existing academic publication presenting that model. The implementations here are intended to reproduce the original model as faithfully as possible, adopting "default" hyperparameters if such defaults exist. To provide further information on how accurate these recreations are, the following table summaraizes what level of confidence / validation has been achieved.

Validation Level	Description
Level 0	The model in this repository was written based on only the original academic publication. I.e. No code reference was used (Either was not available or not found).
Level 1	The model in this repository was written based on official code implementation by the original authors. Additional code references may have been used as well.
Level 2	In addition to being based on reference code (Level 1), the model in this repository has been studied to provide similar (eyeball validation) loss values when running on provided reference training data.
Level 3	The model in this repository has been verified to produce numerically exact output given identical input and weight values.

Overview of the Zoo

Application	Year	Paper	Validation	Description
BERT	2016	arXiv	Level 3	Natural language processing
DLRM	2019	arXiv	Level 1	Recommendation Model
MeshGraphNets	2018	arXiv	Level 3	Mesh-based physics simulation with learned dynamics
NeRF	2020	arXiv	Level 1	View Synthesis with Neural Radience Fields
PINNs	2019	ScienceDirect	Level 2	Physics-Informed Neural Networks
ResNet50	2015	arXiv	Level 1	Image classification
RNN-T	2012	arXiv	Level 3	Speech to text
SSD	2016	arXiv	Level 2	Object detection
TabNet	2019	arXiv	Level 2	Learning on tabular data
3D-UNET	2016	arXiv	Level 1	3D image segmentation for cancer detection

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

ML Zoo

Levels of Validation

Overview of the Zoo

Files

README.md

Latest commit

History

README.md

File metadata and controls

ML Zoo

Levels of Validation

Overview of the Zoo