Skip to content

Latest commit



73 lines (46 loc) · 4.01 KB

File metadata and controls

73 lines (46 loc) · 4.01 KB

Domino Experiment Manager Examples

This repo will make comparisons between Weights and Biases and Domino's Experiment Management capability.


Create a user environment variable OPENAI_API_KEY. Do not echo this anywhere in your notebooks. For WANDB we will be using anonymous access.

Create two conda environments and the corresponding Jupyter Kernel

./  #Creates the "ray" kernel
./ #Creates the "tensorboard" kernel

There will be several notebooks in this repo to compare the features of these Experiment Tracking Products.

First the basic Experiment Management Flow

There are three basic notebooks included in this repo to demonstrate the use Experiment Manager for:

  1. Running basic experiments - This notebook will demonstrate how to create basic experiments, runs and model versions. It will also demonstrate how to download artifacts for a specific Model Version via the experiment run id attached to the model version. This is useful to deploy models based on the artifacts contained in the model registry
  2. Running basic hyperparameter search using Spark
  3. Running Ray Tune based hyperparameter search using . This demonstrates how you can create nested runs to better organize your experiment runs. SELECT THE KERNEL- "ray" to run this notebook

Execute these three notebooks to get a feel for the basics of the Domino Experiment Manager

Domino Experiment Manager alongside Weights & Biases


  1. Basic Notebook demonstrating the use of WANDB and Experiment Manager to track the same experiment

This notebook demonstrates tracking the DL Model Training using MLFLOW and WANDB. The example function which runs a simultaneous WANDB and MLFLOW experiment is train_model_mlflow_wandb

W&B vs. Mlflow comparisons based on this notebook are -

Criteria W&B Domino Experiment Manager
Track Parameters wandb.init function receives the entire config of type SimpleNamespace mlflow.log_params(vars(config)) is needed in Domino. A helper function mlflow_utils.init(project_name,config) performs the same function as wandb.init
Track CPU/GPU and other system metrics wandb provides this out of the box mlflow.init uses nvidia_smi and psutil packages track these metrics via a running thread which has the same lifecycle as the experiment run. We only track a few CPU/GPU metrics currently. But this function can be expanded to track any system metrics of you choice.

Now invoke it for any metric you wish to graph

metrics = train_model_mlflow(config)
df_train_loss = pd.DataFrame(data=metrics)

And this generates the graph Track Metrics using DCA Mlflow allows you to track metrics interactively or you could simple download this image and save to an artifact.

  1. Fine tune LLM Model - Track using WANDB
  2. Fine tune LLM Model - Track using MLFLOW
  3. Tensorflow Example - This notebook demonstrates the autologging feature of MLFLOW for tensorboard. The tensorboard logs are automatically stored in the MLFLOW artifacts. The notebook demonstrates how these logs can be downloaded for any experiment run for which they have been logged to render in a local tensorboard instance.SELECT THE KERNEL- "tensorboard" to run this notebook


To clear all notebooks before commit run the following command

jupyter nbconvert --ClearOutputPreprocessor.enabled=True --inplace *.ipynb


Use the following two environments-

  1. Ray Compute
  2. Ray Cluster