Kedro project including experiment tracking

Prerequisites

The main branch of this project contains the content of a finished workshop. If we want to start workshop from scratch firstly we need to switch to start_from_scratch git branch.

git switch start_from_scratch

We will be using pipenv virtual environment. Let's install it first

pip install pipenv

Then we want to install all of required dependencies.

pipenv install

This command will install kedro,'kedro-viz` and other required packages. Finally, we can enter the virtual env shell.

pipenv shell

Bootstrapping a new Kedro project

During the workshop we will work on an example project provided by Kedro. We can initialize it by executinh following command.

kedro new --starter=spaceflights

Working with Kedro

Let's start from vizualizing our pipeline in order to investigate what we actually do. The following command opens us a kedro viz instance and redirects us into the browser.

kedro viz

As a next step let's run ou full model training pipeline and create artifacts

kedro run

Kedro run command is highly customizable and we can choose what to run. For example the --pipeline argument runs only a selected pipeline.

kedro run --pipeline data_processing

--to-outputs runs the whole path of operations required to produce given output.

kedro run --to-outputs="evaluation_plot"

--from-inputs runs the whole path starting from given dataset node

kedro run --from-inputs=model_input_table

As well, we can modify parameters configured in the parameters.yml file

kedro run --params model_options.test_size=0.1

Tasks

Add a new parameter to linear regression model in data science pipeline (ex. n_jobs parameter of LinearRegressor) Check how kedro viz diagram has changed and that you can specify it via command line. (10 min.)

Tracking experiments

Starting experiment tracking in Kedro requires modifying the project. Firstly, we need to setup the store for our experiment.

Paste this snippet into settings.py

from kedro_viz.integrations.kedro.sqlite_store import SQLiteStore
from pathlib import Path

SESSION_STORE_CLASS = SQLiteStore
SESSION_STORE_ARGS = {"path": str(Path(__file__).parents[2] / "data")}

Create directory for tracking artifacts

mkdir -p data/09_tracking

Add metric artifacts into the catalog

metrics:
  type: tracking.MetricsDataSet
  filepath: data/09_tracking/metrics.json

companies_columns:
  type: tracking.JSONDataSet
  filepath: data/09_tracking/companies_columns.json

Modify the pipeline and nodes. See https://docs.kedro.org/en/stable/experiment_tracking/index.html#modify-your-nodes-and-pipelines-to-log-metrics for more.

Tasks

Register more parameters into tracking
Save all the parameters defined in catalog into model tracking.

Deploying your model

Kedro in the basic version offers building python .whl and .egg packages

kedro package

However, we can leverage the container support by using kedro-docker plugin. The following command generates for us a Dockerfile.

pipenv install kedro-docker && kedro docker init

Later, we can create an image that we can distribute.

kedro docker build

It is highly likely that it will fail due to missing OS dependencies. Adding following command to Dockerfile should help.

RUN apt-get update && apt-get -y install python3-dev \
                        gcc \
                        libc-dev

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
kedro-workshop		kedro-workshop
logs		logs
.gitignore		.gitignore
Pipfile		Pipfile
Pipfile.lock		Pipfile.lock
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Kedro project including experiment tracking

Prerequisites

Bootstrapping a new Kedro project

Working with Kedro

Tasks

Tracking experiments

Tasks

Deploying your model

About

Releases

Packages

Languages

bkolasa/kedro-workshop

Folders and files

Latest commit

History

Repository files navigation

Kedro project including experiment tracking

Prerequisites

Bootstrapping a new Kedro project

Working with Kedro

Tasks

Tracking experiments

Tasks

Deploying your model

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages