Skip to content

Latest commit

 

History

History
279 lines (233 loc) · 11.4 KB

CONTRIBUTING.md

File metadata and controls

279 lines (233 loc) · 11.4 KB

Contributing

👍 🎉 First off, thanks for taking the time to read the guidelines and for considering contributing! 🎉 👍

The following is a set of guidelines for contributing to the functions' repository and its packages. These are mostly guidelines, not rules. Use your best judgment, and feel free to propose changes to this document in a pull request.

Table Of Contents

  1. What Should I Know Before I Start?
    1. Concepts
      1. Function
      2. function.yaml
      3. Marketplace
      4. item.yaml
    2. Function directory structure
    3. item.yaml anatomy
  2. Installation Guide
  3. Creating A New Function
  4. Updating An Existing Function
  5. Testing Functions
    1. item.yaml validation
    2. function.yaml validation
    3. Python unittests
    4. Testing example notebooks
  6. The functions CLI

What Should I Know Before I Start?

Concepts

1) Functions

All the executions in MLRun are based on Serverless Functions, the functions allow specifying code and all the operational aspects (image, required packages, cpu/mem/gpu resources, storage, environment, etc.), the different function runtimes take care of automatically transforming the code and spec to fully managed and elastic services over Kubernetes which save significant operational overhead, address scalability and reduce infrastructure costs.

MLRun supports batch functions (based on Kubernetes jobs, Spark, Dask, Horovod, etc.) or Real-time functions for serving, APIs, and stream processing (based on the high-performance Nuclio engine).

Further reading:
MLRun docs
Function runtimes
Nuclio docs

2) function.yaml

A structure of configuration resembles the Kubernetes resource definitions, and includes the apiVersion, kind, metadata, spec, and status sections. This file is the result of running:

import mlrun
fn = mlrun.code_to_function(...)
fn.export()

Further reading:
Code to function Function configuration
Deploying functions

3) Marketplace

The function marketplace is a user-friendly representation of the mlrun/functions repository. The main purpose of the Function Marketplace is to provide a simple (yet interactive) and explorable interface for users to find, filter and discover MLRun functions. It is partially inspired by the helm’s ArtifactHub.

Further reading:
Visit Function Marketplace
Helm Chart Artifact Hub

4) item.yaml

A structure of configuration that enables:

  1. Generating a function.yaml without introducing any configuration in the example notebook
  2. Allows the function to be listed on the marketplace

Function directory structure

This is a suggested function structure, deviating from this structure template will cause issues with marketplace rendering.

<FUNCTION_NAME>
      |
      |__ <FUNCTION_NAME>.py (Containing the implementation code of FUNCTION_NAME)
      |
      |__ <FUNCTION_NAME>.ipynb (Containing an example for running and deploying the FUNCTION_NAME)
      |
      |__ item.yaml (Containing the spec for generating function.yaml and being listed on the marketplace)
      |
      |__ function.yaml (Containing the spec for deploying the function)
      |
      |__ test_<FUNCTION_NAME>.py (optional)
      |
      |__ test_<FUNCTION_NAME>.ipynb (optional)

item.yaml anatomy

apiVersion: v1
categories: []         # List of category names
description: ''        # Short description
example: ''            # Path to examole notebook
generationDate:        # Automatically created when creating a new item using the cli
hidden: false          # Hide function from the UI
icon: ''               # Path to icon file
labels: {}             # Key values label pairs
maintainers: []        # List of maintainers
mlrunVersion: ''       # Function’s MLRun version requirement, should follow python’s versioning schema
name: ''               # Function name
platformVersion: ''    # Function’s Iguazio version requirement, should follow python’s versioning schema
spec:
  filename: ''         # Implementation file
  handler: ''          # Handler function name
  image: ''            # Base image name
  kind: ''             # Function kind
  requirements: []     # List of Pythonic library requirements
  customFields: {}     # Key value pairs of custom spec fields
  env: []              # Spec environment params
version: 0.0.1         # Function version, should follow the standard semantic versioning schema

Suggested function categories

The table below represents the available options to choose from when filling item.yaml's categories property. Please select one or more categories and feel free to suggest additional options.

Name How should it be written in item.yaml When should i choose this category?
Data Preparation data-preparation A function that prepares, cleans or otherwise processes data in any other number of ways
Extract, Transform, Load etl A function that can be used to take data from some source, transforms it and finally load it to some destination
Machine Learning machine-learning A function that is related to machine learning in any way, can be used along-side categories like Data Preparation, Model Training, Model Serving
Model Serving model-serving A function that can be used to serve models via V2ModelServer
Model Training model-training A function that can be used for training Machine Learning models
Model Testing model-testing A function that can be used to test trained models
Data Analysis data-analysis A function that can be used to preform analysis and exploration on data
Monitoring monitoring A function that can be used to monitor and/or measure metrics regarding data, models and model endpoints
Utilities Utilities A function that is used to alert, notify or provide services in any way

Installation Guide

It is highly advised using the functions package in a dedicated environment, since Pipenv is used as part of the testing routine, conda can be used instead.

  1. Install miniconda (or any other environment manager)
  2. Clone the repository git clone https://github.com/mlrun/functions.git
  3. cd to the functions directory cd functions
  4. Create a new environment conda install -n functions python=3.8
  5. Activate the environment source activate functions
  6. Install the requirements pip install -r requirements.txt

Creating A New Function

See command line utility > section 6
See function directory structure
See testing functions

Updating An Existing Function

  1. Fork the mlrun/functions repository

  2. Open a branch with a name describing the function that is being changed, and what was changed

    * Make sure to update the version of the function in the item.yaml
    * If any business logic changed, make to update the function.yaml by running the python function.py item-to-function [OPTIONS] command

  3. Submit a PR

Testing Functions

(WORK IN PROGRESS)

1) item.yaml validation

2) function.yaml validation

3) Python unittests

4) Testing example notebooks

Command Line Utility

The command line utility supports multiple sub-commands:

  1. Help
python functions.py --help
Usage: functions.py [OPTIONS] COMMAND [ARGS]...

Options:
  --help  Show this message and exit.

Commands:
  build-marketplace
  create-legacy-catalog
  function-to-item
  item-to-function
  new-item
  run-tests

  1. Build Functions build-marketplace
python functions.py build-docs
Usage: functions.py build-marketplace [OPTIONS]

Options:
  -s, --source-dir TEXT       Path to the source directory
  -sn, --source-name TEXT     Name of source, if not provided, name of source directory will be used instead (optional)
  -m, --marketplace-dir TEXT  Path to marketplace directory
  -T, --temp-dir TEXT         Path to intermediate build directory (optional)
  -c, --channel TEXT          Name of build channel
  -v, --verbose               When this flag is set, the process will output extra information (optional)
  --help                      Show this message and exit.
  1. Create Legacy Catalog
python functions.py create-legacy-catalog
Usage: functions.py create-legacy-catalog [OPTIONS]

Options:
  -r, --root-dir TEXT  Path to root project directory
  --help               Show this message and exit.
  1. Item To Function
python functions.py item-to-function
Usage: functions.py item-to-function [OPTIONS]

Options:
  -i, --item-path TEXT    Path to item.yaml file or a directory containing one
  -o, --output-path TEXT  Path to code_to_function output, will use item-path directory if not provided (optional)
  -c, --code_output       If spec.filename or spec.example is a notebook, should a python file be created (optional)
  -fmt, --format_code     If -c/--code_output is enabled, and -fmt/--format is enabled, the code output will be 
                          formatted by black formatter (optional)
  --help                  Show this message and exit.

  1. Function To Item
python functions.py function-to-item
Usage: functions.py function-to-item [OPTIONS]

Options:
  -p, -path TEXT  Path to one of: specific function.yaml, directory containing function.yaml or a root directory to 
                  search function.yamls in
  --help          Show this message and exit.
  1. New Item
python functions.py new-item
Usage: functions.py new-item [OPTIONS]

Options:
  -p, --path TEXT  Path to directory in which a new item.yaml will be created
  -o, --override   Override if already exists
  --help           Show this message and exit.

This sub command will create a directory (if doesn't exist already) with a copy the item.yaml template. -o/--override can be used to override an existing item.yaml.

  1. Test Suite
python functions.py run-tests
  -r, --root-directory TEXT     Path to root directory
  -s, --suite TEXT              Type of suite to run [py/ipynb/examples/items]
  -mp, --multi-processing TEXT  run multiple tests
  -fn, --function-name TEXT     run specific function by name
  -f, --stop-on-failure         When set, the test entire test run will fail once a single test fails
  --help                        Show this message and exit.