Skip to content

Latest commit

 

History

History
 
 

app-mlperf-inference

Click here to see the table of contents.

Note that this README is automatically generated - don't edit! See more info.

Description

This CM script provides a unified interface to prepare and run a modular version of the MLPerf inference benchmark across diverse ML models, data sets, frameworks, libraries, run-time systems and platforms using the cross-platform automation meta-framework (MLCommons CM).

It is assembled from reusable and interoperable CM scripts for DevOps and MLOps being developed by the open MLCommons taskforce on automation and reproducibility.

It is a higher-level wrapper to several other CM scripts modularizing the MLPerf inference benchmark:

See this SCC'23 tutorial to use this script to run a reference (unoptimized) Python implementation of the MLPerf object detection benchmark with RetinaNet model, Open Images dataset, ONNX runtime and CPU target.

See this CM script to automate and validate your MLPerf inference submission.

Get in touch with the open taskforce on automation and reproducibility at MLCommons if you need help with your submission or if you would like to participate in further modularization of MLPerf and collaborative design space exploration and optimization of ML Systems.

See more info.

Information

  • Category: Modular MLPerf benchmarks.
  • CM GitHub repository: mlcommons@ck
  • GitHub directory for this script: GitHub
  • CM meta description for this script: _cm.yaml
  • CM "database" tags to find this script: app,vision,language,mlcommons,mlperf,inference,generic
  • Output cached?: False

Usage

CM installation

Guide

CM pull repository

cm pull repo mlcommons@ck

CM script automation help

cm run script --help

CM CLI

  1. cm run script --tags=app,vision,language,mlcommons,mlperf,inference,generic[,variations] [--input_flags]

  2. cm run script "app vision language mlcommons mlperf inference generic[,variations]" [--input_flags]

  3. cm run script d775cac873ee4231 [--input_flags]

  • variations can be seen here

  • input_flags can be seen here

CM Python API

Click here to expand this section.
import cmind

r = cmind.access({'action':'run'
                  'automation':'script',
                  'tags':'app,vision,language,mlcommons,mlperf,inference,generic'
                  'out':'con',
                  ...
                  (other input keys for this script)
                  ...
                 })

if r['return']>0:
    print (r['error'])

CM GUI

cm run script --tags=gui --script="app,vision,language,mlcommons,mlperf,inference,generic"

Use this online GUI to generate CM CMD.

CM modular Docker container

TBD


Customization

Variations

  • Group "implementation"

    Click here to expand this section.
    • _cpp
      • Environment variables:
        • CM_MLPERF_CPP: yes
        • CM_MLPERF_IMPLEMENTATION: cpp
        • CM_IMAGENET_ACCURACY_DTYPE: float32
        • CM_OPENIMAGES_ACCURACY_DTYPE: float32
      • Workflow:
        1. Read "posthook_deps" on other CM scripts
          • app,mlperf,cpp,inference
            • if (CM_SKIP_RUN != True)
            • CM names: --adr.['cpp-mlperf-inference', 'mlperf-inference-implementation']...
    • _nvidia
    • _nvidia-original
      • Environment variables:
        • CM_MLPERF_IMPLEMENTATION: nvidia-original
        • CM_SQUAD_ACCURACY_DTYPE: float16
        • CM_IMAGENET_ACCURACY_DTYPE: int32
        • CM_LIBRISPEECH_ACCURACY_DTYPE: int8
      • Workflow:
        1. Read "posthook_deps" on other CM scripts
          • reproduce,mlperf,nvidia,inference
            • if (CM_SKIP_RUN != True)
            • CM names: --adr.['nvidia-original-mlperf-inference', 'nvidia-harness', 'mlperf-inference-implementation']...
    • _reference (default)
      • Aliases: _python
      • Environment variables:
        • CM_MLPERF_PYTHON: yes
        • CM_MLPERF_IMPLEMENTATION: reference
        • CM_SQUAD_ACCURACY_DTYPE: float32
        • CM_IMAGENET_ACCURACY_DTYPE: float32
        • CM_OPENIMAGES_ACCURACY_DTYPE: float32
        • CM_LIBRISPEECH_ACCURACY_DTYPE: float32
      • Workflow:
        1. Read "posthook_deps" on other CM scripts
          • app,mlperf,reference,inference
            • if (CM_SKIP_RUN != True)
            • CM names: --adr.['python-reference-mlperf-inference', 'mlperf-inference-implementation']...
    • _tflite-cpp
      • Environment variables:
        • CM_MLPERF_TFLITE_CPP: yes
        • CM_MLPERF_CPP: yes
        • CM_MLPERF_IMPLEMENTATION: tflite-cpp
        • CM_IMAGENET_ACCURACY_DTYPE: float32
      • Workflow:
        1. Read "posthook_deps" on other CM scripts
          • app,mlperf,tflite-cpp,inference
            • if (CM_SKIP_RUN != True)
            • CM names: --adr.['tflite-cpp-mlperf-inference', 'mlperf-inference-implementation']...
  • Group "backend"

    Click here to expand this section.
    • _deepsparse
      • Environment variables:
        • CM_MLPERF_BACKEND: deepsparse
      • Workflow:
    • _onnxruntime (default)
      • Environment variables:
        • CM_MLPERF_BACKEND: onnxruntime
      • Workflow:
    • _pytorch
      • Environment variables:
        • CM_MLPERF_BACKEND: pytorch
      • Workflow:
    • _tensorrt
      • Environment variables:
        • CM_MLPERF_BACKEND: tensorrt
      • Workflow:
    • _tf
      • Environment variables:
        • CM_MLPERF_BACKEND: tf
      • Workflow:
    • _tflite
      • Environment variables:
        • CM_MLPERF_BACKEND: tflite
      • Workflow:
    • _tvm-onnx
      • Environment variables:
        • CM_MLPERF_BACKEND: tvm-onnx
      • Workflow:
    • _tvm-pytorch
      • Environment variables:
        • CM_MLPERF_BACKEND: tvm-pytorch
      • Workflow:
    • _tvm-tflite
      • Environment variables:
        • CM_MLPERF_BACKEND: tvm-tflite
      • Workflow:
  • Group "device"

    Click here to expand this section.
    • _cpu (default)
      • Environment variables:
        • CM_MLPERF_DEVICE: cpu
      • Workflow:
    • _cuda
      • Environment variables:
        • CM_MLPERF_DEVICE: gpu
      • Workflow:
  • Group "model"

    Click here to expand this section.
    • _3d-unet-99
      • Environment variables:
        • CM_MODEL: 3d-unet-99
      • Workflow:
    • _3d-unet-99.9
      • Environment variables:
        • CM_MODEL: 3d-unet-99.9
      • Workflow:
    • _bert-99
      • Environment variables:
        • CM_MODEL: bert-99
      • Workflow:
    • _bert-99.9
      • Environment variables:
        • CM_MODEL: bert-99.9
      • Workflow:
    • _dlrm-99
      • Environment variables:
        • CM_MODEL: dlrm-99
      • Workflow:
    • _dlrm-99.9
      • Environment variables:
        • CM_MODEL: dlrm-99.9
      • Workflow:
    • _efficientnet
      • Environment variables:
        • CM_MODEL: efficientnet
      • Workflow:
        1. Read "deps" on other CM scripts
        2. Read "post_deps" on other CM scripts
          • run,accuracy,mlperf,_imagenet
            • if (CM_MLPERF_LOADGEN_MODE in ['accuracy', 'all'] AND CM_MLPERF_ACCURACY_RESULTS_DIR == on)
            • CM names: --adr.['mlperf-accuracy-script', 'imagenet-accuracy-script']...
    • _mobilenet
      • Environment variables:
        • CM_MODEL: mobilenet
      • Workflow:
        1. Read "deps" on other CM scripts
        2. Read "post_deps" on other CM scripts
          • run,accuracy,mlperf,_imagenet
            • if (CM_MLPERF_LOADGEN_MODE in ['accuracy', 'all'] AND CM_MLPERF_ACCURACY_RESULTS_DIR == on)
            • CM names: --adr.['mlperf-accuracy-script', 'imagenet-accuracy-script']...
    • _resnet50 (default)
      • Environment variables:
        • CM_MODEL: resnet50
      • Workflow:
        1. Read "deps" on other CM scripts
        2. Read "post_deps" on other CM scripts
          • run,accuracy,mlperf,_imagenet
            • if (CM_MLPERF_LOADGEN_MODE in ['accuracy', 'all'] AND CM_MLPERF_ACCURACY_RESULTS_DIR == on)
            • CM names: --adr.['mlperf-accuracy-script', 'imagenet-accuracy-script']...
    • _retinanet
      • Environment variables:
        • CM_MODEL: retinanet
      • Workflow:
        1. Read "post_deps" on other CM scripts
          • run,accuracy,mlperf,_openimages
            • if (CM_MLPERF_LOADGEN_MODE in ['accuracy', 'all'] AND CM_MLPERF_ACCURACY_RESULTS_DIR == on)
            • CM names: --adr.['mlperf-accuracy-script', 'openimages-accuracy-script']...
    • _rnnt
      • Environment variables:
        • CM_MODEL: rnnt
      • Workflow:
        1. Read "post_deps" on other CM scripts
          • run,accuracy,mlperf,_librispeech
            • if (CM_MLPERF_LOADGEN_MODE in ['accuracy', 'all'] AND CM_MLPERF_ACCURACY_RESULTS_DIR == on)
            • CM names: --adr.['mlperf-accuracy-script', 'librispeech-accuracy-script']...
  • Group "precision"

    Click here to expand this section.
    • _fp32 (default)
      • Environment variables:
        • CM_MLPERF_QUANTIZATION: False
        • CM_MLPERF_MODEL_PRECISION: float32
      • Workflow:
    • _int8
      • Aliases: _quantized
      • Environment variables:
        • CM_MLPERF_QUANTIZATION: True
        • CM_MLPERF_MODEL_PRECISION: int8
      • Workflow:
    • _uint8
      • Environment variables:
        • CM_MLPERF_QUANTIZATION: True
        • CM_MLPERF_MODEL_PRECISION: uint8
      • Workflow:
  • Group "execution-mode"

    Click here to expand this section.
    • _fast
      • Environment variables:
        • CM_FAST_FACTOR: 5
        • CM_OUTPUT_FOLDER_NAME: fast_results
        • CM_MLPERF_RUN_STYLE: fast
      • Workflow:
    • _test (default)
      • Environment variables:
        • CM_OUTPUT_FOLDER_NAME: test_results
        • CM_MLPERF_RUN_STYLE: test
      • Workflow:
    • _valid
      • Environment variables:
        • CM_OUTPUT_FOLDER_NAME: valid_results
        • CM_MLPERF_RUN_STYLE: valid
      • Workflow:
  • Group "reproducibility"

    Click here to expand this section.
    • _r2.1_default
      • Environment variables:
        • CM_RERUN: yes
        • CM_SKIP_SYS_UTILS: yes
        • CM_TEST_QUERY_COUNT: 100
      • Workflow:
  • Internal group (variations should not be selected manually)

    Click here to expand this section.
    • _3d-unet_
      • Workflow:
        1. Read "post_deps" on other CM scripts
          • run,accuracy,mlperf,_kits19,_int8
            • if (CM_MLPERF_LOADGEN_MODE in ['accuracy', 'all'] AND CM_MLPERF_ACCURACY_RESULTS_DIR == on)
            • CM names: --adr.['mlperf-accuracy-script', '3d-unet-accuracy-script']...
    • _bert_
      • Workflow:
        1. Read "deps" on other CM scripts
          • get,dataset,squad,language-processing
            • if (CM_DATASET_SQUAD_VAL_PATH not in on)
          • get,dataset-aux,squad-vocab
            • if (CM_ML_MODEL_BERT_VOCAB_FILE_WITH_PATH" not in on)
        2. Read "post_deps" on other CM scripts
          • run,accuracy,mlperf,_squad
            • if (CM_MLPERF_LOADGEN_MODE in ['accuracy', 'all'] AND CM_MLPERF_ACCURACY_RESULTS_DIR == on)
            • CM names: --adr.['squad-accuracy-script', 'mlperf-accuracy-script']...
    • _dlrm_
      • Workflow:
        1. Read "post_deps" on other CM scripts
          • run,accuracy,mlperf,_terabyte,_float32
            • if (CM_MLPERF_LOADGEN_MODE in ['accuracy', 'all'] AND CM_MLPERF_ACCURACY_RESULTS_DIR == on)
            • CM names: --adr.['terabyte-accuracy-script', 'mlperf-accuracy-script']...
  • No group (any variation can be selected)

    Click here to expand this section.
    • _batch_size.#
      • Environment variables:
        • CM_MLPERF_LOADGEN_MAX_BATCHSIZE: #
      • Workflow:
    • _power
      • Environment variables:
        • CM_MLPERF_POWER: yes
        • CM_SYSTEM_POWER: yes
      • Workflow:

Unsupported or invalid variation combinations

  • _resnet50,_pytorch
  • _retinanet,_tf
  • _nvidia-original,_tf
  • _nvidia-original,_onnxruntime
  • _nvidia-original,_pytorch
  • _nvidia,_tf
  • _nvidia,_onnxruntime
  • _nvidia,_pytorch

Default variations

_cpu,_fp32,_onnxruntime,_reference,_resnet50,_test

Input description

  • --scenario MLPerf inference scenario {Offline,Server,SingleStream,MultiStream} (Offline)
  • --mode MLPerf inference mode {performance,accuracy} (accuracy)
  • --test_query_count Specifies the number of samples to be processed during a test run
  • --target_qps Target QPS
  • --target_latency Target Latency
  • --max_batchsize Maximum batchsize to be used
  • --num_threads Number of CPU threads to launch the application with
  • --hw_name Valid value - any system description which has a config file (under same name) defined here
  • --output_dir Location where the outputs are produced
  • --rerun Redo the run even if previous run files exist (True)
  • --regenerate_files Regenerates measurement files including accuracy.txt files even if a previous run exists. This option is redundant if --rerun is used
  • --adr.python.name Python virtual environment name (optional) (mlperf)
  • --adr.python.version_min Minimal Python version (3.8)
  • --adr.python.version Force Python version (must have all system deps)
  • --adr.compiler.tags Compiler for loadgen (gcc)
  • --adr.inference-src-loadgen.env.CM_GIT_URL Git URL for MLPerf inference sources to build LoadGen (to enable non-reference implementations)
  • --adr.inference-src.env.CM_GIT_URL Git URL for MLPerf inference sources to run benchmarks (to enable non-reference implementations)
  • --quiet Quiet run (select default values for all questions) (False)
  • --readme Generate README with the reproducibility report
  • --debug Debug MLPerf script

Above CLI flags can be used in the Python CM API as follows:

r=cm.access({... , "scenario":...}

Script flags mapped to environment

Click here to expand this section.
  • --clean=valueCM_MLPERF_CLEAN_SUBMISSION_DIR=value
  • --count=valueCM_MLPERF_LOADGEN_QUERY_COUNT=value
  • --debug=valueCM_DEBUG_SCRIPT_BENCHMARK_PROGRAM=value
  • --docker=valueCM_RUN_DOCKER_CONTAINER=value
  • --hw_name=valueCM_HW_NAME=value
  • --imagenet_path=valueIMAGENET_PATH=value
  • --max_amps=valueCM_MLPERF_POWER_MAX_AMPS=value
  • --max_batchsize=valueCM_MLPERF_LOADGEN_MAX_BATCHSIZE=value
  • --max_volts=valueCM_MLPERF_POWER_MAX_VOLTS=value
  • --mode=valueCM_MLPERF_LOADGEN_MODE=value
  • --multistream_target_latency=valueCM_MLPERF_LOADGEN_MULTISTREAM_TARGET_LATENCY=value
  • --ntp_server=valueCM_MLPERF_POWER_NTP_SERVER=value
  • --num_threads=valueCM_NUM_THREADS=value
  • --offline_target_qps=valueCM_MLPERF_LOADGEN_OFFLINE_TARGET_QPS=value
  • --output_dir=valueOUTPUT_BASE_DIR=value
  • --power=valueCM_MLPERF_POWER=value
  • --power_server=valueCM_MLPERF_POWER_SERVER_ADDRESS=value
  • --readme=valueCM_MLPERF_README=value
  • --regenerate_files=valueCM_REGENERATE_MEASURE_FILES=value
  • --rerun=valueCM_RERUN=value
  • --scenario=valueCM_MLPERF_LOADGEN_SCENARIO=value
  • --server_target_qps=valueCM_MLPERF_LOADGEN_SERVER_TARGET_QPS=value
  • --singlestream_target_latency=valueCM_MLPERF_LOADGEN_SINGLESTREAM_TARGET_LATENCY=value
  • --target_latency=valueCM_MLPERF_LOADGEN_TARGET_LATENCY=value
  • --target_qps=valueCM_MLPERF_LOADGEN_TARGET_QPS=value
  • --test_query_count=valueCM_TEST_QUERY_COUNT=value

Above CLI flags can be used in the Python CM API as follows:

r=cm.access({... , "clean":...}

Default environment

Click here to expand this section.

These keys can be updated via --env.KEY=VALUE or env dictionary in @input.json or using script flags.

  • CM_MLPERF_LOADGEN_MODE: accuracy
  • CM_MLPERF_LOADGEN_SCENARIO: Offline
  • CM_OUTPUT_FOLDER_NAME: test_results
  • CM_MLPERF_RUN_STYLE: test
  • CM_TEST_QUERY_COUNT: 10
  • CM_MLPERF_QUANTIZATION: False

Script workflow, dependencies and native scripts

Click here to expand this section.
  1. Read "deps" on other CM scripts from meta
  2. Run "preprocess" function from customize.py
  3. Read "prehook_deps" on other CM scripts from meta
  4. Run native script if exists
  5. Read "posthook_deps" on other CM scripts from meta
  6. Run "postrocess" function from customize.py
  7. Read "post_deps" on other CM scripts from meta

Script output

New environment keys (filter)

  • CM_MLPERF_*

New environment keys auto-detected from customize

  • CM_MLPERF_ACCURACY_RESULTS_DIR

Maintainers