Skip to content

Commit

Permalink
Segregate tests + Script to upgrade components (#2288)
Browse files Browse the repository at this point in the history
* script to upgrade components

* add underscore directory name

* fix code style issue

* fix doc style issue

* add env_version arg and upgrade components

* add missing changes

* update package name for yaml

* update README

* try fixing token expiration issue

* Revert "try fixing token expiration issue"

This reverts commit 798f2ab.

* fix token expiration issue via singleton pattern

* fix thread issue

* Revert "fix thread issue"

This reverts commit 8959601.

* Revert "fix token expiration issue via singleton pattern"

This reverts commit 2a21f93.

* separate claude, batch_bench, prompt_crafter tests

* fix failing tests

* fix downloader test

* fix for when component is not published
  • Loading branch information
iamrk04 authored Feb 13, 2024
1 parent e31af43 commit 2665d01
Show file tree
Hide file tree
Showing 50 changed files with 675 additions and 381 deletions.
19 changes: 18 additions & 1 deletion assets/aml-benchmark/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -72,4 +72,21 @@ python scripts/validation/copyright_validation.py -i assets/aml-benchmark/
In the root of the repo, run the following in **powershell**:
```
python scripts/validation/doc_style.py -i assets/aml-benchmark/
```
```

# Release checklist

## 1. Component release
- From the root of this repo, you can run either of the following to install the dependencies:
- `pip install -r assets/aml-benchmark/requirements.txt`
- `conda env create -f assets/aml-benchmark/dev_conda_env.yaml`
- We need to make sure that the spec file is updated for all the components before kicking off the release process. From the root of this repo, run the following command to upgrade the components:
```
python assets/aml-benchmark/scripts/_internal/upgrade_components.py [--env_version <version>]
```
parameter `env_version` can take the following values:
| **Value** | **Description** |
| --- | --- |
| `"latest"` | This is the default value. It will upgrade the components' environment to the latest version. |
| `""` | This will keep the components' environment version as is. |
| `"<specific_version>"` | This will upgrade the components' environment to the specified version. |
Original file line number Diff line number Diff line change
Expand Up @@ -5,4 +5,4 @@ test:
pytest:
enabled: true
conda_environment: ../../dev_conda_env.yaml
tests_dir: ../../tests
tests_dir: ../../tests/test_batch_benchmark_inference
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ type: pipeline
name: batch_benchmark_inference
display_name: Batch Benchmark Inference
description: Components for batch endpoint inference
version: 0.0.5
version: 0.0.6

inputs:
input_dataset:
Expand Down Expand Up @@ -149,7 +149,7 @@ jobs:
# Preparer
batch_inference_preparer:
type: command
component: azureml:batch_inference_preparer:0.0.6
component: azureml:batch_inference_preparer:0.0.7
inputs:
input_dataset: ${{parent.inputs.input_dataset}}
model_type: ${{parent.inputs.model_type}}
Expand All @@ -167,7 +167,7 @@ jobs:
# Inference
endpoint_batch_score:
type: parallel
component: azureml:batch_benchmark_score:0.0.5
component: azureml:batch_benchmark_score:0.0.6
inputs:
model_type: ${{parent.inputs.model_type}}
online_endpoint_url: ${{parent.inputs.endpoint_url}}
Expand Down Expand Up @@ -199,7 +199,7 @@ jobs:
# Reformat
batch_output_formatter:
type: command
component: azureml:batch_output_formatter:0.0.6
component: azureml:batch_output_formatter:0.0.7
inputs:
model_type: ${{parent.inputs.model_type}}
batch_inference_output: ${{parent.jobs.endpoint_batch_score.outputs.mini_batch_results_out_directory}}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -5,4 +5,4 @@ test:
pytest:
enabled: true
conda_environment: ../../dev_conda_env.yaml
tests_dir: ../../tests
tests_dir: ../../tests/test_batch_benchmark_inference
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
$schema: http://azureml/sdk-2-0/ParallelComponent.json
name: batch_benchmark_score
version: 0.0.5
version: 0.0.6
display_name: Batch Benchmark Score
is_deterministic: False
type: parallel
Expand Down Expand Up @@ -77,7 +77,7 @@ outputs:
type: uri_folder
task:
code: ../src
environment: azureml://registries/azureml/environments/model-evaluation/versions/15
environment: azureml://registries/azureml/environments/model-evaluation/versions/19
program_arguments: --append_row_safe_output True --debug_mode ${{inputs.debug_mode}} $[[--model_type ${{inputs.model_type}}]] --online_endpoint_url ${{inputs.online_endpoint_url}} $[[--additional_properties ${{inputs.additional_properties}}]] $[[--additional_headers ${{inputs.additional_headers}}]] $[[--user_agent_segment ${{inputs.user_agent_segment}}]] --metrics_out_directory ${{outputs.metrics_out_directory}} --tally_failed_requests False --tally_exclusions none --run_type parallel --segment_large_requests disabled --segment_max_token_size 600 --ensure_ascii ${{inputs.ensure_ascii}} --output_behavior append_row --initial_worker_count ${{inputs.initial_worker_count}} --max_worker_count ${{inputs.max_worker_count}} $[[--max_retry_time_interval ${{inputs.max_retry_time_interval}}]] --save_mini_batch_results enabled --mini_batch_results_out_directory ${{outputs.mini_batch_results_out_directory}} --connections_name ${{inputs.connections_name}} $[[--deployment_name ${{inputs.deployment_name}}]] $[[--input_metadata ${{inputs.deployment_metadata}}]] $[[--mini_batch_size ${{inputs.mini_batch_size}}]]
entry_script: aml_benchmark.batch_benchmark_score.batch_score.main
type: run_function
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -5,4 +5,4 @@ test:
pytest:
enabled: true
conda_environment: ../../dev_conda_env.yaml
tests_dir: ../../tests
tests_dir: ../../tests/test_batch_benchmark_inference
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ type: command
name: batch_inference_preparer
display_name: Batch Inference Preparer
description: Prepare the jsonl file and endpoint for batch inference component.
version: 0.0.6
version: 0.0.7

inputs:
input_dataset:
Expand Down Expand Up @@ -64,7 +64,7 @@ outputs:
description: Path to the folder where the ground truth metadata will be stored.

code: ../src
environment: azureml://registries/azureml/environments/model-evaluation/labels/latest
environment: azureml://registries/azureml/environments/model-evaluation/versions/19
command: >-
python -m aml_benchmark.batch_inference_preparer.main
--batch_input_pattern '${{inputs.batch_input_pattern}}'
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -5,4 +5,4 @@ test:
pytest:
enabled: true
conda_environment: ../../dev_conda_env.yaml
tests_dir: ../../tests
tests_dir: ../../tests/test_batch_benchmark_inference
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
name: batch_output_formatter
version: 0.0.6
version: 0.0.7
display_name: Batch Output Formatter
is_deterministic: True
type: command
Expand Down Expand Up @@ -53,7 +53,7 @@ outputs:
ground_truth:
type: uri_file
code: ../src
environment: azureml://registries/azureml/environments/model-evaluation/labels/latest
environment: azureml://registries/azureml/environments/model-evaluation/versions/19

resources:
instance_count: 1
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -5,4 +5,4 @@ test:
pytest:
enabled: true
conda_environment: ../../dev_conda_env.yaml
tests_dir: ../../tests
tests_dir: ../../tests/test_batch_benchmark_inference
Original file line number Diff line number Diff line change
Expand Up @@ -5,4 +5,4 @@ test:
pytest:
enabled: true
conda_environment: ../../dev_conda_env.yaml
tests_dir: ../../tests
tests_dir: ../../tests/test_claude
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ type: pipeline
name: batch_benchmark_inference_claude
display_name: Batch Benchmark Inference with claude support
description: Components for batch endpoint inference
version: 0.0.2
version: 0.0.3

inputs:
input_dataset:
Expand Down Expand Up @@ -151,7 +151,7 @@ jobs:
# Preparer
batch_inference_preparer:
type: command
component: azureml:batch_inference_preparer:0.0.6
component: azureml:batch_inference_preparer:0.0.7
inputs:
input_dataset: ${{parent.inputs.input_dataset}}
model_type: ${{parent.inputs.model_type}}
Expand All @@ -168,7 +168,7 @@ jobs:
# Inference
endpoint_batch_score:
type: parallel
component: azureml:batch_benchmark_score:0.0.5
component: azureml:batch_benchmark_score:0.0.6
inputs:
model_type: ${{parent.inputs.model_type}}
online_endpoint_url: ${{parent.inputs.endpoint_url}}
Expand Down Expand Up @@ -199,7 +199,7 @@ jobs:
# Reformat
batch_output_formatter:
type: command
component: azureml:batch_output_formatter:0.0.6
component: azureml:batch_output_formatter:0.0.7
inputs:
model_type: ${{parent.inputs.model_type}}
batch_inference_output: ${{parent.jobs.endpoint_batch_score.outputs.mini_batch_results_out_directory}}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -5,4 +5,4 @@ test:
pytest:
enabled: true
conda_environment: ../../dev_conda_env.yaml
tests_dir: ../../tests
tests_dir: ../../tests/test_benchmark_result_aggregator.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ type: command
name: benchmark_result_aggregator
display_name: Benchmark result aggregator
description: Aggregate quality metrics, performance metrics and all of the metadata from the pipeline. Also add them to the root run.
version: 0.0.4
version: 0.0.5
is_deterministic: false

inputs:
Expand All @@ -23,7 +23,7 @@ outputs:
description: The json file with all of the aggregated results.

code: ../src
environment: azureml://registries/azureml/environments/model-evaluation/labels/latest
environment: azureml://registries/azureml/environments/model-evaluation/versions/19
command: >-
python -m aml_benchmark.result_aggregator.main
$[[--quality_metrics_path ${{inputs.quality_metrics}}]]
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -5,4 +5,4 @@ test:
pytest:
enabled: true
conda_environment: ../../dev_conda_env.yaml
tests_dir: ../../tests
tests_dir: ../../tests/test_compute_perf_metrics.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ type: command
name: compute_performance_metrics
display_name: Compute Performance Metrics
description: Performs performance metric post processing using data from a model inference run.
version: 0.0.2
version: 0.0.3
is_deterministic: true

inputs:
Expand Down Expand Up @@ -57,7 +57,7 @@ outputs:
description: Path to the file where the calculated performance metric results will be stored.

code: ../src
environment: azureml://registries/azureml/environments/model-evaluation/labels/latest
environment: azureml://registries/azureml/environments/model-evaluation/versions/19
command: >-
python -m aml_benchmark.perf_metrics.main
--performance_data ${{inputs.performance_data}}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -5,4 +5,4 @@ test:
pytest:
enabled: true
conda_environment: ../../dev_conda_env.yaml
tests_dir: ../../tests
tests_dir: ../../tests/test_dataset_downloader.py
4 changes: 2 additions & 2 deletions assets/aml-benchmark/components/dataset-downloader/spec.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ type: command
name: dataset_downloader
display_name: Dataset Downloader
description: Downloads the dataset onto blob store.
version: 0.0.2
version: 0.0.3

inputs:
dataset_name:
Expand Down Expand Up @@ -34,7 +34,7 @@ outputs:
description: Path to the directory where the dataset will be downloaded.

code: ../src
environment: azureml://registries/azureml/environments/model-evaluation/labels/latest
environment: azureml://registries/azureml/environments/model-evaluation/versions/19
command: >-
python -m aml_benchmark.dataset_downloader.main
$[[--dataset_name ${{inputs.dataset_name}}]]
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -5,4 +5,4 @@ test:
pytest:
enabled: true
conda_environment: ../../dev_conda_env.yaml
tests_dir: ../../tests
tests_dir: ../../tests/test_dataset_preprocessor.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ type: command
name: dataset_preprocessor
display_name: Dataset Preprocessor
description: Dataset Preprocessor
version: 0.0.2
version: 0.0.3
is_deterministic: true

inputs:
Expand Down Expand Up @@ -50,7 +50,7 @@ outputs:
code: ../src

environment: azureml://registries/azureml/environments/model-evaluation/labels/latest
environment: azureml://registries/azureml/environments/model-evaluation/versions/19

command: >-
python -m aml_benchmark.dataset_preprocessor.main
Expand Down
2 changes: 1 addition & 1 deletion assets/aml-benchmark/components/dataset-sampler/asset.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -5,4 +5,4 @@ test:
pytest:
enabled: true
conda_environment: ../../dev_conda_env.yaml
tests_dir: ../../tests
tests_dir: ../../tests/test_dataset_sampler.py
4 changes: 2 additions & 2 deletions assets/aml-benchmark/components/dataset-sampler/spec.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ type: command
name: dataset_sampler
display_name: Dataset Sampler
description: Samples a dataset containing JSONL file(s).
version: 0.0.2
version: 0.0.3

inputs:
dataset:
Expand Down Expand Up @@ -47,7 +47,7 @@ outputs:
description: Path to the jsonl file where the sampled dataset will be saved.

code: ../src
environment: azureml://registries/azureml/environments/model-evaluation/labels/latest
environment: azureml://registries/azureml/environments/model-evaluation/versions/19
command: >-
python -m aml_benchmark.dataset_sampler.main
--dataset ${{inputs.dataset}}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -5,4 +5,4 @@ test:
pytest:
enabled: true
conda_environment: ../../dev_conda_env.yaml
tests_dir: ../../tests
tests_dir: ../../tests/test_inference_postprocessor.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ type: command
name: inference_postprocessor
display_name: Inference Postprocessor
description: Inference Postprocessor
version: 0.0.3
version: 0.0.4
is_deterministic: true

inputs:
Expand Down Expand Up @@ -130,7 +130,7 @@ outputs:
code: ../src

environment: azureml://registries/azureml/environments/model-evaluation/labels/latest
environment: azureml://registries/azureml/environments/model-evaluation/versions/19

command: >-
python -m aml_benchmark.inference_postprocessor.main
Expand Down
2 changes: 1 addition & 1 deletion assets/aml-benchmark/components/prompt_crafter/asset.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -5,4 +5,4 @@ test:
pytest:
enabled: true
conda_environment: ../../dev_conda_env.yaml
tests_dir: ../../tests
tests_dir: ../../tests/test_prompt_crafter
4 changes: 2 additions & 2 deletions assets/aml-benchmark/components/prompt_crafter/spec.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ display_name: Prompt Crafter
description: This component is used to create prompts from a given dataset. From a
given jinja prompt template, it will generate prompts. It can also create
few-shot prompts given a few-shot dataset and the number of shots.
version: 0.0.5
version: 0.0.6
is_deterministic: true

inputs:
Expand Down Expand Up @@ -134,7 +134,7 @@ outputs:
description: Output file path where few_shot_prompt data will be written.

code: ../src
environment: azureml://registries/azureml/environments/model-evaluation/labels/latest
environment: azureml://registries/azureml/environments/model-evaluation/versions/19
command: >-
python -m aml_benchmark.prompt_crafter.main
--test_data ${{inputs.test_data}}
Expand Down
3 changes: 3 additions & 0 deletions assets/aml-benchmark/dev_conda_env.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,9 @@ dependencies:
- mltable>=1.5.0
- datasets
- ddt
- tqdm
- pyyaml
- azure-core
## Test requirements
- pytest
- pytest-xdist
Expand Down
3 changes: 3 additions & 0 deletions assets/aml-benchmark/requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -10,3 +10,6 @@ datasets
pytest
pytest-xdist
ddt
tqdm
pyyaml
azure-core
Loading

0 comments on commit 2665d01

Please sign in to comment.