Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bhavanatumma/ta2614748 #1294

Merged
merged 54 commits into from
Mar 7, 2024
Merged
Show file tree
Hide file tree
Changes from 10 commits
Commits
Show all changes
54 commits
Select commit Hold shift + click to select a range
a4b06e8
[bhavanatumma/ta2614748] - import model modification for OSS models
Sep 18, 2023
da9eb1a
Merge branch 'main' of https://github.com/Azure/azureml-assets into b…
Sep 18, 2023
412e200
[bhavanatumma/ta2614748] - Converting to MLFlow OSS transformers flavor
Sep 22, 2023
bef9664
Merge branch 'main' of https://github.com/Azure/azureml-assets into b…
Sep 22, 2023
cef6acc
[bhavanatumma/ta2614748] - Converting to MLFlow OSS transformers flavor
Sep 22, 2023
9272238
[bhavanatumma/ta2614748] - Converting to MLFlow OSS transformers flavor
Sep 22, 2023
45f5aa5
[bhavanatumma/ta2614748] - Converting to MLFlow OSS transformers flav…
Sep 22, 2023
c9ecc80
[bhavanatumma/ta2614748] - Converting to MLFlow OSS transformers flav…
Sep 27, 2023
b562a8c
Merge branch 'main' of https://github.com/Azure/azureml-assets into b…
Sep 27, 2023
ae8f9ed
[bhavanatumma/ta2614748] - lint fix
Sep 27, 2023
6533991
[bhavanatumma/ta2614748] - review comments
Sep 30, 2023
0d91738
[bhavanatumma/ta2614748] - current progress
Oct 15, 2023
4cc8252
[bhavanatumma/ta2614748] - current progress v2
Oct 17, 2023
da84c56
Merge branch 'main' of https://github.com/Azure/azureml-assets into b…
Oct 29, 2023
f67cb2c
Merge branch 'main' of https://github.com/Azure/azureml-assets into b…
Oct 29, 2023
5bf20e7
Merge branch 'main' of https://github.com/Azure/azureml-assets into b…
Nov 7, 2023
30ffa2e
Merge branch 'main' of https://github.com/Azure/azureml-assets into b…
Dec 1, 2023
6b8e174
[bhavanatumma/ta2614748] - ref point
Dec 5, 2023
1258560
[bhavanatumma/ta2614748] - ref point
Dec 6, 2023
ab9ee5d
Merge branch 'main' of https://github.com/Azure/azureml-assets into b…
Dec 6, 2023
7ad4980
[bhavanatumma/ta2614748] - ref point v2
Dec 15, 2023
2972bfa
Merge branch 'main' of https://github.com/Azure/azureml-assets into b…
Jan 5, 2024
4c9a9a6
Merge branch 'main' of https://github.com/Azure/azureml-assets into b…
Jan 11, 2024
b87aabd
Merge branch 'main' of https://github.com/Azure/azureml-assets into b…
Jan 12, 2024
bdef5eb
Merge branch 'main' of https://github.com/Azure/azureml-assets into b…
Jan 12, 2024
ac8c8d9
Merge branch 'main' of https://github.com/Azure/azureml-assets into b…
Jan 15, 2024
dea69eb
[bhavanatumma/ta2614748] - added OSS and HF transformers support vari…
Jan 16, 2024
4f996c4
Merge branch 'main' of https://github.com/Azure/azureml-assets into b…
Jan 16, 2024
0caeb39
[bhavanatumma/ta2614748] - lint fixes
Jan 16, 2024
6530c92
[bhavanatumma/ta2614748] - test fixes
Jan 17, 2024
bd23a34
[bhavanatumma/ta2614748] - test fixes
Jan 17, 2024
24a0cbb
[bhavanatumma/ta2614748] - test fixes
Jan 17, 2024
cfead71
[bhavanatumma/ta2614748] - review comments and adding Whisper model
Jan 17, 2024
a6668fc
[bhavanatumma/ta2614748] - review comments and adding Whisper model
Jan 17, 2024
411fda3
Merge branch 'main' of https://github.com/Azure/azureml-assets into b…
Feb 2, 2024
42eb412
[bhavanatumma/ta2614748] - changing version
Feb 2, 2024
22f074f
[bhavanatumma/ta2614748] - changing versions
Feb 5, 2024
7e5fd0f
[bhavanatumma/ta2614748] - changing signatures
Feb 6, 2024
21ffe69
[bhavanatumma/ta2614748] - adding new env
Feb 8, 2024
29ec544
[bhavanatumma/ta2614748] - adding new task text2text
Feb 9, 2024
f5d7780
[bhavanatumma/ta2614748] - adding new task text2text
Feb 12, 2024
913ed97
Merge branch 'main' of https://github.com/Azure/azureml-assets into b…
Feb 13, 2024
347661d
[bhavanatumma/ta2614748] - supporting t2t tasks
Feb 19, 2024
180e1b3
[bhavanatumma/ta2614748] - whisper changes
Feb 20, 2024
69807fb
Merge branch 'main' of https://github.com/Azure/azureml-assets into b…
Mar 5, 2024
5848cd2
[bhavanatumma/ta2614748] - whisper changes + convertors
Mar 6, 2024
67a1ea4
Merge branch 'main' of https://github.com/Azure/azureml-assets into b…
Mar 6, 2024
746ef77
[bhavanatumma/ta2614748] - fixing tests
Mar 6, 2024
66d6868
Merge branch 'main' of https://github.com/Azure/azureml-assets into b…
Mar 6, 2024
1f5bcbf
[bhavanatumma/summarization-oss] - version change
Mar 6, 2024
8f324c5
[bhavanatumma/ta2614748] - lint fix
Mar 6, 2024
2dab0dc
[bhavanatumma/ta2614748] - signature fix
Mar 7, 2024
1f4298a
[bhavanatumma/ta2614748] - signature fix
Mar 7, 2024
3f9fbc4
pl
Mar 7, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -1,15 +1,15 @@
$schema: https://azuremlschemas.azureedge.net/latest/commandComponent.schema.json

name: convert_model_to_mlflow
version: 0.0.15
version: 0.0.16
type: command

is_deterministic: True

display_name: Convert models to MLflow
description: Component converts models from supported frameworks to MLflow model packaging format

environment: azureml://registries/azureml/environments/model-management/versions/13
environment: azureml://registries/azureml/environments/model-management/versions/15

code: ../../src/
command: >
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ type: pipeline
name: import_model
display_name: Import model
description: Import a model into a workspace or a registry
version: 0.0.17
version: 0.0.18

# Pipeline inputs
inputs:
Expand Down Expand Up @@ -227,7 +227,7 @@ jobs:
type: uri_folder

convert_model_to_mlflow:
component: azureml:convert_model_to_mlflow:0.0.14
component: azureml:convert_model_to_mlflow:0.0.16
compute: ${{parent.inputs.compute}}
resources:
instance_type: '${{parent.inputs.instance_type}}'
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -2,12 +2,15 @@
# Licensed under the MIT License.

"""HFTransformers MLflow model convertors."""

from typing import List
import transformers
import platform
import mlflow
import os
import yaml
from abc import ABC, abstractmethod
from azureml.evaluate import mlflow as hf_mlflow
from azureml.core.conda_dependencies import CondaDependencies
from azureml.model.mgmt.processors.convertors import MLFLowConvertorInterface
from azureml.model.mgmt.processors.transformers.config import (
HF_CONF,
Expand Down Expand Up @@ -193,29 +196,58 @@ def _save(
logger.info("Experimental features enabled for MLflow conversion")
self._hf_conf["exp"] = True

hf_mlflow.hftransformers.save_model(
config=config,
tokenizer=tokenizer,
hf_model=model,
hf_conf=self._hf_conf,
conda_env=conda_env,
code_paths=code_paths,
signature=self._signatures,
input_example=input_example,
requirements_file=requirements_file,
pip_requirements=pip_requirements,
extra_pip_requirements=self._extra_pip_requirements,
path=self._output_dir,
)
try:
# create a conda environment for OSS transformers Flavor
python_version = platform.python_version()
pip_pkgs = self._get_curated_environment_pip_package_list()
conda_deps = CondaDependencies.create(conda_packages=None,
python_version=python_version,
pip_packages=pip_pkgs,
pin_sdk_version=False)
curated_conda_env = conda_deps.as_dict()

model_pipeline = transformers.pipeline(task=self._task, model=model)

mlflow.transformers.save_model(
transformers_model=model_pipeline,
conda_env=curated_conda_env,
code_paths=code_paths,
signature=self._signatures,
input_example=input_example,
pip_requirements=pip_requirements,
extra_pip_requirements=self._extra_pip_requirements,
path=self._output_dir,
)

# move metadata files to parent folder
logger.info("Moving meta files such as license, use_policy, readme to parent")
move_files(
Path(self._output_dir) / "data/model",
self._output_dir,
include_pattern_str=META_FILE_PATTERN,
ignore_case=True
)
logger.info("Model saved with mlflow OSS flow for task: {}".format(self._task))
except Exception as e:
logger.error("Model save failed with mlflow OSS flow for task: {} "
"with exception: {}".format(self._task, e))

hf_mlflow.hftransformers.save_model(
config=config,
tokenizer=tokenizer,
hf_model=model,
hf_conf=self._hf_conf,
conda_env=conda_env,
code_paths=code_paths,
signature=self._signatures,
input_example=input_example,
requirements_file=requirements_file,
pip_requirements=pip_requirements,
extra_pip_requirements=self._extra_pip_requirements,
path=self._output_dir,
)

logger.info("Model saved with transformers evaluate flow for task: {}".format(self._task))
# move metadata files to parent folder
logger.info("Moving meta files such as license, use_policy, readme to parent")
move_files(
Path(self._output_dir) / "data/model",
self._output_dir,
include_pattern_str=META_FILE_PATTERN,
ignore_case=True
)

# pin pycocotools==2.0.4
self._update_conda_dependencies({"pycocotools": "2.0.4"})
Expand Down Expand Up @@ -253,6 +285,42 @@ def _update_conda_dependencies(self, package_details):
yaml.safe_dump(conda_dict, f)
logger.info("updated conda.yaml")

def _get_curated_environment_pip_package_list(self) -> List[str]:
"""
Retrieve the packages using 'conda list' command.

:return: A List of the pip package and the corresponding versions.
"""
import subprocess
import json

PIP_LIST = ['mlflow', 'accelerate', 'cffi', 'dill', 'google-api-core', 'numpy',
'packaging', 'pillow', 'protobuf', 'pyyaml', 'requests', 'scikit-learn',
'scipy', 'sentencepiece', 'torch', 'transformers']
ADD_PACKAGE_LIST = ['torchvision==0.14.1']

conda_list_cmd = ["conda", "list", "--json"]
try:
process = subprocess.run(conda_list_cmd, shell=False, check=True,
stdout=subprocess.PIPE, stderr=subprocess.PIPE)
except (FileNotFoundError, subprocess.CalledProcessError) as err:
logger.warning('subprocess failed to get dependencies list from conda with error: {}'.format(err))
return []
output_str = process.stdout.decode('ascii')
output_json = json.loads(output_str)
pip_list = []
for pkg in output_json:
pkg_name = pkg['name']
pkg_version = pkg['version']
if pkg_name in PIP_LIST:
pip_list.append(pkg_name + "==" + pkg_version)

for pkg in ADD_PACKAGE_LIST:
pip_list.append(pkg)

logger.info("pip list: {}".format(pip_list))
return pip_list

def _validate(self, translate_params):
if not translate_params.get("task"):
raise Exception("task is a required parameter for hftransformers MLflow flavor.")
Expand Down Expand Up @@ -412,4 +480,4 @@ def save_as_mlflow(self):
hf_conf[HF_CONF.HF_PRETRAINED_CLASS.value] = self._hf_model_cls.__name__
hf_conf[HF_CONF.HF_TOKENIZER_CLASS.value] = self._hf_tokenizer_cls.__name__

return super()._save(segregate=True)
return super()._save(segregate=False)