Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Flytekit remote updates for notebooks and versions #3109

Merged
merged 7 commits into from
Feb 12, 2025

Conversation

kumare3
Copy link
Contributor

@kumare3 kumare3 commented Feb 5, 2025

Problems:

  1. Version had to be provided when registering certain entities from the notebook
  2. Execute worked, but register did not work in the same way
  3. In cases when same entity is registered in non interactive mode, remote entity was auto selected to be latest registered if version field was empty. This is not explicitly called latest as a special version
  4. Makes version handling uniform across all entities

Example cases, now works
Screenshot 2025-02-04 at 9 35 13 PM

Screenshot 2025-02-04 at 9 35 47 PM Screenshot 2025-02-04 at 9 36 12 PM

Summary by Bito

Implements standardized version handling in Flytekit's remote operations with 'latest' as default, enhancing consistency across interactive and notebook environments. Improves documentation for execution_name_prefix parameter and clarifies behavior of 'latest' version parameter. Includes code cleanup by removing unused elements and replacing print statements with proper logging. Updates documentation for improved clarity and conciseness.

Unit tests added: False

Estimated effort to review (1-5, lower is better): 2

@flyte-bot
Copy link
Contributor

flyte-bot commented Feb 5, 2025

Code Review Agent Run #7eedad

Actionable Suggestions - 1
  • flytekit/remote/remote.py - 1
Additional Suggestions - 1
  • flytekit/remote/remote.py - 1
    • Consider consolidating version check conditions · Line 166-167
Review Details
  • Files reviewed - 1 · Commit Range: 9e0af8a..9e0af8a
    • flytekit/remote/remote.py
  • Files skipped - 0
  • Tools
    • Whispers (Secret Scanner) - ✔︎ Successful
    • Detect-secrets (Secret Scanner) - ✔︎ Successful
    • MyPy (Static Code Analysis) - ✔︎ Successful
    • Astral Ruff (Static Code Analysis) - ✔︎ Successful

AI Code Review powered by Bito Logo

@flyte-bot
Copy link
Contributor

flyte-bot commented Feb 5, 2025

Changelist by Bito

This pull request implements the following key changes.

Key Change Files Impacted
Feature Improvement - Enhanced Version Handling in Remote Operations

remote.py - Standardized version handling with 'latest' as default, improved version resolution logic, and added better logging

Comment on lines 2055 to 2067
def _resolve_version(
self, version: typing.Optional[str], entity: typing.Any, ss: SerializationSettings
) -> typing.Tuple[str, typing.Optional[PickledEntity]]:
if version is None and self.interactive_mode_enabled:
md5_bytes, pickled_target_dict = _get_pickled_target_dict(entity)
return self._version_from_hash(
md5_bytes, ss, entity.python_interface.default_inputs_as_kwargs, *self._get_image_names(entity)
), pickled_target_dict
elif version is not None:
return version, None
raise ValueError(
"Version must be provided when not in interactive mode. If you want to use latest version pass 'latest'"
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider splitting version resolution logic

Consider extracting the version resolution logic into a separate method for better code organization. The _resolve_version method currently handles both version resolution and pickled entity management, which could be split for better maintainability.

Code suggestion
Check the AI-generated fix before applying
Suggested change
def _resolve_version(
self, version: typing.Optional[str], entity: typing.Any, ss: SerializationSettings
) -> typing.Tuple[str, typing.Optional[PickledEntity]]:
if version is None and self.interactive_mode_enabled:
md5_bytes, pickled_target_dict = _get_pickled_target_dict(entity)
return self._version_from_hash(
md5_bytes, ss, entity.python_interface.default_inputs_as_kwargs, *self._get_image_names(entity)
), pickled_target_dict
elif version is not None:
return version, None
raise ValueError(
"Version must be provided when not in interactive mode. If you want to use latest version pass 'latest'"
)
def _get_pickled_entity(self, entity: typing.Any) -> typing.Optional[PickledEntity]:
if not self.interactive_mode_enabled:
return None
md5_bytes, pickled_target_dict = _get_pickled_target_dict(entity)
return pickled_target_dict
def _resolve_version(
self, version: typing.Optional[str], entity: typing.Any, ss: SerializationSettings
) -> typing.Tuple[str, typing.Optional[PickledEntity]]:
if version is not None:
return version, None
if not self.interactive_mode_enabled:
raise ValueError(
"Version must be provided when not in interactive mode. If you want to use latest version pass 'latest'"
)
md5_bytes, pickled_target_dict = _get_pickled_target_dict(entity)
version = self._version_from_hash(
md5_bytes, ss, entity.python_interface.default_inputs_as_kwargs, *self._get_image_names(entity)
)
return version, pickled_target_dict

Code Review Run #7eedad


Is this a valid issue, or was it incorrectly flagged by the Agent?

  • it was incorrectly flagged

@flyte-bot
Copy link
Contributor

flyte-bot commented Feb 7, 2025

Code Review Agent Run #dda473

Actionable Suggestions - 2
  • flytekit/remote/remote.py - 2
Review Details
  • Files reviewed - 1 · Commit Range: 9e0af8a..e020bd1
    • flytekit/remote/remote.py
  • Files skipped - 0
  • Tools
    • Whispers (Secret Scanner) - ✔︎ Successful
    • Detect-secrets (Secret Scanner) - ✔︎ Successful
    • MyPy (Static Code Analysis) - ✔︎ Successful
    • Astral Ruff (Static Code Analysis) - ✔︎ Successful

AI Code Review powered by Bito Logo

Comment on lines 809 to 821
def _resolve_version(
self, version: typing.Optional[str], entity: typing.Any, ss: SerializationSettings
) -> typing.Tuple[str, typing.Optional[PickledEntity]]:
if version is None and self.interactive_mode_enabled:
md5_bytes, pickled_target_dict = _get_pickled_target_dict(entity)
return self._version_from_hash(
md5_bytes, ss, entity.python_interface.default_inputs_as_kwargs, *self._get_image_names(entity)
), pickled_target_dict
elif version is not None:
return version, None
raise ValueError(
"Version must be provided when not in interactive mode. If you want to use latest version pass 'latest'"
)
Copy link
Contributor

@flyte-bot flyte-bot Feb 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider adding specific type hints

Consider adding type hints for entity parameter in _resolve_version(). The current implementation accepts typing.Any which could lead to runtime errors if invalid types are passed.

Code suggestion
Check the AI-generated fix before applying
 -    def _resolve_version(self, version: typing.Optional[str], entity: typing.Any, ss: SerializationSettings
 +    def _resolve_version(self, version: typing.Optional[str], entity: typing.Union[WorkflowBase, PythonTask], ss: SerializationSettings

Code Review Run #dda473

Consider consolidating version handling logic

Consider consolidating the version handling logic. The conditions version is not None and ss.version is not None could be combined since they have the same outcome of returning version, None or ss.version, None respectively.

Code suggestion
Check the AI-generated fix before applying
 -        elif version is not None:
 -            return version, None
 -        elif ss.version is not None:
 -            return ss.version, None
 +        elif version is not None or ss.version is not None:
 +            return version or ss.version, None

Code Review Run #0c0f7f


Is this a valid issue, or was it incorrectly flagged by the Agent?

  • it was incorrectly flagged

project=self.default_project,
domain=self.default_domain,
)

version, _ = self._resolve_version(version, entity, serialization_settings)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider moving version resolution earlier

Consider moving the version resolution logic before the serialization settings initialization to avoid potential inconsistencies between version and settings.

Code suggestion
Check the AI-generated fix before applying
 -            serialization_settings = SerializationSettings(
 -                image_config=ImageConfig.auto_default_image(),
 -                project=self.default_project,
 -                domain=self.default_domain,
 -            )
 -
 -            version, _ = self._resolve_version(version, entity, serialization_settings)
 +            version, _ = self._resolve_version(version, entity, None)
 +            serialization_settings = SerializationSettings(
 +                image_config=ImageConfig.auto_default_image(),
 +                project=self.default_project,
 +                domain=self.default_domain,
 +                version=version
 +            )

Code Review Run #dda473


Is this a valid issue, or was it incorrectly flagged by the Agent?

  • it was incorrectly flagged

@wild-endeavor
Copy link
Contributor

There's a breaking change here I wanted to highlight which may or may not be okay. I think it's technically more correct, but it is a change in behavior.

Assuming you have a simple file like this

from flytekit import task, workflow, ImageSpec, FlyteRemote, Config

image = ImageSpec(
    name="yt_public",
    builder="default",
    registry="ghcr.io/wild-endeavor",
    packages=["flytekit==v1.15.0b2"],
)

@task(container_image=image, enable_deck=True)
def print_hello():
    print("hello")

@workflow
def wf():
    print_hello()

if __name__ == "__main__":
    remote = FlyteRemote(Config.for_sandbox(), default_project="flytesnacks", default_domain="development")
    remote.register_workflow(wf, version="from_main_v4_pr")

the behavior on master is as follows

  • if the user does pyflyte register, fast register is used, and the code is copied into the tarball. Code is not built into the image.
  • if the user does python file.py, fast register is not used, and the code is built into the image.

The reason the code is built into the image in the second case is because the code currently is setting the project root. When the register workflow call triggers the downstream register task call, it gets set on the image spec, causing the user code to be built in.

With this pr, without setting project root, pyflyte register remains the same, but python file.py will no longer copy code - when the workflow is run, the task will fail.

If the user wants to make this work, they'd have to change to fast_register_workflow, which actually feels more correct.

I think we can make a better experience actually - the register_workflow function right now never does fast register... even if the user passes in a serialization settings object with fast register enabled, so there's some discrepancy here to clean up.

Signed-off-by: Kevin Su <[email protected]>
Signed-off-by: Kevin Su <[email protected]>
Signed-off-by: Kevin Su <[email protected]>
pingsutw
pingsutw previously approved these changes Feb 7, 2025
@flyte-bot
Copy link
Contributor

flyte-bot commented Feb 7, 2025

Code Review Agent Run #0c0f7f

Actionable Suggestions - 1
  • flytekit/remote/remote.py - 1
    • Consider consolidating version handling logic · Line 809-821
Review Details
  • Files reviewed - 1 · Commit Range: e020bd1..8368ca8
    • flytekit/remote/remote.py
  • Files skipped - 0
  • Tools
    • Whispers (Secret Scanner) - ✔︎ Successful
    • Detect-secrets (Secret Scanner) - ✔︎ Successful
    • MyPy (Static Code Analysis) - ✔︎ Successful
    • Astral Ruff (Static Code Analysis) - ✔︎ Successful

AI Code Review powered by Bito Logo

Signed-off-by: Yee Hing Tong <[email protected]>
@flyte-bot
Copy link
Contributor

flyte-bot commented Feb 11, 2025

Code Review Agent Run #b0d8cc

Actionable Suggestions - 0
Review Details
  • Files reviewed - 1 · Commit Range: 8368ca8..f9a9b8d
    • flytekit/remote/remote.py
  • Files skipped - 0
  • Tools
    • Whispers (Secret Scanner) - ✔︎ Successful
    • Detect-secrets (Secret Scanner) - ✔︎ Successful
    • MyPy (Static Code Analysis) - ✔︎ Successful
    • Astral Ruff (Static Code Analysis) - ✔︎ Successful

AI Code Review powered by Bito Logo

Signed-off-by: Yee Hing Tong <[email protected]>
@wild-endeavor
Copy link
Contributor

@kumare3 pushed a nit to get the monodocs build to pass. should we merge? Keep in mind there is a small breaking change (detailed above). I'm okay with this given the fact that this function is a bit broken to begin with (fast settings not being respected), and the behavior now actually might better align with expectations.

@flyte-bot
Copy link
Contributor

flyte-bot commented Feb 11, 2025

Code Review Agent Run #37064e

Actionable Suggestions - 0
Review Details
  • Files reviewed - 1 · Commit Range: f9a9b8d..394fc86
    • flytekit/remote/remote.py
  • Files skipped - 0
  • Tools
    • Whispers (Secret Scanner) - ✔︎ Successful
    • Detect-secrets (Secret Scanner) - ✔︎ Successful
    • MyPy (Static Code Analysis) - ✔︎ Successful
    • Astral Ruff (Static Code Analysis) - ✔︎ Successful

AI Code Review powered by Bito Logo

@wild-endeavor wild-endeavor merged commit b2dbd24 into master Feb 12, 2025
108 of 110 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants