Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Python regressions in 1.9.0beta #857

Merged
merged 2 commits into from
Nov 26, 2024
Merged

Fix Python regressions in 1.9.0beta #857

merged 2 commits into from
Nov 26, 2024

Conversation

benc-db
Copy link
Collaborator

@benc-db benc-db commented Nov 26, 2024

Description

Fixing two regressions related to Command submission of python models.

1.) Exceptions were not properly getting propagated to the user
2.) I had accidentally brought back the race condition that the community had already fixed for us.

Checklist

  • I have run this code in development and it appears to resolve the stated issue
  • This PR includes tests, or tests are not required/relevant for this PR
  • I have updated the CHANGELOG.md and added information about my change to the "dbt-databricks next" section.

if self.status(cluster_id) not in ["RUNNING", "PENDING"]:
raise DbtRuntimeError(f"Error starting terminated cluster.\n {response.content!r}")
else:
logger.debug("Presuming race condition, waiting for cluster to start")
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When we query state after error, if it's pending or running it means the start failed due to race condition as another thread got the cluster started.

Copy link
Collaborator

@jackyhu-db jackyhu-db Nov 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should it be info rather than debug or debug with status_code?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Anything that is info will show in the normal dbt output, so we are very conservative about what we log with info.
This is normal operation; if they are using multiple python models and the command api they will hit this, so I don't think this is worth bringing to the users' attention if it's running or pending. For whatever reason, the cluster start API errors if you ask to start a cluster that is already in the process of starting. If it's not running or pending, they'll get the full output from raising the error.

)
).json()

if response["results"]["resultType"] == "error":
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lost this in the refactor: for some reason Command exec will give a state of 'Finished' rather then 'Error' some times, and then stuff the error in the results.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what if the result or resultType does not exist

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these are published parts of the Databricks REST API. If these don't exist, the entire API becomes untrustworthy.

@@ -85,7 +85,7 @@
SHOW_TABLE_EXTENDED_MACRO_NAME = "show_table_extended"
SHOW_TABLES_MACRO_NAME = "show_tables"
SHOW_VIEWS_MACRO_NAME = "show_views"
GET_COLUMNS_COMMENTS_MACRO_NAME = "get_columns_comments"

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Noticed this wasn't referenced anywhere when doing my debugging.

@@ -70,6 +71,8 @@ def __init__(

@override
def submit(self, compiled_code: str) -> None:
logger.debug("Submitting Python model using the Command API.")
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding debug statement so that we can quickly determine submission method from logs.

@@ -0,0 +1,27 @@

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved these to somewhere more sensible (during investigation realized these were in persist_docs, but are sometimes called outside of persisting docs).

Copy link
Contributor

@alexguo-db alexguo-db left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@benc-db benc-db merged commit 6398033 into main Nov 26, 2024
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants