Support mlflow.log_table #634

qwe313 · 2025-02-16T14:28:42Z

Description

To manually evaluate some outputs of a run in the Evaluation tab in the MLflow UI, you need to log data using the mlflow.log_table method, which is not currently supported by any of the Dataset implementations in kedro-mlflow.

Context

I wanted to be able to view the table I created in one of the nodes in the Evaluation tab to compare it with other runs.
I've tried logging JSON as an artifact using MlflowArtifactDataset and pandas.JSONDataset with orient: "split" (since this is the same format MLflow uses when logging artifacts with mlflow.log_table).

As a workaround, I manually call mlflow.log_table from inside the node at the end of execution, but I am not sure if this is reliable and will always log data to the correct run, as I was unable to retrieve the run_id of the current run from inside the node.

Possible Implementation

Probably additional Dataset that will use mlflow.log_table instead of mlflow.log_artifact

Possible Alternatives

Manual logging but this is probably not very reliable

The text was updated successfully, but these errors were encountered:

github-project-automation bot added this to kedro-mlflow roadmap Feb 16, 2025

github-project-automation bot moved this to 🆕 New in kedro-mlflow roadmap Feb 16, 2025

Galileo-Galilei moved this from 🆕 New to 🔖 Ready in kedro-mlflow roadmap Feb 16, 2025

Galileo-Galilei added the enhancement New feature or request label Feb 16, 2025

Galileo-Galilei removed the status in kedro-mlflow roadmap Feb 17, 2025

Galileo-Galilei moved this to 🔖 Ready in kedro-mlflow roadmap Feb 17, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support mlflow.log_table #634

Support mlflow.log_table #634

qwe313 commented Feb 16, 2025

Support mlflow.log_table #634

Support mlflow.log_table #634

Comments

qwe313 commented Feb 16, 2025

Description

Context

Possible Implementation

Possible Alternatives