Hugo Evers
05/31/2024, 2:13 PMdependancy_figure:
type: kedro_mlflow.io.artifacts.MlflowArtifactDataset
dataset:
type: kedro_datasets.plotly.JSONDataset
filepath: ....
However when testing whether this would give the same result as mlflow.log_figure()
i got issues with kedro assuming the dataset is versioned while the s3 bucket is not.
Anyway, I’d propose to make a dataset to do this.
Would the authors be open to a PR?
And if so, do you have opinions on the implementation/naming, should i subclass the MlflowArtifactDataset, or another class?Nok Lam Chan
05/31/2024, 2:16 PMHowever when testing whether this would give the same result asCan you show the error?i got issues with kedro assuming the dataset is versioned while the s3 bucket is not.mlflow.log_figure()
Hugo Evers
05/31/2024, 2:24 PMshap_dependency_figure:
type: kedro_mlflow.io.artifacts.MlflowArtifactDataset
dataset:
type: kedro_datasets.plotly.JSONDataset
filepath: <s3://aw-science/performance_optimisation/${oc.env:ENV}/${oc.env:CLIENT_NAME}/data/08_reporting/shap_dependency.json>
code:
import plotly.express as px
fig = px.bar(x=["a", "b", "c"], y=[1, 3, 2])
catalog.save("shap_dependency_figure",fig)
error:
DatasetError: Cannot save versioned dataset 'shap_dependency.json' to
'aw-science/performance_optimisation/dev/Reed/data/08_reporting' because a file with the same name already exists
in the directory. This is likely because versioning was enabled on a dataset already saved previously. Either
remove 'shap_dependency.json' from the directory or manually convert it into a versioned dataset by placing it in a
versioned directory (e.g. with default versioning format
'aw-science/performance_optimisation/dev/Reed/data/08_reporting/shap_dependency.json/YYYY-MM-DDThh.mm.ss.sssZ/shap_
dependency.json').
Hugo Evers
05/31/2024, 2:25 PMHugo Evers
05/31/2024, 2:27 PMshap_dependency_figure:
type: kedro_datasets.plotly.JSONDataset
filepath: <s3://aw-science/performance_optimisation/${oc.env:ENV}/${oc.env:CLIENT_NAME}/data/08_reporting/shap_dependency.json>
it saves just fineHugo Evers
05/31/2024, 2:27 PMmlflow.log_figure(fig,"shap_dependency_figure.html")
also, it logs just fine, and i can view the plot in mlflowJuan Luis
05/31/2024, 2:28 PMJuan Luis
05/31/2024, 2:28 PMrm -r aw-science/performance_optimisation/dev/Reed/data/08_reporting
(or make a backup somewhere else)Hugo Evers
05/31/2024, 2:29 PMHugo Evers
05/31/2024, 2:29 PMHugo Evers
05/31/2024, 2:30 PMlog_figure
?
Or does the mlflow artefact dataset do the same thing?Juan Luis
05/31/2024, 2:39 PMbut not that folderhmm okay, I see it now: https://kedro-mlflow.readthedocs.io/en/stable/source/04_experimentation_tracking/03_version_datasets.html#how-to-version-data-in-a-kedro-project
Juan Luis
05/31/2024, 2:39 PM# must be a local file, wherever you want to log the data in the end
Hugo Evers
05/31/2024, 2:39 PMshap_dependency_figure:
type: kedro_mlflow.io.artifacts.MlflowArtifactDataset
dataset:
type: kedro_datasets.plotly.JSONDataset
filepath: data/08_reporting/shap_dependency.html
The thing is, regardless of the file handle, (html, json, etc) in mlflow you will see json and not a figure.
Which makes sense because the underlying dataset is saving jsonJuan Luis
05/31/2024, 2:39 PMfilepath: s3
with MlflowArtifactDataset
isn't supportedJuan Luis
05/31/2024, 2:40 PMThe thing is, regardless of the file handle, (html, json, etc) in mlflow you will see json and not a figure.
Which makes sense because the underlying dataset is saving jsonyup I see it now. pinging @Yolan Honoré-Rougé but maybe it's better that you continue in that discussion you pointed out https://github.com/Galileo-Galilei/kedro-mlflow/discussions/338
Juan Luis
05/31/2024, 2:40 PMHugo Evers
05/31/2024, 2:41 PM<http://kedro_mlflow.io|kedro_mlflow.io>.artifacts.MlflowArtifactDataset
or adding a MlflowFigure datasetHugo Evers
05/31/2024, 2:42 PMHugo Evers
05/31/2024, 2:42 PMYolan Honoré-Rougé
06/04/2024, 8:56 PMlog_figure
because wrapping with MlflowArtifactDataset
is supposed to cover all use cases, but I am totally open to PR if you want to avoid dealing with local path manually.Hugo Evers
06/05/2024, 9:08 AM