Is it possible to save datasets to a directory (by...
# questions
s
Is it possible to save datasets to a directory (by having the dataset name be the output of a node) without manually define dataset names in the
catalog.yaml
? I am asking, because there was this handy kedro plug-in kedro-wings, unfortunately the last commit is from 3 years ago and is probably outdated for modern Kedro. The feature itself was very handy, the plug-in would infer the dataset types from the input and output names (and so using the appropriate dataset class to load and save), e.g.:
Copy code
node(
        split_data,
        inputs=['01_raw/iris.csv', 'params:example_test_data_ratio'],
        outputs=dict(
            train_x="02_intermediate/example_train_x.csv"
            train_y="02_intermediate/example_train_y.csv"
)
Kedro-wings would automatically populate
catalog.yaml
. Concrete use case - training runs with 10 plots and 3 txt log files. Currently, I would have to manually define 13x datasets in
catalog.yaml
and manually maintain it if I add or remove node saving data to disk. Is something like this possible today or a different recommended way of solving this?
m
Similar functionality can be achieved with dataset factories https://docs.kedro.org/en/stable/data/kedro_dataset_factories.html
👍 1
s
Thanks, will take a look at it.