Guillaume Tauzin
02/04/2025, 8:36 AMkedro catalog list
will automatically resolve them (for a given pipeline - see this bit of code) while doing catalog.list()
in a kedro jupyter notebook will just list non-factory datasets (and parameters). Are those two returning different outputs by design or is it a bug?
Thanks!Hall
02/04/2025, 8:36 AMGuillaume Tauzin
02/04/2025, 8:38 AMdatajoely
02/04/2025, 9:06 AMcatalog.list(Pipeline.inputs() | Pipeline.outputs())
Guillaume Tauzin
02/04/2025, 9:16 AMfrom kedro.framework.project import pipelines
pipeline = pipelines.get("__default__")
catalog.list(pipeline.inputs() | pipeline.outputs())
returns
AttributeError: 'set' object has no attribute 'strip'
Seems like regex_search is supposed to be a string?
If I pass `regex_search=".*KWD.*", where KWD is part of one of my factored datasets, it also does not find it.Ankita Katiyar
02/04/2025, 10:33 AMcatalog.list()
(Discussion in https://github.com/kedro-org/kedro/issues/3312)
With the new catalog you can do -
catalog["<dataset_name>"]
And it’ll resolve and get you the factory dataset
for dataset in pipelines['__default__'].datasets():
catalog.exists(dataset) # or catalog.get_dataset(dataset)
# now it'll show up
catalog.list()
Guillaume Tauzin
02/04/2025, 1:36 PM