Hi, I have a dataset factory specified, and when I...
# questions
k
Hi, I have a dataset factory specified, and when I do
catalog.list()
I don't see the entires that match the factory (so in the end I don't know that they exist at all). However,
catalog.load("<name>")
works. Is this the intended way?
👍 1
d
which version of Kedro are you running?
k
0.19.5
a
Hey Kacper, yeah, the factory datasets are not eagerly registered to the catalog but at the first instance they are loaded or checked for existence.
👍 2
d
@Ankita Katiyar do you think it makes sense to provide a way of triggering this?
i.e. something like this
catalog.list(eager=true)
k
I see, now that i loaded them i can see them when I do
catalog.list()
Is there a way to list them programatically before loading at the moment?
a
The workaround is -
Copy code
for dataset in pipeline["__default__"].datasets():
  catalog.exists(dataset)
There is a ticket for this https://github.com/kedro-org/kedro/issues/3312 that we haven’t gotten around to yet
👍 1
👍🏼 1
k
Can confirm that doing this:
Copy code
for dataset in pipelines["__default__"].datasets():
    catalog.exists(dataset)
makes the entries show up in the
catalog.list()
a
(might throw a warning if the dataset does not exist just yet)
k
Great! Thanks for the help 🙌
I'm up for
catalog.list(eager=true)
, but if its possible I would make that a default
👍 1
It would be the most convenient to the end user imo
👍 1
a
would you mind adding a comment to the issue so we have your usecase recorded? 😄
k
yeah sure!
m
It depends though, if you have a huge catalog, loading all datasets eagerly might not be what you want. It could be very time and memory consuming.
n
@Merel Should we put this back into inbox? The priority of this isn't high originally, but we have seen multiple requests
m
In that case, let's just up the priority. No need to re-discuss in grooming 🙂
👍🏼 1
1
👍 1