Hi team quick question is it possible to use the catalog fro Kedro #questions

Hi team, quick question: is it possible to use the...

Alex Ferrero

09/16/2024, 8:59 AM

Hi team, quick question: is it possible to use the catalog from one environment (let's say 'prod') when running a pipeline in another environment (let's say 'prod2')? CC/ @Jose Luis Lavado Sánchez

Laurens Vijnck

09/16/2024, 9:05 AM

Hi Alex, what exactly is your use-case?

Laurens Vijnck

09/16/2024, 9:06 AM

I've submitted this issue a few days ago. This is a

--from-env

flag that allows reading from one env, and writing to another. Not sure if this is what you're looking for https://github.com/kedro-org/kedro/issues/4155

Laurens Vijnck

09/16/2024, 9:07 AM

If you wish to extend your catalog entries, you could look into codifying shared entries in

base

and overriding entries in another env, e.g.,

cloud

. Kedro will load both base and cloud, and give priority to configuration in cloud

Jose Luis Lavado Sánchez

09/16/2024, 9:07 AM

On execution-time we need we use the context to access a keyvault and set the needed creds, on this context we use the env variable to know which creds are the correct ones. So there are two envs that share catalog (there are others on the project) but need different credentials for example same query structure but different database.

Jose Luis Lavado Sánchez

09/16/2024, 9:08 AM

Maybe the

--from-env

is the solution, not sure

Laurens Vijnck

09/16/2024, 9:08 AM

I think the

base

and a custom environment should do the trick no?

Laurens Vijnck

09/16/2024, 9:09 AM

codifying all your entries in base, and for your custom structure override it in another env

Jose Luis Lavado Sánchez

09/16/2024, 9:10 AM

Not sure, because there are other environments on the project that have same dataset on the catalog but with other queries, config, .... But maybe just by preference order should work?

Laurens Vijnck

09/16/2024, 9:11 AM

aha you have more envs

Laurens Vijnck

09/16/2024, 9:13 AM

alternatively you could implement the possibility to select multiple environments with a priority order

Laurens Vijnck

09/16/2024, 9:14 AM

it's something I've been thinking about for a while as well

Jose Luis Lavado Sánchez

09/16/2024, 9:15 AM

Yes. Which is the default preference order in kedro. For example if I run env

dev

it will look for

/conf/dev

and if it do not found the dataset names look for them on

/conf/base

? If that the case I can make it work with that

Laurens Vijnck

09/16/2024, 9:17 AM

though the

--from-env

flag would cover your use-case already as well, you will only be reading stuff from this env. What the flag does is, it loads the

--from-env

catalog, and it attempts to override all input datasets of the selected pipeline (or selection or nodes) to use the catalog entries from the

--from-env

. (it currently errors out of the input dataset does not exist in the

from-env

but you could choose to skip the error, and default to the catalog entry from the

env

Laurens Vijnck

09/16/2024, 9:17 AM

no, its essentially a dictionary, so the yaml dics are merged very much in the same fashion python merges dicts

Laurens Vijnck

09/16/2024, 9:17 AM

but it merges base env, local env and selected env

Jose Luis Lavado Sánchez

09/16/2024, 9:20 AM

So, if I set for example

--from-env prod --env prod_A

on the context I will get that the environment is

prod_A

but I will get the catalog from

/conf/prod/catalog.yml

Laurens Vijnck

09/16/2024, 11:12 AM

it will only use the entries from prod for the INPUTS of your pipeline

Laurens Vijnck

09/16/2024, 11:12 AM

all others will use prod_A

Jose Luis Lavado Sánchez

09/16/2024, 11:20 AM

Okay perfect that's all I need thanks!

Laurens Vijnck

09/16/2024, 11:46 AM

check it out! can drop notes on the issue as well

Merel

09/17/2024, 4:47 PM

@Jose Luis Lavado Sánchez To get back to your comment earlier: "For example if I run env

dev

it will look for

/conf/dev

and if it do not found the dataset names look for them on

/conf/base

?" this is exactly how Kedro works as described here: https://docs.kedro.org/en/stable/configuration/configuration_basics.html#configuration-environments You can use settings to specify what your default overriding environment and base environments should be if you want them to be different from "base" and "local": https://docs.kedro.org/en/stable/configuration/configuration_basics.html#how-to-change-the-default-overriding-environment

Jose Luis Lavado Sánchez

09/18/2024, 7:41 AM

Thank you, that solve my problem

👍 1

2 Views

Open in Slack

Previous Next