https://kedro.org/ logo
#questions
Title
# questions
a

Aspen Olszewska

03/14/2024, 2:50 PM
Hi everyone! First, thanks for reading my question. Now, onto it. I'm developing a
kedro-pandera
plugin to add data validation (might open-source it when it works fine for some time on my comapny's end). In one of the commands I get the Kedro context through the session:
Copy code
with KedroSession.create(metadata.package_name, project_path, env=env) as session:
    context = session.load_context()
I'm passing the
env
here as I iteratively go through the environmnets to create schema files for the datasets selected by passing args to the command. Fastforward, I try to load a dataset from the Data Catalog:
Copy code
dataset = context._get_catalog(save_version="")._get_dataset(dataset_name)
And here I experience an error:
Copy code
$ kedro pandera init --env base

(...)

/usr/local/Caskroom/miniconda/base/envs/liquidity-prediction-env/lib/python3.10/site-packages/kedro/io/data_catalog.py:50 in _get_credentials

KeyError: "Unable to find credentials '<redacted>': check your data catalog and credentials configuration. See <https://kedro.readthedocs.io/en/stable/kedro.io.DataCatalog.html> for an example."
It seems that when I created the session, context, and then the catalog, none of them loaded
conf/local/credentials.yml
. Why is that? Is it on purpose (to prevent plugins from stealing credentials) or am I doing something wrong? Why does it work when session, context and catalog are created in the project itself while running
kedro run
? I'm using
kedro==0.18.4
.
j

Juan Luis

03/14/2024, 2:51 PM
hi @Aspen Olszewska! sorry for the digression and not directly answering your question - I don't want to discourage you from creating your plugin, but have you seen https://github.com/galileo-Galilei/kedro-pandera ?
👍 1
a

Aspen Olszewska

03/14/2024, 2:53 PM
yes, I have. I've created mine around two years ago, but now I got back to it as I need data validation again. I've developed kedro-mlflow with galileo-galilei in the early days.
👋 1
still, I'd like to know the reason for the discrepancy between session/context/catalog creation when running kedro run vs a plugin command
I didn't use any credentials back then, when I created the plugin, and now I do
@Juan Luis I have looked at the source code of the galileo's plugin and mine is more advanced/more developed tbh
👍🏼 1
👍 1
j

Juan Luis

03/14/2024, 3:05 PM
exciting 🔥
n

Nok Lam Chan

03/14/2024, 6:27 PM
Hey, not sure if I have it right. I think the problem is Kedro always run with base + env (default: local)
What you want to have seems to be local + custom env if I understand correctly, if you try to pass in
env="local"
, see if the credentials are loaded properly?
a

Aspen Olszewska

03/14/2024, 6:28 PM
let me see
n

Nok Lam Chan

03/14/2024, 6:31 PM
and re:
kedro-pandera
, we started developing it last year but I don't think anyone is using it actively. happy to contribute to something existed already if it open source. Does it supports the feature described in https://github.com/Galileo-Galilei/kedro-pandera/issues?
a

Aspen Olszewska

03/14/2024, 6:31 PM
ok, so that does do the trick of loading the credentials AND the base's Data Catalog
but now it saves the schemas in
conf/local/
...
n

Nok Lam Chan

03/14/2024, 6:33 PM
Where do you expect it to save? And how does the code look like? I guess this is on your implementation
a

Aspen Olszewska

03/14/2024, 6:33 PM
I would need to load env passed through cmd and overlay local on top
yes, it is my implementation
it's not open source yet
I need to run it through my org first to see if I have a green light, but that shouldn't be a problem
👍🏼 1
n

Nok Lam Chan

03/14/2024, 6:35 PM
This might be useful for you, but long story short, Kedro default is
base
+
local
(which you can override via cli), there is a long discussion thread about this that I cannot find it now.
Copy code
CONFIG_LOADER_ARGS = {
      "base_env": "base",
      "default_run_env": "local",
#       "config_patterns": {
#           "spark" : ["spark*/"],
#           "parameters": ["parameters*", "parameters*/**", "**/parameters*"],
#       }
}
If you start a new kedro project, you will find this in
settings.py
a

Aspen Olszewska

03/14/2024, 6:35 PM
as for the features, I cannot tell. I didn't read any of the issues in Yolan's repo
n

Nok Lam Chan

03/14/2024, 6:35 PM
so you can change what "base" is up to your decision
though I rarely see people do it, but it's possible
a

Aspen Olszewska

03/14/2024, 6:36 PM
huh, interesting
thanks for the pointer
and for the help with the config loading part!
n

Nok Lam Chan

03/14/2024, 6:37 PM
for you case you just need to flip the config
a

Aspen Olszewska

03/14/2024, 6:37 PM
yup
I wonder how it works out of the box for regular
kedro run
though...
n

Nok Lam Chan

03/14/2024, 6:38 PM
It would work too, but now you have to keep in mind what you call "local" is the "baes" everywhere Kedro calls
local will have a lower priority than the custom env
which may be undesired
a

Aspen Olszewska

03/14/2024, 6:39 PM
yeah I mean credentials loading - it works right out of the box without fliping the base env name
by default it runs base but local is loaded as well
I wonder how that happens
might be best to replicate that
n

Nok Lam Chan

03/14/2024, 6:40 PM
so out of the box Kedro run base + local, if you don't specify anything local will be loaded
what doens't work after you flip it?
maybe I got a bit lost
a

Aspen Olszewska

03/14/2024, 6:41 PM
it does work after flipping
but I mean pure kedro
pure kedro run
without any plugin
when you run kedro run --env base it still loads credentials from local
👀 1
with default settings
anyways, new thing to figure out! I'll go through the list of issues in Yolan's repo tomorrow and run opensourcing my plugin through someone in my company. thank you for you help @Nok Lam Chan!
👍🏼 1
n

Nok Lam Chan

03/14/2024, 6:46 PM
I see what you mean, I think I missed something but i gotta go now. I'll come back to this tmr but let's see if someone beat me to it
I checked
kedro run --env base
won't read
local
, can you double check?
2 Views