Hi kedro community I have encountered an issue when working Kedro #questions

Hi kedro community!! I have encountered an issue w...

Luis Chaves Rodriguez

01/14/2025, 4:36 PM

Hi kedro community!! I have encountered an issue when working with kedro within a marimo notebook (I think the issue would be just the same in a jupyter notebook). Basically, I initially was working on my notebook by calling it from the command line from the kedro project root folder, something like:

marimo edit notebooks/nb.py

where my folder structure is something like:

Copy code

├── README.md
├── conf
│   ├── base
│   ├── local
├── data ...
├── notebooks
│   ├── nb.py
├── pyproject.toml
├── requirements.txt
├── src ... 
└── tests ...

Within

nb.py

I have a cell that runs:

Copy code

from kedro.io import DataCatalog
from kedro.config import OmegaConfigLoader
from kedro.framework.project import settings
from pathlib import Path
conf_loader = OmegaConfigLoader(
    conf_source=Path(__file__).parent /settings.CONF_SOURCE,
    default_run_env = "base"
)

catalog = DataCatalog.from_config(conf_loader["catalog"], credentials=conf_loader["credentials"])

and later...

Copy code

weekly_sales = pl.from_pandas(
    catalog.load("mytable")
)

The issue is, within the

catalog

all the filepaths are absolute and assume that wherever the catalog is being used from is using the Kedro project root level. the

conf_source

argument in the

OmegaConfigLoader

instance is an absolute path (e.g.

conf/base/sql/somequery.sql

data/mydataset.csv

so if I run my notebook from the root of my kedro project, all is fine but I were to run:

cd notebooks; marimo edit nb.py

then

catalog.load

will attempt to load the query or dataset from

notebooks/conf/base/sql/somequery.sql

Is it clear? PD: please don't ask me why there is SQL code within the conf folder 😅, it's moving soon

Hall

01/14/2025, 4:36 PM

Someone will reply to you shortly. In the meantime, this might help:

Juan Luis

01/14/2025, 4:40 PM

~~hi @Luis Chaves Rodriguez! I think your message is incomplete? or otherwise could you clarify what the issue is?~~ solved

Luis Chaves Rodriguez

01/14/2025, 4:50 PM

Yes sorry, I pressed Enter by mistake as I was writing it, it's complete now, let me know if it's unclear @Juan Luis, the main issue is how the catalog defines the paths to the files that the catalog items are based on I believe

👍🏼 1

Luis Chaves Rodriguez

01/14/2025, 5:27 PM

I see that the problem is solved in jupyter notebooks by using magic, but I wonder if there's a magic-free solution?

Luis Chaves Rodriguez

01/14/2025, 5:35 PM

could this be relevant? https://docs.kedro.org/en/stable/_modules/kedro/ipython.html#magic_reload_kedro

Rashida Kanchwala

01/14/2025, 5:36 PM

hi this a known issue, and looks like the solution for now was to improve our error messaging - https://github.com/kedro-org/kedro/issues/3248. Maybe you can raise this issue on github, and we can revisit.

Luis Chaves Rodriguez

01/14/2025, 5:41 PM

but isn't this a solved issue in Jupyter? It should be possible to reproduce in other environments no? Couldn't we get the project root/session/context programmatically just like it happens with the magic?

Juan Luis

01/14/2025, 6:10 PM

the story of relative filepaths in the catalog is a bit tricky unfortunately. indeed, using the

%load_ext kedro

works, but there's not a good magic-free solution. @Luis Chaves Rodriguez one thing you can try is to use runtime parameters. in your dataset:

Copy code

ds:
  filepath: ${runtime_params:project_root}/data/01_raw/thing.csv

and then you can specify it as follows:

Copy code

config_loader = OmegaConfigLoader(..., runtime_params={"project_root": Path(...).to_posix()})

the missing bit then is how to find the

Path(...)

to the project root. https://docs.kedro.org/en/latest/configuration/advanced_configuration.html#how-to-override-configuration-with-[…]rameters-with-the-omegaconfigloader does this make sense?

Luis Chaves Rodriguez

01/14/2025, 6:17 PM

that makes sense, so every file, based on its location in the project would need to have a different

Path(...)

correct? Would the

catalog.load

respect that? In my example, would it be the following?

Copy code

conf_loader = OmegaConfigLoader(
    ...,
    default_run_env = "base",
    runtime_params = {"project_root": Path(__file__).parent }
)

If you had to start from scratch how would you fix this? How do other similar projects approach this?

Juan Luis

01/14/2025, 6:35 PM

catalog.load will respect it because it will know nothing about it. you’ll instantiate the catalog from the config loader. the translation happens at that step. so it’s a matter of properly prefixing your file paths in the catalog and then instantiating the config loader with the right runtime_params. you can probably wrap that in a function if you’re using it more than once

👌🏼 1

Luis Chaves Rodriguez

01/15/2025, 8:47 AM

What about this?

If you had to start from scratch how would you fix this? How do other similar projects approach this?

Luis Chaves Rodriguez

01/24/2025, 5:52 PM

Hey @Juan Luis why not use the

_find_kedro_project

function for this? https://github.com/kedro-org/kedro/blob/46259b9f5b89a226d47e2119afb40ad7b4fa5e63/kedro/utils.py#L66

Juan Luis

01/25/2025, 12:14 PM

maybe! @Luis Chaves Rodriguez have you tried it? btw, I just read https://github.com/kedro-org/kedro/issues/4440, thanks for opening it 💯

👍🏼 1

Luis Chaves Rodriguez

01/25/2025, 12:30 PM

I tried it briefly on Friday but the project I’m working on is not properly set up as python package, so I got some errors at import. I need to clean up some of how the repo was initially set up by the people that came before me, I’ll report back on this next week

👍🏼 1

Open in Slack

Previous Next