Good morning we have a question about Kedro dataset factorie Kedro #questions

Good morning, we have a question about Kedro datas...

Jacques Vergine

02/11/2025, 11:27 AM

Good morning, we have a question about Kedro dataset factories, we'd be hoping you'd be able to help. I will put the details in the thread to keep this channel tidy 🙂

Hall

02/11/2025, 11:28 AM

Someone will reply to you shortly. In the meantime, this might help:

Jacques Vergine

02/11/2025, 11:28 AM

We have a custom dataset defined as

Copy code

class MyDataset(SparkDataset):

    def __init__(  # noqa: PLR0913
        self,
        *,
        filepath: str,
        table: str
    ):
        ...

We are then trying to use it in our catalog, but this entry was failing

Copy code

integration.int.{source}.data1:
  type: MyDataset
  filepath: ${globals:integration_source_path}/int/{source}/data1
  table: {source}_data1

with the following error pointing to the

table: {source}_data1

line:

Copy code

An error has occurred: Invalid YAML or JSON file .../catalog.yml, unable to read line 20, position 17.
                    ERROR    An error has occurred: Invalid YAML or   ....py:212
                             JSON file                                          
                             .../catalog.yml,           
                             unable to read line 20, position 17.

We managed to solve it by putting

{source}

at the end of the table name, like this:

Copy code

integration.int.{source}.data1:
  type: MyDataset
  filepath: ${globals:integration_source_path}/int/{source}/data1
  table: data1_{source}

Is this an expected behaviour, or should we raise it as an issue?

Jitendra Gundaniya

02/11/2025, 11:43 AM

Hi Jacques, YAML gets confused because it sees the leading

and tries (and fails) to parse it as a mapping. So

table: data1_{source}

table: "{source}_data1"

should work. and I think no need to raise an issue.

Jacques Vergine

02/11/2025, 12:00 PM

Thanks a lot, I'll try with the double quotes to see if it works!

👍 1

Jacques Vergine

02/11/2025, 3:59 PM

it worked, thanks again 🙂

Open in Slack

Previous Next