Hey fellas I have 100 pipeline with their own namespace crea Kedro #questions

Hey fellas, I have 100 pipeline with their own na...

Chandan Malla

09/27/2023, 7:05 AM

Hey fellas, I have 100 pipeline with their own namespace created using loop , each namespace has its own input which is stored in a common table in MySQL. How do I pass those input effeciently to my pipelines.

marrrcin

09/27/2023, 7:14 AM

If what you’re looking for is effectively “how to pass the same data catalog entry to each namespaced pipeline” then when using:

Copy code

from kedro.pipeline.modular_pipeline import pipeline

pipeline(
   inputs={"<name of the dataset within the modular pipeline": "name of the dataset from the *root* data catalog"}, 
   namespace="your_ns",
   # ... other pipeline params
)

Note that

"<name of the dataset within the modular pipeline"

is WITHOUT the namespace

Chandan Malla

09/27/2023, 7:15 AM

No, every input is different, I am creating a bscktesting pipeline. Each pipeline will have different input

marrrcin

09/27/2023, 7:16 AM

So just create entries in the data catalog with namespaces, like this:

Copy code

your_ns.dataset_name:
   type: xyz

Chandan Malla

09/27/2023, 7:17 AM

Yes that would work, but I have to create 100 files for 100 input :/ instead of directly fetching it from single DB source

marrrcin

09/27/2023, 7:17 AM

Is there some kind of a pattern in the data catalog?

marrrcin

09/27/2023, 7:18 AM

If there is, then you can use dataset factories if you’re on

Kedro >= 0.18.12

Copy code

"{namespace}.dataset_name":
   type: xyz
   filepath: "path/to/data/{namespace}/file.csv"

Chandan Malla

09/27/2023, 7:18 AM

Pipeline would roughly like this, except there would 100 of them with each having their own nampespace

marrrcin

09/27/2023, 7:19 AM

https://docs.kedro.org/en/stable/data/kedro_dataset_factories.html

marrrcin

09/27/2023, 7:19 AM

You can do this with custom names too, like:

Copy code

"{namespace}.data_from_{month}":
   type: xyz
   filepath: "path/to/data/{namespace}/{month}.csv"

👍 1

marrrcin

09/27/2023, 7:20 AM

Will that work for you?

Chandan Malla

09/27/2023, 7:20 AM

How do I change month every time in each pipeline input?

Chandan Malla

09/27/2023, 7:21 AM

Oh got it

marrrcin

09/27/2023, 7:21 AM

You’re generating the pipeline via code, so I guess that you have those

Mar

Jan

, `Feb`… etc in some kind of a list

Chandan Malla

09/27/2023, 7:22 AM

Thankss mate

marrrcin

09/27/2023, 7:22 AM

Does that solve your problem?

Chandan Malla

09/27/2023, 7:25 AM

I can use this as an workaround, i will use month as a parameter to pass in my SQL query, that will help me to get different input every time.

👍 1

Chandan Malla

09/27/2023, 8:39 AM

How do I add {namespace} in parameter.yml also?

2 Views

Open in Slack

Previous Next