Hello, I would like to know how can if I can pass...
# questions
a
Hello, I would like to know how can if I can pass from multiple nodes living in a different namespace / tag exactly the same output so it can be reused then later
Copy code
node(
    first_namespace_fn,
    inputs=["some_input"],
    outputs="shared_name_so_it_can_reused_somewhere_else",
    namespace="first"
),

node(
    second_namespace_fn,
    inputs=None,
    outputs="shared_name_so_it_can_reused_somewhere_else",
    namespace="second"
),

node(
    third_common_fn,
    inputs='shared_name_so_it_can_reused_somewhere_else',
    outputs="final_output",
),
Thank you!
d
I don't think I've ever seen somebody explicitly set
namespace
on a node; I don't think it does any input remapping. I assume what you're saying in this case is that you will either run the first and third node or the second and third node, but not first and second in the same run? In that case, it should be possible. When using a modular pipeline multiple times, you can use a mapping dictionary to say that a catalog entry shouldn't get namespaced. For example:
Copy code
cook_breakfast_pipeline = pipeline(
    [
        node(func=defrost, inputs="frozen_potatoes", outputs="veg", name="defrost_node"),
        node(func=sauté, inputs="veg", outputs="breakfast_potatoes"),
    ]
)
cook_lunch_pipeline = pipeline(
    [
        node(func=defrost, inputs="frozen_carrots", outputs="veg", name="defrost_node"),
        node(func=blanch, inputs="veg", outputs="cooked_veggies"),
    ]
)
eat_veggies = pipeline(
    [
        node(func=nom, inputs="cooked_veggies", outputs="leftovers", name="consume")
    ]
)

# Run either of the below pipelines, not both at once
eat_breakfast_pipeline = pipeline(
    pipe=cook_breakfast_pipeline,
    outputs={"breakfast_potatoes": "cooked_veggies"},
    namespace="breakfast",
)
eat_lunch_pipeline = pipeline(
    pipe=cook_lunch_pipeline,
    outputs="cooked_veggies",  # Alternatively, `outputs={"cooked_veggies": "cooked_veggies"},`
    namespace="lunch",
)
You should be able to run that code above, with minimal setup. E.g.
Copy code
from kedro.pipeline import pipeline, node

# Make dummy nodes
defrost = lambda x:x
sauté = lambda x:x
blanch = lambda x:x
nom = lambda x:x

# Copy/paste above code here
...

# Then if you run:
>>> eat_breakfast_pipeline.inputs()
{'breakfast.frozen_potatoes'}
>>> eat_breakfast_pipeline.outputs()
{'cooked_veggies'}
>>> eat_lunch_pipeline.inputs()
{'lunch.frozen_carrots'}
>>> eat_lunch_pipeline.outputs()
{'cooked_veggies'}
I this what you wanted?
a
Yes that is what I want, but now I do not know how to run it kedro run --namespace breakfast it says there is no pipeline defined,
do you mind to share a complete pipeline.py code that i can run using kedro command? thank you
e.g I want that one micropackage would produce a different meal, depending on namespace, it would be either lunch or breakfast. But I do not want to create 2 micropackages, i want to have it in one
is that possible? Thank you so much for your answers!
d
Off the top of my head, I don't think a node could belong to two namespaces, so I think that's not possible. You could put them in the same micropackage and add tags ("breakfast" and "lunch", and add both tags to the node that you'd want to run in either case), and then you could do
kedro run --tag breakfast
.
a
Thank you for your reply. Sure i do not need to put them on node…
I will try
hmm, I am not sure, if we are talking about the same. How can I create a pipeline that would be a combination of: eat_breakfast_pipeline + eat_lunch_pipeline + clean_kitchen where clean_kitchen pipeline would expect to get ['cooked_veggies'] as input and depending on the namespace or tag that was applied to eat_lunch_pipeline or eat_breakfast_pipeline it would procude
cooked_veggies
If do so, then it says:
Copy code
Output(s) ['cooked_veggies'] are returned by more than one nodes. Node outputs must be unique.
Also, I am not sure how to declare it so find_pipelines it is going to pick it out automatically. Thank you again for your help @Deepyaman Datta
d
How can I create a pipeline that would be a combination of:
eat_breakfast_pipeline + eat_lunch_pipeline + clean_kitchen
where clean_kitchen pipeline would expect to get ['cooked_veggies'] as input and depending on the namespace or tag that was applied to eat_lunch_pipeline or eat_breakfast_pipeline it would procude
cooked_veggies
If do so, then it says:
Copy code
Output(s) ['cooked_veggies'] are returned by more than one nodes. Node outputs must be unique.
You can't do this, because you're creating an invalid Kedro pipeline (as mentioned in that error). Even in my example, I wrote:
# Run either of the below pipelines, not both at once
a
And do you plan to support such a sceniario? Also, it would be cool that have one example project with tags and namespaces, so i do not need to bother you here;) Anyway thanks a lot for a quick and straight comunication! Kedro is great!
👍 1
d
And do you plan to support such a sceniario?
Not that I know of; allowing two nodes in the same pipeline to write to a single output causes undeterministic behavior. It is possible to get this same behavior by defining pipelines separately, so that seems to be the way to go.
Also, it would be cool that have one example project with tags and namespaces, so i do not need to bother you here;)
Can you create an issue on github.com/kedro-org/kedro/issues to request this? That way, the broader team can decide if they will add this and how to structure the docs.
a
Btw: can I pass an input param that is optional?