Hello! Could someone tell me the difference betwee...
# questions
f
Hello! Could someone tell me the difference between specifying inputs/outputs inside the
pipeline
function, vs specifying inputs/outputs inside the
node
function? It seems redundant, because these 2 codes seem to yield the same thing:
Copy code
def create_pipeline(**kwargs) -> Pipeline:
    pipe1 = pipeline(
        pipe=[
            node(
                func=mafonc1,
                inputs='params:nombre',
                outputs='result1'
            ),
            node(
                func=mafonc2,
                inputs='result1',
                outputs='result2'
            )
        ],
        namespace='ns1',
        inputs='params:nombre',
        outputs='result2'
    )

    return pipe1
Copy code
def create_pipeline(**kwargs) -> Pipeline:
    pipe1 = pipeline(
        pipe=[
            node(
                func=mafonc1,
                inputs='params:nombre',
                outputs='result1'
            ),
            node(
                func=mafonc2,
                inputs='result1',
                outputs='result2'
            )
        ],
        namespace='ns1'
    )

    return pipe1
K 1
TLDR; the one is node is for specifying the
inputs
and
outputs
, the one for
pipeline
is to optionally escape the namespace.
i.e. whenever you provide the
namespace
argument, all inputs and outputs will be read as
<http://namespace.xxx|namespace.xxx>
instead of
x
, in some case you do want it to read the non-namespace version, and you need to provide the name of that in the arguments
Tips: try to print the pipeline python object, you should see the difference
if you have this in a kedro project, you can do
kedro ipython
then just print the
pipelines
object out
f
Thank you, this is clear! So inputs/outputs for
pipeline
are optional, whereas inputs/outputs for
node
are mandatory
n
Yes - you only need to provide it for
pipeline
if you need to escape from the namespace
The function signature should have reflected this - if not it's something that we should fix :)
f
Oh yes, I just saw this in modular_pipeline.py:
Copy code
inputs: A name or collection of input names to be exposed as connection points
    to other pipelines upstream. This is optional; if not provided, the
    pipeline inputs are automatically inferred from the pipeline structure.
    When str or set[str] is provided, the listed input names will stay
    the same as they are named in the provided pipeline.
    When dict[str, str] is provided, current input names will be
    mapped to new names.
    Must only refer to the pipeline's free inputs.
👍🏼 1