Luis Chaves Rodriguez
01/17/2025, 12:58 PMfrom kedro.pipeline import Pipeline, node, pipeline
@node(inputs=1, outputs="first_sum")
def step1(number):
return number + 1
@node(inputs="first_sum", outputs="second_sum")
def step2(number):
return number + 1
@node(inputs="second_sum", outputs="final_result")
def step3(number):
return number + 2
pipeline = pipeline(
[
step1,
step2,
step3,
]
)
the node name could be inferred from the function nameHall
01/17/2025, 12:58 PMNok Lam Chan
01/17/2025, 1:01 PMLuis Chaves Rodriguez
01/17/2025, 1:01 PMfrom kedro.pipeline import Pipeline, node, pipeline
from .nodes import step1,step2,step3
pipeline = pipeline(
[
step1,
step2,
step3,
]
)
Luis Chaves Rodriguez
01/17/2025, 1:02 PMNok Lam Chan
01/17/2025, 1:02 PMLuis Chaves Rodriguez
01/17/2025, 1:04 PMLuis Chaves Rodriguez
01/17/2025, 1:04 PMdatajoely
01/17/2025, 1:06 PMWhat complex cases would this not cover though?You get into funny situations where different decorators would conflict, for example combining this with Pandera might be painful We also try to have only one way of doing things, whilst this rule is broken in some places, it can cause headaches
Nok Lam Chan
01/17/2025, 1:09 PMNok Lam Chan
01/17/2025, 1:09 PMNok Lam Chan
01/17/2025, 1:10 PMdatajoely
01/17/2025, 1:10 PMLuis Chaves Rodriguez
01/17/2025, 1:11 PMLuis Chaves Rodriguez
01/17/2025, 1:12 PMLuis Chaves Rodriguez
01/17/2025, 1:13 PMdatajoely
01/17/2025, 1:13 PMLuis Chaves Rodriguez
01/17/2025, 1:14 PMIt will make simple things simpler but the complex cases more difficult. That's the main tradeoff
datajoely
01/17/2025, 1:14 PMLuis Chaves Rodriguez
01/17/2025, 1:15 PMdatajoely
01/17/2025, 1:15 PMNok Lam Chan
01/17/2025, 1:15 PM@node(inputs=["a", "b"], outputs="sum")
def pipeline_step(a, b): return reusable_fn(a, b)
Is this simpler than node(reuable_fn, "a", "b", outputs="sum"
? It's a few more keystroke, though maybe slightly clearer since the arguments are highlight at the topLuis Chaves Rodriguez
01/17/2025, 1:15 PMdatajoely
01/17/2025, 1:15 PMdatajoely
01/17/2025, 1:15 PMdatajoely
01/17/2025, 1:16 PMNok Lam Chan
01/17/2025, 1:16 PMNok Lam Chan
01/17/2025, 1:16 PMLuis Chaves Rodriguez
01/17/2025, 1:16 PMIs this simpler than@Nok Lam Chan yeah I'm not really sure which one is simpler at that point hence agreeing with your point that it complicates the "advanced" use cases? It's a few more keystroke, though maybe slightly clearer since the arguments are highlight at the topnode(reuable_fn, "a", "b", outputs="sum"
Luis Chaves Rodriguez
01/17/2025, 1:17 PMDeepyaman Datta
01/17/2025, 5:04 PMnode
. Instead, I think you can get something halfway like:
@node
def step1(number):
return number + 1
@node
def step2(number):
return number + 1
@node
def step3(number):
return number + 2
@pipeline
def my_pipe(my_input):
first_sum = step1(something)
second_sum = step2(first_sum)
final_result = step3(second_sum)
return final_result
my_pipe(1)
This is definitely not 100% what it should look like, but I think the benefit is that you are still constructing the DAG--even though it's function calls--at pipeline definition time.Deepyaman Datta
01/17/2025, 5:12 PMBen Shaughnessy
01/17/2025, 9:27 PMdatajoely
01/18/2025, 2:49 AMLuis Chaves Rodriguez
01/20/2025, 9:04 AMVenusian is a library which allows you to defer the action of decorators. Instead of taking actions when a function, method, or class decorator is executed at import time, you can defer the action until a separate "scan" phase.https://docs.pylonsproject.org/projects/venusian/en/latest/#using-venusian
Luis Chaves Rodriguez
01/27/2025, 5:58 PMAlexandre Ouellet
01/31/2025, 3:25 PMAlexandre Ouellet
01/31/2025, 3:30 PMAlexandre Ouellet
01/31/2025, 3:30 PM