I have a quick question on running selected pipeli...
# questions
y
I have a quick question on running selected pipelines only. I know we could do
kedro run --pipeline=data_science
, but how could i run two or three instead of one (nor all)? in particular, i'm looking to do something like
kedro run --pipeline=data_science+evaluation
where i run these two selected pipelines only
m
As of now, you have to create the pipelines of โ€œtwo or threeโ€ statically in the
register_pipelines
. You can use https://docs.python.org/3/library/itertools.html#itertools.combinations for that if you have a lot of pipelines.
Tagging @datajoely , @Nok Lam Chan - slightly related to dynamic pipelines ๐Ÿ™‚
d
In this situation Iโ€™d just use the command line:
kedro run --pipeline=data_science & kedro run --pipeline=data_science+evaluation
run in parallel
kedro run --pipeline=data_science && kedro run --pipeline=data_science+evaluation
run in sequence
n
or just make a separate registry with
pipeline_data_science
+
pipeline_eavluation
this 2
these pipeline object can be
+
or
-
quite easily
j
how would I subtract a pipeline?
Copy code
def register_pipelines() -> dict[str, Pipeline]:
    """Register the project's pipelines.

    Returns:
        A mapping from pipeline names to ``Pipeline`` objects.
    """
    pipelines = find_pipelines()
    pipelines["__default__"] = sum(pipelines.values())
    pipelines["except-train"] = ???
    return pipelines
n
let say you have an end-to-end pipeline which compose of 4 steps:
ingest
,
process
,
train
,
eval
You can have
Copy code
pipelines["all"] = ingest + process + train + eval
pipelines["all_except_eval"] = pipelines["all"] - eval
j
this is what I did:
Copy code
from .pipelines.model_training import create_pipeline as create_model_training_pipeline

...
pipelines["all"] = sum(pipelines.values())
pipelines["all_except_eval"] = pipelines["all"] - create_model_training_pipeline()
is there a better way of doing it? not sure what
eval
is in your example @Nok Lam Chan
n
eval would be a
Pipeline
object - your snippets should work. But more likely you would structure your pipeline in modular pipeline fashion, so you have
pipelines/eval/
,
find_pipelines
should take care of it and you can just get it from
pipelines["eval"]
j
aaaah when I do
pipelines = find_pipelines()
, when
pipelines["eval"]
will be my object, gotcha ๐Ÿ‘๐Ÿผ
๐Ÿ‘๐Ÿผ 1
y
Fantastic, these are awesome and resolved my question more than i expected. thank you so much for the comprehensive replies!! ๐ŸŽ‰
๐Ÿ™Œ๐Ÿผ 1