Hello, I want to get all pipelines' names in the s...
# questions
ł
Hello, I want to get all pipelines' names in the simple script with
SequentialRunner
to run all of my pipelines - Copilot suggested something like this, but I cannot get it to work:
Copy code
import os
from kedro.runner import SequentialRunner
from kedro.framework.session import KedroSession

def main():
    # Get the parent directory of the current directory
    current_directory = os.path.dirname(os.path.abspath(__file__))
    project_path = os.path.dirname(current_directory)

    # Initialize the Kedro session
    with KedroSession.create(package_name="data_science_pipelines", project_path=project_path) as session:
        context = session.load_context()

        # Initialize the SequentialRunner
        runner = SequentialRunner()

        # Get all pipeline names
        pipelines = context.pipelines

        # Start all pipelines
        for pipeline_name, pipeline in pipelines.items():
            print(f"Starting pipeline: {pipeline_name}")
            runner.run(pipeline, context.catalog)
            print(f"Finished pipeline: {pipeline_name}")

if __name__ == "__main__":
    main()
h
Someone will reply to you shortly. In the meantime, this might help:
ł
Both the context and the session don't have the
pipelines
field
l
To get the pipelines in my project I do this:
Copy code
project_path = Path.cwd()
bootstrap_project(project_path)
with KedroSession.create(project_path=project_path) as session:
    pipelines = find_pipelines()
Mind you I am not sure this is the best/correct way to do it, maybe the real Kedro experts can help us both in this regard
K 1
ł
Okay, I will try it out, thank you!
@Lorenzo Castellino Can I ask from which module do you import this function
find_pipelines
? EDIT: Got it,
from kedro.framework.project import find_pipelines
👍 1
n
The default is running all the pipeline already, why do you need to do this?
ł
I have separate node for train/test data ingestion, preprocessing and training, with the detection pipeline at the end - I want to go through all pipelines once, then schedule it as the outside call, where only the last part with detection on new data is triggered
n
Still, pipeline is composable. You can define as many combination as you want. i.e.
Copy code
pipelines["everything"] = pipelines["train"] + pipelines["test"] + pipelines["detection"]
when you need to run everything, do
kedro run -p everthing
, and when you only need a subset of it, do
kedro run -p detection
K 1