https://kedro.org/ logo
#questions
Title
# questions
n

Nicolas Rosso

03/02/2023, 1:27 PM
Hello friends, I am using kedro 0.18.4 with python 3.7 When running this pipeline with the 'kedro run' command, I get this error message: 'TypeError: pipeline() got an unexpected keyword argument 'tags_hierarchy' anyone have any idea how to fix this?
Copy code
from kedro.pipeline import Pipeline, node, pipeline
from .nodes import medium_posts_extract_file, medium_posts_transform_file, medium_posts_upload_transformed_file_to_gcp, medium_posts_persist_file_in_gcp, delete_files
from datetime import datetime

timestamp = datetime.now().strftime("%Y-%m-%d_%H-%M-%S")
#Defino los nodos dentro del pipeline y el orden de ejecución. Cada nodo puede tener 1 o mas funciones (definidas en nodes.py)
def create_pipeline(**kwargs) -> Pipeline:
    return pipeline(
        [
            node( 
                func=medium_posts_extract_file,
                inputs=None,
                outputs="medium_posts_raw_file",
                name="medium_posts_extract_file_node",
                tags=["extract"]
            ),
            node(
                func=medium_posts_transform_file,
                inputs="medium_posts_raw_file",
                outputs="medium_posts_transformed_file",
                name="medium_posts_transform_file_node",
                tags=["transform"]
            ),
            node(
                func=medium_posts_upload_transformed_file_to_gcp,
                inputs="medium_posts_transformed_file",
                outputs=None,
                name="medium_posts_upload_transformed_file_to_gcp_node",
                tags=["upload"]
            ),
            node(
                func=medium_posts_persist_file_in_gcp,
                inputs="medium_posts_raw_file",
                outputs=None,
                name="medium_posts_persist_file_in_gcp_node",
                tags=["persist"]
            ),
            node(
                func=delete_files,
                inputs="medium_posts_transformed_file",
                outputs=None,
                name="delete_files_node",
                tags=["delete"]
            )
        ],
        tags_hierarchy={
            "extract": [],
            "transform": ["extract"],
            "upload": ["transform"],
            "persist": ["upload"],
            "delete": ["persist"]
        }
    )
d

datajoely

03/02/2023, 1:28 PM
you have this in the
pipeline
constructor
Copy code
tags_hierarchy={
            "extract": [],
            "transform": ["extract"],
            "upload": ["transform"],
            "persist": ["upload"],
            "delete": ["persist"]
        }
that’s not expected by Kedro unless you’ve modified it
n

Nicolas Rosso

03/02/2023, 1:29 PM
Yes, i modified it because i need that specific order for the nodes to excecute
d

datajoely

03/02/2023, 1:30 PM
the way to do that is to pass inputs/outputs explicitly in the order you need it
it will act funky if you’re doing that sort of thing
you don’t have to do anything with objects passed between them
but the topological sort needs to be valid
n

Nicolas Rosso

03/02/2023, 1:31 PM
hmmm, so if i'm understanding correctly. The inputs/outputs is what determines the excecution order of the nodes in the pipeline?
d

datajoely

03/02/2023, 1:31 PM
yup
❤️ 1
n

Nicolas Rosso

03/02/2023, 1:32 PM
ok, thank you very much for your help.
n

Nicolas Rosso

03/02/2023, 1:33 PM
I'll take a look at it. Thank you
oh, one more question. My last node "delete_files_node" is just meant to delete some files. Is there a way to create a node with no inputs and outputs? Because that is what I think is messing my pipeline.
if I declare inputs = None, outputs = None It gives me an error message 😩
d

datajoely

03/02/2023, 1:39 PM
so you need an input, again you don’t need to use it for anything
but Kedro needs to know when yo execute it
n

Nicolas Rosso

03/02/2023, 1:40 PM
and that input has to be declared in the catalog.yml?
d

datajoely

03/02/2023, 2:01 PM
the very first one does, but after that it can simply be an ephemeral output of another node