https://kedro.org/ logo
#questions
Title
# questions
l

Lukas Innig

11/27/2023, 8:28 PM
When I make a modular pipeline with a namespace (to group nodes visually, mainly) I sometime have the problem that if I want a dataset to be an output, that is also used by internal nodes, it doesn’t appear as an output, even if I specify it in the outputs. My workaround is to create a “private” version of the dataset, and then create a dummy node that just copies the input to the output. (example in thread) Is there a better way?
Copy code
def create_pipeline() -> Pipeline:
    return pipeline(
        [
            node(
                func=create_vector_database,
                inputs=[
                    "all_docs",
                    "params:embedding_model_name",
                ],
                outputs="_vector_db",
                name="create_vector_database",
            ),
            node(
                func=test_vector_db_local,
                inputs=[
                    "_vector_db",
                    "params:query_text_vdb",
                ],
                outputs="vector_db_test_result",
                name="test_vector_db_local",
            ),
            node(
                func=lambda x: x,
                inputs="_vector_db",
                outputs="vector_db",
                name="vector_db",
            ),
        ],
        namespace="vector_database",
        inputs="all_docs",
        outputs={"vector_db", "vector_db_test_result"},
    )
copilot suggested the lambda trick on its own, so I guess this is standard practice 👀
e

Emilio Gagliardi

11/27/2023, 8:36 PM
I'm working on something similar, are you passing the vector db between nodes? I was wondering about this myself 😛
l

Lukas Innig

11/27/2023, 8:46 PM
yeah, basically this here just renames it
Copy code
node(
                func=lambda x: x,
                inputs="_vector_db",
                outputs="vector_db",
                name="vector_db",
            ),
e

Emilio Gagliardi

11/27/2023, 9:26 PM
so the lambda function just passes the vectordb to the next node? that's cool! I've only just started to get used to using lambda functions. neat work 🙂
l

Lukas Innig

11/27/2023, 9:36 PM
A lambda function is basically a function without a name. It's the same as saying
Copy code
def copy(x):
  return x
👍 1
4 Views