Kedro is an open-sourced Python framework for creating maintainable and modular data science code.

Kedro

hi kedro peeps :slightly_smiling_face: what is the kedro approach to handle pipelines supported by imported packages? For example, I'm building a project which uses Llama Index for RAG functionality. In their newest version they've released an ingestion pipeline construct where you specify a sequence of class calls to ingest/preprocess your data. the benefit is that each step is cached (in-memory or external database).
it looks like
`
```pipeline = IngestionPipeline(
    transformations=[
        SentenceSplitter(chunk_size=25, chunk_overlap=0),
        TitleExtractor(),
        OpenAIEmbedding(),
    ]
)```
`
now in this context, if I put that pipeline definition inside a single kedro node, the node doesn't perform a single task but I don't understand how to do it across multiple kedro nodes. It also makes me think of scikit pipelines...any wisdom or advice is greatly appreciated!

This is something that comes up now and again, but no great solution exists. See <https://kedro-org.slack.com/archives/C03RKP2LW64/p1666869042994569?thread_ts=1666864575.162609&amp;cid=C03RKP2LW64> for some context.

ok, thank you kindly for illuminating the issue here. It sounds like the most practical approach is to just use the package pipeline within a single kedro node. cheers!