Hello everyone! Does anyone know how to pass in li...
# questions
j
Hello everyone! Does anyone know how to pass in list of dataframes as an input in the pipeline node for Kedro? Because I have a function that takes in list of dataframes but doesn’t seem like it’s straightforward to implement
m
Nodes can take single inputs as well as lists, you’ll just have to specify that your node in put is a list.
n
Can you function just take a *args?, or can you elaborate a bit what’s the problem since you can definite a list of inputs in pipeline.
j
@Merel @Nok Lam Chan Sorry I think I wasn’t clear enough! I’m trying to to something like below in the node:
inputs=[[list of dataframes]]
because my function looks like this:
def f(dataframes: List[DataFrame]) -> DataFrame:
Because it’s currently returning an unhashable list error
n
I see. In this case you may wrap a thin node function to construct however you like. Inside the node function you will call your function f instead. @datajoely is this the common way to do so? I can’t remember is there a good reason why can’t we resolve the inputs as a list/dict/tuple of something to match the function signature exactly.
d
so this should work
we know it doesn’t work if you try and map *args in modular pipeliens
but I accept *args here
j
Thank you @datajoely for sharing! Managed to fix the issue :)
🥳 3
t
There is a different approach for this that may help: use incremental datasets ☺️ you may create many dataframes as output in a node and they will be considered as only one input! Same for input 👌
s
i think creating a custom datacatalog entry may also work in this case? define you load and save functionalities as desired to treat the list of dataframes
👍 1