https://kedro.org/ logo
#questions
Title
# questions
j

Jong Hyeok Lee

02/10/2023, 9:32 AM
Hello everyone! Does anyone know how to pass in list of dataframes as an input in the pipeline node for Kedro? Because I have a function that takes in list of dataframes but doesn’t seem like it’s straightforward to implement
m

Merel

02/10/2023, 9:34 AM
Nodes can take single inputs as well as lists, you’ll just have to specify that your node in put is a list.
n

Nok Lam Chan

02/10/2023, 9:35 AM
Can you function just take a *args?, or can you elaborate a bit what’s the problem since you can definite a list of inputs in pipeline.
j

Jong Hyeok Lee

02/10/2023, 9:39 AM
@Merel @Nok Lam Chan Sorry I think I wasn’t clear enough! I’m trying to to something like below in the node:
inputs=[[list of dataframes]]
because my function looks like this:
def f(dataframes: List[DataFrame]) -> DataFrame:
Because it’s currently returning an unhashable list error
n

Nok Lam Chan

02/10/2023, 10:17 AM
I see. In this case you may wrap a thin node function to construct however you like. Inside the node function you will call your function f instead. @datajoely is this the common way to do so? I can’t remember is there a good reason why can’t we resolve the inputs as a list/dict/tuple of something to match the function signature exactly.
d

datajoely

02/10/2023, 10:17 AM
so this should work
we know it doesn’t work if you try and map *args in modular pipeliens
but I accept *args here
j

Jong Hyeok Lee

02/10/2023, 1:30 PM
Thank you @datajoely for sharing! Managed to fix the issue :)
🥳 3
t

Toni - TomTom - Madrid

02/10/2023, 5:26 PM
There is a different approach for this that may help: use incremental datasets ☺️ you may create many dataframes as output in a node and they will be considered as only one input! Same for input 👌
s

Sergei Benkovich

02/12/2023, 8:34 PM
i think creating a custom datacatalog entry may also work in this case? define you load and save functionalities as desired to treat the list of dataframes
👍 1
6 Views