https://kedro.org/ logo
#questions
Title
# questions
h

Hugo Evers

06/27/2023, 2:01 PM
Hi everyone, I’m trying to pass a dictionary of keyword arguments to a function in a Kedro node, but it doesn’t seem to be working. Instead, I have to use a lambda function to pass the arguments as separate inputs. For example, I would like to have a node that looks like this (knowing that best practice is to move the
sample_size
to a config):
Copy code
node(
    func=train_test_split,
    inputs={"df": "input", "sample_size": 50},
    ...
),
However, this doesn’t seem to work and I get an error refering to a separator error.. I noticed that in the modular pipeline, a similar syntax is allowed. Is that on purpose? What does work is:
Copy code
node(
      func=lambda df: train_test_split(df, sample_size=50),
      inputs="input",
      ...
     )
n

Nok Lam Chan

06/27/2023, 2:06 PM
because
sample_size
does not exist. You need to have
sample_size
in your
parameters.yml
and use
params:sample_size
instead.
I noticed that in the modular pipeline, a similar syntax is allowed.
Can you share the example?
h

Hugo Evers

06/27/2023, 2:07 PM
ah, okay, I got the impression the error was because i was passing a dictionary, but i see i have to move the sample_size to config
Copy code
inputs={
                    "df": "input",
                    "sample_size": "params:finetuner.sample_size",
                },
thats the syntax i was refering to (the parameter overriding example from the modular pipelines)
but to be clear, its not that
sample_size
does not exist right? the issue is that 50 is not a valid input
👍🏼 1
okay, in that case the error message could be clearer
n

Nok Lam Chan

06/27/2023, 2:10 PM
everything inside the
pipeline
is a string literal, which will be reference to a
dataset
or
parameter
, you cannot pass value directly there.
Can you share what error message did you get?
h

Hugo Evers

06/27/2023, 2:10 PM
AttributeError: ‘int’ object has no attribute ‘split’
n

Nok Lam Chan

06/27/2023, 2:11 PM
We are happy to take a PR to improve the error message if you would like to.
h

Hugo Evers

06/27/2023, 2:14 PM
cool, i dont have the bandwidth to do that PR ATM, since im working on an improved AWS batch runner, i would to contribute that in the near future (preferably by making it a plugin in line with kedro-sagemaker)
(also because looking at the stack-trace of the error message, i dont know exactly where the improved error handling should go, and am also not 100% sure i found the dictionary node dataset input mentioned in the documentation, so those two in my mind should go hand in hand)
👍🏼 1
n

Nok Lam Chan

06/27/2023, 2:33 PM
h

Hugo Evers

06/27/2023, 2:33 PM
nice!
n

Nok Lam Chan

06/27/2023, 2:41 PM
I don’t have an idea where should the fix goes into the code, but I have opened an issue here to document the issue. https://github.com/kedro-org/kedro/issues/2733
h

Hugo Evers

06/27/2023, 2:42 PM
thanks!
n

Nok Lam Chan

06/27/2023, 3:01 PM
https://github.com/kedro-org/kedro/pull/2734 I have created a draft PR here.