Hi everyone, I’m trying to pass a dictionary of ke...
# questions
h
Hi everyone, I’m trying to pass a dictionary of keyword arguments to a function in a Kedro node, but it doesn’t seem to be working. Instead, I have to use a lambda function to pass the arguments as separate inputs. For example, I would like to have a node that looks like this (knowing that best practice is to move the
sample_size
to a config):
Copy code
node(
    func=train_test_split,
    inputs={"df": "input", "sample_size": 50},
    ...
),
However, this doesn’t seem to work and I get an error refering to a separator error.. I noticed that in the modular pipeline, a similar syntax is allowed. Is that on purpose? What does work is:
Copy code
node(
      func=lambda df: train_test_split(df, sample_size=50),
      inputs="input",
      ...
     )
n
because
sample_size
does not exist. You need to have
sample_size
in your
parameters.yml
and use
params:sample_size
instead.
I noticed that in the modular pipeline, a similar syntax is allowed.
Can you share the example?
h
ah, okay, I got the impression the error was because i was passing a dictionary, but i see i have to move the sample_size to config
Copy code
inputs={
                    "df": "input",
                    "sample_size": "params:finetuner.sample_size",
                },
thats the syntax i was refering to (the parameter overriding example from the modular pipelines)
but to be clear, its not that
sample_size
does not exist right? the issue is that 50 is not a valid input
👍🏼 1
okay, in that case the error message could be clearer
n
everything inside the
pipeline
is a string literal, which will be reference to a
dataset
or
parameter
, you cannot pass value directly there.
Can you share what error message did you get?
h
AttributeError: ‘int’ object has no attribute ‘split’
n
We are happy to take a PR to improve the error message if you would like to.
h
cool, i dont have the bandwidth to do that PR ATM, since im working on an improved AWS batch runner, i would to contribute that in the near future (preferably by making it a plugin in line with kedro-sagemaker)
(also because looking at the stack-trace of the error message, i dont know exactly where the improved error handling should go, and am also not 100% sure i found the dictionary node dataset input mentioned in the documentation, so those two in my mind should go hand in hand)
👍🏼 1
n
h
nice!
n
I don’t have an idea where should the fix goes into the code, but I have opened an issue here to document the issue. https://github.com/kedro-org/kedro/issues/2733
h
thanks!
n
https://github.com/kedro-org/kedro/pull/2734 I have created a draft PR here.