Elaine Resende
11/20/2023, 7:41 PM--params
argument in the command line? If so, how should I do?
I tried like: kedro run --pipeline=inference --params=job="{'job_id':'98727','timezone':'-8.0'},source=mwb,version=1"
, but I got a ParserError...
ParserError: while parsing a flow mapping
in "<unicode string>", line 1, column 1:
{'job_id':'98727'
^
expected ',' or '}', but got '<stream end>'
in "<unicode string>", line 1, column 18:
{'job_id':'98727'
datajoely
11/20/2023, 8:47 PMNok Lam Chan
11/21/2023, 2:34 AMmarrrcin
11/21/2023, 8:10 AMIñigo Hidalgo
11/21/2023, 10:44 AMdatajoely
11/21/2023, 10:44 AM--config
piece?Iñigo Hidalgo
11/21/2023, 10:44 AMdatajoely
11/21/2023, 10:44 AMmarrrcin
11/21/2023, 10:45 AMdatajoely
11/21/2023, 10:45 AMIñigo Hidalgo
11/21/2023, 10:48 AM@click.option(
"--config",
"-c",
type=click.Path(exists=True, dir_okay=False, resolve_path=True),
help=CONFIG_FILE_HELP,
callback=_config_file_callback,
)
@click.option("--params", type=str, default="", help=PARAMS_ARG_HELP, callback=_split_params)
def run(
tag,
env,
parallel,
runner,
is_async,
node_names,
to_nodes,
from_nodes,
from_inputs,
to_outputs,
load_version,
pipeline,
config,
params,
):
"""Run the pipeline."""
if parallel and runner:
raise KedroCliError(
"Both --parallel and --runner options cannot be used together. " "Please use either --parallel or --runner."
)
runner = runner or "SequentialRunner"
if parallel:
runner = "ParallelRunner"
runner_class = load_obj(runner, "kedro.runner")
tag = _get_values_as_tuple(tag) if tag else tag
node_names = _get_values_as_tuple(node_names) if node_names else node_names
package_name = str(Path(__file__).resolve().parent.name)
with KedroSession.create(package_name, env=env, extra_params=params) as session:
session.run(
tags=tag,
runner=runner_class(is_async=is_async),
node_names=node_names,
from_nodes=from_nodes,
to_nodes=to_nodes,
from_inputs=from_inputs,
to_outputs=to_outputs,
load_versions=load_version,
pipeline_name=pipeline,
)
This is the run function in my cli.py
, it looks like we're somehow discarding that config, right?marrrcin
11/21/2023, 10:54 AM_config_file_callback
updates the contextIñigo Hidalgo
11/21/2023, 10:56 AMdef _config_file_callback(ctx, param, value): # pylint: disable=unused-argument
"""Config file callback, that replaces command line options with config file
values. If command line options are passed, they override config file values.
"""
# for performance reasons
import anyconfig # pylint: disable=import-outside-toplevel
ctx.default_map = ctx.default_map or {}
section = ctx.info_name
if value:
config = anyconfig.load(value)[section]
ctx.default_map.update(config)
return value
datajoely
11/21/2023, 10:57 AMIñigo Hidalgo
11/21/2023, 10:59 AMNok Lam Chan
11/21/2023, 11:05 AM--params
will still takes the priority if both defined.
I used to use both and --config
for some plugin metadata, and --params
for Kedro’s parameterIñigo Hidalgo
11/21/2023, 11:16 AMElaine Resende
11/21/2023, 12:41 PMrun:
tags: tag1, tag2, tag3
pipeline: pipeline1
parallel: true
nodes_names: node1, node2
env: env1
params: job="{'job_id':'98727','timezone':'-8.0'}"
datajoely
11/21/2023, 12:44 PMparams:
job:
job_id': '98727'
timezone': '-8.0'
Elaine Resende
11/21/2023, 1:17 PMValueError: Pipeline input(s) {'job'} not found in the DataCatalog
marrrcin
11/21/2023, 1:27 PMdatajoely
11/21/2023, 1:28 PMElaine Resende
11/21/2023, 2:16 PM