Hi all, I am developing a namespace pipeline and ...
# questions
a
Hi all, I am developing a namespace pipeline and have come into an issue when using Neptune to track my experiments. When I pass
"neptune_run"
as an input to a pipeline node I get the following error:
ValueError: Pipeline input(s) {'NAMESPACE.neptune_run'} not found in the DataCatalog
Where
"NAMESPACE"
is my namespace pipeline name. Is there a way to use Neptune along with namespace pipelines? Thanks again.
๐Ÿ‘€ 1
n
At which point did you get this error? Did you still have the error if you are not running with neptune. I assume you are using their plugin, if you uninstall it do you still get error?
a
If I set
enable: false
in
neptune.yml
I do not get this error.
I can also run the first nodes in my pipeline with
enable: true
and track the run with Neptune. However, in one of my final nodes I want to upload error metrics. So I pass
"neptune_run"
as an input to this node. Within the function called at this node I run:
neptune_run["evaluation/metrics"] = error_metrics
to track the error metrics. This is the point at which I get the
ValueError
above. Should I implement this in a different way?
Thanks for the response @Nok Lam Chan
n
In that case, I think itโ€™s likely a
kedro-neptune
issue,. maybe worth asking it in #plugins-integrations. I would also suggest asking in their repository https://github.com/neptune-ai/kedro-neptune since this is a community maintained plugin.
a
Ok will do, thanks @Nok Lam Chan
j
@Andrew Doherty could you share the complete traceback?
a
Of course
Copy code
INFO     Loading data from 'parameters' (MemoryDataSet)...                                                                                                               data_catalog.py:343
โ•ญโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ Traceback (most recent call last) โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฎ
โ”‚ /Users/andrew/code โ”‚
โ”‚ 8 in <module>                                                                                    โ”‚
โ”‚                                                                                                  โ”‚
โ”‚ /Users/andrew/codeโ”‚
โ”‚ 3.9/site-packages/kedro/framework/cli/cli.py:211 in main                                         โ”‚
โ”‚                                                                                                  โ”‚
โ”‚   208 โ”‚   """                                                                                    โ”‚
โ”‚   209 โ”‚   _init_plugins()                                                                        โ”‚
โ”‚   210 โ”‚   cli_collection = KedroCLI(project_path=Path.cwd())                                     โ”‚
โ”‚ โฑ 211 โ”‚   cli_collection()                                                                       โ”‚
โ”‚   212                                                                                            โ”‚
โ”‚                                                                                                  โ”‚
โ”‚ /Users/andrew/codeโ”‚
โ”‚ 3.9/site-packages/click/core.py:1130 in __call__                                                 โ”‚
โ”‚                                                                                                  โ”‚
โ”‚ /Users/andrew/codeโ”‚
โ”‚ 3.9/site-packages/kedro/framework/cli/cli.py:139 in main                                         โ”‚
โ”‚                                                                                                  โ”‚
โ”‚   136 โ”‚   โ”‚   )                                                                                  โ”‚
โ”‚   137 โ”‚   โ”‚                                                                                      โ”‚
โ”‚   138 โ”‚   โ”‚   try:                                                                               โ”‚
โ”‚ โฑ 139 โ”‚   โ”‚   โ”‚   super().main(                                                                  โ”‚
โ”‚   140 โ”‚   โ”‚   โ”‚   โ”‚   args=args,                                                                 โ”‚
โ”‚   141 โ”‚   โ”‚   โ”‚   โ”‚   prog_name=prog_name,                                                       โ”‚
โ”‚   142 โ”‚   โ”‚   โ”‚   โ”‚   complete_var=complete_var,                                                 โ”‚
โ”‚                                                                                                  โ”‚
โ”‚ /Users/andrew/code โ”‚
โ”‚ 3.9/site-packages/click/core.py:1055 in main                                                     โ”‚
โ”‚                                                                                                  โ”‚
โ”‚ /Users/andrewโ”‚
โ”‚ 3.9/site-packages/click/core.py:1657 in invoke                                                   โ”‚
โ”‚                                                                                                  โ”‚
โ”‚ /Users/andrew/code โ”‚
โ”‚ 3.9/site-packages/click/core.py:1404 in invoke                                                   โ”‚
โ”‚                                                                                                  โ”‚
โ”‚ /Users/andrew/code โ”‚
โ”‚ 3.9/site-packages/click/core.py:760 in invoke                                                    โ”‚
โ”‚                                                                                                  โ”‚
โ”‚ /Users/andrew/code โ”‚
โ”‚ 3.9/site-packages/kedro/framework/cli/project.py:472 in run                                      โ”‚
โ”‚                                                                                                  โ”‚
โ”‚   469 โ”‚   with KedroSession.create(                                                              โ”‚
โ”‚   470 โ”‚   โ”‚   env=env, conf_source=conf_source, extra_params=params                              โ”‚
โ”‚   471 โ”‚   ) as session:                                                                          โ”‚
โ”‚ โฑ 472 โ”‚   โ”‚   session.run(                                                                       โ”‚
โ”‚   473 โ”‚   โ”‚   โ”‚   tags=tag,                                                                      โ”‚
โ”‚   474 โ”‚   โ”‚   โ”‚   runner=runner(is_async=is_async),                                              โ”‚
โ”‚   475 โ”‚   โ”‚   โ”‚   node_names=node_names,                                                         โ”‚
โ”‚                                                                                                  โ”‚
โ”‚ /Users/andrew/codeโ”‚
โ”‚ 3.9/site-packages/kedro/framework/session/session.py:426 in run                                  โ”‚
โ”‚                                                                                                  โ”‚
โ”‚   423 โ”‚   โ”‚   )                                                                                  โ”‚
โ”‚   424 โ”‚   โ”‚                                                                                      โ”‚
โ”‚   425 โ”‚   โ”‚   try:                                                                               โ”‚
โ”‚ โฑ 426 โ”‚   โ”‚   โ”‚   run_result = runner.run(                                                       โ”‚
โ”‚   427 โ”‚   โ”‚   โ”‚   โ”‚   filtered_pipeline, catalog, hook_manager, session_id                       โ”‚
โ”‚   428 โ”‚   โ”‚   โ”‚   )                                                                              โ”‚
โ”‚   429 โ”‚   โ”‚   โ”‚   self._run_called = True                                                        โ”‚
โ”‚                                                                                                  โ”‚
โ”‚ /Users/andrew/codeโ”‚
โ”‚ 3.9/site-packages/kedro/runner/runner.py:78 in run                                               โ”‚
โ”‚                                                                                                  โ”‚
โ”‚    75 โ”‚   โ”‚                                                                                      โ”‚
โ”‚    76 โ”‚   โ”‚   unsatisfied = pipeline.inputs() - set(catalog.list())                              โ”‚
โ”‚    77 โ”‚   โ”‚   if unsatisfied:                                                                    โ”‚
โ”‚ โฑ  78 โ”‚   โ”‚   โ”‚   raise ValueError(                                                              โ”‚
โ”‚    79 โ”‚   โ”‚   โ”‚   โ”‚   f"Pipeline input(s) {unsatisfied} not found in the DataCatalog"            โ”‚
โ”‚    80 โ”‚   โ”‚   โ”‚   )                                                                              โ”‚
โ”‚    81                                                                                            โ”‚
โ•ฐโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฏ
ValueError: Pipeline input(s) {'dah_market.neptune_run'} not found in the DataCatalog
Due to the namespace adding in the prefix "dah_market" (my namespace) it appears that Kedro then sees this as a dataset rather than the
neptune_run
object. I wonder if I could pass the
neptune_run
as a literal to the function using partial. I'll try tomorrow as I need to dig into how this works.
Any thoughts @Juan Luis?
j
no more thoughts for now, haven't looked into this in depth yet but I intend to do so. if you have any further findings drop them here @Andrew Doherty, there's a chance this is an actual bug
a
OK thanks @Juan Luis . Shall I raise this as an issue in
kedro-neptune
Github?
j
yes, go ahead - the worst that can happen is that it isn't an actual bug ๐Ÿ™‚ but will be good to draw their attention. I don't see them in this Slack.
๐Ÿ‘ 1
a
Hi @Juan Luis, I got a response on the Github issue which doesn't touch on the specific Namespace pipeline challenge I am experiencing - https://github.com/neptune-ai/kedro-neptune/issues/66#issuecomment-1554557741. I think I need to create a basic example to share. Do you have any advice on how to create a simple reproducible Namespace Pipeline for this?
j
@Andrew Doherty thanks for your patience. maybe following the Spaceflights example? https://docs.kedro.org/en/0.18.0/tutorial/namespace_pipelines.html#adding-a-namespace-to-the-data-processing-pipeline you could put that in a repository and try to reproduce the problem. then write in the issue how to do so (
git clone spaceflights-modular && cd spaceflights-modular && kedro run
)
a
Nice @Juan Luis, that's very helpful. I'll work on that.
๐Ÿ™Œ๐Ÿผ 1
Hi @Juan Luis. I have added the following comment with a reproducible example. https://github.com/neptune-ai/kedro-neptune/issues/66#issuecomment-1561367556
Let me know your thoughts
j
that looks perfect ๐Ÿ’ฏ thanks @Andrew Doherty! hoping for the maintainers to respond, otherwise we can do some debugging
a
Great, thanks @Juan Luis. If the "neptune_run" object could be imported into the file where the pipeline is defined we could use partial to pass it to the function rather than as a node input here: https://github.com/adoherty21/kedro_neptune_namespace_issue/blob/6abd300ff5980b953[โ€ฆ]/src/basic_namespace/common/pipelines/common_pipeline_01_raw.py If you have any thoughts on how we could do that this would be a neat work around.
s
Hey all, Siddhant here from neptune.ai ๐Ÿ‘‹ Here's an update: https://github.com/neptune-ai/kedro-neptune/issues/66#issuecomment-1564608359 I've asked our devs to have a look, but since the same behavior works fine with normal pipelines, I think we might need support from Kedro's engg. team as well. I'll keep the thread updated. Have a great weekend!
๐Ÿ‘ 1
j
thanks @Siddhant Sadangi for chiming in!
n
Sure, happy to help if needed
๐Ÿ‘ 1
๐Ÿค 1
a
Hi all, just to confirm that @Siddhant Sadangi provided a workaround which has been tested and works. https://github.com/neptune-ai/kedro-neptune/issues/66 Thanks all for your time and effort.
๐ŸŽ‰ 2
j
Fantastic news!
we have an issue for Kedro now by the way https://github.com/kedro-org/kedro/issues/2646
๐Ÿ‘ 1