Hi, What would be the proper way to prematurely te...
# questions
r
Hi, What would be the proper way to prematurely terminate pipeline execution based on condition? The scenario is the following: • Node A: parses website; • Node B: downloads and cleans required data; • Node C: writes new data to SQL database. I want to be able to properly terminate pipeline execution if Node A finds no new data without executing nodes B and C.
i
Maybe even using
on_node_error
to catch a custom
DataNotFound
exception? Just spit balling haha
r
For now I did something like:
if <no new data>:
sys.exit(0)
at the end of Node A. But Kedro still thinks that the pipeline wasn’t properly terminated.
with
on_node_error
same problem as with
sys.exit(0)
i
Maybe someone else will come along with a better idea... Only other thing I can think of would be to allow node B and C to accept
None
as input and just do nothing if that's the input But then you might save over intermediate outputs with
None
if B outputs anything Also not ideal
👍 1
n
I think using
on_node_error
make a lot of sense, though I checked the current implementation won't work without a custom runner. The hook simply catch the error and allows you to do something (e.g. logging) before raising the error
👍 1
r
@Nok Lam Chan does this look like good place to start with custom runner? https://kedro.org/blog/build-a-custom-kedro-runner I understand that I need to build my own runner which will treat specific custom Exception as pipeline termination signal. So instead of throwing
sys.exit(0)
I will raise, for example, TerminateExecutionException and my custom runner will handle it properly. Does my understanding makes sense?
👍 1
👍🏼 1
n
This is a good place to start - I think you are right. You will need to raise a custom exception, if caught, then break out from the execution loop and finish gracefully
👍 1
👍 1
i
I'm not with the kedro team, but I will say: sometimes when I find I want to do some strange logic like this, I find it's probably because there was some design decisions in the pipeline which should be changed. @Ian Whalen’s suggestion to "allow node B and C to accept
None
as input and just do nothing if that's the input" goes in that direction. Sometimes the solution is to rethink how you're structuring a pipeline in order to attain the result you want, rather than trying to modify kedro behavior to work with your current code.
💡 1
👍 1