https://kedro.org/ logo
#questions
Title
# questions
s

Sohum Sachathamakul

10/30/2023, 3:01 AM
Situation: • Kedro (0.18.8), Python (3.8.16), Databricks (9.1 LTS) • We are running
%load_ext kedro.ipython
and
%reload_kedro
to run our Kedro pipeline in Databricks environment successfully Complication: • When faced a
raise Exception
in the code, the Kedro pipeline fails, but doesn’t fail the Databricks job — causing the error to be un-raised (and users not knowing the job actually fail) • It seems to be the
%load_ext kedro.ipython
magic function which is causing the Exception to be suppressed. Before this function we are able to Raise directly in notebook and it fails the job/but afterwards it gets suppressed. Help needed: • Where should we look in the Kedro codebase in order to debug this? Anything that screams out where Exceptions might be getting suppressed?
👀 1
j

Juan Luis

10/30/2023, 9:27 AM
thanks for reporting @Sohum Sachathamakul and sorry you're having trouble with this. I think we have enough information to reproduce the issue. we'll investigate a bit and get back to you.
cannot reproduce on Databricks Community with an empty project after
%pip install kedro
. @Sohum Sachathamakul can you share your complete list of dependencies and Kedro plugins?
this is the
.ipynb
I used by the way
k

Kyle Chung

10/30/2023, 11:21 AM
j

Juan Luis

10/30/2023, 11:26 AM
thanks @Kyle Chung! I'm not sure
kedro.ipython
is injecting
ipywidgets
, but will keep this in mind. @Sohum Sachathamakul and I are debugging, will report back when we know what caused this issue
okay, we narrowed it down by a lot: • we started by removing hooks, a custom config loader, the logging config, and all the datasets. it still happened. • ⚠️ the "broken traceback" appears after doing
import kedro.framework.cli.utils
⚠️ • copy-pasting
kedro/framework/cli/utils.py
into the notebook and executing it didn't break the traceback. • this behavior appears in Kedro 0.18.8, 0.18.9, 0.18.14. so it's not something we've fixed now. •
sys.excepthook
didn't change before and after, it was basically
<bound method InteractiveShell.excepthook of <IPython.terminal.interactiveshell.TerminalInteractiveShell object at ...>
we'll continue tomorrow.
n

Nok Lam Chan

11/01/2023, 4:11 PM
Update:
Copy code
pip install git+<https://github.com/kedro-org/kedro.git@noklam/tempfix-databricks>
This branch seems to fix the issue. This fork branch suppress the use of
rich.pretty.install
I suspect the issue is the older Databricks runtime 9.0 has some strange behavior because this cannot be reproduced on the current databricks version (and it should affect more people if this is a generic issue)
I think we should re-consider making
rich
optionally, it should only install hook when
rich
is installed so there is an easy way to disable it without too many configuration.
d

datajoely

11/01/2023, 4:12 PM
yep
sorry about that 🙈
n

Nok Lam Chan

11/01/2023, 4:37 PM
Supplementing the source, https://docs.databricks.com/en/notebooks/ipython-kernel.html In databricks doc it indicates databricks runtime >=10 is using ipython kernel which is how rich interfering. I don’t fully understand what happened here but most likely rich is interfering it in different way. Cc @Antony Milne
d

datajoely

11/01/2023, 4:42 PM
It means you no longer need to do this! https://github.com/Textualize/rich/issues/2422
n

Nok Lam Chan

11/01/2023, 4:50 PM
Yeah, I have a separate PR to make the syshook install reversible, but it seem like their team are busy working on something else so that PR has been hanging for a while.
s

Sohum Sachathamakul

11/01/2023, 5:01 PM
@Nok Lam Chan thank you so much for creating a fork which did indeed fix the problem. May I check you for any short-term solution (while this goes through the proper routes to hotfix 🙂). I can think of the following: • Override the logging (?) somehow in my Kedro project — this is preferred b/c no environment change and minimal code change • Keep on increasing Databricks version until the issues is resolved
n

Nok Lam Chan

11/02/2023, 12:00 PM
If possible, bumping databricks version should work. Alternative, if it’s acceptable you can mock the rich function so that it doesn’t do anything.
Copy code
from unittest.mock import patch
patch("rich.pretty.install")
You can include this at the top of notebook.
j

Juan Luis

11/02/2023, 12:05 PM
hmm or rather
patch("kedro.logging.rich.pretty.install")
? (always confusing how mock works)
n

Nok Lam Chan

11/02/2023, 2:22 PM
both should work, tho
kedro.logging
is introduce in some 0.18.x series, so I am not sure is this the correct namespace. The difference is
patch("kedro.logging.rich.pretty.install")
only affect the scope of Kedro, while
patch("rich.pretty.install")
affect any use of rich.pretty.install
👍🏼 1
s

Sohum Sachathamakul

11/04/2023, 4:51 AM
Hey @Nok Lam Chan after bumping up the Databricks version to 11.3 it didn’t resolve the issue & also I tried to put both
patch("kedro.logging.rich.pretty.install")
and
patch("rich.pretty.install")
at the top of the notebook, didn’t work too 😢
n

Nok Lam Chan

11/04/2023, 4:53 AM
That's strange... Did the old solution (the fork branch) still work?
👌 1
Just checking, you need to use the
patch
as a context manager - because of that you cannot use the
%reload_kedro
method because it’s a jupyter magic, you should be able to use
reload_kedro
as a function. The code should looks like this
Copy code
from unittest.mock import patch
import rich.pretty


with patch("rich.pretty.install"):
    from kedro.ipython import reload_kedro
    session.run()