https://kedro.org/ logo
#questions
Title
# questions
a

Artur Dobrogowski

01/15/2024, 2:36 PM
Hello, I'm trying to understand better
kedro viz
. I wanted to explore a new unknown project to me so I started with kedro viz. However it fails, because of some issues that are not very important here (some data related issues). First thing that wasn't right to me is why spark was initialized and run for the need of running kedro viz? After some digging I got that it gets run because the project uses
SparkDatasets
, so for some reason it runs Spark to ask some questions about the datasets. My impression so far was that
kedro viz
is only a tool to explore kedro structure - what is defined in pipelines and data catalog. It was a big surprise to me that it tried to enquire anything external. So I searched some options whether its possible to disable this behaviour of deeper inspections, but I didn't find anything. Can someone shed some light why it needs to do so? Here's stack trace of why it was trying to access data set:
d

datajoely

01/15/2024, 2:39 PM
so currently Kedro Viz requires the underlying project to be valid to work, this is something we want to change in the future. But currently you need to make requisites for the
kedro run
work before viz will render
👍 1
👍🏽 1
j

Juan Luis

01/15/2024, 3:03 PM
cc @Rashida Kanchwala @Nero Okwa
n

Nok Lam Chan

01/15/2024, 3:05 PM
I don't think initialising spark is necessary anymore, there is an option to skip hooks, at some point we argue it should be a default already but I don't know what's the latest status.
Depending on which versions you are using. So possibly one of these situation 1. Spark is initialising the connection in hooks (not pipeline or catalog) 2. It's not really "reading" data, but validating the existence of the data (as noted by the
.exists()
method. I think it was a bug related to dataset factory which should be fixed already(?) 3. You are using
preview
in the SparkDataset, thus viz need to fetch some sample data from the corresponding dataset, check if there is a "preview_args" in the
catalog.yml
a

Artur Dobrogowski

01/15/2024, 3:18 PM
I'm using kedro
0.18.14
for this project and some old datasets of version about
1.0.2
I plan to bump it up soon.
n

Nero Okwa

01/15/2024, 3:20 PM
@Artur Dobrogowski thanks for the feedback. What version of Kedro-Viz are you using ?
a

Artur Dobrogowski

01/15/2024, 3:23 PM
6.7.0
@Nok Lam Chan I've disabled it in hooks, according to the trace it's using
.exists()
method.
It's nice to hear that it's a bug that's fixed in newer versions, I'll try it out when I manage to upgrade later and come back with feedback.
👍🏽 1
r

Rashida Kanchwala

01/15/2024, 3:28 PM
Yes, the .exists() issue has been fixed in kedro-viz 7.0.0. Otherwise you can also downgrade viz to 6.6.0 and it should work
👍🏽 1
👍 1
a

Artur Dobrogowski

01/15/2024, 3:46 PM
Yeah, the issue is gone with
kedro-viz~=7.0
thanks @Rashida Kanchwala and others 🙂
👍🏽 1
2 Views