Artur Dobrogowski
01/15/2024, 2:36 PMkedro viz
. I wanted to explore a new unknown project to me so I started with kedro viz. However it fails, because of some issues that are not very important here (some data related issues). First thing that wasn't right to me is why spark was initialized and run for the need of running kedro viz? After some digging I got that it gets run because the project uses SparkDatasets
, so for some reason it runs Spark to ask some questions about the datasets. My impression so far was that kedro viz
is only a tool to explore kedro structure - what is defined in pipelines and data catalog. It was a big surprise to me that it tried to enquire anything external. So I searched some options whether its possible to disable this behaviour of deeper inspections, but I didn't find anything. Can someone shed some light why it needs to do so?
Here's stack trace of why it was trying to access data set:datajoely
01/15/2024, 2:39 PMkedro run
work before viz will renderJuan Luis
01/15/2024, 3:03 PMNok Lam Chan
01/15/2024, 3:05 PMNok Lam Chan
01/15/2024, 3:09 PM.exists()
method. I think it was a bug related to dataset factory which should be fixed already(?)
3. You are using preview
in the SparkDataset, thus viz need to fetch some sample data from the corresponding dataset, check if there is a "preview_args" in the catalog.yml
Artur Dobrogowski
01/15/2024, 3:18 PM0.18.14
for this project and some old datasets of version about 1.0.2
Artur Dobrogowski
01/15/2024, 3:19 PMNero Okwa
01/15/2024, 3:20 PMArtur Dobrogowski
01/15/2024, 3:23 PMArtur Dobrogowski
01/15/2024, 3:26 PM.exists()
method.Artur Dobrogowski
01/15/2024, 3:27 PMRashida Kanchwala
01/15/2024, 3:28 PMArtur Dobrogowski
01/15/2024, 3:46 PMkedro-viz~=7.0
thanks @Rashida Kanchwala and others 🙂