Hello! When running either `kedro run` or `kedro v...
# questions
f
Hello! When running either
kedro run
or
kedro viz run
, I have these two warnings:
24/01/19 12:58:38 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
24/01/19 12:58:40 WARN Utils: Service 'SparkUI' could not bind on port 4040. Attempting port 4041.
In your experience, do you think that these warnings can slow down the rendering of Kedro visualization? It takes about 35 seconds to render, and my project is not that big yet. Is 35 seconds a typical time to render?
n
It shouldn't take so long. Do you have a relatively new kedro-datasets version? In the past there are issue with certain datasets type that establish connection to database when initialisaed and slow down the process a bit.
Anyway, are you aware of the
autoreload
flag? You should only launch it once so I am not sure if 35 seconds bother you too much
f
Yes I'm aware of autoreload! Thanks. But the reload takes a long time too! I have PickleDatasets, CSVDataset, ParquetDataset and a custom dataset that loads and saves SentenceTransformer models.
Maybe my custom dataset. The folder containing the SentenceTransformer model is 418 Mo...
r
it usually doesn't take that long because kedro-viz currently only loads certain datasets CSV, Excel i.e. (if you have specified preview_args in the catalog so u can preview x rows on kedro-viz). or else if you have plotly/matplotlib datasets.
and that too it doesn't load them initially, the loading only happens when u open the metadata panel for that dataset
n
Any chance you can run a profiler to see what’s the bottleneck?