hey everyone, Is there a way to run kedro-viz on ...
# questions
f
hey everyone, Is there a way to run kedro-viz on docker without actually installing the lib? I am asking because i wanted to keep the env a bit clean and I thought docker for viz would be nice. Did anyone do that before?
d
so you can basically compile a portable version that's just a single page app https://docs.kedro.org/projects/kedro-viz/en/stable/platform_agnostic_sharing_with_kedro_viz.html
f
hmm but i would still have to run build everytime i update my command no? I was asking for myself, I don't need to share with someone else. I would then put this into my projects docker compose file so that I can run kedro viz in a isolated docker image, not in my local env
r
I'm not sure if that's possible at the moment. Kedro-Viz works by reading the Kedro project and creating JSON endpoints, which the frontend uses for visualization. Even if you host the frontend in a Docker container, you'll still need Kedro-Viz as a library to convert the Kedro project into JSON files.
f
I can map my project files into the container so that's no problem.
r
One option is maybe you keep your project clean. and if your project is on github, you could use https://github.com/kedro-org/publish-kedro-viz -- this would do the kedro-viz installation on the Github Ci and host your kedro-viz on Github pages
n
I don't think there is an official way, but there's nothing to stop you from creating your own docker to run kedro-viz in a container. To do that you will need both project and kedro-viz dependencies inside your docker.
šŸ‘ 1
It's also completely fine to run kedro-viz in a separate virtual env, more or less the same idea of Docker depends how much isolation you are looking for
f
thanks, github page also cool, might try that later. @Nok Lam Chan yes i will possibly do that, i was just wondering if anyone did that before but yeah i can write some docker config files to do that šŸ‘
šŸ‘šŸ¼ 1
d
if you get a nice solution working please share šŸ™‚ always keen to understand how best to do this
you may also be interested in
kedro-viz --lite
which was just shipped and builds the DAG through ast introspection without actually executing it , because you can now run kedro-viz without any of the actual dependencies (other than Kedro) installed
šŸ‘šŸ¼ 1
šŸ‘ 2
f
Will do that, thanks šŸ™‚
šŸ‘šŸ¼ 1
j
ideally,
uvx --with kedro-viz kedro viz run --lite
should work in any project. I just tested it.
šŸ‘Œ 2
d
that's bonkers @Juan Luis
šŸ”„ 1
there's a blog post in there
j
also since @Fazil Topal was asking specifically about Docker:
Copy code
$ cat Dockerfile
FROM python:3.9-slim
RUN pip install uv && uv pip install --system kedro-viz kedro
EXPOSE 4141
WORKDIR /app
ENTRYPOINT ["kedro", "viz", "run", "--lite", "--host", "0.0.0.0"]
$ docker build -t kedro-viz-lite .
...
$ docker run -p 4141:4141 -v ~/Projects/demo:/app kedro-viz-lite
this works just fine šŸ™‚
šŸ‘šŸ¼ 1
f
Thanks for this, I've tried different combo and somehow i end up getting the following error:
Copy code
File "/app/src/projx/models/text/base.py", line 34, in <module>
    class AnthropicAssistant(BaseMessage):
  File "/opt/conda/envs/py/lib/python3.10/dataclasses.py", line 1184, in dataclass
    return wrap(cls)
  File "/opt/conda/envs/py/lib/python3.10/dataclasses.py", line 1175, in wrap
    return _process_class(cls, init, repr, eq, order, unsafe_hash,
  File "/opt/conda/envs/py/lib/python3.10/dataclasses.py", line 908, in _process_class
    for b in cls.__mro__[-1:0:-1]:
  File "/opt/conda/envs/py/lib/python3.10/unittest/mock.py", line 643, in __getattr__
    raise AttributeError("Mock object has no attribute %r" % name)
AttributeError: Mock object has no attribute '__mro__'
For some reason it leads to dataclasses and errors out there. Code works fine normally, i am not sure why it does that. I thought it's
uv
stuff but replicating the same env in the container also results in the same error. I'll have a look later
j
šŸ™ƒ this is our fault for sure,
kedro viz --lite
uses unittest
Mock
. cc @Ravi Kumar Pilla @Rashida Kanchwala @Fazil Topal do you mind opening an issue on Kedro Viz about this?
šŸ‘€ 1
f
Ahh, okay i was super confused šŸ˜„ I will open soon. I tried without lite and still getting errors about my custom dataset definitions. I played with PYTHONPATH but no luck
Example:
<http://kedro.io|kedro.io>.core.DatasetError: Class '<http://projx.models.audio.io|projx.models.audio.io>.LargeModel' not found, is this a typo?
n
projx.models.audio.io.LargeModel
open a python terminal:
Copy code
from <http://projx.models.audio.io|projx.models.audio.io> import LargeModel
What do you get?
this 1
šŸ™Œ 1
j
that must be a separate error I'm sure. if
python -c "from <http://projx.models.audio.io|projx.models.audio.io> import LargeModel"
works but
kedro run
doesn't, then you have some problem with your installation
f
It was inside the docker, looks like some other deps was missing, when I run that in python i got the no module named
elevenlabs
so installing that fixed it. I wonder why that error is not thrown in kedro tho šŸ¤”
r
For some context on
kedro viz --lite
, it only mocks dependencies within your kedro project. It does not mock any transitive dependencies. For this -
i got the no module named elevenlabs so installing that fixed it. I wonder why that error is not thrown in kedro tho
do you mean
kedro viz --lite
did not raise an error or
kedro run
?
f
Well actually both of them works now, missing dependency had some different error outputs. Not sure why
I do have a different problem now šŸ˜„
Copy code
viz:
    image: projx:viz
    build:
      context: .
      target: viz
    entrypoint: [ "bash" ]
    command:
      - -c
      - "kedro viz run --host 0.0.0.0"
    ports:
      - 4141:4141
    volumes:
      - ./:/app
This returns error as
bash: line 1: kedro: command not found
but when i comment the
command
section, then run
kedro viz run
inside the container it works. Does anyone have a clue? I feel like im missing something super obvious here šŸ˜…
r
I guess the bash file is run in a different env than the container command run
f
Yeah im also getting the error when i add the line to dockerfile
ENTRYPOINT ["kedro", "viz", "run", "--lite", "--host", "0.0.0.0"]
what's the best way to add kedro to containers bin path?
r
Copy code
command:
    - -c
    - "source ~/.bashrc && kedro viz run --host 0.0.0.0"
Try if the above command works. If not, can you try installing kedro and kedro viz globally while creating the image ?
f
thanks that works šŸ˜„
šŸ‘šŸ¼ 1
šŸŽ‰ 1
n
what's the best way to add kedro to containers bin path?
You are not suppose to do that. Installing kedro-viz, should automatically add it to the python binary path already.
f
I think some python level stuff was on the
bashrc
so invoking that solved it. Now i see the kedro viz as expected. Thanks for the support šŸ˜„ šŸŽ‰
🄳 1
Should we still open a issue about user level code errors being hidden in kedro? I feel this led to extra debugging sessions whereas it should have been clear from the beginning that a dependency was missing. Somehow error is being caught somewhere
n
> I think some python level stuff was on the
bashrc
so invoking that solved it. I think it's most likely the virtual env
Feel free to open an issue, ideally with something we can reproduce.
šŸ‘ 1
f
No I meant the earlier stack traces where No Module named
elevenlabs
was not thrown in kedro
n
We have been fighting this a bit to deal with two conflict requirements: • Kedro dynamic import dataset from different place - so it's tricky to figure out when ImportError happens intentionally or not • We want to surface the correct error We partly fixed this before so it should be able to tell you whether it's missing dependencies, maybe there are something we can do more on the Kedro side
šŸ‘ 1
f
The issue i had was this:
Copy code
Traceback (most recent call last):
  File "/opt/conda/envs/py/lib/python3.10/site-packages/kedro/io/core.py", line 159, in from_config
    class_obj, config = parse_dataset_definition(
  File "/opt/conda/envs/py/lib/python3.10/site-packages/kedro/io/core.py", line 501, in parse_dataset_definition
    raise DatasetError(
kedro.io.core.DatasetError: Class 'projx.models.audio.io.LargeModel' not found, is this a typo?
Hint: If you are trying to use a dataset from `kedro-datasets`, make sure that the package is installed in your current environment. You can do so by running `pip install kedro-datasets` or `pip install kedro-datasets[<dataset-group>]` to install `kedro-datasets` along with related dependencies for the specific dataset group.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/opt/conda/envs/py/lib/python3.10/multiprocessing/process.py", line 314, in _bootstrap
    self.run()
  File "/opt/conda/envs/py/lib/python3.10/multiprocessing/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "/opt/conda/envs/py/lib/python3.10/site-packages/kedro_viz/server.py", line 122, in run_server
    load_and_populate_data(
  File "/opt/conda/envs/py/lib/python3.10/site-packages/kedro_viz/server.py", line 59, in load_and_populate_data
    catalog, pipelines, session_store, stats_dict = kedro_data_loader.load_data(
  File "/opt/conda/envs/py/lib/python3.10/site-packages/kedro_viz/integrations/kedro/data_loader.py", line 172, in load_data
    return _load_data_helper(
  File "/opt/conda/envs/py/lib/python3.10/site-packages/kedro_viz/integrations/kedro/data_loader.py", line 101, in _load_data_helper
    catalog = context.catalog
  File "/opt/conda/envs/py/lib/python3.10/site-packages/kedro/framework/context/context.py", line 190, in catalog
    return self._get_catalog()
  File "/opt/conda/envs/py/lib/python3.10/site-packages/kedro/framework/context/context.py", line 234, in _get_catalog
    catalog: DataCatalog = settings.DATA_CATALOG_CLASS.from_config(
  File "/opt/conda/envs/py/lib/python3.10/site-packages/kedro/io/data_catalog.py", line 330, in from_config
    datasets[ds_name] = AbstractDataset.from_config(
  File "/opt/conda/envs/py/lib/python3.10/site-packages/kedro/io/core.py", line 163, in from_config
    raise DatasetError(
kedro.io.core.DatasetError: An exception occurred when parsing config for dataset 'narrator#lam':
Class 'projx.models.audio.io.LargeModel' not found, is this a typo?
Hint: If you are trying to use a dataset from `kedro-datasets`, make sure that the package is installed in your current environment. You can do so by running `pip install kedro-datasets` or `pip install kedro-datasets[<dataset-group>]` to install `kedro-datasets` along with related dependencies for the specific dataset group.
so this is what i saw and when i ran the import statement in the python as you suggested, i got this:
Copy code
>>> from <http://projx.models.audio.io|projx.models.audio.io> import LargeModel
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/app/src/projx/models/audio/__init__.py", line 1, in <module>
    from .base import LAM
  File "/app/src/projx/models/audio/base.py", line 1, in <module>
    from elevenlabs.client import ElevenLabs
ModuleNotFoundError: No module named 'elevenlabs'
so I sort of was expecting kedro to show this in the first place. If there is an open issue about it happy to comment it there, otherwise i'd open a new one
j
so I sort of was expecting kedro to show this in the first place. If there is an open issue about it happy to comment it there, otherwise i'd open a new one
sorry for the delay. I think we've improved this recently. @Fazil Topal what version of Kedro is this?
f
0.19.7 is what i am using.
j
@Ankita Katiyar @Nok Lam Chan I was under the impression that we had fixed this as part of https://github.com/kedro-org/kedro/issues/2943, but maybe this is yet another case we have to handle?
n
kedro.io.core.DatasetError: An exception occurred when parsing config for dataset 'ingestion.int_typed_companies':
No module named 'pandas'. Please see the documentation on how to install relevant dependencies for kedro_datasets.pandas.ParquetDataset:
https://docs.kedro.org/en/stable/kedro_project_setup/dependencies.html#install-dependencies-related-to-the-data-catalog
This is what happened when I try to use
pandas.CSVDataset
when I have
kedro-datasets
installed but not `pandas`(pip uninstall on purpose)
f
Hmm, could it be realted to custom datasets created by the user? This example uses my custom defined dataset definition, perhaps there it doesnt work as expected?
n
could be, but I am not 100% sure here. The way kedro-dataset structure is usually having a
dataset
module and init, and implemnetation file. • some_dataset_module ā—¦ init.py ā—¦ some_dataset.py We never directly import some_dataset, but usually through
some_dataset_module
with
from .some_dataset import XYZDataset
, and we also have lazy loading implemented, that could be another reason why we are able to catch error better for
kedro-datasets
f
I have the following one:
Copy code
dataset_name
- init.py -> from .base import LAM
- base.py -> Wrapper code around aPI endpoint and defines LAM
- io.py -> Reads kedro config and create the LAM instance.
So basically module init would try to load the package
elevenlabs
which is defined in the
base
and that's the error that was not caught. For the context, I use the dataset itself to import LAM to do type declaration for nodes hence the separate io file and base file.
j