Marc Gris
06/23/2023, 7:58 AMdata_processing_node
and my model_training_node
have conflicting dependencies.
How would you handle such a (unfortunately common) situation ?
I know that in MLFlow it is possible to have task-specific-venv…
Does kedro offer such a possibility ?
If not, how would could one circumvent the issue ? 🙂
Many thanks in advance,
M.Iñigo Hidalgo
06/23/2023, 8:08 AMNok Lam Chan
06/23/2023, 8:27 AMMarc Gris
06/23/2023, 10:49 AMkedro run
? If so how ?
Or do you suggest having separate (i.e non-modular) projects / repos ?
Regarding deployment:
I know that, for example, Airflow offers PythonVirtualEnvOperator etc…
But if I read the packaging section of the doc correctly, once a kedro pipeline has been packaged, it must be installed in the same venv as airflow, right ?
Therefore, if we developed separate kedro pipelines with conflicting dependencies, we will run into trouble if we try to install their respective package in airflow’s venv, isn’t it ?
Many thanks in advance for your help / suggestions
M.
P.S: @ Nok => regarding our conflicting dependencies, as an example, some of the libraries we work with require older versions of numpy while some other libraries require more recent versions of numpy etc…Iñigo Hidalgo
06/23/2023, 11:18 AMpython -m venv .venv_data_processing
python -m venv .venv_model
.venv_data_processing/bin/python -m pip install . # (and specific data requirements)
.venv_model/bin/python -m pip install . # (and specific model requirements)
.venv_data_processing/bin/python -m kedro run --pipeline data_processing
.venv_model/bin/python -m kedro run --pipeline model
But this doesn't seem very scalable tbh, and just seeing the snippet is making my eyes bleed 😅Marc Gris
06/23/2023, 11:27 AMIñigo Hidalgo
06/23/2023, 12:54 PMMarc Gris
06/25/2023, 3:34 AM