hello <!here>! :wave::skin-tone-3: I come with a t...
# user-research
j
hello <!here>! 👋🏼 I come with a thorny question today: Python environments python before launching a proper poll, I'd like to ask some qualitative feedback from you: • what tool do you use to create and manage environments for your Kedro projects? venv, virtualenv, virtualenvwrapper, conda, mamba, Pipenv, Poetry, PDM, Hatch, Rye, nothing at all? • where do you place your environments when working with Kedro projects? a global location (
~/.miniconda
,
~/.virtualenvs
), or next to the code (
~/Projects/spaceflights/.venv
)? • when you create a new Kedro project, what are the steps you usually follow? for example 1. create and activate conda environment, 2.
pip install kedro
, 3.
kedro new
• what do you think of the current process? (please leave a reply on the thread 🧵, 1 comment per person to keep the conversation tidy) your feedback and ideas are very much welcome 🙏🏼
bringing @Роман Белый’s response here 😄 https://kedro-org.slack.com/archives/C03R8N2M8KT/p1698933191366269
👍 1
d
Use miniconda for env, pip for everything else currently, will move to Mamba or maybe Pixi when I next have to se up a machine
✔️ 1
i
what tool do you use to create and manage environments for your Kedro projects?
Conda for virtual env, then pip for package installation
where do you place your environments when working with Kedro projects?
Default conda directory (
~/.anaconda
) but store package versions in a
requirements.txt
or something like that
when you create a new Kedro project, what are the steps you usually follow?
I use the workflow you point out above. Create env, install kedro, kedro new
what do you think of the current process?
I think its fine! It reflects what one should do with any other kind of project.
✔️ 3
m
j
Hi! • use venv + pip • place envs in global location • venv + pip install kedro==some_version + kedro new • At this point I do it without thinking The envs inside the project folder "feels" messy to me
✔️ 1
j
Hi, I like to: • Use venv and pip to manage environments • Place my virtual environments in the local directory of the project • (venv) pip install kedro • It has worked so far and is pretty straightforward, haven't played around enough with other possibilities.
👀 1
l
I used to use pyenv, but have since moved to vanilla venv. Much easier to find the correct env (if you have one per project, having them globally accessible makes a mess, if you share envs it creates a version mess (pandas 2.x breaking changes anyone?)). .venv folder next to code I have kedro installed in my global python as well
👀 1
m
I strictly use Docker containers! So far, I manage a base image separate from the kedro project code. This means that • the Dockerfile in my kedro project consists of 3 layers: a
FROM base_image
, a
COPY
to copy the relevant directories and an
ENTRYPOINT
and/or
CMD
• The initial base image lives in a separate repo containing all base images we use. We install relevant linux libraries and python dependencies. We use pip-tools to generate a
requirements.txt
with all versions pinned (
==
) from a
<http://requirements.in|requirements.in>
. Install is done with pip as root (no need for any venv or anything as we are already using containers for isolation). I would like to make the split less dramatic in future projects, but to do so, I miss the equivalent of Rust Workspaces in the Python world. Essentially a splitting your Rust project into workspaces allow you to create subprojects, each with its own set of dependencies defined. But in the end, a general lockfile is created to ensure we have consistent versions across workspaces.
👍🏼 1
👍 1
t
• micromamba + pip • The env live in projet folder I'm waiting for a mature rust implementation of pip
👍🏼 1
👍 1
j
what tool do you use to create and manage environments for your Kedro projects?
Micromamba, Poetry. Would also like to experiment with Rye/Pixi + Kedro
where do you place your environments when working with Kedro projects?
Next to the code.
when you create a new Kedro project, what are the steps you usually follow?
Usually create a conda environment and use Poetry inside it. I have an environment lockfile, and a packages lockfile, which makes shoving things in a container and having my package installed in editable mode straightforward. However, once Pixi can install from PyPI and build/install local development packages, I can stick with a single tool, perhaps 😃
what do you think of the current process?
Personally, I'm not a massive fan of pip alone - I like the convenience of a workflow tool.
🚀 1
👀 1
b
miniconda
• Global named environments •
conda create...
->
pip install -U kedro
->
kedro new
->
cd ...
pip install src/requirements.txt
----- Process is easy. I hate
requirements.txt
files - they allow stuff that works in
pip
but not anywhere else! Can we use
pyproject.toml
instead?
👍🏼 1
🔥 3
m
• I went back to using the good’ole
venv
&
pip
• I like to have the
.venv
in my projects (easily go to definitions etc..) • my steps: ◦
kedro new
cd <new-project>
python -m venv .venv && source .venv/bin/activate
pip install -r src/requirements.txt
2
i
• I use
conda
to install different python versions, so a windows
pyenv
equivalent (I know pyenv-win exists, but I couldn't get it to work without admin rights), I have conda environments called
py38
, and
py310
which I use as a base. • Then I use
venv
to create a local
.venv
then do
pip install requirements
. I don't ever use
kedro new
, I just use plain
cookiecutter
which is installed through
pipx
so is globally available If I were to use
kedro new
I'd probably aim to have it installed through pipx, though I never saw the value of that vs
cookiecutter
relative to the potential headache I'd have with conflicting versions of kedro depending on the
PATH
order
👀 1
If I were to use
kedro new
I'd probably aim to have it installed through pipx, though I never saw the value of that vs
cookiecutter
relative to the potential headache I'd have with conflicting versions of kedro depending on the
PATH
order
f
I also went back to
venv
with the help of rtx. I use
rtx
to install a version of python, create the virtualenv and activate it as I
cd
into the directory. Then
pip install -r requirements.txt
(or
requirements.lock
in CI and production) This is the
.rtx.toml
that I mainly use:
Copy code
dotenv = ".env"

[tools]
python =  {version = "3.10.11", virtualenv = ".venv"}
(the
dotenv
entry is to load the env variables with
rtx
as well, not needed here)
👍🏼 1
j
thanks everybody who replied to this and the other thread. the motivation was to assess how useful it would be if
kedro new
could create a Kedro project in the current directory, which is difficult to do because the underlying library,
cookiecutter
, doesn't support it. however, from the responses I get that the main annoyance is that people tend to have a "global Kedro" and then a "project-specific" Kedro. feel free to drop your thoughts in https://github.com/kedro-org/kedro/issues/681#issuecomment-1798043381
🙌 2
m
In my team we have a cookiecutter / copier template that we initialize for each new project That project uses poetry, kedro, docker, k8s, etc to set everything up the same
👀 1
d
Would love to learn more about your use of Copier @Markus Sagen!
m
We're slowly migrating from Cookiecutter to Copier because of the built in support for managing and responding to changes in the template. We tried Cruft too, but it seems not too well maintained at the moment We created a shared template with minimal code and setup, just structure. Then shared code is defined in another package and not in the template When we update the version of kedro, core packages, version of the shared package, or adding global yaml files to our project, we can do it in one place. We can then add CI/CD check to verify that each project is up to date with the template
1000 1
We're also looking to move away from poetry to Hatch, Rye or PDM. We want something that aligns more with the PEP standards, is minimal (for Docker builds), reproducible, and fast
❤️ 2
d
that’s really really neat