Hi team, Question on `kedro pull micropkg` and sd...
# questions
e
Hi team, Question on
kedro pull micropkg
and sdist here. Would anyone know how to best deal with the multiple egg.info error in the screenshot? Context: with
kedro==0.18.3
on a mac I used
python -m build --sdist path/to/package
to create tar.gz inside our Git repo (i know alternative is available to create sdist through
kedro micropkg package
, but i can't do it from inside a kedro project). I see the archive, but when doing
kedro pull micropkg
(from kedro project root), the following error comes up. This may or may not be related to the existing issue .
j
hi @Elena Mironova! we've been discussing about this code just today. it's not exactly related to that issue, but definitely is related to https://github.com/kedro-org/kedro/issues/2567
could you do a
find . -name "*.egg-info
and tell us what you see? I see you're erasing the name of the wheel for confidentiality purposes. maybe if you can't share it, at least it will give you an indication of what's happening
my theory is that you have 1
.egg-info
directory corresponding to the micropipeline you're trying to pull, and another one corresponding to something else. but could be another issue
e
Thanks a lot for the super swift response! if i execute it in the root of kedro project (where i have also unpacked
.tar
), it only provides one entry for the
.egg-info
directory of the pipeline
j
@Elena Mironova hmm I'm saying nonsense, the code is unpacking the sdist on a temporary directory https://github.com/kedro-org/kedro/blob/4db0ae3be611f60f3c98da2501364d5549c784f2/kedro/framework/cli/micropkg.py#L142-L149 I see you're doing
kedro micropkg pull pipeline-name.tar.gz --verbose
, right? in principle, Kedro is checking that the file exists: https://github.com/kedro-org/kedro/blob/ae2022840147d0c46a985fb3061beac452b2ab59/kedro/framework/cli/micropkg.py#L320 so, I think when you specify a full filename ending in
.tar.gz
, it has to be the path of an actual file on disk. otherwise, it tries to
pip download
it, most likely it doesn't find it, and so the errors appears. could you make sure that your
pipeline-name.tar.gz
exists?
btw based on this I added a comment on https://github.com/kedro-org/kedro/issues/2542 to improve the debuggability of this command
e
Thanks for the suggestions, Juan! I do indeed have the files on disk. If i manually unpack them (i tried 2 different packages), i can see the egg.info folder too
j
so, just to be clear: from the same directory where you'd do
ls pipeline-name.tar.gz
, and provided there are no leftover
.egg-info
directories, does
kedro micropkg pull pipeline-name.tar.gz --verbose
fail? if so, more debugging will be needed, and maybe we could setup a meeting to discuss that live