user
01/12/2023, 2:49 PMAfaque Ahmad
01/13/2023, 7:03 AMkedro-airflow
plugin to generate the dags. Is there a guide I can follow for step by step process to get the dag up and running on Airflow?
Do I need to put the .whl
file anywhere after running kedro package
?user
01/13/2023, 1:28 PMSimen Husøy
01/15/2023, 3:02 PMplotly.PlotlyDataSet
to make bar plots etc., but I am having a hard time figuring out how to plot a image similar to how you do it with plt.imshow(...)
in kedro viz. Anyone here who has knowledge of how to do this?Dustin
01/17/2023, 1:49 AMDustin
01/17/2023, 1:50 AMDustin
01/17/2023, 1:51 AMGaetan
01/17/2023, 10:43 AMuser
01/17/2023, 10:58 AMSimen Husøy
01/17/2023, 12:23 PMpipeline1
, that uses a dataset x
as data input. This dataset is a custom dataset class that downloads a set of data from a REST-api we have. Multiple nodes use x
as input.
I want to make a test pipeline that wraps pipeline1
by loading a different dataset (still from a REST-api, but with different query parameters) together with additional test nodes that runs performance metrics on the results from pipeline1
. I have implemented this by using the override functionality of pipeline by wrapping pipeline1
in a new pipeline function and giving it a override dictionary to use the test dataset instead of the original dataset, inputs={x: test_x}
.
This seems to work, but I register that it downloads the data multiple times, which is not preferable since it takes some time to download the dataset from the api each time. It seems like each node that uses x
in pipeline1
each downloads(loads) the dataset instead of it being loaded one time for the whole test pipeline.
Do know how to prevent the dataset from being loaded for each node?
(code in the comments)Miguel Angel Ortiz Marin
01/17/2023, 3:24 PMLinda Sun
01/17/2023, 9:42 PMVici
01/18/2023, 9:51 AMDamian Fiłonowicz
01/18/2023, 10:14 AMmy app requires fastapi==0.81.0, but you have fastapi 0.66.1 which is incompatible.
my app requires uvicorn[standard]==0.18.3, but you have uvicorn 0.17.6 which is incompatible.
I also see that the kedro-static-viz plugin is dead for like 2 years already: https://github.com/WaylonWalker/kedro-static-viz
Hence, what is an advised way of deploying this viz with the latest versions?
Does anybody use it in a small container, provides it with project's code and/or the json file, and starts it with --load-file FILE args? If not, is there any nice solution to this? 🙂Vaibhav
01/18/2023, 11:15 AMSimen Husøy
01/18/2023, 3:11 PMkedro.framework.cli.utils.KedroCliError: not enough values to unpack (expected 3, got 1)
Run with --verbose to see the full exception
Error: not enough values to unpack (expected 3, got 1)
Worked with the previous version, anyone knows why this happens? (full stack trace in comments)João Areias
01/18/2023, 7:10 PMWilliam Caicedo
01/19/2023, 4:49 AMreload_kedro
magic and Kedro 0.18.4?datajoely
01/19/2023, 8:55 AMAfaque Ahmad
01/19/2023, 9:10 AMLivyRunner
to be able to submit jobs to an EMR
cluster using Livy
. I'm using Kedro
0.18.4
. I need to pass the code as a string to Livy
. Has anyone created something similar. Any help is really appreciated.
I'm trying to pass the code in _run
to Livy
. How to figure our which pipeline and node to run? We do have the following parameters in the _run
function but it cannot be passed to the string.
def _run(
self,
pipeline: Pipeline,
catalog: DataCatalog,
hook_manager: PluginManager,
session_id: str = None,
) -> None:
Iñigo Hidalgo
01/19/2023, 9:17 AMtest_size
need to be passed by name. It would need to be a combination of passing an iterable as well as a dictionary to the inputs
for the node, which as far as I know isn't doable. If not possible, how would you suggest I proceed, when my objective is to be able to feed in outputs from different nodes to converge into that function to then output into a train node.Balazs Konig
01/19/2023, 10:55 AMJuan Marin
01/19/2023, 12:32 PMkedro
command to import datasets from a path into my data directory in the project? Thanks!Simen Husøy
01/19/2023, 2:08 PMWaiting for the remaining 582 operations to synchronize with Neptune. Do not kill this process.
Still waiting for the remaining 582 operations (0.00% done). Please wait.
Brandon Meek
01/19/2023, 8:05 PMkedro run
will load the configurations from conf/base
and then overwrite it with conf/local
and you can use the --env
argument to use a different environment instead of conf/local
But I was wondering if there was a way to use the --env
argument to waterfall instead of just overwrite?
So if you ran kedro run --env=dev
it would go conf/base
-> conf/dev
-> conf/local
Dustin
01/20/2023, 4:05 AMHi team, I would like to discuss a feature idea (or this is already implemented?) to seek your thought :)
Context:
It is common in practice to know the consuming time of the whole pipeline and the consuming time of each node in the pipeline.
I assume the stakeholder/engineers would like to understand the performance of pipline and which part can be optimized.
Features:
1. Is it possible to show consuming time (in second/minutes/) of each node in the pipeline?
1.1 by default, it is shown in the console as part of logging and you can configure to turn it off
2. Given feature 1, is it possible to show the consuming time of each pipeline?
2.1 by default, it is shown in the console as part of logging at the end of each pipeline running
2.2 in case there are multiple pipeline, show it for each pipeline, you can configure to turn it off
3. Given feature 2, is it possible to show the consuming time of all pipelines in total
3.1 by default, it is shown in the console at the end of all pipelines running and you can configure
Dustin
01/20/2023, 4:08 AMArtur Dobrogowski
01/20/2023, 12:52 PMsetup.py
present in src/
. I can't find info on documentation pages what is it used for? Is the kedro pipeline built as a python package for some portability features? I'd like to know what's going on if someone can shed some light here 🙂Massinissa Saïdi
01/20/2023, 3:50 PMRaghav Gupta
01/21/2023, 7:22 PM