Harsh Maheshwari
04/01/2023, 9:02 PMpb95
04/03/2023, 10:25 AMDamian Fiłonowicz
04/03/2023, 11:11 AMDamian Fiłonowicz
04/03/2023, 11:12 AMDawid Bugajny
04/03/2023, 12:02 PMDawid Bugajny
04/03/2023, 12:23 PMGuillaume Latour
04/03/2023, 1:28 PMOlivia Lihn
04/03/2023, 10:36 PMBalachandran Ponnusamy
04/03/2023, 10:37 PMZemeio
04/04/2023, 4:52 AMFlorianGD
04/04/2023, 7:55 AMMelvin Kok
04/04/2023, 10:58 AMafter_catalog_created
hook is triggered before after_context_created
. However this is fixed when kedro-telemetry
is uninstalled (I have raised an issue here)
2. kedro-telemetry
is still sending information about the data catalog, the default pipeline etc to heapanalytics.com even if consent is set to false. Under KedroTelemetryProjectHooks
, it is calling _send_heap_event
without checking for consent.Gary McCormack
04/05/2023, 10:44 AMOlivier Ho
04/05/2023, 1:22 PMget_current_session
has been deprecated for hooksFranco Zentilli
04/05/2023, 4:12 PMkedro viz
is not working properly, is displaying like all nodes in one row, very weirds. There are not error in the terminal when is executed and my kedro-viz version is 6.0.1. Does anyone knows what could be happening? is Like kedro viz is not recognising the connection between nodes in pipelines.William Caicedo
04/05/2023, 8:10 PMJannik Wiedenhaupt
04/05/2023, 8:54 PMDawid Bugajny
04/06/2023, 7:39 AMMatthias Roels
04/06/2023, 12:22 PM_load
for the pandas parquet_dataset is different in kedro.extras.datasets
vs the kedro-datasets
plugin! The difference is significant as the one in kedro extras can be extremely slow (2hours compared to 10sec to load a dataset)!
In our case, we had a dataset on S3 generated by a Spark job (hence a “directory” of (snappy) parquet files with a _SUCCESS
file) with 137808 rows and 6410 columns. With that dataset, I could validate that
pq.ParquetDataset(load_path, filesystem=self._fs).read(**self._load_args)
took indeed longer that 15mins (after that, I ran out of patience since pd.read_parquet()
on the same dataset was loading within 10sec’s).
So the question is: should we already switch from kedro extras datasets to the new kedro-datasets plugin to solve this issue? Is this plugin already ready to use with the current kedro version (v0.18.x)? And can we then simply remove the pandas extras from our requirements
?Guillaume Latour
04/06/2023, 2:46 PMkedro.framework.project.find_pipelines
)? is it a desired behaviour? How can I have all the tags on kedro viz (without manually labeling this node with the tags of my 2 pipelines)?Guilherme Parreira
04/06/2023, 5:17 PMIan Whalen
04/07/2023, 1:07 PMreturn {"__default__": pipeline(Pipeline([node(foo, "params:value", None)]), namespace="bar")}
In my parameters yaml:
value: 1
I’ll get an error that says Pipeline inputs(s) {'params:bar.value'} not found in the DataCatalog
when I kick off a kedro run
Rather than define a bar.value
(and so on for each namespace) is there a way to use defaults as a fallback and only use bar.value
if it appears in my parameters?
I know I could do pipeline(…, parameters={"params:value": "params:value"})
but that would always use the default value. Rather than only use it when its defined.Roman Shevtsiv
04/07/2023, 4:08 PMAaron Niskin (amniskin)
04/07/2023, 4:48 PMRyoji Kuwae Neto
04/07/2023, 9:05 PMIñigo Hidalgo
04/10/2023, 3:02 PMTim
04/11/2023, 2:45 AMChristianne Rio Ortega
04/11/2023, 3:38 AMDotun O
04/11/2023, 1:25 PMGary McCormack
04/11/2023, 2:08 PMconf/base/parameters.yaml
file
param_1: foo
param_2: bar
my_nice_new_params:
If I have the following toy code block:
@hook_impl
def before_node_run(..args.., catalog: DataCatalog, ..more_args..)
print(catalog._get_dataset('params:my_nice_new_params'))
new_param1, new_param2 = run_some_super_cool_logic()
catalog.add_feed_dict(
{
'params:my_nice_new_params':[new_param1, new_param2]
}, replace=True
)
print(catalog._get_dataset('params:my_nice_new_params'))
Then the printed stdout will be something like
MemoryDataSet(data=<NoneType>)
MemoryDataSet(data=<list>)
which is what I would have hoped for. However, when the node itself is run and accesses the 'params:my_nice_new_params'
, the original None
value remains. Is there a step that I'm missing that saves the most recent instance of the catalog?