Iñigo Hidalgo
11/08/2023, 5:18 PMI have a time-series problem where I compute a lot of lags, rolling statistics etc. When designing my training pipeline, I have a target number of days I want my master table to include.
Due to the way lags are carried out in pandas, we need to pad our initial queries by the maximum length of lag, as otherwise we would get nulls at the start. This maximum would then be an input to some initial nodes which filter sql tables.
run something likewe do this for simple behaviors but the runtime params is kinda limited when working with nested dicts. maybe hooks could be a way forward?and whilst its technically possible it's not nice to feed runtime arguments into catalog definitions to dynamically change load behaviour.kedro run --params target_date:2023-11-01
after_catalog_created
?Nok Lam Chan
11/08/2023, 5:21 PMIñigo Hidalgo
11/08/2023, 5:31 PMdatajoely
11/08/2023, 5:32 PMNok Lam Chan
11/08/2023, 5:33 PMIf I wanted to only change the final item in the list, I would need to pass the whole dictionary anew in the cliWhy? I thought it only update the 1 key
Iñigo Hidalgo
11/08/2023, 5:34 PMNok Lam Chan
11/08/2023, 5:37 PMIñigo Hidalgo
11/08/2023, 5:38 PMNok Lam Chan
11/08/2023, 5:39 PMIñigo Hidalgo
11/08/2023, 5:39 PMIñigo Hidalgo
11/08/2023, 5:39 PMIñigo Hidalgo
11/08/2023, 5:39 PMNok Lam Chan
11/08/2023, 5:39 PMNok Lam Chan
11/08/2023, 5:39 PMIñigo Hidalgo
11/08/2023, 5:40 PMNok Lam Chan
11/08/2023, 5:40 PMNok Lam Chan
11/08/2023, 5:41 PMIñigo Hidalgo
11/08/2023, 5:42 PMIñigo Hidalgo
11/08/2023, 5:43 PMNok Lam Chan
11/08/2023, 5:47 PMIñigo Hidalgo
11/08/2023, 6:00 PM18.X
but it hasn't been a massive priority for us so it's slow progressIñigo Hidalgo
11/08/2023, 6:02 PMNok Lam Chan
11/08/2023, 6:04 PMkedro-datasets
even if you are in 0.17.x. The downside is you need to define the full dataset path just like a third party plugin.
i.e.
kedro_datasets.pandas.CSVDataset
Iñigo Hidalgo
11/08/2023, 6:05 PMIñigo Hidalgo
11/08/2023, 6:06 PM