https://kedro.org/ logo
#questions
Title
# questions
c

Chandan Malla

11/30/2023, 11:11 AM
Hello Kedro Team, I am using SQLTAbleDataSet to save data to my DB, but in my pipeline if I am use the same variable to send it to next node, Then it loads the data again from DB instead of using it from MemoryDataSet. catalog.yml:
"{NAMESPACE}.ballbyball_final":
type: pandas.SQLTableDataSet
table_name: MODEL_{NAMESPACE}_BALLBYBALL_V2
credentials: db2
save_args:
if_exists: replace
chunksize: 10000
pipelines.py:
node(
func=total_balls_done,
inputs=["ballbyball_final_1","params:MIN_TOTAL_BALLS_MATCH"],
outputs="ballbyball_final",
---> Data is saved to DB over here
name="total_balls_done",tags="ballbyball_preprocessing"
),
node(
func=lambda x: x,
inputs="ballbyball_final",
----> Data is loaded from SQLTableDataSet instead of MemoryDataSet
outputs="cache_ballbyball_final",
name="cache_ballbyball_final",tags="ballbyball_preprocessing"
),
This is how it looks when pipeline is running:
a

Ankita Katiyar

11/30/2023, 1:27 PM
Try a different name for the MemoryDataset -> it’s still getting matched to the dataset factory pattern
MemoryDataset is only used for datasets that are not mentioned in the catalog
c

Chandan Malla

12/01/2023, 6:10 AM
I am creating a cache, but to create the cache also I need to pass the variable as input, I can not change the input variable name
a

Ankita Katiyar

12/01/2023, 11:47 AM
c

Chandan Malla

12/01/2023, 1:13 PM
No I was not,
"{NAMESPACE}.cache_ballbyball_final":
type: CachedDataset
dataset:
type: pandas.SQLTableDataSet
table_name: MODEL_{NAMESPACE}_BALLBYBALL_V2
credentials: db2
save_args:
if_exists: replace
chunksize: 10000
but now I am {NAMESPACE} is not getting resolved insidle table_name
a

Ankita Katiyar

12/01/2023, 3:11 PM
What version of Kedro are you using?
This was a bug that we fixed in
0.18.14
.
c

Chandan Malla

12/01/2023, 5:28 PM
Wow, works now, Thank you @Ankita Katiyar πŸ˜„
πŸ™Œ 1