Simen Husøy
01/17/2023, 12:23 PMpipeline1
, that uses a dataset x
as data input. This dataset is a custom dataset class that downloads a set of data from a REST-api we have. Multiple nodes use x
as input.
I want to make a test pipeline that wraps pipeline1
by loading a different dataset (still from a REST-api, but with different query parameters) together with additional test nodes that runs performance metrics on the results from pipeline1
. I have implemented this by using the override functionality of pipeline by wrapping pipeline1
in a new pipeline function and giving it a override dictionary to use the test dataset instead of the original dataset, inputs={x: test_x}
.
This seems to work, but I register that it downloads the data multiple times, which is not preferable since it takes some time to download the dataset from the api each time. It seems like each node that uses x
in pipeline1
each downloads(loads) the dataset instead of it being loaded one time for the whole test pipeline.
Do know how to prevent the dataset from being loaded for each node?
(code in the comments)def create_pipeline(**kwargs) -> Pipeline:
# cross_section_pipeline = create_cross_section_pipeline()
cross_section_pipeline = pipeline(
pipe = create_cross_section_pipeline(),
inputs={"radar_data": "falcon_test_data"},
)
cross_section_plotting = node(
func=cross_section_visualizer,
inputs=["concatenated_result", "falcon_test_data"],
outputs="cross_section_plot",
)
reporting_pipeline = pipeline([cross_section_plotting])
return cross_section_pipeline + reporting_pipeline
datajoely
01/17/2023, 12:24 PMSimen Husøy
01/17/2023, 12:27 PMfalcon_test_data
for each node that uses radar_data
within the cross_section_pipeline
. Hmmm....
I'll look at you example! ThanksMemoryDataSet
radar_data
for all nodes?datajoely
01/17/2023, 1:19 PMSimen Husøy
01/17/2023, 1:35 PM_load
from my custom dataset, but figured out that I had to take care of this caching myself. Just stored the data in a class variable and it sorted things out! Thanks for the help 😊datajoely
01/17/2023, 1:43 PM