in kedro / pyspark how to use MemoryDataset
I am trying to use a MemoryDataset with kedro, in order to not save the intermeiate result to disk.
# nodes.py
def preprocess_format_tracksessions(tracksess: DataFrame, userid_profiles:pd.DataFrame , parameters: Dict) -> MemoryDataset:
In the pipeline I am defining the node output and inputs:
# pipeline.py
def create_pipeline(**kwargs) -> Pipeline:
return pipeline([
node(
func=preprocess_format_tracksessions,
inputs= ["track_sessions",...