Hi, I' am a new kedro user. Since a couple of week...
# questions
a
Hi, I' am a new kedro user. Since a couple of weeks I am trying to understand how to use the tool. I have the following problem: In the catalog I have different keys pointing to different config files
Copy code
Model1Config:
  type: yaml.YAMLDataset
  filepath: src/mykedro/models/Model1/config.yaml

Model2Config:
  type: yaml.YAMLDataset
  filepath: src/mykedro/models/Model2/config.yaml
I would like to use these config files in a loop (for example to train the two model seuentially). For that I need to have access to the
catalog
(or even better to the current session) in the node. My first idea was to create a dummy node to expose session as variable:
Copy code
from kedro.framework.session.session import get_current_session

def load_session():
    return get_current_session()


node(
            func=load_session, 
            inputs=None, 
            outputs='current_session', 
        ),
The problem is that get_current_session is deprecated and no longer available (I am using kedro 0.19) My question is: how can I pass these values to node inputs? Notice that creating a new KedroSession in
load_session
like this
Copy code
from kedro.framework.session import KedroSession
def load_session():
    with KedroSession.create() as session:
        return session
is not working for me (I receive a memory error).
y
So how it works is, you first define a pure Python function, e.g.:
Copy code
def train_model(data: pd.DataFrame, config: dict) -> BaseEstimator:
    ...
Where
config
expects that dictionary you have in your yaml files. Next, when defining a pipeline, you wrap this function into a node, and it's during this wrapping you point it to datasets you've defined:
Copy code
# Inside this function
def create_pipeline() -> Pipeline:
    return Pipeline([
    ...
    node(
        func=train_model,
        inputs={
            "data": "how_your_data_is_called_in_catalog",
            "config": "Model1Config", # This is the name of what you have in catalog
        },
        outputs="trained_model_1",
    )
    ...
    ])
I'm almost sure for this task you wouldn't need to interact with
session
object, or even know that it exists