Hey team, I'm migrating a legacy data engineering...
# questions
p
Hey team, I'm migrating a legacy data engineering pipeline to kedro. We had created classes in the past to handle the connection with these sources, e.g. SnowflakeInstance, SharePointSite etc, functioning as clients to these sources with custom read/transform/write functionality. *The problem: W*e're still using the node functions to initialize the clients from those configuration arguments (printscreen 1) My question: Is there an easy way to take the client as the node input (printscreen 2)? How can I define it in the catalog?
s
You can use Kedro's config for managing env variables by storing it in the
conf
directory with subdirectories for different environments . See the docs for it here.
p
Thanks @Sajid Alam, but my question is not on how to load env variables from the catalog, rather on how to load a client (a custom python object) from the catalog, such as the SnowflakeInstance on my example
s
Right I see, it sounds like you already have some classes that sound like datasets,
SnowflakeInstance
and
SharePointSite
, I think these need to turn into custom datasets in Kedro then you can define them in the catalog.yml and load it into nodes directly. You can follow this guide to make these into kedro custom datasets.
p
Thanks! So I guess one can only load simple objects like strings, integers, floats from the params or other config files, or custom datasets through the catalog. Is that correct?
👍 1