Hi everyone, I have create a client encapsulated i...
# questions
a
Hi everyone, I have create a client encapsulated in an AbstractDataset that should be used in multiple places in my project. However, I want to create a few task datasets (inheriting from ABstractDataset) that uses this client (and therefore a dataset). I want to know what is the most common practice to do this? What I can think about is to put the client as a field in each of the task datasets but I doubt that is a good idea 🙃
r
Hi Ayoub, you could probably look into creating a Custom Dataset. Here are the Kedro docs on how to go about it - https://docs.kedro.org/en/stable/data/how_to_create_a_custom_dataset.html
a
That is what I did actually. The client dataset I am talking about is nothing but a custom dataset. I want this custom dataset to be used in multiple other "services" datasets in my code. The problem is that idk how to load the "client dataset" (which is supposed to be unique) in the "services datasets" ?
n
@Ayoub Chouikha Can you provide some pseudo-code to demonstrate what you mean? I am not sure what you mean a client dataset, do you mean a connection or something else?
a
A client dataset is nothing but a custom dataset that wraps a client to a distant server. A task dataset is also a custom dataset that uses this client to execute some API requests
n
Do you actually use this ClientDataset directly? Or it's only being used indirectly through these task dataset?
a
It is being used indirectly through the "services/tasks datasets" to execute the requests
n
I don't think it's wrong to have dataset in dataset (i.e. PartitionedDataset, IncrementalDataset). But if the main point is about creating a client that can be used for some other task, you don't really need a full dataset but just some normal Python class to do so. If you look at the SQLDataset, I think it's similar to how database connections is created and used in different dataset, if I understand correctly.
Depends on your use case, you may want these client to be unique (like you said), or the opposite case, you want to create minimal database connections that are shared for different datasets