https://kedro.org/ logo
#questions
Title
# questions
b

Brandon Meek

03/09/2023, 5:55 PM
Hey all, if I wanted to request that a part of Kedro be moved to a plugin so people could install it as a standalone tool, would I do that in Kedro or in Kedro-Plugins?
d

Deepyaman Datta

03/09/2023, 5:58 PM
Main Kedro repo is fine. What do you want to move though? πŸ˜„
πŸ‘ 1
b

Brandon Meek

03/09/2023, 6:02 PM
I'd like to move
AbstractDataSet
, I have a library that implements several datasets for my employer's proprietary data tools and I'd love to be able to share it with my colleagues, including the ones that I haven't convinced to start using Kedro, so they could use the code API, but there's a lot of overhead with a full install of Kedro that wouldn't be necessary for using just the
AbstractDataSet
πŸ‘ 1
@Deepyaman Datta thanks for the advice, I've created this
πŸ‘ 1
d

Deepyaman Datta

03/09/2023, 6:30 PM
Great, makes sense!
b

Brandon Meek

03/09/2023, 6:31 PM
awesome! I'd also really love to help out on the implementation of this if it get's approved. I've gotten a lot of use out of all of you folk's efforts on Kedro, and I'd love to give back
πŸ™Œ 1
n

Nok Lam Chan

03/09/2023, 11:24 PM
Curious if you just use the DataSet and no other component of Kedro?
b

Brandon Meek

03/09/2023, 11:50 PM
@Nok Lam Chan I use a lot of features of Kedro, but my coworkers could get benefit from having just the dataset feature. I could unify all of my companies legacy data connectors behind a single code API regardless of if they use Kedro or no
m

Merel

03/10/2023, 9:27 AM
Hi Brandon, we talked about moving the
AbstractDataSet
out of core Kedro as part of moving all dataset implementations into it’s own repo (
kedro-datasets
). However, we decided against it. You can read the discussion back here: https://github.com/kedro-org/kedro/issues/1776#issuecomment-1234432081
n

Nok Lam Chan

03/10/2023, 1:22 PM
If you just use DataSet as a standalone component, then the
AbstractDataSet
will be too much since things like
_release
from_config
is not useful without other component like a kedro run or
DataCatalog
m

Matthias Roels

03/10/2023, 1:35 PM
I agree with the comments on https://github.com/kedro-org/kedro/issues/1776, but I so see the value of making kedro-datasets a standalone component (maybe even combined with the catalog concept). We could then let kedro depend on this component (instead of the other way around as is currently the case).
πŸ‘ 1
b

Brandon Meek

03/10/2023, 1:39 PM
@Merel thank you for the background, @Nok Lam Chan I don't disagree that there's certainly some overhead, but I think that Kedro-datasets could have a value proposition as a stand-alone tool outside of it's integration with Kedro.
πŸ‘ 2
n

Nok Lam Chan

03/10/2023, 1:50 PM
I think that’s a good sign that these Kedro components are quite modular, personally, I am quite happy that you can use some Kedro components without needing the full project. When you are using DataSet as a standalone component, how is the configuration being handled? Or do you always have the argument in place with the dataset?
b

Brandon Meek

03/10/2023, 2:22 PM
So my coworkers are interacting with it through the code API instead of yaml if they aren't using Kedro
d

Deepyaman Datta

03/10/2023, 4:05 PM
FWIW my personal opinion is quite aligned with what @Matthias Roels wrote and generally how @Brandon Meek is looking to use the dataset. To copy a snippet from an idea/vision I shared internal to the maintainer team:
Instead, we should see Kedro as an ecosystem of core components (e.g. data catalog, datasets, modular pipelines), plus an opinionated (relatively lightweight) wrapper that stitches these all together and includes best practices. However, a power user can use each of these components or a combination thereof with their own lightweight stitching, if they don't like the way Kedro does it.
But, of course, there's effort and alignment required on that, since I'm pretty sure there are arguments in the other direction too. πŸ™‚
πŸ‘ 3
12 Views