https://kedro.org/ logo
#questions
Title
# questions
i

Iñigo Hidalgo

02/07/2024, 10:01 AM
Follow-up question on factories: I have this dataset definition.
Copy code
country_technology_granularity__predictions:
  type: axpo.kedro.datasets.pandas_arrow_dataset.ParquetArrowDataset
  path: <abfs://container/country/technology/granularity/predictions/>
  credentials: blob_storage
  versioned: true
  write_mode: append
  partition_method: datetime
  datetime_column: gas_date
  partition_by: [year, month, day]
From
type
to
write_mode
all the config will be the same regardless of the dataset, but I would like to make the last 3 configurable. Could I somehow refer to a global defined like so?
Copy code
dataset_config:
    country_technology_granularity:
        datetime_column: gas_date
        partition_by: [year, month, day]
K 1
a

Ahdra Merali

02/07/2024, 10:27 AM
Do you mean something like using variable interpolation?
i

Iñigo Hidalgo

02/07/2024, 10:46 AM
Yeah, either normal variable interpolation or using globals. But I would need to use the captured groups from the dataset factory to point to the right variable to interpolate Something like this:
Copy code
_dataset_config:
  country_technology_granularity:
    partition_method: datetime
    datetime_column: gas_date
    partition_by: [year, month, day]

'{signal_name}__predictions':
  type: axpo.kedro.datasets.pandas_arrow_dataset.ParquetArrowDataset
  path: <abfs://container/{signal_name}/predictions/>
  credentials: blob_storage
  versioned: true
  write_mode: append
  partition_method: {_dataset_config.{signal_name}.partition_method}
From what I can tell and @Ankita Katiyar mentioned in the other thread, the variable interpolation happens before the actual dataset factory is "applied", so I'm getting the impression it's not really possible to do what I want
a

Ankita Katiyar

02/07/2024, 11:07 AM
This isn’t possible right now again because the config loading happens way before the dataset factory resolution but we do have an issue open to collect use cases for something like this, if you want to add yours too, that’d be great 😄 - https://github.com/kedro-org/kedro/issues/3086
i

Iñigo Hidalgo

02/07/2024, 11:09 AM
Would that be the correct issue? I'm talking about a different way of interpolating but that issue seems to be about using the factories pattern in other config types, right?
a

Ahdra Merali

02/07/2024, 11:16 AM
It's the best issue at present - this comment seems similar to what you're trying to do. As Ankita confirmed, it's not currently possible, but do upvote the issue/comment (or add an additional one if the comment above doesn't quite capture your use case). We're still in our first iteration of this feature, all suggestions for further extensions are welcome 😄
1
🙏 1
i

Iñigo Hidalgo

02/07/2024, 11:23 AM
Posted a reply, thank you 🙂
np 1
At this point I probably will be looking into some hacky method to get this behavior as the factories without being able to customize certain parameters dont get me where I need for the product I'm trying to build. I would welcome any thoughts here for more complex solutions involving hooks. I can see an after_catalog_created but also possibly subclassing the Datacatalog which is where all this resolving happens
👍 1