Rassul Yermagambet
01/05/2024, 11:01 AM@hook_impl
def register_config_loader(
self,
conf_paths: Iterable[str],
env: str,
extra_params: Dict[str, Any],
) -> TemplatedConfigLoader:
return TemplatedConfigLoader(conf_paths, globals_pattern="*globals.yml")
@hook_impl
def register_catalog(
self,
catalog: Optional[Dict[str, Dict[str, Any]]],
credentials: Dict[str, Dict[str, Any]],
load_versions: Dict[str, str],
save_version: str,
# journal: Journal,
) -> DataCatalog:
return DataCatalog.from_config(catalog, credentials, load_versions, save_version,
# journal
)
Merel
01/05/2024, 11:06 AMsettings.py
instead to set a custom config loader and catalog. See the migration guide https://github.com/kedro-org/kedro/blob/main/RELEASE.md#breaking-changes-to-the-api-7 as well as: https://docs.kedro.org/en/stable/kedro_project_setup/settings.htmlRassul Yermagambet
01/05/2024, 11:47 AMRassul Yermagambet
01/08/2024, 8:47 AMValueError: Failed to format pattern '${folders.ref}': no config value found, no default provided
Filepaths with formats are defined in globals.yml as following:
# file /conf/base/globals
base_dir: ""
folders:
ref: "data/reference"
raw: "data/base/01_raw"
intermediate: "data/base/02_intermediate"
primary: "data/base/03_primary"
features: "data/base/04_feature"
model_input: "data/base/05_model_input"
models: "data/base/06_models"
model_output: "data/base/07_model_output"
reporting: "data/base/08_reporting"
# timezone set for the whole pipeline
pipeline_timezone: 'UTC'
I defined the config files in settings.py as following:
CONFIG_LOADER_CLASS = TemplatedConfigLoader(
# conf_source=str(Path(__file__).parents[2].resolve() / settings.CONF_SOURCE),
conf_source = "conf",
base_env="base",
default_run_env="local",
)
CONFIG_LOADER_ARGS = {
"globals_pattern":"*globals.yml",
"config_pattern": {
"catalog": ["catalog*", "catalog*/**", "**/catalog*"],
"parameters": ["parameters*", "parameters*/**", "**/parameters*"],
"credentials": ["credentials*", "credentials*/**", "**/credentials*"],
}
}
CONF_CATALOG = CONFIG_LOADER_CLASS["catalog"]
CONF_CREDENTIALS = CONFIG_LOADER_CLASS["credentials"]
DATA_CATALOG_CLASS = DataCatalog.from_config(
catalog=CONF_CATALOG, credentials=CONF_CREDENTIALS
)
Merel
01/08/2024, 10:51 AMTemplatedConfigLoader
settings, so you can just do:
from kedro.config import TemplatedConfigLoader # new import
CONFIG_LOADER_CLASS = TemplatedConfigLoader
You also don't need to overwrite the patterns like you have here in CONFIG_LOADER_ARGS
Merel
01/08/2024, 10:51 AM0.18.1
right?Merel
01/08/2024, 10:53 AMCONF_CATALOG = CONFIG_LOADER_CLASS["catalog"]
CONF_CREDENTIALS = CONFIG_LOADER_CLASS["credentials"]
isn't possible until 0.18.4
, but I also don't think this is necessary because you're using the default DataCatalog
Rassul Yermagambet
01/08/2024, 11:17 AMCONFIG_LOADER_CLASS = TemplatedConfigLoader
DATA_CATALOG_CLASS = DataCatalog
But I am getting the following error:
ValueError: Failed to format pattern '${telegram_logger.token}': no config value found, no default provided
The logger is defined in logging.yml.
I tried several modifications with class parameters, but all raised errors are related to the value error above.Merel
01/08/2024, 11:21 AM${telegram_logger.token}
and is the value for telegram_logger.token
within your globals?Rassul Yermagambet
01/08/2024, 11:38 AMlogging.yml
under conf>base
. It is not within globals
. For context: it is used for the Telegram bot API to send log messages. I removed it for a while to further advance the work as it is not so important, but it would be a pleasure if you could help resolve the error. And got the following error after removing the talegram handler from logging.yml
:
ModularPipelineError: Failed to map datasets and/or parameters: params:train_model, params:train_model.report
prediction_pipeline = Pipeline(
[
......
parameters={
'params:train_model': 'params:train_power_model',
'params:train_model.report': 'params:train_power_model.report',
......
])
train_model
is the custom modular pipeline. Also it is used as following in `parameter.yml`:
train_model_power:
type: pickle.PickleDataSet
filepath: ${folders.models}/model_power_21_02_23.pickle
layer: train_model
But I am not understanding why and how it is passed as parametersMerel
01/08/2024, 1:58 PMModularPipelineError: Failed to map datasets and/or parameters: params:train_model, params:train_model.report
is basically telling you that the parameters train_model
and train_model.report
can't be found. Do you have parameters with those names?Rassul Yermagambet
01/09/2024, 3:52 AMtrain_model
in the parameters. The project only has modular pipeline named train_model
. I did not understand how it was implemented in Kedro 0.17.2, but it worked that way. That's why I thought the issue might be related to the correct way of registering catalog files.Merel
01/09/2024, 1:43 PMprediction_pipeline = Pipeline(
[
......
parameters={
'params:train_model': 'params:train_power_model',
'params:train_model.report': 'params:train_power_model.report',
......
])
you reference params: train_model
. So what are you trying to do there?Rassul Yermagambet
01/11/2024, 4:27 AMpipeline(
pipe=train_model.create_pipeline().only_nodes('train_model.load_regressor',
'train_model.add_transformers',
'train_model.train_model',
'train_model.create_train_predictions',
'train_model.create_test_predictions',
'train_model.generate_performance_report'
),
inputs={
'train_model.train_set': 'data_master_train',
'train_model.test_set': 'data_master_test',
'train_model.input': 'data_master',
'train_model.td': 'tag_dictionary',
},
parameters={
'params:train_model': 'params:train_power_model',
'params:train_model.report': 'params:train_power_model.report',
},
outputs={
'train_model.train_set_model': 'train_model_power',
'train_model.train_set_feature_importance': 'power_model_feature_importance',
'train_model.train_set_predictions': 'train_set_power_model_predictions',
'train_model.train_set_metrics': 'train_set_power_model_metrics',
'train_model.test_set_predictions': 'test_set_power_model_predictions',
'train_model.test_set_metrics': 'test_set_power_model_metrics',
},
namespace='train_power_model'
),
Merel
01/12/2024, 4:36 PMcatalog.yml
and parameters in parameters.yml
and Kedro then loads and saves them for you when running the pipeline.Rassul Yermagambet
01/15/2024, 5:20 AMMerel
01/15/2024, 9:23 AM