Clement
03/07/2024, 2:27 PMNok Lam Chan
03/07/2024, 2:37 PM${runtime_params: xxx, ${globals: xxx}}
Nok Lam Chan
03/07/2024, 2:38 PMglobals.yml
? Not so sure about this, Cc @Ankita KatiyarClement
03/07/2024, 2:50 PMget_kedro_boot_session("project").run(
parameters={...},
}
)
I have a global variable that i use in my catalog
Edit : I understood your message, i will try it and get back to you, thanksNok Lam Chan
03/07/2024, 2:52 PMTakieddine Kadiri
03/07/2024, 3:35 PMglobals
are resolved and an iteration time ( Kedro Boot session run) when itertime_params
are resolved.
You can try using itertime_params
instead of paramaters
, if you want to inject a parameter at each run iteration. Here is how to declare it in your catalog :
your_dataset:
type: …
Filepath: …${itertime_params:<param_name>}
Then you can run the kedro boot session with:
session.run(itertime_params={‘param_name’: param_value})
Otherwise, can you elaborate more on your use case to see if Kedro Boot is a good fitClement
03/07/2024, 4:32 PMif submitted:
outputs = get_kedro_boot_session("project").run(
itertime_params={
"airline": airline,
...
}
)
kedro.io.core.DatasetError:
Port could not be cast to integer value as 'airline,None}'.
Failed to instantiate dataset 'fdr' of type 'kedro_datasets.partitions.partitioned_dataset.PartitionedDataset'.
Thanks for your helpClement
03/07/2024, 4:39 PMfdr:
type: partitions.PartitionedDataset
path: <gs://oa-sbfe-generated-fdr>-${runtime_params:airline}
dataset:
type: fdr_explorer_streamlit.datasets.fdr_dataset.FdrDataset
load_args:
low_memory: False
And here is my simple streamlit app :
from datetime import datetime, timedelta
from pathlib import Path
import streamlit as st
from kedro_boot.app.booter import boot_package, boot_project
from ydata_profiling import ProfileReport
st.set_page_config(layout="wide")
def get_kedro_boot_session(boot_type):
if boot_type == "project":
return boot_project(project_path=Path.cwd(), kedro_args={"pipeline": "__default__"})
else:
return boot_package(
package_name="fdr_explorer_streamlit",
kedro_args={"pipeline": "__default__", "conf_source": "conf"},
)
# get_kedro_boot_session("project")
with st.container():
with st.form("Parameters"):
airline = st.text_input("Choose an airline", placeholder="iad")
recorded_flight_id = st.text_input("Chose a recorded flight id")
date_min, date_max = datetime.now().date() - timedelta(days=365), datetime.now().date()
date_range = st.slider("Choose a date range", date_min, date_max, (date_min, date_max))
submitted = st.form_submit_button("Explore FDRs")
if submitted:
# session.run(namespace="download")
outputs = get_kedro_boot_session("project").run(
parameters={
"airline": airline,
"metadatas_parameters": {
"recorded_flight_id": recorded_flight_id,
"date_range": date_range,
},
}
)
report = ProfileReport(
outputs.get("processed_fdr").reset_index(),
title="FDR profiling",
minimal=True,
).to_html()
st.components.v1.html(report, height=1020, scrolling=True)
In the case of runtime_params, i have this error omegaconf.errors.InterpolationResolutionError: Runtime parameter 'airline' not found and no default value provided.
full_key: fdr.path
object_type=dictTakieddine Kadiri
03/07/2024, 6:34 PMitertime_params
is related to the uderlying DataSet. But anyways, neither itertime_params
nor the parameters
would help with the need of passing the airline
param to the credentials.
The only current way of providing credentials to your kedro pipeline is to create a session for each iteration. You'll end up with :
if submitted:
os.environ["your_creds_entry1"] = creds_entry1
os.environ["your_creds_entry2"] = creds_entry2
..
session = boot_project(project_path=Path.cwd(), kedro_args={"pipeline": "__default__", "params": {"airline": airline, ....})
session.run()
In your catalog.yml
and parameter.yml
use ${runtime_params:airline}
In your credentials.yml
use the oc.env resolver