Afiq Johari
10/23/2023, 3:44 PMparameters.yml
. It runs a SQL stored procedure to generate the latest data in the SQL database. Current node has None as output but I prefer to return the EIU_sql
which should become the input for my next node.
# Generates EIU data using a SQL stored procedure and stores it in the database.
node(
func=generate_EIU,
inputs=["params:EIU"],
# preferred outputs
# outputs=["EIU_sql"],
# current output
outputs=None,
name="generate_EIU",
namespace="EIU"
),
1. Excel Data Export Node: The second node takes as input the SQL table that were updated or generated by the first node. It is responsible for writing the data from these SQL tables into an Excel file.
# Transforms EIU data to Excel format and saves it to a file.
node(
func=EIU_sql_to_excel,
inputs=["EIU_sql"],
# preferred outputs
# outputs=["EIU_excel"],
# current output
outputs=None,
name="EIU_sql_to_excel",
namespace="EIU"
),
Additionally, I've defined the SQL table and the Excel output location in the catalog.yml
file.
EIU_sql:
type: pandas.SQLQueryDataset
sql: "SELECT ..."
credentials: cred
EIU_excel:
type: pandas.ExcelDataSet
filepath: data/03_primary/EIUdata.xlsx
load_args:
sheet_name: data
save_args:
sheet_name: data
functions
def generate_EIU(parameters: Dict[str, Any]) -> None:
"""
Generates EIU data using a SQL stored procedure and stores it in the database.
Input:
- parameters: A dictionary containing configuration parameters
Output:
- None
"""
# ... (code for generating and storing EIU data)
def EIU_sql_to_excel(
EIU_sql: Optional[pd.DataFrame]) -> None:
"""
Transforms EIU data to Excel format and saves it to a file.
Input:
- EIU_sql: Yearly EIU data in a DataFrame.
Output:
- None
"""
# ... (code for saving EIU data to Excel)
catalog.load("EIU_sql")
within the generate_EIU
function and return `EIU_sql`table after the stored procedure is completed.