WEN XIN (Jessie 文馨)
02/03/2023, 4:47 AMspark
job to EMR
through livy
for a kedro
project?datajoely
02/03/2023, 9:54 AMkedro package
and deploy it that wayWEN XIN (Jessie 文馨)
02/07/2023, 11:26 AMfrom test_livy.__main__ import main
import sys
if __name__ == "__main__":
main(sys.argv)
• in airflow livy operator, pass location of the file to file parameter
• kedro commands are passed as "args" parameter in airflow livy operator
# airflow task
t1 = LivyOperator(
task_id="run_kedro_pipeline",
driver_memory="1g",
num_executors=1,
executor_memory="1g",
executor_cores=1,
polling_interval=30,
file="s3://{{ var.json.AWS_BUCKETS.app.name}}/applications/spark/emr/test_kedro_livy.py",
args=["--pipeline", "test", "--params", "pipeline:test,app_name:test,ds:{{ ds }}"],
dag=dag,
livy_conn_id="livy_emr",
)
datajoely
02/07/2023, 11:27 AMWEN XIN (Jessie 文馨)
02/07/2023, 11:28 AMdatajoely
02/07/2023, 11:28 AMWEN XIN (Jessie 文馨)
02/07/2023, 12:34 PMdatajoely
02/07/2023, 12:57 PMWEN XIN (Jessie 文馨)
02/08/2023, 3:59 AMdatajoely
02/08/2023, 10:13 AMWEN XIN (Jessie 文馨)
02/08/2023, 10:13 AM