Afaque Ahmad
01/19/2023, 9:10 AMLivyRunner to be able to submit jobs to an EMR cluster using Livy. I'm using Kedro 0.18.4. I need to pass the code as a string to Livy. Has anyone created something similar. Any help is really appreciated.
I'm trying to pass the code in _run to Livy. How to figure our which pipeline and node to run? We do have the following parameters in the _run function but it cannot be passed to the string.
def _run(
self,
pipeline: Pipeline,
catalog: DataCatalog,
hook_manager: PluginManager,
session_id: str = None,
) -> None:datajoely
01/19/2023, 9:14 AMafter_context_created or before_pipeline_run hook here?Afaque Ahmad
01/19/2023, 9:20 AM_run as a string to the Livy.
Something as below (not complete)
cmd = textwrap.dedent("""
import json
import sys
import time
from collections import Counter
from itertools import chain
import requests
from pluggy import PluginManager
from <http://kedro.io|kedro.io> import AbstractDataSet, DataCatalog, MemoryDataSet
from kedro.pipeline import Pipeline
from kedro.runner.runner import AbstractRunner, run_node
run_node(node, catalog, hook_manager, self._is_async, session_id)
data = {
"code": cmd,
"kind": "pyspark"
}
statements_url = session_url + '/statements'
r = <http://requests.post|requests.post>(statements_url, data=json.dumps(data), headers=headers)
""")Afaque Ahmad
01/19/2023, 9:28 AMload_context and the set of pipelines and nodes running? (the ones passed to the _run function?Afaque Ahmad
01/19/2023, 9:37 AM