Nelson Zambrano
04/11/2024, 5:29 AMThreadRunner
in an AzureML cluster (no scaling, 1 node compute).
Kedro==0.19.3
sqlalchemy==2.0.29
oracledb ==2.1.1 / cx-Oracle==8.3.0 (tried both, same results.)
The Kedro Pipeline:
Executed through a kedro_script.py
(which essentially is a KedroSession.create
+ session.run
)
1. It intakes 21 SQLQueryDataset
2. Performs transformations to each in different nodes.
3. Writes to Azure blob storage using a ParquetDataset
4. Uses all outputs and combines them.
(I attached a viz of the pipeline.)
The problem:
Using ThreadRunner
in a cluster -- 20 of the transformation nodes(2) run and write their output to storage(3) except the last (random) one. Then it fails with a DB error and the stdout attached.
Using ThreadRunner
in a compute instance with the same environment (docker image, compute type, etc) works just fine
Using SequentialRunner
in the cluster does not reproduce the error ; it runs just fine. (is_async= True/False)
Tried:
• Different Oracle (ugh, I know) drivers
• Different versions of oracledb
and cx-Oracle
no luck.
• Different amount of workers
• Different engine parameters pool_size
, max_overflow
,thick_mode
(yay to the support of sqlalchemy engine params)
Any Idea what might be happening here?Nelson Zambrano
04/11/2024, 5:31 AMdatajoely
04/11/2024, 6:09 AMdatajoely
04/11/2024, 6:10 AMNelson Zambrano
04/11/2024, 6:15 AMNok Lam Chan
04/11/2024, 8:25 AMNelson Zambrano
04/11/2024, 3:52 PMSQLQueryDataset
It formats the query in a special way using parameters in the catalog and then super()
Deepyaman Datta
04/11/2024, 3:57 PMDeepyaman Datta
04/11/2024, 3:58 PMThreadRunner
, you're trying to create 21 sessions concurrently. And maybe that's problematic.Deepyaman Datta
04/11/2024, 3:59 PMNelson Zambrano
04/14/2024, 2:33 AMN
connections; this is from previous tests in a compute instance in Azure with ThreadRunner
.
The weird part is that it fails when running in a cluster using a CommandComponent
as part of a Pipeline Job in AzureML under the same conditions. (the pipeline architecture and environment).Nelson Zambrano
04/14/2024, 5:36 AM