Hi everyone, I have a question. I was trying to us...
# questions
j
Hi everyone, I have a question. I was trying to use convert kedro pipeline to airflow dags following the steps as described in [here] Following is my steps: 1. kedro airflow create, the file is then generated inside airflow_dags/ directory 2. I have another folder for airflow, therefore, I copy the generated dags to corresponding dags/ folder in airflow directory 3. Package the kedro pipeline using kedro package and install it in the corresponding airflow folder 4. Copy the conf/ and data/ folder to the corresponding airflow folder The problem is, when I run the airflow task, it output the following error.
Copy code
ValueError: Given configuration path either does not exist or is not a valid directory: /home/jackson/MASS_AIR_PIPELINE/conf/base
May I know if i missed something in my steps above? Shouldn't it locate the conf/base folder inside my airflow directory? FYI, /home/jackson/MASS_AIR_PIPELINE/ is the folder where I code all my kedro pipeline and /home/jackson/airflow is where i stored my airflow dags.
n
Hey! Sorry that you have a bumpy experience. Is it possible to run the tree command and print out the directory structures?
j
Sure, this is the directory structures of my kedro pipeline:
Copy code
.
├── airflow_dags
│   └── mass_air_pipeline_dag.py
├── conf
│   ├── base
│   ├── local
│   └── README.md
├── data
│   ├── 01_raw
│   ├── 02_intermediate
│   ├── 03_primary
│   ├── 04_feature
│   ├── 05_model_input
│   ├── 06_models
│   ├── 07_model_output
│   └── 08_reporting
├── dist
│   ├── conf-mass_air_pipeline.tar.gz
│   └── mass_air_pipeline-0.1-py3-none-any.whl
├── docs
│   └── source
├── info.log
├── logs
├── notebooks
│   └── make_request.ipynb
├── pyproject.toml
├── README.md
├── setup.cfg
└── src
    ├── build
    ├── mass_air_pipeline
    ├── mass_air_pipeline.egg-info
    ├── requirements.txt
    ├── setup copy.py
    ├── setup.py
    └── tests
Here is my airflow directory structures:
Copy code
.
├── airflow.cfg
├── airflow.db
├── airflow_lib
│   ├── bin
│   ├── etc
│   ├── generated
│   ├── include
│   ├── lib
│   ├── lib64 -> lib
│   ├── pyvenv.cfg
│   └── share
├── airflow-webserver.pid
├── conf
│   ├── base
│   ├── local
│   └── README.md
├── dags
│   ├── mass_air_pipeline_dag.py
│   └── __pycache__
├── data
│   ├── 01_raw
│   ├── 02_intermediate
│   ├── 03_primary
│   ├── 04_feature
│   ├── 05_model_input
│   ├── 06_models
│   ├── 07_model_output
│   └── 08_reporting
├── dist
│   ├── conf-mass_air_pipeline.tar.gz
│   └── mass_air_pipeline-0.1-py3-none-any.whl
├── logs
│   ├── dag_id=example_python_operator
│   ├── dag_id=mass-air-pipeline
│   ├── dag_processor_manager
│   └── scheduler
└── webserver_config.py
n
I see
Copying the
conf
is not working here because when you package a project, the
conf
is not save inside the package.