Hey team, i'm researching how to deploy kedro in d...
# questions
c
Hey team, i'm researching how to deploy kedro in dagster. I found out 2 ways by now that works properly: 1.- Running as a subprocess
Copy code
@op
def run_kedro_terminal(context: OpExecutionContext):
    process = Popen("kedro run", stdout=PIPE, stderr=STDOUT, shell=True)
    with process.stdout:
        log_subprocess_output(context, process.stdout)
    process.wait()
2.- Running in code
Copy code
@op
def kedro_run(context: OpExecutionContext):
    project_entry = Path(os.getcwd())
    bootstrap_project(project_entry)
    with KedroSession.create(project_path=project_entry) as session:
        session.run()
By now i'm not able to show the kedro logs on the 2nd option, i found out that
logger = get_dagster_logger()
enables me to log properly inside code/function of kedro, but i haven't found a way to show kedro logs for nodes and pipelines execution sadblob (edited)
On the Dagster UI: • On option 1: i can show the kedro logs since i'm using the function that calls the logger dagster context;
Copy code
def log_subprocess_output(context, pipe):
    for line in iter(pipe.readline, b''):
        context.log.info(line.decode('utf-8'))
• On option 2: Kedro logs (like
INFO     Running node
) are not being showed
d
Hey @Camilo López getting a Dagster tutorial out has been an our radar for a while, so working out what works would be very helpful
if you’re on 0.19.x we also have a new logging env var that may be helpful here https://docs.kedro.org/en/stable/logging/index.html#how-to-customise-kedro-logging
I have a feeling the solution here for prefect would make sense here too https://discourse.prefect.io/t/prefect-and-kedro-integration-logging/3123
c
Hey @datajoely cool, happy to collaborate with this im just starting with this new project with kedro, minio and dagster
🚀 1
👍🏼 1
n
Both Kedro and Dagster are using Python standard
logging
module.
Any chance there are some verbosity settings you can switch on first?
c
lookslike you can put the logger names that you need to capture on dagster.yml
Copy code
python_logs:
  managed_python_loggers:
    - my_logger
    - my_other_logger
I tried with no results:
Copy code
python_logs:
  managed_python_loggers:
    - kedro
    - kedro_handler
d
can you post your existing logging.yml?
c
is just spaceflights logging.yml
Copy code
version: 1

disable_existing_loggers: False

formatters:
  simple:
    format: "%(asctime)s - %(name)s - %(levelname)s - %(message)s"

handlers:
  console:
    class: logging.StreamHandler
    level: INFO
    formatter: simple
    stream: <ext://sys.stdout>

  info_file_handler:
    class: logging.handlers.RotatingFileHandler
    level: INFO
    formatter: simple
    filename: logs/info.log
    maxBytes: 10485760 # 10MB
    backupCount: 20
    encoding: utf8
    delay: True

  error_file_handler:
    class: logging.handlers.RotatingFileHandler
    level: ERROR
    formatter: simple
    filename: logs/errors.log
    maxBytes: 10485760 # 10MB
    backupCount: 20
    encoding: utf8
    delay: True

  rich:
    class: rich.logging.RichHandler

loggers:
  kedro:
    level: INFO

  minio_spaceflights:
    level: INFO

root:
  handlers: [rich, info_file_handler, error_file_handler]
d
can you change your handlers at the bottom and switch
rich
with
console
n
lookslike you can put the logger names that you need to capture on dagster.yml
what's the name of your logger? you mentioned the log inside node is logged correctly.
c
i changed from rich to console didnt work 😞 . Idk what's the logger name, inside the node works if i do the following:
Copy code
logger = get_dagster_logger()
Solved 😄 ! I forget to tell the part that dagster was running kedro in a docker container, so what was missing was to export the
KEDRO_LOGGING_CONFIG=<project_root>/conf/logging.yml
in the Dockerfile, after that the changes in the logging.yml worked properly
In my case was just adding in the Dockerfile
ENV KEDRO_LOGGING_CONFIG=/opt/dagster/app/conf/base/logging.yml
thanks @datajoely & @Nok Lam Chan
n
Are you using Kedro-docker or you write your own docker?
Just curious why this fails only at this stage, since you still need this config to run locally, maybe you just upgrade to 0.19 recently?
c
i wrote my own one, yes i'm running 0.19
qq: there's a way to get rid of
[dark_orange]
or colors in the kedro default logging ?
n
I don't think it's possible. Cc @Ahdra Merali is the markup syntax
rich
specific?
👀 1
d
you can do so with rich environment vars directly https://rich.readthedocs.io/en/stable/console.html#environment-variables
NO_COLOR=true
it’s apparently a wider convention https://no-color.org/
g
@datajoely have you perhaps made any progress on the dagster+kedro tutorial idea mentioned above? Also, @Camilo López / @Nok Lam Chan do you perhaps know of any GitHub repos (or other material) one can use to consult ?
n
@George p unfortunately I don't, I know there are teams using it internally but I don't recall we have any blog post to showcase. I will ask around
c
Hey! I haven’t but we can do something im open to it!. The main struggle i found out was just the logger connection mentioned above. I'll pin you this week @Nok Lam Chan wdyt?
👍🏼 1
the project i was working on was dagster deployed on k8's , scheduling kedro with minio as storage. It's a great open source example btw
💯 2
d
if its open source can we showcase the repo?
n
It would be fantastic if it can be open sourced, at a bare minimal a blog post to summarise what it takes, the general approach and some code snippets would be a great start.
Feel free to ping me. @Juan Luis maybe interested too?
j
would love to have a look at the repo yes!
g
Consider me an amateur, but i am happy to help out as well 🙌🏻
@Florian d ^
f
Thanks @George p Curious to see examples here too :) I’ll play around myself