Ralf Kowatsch
09/18/2025, 9:53 AMdatajoely
09/18/2025, 11:01 AMkedro catalog resolve to review what Kedro compiles at runtime.
• Can you explain this a bit more? My view with Kedro (and any other framework) is that you should adopt the principle of 'loose coupling, high cohesion'. That means your business logic should be independently well tested pure python package, your kedro pipeline should be dumb in the sense you are not coupling business logic to your expression of flow. In my opinion the nodes.py we generate is purely for newbies, in practice you should only need a pipeline.py which imports the python functions from elsewhere.
• So there are a couple of different ways to achieve parallelism in Kedro, the ParallelRunner uses multiprocesses and will be good for local procesing engines like Pandas / Polars. The ThreadRunner is perfect for Spark / Snowpark / Ibis since it delegates execution to a remote computation backend. We don't have a great deal of control of how that gets delegated beyond the number of threads, I think some tuning will be required on the engine side if performance is bottleneck.
• Data Quality is an interesting topic, there have been many frameworks come and go so it's been hard to build long term integrations. Pandera is by far my favourite way of doing runtime expectation testing. It has support for Spark / Polars / Ibis (Snowpark may work since it's Spark like at an API level, but don't rely on me saying that). The other advantage of this is that you annotate the pure python functions I mentioned above, without coupling to your flow framework like Kedro. In summary, if you have well unit tested pure python functions annotated with Pandera schemas you have a solid foundation of trust to work against. This doesn't support expectation tests on persisted data like dbt or great expectations. kedro-pandera does exist, but I'm not sure how up to date it is.
• In truth most examples of Kedro at scale are not open to the public, beyond our tutorial docs, maybe explore the github dependents view?