Hi! Are there best practices regarding exploratory...
# questions
t
Hi! Are there best practices regarding exploratory data analysis and data cleaning? Do you start with notebooks and move code to kedro nodes later on? Thanks for your suggestions
👌🏼 1
g
For EDA, Kedro is great if you want to present how you get to specific plots. Notebooks can speed up exploration but, of course, the more work you do there the harder it becomes to translate it to nodes, so consider translating your code to nodes as soon as you understand better what you want to show. You can then use the nodes in the notebooks as you develop new ones.
r
Hi Tim, You can also use Kedro during your exploration phase and visualize your results with Kedro Viz. I agree that notebooks make exploration faster, but they can become harder to maintain over time. We often see people start with notebooks and then convert their work into Kedro projects. We’ve also recently developed Kedro MCP, which includes a notebook conversion feature that lets you turn your notebook exploration into a Kedro project. It’s still in the testing phase, but we’d love for you to try it out.
t
I'm about to start a new project. So your suggestion is not to use notebooks but put everything in nodes right from the beginning?
r
With Kedro, you have flexibility to do both .... you could start by setting up a Kedro project since it already includes a
notebooks/
folder, so you can do your explorations using Kedro as a library in your notebooks (see docs) You should use the DataCatalog from the beginning to organise your data and keep things consistent. Once things start to take shape, you can move that code into nodes to make it more reproducible.
t
I'll try that. Thanks for your advice.