Hello Kedro Friends, I have a colleague who loves ...
# questions
h
Hello Kedro Friends, I have a colleague who loves using notebooks for their pipelines. Personally, I feel that notebooks are not ideal for writing production code, and I've been trying to convince them to switch over to a more structured approach, like Kedro. However, there's one aspect of notebooks that I haven't been able to replicate with Kedro: documentation. The markdown cells in notebooks are great for explaining the steps within a pipeline. As pipelines become more complex, Kedro Viz alone doesn't seem sufficient to make the process easy to understand. So, here's my question: How do you document your code when using Kedro?
y
• First of all, pipelines consist of nodes, and nodes are wrappers around usual Python functions - so I'd say that first, those functions should have proper typing and docstrings. They should be arranged into Python modules in a way that reflects project flow. • Second, there should be some docs explaining what pipelines exist in the project, how are they related (are they sequential or parallel in regards to each other, etc.). Why are they sliced this way. For this, I typically use README of the project or a separate Markdown file where I explain this.
šŸ‘ 1
Also, few fun thoughts on drawbacks of notebooks: https://github.com/samuelcolvin/notbook
d
On that topic, an amazing watch on why Joel Grus doesn't like notebooks:

https://www.youtube.com/watch?v=7jiPeIFXb6Uā–¾

j
Functionalise everything and add doc strings. The doc strings can serve as documentation of what's been executed. Further use readme files for deeper explanations.
šŸ‘ 1