Hey Kedro community! :kedro: :dna: Just publishe...
# resources
a
Hey Kedro community! K 🧬 Just published a blog post about using Kedro to structure BioLM protocols: https://blog.biolm.ai/translating-biolm-protocols-into-kedro-workflows/ Over at BioLM, we are using language models to power biological research. Our platform provides models for protein structure prediction, sequence generation, and molecular design. We make frequent use of Jupyter notebooks to compose multi-step modeling workflows, often for user demonstration, but sometimes for reproducibility purposes a notebook can be a bit too "fluid" so we need a more defined workflow. Kedro feels like a natural transitional fit in such cases. We used Kedro to translate a Jupyter notebook (demonstrating antibody engineering with protein structure analysis) into a structured, reproducible pipeline. I think it worked out pretty well! Some key Kedro features we found really useful for this purpose: • Configuration management: Externalized file paths, API parameters, and target definitions into YAML files • Data catalog: Made data dependencies explicit and the dataset previews in Kedro-Viz are great • Checkpoint functionality:
kedro run --from-nodes
and
--from-outputs
is super helpful for iterative development • Kedro-Viz: Publishing the static Kedro-Viz build via GitHub Actions → GitHub Pages is a great way to share and present workflows (like this: https://biolm.github.io/biolm-kedro/) The workflow processes multiple protein targets (EGFR, PDL1, MBP, IL-7Rα) through BioLM's AI models, generating variant analysis and visualizations. Really showcases how Kedro's modular approach works well for bioinformatics pipelines that involve external API calls and long-running analyses. Would love to hear if others in the community are using Kedro for similar scientific workflows! If anyone is interested in the intersection of biology and AI/ML, we'd love to invite you to our own Slack community at https://biolm.ai/community/ - it's a great space for sharing projects, getting help, and staying updated on bio AI developments!
ā¤ļø 9
šŸš€ 14
🄳 15
d
Would love to hear if others in the community are using Kedro for similar scientific workflows!
@Andrew Stewart Feels like it could be a great SciPy talk in the future! Always very cool to see real-world applications of Kedro in spaces like this.
šŸ‘ 1
a
Indeed, @Deepyaman Datta ! We'll actually be presenting at https://summit.nextflow.io/ this year.
šŸ’› 1
I'm always surprised more of the bioinformatics community isn't (yet!) familiar with Kedro.. it's pretty ideal for managing bioinformatics workflows. Checkpointing is especially useful due to long compute steps.
I think I actually recall seeing a FASTA Dateset somewhere??
d
I'm always surprised more of the bioinformatics community isn't (yet!) familiar with Kedro.. it's pretty ideal for managing bioinformatics workflows. Checkpointing is especially useful due to long compute steps.
Same! I went to SciPy for the first time this year, and a lot of people in the community talk about issues that Kedro has done a great job solving, but lot of people trying to reinvent solutions. Awareness takes time, but great to see things like this!
šŸ‘ 1