I've recently found Databricks' Medallion Architec...
# resources
j
I've recently found Databricks' Medallion Architecture (bronze, silver, gold) layers to be a useful mental model for data pipelines. but this article takes that a step further, bridges it with the concept of data marts + proper data modeling, and provides lots of useful guidance https://lakshmanok.medium.com/what-goes-into-bronze-silver-and-gold-layers-of-a-medallion-data-architecture-4b6fdfb405fc (it adds one more metal πŸ™ˆ but it's a necessary evil) this somehow maps with Kedro's
data/
convention, although ours is way more ML-oriented. would love to know your thoughts!
πŸ’‘ 2
d
I think it’s time we adopt this, it’s a clear industry standard at this point and super compatible with our own model πŸ₯‰ Bronze is
intermediate
πŸ₯ˆ Silver is
Primary
πŸ₯‡ Gold is
Model input+
πŸ’‘ 3
πŸ‘ 3
πŸ’― 1
g
interesting! in our case we have a complicated pipeline. For example our FactReceipts (primary layer) is a join of 20+ tables so I had to add an extra layer at the very beginning and create some intermediate entities (joins) used by multiple entities in intermediate
πŸ‘€ 1