Principles
Data pipelines are products
As something that should respond to and embrace regular change, pipelines should be treated as products rather than projects.
Pipelines need product managers to understand the pipelines’ current statuses and operability, and to prioritise the work.
Find ways for making common use of the data
When creating pipelines, we try to architect them in a way that allows reuse whilst also remaining lean in our implementation choices.
A pipeline should operate as a well-defined unit of work
Pipelines should be built around use cases
Continuously deliver your pipelines
Adopting this mindset and these practices is essential to support continuous improvement and create feedback loops that rapidly expose problems and address user feedback.
Consider how you name and partition your data
Consider what will be the most frequent queries when specifying bucket names, table partitions, shards, and so on.
Last updated