Nripesh Srivastava’s Post

No one asked for a shared package. I built one anyway. Multiple teams at a global pharmaceutical company were running the same logic. Fetch data from source. Transform it. Write to ADLS Gen2. Each team had their own version. Assumption: custom code per team is safer. Easier to change without breaking someone else’s pipeline. Reality: five codebases with five variations of the same bug. Every upstream schema change meant five separate fixes. I built an OOP-based Python package. Parameterized. Modular. One abstraction for retrieval, one for transformation, one for storage. Other teams started using it. Then more teams. It became the default pattern not because someone mandated it, but because it was simply better. Reusability isn’t about efficiency. It’s about reducing drift between what you intended and what ten teams independently decided to implement. The hardest part wasn’t the code. It was designing the interface so teams could configure it without needing to understand what was underneath. That’s the real engineering skill. Not writing a good function. Writing one that other engineers trust enough not to rewrite. What’s a pattern you built that spread further than you expected? #DataEngineering #Python #AzureDatabricks

To view or add a comment, sign in

Explore content categories