Data Engineering and Data Science - Yin and Yang: Lessons from Data Science Projects
I recently came across couple of AIML / Data Science (DS) projects and I felt compelled to write this article based on what I learned about these projects. I think it is quite an established fact most data science projects fail due to lack of production deployment and operationalization thinking from the start of any DS project. According to Gartner analyst Nick Heudecker, over 85% of data science projects fail. A report from Dimensional Research indicated that only 4% of companies have succeeded in deploying ML models to production environment. I wrote an article calling out the distinction between Exploratory AI vs. Operational AI, and why data scientists need to think about operational AI because in order for these projects to deliver value to organizations, AI projects need to come out of the sand boxes and operate in real world.
In spite of such prevalent knowledge, it was evident from the projects that the teams lacked the "product thinking" and neglected to include the set of user stories that would drive towards the operational needs of the DS models. As a result, the data engineering team and the DS team worked in silos until to the point when it came to deploying the model in production. This resulted in undue stress on the project timelines and outcomes. In fact, this raised the question about the success of the projects even though the DS model was developed to meet the business requirements and running successfully in sandbox environment in a silo mode. Here are some lessons learned from the project that might help you to avoid the mistakes the teams made.
Apply Product thinking:
This approach to implementing AIML /DS project forces you to think in terms of operational AI from the beginning of the project. You are forced to create user stories that would include gathering not only functional requirements but production deployment and operational(model maintenance) requirements. You are forced to plan better for production deployment and maintenance.
Break down Siloed teams:
Siloed data teams (engineering and DS) can hinder the iterative model development processes, slowing innovation - the development of Operational AI. It is imperative that leaders understand AIML /DS projects require effective collaboration between data engineering and DS teams for efficient implementation. Without understanding each others requirements and constraints, the model could ultimately be not suitable for production deployment or would require significant effort on both sides to deploy in production. This could add to the timeline and overall effort. In one of the projects, when it came to deploy the model in production, the engineering team had to try several architectural approaches to find the right one that would fit the DS model requirements to run in production mode. This could have been avoided had the two teams collaborated to understand the integration points and requirements of the operational DS model.
Integration points with the Engineering team and the Data Science team should be throughout the life cycle of the project
The two teams will run in parallel. They might feel like they are distinct streams of work but they are complementary to each other like Yan and Yang. Yan and Yang describes how seemingly opposite or contrary forces may actually be complementary, interconnected, and interdependent in the natural world, and how they may give rise to each other as they interrelate to one another. (Source: Wikipedia).
Matie Zahari who is the chief scientist at Databricks and started the Apache Spark project said in his recent article How to empower data teams in 3 critical ways
We need to break silos down, enable collaboration between data engineering and data science teams, and build a new data team structure.
He also argues how there will be rise of converging roles in future where the hybrid roles will be focused on delivering business value and less of specialization of skillsets. This will be possible as the platforms will evolve and will support the convergence of such roles. ML engineers will have full stack data experience and data engineers will DS experience.
Think MLOps from the beginning:
In order for Operational AI to deliver business value successfully, there is a need for MLOps thinking from the beginning of the project. Danny Farah did an excellent job explaining the need for MLOps as well as the blueprint of MLOps in detail in his article The Modern MLOps Blueprint. Once deployed in production, the model needs to be monitored, upgraded with the changing needs of business (because the model needs to deliver value at all times) and also account for edge cases and biases. Without having a MLOps framework in place, this can become a very difficult and cumbersome process.
Do not under estimate the need for a Scrum Master for a DS project:
This might seem like a very obvious project management requirement but you might be surprised to find that lot of projects tend to make one of the leads to be the scrum master to cut cost. That's an obvious mistake. Like all other data engineering project, there is a need for a dedicated Scrum Master for DS projects also. The Scrum Master needs to make sure all team members are working collaboratively to deliver their user stories at the end of each sprint. The Scrum Master needs to make sure all impediments to the project are removed.
Above all, the Scrum Master needs to break down all silos within the team.
Gartner Survey reveals 66% of organizations increased or did not change AI investments since the onset of COVID-19. As companies continue to invest in AI projects to improve customer experience and retention, revenue growth and cost optimization they must focus on creating strong, collaborative teams where data engineering and DS teams are like Yin and Yang focusing on delivering business value. That is how data teams will have bigger impact in innovation.
Mainak thanks for drafting that article. I'd be curious to hear where you think ML success rate will be in 2 years. Is a talent gap also part of the reason for the low success rate? Troy Hall, Zola Petkovic, Marc Lobree - you all are people I respect in the AI/ML space. Would be curious to hear your thoughts.
Nice article Mainak Sarkar. It has been my learning too: - Consolidate ownership - Integrate early - Iterate often https://ml4devs.substack.com/p/003-why-machine-learning-projects-fail
Enamored with the science, but neglectful of the #productmindset needed for success...agreed Mainak!