Rise of DataOps
Combining Data Management with Agile Development and DevOps
Background
All organizations are cognizant of the fact that while their business might or might not be growing, their data is certainly growing and growing exponentially. With the advent of mobile computing and the Internet of Things (IoT), the only thing that organizations know for sure is that not only will there new sources of data but that older sources will also be generating more data at a faster pace, as newer technologies like mobile etc. gain wider traction and business drivers force the organizations to seek out value from this asset. For such enterprises, in order to maintain their competitive advantage, it is imperative that they understand their data and how it impacts their business's bottom line. To understand it, the data needs to be collated, organized, managed and only then made available for business use like traditional reporting, analytics and now, machine learning and its sexier cousin, Artificial Intelligence (AI).
The key challenges faced by companies are the increasing volume of data and the rapid velocity at which it is being generated. Therefore it is imperative for any company to be able to rapidly derive value from it without incurring too much "data debt" in how that is done. In order to be able to meet and beat these challenges, organization's data management strategies must aspire to be lean and agile. This is where DataOps comes in.
What is DataOps ?
DataOps is a framework that fuses concepts of agile development and DevOps to produce a rapid, flexible and robust data analytics capability. Key considerations are that the implemented processes are repeatable or reusable. A DataOps framework would give an organization the capability to rapidly integrate, deploy and derive value existing data sources as well as to embrace new data sources through the reuse of existing components. What was previously only possible in months, is now achievable in days and with far better results due to the automation of key processes.
A core premise of DataOps is agility. With the advent of cloud technologies, and more importantly, the acceptance of public cloud infrastructure usage by large organizations has resolved a key dilemma faced by them. How do you address new data requirements with ever growing infrastructure needs in a rapid & efficient manner ? Note that cost is but one factor. Previously, relatively long lead times to deliver critical infrastructure scaling was not conducive to rapid delivery of data management solutions. This is has been largely solved with always-available cloud infrastructure that gives organizations the computing elasticity needed to quickly meet their business requirements.
This rapid change from on-prem to cloud and from bespoke to "build-once, use repeatedly" model requires a rethink of traditional data management strategies. The key difference being that organizations are moving towards a self service model. Business users, with IT support, are being enabled to acquire, ingest and analyze data on their own. Using the principles of agile development and DevOps, users can now rapidly set up analytical environments using proven ingestion, integration and deployment processes. These processes are heavily parameterized and require only configuration changes, rather than custom coding, something even a non-technical person can do. This is driving huge cost savings in terms of time and development effort. Furthermore, using this approach, there is also an opportunity cost utilization in that requirements are met quicker than ever before and business users can now focus on analyzing their data rather than worrying on when they can start. Organizations now have the operational flexibility they have hankered after for decades.
Implementing DataOps
Implementing DataOps requires a significant changes to the mind set and status quo. It pushes people out of their comfort zone and forces a change in their behaviour. And therein lies the biggest challenge to a successful implementation - people. Implementing DataOps is not a technology issue but a people issue. So in order to be successful at implementing DataOps, companies need to be able to have a clear vision on what the change is & why it is critical for the companies future. A clear roadmap to ensure everyone how and when the change will be achieved is required. Note that in order to achieve the goals, the DataOps team members should have a horizontal mix of skill sets rather than a vertical one so that the members work and support each other rather than work in siloed isolation. As with any project, quick wins are essential to gain traction with executives and to drive motivation and momentum in the overall project. Of course, communication is key in all aspects of achieving the enterprise vision.
Parting Shot
While DataOps will allow organizations to be able to rapidly gain value from one of their most undervalued assets, it is not the end all. It cannot turn bad data into a gold mine. The need for good quality data remains paramount. This has become more even relevant with the advent of Artificial Intelligence and it's need for vast amounts of good quality data. All enterprises still struggle with the quality of data and if efforts are not made to improve this, even the best AI projects will either struggle or fail.
Abid, thanks for sharing!