Polyglot Persistence : A preferred path to big data in the cloud for Enterprises
In the big data field, "Polyglot Persistence" is a term increasingly being used to describe a strategy where a combination of data storage technologies are being used when creating data solutions. Different combinations of specialized storage technologies are chosen based on the way the data is going to be used by different components of the solution.
Most enterprises realize that a one-size-fits-all solution approach to big data is not always suitable. An architect must evaluate different use cases to make decisions such as when to use traditional Hadoop vs Massively Parallel Processing (MPP) database vs NoSQL vs RDBMS vs In-memory store. Is the data best modeled as Row Store vs Column Store vs Key-Value store? Is there a need for a document store or a graph store? Do we need stream processing or batch processing? Very often its not choice of one vs the other. Many data solutions need to be built using the right combination of these technologies by considering latency expectations, access patterns, retention needs and overall cost.
However there is significant operational overhead of deploying and supporting multiple specialized technologies such as those mentioned above. The prospect of locating the right talent, sizing and procuring the hardware, installing, configuring, managing, monitoring, patching and upgrading the technical components etc. can be daunting for most enterprises. This forces enterprises to limit the options they can make available. As a result, solutions are built to fit the available tools as opposed to picking the right tools for the job, creating sub-optimal solutions.
With the move to the cloud however, we are able to leverage “managed-services” and adopt a “pay-for-what-you-use” model. This frees up the organization from a majority of the operational overhead and significantly reduces the up-front capital investments. The focus now shifts to choosing the right set of technologies to solve the business problems. It also significantly shrinks the time to deliver the solution to the business, vastly improving productivity and time to value! Lastly, for the more challenging use cases, this allows us to fail fast, learn from what works and what does not, and iterate until we are successful.
Born-in-the-cloud startups have been using this model for quite a while, but traditional enterprises are only now slowly starting to adopt this pattern as they mature their usage of cloud technologies. At least that has been my own experience. What does your enterprise's journey to the cloud look like? Has it been more of a lift and shift of existing on-premise data infrastructure to the cloud, or is your organization re-engineering data solutions to leverage the best of what the cloud have to offer?
Please add your comments.
Still relevant today! Great piece Nitin
Good one Nitin!!
Good piece Nitin Goel!