Polyglot Persistence : A preferred path to big data in the cloud for Enterprises

Nitin Goel

Published Mar 19, 2017

In the big data field, "Polyglot Persistence" is a term increasingly being used to describe a strategy where a combination of data storage technologies are being used when creating data solutions. Different combinations of specialized storage technologies are chosen based on the way the data is going to be used by different components of the solution.

Most enterprises realize that a one-size-fits-all solution approach to big data is not always suitable. An architect must evaluate different use cases to make decisions such as when to use traditional Hadoop vs Massively Parallel Processing (MPP) database vs NoSQL vs RDBMS vs In-memory store. Is the data best modeled as Row Store vs Column Store vs Key-Value store? Is there a need for a document store or a graph store? Do we need stream processing or batch processing? Very often its not choice of one vs the other. Many data solutions need to be built using the right combination of these technologies by considering latency expectations, access patterns, retention needs and overall cost.

However there is significant operational overhead of deploying and supporting multiple specialized technologies such as those mentioned above. The prospect of locating the right talent, sizing and procuring the hardware, installing, configuring, managing, monitoring, patching and upgrading the technical components etc. can be daunting for most enterprises. This forces enterprises to limit the options they can make available. As a result, solutions are built to fit the available tools as opposed to picking the right tools for the job, creating sub-optimal solutions.

With the move to the cloud however, we are able to leverage “managed-services” and adopt a “pay-for-what-you-use” model. This frees up the organization from a majority of the operational overhead and significantly reduces the up-front capital investments. The focus now shifts to choosing the right set of technologies to solve the business problems. It also significantly shrinks the time to deliver the solution to the business, vastly improving productivity and time to value! Lastly, for the more challenging use cases, this allows us to fail fast, learn from what works and what does not, and iterate until we are successful.

Born-in-the-cloud startups have been using this model for quite a while, but traditional enterprises are only now slowly starting to adopt this pattern as they mature their usage of cloud technologies. At least that has been my own experience. What does your enterprise's journey to the cloud look like? Has it been more of a lift and shift of existing on-premise data infrastructure to the cloud, or is your organization re-engineering data solutions to leverage the best of what the cloud have to offer?

Please add your comments.

Jonathan Garwood 3y

Still relevant today! Great piece Nitin

Padmanaban Ramasamy 9y

Good one Nitin!!

JuneAn Lanigan 9y

Good piece Nitin Goel!

1 Reaction

See more comments

To view or add a comment, sign in

Polyglot Persistence : A preferred path to big data in the cloud for Enterprises

Nitin Goel

Others also viewed

Using Azure Cosmos DB for Globally Distributed Applications

Oracle’s Strategic Blueprint for Unified, Multicloud, AI, and Data

Navigating the Depths of Azure Data Lake: A Comprehensive Exploration

Scaling AI Vector Search with Oracle Globally Distributed Database

Unlocking Scalable Data Storage on Google Cloud: A Data Engineer’s Guide

AWS Tools for Big Data Engineering: Enabling Scalable and Efficient Solutions

Success in Shifting Analytics to the Cloud

Breaking Free from Relational Chains: Real-Time Search Architectures with DynamoDB & OpenSearch

Azure Data Lake - An enabler in the growing big data world

Iceberg Native Lakehouse on GCP

Explore content categories