Is Big Data same as Large amount of Data ?

Vani Srivastava

Published Jun 5, 2017

Big Data is the new buzzword in the industry. But what actually is Big Data. As the name suggests most people think Big Data means large datasets. Possibly terabytes/petabytes of data. And that’s it.

But that’s not what Big Data is. It is distinctly identified as “3V’s” : Volume, Velocity and Variety of data.

Volume : Of course large datasets. Includes structured and unstructured data

Velocity : Fast moving, ever changing. Think streaming

Variety : Includes information, pictures, videos, chats uploaded to social media, pictures coming from camera, IOT devices, tweets, mobile devices, RFID readers

Additionally, the insight provided by this data is what constitutes “Big Data”.

It’s not just about how much or what kind of data you have but what you do with that data.

Having said that, large amount of data is still very important because the more data we have the more accurate the sample would be. Which in turn helps in coming up with better models and better insights.

Some use cases of Big Data :

Understanding and Targeting Customers : understand customers and their behaviors/preferences and create predictive models
Fraud Detection : Based on Pattern recognition
Understanding and Optimizing Business Processes: Retailers are able to optimize their stock based on predictions generated from social media data, web search trends and weather forecasts

Now that we have understood that Big data is not just about large volumes of data but about how to put that data to use, we need to talk about how to process that data.

Batch processing is an efficient way to generate insights when working with a high volume of data. Processing time can take between minutes and hours. Operations can be complicated.

When your most important consideration is extracting near real-time insights from massive amounts of data, you need Streaming processing. Here we are talking about large volume of data coming in at high velocity. Operations need to be less complicated with response time of seconds. Streaming data is processed even before it ends up in a data warehouse.

Few use cases of Streaming data processing :

A music streaming service looks at user-listening data to automatically improve its user recommendations.
Network monitoring
Intelligence and surveillance

There are two approaches for Stream processing :

Native Stream Processing : Every event is processed as it comes in, resulting in the lowest possible latency. But processing every incoming event is also computationally very expensive
Micro batch processing : In this incoming events are divided into batches either by arrival time or until a batch has reached a certain size. This reduces the computational cost of processing but can introduce latency.

“Big Data” is rapidly changing the world and empowering AI at scale

To view or add a comment, sign in

Is Big Data same as Large amount of Data ?

Vani Srivastava

More articles by Vani Srivastava

Others also viewed

Data-Driven Decision Making: Leveraging Big Data Analytics in Industry 4.0

Big Data 101

Backsliding at the juncture- Analytics on Internet of Things

What is future of big data analytics?

What the heck is Big Data?

Our Thermomix is smart: How I help to make it smarter as a Data Scientist at Vorwerk Digital

The Democratisation of Your Data

Why Your Company Needs Data Analytics

Handling Real-Time Operational Data From an IoT

Big Data, then Big Information, then Big Decisions

Batch Processing in Big Data

Big Data Analysis for Consumer Behavior

Big Data Applications in Forecasting

Big Data Applications in Ecommerce

Explore content categories