__Getting started with Big Data__

__Getting started with Big Data__

This article is for those who want to know basics of Big Data like what is big data, what is the need of it in the current IT industry and its challenges and how companies overcome these challenges.

So lets start with very simple question. What is big data?

In a way, big data is exactly what it sounds like - a lot of data. Since the advent of the Internet, we've been producing data in staggering amounts. It's been estimated that in all the time leading up to the year 2003, only 5 exabytes of data were generated -- that's equal to 5 billion gigabytes. But from 2003 to 2012, the amount reached around 2.7 zettabytes (or 2,700 exabytes, or 2.7 trillion gigabytes) [sources: Intel]. According to Berkeley researchers, we are now producing roughly 5 quintillion bytes (or around 4.3 exabytes) of data every two days.

We all are aware of Facebook, it processes about 2.5 billion pieces of content and 500+ terabytes of data each day. It’s pulling in 2.7 billion Like actions and 300 million photos per day, and it scans roughly 105 terabytes of data each half hour. 

But lot of data does not mean that is has to be in millions of gigabytes or zettabytes. Any amount of data that can not be managed and processed by the traditional means of data management comes under the category of Big Data. If some organization does not have measures to manage some gigabytes of data, lets say 10 or 20gbs, by existing traditional means of data management like relational databases etc, then that amount of data also comes under big data for that organization. But generally, Big Data term is used to refer to large volume of data having enormous varieties.

Now we come to the need of big Data.

In a very general terms if I say then management and processing of data is done to draw some actionable insights from the data and grow the business. Lets talk about some use cases:

A real example of a company that uses big data analytics to drive customer retention is Coca-Cola. In the year 2015, Coca-Cola managed to strengthen its data strategy by building a digital-led loyalty program. Coca-Cola director of data strategy was interviewed by ADMA managing editor. The interview made it clear that big data analytics is strongly behind customer retention at Coca-Cola. To read the whole interview refer the link: https://www.adma.com.au/resources/how-coca-cola-uses-data-to-supercharge-its-superbrand-status

Other is Netflix which is also a good example of a big brand that uses big data analytics for targeted advertising. With over 100 million subscribers, the company collects huge data, which is the key to achieving the industry status Netflix boosts. If you are a subscriber, you are familiar to how they send you suggestions of the next movie you should watch. Basically, this is done using your past search and watch data. This data is used to give them insights on what interests the subscriber most.

Big data analytics can help change all business operations. This includes the ability to match customer expectation, changing company’s product line and of course ensuring that the marketing campaigns are powerful. Big data analytics also helps in risk management and there are many scenarios where companies are largely benefited from the use of Big Data.

Challenges thrown by Big Data.

Lets talk about the three Vs of big data, i.e., Volume, Velocity and Variety. These three Vs are not the only challenges but covers most of the say when it comes to challenges of bog data.

  • Volume: Big data is any set of data that is so large that the organization that owns it faces challenges related to storing or processing it. In reality, trends like ecommerce, mobility, social media and the Internet of Things (IoT) are generating so much information, that nearly every organization probably meets this criterion.
  • Velocity: If the organizations is generating new data at a rapid pace and needs to respond in real time, you have the velocity associated with big data. Most organizations that are involved in ecommerce, social media or IoT satisfy this criterion for big data.
  • Variety: If the data resides in many different formats, it has the variety associated with big data. For example, big data stores typically include email messages, word processing documents, images, video and presentations, as well as data that resides in structured relational databases.

Now overcoming these challenges is a big discussion in itself. So we are going to talk about this later. For now I want to tell you about a very interesting framework which is used by almost all of the big IT firms like facebook and that is Hadoop. Hadoop is a distributed data storage and management framework. If you want then you can search about it own your own or I will talk about this also later.

Thanks for reading my article on Introduction to Big Data.



To view or add a comment, sign in

More articles by Akash Saini ( Akki )

  • Expert Session on Red Hat OpenShift

    Yesterday an expert sessions was held on Red Hat Openshift container platform organized by LinuxWorld Informatics Pvt…

  • A use case study of how Netflix uses services provided by AWS.

    Online content provider Netflix can support seamless global service by using Amazon Web Services (AWS). AWS enables…

  • A Self Reflection of mine

    Here I am writing a self reflection of mine, what I knew 2 months age and what I know now in the world of Machine…

  • Instance Segmentation Using Mask R CNN

    Here I am going to perform instance segmentation of Cars using Mask R-CNN model. I will be using the Supervisely a Web…

  • Task 2

    In this task of mlops training I am going create a complete Jenkins Pipeline from pulling the code from github…

  • MLOps

    I completed one of the tasks given to me by Sir Vimal Daga, a world record holder, during my industrial training under…

  • Do not remain just sitting, Fight against Corona....

    Yes, today the situation we are in is very tough. This pandemic is a tragedy happening all around the world.

Others also viewed

Explore content categories