Big Data Problems ........

Nishant Singh

Published Sep 17, 2020

Numbers are increasing day by day call it no. of people , no. of connections, or the amount of data .Since I'm here referring to the MNCs so we are going through some of the difficulties faced these companies.

Each user possessing some data and the data increasing data by day ......

This doesn't means that let's say ," Hey user you delete your previous data and Store something new you want ".

This is not the right way for these companies to look upon the handling big data in this way .

Now considering one more scenario, Let's say users are typing some text files and Here's there's not only a single user but millions of users and let's assume that now every user whatever they may have written now send their texts to the servers of these big companies. So will the computing unit in the datacenter of these companies will be able to handle storing this much huge data ?

But firstly why I am here taking about CPU and RAM stuffs. Why would we even need them on purpose of storing data on 🤔?

This is the fact that most of us really don't know because we as an user of any operating system , if we were writing some text file , normally we write a very few words about less than 5000 words in a text file at once . So processing this much of data won't be a big deal for our computers.

But if somehow you'll write lakhs of words and now try to save this text file , you will definitely feel some lag in this situation .Also we if somehow you can monitor the RAM in your PC then give it a try you will find your answer. This is the proof that we need storage units but high end computing units as well .

Conclusion !

So all these points clears that there is need of increase in storage capacity in the respective data centres of these companies with the advancement in their computing units.

So let's look upon some figures and judge how much storage capacity they really need .

Here is some of the data based on recent updates as per last year , giving per day data generated

* Twitter - 500 million tweets are sent

*Facebook - 4 petabytes of data are created

* WhatsApp - 65 billion messages are sent

* Google Search - 3.5 billion searches per day are made

* Emails - 294 billion emails are sent

Okay we got it ,we really need some big Hard-disk, Lots of RAM and also CPU .

But much big can these data Be ? Are we available with so much big sized Hard-disk ?

Actually for this we have lots of big companies like IBM, Dell EMC and many more who are ready with such huge type of storage devices.

So getting a look on above snap , one can get to know that we really have a lot of options

Well optimisation is still going on but in between these a solution came to developer's mind. They felt like what if they can create a Cluster of it 💡.

Here starts the journey of big data world management system.

In this process what they have done is ,they have written some program and planned to have a master and slave node setup and here each salve will have their Ram ,CPU and Storage and here they will be sharing all of these units to the master node.

How to achieve this type of setup ?

Here a software will be developed and this software is the one responsible for distribution the incoming data from various users through the world to the various slave nodes and therefore creating a setup which will store data in various slave nodes. This type of methodology is known as Distribution Storage Cluster.

What issues it can solve?

Each of the four were the major issues that we face in Big Data World

Topic was Hadoop right , where is it ?

Hadoop is one of the software by Apache . Its is a framework that allows for the distributed processing of large data sets across clusters of computers using simple programming models

We can use it to overcome Big data issues using Distribution Storage Cluster approach to setup master-slave node approach and hence overcoming the Big Data Problems ...........

To view or add a comment, sign in

Big Data Problems ........

Nishant Singh

More articles by Nishant Singh

Others also viewed

Big Data Storage Crisis – Are Ships the Answer?

What Should We Know About Big Data?

Are Databases headed the Internet way: Spread out geographically and connected loosely? By themselves they become the back-end of Internet!

Will your next computer be DIMM witted?

A Little Slice of Tracking Cost with Microsoft Sentinel data lake Meters, Part 1

The Age of Fake Data Centers: What if the Data Centers are Real, but the delivery dates are Fake?

IIGR: VL2: A Scalable and Flexible Data Center Network (Albert Greenberg et. al. Microsoft Research )

Introducing OpenSIPS 2.4

PowerProtect Data Manager: Automating Disk Exclusions for Virtual Machine Assets

Explore content categories

More articles by Nishant Singh

Configuring Hive with HDFS & MapReduce Cluster backend

Why handlers are used in Ansible?

Setting up AWS CDN with AWS CLI

Play with IPs , IPv4 in particular

A Session with two experts

Configuring HAProxy-LB with Ansible

Configuring Hadoop(NN/DN) via Ansible

Getting started with AWS CLI....

Ubisoft got enhanced with AWS

Hybrid Cloud Setup: K8s and RDS