BIG DATA

BIG DATA

No alt text provided for this image

Big data is not a new technology not any kind of software just it a problem or may be we can say beyond the capacity in our hard disk and the problem in below:

  1. VOLUME :- it is refering to large amount of data like 500 TB OR 500PB generated through the any browser or any application and portals and data generated every seconds in browser like amazon,facebook. size mb,gb,tb,pb and so on...
millions of data upload in just of one second

millions of data uploaded in just a second

VELOCITY :its refer to the speed of data processing or data are being generated in our social media .significant for real time processing in bits,bytes,batches staying a social media example every day 900millions photos,uploaded in fb thats make processing the data.

No alt text provided for this image

3.VARIETY:-IT refer to the number of type of data both structured and unstructured ,semi-structured data that gathered from multiple sources which is generated either by human being or machine the most commonly data added are txt,tweets,pictures and videos

No alt text provided for this image

HOW MUCH DATA HANDLE AMAZON ,FACEBOOK AND ALL BIG COMPANIES?

No alt text provided for this image

Despite the hype, many organizations don’t realize they have a big data problem or they simply don’t think of it in terms of big data. In general, an organization is likely to benefit from big data technologies when existing databases and applications can no longer scale to support sudden increases in volume, variety, and velocity of data.

Failure to correctly address big data challenges can result in escalating costs, as well as reduced productivity and competitiveness. On the other hand, a sound big data strategy can help organizations reduce costs and gain operational efficiencies by migrating heavy existing workloads to big data technologies; as well as deploying new applications to capitalize on new opportunities.

lots of data in regular basis like in 500tb or 500 pb so HOW HANDLE THE DATA IN HARD DISK?

THE problem are solved by big data hadoop software which is build in java(jdk) On-cluster storage with HDFS

Hadoop also includes a distributed storage system, the Hadoop Distributed File System (HDFS), which stores data across local disks of your cluster in large blocks. HDFS has a configurable replication factor (with a default of 3x), giving increased availability and durability. HDFS monitors replication and balances your data across your nodes as nodes fail and new nodes are added.

No alt text provided for this image

MASTER SALVE CLUSTER

No alt text provided for this image

just in brief

thank you!



To view or add a comment, sign in

More articles by Santosh Yadav

  • WORKING ON AWS CLI

    ABOUT AWS CLI THE AWS COMMAND LINE INTERFACE (CLI) is tool to manage aws services. AWS provide the storage and…

    1 Comment
  • WHY AWS IS BETTER THAN OTHER SERVICE PROVIDER

    AWS(AMAZON WEB SERVICE):---> Aws is cloud computing platform which provide services such as compute power, data base…

Others also viewed

Explore content categories