Apache Hadoop in nutshell

Apache Hadoop is an open source framework for distributed storage and processing of large sets of data on commodity hardware. Hadoop enables businesses to quickly gain insight from massive amounts of structured and unstructured data. Numerous Apache Software Foundation projects make up the services required by an enterprise to deploy, integrate and work with Hadoop.



To view or add a comment, sign in

More articles by Paresh Goyal,PMP

  • Difference between Flume and Sqoop

    Both Flume and Sqoop are meant for data movement. Sqoop and Flume both are meant to fulfill data ingestion needs but…

    1 Comment
  • Difference Between flume and sqoop

    Both Flume and Sqoop are meant for data movement. Sqoop and Flume both are meant to fulfill data ingestion needs but…

    1 Comment
  • Big Data Vs Business Intelligence

    Many people bandy around the terms “big data” and “business intelligence” as if they are interchangeable. In some…

  • hadoop vs rdbms

    Hadoop is not a database, it is basically a distributed file system which is used to process and store large data sets…

    6 Comments
  • Hadoop vs Spark

    Should we go for Hadoop or Spark as our big data framework? Spark has overtaken Hadoop as the most active open source…

    7 Comments

Explore content categories