Top Amazing facts about Big Data Hadoop

by | May 20, 2016 | BIG Data | 0 comments

I am sure most of the students have heard the name Big Data Hadoop but don’t know what exactly it is. No worries. It does sound like some sci-fi term having a heavy weight. Here we are going to know about what is Big Data Hadoop and its amazing features. So let’s rock n roll.

Actually, Big Data and Hadoop are two different concepts but in practice, we call it big data Hadoop.

Recommended Read: 

Big Data Hadoop is a made up of a combination of two different terms Big data and Hadoop. Before moving forward with the amazing facts one should know what is big data and what is Hadoop? Lets start with some basic concepts:-

What is Big Data

Basically, data is the plural form of the datum which means a single piece of information. But generally, we use data as both singular and plural. Data are facts, figures and statistics collected together for research, reference and analysis purpose. Data can exist in the variety of forms – as numbers or texts as bits and bytes or as facts and figures.

Note- Bits and bytes are used to measure data like we use gm and kg to measure weight.

Big data is a concept that describes enormous amount of data or a large volume of data collected over a period of time– both organized and unorganized. Every passing day, we create quintillions bytes of data out of which 80% are unorganized/unstructured.

As per a survey by IBM, 80% of data created today are from satellites used to gather climate information, social media sites, digital pictures and videos, GPS signals to name a few.

big data facts

It is amazing to know that it’s not the amount of data that matters, what matters is how we handle this unorganized data for judgement or observation that will result in better decision making and vital business moves in future.

Check out more details about Big Data on wikipedia.

What is Hadoop

Imagine you had a file which is too larger than your computer’s memory. You can’t store that file, right? Hadoop makes it possible to store much larger files than storage capacity.

Apache Hadoop is an open-source software framework written in Java for storing enormous data and distributed processing of very large data. It is called Apache Hadoop because it is developed by Apache Software Foundation.

History of Big Data

As the World Wide Web grew in the early 2000s, search engines like Yahoo were created to help detect relevant information and search results were returned by humans. But as the web grew from thousands to millions of pages, need for automation was felt in search engines.

An open source web search engine called Nutch was created by Doug Cutting and Mike Cafarella in 2002. The main aim behind this creation was to make web search result much faster by distributing data and calculations across different computers. Another search engine project Google was underway motivated by the same goal.

In 2006, Doug Cutting joined Yahoo and took with him the Nutch project as well as the idea behind Google’s early stage work. Then project Nutch was divided- the web crawler portion remained as Nutch and distributed computing and processing part was named as Hadoop (named after Cutting’s son’s toy elephant).In 2008, Yahoo released Hadoop as an open source project. Today, Hadoop is managed and maintained by Apache Software Foundation.

apache hadoop logo

Architecture & Working

There are mainly five components that together make this platform. These are:

  • Map-Reduce Framework
  • YARN (Yet Another Resource Negotiator) Infrastructure.
  • HDFS Federation
  • Storage
  • Cluster

Check more details about Hadoop Architecture 

It has mainly two parts- a data processing part and a distributed filesystem for storage of data. A Hadoop distributed File System (HDFS) is a collection of many storage components that holds the actual data. You keep your data in HDFS and it will remain in there until it is needed for analysis. On the other hand, the data processing part is the very tool which works on the big chunk of data and gets it processed. Basically, this is java based platform known as Map Reduce.

Check more details on Hadoop Working

Now we separately know what big data and Hadoop is. Hadoop is a framework used to store and process very big data for analysis and decision-making process.

Here we will see some Amazing Big Data Facts.

  • Over 90% of data in the world was created in past two years.
  • Every 2 days we create as much information as we did from the beginning of time until 2013.
  • Google now processes over 40,000 search queries every second
  • Around 100 hours of video are uploaded to YouTube every min and it will take 15 years to watch every video uploaded by users in one day.
  • Today’s data centres occupy an area of land equal to that of 6,000 football fields.
  • The big data industry is expected to grow from the US $10.2 billion in 2013 to US $54.3 billion by 2017.

big data forcast

  • As per Forbes report, almost 90% of global companies invests in big data analytics.
  • As per research firm Allied market Research, the value of Hadoop market is expected to rise from $2billion in 2013 to $50billion by 2020.
  • Do you know the white house has already invested more than $200 million in big data projects?
  • Accelerated career growth and hike in pay package due to Hadoop skill.
  • Your dream company is on a spree of hiring Hadoop skilled workforce.

big data largest employer

big data salary

These facts and figures are enough to say there is a bright future of Big data & Hadoop industry. So, what are you waiting for? Pursue your job in your dream company.

I-Medita

I-Medita is an ISO 9001:2015 certified Professional Training Company. I-Medita is India's Most Trusted Networking Training Company. We help in providing industry oriented skill training to networking enthusiasts and professionals to kick-start their career in Networking domains. Our efforts are to keep momentum with the Industry technological demands and diversifying universe of knowledge.
Register for Free Demo Session