Top Amazing facts about Big Data Hadoop
I am sure most of the students have heard the name Big Data Hadoop but don’t know what exactly it is. No worries. It does sound like some sci-fi term having a heavy weight. Here we are going to know about what is Big Data Hadoop and its amazing features. So let’s rock n roll.
Actually, Big Data and Hadoop are two different concepts but in practice, we call it big data Hadoop.
- What is Big Data and Apache Hadoop?
- Why is Big Data Hadoop a Promising Career?
- Secret Revealed: Top 9 Big Data Job Profiles
Big Data Hadoop is a made up of a combination of two different terms Big data and Hadoop. Before moving forward with the amazing facts one should know what is big data and what is Hadoop? Lets start with some basic concepts:-
What is Big Data
Basically, data is the plural form of the datum which means a single piece of information. But generally, we use data as both singular and plural. Data are facts, figures and statistics collected together for research, reference and analysis purpose. Data can exist in the variety of forms – as numbers or texts as bits and bytes or as facts and figures.
Note- Bits and bytes are used to measure data like we use gm and kg to measure weight.
Big data is a concept that describes enormous amount of data or a large volume of data collected over a period of time– both organized and unorganized. Every passing day, we create quintillions bytes of data out of which 80% are unorganized/unstructured.
As per a survey by IBM, 80% of data created today are from satellites used to gather climate information, social media sites, digital pictures and videos, GPS signals to name a few.
It is amazing to know that it’s not the amount of data that matters, what matters is how we handle this unorganized data for judgement or observation that will result in better decision making and vital business moves in future.
Check out more details about Big Data on wikipedia.
What is Hadoop
Imagine you had a file which is too larger than your computer’s memory. You can’t store that file, right? Hadoop makes it possible to store much larger files than storage capacity.
Apache Hadoop is an open-source software framework written in Java for storing enormous data and distributed processing of very large data. It is called Apache Hadoop because it is developed by Apache Software Foundation.
History of Big Data
As the World Wide Web grew in the early 2000s, search engines like Yahoo were created to help detect relevant information and search results were returned by humans. But as the web grew from thousands to millions of pages, need for automation was felt in search engines.
An open source web search engine called Nutch was created by Doug Cutting and Mike Cafarella in 2002. The main aim behind this creation was to make web search result much faster by distributing data and calculations across different computers. Another search engine project Google was underway motivated by the same goal.
In 2006, Doug Cutting joined Yahoo and took with him the Nutch project as well as the idea behind Google’s early stage work. Then project Nutch was divided- the web crawler portion remained as Nutch and distributed computing and processing part was named as Hadoop (named after Cutting’s son’s toy elephant).In 2008, Yahoo released Hadoop as an open source project. Today, Hadoop is managed and maintained by Apache Software Foundation.
Architecture & Working
There are mainly five components that together make this platform. These are:
- Map-Reduce Framework
- YARN (Yet Another Resource Negotiator) Infrastructure.
- HDFS Federation
Check more details about Hadoop Architecture
It has mainly two parts- a data processing part and a distributed filesystem for storage of data. A Hadoop distributed File System (HDFS) is a collection of many storage components that holds the actual data. You keep your data in HDFS and it will remain in there until it is needed for analysis. On the other hand, the data processing part is the very tool which works on the big chunk of data and gets it processed. Basically, this is java based platform known as Map Reduce.
Check more details on Hadoop Working
Now we separately know what big data and Hadoop is. Hadoop is a framework used to store and process very big data for analysis and decision-making process.
Here we will see some Amazing Big Data Facts.
- Over 90% of data in the world was created in past two years.
- Every 2 days we create as much information as we did from the beginning of time until 2013.
- Google now processes over 40,000 search queries every second
- Around 100 hours of video are uploaded to YouTube every min and it will take 15 years to watch every video uploaded by users in one day.
- Today’s data centres occupy an area of land equal to that of 6,000 football fields.
- The big data industry is expected to grow from the US $10.2 billion in 2013 to US $54.3 billion by 2017.
- As per Forbes report, almost 90% of global companies invests in big data analytics.
- As per research firm Allied market Research, the value of Hadoop market is expected to rise from $2billion in 2013 to $50billion by 2020.
- Do you know the white house has already invested more than $200 million in big data projects?
- Accelerated career growth and hike in pay package due to Hadoop skill.
- Your dream company is on a spree of hiring Hadoop skilled workforce.
These facts and figures are enough to say there is a bright future of Big data & Hadoop industry. So, what are you waiting for? Pursue your job in your dream company.