Hadoop is an open-source software framework for storing data and running applications on clusters of commodity hardware. It provides massive storage for any kind of data, enormous processing power and the ability to handle virtually limitless concurrent tasks or jobs. What are the benefits of Hadoop? One of the top reasons that organizations turn to Hadoop is its ability to store and process huge amounts of data is any kind of data as quickly. With data volumes and varieties constantly increasing, especially from social media and the Internet of Things, that is a key consideration. Other benefits include: Computing power. Its distributed computing model quickly processes big data. The more computing nodes you use, the more processing power you have. Flexibility. Unlike traditional relational databases, you donate have to pre-process data before storing it. You can store as much data as you want and decide how to use it later. That includes unstructured data like text, images and videos. Fault tolerance. Data and application processing are protected against hardware failure. If a node goes down, jobs are automatically redirected to other nodes to make sure the distributed computing does not fail. And it automatically stores multiple copies of all data. Low cost. The open-source framework is free and uses commodity hardware to store large quantities of data. Scalability. You can easily grow your system simply by adding more nodes. Little administration is required. What is Hadoop used for? Going beyond its original goal of searching millions (or billions) of web pages and returning relevant results, many organizations are looking to Hadoop as their next big data platform. Popular uses today include: Low-cost storage and active data archive. The modest cost of commodity hardware makes Hadoop useful for storing and combining data such as transactional, social media, sensor, machine, scientific, click streams, etc. The low-cost storage lets you keep information that is not deemed currently critical but that you might want to analyze later. Staging area for a data warehouse and analytics store. One of the most prevalent uses is to stage large amounts of raw data for loading into an enterprise data warehouse (EDW) or an analytical store for activities such as advanced analytics, query and reporting, etc. Organizations are looking at Hadoop to handle new types of data (e.g., unstructured), as well as to offload some historical data from their enterprise data warehouses. Data lake. Hadoop is often used to store large amounts of data without the constraints introduced by schemas commonly found in the SQL-based world. It is used as a low-cost compute-cycle platform that supports processing ETL and data quality jobs in parallel using hand-coded or commercial data management technologies. Refined results can then be passed to other systems as needed. Sandbox for discovery and analysis. Because Hadoop was designed to deal with volumes of data in a variety of shapes and forms, it can run analytical algorithms. Big data analytics on Hadoop can help your organization operate more efficiently, uncover new opportunities and derive next-level competitive advantage. The sandbox approach provides an opportunity to innovate with minimal investment. Recommendation systems. One of the most popular analytical uses by some of Hadoop’s largest adopters is for web-based recommendation systems. Facebook is people you may know. LinkedIn €“ jobs you may be interested in. Netflix, eBay, Hulu as items you may be interested in. These systems analyze huge amounts of data in real time to quickly predict preferences before customers leave the web page.