Learning Hadoop 2
Garry Turkington & Gabriele ModenaGoogle started the change that would eventually be known as Hadoop, when in 2003, and in 2004, they released two academic papers describing the Google File System (GFS) and MapReduce. The two together provided a platform for very large-scale data processing in a highly efficient manner.
At the same time, Doug Cutting was working on the Nutch open source web crawler. He was working on elements within the system that resonated strongly once the Google GFS and MapReduce papers were published. Doug started work on open source implementations of these Google ideas, and Hadoop was soon born, firstly, as a subproject of Lucene, and then as its own top-level project within the Apache Software Foundation. Yahoo! hired Doug Cutting in 2006 and quickly became one of the most prominent supporters of the Hadoop project. In addition to often publicizing some of the largest Hadoop deployments in the world, Yahoo! allowed Doug and other engineers to contribute to Hadoop while employed by the company, not to mention contributing back some of its own internally developed Hadoop improvements and extensions.