What is Hadoop?
Hadoop - is an open-source software framework for storing and processing big data in a distributed fashion on large clusters of commodity hardware. Essentially, it accomplishes two tasks: massive data storage and faster processing.
Currently three core components are included with your basic download from the Apache Software Foundation.
HDFS - the Java-based distributed file system that can store all kinds of data without
prior organization.
MapReduce –
a software programming model for processing large sets of data in parallel.
YARN – a resource management framework for scheduling
and handling resource requests from distributed applications.
Hadoop link: http://hadoop.apache.org/releases.html#Download
No comments:
Post a Comment