Hadoop 1
Hadoop 1.x Supports only MapReduce (MR)
processing model.it Does not support non-MR tools.
MR does both processing and cluster
resource management.
1.x Has limited scaling of nodes. Limited
to 4000 nodes per cluster.
Works on concepts of slots – slots can run
either a Map task or a Reduce task only.
A single Namenode to manage the entire
namespace.
1.x Has Single-Point-of-Failure (SPOF) –
because of single Namenode- and in case of Namenode failure, needs manual
intervention to overcome.
MR API is compatible with Hadoop 1x. A
program written in Hadoop1 executes in Hadoop1x without any additional files.
1.x Has a limitation to serve as a platform
for event processing, streaming and real-time operations.
Hadoop 2
Hadoop 2.x Allows to work in MR as well as
other distributed computing models like Spark, Hama, Giraph, Message Passing
Interface) MPI & HBase coprocessors.
YARN (Yet Another Resource Negotiator) does
cluster resource management and processing is done using different processing
models.
2.x Has better scalability. Scalable up to
10000 nodes per cluster.
Works on concepts of containers. Using
containers can run generic tasks.
Multiple Namenode servers manage multiple
namespace.
2.x Has feature to overcome SPOF with a
standby Namenode and in case of Namenode failure, it is configured for
automatic recovery.
MR API requires additional files for a
program written in Hadoop1x to execute in Hadoop2x.
Can serve as a platform for a wide variety
of data analytics-possible to run event processing, streaming and real time
operations.
No comments:
Post a Comment