Posts

Showing posts from July, 2013

Hadoop Multi Node Set Up

   With this post I am hoping to share the procedure to set up Apache Hadoop in multi node and is a continuation of the post,  Hadoop Single Node Set-up . The given steps are to set up a two node cluster which can be then expanded to more nodes according to the volume of data. The unique capabilities of Hadoop can be well observed when performing on a BIG volume of data in a multi node cluster of commodity hardware.    It will be useful to have a general idea on the  HDFS(Hadoop Distributed File System) architecture which is the default data storage for Hadoop, before proceed to the set up, that we can well understand the steps we are following and what is happening at execution. In brief, it is a master-slave architecture where master act as the NameNode which manages file system namespace and slaves act as the DataNodes which manage the storage of each node. Also there are JobTrackers which are master nodes and TaskTrackers which are ...