Dear Technocrats,
After installing hadoop in single nodes, it can be easily extended as a multinode cluster. for single node installation please refer my previous post.
This post is intended for expert practitioners who have already installed hadoop on single nodes and are very familiar with common commands and paths.
This post is intended for expert practitioners who have already installed hadoop on single nodes and are very familiar with common commands and paths.
In a multinode you will have a masternode ie. (Namenode ) and slavenodes (datanodes) as many as you wish to add.
The steps to make multinode hadoop cluster are as following.
1. update the /etc/hosts file of all nodes with ip addresses and userdefined names of your masternode and slavenodes. By doing this you can refer any node by name rather than by its ip. It makes the configuration simple and understandable.
2. Now as masternode will need to access resources i.e hdfs locations etc., for the same you need to authorize all these communication through SSH. To do it copy the SSH key of masternode to authorized_keys of all datanodes.
3. Update the masters and slaves file under configuration folder. For hadoop 2.x all configuration files are found under hadoop-2.x/etc/hadoop directory. Mention name of all slavenodes in slaves file. You need to make masters files in this directory and write the name of masternode in this file. Save and close both these files. Do this on masternode configuration only. You need not to do it on datanodes configuration.
4. Now update the core-site.xml & mapred-site.xml. You just need to change the 'localhost' word with 'masternode' in all properties wherever applicable. Perform this on all nodes i.e masternode as well as slavenodes.
5. Format namenode from masternode and start all services from there. You multinode cluster is ready. If all goes well you will see the cluster having multiple datanodes as shown in below images.
Note: I have two different variations of hadoop2.x installed on my two different machine. So It will show you notification for the same as below.
Note: If datanode/nodes does not appear in cluster stop all, format the temporary storage of your hadoop framework and start the cluster again. You will get all your components working. For any other issue write comment on this post. I ll try to resolve the same...