Sometimes we may require to remove a node from a hadoop cluster without loosing the data.
For this we have to do the decommissioning procedures.
Decommisioning will exclude a node from the cluster after replicating the data present in the decommissioning node to the other active nodes.
The decommissioning is very simple. The steps are explained below.
First stop the tasktracker in the node to be decommissioned.
In the namenode machine add the below property to the hdfs-site.xml
<property> <name>dfs.hosts.exclude</name> <value>/etc/hadoop/conf/dfs.exclude</value> </property>
where dfs.exclude is a file that we have to create and place it in a safe location. Better to keep it in HADOOP_CONF_DIR (/etc/hadoop/conf).
Create a file named dfs.exclude and add the hostnames of machines that need to be decommissioned line by line.
Eg: dfs.exclude
hostname1 hostname2 hostname3
After doing this, execute the following command from the superuser in the namenode machine.
hadoop dfsadmin -refreshNodes
After this, check the namenode UI. ie http://namenode:50070
You will be able to see the machines under decommissioning nodes.
The decommissioning process will take some time.
After the re-replication gets completed, the machine will be added to decommissioned nodes list.
After this, the decommissioned node can be safely removed from the cluster. 🙂
Thanks for the info…
Thanks a lot..Very nice and clear explanation…!