Namenode is the single point of failure in hadoop cluster. Because it stores the metadata of the entire hadoop system.
So extra care should be given in maintaining it. We use the best hardware for namenode machines.
Even if we use best hardware, complete protection cannot be guarenteed, because hardware issues can happen at anytime. So a backup for namenode is very necessary.
One of the methods is creating a simple backup storage by mounting the partition of another machine located in a different place to the namenode machine.
The back up machine should have the same hardware/software specifications as of namenode machine and installed with hadoop similar to namenode machine. But hadoop services are not started in that machine.
Incase of failure, we can start namenode in this backup machine and it runs like normal namenode. The only thing we need to do is assigning ipaddress/hostname of actual namenode to the backup namenode.
In the hdfs-site.xml, we are giving an additional value to dfs.name.dir property.
ie actual location, backup location.
Eg:
<property> <name>dfs.name.dir</name> <value>/app/hadoop/name,/app/hadoop/backup</value> <description> Determines where on the local filesystem the DFS name node should store the name table(fsimage). If this is a comma-delimited list of directories then the name table is replicated in all of the directories, for redundancy. </description> </property>
here /app/hadoop/name is the actual namenode storage location and /app/hadoop/backup is the location where the partition is mounted for storing the namenode backup.
In case of failure of the first namenode machine, the namenode data will be safe in the second machine(backup), so we can start the namenode in the second machine.
The second machine is placed in different location and is provided with a differnt power supply, so that the dependencies of both the machines will be different, thus making an efficient backup.