Advertisements

Migrating Namenode from one host to another host

Namenode is the heart of the hadoop cluster. So namenode will be installed in a good quality machine compared to the other nodes. If we want to migrate namenode from one node to another node, the following steps are required. This is a rare scenario.

Manual Approach

Method 1: (By migrating the harddrive)

  • Stop all the running jobs in the cluster
  • Enter into Namenode Safe
    • hdfs dfsadmin -safemode enter
  • Execute the following command to save the currrent namespace to the storage directories and reset editlogs..
    • hdfs dfsadmin -saveNamespace
  • Stop the entire cluster
  • Remove the hard disk from the old namenode host and attach it to the new namenode host
  • Release the ipaddress from the old namenode host and assign it to the new namenode host
  • Start the new namenode (DO NOT PERFORM FORMAT)
  • Start all the services

Method 2: (New Harddrive)

  • Stop all the running jobs in the cluster
  • Enter into Namenode Safe
    • hdfs dfsadmin -safemode enter
  • Execute the following command to save the currrent namespace to the storage directories and reset editlogs..
    • hdfs dfsadmin -saveNamespace
  • Stop the entire cluster
  • Login to the namenode host.
  • Navigate to the namenode storage directories.
  • Copy the namenode metadata. Always better to keep this as a compressed file. Notedown the folder and file permissions & ownership.
  • Take a back up of the configuration files.
  • Install namenode of the same version as that of the existing system to the new machine.
  • Ensure that the ipaddress of the old host is taken and assigned to the new host.
  • Copy the configuration files and metadata to the new namenode host
  • Create namenode storage directory structure in the new host.
  • Maintain the same folder permissions and ownership in the new host also.
  • If there are any changes in namenode directory structure, make the corresponding changes in config files.
  • Incase of a kerberised cluster, create appropriate principles for the new host and place the proper keytabs.
  • Start the new namenode. (DO NOT PERFORM FORMAT)
  • Start the remaining services.
  • Test the working of the cluster by executing file system operations as well as MR operations.

Automated Approach in a cluster managed using Cloudera Manager (CM above 5.4)

If you are using cloudera manager 5.4 or above, there is a new feature known as Namenode Role Migration that helps us to migrate namenode from one host to another. This requires HDFS HA to be enabled.

Advertisements

Migrating hive from one hadoop cluster to another cluster

Recently I have migrated a hive installation from one cluster to another cluster. I havent find any
document regarding this migration. So I did it with my experience and knowledge.
Hive stores the metadata in some databases, ie it stores the data about the tables in some database.
For developement/ production grade installations, we normally use mysql/oracle/postgresql databases.
Here I am explaining about the migration of the hive with its metastore database in mysql.
The metadata contains the information about the tables. The contents of the table are stored in hdfs.
So the metadata contains hdfs uri and other details. So if we migrate hive from one cluster to another
cluster, we have to point the metadata to the hdfs of new cluster. If we haven’t do this, it will point
to the hdfs of older cluster.

For migrating a hive installation, we have to do the following things.

1) Install hive in the new hadoop cluster
2) Transfer the data present in the hive metastore directory (/user/hive/warehouse) to the new hadoop
cluster
3) take the mysql metastore dump.
4) Install mysql in the new hadoop cluster
5) Open the hive mysql-metastore dump using text readers such as notepad, notepad++ etc and search for
hdfs://ip-address-old-namenode:port and replace with hdfs://ip-address-new-namenode:port and save it.

Where ip-address-old-namenode is the ipaddress of namenode of old hadoop cluster and ip-address-
new-namenode
is the ipaddress of namenode of new hadoop cluster.

6) After doing the above steps, restore the editted mysql dump into the mysql of new hadoop cluster.
7) Configure hive as normal and do the hive schema upgradations if needed.

This is a solution that I discovered when I faced the migration issues. I dont know whether any other
standard methods are available.
This worked for me perfectly. 🙂