Programmatic way to identify the status of the namenode in an HA enabled hadoop cluster

In an namenode HA enabled hadoop cluster, one of the namenodes will be active and the other will be standby. If you want to perform some operations on HDFS programmatically, some of the the libraries or packages need the details of active namenode (some of the packages in python need the details of active namenode, they will not support the nameservice). In this case, the easiest way to get the status is to issue a GET request similar to the one given below on each of the namenodes. This will help us to identify the status of each namenode.

GET REQUEST

curl 'http://namenode.1.host:50070/jmx?qry=Hadoop:service=NameNode,name=NameNodeStatus'

SAMPLE OUTPUT

{
"beans" : [ {
"name" : "Hadoop:service=NameNode,name=NameNodeStatus",
"modelerType" : "org.apache.hadoop.hdfs.server.namenode.NameNode",
"State" : "active",
"SecurityEnabled" : false,
"NNRole" : "NameNode",
"HostAndPort" : "namenode.1.host:8020",
"LastHATransitionTime" : 0
} ]
}

Load Balancers – HA Proxy and ELB

ELB

HAProxy

Earlier I wondered how the sites like google handles the large number of requests reaching there. Later I came to know that there is a concept of load balancing. Using load balancing we can keep multiple servers in the back end and route the incoming requests to the back end servers. This will ensure faster response as well as high availability. This Load balancers play a very important role. There are a lot of opensource load balancers as well as paid services. HAProxy is one of the opensource load balancer. Amazon is providing a Load Balancer as a service known as Elastic Load Balancer (ELB).Using the load balancer, we can handle very large number of requests in a very reliable and optimal way. We can use this load balancer in Impala for load balancing the requests hitting the impala server. For on-premise environments, we can configure HAProxy and for cloud environments, we can use ELB.The ELB is a ready to use service, we just have to add the details of ports to be forwarded and the listener machines. HA Proxy is a very simple application that is available in the linux repositories. It is very easy to configure also.