CDH cluster installation failing in “distributing” stage- Failure due to stall on seeded torrent

I faced this issue while distributing the downloaded packages in cloudera manager.

The solution that worked for me is to add the IP Address – Hostname mapping in all the /etc/hosts files of all the cloudera manager server and agents

/etc/hosts

192.168.0.101   cdhdatanode1

ERROR Failed to collect NTP metrics – Cloudera Manager Agent

If you are facing an error like “Failed to collect NTP metrics”. The following solution might help you. This is because of the lack of ntp server in the server. The below solution will work for CentOS/RHEL systems. NTP will sync the system time with the network time.

yum install ntp

systemctl enable ntpd

systemctl restart ntpd

Installing Cloudera Manager in an existing hadoop cluster

Cloudera Manager is an Infrastructure management and monitoring tool provided by cloudera. This has now became a very excellent tool to manage bigdata infrastructure. The pain of administrators has been reduced by 80% with this cloudera manager. Almost everything required for an administrator is integrated into this great software and is very user friendly. Cloudera Manager became this muhc powerful recently. So lot of existing clusters are still running without using cloudera manager. If you want to manage an existing cluster using cloudera manager, the following steps may help you. For this you have to completely uninstall the existing hadoop set up. No data loss will happen because we are not touching any data. The configurations also will remain the same. These are just pointers.

1) Stop all the services
2) Back up hive metastore, Namenode metadata and all the other required metastores (Eg hue, oozie)
3) Back up all the configurations
4) Note down the existing storage directories
5) Uninstall all the hadoop services (Never touch the data)
6) Install Cloudera Manager Server and Agent
7) Install all the services (It should be same version as that of previous to make installation smoother)
8) Add the configurations (Use the same configurations as that of previous. There is an option to add xml configs in CM)
9) Point the storage directories in the cloudera manager configurations.
10) Point the new installation to the existing metastore (hive, oozie, hue etc)
11) Start all the services (Don’t format the namenode)
12) Test the cluster

Monitoring Tools for Hadoop Clusters

For getting the exact status of server machines, we use monitoring tools. Now a days a lot of monitoring tools are available with good monitoring capabilities and alert capabilities. I was in a search for finding a good monitoring as well as alert tool for hadoop clusters. From my observation, I found some free tools

1) Ganglia
2) Nagios
3) Zabbix
4) Cloudera Manager
6) Apache Ambari

My observations are as follows.

For using Ambari or Cloudera Manager, the cluster should be installed using that tool itself.
That means, we cannot monitor an existing cluster using these tools.
Ganglia provides good matrices and we can capture custom matrices using ganglia. Ganglia is very much flexible.Hadoop comes with a set of configurations that can be used for capturing hadoop matrices using ganglia.These properties can be seen in hadoop-metrics.properties file. New ganglia web UI is very good and we can export the metrics as csv or json files. This is a very useful feature.But ganglia doesn’t have the alert giving capability such as sending mails in case of issues.Here we can use Nagios. Nagios-Ganglia integration is a good tool for monitoring hadoop clusters. Because we will get good metrics capturing capability as well as alert sending capability.

Ganglia is free. Nagios base version is free. Base version of nagios serves our needs.

Zabbix is also a good tool. A lot of production clusters are running with zabbix as monitoring tool.