For getting the exact status of server machines, we use monitoring tools. Now a days a lot of monitoring tools are available with good monitoring capabilities and alert capabilities. I was in a search for finding a good monitoring as well as alert tool for hadoop clusters. From my observation, I found some free tools
4) Cloudera Manager
6) Apache Ambari
My observations are as follows.
For using Ambari or Cloudera Manager, the cluster should be installed using that tool itself.
That means, we cannot monitor an existing cluster using these tools.
Ganglia provides good matrices and we can capture custom matrices using ganglia. Ganglia is very much flexible.Hadoop comes with a set of configurations that can be used for capturing hadoop matrices using ganglia.These properties can be seen in hadoop-metrics.properties file. New ganglia web UI is very good and we can export the metrics as csv or json files. This is a very useful feature.But ganglia doesn’t have the alert giving capability such as sending mails in case of issues.Here we can use Nagios. Nagios-Ganglia integration is a good tool for monitoring hadoop clusters. Because we will get good metrics capturing capability as well as alert sending capability.
Ganglia is free. Nagios base version is free. Base version of nagios serves our needs.
Zabbix is also a good tool. A lot of production clusters are running with zabbix as monitoring tool.