Notification on completion of Mapreduce jobs

Heavy mapreduce jobs may run for several hours. There can be several jobs and checking the status of mapreduce jobs manually will be a boring task. I don’t like this  J. If we try to manage java programs using a script, it will not be a clean approach. Using scripts for managing java programs is bad approach. I consider these kind of designs as worst designs.

My requirement was to get notification on completion of mapreduce jobs. These are some critical mapreduce jobs and I don’t want to frequently check the status and wait for its completion.

Hadoop is providing a useful configuration to solve my problem. It is very easy to achieve this solution. Just a few lines of code will help us. Add these three lines to the Driver class

conf.set("job.end.notification.url", "http://myserverhost:8888/notification/jobId=$jobId?status=$jobStatus");
conf.setInt("job.end.retry.attempts", 3);
conf.setInt("job.end.retry.interval", 1000);

By setting these properties, hadoop sends an http request on completion of the job. We need a small piece of code for creating a webservice that accepts this http request and send email. For creating the webservice and email utility I used python language because of simplicity. Once the mapreduce job completes, it sends an http request to the URL mentioned by the configuration job.end.notification.url. The variables jobId and jobStatus will be replaced with the actual values. Once a request comes to the webservice, it will parse the arguments and call the email sending module. This is a very simple example. Instead of email, we can make different kind of notifications such as sms, phone call or triggering some other application etc. The property job.end.notification.url  is very helpful in tracking asynchronous mapreduce jobs. We can trigger another action also using this trigger. This is a clean approach because we are not running any other script to track the status of the job. The job itself is providing the status. We are using the python program for just collecting the status and making notifications using the status.

The python code for the webservice and email notification are given below.

Advertisements

About amalgjose
I am an Electrical Engineer by qualification, now I am working as a Software Engineer. I am very much interested in Electrical, Electronics, Mechanical and now in Software fields. I like exploring things in these fields. I like travelling, long drives and very much addicted to music.

One Response to Notification on completion of Mapreduce jobs

  1. Pingback: How to setup notification on completion of Mapreduce jobs « Aryan Nava

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: