Heavy mapreduce jobs may run for several hours. There can be several jobs and checking the status of mapreduce jobs manually will be a boring task. I don’t like this J. If we try to manage java programs using a script, it will not be a clean approach. Using scripts for managing java programs is bad approach. I consider these kind of designs as worst designs.
My requirement was to get notification on completion of mapreduce jobs. These are some critical mapreduce jobs and I don’t want to frequently check the status and wait for its completion.
Hadoop is providing a useful configuration to solve my problem. It is very easy to achieve this solution. Just a few lines of code will help us. Add these three lines to the Driver class
conf.set("job.end.notification.url", "http://myserverhost:8888/notification/jobId=$jobId?status=$jobStatus"); conf.setInt("job.end.retry.attempts", 3); conf.setInt("job.end.retry.interval", 1000);
By setting these properties, hadoop sends an http request on completion of the job. We need a small piece of code for creating a webservice that accepts this http request and send email. For creating the webservice and email utility I used python language because of simplicity. Once the mapreduce job completes, it sends an http request to the URL mentioned by the configuration job.end.notification.url. The variables jobId and jobStatus will be replaced with the actual values. Once a request comes to the webservice, it will parse the arguments and call the email sending module. This is a very simple example. Instead of email, we can make different kind of notifications such as sms, phone call or triggering some other application etc. The property job.end.notification.url is very helpful in tracking asynchronous mapreduce jobs. We can trigger another action also using this trigger. This is a clean approach because we are not running any other script to track the status of the job. The job itself is providing the status. We are using the python program for just collecting the status and making notifications using the status.
The python code for the webservice and email notification are given below.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
__author__ = 'Amal G Jose' | |
import smtplib | |
class SendEmail(object): | |
def __init__(self): | |
self.email_id = "sender@gmail.com" | |
self.email_password = "password" | |
def send_email(self, job_id, job_status): | |
FROM = self.email_id | |
TO = ['recepient@email.com'] | |
SUBJECT = "Mapreduce Job Status of Job ID : %s" %(job_id) | |
TEXT = "Status of JobId : %s was : %s" %(job_id, job_status) | |
# Prepare actual message | |
message = """\From: %s\nTo: %s\nSubject: %s\n\n%s | |
""" % (FROM, ", ".join(TO), SUBJECT, TEXT) | |
try: | |
server = smtplib.SMTP("smtp.gmail.com", 587) | |
server.ehlo() | |
server.starttls() | |
server.login(self.email_id, self.email_password) | |
server.sendmail(FROM, TO, message) | |
server.close() | |
print 'Email Sent Successfully' | |
except: | |
print "Failed to send email" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
__author__ = 'Amal G Jose' | |
import tornado.httpserver | |
import tornado.ioloop | |
import tornado.options | |
import tornado.web | |
from SendEmail import SendEmail | |
PORT = 8888 | |
class JobStatus(tornado.web.RequestHandler): | |
def get(self): | |
job_id = self.get_argument("jobId", default="unknown", strip=True) | |
job_status = self.get_argument("status", default="unknown", strip=True) | |
notify = SendEmail() | |
notify.send_email(job_id, job_status) | |
self.write("Notified !!") | |
if __name__ == "__main__": | |
app = tornado.web.Application(handlers=[(r"/notification", JobStatus)]) | |
http_server = tornado.httpserver.HTTPServer(app) | |
http_server.listen(PORT) | |
print "Starting server" | |
tornado.ioloop.IOLoop.instance().start() |
I beileve this is good approach, but wouldn’t be using cronjob a fault tolerant approach.
This is not for scheduling a mapreduce job. This is for getting the notification after the completion of a job.