Mapreduce– textinputformat.record.delimiter

The default input format of hadoop mapreduce is text input format. This means it reads text files.

The default delimiter is ‘/n’. This means, it reads line by line.
But reading line by line may not be favourable for us in all the cases. So we can make it read based on our on delimiter rather than the default delimiter ‘/n’.

This can be done by setting a property textinputformat.record.delimiter .

This property can be set either in the program or while running the program in the cli.

The format for setting it in the program (Driver class)  is  conf.set(“textinputformat.record.delimiter”, “delimiter”) .

Advertisements

About amalgjose
I am an Electrical Engineer by qualification, now I am working as a Software Engineer. I am very much interested in Electrical, Electronics, Mechanical and now in Software fields. I like exploring things in these fields. I like travelling, long drives and very much addicted to music.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: