Compressing the output of sqoop

The output of a sqoop job can be compressed directly. Sqoop job is a mapreduce job, so by setting the mapreduce output compression codec, we can get the output of sqoop compressed. It is very simple, just put an argument to the sqoop command string.

--compression-codec <compression codec>

For snappy compressed output the argument will be as below.

--compression-codec org.apache.hadoop.io.compress.SnappyCodec

For Gzip compression

--compression-codec org.apache.hadoop.io.compress.GzipCodec

For Bzip compression

--compression-codec org.apache.hadoop.io.compress.BZip2Codec
Advertisements

About amalgjose
I am an Electrical Engineer by qualification, now I am working as a Software Engineer. I am very much interested in Electrical, Electronics, Mechanical and now in Software fields. I like exploring things in these fields. I like travelling, long drives and very much addicted to music.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: