Compressing the output of sqoop

The output of a sqoop job can be compressed directly. Sqoop job is a mapreduce job, so by setting the mapreduce output compression codec, we can get the output of sqoop compressed. It is very simple, just put an argument to the sqoop command string.

--compression-codec <compression codec>

For snappy compressed output the argument will be as below.

--compression-codec org.apache.hadoop.io.compress.SnappyCodec

For Gzip compression

--compression-codec org.apache.hadoop.io.compress.GzipCodec

For Bzip compression

--compression-codec org.apache.hadoop.io.compress.BZip2Codec
Advertisements