Pig – Local and Distributed Execution modes

There are currently two execution environments for pig.

  • Local execution in a single JVM
  • Distributed execution on a Hadoop cluster.

Local mode

In local mode, it uses a single JVM and local file system as execution environments. For running in local mode, we doent need any hadoop cluster. For entering into local execution mode, type the below command in the terminal. The execution type is set using the  -x or  -exectype option. When you type pig -x local,  You can see an output similar below and will enter into the grunt shell. On examining the below INFO logs, you can see that, it is using local file system.


pig –x local

 

2013-07-10 16:46:56,344 [main] INFO  org.apache.pig.Main - Apache Pig version 0.10.0-cdh4.1.2 (rexported) compiled Nov 01 2012, 18:38:58

2013-07-10 16:46:56,345 [main] INFO  org.apache.pig.Main - Logging error messages to: /home/amal_george/pig_1373455016342.log

2013-07-10 16:46:56,500 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at:file:///

grunt>

Distributed Mode

In a pig installed machine, when we type pig in the terminal, it will by default go into distribution execution mode. In distributed mode, the job will run as mapreduce and will use hdfs as file system. So we need a hadoop cluster for run pig in distributed mode.

When we type pig in the terminal. You can see an output similar below and will enter into the grunt shell. On examining the below INFO logs, you can see that, it is connecting to a cluster.

2013-07-10 16:47:52,510 [main] INFO  org.apache.pig.Main - Apache Pig version 0.10.0-cdh4.1.2 (rexported) compiled Nov 01 2012, 18:38:58

2013-07-10 16:47:52,511 [main] INFO  org.apache.pig.Main - Logging error messages to: /home/amal_george/pig_1373455072507.log
2013-07-10 16:47:52,797 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: hdfs://master:9000

2013-07-10 16:47:53,487 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to map-reduce job tracker at: master:9001

grunt>
Advertisements