There are currently two execution environments for pig.
- Local execution in a single JVM
- Distributed execution on a Hadoop cluster.
Local mode
In local mode, it uses a single JVM and local file system as execution environments. For running in local mode, we doent need any hadoop cluster. For entering into local execution mode, type the below command in the terminal. The execution type is set using the -x or -exectype option. When you type pig -x local, You can see an output similar below and will enter into the grunt shell. On examining the below INFO logs, you can see that, it is using local file system.
pig –x local
2013-07-10 16:46:56,344 [main] INFO org.apache.pig.Main - Apache Pig version 0.10.0-cdh4.1.2 (rexported) compiled Nov 01 2012, 18:38:58 2013-07-10 16:46:56,345 [main] INFO org.apache.pig.Main - Logging error messages to: /home/amal_george/pig_1373455016342.log 2013-07-10 16:46:56,500 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at:file:/// grunt>
Distributed Mode
In a pig installed machine, when we type pig in the terminal, it will by default go into distribution execution mode. In distributed mode, the job will run as mapreduce and will use hdfs as file system. So we need a hadoop cluster for run pig in distributed mode.
When we type pig in the terminal. You can see an output similar below and will enter into the grunt shell. On examining the below INFO logs, you can see that, it is connecting to a cluster.
2013-07-10 16:47:52,510 [main] INFO org.apache.pig.Main - Apache Pig version 0.10.0-cdh4.1.2 (rexported) compiled Nov 01 2012, 18:38:58 2013-07-10 16:47:52,511 [main] INFO org.apache.pig.Main - Logging error messages to: /home/amal_george/pig_1373455072507.log 2013-07-10 16:47:52,797 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: hdfs://master:9000 2013-07-10 16:47:53,487 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to map-reduce job tracker at: master:9001 grunt>