Role of Automation & AI in the modern Agriculture

Food is one of the fundamental need of any living organism. For humans in the current age, food is something that has to be purchased from the shops. Once humans became civilized, the freely available food became a commodity product. The idea behind work and salary was all for food in the earlier ages. Later it was for food, medicine and shelter. Now the priorities has changed and we all are running behind the fast moving world.

Now everyone are busy. Getting clean & good food is very difficult.

  • Vegetables are full of pesticides & chemicals
  • Meat is poisonous with antibiotics
  • Rice & grains contains harmful chemicals. People even uses plastic like materials for manufacturing similar materials
  • Water and water bodies are polluted
  • Air is polluted
  • Soil is polluted

Everything is polluted….!!!

In general, the food that we eat is not good for health and it may even take our life. What is the solution ?.

People including me are busy with their work and they don’t have time to do any cultivation. I came from an agricultural background and I have farm lands also. But now everything is unused. Now the food production is getting reduced drastically. It is basically because of the following reasons.

  • The cost of farming is high.
  • No steady income.
  • Lack of stability in the price of agricultural products and market.
  • The major share goes to the middle man
  • High labour cost. Most of the conventional farming needs more man power and the cost of man power is more in the current society.
  • Poor social status.
  • Dependency on climate and the sudden climatic changes.

Because of all these factors new generation is not even thinking about agriculture. I also moved away from agriculture because of all these factors.

Currently I stay in an apartment surrounded by farm lands. When I observed the cultivation process and the deadly harmful pesticides that they use, I got shocked. None of these vegetables are in an eatable condition. I started thinking and researching about the ways to control this.

One of the images that made me re-thing is shared below. The picture shows two oxen with their mouths masked. The reason for the mask is to avoid them eating the cabbage leaves in the farmland while ploughing the fields. Those cabbages were sprayed with highly poisonous chemicals.

pesticide_image

One of my goals in this year is to start an organic farm. Automation is required to improve the efficiency and reduce the human effort. Now my blog posts will include the updates and progress about my organic farm and my learning.

Making hive usable to multiple users in a hadoop cluster.

By default, hive operations are limited to the superuser. If you are using cdh, then the superuser is hdfs.
The reason for this is because of the permission of hive warehouse directory.
By default the read/permission of this directory is given only to the superuser.
So if we want to use hive from multiple users, change the permission of this directory accordingly.
If you want to make hive usable by all users, then do the following command.

hadoop fs –chmod –R 777 /user/hive/warehouse

hadoop fs –chmod –R 777 /tmp

If you group the users in specific groups, then you can do this by giving read/write permission to group only. ie 775

Rhipe Installation

Rhipe was first developed by Saptarshi Guha.
Rhipe needs R and Hadoop. So first install R and hadooop. Installation of R and hadoop are well explained in my previous posts. The latest version of Rhipe as of now is Rhipe-0.73.1. and  latest available version of R is R-3.0.0. If you are using CDH4 (Cloudera distribution of hadoop) , use Rhipe-0.73 or later versions, because older versions may not work with CDH4.
Rhipe is an R and Hadoop integrated programming environment. Rhipe integrates R and Hadoop. Rhipe is very good for statistical and analytical calculations of very large data. Because here R is integrated with hadoop, so it will process in distributed mode, ie  mapreduce.
Futher explainations of Rhipe are available in http://www.datadr.org/

Prerequisites

Hadoop, R, protocol buffers and rJava should be installed before installing Rhipe.
We are installing Rhipe in a hadoop cluster. So the job submitted may execute in any of the tasktracker nodes. So we have to install R and Rhipe in all the tasktracker nodes, otherwise you will face an exception “Cannot find R” or something similar to that.

Installing Protocol Buffer

Download the protocol buffer 2.4.1 from the below link

http://protobuf.googlecode.com/files/protobuf-2.4.1.tar.gz

tar -xzvf protobuf-2.4.1.tar.gz

cd protobuf-2.4.1

chmod -R 755 protobuf-2.4.1

./configure

make

make install

Set the environment variable PKG_CONFIG_PATH

nano /etc/bashrc

export PKG_CONFIG_PATH=/usr/local/lib/pkgconfig

save and exit

Then executed the following commands to check the installation

pkg-config --modversion protobuf

This will show the version number 2.4.1
Then execute

pkg-config --libs protobuf

This will display the following things

-pthread -L/usr/local/lib -lprotobuf -lz –lpthread

If these two are working fine, This means that the protobuf is properly installed.

Set the environment variables for hadoop

For example

nano /etc/bashrc

export HADOOP_HOME=/usr/lib/hadoop

export HADOOP_BIN=/usr/lib/hadoop/bin

export HADOOP_CONF_DIR=/etc/hadoop/conf

save and exit

Then


cd /etc/ld.so.conf.d/

nano Protobuf-x86.conf

/usr/local/lib   # add this value as the content of Protobuf-x86.conf

Save and exit

/sbin/ldconfig

Installing rJava

Download the rJava tarball from the below link.

http://cran.r-project.org/web/packages/rJava/index.html

The latest version of rJava available as of now is rJava_0.9-4.tar.gz

install rJava using the following command

R CMD INSTALL rJava_0.9-4.tar.gz

Installing Rhipe

Rhipe can be downloaded from the following link
https://github.com/saptarshiguha/RHIPE/blob/master/code/Rhipe_0.73.1.tar.gz

R CMD INSTALL Rhipe_0.73.1.tar.gz

This will install Rhipe

After this type R in the terminal

You will enter into R terminal

Then type

library(Rhipe)

#This will display

------------------------------------------------

| Please call rhinit() else RHIPE will not run |

————————————————

rhinit()

#This will display

Rhipe: Detected CDH4 jar files, using RhipeCDH4.jar
Initializing Rhipe v0.73
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/lib/hadoop/client-0.20/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/lib/hadoop/client/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/lib/hadoop/lib/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
Initializing mapfile caches

Now you can execute you Rhipe scripts.

Setting Up Multiple Users in Hadoop Clusters

 

 Need for multiple users

In hadoop we run different tasks and store data in  HDFS.

If several users are doing tasks using the same user account, it will be difficult to trace the jobs and track the tasks/defects done by each user.

Also the other issue is with the security.

If all are given the same user account, all users will have the same privilege and all can access everyone’s  data, can modify it, can perform execution, can delete it also.

This is a very serious issue.

For this we need to create multiple user accounts.

Benefits of Creating multiple users

1)      The directories/files of other users cannot be modified by a user.

2)      Other users cannot add new files to a user’s directory.

3)      Other users cannot perform any tasks (mapreduce etc) on a user’s files.

In short data is safe and is accessible only to the assigned user and the superuser.

Steps for setting up multiple User accounts

For adding new user capable of performing hadoop operations, do the following steps.

Step 1

Creating a New User

For Ubuntu

sudo  adduser  --ingroup   <groupname>   <username>

For RedHat variants

useradd  -g <groupname>   <username>

passwd <username>

Then enter the user details and password.

Step 2

we need to change the permission of a directory in HDFS where hadoop stores its temporary data.

Open the core-site.xml file

Find the value of hadoop.tmp.dir.

In my core-site.xml, it is /app/hadoop/tmp. In the proceeding steps, I will be using /app/hadoop/tmp as my directory for storing hadoop data ( ie value of hadoop.tmp.dir).

Then from the superuser account do the following step.

hadoop fs –chmod -R  1777 /app/hadoop/tmp/mapred/staging

Step 3

The next step is to give write permission to our user group on hadoop.tmp.dir (here /app/hadoop/tmp. Open core-site.xml to get the path for hadoop.tmp.dir). This should be done only in the machine(node) where the new user is added.

chmod 777 /app/hadoop/tmp

Step 4

The next step is to create a directory structure in HDFS for the new user.

For that from the superuser, create a directory structure.

Eg: hadoop  fs –mkdir /user/username/

Step 5

With this we will not be able to run mapreduce programs, because the ownership of the newly created directory structure is with superuser. So change the ownership of newly created directory in HDFS  to the new user.

hadoop  fs –chown –R username:groupname   <directory to access in HDFS>

Eg: hadoop fs –chown –R username:groupname  /user/username/

Step 6

login as the new user and perform hadoop jobs..

su  – username

Note: Run hadoop tasks in the assigned hdfs paths directory only ie /user/username.
Enjoy…. 🙂