Advertisements

Changing the Hive Warehouse Directory

By default the hive warehouse directory is located at  the hdfs location /user/hive/warehouse

If you want to change this location, you can add the following property to hive-site.xml.

Everyone using hive should have appropriate read/write permissions to this warehouse directory.

<property>
   <name>hive.metastore.warehouse.dir</name>
   <value>/user/hivestore/warehouse </value>
   <description>location of the warehouse directory</description>
 </property>

Advertisements

Hive Installation with MySQL metastore

Hive Installation

  • Download the hive tar ball. ( otherwise do apt-get/ yum install/ install the hive rpm manually)
  • Extract the tar ball in a slave node.
  • Set HIVE_HOME  and PATH in /etc/bash.bashrc file.
  • Logout and login again.
  • Type hive in terminal.
  • Hive will work.
  • This is just a play around installation. Here the database used is derby database. This is the default database, but this will not work for multiuser setup. Multiuser means only one can use hive at a time.

Changing the Hive metastore database from derby to MySql

  • Setup a MySql database.
    • In MySql command line type
CREATE USER 'hive_user'@'%' IDENTIFIED BY 'hive@123';
GRANT ALL PRIVILEGES ON *.* TO 'hive_user'@'%' WITH GRANT OPTION;
  • For connecting to a user account in MySql use the command.
C:/Mysql/bin>mysql –uUsername -pPassword
  • Remember the ip address, port, username, password of the mysql setup.
  • Make a firewall inbound rule in the machine where mysql is installed. Otherwise connection to the mysql port will be blocked by the firewall. So jdbc connectivity will not happen.
  • Add mysql-connector-java-5.0.5.jar  file to the lib directory inside hive installation.
  • Create hive-site.xml file in the conf directory of hive installation. Fill that file with all the necessary properties. This file may not be present in the conf directory. So create a new file.
  • Restart Hive.
  • Now multiuser setup will be ready.
  • If you want JDBC Connectivity, thrift server should be started
 hive  --service hiveserver&

*& is used for running that hive service server process in the background.

Sample hive-site.xml file is shown below

<configuration>

<property>

<name>hive.metastore.local</name>

<value>true</value>

</property>

<property>

<name>javax.jdo.option.ConnectionURL</name>

<value>jdbc:mysql://<ip address of mysql machine>/hive?createDatabaseIfNotExist=true</value>

</property>

<property>

<name>javax.jdo.option.ConnectionDriverName</name>

<value>com.mysql.jdbc.Driver</value>

</property>

<property>

<name>javax.jdo.option.ConnectionUserName</name>

<value>hive_user</value>

</property>

<property>

<name>javax.jdo.option.ConnectionPassword</name>

<value>hive@123</value>

</property>

</configuration>