Advertisements

Hue Error – DatabaseError: database is locked

You may face this error in Hue while using Impala or Hive. This is because of the lock happening in the backend database used in Hue. Hue uses a backend database to store all the metadata and history. By default it uses sqlite, which is not suitable for multiuser environments. The usage of the sqlite causes this issue.

We can resolve this by using mysql, postgresql or oracle database as the metastore for hue.

Advertisements

Migrating hive from one hadoop cluster to another cluster

Recently I have migrated a hive installation from one cluster to another cluster. I havent find any
document regarding this migration. So I did it with my experience and knowledge.
Hive stores the metadata in some databases, ie it stores the data about the tables in some database.
For developement/ production grade installations, we normally use mysql/oracle/postgresql databases.
Here I am explaining about the migration of the hive with its metastore database in mysql.
The metadata contains the information about the tables. The contents of the table are stored in hdfs.
So the metadata contains hdfs uri and other details. So if we migrate hive from one cluster to another
cluster, we have to point the metadata to the hdfs of new cluster. If we haven’t do this, it will point
to the hdfs of older cluster.

For migrating a hive installation, we have to do the following things.

1) Install hive in the new hadoop cluster
2) Transfer the data present in the hive metastore directory (/user/hive/warehouse) to the new hadoop
cluster
3) take the mysql metastore dump.
4) Install mysql in the new hadoop cluster
5) Open the hive mysql-metastore dump using text readers such as notepad, notepad++ etc and search for
hdfs://ip-address-old-namenode:port and replace with hdfs://ip-address-new-namenode:port and save it.

Where ip-address-old-namenode is the ipaddress of namenode of old hadoop cluster and ip-address-
new-namenode
is the ipaddress of namenode of new hadoop cluster.

6) After doing the above steps, restore the editted mysql dump into the mysql of new hadoop cluster.
7) Configure hive as normal and do the hive schema upgradations if needed.

This is a solution that I discovered when I faced the migration issues. I dont know whether any other
standard methods are available.
This worked for me perfectly. 🙂

Hive Installation with MySQL metastore

Hive Installation

  • Download the hive tar ball. ( otherwise do apt-get/ yum install/ install the hive rpm manually)
  • Extract the tar ball in a slave node.
  • Set HIVE_HOME  and PATH in /etc/bash.bashrc file.
  • Logout and login again.
  • Type hive in terminal.
  • Hive will work.
  • This is just a play around installation. Here the database used is derby database. This is the default database, but this will not work for multiuser setup. Multiuser means only one can use hive at a time.

Changing the Hive metastore database from derby to MySql

  • Setup a MySql database.
    • In MySql command line type
CREATE USER 'hive_user'@'%' IDENTIFIED BY 'hive@123';
GRANT ALL PRIVILEGES ON *.* TO 'hive_user'@'%' WITH GRANT OPTION;
  • For connecting to a user account in MySql use the command.
C:/Mysql/bin>mysql –uUsername -pPassword
  • Remember the ip address, port, username, password of the mysql setup.
  • Make a firewall inbound rule in the machine where mysql is installed. Otherwise connection to the mysql port will be blocked by the firewall. So jdbc connectivity will not happen.
  • Add mysql-connector-java-5.0.5.jar  file to the lib directory inside hive installation.
  • Create hive-site.xml file in the conf directory of hive installation. Fill that file with all the necessary properties. This file may not be present in the conf directory. So create a new file.
  • Restart Hive.
  • Now multiuser setup will be ready.
  • If you want JDBC Connectivity, thrift server should be started
 hive  --service hiveserver&

*& is used for running that hive service server process in the background.

Sample hive-site.xml file is shown below

<configuration>

<property>

<name>hive.metastore.local</name>

<value>true</value>

</property>

<property>

<name>javax.jdo.option.ConnectionURL</name>

<value>jdbc:mysql://<ip address of mysql machine>/hive?createDatabaseIfNotExist=true</value>

</property>

<property>

<name>javax.jdo.option.ConnectionDriverName</name>

<value>com.mysql.jdbc.Driver</value>

</property>

<property>

<name>javax.jdo.option.ConnectionUserName</name>

<value>hive_user</value>

</property>

<property>

<name>javax.jdo.option.ConnectionPassword</name>

<value>hive@123</value>

</property>

</configuration>