Hive Installation on Ubuntu 14.04 With Pre Built Derby Database

posted on Nov 20th, 2016

Apache Hive

Apache Hive is a data warehouse infrastructure built on top of Hadoop for providing data summarization, query, and analysis. Hive gives an SQL-like interface to query data stored in various databases and file systems that integrate with Hadoop. The traditional SQL queries must be implemented in the MapReduce Java API to execute SQL applications and queries over a distributed data. Hive provides the necessary SQL abstraction to integrate SQL-like Queries (HiveQL) into the underlying Java API without the need to implement queries in the low-level Java API. Since most of the data warehousing application work with SQL based querying language, Hive supports easy portability of SQL-based application to Hadoop.

Pre Requirements

1) A machine with Ubuntu 14.04 LTS operating system

2) Apache Hadoop 2.6.4 pre installed (How to install Hadoop on Ubuntu 14.04)

3) Apache Hive 2.1.0 Software (Download Here)

Hive Installation With Pre Built Derby Database

NOTE

Hive versions 1.2 onward require Java 1.7 or newer. Hive versions 0.14 to 1.1 work with Java 1.6 as well.

Hadoop 2.x (preferred), 1.x (not supported by Hive 2.0.0 onward). Hive versions up to 0.13 also supported Hadoop 0.20.x, 0.23.x.

Hive Installation on Ubuntu 14.04 With Pre Built Derby Database

Hive Installation Steps

Step 1 - Creating hive directory. Open a new terminal(CTRL + ALT + T) and enter the following command.

$ sudo mkdir /usr/local/hive

Step 2 - Change the ownership and permissions of the directory /usr/local/hive. Here 'hduser' is an Ubuntu username.

$ sudo chown -R hduser /usr/local/hive
$ sudo chmod -R 755 /usr/local/hive

Step 3 - Switch User, is used by a computer user to execute commands with the privileges of another user account.

$ su hduser

Step 4 - Change the directory to /home/hduser/Desktop , In my case the downloaded apache-hive-2.1.0-bin.tar.gz file is in /home/hduser/Desktop folder. For you it might be in /downloads folder check it.

$ cd /home/hduser/Desktop/

Step 5 - Untar the apache-hive-2.1.0-bin.tar.gz file.

$ tar xzf apache-hive-2.1.0-bin.tar.gz

Step 6 - Move the contents of apache-hive-2.1.0-bin folder to /usr/local/hive

$ mv apache-hive-2.1.0-bin/* /usr/local/hive

Step 7 - Edit $HOME/.bashrc file by adding the pig path.

$ sudo gedit $HOME/.bashrc

$HOME/.bashrc file. Add the following lines

export HIVE_HOME=/usr/local/hive
export PATH=$HIVE_HOME/bin:$HIVE_HOME/lib:$PATH

Step 8 - Reload your changed $HOME/.bashrc settings

$ source $HOME/.bashrc

Step 9 - Change the directory to /usr/local/hive/conf

$ cd $HIVE_HOME/conf

Step 10 - Copy the default hive-env.sh.template to hive-env.sh

$ cp hive-env.sh.template hive-env.sh

Step 11 - Edit hive-env.sh file.

$ gedit hive-env.sh

Step 12 - Add the below lines to hive-env.sh file. Save and Close.

export HADOOP_HOME=/usr/local/hadoop
export HIVE_CONF_DIR=$HIVE_CONF_DIR
export HIVE_AUX_JARS_PATH=$HIVE_AUX_JARS_PATH

Step 13 - Copy the default hive-default.xml.template to hive-site.xml

$ cp hive-default.xml.template hive-site.xml

Step 14 - Edit hive-site.xml file.

$ gedit hive-site.xml

Step 15 - Add or update below properties in hive-site.xml file. Save and Close.

 <property>
  <name>hive.metastore.schema.verification</name>
    <value>false</value>
   <description>Will remove your error occurring because of metastore_db in shark</description>
  </property>
  <property>
    <name>hive.exec.scratchdir</name>
    <value>/tmp/hive</value>
    <description>HDFS root scratch dir for Hive jobs which gets created with write all (733) permission.</description>
  </property>
  <property>
    <name>hive.exec.local.scratchdir</name>
    <value>/tmp/$ {user.name}</value>
    <description>Local scratch space for Hive jobs</description>
  </property>
  <property>
    <name>hive.downloaded.resources.dir</name>
    <value>/tmp/$ {user.name}_resources</value>
    <description>Temporary local directory for added resources in the remote file system.</description>
  </property>
  <property>
    <name>hive.scratch.dir.permission</name>
    <value>733</value>
    <description>The permission for the user specific scratch directories that get created.</description>
  </property>

<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:derby://localhost:1527/metastore_db;create=true </value>
<description>JDBC connect string for a JDBC metastore </description>
</property>

Step 16 - Change the directory to /usr/local/hadoop/sbin

$ cd /usr/local/hadoop/sbin

Step 17 - Start all hadoop daemons.

$ start-all.sh

Step 18 - You must use below HDFS commands to create /tmp and /user/hive/warehouse (aka hive.metastore.warehouse.dir) and set them chmod g+w before you can create a table in Hive.

$ hdfs dfs -mkdir /tmp
$ hdfs dfs -chmod 777 /tmp 
$ hdfs dfs -mkdir /user/hive/warehouse
$ hdfs dfs -chmod g+w /tmp
$ hdfs dfs -chmod g+w /user/hive/warehouse

Step 19 - Change the directory to /usr/local/hive/bin

$ cd $HIVE_HOME/bin

Step 20 - We need to run the schematool command below as an initialization step. For example, we can use "derby" as db type.

$ schematool -initSchema -dbType derby

Hive Installation on Ubuntu 14.04 With Pre Built Derby Database

Step 21 - To use the Hive command line interface (CLI) from the shell.

$ ./hive

Step 22 - To list all the tables those are present in derby database.

$ show tables;

Please share this blog post and follow me for latest updates on

facebook             google+             twitter             feedburner

Previous Post                                                                                          Next Post

Labels : Hive Installation With MySQL Database Metastore   Beeline Client Usage   hiveserver2 and Web UI usage   WordCount hiveQL Execution   Hive Metastore Configuration   Hive Command Line Interface   Hive Shell Commands usage   Hive Distributed Cache   HDFS and Linux Commands in hive shell   Customizing hive logs   Database Commnds Usage   Table Commands Usage   Hive Partitioning Configuration   Hive Bucketing Configuration   UDFs Java Example   UDAFs Java Example   UDTF Java Example   Hive JDBC client Java Example   Hive Web Interface (HWI)   HiveQL Examples