Installation of Hive is pretty straigtforward and easy. With least chit-chatting, I will get to business for ya!
Prerequisites
Sun Java 6
Hadoop requires Sun Java 5.0.x. However, Hive wiki mentions a prerequisite of Sun Java 6.0. Thus we will stick to Sun Java 6.0Hadoop (0.17.x – 0.19.x)
We must have Hadoop already up and running (support for 0.20.x is still under progress – so 0.17.x to 0.19.x is preferable)!Note:
a) For this tutorial purpose, we will be referring to a Single Node Hadoop installation
SVN
SVN aka Subversion is an open source version control system. Most of the apache projects are hosted over SVN. Thus, its a good idea to have it on your system if not already.For the current tutorial, you will need it to grab the code out of Hive SVN Repository
Download it from: http://subversion.tigris.org/
Ant
Ant or Apache Ant is a Java-based build tool. In present context, you will need it to build the ‘checked out’ Hive code.Download it from: http://ant.apache.org/
Downloading and Building Hive
Hive is available via SVN at: http://svn.apache.org/repos/asf/hadoop/hive/trunkWe will first checkout Hive’s code
svn co http://svn.apache.org/repos/asf/hadoop/hive/trunk hiveThis will put Hive trunk’s content (Hive’s development repository) in your local ‘hive’ directory
Now, we will build the downloaded code
cd hiveFor example
ant -Dhadoop.version=”<your-hadoop-version>” package
ant -Dhadoop.version=”0.19.2″ package
Your built code is now in build/dist directory
cd build/distOn ‘ls’ you will see the following content:
ls
README.txtThe “build/dist/” directory is your Hive Installation and moving further we are going to call it Hive Home.
bin/ (all the shell scripts)
lib/ (required jar files)
conf/ (configuration files)
examples/ (sample input and query files)
Let us set an environment variable for our Hive Home too:
export HIVE_HOME=<some path>/build/distFor example
export HIVE_HOME=/data/build/dist
Hadoop Side Changes
Hive uses hadoop that means:
1. you must have hadoop in your path OR2. export HADOOP_HOME=<hadoop-install-dir>
In addition, you must create /tmp and /user/hive/warehouse (aka hive.metastore.warehouse.dir) and set them chmod g+w in HDFS before a table can be created in Hive.
Commands to perform these changes
$HADOOP_HOME/bin/hadoop fs -mkdir /tmp
$HADOOP_HOME/bin/hadoop fs -mkdir /user/hive/warehouse
$HADOOP_HOME/bin/hadoop fs -chmod g+w /tmp
$HADOOP_HOME/bin/hadoop fs -chmod g+w /user/hive/warehouse
Running Hive
Now, you are all set to run Hive for yourself! Invoke the command line interface (cli) from the shell:$HIVE_HOME/bin/hive
No comments:
Post a Comment