Thursday, 7 June 2012

Install hive

Step 1: Enable multiverse repo and get packages
The first thing we need to do is make sure we've got multiverse repos installed. Using your favorite editor (vi) add these lines to your etc/apt/sources.list:


deb http://us.archive.ubuntu.com/ubuntu/ lucid multiverse
deb-src http://us.archive.ubuntu.com/ubuntu/ lucid multiverse
deb http://us.archive.ubuntu.com/ubuntu/ lucid-updates multiverse
deb-src http://us.archive.ubuntu.com/ubuntu/ lucid-updates multiverse

With that done, go ahead and update your copy and install the subversion, java, and ant packages you'll need to do the install.

sudo apt-get update
sudo apt-get dist-ugprade
sudo apt-get install openjdk-6-jre ant subversion


Step 2: Get Hadoop
The next thing we'll do is grab hadoop. Be sure to get the latest version. For this tutorial we're using 0.20.2

wget http://mirror.its.uidaho.edu/pub/apache/hadoop/core/hadoop-0.20.2/hadoop-0.20.2.tar.gz
We'll move this to /usr/local, untar it, and then rename it. Use any alternate techniques you like here.. (e.g. symlinks, different directories, etc) there's no magic in this step

sudo tar xvzf hadoop-0.20.2.tar.gz
sudo mv hadoop-0.20.2 hadoop
cd hadoop

Once you've extracted it and moved into the directory, find the JAVA_HOME line in the environment script and uncomment it as so

sudo vi conf/hadoop-env.sh
export JAVA_HOME=/usr/lib/jvm/java-6-openjdk/

Then type

sudo ant
Finally, when ant is done doing it's thing, remove the build directory

sudo rm -rf /usr/local/hadoop/build


Step 3. Get Hive

From /usr/local let's go ahead and checkout hive using subversion and then build it:

sudo svn co http://svn.apache.org/repos/asf/hadoop/hive/trunk hive
cd hive
sudo ant package

By default hive uses a directory called /user/hive/warehouse You can change that if you like, but for simplicity, we'll just go ahead and create it instead.

sudo mkdir -p /user/hive/warehouse


Step 4: Add the ingredients to your PATH
I'm running hive as root in development but you can add this PATH statement to whatever user has permissions.

export PATH=$PATH:/usr/src/hive/build/dist/bin/
export PATH=$PATH:/usr/src/hive/build/dist/lib/
export PATH=$PATH:/usr/local/hadoop/bin

Once done, log out and log back in (so your path takes hold) and then as root you can launch hive using this command:

hive --service hiveserver

If you get an error about hadoop not being found, make sure you've renamed your hadoop-0.20.2 folder to just hadoop (or used symlinks or whatever)

No comments:

Post a Comment