1)First of all install java jdk 1.6
In my case ,JAVA_HOME= /usr/java/jdk1.6.0_33
2)Create a dedicated hadoop user group to separate the hadoop installation from other software applications.
$ sudo addgroup hadoop
$ sudo adduser --ingroup hadoop hduser
This will add the user hduser and the group hadoop to your local machine.
3)Configure SSH
Name the node as master and slave so it becomes convenient to distinguish the two
master's IP:192.168.0.106
Slave's IP:192.168.0.122
For Master
root@master:~$ su - hduser
hduser@master:~$ ssh-keygen -t rsa -P ""
Generating public/private rsa key pair.
Enter file in which to save the key (/home/hduser/.ssh/id_rsa):
Created directory '/home/hduser/.ssh/id_rsa'
Your identification has been saved in /home/hduser/.ssh/id_rsa.
Your public key has been saved in /home/hduser/.ssh/id_rsa.pub.
The key fingerprint is:
f6:de:f8:ad:ff:b4:aa:06:63:57:2f:ae:f3:bf:ce:93 hduser@master
Copy the key in authorized keys of both the machines
[hduser@master ~]$ scp /home/hduser/.ssh/id_rsa.pub 192.168.0.106:$HOME/.ssh/authorized_keys
id_rsa.pub 100% 394 0.4KB/s 00
[hduser@master ~]$ scp /home/hduser/.ssh/id_rsa.pub 192.168.0.122:$HOME/.ssh/authorized_keys
id_rsa.pub 100% 394 0.4KB/s 00
For Slave
[root@slave ~]# su - hduser
[hduser@slave ~]$ ssh-keygen -t rsa -P ""
Generating public/private rsa key pair.
Enter file in which to save the key (/home/hduser/.ssh/id_rsa):
Created Directory '/home/hduser/.ssh/id_rsa '
Your identification has been saved in /home/hduser/.ssh/id_rsa.
Your public key has been saved in /home/hduser/.ssh/id_rsa.pub.
The key fingerprint is:
f6:de:f8:ad:ff:b4:aa:06:63:57:2f:ae:f3:bf:ce:93 hduser@slave
Copy the key to master's authorized key file
[hduser@slave ~]$ scp /home/hduser/.ssh/id_rsa.pub 192.168.0.106:$HOME/.ssh/authorized_keys2
hduser@192.168.0.106's password:
id_rsa.pub 100% 394 0.4KB/s 00:00
4)In /etc/hosts of both the machines
192.168.0.106 master
192.168.0.122 slave
Caution:Avoid using localhost ,it creates poblems later.
5) You need to disable IPv6
Check the status
$ cat /proc/sys/net/ipv6/conf/all/disable_ipv6
A return value of zero indicates it is enabled but it should be disabled.
We will disable it in conf/hadoop-env.sh by adding
export HADOOP_OPTS=-Djava.net.preferIPv4Stack=true
6)Download Hadoop and extract its content to /usr/local/hadoop by
$ cd /usr/local $ sudo tar xzf hadoop-1.0.3.tar.gz $ sudo mv hadoop-1.0.3 hadoop $ sudo chown -R hduser:hadoop hadoop
export HADOOP_HOME=/usr/local/hadoop
export JAVA_HOME= /usr/java/jdk1.6.0_33
Make a directory /app/hadoop/tmp to store temporary data
$ sudo mkdir -p /app/hadoop/tmp $ sudo chown hduser:hadoop /app/hadoop/tmp # ...and if you want to tighten up security, chmod from 755 to 750... $ sudo chmod 750 /app/hadoop/tmp
7)In /usr/local/hadoop/conf/hadoop-env.sh
For Masters,add or uncomment these
export JAVA_HOME=/usr/java/jdk1.6.0_33# Extra Java CLASSPATH elements. Optional.
export HADOOP_CLASSPATH=$HBASE_HOME/hbase-0.92.1.jar:$HBASE_HOME/conf:$HBASE_HOME/lib/zookeeper-3.4.3.jar
# The maximum amount of heap to use, in MB. Default is 1000.
export HADOOP_HEAPSIZE=2000
export HADOOP_OPTS=-Djava.net.preferIPv4Stack=true
# Command specific options appended to HADOOP_OPTS when specified
export HADOOP_NAMENODE_OPTS="-Dcom.sun.management.jmxremote $HADOOP_NAMENO$
export HADOOP_SECONDARYNAMENODE_OPTS="-Dcom.sun.management.jmxremote $HADO$
export HADOOP_DATANODE_OPTS="-Dcom.sun.management.jmxremote $HADOOP_DATANO$
export HADOOP_BALANCER_OPTS="-Dcom.sun.management.jmxremote $HADOOP_BALANC$
export HADOOP_JOBTRACKER_OPTS="-Dcom.sun.management.jmxremote $HADOOP_JOBT$
# export HADOOP_TASKTRACKER_OPTS=
For Slaves,add or uncomment these
export JAVA_HOME=/usr/java/jdk1.6.0_33/ export HADOOP_CLASSPATH=$HBASE_HOME/hbase-0.92.1.jar:$HBASE_HOME/conf:$HBASE_H$ export HADOOP_OPTS=-Djava.net.preferIPv4Stack=true
export HADOOP_NAMENODE_OPTS="-Dcom.sun.management.jmxremote $HADOOP_NAMENODE_OP$ export HADOOP_SECONDARYNAMENODE_OPTS="-Dcom.sun.management.jmxremote $HADOOP_SE$ export HADOOP_DATANODE_OPTS="-Dcom.sun.management.jmxremote $HADOOP_DATANODE_OP$ export HADOOP_BALANCER_OPTS="-Dcom.sun.management.jmxremote $HADOOP_BALANCER_OP$ export HADOOP_JOBTRACKER_OPTS="-Dcom.sun.management.jmxremote $HADOOP_JOBTRACKE$ # export HADOOP_TASKTRACKER_OPTS=
8) In /usr/local/hadoop/conf/core-site.xml
For master,add these properties in configuration
For slaves,add these properties in configuration
hadoop.tmp.dir /app/hadoop/tmp A base for other temporary directories.
fs.default.name hdfs://master:54310 The name of the default file system. A URI whose scheme and authority determine the FileSystem implementation. The uri's scheme determines the config property (fs.SCHEME.impl) naming the FileSystem implementation class. The uri's authority is used to determine the host, port, etc. for a filesystem.
9)In /usr/local/hadoop/conf/mapred-site.xml
For master,add
at. If "local", then jobs are run in-process as a single map
and reduce task.
at. If "local", then jobs are run in-process as a single map
and reduce task.
at. If "local", then jobs are run in-process as a single map
and reduce task.
at. If "local", then jobs are run in-process as a single map
and reduce task.
For slaves,add
mapred.job.tracker master:54311 The host and port that the MapReduce job tracker runs at. If "local", then jobs are run in-process as a single map and reduce task.
10)In /usr/local/hadoop/conf/hdfs-site.xml
For master
The actual number of replications can be specified when the file is
created.
The default is used if replication is not specified in create time.
For Slaves
The actual number of replications can be specified when the file is created.
The default is used if replication is not specified in create time.
11)The first step to starting up your Hadoop installation is formatting the Hadoop filesystem which is implemented on top of the local filesystem of your “cluster” . You need to do this the first time you set up a Hadoop cluster.
Caution:Do not format a running Hadoop filesystem as you will lose all the data currently in the cluster (in HDFS).To format the filesystem (which simply initializes the directory specified by the dfs.name.dir variable), run the command on master only
hduser@ubuntu:~$ /usr/local/hadoop/bin/hadoop namenode -formatThe output will look like this:
hduser@ubuntu:/usr/local/hadoop$ bin/hadoop namenode -format 10/05/08 16:59:56 INFO namenode.NameNode: STARTUP_MSG: /************************************************************ STARTUP_MSG: Starting NameNode STARTUP_MSG: host = ubuntu/127.0.1.1 STARTUP_MSG: args = [-format] STARTUP_MSG: version = 0.20.2 STARTUP_MSG: build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20 -r 911707; compiled by 'chrisdo' on Fri Feb 19 08:07:34 UTC 2010 ************************************************************/ 10/05/08 16:59:56 INFO namenode.FSNamesystem: fsOwner=hduser,hadoop 10/05/08 16:59:56 INFO namenode.FSNamesystem: supergroup=supergroup 10/05/08 16:59:56 INFO namenode.FSNamesystem: isPermissionEnabled=true 10/05/08 16:59:56 INFO common.Storage: Image file of size 96 saved in 0 seconds. 10/05/08 16:59:57 INFO common.Storage: Storage directory .../hadoop-hduser/dfs/name has been successfully formatted. 10/05/08 16:59:57 INFO namenode.NameNode: SHUTDOWN_MSG: /************************************************************ SHUTDOWN_MSG: Shutting down NameNode at ubuntu/127.0.1.1 ************************************************************/ hduser@ubuntu:/usr/local/hadoop$
For installing hbase refer
http://biforbeginners.blogspot.in/2012/07/step-by-step-installation-of-hbase.html
No comments:
Post a Comment