(原创)VM下搭建hadoop集群
一.前期准备
1.安装VMware_Workstation_wmb
2.安装三台CentOS-6.3-i386-bin-DVD1
Master;192.168.66.174
Slave1:192.168.66.171
Slave2:9:1.168.66.173
二.安装步骤:
(在安装centos时就把pc的名字改好,免得后面改hostname)
1.在每台pc上的/etc/hosts中加入:
127.0.0.1 localhost
192.168.66.174 master
192.168.66.171 slave1
192.168.66.173 slave2
2.在每台pc安装java:
在/etc/profile中加入:
export JAVA_HOME=/usr/local/java/jdk1.6.0_45 exportPATH=:$JAVA_HOME/bin:/sbin:/usr/bin:/usr/sbin: /bin
export CLASSPATH=.:$JAVA_HOME/lib
3.配置SSH无密码登录:
在每台pc上面:
ssh-keygen -t rsa
cat ~/.ssh/id_rsa.pub >>~/.ssh/authorized_keys chmod 700 .ssh
chmod 644 authorized_keys
编辑sshd配置文件/etc/ssh/sshd_config,
把#AuthorizedKeysFile .ssh/authorized_keys前面的注释取消掉。
在master上面:
ssh-copy-id -i ~/.ssh/id_rsa.pub root@slave1
ssh-copy-id -i ~/.ssh/id_rsa.pub root@slave2
重启
service sshd restart
在slave1上面:
ssh-copy-id -i ~/.ssh/id_rsa.pub root@master
ssh-copy-id -i ~/.ssh/id_rsa.pub root@slave2
重启
service sshd restart
在slave2上面:
ssh-copy-id -i ~/.ssh/id_rsa.pub root@slave1
ssh-copy-id -i ~/.ssh/id_rsa.pub root@master
重启
service sshd restart
再ssh slave1;ssh slave2;ssh master验证
**遇到Agent admitted failure to sign using the key
解決方式使用ssh-add 指令将私钥加进来(根据个人的密匙命名不同更改id_rsa)
# ssh-add ~/.ssh/id_rsa
**遇到ssh: connect to host master port 22: No route to host是由于ip地址问题,检查/etc/hosts
4.配置hadoop
下载hadoop后
scp hadoop-1.2.0.tar.gz root@slave1:/usr/local tar xzvf hadoop-1.2.0.tar.gz
mv hadoop-1.2.0 /usr/local/hadoop
在profile和./bashrc中加入:
export HADOOP_HOME=/usr/local/hadoop export PATH=$PATH:$HADOOP_HOME/bin
配置hadoop-env.sh
设置JAVA_HOME
配置core-site.xml
配置mapred-site.xml
配置hdfs-site.xml
配置masters
修改为master或对应的IP地址
配置slaves
slave1
slave2
(可以salves机上通过scp拷masters机上的hadoop主目录:sudo scp -r test@192.168.30.20:/usr/local/hadoop /usr/local)
关掉防火墙
root用户下
service iptables stop
格式化namenode
在master机子中:hadoop namenode -format
在master下启动集群即可:start-all.sh(因为已经将hadoop添加到PATH中去了)检验master:输入jps
3711 NameNode
4085 Jps
3970 JobTracker
3874 SecondaryNameNode
检验slave:输入jps
2892 Jps
2721 DataNode
2805 TaskTracker
或在master输入:hadoop dfsadmin -report Safe mode is ON
Configured Capacity: 15481700352 (14.42 GB)
Present Capacity: 137******** (12.79 GB)
DFS Remaining: 134******** (12.53 GB)
DFS Used: 276422656 (263.62 MB)
DFS Used%: 2.01%
Under replicated blocks: 0
Blocks with corrupt replicas: 0
Missing blocks: 0
-------------------------------------------------
Datanodes available: 3 (3 total, 0 dead)
Name: 192.168.160.143:50010
Decommission Status : Normal
Configured Capacity: 5160566784 (4.81 GB)
DFS Used: 41160704 (39.25 MB)
Non DFS Used: 582455296 (555.47 MB)
DFS Remaining: 4536950784(4.23 GB)
DFS Used%: 0.8%
DFS Remaining%: 87.92%
Last contact: Mon May 06 16:12:02 CST 2013 Name: 192.168.160.140:50010 Decommission Status : Normal Configured Capacity: 5160566784 (4.81 GB) DFS Used: 97075200 (92.58 MB)
Non DFS Used: 582545408 (555.56 MB) DFS Remaining: 4480946176(4.17 GB)
DFS Used%: 1.88%
DFS Remaining%: 86.83%
Last contact: Mon May 06 16:12:01 CST 2013 Name: 192.168.160.141:50010 Decommission Status : Normal Configured Capacity: 5160566784 (4.81 GB) DFS Used: 138186752 (131.79 MB)
Non DFS Used: 582406144 (555.43 MB) DFS Remaining: 4439973888(4.14 GB)
DFS Used%: 2.68%
DFS Remaining%: 86.04%
Last contact: Mon May 06 16:12:00 CST 2013
此时都表示集群成功。
hadoop集群关闭:stop-all.sh