本Hadoop与HBase集群有1台NameNode, 7台DataNode
1. /etc/hostname文件
NameNode:
node1
DataNode 1:
node2
DataNode 2:
node3
.......
DataNode 7:
node8
2. /etc/hosts文件
NameNode:
127.0.0.1 localhost
#127.0.1.1 node1
#-------edit by HY(2014-05-04)--------
#127.0.1.1 node1
125.216.241.113 node1
125.216.241.112 node2
125.216.241.96 node3
125.216.241.111 node4
125.216.241.114 node5
125.216.241.115 node6
125.216.241.116 node7
125.216.241.117 node8
#-------end edit--------
# The following lines are desirable for IPv6 capable hosts
::1 ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
DataNode 1:?
127.0.0.1 localhost
#127.0.0.1 node2
#127.0.1.1 node2
#--------eidt by HY(2014-05-04)--------
125.216.241.113 node1
125.216.241.112 node2
125.216.241.96 node3
125.216.241.111 node4
125.216.241.114 node5
125.216.241.115 node6
125.216.241.116 node7
125.216.241.117 node8
#-------end eidt---------
# The following lines are desirable for IPv6 capable hosts
::1 ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
其他的DataNode类似,只是注意要保持hostname与hosts中的域名要一样, 如果不一样, 在集群上跑任务时会出一些莫名奇妙的问题, 具体什么问题忘记了.
3. 在hadoop-env.sh中注释
# export JAVA_HOME=/usr/lib/j2sdk1.5-sun
增加
JAVA_HOME=/usr/lib/jvm/java-6-sun
4. core-site.xml
fs.default.name
hdfs://node1:49000
hadoop.tmp.dir
/home/hadoop/newdata/hadoop-1.2.1/tmp
io.compression.codecs
org.apache.hadoop.io.compress.DefaultCodec,org.apache.hadoop.io.compress.GzipCodec,org.apache.hadoop.io.compress.BZip2Codec,com.hadoop.compression.lzo.LzoCodec,com.hadoop.compression.lzo.LzopCodec
io.compression.codec.lzo.class
com.hadoop.compression.lzo.LzoCodec
dfs.datanode.socket.write.timeout
3000000
dfs.socket.timeout
3000000
5. hdfs-site.xml?
dfs.name.dir
/home/hadoop/newdata/hadoop-1.2.1/name1,/home/hadoop/newdata/hadoop-1.2.1/name2
数据元信息存储位置
dfs.data.dir
/home/hadoop/newdata/hadoop-1.2.1/data1,/home/hadoop/newdata/hadoop-1.2.1/data2
数据块存储位置
dfs.replication
2
6. mapred-site.xml
mapred.job.tracker
node1:49001
mapred.local.dir
/home/hadoop/newdata/hadoop-1.2.1/tmp
mapred.compress.map.output
true
mapred.map.output.compression.codec
com.hadoop.compression.lzo.LzoCodec
?
7. masters
node1
8. slaves
node2
node3
node4
node5
node6
node7
node8
9. 在hbase-env.sh
增加
JAVA_HOME=/usr/lib/jvm/java-6-sun
并启用export HBASE_MANAGES_ZK=true //为true表示使用自带的Zookeeper, 如果需要独立的Zookeeper,则设置为false, 并且安装Zookeeper
10. hbase-site.xml
hbase.rootdir
hdfs://node1:49000/hbase
The directory shared by RegionServers.
hbase.cluster.distributed
true
The mode the cluster will be in. Possible values are
false: standalone and pseudo-distributed setups with managed Zookeeper
true: fully-distributed with unmanaged Zookeeper Quorum (see hbase-env.sh)
hbase.master
node1:60000
hbase.tmp.dir
/home/hadoop/newdata/hbase/tmp
Temporary directory on the local filesyst