在CentOS7.3上单机版安装Hadoop 2.8.3。
1、基础环境配置
配置好JDK环境,关闭防火墙,关闭selinux
[root@centos]vim /etc/selinux/config
#bled This file controls the state of SELinux on the system.
# SELINUX= can take one of these three values:
# enforcing - SELinux security policy is enforced.
# permissive - SELinux prints warnings instead of enforcing.
# disabled - No SELinux policy is loaded.
SELINUX=disabled
# SELINUXTYPE= can take one of three two values:
# targeted - Targeted processes are protected,
# minimum - Modification of targeted policy. Only selected processes are protected.
# mls - Multi Level Security protection.
SELINUXTYPE=targeted
计算机重命名:
查看主机名
[root@centos /]# hostnamectl status
修改主机名
[root@centos /]# hostnamectl set-hostname centos.hadoop1
2、解压缩Hadoop
[root@centos hadoop]tar –zxvf hadoop-2.8.3.tar.gz -C /usr/hadoop/
3、配置Hadoop环境变量
配置hadoop环境变量:
[root@centos hadoop]# vim /etc/profile
#set hadoop environment
export HADOOP_HOME=/usr/hadoop/hadoop-2.8.3
export PATH=$PATH:$HADOOP_HOME/bin
[root@centos hadoop]# source /etc/profile
4、编辑/usr/hadoop/hadoop2.8.3/etc/hadoop/slaves
删除缺省的localhost,将slave机器的hostname添加到其中
[root@centos hadoop-2.8.3]# vim etc/hadoop/slaves
5、配置/usr/hadoop/hadoop2.8.3/etc/hadoop/hadoop-env.sh
[root@centos hadoop-2.8.3]# vim etc/hadoop/hadoop-env.sh
export JAVA_HOME=/usr/java/jdk1.8.0_151
6、配置/usr/hadoop/hadoop2.8.3/etc/hadoop/core-site.xml
<configuration>
<!-- 指定HDFS老大(namenode)的通信地址 -->
<property>
<name>fs.defaultFS</name>
<value>hdfs://127.0.0.1:9000</value>
</property>
<!-- 指定hadoop运行时产生文件的存储路径 -->
<property>
<name>hadoop.tmp.dir</name>
<value>/usr/hadoop/hadoop-2.8.3/tmp</value>
</property>
</configuration>
7、配置/usr/hadoop/hadoop2.8.3/etc/hadoop/hdfs-site.xml
<configuration>
<property>
<name>dfs.name.dir</name>
<value>/usr/hadoop/hdfs/name</value>
<description>namenode上存储hdfs名字的物理存储位置 </description>
</property>
<property>
<name>dfs.data.dir</name>
<value>/usr/hadoop/hdfs/data</value>
<description>datanode上数据块的物理存储位置</description>
</property>
<!-- 设置hdfs副本数量,默认为3,设置为1 这样每个block只会存在一份 -->
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.permissions</name>
<value>false</value>
<description>对hdfs上的文件进行读写时,是否检查权限</description>
</property>
</configuration>
8、SSH免密码登录
[root@centos hadoop]# ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa
[root@centos hadoop]# cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys
[root@centos hadoop]# chmod 0600 ~/.ssh/authorized_keys
9、启动停止HDFS
第一次启动hdfs需要格式化,之后启动就不需要的:
[root@centos]# cd /usr/hadoop/hadoop-2.8.3
[root@centos hadoop-2.8.3]# ./bin/hdfs namenode -format
启动命令:
[root@centos hadoop-2.8.3]# ./sbin/start-dfs.sh
可以用jps命令查看启动了什么进程:
[root@centos hadoop-2.8.3]# jps
3969 NameNode
4275 SecondaryNameNode
4389 Jps
4071 DataNode
测试hdfs,创建一个test文件夹:
[root@centos hadoop-2.8.3]# ./bin/hdfs dfs -mkdir /test
停止命令:
[root@centos hadoop-2.8.3]# ./sbin/stop-dfs.sh
10.配置yarn(mapred-site.xml)
Apache Hadoop YARN (Yet Another Resource Negotiator,另一种资源协调者)是一种新的 Hadoop 资源管理器,它是一个通用资源管理系统,可为上层应用提供统一的资源管理和调度,它的引入为集群在利用率、资源统一管理和数据共享等方面带来了巨大好处。
配置/usr/hadoop/hadoop-2.8.3/etc/hadoop/mapred-site.xml。注意,hadoop里面默认是mapred-site.xml.template 文件,如果配置yarn,把mapred-site.xml.template 重命名为mapred-site.xml 。如果不启动yarn,把重命名还原。
[root@centos hadoop]# mv mapred-site.xml.template mapred-site.xml
<configuration>
<!-- 启用yarn作为资源管理框架 -->
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
11、配置yarn-site.xml
<configuration>
<!-- NodeManager上运行的附属服务。需配置成mapreduce_shuffle,才可运行MapReduce程序 -->
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
</configuration>
12、启动停止yarn
启动yarn,启动了resourcemanager和nodemanager
[root@centos hadoop-2.8.3]# ./sbin/start-yarn.sh
starting yarn daemons
starting resourcemanager, logging to /usr/hadoop/hadoop-2.8.3/logs/yarn-root-resourcemanager-centos.hbase.out
localhost: starting nodemanager, logging to /usr/hadoop/hadoop-2.8.3/logs/yarn-root-nodemanager-centos.hadoop1.out
浏览器输入:
http://192.168.2.5:8088/ (8088是默认端口,如果端口占用,先把占用的端口杀掉 netstat -ano),打开Hadoop集群页面。
可以用jps命令查看启动了什么进程:
[root@centos hadoop-2.8.3]# jps
3969 NameNode
5298 Jps
4275 SecondaryNameNode
4071 DataNode
4965 NodeManager
4858 ResourceManager
停止yarn:
[root@centos hadoop-2.8.3]# ./sbin/stop-yarn.sh
13、启动停止Hadoop(YARN、HDFS、MapReduce)
[root@centos hadoop-2.8.3]# ./sbin/start-all.sh
[root@centos hadoop-2.8.3]# ./sbin/stop-all.sh