最近在阅读hadoop源码,有时候为搞清楚来龙去脉,必要时得做debug。
在搭建调试环境的时候,遇到不少问题,最后逐一解决。在此分享给大家,以飨读者、同仁。
NoClassDefFoundError
第一个问题,莫名其妙,类找不到,代码都没标红,排查了很久以为环境没搭好。
Exception in thread "main" java .lang .NoClassDefFoundError : org/apache/hadoop/fs/Path
at java .lang .Class .forName 0(Native Method)
at java.lang .Class .forName (Class.java :195 )
at com .intellij .rt .execution .application .AppMain .main (AppMain.java :123 )
Caused by: java.lang .ClassNotFoundException : org.apache .hadoop .fs .Path
at java.net .URLClassLoader $1.run (URLClassLoader.java :366 )
at java.net .URLClassLoader $1.run (URLClassLoader.java :355 )
at java.security .AccessController .doPrivileged (Native Method)
at java.net .URLClassLoader .findClass (URLClassLoader.java :354 )
at java.lang .ClassLoader .loadClass (ClassLoader.java :425 )
at sun.misc .Launcher $AppClassLoader.loadClass (Launcher.java :308 )
at java.lang .ClassLoader .loadClass (ClassLoader.java :358 )
... 3 more
最后才发现,pom.xml 文件多写了一行 “provided”,因为是从maven repo直接复制过来的,代码也没报错,就没注意,注释掉就好了。
<dependency >
<groupId > org.apache.hadoop</groupId >
<artifactId > hadoop-common</artifactId >
<version > 2.6.0</version >
</dependency >
Could not locate executable null\bin\winutils.exe in the Hadoop binaries
接下来遇到第二个问题
18 /05 /17 13 :54 :44 ERROR util.Shell: Failed to locate
the winutils binary in the hadoop binary path
java.io.IOException: Could not locate executable
null \bin\winutils.exe in the Hadoop binaries.
解决办法:在代码里 添加 hadoop_home 系统变量,并且下载winutils.exe,放到hadoop_home 下的 bin 目录
System.setProperty ("hadoop.home.dir" , "D:\\aws\\hadoop-2.6.0" )
No FileSystem for scheme: hdfs
找不到相应的文件系统
java.io.IOException: No FileSystem for scheme: hdfs
at org.apache .hadoop .fs .FileSystem .getFileSystemClass (FileSystem.java :2584 )
at org.apache .hadoop .fs .FileSystem .createFileSystem (FileSystem.java :2591 )
at org.apache .hadoop .fs .FileSystem .access $200(FileSystem.java :91 )
at org.apache .hadoop .fs .FileSystem $Cache.getInternal (FileSystem.java :2630 )
代码虽然不报错,但是运行是缺少相应的依赖包
解决办法是,在pom.xml,增加依赖
<dependency >
<groupId > org.apache.hadoop</groupId >
<artifactId > hadoop-hdfs</artifactId >
<version > 2.6.0</version >
</dependency >
访问权限问题
org.apache.hadoop.security.AccessControlException: Permission denied:
user=XXX, access=WRITE, inode="/user" :root:supergroup:drwxr-xr-x
at org.apache .hadoop .hdfs .server .namenode .FSPermissionChecker .
checkFsPermission(FSPermissionChecker.java :271 )
window7 客户端 的系统用户 和 hdfs 用户不一致,
解决办法是在代码里增加系统变量
System.setProperty ("HADOOP_USER_NAME" , "root" )
java.net.UnknownHostException
Exception in thread "main" java.lang .IllegalArgumentException : java.net .UnknownHostException : 1. txt
问题是 文件URI,hdfs 后面 要么 是“HDFS:” 后面直接加三个 斜杠,或者 是”IP:port” 再加斜杠, 千万别写成”hdfs://1.txt”。
正确的写法如下
fs = FileSystem.get(URI.create ("hdfs:///1.txt" ), conf);
或
fs = FileSystem.get(URI.create ("hdfs://10.205.84.14:9000/1.txt" ), conf);
总结
添加resource文件夹至客户端工程项目里,resource 文件夹里包含两个文件 “core-site.xml”、“log4j.properties”,这两个文件从hadoop集群中获取,保持一致
编写好pom文件,主要是 “hadoop-common” “hadoop-hdfs”
在客户端代码里配好环境变量(环境变量也可以通过其他方式获得)
客户端代码样例
package test.test
import org.apache .hadoop .conf .Configuration
import org.apache .hadoop .fs .FSDataOutputStream
import org.apache .hadoop .fs .FileSystem
import org.apache .hadoop .fs .Path
import java.io .IOException
import java.net .URI
public class TestHdfs {
public static void main( String[] args )
{
System.setProperty ("hadoop.home.dir" , "D:\\aws\\hadoop-2.6.0" )
System.setProperty ("HADOOP_USER_NAME" , "root" )
Configuration conf = new Configuration()
FileSystem fs = null
try {
fs = FileSystem.get (URI.create ("hdfs:///1.txt" ), conf)
Path path = new Path("3.txt" )
FSDataOutputStream out = fs.create (path)
out .write ("hello" .getBytes ("UTF-8" ))
out .writeUTF ("da jia hao,cai shi zhen de hao!" )
out .close ()
} catch (IOException e) {
e.printStackTrace ()
}
}
}
pom文件样例
<project xmlns ="http://maven.apache.org/POM/4.0.0" xmlns:xsi ="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation ="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd" >
<modelVersion > 4.0.0</modelVersion >
<groupId > test.test</groupId >
<artifactId > test_debug_hadoop</artifactId >
<version > 1.0-SNAPSHOT</version >
<packaging > jar</packaging >
<name > testtest</name >
<url > http://maven.apache.org</url >
<properties >
<project.build.sourceEncoding > UTF-8</project.build.sourceEncoding >
</properties >
<dependencies >
<dependency >
<groupId > org.apache.hadoop</groupId >
<artifactId > hadoop-common</artifactId >
<version > 2.6.0</version >
</dependency >
<dependency >
<groupId > org.apache.hadoop</groupId >
<artifactId > hadoop-hdfs</artifactId >
<version > 2.6.0</version >
</dependency >
<dependency >
<groupId > junit</groupId >
<artifactId > junit</artifactId >
<version > 3.8.1</version >
<scope > test</scope >
</dependency >
</dependencies >
</project >