设为首页 加入收藏

TOP

hadoop的"mapred.ReduceTask: java.net.ConnectException: Connection timed out"问题解决
2014-11-24 13:33:00 】 浏览:5972
Tags:hadoop " mapred.ReduceTask: java.net.ConnectException: Connection timed out" 问题 解决
集群某节点91有故障发生,出现
[plain]
2013-11-08 08:32:13,908 WARN org.apache.hadoop.mapred.ReduceTask: attempt_201311061017_18902_r_000000_0 copy failed: attempt_201311061017_18902_m_000003_0 from node-192
2013-11-08 08:32:13,921 WARN org.apache.hadoop.mapred.ReduceTask: java.net.ConnectException: Connection timed out
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.AbstractPlainSocketImpl.doConnect(Unknown Source)
at java.net.AbstractPlainSocketImpl.connectToAddress(Unknown Source)
at java.net.AbstractPlainSocketImpl.connect(Unknown Source)
at java.net.SocksSocketImpl.connect(Unknown Source)
at java.net.Socket.connect(Unknown Source)
at sun.net.NetworkClient.doConnect(Unknown Source)
at sun.net.www.http.HttpClient.openServer(Unknown Source)
at sun.net.www.http.HttpClient.openServer(Unknown Source)
at sun.net.www.http.HttpClient.(Unknown Source)
at sun.net.www.http.HttpClient.New(Unknown Source)
at sun.net.www.http.HttpClient.New(Unknown Source)
at sun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(Unknown Source)
at sun.net.www.protocol.http.HttpURLConnection.plainConnect(Unknown Source)
at sun.net.www.protocol.http.HttpURLConnection.connect(Unknown Source)
at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.getInputStream(ReduceTask.java:1631)
at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.setupSecureConnection(ReduceTask.java:1588)
at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.getMapOutput(ReduceTask.java:1488)
at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.copyOutput(ReduceTask.java:1399)
at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.run(ReduceTask.java:1331)
分析hadoop代码:
[java]
localFs = FileSystem.getLocal(fConf);
if (fConf.get("slave.host.name") != null) {
this.localHostname = fConf.get("slave.host.name");
}
if (localHostname == null) {
this.localHostname =
DNS.getDefaultHost
(fConf.get("mapred.tasktracker.dns.interface","default"),
fConf.get("mapred.tasktracker.dns.nameserver","default"));
}
在该节点ping 下这个hostname:
[plain]
ping node-191
PING node-128-191.localhost (220.250.64.228) 56(84) bytes of data.
64 bytes from 220.250.64.228: icmp_seq=1 ttl=247 time=14.8 ms
64 bytes from 220.250.64.228: icmp_seq=2 ttl=247 time=14.3 ms
64 bytes from 220.250.64.228: icmp_seq=3 ttl=247 time=14.4 ms
发现压根不是191的ip。
到该节点的hosts里查看,也没有配置191的hostname。
问题得解。
将191的hostname添加到集群所有节点的hosts上。重启tasktracker搞定。
】【打印繁体】【投稿】【收藏】 【推荐】【举报】【评论】 【关闭】 【返回顶部
上一篇处理并发---索引的优点 下一篇SQL Server查询性能优化――创建..

最新文章

热门文章

Hot 文章

Python

C 语言

C++基础

大数据基础

linux编程基础

C/C++面试题目