设为首页 加入收藏

TOP

hadoop那些事儿_1
2019-02-19 00:20:19 】 浏览:53
Tags:hadoop 那些 事儿

一、背景

1、上周生产集群加入几台节点,执行start-balancer后进度十分地缓慢,连续几天未完成。

2、屋漏偏逢连夜雨,周六供电线路被施工挖断,机房UPS在坚持几个小时后,集群整体宕掉。

3、周一供电正常后,集群再次启动。

二、问题

1、症状

(1)hadoop,hdfs启动后上传文件正常,日志中未发现异常。

(2)hbase,可以启动,但是启动后很多表的regions无法正常加载,执行hbase hbck异常比较多。hbase启动后hdfs上传文件出现错误。hbase表可以访问,但是其访问速度异常地慢。

2、解决

(1)排除硬件服务器异常。

(2)通过检查发现部分服务器的时间未与时钟服务器同步,手机同步一次,检查及重新配置执行计划。

(3)重点,根据节点日志上报的明显错误,调整了hdfs-site.xml中的参数。重启hdfs及hbase后正常。

三、日志

1、hadoop hdfs上传文件报出问题
17/03/29 13:30:31 INFO hdfs.DFSClient: Exception in createBlockOutputStream
java.io.EOFException: Premature EOF: no length prefix available
at org.apache.hadoop.hdfs.protocolPB.PBHelper.vintPrefixed(PBHelper.java:2282)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1346)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1266)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:449)
17/03/29 13:30:31 INFO hdfs.DFSClient: Abandoning BP-903121414-10.141.17.33-1461912427616:blk_1076230712_2574868
17/03/29 13:30:31 INFO hdfs.DFSClient: Excluding datanode DatanodeInfoWithStorage[10.141.17.47:50010,DS-9cf11117-1b97-400e-87f7-0dd4aad6c266,DISK]
17/03/29 13:30:31 INFO hdfs.DFSClient: Exception in createBlockOutputStream
java.io.IOException: Got error, status message , ack with firstBadLink as 192.168.17.46:50010
at org.apache.hadoop.hdfs.protocol.datatransfer.DataTransferProtoUtil.checkBlockOpStatus(DataTransferProtoUtil.java:140)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1363)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1266)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:449)
17/03/29 13:30:31 INFO hdfs.DFSClient: Abandoning BP-903121414-10.141.17.33-1461912427616:blk_1076230713_2574869
17/03/29 13:30:31 INFO hdfs.DFSClient: Excluding datanode DatanodeInfoWithStorage[192.168.17.46:50010,DS-91025977-762a-46b7-bdd9-be49b4873cb5,DISK]
17/03/29 13:30:32 INFO hdfs.DFSClient: Exception in createBlockOutputStream
java.io.IOException: Got error, status message , ack with firstBadLink as 192.168.17.37:50010
at org.apache.hadoop.hdfs.protocol.datatransfer.DataTransferProtoUtil.checkBlockOpStatus(DataTransferProtoUtil.java:140)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1363)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1266)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:449)
17/03/29 13:30:32 INFO hdfs.DFSClient: Abandoning BP-903121414-10.141.17.33-1461912427616:blk_1076230715_2574871
17/03/29 13:30:32 INFO hdfs.DFSClient: Excluding datanode DatanodeInfoWithStorage[192.168.17.37:50010,DS-ab386599-037c-44e1-929d-ca955d21dcf3,DISK]
17/03/29 13:30:32 INFO hdfs.DFSClient: Exception in createBlockOutputStream
java.io.EOFException: Premature EOF: no length prefix available
at org.apache.hadoop.hdfs.protocolPB.PBHelper.vintPrefixed(PBHelper.java:2282)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1346)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1266)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:449)
17/03/29 13:30:32 INFO hdfs.DFSClient: Abandoning BP-903121414-10.141.17.33-1461912427616:blk_1076230716_2574872
17/03/29 13:30:32 INFO hdfs.DFSClient: Excluding datanode DatanodeInfoWithStorage[192.168.17.42:50010,DS-8f01a8c7-beed-4609-98bb-529501104d90,DISK]
17/03/29 13:30:32 INFO hdfs.DFSClient: Exception in createBlockOutputStream
java.io.EOFException: Premature EOF: no length prefix available
at org.apache.hadoop.hdfs.protocolPB.PBHelper.vintPrefixed(PBHelper.java:2282)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1346)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1266)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:449)
17/03/29 13:30:32 INFO hdfs.DFSClient: Abandoning BP-903121414-10.141.17.33-1461912427616:blk_1076230717_2574873
17/03/29 13:30:32 INFO hdfs.DFSClient: Excluding datanode DatanodeInfoWithStorage[192.168.17.34:50010,DS-cce3cb5f-30d4-4e8c-97b6-b0104be9589e,DISK]
17/03/29 13:30:32 INFO hdfs.DFSClient: Exception in createBlockOutputStream
java.io.EOFException: Premature EOF: no length prefix available
at org.apache.hadoop.hdfs.protocolPB.PBHelper.vintPrefixed(PBHelper.java:2282)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1346)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1266)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:449)
17/03/29 13:30:32 INFO hdfs.DFSClient: Abandoning BP-903121414-10.141.17.33-1461912427616:blk_1076230718_2574874
17/03/29 13:30:32 INFO hdfs.DFSClient: Excluding datanode DatanodeInfoWithStorage[192.168.17.48:50010,DS-3ef57df8-b025-4d1e-8a10-271353590385,DISK]
17/03/29 13:30:32 WARN hdfs.DFSClient: DataStreamer Exception
java.io.IOException: Unable to create new block.
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1279)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:449)
17/03/29 13:30:32 WARN hdfs.DFSClient: Could not get block locations. Source file "/user/hadoop/jhm/2016-01-01pmsln.txt._COPYING_" - Aborting...
put: Premature EOF: no length prefix available

2、datanode节点警告
2017-03-30 08:58:21,378 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: datanode35:50010:DataXceiverServer:
java.io.IOException: Xceiver count 8193 exceeds the limit of concurrent xcievers: 8192
at org.apache.hadoop.hdfs.server.datanode.DataXceiverServer.run(DataXceiverServer.java:140)
at java.lang.Thread.run(Thread.java:744)
2017-03-30 08:58:21,484 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: datanode35:50010:DataXceiverServer:
java.io.IOException: Xceiver count 8193 exceeds the limit of concurrent xcievers: 8192
at org.apache.hadoop.hdfs.server.datanode.DataXceiverServer.run(DataXceiverServer.java:140)
at java.lang.Thread.run(Thread.java:744)
2017-03-30 08:58:21,494 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: datanode35:50010:DataXceiverServer:
java.io.IOException: Xceiver count 8193 exceeds the limit of concurrent xcievers: 8192
at org.apache.hadoop.hdfs.server.datanode.DataXceiverServer.run(DataXceiverServer.java:140)
at java.lang.Thread.run(Thread.java:744)
2017-03-30 08:58:22,302 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: datanode35:50010:DataXceiverServer:
java.io.IOException: Xceiver count 8193 exceeds the limit of concurrent xcievers: 8192
at org.apache.hadoop.hdfs.server.datanode.DataXceiverServer.run(DataXceiverServer.java:140)
at java.lang.Thread.run(Thread.java:744)
2017-03-30 08:58:22,304 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: datanode35:50010:DataXceiverServer:
java.io.IOException: Xceiver count 8193 exceeds the limit of concurrent xcievers: 8192

四、经验

1、排除集群问题需要多多地分析日志,不仅要分析namenode也要分析datanode。

2、hdfs-site.xml中的dfs.datanode.max.xceievers参数值,由8192调整为32768,它设置过小会直接地影响hbase线程请求。这个值设置多少合适?8192需要占用节点内存8G,32768就是32G,我的节点内存都是384G,CPU也足够强大,修改后不影响。

】【打印繁体】【投稿】【收藏】 【推荐】【举报】【评论】 【关闭】 【返回顶部
上一篇hadoop HA 模式集群配置文件 下一篇HDFS一     HDFS的sh..

最新文章

热门文章

Hot 文章

Python

C 语言

C++基础

大数据基础

linux编程基础

C/C++面试题目