linux - HdfsIOException: Build pipeline to recovery block failed: all datanodes are bad -
i have file in hdfs has 8 billion records , when flushing internal table encountered following error.
hdfsioexception: build pipeline recovery block [block pool id: bp-2080382728-10.3.50.10-1444849419015 block id 1076905963_3642418] failed: datanodes bad.
we tried setting pipeline recovery parameters dfs.client.block.write.replace-datanode-on-failure.enable true, dfs.client.block.write.replace-datanode-on-failure.policy default , dfs.client.block.write.replace-datanode-on-failure.best-effort true( , know setting lead data loss in case when data nodes go down) still wanted give try , run our insert process smoothly .however, didn't worked.
here complete stack trace error.
enter code here 2017-09-13 00:06:27,099 warn datanode.datanode (blockreceiver.java:receivepacket(520)) - slow blockreceiver write packet mirror took 2247ms (threshold=300ms) 2017-09-13 00:06:27,601 warn datanode.datanode (blockreceiver.java:receivepacket(520)) - slow blockreceiver write packet mirror took 3647ms (threshold=300ms) 2017-09-13 00:06:27,662 info datanode.blockpoolslicescanner (blockpoolslicescanner.java:verifyblock(444)) - verification succeeded bp-2080382728-10.3.50.10-1444849419015:blk_1073898400_180103 2017-09-13 00:06:28,731 warn datanode.datanode (blockreceiver.java:receivepacket(520)) - slow blockreceiver write packet mirror took 1629ms (threshold=300ms) 2017-09-13 00:06:29,062 warn datanode.datanode (blockreceiver.java:receivepacket(520)) - slow blockreceiver write packet mirror took 1147ms (threshold=300ms) 2017-09-13 00:06:29,762 info datanode.datanode (blockreceiver.java:run(1223)) - packetresponder: bp-2080382728-10.3.50.10-1444849419015:blk_1076901876_3637612, type=has_downstream_in_pipeline java.io.eofexception: premature eof: no length prefix available @ org.apache.hadoop.hdfs.protocolpb.pbhelper.vintprefixed(pbhelper.java:2211) @ org.apache.hadoop.hdfs.protocol.datatransfer.pipelineack.readfields(pipelineack.java:176) @ org.apache.hadoop.hdfs.server.datanode.blockreceiver$packetresponder.run(blockreceiver.java:1165) @ java.lang.thread.run(thread.java:745) 2017-09-13 00:06:29,809 info datanode.datanode (blockreceiver.java:receiveblock(817)) - exception bp-2080382728-10.3.50.10-1444849419015:blk_1076901876_3637612 java.net.sockettimeoutexception: 60000 millis timeout while waiting channel ready read. ch : java.nio.channels.socketchannel[connected local=/10.3.50.16:50010 remote=/10.3.50.15:11687] @ org.apache.hadoop.net.socketiowithtimeout.doio(socketiowithtimeout.java:164) @ org.apache.hadoop.net.socketinputstream.read(socketinputstream.java:161) @ org.apache.hadoop.net.socketinputstream.read(socketinputstream.java:131) @ java.io.bufferedinputstream.fill(bufferedinputstream.java:235) @ java.io.bufferedinputstream.read1(bufferedinputstream.java:275) @ java.io.bufferedinputstream.read(bufferedinputstream.java:334) @ java.io.datainputstream.read(datainputstream.java:149) @ org.apache.hadoop.io.ioutils.readfully(ioutils.java:192) @ org.apache.hadoop.hdfs.protocol.datatransfer.packetreceiver.doreadfully(packetreceiver.java:213) @ org.apache.hadoop.hdfs.protocol.datatransfer.packetreceiver.doread(packetreceiver.java:134) @ org.apache.hadoop.hdfs.protocol.datatransfer.packetreceiver.receivenextpacket(packetreceiver.java:109) @ org.apache.hadoop.hdfs.server.datanode.blockreceiver.receivepacket(blockreceiver.java:467) @ org.apache.hadoop.hdfs.server.datanode.blockreceiver.receiveblock(blockreceiver.java:781) @ org.apache.hadoop.hdfs.server.datanode.dataxceiver.writeblock(dataxceiver.java:740) @ org.apache.hadoop.hdfs.protocol.datatransfer.receiver.opwriteblock(receiver.java:137) @ org.apache.hadoop.hdfs.protocol.datatransfer.receiver.processop(receiver.java:74) @ org.apache.hadoop.hdfs.server.datanode.dataxceiver.run(dataxceiver.java:235) @ java.lang.thread.run(thread.java:745) 2017-09-13 00:06:29,810 warn datanode.datanode (blockreceiver.java:run(1257)) - ioexception in blockreceiver.run(): java.nio.channels.closedbyinterruptexception @ java.nio.channels.spi.abstractinterruptiblechannel.end(abstractinterruptiblechannel.java:202) @ sun.nio.ch.socketchannelimpl.write(socketchannelimpl.java:496) @ org.apache.hadoop.net.socketoutputstream$writer.performio(socketoutputstream.java:63) @ org.apache.hadoop.net.socketiowithtimeout.doio(socketiowithtimeout.java:142) @ org.apache.hadoop.net.socketoutputstream.write(socketoutputstream.java:159) @ org.apache.hadoop.net.socketoutputstream.write(socketoutputstream.java:117) @ java.io.bufferedoutputstream.flushbuffer(bufferedoutputstream.java:82) @ java.io.bufferedoutputstream.flush(bufferedoutputstream.java:140) @ java.io.dataoutputstream.flush(dataoutputstream.java:123) @ org.apache.hadoop.hdfs.server.datanode.blockreceiver$packetresponder.sendackupstreamunprotected(blockreceiver.java:1389) @ org.apache.hadoop.hdfs.server.datanode.blockreceiver$packetresponder.sendackupstream(blockreceiver.java:1328) @ org.apache.hadoop.hdfs.server.datanode.blockreceiver$packetresponder.run(blockreceiver.java:1249) @ java.lang.thread.run(thread.java:745) 2017-09-13 00:06:29,810 info datanode.datanode (blockreceiver.java:run(1260)) - packetresponder: bp-2080382728-10.3.50.10-1444849419015:blk_1076901876_3637612, type=has_downstream_in_pipeline java.nio.channels.closedbyinterruptexception @ java.nio.channels.spi.abstractinterruptiblechannel.end(abstractinterruptiblechannel.java:202) @ sun.nio.ch.socketchannelimpl.write(socketchannelimpl.java:496) @ org.apache.hadoop.net.socketoutputstream$writer.performio(socketoutputstream.java:63)
can suggest me possible reason error , how can fixed?.
your appreciated
thanks
Comments
Post a Comment