linux - HdfsIOException: Build pipeline to recovery block failed: all datanodes are bad -


i have file in hdfs has 8 billion records , when flushing internal table encountered following error.

hdfsioexception: build pipeline recovery block [block pool id: bp-2080382728-10.3.50.10-1444849419015 block id 1076905963_3642418] failed: datanodes bad.

we tried setting pipeline recovery parameters dfs.client.block.write.replace-datanode-on-failure.enable true, dfs.client.block.write.replace-datanode-on-failure.policy default , dfs.client.block.write.replace-datanode-on-failure.best-effort true( , know setting lead data loss in case when data nodes go down) still wanted give try , run our insert process smoothly .however, didn't worked.

here complete stack trace error.

enter code here 2017-09-13 00:06:27,099 warn  datanode.datanode (blockreceiver.java:receivepacket(520)) - slow blockreceiver write packet mirror took 2247ms (threshold=300ms) 2017-09-13 00:06:27,601 warn  datanode.datanode (blockreceiver.java:receivepacket(520)) - slow blockreceiver write packet mirror took 3647ms (threshold=300ms) 2017-09-13 00:06:27,662 info  datanode.blockpoolslicescanner (blockpoolslicescanner.java:verifyblock(444)) - verification succeeded bp-2080382728-10.3.50.10-1444849419015:blk_1073898400_180103 2017-09-13 00:06:28,731 warn  datanode.datanode (blockreceiver.java:receivepacket(520)) - slow blockreceiver write packet mirror took 1629ms (threshold=300ms) 2017-09-13 00:06:29,062 warn  datanode.datanode (blockreceiver.java:receivepacket(520)) - slow blockreceiver write packet mirror took 1147ms (threshold=300ms) 2017-09-13 00:06:29,762 info  datanode.datanode (blockreceiver.java:run(1223)) - packetresponder: bp-2080382728-10.3.50.10-1444849419015:blk_1076901876_3637612, type=has_downstream_in_pipeline java.io.eofexception: premature eof: no length prefix available         @ org.apache.hadoop.hdfs.protocolpb.pbhelper.vintprefixed(pbhelper.java:2211)         @ org.apache.hadoop.hdfs.protocol.datatransfer.pipelineack.readfields(pipelineack.java:176)         @ org.apache.hadoop.hdfs.server.datanode.blockreceiver$packetresponder.run(blockreceiver.java:1165)         @ java.lang.thread.run(thread.java:745) 2017-09-13 00:06:29,809 info  datanode.datanode (blockreceiver.java:receiveblock(817)) - exception bp-2080382728-10.3.50.10-1444849419015:blk_1076901876_3637612 java.net.sockettimeoutexception: 60000 millis timeout while waiting channel ready read. ch : java.nio.channels.socketchannel[connected local=/10.3.50.16:50010 remote=/10.3.50.15:11687]         @ org.apache.hadoop.net.socketiowithtimeout.doio(socketiowithtimeout.java:164)         @ org.apache.hadoop.net.socketinputstream.read(socketinputstream.java:161)         @ org.apache.hadoop.net.socketinputstream.read(socketinputstream.java:131)         @ java.io.bufferedinputstream.fill(bufferedinputstream.java:235)         @ java.io.bufferedinputstream.read1(bufferedinputstream.java:275)         @ java.io.bufferedinputstream.read(bufferedinputstream.java:334)         @ java.io.datainputstream.read(datainputstream.java:149)         @ org.apache.hadoop.io.ioutils.readfully(ioutils.java:192)         @ org.apache.hadoop.hdfs.protocol.datatransfer.packetreceiver.doreadfully(packetreceiver.java:213)         @ org.apache.hadoop.hdfs.protocol.datatransfer.packetreceiver.doread(packetreceiver.java:134)         @ org.apache.hadoop.hdfs.protocol.datatransfer.packetreceiver.receivenextpacket(packetreceiver.java:109)         @ org.apache.hadoop.hdfs.server.datanode.blockreceiver.receivepacket(blockreceiver.java:467)         @ org.apache.hadoop.hdfs.server.datanode.blockreceiver.receiveblock(blockreceiver.java:781)         @ org.apache.hadoop.hdfs.server.datanode.dataxceiver.writeblock(dataxceiver.java:740)         @ org.apache.hadoop.hdfs.protocol.datatransfer.receiver.opwriteblock(receiver.java:137)         @ org.apache.hadoop.hdfs.protocol.datatransfer.receiver.processop(receiver.java:74)         @ org.apache.hadoop.hdfs.server.datanode.dataxceiver.run(dataxceiver.java:235)         @ java.lang.thread.run(thread.java:745) 2017-09-13 00:06:29,810 warn  datanode.datanode (blockreceiver.java:run(1257)) - ioexception in blockreceiver.run(): java.nio.channels.closedbyinterruptexception         @ java.nio.channels.spi.abstractinterruptiblechannel.end(abstractinterruptiblechannel.java:202)         @ sun.nio.ch.socketchannelimpl.write(socketchannelimpl.java:496)         @ org.apache.hadoop.net.socketoutputstream$writer.performio(socketoutputstream.java:63)         @ org.apache.hadoop.net.socketiowithtimeout.doio(socketiowithtimeout.java:142)         @ org.apache.hadoop.net.socketoutputstream.write(socketoutputstream.java:159)         @ org.apache.hadoop.net.socketoutputstream.write(socketoutputstream.java:117)         @ java.io.bufferedoutputstream.flushbuffer(bufferedoutputstream.java:82)         @ java.io.bufferedoutputstream.flush(bufferedoutputstream.java:140)         @ java.io.dataoutputstream.flush(dataoutputstream.java:123)         @ org.apache.hadoop.hdfs.server.datanode.blockreceiver$packetresponder.sendackupstreamunprotected(blockreceiver.java:1389)         @ org.apache.hadoop.hdfs.server.datanode.blockreceiver$packetresponder.sendackupstream(blockreceiver.java:1328)         @ org.apache.hadoop.hdfs.server.datanode.blockreceiver$packetresponder.run(blockreceiver.java:1249)         @ java.lang.thread.run(thread.java:745) 2017-09-13 00:06:29,810 info  datanode.datanode (blockreceiver.java:run(1260)) - packetresponder: bp-2080382728-10.3.50.10-1444849419015:blk_1076901876_3637612, type=has_downstream_in_pipeline java.nio.channels.closedbyinterruptexception         @ java.nio.channels.spi.abstractinterruptiblechannel.end(abstractinterruptiblechannel.java:202)         @ sun.nio.ch.socketchannelimpl.write(socketchannelimpl.java:496)         @ org.apache.hadoop.net.socketoutputstream$writer.performio(socketoutputstream.java:63) 

can suggest me possible reason error , how can fixed?.

your appreciated

thanks


Comments

Popular posts from this blog

ZeroMQ on Windows, with Qt Creator -

unity3d - Unity SceneManager.LoadScene quits application -

python - Error while using APScheduler: 'NoneType' object has no attribute 'now' -