[ https://issues.apache.org/jira/browse/HDFS-12326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Andras Bokor resolved HDFS-12326. --------------------------------- Resolution: Not A Problem It seems like a question, not a bug. > What is the correct way of retrying when failure occurs during writing > ---------------------------------------------------------------------- > > Key: HDFS-12326 > URL: https://issues.apache.org/jira/browse/HDFS-12326 > Project: Hadoop HDFS > Issue Type: Test > Components: hdfs-client > Reporter: ZhangBiao > > I'm using hdfs client for golang https://github.com/colinmarc/hdfs to write > to the hdfs. And I'm using hadoop 2.7.3 > When the number of files concurrently being opened is larger, for example > 200. I'll always get the 'broken pipe' error. > So I want to retry to continue writing. What is the correct way of retrying? > Because https://github.com/colinmarc/hdfs hasn't been able to recover the > stream status when an error occurs duing writing, so I have to reopen and get > a new stream. So I tried the following steps: > 1 close the current stream > 2 Append the file to get a new stream > But when I close the stream, I got the error "updateBlockForPipeline call > failed with ERROR_APPLICATION (java.io.IOException" > and it seems the namenode complains: > {code:java} > 2017-08-20 03:22:55,598 INFO org.apache.hadoop.ipc.Server: IPC Server handler > 2 on 9000, call > org.apache.hadoop.hdfs.protocol.ClientProtocol.updateBlockForPipeline from > 192.168.0.39:46827 Call#50183 Retry#-1 > java.io.IOException: > BP-1152809458-192.168.0.39-1502261411064:blk_1073825071_111401 does not exist > or is not under Constructionblk_1073825071_111401{UCState=COMMITTED, > truncateBlock=null, primaryNodeIndex=-1, > replicas=[ReplicaUC[[DISK]DS-d61914ba-df64-467b-bb75-272875e5e865:NORMAL:192.168.0.39:50010|RBW], > > ReplicaUC[[DISK]DS-1314debe-ab08-4001-ab9a-8e234f28f87c:NORMAL:192.168.0.38:50010|RBW]]} > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkUCBlock(FSNamesystem.java:6241) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.updateBlockForPipeline(FSNamesystem.java:6309) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.updateBlockForPipeline(NameNodeRpcServer.java:806) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.updateBlockForPipeline(ClientNamenodeProtocolServerSideTranslatorPB.java:955) > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2043) > 2017-08-20 03:22:56,333 INFO > org.apache.hadoop.hdfs.server.namenode.FSNamesystem: BLOCK* > blk_1073825071_111401{UCState=COMMITTED, truncateBlock=null, > primaryNodeIndex=-1, > replicas=[ReplicaUC[[DISK]DS-d61914ba-df64-467b-bb75-272875e5e865:NORMAL:192.168.0.39:50010|RBW], > > ReplicaUC[[DISK]DS-1314debe-ab08-4001-ab9a-8e234f28f87c:NORMAL:192.168.0.38:50010|RBW]]} > is not COMPLETE (ucState = COMMITTED, replication# = 0 < minimum = 1) in > file > /user/am/scan_task/2017-08-20/192.168.0.38_audience_f/user-bak010-20170820030804.log > {code} > when I Appended to get a new stream, I got the error 'append call failed with > ERROR_APPLICATION > (org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException)', and the > corresponding error in namenode is: > {code:java} > 2017-08-20 03:22:56,335 WARN org.apache.hadoop.hdfs.StateChange: DIR* > NameSystem.append: Failed to APPEND_FILE > /user/am/scan_task/2017-08-20/192.168.0.38_audience_f/user-bak010-20170820030804.log > for go-hdfs-OAfvZiSUM2Eu894p on 192.168.0.39 because > go-hdfs-OAfvZiSUM2Eu894p is already the current lease holder. > 2017-08-20 03:22:56,335 INFO org.apache.hadoop.ipc.Server: IPC Server handler > 0 on 9000, call org.apache.hadoop.hdfs.protocol.ClientProtocol.append from > 192.168.0.39:46827 Call#50186 Retry#-1: > org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: Failed to > APPEND_FILE > /user/am/scan_task/2017-08-20/192.168.0.38_audience_f/user-bak010-20170820030804.log > for go-hdfs-OAfvZiSUM2Eu894p on 192.168.0.39 because > go-hdfs-OAfvZiSUM2Eu894p is already the current lease holder. > {code} > Could you please suggest the correct way of retrying of the client side when > write fails? -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org