This looks like a bug with the new inode ID code in trunk, rather than a bug with QJM or HA.
Suresh/Brandon, any thoughts? -Todd On Sun, Mar 31, 2013 at 6:43 PM, Azuryy Yu <azury...@gmail.com> wrote: > Hi All, > > I configured HDFS Ha using source code from trunk r1463074. > > I got an exception as follows when I put a file to the HDFS. > > 13/04/01 09:33:45 WARN retry.RetryInvocationHandler: Exception while > invoking addBlock of class ClientNamenodeProtocolTranslatorPB. Trying to > fail over immediately. > 13/04/01 09:33:45 WARN hdfs.DFSClient: DataStreamer Exception > java.io.FileNotFoundException: ID mismatch. Request id and saved id: 1073 , > 1050 > at > org.apache.hadoop.hdfs.server.namenode.INodeId.checkId(INodeId.java:51) > at > > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:2501) > at > > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.analyzeFileState(FSNamesystem.java:2298) > at > > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2212) > at > > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:498) > at > > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:356) > at > > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:40979) > at > > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:526) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1018) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1818) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1814) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1489) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1812) > > > please reproduce as : > > hdfs dfs -put test.data /user/data/test.data > after this command start to run, then kill active name node process. > > > I have only three nodes(A,B,C) for test > A and B are name nodes. > B and C are data nodes. > ZK deployed on A, B and C. > > A, B and C are all journal nodes. > > Thanks. > -- Todd Lipcon Software Engineer, Cloudera