Hi Devs,

Anyone meet NameNode fatal since editlog not sync in time then lead to
process unexpectedly exit.
I have met several times and try to dig the root cause but no result. This
JIRA(HDFS-10943 <https://issues.apache.org/jira/browse/HDFS-10943>) is
trace this issue. welcome any suggestions and more discuss.

I would like to offer some more information:
Environment:
branch-2.7 HDFS HA using QJM
FATAL Log:

> 2019-03-08 10:53:17,111 FATAL
> org.apache.hadoop.hdfs.server.namenode.FSEditLog: Error: finalize log
> segment 117479864532, 117479959500 failed for required journal
> (JournalAndStream(mgr=QJM to [journalhosts], stream=QuorumOutputStream
> starting at txid 117479864532))
> java.io.IOException: FSEditStream has 141 bytes still to be flushed and
> cannot be closed.
>         at
> org.apache.hadoop.hdfs.server.namenode.EditsDoubleBuffer.close(EditsDoubleBuffer.java:66)
>         at
> org.apache.hadoop.hdfs.qjournal.client.QuorumOutputStream.close(QuorumOutputStream.java:65)
>         at
> org.apache.hadoop.hdfs.server.namenode.JournalSet$JournalAndStream.closeStream(JournalSet.java:115)
>         at
> org.apache.hadoop.hdfs.server.namenode.JournalSet$4.apply(JournalSet.java:235)
>         at
> org.apache.hadoop.hdfs.server.namenode.JournalSet.mapJournalsAndReportErrors(JournalSet.java:393)
>         at
> org.apache.hadoop.hdfs.server.namenode.JournalSet.finalizeLogSegment(JournalSet.java:231)
>         at
> org.apache.hadoop.hdfs.server.namenode.FSEditLog.endCurrentLogSegment(FSEditLog.java:1274)
>         at
> org.apache.hadoop.hdfs.server.namenode.FSEditLog.rollEditLog(FSEditLog.java:1203)
>         at
> org.apache.hadoop.hdfs.server.namenode.FSImage.rollEditLog(FSImage.java:1331)
>         at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.rollEditLog(FSNamesystem.java:6102)
>         at
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.rollEditLog(NameNodeRpcServer.java:1297)
>         at
> org.apache.hadoop.hdfs.protocolPB.NamenodeProtocolServerSideTranslatorPB.rollEditLog(NamenodeProtocolServerSideTranslatorPB.java:142)
>         at
> org.apache.hadoop.hdfs.protocol.proto.NamenodeProtocolProtos$NamenodeProtocolService$2.callBlockingMethod(NamenodeProtocolProtos.java:12025)
>         at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)
>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:976)
>         at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:847)
>         at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:790)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:422)
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1686)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2458)


Best Regards,
Hexiaoqiao

Reply via email to