Sergey Soldatov created HDDS-13621:
--------------------------------------

             Summary: NPE in OzoneManagerRatisServer.checkRetryCache
                 Key: HDDS-13621
                 URL: https://issues.apache.org/jira/browse/HDDS-13621
             Project: Apache Ozone
          Issue Type: Bug
          Components: Ozone Manager
    Affects Versions: 2.1.0
            Reporter: Sergey Soldatov


Under a load, OM periodically fails to check the RetryCache:
{code:java}
2025-08-27 16:18:09,562 WARN ipc.Server: IPC Server handler 0 on default port 
9862, call Call#5998989 Retry#2 
org.apache.hadoop.ozone.om.protocol.OzoneManagerProtocol.submitRequest from 
10.88.252.12:48376
java.lang.NullPointerException: Cannot invoke 
"org.apache.ratis.protocol.Message.getContent()" because the return value of 
"org.apache.ratis.protocol.RaftClientReply.getMessage()" is null
        at 
org.apache.hadoop.ozone.om.helpers.OMRatisHelper.getOMResponseFromRaftClientReply(OMRatisHelper.java:68)
        at 
org.apache.hadoop.ozone.om.ratis.OzoneManagerRatisServer.getOMResponse(OzoneManagerRatisServer.java:570)
        at 
org.apache.hadoop.ozone.om.ratis.OzoneManagerRatisServer.checkRetryCache(OzoneManagerRatisServer.java:495)
        at 
org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.internalProcessRequest(OzoneManagerProtocolServerSideTranslatorPB.java:168)
        at 
org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.processRequest(OzoneManagerProtocolServerSideTranslatorPB.java:124)
        at 
org.apache.hadoop.hdds.server.OzoneProtocolMessageDispatcher.processRequest(OzoneProtocolMessageDispatcher.java:87)
        at 
org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.submitRequest(OzoneManagerProtocolServerSideTranslatorPB.java:115)
        at 
org.apache.hadoop.ozone.protocol.proto.OzoneManagerProtocolProtos$OzoneManagerService$2.callBlockingMethod(OzoneManagerProtocolProtos.java)
        at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server.processCall(ProtobufRpcEngine.java:484)
        at 
org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:595)
        at 
org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:573)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1227)
        at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:1246)
        at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:1169)
        at 
java.base/java.security.AccessController.doPrivileged(AccessController.java:712)
        at java.base/javax.security.auth.Subject.doAs(Subject.java:439)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1953)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:3198){code}
It's not clear yet whether this is Ozone or Ratis issue. RCA is in progress. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to