[
https://issues.apache.org/jira/browse/HDDS-14705?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Zita Dombi updated HDDS-14705:
------------------------------
Status: Patch Available (was: Open)
> Ozone clients should retry when OM is in prepare mode
> -----------------------------------------------------
>
> Key: HDDS-14705
> URL: https://issues.apache.org/jira/browse/HDDS-14705
> Project: Apache Ozone
> Issue Type: Bug
> Reporter: Zita Dombi
> Assignee: Zita Dombi
> Priority: Major
> Labels: pull-request-available
>
> If OM is in prepare mode there is no failover handling:
> [https://github.com/apache/ozone/blob/17a126da1775d45f8843b47985bb5deb0ea3e928/hadoop-ozone/common/src/main/java/org/apache/hadoop/ozone/om/ha/OMFailoverProxyProvider.java#L473-L483]
>
> {code:java}
> } else if (ex instanceof StateMachineException) {
> StateMachineException smEx = (StateMachineException) ex;
> Throwable cause = smEx.getCause();
> if (cause instanceof OMException) {
> OMException omEx = (OMException) cause;
> // Do not failover if the operation was blocked because the OM was
> // prepared.
> return omEx.getResult() !=
> OMException.ResultCodes.NOT_SUPPORTED_OPERATION_WHEN_PREPARED;
> }
> }{code}
> This causes can cause job failures:
> {code:java}
> 26/01/21 16:22:21 INFO mapreduce.Job: Task Id :
> attempt_1768994470888_0006_m_000007_0, Status : FAILED
> Error: NOT_SUPPORTED_OPERATION_WHEN_PREPARED
> org.apache.hadoop.ozone.om.exceptions.OMException: Cannot apply write request
> CreateFile when OM is in prepare mode. at
> org.apache.hadoop.ozone.om.protocolPB.OzoneManagerProtocolClientSideTranslatorPB.handleError(OzoneManagerProtocolClientSideTranslatorPB.java:761)
> at
> org.apache.hadoop.ozone.om.protocolPB.OzoneManagerProtocolClientSideTranslatorPB.handleSubmitRequestAndSCMSafeModeRetry(OzoneManagerProtocolClientSideTranslatorPB.java:2332)
> at
> org.apache.hadoop.ozone.om.protocolPB.OzoneManagerProtocolClientSideTranslatorPB.createFile(OzoneManagerProtocolClientSideTranslatorPB.java:2321)
> at
> org.apache.hadoop.ozone.client.rpc.RpcClient.createFile(RpcClient.java:2250)
> at
> org.apache.hadoop.ozone.client.OzoneBucket.createFile(OzoneBucket.java:962)
> at
> org.apache.hadoop.fs.ozone.BasicRootedOzoneClientAdapterImpl.createFile(BasicRootedOzoneClientAdapterImpl.java:413)
> at
> org.apache.hadoop.fs.ozone.BasicRootedOzoneFileSystem.createOutputStream(BasicRootedOzoneFileSystem.java:317)
> at
> org.apache.hadoop.fs.ozone.BasicRootedOzoneFileSystem.lambda$create$1(BasicRootedOzoneFileSystem.java:277)
> at
> org.apache.hadoop.hdds.tracing.TracingUtil.executeInSpan(TracingUtil.java:167)
> at
> org.apache.hadoop.hdds.tracing.TracingUtil.executeInNewSpan(TracingUtil.java:157)
> at
> org.apache.hadoop.fs.ozone.BasicRootedOzoneFileSystem.create(BasicRootedOzoneFileSystem.java:276)
> at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:1233) at
> org.apache.hadoop.fs.FileSystem.create(FileSystem.java:1210) at
> org.apache.hadoop.fs.FileSystem.create(FileSystem.java:1091) at
> org.apache.hadoop.fs.FileSystem.create(FileSystem.java:1078) at
> org.apache.hadoop.examples.terasort.TeraOutputFormat.getRecordWriter(TeraOutputFormat.java:141)
> at
> org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.<init>(MapTask.java:660)
> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:780) at
> org.apache.hadoop.mapred.MapTask.run(MapTask.java:348) at
> org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:178) at
> java.base/java.security.AccessController.doPrivileged(AccessController.java:712)
> at java.base/javax.security.auth.Subject.doAs(Subject.java:439) at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1964)
> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:172) {code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]