[jira] [Work logged] (HIVE-27097) Improve the retry strategy for Metastore client and server

ASF GitHub Bot (Jira) Tue, 14 Mar 2023 10:19:05 -0700


     [ 
https://issues.apache.org/jira/browse/HIVE-27097?focusedWorklogId=850965&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-850965
 ]


ASF GitHub Bot logged work on HIVE-27097:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 14/Mar/23 17:18
            Start Date: 14/Mar/23 17:18
    Worklog Time Spent: 10m 
      Work Description: wecharyu commented on code in PR #4076:
URL: https://github.com/apache/hive/pull/4076#discussion_r1135936973


##########
standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/RetryingMetaStoreClient.java:
##########
@@ -264,17 +238,6 @@ public Object run() throws MetaException {
     return ret;
   }
 
-  private static boolean isRecoverableMetaException(MetaException e) {
-    String m = e.getMessage();
-    if (m == null) {
-      return false;
-    }
-    if (m.contains("java.sql.SQLIntegrityConstraintViolationException")) {
-      return false;
-    }
-    return IO_JDO_TRANSPORT_PROTOCOL_EXCEPTION_PATTERN.matcher(m).matches();

Review Comment:
   @saihemanth-cloudera nice concern! And thanks for your review.
   IMO we should seperate TTransportException from MetaException, because they 
are responsible for different things:
   
   - TTransportException means the exception in transport, like network issue.
   - MetaException means the meta data exception ,like meta data corrupted.
   
   BTW could you explain in which case TTransportException will be wrapped as 
MetaException?





Issue Time Tracking
-------------------

    Worklog Id:     (was: 850965)
    Time Spent: 1h 10m  (was: 1h)

> Improve the retry strategy for Metastore client and server
> ----------------------------------------------------------
>
>                 Key: HIVE-27097
>                 URL: https://issues.apache.org/jira/browse/HIVE-27097
>             Project: Hive
>          Issue Type: Improvement
>          Components: Hive
>    Affects Versions: 4.0.0-alpha-2
>            Reporter: Wechar
>            Assignee: Wechar
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> *Background*
> Hive provides *{{RetryingMetaStoreClient}}* and *{{RetryingHMSHandler}}* to 
> do retry when thrift request failed:
>  * RetryingMetaStoreClient will retry for *thrift related exception* and some 
> *MetaException*
>  * RetryingHMSHandler will retry for all {*}JDOException{*} or 
> *NucleusException*.
> *Motivation*
> Current retry mechanism will lead to many unnecessary retries in both client 
> and server. To simplify the process, we introduce following retry mechanism:
>  * Client side only concerns the error of communication, i.e., 
> {*}TTransportException{*}.
>  * Server side can skip some exceptions which always turn to fail even with 
> retry, like {*}SQLIntegrityConstraintViolationException{*}.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Work logged] (HIVE-27097) Improve the retry strategy for Metastore client and server

Reply via email to