[ https://issues.apache.org/jira/browse/HIVE-13014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15826572#comment-15826572 ]
Eugene Koifman commented on HIVE-13014: --------------------------------------- The patch makes a few methods safe to retry (which were not so before) and annotates others to indicate retry semantics The worst case to avoid is when server side op succeeds (and commits against the metastore RDBMS) but the remote caller doesn't know this and retries an op that cannot be retried. [~alangates] could you review please > RetryingMetaStoreClient is retrying too aggresievley > ---------------------------------------------------- > > Key: HIVE-13014 > URL: https://issues.apache.org/jira/browse/HIVE-13014 > Project: Hive > Issue Type: Bug > Components: Metastore, Transactions > Affects Versions: 1.0.0 > Reporter: Eugene Koifman > Assignee: Eugene Koifman > Priority: Critical > Attachments: HIVE-13014.01.patch, HIVE-13014.02.patch, > HIVE-13014.03.patch > > > Not all metastore operations are idempotent. For example, commit_txn() > consists of > 1. request from client to server > 2. server action > 3. ack to client > If network connection is broken after (or during) 2 but before 3 happens, > RetryingMetastoreClient will retry the operation thus causing an attempt to > commit the same txn twice (sometimes in concurrently) > The 2nd attempt is guaranteed to fail and thus return an error to the caller > (which doesn't know the operation is being retried), while the first attempt > has actually succeeded. Thus the caller thinks commit failed and will likely > attempt to redo the transactions - not what we want in most cases. -- This message was sent by Atlassian JIRA (v6.3.4#6332)