[ 
https://issues.apache.org/jira/browse/HIVE-13014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15832666#comment-15832666
 ] 

Alan Gates commented on HIVE-13014:
-----------------------------------

In general patch looks fine.  I have a couple of questions:
# What's the performance impact of looking up the annotations on the method 
everytime through the retry handler?  Is it enough that we should build a map 
of methods to retriability so that subsequent lookups become O(1)?
# Why does this not apply to other metastore operations, like create table?  
That would seem also to be a case where a timeout but succeeded first attempt 
could be masked by a failed second attempt.

> RetryingMetaStoreClient is retrying too aggresievley
> ----------------------------------------------------
>
>                 Key: HIVE-13014
>                 URL: https://issues.apache.org/jira/browse/HIVE-13014
>             Project: Hive
>          Issue Type: Bug
>          Components: Metastore, Transactions
>    Affects Versions: 1.0.0
>            Reporter: Eugene Koifman
>            Assignee: Eugene Koifman
>            Priority: Critical
>         Attachments: HIVE-13014.01.patch, HIVE-13014.02.patch, 
> HIVE-13014.03.patch, HIVE-13014.04.patch, HIVE-13014.05.patch, 
> HIVE-13014.06.patch, HIVE-13014.07.patch
>
>
> Not all metastore operations are idempotent.  For example, commit_txn() 
> consists of 
> 1. request from client to server
> 2. server action
> 3. ack to client
> If network connection is broken after (or during) 2 but before 3 happens, 
> RetryingMetastoreClient will retry the operation thus causing an attempt to 
> commit the same txn twice (sometimes in concurrently)
> The 2nd attempt is guaranteed to fail and thus return an error to the caller 
> (which doesn't know the operation is being retried), while the first attempt 
> has actually succeeded.  Thus the caller thinks commit failed and will likely 
> attempt to redo the transactions - not what we want in most cases.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to