[ 
https://issues.apache.org/jira/browse/IMPALA-10866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18012056#comment-18012056
 ] 

ASF subversion and git services commented on IMPALA-10866:
----------------------------------------------------------

Commit 59fdd7169a4523a2c4916096d550855e49c8a35a in impala's branch 
refs/heads/master from Yida Wu
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=59fdd7169 ]

IMPALA-10866: Add testcases for failure cases involving the admission service

The admission service uses the statestore as the only source of
truth to determine whether a coordinator is down. If the statestore
reports a coordinator is down, all running and queued queries
associated with it should be cancelled or rejected.

In IMPALA-12057, we introduced logic to reject queued queries if
the corresponding coordinator has been removed, along with tests
for that behavior.

This patch adds additional test cases to cover other failure
scenarios, such as the coordinator or the statestore going down
with running queries, and verifies that the behavior is as expected
in each case.

Tests:
Passed exhaustive tests.

Change-Id: If617326cbc6fe2567857d6323c6413d98c92d009
Reviewed-on: http://gerrit.cloudera.org:8080/23217
Reviewed-by: Riza Suminto <[email protected]>
Reviewed-by: Abhishek Rawat <[email protected]>
Tested-by: Impala Public Jenkins <[email protected]>


> Ensure consistency between failure detection and registration/Ack of a 
> coordinator by the admission service
> -----------------------------------------------------------------------------------------------------------
>
>                 Key: IMPALA-10866
>                 URL: https://issues.apache.org/jira/browse/IMPALA-10866
>             Project: IMPALA
>          Issue Type: Sub-task
>    Affects Versions: Impala 4.0.0
>            Reporter: Bikramjeet Vig
>            Assignee: Bikramjeet Vig
>            Priority: Critical
>
> Ensure consistency between failure detection and registration/Ack of a 
> coordinator by the admission service.
>  Currently admission service utilizes the statestore membership updates to 
> detect a coordinator going down but it still services RPCs from that 
> coordinator if it is still up and able to contact the admission service.
>  Using the current mechanisms of statestore updates(IMPALA-10594), admission 
> heartbeats(IMPALA-10590, IMPALA-10720) and coordinator 
> registration(IMPALA-9976) ensure that consistency is maintained between these 
> mechanism.
>  A possible implementation is:
>  - Use statestore as the only source of truth.
>  ** Consistency: Only allow a coord to register if it is registered with the 
> statestore
>  ** Atomicity: If the statestore update signals that a coord is down, remove 
> all its state (running and queued queries) before you allow it to register 
> again
>  OR
>  -Eventual consistency: We remove queries between subsequent statestore 
> updates and if the coord comes back up and sends the full admission state, we 
> can update the state of that query id if it has not been removed yet (since 
> the full admission state only contains running queries)- Cant use this 
> because only changes to the membership initiate the query removal process 
> which would only happen once if a coord is removed.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to