[ 
https://issues.apache.org/jira/browse/KUDU-3587?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexey Serbin updated KUDU-3587:
--------------------------------
    Description: 
As of Kudu 1.17.0, the implementation of RetriableRpc for WriteRpc in the C++ 
client uses linear back-off strategy, where the hold-off time interval (in 
milliseconds) is computed as
{noformat}
num_attempts + (rand() % 5)
{noformat}

Even if Kudu servers use separate incoming queues for different RPC interfaces 
(e.g. TabletServerService, ConsensusService, etc.), in the presence of many 
active clients, many tablet replicas per tablet server, and on-going Raft 
election storms due to frozen and/or slow RPC worker threads, many more 
unrelated write requests might be dropped out of the overflown 
TabletServerService RPC queues because the queues are flooded with too many 
retried write requests to tablets whose leader replicas aren't yet established. 
 It doesn't make sense to self-inflict such a DoS condition because of 
non-optimal RPC retry strategy at the client side.

One option might be using linear back-off strategy when going round-robin 
through the recently refreshed list of tablet replicas, but using exponential 
strategy upon completing a full circle and issuing next GetTablesLocation 
request to Kudu master.

  was:
As of Kudu 1.17.0, the implementation of RetriableRpc for WriteRpc in the C++ 
client uses linear back-off strategy, where the hold-off time interval (in 
milliseconds) is computed as
{noformat}
num_attempts + (rand() % 5)
{noformat}

Since Kudu servers use a single queue for all their RPC interfaces (e.g. 
TabletServerService, ConsensusService, etc.), in the presence of many active 
clients and busy server nodes, this might start Raft election storm or 
exacerbate an existing one by keeping the RPC queue full or almost full, so 
more ConsensusService requests are dropped out of overflown RPC queues.

Of course, separating RPC queues for different interfaces is one part of the 
remedy (e.g., see [KUDU-2955|https://issues.apache.org/jira/browse/KUDU-2955]), 
but even with separate RPC queues it doesn't make sense to self-inflict a DoS 
condition because of non-optimal RPC retry strategy when there are many active 
clients and tablet leadership transition is in progress for many "hot" tables.

One option might be using linear back-off strategy when going round-robin 
through the recently refreshed list of tablet replicas, but using exponential 
strategy upon completing a full circle and issuing next GetTablesLocation 
request to Kudu master.


> Implement smarter back-off strategy for RetriableRpc upon receving 
> REPLICA_NOT_LEADER response
> ----------------------------------------------------------------------------------------------
>
>                 Key: KUDU-3587
>                 URL: https://issues.apache.org/jira/browse/KUDU-3587
>             Project: Kudu
>          Issue Type: Improvement
>          Components: client
>            Reporter: Alexey Serbin
>            Priority: Major
>
> As of Kudu 1.17.0, the implementation of RetriableRpc for WriteRpc in the C++ 
> client uses linear back-off strategy, where the hold-off time interval (in 
> milliseconds) is computed as
> {noformat}
> num_attempts + (rand() % 5)
> {noformat}
> Even if Kudu servers use separate incoming queues for different RPC 
> interfaces (e.g. TabletServerService, ConsensusService, etc.), in the 
> presence of many active clients, many tablet replicas per tablet server, and 
> on-going Raft election storms due to frozen and/or slow RPC worker threads, 
> many more unrelated write requests might be dropped out of the overflown 
> TabletServerService RPC queues because the queues are flooded with too many 
> retried write requests to tablets whose leader replicas aren't yet 
> established.  It doesn't make sense to self-inflict such a DoS condition 
> because of non-optimal RPC retry strategy at the client side.
> One option might be using linear back-off strategy when going round-robin 
> through the recently refreshed list of tablet replicas, but using exponential 
> strategy upon completing a full circle and issuing next GetTablesLocation 
> request to Kudu master.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to