[ 
https://issues.apache.org/jira/browse/KUDU-3124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexey Serbin updated KUDU-3124:
--------------------------------
    Affects Version/s: 1.12.0
                       1.13.0

> A safer way to handle CreateTablet requests
> -------------------------------------------
>
>                 Key: KUDU-3124
>                 URL: https://issues.apache.org/jira/browse/KUDU-3124
>             Project: Kudu
>          Issue Type: Improvement
>          Components: master, tserver
>    Affects Versions: 1.2.0, 1.3.0, 1.3.1, 1.4.0, 1.5.0, 1.6.0, 1.7.0, 1.8.0, 
> 1.7.1, 1.9.0, 1.10.0, 1.10.1, 1.11.0, 1.12.0, 1.11.1, 1.13.0
>            Reporter: Alexey Serbin
>            Priority: Major
>
> As of now, catalog manager (a part of kudu-master) sends 
> {{CreateTabletRequest}} RPC
> as soon as they are realized by 
> {{CatalogManager::ProcessPendingAssignments()}}
> when processing the list of deferred DDL operations, and at this level there
> isn't any restrictions on how many of those might be in flight or sent to
> a particular tablet server (NOTE: there is {{\-\-max_create_tablets_per_ts}} 
> flag,
> but it works on a higher level and only during initial creation of a table).
> The {{CreateTablet}} requests are sent asynchronously, and if the tablet isn't
> created within {{\-\-tablet_creation_timeout_ms|| milliseconds, catalog 
> manager
> replaces all the tablet replicas, generating a new tablet UUID and sending
> corresponding {{CreateTabletRequest}} RPCs to a potentially different set of 
> tablet
> servers.  Corresponding {{DeleteTabletRequest}} RPCs (to remove the replicas 
> of the
> stalled-during-creation tablet) are sent separately in an asynchronous way
> as well.
> There are at least two issues with this approach:
> # The {{\-\-max_create_tablets_per_ts}} threshold limits the number of 
> concurrent requests hitting one tablet server only during the initial 
> creation of a table. However, nothing limits how many requests to create a 
> table replica might hit a tablet server when adding partitions to an existing 
> table as a result of ALTER TABLE request.
> # {{DeleteTabletRequest}} RPCs sometimes might not get into the RPC queues of
> corresponding tablet servers, and catalog manager stops retrying sending those
> after {{\-\-unresponsive_ts_rpc_timeout_ms}} interval.  This might spiral 
> into a situation when requests to create replacement tablet replicas are 
> passing through and executed by tablet servers, but corresponding requests to 
> delete tablet replica cannot get through because of queue overflows, with 
> catalog manager eventually giving up retrying the latter ones.  Eventually, 
> tablet servers end up with huge number of tablet replicas created, and they 
> crash running out of memory.  The crashed tablet servers cannot start after 
> that because they eventually run out of memory trying to bootstrap the huge 
> number of tablet replicas (running out of memory again). See 
> https://gerrit.cloudera.org/#/c/15912/ for the reproduction scenario and 
> [KUDU-2453|https://issues.apache.org/jira/browse/KUDU-2453] for corresponding 
> issue reported some time ago. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to