[ https://issues.apache.org/jira/browse/KUDU-3124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Alexey Serbin updated KUDU-3124: -------------------------------- Affects Version/s: 1.12.0 1.13.0 > A safer way to handle CreateTablet requests > ------------------------------------------- > > Key: KUDU-3124 > URL: https://issues.apache.org/jira/browse/KUDU-3124 > Project: Kudu > Issue Type: Improvement > Components: master, tserver > Affects Versions: 1.2.0, 1.3.0, 1.3.1, 1.4.0, 1.5.0, 1.6.0, 1.7.0, 1.8.0, > 1.7.1, 1.9.0, 1.10.0, 1.10.1, 1.11.0, 1.12.0, 1.11.1, 1.13.0 > Reporter: Alexey Serbin > Priority: Major > > As of now, catalog manager (a part of kudu-master) sends > {{CreateTabletRequest}} RPC > as soon as they are realized by > {{CatalogManager::ProcessPendingAssignments()}} > when processing the list of deferred DDL operations, and at this level there > isn't any restrictions on how many of those might be in flight or sent to > a particular tablet server (NOTE: there is {{\-\-max_create_tablets_per_ts}} > flag, > but it works on a higher level and only during initial creation of a table). > The {{CreateTablet}} requests are sent asynchronously, and if the tablet isn't > created within {{\-\-tablet_creation_timeout_ms|| milliseconds, catalog > manager > replaces all the tablet replicas, generating a new tablet UUID and sending > corresponding {{CreateTabletRequest}} RPCs to a potentially different set of > tablet > servers. Corresponding {{DeleteTabletRequest}} RPCs (to remove the replicas > of the > stalled-during-creation tablet) are sent separately in an asynchronous way > as well. > There are at least two issues with this approach: > # The {{\-\-max_create_tablets_per_ts}} threshold limits the number of > concurrent requests hitting one tablet server only during the initial > creation of a table. However, nothing limits how many requests to create a > table replica might hit a tablet server when adding partitions to an existing > table as a result of ALTER TABLE request. > # {{DeleteTabletRequest}} RPCs sometimes might not get into the RPC queues of > corresponding tablet servers, and catalog manager stops retrying sending those > after {{\-\-unresponsive_ts_rpc_timeout_ms}} interval. This might spiral > into a situation when requests to create replacement tablet replicas are > passing through and executed by tablet servers, but corresponding requests to > delete tablet replica cannot get through because of queue overflows, with > catalog manager eventually giving up retrying the latter ones. Eventually, > tablet servers end up with huge number of tablet replicas created, and they > crash running out of memory. The crashed tablet servers cannot start after > that because they eventually run out of memory trying to bootstrap the huge > number of tablet replicas (running out of memory again). See > https://gerrit.cloudera.org/#/c/15912/ for the reproduction scenario and > [KUDU-2453|https://issues.apache.org/jira/browse/KUDU-2453] for corresponding > issue reported some time ago. -- This message was sent by Atlassian Jira (v8.3.4#803005)