[ https://issues.apache.org/jira/browse/KUDU-3486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17913097#comment-17913097 ]
ASF subversion and git services commented on KUDU-3486: ------------------------------------------------------- Commit 0ddaac556f7bc7aeb47db740300921d10eabd856 in kudu's branch refs/heads/master from Alexey Serbin [ https://gitbox.apache.org/repos/asf?p=kudu.git;h=0ddaac556 ] [tserver] disable KUDU-3486 behavior by default This is a quick-and-dirty fix to mitigate KUDU-3638. This patch isn't focusing on properly addressing the issues that KUDU-3486 has introduced apart from fixing the obvious bug of missing updates of the Heartbeater::Thread::last_tombstoned_report_time_ field. Also, with this patch, the functionality introduced with KUDU-3486 is now disabled by default. To re-enable it back, customize the setting for the --tserver_send_tombstoned_tablets_report_inteval_secs flag, if needed. Properly implementing the functionality that KUDU-3486 attempted to add would be a much more involved patch because there are several items to address from both the design and implementation standpoints. Change-Id: I8e32aafab99c74f0ead3ba65aea58ce91d40297c Reviewed-on: http://gerrit.cloudera.org:8080/22341 Reviewed-by: Abhishek Chennaka <achenn...@cloudera.com> Tested-by: Alexey Serbin <ale...@apache.org> > Tserver: Too many tombstone tablet may lead to high memory usage. > ----------------------------------------------------------------- > > Key: KUDU-3486 > URL: https://issues.apache.org/jira/browse/KUDU-3486 > Project: Kudu > Issue Type: Bug > Components: tserver > Affects Versions: 1.14.0 > Reporter: Song Jiacheng > Priority: Minor > Fix For: 1.18.0, 1.17.1 > > Attachments: image-2023-07-06-15-59-44-181.png > > > There are two kinds of tablet replica deletion: tombstone and delete. A > tombstone tablet replica might never be deleted since the delete-type > deletion could only occur when the tablet is deleted, and the requests will > be sent to the voters, not including the tombstone ones. > Here is a example: > Tablet T: > replica A > replica B > replica C > After rebalance: > replica A > replica B > replica C(Tombstone) > replica D > When the tablet T is deleted, A B D are deleted, and C exists forever. > Like this picture, the tablet had already been deleted at 3:00 am 13th Jun, > but the tombstone replica still exists. > !image-2023-07-06-15-59-44-181.png|width=568,height=261! > The data of tombstone replica is deleted, but metadata is persisted in > memory, especially the biggest one SchemaPB will occupy a lot of memory. > In some of our clusters, tombstone replicas of each tserver could reach 50k ~ > 100k, which takes about 10G. > It takes too much resource if adds a vector for each tablet to store the > history tablet servers that used to hold a replica of the tablet. So I think > periodically heartbeat might be a good way to solve the problem. -- This message was sent by Atlassian Jira (v8.20.10#820010)