Alexey Serbin created KUDU-3638: ----------------------------------- Summary: Deficiences in cleaning up tombstoned tablets lead to flooding logs and high CPU usage at Kudu master nodes Key: KUDU-3638 URL: https://issues.apache.org/jira/browse/KUDU-3638 Project: Kudu Issue Type: Bug Components: master, tserver Affects Versions: 1.17.1 Reporter: Alexey Serbin
In the scope of implementing [KUDU-3486|https://issues.apache.org/jira/browse/KUDU-3486], a few deficiencies have been introduced that manifest themselves at least as the following: * Tablet servers that host tombstoned replicas of tablets that are part of still existing tables send reports on all of them with every incremental heartbeat to leader master after about 30 minutes after start or as customized by the {{\-\-tserver_send_tombstoned_tablets_report_inteval_secs}} flag * Leader master would flood its INFO log with messages like below, adding same records again and again upon processing every incremental heartbeat from tablet servers like mentioned in the item above {noformat} ... catalog_manager.cc:5516] TS <ts_UUID> (<ta_node_name>:7050) does not have the latest schema for tablet <tablet_UUID> (table <table_name> [id=<table_UUID>]). Expected version A got B {noformat} As a temporary workaround for the issue, set {{\-\-tserver_send_tombstoned_tablets_report_inteval_sec=-1}} for tablet servers (NOTE: since the flag is runtime by its nature, it's possible to address the issue without restarting tablet servers by using the {{kudu tserver set_flag}} CLI tool). -- This message was sent by Atlassian Jira (v8.20.10#820010)