Alexey Serbin created KUDU-3638:
-----------------------------------

             Summary: Deficiences in cleaning up tombstoned tablets lead to 
flooding logs and high CPU usage at Kudu master nodes
                 Key: KUDU-3638
                 URL: https://issues.apache.org/jira/browse/KUDU-3638
             Project: Kudu
          Issue Type: Bug
          Components: master, tserver
    Affects Versions: 1.17.1
            Reporter: Alexey Serbin


In the scope of implementing 
[KUDU-3486|https://issues.apache.org/jira/browse/KUDU-3486], a few deficiencies 
have been introduced that manifest themselves at least as the following:
* Tablet servers that host tombstoned replicas of tablets that are part of 
still existing tables send reports on all of them with every incremental 
heartbeat to leader master after about 30 minutes after start or as customized 
by the {{\-\-tserver_send_tombstoned_tablets_report_inteval_secs}} flag
* Leader master would flood its INFO log with messages like below, adding same 
records again and again upon processing every incremental heartbeat from tablet 
servers like mentioned in the item above {noformat}
... catalog_manager.cc:5516] TS <ts_UUID> (<ta_node_name>:7050) does not have 
the latest schema for tablet <tablet_UUID> (table <table_name> 
[id=<table_UUID>]). Expected version A got B
{noformat}

As a temporary workaround for the issue, set 
{{\-\-tserver_send_tombstoned_tablets_report_inteval_sec=-1}} for tablet 
servers (NOTE: since the flag is runtime by its nature, it's possible to 
address the issue without restarting tablet servers by using the {{kudu tserver 
set_flag}} CLI tool).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to