[
https://issues.apache.org/jira/browse/KUDU-2147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Todd Lipcon updated KUDU-2147:
------------------------------
Resolution: Fixed
Fix Version/s: 1.6.0
Status: Resolved (was: In Review)
> Unknown leader treated as valid TS UUID by catalog manager
> ----------------------------------------------------------
>
> Key: KUDU-2147
> URL: https://issues.apache.org/jira/browse/KUDU-2147
> Project: Kudu
> Issue Type: Bug
> Components: consensus, master
> Reporter: Mike Percy
> Assignee: Mike Percy
> Fix For: 1.6.0
>
>
> A bug was observed on a development cluster where an empty string reported as
> the leader in a TS heartbeat to the master was treated as a valid TS UUID by
> the TSPicker, resulting in the master attempting to add contact an
> empty-string member of the cluster to add a new replica and being unable to
> connect.
> ksck looked like this:
> {code}
> Tablet c4e8c3260dda48efbb7e182b569a7fc6 of table
> 'impala::tpch_1000_kudu.customer' is under-replicated: configuration has 2
> replicas vs desired 3
> 09d6bf7a02124145b43f43cb7a667b3d (host1.foo.example.com:7050): RUNNING
> a662440710624c02bd5612df32cb0235 (host2.foo.example.com:7050): RUNNING
> 2 replicas' active configs differ from the master's.
> All the peers reported by the master and tablet servers are:
> A = 09d6bf7a02124145b43f43cb7a667b3d
> B = a662440710624c02bd5612df32cb0235
> The consensus matrix is:
> Config source | Voters | Current term | Config index | Committed?
> ---------------+----------+--------------+--------------+------------
> master | A B | | | Yes
> A | A* B | 101 | 5441 | Yes
> B | A* B | 101 | 5441 | Yes
> Table impala::tpch_1000_kudu.customer has 1 under-replicated tablet(s)
> {code}
> There were accompanying error messages printed in the catalog manager log
> that looked like this:
> {code}
> I0914 15:03:35.774370 22121 catalog_manager.cc:2988] Scheduling retry of
> AddServer ChangeConfig RPC for tablet c4e8c3260dda48efbb7e182b569a7fc6 with
> cas_config_opid_index 5441 with a delay of 24 ms (attempt = 1)
> W0914 15:03:35.774382 22121 catalog_manager.cc:3007] Async tablet task
> AddServer ChangeConfig RPC for tablet c4e8c3260dda48efbb7e182b569a7fc6 with
> cas_config_opid_index 5441 failed: Not found: Failed to reset TS proxy: Could
> not find TS for UUID
> {code}
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)