[ https://issues.apache.org/jira/browse/KUDU-3641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17929183#comment-17929183 ]
ASF subversion and git services commented on KUDU-3641: ------------------------------------------------------- Commit 8ce91854a3ae749ea02c45096dfff4b877050a82 in kudu's branch refs/heads/master from Alexey Serbin [ https://gitbox.apache.org/repos/asf?p=kudu.git;h=8ce91854a ] KUDU-3641 fix flaky TestNewLeaderCantResolvePeers (take 2) I noticed that even after [1] the TestNewLeaderCantResolvePeers was still failing in about once in 60 runs [2], so I took a closer look. It turns out StartElection() doesn't trigger a re-election if it arriving at the current Raft leader. To address that, this patch replaces StartElection() back with LeaderStepDown() but with new target leader being the tablet replica at the third tablet server. This is a follow-up to [1]. [1] https://github.com/apache/kudu/commit/6c77ec875 [2] http://dist-test.cloudera.org:8080/test_drilldown?test_name=raft_consensus_election-itest Change-Id: I3bee924353079f7c8bfab6d0d5a6367bd1ee243e Reviewed-on: http://gerrit.cloudera.org:8080/22516 Tested-by: Alexey Serbin <ale...@apache.org> Reviewed-by: Yifan Zhang <chinazhangyi...@163.com> > RaftConsensusElectionITest.TestNewLeaderCantResolvePeers scenario fails from > time to time > ----------------------------------------------------------------------------------------- > > Key: KUDU-3641 > URL: https://issues.apache.org/jira/browse/KUDU-3641 > Project: Kudu > Issue Type: Bug > Components: consensus, test > Affects Versions: 1.17.0, 1.17.1 > Reporter: Alexey Serbin > Assignee: Alexey Serbin > Priority: Major > Fix For: 1.18.0 > > Attachments: raft_consensus_election-itest.log.xz > > > The {{RaftConsensusElectionITest.TestNewLeaderCantResolvePeers}} scenario of > {{raft_consensus_election-itest}} fails spuriously in DEBUG and ASAN builds > at least with errors like below: > {noformat} > src/kudu/integration-tests/raft_consensus_election-itest.cc:291: Failure > Value of: tablets.empty() > Actual: true > Expected: false > src/kudu/util/test_util.cc:401: Failure > Failed > Timed out waiting for assertion to pass. > {noformat} > The log is attached. -- This message was sent by Atlassian Jira (v8.20.10#820010)