[ https://issues.apache.org/jira/browse/KAFKA-4574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15905939#comment-15905939 ]
Apurva Mehta commented on KAFKA-4574: ------------------------------------- Here are all the leader changes for this partition across all the controller logs: {noformat} amehta-macbook-pro:KafkaService-0-140193561885648 apurva$ for i in `find . -name controller.log`; do echo $i; perl -lne 'if ($_ =~ /^\[(.*)\] DEBUG.*After leader election.*(\[test_topic,2\] -> \(Leader:\d+,ISR:\d+,LeaderEpoch:\d+,ControllerEpoch:\d+\))/) { print "$1 -> $2"; }' $i; done ./worker2/debug/controller.log 2017-03-09 05:20:43,538 -> [test_topic,2] -> (Leader:3,ISR:3,LeaderEpoch:2,ControllerEpoch:2) 2017-03-09 05:20:43,544 -> [test_topic,2] -> (Leader:3,ISR:3,LeaderEpoch:2,ControllerEpoch:2) 2017-03-09 05:20:43,573 -> [test_topic,2] -> (Leader:3,ISR:3,LeaderEpoch:2,ControllerEpoch:2) 2017-03-09 05:20:43,579 -> [test_topic,2] -> (Leader:3,ISR:3,LeaderEpoch:2,ControllerEpoch:2) 2017-03-09 05:20:43,593 -> [test_topic,2] -> (Leader:3,ISR:3,LeaderEpoch:2,ControllerEpoch:2) 2017-03-09 05:20:43,608 -> [test_topic,2] -> (Leader:3,ISR:3,LeaderEpoch:2,ControllerEpoch:2) 2017-03-09 05:20:43,621 -> [test_topic,2] -> (Leader:3,ISR:3,LeaderEpoch:2,ControllerEpoch:2) 2017-03-09 05:20:43,629 -> [test_topic,2] -> (Leader:3,ISR:3,LeaderEpoch:2,ControllerEpoch:2) 2017-03-09 05:20:43,635 -> [test_topic,2] -> (Leader:3,ISR:3,LeaderEpoch:2,ControllerEpoch:2) 2017-03-09 05:20:43,641 -> [test_topic,2] -> (Leader:3,ISR:3,LeaderEpoch:2,ControllerEpoch:2) 2017-03-09 05:20:43,660 -> [test_topic,2] -> (Leader:3,ISR:3,LeaderEpoch:2,ControllerEpoch:2) ./worker2/info/controller.log ./worker6/debug/controller.log 2017-03-09 05:20:51,053 -> [test_topic,2] -> (Leader:1,ISR:1,LeaderEpoch:3,ControllerEpoch:3) 2017-03-09 05:20:51,059 -> [test_topic,2] -> (Leader:1,ISR:1,LeaderEpoch:3,ControllerEpoch:3) 2017-03-09 05:20:51,065 -> [test_topic,2] -> (Leader:1,ISR:1,LeaderEpoch:3,ControllerEpoch:3) 2017-03-09 05:20:51,072 -> [test_topic,2] -> (Leader:1,ISR:1,LeaderEpoch:3,ControllerEpoch:3) 2017-03-09 05:20:51,079 -> [test_topic,2] -> (Leader:1,ISR:1,LeaderEpoch:3,ControllerEpoch:3) 2017-03-09 05:20:51,086 -> [test_topic,2] -> (Leader:1,ISR:1,LeaderEpoch:3,ControllerEpoch:3) 2017-03-09 05:20:51,092 -> [test_topic,2] -> (Leader:1,ISR:1,LeaderEpoch:3,ControllerEpoch:3) 2017-03-09 05:20:51,098 -> [test_topic,2] -> (Leader:1,ISR:1,LeaderEpoch:3,ControllerEpoch:3) 2017-03-09 05:20:51,104 -> [test_topic,2] -> (Leader:1,ISR:1,LeaderEpoch:3,ControllerEpoch:3) 2017-03-09 05:20:51,112 -> [test_topic,2] -> (Leader:1,ISR:1,LeaderEpoch:3,ControllerEpoch:3) 2017-03-09 05:20:51,118 -> [test_topic,2] -> (Leader:1,ISR:1,LeaderEpoch:3,ControllerEpoch:3) 2017-03-09 05:20:51,130 -> [test_topic,2] -> (Leader:1,ISR:1,LeaderEpoch:3,ControllerEpoch:3) 2017-03-09 05:20:51,137 -> [test_topic,2] -> (Leader:1,ISR:1,LeaderEpoch:3,ControllerEpoch:3) 2017-03-09 05:20:51,146 -> [test_topic,2] -> (Leader:1,ISR:1,LeaderEpoch:3,ControllerEpoch:3) 2017-03-09 05:20:51,153 -> [test_topic,2] -> (Leader:1,ISR:1,LeaderEpoch:3,ControllerEpoch:3) 2017-03-09 05:20:51,164 -> [test_topic,2] -> (Leader:1,ISR:1,LeaderEpoch:3,ControllerEpoch:3) 2017-03-09 05:20:51,189 -> [test_topic,2] -> (Leader:1,ISR:1,LeaderEpoch:3,ControllerEpoch:3) 2017-03-09 05:20:51,196 -> [test_topic,2] -> (Leader:1,ISR:1,LeaderEpoch:3,ControllerEpoch:3) 2017-03-09 05:21:18,092 -> [test_topic,2] -> (Leader:1,ISR:1,LeaderEpoch:7,ControllerEpoch:6) 2017-03-09 05:21:18,101 -> [test_topic,2] -> (Leader:1,ISR:1,LeaderEpoch:7,ControllerEpoch:6) 2017-03-09 05:21:18,108 -> [test_topic,2] -> (Leader:1,ISR:1,LeaderEpoch:7,ControllerEpoch:6) 2017-03-09 05:21:18,116 -> [test_topic,2] -> (Leader:1,ISR:1,LeaderEpoch:7,ControllerEpoch:6) 2017-03-09 05:21:18,126 -> [test_topic,2] -> (Leader:1,ISR:1,LeaderEpoch:7,ControllerEpoch:6) 2017-03-09 05:21:18,133 -> [test_topic,2] -> (Leader:1,ISR:1,LeaderEpoch:7,ControllerEpoch:6) 2017-03-09 05:21:18,145 -> [test_topic,2] -> (Leader:1,ISR:1,LeaderEpoch:7,ControllerEpoch:6) 2017-03-09 05:21:18,153 -> [test_topic,2] -> (Leader:1,ISR:1,LeaderEpoch:7,ControllerEpoch:6) 2017-03-09 05:21:18,164 -> [test_topic,2] -> (Leader:1,ISR:1,LeaderEpoch:7,ControllerEpoch:6) 2017-03-09 05:21:18,171 -> [test_topic,2] -> (Leader:1,ISR:1,LeaderEpoch:7,ControllerEpoch:6) 2017-03-09 05:21:18,177 -> [test_topic,2] -> (Leader:1,ISR:1,LeaderEpoch:7,ControllerEpoch:6) 2017-03-09 05:21:18,184 -> [test_topic,2] -> (Leader:1,ISR:1,LeaderEpoch:7,ControllerEpoch:6) 2017-03-09 05:21:18,192 -> [test_topic,2] -> (Leader:1,ISR:1,LeaderEpoch:7,ControllerEpoch:6) 2017-03-09 05:21:18,199 -> [test_topic,2] -> (Leader:1,ISR:1,LeaderEpoch:7,ControllerEpoch:6) 2017-03-09 05:21:18,207 -> [test_topic,2] -> (Leader:1,ISR:1,LeaderEpoch:7,ControllerEpoch:6) 2017-03-09 05:21:18,218 -> [test_topic,2] -> (Leader:1,ISR:1,LeaderEpoch:7,ControllerEpoch:6) 2017-03-09 05:21:18,225 -> [test_topic,2] -> (Leader:1,ISR:1,LeaderEpoch:7,ControllerEpoch:6) 2017-03-09 05:21:18,230 -> [test_topic,2] -> (Leader:1,ISR:1,LeaderEpoch:7,ControllerEpoch:6) 2017-03-09 05:22:59,462 -> [test_topic,2] -> (Leader:3,ISR:3,LeaderEpoch:10,ControllerEpoch:9) 2017-03-09 05:22:59,472 -> [test_topic,2] -> (Leader:3,ISR:3,LeaderEpoch:10,ControllerEpoch:9) 2017-03-09 05:22:59,480 -> [test_topic,2] -> (Leader:3,ISR:3,LeaderEpoch:10,ControllerEpoch:9) 2017-03-09 05:22:59,489 -> [test_topic,2] -> (Leader:3,ISR:3,LeaderEpoch:10,ControllerEpoch:9) 2017-03-09 05:22:59,499 -> [test_topic,2] -> (Leader:3,ISR:3,LeaderEpoch:10,ControllerEpoch:9) 2017-03-09 05:22:59,508 -> [test_topic,2] -> (Leader:3,ISR:3,LeaderEpoch:10,ControllerEpoch:9) 2017-03-09 05:22:59,517 -> [test_topic,2] -> (Leader:3,ISR:3,LeaderEpoch:10,ControllerEpoch:9) 2017-03-09 05:22:59,526 -> [test_topic,2] -> (Leader:3,ISR:3,LeaderEpoch:10,ControllerEpoch:9) 2017-03-09 05:22:59,544 -> [test_topic,2] -> (Leader:3,ISR:3,LeaderEpoch:10,ControllerEpoch:9) 2017-03-09 05:22:59,554 -> [test_topic,2] -> (Leader:3,ISR:3,LeaderEpoch:10,ControllerEpoch:9) 2017-03-09 05:22:59,562 -> [test_topic,2] -> (Leader:3,ISR:3,LeaderEpoch:10,ControllerEpoch:9) 2017-03-09 05:22:59,570 -> [test_topic,2] -> (Leader:3,ISR:3,LeaderEpoch:10,ControllerEpoch:9) 2017-03-09 05:22:59,578 -> [test_topic,2] -> (Leader:3,ISR:3,LeaderEpoch:10,ControllerEpoch:9) 2017-03-09 05:22:59,586 -> [test_topic,2] -> (Leader:3,ISR:3,LeaderEpoch:10,ControllerEpoch:9) 2017-03-09 05:22:59,594 -> [test_topic,2] -> (Leader:3,ISR:3,LeaderEpoch:10,ControllerEpoch:9) 2017-03-09 05:22:59,602 -> [test_topic,2] -> (Leader:3,ISR:3,LeaderEpoch:10,ControllerEpoch:9) ./worker6/info/controller.log ./worker8/debug/controller.log ./worker8/info/controller.log amehta-macbook-pro:KafkaService-0-140193561885648 apurva$ {noformat} > Transient failure in ZooKeeperSecurityUpgradeTest.test_zk_security_upgrade > with security_protocol = SASL_PLAINTEXT, SSL > ----------------------------------------------------------------------------------------------------------------------- > > Key: KAFKA-4574 > URL: https://issues.apache.org/jira/browse/KAFKA-4574 > Project: Kafka > Issue Type: Test > Components: system tests > Reporter: Shikhar Bhushan > Assignee: Apurva Mehta > > http://confluent-kafka-system-test-results.s3-us-west-2.amazonaws.com/2016-12-29--001.1483003056--apache--trunk--dc55025/report.html > {{ZooKeeperSecurityUpgradeTest.test_zk_security_upgrade}} failed with these > {{security_protocol}} parameters > {noformat} > ==================================================================================================== > test_id: > kafkatest.tests.core.zookeeper_security_upgrade_test.ZooKeeperSecurityUpgradeTest.test_zk_security_upgrade.security_protocol=SASL_PLAINTEXT > status: FAIL > run time: 3 minutes 44.094 seconds > 1 acked message did not make it to the Consumer. They are: [5076]. We > validated that the first 1 of these missing messages correctly made it into > Kafka's data files. This suggests they were lost on their way to the consumer. > Traceback (most recent call last): > File > "/var/lib/jenkins/workspace/system-test-kafka/kafka/venv/local/lib/python2.7/site-packages/ducktape-0.6.0-py2.7.egg/ducktape/tests/runner_client.py", > line 123, in run > data = self.run_test() > File > "/var/lib/jenkins/workspace/system-test-kafka/kafka/venv/local/lib/python2.7/site-packages/ducktape-0.6.0-py2.7.egg/ducktape/tests/runner_client.py", > line 176, in run_test > return self.test_context.function(self.test) > File > "/var/lib/jenkins/workspace/system-test-kafka/kafka/venv/local/lib/python2.7/site-packages/ducktape-0.6.0-py2.7.egg/ducktape/mark/_mark.py", > line 321, in wrapper > return functools.partial(f, *args, **kwargs)(*w_args, **w_kwargs) > File > "/var/lib/jenkins/workspace/system-test-kafka/kafka/tests/kafkatest/tests/core/zookeeper_security_upgrade_test.py", > line 117, in test_zk_security_upgrade > self.run_produce_consume_validate(self.run_zk_migration) > File > "/var/lib/jenkins/workspace/system-test-kafka/kafka/tests/kafkatest/tests/produce_consume_validate.py", > line 101, in run_produce_consume_validate > self.validate() > File > "/var/lib/jenkins/workspace/system-test-kafka/kafka/tests/kafkatest/tests/produce_consume_validate.py", > line 163, in validate > assert success, msg > AssertionError: 1 acked message did not make it to the Consumer. They are: > [5076]. We validated that the first 1 of these missing messages correctly > made it into Kafka's data files. This suggests they were lost on their way to > the consumer. > {noformat} > {noformat} > ==================================================================================================== > test_id: > kafkatest.tests.core.zookeeper_security_upgrade_test.ZooKeeperSecurityUpgradeTest.test_zk_security_upgrade.security_protocol=SSL > status: FAIL > run time: 3 minutes 50.578 seconds > 1 acked message did not make it to the Consumer. They are: [3559]. We > validated that the first 1 of these missing messages correctly made it into > Kafka's data files. This suggests they were lost on their way to the consumer. > Traceback (most recent call last): > File > "/var/lib/jenkins/workspace/system-test-kafka/kafka/venv/local/lib/python2.7/site-packages/ducktape-0.6.0-py2.7.egg/ducktape/tests/runner_client.py", > line 123, in run > data = self.run_test() > File > "/var/lib/jenkins/workspace/system-test-kafka/kafka/venv/local/lib/python2.7/site-packages/ducktape-0.6.0-py2.7.egg/ducktape/tests/runner_client.py", > line 176, in run_test > return self.test_context.function(self.test) > File > "/var/lib/jenkins/workspace/system-test-kafka/kafka/venv/local/lib/python2.7/site-packages/ducktape-0.6.0-py2.7.egg/ducktape/mark/_mark.py", > line 321, in wrapper > return functools.partial(f, *args, **kwargs)(*w_args, **w_kwargs) > File > "/var/lib/jenkins/workspace/system-test-kafka/kafka/tests/kafkatest/tests/core/zookeeper_security_upgrade_test.py", > line 117, in test_zk_security_upgrade > self.run_produce_consume_validate(self.run_zk_migration) > File > "/var/lib/jenkins/workspace/system-test-kafka/kafka/tests/kafkatest/tests/produce_consume_validate.py", > line 101, in run_produce_consume_validate > self.validate() > File > "/var/lib/jenkins/workspace/system-test-kafka/kafka/tests/kafkatest/tests/produce_consume_validate.py", > line 163, in validate > assert success, msg > AssertionError: 1 acked message did not make it to the Consumer. They are: > [3559]. We validated that the first 1 of these missing messages correctly > made it into Kafka's data files. This suggests they were lost on their way to > the consumer. > {noformat} > Previously: KAFKA-3985 -- This message was sent by Atlassian JIRA (v6.3.15#6346)