[ 
https://issues.apache.org/jira/browse/KAFKA-4779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15885716#comment-15885716
 ] 

Ismael Juma commented on KAFKA-4779:
------------------------------------

The test failed again, this time with a different message:

{code}
--------------------------------------------------------------------------------
test_id:    
kafkatest.tests.core.security_rolling_upgrade_test.TestSecurityRollingUpgrade.test_rolling_upgrade_phase_two.broker_protocol=SASL_PLAINTEXT.client_protocol=SSL
status:     FAIL
run time:   4 minutes 32.586 seconds


    1152 acked message did not make it to the Consumer. They are: 12288, 12289, 
12290, 12291, 12292, 12293, 12294, 12295, 12296, 12297, 12298, 12299, 12300, 
12301, 12302, 12303, 12304, 12305, 12306, 12307...plus 1132 more. Total Acked: 
12184, Total Consumed: 11032. We validated that the first 1000 of these missing 
messages correctly made it into Kafka's data files. This suggests they were 
lost on their way to the consumer.
Traceback (most recent call last):
  File 
"/var/lib/jenkins/workspace/system-test-kafka/kafka/venv/local/lib/python2.7/site-packages/ducktape-0.6.0-py2.7.egg/ducktape/tests/runner_client.py",
 line 123, in run
    data = self.run_test()
  File 
"/var/lib/jenkins/workspace/system-test-kafka/kafka/venv/local/lib/python2.7/site-packages/ducktape-0.6.0-py2.7.egg/ducktape/tests/runner_client.py",
 line 176, in run_test
    return self.test_context.function(self.test)
  File 
"/var/lib/jenkins/workspace/system-test-kafka/kafka/venv/local/lib/python2.7/site-packages/ducktape-0.6.0-py2.7.egg/ducktape/mark/_mark.py",
 line 321, in wrapper
    return functools.partial(f, *args, **kwargs)(*w_args, **w_kwargs)
  File 
"/var/lib/jenkins/workspace/system-test-kafka/kafka/tests/kafkatest/tests/core/security_rolling_upgrade_test.py",
 line 148, in test_rolling_upgrade_phase_two
    self.run_produce_consume_validate(self.roll_in_secured_settings, 
client_protocol, broker_protocol)
  File 
"/var/lib/jenkins/workspace/system-test-kafka/kafka/tests/kafkatest/tests/produce_consume_validate.py",
 line 117, in run_produce_consume_validate
    self.validate()
  File 
"/var/lib/jenkins/workspace/system-test-kafka/kafka/tests/kafkatest/tests/produce_consume_validate.py",
 line 179, in validate
    assert success, msg
AssertionError: 1152 acked message did not make it to the Consumer. They are: 
12288, 12289, 12290, 12291, 12292, 12293, 12294, 12295, 12296, 12297, 12298, 
12299, 12300, 12301, 12302, 12303, 12304, 12305, 12306, 12307...plus 1132 more. 
Total Acked: 12184, Total Consumed: 11032. We validated that the first 1000 of 
these missing messages correctly made it into Kafka's data files. This suggests 
they were lost on their way to the consumer.
--------------------------------------------------------------------------------
{code}

http://confluent-kafka-system-test-results.s3-us-west-2.amazonaws.com/2017-02-26--001.1488103947--apache--trunk--5b682ba/report.html
http://confluent-kafka-system-test-results.s3-us-west-2.amazonaws.com/2017-02-26--001.1488103947--apache--trunk--5b682ba/TestSecurityRollingUpgrade/test_rolling_upgrade_phase_two/broker_protocol=SASL_PLAINTEXT.client_protocol=SSL/62.tgz


> Failure in kafka/tests/kafkatest/tests/core/security_rolling_upgrade_test.py
> ----------------------------------------------------------------------------
>
>                 Key: KAFKA-4779
>                 URL: https://issues.apache.org/jira/browse/KAFKA-4779
>             Project: Kafka
>          Issue Type: Bug
>            Reporter: Apurva Mehta
>            Assignee: Rajini Sivaram
>             Fix For: 0.10.3.0, 0.10.2.1
>
>
> This test failed on 01/29, on both trunk and 0.10.2, error message:
> {noformat}
> The consumer has terminated, or timed out, on node ubuntu@worker3.
> Traceback (most recent call last):
>   File 
> "/var/lib/jenkins/workspace/system-test-kafka-0.10.2/kafka/venv/local/lib/python2.7/site-packages/ducktape-0.6.0-py2.7.egg/ducktape/tests/runner_client.py",
>  line 123, in run
>     data = self.run_test()
>   File 
> "/var/lib/jenkins/workspace/system-test-kafka-0.10.2/kafka/venv/local/lib/python2.7/site-packages/ducktape-0.6.0-py2.7.egg/ducktape/tests/runner_client.py",
>  line 176, in run_test
>     return self.test_context.function(self.test)
>   File 
> "/var/lib/jenkins/workspace/system-test-kafka-0.10.2/kafka/venv/local/lib/python2.7/site-packages/ducktape-0.6.0-py2.7.egg/ducktape/mark/_mark.py",
>  line 321, in wrapper
>     return functools.partial(f, *args, **kwargs)(*w_args, **w_kwargs)
>   File 
> "/var/lib/jenkins/workspace/system-test-kafka-0.10.2/kafka/tests/kafkatest/tests/core/security_rolling_upgrade_test.py",
>  line 148, in test_rolling_upgrade_phase_two
>     self.run_produce_consume_validate(self.roll_in_secured_settings, 
> client_protocol, broker_protocol)
>   File 
> "/var/lib/jenkins/workspace/system-test-kafka-0.10.2/kafka/tests/kafkatest/tests/produce_consume_validate.py",
>  line 100, in run_produce_consume_validate
>     self.stop_producer_and_consumer()
>   File 
> "/var/lib/jenkins/workspace/system-test-kafka-0.10.2/kafka/tests/kafkatest/tests/produce_consume_validate.py",
>  line 87, in stop_producer_and_consumer
>     self.check_alive()
>   File 
> "/var/lib/jenkins/workspace/system-test-kafka-0.10.2/kafka/tests/kafkatest/tests/produce_consume_validate.py",
>  line 79, in check_alive
>     raise Exception(msg)
> Exception: The consumer has terminated, or timed out, on node ubuntu@worker3.
> {noformat}
> Looks like the console consumer times out: 
> {noformat}
> [2017-01-30 04:56:00,972] ERROR Error processing message, terminating 
> consumer process:  (kafka.tools.ConsoleConsumer$)
> kafka.consumer.ConsumerTimeoutException
>         at kafka.consumer.NewShinyConsumer.receive(BaseConsumer.scala:90)
>         at kafka.tools.ConsoleConsumer$.process(ConsoleConsumer.scala:120)
>         at kafka.tools.ConsoleConsumer$.run(ConsoleConsumer.scala:75)
>         at kafka.tools.ConsoleConsumer$.main(ConsoleConsumer.scala:50)
>         at kafka.tools.ConsoleConsumer.main(ConsoleConsumer.scala)
> {noformat}
> A bunch of these security_rolling_upgrade tests failed, and in all cases, the 
> producer produced ~15k messages, of which ~7k were acked, and the consumer 
> only got around ~2600 before timing out. 
> There are a lot of messages like the following for different request types on 
> the producer and consumer:
> {noformat}
> [2017-01-30 05:13:35,954] WARN Received unknown topic or partition error in 
> produce request on partition test_topic-0. The topic/partition may not exist 
> or the user may not have Describe access to it 
> (org.apache.kafka.clients.producer.internals.Sender)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to