Ewen Cheslack-Postava created KAFKA-3481:
--------------------------------------------

             Summary: Transient failure in TestVerifiableProducer
                 Key: KAFKA-3481
                 URL: https://issues.apache.org/jira/browse/KAFKA-3481
             Project: Kafka
          Issue Type: Bug
          Components: system tests
            Reporter: Ewen Cheslack-Postava
            Assignee: Geoff Anderson


See, for example, the result from: 
http://confluent-kafka-system-test-results.s3-us-west-2.amazonaws.com/2016-03-28--001.1459178417--apache--trunk--1fbe445/report.html

Based on the trace collected, it looks like the command trying to list 
processes for the producer and then extracting version information may be 
failing. Looking at the parameters for VerifiableProducer, I think we may be 
relying on careful timing -- we start with a throughput of 1000 messages/sec 
and num_messages is only 100, so we'll complete those pretty quickly. Unless 
I'm misreading the rates, it's actually surprisingly reliable and must rely on 
JVM startup time since SSH overhead is relatively high too, so simply running 
the command to check on the processes would take awhile.

A very simple fix would just ensure the process runs long enough. Presumably 
finishing the test successfully would forcefully shut it down via SIGINT anyway.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to