Ewen Cheslack-Postava created KAFKA-3481:
--------------------------------------------
Summary: Transient failure in TestVerifiableProducer
Key: KAFKA-3481
URL: https://issues.apache.org/jira/browse/KAFKA-3481
Project: Kafka
Issue Type: Bug
Components: system tests
Reporter: Ewen Cheslack-Postava
Assignee: Geoff Anderson
See, for example, the result from:
http://confluent-kafka-system-test-results.s3-us-west-2.amazonaws.com/2016-03-28--001.1459178417--apache--trunk--1fbe445/report.html
Based on the trace collected, it looks like the command trying to list
processes for the producer and then extracting version information may be
failing. Looking at the parameters for VerifiableProducer, I think we may be
relying on careful timing -- we start with a throughput of 1000 messages/sec
and num_messages is only 100, so we'll complete those pretty quickly. Unless
I'm misreading the rates, it's actually surprisingly reliable and must rely on
JVM startup time since SSH overhead is relatively high too, so simply running
the command to check on the processes would take awhile.
A very simple fix would just ensure the process runs long enough. Presumably
finishing the test successfully would forcefully shut it down via SIGINT anyway.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)