Ewen Cheslack-Postava created KAFKA-3481: --------------------------------------------
Summary: Transient failure in TestVerifiableProducer Key: KAFKA-3481 URL: https://issues.apache.org/jira/browse/KAFKA-3481 Project: Kafka Issue Type: Bug Components: system tests Reporter: Ewen Cheslack-Postava Assignee: Geoff Anderson See, for example, the result from: http://confluent-kafka-system-test-results.s3-us-west-2.amazonaws.com/2016-03-28--001.1459178417--apache--trunk--1fbe445/report.html Based on the trace collected, it looks like the command trying to list processes for the producer and then extracting version information may be failing. Looking at the parameters for VerifiableProducer, I think we may be relying on careful timing -- we start with a throughput of 1000 messages/sec and num_messages is only 100, so we'll complete those pretty quickly. Unless I'm misreading the rates, it's actually surprisingly reliable and must rely on JVM startup time since SSH overhead is relatively high too, so simply running the command to check on the processes would take awhile. A very simple fix would just ensure the process runs long enough. Presumably finishing the test successfully would forcefully shut it down via SIGINT anyway. -- This message was sent by Atlassian JIRA (v6.3.4#6332)