[ 
https://issues.apache.org/jira/browse/KAFKA-4558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15797404#comment-15797404
 ] 

Ewen Cheslack-Postava commented on KAFKA-4558:
----------------------------------------------

[~apurva] I'm thinking it'd be a lot easier to handle this type of issue if we 
forced system tests to use a custom MetricsReporter that exposes metrics via 
HTTP. Right now it's a pain to grab metrics because the JMX reporter isn't 
easily accessible to the test driver app written in python. If we provided an 
alternative that gave access to metrics via HTTP, we'd have a really easy way 
to validate metrics from system tests.

To fix this specific problem we might still need to add another metric to the 
consumer, but as I looked into how to get the metrics required in the system 
tests, it seemed really painful in its current form. This would probably also 
simplify some other tests currently relying on the {{JmxMixin}} class in the 
system tests.

What do you think? We could add this in the test binaries for now to avoid any 
KIPs, dependency issues, etc, though I suspect we might eventually want to 
graduate it to its own module as many folks might find it useful to be able to 
just ping a URL periodically and collect all metrics data from a Kafka process.

> throttling_test fails if the producer starts too fast.
> ------------------------------------------------------
>
>                 Key: KAFKA-4558
>                 URL: https://issues.apache.org/jira/browse/KAFKA-4558
>             Project: Kafka
>          Issue Type: Bug
>            Reporter: Apurva Mehta
>            Assignee: Apurva Mehta
>
> As described in https://issues.apache.org/jira/browse/KAFKA-4526, the 
> throttling test will fail if the producer in the produce-consume-validate 
> loop starts up before the consumer is fully initialized.
> We need to block the start of the producer until the consumer is ready to go. 
> The current plan is to poll the consumer for a particular metric (like, for 
> instance, partition assignment) which will act as a good proxy for successful 
> initialization. Currently, we just check for the existence of a process with 
> the PID, which is not a strong enough check, causing the test to fail 
> intermittently. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to