[jira] [Commented] (KAFKA-5741) Prioritize threads in Connect distributed worker process

Ewen Cheslack-Postava (JIRA) Wed, 06 Sep 2017 11:38:32 -0700

    [ 
https://issues.apache.org/jira/browse/KAFKA-5741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16155857#comment-16155857
 ]


Ewen Cheslack-Postava commented on KAFKA-5741:
----------------------------------------------

It would be good to have clear indications this is actually a problem in 
practice and that other threads starving the herder thread caused it to 
rebalance. First, heartbeating actually happens in a background thread, so 
you'd have to starve that thread as well for the session timeout. And the 
actual processing done in the thread is very minimal, so you'd have to 
completely starve that thread for a long time -- it's much more likely that 
things like waiting for other threads to flush data during a rebalance is what 
causes it to fall out of the group.

I'm also skeptical of the prioritization because to me, if this really occurred 
for this reason, it would suggest that the hardware is just underprovisioned 
for the workload. Prioritizing the DistributedHerder thread would probably just 
end up starving other threads if there really is that much resource contention, 
and so the connectors won't even really be functioning correctly anyway...

> Prioritize threads in Connect distributed worker process
> --------------------------------------------------------
>
>                 Key: KAFKA-5741
>                 URL: https://issues.apache.org/jira/browse/KAFKA-5741
>             Project: Kafka
>          Issue Type: Improvement
>          Components: KafkaConnect
>    Affects Versions: 0.11.0.0
>            Reporter: Randall Hauch
>            Priority: Critical
>
> Connect's distributed worker process uses the {{DistributedHerder}} to 
> perform all administrative operations, including: starting, stopping, 
> pausing, resuming, reconfiguring connectors; rebalancing; etc. The 
> {{DistributedHerder}} uses a single threaded executor service to do all this 
> work and to do it sequentially. If this thread gets preempted for any reason 
> (e.g., connector tasks are bogging down the process, DoS, etc.), then the 
> herder's membership in the group may be dropped, causing a rebalance.
> This herder thread should be run at a much higher priority than all of the 
> other threads in the system.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (KAFKA-5741) Prioritize threads in Connect distributed worker process

Reply via email to