Re: GC pauses and rebalance failures

David DeMaagd Mon, 14 Apr 2014 12:59:12 -0700

Correct - heavy client GC leads to numerous problems.  There's
two things you can do:


1) Tune the client JVM better to get GC to a more reasonable level 
2) Increase the zookeeper session timeout value (this is generally a
   work-around for #1, but it can buy you time to dig into it)

-- 
Dave DeMaagd | S'aite Reliability Engineering, Y'all
ddema...@linkedin.com | 818 262 7958

(cl...@breyman.com - Mon, Apr 14, 2014 at 12:41:15PM -0700)
> I've got some consumers under decent GC pressure and, as a result, they are
> having ZK sessions expire and the consumers never recover. I see a number
> of rebalance failures in the log after the ZK session expiration followed
> by silence (and consumed partitions).
> 
> My hypothesis is that, since the GC pause is global to the JVM, I'll have
> multiple ConsumerConnectors get expired at the same time and have
> synchronized rebalance/backoff cycles. Since rebalance fails if new
> consumers join mid balance, the multiple expired connectors will always
> collide with each other while attempting to rebalance.
> 
> Is this hypothesis crazy? If not, is there a more likely situation? If the
> hypothesis isn't crazy, how might I avoid this when the JVM is under GC
> pressure?
> 
> Thanks in advance.

Re: GC pauses and rebalance failures

Reply via email to