Hi Cliff, One other case I observed in my environment is - when there were gc pauses on one of our high level consumer in the group.
Thanks, Kris On Wed, Oct 21, 2015 at 10:12 AM, Cliff Rhyne <crh...@signal.co> wrote: > Hi James, > > There are two scenarios we run: > > 1. Multiple partitions with one consumer per partition. This rarely has > starting/stopping of consumers, so the pool is very static. There is a > configured consumer timeout, which is causing the ConsumerTimeoutException > to get thrown prior to the test starting. We handle this exception and > then resume consuming. > 2. Single partition with one consumer. This consumer is started by a > triggered condition (number of messages pending to be processed in the > kafka topic or a schedule). The consumer is stopped after processing is > completed. > > In both cases, based on my understanding there shouldn't be a rebalance as > either a) all consumers are running or b) there's only one consumer / > partition. Also, the same consumer group is used by all consumers in > scenario 1 and 2. Is there a good way to investigate whether rebalances > are occurring? > > Thanks, > Cliff > > On Wed, Oct 21, 2015 at 11:37 AM, James Cheng <jch...@tivo.com> wrote: > > > Do you have multiple consumers in a consumer group? > > > > I think that when a new consumer joins the consumer group, that the > > existing consumers will stop consuming during the group rebalance, and > then > > when they start consuming again, that they will consume from the last > > committed offset. > > > > You should get more verification on this, tho. I might be remembering > > wrong. > > > > -James > > > > > On Oct 21, 2015, at 8:40 AM, Cliff Rhyne <crh...@signal.co> wrote: > > > > > > Hi, > > > > > > My team and I are looking into a problem where the Java high level > > consumer > > > provides duplicate messages if we turn auto commit off (using version > > > 0.8.2.1 of the server and Java client). The expected sequence of > events > > > are: > > > > > > 1. Start high-level consumer and initialize a KafkaStream to get a > > > ConsumerIterator > > > 2. Consume n items (could be 10,000, could be 1,000,000) from the > > iterator > > > 3. Commit the new offsets > > > > > > What we are seeing is that during step 2, some number of the n messages > > are > > > getting returned by the iterator in duplicate (in some cases, we've > seen > > > n*5 messages consumed). The problem appears to go away if we turn on > > auto > > > commit (and committing offsets to kafka helped too), but auto commit > > causes > > > conflicts with our offset rollback logic. The issue seems to happen > more > > > when we are in our test environment on a lower-cost cloud provider. > > > > > > Diving into the Java and Scala classes including the ConsumerIterator, > > it's > > > not obvious what event causes a duplicate offset to be requested or > > > returned (there's even a loop that is supposed to exclude duplicate > > > messages in this class). I tried turning on trace logging but my log4j > > > config isn't getting the Kafka client logs to write out. > > > > > > Does anyone have suggestions of where to look or how to enable logging? > > > > > > Thanks, > > > Cliff > > > > > > ________________________________ > > > > This email and any attachments may contain confidential and privileged > > material for the sole use of the intended recipient. Any review, copying, > > or distribution of this email (or any attachments) by others is > prohibited. > > If you are not the intended recipient, please contact the sender > > immediately and permanently delete this email and any attachments. No > > employee or agent of TiVo Inc. is authorized to conclude any binding > > agreement on behalf of TiVo Inc. by email. Binding agreements with TiVo > > Inc. may only be made by a signed written agreement. > > >