Re: (Re-)joining group for a longer time

Ivan Yurchenko Mon, 05 Aug 2019 01:01:46 -0700

Hi,

Kamesh, does one of worker's logs look like in
https://issues.apache.org/jira/browse/KAFKA-7941?focusedCommentId=16899851&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16899851
?
I.e.

INFO [Worker clientId=connect-1, groupId=connect] Was selected to
perform assignments, but do not have latest config found in sync
request. Returning an empty configuration to trigger re-sync.
(org.apache.kafka.connect.runtime.distributed.WorkerCoordinator:208)
INFO [GroupCoordinator 3]: Assignment received from leader for group
connect for generation 436 (kafka.coordinator.group.GroupCoordinator)
INFO [Worker clientId=connect-1, groupId=connect] Successfully joined
group with generation 436
(org.apache.kafka.clients.consumer.internals.AbstractCoordinator:455)
INFO Joined group and got assignment: Assignment{error=1,
leader='connect-1-caf0b504-cb29-4456-a28d-3172cdf67d73',
leaderUrl='http://<url>/', offset=1, connectorIds=[], taskIds=[]}
(org.apache.kafka.connect.runtime.distributed.DistributedHerder:1216)
INFO [Worker clientId=connect-1, groupId=connect] (Re-)joining group
(org.apache.kafka.clients.consumer.internals.AbstractCoordinator:491)
INFO [GroupCoordinator 3]: Preparing to rebalance group connect in
state PreparingRebalance with old generation 436
(__consumer_offsets-30) (reason: Updating metadata for member
connect-1-caf0b504-cb29-4456-a28d-3172cdf67d73)
(kafka.coordinator.group.GroupCoordinator)
INFO [GroupCoordinator 3]: Stabilized group connect generation 437
(__consumer_offsets-30) (kafka.coordinator.group.GroupCoordinator)

If so, then it might be the cause.
In my experiments, only restarting of workers helps here.
The fix might be available soon https://github.com/apache/kafka/pull/6283

Best,
Ivan

On Sat, 3 Aug 2019 at 19:36, Boyang Chen <reluctanthero...@gmail.com> wrote:

> Hey Kamesh,
>
> thank you for the question. Could you also check the broker side log to see
> if the group is forming generations properly? Information we have for now
> is a bit hard to tell what's going on. Also since you have upgraded to 2.3,
> during incremental rebalancing you will experience 2 rebalance in a row but
> won't revoke/assign tasks unless necessary, could you verify that for old
> connectors their partitions are not getting revoked during the first
> rebalance?
>
> Boyang
>
>
>
> On Sat, Aug 3, 2019 at 1:40 AM Kamesh <kam.iit...@gmail.com> wrote:
>
> > Hi,
> >  I am using Kafka connect cluster for writing data to S3. I have
> observed,
> > whenever I add a new connector or update the config of an existing
> > connector, I think group balancing is happening and it is affecting all
> the
> > existing connectors. Rebalancing is happening for all of the existing
> > connectors also. All of my log files are filled with the following log
> > messages
> >
> > *[2019-08-03 08:28:28,668] INFO [Consumer
> clientId=connector-consumer-xxx,
> > groupId=connect-xxx] (Re-)joining group
> > (org.apache.kafka.clients.consumer.internals.AbstractCoordinator:505)*
> >
> >  This rebalancing is taking a very long time, and sometimes not evening
> > completing. Any pointers on this?
> >
> > I am using Kafka connect *2.3.0, *and this version supports incremental
> > rebalancing and it should not affect existing connectors. Am I
> > missing something?
> >
> > I have already tuned *session.timeout.ms <http://session.timeout.ms>*
> > and *max.poll.interval.ms
> > <http://max.poll.interval.ms>* configs and increased their values as
> > suggested in the community.
> >
> >
> > Thanks & Regards
> > Kamesh.
> >
>

Re: (Re-)joining group for a longer time

Reply via email to