Hi Liam,

Sorry for the mis-identification in the last email. The fun of answering an
email on a phone instead of a desktop. I confirmed the upper log messages I
included in the message come from this location in the `kafka-clients`
library

https://github.com/apache/kafka/blob/trunk/clients/src/main/java/org/apache/kafka/clients/consumer/internals/ConsumerCoordinator.java#L422

And it's only including 8 of the 10 partitions that were assigned to that
consumer instance.

-Richard

On Fri, Mar 18, 2022 at 11:15 PM Richard Ney <kamisama....@gmail.com> wrote:

> Hi Ngā mihi,
>
> I believe the log entry I included was from the underlying kafka-clients
> library given that the logger identified is
> “org.apache.kafka.clients.consumer.internals.ConsumerCoordinator”. I’ll
> admit at first I thought it also might be the fs2-kafka wrapper given that
> the 2.4.0 version is the first version that has correct support for the
> messaging from the ConsumerCoordinator. I am planning to do a test with the
> 3.0.0-M5 version which is built on the updated 3.1.0 kafka-clients library
> and will let the list know.
>
> -Richard Ney
>
> Sent from my iPhone
>
> > On Mar 18, 2022, at 10:55 PM, Liam Clarke-Hutchinson <
> lclar...@redhat.com> wrote:
> >
> > Kia ora Richard,
> >
> > I see support for the Cooperative Sticky Assignor in fs2-kafka is quite
> > new. Have you discussed this issue with the community of that client at
> > all? I ask because I see on GitHub that fs2-kafka is using kafka-clients
> > 2.8.1 as the underlying client, and there's been a fair few bugfixes
> around
> > the cooperative sticky assignor since that version.
> >
> > Could you perhaps try overriding the kafka-clients dependency of
> fs2-kafka
> > and try a higher version, perhaps 3.1.0, and see if the issue remains?
> I'm
> > not sure how well that'll work, but might help narrow down the issue.
> >
> > Ngā mihi,
> >
> > Liam Clarke-Hutchinson
> >
> >> On Sat, 19 Mar 2022 at 14:24, Richard Ney <kamisama....@gmail.com>
> wrote:
> >>
> >> Thanks for the additional information Bruno. Does this look like a
> possible
> >> bug in the CooperativeStickyAssignor? I have 5 consumers reading from a
> 50
> >> partition topic. Based on the log messages this application instance is
> >> only getting assigned 8 partitions but when I ask the consumer group for
> >> LAG information the consumer group thinks the correct number of 10
> >> partitions were assigned but as should 2 partitions aren't getting read
> due
> >> to the application not knowing it has them.
> >>
> >> {"timestamp":"2022-03-19T00:54:46.025Z","message":"[Consumer
> >> instanceId=i-0e89c9bee06f71f68,
> >>
> clientId=consumer-app-query-platform-aoa-backfill-v7-i-0e89c9bee06f71f68,
> >> groupId=app-query-platform-aoa-backfill-v7] Updating assignment with\n\t
> >> Assigned partitions: [
> >> platform-data.appquery-platform.aoav3.backfill-28,
> >> platform-data.appquery-platform.aoav3.backfill-43,
> >> platform-data.appquery-platform.aoav3.backfill-31,
> >> platform-data.appquery-platform.aoav3.backfill-46,
> >> platform-data.appquery-platform.aoav3.backfill-34,
> >> platform-data.appquery-platform.aoav3.backfill-49,
> >> platform-data.appquery-platform.aoav3.backfill-40,
> >> platform-data.appquery-platform.aoav3.backfill-37] \n\t
> >> Current owned partitions:                  []\n\t
> >>
> >> Added partitions (assigned - owned):       [
> >> platform-data.appquery-platform.aoav3.backfill-28,
> >> platform-data.appquery-platform.aoav3.backfill-43,
> >> platform-data.appquery-platform.aoav3.backfill-31,
> >> platform-data.appquery-platform.aoav3.backfill-46,
> >> platform-data.appquery-platform.aoav3.backfill-34,
> >> platform-data.appquery-platform.aoav3.backfill-49,
> >> platform-data.appquery-platform.aoav3.backfill-40,
> >> platform-data.appquery-platform.aoav3.backfill-37]\n\t
> >>
> >> Revoked partitions (owned - assigned):
> >>
> >>
> []\n","logger_name":"org.apache.kafka.clients.consumer.internals.ConsumerCoordinator","thread_name":"fs2-kafka-consumer-95","level":"INFO"}
> >>
> >>
> >> {"timestamp":"2022-03-19T00:54:46.026Z","message":"[Consumer
> >> instanceId=i-0e89c9bee06f71f68,
> >>
> clientId=consumer-app-query-platform-aoa-backfill-v7-i-0e89c9bee06f71f68,
> >> groupId=app-query-platform-aoa-backfill-v7]
> >>
> >> Notifying assignor about the new Assignment(partitions=[
> >> platform-data.appquery-platform.aoav3.backfill-28,
> >> platform-data.appquery-platform.aoav3.backfill-31,
> >> platform-data.appquery-platform.aoav3.backfill-34,
> >> platform-data.appquery-platform.aoav3.backfill-37,
> >> platform-data.appquery-platform.aoav3.backfill-40,
> >> platform-data.appquery-platform.aoav3.backfill-43,
> >> platform-data.appquery-platform.aoav3.backfill-46,
> >>
> >>
> platform-data.appquery-platform.aoav3.backfill-49])","logger_name":"org.apache.kafka.clients.consumer.internals.ConsumerCoordinator","thread_name":"fs2-kafka-consumer-95","level":"INFO"}
> >>
> >> {"timestamp":"2022-03-19T00:54:46.028Z","message":"[Consumer
> >> instanceId=i-0e89c9bee06f71f68,
> >>
> clientId=consumer-app-query-platform-aoa-backfill-v7-i-0e89c9bee06f71f68,
> >> groupId=app-query-platform-aoa-backfill-v7]
> >>
> >> Adding newly assigned partitions:
> >> platform-data.appquery-platform.aoav3.backfill-28,
> >> platform-data.appquery-platform.aoav3.backfill-43,
> >> platform-data.appquery-platform.aoav3.backfill-31,
> >> platform-data.appquery-platform.aoav3.backfill-46,
> >> platform-data.appquery-platform.aoav3.backfill-34,
> >> platform-data.appquery-platform.aoav3.backfill-49,
> >> platform-data.appquery-platform.aoav3.backfill-40,
> >>
> >>
> platform-data.appquery-platform.aoav3.backfill-37","logger_name":"org.apache.kafka.clients.consumer.internals.ConsumerCoordinator","thread_name":"fs2-kafka-consumer-95","level":"INFO"}
> >>
> >> *OUTPUT FROM `/usr/local/opt/kafka/bin/kafka-consumer-groups`*
> >>
> >> GROUP                              TOPIC
> >>       PARTITION  CURRENT-OFFSET  LOG-END-OFFSET  LAG
> >> CONSUMER-ID                                              HOST
> >> CLIENT-ID
> >> app-query-platform-aoa-backfill-v7
> >> platform-data.appquery-platform.aoav3.backfill 40         8369679
> >> 8369696         17
> >> i-0e89c9bee06f71f68-2c1558e4-7975-493e-8c6b-08da7c4bf941 /10.123.16.69
> >> consumer-app-query-platform-aoa-backfill-v7-i-0e89c9bee06f71f68
> >> app-query-platform-aoa-backfill-v7
> >> platform-data.appquery-platform.aoav3.backfill 37         8369643
> >> 8369658         15
> >> i-0e89c9bee06f71f68-2c1558e4-7975-493e-8c6b-08da7c4bf941 /10.123.16.69
> >> consumer-app-query-platform-aoa-backfill-v7-i-0e89c9bee06f71f68
> >> app-query-platform-aoa-backfill-v7
> >> platform-data.appquery-platform.aoav3.backfill 46         8368044
> >> 8368055         11
> >> i-0e89c9bee06f71f68-2c1558e4-7975-493e-8c6b-08da7c4bf941 /10.123.16.69
> >> consumer-app-query-platform-aoa-backfill-v7-i-0e89c9bee06f71f68
> >> app-query-platform-aoa-backfill-v7
> >> platform-data.appquery-platform.aoav3.backfill 34         8379346
> >> 8379358         12
> >> i-0e89c9bee06f71f68-2c1558e4-7975-493e-8c6b-08da7c4bf941 /10.123.16.69
> >> consumer-app-query-platform-aoa-backfill-v7-i-0e89c9bee06f71f68
> >> app-query-platform-aoa-backfill-v7
> >> platform-data.appquery-platform.aoav3.backfill 28         8374244
> >> 8374247         3
> >> i-0e89c9bee06f71f68-2c1558e4-7975-493e-8c6b-08da7c4bf941 /10.123.16.69
> >> consumer-app-query-platform-aoa-backfill-v7-i-0e89c9bee06f71f68
> >> app-query-platform-aoa-backfill-v7
> >> platform-data.appquery-platform.aoav3.backfill 49         8364656
> >> 8364665         9
> >> i-0e89c9bee06f71f68-2c1558e4-7975-493e-8c6b-08da7c4bf941 /10.123.16.69
> >> consumer-app-query-platform-aoa-backfill-v7-i-0e89c9bee06f71f68
> >> app-query-platform-aoa-backfill-v7
> >> platform-data.appquery-platform.aoav3.backfill 43         8369980
> >> 8369988         8
> >> i-0e89c9bee06f71f68-2c1558e4-7975-493e-8c6b-08da7c4bf941 /10.123.16.69
> >> consumer-app-query-platform-aoa-backfill-v7-i-0e89c9bee06f71f68
> >> app-query-platform-aoa-backfill-v7
> >> platform-data.appquery-platform.aoav3.backfill 25         8369261
> >> 8370063         802
> >> i-0e89c9bee06f71f68-2c1558e4-7975-493e-8c6b-08da7c4bf941 /10.123.16.69
> >> consumer-app-query-platform-aoa-backfill-v7-i-0e89c9bee06f71f68
> >> app-query-platform-aoa-backfill-v7
> >> platform-data.appquery-platform.aoav3.backfill 31         8368087
> >> 8368097         10
> >> i-0e89c9bee06f71f68-2c1558e4-7975-493e-8c6b-08da7c4bf941 /10.123.16.69
> >> consumer-app-query-platform-aoa-backfill-v7-i-0e89c9bee06f71f68
> >> app-query-platform-aoa-backfill-v7
> >> platform-data.appquery-platform.aoav3.backfill 22         8370475
> >> 8371319         844
> >> i-0e89c9bee06f71f68-2c1558e4-7975-493e-8c6b-08da7c4bf941 /10.123.16.69
> >> consumer-app-query-platform-aoa-backfill-v7-i-0e89c9bee06f71f68
> >>
> >>> On Thu, Mar 17, 2022 at 2:29 AM Bruno Cadonna <cado...@apache.org>
> wrote:
> >>>
> >>> Hi Richard,
> >>>
> >>> The group.instance.id config is orthogonal to the partition assignment
> >>> strategy. The group.instance.id is used if you want to have static
> >>> membership which is not related to the partition assignment strategy.
> >>>
> >>> If you think you found a bug, could you please open a JIRA ticket with
> >>> steps to reproduce the bug.
> >>>
> >>> Best,
> >>> Bruno
> >>>
> >>> On 16.03.22 10:01, Luke Chen wrote:
> >>>> Hi Richard,
> >>>>
> >>>> Right, you are not missing any settings beyond the partition
> assignment
> >>>> strategy and the group instance id.
> >>>> You might need to know from the log that why the rebalance triggered
> to
> >>> do
> >>>> troubleshooting.
> >>>>
> >>>> Thank you.
> >>>> Luke
> >>>>
> >>>> On Wed, Mar 16, 2022 at 3:02 PM Richard Ney <kamisama....@gmail.com>
> >>> wrote:
> >>>>
> >>>>> Hi Luke,
> >>>>>
> >>>>> I did end up with a situation where I had two instances connecting to
> >>> the
> >>>>> same consumer group and they ended up in a rebalance trade-off. All
> >>>>> partitions kept going back and forth between the two microservice
> >>>>> instances. That was a test case where I'd removed the Group Instance
> >> Id
> >>>>> setting to see what would happen. I stabilized that one by reducing
> it
> >>> to a
> >>>>> single consumer after 20+ rebalances.
> >>>>>
> >>>>> The other issue I'm seeing may be a bug in the Functional Scala
> >>> `fs2-kafka`
> >>>>> wrapper where I see the partitions cleanly assigned but one or more
> >>>>> instances isn't ingesting. I found out that they recently added
> >> support
> >>> for
> >>>>> the cooperative sticky assignor for the stream recreation since they
> >>> were
> >>>>> assuming a full revocation of the partitions.
> >>>>>
> >>>>> So I basically wanted to make sure I wasn't missing any settings
> >> beyond
> >>> the
> >>>>> partition assignment strategy and the group instance id.
> >>>>>
> >>>>> -Richard
> >>>>>
> >>>>> -Richard
> >>>>>
> >>>>> On Tue, Mar 15, 2022 at 11:27 PM Luke Chen <show...@gmail.com>
> wrote:
> >>>>>
> >>>>>> Hi Richard,
> >>>>>>
> >>>>>> To use `CooperativeStickyAssignor`, no other special configuration
> is
> >>>>>> required.
> >>>>>>
> >>>>>> I'm not sure what does `make the rebalance happen cleanly` mean.
> >>>>>> Did you find any problem during group rebalance?
> >>>>>>
> >>>>>> Thank you.
> >>>>>> Luke
> >>>>>>
> >>>>>> On Wed, Mar 16, 2022 at 1:00 PM Richard Ney <
> richard....@lookout.com
> >>>>>> .invalid>
> >>>>>> wrote:
> >>>>>>
> >>>>>>> Trying to find a good sample of what consumer settings besides
> >> setting
> >>>>>>>
> >>>>>>> ConsumerConfig.PARTITION_ASSIGNMENT_STRATEGY_CONFIG to
> >>>>>>> org.apache.kafka.clients.consumer.CooperativeStickyAssignor
> >>>>>>>
> >>>>>>> is needed to make the rebalance happen cleanly. Unable to find and
> >>>>> decent
> >>>>>>> documentation or code samples. I have set the Group Instance Id to
> >> the
> >>>>>> EC2
> >>>>>>> instance id based on one blog write up I found.
> >>>>>>>
> >>>>>>> Any help would be appreciated
> >>>>>>>
> >>>>>>> -Richard
> >>>>>>>
> >>>>>>
> >>>>>
> >>>>
> >>>
> >>
>

Reply via email to