So to clarify, you're using kafka-clients directly? Or via fx2-kafka? If it's kafka-clients directly, what version please?
On Sat, 19 Mar 2022 at 19:59, Richard Ney <kamisama....@gmail.com> wrote: > Hi Liam, > > Sorry for the mis-identification in the last email. The fun of answering an > email on a phone instead of a desktop. I confirmed the upper log messages I > included in the message come from this location in the `kafka-clients` > library > > > https://github.com/apache/kafka/blob/trunk/clients/src/main/java/org/apache/kafka/clients/consumer/internals/ConsumerCoordinator.java#L422 > > And it's only including 8 of the 10 partitions that were assigned to that > consumer instance. > > -Richard > > On Fri, Mar 18, 2022 at 11:15 PM Richard Ney <kamisama....@gmail.com> > wrote: > > > Hi Ngā mihi, > > > > I believe the log entry I included was from the underlying kafka-clients > > library given that the logger identified is > > “org.apache.kafka.clients.consumer.internals.ConsumerCoordinator”. I’ll > > admit at first I thought it also might be the fs2-kafka wrapper given > that > > the 2.4.0 version is the first version that has correct support for the > > messaging from the ConsumerCoordinator. I am planning to do a test with > the > > 3.0.0-M5 version which is built on the updated 3.1.0 kafka-clients > library > > and will let the list know. > > > > -Richard Ney > > > > Sent from my iPhone > > > > > On Mar 18, 2022, at 10:55 PM, Liam Clarke-Hutchinson < > > lclar...@redhat.com> wrote: > > > > > > Kia ora Richard, > > > > > > I see support for the Cooperative Sticky Assignor in fs2-kafka is quite > > > new. Have you discussed this issue with the community of that client at > > > all? I ask because I see on GitHub that fs2-kafka is using > kafka-clients > > > 2.8.1 as the underlying client, and there's been a fair few bugfixes > > around > > > the cooperative sticky assignor since that version. > > > > > > Could you perhaps try overriding the kafka-clients dependency of > > fs2-kafka > > > and try a higher version, perhaps 3.1.0, and see if the issue remains? > > I'm > > > not sure how well that'll work, but might help narrow down the issue. > > > > > > Ngā mihi, > > > > > > Liam Clarke-Hutchinson > > > > > >> On Sat, 19 Mar 2022 at 14:24, Richard Ney <kamisama....@gmail.com> > > wrote: > > >> > > >> Thanks for the additional information Bruno. Does this look like a > > possible > > >> bug in the CooperativeStickyAssignor? I have 5 consumers reading from > a > > 50 > > >> partition topic. Based on the log messages this application instance > is > > >> only getting assigned 8 partitions but when I ask the consumer group > for > > >> LAG information the consumer group thinks the correct number of 10 > > >> partitions were assigned but as should 2 partitions aren't getting > read > > due > > >> to the application not knowing it has them. > > >> > > >> {"timestamp":"2022-03-19T00:54:46.025Z","message":"[Consumer > > >> instanceId=i-0e89c9bee06f71f68, > > >> > > clientId=consumer-app-query-platform-aoa-backfill-v7-i-0e89c9bee06f71f68, > > >> groupId=app-query-platform-aoa-backfill-v7] Updating assignment > with\n\t > > >> Assigned partitions: [ > > >> platform-data.appquery-platform.aoav3.backfill-28, > > >> platform-data.appquery-platform.aoav3.backfill-43, > > >> platform-data.appquery-platform.aoav3.backfill-31, > > >> platform-data.appquery-platform.aoav3.backfill-46, > > >> platform-data.appquery-platform.aoav3.backfill-34, > > >> platform-data.appquery-platform.aoav3.backfill-49, > > >> platform-data.appquery-platform.aoav3.backfill-40, > > >> platform-data.appquery-platform.aoav3.backfill-37] \n\t > > >> Current owned partitions: []\n\t > > >> > > >> Added partitions (assigned - owned): [ > > >> platform-data.appquery-platform.aoav3.backfill-28, > > >> platform-data.appquery-platform.aoav3.backfill-43, > > >> platform-data.appquery-platform.aoav3.backfill-31, > > >> platform-data.appquery-platform.aoav3.backfill-46, > > >> platform-data.appquery-platform.aoav3.backfill-34, > > >> platform-data.appquery-platform.aoav3.backfill-49, > > >> platform-data.appquery-platform.aoav3.backfill-40, > > >> platform-data.appquery-platform.aoav3.backfill-37]\n\t > > >> > > >> Revoked partitions (owned - assigned): > > >> > > >> > > > []\n","logger_name":"org.apache.kafka.clients.consumer.internals.ConsumerCoordinator","thread_name":"fs2-kafka-consumer-95","level":"INFO"} > > >> > > >> > > >> {"timestamp":"2022-03-19T00:54:46.026Z","message":"[Consumer > > >> instanceId=i-0e89c9bee06f71f68, > > >> > > clientId=consumer-app-query-platform-aoa-backfill-v7-i-0e89c9bee06f71f68, > > >> groupId=app-query-platform-aoa-backfill-v7] > > >> > > >> Notifying assignor about the new Assignment(partitions=[ > > >> platform-data.appquery-platform.aoav3.backfill-28, > > >> platform-data.appquery-platform.aoav3.backfill-31, > > >> platform-data.appquery-platform.aoav3.backfill-34, > > >> platform-data.appquery-platform.aoav3.backfill-37, > > >> platform-data.appquery-platform.aoav3.backfill-40, > > >> platform-data.appquery-platform.aoav3.backfill-43, > > >> platform-data.appquery-platform.aoav3.backfill-46, > > >> > > >> > > > platform-data.appquery-platform.aoav3.backfill-49])","logger_name":"org.apache.kafka.clients.consumer.internals.ConsumerCoordinator","thread_name":"fs2-kafka-consumer-95","level":"INFO"} > > >> > > >> {"timestamp":"2022-03-19T00:54:46.028Z","message":"[Consumer > > >> instanceId=i-0e89c9bee06f71f68, > > >> > > clientId=consumer-app-query-platform-aoa-backfill-v7-i-0e89c9bee06f71f68, > > >> groupId=app-query-platform-aoa-backfill-v7] > > >> > > >> Adding newly assigned partitions: > > >> platform-data.appquery-platform.aoav3.backfill-28, > > >> platform-data.appquery-platform.aoav3.backfill-43, > > >> platform-data.appquery-platform.aoav3.backfill-31, > > >> platform-data.appquery-platform.aoav3.backfill-46, > > >> platform-data.appquery-platform.aoav3.backfill-34, > > >> platform-data.appquery-platform.aoav3.backfill-49, > > >> platform-data.appquery-platform.aoav3.backfill-40, > > >> > > >> > > > platform-data.appquery-platform.aoav3.backfill-37","logger_name":"org.apache.kafka.clients.consumer.internals.ConsumerCoordinator","thread_name":"fs2-kafka-consumer-95","level":"INFO"} > > >> > > >> *OUTPUT FROM `/usr/local/opt/kafka/bin/kafka-consumer-groups`* > > >> > > >> GROUP TOPIC > > >> PARTITION CURRENT-OFFSET LOG-END-OFFSET LAG > > >> CONSUMER-ID HOST > > >> CLIENT-ID > > >> app-query-platform-aoa-backfill-v7 > > >> platform-data.appquery-platform.aoav3.backfill 40 8369679 > > >> 8369696 17 > > >> i-0e89c9bee06f71f68-2c1558e4-7975-493e-8c6b-08da7c4bf941 / > 10.123.16.69 > > >> consumer-app-query-platform-aoa-backfill-v7-i-0e89c9bee06f71f68 > > >> app-query-platform-aoa-backfill-v7 > > >> platform-data.appquery-platform.aoav3.backfill 37 8369643 > > >> 8369658 15 > > >> i-0e89c9bee06f71f68-2c1558e4-7975-493e-8c6b-08da7c4bf941 / > 10.123.16.69 > > >> consumer-app-query-platform-aoa-backfill-v7-i-0e89c9bee06f71f68 > > >> app-query-platform-aoa-backfill-v7 > > >> platform-data.appquery-platform.aoav3.backfill 46 8368044 > > >> 8368055 11 > > >> i-0e89c9bee06f71f68-2c1558e4-7975-493e-8c6b-08da7c4bf941 / > 10.123.16.69 > > >> consumer-app-query-platform-aoa-backfill-v7-i-0e89c9bee06f71f68 > > >> app-query-platform-aoa-backfill-v7 > > >> platform-data.appquery-platform.aoav3.backfill 34 8379346 > > >> 8379358 12 > > >> i-0e89c9bee06f71f68-2c1558e4-7975-493e-8c6b-08da7c4bf941 / > 10.123.16.69 > > >> consumer-app-query-platform-aoa-backfill-v7-i-0e89c9bee06f71f68 > > >> app-query-platform-aoa-backfill-v7 > > >> platform-data.appquery-platform.aoav3.backfill 28 8374244 > > >> 8374247 3 > > >> i-0e89c9bee06f71f68-2c1558e4-7975-493e-8c6b-08da7c4bf941 / > 10.123.16.69 > > >> consumer-app-query-platform-aoa-backfill-v7-i-0e89c9bee06f71f68 > > >> app-query-platform-aoa-backfill-v7 > > >> platform-data.appquery-platform.aoav3.backfill 49 8364656 > > >> 8364665 9 > > >> i-0e89c9bee06f71f68-2c1558e4-7975-493e-8c6b-08da7c4bf941 / > 10.123.16.69 > > >> consumer-app-query-platform-aoa-backfill-v7-i-0e89c9bee06f71f68 > > >> app-query-platform-aoa-backfill-v7 > > >> platform-data.appquery-platform.aoav3.backfill 43 8369980 > > >> 8369988 8 > > >> i-0e89c9bee06f71f68-2c1558e4-7975-493e-8c6b-08da7c4bf941 / > 10.123.16.69 > > >> consumer-app-query-platform-aoa-backfill-v7-i-0e89c9bee06f71f68 > > >> app-query-platform-aoa-backfill-v7 > > >> platform-data.appquery-platform.aoav3.backfill 25 8369261 > > >> 8370063 802 > > >> i-0e89c9bee06f71f68-2c1558e4-7975-493e-8c6b-08da7c4bf941 / > 10.123.16.69 > > >> consumer-app-query-platform-aoa-backfill-v7-i-0e89c9bee06f71f68 > > >> app-query-platform-aoa-backfill-v7 > > >> platform-data.appquery-platform.aoav3.backfill 31 8368087 > > >> 8368097 10 > > >> i-0e89c9bee06f71f68-2c1558e4-7975-493e-8c6b-08da7c4bf941 / > 10.123.16.69 > > >> consumer-app-query-platform-aoa-backfill-v7-i-0e89c9bee06f71f68 > > >> app-query-platform-aoa-backfill-v7 > > >> platform-data.appquery-platform.aoav3.backfill 22 8370475 > > >> 8371319 844 > > >> i-0e89c9bee06f71f68-2c1558e4-7975-493e-8c6b-08da7c4bf941 / > 10.123.16.69 > > >> consumer-app-query-platform-aoa-backfill-v7-i-0e89c9bee06f71f68 > > >> > > >>> On Thu, Mar 17, 2022 at 2:29 AM Bruno Cadonna <cado...@apache.org> > > wrote: > > >>> > > >>> Hi Richard, > > >>> > > >>> The group.instance.id config is orthogonal to the partition > assignment > > >>> strategy. The group.instance.id is used if you want to have static > > >>> membership which is not related to the partition assignment strategy. > > >>> > > >>> If you think you found a bug, could you please open a JIRA ticket > with > > >>> steps to reproduce the bug. > > >>> > > >>> Best, > > >>> Bruno > > >>> > > >>> On 16.03.22 10:01, Luke Chen wrote: > > >>>> Hi Richard, > > >>>> > > >>>> Right, you are not missing any settings beyond the partition > > assignment > > >>>> strategy and the group instance id. > > >>>> You might need to know from the log that why the rebalance triggered > > to > > >>> do > > >>>> troubleshooting. > > >>>> > > >>>> Thank you. > > >>>> Luke > > >>>> > > >>>> On Wed, Mar 16, 2022 at 3:02 PM Richard Ney <kamisama....@gmail.com > > > > >>> wrote: > > >>>> > > >>>>> Hi Luke, > > >>>>> > > >>>>> I did end up with a situation where I had two instances connecting > to > > >>> the > > >>>>> same consumer group and they ended up in a rebalance trade-off. All > > >>>>> partitions kept going back and forth between the two microservice > > >>>>> instances. That was a test case where I'd removed the Group > Instance > > >> Id > > >>>>> setting to see what would happen. I stabilized that one by reducing > > it > > >>> to a > > >>>>> single consumer after 20+ rebalances. > > >>>>> > > >>>>> The other issue I'm seeing may be a bug in the Functional Scala > > >>> `fs2-kafka` > > >>>>> wrapper where I see the partitions cleanly assigned but one or more > > >>>>> instances isn't ingesting. I found out that they recently added > > >> support > > >>> for > > >>>>> the cooperative sticky assignor for the stream recreation since > they > > >>> were > > >>>>> assuming a full revocation of the partitions. > > >>>>> > > >>>>> So I basically wanted to make sure I wasn't missing any settings > > >> beyond > > >>> the > > >>>>> partition assignment strategy and the group instance id. > > >>>>> > > >>>>> -Richard > > >>>>> > > >>>>> -Richard > > >>>>> > > >>>>> On Tue, Mar 15, 2022 at 11:27 PM Luke Chen <show...@gmail.com> > > wrote: > > >>>>> > > >>>>>> Hi Richard, > > >>>>>> > > >>>>>> To use `CooperativeStickyAssignor`, no other special configuration > > is > > >>>>>> required. > > >>>>>> > > >>>>>> I'm not sure what does `make the rebalance happen cleanly` mean. > > >>>>>> Did you find any problem during group rebalance? > > >>>>>> > > >>>>>> Thank you. > > >>>>>> Luke > > >>>>>> > > >>>>>> On Wed, Mar 16, 2022 at 1:00 PM Richard Ney < > > richard....@lookout.com > > >>>>>> .invalid> > > >>>>>> wrote: > > >>>>>> > > >>>>>>> Trying to find a good sample of what consumer settings besides > > >> setting > > >>>>>>> > > >>>>>>> ConsumerConfig.PARTITION_ASSIGNMENT_STRATEGY_CONFIG to > > >>>>>>> org.apache.kafka.clients.consumer.CooperativeStickyAssignor > > >>>>>>> > > >>>>>>> is needed to make the rebalance happen cleanly. Unable to find > and > > >>>>> decent > > >>>>>>> documentation or code samples. I have set the Group Instance Id > to > > >> the > > >>>>>> EC2 > > >>>>>>> instance id based on one blog write up I found. > > >>>>>>> > > >>>>>>> Any help would be appreciated > > >>>>>>> > > >>>>>>> -Richard > > >>>>>>> > > >>>>>> > > >>>>> > > >>>> > > >>> > > >> > > >