Dear all, I am running a simple kafka consumer group reactive autoscaling experiment on kubernetes, while leveraging range stop of the world assignor in the first run, and next in the second run I used incremental cooperative assignor. My workload is shown below where x-axis is the time in seconds, and y axis is the corresponding batch of messages to be sent to the broker.
[cid:5282405a-51fd-4b74-92a9-0bef0e0ef4ef] At the consumer side, I started with one consumer configured with maximum consumption rate of 100 messages per second (max.poll.records = 100, and sleep 1 second across the call to poll), so every time total arrival rate becomes greater than current number of consumers *100 I automatically add a consumer etc… no complex processing logic for records, just simple logging…. Having run the experiment in stop of world range assignor (first Figure) and in cooperative sticky assignor (second figure), it can be shown unexpectedly that stop of the world performed better than continual flow cooperative assignor. In fact, the percent of messages that have waiting (in the broker) plus processing time (in the poll) higher than 5 seconds is 2.8% in stop of the world, and 9.98 percent in incremental cooperative! isn’t that weird a bit? Autoscaling will trigger a rebalancing and of course continual flow non stop of the world shall be better in terms of message waiting plus processing time! can someone please give me a hint on the cause of this behavior, or what I am missing? can this be caused by some kind of non determinism across runs of the experiment (rebalancing time, heartbeat, provisioning time etc..), Please help interpreting these numbers. Under your disposal for further logs, code, information if needed. [cid:a17ac2aa-d9a7-44d3-8812-4e19a22bb762] [cid:eba1f7c4-8d5c-4e6d-9dc3-371a9b7bbbd2] Thank you so much.