Offset commit request failing

Robert Quinlivan Wed, 15 Mar 2017 09:11:30 -0700

Good morning,

I'm hoping for some help understanding the expected behavior for an offset
commit request and why this request might fail on the broker.


*Context:*

For context, my configuration looks like this:

   - Three brokers
   - Consumer offsets topic replication factor set to 3
   - Auto commit enabled
   - The user application topic, which I will call "my_topic", has a
   replication factor of 3 as well and 800 partitions
   - 4000 consumers attached in consumer group "my_group"


*Issue:*

When I attach the consumers, the coordinator logs the following error
message repeatedly for each generation:

ERROR [Group Metadata Manager on Broker 0]: Appending metadata message for
group my_group generation 2066 failed due to
org.apache.kafka.common.errors.RecordTooLargeException, returning UNKNOWN
error code to the client (kafka.coordinator.GroupMetadataManager)

*Observed behavior:*

The consumer group does not stay connected long enough to consume messages.
It is effectively stuck in a rebalance loop and the "my_topic" data has
become unavailable.


*Investigation:*

Following the Group Metadata Manager code, it looks like the broker is
writing to a cache after it writes an Offset Commit Request to the log
file. If this cache write fails, the broker then logs this error and
returns an error code in the response. In this case, the error from the
cache is MESSAGE_TOO_LARGE, which is logged as a RecordTooLargeException.
However, the broker then sets the error code to UNKNOWN on the Offset
Commit Response.

It seems that the issue is the size of the metadata in the Offset Commit
Request. I have the following questions:

   1. What is the size limit for this request? Are we exceeding the size
   which is causing this request to fail?
   2. If this is an issue with metadata size, what would cause abnormally
   large metadata?
   3. How is this cache used within the broker?


Thanks in advance for any insights you can provide.

Regards,
Robert Quinlivan
Software Engineer, Signal

Offset commit request failing

Reply via email to