Luke Chen created KAFKA-19462:
---------------------------------
Summary: "fetch.max.bytes" config is not honored when remote +
local fetch
Key: KAFKA-19462
URL: https://issues.apache.org/jira/browse/KAFKA-19462
Project: Kafka
Issue Type: Bug
Reporter: Luke Chen
Assignee: Luke Chen
Currently in local fetch case, we'll calculate the remaining bytes to be
fetched for each partition via "fetch.max.bytes" and
"max.partition.fetch.bytes" configs. For example:
# Config:
max.partition.fetch.bytes = 1MB
fetch.max.bytes = 1.5MB
# Topic foo has 2 partitions.
# Consumer fetches data from topic foo
# Fetches from foo-0 first, it got 1MB of data, so remaining 0.5 MB of data
available to be fetched
# Fetches from foo-1 for max 0.5MB.
# Total returned 1.5MB records
However, in remote + local fetch case, because we don't know how much data we
can fetch before querying remote log metadata manager or other resource, we
can't have a value to tell replicaManager beforehand. Currently, we treat it as
0 bytes read. And that's why the final returned data could exceed the
"fetch.max.bytes" value.
For example:
# Config:
max.partition.fetch.bytes = 1MB
fetch.max.bytes = 1.5MB
# Topic foo has 2 partitions + topic boo has 1 partition with tiered storage
enabled.
# Consumer fetches data from topic foo and boo
# Fetches from boo-0, because we don't know how much data we can get, return
0, and send to remote async read.
# Fetches from foo-0, it got 1MB of data, so remaining 0.5 MB of data
available to be fetched
# Fetches from foo-1 for max 0.5MB.
# remote async read for boo-0, and it got 1MB data (max.partition.fetch.bytes).
# Total returned 2.5MB records, which exceeds `fetch.max.bytes = 1.5MB`
--
This message was sent by Atlassian Jira
(v8.20.10#820010)