Hi Andrew,

Thanks for the review!

The initial idea was to introduce the configuration on the broker side,
similar to how remote.fetch.max.wait.ms complements fetch.max.wait.ms:

   - fetch.max.wait.ms — configured via ConsumerConfig
   
<https://sourcegraph.com/github.com/apache/kafka/-/blob/clients/src/main/java/org/apache/kafka/clients/consumer/ConsumerConfig.java?L205>
   - remote.fetch.max.wait.ms — configured via RemoteLogManagerConfig
   
<https://sourcegraph.com/github.com/apache/kafka/-/blob/storage/src/main/java/org/apache/kafka/server/log/remote/storage/RemoteLogManagerConfig.java?L185>
    (broker-side)

With KIP-74
<https://cwiki.apache.org/confluence/display/KAFKA/KIP-74%3A+Add+Fetch+Response+Size+Limit+in+Bytes>,
the max.partition.fetch.bytes
<https://sourcegraph.com/github.com/apache/kafka/-/blob/clients/src/main/java/org/apache/kafka/clients/consumer/ConsumerConfig.java?L218-224>
is
treated as a soft limit. This means the actual size of the FETCH response
returned to the client may exceed this value, particularly when a single
RecordBatch is larger than the configured limit.

One area that remains unclear is how third-party clients behave when they
are configured with the default values:

   - max.partition.fetch.bytes = 1 MB and
   - max.fetch.bytes = 50 MB

If the broker responds with a 4 MB partition response containing multiple
RecordBatches, does the client fail to process the records
due to exceeding max.partition.fetch.bytes, or does it handle the larger
response gracefully?

Thanks,
Kamal

On Thu, May 8, 2025 at 7:19 PM Andrew Schofield <
andrew_schofield_j...@outlook.com> wrote:

> Hi Kamal,
> Thanks for the KIP.
>
> While it makes a lot of sense to me to be able to control the fetching
> from remote
> storage to make sure it's sympathetic to the characteristics of the
> storage provider,
> it seems to me that extending this concept all the way to the individual
> consumers
> is not a good idea. You might have different consumers specifying their own
> wildly different values, when really you want a consistent configuration
> which
> applies whenever data is fetched from remote storage. Could a broker or
> topic
> config be used to achieve this more effectively? It worries me whenever we
> have
> a configuration which would ideally be used by all consumers setting the
> same
> value. It suggests that they shouldn't be able to provide their own values
> at all.
>
> Thanks,
> Andrew
> ________________________________________
> From: Kamal Chandraprakash <kamal.chandraprak...@gmail.com>
> Sent: 08 May 2025 12:07
> To: dev@kafka.apache.org <dev@kafka.apache.org>
> Subject: [DISCUSS] KIP-1178: Introduce remote.max.partition.fetch.bytes in
> Consumer
>
> Hi all,
>
> I've opened the KIP-1178 to add a new config
> 'remote.max.partition.fetch.bytes' in the consumer. This config allows it
> to read from remote storage faster.
>
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1178%3A+Introduce+remote.max.partition.fetch.bytes+config+in+Consumer
>
> Please take a look and suggest your thoughts.
>
> Thanks,
> Kamal
>

Reply via email to