Re: [DISCUSS] KIP-1241: Reduce tiered storage redundancy with delayed upload

jian fu Wed, 18 Mar 2026 04:27:16 -0700

Hi  Chia-Ping:

Thanks for your review and comments.
Q1:    Is this broker-level configuration dynamic?
The configures are broker-level configuration dynamic similar as others . I
already updated the KIP content. Thanks for your reminder.
Q2:   should we add a metric to track the 'delayed size'?
Currently, we do have a way to measure how much delay there is. Although
it’s not very convenient, Thus I think in most cases there isn’t a need to
continuously monitor this delay . In addition, we can leverage the API
introduced in KIP-1187 (Support to retrieve remote log size via
DescribeLogDirs RPC) to query it directly in future. So, given that this
requirement is not particularly urgent, I think we can hold off on adding a
metric for now.


Thanks for your comments!

Regards
Jian


Chia-Ping Tsai <[email protected]> 于2026年3月18日周三 18:43写道：

> hi Jian
>
> sorry for late review. There are some questions below.
>
> Is this broker-level configuration dynamic? We should clarify this in the
> KIP. Also, should we add a metric to track the 'delayed size'?
>
> Best,
> Chia-Ping
>
> On 2025/11/19 13:29:11 jian fu wrote:
> > Hi everyone, I'd like to start a discussion on KIP-1241, the goal is to
> > reduce the remote storage. KIP:
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1241%3A+Reduce+tiered+storage+redundancy+with+delayed+upload
> >
> > The Draft PR:   https://github.com/apache/kafka/pull/20913    Problem:
> > Currently,
> > Kafka's tiered storage implementation uploads all non-active local log
> > segments to remote storage immediately, even when they are still within
> the
> > local retention period.
> > This results in redundant storage of the same data in both local and
> remote
> > tiers.
> >
> > When there is no requirement for real-time analytics or immediate
> > consumption based on remote storage. It has the following drawbacks:
> >
> > 1. Wastes storage capacity and costs: The same data is stored twice
> during
> > the local retention window
> > 2. Provides no immediate benefit: During the local retention period,
> reads
> > prioritize local data, making the remote copy unnecessary
> >
> >
> > So. this KIP is to reduce tiered storage redundancy with delayed upload.
> > You can check the test result example here directly:
> > https://github.com/apache/kafka/pull/20913#issuecomment-3547156286
> > Looking forward to your feedback! Best regards, Jian
> >
>

Re: [DISCUSS] KIP-1241: Reduce tiered storage redundancy with delayed upload

Reply via email to