Luke,
Thanks for putting effort to summarize three KIPs and bring the focus on 
community's direction.
For KIP-1176, we are implementing some of the recommendations from the 
community (e.g. creating separate metadata topic, tuning the S3E1Z 
performance), we will have some updates to share in the next few weeks.  We 
would love to see the new updates from the other two KIPs to see how we can 
leverage/converge on the initiative.
One of the advantage of KIP-1176 is its simpler architecture (no new component 
such as Batch Coordinator or new cache system to build), mostly relying on the 
existing building blocks (page cache, tiered storage classes, tiered storage 
metadata topic).  We would like to see those design choices can be considered 
in the new updates from other KIPs.

    On Wednesday, August 6, 2025 at 12:56:46 AM PDT, Luke Chen 
<show...@gmail.com> wrote:  
 
 Hi Josep,

Thanks for the update.

> Luke, thank you for being proactive and caring about this topic!
I believe many community users are also caring about this topic! :)

Look forward to seeing the updated KIP!


Hi Stanislav,

Yes, it'd be good for the community to decide which way we want to go,
Leaderless or leader-based is absolutely one of the decisions.
And yes, more than one KIP is also good to me. It's just that we need a way
to move them forward.
Otherwise, suppose one of the KIPs is ready for voting, we can anticipate
requests to wait for the other two related KIPs.
Any good suggestions?

Hi Xinyu,

Thanks for the reply.
Look forward to seeing the updated KIP!

> If the community plans to adopt a leaderless architecture, will the focus
be on a complete transition to leaderless, or will both architectures
coexist in the long term?

I don't think we will abandon the leader-based design as a lot of users are
still relying on it.
Besides, KIP-1150 also claims the existing leader-based protocol works as
usual.
So, I think they should coexist in the long term.


Thank you.
Luke


On Wed, Aug 6, 2025 at 10:13 AM Xinyu Zhou <yu...@apache.org> wrote:

> Hi Luke,
>
> Thank you for creating this dedicated thread; we definitely need a space to
> discuss future steps for these topics. I apologize for my delay on KIP-1183
> and will provide more details in the coming weeks.
>
> I agree with Stanislav that we should first focus on the community's
> direction. Specifically, should we consider introducing a leaderless
> architecture to Kafka, given that it currently relies on a partitioned,
> leader-based model?
>
> From my own perspective, I’m particularly interested in how Leaderless and
> Leader-based architectures differ when it comes to handling data
> locality—which directly affects batching and fetch efficiency—and in the
> way core features are implemented. For instance, ordering, compaction,
> transactions, idempotent producers, and queues all have to be realized on
> the Coordinator in a Leaderless design, whereas in a Leader-based design
> they are handled by the Leader Partition.
>
> If the community plans to adopt a leaderless architecture, will the focus
> be on a complete transition to leaderless, or will both architectures
> coexist in the long term?
>
> I welcome discussions on this topic and am eager to hear diverse opinions.
>
> Regards,
> Xinyu
>
> On Wed, Aug 6, 2025 at 3:05 AM Stanislav Kozlovski <
> stanislavkozlov...@apache.org> wrote:
>
> > Thank you Luke for this wonderful summary and taking initiative.
> >
> > To me, it seems like a large differentiator from KIP-1150 and others is
> > the leaderless design. The other two don’t allow for it.
> >
> > It sounds productive to focus the discussion on whether the leaderless
> > design is worth it on top of the replication cost savings.
> >
> > I’m of the opinion that it’s worth pursuing - both for the truly zero
> > network cost (no producer cross az) but perhaps even more importantly the
> > zero state architecture that promises to significantly simplify
> operations,
> > including auto scaling brokers and scaling throughput per partition
> >
> > It would be great if the folks at Aiven could address the concerns
> > regarding queue and transactions support. I’m not of the opinion that
> these
> > things need to ship with v1, but it would be wise to ensure nothing in
> the
> > architecture blocks these features from being shipped in the future
> >
> > KIP-1176 is also very cool, addressing the acks=1 case will still be
> > necessary. I think it’s a necessary feature to implement, but I’d be
> > disappointed if that’s the only diskless solution the community agrees
> on.
> >
> > A good path, if possible, may be to merge KIP-1150 and KIP-1176.
> >
> > If instead the community decides leaderless isn’t necessary, then
> KIP-1183
> > seems fit.
> >
> > That’s my opinion. Happy to hear if anyone disagrees.
> >
> > On 2025/08/05 14:30:45 Josep Prat wrote:
> > > Hi Luke and community!
> > >
> > > Luke, thank you for being proactive and caring about this topic!
> > >
> > > In the meantime we have been keeping ourselves busy pushing our
> > > implementation of KIP-1150 to production to validate our assumptions
> and
> > > confirm its strengths while discovering its weaknesses.
> > > Now, after gathering some experience running it, we are (as I'm writing
> > > this, gathered in the same room) working on an improved proposal for
> > > KIP-1150 that also addresses the concerns from the community.
> > > We expect to share the updated KIP in the next couple of weeks.
> > >
> > > We apologize for the recent period of silence and are committed to more
> > > regular communication as we move forward.
> > >
> > > Best,
> > >
> > >
> > > On Tue, Aug 5, 2025 at 10:31 AM Luke Chen <show...@gmail.com> wrote:
> > >
> > > > Hi all,
> > > >
> > > > The Kafka community is currently seeing an unprecedented situation
> with
> > > > three KIPs (KIP-1150, IP-1176, KIP-1183) simultaneously addressing
> the
> > same
> > > > challenge of high replication costs when running Kafka across
> multiple
> > > > cloud availability zones. Each KIP offers a different solution to
> this
> > > > issue. While diversity of innovative ideas is a key strength of
> > open-source
> > > > projects, it creates a burden for reviewers and users who must
> compare
> > and
> > > > comment on multiple proposals simultaneously. Furthermore, discussion
> > > > around the three KIPs has stalled for over two months now. This could
> > be
> > > > due to the authors being hesitant to proceed due to the existence of
> > > > alternative, potentially conflicting, solutions. Addressing
> replication
> > > > cost is a key concern of Kafka’s userbase and we should try to move
> the
> > > > conversation forward if we can.
> > > >
> > > > From what I understand, these three KIPs are not mutually exclusive.
> > But
> > > > adopting all three KIPs in the community might not be what we expect.
> > Thus,
> > > > I would like to *start a discussion on how we could move the
> > conversation
> > > > forward*.
> > > >
> > > > To save time for the KIP readers/reviewers, I have created this
> > document
> > > > <
> > > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/The+Path+Forward+for+Saving+Cross-AZ+Replication+Costs+KIPs
> > > > >[1]
> > > > to help summarize each of the KIPs and describe their current status.
> > *Hope
> > > > to get some suggestions/feedback from the community*.
> > > >
> > > >
> > > > [1]
> > > >
> > > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/The+Path+Forward+for+Saving+Cross-AZ+Replication+Costs+KIPs
> > > >
> > > > KIP-1150:
> > > >
> > > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1150%3A+Diskless+Topics
> > > > KIP-1176
> > > > <
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1150%3A+Diskless+TopicsKIP-1176
> > >
> > > > :
> > > >
> > > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1176%3A+Tiered+Storage+for+Active+Log+Segment
> > > > KIP-1183
> > > > <
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1176%3A+Tiered+Storage+for+Active+Log+SegmentKIP-1183
> > >
> > > > :
> > > >
> > > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1183%3A+Unified+Shared+Storage
> > > >
> > > >
> > > > Thank you.
> > > > Luke
> > > >
> > >
> > >
> > > --
> > > [image: Aiven] <https://www.aiven.io>
> > >
> > > *Josep Prat*
> > > Sr. Engineering Director, Streaming Services, *Aiven*
> > > josep.p...@aiven.io  |  +491715557497
> > > aiven.io <https://www.aiven.io>  |  <
> > https://www.facebook.com/aivencloud>
> > >  <https://www.linkedin.com/company/aiven/>  <
> > https://twitter.com/aiven_io>
> > > *Aiven Deutschland GmbH*
> > > Alexanderufer 3-7, 10117 Berlin
> > >
> > > Geschäftsführer: Oskari Saarenmaa, Hannu Valtonen,
> > >
> > >  Kenneth Chen
> > > Amtsgericht Charlottenburg, HRB 209739 B
> > >
> >
>
  

Reply via email to