Hi all,

Thanks everyone for the votes and discussion.
I'm happy to announce that KIP-1150 is now accepted with
+9 binding (Stanislav Kozlovski, Chris Egerton, Luke Chen, Josep Prat, Greg
Harris, Andrew Schofield, Jun Rao, Satish Duggana, Chia-Ping Tsai) votes
and
+5 non-binding(Henry Cai, Jian Fu, Andrew Mills, Varun Ghai, Vaquar Khan)
votes.

As a reminder, KIP-1150 is a motivational KIP — it establishes community
 consensus on whether Apache Kafka should pursue object storage as the
 primary storage backend for a new topic type, without prescribing a
specific implementation. All implementation details will be defined and
 discussed in the follow-up KIPs:

   - KIP-1163: Diskless Core —  DISCUSS thread : https://lists.
   apache.org/thread/3dj67w04r7pcmlytl912gv69j22o3g4j


   - KIP-1164: Diskless Coordinator — DISCUSS thrread:
   https://lists.apache.org/thread/m9l6lbqv2cffxtz5frypylmqjd7bsqoz

We encourage everyone to continue the conversation in those threads. Each of
those KIPs will have its own discussion and voting process.

Thanks again to all who participated!

~Anatolii.

On Mon, Mar 2, 2026 at 7:31 AM Chia-Ping Tsai <[email protected]> wrote:

> +1 (binding)
>
> > Satish Duggana <[email protected]> 於 2026年2月28日 上午11:00 寫道:
> >
> > Thanks for the KIP.
> > I've reviewed the updated KIP and agree with the motivation behind
> > KIP-1150, overall LGTM.
> > It seems KIP-1163 and KIP-1164 require more details, which we can discuss
> > in those respective threads.
> >
> > +1(binding) for KIP-1150.
> >
> > ~Satish.
> >
> >> On Fri, 27 Feb 2026 at 23:28, Jun Rao via dev <[email protected]>
> wrote:
> >>
> >> Hi, Anatolii,
> >>
> >> Thanks for the KIP. The link you posted for KIP-1150 seems incorrect
> and it
> >> points to KIP-1163. Otherwise, +1.
> >>
> >> Jun
> >>
> >>> On Wed, Feb 25, 2026 at 2:59 PM vaquar khan <[email protected]>
> wrote:
> >>>
> >>> Fair point, Chris. I agree with that architectural boundary. KIP-1150
> >>> successfully sets the high-level mandate , and we can rigorously tackle
> >> the
> >>> exact EOS and RPC mechanics over in the KIP-1164 thread .
> >>>
> >>> Andrew, I am fully aligned with you on the massive operational value of
> >>> eliminating those cross-AZ replication costs. It is absolutely the
> right
> >>> strategic direction for Kafka.
> >>>
> >>> Since my initial concerns on the storage side are resolved, and we are
> >>> aligned on where the transactional interfaces will be finalized, I am
> >>> officially withdrawing my objection.
> >>> +1 (non-binding) for KIP-1150.
> >>>
> >>> I will migrate my open questions over to the KIP-1164 discussion thread
> >> so
> >>> we can lock down the data safety details there.
> >>>
> >>> Regards,
> >>> Vaquar Khan
> >>>
> >>> On Wed, 25 Feb 2026 at 15:24, Chris Egerton <[email protected]>
> >>> wrote:
> >>>
> >>>> Hi Vaquar,
> >>>>
> >>>>> Let me know what you guys think about locking down the text for these
> >>>> interfaces.
> >>>>
> >>>> I think this KIP has the appropriate level of detail and any concerns
> >>> about
> >>>> EOS can be addressed in the relevant sub-KIP.
> >>>>
> >>>> Chris
> >>>>
> >>>> On Wed, Feb 25, 2026 at 4:20 PM vaquar khan <[email protected]>
> >>> wrote:
> >>>>
> >>>>> Hi everyone,
> >>>>>
> >>>>> First off, thanks to the authors for the Feb 12th updates to
> >> KIP-1163 .
> >>>>> Adding the periodic reconciliation loop clears up my concerns about
> >> the
> >>>>> orphaned "Upload-then-Commit" segments, so I'm officially withdrawing
> >>> my
> >>>>> objection on the storage leak issue .
> >>>>>
> >>>>> Chris and Greg- since you both mentioned digging into the 1164
> >>> details, I
> >>>>> wanted to pick your brains on how Exactly-Once Semantics (EOS) is
> >> going
> >>>> to
> >>>>> safely operate here. In standard Kafka, the Partition Leader is our
> >>>> single
> >>>>> serialization point. It receives the data, tracks ongoing
> >> transactions
> >>>> via
> >>>>> the ProducerStateManager, and calculates the Last Stable Offset (LSO)
> >>>>> locally . Since KIP-1150 removes the leader, the Batch Coordinator
> >>> takes
> >>>>> over. But as I read through the current text, a few critical
> >>>>> synchronization barriers seem to be missing to me:
> >>>>>
> >>>>> 1. LSO Calculation: How exactly will the Batch Coordinator maintain
> >> and
> >>>>> calculate the LSO? Justine Olshan brought this up earlier too . Will
> >>> the
> >>>>> coordinator run its own ProducerStateManager to track ongoing
> >>>> transactions,
> >>>>> or is there a totally different state machine planned?
> >>>>>
> >>>>> 2. RPC Protocol: What's the exact synchronization protocol between
> >> the
> >>>>> legacy Transaction Coordinator and the new Batch Coordinator? When
> >> the
> >>>> Txn
> >>>>> Coordinator sends a commit marker, how does the Batch Coordinator
> >>>> actually
> >>>>> verify it has received all the prerequisite data batches for that
> >>>> specific
> >>>>> transaction epoch?
> >>>>>
> >>>>> 3. Delayed Data Race Condition: Let's say a broker hits a GC pause
> >>> right
> >>>>> *after
> >>>>> *uploading a batch to object storage, but *before* committing the
> >>>>> coordinates . If the transaction commit marker arrives at the
> >>> Coordinator
> >>>>> first, what happens? Does the Coordinator wait? If not, couldn't the
> >>>>> transaction commit with missing data, completely violating
> >>> read_committed
> >>>>> isolation?
> >>>>>
> >>>>> The KIP vaguely mentions *transactional checks* but leaves the actual
> >>>>> commit protocol and public interfaces undefined right now . I'm not
> >>>> saying
> >>>>> the design itself is broken, but I really think myself and others
> >> need
> >>> to
> >>>>> see these RPC flows explicitly documented before we implement and
> >>> adopt
> >>>>> this. Otherwise, we risk baking in some severe data isolation
> >> headaches
> >>>>> down the line.
> >>>>>
> >>>>> Let me know what you guys think about locking down the text for these
> >>>>> interfaces.
> >>>>>
> >>>>> Regards,
> >>>>> Vaquar Khan
> >>>>>
> >>>>> On Wed, 25 Feb 2026 at 10:33, Greg Harris via dev <
> >>> [email protected]>
> >>>>> wrote:
> >>>>>
> >>>>>> Hey all,
> >>>>>>
> >>>>>> I'm excited to discuss more details in 1163 and 1164 with everyone.
> >>>>>>
> >>>>>> +1 (binding)
> >>>>>>
> >>>>>> Thanks!
> >>>>>> Greg
> >>>>>>
> >>>>>> On Wed, Feb 25, 2026 at 1:08 AM Anatolii Popov via dev <
> >>>>>> [email protected]>
> >>>>>> wrote:
> >>>>>>
> >>>>>>> Hi all,
> >>>>>>>
> >>>>>>> Given the importance of this KIP, we want to keep the vote open
> >>> for a
> >>>>> few
> >>>>>>> more days to give time to people who had comments in the DISCUSS
> >>>> thread
> >>>>>> to
> >>>>>>> cast their vote if they want.
> >>>>>>>
> >>>>>>> On Wed, Feb 25, 2026 at 10:47 AM Josep Prat via dev <
> >>>>>> [email protected]>
> >>>>>>> wrote:
> >>>>>>>
> >>>>>>>> Hi all,
> >>>>>>>> As a co-author of the KIP, I want to explicitly cast my vote
> >> for
> >>>> this
> >>>>>>> KIP.
> >>>>>>>>
> >>>>>>>> +1 (binding)
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> On Wed, Feb 25, 2026 at 9:02 AM Luke Chen <[email protected]>
> >>>> wrote:
> >>>>>>>>
> >>>>>>>>> I've re-read KIP-1150, and still agree this is what we need
> >> for
> >>>>>> Apache
> >>>>>>>>> Kafka.
> >>>>>>>>>
> >>>>>>>>> +1 (binding) from me.
> >>>>>>>>>
> >>>>>>>>> Thank you,
> >>>>>>>>> Luke
> >>>>>>>>>
> >>>>>>>>> On Wed, Feb 25, 2026 at 12:10 PM Chris Egerton <
> >>>>>>> [email protected]>
> >>>>>>>>> wrote:
> >>>>>>>>>
> >>>>>>>>>> Hi all,
> >>>>>>>>>>
> >>>>>>>>>> Thanks for the KIP. I've reviewed 1150, 1163, and 1164, as
> >>> well
> >>>> as
> >>>>>> the
> >>>>>>>>>> relevant discussion threads. I may have granular comments
> >>> about
> >>>>> 1163
> >>>>>>> and
> >>>>>>>>>> 1164 but the overall approach suggested in 1150 looks good
> >> to
> >>>> me.
> >>>>> I
> >>>>>>>>>> especially like that the approach covers two main pain
> >> points
> >>> of
> >>>>>>>> operating
> >>>>>>>>>> and paying for Kafka today: it allows cross-AZ traffic to be
> >>>>> reduced
> >>>>>>>> (even
> >>>>>>>>>> eliminated in some cases), and it also allows local disk
> >> usage
> >>>> by
> >>>>>>>> brokers
> >>>>>>>>>> to be reduced (if operators opt for a small local cache on
> >>>>> follower
> >>>>>>>>>> brokers
> >>>>>>>>>> for non-tiered segments).
> >>>>>>>>>>
> >>>>>>>>>> +1 (binding)
> >>>>>>>>>>
> >>>>>>>>>> Cheers,
> >>>>>>>>>>
> >>>>>>>>>> Chris
> >>>>>>>>>>
> >>>>>>>>>> On Mon, Jan 26, 2026 at 3:36 PM vaquar khan <
> >>>>> [email protected]>
> >>>>>>>>>> wrote:
> >>>>>>>>>>
> >>>>>>>>>>> Hi Josep,
> >>>>>>>>>>>
> >>>>>>>>>>> Thank you for the detailed response. I appreciate the
> >>>>>> clarification
> >>>>>>>>>>> regarding the distinction between the Inkless POC and the
> >>> KIP
> >>>>>>> design.
> >>>>>>>>>>>
> >>>>>>>>>>> However, my objection is not based on temporary bugs in
> >> the
> >>>>> fork,
> >>>>>>> but
> >>>>>>>>>> *on
> >>>>>>>>>>> architectural gaps in the KIPs themselves* that these
> >>>>>> implementation
> >>>>>>>>>> issues
> >>>>>>>>>>> highlighted. If we are voting to approve the design, the
> >>>> design
> >>>>>>>>>> documents
> >>>>>>>>>>> must be structurally complete regarding data safety.
> >>>>>>>>>>>
> >>>>>>>>>>> *1. Regarding Storage Leaks (The Missing Design)* You
> >>>> mentioned
> >>>>>> that
> >>>>>>>>>>> cleanup logic "can be defined later." However, KIP-1163
> >>>>> explicitly
> >>>>>>>>>>> delegates this responsibility to a separate process, and
> >>>>> KIP-1165
> >>>>>>>>>> (Object
> >>>>>>>>>>> Compaction/GC) is currently marked as "Discarded" in the
> >>> wiki.
> >>>>>>>>>>>
> >>>>>>>>>>> We cannot vote to approve a storage engine that has no
> >>>> specified
> >>>>>>>>>> mechanism
> >>>>>>>>>>> for garbage collection. The "Upload-then-Commit" pattern
> >>>>> described
> >>>>>>> in
> >>>>>>>>>>> KIP-1163 structurally creates orphaned segments during
> >>> broker
> >>>>>>>> failures.
> >>>>>>>>>>> Without an active KIP defining the reconciliation protocol
> >>>>> (since
> >>>>>>>>>> KIP-1165
> >>>>>>>>>>> was withdrawn), the proposal effectively describes a
> >> system
> >>>> with
> >>>>>>>>>> unbounded
> >>>>>>>>>>> storage growth during failure modes. This is a blocking
> >>> design
> >>>>>> gap,
> >>>>>>>> not
> >>>>>>>>>> an
> >>>>>>>>>>> implementation detail.
> >>>>>>>>>>>
> >>>>>>>>>>> *2. Regarding EOS (The Coordinator Synchronization Gap)*
> >>> This
> >>>> is
> >>>>>>> not a
> >>>>>>>>>>> misunderstanding of standard Kafka transactions; it is a
> >>>>> critique
> >>>>>> of
> >>>>>>>> how
> >>>>>>>>>>> KIP-1150 changes them. Standard EOS relies on the
> >> Partition
> >>>>> Leader
> >>>>>>> to
> >>>>>>>>>>> sequence markers and calculate the LSO (Last Stable
> >> Offset)
> >>> in
> >>>>>>> memory.
> >>>>>>>>>>> KIP-1150 removes the Leader.
> >>>>>>>>>>>
> >>>>>>>>>>> KIP-1164 (Batch Coordinator) must explicitly define the
> >> RPC
> >>>> flow
> >>>>>>>> between
> >>>>>>>>>>> the Transaction Coordinator and the Batch Coordinator to
> >>>> replace
> >>>>>> the
> >>>>>>>>>>> leader's role. Currently, the KIP does not specify how the
> >>>>> system
> >>>>>>>>>> prevents
> >>>>>>>>>>> a "Split Brain" scenario where a consumer reads ahead of a
> >>>>>>> transaction
> >>>>>>>>>>> marker that hasn't yet been sequenced by the Batch
> >>>> Coordinator.
> >>>>>> This
> >>>>>>>> is
> >>>>>>>>>> a
> >>>>>>>>>>> protocol-level correctness issue that must be resolved in
> >>> the
> >>>>> text
> >>>>>>>>>> before
> >>>>>>>>>>> adoption.
> >>>>>>>>>>>
> >>>>>>>>>>> Please note - I am maintaining my objection based on
> >> missing
> >>>>>>>>>>> specifications, not code bugs.
> >>>>>>>>>>>
> >>>>>>>>>>> I respectfully request that we pause the vote until:
> >>>>>>>>>>>
> >>>>>>>>>>>    A valid design for Garbage Collection (replacing the
> >>>>> discarded
> >>>>>>>>>>> KIP-1165) is added to the proposal.
> >>>>>>>>>>>
> >>>>>>>>>>>    The Transaction/LSO synchronization protocol is
> >>> explicitly
> >>>>>>>>>> documented
> >>>>>>>>>>> in KIP-1164.
> >>>>>>>>>>>
> >>>>>>>>>>> Regards,
> >>>>>>>>>>>
> >>>>>>>>>>> Vaquar Khan
> >>>>>>>>>>> Sr Data Architect
> >>>>>>>>>>> https://www.linkedin.com/in/vaquar-khan-b695577/
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>> --
> >>>>>>>> [image: Aiven] <https://www.aiven.io>
> >>>>>>>>
> >>>>>>>> *Josep Prat*
> >>>>>>>> Sr. Engineering Director, Streaming Services, *Aiven*
> >>>>>>>> [email protected]   |   +491715557497
> >>>>>>>> aiven.io <https://www.aiven.io>   |   <
> >>>>>>> https://www.facebook.com/aivencloud
> >>>>>>>>>
> >>>>>>>>  <https://www.linkedin.com/company/aiven/>   <
> >>>>>>>> https://twitter.com/aiven_io>
> >>>>>>>> *Aiven Deutschland GmbH*
> >>>>>>>> Alexanderufer 3-7, 10117 Berlin
> >>>>>>>>
> >>>>>>>> Geschäftsführer: Oskari Saarenmaa, Kenneth Chen
> >>>>>>>> Amtsgericht Charlottenburg, HRB 209739 B
> >>>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> --
> >>>>>>> Anatolii Popov
> >>>>>>> Senior Software Developer, *Aiven OY*
> >>>>>>> m: +358505126242
> >>>>>>> w: aiven.io  e: [email protected]
> >>>>>>> <https://www.facebook.com/aivencloud>
> >>>>>>> <https://www.linkedin.com/company/aiven/>   <
> >>>>>> https://twitter.com/aiven_io>
> >>>>>>>
> >>>>>>
> >>>>>
> >>>>
> >>>
> >>
>


-- 
Anatolii Popov
Senior Software Developer, *Aiven OY*
m: +358505126242
w: aiven.io  e: [email protected]
<https://www.facebook.com/aivencloud>
<https://www.linkedin.com/company/aiven/>   <https://twitter.com/aiven_io>

Reply via email to