Greg,

Thanks for the revisions on KIP-1150 and KIP-1163.  I like the idea of reusing 
KIP-405 tiered storage and Kafka’s strength on using page cache and local log 
segment file which greatly simplifies the design and implementation.

I have a few questions:

HC1: The design is aiming for both diskless and leaderless.  It is probably 
better to focus on one problem in the KIP.  I think with KIP-392 (Fetch from 
follower) and KIP-1123 (Rack aware partitioning for Kafka Producer), both 
producer and consumer can read/write to the broker in the same AZ to avoid the 
across-AZ cost.  The leader broker is no longer the blocker component in the 
read/write paths (The client can now read/write to a different broker).  By 
removing the leader broker in the revised KIP-1150 design, you would need to 
move some logic (e.g. offset assignment) which was originally in the leader to 
the batch coordinator, this adds to the latency and complicated the logic 
(those logic are easier to implement in a single leader broker).  In the 
current Kafka design, the control is distributed in many leader brokers with no 
hotspots for the cluster.  But KIP-1150 is moving the distributed control into 
a central component: batch coordinator which adds a hotspot/bottleneck for the 
cluster;

HC2: Although the revised KIP-1150 is reusing page cache and local log segments 
in the follower broker to avoid designing another caching system, the page 
cache and local log segment files are built much later in the follower where in 
the current Kafka the page cache was built on the leader broker when the 
produce data arrives.  This affects when the consumer will be able to read the 
data if the consumer was connecting to the original leader broker;

HC3: The latency on acks=1 produce is longer than current Kafka since the 
producer needs to wait longer.  Acks=1 performs much better (comparing to 
acks=all) for use cases which can tolerate occasional message loss;

HC4: Although the revised design is leaderless, but there is still a leader 
concept when it comes to uploading closed log segment to tiered storage, how 
was that leader elected and issues surrounding leadership switch?

HC5: The messaging order is only maintained with the same broker.  When the 
producer cares about the message ordering and use message key to order messages 
on the same key, I guess the producer client needs to always send the messages 
with the same key to the same broker.  How is that consistent routing 
implemented given there is no leader broker concept anymore.  And even if those 
messages are routed to the same broker but at different times, how is the 
ordering maintained?

HC6: For the topic-based batch coordinator, does the read-only coordinator live 
on each broker?  If so, there will be a big fan-out read from that metadata 
topic.

HC7: For the topic-based batch coordinator, is the embedded SQLLite engine 
always needed if the size of the metadata topic is contained?

On 2025/09/03 19:59:48 Greg Harris wrote:
> Hi all,
> 
> Thank you all for your questions and design input on KIP-1150.
> 
> We have just updated KIP-1150 and KIP-1163 with a new design. To summarize
> the changes:
> 
> 1. The design prioritizes integrating with the existing KIP-405 Tiered
> Storage interfaces, permitting data produced to a Diskless topic to be
> moved to tiered storage.
> This lowers the scalability requirements for the Batch Coordinator
> component, and allows Diskless to compose with Tiered Storage plugin
> features such as encryption and alternative data formats.
> 
> 2. Consumer fetches are now served from local segments, making use of the
> indexes, page cache, request purgatory, and zero-copy functionality already
> built into classic topics.
> However, local segments are now considered cache elements, do not need to
> be durably stored, and can be built without contacting any other replicas.
> 
> 3. The design has been simplified substantially, by removing the previous
> Diskless consume flow, distributed cache component, and "object
> compaction/merging" step.
> 
> The design maintains leaderless produces as enabled by the Batch
> Coordinator, and the same latency profiles as the earlier design, while
> being simpler and integrating better into the existing ecosystem.
> 
> Thanks, and we are eager to hear your feedback on the new design.
> Greg Harris
> 
> On Mon, Jul 21, 2025 at 3:30 PM Jun Rao <ju...@confluent.io.invalid> wrote:
> 
> > Hi, Jan,
> >
> > For me, the main gap of KIP-1150 is the support of all existing client
> > APIs. Currently, there is no design for supporting APIs like transactions
> > and queues.
> >
> > Thanks,
> >
> > Jun
> >
> > On Mon, Jul 21, 2025 at 3:53 AM Jan Siekierski
> > <ja...@kentra.io.invalid> wrote:
> >
> > > Would it be a good time to ask for the current status of this KIP? I
> > > haven't seen much activity here for the past 2 months, the vote got
> > vetoed
> > > but I think the pending questions have been answered since then. KIP-1183
> > > (AutoMQ's proposal) also didn't have any activity since May.
> > >
> > > In my eyes KIP-1150 and KIP-1183 are two real choices that can be
> > > made, with a coordinator-based approach being by far the dominant one
> > when
> > > it comes to market adoption - but all these are standalone products.
> > >
> > > I'm a big fan of both approaches, but would hate to see a stall. So the
> > > question is: can we get an update?
> > >
> > > Maybe it's time to start another vote? Colin McCabe - have your questions
> > > been answered? If not, is there anything I can do to help? I'm deeply
> > > familiar with both architectures and have written about both?
> > >
> > > Kind regards,
> > > Jan
> > >
> > > On Tue, Jun 24, 2025 at 10:42 AM Stanislav Kozlovski <
> > > stanislavkozlov...@apache.org> wrote:
> > >
> > > > I have some nits - it may be useful to
> > > >
> > > > a) group all the KIP email threads in the main one (just a bunch of
> > links
> > > > to everything)
> > > > b) create the email threads
> > > >
> > > > It's a bit hard to track it all - for example, I was searching for a
> > > > discuss thread for KIP-1165 for a while; As far as I can tell, it
> > doesn't
> > > > exist yet.
> > > >
> > > > Since the KIPs are published (by virtue of having the root KIP be
> > > > published, having a DISCUSS thread and links to sub-KIPs where were
> > aimed
> > > > to move the discussion towards), I think it would be good to create
> > > DISCUSS
> > > > threads for them all.
> > > >
> > > > Best,
> > > > Stan
> > > >
> > > > On 2025/04/16 11:58:22 Josep Prat wrote:
> > > > > Hi Kafka Devs!
> > > > >
> > > > > We want to start a new KIP discussion about introducing a new type of
> > > > > topics that would make use of Object Storage as the primary source of
> > > > > storage. However, as this KIP is big we decided to split it into
> > > multiple
> > > > > related KIPs.
> > > > > We have the motivational KIP-1150 (
> > > > >
> > > >
> > >
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-1150%3A+Diskless+Topics
> > > > )
> > > > > that aims to discuss if Apache Kafka should aim to have this type of
> > > > > feature at all. This KIP doesn't go onto details on how to implement
> > > it.
> > > > > This follows the same approach used when we discussed KRaft.
> > > > >
> > > > > But as we know that it is sometimes really hard to discuss on that
> > meta
> > > > > level, we also created several sub-kips (linked in KIP-1150) that
> > offer
> > > > an
> > > > > implementation of this feature.
> > > > >
> > > > > We kindly ask you to use the proper DISCUSS threads for each type of
> > > > > concern and keep this one to discuss whether Apache Kafka wants to
> > have
> > > > > this feature or not.
> > > > >
> > > > > Thanks in advance on behalf of all the authors of this KIP.
> > > > >
> > > > > ------------------
> > > > > Josep Prat
> > > > > Open Source Engineering Director, Aiven
> > > > > josep.p...@aiven.io   |   +491715557497 | aiven.io
> > > > > Aiven Deutschland GmbH
> > > > > Alexanderufer 3-7, 10117 Berlin
> > > > > Geschäftsführer: Oskari Saarenmaa, Hannu Valtonen,
> > > > > Anna Richardson, Kenneth Chen
> > > > > Amtsgericht Charlottenburg, HRB 209739 B
> > > > >
> > > >
> > >
> >
> 

Reply via email to