Re: [DISCUSS] KIP-1150 Diskless Topics

Xiaorui Wang Wed, 23 Apr 2025 02:14:00 -0700

Dear Community,

I am truly delighted to see the KIP-1150 proposal put forth by Josep, which 
presents an exciting new architecture for the cloud economy.


It was an honor to hear Ziming and Stanislav mention AutoMQ[1] during the 
discussion. Over the past few years, AutoMQ has dedicated itself to this 
direction, creating what we call a "Stateless Kafka" that is entirely built on 
object storage. AutoMQ has quickly gained recognition from numerous medium to 
large enterprises. For instance, Grab, based in Singapore, has already put it 
into use. We even have single clients whose peak throughput has exceeded 
50GiB/s.

As a production-ready solution, the core storage layer of AutoMQ can be simply 
understood as a new implementation of Kafka's LogSegment. From a feasibility 
standpoint, it could serve as another storage engine for the community, 
developing in parallel with the existing ISR-based storage engine. If it 
aligns with the community's roadmap, we would be more than willing to discuss 
the possibility of merging AutoMQ's code into Kafka. This is a new option for 
everyone to consider.

If this is not the appropriate thread for discussion, we are open to starting 
a new thread for a more comprehensive discussion.

Thanks again for KIP-1150 and everyone involved in this discussion.

[1] https://github.com/AutoMQ/automq

Best Regards,  
Xiaorui  
Co-founder of AutoMQ

On 2025/04/20 20:04:14 Stanislav Kozlovski wrote:
> This is an amazing initiative. Huge kudos for driving it. We should 
> incorporate it one way or another.
> 
> I have a suggestion I'd like to hear your thoughts on. I'm cognizant of the 
> effort required for KIP-1150 so I don't necessarily want to increase the 
> scope - but thinking about this early on can help design later on, plus shape 
> the motivation.
> 
> The idea is to introduce support for replicationless acks=1 writes. This 
> would be very similar to how AutoMQ's WAL+S3 feature works, as far as I 
> understand it.
> 
> Could we have Diskless Brokers serve acks=1 produce requests by immediately 
> persisting the data on disk (not sure if we should use fsync or not), 
> responding to the request, and then still asynchronously batching said data 
> with regular acks=all data via the "diskless.append.commit.interval.ms"/ 
> "diskless.append.buffer.max.bytes" configs?
> 
> If I'm not mistaken, this would offer very similar guarantees as today's 
> acks=1 requests, where a period of low durability exists b/w the time the 
> leader persists to its local disk and the time all followers persist to their 
> disk. Granted, in traditional Kafka this period is probably no more than a 
> hundred milliseconds, and here it'd be at least 2x higher. But I believe that 
> given the major savings, many acks=1 users will be happy to make the tradeoff.
> 
> While on the topic of cost, I hastily ran some cost calculations and found 
> that the KIP should reduce replication costs by more than 80x. 
> (https://topicpartition.io/blog/kip-1150-diskless-topics-in-apache-kafka). 
> There may be some errors there as the batch coordinator RPC and merging isn't 
> fully fleshed out - but I believe it's directionally correct. It may be worth 
> to add that to the motivation in one way or another - so as to be able to 
> quantify the numbers.
> 
> Best,
> Stanislav
> 
> On 2025/04/19 11:02:30 Ivan Yurchenko wrote:
> > Hi Ziming,
> > 
> > > 1. Is this feature available by just a minor adjust of config or it will 
> > > intrude current code heavily, say, AutoMq is 100% compatible with Kafka 
> > > and doesn’t intrude the code heavily 
> > 
> > If we speak about the part visible to the user, we expect:
> >  1. Minimal changes to the client code (with potential fallback with even 0 
> > changes for older clients).
> >  2. A limited set of new configurations for broker and topics.
> > Otherwise, this should be a perfectly normal Apache Kafka.
> > 
> > > 2. Though we are not discussing implement details, it’s worth giving some 
> > > high-level architecture ideas, and it’s better to compare with AutoMq 
> > > like systems.
> > 
> > There's quite a bit of high-level architecture in a sub-KIP-1163 [1].
> > We didn't do comparison to AutoMQ (to the best of our knowledge, they have 
> > a fairly different approach), but if this helps the community to get the 
> > idea then sure, we should do this.
> > 
> > > 3. What we will provide through it, I think we will just provide a common 
> > > interface and put implementations in another repos, just as we did for 
> > > Kafka Connect and Kafka Tired Storage.
> > 
> > This is true for the component that does CRUD operations on object storage. 
> > However, for the batch coordinator we would like to provide a decent 
> > out-of-the-box self-contained (i.e. no external deps like database) 
> > implementation that many Kafka users who don't have challenging scaling 
> > requirements would benefit from. There's the sub-KIP-1164 [2] for this.
> > 
> > > 4. How to deal with KRaft related protocol, since metadata topic is 
> > > managed differently with __cluster_metadata, through this KIP, will we 
> > > align the gap between __cluster_metadata  and data topics by put metadata 
> > > in an object storage? if so, there will be no standby controller? since 
> > > standby controller is the __cluster_metadata followers and there will be 
> > > no followers.
> > 
> > The current plan is to not directly work with the KRaft and 
> > __cluster_metadata. What we need from KRaft is 3 types of events: 
> > topic/partition creation, topic deletion, and topic configuration changes 
> > (with the possibility to limit this set to topic deletion only). We think 
> > that'd be enough if we have a "bridge" that watches for these events in 
> > __cluster_metadata and reflects them in the batch coordinator (basically, 
> > by sending requests).
> > Does this answer the question or maybe I misunderstood?
> > 
> > Best,
> > Ivan
> > 
> > [1] 
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-1163%3A+Diskless+Core
> > [2] 
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-1164%3A+Topic+Based+Batch+Coordinator
> > 
> > On Fri, Apr 18, 2025, at 12:42, Ziming Deng wrote:
> > > Hi Josep,
> > > 
> > > This would be a fascinating feature, some well known Kafka users are 
> > > using Kafka in a cloud-native env. As for as I know, there are already 
> > > some secondary development version Kafka which provide this feature, for 
> > > example, I am using AutoMq(https://github.com/AutoMQ/automq) in my 
> > > environment, which significantly helped ms reduced the cost, so I think 
> > > it’s worthwhile to clarify some related details:
> > > 1. Is this feature available by just a minor adjust of config or it will 
> > > intrude current code heavily, say, AutoMq is 100% compatible with Kafka 
> > > and doesn’t intrude the code heavily 
> > > 2. Though we are not discussing implement details, it’s worth giving some 
> > > high-level architecture ideas, and it’s better to compare with AutoMq 
> > > like systems.
> > > 3. What we will provide through it, I think we will just provide a common 
> > > interface and put implementations in another repos, just as we did for 
> > > Kafka Connect and Kafka Tired Storage.
> > > 4. How to deal with KRaft related protocol, since metadata topic is 
> > > managed differently with __cluster_metadata, through this KIP, will we 
> > > align the gap between __cluster_metadata  and data topics by put metadata 
> > > in an object storage? if so, there will be no standby controller? since 
> > > standby controller is the __cluster_metadata followers and there will be 
> > > no followers.
> > > 
> > > — 
> > > Ziming
> > > 
> > > > On Apr 16, 2025, at 19:58, Josep Prat <[email protected]> 
> > > > wrote:
> > > > 
> > > > Hi Kafka Devs!
> > > > 
> > > > We want to start a new KIP discussion about introducing a new type of
> > > > topics that would make use of Object Storage as the primary source of
> > > > storage. However, as this KIP is big we decided to split it into 
> > > > multiple
> > > > related KIPs.
> > > > We have the motivational KIP-1150 (
> > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-1150%3A+Diskless+Topics)
> > > > that aims to discuss if Apache Kafka should aim to have this type of
> > > > feature at all. This KIP doesn't go onto details on how to implement it.
> > > > This follows the same approach used when we discussed KRaft.
> > > > 
> > > > But as we know that it is sometimes really hard to discuss on that meta
> > > > level, we also created several sub-kips (linked in KIP-1150) that offer 
> > > > an
> > > > implementation of this feature.
> > > > 
> > > > We kindly ask you to use the proper DISCUSS threads for each type of
> > > > concern and keep this one to discuss whether Apache Kafka wants to have
> > > > this feature or not.
> > > > 
> > > > Thanks in advance on behalf of all the authors of this KIP.
> > > > 
> > > > ------------------
> > > > Josep Prat
> > > > Open Source Engineering Director, Aiven
> > > > [email protected]   |   +491715557497 | aiven.io
> > > > Aiven Deutschland GmbH
> > > > Alexanderufer 3-7, 10117 Berlin
> > > > Geschäftsführer: Oskari Saarenmaa, Hannu Valtonen,
> > > > Anna Richardson, Kenneth Chen
> > > > Amtsgericht Charlottenburg, HRB 209739 B
> > > 
> > > 
> > 
>

Re: [DISCUSS] KIP-1150 Diskless Topics

Reply via email to