I'm not sure if this reply will get threaded correctly, I'm joining the discussion late, intending to reply to https://lists.apache.org/thread/ljxc495nf39myp28pmf77sm2xydwjm6d
I haven't read the design doc in detail but just want to say as a heavy user of Kafka: I (and my team) are very interested in diskless topics, if not this implementation then something with similar properties (i.e. publish direct to object storage avoiding cross-AZ costs on replication and produce/consume with the tradeoff of higher latency). Our biggest motivation by far would be saving cost, though to the extent that it makes it easier to manage clusters by having brokers be mostly stateless and elastically expanding/shrinking clusters that's a bonus. Not all of our topics would be candidates for something like this due to the higher latency, but as a rough estimate something like half might be. For one half of our topics we need low latency on the produce side because publishes are in the hot path of serving requests (or otherwise it would take a lot of engineering effort to change many existing systems to deal with high produce latency (which is tough in languages like Ruby with limited concurrency)) or because we need tight end-to-end latency of consumers. For the other half though, the ones where this would be a good fit, it's because the producers would be fine with higher latency because they are able to use concurrency, or because they are okay with a small chance of data loss (eg because the data being published is sampled anyway) which means that we can hide the latency by making the publish asynchronous. Overall I'm excited about this KIP and happy that in general there has been so much innovation happening in and around Kafka lately with tiered storage, diskless topics, and more. Thanks, Donny On 2025/04/16 11:58:22 Josep Prat wrote: > Hi Kafka Devs! > > We want to start a new KIP discussion about introducing a new type of > topics that would make use of Object Storage as the primary source of > storage. However, as this KIP is big we decided to split it into multiple > related KIPs. > We have the motivational KIP-1150 ( > https://cwiki.apache.org/confluence/display/KAFKA/KIP-1150%3A+Diskless+Topics ) > that aims to discuss if Apache Kafka should aim to have this type of > feature at all. This KIP doesn't go onto details on how to implement it. > This follows the same approach used when we discussed KRaft. > > But as we know that it is sometimes really hard to discuss on that meta > level, we also created several sub-kips (linked in KIP-1150) that offer an > implementation of this feature. > > We kindly ask you to use the proper DISCUSS threads for each type of > concern and keep this one to discuss whether Apache Kafka wants to have > this feature or not. > > Thanks in advance on behalf of all the authors of this KIP. > > ------------------ > Josep Prat > Open Source Engineering Director, Aiven > [email protected] | +491715557497 | aiven.io > Aiven Deutschland GmbH > Alexanderufer 3-7, 10117 Berlin > Geschäftsführer: Oskari Saarenmaa, Hannu Valtonen, > Anna Richardson, Kenneth Chen > Amtsgericht Charlottenburg, HRB 209739 B >
