I'm not sure if this reply will get threaded correctly, I'm joining the
discussion late, intending to reply to
https://lists.apache.org/thread/ljxc495nf39myp28pmf77sm2xydwjm6d

I haven't read the design doc in detail but just want to say as a heavy
user of Kafka: I (and my team) are very interested in diskless topics, if
not this implementation then something with similar properties (i.e.
publish direct to object storage avoiding cross-AZ costs on replication and
produce/consume with the tradeoff of higher latency).

Our biggest motivation by far would be saving cost, though to the extent
that it makes it easier to manage clusters by having brokers be mostly
stateless and elastically expanding/shrinking clusters that's a bonus.

Not all of our topics would be candidates for something like this due to
the higher latency, but as a rough estimate something like half might be.
For one half of our topics we need low latency on the produce side because
publishes are in the hot path of serving requests (or otherwise it would
take a lot of engineering effort to change many existing systems to deal
with high produce latency (which is tough in languages like Ruby with
limited concurrency)) or because we need tight end-to-end latency of
consumers. For the other half though, the ones where this would be a good
fit, it's because the producers would be fine with higher latency because
they are able to use concurrency, or because they are okay with a small
chance of data loss (eg because the data being published is sampled anyway)
which means that we can hide the latency by making the publish asynchronous.

Overall I'm excited about this KIP and happy that in general there has been
so much innovation happening in and around Kafka lately with tiered
storage, diskless topics, and more.

Thanks,
Donny

On 2025/04/16 11:58:22 Josep Prat wrote:
> Hi Kafka Devs!
>
> We want to start a new KIP discussion about introducing a new type of
> topics that would make use of Object Storage as the primary source of
> storage. However, as this KIP is big we decided to split it into multiple
> related KIPs.
> We have the motivational KIP-1150 (
>
https://cwiki.apache.org/confluence/display/KAFKA/KIP-1150%3A+Diskless+Topics
)
> that aims to discuss if Apache Kafka should aim to have this type of
> feature at all. This KIP doesn't go onto details on how to implement it.
> This follows the same approach used when we discussed KRaft.
>
> But as we know that it is sometimes really hard to discuss on that meta
> level, we also created several sub-kips (linked in KIP-1150) that offer an
> implementation of this feature.
>
> We kindly ask you to use the proper DISCUSS threads for each type of
> concern and keep this one to discuss whether Apache Kafka wants to have
> this feature or not.
>
> Thanks in advance on behalf of all the authors of this KIP.
>
> ------------------
> Josep Prat
> Open Source Engineering Director, Aiven
> [email protected]   |   +491715557497 | aiven.io
> Aiven Deutschland GmbH
> Alexanderufer 3-7, 10117 Berlin
> Geschäftsführer: Oskari Saarenmaa, Hannu Valtonen,
> Anna Richardson, Kenneth Chen
> Amtsgericht Charlottenburg, HRB 209739 B
>

Reply via email to