Hello Jun,

Thank you for your interest in the KIP.

Regarding your question about transactions:

> JR2. Transactions on Diskless Topics is listed in the future work.
> Currently, we try to support all existing client APIs for every new
> feature. For example, remote storage (KIP-405) supports transactions in
the
> very first release. Similarly, queue for Kafka (KIP-932) will support
> transactions and remote storage in its first release. The reasoning is
that
> without the full support of all client APIs, it's going to be hard for a
> Kafka admin to adopt the new feature, since it has the potential to break
> existing or new users. So, it would be better if this KIP can follow the
> current convention to support all existing client APIs such as
transactions
> and queue for Kafka. The current implementations of both transactions and
> queues depend on a partition leader. Since this KIP no longer has
partition
> leaders, it will be useful to think through how those APIs can be
supported
> in the new architecture.

We excluded transactions and queues for the practical reason of keeping the
whole process manageable and able to make incremental steps forward
delivering new bits of value each time. Transactions and queues are great
features of Kafka, but many use cases–especially where diskless shine–may
be just fine without them, so it may be unnecessary to delay serving them.
Other similarly big features of Kafka also not always supported all other
features. The same tiered storage doesn’t support compacted topics and the
community seems fine with this, because compacted topics are not often that
big to benefit from remote storage. Besides, some other big features were
released in availability stages (EA, GA) and most likely diskless topics
will follow this route as well. So it seems fine if some features aren’t
supported from the beginning.

We also don’t think existing users will be affected, because this is a new
topic type and one cannot unconsciously start using it so it breaks
existing workload.

The path to transactions seems more or less clear. We plan to support the
idempotent produce from the beginning by storing producer state in the
batch coordinator, which for classic topics is the responsibility of
partition leaders. We plan to use the existing transaction coordinator
mechanism together with the extension of this approach to support
transactions.

Best regards,
Giuseppe

On Tue, May 13, 2025 at 6:35 PM Jun Rao <j...@confluent.io.invalid> wrote:

> Hi, Josep,
>
> Thanks for the KIP. At the highlevel, the KIP is well thought through and
> provides multiple benefits for Kafka in the Cloud. A few comments below.
>
> JR1. One of the key motivations is to eliminate inter-zone data transfer
> costs from Kafka replication. It would be useful to provide a short summary
> regarding the cost saving from the major Cloud providers. As people
> mentioned in another email, currently Azure doesn't charge for inter-zone
> data transfer.
>
> JR2. Transactions on Diskless Topics is listed in the future work.
> Currently, we try to support all existing client APIs for every new
> feature. For example, remote storage (KIP-405) supports transactions in the
> very first release. Similarly, queue for Kafka (KIP-932) will support
> transactions and remote storage in its first release. The reasoning is that
> without the full support of all client APIs, it's going to be hard for a
> Kafka admin to adopt the new feature, since it has the potential to break
> existing or new users. So, it would be better if this KIP can follow the
> current convention to support all existing client APIs such as transactions
> and queue for Kafka. The current implementations of both transactions and
> queues depend on a partition leader. Since this KIP no longer has partition
> leaders, it will be useful to think through how those APIs can be supported
> in the new architecture.
>
> JR3. "Permit multi-region active-active topics with automatic failover".
> Could you elaborate on the benefit of this? Cloud providers still charge
> cross region data transfer in object stores, right?
>
> JR4. "Balance traffic among brokers and eliminate broker hotspots with
> per-client granularity". Does that mean all traffic from a client is served
> from a single broker? This seems to reduce the scalability from the client
> perspective.
>
> JR5. Regarding the name diskless. It might be ok, but people may associate
> it with less durability. Under the cover, the Cloud storage will still
> store the data on some disks. I am wondering if there is another name that
> captures the essence but without the potential negative impression.
>
> Thanks,
>
> Jun
>
>
> On Wed, Apr 16, 2025 at 5:00 AM Josep Prat <josep.p...@aiven.io.invalid>
> wrote:
>
> > Hi Kafka Devs!
> >
> > We want to start a new KIP discussion about introducing a new type of
> > topics that would make use of Object Storage as the primary source of
> > storage. However, as this KIP is big we decided to split it into multiple
> > related KIPs.
> > We have the motivational KIP-1150 (
> >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1150%3A+Diskless+Topics
> > )
> > that aims to discuss if Apache Kafka should aim to have this type of
> > feature at all. This KIP doesn't go onto details on how to implement it.
> > This follows the same approach used when we discussed KRaft.
> >
> > But as we know that it is sometimes really hard to discuss on that meta
> > level, we also created several sub-kips (linked in KIP-1150) that offer
> an
> > implementation of this feature.
> >
> > We kindly ask you to use the proper DISCUSS threads for each type of
> > concern and keep this one to discuss whether Apache Kafka wants to have
> > this feature or not.
> >
> > Thanks in advance on behalf of all the authors of this KIP.
> >
> > ------------------
> > Josep Prat
> > Open Source Engineering Director, Aiven
> > josep.p...@aiven.io   |   +491715557497 | aiven.io
> > Aiven Deutschland GmbH
> > Alexanderufer 3-7, 10117 Berlin
> > Geschäftsführer: Oskari Saarenmaa, Hannu Valtonen,
> > Anna Richardson, Kenneth Chen
> > Amtsgericht Charlottenburg, HRB 209739 B
> >
>

Reply via email to