Hi Ziming, > 1. Is this feature available by just a minor adjust of config or it will > intrude current code heavily, say, AutoMq is 100% compatible with Kafka and > doesn’t intrude the code heavily
If we speak about the part visible to the user, we expect: 1. Minimal changes to the client code (with potential fallback with even 0 changes for older clients). 2. A limited set of new configurations for broker and topics. Otherwise, this should be a perfectly normal Apache Kafka. > 2. Though we are not discussing implement details, it’s worth giving some > high-level architecture ideas, and it’s better to compare with AutoMq like > systems. There's quite a bit of high-level architecture in a sub-KIP-1163 [1]. We didn't do comparison to AutoMQ (to the best of our knowledge, they have a fairly different approach), but if this helps the community to get the idea then sure, we should do this. > 3. What we will provide through it, I think we will just provide a common > interface and put implementations in another repos, just as we did for Kafka > Connect and Kafka Tired Storage. This is true for the component that does CRUD operations on object storage. However, for the batch coordinator we would like to provide a decent out-of-the-box self-contained (i.e. no external deps like database) implementation that many Kafka users who don't have challenging scaling requirements would benefit from. There's the sub-KIP-1164 [2] for this. > 4. How to deal with KRaft related protocol, since metadata topic is managed > differently with __cluster_metadata, through this KIP, will we align the gap > between __cluster_metadata and data topics by put metadata in an object > storage? if so, there will be no standby controller? since standby controller > is the __cluster_metadata followers and there will be no followers. The current plan is to not directly work with the KRaft and __cluster_metadata. What we need from KRaft is 3 types of events: topic/partition creation, topic deletion, and topic configuration changes (with the possibility to limit this set to topic deletion only). We think that'd be enough if we have a "bridge" that watches for these events in __cluster_metadata and reflects them in the batch coordinator (basically, by sending requests). Does this answer the question or maybe I misunderstood? Best, Ivan [1] https://cwiki.apache.org/confluence/display/KAFKA/KIP-1163%3A+Diskless+Core [2] https://cwiki.apache.org/confluence/display/KAFKA/KIP-1164%3A+Topic+Based+Batch+Coordinator On Fri, Apr 18, 2025, at 12:42, Ziming Deng wrote: > Hi Josep, > > This would be a fascinating feature, some well known Kafka users are using > Kafka in a cloud-native env. As for as I know, there are already some > secondary development version Kafka which provide this feature, for example, > I am using AutoMq(https://github.com/AutoMQ/automq) in my environment, which > significantly helped ms reduced the cost, so I think it’s worthwhile to > clarify some related details: > 1. Is this feature available by just a minor adjust of config or it will > intrude current code heavily, say, AutoMq is 100% compatible with Kafka and > doesn’t intrude the code heavily > 2. Though we are not discussing implement details, it’s worth giving some > high-level architecture ideas, and it’s better to compare with AutoMq like > systems. > 3. What we will provide through it, I think we will just provide a common > interface and put implementations in another repos, just as we did for Kafka > Connect and Kafka Tired Storage. > 4. How to deal with KRaft related protocol, since metadata topic is managed > differently with __cluster_metadata, through this KIP, will we align the gap > between __cluster_metadata and data topics by put metadata in an object > storage? if so, there will be no standby controller? since standby controller > is the __cluster_metadata followers and there will be no followers. > > — > Ziming > > > On Apr 16, 2025, at 19:58, Josep Prat <josep.p...@aiven.io.INVALID> wrote: > > > > Hi Kafka Devs! > > > > We want to start a new KIP discussion about introducing a new type of > > topics that would make use of Object Storage as the primary source of > > storage. However, as this KIP is big we decided to split it into multiple > > related KIPs. > > We have the motivational KIP-1150 ( > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-1150%3A+Diskless+Topics) > > that aims to discuss if Apache Kafka should aim to have this type of > > feature at all. This KIP doesn't go onto details on how to implement it. > > This follows the same approach used when we discussed KRaft. > > > > But as we know that it is sometimes really hard to discuss on that meta > > level, we also created several sub-kips (linked in KIP-1150) that offer an > > implementation of this feature. > > > > We kindly ask you to use the proper DISCUSS threads for each type of > > concern and keep this one to discuss whether Apache Kafka wants to have > > this feature or not. > > > > Thanks in advance on behalf of all the authors of this KIP. > > > > ------------------ > > Josep Prat > > Open Source Engineering Director, Aiven > > josep.p...@aiven.io | +491715557497 | aiven.io > > Aiven Deutschland GmbH > > Alexanderufer 3-7, 10117 Berlin > > Geschäftsführer: Oskari Saarenmaa, Hannu Valtonen, > > Anna Richardson, Kenneth Chen > > Amtsgericht Charlottenburg, HRB 209739 B > >