Re: [DISCUSS] KIP-1150 Diskless Topics
Hi Luke and all! I'll be participating in this discussion from the authors' side together with Josep and some other colleagues. > 2. "Write through to object storage, avoiding local disk usage" > While this title and the goal said no local disk usage, I'd like to make > sure is it really zero local disk needed? You're right, this needs clarification. First thing: when we speak about disk, we mean broker disk. Data will be stored on object store and most likely there is some form of disk underneath, but this storage has different economy and performance characteristics (using which is the main focus of the KIP.) 1. For reading/writing/storing data themselves, broker disk is not used. There's also no index files and the like. 2. Where metadata is stored, depends on the batch coordinator implementation, which is supposed to be pluggable. However, the reference implementation we propose in KIP-1164 uses normal Kafka topics, so some broker disk will be used for metadata. 3. There's also caching for the read path, which may optionally use disk instead of memory. So, strictly speaking, it's not zero disk. But despite some disk is used, we still call the whole approach diskless because the amount stored on broker disks is a tiny fraction of the total amount of user data it supports. Does this make sense to you? Best, Ivan On Thu, Apr 17, 2025, at 14:11, Luke Chen wrote: > Hi Josep, > > Thanks for the KIP! > Quite exciting to see this feature brought into Apache Kafka > > Comments: > 1. "Permit multi-region active-active topics with automatic failover" > I didn't see any future work mentioning this. Does it mean, with diskless > topic MVP, this will work by default? > > 2. "Write through to object storage, avoiding local disk usage" > While this title and the goal said no local disk usage, I'd like to make > sure is it really zero local disk needed? > We might need to clarify it in the KIP. > > Thank you. > Luke > > On Wed, Apr 16, 2025 at 7:58 PM Josep Prat > wrote: > > > Hi Kafka Devs! > > > > We want to start a new KIP discussion about introducing a new type of > > topics that would make use of Object Storage as the primary source of > > storage. However, as this KIP is big we decided to split it into multiple > > related KIPs. > > We have the motivational KIP-1150 ( > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-1150%3A+Diskless+Topics > > ) > > that aims to discuss if Apache Kafka should aim to have this type of > > feature at all. This KIP doesn't go onto details on how to implement it. > > This follows the same approach used when we discussed KRaft. > > > > But as we know that it is sometimes really hard to discuss on that meta > > level, we also created several sub-kips (linked in KIP-1150) that offer an > > implementation of this feature. > > > > We kindly ask you to use the proper DISCUSS threads for each type of > > concern and keep this one to discuss whether Apache Kafka wants to have > > this feature or not. > > > > Thanks in advance on behalf of all the authors of this KIP. > > > > -- > > Josep Prat > > Open Source Engineering Director, Aiven > > josep.p...@aiven.io | +491715557497 | aiven.io > > Aiven Deutschland GmbH > > Alexanderufer 3-7, 10117 Berlin > > Geschäftsführer: Oskari Saarenmaa, Hannu Valtonen, > > Anna Richardson, Kenneth Chen > > Amtsgericht Charlottenburg, HRB 209739 B > > >
Re: [VOTE] 3.9.1 RC0
Hi David Thanks for the quick fix! I will test this patch. Sincerely, TengYao David Arthur 於 2025年4月20日 週日 上午3:14寫道: > TengYao, I have a patch here that hopefully fixes the tag issue. It's > pretty hard to test without actually doing a release, so I was hoping you > could test it for me :) > > https://github.com/apache/kafka/pull/19518 > > -David > > On Wed, Apr 16, 2025 at 9:07 PM Matthias J. Sax wrote: > > > Thanks for cutting the first RC. > > > > I just want to point to the email on the dev list with subject: `Kafka > > tags on repo`. > > > > It seems, our release scripts have some issue. David Arthur is actively > > investigation: https://issues.apache.org/jira/browse/KAFKA-19166 > > > > Apparently something is not pushed correctly, cf > > > > > https://github.com/apache/kafka/commit/3cb4749195b9bfcfa64f6b5855a019dfbd173ed4 > > which say > > > > > This commit does not belong to any branch on this repository, and may > > belong to a fork outside of the repository. > > > > While the RC can be verified, we should hold off to moving forward with > > the release until the root cause of this issue was identified and fixed. > > > > > > -Matthias > > > > > > > > On 4/16/25 2:00 AM, TengYao Chi wrote: > > > Hello Kafka users, developers, and client-developers, > > > > > > This is the first candidate for the release of Apache Kafka 3.9.1. > > > > > > This is a bug-fix release with several fixes, and most importantly, it > > adds > > > Java 23 support for 3.9 > > > > > > Release notes for the 3.9.1 release: > > > > > > https://dist.apache.org/repos/dist/dev/kafka/3.9.1-rc0/RELEASE_NOTES.html > > > > > > Please download, test, and vote by *Wednesday, April 23, 9:00 AM PT* > > > > > > Kafka's KEYS file containing PGP keys we use to sign the release: > > > https://kafka.apache.org/KEYS > > > > > > * Release artifacts to be voted upon (source and binary): > > > https://dist.apache.org/repos/dist/dev/kafka/3.9.1-rc0/ > > > > > > * Docker release artifacts to be voted upon: > > > apache/kafka:3.9.1-rc0 > > > > > > https://hub.docker.com/layers/apache/kafka/3.9.1-rc0/images/sha256-3f7d1298c0ff2cdfac2a65e36c2515ca9288d89c64e07138bc61843b648e > > > apache/kafka-native:3.9.1-rc0 > > > > > > https://hub.docker.com/layers/apache/kafka-native/3.9.1-rc0/images/sha256-2a7f7f178d862b7f0b41b007732edb1c5882f2f6062c47dda48f90a10e1bd6fa > > > > > > * Maven artifacts to be voted upon: > > > https://repository.apache.org/content/groups/staging/org/apache/kafka/ > > > > > > * Javadoc: > > > https://dist.apache.org/repos/dist/dev/kafka/3.9.1-rc0/javadoc/ > > > > > > * Tag to be voted upon (off 3.9 branch) is the 3.9.1 tag: > > > https://github.com/apache/kafka/releases/tag/3.9.1-rc0 > > > > > > * Documentation: > > > https://kafka.apache.org/39/documentation.html > > > > > > * Protocol: > > > https://kafka.apache.org/39/protocol.html > > > > > > *CI builds for the 3.9 branch: > > > Unit/integration tests (There are some flaky tests): > > > https://ci-builds.apache.org/job/Kafka/job/kafka/job/3.9/191/ > > > > > > System tests: In progress. Will provide it once it is finished. > > > > > > * Successful Docker Image Github Actions Pipeline for 3.9 branch: > > > Docker Build Test Pipeline (JVM): > > > https://github.com/apache/kafka/actions/runs/14485509197 > > > Docker Build Test Pipeline (Native): > > > https://github.com/apache/kafka/actions/runs/14485512365 > > > > > > > > > Thanks, > > > TengYao Chi > > > > > > > > > -- > David Arthur >
Re: KIP-1141: Simplifying MetadataQuorumCommand by Leveraging Admin API for Controller Management
Hi José, Apologies for the delayed response. Do you still have any questions regarding the describeConfig part? As Chia-Ping mentioned, we are able to retrieve all broker configurations through that method. (https://github.com/apache/kafka/blob/da46cf6e79afbbed1da2bae831e0f70992e85f9b/core/src/main/scala/kafka/server/ConfigHelper.scala#L121-L123) Please feel free to reach out if you have any further questions or need clarification. Thank you again for your valuable feedback! Best, Kuan-Po Tseng On 2025/04/03 16:04:36 José Armando García Sancio wrote: > Hi Chia, > > On Thu, Apr 3, 2025 at 10:24 AM Chia-Ping Tsai wrote: > > We propose to use `Admin#describeConfigs` to get the configs for specific > > controller if the bootstrap.controllers is configured. This approach is > > similar to what `MetadataQuorumCommand` does, and the difference is > > `MetadataQuorumCommand` read those configs from local file and this KIP > > gets those configs by `Admin#describeConfigs` > > I am not sure. I have to look at that code but doesn't > "Admin#describeConfigs" only return dynamic configuration for the > controller? Most users configure the controller using the server > properties file. My current understanding is that values coming from > the properties file won't show up in Admin#describeConfigs. > > Thanks, > -- > -José >
Re: [DISCUSS] KIP-1150 Diskless Topics
Hi Ziming, > 1. Is this feature available by just a minor adjust of config or it will > intrude current code heavily, say, AutoMq is 100% compatible with Kafka and > doesn’t intrude the code heavily If we speak about the part visible to the user, we expect: 1. Minimal changes to the client code (with potential fallback with even 0 changes for older clients). 2. A limited set of new configurations for broker and topics. Otherwise, this should be a perfectly normal Apache Kafka. > 2. Though we are not discussing implement details, it’s worth giving some > high-level architecture ideas, and it’s better to compare with AutoMq like > systems. There's quite a bit of high-level architecture in a sub-KIP-1163 [1]. We didn't do comparison to AutoMQ (to the best of our knowledge, they have a fairly different approach), but if this helps the community to get the idea then sure, we should do this. > 3. What we will provide through it, I think we will just provide a common > interface and put implementations in another repos, just as we did for Kafka > Connect and Kafka Tired Storage. This is true for the component that does CRUD operations on object storage. However, for the batch coordinator we would like to provide a decent out-of-the-box self-contained (i.e. no external deps like database) implementation that many Kafka users who don't have challenging scaling requirements would benefit from. There's the sub-KIP-1164 [2] for this. > 4. How to deal with KRaft related protocol, since metadata topic is managed > differently with __cluster_metadata, through this KIP, will we align the gap > between __cluster_metadata and data topics by put metadata in an object > storage? if so, there will be no standby controller? since standby controller > is the __cluster_metadata followers and there will be no followers. The current plan is to not directly work with the KRaft and __cluster_metadata. What we need from KRaft is 3 types of events: topic/partition creation, topic deletion, and topic configuration changes (with the possibility to limit this set to topic deletion only). We think that'd be enough if we have a "bridge" that watches for these events in __cluster_metadata and reflects them in the batch coordinator (basically, by sending requests). Does this answer the question or maybe I misunderstood? Best, Ivan [1] https://cwiki.apache.org/confluence/display/KAFKA/KIP-1163%3A+Diskless+Core [2] https://cwiki.apache.org/confluence/display/KAFKA/KIP-1164%3A+Topic+Based+Batch+Coordinator On Fri, Apr 18, 2025, at 12:42, Ziming Deng wrote: > Hi Josep, > > This would be a fascinating feature, some well known Kafka users are using > Kafka in a cloud-native env. As for as I know, there are already some > secondary development version Kafka which provide this feature, for example, > I am using AutoMq(https://github.com/AutoMQ/automq) in my environment, which > significantly helped ms reduced the cost, so I think it’s worthwhile to > clarify some related details: > 1. Is this feature available by just a minor adjust of config or it will > intrude current code heavily, say, AutoMq is 100% compatible with Kafka and > doesn’t intrude the code heavily > 2. Though we are not discussing implement details, it’s worth giving some > high-level architecture ideas, and it’s better to compare with AutoMq like > systems. > 3. What we will provide through it, I think we will just provide a common > interface and put implementations in another repos, just as we did for Kafka > Connect and Kafka Tired Storage. > 4. How to deal with KRaft related protocol, since metadata topic is managed > differently with __cluster_metadata, through this KIP, will we align the gap > between __cluster_metadata and data topics by put metadata in an object > storage? if so, there will be no standby controller? since standby controller > is the __cluster_metadata followers and there will be no followers. > > — > Ziming > > > On Apr 16, 2025, at 19:58, Josep Prat wrote: > > > > Hi Kafka Devs! > > > > We want to start a new KIP discussion about introducing a new type of > > topics that would make use of Object Storage as the primary source of > > storage. However, as this KIP is big we decided to split it into multiple > > related KIPs. > > We have the motivational KIP-1150 ( > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-1150%3A+Diskless+Topics) > > that aims to discuss if Apache Kafka should aim to have this type of > > feature at all. This KIP doesn't go onto details on how to implement it. > > This follows the same approach used when we discussed KRaft. > > > > But as we know that it is sometimes really hard to discuss on that meta > > level, we also created several sub-kips (linked in KIP-1150) that offer an > > implementation of this feature. > > > > We kindly ask you to use the proper DISCUSS threads for each type of > > concern and keep this one to discuss whether Apache Kafka wants to have
Re: [VOTE] 3.9.1 RC0
TengYao, I have a patch here that hopefully fixes the tag issue. It's pretty hard to test without actually doing a release, so I was hoping you could test it for me :) https://github.com/apache/kafka/pull/19518 -David On Wed, Apr 16, 2025 at 9:07 PM Matthias J. Sax wrote: > Thanks for cutting the first RC. > > I just want to point to the email on the dev list with subject: `Kafka > tags on repo`. > > It seems, our release scripts have some issue. David Arthur is actively > investigation: https://issues.apache.org/jira/browse/KAFKA-19166 > > Apparently something is not pushed correctly, cf > > https://github.com/apache/kafka/commit/3cb4749195b9bfcfa64f6b5855a019dfbd173ed4 > which say > > > This commit does not belong to any branch on this repository, and may > belong to a fork outside of the repository. > > While the RC can be verified, we should hold off to moving forward with > the release until the root cause of this issue was identified and fixed. > > > -Matthias > > > > On 4/16/25 2:00 AM, TengYao Chi wrote: > > Hello Kafka users, developers, and client-developers, > > > > This is the first candidate for the release of Apache Kafka 3.9.1. > > > > This is a bug-fix release with several fixes, and most importantly, it > adds > > Java 23 support for 3.9 > > > > Release notes for the 3.9.1 release: > > > https://dist.apache.org/repos/dist/dev/kafka/3.9.1-rc0/RELEASE_NOTES.html > > > > Please download, test, and vote by *Wednesday, April 23, 9:00 AM PT* > > > > Kafka's KEYS file containing PGP keys we use to sign the release: > > https://kafka.apache.org/KEYS > > > > * Release artifacts to be voted upon (source and binary): > > https://dist.apache.org/repos/dist/dev/kafka/3.9.1-rc0/ > > > > * Docker release artifacts to be voted upon: > > apache/kafka:3.9.1-rc0 > > > https://hub.docker.com/layers/apache/kafka/3.9.1-rc0/images/sha256-3f7d1298c0ff2cdfac2a65e36c2515ca9288d89c64e07138bc61843b648e > > apache/kafka-native:3.9.1-rc0 > > > https://hub.docker.com/layers/apache/kafka-native/3.9.1-rc0/images/sha256-2a7f7f178d862b7f0b41b007732edb1c5882f2f6062c47dda48f90a10e1bd6fa > > > > * Maven artifacts to be voted upon: > > https://repository.apache.org/content/groups/staging/org/apache/kafka/ > > > > * Javadoc: > > https://dist.apache.org/repos/dist/dev/kafka/3.9.1-rc0/javadoc/ > > > > * Tag to be voted upon (off 3.9 branch) is the 3.9.1 tag: > > https://github.com/apache/kafka/releases/tag/3.9.1-rc0 > > > > * Documentation: > > https://kafka.apache.org/39/documentation.html > > > > * Protocol: > > https://kafka.apache.org/39/protocol.html > > > > *CI builds for the 3.9 branch: > > Unit/integration tests (There are some flaky tests): > > https://ci-builds.apache.org/job/Kafka/job/kafka/job/3.9/191/ > > > > System tests: In progress. Will provide it once it is finished. > > > > * Successful Docker Image Github Actions Pipeline for 3.9 branch: > > Docker Build Test Pipeline (JVM): > > https://github.com/apache/kafka/actions/runs/14485509197 > > Docker Build Test Pipeline (Native): > > https://github.com/apache/kafka/actions/runs/14485512365 > > > > > > Thanks, > > TengYao Chi > > > > -- David Arthur