[jira] [Reopened] (KAFKA-18092) TransactionsTest.testBumpTransactionalEpochWithTV2Enabled is flaky
[ https://issues.apache.org/jira/browse/KAFKA-18092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthias J. Sax reopened KAFKA-18092: - Just failed again (two time in a row) on [https://github.com/apache/kafka/pull/18402] > TransactionsTest.testBumpTransactionalEpochWithTV2Enabled is flaky > -- > > Key: KAFKA-18092 > URL: https://issues.apache.org/jira/browse/KAFKA-18092 > Project: Kafka > Issue Type: Bug >Reporter: Andrew Schofield >Assignee: Justine Olshan >Priority: Major > Fix For: 4.0.0 > > > [https://ge.apache.org/scans/tests?search.rootProjectNames=kafka&search.tasks=test&search.timeZoneId=Europe%2FLondon&tests.container=kafka.api.TransactionsTest] -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-18421) Test ConsumerProtocolMigrationTest.testDowngradeFromEmptyConsumerToClassicGroupWithDowngradePolicy failed
Matthias J. Sax created KAFKA-18421: --- Summary: Test ConsumerProtocolMigrationTest.testDowngradeFromEmptyConsumerToClassicGroupWithDowngradePolicy failed Key: KAFKA-18421 URL: https://issues.apache.org/jira/browse/KAFKA-18421 Project: Kafka Issue Type: Test Components: clients, consumer, unit tests Reporter: Matthias J. Sax Cf [https://github.com/apache/kafka/actions/runs/12636867829/job/3542253?pr=18402] {{FAILED ❌ ConsumerProtocolMigrationTest > testDowngradeFromEmptyConsumerToClassicGroupWithDowngradePolicy [1] Type=Raft-Isolated, MetadataVersion=4.0-IV3,BrokerSecurityProtocol=PLAINTEXT,BrokerListenerName=ListenerName(EXTERNAL),ControllerSecurityProtocol=PLAINTEXT,ControllerListenerName=ListenerName(CONTROLLER)}} {{}} {{}} -- This message was sent by Atlassian Jira (v8.20.10#820010)
Re: [DISCUSS]KIP-1107: Adding record-level acks for producers
Hi Divij and Kirk, Thanks for your response. You are right, this change is not straightforward and I apologize for that. > we haven't answered the question about protocol for ProduceRequest raised above. Sorry but which question did I miss, this KIP has been modified from record-level to topic-level. > Note that there are disadvantages of "vertically scaling" a producer i.e. > reusing a producer with multiple threads. This change is optional so users can choose to adopt it. If they don't want to use this, it would not have any impact. > making producer(s) cheap to create is a goal worth pursuing. > I'd rather attack that in a more direct manner Thanks for your suggestion, I will investigate this approach simultaneously. Best, TaiJuWu Kirk True 於 2025年1月7日 週二 上午8:34寫道: > Hi TaiJu! > > I will echo the concerns about the likelihood of gotchas arising in an > effort to work around the existing API and protocol design. > > If the central concern is the performance impact and/or resource overhead > of multiple client instances, I'd rather attack that in a more direct > manner. > > Thanks, > Kirk > > On Fri, Jan 3, 2025, at 8:03 AM, Divij Vaidya wrote: > > Hey TaiJu > > > > I read the latest version of the KIP. > > > > I understand the problem you are trying to solve here. But the solution > > needs more changes than you proposed and hence, is not straightforward. > As > > an example, we haven't answered the question about protocol for > > ProduceRequest raised above. A `ProduceRequest` defines `ack` at a > request > > level where the payload consists of records belonging to multiple topics. > > One way to solve it is to define topic-level `ack` at the server as > > suggested above in this thread, but wouldn't that require us to > > remove/deprecate this field? > > > > Alternatively, have you tried to explore the option of decreasing the > > resource footprint of an idle producer so that it is not expensive to > > create 3x producers? > > Note that there are disadvantages of "vertically scaling" a producer i.e. > > reusing a producer with multiple threads. One of the many disadvantages > is > > that all requests from the producer will be handled by the same network > > thread on the broker. If that network thread is busy doing IO for some > > reason (perhaps reading from disk is slow), then it will impact all other > > requests from that producer. Hence, making producer(s) cheap to create > is a > > goal worth pursuing. > > > > -- > > Divij Vaidya > > > > > > > > On Fri, Jan 3, 2025 at 4:39 AM TaiJu Wu wrote: > > > > > Hello folk, > > > > > > This thread is pending for a long time, I want to bump this thread and > get > > > more feedback. > > > Any questions are welcome. > > > > > > Best, > > > TaiJuWu > > > > > > TaiJu Wu 於 2024年11月23日 週六 下午9:15寫道: > > > > > > > Hi Chia-Ping, > > > > > > > > Sorry for late reply and thanks for your feedback to make this KIP > more > > > > valuable. > > > > After initial verification, I think this can do without large > changes. > > > > > > > > I have updated KIP, thanks a lot. > > > > > > > > Best, > > > > TaiJuWu > > > > > > > > > > > > Chia-Ping Tsai 於 2024年11月20日 週三 下午5:06寫道: > > > > > > > >> hi TaiJuWu > > > >> > > > >> Is there a possibility to extend this KIP to include topic-level > > > >> compression for the producer? This is another issue that prevents us > > > from > > > >> sharing producers across different threads, as it's common to use > > > different > > > >> compression types for different topics (data). > > > >> > > > >> Best, > > > >> Chia-Ping > > > >> > > > >> On 2024/11/18 08:36:25 TaiJu Wu wrote: > > > >> > Hi Chia-Ping, > > > >> > > > > >> > Thanks for your suggestions and feedback. > > > >> > > > > >> > Q1: I have updated this according your suggestions. > > > >> > Q2: This is necessary change since there is a assumption about > > > >> > *RecourdAccumulator > > > >> > *that all records have same acks(e.g. ProducerConfig.acks) so we > need > > > >> to a > > > >> > method to distinguish which acks belong to each Batch. > > > >> > > > > >> > Best, > > > >> > TaiJuWu > > > >> > > > > >> > Chia-Ping Tsai 於 2024年11月18日 週一 上午2:17寫道: > > > >> > > > > >> > > hi TaiJuWu > > > >> > > > > > >> > > Q0: > > > >> > > > > > >> > > `Format: topic.acks` the dot is acceptable character in topic > > > >> naming, so > > > >> > > maybe we should reverse the format to "acks.${topic}" to get the > > > acks > > > >> of > > > >> > > topic easily > > > >> > > > > > >> > > Q1: `Return Map> when > > > >> > > RecordAccumulator#drainBatchesForOneNode is called.` > > > >> > > > > > >> > > this is weird to me, as all we need to do is pass `Map Acks> > > > >> to > > > >> > > `Sender` and make sure `Sender#sendProduceRequest` add correct > acks > > > to > > > >> > > ProduceRequest, right? > > > >> > > > > > >> > > Best, > > > >> > > Chia-Ping > > > >> > > > > > >> > > > > > >> > > > > > >> > > On 2024/11/15 05:12:33 TaiJu Wu wrote: > > > >> > > > Hi all, > > > >> > > > > > > >>
Re: [DISCUSS]KIP-1107: Adding record-level acks for producers
Hi TaiJu! I will echo the concerns about the likelihood of gotchas arising in an effort to work around the existing API and protocol design. If the central concern is the performance impact and/or resource overhead of multiple client instances, I'd rather attack that in a more direct manner. Thanks, Kirk On Fri, Jan 3, 2025, at 8:03 AM, Divij Vaidya wrote: > Hey TaiJu > > I read the latest version of the KIP. > > I understand the problem you are trying to solve here. But the solution > needs more changes than you proposed and hence, is not straightforward. As > an example, we haven't answered the question about protocol for > ProduceRequest raised above. A `ProduceRequest` defines `ack` at a request > level where the payload consists of records belonging to multiple topics. > One way to solve it is to define topic-level `ack` at the server as > suggested above in this thread, but wouldn't that require us to > remove/deprecate this field? > > Alternatively, have you tried to explore the option of decreasing the > resource footprint of an idle producer so that it is not expensive to > create 3x producers? > Note that there are disadvantages of "vertically scaling" a producer i.e. > reusing a producer with multiple threads. One of the many disadvantages is > that all requests from the producer will be handled by the same network > thread on the broker. If that network thread is busy doing IO for some > reason (perhaps reading from disk is slow), then it will impact all other > requests from that producer. Hence, making producer(s) cheap to create is a > goal worth pursuing. > > -- > Divij Vaidya > > > > On Fri, Jan 3, 2025 at 4:39 AM TaiJu Wu wrote: > > > Hello folk, > > > > This thread is pending for a long time, I want to bump this thread and get > > more feedback. > > Any questions are welcome. > > > > Best, > > TaiJuWu > > > > TaiJu Wu 於 2024年11月23日 週六 下午9:15寫道: > > > > > Hi Chia-Ping, > > > > > > Sorry for late reply and thanks for your feedback to make this KIP more > > > valuable. > > > After initial verification, I think this can do without large changes. > > > > > > I have updated KIP, thanks a lot. > > > > > > Best, > > > TaiJuWu > > > > > > > > > Chia-Ping Tsai 於 2024年11月20日 週三 下午5:06寫道: > > > > > >> hi TaiJuWu > > >> > > >> Is there a possibility to extend this KIP to include topic-level > > >> compression for the producer? This is another issue that prevents us > > from > > >> sharing producers across different threads, as it's common to use > > different > > >> compression types for different topics (data). > > >> > > >> Best, > > >> Chia-Ping > > >> > > >> On 2024/11/18 08:36:25 TaiJu Wu wrote: > > >> > Hi Chia-Ping, > > >> > > > >> > Thanks for your suggestions and feedback. > > >> > > > >> > Q1: I have updated this according your suggestions. > > >> > Q2: This is necessary change since there is a assumption about > > >> > *RecourdAccumulator > > >> > *that all records have same acks(e.g. ProducerConfig.acks) so we need > > >> to a > > >> > method to distinguish which acks belong to each Batch. > > >> > > > >> > Best, > > >> > TaiJuWu > > >> > > > >> > Chia-Ping Tsai 於 2024年11月18日 週一 上午2:17寫道: > > >> > > > >> > > hi TaiJuWu > > >> > > > > >> > > Q0: > > >> > > > > >> > > `Format: topic.acks` the dot is acceptable character in topic > > >> naming, so > > >> > > maybe we should reverse the format to "acks.${topic}" to get the > > acks > > >> of > > >> > > topic easily > > >> > > > > >> > > Q1: `Return Map> when > > >> > > RecordAccumulator#drainBatchesForOneNode is called.` > > >> > > > > >> > > this is weird to me, as all we need to do is pass `Map > > >> to > > >> > > `Sender` and make sure `Sender#sendProduceRequest` add correct acks > > to > > >> > > ProduceRequest, right? > > >> > > > > >> > > Best, > > >> > > Chia-Ping > > >> > > > > >> > > > > >> > > > > >> > > On 2024/11/15 05:12:33 TaiJu Wu wrote: > > >> > > > Hi all, > > >> > > > > > >> > > > I have updated the contents of this KIP > > >> > > > Please take a look and let me know what you think. > > >> > > > > > >> > > > Thanks, > > >> > > > TaiJuWu > > >> > > > > > >> > > > On Thu, Nov 14, 2024 at 2:21 PM TaiJu Wu > > >> wrote: > > >> > > > > > >> > > > > Hi all, > > >> > > > > > > >> > > > > Thanks for your feeback and @Chia-Ping's help. > > >> > > > > . > > >> > > > > I also agree topic-level acks config is more reasonable and it > > can > > >> > > simply > > >> > > > > the story. > > >> > > > > When I try implementing record-level acks, I notice I don't have > > >> good > > >> > > idea > > >> > > > > to avoid iterating batches for get partition information (need > > by > > >> > > > > *RecordAccumulator#partitionChanged*). > > >> > > > > > > >> > > > > Back to the init question how can I handle different acks for > > >> batches: > > >> > > > > First, we can attach *topic-level acks *to > > >> > > *RecordAccumulator#TopicInfo*. > > >> > > > > Second, we can return *Map>* when > > >> > > *RecordAccumulator#drainBat
[jira] [Resolved] (KAFKA-18374) Remove EncryptingPasswordEncoder
[ https://issues.apache.org/jira/browse/KAFKA-18374?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chia-Ping Tsai resolved KAFKA-18374. Resolution: Fixed trunk: https://github.com/apache/kafka/commit/23e77ed2d44b53deb968231345a0888f86f8ccb5 4.0: https://github.com/apache/kafka/commit/01dcba56616d9cf4056c48e21f051822e37bcc5d > Remove EncryptingPasswordEncoder > > > Key: KAFKA-18374 > URL: https://issues.apache.org/jira/browse/KAFKA-18374 > Project: Kafka > Issue Type: Sub-task >Reporter: Chia-Ping Tsai >Assignee: Mingdao Yang >Priority: Major > Fix For: 4.0.0 > > > It is no longer used -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-18419) Accept plugin.version configurations for transforms and predicates
Greg Harris created KAFKA-18419: --- Summary: Accept plugin.version configurations for transforms and predicates Key: KAFKA-18419 URL: https://issues.apache.org/jira/browse/KAFKA-18419 Project: Kafka Issue Type: New Feature Components: connect Reporter: Greg Harris Assignee: Snehashis Pal Fix For: 4.1.0 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (KAFKA-18388) test-kraft-server-start.sh should use log4j2.yaml
[ https://issues.apache.org/jira/browse/KAFKA-18388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chia-Ping Tsai resolved KAFKA-18388. Resolution: Fixed trunk: https://github.com/apache/kafka/commit/a52aedd6ff9ade0230c0f41b473e8fbdfa2e0345 4.0: https://github.com/apache/kafka/commit/efdfa0184259a41e0c22359df36a169c8d97214d > test-kraft-server-start.sh should use log4j2.yaml > - > > Key: KAFKA-18388 > URL: https://issues.apache.org/jira/browse/KAFKA-18388 > Project: Kafka > Issue Type: Improvement >Reporter: Chia-Ping Tsai >Assignee: PoAn Yang >Priority: Blocker > Fix For: 4.0.0 > > > as title, and we should remove kraft-log4j.properties as well -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (KAFKA-18131) Improve logs for voters
[ https://issues.apache.org/jira/browse/KAFKA-18131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] TengYao Chi resolved KAFKA-18131. - Resolution: Fixed > Improve logs for voters > --- > > Key: KAFKA-18131 > URL: https://issues.apache.org/jira/browse/KAFKA-18131 > Project: Kafka > Issue Type: Improvement >Reporter: Luke Chen >Assignee: TengYao Chi >Priority: Major > Labels: newbie > Fix For: 4.0.0 > > > Saw logs like this, which has no info about "voters". > > _[2024-11-27 09:43:13,853] INFO [RaftManager id=2] Did not receive fetch > request from the majority of the voters within 3000ms. Current fetched voters > are [], and voters are java.util.stream.ReferencePipeline$3@39660237 > (org.apache.kafka.raft.LeaderState)_ > > The voters are important info when diagnose issues. We should clearly log > them. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-18407) Remove ZkAdminManager
黃竣陽 created KAFKA-18407: --- Summary: Remove ZkAdminManager Key: KAFKA-18407 URL: https://issues.apache.org/jira/browse/KAFKA-18407 Project: Kafka Issue Type: Improvement Reporter: 黃竣陽 Assignee: 黃竣陽 Once we delete ZkSupport https://issues.apache.org/jira/browse/KAFKA-18399 , we will be able to remove ZkAdminManager -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-18417) Remove controlled.shutdown.max.retries and controlled.shutdown.retry.backoff.ms
Chia-Ping Tsai created KAFKA-18417: -- Summary: Remove controlled.shutdown.max.retries and controlled.shutdown.retry.backoff.ms Key: KAFKA-18417 URL: https://issues.apache.org/jira/browse/KAFKA-18417 Project: Kafka Issue Type: Sub-task Reporter: Chia-Ping Tsai Assignee: Chia-Ping Tsai -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-18418) Flaky test in KafkaStreamsTest::shouldThrowOnCleanupWhileShuttingDownStreamClosedWithCloseOptionLeaveGroupFalse
Ao Li created KAFKA-18418: - Summary: Flaky test in KafkaStreamsTest::shouldThrowOnCleanupWhileShuttingDownStreamClosedWithCloseOptionLeaveGroupFalse Key: KAFKA-18418 URL: https://issues.apache.org/jira/browse/KAFKA-18418 Project: Kafka Issue Type: Bug Reporter: Ao Li KafkaStreams does not synchronize with CloseThread after shutdown thread starts at line https://github.com/apache/kafka/blob/c1163549081561cade03bbc6a29bfe6caad332a2/streams/src/main/java/org/apache/kafka/streams/KafkaStreams.java#L1571 So it is possible for the shutdown helper update the state of the KafkaStreams (https://github.com/apache/kafka/blob/c1163549081561cade03bbc6a29bfe6caad332a2/streams/src/main/java/org/apache/kafka/streams/KafkaStreams.java#L1530) before `waitOnState` is called (https://github.com/apache/kafka/blob/c1163549081561cade03bbc6a29bfe6caad332a2/streams/src/main/java/org/apache/kafka/streams/KafkaStreams.java#L1577). If this happens, `KafkaStreamsTest::shouldThrowOnCleanupWhileShuttingDownStreamClosedWithCloseOptionLeaveGroupFalse` will fail. Please check code https://github.com/aoli-al/kafka/tree/KAFKA-159, and run `./gradlew :streams:test --tests "org.apache.kafka.streams.KafkaStreamsTest.shouldThrowOnCleanupWhileShuttingDownStreamClosedWithCloseOptionLeaveGroupFalse"` to reproduce the failure. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (KAFKA-18419) Accept plugin.version configurations for transforms and predicates
[ https://issues.apache.org/jira/browse/KAFKA-18419?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Greg Harris resolved KAFKA-18419. - Resolution: Fixed > Accept plugin.version configurations for transforms and predicates > -- > > Key: KAFKA-18419 > URL: https://issues.apache.org/jira/browse/KAFKA-18419 > Project: Kafka > Issue Type: New Feature > Components: connect >Reporter: Greg Harris >Assignee: Snehashis Pal >Priority: Major > Fix For: 4.1.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-18420) Find out the license which is in the license file but is not in distribution
kangning.li created KAFKA-18420: --- Summary: Find out the license which is in the license file but is not in distribution Key: KAFKA-18420 URL: https://issues.apache.org/jira/browse/KAFKA-18420 Project: Kafka Issue Type: Improvement Reporter: kangning.li Assignee: kangning.li see discussion: https://github.com/apache/kafka/pull/18299#discussion_r1904604076 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (KAFKA-17539) Implement registerMetricsForSubscription
[ https://issues.apache.org/jira/browse/KAFKA-17539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Schofield resolved KAFKA-17539. -- Resolution: Fixed > Implement registerMetricsForSubscription > > > Key: KAFKA-17539 > URL: https://issues.apache.org/jira/browse/KAFKA-17539 > Project: Kafka > Issue Type: Sub-task >Reporter: Andrew Schofield >Assignee: Andrew Schofield >Priority: Minor > Fix For: 4.1.0 > > > Add the equivalent of KIP-1076 to KIP-932 and implement the new methods. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Reopened] (KAFKA-18036) TransactionsWithTieredStoreTest testReadCommittedConsumerShouldNotSeeUndecidedData is flaky
[ https://issues.apache.org/jira/browse/KAFKA-18036?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yu-Lin Chen reopened KAFKA-18036: - Reopen this Jira as the issue still exists in trunk and can be reproduced locally within 20 loops. I'm working on troubleshooting it. > TransactionsWithTieredStoreTest > testReadCommittedConsumerShouldNotSeeUndecidedData is flaky > --- > > Key: KAFKA-18036 > URL: https://issues.apache.org/jira/browse/KAFKA-18036 > Project: Kafka > Issue Type: Test >Reporter: David Arthur >Assignee: Chia-Ping Tsai >Priority: Major > Labels: flaky-test > > https://ge.apache.org/scans/tests?search.names=CI%20workflow%2CGit%20repository&search.rootProjectNames=kafka&search.tags=github%2Ctrunk&search.tasks=test&search.timeZoneId=America%2FNew_York&search.values=CI%2Chttps:%2F%2Fgithub.com%2Fapache%2Fkafka&tests.container=org.apache.kafka.tiered.storage.integration.TransactionsWithTieredStoreTest&tests.sortField=FLAKY&tests.test=testReadCommittedConsumerShouldNotSeeUndecidedData(String%2C%20String)%5B2%5D -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-18415) Flaky ApplicationEventHandlerTest testRecordApplicationEventQueueSize
Lianet Magrans created KAFKA-18415: -- Summary: Flaky ApplicationEventHandlerTest testRecordApplicationEventQueueSize Key: KAFKA-18415 URL: https://issues.apache.org/jira/browse/KAFKA-18415 Project: Kafka Issue Type: Test Components: consumer Affects Versions: 4.0.0 Reporter: Lianet Magrans Assignee: Lianet Magrans Fails with org.opentest4j.AssertionFailedError: expected: <1.0> but was: <0.0> [https://github.com/apache/kafka/actions/runs/12600961241/job/35121534600?pr=17099] Looks like race condition. I guess this could easily happen if the background thread removes the event from the queue (record metric for queue size with 0) before the app thread records the queue size with 1 (which happens after adding the event to the queue). -- This message was sent by Atlassian Jira (v8.20.10#820010)
Re: [DISCUSS] KIP-1101: Trigger rebalance on rack topology changes
Hi PoAn, Thanks for the update. I haven't read the updated KIP yet. DJ02: I am not sure about using Guava as a dependency. I mentioned it more as an inspiration/reference. I suppose that we could use it on the server but we should definitely not use it on the client. I am not sure how others feel about it. Best, David On Mon, Jan 6, 2025 at 5:21 AM PoAn Yang wrote: > Hi Chia-Ping / David / Lucas, > > Happy new year and thanks for the review. > > DJ02: Thanks for the suggestion. I updated the PR to use Guava. > > DJ03: Yes, I updated the description to mention ISR change, > add altering partition reassignment case, and mention that > non-related topic change doesn’t trigger a rebalance. > DJ03.1: Yes, I will keep using ModernGroup#requestMetadataRefresh > to notify group. > > DJ06: Updated the PR to use Guava Hashing#combineUnordered > function to combine topic hash. > > DJ07: Renamed it to MetadataHash. > > DJ08: Added a sample hash function to the KIP and use first byte as magic > byte. This is also included in latest PR. > > DJ09: Added two paragraphs about upgraded and downgraded. > > DJ10: According to Lucas’s comment, I add StreamsGroupMetadataValue update > to this KIP. > > Thanks, > PoAn > > > > On Dec 20, 2024, at 3:58 PM, Chia-Ping Tsai wrote: > > > >> because assignors are sticky. > > > > I forgot about that spec again :( > > > > > > > > > > David Jacot 於 2024年12月20日 週五 下午3:41寫道: > > > >> Hi Chia-Ping, > >> > >> DJ08: In my opinion, changing the format will be rare so it is > >> acceptable if rebalances are triggered in this case on > >> upgrade/downgrade. It is also what will happen if a cluster is > >> downgraded from 4.1 (with this KIP) to 4.0. The rebalance won't change > >> anything if the topology of the group is the same because assignors > >> are sticky. The default ones are and we recommend custom ones to also > >> be. > >> > >> Best, > >> David > >> > >> On Fri, Dec 20, 2024 at 2:11 AM Chia-Ping Tsai > >> wrote: > >>> > >>> ummm, it does not work for downgrade as the old coordinator has no idea > >> about new format :( > >>> > >>> > >>> On 2024/12/20 00:57:27 Chia-Ping Tsai wrote: > hi David > > > DJ08: > > That's a good question. If the "hash" lacks version control, it could > >> trigger a series of unnecessary rebalances. However, adding additional > >> information ("magic") to the hash does not help the upgraded coordinator > >> determine the "version." This means that the upgraded coordinator would > >> still trigger unnecessary rebalances because it has no way to know which > >> format to use when comparing the hash. > > Perhaps we can add a new field to ConsumerGroupMetadataValue to > >> indicate the version of the "hash." This would allow the coordinator, > when > >> handling subscription metadata, to compute the old hash and determine > >> whether an epoch bump is necessary. Additionally, the coordinator can > >> generate a new record to upgrade the hash without requiring an epoch > bump. > > Another issue is whether the coordinator should cache all versions of > >> the hash. I believe this is necessary; otherwise, during an upgrade, > there > >> would be extensive recomputing of old hashes. > > I believe this idea should also work for downgrades, and that's just > >> my two cents. > > Best, > Chia-Ping > > > On 2024/12/19 14:39:41 David Jacot wrote: > > Hi PoAn and Chia-Ping, > > > > Thanks for your responses. > > > > DJ02: Sorry, I was not clear. I was wondering whether we could > >> compute the > > hash without having to convert to bytes before. Guava has a nice > >> interface > > for this allowing to incrementally add primitive types to the hash. > >> We can > > discuss this in the PR as it is an implementation detail. > > > > DJ03: Thanks. I don't think that the replicas are updated when a > >> broker > > shuts down. What you said applies to the ISR. I suppose that we can > >> rely on > > the ISR changes to trigger updates. It is also worth noting > > that TopicsDelta#changedTopics is updated for every change (e.g. ISR > > change, leader change, replicas change, etc.). I suppose that it is > >> OK but > > it seems that it will trigger refreshes which are not necessary. > >> However, a > > rebalance won't be triggered because the hash won't change. > > DJ03.1: I suppose that we will continue to rely on > > ModernGroup#requestMetadataRefresh to notify groups that must > >> refresh their > > hashes. Is my understanding correct? > > > > DJ05: Fair enough. > > > > DJ06: You mention in two places that you would like to combine > >> hashes by > > additioning them. I wonder if this is a good practice. Intuitively, > >> I would > > have used XOR or hashed the hashed. Guava has a method for combining > > hashes. It may be worth looking into the algorithm used. > > > > DJ07: I would
[jira] [Created] (KAFKA-18406) Remove ZkBrokerEpochManager
黃竣陽 created KAFKA-18406: --- Summary: Remove ZkBrokerEpochManager Key: KAFKA-18406 URL: https://issues.apache.org/jira/browse/KAFKA-18406 Project: Kafka Issue Type: Improvement Reporter: 黃竣陽 Assignee: 黃竣陽 Once we delete ZkSupport https://issues.apache.org/jira/browse/KAFKA-18399 , we will be able to remove ZkBrokerEpochManager -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-18405) Remove ZooKeeper logic from DynamicBrokerConfig
Mickael Maison created KAFKA-18405: -- Summary: Remove ZooKeeper logic from DynamicBrokerConfig Key: KAFKA-18405 URL: https://issues.apache.org/jira/browse/KAFKA-18405 Project: Kafka Issue Type: Sub-task Reporter: Mickael Maison -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-18412) Remove EmbeddedZookeeper
TengYao Chi created KAFKA-18412: --- Summary: Remove EmbeddedZookeeper Key: KAFKA-18412 URL: https://issues.apache.org/jira/browse/KAFKA-18412 Project: Kafka Issue Type: Improvement Reporter: TengYao Chi Assignee: TengYao Chi Fix For: 4.0.0 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-18413) Remove AdminZkClient
TengYao Chi created KAFKA-18413: --- Summary: Remove AdminZkClient Key: KAFKA-18413 URL: https://issues.apache.org/jira/browse/KAFKA-18413 Project: Kafka Issue Type: Improvement Reporter: TengYao Chi Assignee: TengYao Chi Fix For: 4.0.0 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (KAFKA-17616) Remove KafkaServer
[ https://issues.apache.org/jira/browse/KAFKA-17616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mickael Maison resolved KAFKA-17616. Fix Version/s: 4.0.0 Resolution: Fixed > Remove KafkaServer > -- > > Key: KAFKA-17616 > URL: https://issues.apache.org/jira/browse/KAFKA-17616 > Project: Kafka > Issue Type: Sub-task >Reporter: Colin McCabe >Assignee: Mickael Maison >Priority: Major > Fix For: 4.0.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-18414) Remove KRaftRegistrationResult
TengYao Chi created KAFKA-18414: --- Summary: Remove KRaftRegistrationResult Key: KAFKA-18414 URL: https://issues.apache.org/jira/browse/KAFKA-18414 Project: Kafka Issue Type: Improvement Reporter: TengYao Chi Assignee: TengYao Chi Fix For: 4.0.0 This trait is actually unused since we are removing ZK code. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-18411) Remove ZkProducerIdManager
黃竣陽 created KAFKA-18411: --- Summary: Remove ZkProducerIdManager Key: KAFKA-18411 URL: https://issues.apache.org/jira/browse/KAFKA-18411 Project: Kafka Issue Type: Improvement Reporter: 黃竣陽 Assignee: 黃竣陽 as we remove KafkaServer, we also can remove ZkProducerIdManager -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (KAFKA-18307) Flaky test report includes disabled or removed tests.
[ https://issues.apache.org/jira/browse/KAFKA-18307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Arthur resolved KAFKA-18307. -- Resolution: Fixed > Flaky test report includes disabled or removed tests. > - > > Key: KAFKA-18307 > URL: https://issues.apache.org/jira/browse/KAFKA-18307 > Project: Kafka > Issue Type: Sub-task >Reporter: David Arthur >Priority: Major > > I noticed in this report > [https://github.com/apache/kafka/actions/runs/12398964575] that two of the > problematic tests have been removed or disabled on trunk. Following their > links to Develocity shows that there is actually no recent data. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-18416) Ensure we capture all flaky/failing tests in report
David Arthur created KAFKA-18416: Summary: Ensure we capture all flaky/failing tests in report Key: KAFKA-18416 URL: https://issues.apache.org/jira/browse/KAFKA-18416 Project: Kafka Issue Type: Sub-task Reporter: David Arthur I noticed in our Develocity report that we are missing a catch-all section for failing/flaky tests. We currently have these sections: * Quarantined tests which are continuing to fail * Tests that recently began failing (regressions) * Quarantined tests which have started to pass We are missing a section for failing/flaky tests that are not recent. For example, looking at [https://ge.apache.org/scans/tests?search.names=Git%20Repository&search.rootProjectNames=kafka&search.tags=github%2Ctrunk&search.tasks=test&search.timeZoneId=America%2FNew_York&search.values=https:%2F%2Fgithub.com%2Fapache%2Fkafka&tests.sortField=FLAKY#] we have org.apache.kafka.clients.consumer.internals.ApplicationEventHandlerTest which is quite flaky but is missing from the report. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-18422) add Kafka client upgrade path
Chia-Ping Tsai created KAFKA-18422: -- Summary: add Kafka client upgrade path Key: KAFKA-18422 URL: https://issues.apache.org/jira/browse/KAFKA-18422 Project: Kafka Issue Type: Improvement Reporter: Chia-Ping Tsai Assignee: Kuan Po Tseng Fix For: 4.0.0 https://github.com/apache/kafka/pull/18193#issuecomment-2572283545 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[DISCUSS] Proposed KIP: Event-Driven State Store Cleanup through Changelog Deletion Notifications
Hi All, I'd like to propose a KIP that transforms state store cleanup from a time-driven to event-driven approach by introducing changelog delete notifications. Problem: Currently, state stores have no way to know when records are deleted from their changelog topics due to retention. This leads to: - Resource-intensive periodic scans - Blind cleanup operations - Inefficient resource utilization We face this at significant scale: - State stores with 25B+ records - Daily ingestion of 50M records per store - Retention periods from 2 days to 5 years Proposed Solution: Introduce a notification mechanism when records are deleted from changelog topics, enabling: - Event-driven cleanup instead of time-based scans - Targeted deletion of specific records - Better resource utilization Would love to get the community's thoughts on: 1. Viability of this approach 2. Implementation considerations (callbacks vs alternatives) 3. Potential impact on broker performance If there's interest, I can share more detailed technical design. Looking forward to your feedback. Best regards, Thulasiram V
Re: [VOTE] KIP-1098: Reverse Checkpointing in MirrorMaker
Hey everyone, Trying to bump once more, maybe someone will notice :) TIA Daniel Dániel Urbán ezt írta (időpont: 2024. dec. 17., K, 18:26): > Hi everyone, > Bumping in hope for some votes - consider checking this, small KIP with > some useful improvements. > TIA > Daniel > > Dániel Urbán ezt írta (időpont: 2024. dec. 13., > P, 9:22): > >> Bumping - please consider voting on this KIP. >> TIA >> Daniel >> >> Dániel Urbán ezt írta (időpont: 2024. dec. 9., >> H, 9:06): >> >>> Gentle bump - please consider checking the KIP and voting. >>> Daniel >>> >>> Dániel Urbán ezt írta (időpont: 2024. dec. 5., >>> Cs, 12:08): >>> Bumping this vote - the change has a relatively small footprint, but fills a sizable gap in MM2. Please consider checking the KIP and chiming in. TIA Daniel Viktor Somogyi-Vass ezt írta (időpont: 2024. dec. 2., H, 10:40): > +1 (binding) > > Thanks for the KIP Daniel! > > Viktor > > On Mon, Dec 2, 2024 at 10:36 AM Vidor Kanalas > > wrote: > > > Hi, thanks for the KIP! > > +1 (non-binding) > > > > Best, > > Vidor > > On Mon, Dec 2, 2024 at 10:15 AM Dániel Urbán > > wrote: > > > > > Hi everyone, > > > > > > I'd like to start the vote on KIP-1098: Reverse Checkpointing in > > > MirrorMaker ( > > > > > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-1098%3A+Reverse+Checkpointing+in+MirrorMaker > > > ). > > > > > > TIA, > > > Daniel > > > > > >
[jira] [Created] (KAFKA-18408) tweak the 'tag' field for BrokerHeartbeatRequest.json, BrokerRegistrationChangeRecord.json and RegisterBrokerRecord.json
Chia-Ping Tsai created KAFKA-18408: -- Summary: tweak the 'tag' field for BrokerHeartbeatRequest.json, BrokerRegistrationChangeRecord.json and RegisterBrokerRecord.json Key: KAFKA-18408 URL: https://issues.apache.org/jira/browse/KAFKA-18408 Project: Kafka Issue Type: Improvement Reporter: Chia-Ping Tsai Assignee: Chia-Ping Tsai "tag": "0" -> "tag": 0 -- This message was sent by Atlassian Jira (v8.20.10#820010)
Re: [PR] MINOR: Fix newlines not working in LISTENERS_DOC for 3.8 and 3.9 docs [kafka-site]
clarkwtc commented on PR #658: URL: https://github.com/apache/kafka-site/pull/658#issuecomment-2573118367 Preview 3.9 [  ](url) Preview 3.8  -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@kafka.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Created] (KAFKA-18409) ShareGroupStateMessageFormatter should use ApiMessageFormatter
David Jacot created KAFKA-18409: --- Summary: ShareGroupStateMessageFormatter should use ApiMessageFormatter Key: KAFKA-18409 URL: https://issues.apache.org/jira/browse/KAFKA-18409 Project: Kafka Issue Type: Improvement Reporter: David Jacot ShareGroupStateMessageFormatter should extend ApiMessageFormatter in order to have a consistent handling of records of coordinators. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-18410) Should GroupMetadataMessageFormatter print new records too?
David Jacot created KAFKA-18410: --- Summary: Should GroupMetadataMessageFormatter print new records too? Key: KAFKA-18410 URL: https://issues.apache.org/jira/browse/KAFKA-18410 Project: Kafka Issue Type: Improvement Reporter: David Jacot At the moment, GroupMetadataMessageFormatter only prints out the metadata record of the classic groups. It seems that we should extend it to also print out the metadata of other groups (e.g. consumer, share, stream, etc.). Thoughts? -- This message was sent by Atlassian Jira (v8.20.10#820010)