[jira] [Reopened] (KAFKA-18092) TransactionsTest.testBumpTransactionalEpochWithTV2Enabled is flaky

2025-01-06 Thread Matthias J. Sax (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-18092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias J. Sax reopened KAFKA-18092:
-

Just failed again (two time in a row) on 
[https://github.com/apache/kafka/pull/18402] 

> TransactionsTest.testBumpTransactionalEpochWithTV2Enabled is flaky
> --
>
> Key: KAFKA-18092
> URL: https://issues.apache.org/jira/browse/KAFKA-18092
> Project: Kafka
>  Issue Type: Bug
>Reporter: Andrew Schofield
>Assignee: Justine Olshan
>Priority: Major
> Fix For: 4.0.0
>
>
> [https://ge.apache.org/scans/tests?search.rootProjectNames=kafka&search.tasks=test&search.timeZoneId=Europe%2FLondon&tests.container=kafka.api.TransactionsTest]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KAFKA-18421) Test ConsumerProtocolMigrationTest.testDowngradeFromEmptyConsumerToClassicGroupWithDowngradePolicy failed

2025-01-06 Thread Matthias J. Sax (Jira)
Matthias J. Sax created KAFKA-18421:
---

 Summary: Test 
ConsumerProtocolMigrationTest.testDowngradeFromEmptyConsumerToClassicGroupWithDowngradePolicy
 failed
 Key: KAFKA-18421
 URL: https://issues.apache.org/jira/browse/KAFKA-18421
 Project: Kafka
  Issue Type: Test
  Components: clients, consumer, unit tests
Reporter: Matthias J. Sax


Cf 
[https://github.com/apache/kafka/actions/runs/12636867829/job/3542253?pr=18402]
 

{{FAILED ❌ ConsumerProtocolMigrationTest > 
testDowngradeFromEmptyConsumerToClassicGroupWithDowngradePolicy [1] 
Type=Raft-Isolated, 
MetadataVersion=4.0-IV3,BrokerSecurityProtocol=PLAINTEXT,BrokerListenerName=ListenerName(EXTERNAL),ControllerSecurityProtocol=PLAINTEXT,ControllerListenerName=ListenerName(CONTROLLER)}}

{{}}

{{}}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [DISCUSS]KIP-1107: Adding record-level acks for producers

2025-01-06 Thread TaiJu Wu
Hi Divij and Kirk,

Thanks for your response.
You are right, this change is not straightforward and I apologize for that.

> we haven't answered the question about protocol for ProduceRequest raised
above.
Sorry but which question did I miss, this KIP has been modified from
record-level to topic-level.

> Note that there are disadvantages of "vertically scaling" a producer i.e.
> reusing a producer with multiple threads.
This change is optional so users can choose to adopt it. If they don't want
to use this, it would not have any impact.

> making producer(s) cheap to create is a goal worth pursuing.
> I'd rather attack that in a more direct manner
Thanks for your suggestion, I will investigate this approach simultaneously.

Best,
TaiJuWu

Kirk True  於 2025年1月7日 週二 上午8:34寫道:

> Hi TaiJu!
>
> I will echo the concerns about the likelihood of gotchas arising in an
> effort to work around the existing API and protocol design.
>
> If the central concern is the performance impact and/or resource overhead
> of multiple client instances, I'd rather attack that in a more direct
> manner.
>
> Thanks,
> Kirk
>
> On Fri, Jan 3, 2025, at 8:03 AM, Divij Vaidya wrote:
> > Hey TaiJu
> >
> > I read the latest version of the KIP.
> >
> > I understand the problem you are trying to solve here. But the solution
> > needs more changes than you proposed and hence, is not straightforward.
> As
> > an example, we haven't answered the question about protocol for
> > ProduceRequest raised above. A `ProduceRequest` defines `ack` at a
> request
> > level where the payload consists of records belonging to multiple topics.
> > One way to solve it is to define topic-level `ack` at the server as
> > suggested above in this thread, but wouldn't that require us to
> > remove/deprecate this field?
> >
> > Alternatively, have you tried to explore the option of decreasing the
> > resource footprint of an idle producer so that it is not expensive to
> > create 3x producers?
> > Note that there are disadvantages of "vertically scaling" a producer i.e.
> > reusing a producer with multiple threads. One of the many disadvantages
> is
> > that all requests from the producer will be handled by the same network
> > thread on the broker. If that network thread is busy doing IO for some
> > reason (perhaps reading from disk is slow), then it will impact all other
> > requests from that producer. Hence, making producer(s) cheap to create
> is a
> > goal worth pursuing.
> >
> > --
> > Divij Vaidya
> >
> >
> >
> > On Fri, Jan 3, 2025 at 4:39 AM TaiJu Wu  wrote:
> >
> > > Hello folk,
> > >
> > > This thread is pending for a long time, I want to bump this thread and
> get
> > > more feedback.
> > > Any questions are welcome.
> > >
> > > Best,
> > > TaiJuWu
> > >
> > > TaiJu Wu  於 2024年11月23日 週六 下午9:15寫道:
> > >
> > > > Hi Chia-Ping,
> > > >
> > > > Sorry for late reply and thanks for your feedback to make this KIP
> more
> > > > valuable.
> > > > After initial verification, I think this can do without large
> changes.
> > > >
> > > > I have updated KIP, thanks a lot.
> > > >
> > > > Best,
> > > > TaiJuWu
> > > >
> > > >
> > > > Chia-Ping Tsai  於 2024年11月20日 週三 下午5:06寫道:
> > > >
> > > >> hi TaiJuWu
> > > >>
> > > >> Is there a possibility to extend this KIP to include topic-level
> > > >> compression for the producer? This is another issue that prevents us
> > > from
> > > >> sharing producers across different threads, as it's common to use
> > > different
> > > >> compression types for different topics (data).
> > > >>
> > > >> Best,
> > > >> Chia-Ping
> > > >>
> > > >> On 2024/11/18 08:36:25 TaiJu Wu wrote:
> > > >> > Hi Chia-Ping,
> > > >> >
> > > >> > Thanks for your suggestions and feedback.
> > > >> >
> > > >> > Q1: I have updated this according your suggestions.
> > > >> > Q2: This is necessary change since there is a assumption about
> > > >> > *RecourdAccumulator
> > > >> > *that all records have same acks(e.g. ProducerConfig.acks) so we
> need
> > > >> to a
> > > >> > method to distinguish which acks belong to each Batch.
> > > >> >
> > > >> > Best,
> > > >> > TaiJuWu
> > > >> >
> > > >> > Chia-Ping Tsai  於 2024年11月18日 週一 上午2:17寫道:
> > > >> >
> > > >> > > hi TaiJuWu
> > > >> > >
> > > >> > > Q0:
> > > >> > >
> > > >> > > `Format: topic.acks`  the dot is acceptable character in topic
> > > >> naming, so
> > > >> > > maybe we should reverse the format to "acks.${topic}" to get the
> > > acks
> > > >> of
> > > >> > > topic easily
> > > >> > >
> > > >> > > Q1: `Return Map> when
> > > >> > > RecordAccumulator#drainBatchesForOneNode is called.`
> > > >> > >
> > > >> > > this is weird to me, as all we need to do is pass `Map Acks>
> > > >> to
> > > >> > > `Sender` and make sure `Sender#sendProduceRequest` add correct
> acks
> > > to
> > > >> > > ProduceRequest, right?
> > > >> > >
> > > >> > > Best,
> > > >> > > Chia-Ping
> > > >> > >
> > > >> > >
> > > >> > >
> > > >> > > On 2024/11/15 05:12:33 TaiJu Wu wrote:
> > > >> > > > Hi all,
> > > >> > > >
> > > >>

Re: [DISCUSS]KIP-1107: Adding record-level acks for producers

2025-01-06 Thread Kirk True
Hi TaiJu!

I will echo the concerns about the likelihood of gotchas arising in an effort 
to work around the existing API and protocol design.

If the central concern is the performance impact and/or resource overhead of 
multiple client instances, I'd rather attack that in a more direct manner.

Thanks,
Kirk

On Fri, Jan 3, 2025, at 8:03 AM, Divij Vaidya wrote:
> Hey TaiJu
> 
> I read the latest version of the KIP.
> 
> I understand the problem you are trying to solve here. But the solution
> needs more changes than you proposed and hence, is not straightforward. As
> an example, we haven't answered the question about protocol for
> ProduceRequest raised above. A `ProduceRequest` defines `ack` at a request
> level where the payload consists of records belonging to multiple topics.
> One way to solve it is to define topic-level `ack` at the server as
> suggested above in this thread, but wouldn't that require us to
> remove/deprecate this field?
> 
> Alternatively, have you tried to explore the option of decreasing the
> resource footprint of an idle producer so that it is not expensive to
> create 3x producers?
> Note that there are disadvantages of "vertically scaling" a producer i.e.
> reusing a producer with multiple threads. One of the many disadvantages is
> that all requests from the producer will be handled by the same network
> thread on the broker. If that network thread is busy doing IO for some
> reason (perhaps reading from disk is slow), then it will impact all other
> requests from that producer. Hence, making producer(s) cheap to create is a
> goal worth pursuing.
> 
> --
> Divij Vaidya
> 
> 
> 
> On Fri, Jan 3, 2025 at 4:39 AM TaiJu Wu  wrote:
> 
> > Hello folk,
> >
> > This thread is pending for a long time, I want to bump this thread and get
> > more feedback.
> > Any questions are welcome.
> >
> > Best,
> > TaiJuWu
> >
> > TaiJu Wu  於 2024年11月23日 週六 下午9:15寫道:
> >
> > > Hi Chia-Ping,
> > >
> > > Sorry for late reply and thanks for your feedback to make this KIP more
> > > valuable.
> > > After initial verification, I think this can do without large changes.
> > >
> > > I have updated KIP, thanks a lot.
> > >
> > > Best,
> > > TaiJuWu
> > >
> > >
> > > Chia-Ping Tsai  於 2024年11月20日 週三 下午5:06寫道:
> > >
> > >> hi TaiJuWu
> > >>
> > >> Is there a possibility to extend this KIP to include topic-level
> > >> compression for the producer? This is another issue that prevents us
> > from
> > >> sharing producers across different threads, as it's common to use
> > different
> > >> compression types for different topics (data).
> > >>
> > >> Best,
> > >> Chia-Ping
> > >>
> > >> On 2024/11/18 08:36:25 TaiJu Wu wrote:
> > >> > Hi Chia-Ping,
> > >> >
> > >> > Thanks for your suggestions and feedback.
> > >> >
> > >> > Q1: I have updated this according your suggestions.
> > >> > Q2: This is necessary change since there is a assumption about
> > >> > *RecourdAccumulator
> > >> > *that all records have same acks(e.g. ProducerConfig.acks) so we need
> > >> to a
> > >> > method to distinguish which acks belong to each Batch.
> > >> >
> > >> > Best,
> > >> > TaiJuWu
> > >> >
> > >> > Chia-Ping Tsai  於 2024年11月18日 週一 上午2:17寫道:
> > >> >
> > >> > > hi TaiJuWu
> > >> > >
> > >> > > Q0:
> > >> > >
> > >> > > `Format: topic.acks`  the dot is acceptable character in topic
> > >> naming, so
> > >> > > maybe we should reverse the format to "acks.${topic}" to get the
> > acks
> > >> of
> > >> > > topic easily
> > >> > >
> > >> > > Q1: `Return Map> when
> > >> > > RecordAccumulator#drainBatchesForOneNode is called.`
> > >> > >
> > >> > > this is weird to me, as all we need to do is pass `Map
> > >> to
> > >> > > `Sender` and make sure `Sender#sendProduceRequest` add correct acks
> > to
> > >> > > ProduceRequest, right?
> > >> > >
> > >> > > Best,
> > >> > > Chia-Ping
> > >> > >
> > >> > >
> > >> > >
> > >> > > On 2024/11/15 05:12:33 TaiJu Wu wrote:
> > >> > > > Hi all,
> > >> > > >
> > >> > > > I have updated the contents of this KIP
> > >> > > > Please take a look and let me know what you think.
> > >> > > >
> > >> > > > Thanks,
> > >> > > > TaiJuWu
> > >> > > >
> > >> > > > On Thu, Nov 14, 2024 at 2:21 PM TaiJu Wu 
> > >> wrote:
> > >> > > >
> > >> > > > > Hi all,
> > >> > > > >
> > >> > > > > Thanks for your feeback and @Chia-Ping's help.
> > >> > > > > .
> > >> > > > > I also agree topic-level acks config is more reasonable and it
> > can
> > >> > > simply
> > >> > > > > the story.
> > >> > > > > When I try implementing record-level acks, I notice I don't have
> > >> good
> > >> > > idea
> > >> > > > > to avoid iterating batches for get partition information (need
> > by
> > >> > > > > *RecordAccumulator#partitionChanged*).
> > >> > > > >
> > >> > > > > Back to the init question how can I handle different acks for
> > >> batches:
> > >> > > > > First, we can attach *topic-level acks *to
> > >> > > *RecordAccumulator#TopicInfo*.
> > >> > > > > Second,  we can return *Map>* when
> > >> > > *RecordAccumulator#drainBat

[jira] [Resolved] (KAFKA-18374) Remove EncryptingPasswordEncoder

2025-01-06 Thread Chia-Ping Tsai (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-18374?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chia-Ping Tsai resolved KAFKA-18374.

Resolution: Fixed

trunk: 
https://github.com/apache/kafka/commit/23e77ed2d44b53deb968231345a0888f86f8ccb5

4.0: 
https://github.com/apache/kafka/commit/01dcba56616d9cf4056c48e21f051822e37bcc5d

> Remove EncryptingPasswordEncoder
> 
>
> Key: KAFKA-18374
> URL: https://issues.apache.org/jira/browse/KAFKA-18374
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Chia-Ping Tsai
>Assignee: Mingdao Yang
>Priority: Major
> Fix For: 4.0.0
>
>
> It is no longer used



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KAFKA-18419) Accept plugin.version configurations for transforms and predicates

2025-01-06 Thread Greg Harris (Jira)
Greg Harris created KAFKA-18419:
---

 Summary: Accept plugin.version configurations for transforms and 
predicates
 Key: KAFKA-18419
 URL: https://issues.apache.org/jira/browse/KAFKA-18419
 Project: Kafka
  Issue Type: New Feature
  Components: connect
Reporter: Greg Harris
Assignee: Snehashis Pal
 Fix For: 4.1.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (KAFKA-18388) test-kraft-server-start.sh should use log4j2.yaml

2025-01-06 Thread Chia-Ping Tsai (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-18388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chia-Ping Tsai resolved KAFKA-18388.

Resolution: Fixed

trunk: 
https://github.com/apache/kafka/commit/a52aedd6ff9ade0230c0f41b473e8fbdfa2e0345

4.0: 
https://github.com/apache/kafka/commit/efdfa0184259a41e0c22359df36a169c8d97214d

> test-kraft-server-start.sh should use log4j2.yaml
> -
>
> Key: KAFKA-18388
> URL: https://issues.apache.org/jira/browse/KAFKA-18388
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Chia-Ping Tsai
>Assignee: PoAn Yang
>Priority: Blocker
> Fix For: 4.0.0
>
>
> as title, and we should remove kraft-log4j.properties as well



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (KAFKA-18131) Improve logs for voters

2025-01-06 Thread TengYao Chi (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-18131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

TengYao Chi resolved KAFKA-18131.
-
Resolution: Fixed

> Improve logs for voters
> ---
>
> Key: KAFKA-18131
> URL: https://issues.apache.org/jira/browse/KAFKA-18131
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Luke Chen
>Assignee: TengYao Chi
>Priority: Major
>  Labels: newbie
> Fix For: 4.0.0
>
>
> Saw logs like this, which has no info about "voters".
>  
> _[2024-11-27 09:43:13,853] INFO [RaftManager id=2] Did not receive fetch 
> request from the majority of the voters within 3000ms. Current fetched voters 
> are [], and voters are java.util.stream.ReferencePipeline$3@39660237 
> (org.apache.kafka.raft.LeaderState)_
>  
> The voters are important info when diagnose issues. We should clearly log 
> them.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KAFKA-18407) Remove ZkAdminManager

2025-01-06 Thread Jira
黃竣陽 created KAFKA-18407:
---

 Summary: Remove ZkAdminManager
 Key: KAFKA-18407
 URL: https://issues.apache.org/jira/browse/KAFKA-18407
 Project: Kafka
  Issue Type: Improvement
Reporter: 黃竣陽
Assignee: 黃竣陽


Once we delete ZkSupport https://issues.apache.org/jira/browse/KAFKA-18399 , we 
will be able to remove ZkAdminManager 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KAFKA-18417) Remove controlled.shutdown.max.retries and controlled.shutdown.retry.backoff.ms

2025-01-06 Thread Chia-Ping Tsai (Jira)
Chia-Ping Tsai created KAFKA-18417:
--

 Summary: Remove controlled.shutdown.max.retries and 
controlled.shutdown.retry.backoff.ms
 Key: KAFKA-18417
 URL: https://issues.apache.org/jira/browse/KAFKA-18417
 Project: Kafka
  Issue Type: Sub-task
Reporter: Chia-Ping Tsai
Assignee: Chia-Ping Tsai






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KAFKA-18418) Flaky test in KafkaStreamsTest::shouldThrowOnCleanupWhileShuttingDownStreamClosedWithCloseOptionLeaveGroupFalse

2025-01-06 Thread Ao Li (Jira)
Ao Li created KAFKA-18418:
-

 Summary: Flaky test in 
KafkaStreamsTest::shouldThrowOnCleanupWhileShuttingDownStreamClosedWithCloseOptionLeaveGroupFalse
 Key: KAFKA-18418
 URL: https://issues.apache.org/jira/browse/KAFKA-18418
 Project: Kafka
  Issue Type: Bug
Reporter: Ao Li


KafkaStreams does not synchronize with CloseThread after shutdown thread starts 
at line 
https://github.com/apache/kafka/blob/c1163549081561cade03bbc6a29bfe6caad332a2/streams/src/main/java/org/apache/kafka/streams/KafkaStreams.java#L1571

So it is possible for the shutdown helper update the state of the KafkaStreams 
(https://github.com/apache/kafka/blob/c1163549081561cade03bbc6a29bfe6caad332a2/streams/src/main/java/org/apache/kafka/streams/KafkaStreams.java#L1530)
 before `waitOnState` is called 
(https://github.com/apache/kafka/blob/c1163549081561cade03bbc6a29bfe6caad332a2/streams/src/main/java/org/apache/kafka/streams/KafkaStreams.java#L1577).
  

If this happens, 
`KafkaStreamsTest::shouldThrowOnCleanupWhileShuttingDownStreamClosedWithCloseOptionLeaveGroupFalse`
 will fail. 

Please check code https://github.com/aoli-al/kafka/tree/KAFKA-159, and run 
`./gradlew :streams:test --tests 
"org.apache.kafka.streams.KafkaStreamsTest.shouldThrowOnCleanupWhileShuttingDownStreamClosedWithCloseOptionLeaveGroupFalse"`
 to reproduce the failure.





--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (KAFKA-18419) Accept plugin.version configurations for transforms and predicates

2025-01-06 Thread Greg Harris (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-18419?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Greg Harris resolved KAFKA-18419.
-
Resolution: Fixed

> Accept plugin.version configurations for transforms and predicates
> --
>
> Key: KAFKA-18419
> URL: https://issues.apache.org/jira/browse/KAFKA-18419
> Project: Kafka
>  Issue Type: New Feature
>  Components: connect
>Reporter: Greg Harris
>Assignee: Snehashis Pal
>Priority: Major
> Fix For: 4.1.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KAFKA-18420) Find out the license which is in the license file but is not in distribution

2025-01-06 Thread kangning.li (Jira)
kangning.li created KAFKA-18420:
---

 Summary: Find out the license which is in the license file but is 
not in distribution
 Key: KAFKA-18420
 URL: https://issues.apache.org/jira/browse/KAFKA-18420
 Project: Kafka
  Issue Type: Improvement
Reporter: kangning.li
Assignee: kangning.li


see discussion:   
https://github.com/apache/kafka/pull/18299#discussion_r1904604076



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (KAFKA-17539) Implement registerMetricsForSubscription

2025-01-06 Thread Andrew Schofield (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-17539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Schofield resolved KAFKA-17539.
--
Resolution: Fixed

> Implement registerMetricsForSubscription
> 
>
> Key: KAFKA-17539
> URL: https://issues.apache.org/jira/browse/KAFKA-17539
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Andrew Schofield
>Assignee: Andrew Schofield
>Priority: Minor
> Fix For: 4.1.0
>
>
> Add the equivalent of KIP-1076 to KIP-932 and implement the new methods.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Reopened] (KAFKA-18036) TransactionsWithTieredStoreTest testReadCommittedConsumerShouldNotSeeUndecidedData is flaky

2025-01-06 Thread Yu-Lin Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-18036?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yu-Lin Chen reopened KAFKA-18036:
-

Reopen this Jira as the issue still exists in trunk and can be reproduced 
locally within 20 loops. I'm working on troubleshooting it.

> TransactionsWithTieredStoreTest 
> testReadCommittedConsumerShouldNotSeeUndecidedData is flaky
> ---
>
> Key: KAFKA-18036
> URL: https://issues.apache.org/jira/browse/KAFKA-18036
> Project: Kafka
>  Issue Type: Test
>Reporter: David Arthur
>Assignee: Chia-Ping Tsai
>Priority: Major
>  Labels: flaky-test
>
> https://ge.apache.org/scans/tests?search.names=CI%20workflow%2CGit%20repository&search.rootProjectNames=kafka&search.tags=github%2Ctrunk&search.tasks=test&search.timeZoneId=America%2FNew_York&search.values=CI%2Chttps:%2F%2Fgithub.com%2Fapache%2Fkafka&tests.container=org.apache.kafka.tiered.storage.integration.TransactionsWithTieredStoreTest&tests.sortField=FLAKY&tests.test=testReadCommittedConsumerShouldNotSeeUndecidedData(String%2C%20String)%5B2%5D



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KAFKA-18415) Flaky ApplicationEventHandlerTest testRecordApplicationEventQueueSize

2025-01-06 Thread Lianet Magrans (Jira)
Lianet Magrans created KAFKA-18415:
--

 Summary: Flaky ApplicationEventHandlerTest 
testRecordApplicationEventQueueSize
 Key: KAFKA-18415
 URL: https://issues.apache.org/jira/browse/KAFKA-18415
 Project: Kafka
  Issue Type: Test
  Components: consumer
Affects Versions: 4.0.0
Reporter: Lianet Magrans
Assignee: Lianet Magrans


Fails with 

org.opentest4j.AssertionFailedError: expected: <1.0> but was: <0.0>

[https://github.com/apache/kafka/actions/runs/12600961241/job/35121534600?pr=17099]

Looks like race condition. I guess this could easily happen if the background 
thread removes the event from the queue (record metric for queue size with 0) 
before the app thread records the queue size with 1 (which happens after adding 
the event to the queue).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [DISCUSS] KIP-1101: Trigger rebalance on rack topology changes

2025-01-06 Thread David Jacot
Hi PoAn,

Thanks for the update. I haven't read the updated KIP yet.

DJ02: I am not sure about using Guava as a dependency. I mentioned it more
as an inspiration/reference. I suppose that we could use it on the server
but we should definitely not use it on the client. I am not sure how others
feel about it.

Best,
David

On Mon, Jan 6, 2025 at 5:21 AM PoAn Yang  wrote:

> Hi Chia-Ping / David / Lucas,
>
> Happy new year and thanks for the review.
>
> DJ02: Thanks for the suggestion. I updated the PR to use Guava.
>
> DJ03: Yes, I updated the description to mention ISR change,
> add altering partition reassignment case, and mention that
> non-related topic change doesn’t trigger a rebalance.
> DJ03.1: Yes, I will keep using ModernGroup#requestMetadataRefresh
> to notify group.
>
> DJ06: Updated the PR to use Guava Hashing#combineUnordered
> function to combine topic hash.
>
> DJ07: Renamed it to MetadataHash.
>
> DJ08: Added a sample hash function to the KIP and use first byte as magic
> byte. This is also included in latest PR.
>
> DJ09: Added two paragraphs about upgraded and downgraded.
>
> DJ10: According to Lucas’s comment, I add StreamsGroupMetadataValue update
> to this KIP.
>
> Thanks,
> PoAn
>
>
> > On Dec 20, 2024, at 3:58 PM, Chia-Ping Tsai  wrote:
> >
> >> because assignors are sticky.
> >
> > I forgot about that spec again :(
> >
> >
> >
> >
> > David Jacot  於 2024年12月20日 週五 下午3:41寫道:
> >
> >> Hi Chia-Ping,
> >>
> >> DJ08: In my opinion, changing the format will be rare so it is
> >> acceptable if rebalances are triggered in this case on
> >> upgrade/downgrade. It is also what will happen if a cluster is
> >> downgraded from 4.1 (with this KIP) to 4.0. The rebalance won't change
> >> anything if the topology of the group is the same because assignors
> >> are sticky. The default ones are and we recommend custom ones to also
> >> be.
> >>
> >> Best,
> >> David
> >>
> >> On Fri, Dec 20, 2024 at 2:11 AM Chia-Ping Tsai 
> >> wrote:
> >>>
> >>> ummm, it does not work for downgrade as the old coordinator has no idea
> >> about new format :(
> >>>
> >>>
> >>> On 2024/12/20 00:57:27 Chia-Ping Tsai wrote:
>  hi David
> 
> > DJ08:
> 
>  That's a good question. If the "hash" lacks version control, it could
> >> trigger a series of unnecessary rebalances. However, adding additional
> >> information ("magic") to the hash does not help the upgraded coordinator
> >> determine the "version." This means that the upgraded coordinator would
> >> still trigger unnecessary rebalances because it has no way to know which
> >> format to use when comparing the hash.
> 
>  Perhaps we can add a new field to ConsumerGroupMetadataValue to
> >> indicate the version of the "hash." This would allow the coordinator,
> when
> >> handling subscription metadata, to compute the old hash and determine
> >> whether an epoch bump is necessary. Additionally, the coordinator can
> >> generate a new record to upgrade the hash without requiring an epoch
> bump.
> 
>  Another issue is whether the coordinator should cache all versions of
> >> the hash. I believe this is necessary; otherwise, during an upgrade,
> there
> >> would be extensive recomputing of old hashes.
> 
>  I believe this idea should also work for downgrades, and that's just
> >> my two cents.
> 
>  Best,
>  Chia-Ping
> 
> 
>  On 2024/12/19 14:39:41 David Jacot wrote:
> > Hi PoAn and Chia-Ping,
> >
> > Thanks for your responses.
> >
> > DJ02: Sorry, I was not clear. I was wondering whether we could
> >> compute the
> > hash without having to convert to bytes before. Guava has a nice
> >> interface
> > for this allowing to incrementally add primitive types to the hash.
> >> We can
> > discuss this in the PR as it is an implementation detail.
> >
> > DJ03: Thanks. I don't think that the replicas are updated when a
> >> broker
> > shuts down. What you said applies to the ISR. I suppose that we can
> >> rely on
> > the ISR changes to trigger updates. It is also worth noting
> > that TopicsDelta#changedTopics is updated for every change (e.g. ISR
> > change, leader change, replicas change, etc.). I suppose that it is
> >> OK but
> > it seems that it will trigger refreshes which are not necessary.
> >> However, a
> > rebalance won't be triggered because the hash won't change.
> > DJ03.1: I suppose that we will continue to rely on
> > ModernGroup#requestMetadataRefresh to notify groups that must
> >> refresh their
> > hashes. Is my understanding correct?
> >
> > DJ05: Fair enough.
> >
> > DJ06: You mention in two places that you would like to combine
> >> hashes by
> > additioning them. I wonder if this is a good practice. Intuitively,
> >> I would
> > have used XOR or hashed the hashed. Guava has a method for combining
> > hashes. It may be worth looking into the algorithm used.
> >
> > DJ07: I would 

[jira] [Created] (KAFKA-18406) Remove ZkBrokerEpochManager

2025-01-06 Thread Jira
黃竣陽 created KAFKA-18406:
---

 Summary: Remove ZkBrokerEpochManager
 Key: KAFKA-18406
 URL: https://issues.apache.org/jira/browse/KAFKA-18406
 Project: Kafka
  Issue Type: Improvement
Reporter: 黃竣陽
Assignee: 黃竣陽


Once we delete ZkSupport https://issues.apache.org/jira/browse/KAFKA-18399 , we 
will be able to remove ZkBrokerEpochManager 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KAFKA-18405) Remove ZooKeeper logic from DynamicBrokerConfig

2025-01-06 Thread Mickael Maison (Jira)
Mickael Maison created KAFKA-18405:
--

 Summary: Remove ZooKeeper logic from DynamicBrokerConfig
 Key: KAFKA-18405
 URL: https://issues.apache.org/jira/browse/KAFKA-18405
 Project: Kafka
  Issue Type: Sub-task
Reporter: Mickael Maison






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KAFKA-18412) Remove EmbeddedZookeeper

2025-01-06 Thread TengYao Chi (Jira)
TengYao Chi created KAFKA-18412:
---

 Summary: Remove EmbeddedZookeeper
 Key: KAFKA-18412
 URL: https://issues.apache.org/jira/browse/KAFKA-18412
 Project: Kafka
  Issue Type: Improvement
Reporter: TengYao Chi
Assignee: TengYao Chi
 Fix For: 4.0.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KAFKA-18413) Remove AdminZkClient

2025-01-06 Thread TengYao Chi (Jira)
TengYao Chi created KAFKA-18413:
---

 Summary: Remove AdminZkClient
 Key: KAFKA-18413
 URL: https://issues.apache.org/jira/browse/KAFKA-18413
 Project: Kafka
  Issue Type: Improvement
Reporter: TengYao Chi
Assignee: TengYao Chi
 Fix For: 4.0.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (KAFKA-17616) Remove KafkaServer

2025-01-06 Thread Mickael Maison (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-17616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mickael Maison resolved KAFKA-17616.

Fix Version/s: 4.0.0
   Resolution: Fixed

> Remove KafkaServer
> --
>
> Key: KAFKA-17616
> URL: https://issues.apache.org/jira/browse/KAFKA-17616
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Colin McCabe
>Assignee: Mickael Maison
>Priority: Major
> Fix For: 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KAFKA-18414) Remove KRaftRegistrationResult

2025-01-06 Thread TengYao Chi (Jira)
TengYao Chi created KAFKA-18414:
---

 Summary: Remove KRaftRegistrationResult
 Key: KAFKA-18414
 URL: https://issues.apache.org/jira/browse/KAFKA-18414
 Project: Kafka
  Issue Type: Improvement
Reporter: TengYao Chi
Assignee: TengYao Chi
 Fix For: 4.0.0


This trait is actually unused since we are removing ZK code.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KAFKA-18411) Remove ZkProducerIdManager

2025-01-06 Thread Jira
黃竣陽 created KAFKA-18411:
---

 Summary: Remove ZkProducerIdManager
 Key: KAFKA-18411
 URL: https://issues.apache.org/jira/browse/KAFKA-18411
 Project: Kafka
  Issue Type: Improvement
Reporter: 黃竣陽
Assignee: 黃竣陽


as we remove KafkaServer, we also can remove ZkProducerIdManager



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (KAFKA-18307) Flaky test report includes disabled or removed tests.

2025-01-06 Thread David Arthur (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-18307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Arthur resolved KAFKA-18307.
--
Resolution: Fixed

> Flaky test report includes disabled or removed tests.
> -
>
> Key: KAFKA-18307
> URL: https://issues.apache.org/jira/browse/KAFKA-18307
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: David Arthur
>Priority: Major
>
> I noticed in this report 
> [https://github.com/apache/kafka/actions/runs/12398964575] that two of the 
> problematic tests have been removed or disabled on trunk. Following their 
> links to Develocity shows that there is actually no recent data.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KAFKA-18416) Ensure we capture all flaky/failing tests in report

2025-01-06 Thread David Arthur (Jira)
David Arthur created KAFKA-18416:


 Summary: Ensure we capture all flaky/failing tests in report
 Key: KAFKA-18416
 URL: https://issues.apache.org/jira/browse/KAFKA-18416
 Project: Kafka
  Issue Type: Sub-task
Reporter: David Arthur


I noticed in our Develocity report that we are missing a catch-all section for 
failing/flaky tests. We currently have these sections:

 
 * Quarantined tests which are continuing to fail
 * Tests that recently began failing (regressions)
 * Quarantined tests which have started to pass

 

We are missing a section for failing/flaky tests that are not recent. For 
example, looking at

 

[https://ge.apache.org/scans/tests?search.names=Git%20Repository&search.rootProjectNames=kafka&search.tags=github%2Ctrunk&search.tasks=test&search.timeZoneId=America%2FNew_York&search.values=https:%2F%2Fgithub.com%2Fapache%2Fkafka&tests.sortField=FLAKY#]

 

we have org.apache.kafka.clients.consumer.internals.ApplicationEventHandlerTest 
which is quite flaky but is missing from the report.

 

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KAFKA-18422) add Kafka client upgrade path

2025-01-06 Thread Chia-Ping Tsai (Jira)
Chia-Ping Tsai created KAFKA-18422:
--

 Summary: add Kafka client upgrade path
 Key: KAFKA-18422
 URL: https://issues.apache.org/jira/browse/KAFKA-18422
 Project: Kafka
  Issue Type: Improvement
Reporter: Chia-Ping Tsai
Assignee: Kuan Po Tseng
 Fix For: 4.0.0


https://github.com/apache/kafka/pull/18193#issuecomment-2572283545



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[DISCUSS] Proposed KIP: Event-Driven State Store Cleanup through Changelog Deletion Notifications

2025-01-06 Thread Thulasi Ram
Hi All,

I'd like to propose a KIP that transforms state store cleanup from a
time-driven to event-driven approach by introducing changelog delete
notifications.

Problem:
Currently, state stores have no way to know when records are deleted from
their changelog topics due to retention. This leads to:
- Resource-intensive periodic scans
- Blind cleanup operations
- Inefficient resource utilization

We face this at significant scale:
- State stores with 25B+ records
- Daily ingestion of 50M records per store
- Retention periods from 2 days to 5 years

Proposed Solution:
Introduce a notification mechanism when records are deleted from changelog
topics, enabling:
- Event-driven cleanup instead of time-based scans
- Targeted deletion of specific records
- Better resource utilization

Would love to get the community's thoughts on:
1. Viability of this approach
2. Implementation considerations (callbacks vs alternatives)
3. Potential impact on broker performance

If there's interest, I can share more detailed technical design.

Looking forward to your feedback.

Best regards,
Thulasiram V


Re: [VOTE] KIP-1098: Reverse Checkpointing in MirrorMaker

2025-01-06 Thread Dániel Urbán
Hey everyone,
Trying to bump once more, maybe someone will notice :)
TIA
Daniel

Dániel Urbán  ezt írta (időpont: 2024. dec. 17., K,
18:26):

> Hi everyone,
> Bumping in hope for some votes - consider checking this, small KIP with
> some useful improvements.
> TIA
> Daniel
>
> Dániel Urbán  ezt írta (időpont: 2024. dec. 13.,
> P, 9:22):
>
>> Bumping - please consider voting on this KIP.
>> TIA
>> Daniel
>>
>> Dániel Urbán  ezt írta (időpont: 2024. dec. 9.,
>> H, 9:06):
>>
>>> Gentle bump - please consider checking the KIP and voting.
>>> Daniel
>>>
>>> Dániel Urbán  ezt írta (időpont: 2024. dec. 5.,
>>> Cs, 12:08):
>>>
 Bumping this vote - the change has a relatively small footprint, but
 fills a sizable gap in MM2.
 Please consider checking the KIP and chiming in.
 TIA
 Daniel

 Viktor Somogyi-Vass  ezt írta
 (időpont: 2024. dec. 2., H, 10:40):

> +1 (binding)
>
> Thanks for the KIP Daniel!
>
> Viktor
>
> On Mon, Dec 2, 2024 at 10:36 AM Vidor Kanalas  >
> wrote:
>
> > Hi, thanks for the KIP!
> > +1 (non-binding)
> >
> > Best,
> > Vidor
> > On Mon, Dec 2, 2024 at 10:15 AM Dániel Urbán 
> > wrote:
> >
> > > Hi everyone,
> > >
> > > I'd like to start the vote on KIP-1098: Reverse Checkpointing in
> > > MirrorMaker (
> > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1098%3A+Reverse+Checkpointing+in+MirrorMaker
> > > ).
> > >
> > > TIA,
> > > Daniel
> > >
> >
>



[jira] [Created] (KAFKA-18408) tweak the 'tag' field for BrokerHeartbeatRequest.json, BrokerRegistrationChangeRecord.json and RegisterBrokerRecord.json

2025-01-06 Thread Chia-Ping Tsai (Jira)
Chia-Ping Tsai created KAFKA-18408:
--

 Summary: tweak the 'tag' field for BrokerHeartbeatRequest.json, 
BrokerRegistrationChangeRecord.json and RegisterBrokerRecord.json
 Key: KAFKA-18408
 URL: https://issues.apache.org/jira/browse/KAFKA-18408
 Project: Kafka
  Issue Type: Improvement
Reporter: Chia-Ping Tsai
Assignee: Chia-Ping Tsai


"tag": "0" -> "tag": 0



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [PR] MINOR: Fix newlines not working in LISTENERS_DOC for 3.8 and 3.9 docs [kafka-site]

2025-01-06 Thread via GitHub


clarkwtc commented on PR #658:
URL: https://github.com/apache/kafka-site/pull/658#issuecomment-2573118367

   Preview 3.9
   [
   ![Screenshot 2025-01-06 
212809](https://github.com/user-attachments/assets/e45ad4b7-db3e-4a9a-9cb4-e538de11f84a)
   ](url)
   
   Preview 3.8
   ![Screenshot 2025-01-06 
212840](https://github.com/user-attachments/assets/83b9f348-c451-4760-a563-e1bd57d73b54)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Created] (KAFKA-18409) ShareGroupStateMessageFormatter should use ApiMessageFormatter

2025-01-06 Thread David Jacot (Jira)
David Jacot created KAFKA-18409:
---

 Summary: ShareGroupStateMessageFormatter should use 
ApiMessageFormatter
 Key: KAFKA-18409
 URL: https://issues.apache.org/jira/browse/KAFKA-18409
 Project: Kafka
  Issue Type: Improvement
Reporter: David Jacot


ShareGroupStateMessageFormatter should extend ApiMessageFormatter in order to 
have a consistent handling of records of coordinators.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KAFKA-18410) Should GroupMetadataMessageFormatter print new records too?

2025-01-06 Thread David Jacot (Jira)
David Jacot created KAFKA-18410:
---

 Summary: Should GroupMetadataMessageFormatter print new records 
too?
 Key: KAFKA-18410
 URL: https://issues.apache.org/jira/browse/KAFKA-18410
 Project: Kafka
  Issue Type: Improvement
Reporter: David Jacot


At the moment, GroupMetadataMessageFormatter only prints out the metadata 
record of the classic groups. It seems that we should extend it to also print 
out the metadata of other groups (e.g. consumer, share, stream, etc.). Thoughts?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)