Re: So long, Jenkins 👋

2024-09-26 Thread Chia-Ping Tsai
hi Josep

> Do you see any potential impact if we backport the change to those?

In my opinion, the main concern is that non-trunk PRs can't effectively
leverage the cache, meaning they require more time and resources to run CI.
Additionally, github-ci is triggered by trunk branch only, and we have not
tested it on non-trunk branch yet. Given that 3.9.0 and 3.8.1 releases are
processing, we could continue using Jenkins CI to avoid the additional
overhead of backporting.

By the way, we'll eventually need to backport GitHub CI to the non-trunk
branches once the 4.1 branch is created.

Best,
Chia-Ping



Chia-Ping Tsai  於 2024年9月26日 週四 下午4:15寫道:

> Thanks to David for providing us with an improved CI!
>
> Cheers,
> Chia-Ping
>
> David Arthur  於 2024年9月26日 週四 上午8:51寫道:
>
>> Today, we disabled the Jenkins build on trunk. With this change, we should
>> now be expecting all green status checks on PRs before merging. Of course,
>> flaky tests still exist, but generally speaking we should have green
>> builds
>> (see KIP-1090 for some plans on flaky tests).
>>
>> Any committer or "collaborator" (as defined in .asf.yaml) is able to
>> manually re-run a GitHub Action via the UI.
>>
>> For non-committers, someone must approve the workflow. There is a
>> "approve-workflows.py" script in committer-tools to help with this. I'm
>> still investigating options to improve this.
>>
>> We will keep the Jenkins build enabled for 3.9 and other release branches.
>>
>> Cheers,
>> David A
>>
>


Re: So long, Jenkins 👋

2024-09-26 Thread Josep Prat
That's what I feared

On Thu, Sep 26, 2024 at 10:31 AM Chia-Ping Tsai  wrote:

> hi Josep
>
> > Do you see any potential impact if we backport the change to those?
>
> In my opinion, the main concern is that non-trunk PRs can't effectively
> leverage the cache, meaning they require more time and resources to run CI.
> Additionally, github-ci is triggered by trunk branch only, and we have not
> tested it on non-trunk branch yet. Given that 3.9.0 and 3.8.1 releases are
> processing, we could continue using Jenkins CI to avoid the additional
> overhead of backporting.
>
> By the way, we'll eventually need to backport GitHub CI to the non-trunk
> branches once the 4.1 branch is created.
>
> Best,
> Chia-Ping
>
>
>
> Chia-Ping Tsai  於 2024年9月26日 週四 下午4:15寫道:
>
> > Thanks to David for providing us with an improved CI!
> >
> > Cheers,
> > Chia-Ping
> >
> > David Arthur  於 2024年9月26日 週四 上午8:51寫道:
> >
> >> Today, we disabled the Jenkins build on trunk. With this change, we
> should
> >> now be expecting all green status checks on PRs before merging. Of
> course,
> >> flaky tests still exist, but generally speaking we should have green
> >> builds
> >> (see KIP-1090 for some plans on flaky tests).
> >>
> >> Any committer or "collaborator" (as defined in .asf.yaml) is able to
> >> manually re-run a GitHub Action via the UI.
> >>
> >> For non-committers, someone must approve the workflow. There is a
> >> "approve-workflows.py" script in committer-tools to help with this. I'm
> >> still investigating options to improve this.
> >>
> >> We will keep the Jenkins build enabled for 3.9 and other release
> branches.
> >>
> >> Cheers,
> >> David A
> >>
> >
>


-- 
[image: Aiven] 

*Josep Prat*
Open Source Engineering Director, *Aiven*
josep.p...@aiven.io   |   +491715557497
aiven.io    |   
     
*Aiven Deutschland GmbH*
Alexanderufer 3-7, 10117 Berlin
Geschäftsführer: Oskari Saarenmaa, Hannu Valtonen,
Anna Richardson, Kenneth Chen
Amtsgericht Charlottenburg, HRB 209739 B


Re: So long, Jenkins 👋

2024-09-26 Thread Chia-Ping Tsai
Thanks to David for providing us with an improved CI!

Cheers,
Chia-Ping

David Arthur  於 2024年9月26日 週四 上午8:51寫道:

> Today, we disabled the Jenkins build on trunk. With this change, we should
> now be expecting all green status checks on PRs before merging. Of course,
> flaky tests still exist, but generally speaking we should have green builds
> (see KIP-1090 for some plans on flaky tests).
>
> Any committer or "collaborator" (as defined in .asf.yaml) is able to
> manually re-run a GitHub Action via the UI.
>
> For non-committers, someone must approve the workflow. There is a
> "approve-workflows.py" script in committer-tools to help with this. I'm
> still investigating options to improve this.
>
> We will keep the Jenkins build enabled for 3.9 and other release branches.
>
> Cheers,
> David A
>


[jira] [Created] (KAFKA-17619) Remove zk type and instance from ClusterTest

2024-09-26 Thread Chia-Ping Tsai (Jira)
Chia-Ping Tsai created KAFKA-17619:
--

 Summary: Remove zk type and instance from ClusterTest
 Key: KAFKA-17619
 URL: https://issues.apache.org/jira/browse/KAFKA-17619
 Project: Kafka
  Issue Type: Sub-task
Reporter: Chia-Ping Tsai
Assignee: Chia-Ping Tsai


as title



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KAFKA-17620) Simplify share partition acquire API

2024-09-26 Thread Apoorv Mittal (Jira)
Apoorv Mittal created KAFKA-17620:
-

 Summary: Simplify share partition acquire API
 Key: KAFKA-17620
 URL: https://issues.apache.org/jira/browse/KAFKA-17620
 Project: Kafka
  Issue Type: Sub-task
Reporter: Apoorv Mittal
Assignee: Apoorv Mittal


Simplify share partition acquire API to remove completable future as there do 
not exist any future calls.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (KAFKA-17496) Add heterogeneous configuration to TargetAssignmentBuilderBenchmark

2024-09-26 Thread David Jacot (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-17496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Jacot resolved KAFKA-17496.
-
Fix Version/s: 4.0.0
   Resolution: Fixed

> Add heterogeneous configuration to TargetAssignmentBuilderBenchmark
> ---
>
> Key: KAFKA-17496
> URL: https://issues.apache.org/jira/browse/KAFKA-17496
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Sean Quah
>Assignee: Sean Quah
>Priority: Minor
> Fix For: 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (KAFKA-17571) Revert #17219

2024-09-26 Thread David Jacot (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-17571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Jacot resolved KAFKA-17571.
-
Fix Version/s: 4.0.0
   Resolution: Fixed

> Revert #17219
> -
>
> Key: KAFKA-17571
> URL: https://issues.apache.org/jira/browse/KAFKA-17571
> Project: Kafka
>  Issue Type: Sub-task
>Affects Versions: 4.0.0
>Reporter: David Jacot
>Assignee: David Jacot
>Priority: Blocker
> Fix For: 4.0.0
>
>
> Revert https://github.com/apache/kafka/pull/17219



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (KAFKA-17584) Fix incorrect synonym handling for dynamic log configurations

2024-09-26 Thread Luke Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-17584?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luke Chen resolved KAFKA-17584.
---
Fix Version/s: 3.9.0
   3.8.1
   Resolution: Fixed

> Fix incorrect synonym handling for dynamic log configurations
> -
>
> Key: KAFKA-17584
> URL: https://issues.apache.org/jira/browse/KAFKA-17584
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 3.9.0
>Reporter: Christo Lolov
>Assignee: Christo Lolov
>Priority: Blocker
> Fix For: 3.9.0, 3.8.1
>
>
> Updating certain dynamic configurations (for example `message.max.bytes`) 
> causes retention based on time to reset to the default value (source code) 
> for log.retention.ms. This poses a durability issue if users have set their 
> retention by using log.retention.hours or log.retention.minutes. In other 
> words, if a user has set log.retention.hours=-1 (infinite retention) and they 
> dynamically change `message.max.bytes` their retention will immediately 
> change back to the default of 60480 ms (7 days) and data before this will 
> be scheduled for deletion immediately.
> Steps to reproduce:
>  1. Add log.retention.minutes=1,log.retention.check.interval.ms=1000 to 
> server.properties
>  2. Start a single ZK or KRaft instance + a single Kafka instance
>  3. Create a topic using
> {code:java}
> bin/kafka-topics.sh --bootstrap-server localhost:9092 --create --topic A 
> --replication-factor 1 --partitions 1 --config min.insync.replicas=1 --config 
> segment.bytes=512{code}
>  4. Create a few segments with the console producer
>  5. Observe that they are deleted after 1 minute
>  6. Use the following command
> {code:java}
> bin/kafka-configs.sh --bootstrap-server loclahost:9092 --entity-type brokers 
> --entity-default --alter --add-config message.max.bytes=1048609{code}
> (the value of `message.max.bytes` is irrelevant)
>  7. Create a few more segments with the console producer
>  8. Observe that segments are no longer deleted after 1 minute



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KAFKA-17618) group consumer heartbeat interval should be less than session timeout

2024-09-26 Thread PoAn Yang (Jira)
PoAn Yang created KAFKA-17618:
-

 Summary: group consumer heartbeat interval should be less than 
session timeout
 Key: KAFKA-17618
 URL: https://issues.apache.org/jira/browse/KAFKA-17618
 Project: Kafka
  Issue Type: Task
Reporter: PoAn Yang
Assignee: PoAn Yang


[KIP-848|https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=217387038#KIP848:TheNextGenerationoftheConsumerRebalanceProtocol-Heartbeat&Session]
 mentions:
bq. The member is expected to heartbeat every 
group.consumer.heartbeat.interval.ms in order to keep its session opened. If it 
does not heartbeat at least once within the group.consumer.session.timeout.ms, 
the group coordinator will kick the member out from the group.

To avoid users configure _group.consumer.heartbeat.interval.ms_ bigger than 
_group.consumer.session.timeout.ms_, we can add validation for it.

We can do similar validation for _group.share.heartbeat.interval.ms_ and 
_group.share.session.timeout.ms_ as well.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KAFKA-17624) Remove the E2E uses of accessing ACLs from zk

2024-09-26 Thread Chia-Ping Tsai (Jira)
Chia-Ping Tsai created KAFKA-17624:
--

 Summary: Remove the E2E uses of accessing ACLs from zk
 Key: KAFKA-17624
 URL: https://issues.apache.org/jira/browse/KAFKA-17624
 Project: Kafka
  Issue Type: Sub-task
Reporter: Chia-Ping Tsai
Assignee: Chia-Ping Tsai


To remove ZooKeeper code from AclCommand, we first need to remove the related 
system tests. This task should remove zookeeper.py#list_acls and its usages, 
including zookeeper_tls_test.py, zookeeper_tls_encrypt_only_test.py, 
zookeeper_security_upgrade_test.py, `test_rolling_upgrade_phase_two`, 
`test_rolling_upgrade_sasl_mechanism_phase_two`, and  
upgrade_test.py#perform_upgrade.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [VOTE] KIP-1034: Dead letter queue in Kafka Streams

2024-09-26 Thread Bill Bejeck
Thanks for the KIP, this will be a great addition.

+1(binding)

Regards,
Bill

On Thu, Sep 26, 2024 at 9:19 AM Bruno Cadonna  wrote:

> Thanks Loïc, Sebastien, and Damien,
>
> +1 (binding)
>
> Best,
> Bruno
>
> On 9/26/24 3:15 AM, Sophie Blee-Goldman wrote:
> > +1 (binding)
> >
> > thanks for the KIP guys!
> >
> > On Mon, Sep 23, 2024 at 3:38 AM Sebastien Viale <
> > sebastien.vi...@michelin.com> wrote:
> >
> >> Hi everyone,
> >>
> >> Just a quick reminder that the vote for KIP-1034 is still open.
> >> Thank you all for your participation!
> >>
> >> Best regards,
> >> Damien Sebastien and Loic
> >>
> >>
> >> 
> >> De : Sebastien Viale 
> >> Envoyé : mercredi 11 septembre 2024 09:26
> >> À : dev 
> >> Objet : Marketing: [VOTE] KIP-1034: Dead letter queue in Kafka Streams
> >>
> >> Hi all,
> >>
> >> We would like to start a vote for KIP-1034: Dead letter queue in Kafka
> >> Streams<
> >>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1034%3A+Dead+letter+queue+in+Kafka+Streams
> >> <
> >>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1034%3A+Dead+letter+queue+in+Kafka+Streams
> 
> >>
> >> The KIP is available on
> >>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1034%3A+Dead+letter+queue+in+Kafka+Streams
> >> <
> >>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1034%3A+Dead+letter+queue+in+Kafka+Streams
> >>>
> >>
> >> If you have any suggestions or feedback, feel free to participate to the
> >> discussion thread:
> >> https://lists.apache.org/thread/1nhhsrogmmv15o7mk9nj4kvkb5k2bx9s<
> >> https://lists.apache.org/thread/1nhhsrogmmv15o7mk9nj4kvkb5k2bx9s>
> >>
> >> Best regards,
> >>
> >> Damien Sebastien and Loic
> >>
> >>
> >> This email was screened for spam and malicious content but exercise
> >> caution anyway.
> >>
> >>
> >>
> >
>
>


Re: [VOTE] KIP-1034: Dead letter queue in Kafka Streams

2024-09-26 Thread Bruno Cadonna

Thanks Loïc, Sebastien, and Damien,

+1 (binding)

Best,
Bruno

On 9/26/24 3:15 AM, Sophie Blee-Goldman wrote:

+1 (binding)

thanks for the KIP guys!

On Mon, Sep 23, 2024 at 3:38 AM Sebastien Viale <
sebastien.vi...@michelin.com> wrote:


Hi everyone,

Just a quick reminder that the vote for KIP-1034 is still open.
Thank you all for your participation!

Best regards,
Damien Sebastien and Loic



De : Sebastien Viale 
Envoyé : mercredi 11 septembre 2024 09:26
À : dev 
Objet : Marketing: [VOTE] KIP-1034: Dead letter queue in Kafka Streams

Hi all,

We would like to start a vote for KIP-1034: Dead letter queue in Kafka
Streams<
https://cwiki.apache.org/confluence/display/KAFKA/KIP-1034%3A+Dead+letter+queue+in+Kafka+Streams
<
https://cwiki.apache.org/confluence/display/KAFKA/KIP-1034%3A+Dead+letter+queue+in+Kafka+Streams




The KIP is available on
https://cwiki.apache.org/confluence/display/KAFKA/KIP-1034%3A+Dead+letter+queue+in+Kafka+Streams
<
https://cwiki.apache.org/confluence/display/KAFKA/KIP-1034%3A+Dead+letter+queue+in+Kafka+Streams




If you have any suggestions or feedback, feel free to participate to the
discussion thread:
https://lists.apache.org/thread/1nhhsrogmmv15o7mk9nj4kvkb5k2bx9s<
https://lists.apache.org/thread/1nhhsrogmmv15o7mk9nj4kvkb5k2bx9s>

Best regards,

Damien Sebastien and Loic


This email was screened for spam and malicious content but exercise
caution anyway.









Re: So long, Jenkins 👋

2024-09-26 Thread Josep Prat
Hi David,
I think we need a way to flag in the PR list (github.com/apache/kafka/pulls)
the ones that are waiting for a committer to approve the workflows. As an
example:
[image: image.png]
This PR has a green checkmark where the check status usually goes. But if
one navigates to the PR in question, one can see that the CI tasks didn't
start and wait for a committer to approve and run.
[image: image.png]
Do you have another way to identify these PRs? Or should we maybe work on
auto labelling PRs from non-committers (the ones that would wait for CI to
run).

On Thu, Sep 26, 2024 at 11:00 AM Josep Prat  wrote:

> That's what I feared
>
> On Thu, Sep 26, 2024 at 10:31 AM Chia-Ping Tsai 
> wrote:
>
>> hi Josep
>>
>> > Do you see any potential impact if we backport the change to those?
>>
>> In my opinion, the main concern is that non-trunk PRs can't effectively
>> leverage the cache, meaning they require more time and resources to run
>> CI.
>> Additionally, github-ci is triggered by trunk branch only, and we have not
>> tested it on non-trunk branch yet. Given that 3.9.0 and 3.8.1 releases are
>> processing, we could continue using Jenkins CI to avoid the additional
>> overhead of backporting.
>>
>> By the way, we'll eventually need to backport GitHub CI to the non-trunk
>> branches once the 4.1 branch is created.
>>
>> Best,
>> Chia-Ping
>>
>>
>>
>> Chia-Ping Tsai  於 2024年9月26日 週四 下午4:15寫道:
>>
>> > Thanks to David for providing us with an improved CI!
>> >
>> > Cheers,
>> > Chia-Ping
>> >
>> > David Arthur  於 2024年9月26日 週四 上午8:51寫道:
>> >
>> >> Today, we disabled the Jenkins build on trunk. With this change, we
>> should
>> >> now be expecting all green status checks on PRs before merging. Of
>> course,
>> >> flaky tests still exist, but generally speaking we should have green
>> >> builds
>> >> (see KIP-1090 for some plans on flaky tests).
>> >>
>> >> Any committer or "collaborator" (as defined in .asf.yaml) is able to
>> >> manually re-run a GitHub Action via the UI.
>> >>
>> >> For non-committers, someone must approve the workflow. There is a
>> >> "approve-workflows.py" script in committer-tools to help with this. I'm
>> >> still investigating options to improve this.
>> >>
>> >> We will keep the Jenkins build enabled for 3.9 and other release
>> branches.
>> >>
>> >> Cheers,
>> >> David A
>> >>
>> >
>>
>
>
> --
> [image: Aiven] 
>
> *Josep Prat*
> Open Source Engineering Director, *Aiven*
> josep.p...@aiven.io   |   +491715557497
> aiven.io    |
> 
>    
> *Aiven Deutschland GmbH*
> Alexanderufer 3-7, 10117 Berlin
> Geschäftsführer: Oskari Saarenmaa, Hannu Valtonen,
> Anna Richardson, Kenneth Chen
> Amtsgericht Charlottenburg, HRB 209739 B
>


-- 
[image: Aiven] 

*Josep Prat*
Open Source Engineering Director, *Aiven*
josep.p...@aiven.io   |   +491715557497
aiven.io    |   
     
*Aiven Deutschland GmbH*
Alexanderufer 3-7, 10117 Berlin
Geschäftsführer: Oskari Saarenmaa, Hannu Valtonen,
Anna Richardson, Kenneth Chen
Amtsgericht Charlottenburg, HRB 209739 B


Re: So long, Jenkins 👋

2024-09-26 Thread Josep Prat
I see you have the python script under "committer-tools", I guess I might
need to get used to call that script instead of going to the "pulls" page.

Best,

On Thu, Sep 26, 2024 at 3:36 PM Josep Prat  wrote:

> Hi David,
> I think we need a way to flag in the PR list (
> github.com/apache/kafka/pulls) the ones that are waiting for a committer
> to approve the workflows. As an example:
> [image: image.png]
> This PR has a green checkmark where the check status usually goes. But if
> one navigates to the PR in question, one can see that the CI tasks didn't
> start and wait for a committer to approve and run.
> [image: image.png]
> Do you have another way to identify these PRs? Or should we maybe work on
> auto labelling PRs from non-committers (the ones that would wait for CI to
> run).
>
> On Thu, Sep 26, 2024 at 11:00 AM Josep Prat  wrote:
>
>> That's what I feared
>>
>> On Thu, Sep 26, 2024 at 10:31 AM Chia-Ping Tsai 
>> wrote:
>>
>>> hi Josep
>>>
>>> > Do you see any potential impact if we backport the change to those?
>>>
>>> In my opinion, the main concern is that non-trunk PRs can't effectively
>>> leverage the cache, meaning they require more time and resources to run
>>> CI.
>>> Additionally, github-ci is triggered by trunk branch only, and we have
>>> not
>>> tested it on non-trunk branch yet. Given that 3.9.0 and 3.8.1 releases
>>> are
>>> processing, we could continue using Jenkins CI to avoid the additional
>>> overhead of backporting.
>>>
>>> By the way, we'll eventually need to backport GitHub CI to the non-trunk
>>> branches once the 4.1 branch is created.
>>>
>>> Best,
>>> Chia-Ping
>>>
>>>
>>>
>>> Chia-Ping Tsai  於 2024年9月26日 週四 下午4:15寫道:
>>>
>>> > Thanks to David for providing us with an improved CI!
>>> >
>>> > Cheers,
>>> > Chia-Ping
>>> >
>>> > David Arthur  於 2024年9月26日 週四 上午8:51寫道:
>>> >
>>> >> Today, we disabled the Jenkins build on trunk. With this change, we
>>> should
>>> >> now be expecting all green status checks on PRs before merging. Of
>>> course,
>>> >> flaky tests still exist, but generally speaking we should have green
>>> >> builds
>>> >> (see KIP-1090 for some plans on flaky tests).
>>> >>
>>> >> Any committer or "collaborator" (as defined in .asf.yaml) is able to
>>> >> manually re-run a GitHub Action via the UI.
>>> >>
>>> >> For non-committers, someone must approve the workflow. There is a
>>> >> "approve-workflows.py" script in committer-tools to help with this.
>>> I'm
>>> >> still investigating options to improve this.
>>> >>
>>> >> We will keep the Jenkins build enabled for 3.9 and other release
>>> branches.
>>> >>
>>> >> Cheers,
>>> >> David A
>>> >>
>>> >
>>>
>>
>>
>> --
>> [image: Aiven] 
>>
>> *Josep Prat*
>> Open Source Engineering Director, *Aiven*
>> josep.p...@aiven.io   |   +491715557497
>> aiven.io    |
>> 
>> 
>> 
>> *Aiven Deutschland GmbH*
>> Alexanderufer 3-7, 10117 Berlin
>> Geschäftsführer: Oskari Saarenmaa, Hannu Valtonen,
>> Anna Richardson, Kenneth Chen
>> Amtsgericht Charlottenburg, HRB 209739 B
>>
>
>
> --
> [image: Aiven] 
>
> *Josep Prat*
> Open Source Engineering Director, *Aiven*
> josep.p...@aiven.io   |   +491715557497
> aiven.io    |
> 
>    
> *Aiven Deutschland GmbH*
> Alexanderufer 3-7, 10117 Berlin
> Geschäftsführer: Oskari Saarenmaa, Hannu Valtonen,
> Anna Richardson, Kenneth Chen
> Amtsgericht Charlottenburg, HRB 209739 B
>


-- 
[image: Aiven] 

*Josep Prat*
Open Source Engineering Director, *Aiven*
josep.p...@aiven.io   |   +491715557497
aiven.io    |   
     
*Aiven Deutschland GmbH*
Alexanderufer 3-7, 10117 Berlin
Geschäftsführer: Oskari Saarenmaa, Hannu Valtonen,
Anna Richardson, Kenneth Chen
Amtsgericht Charlottenburg, HRB 209739 B


Re: So long, Jenkins 👋

2024-09-26 Thread David Arthur
We can probably get the new CI working on older release branches, it will
just take a bit of effort. As a start, we can just disable the build cache
for these builds. I'm not sure caching is even that useful beyond the time
between the branch point and the .0 release (since the rate of change slows
way down after a release). There is also a 10Gb limit for our total cache
items, which we are pretty close to already.

On Thu, Sep 26, 2024 at 9:51 AM Chia-Ping Tsai  wrote:

> It seems we need to promote approve-workflows.py to all committers 😀
>
> Josep Prat  於 2024年9月26日 週四 下午9:42寫道:
>
> > I see you have the python script under "committer-tools", I guess I might
> > need to get used to call that script instead of going to the "pulls"
> page.
> >
> > Best,
> >
> > On Thu, Sep 26, 2024 at 3:36 PM Josep Prat  wrote:
> >
> >> Hi David,
> >> I think we need a way to flag in the PR list (
> >> github.com/apache/kafka/pulls) the ones that are waiting for a
> committer
> >> to approve the workflows. As an example:
> >> [image: image.png]
> >> This PR has a green checkmark where the check status usually goes. But
> if
> >> one navigates to the PR in question, one can see that the CI tasks
> didn't
> >> start and wait for a committer to approve and run.
> >> [image: image.png]
> >> Do you have another way to identify these PRs? Or should we maybe work
> on
> >> auto labelling PRs from non-committers (the ones that would wait for CI
> to
> >> run).
> >>
> >> On Thu, Sep 26, 2024 at 11:00 AM Josep Prat 
> wrote:
> >>
> >>> That's what I feared
> >>>
> >>> On Thu, Sep 26, 2024 at 10:31 AM Chia-Ping Tsai 
> >>> wrote:
> >>>
>  hi Josep
> 
>  > Do you see any potential impact if we backport the change to those?
> 
>  In my opinion, the main concern is that non-trunk PRs can't
> effectively
>  leverage the cache, meaning they require more time and resources to
> run
>  CI.
>  Additionally, github-ci is triggered by trunk branch only, and we have
>  not
>  tested it on non-trunk branch yet. Given that 3.9.0 and 3.8.1 releases
>  are
>  processing, we could continue using Jenkins CI to avoid the additional
>  overhead of backporting.
> 
>  By the way, we'll eventually need to backport GitHub CI to the
> non-trunk
>  branches once the 4.1 branch is created.
> 
>  Best,
>  Chia-Ping
> 
> 
> 
>  Chia-Ping Tsai  於 2024年9月26日 週四 下午4:15寫道:
> 
>  > Thanks to David for providing us with an improved CI!
>  >
>  > Cheers,
>  > Chia-Ping
>  >
>  > David Arthur  於 2024年9月26日 週四 上午8:51寫道:
>  >
>  >> Today, we disabled the Jenkins build on trunk. With this change, we
>  should
>  >> now be expecting all green status checks on PRs before merging. Of
>  course,
>  >> flaky tests still exist, but generally speaking we should have
> green
>  >> builds
>  >> (see KIP-1090 for some plans on flaky tests).
>  >>
>  >> Any committer or "collaborator" (as defined in .asf.yaml) is able
> to
>  >> manually re-run a GitHub Action via the UI.
>  >>
>  >> For non-committers, someone must approve the workflow. There is a
>  >> "approve-workflows.py" script in committer-tools to help with this.
>  I'm
>  >> still investigating options to improve this.
>  >>
>  >> We will keep the Jenkins build enabled for 3.9 and other release
>  branches.
>  >>
>  >> Cheers,
>  >> David A
>  >>
>  >
> 
> >>>
> >>>
> >>> --
> >>> [image: Aiven] 
> >>>
> >>> *Josep Prat*
> >>> Open Source Engineering Director, *Aiven*
> >>> josep.p...@aiven.io   |   +491715557497
> >>> aiven.io    |
> >>> 
> >>> 
> >>> 
> >>> *Aiven Deutschland GmbH*
> >>> Alexanderufer 3-7, 10117 Berlin
> >>> Geschäftsführer: Oskari Saarenmaa, Hannu Valtonen,
> >>> Anna Richardson, Kenneth Chen
> >>> Amtsgericht Charlottenburg, HRB 209739 B
> >>>
> >>
> >>
> >> --
> >> [image: Aiven] 
> >>
> >> *Josep Prat*
> >> Open Source Engineering Director, *Aiven*
> >> josep.p...@aiven.io   |   +491715557497
> >> aiven.io    |
> >> 
> >> 
> >> 
> >> *Aiven Deutschland GmbH*
> >> Alexanderufer 3-7, 10117 Berlin
> >> Geschäftsführer: Oskari Saarenmaa, Hannu Valtonen,
> >> Anna Richardson, Kenneth Chen
> >> Amtsgericht Charlottenburg, HRB 209739 B
> >>
> >
> >
> > --
> > [image: Aiven] 
> >
> > *Josep Prat*
> > Open Source Engineering Director, *Aiven*
> > josep.p...@aiven.io   |   +491715557497
> > aiven.io    |
> > 
> >    <
> https://twitter.com/aiven_io>
> > *Aiven Deutschland GmbH*

[jira] [Created] (KAFKA-17623) Flaky testSeekPositionAndPauseNewlyAssignedPartitionOnPartitionsAssignedCallback

2024-09-26 Thread Lianet Magrans (Jira)
Lianet Magrans created KAFKA-17623:
--

 Summary: Flaky 
testSeekPositionAndPauseNewlyAssignedPartitionOnPartitionsAssignedCallback
 Key: KAFKA-17623
 URL: https://issues.apache.org/jira/browse/KAFKA-17623
 Project: Kafka
  Issue Type: Bug
  Components: consumer
Reporter: Lianet Magrans


Flaky for the new consumer, failing with :

org.apache.kafka.common.KafkaException: User rebalance callback throws an error 
at 
app//org.apache.kafka.clients.consumer.internals.ConsumerUtils.maybeWrapAsKafkaException(ConsumerUtils.java:259)
 at 
app//org.apache.kafka.clients.consumer.internals.AsyncKafkaConsumer.invokeRebalanceCallbacks(AsyncKafkaConsumer.java:1867)
 at 
app//org.apache.kafka.clients.consumer.internals.AsyncKafkaConsumer$BackgroundEventProcessor.process(AsyncKafkaConsumer.java:195)
 at 
app//org.apache.kafka.clients.consumer.internals.AsyncKafkaConsumer$BackgroundEventProcessor.process(AsyncKafkaConsumer.java:181)
 at 
app//org.apache.kafka.clients.consumer.internals.AsyncKafkaConsumer.processBackgroundEvents(AsyncKafkaConsumer.java:1758)
 at 
app//org.apache.kafka.clients.consumer.internals.AsyncKafkaConsumer.updateAssignmentMetadataIfNeeded(AsyncKafkaConsumer.java:1618)

...

Caused by: java.lang.IllegalStateException: No current assignment for partition 
topic-0 at 
org.apache.kafka.clients.consumer.internals.SubscriptionState.assignedState(SubscriptionState.java:378)
 at 
org.apache.kafka.clients.consumer.internals.SubscriptionState.seekUnvalidated(SubscriptionState.java:395)
 at 
org.apache.kafka.clients.consumer.internals.events.ApplicationEventProcessor.process(ApplicationEventProcessor.java:425)
 at 
org.apache.kafka.clients.consumer.internals.events.ApplicationEventProcessor.process(ApplicationEventProcessor.java:147)
 at 
org.apache.kafka.clients.consumer.internals.ConsumerNetworkThread.processApplicationEvents(ConsumerNetworkThread.java:171)

 

Flaky behaviour:

 

https://ge.apache.org/scans/tests?search.buildOutcome=failure&search.names=Git%20branch&search.rootProjectNames=kafka&search.startTimeMax=172740959&search.startTimeMin=172248480&search.timeZoneId=America%2FToronto&search.values=trunk&tests.container=integration.kafka.api.PlaintextConsumerCallbackTest&tests.test=testSeekPositionAndPauseNewlyAssignedPartitionOnPartitionsAssignedCallback(String%2C%20String)%5B3%5D



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Jenkins build is still unstable: Kafka » Kafka Branch Builder » 3.8 #93

2024-09-26 Thread Apache Jenkins Server
See 




[jira] [Resolved] (KAFKA-15266) Static configs set for non primary synonyms are ignored for Log configs

2024-09-26 Thread Kamal Chandraprakash (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-15266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kamal Chandraprakash resolved KAFKA-15266.
--
Resolution: Duplicate

> Static configs set for non primary synonyms are ignored for Log configs
> ---
>
> Key: KAFKA-15266
> URL: https://issues.apache.org/jira/browse/KAFKA-15266
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 2.6.0
>Reporter: Aman Harish Gandhi
>Assignee: Aman Harish Gandhi
>Priority: Major
>
> In our server.properties we had the following config
> {code:java}
> log.retention.hours=48
> {code}
> We noticed that after running alter configs to update broker level config(for 
> a config unrelated to retention) we were only deleting data after 7 days 
> instead of the configured 2.
> The alterconfig we had ran was similar to this
> {code:java}
> sh kafka-config.sh --bootstrap-server localhost:9092 --alter --add-config 
> "log.segment.bytes=50"
> {code}
> Digging deeper the issue could be pin pointed to the reconfigure block of 
> DynamicLogConfig inside DynamicBrokerConfig. Here we only look at the 
> "primary" KafkaConfig synonym of the LogConfig and if it is not set then we 
> remove the value set in default log config as well. This eventually leads to 
> the retention.ms not being set in the default log config and that leads to 
> the default value of 7 days being used. The value set in 
> "log.retention.hours" is completely ignored in this case.
> Pasting the relevant code block here
> {code:java}
> newConfig.valuesFromThisConfig.forEach { (k, v) =>
>   if (DynamicLogConfig.ReconfigurableConfigs.contains(k)) {
> DynamicLogConfig.KafkaConfigToLogConfigName.get(k).foreach { configName =>
>   if (v == null)
>  newBrokerDefaults.remove(configName)
>   else
> newBrokerDefaults.put(configName, v.asInstanceOf[AnyRef])
> }
>   }
> } {code}
> In the above block `DynamicLogConfig.ReconfigurableConfigs` contains only 
> log.retention.ms. It does not contain the other synonyms like 
> `log.retention.minutes` or `log.retention.hours`.
> This issue seems prevalent in all cases where there are more than 1 
> KafkaConfig synonyms for the LogConfig.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [DISCUSS] KIP-1092: Extend Consumer#close with an option to leave the group or not

2024-09-26 Thread Chia-Ping Tsai
Dear all,

The main purpose is to allow consumers to leave a group permanently, even if 
they have a static member ID. Additionally, we don't have insights into the use 
case where a dynamic member might not want to leave the group.

Therefore, should we enhance the close option to support our goal—allowing 
static members to leave the group when closing the consumer? This means the 
flag should be renamed to 'releaseStaticMember.

Best,
Chia-Ping

On 2024/09/25 03:22:17 TengYao Chi wrote:
> Hi Kirk,
> 
> Thanks for your feedback and questions!
> 
> KT1:
> While it might seem that there’s no immediate benefit for non-Kafka Streams
> applications to set `leaveGroup=false`, we still offer this option for
> consistency with `KafkaStreams#CloseOptions` and to provide flexibility for
> future features. However, I agree with your concern that users might misuse
> this option, leading to potential throughput issues. To avoid this, we
> could include documentation that strongly recommends setting
> `leaveGroup=true` unless the user fully understands the implications of
> leaving it as `false`.
> 
> KT2:
> Thanks for the reminder.
> I have updated the KIP to include the actual API changes. Please take a
> look .
> 
> KT3:
> I believe we should respect the timeout setting, as this offers several key
> advantages. First, as Chia-Ping mentioned, it gives users more control over
> the close process, allowing them to better manage their application flow.
> Second, as a public interface, users will expect the timeout parameter to
> act as a contract between Kafka and the user. If the consumer does not
> respect this setting, it could lead to confusion and force users to
> implement unnecessary workarounds. Ensuring that the `close()` operation
> honors the provided timeout will lead to a more predictable and
> user-friendly API.
> 
> Thanks again for your questions.
> Let me know if you have any further comments.
> 
> Best,
> TengYao
> 
> Chia-Ping Tsai  於 2024年9月25日 週三 上午4:51寫道:
> 
> > hi Kirk
> >
> > > KT1: Why would a non-Kafka Streams application want to set
> > leaveGroup=false? Because Kafka Streams manages the group membership
> > assignment under the covers, it can re-assign partitions to a new Consumer
> > when the old one closes. But in a non-Kafka Streams application, doesn’t
> > this just leave the partitions assigned until the coordinator kicks the
> > member out of the group?
> >
> > That's definitely a great question. I don’t have a clear answer regarding
> > whether leaveGroup=false means not sending a LEAVE_GROUP request, but I
> > have a crazy idea—what if leaveGroup=false means temporarily leaving the
> > group? My idea is that we could potentially integrate dynamic members with
> > static members.
> >
> > Here’s how it could work with changes to the new protocol and consumer:
> >
> >1. The new consumer will exclusively use static members. If a user
> >doesn't specify one, the new consumer will automatically generate it
> >(similar to KIP-1082).
> >2. Users can define a close option to either leave the group permanently
> >or temporarily.
> >
> > This approach offers several advantages:
> >
> >1. It addresses the needs of Kafka Streams (if they start to use new
> >consumer)
> >2. It simplifies the new coordinator, protocol, and consumer, as they no
> >longer need to handle two types of members.
> >3. It addresses KIP-1082. The new consumer always sends a LEAVE_GROUP
> >request during closing, even if it hasn't received a response from the
> >server.
> >
> > > KT3: Does setting leaveGroup=true carry the *guarantee* that the member
> > will leave the group? I’m currently battling some edge cases where close()
> > is called with a zero timeout and it times out before the consumer can
> > leave the group cleanly.
> >
> > In my opinion, this is a trade-off between 'not honoring the timeout' and
> > 'creating ghost members.' However, I prefer honoring the timeout, as it
> > provides users with more control over the 'close' process. We should trust
> > that users fully understand the options they choose.
> >
> >
> > Best,
> > Chia-Ping
> >
> >
> >
> > Kirk True  於 2024年9月24日 週二 上午8:37寫道:
> >
> > > Hi TengYao,
> > >
> > > Thanks for writing up this KIP :)
> > >
> > > Questions:
> > >
> > > KT1: Why would a non-Kafka Streams application want to set
> > > leaveGroup=false? Because Kafka Streams manages the group membership
> > > assignment under the covers, it can re-assign partitions to a new
> > Consumer
> > > when the old one closes. But in a non-Kafka Streams application, doesn’t
> > > this just leave the partitions assigned until the coordinator kicks the
> > > member out of the group?
> > >
> > > KT2: Can you add the actual API changes to the KIP?
> > >
> > > KT3: Does setting leaveGroup=true carry the *guarantee* that the member
> > > will leave the group? I’m currently battling some edge cases where
> > close()
> > > is called with a zero timeout and it tim

[jira] [Resolved] (KAFKA-17612) Remove some tests that only apply to ZK mode or migration

2024-09-26 Thread Chia-Ping Tsai (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-17612?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chia-Ping Tsai resolved KAFKA-17612.

Fix Version/s: 4.0.0
   Resolution: Fixed

> Remove some tests that only apply to ZK mode or migration
> -
>
> Key: KAFKA-17612
> URL: https://issues.apache.org/jira/browse/KAFKA-17612
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Colin McCabe
>Assignee: Colin McCabe
>Priority: Major
> Fix For: 4.0.0
>
>
> Remove some tests that only apply to ZK mode



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KAFKA-17621) Reduce logging verbosity on ConsumerGroupHeartbeat path

2024-09-26 Thread David Jacot (Jira)
David Jacot created KAFKA-17621:
---

 Summary: Reduce logging verbosity on ConsumerGroupHeartbeat path
 Key: KAFKA-17621
 URL: https://issues.apache.org/jira/browse/KAFKA-17621
 Project: Kafka
  Issue Type: Sub-task
Reporter: David Jacot
Assignee: David Jacot






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: So long, Jenkins 👋

2024-09-26 Thread Chia-Ping Tsai
It seems we need to promote approve-workflows.py to all committers 😀

Josep Prat  於 2024年9月26日 週四 下午9:42寫道:

> I see you have the python script under "committer-tools", I guess I might
> need to get used to call that script instead of going to the "pulls" page.
>
> Best,
>
> On Thu, Sep 26, 2024 at 3:36 PM Josep Prat  wrote:
>
>> Hi David,
>> I think we need a way to flag in the PR list (
>> github.com/apache/kafka/pulls) the ones that are waiting for a committer
>> to approve the workflows. As an example:
>> [image: image.png]
>> This PR has a green checkmark where the check status usually goes. But if
>> one navigates to the PR in question, one can see that the CI tasks didn't
>> start and wait for a committer to approve and run.
>> [image: image.png]
>> Do you have another way to identify these PRs? Or should we maybe work on
>> auto labelling PRs from non-committers (the ones that would wait for CI to
>> run).
>>
>> On Thu, Sep 26, 2024 at 11:00 AM Josep Prat  wrote:
>>
>>> That's what I feared
>>>
>>> On Thu, Sep 26, 2024 at 10:31 AM Chia-Ping Tsai 
>>> wrote:
>>>
 hi Josep

 > Do you see any potential impact if we backport the change to those?

 In my opinion, the main concern is that non-trunk PRs can't effectively
 leverage the cache, meaning they require more time and resources to run
 CI.
 Additionally, github-ci is triggered by trunk branch only, and we have
 not
 tested it on non-trunk branch yet. Given that 3.9.0 and 3.8.1 releases
 are
 processing, we could continue using Jenkins CI to avoid the additional
 overhead of backporting.

 By the way, we'll eventually need to backport GitHub CI to the non-trunk
 branches once the 4.1 branch is created.

 Best,
 Chia-Ping



 Chia-Ping Tsai  於 2024年9月26日 週四 下午4:15寫道:

 > Thanks to David for providing us with an improved CI!
 >
 > Cheers,
 > Chia-Ping
 >
 > David Arthur  於 2024年9月26日 週四 上午8:51寫道:
 >
 >> Today, we disabled the Jenkins build on trunk. With this change, we
 should
 >> now be expecting all green status checks on PRs before merging. Of
 course,
 >> flaky tests still exist, but generally speaking we should have green
 >> builds
 >> (see KIP-1090 for some plans on flaky tests).
 >>
 >> Any committer or "collaborator" (as defined in .asf.yaml) is able to
 >> manually re-run a GitHub Action via the UI.
 >>
 >> For non-committers, someone must approve the workflow. There is a
 >> "approve-workflows.py" script in committer-tools to help with this.
 I'm
 >> still investigating options to improve this.
 >>
 >> We will keep the Jenkins build enabled for 3.9 and other release
 branches.
 >>
 >> Cheers,
 >> David A
 >>
 >

>>>
>>>
>>> --
>>> [image: Aiven] 
>>>
>>> *Josep Prat*
>>> Open Source Engineering Director, *Aiven*
>>> josep.p...@aiven.io   |   +491715557497
>>> aiven.io    |
>>> 
>>> 
>>> 
>>> *Aiven Deutschland GmbH*
>>> Alexanderufer 3-7, 10117 Berlin
>>> Geschäftsführer: Oskari Saarenmaa, Hannu Valtonen,
>>> Anna Richardson, Kenneth Chen
>>> Amtsgericht Charlottenburg, HRB 209739 B
>>>
>>
>>
>> --
>> [image: Aiven] 
>>
>> *Josep Prat*
>> Open Source Engineering Director, *Aiven*
>> josep.p...@aiven.io   |   +491715557497
>> aiven.io    |
>> 
>> 
>> 
>> *Aiven Deutschland GmbH*
>> Alexanderufer 3-7, 10117 Berlin
>> Geschäftsführer: Oskari Saarenmaa, Hannu Valtonen,
>> Anna Richardson, Kenneth Chen
>> Amtsgericht Charlottenburg, HRB 209739 B
>>
>
>
> --
> [image: Aiven] 
>
> *Josep Prat*
> Open Source Engineering Director, *Aiven*
> josep.p...@aiven.io   |   +491715557497
> aiven.io    |
> 
>    
> *Aiven Deutschland GmbH*
> Alexanderufer 3-7, 10117 Berlin
> Geschäftsführer: Oskari Saarenmaa, Hannu Valtonen,
> Anna Richardson, Kenneth Chen
> Amtsgericht Charlottenburg, HRB 209739 B
>


[jira] [Created] (KAFKA-17622) Kafka Streams Timeout During Partition Rebalance - Seeking Insights on NotLeaderOrFollowerException

2024-09-26 Thread Alieh Saeedi (Jira)
Alieh Saeedi created KAFKA-17622:


 Summary: Kafka Streams Timeout During Partition Rebalance - 
Seeking Insights on NotLeaderOrFollowerException
 Key: KAFKA-17622
 URL: https://issues.apache.org/jira/browse/KAFKA-17622
 Project: Kafka
  Issue Type: Bug
Reporter: Alieh Saeedi


Re: 
[https://forum.confluent.io/t/kafka-streams-timeout-during-partition-rebalance-seeking-insights-on-notleaderorfollowerexception/11362]

Calling {{{}Consumer.position() from KS{}}}treams  for computing the offset 
that must be committed suffers from a race condition so that by the time we 
want to commit, the position may be gone.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Reopened] (KAFKA-17612) Remove some tests that only apply to ZK mode or migration

2024-09-26 Thread Colin McCabe (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-17612?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin McCabe reopened KAFKA-17612:
--

> Remove some tests that only apply to ZK mode or migration
> -
>
> Key: KAFKA-17612
> URL: https://issues.apache.org/jira/browse/KAFKA-17612
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Colin McCabe
>Assignee: Colin McCabe
>Priority: Major
> Fix For: 4.0.0
>
>
> Remove some tests that only apply to ZK mode



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KAFKA-17625) Remove ZK from ducktape in 4.0

2024-09-26 Thread Colin McCabe (Jira)
Colin McCabe created KAFKA-17625:


 Summary: Remove ZK from ducktape in 4.0
 Key: KAFKA-17625
 URL: https://issues.apache.org/jira/browse/KAFKA-17625
 Project: Kafka
  Issue Type: Sub-task
Reporter: Colin McCabe






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KAFKA-17626) Move common fetch related classes from storage to storage-api

2024-09-26 Thread Apoorv Mittal (Jira)
Apoorv Mittal created KAFKA-17626:
-

 Summary: Move common fetch related classes from storage to 
storage-api
 Key: KAFKA-17626
 URL: https://issues.apache.org/jira/browse/KAFKA-17626
 Project: Kafka
  Issue Type: Sub-task
Reporter: Apoorv Mittal
Assignee: Apoorv Mittal






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KAFKA-17627) ConfigProvider TTLs do not restart Tasks

2024-09-26 Thread Greg Harris (Jira)
Greg Harris created KAFKA-17627:
---

 Summary: ConfigProvider TTLs do not restart Tasks
 Key: KAFKA-17627
 URL: https://issues.apache.org/jira/browse/KAFKA-17627
 Project: Kafka
  Issue Type: Bug
  Components: connect
Affects Versions: 2.0.0
Reporter: Greg Harris


The ConfigProvider interface allows for implementations to provide 
configurations with an accompanying Time To Live, a lifetime that the returned 
value is valid for. Callers which receive a TTL should periodically call the 
ConfigProvider to refresh and receive any updated value.

The WorkerConfigTransformer is responsible for accepting TTLs from 
ConfigProviders, and scheduling restarts for connectors to force them to 
refresh their externalized configurations.

Since the introduction of the ConfigProvider interface in 2.0.0, when the 
WorkerConfigTransformer processes task configurations, it schedules a restart 
of the connector, which may or may not be present on the running machine. 
Instead, when secrets for a task include a TTL, the task should be restarted 
directly.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (KAFKA-17621) Reduce logging verbosity on ConsumerGroupHeartbeat path

2024-09-26 Thread David Jacot (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-17621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Jacot resolved KAFKA-17621.
-
Fix Version/s: 4.0.0
   Resolution: Fixed

> Reduce logging verbosity on ConsumerGroupHeartbeat path
> ---
>
> Key: KAFKA-17621
> URL: https://issues.apache.org/jira/browse/KAFKA-17621
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: David Jacot
>Assignee: David Jacot
>Priority: Major
> Fix For: 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: New release branch 3.9

2024-09-26 Thread Greg Harris
Hi Colin,

It has been brought to my attention that Java 23 is now GA, and Kafka
clients, brokers, and connect are now experiencing
UnsupportedOperationExceptions by default due our use of the
deprecated-for-removal SecurityManager.
This only happens upon upgrading to Java 23, so this is not a regression.
Users can workaround this issue themselves by setting a system property or
by not upgrading to Java 23.

I have implemented a patch to avoid these UnsupportedOperationExceptions,
and effectively make Kafka compatible with Java 23 by default.
Issue: https://issues.apache.org/jira/browse/KAFKA-17078 and the associated
PR: https://github.com/apache/kafka/pull/16522
The patch has low risk for users which don't use SecurityManager. It has a
moderate risk for users that use SecurityManager, as we don't have good
visibility into those use-cases.

Is this something that you want to include in 3.9.0?

Thanks,
Greg

On Wed, Sep 25, 2024 at 2:31 PM José Armando García Sancio
 wrote:

> Hi Colin,
>
> We found a bug that we should fix for 3.9.0
> (https://issues.apache.org/jira/browse/KAFKA-17608). Alyssa is going
> to work on the fix and we expect a PR soon.
>
> Thanks,
> -José
>


[jira] [Resolved] (KAFKA-17563) Move RequestConvertToJson to server module

2024-09-26 Thread Chia-Ping Tsai (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-17563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chia-Ping Tsai resolved KAFKA-17563.

Fix Version/s: 4.0.0
   Resolution: Fixed

> Move RequestConvertToJson to server module
> --
>
> Key: KAFKA-17563
> URL: https://issues.apache.org/jira/browse/KAFKA-17563
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Chia-Ping Tsai
>Assignee: kangning.li
>Priority: Major
> Fix For: 4.0.0
>
>
> as title



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: New release branch 3.9

2024-09-26 Thread Colin McCabe
Hi all,

I wanted to give a quick update on the 3.9.0 release.

We had several fixes in the last few days (KAFKA-17459, KAFKA-17584, etc.) and 
uncovered two new blocker bugs which now have PRs available (KAFKA-17608, 
KAFKA-17604). Once those are in, I'll create a new RC.

thanks, all.
Colin

On Thu, Sep 26, 2024, at 10:12, Colin McCabe wrote:
> Hi Greg,
>
> Thank you for working on this. Isn't this work part of KIP-1006? That 
> KIP hasn't even been approved (although it seems very likely to be so). 
> Certainly it isn't in 3.9. The PR is also kind of large with 16 files 
> changed. I think supporting new JDK versions that we haven't supported 
> before should be a 4.0 thing.
>
> best,
> Colin
>
>
> On Thu, Sep 26, 2024, at 09:44, Greg Harris wrote:
>> Hi Colin,
>>
>> It has been brought to my attention that Java 23 is now GA, and Kafka
>> clients, brokers, and connect are now experiencing
>> UnsupportedOperationExceptions by default due our use of the
>> deprecated-for-removal SecurityManager.
>> This only happens upon upgrading to Java 23, so this is not a regression.
>> Users can workaround this issue themselves by setting a system property or
>> by not upgrading to Java 23.
>>
>> I have implemented a patch to avoid these UnsupportedOperationExceptions,
>> and effectively make Kafka compatible with Java 23 by default.
>> Issue: https://issues.apache.org/jira/browse/KAFKA-17078 and the associated
>> PR: https://github.com/apache/kafka/pull/16522
>> The patch has low risk for users which don't use SecurityManager. It has a
>> moderate risk for users that use SecurityManager, as we don't have good
>> visibility into those use-cases.
>>
>> Is this something that you want to include in 3.9.0?
>>
>> Thanks,
>> Greg
>>
>> On Wed, Sep 25, 2024 at 2:31 PM José Armando García Sancio
>>  wrote:
>>
>>> Hi Colin,
>>>
>>> We found a bug that we should fix for 3.9.0
>>> (https://issues.apache.org/jira/browse/KAFKA-17608). Alyssa is going
>>> to work on the fix and we expect a PR soon.
>>>
>>> Thanks,
>>> -José
>>>


Re: New release branch 3.9

2024-09-26 Thread Greg Harris
Hey Colin,

KIP-1006 and the associated vote are about removing support for
SecurityManager, which is a user-facing breaking change.
The PR is an internal-facing, backwards-compatible change that I motivated
and described in the KIP as an interim solution, and doesn't require the
vote to pass in order to be merged.

> I think supporting new JDK versions that we haven't supported before
should be a 4.0 thing.

Sure, especially as we don't have any infrastructure for testing Java 23
set up at the moment. This is possibly the first most obvious compatibility
problem, but there could be more.

I'll plan on including this in 3.9.1, thanks!
Greg

On Thu, Sep 26, 2024 at 10:16 AM Colin McCabe  wrote:

> Hi all,
>
> I wanted to give a quick update on the 3.9.0 release.
>
> We had several fixes in the last few days (KAFKA-17459, KAFKA-17584, etc.)
> and uncovered two new blocker bugs which now have PRs available
> (KAFKA-17608, KAFKA-17604). Once those are in, I'll create a new RC.
>
> thanks, all.
> Colin
>
> On Thu, Sep 26, 2024, at 10:12, Colin McCabe wrote:
> > Hi Greg,
> >
> > Thank you for working on this. Isn't this work part of KIP-1006? That
> > KIP hasn't even been approved (although it seems very likely to be so).
> > Certainly it isn't in 3.9. The PR is also kind of large with 16 files
> > changed. I think supporting new JDK versions that we haven't supported
> > before should be a 4.0 thing.
> >
> > best,
> > Colin
> >
> >
> > On Thu, Sep 26, 2024, at 09:44, Greg Harris wrote:
> >> Hi Colin,
> >>
> >> It has been brought to my attention that Java 23 is now GA, and Kafka
> >> clients, brokers, and connect are now experiencing
> >> UnsupportedOperationExceptions by default due our use of the
> >> deprecated-for-removal SecurityManager.
> >> This only happens upon upgrading to Java 23, so this is not a
> regression.
> >> Users can workaround this issue themselves by setting a system property
> or
> >> by not upgrading to Java 23.
> >>
> >> I have implemented a patch to avoid these
> UnsupportedOperationExceptions,
> >> and effectively make Kafka compatible with Java 23 by default.
> >> Issue: https://issues.apache.org/jira/browse/KAFKA-17078 and the
> associated
> >> PR: https://github.com/apache/kafka/pull/16522
> >> The patch has low risk for users which don't use SecurityManager. It
> has a
> >> moderate risk for users that use SecurityManager, as we don't have good
> >> visibility into those use-cases.
> >>
> >> Is this something that you want to include in 3.9.0?
> >>
> >> Thanks,
> >> Greg
> >>
> >> On Wed, Sep 25, 2024 at 2:31 PM José Armando García Sancio
> >>  wrote:
> >>
> >>> Hi Colin,
> >>>
> >>> We found a bug that we should fix for 3.9.0
> >>> (https://issues.apache.org/jira/browse/KAFKA-17608). Alyssa is going
> >>> to work on the fix and we expect a PR soon.
> >>>
> >>> Thanks,
> >>> -José
> >>>
>


Re: So long, Jenkins 👋

2024-09-26 Thread Chia-Ping Tsai
> I'm not sure caching is even that useful beyond the time
> between the branch point and the .0 release (since the rate of change slows
> way down after a release).

I try to keep us optimistic. 🙂

With the restore keys provided by setup-gradle, CI will always find a cache to 
restore. While some task outputs might not be reusable, at least we avoid 
downloading all dependencies again.

By the way, the bulk of the heavy dependencies comes from different versions of 
rocksdb required by the upgrade-system-tests-xxx.

In short, the cache remains valuable even for branches with slower changes.

Best,
Chia-Ping

On 2024/09/26 14:13:55 David Arthur wrote:
> We can probably get the new CI working on older release branches, it will
> just take a bit of effort. As a start, we can just disable the build cache
> for these builds. I'm not sure caching is even that useful beyond the time
> between the branch point and the .0 release (since the rate of change slows
> way down after a release). There is also a 10Gb limit for our total cache
> items, which we are pretty close to already.
> 
> On Thu, Sep 26, 2024 at 9:51 AM Chia-Ping Tsai  wrote:
> 
> > It seems we need to promote approve-workflows.py to all committers 😀
> >
> > Josep Prat  於 2024年9月26日 週四 下午9:42寫道:
> >
> > > I see you have the python script under "committer-tools", I guess I might
> > > need to get used to call that script instead of going to the "pulls"
> > page.
> > >
> > > Best,
> > >
> > > On Thu, Sep 26, 2024 at 3:36 PM Josep Prat  wrote:
> > >
> > >> Hi David,
> > >> I think we need a way to flag in the PR list (
> > >> github.com/apache/kafka/pulls) the ones that are waiting for a
> > committer
> > >> to approve the workflows. As an example:
> > >> [image: image.png]
> > >> This PR has a green checkmark where the check status usually goes. But
> > if
> > >> one navigates to the PR in question, one can see that the CI tasks
> > didn't
> > >> start and wait for a committer to approve and run.
> > >> [image: image.png]
> > >> Do you have another way to identify these PRs? Or should we maybe work
> > on
> > >> auto labelling PRs from non-committers (the ones that would wait for CI
> > to
> > >> run).
> > >>
> > >> On Thu, Sep 26, 2024 at 11:00 AM Josep Prat 
> > wrote:
> > >>
> > >>> That's what I feared
> > >>>
> > >>> On Thu, Sep 26, 2024 at 10:31 AM Chia-Ping Tsai 
> > >>> wrote:
> > >>>
> >  hi Josep
> > 
> >  > Do you see any potential impact if we backport the change to those?
> > 
> >  In my opinion, the main concern is that non-trunk PRs can't
> > effectively
> >  leverage the cache, meaning they require more time and resources to
> > run
> >  CI.
> >  Additionally, github-ci is triggered by trunk branch only, and we have
> >  not
> >  tested it on non-trunk branch yet. Given that 3.9.0 and 3.8.1 releases
> >  are
> >  processing, we could continue using Jenkins CI to avoid the additional
> >  overhead of backporting.
> > 
> >  By the way, we'll eventually need to backport GitHub CI to the
> > non-trunk
> >  branches once the 4.1 branch is created.
> > 
> >  Best,
> >  Chia-Ping
> > 
> > 
> > 
> >  Chia-Ping Tsai  於 2024年9月26日 週四 下午4:15寫道:
> > 
> >  > Thanks to David for providing us with an improved CI!
> >  >
> >  > Cheers,
> >  > Chia-Ping
> >  >
> >  > David Arthur  於 2024年9月26日 週四 上午8:51寫道:
> >  >
> >  >> Today, we disabled the Jenkins build on trunk. With this change, we
> >  should
> >  >> now be expecting all green status checks on PRs before merging. Of
> >  course,
> >  >> flaky tests still exist, but generally speaking we should have
> > green
> >  >> builds
> >  >> (see KIP-1090 for some plans on flaky tests).
> >  >>
> >  >> Any committer or "collaborator" (as defined in .asf.yaml) is able
> > to
> >  >> manually re-run a GitHub Action via the UI.
> >  >>
> >  >> For non-committers, someone must approve the workflow. There is a
> >  >> "approve-workflows.py" script in committer-tools to help with this.
> >  I'm
> >  >> still investigating options to improve this.
> >  >>
> >  >> We will keep the Jenkins build enabled for 3.9 and other release
> >  branches.
> >  >>
> >  >> Cheers,
> >  >> David A
> >  >>
> >  >
> > 
> > >>>
> > >>>
> > >>> --
> > >>> [image: Aiven] 
> > >>>
> > >>> *Josep Prat*
> > >>> Open Source Engineering Director, *Aiven*
> > >>> josep.p...@aiven.io   |   +491715557497
> > >>> aiven.io    |
> > >>> 
> > >>> 
> > >>> 
> > >>> *Aiven Deutschland GmbH*
> > >>> Alexanderufer 3-7, 10117 Berlin
> > >>> Geschäftsführer: Oskari Saarenmaa, Hannu Valtonen,
> > >>> Anna Richardson, Kenneth Chen
> > >>> Amtsgericht Charlottenburg, HRB 209739 B
> > >>>
> 

Re: New release branch 3.9

2024-09-26 Thread Colin McCabe
Hi Greg,

Thank you for working on this. Isn't this work part of KIP-1006? That KIP 
hasn't even been approved (although it seems very likely to be so). Certainly 
it isn't in 3.9. The PR is also kind of large with 16 files changed. I think 
supporting new JDK versions that we haven't supported before should be a 4.0 
thing.

best,
Colin


On Thu, Sep 26, 2024, at 09:44, Greg Harris wrote:
> Hi Colin,
>
> It has been brought to my attention that Java 23 is now GA, and Kafka
> clients, brokers, and connect are now experiencing
> UnsupportedOperationExceptions by default due our use of the
> deprecated-for-removal SecurityManager.
> This only happens upon upgrading to Java 23, so this is not a regression.
> Users can workaround this issue themselves by setting a system property or
> by not upgrading to Java 23.
>
> I have implemented a patch to avoid these UnsupportedOperationExceptions,
> and effectively make Kafka compatible with Java 23 by default.
> Issue: https://issues.apache.org/jira/browse/KAFKA-17078 and the associated
> PR: https://github.com/apache/kafka/pull/16522
> The patch has low risk for users which don't use SecurityManager. It has a
> moderate risk for users that use SecurityManager, as we don't have good
> visibility into those use-cases.
>
> Is this something that you want to include in 3.9.0?
>
> Thanks,
> Greg
>
> On Wed, Sep 25, 2024 at 2:31 PM José Armando García Sancio
>  wrote:
>
>> Hi Colin,
>>
>> We found a bug that we should fix for 3.9.0
>> (https://issues.apache.org/jira/browse/KAFKA-17608). Alyssa is going
>> to work on the fix and we expect a PR soon.
>>
>> Thanks,
>> -José
>>


[jira] [Resolved] (KAFKA-17586) AsyncKafkaConsumer#seek should NOT wait the completion of backgound

2024-09-26 Thread Chia-Ping Tsai (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-17586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chia-Ping Tsai resolved KAFKA-17586.

Resolution: Not A Problem

> AsyncKafkaConsumer#seek should NOT wait the completion of backgound
> ---
>
> Key: KAFKA-17586
> URL: https://issues.apache.org/jira/browse/KAFKA-17586
> Project: Kafka
>  Issue Type: Bug
>  Components: clients, consumer
>Reporter: Chia-Ping Tsai
>Assignee: TaiJuWu
>Priority: Major
>  Labels: kip-848-client-support
>
> see https://github.com/apache/kafka/pull/17230/files#r1768666772



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


RE: [VOTE] KIP-1052: Enable warmup in producer performance test

2024-09-26 Thread Welch, Matt
Hi Chia-Ping

Earlier in the discussion phase, Federico Valeri has also proposed the idea of 
a "producer autopilot" which could switch to a steady state mode once the p99 
was acceptably stable.  
I agree with both of you that an automatic warmup would be a very useful 
addition to the producer performance tooling, but creating a universal 
definition of "stable" and the automatic detection of stability are both 
somewhat complex. For these reasons, I think it's best that automatic warmup is 
contained in a follow-on KIP. I've attempted to include your ideas in the 
Rejected Alternatives section of the KIP.

Thanks,
Matt

-Original Message-
From: Chia-Ping Tsai  
Sent: Tuesday, September 24, 2024 4:22 AM
To: dev@kafka.apache.org
Subject: RE: [VOTE] KIP-1052: Enable warmup in producer performance test

hi Matt

Apologies for the delayed response. I completely agree that ProducerPerformance 
should separate warmup statistics from steady-state metrics. Overall +1, but I 
have a small question:

Have you considered implementing an explicit check for warmup instead of 
relying on sending a set number of records? My concern is that users may not 
know how many records are necessary to complete the warmup. Rather than relying 
on a rule of thumb, ProducerPerformance could check the metadata and node 
latency (via metrics) to ensure that the node information (such as connection, 
DNS, and metadata) is ready. 

In summary, this approach introduces a flag called enable-warmup instead of 
using warmup-records. The advantage is that users no longer need to specify the 
number of warmup records. When the flag is enabled, ProducerPerformance will 
continue sending warmup records until the node information is fully ready.

Best,
Chia-Ping

On 2024/09/04 22:20:15 "Welch, Matt" wrote:
> Hi Kafka devs,
> 
> Bumping this VOTE thread again for visibility.
> 
> Thanks,
> Matt
> 
> -Original Message-
> From: Welch, Matt 
> Sent: Friday, August 23, 2024 4:26 PM
> To: dev@kafka.apache.org
> Subject: RE: [VOTE] KIP-1052: Enable warmup in producer performance 
> test
> 
> Hi Kafka devs,
> 
> Bumping this VOTE thread for visibility.
> 
> Thanks,
> Matt
> 
> -Original Message-
> From: Federico Valeri 
> Sent: Monday, August 19, 2024 12:38 AM
> To: dev@kafka.apache.org
> Subject: Re: [VOTE] KIP-1052: Enable warmup in producer performance 
> test
> 
> Hi Matt, +1 (non binding) from me. Thanks!
> 
> Just a suggestion: I think that the following output line does not add much 
> value and could be removed.
> 
> "Warmup first 10 records. Steady-state results will print after the 
> complete-test summary."
> 
> On Wed, Aug 14, 2024 at 8:06 PM Welch, Matt  wrote:
> >
> >
> > Hi all,
> >
> > It seems discussion has been quiet for a couple of weeks so I'd like 
> > to call a vote on KIP-1052 
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-1052%3A+Enable
> > +w
> > armup+in+producer+performance+test
> >
> > Thanks,
> > Matt Welch
> >
> 


Re: [VOTE] KIP-1052: Enable warmup in producer performance test

2024-09-26 Thread Chia-Ping Tsai
hi Matt

thanks for your response. +1 (binding)

Best,
Chia-Ping

Welch, Matt  於 2024年9月27日 週五 上午2:31寫道:

> Hi Chia-Ping
>
> Earlier in the discussion phase, Federico Valeri has also proposed the
> idea of a "producer autopilot" which could switch to a steady state mode
> once the p99 was acceptably stable.
> I agree with both of you that an automatic warmup would be a very useful
> addition to the producer performance tooling, but creating a universal
> definition of "stable" and the automatic detection of stability are both
> somewhat complex. For these reasons, I think it's best that automatic
> warmup is contained in a follow-on KIP. I've attempted to include your
> ideas in the Rejected Alternatives section of the KIP.
>
> Thanks,
> Matt
>
> -Original Message-
> From: Chia-Ping Tsai 
> Sent: Tuesday, September 24, 2024 4:22 AM
> To: dev@kafka.apache.org
> Subject: RE: [VOTE] KIP-1052: Enable warmup in producer performance test
>
> hi Matt
>
> Apologies for the delayed response. I completely agree that
> ProducerPerformance should separate warmup statistics from steady-state
> metrics. Overall +1, but I have a small question:
>
> Have you considered implementing an explicit check for warmup instead of
> relying on sending a set number of records? My concern is that users may
> not know how many records are necessary to complete the warmup. Rather than
> relying on a rule of thumb, ProducerPerformance could check the metadata
> and node latency (via metrics) to ensure that the node information (such as
> connection, DNS, and metadata) is ready.
>
> In summary, this approach introduces a flag called enable-warmup instead
> of using warmup-records. The advantage is that users no longer need to
> specify the number of warmup records. When the flag is enabled,
> ProducerPerformance will continue sending warmup records until the node
> information is fully ready.
>
> Best,
> Chia-Ping
>
> On 2024/09/04 22:20:15 "Welch, Matt" wrote:
> > Hi Kafka devs,
> >
> > Bumping this VOTE thread again for visibility.
> >
> > Thanks,
> > Matt
> >
> > -Original Message-
> > From: Welch, Matt 
> > Sent: Friday, August 23, 2024 4:26 PM
> > To: dev@kafka.apache.org
> > Subject: RE: [VOTE] KIP-1052: Enable warmup in producer performance
> > test
> >
> > Hi Kafka devs,
> >
> > Bumping this VOTE thread for visibility.
> >
> > Thanks,
> > Matt
> >
> > -Original Message-
> > From: Federico Valeri 
> > Sent: Monday, August 19, 2024 12:38 AM
> > To: dev@kafka.apache.org
> > Subject: Re: [VOTE] KIP-1052: Enable warmup in producer performance
> > test
> >
> > Hi Matt, +1 (non binding) from me. Thanks!
> >
> > Just a suggestion: I think that the following output line does not add
> much value and could be removed.
> >
> > "Warmup first 10 records. Steady-state results will print after the
> complete-test summary."
> >
> > On Wed, Aug 14, 2024 at 8:06 PM Welch, Matt 
> wrote:
> > >
> > >
> > > Hi all,
> > >
> > > It seems discussion has been quiet for a couple of weeks so I'd like
> > > to call a vote on KIP-1052
> > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-1052%3A+Enable
> > > +w
> > > armup+in+producer+performance+test
> > >
> > > Thanks,
> > > Matt Welch
> > >
> >
>


[jira] [Resolved] (KAFKA-16683) Extract security-related helpers from scala.TestUtils to java class

2024-09-26 Thread Chia-Ping Tsai (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-16683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chia-Ping Tsai resolved KAFKA-16683.

Fix Version/s: 4.0.0
   Resolution: Fixed

> Extract security-related helpers from scala.TestUtils to java class
> ---
>
> Key: KAFKA-16683
> URL: https://issues.apache.org/jira/browse/KAFKA-16683
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Chia-Ping Tsai
>Assignee: PoAn Yang
>Priority: Minor
> Fix For: 4.0.0
>
>
> We can merge them into `JaasTestUtils and then rename `JaasTestUtils` to 
> `SecurityTestUtils.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KAFKA-17628) Automate workflow approvals for CI

2024-09-26 Thread David Arthur (Jira)
David Arthur created KAFKA-17628:


 Summary: Automate workflow approvals for CI
 Key: KAFKA-17628
 URL: https://issues.apache.org/jira/browse/KAFKA-17628
 Project: Kafka
  Issue Type: Improvement
  Components: build
Reporter: David Arthur
Assignee: David Arthur


Now that we have switched to GitHub Actions for our CI, we need to make it 
easier for non-committer contributors to have their PRs built. With Jenkins, a 
push to the branch would trigger a build. However, with GitHub Actions, we must 
manually approve each workflow. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [DISCUSS] KIP-1090 Flaky Tests 👻

2024-09-26 Thread David Arthur
If there is no more feedback on this, I'll go ahead and move to a vote.

-David

On Sun, Sep 22, 2024 at 11:04 AM Chia-Ping Tsai  wrote:

>
>
> > David Arthur  於 2024年9月22日 晚上10:07 寫道:
> >
> > Q2: Yes, I think we should run the quarantined tests on all CI builds,
> PRs
> > and trunk. We can achieve this with --rerun-tasks. This will let PR
> authors
> > gain feedback about their changes affect on the flaky tests. We could
> even
> > create a PR-specific report that shows if their changes improved or
> > worsened the flakiness of the quarantined tests.
>
> I guess it will be an individual status like “Gradle Build Scan”? The
> failure of quarantined tests does not obstruct us from merging PR unless
> the target of PR is to fix specific flaky.
>
> Thanks,
> Chia-Ping



-- 
David Arthur


[jira] [Resolved] (KAFKA-17277) Add version mapping and feature dependencies commands to tools (KIP-1022)

2024-09-26 Thread Justine Olshan (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-17277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Justine Olshan resolved KAFKA-17277.

Resolution: Fixed

> Add version mapping and feature dependencies commands to tools (KIP-1022)
> -
>
> Key: KAFKA-17277
> URL: https://issues.apache.org/jira/browse/KAFKA-17277
> Project: Kafka
>  Issue Type: Bug
>Reporter: Ritika Reddy
>Assignee: Ritika Reddy
>Priority: Major
>
> As a part of KIP-1022 the following features need to be implemented :
>  * Add {{version-mapping}} command to to look up the corresponding features 
> for a given metadata version. Using the command with no  
> {{--release-version}}  argument will return the mapping for the latest stable 
> metadata version.
>  * Add {{feature-dependencies}} command to look up dependencies for a given 
> feature version supplied by {{--feature}} flag. If the feature is not known 
> or the version not yet defined, throw an error. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [DISCUSS] KIP-1092: Extend Consumer#close with an option to leave the group or not

2024-09-26 Thread Kirk True
Hi all,

I see that leaveGroup has been renamed as releaseStaticMember. I hate to ask, 
but what is the behavior when a user sets this to true for a non-static member? 
Should it just work, log a warning, throw an error, or?

I think I’m actually OK with leaving it as leaveGroup with a lot of 
documentation that warns users away from changing it arbitrarily.

Thanks!

> On Sep 26, 2024, at 7:32 AM, TengYao Chi  wrote:
> 
> Hi Chia-Ping
> 
> I got your point and updated the content of KIP accordingly.
> Please take a look and let me know what you think.
> 
> Best,
> TengYao
> 
> Chia-Ping Tsai  於 2024年9月26日 週四 下午8:41寫道:
> 
>> Dear all,
>> 
>> The main purpose is to allow consumers to leave a group permanently, even
>> if they have a static member ID. Additionally, we don't have insights into
>> the use case where a dynamic member might not want to leave the group.
>> 
>> Therefore, should we enhance the close option to support our goal—allowing
>> static members to leave the group when closing the consumer? This means the
>> flag should be renamed to 'releaseStaticMember.
>> 
>> Best,
>> Chia-Ping
>> 
>> On 2024/09/25 03:22:17 TengYao Chi wrote:
>>> Hi Kirk,
>>> 
>>> Thanks for your feedback and questions!
>>> 
>>> KT1:
>>> While it might seem that there’s no immediate benefit for non-Kafka
>> Streams
>>> applications to set `leaveGroup=false`, we still offer this option for
>>> consistency with `KafkaStreams#CloseOptions` and to provide flexibility
>> for
>>> future features. However, I agree with your concern that users might
>> misuse
>>> this option, leading to potential throughput issues. To avoid this, we
>>> could include documentation that strongly recommends setting
>>> `leaveGroup=true` unless the user fully understands the implications of
>>> leaving it as `false`.
>>> 
>>> KT2:
>>> Thanks for the reminder.
>>> I have updated the KIP to include the actual API changes. Please take a
>>> look .
>>> 
>>> KT3:
>>> I believe we should respect the timeout setting, as this offers several
>> key
>>> advantages. First, as Chia-Ping mentioned, it gives users more control
>> over
>>> the close process, allowing them to better manage their application flow.
>>> Second, as a public interface, users will expect the timeout parameter to
>>> act as a contract between Kafka and the user. If the consumer does not
>>> respect this setting, it could lead to confusion and force users to
>>> implement unnecessary workarounds. Ensuring that the `close()` operation
>>> honors the provided timeout will lead to a more predictable and
>>> user-friendly API.
>>> 
>>> Thanks again for your questions.
>>> Let me know if you have any further comments.
>>> 
>>> Best,
>>> TengYao
>>> 
>>> Chia-Ping Tsai  於 2024年9月25日 週三 上午4:51寫道:
>>> 
 hi Kirk
 
> KT1: Why would a non-Kafka Streams application want to set
 leaveGroup=false? Because Kafka Streams manages the group membership
 assignment under the covers, it can re-assign partitions to a new
>> Consumer
 when the old one closes. But in a non-Kafka Streams application,
>> doesn’t
 this just leave the partitions assigned until the coordinator kicks the
 member out of the group?
 
 That's definitely a great question. I don’t have a clear answer
>> regarding
 whether leaveGroup=false means not sending a LEAVE_GROUP request, but I
 have a crazy idea—what if leaveGroup=false means temporarily leaving
>> the
 group? My idea is that we could potentially integrate dynamic members
>> with
 static members.
 
 Here’s how it could work with changes to the new protocol and consumer:
 
   1. The new consumer will exclusively use static members. If a user
   doesn't specify one, the new consumer will automatically generate it
   (similar to KIP-1082).
   2. Users can define a close option to either leave the group
>> permanently
   or temporarily.
 
 This approach offers several advantages:
 
   1. It addresses the needs of Kafka Streams (if they start to use new
   consumer)
   2. It simplifies the new coordinator, protocol, and consumer, as
>> they no
   longer need to handle two types of members.
   3. It addresses KIP-1082. The new consumer always sends a
>> LEAVE_GROUP
   request during closing, even if it hasn't received a response from
>> the
   server.
 
> KT3: Does setting leaveGroup=true carry the *guarantee* that the
>> member
 will leave the group? I’m currently battling some edge cases where
>> close()
 is called with a zero timeout and it times out before the consumer can
 leave the group cleanly.
 
 In my opinion, this is a trade-off between 'not honoring the timeout'
>> and
 'creating ghost members.' However, I prefer honoring the timeout, as it
 provides users with more control over the 'close' process. We should
>> trust
 that users fully understand the options they choose.
 
 
 Best,
 Chia-Ping
>>

Re: [DISCUSS] KIP-1094 Add a new constructor method with nextOffsets to ConsumerRecords

2024-09-26 Thread Bill Bejeck
Hi Alieh,

Thanks for the KIP, it will be very useful to Kafka Streams.
I have one comment.  In the "Proposed Changes" section, you mention the
"The `nextOffsets` object contains the next offset and the last leader
epoch per partition".
If understand the KIP correctly, it should be something along the lines of
"The `nextOffsets` method returns a map of `TopicPartition` to
`OffsetAndMetadata` objects and  `OffsetAndMetadata` contains the next
offset and the last leader epoch per partition"

Other than that, the KIP LGTM.

Thanks,
Bill


On Wed, Sep 25, 2024 at 7:43 AM Alieh Saeedi 
wrote:

> Hi all,
>
> I would like to open a discussion for KIP-1094: Add a new constructor
> method with `nextOffsets` to `ConsumerRecords`.
>
> You can find the detailed proposal here:
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1094%3A+Add+a+new+constructor+method+with+nextOffsets+to+ConsumerRecords
>
> I look forward to your feedback and suggestions.
>
> Thanks,
> Alieh
>


Re: [VOTE] KIP-1090 Flaky Test Management

2024-09-26 Thread Chia-Ping Tsai
+1

nit: Could you please add the KIP link to KAFKA-17629

David Arthur  於 2024年9月27日 週五 上午6:31寫道:

> I would like to call a vote for KIP-1090. Please take a moment to review
> the proposal and cast your vote.
>
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1090+Flaky+Test+Management
>
> Thanks!
> David A
>


[jira] [Resolved] (KAFKA-17482) Make share partition initialization async

2024-09-26 Thread Apoorv Mittal (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-17482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apoorv Mittal resolved KAFKA-17482.
---
Resolution: Fixed

> Make share partition initialization async
> -
>
> Key: KAFKA-17482
> URL: https://issues.apache.org/jira/browse/KAFKA-17482
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Apoorv Mittal
>Assignee: Apoorv Mittal
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (KAFKA-17483) Complete pending fetch request on broker shutdown

2024-09-26 Thread Apoorv Mittal (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-17483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apoorv Mittal resolved KAFKA-17483.
---
Resolution: Fixed

> Complete pending fetch request on broker shutdown
> -
>
> Key: KAFKA-17483
> URL: https://issues.apache.org/jira/browse/KAFKA-17483
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Apoorv Mittal
>Assignee: Apoorv Mittal
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [VOTE] KIP-1090 Flaky Test Management

2024-09-26 Thread Bill Bejeck
+1

On Thu, Sep 26, 2024 at 8:08 PM Lianet M.  wrote:

> +1.
>
> Thanks David!
>
>
> On Thu, Sep 26, 2024, 7:47 p.m. Matthias J. Sax  wrote:
>
> > +1
> >
> > On 9/26/24 3:38 PM, Chia-Ping Tsai wrote:
> > > +1
> > >
> > > nit: Could you please add the KIP link to KAFKA-17629
> > >
> > > David Arthur  於 2024年9月27日 週五 上午6:31寫道:
> > >
> > >> I would like to call a vote for KIP-1090. Please take a moment to
> review
> > >> the proposal and cast your vote.
> > >>
> > >>
> > >>
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1090+Flaky+Test+Management
> > >>
> > >> Thanks!
> > >> David A
> > >>
> > >
> >
>


Re: [DISCUSS] KIP-1092: Extend Consumer#close with an option to leave the group or not

2024-09-26 Thread Chia-Ping Tsai
> I think I’m actually OK with leaving it as leaveGroup with a lot of
documentation that warns users away from changing it arbitrarily.

Pardon me, I just want to ensure we are all on the same page.

   1. `leaveGroup=true`:  `ClassicKafkaConsumer` sends a
   `LeaveGroupRequest` for either the dynamic or static member.
   2. `leaveGroup=false`:  `ClassicKafkaConsumer` does not send any `
   LeaveGroupRequest` for either the dynamic or static member.
   3. `leaveGroup=default` (current behavior): `ClassicKafkaConsumer` sends
   a `LeaveGroupRequest` for dynamic member, and it does NOT send any
   `ConsumerGroupHeartbeatRequest`for static member
   4. `leaveGroup=true`:  `AsyncKafkaConsumer` sends a
   `ConsumerGroupHeartbeatRequest` with "-1" epoch for either the dynamic or
   static member
   5. `leaveGroup=false`: `AsyncKafkaConsumer` sends a
   `ConsumerGroupHeartbeatRequest` with "-2" epoch for the static member, and
   it does NOT send any `ConsumerGroupHeartbeatRequest` for dynamic member
   6. `leaveGroup=default` (current behavior): `AsyncKafkaConsumer` sends a
   `ConsumerGroupHeartbeatRequest`with "-1" epoch for dynamic member and
   "-2" epoch for static member

Best,
Chia-Ping


Build failed in Jenkins: Kafka » Kafka PowerPC Daily » test-powerpc #69

2024-09-26 Thread Apache Jenkins Server
See 


Changes:


--
[...truncated 133567 lines...]
[2024-09-27T04:29:06.042Z] 
[2024-09-27T04:29:06.042Z] Gradle Test Run :core:test > Gradle Test Executor 72 
> ZkMigrationClientTest > testEmptyWrite() PASSED
[2024-09-27T04:29:06.042Z] 
[2024-09-27T04:29:06.042Z] Gradle Test Run :core:test > Gradle Test Executor 72 
> ZkMigrationClientTest > testReadMigrateAndWriteProducerId() STARTED
[2024-09-27T04:29:07.605Z] 
[2024-09-27T04:29:07.605Z] Gradle Test Run :core:test > Gradle Test Executor 72 
> ZkMigrationClientTest > testReadMigrateAndWriteProducerId() PASSED
[2024-09-27T04:29:07.605Z] 
[2024-09-27T04:29:07.605Z] Gradle Test Run :core:test > Gradle Test Executor 72 
> ZkMigrationClientTest > testExistingKRaftControllerClaim() STARTED
[2024-09-27T04:29:07.605Z] 
[2024-09-27T04:29:07.605Z] Gradle Test Run :core:test > Gradle Test Executor 72 
> ZkMigrationClientTest > testExistingKRaftControllerClaim() PASSED
[2024-09-27T04:29:07.605Z] 
[2024-09-27T04:29:07.605Z] Gradle Test Run :core:test > Gradle Test Executor 72 
> ZkMigrationClientTest > testMigrateTopicConfigs() STARTED
[2024-09-27T04:29:09.166Z] 
[2024-09-27T04:29:09.166Z] Gradle Test Run :core:test > Gradle Test Executor 72 
> ZkMigrationClientTest > testMigrateTopicConfigs() PASSED
[2024-09-27T04:29:09.166Z] 
[2024-09-27T04:29:09.166Z] Gradle Test Run :core:test > Gradle Test Executor 72 
> ZkMigrationClientTest > testNonIncreasingKRaftEpoch() STARTED
[2024-09-27T04:29:09.166Z] 
[2024-09-27T04:29:09.166Z] Gradle Test Run :core:test > Gradle Test Executor 72 
> ZkMigrationClientTest > testNonIncreasingKRaftEpoch() PASSED
[2024-09-27T04:29:09.166Z] 
[2024-09-27T04:29:09.166Z] Gradle Test Run :core:test > Gradle Test Executor 72 
> ZkMigrationClientTest > testMigrateEmptyZk() STARTED
[2024-09-27T04:29:10.729Z] 
[2024-09-27T04:29:10.729Z] Gradle Test Run :core:test > Gradle Test Executor 72 
> ZkMigrationClientTest > testMigrateEmptyZk() PASSED
[2024-09-27T04:29:10.729Z] 
[2024-09-27T04:29:10.729Z] Gradle Test Run :core:test > Gradle Test Executor 72 
> ZkMigrationClientTest > testTopicAndBrokerConfigsMigrationWithSnapshots() 
STARTED
[2024-09-27T04:29:10.729Z] 
[2024-09-27T04:29:10.729Z] Gradle Test Run :core:test > Gradle Test Executor 72 
> ZkMigrationClientTest > testTopicAndBrokerConfigsMigrationWithSnapshots() 
PASSED
[2024-09-27T04:29:10.729Z] 
[2024-09-27T04:29:10.729Z] Gradle Test Run :core:test > Gradle Test Executor 72 
> ZkMigrationClientTest > testClaimAndReleaseExistingController() STARTED
[2024-09-27T04:29:12.293Z] 
[2024-09-27T04:29:12.293Z] Gradle Test Run :core:test > Gradle Test Executor 72 
> ZkMigrationClientTest > testClaimAndReleaseExistingController() PASSED
[2024-09-27T04:29:12.293Z] 
[2024-09-27T04:29:12.293Z] Gradle Test Run :core:test > Gradle Test Executor 72 
> ZkMigrationClientTest > testClaimAbsentController() STARTED
[2024-09-27T04:29:13.906Z] 
[2024-09-27T04:29:13.906Z] Gradle Test Run :core:test > Gradle Test Executor 72 
> ZkMigrationClientTest > testClaimAbsentController() PASSED
[2024-09-27T04:29:13.906Z] 
[2024-09-27T04:29:13.906Z] Gradle Test Run :core:test > Gradle Test Executor 72 
> ZkMigrationClientTest > testIdempotentCreateTopics() STARTED
[2024-09-27T04:29:13.906Z] 
[2024-09-27T04:29:13.906Z] Gradle Test Run :core:test > Gradle Test Executor 72 
> ZkMigrationClientTest > testIdempotentCreateTopics() PASSED
[2024-09-27T04:29:13.906Z] 
[2024-09-27T04:29:13.906Z] Gradle Test Run :core:test > Gradle Test Executor 72 
> ZkMigrationClientTest > testCreateNewTopic() STARTED
[2024-09-27T04:29:15.593Z] 
[2024-09-27T04:29:15.593Z] Gradle Test Run :core:test > Gradle Test Executor 72 
> ZkMigrationClientTest > testCreateNewTopic() PASSED
[2024-09-27T04:29:15.593Z] 
[2024-09-27T04:29:15.593Z] Gradle Test Run :core:test > Gradle Test Executor 72 
> ZkMigrationClientTest > testUpdateExistingTopicWithNewAndChangedPartitions() 
STARTED
[2024-09-27T04:29:15.593Z] 
[2024-09-27T04:29:15.593Z] Gradle Test Run :core:test > Gradle Test Executor 72 
> ZkMigrationClientTest > testUpdateExistingTopicWithNewAndChangedPartitions() 
PASSED
[2024-09-27T04:29:15.593Z] 
[2024-09-27T04:29:15.593Z] Gradle Test Run :core:test > Gradle Test Executor 72 
> ZooKeeperClientTest > testZNodeChangeHandlerForDataChange() STARTED
[2024-09-27T04:29:15.593Z] 
[2024-09-27T04:29:15.593Z] Gradle Test Run :core:test > Gradle Test Executor 72 
> ZooKeeperClientTest > testZNodeChangeHandlerForDataChange() PASSED
[2024-09-27T04:29:15.593Z] 
[2024-09-27T04:29:15.593Z] Gradle Test Run :core:test > Gradle Test Executor 72 
> ZooKeeperClientTest > testZooKeeperSessionStateMetric() STARTED
[2024-09-27T04:29:17.156Z] 
[2024-09-27T04:29:17.156Z] Gradle Test Run :core:test > Gradle Test Executor 72 
> ZooKeeperClientTest > testZooKeeperSessionStateMetric() PASSED
[2024-09-27T04:29:17.156Z] 
[2024-09-27T04:29:17.156Z] Gradl

[jira] [Created] (KAFKA-17629) KIP-1090 Flaky Test Management

2024-09-26 Thread David Arthur (Jira)
David Arthur created KAFKA-17629:


 Summary: KIP-1090 Flaky Test Management
 Key: KAFKA-17629
 URL: https://issues.apache.org/jira/browse/KAFKA-17629
 Project: Kafka
  Issue Type: Improvement
  Components: build
Reporter: David Arthur






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[VOTE] KIP-1090 Flaky Test Management

2024-09-26 Thread David Arthur
I would like to call a vote for KIP-1090. Please take a moment to review
the proposal and cast your vote.

https://cwiki.apache.org/confluence/display/KAFKA/KIP-1090+Flaky+Test+Management

Thanks!
David A


Re: [VOTE] KIP-1090 Flaky Test Management

2024-09-26 Thread Matthias J. Sax

+1

On 9/26/24 3:38 PM, Chia-Ping Tsai wrote:

+1

nit: Could you please add the KIP link to KAFKA-17629

David Arthur  於 2024年9月27日 週五 上午6:31寫道:


I would like to call a vote for KIP-1090. Please take a moment to review
the proposal and cast your vote.


https://cwiki.apache.org/confluence/display/KAFKA/KIP-1090+Flaky+Test+Management

Thanks!
David A





Re: [VOTE] KIP-1090 Flaky Test Management

2024-09-26 Thread Lianet M.
+1.

Thanks David!


On Thu, Sep 26, 2024, 7:47 p.m. Matthias J. Sax  wrote:

> +1
>
> On 9/26/24 3:38 PM, Chia-Ping Tsai wrote:
> > +1
> >
> > nit: Could you please add the KIP link to KAFKA-17629
> >
> > David Arthur  於 2024年9月27日 週五 上午6:31寫道:
> >
> >> I would like to call a vote for KIP-1090. Please take a moment to review
> >> the proposal and cast your vote.
> >>
> >>
> >>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1090+Flaky+Test+Management
> >>
> >> Thanks!
> >> David A
> >>
> >
>


Re: [VOTE] KIP-1090 Flaky Test Management

2024-09-26 Thread David Arthur
> Could you please add the KIP link to KAFKA-17629

Done.

On Thu, Sep 26, 2024 at 8:11 PM Bill Bejeck  wrote:

> +1
>
> On Thu, Sep 26, 2024 at 8:08 PM Lianet M.  wrote:
>
> > +1.
> >
> > Thanks David!
> >
> >
> > On Thu, Sep 26, 2024, 7:47 p.m. Matthias J. Sax 
> wrote:
> >
> > > +1
> > >
> > > On 9/26/24 3:38 PM, Chia-Ping Tsai wrote:
> > > > +1
> > > >
> > > > nit: Could you please add the KIP link to KAFKA-17629
> > > >
> > > > David Arthur  於 2024年9月27日 週五 上午6:31寫道:
> > > >
> > > >> I would like to call a vote for KIP-1090. Please take a moment to
> > review
> > > >> the proposal and cast your vote.
> > > >>
> > > >>
> > > >>
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1090+Flaky+Test+Management
> > > >>
> > > >> Thanks!
> > > >> David A
> > > >>
> > > >
> > >
> >
>


-- 
David Arthur


Re: So long, Jenkins 👋

2024-09-26 Thread David Arthur
Josep, I've filed KAFKA-17628 and submitted a PR to partially automate the
workflow approval. With this change, we just need a committer to add a
label one time to a PR, then it will get auto-approved.

I've also wondered about a "triage" or "new" label that is automatically
added to new PRs. This would make for an easy filter for committers to use
when seeing what needs attention. The trouble with this is we would then
need to remove the label.

-David


On Thu, Sep 26, 2024 at 1:27 PM Chia-Ping Tsai  wrote:

> > I'm not sure caching is even that useful beyond the time
> > between the branch point and the .0 release (since the rate of change
> slows
> > way down after a release).
>
> I try to keep us optimistic. 🙂
>
> With the restore keys provided by setup-gradle, CI will always find a
> cache to restore. While some task outputs might not be reusable, at least
> we avoid downloading all dependencies again.
>
> By the way, the bulk of the heavy dependencies comes from different
> versions of rocksdb required by the upgrade-system-tests-xxx.
>
> In short, the cache remains valuable even for branches with slower changes.
>
> Best,
> Chia-Ping
>
> On 2024/09/26 14:13:55 David Arthur wrote:
> > We can probably get the new CI working on older release branches, it will
> > just take a bit of effort. As a start, we can just disable the build
> cache
> > for these builds. I'm not sure caching is even that useful beyond the
> time
> > between the branch point and the .0 release (since the rate of change
> slows
> > way down after a release). There is also a 10Gb limit for our total cache
> > items, which we are pretty close to already.
> >
> > On Thu, Sep 26, 2024 at 9:51 AM Chia-Ping Tsai 
> wrote:
> >
> > > It seems we need to promote approve-workflows.py to all committers 😀
> > >
> > > Josep Prat  於 2024年9月26日 週四 下午9:42寫道:
> > >
> > > > I see you have the python script under "committer-tools", I guess I
> might
> > > > need to get used to call that script instead of going to the "pulls"
> > > page.
> > > >
> > > > Best,
> > > >
> > > > On Thu, Sep 26, 2024 at 3:36 PM Josep Prat 
> wrote:
> > > >
> > > >> Hi David,
> > > >> I think we need a way to flag in the PR list (
> > > >> github.com/apache/kafka/pulls) the ones that are waiting for a
> > > committer
> > > >> to approve the workflows. As an example:
> > > >> [image: image.png]
> > > >> This PR has a green checkmark where the check status usually goes.
> But
> > > if
> > > >> one navigates to the PR in question, one can see that the CI tasks
> > > didn't
> > > >> start and wait for a committer to approve and run.
> > > >> [image: image.png]
> > > >> Do you have another way to identify these PRs? Or should we maybe
> work
> > > on
> > > >> auto labelling PRs from non-committers (the ones that would wait
> for CI
> > > to
> > > >> run).
> > > >>
> > > >> On Thu, Sep 26, 2024 at 11:00 AM Josep Prat 
> > > wrote:
> > > >>
> > > >>> That's what I feared
> > > >>>
> > > >>> On Thu, Sep 26, 2024 at 10:31 AM Chia-Ping Tsai <
> chia7...@gmail.com>
> > > >>> wrote:
> > > >>>
> > >  hi Josep
> > > 
> > >  > Do you see any potential impact if we backport the change to
> those?
> > > 
> > >  In my opinion, the main concern is that non-trunk PRs can't
> > > effectively
> > >  leverage the cache, meaning they require more time and resources
> to
> > > run
> > >  CI.
> > >  Additionally, github-ci is triggered by trunk branch only, and we
> have
> > >  not
> > >  tested it on non-trunk branch yet. Given that 3.9.0 and 3.8.1
> releases
> > >  are
> > >  processing, we could continue using Jenkins CI to avoid the
> additional
> > >  overhead of backporting.
> > > 
> > >  By the way, we'll eventually need to backport GitHub CI to the
> > > non-trunk
> > >  branches once the 4.1 branch is created.
> > > 
> > >  Best,
> > >  Chia-Ping
> > > 
> > > 
> > > 
> > >  Chia-Ping Tsai  於 2024年9月26日 週四 下午4:15寫道:
> > > 
> > >  > Thanks to David for providing us with an improved CI!
> > >  >
> > >  > Cheers,
> > >  > Chia-Ping
> > >  >
> > >  > David Arthur  於 2024年9月26日 週四 上午8:51寫道:
> > >  >
> > >  >> Today, we disabled the Jenkins build on trunk. With this
> change, we
> > >  should
> > >  >> now be expecting all green status checks on PRs before
> merging. Of
> > >  course,
> > >  >> flaky tests still exist, but generally speaking we should have
> > > green
> > >  >> builds
> > >  >> (see KIP-1090 for some plans on flaky tests).
> > >  >>
> > >  >> Any committer or "collaborator" (as defined in .asf.yaml) is
> able
> > > to
> > >  >> manually re-run a GitHub Action via the UI.
> > >  >>
> > >  >> For non-committers, someone must approve the workflow. There
> is a
> > >  >> "approve-workflows.py" script in committer-tools to help with
> this.
> > >  I'm
> > >  >> still investigating options to improve this

Jenkins build is still unstable: Kafka » Kafka Branch Builder » 3.9 #78

2024-09-26 Thread Apache Jenkins Server
See 




Re: [DISCUSS] KIP-1094 Add a new constructor method with nextOffsets to ConsumerRecords

2024-09-26 Thread Sophie Blee-Goldman
Should we deprecate the old constructor to make sure that all info gets
passed in when creating a ConsumerRecords instance?

On Thu, Sep 26, 2024 at 3:37 PM Bill Bejeck  wrote:

> Hi Alieh,
>
> Thanks for the KIP, it will be very useful to Kafka Streams.
> I have one comment.  In the "Proposed Changes" section, you mention the
> "The `nextOffsets` object contains the next offset and the last leader
> epoch per partition".
> If understand the KIP correctly, it should be something along the lines of
> "The `nextOffsets` method returns a map of `TopicPartition` to
> `OffsetAndMetadata` objects and  `OffsetAndMetadata` contains the next
> offset and the last leader epoch per partition"
>
> Other than that, the KIP LGTM.
>
> Thanks,
> Bill
>
>
> On Wed, Sep 25, 2024 at 7:43 AM Alieh Saeedi  >
> wrote:
>
> > Hi all,
> >
> > I would like to open a discussion for KIP-1094: Add a new constructor
> > method with `nextOffsets` to `ConsumerRecords`.
> >
> > You can find the detailed proposal here:
> >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1094%3A+Add+a+new+constructor+method+with+nextOffsets+to+ConsumerRecords
> >
> > I look forward to your feedback and suggestions.
> >
> > Thanks,
> > Alieh
> >
>


[jira] [Created] (KAFKA-17630) Convert ClientQuotasRequestTest#testClientQuotasForScramUsers to kraft

2024-09-26 Thread PoAn Yang (Jira)
PoAn Yang created KAFKA-17630:
-

 Summary: Convert 
ClientQuotasRequestTest#testClientQuotasForScramUsers to kraft
 Key: KAFKA-17630
 URL: https://issues.apache.org/jira/browse/KAFKA-17630
 Project: Kafka
  Issue Type: Task
Reporter: PoAn Yang
Assignee: PoAn Yang


We would like to remove zk type from ClusterTest, but 
ClientQuotasRequestTest#testClientQuotasForScramUsers has no kraft test. We 
should convert the test to kraft.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KAFKA-17631) Convert SaslApiVersionsRequestTest to kraft

2024-09-26 Thread PoAn Yang (Jira)
PoAn Yang created KAFKA-17631:
-

 Summary: Convert SaslApiVersionsRequestTest to kraft
 Key: KAFKA-17631
 URL: https://issues.apache.org/jira/browse/KAFKA-17631
 Project: Kafka
  Issue Type: Task
Reporter: PoAn Yang
Assignee: PoAn Yang


We would like to remove zk type from ClusterTest, but 
SaslApiVersionsRequestTest has no kraft test. We should convert the test to 
kraft.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [DISCUSS] KIP-1092: Extend Consumer#close with an option to leave the group or not

2024-09-26 Thread Sophie Blee-Goldman
Thanks for the KIP! Quick request for readability, can you please include
the exact APIs that you're proposing to add or change under the "Public
Interfaces" section? The KIP should display the actual method signature and
any applicable javadocs for new public APIs.

You can look at other KIPs for a clear sense of what it should contain, but
here's one example you could work from:
https://cwiki.apache.org/confluence/display/KAFKA/KIP-1036%3A+Extend+RecordDeserializationException+exception

On Thu, Sep 26, 2024 at 6:22 PM Chia-Ping Tsai  wrote:

> > I think I’m actually OK with leaving it as leaveGroup with a lot of
> documentation that warns users away from changing it arbitrarily.
>
> Pardon me, I just want to ensure we are all on the same page.
>
>1. `leaveGroup=true`:  `ClassicKafkaConsumer` sends a
>`LeaveGroupRequest` for either the dynamic or static member.
>2. `leaveGroup=false`:  `ClassicKafkaConsumer` does not send any `
>LeaveGroupRequest` for either the dynamic or static member.
>3. `leaveGroup=default` (current behavior): `ClassicKafkaConsumer` sends
>a `LeaveGroupRequest` for dynamic member, and it does NOT send any
>`ConsumerGroupHeartbeatRequest`for static member
>4. `leaveGroup=true`:  `AsyncKafkaConsumer` sends a
>`ConsumerGroupHeartbeatRequest` with "-1" epoch for either the dynamic
> or
>static member
>5. `leaveGroup=false`: `AsyncKafkaConsumer` sends a
>`ConsumerGroupHeartbeatRequest` with "-2" epoch for the static member,
> and
>it does NOT send any `ConsumerGroupHeartbeatRequest` for dynamic member
>6. `leaveGroup=default` (current behavior): `AsyncKafkaConsumer` sends a
>`ConsumerGroupHeartbeatRequest`with "-1" epoch for dynamic member and
>"-2" epoch for static member
>
> Best,
> Chia-Ping
>