date:20250226

Re: [DISCUSS] KIP-1124: Providing a clear Kafka Client upgrade path

2025-02-26 Thread Matthias J. Sax


The difference I see for EOS as the following:

We have a different cut-off version. Instead of 2.3 (or earlier) like we 
have for eager/cooperative, also 2.4 and 2.5 require the bridge release 
upgrade to 3.9, to transit the app from EOSv1 to EOSv2.


A 2.5 (or earlier) apps with EOSv1 enabled, cannot safely be upgraded to 
4.0 and EOSv2. In this context, "safely" means, during the upgrade phase 
you won't have EOS protection, and an error/crash during the upgrade 
could result in duplicates.


For code changes: actually not necessarily true. Most people don't 
hard-code configs, but would load them from a config file, so you might 
be able to actually upgrade w/o any code changes, just updating the 
config file.



For the docs, I am not too worried. Even if an RC does not contain all 
doc changes, it no reason to not release it. We could always close gap 
afterwards, too, by updating docs in trunk/4.0 branch and asf-site at 
the same time.


But yes, I am sure we will find/make time to get it done properly.


-Matthias

On 2/26/25 2:46 PM, Sophie Blee-Goldman wrote:

I personally think we should just recommend 3.9 as the sole bridge release,
and not include 2.8. Major versions are about API compatibility, there
shouldn't be any inherent risk in upgrading more than one major release at
a time. Things are already complicated enough :)

I do agree that we should just lay out a specific version to recommend for
a bridge release rather than trying to include the range of all possible
bridge versions.

I also think we should make a separate section in the Streams upgrade guide
specifically about when and why the `upgrade.from` config & double rolling
bounce is needed. Then we can just link to that in the KIP and don't have
to get into the weeds too much. I won't be able to put this together until
mid next week at the earliest but maybe Matthias will have time for it
before that?

So for Kafka Streams this KIP should basically just state two things: users
upgrading from 2.3 or below need to first upgrade to 3.9 then to 4.0, and
they should refer to the upgrade guide (specifically the `upgrade.from`
section we'll add soon) for specific instructions regarding the first
upgrade to 3.9

Side question: do we need to call out that the rows are not mutually

exclusive, but cumulative? Ie, if one is on 2.2 with EOSv1 enabled, two
row apply: need to switch from eager to cooperative rebalancing, and
need to switch from EOVv1 to EOSv2. -- Or is this clear anyway?



Lastly, was just thinking about the EOS thing, and I realize I don't
understand why we have to treat this specially or outline a different
upgrade path. It's the same as any other API change from upgrading across
major versions, right? For example: if you try to upgrade an EOSv1 app
(let's say 2.4) to 4.0, you get a compiler error, and change to the new
config. You just need to update your source code, it's not like you can't
do that upgrade or have to go through a bridge release. At most you need to
make sure the brokers are 2.5 or above but that's not what this table is
about

On Wed, Feb 26, 2025 at 8:12 AM Kuan Po Tseng  wrote:


Hi Matthias,

I appreciate your feedback; it really helped me a lot!
Regarding the issue of upgrading streams from a very old version to 4.x,
I understand that streams are much more complicated than Kafka Client.
I think it's reasonable to do two bumps, but I'm not a Kafka Streams
expert,
and I would like to hear others' opinions as well.

I just updated the KIP, and I hope I have addressed all your comments
above.
Please let me know if I missed anything.

Best,
Kuan-Po Tseng

On 2025/02/25 03:32:06 "Matthias J. Sax" wrote:

Thanks all. Seems we are converging. :)

Again, sorry to the previous very long email, but I thought it's
important to paint a full end-to-end picture.

I agree that we should try to keep it simple to the extend reasonable
possible! If we really want to suggest just 2.8 / 3.9 as bridge
versions, I am ok with this.

About upgrades across multiple major version. Yes, it's certainly
possible for some simple apps, and if we want to keep the guidelines
simple, we can also drop 2.8 as bridge release and only use 3.9. My take
was just, that it seems to de-risk an upgrade to not recommend skipping
a major release; but I can be convinced otherwise :)

Guess it would simplify the table, and we could cut one column. Let's
hear from Sophie again about it.



For the rows in the table:


Kafka Streams
0.11.x - 2.3.x


It says


No, you'll need to do two-rolling bounce twice.


I don't think that "two-rolling bounce twice" is necessary, and it might
be simpler to be less details on the KIP anyway, but refer to the docs?
Similar for other rows... if a two-rolling bounce is necessary, is a "it
depends" answer in many cases, and just omitting it in this table might
be easier.

Instead, it might be good to call out, what needs to be upgraded though,
ie, need to upgrade from "eager" to "cooperative" rebalancing if old
version

[jira] [Resolved] (KAFKA-18850) Fix the docs of org.apache.kafka.automatic.config.providers

2025-02-26 Thread Chia-Ping Tsai (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-18850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chia-Ping Tsai resolved KAFKA-18850.

Resolution: Fixed

trunk: 
[https://github.com/apache/kafka/commit/dd85938661b7b84640941dac0ac92a04fcd3705b]

 

4.0: 
https://github.com/apache/kafka/commit/242573d7de4cec06863b6486202675b57bff0dab

> Fix the docs of org.apache.kafka.automatic.config.providers
> ---
>
> Key: KAFKA-18850
> URL: https://issues.apache.org/jira/browse/KAFKA-18850
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Chia-Ping Tsai
>Assignee: Nick Guo
>Priority: Major
> Fix For: 4.0.0
>
>
> It seems to me that =evn is incorrect. According to the source code, the 
> correct value should be the class name. for example: 
> `org.apache.kafka.common.config.provider.EnvVarConfigProvider`
> ```
> for (String provider : configProviders.split(",")) {
> String providerClass = providerClassProperty(provider);
> if (indirectConfigs.containsKey(providerClass)) {
> String providerClassName = indirectConfigs.get(providerClass);
> if (classNameFilter.test(providerClassName)) {
> providerMap.put(provider, providerClassName);
> } else {
> throw new ConfigException(providerClassName + " is not 
> allowed. Update System property '"
> + AUTOMATIC_CONFIG_PROVIDERS_PROPERTY + "' to 
> allow " + providerClassName);
> }
> }
> }
> ```
> https://github.com/apache/kafka/blob/trunk/clients/src/main/java/org/apache/kafka/common/config/AbstractConfig.java#L616



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[DISCUSS] KIP-1138: Clean up TopologyConfig and API for supplying configs needed by the topology

2025-02-26 Thread Sebastien Viale

Hi Everyone,

I would like to start a discussion on KIP-1138: Clean up TopologyConfig and API 
for supplying configs needed by the 
topology

This proposal aims to simplify Kafka Streams configuration by unifying APIs and 
ensuring topology-specific configs (e.g., topology.optimization, 
processor.wrapper.class) are correctly applied to prevent silent 
misconfigurations.

Regards,

Sébastien

[jira] [Created] (KAFKA-18879) Formatter for share group specific records in __consumer_offsets

2025-02-26 Thread Sushant Mahajan (Jira)

Sushant Mahajan created KAFKA-18879:
---

 Summary: Formatter for share group specific records in 
__consumer_offsets
 Key: KAFKA-18879
 URL: https://issues.apache.org/jira/browse/KAFKA-18879
 Project: Kafka
  Issue Type: Sub-task
Reporter: Sushant Mahajan
Assignee: Sushant Mahajan






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [VOTE] KIP-1127: Flexible Windows for Late Arriving Data

2025-02-26 Thread Almog Gavra

Hi All,

KIP-1127 has been accepted with 3 binding votes (Sophie, Lucas, Matthias).
Voting is now closed -- thanks everyone!

Cheers,
Almog

On Mon, Feb 24, 2025 at 6:51 PM Matthias J. Sax  wrote:

> +1 (binding)
>
> On 2/18/25 8:58 AM, Lucas Brutschy wrote:
> > +1 (binding)
> >
> > Thanks for the KIP!
> >
> > On Wed, Feb 5, 2025 at 8:06 AM Sophie Blee-Goldman
> >  wrote:
> >>
> >> +1 (binding)
> >>
> >> Thanks for the KIP! Neat and practical idea
> >>
> >> On Tue, Feb 4, 2025 at 10:52 AM Almog Gavra 
> wrote:
> >>
> >>> Hello All,
> >>>
> >>> I'd like to start a vote on KIP-1127. Please take a look at the KIP and
> >>> Discussion thread and let me know what you think.
> >>>
> >>> Note that the discussion thread was incorrectly prefixed with KIP-1124
> >>> instead of KIP-1127 (oops!).
> >>>
> >>> Link to KIP:
> >>>
> >>>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1127+Flexible+Windows+for+Late+Arriving+Data
> >>>
> >>> Thanks,
> >>> Almog
> >>>
>
>

Re: New PR workflow in KAFKA-18748

2025-02-26 Thread David Arthur

Sophie,

> are tests automatically sorted into these buckets or do we have to
manually move them

This part hasn't changed -- we still need to manually mark tests as @Flaky
when they are flaky. The recent change has just separated out the flaky
tests into a parallel job rather than a sequential step. It also added the
"new" tests as a separate job rather than baked into the now-removed
quarantinedTest step.

New tests are determined based on the test-catalog branch
https://github.com/apache/kafka/tree/test-catalog



On Tue, Feb 25, 2025 at 9:41 PM Sophie Blee-Goldman 
wrote:

> Thanks David! This is awesome, really glad to see this effort to reduce
> test flakiness.
>
> One question -- are tests automatically sorted into these buckets or do we
> have to manually move them? And if so, how does that work (eg a test in
> "main" becomes flaky)
>
> On Tue, Feb 25, 2025 at 3:20 PM David Arthur  wrote:
>
> > > Can we merge the PR if only flaky or new tests fail?
> >
> > I agree with Ismael that new tests must be solid -- no flakiness should
> be
> > expected when adding a test. Obviously, we will miss things, so we have
> to
> > tolerate them on trunk (along with environmental flaky factors).
> >
> > If there are existing *unrelated* tests that are flaky on a PR, that is
> > fine. Ideally, each failing test or flaky tests on a PR should be
> > investigated.
> >
> > For PRs:
> > "new" tests -- no flakiness
> > "flaky" tests -- expect some flakiness, still look at these failures to
> > make sure the PR didn't make it worse
> > "main" tests -- normal amounts of flakiness, still look at these failures
> > to make sure the PR didn't make it worse (and file a Jira to report
> flaky,
> > if applicable)
> >
> > ---
> >
> >
> > BTW If your PRs have failed build scans, try merging in trunk.
> >
> > [image: image.png]
> >
> > -David A
> >
> >
> > On Mon, Feb 24, 2025 at 7:16 PM Ismael Juma  wrote:
> >
> >> >
> >> > Can we merge the PR if only flaky or new tests fail?
> >>
> >>
> >> We certainly cannot merge if new tests fail - the goal is to ensure new
> >> tests are solid.
> >>
> >> For the flaky ones, I'm not sure how we intend to use these. I would
> >> prefer
> >> if we only merge when the PR status is green. Otherwise, we often end up
> >> merging things we shouldn't (by accident).
> >>
> >> Ismael
> >>
> >> On Mon, Feb 24, 2025 at 2:50 PM Chia-Ping Tsai 
> >> wrote:
> >>
> >> > hi David
> >> >
> >> > Thanks for all your improvement. I do love the new test suites!
> >> >
> >> > one small question:
> >> > Can we merge the PR if only flaky or new tests fail? Sometimes, I list
> >> > tickets for flaky (or unrelated) tests before merging. However, since
> we
> >> > now have a separate test suite for stable tests (non-flaky, non-new),
> I
> >> > assume the new condition is that "stable tests must pass"?
> >> >
> >> > Best,
> >> > Chia-Ping
> >> >
> >> >
> >> >
> >> >
> >> > Ismael Juma  於 2025年2月25日 週二 上午6:24寫道：
> >> >
> >> > > Thanks David - this is another important improvement to our CI
> >> pipeline
> >> > and
> >> > > is super helpful for the project and community.
> >> > >
> >> > > Ismael
> >> > >
> >> > > On Mon, Feb 24, 2025 at 2:15 PM David Arthur 
> >> wrote:
> >> > >
> >> > > > Hey everyone, just wanted to inform you all that we just merged
> >> > > KAFKA-18748
> >> > > >
> >> > > > https://github.com/apache/kafka/pull/18770
> >> > > >
> >> > > > This splits our CI workflow into more parallel jobs which run
> >> subsets
> >> > of
> >> > > > the tests with different settings. The JUnit tests are now split
> >> into
> >> > > > "new", "flaky", and the remainder.
> >> > > >
> >> > > > "New" tests are what we previously called auto-quarantined tests.
> >> > > >
> >> > > > On PR builds, "new" tests are anything that do not exist on trunk.
> >> They
> >> > > are
> >> > > > run with zero tolerance for flakiness.
> >> > > >
> >> > > > On trunk builds, "new" tests are anything added in the last 7
> days.
> >> > They
> >> > > > are run with some tolerance for flakiness.
> >> > > >
> >> > > > The point of this is to discourage flaky tests from being added to
> >> > trunk.
> >> > > >
> >> > > > Please update your PRs with trunk and let me know if you see any
> >> > > weirdness.
> >> > > > Feel free to tag me in the PR, reply to this thread, or email me
> >> > directly
> >> > > > with questions.
> >> > > >
> >> > > > Thanks!
> >> > > > David A
> >> > > >
> >> > >
> >> >
> >>
> >
> >
> > --
> > David Arthur
> >
>


-- 
David Arthur

Re: [VOTE] 4.0.0 RC0

2025-02-26 Thread Ismael Juma

Thanks for the clarification Jose. Makes sense.

Ismael

On Wed, Feb 26, 2025 at 9:22 AM José Armando García Sancio
 wrote:

> Hi Ismael,
>
> On Wed, Feb 26, 2025 at 7:51 AM Ismael Juma  wrote:
> >
> > Hi Jose,
> >
> > Is it a low risk change? If not, I would suggest giving it a bit of
> > stabilization time before cherry-picking to older branches including 4.0
> > and 3.9. I don't think it's necessary to backport to other older branches
> > given that it's not a regression and I don't believe there are any cases
> of
> > users running into it.
>
> We do have users that have encountered this issue in KRaft. I
> documented two cases in the Jira.
>
> Since this is not a regression, I am okay not including this in the
> 4.0.0 release. We should include it in the 4.0.1 release. I updated
> the Jira so that the fixed version is 4.0.1.
>
> Thanks,
> --
> -José
>

Re: [VOTE] 4.0.0 RC0

2025-02-26 Thread Ismael Juma

Hi Jose,

Is it a low risk change? If not, I would suggest giving it a bit of
stabilization time before cherry-picking to older branches including 4.0
and 3.9. I don't think it's necessary to backport to other older branches
given that it's not a regression and I don't believe there are any cases of
users running into it.

Ismael

On Tue, Feb 25, 2025 at 11:30 AM José Armando García Sancio
 wrote:

> Hi David and all,
>
> I am about to merge https://github.com/apache/kafka/pull/18852 to
> trunk: https://issues.apache.org/jira/browse/KAFKA-18723. I would like
> to cherry-pick it to 4.0.0. What do you think?
>
> I will also be cherry picking the change back to 3.7.x, 3.8.x and
> 3.9.x. If it is not included in the 4.0.0 release, we will need to
> include it in the 4.0.1 release.
>
> Thanks,
> --
> -José
>

[jira] [Created] (KAFKA-18873) Incorrect error message for max.in.flight.requests.per.connection when using transactional producer.

2025-02-26 Thread Eslam Mohamed (Jira)

Eslam Mohamed created KAFKA-18873:
-

 Summary: Incorrect error message for 
max.in.flight.requests.per.connection when using transactional producer.
 Key: KAFKA-18873
 URL: https://issues.apache.org/jira/browse/KAFKA-18873
 Project: Kafka
  Issue Type: Bug
  Components: clients
Reporter: Eslam Mohamed
Assignee: Eslam Mohamed


 
{code:java}
KafkaProducerTest.testInflightRequestsAndIdempotenceForIdempotentProduces{code}

{{Above unit test }}checks for configuration validation errors when 
instantiating a {{ProducerConfig}} with invalid properties. One of the 
assertions in this test "invalidProps4" is designed to validate the constraint 
that {{max.in.flight.requests.per.connection}} must be at most {{5}} when using 
a transactional producer. However, the error message thrown by the 
{{ProducerConfig}} constructor in this scenario is incorrect.
 * *Observed Behavior:*
When {{max.in.flight.requests.per.connection}} is set to {{6}} for a 
transactional producer, the test expects an exception with the message:
{{"Must set max.in.flight.requests.per.connection to at most 5 when using the 
transactional producer."}}
Instead, the error message states:
{{"Must set retries to non-zero when using the idempotent producer."}}

 * *Expected Behavior:*
The error message should explicitly indicate the violation of the 
{{max.in.flight.requests.per.connection}} constraint for transactional 
producers:
{{"Must set max.in.flight.requests.per.connection to at most 5 when using the 
transactional producer."}}

The mismatch in the error message can lead to confusion for developers 
debugging the configuration error, as it incorrectly hints at a {{retries}} 
configuration issue instead of the actual 
{{max.in.flight.requests.per.connection}} issue.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [DISCUSS] KIP-1124: Providing a clear Kafka Client upgrade path

2025-02-26 Thread Kuan Po Tseng

Hi Matthias,

I appreciate your feedback; it really helped me a lot!
Regarding the issue of upgrading streams from a very old version to 4.x,
I understand that streams are much more complicated than Kafka Client.
I think it's reasonable to do two bumps, but I'm not a Kafka Streams expert,
and I would like to hear others' opinions as well.

I just updated the KIP, and I hope I have addressed all your comments above.
Please let me know if I missed anything.

Best,
Kuan-Po Tseng

On 2025/02/25 03:32:06 "Matthias J. Sax" wrote:
> Thanks all. Seems we are converging. :)
> 
> Again, sorry to the previous very long email, but I thought it's 
> important to paint a full end-to-end picture.
> 
> I agree that we should try to keep it simple to the extend reasonable 
> possible! If we really want to suggest just 2.8 / 3.9 as bridge 
> versions, I am ok with this.
> 
> About upgrades across multiple major version. Yes, it's certainly 
> possible for some simple apps, and if we want to keep the guidelines 
> simple, we can also drop 2.8 as bridge release and only use 3.9. My take 
> was just, that it seems to de-risk an upgrade to not recommend skipping 
> a major release; but I can be convinced otherwise :)
> 
> Guess it would simplify the table, and we could cut one column. Let's 
> hear from Sophie again about it.
> 
> 
> 
> For the rows in the table:
> 
> > Kafka Streams 
> > 0.11.x - 2.3.x
> 
> It says
> 
> > No, you'll need to do two-rolling bounce twice.
> 
> I don't think that "two-rolling bounce twice" is necessary, and it might 
> be simpler to be less details on the KIP anyway, but refer to the docs? 
> Similar for other rows... if a two-rolling bounce is necessary, is a "it 
> depends" answer in many cases, and just omitting it in this table might 
> be easier.
> 
> Instead, it might be good to call out, what needs to be upgraded though, 
> ie, need to upgrade from "eager" to "cooperative" rebalancing if old 
> version is 0.11 to 2.3. Similar to what we already say for:
> 
> > Kafka Streams 
> > 2.4.x - 3.9.x with EOSv1
> 
> when we call out EOSv1 is removed with 4.0.
> 
> 
> 
> Side question: do we need to call out that the rows are not mutually 
> exclusive, but cumulative? Ie, if one is on 2.2 with EOSv1 enabled, two 
> row apply: need to switch from eager to cooperative rebalancing, and 
> need to switch from EOVv1 to EOSv2. -- Or is this clear anyway?
> 
> 
> 
> > Since Kafka Streams 2.4.0 introduced cooperative rebalancing, which is 
> > enabled by default, it is no longer possible to directly upgrade a Streams 
> > application from a version prior to 2.4 to 2.4 or higher.
> 
> "which is enabled by default" is not really the reason why the upgrade 
> from 2.3 (and earlier) to 4.0 breaks. The reason is, that "eager" 
> support is dropped with 4.0.
> 
> 
> 
> 
> -Matthias
> 
> 
> On 2/24/25 8:28 AM, Kuan Po Tseng wrote:
> > Hello everyone,
> > 
> > Thanks Chia-Ping for the advice. I’ve created a table to cover all upgrade 
> > path scenarios, hoping it provides more clarity. Please let me know if I’ve 
> > misunderstood anything. I appreciate any corrections!
> > 
> > Additionally, as I recently updated the KIP title, here’s the new link:
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-1124%3A+Providing+a+clear+Kafka+Client+upgrade+path+for+4.x
> > 
> > Regarding Kafka Connect, I’m still investigating and will update the KIP 
> > soon. I’ll share any new findings with you as soon as possible.
> > 
> > Thank you!
> > 
> > Best,
> > Kuan-Po
> > 
> > On 2025/02/23 19:12:41 Chia-Ping Tsai wrote:
> >> hi Kuan-Po
> >>
> >>> Apologies for my mistake... Indeed, 2.1 should be the starting point for 
> >>> the bridge version.
> >>
> >> Have you updated the KIP? it seems the bridge version of client still 
> >> starts from 2.0
> >>
> >> Additionally, if I were a user hesitant to adopt the bridge version, it 
> >> would be helpful to list common reasons to aid in choosing the "best" 
> >> bridge version. For example:
> >>
> >> Client Upgrade Paths
> >>
> >> // here is the table
> >>
> >> Best Bridge Version to you
> >>
> >> // add some explanation
> >>
> >> 1. minimize code refactoring - // Kafka Client: 3.3.x - 3.9.x + Kafka 
> >> Streams: 3.6.x - 3.9.x
> >> 2. starts with quorum APIs - // 3.3.x - 3.9.x
> >> 3. xxx
> >> 4. aaa
> >> n. last stable/active version: 3.9.x // we can emphasize the 3.9 is 
> >> recommended by community
> >>
> >> Best,
> >> Chia-Ping
> >>
> >>
> >>
> >> On 2025/02/23 16:10:03 Kuan Po Tseng wrote:
> >>> Thanks, Jun and Juma,
> >>>
> >>> Apologies for my mistake... Indeed, 2.1 should be the starting point for 
> >>> the bridge version.
> >>> I will revise my statement as follows:
> >>> - For Kafka Clients below 2.1, users need to upgrade to 3.9 first, then 
> >>> to 4.x.
> >>> - For Kafka Clients from 2.1 or above, users can directly upgrade to 4.x.
> >>>
> >>> As for Kafka Connect, I initially didn’t consider it because I saw it as 
> >>> another form of a server.
> >>> However,

Re: [DISCUSS] KIP-1124: Providing a clear Kafka Client upgrade path

2025-02-26 Thread Sophie Blee-Goldman

I personally think we should just recommend 3.9 as the sole bridge release,
and not include 2.8. Major versions are about API compatibility, there
shouldn't be any inherent risk in upgrading more than one major release at
a time. Things are already complicated enough :)

I do agree that we should just lay out a specific version to recommend for
a bridge release rather than trying to include the range of all possible
bridge versions.

I also think we should make a separate section in the Streams upgrade guide
specifically about when and why the `upgrade.from` config & double rolling
bounce is needed. Then we can just link to that in the KIP and don't have
to get into the weeds too much. I won't be able to put this together until
mid next week at the earliest but maybe Matthias will have time for it
before that?

So for Kafka Streams this KIP should basically just state two things: users
upgrading from 2.3 or below need to first upgrade to 3.9 then to 4.0, and
they should refer to the upgrade guide (specifically the `upgrade.from`
section we'll add soon) for specific instructions regarding the first
upgrade to 3.9

Side question: do we need to call out that the rows are not mutually
> exclusive, but cumulative? Ie, if one is on 2.2 with EOSv1 enabled, two
> row apply: need to switch from eager to cooperative rebalancing, and
> need to switch from EOVv1 to EOSv2. -- Or is this clear anyway?


Lastly, was just thinking about the EOS thing, and I realize I don't
understand why we have to treat this specially or outline a different
upgrade path. It's the same as any other API change from upgrading across
major versions, right? For example: if you try to upgrade an EOSv1 app
(let's say 2.4) to 4.0, you get a compiler error, and change to the new
config. You just need to update your source code, it's not like you can't
do that upgrade or have to go through a bridge release. At most you need to
make sure the brokers are 2.5 or above but that's not what this table is
about

On Wed, Feb 26, 2025 at 8:12 AM Kuan Po Tseng  wrote:

> Hi Matthias,
>
> I appreciate your feedback; it really helped me a lot!
> Regarding the issue of upgrading streams from a very old version to 4.x,
> I understand that streams are much more complicated than Kafka Client.
> I think it's reasonable to do two bumps, but I'm not a Kafka Streams
> expert,
> and I would like to hear others' opinions as well.
>
> I just updated the KIP, and I hope I have addressed all your comments
> above.
> Please let me know if I missed anything.
>
> Best,
> Kuan-Po Tseng
>
> On 2025/02/25 03:32:06 "Matthias J. Sax" wrote:
> > Thanks all. Seems we are converging. :)
> >
> > Again, sorry to the previous very long email, but I thought it's
> > important to paint a full end-to-end picture.
> >
> > I agree that we should try to keep it simple to the extend reasonable
> > possible! If we really want to suggest just 2.8 / 3.9 as bridge
> > versions, I am ok with this.
> >
> > About upgrades across multiple major version. Yes, it's certainly
> > possible for some simple apps, and if we want to keep the guidelines
> > simple, we can also drop 2.8 as bridge release and only use 3.9. My take
> > was just, that it seems to de-risk an upgrade to not recommend skipping
> > a major release; but I can be convinced otherwise :)
> >
> > Guess it would simplify the table, and we could cut one column. Let's
> > hear from Sophie again about it.
> >
> >
> >
> > For the rows in the table:
> >
> > > Kafka Streams
> > > 0.11.x - 2.3.x
> >
> > It says
> >
> > > No, you'll need to do two-rolling bounce twice.
> >
> > I don't think that "two-rolling bounce twice" is necessary, and it might
> > be simpler to be less details on the KIP anyway, but refer to the docs?
> > Similar for other rows... if a two-rolling bounce is necessary, is a "it
> > depends" answer in many cases, and just omitting it in this table might
> > be easier.
> >
> > Instead, it might be good to call out, what needs to be upgraded though,
> > ie, need to upgrade from "eager" to "cooperative" rebalancing if old
> > version is 0.11 to 2.3. Similar to what we already say for:
> >
> > > Kafka Streams
> > > 2.4.x - 3.9.x with EOSv1
> >
> > when we call out EOSv1 is removed with 4.0.
> >
> >
> >
> > Side question: do we need to call out that the rows are not mutually
> > exclusive, but cumulative? Ie, if one is on 2.2 with EOSv1 enabled, two
> > row apply: need to switch from eager to cooperative rebalancing, and
> > need to switch from EOVv1 to EOSv2. -- Or is this clear anyway?
> >
> >
> >
> > > Since Kafka Streams 2.4.0 introduced cooperative rebalancing, which is
> enabled by default, it is no longer possible to directly upgrade a Streams
> application from a version prior to 2.4 to 2.4 or higher.
> >
> > "which is enabled by default" is not really the reason why the upgrade
> > from 2.3 (and earlier) to 4.0 breaks. The reason is, that "eager"
> > support is dropped with 4.0.
> >
> >
> >
> >
> > -Matthias
> >
> >
> > On

[jira] [Resolved] (KAFKA-18863) Runtime additions for connector multiversioning.

2025-02-26 Thread Greg Harris (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-18863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Greg Harris resolved KAFKA-18863.
-
Fix Version/s: 4.1.0
   Resolution: Fixed

> Runtime additions for connector multiversioning.
> 
>
> Key: KAFKA-18863
> URL: https://issues.apache.org/jira/browse/KAFKA-18863
> Project: Kafka
>  Issue Type: New Feature
>  Components: connect
>Affects Versions: 4.1.0
>Reporter: Snehashis Pal
>Assignee: Snehashis Pal
>Priority: Major
> Fix For: 4.1.0
>
>
> Updates to connect worker to support connector multi versioning.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (KAFKA-18878) Implement share session cache metrics for share fetch

2025-02-26 Thread Apoorv Mittal (Jira)

Apoorv Mittal created KAFKA-18878:
-

 Summary: Implement share session cache metrics for share fetch
 Key: KAFKA-18878
 URL: https://issues.apache.org/jira/browse/KAFKA-18878
 Project: Kafka
  Issue Type: Sub-task
Reporter: Apoorv Mittal
Assignee: Apoorv Mittal






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: kafka producer exception due to TimeoutException:

2025-02-26 Thread Kirk True

Hi Giri,

The first question I would ask is: what happens when you run the producer with 
the default configuration? Producer timeouts are usually caused by client 
misconfiguration, network issues, broker load/topology changes, or a 
combination of those.

Try to remove as many configuration overrides as reasonably possible on the 
client and see if the issue still persists.

Thanks,
Kirk

On Tue, Feb 25, 2025, at 11:23 AM, Greg Harris wrote:
> Hi Giri,
> 
> Thanks for your additional context. Good to see you're running experiments!
> 
> > How to achieve workload among a different number of producers..could you
> > suggest java code for this. This is an api calling from dB stored
> procedure
> > and they can call any number of times  this api with messages as payload.
> 
> This is something specific to your application or framework, I won't be
> able to give you a specific answer. You should work backwards from where
> the producers are configured and instantiated in order to figure out how to
> instantiate more of them.
> 
> > How to find downstream bottleneck (network, brokers, partitions).
> 
> You should examine the metrics emitted by clients and brokers to gather
> more information about the workload. Try changing configurations, reducing
> the incoming workload, etc, and see how the metrics respond. And check for
> resource exhaustion or over-utilization in your operating system.
> These are only general points about where to start. Optimizing Kafka is a
> large topic, so I would recommend searching online for more resources.
> 
> Thanks,
> Greg
> 
> On Tue, Feb 25, 2025 at 10:50 AM giri mungi  wrote:
> 
> > Hi Greg,
> >
> > Thanks for your insights! I tried increasing the timeouts to:
> >
> > request.timeout.ms = 6
> > delivery.timeout.ms = 90
> > However, the issue persists. Some messages are still failing intermittently
> > with a timeout, while others are successfully delivered.
> >
> > props.put(ProducerConfig.ACKS_CONFIG, "all");
> >props.put(ProducerConfig.BATCH_SIZE_CONFIG, 16384);  // Keep 32 KB
> >props.put(ProducerConfig.LINGER_MS_CONFIG, 10);  // Allow 10ms to batch
> > messages
> >props.put(ProducerConfig.RETRIES_CONFIG, 10);
> >props.put(ProducerConfig.RETRY_BACKOFF_MS_CONFIG, 5000);  // Increase
> > backoff to 5s
> >props.put(ProducerConfig.REQUEST_TIMEOUT_MS_CONFIG, 6);
> >props.put(ProducerConfig.DELIVERY_TIMEOUT_MS_CONFIG, 90);
> >props.put(ProducerConfig.BUFFER_MEMORY_CONFIG, 33554432);  // Increase
> > buffer to 32MB
> >props.put(ProducerConfig.COMPRESSION_TYPE_CONFIG, "lz4");  // Reduce
> > payload size
> >
> >
> > What I've Tried So Far:
> > 1) Increasing timeouts (as mentioned above) – but this hasn’t fully
> > resolved the issue.
> > 2) Tuning linger.ms and batch.size to optimize batching, but I haven’t
> > observed a significant improvement.
> >
> > Looking for Advice on Next Steps:
> >
> > you suggested we can consider dividing  workload among a different number
> > ofproducers ==> how to do this.
> >
> > What would you suggest as the next steps to:
> >
> > How to achieve workload among a different number of producers..could you
> > suggest java code for this. This is an api calling from dB stored procedure
> > and they can call any number of times  this api with messages as payload.
> >
> > How to find downstream bottleneck (network, brokers, partitions).
> >
> > On Tue, Feb 25, 2025 at 10:25 PM Greg Harris  > >
> > wrote:
> >
> > > Hi Giri,
> > >
> > > Since nobody with more experience has answered yet, let me give you my
> > > amateur understanding of this error.
> > >
> > > The TimeoutException will appear whenever the load generation (code
> > calling
> > > the Producer) runs faster than all downstream components (Producer,
> > > Network, Brokers, etc) can handle.
> > > Records are accepted by send() but the throughput is not high enough to
> > > acknowledge the record before the delivery.timeout.ms expires.
> > >
> > > You may be able to change the behavior of the timeout to mitigate the
> > > errors without changing the throughput by configuring the
> > > delivery.timeout.ms, buffer.memory, max.block.ms, request.timeout.ms,
> > etc.
> > > You may also be able to improve the throughput of the producer by
> > > configuring the linger.ms, batch.size, compression, send.buffer.bytes,
> > > etc.
> > > If the performance bottleneck is elsewhere downstream, you will need to
> > > investigate and optimize that.
> > > You can also consider dividing your workload among a different number of
> > > producers, partitions, or brokers to see how the throughput behaves.
> > >
> > > Without knowing the details of your setup and debugging it directly, it's
> > > hard to give specific tuning advice. You can try looking online for
> > others'
> > > tuning strategy.
> > >
> > > Thanks,
> > > Greg
> > >
> > > On Fri, Feb 21, 2025 at 10:45 AM giri mungi 
> > wrote:
> > >
> > > >Hi all iam encountering a TimeoutException while publishing me

Re: [DISCUSS] KIP-1136: Make ConsumerGroupMetadata an interface

2025-02-26 Thread Kirk True

Hi Paweł,

Thanks for the KIP!

My questions:

KT1. What will prevent developers from implementing their own 
ConsumerGroupMetadata and passing that to sendOffsetsToTransaction()? I assume 
the code will check the incoming object is of type DefaultConsumerGroupMetadata?

KT2. To me, the use of the adjective "default" in DefaultConsumerGroupMetadata 
implies that there could be other implementations. Is the intention that there 
could be other implementations in the future?

KT3. DefaultConsumerGroupMetadata should be defined in an "internals" package 
of some sort, right? Will users ever reference the implementation class name in 
their code? I'm assuming not.

Thanks!
Kirk 

On Tue, Feb 25, 2025, at 8:24 AM, Paweł Szymczyk wrote:
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1136%3A+Make+ConsumerGroupMetadata+an+interface
> 
> -- 
> Pozdrawiam
> Paweł Szymczyk
>

Re: [VOTE] 4.0.0 RC0

2025-02-26 Thread Chia-Ping Tsai

hi David

I apologize for not testing the release candidate (RC) as I focused on 
reviewing the documentation. I have created tickets to address issues in the 
documentation [0][1][2][3], which will be merged into 4.0 as the risk is 
minimal.

Please let me know if you have any other options for those tickets.

[0] https://issues.apache.org/jira/browse/KAFKA-18869
[1] https://issues.apache.org/jira/browse/KAFKA-18868
[2] https://issues.apache.org/jira/browse/KAFKA-18850
[3] https://issues.apache.org/jira/browse/KAFKA-18849

Best,
Chia-Ping

On 2025/02/22 10:16:43 David Jacot wrote:
> Hello Kafka users, developers and client-developers,
> 
> This is the first candidate for release of Apache Kafka 4.0.0. We
> still have some remaining blockers but we figured that getting a first
> release candidate will help the community to test this major release.
> 
> - This is the first release without Apache Zookeeper
> - The Next Generation of the Consumer Rebalance Protocol is Generally 
> Available
> - The Transactions Server-Side Defense (Phase 2) is Generally Available
> - Queues for Kafka is in Early Access
> - Kafka uses log4j2
> - Drop broker and tools support for Java 11
> - Remove old client protocol API versions
> 
> Release notes for the 4.0.0 release:
> https://dist.apache.org/repos/dist/dev/kafka/4.0.0-rc0/RELEASE_NOTES.html
> 
> *** Please download and test the release. Voting is not necessary as
> we still have blockers.
> 
> Kafka's KEYS file containing PGP keys we use to sign the release:
> https://kafka.apache.org/KEYS
> 
> * Release artifacts to be voted upon (source and binary):
> https://dist.apache.org/repos/dist/dev/kafka/4.0.0-rc0/
> 
> * Docker release artifacts to be voted upon:
> apache/kafka:4.0.0-rc0
> apache/kafka-native:4.0.0-rc0 (Building the native image failed, I
> need to investigate it)
> 
> * Maven artifacts to be voted upon:
> https://repository.apache.org/content/groups/staging/org/apache/kafka/
> 
> * Javadoc:
> https://dist.apache.org/repos/dist/dev/kafka/4.0.0-rc0/javadoc/
> 
> * Tag to be voted upon (off 4.0 branch) is the 4.0.0 tag:
> https://github.com/apache/kafka/releases/tag/4.0.0-rc0
> 
> * Documentation:
> https://kafka.apache.org/40/documentation.html
> 
> * Protocol:
> https://kafka.apache.org/40/protocol.html
> 
> * Successful CI builds for the 4.0 branch:
> Unit/integration tests: 
> https://github.com/apache/kafka/actions/runs/13459676207
> System tests: TBD
> 
> * Successful Docker Image Github Actions Pipeline for 4.0 branch:
> Docker Build Test Pipeline (JVM):
> https://github.com/apache/kafka/actions/runs/13471603921
> Docker Build Test Pipeline (Native):
> https://github.com/apache/kafka/actions/runs/13471605941
> 
> /**
> 
> Thanks,
> David
>

[jira] [Resolved] (KAFKA-18875) KRaft controller does not retry registration if the first attempt times out

2025-02-26 Thread TaiJuWu (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-18875?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

TaiJuWu resolved KAFKA-18875.
-
Resolution: Duplicate

> KRaft controller does not retry registration if the first attempt times out
> ---
>
> Key: KAFKA-18875
> URL: https://issues.apache.org/jira/browse/KAFKA-18875
> Project: Kafka
>  Issue Type: Bug
>Reporter: Daniel Fonai
>Priority: Minor
>
> There is a [retry 
> mechanism|https://github.com/apache/kafka/blob/3.9.0/core/src/main/scala/kafka/server/ControllerRegistrationManager.scala#L274]
>  with exponential backoff built-in in KRaft controller registration. The 
> timeout of the first attempt is 5 s for KRaft controllers 
> ([code|https://github.com/apache/kafka/blob/3.9.0/core/src/main/scala/kafka/server/ControllerServer.scala#L448])
>  which is not configurable.
> If for some reason the controller's first registration request times out, the 
> attempt should be retried but in practice this does not happen and the 
> controller is not able to join the quorum. We see the following in the faulty 
> controller's log:
> {noformat}
> 2025-02-21 13:31:46,606 INFO [ControllerRegistrationManager id=3 
> incarnation=mEzjHheAQ_eDWejAFquGiw] sendControllerRegistration: attempting to 
> send ControllerRegistrationRequestData(controllerId=3, 
> incarnationId=mEzjHheAQ_eDWejAFquGiw, zkMigrationReady=true, 
> listeners=[Listener(name='CONTROLPLANE-9090', 
> host='kraft-rollback-kafka-controller-pool-3.kraft-rollback-kafka-kafka-brokers.csm-op-test-kraft-rollback-631e64ac.svc',
>  port=9090, securityProtocol=1)], features=[Feature(name='kraft.version', 
> minSupportedVersion=0, maxSupportedVersion=1), 
> Feature(name='metadata.version', minSupportedVersion=1, 
> maxSupportedVersion=21)]) (kafka.server.ControllerRegistrationManager) 
> [controller-3-registration-manager-event-handler]
> ...
> 2025-02-21 13:31:51,627 ERROR [ControllerRegistrationManager id=3 
> incarnation=mEzjHheAQ_eDWejAFquGiw] RegistrationResponseHandler: channel 
> manager timed out before sending the request. 
> (kafka.server.ControllerRegistrationManager) 
> [controller-3-to-controller-registration-channel-manager]
> 2025-02-21 13:31:51,726 INFO [ControllerRegistrationManager id=3 
> incarnation=mEzjHheAQ_eDWejAFquGiw] maybeSendControllerRegistration: waiting 
> for the previous RPC to complete. 
> (kafka.server.ControllerRegistrationManager) 
> [controller-3-registration-manager-event-handler]
> {noformat}
> After this we can not see any controller retry in the log.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Resolved] (KAFKA-18869) add remote storage threads to "Updating Thread Configs" section

2025-02-26 Thread Chia-Ping Tsai (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-18869?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chia-Ping Tsai resolved KAFKA-18869.

Resolution: Fixed

trunk: 
[https://github.com/apache/kafka/commit/8bbca913efe260ba59c824801466f73584c46f8f]

4.0: 
https://github.com/apache/kafka/commit/32d012fd8e0329c7b696dff9487ea704481f94a7

> add remote storage threads to "Updating Thread Configs" section
> ---
>
> Key: KAFKA-18869
> URL: https://issues.apache.org/jira/browse/KAFKA-18869
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Chia-Ping Tsai
>Priority: Trivial
> Fix For: 4.0.0
>
>
> # remote.log.reader.threads
>  # remote.log.manager.copier.thread.pool.size
>  # remote.log.manager.expiration.thread.pool.size
> those configs should be added



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Jenkins build is still unstable: Kafka » Kafka PowerPC Daily » test-powerpc #222

2025-02-26 Thread Apache Jenkins Server

See

[jira] [Created] (KAFKA-18874) KRaft controller does not retry registration if the first attempt times out

2025-02-26 Thread Daniel Fonai (Jira)

Daniel Fonai created KAFKA-18874:


 Summary: KRaft controller does not retry registration if the first 
attempt times out
 Key: KAFKA-18874
 URL: https://issues.apache.org/jira/browse/KAFKA-18874
 Project: Kafka
  Issue Type: Bug
Reporter: Daniel Fonai


There is a [retry 
mechanism|https://github.com/apache/kafka/blob/3.9.0/core/src/main/scala/kafka/server/ControllerRegistrationManager.scala#L274]
 with exponential backoff built-in in KRaft controller registration. The 
timeout of the first attempt is 5 s for KRaft controllers 
([code|https://github.com/apache/kafka/blob/3.9.0/core/src/main/scala/kafka/server/ControllerServer.scala#L448])
 which is not configurable.

If for some reason the controller's first registration request times out, the 
attempt should be retried but in practice this does not happen and the 
controller is not able to join the quorum. We see the following in the faulty 
controller's log:
{noformat}
2025-02-21 13:31:46,606 INFO [ControllerRegistrationManager id=3 
incarnation=mEzjHheAQ_eDWejAFquGiw] sendControllerRegistration: attempting to 
send ControllerRegistrationRequestData(controllerId=3, 
incarnationId=mEzjHheAQ_eDWejAFquGiw, zkMigrationReady=true, 
listeners=[Listener(name='CONTROLPLANE-9090', 
host='kraft-rollback-kafka-controller-pool-3.kraft-rollback-kafka-kafka-brokers.csm-op-test-kraft-rollback-631e64ac.svc',
 port=9090, securityProtocol=1)], features=[Feature(name='kraft.version', 
minSupportedVersion=0, maxSupportedVersion=1), Feature(name='metadata.version', 
minSupportedVersion=1, maxSupportedVersion=21)]) 
(kafka.server.ControllerRegistrationManager) 
[controller-3-registration-manager-event-handler]
...
2025-02-21 13:31:51,627 ERROR [ControllerRegistrationManager id=3 
incarnation=mEzjHheAQ_eDWejAFquGiw] RegistrationResponseHandler: channel 
manager timed out before sending the request. 
(kafka.server.ControllerRegistrationManager) 
[controller-3-to-controller-registration-channel-manager]
2025-02-21 13:31:51,726 INFO [ControllerRegistrationManager id=3 
incarnation=mEzjHheAQ_eDWejAFquGiw] maybeSendControllerRegistration: waiting 
for the previous RPC to complete. (kafka.server.ControllerRegistrationManager) 
[controller-3-registration-manager-event-handler]
{noformat}
After this we can not see any controller retry in the log.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [VOTE] 4.0.0 RC0

2025-02-26 Thread José Armando García Sancio

Hi Ismael,

On Wed, Feb 26, 2025 at 7:51 AM Ismael Juma  wrote:
>
> Hi Jose,
>
> Is it a low risk change? If not, I would suggest giving it a bit of
> stabilization time before cherry-picking to older branches including 4.0
> and 3.9. I don't think it's necessary to backport to other older branches
> given that it's not a regression and I don't believe there are any cases of
> users running into it.

We do have users that have encountered this issue in KRaft. I
documented two cases in the Jira.

Since this is not a regression, I am okay not including this in the
4.0.0 release. We should include it in the 4.0.1 release. I updated
the Jira so that the fixed version is 4.0.1.

Thanks,
-- 
-José

Re: New PR workflow in KAFKA-18748

2025-02-26 Thread David Arthur

Another thing that changed this week is the addition of a PR Linter. This
is part of the merge queue effort.

With merge queues, we cannot alter the commit message like we do with the
merge button. Instead the commit message comes from the PR title and PR
body. A script for validating the PR title+body was added that runs any
time a PR is edited or reviewed.

To pass the validation, the PR body must include the "Reviewers: " line at
the bottom. To reduce noise, this particular check is only performed *after*
the PR has been approved.

There are other validations done by the script, but this is the most
significant.

-David A

On Wed, Feb 26, 2025 at 12:59 PM David Arthur  wrote:

> Sophie,
>
> > are tests automatically sorted into these buckets or do we have to
> manually move them
>
> This part hasn't changed -- we still need to manually mark tests as @Flaky
> when they are flaky. The recent change has just separated out the flaky
> tests into a parallel job rather than a sequential step. It also added the
> "new" tests as a separate job rather than baked into the now-removed
> quarantinedTest step.
>
> New tests are determined based on the test-catalog branch
> https://github.com/apache/kafka/tree/test-catalog
>
>
>
> On Tue, Feb 25, 2025 at 9:41 PM Sophie Blee-Goldman 
> wrote:
>
>> Thanks David! This is awesome, really glad to see this effort to reduce
>> test flakiness.
>>
>> One question -- are tests automatically sorted into these buckets or do we
>> have to manually move them? And if so, how does that work (eg a test in
>> "main" becomes flaky)
>>
>> On Tue, Feb 25, 2025 at 3:20 PM David Arthur  wrote:
>>
>> > > Can we merge the PR if only flaky or new tests fail?
>> >
>> > I agree with Ismael that new tests must be solid -- no flakiness should
>> be
>> > expected when adding a test. Obviously, we will miss things, so we have
>> to
>> > tolerate them on trunk (along with environmental flaky factors).
>> >
>> > If there are existing *unrelated* tests that are flaky on a PR, that is
>> > fine. Ideally, each failing test or flaky tests on a PR should be
>> > investigated.
>> >
>> > For PRs:
>> > "new" tests -- no flakiness
>> > "flaky" tests -- expect some flakiness, still look at these failures to
>> > make sure the PR didn't make it worse
>> > "main" tests -- normal amounts of flakiness, still look at these
>> failures
>> > to make sure the PR didn't make it worse (and file a Jira to report
>> flaky,
>> > if applicable)
>> >
>> > ---
>> >
>> >
>> > BTW If your PRs have failed build scans, try merging in trunk.
>> >
>> > [image: image.png]
>> >
>> > -David A
>> >
>> >
>> > On Mon, Feb 24, 2025 at 7:16 PM Ismael Juma  wrote:
>> >
>> >> >
>> >> > Can we merge the PR if only flaky or new tests fail?
>> >>
>> >>
>> >> We certainly cannot merge if new tests fail - the goal is to ensure new
>> >> tests are solid.
>> >>
>> >> For the flaky ones, I'm not sure how we intend to use these. I would
>> >> prefer
>> >> if we only merge when the PR status is green. Otherwise, we often end
>> up
>> >> merging things we shouldn't (by accident).
>> >>
>> >> Ismael
>> >>
>> >> On Mon, Feb 24, 2025 at 2:50 PM Chia-Ping Tsai 
>> >> wrote:
>> >>
>> >> > hi David
>> >> >
>> >> > Thanks for all your improvement. I do love the new test suites!
>> >> >
>> >> > one small question:
>> >> > Can we merge the PR if only flaky or new tests fail? Sometimes, I
>> list
>> >> > tickets for flaky (or unrelated) tests before merging. However,
>> since we
>> >> > now have a separate test suite for stable tests (non-flaky,
>> non-new), I
>> >> > assume the new condition is that "stable tests must pass"?
>> >> >
>> >> > Best,
>> >> > Chia-Ping
>> >> >
>> >> >
>> >> >
>> >> >
>> >> > Ismael Juma  於 2025年2月25日 週二 上午6:24寫道：
>> >> >
>> >> > > Thanks David - this is another important improvement to our CI
>> >> pipeline
>> >> > and
>> >> > > is super helpful for the project and community.
>> >> > >
>> >> > > Ismael
>> >> > >
>> >> > > On Mon, Feb 24, 2025 at 2:15 PM David Arthur 
>> >> wrote:
>> >> > >
>> >> > > > Hey everyone, just wanted to inform you all that we just merged
>> >> > > KAFKA-18748
>> >> > > >
>> >> > > > https://github.com/apache/kafka/pull/18770
>> >> > > >
>> >> > > > This splits our CI workflow into more parallel jobs which run
>> >> subsets
>> >> > of
>> >> > > > the tests with different settings. The JUnit tests are now split
>> >> into
>> >> > > > "new", "flaky", and the remainder.
>> >> > > >
>> >> > > > "New" tests are what we previously called auto-quarantined tests.
>> >> > > >
>> >> > > > On PR builds, "new" tests are anything that do not exist on
>> trunk.
>> >> They
>> >> > > are
>> >> > > > run with zero tolerance for flakiness.
>> >> > > >
>> >> > > > On trunk builds, "new" tests are anything added in the last 7
>> days.
>> >> > They
>> >> > > > are run with some tolerance for flakiness.
>> >> > > >
>> >> > > > The point of this is to discourage flaky tests from being added
>> to
>> >> > trunk.
>> >> > > >
>> >>

Re: Support for other OAuth2 grant types in Kafka

2025-02-26 Thread Kirk True

Hi Subra,

I'm one of the authors of the OAuth support in Kafka. Answers to your questions 
are below...

On Tue, Feb 25, 2025, at 3:05 AM, Subra I wrote:
> Hello All,
> 
> I see that Kafka by itself supports client credentials as grant type for
> OAuth2. I see this mentioned in one of the kafka KIP as well:
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=186877575
> 
> Is there a way to support other grant types as well? I came across the
> following page:
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-%3A+Add+support+for+OAuth+jwt-bearer+grant+type
> 
> Here it says that there is a proposal to support jwt bearer grant type as
> well, but no details are mentioned and looks like it may be out only in
> future.

Yes, that KIP is a work in progress. I'm planning to submit a reviewable 
version of that KIP in the next couple of weeks. I'd love to get your input on 
it, so watch this mailing list for when it's made available.

> 
> 1. Any idea when support for jwt bearer grant type will be available?

Support for the jwt-bearer grant type is scheduled for inclusion in Kafka 
4.1.0, which will come out mid-2025.

> 2. Is there a way to support other grant types? Any references for the same?

Absolutely! I'm not sure if there's a tutorial for it or anything, though :(

You can implement your own AuthenticateCallbackHandler implementation and 
configure your application's sasl.login.callback.handler.class [1] 
configuration to your new class. For OAuth, the configuration for the handler 
class is OAuthBearerLoginCallbackHandler [2], but you can swap in your own. You 
can take some of the bits you need from the existing implementation if you want.

One of the goals of the new KIP is to expose some of the primitives that are 
used internally by the OAuth callback handler. That will provide some building 
blocks so that it's easier for individuals to write custom handlers without 
having to resort to a bunch of code duplication.

Let me know if you have additional questions!

Thanks,
Kirk

[1] 
https://kafka.apache.org/documentation/#producerconfigs_sasl.login.callback.handler.class
[2] 
https://github.com/apache/kafka/blob/trunk/clients/src/main/java/org/apache/kafka/common/security/oauthbearer/OAuthBearerLoginCallbackHandler.java

> Thanks,
> Subra
>

[jira] [Created] (KAFKA-18875) KRaft controller does not retry registration if the first attempt times out

2025-02-26 Thread Daniel Fonai (Jira)

Daniel Fonai created KAFKA-18875:


 Summary: KRaft controller does not retry registration if the first 
attempt times out
 Key: KAFKA-18875
 URL: https://issues.apache.org/jira/browse/KAFKA-18875
 Project: Kafka
  Issue Type: Bug
Reporter: Daniel Fonai


There is a [retry 
mechanism|https://github.com/apache/kafka/blob/3.9.0/core/src/main/scala/kafka/server/ControllerRegistrationManager.scala#L274]
 with exponential backoff built-in in KRaft controller registration. The 
timeout of the first attempt is 5 s for KRaft controllers 
([code|https://github.com/apache/kafka/blob/3.9.0/core/src/main/scala/kafka/server/ControllerServer.scala#L448])
 which is not configurable.

If for some reason the controller's first registration request times out, the 
attempt should be retried but in practice this does not happen and the 
controller is not able to join the quorum. We see the following in the faulty 
controller's log:
{noformat}
2025-02-21 13:31:46,606 INFO [ControllerRegistrationManager id=3 
incarnation=mEzjHheAQ_eDWejAFquGiw] sendControllerRegistration: attempting to 
send ControllerRegistrationRequestData(controllerId=3, 
incarnationId=mEzjHheAQ_eDWejAFquGiw, zkMigrationReady=true, 
listeners=[Listener(name='CONTROLPLANE-9090', 
host='kraft-rollback-kafka-controller-pool-3.kraft-rollback-kafka-kafka-brokers.csm-op-test-kraft-rollback-631e64ac.svc',
 port=9090, securityProtocol=1)], features=[Feature(name='kraft.version', 
minSupportedVersion=0, maxSupportedVersion=1), Feature(name='metadata.version', 
minSupportedVersion=1, maxSupportedVersion=21)]) 
(kafka.server.ControllerRegistrationManager) 
[controller-3-registration-manager-event-handler]
...
2025-02-21 13:31:51,627 ERROR [ControllerRegistrationManager id=3 
incarnation=mEzjHheAQ_eDWejAFquGiw] RegistrationResponseHandler: channel 
manager timed out before sending the request. 
(kafka.server.ControllerRegistrationManager) 
[controller-3-to-controller-registration-channel-manager]
2025-02-21 13:31:51,726 INFO [ControllerRegistrationManager id=3 
incarnation=mEzjHheAQ_eDWejAFquGiw] maybeSendControllerRegistration: waiting 
for the previous RPC to complete. (kafka.server.ControllerRegistrationManager) 
[controller-3-registration-manager-event-handler]
{noformat}
After this we can not see any controller retry in the log.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (KAFKA-18876) 4.0 documentation improvement

2025-02-26 Thread Jun Rao (Jira)

Jun Rao created KAFKA-18876:
---

 Summary: 4.0 documentation improvement
 Key: KAFKA-18876
 URL: https://issues.apache.org/jira/browse/KAFKA-18876
 Project: Kafka
  Issue Type: Improvement
Affects Versions: 4.0.0
Reporter: Jun Rao


We need to fix a few things in the 4.0 documentation.
 
6.10 Consumer Rebalance Protocol 
It's missing from the index on the left.
 
ConsumerGroupPartitionAssignor is cut off in 
org.apache.kafka.coordinator.group.api.assignor.ConsumerGroupPartitionA.
 
6.11 Transaction Protocol
It's missing from the index on the left.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (KAFKA-18877) an mechanism to find cases where we accessed variables from the wrong thread

2025-02-26 Thread Chia-Ping Tsai (Jira)

Chia-Ping Tsai created KAFKA-18877:
--

 Summary: an mechanism to find cases where we accessed variables 
from the wrong thread
 Key: KAFKA-18877
 URL: https://issues.apache.org/jira/browse/KAFKA-18877
 Project: Kafka
  Issue Type: Improvement
Reporter: Chia-Ping Tsai
Assignee: Chia-Ping Tsai


from [https://github.com/apache/kafka/pull/18997#pullrequestreview-2645589959]

There are some _non-thread safe_ classes storing the important information, and 
so they are expected to be access by specific thread.  Otherwise, it may cause 
unexpected behavior



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [VOTE] 4.0.0 RC0

2025-02-26 Thread Andrew Schofield

Hi David,
Thanks for running the release.

I performed the following steps:
* Built from source using Java 17
* Enabled KIP-932 and ran through some basic actions to make sure it worked  
based on
https://cwiki.apache.org/confluence/display/KAFKA/Queues+for+Kafka+%28KIP-932%29+-+Early+Access+Release+Notes
* Validated the APIs enabled using kafka-broker-api-versions.sh
* Stopped the broker and removed the config changes so KIP-932 was no longer 
enabled
* Restarted the broker and validated the APIs enabled did not include KIP-932

All good.

I also looked at the protocol docs and observe the following:
* A bunch of APIs were removed in 4.0 (correct, of course) - these are not in 
the protocol docs
* The APIs which are supported by the controller, such as Vote, are omitted 
from the protocol docs

My view is that these APIs are supported in 4.0 and probably should be in the 
protocol docs.
I haven't created an issue at this point because I'm not sure whether others 
will agree.

Thanks,
Andrew

From: Ismael Juma 
Sent: 26 February 2025 00:08
To: dev@kafka.apache.org 
Subject: Re: [VOTE] 4.0.0 RC0

Hi Jun,

When it comes to the upgrade documentation, a couple of changes landed
after the RC was generated:

*
https://github.com/apache/kafka/commit/da8f390c4599d7199c4cdf2bb85441146e859b17
*
https://github.com/apache/kafka/commit/da3b8e88dc61a1b749895866394cab68410e0eda

Regarding the message format, I agree we can improve the documentation - I
will submit a PR for that momentarily.

Ismael

On Tue, Feb 25, 2025 at 3:58 PM Jun Rao  wrote:

> Hi, David,
>
> Thanks for preparing RC0.
>
> A few comments on the documentation.
>
> 1.5 Upgrading
> There is no documentation on which previous releases can be upgraded to
> 4.0.
>
> 5.3.3 Old Message Format
> Should we remove this section since 4.0 only supports the V2 message
> format?
>
> 6.10 Consumer Rebalance Protocol
> It's missing from the index on the left.
>
> ConsumerGroupPartitionAssignor is cut off in
> org.apache.kafka.coordinator.group.api.assignor.ConsumerGroupPartitionA.
>
> 6.11 Transaction Protocol
> It's missing from the index on the left.
>
> Jun
>
>
> On Tue, Feb 25, 2025 at 11:28 AM José Armando García Sancio
>  wrote:
>
> > Hi David and all,
> >
> > I am about to merge https://github.com/apache/kafka/pull/18852 to
> > trunk: https://issues.apache.org/jira/browse/KAFKA-18723. I would like
> > to cherry-pick it to 4.0.0. What do you think?
> >
> > I will also be cherry picking the change back to 3.7.x, 3.8.x and
> > 3.9.x. If it is not included in the 4.0.0 release, we will need to
> > include it in the 4.0.1 release.
> >
> > Thanks,
> > --
> > -José
> >
>

[jira] [Created] (KAFKA-18871) KRaft migration rollback causes downtime

2025-02-26 Thread Daniel Urban (Jira)

Daniel Urban created KAFKA-18871:


 Summary: KRaft migration rollback causes downtime
 Key: KAFKA-18871
 URL: https://issues.apache.org/jira/browse/KAFKA-18871
 Project: Kafka
  Issue Type: Bug
  Components: kraft, migration
Affects Versions: 3.9.0
Reporter: Daniel Urban


When testing the KRaft migration rollback feature, found the following scenario:
 # Execute KRaft migration on a 3 broker 3 ZK node cluster to the last step, 
but do not finalize the migration.
 ## In the test, we put a slow but continuous produce+consume load on the 
cluster, with a topic (partitions=3, RF=3, min ISR=2)
 # Start the rollback procedure
 # First we roll back the brokers from KRaft mode to migration mode (both 
controller and ZK configs are set, process roles are removed, 
{{zookeeper.metadata.migration.enable}} is true)
 # Then we delete the KRaft controllers, delete the /controller znode
 # Then we immediately start rolling the brokers 1 by 1 to ZK mode by removing 
the {{zookeeper.metadata.migration.enable}} flag and the controller.* 
configurations.
 # At this point, when we restart the 1st broker (let's call it broker 0) into 
ZK mode, we find an issue which occurs ~1 out of 20 times:
If broker 0 is not in the ISR for one of the partitions at this point, it can 
simply never become part of the ISR. Since we are aiming for zero downtime, we 
check the ISR states of partitions between broker restarts, and our process 
gets blocked at this point. We have tried multiple workarounds at this point, 
but it seems that there is no workaround which still ensures zero downtime.

Some more details about the process:
 * We are using Strimzi to drive this process, but have verified that Strimzi 
follows the documented steps precisely.
 * When we reach the error state, it doesn't matter which broker became the 
controller through the ZK node, the brokers still in migration mode get stuck, 
and they flood the logs with the following error:

{code:java}
2025-02-26 10:55:21,985 WARN [RaftManager id=0] Error connecting to node 
kraft-rollback-kafka-controller-pool-5.kraft-rollback-kafka-kafka-brokers.csm-op-test-kraft-rollback-e7798bef.svc.cluster.local:9090
 (id: 5 rack: null) (org.apache.kafka.clients.NetworkClient) 
[kafka-raft-outbound-request-thread]
java.net.UnknownHostException: 
kraft-rollback-kafka-controller-pool-5.kraft-rollback-kafka-kafka-brokers.csm-op-test-kraft-rollback-e7798bef.svc.cluster.local
        at 
java.base/java.net.InetAddress$CachedAddresses.get(InetAddress.java:801)
        at java.base/java.net.InetAddress.getAllByName0(InetAddress.java:1533)
        at java.base/java.net.InetAddress.getAllByName(InetAddress.java:1385)
        at java.base/java.net.InetAddress.getAllByName(InetAddress.java:1306)
        at 
org.apache.kafka.clients.DefaultHostResolver.resolve(DefaultHostResolver.java:27)
        at org.apache.kafka.clients.ClientUtils.resolve(ClientUtils.java:125)
        at 
org.apache.kafka.clients.ClusterConnectionStates$NodeConnectionState.resolveAddresses(ClusterConnectionStates.java:536)
        at 
org.apache.kafka.clients.ClusterConnectionStates$NodeConnectionState.currentAddress(ClusterConnectionStates.java:511)
        at 
org.apache.kafka.clients.ClusterConnectionStates$NodeConnectionState.access$200(ClusterConnectionStates.java:466)
        at 
org.apache.kafka.clients.ClusterConnectionStates.currentAddress(ClusterConnectionStates.java:173)
        at 
org.apache.kafka.clients.NetworkClient.initiateConnect(NetworkClient.java:1075)
        at org.apache.kafka.clients.NetworkClient.ready(NetworkClient.java:321)
        at 
org.apache.kafka.server.util.InterBrokerSendThread.sendRequests(InterBrokerSendThread.java:146)
        at 
org.apache.kafka.server.util.InterBrokerSendThread.pollOnce(InterBrokerSendThread.java:109)
        at 
org.apache.kafka.server.util.InterBrokerSendThread.doWork(InterBrokerSendThread.java:137)
        at 
org.apache.kafka.server.util.ShutdownableThread.run(ShutdownableThread.java:136)
 {code}
 * Manually verified the last offsets of the replicas, and broker 0 is caught 
up in the partition.
 * Even after stopping the produce load, the issue persists.
 * Even after removing the /controller node manually (to retrigger election), 
regardless of which broker becomes the controller, the issue persists.

Based on the above, it seems that during the rollback, brokers in migration 
mode cannot handle the KRaft controllers being removed from the system. Since 
broker 0 is caught up in the partition, we suspect that the other brokers 
(still in migration mode) do not respect the controller state in ZK, and do not 
report changes in the ISR of the partitions they are leading.

This means that if a replica becomes out of sync in the last restart (e.g. due 
to a slow broker restart), we cannot restart the brokers while ensuring zero 
downtime.



--
This messa

[jira] [Created] (KAFKA-18866) JDK23: UnsupportedOperationException: getSubject is supported only if a security manager is allowed

2025-02-26 Thread Christian Habermehl (Jira)

Christian Habermehl created KAFKA-18866:
---

 Summary: JDK23: UnsupportedOperationException: getSubject is 
supported only if a security manager is allowed
 Key: KAFKA-18866
 URL: https://issues.apache.org/jira/browse/KAFKA-18866
 Project: Kafka
  Issue Type: Bug
  Components: security
Affects Versions: 3.8.1
 Environment: e.g.
OpenJDK 64-Bit Server VM Corretto-23.0.2.7.1 (build 23.0.2+7-FR, mixed mode, 
sharing)
all OS should be affected
Reporter: Christian Habermehl


Kafka Client is unable to connect to the broker with JDK23, because 
SecurityManager is deprecated:
{code}
Caused by: javax.security.sasl.SaslException: User name or extensions could not 
be obtained
        at 
org.apache.kafka.common.security.scram.internals.ScramSaslClient.evaluateChallenge(ScramSaslClient.java:112)
        at 
org.apache.kafka.common.security.authenticator.SaslClientAuthenticator.lambda$createSaslToken$1(SaslClientAuthenticator.java:535)
        at 
java.base/jdk.internal.vm.ScopedValueContainer.callWithoutScope(ScopedValueContainer.java:162)
        at 
java.base/jdk.internal.vm.ScopedValueContainer.call(ScopedValueContainer.java:147)
        at java.base/java.lang.ScopedValue$Carrier.call(ScopedValue.java:420)
        at java.base/java.lang.ScopedValue.callWhere(ScopedValue.java:568)
        at java.base/javax.security.auth.Subject.callAs(Subject.java:439)
        at java.base/javax.security.auth.Subject.doAs(Subject.java:614)
        at 
org.apache.kafka.common.security.authenticator.SaslClientAuthenticator.createSaslToken(SaslClientAuthenticator.java:535)
        at 
org.apache.kafka.common.security.authenticator.SaslClientAuthenticator.sendSaslClientToken(SaslClientAuthenticator.java:434)
        at 
org.apache.kafka.common.security.authenticator.SaslClientAuthenticator.sendInitialToken(SaslClientAuthenticator.java:333)
        at 
org.apache.kafka.common.security.authenticator.SaslClientAuthenticator.authenticate(SaslClientAuthenticator.java:274)
        at 
org.apache.kafka.common.network.KafkaChannel.prepare(KafkaChannel.java:181)
        at 
org.apache.kafka.common.network.Selector.pollSelectionKeys(Selector.java:547)
        at org.apache.kafka.common.network.Selector.poll(Selector.java:485)
        at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:595)
        at 
org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:281)
        at 
org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:231)
        at 
org.apache.kafka.clients.consumer.internals.AbstractCoordinator.ensureCoordinatorReady(AbstractCoordinator.java:289)
        at 
org.apache.kafka.clients.consumer.internals.AbstractCoordinator.ensureCoordinatorReady(AbstractCoordinator.java:263)
        at 
org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.coordinatorUnknownAndUnreadySync(ConsumerCoordinator.java:450)
        at 
org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.poll(ConsumerCoordinator.java:482)
        at 
org.apache.kafka.clients.consumer.internals.LegacyKafkaConsumer.updateAssignmentMetadataIfNeeded(LegacyKafkaConsumer.java:652)
        at 
org.apache.kafka.clients.consumer.internals.LegacyKafkaConsumer.poll(LegacyKafkaConsumer.java:611)
        at 
org.apache.kafka.clients.consumer.internals.LegacyKafkaConsumer.poll(LegacyKafkaConsumer.java:591)
        at 
org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:874)
...
Caused by: java.lang.UnsupportedOperationException: getSubject is supported 
only if a security manager is allowed
        at java.base/javax.security.auth.Subject.getSubject(Subject.java:347)
        at 
org.apache.kafka.common.security.authenticator.SaslClientCallbackHandler.handle(SaslClientCallbackHandler.java:58)
        at 
org.apache.kafka.common.security.scram.internals.ScramSaslClient.evaluateChallenge(ScramSaslClient.java:104)
        ... 28 common frames omitted
{code}

The workaround for JDK26 is to use the JVM flag 
{{-Djava.security.manager=allow}}. As far as I know this won't work for JDK24



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (KAFKA-18867) add tests to describe topic configs with empty name

2025-02-26 Thread Chia-Ping Tsai (Jira)

Chia-Ping Tsai created KAFKA-18867:
--

 Summary: add tests to describe topic configs with empty name
 Key: KAFKA-18867
 URL: https://issues.apache.org/jira/browse/KAFKA-18867
 Project: Kafka
  Issue Type: Improvement
Reporter: Chia-Ping Tsai


Describing topic configs for default configs is disallowed, but we lack a test 
for this. {{testDescribeConfigsForTopic}} would be a good place to add a new 
integration test.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (KAFKA-18869) add remote storage threads to "Updating Thread Configs" section

2025-02-26 Thread Chia-Ping Tsai (Jira)

Chia-Ping Tsai created KAFKA-18869:
--

 Summary: add remote storage threads to "Updating Thread Configs" 
section
 Key: KAFKA-18869
 URL: https://issues.apache.org/jira/browse/KAFKA-18869
 Project: Kafka
  Issue Type: Improvement
Reporter: Chia-Ping Tsai


# remote.log.reader.threads
 # remote.log.manager.copier.thread.pool.size
 # remote.log.manager.expiration.thread.pool.size

those configs should be added



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (KAFKA-18868) add the "default value" explanation to the docs of num.replica.alter.log.dirs.threads

2025-02-26 Thread Chia-Ping Tsai (Jira)

Chia-Ping Tsai created KAFKA-18868:
--

 Summary: add the "default value" explanation to the docs of 
num.replica.alter.log.dirs.threads
 Key: KAFKA-18868
 URL: https://issues.apache.org/jira/browse/KAFKA-18868
 Project: Kafka
  Issue Type: Improvement
Reporter: Chia-Ping Tsai


The default value of {{num.replica.alter.log.dirs.threads}} is equal to the 
number of log directories, but the documentation doesn't mention this. Users 
only see a "null" default from the doc



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (KAFKA-18870) implement describeDelegationToken for controller

2025-02-26 Thread Chia-Ping Tsai (Jira)

Chia-Ping Tsai created KAFKA-18870:
--

 Summary: implement describeDelegationToken for controller
 Key: KAFKA-18870
 URL: https://issues.apache.org/jira/browse/KAFKA-18870
 Project: Kafka
  Issue Type: Sub-task
Reporter: Chia-Ping Tsai


as title



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (KAFKA-18872) Convert StressTestLog and TestLinearWriteSpeed to jmh benchmarks

2025-02-26 Thread Mickael Maison (Jira)

Mickael Maison created KAFKA-18872:
--

 Summary: Convert StressTestLog and TestLinearWriteSpeed to jmh 
benchmarks
 Key: KAFKA-18872
 URL: https://issues.apache.org/jira/browse/KAFKA-18872
 Project: Kafka
  Issue Type: Task
Reporter: Mickael Maison


Both StressTestLog and TestLinearWriteSpeed are in the jmh-benchmarks module 
but are not actually benchmarks.

We can keep the main method if people want but it would be helpful to be able 
to run them as benchmarks to ensure changes to our log layer does not 
negatively impact performance.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Resolved] (KAFKA-18757) Create full-function SimpleAssignor to match KIP-932 description

2025-02-26 Thread Abhinav Dixit (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-18757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abhinav Dixit resolved KAFKA-18757.
---
Fix Version/s: 4.1.0
   Resolution: Fixed

> Create full-function SimpleAssignor to match KIP-932 description
> 
>
> Key: KAFKA-18757
> URL: https://issues.apache.org/jira/browse/KAFKA-18757
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Abhinav Dixit
>Assignee: Abhinav Dixit
>Priority: Major
> Fix For: 4.1.0
>
>
> The SimpleAssignor currently assigns all the subscribed topic partitions to 
> all the members of a share group. We need to change it to match the KIP-932 
> description 
> [https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=255070434#KIP932:QueuesforKafka-TheSimpleAssignor]
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [DISCUSS] KIP-1124: Providing a clear Kafka Client upgrade path

[jira] [Resolved] (KAFKA-18850) Fix the docs of org.apache.kafka.automatic.config.providers

[DISCUSS] KIP-1138: Clean up TopologyConfig and API for supplying configs needed by the topology

[jira] [Created] (KAFKA-18879) Formatter for share group specific records in __consumer_offsets

Re: [VOTE] KIP-1127: Flexible Windows for Late Arriving Data

Re: New PR workflow in KAFKA-18748

Re: [VOTE] 4.0.0 RC0

Re: [VOTE] 4.0.0 RC0

[jira] [Created] (KAFKA-18873) Incorrect error message for max.in.flight.requests.per.connection when using transactional producer.

Re: [DISCUSS] KIP-1124: Providing a clear Kafka Client upgrade path

Re: [DISCUSS] KIP-1124: Providing a clear Kafka Client upgrade path

[jira] [Resolved] (KAFKA-18863) Runtime additions for connector multiversioning.

[jira] [Created] (KAFKA-18878) Implement share session cache metrics for share fetch

Re: kafka producer exception due to TimeoutException:

Re: [DISCUSS] KIP-1136: Make ConsumerGroupMetadata an interface

Re: [VOTE] 4.0.0 RC0

[jira] [Resolved] (KAFKA-18875) KRaft controller does not retry registration if the first attempt times out

[jira] [Resolved] (KAFKA-18869) add remote storage threads to "Updating Thread Configs" section

Jenkins build is still unstable: Kafka » Kafka PowerPC Daily » test-powerpc #222

[jira] [Created] (KAFKA-18874) KRaft controller does not retry registration if the first attempt times out

Re: [VOTE] 4.0.0 RC0

Re: New PR workflow in KAFKA-18748

Re: Support for other OAuth2 grant types in Kafka

[jira] [Created] (KAFKA-18875) KRaft controller does not retry registration if the first attempt times out

[jira] [Created] (KAFKA-18876) 4.0 documentation improvement

[jira] [Created] (KAFKA-18877) an mechanism to find cases where we accessed variables from the wrong thread

Re: [VOTE] 4.0.0 RC0

[jira] [Created] (KAFKA-18871) KRaft migration rollback causes downtime

[jira] [Created] (KAFKA-18866) JDK23: UnsupportedOperationException: getSubject is supported only if a security manager is allowed

[jira] [Created] (KAFKA-18867) add tests to describe topic configs with empty name

[jira] [Created] (KAFKA-18869) add remote storage threads to "Updating Thread Configs" section

[jira] [Created] (KAFKA-18868) add the "default value" explanation to the docs of num.replica.alter.log.dirs.threads

[jira] [Created] (KAFKA-18870) implement describeDelegationToken for controller

[jira] [Created] (KAFKA-18872) Convert StressTestLog and TestLinearWriteSpeed to jmh benchmarks

[jira] [Resolved] (KAFKA-18757) Create full-function SimpleAssignor to match KIP-932 description

35 matches

Site Navigation

Mail list logo

Footer information