Hi Matthias,
I appreciate your feedback; it really helped me a lot!
Regarding the issue of upgrading streams from a very old version to 4.x,
I understand that streams are much more complicated than Kafka Client.
I think it's reasonable to do two bumps, but I'm not a Kafka Streams
expert,
and I would like to hear others' opinions as well.
I just updated the KIP, and I hope I have addressed all your comments
above.
Please let me know if I missed anything.
Best,
Kuan-Po Tseng
On 2025/02/25 03:32:06 "Matthias J. Sax" wrote:
Thanks all. Seems we are converging. :)
Again, sorry to the previous very long email, but I thought it's
important to paint a full end-to-end picture.
I agree that we should try to keep it simple to the extend reasonable
possible! If we really want to suggest just 2.8 / 3.9 as bridge
versions, I am ok with this.
About upgrades across multiple major version. Yes, it's certainly
possible for some simple apps, and if we want to keep the guidelines
simple, we can also drop 2.8 as bridge release and only use 3.9. My take
was just, that it seems to de-risk an upgrade to not recommend skipping
a major release; but I can be convinced otherwise :)
Guess it would simplify the table, and we could cut one column. Let's
hear from Sophie again about it.
For the rows in the table:
Kafka Streams
0.11.x - 2.3.x
It says
No, you'll need to do two-rolling bounce twice.
I don't think that "two-rolling bounce twice" is necessary, and it might
be simpler to be less details on the KIP anyway, but refer to the docs?
Similar for other rows... if a two-rolling bounce is necessary, is a "it
depends" answer in many cases, and just omitting it in this table might
be easier.
Instead, it might be good to call out, what needs to be upgraded though,
ie, need to upgrade from "eager" to "cooperative" rebalancing if old
version is 0.11 to 2.3. Similar to what we already say for:
Kafka Streams
2.4.x - 3.9.x with EOSv1
when we call out EOSv1 is removed with 4.0.
Side question: do we need to call out that the rows are not mutually
exclusive, but cumulative? Ie, if one is on 2.2 with EOSv1 enabled, two
row apply: need to switch from eager to cooperative rebalancing, and
need to switch from EOVv1 to EOSv2. -- Or is this clear anyway?
Since Kafka Streams 2.4.0 introduced cooperative rebalancing, which is
enabled by default, it is no longer possible to directly upgrade a Streams
application from a version prior to 2.4 to 2.4 or higher.
"which is enabled by default" is not really the reason why the upgrade
from 2.3 (and earlier) to 4.0 breaks. The reason is, that "eager"
support is dropped with 4.0.
-Matthias
On 2/24/25 8:28 AM, Kuan Po Tseng wrote:
Hello everyone,
Thanks Chia-Ping for the advice. I’ve created a table to cover all
upgrade path scenarios, hoping it provides more clarity. Please let me know
if I’ve misunderstood anything. I appreciate any corrections!
Additionally, as I recently updated the KIP title, here’s the new link:
https://cwiki.apache.org/confluence/display/KAFKA/KIP-1124%3A+Providing+a+clear+Kafka+Client+upgrade+path+for+4.x
Regarding Kafka Connect, I’m still investigating and will update the
KIP soon. I’ll share any new findings with you as soon as possible.
Thank you!
Best,
Kuan-Po
On 2025/02/23 19:12:41 Chia-Ping Tsai wrote:
hi Kuan-Po
Apologies for my mistake... Indeed, 2.1 should be the starting point
for the bridge version.
Have you updated the KIP? it seems the bridge version of client still
starts from 2.0
Additionally, if I were a user hesitant to adopt the bridge version,
it would be helpful to list common reasons to aid in choosing the "best"
bridge version. For example:
Client Upgrade Paths
// here is the table
Best Bridge Version to you
// add some explanation
1. minimize code refactoring - // Kafka Client: 3.3.x - 3.9.x + Kafka
Streams: 3.6.x - 3.9.x
2. starts with quorum APIs - // 3.3.x - 3.9.x
3. xxx
4. aaa
n. last stable/active version: 3.9.x // we can emphasize the 3.9 is
recommended by community
Best,
Chia-Ping
On 2025/02/23 16:10:03 Kuan Po Tseng wrote:
Thanks, Jun and Juma,
Apologies for my mistake... Indeed, 2.1 should be the starting point
for the bridge version.
I will revise my statement as follows:
- For Kafka Clients below 2.1, users need to upgrade to 3.9 first,
then to 4.x.
- For Kafka Clients from 2.1 or above, users can directly upgrade to
4.x.
As for Kafka Connect, I initially didn’t consider it because I saw
it as another form of a server.
However, I’m not very familiar with this area and will need some
time to look into it.
---
Thanks, Matthias and Sophie, for the detailed explanation.
The Streams upgrade is indeed complex... I took some time to digest
these details.
It sounds like the key concerns for the Streams upgrade are:
1. Users still using EOSv1
2. Users relying on Eager rebalancing (i.e., versions before 2.4)
With this in mind, should we adjust the recommended upgrade path to
the following?
- For Kafka Streams below 2.4, you’ll need to first upgrade to 2.8,
then to 3.9, and finally to 4.x.
- For Kafka Streams using EOSv1 on versions 2.4 or above, you’ll
need to upgrade to 3.9 first, then to 4.x.
- Others can directly upgrade to 4.x.
*Note*: If upgrading Streams from 3.4 or below, you’ll need to
perform two rolling bounces.
For more details, please refer to: [Kafka Streams Upgrade Guide](
https://kafka.apache.org/39/documentation/streams/upgrade-guide).
Also, Kafka Streams requires broker compatibility considerations,
see: [Streams API Broker Compatibility](
https://kafka.apache.org/39/documentation/streams/upgrade-guide#streams_api_broker_compat
)
The above approach simplifies the definition of the bridge version.
Instead of providing a range (e.g., [3.5-3.9]), would it be better
to specify 3.9 directly to
reduce users' decision anxiety? This aligns with Juma’s
recommendation for bridge versions
in the Kafka Clients upgrade. 3.9 is the last version before 4.0,
containing the most bug fixes and offering greater stability.
For detailed upgrade steps, such as two-rolling bounce upgrades,
like Matthias recommended we should direct users to the Streams
upgrade documentation.
And I feel the examples in my KIP are too simplistic,
but I don’t think I should make them as detailed as the Streams
upgrade guide.
Otherwise, I’d just be duplicating content. The goal of this KIP
should be to provide
users with a clear high-level upgrade path, while the detailed steps
should be
referenced from the Streams upgrade documentation.
Regarding ALOS, it looks like we support versions from 0.11.x to 4.x,
so there’s no need to specify additional details, right? Or am I
missing something?
Lastly, based on Matthias' suggestion, I have revised the motivation
section to emphasize:
- Simplifying testing
- Reducing compatibility challenges across too many versions
- Clearly defining the recommended upgrade paths
Best,
Kuan-Po Tseng
On 2025/02/21 23:55:51 Sophie Blee-Goldman wrote:
whew, long response from Matthias :P Lot to digest but I want to add
on/respond to a few points:
If they want to be "advantageous", they could make it a two step
upgrade
I guess, and go from 2.5 (or older) directly to 3.x and apply all
required code changes in a single upgrade step, and repeat the
same to
upgrade to 4.0. But I would not necessarily recommend to do an
non-API
compatible upgrade directly, and for sure officially discourage it
for
two major releases.
Are we still talking about only API compatibility here? Because I'm
not so
sure why we would
officially discourage upgrading across 2 major releases as long as
their
code is compatible. Of
course if you're referring to possible gotchas from upgrading over
such a
long period, that's worth
discussing, but it's independent of API compatibility. Imo API
compatibility is a binary thing: either
it's possible to do a direct upgrade or it's not. Why do we have to
officially recommend anything?
Or we can distinguish
between ALOS and EOS and have an "bridge release version range"
for both
cases.
I like this idea. EOS and ALOS are very distinct in Streams and may
only
become moreso divided
over time. It's worth calling them out as separate cases
Now regarding the eager/cooperative rebalancing protocol thing in
Streams:
As Matthias said we hope to officially drop support for eager
rebalancing
in 4.0, and I've prepared
a PR for this already: https://github.com/apache/kafka/pull/18988
This does have the effect of forcing a bridge release for users
hoping to
upgrade directly from 2.3
or below to 4.0+, and users will have to follow a specific upgrade
path to
do so as outlined in the
PR description. Assuming we fit that into 4.0, it should definitely
be
called out in this KIP. (Basically
users need to use the `upgrade.from` config to first upgrade to the
bridge
release, then go on to 4.0)
There are also other runtime incompatibilities that have been
introduced
into Streams over the years
that restrict direct live upgrades across certain versions. It
would also
be good to call this out in
the KIP and point to the `upgrade.from` config, though we can point
to the
Streams upgrade guide
for details rather than try to reiterate everything here.
On Thu, Feb 20, 2025 at 5:21 PM Matthias J. Sax <mj...@apache.org>
wrote:
Hello,
took me some time, and sorry for the long email, but it's
complicated...
First, I just re-read the latest version of the KIP. Thanks for
all the
updates.
One thing that I an missing in the motivation is, that we really
want to
stop support direct upgrades from older versions, to cut down our
upgrade matrix for testing. The motivation does somewhat touch on
it,
but I think it would be good to be more explicit. Even if it isn't
something users care about, it's a second main motivation for us,
in
addition to the complexity to actually keep the versions
compatible.
I also want to further clarify my understanding of the KIP. The
goal is
not to define what upgrades are possible, right? What is possible
is
much more nuanced. -- But we rather want to define what we
recommend? Is
this understanding correct? If yes, it might also be worth to add
to the
motivation section.
I also think, we actually need to more explicitly distinguish three
categories of compatibility, but did so far only discuss two of
them.
Even if the KIP does mention all three. Ideally, we should have a
section in the motivation, explaining the three different types of
compatibility, and explicitly state which one this KIP is concerned
with, and which ones it's not concerned with.
(1) protocol compatibility: ie, what client-broker versions are
compatible
This one is not in the focus of the KIP, but it might still be
good to
be explicit about it. Could be explained in the motivation for
completeness, and maybe refer to KIP-896 for 4.x related changes.
Btw: there is also some additional limitations for KS-broker
compatibility:
https://kafka.apache.org/39/documentation/streams/upgrade-guide#streams_api_broker_compat
Many of you know this, but wanted to mention it for completeness.
Not
sure if we need to mention it on the KIP.
(2) API compatibility (ie Java/Scala API).
This is only mentioned briefly in the KIP, and again, it's not the
core
of the KIP, but I think it is still important to include it more
explicitly, because we talk about "bridge version".
Given the rule that we are allowed to break API compatibility in
major
release, but still guarantee API compatibility for the last three
minor
releases, it can be confusing and it would be great to explain it
better.
In the end, directly upgrading from 2.5 or older to 4.x is
practically
impossible as we went through two major releases which did remove
deprecated APIs, and I would not recommend to do such a direct
upgrade.
From an API POV, if one is on 2.5 or older, they should first
upgrade
to 2.6/2.7/2.8, and than lazily migrate off any older stuff what is
removed with 3.0. Afterwards, they can upgrade to 3.7/3.8/3.9
following
the same pattern, and only upgrade to 4.0 in a third step.
If they want to be "advantageous", they could make it a two step
upgrade
I guess, and go from 2.5 (or older) directly to 3.x and apply all
required code changes in a single upgrade step, and repeat the
same to
upgrade to 4.0. But I would not necessarily recommend to do an
non-API
compatible upgrade directly, and for sure officially discourage it
for
two major releases.
Thus, the information in the KIP about "bridge version" to 4.x
begin
"2.4.x - 3.9.x" seems to fall short, and mentioning
To minimize code refactoring, we recommend the following bridge
versions
that maintain API compatibility with Kafka 4.x:
Kafka Client: 3.3.x - 3.9.x
Kafka Streams: 3.6.x - 3.9.x
seems not to be sufficient to me.
Hence, the provided "Upgrade Examples" might be oversimplified,
and we
might want to refine them.
(3) Runtime compatibility. This one is specific to Kafka Streams,
but
not to clients from my understanding. Client are stateless and
thus they
don't face any issue, but Kafka Streams is stateful, and thus need
to
take care of it. Please correct me if I am wrong.
The KIP so far, seems to only consider this one, and what is
proposed
make sense to me on a high level. However, I am confused why Kafka
Clients are mentioned here, too, as this type of compatibility
should
not really be relevant for them? Even if clients might also have
some
semantic changes, these should always align with API changes (ie,
old
deprecate API might have slightly different semantics than new
API).
Now about the currently proposes ranges from Kafka Streams:
Kafka Streams
Current Version 0.11.x - 2.3.x
Bridge Release 2.4.x - 3.9.x
Target Version 4.x
This could make sense for "eager vs cooperative" rebalancing,
however,
at the current point, we did not remove "eager" in 4.0 yet. I was
actually just syncing up with Sophie about it, and it was a slip,
and we
want to propose to remove "eager" in 4.0 (Sophie will prepare a
PR), so
we can avoid keeping "eager" until 5.0.
We did officially deprecate "eager" in 3.1 release, so we are
covered to
actually remove it with 4.0.
If we would not drop "eager", using `2.4.x to 3.9.x` would not make
sense though. If we keep "eager" in 4.0, user can still upgrade
from
2.0.x to 4.0.x w/o issues from a runtime perspective.
If we drop "eager" we also need to drop the corresponding system
tests
that upgrade to 4.0, and also stop testing upgrading from "eager to
cooperative" with 3.9 being the highest target version in this
system
test. And if we don't test it, it's not officially supported any
longer... (even if people could still upgrade via an offline
upgrade --
what really breaks if we remove "eager" is "only" the online
[two-]rolling bounce upgrade...)
However, there is another change we want to consider: we did remove
EOSv1 in 4.0 release, which was replace with EOSv2 in Kafka
Streams 2.6
via KIP-447.
Thus, for EOSv1 users, they cannot directly upgrade to 4.0 either,
but
only EOSv2 users can. Thus, it might make sense to actually use
"bridge
releases 2.6.x - 3.9.x" just to keep it simple... Or we can
distinguish
between ALOS and EOS and have an "bridge release version range"
for both
cases.
Btw: using EOSv2 required broker version 2.5+, that we might also
want
to call out.
Last but not least, while we are very explicit in the KS upgrade
docs,
it might be worth to call out that some upgrades require a
two-rolling
bounce approach, and users should always consult the upgrade
docs... We
use two-rolling bounce upgrade to bridge runtime backward
incompatible
changes (similar to what we do broker side, when IBP version is
bumped).
So overall, it seems that we need to really have two guidelines,
not
just one? For for API compatibility, which is much stricter, and
one for
runtime compatibility?
If we really want to make a recommendation that is most easy to
understand, we might want to only go with API compatibility. Not
sure if
this might be "too restrictive" though?
Curious to get you though on all this.
-Matthias
On 2/19/25 5:51 PM, Kuan Po Tseng wrote:
Hi Lianet,
Thank you for your feedback!
Yes, the current KIP focuses solely on the client upgrade for
4.x. I
have updated the title accordingly and also included the KS
upgrade link in
the KIP.
Thanks again!
Best regards,
Kuan-Po
On 2025/02/19 16:59:25 "Lianet M." wrote:
Hello all, sorry a bit late, just minor comments on this one:
- Should we clarify in the title or at the beginning of the KIP
that it
is
proposing a client upgrade path for 4.x? The broader
considerations for
upgrades discussed in this thread will be tackled separately
(seems we
all
agree).
- The KS upgrade path seems to be the tricky one, and all that
the user
needs to consider to successfully follow the provided path for
KS is not
clear in the KIP, but it's all well explained on the KS upgrade
notes
for
3.9, should we add a ref to that?
https://kafka.apache.org/39/documentation/streams/upgrade-guide
Thanks Kuan Po!
Lianet
On Tue, Feb 11, 2025 at 11:22 AM Kuan Po Tseng <
brandb...@gmail.com>
wrote:
Hello everyone,
If there are no other opinions, I would like to start a vote
tomorrow,
thank you!
Best,
Kuan Po
On Sat, Feb 8, 2025 at 1:51 AM Kuan Po Tseng <
brandb...@apache.org>
wrote:
Hi all,
Based on our discussion, I added a section on choosing the
appropriate
bridge version from an API compatibility perspective for
upgrading to
Kafka
4.0. Let me know if you have any thoughts. Thank you!
Best,
Kuan-Po
On 2025/02/07 03:34:46 Kuan Po Tseng wrote:
Hi Chia-Ping,
Sorry for the delayed response. I’ve checked all relevant
JIRAs using
the following Jira Query Language:
project = KAFKA AND status in (Resolved, Closed) AND
fixVersion =
4.0.0
AND text ~ "Remove" order by updated DESC
Based on this, I checked the JIRAs related to removing
deprecated
methods in client modules. The minimum backward-compatible
client
versions
for client 4.0 are as follows:
- Producer: 3.3.0
Reason: Partitioner#onNewBatch was deprecated in 3.3.0,
and was
removed by https://issues.apache.org/jira/browse/KAFKA-18295
- Consumer: 2.4.0
Reason: Consumer#committed was deprecated in 2.4.0, and
was
removed
by
https://issues.apache.org/jira/browse/KAFKA-17451
- Admin: 3.3.0
Reason: ListConsumerGroupOffsetsOptions was deprecated in
3.3.0
and
was removed by
https://issues.apache.org/jira/browse/KAFKA-18291
You can find a list of all related JIRAs and pull requests in
this
Google Sheet:
https://docs.google.com/spreadsheets/d/1ZWNRk1rjWptjpGM2UtT0Q3lDULhrqkP_UfHr9roQW3M/edit?usp=sharing
There are also some public methods removed in 4.0, such as:
- KafkaFuture#Function, KafkaFuture#thenApply
https://issues.apache.org/jira/browse/KAFKA-17903
- JmxReported(String)
https://issues.apache.org/jira/browse/KAFKA-18077
, but I'm uncertain about how we should handle these.
Best,
Kuan-Po
On 2025/02/06 19:08:49 Chia-Ping Tsai wrote:
hi Kuan-Po
any update? Now that an upgrade path for bridge versions
exists, we
can introduce additional "conditions" to assist users in
selecting the
"best" bridge version. For example, we can provide guidance on
which
bridge
versions offer backward compatibility with Kafka 4.0 client or
are
compatible with Kafka 4.0 server.
Best,
Chia-Ping
On 2025/01/22 04:48:36 Chia-Ping Tsai wrote:
- If we support 2.0+ to 4.0 client/KS upgrade it's
simpler, but
of
course brokers cannot be 4.0 yet -- but I guess this would be
something
natural? Given that the clients would be on 2.0, brokers
cannot be 4.0
yet,
or clients would have crashed already... Thus, I think I
slightly
prefer
this one.
Using a major version as a bridge is a viable approach. We
can
emphasize the limitations of this method to guide users in
selecting
the
most suitable bridge version.
For KS, from an API compatibility POV, upgrading from
anything
older than 3.6 might not work any longer (for DSL users; of
course,
depending on what APIs they are using). And for PAPI, the old
API was
removed too, so only if the new one is use (introduced in 2.7)
a
seamless
upgrade would work smoothly.
You make a valid point. The previous discussion overlooked
the APIs
that were removed in version 4.0.
We could also emphasize the BC advantages. As an example,
users
have
the option of using version 2.7 as a bridge and subsequently
upgrade
without code alterations or recompilation. Of course, we need
to check
the
version of other PAPI removal.
Best,
Chia-Ping
Matthias J. Sax <mj...@apache.org> 於 2025年1月22日 凌晨2:55 寫道:
For KS, from an API compatibility POV, upgrading from
anything
older than 3.6 might not work any longer (for DSL users; of
course,
depending on what APIs they are using). And for PAPI, the old
API was
removed too, so only if the new one is use (introduced in 2.7)
a
seamless
upgrade would work smoothly.