Re: What branches should perf fixes be targeting

Dmitry Konstantinov Thu, 23 Jan 2025 06:19:59 -0800

Hi Stefan,

Thank you a lot for the detailed feedback! Few comments:


>> I think this is already the case, more or less. We are not doing perf
changes in older branches.
Yes, I understand the idea about stability of older branches, the primary
issue for me is that if I contribute even a small improvement to trunk - I
cannot really use it for a long time (except having it in my own
fork), because there is no release to get it back for me or anybody else..

>> Maybe it would be better to make the upgrading process as smooth as
possible so respective businesses are open to upgrade their clusters in a
more frequent manner.
About the upgrade process: my personal experience (3.0.x -> 3.11.x -> 4.0.x
-> 4.1.x), the upgrade in Cassandra is positive (I suppose the autotests
which test it are really helpful), I have not experienced any serious
issues with it. I suppose the majority of time when people have an
issue with upgrades is due to delaying them for too long and staying on
very old unsupported versions till the last moment.

>>  Cassandra is not JDK. We need to fix bugs in older branches we said we
support
Regarding the necessity to support the older branches it is the same story
for JDK: they now support and fix bugs in JDK8, JDK11, JDK17 and JDK 21 as
LTS versions and JDK23 as the latest release while developing and releasing
JDK24 now.
Another example, Postgres does a major release every year:
https://www.postgresql.org/support/versioning/ and supports the last 5
major versions.

>> please keep in mind that there are people behind the releases who are
spending time on that.
Yes, as I already mentioned, I really thank you to Brandon and Mick for
doing it! It is hard, exhausting and not the most exciting work to do.
Please contact me if I can help somehow with it, like checking and fixing
CI test failures(I've already done it for a while) / doing some scripting/
etc.
I have a hypothesis (maybe I am completely wrong here) that actually the
low interest in the releasing process is somehow related to having a
Cassandra fork by many contributors, so there is no big demand for regular
mainline releases if you have them in a fork..

Regards,
Dmitry









On Thu, 23 Jan 2025 at 12:30, Štefan Miklošovič <smikloso...@apache.org>
wrote:

> I think the current guidelines are sensible.
>
> Going through your suggestions:
>
> 1) I think this is already the case, more or less. We are not doing perf
> changes in older branches. This is what we see in CASSANDRA-19429, a user
> reported that it is a performance improvement, and most probably he is
> right, but I am hesitant to refactor / introduce changes into older
> branches.
>
> Cassandra has a lot of inertia, we can not mess with what works even
> performance improvements are appealing. Maybe it would be better to make
> the upgrading process as smooth as possible so respective businesses are
> open to upgrade their clusters in a more frequent manner.
>
> 2) Well, but Cassandra is not JDK. We need to fix bugs in older branches
> we said we support. This is again related to inertia Cassandra has as a
> database. Bug fixes are always welcome, especially if there is 0 risk
> deploying it.
>
> What particularly resonates with me is your wording "more frequent and
> predictable". Well ... I understand it would be the most ideal outcome, but
> please keep in mind that there are people behind the releases who are
> spending time on that. I have been following this project for a couple
> years and the only people who are taking care of releases are Brandon and
> Mick. I was helping here and there to at least stage it and I am willing to
> continue to do so, but that is basically it. "two and a half" people are
> doing releases. For all these years.
>
> So if you ask for more frequent releases, that is something which is going
> to directly affect respective people involved in them. I guess they are
> doing it basically out of courtesy and it would be great to see more PMCs
> involved in release processes. As of now, it looks like everybody just
> assumes that "it will be somehow released" and "releases just happen" but
> that is not the case. Releases are not "just happening". There are people
> behind them who need to plan when it is going to happen and they need to
> find time for that etc. There are a lot of things not visible behind the
> scenes and doing releases is a job in itself.
>
> So if we ask for more frequent releases, it is a good question to ask who
> would be actually releasing that.
>
> On Wed, Jan 22, 2025 at 12:17 PM Dmitry Konstantinov <netud...@gmail.com>
> wrote:
>
>> Hi all,
>>
>> I am one of the contributors for the recent perf changes, like:
>> https://issues.apache.org/jira/browse/CASSANDRA-20165
>> https://issues.apache.org/jira/browse/CASSANDRA-20226
>> https://issues.apache.org/jira/browse/CASSANDRA-19557
>> ...
>>
>> My motivation: I am currently using 4.1.x and planning to adopt 5.0.x in
>> the next quarter. Of course, I want to have it in the best possible share
>> from performance point of view, performance is one of important selling
>> points for upgrades. In general, performance is one of key reasons why
>> people select NoSQL and Cassandra particularly, so any improvement here
>> should be appreciated by users, especially in the current cloud-oriented
>> world where every such improvement is a potential cost saving.
>>
>> For me the question is tightly related to the release scheduling. We have
>> periodic and quite frequent patch releases now, thank you a lot to the
>> people who spend their time to do it. When we speak about minor releases -
>> it looks like the release process is much slower and not so predictable, it
>> can be a year or even more before I can get any minor release which
>> includes a change, and nobody can say even a preliminary date for it.
>> As a result when I have a performance patch and it is suggested to merge
>> only to trunk I will not get the improvement back to use for a long time.
>> So, I have 2 options in this case:
>> 1) relax and wait (potentially losing an interest due to a delayed
>> feedback)
>> 2) keep my own private fork to accumulate such changes with correspondent
>> overheads (what I am actually do now)
>>
>> As a guy who supports Cassandra in production for systems with 99.999
>> availability requirements, of course I am curious about stability too, but
>> I think we need some balance here and we should rely more on things like
>> test coverage and different policies for different branches to not stagnate
>> due to fear of any change. I am not saying about massive breaking changes,
>> especially which modify (even in a compatible way) network communication
>> protocols or disk data formats, it should be a separate individual
>> discussion for them.
>>
>> The situation reminds me of the story of JDK prior to Java 9. There were
>> also some big bang minor releases (1.5/1.6/1.7/1.8) which we waited for a
>> very long time and Java was evolving very slowly. Now we have a model where
>> a new release is available every 1/2 year and some of them are supported as
>> long term. So, the people who prefer stability select and use LTS versions,
>> the people who want to get access to new features/improvements can take the
>> latest release, all are happy. Similar models like stable/latest releases
>> are available for other products.
>>
>> So, my suggestion is one of the following options:
>> 1) Classify the current release branches as more and less stable, like:
>> -- 4.0.x/4.1.x - avoid perf changes unless it is really a bug-like
>> -- 5.0.x - more relaxed rules
>>
>> 2) Do something similar to JDK with LTS versions: make minor releases for
>> the latest major version (like: 5.1/5.2) more frequent and predictable,
>> like a train release, do not create a fix branch for every one,
>> periodically for some selected minor versions establish fix branches and
>> release patch versions for them.
>>
>> Thank you,
>> Dmitry
>>
>>
>> On Wed, 22 Jan 2025 at 09:02, Jeff Jirsa <jji...@gmail.com> wrote:
>>
>>> I think the status quo is fine - perf goes to trunk, if you think
>>> something is special, it goes to the mailing list to justify exceptions
>>>
>>>
>>> On Jan 22, 2025, at 3:36 AM, Jordan West <jw...@apache.org> wrote:
>>>
>>> 
>>> Thanks for the initial feedback. I hear a couple different themes /
>>> POVs.
>>>
>>> David/Paulo, it sounds like maybe a guide for perf backports + mailing
>>> list consensus when necessary + clear documentation of this could be a way
>>> forward. I agree that each change comes with stability risks but at the
>>> same time the greatest stability risk with Cassandra historically has been
>>> major version upgrades (although we have made great improvements here). For
>>> folks who want only the performance improvements, we are asking them to
>>> take greater risk by upgrading a major version or to maintain a fork. The
>>> fork is reasonable for some of the larger operators but not others. That
>>> said, I do agree we need to use judgement. Not all changes are worth
>>> backporting and some may incur too much risk. We could also add to the
>>> guide suggestions of how to de-risk a change (e.g. code is isolated, config
>>> to turn it off / off by default, etc).
>>>
>>> Jeff, I agree 1% wins aren't worth it if they are invasive and in risky
>>> areas. Not all of the improvements are that minor.
>>>
>>> Jordan
>>>
>>> On Tue, Jan 21, 2025 at 1:57 PM Jeff Jirsa <jji...@gmail.com> wrote:
>>>
>>>> We expect users to treat patch and minor releases as low risk. Changing
>>>> something deep in the storage engine to be 1% faster is not worth the risk,
>>>> because most users will skip the type of qualification that finds those one
>>>> in a billion regressions.
>>>>
>>>> Patch releases are for bug fixes not perf improvements.
>>>>
>>>>
>>>> On Jan 21, 2025, at 9:10 PM, Jordan West <jw...@apache.org> wrote:
>>>>
>>>> 
>>>> Hi folks,
>>>>
>>>> A topic that’s come up recently is what branches are valid targets for
>>>> performance improvements. Should they only go into trunk? This has come up
>>>> in the context of BTI improvements, Dmitry’s work on reducing object
>>>> overhead, and my work on CASSANDRA-15452.
>>>>
>>>> We currently have guidelines published:
>>>> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=199530302#Patching,versioning,andLTSreleases-Wheretoapplypatches.
>>>> But there’s no explicit discussion of how to handle performance
>>>> improvements. We tend to discuss whether they’re “bugfixes”.
>>>>
>>>> I’d like to discuss whether performance improvements should target more
>>>> than just trunk. I believe they should target every active branch because
>>>> performance is a major selling point of Cassandra. It’s not practical to
>>>> ask users to upgrade major versions for simple performance wins. A major
>>>> version can be deployed for years, especially when the next one has major
>>>> changes. But we shouldn’t target non-supported major versions, either.
>>>> Also, there will be exceptions: patches that are too large, invasive,
>>>> risky, or complicated to backport. For these, we rely on the contributor
>>>> and reviewer’s judgment and the mailing list. So, I’m proposing an
>>>> allowance to backport to active branches, not a requirement to merge them.
>>>>
>>>> I’m curious to hear your thoughts.
>>>> Jordan
>>>>
>>>>
>>
>> --
>> Dmitry Konstantinov
>>
>

-- 
Dmitry Konstantinov

Re: What branches should perf fixes be targeting

Reply via email to