Hi all,

I am one of the contributors for the recent perf changes, like:
https://issues.apache.org/jira/browse/CASSANDRA-20165
https://issues.apache.org/jira/browse/CASSANDRA-20226
https://issues.apache.org/jira/browse/CASSANDRA-19557
...

My motivation: I am currently using 4.1.x and planning to adopt 5.0.x in
the next quarter. Of course, I want to have it in the best possible share
from performance point of view, performance is one of important selling
points for upgrades. In general, performance is one of key reasons why
people select NoSQL and Cassandra particularly, so any improvement here
should be appreciated by users, especially in the current cloud-oriented
world where every such improvement is a potential cost saving.

For me the question is tightly related to the release scheduling. We have
periodic and quite frequent patch releases now, thank you a lot to the
people who spend their time to do it. When we speak about minor releases -
it looks like the release process is much slower and not so predictable, it
can be a year or even more before I can get any minor release which
includes a change, and nobody can say even a preliminary date for it.
As a result when I have a performance patch and it is suggested to merge
only to trunk I will not get the improvement back to use for a long time.
So, I have 2 options in this case:
1) relax and wait (potentially losing an interest due to a delayed feedback)
2) keep my own private fork to accumulate such changes with correspondent
overheads (what I am actually do now)

As a guy who supports Cassandra in production for systems with 99.999
availability requirements, of course I am curious about stability too, but
I think we need some balance here and we should rely more on things like
test coverage and different policies for different branches to not stagnate
due to fear of any change. I am not saying about massive breaking changes,
especially which modify (even in a compatible way) network communication
protocols or disk data formats, it should be a separate individual
discussion for them.

The situation reminds me of the story of JDK prior to Java 9. There were
also some big bang minor releases (1.5/1.6/1.7/1.8) which we waited for a
very long time and Java was evolving very slowly. Now we have a model where
a new release is available every 1/2 year and some of them are supported as
long term. So, the people who prefer stability select and use LTS versions,
the people who want to get access to new features/improvements can take the
latest release, all are happy. Similar models like stable/latest releases
are available for other products.

So, my suggestion is one of the following options:
1) Classify the current release branches as more and less stable, like:
-- 4.0.x/4.1.x - avoid perf changes unless it is really a bug-like
-- 5.0.x - more relaxed rules

2) Do something similar to JDK with LTS versions: make minor releases for
the latest major version (like: 5.1/5.2) more frequent and predictable,
like a train release, do not create a fix branch for every one,
periodically for some selected minor versions establish fix branches and
release patch versions for them.

Thank you,
Dmitry


On Wed, 22 Jan 2025 at 09:02, Jeff Jirsa <jji...@gmail.com> wrote:

> I think the status quo is fine - perf goes to trunk, if you think
> something is special, it goes to the mailing list to justify exceptions
>
>
> On Jan 22, 2025, at 3:36 AM, Jordan West <jw...@apache.org> wrote:
>
> 
> Thanks for the initial feedback. I hear a couple different themes / POVs.
>
> David/Paulo, it sounds like maybe a guide for perf backports + mailing
> list consensus when necessary + clear documentation of this could be a way
> forward. I agree that each change comes with stability risks but at the
> same time the greatest stability risk with Cassandra historically has been
> major version upgrades (although we have made great improvements here). For
> folks who want only the performance improvements, we are asking them to
> take greater risk by upgrading a major version or to maintain a fork. The
> fork is reasonable for some of the larger operators but not others. That
> said, I do agree we need to use judgement. Not all changes are worth
> backporting and some may incur too much risk. We could also add to the
> guide suggestions of how to de-risk a change (e.g. code is isolated, config
> to turn it off / off by default, etc).
>
> Jeff, I agree 1% wins aren't worth it if they are invasive and in risky
> areas. Not all of the improvements are that minor.
>
> Jordan
>
> On Tue, Jan 21, 2025 at 1:57 PM Jeff Jirsa <jji...@gmail.com> wrote:
>
>> We expect users to treat patch and minor releases as low risk. Changing
>> something deep in the storage engine to be 1% faster is not worth the risk,
>> because most users will skip the type of qualification that finds those one
>> in a billion regressions.
>>
>> Patch releases are for bug fixes not perf improvements.
>>
>>
>> On Jan 21, 2025, at 9:10 PM, Jordan West <jw...@apache.org> wrote:
>>
>> 
>> Hi folks,
>>
>> A topic that’s come up recently is what branches are valid targets for
>> performance improvements. Should they only go into trunk? This has come up
>> in the context of BTI improvements, Dmitry’s work on reducing object
>> overhead, and my work on CASSANDRA-15452.
>>
>> We currently have guidelines published:
>> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=199530302#Patching,versioning,andLTSreleases-Wheretoapplypatches.
>> But there’s no explicit discussion of how to handle performance
>> improvements. We tend to discuss whether they’re “bugfixes”.
>>
>> I’d like to discuss whether performance improvements should target more
>> than just trunk. I believe they should target every active branch because
>> performance is a major selling point of Cassandra. It’s not practical to
>> ask users to upgrade major versions for simple performance wins. A major
>> version can be deployed for years, especially when the next one has major
>> changes. But we shouldn’t target non-supported major versions, either.
>> Also, there will be exceptions: patches that are too large, invasive,
>> risky, or complicated to backport. For these, we rely on the contributor
>> and reviewer’s judgment and the mailing list. So, I’m proposing an
>> allowance to backport to active branches, not a requirement to merge them.
>>
>> I’m curious to hear your thoughts.
>> Jordan
>>
>>

-- 
Dmitry Konstantinov

Reply via email to