Hi all, I am one of the contributors for the recent perf changes, like: https://issues.apache.org/jira/browse/CASSANDRA-20165 https://issues.apache.org/jira/browse/CASSANDRA-20226 https://issues.apache.org/jira/browse/CASSANDRA-19557 ...
My motivation: I am currently using 4.1.x and planning to adopt 5.0.x in the next quarter. Of course, I want to have it in the best possible share from performance point of view, performance is one of important selling points for upgrades. In general, performance is one of key reasons why people select NoSQL and Cassandra particularly, so any improvement here should be appreciated by users, especially in the current cloud-oriented world where every such improvement is a potential cost saving. For me the question is tightly related to the release scheduling. We have periodic and quite frequent patch releases now, thank you a lot to the people who spend their time to do it. When we speak about minor releases - it looks like the release process is much slower and not so predictable, it can be a year or even more before I can get any minor release which includes a change, and nobody can say even a preliminary date for it. As a result when I have a performance patch and it is suggested to merge only to trunk I will not get the improvement back to use for a long time. So, I have 2 options in this case: 1) relax and wait (potentially losing an interest due to a delayed feedback) 2) keep my own private fork to accumulate such changes with correspondent overheads (what I am actually do now) As a guy who supports Cassandra in production for systems with 99.999 availability requirements, of course I am curious about stability too, but I think we need some balance here and we should rely more on things like test coverage and different policies for different branches to not stagnate due to fear of any change. I am not saying about massive breaking changes, especially which modify (even in a compatible way) network communication protocols or disk data formats, it should be a separate individual discussion for them. The situation reminds me of the story of JDK prior to Java 9. There were also some big bang minor releases (1.5/1.6/1.7/1.8) which we waited for a very long time and Java was evolving very slowly. Now we have a model where a new release is available every 1/2 year and some of them are supported as long term. So, the people who prefer stability select and use LTS versions, the people who want to get access to new features/improvements can take the latest release, all are happy. Similar models like stable/latest releases are available for other products. So, my suggestion is one of the following options: 1) Classify the current release branches as more and less stable, like: -- 4.0.x/4.1.x - avoid perf changes unless it is really a bug-like -- 5.0.x - more relaxed rules 2) Do something similar to JDK with LTS versions: make minor releases for the latest major version (like: 5.1/5.2) more frequent and predictable, like a train release, do not create a fix branch for every one, periodically for some selected minor versions establish fix branches and release patch versions for them. Thank you, Dmitry On Wed, 22 Jan 2025 at 09:02, Jeff Jirsa <jji...@gmail.com> wrote: > I think the status quo is fine - perf goes to trunk, if you think > something is special, it goes to the mailing list to justify exceptions > > > On Jan 22, 2025, at 3:36 AM, Jordan West <jw...@apache.org> wrote: > > > Thanks for the initial feedback. I hear a couple different themes / POVs. > > David/Paulo, it sounds like maybe a guide for perf backports + mailing > list consensus when necessary + clear documentation of this could be a way > forward. I agree that each change comes with stability risks but at the > same time the greatest stability risk with Cassandra historically has been > major version upgrades (although we have made great improvements here). For > folks who want only the performance improvements, we are asking them to > take greater risk by upgrading a major version or to maintain a fork. The > fork is reasonable for some of the larger operators but not others. That > said, I do agree we need to use judgement. Not all changes are worth > backporting and some may incur too much risk. We could also add to the > guide suggestions of how to de-risk a change (e.g. code is isolated, config > to turn it off / off by default, etc). > > Jeff, I agree 1% wins aren't worth it if they are invasive and in risky > areas. Not all of the improvements are that minor. > > Jordan > > On Tue, Jan 21, 2025 at 1:57 PM Jeff Jirsa <jji...@gmail.com> wrote: > >> We expect users to treat patch and minor releases as low risk. Changing >> something deep in the storage engine to be 1% faster is not worth the risk, >> because most users will skip the type of qualification that finds those one >> in a billion regressions. >> >> Patch releases are for bug fixes not perf improvements. >> >> >> On Jan 21, 2025, at 9:10 PM, Jordan West <jw...@apache.org> wrote: >> >> >> Hi folks, >> >> A topic that’s come up recently is what branches are valid targets for >> performance improvements. Should they only go into trunk? This has come up >> in the context of BTI improvements, Dmitry’s work on reducing object >> overhead, and my work on CASSANDRA-15452. >> >> We currently have guidelines published: >> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=199530302#Patching,versioning,andLTSreleases-Wheretoapplypatches. >> But there’s no explicit discussion of how to handle performance >> improvements. We tend to discuss whether they’re “bugfixes”. >> >> I’d like to discuss whether performance improvements should target more >> than just trunk. I believe they should target every active branch because >> performance is a major selling point of Cassandra. It’s not practical to >> ask users to upgrade major versions for simple performance wins. A major >> version can be deployed for years, especially when the next one has major >> changes. But we shouldn’t target non-supported major versions, either. >> Also, there will be exceptions: patches that are too large, invasive, >> risky, or complicated to backport. For these, we rely on the contributor >> and reviewer’s judgment and the mailing list. So, I’m proposing an >> allowance to backport to active branches, not a requirement to merge them. >> >> I’m curious to hear your thoughts. >> Jordan >> >> -- Dmitry Konstantinov