My thinking is most closely aligned with Blake and Benedict’s views here.
For the specific refactor in question, I support adoption of the language feature for new code or to cut existing code over to the new syntax as changes are made to the respective areas of the codebase. But I don’t support a sweeping project-wide refactor on trunk in this case.
Here is my thinking:
- If there are 2000 target sites for the refactor, that means this is going to be a 5000+ line diff. - The safety improvement here is marginal but nonzero. - If we have a 5000 line refactor, it should accomplish a significant and compelling purpose in the project. - Any execution of the refactor will require manual review of each of those 2000 refactor sites on the part of the implementer and two reviewers. - Since the check is compile-time, we’d learn that by the initial refactor the first time it’s compiled, and we short-circuit to having gained 100% of the value by being able to fix the broken callsites. - The act of that per-call site review would inform us as to whether we had incorrect casts; and we would immediately achieve the value of the “safer” approach by having identified the bugs. - 2x reviewer coverage for a 5000 line patch set is a significant commitment of reviewer resources. These reviewer resources have significant opportunity cost and can put to a better purpose. - Blake/others mention that such refactors create conflicts when bug fixes are backported to previous releases, requiring refactors of those rebased patches to bring fixes to versions that predate the large refactor.
I think this is a good language feature. I think we should use it. I think it’d be completely reasonable to cut existing syntax over to it as we make changes to the respective subsystems.
But I wouldn’t do a big bang refactor in this case. The juice isn’t worth the squeeze for me. On May 9, 2025, at 11:33 AM, Blake Eggleston <bl...@ultrablake.com> wrote:
No one is treating the codebase like a house of cards that can’t be touched.
In this case I think the cost/risk of doing this change outweighs the potential benefits the project might see from it. Josh counts ~2000 instances where we’re casting objects so we’re talking about a not-insignificant change which may introduce it’s own bugs. Even if no new bugs are introduced, this will be an refactor annoyance for projects in development, but the real concern I have with any large change is how it complicates the process of fixing bugs across versions. On the other hand, I don’t think that incorrectly casting objects has historically been a source of pain for us, so it seems like the benefit would be small if any.
On Fri, May 9, 2025, at 10:38 AM, Jon Haddad wrote: Why not?
Personally, I hate the idea of treating a codebase (any codebase) like a house of cards that can't be touched. It never made sense to me to try to bundle new features / bug fixes with improvements to code quality.
Making the code more reliable should be a goal in itself, rather than a side effect of other work.
Jon
This seems like a cool feature that will be useful in future development work, but not something we should be proactively refactoring the project to make use of.
On Fri, May 9, 2025, at 10:18 AM, Vivekanand Koya wrote: I would say that https://openjdk.org/jeps/394 (instanceOf) aims to
provide safer and less poisoning in the code by default. Instead of
having a production server halt/impaired due to a RuntimeException, instead it is verified at compile time. If a new language feature makes code more robust and addresses a hazardous, historical design choice, I believe it's time has arrived. Curious to see what everyone thinks.
Thanks, Vivekanand K.
I would like to refactor the codebase (Trunk 5+) to eliminate unsafe explicit casting with instanceOf.
We have a rich history of broad sweeping refactors dying on the rocks of the community's aversion to instability and risk w/out a concrete outcome we're trying to achieve. :)
All of which is to say: do we have examples of instanceOf casting blowing things up for users that would warrant going through the codebase to tidy this up? Between src/java and test/unit and test/distributed we have around 2,000 occurrences of this pattern.
On Fri, May 9, 2025, at 10:14 AM, Vivekanand Koya wrote: Sounds great. I would like to refactor the codebase (Trunk 5+) to eliminate unsafe explicit casting with instanceOf.
Thanks, Vivekanand
Yep, that approach seems more than sufficient to me. No need for lots of ceremony, but good to keep everyone in the decision loop.
I think it doesn’t cost us much to briefly discuss new language features before using them.
I had that thought as well but on balance my intuition was there were enough new features that the volume of discussion to do that would be a poor cost/benefit compared to the "lazy consensus, revert" approach.
So if I actually do the work required to have an opinion ;):
JDK21: JDK17: JDK16: JDK15: JDK14: JDK11:
Assuming we just lazily evaluate and deal with new features as people actually care about them and seeing them add value, a simple "[DISCUSS] I'm thinking about using new language feature X; any objection?" lazy consensus that we then dumped onto a wiki article / code style page as "stuff we're good to use" would probably be fine?
On Fri, May 9, 2025, at 7:58 AM, Benedict wrote:
I think it doesn’t cost us much to briefly discuss new language features before using them. Lambdas, Streams and var all have problems - and even with the guidance we publish some are still misused.
The flow scoping improvement to instanceof seems obviously good though. For new feature work on trunk, targeting the highest supported language level featureset (jdk17 right now, jdk21 within the next couple of weeks) makes sense to me. For bugfixing, targeting the oldest supported GA branch and the highest language level that works there would allow maximum flexibility with minimal re-implementation.
If anyone has any misgivings with certain features (i.e. the debate around usage of "var") they can bring it up on the dev ML and we can adjust, but otherwise I'd prefer to see us have more modern evolving options on how contributors engage rather than less.
On Fri, May 9, 2025, at 1:56 AM, Vivekanand Koya wrote: Hello,
I want to understand the community's thoughts on using newer features (post JDK11) in upcoming releases in Cassandra. An example is flow scoping instead of explicitly casting types with instanceOf: https://openjdk.org/jeps/395. I want your thoughts on JDK requirements for the main Cassandra repository, Accord, and Sidecar.
Much appreciated. Thanks, Vivekanand K.
|