Re: [DISCUSS] How we handle JDK support

Josh McKenzie Wed, 21 May 2025 15:15:33 -0700

Great context - thanks for that insight.

Operators running the older supported versions of C* will retain the *option* 
to run the older JDK, however if they want to upgrade their JDK version and C* 
version *separately* under the above paradigm, they'd need to rev their JDK 
separately on their clusters before running the C* version upgrade.


The need to bump deps for JDK support is very real and that does concern me - 
really great point. Bumping dependencies in an older C* version because a newer 
JDK you're not interested in using needed to be supported would not be a 
positive experience for a user; you're effectively taking on risk for no new 
functionality. *In theory*, we could have conditional dependency inclusion 
based on what version of the JDK you're building a cassandra build for. A 
cursory inspection of this topic in gradle and ant both shows it's possible, if 
a bit cleaner and simpler in the former than the latter.

My recollection of JDK17 to JDK21 was under 5 differences we needed to bump, so 
maintaining a per-jdk list of conditional dependency versions wouldn't be an 
overwhelming burden - at least from that 1 example. Do you recall how many 
dependencies needed to be bumped on the JDK11 to JDK17 transition Ekaterina?

And your point about our lack of performance testing and JDK changes 
translating into perf changes is also one that resonates strongly with me as 
well. There's a straightforward fix there too. :D

For the rare edge case where we have to stop supporting something entirely 
because it's incompatible with a JDK release (has this happened more than the 1 
time?) - I think a reasonable fallback is to just not backport new JDK support 
and consider carrying forward the older JDK support until the release w/the 
feature in it is EoL'ed. That'd allow us to continue to run in-jvm upgrade 
dtests between the versions on the older JDK.

Also - we can't up language level on older branches w/newer JDK support which I 
hand-waved at in my wishlist. They'd obviously not build on older JDKs if we 
did that, and that'd force the ecosystem that relies on cassandra-all update at 
that time as well which wouldn't be pretty.

On Wed, May 21, 2025, at 3:27 PM, Ekaterina Dimitrova wrote:
> Benedict, I am not sure what do you mean by optional feature. FWIW we cannot 
> compile cassandra-4.1 until we removed the feature in cassandra-5.0. I, as a 
> user would be very disappointed a feature to be removed in a patch release. 
> 
> Yes, replacing nashorn was the unpleasant part. I did not raise the nashorn 
> part as if removing the scripted UDFs was a hard technical task, but more to 
> flag we wouldn’t want to make such breaking changes in patch releases.
> 
> “We may well hit similar issues in future, some perhaps even harder to 
> surmount, but I’m sure we can address them as they come on a case by case 
> basis. Worst case we have to postpone the migration by one major for any 
> deprecation to take effect.”
> 
> Agreed, though the lack of performance testing still stands for me. 
> 
> I just got reminded - there was also some time format issue with JDK11 that 
> Scott mentioned before, if I remember correctly.
> 
> So yeah, these are the type of things we may have in front of us. Also, I 
> can’t wait to find a replacement for jamm so we don’t have to think of it 
> anymore. 
> 
> On Wed, 21 May 2025 at 15:17, Benedict <[email protected]> wrote:
>> 
>> Yes the issue of Nashorn did spring to mind, but as I recall this was an 
>> optional feature. I don’t remember how hard it would have been to simply 
>> declare the feature unavailable if you use the newer JDK, but my vague 
>> recollection is the hard part was primarily finding a suitable replacement.
>> 
>> We may well hit similar issues in future, some perhaps even harder to 
>> surmount, but I’m sure we can address them as they come on a case by case 
>> basis. Worst case we have to postpone the migration by one major for any 
>> deprecation to take effect.
>> 
>> 
>>> On 21 May 2025, at 19:57, Ekaterina Dimitrova <[email protected]> wrote:
>>> 
>>> “I'm curious what this raises for you. “
>>> 
>>> A few points that come to mind:
>>> 
>>> - every time we switch/add JDKs we also need to do a bunch of changes in CI 
>>> systems, ccm, etc, not only C* - so more work to call out. Also, if we make 
>>> older versions support newer JDK, I guess we need to ensure drivers, etc 
>>> will support it too probably? Are we discussing JDK support here only for 
>>> Cassandra repo?
>>> - very often we need to bump library versions to support newer JDK versions 
>>> but at the same time we try not to upgrade dependencies in patch release; 
>>> only if it is bug related, in most cases
>>> - whether it is a lot of work or not to backport, I’d say it depends. My 
>>> assumption is that if we keep our maintenance regularly going (which we 
>>> missed with the long development cycle of 4.0) - it is more feasible. 
>>> Though we know that we removed a whole feature to move to JDK17 quicker - 
>>> the scripted UDFs. If we have similar needs at any time - we can’t do such 
>>> breaking changes in a patch release.
>>> - Benedict made a great point on performance changes with JDK upgrades - we 
>>> do not have regular performance testing so probably introducing a new JDK 
>>> in a patch version will come with a huge warning - test thoroughly and move 
>>> to prod at your own judgement or something like that. 
>>> 
>>> I guess there are more things to consider but these are immediate things 
>>> that come to my mind now.
>>> 
>>> Best regards,
>>> Ekaterina
>>> 
>>> On Wed, 21 May 2025 at 10:31, Josh McKenzie <[email protected]> wrote:
>>>> __
>>>> Lessons learned from advancing JDK support on trunk *should* translate 
>>>> into older branches making that effort much smaller; Ekaterina you have a 
>>>> lot of experience here so I'm curious what this raises for you. I like the 
>>>> productivity implications of us being able to adopt new language features 
>>>> faster on trunk; I think this is a solid evolution of the idea, definitely.
>>>> 
>>>> Distilling to bulleted lists to try and snapshot the state of the thread 
>>>> w/the above proposal:
>>>> 
>>>> *[New LTS JDK Adoption]*
>>>>  • Trunk supports 1 JDK at a time
>>>>  • That JDK will be the GA LTS the day we cut a frozen branch for a new 
>>>> major (i.e. from moment of previous release bifurcation, trunk snapshots 
>>>> the JDK at that moment). Obviously there will be some flexibility here in 
>>>> terms of when the work lands on trunk and supporting on other branches, 
>>>> but the general pattern / intent hold - push to snapshot latest GA LTS JDK 
>>>> on trunk ASAP after branching for a major.
>>>>  • Trunk targets the language level of that JDK
>>>>  • CI on trunk is that single JDK only
>>>>  • We merge new JDK LTS support to all supported branches at the same time 
>>>> as trunk
>>>>  • We up the supported language level for all supported branches to the 
>>>> latest supported JDK at this time
>>>>  • We don't need to worry about dropping JDK support as that will happen 
>>>> naturally w/the dropping of support for a branch. Branches will slowly 
>>>> gain JDK support w/each subsequent trunk-based LTS integration.
>>>> *[Branch JDK Support]*
>>>>  • N-2: JDK, JDK-1, JDK-2
>>>>  • N-1: JDK, JDK-1
>>>>  • N: JDK
>>>> *[CI, JDK's, Upgrades]*
>>>>  • CI:
>>>>    • For each branch we run per-commit CI for the latest JDK they support
>>>>    • Periodically we run all CI pipelines for older JDK's per-branch 
>>>> (cadence TBD)
>>>>  • Upgrades
>>>>    • N-2 -> N-1: tested on JDK and JDK-1
>>>>    • N-2 -> N: tested on JDK
>>>>    • N-1 -> N: tested on JDK
>>>> That'd give us 4 upgrade paths we'd need to support and test which feels 
>>>> like it's in the territory of "doable on each commit" if we limit the 
>>>> upgrade tests to the in-jvm variety and let the periodic run capture the 
>>>> python upgrade tests space.
>>>> 
>>>> On Wed, May 21, 2025, at 9:30 AM, Benedict wrote:
>>>>> 
>>>>> Perhaps we should consider back porting support for newer Java LTS 
>>>>> releases to older C* versions, and suggesting users upgrade JDK first. 
>>>>> This way we can have trunk always on the latest LTS, advancing language 
>>>>> feature support more quickly. 
>>>>> 
>>>>> That is, we would have something like 
>>>>> 
>>>>> N-2: JDK, JDK-1, JDK-2
>>>>> N-1: JDK, JDK-1
>>>>> N: JDK
>>>>> 
>>>>> I think to assist those deploying trunk and reduce churn for development, 
>>>>> we might only want to advance the LTS version for trunk after we release 
>>>>> a new major, fixing the next release’s Java version at that point.
>>>>> 
>>>>>>  • On 21 May 2025, at 13:57, Josh McKenzie <[email protected]> wrote:
>>>>>> 
>>>>>>> You don’t have to run every suite on every commit since as folks have 
>>>>>>> pointed out for the most part the JVM isn’t culprit. Need to run it 
>>>>>>> enough times to catch when it is for some assumption of “enough”. 
>>>>>> So riffing on this. We could move to something like:
>>>>>>  • For each given supported C* branch, confirm it **builds **on all 
>>>>>> supported JDKs (pre-commit verification, post-commit reactive runs)
>>>>>>  • Constrain language level on any given C* branch to **lowest supported 
>>>>>> JDK**
>>>>>>  • Run all reactive post-commit CI pipelines against *the *highest 
>>>>>> supported JDK only**
>>>>>>  • Once a N (day, week, month?), run all pipelines against all supported 
>>>>>> JDKs on all branches
>>>>>>    • Augment notification mechanisms so it squawks to dev list and slack 
>>>>>> on failure of non-highest JDK pipelines
>>>>>> That approach would tweak our balance towards our perception of the 
>>>>>> infrequency of per-JDK failures while allowing us to "scale up" the 
>>>>>> matrix of tests that we perform.
>>>>>> 
>>>>>> i.e. once a week we could have a heavy 9x run (3 branches, 3 JDK's) 
>>>>>> which we could then plan around and space out in terms of resource 
>>>>>> allocation, but otherwise we run a single set of pipelines per branch 
>>>>>> post-commit.
>>>>>> 
>>>>>> That'd give us the confidence to say "we tested the upgrade path we're 
>>>>>> recommending for you" without having to pay the tax of doing it on every 
>>>>>> commit or allowing potential defects to pile up to a once-a-year 
>>>>>> JDK-specific bug-bash.
>>>>>> 
>>>>>> In terms of JDK support when bumping (mapping of relative C* version and 
>>>>>> relative JDK version):
>>>>>>  • N-2: JDK-2, JDK-3, JDK-4
>>>>>>  • N-1: JDK-1, JDK-2, JDK-3 
>>>>>>  • N: JDK, JDK-1, JDK-2
>>>>>> So we'd have 3 supported LTS per branch, be able to adhere to "you can 
>>>>>> upgrade from N-2 to N using the same JDK", and allow us to balance our 
>>>>>> CI coverage to our expected surfacing of defects.
>>>>>> 
>>>>>> Then if we rev JDK we support on any given N+1, we end up with (keeping 
>>>>>> with N above as reference):
>>>>>>  • N-1: JDK-1, JDK-2, JDK-3
>>>>>>  • N: JDK, JDK-1, JDK-2
>>>>>>  • N+1: JDK+1, JDK, JDK-1
>>>>>> So shared JDK across all 3 on that rev is JDK-1.
>>>>>> 
>>>>>> I think 3 LTS per branch gives us the ability to both add / drop a JDK 
>>>>>> per major and test / provide for upgrades from N-2 to N w/out requiring 
>>>>>> a new JDK cert too.
>>>>>> 
>>>>>> On Wed, May 21, 2025, at 3:27 AM, Mick Semb Wever wrote:
>>>>>>>    .
>>>>>>>   
>>>>>>>>> So yeah. I think we'll need to figure out how much coverage is 
>>>>>>>>> reasonable to call something "tested". I don't think it's sustainable 
>>>>>>>>> for us to have, at any given time, 3 branches we test across 3 JDK's 
>>>>>>>>> each with all our in-jvm test suites is it?
>>>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> Correct.
>>>>>>> For non-upgrade tests, where testing against more than one jdk exists, 
>>>>>>> we should start the conversation of the value of running more than one 
>>>>>>> JDK for all tests per-commit CI, before we go adding a third.
>>>>>>> 
>>>>>>> I'm not against weekly/fortnightly CI runs, just that it deserves the 
>>>>>>> discussion of cost (it's not necessarily cheaper due to saturation, nor 
>>>>>>> are we a team that has assigned build barons).  The actual change is 
>>>>>>> relatively easy, just adding a profile and a jdk element here: 
>>>>>>> https://github.com/apache/cassandra/blob/trunk/.jenkins/Jenkinsfile#L126-L135
>>>>>>>  
>>>>>> 
>>>>

Re: [DISCUSS] How we handle JDK support

Reply via email to