> My expectation is that in trunk SCM CASSANDRA_4 would change to SCM 
> CASSANDRA_5.

Assuming you upgrade from 4.0 to 5.0, then you are running on CASSANDRA_4… how 
many people know that they are expected to do something about that (Sam 
documented the steps earlier)?  What if you leave things alone and try to 
upgrade to 5.1/6.0… now what?

What about users who create a new 5.0 cluster… we still default to 
compatibility mode in this case, so a new 5.0 cluster is running with 
CASSANDRA_4…

> This is why I want to remove the coupling between SCM and messaging version.

Feels like we just had a similar conversation Sam with regard to TCM / Accord ;)

I don’t see messaging version as the problem, as I feel that messaging and disk 
versions are intertwined and cause this confusion (they are the same 
serializers)… If we are running with messaging version VERSION_40, why does it 
matter if we write to disk with VERSION_40 or VERSION_50?  If we want downgrade 
we should block _50 and only use _40 for disk, but why should networking not be 
allowed to do _50?  What we write to disk impacts our ability to downgrade, and 
messaging already has an ability to downgrade its version if its peers don’t 
know the latest version.

In short I agree with you Sam, we should decouple… I think it makes sense for 
SCM to control the version we use for disk, but not networking… 

> On Dec 12, 2024, at 8:46 AM, Sam Tunnicliffe <s...@beobal.com> wrote:
> 
> No, we initially tried to preserve all the previous paths and put the whole 
> thing behind a feature flag, but it was just way too pervasive and doing so 
> would've added years to the project. So for the period before the CMS is 
> initialized, certain operations are not available. 
> 
> However, it should be entirely possible to downgrade and rollback to 5.0 
> after cutting over to TCM, as long as SSTables are still in the old format. 
> By "should be" I mean it is absolutely possible and has been tested, but it 
> requires the SCM to guard the on disk format, which has the unfortunate 
> effect of limiting the messaging version and that in turn make it impossible 
> to actually cut over to TCM. i.e. the testing has been done with a patch 
> which disables some things which rely on messaging VERSION_51. This is why I 
> want to remove the coupling between SCM and messaging version.
> 
> Also, I misspoke slightly in my previous email because I forgot that we did 
> manage to enable a decent subsection of TCM to work with 
> VERSION_40/VERSION_50. In this scenario, you still get the linearized schema 
> updates via the metadata log but replicas/coordinators don't exchange epochs 
> during reads/writes so the consistency guarantees are weakened.
> 
> Thanks,
> Sam
> 
> 
>> On 12 Dec 2024, at 16:17, Jeremiah Jordan <jeremiah.jor...@gmail.com> wrote:
>> 
>> My expectation is that in trunk SCM CASSANDRA_4 would change to SCM 
>> CASSANDRA_5.  I think we should be striving to support full 
>> downgrade/rollback ability to the previous major version from trunk.
>> With TCM I would expect that when running in CASSANDRA_5 mode that 
>> initializing TCM would not be possible, as once initialized you could no 
>> longer roll back.
>> Do we have no way to support the gossip paths continuing to work prior to 
>> initializing TCM?
>> 
>> -Jeremiah
>> 
>> On Dec 11, 2024 at 7:41:48 AM, Sam Tunnicliffe <s...@beobal.com> wrote:
>>> My point is that the upgrade to 5.1/6.0 isn't really complete until the CMS 
>>> is initialised and this can't be done while running with SCM CASSANDRA_4 
>>> because of the messaging service limitation. Until that point, schema 
>>> changes & node replacements are not supported which affects how long a bake 
>>> time is tolerable. 
>>> This specific issue could probably be fixed by revisiting the SCM 
>>> implementation in 5.1/6.0, so we should certainly do that but the fact 
>>> remains that we don't have great test coverage to indicate how clusters 
>>> behave when running in SCM for a prolonged period.  
>>> 
>>> Thanks, 
>>> Sam
>>> 
>>>> On 11 Dec 2024, at 13:29, Brandon Williams <dri...@gmail.com> wrote:
>>>> 
>>>> On Wed, Dec 11, 2024 at 7:22 AM Sam Tunnicliffe <s...@beobal.com> wrote:
>>>>> 
>>>>> so running in any SCM mode for a prolonged period is not really viable.
>>>> 
>>>> This is what many users want to do though, upgrade one DC and let it
>>>> bake to see how it goes before continuing.  I don't think that's
>>>> unreasonable, but from working on CASSANDRA-20118 I know how difficult
>>>> that is already.  I don't think we've built enough SCM muscle yet to
>>>> think about handling multiple previous versions.
>>>> 
>>>> Kind Regards,
>>>> Brandon
>>> 
> 

Reply via email to