Re: Downgradability

Jeremiah D Jordan Wed, 22 Feb 2023 13:22:51 -0800

We have multiple tickets about to merge that introduce new on disk format 
changes.  I see no reason to block those indefinitely while we figure out how 
to do the on disk format downgrade stuff.


-Jeremiah

> On Feb 22, 2023, at 3:12 PM, Benedict <bened...@apache.org> wrote:
> 
> Ok I will be honest, I was fairly sure we hadn’t yet broken downgrade - but I 
> was wrong. CASSANDRA-18061 introduced a new column to a system table, which 
> is a breaking change. 
> 
> But that’s it, as far as I can tell. I have run a downgrade test successfully 
> after reverting that ticket, using the one line patch below. This makes every 
> in-jvm upgrade test also a downgrade test. I’m sure somebody more familiar 
> with dtests can readily do the same there.
> 
> While we look to fix 18061 and enable downgrade tests (and get a clean run of 
> the full suite), can we all agree not to introduce new breaking changes?
> 
> 
> index e41444fe52..085b25f8af 100644
> --- 
> a/test/distributed/org/apache/cassandra/distributed/upgrade/UpgradeTestBase.java
> +++ 
> b/test/distributed/org/apache/cassandra/distributed/upgrade/UpgradeTestBase.java
> @@ -104,6 +104,7 @@ public class UpgradeTestBase extends DistributedTestBase
>                                                                          
> .addEdge(v40, v41)
>                                                                          
> .addEdge(v40, v42)
>                                                                          
> .addEdge(v41, v42)
> +                                                                         
> .addEdge(v42, v41)
>                                                                          
> .build();
> 
> 
>> On 22 Feb 2023, at 15:08, Jeff Jirsa <jji...@gmail.com> wrote:
>> 
>> When people are serious about this requirement, they’ll build the downgrade 
>> equivalents of the upgrade tests and run them automatically, often, so 
>> people understand what the real gap is and when something new makes it break 
>> 
>> Until those tests exist, I think collectively we should all stop pretending 
>> like this is dogma. Best effort is best effort. 
>> 
>> 
>> 
>>> On Feb 22, 2023, at 6:57 AM, Branimir Lambov <branimir.lam...@datastax.com> 
>>> wrote:
>>> 
>>> 
>>> > 1. Major SSTable changes should begin with forward-compatibility in a 
>>> > prior release.
>>> 
>>> This requires "feature" changes, i.e. new non-trivial code for previous 
>>> patch releases. It also entails porting over any further format 
>>> modification.
>>> 
>>> Instead of this, in combination with your second point, why not implement 
>>> backwards write compatibility? The opt-in is then clearer to define (i.e. 
>>> upgrades start with e.g. a "4.1-compatible" settings set that includes file 
>>> format compatibility and disabling of new features, new nodes start with 
>>> "current" settings set). When the upgrade completes and the user is happy 
>>> with the result, the settings set can be replaced.
>>> 
>>> Doesn't this achieve what you want (and we all agree is a worthy goal) with 
>>> much less effort for everyone? Supporting backwards-compatible writing is 
>>> trivial, and we even have a proof-of-concept in the stats metadata 
>>> serializer. It also simplifies by a serious margin the amount of work and 
>>> thinking one has to do when a format improvement is implemented -- e.g. the 
>>> TTL patch can just address this in exactly the way the problem was 
>>> addressed in earlier versions of the format, by capping to 2038, without 
>>> any need to specify, obey or test any configuration flags.
>>> 
>>> >> It’s a commitment, and it requires every contributor to consider it as 
>>> >> part of work they produce.
>>> 
>>> > But it shouldn't be a burden. Ability to downgrade is a testable problem, 
>>> > so I see this work as a function of the suite of tests the project is 
>>> > willing to agree on supporting.
>>> 
>>> I fully agree with this sentiment, and I feel that the current "try to not 
>>> introduce breaking changes" approach is adding the burden, but not the 
>>> benefits -- because the latter cannot be proven, and are most likely 
>>> already broken.
>>> 
>>> Regards,
>>> Branimir
>>> 
>>> On Wed, Feb 22, 2023 at 1:01 AM Abe Ratnofsky <a...@aber.io 
>>> <mailto:a...@aber.io>> wrote:
>>>> Some interesting existing work on this subject is "Understanding and 
>>>> Detecting Software Upgrade Failures in Distributed Systems" - 
>>>> https://dl.acm.org/doi/10.1145/3477132.3483577 
>>>> <https://urldefense.com/v3/__https://dl.acm.org/doi/10.1145/3477132.3483577__;!!PbtH5S7Ebw!ZUMhWOKjMaK62HKCGLYN0rAhZbbX8fOJkgCsfMgjYO5EgJQulefcb5pwH4q5oU5ylLl6W56W-NWm0FLO7w$>,
>>>>  also summarized by Andrey Satarin here: 
>>>> https://asatarin.github.io/talks/2022-09-upgrade-failures-in-distributed-systems/
>>>>  
>>>> <https://urldefense.com/v3/__https://asatarin.github.io/talks/2022-09-upgrade-failures-in-distributed-systems/__;!!PbtH5S7Ebw!ZUMhWOKjMaK62HKCGLYN0rAhZbbX8fOJkgCsfMgjYO5EgJQulefcb5pwH4q5oU5ylLl6W56W-NUfWWwFsA$>
>>>> 
>>>> They specifically tested Cassandra upgrades, and have a solid list of 
>>>> defects that they found. They also describe their testing mechanism 
>>>> DUPTester, which includes a component that confirms that the leftover 
>>>> state from one version can start up on the next version. There is a wider 
>>>> scope of upgrade defects highlighted in the paper, beyond SSTable version 
>>>> support.
>>>> 
>>>> I believe the project would benefit from expanding our test suite 
>>>> similarly, by parametrizing more tests on upgrade version pairs.
>>>> 
>>>> Also, per Benedict's comment:
>>>> 
>>>> > It’s a commitment, and it requires every contributor to consider it as 
>>>> > part of work they produce.
>>>> 
>>>> But it shouldn't be a burden. Ability to downgrade is a testable problem, 
>>>> so I see this work as a function of the suite of tests the project is 
>>>> willing to agree on supporting.
>>>> 
>>>> Specifically - I agree with Scott's proposal to emulate the HDFS 
>>>> upgrade-then-finalize approach. I would also support automatic 
>>>> finalization based on a time threshold or similar, to balance the 
>>>> priorities of safe and straightforward upgrades. Users need to be aware of 
>>>> the range of SSTable formats supported by a given version, and how to 
>>>> handle when their SSTables wouldn't be supported by an upcoming upgrade.
>>>> 
>>>> --
>>>> Abe
>>> 
>>> 
>>> -- 
>>> Branimir Lambov
>>> e. branimir.lam...@datastax.com <mailto:branimir.lam...@datastax.com>
>>> w. www.datastax.com <http://www.datastax.com/>
>>>

Re: Downgradability

Reply via email to