Re: [UPDATE] CEP-37

Jon Haddad Sun, 09 Mar 2025 14:34:16 -0700

This is all really exciting.  Getting a built in, orchestrated repair is a
massive achievement.  Thank you for your work on this, it's incredibly
valuable to the community!!


Jon

On Sun, Mar 9, 2025 at 2:25 PM Jaydeep Chovatia <[email protected]>
wrote:

> No problem, Dave! Thank you.
>
> Jaydeep
>
> On Sun, Mar 9, 2025 at 10:46 AM Dave Herrington <[email protected]>
> wrote:
>
>> Jaydeep,
>>
>> Thank you for taking time to answer my questions and for the links to the
>> design and overview docs, which are excellent and answer all of my
>> remaining questions.  Sorry I missed those links in the CEP page.
>>
>> Great work and I will continue to follow your progress on this powerful
>> new feature.
>>
>> Thanks!
>> -Dave
>>
>> On Sat, Mar 8, 2025 at 9:36 AM Jaydeep Chovatia <
>> [email protected]> wrote:
>>
>>> Hi David,
>>>
>>> Thanks for the kind words!
>>>
>>> >Is there a goal in this CEP to make automated repair work during
>>> rolling upgrades, when multiple versions exist in the cluster?
>>> We debated a lot on this over ASF Slack
>>> (#cassandra-repair-scheduling-cep37). The summary is that, ideally, we want
>>> to have a repair function during the mixed version, but the reality is that
>>> currently, there is no test suite available inside Apache Cassandra to
>>> verify the streaming behavior during the mixed version, so the confidence
>>> is low.
>>> We agreed on the following: 1) Keeping safety in mind, we should by
>>> default disable the repair during mixed version 2) Add a comprehensive test
>>> suite 3) Allow repair during mixed version. Currently, we are at #1
>>>
>>> >Would automated repair be smart enough to automatically stop, if it
>>> sees incompatible versions?
>>> That's the plan, and we already have PR (CASSANDRA-20048
>>> <https://issues.apache.org/jira/browse/CASSANDRA-20048>) out from Chris
>>> Lohfink. The thing we are debating is whether to stop only during major
>>> version mismatch or also during the minor version, and we are leaning
>>> towards only disabling for the major version mismatch. Regardless, this
>>> should be available soon.
>>> We are also extending this further as per feedback from David
>>> Capwell that we should automatically stop repair if we detect a new DC or
>>> keyspace RF is changed. That will be covered later as part of
>>> CASSANDRA-20414 <https://issues.apache.org/jira/browse/CASSANDRA-20414>
>>>
>>> >If automated repair must be disabled for the entire cluster, will this
>>> be a single nodetool command, or must automated repair be disabled on each
>>> node individually?
>>> Yes, it is a nodetool command and does not require any restarts! All the
>>> *nodetool* command details are currently covered in the design doc
>>> <https://docs.google.com/document/d/1CJWxjEi-mBABPMZ3VWJ9w5KavWfJETAGxfUpsViPcPo/edit?tab=t.0#heading=h.89fmsespiosd>,
>>> and the same details will also be available in the Cassandra
>>> overview.adoc
>>> <https://github.com/apache/cassandra/pull/3598/files?short_path=e901018#diff-e90101885c1188844bb4188d1301277bfdc4a9e1e705c4ab8a6cc5a4b44460c0>
>>> .
>>>
>>> >Would it make sense for automated repair to upgrade sstables, if it
>>> finds old formats? (Maybe this could be a feature that could be optionally
>>> enabled?)
>>> My opinion is that it should not be part of the repair. It is best
>>> suited as part of the Cassandra upgrade framework; I guess Paulo M is
>>> looking at it.
>>>
>>> >W.R.T. the repair logging tables in the system_distributed keyspace,
>>> will these tables have a configurable TTL, or must they be periodically
>>> truncated to limit their size?
>>> The number of entries will equal the number of Cassandra nodes in a
>>> cluster. There is no TTL because each row represents the repair status of
>>> that particular node. The entries would be automatically added/removed as
>>> nodes are added/removed from the Cassandra cluster.
>>>
>>> Jaydeep
>>>
>>> On Sat, Mar 8, 2025 at 7:46 AM Dave Herrington <[email protected]>
>>> wrote:
>>>
>>>> Jaydeep,
>>>>
>>>> Thank you for your excellent efforts on this mission-critical feature.
>>>> The stated goals of CEP-37 are noble and stand to make valuable
>>>> improvements for cluster operations.  I look forward to testing these new
>>>> capabilities.
>>>>
>>>> My apologies up-front if you’ve already answered these questions.  I
>>>> did read the CEP a number of times and the linked JIRAs, but these are my
>>>> questions that I couldn’t answer myself.
>>>>
>>>> I’m interested to understand the goals of CEP-37 W.R.T. to rolling
>>>> upgrades of large clusters, as I am responsible for maintaining the cluster
>>>> operations runbooks for a number of customers.
>>>>
>>>> Operators have to navigate the upgrade gauntlet with automated repairs
>>>> disabled and get all nodes upgraded within gc_grace_seconds and then do a
>>>> full repair, before restarting automated repairs.
>>>>
>>>> I see that CASSANDRA-7530
>>>> https://issues.apache.org/jira/browse/CASSANDRA-7530 is related to
>>>> this.
>>>>
>>>> Is there a goal in this CEP to make automated repair work during
>>>> rolling upgrades, when multiple versions exist in the cluster?
>>>>
>>>> (I think this would imply that stopping automated repairs would no
>>>> longer be a pre-upgrade step.)
>>>>
>>>> Would automated repair be smart enough to automatically stop, if it
>>>> sees incompatible versions?
>>>>
>>>> Would automated repair continue between nodes with compatible versions,
>>>> or would it stop for the entire cluster?
>>>>
>>>> If automated repair must be disabled for the entire cluster, will this
>>>> be a single nodetool command, or must automated repair be disabled on each
>>>> node individually?
>>>>
>>>> Would it make sense for automated repair to upgrade sstables, if it
>>>> finds old formats? (Maybe this could be a feature that could be optionally
>>>> enabled?)
>>>>
>>>> W.R.T. the repair logging tables in the system_distributed keyspace,
>>>> will these tables have a configurable TTL, or must they be periodically
>>>> truncated to limit their size?
>>>>
>>>> Thanks,
>>>> -Dave
>>>>
>>>> David A. Herrington II
>>>> President and Chief Engineer
>>>> RhinoSource, Inc.
>>>>
>>>> *Data Lake Architecture, Cloud Computing and Advanced Analytics.*
>>>>
>>>> www.rhinosource.com
>>>>
>>>>
>>>> On Fri, Mar 7, 2025 at 11:48 AM Jaydeep Chovatia <
>>>> [email protected]> wrote:
>>>>
>>>>> Hello Everyone,
>>>>>
>>>>> I wanted to update you on CEP-37
>>>>> <https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-37+Apache+Cassandra+Unified+Repair+Solution>
>>>>>  (Jira:
>>>>> CASSANDRA-19918
>>>>> <https://issues.apache.org/jira/browse/CASSANDRA-19918>) work.
>>>>> Over the last year, some of us (Andy Tolbert, Chris Lohfink,
>>>>> Francisco Guerrero, and Kristijonas Zalys) have been working closely on
>>>>> making CEP-37 rock solid, with support from Josh McKenzie, Dinesh Joshi,
>>>>> and David Capwell.
>>>>> First and foremost, a huge thank you to everyone, including the
>>>>> broader Apache Cassandra community, for their invaluable contributions in
>>>>> making CEP-37 robust and solid!
>>>>>
>>>>> Here is the current status:
>>>>>
>>>>> *Feature stability*
>>>>>
>>>>>    - *Voted feature:* All the features mentioned in CEP-37 have
>>>>>    worked as expected.
>>>>>    - *Post-voted feature:* A few new minor improvements
>>>>>    
>>>>> <https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=272927365#CEP37ApacheCassandraUnifiedRepairSolution-Post-VoteUpdates>
>>>>>    have been added to post-voting, and they are also working as expected.
>>>>>    - Tested the functionality by multiple people over the period of
>>>>>    time.
>>>>>    - Some other facts: it has already been validated at scale
>>>>>    <https://www.youtube.com/watch?v=xFicEj6Nhq8>. Another big
>>>>>    Cassandra use case is in the process of validating/adopting it in their
>>>>>    environment.
>>>>>
>>>>> *Source Code*
>>>>>
>>>>>    - It is an opt-in feature; nobody notices anything unless someone
>>>>>    opts in.
>>>>>    - By default, this feature is pretty isolated (in a separate
>>>>>    package) from the source code point of view (94% of the source code
>>>>>    lines are in the new files)
>>>>>    - A thorough documentation has been added:
>>>>>       - overview.doc
>>>>>       - metrics.doc
>>>>>       - cassandra.yaml doc
>>>>>       - NEWS.txt overview
>>>>>    - Five people (Andy Tolbert, Chris Lohfink, Francisco Guerrero,
>>>>>    and Kristijonas Zalys) have contributed.
>>>>>    - The source code has been reviewed multiple times by the same
>>>>>    five people.
>>>>>
>>>>> *Test Coverage*
>>>>>
>>>>>    - A comprehensive test coverage has been added to cover all
>>>>>    aspects.
>>>>>    - The entire test suite has been passing
>>>>>
>>>>>
>>>>> We are in the final review phase and nearly ready to merge. If anyone
>>>>> has any last-minute feedback, this is the final opportunity for review.
>>>>>
>>>>> Thank you!
>>>>> Andy Tolbert, Chris Lohfink, Francisco Guerrero, Kristijonas Zalys,
>>>>> and Jaydeep
>>>>>
>>>>
>>
>> --
>> -Dave
>>
>> David A. Herrington II
>> President and Chief Engineer
>> RhinoSource, Inc.
>>
>> *Data Lake Architecture, Cloud Computing and Advanced Analytics.*
>>
>> www.rhinosource.com
>>
>

Re: [UPDATE] CEP-37

Reply via email to