This is all really exciting. Getting a built in, orchestrated repair is a massive achievement. Thank you for your work on this, it's incredibly valuable to the community!!
Jon On Sun, Mar 9, 2025 at 2:25 PM Jaydeep Chovatia <chovatia.jayd...@gmail.com> wrote: > No problem, Dave! Thank you. > > Jaydeep > > On Sun, Mar 9, 2025 at 10:46 AM Dave Herrington <he...@rhinosource.com> > wrote: > >> Jaydeep, >> >> Thank you for taking time to answer my questions and for the links to the >> design and overview docs, which are excellent and answer all of my >> remaining questions. Sorry I missed those links in the CEP page. >> >> Great work and I will continue to follow your progress on this powerful >> new feature. >> >> Thanks! >> -Dave >> >> On Sat, Mar 8, 2025 at 9:36 AM Jaydeep Chovatia < >> chovatia.jayd...@gmail.com> wrote: >> >>> Hi David, >>> >>> Thanks for the kind words! >>> >>> >Is there a goal in this CEP to make automated repair work during >>> rolling upgrades, when multiple versions exist in the cluster? >>> We debated a lot on this over ASF Slack >>> (#cassandra-repair-scheduling-cep37). The summary is that, ideally, we want >>> to have a repair function during the mixed version, but the reality is that >>> currently, there is no test suite available inside Apache Cassandra to >>> verify the streaming behavior during the mixed version, so the confidence >>> is low. >>> We agreed on the following: 1) Keeping safety in mind, we should by >>> default disable the repair during mixed version 2) Add a comprehensive test >>> suite 3) Allow repair during mixed version. Currently, we are at #1 >>> >>> >Would automated repair be smart enough to automatically stop, if it >>> sees incompatible versions? >>> That's the plan, and we already have PR (CASSANDRA-20048 >>> <https://issues.apache.org/jira/browse/CASSANDRA-20048>) out from Chris >>> Lohfink. The thing we are debating is whether to stop only during major >>> version mismatch or also during the minor version, and we are leaning >>> towards only disabling for the major version mismatch. Regardless, this >>> should be available soon. >>> We are also extending this further as per feedback from David >>> Capwell that we should automatically stop repair if we detect a new DC or >>> keyspace RF is changed. That will be covered later as part of >>> CASSANDRA-20414 <https://issues.apache.org/jira/browse/CASSANDRA-20414> >>> >>> >If automated repair must be disabled for the entire cluster, will this >>> be a single nodetool command, or must automated repair be disabled on each >>> node individually? >>> Yes, it is a nodetool command and does not require any restarts! All the >>> *nodetool* command details are currently covered in the design doc >>> <https://docs.google.com/document/d/1CJWxjEi-mBABPMZ3VWJ9w5KavWfJETAGxfUpsViPcPo/edit?tab=t.0#heading=h.89fmsespiosd>, >>> and the same details will also be available in the Cassandra >>> overview.adoc >>> <https://github.com/apache/cassandra/pull/3598/files?short_path=e901018#diff-e90101885c1188844bb4188d1301277bfdc4a9e1e705c4ab8a6cc5a4b44460c0> >>> . >>> >>> >Would it make sense for automated repair to upgrade sstables, if it >>> finds old formats? (Maybe this could be a feature that could be optionally >>> enabled?) >>> My opinion is that it should not be part of the repair. It is best >>> suited as part of the Cassandra upgrade framework; I guess Paulo M is >>> looking at it. >>> >>> >W.R.T. the repair logging tables in the system_distributed keyspace, >>> will these tables have a configurable TTL, or must they be periodically >>> truncated to limit their size? >>> The number of entries will equal the number of Cassandra nodes in a >>> cluster. There is no TTL because each row represents the repair status of >>> that particular node. The entries would be automatically added/removed as >>> nodes are added/removed from the Cassandra cluster. >>> >>> Jaydeep >>> >>> On Sat, Mar 8, 2025 at 7:46 AM Dave Herrington <he...@rhinosource.com> >>> wrote: >>> >>>> Jaydeep, >>>> >>>> Thank you for your excellent efforts on this mission-critical feature. >>>> The stated goals of CEP-37 are noble and stand to make valuable >>>> improvements for cluster operations. I look forward to testing these new >>>> capabilities. >>>> >>>> My apologies up-front if you’ve already answered these questions. I >>>> did read the CEP a number of times and the linked JIRAs, but these are my >>>> questions that I couldn’t answer myself. >>>> >>>> I’m interested to understand the goals of CEP-37 W.R.T. to rolling >>>> upgrades of large clusters, as I am responsible for maintaining the cluster >>>> operations runbooks for a number of customers. >>>> >>>> Operators have to navigate the upgrade gauntlet with automated repairs >>>> disabled and get all nodes upgraded within gc_grace_seconds and then do a >>>> full repair, before restarting automated repairs. >>>> >>>> I see that CASSANDRA-7530 >>>> https://issues.apache.org/jira/browse/CASSANDRA-7530 is related to >>>> this. >>>> >>>> Is there a goal in this CEP to make automated repair work during >>>> rolling upgrades, when multiple versions exist in the cluster? >>>> >>>> (I think this would imply that stopping automated repairs would no >>>> longer be a pre-upgrade step.) >>>> >>>> Would automated repair be smart enough to automatically stop, if it >>>> sees incompatible versions? >>>> >>>> Would automated repair continue between nodes with compatible versions, >>>> or would it stop for the entire cluster? >>>> >>>> If automated repair must be disabled for the entire cluster, will this >>>> be a single nodetool command, or must automated repair be disabled on each >>>> node individually? >>>> >>>> Would it make sense for automated repair to upgrade sstables, if it >>>> finds old formats? (Maybe this could be a feature that could be optionally >>>> enabled?) >>>> >>>> W.R.T. the repair logging tables in the system_distributed keyspace, >>>> will these tables have a configurable TTL, or must they be periodically >>>> truncated to limit their size? >>>> >>>> Thanks, >>>> -Dave >>>> >>>> David A. Herrington II >>>> President and Chief Engineer >>>> RhinoSource, Inc. >>>> >>>> *Data Lake Architecture, Cloud Computing and Advanced Analytics.* >>>> >>>> www.rhinosource.com >>>> >>>> >>>> On Fri, Mar 7, 2025 at 11:48 AM Jaydeep Chovatia < >>>> chovatia.jayd...@gmail.com> wrote: >>>> >>>>> Hello Everyone, >>>>> >>>>> I wanted to update you on CEP-37 >>>>> <https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-37+Apache+Cassandra+Unified+Repair+Solution> >>>>> (Jira: >>>>> CASSANDRA-19918 >>>>> <https://issues.apache.org/jira/browse/CASSANDRA-19918>) work. >>>>> Over the last year, some of us (Andy Tolbert, Chris Lohfink, >>>>> Francisco Guerrero, and Kristijonas Zalys) have been working closely on >>>>> making CEP-37 rock solid, with support from Josh McKenzie, Dinesh Joshi, >>>>> and David Capwell. >>>>> First and foremost, a huge thank you to everyone, including the >>>>> broader Apache Cassandra community, for their invaluable contributions in >>>>> making CEP-37 robust and solid! >>>>> >>>>> Here is the current status: >>>>> >>>>> *Feature stability* >>>>> >>>>> - *Voted feature:* All the features mentioned in CEP-37 have >>>>> worked as expected. >>>>> - *Post-voted feature:* A few new minor improvements >>>>> >>>>> <https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=272927365#CEP37ApacheCassandraUnifiedRepairSolution-Post-VoteUpdates> >>>>> have been added to post-voting, and they are also working as expected. >>>>> - Tested the functionality by multiple people over the period of >>>>> time. >>>>> - Some other facts: it has already been validated at scale >>>>> <https://www.youtube.com/watch?v=xFicEj6Nhq8>. Another big >>>>> Cassandra use case is in the process of validating/adopting it in their >>>>> environment. >>>>> >>>>> *Source Code* >>>>> >>>>> - It is an opt-in feature; nobody notices anything unless someone >>>>> opts in. >>>>> - By default, this feature is pretty isolated (in a separate >>>>> package) from the source code point of view (94% of the source code >>>>> lines are in the new files) >>>>> - A thorough documentation has been added: >>>>> - overview.doc >>>>> - metrics.doc >>>>> - cassandra.yaml doc >>>>> - NEWS.txt overview >>>>> - Five people (Andy Tolbert, Chris Lohfink, Francisco Guerrero, >>>>> and Kristijonas Zalys) have contributed. >>>>> - The source code has been reviewed multiple times by the same >>>>> five people. >>>>> >>>>> *Test Coverage* >>>>> >>>>> - A comprehensive test coverage has been added to cover all >>>>> aspects. >>>>> - The entire test suite has been passing >>>>> >>>>> >>>>> We are in the final review phase and nearly ready to merge. If anyone >>>>> has any last-minute feedback, this is the final opportunity for review. >>>>> >>>>> Thank you! >>>>> Andy Tolbert, Chris Lohfink, Francisco Guerrero, Kristijonas Zalys, >>>>> and Jaydeep >>>>> >>>> >> >> -- >> -Dave >> >> David A. Herrington II >> President and Chief Engineer >> RhinoSource, Inc. >> >> *Data Lake Architecture, Cloud Computing and Advanced Analytics.* >> >> www.rhinosource.com >> >