Thanks a lot, Jon! This has truly been a team effort, with Andy Tolbert, Chris Lohfink, Francisco Guerrero, and Kristijonas Zalys all contributing over the past year. The credit belongs to everyone!
Jaydeep On Sun, Mar 9, 2025 at 2:35 PM Jon Haddad <j...@rustyrazorblade.com> wrote: > This is all really exciting. Getting a built in, orchestrated repair is a > massive achievement. Thank you for your work on this, it's incredibly > valuable to the community!! > > Jon > > On Sun, Mar 9, 2025 at 2:25 PM Jaydeep Chovatia < > chovatia.jayd...@gmail.com> wrote: > >> No problem, Dave! Thank you. >> >> Jaydeep >> >> On Sun, Mar 9, 2025 at 10:46 AM Dave Herrington <he...@rhinosource.com> >> wrote: >> >>> Jaydeep, >>> >>> Thank you for taking time to answer my questions and for the links to >>> the design and overview docs, which are excellent and answer all of my >>> remaining questions. Sorry I missed those links in the CEP page. >>> >>> Great work and I will continue to follow your progress on this powerful >>> new feature. >>> >>> Thanks! >>> -Dave >>> >>> On Sat, Mar 8, 2025 at 9:36 AM Jaydeep Chovatia < >>> chovatia.jayd...@gmail.com> wrote: >>> >>>> Hi David, >>>> >>>> Thanks for the kind words! >>>> >>>> >Is there a goal in this CEP to make automated repair work during >>>> rolling upgrades, when multiple versions exist in the cluster? >>>> We debated a lot on this over ASF Slack >>>> (#cassandra-repair-scheduling-cep37). The summary is that, ideally, we want >>>> to have a repair function during the mixed version, but the reality is that >>>> currently, there is no test suite available inside Apache Cassandra to >>>> verify the streaming behavior during the mixed version, so the confidence >>>> is low. >>>> We agreed on the following: 1) Keeping safety in mind, we should by >>>> default disable the repair during mixed version 2) Add a comprehensive test >>>> suite 3) Allow repair during mixed version. Currently, we are at #1 >>>> >>>> >Would automated repair be smart enough to automatically stop, if it >>>> sees incompatible versions? >>>> That's the plan, and we already have PR (CASSANDRA-20048 >>>> <https://issues.apache.org/jira/browse/CASSANDRA-20048>) out from >>>> Chris Lohfink. The thing we are debating is whether to stop only during >>>> major version mismatch or also during the minor version, and we are leaning >>>> towards only disabling for the major version mismatch. Regardless, this >>>> should be available soon. >>>> We are also extending this further as per feedback from David >>>> Capwell that we should automatically stop repair if we detect a new DC or >>>> keyspace RF is changed. That will be covered later as part of >>>> CASSANDRA-20414 <https://issues.apache.org/jira/browse/CASSANDRA-20414> >>>> >>>> >If automated repair must be disabled for the entire cluster, will this >>>> be a single nodetool command, or must automated repair be disabled on each >>>> node individually? >>>> Yes, it is a nodetool command and does not require any restarts! All >>>> the *nodetool* command details are currently covered in the design doc >>>> <https://docs.google.com/document/d/1CJWxjEi-mBABPMZ3VWJ9w5KavWfJETAGxfUpsViPcPo/edit?tab=t.0#heading=h.89fmsespiosd>, >>>> and the same details will also be available in the Cassandra >>>> overview.adoc >>>> <https://github.com/apache/cassandra/pull/3598/files?short_path=e901018#diff-e90101885c1188844bb4188d1301277bfdc4a9e1e705c4ab8a6cc5a4b44460c0> >>>> . >>>> >>>> >Would it make sense for automated repair to upgrade sstables, if it >>>> finds old formats? (Maybe this could be a feature that could be optionally >>>> enabled?) >>>> My opinion is that it should not be part of the repair. It is best >>>> suited as part of the Cassandra upgrade framework; I guess Paulo M is >>>> looking at it. >>>> >>>> >W.R.T. the repair logging tables in the system_distributed keyspace, >>>> will these tables have a configurable TTL, or must they be periodically >>>> truncated to limit their size? >>>> The number of entries will equal the number of Cassandra nodes in a >>>> cluster. There is no TTL because each row represents the repair status of >>>> that particular node. The entries would be automatically added/removed as >>>> nodes are added/removed from the Cassandra cluster. >>>> >>>> Jaydeep >>>> >>>> On Sat, Mar 8, 2025 at 7:46 AM Dave Herrington <he...@rhinosource.com> >>>> wrote: >>>> >>>>> Jaydeep, >>>>> >>>>> Thank you for your excellent efforts on this mission-critical >>>>> feature. The stated goals of CEP-37 are noble and stand to make valuable >>>>> improvements for cluster operations. I look forward to testing these new >>>>> capabilities. >>>>> >>>>> My apologies up-front if you’ve already answered these questions. I >>>>> did read the CEP a number of times and the linked JIRAs, but these are my >>>>> questions that I couldn’t answer myself. >>>>> >>>>> I’m interested to understand the goals of CEP-37 W.R.T. to rolling >>>>> upgrades of large clusters, as I am responsible for maintaining the >>>>> cluster >>>>> operations runbooks for a number of customers. >>>>> >>>>> Operators have to navigate the upgrade gauntlet with automated repairs >>>>> disabled and get all nodes upgraded within gc_grace_seconds and then do a >>>>> full repair, before restarting automated repairs. >>>>> >>>>> I see that CASSANDRA-7530 >>>>> https://issues.apache.org/jira/browse/CASSANDRA-7530 is related to >>>>> this. >>>>> >>>>> Is there a goal in this CEP to make automated repair work during >>>>> rolling upgrades, when multiple versions exist in the cluster? >>>>> >>>>> (I think this would imply that stopping automated repairs would no >>>>> longer be a pre-upgrade step.) >>>>> >>>>> Would automated repair be smart enough to automatically stop, if it >>>>> sees incompatible versions? >>>>> >>>>> Would automated repair continue between nodes with compatible >>>>> versions, or would it stop for the entire cluster? >>>>> >>>>> If automated repair must be disabled for the entire cluster, will this >>>>> be a single nodetool command, or must automated repair be disabled on each >>>>> node individually? >>>>> >>>>> Would it make sense for automated repair to upgrade sstables, if it >>>>> finds old formats? (Maybe this could be a feature that could be optionally >>>>> enabled?) >>>>> >>>>> W.R.T. the repair logging tables in the system_distributed keyspace, >>>>> will these tables have a configurable TTL, or must they be periodically >>>>> truncated to limit their size? >>>>> >>>>> Thanks, >>>>> -Dave >>>>> >>>>> David A. Herrington II >>>>> President and Chief Engineer >>>>> RhinoSource, Inc. >>>>> >>>>> *Data Lake Architecture, Cloud Computing and Advanced Analytics.* >>>>> >>>>> www.rhinosource.com >>>>> >>>>> >>>>> On Fri, Mar 7, 2025 at 11:48 AM Jaydeep Chovatia < >>>>> chovatia.jayd...@gmail.com> wrote: >>>>> >>>>>> Hello Everyone, >>>>>> >>>>>> I wanted to update you on CEP-37 >>>>>> <https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-37+Apache+Cassandra+Unified+Repair+Solution> >>>>>> (Jira: >>>>>> CASSANDRA-19918 >>>>>> <https://issues.apache.org/jira/browse/CASSANDRA-19918>) work. >>>>>> Over the last year, some of us (Andy Tolbert, Chris Lohfink, >>>>>> Francisco Guerrero, and Kristijonas Zalys) have been working closely on >>>>>> making CEP-37 rock solid, with support from Josh McKenzie, Dinesh Joshi, >>>>>> and David Capwell. >>>>>> First and foremost, a huge thank you to everyone, including the >>>>>> broader Apache Cassandra community, for their invaluable contributions in >>>>>> making CEP-37 robust and solid! >>>>>> >>>>>> Here is the current status: >>>>>> >>>>>> *Feature stability* >>>>>> >>>>>> - *Voted feature:* All the features mentioned in CEP-37 have >>>>>> worked as expected. >>>>>> - *Post-voted feature:* A few new minor improvements >>>>>> >>>>>> <https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=272927365#CEP37ApacheCassandraUnifiedRepairSolution-Post-VoteUpdates> >>>>>> have been added to post-voting, and they are also working as expected. >>>>>> - Tested the functionality by multiple people over the period of >>>>>> time. >>>>>> - Some other facts: it has already been validated at scale >>>>>> <https://www.youtube.com/watch?v=xFicEj6Nhq8>. Another big >>>>>> Cassandra use case is in the process of validating/adopting it in >>>>>> their >>>>>> environment. >>>>>> >>>>>> *Source Code* >>>>>> >>>>>> - It is an opt-in feature; nobody notices anything unless someone >>>>>> opts in. >>>>>> - By default, this feature is pretty isolated (in a separate >>>>>> package) from the source code point of view (94% of the source code >>>>>> lines are in the new files) >>>>>> - A thorough documentation has been added: >>>>>> - overview.doc >>>>>> - metrics.doc >>>>>> - cassandra.yaml doc >>>>>> - NEWS.txt overview >>>>>> - Five people (Andy Tolbert, Chris Lohfink, Francisco Guerrero, >>>>>> and Kristijonas Zalys) have contributed. >>>>>> - The source code has been reviewed multiple times by the same >>>>>> five people. >>>>>> >>>>>> *Test Coverage* >>>>>> >>>>>> - A comprehensive test coverage has been added to cover all >>>>>> aspects. >>>>>> - The entire test suite has been passing >>>>>> >>>>>> >>>>>> We are in the final review phase and nearly ready to merge. If anyone >>>>>> has any last-minute feedback, this is the final opportunity for review. >>>>>> >>>>>> Thank you! >>>>>> Andy Tolbert, Chris Lohfink, Francisco Guerrero, Kristijonas Zalys, >>>>>> and Jaydeep >>>>>> >>>>> >>> >>> -- >>> -Dave >>> >>> David A. Herrington II >>> President and Chief Engineer >>> RhinoSource, Inc. >>> >>> *Data Lake Architecture, Cloud Computing and Advanced Analytics.* >>> >>> www.rhinosource.com >>> >>