I love the idea of a repair service being there by default for an install of C*. My main concern here is that it is putting more services into the main database process. I actually think we should be looking at how we can move things out of the database process. The C* process being a giant monolith has always been a pain point. Is there anyway it makes sense for this to be an external process rather than a new thread pool inside the C* process?
-Jeremiah Jordan On Oct 18, 2024 at 2:58:15 PM, Mick Semb Wever <m...@apache.org> wrote: > > This is looking strong, thanks Jaydeep. > > I would suggest folk take a look at the design doc and the PR in the CEP. > A lot is there (that I have completely missed). > > I would especially ask all authors of prior art (Reaper, DSE nodesync, > ecchronos) to take a final review of the proposal > > Jaydeep, can we ask for a two week window while we reach out to these > people ? There's a lot of prior art in this space, and it feels like we're > in a good place now where it's clear this has legs and we can use that to > bring folk in and make sure there's no remaining blindspots. > > > On Fri, 18 Oct 2024 at 01:40, Jaydeep Chovatia <chovatia.jayd...@gmail.com> > wrote: > >> Sorry, there is a typo in the CEP-37 link; here is the correct link >> <https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-37+Apache+Cassandra+Unified+Repair+Solution> >> >> >> On Thu, Oct 17, 2024 at 4:36 PM Jaydeep Chovatia < >> chovatia.jayd...@gmail.com> wrote: >> >>> First, thank you for your patience while we strengthened the CEP-37. >>> >>> >>> Over the last eight months, Chris Lohfink, Andy Tolbert, Josh McKenzie, >>> Dinesh Joshi, Kristijonas Zalys, and I have done tons of work (online >>> discussions/a dedicated Slack channel #cassandra-repair-scheduling-cep37) >>> to come up with the best possible design that not only significantly >>> simplifies repair operations but also includes the most common features >>> that everyone will benefit from running at Scale. >>> >>> For example, >>> >>> - >>> >>> Apache Cassandra must be capable of running multiple repair types, >>> such as Full, Incremental, Paxos, and Preview - so the framework should >>> be >>> easily extendable with no additional overhead from the operator’s point >>> of >>> view. >>> - >>> >>> An easy way to extend the token-split calculation algorithm with a >>> default implementation should exist. >>> - >>> >>> Running incremental repair reliably at Scale is pretty challenging, >>> so we need to place safeguards, such as migration/rollback w/o restart >>> and >>> stopping incremental repair automatically if the disk is about to get >>> full. >>> >>> We are glad to inform you that CEP-37 (i.e., Repair inside Cassandra) is >>> now officially ready for review after multiple rounds of design, testing, >>> code reviews, documentation reviews, and, more importantly, validation that >>> it runs at Scale! >>> >>> >>> Some facts about CEP-37. >>> >>> - >>> >>> Multiple members have verified all aspects of CEP-37 numerous times. >>> - >>> >>> The design proposed in CEP-37 has been thoroughly tried and tested >>> on an immense scale (hundreds of unique Cassandra clusters, tens of >>> thousands of Cassandra nodes, with tens of millions of QPS) on top of 4.1 >>> open-source for more than five years; please see more details here >>> >>> <https://www.uber.com/en-US/blog/how-uber-optimized-cassandra-operations-at-scale/> >>> . >>> - >>> >>> The following presentation >>> >>> <https://docs.google.com/presentation/d/1Zilww9c7LihHULk_ckErI2s4XbObxjWknKqRtbvHyZc/edit#slide=id.g30a4fd4fcf7_0_13> >>> highlights the rigorous applied to CEP-37, which was given during last >>> week’s Apache Cassandra Bay Area Meetup >>> <https://www.meetup.com/apache-cassandra-bay-area/events/303469006/>, >>> >>> >>> Since things are massively overhauled, we believe it is almost ready for >>> a final pass pre-VOTE. We would like you to please review the CEP-37 >>> <https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-37+Apache+Cassandra+Unified+Repair+Solution)> >>> and the associated detailed design doc >>> <https://docs.google.com/document/d/1CJWxjEi-mBABPMZ3VWJ9w5KavWfJETAGxfUpsViPcPo/edit#heading=h.r112r46toau0> >>> . >>> >>> Thank you everyone! >>> >>> Chris, Andy, Josh, Dinesh, Kristijonas, and Jaydeep >>> >>> >>> >>> On Thu, Sep 19, 2024 at 11:26 AM Josh McKenzie <jmcken...@apache.org> >>> wrote: >>> >>>> Not quite; finishing touches on the CEP and design doc are in flight >>>> (as of last / this week). >>>> >>>> Soon(tm). >>>> >>>> On Thu, Sep 19, 2024, at 2:07 PM, Patrick McFadin wrote: >>>> >>>> Is this CEP ready for a VOTE thread? >>>> https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-37+%28DRAFT%29+Apache+Cassandra+Unified+Repair+Solution >>>> >>>> On Sun, Feb 25, 2024 at 12:25 PM Jaydeep Chovatia < >>>> chovatia.jayd...@gmail.com> wrote: >>>> >>>> Thanks, Josh. I've just updated the CEP >>>> <https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-37+%28DRAFT%29+Apache+Cassandra+Official+Repair+Solution> >>>> and included all the solutions you mentioned below. >>>> >>>> Jaydeep >>>> >>>> On Thu, Feb 22, 2024 at 9:33 AM Josh McKenzie <jmcken...@apache.org> >>>> wrote: >>>> >>>> >>>> Very late response from me here (basically necro'ing this thread). >>>> >>>> I think it'd be useful to get this condensed into a CEP that we can >>>> then discuss in that format. It's clearly something we all agree we need >>>> and having an implementation that works, even if it's not in your preferred >>>> execution domain, is vastly better than nothing IMO. >>>> >>>> I don't have cycles (nor background ;) ) to do that, but it sounds like >>>> you do Jaydeep given the implementation you have on a private fork + >>>> design. >>>> >>>> A non-exhaustive list of things that might be useful incorporating into >>>> or referencing from a CEP: >>>> Slack thread: >>>> https://the-asf.slack.com/archives/CK23JSY2K/p1690225062383619 >>>> Joey's old C* ticket: >>>> https://issues.apache.org/jira/browse/CASSANDRA-14346 >>>> Even older automatic repair scheduling: >>>> https://issues.apache.org/jira/browse/CASSANDRA-10070 >>>> Your design gdoc: >>>> https://docs.google.com/document/d/1CJWxjEi-mBABPMZ3VWJ9w5KavWfJETAGxfUpsViPcPo/edit#heading=h.r112r46toau0 >>>> PR with automated repair: >>>> https://github.com/jaydeepkumar1984/cassandra/commit/ef6456d652c0d07cf29d88dfea03b73704814c2c >>>> >>>> My intuition is that we're all basically in agreement that this is >>>> something the DB needs, we're all willing to bikeshed for our personal >>>> preference on where it lives and how it's implemented, and at the end of >>>> the day, code talks. I don't think anyone's said they'll die on the hill of >>>> implementation details, so that feels like CEP time to me. >>>> >>>> If you were willing and able to get a CEP together for automated repair >>>> based on the above material, given you've done the work and have the proof >>>> points it's working at scale, I think this would be a *huge >>>> contribution* to the community. >>>> >>>> On Thu, Aug 24, 2023, at 7:26 PM, Jaydeep Chovatia wrote: >>>> >>>> Is anyone going to file an official CEP for this? >>>> As mentioned in this email thread, here is one of the solution's design >>>> doc >>>> <https://docs.google.com/document/d/1CJWxjEi-mBABPMZ3VWJ9w5KavWfJETAGxfUpsViPcPo/edit#heading=h.r112r46toau0> >>>> and source code on a private Apache Cassandra patch. Could you go through >>>> it and let me know what you think? >>>> >>>> Jaydeep >>>> >>>> On Wed, Aug 2, 2023 at 3:54 PM Jon Haddad <rustyrazorbl...@apache.org> >>>> wrote: >>>> >>>> > That said I would happily support an effort to bring repair >>>> scheduling to the sidecar immediately. This has nothing blocking it, and >>>> would potentially enable the sidecar to provide an official repair >>>> scheduling solution that is compatible with current or even previous >>>> versions of the database. >>>> >>>> This is something I hadn't thought much about, and is a pretty good >>>> argument for using the sidecar initially. There's a lot of deployments out >>>> there and having an official repair option would be a big win. >>>> >>>> >>>> On 2023/07/26 23:20:07 "C. Scott Andreas" wrote: >>>> > I agree that it would be ideal for Cassandra to have a repair >>>> scheduler in-DB. >>>> > >>>> > That said I would happily support an effort to bring repair >>>> scheduling to the sidecar immediately. This has nothing blocking it, and >>>> would potentially enable the sidecar to provide an official repair >>>> scheduling solution that is compatible with current or even previous >>>> versions of the database. >>>> > >>>> > Once TCM has landed, we’ll have much stronger primitives for repair >>>> orchestration in the database itself. But I don’t think that should block >>>> progress on a repair scheduling solution in the sidecar, and there is >>>> nothing that would prevent someone from continuing to use a sidecar-based >>>> solution in perpetuity if they preferred. >>>> > >>>> > - Scott >>>> > >>>> > > On Jul 26, 2023, at 3:25 PM, Jon Haddad <rustyrazorbl...@apache.org> >>>> wrote: >>>> > > >>>> > > I'm 100% in favor of repair being part of the core DB, not the >>>> sidecar. The current (and past) state of things where running the DB >>>> correctly *requires* running a separate process (either community >>>> maintained or official C* sidecar) is incredibly painful for folks. The >>>> idea that your data integrity needs to be opt-in has never made sense to me >>>> from the perspective of either the product or the end user. >>>> > > >>>> > > I've worked with way too many teams that have either configured >>>> this incorrectly or not at all. >>>> > > >>>> > > Ideally Cassandra would ship with repair built in and on by >>>> default. Power users can disable if they want to continue to maintain >>>> their own repair tooling for some reason. >>>> > > >>>> > > Jon >>>> > > >>>> > >> On 2023/07/24 20:44:14 German Eichberger via dev wrote: >>>> > >> All, >>>> > >> We had a brief discussion in [2] about the Uber article [1] where >>>> they talk about having integrated repair into Cassandra and how great that >>>> is. I expressed my disappointment that they didn't work with the community >>>> on that (Uber, if you are listening time to make amends 🙂) and it turns >>>> out Joey already had the idea and wrote the code [3] - so I wanted to start >>>> a discussion to gauge interest and maybe how to revive that effort. >>>> > >> Thanks, >>>> > >> German >>>> > >> [1] >>>> https://www.uber.com/blog/how-uber-optimized-cassandra-operations-at-scale/ >>>> > >> [2] https://the-asf.slack.com/archives/CK23JSY2K/p1690225062383619 >>>> > >> [3] https://issues.apache.org/jira/browse/CASSANDRA-14346 >>>> > >>>> >>>> >>>> >>>>