Not quite; finishing touches on the CEP and design doc are in flight (as of last / this week).
Soon(tm). On Thu, Sep 19, 2024, at 2:07 PM, Patrick McFadin wrote: > Is this CEP ready for a VOTE thread? > https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-37+%28DRAFT%29+Apache+Cassandra+Unified+Repair+Solution > > On Sun, Feb 25, 2024 at 12:25 PM Jaydeep Chovatia > <chovatia.jayd...@gmail.com> wrote: >> Thanks, Josh. I've just updated the CEP >> <https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-37+%28DRAFT%29+Apache+Cassandra+Official+Repair+Solution> >> and included all the solutions you mentioned below. >> >> Jaydeep >> >> On Thu, Feb 22, 2024 at 9:33 AM Josh McKenzie <jmcken...@apache.org> wrote: >>> __ >>> Very late response from me here (basically necro'ing this thread). >>> >>> I think it'd be useful to get this condensed into a CEP that we can then >>> discuss in that format. It's clearly something we all agree we need and >>> having an implementation that works, even if it's not in your preferred >>> execution domain, is vastly better than nothing IMO. >>> >>> I don't have cycles (nor background ;) ) to do that, but it sounds like you >>> do Jaydeep given the implementation you have on a private fork + design. >>> >>> A non-exhaustive list of things that might be useful incorporating into or >>> referencing from a CEP: >>> Slack thread: https://the-asf.slack.com/archives/CK23JSY2K/p1690225062383619 >>> Joey's old C* ticket: https://issues.apache.org/jira/browse/CASSANDRA-14346 >>> Even older automatic repair scheduling: >>> https://issues.apache.org/jira/browse/CASSANDRA-10070 >>> Your design gdoc: >>> https://docs.google.com/document/d/1CJWxjEi-mBABPMZ3VWJ9w5KavWfJETAGxfUpsViPcPo/edit#heading=h.r112r46toau0 >>> PR with automated repair: >>> https://github.com/jaydeepkumar1984/cassandra/commit/ef6456d652c0d07cf29d88dfea03b73704814c2c >>> >>> My intuition is that we're all basically in agreement that this is >>> something the DB needs, we're all willing to bikeshed for our personal >>> preference on where it lives and how it's implemented, and at the end of >>> the day, code talks. I don't think anyone's said they'll die on the hill of >>> implementation details, so that feels like CEP time to me. >>> >>> If you were willing and able to get a CEP together for automated repair >>> based on the above material, given you've done the work and have the proof >>> points it's working at scale, I think this would be a *huge contribution* >>> to the community. >>> >>> On Thu, Aug 24, 2023, at 7:26 PM, Jaydeep Chovatia wrote: >>>> Is anyone going to file an official CEP for this? >>>> As mentioned in this email thread, here is one of the solution's design >>>> doc >>>> <https://docs.google.com/document/d/1CJWxjEi-mBABPMZ3VWJ9w5KavWfJETAGxfUpsViPcPo/edit#heading=h.r112r46toau0> >>>> and source code on a private Apache Cassandra patch. Could you go through >>>> it and let me know what you think? >>>> >>>> Jaydeep >>>> >>>> On Wed, Aug 2, 2023 at 3:54 PM Jon Haddad <rustyrazorbl...@apache.org> >>>> wrote: >>>>> > That said I would happily support an effort to bring repair scheduling >>>>> > to the sidecar immediately. This has nothing blocking it, and would >>>>> > potentially enable the sidecar to provide an official repair scheduling >>>>> > solution that is compatible with current or even previous versions of >>>>> > the database. >>>>> >>>>> This is something I hadn't thought much about, and is a pretty good >>>>> argument for using the sidecar initially. There's a lot of deployments >>>>> out there and having an official repair option would be a big win. >>>>> >>>>> >>>>> On 2023/07/26 23:20:07 "C. Scott Andreas" wrote: >>>>> > I agree that it would be ideal for Cassandra to have a repair scheduler >>>>> > in-DB. >>>>> > >>>>> > That said I would happily support an effort to bring repair scheduling >>>>> > to the sidecar immediately. This has nothing blocking it, and would >>>>> > potentially enable the sidecar to provide an official repair scheduling >>>>> > solution that is compatible with current or even previous versions of >>>>> > the database. >>>>> > >>>>> > Once TCM has landed, we’ll have much stronger primitives for repair >>>>> > orchestration in the database itself. But I don’t think that should >>>>> > block progress on a repair scheduling solution in the sidecar, and >>>>> > there is nothing that would prevent someone from continuing to use a >>>>> > sidecar-based solution in perpetuity if they preferred. >>>>> > >>>>> > - Scott >>>>> > >>>>> > > On Jul 26, 2023, at 3:25 PM, Jon Haddad <rustyrazorbl...@apache.org> >>>>> > > wrote: >>>>> > > >>>>> > > I'm 100% in favor of repair being part of the core DB, not the >>>>> > > sidecar. The current (and past) state of things where running the DB >>>>> > > correctly *requires* running a separate process (either community >>>>> > > maintained or official C* sidecar) is incredibly painful for folks. >>>>> > > The idea that your data integrity needs to be opt-in has never made >>>>> > > sense to me from the perspective of either the product or the end >>>>> > > user. >>>>> > > >>>>> > > I've worked with way too many teams that have either configured this >>>>> > > incorrectly or not at all. >>>>> > > >>>>> > > Ideally Cassandra would ship with repair built in and on by default. >>>>> > > Power users can disable if they want to continue to maintain their >>>>> > > own repair tooling for some reason. >>>>> > > >>>>> > > Jon >>>>> > > >>>>> > >> On 2023/07/24 20:44:14 German Eichberger via dev wrote: >>>>> > >> All, >>>>> > >> We had a brief discussion in [2] about the Uber article [1] where >>>>> > >> they talk about having integrated repair into Cassandra and how >>>>> > >> great that is. I expressed my disappointment that they didn't work >>>>> > >> with the community on that (Uber, if you are listening time to make >>>>> > >> amends 🙂) and it turns out Joey already had the idea and wrote the >>>>> > >> code [3] - so I wanted to start a discussion to gauge interest and >>>>> > >> maybe how to revive that effort. >>>>> > >> Thanks, >>>>> > >> German >>>>> > >> [1] >>>>> > >> https://www.uber.com/blog/how-uber-optimized-cassandra-operations-at-scale/ >>>>> > >> [2] https://the-asf.slack.com/archives/CK23JSY2K/p1690225062383619 >>>>> > >> [3] https://issues.apache.org/jira/browse/CASSANDRA-14346 >>>>> > >>>