Is this CEP ready for a VOTE thread? https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-37+%28DRAFT%29+Apache+Cassandra+Unified+Repair+Solution
On Sun, Feb 25, 2024 at 12:25 PM Jaydeep Chovatia < chovatia.jayd...@gmail.com> wrote: > Thanks, Josh. I've just updated the CEP > <https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-37+%28DRAFT%29+Apache+Cassandra+Official+Repair+Solution> > and included all the solutions you mentioned below. > > Jaydeep > > On Thu, Feb 22, 2024 at 9:33 AM Josh McKenzie <jmcken...@apache.org> > wrote: > >> Very late response from me here (basically necro'ing this thread). >> >> I think it'd be useful to get this condensed into a CEP that we can then >> discuss in that format. It's clearly something we all agree we need and >> having an implementation that works, even if it's not in your preferred >> execution domain, is vastly better than nothing IMO. >> >> I don't have cycles (nor background ;) ) to do that, but it sounds like >> you do Jaydeep given the implementation you have on a private fork + design. >> >> A non-exhaustive list of things that might be useful incorporating into >> or referencing from a CEP: >> Slack thread: >> https://the-asf.slack.com/archives/CK23JSY2K/p1690225062383619 >> Joey's old C* ticket: >> https://issues.apache.org/jira/browse/CASSANDRA-14346 >> Even older automatic repair scheduling: >> https://issues.apache.org/jira/browse/CASSANDRA-10070 >> Your design gdoc: >> https://docs.google.com/document/d/1CJWxjEi-mBABPMZ3VWJ9w5KavWfJETAGxfUpsViPcPo/edit#heading=h.r112r46toau0 >> PR with automated repair: >> https://github.com/jaydeepkumar1984/cassandra/commit/ef6456d652c0d07cf29d88dfea03b73704814c2c >> >> My intuition is that we're all basically in agreement that this is >> something the DB needs, we're all willing to bikeshed for our personal >> preference on where it lives and how it's implemented, and at the end of >> the day, code talks. I don't think anyone's said they'll die on the hill of >> implementation details, so that feels like CEP time to me. >> >> If you were willing and able to get a CEP together for automated repair >> based on the above material, given you've done the work and have the proof >> points it's working at scale, I think this would be a *huge contribution* >> to the community. >> >> On Thu, Aug 24, 2023, at 7:26 PM, Jaydeep Chovatia wrote: >> >> Is anyone going to file an official CEP for this? >> As mentioned in this email thread, here is one of the solution's design >> doc >> <https://docs.google.com/document/d/1CJWxjEi-mBABPMZ3VWJ9w5KavWfJETAGxfUpsViPcPo/edit#heading=h.r112r46toau0> >> and source code on a private Apache Cassandra patch. Could you go through >> it and let me know what you think? >> >> Jaydeep >> >> On Wed, Aug 2, 2023 at 3:54 PM Jon Haddad <rustyrazorbl...@apache.org> >> wrote: >> >> > That said I would happily support an effort to bring repair scheduling >> to the sidecar immediately. This has nothing blocking it, and would >> potentially enable the sidecar to provide an official repair scheduling >> solution that is compatible with current or even previous versions of the >> database. >> >> This is something I hadn't thought much about, and is a pretty good >> argument for using the sidecar initially. There's a lot of deployments out >> there and having an official repair option would be a big win. >> >> >> On 2023/07/26 23:20:07 "C. Scott Andreas" wrote: >> > I agree that it would be ideal for Cassandra to have a repair scheduler >> in-DB. >> > >> > That said I would happily support an effort to bring repair scheduling >> to the sidecar immediately. This has nothing blocking it, and would >> potentially enable the sidecar to provide an official repair scheduling >> solution that is compatible with current or even previous versions of the >> database. >> > >> > Once TCM has landed, we’ll have much stronger primitives for repair >> orchestration in the database itself. But I don’t think that should block >> progress on a repair scheduling solution in the sidecar, and there is >> nothing that would prevent someone from continuing to use a sidecar-based >> solution in perpetuity if they preferred. >> > >> > - Scott >> > >> > > On Jul 26, 2023, at 3:25 PM, Jon Haddad <rustyrazorbl...@apache.org> >> wrote: >> > > >> > > I'm 100% in favor of repair being part of the core DB, not the >> sidecar. The current (and past) state of things where running the DB >> correctly *requires* running a separate process (either community >> maintained or official C* sidecar) is incredibly painful for folks. The >> idea that your data integrity needs to be opt-in has never made sense to me >> from the perspective of either the product or the end user. >> > > >> > > I've worked with way too many teams that have either configured this >> incorrectly or not at all. >> > > >> > > Ideally Cassandra would ship with repair built in and on by default. >> Power users can disable if they want to continue to maintain their own >> repair tooling for some reason. >> > > >> > > Jon >> > > >> > >> On 2023/07/24 20:44:14 German Eichberger via dev wrote: >> > >> All, >> > >> We had a brief discussion in [2] about the Uber article [1] where >> they talk about having integrated repair into Cassandra and how great that >> is. I expressed my disappointment that they didn't work with the community >> on that (Uber, if you are listening time to make amends 🙂) and it turns >> out Joey already had the idea and wrote the code [3] - so I wanted to start >> a discussion to gauge interest and maybe how to revive that effort. >> > >> Thanks, >> > >> German >> > >> [1] >> https://www.uber.com/blog/how-uber-optimized-cassandra-operations-at-scale/ >> > >> [2] https://the-asf.slack.com/archives/CK23JSY2K/p1690225062383619 >> > >> [3] https://issues.apache.org/jira/browse/CASSANDRA-14346 >> > >> >> >>