Re: [Discuss] Repair inside C*

Alexander DEJANOVSKI Mon, 28 Oct 2024 04:45:24 -0700

Hi Jaydeep,

I've taken a look at the proposed design and have a few comments/questions.
As one of the maintainers of Reaper, I'm looking this through the lens of
how Reaper does things.



*The approach taken in the CEP-37 design is "node-centric" vs a "range
centric" approach (which is the one Reaper takes).*I'm worried that this
will not allow spreading the repair load evenly across the cluster, since
nodes are the concurrency unit. You could allow running repair on 3 nodes
concurrently for example, but these 3 nodes could all involve the same
replicas, making these replicas process 3 concurrent repairs while others
could be left uninvolved in any repair at all.
Taking a range centric approach (we're not repairing nodes, we're repairing
the token ranges) allows to spread the load evenly without overlap in the
replica sets.
I'm more worried even with incremental repair here, because you might end
up with some conflicts around sstables which would be in the pending repair
pool but would be needed by a competing repair job.
I don't know if in the latest versions such sstables would be totally
ignored or if the competing repair job would fail.

*Each repair command will repair all keyspaces (with the ability to fully
exclude some tables) and **I haven't seen a notion of schedule which seems
to suggest repairs are running continuously (unless I missed something?).*
There are many cases where one might have differentiated gc_grace_seconds
settings to optimize reclaiming tombstones when applicable. That requires
having some fine control over the repair cycle for a given keyspace/set of
tables.
Here, nodes will be processed sequentially and each node will process the
keyspaces sequentially, tying the repair cycle of all keyspaces together.
If one of the ranges for a specific keyspace cannot be repaired within the
3 hours timeout, it could block all the other keyspaces repairs.
Continuous repair might create a lot of overhead for full repairs which
often don't require more than 1 run per week.
It also will not allow running a mix of scheduled full/incremental repairs
(I'm unsure if that is still a recommendation, but it was still recommended
not so long ago)

*The timeout base duration is large*
I think the 3 hours timeout might be quite large and probably means a lot
of data is being repaired for each split. That usually involves some level
of overstreaming. I don't have numbers to support this, it's more about my
own experience on sizing splits in production with Reaper to reduce the
impact as much as possible on cluster performance.
We use 30 minutes as default in Reaper with subsequent attempts growing the
timeout dynamically for challenging splits.

Finally thanks for picking this up, I'm eager to see Reaper not being
needed anymore and having the database manage its own repairs!


Le mar. 22 oct. 2024 à 21:10, Benedict <bened...@apache.org> a écrit :

> I realise it’s out of scope, but to counterbalance all of the
> pro-decomposition messages I wanted to chime in with a strong -1. But we
> can debate that in a suitable context later.
>
> On 22 Oct 2024, at 16:36, Jordan West <jw...@apache.org> wrote:
>
> 
> Agreed with the sentiment that decomposition is a good target but out of
> scope here. I’m personally excited to see an in-tree repair scheduler and
> am supportive of the approach shared here.
>
> Jordan
>
> On Tue, Oct 22, 2024 at 08:12 Dinesh Joshi <djo...@apache.org> wrote:
>
>> Decomposing Cassandra may be architecturally desirable but that is not
>> the goal of this CEP. This CEP brings value to operators today so it should
>> be considered on that merit. We definitely need to have a separate
>> conversation on Cassandra's architectural direction.
>>
>> On Tue, Oct 22, 2024 at 7:51 AM Joseph Lynch <joe.e.ly...@gmail.com>
>> wrote:
>>
>>> Definitely like this in C* itself. We only changed our proposal to
>>> putting repair scheduling in the sidecar before because trunk was frozen
>>> for the foreseeable future at that time. With trunk unfrozen and
>>> development on the main process going at a fast pace I think it makes way
>>> more sense to integrate natively as table properties as this CEP proposes.
>>> Completely agree the scheduling overhead should be minimal.
>>>
>>> Moving the actual repair operation (comparing data and streaming
>>> mismatches) along with compaction operations to a separate process long
>>> term makes a lot of sense but imo only once we both have a release of
>>> sidecar and a contract figured out between them on communication. I'm
>>> watching CEP-38 there as I think CQL and virtual tables are looking much
>>> stronger than when we wrote CEP-1 and chose HTTP but that's for that
>>> discussion and not this one.
>>>
>>> -Joey
>>>
>>> On Mon, Oct 21, 2024 at 3:25 PM Francisco Guerrero <fran...@apache.org>
>>> wrote:
>>>
>>>> Like others have said, I was expecting the scheduling portion of repair
>>>> is
>>>> negligible. I was mostly curious if you had something handy that you can
>>>> quickly share.
>>>>
>>>> On 2024/10/21 18:59:41 Jaydeep Chovatia wrote:
>>>> > >Jaydeep, do you have any metrics on your clusters comparing them
>>>> before
>>>> > and after introducing repair scheduling into the Cassandra process?
>>>> >
>>>> > Yes, I had made some comparisons when I started rolling this feature
>>>> out to
>>>> > our production five years ago :)  Here are the details:
>>>> > *The Scheduling*
>>>> > The scheduling itself is exceptionally lightweight, as only one
>>>> additional
>>>> > thread monitors the repair activity, updating the status to a system
>>>> table
>>>> > once every few minutes or so. So, it does not appear anywhere in the
>>>> CPU
>>>> > charts, etc. Unfortunately, I do not have those graphs now, but I can
>>>> do a
>>>> > quick comparison if it helps!
>>>> >
>>>> > *The Repair Itself*
>>>> > As we all know, the Cassandra repair algorithm is a heavy-weight
>>>> process
>>>> > due to Merkle tree/streaming, etc., no matter how we schedule it. But
>>>> it is
>>>> > an orthogonal topic and folks are already discussing creating a new
>>>> CEP.
>>>> >
>>>> > Jaydeep
>>>> >
>>>> >
>>>> > On Mon, Oct 21, 2024 at 10:02 AM Francisco Guerrero <
>>>> fran...@apache.org>
>>>> > wrote:
>>>> >
>>>> > > Jaydeep, do you have any metrics on your clusters comparing them
>>>> before
>>>> > > and after introducing repair scheduling into the Cassandra process?
>>>> > >
>>>> > > On 2024/10/21 16:57:57 "J. D. Jordan" wrote:
>>>> > > > Sounds good. Just wanted to bring it up. I agree that the
>>>> scheduling bit
>>>> > > is
>>>> > > > pretty light weight and the ideal would be to bring the whole of
>>>> the
>>>> > > repair
>>>> > > > external, which is a much bigger can of worms to open.
>>>> > > >
>>>> > > >
>>>> > > >
>>>> > > > -Jeremiah
>>>> > > >
>>>> > > >
>>>> > > >
>>>> > > > > On Oct 21, 2024, at 11:21 AM, Chris Lohfink <
>>>> clohfin...@gmail.com>
>>>> > > wrote:
>>>> > > > >
>>>> > > > >
>>>> > > >
>>>> > > > > 
>>>> > > > >
>>>> > > > > > I actually think we should be looking at how we can move
>>>> things out
>>>> > > of the
>>>> > > > > database process.
>>>> > > > >
>>>> > > > >
>>>> > > > >
>>>> > > > >
>>>> > > > >
>>>> > > > > While worth pursuing, I think we would need a different CEP
>>>> just to
>>>> > > figure
>>>> > > > > out how to do that. Not only is there a lot of infrastructure
>>>> > > difficulty in
>>>> > > > > running multi process, the inter app communication needs to be
>>>> figured
>>>> > > out
>>>> > > > > better then JMX. Even the sidecar we dont have a solid story on
>>>> how to
>>>> > > > > ensure both are running or anything yet. It's up to each app
>>>> owner to
>>>> > > figure
>>>> > > > > it out. Once we have a good thing in place I think we can start
>>>> moving
>>>> > > > > compactions, repairs, etc out of the database. Even then it's
>>>> the
>>>> > > _repairs_
>>>> > > > > that is expensive, not the scheduling.
>>>> > > > >
>>>> > > > >
>>>> > > > >
>>>> > > > >
>>>> > > > > On Mon, Oct 21, 2024 at 9:45 AM Jeremiah Jordan
>>>> > > > > <[jeremiah.jor...@gmail.com](mailto:jeremiah.jor...@gmail.com)>
>>>> > > wrote:
>>>> > > > >
>>>> > > > >
>>>> > > >
>>>> > > > >> I love the idea of a repair service being there by default for
>>>> an
>>>> > > install
>>>> > > > of C*.  My main concern here is that it is putting more services
>>>> into
>>>> > > the main
>>>> > > > database process.  I actually think we should be looking at how
>>>> we can
>>>> > > move
>>>> > > > things out of the database process.  The C* process being a giant
>>>> > > monolith has
>>>> > > > always been a pain point.  Is there anyway it makes sense for
>>>> this to be
>>>> > > an
>>>> > > > external process rather than a new thread pool inside the C*
>>>> process?
>>>> > > >
>>>> > > > >>
>>>> > > >
>>>> > > > >>
>>>> > > > >
>>>> > > > >>
>>>> > > >
>>>> > > > >> -Jeremiah Jordan
>>>> > > >
>>>> > > > >>
>>>> > > >
>>>> > > > >>
>>>> > > > >
>>>> > > > >>
>>>> > > >
>>>> > > > >> On Oct 18, 2024 at 2:58:15 PM, Mick Semb Wever
>>>> > > > <[m...@apache.org](mailto:m...@apache.org)> wrote:
>>>> > > > >
>>>> > > > >>
>>>> > > >
>>>> > > > >>>
>>>> > > > >
>>>> > > > >>>
>>>> > > >
>>>> > > > >>> This is looking strong, thanks Jaydeep.
>>>> > > >
>>>> > > > >>>
>>>> > > >
>>>> > > > >>>
>>>> > > > >
>>>> > > > >>>
>>>> > > >
>>>> > > > >>> I would suggest folk take a look at the design doc and the PR
>>>> in the
>>>> > > CEP.
>>>> > > > A lot is there (that I have completely missed).
>>>> > > >
>>>> > > > >>>
>>>> > > >
>>>> > > > >>>
>>>> > > > >
>>>> > > > >>>
>>>> > > >
>>>> > > > >>> I would especially ask all authors of prior art (Reaper, DSE
>>>> > > nodesync,
>>>> > > > ecchronos)  to take a final review of the proposal
>>>> > > > >
>>>> > > > >>>
>>>> > > >
>>>> > > > >>>
>>>> > > > >
>>>> > > > >>>
>>>> > > >
>>>> > > > >>> Jaydeep, can we ask for a two week window while we reach out
>>>> to these
>>>> > > > people ?  There's a lot of prior art in this space, and it feels
>>>> like
>>>> > > we're in
>>>> > > > a good place now where it's clear this has legs and we can use
>>>> that to
>>>> > > bring
>>>> > > > folk in and make sure there's no remaining blindspots.
>>>> > > >
>>>> > > > >>>
>>>> > > >
>>>> > > > >>>
>>>> > > > >
>>>> > > > >>>
>>>> > > >
>>>> > > > >>>
>>>> > > > >
>>>> > > > >>>
>>>> > > >
>>>> > > > >>> On Fri, 18 Oct 2024 at 01:40, Jaydeep Chovatia
>>>> > > > <[chovatia.jayd...@gmail.com](mailto:chovatia.jayd...@gmail.com)>
>>>> > > wrote:
>>>> > > > >
>>>> > > > >>>
>>>> > > >
>>>> > > > >>>> Sorry, there is a typo in the CEP-37 link; here is the
>>>> correct
>>>> > > > [link](
>>>> > >
>>>> https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-37+Apache+Cassandra+Unified+Repair+Solution
>>>> > > )
>>>> > > >
>>>> > > > >>>>
>>>> > > >
>>>> > > > >>>>
>>>> > > > >
>>>> > > > >>>>
>>>> > > >
>>>> > > > >>>>
>>>> > > > >
>>>> > > > >>>>
>>>> > > >
>>>> > > > >>>> On Thu, Oct 17, 2024 at 4:36 PM Jaydeep Chovatia
>>>> > > > <[chovatia.jayd...@gmail.com](mailto:chovatia.jayd...@gmail.com)>
>>>> > > wrote:
>>>> > > > >
>>>> > > > >>>>
>>>> > > >
>>>> > > > >>>>> First, thank you for your patience while we strengthened the
>>>> > > CEP-37.
>>>> > > >
>>>> > > > >>>>>
>>>> > > >
>>>> > > > >>>>>
>>>> > > > >
>>>> > > > >>>>>
>>>> > > >
>>>> > > > >>>>> Over the last eight months, Chris Lohfink, Andy Tolbert,
>>>> Josh
>>>> > > McKenzie,
>>>> > > > Dinesh Joshi, Kristijonas Zalys, and I have done tons of work
>>>> (online
>>>> > > > discussions/a dedicated Slack channel
>>>> > > #cassandra-repair-scheduling-cep37) to
>>>> > > > come up with the best possible design that not only significantly
>>>> > > simplifies
>>>> > > > repair operations but also includes the most common features that
>>>> > > everyone
>>>> > > > will benefit from running at Scale.
>>>> > > >
>>>> > > > >>>>>
>>>> > > >
>>>> > > > >>>>> For example,
>>>> > > >
>>>> > > > >>>>>
>>>> > > >
>>>> > > > >>>>>   * Apache Cassandra must be capable of running multiple
>>>> repair
>>>> > > types,
>>>> > > > such as Full, Incremental, Paxos, and Preview - so the framework
>>>> should
>>>> > > be
>>>> > > > easily extendable with no additional overhead from the operator’s
>>>> point
>>>> > > of
>>>> > > > view.
>>>> > > >
>>>> > > > >>>>>
>>>> > > >
>>>> > > > >>>>>   * An easy way to extend the token-split calculation
>>>> algorithm
>>>> > > with a
>>>> > > > default implementation should exist.
>>>> > > >
>>>> > > > >>>>>
>>>> > > >
>>>> > > > >>>>>   * Running incremental repair reliably at Scale is pretty
>>>> > > challenging,
>>>> > > > so we need to place safeguards, such as migration/rollback w/o
>>>> restart
>>>> > > and
>>>> > > > stopping incremental repair automatically if the disk is about to
>>>> get
>>>> > > full.
>>>> > > >
>>>> > > > >>>>>
>>>> > > >
>>>> > > > >>>>>
>>>> > > >
>>>> > > > >>>>>
>>>> > > >
>>>> > > > >>>>> We are glad to inform you that CEP-37 (i.e., Repair inside
>>>> > > Cassandra) is
>>>> > > > now officially ready for review after multiple rounds of design,
>>>> > > testing, code
>>>> > > > reviews, documentation reviews, and, more importantly, validation
>>>> that
>>>> > > it runs
>>>> > > > at Scale!
>>>> > > >
>>>> > > > >>>>>
>>>> > > >
>>>> > > > >>>>>
>>>> > > > >
>>>> > > > >>>>>
>>>> > > >
>>>> > > > >>>>> Some facts about CEP-37.
>>>> > > >
>>>> > > > >>>>>
>>>> > > >
>>>> > > > >>>>>   * Multiple members have verified all aspects of CEP-37
>>>> numerous
>>>> > > times.
>>>> > > >
>>>> > > > >>>>>
>>>> > > >
>>>> > > > >>>>>   * The design proposed in CEP-37 has been thoroughly tried
>>>> and
>>>> > > tested
>>>> > > > on an immense scale (hundreds of unique Cassandra clusters, tens
>>>> of
>>>> > > thousands
>>>> > > > of Cassandra nodes, with tens of millions of QPS) on top of 4.1
>>>> > > open-source
>>>> > > > for more than five years; please see more details[
>>>> > > > here](
>>>> > >
>>>> https://www.uber.com/en-US/blog/how-uber-optimized-cassandra-operations-
>>>> > > > at-scale/).
>>>> > > >
>>>> > > > >>>>>
>>>> > > >
>>>> > > > >>>>>   * The following
>>>> > > > [presentation](
>>>> > >
>>>> https://docs.google.com/presentation/d/1Zilww9c7LihHULk_ckErI2s4XbObxjWknKqRtbvHyZc/edit#slide=id.g30a4fd4fcf7_0_13
>>>> > > )
>>>> > > > highlights the rigorous applied to CEP-37, which was given during
>>>> last
>>>> > > week’s
>>>> > > > Apache Cassandra Bay Area [Meetup](
>>>> > > https://www.meetup.com/apache-cassandra-
>>>> > > > bay-area/events/303469006/),
>>>> > > >
>>>> > > > >>>>>
>>>> > > >
>>>> > > > >>>>>
>>>> > > >
>>>> > > >
>>>> > > > >
>>>> > > > >>>>>
>>>> > > >
>>>> > > > >>>>> Since things are massively overhauled, we believe it is
>>>> almost
>>>> > > ready for
>>>> > > > a final pass pre-VOTE. We would like you to please review the
>>>> > > > [CEP-37](
>>>> > >
>>>> https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-37+Apache+Cassandra+Unified+Repair+Solution\
>>>> <https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-37+Apache+Cassandra+Unified+Repair+Solution%5C>
>>>> > > <
>>>> https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-37+Apache+Cassandra+Unified+Repair+Solution%5C
>>>> >
>>>> > > ))
>>>> > > > and the associated detailed design
>>>> > > > [doc](https://docs.google.com/document/d/1CJWxjEi-
>>>> > > > mBABPMZ3VWJ9w5KavWfJETAGxfUpsViPcPo/edit#heading=h.r112r46toau0).
>>>> > > >
>>>> > > > >>>>>
>>>> > > >
>>>> > > > >>>>>
>>>> > > > >
>>>> > > > >>>>>
>>>> > > >
>>>> > > > >>>>> Thank you everyone!
>>>> > > >
>>>> > > > >>>>>
>>>> > > >
>>>> > > > >>>>> Chris, Andy, Josh, Dinesh, Kristijonas, and Jaydeep
>>>> > > >
>>>> > > > >>>>>
>>>> > > >
>>>> > > > >>>>>
>>>> > > > >
>>>> > > > >
>>>> > > > >>>>>
>>>> > > >
>>>> > > > >>>>>
>>>> > > > >
>>>> > > > >>>>>
>>>> > > >
>>>> > > > >>>>> On Thu, Sep 19, 2024 at 11:26 AM Josh McKenzie
>>>> > > > <[jmcken...@apache.org](mailto:jmcken...@apache.org)> wrote:
>>>> > > > >
>>>> > > > >>>>>
>>>> > > >
>>>> > > > >>>>>>  __
>>>> > > >
>>>> > > > >>>>>>
>>>> > > >
>>>> > > > >>>>>> Not quite; finishing touches on the CEP and design doc are
>>>> in
>>>> > > flight
>>>> > > > (as of last / this week).
>>>> > > > >
>>>> > > > >>>>>>
>>>> > > >
>>>> > > > >>>>>>
>>>> > > > >
>>>> > > > >>>>>>
>>>> > > >
>>>> > > > >>>>>> Soon(tm).
>>>> > > >
>>>> > > > >>>>>>
>>>> > > >
>>>> > > > >>>>>>
>>>> > > > >
>>>> > > > >>>>>>
>>>> > > >
>>>> > > > >>>>>> On Thu, Sep 19, 2024, at 2:07 PM, Patrick McFadin wrote:
>>>> > > > >
>>>> > > > >>>>>>
>>>> > > >
>>>> > > > >>>>>>> Is this CEP ready for a VOTE thread?
>>>> > > > <
>>>> > >
>>>> https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-37+%28DRAFT%29+Apache+Cassandra+Unified+Repair+Solution
>>>> >
>>>> > >
>>>> > > > >
>>>> > > > >>>>>>>
>>>> > > >
>>>> > > > >>>>>>>
>>>> > > > >
>>>> > > > >>>>>>>
>>>> > > >
>>>> > > > >>>>>>> On Sun, Feb 25, 2024 at 12:25 PM Jaydeep Chovatia
>>>> > > > <[chovatia.jayd...@gmail.com](mailto:chovatia.jayd...@gmail.com)>
>>>> > > wrote:
>>>> > > > >
>>>> > > > >>>>>>>
>>>> > > >
>>>> > > > >>>>>>>> Thanks, Josh. I've just updated the
>>>> > > > [CEP](
>>>> > >
>>>> https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-37+%28DRAFT%29+Apache+Cassandra+Official+Repair+Solution
>>>> > > )
>>>> > > > and included all the solutions you mentioned below.
>>>> > > > >
>>>> > > > >>>>>>>>
>>>> > > >
>>>> > > > >>>>>>>>
>>>> > > > >
>>>> > > > >>>>>>>>
>>>> > > >
>>>> > > > >>>>>>>> Jaydeep
>>>> > > > >
>>>> > > > >>>>>>>>
>>>> > > >
>>>> > > > >>>>>>>>
>>>> > > > >
>>>> > > > >>>>>>>>
>>>> > > >
>>>> > > > >>>>>>>> On Thu, Feb 22, 2024 at 9:33 AM Josh McKenzie
>>>> > > > <[jmcken...@apache.org](mailto:jmcken...@apache.org)> wrote:
>>>> > > > >
>>>> > > > >>>>>>>>
>>>> > > >
>>>> > > > >>>>>>>>>  __
>>>> > > > >
>>>> > > > >>>>>>>>>
>>>> > > >
>>>> > > > >>>>>>>>> Very late response from me here (basically necro'ing
>>>> this
>>>> > > thread).
>>>> > > > >
>>>> > > > >>>>>>>>>
>>>> > > >
>>>> > > > >>>>>>>>>
>>>> > > > >
>>>> > > > >>>>>>>>>
>>>> > > >
>>>> > > > >>>>>>>>> I think it'd be useful to get this condensed into a CEP
>>>> that
>>>> > > we can
>>>> > > > then discuss in that format. It's clearly something we all agree
>>>> we need
>>>> > > and
>>>> > > > having an implementation that works, even if it's not in your
>>>> preferred
>>>> > > > execution domain, is vastly better than nothing IMO.
>>>> > > > >
>>>> > > > >>>>>>>>>
>>>> > > >
>>>> > > > >>>>>>>>>
>>>> > > > >
>>>> > > > >>>>>>>>>
>>>> > > >
>>>> > > > >>>>>>>>> I don't have cycles (nor background ;) ) to do that,
>>>> but it
>>>> > > sounds
>>>> > > > like you do Jaydeep given the implementation you have on a
>>>> private fork +
>>>> > > > design.
>>>> > > > >
>>>> > > > >>>>>>>>>
>>>> > > >
>>>> > > > >>>>>>>>>
>>>> > > > >
>>>> > > > >>>>>>>>>
>>>> > > >
>>>> > > > >>>>>>>>> A non-exhaustive list of things that might be useful
>>>> > > incorporating
>>>> > > > into or referencing from a CEP:
>>>> > > > >
>>>> > > > >>>>>>>>>
>>>> > > >
>>>> > > > >>>>>>>>> Slack thread: <https://the-
>>>> > > > asf.slack.com/archives/CK23JSY2K/p1690225062383619>
>>>> > > > >
>>>> > > > >>>>>>>>>
>>>> > > >
>>>> > > > >>>>>>>>> Joey's old C* ticket:
>>>> > > > <https://issues.apache.org/jira/browse/CASSANDRA-14346>
>>>> > > > >
>>>> > > > >>>>>>>>>
>>>> > > >
>>>> > > > >>>>>>>>> Even older automatic repair scheduling:
>>>> > > > <https://issues.apache.org/jira/browse/CASSANDRA-10070>
>>>> > > > >
>>>> > > > >>>>>>>>>
>>>> > > >
>>>> > > > >>>>>>>>> Your design gdoc: <
>>>> > > https://docs.google.com/document/d/1CJWxjEi-
>>>> > > > mBABPMZ3VWJ9w5KavWfJETAGxfUpsViPcPo/edit#heading=h.r112r46toau0>
>>>> > > > >
>>>> > > > >>>>>>>>>
>>>> > > >
>>>> > > > >>>>>>>>> PR with automated repair:
>>>> > > > <
>>>> > >
>>>> https://github.com/jaydeepkumar1984/cassandra/commit/ef6456d652c0d07cf29d88dfea03b73704814c2c
>>>> >
>>>> > >
>>>> > > > >
>>>> > > > >>>>>>>>>
>>>> > > >
>>>> > > > >>>>>>>>>
>>>> > > > >
>>>> > > > >>>>>>>>>
>>>> > > >
>>>> > > > >>>>>>>>> My intuition is that we're all basically in agreement
>>>> that
>>>> > > this is
>>>> > > > something the DB needs, we're all willing to bikeshed for our
>>>> personal
>>>> > > > preference on where it lives and how it's implemented, and at the
>>>> end of
>>>> > > the
>>>> > > > day, code talks. I don't think anyone's said they'll die on the
>>>> hill of
>>>> > > > implementation details, so that feels like CEP time to me.
>>>> > > > >
>>>> > > > >>>>>>>>>
>>>> > > >
>>>> > > > >>>>>>>>>
>>>> > > > >
>>>> > > > >>>>>>>>>
>>>> > > >
>>>> > > > >>>>>>>>> If you were willing and able to get a CEP together for
>>>> > > automated
>>>> > > > repair based on the above material, given you've done the work
>>>> and have
>>>> > > the
>>>> > > > proof points it's working at scale, I think this would be a  _huge
>>>> > > > contribution_ to the community.
>>>> > > > >
>>>> > > > >>>>>>>>>
>>>> > > >
>>>> > > > >>>>>>>>>
>>>> > > > >
>>>> > > > >>>>>>>>>
>>>> > > >
>>>> > > > >>>>>>>>> On Thu, Aug 24, 2023, at 7:26 PM, Jaydeep Chovatia
>>>> wrote:
>>>> > > > >
>>>> > > > >>>>>>>>>
>>>> > > >
>>>> > > > >>>>>>>>>> Is anyone going to file an official CEP for this?
>>>> > > > >
>>>> > > > >>>>>>>>>>
>>>> > > >
>>>> > > > >>>>>>>>>> As mentioned in this email thread, here is one of the
>>>> > > solution's
>>>> > > > [design doc](https://docs.google.com/document/d/1CJWxjEi-
>>>> > > > mBABPMZ3VWJ9w5KavWfJETAGxfUpsViPcPo/edit#heading=h.r112r46toau0)
>>>> and
>>>> > > source
>>>> > > > code on a private Apache Cassandra patch. Could you go through it
>>>> and
>>>> > > let me
>>>> > > > know what you think?
>>>> > > > >
>>>> > > > >>>>>>>>>>
>>>> > > >
>>>> > > > >>>>>>>>>>
>>>> > > > >
>>>> > > > >>>>>>>>>>
>>>> > > >
>>>> > > > >>>>>>>>>> Jaydeep
>>>> > > > >
>>>> > > > >>>>>>>>>>
>>>> > > >
>>>> > > > >>>>>>>>>>
>>>> > > > >
>>>> > > > >>>>>>>>>>
>>>> > > >
>>>> > > > >>>>>>>>>> On Wed, Aug 2, 2023 at 3:54 PM Jon Haddad
>>>> > > > <[rustyrazorbl...@apache.org](mailto:rustyrazorbl...@apache.org)>
>>>> > > wrote:
>>>> > > > >
>>>> > > > >>>>>>>>>>
>>>> > > >
>>>> > > > >>>>>>>>>>> > That said I would happily support an effort to
>>>> bring repair
>>>> > > > scheduling to the sidecar immediately. This has nothing blocking
>>>> it, and
>>>> > > would
>>>> > > > potentially enable the sidecar to provide an official repair
>>>> scheduling
>>>> > > > solution that is compatible with current or even previous
>>>> versions of the
>>>> > > > database.
>>>> > > > >
>>>> > > > >>>>>>>>>>>
>>>> > > >
>>>> > > > >>>>>>>>>>>
>>>> > > > >
>>>> > > > >>>>>>>>>>>
>>>> > > >
>>>> > > > >>>>>>>>>>> This is something I hadn't thought much about, and is
>>>> a
>>>> > > pretty
>>>> > > > good argument for using the sidecar initially.  There's a lot of
>>>> > > deployments
>>>> > > > out there and having an official repair option would be a big win.
>>>> > > > >
>>>> > > > >>>>>>>>>>>
>>>> > > >
>>>> > > > >>>>>>>>>>>
>>>> > > > >
>>>> > > > >>>>>>>>>>>
>>>> > > >
>>>> > > > >>>>>>>>>>>
>>>> > > > >
>>>> > > > >>>>>>>>>>>
>>>> > > >
>>>> > > > >>>>>>>>>>> On 2023/07/26 23:20:07 "C. Scott Andreas" wrote:
>>>> > > > >
>>>> > > > >>>>>>>>>>>
>>>> > > >
>>>> > > > >>>>>>>>>>> > I agree that it would be ideal for Cassandra to
>>>> have a
>>>> > > repair
>>>> > > > scheduler in-DB.
>>>> > > > >
>>>> > > > >>>>>>>>>>>
>>>> > > >
>>>> > > > >>>>>>>>>>> >
>>>> > > > >
>>>> > > > >>>>>>>>>>>
>>>> > > >
>>>> > > > >>>>>>>>>>> > That said I would happily support an effort to
>>>> bring repair
>>>> > > > scheduling to the sidecar immediately. This has nothing blocking
>>>> it, and
>>>> > > would
>>>> > > > potentially enable the sidecar to provide an official repair
>>>> scheduling
>>>> > > > solution that is compatible with current or even previous
>>>> versions of the
>>>> > > > database.
>>>> > > > >
>>>> > > > >>>>>>>>>>>
>>>> > > >
>>>> > > > >>>>>>>>>>> >
>>>> > > > >
>>>> > > > >>>>>>>>>>>
>>>> > > >
>>>> > > > >>>>>>>>>>> > Once TCM has landed, we’ll have much stronger
>>>> primitives
>>>> > > for
>>>> > > > repair orchestration in the database itself. But I don’t think
>>>> that
>>>> > > should
>>>> > > > block progress on a repair scheduling solution in the sidecar,
>>>> and there
>>>> > > is
>>>> > > > nothing that would prevent someone from continuing to use a
>>>> sidecar-based
>>>> > > > solution in perpetuity if they preferred.
>>>> > > > >
>>>> > > > >>>>>>>>>>>
>>>> > > >
>>>> > > > >>>>>>>>>>> >
>>>> > > > >
>>>> > > > >>>>>>>>>>>
>>>> > > >
>>>> > > > >>>>>>>>>>> > \- Scott
>>>> > > > >
>>>> > > > >>>>>>>>>>>
>>>> > > >
>>>> > > > >>>>>>>>>>> >
>>>> > > > >
>>>> > > > >>>>>>>>>>>
>>>> > > >
>>>> > > > >>>>>>>>>>> > > On Jul 26, 2023, at 3:25 PM, Jon Haddad
>>>> > > > <[rustyrazorbl...@apache.org](mailto:rustyrazorbl...@apache.org)>
>>>> > > wrote:
>>>> > > > >
>>>> > > > >>>>>>>>>>>
>>>> > > >
>>>> > > > >>>>>>>>>>> > >
>>>> > > > >
>>>> > > > >>>>>>>>>>>
>>>> > > >
>>>> > > > >>>>>>>>>>> > > I'm 100% in favor of repair being part of the
>>>> core DB,
>>>> > > not
>>>> > > > the sidecar.  The current (and past) state of things where
>>>> running the DB
>>>> > > > correctly *requires* running a separate process (either community
>>>> > > maintained
>>>> > > > or official C* sidecar) is incredibly painful for folks.  The
>>>> idea that
>>>> > > your
>>>> > > > data integrity needs to be opt-in has never made sense to me from
>>>> the
>>>> > > > perspective of either the product or the end user.
>>>> > > > >
>>>> > > > >>>>>>>>>>>
>>>> > > >
>>>> > > > >>>>>>>>>>> > >
>>>> > > > >
>>>> > > > >>>>>>>>>>>
>>>> > > >
>>>> > > > >>>>>>>>>>> > > I've worked with way too many teams that have
>>>> either
>>>> > > > configured this incorrectly or not at all.
>>>> > > > >
>>>> > > > >>>>>>>>>>>
>>>> > > >
>>>> > > > >>>>>>>>>>> > >
>>>> > > > >
>>>> > > > >>>>>>>>>>>
>>>> > > >
>>>> > > > >>>>>>>>>>> > > Ideally Cassandra would ship with repair built in
>>>> and on
>>>> > > by
>>>> > > > default.  Power users can disable if they want to continue to
>>>> maintain
>>>> > > their
>>>> > > > own repair tooling for some reason.
>>>> > > > >
>>>> > > > >>>>>>>>>>>
>>>> > > >
>>>> > > > >>>>>>>>>>> > >
>>>> > > > >
>>>> > > > >>>>>>>>>>>
>>>> > > >
>>>> > > > >>>>>>>>>>> > > Jon
>>>> > > > >
>>>> > > > >>>>>>>>>>>
>>>> > > >
>>>> > > > >>>>>>>>>>> > >
>>>> > > > >
>>>> > > > >>>>>>>>>>>
>>>> > > >
>>>> > > > >>>>>>>>>>> > >> On 2023/07/24 20:44:14 German Eichberger via dev
>>>> > > wrote:
>>>> > > > >
>>>> > > > >>>>>>>>>>>
>>>> > > >
>>>> > > > >>>>>>>>>>> > >> All,
>>>> > > > >
>>>> > > > >>>>>>>>>>>
>>>> > > >
>>>> > > > >>>>>>>>>>> > >> We had a brief discussion in [2] about the Uber
>>>> article
>>>> > > [1]
>>>> > > > where they talk about having integrated repair into Cassandra and
>>>> how
>>>> > > great
>>>> > > > that is. I expressed my disappointment that they didn't work with
>>>> the
>>>> > > > community on that (Uber, if you are listening time to make amends
>>>> 🙂)
>>>> > > and it
>>>> > > > turns out Joey already had the idea and wrote the code [3] - so I
>>>> wanted
>>>> > > to
>>>> > > > start a discussion to gauge interest and maybe how to revive that
>>>> > > effort.
>>>> > > > >
>>>> > > > >>>>>>>>>>>
>>>> > > >
>>>> > > > >>>>>>>>>>> > >> Thanks,
>>>> > > > >
>>>> > > > >>>>>>>>>>>
>>>> > > >
>>>> > > > >>>>>>>>>>> > >> German
>>>> > > > >
>>>> > > > >>>>>>>>>>>
>>>> > > >
>>>> > > > >>>>>>>>>>> > >> [1] <
>>>> > > https://www.uber.com/blog/how-uber-optimized-cassandra-
>>>> > > > operations-at-scale/>
>>>> > > > >
>>>> > > > >>>>>>>>>>>
>>>> > > >
>>>> > > > >>>>>>>>>>> > >> [2] <https://the-
>>>> > > > asf.slack.com/archives/CK23JSY2K/p1690225062383619>
>>>> > > > >
>>>> > > > >>>>>>>>>>>
>>>> > > >
>>>> > > > >>>>>>>>>>> > >> [3] <
>>>> > > https://issues.apache.org/jira/browse/CASSANDRA-14346>
>>>> > > > >
>>>> > > > >>>>>>>>>>>
>>>> > > >
>>>> > > > >>>>>>>>>>> >
>>>> > > > >
>>>> > > > >>>>>>>>>
>>>> > > >
>>>> > > > >>>>>>>>>
>>>> > > > >
>>>> > > > >>>>>>
>>>> > > >
>>>> > > > >>>>>>
>>>> > > > >
>>>> > > >
>>>> > > >
>>>> > >
>>>> >
>>>>
>>>

Re: [Discuss] Repair inside C*

Reply via email to