Re: Resources for fire drills

2017-03-01 Thread Oskar Kjellin
Throttle your compaction so low that it practically stops and then try so save the nodes to simulate not keeping up with compaction Sent from my iPhone > On 1 Mar 2017, at 14:35, Stefan Podkowinski wrote: > > I've just created a page for this topic that we can use to collect some > content: >

Re: Resources for fire drills

2017-03-01 Thread Stefan Podkowinski
I've just created a page for this topic that we can use to collect some content: https://github.com/spodkowinski/cassandra-collab/blob/docs_firedrill/doc/source/operating/failure_scenarios.rst I've invited both of you Malte and Benjamin as collaborators in github, so you can either push changes or

Re: Resources for fire drills

2017-03-01 Thread benjamin roth
@Doc: http://cassandra.apache.org/doc/latest/ is built from the git repo. So you can add documentation in doc/source and submit a patch. I personally think that is not the very best place or way to build a knowledge DB but thats what we have. 2017-03-01 13:39 GMT+01:00 Malte Pickhan : > Hi, > >

Re: Resources for fire drills

2017-03-01 Thread Malte Pickhan
Hi, really cool that this discussion gets attention. You are right my question was quite open. For me it would already be helpful to compile a list like Ben started with scenarios that can happen to a cluster and what actions/strategies you have to take to resolve the incident without loosing

Re: Resources for fire drills

2017-03-01 Thread Stefan Podkowinski
I've been thinking about this for a while, but haven't found a practical solution yet, although the term "fire drill" leaves a lot of room for interpretation. The most basic requirements I'd have for these kind of trainings would start with automated cluster provisioning for each scenario (either f

Re: Resources for fire drills

2017-03-01 Thread benjamin roth
But if you want to do fire-drills you only have to break things on purpose. Examples: - Cut off a commitlog file at a random position and restart CS - Overwrite some bytes in an SSTables and read all data from it - Delete some files in /var/lib/cassandra and try to restore them from backups or dif

Re: Resources for fire drills

2017-03-01 Thread benjamin roth
As far as I know there is no such resource, at least not officially. IMHO things like this can be improved a lot within the CS community. I just proposed on the dev-list to move the official docs out of the repo into an easier to maintain place like a Wiki or sth. This could help the community to

Re: Resources for fire drills

2017-03-01 Thread Malte Pickhan
Yeah thats the point. What I mean are some overview for basic scenarios for firedrills, so that you can exercise them with your team. Best > On 1 Mar 2017, at 11:01, benjamin roth wrote: > > Could you specify it a little bit? There are really a lot of things that can > go wrong. > > 2017-0

Re: Resources for fire drills

2017-03-01 Thread benjamin roth
Could you specify it a little bit? There are really a lot of things that can go wrong. 2017-03-01 10:59 GMT+01:00 Malte Pickhan : > Hi Cassandra users, > > I am looking for some resources/guides for firedrill scenarios with apache > cassandra. > > Do you know anything like that? > > Best, > > Mal

Resources for fire drills

2017-03-01 Thread Malte Pickhan
Hi Cassandra users, I am looking for some resources/guides for firedrill scenarios with apache cassandra. Do you know anything like that? Best, Malte