Re: Repair Failing due to bad network

2012-10-12 Thread Rob Coli
https://issues.apache.org/jira/browse/CASSANDRA-3483 Is directly on point for the use case in question, and introduces "rebuild" concept.. https://issues.apache.org/jira/browse/CASSANDRA-3487 https://issues.apache.org/jira/browse/CASSANDRA-3112 Are for improvements in repair sessions.. https://

Re: Repair Failing due to bad network

2012-10-12 Thread David Koblas
Jim, Great idea - though it doesn't look like it's in 1.1.3 (which is what I'm running). My lame idea of the morning is that I'm going to just read the whole keyspace with QUORUM reads to force read repairs - the unfortunate truth is that this is about 2B reads... --david On 10/11/12 4:51

Re: Repair Failing due to bad network

2012-10-11 Thread Jim Cistaro
I am not aware of any built-in mechanism for retrying repairs. I believe you will have to build that into your process. As for reducing the time of each repair command to fit in your windows: If you have multiple reasonable size column families, and are not already doing this, one approach might

Repair Failing due to bad network

2012-10-11 Thread David Koblas
I'm trying to bring up a new Datacenter - while I probably could have brought things up in another way I've now got a DC that has a ready Cassandra with keys allocated. The problem is that I cannot get a repair to complete due since it appears that some part of my network decides to restart al