Hi Tyler,
I think the scenario needs some correction. 20 node clsuter, RF=5, Read/Write 
Quorum, gc grace period=20. If a node goes down, repair -pr would fail on 4 
nodes maintaining replicas and full repair would fail on even greater no.of 
number of nodes but not 19. Please confirm.
Anyways the system health would get impacted as multiple nodes are not 
repairing with a single node failure.
ThanksAnujSent from Yahoo Mail on Android 
 
  On Tue, 19 Jan, 2016 at 10:48 pm, Anuj Wadehra<anujw_2...@yahoo.co.in> wrote: 
  There is a JIRA 
Issue https://issues.apache.org/jira/browse/CASSANDRA-10446 . 
But its open with Minor prority and type as Improvement. I think its a very 
valid concern for all and especially for users who have bigger clusters. More 
of an issue related with Design decision rather than an improvement. Can we 
change its priority so that it gets appropriate attention?

ThanksAnuj 
 
  On Tue, 19 Jan, 2016 at 10:35 pm, Tyler Hobbs<ty...@datastax.com> wrote:  
On Tue, Jan 19, 2016 at 10:44 AM, Anuj Wadehra <anujw_2...@yahoo.co.in> wrote:


Consider a scenario where I have a 20 node clsuter, RF=5, Read/Write Quorum, gc 
grace period=20. My cluster is fault tolerant and it can afford 2 node failure. 
Suddenly, one node goes down due to some hardware issue. Its 10 days since my 
node is down, none of the 19 nodes are being repaired and now its decision 
time. I am not sure how soon issue would be fixed may be 8 days before gc 
grace, so I shouldnt remove node early and add node back as it would cause 
unnecessary streaming. At the same time, if I dont remove the failed node, my 
entire system health would be in question and it would be a panic situation as 
no data got repaired in last 10 days and gc grace is approaching. I need 
sufficient time to repair 19 nodes.
What looked like a fault tolerant system which can afford 2 node failure, 
required urgent attention and manual decision making when a single node went 
down. Why cant we just go ahead and repair remaining replicas if some replicas 
are down? If failed node comes up before gc grace period, we would run repair 
to fix inconsistencies and otheriwse we would discard data and bootstrap. I 
think that would be a really robust fault tolerant system.

That makes sense.  It seems like having the option to ignore down replicas 
during repair could be at least somewhat helpful, although it may be tricky to 
decide how this should interact with incremental repairs.  If there isn't a 
jira ticket for this already, can you open one with the scenario above?


-- 
Tyler Hobbs
DataStax
    

Reply via email to