One other slight advantage of -prŠ We sometimes have repairs that hang and need to be killed and restarted. -pr means you have to "redo" a fraction of the work.
jc -----Original Message----- From: <Hiller>, Dean <dean.hil...@nrel.gov> Reply-To: "user@cassandra.apache.org" <user@cassandra.apache.org> Date: Friday, March 1, 2013 5:46 AM To: "user@cassandra.apache.org" <user@cassandra.apache.org> Subject: Re: -pr vs. no -pr >Sweeet, I %100 understand this now from these last few emails. It has >always been a bit confusing. > >Thanks, >Dean > >From: Sylvain Lebresne <sylv...@datastax.com<mailto:sylv...@datastax.com>> >Reply-To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" ><user@cassandra.apache.org<mailto:user@cassandra.apache.org>> >Date: Friday, March 1, 2013 4:36 AM >To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" ><user@cassandra.apache.org<mailto:user@cassandra.apache.org>> >Subject: Re: -pr vs. no -pr > >On Thu, Feb 28, 2013 at 11:39 PM, Hiller, Dean ><dean.hil...@nrel.gov<mailto:dean.hil...@nrel.gov>> wrote: >Isn't it true if I have 6 nodes, I could run nodetool repair on just 2 >nodes(RF=3) instead of using nodetool repair pr??? > >Yes, it is true. > >And to precise further, in your case you have 2 options: > 1) doing repair *without* -pr on 2 nodes (assuming you pick the correct >2 nodes, it's *not* any 2 nodes) > 2) doing a repair *with* -pr on the 6 nodes > >Both of those cases would 1) repair the full ring and 2) do the same >amount of work. > >What is the advantage of pr then? > >As it happens, your case is a special case. You have a number of node >that is a multiple of your replication factor. Now if that wasn't the >case (say 5, 7 or 8 nodes with RF=3), then there is *no way* you can >repair *without* -pr the whole cluster without doing *more* work than by >doing a repair *with* -pr on all nodes. > >So the advantages of --pr (which btw, should be use for repair the whole >cluster, not when you want to rebuild a specific node) are: > 1) it always do the minimum of work, while repair without --pr is >wasteful if the number of nodes is not a multiple of the replication >factor (no matter how smart you are at scheduling the repairs). > 2) even if your number of nodes is a multiple of the replication factor, >you still have to make sure you pick the right N/RF nodes to repair if >you don't use -pr. If you don't pick the correct ones, you will not >repair the full ring. Using -pr is much more shoot-footing free: you have >to run it on every node, period. > >-- >Sylvain >