The way I've always thought about it is that -pr will make sure the information that specific node originates is consistent with its replicas.
So, we know that a node is responsible for a specific token range, and the next nodes in the ring will hold its replicas. The -pr will make sure that a specific node's information is consistent to its replicas, but will not make sure a specific node has all the replicated information it can get from nodes previous to itself in the ring. Without the -pr option, not only will the current node make sure its information and its replica's information is consistent, but it will also make sure that all the information that it is a replica for, is consistent. If you run regular repairs on all the nodes in your cluster, then -pr is sufficient. Every node will run repair, and make sure its information is consistent with its replicas, eventually creating a fully consistent cluster. This is a quicker process, and will have less impact on your operations by essentially spreading out the pain. For instance, we run a 12 node cluster. We run "nodetool repair -pr" on nodes that are opposite to each other, 4 nodes a day (2 nodes in the morning, 2 nodes in the evening). With a grace period of 10 days, this allows us to run repairs twice a week on a specific node, and to occasionally skip repairs on specific nodes once a week. In this case, without -pr, a lot of extra work would be done. In fact, with an RF of 3 (in our case), the time per repair would increase many fold. Another way to thing about it... although likely not 100% technically correct.. A repair -pr will cause a push of a node's information to its replicas. Without the -pr, it will cause a push, and it will cause nodes it is a replica for to push their information as well. -Mike On Feb 28, 2013, at 9:39 PM, Hiller, Dean wrote: > Isn't there more to it than that. You really have nodes responsible for > token ranges like so(using describe ring) > > What we see is this from our describe ringŠ(1 to 6 are token ranges while > A to F are servers)Š. > A - 1, 2, 3 > B - 2, 3, 4 > C - 3, 4, 5 > D - 4, 5, 6 > E - 5, 6, 1 > F - 6, 1, 2 > > With -pr, only token range 1 is repaired I think, right? 2 and 3 are only > repaired without the -pr option? This means if I have a node that I just > joined the cluster, I should "not" be using -pr as 2 and 3 on node A will > not be up to date. Using -pr is nice if I am going to repair every single > node and is nice for the cron job that has to happen before > gc_grace_seconds. Am I wrong here? Ie. -pr is really only good for use > in the cron job as it would miss 2 and 3 above. I could run the cron on > just two servers but then my nodes are different which can be a hassle. > > Please verify that is what you believe is what happens as well? > > Thanks, > Dean > > On 2/28/13 5:58 PM, "Takenori Sato(Cloudian)" <ts...@cloudian.com> wrote: > >> Hi, >> >> Please note that I confirmed on v1.0.7. >> >>> I mean a repair involves all three nodes and pushes and pulls data, >> right? >> >> Yes, but that's how -pr works. A repair without -pr does more. >> >> For example, suppose you have a ring with RF=3 like this. >> >> A - B - C - D - E - F >> >> Then, a repair on A without -pr does for 3 ranges as follows: >> [A, B, C] >> [E, F, A] >> [F, A, B] >> >> Among them, the first one, [A, B, C] is the primary range of A. >> >> So, with -pr, a repair runs only for: >> [A, B, C] >> >>> I could run nodetool repair on just 2 nodes(RF=3) instead of using >> nodetool repair pr??? >> >> Yes. >> >> You need to run two repairs on A and D. >> >>> What is the advantage of pr then? >> >> Whenever you want to minimize rapair impacts. >> >> For example, suppose you got one node down for a while, and bring it >> back to the cluster. >> >> You need to run rapair without affecting the entire cluster. Then, -pr >> is the option. >> >> Thanks, >> Takenori >> >> (2013/03/01 7:39), Hiller, Dean wrote: >>> Isn't it true if I have 6 nodes, I could run nodetool repair on just 2 >>> nodes(RF=3) instead of using nodetool repair pr??? >>> >>> What is the advantage of pr then? >>> >>> I mean a repair involves all three nodes and pushes and pulls data, >>> right? >>> >>> Thanks, >>> Dean >> >