I'm not sure what the fix is. When using an order preserving partitioner it's up to you to ensure the ring is correctly balanced.
Say you have the following setup… node : token 1 : a 2 : h 3 : p If keys are always 1 character we can say each node own's roughly 33% of the ring. Because we know there are only 26 possible keys. With the RP we know how many keys there are, the output of the md5 calculation is a 128 bit integer. So we can say what fraction of the total each range is. If in the example above keys are of any length, how many values exist between a and h ? Cheers ----------------- Aaron Morton Freelance Cassandra Developer @aaronmorton http://www.thelastpickle.com On 22/08/2011, at 3:33 AM, Thibaut Britz wrote: > Hi, > > I will wait until this is fixed beforeI upgrade, just to be sure. > > Shall I open a new ticket for this issue? > > Thanks, > Thibaut > > On Sun, Aug 21, 2011 at 11:57 AM, aaron morton <aa...@thelastpickle.com> > wrote: >> This looks like an artifact of the way ownership is calculated for the OOP. >> See >> https://github.com/apache/cassandra/blob/cassandra-0.8.4/src/java/org/apache/cassandra/dht/OrderPreservingPartitioner.java#L177 >> it >> was changed in this ticket >> https://issues.apache.org/jira/browse/CASSANDRA-2800 >> The change applied in CASSANDRA-2800 was not applied to the >> AbstractByteOrderPartitioner. Looks like it should have been. I'll chase >> that up. >> >> When each node calculates the ownership for the token ranges (for OOP and >> BOP) it's based on the number of keys the node has in that range. As there >> is no way for the OOP to understand the range of values the keys may take. >> If you look at the 192 node it's showing ownership most with 192, 191 and >> 190 - so i'm assuming RF3 and 192 also has data from the ranges owned by 191 >> and 190. >> IMHO you can ignore this. >> You can use load the the number of keys estimate from cfstats to get an idea >> of whats happening. >> Hope that helps. >> ----------------- >> Aaron Morton >> Freelance Cassandra Developer >> @aaronmorton >> http://www.thelastpickle.com >> On 19/08/2011, at 9:42 PM, Thibaut Britz wrote: >> >> Hi, >> >> we were using apache-cassandra-2011-06-28_08-04-46.jar so far in >> production and wanted to upgrade to 0.8.4. >> >> Our cluster was well balanced and we only saved keys with a lower case >> md5 prefix. (Orderpreserving partitioner). >> Each node owned 20% of the tokens, which was also displayed on each >> node in nodetool -h localhost ring. >> >> After upgrading, our well balanced cluster shows completely wrong >> percentage on who owns which keys: >> >> *.*.*.190: >> Address DC Rack Status State Load >> Owns Token >> >> ffffffffffffffff >> *.*.*.190 datacenter1 rack1 Up Normal 87.95 GB >> 34.57% 2a >> *.*.*.191 datacenter1 rack1 Up Normal 84.3 GB >> 0.02% 55 >> *.*.*.192 datacenter1 rack1 Up Normal 79.46 GB >> 0.02% 80 >> *.*.*.194 datacenter1 rack1 Up Normal 68.16 GB >> 0.02% aa >> *.*.*.196 datacenter1 rack1 Up Normal 79.9 GB >> 65.36% ffffffffffffffff >> >> *.*.*.191: >> Address DC Rack Status State Load >> Owns Token >> >> ffffffffffffffff >> *.*.*.190 datacenter1 rack1 Up Normal 87.95 GB >> 36.46% 2a >> *.*.*.191 datacenter1 rack1 Up Normal 84.3 GB >> 26.02% 55 >> *.*.*.192 datacenter1 rack1 Up Normal 79.46 GB >> 0.02% 80 >> *.*.*.194 datacenter1 rack1 Up Normal 68.16 GB >> 0.02% aa >> *.*.*.196 datacenter1 rack1 Up Normal 79.9 GB >> 37.48% ffffffffffffffff >> >> *.*.*.192: >> Address DC Rack Status State Load >> Owns Token >> >> ffffffffffffffff >> *.*.*.190 datacenter1 rack1 Up Normal 87.95 GB >> 38.16% 2a >> *.*.*.191 datacenter1 rack1 Up Normal 84.3 GB >> 27.61% 55 >> *.*.*.192 datacenter1 rack1 Up Normal 79.46 GB >> 34.17% 80 >> *.*.*.194 datacenter1 rack1 Up Normal 68.16 GB >> 0.02% aa >> *.*.*.196 datacenter1 rack1 Up Normal 79.9 GB >> 0.02% ffffffffffffffff >> >> *.*.*.194: >> Address DC Rack Status State Load >> Owns Token >> >> ffffffffffffffff >> *.*.*.190 datacenter1 rack1 Up Normal 87.95 GB >> 0.03% 2a >> *.*.*.191 datacenter1 rack1 Up Normal 84.3 GB >> 31.43% 55 >> *.*.*.192 datacenter1 rack1 Up Normal 79.46 GB >> 39.69% 80 >> *.*.*.194 datacenter1 rack1 Up Normal 68.16 GB >> 28.82% aa >> *.*.*.196 datacenter1 rack1 Up Normal 79.9 GB >> 0.03% ffffffffffffffff >> >> *.*.*.196: >> Address DC Rack Status State Load >> Owns Token >> >> ffffffffffffffff >> *.*.*.190 datacenter1 rack1 Up Normal 87.95 GB >> 0.02% 2a >> *.*.*.191 datacenter1 rack1 Up Normal 84.3 GB >> 0.02% 55 >> *.*.*.192 datacenter1 rack1 Up Normal 79.46 GB >> 0.02% 80 >> *.*.*.194 datacenter1 rack1 Up Normal 68.16 GB >> 27.52% aa >> *.*.*.196 datacenter1 rack1 Up Normal 79.9 GB >> 72.42% ffffffffffffffff >> >> >> Interestingly, each server shows something completely different. >> >> Removing the locationInfo files didn't help. >> -Dcassandra.load_ring_state=false didn't help as well. >> >> Our cassandra.yaml is at http://pastebin.com/pCVCt3RM >> >> Any idea on what might cause this? Is it save to suspect that >> operating under this distribution will cause severe data loss? Or can >> I safely ignore this? >> >> Thanks, >> Thibaut >> >>