You're going to be mad at how simple the answer turns out to be. :) Nodes "own" the range from (previous, token], NOT from [token, next). So, the last node will get from (50, 75] and the first will get from (75, 0].
On Tue, Jul 12, 2011 at 9:38 AM, Eric tamme <eta...@gmail.com> wrote: > I have been reading through some code in TokenMetadata.java, > specifically with ringIterator() and firstTokenIndex(). I am trying > to get a very firm grasp on how nodes are collected for writes. > > I have run into a bit of confusion about what happens when the data's > token is larger than than the largest initial token. It looks to me > like there is no way for the largest token node to get a write. I > know this can't be correct, and that I must be missing some thing. If > any one can help to fill in what I am missing I would appreciate it. > > Lets say I have a hypothetical token range of 0-100 (for simplicity > sake) and 4 nodes using RandomPartitioner and SimpleStrategy. My > nodes are configured with initial tokens: 0,25,50,75. > > If I receive a write for data whose token is 85 (or any number > 75) > this is the code path as far as I can tell: > > SimpleStrategy will create a ringIterator and pass it in a list of all > the initial tokens [0,25,50,75] > > The ringIterator will in turn call firstTokenIndex and pass it the > token list, and token we are trying to insert > > final int startIndex = firstTokenIndex(ring, start, insertMin);. > > firstTokenIndex will then do a Collections.binarySearch to try to find > the index of where the token should be inserted > > int i = Collections.binarySearch(list, token); > > Here is the JavaDoc on Collections.binarySearch: > > "Returns: > index of the search key, if it is contained in the list; > otherwise, (-(insertion point) - 1). The insertion point is defined as > the point at which the key would be inserted into the list: the index > of the first element greater than the key, or list.size(), if all > elements in the list are less than the specified key. Note that this > guarantees that the return value will be >= 0 if and only if the key > is found. " > > So in this example, 85 is larger than any of the available tokens so > (-(list.size())-1) or (-(4)-1) = -5 will be returned. > > From there, firstTokenIndex does some processing on the return value: > > if (i < 0) > { > i = (i + 1) * (-1); // i is now 4 > if (i >= ring.size()) // evaluates true > i = insertMin ? -1 : 0; // 0 is returned > } > return i; > > So we return index 0 to the ringIterator... which will try to execute > a ring.get(0) for the first next() based on our return value. > > here is the calling, and processing code for ringIterator: > > return new AbstractIterator<Token>() > { > int j = startIndex; > protected Token computeNext() > { > if (j < -1) > return endOfData(); > try > { > // return minimum for index == -1 > if (j == -1) > return > StorageService.getPartitioner().getMinimumToken(); > // return ring token for other indexes > return ring.get(j); > } > finally > { > j++; > if (j == ring.size()) > j = insertMin ? -1 : 0; > if (j == startIndex) > // end iteration > j = -2; > } > } > }; > > Thanks, > Eric > -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of DataStax, the source for professional Cassandra support http://www.datastax.com