You're going to be mad at how simple the answer turns out to be. :)

Nodes "own" the range from (previous, token], NOT from [token, next).
So, the last node will get from (50, 75] and the first will get from
(75, 0].

On Tue, Jul 12, 2011 at 9:38 AM, Eric tamme <eta...@gmail.com> wrote:
> I have been reading through some code in TokenMetadata.java,
> specifically with ringIterator() and firstTokenIndex().  I am trying
> to get a very firm grasp on how nodes are collected for writes.
>
> I have run into a bit of confusion about what happens when the data's
> token is larger than than the largest initial token. It looks to me
> like there is no way for the largest token node to get a write.  I
> know this can't be correct, and that I must be missing some thing.  If
> any one can help to fill in what I am missing I would appreciate it.
>
> Lets say I have a hypothetical token range of 0-100 (for simplicity
> sake) and 4 nodes using RandomPartitioner and SimpleStrategy.  My
> nodes are configured with initial tokens: 0,25,50,75.
>
> If I receive a write for data whose token is 85 (or any number > 75)
> this is the code path as far as I can tell:
>
> SimpleStrategy will create a ringIterator and pass it in a list of all
> the initial tokens [0,25,50,75]
>
> The ringIterator will in turn call firstTokenIndex and pass it the
> token list, and token we are trying to insert
>
> final int startIndex = firstTokenIndex(ring, start, insertMin);.
>
> firstTokenIndex will then do a Collections.binarySearch to try to find
> the index of where the token should be inserted
>
> int i = Collections.binarySearch(list, token);
>
> Here is the JavaDoc on Collections.binarySearch:
>
> "Returns:
>    index of the search key, if it is contained in the list;
> otherwise, (-(insertion point) - 1). The insertion point is defined as
> the point at which the key would be inserted into the list: the index
> of the first element greater than the key, or list.size(), if all
> elements in the list are less than the specified key. Note that this
> guarantees that the return value will be >= 0 if and only if the key
> is found. "
>
> So in this example, 85 is larger than any of the available tokens so
> (-(list.size())-1) or (-(4)-1) = -5 will be returned.
>
> From there, firstTokenIndex does some processing on the return value:
>
> if (i < 0)
> {
>  i = (i + 1) * (-1);        // i is now 4
>  if (i >= ring.size())      // evaluates true
>    i = insertMin ? -1 : 0;  // 0 is returned
> }
> return i;
>
> So we return index 0 to the ringIterator... which will try to execute
> a ring.get(0) for the first next() based on our return value.
>
> here is the calling, and processing code for ringIterator:
>
>        return new AbstractIterator<Token>()
>        {
>            int j = startIndex;
>            protected Token computeNext()
>            {
>                if (j < -1)
>                    return endOfData();
>                try
>                {
>                    // return minimum for index == -1
>                    if (j == -1)
>                        return
> StorageService.getPartitioner().getMinimumToken();
>                    // return ring token for other indexes
>                    return ring.get(j);
>                }
>                finally
>                {
>                    j++;
>                    if (j == ring.size())
>                        j = insertMin ? -1 : 0;
>                    if (j == startIndex)
>                        // end iteration
>                        j = -2;
>                }
>            }
>        };
>
> Thanks,
> Eric
>



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com

Reply via email to