I don't mind missing data for a few hours, it's the weird behaviour of
get_range_slices that's bothering me.  I added some logging to
ColumnFamilyRecordReader to see what's going on:

Split startToken=67160993471237854630929198835217410155,
endToken=68643623863384825230116928934887817211

...

Getting batch for range: 67965855060996012099315582648654139032 to
68643623863384825230116928934887817211

Token for last row is: 50448492574454416067449808504057295946

Getting batch for range: 50448492574454416067449808504057295946 to
68643623863384825230116928934887817211

...


Notice how the get_range_slices response is invalid since it returns an
out-of-range row.  This poisons the batching loop and causes the task to
spin out of control.

/joost

On Tue, Jun 22, 2010 at 9:09 AM, Jonathan Ellis <jbel...@gmail.com> wrote:

> What I would expect to have happen is for the removed node to
> disappear from the ring and for nodes that are supposed to get more
> data to start streaming it over.  I would expect it to be hours before
> any new data started appearing anywhere when you are anticompacting
> 80+GB prior to the streaming part.
> http://wiki.apache.org/cassandra/Streaming
>
> On Tue, Jun 22, 2010 at 12:57 AM, Joost Ouwerkerk <jo...@openplaces.org>
> wrote:
> > Yes, although "forget" implies that we once knew we were supposed to do
> so.
> > Given the following before-and-after states, on which nodes are we
> supposed
> > to run repair?  Should the cluster be restarted?  Is there anything else
> we
> > should be doing, or not doing?
> >
> > 1. Node is down due to hardware failure
> >
> > 192.168.1.104 Up         111.75 GB
> > 8954799129498380617457226511362321354      |   ^
> > 192.168.1.106 Up         113.25 GB
> > 17909598258996761234914453022724642708     v   |
> > 192.168.1.107 Up         75.65 GB
> > 22386997823745951543643066278405803385     |   ^
> > 192.168.1.108 Down    75.77 GB
> > 26864397388495141852371679534086964062     v   |
> > 192.168.1.109 Up         76.14 GB
> > 35819196517993522469828906045449285416     |   ^
> > 192.168.1.110 Up         75.9 GB
> > 40296596082742712778557519301130446093     v   |
> > 192.168.1.111 Up         95.21 GB
> > 49251395212241093396014745812492767447     |   ^
> >
> > 2. nodetool removetoken 26864397388495141852371679534086964062
> >
> > 192.168.1.104 Up         111.75 GB
> > 8954799129498380617457226511362321354      |   ^
> > 192.168.1.106 Up         113.25 GB
> > 17909598258996761234914453022724642708     v   |
> > 192.168.1.107 Up         75.65 GB
> > 22386997823745951543643066278405803385     |   ^
> > 192.168.1.109 Up         76.14 GB
> > 35819196517993522469828906045449285416     |   ^
> > 192.168.1.110 Up         75.9 GB
> > 40296596082742712778557519301130446093     v   |
> > 192.168.1.111 Up         95.21 GB
> > 49251395212241093396014745812492767447     |   ^
> >
> > At this point we're expecting 192.168.1.107 to pick up the slack for the
> > removed token, and for 192.168.1.109 and/or 192.168.1.110 to start
> streaming
> > data to 192.168.1.107 since they are holding the replicated data for that
> > range.
> >
> > 3. nodetool repair ?
> >
> > On Tue, Jun 22, 2010 at 12:03 AM, Benjamin Black <b...@b3k.us> wrote:
> >>
> >> Did you forget to run repair?
> >>
> >> On Mon, Jun 21, 2010 at 7:02 PM, Joost Ouwerkerk <jo...@openplaces.org>
> >> wrote:
> >> > I believe we did nodetool removetoken on nodes that were already down
> >> > (due
> >> > to hardware failure), but I will check to make sure. We're running
> >> > Cassandra
> >> > 0.6.2.
> >> >
> >> > On Mon, Jun 21, 2010 at 9:59 PM, Joost Ouwerkerk <
> jo...@openplaces.org>
> >> > wrote:
> >> >>
> >> >> Greg, can you describe the steps we took to decommission the nodes?
> >> >>
> >> >> ---------- Forwarded message ----------
> >> >> From: Rob Coli <rc...@digg.com>
> >> >> Date: Mon, Jun 21, 2010 at 8:08 PM
> >> >> Subject: Re: get_range_slices confused about token ranges after
> >> >> decommissioning a node
> >> >> To: user@cassandra.apache.org
> >> >>
> >> >>
> >> >> On 6/21/10 4:57 PM, Joost Ouwerkerk wrote:
> >> >>>
> >> >>> We're seeing very strange behaviour after decommissioning a node:
> when
> >> >>> requesting a get_range_slices with a KeyRange by token, we are
> getting
> >> >>> back tokens that are out of range.
> >> >>
> >> >> What sequence of actions did you take to "decommission" the node?
> What
> >> >> version of Cassandra are you running?
> >> >>
> >> >> =Rob
> >> >>
> >> >
> >> >
> >
> >
>
>
>
> --
> Jonathan Ellis
> Project Chair, Apache Cassandra
> co-founder of Riptano, the source for professional Cassandra support
> http://riptano.com
>

Reply via email to