On Sat, Nov 6, 2010 at 4:51 PM, Reverend Chip <rev.c...@gmail.com> wrote:
> On 11/6/2010 1:48 PM, Jonathan Ellis wrote:
>>   Did any of the nodes log any dropped messages?
>
> I didn't keep timestamps of the maintenance steps, so I will be unable
> to be sure which log entries correspond to which failure states.  I did
> find dropped message log entries on node X.22, though.  Here's the batch
> that happened more or less the time things went wrong:
>
>  WARN [ScheduledTasks:1] 2010-11-05 17:15:03,294 MessagingService.java
> (line 515) Dropped 9122 messages in the last 1000ms

> Am I to understand that
> ring maintenance requests can just fail when partially complete, in the
> same manner as a regular insert might fail, perhaps due to inter-node
> RPC overflow?

Yes, in beta3 this can happen.  This was fixed in CASSANDRA-1676.

> It would appear, then, that Cassandra isn't designed to be operated and
> understood without constant log watching of all nodes.

Not in beta, it's not. :)

(In fact I would recommend running beta nodes at debug log level so
when something goes wrong you have a better picture of what happened.)

-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com

Reply via email to