Reddit posted a blog entry about some recent downtime, partially due
to issues with Cassandra.
http://blog.reddit.com/2010/05/reddits-may-2010-state-of-servers.html

This part surprised me:
"
First, Cassandra has an internal queue of work to do. When it times
out a client (10s by default), it still leaves the operation in the
queue of work to complete (even though the person that asked for the
read is no longer even holding the socket), which given a constant
stream of requests makes the amount of pending work snowball
effectively infinitely (specifically, ROW-READ-STAGE's PENDING
operations grow unbounded).
"

I've searched Jira for an issue related to this -- it seems like a bug
to have reads in queue when the result is useless (because the reader
is gone).  Obviously a 10-second read is not a normal run condition,
but removing stale reads could remove a cause of cascading failure.

Should I open a ticket, or have I misunderstood something?

Reply via email to