On Sun, Aug 29, 2010 at 12:20:10PM -0700, Benjamin Black wrote:
> On Sun, Aug 29, 2010 at 11:04 AM, Anthony Molinaro
> <antho...@alumni.caltech.edu> wrote:
> >
> >
> > I don't know it seems to tax our setup of 39 extra large ec2 nodes, its
> > also closer to 24000 reqs/sec at peak since there are different tables
> > (2 tables for each read and 2 for each write)
> >
> 
> Could you clarify what you mean here?  On the face of it, this
> performance seems really poor given the number and size of nodes.

As you say I would expect to achieve much better performance given the node
size, but if you go back and look through some of the issues we've seen
over time, you'll find we've been hit with nodes being too small, having
too few nodes to deal with request volume, having OOMs, having bad sstables,
having the ring appear different to different nodes, and several other
problems.

Many of i/o problems presented themselves as MessageDeserializer pool backups
(although we stopped having these since Jonathan was by and suggested row
cache of about 1Gb, thanks Riptano!).  We currently have mystery OOMs
which are probably caused by GC storms during compactions (although usually
the nodes restart and compact fine, so who knows).  I also regularly watch
nodes go away for 30 seconds or so (logs show node goes dead, then comes
back to life a few seconds later).

I've sort of given up worrying about these, as we are in the process of
moving this cluster to our own machines in a colo, so I figure I should
wait until they are moved, and see how the new machines do before I worry
more about performance.

-Anthony

-- 
------------------------------------------------------------------------
Anthony Molinaro                           <antho...@alumni.caltech.edu>

Reply via email to