On Mon, Feb 13, 2012 at 8:09 PM, Peter Schuller <peter.schul...@infidyne.com
> wrote:

> > the servers spending >50% of the time in io-wait
>
> Note that I/O wait is not necessarily a good indicator, depending on
> situation. In particular if you have multiple drives, I/O wait can
> mostly be ignored. Similarly if you have non-trivial CPU usage in
> addition to disk I/O, it is also not a good indicator. I/O wait is
> essentially giving you the amount of time CPU:s spend doing nothing
> because the only processes that would otherwise be runnable are
> waiting on disk I/O. But even a single process waiting on disk I/O ->
> lots of I/O wait even if you have 24 drives.
>

Yep - user space cpu is <20% or much worse when the io-wait goes in to the
90's - looks a great deal like IO bottleknecks


>
> The per-disk % utilization is generally a much better indicator
> (assuming no hardware raid device, and assuming no SSD), along with
> the average queue size.
>

I doubt that figure is available sensibly in an ec2 instance


>
> >> In general, if you have queries that come in at some rate that
> >> is determined by outside sources (rather than by the time the last
> >> query took to execute),
> >
> > That's an interesting approach - is that likely to give close to optimal
> > performance ?
>
> I just mean that it all depends on the situation. If you have, for
> example, some N number of clients that are doing work as fast as they
> can, bottlenecking only on Cassandra, you're essentially saturating
> the Cassandra cluster no matter what (until the client/network becomes
> a bottleneck). Under such conditions (saturation) you generally never
> should expect good latencies.
>
> For most non-batch job production use-cases, you tend to have incoming
> requests driven by something external such as user behavior or
> automated systems not related to the Cassandra cluster. In this cases,
> you tend to have a certain amount of incoming requests at any given
> time that you must serve within a reasonable time frame, and that's
> where the question comes in of how much I/O you're doing in relation
> to maximum. For good latencies, you always want to be significantly
> below maximum - particularly when platter based disk I/O is involved.
>
> > That may well explain it - I'll have to think about what that means for
> our
> > use case as load will be extremely bursty
>
> To be clear though, even your typical un-bursty load is still bursty
> once you look at it at sufficient resolution, unless you have
> something specifically ensuring that it is entirely smooth. A
> completely random distribution over time for example would look very
> even on almost any graph you can imagine unless you have sub-second
> resolution, but would still exhibit un-evenness and have an affect on
> latency.
>
> --
> / Peter Schuller (@scode, http://worldmodscode.wordpress.com)
>



-- 

*Franc Carter* | Systems architect | Sirca Ltd
 <marc.zianideferra...@sirca.org.au>

franc.car...@sirca.org.au | www.sirca.org.au

Tel: +61 2 9236 9118

Level 9, 80 Clarence St, Sydney NSW 2000

PO Box H58, Australia Square, Sydney NSW 1215

Reply via email to