On 01/18/2013 03:08 AM, Amit Kale wrote:
>> > Can you explain what you mean by that in a little more detail?
> Let's say latency of a block device is 10ms for 4kB requests. With single 
> threaded IO, the throughput will be 4kB/10ms = 400kB/s. If the device is 
> capable of more throughput, a multithreaded IO will generate more throughput. 
> So with 2 threads the throughput will be roughly 800kB/s. We can keep 
> increasing the number of threads resulting in an approximately linear 
> throughput. It'll saturate at the maximum capacity the device has. So it 
> could saturate at perhaps at 8MB/s. Increasing the number of threads beyond 
> this will not increase throughput.
>
> This is a simplistic computation. Throughput, latency and number of threads 
> are related in a more complex relationship. Latency is still important, but 
> throughput is more important.
>
> The way all this matters for SSD caching is, caching will typically show a 
> higher latency compared to the base SSD, even for a 100% hit ratio. It may be 
> possible to reach the maximum throughput achievable with the base SSD using a 
> high number of threads. Let's say an SSD shows 450MB/s with 4 threads. A 
> cache may show 440MB/s with 8 threads.
>
> A practical difficulty in measuring latency is that the latency seen by an 
> application is a sum of the device latency plus the time spent in request 
> queue (and caching layer, when present). Increasing number of threads shows 
> latency increase, although it's only because the requests stay in request 
> queue for a longer duration. Latency measurement in a multithreaded 
> environment is very challenging. Measurement of throughput is fairly 
> straightforward.
>
>> > 
>> > As an enterprise level user I see both as important overall.  However,
>> > the biggest driving factor in wanting a cache device in front of any
>> > sort of target in my use cases is to hide latency as the number of
>> > threads reading and writing to the backing device go up.  So for me the
>> > cache is basically a tier stage where your ability to keep dirty blocks
>> > on it is determined by the specific use case.
> SSD caching will help in this case since SSD's latency remains almost 
> constant regardless of location of data. HDD latency for sequential and 
> random IO could vary by a factor of 5 or even much more.
>
> Throughput with caching could even be 100 times the HDD throughput when using 
> multiple threaded non-sequential IO.
> -Amit

Thank you for the explanation.  In context your reasoning makes more
sense to me.

If I am understanding you correctly when you refer to throughput your
speaking more in terms of IOPS than what most people would think of as
referencing only bit rate.

I would expect a small increase in minimum and average latency when
adding in another layer that the blocks have to traverse.  If my minimum
and average increase by 20% on most of my workloads, that is very
acceptable as long as there is a decrease in 95th and 99th percentile
maximums.  I would hope that absolute maximum would decrease as well but
that is going to be much harder to achieve.

If I can help test and benchmark all three of these solutions please
ask.  I have allot of hardware resources available to me and perhaps I
can add value from an outsiders perspective.

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to