The theoretical answer involves Little's Law
<https://en.wikipedia.org/wiki/Little%27s_law> (*L=λW*). But the practical
experience is, as you say, dependent on a fair number of factors. We wrote
a recent blog
<https://www.scylladb.com/2019/11/20/maximizing-performance-via-concurrency-while-minimizing-timeouts-in-distributed-databases/>
that
should be applicable to your thought processes about parallelism,
throughput, latency, and timeouts.

Earlier this year, we also wrote a blog about sizing Scylla clusters
<https://www.scylladb.com/2019/06/20/sizing-up-your-scylla-cluster/> that
touches on latency and throughput. For example a general rule of thumb is
that with the current generation of Intel cores, for payloads of <1kb you
can get ~12.5k ops/core with Scylla. If there are similar blogs about
sizing Cassandra clusters, I'd be interested in reading them as well!

Also, in terms of latency, I want to point out that there is a great deal
dependent on the nature of your data, queries and caching. For example, if
you have a very low cache hit rate, expect greater latencies — data will
still need to be read from storage even if you add more nodes.

On Tue, Dec 10, 2019 at 6:57 AM Fred Habash <fmhab...@gmail.com> wrote:

> I'm looking for an empirical way to answer these two question:
>
> 1. If I increase application work load (read/write requests) by some
> percentage, how is it going to affect read/write latency. Of course, all
> other factors remaining constant e.g. ec2 instance class, ssd specs, number
> of nodes, etc.
>
> 2) How many nodes do I have to add to maintain a given read/write latency?
>
> Are there are any methods or instruments out there that can help answer
> these que
>
>
>
> ----------------------------------------
> Thank you
>
>
>

-- 
Peter Corless
Technical Marketing Manager
ScyllaDB
e: pe...@scylladb.com
t: @petercorless <https://twitter.com/PeterCorless>
v: 650-906-3134

Reply via email to