Great question. The safe answer is to do a proof of concept implementation
and try various rates to determine where the bottleneck is. It will also
depend on the row size. Hard to say if you will be limited by the cluster
load or network bandwidth.
Is there only one client talking to your cluster? Or are you asking what
each of, say, one million clients can be simultaneously requesting?
The rate of requests will matter as well, particularly if the cluster has a
non-trivial load.
My ultimate rule of thumb is simple: Moderation. Not too many threads, not
too frequent request rate.
It would be nice if we had a way to calculate this number (both numbers) for
you so that a client (driver) could ping for it from the cluster, as well as
for the cluster to return a suggested wait interval before sending another
request based on actual load.
-- Jack Krupansky
-----Original Message-----
From: Robert Wille
Sent: Tuesday, November 25, 2014 10:57 AM
To: user@cassandra.apache.org
Subject: Rule of thumb for concurrent asynchronous queries?
Suppose I have the primary keys for 10,000 rows and I want them all. Is
there a rule of thumb for the maximum number of concurrent asynchronous
queries I should execute?=