> Did you see about equal CPU usage on the cassandra nodes during the
> test? Is it possible that most or all of the keys generated by
> stress.py simply fall on a single node?

CPU was approximately equal across the cluster; it was around 50%.

stress.py generates keys randomly or using a gaussian distribution, both 
methods showed the same results.

Finally, we're using a random partitioner, so Cassandra will hash the keys 
using md5 to map it to a position on the ring.

--
David Schoonover

On Jul 19, 2010, at 4:14 PM, Peter Schuller wrote:

> The following is completely irrelevant if you are indeed using the
> default storage-conf.xml as you said. However since I wrote it and it
> remains relevant for anyone testing with the order preserving
> partitioner, I might aswell post it rather than discard it...
> 
> Begin probably irrelevant post:
> 
> Another stab in the dark:
> 
> You do specifically mention that you distributed tokens evenly across
> the cluster and independently for each cluster size. However, were the
> tokens distributed evenly *within the range used by the stress test*?
> 
> This is the random key generator in stress.py:
> 
> def key_generator_random():
>    fmt = '%0' + str(len(str(total_keys))) + 'd'
>    return fmt % randint(0, total_keys - 1)
> 
> Unless I am misreading/mis-testing, this will generate keys that are
> essentially ASCII decimal characters in strings of equal length, with
> numerical values distributed in the range [0,total_keys - 1]. However,
> the key prefixes covered by the range '0-9' make up a very limited
> subset of the token spaces into which cluster nodes are placed, for
> both byte strings and UTF-8 strings.
> 
> Did you see about equal CPU usage on the cassandra nodes during the
> test? Is it possible that most or all of the keys generated by
> stress.py simply fall on a single node?
> 
> -- 
> / Peter Schuller

Reply via email to