Re: Cassandra benchmarking on Rackspace Cloud

2010-07-19 Thread David Schoonover
> Now keep adding clients until it stops making the numbers go up... Neither adding additional readers nor additional cluster nodes showed performance gains. The numbers, they do not move. -- David Schoonover On Jul 19, 2010, at 5:18 PM, Jonathan Ellis wrote: > Now keep adding clients

Re: Cassandra benchmarking on Rackspace Cloud

2010-07-19 Thread David Schoonover
r using a gaussian distribution, both methods showed the same results. Finally, we're using a random partitioner, so Cassandra will hash the keys using md5 to map it to a position on the ring. -- David Schoonover On Jul 19, 2010, at 4:14 PM, Peter Schuller wrote: > The following is completely

Re: Cassandra benchmarking on Rackspace Cloud

2010-07-19 Thread David Schoonover
runs of inserts.) My version is uploaded here: http://gist.github.com/481966 -- David Schoonover On Jul 19, 2010, at 4:26 PM, Juho Mäkinen wrote: > I'm about to extend my two node cluster with four dedicated nodes and > removing one of the old nodes, leaving a five node cluster. The &

Re: Cassandra benchmarking on Rackspace Cloud

2010-07-19 Thread David Schoonover
e else could attempt to replicate our results. Ultimately, our goal is to see an increase in throughput given an increase in cluster size. -- David Schoonover On Jul 19, 2010, at 2:25 PM, Stu Hood wrote: > If you put 25 processes on each of the 2 machines, all you are testing is how >

Re: Cassandra benchmarking on Rackspace Cloud

2010-07-19 Thread David Schoonover
ted hardware now to ensure that result was not an artifact of the cloud. David Schoonover On Jul 19, 2010, at 1:38 PM, Jonathan Ellis wrote: > On Mon, Jul 19, 2010 at 12:30 PM, David Schoonover > wrote: >>> How many physical client machines are running stress.py? >> >&

Re: Cassandra benchmarking on Rackspace Cloud

2010-07-19 Thread David Schoonover
stress.py uses multiprocessing if it is present, circumventing the GIL; we ran the tests with python 2.6.5. David Schoonover On Jul 19, 2010, at 1:51 PM, Peter Schuller wrote: >>> One with 50 threads; it is remote from the cluster but within the same >>> DC in both cases. I

Re: Cassandra benchmarking on Rackspace Cloud

2010-07-19 Thread David Schoonover
> Another thing: Is the py_stress traffic definitely non-determinstic > such that each client will generate a definitely unique series of > requests? The tests were run both with --random and --std 0.1; in both cases, the key-sequence is non-deterministic. Cheers, Dave On Jul 19, 2010, at 1

Re: Cassandra benchmarking on Rackspace Cloud

2010-07-19 Thread David Schoonover
ote: > How many physical client machines are running stress.py? > > -Original Message----- > From: "David Schoonover" > Sent: Monday, July 19, 2010 12:11pm > To: user@cassandra.apache.org > Subject: Re: Cassandra benchmarking on Rackspace Cloud > > Hello all,

Re: Cassandra benchmarking on Rackspace Cloud

2010-07-19 Thread David Schoonover
Hello all, I'm Oren's partner in crime on all this. I've got a few more numbers to add. In an effort to eliminate everything but the scaling issue, I set up a cluster on dedicated hardware (non-virtualized; 8-core, 16G RAM). No data was loaded into Cassandra -- 100% of requests were misses. Th