Sorry, mixed signals in my response. I was partially replying to suggestions that we were limited by the box's NIC or DC's bandwidth (which is gigabit, no dice there). I also ran the tests with -t50 on multiple tester machines in the cloud with no change in performance; I've now rerun those tests on dedicated hardware.
reads/sec @ nodes one client two clients 1 53k 73k 2 37k 50k 4 37k 50k Notes: - All notes from the previous dataset apply here. - All clients were reading with 50 processes. - Test clients were not co-located with the databases or each other. - All machines are in the same DC. - Servers showed about 20MB/sec in network i/o for the multi-node clusters, which is well under the max for gigabit. - Latency was about 2.5ms/req. At this point, we'd really appreciate it if anyone else could attempt to replicate our results. Ultimately, our goal is to see an increase in throughput given an increase in cluster size. -- David Schoonover On Jul 19, 2010, at 2:25 PM, Stu Hood wrote: > If you put 25 processes on each of the 2 machines, all you are testing is how > fast 50 processes can hit Cassandra... the point of using more machines is > that you can use more processes. > > Presumably, for a single machine, there is some limit (K) to the number of > processes that will give you additional gains: above that point, you should > use more machines, each running K processes. >