On Sun, Mar 20, 2011 at 10:23 AM, pob <peterob...@gmail.com> wrote: > Hi, > I was searching for similar topic in mailing list, I think there is still > misunderstanding in measuring cluster. It will be nice if someone could > write right definitions. > What are we measuring? Ops/sec, throughput in Mbit/s?, number of > clients/threads writing/reading data? > I read Jonathan said it doesnt matter if you use CL.ONE or CL.QUORUM, but > for example writing with CL.ONE into one node of cluster with 3 nodes, RF = > 3 works fine instead writing with CL.ONE into 3 nodes in parallel randomly > (stress.py -d node1,node2,node3) of same cluster with 3 nodes, RF = 3 > have consequences in nodes crashing because java out of memory. > Another thing, it was said if you use RF = N your throughput of the whole > cluster is one node throughput / 3, whats throughput in that case? Bandwith? > Ops/sec? Whats one node throughput ? One node with RF=1? Im > getting completely lost while Im trying to do some estimation about how big > stream i can write into cluster, what happens if I double nodes of cluster > and so on. > > Thanks for explanation or any hints. > > Best, > Peter > 2011/3/20 pob <peterob...@gmail.com> >> >> Hello, >> I set up cluster with 3 nodes/ 4Gram,4cores,raid0. I did experiment with >> stress.py to see how fast my inserts are. The results are confusing. >> In each case stress.py was inserting 170KB of data: >> 1) >> stress.py was inserting directly to one node -dNode1, RF=3, CL.ONE >> 300000 inserts in 1296 sec (300000,246,246,0.01123401983,1296) >> 2) >> stress.py was inserting directly to one node -dNode1, RF=3, CL.QUORUM >> 300000 inserts in 987 sec (300000,128,128,0.00894131883979,978) >> 3) >> stress.py was inserting random into all 3 nodes -dNode1,Node2,Node3 RF=3, >> CL.QUORUM >> 300000 inserts in 784 sec (300000,157,157,0.00900169542641,784) >> 4) >> stress.py was inserting directly to one node -dNode1, RF=3, CL.ALL >> similar to case 1) >> ------- >> Im not surprising about cases 2,3) but the biggest surprise for me is why >> cl.one is slower then cl.quorum. CL.one has less "acks", shorter time of >> waiting... and so on. >> I was looking at some blogs about "write" architecture but the reason is >> still not clear for me. >> http://www.mikeperham.com/2010/03/13/cassandra-internals-writing/ >> http://prettyprint.me/2010/05/02/understanding-cassandra-code-base/ >> >> Thanks for advice. >> >> Best, >> Peter > >
Peter, There are too many combinations of Replication factor, Consistency Level, Node Count, and work load to have extended write ups about how each situation performs. The paper that does the best job explaining this is the yahoo cloud server benchmark http://research.yahoo.com/files/ycsb.pdf * This paper is old ** This paper tests with older version of Cassandra *** YCSB seems to be fragmented across github now Also remember the stress test tools create fictitious workloads. You can "game" a stress test and produce incredible results or vice versa. (you always get more of what you measure) I can not speak for anyone but I imagine the stress test tools are used primarily by the developers to ensure no performance regressions after patches. I think one way to look at the performance is by saying "It is what it is". i.e. You have disks, you have RAM, data is sorted, it is designed to be as fast as it can. Scale-out means you can grow the cluster indefinitely. So how hard you can drive a single node becomes less of an issue.