I am amazed to see that you don't have OOM with this setup... 1 - for performances and given Cassandra replication properties an I/O usage you might want to try with a Raid0. But I imagine this is tradeoff.
2 - A billion is quite a few and any of your nodes takes the full load. You might want to try with RF 2 and CL one if performance is what you are looking for. 3 - Using 50 GB of key cache is something I never saw and can't be good, since afaik, key cache is on heap and you don"t really want a heap bigger than 8 GB ( or 10/12 GB for some cases). Try with default heap size and key cache. 4 - Are you querying the set at once ? You might want to query rows one by one, maybe in a synchronous way to have back pressure. An other question would be: did you use native protocol or rather thrift ? ( http://www.datastax.com/dev/blog/cassandra-2-1-now-over-50-faster) BTW interesting benchmark, but having the right conf is interesting. Also you might want to go to 2.1.7 that mainly fixes a memory leak afaik. C*heers, Alain Le 25 juin 2015 01:23, "Zhiyan Shao" <zhiyan.s...@gmail.com> a écrit : > Hi, > > we recently experimented read performance on both versions and found read > is slower in 2.1.6. Here is our setup: > > 1. Machines: 3 physical hosts. Each node has 24 cores CPU, 256G memory and > 8x600GB SAS disks with raid 1. > 2. Replica is 3 and a billion rows of data is inserted. > 3. Key cache capacity is increased to 50G on each node. > 4. Keep querying the same set of a million partition keys in a loop. > > Result: > For 2.0.14, we can get an average of 6 ms while for 2.1.6, we can only get > 18 ms > > It seems key cache hit rate 0.011 is pretty low even though the same set > of keys were used. Has anybody done similar read performance testing? Could > you share your results? > > Thanks, > Zhiyan >