This comment and some testing were enough for us. "Generally, a value between 128 and 512 here coupled with a large key cache size on CFs results in the best trade offs. This value is not often changed, however if you have many very small rows (many to an OS page), then increasing this will often lower memory usage without a impact on performance."
And indeed, I started using this config in only one node without seeing any performance degradation. Mean reads latency was around 4 ms in all the servers, including this one. And I had no more heap full. Heap used now goes from 2.5 GB to 5.5 GB increasing slowly instead of getting stuck around 5.0 GB and 6.5GB (out of 8GB Heap). All the graph I could see while having both configurations (128/512) on different servers were almost the same, excepted about the Heap. So 512 was a lot better in our case. Hope it will help you, since it was also the purpose of this thread. Alain 2013/7/9 Mike Heffner <m...@librato.com> > I'm curious because we are experimenting with a very similar > configuration, what basis did you use for expanding the index_interval to > that value? Do you have before and after numbers or was it simply reduction > of the heap pressure warnings that you looked for? > > thanks, > > Mike > > > On Tue, Jul 9, 2013 at 10:11 AM, Alain RODRIGUEZ <arodr...@gmail.com>wrote: > >> Hi, >> >> Using C*1.2.2. >> >> We recently dropped our 18 m1.xLarge (4CPU, 15GB RAM, 4 Raid-0 Disks) >> servers to get 3 hi1.4xLarge (16CPU, 60GB RAM, 2 Raid-0 SSD) servers >> instead, for about the same price. >> >> We tried it after reading some benchmark published by Netflix. >> >> It is awesome and I recommend it to anyone who is using more than 18 >> xLarge server or can afford these high cost / high performance EC2 >> instances. SSD gives a very good throughput with an awesome latency. >> >> Yet, we had about 200 GB data per server and now about 1 TB. >> >> To alleviate memory pressure inside the heap I had to reduce the index >> sampling. I changed the index_interval value from 128 to 512, with no >> visible impact on latency, but a great improvement inside the heap which >> doesn't complain about any pressure anymore. >> >> Is there some more tuning I could use, more tricks that could be useful >> while using big servers, with a lot of data per node and relatively high >> throughput ? >> >> SSD are at 20-40 % of their throughput capacity (according to OpsCenter), >> CPU almost never reach a bigger load than 5 or 6 (with 16 CPU), 15 GB RAM >> used out of 60GB. >> >> At this point I have kept my previous configuration, which is almost the >> default one from the Datastax community AMI. There is a part of it, you can >> consider that any property that is not in here is configured as default : >> >> cassandra.yaml >> >> key_cache_size_in_mb: (empty) - so default - 100MB (hit rate between 88 % >> and 92 %, good enough ?) >> row_cache_size_in_mb: 0 (not usable in our use case, a lot of different >> and random reads) >> flush_largest_memtables_at: 0.80 >> reduce_cache_sizes_at: 0.90 >> >> concurrent_reads: 32 (I am thinking to increase this to 64 or more since >> I have just a few servers to handle more concurrence) >> concurrent_writes: 32 (I am thinking to increase this to 64 or more too) >> memtable_total_space_in_mb: 1024 (to avoid having a full heap, shoul I >> use bigger value, why for ?) >> >> rpc_server_type: sync (I tried hsha and had the "ERROR 12:02:18,971 Read >> an invalid frame size of 0. Are you using TFramedTransport on the client >> side?" error). No idea how to fix this, and I use 5 different clients for >> different purpose (Hector, Cassie, phpCassa, Astyanax, Helenus)... >> >> multithreaded_compaction: false (Should I try enabling this since I now >> use SSD ?) >> compaction_throughput_mb_per_sec: 16 (I will definitely up this to 32 or >> even more) >> >> cross_node_timeout: true >> endpoint_snitch: Ec2MultiRegionSnitch >> >> index_interval: 512 >> >> cassandra-env.sh >> >> I am not sure about how to tune the heap, so I mainly use defaults >> >> MAX_HEAP_SIZE="8G" >> HEAP_NEWSIZE="400M" (I tried with higher values, and it produced bigger >> GC times (1600 ms instead of < 200 ms now with 400M) >> >> -XX:+UseParNewGC >> -XX:+UseConcMarkSweepGC >> -XX:+CMSParallelRemarkEnabled >> -XX:SurvivorRatio=8 >> -XX:MaxTenuringThreshold=1 >> -XX:CMSInitiatingOccupancyFraction=70 >> -XX:+UseCMSInitiatingOccupancyOnly >> >> Does this configuration seems coherent ? Right now, performance are >> correct, latency < 5ms almost all the time. What can I do to handle more >> data per node and keep these performances or get even better once ? >> >> I know this is a long message but if you have any comment or insight even >> on part of it, don't hesitate to share it. I guess this kind of comment on >> configuration is usable by the entire community. >> >> Alain >> >> > > > -- > > Mike Heffner <m...@librato.com> > Librato, Inc. > >