High performance hardware with lot of data per node - Global learning about configuration

Alain RODRIGUEZ Tue, 09 Jul 2013 07:13:25 -0700

Hi,

Using C*1.2.2.


We recently dropped our 18 m1.xLarge (4CPU, 15GB RAM, 4 Raid-0 Disks)
servers to get 3 hi1.4xLarge (16CPU, 60GB RAM, 2 Raid-0 SSD) servers
instead, for about the same price.

We tried it after reading some benchmark published by Netflix.

It is awesome and I recommend it to anyone who is using more than 18 xLarge
server or can afford these high cost / high performance EC2 instances. SSD
gives a very good throughput with an awesome latency.

Yet, we had about 200 GB data per server and now about 1 TB.

To alleviate memory pressure inside the heap I had to reduce the index
sampling. I changed the index_interval value from 128 to 512, with no
visible impact on latency, but a great improvement inside the heap which
doesn't complain about any pressure anymore.

Is there some more tuning I could use, more tricks that could be useful
while using big servers, with a lot of data per node and relatively high
throughput ?

SSD are at 20-40 % of their throughput capacity (according to OpsCenter),
CPU almost never reach a bigger load than 5 or 6 (with 16 CPU), 15 GB RAM
used out of 60GB.

At this point I have kept my previous configuration, which is almost the
default one from the Datastax community AMI. There is a part of it, you can
consider that any property that is not in here is configured as default :

cassandra.yaml

key_cache_size_in_mb: (empty) - so default - 100MB (hit rate between 88 %
and 92 %, good enough ?)
row_cache_size_in_mb: 0 (not usable in our use case, a lot of different and
random reads)
flush_largest_memtables_at: 0.80
reduce_cache_sizes_at: 0.90

concurrent_reads: 32 (I am thinking to increase this to 64 or more since I
have just a few servers to handle more concurrence)
concurrent_writes: 32 (I am thinking to increase this to 64 or more too)
memtable_total_space_in_mb: 1024 (to avoid having a full heap, shoul I use
bigger value, why for ?)

rpc_server_type: sync (I tried hsha and had the "ERROR 12:02:18,971 Read an
invalid frame size of 0. Are you using TFramedTransport on the client
side?" error). No idea how to fix this, and I use 5 different clients for
different purpose  (Hector, Cassie, phpCassa, Astyanax, Helenus)...

multithreaded_compaction: false (Should I try enabling this since I now use
SSD ?)
compaction_throughput_mb_per_sec: 16 (I will definitely up this to 32 or
even more)

cross_node_timeout: true
endpoint_snitch: Ec2MultiRegionSnitch

index_interval: 512

cassandra-env.sh

I am not sure about how to tune the heap, so I mainly use defaults

MAX_HEAP_SIZE="8G"
HEAP_NEWSIZE="400M" (I tried with higher values, and it produced bigger GC
times (1600 ms instead of < 200 ms now with 400M)

-XX:+UseParNewGC
-XX:+UseConcMarkSweepGC
-XX:+CMSParallelRemarkEnabled
-XX:SurvivorRatio=8
-XX:MaxTenuringThreshold=1
-XX:CMSInitiatingOccupancyFraction=70
-XX:+UseCMSInitiatingOccupancyOnly

Does this configuration seems coherent ? Right now, performance are
correct, latency < 5ms almost all the time. What can I do to handle more
data per node and keep these performances or get even better once ?

I know this is a long message but if you have any comment or insight even
on part of it, don't hesitate to share it. I guess this kind of comment on
configuration is usable by the entire community.

Alain

High performance hardware with lot of data per node - Global learning about configuration

Reply via email to