I am not sure of the new default is to use compression, but I do not believe compression is a good default. I find compression is better for larger column families that are sparsely read. For high throughput CF's I feel that decompressing larger blocks hurts performance more then compression adds.
On Thu, May 16, 2013 at 10:14 AM, Keith Wright <kwri...@nanigans.com> wrote: > Hi all, > > I currently have 2 clusters, one running on 1.1.10 using CQL2 and one > running on 1.2.4 using CQL3 and Vnodes. The machines in the 1.2.4 cluster > are expected to have better IO performance as we are going from 1 SSD data > disk per node in the 1.1 cluster to 3 SSD data disks per node in the 1.2 > cluster with higher end drives (commit logs are on their own disk shared > with the OS). I am doing some stress testing on the 1.2 cluster and have > found that although the reads / sec as seen from iostat are approximately > the same (3K / sec) in both clusters, the MB/s read in the new cluster is > MUCH higher (7 MB/s in 1.1 as compared to 30-50 MB/s in 1.2). As a result, > I am seeing excessive iowait in the 1.2 cluster causing high average read > times of 30 ms under the same load (1.1 cluster sees around 5 ms). They > are both using Leveled compaction but one thing I did change in the new > cluster was to increase the sstable size from the OOTB setting to 32 MB. > Note that my reads are by definition highly random as we are running > memcached in front for various reasons. Does cassandra need to read the > entire SSTable when fetching a row or only the relevant chunk (I have the > OOTB chunk size and BF settings)? I just decreased the sstable size to 5 > MB and am waiting for compactions to complete to see if that makes a > difference. > > Thanks! > > Relevant table definition if helpful (note that I also changed to the LZ4 > compressor expecting better read performance and I decreased the crc change > again to minimize read latency): > > CREATE TABLE global_user ( > user_id BIGINT, > app_id INT, > type TEXT, > name TEXT, > last TIMESTAMP, > paid BOOLEAN, > values map<TIMESTAMP,FLOAT>, > sku_time map<TEXT,TIMESTAMP>, > extra_param map<TEXT,TEXT>, > PRIMARY KEY (user_id, app_id, type, name) > ) with > compression={'crc_check_chance':0.1,'sstable_compression':'LZ4Compressor'} > and > compaction={'class':'LeveledCompactionStrategy'} and > compaction_strategy_options = {'sstable_size_in_mb':5} and > gc_grace_seconds = 86400; >