Re: SSTable size versus read performance

Edward Capriolo Thu, 16 May 2013 07:23:59 -0700

I am not sure of the new default is to use compression, but I do not
believe compression is a good default. I find compression is better for
larger column families that are sparsely read. For high throughput CF's I
feel that decompressing larger blocks hurts performance more then
compression adds.



On Thu, May 16, 2013 at 10:14 AM, Keith Wright <kwri...@nanigans.com> wrote:

> Hi all,
>
>     I currently have 2 clusters, one running on 1.1.10 using CQL2 and one
> running on 1.2.4 using CQL3 and Vnodes.   The machines in the 1.2.4 cluster
> are expected to have better IO performance as we are going from 1 SSD data
> disk per node in the 1.1 cluster to 3 SSD data disks per node in the 1.2
> cluster with higher end drives (commit logs are on their own disk shared
> with the OS).  I am doing some stress testing on the 1.2 cluster and have
> found that although the reads / sec as seen from iostat are approximately
> the same (3K / sec) in both clusters, the MB/s read in the new cluster is
> MUCH higher (7 MB/s in 1.1 as compared to 30-50 MB/s in 1.2).  As a result,
> I am seeing excessive iowait in the 1.2 cluster causing high average read
> times of 30 ms under the same load (1.1 cluster sees around 5 ms).  They
> are both using Leveled compaction but one thing I did change in the new
> cluster was to increase the sstable size from the OOTB setting to 32 MB.
>  Note that my reads are by definition highly random as we are running
> memcached in front for various reasons.  Does cassandra need to read the
> entire SSTable when fetching a row or only the relevant chunk (I have the
> OOTB chunk size and BF settings)?  I just decreased the sstable size to 5
> MB and am waiting for compactions to complete to see if that makes a
> difference.
>
> Thanks!
>
> Relevant table definition if helpful (note that I also changed to the LZ4
> compressor expecting better read performance and I decreased the crc change
> again to minimize read latency):
>
> CREATE TABLE global_user (
> user_id BIGINT,
> app_id INT,
> type TEXT,
> name TEXT,
> last TIMESTAMP,
> paid BOOLEAN,
> values map<TIMESTAMP,FLOAT>,
> sku_time map<TEXT,TIMESTAMP>,
> extra_param map<TEXT,TEXT>,
> PRIMARY KEY (user_id, app_id, type, name)
> ) with 
> compression={'crc_check_chance':0.1,'sstable_compression':'LZ4Compressor'}
> and
> compaction={'class':'LeveledCompactionStrategy'} and
> compaction_strategy_options = {'sstable_size_in_mb':5} and
> gc_grace_seconds = 86400;
>

Re: SSTable size versus read performance

Reply via email to