Hi

Currently i am running a cassandra cluster of 3 nodes (with it replicating to 
both nodes) and am experiencing poor performance, usually getting second 
response times when running queries when i am expecting/needing millisecond 
response times. Currently i have a table which looks like:

CREATE TABLE tracker.all_ad_impressions_counter_1d (
    time_bucket bigint,
    ad_id text,
    uc text,
    count counter,
    PRIMARY KEY ((time_bucket, ad_id), uc)
) WITH CLUSTERING ORDER BY (uc ASC)
    AND bloom_filter_fp_chance = 0.01
    AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
    AND comment = ''
    AND compaction = {'base_time_seconds': '3600', 'class': 
'org.apache.cassandra.db.compaction.DateTieredCompactionStrategy', 
'max_sstable_age_days': '30', 'max_threshold': '32', 'min_threshold': '4', 
'timestamp_resolution': 'MILLISECONDS'}
    AND compression = {'chunk_length_in_kb': '64', 'class': 
'org.apache.cassandra.io.compress.LZ4Compressor'}
    AND crc_check_chance = 1.0
    AND dclocal_read_repair_chance = 0.1
    AND default_time_to_live = 0
    AND gc_grace_seconds = 864000
    AND max_index_interval = 2048
    AND memtable_flush_period_in_ms = 0
    AND min_index_interval = 128
    AND read_repair_chance = 0.0
    AND speculative_retry = '99PERCENTILE';


and queries which look like:

        SELECT
            time_bucket,
            uc,
            count
        FROM
            all_ad_impressions_counter_1d

        WHERE ad_id = ?
            AND time_bucket = ?

the cluster is running on servers with 16 GB RAM, and 4 CPU cores and 3 100GB 
datastores, the storage is not local and these VMs are being managed through 
openstack. There are roughly 200 million records being written per day (1 
time_bucket) and maybe a few thousand records per partition (time_bucket, 
ad_id) at most. The amount of writes is not having a significant effect on our 
read performance as when writes are stopped, the read response time does not 
improve noticeably. I have attached a trace of one query i ran which took 
around 3 seconds which i would expect to take well below a second. I have also 
included the cassandra.yaml file and jvm options file. We do intend to change 
the storage to local storage and expect this will have a significant impact but 
i was wondering if there's anything else which could be changed which will also 
have a significant impact on read performance?

Thanks
Ian

Attachment: cassandra.yaml
Description: cassandra.yaml

Attachment: jvm_options.sh
Description: jvm_options.sh

Attachment: Query 2.xlsx
Description: Query 2.xlsx

Reply via email to