Hi Currently i am running a cassandra cluster of 3 nodes (with it replicating to both nodes) and am experiencing poor performance, usually getting second response times when running queries when i am expecting/needing millisecond response times. Currently i have a table which looks like:
CREATE TABLE tracker.all_ad_impressions_counter_1d ( time_bucket bigint, ad_id text, uc text, count counter, PRIMARY KEY ((time_bucket, ad_id), uc) ) WITH CLUSTERING ORDER BY (uc ASC) AND bloom_filter_fp_chance = 0.01 AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'} AND comment = '' AND compaction = {'base_time_seconds': '3600', 'class': 'org.apache.cassandra.db.compaction.DateTieredCompactionStrategy', 'max_sstable_age_days': '30', 'max_threshold': '32', 'min_threshold': '4', 'timestamp_resolution': 'MILLISECONDS'} AND compression = {'chunk_length_in_kb': '64', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor'} AND crc_check_chance = 1.0 AND dclocal_read_repair_chance = 0.1 AND default_time_to_live = 0 AND gc_grace_seconds = 864000 AND max_index_interval = 2048 AND memtable_flush_period_in_ms = 0 AND min_index_interval = 128 AND read_repair_chance = 0.0 AND speculative_retry = '99PERCENTILE'; and queries which look like: SELECT time_bucket, uc, count FROM all_ad_impressions_counter_1d WHERE ad_id = ? AND time_bucket = ? the cluster is running on servers with 16 GB RAM, and 4 CPU cores and 3 100GB datastores, the storage is not local and these VMs are being managed through openstack. There are roughly 200 million records being written per day (1 time_bucket) and maybe a few thousand records per partition (time_bucket, ad_id) at most. The amount of writes is not having a significant effect on our read performance as when writes are stopped, the read response time does not improve noticeably. I have attached a trace of one query i ran which took around 3 seconds which i would expect to take well below a second. I have also included the cassandra.yaml file and jvm options file. We do intend to change the storage to local storage and expect this will have a significant impact but i was wondering if there's anything else which could be changed which will also have a significant impact on read performance? Thanks Ian
cassandra.yaml
Description: cassandra.yaml
jvm_options.sh
Description: jvm_options.sh
Query 2.xlsx
Description: Query 2.xlsx