Hi, I am by no means an expert on Cassandra, nor on DateTieredCompactionStrategy. However, looking in "Query 2.xlsx" I see a lot of
Partition index with 0 entries found for sstable 186 To me, that looks like Cassandra is looking at a lot of sstables and realize too late that they don't contain any relevant data. Are you using TTLs when you write data? Do the TTLs vary? If they do, there's a risk Cassandra will have to inspect a lot of tables that turns out to hold expired data. Also, have you checked `nodetool cfstats` and bloom filter false positives? Does `nodetool cfhistograms` give you any insights? I'm mostly thinking in terms of unbalanced partition keys. Have you checked the logs for how long GC pauses are being taken? Somewhat implementation specific: Would adjusting the time bucket to a smaller time resolution be an option? Also, since you are using DateTieredCompactionStrategy, have you considered using a TIMESTAMP constraint[1]? That might help you a lot actually. [1] https://issues.apache.org/jira/browse/CASSANDRA-5514 Cheers, Jens On Mon, Oct 31, 2016 at 11:10 PM, _ _ <rage...@hotmail.com> wrote: > Hi > > Currently i am running a cassandra cluster of 3 nodes (with it replicating > to both nodes) and am experiencing poor performance, usually getting second > response times when running queries when i am expecting/needing millisecond > response times. Currently i have a table which looks like: > > CREATE TABLE tracker.all_ad_impressions_counter_1d ( > time_bucket bigint, > ad_id text, > uc text, > count counter, > PRIMARY KEY ((time_bucket, ad_id), uc) > ) WITH CLUSTERING ORDER BY (uc ASC) > AND bloom_filter_fp_chance = 0.01 > AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'} > AND comment = '' > AND compaction = {'base_time_seconds': '3600', 'class': > 'org.apache.cassandra.db.compaction.DateTieredCompactionStrategy', > 'max_sstable_age_days': '30', 'max_threshold': '32', 'min_threshold': '4', > 'timestamp_resolution': 'MILLISECONDS'} > AND compression = {'chunk_length_in_kb': '64', 'class': ' > org.apache.cassandra.io.compress.LZ4Compressor'} > AND crc_check_chance = 1.0 > AND dclocal_read_repair_chance = 0.1 > AND default_time_to_live = 0 > AND gc_grace_seconds = 864000 > AND max_index_interval = 2048 > AND memtable_flush_period_in_ms = 0 > AND min_index_interval = 128 > AND read_repair_chance = 0.0 > AND speculative_retry = '99PERCENTILE'; > > > and queries which look like: > > SELECT > time_bucket, > uc, > count > FROM > all_ad_impressions_counter_1d > > WHERE ad_id = ? > AND time_bucket = ? > > the cluster is running on servers with 16 GB RAM, and 4 CPU cores and 3 > 100GB datastores, the storage is not local and these VMs are being managed > through openstack. There are roughly 200 million records being written per > day (1 time_bucket) and maybe a few thousand records per partition > (time_bucket, ad_id) at most. The amount of writes is not having a > significant effect on our read performance as when writes are stopped, the > read response time does not improve noticeably. I have attached a trace of > one query i ran which took around 3 seconds which i would expect to take > well below a second. I have also included the cassandra.yaml file and jvm > options file. We do intend to change the storage to local storage and > expect this will have a significant impact but i was wondering if there's > anything else which could be changed which will also have a significant > impact on read performance? > > Thanks > Ian > > -- Jens Rantil Backend engineer Tink AB Email: jens.ran...@tink.se Phone: +46 708 84 18 32 Web: www.tink.se Facebook <https://www.facebook.com/#!/tink.se> Linkedin <http://www.linkedin.com/company/2735919?trk=vsrp_companies_res_photo&trkInfo=VSRPsearchId%3A1057023381369207406670%2CVSRPtargetId%3A2735919%2CVSRPcmpt%3Aprimary> Twitter <https://twitter.com/tink>