Thanks Jeff, Since low writes and high reads most of the time data in memtables only. When I noticed intially issue no stables on disk everything in memtable only.
On Sat, Feb 23, 2019, 10:01 PM Jeff Jirsa <jji...@gmail.com> wrote: > Also given your short ttl and low write rate, you may want to think about > how you can keep more in memory - this may mean larger memtable and high > flush thresholds (reading from the memtable), or perhaps the partition > cache (if you are likely to read the same key multiple times). You’ll also > probably win some with basic perf and GC tuning, but can’t really do that > via email. Cassandra-8150 has some pointers. > > -- > Jeff Jirsa > > > On Feb 23, 2019, at 6:52 PM, Jeff Jirsa <jji...@gmail.com> wrote: > > You’ll only ever have one tombstone per read, so your load is based on > normal read rate not tombstones. The metric isn’t wrong, but it’s not > indicative of a problem here given your data model. > > You’re using STCS do you may be reading from more than one sstable if you > update column2 for a given column1, otherwise you’re probably just seeing > normal read load. Consider dropping your compression chunk size a bit > (given the sizes in your cfstats I’d probably go to 4K instead of 64k), and > maybe consider LCS or TWCS instead of STCS (Which is appropriate depends on > a lot of factors, but STCS is probably causing a fair bit of unnecessary > compactions and probably is very slow to expire data). > > -- > Jeff Jirsa > > > On Feb 23, 2019, at 6:31 PM, Rahul Reddy <rahulreddy1...@gmail.com> wrote: > > Do you see anything wrong with this metric. > > metric to scan tombstones > > increase(cassandra_Table_TombstoneScannedHistogram{keyspace="mykeyspace",Table="tablename",function="Count"}[5m]) > > And sametime CPU Spike to 50% whenever I see high tombstone alert. > > On Sat, Feb 23, 2019, 9:25 PM Jeff Jirsa <jji...@gmail.com> wrote: > >> Your schema is such that you’ll never read more than one tombstone per >> select (unless you’re also doing range reads / table scans that you didn’t >> mention) - I’m not quite sure what you’re alerting on, but you’re not going >> to have tombstone problems with that table / that select. >> >> -- >> Jeff Jirsa >> >> >> On Feb 23, 2019, at 5:55 PM, Rahul Reddy <rahulreddy1...@gmail.com> >> wrote: >> >> Changing gcgs didn't help >> >> CREATE KEYSPACE ksname WITH replication = {'class': >> 'NetworkTopologyStrategy', 'dc1': '3', 'dc2': '3'} AND durable_writes = >> true; >> >> >> ```CREATE TABLE keyspace."table" ( >> "column1" text PRIMARY KEY, >> "column2" text >> ) WITH bloom_filter_fp_chance = 0.01 >> AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'} >> AND comment = '' >> AND compaction = {'class': >> 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', >> 'max_threshold': '32', 'min_threshold': '4'} >> AND compression = {'chunk_length_in_kb': '64', 'class': >> 'org.apache.cassandra.io.compress.LZ4Compressor'} >> AND crc_check_chance = 1.0 >> AND dclocal_read_repair_chance = 0.1 >> AND default_time_to_live = 18000 >> AND gc_grace_seconds = 60 >> AND max_index_interval = 2048 >> AND memtable_flush_period_in_ms = 0 >> AND min_index_interval = 128 >> AND read_repair_chance = 0.0 >> AND speculative_retry = '99PERCENTILE'; >> >> flushed table and took tsstabledump >> grep -i '"expired" : true' SSTables.txt|wc -l >> 16439 >> grep -i '"expired" : false' SSTables.txt |wc -l >> 2657 >> >> ttl is 4 hours. >> >> INSERT INTO keyspace."TABLE_NAME" ("column1", "column2") VALUES (?, ?) >> USING TTL(4hours) ?'; >> SELECT * FROM keyspace."TABLE_NAME" WHERE "column1" = ?'; >> >> metric to scan tombstones >> >> increase(cassandra_Table_TombstoneScannedHistogram{keyspace="mykeyspace",Table="tablename",function="Count"}[5m]) >> >> during peak hours. we only have couple of hundred inserts and 5-8k >> reads/s per node. >> ``` >> >> ```tablestats >> Read Count: 605231874 >> Read Latency: 0.021268529760215503 ms. >> Write Count: 2763352 >> Write Latency: 0.027924007871599422 ms. >> Pending Flushes: 0 >> Table: name >> SSTable count: 1 >> Space used (live): 1413203 >> Space used (total): 1413203 >> Space used by snapshots (total): 0 >> Off heap memory used (total): 28813 >> SSTable Compression Ratio: 0.5015090954531143 >> Number of partitions (estimate): 19568 >> Memtable cell count: 573 >> Memtable data size: 22971 >> Memtable off heap memory used: 0 >> Memtable switch count: 6 >> Local read count: 529868919 >> Local read latency: 0.020 ms >> Local write count: 2707371 >> Local write latency: 0.024 ms >> Pending flushes: 0 >> Percent repaired: 0.0 >> Bloom filter false positives: 1 >> Bloom filter false ratio: 0.00000 >> Bloom filter space used: 23888 >> Bloom filter off heap memory used: 23880 >> Index summary off heap memory used: 4717 >> Compression metadata off heap memory used: 216 >> Compacted partition minimum bytes: 73 >> Compacted partition maximum bytes: 124 >> Compacted partition mean bytes: 99 >> Average live cells per slice (last five minutes): 1.0 >> Maximum live cells per slice (last five minutes): 1 >> Average tombstones per slice (last five minutes): 1.0 >> Maximum tombstones per slice (last five minutes): 1 >> Dropped Mutations: 0 >> histograms >> Percentile SSTables Write Latency Read Latency Partition >> Size Cell Count >> (micros) (micros) >> (bytes) >> 50% 0.00 20.50 17.08 >> 86 1 >> 75% 0.00 24.60 20.50 >> 124 1 >> 95% 0.00 35.43 29.52 >> 124 1 >> 98% 0.00 35.43 42.51 >> 124 1 >> 99% 0.00 42.51 51.01 >> 124 1 >> Min 0.00 8.24 5.72 >> 73 0 >> Max 1.00 42.51 152.32 >> 124 1 >> ``` >> >> 3 node in dc1 and 3 node in dc2 cluster. With instanc type aws ec2 >> m4.xlarge >> >> On Sat, Feb 23, 2019, 7:47 PM Jeff Jirsa <jji...@gmail.com> wrote: >> >>> Would also be good to see your schema (anonymized if needed) and the >>> select queries you’re running >>> >>> >>> -- >>> Jeff Jirsa >>> >>> >>> On Feb 23, 2019, at 4:37 PM, Rahul Reddy <rahulreddy1...@gmail.com> >>> wrote: >>> >>> Thanks Jeff, >>> >>> I'm having gcgs set to 10 mins and changed the table ttl also to 5 >>> hours compared to insert ttl to 4 hours . Tracing on doesn't show any >>> tombstone scans for the reads. And also log doesn't show tombstone scan >>> alerts. Has the reads are happening 5-8k reads per node during the peak >>> hours it shows 1M tombstone scans count per read. >>> >>> On Fri, Feb 22, 2019, 11:46 AM Jeff Jirsa <jji...@gmail.com> wrote: >>> >>>> If all of your data is TTL’d and you never explicitly delete a cell >>>> without using s TTL, you can probably drop your GCGS to 1 hour (or less). >>>> >>>> Which compaction strategy are you using? You need a way to clear out >>>> those tombstones. There exist tombstone compaction sub properties that can >>>> help encourage compaction to grab sstables just because they’re full of >>>> tombstones which will probably help you. >>>> >>>> >>>> -- >>>> Jeff Jirsa >>>> >>>> >>>> On Feb 22, 2019, at 8:37 AM, Kenneth Brotman < >>>> kenbrot...@yahoo.com.invalid> wrote: >>>> >>>> Can we see the histogram? Why wouldn’t you at times have that many >>>> tombstones? Makes sense. >>>> >>>> >>>> >>>> Kenneth Brotman >>>> >>>> >>>> >>>> *From:* Rahul Reddy [mailto:rahulreddy1...@gmail.com >>>> <rahulreddy1...@gmail.com>] >>>> *Sent:* Thursday, February 21, 2019 7:06 AM >>>> *To:* user@cassandra.apache.org >>>> *Subject:* Tombstones in memtable >>>> >>>> >>>> >>>> We have small table records are about 5k . >>>> >>>> All the inserts comes as 4hr ttl and we have table level ttl 1 day and >>>> gc grace seconds has 3 hours. We do 5k reads a second during peak load >>>> During the peak load seeing Alerts for tomstone scanned histogram reaching >>>> million. >>>> >>>> Cassandra version 3.11.1. Please let me know how can this tombstone >>>> scan can be avoided in memtable >>>> >>>>