nodetool cfhistograms is also very helpful in diagnosing these kinds of data modelling issues.
On 23 March 2015 at 14:43, Chris Lohfink <clohfin...@gmail.com> wrote: > > >> Compacted partition maximum bytes: 36904729268 > > thats huge... 36gb rows are gonna cause a lot of problems, even when you > specify a precise cell under this it still is going to have an enormous > column index to deserialize on every read for the partition. As mentioned > above, you should include your attribute name in the partition key ((row_time, > attrs)) > to spread this out... Id call that critical > > Chris > > On Mon, Mar 23, 2015 at 4:13 PM, Dave Galbraith < > david92galbra...@gmail.com> wrote: > >> I haven't deleted anything. Here's output from a traced cqlsh query (I >> tried to make the spaces line up, hope it's legible): >> >> Execute CQL3 >> query >> | 2015-03-23 21:04:37.422000 | 172.31.32.211 | 0 >> Parsing select * from default.metrics where row_time = 16511 and attrs = >> '[redacted]' limit 100; [SharedPool-Worker-2] | 2015-03-23 21:04:37.423000 >> | 172.31.32.211 | 93 >> Preparing statement >> [SharedPool-Worker-2] >> | 2015-03-23 21:04:37.423000 | 172.31.32.211 | 696 >> Executing single-partition query on metrics [SharedPool-Worker-1] >> >> | 2015-03-23 >> 21:04:37.425000 | 172.31.32.211 | 2807 >> Acquiring sstable references [SharedPool-Worker-1] >> >> | 2015-03-23 21:04:37.425000 | >> 172.31.32.211 | 2993 >> Merging memtable tombstones [SharedPool-Worker-1] >> >> | 2015-03-23 21:04:37.426000 | >> 172.31.32.211 | 3049 >> Partition index with 484338 entries found for sstable 15966 >> [SharedPool-Worker-1] >> | 2015-03-23 21:04:38.625000 | 172.31.32.211 >> | 202304 >> Seeking to partition indexed section in data file >> [SharedPool-Worker-1] >> | 2015-03-23 21:04:38.625000 | 172.31.32.211 | 202354 >> Bloom filter allows skipping sstable 5613 [SharedPool-Worker-1] >> >> | 2015-03-23 21:04:38.625000 | 172.31.32.211 | >> 202445 >> Bloom filter allows skipping sstable 5582 [SharedPool-Worker-1] >> >> | 2015-03-23 21:04:38.625000 | 172.31.32.211 | >> 202478 >> Bloom filter allows skipping sstable 5611 [SharedPool-Worker-1] >> >> | 2015-03-23 21:04:38.625000 | 172.31.32.211 | >> 202508 >> Bloom filter allows skipping sstable 5610 >> [SharedPool-Worker-1] >> | 2015-03-23 21:04:38.625000 | 172.31.32.211 | 202539 >> Bloom filter allows skipping sstable 5549 >> [SharedPool-Worker-1] >> | 2015-03-23 21:04:38.625001 | 172.31.32.211 | 202678 >> Bloom filter allows skipping sstable 5544 [SharedPool-Worker-1] >> >> | 2015-03-23 21:04:38.625001 | 172.31.32.211 | >> 202720 >> Bloom filter allows skipping sstable 5237 >> [SharedPool-Worker-1] >> | 2015-03-23 21:04:38.625001 | 172.31.32.211 | 202752 >> Bloom filter allows skipping sstable 2516 >> [SharedPool-Worker-1] >> | 2015-03-23 21:04:38.625001 | 172.31.32.211 | 202782 >> Bloom filter allows skipping sstable 2632 [SharedPool-Worker-1] >> >> | 2015-03-23 21:04:38.625001 | 172.31.32.211 | >> 202812 >> Bloom filter allows skipping sstable 3015 [SharedPool-Worker-1] >> >> | 2015-03-23 21:04:38.625001 | 172.31.32.211 | >> 202852 >> Skipped 0/11 non-slice-intersecting sstables, included 0 due to >> tombstones [SharedPool-Worker-1] | 2015-03-23 >> 21:04:38.625001 | 172.31.32.211 | 202882 >> Merging data from memtables and 1 sstables [SharedPool-Worker-1] >> >> | 2015-03-23 21:04:38.625001 | 172.31.32.211 | 202902 >> Read 101 live and 0 tombstoned cells >> [SharedPool-Worker-1] >> | 2015-03-23 21:04:38.626000 | 172.31.32.211 | 203752 >> Request complete >> >> | 2015-03-23 >> 21:04:38.628253 | 172.31.32.211 | 206253 >> >> On Mon, Mar 23, 2015 at 11:53 AM, Eric Stevens <migh...@gmail.com> wrote: >> >>> Enable tracing in cqlsh and see how many sstables are being lifted to >>> satisfy the query (are you repeatedly writing to the same partition >>> [row_time]) over time?). >>> >>> Also watch for whether you're hitting a lot of tombstones (are you >>> deleting lots of values in the same partition over time?). >>> >>> On Mon, Mar 23, 2015 at 4:01 AM, Dave Galbraith < >>> david92galbra...@gmail.com> wrote: >>> >>>> Duncan: I'm thinking it might be something like that. I'm also seeing >>>> just a ton of garbage collection on the box, could it be pulling rows for >>>> all 100k attrs for a given row_time into memory since only row_time is the >>>> partition key? >>>> >>>> Jens: I'm not using EBS (although I used to until I read up on how >>>> useless it is). I'm not sure what constitutes proper paging but my client >>>> has a pretty small amount of available memory so I'm doing pages of size 5k >>>> using the C++ Datastax driver. >>>> >>>> Thanks for the replies! >>>> >>>> -Dave >>>> >>>> On Mon, Mar 23, 2015 at 2:00 AM, Jens Rantil <jens.ran...@tink.se> >>>> wrote: >>>> >>>>> Also, two control questions: >>>>> >>>>> - Are you using EBS for data storage? It might introduce >>>>> additional latencies. >>>>> - Are you doing proper paging when querying the keyspace? >>>>> >>>>> Cheers, >>>>> Jens >>>>> >>>>> On Mon, Mar 23, 2015 at 5:56 AM, Dave Galbraith < >>>>> david92galbra...@gmail.com> wrote: >>>>> >>>>>> Hi! So I've got a table like this: >>>>>> >>>>>> CREATE TABLE "default".metrics (row_time int,attrs varchar,offset >>>>>> int,value double, PRIMARY KEY(row_time, attrs, offset)) WITH COMPACT >>>>>> STORAGE AND bloom_filter_fp_chance=0.01 AND caching='KEYS_ONLY' AND >>>>>> comment='' AND dclocal_read_repair_chance=0 AND gc_grace_seconds=864000 >>>>>> AND >>>>>> index_interval=128 AND read_repair_chance=1 AND replicate_on_write='true' >>>>>> AND populate_io_cache_on_flush='false' AND default_time_to_live=0 AND >>>>>> speculative_retry='NONE' AND memtable_flush_period_in_ms=0 AND >>>>>> compaction={'class':'DateTieredCompactionStrategy','timestamp_resolution':'MILLISECONDS'} >>>>>> AND compression={'sstable_compression':'LZ4Compressor'}; >>>>>> >>>>>> and I'm running Cassandra on an EC2 m3.2xlarge out in the cloud, with >>>>>> 4 GB of heap space. So it's timeseries data that I'm doing so I increment >>>>>> "row_time" each day, "attrs" is additional identifying information about >>>>>> each series, and "offset" is the number of milliseconds into the day for >>>>>> each data point. So for the past 5 days, I've been inserting 3k >>>>>> points/second distributed across 100k distinct "attrs"es. And now when I >>>>>> try to run queries on this data that look like >>>>>> >>>>>> "SELECT * FROM "default".metrics WHERE row_time = 5 AND attrs = >>>>>> 'potatoes_and_jam'" >>>>>> >>>>>> it takes an absurdly long time and sometimes just times out. I did >>>>>> "nodetool cftsats default" and here's what I get: >>>>>> >>>>>> Keyspace: default >>>>>> Read Count: 59 >>>>>> Read Latency: 397.12523728813557 ms. >>>>>> Write Count: 155128 >>>>>> Write Latency: 0.3675690719921613 ms. >>>>>> Pending Flushes: 0 >>>>>> Table: metrics >>>>>> SSTable count: 26 >>>>>> Space used (live): 35146349027 >>>>>> Space used (total): 35146349027 >>>>>> Space used by snapshots (total): 0 >>>>>> SSTable Compression Ratio: 0.10386468749216264 >>>>>> Memtable cell count: 141800 >>>>>> Memtable data size: 31071290 >>>>>> Memtable switch count: 41 >>>>>> Local read count: 59 >>>>>> Local read latency: 397.126 ms >>>>>> Local write count: 155128 >>>>>> Local write latency: 0.368 ms >>>>>> Pending flushes: 0 >>>>>> Bloom filter false positives: 0 >>>>>> Bloom filter false ratio: 0.00000 >>>>>> Bloom filter space used: 2856 >>>>>> Compacted partition minimum bytes: 104 >>>>>> Compacted partition maximum bytes: 36904729268 >>>>>> Compacted partition mean bytes: 986530969 >>>>>> Average live cells per slice (last five minutes): >>>>>> 501.66101694915255 >>>>>> Maximum live cells per slice (last five minutes): 502.0 >>>>>> Average tombstones per slice (last five minutes): 0.0 >>>>>> Maximum tombstones per slice (last five minutes): 0.0 >>>>>> >>>>>> Ouch! 400ms of read latency, orders of magnitude higher than it has >>>>>> any right to be. How could this have happened? Is there something >>>>>> fundamentally broken about my data model? Thanks! >>>>>> >>>>>> >>>>> >>>>> >>>>> -- >>>>> Jens Rantil >>>>> Backend engineer >>>>> Tink AB >>>>> >>>>> Email: jens.ran...@tink.se >>>>> Phone: +46 708 84 18 32 >>>>> Web: www.tink.se >>>>> >>>>> Facebook <https://www.facebook.com/#!/tink.se> Linkedin >>>>> <http://www.linkedin.com/company/2735919?trk=vsrp_companies_res_photo&trkInfo=VSRPsearchId%3A1057023381369207406670%2CVSRPtargetId%3A2735919%2CVSRPcmpt%3Aprimary> >>>>> Twitter <https://twitter.com/tink> >>>>> >>>> >>>> >>> >> > -- Ben Bromhead Instaclustr | www.instaclustr.com | @instaclustr <http://twitter.com/instaclustr> | (650) 284 9692