Enable tracing in cqlsh and see how many sstables are being lifted to satisfy the query (are you repeatedly writing to the same partition [row_time]) over time?).
Also watch for whether you're hitting a lot of tombstones (are you deleting lots of values in the same partition over time?). On Mon, Mar 23, 2015 at 4:01 AM, Dave Galbraith <david92galbra...@gmail.com> wrote: > Duncan: I'm thinking it might be something like that. I'm also seeing just > a ton of garbage collection on the box, could it be pulling rows for all > 100k attrs for a given row_time into memory since only row_time is the > partition key? > > Jens: I'm not using EBS (although I used to until I read up on how useless > it is). I'm not sure what constitutes proper paging but my client has a > pretty small amount of available memory so I'm doing pages of size 5k using > the C++ Datastax driver. > > Thanks for the replies! > > -Dave > > On Mon, Mar 23, 2015 at 2:00 AM, Jens Rantil <jens.ran...@tink.se> wrote: > >> Also, two control questions: >> >> - Are you using EBS for data storage? It might introduce additional >> latencies. >> - Are you doing proper paging when querying the keyspace? >> >> Cheers, >> Jens >> >> On Mon, Mar 23, 2015 at 5:56 AM, Dave Galbraith < >> david92galbra...@gmail.com> wrote: >> >>> Hi! So I've got a table like this: >>> >>> CREATE TABLE "default".metrics (row_time int,attrs varchar,offset >>> int,value double, PRIMARY KEY(row_time, attrs, offset)) WITH COMPACT >>> STORAGE AND bloom_filter_fp_chance=0.01 AND caching='KEYS_ONLY' AND >>> comment='' AND dclocal_read_repair_chance=0 AND gc_grace_seconds=864000 AND >>> index_interval=128 AND read_repair_chance=1 AND replicate_on_write='true' >>> AND populate_io_cache_on_flush='false' AND default_time_to_live=0 AND >>> speculative_retry='NONE' AND memtable_flush_period_in_ms=0 AND >>> compaction={'class':'DateTieredCompactionStrategy','timestamp_resolution':'MILLISECONDS'} >>> AND compression={'sstable_compression':'LZ4Compressor'}; >>> >>> and I'm running Cassandra on an EC2 m3.2xlarge out in the cloud, with 4 >>> GB of heap space. So it's timeseries data that I'm doing so I increment >>> "row_time" each day, "attrs" is additional identifying information about >>> each series, and "offset" is the number of milliseconds into the day for >>> each data point. So for the past 5 days, I've been inserting 3k >>> points/second distributed across 100k distinct "attrs"es. And now when I >>> try to run queries on this data that look like >>> >>> "SELECT * FROM "default".metrics WHERE row_time = 5 AND attrs = >>> 'potatoes_and_jam'" >>> >>> it takes an absurdly long time and sometimes just times out. I did >>> "nodetool cftsats default" and here's what I get: >>> >>> Keyspace: default >>> Read Count: 59 >>> Read Latency: 397.12523728813557 ms. >>> Write Count: 155128 >>> Write Latency: 0.3675690719921613 ms. >>> Pending Flushes: 0 >>> Table: metrics >>> SSTable count: 26 >>> Space used (live): 35146349027 >>> Space used (total): 35146349027 >>> Space used by snapshots (total): 0 >>> SSTable Compression Ratio: 0.10386468749216264 >>> Memtable cell count: 141800 >>> Memtable data size: 31071290 >>> Memtable switch count: 41 >>> Local read count: 59 >>> Local read latency: 397.126 ms >>> Local write count: 155128 >>> Local write latency: 0.368 ms >>> Pending flushes: 0 >>> Bloom filter false positives: 0 >>> Bloom filter false ratio: 0.00000 >>> Bloom filter space used: 2856 >>> Compacted partition minimum bytes: 104 >>> Compacted partition maximum bytes: 36904729268 >>> Compacted partition mean bytes: 986530969 >>> Average live cells per slice (last five minutes): >>> 501.66101694915255 >>> Maximum live cells per slice (last five minutes): 502.0 >>> Average tombstones per slice (last five minutes): 0.0 >>> Maximum tombstones per slice (last five minutes): 0.0 >>> >>> Ouch! 400ms of read latency, orders of magnitude higher than it has any >>> right to be. How could this have happened? Is there something fundamentally >>> broken about my data model? Thanks! >>> >>> >> >> >> -- >> Jens Rantil >> Backend engineer >> Tink AB >> >> Email: jens.ran...@tink.se >> Phone: +46 708 84 18 32 >> Web: www.tink.se >> >> Facebook <https://www.facebook.com/#!/tink.se> Linkedin >> <http://www.linkedin.com/company/2735919?trk=vsrp_companies_res_photo&trkInfo=VSRPsearchId%3A1057023381369207406670%2CVSRPtargetId%3A2735919%2CVSRPcmpt%3Aprimary> >> Twitter <https://twitter.com/tink> >> > >