Also, two control questions: - Are you using EBS for data storage? It might introduce additional latencies. - Are you doing proper paging when querying the keyspace?
Cheers, Jens On Mon, Mar 23, 2015 at 5:56 AM, Dave Galbraith <david92galbra...@gmail.com> wrote: > Hi! So I've got a table like this: > > CREATE TABLE "default".metrics (row_time int,attrs varchar,offset > int,value double, PRIMARY KEY(row_time, attrs, offset)) WITH COMPACT > STORAGE AND bloom_filter_fp_chance=0.01 AND caching='KEYS_ONLY' AND > comment='' AND dclocal_read_repair_chance=0 AND gc_grace_seconds=864000 AND > index_interval=128 AND read_repair_chance=1 AND replicate_on_write='true' > AND populate_io_cache_on_flush='false' AND default_time_to_live=0 AND > speculative_retry='NONE' AND memtable_flush_period_in_ms=0 AND > compaction={'class':'DateTieredCompactionStrategy','timestamp_resolution':'MILLISECONDS'} > AND compression={'sstable_compression':'LZ4Compressor'}; > > and I'm running Cassandra on an EC2 m3.2xlarge out in the cloud, with 4 GB > of heap space. So it's timeseries data that I'm doing so I increment > "row_time" each day, "attrs" is additional identifying information about > each series, and "offset" is the number of milliseconds into the day for > each data point. So for the past 5 days, I've been inserting 3k > points/second distributed across 100k distinct "attrs"es. And now when I > try to run queries on this data that look like > > "SELECT * FROM "default".metrics WHERE row_time = 5 AND attrs = > 'potatoes_and_jam'" > > it takes an absurdly long time and sometimes just times out. I did > "nodetool cftsats default" and here's what I get: > > Keyspace: default > Read Count: 59 > Read Latency: 397.12523728813557 ms. > Write Count: 155128 > Write Latency: 0.3675690719921613 ms. > Pending Flushes: 0 > Table: metrics > SSTable count: 26 > Space used (live): 35146349027 > Space used (total): 35146349027 > Space used by snapshots (total): 0 > SSTable Compression Ratio: 0.10386468749216264 > Memtable cell count: 141800 > Memtable data size: 31071290 > Memtable switch count: 41 > Local read count: 59 > Local read latency: 397.126 ms > Local write count: 155128 > Local write latency: 0.368 ms > Pending flushes: 0 > Bloom filter false positives: 0 > Bloom filter false ratio: 0.00000 > Bloom filter space used: 2856 > Compacted partition minimum bytes: 104 > Compacted partition maximum bytes: 36904729268 > Compacted partition mean bytes: 986530969 > Average live cells per slice (last five minutes): > 501.66101694915255 > Maximum live cells per slice (last five minutes): 502.0 > Average tombstones per slice (last five minutes): 0.0 > Maximum tombstones per slice (last five minutes): 0.0 > > Ouch! 400ms of read latency, orders of magnitude higher than it has any > right to be. How could this have happened? Is there something fundamentally > broken about my data model? Thanks! > > -- Jens Rantil Backend engineer Tink AB Email: jens.ran...@tink.se Phone: +46 708 84 18 32 Web: www.tink.se Facebook <https://www.facebook.com/#!/tink.se> Linkedin <http://www.linkedin.com/company/2735919?trk=vsrp_companies_res_photo&trkInfo=VSRPsearchId%3A1057023381369207406670%2CVSRPtargetId%3A2735919%2CVSRPcmpt%3Aprimary> Twitter <https://twitter.com/tink>