Is the underlying disk spinning disk? Because that'd be about right for a cold read (non cached), the fast reads would likely be in buffer cache or just pure memtable reads.
On Wed, Dec 24, 2014 at 5:32 AM, nitin padalia <padalia.ni...@gmail.com> wrote: > Is merging costly operation with wide rows? > On Dec 10, 2014 5:53 PM, "nitin padalia" <padalia.ni...@gmail.com> wrote: > >> I am using a schema like below: >> >> CREATE TABLE user_location_map ( >> store_id uuid, >> location_id uuid, >> user_serial_number text, >> userobjectid uuid, >> PRIMARY KEY ((store_id, location_id), user_serial_number) >> ) WITH CLUSTERING ORDER BY (user_serial_number ASC) >> AND bloom_filter_fp_chance = 0.01 >> AND caching = '{"keys":"ALL", "rows_per_partition":"NONE"}' >> AND comment = '' >> AND compaction = {'min_threshold': '4', 'class': >> 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', >> 'max_threshold': '32'} >> AND compression = {'sstable_compression': >> 'org.apache.cassandra.io.compress.LZ4Compressor'} >> AND dclocal_read_repair_chance = 0.1 >> AND default_time_to_live = 0 >> AND gc_grace_seconds = 864000 >> AND max_index_interval = 2048 >> AND memtable_flush_period_in_ms = 0 >> AND min_index_interval = 128 >> AND read_repair_chance = 0.0 >> AND speculative_retry = '99.0PERCENTILE'; >> >> Where I run a query like: >> select * from user_location_map where store_id = >> 17b73358-79e6-11e4-bfd4-0050568aa211 and location_id = >> 2c269ea4-dbfd-32dd-9bd7-a5c22677d18b and user_serial_number = >> 'uI2201'; >> >> some times queries like above complete in 3-4 milliseconds, however >> few times they take around 80-90 milliseconds. The data is around 1 >> million distributed in 5 nodes with RF 3. >> >> Tacing shows every time most time is consumed by: >> Merging data from memtables and 1 sstables >> >> What could the reason that some times this take too long, however rest >> of the time its fast. >> > -- [image: datastax_logo.png] <http://www.datastax.com/> Ryan Svihla Solution Architect [image: twitter.png] <https://twitter.com/foundev> [image: linkedin.png] <http://www.linkedin.com/pub/ryan-svihla/12/621/727/> DataStax is the fastest, most scalable distributed database technology, delivering Apache Cassandra to the world’s most innovative enterprises. Datastax is built to be agile, always-on, and predictably scalable to any size. With more than 500 customers in 45 countries, DataStax is the database technology and transactional backbone of choice for the worlds most innovative companies such as Netflix, Adobe, Intuit, and eBay.