Hello, We've had a 5-node C* cluster (version 1.1.0) running for several months. Up until now we've mostly been writing data, but now we're starting to service more read traffic. We're seeing far more disk I/O to service these reads than I would have anticipated.
The CF being queried consists of chat messages. Each row represents a conversation between two people. Each column represents a message. The column key is composite, consisting of the message date and a few other bits of information. The CF is using compression. The query is looking for a maximum of 50 messages between two dates, in reverse order. Usually the two dates used as endpoints are 30 days ago and the current time. The query in Astyanax looks like this: ColumnList<ConversationTextMessageKey> result = keyspace.prepareQuery(CF_CONVERSATION_TEXT_MESSAGE) .setConsistencyLevel(ConsistencyLevel.CL_QUORUM) .getKey(conversationKey) .withColumnRange( textMessageSerializer.makeEndpoint(endDate, Equality.LESS_THAN).toBytes(), textMessageSerializer.makeEndpoint(startDate, Equality.GREATER_THAN_EQUALS).toBytes(), true, maxMessages) .execute() .getResult(); We're currently servicing around 30 of these queries per second. Here's what the cfstats for the CF look like: Column Family: conversation_text_message SSTable count: 15 Space used (live): 211762982685 Space used (total): 211762982685 Number of Keys (estimate): 330118528 Memtable Columns Count: 68063 Memtable Data Size: 53093938 Memtable Switch Count: 9743 Read Count: 4313344 Read Latency: 118.831 ms. Write Count: 817876950 Write Latency: 0.023 ms. Pending Tasks: 0 Bloom Filter False Postives: 6055 Bloom Filter False Ratio: 0.00260 Bloom Filter Space Used: 686266048 Compacted row minimum size: 87 Compacted row maximum size: 14530764 Compacted row mean size: 1186 On the C* nodes, iostat output like this is typical, and can spike to be much worse: avg-cpu: %user %nice %system %iowait %steal %idle 1.91 0.00 2.08 30.66 0.50 64.84 Device: tps kB_read/s kB_wrtn/s kB_read kB_wrtn xvdap1 0.13 0.00 1.07 0 16 xvdb 474.20 13524.53 25.33 202868 380 xvdc 469.87 13455.73 30.40 201836 456 md0 972.13 26980.27 55.73 404704 836 Any thoughts on what could be causing read I/O to the disk from these queries? Much thanks! -Jon