Yea this was a bit of a read, so I think what I need really is this
http://www.slideshare.net/driftx/cassandra-summit-2010-performance-tuning On Tue, Sep 14, 2010 at 6:55 PM, Jonathan Ellis <jbel...@gmail.com> wrote: > If your question is, "should I optimize for faster compaction or less > impact on my cluster," the answer is almost always "less impact." > > On Tue, Sep 14, 2010 at 8:33 PM, Dathan Pattishall <datha...@gmail.com> > wrote: > > Okay from what I gather. When data is written its always written to > memory. > > The flow for our concerns is the data is written to the commitLog then to > > the memtable. > > > > If any of memtable's 3 tunable thresholds are hit a flush occurs writing > the > > data sorted by key to the SSTABLE still enabling sequential disk access. > The > > fastest disk access is sequential while random is the slowest - this is > just > > a statement to make sure people are on the same page. > > > > Now the data in the SSTABLE is immutable, and when the SSTABLE threshold > is > > hit (4 by default) the multiple SSTABLES are turned into a single TABLE, > > this is compaction. Now I can't verify this because I never see more then > 1 > > SSTABLE at a time, which could just be me misinterpreting the info. This > > Operation singly should produce the most load. > > > > Using JCONSOLE connected to a live Cass box JMX port and forced > operations > > from the MBEANS tab. I've quantified that normal operations for my > > environment is 3% CPU utilization. > > Each Node roughly does 200 operations per second on a standard key/value > > pair sorted by UTF8 with a Random Partitioner. > > > > Doing a Flush for this 10MB Memtable consumes 33% of my CPU from Java's > > point of view and that 33% is sustained for 15 mins - I assume some > sorting > > needs to be done according to this link: > > > http://wiki.apache.org/cassandra/JmxInterface#org.apache.cassandra.concurrent.FLUSH-WRITER-POOL > > > > Doing a compaction consumes 30% of my CPU and lasts for 10 mins from the > > subsequent flush. > > > > Now to give some context my MemtableDataSize was around 11,000,000. I am > > going to assume that this is in units of bytes so roughly 10MB of data. > Is > > assuming that MemtableDataSize is in units of bytes correct? If so why > does > > it take 15 mins to flush 10MB of data to a RAID-10 8 15K RPM disk array > with > > a 256 MB of memory and BBC on a dedicated hardware controller? > > > > Additional to this I see in system.log > > > > WARN [MESSAGE-DESERIALIZER-POOL:1] 2010-08-26 08:27:11,541 > > MessageDeserializationTask.java (line 47) dropping message (9,713ms past > > timeout) > > > > 39,625 events are showing. How do I get rid of this? If its a warning and > is > > dropping messages how can I fix the warning? > > > > > > > > So, in summary. On Cassandra 0.6.4 > > with > > TotalDiskSpaceUsed - 40GB > > MemtableDataSize - 10MB > > MemtableColumnsCount 130K > > > > How do I remove MessageDeserializationTasks warnings? > > How do I reduce the compaction time? > > Why is flushing taking so long? > > What do I look at to tune Memtable thresholds, because this > > http://wiki.apache.org/cassandra/MemtableThresholds does not help, and I > > believe that I'm not in a position where I need to tune the defaults. > > > > http://www.riptano.com/blog/cassandra-annotated-changelog-063 allows me > to > > lower the priority of compaction, but to get the best compaction times > > verses system load how do I tune for that? > > > > > > > > Below is some sample output of system.log > > > > Notice that SSTable 1155 is 320 MB, is the sorting taking all the time? > > Because 320MB written even randomly should be very fast. 320 MB written > > sequentially takes 2 seconds, while randomly I would expect 9 seconds. > (320 > > MB/34 MB/sec random writes == 9 seconds) > > > > > > INFO [COMPACTION-POOL:1] 2010-09-14 20:40:39,120 CompactionManager.java > > (line 320) Compacted to > > /data/cassandra/data/TimeFrameClicks/Standard2-1155-Data.db. > > 339681221/339679627 bytes for 1253279 keys. Time: 18645ms. > > INFO [SSTABLE-CLEANUP-TIMER] 2010-09-14 21:14:49,518 > > SSTableDeletingReference.java (line 104) Deleted > > /data/cassandra/data/TimeFrameClicks/Standard2-1154-Data.db > > INFO [SSTABLE-CLEANUP-TIMER] 2010-09-14 21:14:49,547 > > SSTableDeletingReference.java (line 104) Deleted > > /data/cassandra/data/TimeFrameClicks/Standard2-1152-Data.db > > INFO [SSTABLE-CLEANUP-TIMER] 2010-09-14 21:14:49,574 > > SSTableDeletingReference.java (line 104) Deleted > > /data/cassandra/data/TimeFrameClicks/Standard2-1153-Data.db > > INFO [SSTABLE-CLEANUP-TIMER] 2010-09-14 21:14:49,603 > > SSTableDeletingReference.java (line 104) Deleted > > /data/cassandra/data/TimeFrameClicks/Standard2-1151-Data.db > > INFO [ROW-MUTATION-STAGE:6] 2010-09-14 22:24:33,320 > ColumnFamilyStore.java > > (line 357) Standard2 has reached its threshold; switching in a fresh > > Memtable at > > > CommitLogContext(file='/data/cassandra/commitlog/CommitLog-1284496449043.log', > > position=68512986) > > INFO [ROW-MUTATION-STAGE:6] 2010-09-14 22:24:33,320 > ColumnFamilyStore.java > > (line 609) Enqueuing flush of memtable-standa...@798596883(11324628 > bytes, > > 314573 operations) > > INFO [FLUSH-WRITER-POOL:1] 2010-09-14 22:24:33,321 Memtable.java (line > 148) > > Writing memtable-standa...@798596883(11324628 bytes, 314573 operations) > > INFO [FLUSH-WRITER-POOL:1] 2010-09-14 22:24:38,974 Memtable.java (line > 162) > > Completed flushing > > /data/cassandra/data/TimeFrameClicks/Standard2-1156-Data.db > > INFO [COMPACTION-POOL:1] 2010-09-14 22:25:19,326 CompactionManager.java > > (line 246) Compacting > > > [org.apache.cassandra.io.SSTableReader(path='/data/cassandra/data/TimeFrameClicks/Standard2-1156-Data.db'),org.apache.cassandra.io.SSTableReader(path='/data/cassandra/data/TimeFrameClicks/Standard2-1129-Data.db'),org.apache.cassandra.io.SSTableReader(path='/data/cassandra/data/TimeFrameClicks/Standard2-1150-Data.db'),org.apache.cassandra.io.SSTableReader(path='/data/cassandra/data/TimeFrameClicks/Standard2-1108-Data.db'),org.apache.cassandra.io.SSTableReader(path='/data/cassandra/data/TimeFrameClicks/Standard2-1023-Data.db'),org.apache.cassandra.io.SSTableReader(path='/data/cassandra/data/TimeFrameClicks/Standard2-1155-Data.db')] > > INFO [GC inspection] 2010-09-14 22:25:21,236 GCInspector.java (line 116) > GC > > for ConcurrentMarkSweep: 1825 ms, 650770408 reclaimed leaving 472796832 > > used; max is 7645888512 > > INFO [GC inspection] 2010-09-14 22:25:21,237 GCInspector.java (line 137) > > Pool Name Active Pending > > INFO [GC inspection] 2010-09-14 22:25:21,238 GCInspector.java (line 143) > > STREAM-STAGE 0 0 > > INFO [GC inspection] 2010-09-14 22:25:21,238 GCInspector.java (line 143) > > FILEUTILS-DELETE-POOL 0 0 > > INFO [GC inspection] 2010-09-14 22:25:21,238 GCInspector.java (line 143) > > RESPONSE-STAGE 0 0 > > INFO [GC inspection] 2010-09-14 22:25:21,239 GCInspector.java (line 143) > > ROW-READ-STAGE 1 5 > > INFO [GC inspection] 2010-09-14 22:25:21,239 GCInspector.java (line 143) > > LB-OPERATIONS 0 0 > > INFO [GC inspection] 2010-09-14 22:25:21,239 GCInspector.java (line 143) > > GMFD 0 3 > > INFO [GC inspection] 2010-09-14 22:25:21,240 GCInspector.java (line 143) > > MESSAGE-DESERIALIZER-POOL 0 121 > > INFO [GC inspection] 2010-09-14 22:25:21,240 GCInspector.java (line 143) > > CONSISTENCY-MANAGER 0 7 > > INFO [GC inspection] 2010-09-14 22:25:21,240 GCInspector.java (line 143) > > LB-TARGET 0 0 > > INFO [GC inspection] 2010-09-14 22:25:21,240 GCInspector.java (line 143) > > ROW-MUTATION-STAGE 0 0 > > INFO [GC inspection] 2010-09-14 22:25:21,241 GCInspector.java (line 143) > > MESSAGE-STREAMING-POOL 0 0 > > INFO [GC inspection] 2010-09-14 22:25:21,241 GCInspector.java (line 143) > > LOAD-BALANCER-STAGE 0 0 > > INFO [GC inspection] 2010-09-14 22:25:21,241 GCInspector.java (line 143) > > FLUSH-SORTER-POOL 0 0 > > INFO [GC inspection] 2010-09-14 22:25:21,242 GCInspector.java (line 143) > > MEMTABLE-POST-FLUSHER 0 0 > > INFO [GC inspection] 2010-09-14 22:25:21,242 GCInspector.java (line 143) > > AE-SERVICE-STAGE 0 0 > > INFO [GC inspection] 2010-09-14 22:25:21,243 GCInspector.java (line 143) > > FLUSH-WRITER-POOL 0 0 > > INFO [GC inspection] 2010-09-14 22:25:21,243 GCInspector.java (line 143) > > HINTED-HANDOFF-POOL 0 0 > > INFO [GC inspection] 2010-09-14 22:25:21,244 GCInspector.java (line 148) > > CompactionManager n/a 0 > > INFO [COMPACTION-POOL:1] 2010-09-14 22:38:13,035 CompactionManager.java > > (line 320) Compacted to > > /data/cassandra/data/TimeFrameClicks/Standard2-1157-Data.db. > > 13065811194/8886372187 bytes for 32990587 keys. Time: 773708ms. > > INFO [COMPACTION-POOL:1] 2010-09-14 22:40:23,776 CompactionManager.java > > (line 246) Compacting > > > [org.apache.cassandra.io.SSTableReader(path='/data/cassandra/data/TimeFrameClicks/Standard2-1157-Data.db')] > > INFO [COMPACTION-POOL:1] 2010-09-14 22:54:20,197 CompactionManager.java > > (line 320) Compacted to > > /data/cassandra/data/TimeFrameClicks/Standard2-1158-Data.db. > > 8886372187/8886372187 bytes for 32990587 keys. Time: 836420ms. > > INFO [SSTABLE-CLEANUP-TIMER] 2010-09-14 23:26:54,242 > > SSTableDeletingReference.java (line 104) Deleted > > /data/cassandra/data/TimeFrameClicks/Standard2-1157-Data.db > > INFO [SSTABLE-CLEANUP-TIMER] 2010-09-14 23:26:54,422 > > SSTableDeletingReference.java (line 104) Deleted > > /data/cassandra/data/TimeFrameClicks/Standard2-1156-Data.db > > INFO [SSTABLE-CLEANUP-TIMER] 2010-09-14 23:26:56,137 > > SSTableDeletingReference.java (line 104) Deleted > > /data/cassandra/data/TimeFrameClicks/Standard2-1129-Data.db > > INFO [SSTABLE-CLEANUP-TIMER] 2010-09-14 23:26:59,458 > > SSTableDeletingReference.java (line 104) Deleted > > /data/cassandra/data/TimeFrameClicks/Standard2-1108-Data.db > > INFO [SSTABLE-CLEANUP-TIMER] 2010-09-14 23:27:01,106 > > SSTableDeletingReference.java (line 104) Deleted > > /data/cassandra/data/TimeFrameClicks/Standard2-1150-Data.db > > INFO [SSTABLE-CLEANUP-TIMER] 2010-09-14 23:27:12,050 > > SSTableDeletingReference.java (line 104) Deleted > > /data/cassandra/data/TimeFrameClicks/Standard2-1023-Data.db > > INFO [SSTABLE-CLEANUP-TIMER] 2010-09-14 23:27:12,534 > > SSTableDeletingReference.java (line 104) Deleted > > /data/cassandra/data/TimeFrameClicks/Standard2-1155-Data.db > > > > > > > > > > > > > > > > > > > > -- > Jonathan Ellis > Project Chair, Apache Cassandra > co-founder of Riptano, the source for professional Cassandra support > http://riptano.com >