I'm having difficulties with leveled compaction, it's not making fast
enough progress. I'm on a quad-core box and it only does one compaction
at a time. Cassandra version: 1.0.6. Here's nodetool compaction stats:

# nodetool -h localhost compactionstats
pending tasks: 2568
          compaction type        keyspace   column family bytes
compacted     bytes total  progress
               Compactionrslog_production        req_text        
4974195       314597326     1.58%

The number of pending tasks decreases extremely slowly. In the log, I
can see it perform 3-4 compactions but the number of tasks only
decreases by one. (I turned clients off and even disabled thrift to
ensure this is not because of writes happening at the same time.) In the
log, I see nicely paired Compacting... Compacted... lines after each
other, it doesn't look like there's ever more than one compaction
running at a time. I have 3 cpus sitting idle. My cassandra.yaml has:

snapshot_before_compaction: false
column_index_size_in_kb: 128
in_memory_compaction_limit_in_mb: 64
multithreaded_compaction: false
compaction_throughput_mb_per_sec: 16
compaction_preheat_key_cache: true

I've issued a "nodetool -h localhost setcompactionthroughput 100", which
didn't seem to make a difference. Here are some sample log lines:

 INFO [CompactionExecutor:117] 2012-03-12 02:54:43,963
CompactionTask.java (line 113)
Compacting
[SSTableReader(path='/mnt/ebs/data/rslog_production/req_text-hc-753342-Data.db')]
 INFO [CompactionExecutor:117] 2012-03-12 02:54:47,793
CompactionTask.java (line 218)
Compacted to
[/mnt/ebs/data/rslog_production/req_text-hc-753992-Data.db,]. 
30,198,523 to 30,197,052 (~99% of original) bytes for 39,269 keys at
7.519100MB/s.  Time: 3,830ms.
 INFO [CompactionExecutor:119] 2012-03-12 02:54:47,795
CompactionTask.java (line 113)
Compacting
[SSTableReader(path='/mnt/ebs/data/rslog_production/req_text-hc-753933-Data.db')]
 INFO [CompactionExecutor:119] 2012-03-12 02:54:51,731
CompactionTask.java (line 218)
Compacted to
[/mnt/ebs/data/rslog_production/req_text-hc-753994-Data.db,]. 
31,462,495 to 31,462,495 (~100% of original) bytes for 40,267 keys at
7.625152MB/s.  Time: 3,935ms.
 INFO [CompactionExecutor:119] 2012-03-12 02:54:51,734
CompactionTask.java (line 113)
Compacting
[SSTableReader(path='/mnt/ebs/data/rslog_production/req_text-hc-753343-Data.db')]
 INFO [CompactionExecutor:119] 2012-03-12 02:54:56,093
CompactionTask.java (line 218)
Compacted to
[/mnt/ebs/data/rslog_production/req_text-hc-753996-Data.db,]. 
32,643,675 to 32,643,958 (~100% of original) bytes for 57,473 keys at
7.141937MB/s.  Time: 4,359ms.
 INFO [CompactionExecutor:118] 2012-03-12 02:54:56,095
CompactionTask.java (line 113)
Compacting
[SSTableReader(path='/mnt/ebs/data/rslog_production/req_text-hc-753934-Data.db')]
 INFO [CompactionExecutor:118] 2012-03-12 02:54:59,635
CompactionTask.java (line 218)
Compacted to
[/mnt/ebs/data/rslog_production/req_text-hc-753998-Data.db,]. 
30,709,285 to 30,709,285 (~100% of original) bytes for 32,172 keys at
8.275404MB/s.  Time: 3,539ms.
 INFO [CompactionExecutor:118] 2012-03-12 02:54:59,638
CompactionTask.java (line 113)
Compacting
[SSTableReader(path='/mnt/ebs/data/rslog_production/req_text-hc-753344-Data.db')]

I recently added a second node to the ring and, in what I suspect is
related, I can't get it to have data transferred (RF=1). Somewhere I
read that the compaction executor does the data streaming? I'm wondering
whether those tasks are all queued. nodetool ring:

#nodetool -h localhost ring
Address         DC          Rack        Status State   Load           
Owns    Token
                                                                              
85070591730234615865843651857942052865
10.102.37.168   datacenter1 rack1       Up     Normal  811.87 GB      
50.00%  0
10.80.161.101   datacenter1 rack1       Up     Normal  1.08 MB        
50.00%  85070591730234615865843651857942052865

Last thing I attempted here is to move the empty node from ...864 to ...865:

# nodetool -h localhost move 85070591730234615865843651857942052865
 INFO 19:59:19,625 Moving /10.80.161.101 from
85070591730234615865843651857942052864 to
85070591730234615865843651857942052865.
 INFO 19:59:19,628 Sleeping 30000 ms before start streaming/fetching ranges.
 INFO 19:59:49,639 MOVING: fetching new ranges and streaming old ranges
 INFO 19:59:52,680 Finished streaming session 97489049918693 from
/10.102.37.168
 INFO 19:59:52,681 Enqueuing flush of
Memtable-LocationInfo@227137515(36/45 serialized/live bytes, 1 ops)
 INFO 19:59:52,682 Writing Memtable-LocationInfo@227137515(36/45
serialized/live bytes, 1 ops)
 INFO 19:59:52,706 Completed flushing
/mnt/ebs/data/system/LocationInfo-hc-19-Data.db (87 bytes)
 INFO 19:59:52,708 Node /10.80.161.101 state jump to normal

I'm pretty stumped at this point... Any pointers to what to do or what I
may have done wrong?
Thanks!
Thorsten

Reply via email to