I'm looking into a case where it appears that recycling a commit log segment 
and flushing the dirty CF's results in 46 CF's being flushed. Out of 47 in the 
keyspace. All this flush activity blocks writes. 

Before I dig further I wanted to confirm my understanding.

At 10:46 the MeteredFlusher kicks and flushed the CF…. 

2013-03-18T10:46:59-07:00 INFO [OptionalTasks:1] 2013-03-18 10:46:59,009 
MeteredFlusher.java (line 62) flushing high-traffic column family 
CFS(Keyspace='KSNAME', ColumnFamily='CFNAME') (estimated 198959043 bytes)
2013-03-18T10:46:59-07:00 INFO [OptionalTasks:1] 2013-03-18 10:46:59,009 
ColumnFamilyStore.java (line 659) Enqueuing flush of 
Memtable-CFNAME@172178999(158553073/198959043 serialized/live bytes, 470065 ops)
2013-03-18T10:46:59-07:00 INFO [FlushWriter:482] 2013-03-18 10:46:59,010 
Memtable.java (line 264) Writing Memtable-CFNAME@172178999(158553073/198959043 
serialized/live bytes, 470065 ops)
2013-03-18T10:47:01-07:00 INFO [FlushWriter:482] 2013-03-18 10:47:01,503 
Memtable.java (line 305) Completed flushing 
/mnt/cassandra/data/KSNAME/CFNAME/KSNAME-CFNAME-hf-590-Data.db (29212692 bytes) 
for commitlog position ReplayPosition(segmentId=1363374084494, 
position=982002808)

I expect that the CF is marked as clean in all CL segments other than the 
active one (1363374084494), and in the active one if the last write location 
for the CF is before the current context position (982002808).

There is nothing else in the log (other than liveRatio measuring) then a storm 
of flushing without the MeteredFlusher being involved. I attribute this to the 
CL growing over capacity and the CF's marked as dirty in the oldest segment 
being flushed.

As part of that the CF from above is flushed. 

2013-03-18T10:50:05-07:00 INFO [OptionalTasks:1] 2013-03-18 10:50:05,777 
ColumnFamilyStore.java (line 659) Enqueuing flush of 
Memtable-CFNAME@1841064520(93240219/117001736 serialized/live bytes, 71513 ops)
2013-03-18T10:50:05-07:00 INFO [FlushWriter:483] 2013-03-18 10:50:05,942 
Memtable.java (line 264) Writing Memtable-CFNAME@1841064520(93240219/117001736 
serialized/live bytes, 71513 ops)
2013-03-18T10:50:07-07:00 INFO [FlushWriter:483] 2013-03-18 10:50:07,277 
Memtable.java (line 305) Completed flushing 
/mnt/cassandra/data/KSNAME/CFNAME/KSNAME-CFNAME-hf-591-Data.db (12050522 bytes) 
for commitlog position ReplayPosition(segmentId=1363374084495, position=8542666)

Not that the segmentId has only increased by 1, so we have not put a huge 
amount of traffic through the commit log. 

My understanding is that after the first flush the CF would be clean in the 
oldest CL segment and I should not see the CF being flushed a second time. Is 
that correct? 

running on 1.1.6
  
Cheers

-----------------
Aaron Morton
Freelance Cassandra Consultant
New Zealand

@aaronmorton
http://www.thelastpickle.com

Reply via email to