Are you able to put together a test case, maybe using the stress testing tool, 
that models your data layout?

If so can you add it to https://issues.apache.org/jira/browse/CASSANDRA-3592

Thanks

-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 16/07/2012, at 8:17 PM, 黄荣桢 wrote:

> Hello,
> 
> I find the compaction of my secondary index takes a long time and occupy a 
> lot of CPU.
> 
>  INFO [CompactionExecutor:8] 2012-07-16 12:03:16,408 CompactionTask.java 
> (line 213) Compacted to [XXX].  71,018,346 to 9,020 (~0% of original) bytes 
> for 3 keys at 0.000022MB/s.  Time: 397,602ms.
> 
> The stack of this over load Thread is:
> "CompactionReducer:5" - Thread t@1073
>    java.lang.Thread.State: RUNNABLE
>       at java.util.AbstractList$Itr.remove(AbstractList.java:360)
>       at 
> org.apache.cassandra.db.ColumnFamilyStore.removeDeletedStandard(ColumnFamilyStore.java:851)
>       at 
> org.apache.cassandra.db.ColumnFamilyStore.removeDeletedColumnsOnly(ColumnFamilyStore.java:835)
>       at 
> org.apache.cassandra.db.ColumnFamilyStore.removeDeleted(ColumnFamilyStore.java:826)
>       at 
> org.apache.cassandra.db.compaction.PrecompactedRow.removeDeletedAndOldShards(PrecompactedRow.java:77)
>       at 
> org.apache.cassandra.db.compaction.ParallelCompactionIterable$Reducer$MergeTask.call(ParallelCompactionIterable.java:224)
>       at 
> org.apache.cassandra.db.compaction.ParallelCompactionIterable$Reducer$MergeTask.call(ParallelCompactionIterable.java:198)
>       at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>       at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>       at java.lang.Thread.run(Thread.java:662)
> 
>    Locked ownable synchronizers:
>       - locked <4be5863d> (a 
> java.util.concurrent.locks.ReentrantLock$NonfairSync)
> 
> I guess this problem due to huge amount of columns in my index. The column 
> which is indexed only have 3 kinds of values, and one possible value have 
> several million of record, so this index have several million columns. 
> Compact these columns take a long time. 
> 
> I find a similar issue on the jira:
> https://issues.apache.org/jira/browse/CASSANDRA-3592
> 
> Is there any way to work around this issue?  Is there any way to improve the 
> efficiency to compact this index?
> 

Reply via email to