Hi, Now I start to know what's really happenning. The INDEX_INTERVAL(in IndexSummary.java) was set to be 4; so at least 1/4 of the indices are in the heap. For a node with 20M columns, most of the heap is occupied by indices, and of course a poor performance with processing large files. Is it possible to modify the INDEX_INTERVAL without reconstruct the sstables? I modified the code, and restart every node, but "NotFoundException()" is reported when read the columns. Thanks !
Best regards, Cao Jiguang ------------------ casablinca126.com 2010-04-29 ------------------------------------------------------------- 发件人:Jonathan Ellis 发送日期:2010-04-28 22:48:44 收件人:u...@cassandra.apache.org 抄送: 主题:Re: compaction slow while sstable>25GB,limitation of the sstablesize? Compaction time is proportional to the size of the sstable, yes. Not sure how it could be otherwise. And it does generate a lot of garbage. So unless you are seeing concurrent failures in the GC and corresponding large pause times, your heap should be fine, as long as the rows you are compacting aren't too large. 2010/4/28 casablinca126.com <casabli...@126.com>: > �hi, > ???�The compaction process is very slow, when the size of new generating > sstable file grows upon 25GB; > at the meantime, the garbage collector is running frequently. > ???�Firstly, I have a question that, is there a limitation of the sstable > size? if not, is 2GB heap size not > enough for processing such a large file? > ???�I'm using cassandara-0.6.1, the heap size of jvm is 2GB(maximum in 32-bit > system) . > > ???�Thanks in advance ! > Best Regards, > Cao Jiguang > > -------------- > casablinca126.com > 2010-04-28 > > > -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of Riptano, the source for professional Cassandra support http://riptano.com