Hi,     
        Now I start to know what's really happenning. The INDEX_INTERVAL(in 
IndexSummary.java) was set to be 4; so at least 1/4 
of the indices are in the heap. For a node with 20M columns, most of the heap 
is occupied by indices, and of course a poor performance
with processing large files.
        Is it possible to modify the INDEX_INTERVAL without reconstruct the 
sstables? I modified the code,  and restart every node,
but "NotFoundException()" is reported when read the columns.
        Thanks !

Best regards,
        Cao Jiguang

------------------                               
casablinca126.com
2010-04-29

-------------------------------------------------------------
发件人:Jonathan Ellis
发送日期:2010-04-28 22:48:44
收件人:u...@cassandra.apache.org
抄送:
主题:Re: compaction slow while sstable>25GB,limitation of the sstablesize?

Compaction time is proportional to the size of the sstable, yes.  Not
sure how it could be otherwise.  And it does generate a lot of
garbage.  So unless you are seeing concurrent failures in the GC and
corresponding large pause times, your heap should be fine, as long as
the rows you are compacting aren't too large.

2010/4/28 casablinca126.com <casabli...@126.com>:
> �hi,
> ???�The compaction process is very slow, when the size of new generating 
> sstable file grows upon 25GB;
> at the meantime, the garbage collector is running frequently.
> ???�Firstly, I have a question that, is there a limitation of the sstable 
> size? if not, is 2GB heap size not
> enough for processing such a large file?
> ???�I'm using cassandara-0.6.1, the heap size of jvm is 2GB(maximum in 32-bit 
> system) .
>
> ???�Thanks in advance !
> Best Regards,
> Cao Jiguang
>
> --------------
> casablinca126.com
> 2010-04-28
>
>
>



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com

Reply via email to