I can test on 3 servers and I can test using up to 86gb on each, is there anything specific you want to test in this case? I am using Cassandra 6.3 and running a much smaller amount of RAM but if you think it is interesting I will add it to my ToDo list. I don’t know if I will have more servers soon because that is under consideration.
/Justus -----Ursprungligt meddelande----- Från: B. Todd Burruss [mailto:bburr...@real.com] Skickat: den 27 juli 2010 01:33 Till: user@cassandra.apache.org Ämne: Re: Key Caching i run cassandra with a 30gb heap on machines with 48gb total with good results. i don't use more just because i want to leave some for the OS to cache disk pages, etc. i did have the problem a couple of times with GC doing a full stop on the JVM because it couldn't keep up. my understanding of the CMS GC is that it kicks in when a certain percentage of the JVM heap is used. by tweaking XX:CMSInitiatingOccupancyFraction you can make this kick in sooner (or later) and this fixed it for me. my JVM opts differ just slightly from the latest cassandra changes in 0.6 JVM_OPTS=" \ -ea \ -Xms30G \ -Xmx30G \ -XX:SurvivorRatio=128 \ -XX:MaxTenuringThreshold=0 \ -XX:TargetSurvivorRatio=90 \ -XX:+AggressiveOpts \ -XX:+UseParNewGC \ -XX:+UseConcMarkSweepGC \ -XX:+CMSParallelRemarkEnabled \ -XX:CMSInitiatingOccupancyFraction=88 \ -XX:+HeapDumpOnOutOfMemoryError \ -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -verbose:gc \ -Dnetworkaddress.cache.ttl=60 \ -Dcom.sun.management.jmxremote.port=6786 \ -Dcom.sun.management.jmxremote.ssl=false \ -Dcom.sun.management.jmxremote.authenticate=false \ " On Mon, 2010-07-26 at 14:04 -0700, Peter Schuller wrote: > > If the cache is stored in the heap, how big can the heap be made > > realistically on a 24gb ram machine? I am a java newbie but I have read > > concerns with going over 8gb for the heap as the GC can be too painful/take > > too long. I already have seen timeout issues (node is dead errors) under > > load during GC or compaction. Can/should the heap be set to 16gb with 24gb > > ram? > > I have never run Cassandra in production with such a large heap, so > I'll let others comment on practical experience with that. > > In general however, with the JVM and the CMS garbage collector (which > is enabled by default with Cassandra), having a large heap is not > necessarily a problem depending on the application's workload. > > In terms of GC:s taking too long - with the default throughput > collector used by the JVM you will tend to see the longest pause times > scale roughly linearly with heap size. Most pauses would still be > short (these are what is known as young generation collections), but > periodically a so-called full collection is done. WIth the throughput > collector, this implies stopping all Java threads while the *entire* > Java heap is garbage collected. > > WIth the CMS (Concurrent Mark/Sweep) collector the intent is that the > periodic scans of the entire Java heap are done concurrently with the > application without pausing it. Fallback to full stop-the-world > garbage collections can still happen if CMS fails to complete such > work fast enough, in which case tweaking of garbage collection > settings may be required. > > One thing to consider in any case is how much memory you actually > need; the more you give to the JVM, the less there is left for the OS > to cache file contents. If for example your true working set in > cassandra is, to grab a random number, 3 GB and you set the heap > sizeto 15 GB - now you're wasting a lot of memory by allowing the JVM > to postpone GC until it starts approaching the 15 GB mark. This is > actually good (normally) for overall GC throughput, but not > necessarily good overall for something like cassandra where there is a > direct trade-off with cache eviction in the operating system possibly > causing additional I/O. > > Personally I'd be very interested in hearing any stories about running > cassandra nodes with 10+ gig heap sizes, and how well it has worked. > My gut feeling is that it should work reasonable well, but I have no > evidence of that and I may very well be wrong. Anyone? > > (On a related noted, my limited testing with the G1 collector with > Cassandra has indicated it works pretty well. Though I'm concerned > with the weak ref finalization based cleanup of compacted sstables > since the G1 collector will be much less deterministic in when a > particular object may be collected. Has anyone deployed Cassandra with > G1 on very large heaps under real load?) >