Hello everybody, I actually have the exact same problem. I have very little amount of data ( few hundred kb) and the memory consumption goes up without any end. in sight. For On my node I have limited ram ( 2 Gb) to run cassandra, but since I have very little data, I fought it was not a problem, here is the result of $du :
vic...@****:~$ du /opt/cassandra/data/ -h 40K /opt/cassandra/data/system 1,7M /opt/cassandra/data/FallingDown 1,7M /opt/cassandra/data/ Now, if I look at : vic...@****:~$ sudo ps aux | grep "cassandra" cassandra 11034 0.2 22.9 *1107772 462764* ? Sl Dec17 6:13 /usr/bin/java -ea -Xms128M -Xmx512M -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled -XX:SurvivorRatio=8 -XX:MaxTenuringThreshold=1 -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly -XX:+HeapDumpOnOutOfMemoryError -Dcom.sun.management.jmxremote.port=8081 -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.authenticate=false -Dstorage-config=bin/../conf -Dcassandra-foreground=yes -cp bin/../conf:bin/../build/classes:bin/../lib/antlr-3.1.3.jar:bin/../lib/apache-cassandra-0.6.6.jar:bin/../lib/clhm-production.jar:bin/../lib/commons-cli-1.1.jar:bin/../lib/commons-codec-1.2.jar:bin/../lib/commons-collections-3.2.1.jar:bin/../lib/commons-lang-2.4.jar:bin/../lib/google-collections-1.0.jar:bin/../lib/hadoop-core-0.20.1.jar:bin/../lib/high-scale-lib.jar:bin/../lib/ivy-2.1.0.jar:bin/../lib/jackson-core-asl-1.4.0.jar:bin/../lib/jackson-mapper-asl-1.4.0.jar:bin/../lib/jline-0.9.94.jar:bin/../lib/json-simple-1.1.jar:bin/../lib/libthrift-r917130.jar:bin/../lib/log4j-1.2.14.jar:bin/../lib/slf4j-api-1.5.8.jar:bin/../lib/slf4j-log4j12-1.5.8.jar org.apache.cassandra.thrift.CassandraDaemon Cassandra uses 462764 Kb, roughly 460 Mb for 2 Mb of data... And it keeps getting bigger. It is important to know that I have just a few insert, quite a lot of read though. Also Cassandra seams to completly ignore the JVM limitations such as Xmx. If I don't stop and launch Cassandra every 15 ou 20 days it simply crashes, due to oom errors. Is there an explanation for this ? Thank you all, Victor 2010/12/18 Zhu Han <schumi....@gmail.com> > Here is a typo, sorry... > > best regards, > hanzhu > > > On Sun, Dec 19, 2010 at 10:29 AM, Zhu Han <schumi....@gmail.com> wrote: > >> The problem seems still like the C-heap of JVM, which leaks 70MB every >> day. Here is the summary: >> >> on 12/19: 00000000010c3000 178548K rw--- [ anon ] >> on 12/18: 00000000010c3000 110320K rw--- [ anon ] >> on 12/17: 00000000010c3000 39256K rw--- [ anon ] >> >> This should not be the JVM object heap, because the object heap size is >> fixed up per the below JVM settings. Here is the map of JVM object heap, >> which remains constant. >> >> 00000000010c3000 39256K rw--- [ anon ] >> > > It should be : > 00002b58433c0000 1069824K rw--- [ anon ] > > >> >> I'll paste it to open-jdk mailist to seek for help. >> >> Zhu, >>> Couple of quick questions: >>> How many threads are in your JVM? >>> >> >> There are hundreds of threads. Here is the settings of Cassandra: >> 1) *<ConcurrentReads>8</ConcurrentReads> >> <ConcurrentWrites>128</ConcurrentWrites>* >> >> The thread stack size on this server is 1MB. So I observe hundreds of >> single mmap segment as 1MB. >> >> Can you also post the full commandline as well? >>> >> Sure. All of them are default settings. >> >> /usr/bin/java -ea -Xms1G -Xmx1G -XX:+UseParNewGC -XX:+UseConcMarkSweepGC >> -XX:+CMSParallelRemarkEnabled -XX:SurvivorRatio=8 -XX:MaxTenuringThreshold=1 >> -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly >> -XX:+HeapDumpOnOutOfMemoryError -Dcom.sun.management.jmxremote.port=8080 >> -Dcom.sun.management.jmxremote.ssl=false >> -Dcom.sun.management.jmxremote.authenticate=false >> -Dstorage-config=bin/../conf -cp >> bin/../conf:bin/../build/classes:bin/../lib/antlr-3.1.3.jar:bin/../lib/apache-cassandra-0.6.8.jar:bin/../lib/clhm-production.jar:bin/../lib/commons-cli-1.1.jar:bin/../lib/commons-codec-1.2.jar:bin/../lib/commons-collections-3.2.1.jar:bin/../lib/commons-lang-2.4.jar:bin/../lib/google-collections-1.0.jar:bin/../lib/hadoop-core-0.20.1.jar:bin/../lib/high-scale-lib.jar:bin/../lib/ivy-2.1.0.jar:bin/../lib/jackson-core-asl-1.4.0.jar:bin/../lib/jackson-mapper-asl-1.4.0.jar:bin/../lib/jline-0.9.94.jar:bin/../lib/jna.jar:bin/../lib/json-simple-1.1.jar:bin/../lib/libthrift-r917130.jar:bin/../lib/log4j-1.2.14.jar:bin/../lib/slf4j-api-1.5.8.jar:bin/../lib/slf4j-log4j12-1.5.8.jar >> org.apache.cassandra.thrift.CassandraDaemon >> >> >>> Also, output of cat /proc/meminfo >>> >> >> This is an openvz based testing environment. So /proc/meminfo is not very >> helpful. Whatever, I paste it here. >> >> >> MemTotal: 9838380 kB >> MemFree: 4005900 kB >> Buffers: 0 kB >> Cached: 0 kB >> SwapCached: 0 kB >> Active: 0 kB >> Inactive: 0 kB >> HighTotal: 0 kB >> HighFree: 0 kB >> LowTotal: 9838380 kB >> LowFree: 4005900 kB >> SwapTotal: 0 kB >> SwapFree: 0 kB >> Dirty: 0 kB >> Writeback: 0 kB >> AnonPages: 0 kB >> Mapped: 0 kB >> Slab: 0 kB >> PageTables: 0 kB >> NFS_Unstable: 0 kB >> Bounce: 0 kB >> CommitLimit: 0 kB >> Committed_AS: 0 kB >> VmallocTotal: 0 kB >> VmallocUsed: 0 kB >> VmallocChunk: 0 kB >> HugePages_Total: 0 >> HugePages_Free: 0 >> HugePages_Rsvd: 0 >> Hugepagesize: 2048 kB >> >> >>> thanks, >>> Sri >>> >>> On Fri, Dec 17, 2010 at 7:15 PM, Zhu Han <schumi....@gmail.com> wrote: >>> >>> > Seems like the problem there after I upgrade to "OpenJDK Runtime >>> > Environment (IcedTea6 1.9.2)". So it is not related to the bug I >>> reported >>> > two days ago. >>> > >>> > Can somebody else share some info with us? What's the java environment >>> you >>> > used? Is it stable for long-lived cassandra instances? >>> > >>> > best regards, >>> > hanzhu >>> > >>> > >>> > On Thu, Dec 16, 2010 at 9:28 PM, Zhu Han <schumi....@gmail.com> wrote: >>> > >>> > > I've tried it. But it does not work for me this afternoon. >>> > > >>> > > Thank you! >>> > > >>> > > best regards, >>> > > hanzhu >>> > > >>> > > >>> > > >>> > > On Thu, Dec 16, 2010 at 8:59 PM, Matthew Conway <m...@backupify.com >>> > >wrote: >>> > > >>> > >> Thanks for debugging this, I'm running into the same problem. >>> > >> BTW, if you can ssh into your nodes, you can use jconsole over ssh: >>> > >> http://simplygenius.com/2010/08/jconsole-via-socks-ssh-tunnel.html >>> > >> >>> > >> Matt >>> > >> >>> > >> >>> > >> On Dec 16, 2010, at Thu Dec 16, 2:39 AM, Zhu Han wrote: >>> > >> >>> > >> > Sorry for spam again. :-) >>> > >> > >>> > >> > I think I find the root cause. Here is a bug report[1] on memory >>> leak >>> > of >>> > >> > ParNewGC. It is solved by OpenJDK 1.6.0_20(IcedTea6 1.9.2)[2]. >>> > >> > >>> > >> > So the suggestion is: for who runs cassandra of Ubuntu 10.04, >>> please >>> > >> > upgrade OpenJDK to the latest version. >>> > >> > >>> > >> > [1] http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6824570 >>> > >> > [2] >>> > http://blog.fuseyism.com/index.php/2010/09/10/icedtea6-19-released/ >>> > >> > >>> > >> > best regards, >>> > >> > hanzhu >>> > >> > >>> > >> > >>> > >> > On Thu, Dec 16, 2010 at 3:10 PM, Zhu Han <schumi....@gmail.com> >>> > wrote: >>> > >> > >>> > >> >> The test node is behind a firewall. So I took some time to find a >>> way >>> > >> to >>> > >> >> get JMX diagnostic information from it. >>> > >> >> >>> > >> >> What's interesting is, both the HeapMemoryUsage and >>> > NonHeapMemoryUsage >>> > >> >> reported by JVM is quite reasonable. So, it's a myth why the JVM >>> > >> process >>> > >> >> maps such a big anonymous memory region... >>> > >> >> >>> > >> >> $ java -Xmx128m -jar /tmp/cmdline-jmxclient-0.10.3.jar - >>> > localhost:8080 >>> > >> >> java.lang:type=Memory HeapMemoryUsage >>> > >> >> 12/16/2010 15:07:45 +0800 org.archive.jmx.Client HeapMemoryUsage: >>> > >> >> committed: 1065025536 >>> > >> >> init: 1073741824 >>> > >> >> max: 1065025536 >>> > >> >> used: 18295328 >>> > >> >> >>> > >> >> $java -Xmx128m -jar /tmp/cmdline-jmxclient-0.10.3.jar - >>> > localhost:8080 >>> > >> >> java.lang:type=Memory NonHeapMemoryUsage >>> > >> >> 12/16/2010 15:01:51 +0800 org.archive.jmx.Client >>> NonHeapMemoryUsage: >>> > >> >> committed: 34308096 >>> > >> >> init: 24313856 >>> > >> >> max: 226492416 >>> > >> >> used: 21475376 >>> > >> >> >>> > >> >> If anybody is interested in it, I can provide more diagnostic >>> > >> information >>> > >> >> before I restart the instance. >>> > >> >> >>> > >> >> best regards, >>> > >> >> hanzhu >>> > >> >> >>> > >> >> >>> > >> >> >>> > >> >> On Thu, Dec 16, 2010 at 1:00 PM, Zhu Han <schumi....@gmail.com> >>> > wrote: >>> > >> >> >>> > >> >>> After investigating it deeper, I suspect it's native memory >>> leak of >>> > >> JVM. >>> > >> >>> The large anonymous map on lower address space should be the >>> native >>> > >> heap of >>> > >> >>> JVM, but not java object heap. Has anybody met it before? >>> > >> >>> >>> > >> >>> I'll try to upgrade the JVM tonight. >>> > >> >>> >>> > >> >>> best regards, >>> > >> >>> hanzhu >>> > >> >>> >>> > >> >>> >>> > >> >>> >>> > >> >>> On Thu, Dec 16, 2010 at 10:50 AM, Zhu Han <schumi....@gmail.com >>> > >>> > >> wrote: >>> > >> >>> >>> > >> >>>> Hi, >>> > >> >>>> >>> > >> >>>> I have a test node with apache-cassandra-0.6.8 on ubuntu 10.4. >>> The >>> > >> >>>> hardware environment is an OpenVZ container. JVM settings is >>> > >> >>>> # java -Xmx128m -version >>> > >> >>>> java version "1.6.0_18" >>> > >> >>>> OpenJDK Runtime Environment (IcedTea6 1.8.2) >>> (6b18-1.8.2-4ubuntu2) >>> > >> >>>> OpenJDK 64-Bit Server VM (build 16.0-b13, mixed mode) >>> > >> >>>> >>> > >> >>>> This is the memory settings: >>> > >> >>>> >>> > >> >>>> "/usr/bin/java -ea -Xms1G -Xmx1G ..." >>> > >> >>>> >>> > >> >>>> And the ondisk footprint of sstables is very small: >>> > >> >>>> >>> > >> >>>> "#du -sh data/ >>> > >> >>>> "9.8M data/" >>> > >> >>>> >>> > >> >>>> The node was infrequently accessed in the last three weeks. >>> After >>> > >> that, >>> > >> >>>> I observe the abnormal memory utilization by top: >>> > >> >>>> >>> > >> >>>> PID USER PR NI *VIRT* *RES* SHR S %CPU %MEM TIME+ >>> > >> >>>> COMMAND >>> > >> >>>> >>> > >> >>>> 7836 root 15 0 *3300m* *2.4g* 13m S 0 26.0 >>> 2:58.51 >>> > >> >>>> java >>> > >> >>>> >>> > >> >>>> The jvm heap utilization is quite normal: >>> > >> >>>> >>> > >> >>>> #sudo jstat -gc -J"-Xmx128m" 7836 >>> > >> >>>> S0C S1C S0U S1U *EC* *EU* *OC* >>> > >> >>>> *OU* *PC PU* YGC YGCT FGC >>> FGCT >>> > >> >>>> GCT >>> > >> >>>> 8512.0 8512.0 372.8 0.0 *68160.0* *5225.7* *963392.0 >>> > >> 508200.7 >>> > >> >>>> 30604.0 18373.4* 480 3.979 2 0.005 3.984 >>> > >> >>>> >>> > >> >>>> And then I try "pmap" to see the native memory mapping. *There >>> is >>> > two >>> > >> >>>> large anonymous mmap regions.* >>> > >> >>>> >>> > >> >>>> 00000000080dc000 1573568K rw--- [ anon ] >>> > >> >>>> 00002b2afc900000 1079180K rw--- [ anon ] >>> > >> >>>> >>> > >> >>>> The second one should be JVM heap. What is the first one? >>> Mmap of >>> > >> >>>> sstable should never be anonymous mmap, but file based mmap. >>> *Is >>> > it >>> > >> a >>> > >> >>>> native memory leak? *Does cassandra allocate any >>> DirectByteBuffer? >>> > >> >>>> >>> > >> >>>> best regards, >>> > >> >>>> hanzhu >>> > >> >>>> >>> > >> >>> >>> > >> >>> >>> > >> >> >>> > >> >>> > >> >>> > > >>> > >>> >> >> >