> When we analyzed the heap, almost all of it was memtables. What were the top classes ? I would normally expect an OOM in pre 1.2 days to be the result of bloom filters, compaction meta data and index samples.
> Is there any known issue with 1.1.5 which causes memtable_total_space_in_mb > not to be respected, or not defaulting to 1/3rd of the heap size? Nothing I can remember. We estimate the in memory size of the memtables using the live ratio. That’s been pretty good for a while now, but you may want to check the change log for changes there. > The latest test was running on high performance 32-core, 128 GB RAM, 7 RAID-0 > 1TB disks (regular). With all those cores grab the TLAB setting from the 1.2 cassandra-env.sh file. Cheers ----------------- Aaron Morton New Zealand @aaronmorton Co-Founder & Principal Consultant Apache Cassandra Consulting http://www.thelastpickle.com On 1/11/2013, at 2:59 pm, Arindam Barua <aba...@247-inc.com> wrote: > > Thank you for your responses. In another recent test, the heap actually got > full, and we got an out of memory error. When we analyzed the heap, almost > all of it was memtables. Is there any known issue with 1.1.5 which causes > memtable_total_space_in_mb not to be respected, or not defaulting to 1/3rd of > the heap size? Or is it possible that the load in the test is that high that > Cassandra is not able to keep flushing even though it starts the process when > memtable_total_space_in_mb is 1/3rd of the heap? > > We recently switched to LeveledCompaction, however, when we got the earlier > heap warning, that was running on SizeTiered. > The latest test was running on high performance 32-core, 128 GB RAM, 7 RAID-0 > 1TB disks (regular). Earlier tests were run on lesser hardware with the same > load, but there was no memory problem. We are running more tests to check if > this is always reproducible. > > Answering some of the earlier questions if it helps: > > We have Cassandra 1.1.5 running in production. Upgrading to the latest 1.2.x > release is on the roadmap, but till then this needs to be figured out. > >> - How many data do you got per node ? > We are running into these errors while running tests in QA starting with 0 > load. These are around 4 hr tests which end up adding under 1 GB of data on > each node of a 4-node ring, or a 2-node ring. > >> - What is the value of the "index_intval" (cassandra.yaml) ? > It's the default value of 128. > > Thanks, > Arindam > > -----Original Message----- > From: Aaron Morton [mailto:aa...@thelastpickle.com] > Sent: Monday, October 28, 2013 12:09 AM > To: Cassandra User > Subject: Re: Heap almost full > >> 1] [14/10/2013:19:15:08 PDT] ScheduledTasks:1: WARN GCInspector.java (line >> 145) Heap is 0.8287082580489245 full. You may need to reduce memtable >> and/or cache sizes. Cassandra will now flush up to the two largest >> memtables to free up memory. Adjust flush_largest_memtables_at threshold in >> cassandra.yaml if you don't want Cassandra to do this automatically > This means that the CMS GC was unable to free memory quickly, you've not run > out but may do under heavy load. > > CMS uses CPU resources to do it's job, how much CPU do you have available ? > To check the behaviour of the CMS collector using JConsole or another tool to > watch the heap size, you should see a nice saw tooth graph. It should > gradually grow then drop quickly to below 3ish GB. If the size of CMS is not > low enough you will spend more time in GC. > > You may also want to adjust flush_largest_memtables_at to be .8 to give CMS a > chance to do it's work. It starts at .75 > >> In 1.2+ bloomfilters are off-heap, you can use vnodes... > +1 for 1.2 with off heap bloom filters. > >> - increasing the heap to 10GB. > > -1 > Unless you have a node under heavy memory problems, pre 1.2 with 1+billion > rows and lots of bloom filters, increasing the heap is not the answer. It > will increase the time taken for ParNew CMS and in kicks the problem down the > road. > > Cheers > > ----------------- > Aaron Morton > New Zealand > @aaronmorton > > Co-Founder & Principal Consultant > Apache Cassandra Consulting > http://www.thelastpickle.com > > On 26/10/2013, at 8:32 am, Alain RODRIGUEZ <arodr...@gmail.com> wrote: > >> If you are starting with Cassandra I really advice you to start with 1.2.11 >> >> In 1.2+ bloomfilters are off-heap, you can use vnodes... >> >> "I summed up the bloom filter usage reported by nodetool cfstats in all the >> CFs and it was under 50 MB." >> >> This is quite a small value. Is there no error in your conversion from Bytes >> read in cfstats ? >> >> If you are trying to understand this could you tell us : >> >> - How many data do you got per node ? >> - What is the value of the "index_intval" (cassandra.yaml) ? >> >> If you are trying to fix this, you can try : >> >> - changing the "memtable_total_space_in_mb" to 1024 >> - increasing the heap to 10GB. >> >> Hope this will help somehow :). >> >> Good luck >> >> >> 2013/10/16 Arindam Barua <aba...@247-inc.com> >> >> >> During performance testing being run on our 4 node Cassandra 1.1.5 cluster, >> we are seeing warning logs about the heap being almost full - [1]. I'm >> trying to figure out why, and how to prevent it. >> >> >> >> The tests are being run on a Cassandra ring consisting of 4 dedicated boxes >> with 32 GB of RAM each. >> >> The heap size is set to 8 GB as recommended. >> >> All the other relevant settings I know off are the default ones: >> >> - memtable_total_space_in_mb is not set in the yaml, so should >> default to 1/3rd the heap size. >> >> - They key cache should be 100 MB at the most. I checked the key >> cache the day after the tests were run via nodetool info, and it reported >> 4.5 MB being used. >> >> - row cache is not being used >> >> - I summed up the bloom filter usage reported by nodetool cfstats >> in all the CFs and it was under 50 MB. >> >> >> >> The resident size of the cassandra process accd to top is 8.4g even now. Did >> a heap histogram using jmap, but not sure how to interpret those results >> usefully - [2]. >> >> >> >> Performance test details: >> >> - The test is write only, and is writing relatively large amount of >> data to one CF. >> >> - There is some other traffic that is constantly on that writes >> smaller amounts of data to many CFs, and does some reads. >> >> >> >> The total number of CFs are 114, but quite a few of them are not used. >> >> >> >> Thanks, >> >> Arindam >> >> >> >> [1] [14/10/2013:19:15:08 PDT] ScheduledTasks:1: WARN GCInspector.java (line >> 145) Heap is 0.8287082580489245 full. You may need to reduce memtable >> and/or cache sizes. Cassandra will now flush up to the two largest >> memtables to free up memory. Adjust flush_largest_memtables_at threshold in >> cassandra.yaml if you don't want Cassandra to do this automatically >> >> >> >> [2] Object Histogram: >> >> >> >> num #instances #bytes Class description >> >> -------------------------------------------------------------------------- >> >> 1: 152855 86035312 int[] >> >> 2: 13395 45388008 long[] >> >> 3: 49517 9712000 java.lang.Object[] >> >> 4: 120094 8415560 char[] >> >> 5: 145106 6965088 java.nio.HeapByteBuffer >> >> 6: 40525 5891040 * ConstMethodKlass >> >> 7: 231258 5550192 java.lang.Long >> >> 8: 40525 5521592 * MethodKlass >> >> 9: 134574 5382960 java.math.BigInteger >> >> 10: 36692 4403040 java.net.SocksSocketImpl >> >> 11: 3741 4385048 * ConstantPoolKlass >> >> 12: 63875 3538128 * SymbolKlass >> >> 13: 104048 3329536 java.lang.String >> >> 14: 132636 3183264 org.apache.cassandra.db.DecoratedKey >> >> 15: 97466 3118912 >> java.util.concurrent.ConcurrentHashMap$HashEntry >> >> 16: 97216 3110912 >> com.googlecode.concurrentlinkedhashmap.ConcurrentLinkedHashMap$Node >> >> >> >> >> >