And here is my cassandra-env.sh https://gist.github.com/kunalg/2c092cb2450c62be9a20
Kunal On 11 July 2015 at 00:04, Kunal Gangakhedkar <kgangakhed...@gmail.com> wrote: > From jhat output, top 10 entries for "Instance Count for All Classes > (excluding platform)" shows: > > 2088223 instances of class org.apache.cassandra.db.BufferCell > 1983245 instances of class > org.apache.cassandra.db.composites.CompoundSparseCellName > 1885974 instances of class > org.apache.cassandra.db.composites.CompoundDenseCellName > 630000 instances of class > org.apache.cassandra.io.sstable.IndexHelper$IndexInfo > 503687 instances of class org.apache.cassandra.db.BufferDeletedCell > 378206 instances of class org.apache.cassandra.cql3.ColumnIdentifier > 101800 instances of class org.apache.cassandra.utils.concurrent.Ref > 101800 instances of class org.apache.cassandra.utils.concurrent.Ref$State > 90704 instances of class > org.apache.cassandra.utils.concurrent.Ref$GlobalState > 71123 instances of class org.apache.cassandra.db.BufferDecoratedKey > > At the bottom of the page, it shows: > Total of 8739510 instances occupying 193607512 bytes. > JFYI. > > Kunal > > On 10 July 2015 at 23:49, Kunal Gangakhedkar <kgangakhed...@gmail.com> > wrote: > >> Thanks for quick reply. >> >> 1. I don't know what are the thresholds that I should look for. So, to >> save this back-and-forth, I'm attaching the cfstats output for the keyspace. >> >> There is one table - daily_challenges - which shows compacted partition >> max bytes as ~460M and another one - daily_guest_logins - which shows >> compacted partition max bytes as ~36M. >> >> Can that be a problem? >> Here is the CQL schema for the daily_challenges column family: >> >> CREATE TABLE app_10001.daily_challenges ( >> segment_type text, >> date timestamp, >> user_id int, >> sess_id text, >> data text, >> deleted boolean, >> PRIMARY KEY (segment_type, date, user_id, sess_id) >> ) WITH CLUSTERING ORDER BY (date DESC, user_id ASC, sess_id ASC) >> AND bloom_filter_fp_chance = 0.01 >> AND caching = '{"keys":"ALL", "rows_per_partition":"NONE"}' >> AND comment = '' >> AND compaction = {'min_threshold': '4', 'class': >> 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', >> 'max_threshold': '32'} >> AND compression = {'sstable_compression': >> 'org.apache.cassandra.io.compress.LZ4Compressor'} >> AND dclocal_read_repair_chance = 0.1 >> AND default_time_to_live = 0 >> AND gc_grace_seconds = 864000 >> AND max_index_interval = 2048 >> AND memtable_flush_period_in_ms = 0 >> AND min_index_interval = 128 >> AND read_repair_chance = 0.0 >> AND speculative_retry = '99.0PERCENTILE'; >> >> CREATE INDEX idx_deleted ON app_10001.daily_challenges (deleted); >> >> >> 2. I don't know - how do I check? As I mentioned, I just installed the >> dsc21 update from datastax's debian repo (ver 2.1.7). >> >> Really appreciate your help. >> >> Thanks, >> Kunal >> >> On 10 July 2015 at 23:33, Sebastian Estevez < >> sebastian.este...@datastax.com> wrote: >> >>> 1. You want to look at # of sstables in cfhistograms or in cfstats look >>> at: >>> Compacted partition maximum bytes >>> Maximum live cells per slice >>> >>> 2) No, here's the env.sh from 3.0 which should work with some tweaks: >>> >>> https://github.com/tobert/cassandra/blob/0f70469985d62aeadc20b41dc9cdc9d72a035c64/conf/cassandra-env.sh >>> >>> You'll at least have to modify the jamm version to what's in yours. I >>> think it's 2.5 >>> >>> >>> >>> All the best, >>> >>> >>> [image: datastax_logo.png] <http://www.datastax.com/> >>> >>> Sebastián Estévez >>> >>> Solutions Architect | 954 905 8615 | sebastian.este...@datastax.com >>> >>> [image: linkedin.png] <https://www.linkedin.com/company/datastax> [image: >>> facebook.png] <https://www.facebook.com/datastax> [image: twitter.png] >>> <https://twitter.com/datastax> [image: g+.png] >>> <https://plus.google.com/+Datastax/about> >>> <http://feeds.feedburner.com/datastax> >>> >>> <http://cassandrasummit-datastax.com/> >>> >>> DataStax is the fastest, most scalable distributed database technology, >>> delivering Apache Cassandra to the world’s most innovative enterprises. >>> Datastax is built to be agile, always-on, and predictably scalable to any >>> size. With more than 500 customers in 45 countries, DataStax is the >>> database technology and transactional backbone of choice for the worlds >>> most innovative companies such as Netflix, Adobe, Intuit, and eBay. >>> >>> On Fri, Jul 10, 2015 at 1:42 PM, Kunal Gangakhedkar < >>> kgangakhed...@gmail.com> wrote: >>> >>>> Thanks, Sebastian. >>>> >>>> Couple of questions (I'm really new to cassandra): >>>> 1. How do I interpret the output of 'nodetool cfstats' to figure out >>>> the issues? Any documentation pointer on that would be helpful. >>>> >>>> 2. I'm primarily a python/c developer - so, totally clueless about JVM >>>> environment. So, please bare with me as I would need a lot of hand-holding. >>>> Should I just copy+paste the settings you gave and try to restart the >>>> failing cassandra server? >>>> >>>> Thanks, >>>> Kunal >>>> >>>> On 10 July 2015 at 22:35, Sebastian Estevez < >>>> sebastian.este...@datastax.com> wrote: >>>> >>>>> #1 You need more information. >>>>> >>>>> a) Take a look at your .hprof file (memory heap from the OOM) with an >>>>> introspection tool like jhat or visualvm or java flight recorder and see >>>>> what is using up your RAM. >>>>> >>>>> b) How big are your large rows (use nodetool cfstats on each node). If >>>>> your data model is bad, you are going to have to re-design it no matter >>>>> what. >>>>> >>>>> #2 As a possible workaround try using the G1GC allocator with the >>>>> settings from c* 3.0 instead of CMS. I've seen lots of success with it >>>>> lately (tl;dr G1GC is much simpler than CMS and almost as good as a finely >>>>> tuned CMS). *Note:* Use it with the latest Java 8 from Oracle. Do >>>>> *not* set the newgen size for G1 sets it dynamically: >>>>> >>>>> # min and max heap sizes should be set to the same value to avoid >>>>>> # stop-the-world GC pauses during resize, and so that we can lock the >>>>>> # heap in memory on startup to prevent any of it from being swapped >>>>>> # out. >>>>>> JVM_OPTS="$JVM_OPTS -Xms${MAX_HEAP_SIZE}" >>>>>> JVM_OPTS="$JVM_OPTS -Xmx${MAX_HEAP_SIZE}" >>>>>> >>>>>> # Per-thread stack size. >>>>>> JVM_OPTS="$JVM_OPTS -Xss256k" >>>>>> >>>>>> # Use the Hotspot garbage-first collector. >>>>>> JVM_OPTS="$JVM_OPTS -XX:+UseG1GC" >>>>>> >>>>>> # Have the JVM do less remembered set work during STW, instead >>>>>> # preferring concurrent GC. Reduces p99.9 latency. >>>>>> JVM_OPTS="$JVM_OPTS -XX:G1RSetUpdatingPauseTimePercent=5" >>>>>> >>>>>> # The JVM maximum is 8 PGC threads and 1/4 of that for ConcGC. >>>>>> # Machines with > 10 cores may need additional threads. >>>>>> # Increase to <= full cores (do not count HT cores). >>>>>> #JVM_OPTS="$JVM_OPTS -XX:ParallelGCThreads=16" >>>>>> #JVM_OPTS="$JVM_OPTS -XX:ConcGCThreads=16" >>>>>> >>>>>> # Main G1GC tunable: lowering the pause target will lower throughput >>>>>> and vise versa. >>>>>> # 200ms is the JVM default and lowest viable setting >>>>>> # 1000ms increases throughput. Keep it smaller than the timeouts in >>>>>> cassandra.yaml. >>>>>> JVM_OPTS="$JVM_OPTS -XX:MaxGCPauseMillis=500" >>>>>> # Do reference processing in parallel GC. >>>>>> JVM_OPTS="$JVM_OPTS -XX:+ParallelRefProcEnabled" >>>>>> >>>>>> # This may help eliminate STW. >>>>>> # The default in Hotspot 8u40 is 40%. >>>>>> #JVM_OPTS="$JVM_OPTS -XX:InitiatingHeapOccupancyPercent=25" >>>>>> >>>>>> # For workloads that do large allocations, increasing the region >>>>>> # size may make things more efficient. Otherwise, let the JVM >>>>>> # set this automatically. >>>>>> #JVM_OPTS="$JVM_OPTS -XX:G1HeapRegionSize=32m" >>>>>> >>>>>> # Make sure all memory is faulted and zeroed on startup. >>>>>> # This helps prevent soft faults in containers and makes >>>>>> # transparent hugepage allocation more effective. >>>>>> JVM_OPTS="$JVM_OPTS -XX:+AlwaysPreTouch" >>>>>> >>>>>> # Biased locking does not benefit Cassandra. >>>>>> JVM_OPTS="$JVM_OPTS -XX:-UseBiasedLocking" >>>>>> >>>>>> # Larger interned string table, for gossip's benefit (CASSANDRA-6410) >>>>>> JVM_OPTS="$JVM_OPTS -XX:StringTableSize=1000003" >>>>>> >>>>>> # Enable thread-local allocation blocks and allow the JVM to >>>>>> automatically >>>>>> # resize them at runtime. >>>>>> JVM_OPTS="$JVM_OPTS -XX:+UseTLAB -XX:+ResizeTLAB" >>>>>> >>>>>> # http://www.evanjones.ca/jvm-mmap-pause.html >>>>>> JVM_OPTS="$JVM_OPTS -XX:+PerfDisableSharedMem" >>>>> >>>>> >>>>> All the best, >>>>> >>>>> >>>>> [image: datastax_logo.png] <http://www.datastax.com/> >>>>> >>>>> Sebastián Estévez >>>>> >>>>> Solutions Architect | 954 905 8615 | sebastian.este...@datastax.com >>>>> >>>>> [image: linkedin.png] <https://www.linkedin.com/company/datastax> [image: >>>>> facebook.png] <https://www.facebook.com/datastax> [image: twitter.png] >>>>> <https://twitter.com/datastax> [image: g+.png] >>>>> <https://plus.google.com/+Datastax/about> >>>>> <http://feeds.feedburner.com/datastax> >>>>> >>>>> <http://cassandrasummit-datastax.com/> >>>>> >>>>> DataStax is the fastest, most scalable distributed database >>>>> technology, delivering Apache Cassandra to the world’s most innovative >>>>> enterprises. Datastax is built to be agile, always-on, and predictably >>>>> scalable to any size. With more than 500 customers in 45 countries, >>>>> DataStax >>>>> is the database technology and transactional backbone of choice for the >>>>> worlds most innovative companies such as Netflix, Adobe, Intuit, and eBay. >>>>> >>>>> On Fri, Jul 10, 2015 at 12:55 PM, Kunal Gangakhedkar < >>>>> kgangakhed...@gmail.com> wrote: >>>>> >>>>>> I upgraded my instance from 8GB to a 14GB one. >>>>>> Allocated 8GB to jvm heap in cassandra-env.sh. >>>>>> >>>>>> And now, it crashes even faster with an OOM.. >>>>>> >>>>>> Earlier, with 4GB heap, I could go upto ~90% replication completion >>>>>> (as reported by nodetool netstats); now, with 8GB heap, I cannot even get >>>>>> there. I've already restarted cassandra service 4 times with 8GB heap. >>>>>> >>>>>> No clue what's going on.. :( >>>>>> >>>>>> Kunal >>>>>> >>>>>> On 10 July 2015 at 17:45, Jack Krupansky <jack.krupan...@gmail.com> >>>>>> wrote: >>>>>> >>>>>>> You, and only you, are responsible for knowing your data and data >>>>>>> model. >>>>>>> >>>>>>> If columns per row or rows per partition can be large, then an 8GB >>>>>>> system is probably too small. But the real issue is that you need to >>>>>>> keep >>>>>>> your partition size from getting too large. >>>>>>> >>>>>>> Generally, an 8GB system is okay, but only for reasonably-sized >>>>>>> partitions, like under 10MB. >>>>>>> >>>>>>> >>>>>>> -- Jack Krupansky >>>>>>> >>>>>>> On Fri, Jul 10, 2015 at 8:05 AM, Kunal Gangakhedkar < >>>>>>> kgangakhed...@gmail.com> wrote: >>>>>>> >>>>>>>> I'm new to cassandra >>>>>>>> How do I find those out? - mainly, the partition params that you >>>>>>>> asked for. Others, I think I can figure out. >>>>>>>> >>>>>>>> We don't have any large objects/blobs in the column values - it's >>>>>>>> all textual, date-time, numeric and uuid data. >>>>>>>> >>>>>>>> We use cassandra to primarily store segmentation data - with >>>>>>>> segment type as partition key. That is again divided into two separate >>>>>>>> column families; but they have similar structure. >>>>>>>> >>>>>>>> Columns per row can be fairly large - each segment type as the row >>>>>>>> key and associated user ids and timestamp as column value. >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Kunal >>>>>>>> >>>>>>>> On 10 July 2015 at 16:36, Jack Krupansky <jack.krupan...@gmail.com> >>>>>>>> wrote: >>>>>>>> >>>>>>>>> What does your data and data model look like - partition size, >>>>>>>>> rows per partition, number of columns per row, any large values/blobs >>>>>>>>> in >>>>>>>>> column values? >>>>>>>>> >>>>>>>>> You could run fine on an 8GB system, but only if your rows and >>>>>>>>> partitions are reasonably small. Any large partitions could blow you >>>>>>>>> away. >>>>>>>>> >>>>>>>>> -- Jack Krupansky >>>>>>>>> >>>>>>>>> On Fri, Jul 10, 2015 at 4:22 AM, Kunal Gangakhedkar < >>>>>>>>> kgangakhed...@gmail.com> wrote: >>>>>>>>> >>>>>>>>>> Attaching the stack dump captured from the last OOM. >>>>>>>>>> >>>>>>>>>> Kunal >>>>>>>>>> >>>>>>>>>> On 10 July 2015 at 13:32, Kunal Gangakhedkar < >>>>>>>>>> kgangakhed...@gmail.com> wrote: >>>>>>>>>> >>>>>>>>>>> Forgot to mention: the data size is not that big - it's barely >>>>>>>>>>> 10GB in all. >>>>>>>>>>> >>>>>>>>>>> Kunal >>>>>>>>>>> >>>>>>>>>>> On 10 July 2015 at 13:29, Kunal Gangakhedkar < >>>>>>>>>>> kgangakhed...@gmail.com> wrote: >>>>>>>>>>> >>>>>>>>>>>> Hi, >>>>>>>>>>>> >>>>>>>>>>>> I have a 2 node setup on Azure (east us region) running Ubuntu >>>>>>>>>>>> server 14.04LTS. >>>>>>>>>>>> Both nodes have 8GB RAM. >>>>>>>>>>>> >>>>>>>>>>>> One of the nodes (seed node) died with OOM - so, I am trying to >>>>>>>>>>>> add a replacement node with same configuration. >>>>>>>>>>>> >>>>>>>>>>>> The problem is this new node also keeps dying with OOM - I've >>>>>>>>>>>> restarted the cassandra service like 8-10 times hoping that it >>>>>>>>>>>> would finish >>>>>>>>>>>> the replication. But it didn't help. >>>>>>>>>>>> >>>>>>>>>>>> The one node that is still up is happily chugging along. >>>>>>>>>>>> All nodes have similar configuration - with libjna installed. >>>>>>>>>>>> >>>>>>>>>>>> Cassandra is installed from datastax's debian repo - pkg: dsc21 >>>>>>>>>>>> version 2.1.7. >>>>>>>>>>>> I started off with the default configuration - i.e. the default >>>>>>>>>>>> cassandra-env.sh - which calculates the heap size automatically >>>>>>>>>>>> (1/4 * RAM >>>>>>>>>>>> = 2GB) >>>>>>>>>>>> >>>>>>>>>>>> But, that didn't help. So, I then tried to increase the heap to >>>>>>>>>>>> 4GB manually and restarted. It still keeps crashing. >>>>>>>>>>>> >>>>>>>>>>>> Any clue as to why it's happening? >>>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> Kunal >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> >