I am curious, thanks. ( I am in the same situation, big nodes choking under 300-400G data load, 500mil keys )
How does your "cfhistograms Keyspace CF" output look like? How many sstable reads ? What is your bloom filter fp chance ? Regards, Andras On 20/03/13 13:54, "Hiller, Dean" <dean.hil...@nrel.gov> wrote: >Oh, and to give you an idea of memory savings, we had a node at 10G RAM >usage...we had upped a few nodes to 16G from 8G as we don't have our new >nodes ready yet(we know we should be at 8G but we would have a dead >cluster if we did that). > >On startup, the initial RAM is around 6-8G. Startup with >index_interval=512 resulted in a 2.5G-2.8G initial RAM and I have seen it >grow to 3.3G and back down to 2.8G. We just rolled this out an hour ago. >Our website response time is the same as before as well. > >We rolled to only 2 nodes(out of 6) in our cluster so far to test it out >and let it soak a bit. We will slowly roll to more nodes monitoring the >performance as we go. Also, since dynamic snitch is not working with >SimpleSnitch, we know that just one slow node affects our website(from >personal pain/experience of nodes hitting RAM limit and slowing down >causing website to get real slow). > >Dean > >On 3/20/13 6:41 AM, "Andras Szerdahelyi" ><andras.szerdahe...@ignitionone.com> wrote: > >>2. Upping index_interval from 128 to 512 (this seemed to reduce our >>memory >>usage significantly!!!) >> >> >>I'd be very careful with that as a one-stop improvement solution for two >>reasons AFAIK >>1) you have to rebuild stables ( not an issue if you are evaluating, >>doing >>test writes.. Etc, not so much in production ) >>2) it can affect reads ( number of sstable reads to serve a read ) >>especially if your key/row cache is ineffective >> >>On 20/03/13 13:34, "Hiller, Dean" <dean.hil...@nrel.gov> wrote: >> >>>Also, look at the cassandra logs. I bet you see the typicalŠblah blah >>>is >>>at 0.85, doing memory cleanup which is not exactly GC but cassandra >>>memory >>>managementŠ..and of course, you have GC on top of that. >>> >>>If you need to get your memory down, there are multiple ways >>>1. Switching size tiered compaction to leveled compaction(with 1 billion >>>narrow rows, this helped us quite a bit) >>>2. Upping index_interval from 128 to 512 (this seemed to reduce our >>>memory >>>usage significantly!!!) >>>3. Just add more nodes as moving the rows to other servers reduces >>>memory >>>from #1 and #2 above since the server would have less rows >>> >>>Later, >>>Dean >>> >>>On 3/20/13 6:29 AM, "Andras Szerdahelyi" >>><andras.szerdahe...@ignitionone.com> wrote: >>> >>>> >>>>I'd say GC. Please fill in form CASS-FREEZE-001 below and get back to >>>>us >>>>:-) ( sorry ) >>>> >>>>How big is your JVM heap ? How many CPUs ? >>>>Garbage collection taking long ? ( look for log lines from GCInspector) >>>>Running out of heap ? ( "heap is .. full" log lines ) >>>>Any tasks backing up / being dropped ? ( nodetool tpstats and ".. >>>>dropped >>>>in last .. ms" log lines ) >>>>Are writes really slow? ( nodetool cfhistograms Keyspace ColumnFamily ) >>>> >>>>How much is lots of data? Wide or skinny rows? Mutations/sec ? >>>>Which Compaction Strategy are you using? Output of show schema ( >>>>cassandra-cli ) for the relevant Keyspace/CF might help as well >>>> >>>>What consistency are you doing your writes with ? I assume ONE or ANY >>>>if >>>>you have a single node. >>>> >>>>What are the values for these settings in cassandra.yaml >>>> >>>>memtable_total_space_in_mb: >>>>memtable_flush_writers: >>>>memtable_flush_queue_size: >>>>compaction_throughput_mb_per_sec: >>>> >>>>concurrent_writes: >>>> >>>> >>>> >>>>Which version of Cassandra? >>>> >>>> >>>> >>>>Regards, >>>>Andras >>>> >>>>From: Joel Samuelsson <samuelsson.j...@gmail.com> >>>>Reply-To: "user@cassandra.apache.org" <user@cassandra.apache.org> >>>>Date: Wednesday 20 March 2013 13:06 >>>>To: "user@cassandra.apache.org" <user@cassandra.apache.org> >>>>Subject: Cassandra freezes >>>> >>>> >>>>Hello, >>>> >>>>I've been trying to load test a one node cassandra cluster. When I add >>>>lots of data, the Cassandra node freezes for 4-5 minutes during which >>>>neither reads nor writes are served. >>>>During this time, Cassandra takes 100% of a single CPU core. >>>>My initial thought was that this was Cassandra flushing memtables to >>>>the >>>>disk, however, the disk i/o is very low during this time. >>>>Any idea what my problem could be? >>>>I'm running in a virtual environment in which I have no control of >>>>drives. >>>>So commit log and data directory is (probably) on the same drive. >>>> >>>>Best regards, >>>>Joel Samuelsson >>>> >>> >> >