uva.go...@aspect.com]
> *Sent:* Tuesday, April 04, 2017 5:34 PM
> *To:* user@cassandra.apache.org
> *Subject:* Re: cassandra OOM
>
>
>
> Thanks, that’s interesting – so CMS is a better option for
> stability/performance? We’ll try this out in our cluster.
>
>
>
>
> Sean Durity
>
>
>
> *From:* Gopal, Dhruva [mailto:dhruva.go...@aspect.com]
> *Sent:* Tuesday, April 04, 2017 5:34 PM
> *To:* user@cassandra.apache.org
> *Subject:* Re: cassandra OOM
>
>
>
> Thanks, that’s interesting – so CMS is a better option for
> stabi
We have seen much better stability (and MUCH less GC pauses) from G1 with a
variety of heap sizes. I don’t even consider CMS any more.
Sean Durity
From: Gopal, Dhruva [mailto:dhruva.go...@aspect.com]
Sent: Tuesday, April 04, 2017 5:34 PM
To: user@cassandra.apache.org
Subject: Re: cassandra OOM
Thanks, that’s interesting – so CMS is a better option for
stability/performance? We’ll try this out in our cluster.
From: Alexander Dejanovski
Reply-To: "user@cassandra.apache.org"
Date: Monday, April 3, 2017 at 10:31 PM
To: "user@cassandra.apache.org"
Subject: Re: cassa
ConcGCThreads is 1/4 of ParallelGCThreads.
>
> # Setting both to the same value can reduce STW durations.
>
> #-XX:ConcGCThreads=16
>
>
>
> ### GC logging options -- uncomment to enable
>
>
>
> #-XX:+PrintGCDetails
>
> #-XX:+PrintGCDateStamps
>
> #
ndra.apache.org"
Date: Monday, April 3, 2017 at 8:00 AM
To: "user@cassandra.apache.org"
Subject: Re: cassandra OOM
Hi,
could you share your GC settings ? G1 or CMS ? Heap size, etc...
Thanks,
On Sun, Apr 2, 2017 at 10:30 PM Gopal, Dhruva
mailto:dhruva.go...@aspect.com>> wro
Hi,
could you share your GC settings ? G1 or CMS ? Heap size, etc...
Thanks,
On Sun, Apr 2, 2017 at 10:30 PM Gopal, Dhruva
wrote:
> Hi –
>
> We’ve had what looks like an OOM situation with Cassandra (we have a
> dump file that got generated) in our staging (performance/load testing
> environ
Hi –
We’ve had what looks like an OOM situation with Cassandra (we have a dump
file that got generated) in our staging (performance/load testing environment)
and I wanted to reach out to this user group to see if you had any
recommendations on how we should approach our investigation as to the
running
Cassandra process:
$ cat /proc//limits
You can try above settings and share your results..
Thanks
Anuj
Sent from Yahoo Mail on Android
From:"Sebastian Estevez"
Date:Mon, 13 Jul, 2015 at 7:02 pm
Subject:Re: Cassandra OOM on joining existing ring
Are you on the azu
Are you on the azure premium storage?
http://www.datastax.com/2015/04/getting-started-with-azure-premium-storage-and-datastax-enterprise-dse
Secondary indexes are built for convenience not performance.
http://www.datastax.com/resources/data-modeling
What's your compaction strategy? Your nodes hav
Hi,
Looks like that is my primary problem - the sstable count for the
daily_challenges column family is >5k. Azure had scheduled maintenance
window on Sat. All the VMs got rebooted one by one - including the current
cassandra one - and it's taking forever to bring cassandra back up online.
Is the
#1
> There is one table - daily_challenges - which shows compacted partition
> max bytes as ~460M and another one - daily_guest_logins - which shows
> compacted partition max bytes as ~36M.
460 is high, I like to keep my partitions under 100mb when possible. I've
seen worse though. The fix is to
And here is my cassandra-env.sh
https://gist.github.com/kunalg/2c092cb2450c62be9a20
Kunal
On 11 July 2015 at 00:04, Kunal Gangakhedkar
wrote:
> From jhat output, top 10 entries for "Instance Count for All Classes
> (excluding platform)" shows:
>
> 2088223 instances of class org.apache.cassandra
>From jhat output, top 10 entries for "Instance Count for All Classes
(excluding platform)" shows:
2088223 instances of class org.apache.cassandra.db.BufferCell
1983245 instances of class
org.apache.cassandra.db.composites.CompoundSparseCellName
1885974 instances of class
org.apache.cassandra.db.c
Thanks for quick reply.
1. I don't know what are the thresholds that I should look for. So, to save
this back-and-forth, I'm attaching the cfstats output for the keyspace.
There is one table - daily_challenges - which shows compacted partition max
bytes as ~460M and another one - daily_guest_logi
1. You want to look at # of sstables in cfhistograms or in cfstats look at:
Compacted partition maximum bytes
Maximum live cells per slice
2) No, here's the env.sh from 3.0 which should work with some tweaks:
https://github.com/tobert/cassandra/blob/0f70469985d62aeadc20b41dc9cdc9d72a035c64/conf/ca
Thanks, Sebastian.
Couple of questions (I'm really new to cassandra):
1. How do I interpret the output of 'nodetool cfstats' to figure out the
issues? Any documentation pointer on that would be helpful.
2. I'm primarily a python/c developer - so, totally clueless about JVM
environment. So, please
#1 You need more information.
a) Take a look at your .hprof file (memory heap from the OOM) with an
introspection tool like jhat or visualvm or java flight recorder and see
what is using up your RAM.
b) How big are your large rows (use nodetool cfstats on each node). If your
data model is bad, yo
I upgraded my instance from 8GB to a 14GB one.
Allocated 8GB to jvm heap in cassandra-env.sh.
And now, it crashes even faster with an OOM..
Earlier, with 4GB heap, I could go upto ~90% replication completion (as
reported by nodetool netstats); now, with 8GB heap, I cannot even get
there. I've alr
You, and only you, are responsible for knowing your data and data model.
If columns per row or rows per partition can be large, then an 8GB system
is probably too small. But the real issue is that you need to keep your
partition size from getting too large.
Generally, an 8GB system is okay, but o
I'm new to cassandra
How do I find those out? - mainly, the partition params that you asked for.
Others, I think I can figure out.
We don't have any large objects/blobs in the column values - it's all
textual, date-time, numeric and uuid data.
We use cassandra to primarily store segmentation data
What does your data and data model look like - partition size, rows per
partition, number of columns per row, any large values/blobs in column
values?
You could run fine on an 8GB system, but only if your rows and partitions
are reasonably small. Any large partitions could blow you away.
-- Jack
Attaching the stack dump captured from the last OOM.
Kunal
On 10 July 2015 at 13:32, Kunal Gangakhedkar
wrote:
> Forgot to mention: the data size is not that big - it's barely 10GB in all.
>
> Kunal
>
> On 10 July 2015 at 13:29, Kunal Gangakhedkar
> wrote:
>
>> Hi,
>>
>> I have a 2 node setup
Forgot to mention: the data size is not that big - it's barely 10GB in all.
Kunal
On 10 July 2015 at 13:29, Kunal Gangakhedkar
wrote:
> Hi,
>
> I have a 2 node setup on Azure (east us region) running Ubuntu server
> 14.04LTS.
> Both nodes have 8GB RAM.
>
> One of the nodes (seed node) died with
Hi,
I have a 2 node setup on Azure (east us region) running Ubuntu server
14.04LTS.
Both nodes have 8GB RAM.
One of the nodes (seed node) died with OOM - so, I am trying to add a
replacement node with same configuration.
The problem is this new node also keeps dying with OOM - I've restarted the
> For JVM Heap it is 2G
Try 4G
> and gc_grace = 1800
Realised that I did not provide a warning about the implication this has for
node tool repair. If you are doing deleted on the CF you need to run nodetool
repair every gc_grace seconds.
In this case I think you main problem was not enough
Thanks for you reply. we will try both of your recommentation. The OS
memory is 8G, For JVM Heap it is 2G, DeletedColumn used 1.4G which are
rooted from readStage thread. Do you think we need increase the size of JVM
Heap?
Configuration for the index columnFamily is
create column family purge
You need to provide some details of the machine and the JVM configuration. But
lets say you need to have 4Gb to 8GB for the JVM heap.
If you have many deleted columns I would say you have a *lot* of garbage in
each row. Consider reducing the gc_grace seconds so the columns are purged more
freq
hmm.. did you managed to take a look using nodetool tpstats? That may give
you indication further..
Jason
On Thu, Mar 7, 2013 at 1:56 PM, 金剑 wrote:
> Hi,
>
> My version is 1.1.7
>
> Our use case is : we have a index columnfamily to record how many resource
> is stored for a user. The number m
Hi,
My version is 1.1.7
Our use case is : we have a index columnfamily to record how many resource
is stored for a user. The number might vary from tens to millions.
We provide a feature to let user to delete resource according prefix.
we found some cassandra will OOM after some period. The
Everything still runs smooth. It's really plausible that the 1.1.3 version
resolved this bug.
2012/8/13 Robin Verlangen
> 3 hours ago I finished the upgraded of our cluster. Currently it runs
> quite smooth. I'll give an update within a week if this really solved our
> issues.
>
> Cheers!
>
>
>
3 hours ago I finished the upgraded of our cluster. Currently it runs quite
smooth. I'll give an update within a week if this really solved our issues.
Cheers!
2012/8/13 Robin Verlangen
> @Tyler: We were already running most of our machines in 64bit JVM (Sun,
> not the OpenJDK). Those also cras
@Tyler: We were already running most of our machines in 64bit JVM (Sun, not
the OpenJDK). Those also crashed.
@Holger: Good to hear that. I'll schedule an update for our Cassandra
cluster.
Thank you both for your time.
2012/8/13 Holger Hoffstaette
> On Sun, 12 Aug 2012 13:36:42 +0200, Robin Ve
On Sun, 12 Aug 2012 13:36:42 +0200, Robin Verlangen wrote:
> Hmm, is issue caused by some 1.x version? Before it never occurred to us.
This bug was introduced in 1.1.0 and has been fixed in 1.1.3, where the
closed/recycled segments are now closed & unmapped properly. The default
sizes are also sm
Hmm, is issue caused by some 1.x version? Before it never occurred to us.
Op 11 aug. 2012 22:36 schreef "Tyler Hobbs" het
volgende:
> We've seen something similar when running on a 32bit JVM, so make sure
> you're using the latest 64bit Java 6 JVM.
>
> On Sat, Aug 11, 2012 at 11:59 AM, Robin Verl
We've seen something similar when running on a 32bit JVM, so make sure
you're using the latest 64bit Java 6 JVM.
On Sat, Aug 11, 2012 at 11:59 AM, Robin Verlangen wrote:
> Hi there,
>
> I currently see Cassandra crash every couple of days. I run a 3 node
> cluster on version 1.1.2. Does anyone h
Hi there,
I currently see Cassandra crash every couple of days. I run a 3 node
cluster on version 1.1.2. Does anyone have a clue why it crashes? I
couldn't find it as fix in a newer release. Is this an actual bug or did I
do something wrong?
Thank you in advance for your time.
Last 100 log lines
On Tue, Feb 7, 2012 at 10:45 AM, aaron morton wrote:
> Just to ask the stupid question, have you tried setting it really high ?
> Like 50 ?
>
No I have not. I moved to mmap_index_only as a stopgap solution.
Is it possible for there to be that many mmaps for about 300 db files?
--
Regards,
Just to ask the stupid question, have you tried setting it really high ? Like
50 ?
Cheers
-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com
On 7/02/2012, at 10:27 AM, Ajeet Grewal wrote:
> Here are the last few lines of strace (of one of the thr
Here are the last few lines of strace (of one of the threads). There
are a bunch of mmap system calls. Notice the last mmap call a couple
of lines before the trace ends. Could the last mmap call fail?
== BEGIN STRACE ==
mmap(NULL, 2147487599, PROT_READ, MAP_SHARED, 37, 0xbb000) = 0x7709b54000
On Mon, Feb 6, 2012 at 11:50 AM, Ajeet Grewal wrote:
> On Sat, Feb 4, 2012 at 7:03 AM, Jonathan Ellis wrote:
>> Sounds like you need to increase sysctl vm.max_map_count
>
> This did not work. I increased vm.max_map_count from 65536 to 131072.
> I am still getting the same error.
The number of fi
On Sat, Feb 4, 2012 at 7:03 AM, Jonathan Ellis wrote:
> Sounds like you need to increase sysctl vm.max_map_count
This did not work. I increased vm.max_map_count from 65536 to 131072.
I am still getting the same error.
ERROR [SSTableBatchOpen:4] 2012-02-06 11:43:50,463
AbstractCassandraDaemon.jav
Sounds like you need to increase sysctl vm.max_map_count
On Fri, Feb 3, 2012 at 7:27 PM, Ajeet Grewal wrote:
> Hey guys,
>
> I am getting an out of memory (mmap failed) error with Cassandra
> 1.0.2. The relevant log lines are pasted at
> http://pastebin.com/UM28ZC1g.
>
> Cassandra works fine unti
Hey guys,
I am getting an out of memory (mmap failed) error with Cassandra
1.0.2. The relevant log lines are pasted at
http://pastebin.com/UM28ZC1g.
Cassandra works fine until it reaches about 300-400GB of load (on one
instance, I have 12 nodes RF=2). Then nodes start failing with such
errors. Th
2012/1/4 Vitalii Tymchyshyn
> 04.01.12 14:25, Radim Kolar написав(ла):
>
> > So, what are cassandra memory requirement? Is it 1% or 2% of disk data?
>> It depends on number of rows you have. if you have lot of rows then
>> primary memory eaters are index sampling data and bloom filters. I use
>>
04.01.12 14:25, Radim Kolar написав(ла):
> So, what are cassandra memory requirement? Is it 1% or 2% of disk data?
It depends on number of rows you have. if you have lot of rows then
primary memory eaters are index sampling data and bloom filters. I use
index sampling 512 and bloom filters set
> Looking at heap dumps, a lot of memory is taken by memtables, much
more than 1/3 of heap. At the same time, logs say that it has nothing to
flush since there are not dirty memtables.
I seen this too.
> So, what are cassandra memory requirement? Is it 1% or 2% of disk data?
It depends on numbe
Hello.
BTW: It would be great for cassandra to shutdown on Errors like OOM
because now I am not sure if the problem described in previous email is
the root cause or some of OOM error found in log made some "writer" stop.
I am now looking at different OOMs in my cluster. Currently each node
h
The DynamicSnitch can result in less read operations been sent to a node, but
as long as a node is marked as UP mutations are sent to all replicas. Nodes
will shed load when they pull messages off the queue that have expired past
rpc_timeout, but they will not feed back flow control to the other
Hello.
We are using cassandra for some time in our project. Currently we are on
1.1 trunk (it was accidental migration, but since it's hard to migrate back
and it's performing nice enough we are currently on 1.1).
During New Year holidays one of the servers've produces a number of OOM
messages in
Can't think of any.
On Sun, Jul 17, 2011 at 1:27 PM, Andrey Stepachev wrote:
> Looks like problem in code:
> public IndexSummary(long expectedKeys)
> {
> long expectedEntries = expectedKeys /
> DatabaseDescriptor.getIndexInterval();
> if (expectedEntries > Integer.MAX_VALU
Looks like problem in code:
public IndexSummary(long expectedKeys)
{
long expectedEntries = expectedKeys /
DatabaseDescriptor.getIndexInterval();
if (expectedEntries > Integer.MAX_VALUE)
// TODO: that's a _lot_ of keys, or a very low interval
throw n
Looks like key indexes eat all memory:
http://paste.kde.org/97213/
2011/7/15 Andrey Stepachev
> UPDATE:
>
> I found, that
> a) with min10G cassandra survive.
> b) I have ~1000 sstables
> c) CompactionManager uses PrecompactedRows instead of LazilyCompactedRow
>
> So, I have a question:
> a) if
UPDATE:
I found, that
a) with min10G cassandra survive.
b) I have ~1000 sstables
c) CompactionManager uses PrecompactedRows instead of LazilyCompactedRow
So, I have a question:
a) if row is bigger then 64mb before compaction, why it compacted in memory
b) if it smaller, what eats so much memory?
Hi all.
Cassandra constantly OOM on repair or compaction. Increasing memory doesn't
help (6G)
I can give more, but I think that this is not a regular situation. Cluster
has 4 nodes. RF=3.
Cassandra version 0.8.1
Ring looks like this:
Address DC RackStatus State Load
55 matches
Mail list logo