strace -F -f -c java is how I use for other related issues. Haven't
used with Cassandra though.
On Wed, Oct 12, 2011 at 3:22 PM, Ashley Martens wrote:
> This is a production node on real hardware. I like the strace idea, do you
> have a workable command line for that?
>
> On Wed, Oct 12, 2011 at
This is a production node on real hardware. I like the strace idea, do you
have a workable command line for that?
On Wed, Oct 12, 2011 at 1:13 PM, Mohit Anchlia wrote:
> Yes. If you have exhausted all the options I think it will be good to
> see if this issue persists accross other nodes after yo
I've never seen a JVM crash that was polite enough to run shutdown
hooks first, but it's worth a try.
On Wed, Oct 12, 2011 at 3:27 PM, Erik Forkalsrud wrote:
>
> My suggestion would be to put a recent Sun JVM on the problematic node and
> see if that eliminates the crashes.
>
> The Sun JVM appear
My suggestion would be to put a recent Sun JVM on the problematic node
and see if that eliminates the crashes.
The Sun JVM appears the be the mainstream choice when running Cassandra,
so that's a more well tested configuration. You can search the list
archives for OpenJDK related bugs to see
Yes. If you have exhausted all the options I think it will be good to
see if this issue persists accross other nodes after you decommission
that node.
If this is not production and issue is reproducible easily you can
also try using strace with fork option to see if it gets killed at the
same plac
I guess it could be an option but I can't puppet the Oracle JDK install so I
would rather not.
On Wed, Oct 12, 2011 at 12:35 PM, Erik Forkalsrud wrote:
> On 10/12/2011 11:33 AM, Ashley Martens wrote:
>
> java version "1.6.0_20"
> OpenJDK Runtime Environment (IcedTea6 1.9.9) (6b20-1.9.9-0ubuntu1~
We have 20 nodes in this cluster.
Yes, however are you recommending that I decommission the node?
I noted the compaction because it is common for the last line in the log
file. For reference:
INFO [FlushWriter:12] 2011-10-12 18:10:09,823 Memtable.java (line 157)
Writing Memtable-HintsColumnFamily
On 10/12/2011 11:33 AM, Ashley Martens wrote:
java version "1.6.0_20"
OpenJDK Runtime Environment (IcedTea6 1.9.9) (6b20-1.9.9-0ubuntu1~10.10.2)
OpenJDK 64-Bit Server VM (build 19.0-b09, mixed mode)
This may have been mentioned before, but is it an option to use the
Sun/Oracle JDK?
- Erik -
You mentioned this happens only on one node? How many nodes do you
have? Is it possible to turn off this node completely and run
compactions on other nodes and see if this happens there too?
Also, you mentioned this happens after compaction. Did you mean during
compaction or right after it? What l
No.
On Wed, Oct 12, 2011 at 11:46 AM, Brandon Williams wrote:
> Anything from the OOM killer in the last few lines from dmesg?
>
>
Anything from the OOM killer in the last few lines from dmesg?
On Wed, Oct 12, 2011 at 1:33 PM, Ashley Martens wrote:
> Ubuntu 10.10
>
> java version "1.6.0_20"
> OpenJDK Runtime Environment (IcedTea6 1.9.9) (6b20-1.9.9-0ubuntu1~10.10.2)
> OpenJDK 64-Bit Server VM (build 19.0-b09, mixed mode)
>
>
Ubuntu 10.10
java version "1.6.0_20"
OpenJDK Runtime Environment (IcedTea6 1.9.9) (6b20-1.9.9-0ubuntu1~10.10.2)
OpenJDK 64-Bit Server VM (build 19.0-b09, mixed mode)
Always the same node. No other nodes in this cluster, which all have the
same hardware and OS, have this issue.
I don't see any re
What OS? JVM version? is it always on the same node or all nodes? i had a
similar problem in the past in that the OS killed Cassandra because it felt
threatened and needed more resources.
On Wed, Oct 12, 2011 at 7:47 PM, Ashley Martens wrote:
> The thing is we only see that error once every s
The thing is we only see that error once every so often. Additional, since
Cassandra is not logging a shutdown message then it must be a violent
termination, which leaves no traces in the system logs. It's possible that
there is something wrong with the hardware, but the OS side I don't see what
wo
I'm comfortable with saying that there's some problem with your
environment that hasn't been identified yet, because we see many many
people running 0.7.9 and it just does not die randomly.
Again, the exception you see in the log is consistent with being
killed externally (and not consistent with
Tue Oct 11 21:34:10 UTC 2011 - Fuck this Cassandra bullshit... it died again
Tue Oct 11 22:06:10 UTC 2011 - Fuck this Cassandra bullshit... it died again
Tue Oct 11 22:36:10 UTC 2011 - Fuck this Cassandra bullshit... it died again
Wed Oct 12 00:40:10 UTC 2011 - Fuck this Cassandra bullshit... it di
deploy@mobage-prod-cassandra150:~$ grep -i 'killed process'
/var/log/messages
deploy@mobage-prod-cassandra150:~$
On Tue, Oct 11, 2011 at 5:57 PM, Jonathan Ellis wrote:
> grep -i 'killed process' /var/log/messages
>
>
grep -i 'killed process' /var/log/messages
On Tue, Oct 11, 2011 at 5:25 PM, Ashley Martens wrote:
> So we created a script to check if Cassandra is alive and run it every two
> minutes. Here are some results for today:
>
> Tue Oct 11 18:28:09 UTC 2011 - F this Cassandra bullshit... it died again
So we created a script to check if Cassandra is alive and run it every two
minutes. Here are some results for today:
Tue Oct 11 18:28:09 UTC 2011 - F this Cassandra bullshit... it died again
Tue Oct 11 19:00:10 UTC 2011 - F this Cassandra bullshit... it died again
Tue Oct 11 19:30:10 UTC 2011 - F
It is actually not at the exact same time of the day. It varies but happens
within certain blocks of time, like between 00hr and 02hr. The could be up
for hours or it could crash again in 15 minutes. The memory is fine, just
using a larger footprint than 0.6 in all ways.
On Mon, Oct 10, 2011 at 1:
The service keeps dieing at the same time every day and there is nothing in the
app logs, it's going to be something external.
Sorry but I'm not sure what the problem with the memory usage is. Is the server
running out of memory, or is it experiencing a lot of GC ?
Cheers
-
Aa
I have check both the output file and the system log, neither have errors in
them. I don't believe anything external is killing the process, I could be
wrong but this node's setup is the same as all my other nodes (including
hardware) so it doesn't make much sense.
jsvc.exec -user cassandra -home
Have you checked /var/log/cassandra/output.txt (the packaged install pipes std
out/err to there) or the system logs ? If there are no errors in the logs it
may well be something external killing it.
With regard to memory usage, it's hard for people to help unless you provide
some numbers. Wha
Okay, this is still a problem. This node keeps dieing at 1am every day, most
times without an error in the log. I'd appriciate any help in tracking down
why.
Additionally, I don't understand why 0.7.x using *way* more RAM than 0.6.x
and 0.8.x, from a top or ps perspective. I'm now watching the JVM
check this http://wiki.apache.org/cassandra/FAQ#mmap
Cheers
-
Aaron Morton
Freelance Cassandra Developer
@aaronmorton
http://www.thelastpickle.com
On 6/10/2011, at 9:25 AM, Ashley Martens wrote:
> I could be wrong. I just looked the amount of memory being used and it's
> huge.
I could be wrong. I just looked the amount of memory being used and it's
huge. WTF?
No OOM errors appear and the memory used is far below physical and Java max.
I changed the JAR to 0.7.8 to see if that works. If so I'll find a way to
roll out that version instead of 0.7.9.
"I can't schedule this task because I'm shutting down" is a symptom of
your node crashing, not a cause. Is it being OOMkilled, perhaps?
On Wed, Oct 5, 2011 at 12:42 PM, Ashley Martens wrote:
> I'm getting the following exception on a 0.7.9 node before the node crashes.
> I don't have this proble
I'm getting the following exception on a 0.7.9 node before the node crashes.
I don't have this problem with the other nodes running 0.7.8. Does anyone
know what the problem is?
ERROR [Thread-47] 2011-10-05 05:07:03,840 AbstractCassandraDaemon.java (line
133) Fatal exception in thread Thread[Thread
29 matches
Mail list logo