Re: 0.7.9 RejectedExecutionException

2011-10-12 Thread Mohit Anchlia
strace -F -f -c java is how I use for other related issues. Haven't used with Cassandra though. On Wed, Oct 12, 2011 at 3:22 PM, Ashley Martens wrote: > This is a production node on real hardware. I like the strace idea, do you > have a workable command line for that? > > On Wed, Oct 12, 2011 at

Re: 0.7.9 RejectedExecutionException

2011-10-12 Thread Ashley Martens
This is a production node on real hardware. I like the strace idea, do you have a workable command line for that? On Wed, Oct 12, 2011 at 1:13 PM, Mohit Anchlia wrote: > Yes. If you have exhausted all the options I think it will be good to > see if this issue persists accross other nodes after yo

Re: 0.7.9 RejectedExecutionException

2011-10-12 Thread Jonathan Ellis
I've never seen a JVM crash that was polite enough to run shutdown hooks first, but it's worth a try. On Wed, Oct 12, 2011 at 3:27 PM, Erik Forkalsrud wrote: > > My suggestion would be to put a recent Sun JVM on the problematic node and > see if that eliminates the crashes. > > The Sun JVM appear

Re: 0.7.9 RejectedExecutionException

2011-10-12 Thread Erik Forkalsrud
My suggestion would be to put a recent Sun JVM on the problematic node and see if that eliminates the crashes. The Sun JVM appears the be the mainstream choice when running Cassandra, so that's a more well tested configuration. You can search the list archives for OpenJDK related bugs to see

Re: 0.7.9 RejectedExecutionException

2011-10-12 Thread Mohit Anchlia
Yes. If you have exhausted all the options I think it will be good to see if this issue persists accross other nodes after you decommission that node. If this is not production and issue is reproducible easily you can also try using strace with fork option to see if it gets killed at the same plac

Re: 0.7.9 RejectedExecutionException

2011-10-12 Thread Ashley Martens
I guess it could be an option but I can't puppet the Oracle JDK install so I would rather not. On Wed, Oct 12, 2011 at 12:35 PM, Erik Forkalsrud wrote: > On 10/12/2011 11:33 AM, Ashley Martens wrote: > > java version "1.6.0_20" > OpenJDK Runtime Environment (IcedTea6 1.9.9) (6b20-1.9.9-0ubuntu1~

Re: 0.7.9 RejectedExecutionException

2011-10-12 Thread Ashley Martens
We have 20 nodes in this cluster. Yes, however are you recommending that I decommission the node? I noted the compaction because it is common for the last line in the log file. For reference: INFO [FlushWriter:12] 2011-10-12 18:10:09,823 Memtable.java (line 157) Writing Memtable-HintsColumnFamily

Re: 0.7.9 RejectedExecutionException

2011-10-12 Thread Erik Forkalsrud
On 10/12/2011 11:33 AM, Ashley Martens wrote: java version "1.6.0_20" OpenJDK Runtime Environment (IcedTea6 1.9.9) (6b20-1.9.9-0ubuntu1~10.10.2) OpenJDK 64-Bit Server VM (build 19.0-b09, mixed mode) This may have been mentioned before, but is it an option to use the Sun/Oracle JDK? - Erik -

Re: 0.7.9 RejectedExecutionException

2011-10-12 Thread Mohit Anchlia
You mentioned this happens only on one node? How many nodes do you have? Is it possible to turn off this node completely and run compactions on other nodes and see if this happens there too? Also, you mentioned this happens after compaction. Did you mean during compaction or right after it? What l

Re: 0.7.9 RejectedExecutionException

2011-10-12 Thread Ashley Martens
No. On Wed, Oct 12, 2011 at 11:46 AM, Brandon Williams wrote: > Anything from the OOM killer in the last few lines from dmesg? > >

Re: 0.7.9 RejectedExecutionException

2011-10-12 Thread Brandon Williams
Anything from the OOM killer in the last few lines from dmesg? On Wed, Oct 12, 2011 at 1:33 PM, Ashley Martens wrote: > Ubuntu 10.10 > > java version "1.6.0_20" > OpenJDK Runtime Environment (IcedTea6 1.9.9) (6b20-1.9.9-0ubuntu1~10.10.2) > OpenJDK 64-Bit Server VM (build 19.0-b09, mixed mode) > >

Re: 0.7.9 RejectedExecutionException

2011-10-12 Thread Ashley Martens
Ubuntu 10.10 java version "1.6.0_20" OpenJDK Runtime Environment (IcedTea6 1.9.9) (6b20-1.9.9-0ubuntu1~10.10.2) OpenJDK 64-Bit Server VM (build 19.0-b09, mixed mode) Always the same node. No other nodes in this cluster, which all have the same hardware and OS, have this issue. I don't see any re

Re: 0.7.9 RejectedExecutionException

2011-10-12 Thread Sasha Dolgy
What OS? JVM version? is it always on the same node or all nodes? i had a similar problem in the past in that the OS killed Cassandra because it felt threatened and needed more resources. On Wed, Oct 12, 2011 at 7:47 PM, Ashley Martens wrote: > The thing is we only see that error once every s

Re: 0.7.9 RejectedExecutionException

2011-10-12 Thread Ashley Martens
The thing is we only see that error once every so often. Additional, since Cassandra is not logging a shutdown message then it must be a violent termination, which leaves no traces in the system logs. It's possible that there is something wrong with the hardware, but the OS side I don't see what wo

Re: 0.7.9 RejectedExecutionException

2011-10-12 Thread Jonathan Ellis
I'm comfortable with saying that there's some problem with your environment that hasn't been identified yet, because we see many many people running 0.7.9 and it just does not die randomly. Again, the exception you see in the log is consistent with being killed externally (and not consistent with

Re: 0.7.9 RejectedExecutionException

2011-10-12 Thread Ashley Martens
Tue Oct 11 21:34:10 UTC 2011 - Fuck this Cassandra bullshit... it died again Tue Oct 11 22:06:10 UTC 2011 - Fuck this Cassandra bullshit... it died again Tue Oct 11 22:36:10 UTC 2011 - Fuck this Cassandra bullshit... it died again Wed Oct 12 00:40:10 UTC 2011 - Fuck this Cassandra bullshit... it di

Re: 0.7.9 RejectedExecutionException

2011-10-12 Thread Ashley Martens
deploy@mobage-prod-cassandra150:~$ grep -i 'killed process' /var/log/messages deploy@mobage-prod-cassandra150:~$ On Tue, Oct 11, 2011 at 5:57 PM, Jonathan Ellis wrote: > grep -i 'killed process' /var/log/messages > >

Re: 0.7.9 RejectedExecutionException

2011-10-11 Thread Jonathan Ellis
grep -i 'killed process' /var/log/messages On Tue, Oct 11, 2011 at 5:25 PM, Ashley Martens wrote: > So we created a script to check if Cassandra is alive and run it every two > minutes. Here are some results for today: > > Tue Oct 11 18:28:09 UTC 2011 - F this Cassandra bullshit... it died again

Re: 0.7.9 RejectedExecutionException

2011-10-11 Thread Ashley Martens
So we created a script to check if Cassandra is alive and run it every two minutes. Here are some results for today: Tue Oct 11 18:28:09 UTC 2011 - F this Cassandra bullshit... it died again Tue Oct 11 19:00:10 UTC 2011 - F this Cassandra bullshit... it died again Tue Oct 11 19:30:10 UTC 2011 - F

Re: 0.7.9 RejectedExecutionException

2011-10-10 Thread Ashley Martens
It is actually not at the exact same time of the day. It varies but happens within certain blocks of time, like between 00hr and 02hr. The could be up for hours or it could crash again in 15 minutes. The memory is fine, just using a larger footprint than 0.6 in all ways. On Mon, Oct 10, 2011 at 1:

Re: 0.7.9 RejectedExecutionException

2011-10-10 Thread aaron morton
The service keeps dieing at the same time every day and there is nothing in the app logs, it's going to be something external. Sorry but I'm not sure what the problem with the memory usage is. Is the server running out of memory, or is it experiencing a lot of GC ? Cheers - Aa

Re: 0.7.9 RejectedExecutionException

2011-10-10 Thread Ashley Martens
I have check both the output file and the system log, neither have errors in them. I don't believe anything external is killing the process, I could be wrong but this node's setup is the same as all my other nodes (including hardware) so it doesn't make much sense. jsvc.exec -user cassandra -home

Re: 0.7.9 RejectedExecutionException

2011-10-10 Thread aaron morton
Have you checked /var/log/cassandra/output.txt (the packaged install pipes std out/err to there) or the system logs ? If there are no errors in the logs it may well be something external killing it. With regard to memory usage, it's hard for people to help unless you provide some numbers. Wha

Re: 0.7.9 RejectedExecutionException

2011-10-07 Thread Ashley Martens
Okay, this is still a problem. This node keeps dieing at 1am every day, most times without an error in the log. I'd appriciate any help in tracking down why. Additionally, I don't understand why 0.7.x using *way* more RAM than 0.6.x and 0.8.x, from a top or ps perspective. I'm now watching the JVM

Re: 0.7.9 RejectedExecutionException

2011-10-05 Thread aaron morton
check this http://wiki.apache.org/cassandra/FAQ#mmap Cheers - Aaron Morton Freelance Cassandra Developer @aaronmorton http://www.thelastpickle.com On 6/10/2011, at 9:25 AM, Ashley Martens wrote: > I could be wrong. I just looked the amount of memory being used and it's > huge.

Re: 0.7.9 RejectedExecutionException

2011-10-05 Thread Ashley Martens
I could be wrong. I just looked the amount of memory being used and it's huge. WTF?

Re: 0.7.9 RejectedExecutionException

2011-10-05 Thread Ashley Martens
No OOM errors appear and the memory used is far below physical and Java max. I changed the JAR to 0.7.8 to see if that works. If so I'll find a way to roll out that version instead of 0.7.9.

Re: 0.7.9 RejectedExecutionException

2011-10-05 Thread Jonathan Ellis
"I can't schedule this task because I'm shutting down" is a symptom of your node crashing, not a cause. Is it being OOMkilled, perhaps? On Wed, Oct 5, 2011 at 12:42 PM, Ashley Martens wrote: > I'm getting the following exception on a 0.7.9 node before the node crashes. > I don't have this proble

0.7.9 RejectedExecutionException

2011-10-05 Thread Ashley Martens
I'm getting the following exception on a 0.7.9 node before the node crashes. I don't have this problem with the other nodes running 0.7.8. Does anyone know what the problem is? ERROR [Thread-47] 2011-10-05 05:07:03,840 AbstractCassandraDaemon.java (line 133) Fatal exception in thread Thread[Thread