Check /var/log/cassandra/output.log (assuming the default init scripts)

A
-----------------
Aaron Morton
Freelance Cassandra Developer
@aaronmorton
http://www.thelastpickle.com

On 22 Jul 2011, at 10:13, Sameer Farooqui wrote:

> Hmm. Just looked at the log more closely.
> 
> So, what actually happened is while Repair was running on this specific node, 
> the Cassandra java process terminated itself automatically. The last entries 
> in the log are:
> 
>  INFO [ScheduledTasks:1] 2011-07-21 13:00:20,285 GCInspector.java (line 128) 
> GC for ParNew: 214 ms, 162748656 reclaimed leaving 1845274888 used; max is 
> 4030726144
>  INFO [ScheduledTasks:1] 2011-07-21 13:00:27,375 GCInspector.java (line 128) 
> GC for ParNew: 266 ms, 158835624 reclaimed leaving 1864471688 used; max is 
> 4030726144
>  INFO [ScheduledTasks:1] 2011-07-21 13:00:57,658 GCInspector.java (line 128) 
> GC for ParNew: 251 ms, 148861328 reclaimed leaving 1931111120 used; max is 
> 4030726144
>  INFO [ScheduledTasks:1] 2011-07-21 13:01:19,358 GCInspector.java (line 128) 
> GC for ParNew: 260 ms, 157638152 reclaimed leaving 1955746368 used; max is 
> 4030726144
>  INFO [ScheduledTasks:1] 2011-07-21 13:01:22,729 GCInspector.java (line 128) 
> GC for ParNew: 325 ms, 154157352 reclaimed leaving 1969361176 used; max is 
> 4030726144
>  INFO [ScheduledTasks:1] 2011-07-21 13:01:51,187 GCInspector.java (line 128) 
> GC for ParNew: 202 ms, 153219160 reclaimed leaving 2040879600 used; max is 
> 4030726144
>  
> When we came in this morning, nodetool ring from another node showed the 1st 
> node as down and OpsCenter also reported it as down.
> 
> Next we ran "sudo netstat -anp | grep 7199" from the 1st node to see the 
> status of the Cassandra PID and it was not running.
> 
> We then started Cassandra:
> 
> INFO [main] 2011-07-21 15:48:07,233 AbstractCassandraDaemon.java (line 78) 
> Logging initialized
>  INFO [main] 2011-07-21 15:48:07,266 AbstractCassandraDaemon.java (line 96) 
> Heap size: 3894411264/3894411264
>  INFO [main] 2011-07-21 15:48:11,678 CLibrary.java (line 106) JNA mlockall 
> successful
>  INFO [main] 2011-07-21 15:48:11,702 DatabaseDescriptor.java (line 121) 
> Loading settings from 
> file:/home/ubuntu/brisk/resources/cassandra/conf/cassandra.yaml
> 
> 
> It was during this start process that the java.io.EOFException was seen, but 
> yes, like you said Jonathan, the Cassandra process started back up and joined 
> the ring. 
> 
> We're now wondering why the Repair failed and why Cassandra crashed in the 
> first place. We only had default level logging enabled. Is there something 
> else I can check or that you suspect?
> 
> Should we turn the logging up to debug and retry the Repair?
> 
> 
> - Sameer
> 
> 
> On Thu, Jul 21, 2011 at 12:37 PM, Jonathan Ellis <jbel...@gmail.com> wrote:
> Looks harmless to me.
> 
> On Thu, Jul 21, 2011 at 1:41 PM, Sameer Farooqui
> <cassandral...@gmail.com> wrote:
> > While running Repair on a 0.8.1 node, we got this error in the system.log:
> >
> > ERROR [Thread-23] 2011-07-21 15:48:43,868 AbstractCassandraDaemon.java (line
> > 113) Fatal exception in thread Thread[Thread-23,5,main]
> > java.io.IOError: java.io.EOFException
> > at
> > org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:78)
> > Caused by: java.io.EOFException
> > at java.io.DataInputStream.readInt(DataInputStream.java:375)
> > at
> > org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:66)
> >
> > There's just a bunch of informational messages about Gossip before this.
> >
> > Looks like the file or stream unexpectedly ended?
> > http://download.oracle.com/javase/1.4.2/docs/api/java/io/EOFException.html
> >
> > Is this a bug or something wrong in our environment?
> >
> >
> > - Sameer
> >
> 
> 
> 
> --
> Jonathan Ellis
> Project Chair, Apache Cassandra
> co-founder of DataStax, the source for professional Cassandra support
> http://www.datastax.com
> 

Reply via email to