Hey Aaron, I don't have any output.log files in that folder:
ubuntu@ip-10-2-x-x:~$ cd /var/log/cassandra ubuntu@ip-10-2-x-x:/var/log/cassandra$ ls system.log system.log.11 system.log.4 system.log.7 system.log.1 system.log.2 system.log.5 system.log.8 system.log.10 system.log.3 system.log.6 system.log.9 On Thu, Jul 21, 2011 at 3:40 PM, aaron morton <aa...@thelastpickle.com>wrote: > Check /var/log/cassandra/output.log (assuming the default init scripts) > > A > ----------------- > Aaron Morton > Freelance Cassandra Developer > @aaronmorton > http://www.thelastpickle.com > > On 22 Jul 2011, at 10:13, Sameer Farooqui wrote: > > Hmm. Just looked at the log more closely. > > So, what actually happened is while Repair was running on this specific > node, the Cassandra java process terminated itself automatically. The last > entries in the log are: > > INFO [ScheduledTasks:1] 2011-07-21 13:00:20,285 GCInspector.java (line > 128) GC for ParNew: 214 ms, 162748656 reclaimed leaving 1845274888 used; max > is 4030726144 > INFO [ScheduledTasks:1] 2011-07-21 13:00:27,375 GCInspector.java (line > 128) GC for ParNew: 266 ms, 158835624 reclaimed leaving 1864471688 used; max > is 4030726144 > INFO [ScheduledTasks:1] 2011-07-21 13:00:57,658 GCInspector.java (line > 128) GC for ParNew: 251 ms, 148861328 reclaimed leaving 1931111120 used; max > is 4030726144 > INFO [ScheduledTasks:1] 2011-07-21 13:01:19,358 GCInspector.java (line > 128) GC for ParNew: 260 ms, 157638152 reclaimed leaving 1955746368 used; max > is 4030726144 > INFO [ScheduledTasks:1] 2011-07-21 13:01:22,729 GCInspector.java (line > 128) GC for ParNew: 325 ms, 154157352 reclaimed leaving 1969361176 used; max > is 4030726144 > INFO [ScheduledTasks:1] 2011-07-21 13:01:51,187 GCInspector.java (line > 128) GC for ParNew: 202 ms, 153219160 reclaimed leaving 2040879600 used; max > is 4030726144 > > When we came in this morning, nodetool ring from another node showed the > 1st node as down and OpsCenter also reported it as down. > > Next we ran "sudo netstat -anp | grep 7199" from the 1st node to see the > status of the Cassandra PID and it was not running. > > We then started Cassandra: > > INFO [main] 2011-07-21 15:48:07,233 AbstractCassandraDaemon.java (line 78) > Logging initialized > INFO [main] 2011-07-21 15:48:07,266 AbstractCassandraDaemon.java (line 96) > Heap size: 3894411264/3894411264 > INFO [main] 2011-07-21 15:48:11,678 CLibrary.java (line 106) JNA mlockall > successful > INFO [main] 2011-07-21 15:48:11,702 DatabaseDescriptor.java (line 121) > Loading settings from > file:/home/ubuntu/brisk/resources/cassandra/conf/cassandra.yaml > > > It was during this start process that the java.io.EOFException was seen, > but yes, like you said Jonathan, the Cassandra process started back up and > joined the ring. > > We're now wondering why the Repair failed and why Cassandra crashed in the > first place. We only had default level logging enabled. Is there something > else I can check or that you suspect? > > Should we turn the logging up to debug and retry the Repair? > > > - Sameer > > > On Thu, Jul 21, 2011 at 12:37 PM, Jonathan Ellis <jbel...@gmail.com>wrote: > >> Looks harmless to me. >> >> On Thu, Jul 21, 2011 at 1:41 PM, Sameer Farooqui >> <cassandral...@gmail.com> wrote: >> > While running Repair on a 0.8.1 node, we got this error in the >> system.log: >> > >> > ERROR [Thread-23] 2011-07-21 15:48:43,868 AbstractCassandraDaemon.java >> (line >> > 113) Fatal exception in thread Thread[Thread-23,5,main] >> > java.io.IOError: java.io.EOFException >> > at >> > >> org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:78) >> > Caused by: java.io.EOFException >> > at java.io.DataInputStream.readInt(DataInputStream.java:375) >> > at >> > >> org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:66) >> > >> > There's just a bunch of informational messages about Gossip before this. >> > >> > Looks like the file or stream unexpectedly ended? >> > >> http://download.oracle.com/javase/1.4.2/docs/api/java/io/EOFException.html >> > >> > Is this a bug or something wrong in our environment? >> > >> > >> > - Sameer >> > >> >> >> >> -- >> Jonathan Ellis >> Project Chair, Apache Cassandra >> co-founder of DataStax, the source for professional Cassandra support >> http://www.datastax.com >> > > >