Re: Repair fails with java.io.IOError: java.io.EOFException

2011-07-26 Thread Sameer Farooqui
Thanks for the info guys. I'm running compaction on the two very highly loaded nodes now in hopes of the data volume going down. But I'm skeptical because I don't see how it got so unbalanced in the first place (all nodes were up while the writes were being injected). I should have an update tomo

Re: Repair fails with java.io.IOError: java.io.EOFException

2011-07-26 Thread aaron morton
Was guessing something like a token move may have happened in the past. Good suggestion to also kick off a major compaction. I've seen that make a big difference even for apps that do not do deletes, but do do overwrites. Cheers - Aaron Morton Freelance Cassandra Developer @aa

Re: Repair fails with java.io.IOError: java.io.EOFException

2011-07-26 Thread Sylvain Lebresne
> If they are and repair has completed use node tool cleanup to remove the > data the node is no longer responsible. See bootstrap section above. I've seen that said a few times so allow me to correct. Cleanup is useless after a repair. 'nodetool cleanup' removes rows the node is not responsible a

Re: Repair fails with java.io.IOError: java.io.EOFException

2011-07-25 Thread aaron morton
Background: http://wiki.apache.org/cassandra/Operations Use node tool ring to check if the tokens are evenly distributed. If not then check the Load Balancing and Moving Nodes sections in the page above. If they are and repair has completed use node tool cleanup to remove the data the node is n

Re: Repair fails with java.io.IOError: java.io.EOFException

2011-07-25 Thread Sameer Farooqui
Looks like the repair finished successfully the second time. However, the cluster is still severely unbalanced. I was hoping the repair would balance the nodes. We're using random partitioner. One node has 900GB and others have 128GB, 191GB, 129GB, 257 GB, etc. The 900GB and the 646GB are just insa

Re: Repair fails with java.io.IOError: java.io.EOFException

2011-07-22 Thread Sameer Farooqui
I don't see a JVM crashlog ( hs_err_pid[pid].log) in ~/brisk/resources/cassandra/bin or /tmp. So maybe JVM didn't crash? We're running a pretty up to date with Sun Java: ubuntu@ip-10-2-x-x:/tmp$ java -version java version "1.6.0_24" Java(TM) SE Runtime Environment (build 1.6.0_24-b07) Java HotSpo

Re: Repair fails with java.io.IOError: java.io.EOFException

2011-07-21 Thread Jonathan Ellis
Did you check for a JVM crash log? You should make sure you're running the latest Sun JVM, older versions and OpenJDK in particular are prone to segfaulting. On Thu, Jul 21, 2011 at 6:53 PM, Sameer Farooqui wrote: > We are starting Cassandra with "brisk cassandra", so as a stand-alone > process,

Re: Repair fails with java.io.IOError: java.io.EOFException

2011-07-21 Thread Sameer Farooqui
We are starting Cassandra with "brisk cassandra", so as a stand-alone process, not a service. The syslog on the node doesn't show anything regarding the Cassandra Java process around the time the last entries were made in the Cassandra system.log (2011-07-21 13:01:51): Jul 21 12:35:01 ip-10-2-206

Re: Repair fails with java.io.IOError: java.io.EOFException

2011-07-21 Thread aaron morton
The default init.d script will direct std out/err to that file, how are you starting brisk / cassandra ? Check the syslog and other logs in /var/log to see if the OS killed cassandra. Also, what was the last thing in the casandra log before INFO [main] 2011-07-21 15:48:07,233 AbstractCassandra

Re: Repair fails with java.io.IOError: java.io.EOFException

2011-07-21 Thread Sameer Farooqui
Hey Aaron, I don't have any output.log files in that folder: ubuntu@ip-10-2-x-x:~$ cd /var/log/cassandra ubuntu@ip-10-2-x-x:/var/log/cassandra$ ls system.log system.log.11 system.log.4 system.log.7 system.log.1 system.log.2 system.log.5 system.log.8 system.log.10 system.log.3 system

Re: Repair fails with java.io.IOError: java.io.EOFException

2011-07-21 Thread aaron morton
Check /var/log/cassandra/output.log (assuming the default init scripts) A - Aaron Morton Freelance Cassandra Developer @aaronmorton http://www.thelastpickle.com On 22 Jul 2011, at 10:13, Sameer Farooqui wrote: > Hmm. Just looked at the log more closely. > > So, what actually hap

Re: Repair fails with java.io.IOError: java.io.EOFException

2011-07-21 Thread Sameer Farooqui
Hmm. Just looked at the log more closely. So, what actually happened is while Repair was running on this specific node, the Cassandra java process terminated itself automatically. The last entries in the log are: INFO [ScheduledTasks:1] 2011-07-21 13:00:20,285 GCInspector.java (line 128) GC for

Re: Repair fails with java.io.IOError: java.io.EOFException

2011-07-21 Thread Jonathan Ellis
Looks harmless to me. On Thu, Jul 21, 2011 at 1:41 PM, Sameer Farooqui wrote: > While running Repair on a 0.8.1 node, we got this error in the system.log: > > ERROR [Thread-23] 2011-07-21 15:48:43,868 AbstractCassandraDaemon.java (line > 113) Fatal exception in thread Thread[Thread-23,5,main] > j

Repair fails with java.io.IOError: java.io.EOFException

2011-07-21 Thread Sameer Farooqui
While running Repair on a 0.8.1 node, we got this error in the system.log: ERROR [Thread-23] 2011-07-21 15:48:43,868 AbstractCassandraDaemon.java (line 113) Fatal exception in thread Thread[Thread-23,5,main] java.io.IOError: java.io.EOFException at org.apache.cassandra.net.IncomingTcpConnection.ru