Re: Ec2Snitch to Ec2MultiRegionSnitch

2013-04-22 Thread Dane Miller
On Thu, Apr 18, 2013 at 7:41 AM, Alain RODRIGUEZ wrote: > I am wondering about the process to grow from one data center to a few of > them. First thing is we use EC2Snitch for now. So I guess we have to switch > to Ec2MultiRegionSnitch. > > c/ I am using the SimpleStrategy. Is it worth it/mandator

Re: unexplained hinted handoff

2013-04-16 Thread Dane Miller
On Sun, Apr 14, 2013 at 11:28 AM, aaron morton wrote: >> If hints are being stored, doesn't that imply DOWN nodes, and why don't I >> see that in the logs? > > Hints are stored for two reasons. First if the node is down when the write > request starts, second if the node does not reply to the coo

Re: unexplained hinted handoff

2013-04-12 Thread Dane Miller
On Fri, Apr 12, 2013 at 1:12 PM, Dane Miller wrote: > I'm seeing hinted handoff kick in on all our nodes during periods of > high activity, but all the nodes seem to be up (according to the logs > and nodetool status). The pattern in the logs is something like this: > &g

unexplained hinted handoff

2013-04-12 Thread Dane Miller
I'm seeing hinted handoff kick in on all our nodes during periods of high activity, but all the nodes seem to be up (according to the logs and nodetool status). The pattern in the logs is something like this: 18:10:45 194 READ messages dropped in last 5000ms 18:11:10 Started hinted handoff for ho

Re: IndexOutOfBoundsException during repair, streaming

2013-04-04 Thread Dane Miller
On Wed, Apr 3, 2013 at 6:08 PM, aaron morton wrote: > We deleted and recreated those CFs before moving into > production mode. > > We have a wiener. > > The comparator is applying the current schema to the byte value read from > disk (schema on read) which describes a value with more than 2 compon

Re: IndexOutOfBoundsException during repair, streaming

2013-04-02 Thread Dane Miller
On Mon, Apr 1, 2013 at 10:19 PM, aaron morton wrote: > ERROR [Thread-232] 2013-04-01 22:22:21,760 CassandraDaemon.java (line > 133) Exception in thread Thread[Thread-232,5,main] > java.lang.IndexOutOfBoundsException: index (2) must be less than size (2) >at > com.google.common.base.Precond

IndexOutOfBoundsException during repair, streaming

2013-04-01 Thread Dane Miller
I hit some more errors while running repair. Repair starts streaming, and then hangs. On 2 of the 3 nodes involved in the repair, "nodetool netstats" show no streams. On the 3rd node there is an error in the logs and "nodetool netstats" show several streams that appear hung -- no progress in abo

hints compaction

2013-04-01 Thread Dane Miller
Hi, Several of my nodes have been compacting system.hints for over 24 hours with no progress, causing high load on otherwise idle nodes. I'm seeing 30-50 Data.db files in system/hints/ What are the proper compaction settings for the hints CF? Mine are: compaction={'min_threshold': '0', 'class':

Re: Stream fails during repair, two nodes out-of-memory

2013-03-23 Thread Dane Miller
wondering if I should throttle streaming, and/or repair only one CF at a time. > From: "Dane Miller" > Subject: Re: Stream fails during repair, two nodes out-of-memory > > On Thu, Mar 21, 2013 at 10:28 AM, aaron morton > wrote: >> heap of 1867M is kind of small. According

Re: Stream fails during repair, two nodes out-of-memory

2013-03-22 Thread Dane Miller
On Thu, Mar 21, 2013 at 10:28 AM, aaron morton wrote: > heap of 1867M is kind of small. According to the discussion on this list, > it's advisable to have m1.xlarge. > > +1 > > In cassadrea-env.sh set the MAX_HEAP_SIZE to 4GB, and the NEW_HEAP_SIZE to > 400M > > In the yaml file set > > in_memory_

Stream fails during repair, two nodes out-of-memory

2013-03-20 Thread Dane Miller
After having just solved one repair problem, I immediately hit another. Again, much appreciation for suggestions... I'm having problems repairing a CF, and the failure consistenly brings down 2 of the 6 nodes in the cluster. I'm running "repair -pr" on a single CF on node2, the repair starts str

Re: Errors on replica nodes halt repair

2013-03-20 Thread Dane Miller
e so you can work out which one is > failing. Thanks Aaron. I was able to determine the problem CFs from the logs, and then fixed with nodetool scrub. Repair now completes. Thank you! Dane > On 19/03/2013, at 10:58 AM, Dane Miller wrote: > > I'm having trouble completing a repa

Errors on replica nodes halt repair

2013-03-18 Thread Dane Miller
I'm having trouble completing a repair on several of my nodes due to errors during compaction. This is a 6 node cluster using the simple replication strategy, rf=3, with each node assigned a single token. I'm running "nodetool repair -pr" on node1, which progresses until a specific keyspace then a

Re: repair hangs

2013-03-14 Thread Dane Miller
On Thu, Mar 14, 2013 at 6:34 AM, aaron morton wrote: >> 1. is this a nodetool bug? is there any way to propagate the >> java.io.IOException back to nodetool? > The repair continues to work even if nodetool fails, it's a server side thing. > >> 2. network problems on EC2, I'm shocked! are there r

Re: repair hangs

2013-03-13 Thread Dane Miller
On Wed, Mar 13, 2013 at 12:39 PM, Wei Zhu wrote: > My guess would be there is some exception during the repair and your session > is aborted. > Here is the code of doing repair: > >https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/service/AntiEntropyService.java > > loo

Re: repair hangs

2013-03-13 Thread Dane Miller
On Wed, Mar 13, 2013 at 11:44 AM, Wei Zhu wrote: >Do you see anything related to "merkle" tree in your log? > >Also do a nodetool compactionstats, during merkle tree calculation, you will >see >validation there. The last mention of "merkle" is 2 days old. compactionstats are: $ nodetool compac

repair hangs

2013-03-13 Thread Dane Miller
Hi, On one of my nodes, nodetool repair -pr has been running for 48 hours and appears to be hung, with no output and no AntiEntropy messages in system.log for 40+ hours. Load, cpu, etc are all near zero. There are no other repair jobs running in my cluster. What's the recommended way to deal wi

migrating from SimpleStrategy to NetworkTopologyStrategy

2013-03-11 Thread Dane Miller
Hi, I'd like to resurrect this thread from April 2012 - http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/migrating-from-SimpleStrategy-to-NetworkTopologyStrategy-td7481090.html - "migrating from SimpleStrategy to NetworkTopologyStrategy" We're in a similar situation, and I'd like