Jeremiah,
Thanks!
I'm running 1.0.8, two interesting things to note:
- I don't have sufficient disk space to handle the straight bump to a
replication factor of 4, so I think I'm going to have to do it one by
one (1,2,3 and 4) with a bunch of cleanups in between.
- Also, using a LOCAL_QUORUM doesn't work since my application has a
hard response time limit then my read speed ends up being the speed of
the slowest node. What I want is LOCAL_ONE which doesn't exist in the
API (unless I missed something).
Yes, CASSANDRA-3483 is really what I'm looking for.
--david
On 3/5/12 8:02 AM, Jeremiah Jordan wrote:
You need to make sure your clients are reading using LOCAL_* settings
so that they don't try to get data from the other data center. But
you shouldn't get errors while replication_factor is 0. Once you
change the replication factor to 4, you should get missing data if you
are using LOCAL_* for reading.
What version are you using?
See the IRC logs at the begining of this JIRA discussion thread for
some info:
https://issues.apache.org/jira/browse/CASSANDRA-3483
But you should be able to:
1. Set dc2:0 in the replication_factor.
2. Set bootstrap to false on the new nodes.
2. Start all of the new nodes.
3. Change replication_factor to dc2:4
4. run repair on the nodes in dc2.
Once the repairs finish you should be able to start using DC2. You
are still going to need a bunch of extra space because the repair is
going to get you a couple copies of the data.
Once 1.1 comes out it will have new nodetool commands for making this
a little nicer per CASSANDRA-3483
-Jeremiah
On 03/05/2012 09:42 AM, David Koblas wrote:
Everything that I've read about data centers focuses on setting
things up at the beginning of time.
I've the the following situation:
10 machines in a datacenter (DC1), with replication factor of 2.
I want to set up a second data center (DC2) with the following
configuration:
20 machines with a replication factor of 4
What I've found is that if I initially start adding things, the first
machine to join the network attempts to replicate all of the data
from DC1 and fills up it's disk drive. I've played with setting the
storage_options to have a replication factor of 0, then I can bring
up all 20 machines in DC2 but then start getting a huge number of
read errors from read on DC1.
Is there a simple cookbook on how to add a second DC? I'm currently
trying to set the replication factor to 1 and do a repair, but that
doesn't feel like the right approach.
Thanks,