Data Distribution / Replication

Stefan Kaufmann Thu, 12 Aug 2010 08:30:37 -0700

Hello again,

last day's I started several tests with Cassandra and learned quite some facts.


However, of course, there are still enough things I need to
understand. One thing is, how the data replication works.
For my Testing:
1. I set the replication Factor to 3, started with 1 active node (the
seed) and I inserted some test key's.
2. I started 2 more nodes, which joined the cluster.
3. I waited for the data to replicate, which didn't happen.
4. I inserted more key's, and it looked like they were distributed to
all three nodes.

So here is my question:
How can I ensure that every key exists at least on three nodes? So,
when I start with one node and later join 2 more - the data will be
distributed.
Shouldn't this happen automatically? Am I just not patient enough?
How is this handled in productive environments? For instance, one node
has a hardware failure, so it will be exchanged with a new blank one.
How does that one get it's data back?

I searched the mailinglist, the only answer I found was to copy the
data manually, is this true?

I'm currently using Cassandra 0.6.4 in our testing environment. I
chose the RackUnawareStrategy

Stefan

Data Distribution / Replication

Reply via email to