I just wanted to verify the fact that if I happen to setup a multi
data-center Cassandra setup, will each data center have the complete
data-set with it?

Say, I have two data-center each with two nodes, and a partitioner that
ranges from 0 to 100. Initial token assigned this way

DC1:N1 = 00
DC2:N1 = 25
DC1:N2 = 50
DC2:N2 = 75

where DCX is data center X, NX is node X. *Which one the following options
is true?*

*Option #1: *DC1 and DC2, each will hold complete dataset with keys
bucketed as follows
DC1:N1 = (50, 00] => 50 keys
DC1:N2 = (00, 50] => 50 keys
----
Complete data set mirrored at DC1

DC2:N1 = (75, 25] => 50 keys
DC2:N2 = (25, 75] => 50 keys
----
Complete data set mirrored at DC2

*Option #2: *DC1 and DC2, each will hold 50% of the data with keys bucketed
as follows (much the same way in a single C setup)
DC1:N1 = (75, 00] => 25 keys
DC2:N1 = (00, 25] => 25 keys
DC1:N2 = (25, 50] => 25 keys
DC2:N2 = (50, 75] => 25 keys
----
data is divided into the two data centers.

Thanks,
PP

Reply via email to