Hi, Since this thread already contains the system setup, I just want to ask another question:
If you have 3 data centers (DC1,DC2 and DC3) and you have a keyspace where the strategy options are such that each DC gets one replica. If you only write to the nodes in one DC1 what is the path the replicas take assuming you're correctly interleaved and evenly spaced the tokens of all the nodes? If you write a record in a node in DC1 will it replicate it to the node in DC2 and the node in DC2 will replicate it to the node in DC3? Or will the node in DC1 replicate the record both to DC2 and DC3? Cheers, Alex On Thu, Mar 15, 2012 at 11:26 PM, Alexandru Sicoe <adsi...@gmail.com> wrote: > Sorry for that last message, I was confused because I thought I needed to > use the DseSimpleSnitch but of course I can use the PropertyFileSnitch and > that allows me to get the configuration with 3 data centers explained. > > Cheers, > Alex > > > On Thu, Mar 15, 2012 at 10:56 AM, Alexandru Sicoe <adsi...@gmail.com>wrote: > >> Thanks Tyler, >> I see that cassandra.yaml has "endpoint_snitch: >> com.datastax.bdp.snitch.DseSimpleSnitch". Will this pick up the >> configuration from the cassandra-topology.properties file as does the >> PropertyFileSnitch ? Or is there some other way of telling it which nodes >> are in withc DC? >> >> Cheers, >> Alex >> >> >> On Wed, Mar 14, 2012 at 9:09 PM, Tyler Hobbs <ty...@datastax.com> wrote: >> >>> Yes, you can do this. >>> >>> You will want to have three DCs: DC1 with [1, 2, 3], DC2 with [4, 5, 6], >>> and DC3 with [7, 8, 9]. For your normal data keyspace, the replication >>> strategy should be NTS, and the strategy_options should have some replicas >>> in each of the three DCs. For example: {DC1: 3, DC2: 3, DC3: 3} if you >>> need that level of replication in each one (although you probably only want >>> an RF of 1 for DC3). >>> >>> Your clients that are performing writes should only open connections >>> against the nodes in DC1, and you should write at CL.ONE or >>> CL.LOCAL_QUORUM. Likewise for reads, your clients should only connect to >>> nodes in DC2, and you should read at CL.ONE or CL.LOCAL_QUORUM. >>> >>> The nodes in DC3 should run as analytics nodes. I believe the default >>> CL for m/r jobs is ONE, which would work. >>> >>> As far as tokens go, interleaving all three DCs and evenly spacing the >>> tokens will work. For example, the ordering of your nodes might be [1, 4, >>> 7, 2, 5, 8, 3, 6, 9]. >>> >>> >>> On Wed, Mar 14, 2012 at 12:05 PM, Alexandru Sicoe <adsi...@gmail.com>wrote: >>> >>>> Hi everyone, >>>> I want to test out the Datastax Enterprise software to have a mixed >>>> workload setup with an analytics and a real time part. >>>> >>>> However I am not sure how to configure it to achieve what I want: I >>>> will have 3 real machines on one side of a gateway (1,2,3) and 6 VMs on >>>> another(4,5,6). >>>> 1,2,3 will each have a normal Cassandra node that just takes data >>>> directly from my data sources. I want them to replicate the data to the >>>> other 6 VMs. Now, out of those 6 VMs 4,5,6 will run normal Cassandra nodes >>>> and 7,8,9 will run Analytics nodes. So I only want to write to the 1,2,3 >>>> and I only want to serve user reads from 4,5,6 and do analytics on 7,8,9. >>>> Can I achieve this by configuring 1,2,3,4,5,6 as normal nodes and the rest >>>> as analytics nodes? If I alternate the tokens as it's explained in >>>> http://www.datastax.com/docs/1.0/datastax_enterprise/init_dse_cluster#init-dseis >>>> it analoguous to achieving something like 3 DCs each getting their own >>>> replica? >>>> >>>> Thanks, >>>> Alex >>>> >>>> >>> >>> >>> -- >>> Tyler Hobbs >>> DataStax <http://datastax.com/> >>> >>> >> >