Re: Questions about the replicas selection and remote coordinator

Jun Wu Mon, 01 Feb 2016 13:47:40 -0800

Hi Steve,

    Thank you so much for your kind reply and now it makes more sense. But for 
the remote coordinator issue, it’s definitely a interesting topic. If you have 
any other conclusion  on this. I’d be pretty happy to learn from you.


    Thanks again!

Jun
> On Jan 29, 2016, at 13:09, Steve Robenalt <sroben...@highwire.org> wrote:
> 
> Hi Jun,
> 
> The replicas are chosen according to factors that are generally more easily 
> selected internally, as is the case with coordinators. Even if the replicas 
> were selected in a completely round-robin fashion initially, they could end 
> up being re-distributed as a result of node failures, additions/removals 
> to/from the cluster, etc, particularly when vnodes are used. As such, the 
> diagrams and the nodes they refer to are hypothetical, but accurate in the 
> sense that they are non-contiguous, and that different sets of replicas are 
> distributed to various parts of the cluster.
> 
> As far as the remote coordinator is concerned, I'm not sure what motivated 
> the change from 1.2 to 2.1 and would be interested in understanding that 
> change myself. I do know that improved performance was a big part of the 2.1 
> release, but I'm not sure if the change in coordinators was part of that 
> effort or not.
> 
> Steve
> 
> 
> On Fri, Jan 29, 2016 at 10:13 AM, Jun Wu <wuxiaomi...@hotmail.com 
> <mailto:wuxiaomi...@hotmail.com>> wrote:
> Hi Steve,
> 
>    Thank you so much for your reply. 
> 
>    Yes, you're right, I'm using the version of 2.1. So based on this, I think 
> I'm outdated. 
> 
>     However, this comes to another interesting question: why we change this 
> part from version 1 to version 2. As we can see that in version 1, there's 
> connections from node 10 in DC 1 with node 10 in DC 2, then node 10 in DC 2 
> send 3 copies to 3 nodes in DC 2, which should be more time-saving than 
> version 2.1, which send data from node 10 in DC 1 to 3 nodes in DC 2 directly.
> 
>      Also, is there any information on how to choose the replicas. Like here 
> https://docs.datastax.com/en/cassandra/2.1/cassandra/dml/architectureClientRequestsMultiDCWrites_c.html
>  
> <https://docs.datastax.com/en/cassandra/2.1/cassandra/dml/architectureClientRequestsMultiDCWrites_c.html>
>     Why we choose node 1, 3, 6 as replicas and 4, 8, 11 as another 3 replicas?
> 
>     Also, is node 11 working as remote coordinator here? Or is the concept of 
> remote coordinator really existed, as the figure shows, we even don't need 
> the remote coordinator. 
> 
>     Thanks!
> 
> Jun
> 
>     
>     
> 
> Date: Fri, 29 Jan 2016 09:55:58 -0800
> Subject: Re: Questions about the replicas selection and remote coordinator
> From: sroben...@highwire.org <mailto:sroben...@highwire.org>
> To: user@cassandra.apache.org <mailto:user@cassandra.apache.org>
> 
> 
> Hi Jun,
> 
> The 2 diagrams you are comparing come from versions of Cassandra that are 
> significantly different - 1.2 in the first case and 2.1 in the second case, 
> so it's not surprising that there are differences. since you haven't 
> qualified your question with the Cassandra version you are asking about, I 
> would assume that the 2.1 example is more representative of what you would be 
> likely to see. In any case, it's best to use a consistent version for your 
> documentation because Cassandra changes quite rapidly with many of the 
> releases.
> 
> As far as choosing the coordinator node, I don't think there's a way to force 
> it, nor would it be a good idea to do so. In order to make a reasonable 
> selection of coordinators, you would need a lot of internal knowledge about 
> load on the nodes in the cluster and you'd need to also handle certain 
> classes of failures and retries, so you would end up duplicating what is 
> already being done for you internally.
> 
> Steve
> 
> 
> On Fri, Jan 29, 2016 at 9:11 AM, Jun Wu <wuxiaomi...@hotmail.com 
> <mailto:wuxiaomi...@hotmail.com>> wrote:
> Hi there,
> 
>     I have some questions about the replicas selection. 
> 
>     Let's say that we have 2 data centers: DC1 and DC2, the figure also be 
> got from link here: 
> https://docs.datastax.com/en/cassandra/1.2/cassandra/images/write_access_multidc_12.png
>  
> <https://docs.datastax.com/en/cassandra/1.2/cassandra/images/write_access_multidc_12.png>.
>  There're 10 nodes in each data center. We set the replication factor to be 3 
> and 3 in each data center, which means there'll be 3 and 3 replicas in each 
> data center.
> 
>     (1) My first question is how to choose which 3 nodes to write data to, in 
> the link above, the 3 replicas are node 1, 2, 7. But, is there any mechanism 
> to select these 3?
> 
>     (2) Another question is about the remote coordinator, the previous figure 
> shows that node 10 in DC1 will write data to node 10  in DC 2, then node 10 
> in DC2 will write 3 copies to 3 nodes in DC2.
> 
>     But, another figure from datastax shows different method, the figure can 
> be found here, 
> https://docs.datastax.com/en/cassandra/2.1/cassandra/dml/architectureClientRequestsMultiDCWrites_c.html
>  
> <https://docs.datastax.com/en/cassandra/2.1/cassandra/dml/architectureClientRequestsMultiDCWrites_c.html>.
>  It shows that node 10 in DC 1 will send directly 3 copies to 3 nodes in DC2, 
> without using remote coordinator.
> 
>     I'm wondering which case is true, because in multiple data center, the 
> time duration for these two methods varies a lot.
> 
>     Also, is there any mechanism to select which node to be remote 
> coordinator?
> 
>     Thanks!
> 
> Jun
> 
> 
> 
> -- 
> Steve Robenalt 
> Software Architect
> sroben...@highwire.org <mailto:bza...@highwire.org> 
> (office/cell): 916-505-1785 <tel:916-505-1785>
> 
> HighWire Press, Inc.
> 425 Broadway St, Redwood City, CA 94063
> www.highwire.org <http://www.highwire.org/>
> 
> Technology for Scholarly Communication
> 
> 
> 
> -- 
> Steve Robenalt 
> Software Architect
> sroben...@highwire.org <mailto:bza...@highwire.org> 
> (office/cell): 916-505-1785
> 
> HighWire Press, Inc.
> 425 Broadway St, Redwood City, CA 94063
> www.highwire.org <http://www.highwire.org/>
> 
> Technology for Scholarly Communication

Re: Questions about the replicas selection and remote coordinator

Reply via email to