Re: Best practices on how to approach data centre aware affinity

Stephen Darlington Mon, 16 Aug 2021 10:04:47 -0700

You’re right, Ignite does not provide any built-in facilities for this. There 
are some commercial solutions for replication, or you could write code that 
listens to changes and pushes them on a queue/message bus.


Whether this is worth doing depends on how reliable and fast your network is, 
and your RPO/RTO requirements.

> On 16 Aug 2021, at 12:30, Courtney Robinson <courtney.robin...@hypi.io> wrote:
> 
> Hi Stephen,
> We've been considering those points you've raised. The challenge with having 
> isolated clusters is how to deal with synchronisation issues. In one cluster, 
> Ignite will handle an offline node re-joining the cluster. If there are 
> multiple clusters we'd need to detect and replay changes from the application 
> side effectively duplicating a part of what Ignite's doing.
> 
> Did I miss anything and if not, how would you suggest handling this in the 
> case of multiple clusters - one in each data centre?
> 
> Regards,
> Courtney Robinson
> Founder and CEO, Hypi
> Tel: ++44 208 123 2413 (GMT+0) <https://hypi.io/>
> 
>  <https://hypi.io/>https://hypi.io <https://hypi.io/>
> 
> On Mon, Aug 16, 2021 at 10:19 AM Stephen Darlington 
> <stephen.darling...@gridgain.com <mailto:stephen.darling...@gridgain.com>> 
> wrote:
> A word of caution: you’re generally better replicating your data across 
> clusters than stretching a single cluster across data centres. If the latency 
> is very low it should work, but it could degrade your throughput and you need 
> to be careful about split-brain and other networking issues.
> 
> Regards,
> Stephen
> 
>> On 5 Aug 2021, at 15:24, Courtney Robinson <courtney.robin...@hypi.io 
>> <mailto:courtney.robin...@hypi.io>> wrote:
>> 
>> Hi Alex,
>> Thanks for the reply. I'm glad I asked before the team went any further.
>> So we can achieve this with the built in affinity function and the backup 
>> filter. The real complexity is going to be in migrating our existing caches.
>> 
>> So to clarify the steps involved here are 
>> because Ignite registers all env. vars as node attributes we can set e.g. 
>> NODE_DC=<EU_WEST|EU_EAST|CAN0> as an environment var in each k8s cluster
>> Then set the backup filter's constructor-arg.value to be NODE_DC. This will 
>> tell Ignite that two backups cannot be placed on any two nodes with the same 
>> NODE_DC value - correct?
>> When we call create table, we must set template=myTemplateName
>> Before creating any tables, myTemplateName must be created and must include 
>> the backup filter with NODE_DC
>> Have I got that right?
>> 
>> If so, it seems simple enough. Now the real challenge is where you said the 
>> cache has to be re-created.
>> 
>> I can't see how we do this without major down time, we have functionality in 
>> place that allows customers to effectively do a "copy from table A to B and 
>> then delete A" but it will be impossible to get all of them to do this any 
>> time soon.
>> 
>> Has anyone else had to do something similar, how is the community generally 
>> doing migrations like this?
>> 
>> Side note: The only thing that comes to mind is that we will need to build a 
>> virtual catalog that we maintain so that there isn't a one to one mapping 
>> between customer tables and the actual Ignite table name.
>> So if a table is currently called A and we add a virtual catalog then we 
>> keep a mapping that says when the user wants to call "A" it should really go 
>> to table "A_v2" or something. This comes with its own challenge and a 
>> massive testing overhead.
>> 
>> Regards,
>> Courtney Robinson
>> Founder and CEO, Hypi
>> Tel: ++44 208 123 2413 (GMT+0) <https://hypi.io/>
>> 
>>  <https://hypi.io/>https://hypi.io <https://hypi.io/>
>> 
>> On Thu, Aug 5, 2021 at 11:43 AM Alex Plehanov <plehanov.a...@gmail.com 
>> <mailto:plehanov.a...@gmail.com>> wrote:
>> Hello,
>> 
>> You can create your own cache templates with the affinity function you 
>> require (currently you use a predefined "partitioned" template, which only 
>> sets cache mode to "PARTITIONED"). See [1] for more information about cache 
>> templates.
>> 
>> > Is this the right approach
>> > How do we handle existing data, changing the affinity function will cause 
>> > Ignite to not be able to find existing data right?
>> You can't change cache configuration after cache creation. In your example 
>> these changes will be just ignored. The only way to change cache 
>> configuration - is to create the new cache and migrate data.
>> 
>> > How would you recommend implementing the affinity function to be aware of 
>> > the data centre?
>> It's better to use the standard affinity function with a backup filter for 
>> such cases. There is one shipped with Ignite (see [2]).
>>  
>> [1]: 
>> https://ignite.apache.org/docs/latest/configuring-caches/configuration-overview#cache-templates
>>  
>> <https://ignite.apache.org/docs/latest/configuring-caches/configuration-overview#cache-templates>
>> [2]: 
>> https://ignite.apache.org/releases/latest/javadoc/org/apache/ignite/cache/affinity/rendezvous/ClusterNodeAttributeAffinityBackupFilter.html
>>  
>> <https://ignite.apache.org/releases/latest/javadoc/org/apache/ignite/cache/affinity/rendezvous/ClusterNodeAttributeAffinityBackupFilter.html>
>> чт, 5 авг. 2021 г. в 09:40, Courtney Robinson <courtney.robin...@hypi.io 
>> <mailto:courtney.robin...@hypi.io>>:
>> Hi all,
>> Our growth with Ignite continues and as we enter the next phase, we need to 
>> support multi-cluster deployments for our platform.
>> We deploy Ignite and the rest of our stack in Kubernetes and we're in the 
>> early stages of designing what a multi-region deployment should look like.
>> We are 90% SQL based when using Ignite, the other 10% includes Ignite 
>> messaging, Queues and compute.
>> 
>> In our case we have thousands of tables
>> CREATE TABLE IF NOT EXISTS Person (
>>   id int,
>>   city_id int,
>>   name varchar,
>>   company_id varchar,
>>   PRIMARY KEY (id, city_id)
>> ) WITH "template=...";
>> In our case, most tables use a template that looks like this:
>> 
>> partitioned,backups=2,data_region=hypi,cache_group=hypi,write_synchronization_mode=primary_sync,affinity_key=instance_id,atomicity=ATOMIC,cache_name=Person,key_type=PersonKey,value_type=PersonValue
>> 
>> I'm aware of affinity co-location 
>> (https://ignite.apache.org/docs/latest/data-modeling/affinity-collocation 
>> <https://ignite.apache.org/docs/latest/data-modeling/affinity-collocation>) 
>> and in the past when we used the key value APIs more than SQL we also used 
>> custom affinity a function to control placement.
>> 
>> What I don't know is how to best do this with SQL defined caches.
>> We will have at least 3 Kubernetes clusters, each in a different data 
>> centre, let's say EU_WEST, EU_EAST, CAN0
>> 
>> Previously we provided environment variables that our custom affinity 
>> function would use and we're thinking of providing the data centre name this 
>> way.
>> 
>> We have 2 backups in all cases + the primary and so we want the primary in 
>> one DC and each backup to be in a different DC.
>> 
>> There is no syntax in the SQL template that we could find to enables 
>> specifying a custom affinity function.
>> Our instance_id column currently used has no common prefix or anything to 
>> associate with a DC.
>> 
>> We're thinking of getting the cache for each table and then setting the 
>> affinity function to replace the default RendevousAffinityFunction the way 
>> we did before we switched to SQL.
>> Something like this:
>> repo.ctx.ignite.cache("Person").getConfiguration(org.apache.ignite.configuration.CacheConfiguration)
>> .setAffinity(new org.apache.ignite.cache.affinity.AffinityFunction() {
>>     ...
>> })
>> 
>> There are a few things unclear about this:
>> Is this the right approach?
>> How do we handle existing data, changing the affinity function will cause 
>> Ignite to not be able to find existing data right?
>> How would you recommend implementing the affinity function to be aware of 
>> the data centre?
>> Are there any other caveats we need to be thinking about?
>> There is a lot of existing data, we want to try to avoid a full copy/move to 
>> new tables if possible, that will prove to be very difficult in production.
>> 
>> Regards,
>> Courtney Robinson
>> Founder and CEO, Hypi
>> Tel: ++44 208 123 2413 (GMT+0) <https://hypi.io/>
>> 
>>  <https://hypi.io/>https://hypi.io <https://hypi.io/>
>

Re: Best practices on how to approach data centre aware affinity

Reply via email to