Hi Stephen,
We've been considering those points you've raised. The challenge with
having isolated clusters is how to deal with synchronisation issues. In one
cluster, Ignite will handle an offline node re-joining the cluster. If
there are multiple clusters we'd need to detect and replay changes from the
application side effectively duplicating a part of what Ignite's doing.

Did I miss anything and if not, how would you suggest handling this in the
case of multiple clusters - one in each data centre?

Regards,
Courtney Robinson
Founder and CEO, Hypi
Tel: ++44 208 123 2413 (GMT+0) <https://hypi.io>

<https://hypi.io>
https://hypi.io


On Mon, Aug 16, 2021 at 10:19 AM Stephen Darlington <
stephen.darling...@gridgain.com> wrote:

> A word of caution: you’re generally better replicating your data across
> clusters than stretching a single cluster across data centres. If the
> latency is very low it should work, but it could degrade your throughput
> and you need to be careful about split-brain and other networking issues.
>
> Regards,
> Stephen
>
> On 5 Aug 2021, at 15:24, Courtney Robinson <courtney.robin...@hypi.io>
> wrote:
>
> Hi Alex,
> Thanks for the reply. I'm glad I asked before the team went any further.
> So we can achieve this with the built in affinity function and the backup
> filter. The real complexity is going to be in migrating our existing caches.
>
> So to clarify the steps involved here are
>
>    1. because Ignite registers all env. vars as node attributes we can
>    set e.g. NODE_DC=<EU_WEST|EU_EAST|CAN0> as an environment var in each k8s
>    cluster
>    2. Then set the backup filter's constructor-arg.value to be NODE_DC.
>    This will tell Ignite that two backups cannot be placed on any two nodes
>    with the same NODE_DC value - correct?
>    3. When we call create table, we must set template=myTemplateName
>    4. Before creating any tables, myTemplateName must be created and must
>    include the backup filter with NODE_DC
>
> Have I got that right?
>
> If so, it seems simple enough. Now the real challenge is where you said
> the cache has to be re-created.
>
> I can't see how we do this without major down time, we have functionality
> in place that allows customers to effectively do a "copy from table A to B
> and then delete A" but it will be impossible to get all of them to do this
> any time soon.
>
> Has anyone else had to do something similar, how is the community
> generally doing migrations like this?
>
> Side note: The only thing that comes to mind is that we will need to build
> a virtual catalog that we maintain so that there isn't a one to one mapping
> between customer tables and the actual Ignite table name.
> So if a table is currently called A and we add a virtual catalog then we
> keep a mapping that says when the user wants to call "A" it should really
> go to table "A_v2" or something. This comes with its own challenge and a
> massive testing overhead.
>
> Regards,
> Courtney Robinson
> Founder and CEO, Hypi
> Tel: ++44 208 123 2413 (GMT+0) <https://hypi.io/>
>
> <https://hypi.io/>
> https://hypi.io
>
>
> On Thu, Aug 5, 2021 at 11:43 AM Alex Plehanov <plehanov.a...@gmail.com>
> wrote:
>
>> Hello,
>>
>> You can create your own cache templates with the affinity function you
>> require (currently you use a predefined "partitioned" template, which only
>> sets cache mode to "PARTITIONED"). See [1] for more information about cache
>> templates.
>>
>> > Is this the right approach
>> > How do we handle existing data, changing the affinity function will
>> cause Ignite to not be able to find existing data right?
>> You can't change cache configuration after cache creation. In your
>> example these changes will be just ignored. The only way to change cache
>> configuration - is to create the new cache and migrate data.
>>
>> > How would you recommend implementing the affinity function to be aware
>> of the data centre?
>> It's better to use the standard affinity function with a backup filter
>> for such cases. There is one shipped with Ignite (see [2]).
>>
>> [1]:
>> https://ignite.apache.org/docs/latest/configuring-caches/configuration-overview#cache-templates
>> [2]:
>> https://ignite.apache.org/releases/latest/javadoc/org/apache/ignite/cache/affinity/rendezvous/ClusterNodeAttributeAffinityBackupFilter.html
>>
>> чт, 5 авг. 2021 г. в 09:40, Courtney Robinson <courtney.robin...@hypi.io
>> >:
>>
>>> Hi all,
>>> Our growth with Ignite continues and as we enter the next phase, we need
>>> to support multi-cluster deployments for our platform.
>>> We deploy Ignite and the rest of our stack in Kubernetes and we're in
>>> the early stages of designing what a multi-region deployment should look
>>> like.
>>> We are 90% SQL based when using Ignite, the other 10% includes Ignite
>>> messaging, Queues and compute.
>>>
>>> In our case we have thousands of tables
>>>
>>> CREATE TABLE IF NOT EXISTS Person (
>>>   id int,
>>>   city_id int,
>>>   name varchar,
>>>   company_id varchar,
>>>   PRIMARY KEY (id, city_id)) WITH "template=...";
>>>
>>> In our case, most tables use a template that looks like this:
>>>
>>>
>>> partitioned,backups=2,data_region=hypi,cache_group=hypi,write_synchronization_mode=primary_sync,affinity_key=instance_id,atomicity=ATOMIC,cache_name=Person,key_type=PersonKey,value_type=PersonValue
>>>
>>> I'm aware of affinity co-location (
>>> https://ignite.apache.org/docs/latest/data-modeling/affinity-collocation)
>>> and in the past when we used the key value APIs more than SQL we also used
>>> custom affinity a function to control placement.
>>>
>>> What I don't know is how to best do this with SQL defined caches.
>>> We will have at least 3 Kubernetes clusters, each in a different data
>>> centre, let's say EU_WEST, EU_EAST, CAN0
>>>
>>> Previously we provided environment variables that our custom affinity
>>> function would use and we're thinking of providing the data centre name
>>> this way.
>>>
>>> We have 2 backups in all cases + the primary and so we want the primary
>>> in one DC and each backup to be in a different DC.
>>>
>>> There is no syntax in the SQL template that we could find to enables
>>> specifying a custom affinity function.
>>> Our instance_id column currently used has no common prefix or anything
>>> to associate with a DC.
>>>
>>> We're thinking of getting the cache for each table and then setting the
>>> affinity function to replace the default RendevousAffinityFunction the way
>>> we did before we switched to SQL.
>>> Something like this:
>>>
>>> repo.ctx.ignite.cache("Person").getConfiguration(org.apache.ignite.configuration.CacheConfiguration)
>>> .setAffinity(new org.apache.ignite.cache.affinity.AffinityFunction() {
>>>     ...
>>> })
>>>
>>>
>>> There are a few things unclear about this:
>>>
>>>    1. Is this the right approach?
>>>    2. How do we handle existing data, changing the affinity function
>>>    will cause Ignite to not be able to find existing data right?
>>>    3. How would you recommend implementing the affinity function to be
>>>    aware of the data centre?
>>>    4. Are there any other caveats we need to be thinking about?
>>>
>>> There is a lot of existing data, we want to try to avoid a full
>>> copy/move to new tables if possible, that will prove to be very difficult
>>> in production.
>>>
>>> Regards,
>>> Courtney Robinson
>>> Founder and CEO, Hypi
>>> Tel: ++44 208 123 2413 (GMT+0) <https://hypi.io/>
>>>
>>> <https://hypi.io/>
>>> https://hypi.io
>>>
>>
>
>

Reply via email to