Re: How to avoid "all partition owners have left the grid" or handle automatically.

John Smith Mon, 20 Feb 2023 10:15:34 -0800

My cache config for distributed cache is as follows... The maintenance of a
machine can be about 10-20 mins depending on what the maintenance is. I
don't lose data. I just get "all partition owners have left" message and
then I just use control script to reset the  flag for that specific cache.


<bean id="cache-template-bean" abstract="true"
class="org.apache.ignite.configuration.CacheConfiguration"> <!-- when you
create a template via XML configuration, you must add an asterisk to the
name of the template --> <property name="name" value="partitionedTpl*"/>
<property name="cacheMode" value="PARTITIONED" /> <property name="backups"
value="1" /> <property name="partitionLossPolicy" value="READ_WRITE_SAFE"/>
</bean>

On Mon., Feb. 20, 2023, 7:03 a.m. Stephen Darlington, <
stephen.darling...@gridgain.com> wrote:

> How are your caches configured? If they have at least one backup, you
> should be able to restart one node at a time without data loss.
>
> There is no automated way to reset lost partitions. Nor should there be
> (IMHO). If you have lost partitions, you have probably lost data. That
> should require manual intervention.
>
> On 14 Feb 2023, at 17:58, John Smith <java.dev....@gmail.com> wrote:
>
> Hello, does anyone have insights on this?
>
> On Thu., Feb. 9, 2023, 4:28 p.m. John Smith, <java.dev....@gmail.com>
> wrote:
>
>> Any thoughts on this?
>>
>> On Mon., Feb. 6, 2023, 8:38 p.m. John Smith, <java.dev....@gmail.com>
>> wrote:
>>
>>> That Jira doesn't look like the issue at all. That issue seems to
>>> suggest that there is a "data loss" exception. In our case the grid sets
>>> the cache in a "safe" mode... "all partition owners have left the grid"
>>> which requires us to then manually reset the flag.
>>>
>>> On Mon, Feb 6, 2023 at 7:46 PM 18624049226 <18624049...@163.com> wrote:
>>>
>>>> https://issues.apache.org/jira/browse/IGNITE-17657
>>>> 在 2023/2/7 05:41, John Smith 写道:
>>>>
>>>> Hi, sometimes when we perform maintenance and reboot nodes we get "All
>>>> partition owners have left the grid" and then we go and run ./control.sh
>>>> --host ignite-xxxxxx --cache reset_lost_partitions some-cache and
>>>> everything is fine again...
>>>>
>>>> This seems to happen with partitioned caches and we are running as
>>>> READ_WRITE_SAFE.
>>>>
>>>> We have a few caches and instead of relying on a human to manually go
>>>> run the command is there a way for this to happen automatically?
>>>>
>>>> And if there is an automatic way how do we enable it and what are the
>>>> consequences?
>>>>
>>>>
>

Re: How to avoid "all partition owners have left the grid" or handle automatically.

Reply via email to