Re: How to avoid "all partition owners have left the grid" or handle automatically.

2023-02-21 Thread John Smith
ok, not sure what happened but I'm pretty sure it was one machine at a time. But ok. So just to be clear, with backup = 1 then we can lose 1 machine for any amount of time until it comes back fully online and rebalanced before going to the next machine? On Tue, Feb 21, 2023 at 10:01 AM Stephen Da

Re: How to avoid "all partition owners have left the grid" or handle automatically.

2023-02-21 Thread Stephen Darlington
I think there is an argument that when you have persistence enabled and a sensible partition loss policy, then you shouldn’t have to reset lost partitions. As you note, the data is still consistent. You’ve just temporarily lost some availability. However, that’s not how it currently works. If y

Re: How to avoid "all partition owners have left the grid" or handle automatically.

2023-02-20 Thread John Smith
My cache config for distributed cache is as follows... The maintenance of a machine can be about 10-20 mins depending on what the maintenance is. I don't lose data. I just get "all partition owners have left" message and then I just use control script to reset the flag for that specific cache.

Re: How to avoid "all partition owners have left the grid" or handle automatically.

2023-02-20 Thread Stephen Darlington
How are your caches configured? If they have at least one backup, you should be able to restart one node at a time without data loss. There is no automated way to reset lost partitions. Nor should there be (IMHO). If you have lost partitions, you have probably lost data. That should require man

Re: How to avoid "all partition owners have left the grid" or handle automatically.

2023-02-14 Thread John Smith
Hello, does anyone have insights on this? On Thu., Feb. 9, 2023, 4:28 p.m. John Smith, wrote: > Any thoughts on this? > > On Mon., Feb. 6, 2023, 8:38 p.m. John Smith, > wrote: > >> That Jira doesn't look like the issue at all. That issue seems to suggest >> that there is a "data loss" exception

Re: How to avoid "all partition owners have left the grid" or handle automatically.

2023-02-09 Thread John Smith
Any thoughts on this? On Mon., Feb. 6, 2023, 8:38 p.m. John Smith, wrote: > That Jira doesn't look like the issue at all. That issue seems to suggest > that there is a "data loss" exception. In our case the grid sets the cache > in a "safe" mode... "all partition owners have left the grid" which

Re: How to avoid "all partition owners have left the grid" or handle automatically.

2023-02-06 Thread John Smith
That Jira doesn't look like the issue at all. That issue seems to suggest that there is a "data loss" exception. In our case the grid sets the cache in a "safe" mode... "all partition owners have left the grid" which requires us to then manually reset the flag. On Mon, Feb 6, 2023 at 7:46 PM 18624

Re: How to avoid "all partition owners have left the grid" or handle automatically.

2023-02-06 Thread 18624049226
https://issues.apache.org/jira/browse/IGNITE-17657 在 2023/2/7 05:41, John Smith 写道: Hi, sometimes when we perform maintenance and reboot nodes we get "All partition owners have left the grid" and then we go and run ./control.sh --host ignite-xx --cache reset_lost_partitions some-cache and