Hello! You could try to rolling-restart multiple nodes, waiting for the data to rebalance after each restart, to avoid data loss.
Regards, -- Ilya Kasnacheev пн, 28 окт. 2019 г. в 21:51, Abhishek Gupta (BLOOMBERG/ 919 3RD A) < [email protected]>: > Thanks Ilya. The thing is, I've seen these exceptions without any errors > occurring before them. Also I'm not using persistence. Also, I've seen this > happen on multiple nodes at the same time. If I bounce multiple nodes, I > would loose data (since I have only 1 backup). Anything else I could do? > > > -Abhishek > > > From: [email protected] At: 10/28/19 12:47:23 > Cc: [email protected] > Subject: Re: Intermittent "Partition states validation has failed for > group" issues > > Hello! > > I think this means that backup/primary contents are inconsistent. > > The implications is that in case of node failure there will be data > inconsistency (or maybe it's already there). > > The recommendation is to a) check logs for any oddities/exceptions, and b) > maybe remove problematic partitions' files from persistence and/or restart > problematic nodes. > > Regards, > -- > Ilya Kasnacheev > > > пн, 21 окт. 2019 г. в 23:17, Abhishek Gupta (BLOOMBERG/ 731 LEX) < > [email protected]>: > >> In my otherwise stably running grid (on 2.7.5) I sometimes see >> intermittent GridDhtPartitionsExchangeFuture warning. This warning the >> occurs periodically and then goes away after some time. I couldn't find any >> documentation or other threads about this warning and its implications. >> * What is the trigger for this warning? >> * What are the implications? >> * Is there any recommendation around fixing this issue? >> >> >> >> >> 2019-10-21 16:09:44.378 [WARN ] [sys-#26240] >> GridDhtPartitionsExchangeFuture - Partition states validation has failed >> for group: mainCache. Partitions cache sizes are inconsistent for Part 0: >> [id-dgcasp-ob-398-csp-drp-ny-1=43417 id-dgcasp-ob-080-csp-drp-ny-1=43416 ] >> Part 1: [id-dgcasp-ob-080-csp-drp-ny-1=43720 >> id-dgcasp-ob-471-csp-drp-ny-1=43724 ] Part 2: >> [id-dgcasp-ob-762-csp-drp-ny-1=43388 id-dgcasp-ob-471-csp-drp-ny-1=43376 ] >> Part 3: [id-dgcasp-ob-775-csp-drp-ny-1=43488 >> id-dgcasp-ob-403-csp-drp-ny-1=43484 ] Part 4: >> [id-dgcasp-ob-080-csp-drp-ny-1=43338 id-dgcasp-ob-471-csp-drp-ny-1=43339 ] >> Part 5: [id-dgcasp-ob-398-csp-drp-ny-1=43105 >> id-dgcasp-ob-471-csp-drp-ny-1=43106 ] Part 7: >> [id-dgcasp-ob-775-csp-drp-ny-1=43151 id-dgcasp-ob-762-csp-drp-ny-1=43157 ] >> Part 8: [id-dgcasp-ob-398-csp-drp-ny-1=42975 >> id-dgcasp-ob-471-csp-drp-ny-1=42976 ] Part 10: >> [id-dgcasp-ob-775-csp-drp-ny-1=43033 id-dgcasp-ob-471-csp-drp-ny-1=43036 ] >> Part 11: [id-dgcasp-ob-762-csp-drp-ny-1=43303 >> id-dgcasp-ob-471-csp-drp-ny-1=43299 ] Part 12: >> [id-dgcasp-ob-398-csp-drp-ny-1=43262 id-dgcasp-ob-471-csp-drp-ny-1=43265 ] >> Part 13: [id-dgcasp-ob-762-csp-drp-ny-1=43123 >> id-dgcasp-ob-471-csp-drp-ny-1=43120 ] Part 15: >> [id-dgcasp-ob-775-csp-drp-ny-1=43412 id-dgcasp-ob-398-csp-drp-ny-1=43413 ] >> Part 16: [id-dgcasp-ob-471-csp-drp-ny-1=43934 >> id-dgcasp-ob-403-csp-drp-ny-1=43933 ] Part 20: >> [id-dgcasp-ob-080-csp-drp-ny-1=43146 id-dgcasp-ob-471-csp-drp-ny-1=43148 ] >> Part 21: [id-dgcasp-ob-762-csp-drp-ny-1=43196 >> id-dgcasp-ob-080-csp-drp-ny-1=43197 ] Part 22: >> [id-dgcasp-ob-398-csp-drp-ny-1=43233 id-dgcasp-ob-762-csp-drp-ny-1=43234 ] >> Part 23: [id-dgcasp-ob-398-csp-drp-ny-1=43127 >> id-dgcasp-ob-471-csp-drp-ny-1=43128 ] Part 24: >> [id-dgcasp-ob-775-csp-drp-ny-1=43144 id-dgcasp-ob-398-csp-drp-ny-1=43142 ] >> ... TRUNCATED >> >> >> Thanks, >> Abhishek >> >> >
