On 18/09/2023 12:16, Rich Freeman wrote:
This is part of why I like storage implementations that have more robustness built into the software. Granted, it is still only as good as your clients, but with distributed storage I really don't want to be paying for ECC on all of my nodes. If the client calculates a checksum and it remains independent of the data, then any RAM corruption should be detectable as a mismatch (that of course assumes the checksum is preserved and not re-calculated at any point).
Which is why I run raid-5 over dm-integrity. I'm not sure it's that stable :-( :-( but it means any disk corruption will get picked up at the integrity level, and raid-5 will just get a read error which will trigger a parity recalc without data loss.
Cheers, Wol