On 11/22/19 10:58 AM, Robert Haas wrote: > On Tue, Nov 19, 2019 at 8:49 AM Andrew Dunstan > <andrew.duns...@2ndquadrant.com> wrote: >> I admit I haven't been following along closely, but why do we need a >> cryptographic checksum here instead of, say, a CRC? Do we think that >> somehow the checksum might be forged? Use of cryptographic hashes as >> general purpose checksums has become far too common IMNSHO. > > I tend to agree with you. I suspect if we just use CRC, some people > are going to complain that they want something "stronger" because that > will make them feel better about error detection rates or obscure > threat models or whatever other things a SHA-based approach might be > able to catch that CRC would not catch.
Well, the maximum amount of data that can be protected with a 32-bit CRC is 512MB according to all the sources I found (NIST, Wikipedia, etc). I presume that's what we are talking about since I can't find any 64-bit CRC code in core or this patch. So, that's half of what we need with the default relation segment size (I've seen larger in the field). > I don't think we > should offer an option for MD5, because MD5 is a dirty word these days > and will cause problems for users who have to worry about FIPS 140-2 > compliance. +1. > Phrased more positively, if you want a cryptographic hash > at all, you should probably use one that isn't widely viewed as too > weak. Sure. There's another advantage to picking an algorithm with lower collision rates, though. CRCs are fine for catching transmission errors (as caveated above) but not as great for comparing two files for equality. With strong hashes you can confidently compare local files against the path, size, and hash stored in the manifest and save yourself a round-trip to the remote storage to grab the file if it has not changed locally. This is the basic premise of what we call delta restore which can speed up restores by orders of magnitude. Delta restore is the main advantage that made us decide to require SHA1 checksums. In most cases, restore speed is more important than backup speed. Regards, -- -David da...@pgmasters.net