On Mon, 2013-03-04 at 10:36 +0200, Heikki Linnakangas wrote: > On 04.03.2013 09:11, Simon Riggs wrote: > > Are there objectors? > > FWIW, I still think that checksumming belongs in the filesystem, not > PostgreSQL.
Doing checksums in the filesystem has some downsides. One is that you need to use a copy-on-write filesystem like btrfs or zfs, which (by design) will fragment the heap on random writes. If we're going to start pushing people toward those systems, we will probably need to spend some effort to mitigate this problem (aside: my patch to remove PD_ALL_VISIBLE might get some new wind behind it). There are also other issues, like what fraction of our users can freely move to btrfs, and when. If it doesn't happen to be already there, you need root to get it there, which has never been a requirement before. I don't fundamentally disagree. We probably need to perform reasonably well on btrfs in COW mode[1] regardless, because a lot of people will be using it a few years from now. But there are a lot of unknowns here, and I'm concerned about tying checksums to a series of things that will be resolved a few years from now, if ever. [1] Interestingly, you can turn off COW mode on btrfs, but you lose checksums if you do. > If you go ahead with this anyway, at the very least I'd like > to see some sort of a comparison with e.g btrfs. How do performance, > error-detection rate, and behavior on error compare? Any other metrics > that are relevant here? I suspect it will be hard to get an apples-to-apples comparison here because of the heap fragmentation, which means that a sequential scan is not so sequential. That may be acceptable for some workloads but not for others, so it would get tricky to compare. And any performance numbers from an experimental filesystem are somewhat suspect anyway. Also, it's a little more challenging to test corruption on a filesystem, because you need to find the location of the file you want to corrupt, and corrupt it out from underneath the filesystem. Greg may have more comments on this matter. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers