On Wed, Jan 15, 2014 at 10:12:38AM -0500, Robert Haas wrote: > On Wed, Jan 15, 2014 at 4:35 AM, Jan Kara <j...@suse.cz> wrote: > > Filesystems could in theory provide facility like atomic write (at least up > > to a certain size say in MB range) but it's not so easy and when there are > > no strong usecases fs people are reluctant to make their code more complex > > unnecessarily. OTOH without widespread atomic write support I understand > > application developers have similar stance. So it's kind of chicken and egg > > problem. BTW, e.g. ext3/4 has quite a bit of the infrastructure in place > > due to its data=journal mode so if someone on the PostgreSQL side wanted to > > research on this, knitting some experimental ext4 patches should be doable. > > Atomic 8kB writes would improve performance for us quite a lot. Full > page writes to WAL are very expensive. I don't remember what > percentage of write-ahead log traffic that accounts for, but it's not > small.
Essentially, the "atomic writes" will essentially be journalled data so initially there is not going to be any different in performance between journalling the data in userspace and journalling it in the filesystem journal. Indeed, it could be worse because the filesystem journal is typically much smaller than a database WAL file, and it will flush much more frequently and without the database having any say in when that occurs. AFAICT, we're stuck with sucky WAL until block layer and hardware support atomic writes. FWIW, I've certainly considered adding per-file data journalling capabilities to XFS in the past. If we decide that this is the way to proceed (i.e. as a stepping stone towards hardware atomic write support), then I can go back to my notes from a few years ago and see what still needs to be done to support it.... Cheers, Dave. -- Dave Chinner da...@fromorbit.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers