I wonder if kernel can sometimes provide weaker version of fsync() which is not enforcing all pending data to be written immediately but just servers as write barrier, guaranteeing that all write operations preceding fsync() will be completed before any of subsequent operations.

It will allow implementation of weaker transaction models which are not satisfying all ACID requirements (results of committed transaction can be lost in case power failure or OS crash) but still preserving database consistency. It is acceptable for many applications and can provide much better performance.

Right now it is possible to implement something like this at application level using asynchronous write process. So all write/sync operations should be redirected to this process. But such process can become a bottleneck reducing scalability of the system. Also communication channels with this process can cause significant memory/CPU overhead.

In most DBMSes including PostgreSQL transaction log and database data are located in separate files. So such write barrier should be associated not with one file, but with set of files or may be the whole file system. I wonder if there are some principle problems in implementing or using such file system write barrier?



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to