On Wed, Jan 15, 2014 at 10:35:44AM +0100, Jan Kara wrote: > Filesystems could in theory provide facility like atomic write (at least up > to a certain size say in MB range) but it's not so easy and when there are > no strong usecases fs people are reluctant to make their code more complex > unnecessarily. OTOH without widespread atomic write support I understand > application developers have similar stance. So it's kind of chicken and egg > problem. BTW, e.g. ext3/4 has quite a bit of the infrastructure in place > due to its data=journal mode so if someone on the PostgreSQL side wanted to > research on this, knitting some experimental ext4 patches should be doable.
For the record, a researcher (plus is PhD student) at HP Labs actually implemented a prototype based on ext3 which created an atomic write facility. It was good up to about 25% of the ext4 journal size (so, a couple of MB), and it was use to research using persistent memory by creating a persistent heap using standard in-memory data structures as a replacement for using a database. The results of their research work was that showed that ext3 plus atomic write plus standard Java associative arrays beat using Sqllite. It was a research prototype, so they didn't handle OOM kill conditions, and they also didn't try benchmarking against a real database instead of a toy database such as SqlLite, but if someone wants to experiment with Atomic write, there are patches against ext3 that we can probably get from HP Labs. - Ted -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers