>>>>> "c" == Miles Nordin <car...@ivy.net> writes:
c> fbarrier() on second thought that couldn't help this problem. The goal is to associate writing to the directory (rename) with writing to the file referenced by that inode/handle (write/fsync/``fbarrier''), and in POSIX these two things are pretty distant and unrelated to each other. The posix way to associate these two things is to wait for fsync() to return before asking for the rename. The waiting is expressive---it's an extremely simple, easy-to-understand API for associating one thing with another. I thought maybe this was so simple there was only one thing not two, so the wait coudl be skipped, but I am wrong. It is too bad because as others have said it means these fsync()'s will have to go in to make the app correct/portable with the API we have to work under, even though ZFS has certain convenient quirks and probably doesn't need them. IMHO the best reaction to the KDE hysteria would be to make sure SQLite and BerkeleyDB are fast as possible and effortlessly correct on ZFS, and anything that's slow because of too much synchronous writing to tiny files should use a library instead. This is not currently the case because for high performance one has to manually match DB and ZFS record sizes which isn't practical for these tiny throwaway databases that must share a filesystem with nonDB stuff, and there might be room for improvement in terms of online defragmentation too.
pgpIEWQ58qaLi.pgp
Description: PGP signature
_______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss