On 2012-Jul-05 06:47:36 +1000, Nico Williams <n...@cryptonector.com> wrote: >On Wed, Jul 4, 2012 at 11:14 AM, Bob Friesenhahn ><bfrie...@simple.dallas.tx.us> wrote: >> On Tue, 3 Jul 2012, James Litchfield wrote: >>> Agreed - msync/munmap is the only guarantee. >> >> I don't see that the munmap definition assures that anything is written to >> "disk". The system is free to buffer the data in RAM as long as it likes >> without writing anything at all. > >Oddly enough the manpages at the Open Group don't make this clear.
They don't specify the behaviour on write(2) or close(2) either. All this means is that there is no guarantee that munmap(2) (or write(2) or close(2)) will immediately flush the data to stable storage. > So >I think it may well be advisable to use msync(3C) before munmap() on >MAP_SHARED mappings. If you want to be certain that your changes will be flushed to stable storage by a particular point in your program execution then you must call msync(MS_SYNC) before munmap(2). > However, I think all implementors should, and >probably all do (Linux even documents that it does) have an implied >msync(2) when doing a munmap(2). There's nothing in the standard requiring this behaviour and it will adversely impact performance in the general case so I would expect that implementors _wouldn't_ force msync(2) on munmap(2). FreeBSD definitely doesn't. As for Linux, I keep finding cases where, if a standard doesn't mandate specific behaviour, Linux will implement (and document) different behaviour to the way other OSs behave in the same situation. > I really makes no sense at all to >have munmap(2) not imply msync(3C). Actually, it makes no more sense for munmap(2) to imply msync(2) than it does for close(2) [which is functionally equivalent] to imply fsync(2) - ie none at all. >(That's another thing, I don't see where the standard requires that >munmap(2) be synchronous. http://pubs.opengroup.org/onlinepubs/009695399/functions/munmap.html states "Further references to these pages shall result in the generation of a SIGSEGV signal to the process." It's difficult to see how to implement this behaviour unless munmap(2) is synchronous. > Async munmap(2) -> no need to mount >cross-calls, instead allowing to mapping to be torn down over time. >Doing a synchronous msync(3C), then a munmap(2) is a recipe for going >real slow, but if munmap(2) does not portably guarantee an implied >msync(3C), then would it be safe to do an async msync(2) then >munmap(2)??) I don't understand what you are trying to achieve here. munmap(2) should be a relatively cheap operation so there is very little to be gained by making it asynchronous. Can you please explain a scenario where munmap(2) would be slow (other than cases where implementors have deliberately and unnecessarily made it slow). I agree that msync(MS_SYNC) is slow but if you want a guarantee that your data is securely written to stable storage then you need to wait for that stable storage. msync(MS_ASYNC) should have no impact on a later munmap(2) and it should always be safe to call msync(MS_ASYNC) before munmap(2) (in fact, it's a good idea to maximise portability). -- Peter Jeremy
pgp7hDyys4IEu.pgp
Description: PGP signature
_______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss