[go-nuts] Re: Should os.WriteFile call Sync()?

'Brian Candler' via golang-nuts Fri, 18 Apr 2025 04:32:19 -0700

On Thursday, 17 April 2025 at 00:10:08 UTC+1 Karel Bílek wrote:

I was quite surprised recently that, at least on linux, file.Close() does 
not guarantee file.Sync(); in some edge-cases, files can be not put 
properly to filesystem.

If the data integrity is critical, it seems better to call file.Sync()
before file.Close().

It depends what you mean by "data integrity". If you want to be reasonably
sure[^1] that the data has been persisted to disk *before you continue with
anything else*, e.g. in case the power is pulled out later, then yes, you
need to fsync the file [^2].

However, of course, someone could pull the power plug *before* your program
gets to the point of calling fsync() and/or close(). Therefore, it's
really a question of how your application recovers from errors when it next
starts, and/or how it communicates with other applications. For example,
say that its next step is to send a reply saying "yes I got your request,
you don't need to worry about it any more", then maybe the contract says
that the other party is entitled to assume that the message has been
persisted safely when it receives that message. Therefore, you should
persist to disk before sending the reply.

Theodore Ts'o explains this very nicely:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/317781/comments/54
https://thunk.org/tytso/blog/2009/03/15/dont-fear-the-fsync/
https://thunk.org/tytso/blog/2009/03/12/delayed-allocation-and-the-zero-length-file-problem/

In particular, people were assuming that if you write and close a file
(without syncing), followed by an atomic rename, the filesystem would
guarantee that *either* the old file *or* the new file would be persisted
to disk. Because of delayed allocation this was a false assumption - but it
is so ingrained in many pieces of code that a workaround was put into ext4
so that people get the behaviour they "expect".

Aside: sometimes all that you need for data integrity is sequencing, which
can be enforced by write barriers, without having to wait for things to
complete (since the write barrier passes down the stack even through
deferred writes).

For example, suppose your application did the following:
- write chunk A
- barrier
- write chunk B

The power could be pulled out *at any point*, even in the middle of a
write. On restart you will have to deal with these situations:
- corrupt or incomplete A only
- complete A, corrupt or incomplete B
- complete A, complete B

But with a write barrier, you will never see:
- corrupt or incomplete A, corrupt or incomplete B
- corrupt or incomplete A, complete B

[^1] if the device lies, e.g. it says the data has been put in persistent
storage but it's only in non-battery-backed RAM, then all bets are off.

[^2] it's also essential to check the return code from fsync():
https://wiki.postgresql.org/wiki/Fsync_Errors

--
You received this message because you are subscribed to the Google Groups
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to golang-nuts+unsubscr...@googlegroups.com.
To view this discussion visit
https://groups.google.com/d/msgid/golang-nuts/087cb329-faa2-479b-97e1-24b714e91735n%40googlegroups.com.

[go-nuts] Re: Should os.WriteFile call Sync()?

Reply via email to