On Thursday, 17 April 2025 at 00:10:08 UTC+1 Karel Bílek wrote:

I was quite surprised recently that, at least on linux, file.Close() does 
not guarantee file.Sync(); in some edge-cases, files can be not put 
properly to filesystem.

If the data integrity is critical, it seems better to call file.Sync() 
before file.Close().


It depends what you mean by "data integrity". If you want to be reasonably 
sure[^1] that the data has been persisted to disk *before you continue with 
anything else*, e.g. in case the power is pulled out later, then yes, you 
need to fsync the file [^2].

However, of course, someone could pull the power plug *before* your program 
gets to the point of calling fsync() and/or close().  Therefore, it's 
really a question of how your application recovers from errors when it next 
starts, and/or how it communicates with other applications. For example, 
say that its next step is to send a reply saying "yes I got your request, 
you don't need to worry about it any more", then maybe the contract says 
that the other party is entitled to assume that the message has been 
persisted safely when it receives that message. Therefore, you should 
persist to disk before sending the reply.

Theodore Ts'o explains this very nicely:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/317781/comments/54
https://thunk.org/tytso/blog/2009/03/15/dont-fear-the-fsync/
https://thunk.org/tytso/blog/2009/03/12/delayed-allocation-and-the-zero-length-file-problem/

In particular, people were assuming that if you write and close a file 
(without syncing), followed by an atomic rename, the filesystem would 
guarantee that *either* the old file *or* the new file would be persisted 
to disk. Because of delayed allocation this was a false assumption - but it 
is so ingrained in many pieces of code that a workaround was put into ext4 
so that people get the behaviour they "expect".

Aside: sometimes all that you need for data integrity is sequencing, which 
can be enforced by write barriers, without having to wait for things to 
complete (since the write barrier passes down the stack even through 
deferred writes).

For example, suppose your application did the following:
- write chunk A
- barrier
- write chunk B

The power could be pulled out *at any point*, even in the middle of a 
write. On restart you will have to deal with these situations:
- corrupt or incomplete A only
- complete A, corrupt or incomplete B
- complete A, complete B

But with a write barrier, you will never see:
- corrupt or incomplete A, corrupt or incomplete B
- corrupt or incomplete A, complete B

[^1] if the device lies, e.g. it says the data has been put in persistent 
storage but it's only in non-battery-backed RAM, then all bets are off.

[^2] it's also essential to check the return code from fsync():
https://wiki.postgresql.org/wiki/Fsync_Errors

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
To view this discussion visit 
https://groups.google.com/d/msgid/golang-nuts/087cb329-faa2-479b-97e1-24b714e91735n%40googlegroups.com.

Reply via email to