On Mon, Mar 7, 2016 at 3:38 PM, Andres Freund <and...@anarazel.de> wrote: > On 2016-03-05 19:54:05 -0800, Andres Freund wrote: >> I started working on this; delayed by taking longer than planned on the >> logical decoding stuff (quite a bit complicated by >> e1a11d93111ff3fba7a91f3f2ac0b0aca16909a8). I'm not very happy with the >> error handling as it is right now. For one, you have rename_safe return >> error codes, and act on them in the callers, but on the other hand you >> call fsync_fname which always errors out in case of failure. I also >> don't like the new messages much. >> >> Will continue working on this tomorrow. > > So, here's my current version of this. I've not performed any testing > yet, and it's hot of the press. There's some comment smithing > needed. But otherwise I'm starting to like this. > > Changes: > * renamed rename_safe to durable_rename > * renamed replace_safe to durable_link_or_rename (there was never any > replacing going on) > * pass through elevel to the underlying routines, otherwise we could > error out with ERROR when we don't want to. That's particularly > important in case of things like InstallXLogFileSegment(). > * made fsync_fname use fsync_fname_ext, add 'accept permission errors' > param > * have walkdir call a wrapper, to add ignore_perms param > > What do you think?
I have spent a couple of hours looking at that in details, and the patch is neat. + * This routine ensures that, after returning, the effect of renaming file + * persists in case of a crash. A crash while this routine is running will + * leave you with either the old, or the new file. "you" is not really Postgres-like, "the server" or "the backend" perhaps? + /* XXX: perform close() before? might be outside a transaction. Consider errno! */ ereport(elevel, (errcode_for_file_access(), errmsg("could not fsync file \"%s\": %m", fname))); + (void) CloseTransientFile(fd); + return -1; close() should be called before. slot.c for example does so and we don't want to link an fd here in case of elevel >= ERROR. + * It does so by using fsync on the sourcefile before the rename, and the + * target file and directory after. fsync is issued as well on the target file if it exists. I think that's worth mentioning in the header. + /* XXX: Add racy file existence check? */ + if (rename(oldfile, newfile) < 0) I am not sure we should worry about that, what do you think could cause the old file from going missing all of a sudden. Other backend processes are not playing with it in the code paths where this routine is called. Perhaps adding a comment in the header to let users know that would help? Instead of "durable" I think that "persistent" makes more sense. We want to make those renames persistent on disk on case of a crash. So I would suggest the following routine names: - rename_persistent - rename_or_link_persistent Having the verb first also helps identifying that this is a system-level, rename()-like, routine. > I sure wish we had the recovery testing et al. in all the back > branches... Sure, what we have now is something that should really be backpatched, I was just waiting to have all the existing stability issues addressed, the last one on my agenda being the failure of hamster for test 005 I mentioned in another thread before sending patches for other branches. I proposed a couple of potential regarding that actually, see here: http://www.postgresql.org/message-id/cab7npqsaz9hnucmoua30jo2wj8mnrem18p2a7mcra-zrjxj...@mail.gmail.com -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers