On Thu, Jan 12, 2023 at 10:17:55PM -0600, Justin Pryzby wrote: > On Thu, Jan 12, 2023 at 06:43:54PM -0800, Andres Freund wrote: > > > It looks like logical decoding may be the "most wrong" place that > > > wal_sync_method is being used, so maybe my change is reasonable to > > > consider, and not just a workaround. > > > > I don't follow. What does using fsync_fname() vs fsync_fname_ext() have to > > do > > with pg_fsync() using wal_sync_method? fsync_fname() is just a wrapper > > around > > fsync_fname_ext(). Both end up in pg_fsync(). > > My patch used fsync_fname_ext() which would cause an ERROR rather than a > PANIC when failing to fsync the logical decoding pathname. > > > Are you actually proposing that we don't PANIC after an fsync for the > > category > > of files that you list here, even with data_sync_retry set? > > Yes, but I'm referring only to my change to SnapBuildSerialize(). > > The rest of the verbage was me trying to figure out the > history/evolution of pg_fsync usage.
Also note the existing comment (originating from Thomas' "fsync-gate" commit, which introduced data_sync_retry): + * It's safe to just ERROR on fsync() here because we'll retry the whole + * operation including the writes. Also, it seems to work fine if one calls pg_fsync() again, rather than calling fsync_fname(), which re-opens the file. It also seems to work fine if one omits the initial call to fsync_fname("pg_logical/snapshots", true); Since SnapBuildSerialize() isn't atomic (the system could crash at any point), I'm not seeing why these wouldn't be adequately safe. But also hoping Thomas will comment on that. -- Justin