On Sun, Feb 20, 2022 at 6:03 AM Andres Freund <and...@anarazel.de> wrote: > On 2022-02-19 14:10:39 +0000, Simon Riggs wrote: > > * wal_receiver - 100ms, currently gets woken when WAL arrives > > This is a fairly insane one. We should compute a precise timeout based on > wal_receiver_timeout.
I proposed a patch to do that here: https://commitfest.postgresql.org/37/3520/ It needs a couple more tweaks (I realised it needs to round microsecond sleep times up to the nearest whole millisecond, explaining a few spurious wakeups, and Horiguchi-san had some more feedback for me that I haven't got to yet), but it seems close. > And it's not just one syscall every 100ms, it's > > recvfrom(4, 0x7fd66134b960, 16384, 0, NULL, NULL) = -1 EAGAIN (Resource > temporarily unavailable) > epoll_create1(EPOLL_CLOEXEC) = 6 > epoll_ctl(6, EPOLL_CTL_ADD, 9, {EPOLLIN|EPOLLERR|EPOLLHUP, {u32=1630593560, > u64=140558730322456}}) = 0 > epoll_ctl(6, EPOLL_CTL_ADD, 3, {EPOLLIN|EPOLLERR|EPOLLHUP, {u32=1630593584, > u64=140558730322480}}) = 0 > epoll_ctl(6, EPOLL_CTL_ADD, 4, {EPOLLIN|EPOLLERR|EPOLLHUP, {u32=1630593608, > u64=140558730322504}}) = 0 > epoll_wait(6, [], 1, 100) = 0 > close(6) = 0 Yeah, I have a patch for that (no CF entry yet, will create shortly), and together these two patches get walreceiver's wait loop down to a single epoll_wait() call (or local equivalent) that waits the exact amount of time required to perform the next periodic action. > > 5. Startup process has a hardcoded 5s loop because it checks for > > trigger file to promote it. So hibernating would mean that it would > > promote more slowly, and/or restart failing walreceiver more slowly, > > so this requires user approval, and hence add new GUCs to approve that > > choice. This is a valid choice because a long-term idle server is > > obviously not in current use, so waiting 60s for failover or restart > > is very unlikely to cause significant issue. > > There's plenty of databases that are close to read-only but business critical, > so I don't buy that argument. > > IMO we should instead consider either deprecating file based promotion, or > adding an optional dependency on filesystem monitoring APIs (i.e. inotify etc) > that avoid the need to poll for file creation. Yeah, I pondered inotify/KQ_FILTER_VNODE/FindFirstChangeNotification for this while working on 600f2f50. I realised I could probably teach a WaitEventSet to wake up when a file is added on most OSes, but I didn't try to prototype it... I agree that the whole file system-watching concept feels pretty clunky, so if just getting rid of it is an option... It would be nice to tame the walwriter's wakeups...