Am 19.05.25 um 15:09 schrieb Maximiliano Sandoval: > One sync comes after warning that the watchdog is about to expire, and a > second right after the watchdog expires. > > To maximize the chances the log will contain entries relevant to a fence > event. This would be extremely useful for detecting whether a node > fenced. > > Signed-off-by: Maximiliano Sandoval <m.sando...@proxmox.com> > --- > src/watchdog-mux.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/src/watchdog-mux.c b/src/watchdog-mux.c > index e14c768..8669b10 100644 > --- a/src/watchdog-mux.c > +++ b/src/watchdog-mux.c > @@ -268,11 +268,13 @@ main(void) > ) { > client_list[i].warning_state = WARNING_ISSUED; > fprintf(stderr, "client watchdog is about to > expire\n"); > + sync_journal_unsafe();
The "unsafe" is there for a reason, on a loaded machine doing above might trigger a few times and create a zombie left over process for each of those. Simplest fix might be doing a double fork there so that the parent process does not exist anymore, in which case systemd collects the child process exit status, albeit that wouldn't be the most efficient solution. > } > > if ((ctime - client_list[i].time) > > client_watchdog_timeout) { > update_watchdog = 0; > fprintf(stderr, "client watchdog expired - > disable watchdog updates\n"); > + sync_journal_unsafe(); This is basically useless compared to the status quo, there is already such a call a few (compiled) instructions after that branch hits anyway as we break the main loop then. > } > } > } _______________________________________________ pve-devel mailing list pve-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel