There is yet another bug in Openntpd. This is direct copy-paste from openntpd code (ntpd.c:main()):
do { if ((pid = wait(NULL)) == -1 && errno != EINTR && errno != ECHILD) fatal("wait"); } while (pid != -1 || (pid == -1 && errno == EINTR)); What this code intends to do is to reap all children and move on when there are no more. Instead, it ends up blocking indefinitely even when there are no children to reap! The way I fixed the bug is by doing this: Index: openntpd-src/ntpd.c =================================================================== --- openntpd-src.orig/ntpd.c +++ openntpd-src/ntpd.c @@ -90,10 +90,11 @@ main(int argc, char *argv[]) { struct ntpd_conf lconf; struct pollfd pfd[POLL_MAX]; - pid_t chld_pid = 0, pid; + pid_t chld_pid = 0, pid=0; const char *conffile; int ch, nfds, timeout = INFTIM; int pipe_chld[2]; + int status; extern char *__progname; __progname = _compat_get_progname(argv[0]); @@ -233,11 +234,11 @@ main(int argc, char *argv[]) if (chld_pid) kill(chld_pid, SIGTERM); - do { - if ((pid = wait(NULL)) == -1 && - errno != EINTR && errno != ECHILD) - fatal("wait"); - } while (pid != -1 || (pid == -1 && errno == EINTR)); + if (chld_pid && (pid = waitpid(chld_pid, &status, 0)) == -1 && + errno != EINTR && errno != ECHILD) + fatal("wait"); + if (pid !=-1) + log_info("child %d exited with return code %d", pid, WEXITSTATUS(status)); msgbuf_clear(&ibuf->w); free(ibuf); It forks one child anyway. So it suffices to reap that one child. I hope I will get some response to this. If not, I will assume that there is really no interest in fixing bugs in openntpd and in that case, I will patch only our local copy of the ntpd codebase (as opposed to reporting to the community). Thanks, Ani >-----Original Message----- >From: Anirban Sinha >Sent: Thursday, December 04, 2008 6:04 PM >To: 'misc@openbsd.org' >Subject: possible bug in OpenNTPD code? > >Hi: > >I am sort of digging my way through the OpenNTPD codebase for my work. I >think I find a bug in the code. Please help me to understand the reason >if this is not a bug. > >In function ntp_main() (ntp.c), we poll() to check if there are any >events of interest. We do this: > >1. Check internal fds (PIPE_MAIN) >2. Then check PIPE_DNS fds >3. Then check PIPE_HOTPLUG fds > >Next, for the server, we check all the fds we are listening on. And then >finally, for nfs clients, we check the fds for the remote servers. Now, >there's the issue in this line; > >for (j = 1; nfds > 0 && j < idx_peers; j++) { >... >} > >Shouldn't the index start with 3? That is, shouldn't we do this: > >for (j = 3; nfds > 0 && j < idx_peers; j++) > >since, indices 0,1 and 2 correspond to the three checks I have written >above which are already done. > >In other words, can we apply the following patch to fix the issue? > >Index: ntpd/ntp.c >=================================================================== >--- ntpd.orig/ntp.c >+++ ntpd/ntp.c >@@ -344,7 +344,7 @@ ntp_main(int pipe_prnt[2], struct ntpd_c > sensor_hotplugevent(hotplugfd); > } > >- for (j = 1; nfds > 0 && j < idx_peers; j++) >+ for (j = PFD_MAX; nfds > 0 && j < idx_peers; j++) > if (pfd[j].revents & (POLLIN|POLLERR)) { > nfds--; > if (server_dispatch(pfd[j].fd, conf) == -1) > > > >Thanks, > >Ani