The ypserv bug (the one where ypserv randomly stops responding or
just seg-faults) is still very much alive. I had to restart it
about 11 times in the course of 20 minutes this morning. That's
the bad news, the good news is that I started it each time with
'ktrace -i'.
Going back a bit, Matt Dillon suggested that the problem may have been
in the signal handler for sigchld. I looked at the signal handler and
it does not appear to be doing anything dangerous at all (just a
child_count--;) is it doing something dangerous that I am just not seeing?
Also, in the last 200 lines of kdump output for each and every crash there
is the sequence of calls "select(); gettimeofday();"... that sequence of
calls never appears in the ypserv source code, but does appear in svc_tcp.c
in librpc... my question is: "ypserv defines its own svc_run, and for
TCP connections specifically handles things itself very carefully, how is
the svc_tcp.c code getting called at all?" I think the answer to that is
the source of the problem (it should also be noted that in the case where
ypserv hasn't died and I have collected ktrace information -- up to 8 gig
of it -- the "select(); gettimeofday();" sequence is _never_ called.)
One of my ktrace-s is _very_ small, only 330K, from fork()/exec() to
SIG_DFL/SEGV, so I am hoping this will provide easily digestible information.
I did not include context-switch information in the ktrace for the following
reasons:
1) It didn't appear to be usefull, and since I did specify the -i, it is
obvious where context switches occur (to the only thing that could affect
anything: the children)
2) It caused ypserv to act strangely... instead of dying, it just got
very slow, and didn't respond.
Anyone interested in helping me track this one down?
--
David Cross | email: [EMAIL PROTECTED]
Lab Director | Rm: 308 Lally Hall
Rensselaer Polytechnic Institute, | Ph: 518.276.2860
Department of Computer Science | Fax: 518.276.4033
I speak only for myself. | WinNT:Linux::Linux:FreeBSD
To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message