> If the hardware isn't completely identical then it is reasonable to
> have differences in the parallel boot timings.


Theoretically, the machines were identical, but I haven't inspected them to
make sure. The fact was: timing to suceed binding to NIS server was quite
different from one machine to the other.


> Using it with NIS/YP is not so common so I think it
> not unlikely that there is a bug related to it there.
>

It turned out there was no real bug (see below).


> That seems like a completely separate issue.  Probably should separate
> the two problems and address each one individually.  Would be happy to
> help with the DNS configuration too.  Describe how it is set up and
> the list could provide feedback on how to improve it.
>

Knowing that there are people ready to help out there always makes me feel
good about the community. Thank you for your willingness to help. However,
DNS is corporate infrastructure business, and it is out of my scope.


> DNS is a marvelously designed distributed database system... ... But it is
> only as good as the configured network around it.
>

Indeed! :-)


> Try this experiment.  At the last point in the /etc/init.d/nis startup
> script add a short sleep.  That will give the daemons time to finish
> and get ready to go.  It is possible that they are not yet quite ready
> yet and so immediately after the end of the script the next one to run
> hits them too early.
>
> I suggest changing this in file /etc/init.d/nis:
>
>   case "$1" in
>     start)
>           do_start
>           ;;
>     stop)
>
> To this as an experiment:
>
>   case "$1" in
>     start)
>           do_start
>           sleep 5   # <-- Add this sleep to give things more time.
>           ;;
>     stop)
>
>
Bingo!!!!  Haven't done exactly that, but your suggestion helped me to
understand NIS init script a bit better. So, I just increased the already
existent "wait for bind to succeed" loop maximum count, from 10 seconds to
20 seconds, and that did the trick. Even the slowest or our machines boot
properly now. The change was as shown below:

The init script was:

--------------
bind_wait()
{
[ "`ypwhich 2>/dev/null`" = "" ] && sleep 1

if [ "`ypwhich 2>/dev/null`" = "" ]
then
bound=""
log_action_begin_msg "binding to YP server"
for i in 1 2 3 4 5 6 7 8 9 10
do
sleep 1
log_action_cont_msg "."
if [ "`ypwhich 2>/dev/null`" != "" ]
then
echo -n " done] "
bound="yes"
break
fi
done
# This should potentially be an error
if [ "$bound" ] ; then
log_action_end_msg 0
else
log_action_end_msg 1 "backgrounded"
fi
fi
}
--------------
... and I changed that to:
--------------
bind_wait()
{
[ "`ypwhich 2>/dev/null`" = "" ] && sleep 1

if [ "`ypwhich 2>/dev/null`" = "" ]
then
bound=""
log_action_begin_msg "binding to YP server"
for i in 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
do
sleep 1
log_action_cont_msg "."
if [ "`ypwhich 2>/dev/null`" != "" ]
then
echo -n " done] "
bound="yes"
break
fi
done
# This should potentially be an error
if [ "$bound" ] ; then
log_action_end_msg 0
else
log_action_end_msg 1 "backgrounded"
fi
fi
}

Now, this will suffice, for me, for now. I'll have to keep an eye on this
script every time I perform system updates, but machines will boot
properly, until the IT crew manages to figure out what delaying the binding
to NIS process. I'm assuming this is not a real debian/squeeze issue.

Thank you very much for your help,

João

Reply via email to