On Jul 15 2009, I wrote:
We had an incident last night on the authoritative nameserver which
is master for dnssec-test.csi.cam.ac.uk (a signed zone). At the time
it was running BIND 9.6.1rc1 (but I doubt if 9.6.1 is going to make
a difference). A script-generated update timed out, and it subsequently
failed to respond to any DNS queries or rndc commands (although the
named process was still running).
It has to have been the update itself that caused this. (It had just
previously processed updates to two unsigned zones perfectly). On
the other hand, it had previously processed dozens of updates to the
signed zone without any problems (it is maintained as an approximate
clone of cam.ac.uk), and there wasn't anything unusual about this one.
Indeed there was no problem re-applying it after BIND had been restarted.
I am reduced to speculating about timing effects, e.g. collision with
a re-signing event.
Unfortunately I failed to get a core dump of named in the non-responding
state (I need to review my procedures for that!) so I haven't got enough
to report to bind-bugs. This is an appeal to ask if anyone has seen
anything similar.
Some extra information - for the previous 14+ hours it had been logging
messages like this:
Jul 14 10:44:24 authdns0.csx.cam.ac.uk named[1900]: [ID 873579 local7.error]
general: error: zone dnssec-test.csi.cam.ac.uk/IN: updatesecure -> not exact
Jul 14 10:45:54 authdns0.csx.cam.ac.uk named[1900]: [ID 873579 local7.error]
general: error: zone dnssec-test.csi.cam.ac.uk/IN: updatesecure -> not exact
Jul 14 10:50:22 authdns0.csx.cam.ac.uk named[1900]: [ID 873579 local7.error]
general: error: zone dnssec-test.csi.cam.ac.uk/IN: updatesecure -> not exact
Jul 14 10:51:51 authdns0.csx.cam.ac.uk named[1900]: [ID 873579 local7.error]
general: error: zone dnssec-test.csi.cam.ac.uk/IN: updatesecure -> not exact
Jul 14 10:56:15 authdns0.csx.cam.ac.uk named[1900]: [ID 873579 local7.error]
general: error: zone dnssec-test.csi.cam.ac.uk/IN: updatesecure -> not exact
...
Jul 15 00:50:56 authdns0.csx.cam.ac.uk named[1900]: [ID 873579 local7.error]
general: error: zone dnssec-test.csi.cam.ac.uk/IN: updatesecure -> not exact
Jul 15 00:52:22 authdns0.csx.cam.ac.uk named[1900]: [ID 873579 local7.error]
general: error: zone dnssec-test.csi.cam.ac.uk/IN: updatesecure -> not exact
Jul 15 00:53:47 authdns0.csx.cam.ac.uk named[1900]: [ID 873579 local7.error]
general: error: zone dnssec-test.csi.cam.ac.uk/IN: updatesecure -> not exact
Jul 15 00:55:13 authdns0.csx.cam.ac.uk named[1900]: [ID 873579 local7.error]
general: error: zone dnssec-test.csi.cam.ac.uk/IN: updatesecure -> not exact
But I am no nearer understanding what causes these. The zone had several
externally applied updates (apparently successfully) during this period,
before the one that hung.
--
Chris Thompson
Email: c...@cam.ac.uk
_______________________________________________
bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users