Hi!
We use Bind with inline-signing as "bump-in-the-wire". We started with
Bind 9.9, used 9.10 (several versions) and recently we switched to
9.11.0-P2.
All of them showed the same 2 problems:
1. Bind is in a signing loop and consumes memory until killed by Linux'
OOM-killer
2. Bind produces broken zones (signatures not updates, invalid
signatures, missing RRSIGs ..)
Problem 1 was already reported in detail to bind-b...@isc.org but we
never received an answer.
So, I will describe the problems in more detail below. It would be great
if you can give us some advice how we can track this down.
ad 1) Bind endlessly resigns a zone. In the logs this is shown as
"sending NOTIFYs" due to the increased SOA and slaves fetching the zone.
Bind itself slaves the zone from a hidden master. But the zone on the
hidden master is not updated:
20:38:09 named[3374]: zone klaus-dev.dnssec-signiert.at/IN (signed):
sending notifies (serial 5691271)
20:38:10 named[3374]: client @0x7fe570031500 11.22.34.27#53632
(klaus-dev.dnssec-signiert.at): transfer of
'klaus-dev.dnssec-signiert.at/IN': AXFR started
(serial 5691289)
20:38:10 named[3374]: client @0x7fe570031500 11.22.34.27#53632
(klaus-dev.dnssec-signiert.at): transfer of
'klaus-dev.dnssec-signiert.at/IN': AXFR ended
20:38:10 named[3374]: client @0x7fe5780cb530 11.22.34.29#57629
(klaus-dev.dnssec-signiert.at): transfer of
'klaus-dev.dnssec-signiert.at/IN': AXFR started
(serial 5691302)
20:38:10 named[3374]: client @0x7fe5780cb530 11.22.34.29#57629
(klaus-dev.dnssec-signiert.at): transfer of
'klaus-dev.dnssec-signiert.at/IN': AXFR ended
20:38:14 named[3374]: zone klaus-dev.dnssec-signiert.at/IN (signed):
sending notifies (serial 5691381)
20:38:15 named[3374]: client @0x7fe578496d60 11.22.34.27#36770
(klaus-dev.dnssec-signiert.at): transfer of
'klaus-dev.dnssec-signiert.at/IN': AXFR started
(serial 5691416)
20:38:15 named[3374]: client @0x7fe578496d60 11.22.34.27#36770
(klaus-dev.dnssec-signiert.at): transfer of
'klaus-dev.dnssec-signiert.at/IN': AXFR ended
20:38:15 named[3374]: client @0x7fe570031500 11.22.34.29#45449
(klaus-dev.dnssec-signiert.at): transfer of
'klaus-dev.dnssec-signiert.at/IN': AXFR started
(serial 5691421)
20:38:15 named[3374]: client @0x7fe570031500 11.22.34.29#45449
(klaus-dev.dnssec-signiert.at): transfer of
'klaus-dev.dnssec-signiert.at/IN': AXFR ended
20:38:19 named[3374]: zone klaus-dev.dnssec-signiert.at/IN (signed):
sending notifies (serial 5691509)
While doing this Bind consumes more and more memory until killed by OOM
killer. After restarting Bind it is running fine again.
On our production server we have this issue every 2 or 3 month. On our
development server we have this issue every second day. The difference
are the ZSK rollover timings:
prod: ZSK rollover every 90 days, sig-validity-interval=30days, ~350
zones
dev: ZSK rollover every 2 days, sig-validity-interval=1day, ~10 zones,
dnssec-dnskey-kskonly
On the dev system we have multiple published and active keys which is
for sure an untypical setup, but nevertheless Bind should not endlessly
resign the zone.
ad 2) Before we deploy the signed zone on the public name servers we
verify the zone with validns, dnssec-verify and ldns-verify. When
receiving an NOTIFY from Bind we AXFR the zone and then let the tools
inspect the zone. Once a month we have a broken zone (reported
identically by all 3 tools). Typical errors are (here the validns
reports)
no corresponding NSEC3 found for ...
NSEC3 mentions RRSIG, but no such record found for ...
NSEC3 without a corresponding record (or empty non-terminal)
bad SHA-256 hash length
broken NSEC3 chain, expected ... but found ...
NSEC3 mentions NSEC3PARAM, but no such record found for
Sometimes we are lucky and we can solve the problem with "rndc sign ..."
or "rndc retransfer ...". Most of the time all this tricks do not work
and even a Bind restart does not help. In such a case we have to stop
Bind, delete the zone file and the journal file, and then start Bind
(causing a fresh new incoming AXFR and signing). We do have archived
this broken zone files for inspection.
We are willing to spend time debugging these issues (when they happen
again) if you can give us some advice what we should check in case of an
error.
Thanks
Klaus
_______________________________________________
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe
from this list
bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users