Laurent Bigonville <bigon+b...@bigon.be> wrote: > > Don't take what I said about the internal working of systemd-resolved for > granted :) > > Looking at the log that I initially provided > (https://github.com/systemd/systemd/issues/8897), it seems to revalidate the > complete chain.
Yes, you are right, I shouldn't have immediately gone for the full blast of sarcasm without verifying that systemd-resolved deserves it. So I looked at the log - details below. (Spoiler: my prejudices have been confirmed.) > An idea what should be done to fix this then? Well, the good options are to fix Facebook (as Mark rightly said) and to fix systemd-resolved. Alternatively you can add negative trust anchors for broken domains like Facebook. OK, logs. After a lot of setup faff we have: 16:24:21 Switching to system DNS server 10.200.0.200. 16:24:23 Cache miss for www.facebook.com IN A 16:24:23 Transaction 41850 for <www.facebook.com IN A> scope dns on */*. 16:24:23 Using DNS server 10.200.0.200 for transaction 41850. 16:24:23 Timeout reached on transaction 41850. That's a remarkably hair-trigger timeout. 16:24:23 Switching to system DNS server 10.122.17.186. 16:24:23 Transaction 41850 for <www.facebook.com IN A> scope dns on */*. 16:24:23 Processing incoming packet on transaction 41850. (rcode=SUCCESS) 16:24:23 Verified we get a response at feature level UDP+EDNS0+DO from DNS server 10.122.17.186. OK so we know at this point that systemd-resolved is not designed for fast validation, because it hasn't sent the queries for the validation chain yet. A big shame for new code. 16:24:23 Requesting parent SOA to validate transaction 41850 (www.facebook.com, unsigned CNAME/DNAME/DS RRset). 16:24:23 Transaction 60936 for <facebook.com IN SOA> scope dns on */*. Wat? How does a SOA query help anything? There's no point wasting time looking for zone cuts before you request DNSKEY and DS records, because the DNSKEY and DS responses tell you where the zone cuts are as a side effect. This is just a waste of time. 16:24:23 Requesting DS to validate transaction 41850 (c10r.facebook.com, unsigned SOA/NS RRset). 16:24:23 Transaction 36881 for <c10r.facebook.com IN DS> scope dns on */*. 16:24:23 Requesting DS to validate transaction 41850 (c10r.facebook.com, unsigned SOA/NS RRset). Twice?? 16:24:23 Processing incoming packet on transaction 60936. (rcode=SUCCESS) 16:24:23 Requesting DS to validate transaction 60936 (facebook.com, unsigned SOA/NS RRset). 16:24:23 Transaction 35625 for <facebook.com IN DS> scope dns on */*. 16:24:23 Processing incoming packet on transaction 35625. (rcode=SUCCESS) 16:24:23 Requesting DNSKEY to validate transaction 35625 (com, RRSIG with key tag: 36707). Then there's a lot of upwards validation faff for com and root zones. 16:24:23 Found verdict for lookup facebook.com IN DS: insecure 16:24:23 Added NODATA cache entry for facebook.com IN DS 105s 16:24:23 Transaction 35625 for <facebook.com IN DS> on scope dns on */* now complete with <success> from network (unsigned). 16:24:23 Transaction 60936 for <facebook.com IN SOA> on scope dns on */* now complete with <success> from network (unsigned). OK so far. 16:24:24 Timeout reached on transaction 36881. 16:24:24 Retrying transaction 36881. At this point systemd-resolved should have abandoned transaction 36881: facebook.com is insecure so the c10r DS is immaterial. It then spends another 1.5 minutes (!!!) retrying 36881. If you get a SERVFAIL from one recursive server, it's reasonable to retry on alternative recursive servers if you have them, but it's almost always futile to retry against the same server. systemd-resolved needs to give up way faster. It seems to be using SERVFAIL as a feture negotiation signal. Weirdly, it doesn't reduce the LARGE buffer size feature on timeout (which would make sense) but only after it gets the first SERVFAIL response (which doesn't make sense). It also tries to make a DS query with DO=0 which is nonsense. 16:25:52 Transaction 36881 for <c10r.facebook.com IN DS> on scope dns on */* now complete with <attempts-max-reached> from network (unsigned). 16:25:52 Auxiliary DNSSEC RR query failed with attempts-max-reached Sheesh. At long last! 16:25:52 DNSSEC validation failed for question www.facebook.com IN A: failed-auxiliary 16:25:52 Transaction 41850 for <www.facebook.com IN A> on scope dns on */* now complete with <dnssec-failed> from network (unsigned). WRONG. You already validated it insecure! Good grief. Tony. -- f.anthony.n.finch <d...@dotat.at> http://dotat.at/ Shannon: Northerly or northwesterly 3 or 4, backing westerly or southwesterly 4 or 5 in northwest. Moderate. Rain later in northwest. Good, occasionally moderate later in northwest. _______________________________________________ Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe from this list bind-users mailing list bind-users@lists.isc.org https://lists.isc.org/mailman/listinfo/bind-users