Thank you for dnsmasq. I'm just happy I was able to come up with a
fairly simple way to reproduce the bug.
I applied the change in commit 930428fb970 as a patch to dnsmasq 2.87,
and that fixes the problem for me as well.
Thanks again for all your hard work.
On 10/17/22 15:26, Simon Kelley wrote:
Thank you very much for the information below. Having that saved me a
load of time.
The problem, as ever, is linked lists and
https://thekelleys.org.uk/gitweb/?p=dnsmasq.git;a=commit;h=930428fb970f4991e5c2933fd5a5d2504c18a551
fixes things for me.
To preempt the next question, I intend to make a 2.88 release fairly
soon. I'm working through a backlog of patches from before 2.87, and
once they are done in week or so, 2.88 will go into the release
sausage-grinder.
Cheers,
Simon.
On 16/10/2022 22:25, Christopher J. Madsen wrote:
I tried building dnsmasq 2.87 with a patch that reverts commit
553c4c99, and that does seem to fix the problem.
Using dbus-monitor (thanks, I hadn't been aware of that), I was able
to create 2 dbus-send commands that reproduce the problem without
having to set up a VPN or openresolv:
dbus-send --system --dest=uk.org.thekelleys.dnsmasq
/uk/org/thekelleys/dnsmasq uk.org.thekelleys.SetDomainServers
array:string:"/example.com/10.3.10.24","/example.com/10.3.10.26","/example.com/10.3.10.25","/example.org/10.3.10.24","/example.org/10.3.10.26","/example.org/10.3.10.25","/lan.example.net/192.168.1.1","/lan.example.net/fd00::1"
dbus-send --system --dest=uk.org.thekelleys.dnsmasq
/uk/org/thekelleys/dnsmasq uk.org.thekelleys.SetDomainServers
array:string:"/lan.example.net/192.168.1.1","/lan.example.net/fd00::1"
(Yes, I did use example domains when running the commands. It breaks
lookups for those domains, since those nameservers don't exist, but
other domains still work fine.)
If I start dnsmasq 2.87 and watch the debug log, the first command
just adds the domain-specific nameservers to the global ones, but the
second command sets only domain-specific nameservers and removes the
global ones. The same commands on 2.86 (or the patched 2.87) work fine.
However, If I remove ',"/lan.example.net/fd00::1"' from the end of
each dbus-send command, then I don't see the problem. I'm not sure
if it's the IPv6 address or the number of nameservers, but the
problem only happens when lan.example.net has both IPv4 and IPv6
nameservers.
Hopefully, this will help you track down the issue. Thanks for your
help.
On 10/13/22 09:36, Simon Kelley wrote:
On 10/10/2022 00:21, Christopher J. Madsen wrote:
I have configured dnsmasq and openresolv as described in
https://unix.stackexchange.com/a/575449/2421 so that the DNS
servers provided by the VPN are only used for the domains on that
network.
With dnsmasq 2.86 and openresolv 3.12.0 this was working great, but
I was setting up a new computer the same way and discovered that
DNS lookups broke when I disconnected from the VPN (causing
resolvconf to remove the private DNS servers). I soon realized
that the new machine had gotten dnsmasq 2.87, which I hadn't yet
upgraded to on the old machine (it had dnsmasq 2.86).
The symptom is that all DNS requests (except those for other
machines on my LAN) are refused by dnsmasq:
$ nslookup www.google.com
Server: ::1
Address: ::1#53
** server can't find www.google.com: REFUSED
Restarting dnsmasq fixes the problem until the next time I
disconnect the VPN.
I installed dnsmasq 2.86 on the new machine and the problem went
away. If I put 2.87 back, the problem also comes back. It seems
that something in 2.87 breaks with my setup. BTW, openresolv
3.12.0 uses DBus to add/remove nameservers instead of editing the
dnsmasq config files.
I turned on debug logging. When I connect the VPN, I see this in
the log:
Oct 9 16:40:15 dnsmasq[105349]: setting upstream servers from DBus
Oct 9 16:40:15 dnsmasq[105349]: using nameserver 192.168.1.1#53
Oct 9 16:40:15 dnsmasq[105349]: using nameserver fd...::1#53
Oct 9 16:40:15 dnsmasq[105349]: using nameserver 10.3.10.24#53 for
domain example.com
Oct 9 16:40:15 dnsmasq[105349]: using nameserver 10.3.10.26#53 for
domain example.com
Oct 9 16:40:15 dnsmasq[105349]: using nameserver 10.3.10.25#53 for
domain example.com
Oct 9 16:40:15 dnsmasq[105349]: using nameserver 10.3.10.24#53 for
domain example.org
Oct 9 16:40:15 dnsmasq[105349]: using nameserver 10.3.10.26#53 for
domain example.org
Oct 9 16:40:15 dnsmasq[105349]: using nameserver 10.3.10.25#53 for
domain example.org
Oct 9 16:40:15 dnsmasq[105349]: using nameserver 192.168.1.1#53
for domain lan.example.net
Oct 9 16:40:15 dnsmasq[105349]: using nameserver fd...::1#53 for
domain lan.example.net
Oct 9 16:40:15 dnsmasq[105349]: read /etc/hosts - 0 addresses
I have redacted the IPv6 address, but it is exactly the same in all
log entries. I have also redacted the domains. The VPN provides
example.com and example.org, and lan.example.net is my LAN. This
part of the log looks exactly the same in 2.86 and 2.87; only the
timestamps change.
Here is what dnsmasq 2.86 reports when I disconnect the VPN:
Oct 9 16:40:43 dnsmasq[105349]: setting upstream servers from DBus
Oct 9 16:40:43 dnsmasq[105349]: using nameserver 192.168.1.1#53
Oct 9 16:40:43 dnsmasq[105349]: using nameserver fd...::1#53
Oct 9 16:40:43 dnsmasq[105349]: using nameserver 192.168.1.1#53
for domain lan.example.net
Oct 9 16:40:43 dnsmasq[105349]: using nameserver fd...::1#53 for
domain lan.example.net
Oct 9 16:40:43 dnsmasq[105349]: read /etc/hosts - 0 addresses
Here is what dnsmasq 2.87 reports when I disconnect the VPN:
Oct 9 16:46:21 dnsmasq[105730]: setting upstream servers from DBus
Oct 9 16:46:21 dnsmasq[105730]: using nameserver 192.168.1.1#53
for domain lan.example.net
Oct 9 16:46:21 dnsmasq[105730]: using nameserver fd...::1#53 for
domain lan.example.net
Oct 9 16:46:21 dnsmasq[105730]: read /etc/hosts - 0 addresses
Oct 9 16:46:22 dnsmasq[105730]: query[A] ipv4only.arpa from ::1
Oct 9 16:46:22 dnsmasq[105730]: config error is REFUSED (EDE: not
ready)
Notice that 2.87 does not show any "using nameserver" lines that
don't also say "for domain". As a result, I can only look up hosts
under the lan.example.net domain. Everything else is refused.
I don't know how to see the DBus messages that openresolv is
sending to dnsmasq, but I would assume they're the same in both
cases. The only thing that changed is the version of dnsmasq. But
for whatever reason, dnsmasq 2.87 isn't setting up generic
nameservers when the VPN disconnects, but 2.86 is.
I've stared at this for a while, but not found an obvious problem
yet. An obvious commit on 2.87 that should be looked at is
https://thekelleys.org.uk/gitweb/?p=dnsmasq.git;a=commit;h=553c4c99cca173e9964d0edbd0676ed96c30f62b
Maybe the massive confusion is not as resolved as we thought, if you
can build a test binary which reverts that change, and see if it
fixes things, that would be very useful.
Another useful bit of data would be to see the DBUS messages being
sent by openresolv. dbus-monitor should enable you to get that.
Cheers,
Simon.
_______________________________________________
Dnsmasq-discuss mailing list
Dnsmasq-discuss@lists.thekelleys.org.uk
https://lists.thekelleys.org.uk/cgi-bin/mailman/listinfo/dnsmasq-discuss
_______________________________________________
Dnsmasq-discuss mailing list
Dnsmasq-discuss@lists.thekelleys.org.uk
https://lists.thekelleys.org.uk/cgi-bin/mailman/listinfo/dnsmasq-discuss
_______________________________________________
Dnsmasq-discuss mailing list
Dnsmasq-discuss@lists.thekelleys.org.uk
https://lists.thekelleys.org.uk/cgi-bin/mailman/listinfo/dnsmasq-discuss
_______________________________________________
Dnsmasq-discuss mailing list
Dnsmasq-discuss@lists.thekelleys.org.uk
https://lists.thekelleys.org.uk/cgi-bin/mailman/listinfo/dnsmasq-discuss
_______________________________________________
Dnsmasq-discuss mailing list
Dnsmasq-discuss@lists.thekelleys.org.uk
https://lists.thekelleys.org.uk/cgi-bin/mailman/listinfo/dnsmasq-discuss