Peter, thanks for that very useful bug report.
Your diagnosis is a little off, what happened is that the new code to
check for mismatched case when an answer comes back from upstream got
into the code path for two identical queries in quick succession.
The compare failed due to the case mismatch, tiggered the warning and
suppressed the combination of the two queries.
Stupid bug, really, but it made me think about this situation a bit more
and I came up with a more insidious bug that's been around forever
Consider example.com and Example.com arrive closely and get combined.
example.com gets sent upstream. The answer comes back and gets return to
the requestor for example.com - tick. It also gets returned, as
example.com, to the reguestor for Example.com - not so good if it's
doing 0x20 encoding and case sensitive matching between request and
response.
Fixing that took all day.
Your bug fix is at
https://thekelleys.org.uk/gitweb/?p=dnsmasq.git;a=commit;h=e44165c0f77929a4dd56694e1323337d68b624d1
and the second one is at
https://thekelleys.org.uk/gitweb/?p=dnsmasq.git;a=commit;h=77c4e95d4a55ef7899ee011ab2640d93194aa1d1
I hope better explanations there.
Number two is a little uncomfortable to be releasing as "stable" in
less than 24 hours so I've tagged a second release candidate, and I hope
people will keep bashing on it.
Setting deadlines seems to work.
Cheers,
Simon.
On 2/6/25 03:08, Peter Tirsek wrote:
On Mon, 3 Feb 2025, Simon Kelley wrote:
I did wonder if this might happen.
Can you share with us what the misbehaving server is?
On Mon, 3 Feb 2025, wornandrew via Dnsmasq-discuss wrote:
I don't have much info about the server.
It's on an enterprise intranet. No doubt something ancient.
I just upgraded, saw that warning in the log after about half an hour,
and started digging into it a little further. I tried isolating it to a
certain upstream server, but all of them seemed to exhibit it, and my
primary one definitely isn't ancient (BIND 9.20.5).
After some additional investigating, I discovered that it happens when
dnsmasq is presented with more than one query for the same record in
rapid succession. I didn't look through the source yet, but the problem
can be reproduced by restarting dnsmasq, then issuing two queries for
the same record in rapid succession, such as with:
$ for i in 1 2; do dig www.google.com A @10.0.0.1 & done
(10.0.0.1 is my router running dnsmasq, of course).
Here's a sample tcpdump of the WAN side of the router running dnsmasq,
when configured to use 8.8.8.8 as the upstream server, and presented
with two such rapid queries:
16:25:45.957031 IP xxx.xxx.xxx.xxx.41668 > 8.8.8.8.53: 10400+ [1au] A?
WwW.GooGLE.CoM. (55)
16:25:45.957267 IP xxx.xxx.xxx.xxx.53410 > 8.8.8.8.53: 56858+ [1au] A?
www.GooglE.cOM. (55)
16:25:46.008817 IP 8.8.8.8.53 > xxx.xxx.xxx.xxx.41668: 10400 1/0/1 A
142.250.190.68 (59)
16:25:46.022715 IP 8.8.8.8.53 > xxx.xxx.xxx.xxx.53410: 56858 1/0/1 A
142.250.190.68 (59)
It does not appear to make any difference whether the first or second
query is answered first. Perhaps dnsmasq only stores the one random
capitalization, and when the "wrong" response comes in, this is seen as
a problem and the warning is emitted.
If that's the case, a few potential ways of solving this could be:
1) Never issue another query to the upstream server while a previous
query is still in flight, and when the answer comes in, respond to
both downstream queries
2) Keep track of randomization per outgoing request ID instead of only
per name and record type
3) Reuse the same randomization pattern for subsequent upstream
requests while another is still in flight.
Finally, this may not be the same problem that wornandrew experienced. I
don't have an overall problem running this version of dnsmasq. It didn't
even seem to ignore the answer that triggered the warning in the log. At
least the client received a reply to both questions, so maybe dnsmasq
answered from the cache after ignoring the second answer, and what I'm
seeing is purely a cosmetic problem?
Either way, I thought I'd add my 2 cents worth before this ends up in a
lot of people's logs and causing a lot of confusion.
_______________________________________________
Dnsmasq-discuss mailing list
Dnsmasq-discuss@lists.thekelleys.org.uk
https://lists.thekelleys.org.uk/cgi-bin/mailman/listinfo/dnsmasq-discuss