Peter, thanks for that very useful bug report.

Your diagnosis is a little off, what happened is that the new code to check for mismatched case when an answer comes back from upstream got into the code path for two identical queries in quick succession.

The compare failed due to the case mismatch, tiggered the warning and suppressed the combination of the two queries.

Stupid bug, really, but it made me think about this situation a bit more and I came up with a more insidious bug that's been around forever

Consider example.com and Example.com arrive closely and get combined. example.com gets sent upstream. The answer comes back and gets return to the requestor for example.com - tick. It also gets returned, as example.com, to the reguestor for Example.com - not so good if it's doing 0x20 encoding and case sensitive matching between request and response.

Fixing that took all day.

Your bug fix is at

https://thekelleys.org.uk/gitweb/?p=dnsmasq.git;a=commit;h=e44165c0f77929a4dd56694e1323337d68b624d1

and the second one is at

https://thekelleys.org.uk/gitweb/?p=dnsmasq.git;a=commit;h=77c4e95d4a55ef7899ee011ab2640d93194aa1d1

I hope better explanations there.

Number two is a little uncomfortable to be releasing as "stable" in less than 24 hours so I've tagged a second release candidate, and I hope people will keep bashing on it.

Setting deadlines seems to work.

Cheers,

Simon.



On 2/6/25 03:08, Peter Tirsek wrote:
On Mon, 3 Feb 2025, Simon Kelley wrote:

I did wonder if this might happen.
Can you share with us what the misbehaving server is?

On Mon, 3 Feb 2025, wornandrew via Dnsmasq-discuss wrote:

I don't have much info about the server.
It's on an enterprise intranet. No doubt something ancient.

I just upgraded, saw that warning in the log after about half an hour, and started digging into it a little further. I tried isolating it to a certain upstream server, but all of them seemed to exhibit it, and my primary one definitely isn't ancient (BIND 9.20.5).

After some additional investigating, I discovered that it happens when dnsmasq is presented with more than one query for the same record in rapid succession. I didn't look through the source yet, but the problem can be reproduced by restarting dnsmasq, then issuing two queries for the same record in rapid succession, such as with:

$ for i in 1 2; do dig www.google.com A @10.0.0.1 & done

(10.0.0.1 is my router running dnsmasq, of course).


Here's a sample tcpdump of the WAN side of the router running dnsmasq, when configured to use 8.8.8.8 as the upstream server, and presented with two such rapid queries:

16:25:45.957031 IP xxx.xxx.xxx.xxx.41668 > 8.8.8.8.53: 10400+ [1au] A? WwW.GooGLE.CoM. (55) 16:25:45.957267 IP xxx.xxx.xxx.xxx.53410 > 8.8.8.8.53: 56858+ [1au] A? www.GooglE.cOM. (55) 16:25:46.008817 IP 8.8.8.8.53 > xxx.xxx.xxx.xxx.41668: 10400 1/0/1 A 142.250.190.68 (59) 16:25:46.022715 IP 8.8.8.8.53 > xxx.xxx.xxx.xxx.53410: 56858 1/0/1 A 142.250.190.68 (59)

It does not appear to make any difference whether the first or second query is answered first. Perhaps dnsmasq only stores the one random capitalization, and when the "wrong" response comes in, this is seen as a problem and the warning is emitted.


If that's the case, a few potential ways of solving this could be:

1) Never issue another query to the upstream server while a previous
    query is still in flight, and when the answer comes in, respond to
    both downstream queries

2) Keep track of randomization per outgoing request ID instead of only
    per name and record type

3) Reuse the same randomization pattern for subsequent upstream
    requests while another is still in flight.


Finally, this may not be the same problem that wornandrew experienced. I don't have an overall problem running this version of dnsmasq. It didn't even seem to ignore the answer that triggered the warning in the log. At least the client received a reply to both questions, so maybe dnsmasq answered from the cache after ignoring the second answer, and what I'm seeing is purely a cosmetic problem?

Either way, I thought I'd add my 2 cents worth before this ends up in a lot of people's logs and causing a lot of confusion.



_______________________________________________
Dnsmasq-discuss mailing list
Dnsmasq-discuss@lists.thekelleys.org.uk
https://lists.thekelleys.org.uk/cgi-bin/mailman/listinfo/dnsmasq-discuss

Reply via email to