Re: [Dnsmasq-discuss] dnsmasq 2.86 seems to stop reading from one of its dns sockets after a period of time under load

Simon Kelley Mon, 16 May 2022 11:42:54 -0700

What value did you use?

On my Ubuntu desktop, /proc/sys/net/core/wmem_default and wmem_max areboth 212992 which is a fair few DNS replies.



Simon.


On 16/05/2022 18:34, Tom Keddie wrote:

Hi Simon,
Thanks for your response. I don't have the detailed logs but it's anoisy qa wireless environment where clients are coming and going a lot.eg. In syslog I could see instances where we would get a DHCP requestand then a L2 wireless disassociate message would appear immediatelyafterwards, that response isn't going to be deliverable as unicast(although for dhcp it might fall back to broadcast eventually).
As we know, DNS isn't logged in such a manner but you could see the samescenario unfolding where we get a bunch of dns requests, the clientdrops off immediately afterwards and the responses can't be delivered.When there's a lot of requests or a lot of clients you can see how thesocket buffer would fill.
Increasing the socket buffers as I described below allowed the test torun for the required 96 hours, without it we weren't making it past the48 hour mark.
A dynamic solution might work provided it was carefully bound to preventDoS. If you have something you'd like us to test I probably arrange atime slot, it's a busy setup that needs lots of hardware though.
Thanks,
Tom Keddie
ps. this is a controlled environment (as much as you can control wifi),there are no malicious actors nor intent in this scenario. It's a soaktest with a large variety of clients all doing busy work like videostreaming etc.
On Fri, May 13, 2022 at 12:48 PM Simon Kelley <si...@thekelleys.org.uk<mailto:si...@thekelleys.org.uk>> wrote:
    On 10/05/2022 16:40, Tom Keddie via Dnsmasq-discuss wrote:
     > Hi All,
     >
     >     I think you're saying that it's not surprising that dnsmasq
    is not
     >     reading from the socket because the send queue is also full.
     >
     >
     > As per this thread on netdev
     >
    
(https://lore.kernel.org/netdev/cabuuw65r3or9hehsmt_isvx1f-7b6ecppdr+bnr6f6wbkpn...@mail.gmail.com/
    
<https://lore.kernel.org/netdev/cabuuw65r3or9hehsmt_isvx1f-7b6ecppdr+bnr6f6wbkpn...@mail.gmail.com/>

     >
    
<https://lore.kernel.org/netdev/cabuuw65r3or9hehsmt_isvx1f-7b6ecppdr+bnr6f6wbkpn...@mail.gmail.com/
    
<https://lore.kernel.org/netdev/cabuuw65r3or9hehsmt_isvx1f-7b6ecppdr+bnr6f6wbkpn...@mail.gmail.com/>>)

     > it seems we were consuming the socket send buffer with pending
    packets
     > waiting for ARP responses that were never coming.  This was causing
     > failures sending to devices that were still live.
     >
     > As per that thread we increased the /proc/sys/net/core/wmem_default
     > value so all sockets will have larger send buffers (the device
    has very
     > few sockets in use). It might be useful to add dnsmasq config
    options to
     > increase SO_SNDBUF on the dhcp and dns sockets to allow more
    granular
     > control.
     >
     > Thanks, Tom Keddie

    So queries are being received, and answered, but the reply is being
    dropped by the kernel because the send queue is full of replies to dead
    hosts? If the hosts are dead, where are the queries coming from to
    generate these blocked replies?

    It might be sensible to automatically increase the send queue length
    when a packer send gets EAGAIN. at least the first time, but I'd
    like to
    understand exactly what's going on first.


    Simon.

     >
     > _______________________________________________
     > Dnsmasq-discuss mailing list
     > Dnsmasq-discuss@lists.thekelleys.org.uk
    <mailto:Dnsmasq-discuss@lists.thekelleys.org.uk>
     >
    https://lists.thekelleys.org.uk/cgi-bin/mailman/listinfo/dnsmasq-discuss
    <https://lists.thekelleys.org.uk/cgi-bin/mailman/listinfo/dnsmasq-discuss>


_______________________________________________
Dnsmasq-discuss mailing list
Dnsmasq-discuss@lists.thekelleys.org.uk
https://lists.thekelleys.org.uk/cgi-bin/mailman/listinfo/dnsmasq-discuss

Re: [Dnsmasq-discuss] dnsmasq 2.86 seems to stop reading from one of its dns sockets after a period of time under load

Reply via email to