On Thu, Sep 05, 2024 at 10:00:42AM +0300, Danil Smirnov wrote:
> Hi,
> So I managed to isolate and reproduce the issue quite reliably.
> Every day exactly at 06:10 UTC time
> my dnsmasq container stops responding.


> During the event, I can successfully query my external DNS
> servers but not dnsmasq:
> $ dig domain.tld @
> ; <<>> DiG 9.16.23-RH <<>> domain.tld @
> ;; global options: +cmd
> ;; connection timed out; no servers could be reached
> I see hundreds of errors like this in the system log:
> Sep 05 06:10:58 mm4.lax.icann.org dockerd[1150]:
> time="2024-09-05T06:10:58.464185887Z" level=error msg="[resolver] failed to
> query external DNS server" client-addr="udp:"
> dns-server="udp:" error="read udp>
> i/o timeout" question=";_dmarc.domain.tld.\tIN\t TXT"
> However, there is nothing suspicious in the /var/log/messages and
> /var/log/cron that might explain what happened.
> Before the container restarted at 06:15, I tried to collect stats via the
> "kill --signal=USR1" command but the stats weren't posted in the logs -
> obviously, dnsmasq was so stuck it couldn't even process the signal.
> (However, I don't think stats would be helpful since the time of the event
> doesn't change even if I restart dnsmasq in between 6:10 events.)
> Resource-wise, it was an increase in memory consumption by dnsmasq when
> the issue started and then a spike in the middle of it (the time shown is 3
> hours later than UTC):
> [image: Screenshot 2024-09-05 at 09.42.38.png]
> https://smirnov.la/Screenshot%202024-09-05%20at%2009.42.38.png
> I'm using these params 
> https://github.com/dockur/dnsmasq/blob/master/entry.sh#L14
> plus "fast-dns-retry". Also tried adding "no-negcache" and "all-servers" but
> it didn't fix the issue.
> Any idea where to continue the investigation?

Check the network what is being triggered at 9:10

Geert Stappers
Silence is hard to parse

Dnsmasq-discuss mailing list

Reply via email to