I suspect if you tcpdump/wireshark the DNS traffic, you'll find a query goes out, and either the response is delayed by 2 seconds, or no response is received and your client re-sends the request.
To understand this, inside your pod you'll need to find out what your upstream DNS recursive server is. This might be `cat /etc/resolv.conf`, but if it's using systemd for resolution it could be `resolvectl status` or such like. And then you need to work out what's going on upstream. You should note that a 2 second delay a few times per day for DNS resolution is not unusual. There are lots of reasons. It could be as simple as some network packet loss between your k8s server and your DNS recursor (since DNS is usually sent over UDP, and UDP does not guarantee delivery). Just one lost packet can cause a 1-2 second delay, depending on what the client's retransmission policy is. However, a more likely explanation is this: the record has expired from the cache in the DNS recursor. When it next gets a query for this expired name, the recursive DNS server needs to locate the upstream authoritative DNS servers for that domain. If the one it chooses first is down, it will timeout and retry to a different one. Furthermore, it also needs to resolve the *names* of the authoritative servers (from NS records) into addresses, and if those have expired, there can be delays with that too. A delay of several seconds for all this is quite common. This is just life: many DNS domains are broken in this way, because people don't know how to delegate properly or run their authoritative nameservers properly. If you tell us the actual domain you're querying, maybe we can identify the problem with the domain - but you'll have to get the domain owner to fix it. As a sticking-plaster over the problem: if you run your own DNS recursor with suitable software, then you can get it to refresh the record *before* it expires. In powerdns-recursor this is controlled by refresh-on-ttl-perc <https://docs.powerdns.com/recursor/settings.html#refresh-on-ttl-perc>. Bind calls it "prefetch <https://kb.isc.org/docs/aa-01122>". (Other nameserver software may or may not have this feature). At the end of the day though, DNS issues are not related to the Go programming language. On Friday, 9 May 2025 at 17:40:57 UTC+1 Cipov Peter wrote: > Hello Community > > I have question regarding native golang DNS lookup as my app is compiled > statically (CGO_ENABLED=0). For some reason this solution behaves > unpredictable, having sometimes (few times a day) dns lookup >2s. I am > using http-trace to get this number. I have tried to look into code whether > there is possibility to drill-down to these 2s (no luck right now). My > usecase is making quick http call to integration (optimal request total > time < 500ms). > > using > GODEBUG="netdns=2" > CGO_ENABLED=0 > GOOS=linux > GOARCH=amd64 > > running in docker debian bookworm as k8s pod > > logs: > go package net: confVal.netCgo = false netGo = false > go package net: cgo resolver not > supported; using Go's DNS resolver > > I have checked the source code I have not seen much tracing information > into why sometimes dns spikes occurs. Did I missed some option to get > insights why dns lookup takes so long ? I cannot distinguish whether it is > waiting for network call or some internal timeout. > > Thank you > -- You received this message because you are subscribed to the Google Groups "golang-nuts" group. To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscr...@googlegroups.com. To view this discussion visit https://groups.google.com/d/msgid/golang-nuts/bb46b311-0cdd-4bd7-aa55-5bbbd25f7b7dn%40googlegroups.com.