Hi Alexander,
thanks for your suggestion.
Alexander Leidinger <alexan...@leidinger.net> writes:
[[PGP Signed Part:Undecided]]
Quoting Mathias Picker <mathias.pic...@virtual-earth.de> (from
Tue, 10 Jan 2023
06:51:06 +0100):
Hi all,
I’m testing a few linux triplestore in a linux jail, and used
13.1 which
worked fine most of the time.
Now one of the stores shows dropped connections with many
clients, and as I
can see logs of netlink errors in the logs, I thought I’d try
-CURRENT.
Sadly, my linux jail (Ubuntu 16.04.7) now shows an irritating
behaviour, some
programs seem to hang indefinitely waiting for name
resolution:
Inside the jail:
Working version with ping
[example]
Non-working with wget (same for curl and others)
[example]
So, this tcpdump looks pretty much as if both got answers from
unbound.
Why is wget (and host, and curl, and sudo) not “getting” this
answer?
Any ideas where to look or questions about my setup welcome!
Current has netlink support, 13.1 doesn't. Current may have
changes in the
linuxumaltor, which aren't in 13.1. You need to debug the kernel
path. Possible
tools to do so are ktrace and dtrace.
The most easy cmdline would be ktrace, whereas dtrace gives more
flexibility in
what you do and how you look at it. As a first step I would
recommend ktrace.
Not sure if it will work as I want it to work...
ktrace -di jexec "ID or name of jail" ping google.de
After you have seen the answer with tcpdump, you can kill
ktrace/ping (or wait
for a timeout, but this will increase the amount of data traced)
and inspect the
result via "kdump" (this will take the file "ktrace.out" in the
current
directory and print out the data).
This trace ends with
32282 wget CALL linux_socket(0x10,0x3,0)
32282 wget RET linux_socket 3
32282 wget CALL linux_bind(0x3,0x7fffffffad20,0xc)
32282 wget STRU struct sockaddr { AF_NETLINK, unknown
address family }
32282 wget RET linux_bind 0
32282 wget CALL
linux_getsockname(0x3,0x7fffffffad20,0x7fffffffad1c)
32282 wget STRU struct sockaddr { AF_NETLINK, unknown
address family }
32282 wget RET linux_getsockname 0
32282 wget CALL
linux_sendto(0x3,0x7fffffffad50,0x14,0,0x7fffffffad30,0xc)
32282 wget GIO fd 3 wrote 20 bytes
0x0000 1400 0000 1600 0103 f324 |.........$|
0x000a bd63 0000 0000 0000 0000 |.c........|
32282 wget RET linux_sendto 20/0x14
32282 wget CALL linux_recvmsg(0x3,0x7fffffffad70,0)
32282 wget GIO fd 3 read 40 bytes
0x0000 2800 0000 0200 0000 f324 |(........$|
0x000a bd63 1a7e 0000 eaff ffff |.c.~......|
0x0014 1400 0000 1600 0103 f324 |.........$|
0x001e bd63 1a7e 0000 0000 0000 |.c.~......|
32282 wget STRU struct sockaddr { AF_NETLINK, unknown
address family }
32282 wget RET linux_recvmsg 40/0x28
32282 wget CALL linux_recvmsg(0x3,0x7fffffffad70,0)
32282 wget RET linux_recvmsg -1 errno -4 Interrupted system
call
32282 wget PSIG SIGINT SIG_DFL code=SI_KERNEL
Sadly, I have to get the benchmarks up and running, so I will
install Linux on the machine and cannot follow up on this.
Maybe I’ll try this again next week on another server, since
installing -CURRENT in another boot environment was so easy.
Thanks,
Mathias
IF this works (I'm not sure if the ktrace inherits(descents into
a jail), you
will see the calls to jexec and the exec of ping and what all
those do in the
kernel. This will then give a hint where to look next.
IF this doesn't work, you can use "ktrace -di -p <pid of ping>"
from the
jail-host while ping is running. If ping tries to redo the DNS
lookup, or a
second nameserver is configured and it tries to get the info
from the second
after a timeout, you may be lucky to catch that in the trace.
Bye,
Alexander.
--
Mathias Picker
Geschäftsführer
mathias.pic...@virtual-earth.de
virtual earth Gesellschaft für Wissens re/prä sentation mbH
http://www.virtual-earth.de/ HRB126870
supp...@virtual-earth.de Westendstr. 142
089 / 1250 3943