Le mercredi 16 septembre 2020 à 22:08:24+0200, Pierre-Elliott Bécue a écrit : > Le vendredi 11 septembre 2020 à 11:12:25+0200, Iñaki Malerba a écrit : > > Hi Pebs, > > > > Thanks for checking this. > > > > On Sat, 5 Sep 2020 23:23:30 +0200 Pierre-Elliott =?utf-8?B?QsOpY3Vl?= > > <p...@debian.org> wrote:> > > > LXC's devs told me that 4.0.4 should solve it. I'm uploading this > > > release now. Please don't hesitate to tell me if it helps. > > > > Run a pipeline removing the pinning of lxc, and the behaviour seems to > > be the same. > > > > Image building: > > https://salsa.debian.org/ina/pipeline/-/jobs/990332 > > > Setting up lxc (1:4.0.4-1) .. > > > > Running lxc: > > https://salsa.debian.org/ina/pipeline/-/jobs/990352 > > > <VirtSubproc>: failure: ['sudo', 'timeout', '600', 'lxc-stop', > > '--quiet', '--kill', '--name', 'ci-254-b2fcad5f'] failed (exit status 1, > > stderr '') > > > > Please let me know if you want us to test something else. > > > > Abrazos, > > Could you get me a full trace like the previous time? I have no > technical means of running proper tests currently, sorry. :/ > > Cheers!
I found a way to run tests on my own. Turns out I tried to add a lxc-attach autopkgtest-stable-amd64 -- ps auxf to see the process tree in case I could find something useful, and… the container successfully stopped that time. I retried and it kept working. The process tree I see is: ─( 23:09:35 )─< /home/becue/tmp >───────────────────────────────────────────────[ 0 ]─ root@dawaj # docker run --rm --privileged -i autopkgtest Starting LXC network bridge: :Starting LXC autoboot containers: :USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND root 4 0.0 0.0 7644 2760 ? R 21:09 0:00 ps auxf root 1 0.0 0.0 20904 7492 ? Ds 21:09 0:00 /sbin/init ok After some more tests, it seems that lxc-start && lxc-stop isn't working properly because the signal is sent before the container is ready to handle it. After this test I decided to add a sleep 2 before the lxc-attach ... -- ps command: Starting LXC network bridge: :Starting LXC autoboot containers: :USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND root 52 0.0 0.0 7644 2768 ? R 21:10 0:00 ps auxf root 1 2.5 0.1 21524 9596 ? Ss 21:10 0:00 /sbin/init root 17 0.5 0.1 27444 8404 ? Ss 21:10 0:00 /lib/systemd/systemd-journald root 27 0.0 0.0 2348 1772 ? Ss 21:10 0:00 /sbin/ifup -a --read-environment root 42 0.0 0.0 2392 764 ? S 21:10 0:00 \_ /bin/sh -c /sbin/dhclient -4 -v -i -pf /run/dhclient.eth0.pid -lf /var/lib/dhcp/dhclient.eth0.leases -I -df /var/lib/dhcp/dhclient6.eth0.leases eth0 . root 43 0.0 0.0 8456 1936 ? S 21:10 0:00 \_ /sbin/dhclient -4 -v -i -pf /run/dhclient.eth0.pid -lf /var/lib/dhcp/dhclient.eth0.leases -I -df /var/lib/dhcp/dhclient6.eth0.leases eth0 root 44 0.0 0.0 9492 5644 ? S 21:10 0:00 \_ /sbin/dhclient -4 -v -i -pf /run/dhclient.eth0.pid -lf /var/lib/dhcp/dhclient.eth0.leases -I -df /var/lib/dhcp/dhclient6.eth0.leases eth0 message+ 50 0.0 0.0 8696 3636 ? Ss 21:10 0:00 /usr/bin/dbus-daemon --system --address=systemd: --nofork --nopidfile --systemd-activation --syslog-only root 51 0.5 0.0 19308 6376 ? Ss 21:10 0:00 /lib/systemd/systemd-logind ok Turns out your lxc-stop is really fast, and therefore, not catched properly by LXC. While I appreciate it shouldn't be a corner case that makes things explode, do you think there's a way to take this realization into account to lower the severity of this bug, having a temporary fix set up in place? I'll still try to see what upstream could offer to handle this in a better way. Cheers, -- Pierre-Elliott Bécue GPG: 9AE0 4D98 6400 E3B6 7528 F493 0D44 2664 1949 74E2 It's far easier to fight for one's principles than to live up to them.
signature.asc
Description: PGP signature