On Wed, Feb 14, 2024 at 18:44:54 +0100, Simon Josefsson wrote: > Tom Parkin <tpar...@katalix.com> writes: > > > On Tue, Jan 23, 2024 at 18:05:23 +0100, Simon Josefsson wrote: > >> Tom Parkin <tpar...@katalix.com> writes: > >> > >> > Hi Simon, > >> > > >> > On Mon, Jan 22, 2024 at 20:15:11 +0100, Simon Josefsson wrote: > >> >> golang-github-katalix-go-l2tp > >> >> https://salsa.debian.org/jas/golang-google-grpc/-/jobs/5191076 > >> >> === RUN > >> >> TestBasicSendReceive/5:_send/recv_[::1]:9000_[::1]:9001_L2TPv3_IP > >> >> level=info function=transport message=retransmit > >> >> message_type=avpMsgTypeHello > >> >> level=info function=transport message=retransmit > >> >> message_type=avpMsgTypeHello > >> >> level=info function=transport message=retransmit > >> >> message_type=avpMsgTypeHello > >> >> level=error function=transport message="socket read failed" > >> >> error="resource temporarily unavailable" > >> >> level=error function=transport message="transport down" error="transmit > >> >> of avpMsgTypeHello failed after 3 retry attempts" > >> >> transport_test.go:388: test sender function reported an error: > >> >> failed to send Hello message: transmit of avpMsgTypeHello failed > >> >> after 3 retry attempts > >> >> panic: test timed out after 10m0s > >> > > >> > This test is failing to send a packet over an IPv6 L2TPIP socket: it > >> > will depend on the go runtime support for L2TPIP (which has been in > >> > for ages), and also the kernel having the l2tp_ipv6 driver loaded. > >> > > >> > I'd sort of expect to see messages along those lines when trying to > >> > open the socket, though, rather than tx/rx failing :-/ > >> > > >> > I'm not at all familiar with the environment of the Salsa test > >> > pipeline -- could you expand on what the configuration is here? > >> > >> Thanks for looking at the logs Tom. I don't really know much about the > >> environment except for these pointers: > >> > >> https://wiki.debian.org/Salsa/Doc#Runners > >> https://salsa.debian.org/salsa-ci-team/pipeline/ > >> > >> Does it setup a server on ::1 properly? Any outbound connections? Only > >> http(s) is allowed. > > > > So I *think* the runtime env is a VM using the "Google Container-Optimized > > OS". The fact that the socket opens successfully but the packet is > > apparently lost is suggestive of some kind of firewalling. I'll see > > if I can figure anything out from the Google docs. > > > > The tests work OK when run manually and when run as part of the > > package build here, so I think it must be something specific to the > > pipeline VM but I'm not sure what at the moment. > > > > In terms of the test configuration, it basically opens a socket for > > each end of the connection and verifies it can send/receive over those > > sockets. It's the same test code for each configuration. > > This problem happens for others: > > https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1063746 > > Interestingly the failure seems arch-specific: > > https://ci.debian.net/packages/g/golang-github-katalix-go-l2tp/
Interesting -- thank you for the further information. The fact it seems arch-specific is striking as you say, but it's odd that amd64 is failing since that's what the code has been developed on. If I can reproduce it in a sid chroot that'll be a good starting point I think. I will try this and see if I can get any more information. I've unfortunately not had time to dig further into the Google VM docs; so a way to reproduce it outside that environment would be most welcome. > It could still well be that something in salsa and debci VM, and the > #1063746 reporter's machine, that is causing this -- but it seems this > clearly happens often enough, and is causing build failures checking > reverse dependencies of several packages going into experimental, so it > would be nice to fix it. Do you have any ideas? Could some test be > disabled or silenced somehow? I'm ignoring build failures in > golang-github-katalix-go-l2tp meanwhile. Possibly the test could be skipped if we could figure out the root cause. I did find when working on Fedora packaging that F38 had a strange issue whereby the l2tp_ip kernel module was blacklisted, which would cause the first IP encap test to fail. Strangely the l2tp_ip6 module is not blacklisted, so on the second time around the IP encap test would pass as the l2tp_ip6 module would autoload l2tp_ip as the former depends on the latter. There's a fix in go-l2tp upstream for this issue, I'm not sure whether something similar might apply here. If I can reproduce the issue I'll see whether a workaround can be applied. Thanks again, Tom -- Tom Parkin Katalix Systems Ltd https://katalix.com Catalysts for your Embedded Linux software development
signature.asc
Description: PGP signature