On Tue, Jul 18, 2023 at 07:51:24AM +0700, Bagas Sanjaya wrote:
> Hi,
> 
> I notice a regression report on Bugzilla [1]. Quoting from it:
> 
> > Hi,
> > 
> > After I updated to 6.4 through Archlinux kernel update, suddenly I noticed 
> > random packet losses on my routers like nodes. I have these networking 
> > relevant config on my nodes
> > 
> > 1. Using archlinux
> > 2. Network config through systemd-networkd
> > 3. Using bird2 for BGP routing, but not relevant to this bug.
> > 4. Using nftables for traffic control, but seems not relevant to this bug. 
> > 5. Not using fail2ban like dymanic filtering tools, at least at L3/L4 level
> > 
> > After I ruled out systemd-networkd, nftables related issues. I tracked down 
> > issues to kernel.
> > 
> > Here's the tcpdump I'm seeing on one side of my node ""
> > 
> > ```
> > sudo tcpdump -i fios_wan port 38851
> > tcpdump: verbose output suppressed, use -v[v]... for full protocol decode
> > listening on fios_wan, link-type EN10MB (Ethernet), snapshot length 262144 
> > bytes
> > 10:33:06.073236 IP [BOS1_NODE].38851 > [REDACTED_PUBLIC_IPv4_1].38851: UDP, 
> > length 148
> > 10:33:11.406607 IP [BOS1_NODE].38851 > [REDACTED_PUBLIC_IPv4_1].38851: UDP, 
> > length 148
> > 10:33:16.739969 IP [BOS1_NODE].38851 > [REDACTED_PUBLIC_IPv4_1].38851: UDP, 
> > length 148
> > 10:33:21.859856 IP [BOS1_NODE].38851 > [REDACTED_PUBLIC_IPv4_1].38851: UDP, 
> > length 148
> > 10:33:27.193176 IP [BOS1_NODE].38851 > [REDACTED_PUBLIC_IPv4_1].38851: UDP, 
> > length 148
> > 5 packets captured
> > 5 packets received by filter
> > 0 packets dropped by kernel
> > ```
> > 
> > But on the other side "[REDACTED_PUBLIC_IPv4_1]", tcpdump is replying 
> > packets in this wireguard stream. So packet is lost somewhere in the link.
> > 
> > From the otherside, I can do "mtr" to "[BOS1_NODE]"'s public IP and found 
> > the moment the link got lost is right at "[BOS1_NODE]", that means 
> > "[BOS1_NODE]"'s networking stack completely drop the inbound packets from 
> > specific ip addresses.
> > 
> > Some more digging
> > 
> > 1. This situation began after booting in different delays. Sometimes can 
> > trigger after 30 seconds after booting, and sometimes will be after 18 
> > hours or more.
> > 2. It can envolve into worse case that when I do "ip neigh show", the ipv4 
> > ARP table and ipv6 neighbor discovery start to appear as "invalid", meaning 
> > the internet is completely loss.
> > 3. When this happened to wan facing interface, it seems OK with lan facing 
> > interfaces. WAN interface was using Intel X710-T4L using i40e and lan side 
> > was using virtio
> > 4. I tried to bisect in between 6.3 and 6.4, and the first bad commit it 
> > reports was "a3efabee5878b8d7b1863debb78cb7129d07a346". But this is not 
> > relevant to networking at all, maybe it's the wrong commit to look at. At 
> > the meantime, because I haven't found a reproducible way of 100% trigger 
> > the issue, it may be the case during bisect some "good" commits are 
> > actually bad. 
> > 5. I also tried to look at "dmesg", nothing interesting pop up. But I'll 
> > make it available upon request.
> > 
> > This is my first bug reports. Sorry for any confusion it may lead to and 
> > thanks for reading.
> 
> See Bugzilla for the full thread.
> 
> Thorsten: The reporter had a bad bisect (some bad commits were marked as good
> instead), hence SoB chain for culprit (unrelated) ipvu commit is in To:
> list. I also asked the reporter (also in To:) to provide dmesg and request
> rerunning bisection, but he doesn't currently have a reliable reproducer.
> Is it the best I can do?
> 
> Anyway, I'm adding this regression to be tracked in regzbot:
> 
> #regzbot introduced: a3efabee5878b8 
> https://bugzilla.kernel.org/show_bug.cgi?id=217678
> #regzbot title: packet drop on Intel X710-T4L due to ipvu boot fix
> 

This time, the bisection points out to v6.4 networking pull, so:

#regzbot introduced: 6e98b09da931a0

(also Cc: Linus.)

Thanks.

-- 
An old man doll... just what I always wanted! - Clara

Attachment: signature.asc
Description: PGP signature

Reply via email to