On Sun, Aug 26, 2018 at 6:14 AM <mr...@linux.ee> wrote: > > > BTW, removing the FCS also means GRO is going to work, finally on this NIC > > ;) > > > > GRO does not like packets with padding. > > As a follow-up, I am seeing hw csum failures on Sun V440 that has > onboard Sun Cassini with sungem driver. First tested version was 4.18 > (it happened there once) and now that I tried 4.18+git, it still > happens: > > [ 21.563282] libphy: Fixed MDIO Bus: probed > [ 21.617116] cassini: cassini.c:v1.6 (21 May 2008) > [ 21.678962] cassini 0000:00:02.0: enabling device (0144 -> 0146) > [ 21.761931] cassini 0000:00:02.0 eth0: Sun Cassini+ (64bit/66MHz PCI/Cu) > Ethernet[6] 00:03:ba:6f:14:39 > [ 21.884952] cassini 0003:00:01.0: enabling device (0144 -> 0146) > [ 21.967868] cassini 0003:00:01.0 eth1: Sun Cassini+ (64bit/66MHz PCI/Cu) > Ethernet[29] 00:03:ba:6f:14:3a > [...] > [ 54.341212] eth0: hw csum failure > [ 54.384725] CPU: 2 PID: 0 Comm: swapper/2 Not tainted > 4.18.0-12952-g2923b27 #1397 > [ 54.483167] Call Trace: > [ 54.515209] [000000000077838c] __skb_checksum_complete+0xcc/0xe0 > [ 54.595272] [000000000080fc84] igmp_rcv+0x224/0x920 > [ 54.660475] [00000000007ca3d0] ip_local_deliver+0xb0/0x240 > [ 54.733675] [00000000007ca5c0] ip_rcv+0x60/0xa0 > [ 54.794304] [0000000000781a30] __netif_receive_skb_one_core+0x30/0x60 > [ 54.880094] [0000000000782914] process_backlog+0x94/0x140 > [ 54.952161] [0000000000788f6c] net_rx_action+0x1ec/0x320 > [ 55.023083] [0000000000870de8] __do_softirq+0xc8/0x200 > [ 55.091719] [000000000042c4cc] do_softirq_own_stack+0x2c/0x40 > [ 55.168362] [00000000004662d8] irq_exit+0xb8/0xe0 > [ 55.231266] [0000000000870ac0] handler_irq+0xc0/0x100 > [ 55.298756] [00000000004208b4] tl0_irq5+0x14/0x20 > [ 55.361670] [000000000042cafc] arch_cpu_idle+0x9c/0xa0 > [ 55.447055] [000000000048a254] cpu_startup_entry+0x14/0x40 > [ 55.536998] [000000000095f4b4] 0x95f4b4 > [ 55.588471] [0000000040000000] 0x40000000 > [ 179.780371] eth0: hw csum failure > [ 179.823878] CPU: 3 PID: 0 Comm: swapper/3 Not tainted > 4.18.0-12952-g2923b27 #1397 > [ 179.922230] Call Trace: > [ 179.954267] [000000000077838c] __skb_checksum_complete+0xcc/0xe0 > [ 180.034335] [000000000080fc84] igmp_rcv+0x224/0x920 > [ 180.099536] [00000000007ca3d0] ip_local_deliver+0xb0/0x240 > [ 180.172740] [00000000007ca5c0] ip_rcv+0x60/0xa0 > [ 180.233368] [0000000000781a30] __netif_receive_skb_one_core+0x30/0x60 > [ 180.319159] [0000000000782914] process_backlog+0x94/0x140 > [ 180.391225] [0000000000788f6c] net_rx_action+0x1ec/0x320 > [ 180.462148] [0000000000870de8] __do_softirq+0xc8/0x200 > [ 180.530782] [000000000042c4cc] do_softirq_own_stack+0x2c/0x40 > [ 180.607422] [00000000004662d8] irq_exit+0xb8/0xe0 > [ 180.670331] [0000000000870ac0] handler_irq+0xc0/0x100 > [ 180.737822] [00000000004208b4] tl0_irq5+0x14/0x20 > [ 180.800735] [000000000042caf8] arch_cpu_idle+0x98/0xa0 > [ 180.869373] [0000000000489f60] do_idle+0xe0/0x1c0 > [ 180.932281] [000000000048a25c] cpu_startup_entry+0x1c/0x40 > [ 181.005491] [000000000098e9b4] start_kernel+0x3b8/0x3c8
Note that these traces are for non TCP packets. I suspect the driver should not provide CHECKSUM_COMPLETE for non TCP packets, maybe the NIC is mishandling this case.