On Sat, 2020-10-31 at 09:48 +0800, Yunsheng Lin wrote: > On 2020/10/30 19:50, Joakim Tjernlund wrote: > > On Fri, 2020-10-30 at 09:36 +0800, Yunsheng Lin wrote: > > > CAUTION: This email originated from outside of the organization. Do not > > > click links or open attachments unless you recognize the sender and know > > > the content is safe. > > > > > > > > > On 2020/10/29 23:18, David Ahern wrote: > > > > On 10/29/20 8:10 AM, Joakim Tjernlund wrote: > > > > > OK, bisecting (was a bit of a bother since we merge upstream releases > > > > > into our tree, is there a way to just bisect that?) > > > > > > > > > > Result was commit "net: sch_generic: aviod concurrent reset and > > > > > enqueue op for lockless qdisc" > > > > > (749cc0b0c7f3dcdfe5842f998c0274e54987384f) > > > > > > > > > > Reverting that commit on top of our tree made it work again. How to > > > > > fix? > > > > > > > > Adding the author of that patch (linyunsh...@huawei.com) to take a look. > > > > > > > > > > > > > > > > > > Jocke > > > > > > > > > > On Mon, 2020-10-26 at 12:31 -0600, David Ahern wrote: > > > > > > > > > > > > On 10/26/20 6:58 AM, Joakim Tjernlund wrote: > > > > > > > Ping (maybe it should read "arping" instead :) > > > > > > > > > > > > > > Jocke > > > > > > > > > > > > > > On Thu, 2020-10-22 at 17:19 +0200, Joakim Tjernlund wrote: > > > > > > > > strace arping -q -c 1 -b -U -I eth1 0.0.0.0 > > > > > > > > ... > > > > > > > > sendto(3, "\0\1\10\0\6\4\0\1\0\6\234\v\6 > > > > > > > > \v\v\v\v\377\377\377\377\377\377\0\0\0\0", 28, 0, > > > > > > > > {sa_family=AF_PACKET, proto=0x806, if4, pkttype=PACKET_HOST, > > > > > > > > addr(6)={1, ffffffffffff}, > > > > > > > > 20) = -1 ENOBUFS (No buffer space available) > > > > > > > > .... > > > > > > > > and then arping loops. > > > > > > > > > > > > > > > > in 4.19.127 it was: > > > > > > > > sendto(3, > > > > > > > > "\0\1\10\0\6\4\0\1\0\6\234\5\271\362\n\322\212E\377\377\377\377\377\377\0\0\0\0", > > > > > > > > 28, 0, {sa_family=AF_PACKET, proto=0x806, if4, > > > > > > > > pkttype=PACKET_HOST, addr(6)={1, > > > > > > > > ffffffffffff}, 20) = 28 > > > > > > > > > > > > > > > > Seems like something has changed the IP behaviour between now > > > > > > > > and then ? > > > > > > > > eth1 is UP but not RUNNING and has an IP address. > > > > > > "eth1 is UP but not RUNNING" usually mean user has configure the netdev > > > as up, > > > but the hardware has not detected a linkup yet. > > > > > > Also What is the output of "ethtool eth1"? > > > > echo 1 > /sys/class/net/eth1/carrier > > cu3-jocke ~ # arping -q -c 1 -b -U -I eth1 0.0.0.0 > > cu3-jocke ~ # echo 0 > /sys/class/net/eth1/carrier > > cu3-jocke ~ # arping -q -c 1 -b -U -I eth1 0.0.0.0 > > ^Ccu3-jocke ~ # ethtool eth1 > > Settings for eth1: > > Supported ports: [ MII ] > > Supported link modes: 1000baseT/Full > > Supported pause frame use: Symmetric Receive-only > > Supports auto-negotiation: Yes > > Advertised link modes: 1000baseT/Full > > Advertised pause frame use: Symmetric Receive-only > > Advertised auto-negotiation: Yes > > Speed: 10Mb/s > > Duplex: Half > > Port: MII > > PHYAD: 1 > > Transceiver: external > > Auto-negotiation: on > > Current message level: 0x00000037 (55) > > drv probe link ifdown ifup > > Link detected: no > > > > We have a writeable carrier since eth device is PHY less. Maybe that path > > is different ? > > Check drivers/net/ethernet/freescale/dpaa/dpa_eth.c > > The above difference does not seems to matter. > > > > > > > > > It would be good to see the status of netdev before and after executing > > > arping cmd > > > too. > > > > hmm, how do you mean? > > I was trying to find out when the netdev' state became "eth1 is UP but not > RUNNING". > > Anyway, when I looked at the backported patch, I did find new qdisc > assignment is > missing from the upstream patch. > > Please see if the below patch fix your problem, thanks: > > diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c > index bd96fd2..4e15913 100644 > --- a/net/sched/sch_generic.c > +++ b/net/sched/sch_generic.c > @@ -1116,10 +1116,13 @@ static void dev_deactivate_queue(struct net_device > *dev, > void *_qdisc_default) > { > struct Qdisc *qdisc = rtnl_dereference(dev_queue->qdisc); > + struct Qdisc *qdisc_default = _qdisc_default; > > if (qdisc) { > if (!(qdisc->flags & TCQ_F_BUILTIN)) > set_bit(__QDISC_STATE_DEACTIVATED, &qdisc->state); > + > + rcu_assign_pointer(dev_queue->qdisc, qdisc_default); > } > }
This patch seem to have resolved the problem, thanks. Please CC me on the formal patch for 4.19.x Jocke