From: Tobias Waldekranz <tob...@waldekranz.com> Sent: Friday, July 3, 2020 3:55 PM > On Fri Jul 3, 2020 at 4:45 AM CEST, Andy Duan wrote: > > From: Tobias Waldekranz <tob...@waldekranz.com> Sent: Friday, July 3, > > 2020 4:58 AM > > > In the ISR, we poll the event register for the queues in need of > > > service and then enter polled mode. After this point, the event > > > register will never be read again until we exit polled mode. > > > > > > In a scenario where a UDP flow is routed back out through the same > > > interface, i.e. "router-on-a-stick" we'll typically only see an rx queue > > > event > initially. > > > Once we start to process the incoming flow we'll be locked polled > > > mode, but we'll never clean the tx rings since that event is never caught. > > > > > > Eventually the netdev watchdog will trip, causing all buffers to be > > > dropped and then the process starts over again. > > > > > > Rework the NAPI poll to keep trying to consome the entire budget as > > > long as new events are coming in, making sure to service all rx/tx > > > queues, in priority order, on each pass. > > > > > > Fixes: 4d494cdc92b3 ("net: fec: change data structure to support > > > multiqueue") > > > Signed-off-by: Tobias Waldekranz <tob...@waldekranz.com> > > > --- > > > > > > v1 -> v2: > > > * Always do a full pass over all rx/tx queues as soon as any event is > > > received, as suggested by David. > > > * Keep dequeuing packets as long as events keep coming in and we're > > > under budget. > > > > > > drivers/net/ethernet/freescale/fec.h | 5 -- > > > drivers/net/ethernet/freescale/fec_main.c | 94 > > > ++++++++--------------- > > > 2 files changed, 31 insertions(+), 68 deletions(-) > > > > > > diff --git a/drivers/net/ethernet/freescale/fec.h > > > b/drivers/net/ethernet/freescale/fec.h > > > index a6cdd5b61921..d8d76da51c5e 100644 > > > --- a/drivers/net/ethernet/freescale/fec.h > > > +++ b/drivers/net/ethernet/freescale/fec.h > > > @@ -525,11 +525,6 @@ struct fec_enet_private { > > > unsigned int total_tx_ring_size; > > > unsigned int total_rx_ring_size; > > > > > > - unsigned long work_tx; > > > - unsigned long work_rx; > > > - unsigned long work_ts; > > > - unsigned long work_mdio; > > > - > > > struct platform_device *pdev; > > > > > > int dev_id; > > > diff --git a/drivers/net/ethernet/freescale/fec_main.c > > > b/drivers/net/ethernet/freescale/fec_main.c > > > index 2d0d313ee7c5..84589d464850 100644 > > > --- a/drivers/net/ethernet/freescale/fec_main.c > > > +++ b/drivers/net/ethernet/freescale/fec_main.c > > > @@ -75,8 +75,6 @@ static void fec_enet_itr_coal_init(struct > > > net_device *ndev); > > > > > > #define DRIVER_NAME "fec" > > > > > > -#define FEC_ENET_GET_QUQUE(_x) ((_x == 0) ? 1 : ((_x == 1) ? 2 : > > > 0)) > > > - > > > /* Pause frame feild and FIFO threshold */ > > > #define FEC_ENET_FCE (1 << 5) > > > #define FEC_ENET_RSEM_V 0x84 > > > @@ -1248,8 +1246,6 @@ fec_enet_tx_queue(struct net_device *ndev, > u16 > > > queue_id) > > > > > > fep = netdev_priv(ndev); > > > > > > - queue_id = FEC_ENET_GET_QUQUE(queue_id); > > > - > > > txq = fep->tx_queue[queue_id]; > > > /* get next bdp of dirty_tx */ > > > nq = netdev_get_tx_queue(ndev, queue_id); @@ -1340,17 > > > +1336,14 @@ fec_enet_tx_queue(struct net_device *ndev, u16 > queue_id) > > > writel(0, txq->bd.reg_desc_active); } > > > > > > -static void > > > -fec_enet_tx(struct net_device *ndev) > > > +static void fec_enet_tx(struct net_device *ndev) > > > { > > > struct fec_enet_private *fep = netdev_priv(ndev); > > > - u16 queue_id; > > > - /* First process class A queue, then Class B and Best Effort > queue */ > > > - for_each_set_bit(queue_id, &fep->work_tx, > > > FEC_ENET_MAX_TX_QS) { > > > - clear_bit(queue_id, &fep->work_tx); > > > - fec_enet_tx_queue(ndev, queue_id); > > > - } > > > - return; > > > + int i; > > > + > > > + /* Make sure that AVB queues are processed first. */ > > > + for (i = fep->num_rx_queues - 1; i >= 0; i--) > > > > In fact, you already change the queue priority comparing before. > > Before: queue1 (Audio) > queue2 (video) > queue0 (best effort) > > Now: queue2 (video) > queue1 (Audio) > queue0 (best effort) > > Yes, thank you, I meant to ask about that. I was looking at these definitions > in > fec.h: > > #define RCMR_CMP_1 (RCMR_CMP_CFG(0, 0) | > RCMR_CMP_CFG(1, 1) | \ > RCMR_CMP_CFG(2, 2) | > RCMR_CMP_CFG(3, 3)) > #define RCMR_CMP_2 (RCMR_CMP_CFG(4, 0) | > RCMR_CMP_CFG(5, 1) | \ > RCMR_CMP_CFG(6, 2) | > RCMR_CMP_CFG(7, 3)) > > I read that as PCP 0-3 being mapped to queue 1 and 4-7 to queue 2. That led > me to believe that the order should be 2, 1, 0. Is the driver supposed to > prioritize PCP 0-3 over 4-7, or have I misunderstood completely?
The configuration is for RX queues. If consider PCP 0 is high priority, that does make sense: 2 > 1 > 0. > > > Other logic seems fine, but you should run stress test to avoid any > > block issue since the driver cover more than 20 imx platforms. > > I have run stress tests and I observe that we're dequeuing about as many > packets from each queue when the incoming line is filled with 1/3 each of > untagged/tagged-pcp0/tagged-pcp7 traffic: > > root@envoy:~# ply -c "sleep 2" ' > t:net/napi_gro_receive_entry { > @[data->napi_id, data->queue_mapping] = count(); }' > ply: active > ply: deactivating > > @: > { 66, 3 }: 165811 > { 66, 2 }: 167733 > { 66, 1 }: 169470 > > It seems like this is due to "Receive flushing" not being enabled in the FEC. > If I > manually enable it for queue 0, processing is restricted to only queue 1 and > 2: > > root@envoy:~# devmem 0x30be01f0 32 $((1 << 3)) root@envoy:~# ply -c > "sleep 2" ' > t:net/napi_gro_receive_entry { > @[data->napi_id, data->queue_mapping] = count(); }' > ply: active > ply: deactivating > > @: > { 66, 2 }: 275055 > { 66, 3 }: 275870 > > Enabling flushing on queue 1, focuses all processing on queue 2: Please don't enable flush, there have one IC issue. NXP latest errata doc should includes the issue, but the flush issue was fixed at imx8dxl platform. > > root@envoy:~# devmem 0x30be01f0 32 $((3 << 3)) root@envoy:~# ply -c > "sleep 2" ' > t:net/napi_gro_receive_entry { > @[data->napi_id, data->queue_mapping] = count(); }' > ply: active > ply: deactivating > > @: > { 66, 3 }: 545442 > > Changing the default QoS settings feels like a separate change, but I can > submit a v3 as a series if you want? I think the version is fine. No need to submit separate change. > > I do not have access to a single-queue iMX device, would it be possible for > you > to test this change on such a device? Yes, I will do stress test on imx8 and legacy platform like imx6 with single-queue, try to avoid any block issue. Thank you !