On 2023/6/23 0:40, Ali Alnubani wrote:
-----Original Message-----
From: Jie Hai <haij...@huawei.com>
Sent: Friday, June 9, 2023 12:04 PM
To: Aman Singh <aman.deep.si...@intel.com>; Yuying Zhang
<yuying.zh...@intel.com>; Anatoly Burakov <anatoly.bura...@intel.com>;
Matan Azrad <ma...@nvidia.com>; Dmitry Kozlyuk
<dmitry.kozl...@gmail.com>
Cc: dev@dpdk.org; liudongdo...@huawei.com; shiyangx...@intel.com;
ferruh.yi...@amd.com
Subject: [PATCH v4] app/testpmd: fix primary process not polling all queues

Here's how the problem arises.
step1: Start the app.
     dpdk-testpmd -a 0000:35:00.0 -l 0-3 -- -i --rxq=10 --txq=10

step2: Perform the following steps and send traffic. As expected,
queue 7 does not send or receive packets, and other queues do.
     port 0 rxq 7 stop
     port 0 txq 7 stop
     set fwd mac
     start

step3: Perform the following steps and send traffic. All queues
are expected to send and receive packets normally, but that's not
the case for queue 7.
     stop
     port stop all
     port start all
     start
     show port xstats all

In fact, only the value of rx_q7_packets for queue 7 is not zero,
which means queue 7 is enabled for the driver but is not involved
in packet receiving and forwarding by software. If we check queue
state by command 'show rxq info 0 7' and 'show txq info 0 7',
we see queue 7 is started as other queues are.
     Rx queue state: started
     Tx queue state: started
The queue 7 is started but cannot forward. That's the problem.

We know that each stream has a read-only "disabled" field that
control if this stream should be used to forward. This field
depends on testpmd local queue state, please see
commit 3c4426db54fc ("app/testpmd: do not poll stopped queues").
DPDK framework maintains ethdev queue state that drivers reported,
which indicates the real state of queues.

There are commands that update these two kind queue state such as
'port X rxq|txq start|stop'. But these operations take effect only
in one stop-start round. In the following stop-start round, the
preceding operations do not take effect anymore. However, only
the ethdev queue state is updated, causing the testpmd and ethdev
state information to diverge and causing unexpected side effects
as above problem.

There was a similar problem for the secondary process, please see
commit 5028f207a4fa ("app/testpmd: fix secondary process packet
forwarding").

This patch applies its workaround with some difference to the
primary process. Not all PMDs implement rte_eth_rx_queue_info_get and
rte_eth_tx_queue_info_get, however they may support deferred_start
with primary process. To not break their behavior, retain the original
testpmd local queue state for those PMDs.

Fixes: 3c4426db54fc ("app/testpmd: do not poll stopped queues")
Cc: sta...@dpdk.org

Signed-off-by: Jie Hai <haij...@huawei.com>
---

Hi Jie,

I see the error below when starting a representor port after reattaching it 
with this patch, is it expected?

$ sudo ./build /app/dpdk-testpmd -n 4  -a 0000:08:00.0,dv_esw_en=1,representor=vf0-1  -a 
auxiliary: -a 00:00.0 --iova-mode="va" -- -i
[..]
testpmd> port stop all
testpmd> port close 0
testpmd> device detach 0000:08:00.0
testpmd> port attach 0000:08:00.0,dv_esw_en=1,representor=0-1
testpmd> port start 1
Configuring Port 1 (socket 0)
Port 1: FA:9E:D8:5F:D7:D8
Invalid Rx queue_id=0
testpmd: Failed to get rx queue info
Invalid Tx queue_id=0
testpmd: Failed to get tx queue info

Regards,
Ali
Hi Ali,
Thanks for your feedback.

When update_queue_state is called, the status of all queues on all ports are updated. The number of queues is nb_rxq|nb_txq which is stored locally by testpmd process.
All ports on the same process shares the same nb_rxq|nb_txq.

After detached and attached, the number of queues of port 0 is 0.
And it changes only when the port is reconfigured by testpmd,
which is when port 0 is started.

If we start port 1 first, update_queue_state will update nb_rxq|nb_txq
queues state of port 0, and that's invalid because there's zero queues.

If this patch is not applied, the same problem occurs when the secondary process detaches and attaches the port, and then starts the port in the multi-process scenario.

I will submit a patch to fix this problem. When port starts, update queue state based on the number of queues reported by the driver.

Thanks,
Jie Hai

Reply via email to