> -----Original Message----- > From: Ferruh Yigit <ferruh.yi...@amd.com> > Sent: Tuesday, June 27, 2023 2:05 PM > To: Jie Hai <haij...@huawei.com>; Ali Alnubani <alia...@nvidia.com>; Aman > Singh <aman.deep.si...@intel.com>; Yuying Zhang > <yuying.zh...@intel.com>; Anatoly Burakov <anatoly.bura...@intel.com>; > Matan Azrad <ma...@nvidia.com>; Dmitry Kozlyuk > <dmitry.kozl...@gmail.com> > Cc: dev@dpdk.org; liudongdo...@huawei.com; shiyangx...@intel.com; > Raslan Darawsheh <rasl...@nvidia.com>; NBU-Contact-Thomas Monjalon > (EXTERNAL) <tho...@monjalon.net> > Subject: Re: [PATCH v4] app/testpmd: fix primary process not polling all > queues > > On 6/26/2023 10:30 AM, Jie Hai wrote: > > On 2023/6/23 0:40, Ali Alnubani wrote: > >>> -----Original Message----- > >>> From: Jie Hai <haij...@huawei.com> > >>> Sent: Friday, June 9, 2023 12:04 PM > >>> To: Aman Singh <aman.deep.si...@intel.com>; Yuying Zhang > >>> <yuying.zh...@intel.com>; Anatoly Burakov > <anatoly.bura...@intel.com>; > >>> Matan Azrad <ma...@nvidia.com>; Dmitry Kozlyuk > >>> <dmitry.kozl...@gmail.com> > >>> Cc: dev@dpdk.org; liudongdo...@huawei.com; shiyangx...@intel.com; > >>> ferruh.yi...@amd.com > >>> Subject: [PATCH v4] app/testpmd: fix primary process not polling all > >>> queues > >>> > >>> Here's how the problem arises. > >>> step1: Start the app. > >>> dpdk-testpmd -a 0000:35:00.0 -l 0-3 -- -i --rxq=10 --txq=10 > >>> > >>> step2: Perform the following steps and send traffic. As expected, > >>> queue 7 does not send or receive packets, and other queues do. > >>> port 0 rxq 7 stop > >>> port 0 txq 7 stop > >>> set fwd mac > >>> start > >>> > >>> step3: Perform the following steps and send traffic. All queues > >>> are expected to send and receive packets normally, but that's not > >>> the case for queue 7. > >>> stop > >>> port stop all > >>> port start all > >>> start > >>> show port xstats all > >>> > >>> In fact, only the value of rx_q7_packets for queue 7 is not zero, > >>> which means queue 7 is enabled for the driver but is not involved > >>> in packet receiving and forwarding by software. If we check queue > >>> state by command 'show rxq info 0 7' and 'show txq info 0 7', > >>> we see queue 7 is started as other queues are. > >>> Rx queue state: started > >>> Tx queue state: started > >>> The queue 7 is started but cannot forward. That's the problem. > >>> > >>> We know that each stream has a read-only "disabled" field that > >>> control if this stream should be used to forward. This field > >>> depends on testpmd local queue state, please see > >>> commit 3c4426db54fc ("app/testpmd: do not poll stopped queues"). > >>> DPDK framework maintains ethdev queue state that drivers reported, > >>> which indicates the real state of queues. > >>> > >>> There are commands that update these two kind queue state such as > >>> 'port X rxq|txq start|stop'. But these operations take effect only > >>> in one stop-start round. In the following stop-start round, the > >>> preceding operations do not take effect anymore. However, only > >>> the ethdev queue state is updated, causing the testpmd and ethdev > >>> state information to diverge and causing unexpected side effects > >>> as above problem. > >>> > >>> There was a similar problem for the secondary process, please see > >>> commit 5028f207a4fa ("app/testpmd: fix secondary process packet > >>> forwarding"). > >>> > >>> This patch applies its workaround with some difference to the > >>> primary process. Not all PMDs implement rte_eth_rx_queue_info_get and > >>> rte_eth_tx_queue_info_get, however they may support deferred_start > >>> with primary process. To not break their behavior, retain the original > >>> testpmd local queue state for those PMDs. > >>> > >>> Fixes: 3c4426db54fc ("app/testpmd: do not poll stopped queues") > >>> Cc: sta...@dpdk.org > >>> > >>> Signed-off-by: Jie Hai <haij...@huawei.com> > >>> --- > >> > >> Hi Jie, > >> > >> I see the error below when starting a representor port after > >> reattaching it with this patch, is it expected? > >> > >> $ sudo ./build /app/dpdk-testpmd -n 4 -a > >> 0000:08:00.0,dv_esw_en=1,representor=vf0-1 -a auxiliary: -a 00:00.0 > >> --iova-mode="va" -- -i > >> [..] > >> testpmd> port stop all > >> testpmd> port close 0 > >> testpmd> device detach 0000:08:00.0 > >> testpmd> port attach 0000:08:00.0,dv_esw_en=1,representor=0-1 > >> testpmd> port start 1 > >> Configuring Port 1 (socket 0) > >> Port 1: FA:9E:D8:5F:D7:D8 > >> Invalid Rx queue_id=0 > >> testpmd: Failed to get rx queue info > >> Invalid Tx queue_id=0 > >> testpmd: Failed to get tx queue info > >> > >> Regards, > >> Ali > > Hi Ali, > > Thanks for your feedback. > > > > When update_queue_state is called, the status of all queues on all ports > > are updated. > > The number of queues is nb_rxq|nb_txq which is stored locally by testpmd > > process. > > All ports on the same process shares the same nb_rxq|nb_txq. > > > > After detached and attached, the number of queues of port 0 is 0. > > And it changes only when the port is reconfigured by testpmd, > > which is when port 0 is started. > > > > If we start port 1 first, update_queue_state will update nb_rxq|nb_txq > > queues state of port 0, and that's invalid because there's zero queues. > > > > If this patch is not applied, the same problem occurs when the secondary > > process detaches and attaches the port, and then starts the port in the > > multi-process scenario. > > > > I will submit a patch to fix this problem. When port starts, update > > queue state based on the number of queues reported by the driver. > > > > Hi Ali, > > How big a blocker is this issue, should the fix be part of -rc2?
Hi Ferruh, I missed your email, sorry about that. Jie already sent a patch and it resolved it for me. Thanks, Ali