On 11/07/2018 04:17 AM, Zhao1, Wei wrote:
Hi, Luca Boccassi

     The purpose of this patch is to reduce the mailbox interrupt from vf to 
pf, but there seem some point need for discussion in this patch.
First, I do not know why do you change code of function ixgbe_check_mac_link_vf(), because in rte_eth_link_get_nowait() and rte_eth_link_get(),
it will call ixgbe_dev_link_update()->ixgbe_dev_link_update_share()-> 
ixgbevf_check_link() for VF, NOT ixgbe_check_mac_link_vf() in your patch!

Second, in function ixgbevf_check_link(), there is mailbox message read 
operation for vf,
" if (mbx->ops.read(hw, &in_msg, 1, 0))", that is ixgbe_read_mbx_vf() ,
This will cause interrupt from vf to pf, this is just the point of this patch, 
it is also the problem that you want to solve.
So, you use autoneg_wait_to_complete flag to control this mailbox message read 
operation, maybe you will use rte_eth_link_get_nowait(), Which set 
autoneg_wait_to_complete = 0, then the interrupt from vf to pf can be reduced.

But  I do not think this patch is necessary, because in ixgbevf_check_link(), 
it,has

I think you are right here. This patch dates to before the addition
of the vf argument to ixgbe_dev_link_update_share() and the split of
.link_update between ixgbe and ixgbevf.  At one point, this patch was
especially beneficial if you were running bonding (which tends to make
quite a few link status checks).

So this patch probably hasn't been helping at this point.  I will try
to get some time to locally test this.

"
bool no_pflink_check = wait_to_complete == 0;

                ////////////////////////

                 if (no_pflink_check) {
                                 if (*speed == IXGBE_LINK_SPEED_UNKNOWN)
                                                 mac->get_link_status = true;
                                 else
                                                 mac->get_link_status = false;

                                 goto out;
                 }
"
Comment of "for a quick link status checking, wait_to_compelet == 0, skip PF link 
status checking " is clear.

That means in rte_eth_link_get_nowait(), code will skip this mailbox read 
interrupt, only in
rte_eth_link_get() there will be this interrupt, so I think what you need to is 
just replace
rte_eth_link_get() with rte_eth_link_get_nowait() in your APP,
that will reduce interrupt from vf to pf in mailbox read.


-----Original Message-----
From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Luca Boccassi
Sent: Wednesday, August 15, 2018 10:15 PM
To: dev@dpdk.org
Cc: Lu, Wenzhuo <wenzhuo...@intel.com>; Ananyev, Konstantin
<konstantin.anan...@intel.com>; Luca Boccassi <bl...@debian.org>;
sta...@dpdk.org
Subject: [dpdk-dev] [PATCH] net/ixgbe: reduce PF mailbox interrupt rate

We have observed high rate of NIC PF interrupts when VNF is using DPDK
APIs rte_eth_link_get_nowait() and rte_eth_link_get() functions, as they
are causing VF driver to send many MBOX ACK messages.

With these changes, the interrupt rates go down significantly. Here's some
testing results:

Without the patch:

$ egrep 'CPU|ens1f' /proc/interrupts ; sleep 10; egrep 'CPU|ens1f'
/proc/interrupts
             CPU0       CPU1       CPU2       CPU3       CPU4       CPU5       
CPU6       CPU7
CPU8       CPU9       CPU10      CPU11      CPU12      CPU13      CPU14      
CPU15
   34:         88          0          0          0          0         41        
 30        509          0          0        350
24         88        114        461        562   PCI-MSI 1572864-edge      
ens1f0-TxRx-0
   35:         49         24          0          0         65        130        
 64         29         67          0         10
0          0         46         38        764   PCI-MSI 1572865-edge      
ens1f0-TxRx-1
   36:         53          0          0         64         15         85        
132         71        108          0
30          0        165        215        303        104   PCI-MSI 
1572866-edge      ens1f0-
TxRx-2
   37:         46        196          0          0         10         48        
 62         68         51          0          0
0        103         82         54        192   PCI-MSI 1572867-edge      
ens1f0-TxRx-3
   38:        226          0          0          0        159        145        
749        265          0          0
202          0      69229        166        450          0   PCI-MSI 
1572868-edge      ens1f0
   52:         95        896          0          0          0         18        
 53          0        494          0          0
0          0        265         79        124   PCI-MSI 1574912-edge      
ens1f1-TxRx-0
   53:         50          0         18          0         72         33        
  0        168        330          0          0
0        141         22         12         65   PCI-MSI 1574913-edge      
ens1f1-TxRx-1
   54:         65          0          0          0        239        104        
166         49        442          0
0          0        126         26        307          0   PCI-MSI 1574914-edge 
     ens1f1-TxRx-2
   55:         57          0          0          0        123         35        
 83         54        157        106
0          0         26         29        312         97   PCI-MSI 1574915-edge 
     ens1f1-TxRx-3
   56:        232          0      13910          0         16         21        
  0      54422          0          0
0         24         25          0         78          0   PCI-MSI 1574916-edge 
     ens1f1
             CPU0       CPU1       CPU2       CPU3       CPU4       CPU5       
CPU6       CPU7
CPU8       CPU9       CPU10      CPU11      CPU12      CPU13      CPU14      
CPU15
   34:         88          0          0          0          0         41        
 30        509          0          0        350
24         88        119        461        562   PCI-MSI 1572864-edge      
ens1f0-TxRx-0
   35:         49         24          0          0         65        130        
 64         29         67          0         10
0          0         46         38        771   PCI-MSI 1572865-edge      
ens1f0-TxRx-1
   36:         53          0          0         64         15         85        
132         71        108          0
30          0        165        215        303        113   PCI-MSI 
1572866-edge      ens1f0-
TxRx-2
   37:         46        196          0          0         10         48        
 62         68         56          0          0
0        103         82         54        192   PCI-MSI 1572867-edge      
ens1f0-TxRx-3
   38:        226          0          0          0        159        145        
749        265          0          0
202          0      71281        166        450          0   PCI-MSI 
1572868-edge      ens1f0
   52:         95        896          0          0          0         18        
 53          0        494          0          0
0          0        265         79        133   PCI-MSI 1574912-edge      
ens1f1-TxRx-0
   53:         50          0         18          0         72         33        
  0        173        330          0          0
0        141         22         12         65   PCI-MSI 1574913-edge      
ens1f1-TxRx-1
   54:         65          0          0          0        239        104        
166         49        442          0
0          0        126         26        312          0   PCI-MSI 1574914-edge 
     ens1f1-TxRx-2
   55:         57          0          0          0        123         35        
 83         59        157        106
0          0         26         29        312         97   PCI-MSI 1574915-edge 
     ens1f1-TxRx-3
   56:        232          0      15910          0         16         21        
  0      54422          0          0
0         24         25          0         78          0   PCI-MSI 1574916-edge 
     ens1f1

During the 10s interval, CPU2 jumped by 2000 interrupts, CPU12 by 2051
interrupts, for about 200 interrupts/second. That's on the order of what we
expect. I would have guessed 100/s but perhaps there are two mailbox
messages.

With the patch:

$ egrep 'CPU|ens1f' /proc/interrupts ; sleep 10; egrep 'CPU|ens1f'
/proc/interrupts
             CPU0       CPU1       CPU2       CPU3       CPU4       CPU5       
CPU6       CPU7
CPU8       CPU9       CPU10      CPU11      CPU12      CPU13      CPU14      
CPU15
   34:         88          0          0          0          0         25        
 19        177          0          0        350
24         88        100        362        559   PCI-MSI 1572864-edge      
ens1f0-TxRx-0
   35:         49         19          0          0         65        130        
 64         29         67          0         10
0          0         46         38        543   PCI-MSI 1572865-edge      
ens1f0-TxRx-1
   36:         53          0          0         64         15         53        
 85         71        108          0         24
0         85        215        292         31   PCI-MSI 1572866-edge      
ens1f0-TxRx-2
   37:         46        196          0          0         10         43        
 57         39         19          0          0
0         78         69         49        149   PCI-MSI 1572867-edge      
ens1f0-TxRx-3
   38:        226          0          0          0        159        145        
749        247          0          0
202          0      58250          0        450          0   PCI-MSI 
1572868-edge      ens1f0
   52:         95        896          0          0          0         18        
 53          0        189          0          0
0          0        265         79         25   PCI-MSI 1574912-edge      
ens1f1-TxRx-0
   53:         50          0         18          0         72         33        
  0         90        330          0          0
0        136          5         12          0   PCI-MSI 1574913-edge      
ens1f1-TxRx-1
   54:         65          0          0          0         10        104        
166         49        442          0          0
0        126         26        226          0   PCI-MSI 1574914-edge      
ens1f1-TxRx-2
   55:         57          0          0          0         61         35        
 83         30        157        101          0
0         26         15        312          0   PCI-MSI 1574915-edge      
ens1f1-TxRx-3
   56:        232          0       2062          0         16         21        
  0      54422          0          0
0         24         25          0         78          0   PCI-MSI 1574916-edge 
     ens1f1
             CPU0       CPU1       CPU2       CPU3       CPU4       CPU5       
CPU6       CPU7
CPU8       CPU9       CPU10      CPU11      CPU12      CPU13      CPU14      
CPU15
   34:         88          0          0          0          0         25        
 19        177          0          0        350
24         88        102        362        562   PCI-MSI 1572864-edge      
ens1f0-TxRx-0
   35:         49         19          0          0         65        130        
 64         29         67          0         10
0          0         46         38        548   PCI-MSI 1572865-edge      
ens1f0-TxRx-1
   36:         53          0          0         64         15         53        
 85         71        108          0         24
0         85        215        292         36   PCI-MSI 1572866-edge      
ens1f0-TxRx-2
   37:         46        196          0          0         10         45        
 57         39         19          0          0
0         78         69         49        152   PCI-MSI 1572867-edge      
ens1f0-TxRx-3
   38:        226          0          0          0        159        145        
749        247          0          0
202          0      58259          0        450          0   PCI-MSI 
1572868-edge      ens1f0
   52:         95        896          0          0          0         18        
 53          0        194          0          0
0          0        265         79         25   PCI-MSI 1574912-edge      
ens1f1-TxRx-0
   53:         50          0         18          0         72         33        
  0         95        330          0          0
0        136          5         12          0   PCI-MSI 1574913-edge      
ens1f1-TxRx-1
   54:         65          0          0          0         10        104        
166         49        442          0          0
0        126         26        231          0   PCI-MSI 1574914-edge      
ens1f1-TxRx-2
   55:         57          0          0          0         66         35        
 83         30        157        101          0
0         26         15        312          0   PCI-MSI 1574915-edge      
ens1f1-TxRx-3
   56:        232          0       2071          0         16         21        
  0      54422          0          0
0         24         25          0         78          0   PCI-MSI 1574916-edge 
     ens1f1

Note the interrupt rate has gone way down. During the 10s interval, we only
saw a handful of interrupts.

Note that this patch was originally provided by Intel directly to AT&T and
Vyatta, but unfortunately I am unable to find records of the exact author.

We have been using this in production for more than a year.

Fixes: af75078fece3 ("first public release")
Cc: sta...@dpdk.org

Signed-off-by: Luca Boccassi <bl...@debian.org>
---
  drivers/net/ixgbe/base/ixgbe_vf.c | 33 ++++++++++++++++---------------
  1 file changed, 17 insertions(+), 16 deletions(-)

diff --git a/drivers/net/ixgbe/base/ixgbe_vf.c
b/drivers/net/ixgbe/base/ixgbe_vf.c
index 5b25a6b4d4..16086670b1 100644
--- a/drivers/net/ixgbe/base/ixgbe_vf.c
+++ b/drivers/net/ixgbe/base/ixgbe_vf.c
@@ -586,7 +586,6 @@ s32 ixgbe_check_mac_link_vf(struct ixgbe_hw *hw,
ixgbe_link_speed *speed,
        s32 ret_val = IXGBE_SUCCESS;
        u32 links_reg;
        u32 in_msg = 0;
-       UNREFERENCED_1PARAMETER(autoneg_wait_to_complete);

        /* If we were hit with a reset drop the link */
        if (!mbx->ops.check_for_rst(hw, 0) || !mbx->timeout) @@ -643,23
+642,25 @@ s32 ixgbe_check_mac_link_vf(struct ixgbe_hw *hw,
ixgbe_link_speed *speed,
                *speed = IXGBE_LINK_SPEED_UNKNOWN;
        }

-       /* if the read failed it could just be a mailbox collision, best wait
-        * until we are called again and don't report an error
-        */
-       if (mbx->ops.read(hw, &in_msg, 1, 0))
-               goto out;
+       if (autoneg_wait_to_complete) {
+               /* if the read failed it could just be a mailbox collision, best
wait
+                * until we are called again and don't report an error
+                */
+               if (mbx->ops.read(hw, &in_msg, 1, 0))
+                       goto out;

-       if (!(in_msg & IXGBE_VT_MSGTYPE_CTS)) {
-               /* msg is not CTS and is NACK we must have lost CTS status
*/
-               if (in_msg & IXGBE_VT_MSGTYPE_NACK)
+               if (!(in_msg & IXGBE_VT_MSGTYPE_CTS)) {
+                       /* msg is not CTS and is NACK we must have lost CTS
status */
+                       if (in_msg & IXGBE_VT_MSGTYPE_NACK)
+                               ret_val = -1;
+                       goto out;
+               }
+
+               /* the pf is talking, if we timed out in the past we reinit */
+               if (!mbx->timeout) {
                        ret_val = -1;
-               goto out;
-       }
-
-       /* the pf is talking, if we timed out in the past we reinit */
-       if (!mbx->timeout) {
-               ret_val = -1;
-               goto out;
+                       goto out;
+               }
        }

        /* if we passed all the tests above then the link is up and we no
--
2.18.0

Reply via email to