On Wed, Feb 7, 2024 11:13 AM, ferruh.yi...@amd.com wrote:
> On 2/1/2024 3:00 AM, Jiawen Wu wrote:
> > To optimize Rx/Tx burst process, add SSE/NEON vector instructions on
> > x86/arm architecture.
> >
> 
> Do you have any performance improvement number with vector
> implementation, if so can you put it into commit log for record?

On our local x86 platforms, the performance was at full speed without
using vector. So we don't have the performance improvement number
with SSE yet. But I will add the test result for arm.

> > @@ -2198,8 +2220,15 @@ txgbe_set_tx_function(struct rte_eth_dev *dev, 
> > struct txgbe_tx_queue *txq)
> >  #endif
> >                     txq->tx_free_thresh >= RTE_PMD_TXGBE_TX_MAX_BURST) {
> >             PMD_INIT_LOG(DEBUG, "Using simple tx code path");
> > -           dev->tx_pkt_burst = txgbe_xmit_pkts_simple;
> >             dev->tx_pkt_prepare = NULL;
> > +           if (txq->tx_free_thresh <= RTE_TXGBE_TX_MAX_FREE_BUF_SZ &&
> > +                           (rte_eal_process_type() != RTE_PROC_PRIMARY ||
> >
> 
> Why vector Tx enable only for secondary process?

It is not only for secondary process. The constraint is

(rte_eal_process_type() != RTE_PROC_PRIMARY || txgbe_txq_vec_setup(txq) == 0)

This code references ixgbe, which explains:
"When using multiple processes, the TX function used in all processes
 should be the same, otherwise the secondary processes cannot transmit
 more than tx-ring-size - 1 packets.
 To achieve this, we extract out the code to select the ixgbe TX function
 to be used into a separate function inside the ixgbe driver, and call
 that from a secondary process when it is attaching to an
 already-configured NIC."

> > +++ b/drivers/net/txgbe/txgbe_rxtx_vec_neon.c
> > @@ -0,0 +1,604 @@
> > +/* SPDX-License-Identifier: BSD-3-Clause
> > + * Copyright(c) 2015-2024 Beijing WangXun Technology Co., Ltd.
> > + * Copyright(c) 2010-2015 Intel Corporation
> > + */
> > +
> > +#include <ethdev_driver.h>
> > +#include <rte_malloc.h>
> > +#include <rte_vect.h>
> > +
> > +#include "txgbe_ethdev.h"
> > +#include "txgbe_rxtx.h"
> > +#include "txgbe_rxtx_vec_common.h"
> > +
> > +#pragma GCC diagnostic ignored "-Wcast-qual"
> > +
> 
> Is this pragma really required?

Yes. Otherwise, there are warnings in the compilation:

[1909/2921] Compiling C object 
drivers/libtmp_rte_net_txgbe.a.p/net_txgbe_txgbe_rxtx_vec_neon.c.o
../drivers/net/txgbe/txgbe_rxtx_vec_neon.c: In function ‘txgbe_rxq_rearm’:
../drivers/net/txgbe/txgbe_rxtx_vec_neon.c:37:15: warning: cast discards 
‘volatile’ qualifier from pointer target type [-Wcast-qual]
     vst1q_u64((uint64_t *)&rxdp[i], zero);
               ^
../drivers/net/txgbe/txgbe_rxtx_vec_neon.c:60:13: warning: cast discards 
‘volatile’ qualifier from pointer target type [-Wcast-qual]
   vst1q_u64((uint64_t *)rxdp++, dma_addr0);
             ^
../drivers/net/txgbe/txgbe_rxtx_vec_neon.c:65:13: warning: cast discards 
‘volatile’ qualifier from pointer target type [-Wcast-qual]
   vst1q_u64((uint64_t *)rxdp++, dma_addr1);


Reply via email to