On Wed, May 04, 2016 at 12:09:50PM +0100, Bruce Richardson wrote: > On Tue, May 03, 2016 at 06:12:07PM +0530, Jerin Jacob wrote: > > Elements of struct rte_eth_dev used in the fast path. > > Make struct rte_eth_dev cache aligned to avoid the cases where > > rte_eth_dev elements share the same cache line with other structures. > > > > Signed-off-by: Jerin Jacob <jerin.jacob at caviumnetworks.com> > > --- > > v2: > > Remove __rte_cache_aligned from rte_eth_devices and keep > > it only at struct rte_eth_dev definition as suggested by Bruce > > http://dpdk.org/dev/patchwork/patch/12328/ > > --- > > lib/librte_ether/rte_ethdev.h | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h > > index 2757510..48f14d5 100644 > > --- a/lib/librte_ether/rte_ethdev.h > > +++ b/lib/librte_ether/rte_ethdev.h > > @@ -1615,7 +1615,7 @@ struct rte_eth_dev { > > struct rte_eth_rxtx_callback *pre_tx_burst_cbs[RTE_MAX_QUEUES_PER_PORT]; > > uint8_t attached; /**< Flag indicating the port is attached */ > > enum rte_eth_dev_type dev_type; /**< Flag indicating the device type */ > > -}; > > +} __rte_cache_aligned; > > > > struct rte_eth_dev_sriov { > > uint8_t active; /**< SRIOV is active with 16, 32 or 64 > > pools */ > > -- > > Hi Jerin,
Hi Bruce, > > have you seen a performance degradation due to ethdev elements sharing a cache No. Not because of sharing the cache line. > line? I ask because, surprisingly for me, I actually see a performance > regression I see performance degradation in PMD in my setup where independent changes are causing the performance issue in PMD(~<100k). That's the reason I thought making aligned cache line stuff where ever it makes sense so that independent change shouldn't impact the PMD performance and this patch was an initiative for the same. > when I apply the above patch. It's not a big change - perf reduction of <1% - > but > still noticable across multiple runs using testpmd. I'm using two 1x40G NICs > using i40e driver, and I see ~100kpps less traffic per port after applying the > patch. [CPU: Intel(R) Xeon(R) CPU E5-2699 v3 @ 2.30GHz] This particular patch does not have any performance degradation in my setup. CPU: ThunderX > > Testpmd cmd and output shown below. > > Regards, > /Bruce > > $ sudo ./x86_64-native-linuxapp-gcc/app/testpmd -c C0000 -n 4 -- --rxd=512 > --txd=512 --numa > EAL: Detected 36 lcore(s) > EAL: Probing VFIO support... > PMD: nfp_net_pmd_init(): librte_pmd_nfp_net version 0.1 > > EAL: PCI device 0000:01:00.0 on NUMA socket 0 > EAL: probe driver: 8086:1521 rte_igb_pmd > EAL: PCI device 0000:01:00.1 on NUMA socket 0 > EAL: probe driver: 8086:1521 rte_igb_pmd > EAL: PCI device 0000:05:00.0 on NUMA socket 0 > EAL: probe driver: 8086:154a rte_ixgbe_pmd > EAL: PCI device 0000:05:00.1 on NUMA socket 0 > EAL: probe driver: 8086:154a rte_ixgbe_pmd > EAL: PCI device 0000:08:00.0 on NUMA socket 0 > EAL: probe driver: 8086:154a rte_ixgbe_pmd > EAL: PCI device 0000:08:00.1 on NUMA socket 0 > EAL: probe driver: 8086:154a rte_ixgbe_pmd > EAL: PCI device 0000:81:00.0 on NUMA socket 1 > EAL: probe driver: 8086:1584 rte_i40e_pmd > PMD: eth_i40e_dev_init(): FW 5.0 API 1.5 NVM 05.00.02 eetrack 80002281 > EAL: PCI device 0000:88:00.0 on NUMA socket 1 > EAL: probe driver: 8086:1584 rte_i40e_pmd > PMD: eth_i40e_dev_init(): FW 5.0 API 1.5 NVM 05.00.02 eetrack 80002281 > Configuring Port 0 (socket 1) > Port 0: 68:05:CA:27:D4:4E > Configuring Port 1 (socket 1) > Port 1: 68:05:CA:27:D2:0A > Checking link statuses... > Port 0 Link Up - speed 40000 Mbps - full-duplex > Port 1 Link Up - speed 40000 Mbps - full-duplex > Done > No commandline core given, start packet forwarding > io packet forwarding - CRC stripping disabled - packets/burst=32 > nb forwarding cores=1 - nb forwarding ports=2 > RX queues=1 - RX desc=512 - RX free threshold=32 > RX threshold registers: pthresh=8 hthresh=8 wthresh=0 > TX queues=1 - TX desc=512 - TX free threshold=32 > TX threshold registers: pthresh=32 hthresh=0 wthresh=0 > TX RS bit threshold=32 - TXQ flags=0xf01 > Press enter to exit > > Telling cores to stop... > Waiting for lcores to finish... > > ---------------------- Forward statistics for port 0 ---------------------- > RX-packets: 1940564672 RX-dropped: 1456035742 RX-total: 3396600414 > TX-packets: 1940564736 TX-dropped: 0 TX-total: 1940564736 > ---------------------------------------------------------------------------- > > ---------------------- Forward statistics for port 1 ---------------------- > RX-packets: 1940564671 RX-dropped: 1456036082 RX-total: 3396600753 > TX-packets: 1940564736 TX-dropped: 0 TX-total: 1940564736 > ---------------------------------------------------------------------------- > > +++++++++++++++ Accumulated forward statistics for all ports+++++++++++++++ > RX-packets: 3881129343 RX-dropped: 2912071824 RX-total: 6793201167 > TX-packets: 3881129472 TX-dropped: 0 TX-total: 3881129472 > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > > Done. > > Shutting down port 0... > Stopping ports... > Done > Closing ports... > Done > > Shutting down port 1... > Stopping ports... > Done > Closing ports... > Done > > Bye... > >