> -----Original Message----- > From: dev <dev-boun...@dpdk.org> On Behalf Of Marcin Zapolski > Sent: Tuesday, July 30, 2019 6:20 PM > To: dev@dpdk.org > Cc: Marcin Zapolski <marcinx.a.zapol...@intel.com> > Subject: [dpdk-dev] [RFC 19.11 1/2] ethdev: make DPDK core functions non- > inline > > Make rte_eth_rx_burst, rte_eth_tx_burst and other static inline ethdev > functions not inline. They are referencing DPDK internal structures and > inlining forces those structures to be exposed to user applications. > > In internal testing with i40e NICs a performance drop of about 2% was > observed with testpmd.
I tested on two class of arm64 machines(Highend and lowend) one has 1.4% drop And other one has 3.6% drop. I second to not expose internal data structure to avoid ABI break. IMO, This patch has performance issue due to it is fixing it in simple way. It is not worth two have function call overhead to call the driver function. Some thoughts below to reduce the performance impact without exposing internal structures. And I think, We need to follow the similar mechanism for cryptodev, Eventdev, rawdev Etc so bring the common scheme to address this semantics will be use full. > > Signed-off-by: Marcin Zapolski <marcinx.a.zapol...@intel.com> > --- > lib/librte_ethdev/rte_ethdev.c | 168 +++++++++++++++++++++++ > lib/librte_ethdev/rte_ethdev.h | 166 ++-------------------- > lib/librte_ethdev/rte_ethdev_version.map | 12 ++ > 3 files changed, 195 insertions(+), 151 deletions(-) > > diff --git a/lib/librte_ethdev/rte_ethdev.c b/lib/librte_ethdev/rte_ethdev.c > index 17d183e1f..31432a956 100644 > --- a/lib/librte_ethdev/rte_ethdev.c > +++ b/lib/librte_ethdev/rte_ethdev.c > @@ -749,6 +749,174 @@ rte_eth_dev_get_sec_ctx(uint16_t port_id) > return rte_eth_devices[port_id].security_ctx; > } > > +uint16_t > +rte_eth_rx_burst(uint16_t port_id, uint16_t queue_id, > + struct rte_mbuf **rx_pkts, const uint16_t nb_pkts) { > + struct rte_eth_dev *dev = &rte_eth_devices[port_id]; > + uint16_t nb_rx; I think, we only need to store 3 function pointers per port. IMO, Let have structure for that. i.e split the struct rte_eth_dev content as public and private. I think, We nee only following elements in rte_eth_dev struct rte_eth_dev_fns { eth_rx_burst_t rx_pkt_burst; /**< Pointer to PMD receive function. */ eth_tx_burst_t tx_pkt_burst; /**< Pointer to PMD transmit function. */ eth_tx_prep_t tx_pkt_prepare; /**< Pointer to PMD transmit prepare function. * }; struct rte_eth_dev { struct rte_eth_dev_fns fns; // make it as first item allows type cast to struct rte_eth_dev_fns from struct rte_eth_dev private ones } > + > +#ifdef RTE_LIBRTE_ETHDEV_DEBUG > + RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, 0); > + RTE_FUNC_PTR_OR_ERR_RET(*dev->rx_pkt_burst, 0); > + > + if (queue_id >= dev->data->nb_rx_queues) { > + RTE_ETHDEV_LOG(ERR, "Invalid RX queue_id=%u\n", > queue_id); > + return 0; > + } > +#endif > + nb_rx = (*dev->rx_pkt_burst)(dev->data->rx_queues[queue_id], I think, if we make driver funtions as (*dev->rx_pkt_burst)(dev, rx_pkts, nb_pkts) Then no need to deference data from inline function. Lets expose a helper function from driver layer and let PMD use to access queue memory. No need to expose that helper to user app. > + rx_pkts, nb_pkts); > + > +#ifdef RTE_ETHDEV_RXTX_CALLBACKS # If we have ethdev driver helper function for the same and PMD can call it as well no need to call this inline function. # I think, it make sense to as RX_OFFLOAD_FLAGS so that when app needs only It can be included in fastpath. # lastly we are not exposing rte_eth_dev to application then I think we can Remove rte_ from name. > + if (unlikely(dev->post_rx_burst_cbs[queue_id] != NULL)) { > + struct rte_eth_rxtx_callback *cb = > + dev->post_rx_burst_cbs[queue_id]; > + > + do { > + nb_rx = cb->fn.rx(port_id, queue_id, rx_pkts, nb_rx, > + nb_pkts, cb->param); > + cb = cb->next; > + } while (cb != NULL); > + } > +#endif > + > + return nb_rx; > +} > + >