> > > Add support to the ixgbe driver for the API rte_eth_tx_done_cleanup to > > > force free consumed buffers on Tx ring. > > > > > > Signed-off-by: Chenxu Di <chenxux...@intel.com> > > > --- > > > drivers/net/ixgbe/ixgbe_ethdev.c | 2 + > > > drivers/net/ixgbe/ixgbe_rxtx.c | 116 +++++++++++++++++++++++++++++++ > > > drivers/net/ixgbe/ixgbe_rxtx.h | 2 + > > > 3 files changed, 120 insertions(+) > > > > > > diff --git a/drivers/net/ixgbe/ixgbe_ethdev.c > > > b/drivers/net/ixgbe/ixgbe_ethdev.c > > > index 2c6fd0f13..0091405db 100644 > > > --- a/drivers/net/ixgbe/ixgbe_ethdev.c > > > +++ b/drivers/net/ixgbe/ixgbe_ethdev.c > > > @@ -601,6 +601,7 @@ static const struct eth_dev_ops ixgbe_eth_dev_ops > > > = { .udp_tunnel_port_add = ixgbe_dev_udp_tunnel_port_add, > > > .udp_tunnel_port_del = ixgbe_dev_udp_tunnel_port_del, > > > .tm_ops_get = ixgbe_tm_ops_get, > > > +.tx_done_cleanup = ixgbe_tx_done_cleanup, > > > > Don't see how we can have one tx_done_cleanup() for different tx functions? > > Vector and scalar TX path use different format for sw_ring[] entries. > > Also offload and simile TX paths use different method to track used/free > > descriptors, and use different functions to free them: > > offload uses tx_entry next_id, last_id plus txq. last_desc_cleaned, while > > simple > > TX paths use tx_next_dd. > > > > This patches will be not include function for Vector, and I will update my > code to > Make it work for offload and simple . > > > > > }; > > > > > > /* > > > @@ -649,6 +650,7 @@ static const struct eth_dev_ops ixgbevf_eth_dev_ops > > = { > > > .reta_query = ixgbe_dev_rss_reta_query, > > > .rss_hash_update = ixgbe_dev_rss_hash_update, > > > .rss_hash_conf_get = ixgbe_dev_rss_hash_conf_get, > > > +.tx_done_cleanup = ixgbe_tx_done_cleanup, > > > }; > > > > > > /* store statistics names and its offset in stats structure */ diff > > > --git a/drivers/net/ixgbe/ixgbe_rxtx.c > > > b/drivers/net/ixgbe/ixgbe_rxtx.c index fa572d184..520b9c756 100644 > > > --- a/drivers/net/ixgbe/ixgbe_rxtx.c > > > +++ b/drivers/net/ixgbe/ixgbe_rxtx.c > > > @@ -2306,6 +2306,122 @@ ixgbe_tx_queue_release_mbufs(struct > > > ixgbe_tx_queue *txq) } } > > > > > > +int ixgbe_tx_done_cleanup(void *q, uint32_t free_cnt) > > > > That seems to work only for offload(full) TX path (ixgbe_xmit_pkts). > > Simple(fast) path seems not covered by this function. > > > > Same as above > > > > +{ > > > +struct ixgbe_tx_queue *txq = (struct ixgbe_tx_queue *)q; struct > > > +ixgbe_tx_entry *sw_ring; volatile union ixgbe_adv_tx_desc *txr; > > > +uint16_t tx_first; /* First segment analyzed. */ > > > +uint16_t tx_id; /* Current segment being processed. */ > > > +uint16_t tx_last; /* Last segment in the current packet. */ uint16_t > > > +tx_next; /* First segment of the next packet. */ int count; > > > + > > > +if (txq == NULL) > > > +return -ENODEV; > > > + > > > +count = 0; > > > +sw_ring = txq->sw_ring; > > > +txr = txq->tx_ring; > > > + > > > +/* > > > + * tx_tail is the last sent packet on the sw_ring. Goto the end > > > + * of that packet (the last segment in the packet chain) and > > > + * then the next segment will be the start of the oldest segment > > > + * in the sw_ring. > > > > Not sure I understand the sentence above. > > tx_tail is the value of TDT HW register (most recently armed by SW TD). > > last_id is the index of last descriptor for multi-seg packet. > > next_id is just the index of next descriptor in HW TD ring. > > How do you conclude that it will be the ' oldest segment in the sw_ring'? > > > > The tx_tail is the last sent packet on the sw_ring. While the xmit_cleanup or > Tx_free_bufs will be call when the nb_tx_free < tx_free_thresh . > So the sw_ring[tx_tail].next_id must be the begin of mbufs which are not used > or > Already freed . then begin the loop until the mbuf is used and begin to free > them. > > > > > Another question why do you need to write your own functions? > > Why can't you reuse existing ixgbe_xmit_cleanup() for full(offload) path and > > ixgbe_tx_free_bufs() for simple path? > > Yes, ixgbe_xmit_cleanup() doesn't free mbufs, but at least it could be > > used to > > determine finished TX descriptors. > > Based on that you can you can free appropriate sw_ring[] entries. > > > > The reason why I don't reuse existing function is that they all free several > mbufs > While the free_cnt of the API rte_eth_tx_done_cleanup() is the number of > packets. > It also need to be done that check which mbuffs are from the same packet.
At first, I don't see anything bad if tx_done_cleanup() will free only some segments from the packet. As long as it is safe - there is no problem with that. I think rte_eth_tx_done_cleanup() operates on mbuf, not packet quantities. But in our case I think it doesn't matter, as ixgbe_xmit_cleanup() mark TXDs as free only when HW is done with all TXDs for that packet. As long as there is a way to reuse existing code and avoid duplication (without introducing any degradation) - we should use it. And I think there is a very good opportunity here to reuse existing ixgbe_xmit_cleanup() for tx_done_cleanup() implementation. Moreover because your code doesn't follow ixgbe_xmit_pkts()/ixgbe_xmit_cleanup() logic and infrastructure, it introduces unnecessary scans over TXD ring, and in some cases doesn't work as expected: + while (1) { + tx_last = sw_ring[tx_id].last_id; + + if (sw_ring[tx_last].mbuf) { + if (txr[tx_last].wb.status & + IXGBE_TXD_STAT_DD) { ... + } else { + /* + * mbuf still in use, nothing left to + * free. + */ + break; It is not correct to expect that IXGBE_TXD_STAT_DD will be set on last TXD for *every* packet. We set IXGBE_TXD_CMD_RS bit only on threshold packet last descriptor. Plus ixgbe_xmit_cleanup() can cleanup TXD wb.status. So I strongly recommend to reuse ixgbe_xmit_cleanup() here. It would be much less error prone and will help to avoid code duplication. Konstantin > > > > >This is the first packet that will be > > > + * attempted to be freed. > > > + */ > > > + > > > +/* Get last segment in most recently added packet. */ tx_last = > > > +sw_ring[txq->tx_tail].last_id; > > > + > > > +/* Get the next segment, which is the oldest segment in ring. */ > > > +tx_first = sw_ring[tx_last].next_id; > > > + > > > +/* Set the current index to the first. */ tx_id = tx_first; > > > + > > > +/* > > > + * Loop through each packet. For each packet, verify that an > > > + * mbuf exists and that the last segment is free. If so, free > > > + * it and move on. > > > + */ > > > +while (1) { > > > +tx_last = sw_ring[tx_id].last_id; > > > + > > > +if (sw_ring[tx_last].mbuf) { > > > +if (!(txr[tx_last].wb.status & > > > +IXGBE_TXD_STAT_DD)) > > > +break; > > > + > > > +/* Get the start of the next packet. */ tx_next = > > > +sw_ring[tx_last].next_id; > > > + > > > +/* > > > + * Loop through all segments in a > > > + * packet. > > > + */ > > > +do { > > > +rte_pktmbuf_free_seg(sw_ring[tx_id].mbuf); > > > +sw_ring[tx_id].mbuf = NULL; > > > +sw_ring[tx_id].last_id = tx_id; > > > + > > > +/* Move to next segment. */ > > > +tx_id = sw_ring[tx_id].next_id; > > > + > > > +} while (tx_id != tx_next); > > > + > > > +/* > > > + * Increment the number of packets > > > + * freed. > > > + */ > > > +count++; > > > + > > > +if (unlikely(count == (int)free_cnt)) break; } else { > > > +/* > > > + * There are multiple reasons to be here: > > > + * 1) All the packets on the ring have been > > > + * freed - tx_id is equal to tx_first > > > + * and some packets have been freed. > > > + * - Done, exit > > > + * 2) Interfaces has not sent a rings worth of > > > + * packets yet, so the segment after tail is > > > + * still empty. Or a previous call to this > > > + * function freed some of the segments but > > > + * not all so there is a hole in the list. > > > + * Hopefully this is a rare case. > > > + * - Walk the list and find the next mbuf. If > > > + * there isn't one, then done. > > > + */ > > > +if (likely(tx_id == tx_first && count != 0)) break; > > > + > > > +/* > > > + * Walk the list and find the next mbuf, if any. > > > + */ > > > +do { > > > +/* Move to next segment. */ > > > +tx_id = sw_ring[tx_id].next_id; > > > + > > > +if (sw_ring[tx_id].mbuf) > > > +break; > > > + > > > +} while (tx_id != tx_first); > > > + > > > +/* > > > + * Determine why previous loop bailed. If there > > > + * is not an mbuf, done. > > > + */ > > > +if (sw_ring[tx_id].mbuf == NULL) > > > +break; > > > +} > > > +} > > > + > > > +return count; > > > +} > > > + > > > static void __attribute__((cold)) > > > ixgbe_tx_free_swring(struct ixgbe_tx_queue *txq) { diff --git > > > a/drivers/net/ixgbe/ixgbe_rxtx.h b/drivers/net/ixgbe/ixgbe_rxtx.h > > > index 505d344b9..2c3770af6 100644 > > > --- a/drivers/net/ixgbe/ixgbe_rxtx.h > > > +++ b/drivers/net/ixgbe/ixgbe_rxtx.h > > > @@ -285,6 +285,8 @@ int ixgbe_rx_vec_dev_conf_condition_check(struct > > > rte_eth_dev *dev); int ixgbe_rxq_vec_setup(struct ixgbe_rx_queue > > > *rxq); void ixgbe_rx_queue_release_mbufs_vec(struct ixgbe_rx_queue > > > *rxq); > > > > > > +int ixgbe_tx_done_cleanup(void *txq, uint32_t free_cnt); > > > + > > > extern const uint32_t ptype_table[IXGBE_PACKET_TYPE_MAX]; > > > extern const uint32_t ptype_table_tn[IXGBE_PACKET_TYPE_TN_MAX]; > > > > > > -- > > > 2.17.1 > > >