On Tue, Dec 20, 2016 at 12:17:10PM +0000, Ananyev, Konstantin wrote: > > > > -----Original Message----- > > From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Adrien Mazarguil > > Sent: Tuesday, December 20, 2016 11:28 AM > > To: Billy McFall <bmcf...@redhat.com> > > Cc: thomas.monja...@6wind.com; Lu, Wenzhuo <wenzhuo...@intel.com>; > > dev@dpdk.org; Stephen Hemminger > > <step...@networkplumber.org> > > Subject: Re: [dpdk-dev] [PATCH 1/3] ethdev: New API to free consumed > > buffers in TX ring > > > > Hi Billy, > > > > On Fri, Dec 16, 2016 at 07:48:49AM -0500, Billy McFall wrote: > > > Add a new API to force free consumed buffers on TX ring. API will return > > > the number of packets freed (0-n) or error code if feature not supported > > > (-ENOTSUP) or input invalid (-ENODEV). > > > > > > Because rte_eth_tx_buffer() may be used, and mbufs may still be held > > > in local buffer, the API also accepts *buffer and *sent. Before > > > attempting to free, rte_eth_tx_buffer_flush() is called to make sure > > > all mbufs are sent to Tx ring. rte_eth_tx_buffer_flush() is called even > > > if threshold is not met. > > > > > > Signed-off-by: Billy McFall <bmcf...@redhat.com> > > > --- > > > lib/librte_ether/rte_ethdev.h | 56 > > > +++++++++++++++++++++++++++++++++++++++++++ > > > 1 file changed, 56 insertions(+) > > > > > > diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h > > > index 9678179..e3f2be4 100644 > > > --- a/lib/librte_ether/rte_ethdev.h > > > +++ b/lib/librte_ether/rte_ethdev.h > > > @@ -1150,6 +1150,9 @@ typedef uint32_t (*eth_rx_queue_count_t)(struct > > > rte_eth_dev *dev, > > > typedef int (*eth_rx_descriptor_done_t)(void *rxq, uint16_t offset); > > > /**< @internal Check DD bit of specific RX descriptor */ > > > > > > +typedef int (*eth_tx_done_cleanup_t)(void *txq, uint32_t free_cnt); > > > +/**< @internal Force mbufs to be from TX ring. */ > > > + > > > typedef void (*eth_rxq_info_get_t)(struct rte_eth_dev *dev, > > > uint16_t rx_queue_id, struct rte_eth_rxq_info *qinfo); > > > > > > @@ -1467,6 +1470,7 @@ struct eth_dev_ops { > > > eth_rx_disable_intr_t rx_queue_intr_disable; > > > eth_tx_queue_setup_t tx_queue_setup;/**< Set up device TX queue.*/ > > > eth_queue_release_t tx_queue_release;/**< Release TX queue.*/ > > > + eth_tx_done_cleanup_t tx_done_cleanup;/**< Free tx ring mbufs */ > > > eth_dev_led_on_t dev_led_on; /**< Turn on LED. */ > > > eth_dev_led_off_t dev_led_off; /**< Turn off LED. */ > > > flow_ctrl_get_t flow_ctrl_get; /**< Get flow control. */ > > > @@ -2943,6 +2947,58 @@ rte_eth_tx_buffer(uint8_t port_id, uint16_t > > > queue_id, > > > } > > > > > > /** > > > + * Request the driver to free mbufs currently cached by the driver. The > > > + * driver will only free the mbuf if it is no longer in use. > > > + * > > > + * @param port_id > > > + * The port identifier of the Ethernet device. > > > + * @param queue_id > > > + * The index of the transmit queue through which output packets must be > > > + * sent. > > > + * The value must be in the range [0, nb_tx_queue - 1] previously > > > supplied > > > + * to rte_eth_dev_configure(). > > > + * @param free_cnt > > > + * Maximum number of packets to free. Use 0 to indicate all possible > > > packets > > > + * should be freed. Note that a packet may be using multiple mbufs. > > > + * @param buffer > > > + * Buffer used to collect packets to be sent. If provided, the buffer > > > will > > > + * be flushed, even if the current length is less than buffer->size. > > > Pass NULL > > > + * if buffer has already been flushed. > > > + * @param sent > > > + * Pointer to return number of packets sent if buffer has packets to > > > be sent. > > > + * If *buffer is supplied, *sent must also be supplied. > > > + * @return > > > + * Failure: < 0 > > > + * -ENODEV: Invalid interface > > > + * -ENOTSUP: Driver does not support function > > > + * Success: >= 0 > > > + * 0-n: Number of packets freed. More packets may still remain in > > > ring that > > > + * are in use. > > > + */ > > > + > > > +static inline int > > > +rte_eth_tx_done_cleanup(uint8_t port_id, uint16_t queue_id, uint32_t > > > free_cnt, > > > + struct rte_eth_dev_tx_buffer *buffer, uint16_t *sent) > > > +{ > > > + struct rte_eth_dev *dev = &rte_eth_devices[port_id]; > > > + > > > + /* Validate Input Data. Bail if not valid or not supported. */ > > > + RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV); > > > + RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->tx_done_cleanup, -ENOTSUP); > > > + > > > + /* > > > + * If transmit buffer is provided and there are still packets to be > > > + * sent, then send them before attempting to free pending mbufs. > > > + */ > > > + if (buffer && sent) > > > + *sent = rte_eth_tx_buffer_flush(port_id, queue_id, buffer); > > > + > > > + /* Call driver to free pending mbufs. */ > > > + return (*dev->dev_ops->tx_done_cleanup)(dev->data->tx_queues[queue_id], > > > + free_cnt); > > > +} > > > + > > > +/** > > > * Configure a callback for buffered packets which cannot be sent > > > * > > > * Register a specific callback to be called when an attempt is made to > > > send > > > > Just a thought to follow-up on Stephen's comment to further simplify this > > API, how about not adding any new eth_dev_ops but instead defining what > > should happen during an empty TX burst call (tx_burst() with 0 packets). > > > > Several PMDs already have a check for this scenario and start by cleaning up > > completed packets anyway, they effectively partially implement this > > definition for free already. > > Many PMDs start by cleaning up only when number of free entries > drop below some point. > Also in that case the author would have to modify (and test) all existing TX > routinies. > So I think a separate API call seems more plausible.
Not necessarily, as I understand this API in its current form only suggests that a PMD should release a few mbufs from a queue if possible, without any guarantee, PMDs are not forced to comply. I think the threshold you mention is a valid reason not to release them, and it wouldn't change a thing to existing tx_burst() implementations in the meantime (only documentation). This threshold could also be bypassed rather painlessly in the "if (unlikely(nb_pkts == 0))" case that all PMDs already check for in a way or another. > Though I am agree with previous comment from Stephen that last two parameters > are redundant and would just overcomplicate things. > tin > > > > > The main difference with this API would be that you wouldn't know how many > > mbufs were freed and wouldn't collect them into an array. However most > > applications have one mbuf pool and/or know where they come from, so they > > can just query the pool or attempt to re-allocate from it after doing empty > > bursts in case of starvation. > > > > [1] http://dpdk.org/ml/archives/dev/2016-December/052469.html -- Adrien Mazarguil 6WIND