)
On Thu, Feb 3, 2022 at 9:31 PM Ferruh Yigit <ferruh.yi...@intel.com> wrote: > > On 1/31/2022 6:08 PM, jer...@marvell.com wrote: > > From: Jerin Jacob <jer...@marvell.com> > > > > Based on device support and use-case need, there are two different ways > > to enable PFC. The first case is the port level PFC configuration, in > > this case, rte_eth_dev_priority_flow_ctrl_set() API shall be used to > > configure the PFC, and PFC frames will be generated using based on VLAN > > TC value. > > > > The second case is the queue level PFC configuration, in this > > case, Any packet field content can be used to steer the packet to the > > specific queue using rte_flow or RSS and then use > > rte_eth_dev_priority_flow_ctrl_queue_configure() to configure the > > TC mapping on each queue. > > Based on congestion selected on the specific queue, configured TC > > shall be used to generate PFC frames. > > > > Hi Jerin, Sunil, > > Please find below minor comments, mostly syntax issues. > > > Signed-off-by: Jerin Jacob <jer...@marvell.com> > > Signed-off-by: Sunil Kumar Kori <sk...@marvell.com> > > --- > > > > v2..v1: > > - Introduce rte_eth_dev_priority_flow_ctrl_queue_info_get() to > > avoid updates to rte_eth_dev_info > > - Removed devtools/libabigail.abignore changes > > - Address the comment from Ferruh in > > http://patches.dpdk.org/project/dpdk/patch/20220113102718.3167282-1-jer...@marvell.com/ > > > > doc/guides/nics/features.rst | 7 +- > > doc/guides/rel_notes/release_22_03.rst | 6 ++ > > lib/ethdev/ethdev_driver.h | 12 ++- > > lib/ethdev/rte_ethdev.c | 132 +++++++++++++++++++++++++ > > lib/ethdev/rte_ethdev.h | 89 +++++++++++++++++ > > lib/ethdev/version.map | 4 + > > 6 files changed, 247 insertions(+), 3 deletions(-) > > > > diff --git a/doc/guides/nics/features.rst b/doc/guides/nics/features.rst > > index 27be2d2576..1cacdc883a 100644 > > --- a/doc/guides/nics/features.rst > > +++ b/doc/guides/nics/features.rst > > @@ -379,9 +379,12 @@ Flow control > > Supports configuring link flow control. > > > > * **[implements] eth_dev_ops**: ``flow_ctrl_get``, ``flow_ctrl_set``, > > - ``priority_flow_ctrl_set``. > > + ``priority_flow_ctrl_set``, ``priority_flow_ctrl_queue_info_get``, > > + ``priority_flow_ctrl_queue_configure`` > > * **[related] API**: ``rte_eth_dev_flow_ctrl_get()``, > > ``rte_eth_dev_flow_ctrl_set()``, > > - ``rte_eth_dev_priority_flow_ctrl_set()``. > > + ``rte_eth_dev_priority_flow_ctrl_set()``, > > + ``rte_eth_dev_priority_flow_ctrl_queue_info_get()``, > > + ``rte_eth_dev_priority_flow_ctrl_queue_configure()``. > > > > > > .. _nic_features_rate_limitation: > > diff --git a/doc/guides/rel_notes/release_22_03.rst > > b/doc/guides/rel_notes/release_22_03.rst > > index 3bc0630c7c..e988c104e8 100644 > > --- a/doc/guides/rel_notes/release_22_03.rst > > +++ b/doc/guides/rel_notes/release_22_03.rst > > @@ -69,6 +69,12 @@ New Features > > > > The new API ``rte_event_eth_rx_adapter_event_port_get()`` was added. > > > > +* **Added an API to enable queue based priority flow ctrl(PFC).** > > + > > + New APIs, ``rte_eth_dev_priority_flow_ctrl_queue_info_get()`` and > > + ``rte_eth_dev_priority_flow_ctrl_queue_configure()``, was added. > > + > > + > > > > > Can you please move this update before ethdev driver updates. > And no need double empty lines. Ack > > > Removed Items > > ------------- > > diff --git a/lib/ethdev/ethdev_driver.h b/lib/ethdev/ethdev_driver.h > > index d95605a355..320a364766 100644 > > --- a/lib/ethdev/ethdev_driver.h > > +++ b/lib/ethdev/ethdev_driver.h > > @@ -533,6 +533,13 @@ typedef int (*flow_ctrl_set_t)(struct rte_eth_dev *dev, > > typedef int (*priority_flow_ctrl_set_t)(struct rte_eth_dev *dev, > > struct rte_eth_pfc_conf *pfc_conf); > > > > +/** @internal Get info for queue based PFC on an Ethernet device. */ > > +typedef int (*priority_flow_ctrl_queue_info_get_t)( > > + struct rte_eth_dev *dev, struct rte_eth_pfc_queue_info > > *pfc_queue_info); > > +/** @internal Configure queue based PFC parameter on an Ethernet device. */ > > +typedef int (*priority_flow_ctrl_queue_config_t)( > > + struct rte_eth_dev *dev, struct rte_eth_pfc_queue_conf > > *pfc_queue_conf); > > + > > Instead of ending line with opening parantesis '(', can you break the line > after > first argument, like: > > typedef int (*priority_flow_ctrl_queue_config_t)(struct rte_eth_dev *dev, > struct rte_eth_pfc_queue_conf > *pfc_queue_conf); Ack > > Same for all instances. > > > /** @internal Update RSS redirection table on an Ethernet device. */ > > typedef int (*reta_update_t)(struct rte_eth_dev *dev, > > struct rte_eth_rss_reta_entry64 *reta_conf, > > @@ -1080,7 +1087,10 @@ struct eth_dev_ops { > > flow_ctrl_set_t flow_ctrl_set; /**< Setup flow control */ > > /** Setup priority flow control */ > > priority_flow_ctrl_set_t priority_flow_ctrl_set; > > - > > + /** Priority flow control queue info get */ > > + priority_flow_ctrl_queue_info_get_t priority_flow_ctrl_queue_info_get; > > + /** Priority flow control queue configure */ > > + priority_flow_ctrl_queue_config_t priority_flow_ctrl_queue_config; > > Can you please keep empty line before next (hash) group? Ack > > > /** Set Unicast Table Array */ > > eth_uc_hash_table_set_t uc_hash_table_set; > > /** Set Unicast hash bitmap */ > > diff --git a/lib/ethdev/rte_ethdev.c b/lib/ethdev/rte_ethdev.c > > index a1d475a292..2ce38cd2c5 100644 > > --- a/lib/ethdev/rte_ethdev.c > > +++ b/lib/ethdev/rte_ethdev.c > > @@ -4022,6 +4022,138 @@ rte_eth_dev_priority_flow_ctrl_set(uint16_t port_id, > > return -ENOTSUP; > > } > > > > +static inline int > > Not sure if there is a value to ask function to be 'inline', this is in > control > path, only static can be enough. Ack > > > +validate_rx_pause_config(struct rte_eth_dev_info *dev_info, uint8_t tc_max, > > + struct rte_eth_pfc_queue_conf *pfc_queue_conf) > > +{ > > + if ((pfc_queue_conf->mode == RTE_ETH_FC_RX_PAUSE) || > > + (pfc_queue_conf->mode == RTE_ETH_FC_FULL)) { > > We don't allign to paranthesis in dpdk coding convenion [1], it should be as: > > if ((pfc_queue_conf->mode == RTE_ETH_FC_RX_PAUSE) || > (pfc_queue_conf->mode == RTE_ETH_FC_FULL)) { > if (pfc_queue_conf->rx_pause.tx_qid >= dev_info->nb_tx_queues) { > ... > } > } Ack > > > [1] > Altough I am aware many instances sneaked in, still I think better to follow > the convention. > > > + if (pfc_queue_conf->rx_pause.tx_qid >= > > dev_info->nb_tx_queues) { > > + RTE_ETHDEV_LOG(ERR, "Tx queue not in range for Rx > > pause" > > + " (requested: %d configured: %d)\n", > > + pfc_queue_conf->rx_pause.tx_qid, > > + dev_info->nb_tx_queues); > > > Should log mention that this is related to the "priority flow Rx queue > control"? Prepended "PFC" in the log message > > > + return -EINVAL; > > + } > > + > > + if (pfc_queue_conf->rx_pause.tc >= tc_max) { > > Should we document somewhere that 'tc_max' value itself is an invalid value? Updated the documentation to following struct { uint16_t tx_qid; /**< Tx queue ID */ - uint8_t tc; /**< Traffic class as per PFC (802.1Qbb) spec */ + uint8_t tc; + /**< Traffic class as per PFC (802.1Qbb) spec. The value must be + * in the range [0, rte_eth_pfc_queue_info::tx_max - 1] + */ > > > + RTE_ETHDEV_LOG(ERR, "TC not in range for Rx pause" > > + " (requested: %d max: %d)\n", > > + pfc_queue_conf->rx_pause.tc, tc_max); > > + return -EINVAL; > > + } > > + } > > + > > + return 0; > > +} > > + > > +static inline int > > +validate_tx_pause_config(struct rte_eth_dev_info *dev_info, uint8_t tc_max, > > + struct rte_eth_pfc_queue_conf *pfc_queue_conf) > > +{ > > + if ((pfc_queue_conf->mode == RTE_ETH_FC_TX_PAUSE) || > > + (pfc_queue_conf->mode == RTE_ETH_FC_FULL)) { > > + if (pfc_queue_conf->tx_pause.rx_qid >= > > dev_info->nb_rx_queues) { > > + RTE_ETHDEV_LOG(ERR, "Rx queue not in range for Tx > > pause" > > + "(requested: %d configured: %d)\n", > > + pfc_queue_conf->tx_pause.rx_qid, > > + dev_info->nb_rx_queues); > > + return -EINVAL; > > + } > > + > > + if (pfc_queue_conf->tx_pause.tc >= tc_max) { > > + RTE_ETHDEV_LOG(ERR, "TC not in range for Tx pause" > > + "(requested: %d max: %d)\n", > > + pfc_queue_conf->tx_pause.tc, tc_max); > > + return -EINVAL; > > + } > > + } > > + > > + return 0; > > +} > > + > > +int > > +rte_eth_dev_priority_flow_ctrl_queue_info_get( > > + uint16_t port_id, struct rte_eth_pfc_queue_info *pfc_queue_info) > > +{ > > + struct rte_eth_dev *dev; > > + > > + RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV); > > + dev = &rte_eth_devices[port_id]; > > + > > + if (pfc_queue_info == NULL) { > > + RTE_ETHDEV_LOG(ERR, "PFC info param is NULL for port (%u)\n", > > + port_id); > > + return -EINVAL; > > + } > > + > > + if (*dev->dev_ops->priority_flow_ctrl_queue_info_get) > > + return eth_err(port_id, > > + > > (*dev->dev_ops->priority_flow_ctrl_queue_info_get)( > > + dev, pfc_queue_info)); > > + return -ENOTSUP; > > +} > > + > > +int > > +rte_eth_dev_priority_flow_ctrl_queue_configure( > > + uint16_t port_id, struct rte_eth_pfc_queue_conf *pfc_queue_conf) > > +{ > > + struct rte_eth_pfc_queue_info pfc_info; > > + struct rte_eth_dev_info dev_info; > > + struct rte_eth_dev *dev; > > + int ret; > > + > > + RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV); > > + dev = &rte_eth_devices[port_id]; > > + > > + if (pfc_queue_conf == NULL) { > > + RTE_ETHDEV_LOG(ERR, "PFC parameters are NULL for port (%u)\n", > > + port_id); > > + return -EINVAL; > > + } > > + > > + ret = rte_eth_dev_info_get(port_id, &dev_info); > > + if (ret != 0) > > + return ret; > > + > > + ret = rte_eth_dev_priority_flow_ctrl_queue_info_get(port_id, > > &pfc_info); > > + if (ret != 0) > > + return ret; > > + > > + if (pfc_info.capa == 0) { > > + RTE_ETHDEV_LOG(ERR, "Ethdev port %u does not support PFC\n", > > + port_id); > > + return -ENOTSUP; > > + } > > + > > + if (pfc_info.tc_max == 0) { > > + RTE_ETHDEV_LOG(ERR, > > + "Ethdev port %u does not support PFC TC values\n", > > + port_id); > > + return -ENOTSUP; > > + } > > + > > + if (pfc_info.capa & RTE_ETH_PFC_QUEUE_CAPA_RX_PAUSE) { > > + ret = validate_rx_pause_config(&dev_info, pfc_info.tc_max, > > + pfc_queue_conf); > > There is capablilty flags for RTE_ETH_PFC_QUEUE_CAPA_RX_PAUSE and > RTE_ETH_PFC_QUEUE_CAPA_TX_PAUSE > also there is config flags RTE_ETH_FC_RX_PAUSE, RTE_ETH_FC_TX_PAUSE and > RTE_ETH_FC_FULL Removed that and reused rte_eth_fc_mode -/** Device supports Rx pause for queue based PFC. */ -#define RTE_ETH_PFC_QUEUE_CAPA_RX_PAUSE RTE_BIT64(0) -/** Device supports Tx pause for queue based PFC. */ -#define RTE_ETH_PFC_QUEUE_CAPA_TX_PAUSE RTE_BIT64(1) - /** * @warning * @b EXPERIMENTAL: this API may change, or be removed, without prior notice @@ -1424,8 +1419,8 @@ struct rte_eth_pfc_queue_info { * Maximum supported traffic class as per PFC (802.1Qbb) specification. */ uint8_t tc_max; - /** PFC queue capabilities (RTE_ETH_PFC_QUEUE_CAPA_). */ - uint64_t capa; + /** PFC queue mode capabilities. */ + enum rte_eth_fc_mode mode_capa; }; > > > What should happen if driver only support RX_PAUSE but app config request only > TX_PAUSE? > As far as can see with current code it pass the validation, but should it? Yes. Good catch. Added /* Check requested mode supported or not */ if (pfc_info.mode_capa == RTE_ETH_FC_RX_PAUSE && pfc_queue_conf->mode == RTE_ETH_FC_TX_PAUSE) { RTE_ETHDEV_LOG(ERR, "PFC Tx pause unsupported for port (%d)\n", port_id); return -EINVAL; } if (pfc_info.mode_capa == RTE_ETH_FC_TX_PAUSE && pfc_queue_conf->mode == RTE_ETH_FC_RX_PAUSE) { RTE_ETHDEV_LOG(ERR, "PFC Rx pause unsupported for port (%d)\n", port_id); return -EINVAL; } > > > > + if (ret != 0) > > + return ret; > > + } > > + > > + if (pfc_info.capa & RTE_ETH_PFC_QUEUE_CAPA_TX_PAUSE) { > > + ret = validate_tx_pause_config(&dev_info, pfc_info.tc_max, > > + pfc_queue_conf); > > syntax, please don't align to paranthesis > > > + if (ret != 0) > > + return ret; > > + } > > + > > + if (*dev->dev_ops->priority_flow_ctrl_queue_config) > > + return eth_err(port_id, > > + > > (*dev->dev_ops->priority_flow_ctrl_queue_config)( > > + dev, pfc_queue_conf)); > > + return -ENOTSUP; > > +} > > + > > static int > > eth_check_reta_mask(struct rte_eth_rss_reta_entry64 *reta_conf, > > uint16_t reta_size) > > diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h > > index fa299c8ad7..383ad1cdd7 100644 > > --- a/lib/ethdev/rte_ethdev.h > > +++ b/lib/ethdev/rte_ethdev.h > > @@ -1395,6 +1395,48 @@ struct rte_eth_pfc_conf { > > uint8_t priority; /**< VLAN User Priority. */ > > }; > > > > +/** Device supports Rx pause for queue based PFC. */ > > +#define RTE_ETH_PFC_QUEUE_CAPA_RX_PAUSE RTE_BIT64(0) > > +/** Device supports Tx pause for queue based PFC. */ > > +#define RTE_ETH_PFC_QUEUE_CAPA_TX_PAUSE RTE_BIT64(1) > > + > > There is already control flow mode enum 'enum rte_eth_fc_mode', those > enum items use FC as abrivation (RTE_ETH_FC_RX_PAUSE), above ones use PFC, > should they use same prefix 'RTE_ETH_FC_' for consistency? > > And overall, what for struct and functins too, is the correct abreviation > 'pfc' or 'fc', since old code has 'fc' as far as I see? Removed RTE_ETH_PFC_QUEUE_CAPA_TX_PAUSE and reused enum rte_eth_fc_mode > > > +/** > > + * @warning > > + * @b EXPERIMENTAL: this API may change, or be removed, without prior > > notice > > + * > > + * A structure used to retrieve information of queue based PFC. > > + */ > > +struct rte_eth_pfc_queue_info { > > + /** > > + * Maximum supported traffic class as per PFC (802.1Qbb) > > specification. > > Will it be redundant to say valid value should be bigger than 0? Updated the document as: @@ -1440,13 +1435,19 @@ struct rte_eth_pfc_queue_conf { struct { uint16_t tx_qid; /**< Tx queue ID */ - uint8_t tc; /**< Traffic class as per PFC (802.1Qbb) spec */ + uint8_t tc; + /**< Traffic class as per PFC (802.1Qbb) spec. The value must be + * in the range [0, rte_eth_pfc_queue_info::tx_max - 1] + */ > > > + */ > > + uint8_t tc_max; > > + /** PFC queue capabilities (RTE_ETH_PFC_QUEUE_CAPA_). */ > > Can move doxygen comments to same line if they fit to 80 char limit: > uint64_t capa; /**< PFC queue capabilities (RTE_ETH_PFC_QUEUE_CAPA_). > */ > > > + uint64_t capa; > > +}; > > + > > +/** > > + * @warning > > + * @b EXPERIMENTAL: this API may change, or be removed, without prior > > notice > > + * > > + * A structure used to configure Ethernet priority flow control parameter > > for > > + * ethdev queues. > > + */ > > +struct rte_eth_pfc_queue_conf { > > + enum rte_eth_fc_mode mode; /**< Link flow control mode */ > > + > > + struct { > > + uint16_t tx_qid; /**< Tx queue ID */ > > 'tx_qid' within 'rx_pause' struct, this seems done intentionally but just to > double > check, can you please describe here the intendent usage? Clarified the doc as @@ -1429,6 +1429,16 @@ struct rte_eth_pfc_queue_info { * * A structure used to configure Ethernet priority flow control parameter for * ethdev queues. + * + * rte_eth_pfc_queue_conf::rx_pause structure shall used to configure given + * tx_qid with corresponding tc. When ethdev device receives PFC frame with + * rte_eth_pfc_queue_conf::rx_pause::tc, traffic will be paused on + * rte_eth_pfc_queue_conf::rx_pause::tx_qid for that tc. + * + * rte_eth_pfc_queue_conf::tx_pause structure shall used to configure given + * rx_qid. When rx_qid is congested, PFC frames are generated with + * rte_eth_pfc_queue_conf::rx_pause::tc and + * rte_eth_pfc_queue_conf::rx_pause::pause_time to the peer. > > > + uint8_t tc; /**< Traffic class as per PFC (802.1Qbb) spec */ > > + } rx_pause; /* Valid when (mode == FC_RX_PAUSE || mode == FC_FULL) */ > > + > > + struct { > > + uint16_t pause_time; /**< Pause quota in the Pause frame */ > > + uint16_t rx_qid; /**< Rx queue ID */ > > + uint8_t tc; /**< Traffic class as per PFC (802.1Qbb) spec */ > > + } tx_pause; /* Valid when (mode == FC_TX_PAUSE || mode == FC_FULL) */ > > +}; > > + > > /** > > * Tunnel type for device-specific classifier configuration. > > * @see rte_eth_udp_tunnel > > @@ -4144,6 +4186,53 @@ int rte_eth_dev_priority_flow_ctrl_set(uint16_t > > port_id, > > int rte_eth_dev_mac_addr_add(uint16_t port_id, struct rte_ether_addr > > *mac_addr, > > uint32_t pool); > > > > +/** > > + * @warning > > + * @b EXPERIMENTAL: this API may change without prior notice. > > + * > > + * Retrieve the information for queue based PFC. > > + * > > + * @param port_id > > + * The port identifier of the Ethernet device. > > + * @param pfc_queue_info > > + * A pointer to a structure of type *rte_eth_pfc_queue_info* to be > > filled with > > + * the information about queue based PFC. > > + * @return > > + * - (0) if successful. > > + * - (-ENOTSUP) if support for priority_flow_ctrl_queue_info_get does > > not exist. > > + * - (-ENODEV) if *port_id* invalid. > > + * - (-EINVAL) if bad parameter. > > + */ > > +__rte_experimental > > +int rte_eth_dev_priority_flow_ctrl_queue_info_get(uint16_t port_id, > > + struct rte_eth_pfc_queue_info > > *pfc_queue_info); > > +/** > > + * @warning > > + * @b EXPERIMENTAL: this API may change without prior notice. > > + * > > + * Configure the queue based priority flow control for a given queue > > + * for Ethernet device. > > + * > > + * @note When an ethdev port switches to queue based PFC mode, the > > + * unconfigured queues shall be configured by the driver with > > + * default values such as lower priority value for TC etc. > > + * > > May be good to document it has dependency to 'rte_eth_dev_info_get()' API? There is no dependency to rte_eth_dev_info_get() in the spec. > > > + * @param port_id > > + * The port identifier of the Ethernet device. > > + * @param pfc_queue_conf > > + * The pointer to the structure of the priority flow control parameters > > + * for the queue. > > + * @return > > + * - (0) if successful. > > + * - (-ENOTSUP) if hardware doesn't support queue based PFC mode. > > + * - (-ENODEV) if *port_id* invalid. > > + * - (-EINVAL) if bad parameter > > + * - (-EIO) if flow control setup queue failure > > + */ > > +__rte_experimental > > +int rte_eth_dev_priority_flow_ctrl_queue_configure(uint16_t port_id, > > + struct rte_eth_pfc_queue_conf *pfc_queue_conf); > > + > > /** > > * Remove a MAC address from the internal array of addresses. > > * > > diff --git a/lib/ethdev/version.map b/lib/ethdev/version.map > > index c2fb0669a4..49523ebc45 100644 > > --- a/lib/ethdev/version.map > > +++ b/lib/ethdev/version.map > > @@ -256,6 +256,10 @@ EXPERIMENTAL { > > rte_flow_flex_item_create; > > rte_flow_flex_item_release; > > rte_flow_pick_transfer_proxy; > > + > > + # added in 22.03 > > + rte_eth_dev_priority_flow_ctrl_queue_configure; > > + rte_eth_dev_priority_flow_ctrl_queue_info_get; > > }; > > > > INTERNAL { >