On 2023/9/28 21:15, Ferruh Yigit wrote:
On 9/28/2023 8:42 AM, Jie Hai wrote:
The DPDK framework reports the queue status, which is stored in
'dev->data->tx_queue_state' and 'dev->data->rx_queue_state'.The
state is currently maintained by the drivers. Users may determine
whether a queue participates in packet forwarding based on the
state. However, not all drivers correctly report the queue status.
This may cause forwarding problems.
Since it is difficult to modify the queue status of all drivers,
we consider updating the queue status at the framework level.
Some drivers support queues for hairpin, leaving status updating
for such queues to the drivers. Some drivers support
'deferred_start'. Assume that all drivers that support
'deferred_start' can obtain the configuration through
'rte_eth_tx_queue_info_get' and 'rte_eth_rx_queue_info_get'. So
we can directly update the queue status in 'rte_eth_dev_start'
and 'rte_eth_dev_stop'.
Signed-off-by: Jie Hai <haij...@huawei.com>
---
lib/ethdev/rte_ethdev.c | 40 ++++++++++++++++++++++++++++++++++++++++
1 file changed, 40 insertions(+)
diff --git a/lib/ethdev/rte_ethdev.c b/lib/ethdev/rte_ethdev.c
index 0840d2b5942a..e3aaa71eba9e 100644
--- a/lib/ethdev/rte_ethdev.c
+++ b/lib/ethdev/rte_ethdev.c
@@ -1641,6 +1641,9 @@ rte_eth_dev_start(uint16_t port_id)
struct rte_eth_dev_info dev_info;
int diag;
int ret, ret_stop;
+ uint16_t i;
+ struct rte_eth_rxq_info rxq_info;
+ struct rte_eth_txq_info txq_info;
RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
dev = &rte_eth_devices[port_id];
@@ -1697,6 +1700,30 @@ rte_eth_dev_start(uint16_t port_id)
(*dev->dev_ops->link_update)(dev, 0);
}
+ for (i = 0; i < dev->data->nb_rx_queues; i++) {
+ if (rte_eth_dev_is_rx_hairpin_queue(dev, i))
+ continue;
+
+ memset(&rxq_info, 0, sizeof(rxq_info));
+ ret = rte_eth_rx_queue_info_get(port_id, i, &rxq_info);
+ if (ret == 0 && rxq_info.conf.rx_deferred_start != 0)
+ dev->data->rx_queue_state[i] =
RTE_ETH_QUEUE_STATE_STOPPED;
+ else
+ dev->data->rx_queue_state[i] =
RTE_ETH_QUEUE_STATE_STARTED;
+ }
+
+ for (i = 0; i < dev->data->nb_tx_queues; i++) {
+ if (rte_eth_dev_is_tx_hairpin_queue(dev, i))
+ continue;
+
+ memset(&txq_info, 0, sizeof(txq_info));
+ ret = rte_eth_tx_queue_info_get(port_id, i, &txq_info);
+ if (ret == 0 && txq_info.conf.tx_deferred_start != 0)
+ dev->data->tx_queue_state[i] =
RTE_ETH_QUEUE_STATE_STOPPED;
+ else
+ dev->data->tx_queue_state[i] =
RTE_ETH_QUEUE_STATE_STARTED;
+ }
+
Hi Jie,
I am not sure about adding queue_info_get() calls a dependency to
start(), since start() is a mandatory API, I am concerned about possible
side affects.
Like if rte_eth_rx_queue_info_get() fails,Yes, unless we let
rte_eth_rx|tx_queue_info_get a mandatory API.
Or event though deferred_start is set, can application first call
rx_queue_start(), later start(), like:
start()
rx_queue_start()
stop()
rx_queue_start()
start()
start() --> deferred_start is confugured, the queue is stopped
rx_queue_start() --> the queue is started
stop() --> all queues are stopped
rx_queue_start() --> not supported, port should starts first
start()
If we change the order as:
start()
rx_queue_start()
stop()
start()
The last status of the queue for different drivers may be different.
Most drivers starts all queues except the queue setting deferred_start.
For sfc driver, all queues are started and the status of the queue
setting deferred_start will be reported as stopped, which is not correct.
That's the point, drivers have their own private processing for
different sequences of deferred_start, start|stop and queue_start|stop.
Or even applications sets deferred_start, PMD be ignoring without
reflecting that to conf.rx_deferred_start, etc...
Supppose the app sets the deferred_start,
whether the driver verifies the support the configuration or not,
whether the driver issues the configuration to the hardware or not,
whether the driver saves and reports the configuration or not,
Different driver implementations vary.
For the above three cases, pay attention to the following:
1) Y-Y-Y That's what we need.
2) Y-Y-N That's what we should add.
3) N-N-Y This will lead to wrong information(e.g.ice_rxtx.c mlx5_txq.c
mlx5_rxq.c)
4) N-N-N That's what we need, too.
Anyway, intention was to move common task, setting queue state, to the
ethdev layer, but because of the deferred_start, in rte_eth_dev_start()
we don't really know the queue state.
We can try to move deferred state information to the ethdev, but that is
similar to setting queue state in the driver, that is why perhaps better
to leave setting state to drivers, as done in the first version of the set.
@Jie, @David, what do you think?
I support V1. Whoever changes the queue status is responsible for
reporting the queue status.
That will maintains the specificity of the drive treatment.
And, @Jie, can you also check the rx_queue_start() & tx_queue_start()
dev_ops of the drivers sets the queue state?
If missing on some drivers, that needs to be added too.
I have checked already, all drivers support rx_queue_start() &
tx_queue_start() set the queue state.
/* expose selection of PMD fast-path functions */
eth_dev_fp_ops_setup(rte_eth_fp_ops + port_id, dev);
@@ -1708,6 +1735,7 @@ int
rte_eth_dev_stop(uint16_t port_id)
{
struct rte_eth_dev *dev;
+ uint16_t i;
int ret;
RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
@@ -1731,6 +1759,18 @@ rte_eth_dev_stop(uint16_t port_id)
dev->data->dev_started = 0;
rte_ethdev_trace_stop(port_id, ret);
+ for (i = 0; i < dev->data->nb_rx_queues; i++) {
+ if (rte_eth_dev_is_rx_hairpin_queue(dev, i))
+ continue;
+ dev->data->rx_queue_state[i] = RTE_ETH_QUEUE_STATE_STOPPED;
+ }
+
+ for (i = 0; i < dev->data->nb_tx_queues; i++) {
+ if (rte_eth_dev_is_tx_hairpin_queue(dev, i))
+ continue;
+ dev->data->tx_queue_state[i] = RTE_ETH_QUEUE_STATE_STOPPED;
+ }
+
return ret;
}