On 10/11/21 7:25 PM, Ananyev, Konstantin wrote:
At queue configure stage always allocate space for maximum possible
number (RTE_MAX_QUEUES_PER_PORT) of queue pointers.
That will allow 'fast' inline functions (eth_rx_burst, etc.) to refer
pointer to internal queue data without extra checking of current number
of configured queues.
That would help in future to hide rte_eth_dev and related structures.
It means that from now on, each ethdev port will always consume:
((2*sizeof(uintptr_t))* RTE_MAX_QUEUES_PER_PORT)
bytes of memory for its queue pointers.
With RTE_MAX_QUEUES_PER_PORT==1024 (default value) it is 16KB per port.
Signed-off-by: Konstantin Ananyev <konstantin.anan...@intel.com>
---
lib/ethdev/rte_ethdev.c | 36 +++++++++---------------------------
1 file changed, 9 insertions(+), 27 deletions(-)
diff --git a/lib/ethdev/rte_ethdev.c b/lib/ethdev/rte_ethdev.c
index ed37f8871b..c8abda6dd7 100644
--- a/lib/ethdev/rte_ethdev.c
+++ b/lib/ethdev/rte_ethdev.c
@@ -897,7 +897,8 @@ eth_dev_rx_queue_config(struct rte_eth_dev *dev, uint16_t
nb_queues)
if (dev->data->rx_queues == NULL && nb_queues != 0) { /* first time
configuration */
dev->data->rx_queues = rte_zmalloc("ethdev->rx_queues",
- sizeof(dev->data->rx_queues[0]) * nb_queues,
+ sizeof(dev->data->rx_queues[0]) *
+ RTE_MAX_QUEUES_PER_PORT,
RTE_CACHE_LINE_SIZE);
Looking at it I have few questions:
1. Why is nb_queues == 0 case kept as an exception? Yes,
strictly speaking it is not the problem of the patch,
DPDK will still segfault (non-debug build) if I
allocate Tx queues only but call rte_eth_rx_burst().
eth_dev_rx_queue_config(.., nb_queues=0) is used in few places to clean-up
things.
No, as far as I know. For Tx only application (e.g. traffic generator)
it is 100% legal to configure with tx_queues=X, rx_queues=0.
The same is for Rx only application (e.g. packet capture).
After reading the patch description I thought that
we're trying to address it.
We do, though I can't see how we can address it in this patch.
Though it is a good idea - I think I can add extra check in
eth_dev_fp_ops_setup()
or around and setup RX function pointers only when dev->data->rx_queues != NULL.
Same for TX.
You don't need to care about these pointers, if these arrays are
always allocated. See (3) below.
2. Why do we need to allocate memory dynamically?
Can we just make rx_queues an array of appropriate size?
Pavan already asked same question.
My answer to him:
Yep we can, and yes it will simplify this peace of code.
The main reason I decided no to do this change now -
it will change layout of the_eth_dev_data structure.
In this series I tried to mininize(/avoid) changes in rte_eth_dev and
rte_eth_dev_data,
as much as possible to avoid any unforeseen performance and functional impacts.
If we'll manage to make rte_eth_dev and rte_eth_dev_data private we can in
future
consider that one and other changes in rte_eth_dev and rte_eth_dev_data layouts
without worrying about ABI breakage
Thanks a lot. Makes sense.
May be wasting 512K unconditionally is too much.
3. If wasting 512K is too much, I'd consider to move
allocation to eth_dev_get(). If
Don't understand where 512KB came from.
32 port * 1024 queues * 2 types * 8 pointer size
if we allocate as in (2) above.
each ethdev port will always consume:
((2*sizeof(uintptr_t))* RTE_MAX_QUEUES_PER_PORT)
bytes of memory for its queue pointers.
With RTE_MAX_QUEUES_PER_PORT==1024 (default value) it is 16KB per port.
IMHO it will be a bit nicer if queue pointers arrays are allocated
on device get if size is fixed. It is just a suggestion. If you
disagree, feel free to drop it.
if (dev->data->rx_queues == NULL) {
dev->data->nb_rx_queues = 0;
@@ -908,21 +909,11 @@ eth_dev_rx_queue_config(struct rte_eth_dev *dev, uint16_t
nb_queues)
rxq = dev->data->rx_queues;
- for (i = nb_queues; i < old_nb_queues; i++)
+ for (i = nb_queues; i < old_nb_queues; i++) {
(*dev->dev_ops->rx_queue_release)(rxq[i]);
- rxq = rte_realloc(rxq, sizeof(rxq[0]) * nb_queues,
- RTE_CACHE_LINE_SIZE);
- if (rxq == NULL)
- return -(ENOMEM);
- if (nb_queues > old_nb_queues) {
- uint16_t new_qs = nb_queues - old_nb_queues;
-
- memset(rxq + old_nb_queues, 0,
- sizeof(rxq[0]) * new_qs);
+ rxq[i] = NULL;
It looks like the patch should be rebased on top of
next-net main because of queue release patches.
[snip]