There is none, if you send a burst of 9 packets each one coming from a
different mempool the first one will be dropped.
Its worse than just a drop, without debug enabled the error completion
is ignored, and the wqe_pi is taken from an invalid field, which leads
to bogus mbufs free (elts_tail is not valid).
AFAICT, it is the driver responsibility to guarantee to never deregister
a memory region that has inflight send operations posted, otherwise
the send operation *will* complete with a local protection error. Is
that taken care of?
Up to now we have assumed that the user knows its application needs and
he can increase this cache size to its needs through the configuration
item.
This way this limit and guarantee becomes true.
That is an undocumented assumption.
Another question, why is the MR cache maintained per TX queue and not
per device? If the application starts N TX queues then a single mempool
will be registered N times instead of just once. Having lots of MR
instances will pollute the device ICMC pretty badly. Am I missing
something?
Having this cache per device needs a lock on the device structure while
threads are sending packets.
Not sure why it needs a lock at all. it *may* need an rcu protection
or rw_lock if at all.
Having such locks cost cycles, that is why
the cache is per queue. Another point is, having several mempool per
device is something common, whereas having several mempool per queues is
not, it seems logical to have this cache per queue for those two
reasons.
I am currently re-working this part of the code to improve it using
reference counters instead. The cache will remain for performance
purpose. This will fix the issues you are pointing.
AFAICT, all this caching mechanism is just working around the fact
that mlx5 allocates resources on top of the existing verbs interface.
I think it should work like any other pmd driver, i.e. use mbuf the
physical addresses.
The mlx5 device (like all other rdma devices) has a global DMA lkey that
spans the entire physical address space. Just about all the kernel
drivers heavily use this lkey. IMO, the mlx5_pmd driver should be able
to query the kernel what this lkey is and ask for the kernel to create
the QP with privilege level to post send/recv operations with that lkey.
And then, mlx5_pmd becomes like other drivers working with physical
addresses instead of working around the memory registration sub-optimally.
And while were on the subject, what is the plan of detaching mlx5_pmd
from its MLNX_OFED dependency? Mellanox has been doing a good job
upstreaming the needed features (rdma-core). CC'ing Leon (who is
co-maintaining the user-space rdma tree.