On 2024-05-07 18:00, Morten Brørup wrote:
From: Stephen Hemminger [mailto:step...@networkplumber.org]
Sent: Tuesday, 7 May 2024 16.51
I would prefer that the SW statistics be handled generically by ethdev
layers and used by all such drivers.
I agree.
Please note that maintaining counters in the ethdev layer might cause more
cache misses than maintaining them in the hot parts of the individual drivers'
data structures, so it's not all that simple. ;-)
Until then, let's find a short term solution, viable to implement across all
software NIC drivers without API/ABI breakage.
The most complete version of SW stats now is in the virtio driver.
It looks like the virtio PMD maintains the counters; they are not retrieved
from the host.
Considering a DPDK application running as a virtual machine (guest) on a host
server...
If the host is unable to put a packet onto the guest's virtio RX queue - like
when a HW NIC is out of RX descriptors - is it counted somewhere visible to the
guest?
Similarly, if the guest is unable to put a packet onto its virtio TX queue, is
it counted somewhere visible to the host?
If reset needs to be reliable (debatable), then it needs to be done without
atomics.
Let's modify that slightly: Without performance degradation in the fast path.
I'm not sure that all atomic operations are slow.
Relaxed atomic loads from and stores to naturally aligned addresses are
for free on ARM and x86_64 up to at least 64 bits.
"For free" is not entirely true, since both C11 relaxed stores and
stores through volatile may prevent vectorization in GCC. I don't see
why, but in practice that seems to be the case. That is very much a
corner case.
Also, as mentioned before, C11 atomic store effectively has volatile
semantics, which in turn may prevent some compiler optimizations.
On 32-bit x86, 64-bit atomic stores use xmm registers, but those are
going to be used anyway, since you'll have a 64-bit add.
But you are right that it needs to be done without _Atomic counters; they seem
to be slow.
_Atomic is not slower than atomics without _Atomic, when you actually
need atomic operations.