On 4/28/2024 4:11 PM, Mattias Rönnblom wrote:
On 2024-04-26 16:38, Ferruh Yigit wrote:
For stats reset, use an offset instead of zeroing out actual stats
values,
get_stats() displays diff between stats and offset.
This way stats only updated in datapath and offset only updated in
stats
reset function. This makes stats reset function more reliable.
As stats only written by single thread, we can remove 'volatile'
qualifier
which should improve the performance in datapath.
volatile wouldn't help you if you had multiple writers, so that can't be
the reason for its removal. It would be more accurate to say it should
be replaced with atomic updates. If you don't use volatile and don't use
atomics, you have to consider if the compiler can reach the conclusion
that it does not need to store the counter value for future use *for
that thread*. Since otherwise, I don't think the store actually needs to
occur. Since DPDK statistics tend to work, it's pretty obvious that
current compilers tend not to reach this conclusion.
Thanks Mattias for clarifying why we need volatile or atomics even with
single writer.
If this should be done 100% properly, the update operation should be a
non-atomic load, non-atomic add, and an atomic store. Similarly, for the
reset, the offset store should be atomic.
ack
Considered the state of the rest of the DPDK code base, I think a
non-atomic, non-volatile solution is also fine.
Yes, this seems working practically but I guess better to follow above
suggestion.
(That said, I think we're better off just deprecating stats reset
altogether, and returning -ENOTSUP here meanwhile.)
As long as reset is reliable (here I mean it reset stats in every call)
and doesn't impact datapath performance, I am for to continue with it.
Returning non supported won't bring more benefit to users.
Signed-off-by: Ferruh Yigit <ferruh.yi...@amd.com>
---
Cc: Mattias Rönnblom <mattias.ronnb...@ericsson.com>
Cc: Stephen Hemminger <step...@networkplumber.org>
Cc: Morten Brørup <m...@smartsharesystems.com>
This update triggered by mail list discussion [1].
[1]
https://inbox.dpdk.org/dev/3b2cf48e-2293-4226-b6cd-5f4dd3969...@lysator.liu.se/
v2:
* Remove wrapping check for stats
---
drivers/net/af_packet/rte_eth_af_packet.c | 66
++++++++++++++---------
1 file changed, 41 insertions(+), 25 deletions(-)
diff --git a/drivers/net/af_packet/rte_eth_af_packet.c
b/drivers/net/af_packet/rte_eth_af_packet.c
index 397a32db5886..10c8e1e50139 100644
--- a/drivers/net/af_packet/rte_eth_af_packet.c
+++ b/drivers/net/af_packet/rte_eth_af_packet.c
@@ -51,8 +51,10 @@ struct pkt_rx_queue {
uint16_t in_port;
uint8_t vlan_strip;
- volatile unsigned long rx_pkts;
- volatile unsigned long rx_bytes;
+ uint64_t rx_pkts;
+ uint64_t rx_bytes;
+ uint64_t rx_pkts_offset;
+ uint64_t rx_bytes_offset;
I suggest you introduce a separate struct for reset-able counters. It'll
make things cleaner, and you can sneak in atomics without too much
atomics-related bloat.
struct counter
{
uint64_t count;
uint64_t offset;
};
/../
struct counter rx_pkts;
struct counter rx_bytes;
/../
static uint64_t
counter_value(struct counter *counter)
{
uint64_t count = __atomic_load_n(&counter->count,
__ATOMIC_RELAXED);
uint64_t offset = __atomic_load_n(&counter->offset,
__ATOMIC_RELAXED);
return count + offset;
}
static void
counter_reset(struct counter *counter)
{
uint64_t count = __atomic_load_n(&counter->count,
__ATOMIC_RELAXED);
__atomic_store_n(&counter->offset, count, __ATOMIC_RELAXED);
}
static void
counter_add(struct counter *counter, uint64_t operand)
{
__atomic_store_n(&counter->count, counter->count + operand,
__ATOMIC_RELAXED);
}
Ack for separate struct for reset-able counters.
You'd have to port this to <rte_stdatomic.h> calls, which prevents
non-atomic loads from RTE_ATOMIC()s. The non-atomic reads above must be
replaced with explicit relaxed non-atomic load. Otherwise, if you just
use "counter->count", that would be an atomic load with sequential
consistency memory order on C11 atomics-based builds, which would result
in a barrier, at least on weakly ordered machines (e.g., ARM).
I am not sure I understand above.
As load and add will be non-atomic, why not access them directly, like:
`uint64_t count = counter->count;`