On 5/8/2024 4:23 PM, Stephen Hemminger wrote:
> On Wed, 8 May 2024 09:19:02 +0200
> Mattias Rönnblom <hof...@lysator.liu.se> wrote:
> 
>> On 2024-05-04 00:00, Stephen Hemminger wrote:
>>> On Fri, 3 May 2024 16:45:47 +0100
>>> Ferruh Yigit <ferruh.yi...@amd.com> wrote:
>>>   
>>>> For stats reset, use an offset instead of zeroing out actual stats values,
>>>> get_stats() displays diff between stats and offset.
>>>> This way stats only updated in datapath and offset only updated in stats
>>>> reset function. This makes stats reset function more reliable.
>>>>
>>>> As stats only written by single thread, we can remove 'volatile' qualifier
>>>> which should improve the performance in datapath.
>>>>
>>>> While updating around, 'igb_stats' parameter renamed as 'stats'.
>>>>
>>>> Signed-off-by: Ferruh Yigit <ferruh.yi...@amd.com>
>>>> ---
>>>> Cc: Mattias Rönnblom <mattias.ronnb...@ericsson.com>
>>>> Cc: Stephen Hemminger <step...@networkplumber.org>
>>>> Cc: Morten Brørup <m...@smartsharesystems.com>
>>>>
>>>> This update triggered by mail list discussion [1].
>>>>
>>>> [1]
>>>> https://inbox.dpdk.org/dev/3b2cf48e-2293-4226-b6cd-5f4dd3969...@lysator.liu.se/
>>>>   
>>>
>>>
>>> NAK
>>>
>>> I did not hear a good argument why atomic or volatile was necessary in the 
>>> first place.
>>> Why?
>>>   
>>
>> On the reader side, loads should be atomic.
>> On the writer side, stores should be atomic.
>>
>> Updates (stores) should actually occur in a timely manner. The complete 
>> read-modify-write cycle need not be atomic, since we only have a single 
>> writer. All this for the per-lcore counter case.
>>
>> If load or store tearing occurs, the counter values may occasionally 
>> take totally bogus values. I think that should be avoided. Especially 
>> since it will likely come at a very reasonable cost.
>>
>>  From what it seems to me, load or store tearing may well occur. GCC may 
>> generate two 32-bit stores for a program-level 64-bit store on 32-bit 
>> x86. If you have constant and immediate-data store instructions, 
>> constant writes may also be end up teared. The kernel documentation has 
>> some example of this. Add LTO, it's not necessarily going to be all that 
>> clear what is storing-a-constant and what is not.
>>
>> Maybe you care a little less if statistics are occasionally broken, or 
>> some transient, inconsistent state, but generally they should work, and 
>> they should never have some totally bogus values. So, statistics aren't 
>> snow flakes, mostly just business as usual.
>>
>> We can't both have a culture that promotes C11-style parallel 
>> programming, or, at the extreme, push the C11 APIs as-is, and the say 
>> "and btw you don't have to care about the standard when it comes to 
>> statistics".
>>
>> We could adopt the Linux kernel's rules, programming model, and APIs 
>> (ignoring legal issues). That would be very old school, maybe somewhat 
>> over-engineered for our purpose, include a fair amount of inline 
>> assembler, and also and may well depend on GCC or GCC-like compilers, 
>> just like what I believe the kernel does.
>>
>> We could use something in-between, heavily inspired by C11 but still 
>> with an opportunity to work around compiler issues, library issues, and
>> extend the API for our use case.
>>
>> I agree we shouldn't have to mark statistics _Atomic, or RTE_ATOMIC(), 
>> rte_atomic64_t, or rte_sometimes_atomic_and_sometimes_not64_t. Just 
>> keeping the usual C integer types seems like a better option to me.
>>
>>> Why is this driver special (a snowflake) compared to all the other drivers 
>>> doing software
>>> statistics (tap, virtio, xdp, ring, memif, netvsc, vmware)?  
>>
>> If a broken piece of code has been copied around, one place is going to 
>> be the first to be fixed.
> 
> 
> I dislike when any driver does something completely different than valid 
> precedent.
> No other driver in DPDK, Vpp, FreeBSD, Linux (and probably Windows) uses 
> atomic for
> updating statistics. We even got performance benefit at MS from removing 
> atomic
> increment of staistics in internal layers.
> 
> The idea of load tearing is crazy talk of integral types. It would break so 
> many things.
> It is the kind of stupid compiler thing that would send Linus on a rant and 
> get
> the GCC compiler writers in trouble. 
> 
> The DPDK has always favored performance over strict safety guard rails 
> everywhere.
> Switching to making every statistic an atomic operation is not in the spirit 
> of
> what is required. There is no strict guarantee necessary here.
> 

I kind of agree with Stephen.

Thanks Mattias, Morten & Stephen, it was informative discussion. But for
*SW drivers* stats update and reset is not core functionality and I
think we can be OK to get hit on corner cases, instead of
over-engineering or making code more complex.

I am for putting priority as following (from high to low):
- Datapath performance
- Stats get accuracy
- Stats reset accuracy

With the restriction that stat reset requires forwarding to stop, we can
even drop 'offset' logic.
And I am not sure if it is a real requirement that stats reset should be
supported during forwarding, although I can see it is convenient.
If we get this requirement in the future, we can focus on a solution.


As action,
I am planning to send a new version of this RFC that only removes the
'volatile' qualifier.
In next step we can remove atomic updates and volatile stat counters
from more SW drivers.

Thanks,
ferruh

Reply via email to