On Fri, Jul 18, 2025 at 01:48:17PM -0700, Elliott Mitchell wrote:
>On Wed, Jul 16, 2025 at 11:31:06AM -0700, Elliott Mitchell wrote:
>> On Wed, Jul 16, 2025 at 07:47:48AM +0000, Anthoine Bourgeois wrote:
>> > On Tue, Jul 15, 2025 at 12:19:34PM -0700, Elliott Mitchell wrote:
>> > >
>> > >I tend to follow Debian, so kernel 6.1.140 and 4.17.6.  What may be
>> > >more notable is AMD processor.
>> > >
>> > >When initially reported, it was reported as being more severe on systems
>> > >with AMD processors.  I've been wondering about the reason(s) behind
>> > >that.
>> >
>> > AMD processors could make a huge difference. On Ryzen, this patch could
>> > almost double the bandwidth and on Epyc close to nothing with low
>> > frequency models, there is another bottleneck here I guess.
>> > On which one do you test?
>> >
>> > Do you know there is also a workaround on AMD processors about remapping
>> > grant tables as WriteBack?
>> > Upstream patch is 22650d605462 from XenServer.
>> > The test package for XCP-ng with the patch:
>> > https://xcp-ng.org/forum/topic/10943/network-traffic-performance-on-amd-processors
>> >
>>
>> Why are you jumping onto mostly unrelated issues when the current bug is
>> unfinished?
>>
>> Spurious events continue to be observed on the network backend.  Spurious
>> events are also being observed on block and PCI backends.  You identified
>> one cause, but others remain.
>>
>> (I'm hoping the next one effects all the back/front ends; the PCI backend
>> is a bigger issue for me)
>>
>> Should add, one VM being observed with these issue(s) is using 6.12.38.
>
>For reference, the following:
>
>for d in /sys/devices/{pci,vbd,vif}-*[0-9]-*[0-9]/xenbus
>do      if [ -f "$d/spurious_events" ]
>        then    read s < "$d/spurious_events"
>        else    s=0
>        fi
>        if [ "$s" -gt 0 ]
>        then    printf "problem %s: %d\\n" "$d/spurious_events" "$s"
>        else    printf "clean: %s\\n" "$d/spurious_events"
>        fi
>done
>
>Flags all passthrough and virtual devices.  Even though there is a
>reduction with virtual network devices, that is only a 10% reduction.
>Most of the problem remains even though there is progress.
>
>I was mentioning an AMD processor since the initial report stated the
>problem was more severe with AMD processor machines.
>
>This is likely a driver design issue.  Most pieces of hardware, telling
>the hardware to process an empty queue is quite cheap.  Perhaps minor
>energy loss, but most hardware isn't (yet) too worried about being
>attacked.
>
>Passthrough and virtual devices are quite unusual in there being a
>concern over attacks.  There could be major design flaws due to the
>front-ends being designed similar to normal drivers.
>

Hmm, you check the spurious on the backend.
Sorry I should have been more specific, this patch only mitigate the
spurious on the frontend.

I will take a look on the backend.

Regards,
Anthoine


Anthoine Bourgeois | Vates XCP-ng Developer

XCP-ng & Xen Orchestra - Vates solutions

web: https://vates.tech


Reply via email to