On Fri, Jul 18, 2025 at 01:48:17PM -0700, Elliott Mitchell wrote: >On Wed, Jul 16, 2025 at 11:31:06AM -0700, Elliott Mitchell wrote: >> On Wed, Jul 16, 2025 at 07:47:48AM +0000, Anthoine Bourgeois wrote: >> > On Tue, Jul 15, 2025 at 12:19:34PM -0700, Elliott Mitchell wrote: >> > > >> > >I tend to follow Debian, so kernel 6.1.140 and 4.17.6. What may be >> > >more notable is AMD processor. >> > > >> > >When initially reported, it was reported as being more severe on systems >> > >with AMD processors. I've been wondering about the reason(s) behind >> > >that. >> > >> > AMD processors could make a huge difference. On Ryzen, this patch could >> > almost double the bandwidth and on Epyc close to nothing with low >> > frequency models, there is another bottleneck here I guess. >> > On which one do you test? >> > >> > Do you know there is also a workaround on AMD processors about remapping >> > grant tables as WriteBack? >> > Upstream patch is 22650d605462 from XenServer. >> > The test package for XCP-ng with the patch: >> > https://xcp-ng.org/forum/topic/10943/network-traffic-performance-on-amd-processors >> > >> >> Why are you jumping onto mostly unrelated issues when the current bug is >> unfinished? >> >> Spurious events continue to be observed on the network backend. Spurious >> events are also being observed on block and PCI backends. You identified >> one cause, but others remain. >> >> (I'm hoping the next one effects all the back/front ends; the PCI backend >> is a bigger issue for me) >> >> Should add, one VM being observed with these issue(s) is using 6.12.38. > >For reference, the following: > >for d in /sys/devices/{pci,vbd,vif}-*[0-9]-*[0-9]/xenbus >do if [ -f "$d/spurious_events" ] > then read s < "$d/spurious_events" > else s=0 > fi > if [ "$s" -gt 0 ] > then printf "problem %s: %d\\n" "$d/spurious_events" "$s" > else printf "clean: %s\\n" "$d/spurious_events" > fi >done > >Flags all passthrough and virtual devices. Even though there is a >reduction with virtual network devices, that is only a 10% reduction. >Most of the problem remains even though there is progress. > >I was mentioning an AMD processor since the initial report stated the >problem was more severe with AMD processor machines. > >This is likely a driver design issue. Most pieces of hardware, telling >the hardware to process an empty queue is quite cheap. Perhaps minor >energy loss, but most hardware isn't (yet) too worried about being >attacked. > >Passthrough and virtual devices are quite unusual in there being a >concern over attacks. There could be major design flaws due to the >front-ends being designed similar to normal drivers. >
Hmm, you check the spurious on the backend. Sorry I should have been more specific, this patch only mitigate the spurious on the frontend. I will take a look on the backend. Regards, Anthoine Anthoine Bourgeois | Vates XCP-ng Developer XCP-ng & Xen Orchestra - Vates solutions web: https://vates.tech