On 15/03/2024 3:13 pm, Roger Pau Monné wrote: > On Fri, Mar 15, 2024 at 12:13:22PM +0000, Andrew Cooper wrote: >> The use of __clear_bit() forces dmask to be spilled to the stack, and >> interferes with the compiler heuristcs for some upcoming improvements to the >> ffs() code generation. >> >> First, shrink dmask to just the active vectors by making out the upper bits. >> This replaces the "i < msi->vectors" part of the loop condition. >> >> Next, use a simple while() loop with "clear bottom bit" expressed in plane C, >> which affords the optimiser a far better understanding of what the loop is >> doing. >> >> Signed-off-by: Andrew Cooper <andrew.coop...@citrix.com> >> --- >> CC: Roger Pau Monné <roger....@citrix.com> >> >> Noticed when looking at the ffs() code gen improvements. >> >> Any suggestion on how to test this? test_vcpi doesn't seem to check anything >> here. I think I've got the boundary conditions for msi->vectors right, but >> I'd be lying if I said I was certain... > There's no easy way to test this because it relies on having a PCI > device underneath. test_vpci just checks the logic to add & remove > handlers, but doesn't get remotely close as to attempting to provide > some kind of dummy environment for pass through to be sanity tested. > > I should look into it. > >> bloat-o-meter reports: >> >> add/remove: 0/0 grow/shrink: 0/1 up/down: 0/-28 (-28) >> Function old new delta >> mask_write 142 114 -28 >> >> which is a consequence of the compiler having a much better idea of what's >> going on in the loop. There's more to come with the ffs() improvements too. >> --- >> xen/drivers/vpci/msi.c | 10 ++++++---- >> 1 file changed, 6 insertions(+), 4 deletions(-) >> >> diff --git a/xen/drivers/vpci/msi.c b/xen/drivers/vpci/msi.c >> index d3aa5df08941..30adcf7df05d 100644 >> --- a/xen/drivers/vpci/msi.c >> +++ b/xen/drivers/vpci/msi.c >> @@ -169,13 +169,15 @@ static void cf_check mask_write( >> >> if ( msi->enabled ) >> { >> - unsigned int i; >> + /* Skip changes to vectors which aren't enabled. */ >> + dmask &= (~0U >> (32 - msi->vectors)); > Do we need to ASSERT that msi->vectors <= 32 in order to avoid > theoretical UB?
I don't think so. Things have gone catastrophically wrong elsewhere to get here with 64 or 128. All this does is stop calling the set-mask callback for disabled vectors. ~Andrew