On 15/03/2024 3:13 pm, Roger Pau Monné wrote:
> On Fri, Mar 15, 2024 at 12:13:22PM +0000, Andrew Cooper wrote:
>> The use of __clear_bit() forces dmask to be spilled to the stack, and
>> interferes with the compiler heuristcs for some upcoming improvements to the
>> ffs() code generation.
>>
>> First, shrink dmask to just the active vectors by making out the upper bits.
>> This replaces the "i < msi->vectors" part of the loop condition.
>>
>> Next, use a simple while() loop with "clear bottom bit" expressed in plane C,
>> which affords the optimiser a far better understanding of what the loop is
>> doing.
>>
>> Signed-off-by: Andrew Cooper <andrew.coop...@citrix.com>
>> ---
>> CC: Roger Pau Monné <roger....@citrix.com>
>>
>> Noticed when looking at the ffs() code gen improvements.
>>
>> Any suggestion on how to test this?  test_vcpi doesn't seem to check anything
>> here.  I think I've got the boundary conditions for msi->vectors right, but
>> I'd be lying if I said I was certain...
> There's no easy way to test this because it relies on having a PCI
> device underneath.  test_vpci just checks the logic to add & remove
> handlers, but doesn't get remotely close as to attempting to provide
> some kind of dummy environment for pass through to be sanity tested.
>
> I should look into it.
>
>> bloat-o-meter reports:
>>
>>   add/remove: 0/0 grow/shrink: 0/1 up/down: 0/-28 (-28)
>>   Function                                     old     new   delta
>>   mask_write                                   142     114     -28
>>
>> which is a consequence of the compiler having a much better idea of what's
>> going on in the loop.  There's more to come with the ffs() improvements too.
>> ---
>>  xen/drivers/vpci/msi.c | 10 ++++++----
>>  1 file changed, 6 insertions(+), 4 deletions(-)
>>
>> diff --git a/xen/drivers/vpci/msi.c b/xen/drivers/vpci/msi.c
>> index d3aa5df08941..30adcf7df05d 100644
>> --- a/xen/drivers/vpci/msi.c
>> +++ b/xen/drivers/vpci/msi.c
>> @@ -169,13 +169,15 @@ static void cf_check mask_write(
>>  
>>      if ( msi->enabled )
>>      {
>> -        unsigned int i;
>> +        /* Skip changes to vectors which aren't enabled. */
>> +        dmask &= (~0U >> (32 - msi->vectors));
> Do we need to ASSERT that msi->vectors <= 32 in order to avoid
> theoretical UB?

I don't think so.  Things have gone catastrophically wrong elsewhere to
get here with 64 or 128.

All this does is stop calling the set-mask callback for disabled vectors.

~Andrew

Reply via email to