On 05/03/2025 11:36 am, Jan Beulich wrote: > On 05.03.2025 00:22, Andrew Cooper wrote: >> There are two issues. First, pi_test_and_clear_on() pulls the cache-line to >> the CPU and dirties it even if there's nothing outstanding, but the final >> bitmap_for_each() is O(256) when O(8) would do, and would avoid multiple >> atomic updates to the same IRR word. >> >> Rewrite it from scratch, explaining what's going on at each step. >> >> Bloat-o-meter reports 177 -> 145 (net -32), but real improvement is the >> removal of calls to __find_{first,next}_bit() hidden behind >> bitmap_for_each(). > Nit: As said in reply to v2, there are no underscores on the two find > functions bitmap_for_each() uses.
I did change the commit message assuming you were right, but the disassembly never lies. What bitmap_for_each() uses are very much not functions in x86. >> No functional change. >> >> Signed-off-by: Andrew Cooper <andrew.coop...@citrix.com> >> --- >> CC: Jan Beulich <jbeul...@suse.com> >> CC: Roger Pau Monné <roger....@citrix.com> >> >> https://gitlab.com/xen-project/people/andyhhp/xen/-/pipelines/1699791518 >> >> v3: >> * Fix IRR scatter address calculation >> * Spelling/Grammar improvements > The description starting with "There are two issues" I fear it still > doesn't become quite clear what the 2nd issue is. I can only assume it's > the use of bitmap_for_each() that now goes away. > > Preferably with this tweaked a little further > Reviewed-by: Jan Beulich <jbeul...@suse.com> Oh. ", but the final" can turn into ", and second,". That should make it clearer. Thanks, ~Andrew