On 27/08/2024 2:57 pm, Andrew Cooper wrote:
> There are two issues.  First, pi_test_and_clear_on() pulls the cache-line to
> the CPU and dirties it even if there's nothing outstanding, but the final
> for_each_set_bit() is O(256) when O(8) would do, and would avoid multiple
> atomic updates to the same IRR word.
>
> Rewrite it from scratch, explaining what's going on at each step.
>
> Bloat-o-meter reports 177 -> 145 (net -32), but the better aspect is the
> removal calls to __find_{first,next}_bit() hidden behind for_each_set_bit().
>
> No functional change.
>
> Signed-off-by: Andrew Cooper <andrew.coop...@citrix.com>
> ---
> CC: Jan Beulich <jbeul...@suse.com>
> CC: Roger Pau Monné <roger....@citrix.com>
> CC: Stefano Stabellini <sstabell...@kernel.org>
> CC: Julien Grall <jul...@xen.org>
> CC: Volodymyr Babchuk <volodymyr_babc...@epam.com>
> CC: Bertrand Marquis <bertrand.marq...@arm.com>
> CC: Michal Orzel <michal.or...@amd.com>
> CC: Oleksii Kurochko <oleksii.kuroc...@gmail.com>
>
> The main purpose of this is to get rid of bitmap_for_each().
>
> v2:
>  * Extend the comments

FWIW, Gitlab CI has gained one reliable failure for this series (which
includes the hweight series too, because of how I've got my branch
arranged).

It is a timeout (domU not reporting in after boot), and as it is
specific to the AlderLake runner, it's very likely to be this patch.

I guess I need to triple-check the IRR scatter logic...

~Andrew

Reply via email to