... which is better optimised for scalar values, rather than using the arbitrary-sized bitmap helpers.
For ARM32: add/remove: 0/0 grow/shrink: 0/1 up/down: 0/-16 (-16) Function old new delta vgic_set_irqs_pending 284 268 -16 including removing calls to _find_{first,next}_bit_le(), and two stack-spilled words too. For ARM64: add/remove: 0/0 grow/shrink: 0/1 up/down: 0/-40 (-40) Function old new delta vgic_set_irqs_pending 268 228 -40 including removing three calls to find_next_bit(). Signed-off-by: Andrew Cooper <andrew.coop...@citrix.com> --- CC: Stefano Stabellini <sstabell...@kernel.org> CC: Julien Grall <jul...@xen.org> CC: Volodymyr Babchuk <volodymyr_babc...@epam.com> CC: Bertrand Marquis <bertrand.marq...@arm.com> CC: Michal Orzel <michal.or...@amd.com> TODO: These were debug builds, and I need to redo the analysis with release builds. Also extend to the other examples. --- xen/arch/arm/vgic.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/xen/arch/arm/vgic.c b/xen/arch/arm/vgic.c index 57519e834d78..c060676aee78 100644 --- a/xen/arch/arm/vgic.c +++ b/xen/arch/arm/vgic.c @@ -421,15 +421,13 @@ void vgic_enable_irqs(struct vcpu *v, uint32_t r, unsigned int n) void vgic_set_irqs_pending(struct vcpu *v, uint32_t r, unsigned int rank) { - const unsigned long mask = r; - unsigned int i; /* The first rank is always per-vCPU */ bool private = rank == 0; /* LPIs will never be set pending via this function */ ASSERT(!is_lpi(32 * rank + 31)); - bitmap_for_each( i, &mask, 32 ) + for_each_set_bit ( i, r ) { unsigned int irq = i + 32 * rank; -- 2.39.2