https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92135
Bug ID: 92135 Summary: Implement popcountsi expansion for arm Product: gcc Version: 9.0 Status: UNCONFIRMED Keywords: missed-optimization Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: ktkachov at gcc dot gnu.org Target Milestone: --- Target: arm* On aarch64 we can do SImode and DImode popcount operations using the AdvSIMD CNT instruction. As the comment in aarch64.md says: ;; Pop count be done via the "CNT" instruction in AdvSIMD. ;; ;; MOV v.1d, x0 ;; CNT v1.8b, v.8b ;; ADDV b2, v1.8b ;; MOV w0, v2.b[0] We should be able to do a similar thing on arm, using the VCNT instruction. This just needs implementing as an expansion in arm.md. int foocnt (int a) { return __builtin_popcount (a); } is the trivial testcase. Haven't thought too much about the DImode case, but perhaps that can also be accelerated in similar ways