The result of a POPCOUNT operation in RTL should have the same mode as its operand. This corrects the specification of popcount in the nvptx backend, splitting the current generic define_insn into two, one for popcountsi2 and the other for popcountdi2 (the latter with an explicit truncate).
This patch has been tested on nvptx-none (hosted on x86_64-pc-linux-gnu) with make and make -k check with no new failures. This functionality is already tested by gcc.target/nvptx/popc-[123].c. Ok for mainline? 2023-01-09 Roger Sayle <ro...@nextmovesoftware.com> gcc/ChangeLog * config/nvptx/nvptx.md (popcount<mode>2): Split into... (popcountsi2): define_insn handling SImode popcount. (popcountdi2): define_insn handling DImode popcount, with an explicit truncate:SI to produce an SImode result. Thanks in advance, Roger --
diff --git a/gcc/config/nvptx/nvptx.md b/gcc/config/nvptx/nvptx.md index 740c4de..461540e 100644 --- a/gcc/config/nvptx/nvptx.md +++ b/gcc/config/nvptx/nvptx.md @@ -658,11 +658,18 @@ DONE; }) -(define_insn "popcount<mode>2" +(define_insn "popcountsi2" [(set (match_operand:SI 0 "nvptx_register_operand" "=R") - (popcount:SI (match_operand:SDIM 1 "nvptx_register_operand" "R")))] + (popcount:SI (match_operand:SI 1 "nvptx_register_operand" "R")))] "" - "%.\\tpopc.b%T1\\t%0, %1;") + "%.\\tpopc.b32\\t%0, %1;") + +(define_insn "popcountdi2" + [(set (match_operand:SI 0 "nvptx_register_operand" "=R") + (truncate:SI + (popcount:DI (match_operand:DI 1 "nvptx_register_operand" "R"))))] + "" + "%.\\tpopc.b64\\t%0, %1;") ;; Multiplication variants