The result of a POPCOUNT operation in RTL should have the same mode
as its operand.  This corrects the specification of popcount in
the nvptx backend, splitting the current generic define_insn into
two, one for popcountsi2 and the other for popcountdi2 (the latter
with an explicit truncate).

This patch has been tested on nvptx-none (hosted on x86_64-pc-linux-gnu)
with make and make -k check with no new failures.  This functionality is
already tested by gcc.target/nvptx/popc-[123].c.  Ok for mainline?


2023-01-09  Roger Sayle  <ro...@nextmovesoftware.com>

gcc/ChangeLog
        * config/nvptx/nvptx.md (popcount<mode>2): Split into...
        (popcountsi2): define_insn handling SImode popcount.
        (popcountdi2): define_insn handling DImode popcount, with an
        explicit truncate:SI to produce an SImode result.

Thanks in advance,
Roger
--

diff --git a/gcc/config/nvptx/nvptx.md b/gcc/config/nvptx/nvptx.md
index 740c4de..461540e 100644
--- a/gcc/config/nvptx/nvptx.md
+++ b/gcc/config/nvptx/nvptx.md
@@ -658,11 +658,18 @@
   DONE;
 })
 
-(define_insn "popcount<mode>2"
+(define_insn "popcountsi2"
   [(set (match_operand:SI 0 "nvptx_register_operand" "=R")
-       (popcount:SI (match_operand:SDIM 1 "nvptx_register_operand" "R")))]
+       (popcount:SI (match_operand:SI 1 "nvptx_register_operand" "R")))]
   ""
-  "%.\\tpopc.b%T1\\t%0, %1;")
+  "%.\\tpopc.b32\\t%0, %1;")
+
+(define_insn "popcountdi2"
+  [(set (match_operand:SI 0 "nvptx_register_operand" "=R")
+       (truncate:SI
+         (popcount:DI (match_operand:DI 1 "nvptx_register_operand" "R"))))]
+  ""
+  "%.\\tpopc.b64\\t%0, %1;")
 
 ;; Multiplication variants
 

Reply via email to