This improves the code for a switch statement on targets that sign-extend
function arguments, such as RISC-V.  Given a simple testcase

extern void asdf(int);
void foo(int x) {
  switch (x) {
  case 0: asdf(10); break;
  case 1: asdf(11); break;
  case 2: asdf(12); break;
  case 3: asdf(13); break;
  case 4: asdf(14); break;
  }
}

Compiled for a 64-bit target, we get for the tablejump

        li      a5,4
        bgtu    a0,a5,.L1
        slli    a0,a0,32
        lui     a5,%hi(.L4)
        addi    a5,a5,%lo(.L4)
        srli    a0,a0,30
        add     a0,a0,a5
        lw      a5,0(a0)
        jr      a5

There is some unnecessary shifting here.  a0 (x) gets shifted left by 32 then
shifted right by 30 to zero-extend it and multiply by 4 for the table index.
However, after the unsigned greater than branch, we know the value is between
0 and 4.  We also know that a 32-bit int is passed as a 64-bit sign-extended
long for this target.  Thus we get the same exact value if we sign-extend
instead of zero-extend, and the code is one instruction shorter.  We get a slli
by 2 instead of the slli 32/srli 30.

The following patch implements this optimization.  It checks for a range that
does not have the sign-bit set, and an index value that is already sign
extended, and then does a sign extend instead of an zero extend.

This has been tested with a riscv{32,64}-{elf,linux} builds and testsuite runs.
There were no regressions.  It was also tested with an x86_64-linux build and
testsuite run.

Ok?

Jim

        gcc/
        * expr.c (do_tablejump): When converting index to Pmode, if we have a
        sign extended promoted subreg, and the range does not have the sign bit
        set, then do a sign extend.
---
 gcc/expr.c | 19 +++++++++++++++++--
 1 file changed, 17 insertions(+), 2 deletions(-)

diff --git a/gcc/expr.c b/gcc/expr.c
index 9dd0e60d24d..919e20a22f7 100644
--- a/gcc/expr.c
+++ b/gcc/expr.c
@@ -11782,11 +11782,26 @@ do_tablejump (rtx index, machine_mode mode, rtx 
range, rtx table_label,
     emit_cmp_and_jump_insns (index, range, GTU, NULL_RTX, mode, 1,
                             default_label, default_probability);
 
-
   /* If index is in range, it must fit in Pmode.
      Convert to Pmode so we can index with it.  */
   if (mode != Pmode)
-    index = convert_to_mode (Pmode, index, 1);
+    {
+      unsigned int width;
+
+      /* We know the value of INDEX is between 0 and RANGE.  If we have a
+        sign-extended subreg, and RANGE does not have the sign bit set, then
+        we have a value that is valid for both sign and zero extension.  In
+        this case, we get better code if we sign extend.  */
+      if (GET_CODE (index) == SUBREG
+         && SUBREG_PROMOTED_VAR_P (index)
+         && SUBREG_PROMOTED_SIGNED_P (index)
+         && ((width = GET_MODE_PRECISION (as_a <scalar_int_mode> (mode)))
+             <= HOST_BITS_PER_WIDE_INT)
+         && ! (INTVAL (range) & (HOST_WIDE_INT_1U << (width - 1))))
+       index = convert_to_mode (Pmode, index, 0);
+      else
+       index = convert_to_mode (Pmode, index, 1);
+    }
 
   /* Don't let a MEM slip through, because then INDEX that comes
      out of PIC_CASE_VECTOR_ADDRESS won't be a valid address,
-- 
2.14.1

Reply via email to