This improves the code for a switch statement on targets that sign-extend function arguments, such as RISC-V. Given a simple testcase
extern void asdf(int); void foo(int x) { switch (x) { case 0: asdf(10); break; case 1: asdf(11); break; case 2: asdf(12); break; case 3: asdf(13); break; case 4: asdf(14); break; } } Compiled for a 64-bit target, we get for the tablejump li a5,4 bgtu a0,a5,.L1 slli a0,a0,32 lui a5,%hi(.L4) addi a5,a5,%lo(.L4) srli a0,a0,30 add a0,a0,a5 lw a5,0(a0) jr a5 There is some unnecessary shifting here. a0 (x) gets shifted left by 32 then shifted right by 30 to zero-extend it and multiply by 4 for the table index. However, after the unsigned greater than branch, we know the value is between 0 and 4. We also know that a 32-bit int is passed as a 64-bit sign-extended long for this target. Thus we get the same exact value if we sign-extend instead of zero-extend, and the code is one instruction shorter. We get a slli by 2 instead of the slli 32/srli 30. The following patch implements this optimization. It checks for a range that does not have the sign-bit set, and an index value that is already sign extended, and then does a sign extend instead of an zero extend. This has been tested with a riscv{32,64}-{elf,linux} builds and testsuite runs. There were no regressions. It was also tested with an x86_64-linux build and testsuite run. Ok? Jim gcc/ * expr.c (do_tablejump): When converting index to Pmode, if we have a sign extended promoted subreg, and the range does not have the sign bit set, then do a sign extend. --- gcc/expr.c | 19 +++++++++++++++++-- 1 file changed, 17 insertions(+), 2 deletions(-) diff --git a/gcc/expr.c b/gcc/expr.c index 9dd0e60d24d..919e20a22f7 100644 --- a/gcc/expr.c +++ b/gcc/expr.c @@ -11782,11 +11782,26 @@ do_tablejump (rtx index, machine_mode mode, rtx range, rtx table_label, emit_cmp_and_jump_insns (index, range, GTU, NULL_RTX, mode, 1, default_label, default_probability); - /* If index is in range, it must fit in Pmode. Convert to Pmode so we can index with it. */ if (mode != Pmode) - index = convert_to_mode (Pmode, index, 1); + { + unsigned int width; + + /* We know the value of INDEX is between 0 and RANGE. If we have a + sign-extended subreg, and RANGE does not have the sign bit set, then + we have a value that is valid for both sign and zero extension. In + this case, we get better code if we sign extend. */ + if (GET_CODE (index) == SUBREG + && SUBREG_PROMOTED_VAR_P (index) + && SUBREG_PROMOTED_SIGNED_P (index) + && ((width = GET_MODE_PRECISION (as_a <scalar_int_mode> (mode))) + <= HOST_BITS_PER_WIDE_INT) + && ! (INTVAL (range) & (HOST_WIDE_INT_1U << (width - 1)))) + index = convert_to_mode (Pmode, index, 0); + else + index = convert_to_mode (Pmode, index, 1); + } /* Don't let a MEM slip through, because then INDEX that comes out of PIC_CASE_VECTOR_ADDRESS won't be a valid address, -- 2.14.1