This patch improves the RTL that the middle-end generates for testing
signed overflow following a widening multiplication.  During this
expansion the middle-end generates a truncation which can get used
multiple times.  Placing this intermediate value in a pseudo register
reduces the amount of code generated on platforms where this truncation
requires an explicit instruction.

This simple call to force_reg eliminates 368 lines of the -S output
from testsuite/c-c++-common/torture/builtin-arith-overflow-1.c on
nvptx-none.  An example difference is in t120_1smul where the following
7 instruction sequence in which the 1st and 6th instructions perform
the same truncation:

< cvt.u32.u64     %r31, %r28;           <- truncate %r28
< shr.s32 %r30, %r31, 31;
< cvt.u32.u64     %r32, %r29;
< setp.eq.u32     %r33, %r30, %r32;
< selp.u32        %r24, 0, 1, %r33;
< cvt.u32.u64     %r25, %r28;           <- truncate %r28
< setp.eq.u32     %r34, %r24, 0;

is now generated as a 4 instruction sequence without duplication:

> cvt.u32.u64     %r30, %r28;
> shr.s32 %r31, %r30, 31;
> cvt.u32.u64     %r32, %r29;
> setp.eq.u32     %r33, %r31, %r32;

On x86_64-pc-linux-gnu, where SUBREGs are free, this patch generates
exactly the same builtin-arith-overflow-1.s as before.

This patch has been tested on both x86_64-pc-linux-gnu with
"make bootstrap" and nvptx-none with "make", with no new
testsuite regressions on either platform.
Ok for mainline?


2020-07-06  Roger Sayle  <ro...@nextmovesoftware.com>

gcc/ChangeLog:
        * internal-fn.c (expand_mul_overflow): When checking for signed
        overflow from a widening multiplication, we access the truncated
        lowpart RES twice, so keep this value in a pseudo register.


Thanks in advance,
Roger
--
Roger Sayle
NextMove Software
Cambridge, UK

diff --git a/gcc/internal-fn.c b/gcc/internal-fn.c
index 0be2eb4..d1bd6cc 100644
--- a/gcc/internal-fn.c
+++ b/gcc/internal-fn.c
@@ -1627,6 +1627,9 @@ expand_mul_overflow (location_t loc, tree lhs, tree arg0, 
tree arg1,
                                     profile_probability::very_likely ());
          else
            {
+             /* RES is used more than once, place it in a pseudo.  */
+             res = force_reg (mode, res);
+
              rtx signbit = expand_shift (RSHIFT_EXPR, mode, res, prec - 1,
                                          NULL_RTX, 0);
              /* RES is low half of the double width result, HIPART

Reply via email to