On 1/14/22 10:54, Roger Sayle wrote:

Now that the middle-end MULT_HIGHPART_EXPR pieces are in place, this
patch adds support for nvptx's mul.hi.s64 and mul.hi.u64 instructions,
as previously reviewed (provisionally pre-approved) back in August 2020:
https://gcc.gnu.org/pipermail/gcc-patches/2020-August/551373.html
Since then a few things have changed, so this patch uses the new
SMUL_HIGHPART and UMUL_HIGHPART RTX expressions, but the test cases
remain the same.  Like the x86_64 backend, this patch retains the
"trunc" forms of these instructions (while the RTL optimizers/combine
may still generate them).

Given that we're rapidly approaching stage 4, I also took the liberty
of including support in nvptx.md for a few other instructions.  With
the new 64-bit highpart multiplication instructions added above, we
can now provide a define_expand for efficient 64-bit (to 128-bit)
widening multiplications.  This patch also adds support for nvptx's
testp.infinite instruction (for implementing __builtin_isinf) and
the not.pred instruction.

As an example of the code generation improvements, the function
int foo(double x) { return __builtin_isinf(x); }
previously generated with -O2:

                 mov.f64 %r26, %ar0;
                 abs.f64 %r28, %r26;
                 setp.leu.f64    %r31, %r28, 0d7fefffffffffffff;
                 selp.u32        %r30, 1, 0, %r31;
                 mov.u32 %r29, %r30;
                 cvt.u16.u8      %r35, %r29;
                 mov.u16 %r33, %r35;
                 xor.b16 %r32, %r33, 1;
                 cvt.u32.u16     %r34, %r32;
                 cvt.u32.u8      %value, %r34;

and with this patch now generates:

                 mov.f64 %r23, %ar0;
                 testp.infinite.f64      %r24, %r23;
                 selp.u32        %value, 1, 0, %r24;

This patch has been tested on nvptx-none hosted on x86_64-pc-linux-gnu
(including newlib) with a make and make -k check with no new failures.
Ok for mainline?



LGTM, applied.

Thanks,
- Tom

2022-01-14  Roger Sayle  <ro...@nextmovesoftware.com>

gcc/ChangeLog
        * config/nvptx/nvptx.md (UNSPEC_ISINF): New UNSPEC.
        (one_cmplbi2): New define_insn for not.pred.
        (mulditi3): New define_expand for signed widening multiply.
        (umulditi3): New define_expand for unsigned widening multiply.
        (smul<mode>3_highpart): New define_insn for signed highpart mult.
        (umul<mode>3_highpart): New define_insn for unsigned highpart mult.
        (*smulhi3_highpart_2): Renamed from smulhi3_highpart.
        (*smulsi3_highpart_2): Renamed from smulsi3_highpart.
        (*umulhi3_highpart_2): Renamed from umulhi3_highpart.
        (*umulsi3_highpart_2): Renamed from umulsi3_highpart.
        (*setcc<mode>_from_not_bi): New define_insn.
        (*setcc_isinf<mode>): New define_insn for testp.infinite.
        (isinf<mode>2): New define_expand.

gcc/testsuite/ChangeLog
        * gcc.target/nvptx/mul-hi64.c: New test case.
        * gcc.target/nvptx/umul-hi64.c: New test case.
        * gcc.target/nvptx/mul-wide64.c: New test case.
        * gcc.target/nvptx/umul-wide64.c: New test case.
        * gcc.target/nvptx/isinf.c: New test case.


Thanks in advance,
Roger
--

Reply via email to