https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97005

--- Comment #9 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Tom de Vries <vr...@gcc.gnu.org>:

https://gcc.gnu.org/g:5b2d679bbbcc2b976c6e228ba63afdf67c33164e

commit r12-7170-g5b2d679bbbcc2b976c6e228ba63afdf67c33164e
Author: Tom de Vries <tdevr...@suse.de>
Date:   Mon Feb 7 14:12:34 2022 +0100

    [nvptx] Workaround sub.u16 driver JIT bug

    There's a nvidia driver JIT bug that mishandles this code (minimized from
    builtin-arith-overflow-15.c):
    ...
    int main (void) {
      signed char r;
      unsigned char y = (unsigned char) 0x80;
      if (__builtin_sub_overflow ((unsigned char)0, (unsigned char)y, &r))
        __builtin_abort ();
      return 0;
    }
    ...
    which at ptx level minimizes to:
    ...
      mov.u16 r22, 0x0080;
      st.local.u16 [frame_var],r22;
      ld.local.u16 r32,[frame_var];
      sub.u16 r33,0x0000,r32;
      cvt.u32.u16 r35,r33;
    ...
    where we expect r35 == 0x0000ff80 but get instead 0xffffff80, and where
using
    nvptx-none-run -O0 fixes the problem.  [ See also
    https://github.com/vries/nvidia-bugs/tree/master/builtin-arith-overflow-15
. ]

    Try to workaround the bug by using sub.s16 instead of sub.u16.

    Tested on nvptx.

    gcc/ChangeLog:

    2022-02-07  Tom de Vries  <tdevr...@suse.de>

            PR target/97005
            * config/nvptx/nvptx.md (define_insn "sub<mode>3"): Workaround
            driver JIT bug by using sub.s16 instead of sub.u16.

Reply via email to