https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108803
Jakub Jelinek <jakub at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |jakub at gcc dot gnu.org, | |rsandifo at gcc dot gnu.org --- Comment #1 from Jakub Jelinek <jakub at gcc dot gnu.org> --- I'd say this is a bug in expand_doubleword_shift_condmove or so. aarch64 when !TARGET_SIMD is a !SHIFT_COUNT_TRUNCATED target and shift_mask is 0. And HAVE_conditional_move is non-zero. Now, if the shift count is constant at expansion time, we just select one of the expand_superword_shift or expand_subword_shift depending on the exact value and the shift count ought to be in both cases in the [0, BITS_PER_WORD - 1] range. Similarly, if !HAVE_conditional_move and shift count is non-constant, we do the same except that we select one at runtime, so again at runtime the chosen shift count should be [0, BITS_PER_WORD - 1]. But for expand_doubleword_shift_condmove with shift_mask 0, op1 is [0, 2 * BITS_PER_WORD - 1] and we pass op1, superword_op1 where the latter is op1 - BITS_PER_WORD. So, in expand_doubleword_shift_condmove subword_op1 is in [0, 2 * BITS_PER_WORD - 1] range and superword_op1 is in [-BITS_PER_WORD, BITS_PER_WORD - 1] range. And the routine just emits expand_superword_shift and expand_subword_shift and selects using conditional move one of those. But that means one of the two shifts is necessarily with out of range count, either subword_op1 is [BITS_PER_WORD, 2 * BITS_PER_WORD - 1] i.e. too large, or superword_op1 is in [-BITS_PER_WORD, -1] range (i.e. negative). Don't we need to mask those counts in that case (both)? Now, in the testcase __builtin_add_overflow_p is actually evaluated to constant 0 only during the expansion (or later?) - in this particular case I wonder why we haven't optimized it earlier because for any unsigned addends the addition is in [0, 2 * UINT_MAX - 2] range and so fits well into signed __int128 range, something to be looked at for GCC 14. But I fear it is exactly the only during RTL discovered constant that we later on propagate into the op1 - BITS_PER_WORD and thus do one of the shifts with count -64. CCing Richard as the author of that code from 2004.