On Thu, 10 Apr 2025, Jakub Jelinek wrote: > Hi! > > The following testcase is miscompiled I believe starting with > PR112941 r14-6742. That commit fixed the bitint-55.c testcase. > The m_first initialization for such conversion initializes 2 SSA_NAMEs, > one is PHI result on the loop (m_data[save_data_cnt]) and the other > (m_data[save_data_cnt+1]) is the argument of that PHI from the latch > edge initialized somewhere in the loop. Both of these are used to > propagate sign extension (i.e. either 0 or all ones limb) from the > iteration with the sign bit of a narrower type to following iterations. > The bitint-55.c testcase was ICEing with invalid SSA forms as it was > using unconditionally the PHI argument SSA_NAME even in places which > weren't dominated by that. And the code which was touched is about > handling constant idx, so if e.g. there are nested casts and the > outer one does conditional code based on index comparison with > a particular constant index. > In the following testcase there are 2 nested casts, one from signed > _BitInt(129) to unsigned _BitInt(255) and the outer from unsigned > _BitInt(255) to unsigned _BitInt(256). The m_upward_2limbs case which > is used for handling mergeable arithmetics (like +-|&^ and casts etc.) > one loop iteration handles 2 limbs, the first half the even ones, the > second half the odd ones. > And for these 2 conversions, the special one for the inner conversion > on x86_64 is with index 2 where the sign bit of _BitInt(129) is present, > while for the outer one index 3 where we need to mask off the most > significant bit. > The r15-6742 change started using m_data[save_data_cnt] for all constant > indexes if it is still inside of the loop (and it is sign extension). > But that doesn't work correctly for the case where the inner conversion > produces the sign extension limb in the loop for an even index and > the outer conversion needs to special case the immediately next conversion, > because in that case using the PHI result will see still 0 there rather > than the updated value from the handling of previous limb. > So the following patch special cases this and uses the other SSA_NAME. > > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
OK. Thanks, Richard. > Commented IL, trying to lower > _1 = (unsigned _BitInt(255)) y_4(D); > _2 = (unsigned _BitInt(256)) _1; > _3 = _2 + x_5(D); > <retval> = _3; > we were emitting > <bb 3> [local count: 1073741824]: > # _8 = PHI <0(2), _9(12)> // This is the limb index > # _10 = PHI <0(2), _11(12)> // Sign extension limb from inner cast (0 or > ~0UL) > # _22 = PHI <0(2), _23(12)> // Overflow bit from addition of previous limb > if (_8 <= 2) > goto <bb 4>; [80.00%] > else > goto <bb 7>; [20.00%] > > <bb 4> [local count: 1073741824]: > if (_8 == 2) > goto <bb 6>; [20.00%] > else > goto <bb 5>; [80.00%] > > <bb 5> [local count: 1073741824]: > _12 = VIEW_CONVERT_EXPR<unsigned long[3]>(y)[_8]; // Full limbs in y > goto <bb 7>; [100.00%] > > <bb 6> [local count: 214748360]: > _13 = MEM <unsigned long> [(_BitInt(129) *)&y + 16B]; // y[2] which > _14 = (<unnamed-signed:1>) _13; // needs to be > _15 = (unsigned long) _14; // sign extended > _16 = (signed long) _15; // to full > _17 = _16 >> 63; // limb > _18 = (unsigned long) _17; > > <bb 7> [local count: 1073741824]: > # _19 = PHI <_12(5), _10(3), _15(6)> // Limb to add for result of > casts > # _20 = PHI <0(5), _10(3), _18(6)> // Sign extension limb from previous > limb > _11 = _20; // PHI _10 argument above > _21 = VIEW_CONVERT_EXPR<unsigned long[4]>(x)[_8]; > _24 = .UADDC (_19, _21, _22); > _25 = IMAGPART_EXPR <_24>; > _26 = REALPART_EXPR <_24>; > VIEW_CONVERT_EXPR<unsigned long[4]>(<retval>)[_8] = _26; > _27 = _8 + 1; > if (_27 == 3) // For the outer cast limb 3 is special > goto <bb 11>; [20.00%] > else > goto <bb 8>; [80.00%] > > <bb 8> [local count: 1073741824]: > if (_27 < 2) > goto <bb 9>; [80.00%] > else > goto <bb 10>; [20.00%] > > <bb 9> [local count: 1073741824]: > _28 = VIEW_CONVERT_EXPR<unsigned long[3]>(y)[_27]; // These are used in > full > > <bb 10> [local count: 1073741824]: > # _29 = PHI <_28(9), _11(8)> > goto <bb 12>; [100.00%] > > <bb 11> [local count: 214748360]: > // And HERE is the actual bug. Using _10 for idx 3 will mean it is always > // zero there and doesn't contain the _18 value propagated to it. > // It should be > // _30 = (<unnamed-unsigned:63>) _11; > // Now if the outer conversion had special iteration say 5, we could > // have used _10 fine here, by that time it already propagates through > // the PHI. > _30 = (<unnamed-unsigned:63>) _10; > _31 = (unsigned long) _30; > > <bb 12> [local count: 1073741824]: > # _32 = PHI <_29(10), _31(11)> > _33 = VIEW_CONVERT_EXPR<unsigned long[4]>(x)[_27]; > _34 = .UADDC (_32, _33, _25); > _23 = IMAGPART_EXPR <_34>; > _35 = REALPART_EXPR <_34>; > VIEW_CONVERT_EXPR<unsigned long[4]>(<retval>)[_27] = _35; > _9 = _8 + 2; > if (_9 != 4) > goto <bb 3>; [0.05%] > else > goto <bb 13>; [99.95%] > > 2025-04-10 Jakub Jelinek <ja...@redhat.com> > > PR tree-optimization/119707 > * gimple-lower-bitint.cc (bitint_large_huge::handle_cast): Only use > m_data[save_data_cnt] instead of m_data[save_data_cnt + 1] if > idx is odd and equal to low + 1. Remember tree_to_uhwi (idx) in > a temporary instead of calling the function multiple times. > > * gcc.dg/torture/bitint-76.c: New test. > > --- gcc/gimple-lower-bitint.cc.jj 2025-04-08 14:08:51.000000000 +0200 > +++ gcc/gimple-lower-bitint.cc 2025-04-10 16:18:53.315303307 +0200 > @@ -1547,14 +1547,15 @@ bitint_large_huge::handle_cast (tree lhs > } > else > { > - if (tree_to_uhwi (idx) < low) > + unsigned tidx = tree_to_uhwi (idx); > + if (tidx < low) > { > t = handle_operand (rhs1, idx); > if (m_first) > m_data[save_data_cnt + 2] > = build_int_cst (NULL_TREE, m_data_cnt); > } > - else if (tree_to_uhwi (idx) < high) > + else if (tidx < high) > { > t = handle_operand (rhs1, size_int (low)); > if (m_first) > @@ -1587,7 +1588,9 @@ bitint_large_huge::handle_cast (tree lhs > m_data_cnt = tree_to_uhwi (m_data[save_data_cnt + 2]); > if (TYPE_UNSIGNED (rhs_type)) > t = build_zero_cst (m_limb_type); > - else if (m_bb && m_data[save_data_cnt]) > + else if (m_bb > + && m_data[save_data_cnt] > + && ((tidx & 1) == 0 || tidx != low + 1)) > t = m_data[save_data_cnt]; > else > t = m_data[save_data_cnt + 1]; > --- gcc/testsuite/gcc.dg/torture/bitint-76.c.jj 2025-04-10 > 16:24:01.677066011 +0200 > +++ gcc/testsuite/gcc.dg/torture/bitint-76.c 2025-04-10 16:23:35.743422373 > +0200 > @@ -0,0 +1,19 @@ > +/* PR tree-optimization/119707 */ > +/* { dg-do run { target bitint } } */ > + > +#if __BITINT_MAXWIDTH__ >= 256 > +__attribute__((noipa)) unsigned _BitInt(256) > +foo (unsigned _BitInt(256) x, _BitInt(129) y) > +{ > + return x + (unsigned _BitInt(255)) y; > +} > +#endif > + > +int > +main () > +{ > +#if __BITINT_MAXWIDTH__ >= 256 > + if (foo (0, -1) != > 0x7fffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffuwb) > + __builtin_abort (); > +#endif > +} > > Jakub > > -- Richard Biener <rguent...@suse.de> SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg, Germany; GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)