Hi! As the following testcase shows, expand_divmod stopped emitting int128 signed divisions by positive small (fitting into hwi) power of two constants in my r242690 aka PR78416 fix, where I've added next to EXACT_POWER_OF_2_OR_ZERO_P uses a check that either the bitsize is smaller or equal to hwi, or the value is positive (because otherwise the value is not a power of two, has say 65 bits set and 63 bits clear). In this particular spot I've been changing: else if (EXACT_POWER_OF_2_OR_ZERO_P (d) + && (size <= HOST_BITS_PER_WIDE_INT || d >= 0) && (rem_flag ? smod_pow2_cheap (speed, compute_mode) : sdiv_pow2_cheap (speed, compute_mode)) /* We assume that cheap metric is true if the optab has an expander for this mode. */ && ((optab_handler ((rem_flag ? smod_optab : sdiv_optab), compute_mode) != CODE_FOR_nothing) || (optab_handler (sdivmod_optab, compute_mode) != CODE_FOR_nothing))) ; - else if (EXACT_POWER_OF_2_OR_ZERO_P (abs_d)) + else if (EXACT_POWER_OF_2_OR_ZERO_P (abs_d) + && (size <= HOST_BITS_PER_WIDE_INT + || abs_d != (unsigned HOST_WIDE_INT) d))
The first change was correct, but I think I've failed to take into account the large additional && there and that the positive power of two values of d aren't really handled in the first else if, it is merely about not optimizing it if division or modulo is fast, and the actual optimization is only done in the second else if, where it handles both the cases of d being a positive power of two, and the case where d is negative and is not a power of two, but its negation abs_d is a positive power of two (handled by doing additional negation afterwards). The condition I've added allowed for the > 64-bit bitsizes only the cases of negative d values where their negation is a positive power of two (and disallowed the corner wrapping case of abs_d == d). This means with the above change we keep optimizing signed int128 division by e.g. -2 or -0x400000000000000 into shifts, but actually don't optimize division by 2 or 0x40000000000000. Although d and abs_d are HOST_WIDE_INT and unsigned HOST_WIDE_INT, the d >= cases are always good even for int128, the higher bits are all zeros and abs_d is the same as d. For d < 0 and d != HOST_WIDE_INT_MIN it is also ok, d is lots of sign bits followed by 63 arbitrary bits, but the absolute value of that is still a number with the msb bit in hwi clear and in wider precision all bits above it clear too. So the only problematic case is d equal to HOST_WIDE_INT_MIN, where we are divising or doing modulo by (signed __int128) 0xffffffffffffffff8000000000000000, and its negation is still the same value when expressed in CONST_INT or HOST_WIDE_INT, but the actual negated value should be (signed __int128) 0x8000000000000000ULL. So, this patch punts for that single special case which we don't handle properly (so it will be expanded as __divti3 likely), and allows again the positive power of two d values. Sorry for introducing this regression. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk and release branches? 2018-07-24 Jakub Jelinek <ja...@redhat.com> PR middle-end/86627 * expmed.c (expand_divmod): Punt if d == HOST_WIDE_INT_MIN and size > HOST_BITS_PER_WIDE_INT. For size > HOST_BITS_PER_WIDE_INT and abs_d == d, do the power of two handling if profitable. * gcc.target/i386/pr86627.c: New test. --- gcc/expmed.c.jj 2018-07-16 23:24:29.000000000 +0200 +++ gcc/expmed.c 2018-07-23 12:22:05.835272680 +0200 @@ -4480,6 +4480,11 @@ expand_divmod (int rem_flag, enum tree_c HOST_WIDE_INT d = INTVAL (op1); unsigned HOST_WIDE_INT abs_d; + /* Not prepared to handle division/remainder by + 0xffffffffffffffff8000000000000000 etc. */ + if (d == HOST_WIDE_INT_MIN && size > HOST_BITS_PER_WIDE_INT) + break; + /* Since d might be INT_MIN, we have to cast to unsigned HOST_WIDE_INT before negating to avoid undefined signed overflow. */ @@ -4522,9 +4527,7 @@ expand_divmod (int rem_flag, enum tree_c || (optab_handler (sdivmod_optab, int_mode) != CODE_FOR_nothing))) ; - else if (EXACT_POWER_OF_2_OR_ZERO_P (abs_d) - && (size <= HOST_BITS_PER_WIDE_INT - || abs_d != (unsigned HOST_WIDE_INT) d)) + else if (EXACT_POWER_OF_2_OR_ZERO_P (abs_d)) { if (rem_flag) { --- gcc/testsuite/gcc.target/i386/pr86627.c.jj 2018-07-23 13:13:36.126544676 +0200 +++ gcc/testsuite/gcc.target/i386/pr86627.c 2018-07-23 13:13:12.186512895 +0200 @@ -0,0 +1,28 @@ +/* PR middle-end/86627 */ +/* { dg-do compile { target int128 } } */ +/* { dg-options "-O2" } */ +/* { dg-final { scan-assembler-not "call\[^\n\r]*__divti3" } } */ + +__int128_t +f1 (__int128_t a) +{ + return a / 2; +} + +__int128_t +f2 (__int128_t a) +{ + return a / -2; +} + +__int128_t +f3 (__int128_t a) +{ + return a / 0x4000000000000000LL; +} + +__int128_t +f4 (__int128_t a) +{ + return a / -0x4000000000000000LL; +} Jakub