On 6/30/20 12:46 PM, Ilya Leoshkevich wrote:
> v1: https://gcc.gnu.org/pipermail/gcc-patches/2020-June/547236.html
>
> This is the implementation of Jakub's suggestion: allow
> __builtin_constant_p () after IPA, but fold it into 0. Smoke test
> passed on s390x-redhat-linux, full regtest and bootstrap are running on
> x86_64-redhat-linux.
>
> ---
>
> Linux Kernel (specifically, drivers/leds/trigger/ledtrig-cpu.c) build
> with GCC 10 fails on s390 with "impossible constraint".
>
> The problem is that jump threading makes __builtin_constant_p () lie
> when it splits a path containing a non-constant expression in a way
> that on each of the resulting paths this expression is constant.
>
> Fix by disallowing __builtin_constant_p () on threading paths before
> IPA and fold it into 0 after IPA.
>
> gcc/ChangeLog:
>
> 2020-06-30 Ilya Leoshkevich <i...@linux.ibm.com>
>
> * tree-ssa-threadbackward.c (thread_jumps::m_allow_bcp_p): New
> member.
> (thread_jumps::profitable_jump_thread_path): Do not allow
> __builtin_constant_p () on threading paths unless m_allow_bcp_p
> is set.
> (thread_jumps::find_jump_threads_backwards): Set m_allow_bcp_p.
> (pass_thread_jumps::execute): Allow __builtin_constant_p () on
> threading paths after IPA.
> (pass_early_thread_jumps::execute): Do not allow
> __builtin_constant_p () on threading paths before IPA.
> * tree-ssa-threadupdate.c (duplicate_thread_path): Fold
> __builtin_constant_p () on threading paths into 0.
>
> gcc/testsuite/ChangeLog:
>
> 2020-06-30 Ilya Leoshkevich <i...@linux.ibm.com>
>
> * gcc.target/s390/builtin-constant-p-threading.c: New test.
So I'm finally getting back to this. Thanks for your patience.
It's a nasty little problem, and I suspect there's actually some deeper
issues here. While I'd like to claim its a bad use of b_c_p, I don't
think I can reasonably make that argument.
So what we have is a b_c_p at the start of an if-else chain. Subsequent
tests on the "true" arm of the the b_c_p test may throw us off the
constant path (because the constants are out of range). Once all the
tests are passed (it's constant and the constant is in range) the true
arm's terminal block has a special asm that requires a constant
argument. In the case where we get to the terminal block on the true
arm, the argument to the b_c_p is used as the constant argument to the
special asm.
At first glace jump threading seems to be doing the right thing. Except
that we end up with two paths to that terminal block with the special
asm, one for each of the two constant arguments to the b_c_p call.
Naturally since that same value is used in the asm, we have to introduce
a PHI to select between them at the head of the terminal block. Now
the argument in the asm is no longer constant and boom we fail.
I briefly pondered if we should only throttle when the argument to the
b_c_p is not used elsewhere. But I think that just hides the problem
and with a little work I could probably extend the testcase to still
fail in that scenario.
I also briefly pondered if we should isolate the terminal block as well
(essentially creating one for each unique PHI argument). We'd likely
only need to do that when there's an ASM in the terminal block, but that
likely just papers over the problem as well since the ASM could be in a
successor of the terminal block.
I haven't thought real deeply about it, but I wouldn't be surprised if
there's other passes that can trigger similar problems. Aggressive
cross-jumping would be the most obvious, but some of the hosting/sinking
of operations past PHIs would seem potentially problematical as well.
Jakub suggestion might be the best one in this space. I don't have
anything better right now. The deeper questions about other passes
setting up similar scenarios can probably be punted, I'd expect
threading to be far and above the most common way for this to happen and
I'd be comfortable faulting in investigation of other cases if/when they
happen.
So I retract my initial objections. Let's go with the V2 patch.
jeff