On Mon, Oct 24, 2016 at 10:05 AM, Jakub Jelinek <ja...@redhat.com> wrote:
> On Mon, Oct 24, 2016 at 09:55:47AM +0200, Richard Biener wrote:
>> Hum.  This means we no longer can for example apply loop versioning or
>> peeling to loops containing any of both calls.  I've not yet traced down 
>> whether
>> we can still inline functions containing such calls.
>
> I agree gimple_can_duplicate_bb_p is too strong for this.
>
>> That said, we _can_ duplicate blocks with such calls and I don't think
>> that for b_c_p we guarantee anything about it (inlines into two different
>> functions may produce true in one case and false in another).  Your
>> testcase may show an interesting "odd" case but I don't think it's
>> any different that what we regularly see or _want_ to see.
>>
>> Same for b_o_s of course.
>>
>> So I think the bug is no bug.
>
> The problem is that it breaks something fairly common.  The linux kernel
> has:
> #define ilog2(n)                                \
> (                                               \
>         __builtin_constant_p(n) ? (             \
>                 (n) < 1 ? ____ilog2_NaN() :     \
>                 (n) & (1ULL << 63) ? 63 :       \
>                 (n) & (1ULL << 62) ? 62 :       \
> ...
>                 (n) & (1ULL <<  4) ?  4 :       \
>                 (n) & (1ULL <<  3) ?  3 :       \
>                 (n) & (1ULL <<  2) ?  2 :       \
>                 (n) & (1ULL <<  1) ?  1 :       \
>                 (n) & (1ULL <<  0) ?  0 :       \
>                 ____ilog2_NaN()                 \
>                                    ) :          \
>         (sizeof(n) <= 4) ?                      \
>         __ilog2_u32(n) :                        \
>         __ilog2_u64(n)                          \
>  )
>
> and the assumption that in the b_c_p guarded branch (or block) the
> corresponding variable is really constant is just too common.
> I think either not threading if b_c_p appears there, or folding it to 0
> during the threading is what we want.

I agree we should avoid the situation for the above.  The reduced
testcase is too obfuscated to be worth fixing though and for the above
it looks rather like a missed optimization to me ... (we thread a too
small part of the path?)

Richard.

>         Jakub

Reply via email to