On Fri, 2020-06-26 at 16:04 -0600, Jeff Law wrote: > On Fri, 2020-06-26 at 23:54 +0200, Ilya Leoshkevich wrote: > > How should this work ideally? Suppose we have: > > > > static inline void foo (int i) > > { > > if (__builtin_is_constant_p (i)) > > asm volatile ("bar" :: "i" (i)) > > else > > asm volatile ("baz" :: "d" (i)); > > } > > > > First of all, this is a supported use case, right? > Yes, this is a supported case. > > > Then, the way I see it, is that at least for a while there must > > exist > > trees like the ones above, regardless of whether their asm > > arguments > > match constraints. But ultimately dead code elimination should get > > rid > > of the wrong ones before they reach RTL. > > E.g. in the example above, the non-inlined version should have > > `!__builtin_is_constant_p (i)`, so `bar` should not survive until > > RTL. The same for inlined `foo (1)` version's `baz`. > In theory yes, but there are cases where paths converge (like you've > shown) where > you may have evaluated to a constant on the paths, but it's not a > constant at the > convergence point. You have to be very careful using b_c_p like this > and it's > been a regular source of kernel bugs.
Is there something specific that a compiler user should look out for? For example, here is the kernel code, from which the test was derived: static inline void atomic_add(int i, atomic_t *v) { #ifdef CONFIG_HAVE_MARCH_Z196_FEATURES if (__builtin_constant_p(i) && (i > -129) && (i < 128)) { __atomic_add_const(i, &v->counter); return; } #endif __atomic_add(i, &v->counter); } It looks very straightforward - can there still be something wrong with its usage of b_c_p? > I'd recommend looking at the .ssa dump and walk forward from there if > the .ssa > dump looks correct. > > jeff Well, 021t.ssa already has: __attribute__((gnu_inline)) __atomic_add_const (intD.6 valD.2269, intD.6 * ptrD.2270) { intD.6 val_3(D) = valD.2269; intD.6 * ptr_2(D) = ptrD.2270; ;; basic block 2, loop depth 0, maybe hot ;; prev block 0, next block 1, flags: (NEW) ;; pred: ENTRY (FALLTHRU) # .MEM_4 = VDEF <.MEM_1(D)> __asm__ __volatile__("asi %0,%1 " : "ptr" "=Q" *ptr_2(D) : "val" "i" val_3(D), "Q" *ptr_2(D) : "memory", "cc"); # VUSE <.MEM_4> return; ;; succ: EXIT } which is, strictly speaking, not correct, because val_3(D) and valD.2269 are not constant. But as far as I understand we are willing to tolerate trees like this until a certain point. What is this point supposed to be? If I understood you right, 106t.thread1 is already too late - why is it so?