Thanks, Richard, I think I can follow your logic. That patch works for my example. BTW, I have a bug report (pr60012), if you are to check in the patch.
Should I also report the scalar example as a bug? It looks innocuous per se :-). Bingfeng -----Original Message----- From: Richard Biener [mailto:rguent...@suse.de] Sent: 03 February 2014 13:16 To: Bingfeng Mei Cc: Florian Weimer; Jakub Jelinek; gcc@gcc.gnu.org Subject: RE: No TBAA before ptr_derefs_may_alias_p? On Mon, 3 Feb 2014, Bingfeng Mei wrote: > For the following code, why can load be moved before store instruction? > TBAA still applies even it is an anti-dependency. Somehow alias analysis > is implemented differently in vectorization. > > for > int foo (long long *a, short *b, int n) > { > *a = (long long)(n * 100); > > return (*b) + 1000; > } > x86-64 code > imull $100, %edx, %edx > movswl (%rsi), %eax > movslq %edx, %rdx > movq %rdx, (%rdi) > addl $1000, %eax > ret That's a bug. Probably a wrong predicate used in the scheduler (we've fixed many I think). -fno-schedule-insns2 fixes it. But after some local discussion I think we can do Index: gcc/tree-vect-data-refs.c =================================================================== --- gcc/tree-vect-data-refs.c (revision 207417) +++ gcc/tree-vect-data-refs.c (working copy) @@ -235,6 +235,18 @@ vect_analyze_data_ref_dependence (struct || (DR_IS_READ (dra) && DR_IS_READ (drb))) return false; + /* Even if we have an anti-dependence then, as the vectorized loop covers at + least two scalar iterations, there is always also a true dependence. + As the vectorizer does not re-order loads and stores we can ignore + the anti-dependence if TBAA can disambiguate both DRs similar to the + case with known negative distance anti-dependences (positive + distance anti-dependences would violate TBAA constraints). */ + if (((DR_IS_READ (dra) && DR_IS_WRITE (drb)) + || (DR_IS_WRITE (dra) && DR_IS_READ (drb))) + && !alias_sets_conflict_p (get_alias_set (DR_REF (dra)), + get_alias_set (DR_REF (drb)))) + return false; + /* Unknown data dependence. */ if (DDR_ARE_DEPENDENT (ddr) == chrec_dont_know) { We agreed to that dependence-analysis isn't really the suitable place to apply TBAA. To arrive at the above the reasoning goes like so: we need to avoid the case where loading DRA after storing DRB would load a different value. But if DRA were to load from a place where DRB stored to then this would be a true dependence and thus we can apply TBAA to that "re-load" and thus argue it may not happen. The same reasoning applies to LIM and PRE performing invariant motion and disambiguating the load they want to hoist against a store over the back-edge - if there were any aliasing then it wouldn't be valid. Note that both transforms, vectorization and LIM, are careful not to move the loads after the stores. The vectorizer still can re-order loads and stores by means of effectively unrolling, thus a[i] = b[i] becomes tem1 = a[i] tem2 = a[i+1] ... b[i] = tem1 b[i+1] = tem2 ... instead of b[i] = a[i] b[i+1] = a[i+1] ... so the interesting case to construct is one with different size a[] and b[] (to allow one set of DRs catching the other) and try to prove that you can't construct one that causes a[] to read from a location that b[] stored to but the vectorizer would introduce such false dependence. I think that's not possible (fingers crossing ;)). Richard. > > Bingfeng > -----Original Message----- > From: Richard Biener [mailto:rguent...@suse.de] > Sent: 03 February 2014 10:18 > To: Florian Weimer > Cc: Jakub Jelinek; Bingfeng Mei; gcc@gcc.gnu.org > Subject: Re: No TBAA before ptr_derefs_may_alias_p? > > On Mon, 3 Feb 2014, Florian Weimer wrote: > > > On 02/03/2014 10:59 AM, Jakub Jelinek wrote: > > > On Mon, Feb 03, 2014 at 09:51:01AM +0000, Bingfeng Mei wrote: > > > > If it is just for C++ placement new, why don't implement it as a > > > > lang_hook. > > > > Now other languages such as C have to be made conservative and produce > > > > worse > > > > code. > > > > > > Even in C++ code you don't use placement new that often, so e.g. by having > > > the placement new explicit through some special GIMPLE statement in the > > > IL, > > > you could e.g. just look if a particular function or loop contains any > > > placement new stmts (cached in struct function and loop?) and use TBAA if > > > it isn't there. > > > > I believe the convenience of TBAA lies in the fact that you don't have to > > prove anything about actual program behavior if the types are sufficiently > > distinct. If you allow local violations of that principle, the global > > property inevitably breaks down as well. > > > > In any case, C code can call C++ code and vice versa, so it's difficult to > > consider each language in isolation. > > As I said in other mail even C code can change the dynamic type of > a storage location (via memcpy). And as soon as you require > a look at stmts inbetween two refs that you ask the oracle to > disambiguate you are doing sth wrong. > > Richard. > > -- Richard Biener <rguent...@suse.de> SUSE / SUSE Labs SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746 GF: Jeff Hawn, Jennifer Guild, Felix Imend"orffer