On Fri, 28 Oct 2011, Richard Guenther wrote: > On Fri, 28 Oct 2011, Jakub Jelinek wrote: > > > On Fri, Oct 28, 2011 at 12:59:48PM +0200, Richard Guenther wrote: > > > It is also because of re-use of memory via memcpy (yes, some dubious > > > TBAA case from C, but essentially we don't want to break that). Thus > > > we can't use TBAA on anonymous memory. > > > > No, IMHO we always use a ref_all mem access in that case. > > If you meant something like: > > > > void > > foo (int *intptr, float *floatptr) > > { > > int i; > > for (i = 0; i < 256; ++i) > > { > > int tem; > > __builtin_memcpy (&tem, &intptr[i], sizeof (tem)); > > floatptr[i] = (float) tem; > > } > > } > > > > which is valid C even if intptr == floatptr, we have: > > > > <bb 2>: > > > > <bb 3>: > > # i_21 = PHI <i_14(4), 0(2)> > > # ivtmp.12_27 = PHI <ivtmp.12_26(4), 256(2)> > > D.2709_3 = (long unsigned int) i_21; > > D.2710_4 = D.2709_3 * 4; > > D.2711_6 = intptr_5(D) + D.2710_4; > > D.2712_7 = MEM[(char * {ref-all})D.2711_6]; > > D.2713_11 = floatptr_10(D) + D.2710_4; > > D.2715_13 = (float) D.2712_7; > > *D.2713_11 = D.2715_13; > > i_14 = i_21 + 1; > > ivtmp.12_26 = ivtmp.12_27 - 1; > > if (ivtmp.12_26 != 0) > > goto <bb 4>; > > else > > goto <bb 5>; > > > > <bb 4>: > > goto <bb 3>; > > > > which is just fine even with TBAA. > > And similarly for > > void > > bar (int *intptr, float *floatptr) > > { > > int i; > > for (i = 0; i < 256; ++i) > > { > > float tem; > > tem = (float) intptr[i]; > > __builtin_memcpy (&floatptr[i], &tem, sizeof (tem)); > > } > > } > > > > where the ref-all isn't used for load, but for store. > > Well, yeah. I said it's probably difficult to generate a > C testcase. It's still valid middle-end IL (and well-defined) to have > intptr == floatptr and MEM[(int *)..] and MEM[(float *)...].
Btw, only the exact overlap case is critical, for non-exact overlap like for (i) { float[i] = int[i-1] + int[i]; } you can reason that there cannot be aliasing as if you execute this loop more than once(!) then you'd have float[i] = int[i-1] + int[i]; float[i+1] = int[i] + int[i+1]; ... where the 2nd load from int[i] would load from float-initialized memory which is undefined. Thus you can assume that float != int. But that requires more thorough analysis that we don't do at the moment and knowledge that the loop will iterate at least N times (when called from the vectorizer, the vectorization factor, which is at least 2). Richard.