Re: TBAA
On Sun, May 16, 2021 at 8:57 AM Uecker, Martin wrote: > > > > Hi Richard, > > I noticed that GCC 11 has different behavior in the following > example relative to 10.2 with -O2. I wonder whether this > is an intentional change and if yes, what are the rules? Yes, this is a fix for the long-standing PR57359 where we failed to preserve the order of stores when applying store-motion. The basic rule is that stores may not be re-ordered based on TBAA since a store changes the dynamic type of a memory location. That notably happens for C/C++ programs re-using allocated storage and for C++ programs using placement new. It looked like a not so important bug since all other compilers fall into the same trap, nevertheless they are all wrong in doing so (and so was GCC). All work fine at -O0 of course. Richard. > Thanks! > Martin > > https://godbolt.org/z/57res7ax1 > > #include > #include > > > __attribute__((__noinline__, __weak__)) > void f(long unk, void *pa, void *pa2, void *pb, long *x) { > for (long i = 0; i < unk; ++i) { > int oldy = *(int *)pa; > *(float *)pb = 42; > *(int *)pa2 = oldy ^ x[i]; > } > } > > int main(void) { > void *p = malloc(sizeof(int)); > *(int *)p = 13; > f(1, p, p, p, (long []){ 0 }); > printf("*pa(%d)\n", *(int *)p); > }
Re: TBAA
Am Montag, den 17.05.2021, 09:08 +0200 schrieb Richard Biener: > On Sun, May 16, 2021 at 8:57 AM Uecker, Martin > wrote: > > > > > > Hi Richard, > > > > I noticed that GCC 11 has different behavior in the following > > example relative to 10.2 with -O2. I wonder whether this > > is an intentional change and if yes, what are the rules? > > Yes, this is a fix for the long-standing PR57359 where we > failed to preserve the order of stores when applying store-motion. > > The basic rule is that stores may not be re-ordered based on > TBAA since a store changes the dynamic type of a memory > location. That notably happens for C/C++ programs re-using > allocated storage and for C++ programs using placement > new. > > It looked like a not so important bug since all other compilers > fall into the same trap, nevertheless they are all wrong in > doing so (and so was GCC). All work fine at -O0 of course. > > Richard. This is excellent news! We were already thinking about how one could change the effective type rules in C, but it is not clear how this could have been done. Best, Martin > > Thanks! > > Martin > > > > https://godbolt.org/z/57res7ax1 > > > > #include > > #include > > > > > > __attribute__((__noinline__, __weak__)) > > void f(long unk, void *pa, void *pa2, void *pb, long *x) { > > for (long i = 0; i < unk; ++i) { > > int oldy = *(int *)pa; > > *(float *)pb = 42; > > *(int *)pa2 = oldy ^ x[i]; > > } > > } > > > > int main(void) { > > void *p = malloc(sizeof(int)); > > *(int *)p = 13; > > f(1, p, p, p, (long []){ 0 }); > > printf("*pa(%d)\n", *(int *)p); > > }
Re: TBAA
On Mon, May 17, 2021 at 9:32 AM Uecker, Martin wrote: > > > > Am Montag, den 17.05.2021, 09:08 +0200 schrieb Richard Biener: > > On Sun, May 16, 2021 at 8:57 AM Uecker, Martin > > wrote: > > > > > > > > > Hi Richard, > > > > > > I noticed that GCC 11 has different behavior in the following > > > example relative to 10.2 with -O2. I wonder whether this > > > is an intentional change and if yes, what are the rules? > > > > Yes, this is a fix for the long-standing PR57359 where we > > failed to preserve the order of stores when applying store-motion. > > > > The basic rule is that stores may not be re-ordered based on > > TBAA since a store changes the dynamic type of a memory > > location. That notably happens for C/C++ programs re-using > > allocated storage and for C++ programs using placement > > new. > > > > It looked like a not so important bug since all other compilers > > fall into the same trap, nevertheless they are all wrong in > > doing so (and so was GCC). All work fine at -O0 of course. > > > > Richard. > > This is excellent news! We were already thinking about > how one could change the effective type rules in C, but > it is not clear how this could have been done. Yes, indeed. It also turned out the solution found for GCC has zero performance impact in practice. The idea is simply to re-do dependent stores in program order on each loop exit. Without this "trick" fixing the bug has severe impact on SPEC (fixing as in disabling the invalid store-motion optimizations). Richard. > Best, > Martin > > > > > > Thanks! > > > Martin > > > > > > https://godbolt.org/z/57res7ax1 > > > > > > #include > > > #include > > > > > > > > > __attribute__((__noinline__, __weak__)) > > > void f(long unk, void *pa, void *pa2, void *pb, long *x) { > > > for (long i = 0; i < unk; ++i) { > > > int oldy = *(int *)pa; > > > *(float *)pb = 42; > > > *(int *)pa2 = oldy ^ x[i]; > > > } > > > } > > > > > > int main(void) { > > > void *p = malloc(sizeof(int)); > > > *(int *)p = 13; > > > f(1, p, p, p, (long []){ 0 }); > > > printf("*pa(%d)\n", *(int *)p); > > > }
Re: TBAA
Am Montag, den 17.05.2021, 09:44 +0200 schrieb Richard Biener: > On Mon, May 17, 2021 at 9:32 AM Uecker, Martin > wrote: > > > > > > Am Montag, den 17.05.2021, 09:08 +0200 schrieb Richard Biener: > > > On Sun, May 16, 2021 at 8:57 AM Uecker, Martin > > > wrote: > > > > > > > > Hi Richard, > > > > > > > > I noticed that GCC 11 has different behavior in the following > > > > example relative to 10.2 with -O2. I wonder whether this > > > > is an intentional change and if yes, what are the rules? > > > > > > Yes, this is a fix for the long-standing PR57359 where we > > > failed to preserve the order of stores when applying store- > > > motion. > > > > > > The basic rule is that stores may not be re-ordered based on > > > TBAA since a store changes the dynamic type of a memory > > > location. That notably happens for C/C++ programs re-using > > > allocated storage and for C++ programs using placement > > > new. > > > > > > It looked like a not so important bug since all other compilers > > > fall into the same trap, nevertheless they are all wrong in > > > doing so (and so was GCC). All work fine at -O0 of course. > > > > > > Richard. > > > > This is excellent news! We were already thinking about > > how one could change the effective type rules in C, but > > it is not clear how this could have been done. > > Yes, indeed. It also turned out the solution found for GCC > has zero performance impact in practice. The idea is simply > to re-do dependent stores in program order on each loop exit. > Without this "trick" fixing the bug has severe impact on SPEC > (fixing as in disabling the invalid store-motion optimizations). Nice! Was moving these stores out of a loop the only case where GCC did this incorrect store-motions? The other areas where the effective type rules appear unclear and/or not matching reality are: - effective type of aggregates when they are incrementally constructed or changed (not sure whether this is relevant for optimization in GCC) - access paths, i.e. is 'i.p' an access to 'i' or is 'a[3]' an access to 'a' with array type (assuming 'a' is an lvalue of array type before it decays and not a pointer). And if yes, when does this access happen? I think GCC relies on this. - initial sequence rule Anything else? Best, Martin > Richard. > > > Best, > > Martin > > > > > > > > > > Thanks! > > > > Martin > > > > > > > > https://godbolt.org/z/57res7ax1 > > > > > > > > #include > > > > #include > > > > > > > > > > > > __attribute__((__noinline__, __weak__)) > > > > void f(long unk, void *pa, void *pa2, void *pb, long *x) { > > > > for (long i = 0; i < unk; ++i) { > > > > int oldy = *(int *)pa; > > > > *(float *)pb = 42; > > > > *(int *)pa2 = oldy ^ x[i]; > > > > } > > > > } > > > > > > > > int main(void) { > > > > void *p = malloc(sizeof(int)); > > > > *(int *)p = 13; > > > > f(1, p, p, p, (long []){ 0 }); > > > > printf("*pa(%d)\n", *(int *)p); > > > > }
Re: TBAA
On Mon, May 17, 2021 at 1:46 PM Uecker, Martin wrote: > > > > Am Montag, den 17.05.2021, 09:44 +0200 schrieb Richard Biener: > > On Mon, May 17, 2021 at 9:32 AM Uecker, Martin > > wrote: > > > > > > > > > Am Montag, den 17.05.2021, 09:08 +0200 schrieb Richard Biener: > > > > On Sun, May 16, 2021 at 8:57 AM Uecker, Martin > > > > wrote: > > > > > > > > > > Hi Richard, > > > > > > > > > > I noticed that GCC 11 has different behavior in the following > > > > > example relative to 10.2 with -O2. I wonder whether this > > > > > is an intentional change and if yes, what are the rules? > > > > > > > > Yes, this is a fix for the long-standing PR57359 where we > > > > failed to preserve the order of stores when applying store- > > > > motion. > > > > > > > > The basic rule is that stores may not be re-ordered based on > > > > TBAA since a store changes the dynamic type of a memory > > > > location. That notably happens for C/C++ programs re-using > > > > allocated storage and for C++ programs using placement > > > > new. > > > > > > > > It looked like a not so important bug since all other compilers > > > > fall into the same trap, nevertheless they are all wrong in > > > > doing so (and so was GCC). All work fine at -O0 of course. > > > > > > > > Richard. > > > > > > This is excellent news! We were already thinking about > > > how one could change the effective type rules in C, but > > > it is not clear how this could have been done. > > > > Yes, indeed. It also turned out the solution found for GCC > > has zero performance impact in practice. The idea is simply > > to re-do dependent stores in program order on each loop exit. > > Without this "trick" fixing the bug has severe impact on SPEC > > (fixing as in disabling the invalid store-motion optimizations). > > Nice! > > Was moving these stores out of a loop the only > case where GCC did this incorrect store-motions? Those are the ones I'm aware of, yes. > The other areas where the effective type > rules appear unclear and/or not matching reality are: > > - effective type of aggregates when they are incrementally > constructed or changed (not sure whether this is relevant > for optimization in GCC) So like struct S { int i; }; (int *)p = 1; ... = *(struct S *)p; or ... = ((struct S*)p)->i; this and cases with multiple members should work fine with GCC. > - access paths, i.e. is 'i.p' an access to 'i' or is > 'a[3]' an access to 'a' with array type (assuming 'a' > is an lvalue of array type before it decays and not > a pointer). And if yes, when does this access happen? > I think GCC relies on this. Yes, GCC assumes that i.p is an access to 'i' (but not in address-taking context like &i.p). Likewise for ((struct S *)p)->i > - initial sequence rule We do not honor this, but in practice if you have struct S1 { int i; float f; }; struct S2 { int i; double g; }; then accessing the initial sequence as *(int *) works, but struct S1 s1 = {}; int i = ((struct S2 *)&s1)->i; does not because we consider it an access as 'S2' due to the previous point. If it were just an access to 'i' it would work again. > Anything else? The most recent interesting case from the C++ world is when they allowed struct S { int n; std::byte a[16]; } S s; new (s.a) T(); S s2; s2 = s; thus construct an arbitrary type into an aggregate member of std::byte (or char type I believe). With GCCs memory model copying this like s2.a = s.a would have worked (because std::byte and char alias everything). But when you do s2 = s then the accesses as 'struct S' do not TBAA conflict with say 'double' (if you make T == double). For this reason the C++ frontend now drops TBAA for all std::byte/char array containing aggregeates ... I do know this kind of stuff is used in practice in C code as well but usually C people do not use aggregate copying but instead use memcpy (which would be fine). So maybe just an anecdote but eventually such situation occurs in (future) C as well. Richard. > Best, > Martin > > > > > Richard. > > > > > Best, > > > Martin > > > > > > > > > > > > > > Thanks! > > > > > Martin > > > > > > > > > > https://godbolt.org/z/57res7ax1 > > > > > > > > > > #include > > > > > #include > > > > > > > > > > > > > > > __attribute__((__noinline__, __weak__)) > > > > > void f(long unk, void *pa, void *pa2, void *pb, long *x) { > > > > > for (long i = 0; i < unk; ++i) { > > > > > int oldy = *(int *)pa; > > > > > *(float *)pb = 42; > > > > > *(int *)pa2 = oldy ^ x[i]; > > > > > } > > > > > } > > > > > > > > > > int main(void) { > > > > > void *p = malloc(sizeof(int)); > > > > > *(int *)p = 13; > > > > > f(1, p, p, p, (long []){ 0 }); > > > > > printf("*pa(%d)\n", *(int *)p); > > > > > }
Paid link insertion.🔗 Coding.🖥️
Hello. I want to insert a link to this page https://www.gnu.org/software/gcc/projects/cxx-status.html. My link is the link to the coding homework service. How much is such a link insertion? Thank you, I am waiting for your response. -- *Best regards, * *Veronika Chuhalova* *Junior SEO Specialist* *Uvoteam Sales & Marketing Department, Kyiv*
Signedness of boolean types (and Ada)
I noticed while debugging why my "A?CST1:CST0" patch was broken for Ada, I noticed the following ranges for boolean types: # RANGE [0, 1] NONZERO 1 _14 = checks__saved_checks_tos.482_2 > 0; # RANGE [0, 255] NONZERO 1 _18 = _14 == 0; _19 = ~_18; # RANGE [0, 1] NONZERO 1 _15 = _19; if (_15 != 0) goto ; [50.00%] else goto ; [50.00%] Seems like there is a misunderstanding of TYPE_UNSIGNED for boolean types and that is what is causing the problem in the end. As the above gets optimized to be always true. My patch just happens to cause the ~_18 to be produced which is later on miscompiled. Should TYPE_UNSIGNED be always set for boolean types? I am testing the below patch to see if it fixes the problem, if we should assume TYPE_UNSIGNED is true for boolean types. If we should not assume that, then there is a problem with conversion between boolean types that have TYPE_UNSIGNED mismatched. Thanks, Andrew Pinski diff --git a/gcc/ada/gcc-interface/decl.c b/gcc/ada/gcc-interface/decl.c index 232b552a60c..4e839f3b8ab 100644 --- a/gcc/ada/gcc-interface/decl.c +++ b/gcc/ada/gcc-interface/decl.c @@ -1687,7 +1687,7 @@ gnat_to_gnu_entity (Entity_Id gnat_entity, tree gnu_expr, bool definition) gnu_type = make_node (is_boolean ? BOOLEAN_TYPE : ENUMERAL_TYPE); TYPE_PRECISION (gnu_type) = esize; - TYPE_UNSIGNED (gnu_type) = is_unsigned; + TYPE_UNSIGNED (gnu_type) = is_boolean ? true : is_unsigned; set_min_and_max_values_for_integral_type (gnu_type, esize, TYPE_SIGN (gnu_type)); process_attributes (&gnu_type, &attr_list, true, gnat_entity);
Re: Signedness of boolean types (and Ada)
On Mon, May 17, 2021 at 6:52 PM Andrew Pinski wrote: > > I noticed while debugging why my "A?CST1:CST0" patch was broken for > Ada, I noticed the following ranges for boolean types: > # RANGE [0, 1] NONZERO 1 > _14 = checks__saved_checks_tos.482_2 > 0; > # RANGE [0, 255] NONZERO 1 > _18 = _14 == 0; > _19 = ~_18; > # RANGE [0, 1] NONZERO 1 > _15 = _19; > if (_15 != 0) > goto ; [50.00%] > else > goto ; [50.00%] > > Seems like there is a misunderstanding of TYPE_UNSIGNED for boolean > types and that is what is causing the problem in the end. As the above > gets optimized to be always true. My patch just happens to cause the > ~_18 to be produced which is later on miscompiled. > > Should TYPE_UNSIGNED be always set for boolean types? > > I am testing the below patch to see if it fixes the problem, if we > should assume TYPE_UNSIGNED is true for boolean types. If we should > not assume that, then there is a problem with conversion between > boolean types that have TYPE_UNSIGNED mismatched. I went back and noticed this has been discussed before but it does look like the middle-end/gimple is still broken and even worse I have done: (bit_not:boolean_type_node (convert:boolean_type_node @0)) Which means I have done the conversions correctly but they are removed still. So I am still thinking we need to have a discussion about what boolean type node should be and even how boolean types should act during the gimple phase. Thanks, Andrew Pinski > > Thanks, > Andrew Pinski > > diff --git a/gcc/ada/gcc-interface/decl.c b/gcc/ada/gcc-interface/decl.c > index 232b552a60c..4e839f3b8ab 100644 > --- a/gcc/ada/gcc-interface/decl.c > +++ b/gcc/ada/gcc-interface/decl.c > @@ -1687,7 +1687,7 @@ gnat_to_gnu_entity (Entity_Id gnat_entity, tree > gnu_expr, bool definition) > > gnu_type = make_node (is_boolean ? BOOLEAN_TYPE : ENUMERAL_TYPE); > TYPE_PRECISION (gnu_type) = esize; > - TYPE_UNSIGNED (gnu_type) = is_unsigned; > + TYPE_UNSIGNED (gnu_type) = is_boolean ? true : is_unsigned; > set_min_and_max_values_for_integral_type (gnu_type, esize, > TYPE_SIGN (gnu_type)); > process_attributes (&gnu_type, &attr_list, true, gnat_entity);