On Friday 20 January 2006 18:38, Mark Mitchell wrote: > Steven Bosscher wrote: > > On Friday 20 January 2006 18:21, Mark Mitchell wrote: > >>Richard Guenther wrote: > >>>A patch was also posted based on ideas in the audit trail. It's third > >>>incarnation at > >>>http://gcc.gnu.org/ml/gcc-patches/2006-01/msg00967.html > >>>would need a review. > >> > >>This patch uses "type_i == type_j" which is usually a mistake; are we > >>sure we don't need the usual type-equality predicate functions? > > > > Ah, you noticed that. So you answered your own question: > > If so, I'm not smart enough to understand the answer.
The patch just lacks a proper explanation. Richi left that in the PR audit trail, but it should be in cfgexpand.c. Actually the whole algorithm for the stack slot sharing is poorly explained. I already complained about that when rth commited it, but it was never fixed. > Clearly, the > comments on the patch are insufficient for me to understand what it's > doing. May we please have a version of the patch which a paragraph > explaining what the tests are accomplishing? That code in cfgexpand partitions variables that have to go onto the stack. It constructs a conflict graph for these variables. If two variables (or rather, the partitions they are in) do not conflict, they can be allocated in the same stack space. Initially, all variables are in their own partition, and the stack slot sharing algorithm uses a union-find algorithm to union partitions of variables that can live in the same stack space. If two partitions conflict, they can't be merged. Returning to the test case of the PR, we have two variables of the same union type: union setconflict { short a[20]; int b[10]; }; int foo (void) { int sum = 0; { struct A { union setconflict u; } a; short *c; c = a.u.a; asm ("": "=r" (c):"0" (c)); *c = 2; asm ("": "=r" (c):"0" (c)); sum += *c; } { struct B { union setconflict u; } a; int *c; c = a.u.b; asm ("": "=r" (c):"0" (c)); *c = 1; // This store is moved before the first "sum += *c;" asm ("": "=r" (c):"0" (c)); sum += *c; } return sum; } Initially both the variables "a" are in their own partition, but we unify their partitions so that the two variables are allocated in the same stack space. This confuses RTL alias analysis, which only sees two indirect memory references with different alias sets. It applies the strict aliasing rules and decides that it is OK for the scheduler to move around loads and stores to and from those two variables. A proper fix would somehow teach RTL alias analysis about this kind of situation, but that is beyond my hacking skills, and, I'm sure he would admit, beyond Richi's hacking stills too. I only went looking for work-arounds because the few people who understand how RTL alias analysis works decided to ignore this bug (and pretty much all other GCC 4.1 regressions...). The proposed work-around is to avoid confusing RTL alias analysis by simply not presenting it with situations where two references to the same memory can have different alias sets. The next question was: When can we cause this situation, that two variables are allocated on the stack, and we can access that memory through indirect references with different (non-conflicting) alias sets. Honza, Richi, and I have given this a lot of thought, and we believe this can _only_ happen for aggregate types if they contain a union. If that assumption is wrong, you can stop reading here ;-) If you look at cfgexpand.c, you'll find add_alias_set_conflicts: static void add_alias_set_conflicts (void) { size_t i, j, n = stack_vars_num; for (i = 0; i < n; ++i) { tree type_i = TREE_TYPE (stack_vars[i].decl); bool aggr_i = AGGREGATE_TYPE_P (type_i); for (j = 0; j < i; ++j) { tree type_j = TREE_TYPE (stack_vars[j].decl); bool aggr_j = AGGREGATE_TYPE_P (type_j); if (aggr_i != aggr_j || !objects_must_conflict_p (type_i, type_j)) add_stack_var_conflict (i, j); } } } This is _exactly_ supposed to avoid allocating two variables in the same stack space when we can access that memory through indirect references with non-conflicting alias sets. objects_must_conflict_p is supposed to guard that. objects_must_conflict_p returns true if type_i and type_j have conflicting alias sets. If it returns false, we have to assume that relying on RTL AA is unsafe, so a conflict is added to avoid stuffing the variables of type_i and type_j into the same stack space. (At least, that's my understanding, but rth should know for sure.) But if you have two stack variables of the same union type, you get type_i == type_j. For equal types, objects_must_conflict_p returns true. Therefore, we never add a conflict for variables of the same type, ever. Note that this is entirely based on the type of the variable, _not_ on the types of the fields of an aggregate type. For record types, relying on this ought to be safe because you can't access the same field through references with non-conflicting alias sets without breaking the strict aliasing rules. Any code that does this anyway is already invalid and we don't have to worry about it. However, for unions, it is not quite so easy. Here, you _can_ have references with non-conflicting alias sets to the same memory. In other words, even though the two variables of the same union already have conflicting alias sets, references to fields in the union don't, or at least not necessarily so. Therefore, we propose to avoid allocating _any_ variables with union types, or types containing unions, in the same stack space. Hope this clarifies... I think someone (hi Richi! :-) will prepare a better commented patch if you think our approach is good. Again, all of this is just under the "for lack of anything better" rule. Gr. Steven