Davin McCall <dav...@davmac.org> writes: > On 26/06/15 13:18, Francisco Jerez wrote: >> Davin McCall <dav...@davmac.org> writes: >> >>> On 26/06/15 11:08, Erik Faye-Lund wrote: >>>> On Thu, Jun 25, 2015 at 1:48 AM, Davin McCall <dav...@davmac.org> wrote: >>>>> This is an alternative to my earlier patch [1] (and it is now constructed >>>>> properly using git format-patch). >>>>> >>>>> Quick background: >>>>> There is a problem in exec_list due to it directly including a trio >>>>> of 'struct exec_node *' members to implement two overlapping sentinel >>>>> nodes. The sentinel nodes do not exist as exec_node objects and so >>>>> should not be accessed as such, according to C99 6.5 paragraph 7. >>>>> When this strict aliasing rule is violated the compiler may re-order >>>>> reads and writes in unexpected ways, such as demonstrated in another >>>>> email [2]. >>>>> >>>>> The problem only manifests if compiling without -fno-strict-aliasing. >>>>> >>>>> This patch addresses the issue by introducing some new methods for >>>>> setting the 'next' and 'prev' members of the exec_node structure, which >>>>> avoid the aliasing restrictions by way of casting the exec_node pointer >>>>> (to an exec_node-pointer-pointer) whenever it may actual point to a >>>>> sentinel node. Essentially an exec_node can be seen as an array of two >>>>> exec_node pointers, and this view is compatible with the sentinel >>>>> structure in exec_list. >>>>> >>>>> Compared to the previous patch, this patch is much less intrusive, and >>>>> does not increase the storage requirements of the exec_list structure. >>>>> >>>>> While I'm not proposing that -fno-strict-aliasing no longer be used for >>>>> Mesa builds, this patch represents a step in that direction. With this >>>>> patch applied, a working Mesa library can be built, although bugs may >>>>> be present (and could be triggered only when using particular compiler >>>>> versions and options). FWIW file sizes with and without strict aliasing: >>>>> >>>>> (gcc 4.8.4, -O3 -fomit-frame-pointer -march=corei7). >>>>> >>>>> -fno-strict-aliasing: with strict aliasing: >>>>> libGL.so 699188 699188 (no change) >>>>> *_dri.so 9575876 9563104 (-2772) >>>>> >>>>> (dri bundle includes r300, r600, kms_swrast and swrast). >>>>> >>>>> So, not a huge win, size-wise. Dave Airlie reports a 30K difference in >>>>> his r600_dri.so build however [3]. >>>>> >>>>> In terms of performance: >>>>> >>>>> (export LIBGL_ALWAYS_SOFTWARE=1; time glmark2) >>>>> >>>>> -fno-strict-aliasing: >>>>> >>>>> glmark2 Score: 244 >>>>> real 5m34.707s >>>>> user 11m36.192s >>>>> sys 0m29.596s >>>>> >>>>> with strict aliasing: >>>>> >>>>> glmark2 Score: 247 >>>>> real 5m34.438s >>>>> user 11m29.904s >>>>> sys 0m29.556s >>>>> >>>>> Again, only a very small improvement when strict aliasing is enabled. >>>>> >>>>> With the above in mind it is reasonable to question whether this patch >>>>> is worthwhile. However, it's done, and it seems to work, so I offer it >>>>> for review. >>>>> >>>>> >>>>> [1] http://lists.freedesktop.org/archives/mesa-dev/2015-June/087179.html >>>>> [2] http://lists.freedesktop.org/archives/mesa-dev/2015-June/087246.html >>>>> [3] http://lists.freedesktop.org/archives/mesa-dev/2015-June/087206.html >>>>> --- >>>>> src/glsl/list.h | 123 >>>>> ++++++++++++++++++++++++++++++++++++-------------------- >>>>> 1 file changed, 80 insertions(+), 43 deletions(-) >>>>> >>>>> diff --git a/src/glsl/list.h b/src/glsl/list.h >>>>> index 15fcd4a..cfbe5a9 100644 >>>>> --- a/src/glsl/list.h >>>>> +++ b/src/glsl/list.h >>>>> @@ -76,6 +76,12 @@ >>>>> #include "util/ralloc.h" >>>>> >>>>> struct exec_node { >>>>> + /** >>>>> + * Accessing these fields directly may be ill-advised; if the >>>>> 'exec_node' >>>>> + * is actually a sentinel node embedded in the exec_list structure, it >>>>> may >>>>> + * be a strict-aliasing violation (C99 6.5 paragraph 7). Use the >>>>> methods >>>>> + * provided instead. >>>>> + */ >>>>> struct exec_node *next; >>>>> struct exec_node *prev; >>>>> >>>>> @@ -140,35 +146,55 @@ exec_node_init(struct exec_node *n) >>>>> n->prev = NULL; >>>>> } >>>>> >>>>> +/** >>>>> + * Strict-aliasing safe method for setting the next pointer for any >>>>> + * node, including sentinel nodes. >>>>> + */ >>>>> +static inline void >>>>> +exec_node_set_next(struct exec_node *n, struct exec_node *next) >>>>> +{ >>>>> + ((struct exec_node **)n)[0] = next; >>>>> +} >>>>> + >>>>> +/** >>>>> + * Strict-aliasing safe method for setting the next pointer for any >>>>> + * node, including sentinel nodes. >>>>> + */ >>>>> +static inline void >>>>> +exec_node_set_prev(struct exec_node *n, struct exec_node *next) >>>>> +{ >>>>> + ((struct exec_node **)n)[1] = next; >>>>> +} >>>>> + >>>>> static inline const struct exec_node * >>>>> exec_node_get_next_const(const struct exec_node *n) >>>>> { >>>>> - return n->next; >>>>> + return ((const struct exec_node **)n)[0]; >>>>> } >>>> How exactly is this supposed to be strict-aliasing safe? >>>> >>>> Here's the wording from the C++14 spec: >>>> >>>> "If a program attempts to access the stored value of an object through >>>> a glvalue of other than one of the following types the behavior is >>>> undefined: >>>> * the dynamic type of the object, >>>> * a cv-qualified version of the dynamic type of the object, >>>> * a type similar (as defined in 4.4) to the dynamic type of the object, >>>> * a that is the signed or unsigned type corresponype tding to the >>>> dynamic type of the object, >>>> * a type that is the signed or unsigned type corresponding to a >>>> cv-qualified version of the dynamic type of the object, >>>> * an aggregate or union type that includes one of the aforementioned >>>> types among its elements or non-static data members (including, >>>> recursively, an element or non-static data member of a subaggregate or >>>> contained union), >>>> * a type that is a (possibly cv-qualified) base class type of the >>>> dynamic type of the object, >>>> * a char or unsigned char type." >>>> >>>> You cast an 'struct exec_node *' to 'struct exec_node **', and read >>>> it. Those are different types, and are not among the exceptions listed >>>> above... >>> >>> In for eg: >>> >>> return ((const struct exec_node **)n)[0] >>> >>> ... The stored value of 'n' is not accessed by any other type than the >>> type of n itself. This value is then cast to a different pointer type. >>> You are mistaken if you think that the cast accesses the stored value of >>> n. The other "stored value" access that it occurs in that expression is >>> to the object pointed at by the result of the cast. As the exec_node >>> structure is defined as: >>> >>> struct exec_node *next; >>> struct exec_node *prev; >>> >>> ... then the value being accessed is the 'next' member of the exec_node >>> struct, not the exec_node struct itself. I.e. store value of type >>> 'struct exec_node *' is being accessed via a glvalue of that same type. >>> >>> Putting it another way, casting some pointer p, of type 'struct >>> exec_node *', to 'struct exec node **', is effectively the same as '& >>> p->next', but without the de-reference of p (which is the potential an >>> aliasing violation that the patch avoids). >>> >>> >>> Although I don't think this is your concern, the cast is valid due to: >>> C99 6.7.2.1 p13: >>> "[...]. A pointer to a structure object, suitably converted, points to >>> its initial member (or if that member is a >>> bit-field, then to the unit in which it resides), and vice versa." >>> >>> I don't have C++14, so I've checked C++11. I'm less familiar with the >>> C++ standard text in general, so I'm not sure where exactly the >>> equivalent to the C99 is, but I am quite certain that the rules are >>> essentially the same. >>> >>> Perhaps this snippet is all that's really required: >>> >>> C++11 1.8: "Unless an object is a bit-field or a base class subobject of >>> zero size, the address of that object is the address >>> of the first byte it occupies." >>> >>> There's also 4.10p2: >>> >>> "A prvalue of type “pointer to cv T,” where T is an object type, can be >>> converted to a prvalue of type “pointer to cv void”. The result of >>> converting a “pointer to cv T” to a “pointer to cv void” points to the >>> start of the storage location where the object of type T resides [...]" >>> >>> ... which to some extent implies that maybe you should cast first to >>> "void *" then to the member type, but in practice this isn't necessary. >>> >> That may be right, but it still seems kind of difficult to enforce that >> other users of the linked list won't end up breaking the strict aliasing >> rule by accident unless we change its data structures completely >> (e.g. you still have exec_list objects pointed to by exec_node pointers >> what makes it really tempting for other users to bypass the wrappers you >> have introduced and break the rule again). > > Yes. I believe there are such cases in the code already; the patch does > not set out to correct all the aliasing violations. Such a patch would > be quite large. This was an attempt to make some progress, not resolve > the problem completely. >
Yes, of course. The question is whether it's worth the considerable amount of work you mention to fix all users of a linked list implementation that by design encourages them to violate the strict aliasing rule. The amount of effort required to migrate users to a safer linked list implementation may be of the same order of magnitude. >> Another issue is that AFAICT this introduces a new kind of UB e.g. when >> you do "((const struct exec_node **)n)[1]". The layout of the exec_node >> (or exec_list) object "n" points to is unspecified, and the array >> subscript past the end of the pointed-to object is undefined because of >> section 6.5.6 of the C99 spec regarding the additive operators: >> >> | 7 For the purposes of these operators, a pointer to an object that is >> | not an element of an array behaves the same as a pointer to the >> | first element of an array of length one with the type of the object >> | as its element type. >> | >> | 8 [..] If the result points one past the last element of the array >> | object, it shall not be used as the operand of a unary * operator >> | that is evaluated. >> >> (Array subscripting is later on defined in terms of the '+' and '*' >> operators) > > It's true that this is what the standard says, but this is one of those > cases where actual compiler implementations are more permissive than the > standard, as a matter of practicality. > > That being said, the code could be made to work without the array > subscripting by casting first to 'char *' before doing the pointer > arithmetic, and using 'offsetof' to be certain of the correct address of > the next/prev members. > That sounds a less dubious, but... >> That said, this problem is IMHO worth fixing regardless of the >> performance improvement, this linked list implementation relies on UB >> which we work around by passing a non-standard compiler flag to, AFAIK, >> only GCC and Clang, there's no guarantee that other compilers won't take >> advantage of the strict aliasing rule and miscompile this code. > > I agree. > >> We may >> just as well replace this linked list with another of the many >> implementations we have, as someone else suggested earlier. > > The main problem is that the exec_node structure allows making a > determination of whether a node is at the head or the end of the list > without requiring a reference to the list structure itself. The list > implementation in src/util/list.h is on the other hand a simple > doubly-linked list where the list head is also the list tail and neither > can be distinguished from other list nodes (for exec_node, the next > pointer will be NULL in the tail sentinel, and the prev pointer will be > NULL in the head sentinel). > Right. > The amount of effort required to use the src/util/list.h implementation > wherever the exec_list is currently used would be large, and would > probably incur a memory overhead because in various places you would > need to store a reference to the list as well as the node where you > previously only needed a reference to the node. > I'd be surprised if it incurred any measurable performance penalty not offset by the improvement from '-fstrict-aliasing', might be worth checking. > The other alternative would be to make src/util/list.h use the exec_list > implementation, and hope that there's no code that requires on the list > being circular. You'd still have the aliasing violation to contend with, > too, unless you use my patch. So, even with this as an ultimate goal, > the patch I've provided makes progress. If there was concern about the > casting hoops necessary to avoid strict aliasing, the alternative would > be to use version 1 of the patch, which split the head sentinel and tail > sentinel into two nodes. > Your first approach seemed quite reasonable IMHO. Were you able to measure any performance regression from it? Thanks. > Davin
signature.asc
Description: PGP signature
_______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev