On 06/23/2015 07:01 PM, Dave Airlie wrote: > On 24 June 2015 at 11:44, Ian Romanick <i...@freedesktop.org> wrote: >> On 06/24/2015 03:59 PM, Davin McCall wrote: >>> Hi Ian, >>> >>> On 23/06/15 23:26, Ian Romanick wrote: >>>> On 06/23/2015 02:36 PM, Thomas Helland wrote: >>>>> 2015-06-24 23:05 GMT+02:00 Davin McCall <dav...@davmac.org>: >>>>>> Hi - I'm new here. >>>>>> >>>>>> I've recently started poking the Mesa codebase for little reason other >>>>>> than >>>>>> personal interest. In the "help wanted" section of the website it >>>>>> mentions >>>>>> aliasing violations as a target for newcomers to fix, so with that in >>>>>> mind >>>>>> I've attached a patch (against git head) which resolves a few of them, by >>>>>> targeting the linked list implementation (list.h) used in the GLSL >>>>>> compiler/optimizers. This change slightly increases the storage >>>>>> requirements >>>>>> for a list (adds one word) but resolves the blatant aliasing violation >>>>>> that >>>>>> was caused by the trick used to conserve that word in the first place. >>>>>> >>>>>> (I toyed with another approach - using a single sentinel node for both >>>>>> the >>>>>> head and tail of a list - but this was much more invasive, and meant that >>>>>> you could no longer check whether a particular node was a sentinel node >>>>>> unless you had a reference to the list, so I gave up and went with this >>>>>> simpler approach). >>>>>> >>>>>> The most essential change is in the 'exec_list' structure. Three fields >>>>>> 'head', 'tail' and 'tail_pred' are removed, and two separate sentinel >>>>>> nodes >>>>>> are inserted in their place. The old 'head' is replaced by >>>>>> 'head_sentinel.next', 'tail_pred' by 'tail_sentinel.prev', and tail >>>>>> (always >>>>>> NULL) by 'head_sentinel.prev' and 'tail_sentinel.next' (both always >>>>>> NULL). >>>> NAK. The datastructure is correct as-is. It has been in common use >>>> since at least 1985. See the references in the header file. >>> >>> I understand the data structure and how it is supposed to work; the >>> issue is that the trick it employs cannot be implemented in C without >>> breaking the strict aliasing rules (or at least, the current >>> implementation in Mesa breaks the strict aliasing rules). The current >>> code works correctly only with the -fno-strict-aliasing compiler option. >>> The issue is that a pair of 'exec_node *' do not constitute an exec_node >>> in the eyes of the language spec, even though exec_node is declared as >>> holding two such pointers. Consider (from src/glsl/list.h): >>> >>> static inline void >>> exec_list_make_empty(struct exec_list *list) >>> { >>> list->head = (struct exec_node *) & list->tail; >>> list->tail = NULL; >>> list->tail_pred = (struct exec_node *) & list->head; >>> } >>> >>> >>> 'list->head' is of type 'struct exec_node *', and so should point at a >>> 'struct exec_node'. In the code above it is instead co-erced to point >>> at a 'struct exec_node *' (list->tail). That in itself doesn't break the >>> alias rules, but then: >>> >>> static inline bool >>> exec_node_is_tail_sentinel(const struct exec_node *n) >>> { >>> return n->next == NULL; >>> } >>> >>> >>> In 'exec_node_is_tail_sentinel', the sentinel is not actually an >>> exec_node - it is &list->tail. So, if the parameter n does refer to the >>> sentinel, then it does not point to an actual exec_node structure. >>> However, it is de-referenced (by 'n->next') which breaks the strict >>> aliasing rules. This means that the method above can only ever return >>> false, unless it violates the aliasing rules. >>> >>> (The above method could be fixed by casting n to an 'struct exec_node >>> **' and then comparing '**n' against NULL. However, there are other >>> similar examples throughout the code that I do not think would be so >>> trivial). >>> >>> I can quote the relevant parts of the standard if necessary, but your >>> tone somewhat implies that you wouldn't even consider this patch? >> >> Please quote the spec. I have a hard time believing that the compiler >> magically decides that an exec_node* is not really an exec_node* and >> just does whatever it wants (which sounds like what you're describing). >> That sounds pretty rubbish... and I'm surprised that anything works in >> such an environment. >> >> Mesa has also had -fno-strict-aliasing for as long as I can remember, >> and this structure has only been used here for a few years. The whole >> thing just doesn't smell right. > > Oh we've always had aliasing problems this is just one, you can't > expect one person to fix them all at once.
But "With this patch, I can build a working (though perhaps not 100% bug-free) Mesa without using -fno-strict-aliasing during compilation." is a pretty strong statement that doesn't completely jibe many years of Mesa using -fno-strict-aliasing before exec_list was added. > But making headway is a good thing. > > You can't have > struct exec_list *p; > struct exec_node *p2 = (struct exec_list *)p > > And do things with p2 and hope that p will get them, because > the compiler wants to store things its doing to p in registers, > and when you go and do something in p2 it can't work out it's the > same thing, so it has to spill/reload. Which I think is different from what Davin was saying, but I may be misunderstanding the whole thing. That's why I'd like to see spec language. The part that really gets me is that this is across a function boundary... that's generally a sacred line, so I'm surprised that the compiler is allowed to disregard what it's told in that scenario. I'd also like to see assembly dumps with and without -fno-strict-aliasing of some place where this goes wrong. If you, Davin, or someone else can point out a specific function that actually does the wrong thing, I can generate assembly myself. For that matter... how the heck is the ralloc (or any memory allocator) implementation valid? > Dave. _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev