On Thu, 2016-02-25 at 18:33 +0100, Torvald Riegel wrote: > On Wed, 2016-02-24 at 13:14 +0100, Richard Biener wrote: > > On Tue, Feb 23, 2016 at 8:38 PM, Torvald Riegel <trie...@redhat.com> wrote: > > > I'd like to know, based on the GCC experience, how important we consider > > > optimizations that may turn data dependencies of pointers into control > > > dependencies. I'm thinking about all optimizations or transformations > > > that guess that a pointer might have a specific value, and then create > > > (specialized) code that assumes this value that is only executed if the > > > pointer actually has this value. For example: > > > > > > int d[2] = {23, compute_something()}; > > > > > > int compute(int v) { > > > if (likely(v == 23)) return 23; > > > else <lots of stuff>; > > > } > > > > > > int bar() { > > > int *p = ptr.load(memory_order_consume); > > > size_t reveal_that_p_is_in_d = p - d[0]; > > > return compute(*p); > > > } > > > > > > Could be transformed to (after inlining compute(), and specializing for > > > the likely path): > > > > > > int bar() { > > > int *p = ptr.load(memory_order_consume); > > > if (p == d) return 23; > > > else <lots of stuff(*p)>; > > > } > > > > Note that if a user writes > > > > if (p == d) > > { > > ... do lots of stuff via p ... > > } > > > > GCC might rewrite accesses to p as accesses to d and thus expose > > those opportunities. Is that a transform that isn't valid then or is > > the code written by the user (establishing the equivalency) to blame? > > In the context of this memory_order_consume proposal, this transform > would be valid because the program has already "reveiled" what value p > has after the branch has been taken. > > > There's a PR where this kind of equivalencies lead to unexpected (wrong?) > > points-to results for example. > > > > > Other potential examples that come to mind are de-virtualization, or > > > feedback-directed optimizations that has observed at runtime that a > > > certain pointer is likely to be always equal to some other pointer (eg., > > > if p is almost always d[0], and specializing for that). > > > > That's the cases that are quite important in practice. > > Could you quantify this somehow, even if it's a very rough estimate? > I'm asking because it's significant and widely used, then this would > require users or compiler implementors to make a difficult trade-off > (ie, do you want mo_consume performance or performance through those > other optimizations?). > > > > Also, it would be interesting to me to know how often we may turn data > > > dependencies into control dependencies in cases where this doesn't > > > affect performance significantly. > > > > I suppose we try to avoid that but can we ever know for sure? Like > > speculative devirtualization does this (with the intent that it _does_ > > matter, > > of course).
Due to a think-o on my behalf, I need to add that the transformations that turn data into control dependencies would need to operate on data that is not considered "constant" during the lifetime of the application; IOW, all modifications to the data accessed through the original data dependence would always have to happen-before any of the accesses that get turned into a control dependency. In the de-virtualization case I suppose this would be the case, because the vtables won't change, so if the compiler turns this: func = p->vtable[23]; into this if (p->vtable == structA) func = structA.vtable[23]; // or inlines func directly... then this would not matter for the memory_order_consume load because all the vtables wouldn't get modified concurrently.