On Tue, Jul 7, 2015 at 11:23 PM, Abe <abe_skol...@yahoo.com> wrote: >> (if-conversion could directly generate masked load/stores >> of course and not use a scratch-pad at all in that case). > > > IMO that`s a great idea, but I don`t know how to do it. Hints would be > welcome. In particular, how does one "generate masked load/stores" at the > GIMPLE level?
It already does that, see predicate_mem_writes. You should definitely preserve that path (I think it does that only when versioning the loop for if-conversion, non-vectorized loops will then not be if-converted). > >> But are we correctly handling these cases in the current if conversion >> code? > > > I`m uncertain to what that is intended to refer, but I believe Sebastian > would agree that the new if converter is safer than the old one in terms of > correctness at the time of running the code being compiled. > > >> Abe's changes would seem like a step forward from a correctness standpoint > > > Not to argue, but as a point of humility: Sebastian did by far the most work > on this patch. I just modernized it and helped move it along. Even _that_ > was done with Sebastian`s help. > > >> even if they take us a step backwards from a performance standpoint. > > > For now, we have a few performance regressions, and so far we have found > that it`s non-trivial to remove all of those regressions. On what hardware did you test? > We may be better off pushing the current patch to trunk and having the > performance regressions currently introduced by the new if converter be > fixed by later patches. > Pushing to trunk gives us excellent "visibility" amongst GCC hackers, so the > code will get more "eyeballs" than if it lingers in an uncommitted patch or > in a branch. > I, for one, would love some help in fixing these performance regressions. > ;-) > > If fixing the performance regressions winds up taking too long, perhaps the > current imperfect patch could be undone on trunk just before a release is > tagged, > and then we`ll push it in again when trunk goes back to being allowed to be > unstable? According to my analysis of the data near the end of the page at > <https://gcc.gnu.org/develop.html>, we have until roughly April of 2016 to > work on not-yet-perfect patches in trunk. > > >>> So the question is whether we get more non-vectorized if-converted >>> code out of this (and thus whether we want to use --param >>> allow-store-data-races to get the old code back which is nicer to less >>> capable CPUs and probably faster than using scatter/gather or masked >>> loads/stores). > > >> I do think conditionalizing some of this on the allow-store-data-races >> makes sense. > > > I think having both the old if-converter and the new one "live on" in GCC is > nontrivial, but not impossible. I also don`t think it`s the best long-term > goal, > but only a short-term workaround. In the long run, IMO there should be only > one if converter, albeit perhaps with tweaking flags [e.g. > "-fallow-unsafe-if-conversion"]. > > >> I also wonder if we should really care about load data races (not sure >> your patch does). > > > According to a recent long discussion I had with Sebastian, our current > patch does not have the flaw I was concerned it might have in terms of loads > because: > > [1] the scratchpad is only being used to if-convert assignments to > thread-local scalars, never to globals/statics, and because... > > [2] the gimplifier is supposed to detect "the address of this scalar has > been taken" and when such is detected in the code being compiled, > it causes the scalar to no longer look like a scalar in GIMPLE so that > we are also safe from stale-data problems that could come from > corner-case code that takes the address of a thread-local variable and > gives that address to another thread [which then proceeds to > overwrite the value in the supposedly-thread-local scalar that belongs > to a different thread from the one doing the writing] > > > Regards, > > Abe > >