[Richard wrote:]
Yes, the GIMPLE if-converter should strictly be a vectorization enabler. If the "new" if-converter produces code that the vectorizer does not handle (on the target targeted) then it has done a bad job.
Understood [or at least I _think_ I did ;-)] and agreed. OTOH, there _are_ things that apparently the vectorizer _could_ do better, esp. handling scatter/gather. That improvement might be pending new vector instructions in the target ISA[s], though... ... YMMV. ;-) [Richard wrote:]
I think I fixed this bug. It was because of an inverted test. 2015-07-10 Richard Biener <rguent...@suse.de> PR tree-optimization/66823 * tree-if-conv.c (memrefs_read_or_written_unconditionally): Fix inverted predicate.
I can confirm that as of that just after that commit, the bug is gone. If I understand correctly, the bug was in the analysis leading to the decision to convert or not, such that the fix makes the old [pre-scratchpad] converter decide "don`t convert". Is that correct? Would you like me to include the DejaGNUified form of the related new test case in the patch even though this bug no longer exists in the "old" converter? Maybe it`s nice to keep the test for catching possible future regressions. [Richard wrote:]
It was known to produce store data-races with
> -ftree-loop-if-convert-stores.
That is something the scratchpad use avoids (at the expense of a lot(?) of vectorization).
TTBOMK, that "expense" is temporary, pending fixes/improvement to the new converter. IOW, I don`t think any of the relevant regressions are as a result of "essential complexity". [Richard wrote:]
First of all to fix regressions on the load if-conversion side
> you should stop using a scratchpad for loads (do you have one? > I seem to remember you do from the design description). Yes, we do have one. I respectfully disagree with the suggestion to remove all scratchpad use for loads. That use is what allows us to if-convert -- and therefor vectorize -- expressions that may result in invalid pointer dereferences for "condition is false" iterations/columns. AFAIK the old converter -- due to lack of a scratchpad -- must give up and not convert such cases, thus resulting in code of that kind not getting vectorized. As of now, the relevant "improvement" in/due-to the new converter is not always effective, i.e. it does not always result in a vectorization, due to the current implementation of the scratchpad mechanism being too generic and adding indirection that confuses the vectorizer. We intend to fix this. [Richard wrote:]
Then you should IMHO keep the old store path
> for --param allow-store-data-races=1. I have discussed this with Sebastian; he and I formed a plan to allow the following modes: * no if conversion [note: %%] * if-conversion of loads only, the old way * if-conversion of stores only, the old way * if-conversion of loads and stores, the old way * if-conversion of loads only, with scratchpads * if-conversion of stores only, with scratchpads * if-conversion of loads and stores, with scratchpads "%%": to retain as the default in trunk GCC until we have enough improvement, e.g. a cost model to decide whether or not it`s good to if-convert a particular loop Once all the known regressions in the new converter have been fixed and the numbers [e.g. from SPEC] of loops vectorized looks good, I expect/hope that it will be possible to remove the old if converter in favor of the new one. I expect/hope that we will {enable {if conversion} by default when autovectorization is enabled} once we have a cost model that prevents {if conversion that is inadvisable due to it promoting from conditional to unconditional the evaluation of [pure] expression[s] that is/are too expensive, thereby making the conversion a "pessimization" instead of a true optimization}. [Richard wrote:]
I'd like to see a comparison in the number of vectorized loops for SPEC CPU 2006 FP
> with -Ofast -ftree-loop-if-convert-stores with / without the new > if-converter (and if you like also without -ftree-loop-if-convert-stores). > You can use -fopt-info-vec and grep for the number of 'loop vectorized' cases. Please consider it added to my "TO DO" list. I`ll report back when I have results. Regards, Abe