On Wed, Jul 29, 2015 at 6:15 PM, Abe <abe_skol...@yahoo.com> wrote: > [Richard wrote:] >> >> Yes, the GIMPLE if-converter should strictly be a vectorization enabler. >> If the "new" if-converter produces code that the vectorizer does not >> handle (on the target targeted) then it has done a bad job. > > > Understood [or at least I _think_ I did ;-)] and agreed. > > OTOH, there _are_ things that apparently the vectorizer > _could_ do better, esp. handling scatter/gather. > That improvement might be pending new vector > instructions in the target ISA[s], though... > > ... YMMV. ;-) > > > [Richard wrote:] >> >> I think I fixed this bug. It was because of an inverted test. >> 2015-07-10 Richard Biener <rguent...@suse.de> >> PR tree-optimization/66823 >> * tree-if-conv.c (memrefs_read_or_written_unconditionally): >> Fix inverted predicate. > > > I can confirm that as of that just after that commit, the bug is gone. > > If I understand correctly, the bug was in the analysis leading to > the decision to convert or not, such that the fix makes the old > [pre-scratchpad] converter decide "don`t convert". Is that correct?
Yes. > Would you like me to include the DejaGNUified form of the related new test > case > in the patch even though this bug no longer exists in the "old" converter? > Maybe it`s nice to keep the test for catching possible future regressions. Yes. > > [Richard wrote:] >> >> It was known to produce store data-races with > >> -ftree-loop-if-convert-stores. >> >> That is something the scratchpad use avoids >> (at the expense of a lot(?) of vectorization). > > > TTBOMK, that "expense" is temporary, pending fixes/improvement > to the new converter. IOW, I don`t think any of the relevant > regressions are as a result of "essential complexity". I don't see how you can fix this without scatter support in the CPU. > [Richard wrote:] >> >> First of all to fix regressions on the load if-conversion side > >> you should stop using a scratchpad for loads (do you have one? >> I seem to remember you do from the design description). > > Yes, we do have one. > > I respectfully disagree with the suggestion to remove all scratchpad use for > loads. > That use is what allows us to if-convert -- and therefor vectorize -- > expressions that > may result in invalid pointer dereferences for "condition is false" > iterations/columns. > AFAIK the old converter -- due to lack of a scratchpad -- must give up and > not > convert such cases, thus resulting in code of that kind not getting > vectorized. Correct. But then you should restrict the if-conversion to targets that can do gather vectorization. > As of now, the relevant "improvement" in/due-to the new converter is not > always > effective, i.e. it does not always result in a vectorization, due to the > current implementation of the scratchpad mechanism being too generic and > adding indirection that confuses the vectorizer. We intend to fix this. This should be fixed before the patch can go in. > > [Richard wrote:] >> >> Then you should IMHO keep the old store path > >> for --param allow-store-data-races=1. > > I have discussed this with Sebastian; > he and I formed a plan to allow the following modes: > > * no if conversion [note: %%] > * if-conversion of loads only, the old way > * if-conversion of stores only, the old way > * if-conversion of loads and stores, the old way > * if-conversion of loads only, with scratchpads > * if-conversion of stores only, with scratchpads > * if-conversion of loads and stores, with scratchpads > > "%%": to retain as the default in trunk GCC until we have > enough improvement, e.g. a cost model to decide whether > or not it`s good to if-convert a particular loop There is the existing mechanism to if-convert only a copy of the loop guarded with IFN_LOOP_VECOTRIZED so the cost model is that of the vectorizer when it tries to vectorize the if-converted copy. If it doesn't vectorize that copy the non-if-converted loop is preserved, otherwise it is deleted. I've argued in the past that the GIMPLE if-converter should always go that path (thus strictly be a vectorization enabler only). > Once all the known regressions in the new converter have been > fixed and the numbers [e.g. from SPEC] of loops vectorized > looks good, I expect/hope that it will be possible to > remove the old if converter in favor of the new one. > > I expect/hope that we will {enable {if conversion} by default > when autovectorization is enabled} once we have a cost model that > prevents {if conversion that is inadvisable due to it promoting > from conditional to unconditional the evaluation of [pure] > expression[s] that is/are too expensive, thereby making the > conversion a "pessimization" instead of a true optimization}. if-conversion is already on by default if vectorizing. We only do not introduce store-data-races by default. > > [Richard wrote:] >> >> I'd like to see a comparison in the number of vectorized loops for SPEC >> CPU 2006 FP > >> with -Ofast -ftree-loop-if-convert-stores with / without the new >> if-converter (and if you like also without -ftree-loop-if-convert-stores). >> You can use -fopt-info-vec and grep for the number of 'loop vectorized' >> cases. > > Please consider it added to my "TO DO" list. > I`ll report back when I have results. Thanks, Richard. > Regards, > > Abe