[Abe wrote:]
After finishing fixing the known regressions, I intend/plan to reg-test for AArch64; after that, I think I`m going to need some community help to reg-test for other ISAs.
[Alan wrote:]
OK, I'm confused. When you write "making the new if-converter not mangle IR"...
> does "the new if-converter" mean your scratchpad fix to PR46029 Basically, yes. The code I currently have is, IMO, sufficiently-{changed/rewritten/replaced} relative to {the code currently in the file at that path in trunk} that I am referring to the pre-patch code in "tree-if-conv.c" as "the old if converter" and referring to the code in the version of the file at that path in my Git GCC checkout as "the new if converter". [Alan wrote:]
or is there some other new if-conversion phase that you are still working on and haven't posted yet?
No; as of now, Sebastian and I are not working on adding any more passes or phases: we are just trying to replace the old, potentially-unsafe, gives-up-more-easily converter with the new one that uses the scratchpad idea both to produce safer conversions and to be able to convert a greater percentage of the time. The only significant difficulty with the preceding at this time, AFAIK, is that the work-in-progress of the new converter produces {too many levels deep} of indirections, so the vectorizer gives up trying to vectorize the result[s] of the conversion, which makes the whole thing a big "fail" [not in the DejaGNU sense, although pretty close]. That`s why I would love to have some help in fixing the regressions. [Alan wrote:] > I haven't yet understood what you mean about "vectorizer-friendly" IR being mangled; is the problem that your > new […] transforms IR that can currently be if-converted by the existing phase, into IR that can't? (Example?) Not exactly, but close. "TLDR" for the following [from "vvv" to "^^^"]: "too much unnecessary indirection is being added". --- vvv --- TLDR`d above --- vvv --- Let`s pretend the programmer of the code being compiled knows all about if-conversion and vectorization and does the conversion _manually_. In other words, the C/C++/etc. code passed in to the compiler _already_ looks something like: temp0 = compute_condition(); temp1 = foo(); temp2 = bar(); result = temp0 ? temp1 : temp2; IOW, essentially IR/assembler written in C. Since the C code is lowered to GIMPLE and the '?' operator is lowered [too early IMO *] to blocks and "goto"s, the if-conversion pass no longer "sees" "result = temp0 ? temp1 : temp2;" -- instead, it sees something like: if temp0 goto <bb 4> else goto <bb 5> /* --- bb 4 --- */ result = temp1; goto <bb 6> /* --- bb 4 --- */ result = temp2; /* --- bb 6 --- */ … which the old if converter handles in such a way as to keep the vectorizer happy, but the new one does not yet. '*': another issue to discuss and something that IMO should be fixed/improved in GCC but outside of "tree-if-conv.c" The new if converter is doing something at least similar to "ooh! Ooh! The programmer wrote a valid-conversion-candidate ''if'' that I know how to convert!" and messes it up; IIRC the way in which this gets messed up is something like: _ifcvt_temp_0 = temp0; … _ifcvt_temp_1 = temp1; … _ifcvt_temp_2 = temp2; … result = _ifcvt_temp_0 ? _ifcvt_temp_1 : _ifcvt_temp_2; … at which the vectorizer turns up its nose, says "too much indirection" and/or "too much gather", and doesn`t vectorize -- for a {source code, target architecture} pair for which the code can and should be vectorized by GCC. In particular, the old vectorizer handles it well IIRC, probably b/c the old vectorizer has less indirection overhead, which it can easily "afford" since it never adds an indirection through a scratchpad. I think what is needed here is basically to reduce the indirection overhead in the new converter by making the new converter realize "oh, <foo> is already pure and thread-local, so I don`t need to copy <foo> into a temporary before using it". At least, that seems to me like _one_ way to fix this category of regressions in the new converter. Another way is to make GCC overall not convert e.g. "x ? y : z" into basic blocks etc. so early all of the time; my impressions is that GCC is not doing both of {inspecting the purity, analyzing the cost} of the expressions and only converting into basic blocks etc. when e.g. 'y'/'z'/both is/are impure and/or at least one of {'y', 'z'} is an "expensive" thing to compute. For low-cost pure expressions, e.g. "x ? y : z" should be retained as-as IMO -- i.e. not lowered into separate BBs -- for as long as possible. Ideally, it is encoded as a "COND_EXPR" and stays that way for as long as possible when there is no purity/high-computational-cost problem with either the second or the third param. In the worst case, this could involve fixing/improving several front ends, so I want to push this off into a separate subproject from the if conversion itself. --- ^^^ --- TLDR`d above --- ^^^ --- I hope the above is helpful, but since it`s from memory I don`t guarantee that it`s both 100% accurate and 100% complete. ;-) [Alan wrote:]
Then I might (only "might", sorry!) be able to help...
Great! Thanks. :-) Ideally, I/we fix the above problem -- and the rest of the regressions in the new if converter -- without any significant changes to core GCC files outside of "tree-if-conv.c". IOW, I`d like to minimize the invasiveness of this patch and get rid of {as many regressions as possible} the fixes for which lie entirely inside "tree-if-conv.c" before proceeding to fix/improve the "lowered too early" problem that I perceive current GCC trunk has in its lowering of "x ? y : z". I`m almost 100% certain that fixing/improving the lowering will require significant alterations to code in files other than "tree-if-conv.c", so even though that fix/improvement would likely fix at least one regression in the new if converter, I`d rather do it separately/later/both. In particular, I remember that "result = condition ? array1[index] : array2[maybe the same index, maybe not]" is being converted too early IMO. IOW, somewhere in GCC an array dereference is being considered as either impure, too-expensive, or both. "array[index]" in C [not in C++!: operator overloading] AFAIK is always pure and is always low-cost whenever the index expression is pure and low-cost. Regards, Abe