On Fri, May 25, 2018 at 3:06 PM Richard Sandiford <richard.sandif...@linaro.org> wrote: > > Richard Biener <richard.guent...@gmail.com> writes: > > On Fri, May 25, 2018 at 1:49 PM Richard Sandiford < > > richard.sandif...@linaro.org> wrote: > > > >> Richard Biener <richard.guent...@gmail.com> writes: > >> > On Fri, May 25, 2018 at 12:12 PM Richard Sandiford < > >> > richard.sandif...@linaro.org> wrote: > >> > > >> >> Richard Biener <richard.guent...@gmail.com> writes: > >> >> > On Wed, May 16, 2018 at 12:17 PM Richard Sandiford < > >> >> > richard.sandif...@linaro.org> wrote: > >> >> >> This patch uses IFN_COND_* to vectorise conditionally-executed, > >> >> >> potentially-trapping arithmetic, such as most floating-point > >> >> >> ops with -ftrapping-math. E.g.: > >> >> > > >> >> >> if (cond) { ... x = a + b; ... } > >> >> > > >> >> >> becomes: > >> >> > > >> >> >> ... > >> >> >> x = IFN_COND_ADD (cond, a, b); > >> >> >> ... > >> >> > > >> >> >> When this transformation is done on its own, the value of x for > >> >> >> !cond isn't important. > >> >> > > >> >> >> However, the patch also looks for the equivalent of: > >> >> > > >> >> >> y = cond ? x : a; > >> >> > > >> >> > As generated by predicate_all_scalar_phis, right? So this tries to > >> > capture > >> >> > > >> >> > if (cond) > >> >> > y = a / b; > >> >> > else > >> >> > y = a; > >> >> > > >> >> > But I think it would be more useful to represent the else value > >> > explicitely > >> >> > as extra operand given a constant like 0 is sth that will happen in > >> > practice > >> >> > and is also sth that is I think supported as a result by some > > targets. > >> >> > > >> >> > So to support the flow of current if-conversion you'd convert to > >> >> > > >> >> > IFN_COND_ADD (cond, a, b, a); // or any other reasonable "default" > >> > value > >> >> > (target controlled?) > >> >> > > >> >> > and you merge that with a single-use COND_EXPR by replacing that > >> > operand. > >> >> > > >> >> > If a target then cannot support the default value it needs to emit a > >> > blend > >> >> > afterwards. > >> > > >> >> Yeah, that's definitely better. I was a bit worried about the extra > >> >> generality, but it turns out that (on SVE) having the else argument and > >> >> getting the .md patterns to blend the input can avoid an unnecessary > >> >> move that would be difficult to remove otherwise. > >> > > >> > Btw, I've checked AVX512 and their masked operations either blend > >> > into the destination or zero the masked elements. Possibly useful > >> > for plus reductions. > > > >> OK. > > > >> >> I've committed the patches to add the else argument and the target hook > >> >> to select the preferred else value (thanks for the reviews). The patch > >> >> below now uses that preferred else value if there's no ?: or if the > >> >> else value of the ?: isn't available at the point of the conditional > >> >> operation. > >> > > >> >> This version also copes with: > >> > > >> >> if (cond) { ... x = a + b; ... } > >> >> y = !cond ? c : x; > >> > > >> >> by adding a new utility function inverse_conditions_p. > >> > > >> > interpret_as_condition_p is a bit of a layering violation in > > fold-const.c > >> > since it looks at SSA_NAME_DEF_STMT. In gimple.c we have a related > >> > thing, mainly used by forwprop - canonicalize_cond_expr_cond which is > >> > used to interpret a GENERIC folded thing as condition. > >> > > >> > Do you really need the def stmt lookup? > > > >> Yeah, since these internal functions always take a gimple value > >> as input, not a comparison code. But I can put the code elsewhere. > > > >> > Note that one item on my TODO list is to remove [VEC_]COND_EXPR from > >> > gimple in favor of using a quaternary assign stmt with tcc_comparison > > code > >> > so we'd have > >> > > >> > op0 = op1 < op2 ? op3 : op4; > >> > > >> > that means currently valid op0 = op1 < op2; would become > >> > a more explicit op0 = op1 < op2 ? true : false; > >> > > >> > That means you'd for example may need to handle _1 != 0 being > >> > feeded by _1 = _2 < _3 ? true : false; -- or simply rely on those > >> > to be combined. > >> > > >> > Sooo - maybe you feel like implementing the above? ;) > > > >> So is just requiring a gimple value in COND_EXPR and VEC_COND_EXPR > >> a non-starter? That seems neater IMO. (And probably replacing > >> them with a single code.) > > > > That's the alternative. OTOH with the above we have a single canonical > > way to compute the value of a comparison rather than two as we have now: > > _1 = _2 < _3; > > vs. > > _1 = _2 < _3 ? true : false; > > > > so I somewhat settled on that "fix" to the GENERIC expression in > > our [VEC_]COND_EXPRs. > > But the second feels as redundant as x + 0 to me :-) Booleans should > be 0 or 1 whereever they come from, whether it's directly from a > comparison, from a BIT_FIELD_REF, from a logical operation, etc. > So if we don*t fold the second to the first at the moment, that seems > like a missed opportunity.
One of the "achievements" of the re-defined COND_EXPRs would that we no longer have two ways of computing the result of a comparison, so we can't fold the second to the first anymore... > > Also targets seem to like "repeated" conditions better. > > What kinds of situation do you mean? x86 has vector ?: so seeing _1 = _4 < _5; _3 = _1 ? _6 : _7; produces vcond _4 < _5 ? -1 : 0; vcond _1 != 0 ? _6 : _7; and it's more efficient to embed _4 < _5 into the vcond even if there's more than one of them. IIRC. Targets that don't have vector cond but only vector compares likely benefit from the CSE of the condition. I guess from the CSEd IL we can solve this by doing more instruction selection on GIMPLE, side-stepping the combine into-multiple uses issue. > >> I think part of the problem with the 4-way stmt is that it becomes > >> harder to know what canonical form to use. E.g. should it be > >> canonicalised so that op4 is a constant where possible? > >> Or should it be canonicalised based on the condition? > > > > We have the same issue with COND_EXPR right now... > > > >> I've seen > >> this lead to cases in which we end up inserting opposite comparisons > >> with the current embedded VEC_COND_EXPR stuff, whereas splitting > >> them out would make it more obvious that the comparison and the > >> select are logically two separate things. > > > > True, but you'd still have the choice of inverting the condition and thus > > swapping the operands. > > > >> It would also be a bit inconsistent with IFN_COND_*, since they'd > >> still take the result of a separate comparison, and also the older > >> IFN_MASK* functions. > > > > True... it shows that IFNs while "nice" are not really a replacement > > for "proper" predicated stmts ;) > > But IFN_*s can in general be used to predicate an entire block. > I don't think we want to repeat the comparison in the block predicate > for each individual statement. > > If we're moving to doing more instruction selection in gimple > (IMO a good thing), then I don't think we'd want to bake-in a > requirement that every conditional op has to do a comparison. > E.g. if the same condition is used in two selections: > > _5 = _1 < _2 ? _3 : _4; > _8 = _1 < _2 ? _6 : _7; > > then as it stands it would be hard to remove the redundant > comparison. I suppose we could have: > > _9 = _1 < _2 ? true : false; > _5 = _9 != 0 ? _3 : _4; > _8 = _9 != 0 ? _6 : _7; > > and assume that targets can handle != 0 more efficiently than > anything else. But that makes comparisons with zero special, > whereas using a 4-op statement suggests that it shouldn't be. > It also adds another variant to the canonicalisation rules, > since there's a choice over all of: > > - the way _9 is written (true : false vs. false : true) > - whether _9 should be inverted and the inversion propagated > to the users (which like you say is a problem whatever we do) > - whether != 0 should be preferred over == 0. True. So I'm no longer conviced fixing COND_EXPR this way. Let's try going the different route of forcing out the compare then and hope for the x86 best ;) > >> >> >> in which the "then" value is the result of the > > conditionally-executed > >> >> >> operation and the "else" value is the first operand of that > > operation. > >> >> >> This "else" value is the one guaranteed by IFN_COND_* and so we can > >> >> >> replace y with x. > >> >> > > >> >> >> The patch also adds new conditional functions for multiplication > >> >> >> and division, which previously weren't needed. This enables an > >> >> >> extra fully-masked reduction (of dubious value) in > >> > gcc.dg/vect/pr53773.c. > >> >> > > >> >> >> Tested on aarch64-linux-gnu (with and without SVE), aarch64_be-elf > >> >> >> and x86_64-linux-gnu. OK to install? > >> >> > > >> >> > It does look like a clever way of expressing conditional execution. > > It > >> >> > seems > >> >> > "scaling" this to more operations is a bit awkward and certainly it > >> > would be > >> >> > nice to see for example the division case also prototyped for AVX512 > > to > >> > see > >> >> > if it works for anything besides ARM. Kirill CCed. > >> > > >> >> Yeah, I think the new version should be general enough for AVX512. > >> >> I've added a couple of gcc.dg/vect tests that are keyed off > >> >> vect_double_cond_arith. > >> > > >> >> Tested as before. OK to install? > >> > > >> > In the if-conversion part how can it be that you need > >> > value_available_p? Why do you delay propagating out > >> > redundant SSA names? > >> > > >> > It feels somewhat ugly possibly because if-conversion is implemented > >> > like it is right now? > > > >> I guess (although it didn't seem so ugly to me :-)). The statements > >> are converted in-place and then the blocks are stitched together, > >> and since there's no relationship between the "then" and "else" > >> values of the COND_EXPR, there's no guarantee that the "else" is > >> computed before the "then" value, or that the "then" value can > >> be delayed to the point of the COND_EXPR. Some statement reordering > >> would be necessary to handle the general case (and that might in itself > >> not be valid due to aliasing etc.). > > > > Hmm, but the ultimate "merges" are inserted for PHIs and only redundancies > > with those are handled? At least that's how I read the examples you quoted. > > Right. But say we have: > > if (cond) > a1 = b1 op1 c1; > else > a2 = b2 op2 c2; > a = PHI <a1, a2> > > and op2 is safe to speculate and op1 isn't. We can't assume that > a2's block comes before a1's, so we can't use: > > a1 = IFN_COND_FOO (cond, b1, c1, a2) > > in place of the original a1. And we also can't always defer a1 to the > join block because it might be used later in a1's block. Ah, ok. I think the patch is ok for trunk then. Thanks, Richard. > > Thanks, > Richard