On Fri, May 25, 2018 at 1:49 PM Richard Sandiford < richard.sandif...@linaro.org> wrote:
> Richard Biener <richard.guent...@gmail.com> writes: > > On Fri, May 25, 2018 at 12:12 PM Richard Sandiford < > > richard.sandif...@linaro.org> wrote: > > > >> Richard Biener <richard.guent...@gmail.com> writes: > >> > On Wed, May 16, 2018 at 12:17 PM Richard Sandiford < > >> > richard.sandif...@linaro.org> wrote: > >> >> This patch uses IFN_COND_* to vectorise conditionally-executed, > >> >> potentially-trapping arithmetic, such as most floating-point > >> >> ops with -ftrapping-math. E.g.: > >> > > >> >> if (cond) { ... x = a + b; ... } > >> > > >> >> becomes: > >> > > >> >> ... > >> >> x = IFN_COND_ADD (cond, a, b); > >> >> ... > >> > > >> >> When this transformation is done on its own, the value of x for > >> >> !cond isn't important. > >> > > >> >> However, the patch also looks for the equivalent of: > >> > > >> >> y = cond ? x : a; > >> > > >> > As generated by predicate_all_scalar_phis, right? So this tries to > > capture > >> > > >> > if (cond) > >> > y = a / b; > >> > else > >> > y = a; > >> > > >> > But I think it would be more useful to represent the else value > > explicitely > >> > as extra operand given a constant like 0 is sth that will happen in > > practice > >> > and is also sth that is I think supported as a result by some targets. > >> > > >> > So to support the flow of current if-conversion you'd convert to > >> > > >> > IFN_COND_ADD (cond, a, b, a); // or any other reasonable "default" > > value > >> > (target controlled?) > >> > > >> > and you merge that with a single-use COND_EXPR by replacing that > > operand. > >> > > >> > If a target then cannot support the default value it needs to emit a > > blend > >> > afterwards. > > > >> Yeah, that's definitely better. I was a bit worried about the extra > >> generality, but it turns out that (on SVE) having the else argument and > >> getting the .md patterns to blend the input can avoid an unnecessary > >> move that would be difficult to remove otherwise. > > > > Btw, I've checked AVX512 and their masked operations either blend > > into the destination or zero the masked elements. Possibly useful > > for plus reductions. > OK. > >> I've committed the patches to add the else argument and the target hook > >> to select the preferred else value (thanks for the reviews). The patch > >> below now uses that preferred else value if there's no ?: or if the > >> else value of the ?: isn't available at the point of the conditional > >> operation. > > > >> This version also copes with: > > > >> if (cond) { ... x = a + b; ... } > >> y = !cond ? c : x; > > > >> by adding a new utility function inverse_conditions_p. > > > > interpret_as_condition_p is a bit of a layering violation in fold-const.c > > since it looks at SSA_NAME_DEF_STMT. In gimple.c we have a related > > thing, mainly used by forwprop - canonicalize_cond_expr_cond which is > > used to interpret a GENERIC folded thing as condition. > > > > Do you really need the def stmt lookup? > Yeah, since these internal functions always take a gimple value > as input, not a comparison code. But I can put the code elsewhere. > > Note that one item on my TODO list is to remove [VEC_]COND_EXPR from > > gimple in favor of using a quaternary assign stmt with tcc_comparison code > > so we'd have > > > > op0 = op1 < op2 ? op3 : op4; > > > > that means currently valid op0 = op1 < op2; would become > > a more explicit op0 = op1 < op2 ? true : false; > > > > That means you'd for example may need to handle _1 != 0 being > > feeded by _1 = _2 < _3 ? true : false; -- or simply rely on those > > to be combined. > > > > Sooo - maybe you feel like implementing the above? ;) > So is just requiring a gimple value in COND_EXPR and VEC_COND_EXPR > a non-starter? That seems neater IMO. (And probably replacing > them with a single code.) That's the alternative. OTOH with the above we have a single canonical way to compute the value of a comparison rather than two as we have now: _1 = _2 < _3; vs. _1 = _2 < _3 ? true : false; so I somewhat settled on that "fix" to the GENERIC expression in our [VEC_]COND_EXPRs. Also targets seem to like "repeated" conditions better. > I think part of the problem with the 4-way stmt is that it becomes > harder to know what canonical form to use. E.g. should it be > canonicalised so that op4 is a constant where possible? > Or should it be canonicalised based on the condition? We have the same issue with COND_EXPR right now... > I've seen > this lead to cases in which we end up inserting opposite comparisons > with the current embedded VEC_COND_EXPR stuff, whereas splitting > them out would make it more obvious that the comparison and the > select are logically two separate things. True, but you'd still have the choice of inverting the condition and thus swapping the operands. > It would also be a bit inconsistent with IFN_COND_*, since they'd > still take the result of a separate comparison, and also the older > IFN_MASK* functions. True... it shows that IFNs while "nice" are not really a replacement for "proper" predicated stmts ;) > I guess gcond would still be a plain comparison? Yes. > >> >> in which the "then" value is the result of the conditionally-executed > >> >> operation and the "else" value is the first operand of that operation. > >> >> This "else" value is the one guaranteed by IFN_COND_* and so we can > >> >> replace y with x. > >> > > >> >> The patch also adds new conditional functions for multiplication > >> >> and division, which previously weren't needed. This enables an > >> >> extra fully-masked reduction (of dubious value) in > > gcc.dg/vect/pr53773.c. > >> > > >> >> Tested on aarch64-linux-gnu (with and without SVE), aarch64_be-elf > >> >> and x86_64-linux-gnu. OK to install? > >> > > >> > It does look like a clever way of expressing conditional execution. It > >> > seems > >> > "scaling" this to more operations is a bit awkward and certainly it > > would be > >> > nice to see for example the division case also prototyped for AVX512 to > > see > >> > if it works for anything besides ARM. Kirill CCed. > > > >> Yeah, I think the new version should be general enough for AVX512. > >> I've added a couple of gcc.dg/vect tests that are keyed off > >> vect_double_cond_arith. > > > >> Tested as before. OK to install? > > > > In the if-conversion part how can it be that you need > > value_available_p? Why do you delay propagating out > > redundant SSA names? > > > > It feels somewhat ugly possibly because if-conversion is implemented > > like it is right now? > I guess (although it didn't seem so ugly to me :-)). The statements > are converted in-place and then the blocks are stitched together, > and since there's no relationship between the "then" and "else" > values of the COND_EXPR, there's no guarantee that the "else" is > computed before the "then" value, or that the "then" value can > be delayed to the point of the COND_EXPR. Some statement reordering > would be necessary to handle the general case (and that might in itself > not be valid due to aliasing etc.). Hmm, but the ultimate "merges" are inserted for PHIs and only redundancies with those are handled? At least that's how I read the examples you quoted. > The delay in propagating the SSA names is to avoid removing statements > or rewriting uses in the middle of the if-conversion transformation > itself. E.g. SSA names might still be referenced in data_references > or predicates. > Doing it in ifcvt_local_dce seemed logically cleaner anyway, since it > really is a form of DCE. OK, fair enough for that part then. Richard. > Thanks, > Richard