Richard Biener <rguent...@suse.de> writes: > On Tue, 5 Jun 2018, Richard Biener wrote: > >> >> This is a prototype enforcing is_gimple_val conditions in >> [VEC_]COND_EXPRs. It shows quite some fallout - some likely >> due to necessary testsuite adjustments like gcc.dg/tree-ssa/ssa-lim-12.c >> others because some code simply doesn't handle conditions visible >> only via following SSA defs like vect_recog_mixed_size_cond_pattern. >> >> So on x86_64 there's quite some vect.exp fallout. >> >> Anyhow, to assess the effect on other targets I'm throwing this >> in here. >> >> I've hopefully nailed all ICEs (or at least makes catching them >> easy with some asserts in GIMPLE stmt building). >> >> Full bootstrap and regtest for all languages running on >> x86_64-unknown-linux-gnu.
FWIW, I ran the original version through our internal SVE benchmarks. There were quite a few ICEs from: pattern_stmt = gimple_build_assign (vect_recog_temp_ssa_var (itype, NULL), COND_EXPR, cond_expr, trueval, build_int_cst (itype, 0)); in adjust_bool_pattern and some from loop_cand::undo_simple_reduction (which it looks like you fixed in the updated patch). But of the tests that successfully built, it looks like this is neutral (good!). I can try to fix the adjust_bool_pattern thing if you don't see it for AVX. > Testing shows a few omissions in the patch (updated patch attached > below). It also shows that expand_vec_cond_expr_p and friends > need some heavy lifting to deal with vcond* patterns which have > the comparison embedded. There's vcond_mask* which would be > more suited to the GIMPLE we have after this patch and I suspect > we can always expand > > _1 = _2 < _3; > _4 = _1 ? _5 : _66; > > as > > _1 = _2 < _3 ? { -1,...} : {0,...}; via vcond_* > _4 = _1 != 0 ? _5 : _6; via vcond_* > > at expansion time TER might come to the rescue, if there are multiple > uses of _1 then there's cost issues to honor in case _1 = _2 < _3 > can be expanded more efficiently than via such fake vcond_*. > > In any case the expander now needs to be made more smart > (as well as the pattern recog part of the vectorizer). Why not match them as internal functions, like we now do for FMA* etc.? It seems like a similar case: for FMAs we wanted to fold in negates while here we want to fold in the comparison. The fold could be handled by match.pd, gated on a suitably late predicate. (Unless the vectoriser really does have to generate it directly, in which case maybe it could also/instead be done as a pattern. That might even make the generation side simpler, since the pattern would just go through vectorizable_call.) Using one internal function per comparison type would also help to clean up the optabs interface. At the moment, targets basically have to accept any comparison type that's going, regardless of what the target can actually support. If we use internal functions that directly map to individual optabs, then we can use target-independent code to make the comparison fit. (At least for the obvious cases. Some FP comparisons might still be better emulated directly by the target.) > Jakub - I remember you weren't too happy about splitting the > conditions out of [VEC_]COND_EXPRs this came up last time? > With the alternative idea of transitioning [VEC_]COND_EXPRs to > tcc_comparisons with conditional ops aka > > _1 = _2 < _3 ? _5 : _6; > > with RHS code LT_EXPR and 4 operands what do you think of > Richards complaint about being forced to repeat _2 < _3? > A split out _2 < _3 would then look like > > _4 = _2 < _3 ? {-1,...} : {0,...}; > _1 = _4 != {0,...} ? _5 : _6; > > so "plain" LT_EXPR rhs wouldn't be supported anymore but they'd > always have explicit true and false values attached. With making > the last two ops optional we could make them implicit again of course. > We could also keep [VEC_]COND_EXPR then in 3-operand form with > a gimple-val predicate rather than a comparison. Of course that > wouldn't really simplify the IL overall... Yeah, this still doesn't seem like an improvement to me TBH. For one thing it'd be confusing for LT_EXPR to have 2 operands for generic and 4 for gimple: are there any other cases where we do that? I suppose the GIMPLE_SINGLE_RHS stuff is a special case too, but it sounded like you saw that was a wart and wanted to move away from it. Thanks, Richard