https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115517

--- Comment #3 from rguenther at suse dot de <rguenther at suse dot de> ---
On Tue, 18 Jun 2024, liuhongt at gcc dot gnu.org wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115517
> 
> --- Comment #2 from Hongtao Liu <liuhongt at gcc dot gnu.org> ---
> (In reply to Richard Biener from comment #1)
> > Btw, I had opened PR115490 with my results for this already.  Some 
> > mitigation
> > should be from optimizing ISEL expansion to vcond_mask and I'd start with
> > looking at some of the fallout from that side (note that might require
> > the backend reject not natively implemented vec_cmp via its operand 1
> > predicate)
> 
> w/o AVX512, vector integer comparison only supports EQ/GT, others comparison
> rtx_cost is transformed to that. (.i.e GTU is emulated with us_minus + eq +
> negative the vector mask)
> If we restrict the predicate of operand 1, would middle-end reject
> vectorization (or lower it to scalar version)?

Richard suggests that we implement the "obvious" transforms like
inversion in the middle-end but if for example unsigned compares
are not supported the us_minus + eq + negative trick isn't on
that list.

The main reason to restrict vec_cmp would be to avoid
a <= b ? c : d going with an unsupported vec_cmp but instead
do a > b ? d : c - the alternative is trying to fix this
on the RTL side via combine.  I understand the non-native
compares are already expanded to supported form and we
don't use a split after combine to make combinations to
a supported form easier?

I don't have a good feeling which approach is going to be better
maintainable here.  But for example even for the unsigned compare
"lowering" the middle-end would have range info while RTL does
not (to some extent it's available at RTL expansion time).

Reply via email to