RE: [PATCH v3] aarch64: Recognize vector permute patterns suitable for FMOV [PR100165]

quic_pzheng Fri, 25 Apr 2025 17:50:10 -0700

> Richard Sandiford <richard.sandif...@arm.com> writes:
> > I think this would also simplify the evpc detection, since the
> > requirement for using AND is the same for big-endian and
> > little-endian, namely that index I of the result must either come from
> > index I of the nonzero vector or from any element of the zero vector.
> > (What differs between big-endian and little-endian is which masks
> > correspond to FMOV.)
> 
> Or perhaps more accurately, what differs between big-endian and
little-endian
> is the constant that needs to be materialised for a given permute mask.  I
think
> the easiest way of handling that would be to construct an array of
target_units
> (0xffs for bytes that come from the nonzero vector, 0x00s for bytes that
come
> from the zero
> vector) and then get native_encode_rtx to convert that into a vector
constant.
> native_encode_rtx will then do the endian correction for us.


Thanks for the great feedback, Richard! I've reworked the patch accordingly.
Please
let me know if you have any other comments.

[PATCH 1/3] Recognize vector permute patterns which can be interpreted as
AND [PR100165]
https://gcc.gnu.org/pipermail/gcc-patches/2025-April/681900.html

[PATCH 2/3] aarch64: Optimize AND with certain vector of immediates as FMOV
[PR100165]
https://gcc.gnu.org/pipermail/gcc-patches/2025-April/681901.html

[PATCH 3/3] aarch64: Add more vector permute tests for the FMOV optimization
[PR100165]
https://gcc.gnu.org/pipermail/gcc-patches/2025-April/681902.html

Thanks,
Pengxuan
> 
> Thanks,
> Richard

RE: [PATCH v3] aarch64: Recognize vector permute patterns suitable for FMOV [PR100165]

Reply via email to