Kyrylo Tkachov via Gcc-patches <gcc-patches@gcc.gnu.org> writes:
> Hi all,
>
> This patch expresses the intrinsics for the SRA and RSRA instructions with
> standard RTL codes rather than relying on UNSPECs.
> These instructions perform a vector shift right plus accumulate with an
> optional rounding constant addition for the RSRA variant.
> There are a number of interesting points:
>
> * The scalar-in-SIMD-registers variant for DImode SRA e.g. ssra d0, d1, #N
> is left using the UNSPECs. Expressing it as a DImode plus+shift led to all
> kinds of trouble as it started matching the existing define_insns for
> "add x0, x0, asr #N" instructions and adding the SRA form as an extra
> alternative required a significant amount of deduplication of iterators and
> things still didn't work out well. I decided not to tackle that case in
> this patch. It can be attempted later.
>
> * For the RSRA variants that add a rounding constant (1 << (shift-1)) the
> addition is notionally performed in a wider mode than the input types so that
> overflow is handled properly. In RTL this can be represented with an 
> appropriate
> extend operation followed by a truncate back to the original modes.
> However for 128-bit input modes such as V4SI we don't have appropriate modes
> defined for this widening i.e. we'd need a V4DI mode to represent the
> intermediate widened result.  This patch defines such modes for
> V16HI,V8SI,V4DI,V2TI. These will come handy in the future too as we have
> more Advanced SIMD instruction that have similar intermediate widening
> semantics.
>
> * The above new modes led to a problem with stor-layout.cc. The new modes only
> exist for the sake of the RTL optimisers understanding the semantics of the
> instruction but are not indended to be moved to and from register or memory,
> assigned to types, used as TYPE_MODE or participate in auto-vectorisation.
> This is expressed in aarch64 by aarch64_classify_vector_mode returning zero
> for these new modes. However, the code in stor-layout.cc:<mode_for_vector>
> explicitly doesn't check this when picking a TYPE_MODE due to modes being made
> potentially available later through target switching (PR38240).
> This led to these modes being picked as TYPE_MODE for declarations such as:
> typedef int16_t vnx8hi __attribute__((vector_size (32))) when 256-bit
> fixed-length SVE modes are available and vector_type_mode later struggling
> to rectify this.
> This issue is addressed with the new target hook
> TARGET_VECTOR_MODE_SUPPORTED_ANY_TARGET_P that is intended to check if a
> vector mode can be used in any legal target attribute configuration of the
> port, as opposed to the existing TARGET_VECTOR_MODE_SUPPORTED_P that checks
> only the initial target configuration. This allows a simple adjustment in
> stor-layout.cc that still disqualifies these limited modes early on while
> allowing consideration of modes that can be turned on in the future with
> target attributes.
>
> Bootstrapped and tested on aarch64-none-linux-gnu.
> Ok for the non-aarch64 parts?

Yes, thanks.

Since we'd discussed this approach off-list, I wanted to leave a gap in
case others objected to it.  But I guess they would have spoken up by
now if so.

Richard

Reply via email to