Kyrylo Tkachov via Gcc-patches <gcc-patches@gcc.gnu.org> writes: > Hi all, > > This patch expresses the intrinsics for the SRA and RSRA instructions with > standard RTL codes rather than relying on UNSPECs. > These instructions perform a vector shift right plus accumulate with an > optional rounding constant addition for the RSRA variant. > There are a number of interesting points: > > * The scalar-in-SIMD-registers variant for DImode SRA e.g. ssra d0, d1, #N > is left using the UNSPECs. Expressing it as a DImode plus+shift led to all > kinds of trouble as it started matching the existing define_insns for > "add x0, x0, asr #N" instructions and adding the SRA form as an extra > alternative required a significant amount of deduplication of iterators and > things still didn't work out well. I decided not to tackle that case in > this patch. It can be attempted later. > > * For the RSRA variants that add a rounding constant (1 << (shift-1)) the > addition is notionally performed in a wider mode than the input types so that > overflow is handled properly. In RTL this can be represented with an > appropriate > extend operation followed by a truncate back to the original modes. > However for 128-bit input modes such as V4SI we don't have appropriate modes > defined for this widening i.e. we'd need a V4DI mode to represent the > intermediate widened result. This patch defines such modes for > V16HI,V8SI,V4DI,V2TI. These will come handy in the future too as we have > more Advanced SIMD instruction that have similar intermediate widening > semantics. > > * The above new modes led to a problem with stor-layout.cc. The new modes only > exist for the sake of the RTL optimisers understanding the semantics of the > instruction but are not indended to be moved to and from register or memory, > assigned to types, used as TYPE_MODE or participate in auto-vectorisation. > This is expressed in aarch64 by aarch64_classify_vector_mode returning zero > for these new modes. However, the code in stor-layout.cc:<mode_for_vector> > explicitly doesn't check this when picking a TYPE_MODE due to modes being made > potentially available later through target switching (PR38240). > This led to these modes being picked as TYPE_MODE for declarations such as: > typedef int16_t vnx8hi __attribute__((vector_size (32))) when 256-bit > fixed-length SVE modes are available and vector_type_mode later struggling > to rectify this. > This issue is addressed with the new target hook > TARGET_VECTOR_MODE_SUPPORTED_ANY_TARGET_P that is intended to check if a > vector mode can be used in any legal target attribute configuration of the > port, as opposed to the existing TARGET_VECTOR_MODE_SUPPORTED_P that checks > only the initial target configuration. This allows a simple adjustment in > stor-layout.cc that still disqualifies these limited modes early on while > allowing consideration of modes that can be turned on in the future with > target attributes. > > Bootstrapped and tested on aarch64-none-linux-gnu. > Ok for the non-aarch64 parts?
Yes, thanks. Since we'd discussed this approach off-list, I wanted to leave a gap in case others objected to it. But I guess they would have spoken up by now if so. Richard