On Thu, 7 Nov 2019 at 11:26, Kyrill Tkachov <kyrylo.tkac...@foss.arm.com> wrote: > > Hi all, > > This patch adds the plumbing for and an implementation of the saturation > intrinsics from ACLE [1], in particular the __ssat, __usat intrinsics. > These intrinsics set the Q sticky bit in APSR if an overflow occurred. > ACLE allows the user to read that bit (within the same function, it's not > defined across function boundaries) using the __saturation_occurred > intrinsic > and reset it using __set_saturation_occurred. > Thus, if the user cares about the Q bit they would be using a flow such as: > > __set_saturation_occurred (0); // reset the Q bit > ... > __ssat (...) // Do some calculations involving __ssat > ... > if (__saturation_occurred ()) // if Q bit set handle overflow > ... > > For the implementation this has a few implications: > * We must track the Q-setting side-effects of these instructions to make > sure > saturation reading/writing intrinsics are ordered properly. > This is done by introducing a new "apsrq" register (and associated > APSRQ_REGNUM) in a similar way to the "fake"" cc register. > > * The RTL patterns coming out of these intrinsics can have two forms: > one where they set the APSRQ_REGNUM and one where they don't. > Which one is used depends on whether the function cares about reading the Q > flag. This is detected using the TARGET_CHECK_BUILTIN_CALL hook on the > __saturation_occurred, __set_saturation_occurred occurrences. > If no Q-flag read is present in the function we'll use the simpler > non-Q-setting form to allow for more aggressive scheduling and such. > If a Q-bit read is present then the Q-setting form is emitted. > To avoid adding two patterns for each intrinsic to the MD file we make > use of define_subst to auto-generate the Q-setting forms > > * Some existing patterns already produce instructions that may clobber the > Q bit, but they don't model it (as we didn't care about that bit up till > now). > Since these patterns can be generated from straight-line C code they can > affect > the Q-bit reads from intrinsics. Therefore they have to be disabled when > a Q-bit read is present. These are mostly patterns in arm-fixed.md that are > not very common anyway, but there are also a couple of widening > multiply-accumulate patterns in arm.md that can set the Q-bit during > accumulation. > > There are more Q-setting intrinsics in ACLE, but these will be > implemented in > a more mechanical fashion once the infrastructure in this patch goes in. > > Bootstrapped and tested on arm-none-linux-gnueabihf. > > Committing to trunk. > Thanks, > Kyrill > > > 2019-11-07 Kyrylo Tkachov <kyrylo.tkac...@arm.com> > > * config/arm/aout.h (REGISTER_NAMES): Add apsrq. > * config/arm/arm.md (APSRQ_REGNUM): Define. > (add_setq): New define_subst. > (add_clobber_q_name): New define_subst_attr. > (add_clobber_q_pred): Likewise. > (maddhisi4): Change to define_expand. Split into mult and add if > ARM_Q_BIT_READ. > (arm_maddhisi4): New define_insn. > (*maddhisi4tb): Disable for ARM_Q_BIT_READ. > (*maddhisi4tt): Likewise. > (arm_ssat): New define_expand. > (arm_usat): Likewise. > (arm_get_apsr): New define_insn. > (arm_set_apsr): Likewise. > (arm_saturation_occurred): New define_expand. > (arm_set_saturation): Likewise. > (*satsi_<SAT:code>): Rename to... > (satsi_<SAT:code><add_clobber_q_name>): ... This. > (*satsi_<SAT:code>_shift): Disable for ARM_Q_BIT_READ. > * config/arm/arm.h (FIXED_REGISTERS): Mark apsrq as fixed. > (CALL_USED_REGISTERS): Mark apsrq. > (FIRST_PSEUDO_REGISTER): Update value. > (REG_ALLOC_ORDER): Add APSRQ_REGNUM. > (machine_function): Add q_bit_access. > (ARM_Q_BIT_READ): Define. > * config/arm/arm.c (TARGET_CHECK_BUILTIN_CALL): Define. > (arm_conditional_register_usage): Clear APSRQ_REGNUM from > operand_reg_set. > (arm_q_bit_access): Define. > * config/arm/arm-builtins.c: Include stringpool.h. > (arm_sat_binop_imm_qualifiers, > arm_unsigned_sat_binop_unsigned_imm_qualifiers, > arm_sat_occurred_qualifiers, arm_set_sat_qualifiers): Define. > (SAT_BINOP_UNSIGNED_IMM_QUALIFIERS, > UNSIGNED_SAT_BINOP_UNSIGNED_IMM_QUALIFIERS, SAT_OCCURRED_QUALIFIERS, > SET_SAT_QUALIFIERS): Likewise. > (arm_builtins): Define ARM_BUILTIN_SAT_IMM_CHECK. > (arm_init_acle_builtins): Initialize __builtin_sat_imm_check. > Handle 0 argument expander. > (arm_expand_acle_builtin): Handle ARM_BUILTIN_SAT_IMM_CHECK. > (arm_check_builtin_call): Define. > * config/arm/arm.md (ssmulsa3, usmulusa3, usmuluha3, > arm_ssatsihi_shift, arm_usatsihi): Disable when ARM_Q_BIT_READ. > * config/arm/arm-protos.h (arm_check_builtin_call): Declare prototype. > (arm_q_bit_access): Likewise. > * config/arm/arm_acle.h (__ssat, __usat, __ignore_saturation, > __saturation_occurred, __set_saturation_occurred): Define. > * config/arm/arm_acle_builtins.def: Define builtins for ssat, usat, > saturation_occurred, set_saturation_occurred. > * config/arm/unspecs.md (UNSPEC_Q_SET): Define. > (UNSPEC_APSR_READ): Likewise. > (VUNSPEC_APSR_WRITE): Likewise. > * config/arm/arm-fixed.md (ssadd<mode>3): Convert to define_expand. > (*arm_ssadd<mode>3): New define_insn. > (sssub<mode>3): Convert to define_expand. > (*arm_sssub<mode>3): New define_insn. > (ssmulsa3): Convert to define_expand. > (*arm_ssmulsa3): New define_insn. > (usmulusa3): Convert to define_expand. > (*arm_usmulusa3): New define_insn. > (ssmulha3): FAIL if ARM_Q_BIT_READ. > (arm_ssatsihi_shift, arm_usatsihi): Disable for ARM_Q_BIT_READ. > * config/arm/iterators.md (qaddsub_clob_q): New mode attribute. > > 2019-11-07 Kyrylo Tkachov <kyrylo.tkac...@arm.com> > > * gcc.target/arm/acle/saturation.c: New test. > * gcc.target/arm/acle/sat_no_smlatb.c: Likewise. > * lib/target-supports.exp (check_effective_target_arm_qbit_ok_nocache): > Define.. > (check_effective_target_arm_qbit_ok): Likewise. > (add_options_for_arm_qbit): Likewise. >
I Kyrill, I've noticed that the new saturation.c test fails on arm-none-linux-gnueabi --with-mode=arm --with-cpu=cortex-a9 when forcing -march=armv5t in RUNTESTFLAGS. FAIL: gcc.target/arm/acle/saturation.c -O0 scan-assembler-times ssat\t...?, #8, ...? 1 FAIL: gcc.target/arm/acle/saturation.c -O0 scan-assembler-times usat\t...?, #24, ...? 1 FAIL: gcc.target/arm/acle/saturation.c -O0 scan-assembler-times usat\t...?, #3, ...? 1 FAIL: gcc.target/arm/acle/saturation.c -O0 (test for excess errors) etc... The excess errors are: Excess errors: /gcc/testsuite/gcc.target/arm/acle/saturation.c:10:10: warning: implicit declaration of function '__ssat' [-Wimplicit-function-declaration] /gcc/testsuite/gcc.target/arm/acle/saturation.c:18:10: warning: implicit declaration of function '__usat' [-Wimplicit-function-declaration] The testcase is compiled with -march=armv5t [...] -march=armv5te [...] which seems what it is expected? When configuring with --with-mode=thumb, the test is skipped (UNSUPPORTED). Christophe