On Thu, 7 Nov 2019 at 11:26, Kyrill Tkachov <kyrylo.tkac...@foss.arm.com> wrote:
>
> Hi all,
>
> This patch adds the plumbing for and an implementation of the saturation
> intrinsics from ACLE [1], in particular the __ssat, __usat intrinsics.
> These intrinsics set the Q sticky bit in APSR if an overflow occurred.
> ACLE allows the user to read that bit (within the same function, it's not
> defined across function boundaries) using the __saturation_occurred
> intrinsic
> and reset it using __set_saturation_occurred.
> Thus, if the user cares about the Q bit they would be using a flow such as:
>
> __set_saturation_occurred (0); // reset the Q bit
> ...
> __ssat (...) // Do some calculations involving __ssat
> ...
> if (__saturation_occurred ()) // if Q bit set handle overflow
>    ...
>
> For the implementation this has a few implications:
> * We must track the Q-setting side-effects of these instructions to make
> sure
> saturation reading/writing intrinsics are ordered properly.
> This is done by introducing a new "apsrq" register (and associated
> APSRQ_REGNUM) in a similar way to the "fake"" cc register.
>
> * The RTL patterns coming out of these intrinsics can have two forms:
> one where they set the APSRQ_REGNUM and one where they don't.
> Which one is used depends on whether the function cares about reading the Q
> flag. This is detected using the TARGET_CHECK_BUILTIN_CALL hook on the
> __saturation_occurred, __set_saturation_occurred occurrences.
> If no Q-flag read is present in the function we'll use the simpler
> non-Q-setting form to allow for more aggressive scheduling and such.
> If a Q-bit read is present then the Q-setting form is emitted.
> To avoid adding two patterns for each intrinsic to the MD file we make
> use of define_subst to auto-generate the Q-setting forms
>
> * Some existing patterns already produce instructions that may clobber the
> Q bit, but they don't model it (as we didn't care about that bit up till
> now).
> Since these patterns can be generated from straight-line C code they can
> affect
> the Q-bit reads from intrinsics. Therefore they have to be disabled when
> a Q-bit read is present.  These are mostly patterns in arm-fixed.md that are
> not very common anyway, but there are also a couple of widening
> multiply-accumulate patterns in arm.md that can set the Q-bit during
> accumulation.
>
> There are more Q-setting intrinsics in ACLE, but these will be
> implemented in
> a more mechanical fashion once the infrastructure in this patch goes in.
>
> Bootstrapped and tested on arm-none-linux-gnueabihf.
>
> Committing to trunk.
> Thanks,
> Kyrill
>
>
> 2019-11-07  Kyrylo Tkachov  <kyrylo.tkac...@arm.com>
>
>      * config/arm/aout.h (REGISTER_NAMES): Add apsrq.
>      * config/arm/arm.md (APSRQ_REGNUM): Define.
>      (add_setq): New define_subst.
>      (add_clobber_q_name): New define_subst_attr.
>      (add_clobber_q_pred): Likewise.
>      (maddhisi4): Change to define_expand.  Split into mult and add if
>      ARM_Q_BIT_READ.
>      (arm_maddhisi4): New define_insn.
>      (*maddhisi4tb): Disable for ARM_Q_BIT_READ.
>      (*maddhisi4tt): Likewise.
>      (arm_ssat): New define_expand.
>      (arm_usat): Likewise.
>      (arm_get_apsr): New define_insn.
>      (arm_set_apsr): Likewise.
>      (arm_saturation_occurred): New define_expand.
>      (arm_set_saturation): Likewise.
>      (*satsi_<SAT:code>): Rename to...
>      (satsi_<SAT:code><add_clobber_q_name>): ... This.
>      (*satsi_<SAT:code>_shift): Disable for ARM_Q_BIT_READ.
>      * config/arm/arm.h (FIXED_REGISTERS): Mark apsrq as fixed.
>      (CALL_USED_REGISTERS): Mark apsrq.
>      (FIRST_PSEUDO_REGISTER): Update value.
>      (REG_ALLOC_ORDER): Add APSRQ_REGNUM.
>      (machine_function): Add q_bit_access.
>      (ARM_Q_BIT_READ): Define.
>      * config/arm/arm.c (TARGET_CHECK_BUILTIN_CALL): Define.
>      (arm_conditional_register_usage): Clear APSRQ_REGNUM from
>      operand_reg_set.
>      (arm_q_bit_access): Define.
>      * config/arm/arm-builtins.c: Include stringpool.h.
>      (arm_sat_binop_imm_qualifiers,
>      arm_unsigned_sat_binop_unsigned_imm_qualifiers,
>      arm_sat_occurred_qualifiers, arm_set_sat_qualifiers): Define.
>      (SAT_BINOP_UNSIGNED_IMM_QUALIFIERS,
>      UNSIGNED_SAT_BINOP_UNSIGNED_IMM_QUALIFIERS, SAT_OCCURRED_QUALIFIERS,
>      SET_SAT_QUALIFIERS): Likewise.
>      (arm_builtins): Define ARM_BUILTIN_SAT_IMM_CHECK.
>      (arm_init_acle_builtins): Initialize __builtin_sat_imm_check.
>      Handle 0 argument expander.
>      (arm_expand_acle_builtin): Handle ARM_BUILTIN_SAT_IMM_CHECK.
>      (arm_check_builtin_call): Define.
>      * config/arm/arm.md (ssmulsa3, usmulusa3, usmuluha3,
>      arm_ssatsihi_shift, arm_usatsihi): Disable when ARM_Q_BIT_READ.
>      * config/arm/arm-protos.h (arm_check_builtin_call): Declare prototype.
>      (arm_q_bit_access): Likewise.
>      * config/arm/arm_acle.h (__ssat, __usat, __ignore_saturation,
>      __saturation_occurred, __set_saturation_occurred): Define.
>      * config/arm/arm_acle_builtins.def: Define builtins for ssat, usat,
>      saturation_occurred, set_saturation_occurred.
>      * config/arm/unspecs.md (UNSPEC_Q_SET): Define.
>      (UNSPEC_APSR_READ): Likewise.
>      (VUNSPEC_APSR_WRITE): Likewise.
>      * config/arm/arm-fixed.md (ssadd<mode>3): Convert to define_expand.
>      (*arm_ssadd<mode>3): New define_insn.
>      (sssub<mode>3): Convert to define_expand.
>      (*arm_sssub<mode>3): New define_insn.
>      (ssmulsa3): Convert to define_expand.
>      (*arm_ssmulsa3): New define_insn.
>      (usmulusa3): Convert to define_expand.
>      (*arm_usmulusa3): New define_insn.
>      (ssmulha3): FAIL if ARM_Q_BIT_READ.
>      (arm_ssatsihi_shift, arm_usatsihi): Disable for ARM_Q_BIT_READ.
>      * config/arm/iterators.md (qaddsub_clob_q): New mode attribute.
>
> 2019-11-07  Kyrylo Tkachov  <kyrylo.tkac...@arm.com>
>
>      * gcc.target/arm/acle/saturation.c: New test.
>      * gcc.target/arm/acle/sat_no_smlatb.c: Likewise.
>      * lib/target-supports.exp (check_effective_target_arm_qbit_ok_nocache):
>      Define..
>      (check_effective_target_arm_qbit_ok): Likewise.
>      (add_options_for_arm_qbit): Likewise.
>

I Kyrill,

I've noticed that the new saturation.c test fails on
arm-none-linux-gnueabi --with-mode=arm --with-cpu=cortex-a9
when forcing -march=armv5t in RUNTESTFLAGS.
FAIL: gcc.target/arm/acle/saturation.c   -O0   scan-assembler-times
ssat\t...?, #8, ...? 1
FAIL: gcc.target/arm/acle/saturation.c   -O0   scan-assembler-times
usat\t...?, #24, ...? 1
FAIL:  gcc.target/arm/acle/saturation.c   -O0   scan-assembler-times
usat\t...?, #3, ...? 1
FAIL:  gcc.target/arm/acle/saturation.c   -O0  (test for excess errors)
etc...
The excess errors are:
Excess errors:
/gcc/testsuite/gcc.target/arm/acle/saturation.c:10:10: warning:
implicit declaration of function '__ssat'
[-Wimplicit-function-declaration]
/gcc/testsuite/gcc.target/arm/acle/saturation.c:18:10: warning:
implicit declaration of function '__usat'
[-Wimplicit-function-declaration]

The testcase is compiled with -march=armv5t [...] -march=armv5te [...]
which seems what it is expected?

When configuring with --with-mode=thumb, the test is skipped (UNSUPPORTED).

Christophe

Reply via email to