Hi Christophe,

On 11/12/19 10:29 AM, Christophe Lyon wrote:
On Thu, 7 Nov 2019 at 11:26, Kyrill Tkachov <kyrylo.tkac...@foss.arm.com> wrote:
Hi all,

This patch adds the plumbing for and an implementation of the saturation
intrinsics from ACLE [1], in particular the __ssat, __usat intrinsics.
These intrinsics set the Q sticky bit in APSR if an overflow occurred.
ACLE allows the user to read that bit (within the same function, it's not
defined across function boundaries) using the __saturation_occurred
intrinsic
and reset it using __set_saturation_occurred.
Thus, if the user cares about the Q bit they would be using a flow such as:

__set_saturation_occurred (0); // reset the Q bit
...
__ssat (...) // Do some calculations involving __ssat
...
if (__saturation_occurred ()) // if Q bit set handle overflow
    ...

For the implementation this has a few implications:
* We must track the Q-setting side-effects of these instructions to make
sure
saturation reading/writing intrinsics are ordered properly.
This is done by introducing a new "apsrq" register (and associated
APSRQ_REGNUM) in a similar way to the "fake"" cc register.

* The RTL patterns coming out of these intrinsics can have two forms:
one where they set the APSRQ_REGNUM and one where they don't.
Which one is used depends on whether the function cares about reading the Q
flag. This is detected using the TARGET_CHECK_BUILTIN_CALL hook on the
__saturation_occurred, __set_saturation_occurred occurrences.
If no Q-flag read is present in the function we'll use the simpler
non-Q-setting form to allow for more aggressive scheduling and such.
If a Q-bit read is present then the Q-setting form is emitted.
To avoid adding two patterns for each intrinsic to the MD file we make
use of define_subst to auto-generate the Q-setting forms

* Some existing patterns already produce instructions that may clobber the
Q bit, but they don't model it (as we didn't care about that bit up till
now).
Since these patterns can be generated from straight-line C code they can
affect
the Q-bit reads from intrinsics. Therefore they have to be disabled when
a Q-bit read is present.  These are mostly patterns in arm-fixed.md that are
not very common anyway, but there are also a couple of widening
multiply-accumulate patterns in arm.md that can set the Q-bit during
accumulation.

There are more Q-setting intrinsics in ACLE, but these will be
implemented in
a more mechanical fashion once the infrastructure in this patch goes in.

Bootstrapped and tested on arm-none-linux-gnueabihf.

Committing to trunk.
Thanks,
Kyrill


2019-11-07  Kyrylo Tkachov  <kyrylo.tkac...@arm.com>

      * config/arm/aout.h (REGISTER_NAMES): Add apsrq.
      * config/arm/arm.md (APSRQ_REGNUM): Define.
      (add_setq): New define_subst.
      (add_clobber_q_name): New define_subst_attr.
      (add_clobber_q_pred): Likewise.
      (maddhisi4): Change to define_expand.  Split into mult and add if
      ARM_Q_BIT_READ.
      (arm_maddhisi4): New define_insn.
      (*maddhisi4tb): Disable for ARM_Q_BIT_READ.
      (*maddhisi4tt): Likewise.
      (arm_ssat): New define_expand.
      (arm_usat): Likewise.
      (arm_get_apsr): New define_insn.
      (arm_set_apsr): Likewise.
      (arm_saturation_occurred): New define_expand.
      (arm_set_saturation): Likewise.
      (*satsi_<SAT:code>): Rename to...
      (satsi_<SAT:code><add_clobber_q_name>): ... This.
      (*satsi_<SAT:code>_shift): Disable for ARM_Q_BIT_READ.
      * config/arm/arm.h (FIXED_REGISTERS): Mark apsrq as fixed.
      (CALL_USED_REGISTERS): Mark apsrq.
      (FIRST_PSEUDO_REGISTER): Update value.
      (REG_ALLOC_ORDER): Add APSRQ_REGNUM.
      (machine_function): Add q_bit_access.
      (ARM_Q_BIT_READ): Define.
      * config/arm/arm.c (TARGET_CHECK_BUILTIN_CALL): Define.
      (arm_conditional_register_usage): Clear APSRQ_REGNUM from
      operand_reg_set.
      (arm_q_bit_access): Define.
      * config/arm/arm-builtins.c: Include stringpool.h.
      (arm_sat_binop_imm_qualifiers,
      arm_unsigned_sat_binop_unsigned_imm_qualifiers,
      arm_sat_occurred_qualifiers, arm_set_sat_qualifiers): Define.
      (SAT_BINOP_UNSIGNED_IMM_QUALIFIERS,
      UNSIGNED_SAT_BINOP_UNSIGNED_IMM_QUALIFIERS, SAT_OCCURRED_QUALIFIERS,
      SET_SAT_QUALIFIERS): Likewise.
      (arm_builtins): Define ARM_BUILTIN_SAT_IMM_CHECK.
      (arm_init_acle_builtins): Initialize __builtin_sat_imm_check.
      Handle 0 argument expander.
      (arm_expand_acle_builtin): Handle ARM_BUILTIN_SAT_IMM_CHECK.
      (arm_check_builtin_call): Define.
      * config/arm/arm.md (ssmulsa3, usmulusa3, usmuluha3,
      arm_ssatsihi_shift, arm_usatsihi): Disable when ARM_Q_BIT_READ.
      * config/arm/arm-protos.h (arm_check_builtin_call): Declare prototype.
      (arm_q_bit_access): Likewise.
      * config/arm/arm_acle.h (__ssat, __usat, __ignore_saturation,
      __saturation_occurred, __set_saturation_occurred): Define.
      * config/arm/arm_acle_builtins.def: Define builtins for ssat, usat,
      saturation_occurred, set_saturation_occurred.
      * config/arm/unspecs.md (UNSPEC_Q_SET): Define.
      (UNSPEC_APSR_READ): Likewise.
      (VUNSPEC_APSR_WRITE): Likewise.
      * config/arm/arm-fixed.md (ssadd<mode>3): Convert to define_expand.
      (*arm_ssadd<mode>3): New define_insn.
      (sssub<mode>3): Convert to define_expand.
      (*arm_sssub<mode>3): New define_insn.
      (ssmulsa3): Convert to define_expand.
      (*arm_ssmulsa3): New define_insn.
      (usmulusa3): Convert to define_expand.
      (*arm_usmulusa3): New define_insn.
      (ssmulha3): FAIL if ARM_Q_BIT_READ.
      (arm_ssatsihi_shift, arm_usatsihi): Disable for ARM_Q_BIT_READ.
      * config/arm/iterators.md (qaddsub_clob_q): New mode attribute.

2019-11-07  Kyrylo Tkachov  <kyrylo.tkac...@arm.com>

      * gcc.target/arm/acle/saturation.c: New test.
      * gcc.target/arm/acle/sat_no_smlatb.c: Likewise.
      * lib/target-supports.exp (check_effective_target_arm_qbit_ok_nocache):
      Define..
      (check_effective_target_arm_qbit_ok): Likewise.
      (add_options_for_arm_qbit): Likewise.

I Kyrill,

I've noticed that the new saturation.c test fails on
arm-none-linux-gnueabi --with-mode=arm --with-cpu=cortex-a9
when forcing -march=armv5t in RUNTESTFLAGS.
FAIL: gcc.target/arm/acle/saturation.c   -O0   scan-assembler-times
ssat\t...?, #8, ...? 1
FAIL: gcc.target/arm/acle/saturation.c   -O0   scan-assembler-times
usat\t...?, #24, ...? 1
FAIL:  gcc.target/arm/acle/saturation.c   -O0   scan-assembler-times
usat\t...?, #3, ...? 1
FAIL:  gcc.target/arm/acle/saturation.c   -O0  (test for excess errors)
etc...
The excess errors are:
Excess errors:
/gcc/testsuite/gcc.target/arm/acle/saturation.c:10:10: warning:
implicit declaration of function '__ssat'
[-Wimplicit-function-declaration]
/gcc/testsuite/gcc.target/arm/acle/saturation.c:18:10: warning:
implicit declaration of function '__usat'
[-Wimplicit-function-declaration]

The testcase is compiled with -march=armv5t [...] -march=armv5te [...]
which seems what it is expected?

Ahh, it seems that TARGET_ARM_QBIT indeed requires armv5te but the ssat, usat instructions need armv6.

So the effective target check needs adjusting.

Thanks,

Kyrill



When configuring with --with-mode=thumb, the test is skipped (UNSUPPORTED).

Christophe

Reply via email to