Re: [PATCH ver 2] rs6000, __builtin_set_fpscr_rn add retrun value

Kewen.Lin via Gcc-patches Thu, 06 Jul 2023 21:06:34 -0700

Hi Carl,

Some more minor comments are inline below on top of Peter's insightful
review comments.


on 2023/7/1 08:58, Carl Love wrote:
> 
> GCC maintainers:
> 
> Ver 2,  Went back thru the requirements and emails.  Not sure where I
> came up with the requirement for an overloaded version with double
> argument.  Removed the overloaded version with the double argument. 
> Added the macro to announce if the __builtin_set_fpscr_rn returns a
> void or a double with the FPSCR bits.  Updated the documentation file. 
> Retested on Power 8 BE/LE, Power 9 BE/LE, Power 10 LE.  Redid the test
> file.  Per request, the original test file functionality was not
> changed.  Just changed the name from test_fpscr_rn_builtin.c to 
> test_fpscr_rn_builtin_1.c.  Put new tests for the return values into a
> new test file, test_fpscr_rn_builtin_2.c.
> 
> The GLibC team requested a builtin to replace the mffscrn and
> mffscrniinline asm instructions in the GLibC code.  Previously there
> was discussion on adding builtins for the mffscrn instructions.
> 
> https://gcc.gnu.org/pipermail/gcc-patches/2023-May/620261.html
> 
> In the end, it was felt that it would be to extend the existing
> __builtin_set_fpscr_rn builtin to return a double instead of a void
> type.  The desire is that we could have the functionality of the
> mffscrn and mffscrni instructions on older ISAs.  The two instructions
> were initially added in ISA 3.0.  The __builtin_set_fpscr_rn has the
> needed functionality to set the RN field using the mffscrn and mffscrni
> instructions if ISA 3.0 is supported or fall back to using logical
> instructions to mask and set the bits for earlier ISAs.  The
> instructions return the current value of the FPSCR fields DRN, VE, OE,
> UE, ZE, XE, NI, RN bit positions then update the RN bit positions with
> the new RN value provided.
> 
> The current __builtin_set_fpscr_rn builtin has a return type of void. 
> So, changing the return type to double and returning the  FPSCR fields
> DRN, VE, OE, UE, ZE, XE, NI, RN bit positions would then give the
> functionally equivalent of the mffscrn and mffscrni instructions.  Any
> current uses of the builtin would just ignore the return value yet any
> new uses could use the return value.  So the requirement is for the
> change to the __builtin_set_fpscr_rn builtin to be backwardly
> compatible and work for all ISAs.
> 
> The following patch changes the return type of the
>  __builtin_set_fpscr_rn builtin from void to double.  The return value
> is the current value of the various FPSCR fields DRN, VE, OE, UE, ZE,
> XE, NI, RN bit positions when the builtin is called.  The builtin then
> updated the RN field with the new value provided as an argument to the
> builtin.  The patch adds new testcases to test_fpscr_rn_builtin.c to
> check that the builtin returns the current value of the FPSCR fields
> and then updates the RN field.
> 
> The GLibC team has reviewed the patch to make sure it met their needs
> as a drop in replacement for the inline asm mffscr and mffscrni
> statements in the GLibC code.  T
> 
> The patch has been tested on Power 8 LE/BE, Power 9 LE/BE and Power 10
> LE.
> 
> Please let me know if the patch is acceptable for mainline.  Thanks.
> 
>                                Carl 
> 
> 
> ----------------------------------
> rs6000, __builtin_set_fpscr_rn add retrun value
> 
> Change the return value from void to double.  The return value consists of
> the FPSCR fields DRN, VE, OE, UE, ZE, XE, NI, RN bit positions.  Add an
> overloaded version which accepts a double argument.
> 
> The test powerpc/test_fpscr_rn_builtin.c is updated to add tests for the
> double reterun value and the new double argument.
> 
> gcc/ChangeLog:
>       * config/rs6000/rs6000-builtins.def (__builtin_set_fpscr_rn): Update
>       builtin definition return type.
>       * config/rs6000-c.cc(rs6000_target_modify_macros): Add check, define
>       __SET_FPSCR_RN_RETURNS_FPSCR__ macro.
>       * config/rs6000/rs6000.md ((rs6000_get_fpscr_fields): New
>       define_expand.
>       (rs6000_update_fpscr_rn_field): New define_expand.
>       (rs6000_set_fpscr_rn): Added    return argument.  Updated to use new
>       rs6000_get_fpscr_fields and rs6000_update_fpscr_rn_field define
>        _expands.
>       * doc/extend.texi (__builtin_set_fpscr_rn): Update description for
>       the return value and new double argument.  Add descripton for
>       __SET_FPSCR_RN_RETURNS_FPSCR__ macro.
> 
> gcc/testsuite/ChangeLog:
>       gcc.target/powerpc/test_fpscr_rn_builtin.c: Renamed to
>       test_fpscr_rn_builtin_1.c.  Added comment.
>       gcc.target/powerpc/test_fpscr_rn_builtin_2.c: New test for the
>       return value of __builtin_set_fpscr_rn builtin.
> ---
>  gcc/config/rs6000/rs6000-builtins.def         |   2 +-
>  gcc/config/rs6000/rs6000-c.cc                 |   4 +
>  gcc/config/rs6000/rs6000.md                   |  87 +++++++---
>  gcc/doc/extend.texi                           |  26 ++-
>  ...rn_builtin.c => test_fpscr_rn_builtin_1.c} |   6 +
>  .../powerpc/test_fpscr_rn_builtin_2.c         | 153 ++++++++++++++++++
>  6 files changed, 246 insertions(+), 32 deletions(-)
>  rename gcc/testsuite/gcc.target/powerpc/{test_fpscr_rn_builtin.c => 
> test_fpscr_rn_builtin_1.c} (92%)
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/test_fpscr_rn_builtin_2.c
> 
> diff --git a/gcc/config/rs6000/rs6000-builtins.def 
> b/gcc/config/rs6000/rs6000-builtins.def
> index 289a37998b1..28788b69c7d 100644
> --- a/gcc/config/rs6000/rs6000-builtins.def
> +++ b/gcc/config/rs6000/rs6000-builtins.def
> @@ -237,7 +237,7 @@
>    const __ibm128 __builtin_pack_ibm128 (double, double);
>      PACK_IF packif {ibm128}
>  
> -  void __builtin_set_fpscr_rn (const int[0,3]);
> +  double __builtin_set_fpscr_rn (const int[0,3]);
>      SET_FPSCR_RN rs6000_set_fpscr_rn {nosoft}
>  
>    const double __builtin_unpack_ibm128 (__ibm128, const int<1>);
> diff --git a/gcc/config/rs6000/rs6000-c.cc b/gcc/config/rs6000/rs6000-c.cc
> index 8555174d36e..8373bb66919 100644
> --- a/gcc/config/rs6000/rs6000-c.cc
> +++ b/gcc/config/rs6000/rs6000-c.cc
> @@ -604,6 +604,10 @@ rs6000_target_modify_macros (bool define_p, 
> HOST_WIDE_INT flags)
>    /* Tell the user -mrop-protect is in play.  */
>    if (rs6000_rop_protect)
>      rs6000_define_or_undefine_macro (define_p, "__ROP_PROTECT__");
> +  /* Tell the user the __builtin_set_fpscr_rn now returns the FPSCR fields
> +     in a double.  Originally the builtin returned void.  */
> +  if ((flags & OPTION_MASK_SOFT_FLOAT) == 0)
> +  rs6000_define_or_undefine_macro (define_p, 
> "__SET_FPSCR_RN_RETURNS_FPSCR__");
>  }
>  
>  void
> diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
> index b0db8ae508d..1b77a13c8a1 100644
> --- a/gcc/config/rs6000/rs6000.md
> +++ b/gcc/config/rs6000/rs6000.md
> @@ -6440,8 +6440,51 @@
>    "mffscdrn %0,%1"
>    [(set_attr "type" "fp")])
>  
> +
> +(define_expand "rs6000_get_fpscr_fields"
> + [(match_operand:DF 0 "gpc_reg_operand")]
> +  "TARGET_HARD_FLOAT"
> +{
> +  /* Extract fields bits 29:31 (DRN) and bits 56:63 (VE, OE, UE, ZE, XE, NI,
> +     RN) from the FPSCR and return them.  */
> +  rtx tmp_df = gen_reg_rtx (DFmode);
> +  rtx tmp_di = gen_reg_rtx (DImode);
> +
> +  emit_insn (gen_rs6000_mffs (tmp_df));
> +  tmp_di = simplify_gen_subreg (DImode, tmp_df, DFmode, 0);
> +  emit_insn (gen_anddi3 (tmp_di, tmp_di, GEN_INT (0x00000007000000FFULL)));
> +  rtx tmp_rtn = simplify_gen_subreg (DFmode, tmp_di, DImode, 0);
> +  emit_move_insn (operands[0], tmp_rtn);
> +  DONE;
> +})
> +
> +(define_expand "rs6000_update_fpscr_rn_field"
> + [(match_operand:DI 0 "gpc_reg_operand")]
> +  "TARGET_HARD_FLOAT"
> +{
> +  /* Insert the new RN value from operands[0] into FPSCR bit [62:63].  */
> +  rtx tmp_di = gen_reg_rtx (DImode);
> +  rtx tmp_df = gen_reg_rtx (DFmode);
> +
> +  emit_insn (gen_rs6000_mffs (tmp_df));
> +  tmp_di = simplify_gen_subreg (DImode, tmp_df, DFmode, 0);
> +  emit_insn (gen_anddi3 (tmp_di, tmp_di, GEN_INT (-4)));
> +  emit_insn (gen_iordi3 (tmp_di, tmp_di, operands[0]));
> +
> +  /* Need to write to field k=15.  The fields are [0:15].  Hence with
> +     L=0, W=0, FLM_i must be equal to 8, 16 = i + 8*(1-W).  FLM is an
> +     8-bit field[0:7]. Need to set the bit that corresponds to the
> +     value of i that you want [0:7].  */
> +
> +  tmp_df = simplify_gen_subreg (DFmode, tmp_di, DImode, 0);
> +  emit_insn (gen_rs6000_mtfsf (GEN_INT (0x01), tmp_df));
> +  DONE;
> +})
> +
>  (define_expand "rs6000_set_fpscr_rn"
> - [(match_operand:SI 0 "reg_or_cint_operand")]
> + [(set (match_operand:DF 0 "gpc_reg_operand")
> +       (unspec_volatile:DF [(match_operand:SI 1 "reg_or_cint_operand")]
> +       UNSPECV_MFFSCDRN))]

I don't think we need this pattern to be complete, since it's always
expanded with "DONE", the pattern is useless, so it can be simpler like:

   [(use (match_operand:DF 0 "gpc_reg_operand"))
    (use (match_operand:SI 1 "reg_or_cint_operand"))]


>    "TARGET_HARD_FLOAT"
>  {
>    rtx tmp_df = gen_reg_rtx (DFmode);
> @@ -6450,25 +6493,34 @@
>       new rounding mode bits from operands[0][62:63] into FPSCR[62:63].  */
>    if (TARGET_P9_MISC)
>      {
> -      if (const_0_to_3_operand (operands[0], VOIDmode))
> -     emit_insn (gen_rs6000_mffscrni (tmp_df, operands[0]));
> +      if (const_0_to_3_operand (operands[1], VOIDmode))
> +     emit_insn (gen_rs6000_mffscrni (tmp_df, operands[1]));
>        else
>       {
> -       rtx op0 = convert_to_mode (DImode, operands[0], false);
> -       rtx src_df = simplify_gen_subreg (DFmode, op0, DImode, 0);
> +       rtx op1 = convert_to_mode (DImode, operands[1], false);
> +       rtx src_df = simplify_gen_subreg (DFmode, op1, DImode, 0);
>         emit_insn (gen_rs6000_mffscrn (tmp_df, src_df));
>       }
> -      DONE;
> +       emit_move_insn (operands[0], tmp_df);
> +       DONE;
>      }
>  
> -  if (CONST_INT_P (operands[0]))
> +  /* Emulate the behavior of the mffscrni, mffscrn instructions for earlier
> +     ISAs.  Return bits 29:31 (DRN) and bits 56:63 (VE, OE, UE, ZE, XE, NI,
> +     RN) from the FPSCR.  Set the RN field based on the value in operands[1].
> +  */
> +
> +  /* Get the current FPSCR fields to return.  */
> +  emit_insn (gen_rs6000_get_fpscr_fields (operands[0]));

IMHO the above define_expand rs6000_get_fpscr_fields isn't needed, just inline
it here, since it doesn't get reused somewhere else, and more importantly the
read fpscr in DImode (tmp_di from simplify_gen_subreg on tmp_df) can be shared 
with the uses in rs6000_update_fpscr_rn_field (so I'd also suggest not factoring
out the previous RN bit handlings into rs6000_update_fpscr_rn_field).

> +
> +  if (CONST_INT_P (operands[1]))
>      {
> -      if ((INTVAL (operands[0]) & 0x1) == 0x1)
> +      if ((INTVAL (operands[1]) & 0x1) == 0x1)
>       emit_insn (gen_rs6000_mtfsb1 (GEN_INT (31)));
>        else
>       emit_insn (gen_rs6000_mtfsb0 (GEN_INT (31)));
>  
> -      if ((INTVAL (operands[0]) & 0x2) == 0x2)
> +      if ((INTVAL (operands[1]) & 0x2) == 0x2)
>       emit_insn (gen_rs6000_mtfsb1 (GEN_INT (30)));
>        else
>       emit_insn (gen_rs6000_mtfsb0 (GEN_INT (30)));
> @@ -6476,24 +6528,13 @@
>    else
>      {
>        rtx tmp_rn = gen_reg_rtx (DImode);
> -      rtx tmp_di = gen_reg_rtx (DImode);
>  
>        /* Extract new RN mode from operand.  */
> -      rtx op0 = convert_to_mode (DImode, operands[0], false);
> -      emit_insn (gen_anddi3 (tmp_rn, op0, GEN_INT (3)));
> +      rtx op1 = convert_to_mode (DImode, operands[1], false);
> +      emit_insn (gen_anddi3 (tmp_rn, op1, GEN_INT (3)));
>  
>        /* Insert new RN mode into FSCPR.  */
> -      emit_insn (gen_rs6000_mffs (tmp_df));
> -      tmp_di = simplify_gen_subreg (DImode, tmp_df, DFmode, 0);
> -      emit_insn (gen_anddi3 (tmp_di, tmp_di, GEN_INT (-4)));
> -      emit_insn (gen_iordi3 (tmp_di, tmp_di, tmp_rn));
> -
> -      /* Need to write to field k=15.  The fields are [0:15].  Hence with
> -      L=0, W=0, FLM_i must be equal to 8, 16 = i + 8*(1-W).  FLM is an
> -      8-bit field[0:7]. Need to set the bit that corresponds to the
> -      value of i that you want [0:7].  */
> -      tmp_df = simplify_gen_subreg (DFmode, tmp_di, DImode, 0);
> -      emit_insn (gen_rs6000_mtfsf (GEN_INT (0x01), tmp_df));
> +      emit_insn (gen_rs6000_update_fpscr_rn_field (tmp_rn));
>      }
>    DONE;
>  })
> diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
> index cdbd4b34a35..fee35ac40ec 100644
> --- a/gcc/doc/extend.texi
> +++ b/gcc/doc/extend.texi
> @@ -18188,7 +18188,6 @@ double __builtin_mffs (void);
>  void __builtin_mtfsf (const int, double);
>  void __builtin_mtfsb0 (const int);
>  void __builtin_mtfsb1 (const int);
> -void __builtin_set_fpscr_rn (int);

Just update void to double here, instead of the below seperated smallexample?
As the below documentation is to describe these above functions one by
one, just moving one down looks a bit inconsistent.

>  @end smallexample
>  
>  The @code{__builtin_ppc_get_timebase} and @code{__builtin_ppc_mftb}
> @@ -18209,13 +18208,24 @@ values to selected fields of the FPSCR.  The
>  as an argument.  The valid bit range is between 0 and 31.  The builtins map 
> to
>  the @code{mtfsb0} and @code{mtfsb1} instructions which take the argument and
>  add 32.  Hence these instructions only modify the FPSCR[32:63] bits by
> -changing the specified bit to a zero or one respectively.  The
> -@code{__builtin_set_fpscr_rn} builtin allows changing both of the floating
> -point rounding mode bits.  The argument is a 2-bit value.  The argument can
> -either be a @code{const int} or stored in a variable. The builtin uses
> -the ISA 3.0
> -instruction @code{mffscrn} if available, otherwise it reads the FPSCR, masks
> -the current rounding mode bits out and OR's in the new value.
> +changing the specified bit to a zero or one respectively.
> +
> +@smallexample
> +double __builtin_set_fpscr_rn (int);
> +@end smallexample
> +
> +The @code{__builtin_set_fpscr_rn} builtin allows changing both of the 
> floating
> +point rounding mode bits and returning the various FPSCR fields before the RN
> +field is updated.  The builtin returns a double consisting of the initial 
> value
> +of the FPSCR fields DRN, VE, OE, UE, ZE, XE, NI, and RN bit positions with 
> all
> +other bits set to zero. The builtin argument is a 2-bit value for the new RN
> +field value.  The argument can either be an @code{const int} or stored in a
> +variable.  Earlier versions of @code{__builtin_set_fpscr_rn} returned void.  
> A
> +@code{__SET_FPSCR_RN_RETURNS_FPSCR__} macro has been added.  If defined, then
> +the @code{__builtin_set_fpscr_rn} builtin returns the FPSCR fields.  If not
> +defined, the @code{__builtin_set_fpscr_rn} does not return a vaule.  If the
> +@option{-msoft-float} option is used, the @code{__builtin_set_fpscr_rn} 
> builtin
> +will not return a value.

__builtin_set_fpscr_rn isn't available for -msoft-float, your patch doesn't 
change
it so IMHO you don't need to note this.

>  
>  @node Basic PowerPC Built-in Functions Available on ISA 2.05
>  @subsubsection Basic PowerPC Built-in Functions Available on ISA 2.05
> diff --git a/gcc/testsuite/gcc.target/powerpc/test_fpscr_rn_builtin.c 
> b/gcc/testsuite/gcc.target/powerpc/test_fpscr_rn_builtin_1.c
> similarity index 92%
> rename from gcc/testsuite/gcc.target/powerpc/test_fpscr_rn_builtin.c
> rename to gcc/testsuite/gcc.target/powerpc/test_fpscr_rn_builtin_1.c
> index 04707ad8a56..d917ee19656 100644
> --- a/gcc/testsuite/gcc.target/powerpc/test_fpscr_rn_builtin.c
> +++ b/gcc/testsuite/gcc.target/powerpc/test_fpscr_rn_builtin_1.c
> @@ -1,6 +1,12 @@
>  /* { dg-do run { target { powerpc*-*-* } } } */
>  /* { dg-options "-O2 -std=c99" } */
>  
> +/* Originally the __builtin_set_fpscr_rn builtin was defined to return
> +   void.  It was later extended to return a double with the various
> +   FPSCR bits.  The extended builtin is inteded to be a drop in replacement
> +   for the original version.  This test is for the original version of the
> +   builtin and should work exactly as before.  */
> +
>  #ifdef DEBUG
>  #include <stdio.h>
>  #endif
> diff --git a/gcc/testsuite/gcc.target/powerpc/test_fpscr_rn_builtin_2.c 
> b/gcc/testsuite/gcc.target/powerpc/test_fpscr_rn_builtin_2.c
> new file mode 100644
> index 00000000000..62fca67c948
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/test_fpscr_rn_builtin_2.c
> @@ -0,0 +1,153 @@
> +/* { dg-do run { target { powerpc*-*-* } } } */
> +/* { dg-options "-O2 -std=c99" } */

Nit: The previous existing test_fpscr_rn_builtin.c introduced -std=c99 but
I think it is useless for this case, so you can remove it if you like.

> +
> +/* The __builtin_set_fpscr_rn builtin was originally defined to return
> +   void.  It was to return a double with the various FPSCR bits.  This test
> +   verifies the new version returns the desired FPSCR bits.  */
> +
> +#ifdef DEBUG
> +#include <stdio.h>
> +#endif
> +
> +#define RN_MASK  0x3LL             /* RN field mask */
> +#define FIELD_MASK 0x00000007000000FFULL
> +
> +union blah {
> +  double d;
> +  unsigned long long ll;
> +} conv_val;
> +
> +void abort (void);
> +double __attribute__ ((noipa)) wrap_set_fpscr_rn (int val)
> +{
> +  return __builtin_set_fpscr_rn (val);
> +}
> +

Nit: I noticed you want this case to abort if __SET_FPSCR_RN_RETURNS_FPSCR__
doesn't get defined, but actually the compilation should fail
for compiling this function wrap_set_fpscr_rn already, so
the below __SET_FPSCR_RN_RETURNS_FPSCR__ check and abort seems 
useless.  That said, the explicit use of return value of this built-in
__builtin_set_fpscr_rn already implicitly checks it.

BR,
Kewen

Re: [PATCH ver 2] rs6000, __builtin_set_fpscr_rn add retrun value

Reply via email to