Re: [PATCH ver 2] rs6000, __builtin_set_fpscr_rn add retrun value

Kewen.Lin via Gcc-patches Mon, 10 Jul 2023 22:40:25 -0700

on 2023/7/11 03:18, Carl Love wrote:
> On Fri, 2023-07-07 at 12:06 +0800, Kewen.Lin wrote:
>> Hi Carl,
>>
>> Some more minor comments are inline below on top of Peter's
>> insightful
>> review comments.
>>
>> on 2023/7/1 08:58, Carl Love wrote:
>>> GCC maintainers:
>>>
>>> Ver 2,  Went back thru the requirements and emails.  Not sure where
>>> I
>>> came up with the requirement for an overloaded version with double
>>> argument.  Removed the overloaded version with the double
>>> argument. 
>>> Added the macro to announce if the __builtin_set_fpscr_rn returns a
>>> void or a double with the FPSCR bits.  Updated the documentation
>>> file. 
>>> Retested on Power 8 BE/LE, Power 9 BE/LE, Power 10 LE.  Redid the
>>> test
>>> file.  Per request, the original test file functionality was not
>>> changed.  Just changed the name from test_fpscr_rn_builtin.c to 
>>> test_fpscr_rn_builtin_1.c.  Put new tests for the return values
>>> into a
>>> new test file, test_fpscr_rn_builtin_2.c.
>>>
>>> The GLibC team requested a builtin to replace the mffscrn and
>>> mffscrniinline asm instructions in the GLibC code.  Previously
>>> there
>>> was discussion on adding builtins for the mffscrn instructions.
>>>
>>> https://gcc.gnu.org/pipermail/gcc-patches/2023-May/620261.html
>>>
>>> In the end, it was felt that it would be to extend the existing
>>> __builtin_set_fpscr_rn builtin to return a double instead of a void
>>> type.  The desire is that we could have the functionality of the
>>> mffscrn and mffscrni instructions on older ISAs.  The two
>>> instructions
>>> were initially added in ISA 3.0.  The __builtin_set_fpscr_rn has
>>> the
>>> needed functionality to set the RN field using the mffscrn and
>>> mffscrni
>>> instructions if ISA 3.0 is supported or fall back to using logical
>>> instructions to mask and set the bits for earlier ISAs.  The
>>> instructions return the current value of the FPSCR fields DRN, VE,
>>> OE,
>>> UE, ZE, XE, NI, RN bit positions then update the RN bit positions
>>> with
>>> the new RN value provided.
>>>
>>> The current __builtin_set_fpscr_rn builtin has a return type of
>>> void. 
>>> So, changing the return type to double and returning the  FPSCR
>>> fields
>>> DRN, VE, OE, UE, ZE, XE, NI, RN bit positions would then give the
>>> functionally equivalent of the mffscrn and mffscrni
>>> instructions.  Any
>>> current uses of the builtin would just ignore the return value yet
>>> any
>>> new uses could use the return value.  So the requirement is for the
>>> change to the __builtin_set_fpscr_rn builtin to be backwardly
>>> compatible and work for all ISAs.
>>>
>>> The following patch changes the return type of the
>>>  __builtin_set_fpscr_rn builtin from void to double.  The return
>>> value
>>> is the current value of the various FPSCR fields DRN, VE, OE, UE,
>>> ZE,
>>> XE, NI, RN bit positions when the builtin is called.  The builtin
>>> then
>>> updated the RN field with the new value provided as an argument to
>>> the
>>> builtin.  The patch adds new testcases to test_fpscr_rn_builtin.c
>>> to
>>> check that the builtin returns the current value of the FPSCR
>>> fields
>>> and then updates the RN field.
>>>
>>> The GLibC team has reviewed the patch to make sure it met their
>>> needs
>>> as a drop in replacement for the inline asm mffscr and mffscrni
>>> statements in the GLibC code.  T
>>>
>>> The patch has been tested on Power 8 LE/BE, Power 9 LE/BE and Power
>>> 10
>>> LE.
>>>
>>> Please let me know if the patch is acceptable for
>>> mainline.  Thanks.
>>>
>>>                                Carl 
>>>
>>>
>>> ----------------------------------
>>> rs6000, __builtin_set_fpscr_rn add retrun value
>>>
>>> Change the return value from void to double.  The return value
>>> consists of
>>> the FPSCR fields DRN, VE, OE, UE, ZE, XE, NI, RN bit
>>> positions.  Add an
>>> overloaded version which accepts a double argument.
>>>
>>> The test powerpc/test_fpscr_rn_builtin.c is updated to add tests
>>> for the
>>> double reterun value and the new double argument.
>>>
>>> gcc/ChangeLog:
>>>     * config/rs6000/rs6000-builtins.def (__builtin_set_fpscr_rn):
>>> Update
>>>     builtin definition return type.
>>>     * config/rs6000-c.cc(rs6000_target_modify_macros): Add check,
>>> define
>>>     __SET_FPSCR_RN_RETURNS_FPSCR__ macro.
>>>     * config/rs6000/rs6000.md ((rs6000_get_fpscr_fields): New
>>>     define_expand.
>>>     (rs6000_update_fpscr_rn_field): New define_expand.
>>>     (rs6000_set_fpscr_rn): Added    return argument.  Updated
>>> to use new
>>>     rs6000_get_fpscr_fields and rs6000_update_fpscr_rn_field define
>>>      _expands.
>>>     * doc/extend.texi (__builtin_set_fpscr_rn): Update description
>>> for
>>>     the return value and new double argument.  Add descripton for
>>>     __SET_FPSCR_RN_RETURNS_FPSCR__ macro.
>>>
>>> gcc/testsuite/ChangeLog:
>>>     gcc.target/powerpc/test_fpscr_rn_builtin.c: Renamed to
>>>     test_fpscr_rn_builtin_1.c.  Added comment.
>>>     gcc.target/powerpc/test_fpscr_rn_builtin_2.c: New test for the
>>>     return value of __builtin_set_fpscr_rn builtin.
>>> ---
>>>  gcc/config/rs6000/rs6000-builtins.def         |   2 +-
>>>  gcc/config/rs6000/rs6000-c.cc                 |   4 +
>>>  gcc/config/rs6000/rs6000.md                   |  87 +++++++---
>>>  gcc/doc/extend.texi                           |  26 ++-
>>>  ...rn_builtin.c => test_fpscr_rn_builtin_1.c} |   6 +
>>>  .../powerpc/test_fpscr_rn_builtin_2.c         | 153
>>> ++++++++++++++++++
>>>  6 files changed, 246 insertions(+), 32 deletions(-)
>>>  rename gcc/testsuite/gcc.target/powerpc/{test_fpscr_rn_builtin.c
>>> => test_fpscr_rn_builtin_1.c} (92%)
>>>  create mode 100644
>>> gcc/testsuite/gcc.target/powerpc/test_fpscr_rn_builtin_2.c
>>>
>>> diff --git a/gcc/config/rs6000/rs6000-builtins.def
>>> b/gcc/config/rs6000/rs6000-builtins.def
>>> index 289a37998b1..28788b69c7d 100644
>>> --- a/gcc/config/rs6000/rs6000-builtins.def
>>> +++ b/gcc/config/rs6000/rs6000-builtins.def
>>> @@ -237,7 +237,7 @@
>>>    const __ibm128 __builtin_pack_ibm128 (double, double);
>>>      PACK_IF packif {ibm128}
>>>  
>>> -  void __builtin_set_fpscr_rn (const int[0,3]);
>>> +  double __builtin_set_fpscr_rn (const int[0,3]);
>>>      SET_FPSCR_RN rs6000_set_fpscr_rn {nosoft}
>>>  
>>>    const double __builtin_unpack_ibm128 (__ibm128, const int<1>);
>>> diff --git a/gcc/config/rs6000/rs6000-c.cc
>>> b/gcc/config/rs6000/rs6000-c.cc
>>> index 8555174d36e..8373bb66919 100644
>>> --- a/gcc/config/rs6000/rs6000-c.cc
>>> +++ b/gcc/config/rs6000/rs6000-c.cc
>>> @@ -604,6 +604,10 @@ rs6000_target_modify_macros (bool define_p,
>>> HOST_WIDE_INT flags)
>>>    /* Tell the user -mrop-protect is in play.  */
>>>    if (rs6000_rop_protect)
>>>      rs6000_define_or_undefine_macro (define_p, "__ROP_PROTECT__");
>>> +  /* Tell the user the __builtin_set_fpscr_rn now returns the
>>> FPSCR fields
>>> +     in a double.  Originally the builtin returned void.  */
>>> +  if ((flags & OPTION_MASK_SOFT_FLOAT) == 0)
>>> +  rs6000_define_or_undefine_macro (define_p,
>>> "__SET_FPSCR_RN_RETURNS_FPSCR__");
>>>  }
>>>  
>>>  void
>>> diff --git a/gcc/config/rs6000/rs6000.md
>>> b/gcc/config/rs6000/rs6000.md
>>> index b0db8ae508d..1b77a13c8a1 100644
>>> --- a/gcc/config/rs6000/rs6000.md
>>> +++ b/gcc/config/rs6000/rs6000.md
>>> @@ -6440,8 +6440,51 @@
>>>    "mffscdrn %0,%1"
>>>    [(set_attr "type" "fp")])
>>>  
>>> +
>>> +(define_expand "rs6000_get_fpscr_fields"
>>> + [(match_operand:DF 0 "gpc_reg_operand")]
>>> +  "TARGET_HARD_FLOAT"
>>> +{
>>> +  /* Extract fields bits 29:31 (DRN) and bits 56:63 (VE, OE, UE,
>>> ZE, XE, NI,
>>> +     RN) from the FPSCR and return them.  */
>>> +  rtx tmp_df = gen_reg_rtx (DFmode);
>>> +  rtx tmp_di = gen_reg_rtx (DImode);
>>> +
>>> +  emit_insn (gen_rs6000_mffs (tmp_df));
>>> +  tmp_di = simplify_gen_subreg (DImode, tmp_df, DFmode, 0);
>>> +  emit_insn (gen_anddi3 (tmp_di, tmp_di, GEN_INT
>>> (0x00000007000000FFULL)));
>>> +  rtx tmp_rtn = simplify_gen_subreg (DFmode, tmp_di, DImode, 0);
>>> +  emit_move_insn (operands[0], tmp_rtn);
>>> +  DONE;
>>> +})
>>> +
>>> +(define_expand "rs6000_update_fpscr_rn_field"
>>> + [(match_operand:DI 0 "gpc_reg_operand")]
>>> +  "TARGET_HARD_FLOAT"
>>> +{
>>> +  /* Insert the new RN value from operands[0] into FPSCR bit
>>> [62:63].  */
>>> +  rtx tmp_di = gen_reg_rtx (DImode);
>>> +  rtx tmp_df = gen_reg_rtx (DFmode);
>>> +
>>> +  emit_insn (gen_rs6000_mffs (tmp_df));
>>> +  tmp_di = simplify_gen_subreg (DImode, tmp_df, DFmode, 0);
>>> +  emit_insn (gen_anddi3 (tmp_di, tmp_di, GEN_INT (-4)));
>>> +  emit_insn (gen_iordi3 (tmp_di, tmp_di, operands[0]));
>>> +
>>> +  /* Need to write to field k=15.  The fields are [0:15].  Hence
>>> with
>>> +     L=0, W=0, FLM_i must be equal to 8, 16 = i + 8*(1-W).  FLM is
>>> an
>>> +     8-bit field[0:7]. Need to set the bit that corresponds to the
>>> +     value of i that you want [0:7].  */
>>> +
>>> +  tmp_df = simplify_gen_subreg (DFmode, tmp_di, DImode, 0);
>>> +  emit_insn (gen_rs6000_mtfsf (GEN_INT (0x01), tmp_df));
>>> +  DONE;
>>> +})
>>> +
>>>  (define_expand "rs6000_set_fpscr_rn"
>>> - [(match_operand:SI 0 "reg_or_cint_operand")]
>>> + [(set (match_operand:DF 0 "gpc_reg_operand")
>>> +       (unspec_volatile:DF [(match_operand:SI 1
>>> "reg_or_cint_operand")]
>>> +       UNSPECV_MFFSCDRN))]
>>
>> I don't think we need this pattern to be complete, since it's always
>> expanded with "DONE", the pattern is useless, so it can be simpler
>> like:
>>
>>    [(use (match_operand:DF 0 "gpc_reg_operand"))
>>     (use (match_operand:SI 1 "reg_or_cint_operand"))]
> 
> OK, changed this.  Went and reviewed the use of set in the GCC internal
> documentation.  Not really sure why we can use set with the DONE.  Not
> really following your explanation.  It does seem to work.
>


It's mainly based on the documentation of define_expand:

"If the preparation falls through (invokes neither DONE nor FAIL), then
 the define_expand acts like a define_insn in that the RTL template is
 used to generate the insn.

 The RTL template is not used for matching, only for generating the initial
 insn list. **If the preparation statement always invokes DONE or FAIL, the RTL
 template may be reduced to a simple list of operands** ..."

https://gcc.gnu.org/onlinedocs/gccint/Expander-Definitions.html

BR,
Kewen

Re: [PATCH ver 2] rs6000, __builtin_set_fpscr_rn add retrun value

Reply via email to