on 2023/7/11 03:18, Carl Love wrote: > On Fri, 2023-07-07 at 12:06 +0800, Kewen.Lin wrote: >> Hi Carl, >> >> Some more minor comments are inline below on top of Peter's >> insightful >> review comments. >> >> on 2023/7/1 08:58, Carl Love wrote: >>> GCC maintainers: >>> >>> Ver 2, Went back thru the requirements and emails. Not sure where >>> I >>> came up with the requirement for an overloaded version with double >>> argument. Removed the overloaded version with the double >>> argument. >>> Added the macro to announce if the __builtin_set_fpscr_rn returns a >>> void or a double with the FPSCR bits. Updated the documentation >>> file. >>> Retested on Power 8 BE/LE, Power 9 BE/LE, Power 10 LE. Redid the >>> test >>> file. Per request, the original test file functionality was not >>> changed. Just changed the name from test_fpscr_rn_builtin.c to >>> test_fpscr_rn_builtin_1.c. Put new tests for the return values >>> into a >>> new test file, test_fpscr_rn_builtin_2.c. >>> >>> The GLibC team requested a builtin to replace the mffscrn and >>> mffscrniinline asm instructions in the GLibC code. Previously >>> there >>> was discussion on adding builtins for the mffscrn instructions. >>> >>> https://gcc.gnu.org/pipermail/gcc-patches/2023-May/620261.html >>> >>> In the end, it was felt that it would be to extend the existing >>> __builtin_set_fpscr_rn builtin to return a double instead of a void >>> type. The desire is that we could have the functionality of the >>> mffscrn and mffscrni instructions on older ISAs. The two >>> instructions >>> were initially added in ISA 3.0. The __builtin_set_fpscr_rn has >>> the >>> needed functionality to set the RN field using the mffscrn and >>> mffscrni >>> instructions if ISA 3.0 is supported or fall back to using logical >>> instructions to mask and set the bits for earlier ISAs. The >>> instructions return the current value of the FPSCR fields DRN, VE, >>> OE, >>> UE, ZE, XE, NI, RN bit positions then update the RN bit positions >>> with >>> the new RN value provided. >>> >>> The current __builtin_set_fpscr_rn builtin has a return type of >>> void. >>> So, changing the return type to double and returning the FPSCR >>> fields >>> DRN, VE, OE, UE, ZE, XE, NI, RN bit positions would then give the >>> functionally equivalent of the mffscrn and mffscrni >>> instructions. Any >>> current uses of the builtin would just ignore the return value yet >>> any >>> new uses could use the return value. So the requirement is for the >>> change to the __builtin_set_fpscr_rn builtin to be backwardly >>> compatible and work for all ISAs. >>> >>> The following patch changes the return type of the >>> __builtin_set_fpscr_rn builtin from void to double. The return >>> value >>> is the current value of the various FPSCR fields DRN, VE, OE, UE, >>> ZE, >>> XE, NI, RN bit positions when the builtin is called. The builtin >>> then >>> updated the RN field with the new value provided as an argument to >>> the >>> builtin. The patch adds new testcases to test_fpscr_rn_builtin.c >>> to >>> check that the builtin returns the current value of the FPSCR >>> fields >>> and then updates the RN field. >>> >>> The GLibC team has reviewed the patch to make sure it met their >>> needs >>> as a drop in replacement for the inline asm mffscr and mffscrni >>> statements in the GLibC code. T >>> >>> The patch has been tested on Power 8 LE/BE, Power 9 LE/BE and Power >>> 10 >>> LE. >>> >>> Please let me know if the patch is acceptable for >>> mainline. Thanks. >>> >>> Carl >>> >>> >>> ---------------------------------- >>> rs6000, __builtin_set_fpscr_rn add retrun value >>> >>> Change the return value from void to double. The return value >>> consists of >>> the FPSCR fields DRN, VE, OE, UE, ZE, XE, NI, RN bit >>> positions. Add an >>> overloaded version which accepts a double argument. >>> >>> The test powerpc/test_fpscr_rn_builtin.c is updated to add tests >>> for the >>> double reterun value and the new double argument. >>> >>> gcc/ChangeLog: >>> * config/rs6000/rs6000-builtins.def (__builtin_set_fpscr_rn): >>> Update >>> builtin definition return type. >>> * config/rs6000-c.cc(rs6000_target_modify_macros): Add check, >>> define >>> __SET_FPSCR_RN_RETURNS_FPSCR__ macro. >>> * config/rs6000/rs6000.md ((rs6000_get_fpscr_fields): New >>> define_expand. >>> (rs6000_update_fpscr_rn_field): New define_expand. >>> (rs6000_set_fpscr_rn): Added return argument. Updated >>> to use new >>> rs6000_get_fpscr_fields and rs6000_update_fpscr_rn_field define >>> _expands. >>> * doc/extend.texi (__builtin_set_fpscr_rn): Update description >>> for >>> the return value and new double argument. Add descripton for >>> __SET_FPSCR_RN_RETURNS_FPSCR__ macro. >>> >>> gcc/testsuite/ChangeLog: >>> gcc.target/powerpc/test_fpscr_rn_builtin.c: Renamed to >>> test_fpscr_rn_builtin_1.c. Added comment. >>> gcc.target/powerpc/test_fpscr_rn_builtin_2.c: New test for the >>> return value of __builtin_set_fpscr_rn builtin. >>> --- >>> gcc/config/rs6000/rs6000-builtins.def | 2 +- >>> gcc/config/rs6000/rs6000-c.cc | 4 + >>> gcc/config/rs6000/rs6000.md | 87 +++++++--- >>> gcc/doc/extend.texi | 26 ++- >>> ...rn_builtin.c => test_fpscr_rn_builtin_1.c} | 6 + >>> .../powerpc/test_fpscr_rn_builtin_2.c | 153 >>> ++++++++++++++++++ >>> 6 files changed, 246 insertions(+), 32 deletions(-) >>> rename gcc/testsuite/gcc.target/powerpc/{test_fpscr_rn_builtin.c >>> => test_fpscr_rn_builtin_1.c} (92%) >>> create mode 100644 >>> gcc/testsuite/gcc.target/powerpc/test_fpscr_rn_builtin_2.c >>> >>> diff --git a/gcc/config/rs6000/rs6000-builtins.def >>> b/gcc/config/rs6000/rs6000-builtins.def >>> index 289a37998b1..28788b69c7d 100644 >>> --- a/gcc/config/rs6000/rs6000-builtins.def >>> +++ b/gcc/config/rs6000/rs6000-builtins.def >>> @@ -237,7 +237,7 @@ >>> const __ibm128 __builtin_pack_ibm128 (double, double); >>> PACK_IF packif {ibm128} >>> >>> - void __builtin_set_fpscr_rn (const int[0,3]); >>> + double __builtin_set_fpscr_rn (const int[0,3]); >>> SET_FPSCR_RN rs6000_set_fpscr_rn {nosoft} >>> >>> const double __builtin_unpack_ibm128 (__ibm128, const int<1>); >>> diff --git a/gcc/config/rs6000/rs6000-c.cc >>> b/gcc/config/rs6000/rs6000-c.cc >>> index 8555174d36e..8373bb66919 100644 >>> --- a/gcc/config/rs6000/rs6000-c.cc >>> +++ b/gcc/config/rs6000/rs6000-c.cc >>> @@ -604,6 +604,10 @@ rs6000_target_modify_macros (bool define_p, >>> HOST_WIDE_INT flags) >>> /* Tell the user -mrop-protect is in play. */ >>> if (rs6000_rop_protect) >>> rs6000_define_or_undefine_macro (define_p, "__ROP_PROTECT__"); >>> + /* Tell the user the __builtin_set_fpscr_rn now returns the >>> FPSCR fields >>> + in a double. Originally the builtin returned void. */ >>> + if ((flags & OPTION_MASK_SOFT_FLOAT) == 0) >>> + rs6000_define_or_undefine_macro (define_p, >>> "__SET_FPSCR_RN_RETURNS_FPSCR__"); >>> } >>> >>> void >>> diff --git a/gcc/config/rs6000/rs6000.md >>> b/gcc/config/rs6000/rs6000.md >>> index b0db8ae508d..1b77a13c8a1 100644 >>> --- a/gcc/config/rs6000/rs6000.md >>> +++ b/gcc/config/rs6000/rs6000.md >>> @@ -6440,8 +6440,51 @@ >>> "mffscdrn %0,%1" >>> [(set_attr "type" "fp")]) >>> >>> + >>> +(define_expand "rs6000_get_fpscr_fields" >>> + [(match_operand:DF 0 "gpc_reg_operand")] >>> + "TARGET_HARD_FLOAT" >>> +{ >>> + /* Extract fields bits 29:31 (DRN) and bits 56:63 (VE, OE, UE, >>> ZE, XE, NI, >>> + RN) from the FPSCR and return them. */ >>> + rtx tmp_df = gen_reg_rtx (DFmode); >>> + rtx tmp_di = gen_reg_rtx (DImode); >>> + >>> + emit_insn (gen_rs6000_mffs (tmp_df)); >>> + tmp_di = simplify_gen_subreg (DImode, tmp_df, DFmode, 0); >>> + emit_insn (gen_anddi3 (tmp_di, tmp_di, GEN_INT >>> (0x00000007000000FFULL))); >>> + rtx tmp_rtn = simplify_gen_subreg (DFmode, tmp_di, DImode, 0); >>> + emit_move_insn (operands[0], tmp_rtn); >>> + DONE; >>> +}) >>> + >>> +(define_expand "rs6000_update_fpscr_rn_field" >>> + [(match_operand:DI 0 "gpc_reg_operand")] >>> + "TARGET_HARD_FLOAT" >>> +{ >>> + /* Insert the new RN value from operands[0] into FPSCR bit >>> [62:63]. */ >>> + rtx tmp_di = gen_reg_rtx (DImode); >>> + rtx tmp_df = gen_reg_rtx (DFmode); >>> + >>> + emit_insn (gen_rs6000_mffs (tmp_df)); >>> + tmp_di = simplify_gen_subreg (DImode, tmp_df, DFmode, 0); >>> + emit_insn (gen_anddi3 (tmp_di, tmp_di, GEN_INT (-4))); >>> + emit_insn (gen_iordi3 (tmp_di, tmp_di, operands[0])); >>> + >>> + /* Need to write to field k=15. The fields are [0:15]. Hence >>> with >>> + L=0, W=0, FLM_i must be equal to 8, 16 = i + 8*(1-W). FLM is >>> an >>> + 8-bit field[0:7]. Need to set the bit that corresponds to the >>> + value of i that you want [0:7]. */ >>> + >>> + tmp_df = simplify_gen_subreg (DFmode, tmp_di, DImode, 0); >>> + emit_insn (gen_rs6000_mtfsf (GEN_INT (0x01), tmp_df)); >>> + DONE; >>> +}) >>> + >>> (define_expand "rs6000_set_fpscr_rn" >>> - [(match_operand:SI 0 "reg_or_cint_operand")] >>> + [(set (match_operand:DF 0 "gpc_reg_operand") >>> + (unspec_volatile:DF [(match_operand:SI 1 >>> "reg_or_cint_operand")] >>> + UNSPECV_MFFSCDRN))] >> >> I don't think we need this pattern to be complete, since it's always >> expanded with "DONE", the pattern is useless, so it can be simpler >> like: >> >> [(use (match_operand:DF 0 "gpc_reg_operand")) >> (use (match_operand:SI 1 "reg_or_cint_operand"))] > > OK, changed this. Went and reviewed the use of set in the GCC internal > documentation. Not really sure why we can use set with the DONE. Not > really following your explanation. It does seem to work. >
It's mainly based on the documentation of define_expand: "If the preparation falls through (invokes neither DONE nor FAIL), then the define_expand acts like a define_insn in that the RTL template is used to generate the insn. The RTL template is not used for matching, only for generating the initial insn list. **If the preparation statement always invokes DONE or FAIL, the RTL template may be reduced to a simple list of operands** ..." https://gcc.gnu.org/onlinedocs/gccint/Expander-Definitions.html BR, Kewen