On 3/10/23 05:40, Jin Ma via Gcc-patches wrote:
This patch adds the 'Zfa' extension for riscv, which is based on:
https://github.com/riscv/riscv-isa-manual/commit/d74d99e22d5f68832f70982d867614e2149a3bd7
latest 'Zfa' change on the master branch of the RISC-V ISA Manual as
of this writing.
The Wiki Page (details):
https://github.com/a4lg/binutils-gdb/wiki/riscv_zfa
The binutils-gdb for 'Zfa' extension:
https://sourceware.org/pipermail/binutils/2022-September/122938.html
Implementation of zfa extension on LLVM:
https://reviews.llvm.org/rGc0947dc44109252fcc0f68a542fc6ef250d4d3a9
There are three points that need to be discussed here.
1. According to riscv-spec, "The FCVTMO D.W.D instruction was added principally
to
accelerate the processing of JavaScript Numbers.", so it seems that no
implementation
is required in the compiler.
2. The FROUND and FROUNDN instructions in this patch use related functions in
the math
library, such as round, floor, ceil, etc. Since there is no interface for
half-precision in
the math library, the instructions FROUN D.H and FROUNDN X.H have not been
implemented for
the time being. Is it necessary to add a built-in interface belonging to
riscv such as
__builtin_roundhf or __builtin_roundf16 to generate half floating point
instructions?
3. As far as I know, FMINM and FMAXM instructions correspond to C23 library
function fminimum
and fmaximum. Therefore, I have not dealt with such instructions for the
time being, but have
simply implemented the pattern of fminm<hf\sf\df>3 and fmaxm<hf\sf\df>3. Is
it necessary to
add a built-in interface belonging to riscv such as__builtin_fminm to
generate half
floating-point instructions?
gcc/ChangeLog:
* common/config/riscv/riscv-common.cc: Add zfa extension.
* config/riscv/constraints.md (Zf): Constrain the floating point number
that the FLI instruction can load.
* config/riscv/iterators.md (round_pattern): New.
* config/riscv/predicates.md: Predicate the floating point number that
the FLI instruction can load.
* config/riscv/riscv-opts.h (MASK_ZFA): New.
(TARGET_ZFA): New.
* config/riscv/riscv-protos.h (riscv_float_const_rtx_index_for_fli):
Get the index of the
floating-point number that the FLI instruction can load.
* config/riscv/riscv.cc (find_index_in_array): New.
(riscv_float_const_rtx_index_for_fli): New.
(riscv_cannot_force_const_mem): Likewise.
(riscv_const_insns): Likewise.
(riscv_legitimize_const_move): Likewise.
(riscv_split_64bit_move_p): Exclude floating point numbers that can be
loaded by FLI instructions.
(riscv_output_move): Likewise.
(riscv_memmodel_needs_release_fence): Likewise.
(riscv_print_operand): Likewise.
(riscv_secondary_memory_needed): Likewise.
* config/riscv/riscv.h (GP_REG_RTX_P): New.
* config/riscv/riscv.md (fminm<mode>3): New.
(fmaxm<mode>3): New.
(<round_pattern><ANYF:mode>2): New.
(rint<ANYF:mode>2): New.
(f<quiet_pattern>_quiet<ANYF:mode><X:mode>4_zfa): New.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/zfa-fleq-fltq-rv32.c: New test.
* gcc.target/riscv/zfa-fleq-fltq.c: New test.
* gcc.target/riscv/zfa-fli-rv32.c: New test.
* gcc.target/riscv/zfa-fli-zfh-rv32.c: New test.
* gcc.target/riscv/zfa-fli-zfh.c: New test.
* gcc.target/riscv/zfa-fli.c: New test.
* gcc.target/riscv/zfa-fmovh-fmovp-rv32.c: New test.
* gcc.target/riscv/zfa-fround-rv32.c: New test.
* gcc.target/riscv/zfa-fround.c: New test.
---
gcc/common/config/riscv/riscv-common.cc | 4 +
gcc/config/riscv/constraints.md | 7 +
gcc/config/riscv/iterators.md | 5 +
gcc/config/riscv/predicates.md | 4 +
gcc/config/riscv/riscv-opts.h | 3 +
gcc/config/riscv/riscv-protos.h | 1 +
gcc/config/riscv/riscv.cc | 168 +++++++++++++++++-
gcc/config/riscv/riscv.h | 1 +
gcc/config/riscv/riscv.md | 112 +++++++++---
.../gcc.target/riscv/zfa-fleq-fltq-rv32.c | 19 ++
.../gcc.target/riscv/zfa-fleq-fltq.c | 19 ++
gcc/testsuite/gcc.target/riscv/zfa-fli-rv32.c | 79 ++++++++
.../gcc.target/riscv/zfa-fli-zfh-rv32.c | 41 +++++
gcc/testsuite/gcc.target/riscv/zfa-fli-zfh.c | 41 +++++
gcc/testsuite/gcc.target/riscv/zfa-fli.c | 79 ++++++++
.../gcc.target/riscv/zfa-fmovh-fmovp-rv32.c | 10 ++
.../gcc.target/riscv/zfa-fround-rv32.c | 42 +++++
gcc/testsuite/gcc.target/riscv/zfa-fround.c | 42 +++++
18 files changed, 654 insertions(+), 23 deletions(-)
create mode 100644 gcc/testsuite/gcc.target/riscv/zfa-fleq-fltq-rv32.c
create mode 100644 gcc/testsuite/gcc.target/riscv/zfa-fleq-fltq.c
create mode 100644 gcc/testsuite/gcc.target/riscv/zfa-fli-rv32.c
create mode 100644 gcc/testsuite/gcc.target/riscv/zfa-fli-zfh-rv32.c
create mode 100644 gcc/testsuite/gcc.target/riscv/zfa-fli-zfh.c
create mode 100644 gcc/testsuite/gcc.target/riscv/zfa-fli.c
create mode 100644 gcc/testsuite/gcc.target/riscv/zfa-fmovh-fmovp-rv32.c
create mode 100644 gcc/testsuite/gcc.target/riscv/zfa-fround-rv32.c
create mode 100644 gcc/testsuite/gcc.target/riscv/zfa-fround.c
diff --git a/gcc/config/riscv/iterators.md b/gcc/config/riscv/iterators.md
index 5b70ab20758..ef9b1aa9ed3 100644
--- a/gcc/config/riscv/iterators.md
+++ b/gcc/config/riscv/iterators.md
@@ -284,3 +284,8 @@ (define_int_iterator QUIET_COMPARISON [UNSPEC_FLT_QUIET
UNSPEC_FLE_QUIET])
(define_int_attr quiet_pattern [(UNSPEC_FLT_QUIET "lt") (UNSPEC_FLE_QUIET
"le")])
(define_int_attr QUIET_PATTERN [(UNSPEC_FLT_QUIET "LT") (UNSPEC_FLE_QUIET
"LE")])
+(define_int_iterator ROUND [UNSPEC_ROUND UNSPEC_FLOOR UNSPEC_CEIL UNSPEC_BTRUNC UNSPEC_ROUNDEVEN UNSPEC_NEARBYINT])
+(define_int_attr round_pattern [(UNSPEC_ROUND "round") (UNSPEC_FLOOR "floor")
(UNSPEC_CEIL "ceil")
+ (UNSPEC_BTRUNC "btrunc") (UNSPEC_ROUNDEVEN "roundeven")
(UNSPEC_NEARBYINT "nearbyint")])
+(define_int_attr round_rm [(UNSPEC_ROUND "rmm") (UNSPEC_FLOOR "rdn") (UNSPEC_CEIL
"rup")
+ (UNSPEC_BTRUNC "rtz") (UNSPEC_ROUNDEVEN "rne")
(UNSPEC_NEARBYINT "dyn")])
\ No newline at end of file
diff --git a/gcc/config/riscv/predicates.md b/gcc/config/riscv/predicates.md
index 0d9d7701c7e..b16b8d438e3 100644
--- a/gcc/config/riscv/predicates.md
+++ b/gcc/config/riscv/predicates.md
@@ -149,6 +149,10 @@ (define_predicate "move_operand"
case CONST_POLY_INT:
return known_eq (rtx_to_poly_int64 (op), BYTES_PER_RISCV_VECTOR);
+ case CONST_DOUBLE:
+ return const_0_operand (op, mode)
+ || (riscv_float_const_rtx_index_for_fli (op) != -1);
Formatting nit. When you have a line break like this, go ahead and wrap
the whole expression with parens and put the operator one space past the
open paren. Like this:
case CONST_DOUBLE
return (const_0_operand (op, mode)
|| riscv_float_const_rtx_index_for_fli (op) != -1);
I used spaces so that mailers don't muck up the formating. Please make
sure to use tabs insted of 8 spaces if the indent is >= 8.
diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index c91fa3101aa..c81a2bf44f5 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -799,6 +799,108 @@ static int riscv_symbol_insns (enum riscv_symbol_type
type)
}
}
+/* Immediate values loaded by the FLI.S instruction in Chapter 25 of the latest RISC-V ISA
+ Manual draft. For details, please see:
+
https://github.com/riscv/riscv-isa-manual/releases/tag/draft-20221217-cb3b9d1 */
+
+unsigned HOST_WIDE_INT fli_value_hf[32] =
+{
+ 0xbc00, 0x400, 0x100, 0x200, 0x1c00, 0x2000, 0x2c00, 0x3000,
+ 0x3400, 0x3500, 0x3600, 0x3700, 0x3800, 0x3900, 0x3a00, 0x3b00,
+ 0x3c00, 0x3d00, 0x3e00, 0x3f00, 0x4000, 0x4100, 0x4200, 0x4400,
+ 0x4800, 0x4c00, 0x5800, 0x5c00, 0x7800,
+ /* Only used for filling, ensuring that 29 and 30 of HF are the same. */
+ 0x7800,
+ 0x7c00, 0x7e00,
+};
+
+unsigned HOST_WIDE_INT fli_value_sf[32] =
+{
+ 0xbf800000, 0x00800000, 0x37800000, 0x38000000, 0x3b800000, 0x3c000000,
0x3d800000, 0x3e000000,
+ 0x3e800000, 0x3ea00000, 0x3ec00000, 0x3ee00000, 0x3f000000, 0x3f200000,
0x3f400000, 0x3f600000,
+ 0x3f800000, 0x3fa00000, 0x3fc00000, 0x3fe00000, 0x40000000, 0x40200000,
0x40400000, 0x40800000,
+ 0x41000000, 0x41800000, 0x43000000, 0x43800000, 0x47000000, 0x47800000,
0x7f800000, 0x7fc00000
+};
+
+unsigned HOST_WIDE_INT fli_value_df[32] =
+{
+ 0xbff0000000000000, 0x10000000000000, 0x3ef0000000000000, 0x3f00000000000000,
+ 0x3f70000000000000, 0x3f80000000000000, 0x3fb0000000000000,
0x3fc0000000000000,
+ 0x3fd0000000000000, 0x3fd4000000000000, 0x3fd8000000000000,
0x3fdc000000000000,
+ 0x3fe0000000000000, 0x3fe4000000000000, 0x3fe8000000000000,
0x3fec000000000000,
+ 0x3ff0000000000000, 0x3ff4000000000000, 0x3ff8000000000000,
0x3ffc000000000000,
+ 0x4000000000000000, 0x4004000000000000, 0x4008000000000000,
0x4010000000000000,
+ 0x4020000000000000, 0x4030000000000000, 0x4060000000000000,
0x4070000000000000,
+ 0x40e0000000000000, 0x40f0000000000000, 0x7ff0000000000000,
0x7ff8000000000000,
+};
Would it make more sense to use the newer floating point literal
formats? I like them marginally more than hext constants because many
tools can consume that floating point literal syntax trivially.
+
+/* Find the index of TARGET in ARRAY, and return -1 if not found. */
"Return the index of TARGET in ARRAY, return -1 if not found. */
+
+/* Return index of the FLI instruction table if rtx X is an immediate constant
that
+ can be moved using a single FLI instruction in zfa extension. -1 otherwise.
*/
+
+int
+riscv_float_const_rtx_index_for_fli (rtx x)
+{
+ machine_mode mode = GET_MODE (x);
+
+ if (!TARGET_ZFA || mode == VOIDmode
+ || !CONST_DOUBLE_P(x)
+ || (mode == HFmode && !TARGET_ZFH)
+ || (mode == SFmode && !TARGET_HARD_FLOAT)
+ || (mode == DFmode && !TARGET_DOUBLE_FLOAT))
+ return -1;
Bring the "|| mode == VOIDmode" down on its own line.
@@ -1191,6 +1296,8 @@ riscv_const_insns (rtx x)
}
case CONST_DOUBLE:
+ if (riscv_float_const_rtx_index_for_fli (x) != -1)
+ return 4;
Should this be returning 1? That's the case when we find the constant
in the table and have the FLI instruction.
/* Allow FPR <-> FPR and FPR <-> MEM moves, and permit the special case
of zeroing an FPR with FCVT.D.W. */
if (TARGET_DOUBLE_FLOAT
&& ((FP_REG_RTX_P (src) && FP_REG_RTX_P (dest))
|| (FP_REG_RTX_P (dest) && MEM_P (src))
|| (FP_REG_RTX_P (src) && MEM_P (dest))
+ || (TARGET_ZFA
+ && ((FP_REG_RTX_P (dest) && GP_REG_RTX_P (src))
+ || (FP_REG_RTX_P (src) && GP_REG_RTX_P (dest))))
|| (FP_REG_RTX_P (dest) && src == CONST0_RTX (GET_MODE (src)))))
return false;
The formatting here seems off. In particular the last || line you added
seems like it might need to be indented further. It's also possible the
mailer has mucked up. So double check and fix is it needs adjustment.
In the .md file changes, several of the patterns use UNSPECs. If
possible we should try to avoid those when there are reasonable mappings
to GCC's RTL codes. Can you please double-check to see if any of the
cases you're adding can be properly handled by the existing RTL codes?
Overall it looks pretty close.
Jeff