On 9/19/23 02:46, Jin Ma wrote:
This patch adds the 'Zfbfmin' extension for riscv, which is based on spec of
bfloat16:
https://github.com/riscv/riscv-bfloat16/commit/5578e34e15a44e9ad13246072a29f51274b4d999
The 'Zfbfmin' extension of binutils-gdb (REVIEW ONLY):
https://sourceware.org/pipermail/binutils/2023-August/128773.html
The 'Zfbfmin' extension of qemu:
https://github.com/qemu/qemu/commit/5d1270caac2ef7b8c887d4cb5a2444ba6d237516
Because the binutils does not yet support the 'Zfbfmin' extension, test case
zfbfmin_convert_run.c is invalidated with '#if 0' and '#endif'.
gcc/ChangeLog:
* common/config/riscv/riscv-common.cc: Add 'Zfbfmin' extension.
* config/riscv/riscv-opts.h (MASK_ZFBFMIN): New.
(TARGET_ZFBFMIN): New.
* config/riscv/riscv.cc (riscv_output_move): Enable FMV.X.H, and FMV.H.X
for 'Zfbfmin' extension.
(riscv_excess_precision): Likewise.
* config/riscv/riscv.md (truncsfbf2): New.
(extendbfsf2): New.
(*mov<mode>_hardfloat): Support for BFmode.
(*mov<mode>_softfloat): Disable for BFmode when 'Zfbfmin' extension is
enabled.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/zfbfmin_arithmetic.c: New test.
* gcc.target/riscv/zfbfmin_call.c: New test.
* gcc.target/riscv/zfbfmin_comparisons.c: New test.
* gcc.target/riscv/zfbfmin_convert.c: New test.
* gcc.target/riscv/zfbfmin_convert_run.c: New test.
* gcc.target/riscv/zfbfmin_fsh_and_flh.c: New test.
So as with 1/2 in this series, it can't go into the trunk until the
relevant spec reaches a frozen state.
+/* { dg-final { scan-assembler-times "fcvt.s.bf16" 14 } } */
+/* { dg-final { scan-assembler-times "fcvt.bf16.s" 10 } } */
So I think these have the potential to run afoul of unexpected matching
of LTO bits. Joern has an approach to tackle this problem that was
recently pushed into the tree:
https://gcc.gnu.org/pipermail/gcc-patches/2023-September/631485.html
The gist is wrap the assembly instruction inside a {\m \M} construct.
So concretely
> +/* { dg-final { scan-assembler-times {\mfcvt.s.bf16\M} 14 } } */
> +/* { dg-final { scan-assembler-times {\mfcvt.bf16.s\M} 10 } } */
Similarly for the other new tests where you actually match an instruction.
Overall it looks pretty good.
Jeff