Re: [RFC 1/2] RISC-V: Add support for _Bfloat16.

Jeff Law Fri, 10 Nov 2023 13:21:37 -0800



On 10/25/23 04:15, Jin Ma wrote:

+;; The conversion of DF to BF needs to be done with SF if there is a
+;; chance to generate at least one instruction, otherwise just using
+;; libfunc __truncdfbf2.
+(define_expand "truncdfbf2"
+  [(set (match_operand:BF     0 "register_operand" "=f")
+       (float_truncate:BF
+           (match_operand:DF 1 "register_operand" " f")))]
+  "TARGET_DOUBLE_FLOAT || TARGET_ZDINX"
+  {
+    convert_move (operands[0],
+                 convert_modes (SFmode, DFmode, operands[1], 0), 0);
+    DONE;
+  })

So for conversions to/from BFmode, doesn't generic code take care of
this for us?  Search for convert_mode_scalar in expr.cc. That code will
utilize SFmode as an intermediate step just like your expander.   Is
there some reason that generic code is insufficient?

Similarly for the the other conversions.


As far as I can see, the function 'convert_mode_scalar' doesn't seem to be 
perfect for
dealing with the conversions to/from BFmode. It can only handle BF to HF, SF, 
DF and
SF to BF well, but the rest of the conversion without any processing, directly 
using
the libcall.

Maybe I should choose to enhance its functionality? This seems to be a
good choice, I'm not sure.My recollection was that BF could be converted 
to/from SF trivially and

if we wanted BF->DF we'd first convert to SF, then to DF.

Direct BF<->DF conversions aren't actually important from a performance
standpoint.  So it's OK if they have an extra step IMHO.


Thank you very much for your review and detailed reply. Maybe there are some 
problems with my expression
and I am a little confused about your guidance. My understanding is that you 
also think that it is reasonable to
convert through SF, right? In fact, this is what I did.

My point was that I would expect the generic code to handle theconversion and that we didn't need to handle it explicitly in the RISC-Vbackend.

Meaning that I don't think we need a define_expand for truncdfbf2,fix_truncbf<GPR:mode>2, fixuns_truncbf<GPR:mode>2, float<mode>bf2, orfloatuns<mode>bf2.


In this patch, my thoughts are as follows:

The general principle is to use the real instructions instead of libcall as 
much as possible for conversions,
while minimizing the definition of libcall(only reusing which has been defined 
by other architectures such
as aarch64). If SF can be used as a transit, it is preferred to convert to SF, 
otherwise libcall is directly used.

1. For the conversions between floating points

For BF->DF, as you said, the function 'convert_mode_scalar' in the general code 
has been well implemented,
which will be expressed as BF->SF->DF. And the generated instruction list may 
be as follows:
   'call __extendbfsf2' + 'call __extendsfdf2' (when only soft floating point 
support);
   'call __extendbfsf2' + 'fcvt.d.s'           (when (TARGET_DOUBLE_FLOAT || 
TARGET_ZDINX) is true);
   'fcvt.s.bf16'        + 'fcvt.d.s'           (when ((TARGET_DOUBLE_FLOAT || 
TARGET_ZDINX) && TARGET_ZFBFMIN) is true)

For DF->BF, if any of fcvt.s.d and fcvt.bf16.s cannot be generated, the 'call 
__truncdfbf2' is directly generated
by the function 'convert_mode_scalar'. Otherwise the new pattern(define_expand 
"truncdfbf2") is used. This
makes it possible to implement DF->BF by 'fcvt.s.d' + 'fcvt.bf16.s', which 
cannot be generated by the function
'convert_mode_scala'.

But I would have expected convert_mode_scalar to generate DF->BF byfirst truncating to SF, then to BF. If that is missing for truncation,then we should add it to convert_mode_scalar rather than expressing itas a backend expander.


2. For the conversions between integer and BF, it seems that gcc only uses 
libcall to implement it, but this is
obviously wrong. For example, the conversion BF->SI directly calls the 
unimplemented libcall __fixunsbfsi.
So I added some new pattern to handle these transformations with SF.

I would suggest these move into target independent code as well.There's no reason I'm aware of that they should be implemented entirelyin a target machine description. We're not really doing anything targetspecific in here.


jeff

Re: [RFC 1/2] RISC-V: Add support for _Bfloat16.

Reply via email to