On 10/25/23 04:15, Jin Ma wrote:
+;; The conversion of DF to BF needs to be done with SF if there is a
+;; chance to generate at least one instruction, otherwise just using
+;; libfunc __truncdfbf2.
+(define_expand "truncdfbf2"
+ [(set (match_operand:BF 0 "register_operand" "=f")
+ (float_truncate:BF
+ (match_operand:DF 1 "register_operand" " f")))]
+ "TARGET_DOUBLE_FLOAT || TARGET_ZDINX"
+ {
+ convert_move (operands[0],
+ convert_modes (SFmode, DFmode, operands[1], 0), 0);
+ DONE;
+ })
So for conversions to/from BFmode, doesn't generic code take care of
this for us? Search for convert_mode_scalar in expr.cc. That code will
utilize SFmode as an intermediate step just like your expander. Is
there some reason that generic code is insufficient?
Similarly for the the other conversions.
As far as I can see, the function 'convert_mode_scalar' doesn't seem to be
perfect for
dealing with the conversions to/from BFmode. It can only handle BF to HF, SF,
DF and
SF to BF well, but the rest of the conversion without any processing, directly
using
the libcall.
Maybe I should choose to enhance its functionality? This seems to be a
good choice, I'm not sure.My recollection was that BF could be converted
to/from SF trivially and
if we wanted BF->DF we'd first convert to SF, then to DF.
Direct BF<->DF conversions aren't actually important from a performance
standpoint. So it's OK if they have an extra step IMHO.
Thank you very much for your review and detailed reply. Maybe there are some
problems with my expression
and I am a little confused about your guidance. My understanding is that you
also think that it is reasonable to
convert through SF, right? In fact, this is what I did.
My point was that I would expect the generic code to handle the
conversion and that we didn't need to handle it explicitly in the RISC-V
backend.
Meaning that I don't think we need a define_expand for truncdfbf2,
fix_truncbf<GPR:mode>2, fixuns_truncbf<GPR:mode>2, float<mode>bf2, or
floatuns<mode>bf2.
In this patch, my thoughts are as follows:
The general principle is to use the real instructions instead of libcall as
much as possible for conversions,
while minimizing the definition of libcall(only reusing which has been defined
by other architectures such
as aarch64). If SF can be used as a transit, it is preferred to convert to SF,
otherwise libcall is directly used.
1. For the conversions between floating points
For BF->DF, as you said, the function 'convert_mode_scalar' in the general code
has been well implemented,
which will be expressed as BF->SF->DF. And the generated instruction list may
be as follows:
'call __extendbfsf2' + 'call __extendsfdf2' (when only soft floating point
support);
'call __extendbfsf2' + 'fcvt.d.s' (when (TARGET_DOUBLE_FLOAT ||
TARGET_ZDINX) is true);
'fcvt.s.bf16' + 'fcvt.d.s' (when ((TARGET_DOUBLE_FLOAT ||
TARGET_ZDINX) && TARGET_ZFBFMIN) is true)
For DF->BF, if any of fcvt.s.d and fcvt.bf16.s cannot be generated, the 'call
__truncdfbf2' is directly generated
by the function 'convert_mode_scalar'. Otherwise the new pattern(define_expand
"truncdfbf2") is used. This
makes it possible to implement DF->BF by 'fcvt.s.d' + 'fcvt.bf16.s', which
cannot be generated by the function
'convert_mode_scala'.
But I would have expected convert_mode_scalar to generate DF->BF by
first truncating to SF, then to BF. If that is missing for truncation,
then we should add it to convert_mode_scalar rather than expressing it
as a backend expander.
2. For the conversions between integer and BF, it seems that gcc only uses
libcall to implement it, but this is
obviously wrong. For example, the conversion BF->SI directly calls the
unimplemented libcall __fixunsbfsi.
So I added some new pattern to handle these transformations with SF.
I would suggest these move into target independent code as well.
There's no reason I'm aware of that they should be implemented entirely
in a target machine description. We're not really doing anything target
specific in here.
jeff