On Tue, May 7, 2024 at 9:46 AM Jonathan Wakely <jwakely....@gmail.com> wrote:
>
> On Tue, 7 May 2024 at 17:39, Jonathan Wakely <jwakely....@gmail.com> wrote:
> >
> > On Tue, 7 May 2024 at 17:33, Jeff Law wrote:
> > >
> > >
> > >
> > > On 5/7/24 9:36 AM, Andreas Schwab wrote:
> > > > On Mai 07 2024, Jonathan Wakely wrote:
> > > >
> > > >> +#ifdef __riscv
> > > >> +    return _M_insert(__builtin_copysign((double)__f,
> > > >> +                                        
> > > >> (double)-__builtin_signbit(__f));
> > > >
> > > > Should this use static_cast<double>?
> >
> > Meh. It wouldn't fit in 80 columns any more with static_cast, and it
> > means exactly the same thing.
> >
> > > And it's missing a close paren.
> >
> > Now that's more important! Thanks.
>
> Also, I've just realised that signbit might return a negative value if
> the signbit is set. The spec only says it returns non-zero if the
> signbit is set.
>
> So maybe we want:
>
> #ifdef __riscv
>         const int __neg = __builtin_signbit(__f) ? -1 : 0;
>         return _M_insert(__builtin_copysign(static_cast<double>(__f),
>                                               static_cast<double>(__neg)));
> #else
>         return _M_insert(static_cast<double>(__f));
> #endif

We can avoid the signbit call altogether by taking advantage of the
fact that type-punning the float to an int, then converting that int
to a double, will produce a double with the sign of the original
value, with no exceptions raised in the process.  (I don't know
whether we're allowed to use std::bit_cast in this context, but a
type-punning memcpy would have the same effect.)

  int __i = std::bit_cast<int, float>(__f);
  return _M_insert(__builtin_copysign(static_cast<double>(__f),
static_cast<double>(__i)));

Empirically, this saves 3 instructions on RV64 or 1 instruction on
RV32 (as measured on GCC 13.2.0).  Note, I'm not trying to drag-race
on performance here.  Rather, I'm trying to minimize the extent to
which this RISC-V idiosyncrasy results in static code-size bloat.

BTW, I agree with Palmer that adding a __builtin with these semantics
seems advisable if this pattern turns out to recur frequently.

Reply via email to