On Fri, 3 Dec 2010 d...@freebsd.org wrote:

Synopsis: [libm] fma(3) does not respect rounding mode using extended precision
Thanks for the report! This limitation is described in the source for
fma(), and unfortunately, it is unlikely to ever change. There are
several reasons:

- We are a long way from having the necessary compiler support to make
 dynamic precision changes work as expected.
- Dynamic FPU precision changes aren't officially supported, and
 fpsetprec() has been documented as deprecated for many years.

Not really.  See my reply to the commit to the man pages.

- The only supported architecture that can have this problem due to
 dynamic precision changes is i386, and even then only for non-SSE2
 builds.

SSE2 makes little difference to this problem for i386, except for clang
it makes it worse.  The ABI requires using the FPU for at least returning
values, and gcc keeps using the FPU for operations too.  OTOH, clang
uses SSE2 for operations.  This gives an even larger pessimization
than I expected
    (in 1 example, clang with a wrong arch (nocona instead of core2,
    since gcc doesn't support -march=core2 yet and I used the same flags
    for clang as for gcc), clang was 170/45 times slower; with
    -march=core2, it was only 139/45 times slower; with -march=i386,
    it was only 88/45 times slower.  Here -march=i386 works mainly
    by avoiding avoiding even useful SSE1 instructions.  The example
    was a float function, so it only needed SSE1.  Restoring use of
    SSE1 using -march=athlon-xp restores the slowness to 144/45.)
It also makes the precision used more unpredictable than before.  It
now depends on $CC and $CFLAGS, but float.h doesn't.  Fortunately,
i386 float.h covers some cases by defining FLT_EVAL_METHOD = -1, which
says that the FP evaluation method is indeterminate :-).  Unfortunately,
i386 float.h's definition of float_t as double becomes wrong if floats
are actually evaluated in float precision, like clang's use of SSE1
gives.

- The cost and complexity associated with making every function in
 libm detect and adapt to dynamic precision changes is prohibitive.

Same as for dynamic rounding direction changes.  Actually, much lower
cost and complexity than for rounding direction.  For rounding direction,
it is actually useful to keep the caller's mode, and supporting this
would require making sure every step of every function works right in
every mode.  For rounding precision, we can just switch to mode that
works for every function that needs it, and most don't need it except
for bizarre environments (like forcing single precision and calling
extended precision functions and expecting them to return any particular
precision).

I have updated the manpage for fpsetprec() to explain that changing
the FPU precision isn't supported by the compiler or libraries.

Bruce
_______________________________________________
freebsd-bugs@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-bugs
To unsubscribe, send any mail to "freebsd-bugs-unsubscr...@freebsd.org"

Reply via email to