On 11/08/2013 09:30 AM, Richard Henderson wrote: > On 11/08/2013 09:28 AM, Richard Henderson wrote: >> On 11/07/2013 06:31 AM, Tom Musta wrote: >>> } >>> \ >>> + >>> \ >>> + if (r2sp) { >>> \ >>> + float32 tmp32 = float64_to_float32(xt_out.fld[i], >>> \ >>> + &env->fp_status); >>> \ >>> + xt_out.fld[i] = float32_to_float64(tmp32, &env->fp_status); >>> \ >>> + } >>> \ >>> + >>> \ >> >> You can't get correct results for a single-precision fma from a >> double-precision fma and merely rounding the results. >> >> See e.g. glibc's sysdeps/ieee754/dbl-64/s_fmaf.c. > > Blah, nevermind. That would be using separate add+mul in double-precision, > not > using a double-precision fma primitive.
Hmph. I was right the first time. See > http://www.exploringbinary.com/double-rounding-errors-in-floating-point-conversions/ for example inputs that suffer from double-rounding. What's needed in each of the examples are infinite precision values containing 55 bits. This is easy to accomplish with fma. Two 23-bit inputs can create a product with 46 significant bits. One can append 23 more significant bits by choosing an exponent for the addend that does not overlap the product. Thus one can create (almost) every intermediate result with up to 69 consecutive bits (the exception being products without factors that can be represented in 23-bits). I'm too lazy to decompose the examples therein to actual single-precision inputs, but I'm certain it can be done. Thus you *do* need the round-to-zero plus inexact solution that glibc uses. (Or to perform the fma in 128-bits and round once, but I think that would be way more intrusive wrt the rest of the code, and more expensive than necessary.) r~