cvs commit: src/lib/msun/src s_ceil.c s_ceill.c s_floor.c s_floorl.c s_trunc.c s_truncl.c

2008-02-14 Thread Bruce Evans
bde 2008-02-14 15:10:34 UTC FreeBSD src repository Modified files: lib/msun/src s_ceil.c s_ceill.c s_floor.c s_floorl.c s_trunc.c s_truncl.c Log: Oops, the weak reference for ceill(), floorl() and truncl() was in the wrong file. This broke

cvs commit: src/lib/msun/src s_ceil.c s_floor.c s_trunc.c

2008-02-14 Thread Bruce Evans
bde 2008-02-15 07:01:40 UTC FreeBSD src repository Modified files: lib/msun/src s_ceil.c s_floor.c s_trunc.c Log: Sigh, the weak reference for ceill(), floorl() and truncl() was in unreachable code due to a missing include. This kept arm and powerpc broken.

cvs commit: src/lib/msun/src e_rem_pio2.c s_cos.c s_sin.c s_tan.c

2008-02-18 Thread Bruce Evans
bde 2008-02-18 14:02:12 UTC FreeBSD src repository Modified files: lib/msun/src e_rem_pio2.c s_cos.c s_sin.c s_tan.c Log: Inline __ieee754__rem_pio2(). With gcc4-2, this gives an average optimization of about 10% for cos(x), sin(x) and tan(x) on |x| < 2**19*pi/2.

cvs commit: src/lib/msun/ld80 k_tanl.c

2008-02-18 Thread Bruce Evans
bde 2008-02-18 14:09:41 UTC FreeBSD src repository Modified files: lib/msun/ld80k_tanl.c Log: Fix a typo which broke k_tanl.c on !(amd64 || i386). Revision ChangesPath 1.2 +1 -1 src/lib/msun/ld80/k_tanl.c

cvs commit: src/lib/msun/ld80 k_tanl.c

2008-02-18 Thread Bruce Evans
bde 2008-02-18 15:39:52 UTC FreeBSD src repository Modified files: lib/msun/ld80k_tanl.c Log: 2 long double constants were missing L suffixes. This helped break tanl() on !(amd64 || i386). It gave slightly worse than double precision in some cases. tanl() now p

cvs commit: src/lib/msun/src k_cos.c k_sin.c

2008-02-19 Thread Bruce Evans
bde 2008-02-19 12:54:14 UTC FreeBSD src repository Modified files: lib/msun/src k_cos.c k_sin.c Log: Rearrange the polynomial evaluation for better parallelism. This saves an average of about 8 cycles or 5% on A64 (amd64 and i386 -- more in cycles but about the s

cvs commit: src/lib/msun/src e_rem_pio2f.c

2008-02-19 Thread Bruce Evans
bde 2008-02-19 15:42:46 UTC FreeBSD src repository Modified files: lib/msun/src e_rem_pio2f.c Log: Merge cosmetic changes from e_rem_pio2.c 1.10 (convert to __FBSDID(); fix indentation and return type of __ieee754_rem_pio2()). Remove unused variables. Revi

cvs commit: src/lib/msun/src e_rem_pio2.c

2008-02-19 Thread Bruce Evans
bde 2008-02-19 15:30:58 UTC FreeBSD src repository Modified files: lib/msun/src e_rem_pio2.c Log: Optimize for 3pi/4 <= |x| <= 9pi/4 in much the same way as for pi/4 <= |x| <= 3pi/4. Use the same branch ladder as for float precision. Remove the optimization for |

cvs commit: src/lib/msun/src s_rintl.c

2008-02-22 Thread Bruce Evans
bde 2008-02-22 09:21:14 UTC FreeBSD src repository Modified files: lib/msun/src s_rintl.c Log: Fix rintl() on signaling NaNs and unsupported formats. Revision ChangesPath 1.3 +3 -5 src/lib/msun/src/s_rintl.c _

cvs commit: src/lib/msun/src s_rintl.c

2008-02-22 Thread Bruce Evans
bde 2008-02-22 10:04:53 UTC FreeBSD src repository Modified files: lib/msun/src s_rintl.c Log: Optimize the fixup for +-0 by using better classification for this case and by using a table lookup to avoid a branch when this case occurs. On i386, this saves 1-4 cycl

cvs commit: src/lib/msun/src s_rintl.c

2008-02-22 Thread Bruce Evans
bde 2008-02-22 11:59:05 UTC FreeBSD src repository Modified files: lib/msun/src s_rintl.c Log: Optimize the conversion to bits a little (by about 11 cycles or 16% on i386 (A64), 5 cycles on amd64 (A64), and 3 cycles on ia64). gcc tends to generate very bad code f

cvs commit: src/lib/msun/src math_private.h

2008-02-22 Thread Bruce Evans
bde 2008-02-22 14:11:03 UTC FreeBSD src repository Modified files: lib/msun/src math_private.h Log: Add an irint() function in inline asm for amd64 and i386. irint() is the same as lrint() except it returns int instead of long. Though the extern lrint() is fairl

cvs commit: src/lib/msun/src e_rem_pio2.c e_rem_pio2f.c

2008-02-22 Thread Bruce Evans
bde 2008-02-22 15:55:15 UTC FreeBSD src repository Modified files: lib/msun/src e_rem_pio2.c e_rem_pio2f.c Log: Optimize the 9pi/2 < |x| <= 2**19pi/2 case on amd64 and i386 by avoiding the the double to int conversion operation which is very slow on these arches.

cvs commit: src/lib/msun/src e_rem_pio2.c

2008-02-22 Thread Bruce Evans
bde 2008-02-22 17:26:24 UTC FreeBSD src repository Modified files: lib/msun/src e_rem_pio2.c Log: Remove the "quick check no cancellation" optimization for 9pi/2 < |x| < 32pi/2 since it is only a small or negative optimation and it gets in the way of further optim

cvs commit: src/lib/msun/src e_rem_pio2.c e_rem_pio2f.c

2008-02-22 Thread Bruce Evans
bde 2008-02-22 18:43:23 UTC FreeBSD src repository Modified files: lib/msun/src e_rem_pio2.c e_rem_pio2f.c Log: Avoid using FP-to-integer conversion for !(amd64 || i386) too. Use the FP-to-FP method to round to an integer on all arches, and convert this to an int

cvs commit: src/lib/msun/src e_rem_pio2.c e_rem_pio2f.c

2008-02-23 Thread Bruce Evans
bde 2008-02-23 12:53:21 UTC FreeBSD src repository Modified files: lib/msun/src e_rem_pio2.c e_rem_pio2f.c Log: Optimize the 9pi/2 < |x| <= 2**19pi/2 case some more by avoiding an fabs(), a conditional branch, and sign adjustments of 3 variables for x < 0 when the

cvs commit: src/lib/msun/src e_rem_pio2.c e_rem_pio2f.c k_rem_pio2.c

2008-02-25 Thread Bruce Evans
bde 2008-02-25 11:43:20 UTC FreeBSD src repository Modified files: lib/msun/src e_rem_pio2.c e_rem_pio2f.c k_rem_pio2.c Log: Fix some off-by-1 errors. e_rem_pio2.c: Float and double precision didn't work because init_jk[] was 1 too small. It needs to be 2 lar

cvs commit: src/lib/msun/src e_rem_pio2f.c math_private.h s_cosf.c s_sinf.c s_tanf.c

2008-02-25 Thread Bruce Evans
bde 2008-02-25 13:33:20 UTC FreeBSD src repository Modified files: lib/msun/src e_rem_pio2f.c math_private.h s_cosf.c s_sinf.c s_tanf.c Log: Change __ieee754_rem_pio2f() to return double instead of float so that this function and its caller

cvs commit: src/lib/msun/src e_rem_pio2.c e_rem_pio2f.c

2008-02-25 Thread Bruce Evans
bde 2008-02-25 18:28:58 UTC FreeBSD src repository Modified files: lib/msun/src e_rem_pio2.c e_rem_pio2f.c Log: Use a temporary array instead of the arg array y[] for calling __kernel_rem_pio2(). This simplifies analysis of aliasing and thus results in better cod

cvs commit: src/lib/msun/src e_rem_pio2f.c s_cosf.c s_sinf.c s_tanf.c

2008-02-25 Thread Bruce Evans
bde 2008-02-25 22:19:17 UTC FreeBSD src repository Modified files: lib/msun/src e_rem_pio2f.c s_cosf.c s_sinf.c s_tanf.c Log: Inline __ieee754__rem_pio2f(). On amd64 (A64) and i386 (A64), this gives an average speedup of about 12 cycles or 17% for 9pi/4 < |x| <=

cvs commit: src/lib/msun/src e_rem_pio2.c e_rem_pio2f.c

2008-02-28 Thread Bruce Evans
bde 2008-02-28 16:22:36 UTC FreeBSD src repository Modified files: lib/msun/src e_rem_pio2.c e_rem_pio2f.c Log: Fix and improve some magic numbers for the "medium size" case. e_rem_pio2.c: This case goes up to about 2**20pi/2, but the comment about it said that

<    1   2   3   4   5   6