I've done some speed and accuracy comparisons between our respective Frac functions. Initially, my "SafeFrac" was marginally faster than "FracDoSkip", but I managed to optimise Thorsten's routine a little bit into the following:
function FracSkip2(const X: ValReal): ValReal; assembler; nostackframe; asm align 16 movq rax, xmm0 shr rax, 48 and ax, $7FF0 cmp ax, $4330 jge @@zero cmp ax, $3FE0 jbe @@skip cvttsd2si rax, xmm0 cvtsi2sd xmm4, rax subsd xmm0, xmm4 ret @@zero: xorpd xmm0, xmm0 @@skip: end; My test compared Frac, FracDoSkip, SafeFrac and what I call FracSkip2 above, which reworks the comparisons to use only 16 bits, and replaces "jmp @@skip" with "ret". The results are as follows (as you can see... all of them are a great improvement over Frac). Frac raises SIGFPE if plus or minus infinity is passed in, but our functions return zero. This may or may not be a desirable change. Code sizes (alignment will round it up to the nearest 16 bytes): Frac = 49 bytes, FracDoSkip = 52 bytes, SafeFrac = 46 bytes, FracSkip2 = 45 bytes. Long story short, with a few tweaks, Thorsten's routine is the fastest and also the smallest. **** My test set was: DataSet: array[0..14] of Double = (1.5, 0, 2251799813685248, 4503599627370496, 1E300, 0.125, 3.6415926535897932384626433832795, -1.5, -2251799813685248, -4503599627370496, -1E300, -0.125, -3.6415926535897932384626433832795, Infinity, NegInfinity); For each value, it is tested as is, then DataSet[X] + 0.5, then DataSet[X] - 0.5 (best way to determine how it handles precision without it being optimised out by the compiler). **** Frac( 1.5000000000000000E+000) = 5.0000000000000000E-001 - Pass - Time = 124.483 ns FracDoSkip( 1.5000000000000000E+000) = 5.0000000000000000E-001 - Pass - Time = 47.525 ns SafeFrac( 1.5000000000000000E+000) = 5.0000000000000000E-001 - Pass - Time = 32.707 ns FracSkip2( 1.5000000000000000E+000) = 5.0000000000000000E-001 - Pass - Time = 34.904 ns Frac( 2.0000000000000000E+000) = 0.0000000000000000E+000 - Pass - Time = 126.170 ns FracDoSkip( 2.0000000000000000E+000) = 0.0000000000000000E+000 - Pass - Time = 51.210 ns SafeFrac( 2.0000000000000000E+000) = 0.0000000000000000E+000 - Pass - Time = 35.351 ns FracSkip2( 2.0000000000000000E+000) = 0.0000000000000000E+000 - Pass - Time = 33.911 ns Frac( 1.0000000000000000E+000) = 0.0000000000000000E+000 - Pass - Time = 125.927 ns FracDoSkip( 1.0000000000000000E+000) = 0.0000000000000000E+000 - Pass - Time = 49.127 ns SafeFrac( 1.0000000000000000E+000) = 0.0000000000000000E+000 - Pass - Time = 34.695 ns FracSkip2( 1.0000000000000000E+000) = 0.0000000000000000E+000 - Pass - Time = 36.139 ns Frac( 0.0000000000000000E+000) = 0.0000000000000000E+000 - Pass - Time = 119.800 ns FracDoSkip( 0.0000000000000000E+000) = 0.0000000000000000E+000 - Pass - Time = 40.316 ns SafeFrac( 0.0000000000000000E+000) = 0.0000000000000000E+000 - Pass - Time = 35.875 ns FracSkip2( 0.0000000000000000E+000) = 0.0000000000000000E+000 - Pass - Time = 34.046 ns Frac( 5.0000000000000000E-001) = 5.0000000000000000E-001 - Pass - Time = 118.913 ns FracDoSkip( 5.0000000000000000E-001) = 5.0000000000000000E-001 - Pass - Time = 40.183 ns SafeFrac( 5.0000000000000000E-001) = 5.0000000000000000E-001 - Pass - Time = 36.783 ns FracSkip2( 5.0000000000000000E-001) = 5.0000000000000000E-001 - Pass - Time = 34.976 ns Frac(-5.0000000000000000E-001) = -5.0000000000000000E-001 - Pass - Time = 127.560 ns FracDoSkip(-5.0000000000000000E-001) = -5.0000000000000000E-001 - Pass - Time = 41.676 ns SafeFrac(-5.0000000000000000E-001) = -5.0000000000000000E-001 - Pass - Time = 36.577 ns FracSkip2(-5.0000000000000000E-001) = -5.0000000000000000E-001 - Pass - Time = 34.714 ns Frac( 2.2517998136852480E+015) = 0.0000000000000000E+000 - Pass - Time = 126.323 ns FracDoSkip( 2.2517998136852480E+015) = 0.0000000000000000E+000 - Pass - Time = 49.108 ns SafeFrac( 2.2517998136852480E+015) = 0.0000000000000000E+000 - Pass - Time = 35.376 ns FracSkip2( 2.2517998136852480E+015) = 0.0000000000000000E+000 - Pass - Time = 36.373 ns Frac( 2.2517998136852485E+015) = 5.0000000000000000E-001 - Pass - Time = 131.001 ns FracDoSkip( 2.2517998136852485E+015) = 5.0000000000000000E-001 - Pass - Time = 54.474 ns SafeFrac( 2.2517998136852485E+015) = 5.0000000000000000E-001 - Pass - Time = 38.834 ns FracSkip2( 2.2517998136852485E+015) = 5.0000000000000000E-001 - Pass - Time = 37.139 ns Frac( 2.2517998136852475E+015) = 5.0000000000000000E-001 - Pass - Time = 131.932 ns FracDoSkip( 2.2517998136852475E+015) = 5.0000000000000000E-001 - Pass - Time = 52.214 ns SafeFrac( 2.2517998136852475E+015) = 5.0000000000000000E-001 - Pass - Time = 37.093 ns FracSkip2( 2.2517998136852475E+015) = 5.0000000000000000E-001 - Pass - Time = 35.674 ns Frac( 4.5035996273704960E+015) = 0.0000000000000000E+000 - Pass - Time = 82.749 ns FracDoSkip( 4.5035996273704960E+015) = 0.0000000000000000E+000 - Pass - Time = 38.613 ns SafeFrac( 4.5035996273704960E+015) = 0.0000000000000000E+000 - Pass - Time = 38.575 ns FracSkip2( 4.5035996273704960E+015) = 0.0000000000000000E+000 - Pass - Time = 33.970 ns Frac( 4.5035996273704960E+015) = 0.0000000000000000E+000 - Pass - Time = 86.126 ns FracDoSkip( 4.5035996273704960E+015) = 0.0000000000000000E+000 - Pass - Time = 38.434 ns SafeFrac( 4.5035996273704960E+015) = 0.0000000000000000E+000 - Pass - Time = 38.636 ns FracSkip2( 4.5035996273704960E+015) = 0.0000000000000000E+000 - Pass - Time = 33.747 ns Frac( 4.5035996273704955E+015) = 5.0000000000000000E-001 - Pass - Time = 131.589 ns FracDoSkip( 4.5035996273704955E+015) = 5.0000000000000000E-001 - Pass - Time = 53.594 ns SafeFrac( 4.5035996273704955E+015) = 5.0000000000000000E-001 - Pass - Time = 36.617 ns FracSkip2( 4.5035996273704955E+015) = 5.0000000000000000E-001 - Pass - Time = 36.509 ns Frac( 1.0000000000000001E+300) = 0.0000000000000000E+000 - Pass - Time = 82.875 ns FracDoSkip( 1.0000000000000001E+300) = 0.0000000000000000E+000 - Pass - Time = 39.008 ns SafeFrac( 1.0000000000000001E+300) = 0.0000000000000000E+000 - Pass - Time = 39.112 ns FracSkip2( 1.0000000000000001E+300) = 0.0000000000000000E+000 - Pass - Time = 34.195 ns Frac( 1.0000000000000001E+300) = 0.0000000000000000E+000 - Pass - Time = 85.401 ns FracDoSkip( 1.0000000000000001E+300) = 0.0000000000000000E+000 - Pass - Time = 38.653 ns SafeFrac( 1.0000000000000001E+300) = 0.0000000000000000E+000 - Pass - Time = 38.655 ns FracSkip2( 1.0000000000000001E+300) = 0.0000000000000000E+000 - Pass - Time = 34.408 ns Frac( 1.0000000000000001E+300) = 0.0000000000000000E+000 - Pass - Time = 84.719 ns FracDoSkip( 1.0000000000000001E+300) = 0.0000000000000000E+000 - Pass - Time = 39.174 ns SafeFrac( 1.0000000000000001E+300) = 0.0000000000000000E+000 - Pass - Time = 38.876 ns FracSkip2( 1.0000000000000001E+300) = 0.0000000000000000E+000 - Pass - Time = 33.570 ns Frac( 1.2500000000000000E-001) = 1.2500000000000000E-001 - Pass - Time = 123.770 ns FracDoSkip( 1.2500000000000000E-001) = 1.2500000000000000E-001 - Pass - Time = 41.642 ns SafeFrac( 1.2500000000000000E-001) = 1.2500000000000000E-001 - Pass - Time = 38.704 ns FracSkip2( 1.2500000000000000E-001) = 1.2500000000000000E-001 - Pass - Time = 35.399 ns Frac( 6.2500000000000000E-001) = 6.2500000000000000E-001 - Pass - Time = 128.967 ns FracDoSkip( 6.2500000000000000E-001) = 6.2500000000000000E-001 - Pass - Time = 42.082 ns SafeFrac( 6.2500000000000000E-001) = 6.2500000000000000E-001 - Pass - Time = 38.199 ns FracSkip2( 6.2500000000000000E-001) = 6.2500000000000000E-001 - Pass - Time = 36.072 ns Frac(-3.7500000000000000E-001) = -3.7500000000000000E-001 - Pass - Time = 128.962 ns FracDoSkip(-3.7500000000000000E-001) = -3.7500000000000000E-001 - Pass - Time = 40.375 ns SafeFrac(-3.7500000000000000E-001) = -3.7500000000000000E-001 - Pass - Time = 37.153 ns FracSkip2(-3.7500000000000000E-001) = -3.7500000000000000E-001 - Pass - Time = 34.515 ns Frac( 3.6415926535897931E+000) = 6.4159265358979312E-001 - Pass - Time = 129.245 ns FracDoSkip( 3.6415926535897931E+000) = 6.4159265358979312E-001 - Pass - Time = 53.440 ns SafeFrac( 3.6415926535897931E+000) = 6.4159265358979312E-001 - Pass - Time = 38.390 ns FracSkip2( 3.6415926535897931E+000) = 6.4159265358979312E-001 - Pass - Time = 36.963 ns Frac( 4.1415926535897931E+000) = 1.4159265358979312E-001 - Pass - Time = 132.623 ns FracDoSkip( 4.1415926535897931E+000) = 1.4159265358979312E-001 - Pass - Time = 52.325 ns SafeFrac( 4.1415926535897931E+000) = 1.4159265358979312E-001 - Pass - Time = 39.016 ns FracSkip2( 4.1415926535897931E+000) = 1.4159265358979312E-001 - Pass - Time = 36.818 ns Frac( 3.1415926535897931E+000) = 1.4159265358979312E-001 - Pass - Time = 128.032 ns FracDoSkip( 3.1415926535897931E+000) = 1.4159265358979312E-001 - Pass - Time = 49.834 ns SafeFrac( 3.1415926535897931E+000) = 1.4159265358979312E-001 - Pass - Time = 37.077 ns FracSkip2( 3.1415926535897931E+000) = 1.4159265358979312E-001 - Pass - Time = 37.099 ns Frac(-1.5000000000000000E+000) = -5.0000000000000000E-001 - Pass - Time = 132.057 ns FracDoSkip(-1.5000000000000000E+000) = -5.0000000000000000E-001 - Pass - Time = 53.112 ns SafeFrac(-1.5000000000000000E+000) = -5.0000000000000000E-001 - Pass - Time = 38.287 ns FracSkip2(-1.5000000000000000E+000) = -5.0000000000000000E-001 - Pass - Time = 36.849 ns Frac(-1.0000000000000000E+000) = 0.0000000000000000E+000 - Pass - Time = 130.452 ns FracDoSkip(-1.0000000000000000E+000) = 0.0000000000000000E+000 - Pass - Time = 51.451 ns SafeFrac(-1.0000000000000000E+000) = 0.0000000000000000E+000 - Pass - Time = 36.993 ns FracSkip2(-1.0000000000000000E+000) = 0.0000000000000000E+000 - Pass - Time = 36.110 ns Frac(-2.0000000000000000E+000) = 0.0000000000000000E+000 - Pass - Time = 131.912 ns FracDoSkip(-2.0000000000000000E+000) = 0.0000000000000000E+000 - Pass - Time = 52.946 ns SafeFrac(-2.0000000000000000E+000) = 0.0000000000000000E+000 - Pass - Time = 38.330 ns FracSkip2(-2.0000000000000000E+000) = 0.0000000000000000E+000 - Pass - Time = 37.156 ns Frac(-2.2517998136852480E+015) = 0.0000000000000000E+000 - Pass - Time = 131.354 ns FracDoSkip(-2.2517998136852480E+015) = 0.0000000000000000E+000 - Pass - Time = 53.712 ns SafeFrac(-2.2517998136852480E+015) = 0.0000000000000000E+000 - Pass - Time = 36.978 ns FracSkip2(-2.2517998136852480E+015) = 0.0000000000000000E+000 - Pass - Time = 36.262 ns Frac(-2.2517998136852475E+015) = -5.0000000000000000E-001 - Pass - Time = 127.641 ns FracDoSkip(-2.2517998136852475E+015) = -5.0000000000000000E-001 - Pass - Time = 52.853 ns SafeFrac(-2.2517998136852475E+015) = -5.0000000000000000E-001 - Pass - Time = 38.318 ns FracSkip2(-2.2517998136852475E+015) = -5.0000000000000000E-001 - Pass - Time = 37.286 ns Frac(-2.2517998136852485E+015) = -5.0000000000000000E-001 - Pass - Time = 130.918 ns FracDoSkip(-2.2517998136852485E+015) = -5.0000000000000000E-001 - Pass - Time = 52.916 ns SafeFrac(-2.2517998136852485E+015) = -5.0000000000000000E-001 - Pass - Time = 37.928 ns FracSkip2(-2.2517998136852485E+015) = -5.0000000000000000E-001 - Pass - Time = 36.701 ns Frac(-4.5035996273704960E+015) = 0.0000000000000000E+000 - Pass - Time = 82.714 ns FracDoSkip(-4.5035996273704960E+015) = 0.0000000000000000E+000 - Pass - Time = 37.410 ns SafeFrac(-4.5035996273704960E+015) = 0.0000000000000000E+000 - Pass - Time = 37.091 ns FracSkip2(-4.5035996273704960E+015) = 0.0000000000000000E+000 - Pass - Time = 33.130 ns Frac(-4.5035996273704955E+015) = -5.0000000000000000E-001 - Pass - Time = 131.699 ns FracDoSkip(-4.5035996273704955E+015) = -5.0000000000000000E-001 - Pass - Time = 52.932 ns SafeFrac(-4.5035996273704955E+015) = -5.0000000000000000E-001 - Pass - Time = 38.499 ns FracSkip2(-4.5035996273704955E+015) = -5.0000000000000000E-001 - Pass - Time = 37.341 ns Frac(-4.5035996273704960E+015) = 0.0000000000000000E+000 - Pass - Time = 85.069 ns FracDoSkip(-4.5035996273704960E+015) = 0.0000000000000000E+000 - Pass - Time = 38.384 ns SafeFrac(-4.5035996273704960E+015) = 0.0000000000000000E+000 - Pass - Time = 39.041 ns FracSkip2(-4.5035996273704960E+015) = 0.0000000000000000E+000 - Pass - Time = 34.266 ns Frac(-1.0000000000000001E+300) = 0.0000000000000000E+000 - Pass - Time = 81.913 ns FracDoSkip(-1.0000000000000001E+300) = 0.0000000000000000E+000 - Pass - Time = 37.216 ns SafeFrac(-1.0000000000000001E+300) = 0.0000000000000000E+000 - Pass - Time = 37.385 ns FracSkip2(-1.0000000000000001E+300) = 0.0000000000000000E+000 - Pass - Time = 34.328 ns Frac(-1.0000000000000001E+300) = 0.0000000000000000E+000 - Pass - Time = 85.317 ns FracDoSkip(-1.0000000000000001E+300) = 0.0000000000000000E+000 - Pass - Time = 38.639 ns SafeFrac(-1.0000000000000001E+300) = 0.0000000000000000E+000 - Pass - Time = 38.644 ns FracSkip2(-1.0000000000000001E+300) = 0.0000000000000000E+000 - Pass - Time = 34.293 ns Frac(-1.0000000000000001E+300) = 0.0000000000000000E+000 - Pass - Time = 85.878 ns FracDoSkip(-1.0000000000000001E+300) = 0.0000000000000000E+000 - Pass - Time = 38.932 ns SafeFrac(-1.0000000000000001E+300) = 0.0000000000000000E+000 - Pass - Time = 38.651 ns FracSkip2(-1.0000000000000001E+300) = 0.0000000000000000E+000 - Pass - Time = 34.316 ns Frac(-1.2500000000000000E-001) = -1.2500000000000000E-001 - Pass - Time = 128.603 ns FracDoSkip(-1.2500000000000000E-001) = -1.2500000000000000E-001 - Pass - Time = 41.592 ns SafeFrac(-1.2500000000000000E-001) = -1.2500000000000000E-001 - Pass - Time = 37.280 ns FracSkip2(-1.2500000000000000E-001) = -1.2500000000000000E-001 - Pass - Time = 34.995 ns Frac( 3.7500000000000000E-001) = 3.7500000000000000E-001 - Pass - Time = 124.473 ns FracDoSkip( 3.7500000000000000E-001) = 3.7500000000000000E-001 - Pass - Time = 42.099 ns SafeFrac( 3.7500000000000000E-001) = 3.7500000000000000E-001 - Pass - Time = 38.716 ns FracSkip2( 3.7500000000000000E-001) = 3.7500000000000000E-001 - Pass - Time = 36.194 ns Frac(-6.2500000000000000E-001) = -6.2500000000000000E-001 - Pass - Time = 129.138 ns FracDoSkip(-6.2500000000000000E-001) = -6.2500000000000000E-001 - Pass - Time = 42.219 ns SafeFrac(-6.2500000000000000E-001) = -6.2500000000000000E-001 - Pass - Time = 39.724 ns FracSkip2(-6.2500000000000000E-001) = -6.2500000000000000E-001 - Pass - Time = 34.307 ns Frac(-3.6415926535897931E+000) = -6.4159265358979312E-001 - Pass - Time = 129.833 ns FracDoSkip(-3.6415926535897931E+000) = -6.4159265358979312E-001 - Pass - Time = 51.274 ns SafeFrac(-3.6415926535897931E+000) = -6.4159265358979312E-001 - Pass - Time = 38.494 ns FracSkip2(-3.6415926535897931E+000) = -6.4159265358979312E-001 - Pass - Time = 37.459 ns Frac(-3.1415926535897931E+000) = -1.4159265358979312E-001 - Pass - Time = 132.230 ns FracDoSkip(-3.1415926535897931E+000) = -1.4159265358979312E-001 - Pass - Time = 53.066 ns SafeFrac(-3.1415926535897931E+000) = -1.4159265358979312E-001 - Pass - Time = 38.658 ns FracSkip2(-3.1415926535897931E+000) = -1.4159265358979312E-001 - Pass - Time = 36.351 ns Frac(-4.1415926535897931E+000) = -1.4159265358979312E-001 - Pass - Time = 126.783 ns FracDoSkip(-4.1415926535897931E+000) = -1.4159265358979312E-001 - Pass - Time = 51.889 ns SafeFrac(-4.1415926535897931E+000) = -1.4159265358979312E-001 - Pass - Time = 38.785 ns FracSkip2(-4.1415926535897931E+000) = -1.4159265358979312E-001 - Pass - Time = 36.711 ns Frac(+Inf) = EXCEPTION - EInvalidOp raised with message "Invalid floating point operation" FracDoSkip(+Inf) = 0.0000000000000000E+000 - Pass - Time = 39.849 ns SafeFrac(+Inf) = 0.0000000000000000E+000 - Pass - Time = 38.889 ns FracSkip2(+Inf) = 0.0000000000000000E+000 - Pass - Time = 34.289 ns Frac(+Inf) = EXCEPTION - EInvalidOp raised with message "Invalid floating point operation" FracDoSkip(+Inf) = 0.0000000000000000E+000 - Pass - Time = 40.781 ns SafeFrac(+Inf) = 0.0000000000000000E+000 - Pass - Time = 37.504 ns FracSkip2(+Inf) = 0.0000000000000000E+000 - Pass - Time = 33.043 ns Frac(+Inf) = EXCEPTION - EInvalidOp raised with message "Invalid floating point operation" FracDoSkip(+Inf) = 0.0000000000000000E+000 - Pass - Time = 40.993 ns SafeFrac(+Inf) = 0.0000000000000000E+000 - Pass - Time = 39.575 ns FracSkip2(+Inf) = 0.0000000000000000E+000 - Pass - Time = 33.041 ns Frac(-Inf) = EXCEPTION - EInvalidOp raised with message "Invalid floating point operation" FracDoSkip(-Inf) = 0.0000000000000000E+000 - Pass - Time = 40.414 ns SafeFrac(-Inf) = 0.0000000000000000E+000 - Pass - Time = 37.835 ns FracSkip2(-Inf) = 0.0000000000000000E+000 - Pass - Time = 33.294 ns Frac(-Inf) = EXCEPTION - EInvalidOp raised with message "Invalid floating point operation" FracDoSkip(-Inf) = 0.0000000000000000E+000 - Pass - Time = 39.871 ns SafeFrac(-Inf) = 0.0000000000000000E+000 - Pass - Time = 37.885 ns FracSkip2(-Inf) = 0.0000000000000000E+000 - Pass - Time = 34.041 ns Frac(-Inf) = EXCEPTION - EInvalidOp raised with message "Invalid floating point operation" FracDoSkip(-Inf) = 0.0000000000000000E+000 - Pass - Time = 40.437 ns SafeFrac(-Inf) = 0.0000000000000000E+000 - Pass - Time = 38.868 ns FracSkip2(-Inf) = 0.0000000000000000E+000 - Pass - Time = 33.819 ns **** Gareth aka. Kit
_______________________________________________ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel