On 8/14/07, William Stein <[EMAIL PROTECTED]> wrote: > On 8/14/07, cwitty <[EMAIL PROTECTED]> wrote: > > On Aug 14, 12:59 am, Jonathan Bober <[EMAIL PROTECTED]> wrote: > > > This is exactly what NTL does in its quad float class. Just about every > > > function starts and ends with a macro to adjust the fpu, resulting in > > > around 7 extra assembly instructions. In the following code, the > > > overhead is quite significant - it takes around 21 seconds to execute on > > > my machine, but only about 4 seconds without the START_FIX and END_FIX. > > > Of course, this is not necessarily any sort of accurate test, but it > > > does indicate that this can be an expensive operation. > > > > Yes, changing the floating-point modes is very slow on many (all?) x86 > > processors. I believe it flushes the floating-point pipeline, which > > takes many clock cycles. > > OK, how about this plan: > > (1) On systems with sse2, we do the option 3a (which is "If a > processor supports sse2, > then passing gcc -march=whatever -msse2 -mfpmath=sse (maybe the -march > isn't needed) will cause gcc to use sse registers and instructions for > doubles, and these have the proper precision.") > > (2) On systems without sse2 (old slow pentium 3's) we do the START_FIX > and END_FIX. These computers are very slow anyways, so let them suffer > (and the suffering is *only* for code that uses quaddouble, which is very > little > code anyways).
Since nobody objected, can somebody volunteer to implement this? :-) William --~--~---------~--~----~------------~-------~--~----~ To post to this group, send email to sage-devel@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/sage-devel URLs: http://sage.scipy.org/sage/ and http://modular.math.washington.edu/sage/ -~----------~----~----~----~------~----~------~--~---