On 8/14/07, William Stein <[EMAIL PROTECTED]> wrote:
> On 8/14/07, cwitty <[EMAIL PROTECTED]> wrote:
> > On Aug 14, 12:59 am, Jonathan Bober <[EMAIL PROTECTED]> wrote:
> > > This is exactly what NTL does in its quad float class. Just about every
> > > function starts and ends with a macro to adjust the fpu, resulting in
> > > around 7 extra assembly instructions. In the following code, the
> > > overhead is quite significant - it takes around 21 seconds to execute on
> > > my machine, but only about 4 seconds without the START_FIX and END_FIX.
> > > Of course, this is not necessarily any sort of accurate test, but it
> > > does indicate that this can be an expensive operation.
> >
> > Yes, changing the floating-point modes is very slow on many (all?) x86
> > processors.  I believe it flushes the floating-point pipeline, which
> > takes many clock cycles.
>
> OK, how about this plan:
>
> (1) On systems with sse2, we do the option 3a (which is "If a
> processor supports sse2,
> then passing gcc -march=whatever -msse2 -mfpmath=sse (maybe the -march
> isn't needed) will cause gcc to use sse registers and instructions for
> doubles, and these have the proper precision.")
>
> (2) On systems without sse2 (old slow pentium 3's) we do the START_FIX
> and END_FIX.  These computers are very slow anyways, so let them suffer
> (and the suffering is *only* for code that uses quaddouble, which is very 
> little
> code anyways).

Since nobody objected, can somebody volunteer to implement this? :-)

William

--~--~---------~--~----~------------~-------~--~----~
To post to this group, send email to sage-devel@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at http://groups.google.com/group/sage-devel
URLs: http://sage.scipy.org/sage/ and http://modular.math.washington.edu/sage/
-~----------~----~----~----~------~----~------~--~---

Reply via email to