Hi all,

I was looking for ways to improve the MaverickCrunch division routine on
ARM ep93xx, and noticed that there are few other architectures that
don't have a hardware divide.

IA-64 has a "frcpa" instruction that returns an estimate of the
reciprocal of a float or double.
Likewise, RS-6000 has a "fres" that also returns an estimate of the
reciprocal of a float or double.
x86 seems to have something similar with SSE - called "rcpps" - that
also returns the estimated reciprocal.

They all seem to make use of FMAC / FNMAC instructions to calculate the
correct answer for x/y, through an Newton-Raphson and MAC Instructions. 
And the algorithms they use in GCC are different, due to the accuracy of
the reciprocal estimate.

http://en.wikipedia.org/wiki/N-th_root_algorithm
http://en.wikipedia.org/wiki/Multiply-accumulate

They also seem to use a similar algorithm to implement their sqrt
function...

My question is, are there any other architectures in GCC that don't have
a reciprocal estimate instruction, but have a FMAC?

I'd like to implement something similar for MaverickCrunch, using the
integer 32-bit MAC functions, but there is no reciprocal estimate
function on the MaverickCrunch.  I guess a lookup table could be
implemented, but how many entries will need to be generated, and how
accurate will it have to be IEEE754 compliant (in the swdiv routine)?

Also, where should I be sticking such an instruction / table?  Should I
put it in the kernel, and trap an invalid instruction?  Alternatively,
should I put it in libgcc or in glibc/uclibc?

Reply via email to