I'd like to implement something similar for MaverickCrunch, using the
integer 32-bit MAC functions, but there is no reciprocal estimate
function on the MaverickCrunch. I guess a lookup table could be
implemented, but how many entries will need to be generated, and how
accurate will it have to be IEEE754 compliant (in the swdiv routine)?
I think sh does something like that. It is quite a mess, as it has half
a dozen ways to implement division.
The idea is to use integer arithmetic to compute the right exponent, and
the lookup table to estimate the mantissa. I used something like this
for square root:
1) shift the entire FP number by 1 to the right (logical right shift)
2) sum 0x20000000 so that the exponent is still offset by 64
3) extract the 8 bits from 14 to 22 and look them up in a 256-entry,
32-bit table
4) sum the value (as a 32-bit integer!) with the content of the table
5) perform 2 Newton-Raphson iterations as necessary
example, 3.9921875
byte representation = 0x407F8000
shift right = 0x203FC000
sum = 0x403FC000
extract bits = 255
lookup table value = -4194312 = -0x400008
adjusted value = 16r3FFFBFF8, which is the square root
the table is simply making sure that if the rightmost 14 bits of the
mantissa is zero the return value is right. by summing the content of
the lookup table, you can of course interpolate between the values.
With a 12-bit table (i.e. 16 kilobytes instead of just one) you will
only need 1 iteration.
The algorithm will have to be adjusted for reciprocal (subtracting the
FP number from 16r7F000000 or better 16r7EFFFFFF should do the trick for
the first two steps; and since you don't shift right by one you'll use
bits 15-23).
Here is a sample program to generate the table. It's written in
Smalltalk (sorry :-P), it should not be hard to understand (but remember
that indices are 1-based). To double check, the first entries of the
table are 1 -32512 -64519 -96026.
| a int adj table |
table := ##(| table a val estim | table := Array new: 256.
0 to: 255 do: [ :i |
a := ByteArray new: 4.
"Create number"
a intAt: 1 put: (i bitShift: 15).
a at: 1 put: 64.
val := (a floatAt: 1) reciprocal.
"Perform estimation"
a intAt: 1 put: (16r7EFFFFFF - (a intAt: 1)).
estim := a intAt: 1.
"Compute delta with actual value and store it"
a floatAt: 1 put: val.
table at: i + 1 put: ((a intAt: 1) - estim)
].
table).
"Here we do the actual calculation. `self' is the number
to be reciprocated."
a := ByteArray new: 4.
a floatAt: 1 put: self.
"Perform estimation as above"
int := 16r7EFFFFFF - (a intAt: 1).
"Extract bits 15-23 and access the table."
adj := table at: ((a intAt: 1) // 32768 \\ 256) + 1.
"Sum the delta and convert from 32-bit integer to float"
a intAt: 1 put: (int + adj).
^(a floatAt: 1)
Also, where should I be sticking such an instruction / table? Should I
put it in the kernel, and trap an invalid instruction? Alternatively,
should I put it in libgcc
Yes, you could do this.
Paolo