I'd like to implement something similar for MaverickCrunch, using the
integer 32-bit MAC functions, but there is no reciprocal estimate
function on the MaverickCrunch.  I guess a lookup table could be
implemented, but how many entries will need to be generated, and how
accurate will it have to be IEEE754 compliant (in the swdiv routine)?

I think sh does something like that. It is quite a mess, as it has half a dozen ways to implement division.

The idea is to use integer arithmetic to compute the right exponent, and the lookup table to estimate the mantissa. I used something like this for square root:

1) shift the entire FP number by 1 to the right (logical right shift)
2) sum 0x20000000 so that the exponent is still offset by 64
3) extract the 8 bits from 14 to 22 and look them up in a 256-entry, 32-bit table
4) sum the value (as a 32-bit integer!) with the content of the table
5) perform 2 Newton-Raphson iterations as necessary

example, 3.9921875

byte representation = 0x407F8000
shift right = 0x203FC000
sum = 0x403FC000
extract bits = 255
lookup table value = -4194312 = -0x400008
adjusted value = 16r3FFFBFF8, which is the square root

the table is simply making sure that if the rightmost 14 bits of the mantissa is zero the return value is right. by summing the content of the lookup table, you can of course interpolate between the values.

With a 12-bit table (i.e. 16 kilobytes instead of just one) you will only need 1 iteration.

The algorithm will have to be adjusted for reciprocal (subtracting the FP number from 16r7F000000 or better 16r7EFFFFFF should do the trick for the first two steps; and since you don't shift right by one you'll use bits 15-23).

Here is a sample program to generate the table. It's written in Smalltalk (sorry :-P), it should not be hard to understand (but remember that indices are 1-based). To double check, the first entries of the table are 1 -32512 -64519 -96026.

            | a int adj table |
            table := ##(| table a val estim |  table := Array new: 256.
                0 to: 255 do: [ :i |
                   a := ByteArray new: 4.
                   "Create number"
                   a intAt: 1 put: (i bitShift: 15).
                   a at: 1 put: 64.
                   val := (a floatAt: 1) reciprocal.

                   "Perform estimation"
                   a intAt: 1 put: (16r7EFFFFFF - (a intAt: 1)).
                   estim := a intAt: 1.

                   "Compute delta with actual value and store it"
                   a floatAt: 1 put: val.
                   table at: i + 1 put: ((a intAt: 1) - estim)
                ].
                table).

            "Here we do the actual calculation. `self' is the number
             to be reciprocated."

            a := ByteArray new: 4.
            a floatAt: 1 put: self.

            "Perform estimation as above"
            int := 16r7EFFFFFF - (a intAt: 1).

            "Extract bits 15-23 and access the table."
            adj := table at: ((a intAt: 1) // 32768 \\ 256) + 1.

            "Sum the delta and convert from 32-bit integer to float"
            a intAt: 1 put: (int + adj).
            ^(a floatAt: 1)

Also, where should I be sticking such an instruction / table?  Should I
put it in the kernel, and trap an invalid instruction?  Alternatively,
should I put it in libgcc

Yes, you could do this.

Paolo

Reply via email to