I've recently written a program where taking the average of 2 floating
point numbers was a real bottleneck. I've looked into the assembly
generated by gcc -O3 and apparently gcc treats multiplication and
division by a hard-coded 2 like any other multiplication with a
constant. I think, however, that *(2^c) and /(2^c) for floating
points, where the c is known at compile-time, should be able to be
optimized with the following pseudo-code:

e = exponent bits of the number
if (e > c && e < (0b111...11)-c) {
e += c or e -= c
} else {
do regular multiplication
}

Even further optimizations may be possible, such as bitshifting the
significand when e=0. However, that would require checking for a lot
of special cases and require so many conditional jumps that it's most
likely not going to be any faster.

I'm not skilled enough with assembly to write this myself and test if
this actually performs faster than how it's implemented now. Its
performance will most likely also depend on the processor
architecture, and I could only test this code on one machine.
Therefore I ask to those who are familiar with gcc's optimization
routines to give this 2 seconds of thought, as this is probably rather
easy to implement and many programs could benefit from this.

Greets,
Jeroen

Reply via email to