https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116128

--- Comment #5 from mjr19 at cam dot ac.uk ---
I think in general using partial sums improves accuracy.

If one assumes that all of the data have the same sign and similar magnitude,
then by the time the sum is nearly complete one is adding a single element to
the sum of a large number of elements. Adding a small number to a big number
causes precision loss.

For accuracy, the ideal might be to add all elements in pairs to form an array
of half the original size, and then recurse. Then the items one adds are always
likely to be of similar size. But this would be very expensive.

real::a(HUGE)
a=1
t=0
do i=1,HUGE
  t=t+a(i)
enddo

will, for large values of HUGE, end with t=16777216 as 16777216+1=16777216.

A sum function using sixteen partial sums might keep going until about
268435456, which is 16x better. The recursive pairs method would eventually
overflow the real datatype, which is very much better.

However, if the data are such that the sum is expected to be approximately
zero, then these methods are all equally good. A lot depends on what one thinks
that the values in a typical array to sum would be.

As for using a library function, there might be a degree of overhead in the
call, then deciding which instruction set to use, whereas the compiled code may
have fewer instruction sets to choose between. For the product code one would
want SSE, AVX, AVX+FMA and AVX-512 paths in a library, before one worries about
detailed tuning for specific CPUs. I tend to be biased towards inline code for
simple things, although I admit that there are good arguments on both sides.
Inline code also wins if information about the array size is known at
compile-time, e.g. perhaps so that one can skip checks for very small arrays.

Reply via email to