Re: Complex multiplication in gcc

Gabriel Paubert Mon, 17 Jul 2017 11:33:01 -0700

On Mon, Jul 17, 2017 at 10:51:21AM -0600, Sean McAllister wrote:
> When generating code for a simple inner loop (instantiated with
> std::complex<float>)
> 
> template <typename cx>
> void __attribute__((noinline)) benchcore(const cx* __restrict__ aa,
> const cx* __restrict__ bb, const cx* __restrict__ cc, cx* __restrict__
> dd, cx uu, cx vv, size_t nn) {
>     for (ssize_t ii=0; ii < nn; ii++) {
>         dd[ii] = (
>             aa[ii]*uu +
>             bb[ii]*vv +
>             cc[ii]
>         );
>     }
> }
> 
> g++ generates the following assembly code (g++ 7.1.0) (compiled with:
> g++ -I. test.cc -O3 -ggdb3 -o test)


[snipped]
> 
> The interesting part is the two calls to __mulsc3, which the docs
> indicate computes complex multiplication according to Annex G of the
> C99 standard.  This leads me to two questions.
> 
> First, disassembling __mulsc3 doesn't seem to contain anything:
> 
> (gdb) disassemble __mulsc3
> Dump of assembler code for function __mulsc3@plt:
>    0x0000000000400aa0 <+0>: jmpq   *0x2035d2(%rip)        # 0x604078
>    0x0000000000400aa6 <+6>: pushq  $0xc
>    0x0000000000400aab <+11>: jmpq   0x4009d0
> End of assembler dump.
> 
> What's the cause of this?

That you are disassembling the PLT (note __mulsc3@plt), which redirects
to the real function which is provided by libgcc (on my computer the
exact location is /lib/x86_64-linux-gnu/libgcc_s.so.1).

> 
> Second, since I don't think I'll convince anyone to generate
> non-standard conforming code by default, could the default performance
> of complex multiplication be enhanced significantly by performing the
> isnan() checks required by Annex G and only calling the function to
> fix the results if they fail?  That would move the function call
> overhead out of the critical path at least.

        Gabriel

Re: Complex multiplication in gcc

Reply via email to