x86_64 extended the sse2 movnti instruction to support 64-bit integer registers
as well.  i don't see any builtin for generating this... nor is there an
intrinsic listed in the intel manuals or apparently in icc 10.0.023 header
files either.  the natural name would be _mm_stream_si64 -- which fortunately
does not conflict with _mm_stream_pi, the mmx 64-bit store version.

on a whim i tried this:

void foo(unsigned long *d, unsigned long v)
{
  __builtin_ia32_movntq((unsigned long long *)d, v);
}

results in this code:

0000000000000000 <foo>:
   0:   48 89 74 24 f8          mov    %rsi,0xfffffffffffffff8(%rsp)
   5:   0f 6f 44 24 f8          movq   0xfffffffffffffff8(%rsp),%mm0
   a:   0f e7 07                movntq %mm0,(%rdi)
   d:   c3                      retq

perhaps this builtin could be overloaded to generate the "movnti %rsi,(%rdi)"
directly instead of shuffling through the mmx reg file?

(note if i throw in -mtune=core2 it eliminates the trip through the stack)

-dean

/home/odo/gcc/bin/gcc -v
Using built-in specs.
Target: x86_64-unknown-linux-gnu
Configured with: ../gcc/configure --prefix=/home/odo/gcc --disable-multilib
--disable-biarch x86_64-unknown-linux-gnu --enable-languages=c
Thread model: posix
gcc version 4.3.0 20071029 (experimental) (GCC)


-- 
           Summary: streaming 64-bit integer stores
           Product: gcc
           Version: 4.3.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
        AssignedTo: unassigned at gcc dot gnu dot org
        ReportedBy: dean at arctic dot org
 GCC build triplet: x86_64-unknown-linux-gnu
  GCC host triplet: x86_64-unknown-linux-gnu
GCC target triplet: x86_64-unknown-linux-gnu


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33944

Reply via email to