On Fri, Jul 01, 2011 at 11:45:16AM -0700, Martin Thuresson wrote:
> In recent versions of GCC I have seen a transformation of inline
> assembly that I'd like to confirm is valid.
> 
> The code in question can be found in mplayer/mp3lib/dct64_sse.c
> 
>           "movaps    %0, %%xmm0\n\t"
>           "shufps    $27, %%xmm0, %%xmm0\n\t"
>           "movaps    %1, %%xmm5\n\t"
>           "movaps    %%xmm5, %%xmm6\n\t"
>           :
>           :"m"(*costab), "m"(*nnnn)
> 
> where nnnn is
>  static const int nnnn[4] __attribute__((aligned(16))) = { 1 << 31, 1
> << 31, 1 << 31, 1 << 31 };
> 
> GCC turns this into:
>      "movaps    %0, %%xmm0
>       shufps    $27, %%xmm0, %%xmm0
>       movaps    %1, %%xmm5
>       movaps    %%xmm5, %%xmm6
>       " :  : "m" costab_mmx[24], *"m" -2147483648*);
> 
> The new constant might end up in an unaligned address causing the
> program to segfault on Intel platforms.

But you said the operand is an int sized memory, while you expect
4 times as big data with different alignment.
So you want "m"(*(__m128d *)nnnn)  (or "m"(*(__m128i *)nnnn) ).

Your "fixed" version with "p" is not correct either, because
you don't say to gcc that it reads the value from that memory address, e.g.
if there is
int foo (void)
{
  int p, q;
  p = 5;
  asm ("movl %a1, %0" : "=r" (q) : "p" (&p));
  return q;
}
gcc might very well eliminate the p = 5; store.

        Jakub

Reply via email to