On Fri, Nov 8, 2013 at 6:51 PM, Hendrik Greving
<hendrik.greving.in...@gmail.com> wrote:
> That didn't do it. What was the rationale w.r.t. to the relation
> between the vectorized sequenced and/or the alignment (I think these
> things are actually 2 separate things..) and the common block?!

We cannot adjust the alignment of a common block as we don't know
which common block the linker will pick in the end.  We can (and do)
adjust the alignment of global variables though.  And C++ defaults
to -fno-common.

In general when asking optimization questions it helps to provide
a testcase that can be compiled - otherwise you just provoke
random guesses (like mine) ;)

Richard.

> Hendrik
>
> On Fri, Nov 8, 2013 at 9:44 AM, Richard Biener
> <richard.guent...@gmail.com> wrote:
>> Hendrik Greving <hendrik.greving.in...@gmail.com> wrote:
>>>The code for a simple loop like
>>>
>>>for (i = 0; i < LENGTH-1; i++) {
>>>        g_c[i] = g_a[i] + g_b[i];
>>>}
>>>
>>>looks good for g++ (4.9.0 20131028 (experimental)) (-O3 core-avx2)
>>>
>>>.L2:
>>>vmovdqa g_a(%rax), %ymm0 # 26 *movv8si_internal/2 [length = 8]
>>>vpaddd g_b(%rax), %ymm0, %ymm0 # 27 *addv8si3/2 [length = 8]
>>>addq $32, %rax # 29 *adddi_1/1 [length = 4]
>>>vmovaps %ymm0, g_c-32(%rax) # 28 *movv8si_internal/3 [length = 8]
>>>cmpq $39968, %rax # 31 *cmpdi_1/1 [length = 6]
>>>jne .L2 # 32 *jcc_1 [length = 2]
>>>
>>>but for gcc, I'm getting
>>>
>>>.L4:
>>>vmovdqu (%rsi,%rax), %xmm0 # 156 sse2_loaddquv16qi [length = 5]
>>>vinserti128 $0x1, 16(%rsi,%rax), %ymm0, %ymm0 # 157
>>>avx_vec_concatv32qi/1 [length = 8]
>>>addl $1, %edx # 161 *addsi_1/1 [length = 3]
>>>vpaddd (%rdi,%rax), %ymm0, %ymm0 # 158 *addv8si3/2 [length = 5]
>>>vmovups %xmm0, (%rcx,%rax) # 412 *movv16qi_internal/3 [length = 5]
>>>vextracti128 $0x1, %ymm0, 16(%rcx,%rax) # 160 vec_extract_hi_v32qi/2
>>>[length = 8]
>>>addq $32, %rax # 162 *adddi_1/1 [length = 4]
>>>cmpl $1248, %edx # 164 *cmpsi_1/1 [length = 6]
>>>jbe .L4 # 165 *jcc_1 [length = 2]
>>>
>>>unless I add "__attribute__ ((aligned (64)));" g_a, g_b, g_c.
>>>
>>>2 questions: Does C have different alignment requirements/specs than
>>>C++ (I don't think so)?
>>
>> Try -fno-common
>>
>> Richard.
>>
>>  But if so, why does gcc not just align the
>>>arrays (they are in the same module in my example...)? Let aside the
>>>alignment question, why not just do avx2 (ymm) moves as g++ does?
>>>
>>>Guess my question is, is this a bug or a feature?
>>>
>>>Thanks,
>>>Regards,
>>>Hendrik
>>
>>

Reply via email to