On Fri, Nov 8, 2013 at 6:51 PM, Hendrik Greving <hendrik.greving.in...@gmail.com> wrote: > That didn't do it. What was the rationale w.r.t. to the relation > between the vectorized sequenced and/or the alignment (I think these > things are actually 2 separate things..) and the common block?!
We cannot adjust the alignment of a common block as we don't know which common block the linker will pick in the end. We can (and do) adjust the alignment of global variables though. And C++ defaults to -fno-common. In general when asking optimization questions it helps to provide a testcase that can be compiled - otherwise you just provoke random guesses (like mine) ;) Richard. > Hendrik > > On Fri, Nov 8, 2013 at 9:44 AM, Richard Biener > <richard.guent...@gmail.com> wrote: >> Hendrik Greving <hendrik.greving.in...@gmail.com> wrote: >>>The code for a simple loop like >>> >>>for (i = 0; i < LENGTH-1; i++) { >>> g_c[i] = g_a[i] + g_b[i]; >>>} >>> >>>looks good for g++ (4.9.0 20131028 (experimental)) (-O3 core-avx2) >>> >>>.L2: >>>vmovdqa g_a(%rax), %ymm0 # 26 *movv8si_internal/2 [length = 8] >>>vpaddd g_b(%rax), %ymm0, %ymm0 # 27 *addv8si3/2 [length = 8] >>>addq $32, %rax # 29 *adddi_1/1 [length = 4] >>>vmovaps %ymm0, g_c-32(%rax) # 28 *movv8si_internal/3 [length = 8] >>>cmpq $39968, %rax # 31 *cmpdi_1/1 [length = 6] >>>jne .L2 # 32 *jcc_1 [length = 2] >>> >>>but for gcc, I'm getting >>> >>>.L4: >>>vmovdqu (%rsi,%rax), %xmm0 # 156 sse2_loaddquv16qi [length = 5] >>>vinserti128 $0x1, 16(%rsi,%rax), %ymm0, %ymm0 # 157 >>>avx_vec_concatv32qi/1 [length = 8] >>>addl $1, %edx # 161 *addsi_1/1 [length = 3] >>>vpaddd (%rdi,%rax), %ymm0, %ymm0 # 158 *addv8si3/2 [length = 5] >>>vmovups %xmm0, (%rcx,%rax) # 412 *movv16qi_internal/3 [length = 5] >>>vextracti128 $0x1, %ymm0, 16(%rcx,%rax) # 160 vec_extract_hi_v32qi/2 >>>[length = 8] >>>addq $32, %rax # 162 *adddi_1/1 [length = 4] >>>cmpl $1248, %edx # 164 *cmpsi_1/1 [length = 6] >>>jbe .L4 # 165 *jcc_1 [length = 2] >>> >>>unless I add "__attribute__ ((aligned (64)));" g_a, g_b, g_c. >>> >>>2 questions: Does C have different alignment requirements/specs than >>>C++ (I don't think so)? >> >> Try -fno-common >> >> Richard. >> >> But if so, why does gcc not just align the >>>arrays (they are in the same module in my example...)? Let aside the >>>alignment question, why not just do avx2 (ymm) moves as g++ does? >>> >>>Guess my question is, is this a bug or a feature? >>> >>>Thanks, >>>Regards, >>>Hendrik >> >>