Ok, thanks, that explains it... Apparently x86 splits the vector movs
into 2 in 
ix86_expand_vector_move_misalign->ix86_avx256_split_vector_move_misalign.
But I wanted to mention that e.g. icc, despite also putting g_a, g_b,
g_c into .comm, actually generates AVX2 vmovdqu using ymm...

Examples:

foo.c:

#include <stdio.h>
#include <stdint.h>
#include "foo.h"

int g_a[LENGTH];
int g_b[LENGTH];
int g_c[LENGTH];
void
foo()
{
    int i ;
    for (i = 0; i < LENGTH; i++) {
        g_c[i] = g_a[i] + g_b[i];
    }
}

icc:
icc/13.1.3/bin/icc -S -O3 -march=core-avx2 foo.c -v -save-temps -vec-report=2

gcc:
gcc -S -O3 -march=core-avx2 foo.c -ftree-vectorizer-verbose=1 -dp -v -da



On Mon, Nov 11, 2013 at 1:31 PM, David Edelsohn <dje....@gmail.com> wrote:
> On Mon, Nov 11, 2013 at 3:56 PM, Richard Henderson <r...@redhat.com> wrote:
>
>>> I suppose targets without .bss section support should not switch
>>> (that is, targets not defining BSS_SECTION_ASM_OP or
>>> ASM_OUTPUT_ALIGNED_BSS).
>>
>> Good point.  I don't expect that we have many of those left, but
>> if any do still exist...
>
> AIX XCOFF, although it probably can be changed to explicitly use a BSS 
> section.
>
> - David

Reply via email to