https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113358

            Bug ID: 113358
           Summary: OpenMP inhibits vectorization
           Product: gcc
           Version: 13.2.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: c
          Assignee: unassigned at gcc dot gnu.org
          Reporter: thomas.koopman at ru dot nl
  Target Milestone: ---

Created attachment 57054
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57054&action=edit
The three different versions as well as preprocessed output.

I attached three programs that compute an array accel of points in R^3 from a
3d array of positions in R^3. accel[i] is sum_j ||positions[i] -
positions[j]||^2. 

These are seq.c which is the most basic, omp.c which parallelises the outer
loop with OpenMP and block.c which uses the blocking optimisation. The first
version vectorizes as expected, but the other two do not. objdump -d omp.o |
grep ymm  shows up empty.

They are compiled with

gcc -c seq.c -Ofast -mavx2 -mfma -save-temps -Wall -Wextra -o seq.o -lm

gcc -c omp.c -Ofast -mavx2 -mfma -save-temps -Wall -Wextra -o omp.o -lm
-fopenmp

gcc -c block.c -Ofast -mavx2 -mfma -save-temps -Wall -Wextra -o block.o -lm

and

gcc -v gives the following.

Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/home/thomas/.local/libexec/gcc/x86_64-pc-linux-gnu/13.2.0/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: ./configure --prefix=/home/thomas/.local
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 13.2.0 (GCC)

Reply via email to