https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113358
Bug ID: 113358 Summary: OpenMP inhibits vectorization Product: gcc Version: 13.2.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: thomas.koopman at ru dot nl Target Milestone: --- Created attachment 57054 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57054&action=edit The three different versions as well as preprocessed output. I attached three programs that compute an array accel of points in R^3 from a 3d array of positions in R^3. accel[i] is sum_j ||positions[i] - positions[j]||^2. These are seq.c which is the most basic, omp.c which parallelises the outer loop with OpenMP and block.c which uses the blocking optimisation. The first version vectorizes as expected, but the other two do not. objdump -d omp.o | grep ymm shows up empty. They are compiled with gcc -c seq.c -Ofast -mavx2 -mfma -save-temps -Wall -Wextra -o seq.o -lm gcc -c omp.c -Ofast -mavx2 -mfma -save-temps -Wall -Wextra -o omp.o -lm -fopenmp gcc -c block.c -Ofast -mavx2 -mfma -save-temps -Wall -Wextra -o block.o -lm and gcc -v gives the following. Using built-in specs. COLLECT_GCC=gcc COLLECT_LTO_WRAPPER=/home/thomas/.local/libexec/gcc/x86_64-pc-linux-gnu/13.2.0/lto-wrapper Target: x86_64-pc-linux-gnu Configured with: ./configure --prefix=/home/thomas/.local Thread model: posix Supported LTO compression algorithms: zlib zstd gcc version 13.2.0 (GCC)