Renato Golin wrote:
On 15 February 2014 19:26, Jakub Jelinek <ja...@redhat.com> wrote:
GCC supports #pragma GCC ivdep/#pragma simd/#pragma omp simd, the last one
can be used without rest of OpenMP by using -fopenmp-simd switch.
Does the simd/omp have control over the tree vectorizer? Or are they
just flags for the omp implementation?
As '#pragma omp simd' doesn't generate any threads and doesn't call the
OpenMP run-time library (libgomp), I would claim that it only controls
the tree vectorizer. (Hence, -fopenmp-simd was added as it permits this
control without enabling thread parallelization or dependence on libgomp
or libpthread.)
Compiler vendors (and users) have different ideas whether the SIMD
pragmas should give the compiler only a hint or completely override the
compiler's heuristics. In case of the Intel compiler, the user rules; in
case of GCC, it only influences the heuristics unless one passes
explicitly -fsimd-cost-model=unlimited (cf. also -Wopenmp-simd).
[Remark regarding '#pragma simd': I believe that pragma is only active
with -fcilkplus.]
I don't see why we would need more ways to do the same thing.
Me neither! That's what I'm trying to avoid.
Do you guys use those pragmas for everything related to the
vectorizer? I found that the Intel pragmas (not just simd and omp) are
pretty good fit to most of our needed functionality.
Does GCC use Intel pragmas to control the vectorizer? Would be good to
know how you guys did it, so that we can follow the same pattern.
As written by Jakub, only OpenMP's SIMD (requires: -fopenmp or
-fopenmp-simd), Cilk plus's SIMD (-fcilkplus) and '#pragma gcc ivdep"
(always enabled) are supported.
As a user, I found Intel's pragmas interesting, but at the end regarded
OpenMP's SIMD directives/pragmas as sufficient.
Can GCC vectorize lexical blocks as well? Or just loops?
According to http://gcc.gnu.org/projects/tree-ssa/vectorization.html,
basic-block vectorization (SLP) support exists since 2009.
Tobias