> -----Original Message-----
> From: Jakub Jelinek <ja...@redhat.com>
> Sent: 02 March 2022 15:25
> To: Stubbs, Andrew <andrew_stu...@mentor.com>
> Cc: gcc@gcc.gnu.org
> Subject: Re: OpenMP auto-simd
> 
> On Wed, Mar 02, 2022 at 03:12:30PM +0000, Stubbs, Andrew wrote:
> > Has anyone ever considered having GCC add the "simd" clause to offload (or
> regular) loop nests automatically?
> >
> > For example, something like "-fomp-auto-simd" would transform "distribute
> parallel" to "distribute parallel simd" automatically. Loop nests that
> already contain "simd" clauses or directives would remain unchanged, most
> likely.
> 
> I'm afraid we can't do that, at least not always.  The simd has various
> restrictions on what can appear inside of the body, etc. and we shouldn't
> reject valid code just because we decided to add simd automatically (even if
> the user asked for those through an option).
> So, it could be done only if we would do analysis that it is safe to do
> that.

In the general case there are undoubtedly issues, but I think the restrictions 
listed in the OpenMP document ought to be detectable, for at least the inline 
code. Is there one that is too hard, at least during the early passes? I 
anticipate that version 1.0 wouldn't add the directive to regions that include 
function calls (unless declared "simd" explicitly, perhaps), although that 
would be nice to have later.

For AMD GCN it's always safe to set the "force_vectorize" flag on any given 
loop (it's just the same as setting -ftree-vectorize for the whole program) 
given that the vectorizer will simply quietly fail later. For NVPTX this might 
be a bigger issue.

Is this really such a lost cause?

Thanks

Andrew

P.S. Ideally we'd do this the same way that other toolchains do it, such that 
omp_get_thread_num returns a number in the range 0..1023 rather than 0..15 
(AMD) or 0..32 (NVPTX) as we do now, but I think that's just impossible with 
the current implementation.

Reply via email to