> -----Original Message----- > From: Jakub Jelinek <ja...@redhat.com> > Sent: 02 March 2022 15:25 > To: Stubbs, Andrew <andrew_stu...@mentor.com> > Cc: gcc@gcc.gnu.org > Subject: Re: OpenMP auto-simd > > On Wed, Mar 02, 2022 at 03:12:30PM +0000, Stubbs, Andrew wrote: > > Has anyone ever considered having GCC add the "simd" clause to offload (or > regular) loop nests automatically? > > > > For example, something like "-fomp-auto-simd" would transform "distribute > parallel" to "distribute parallel simd" automatically. Loop nests that > already contain "simd" clauses or directives would remain unchanged, most > likely. > > I'm afraid we can't do that, at least not always. The simd has various > restrictions on what can appear inside of the body, etc. and we shouldn't > reject valid code just because we decided to add simd automatically (even if > the user asked for those through an option). > So, it could be done only if we would do analysis that it is safe to do > that.
In the general case there are undoubtedly issues, but I think the restrictions listed in the OpenMP document ought to be detectable, for at least the inline code. Is there one that is too hard, at least during the early passes? I anticipate that version 1.0 wouldn't add the directive to regions that include function calls (unless declared "simd" explicitly, perhaps), although that would be nice to have later. For AMD GCN it's always safe to set the "force_vectorize" flag on any given loop (it's just the same as setting -ftree-vectorize for the whole program) given that the vectorizer will simply quietly fail later. For NVPTX this might be a bigger issue. Is this really such a lost cause? Thanks Andrew P.S. Ideally we'd do this the same way that other toolchains do it, such that omp_get_thread_num returns a number in the range 0..1023 rather than 0..15 (AMD) or 0..32 (NVPTX) as we do now, but I think that's just impossible with the current implementation.