On Thu, 22 Oct 2015 10:05:30 +0200
Richard Biener <richard.guent...@gmail.com> wrote:

> On Thu, Oct 22, 2015 at 9:59 AM, Jakub Jelinek <ja...@redhat.com>
> wrote:
> > On Thu, Oct 22, 2015 at 09:49:29AM +0200, Richard Biener wrote:  
> >> >> Jakub, IYR I originally had IFN_FORK and IFN_JOIN as such
> >> >> distinct internal fns.  This replaces that scheme.
> >> >>
> >> >> ok?  
> >> >
> >> > Hmm, I'd just have used gimple_has_volatile_ops on the call?
> >> > That should have the
> >> > desired effects.  
> >>
> >> That is, whatever new IFNs you need are ok, but special-casing
> >> them is not necessary if you properly mark the calls as volatile.  
> >
> > I don't see gimple_has_volatile_ops used in tracer.c or
> > tree-ssa-threadedge.c.  Setting gimple_has_volatile_ops on those
> > IFNs is fine, but I think they are even stronger than that.  
> 
> Hmm, indeed.  Now I fail to see how the implemented property
> "preserves the CFG looping structure".  And I would have expected
> can_copy_bbs_p to be adjusted instead (catching more cases and the
> threading and tracer case as well).
> 
> As far as I can see nothing would prevent dissolving the loop by
> completely unolling it for example.  Or deleting it because it has no
> side-effects.
> 
> So you'd need to be more precise as to what properties you are trying
> to preserve by placing a single stmt somewhere.

FWIW an earlier, abandoned attempt at solving the same problem was
discussed in the following thread, continuing through June:

  https://gcc.gnu.org/ml/gcc-patches/2015-05/msg02647.html

Though the details of lowering of OpenACC constructs have changed with
Nathan's current patches, the underlying problem remains the same. PTX
requires certain operations (bar.sync) to be executed uniformly by all
threads in a CTA. IIUC this affects "JOIN" points across all
workers/vectors in a gang, in particular (though this is generic code,
other -- particularly GPU -- targets may have similar restrictions).

HTH,

Julian

Reply via email to