On Mon, Nov 5, 2018 at 7:29 AM Jan Hubicka <hubi...@ucw.cz> wrote:
>
> > On 11/5/18 7:21 AM, Jan Hubicka wrote:
> > >>
> > >> Did you mean "the nearest common dominator"?
> > >
> > > If the nearest common dominator appears in the loop while all uses are
> > > out of loops, this will result in suboptimal xor placement.
> > > In this case you want to split edges out of the loop.
> > >
> > > In general this is what the LCM framework will do for you if the problem
> > > is modelled siimlar way as in mode_swtiching.  At entry function mode is
> > > "no zero register needed" and all conversions need mode "zero register
> > > needed".  Mode switching should then do the correct placement decisions
> > > (reaching minimal number of executions of xor).
> > >
> > > Jeff, whan is your optinion on the approach taken by the patch?
> > > It seems like a special case of more general issue, but I do not see
> > > very elegant way to solve it at least in the GCC 9 horisont, so if
> > > the placement is correct we can probalby go either with new pass or
> > > making this part of mode swithcing (which is anyway run by x86 backend)
> > So I haven't followed this discussion at all, but did touch on this
> > issue with some patch a month or two ago with a target patch that was
> > trying to avoid the partial stalls.
> >
> > My assumption is that we're trying to find one or more places to
> > initialize the upper half of an avx register so as to avoid partial
> > register stall at existing sites that set the upper half.
> >
> > This sounds like a classic PRE/LCM style problem (of which mode
> > switching is just another variant).   A common-dominator approach is
> > closer to a classic GCSE and is going to result is more initializations
> > at sub-optimal points than a PRE/LCM style.
>
> yes, it is usual code placement problem. It is special case because the
> zero register is not modified by the conversion (just we need to have
> zero somewhere).  So basically we do not have kills to the zero except
> for entry block.
>

Do you have  testcase to show thatf the nearest common dominator
in the loop, while all uses areout of loops, leads to suboptimal xor
placement?

-- 
H.J.

Reply via email to