6 Regression] Generates wrong code for SSE2 _mm_load_pd

vmakarov at gcc dot gnu.org Fri, 16 Oct 2015 05:45:01 -0700

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67609


--- Comment #8 from Vladimir Makarov <vmakarov at gcc dot gnu.org> ---
(In reply to Uroš Bizjak from comment #7)
> (In reply to Richard Biener from comment #4)
> > (In reply to Uroš Bizjak from comment #3)
> > > The doc says:
> > > 
> > >           When used as an lvalue, 'subreg' is a word-based accessor.
> > >           Storing to a 'subreg' modifies all the words of REG that
> > >           overlap the 'subreg', but it leaves the other words of REG
> > >           alone.
> > 
> > But UNITS_PER_WORD is 8 so (subreg:DF (TI)) should leave the upper half
> > of the TImode register unchanged.
> 
> Indeed, and -m32 creates correct code. So, it is register allocator that
> fails.
> 
> Reconfirmed as rtl-optimization problem.

It is a quite interesting PR which reveals a long lasting latent bug in GCC.

Basically we have before LRA

    2: r90:DF=xmm0:DF
      REG_DEAD xmm0:DF
    3: NOTE_INSN_FUNCTION_BEG
    6: r89:TI=[`reg']
    7: r89:TI#0=r90:DF
      REG_DEAD r90:DF
    8: [`reg']=r89:TI#0

LRA and reload pass produces

    6: xmm1:TI=[`reg']
    7: xmm1:DF=xmm0:DF
    8: [`reg']=xmm1:V2DF

They does not do any transformations except transforming subreg of hard
register in insn #7.  And after that insn #6 is removed as a dead one by
subsequent optimizations.  In order to avoid removing insn #6 we need to keep
the subreg until the final pass:

    7: xmm1:TI#0=xmm0:DF

Why do LRA and reload remove subregs of hard registers? That is because some
subsequent optimizations can handle them.

Last two days I've been struggling to find solution which involves only LRA
(partial removing subreg of hard regs) but still failing.

In any case, even if I find such solution in LRA, it needs extensive testing on
other targets and probably it will be ready next week at the best.

[Bug rtl-optimization/67609] [5/6 Regression] Generates wrong code for SSE2 _mm_load_pd

Reply via email to