[Bug rtl-optimization/67124] [6 Regression] wrong code at -O1, -O2 and -O3 on x86_64-linux-gnu

vmakarov at gcc dot gnu.org Wed, 07 Oct 2015 11:07:18 -0700

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67124


--- Comment #9 from Vladimir Makarov <vmakarov at gcc dot gnu.org> ---
(In reply to rsand...@gcc.gnu.org from comment #6)
> (In reply to Uroš Bizjak from comment #5)
> > Wrong expansion, adding CC.
> 
> The expand code looks OK to me.  Assigning to one DImode word
> of a TImode isn't supposed to change the other half.
> 
Agree.

GCC internal doc says

"          When used as an lvalue, 'subreg' is a word-based accessor.
          Storing to a 'subreg' modifies all the words of REG that
          overlap the 'subreg', but it leaves the other words of REG
          alone.

          When storing to a normal 'subreg' that is smaller than a word,
          the other bits of the referenced word are usually left in an
          undefined state.  This laxity makes it easier to generate
          efficient code for such instructions.  To represent an
          instruction that preserves all the bits outside of those in
          the 'subreg', use 'strict_low_part' or 'zero_extract' around
          the 'subreg'.
"

As the subreg itself is a word.  The rest of the reg should be saved.

> I think the problem is in LRA.  It tries to reload the low half
> of the TImode as follows:
> 
>       Creating newreg=104, assigning class NO_REGS to secondary r104
>    51: r104:DI=r103:DI
>     Inserting the sec. move after:
>    52: r90:TI#0=r104:DI
> 
> then allocates it as an xmm<-mem move:
> 
>          Choosing alt 14 in insn 52:  (0) *v  (1) m {*movdi_internal}
>


> That isn't right because the move won't preserve the high half
> of the xmm register.  It would need to be a strict_lowpart to do that.

It is the same word subreg movement and the rest of the reg bits should be
saved as you wrote about the expander above.

It is not saved as it is implemented by movq which is zeroing the rest bits.  I
guess using movq is wrong here.

I am in difficult position.  Redefining movdi_internal could have a big
unpredictable effect on a major platform. I could implement using strict_low
for such case but it could also have a big effect on other targets too.


 (In reply to Uroš Bizjak from comment #7)
> (In reply to rsand...@gcc.gnu.org from comment #6)
>  
> > The expand code looks OK to me.  Assigning to one DImode word
> > of a TImode isn't supposed to change the other half.
> > 
> > I think the problem is in LRA.  It tries to reload the low half
> > of the TImode as follows:
> 
> Thanks for your analysis!
> 
> Reconfirmed as RA problem, adding another CC.

By the way I've check what reload would do in this case.  It generates the same
(wrong) code as LRA.  So if it is LRA bug, the same bug is present in original
reload.

[Bug rtl-optimization/67124] [6 Regression] wrong code at -O1, -O2 and -O3 on x86_64-linux-gnu

Reply via email to