On Wed, Oct 21, 2020 at 01:27:42PM +1030, Alan Modra wrote:
> On Tue, Oct 20, 2020 at 01:55:56PM -0500, Segher Boessenkool wrote:
> > On Thu, Oct 08, 2020 at 09:27:54AM +1030, Alan Modra wrote:
> > > The existing "case AND" in this function is not sufficient for
> > > optabs.c:avoid_expensive_constant usage, where the AND is passed in
> > > outer_code.  We'd like to cost AND of rs6000_is_valid_and_mask
> > > or rs6000_is_valid_2insn_and variety there, so that those masks aren't
> > > seen as expensive (ie. better to load to a reg then AND).
> > > 
> > >   * config/rs6000/rs6000.c (rs6000_rtx_costs): Combine CONST_INT
> > >   AND handling with IOR/XOR.  Move costing for AND with
> > >   rs6000_is_valid_and_mask or rs6000_is_valid_2insn_and to
> > >   CONST_INT.
> > 
> > Sorry this took so long to review :-(
> > 
> > On 64-bit BE this leads to *bigger* code, and closer observation shows
> > that some common sequences degrade on all configs.  This seems to mostly
> > be about "andc" (and its dot form).  It wasn't costed properly before,
> > but after your patch, a single instruction is replaced by three.
> > 
> > Could you look into this?
> 
> ~/build/gcc-alan/gcc$ for z in *.o; do if test `objdump -dr $z | grep andc | 
> wc -l` != `objdump -dr ../../gcc/gcc/$z | grep andc | wc -l`; then echo $z; 
> fi; done
> gimplify.o
> insn-emit.o
> insn-opinit.o
> insn-recog.o
> rs6000-string.o
> 
> All of these are exactly the case I talked about in
> https://gcc.gnu.org/pipermail/gcc-patches/2020-September/553919.html

For a kernel build (my testcase) it happens more often.

> "Sometimes correct insn cost leads to unexpected results.  For
> example:
> 
> extern unsigned bar (void);
> unsigned
> f1 (unsigned a)
> {
>   if ((a & 0x01000200) == 0x01000200)
>     return bar ();
>   return 0;
> }
> 
> emits for a & 0x01000200
>  (set (reg) (and (reg) (const_int 0x01000200)))
> at expand time (two rlwinm insns) rather than the older
>  (set (reg) (const_int 0x01000200))
>  (set (reg) (and (reg) (reg)))

And that is bad.  Why on earth does expand "optimise" this?  It should
not, it hinders various *real* optimisations!

> which is three insns.  However, since 0x01000200 is needed later the
> older code after optimisation is smaller."
> 
> Things have changed slightly since I wrote the above, with the two
> rlwinm insns being emitted at expand time, so you see
>  (set (reg) (and (reg) (const_int 0xffffffffff0003ff)))
>  (set (reg) (and (reg) (const_int 0x0000000001fffe00)))

It has done that for many years?

> but of course that doesn't change anything regarding the cost of
> "a & 0x01000200".

Yeah.  But the problem is that cost that are "better", "closer to
reality", sometimes result in worse results :-(

Anyway:

+              || (outer_code == AND
+                  && rs6000_is_valid_2insn_and (x, mode)))
        {
          *total = COSTS_N_INSNS (1);
          return true;

It should return COSTS_N_INSNS (2) for that?

Testing with that now.


Segher

Reply via email to