On Wed, Mar 16, 2011 at 01:19:30PM +0100, Paolo Bonzini wrote:
> Jakub's behavior is expected, since the number of active local
> stores should be 1 with the earlier patch in that testcase.
> However, similar artificial testcases can probably be constructed
> that still trigger the ill behavior.

Note the original testcase wasn't so artificial, some customer was actually
using it to create very large functions, presumably to test linker behavior
or whatever.  In the original there were even more stores in each function,
many of those functions and several sources with them, so it took many hours
in DSE.

Here is an artificial testcase which shows that just the first patch isn't
enough:

/* PR rtl-optimization/48141 */
/* { dg-do compile } */
/* { dg-options "-O -fno-tree-fre" } */

#define A(n) volatile int n = 0;
#define B(n) A(n##0) A(n##1) A(n##2) A(n##3) A(n##4) A(n##5) A(n##6) A(n##7) 
A(n##8) A(n##9)
#define C(n) B(n##0) B(n##1) B(n##2) B(n##3) B(n##4) B(n##5) B(n##6) B(n##7) 
B(n##8) B(n##9)
#define D(n) C(n##0) C(n##1) C(n##2) C(n##3) C(n##4) C(n##5) C(n##6) C(n##7) 
C(n##8) C(n##9)
#define E(n) D(n##0) D(n##1) D(n##2) D(n##3) D(n##4) D(n##5) D(n##6) D(n##7) 
D(n##8) D(n##9)
#define F(n) E(n##0) E(n##1) E(n##2) E(n##3) E(n##4) E(n##5) E(n##6) E(n##7) 
E(n##8) E(n##9)

int
foo (void)
{
  F(i)
  return 0;
}

with both patches in -O -fno-tree-fre --param=max-dse-active-local-stores=10
this takes 16.68 seconds to compile, out of which 0.46s is spent in
DSE1+DSE2.
With -O -fno-tree-fre --param=max-dse-active-local-stores=10000000 it
compiles in in 165.80s, DSE1+DSE2 time is 149.55s.  -fno-tree-fre is
here because that pass (in particular alias stmt walking from fre) is
terribly expensive as well on this testcase.

        Jakub

Reply via email to