> That's interesting.  Your placement at
> 
>           NEXT_PASS (pass_cd_dce, false /* update_address_taken_p */);
>           NEXT_PASS (pass_phiopt, true /* early_p */);
> +         NEXT_PASS (pass_sccp);
> 
> and
> 
>        NEXT_PASS (pass_tsan);
>        NEXT_PASS (pass_dse, true /* use DR analysis */);
>        NEXT_PASS (pass_dce);
> +      NEXT_PASS (pass_sccp);
> 
> isn't immediately after the "best" existing pass we have to
> remove dead PHIs which is pass_cd_dce.  phiopt might leave
> dead PHIs around and the second instance runs long after the
> last CD-DCE.
> 
> So I wonder if your pass just detects unnecessary PHIs we'd have
> removed by other means and what survives until RTL expansion is
> what we should count?
> 
> Can you adjust your original early placement to right after
> the cd-dce pass and for the late placement turn the dce pass
> before it into cd-dce and re-do your measurements?

So I did this

          NEXT_PASS (pass_dse);    
      NEXT_PASS (pass_cd_dce, false /* update_address_taken_p */);    
      NEXT_PASS (pass_sccp);    
      NEXT_PASS (pass_phiopt, true /* early_p */);    
      NEXT_PASS (pass_tail_recursion);         

and this

      NEXT_PASS (pass_dse, true /* use DR analysis */);
      NEXT_PASS (pass_cd_dce, false /* update_address_taken_p */);    
      NEXT_PASS (pass_sccp);    
      /* Pass group that runs when 1) enabled, 2) there are loops

and got these results:

500.perlbench_r
Started with (1) 30318
Ended with (1) 26219
Removed PHI % (1) 13.52002110957187149600
Started with (2) 39043
Ended with (2) 38941
Removed PHI % (2) .26125041620777092000

502.gcc_r
Started with (1) 148361
Ended with (1) 140464
Removed PHI % (1) 5.32282742769326170700
Started with (2) 216209
Ended with (2) 215367
Removed PHI % (2) .38943799749316633500

505.mcf_r
Started with (1) 342
Ended with (1) 304
Removed PHI % (1) 11.11111111111111111200
Started with (2) 437    
Ended with (2) 433    
Removed PHI % (2) .91533180778032036700    
     
523.xalancbmk_r    
Started with (1) 62995    
Ended with (1) 58289     
Removed PHI % (1) 7.47043416144138423700    
Started with (2) 134026    
Ended with (2) 133193    
Removed PHI % (2) .62152119737961291100    
                      
531.deepsjeng_r    
Started with (1) 1402    
Ended with (1) 1264    
Removed PHI % (1) 9.84308131241084165500    
Started with (2) 1928    
Ended with (2) 1920    
Removed PHI % (2) .41493775933609958600    
    
541.leela_r    
Started with (1) 3398    
Ended with (1) 3060    
Removed PHI % (1) 9.94702766333137139500    
Started with (2) 4473    
Ended with (2) 4453    
Removed PHI % (2) .44712720769058797300    

557.xz_r
Started with (1) 47
Ended with (1) 44
Removed PHI % (1) 6.38297872340425532000
Started with (2) 43
Ended with (2) 43
Removed PHI % (2) 0

These measurements don't differ very much from the previous. It seems to me
that phiopt does output some redundant PHIs but the vast majority of the
eliminated PHIs are generated in earlier passes and cd_dce isn't able to get
rid of them.

A noteworthy information might be that most of the eliminated PHIs are actually
trivial PHIs. I consider a PHI to be trivial if it only references itself or
one other SSA name.

Here is a comparison of the newest measurements (sccp after cd_dce) with the
previous ones (sccp after phiopt and dce):

500.perlbench_r
 
Started with (1-PREV) 30287
Started with (1-NEW) 30318
 
Ended with (1-PREV) 26188
Ended with (1-NEW) 26219
 
Removed PHI % (1-PREV) 13.53385941162875161000
Removed PHI % (1-NEW) 13.52002110957187149600
 
Started with (2-PREV) 38005
Started with (2-NEW) 39043
 
Ended with (2-PREV) 37897
Ended with (2-NEW) 38941
 
Removed PHI % (2-PREV) .28417313511380081600
Removed PHI % (2-NEW) .26125041620777092000
 
502.gcc_r
 
Started with (1-PREV) 148187
Started with (1-NEW) 148361
 
Ended with (1-PREV) 140292
Ended with (1-NEW) 140464
 
Removed PHI % (1-PREV) 5.32772780338356266100
Removed PHI % (1-NEW) 5.32282742769326170700
                      
Started with (2-PREV) 211479
Started with (2-NEW) 216209
 
Ended with (2-PREV) 210635
Ended with (2-NEW) 215367
 
Removed PHI % (2-PREV) .39909399987705635100
Removed PHI % (2-NEW) .38943799749316633500


Filip K


P.S. I made a small mistake and didn't compute the benchmark speedup
percentages right in the previous email. Here are the corrected results. The
correct percentages are a little bit smaller but very similar. There is still a
~2% speedup with 505.mcf_r and 541.leela_r.

500.perlbench_r
Without SCCP: 244.151807s
With SCCP: 242.448438s
-0.6976679881791663%

502.gcc_r
Without SCCP: 211.029606s
With SCCP: 211.614523s
+0.27717295742853737%

505.mcf_r
Without SCCP: 298.782621s
With SCCP: 291.671468s
-2.380042378703145%

523.xalancbmk_r
Without SCCP: 189.940639s
With SCCP: 189.876261s
-0.03389374719330334%

531.deepsjeng_r
Without SCCP: 250.63648s
With SCCP: 250.988624s
+0.14049989849840747%

541.leela_r
Without SCCP: 346.066278s
With SCCP: 339.692987s
-1.8416388435281157%

Reply via email to