On Fri, Mar 20, 2015 at 9:50 AM, Andreas Krebbel <kreb...@linux.vnet.ibm.com> wrote: > On 03/18/2015 12:04 PM, Richard Biener wrote: >> On Tue, Mar 17, 2015 at 7:29 PM, Jeff Law <l...@redhat.com> wrote: >>> On 03/17/2015 02:17 AM, Andreas Krebbel wrote: >>>> >>>> >>>> Just to have some numbers I did run a -j1 GCC bootstrap twice with and >>>> without the patch on x86_64. >>>> Best results for both are: >>>> >>>> clean: 21459s >>>> patched: 21314s >>>> >>>> There rather appears to be a trend towards reduced compile time perhaps >>>> due to the reduced number of >>>> INSNs to be processed in the RTL passes between the two ifcvt runs (loop >>>> optimization, combine, >>>> fwprop, dse,...)?! >>>> >>>> I also tried to measure the testsuite runs but the results show a big >>>> variance. So what I have right >>>> now does not qualify as a benchmark. >>> >>> And reality is it's getting harder and harder to benchmark this kind of >>> thing with turbo modes and such. A single run isn't sufficient unless >>> you've locked the box into a particular cpu frequency. >> >> For the particular patch I wonder if you really need to change all >> three if-conversion pass instances or if changing the one before >> combine (pass_rtl_ifcvt, thus rest_of_handle_if_conversion) is enough. > Right. For this particular case it would be good enough to do it only in the > first ifcvt run. But > perhaps there are cases where later passes get confused by the leftovers from > ifcvt?
Well, "perhaps" ... >> That already runs an unconditonal (huh...) cleanup_cfg (0) at the end >> which could be changed so that DCE is performed (CLEANUP_EXPENSIVE, >> runs delete_trivially_dead_insns). >> >> At least that makes the patch smaller and its impact restricted to >> one of the three ifcvt passes. > This does not seem to work. The DCE run by cleanup_cfg only deals with dead > pseudos. In my case it > is a set to the CC hard reg which becomes dead. I see. >> OTOH ifcvt performs a DCE at its start (to be not confused by dead >> instructions I guess), so why doesn't combine do that as well >> (oh, it does!?)? > When reaching combine the LR solution is clean and therefore also DCE isn't > performed. This is > because ifcvt disables running DCE with LR. So LR is always clean after ifcvt > although there are > still dead insns left. Ok... >> And maybe _that_ DCE can be removed as if_convert () >> already performs a DF_LR_RUN_DCE on the first pass. > You mean removing the DCE run in combine? That one probably can go away then > given that the passes > running between ifcvt and combine (loop and fwprop) get rid of dead insns > properly. So I think the best solution is still removing the dead code manually in ifcvt. Then there is also pass_ud_rtl_dce run before the combine pass - if that isn't enough because of clean solutions or whatever then that should be strenghthened - it was most definitely designed to fix the issue you see. Richard. > -Andreas- > >> >> Richard. >> >>> jeff >>>> >>>> >>> >> >