Thanks Honza! I have committed changes ( for default ). http://gcc.gnu.org/viewcvs/gcc?view=revision&revision=204442
I will add lookahead value 8 for O3 after experimenting with it. Regards Ganesh -----Original Message----- From: Jan Hubicka [mailto:hubi...@ucw.cz] Sent: Wednesday, October 30, 2013 1:54 AM To: Richard Biener Cc: Jan Hubicka; Gopalasubramanian, Ganesh; gcc-patches@gcc.gnu.org; Uros Bizjak (ubiz...@gmail.com); H.J. Lu (hjl.to...@gmail.com) Subject: Re: Fix scheduler ix86_issue_rate and ix86_adjust_cost for modern x86 chips > On Fri, 25 Oct 2013, Jan Hubicka wrote: > > > > > OK, so it is about 2%. Did you try if you need lookahead even in the > > > > early pass (before reload)? My guess would be so, but if not, it could > > > > cut the cost to half. For -Ofast/-O3 it looks resonable to me, but we > > > > will need to announce it on the ML. For other settings I think we > > > > need to work on more improvements or cut the expenses. > > > > > > Yes, it is required before reload. > > > > > > I have another idea which can be pondered upon. Currently, can we enable > > > lookahead with the value 4 (pre reload) for default? This will > > > exponentially cut the cost of build time. > > > I have done some measurements on the build time of some benchmarks > > > (mentioned below) with lookahead value 4. The 2% increase in build time > > > with value 8 is now almost gone. > > > > > > dfa4 no_lookahead > > > > > > perlbench - 191s 193s > > > bzip2 - 19s 19s > > > gcc - 429s 429s > > > mcf - 3s 3s > > > gobmk - 116s 115s > > > hmmer - 60s 60s > > > sjeng - 18s 17s > > > libquantum - 6s 6s > > > h264ref - 107s 107s > > > omnetpp - 128s 128s > > > astar - 7s 7s > > > bwaves - 5s 5s > > > gamess - 1964s 1957s > > > milc - 18s 18s > > > GemsFDTD - 273s 272s > > > > > > Lookahead value 4 also helps because, the modified decoder model in > > > bdver3.md is only two cycles deep (though in hardware it is actually 4 > > > cycles deep). This means that we can look another two levels deep for > > > better schedule. > > > GemsFDTD still retains the performance boost of around 6-7% with value 4. > > > > > > Let me know your thoughts. > > > > This seems resonable. I would go for lookahead of 4 for now and 8 > > for -Ofast and we can tune things based on the experience with this setting > > incrementally. > > Uros, Richard, what do you think? > > Well, certainly -O3 not -Ofast. Yes, enabling 4 by default and 8 at -O3 seems fine to me. Honza > > Richard.