On Mon, Aug 7, 2017 at 11:10 AM, Bin.Cheng <amker.ch...@gmail.com> wrote: > On Fri, Jun 23, 2017 at 12:04 PM, Richard Biener > <richard.guent...@gmail.com> wrote: >> On Fri, Jun 23, 2017 at 10:47 AM, Bin.Cheng <amker.ch...@gmail.com> wrote: >>> On Fri, Jun 23, 2017 at 6:04 AM, Jeff Law <l...@redhat.com> wrote: >>>> On 06/07/2017 02:07 AM, Bin.Cheng wrote: >>>>> On Tue, Jun 6, 2017 at 6:47 PM, Jeff Law <l...@redhat.com> wrote: >>>>>> On 06/02/2017 05:52 AM, Bin Cheng wrote: >>>>>>> Hi, >>>>>>> This patch enables -ftree-loop-distribution by default at -O3 and above >>>>>>> optimization levels. >>>>>>> Bootstrap and test at O2/O3 on x86_64 and AArch64. is it OK? >>>>>>> >>>>>>> Note I don't have strong opinion here and am fine with either it's >>>>>>> accepted or rejected. >>>>>>> >>>>>>> Thanks, >>>>>>> bin >>>>>>> 2017-05-31 Bin Cheng <bin.ch...@arm.com> >>>>>>> >>>>>>> * opts.c (default_options_table): Enable >>>>>>> OPT_ftree_loop_distribution >>>>>>> for -O3 and above levels. >>>>>> I think the question is how does this generally impact the performance >>>>>> of the generated code and to a lesser degree compile-time. >>>>>> >>>>>> Do you have any performance data? >>>>> Hi Jeff, >>>>> At this stage of the patch, only hmmer is impacted and improved >>>>> obviously in my local run of spec2006 for x86_64 and AArch64. In long >>>>> term, loop distribution is also one prerequisite transformation to >>>>> handle bwaves (at least). For these two impacted cases, it helps to >>>>> resolve the gap against ICC. I didn't check compilation time slow >>>>> down, we can restrict it to problem with small partition number if >>>>> that's a problem. >>>> Just a note. I know you've iterated further with Richi -- I'm not >>>> objecting to the patch, nor was I ready to approve. >>>> >>>> Are you and Richi happy with this as-is or are you looking to submit >>>> something newer based on the conversation the two of you have had? >>> Hi Jeff, >>> The patch series is updated in various ways according to review >>> comments, for example, it restricts compilation time by checking >>> number of data references against MAX_DATAREFS_FOR_DATADEPS as well as >>> restores data dependence cache. There are still two missing parts I'd >>> like to do as followup patches: one is loop nest distribution and the >>> other is a data-locality cost model (at least) for small cases. Now >>> Richi approved most patches except the last major one, but I still >>> need another iterate for some (approved) patches in order to fix >>> mistake/typo introduced when I separating the patch. >> >> The patch is ok after the approved parts of the ldist series has been >> committed. >> Note your patch lacks updates to invoke.texi (what options are enabled at >> -O3). >> Please adjust that before committing. > Hi All, > Given the loop distribution patches have been merged for a while and > couple of issues fixed. I am submitting updated patch to enable the > pass by default at O3/above levels. > Bootstrap and test on x86_64 and AArch64 ongoing. Hmmer still can be > improved. Is it OK if no failure?
Ok. Thanks, Richard. > Thanks, > bin > 2017-08-07 Bin Cheng <bin.ch...@arm.com> > > * doc/invoke.texi: Document -ftree-loop-distribution for O3. > * opts.c (default_options_table): Add OPT_ftree_loop_distribution.