This patch adds influence of macro TARGET_LOOP_UNROLL_ADJUST during constant iterations (decide_unroll_constant_iterations). The macro has been already checked for runtime iterations (decide_unroll_runtime_iterations), and for unroll stupid (decide_unroll_stupid).
Bootstrapping and test passes. Would like to know your comments before committing. Regards Ganesh 2013-11-28 Ganesh Gopalasubramanian <ganesh.gopalasubraman...@amd.com> * loop-unroll.c (decide_unroll_constant_iterations): Check macro TARGET_LOOP_UNROLL_ADJUST while deciding unroll factor. diff --git a/gcc/loop-unroll.c b/gcc/loop-unroll.c index 9c87167..557915f 100644 --- a/gcc/loop-unroll.c +++ b/gcc/loop-unroll.c @@ -664,6 +664,9 @@ decide_unroll_constant_iterations (struct loop *loop, int flags) if (nunroll > (unsigned) PARAM_VALUE (PARAM_MAX_UNROLL_TIMES)) nunroll = PARAM_VALUE (PARAM_MAX_UNROLL_TIMES); + if (targetm.loop_unroll_adjust) + nunroll = targetm.loop_unroll_adjust (nunroll, loop); + /* Skip big loops. */ if (nunroll <= 1) { -----Original Message----- From: Uros Bizjak [mailto:ubiz...@gmail.com] Sent: Friday, November 22, 2013 1:46 PM To: Gopalasubramanian, Ganesh Cc: gcc-patches@gcc.gnu.org; Richard Guenther <richard.guent...@gmail.com> (richard.guent...@gmail.com); borntrae...@de.ibm.com; H.J. Lu (hjl.to...@gmail.com); Jakub Jelinek (ja...@redhat.com) Subject: Re: [RFC] [PATCH, i386] Adjust unroll factor for bdver3 and bdver4 On Wed, Nov 20, 2013 at 7:26 PM, Gopalasubramanian, Ganesh <ganesh.gopalasubraman...@amd.com> wrote: > Steamroller processors contain a loop predictor and a loop buffer, which may > make unrolling small loops less important. > When unrolling small loops for steamroller, making the unrolled loop fit in > the loop buffer should be a priority. > > This patch uses a heuristic approach (number of memory references) to decide > the unrolling factor for small loops. > This patch has some noise in SPEC 2006 results. > > Bootstrapping passes. > > I would like to know your comments before committing. Please split the patch to target-dependant and target-independant part, and get target-idependant part reviewed first.