On the powerpc, switch statements can be expensive, and we would like to be able to tune the threshold of when the compiler generates if statements vs. using a table jump operation (and different processors within the powerpc have different limits). This patch adds a powerpc tuning option to control this.
I've done bootstraps and make checks with no regressions. Is this ok to apply to the trunk? At this time, I am not changing the default value (4). With the option, I've seen a few spec 2006 benchmarks run faster, and a few run slower. [gcc] 2011-06-29 Michael Meissner <meiss...@linux.vnet.ibm.com> * config/rs6000/rs6000.opt (-mcase-values-threshold): New switch. * config/rs6000/rs6000.c (TARGET_CASE_VALUES_THRESHOLD): New target hook for override choice of when to do jump table vs. if statements based on -mcase-values-threshold=<n>. * doc/invoke.texi (RS/6000 and PowerPC Options): Document -mcase-values-threshold. [gcc/testsuite] 2011-06-29 Michael Meissner <meiss...@linux.vnet.ibm.com> * gcc.target/powerpc/ppc-switch-1.c: New test for -mcase-values-threshold. * gcc.target/powerpc/ppc-switch-2.c: Ditto. -- Michael Meissner, IBM 5 Technology Place Drive, M/S 2757, Westford, MA 01886-3141, USA meiss...@linux.vnet.ibm.com fax +1 (978) 399-6899
Index: gcc/config/rs6000/rs6000.opt =================================================================== --- gcc/config/rs6000/rs6000.opt (revision 175662) +++ gcc/config/rs6000/rs6000.opt (working copy) @@ -521,4 +521,7 @@ mxilinx-fpu Target Var(rs6000_xilinx_fpu) Save Specify Xilinx FPU. - +mcase-values-threshold= +Target Report Var(rs6000_case_values_threshold_num) Init(4) RejectNegative Joined UInteger Save +Specify the smallest number of different values for which it is best to use a +jump-table instead of a tree of conditional branches (default, 4). Index: gcc/config/rs6000/rs6000.c =================================================================== --- gcc/config/rs6000/rs6000.c (revision 175662) +++ gcc/config/rs6000/rs6000.c (working copy) @@ -1210,6 +1210,7 @@ static void rs6000_function_specific_pri struct cl_target_option *); static bool rs6000_can_inline_p (tree, tree); static void rs6000_set_current_function (tree); +static unsigned int rs6000_case_values_threshold (void); /* Default register names. */ @@ -1617,6 +1618,9 @@ static const struct attribute_spec rs600 #undef TARGET_LEGITIMATE_CONSTANT_P #define TARGET_LEGITIMATE_CONSTANT_P rs6000_legitimate_constant_p +#undef TARGET_CASE_VALUES_THRESHOLD +#define TARGET_CASE_VALUES_THRESHOLD rs6000_case_values_threshold + struct gcc_target targetm = TARGET_INITIALIZER; @@ -26834,6 +26838,18 @@ rs6000_libcall_value (enum machine_mode return gen_rtx_REG (mode, regno); } +/* If the machine does not have a case insn that compares the bounds, + this means extra overhead for dispatch tables, which raises the + threshold for using them. */ + +static unsigned int +rs6000_case_values_threshold (void) +{ + if (rs6000_case_values_threshold_num) + return rs6000_case_values_threshold_num; + + return default_case_values_threshold (); +} /* Given FROM and TO register numbers, say whether this elimination is allowed. Frame pointer elimination is automatically handled. Index: gcc/doc/invoke.texi =================================================================== --- gcc/doc/invoke.texi (revision 175662) +++ gcc/doc/invoke.texi (working copy) @@ -807,7 +807,7 @@ See RS/6000 and PowerPC Options. -msdata=@var{opt} -mvxworks -G @var{num} -pthread @gol -mrecip -mrecip=@var{opt} -mno-recip -mrecip-precision @gol -mno-recip-precision @gol --mveclibabi=@var{type} -mfriz -mno-friz} +-mveclibabi=@var{type} -mfriz -mno-friz -mcase-values-threshold=@var{n}} @emph{RX Options} @gccoptlist{-m64bit-doubles -m32bit-doubles -fpu -nofpu@gol @@ -16320,6 +16320,11 @@ Generate (do not generate) the @code{fri rounding a floating point value to 64-bit integer and back to floating point. The @code{friz} instruction does not return the same value if the floating point number is too large to fit in an integer. + +@item -mcase-values-threshold=@var{n} +Specify the smallest number of different values for which it is best to +use a jump-table instead of a tree of conditional branches. The +default for @option{-mcase-values-threshold} is 4. @end table @node RX Options Index: gcc/testsuite/gcc.target/powerpc/ppc-switch-1.c =================================================================== --- gcc/testsuite/gcc.target/powerpc/ppc-switch-1.c (revision 0) +++ gcc/testsuite/gcc.target/powerpc/ppc-switch-1.c (revision 0) @@ -0,0 +1,26 @@ +/* { dg-do compile { target { powerpc*-*-* } } } */ +/* { dg-skip-if "" { powerpc*-*-darwin* } { "*" } { "" } } */ +/* { dg-options "-O2 -mcase-values-threshold=2" } */ +/* { dg-final { scan-assembler "mtctr" } } */ +/* { dg-final { scan-assembler "bctr" } } */ + +/* Force using a dispatch table even though by default we would generate + ifs. */ + +extern long call (long); + +long +test_switch (long a, long b) +{ + long c; + + switch (a) + { + case 0: c = -b; break; + case 1: c = ~b; break; + case 2: c = b+1; break; + default: c = b & 9; break; + } + + return call (c) + 1; +} Index: gcc/testsuite/gcc.target/powerpc/ppc-switch-2.c =================================================================== --- gcc/testsuite/gcc.target/powerpc/ppc-switch-2.c (revision 0) +++ gcc/testsuite/gcc.target/powerpc/ppc-switch-2.c (revision 0) @@ -0,0 +1,32 @@ +/* { dg-do compile { target { powerpc*-*-* } } } */ +/* { dg-skip-if "" { powerpc*-*-darwin* } { "*" } { "" } } */ +/* { dg-options "-O2 -mcase-values-threshold=20" } */ +/* { dg-final { scan-assembler-not "mtctr" } } */ +/* { dg-final { scan-assembler-not "bctr" } } */ + +/* Force using if tests, instead of a dispatch table. */ + +extern long call (long); + +long +test_switch (long a, long b) +{ + long c; + + switch (a) + { + case 0: c = -b; break; + case 1: c = ~b; break; + case 2: c = b+1; break; + case 3: c = b-2; break; + case 4: c = b*3; break; + case 5: c = b/4; break; + case 6: c = b<<5; break; + case 7: c = b>>6; break; + case 8: c = b|7; break; + case 9: c = b^8; break; + default: c = b&9; break; + } + + return call (c) + 1; +}