Hi, We noticed some performance gains if we are not using jump over some simple switch statements. Here is the idea: Check whether the switch statement can be expanded with conditional instructions. In that case jump tables should be avoided since some branch instructions can be eliminated in further passes (replaced by conditional execution).
For example: switch (i) { case 1: sum += 1; case 2: sum += 3; case 3: sum += 5; case 4: sum += 10; } Using jump tables the following code will be generated (ARM assembly): ldrcc pc, [pc, r0, lsl #2] b .L5 .L0: .word L1 .word L2 .word L3 .word L4 .L1: add r3, #1 .L2: add r3, #4 .L3: add r3, #5 .L4: add r3, #10 .L5 Although this code has a constant complexity it can be improved by the conditional execution to avoid implicit branching: cmp r0,1 addeq r3, #1 cmp r0,2 addeq r3, #4 cmp r0,3 addeq r3, #5 cmp r0,4 addeq r3, #10 Although the assembly below requires more assembly instructions to be executed, it doesn't violate the CPU pipeline (since no branching is performed). The original version of patch for was developed by Alexey Kravets. I measured some performance improvements/regressions using spec 2000 int benchmark on Samsumg's exynos 5250. Here is the result: before: Base Base Base Peak Peak Peak Benchmarks Ref Time Run Time Ratio Ref Time Run Time Ratio ------------ -------- -------- -------- -------- -------- -------- 164.gzip 1400 287 487* 1400 288 485* 175.vpr 1400 376 373* 1400 374 374* 176.gcc 1100 121 912* 1100 118 933* 181.mcf 1800 242 743* 1800 251 718* 186.crafty 1000 159 628* 1000 165 608* 197.parser 1800 347 518* 1800 329 547* 252.eon 1300 960 135* 1300 960 135* 253.perlbmk 1800 214 842* 1800 212 848* 254.gap 1100 138 797* 1100 136 806* 255.vortex 1900 253 750* 1900 255 744* 256.bzip2 1500 237 632* 1500 230 653* 300.twolf X X SPECint_base2000 561 SPECint2000 563 After: 164.gzip 1400 286 490 * 1400 288 486 * 175.vpr 1400 213 656 * 1400 215 650 * 176.gcc 1100 119 923 * 1100 118 933 * 181.mcf 1800 247 730 * 1800 251 717 * 186.crafty 1000 145 688 * 1000 150 664 * 197.parser 1800 296 608 * 1800 275 654 * 252.eon X X 253.perlbmk 1800 206 872 * 1800 211 853 * 254.gap 1100 133 825 * 1100 131 838 * 255.vortex 1900 241 789 * 1900 239 797 * 256.bzip2 1500 235 638 * 1500 226 663 * 300.twolf X X The error in 252.eon was due to incorrect setup. Also "if (count > 3*PARAM_VALUE (PARAM_SWITCH_JUMP_TABLES_BB_OPS_LIMIT))" does not look correct, and probably it is better to move this code in the earlier stage just before the gimple expand and keep preference expand state (jump-tables or not) for every switch statement to avoid dealing with the RTL altogether. thanks, Dinar.
switch.patch
Description: Binary data