http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59371
Bug ID: 59371 Summary: Performance regression in GCC 4.8 and later versions. Product: gcc Version: 4.9.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: sje at gcc dot gnu.org Target: mips*-*-* If I compile this program with -O2 on MIPS: int foo(int *p, unsigned short c) { signed short i; int x = 0; for (i = 0; i < c; i++) { x = x + *p; p++; } return x; } With GCC 4.7.* or earlier I get loop code that looks like: $L3: lw $5,0($4) addiu $3,$3,1 seh $3,$3 addu $2,$2,$5 bne $3,$6,$L3 addiu $4,$4,4 With GCC 4.8 and later I get: $L3: lw $7,0($4) addiu $3,$3,1 seh $3,$3 slt $6,$3,$5 addu $2,$2,$7 bne $6,$0,$L3 addiu $4,$4,4 This loop has one more instruction in it and is slower. A version of this bug appears in EEMBC 1.1. If I change the loop index to be unsigned then I get the better code but I can't change the benchmark I am testing so I am trying to figure out what changed in GCC and how to generate the faster code.