Loops with a bounded, small number of iterations unroll too much. They should
be peeled away instead. For example, if I compile the following function with
``-O3 -funroll-loops'':
void short_loop(int* dest, int* src, int count) {
// same happens for assert(count <= 4) and if(count > 4) exit(-1)
if(count > 4)
count = 4;
for(int i=0; i < count; i++)
dest[i] = src[i];
}
The assembly output (for i686-pc-cygwin) is an 8x duff's device, of which 75%
of the code will never execute (translated back to C++ here for readability):
void short_loop(int* dest, int* src, int count) {
// same happens for assert(count <= 4) and if(count > 4) exit(-1)
if(count > 4)
count = 4;
int mod = count % 8;
switch(mod) {
case 7:
// loop body
count--;
case 6:
// loop body
count--;
case 5:
// loop body
count--;
case 4:
// loop body
count--;
case 3:
// loop body
count--;
case 2:
// loop body
count--;
case 1:
// loop body
count--;
default:
for(int i=0; i < count; i+=8)
// 8x unrolled loop body
}
}
We need <25% of that code:
void short_loop(int* dest, int* src, int count) {
// same happens for assert(count <= 4) and if(count > 4) exit(-1)
if(count > 4)
count = 4;
switch(count) {
case 4:
// loop body
case 3:
// loop body
case 2:
// loop body
case 1:
// loop body
default:
break;
}
}
--
Summary: Loop unrolling does not exploit VRP for loop bound
Product: gcc
Version: 4.2.0
Status: UNCONFIRMED
Severity: enhancement
Priority: P3
Component: c++
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: scovich at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32073