I have some code of the form

const int primes[] = {7,11,13,17,19};
const int nprimes = sizeof(primes)/sizeof(int);

and later an inmost loop of the form

bool happy = true;
for (int i=0; i<nprimes && happy; i++)
{
  int j = dat%primes[i];
  if (filter[i][j]==1) happy=false;
}

If I look at the assembly code generated on x86_64 with 

g++ -O9 -funroll-all-loops

I have a sequence of explicit 'idiv %r13d' instructions, where %r13d is
initialised to 13 at the start of the function and never changed
thereafter.

On Core2, idiv by a constant is much slower than the
multiply-by-reciprocal sequence which gcc generates when it recognises
that it's doing division by a constant, so the program speeds up by a
factor three if I replace the loop by

if (filter[0][dat%7]==1) happy=false;
if (happy && filter[1][dat%11]==1) happy=false;
if (happy && filter[2][dat%13]==1) happy=false;
if (happy && filter[3][dat%17]==1) happy=false;
if (happy && filter[4][dat%19]==1) happy=false;

and the generated assembly code contains no divide instructions.

Is there any combination of -f options that causes division-by-constant
to be recognised after the loop-unrolling stage?

Tom

Reply via email to