Infering that the condition of a for loop is initially true?

Niels Möller Thu, 14 Sep 2017 12:29:01 -0700

This is more of a question than a bug report, so I'm trying to send it
to the list rather than filing a bugzilla issue.


I think it's quite common to write for- and while-loops where the
condition is always initially true. A simple example might be

double average (const double *a, size_t n) 
{
  double sum;
  size_t i;

  assert (n > 0);
  for (i = 0, sum = 0; i < n; i++) 
    sum += a[i];
  return sum / n;
}

The programmer could do the microptimization to rewrite it as a
do-while-loop instead. It would be nice if gcc could infer that the
condition is initially true, and convert to a do-while loop
automatically. 

Converting to a do-while-loop should produce slightly better code,
omitting the typical jump to enter the loop at the end where the
condition is checked. It would also make analysis of where variables are
written more accurate, which is my main concern at the moment.

My questions are:

1. Does gcc attempt to do this optimization? 

2. If it does, how often does it succeed on loops in real programs?

3. Can I help the compiler to do that inference?

The code I had some trouble with is at
https://git.lysator.liu.se/nettle/nettle/blob/master/ecc-mod.c. A
simplified version with only the interesting code path would be

void
ecc_mod (mp_size_t mn, mp_size_t bn, mp_limb_t *rp)
{
  mp_limb_t hi;
  mp_size_t sn = mn - bn;
  mp_size_t rn = 2*mn;

  assert (bn < mn);

  while (rn >= 2 * mn - bn)
    {
      rn -= sn;
      ... code which sets hi ...
    }

  ... code which uses hi ...
}

The intention is that the loop will run at least once, but
nevertheless, in this context, rewriting as do-while would make the
code uglier, imo. 

It looks obvious at first glance that the initial value rn = 2*mn
makes the condition rn >= 2*mn - bn true. All values are unsigned, and
the assert backs up that mn - bn won't underflow.

This is also used with pretty small mn, so that 2*mn won't overflow,
but unfortunately, that's not obvious to the compiler. In theory, the
function could be called with, e.g., on a 64-bit machine, mn ==
(1 << 63) and bn == 1, and then the initial loop condition would be 
0 >= ~0, which is false. So overflow cases seems to rule out the
optimization. 

I've seen gcc warn about hi being used unused in this function (but
only on sparc, and on a machine I no longer use, so I'm afraid I can't
provide details). I've also seen warnings from the clang static
analyzer (which I guess makes different tradeoffs than gcc -Wall when
it comes to false positives).

I would guess it's quite common that conditions which are always true
for relevant inputs may be false due to overflow with extreme inputs,
which in the case of unsigned variables doesn't leave any "undefined
behaviour"-freedom to the compiler.

What's the best way to tell the compiler, to promise that there will
be no overflows involving mn? I could try adding an assert, like

  rn = 2*mn; assert (rn > mn);

to rule out overflow, but that looks a bit unobvious to humans and
possibly unobvious to the compiler too.

Regards,
/Niels

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid 368C6677.
Internet email is subject to wholesale government surveillance.

Infering that the condition of a for loop is initially true?

Reply via email to