On Fri, Nov 08, 2013 at 08:31:38PM -0500, Geert Bosch wrote:
> 
> On Oct 29, 2013, at 05:41, Richard Biener <richard.guent...@gmail.com> wrote:
> 
> > For reference those
> > (http://clang.llvm.org/docs/LanguageExtensions.html) look like
> > 
> >  if (__builtin_umul_overflow(x, y, &result))
> >    return kErrorCodeHackers;
> > 
> > which should be reasonably easy to support in GCC (if you factor out
> > generating best code and just aim at compatibility).  Code-generation
> > will be somewhat pessimized by providing the multiplication result
> > via memory, but that's an implementation detail.
> 
> I've done the overflow checking in Gigi (Ada front end). Benchmarking
> real world large Ada programs (where every integer operation is checked,
> including array index computations etc.), I found the performance cost 
> *very* small (less than 1% on typical code, never more than 2%). There
> is a bit more cost in code size, but that is mostly due to the fact that
> we want to generate error messages with correct line number information
> without relying on backtraces.
>
Overhead is mostly from additonal branches that are not taken. We need
more accurate measure of cache effects than code size, for example
looking to increased number icache hits which will not count code that
is never executed.
> The rest of the run time checks in Ada (especially index checks and range
> checks) were far more costly (more on the order of 10-15%, but very
> variable depending on code style).
> 
> A few things helped to make the cost small: the biggest one is that
> typically on of the operands is known to be negative or positive.
> Gigi will use Ada type information, and Natural or Positive integer
> variables are very common.  So, if you compute X + C with C positive, 
> you can write the conditional expression as:

On x64 efect of this analysis is small, processor does overflow detection for 
you.

> (if X < Integer'Last - C then X + C else raise Constraint_Error)
> 

> On my x86-64 this generates something like:
> __ada_add:
> 00    cmpl    $0x7fffffff,%edi
> 06    je      0x0000000c
> 08    leal    0x01(%rdi),%eax
> 0b    ret
> 0c    leaq    0x0000000d(%rip),%rdi
> 13    pushq   %rax
> 14    movl    $0x00000003,%esi
> 19    xorl    %eax,%eax
> 1b    callq   ___gnat_rcheck_CE_Overflow_Check
>

This has redundant compare instruction that cost a cycle and 6 bytes.
You can just write

   0:   83 c7 01                add    $0x1,%edi
   3:   71 03                   jno    0x8
   5:   89 f8                   mov    %edi,%eax
   7:   c3                      retq   
   8:   48 8d 3d 0d 00 00 00    lea    0xd(%rip),%rdi
   f:   50                      push   %rax
  10:   be 03 00 00 00          mov    $0x3,%esi
  15:   31 c0                   xor    %eax,%eax
  17:   e8 00 00 00 00          callq  0x1c

When you know that one operand is positive or you deal with unsigned
then you could replace jno with jnc which is bit faster on sandy bridge
processors and later as add, jnc pair is macro-fused but add jno is not.
 
> While this may look like a lot, these operations are expanded
> inline, and only the first three are on the normal execution
> path. As the exception raise is a No_Return subprogram, it will
> be moved to the end of the file. The jumps will both statically
> and dynamically be treated as not-taken, and have very little cost.
> 
> Additionally, the comparison is visible for the optimizers, in
> effect giving more value range information which can be used
> for optimizing away further checks. The drawback of using any
> "special" new operations is that we loose that aspect.
> 
> For the less common case in which neither operand has a known
> sign, widening to 64-bits is the straightforward solution. For Ada,
> we have a mode where we do this kind of widening for entire
> expressions, so we only have to check on the final assignment.
> The semantics here are that you'd get the mathematically correct
> result, even if there was an intermediate overflow. The drawback
> of this approach is that an overflow check may not fail, but
> that suppressing the checks removes the widening and causes
> wrong answers.
> 
>   -Geert
> 

Reply via email to