On Thu, 10 Dec 2020, Lucas de Almeida via Gcc wrote:

> Hello,
> when performing (int64_t) foo / (int32_t) bar in gcc under x86, a call to
> __divdi3 is always output, even though it seems the use of the idiv
> instruction could be faster.
> This seems to remain even under -Ofast and other available options.
> 
> To illustrate, this godbolt link: https://godbolt.org/z/hq4GKb
> With code
> 
> #include <stdint.h>
> int32_t d(int64_t a, int32_t b) {
>     return a / b;
> }
> 
> Compiles to
> 
> d(long long, int):
>         sub     esp, 12
>         mov     eax, DWORD PTR [esp+24]
>         cdq
>         push    edx
>         push    eax
>         push    DWORD PTR [esp+28]
>         push    DWORD PTR [esp+28]
>         call    __divdi3
>         add     esp, 28
>         ret
> 
> Why is this?

C evaluation rules for this are such that first 'b' is extended to int64_t,
the division is done in int64_t, and its result is truncated to int32_t in
an implementation-defined manner. Thus, it must always produce a value,
except if (b == 0 || b == -1 && a == INT64_MIN), in which case division
causes undefined behavior.

The x86 'idiv' instruction, however, will raise a divide error if the result
does not fit in a register, so e.g. dividing INT64_MAX by 1 would trap.

Alexander

Reply via email to