Hi @ll,

I don't use GCC, so I don't know whether there's a benchmark
for __udivmodti4() and/or __udivmoddi4() for AMD64 and i386
processors.

If you have one: get my "slow" __udivmodti4() from
<https://skanthak.homepage.t-online.de/integer.html#as-1>
and run the benchmark, then my fast __udivmodti4() from
<https://skanthak.homepage.t-online.de/integer.html#as-2>
and repeat.
The "slow" __udivmodti4() should be slightly faster than your
current implementation for AMD64, while the fast one almost
an order of magnitude...
<https://skanthak.homepage.t-online.de/integer.html#summary>
shows my numbers.

And while you're there, also benchmark __udivmoddi4() from
<https://skanthak.homepage.t-online.de/integer.html#as-3>,
__umoddi3() from
<https://skanthak.homepage.t-online.de/integer.html#as-4>,
__moddi3() from
<https://skanthak.homepage.t-online.de/integer.html#as-5>,
as well as (after trivial editing) __udivdi3() from
<https://skanthak.homepage.t-online.de/integer.html#ml-1>
and __divdi3() from
<https://skanthak.homepage.t-online.de/integer.html#ml-2>

regards
Stefan

Reply via email to