On 6/7/24 2:30 AM, Jonathan Wakely wrote:
> On Fri, 7 Jun 2024 at 09:57, Sam James <s...@gentoo.org> wrote:
>>
>> Stephen Face <shpf...@gmail.com> writes:
>>
>>> This patch is to optimize the runtime execution of gcd. Mathematically,
>>> it computes with the same algorithm as before, but subtractions and
>>> branches are rearranged to encourage generation of code that can use
>>> flags from the subtractions for conditional moves. Additionally, most
>>> pairs of integers are coprime, so this patch also includes a check for
>>> one of the integers to be equal to 1, and then it will exit the loop
>>> early in this case.
>>
>> Is it worth filing a bug for the missed optimisation? You shouldn't have
>> to write things in a specific order. Thanks.

This patch is not a pure reordering. It has the added comparison with 1.
Also, the argument to the trailing zero count is now the result of the
subtraction before the comparison of m and n. This shortens the chain of
dependencies, which can help the loop run faster.

> 
> Yes, I think a bug report would be good. But 20%-60% decreases in run
> time seems significant enough that we should take the libstdc++ patch
> now, rather than wait for a possible compiler fix to come later.
> 
> Stephen, could you please confirm whether you have a copyright
> assignment in place for GCC, and if not whether you would be will to
> complete one, or alternatively contribute this under the DCO terms:
> https://gcc.gnu.org/dco.html
> Thanks!

I do not have a copyright assignment in place for GCC. I can certify to
the DCO terms for this contribution.

    Signed-off-by: Stephen Face <shpf...@gmail.com>

> 
> 
>>
>>>
>>> libstdc++-v3/ChangeLog:
>>>
>>>       * include/std/numeric(__gcd): Optimize.
>>> ---
>>> I have tested this on x86_64-linux and aarch64-linux. I have tested the
>>> timing with random distributions of small inputs and large inputs on a
>>> couple of machines with -O2 and found decreases in execution time from
>>> 20% to 60% depending on the machine and distribution of inputs.
>>>
>>>  libstdc++-v3/include/std/numeric | 21 +++++++++++----------
>>>  1 file changed, 11 insertions(+), 10 deletions(-)
>>>
>>> diff --git a/libstdc++-v3/include/std/numeric 
>>> b/libstdc++-v3/include/std/numeric
>>> index c912db4a519..3c9e8387a0e 100644
>>> --- a/libstdc++-v3/include/std/numeric
>>> +++ b/libstdc++-v3/include/std/numeric
>>> @@ -148,19 +148,20 @@ namespace __detail
>>>
>>>        while (true)
>>>       {
>>> -       if (__m > __n)
>>> -         {
>>> -           _Tp __tmp = __m;
>>> -           __m = __n;
>>> -           __n = __tmp;
>>> -         }
>>> +       _Tp __m_minus_n = __m - __n;
>>> +       if (__m_minus_n == 0)
>>> +         return __m << __k;
>>>
>>> -       __n -= __m;
>>> +       _Tp __next_m = __m < __n ? __m : __n;
>>>
>>> -       if (__n == 0)
>>> -         return __m << __k;
>>> +       if (__next_m == 1)
>>> +         return __next_m << __k;
>>> +
>>> +       _Tp __n_minus_m = __n - __m;
>>> +       __n = __n < __m ? __m_minus_n : __n_minus_m;
>>> +       __m = __next_m;
>>>
>>> -       __n >>= std::__countr_zero(__n);
>>> +       __n >>= std::__countr_zero(__m_minus_n);
>>>       }
>>>      }
>>>  } // namespace __detail
>>
> 

Reply via email to