Re: [PATCH] audo-detect and use -moutline-atomics compilation flag for aarch64

Zidenberg, Tsahi Sun, 06 Sep 2020 15:13:19 -0700

Hello!

First, I apologize for taking so long to answer. This e-mail regretfully got 
lost in my inbox.

On 24/07/2020, 4:17, "Andres Freund" <and...@anarazel.de> wrote:

    > What does "not significantly affected" exactly mean? Could you post the
    > raw numbers?

The following tests show benchmark behavior on m6g.8xl instance (32-core with 
LSE support)
and a1.4xlarge (16-core, no LSE support) with and without the patch, based on 
postgresql 12.4.
Tests are pgbench select-only/simple-update, and sysbench read-only/write only.

.                      select-only.     simple-update.    read-only.           
write-only
m6g.8xlarge/vanila.      482130.         56275.              273327.            
   33364
m6g.8xlarge/patch.       493748.         59681.              262702.            
   33024
a1.4xlarge/vanila.        82437.         13978.               62489.            
    2928
a1.4xlarge/patch.         79499.         13932.               62796.            
    2945

Results obviously change with OS / parameters /etc. I have attempted ensure a 
fair comparison,
But I don't think these numbers should be taken as absolute.
As reference points, m6g instance compiled with -march=native flag, and m5g 
(x86) instances:

m6g.8xlarge/native.       522771.        60354.               261366.           
   33582
m5.8xlarge.               362908.        58732.               147730.           
   32750

    > I'm a bit concerned that the additional conditional
    > branches on platforms without non ll/sc atomics could hurt noticably.

As can be seen in a1 results - the difference for CPUSs with no LSE atomic 
support is low.
Locks have one branch added, which is always taken the same way and thus easy 
to predict.

    > I'm surprised that read-only didn't benefit - with ll/sc that ought to
    > have pretty high contention on a few lwlocks.

These results show only about 6% performance increase in simple-update, and 
very close
performance in other results, most of which could be attributed to benchmark 
result jitter.
These results from "well behaved" benchmarks do not show the full importance of 
using 
outline-atomics. I have observed in some experiments with other values and 
larger systems
a crush of performance including read-only tests, which was caused by 
continuously failing to
commit strx instructions. In such cases, outline-atomics improved performance 
by more
than 2x factor. These cases are not always easy to replicate.

Thank you!
and sorry again for the delay
Tsahi Zidenberg

Re: [PATCH] audo-detect and use -moutline-atomics compilation flag for aarch64

Reply via email to