** Description changed:

  tl;dr: since it's too much work to make openssl LTO-safe, upstream
  doesn't see it as a goal and doesn't test it, and there are probably no
  performance gains to LTO for this package.
  
  Openssl is an old project and the codebase wasn't written with aliasing
  rules in mind. There are several reports of issues related to LTO. The
  openssl technical commitee says "currently we're not going to fix all
  the strict aliasing and other LTO problems" and "Fixes raised in pull
  requests will be considered."; in other words: if you find a violation,
  we'll merge your fixes but we're not going to dedicate time to fixing
  them ourselves.
  
  We don't have specific reports on launchpad at the moment but there has
  been at least one issue experienced by the FIPS: the compiler decided a
  0-filled array could be removed and proceeded to do so. In addition to
  that, compilers are only pushing this further and further. Issues are
  impossible to predict and even security updates could trigger issues.
  
  Gentoo prevents usage of LTO for openssl and has some links related to this 
at 
https://gitweb.gentoo.org/repo/gentoo.git/tree/dev-libs/openssl/openssl-3.2.1-r1.ebuild#n131
 :
  - https://github.com/llvm/llvm-project/issues/55255
  - https://github.com/openssl/openssl/issues/12247
  - https://github.com/openssl/openssl/issues/18225
  - https://github.com/openssl/openssl/issues/18663
  - https://github.com/openssl/openssl/issues/18663#issuecomment-1181478057
  
  Gentoo also prevents usage of -fstrict-aliasing and always set -fno-
  strict-aliasing. I don't plan to do the same at least at the moment and
  for Noble since I don't have time to investigate more changes.
  
  Performance shouldn't be impacted much if at all:
  - crypto algorithms are implemented in ASM (funnily, using C implementations 
can trigger issues because these got miscompiled)
  - the rest of the openssl codebase probably doesn't benefit from LTO because 
source files match codepaths quite well
  - at the moment, openssl performance for servers is bad due to 
algorithmic/architectural issues, not micro-optimizations and these wouldn't be 
noticed
  - if LTO-compliance was doable and thought to be useful by upstream, they 
would have certainly pushed that forward, especially in the wake of openssl 
3.0's performance issues.
  
  Code size increases by a few percents except for libcrypto which gets
  17% larger. The corresponding .deb file increases by 2.6% only.
  
  I ran "openssl speed" with a long benchmark time in order to get good
  results (there is a variation of several percents with the default
- times). I then scripted a diff which output is shown below (hopefully it
- will display fine...); entries within 2% are not displayed. Also note
+ times). I then scripted a diff which output is shown below; "....."
+ means the difference is within 2% which is the vast majority. Also note
  that some important ciphers are not present due to how openssl speed
  works; small aes-*-cbc are negatively impacted, up to -10% but that
  would -50% if you compared between "software" and "hardware"
  implementations, the results would be reversed at anything but the
  smallest data sizes, and the fact that you want to use hardware
  implementations as much as possible means that you also want to avoid
  places where LTO could have an effect.
  
  type              16      bytes   64                 bytes      256      
bytes   1024   bytes  8192  bytes  16384  bytes
  md5               .....   .....   .....              .....      .....    .....
  sha1              .....   .....   .....              .....      .....    .....
  rmd160            .....   .....   .....              .....      .....    .....
  sha256            +2.3%   .....   .....              .....      .....    .....
  sha512            .....   .....   .....              .....      .....    .....
  hmac(md5)         .....   .....   .....              .....      .....    .....
  des-ede3          .....   .....   .....              .....      .....    .....
  aes-128-cbc       -10.0%  .....   .....              .....      .....    .....
  aes-192-cbc       -7.6%   .....   .....              .....      .....    .....
  aes-256-cbc       -5.2%   .....   .....              .....      .....    .....
  camellia-128-cbc  .....   .....   .....              .....      .....    .....
  camellia-192-cbc  .....   .....   .....              .....      .....    .....
  camellia-256-cbc  .....   .....   .....              .....      .....    .....
  ghash             .....   .....   +21.2%             -27.3%     +30.5%   
+39.3%
  rand              -2.8%   -2.9%   -2.9%              -2.8%      .....    .....
  sign              verify  sign/s  verify/s
  rsa               512     bits    0.000031s          0.000002s  -2.7%    .....
  rsa               1024    bits    .....              0.000005s  .....    .....
  rsa               2048    bits    +2.4%              0.000015s  -2.3%    .....
  rsa               3072    bits    .....              0.000032s  .....    .....
  rsa               4096    bits    .....              .....      .....    .....
  rsa               7680    bits    .....              .....      30.2     .....
  rsa               15360   bits    .....              .....      5.9      .....
  sign              verify  sign/s  verify/s
  dsa               512     bits    +4.8%              0.000024s  -3.9%    .....
  dsa               1024    bits    +2.5%              -3.3%      .....    +2.4%
  dsa               2048    bits    .....              .....      .....    +2.0%
  sign              verify  sign/s  verify/s
  160               bits    ecdsa   (secp160r1)        +100.0%    +100.0%  
.....   -2.2%
  192               bits    ecdsa   (nistp192)         0.0002s    0.0002s  
-3.6%   -3.3%
  224               bits    ecdsa   (nistp224)         0.0000s    0.0001s  
.....   .....
  256               bits    ecdsa   (nistp256)         0.0000s    0.0001s  
.....   .....
  384               bits    ecdsa   (nistp384)         +14.3%     0.0006s  
-3.2%   .....
  521               bits    ecdsa   (nistp521)         0.0002s    0.0005s  
.....   .....
  163               bits    ecdsa   (nistk163)         0.0002s    0.0003s  
-3.2%   -3.0%
  233               bits    ecdsa   (nistk233)         0.0002s    +25.0%   
.....   -2.2%
  283               bits    ecdsa   (nistk283)         0.0004s    0.0008s  
.....   -3.5%
  409               bits    ecdsa   (nistk409)         0.0007s    0.0013s  
-2.1%   -2.0%
  571               bits    ecdsa   (nistk571)         0.0015s    0.0029s  
.....   .....
  163               bits    ecdsa   (nistb163)         0.0002s    0.0003s  
.....   .....
  233               bits    ecdsa   (nistb233)         0.0002s    0.0005s  
.....   .....
  283               bits    ecdsa   (nistb283)         0.0004s    0.0008s  
-2.4%   -2.7%
  409               bits    ecdsa   (nistb409)         0.0007s    +7.7%    
-2.5%   -3.5%
  571               bits    ecdsa   (nistb571)         0.0016s    0.0031s  
.....   .....
  256               bits    ecdsa   (brainpoolP256r1)  0.0003s    0.0003s  
-2.5%   .....
  256               bits    ecdsa   (brainpoolP256t1)  0.0003s    0.0003s  
-2.9%   -3.2%
  384               bits    ecdsa   (brainpoolP384r1)  +14.3%     0.0007s  
-2.9%   .....
  384               bits    ecdsa   (brainpoolP384t1)  +14.3%     0.0006s  
-2.9%   -2.0%
  512               bits    ecdsa   (brainpoolP512r1)  0.0011s    0.0009s  
-2.8%   -3.1%
  512               bits    ecdsa   (brainpoolP512t1)  +10.0%     +12.5%   
-3.4%   -4.5%
  op                op/s
  160               bits    ecdh    (secp160r1)        0.0001s    -5.8%
  192               bits    ecdh    (nistp192)         0.0002s    -7.4%
  224               bits    ecdh    (nistp224)         0.0001s    .....
  256               bits    ecdh    (nistp256)         0.0000s    .....
  384               bits    ecdh    (nistp384)         0.0007s    -4.0%
  521               bits    ecdh    (nistp521)         0.0003s    -4.1%
  163               bits    ecdh    (nistk163)         0.0002s    -4.6%
  233               bits    ecdh    (nistk233)         0.0002s    -4.7%
  283               bits    ecdh    (nistk283)         0.0004s    -2.9%
  409               bits    ecdh    (nistk409)         0.0006s    -3.6%
  571               bits    ecdh    (nistk571)         0.0014s    .....
  163               bits    ecdh    (nistb163)         0.0002s    .....
  233               bits    ecdh    (nistb233)         0.0002s    .....
  283               bits    ecdh    (nistb283)         0.0004s    -2.5%
  409               bits    ecdh    (nistb409)         +16.7%     -3.2%
  571               bits    ecdh    (nistb571)         0.0015s    .....
  256               bits    ecdh    (brainpoolP256r1)  0.0003s    -3.9%
  256               bits    ecdh    (brainpoolP256t1)  0.0003s    -4.9%
  384               bits    ecdh    (brainpoolP384r1)  0.0007s    -3.7%
  384               bits    ecdh    (brainpoolP384t1)  0.0007s    -3.9%
  512               bits    ecdh    (brainpoolP512r1)  0.0010s    .....
  512               bits    ecdh    (brainpoolP512t1)  0.0010s    -2.1%
  253               bits    ecdh    (X25519)           0.0000s    .....
  448               bits    ecdh    (X448)             0.0002s    .....
  sign              verify  sign/s  verify/s
  253               bits    EdDSA   (Ed25519)          0.0000s    0.0001s  
.....   .....
  456               bits    EdDSA   (Ed448)            0.0002s    0.0002s  
.....   .....
  sign              verify  sign/s  verify/s
  256               bits    SM2     (CurveSM2)         0.0003s    0.0003s  
-2.9%   -3.2%
  op                op/s
  2048              bits    ffdh    0.0002s            .....
  3072              bits    ffdh    0.0006s            -2.4%
  4096              bits    ffdh    0.0013s            .....
  6144              bits    ffdh    0.0029s            .....
  8192              bits    ffdh    .....              .....
  
- 
- PS: I used a ZSH script for that (because bash cannot do floating point 
arithmetic operations) which is below, using two files "speed-lto" and 
"speed-no-lto":
+ PS: I used a ZSH script for that (because bash cannot do floating point
+ arithmetic operations) which is below, using two files "speed-lto" and
+ "speed-no-lto":
  
  a=speed-lto; b=speed-no-lto; l=$(wc -l speed-lto | cut -f1 -d' '); exec
  3<$a; exec 4<$b; for i in $(seq 1 $l); do read -A -u 3 c; read -A -u 4
  d; for j in $(seq 1 ${#c}); do x="${c[$j]}"; y="${d[$j]}"; if [[ "$x" ==
  "$y" ]]; then printf '%s ' "$x"; else xm=$(echo "$x" | tr -dc '0-9');
  ym=$(echo "$y" | tr -dc '0-9'); p=$(((100. * (ym - xm)) / xm)); if (( p
  > 2 || p < -2)); then printf '%+0.1f%% ' "$p"; else printf '..... '; fi;
  fi; done; printf '\n'; done | column -t; exec 3>&-; exec 4>&-

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/2058017

Title:
  openssl is not LTO-safe

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/openssl/+bug/2058017/+subscriptions


-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to