[Verificaton XENIAL] # i386 - Significant performance increase using the xenial-proposed/i386 package inside a 32-bit LXD container build using a Ryzen CPU with Intel SHA Extension capability. - Same performance (as expected) using the xenial-proposed/i386 package on a non SHA Extension Intel CPU (i7-6770HQ) with xenial-proposed package.
# amd64 - Significant performance increase using the xenial-proposed/amd64 package on Ryzen CPU with Intel SHA Extension capability. - Same performance (as expected) using the xenial-proposed/amd64 package on a non SHA Extension Intel CPU (i7-6770HQ) with xenial-proposed package. Note : I unfortunately don't (nor colleagues) have access to a Intel CPU with SHA Extension capability at our disposal. Ideally, if someone has access to one to test it would be good. Otherwise, I think it is safe to rely on upstream author of the patch who confirmed it was working as expected using a Intel CPU with SHA extension capability. Reference : https://github.com/openssl/openssl/issues/2848 "...Myself I tested on Intel processors, yes, with/without...." == * Test xenial/i386 on a 32-bit LXD container using a non SHA Extension Intel CPU: -- ii libssl1.0.0:i386 1.0.2g-1ubuntu4.6 i386 Secure Sockets Layer toolkit - shared libraries ii openssl 1.0.2g-1ubuntu4.6 i386 Secure Sockets Layer toolkit - cryptographic utility # openssl speed sha1 Doing sha1 for 3s on 16 size blocks: 12391058 sha1's in 3.00s Doing sha1 for 3s on 64 size blocks: 8934411 sha1's in 3.00s Doing sha1 for 3s on 256 size blocks: 5048901 sha1's in 3.00s Doing sha1 for 3s on 1024 size blocks: 1893157 sha1's in 3.00s Doing sha1 for 3s on 8192 size blocks: 301374 sha1's in 3.00s OpenSSL 1.0.2g 1 Mar 2016 built on: reproducible build, date unspecified options:bn(64,32) rc4(8x,mmx) des(ptr,risc1,16,long) aes(partial) blowfish(idx) compiler: cc -I. -I.. -I../include -fPIC -DOPENSSL_PIC -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -DL_ENDIAN -g -O2 -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -Wl,-Bsymbolic-functions -Wl,-z,relro -Wa,--noexecstack -Wall -DOPENSSL_BN_ASM_PART_WORDS -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DMD5_ASM -DRMD160_ASM -DAES_ASM -DVPAES_ASM -DWHIRLPOOL_ASM -DGHASH_ASM The 'numbers' are in 1000s of bytes per second processed. type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes sha1 66085.64k 190600.77k 430839.55k 646197.59k 822951.94k # time openssl dgst -sha256 /var/tmp/5Gfile SHA256(/var/tmp/5Gfile)= 7f06c62352aebd8125b2a1841e2b9e1ffcbed602f381c3dcb3200200e383d1d5 real 0m15.518s user 0m14.428s sys 0m1.084s == * Test xenial-proposed/i386 on a 32-bit LXD container using a non SHA Extension Intel CPU: -- ii libssl1.0.0:i386 1.0.2g-1ubuntu4.7 i386 Secure Sockets Layer toolkit - shared libraries ii openssl 1.0.2g-1ubuntu4.7 i386 Secure Sockets Layer toolkit - cryptographic utility # openssl speed sha1 Doing sha1 for 3s on 16 size blocks: 12451389 sha1's in 3.00s Doing sha1 for 3s on 64 size blocks: 8913173 sha1's in 3.00s Doing sha1 for 3s on 256 size blocks: 5037978 sha1's in 3.00s Doing sha1 for 3s on 1024 size blocks: 1904530 sha1's in 3.00s Doing sha1 for 3s on 8192 size blocks: 303177 sha1's in 3.00s OpenSSL 1.0.2g 1 Mar 2016 built on: reproducible build, date unspecified options:bn(64,32) rc4(8x,mmx) des(ptr,risc1,16,long) aes(partial) blowfish(idx) compiler: cc -I. -I.. -I../include -fPIC -DOPENSSL_PIC -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -DL_ENDIAN -g -O2 -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -Wl,-Bsymbolic-functions -Wl,-z,relro -Wa,--noexecstack -Wall -DOPENSSL_BN_ASM_PART_WORDS -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DMD5_ASM -DRMD160_ASM -DAES_ASM -DVPAES_ASM -DWHIRLPOOL_ASM -DGHASH_ASM The 'numbers' are in 1000s of bytes per second processed. type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes sha1 66407.41k 190147.69k 429907.46k 650079.57k 827875.33k # time openssl dgst -sha256 /var/tmp/5Gfile SHA256(/var/tmp/5Gfile)= 7f06c62352aebd8125b2a1841e2b9e1ffcbed602f381c3dcb3200200e383d1d5 real 0m15.259s user 0m14.372s sys 0m0.884s == * Test xenial/i386 on a 32-bit LXD container using a Ryzen CPU: -- ii libssl1.0.0:i386 1.0.2g-1ubuntu4.6 i386 Secure Sockets Layer toolkit - shared libraries ii openssl 1.0.2g-1ubuntu4.6 i386 Secure Sockets Layer toolkit - cryptographic utility # openssl speed sha1 Doing sha1 for 3s on 16 size blocks: 11833291 sha1's in 2.98s Doing sha1 for 3s on 64 size blocks: 9305964 sha1's in 3.00s Doing sha1 for 3s on 256 size blocks: 5679556 sha1's in 3.00s Doing sha1 for 3s on 1024 size blocks: 2285214 sha1's in 3.00s Doing sha1 for 3s on 8192 size blocks: 345908 sha1's in 3.00s OpenSSL 1.0.2g 1 Mar 2016 built on: reproducible build, date unspecified options:bn(64,32) rc4(8x,mmx) des(ptr,risc1,16,long) aes(partial) blowfish(idx) compiler: cc -I. -I.. -I../include -fPIC -DOPENSSL_PIC -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -DL_ENDIAN -g -O2 -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -Wl,-Bsymbolic-functions -Wl,-z,relro -Wa,--noexecstack -Wall -DOPENSSL_BN_ASM_PART_WORDS -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DMD5_ASM -DRMD160_ASM -DAES_ASM -DVPAES_ASM -DWHIRLPOOL_ASM -DGHASH_ASM The 'numbers' are in 1000s of bytes per second processed. type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes sha1 63534.45k 198527.23k 484655.45k 780019.71k 944559.45k #time openssl dgst -sha256 /var/tmp/5Gfile SHA256(/var/tmp/5Gfile)= 7f06c62352aebd8125b2a1841e2b9e1ffcbed602f381c3dcb3200200e383d1d5 real 0m15.768s user 0m14.536s sys 0m1.224s == * Test xenial-proposed/i386 on a 32-bit LXD container using a Ryzen CPU: -- ii libssl1.0.0:i386 1.0.2g-1ubuntu4.7 i386 Secure Sockets Layer toolkit - shared libraries ii openssl 1.0.2g-1ubuntu4.7 i386 Secure Sockets Layer toolkit - cryptographic utility # openssl speed sha1 Doing sha1 for 3s on 16 size blocks: 14893525 sha1's in 3.00s Doing sha1 for 3s on 64 size blocks: 12927665 sha1's in 3.00s Doing sha1 for 3s on 256 size blocks: 9115331 sha1's in 3.00s Doing sha1 for 3s on 1024 size blocks: 4153241 sha1's in 3.00s Doing sha1 for 3s on 8192 size blocks: 682211 sha1's in 3.00s OpenSSL 1.0.2g 1 Mar 2016 built on: reproducible build, date unspecified options:bn(64,32) rc4(8x,mmx) des(ptr,risc1,16,long) aes(partial) blowfish(idx) compiler: cc -I. -I.. -I../include -fPIC -DOPENSSL_PIC -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -DL_ENDIAN -g -O2 -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -Wl,-Bsymbolic-functions -Wl,-z,relro -Wa,--noexecstack -Wall -DOPENSSL_BN_ASM_PART_WORDS -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DMD5_ASM -DRMD160_ASM -DAES_ASM -DVPAES_ASM -DWHIRLPOOL_ASM -DGHASH_ASM The 'numbers' are in 1000s of bytes per second processed. type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes sha1 79432.13k 275790.19k 777841.58k 1417639.59k 1862890.84k # time openssl dgst -sha256 /var/tmp/5Gfile SHA256(/var/tmp/5Gfile)= 7f06c62352aebd8125b2a1841e2b9e1ffcbed602f381c3dcb3200200e383d1d5 real 0m3.650s user 0m3.004s sys 0m0.644s == * Test xenial/am64 on Intel CPU (64-bit) with Non Intel SHA Extension: -- ii libssl1.0.0:amd64 1.0.2g-1ubuntu4.6 amd64 Secure Sockets Layer toolkit - shared libraries ii openssl 1.0.2g-1ubuntu4.6 amd64 Secure Sockets Layer toolkit - cryptographic utility # openssl speed sha1 Doing sha1 for 3s on 16 size blocks: 16131936 sha1's in 3.00s Doing sha1 for 3s on 64 size blocks: 11366181 sha1's in 3.00s Doing sha1 for 3s on 256 size blocks: 6534703 sha1's in 3.00s Doing sha1 for 3s on 1024 size blocks: 2442789 sha1's in 3.00s Doing sha1 for 3s on 8192 size blocks: 357145 sha1's in 3.00s OpenSSL 1.0.2g 1 Mar 2016 built on: reproducible build, date unspecified options:bn(64,64) rc4(16x,int) des(idx,cisc,16,int) aes(partial) blowfish(idx) compiler: cc -I. -I.. -I../include -fPIC -DOPENSSL_PIC -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -m64 -DL_ENDIAN -g -O2 -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -Wl,-Bsymbolic-functions -Wl,-z,relro -Wa,--noexecstack -Wall -DMD32_REG_T=int -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_MONT5 -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DMD5_ASM -DAES_ASM -DVPAES_ASM -DBSAES_ASM -DWHIRLPOOL_ASM -DGHASH_ASM -DECP_NISTZ256_ASM The 'numbers' are in 1000s of bytes per second processed. type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes sha1 86036.99k 242478.53k 557627.99k 833805.31k 975243.95k #time openssl dgst -sha256 /var/tmp/5Gfile SHA256(/var/tmp/5Gfile)= 7f06c62352aebd8125b2a1841e2b9e1ffcbed602f381c3dcb3200200e383d1d5 real 0m12.574s user 0m11.832s sys 0m0.740s == * Test xenial-proposed/amd64 on Intel CPU (64-bit) with Non Intel SHA Extension: -- ii libssl1.0.0:amd64 1.0.2g-1ubuntu4.7 amd64 Secure Sockets Layer toolkit - shared libraries ii openssl 1.0.2g-1ubuntu4.7 amd64 Secure Sockets Layer toolkit - cryptographic utility # openssl speed sha1 Doing sha1 for 3s on 16 size blocks: 15937653 sha1's in 3.00s Doing sha1 for 3s on 64 size blocks: 11304094 sha1's in 3.00s Doing sha1 for 3s on 256 size blocks: 6501379 sha1's in 3.00s Doing sha1 for 3s on 1024 size blocks: 2441543 sha1's in 3.00s Doing sha1 for 3s on 8192 size blocks: 357137 sha1's in 3.00s OpenSSL 1.0.2g 1 Mar 2016 built on: reproducible build, date unspecified options:bn(64,64) rc4(16x,int) des(idx,cisc,16,int) aes(partial) blowfish(idx) compiler: cc -I. -I.. -I../include -fPIC -DOPENSSL_PIC -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -m64 -DL_ENDIAN -g -O2 -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -Wl,-Bsymbolic-functions -Wl,-z,relro -Wa,--noexecstack -Wall -DMD32_REG_T=int -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_MONT5 -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DMD5_ASM -DAES_ASM -DVPAES_ASM -DBSAES_ASM -DWHIRLPOOL_ASM -DGHASH_ASM -DECP_NISTZ256_ASM The 'numbers' are in 1000s of bytes per second processed. type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes sha1 85000.82k 241154.01k 554784.34k 833380.01k 975222.10k # time openssl dgst -sha256 /var/tmp/5Gfile SHA256(/var/tmp/5Gfile)= 7f06c62352aebd8125b2a1841e2b9e1ffcbed602f381c3dcb3200200e383d1d5 real 0m12.376s user 0m11.812s sys 0m0.560s == * Test xenial/amd64 on a Ryzen CPU: -- ii libssl1.0.0:amd64 1.0.2g-1ubuntu4.6 amd64 Secure Sockets Layer toolkit - shared libraries ii openssl 1.0.2g-1ubuntu4.6 amd64 Secure Sockets Layer toolkit - cryptographic utility # openssl speed sha1 Doing sha1 for 3s on 16 size blocks: 17131254 sha1's in 3.00s Doing sha1 for 3s on 64 size blocks: 12106212 sha1's in 3.00s Doing sha1 for 3s on 256 size blocks: 6704314 sha1's in 3.00s Doing sha1 for 3s on 1024 size blocks: 2441523 sha1's in 3.00s Doing sha1 for 3s on 8192 size blocks: 352205 sha1's in 3.00s OpenSSL 1.0.2g 1 Mar 2016 built on: reproducible build, date unspecified options:bn(64,64) rc4(8x,int) des(idx,cisc,16,int) aes(partial) blowfish(idx) compiler: cc -I. -I.. -I../include -fPIC -DOPENSSL_PIC -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -m64 -DL_ENDIAN -g -O2 -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -Wl,-Bsymbolic-functions -Wl,-z,relro -Wa,--noexecstack -Wall -DMD32_REG_T=int -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_MONT5 -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DMD5_ASM -DAES_ASM -DVPAES_ASM -DBSAES_ASM -DWHIRLPOOL_ASM -DGHASH_ASM -DECP_NISTZ256_ASM The 'numbers' are in 1000s of bytes per second processed. type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes sha1 91366.69k 258265.86k 572101.46k 833373.18k 961754.45k # time openssl dgst -sha256 /var/tmp/5Gfile SHA256(/var/tmp/5Gfile)= 7f06c62352aebd8125b2a1841e2b9e1ffcbed602f381c3dcb3200200e383d1d5 real 0m13.664s user 0m12.448s sys 0m1.208s == * Test xenial-proposed/amd64 on a Ryzen CPU: -- ii libssl1.0.0:amd64 1.0.2g-1ubuntu4.7 amd64 Secure Sockets Layer toolkit - shared libraries ii openssl 1.0.2g-1ubuntu4.7 amd64 Secure Sockets Layer toolkit - cryptographic utility # openssl speed sha1 Doing sha1 for 3s on 16 size blocks: 25297696 sha1's in 3.00s Doing sha1 for 3s on 64 size blocks: 19825090 sha1's in 3.00s Doing sha1 for 3s on 256 size blocks: 12025484 sha1's in 3.00s Doing sha1 for 3s on 1024 size blocks: 4665262 sha1's in 3.00s Doing sha1 for 3s on 8192 size blocks: 694700 sha1's in 3.00s OpenSSL 1.0.2g 1 Mar 2016 built on: reproducible build, date unspecified options:bn(64,64) rc4(8x,int) des(idx,cisc,16,int) aes(partial) blowfish(idx) compiler: cc -I. -I.. -I../include -fPIC -DOPENSSL_PIC -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -m64 -DL_ENDIAN -g -O2 -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -Wl,-Bsymbolic-functions -Wl,-z,relro -Wa,--noexecstack -Wall -DMD32_REG_T=int -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_MONT5 -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DMD5_ASM -DAES_ASM -DVPAES_ASM -DBSAES_ASM -DWHIRLPOOL_ASM -DGHASH_ASM -DECP_NISTZ256_ASM The 'numbers' are in 1000s of bytes per second processed. type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes sha1 134921.05k 422935.25k 1026174.63k 1592409.43k 1896994.13k # time openssl dgst -sha256 /var/tmp/5Gfile SHA256(/var/tmp/5Gfile)= 7f06c62352aebd8125b2a1841e2b9e1ffcbed602f381c3dcb3200200e383d1d5 real 0m3.579s user 0m2.940s sys 0m0.636s == ** Tags added: verification-done-xenial verification-done-zesty ** Tags removed: sts verification-needed ** Tags added: ua -- You received this bug notification because you are a member of Ubuntu Touch seeded packages, which is subscribed to openssl in Ubuntu. https://bugs.launchpad.net/bugs/1674399 Title: OpenSSL CPU detection for AMD Ryzen CPUs Status in openssl package in Ubuntu: Fix Released Status in openssl source package in Xenial: Fix Committed Status in openssl source package in Yakkety: Fix Committed Status in openssl source package in Zesty: Fix Committed Status in openssl source package in Artful: Fix Released Bug description: [Impact] * Context: AMD added support in their processors for SHA Extensions[1] (CPU flag: sha_ni[2]) starting with Ryzen[3] CPU. Note that Ryzen CPU come in 64bit only (Confirmed with AMD representative). Current OpenSSL version in Ryzens still calls SHA for SSSE3 routine as result a number of extensions were effectively masked on Ryzen and shows no improvement. [1] /proc/cpuinfo processor : 0 vendor_id : AuthenticAMD cpu family : 23 model : 1 model name : AMD Ryzen 5 1600 Six-Core Processor flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc extd_apicid aperfmperf eagerfpu pni pclmulqdq monitor ssse3 fma cx16 sse 4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_l2 mwaitx hw_pstate vmmcall fsgsbase bmi1 avx2 smep bmi2 rdseed adx smap clflusho pt sha_ni xsaveopt xsavec xgetbv1 clzero arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold [2] - sha_ni: SHA1/SHA256 Instruction Extensions [3] - https://en.wikipedia.org/wiki/Ryzen ... All models support: x87, MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, AES, CLMUL, AVX, AVX2, FMA, CVT16/F16C, ABM, BMI1, BMI2, SHA.[5] ... * Program to performs the CPUID check: Reference : https://software.intel.com/en-us/articles/intel-sha-extensions ... Availability of the Intel® SHA Extensions on a particular processor can be determined by checking the SHA CPUID bit in CPUID.(EAX=07H, ECX=0):EBX.SHA [bit 29]. The following C function, using inline assembly, performs the CPUID check: -- int CheckForIntelShaExtensions() { int a, b, c, d; // Look for CPUID.7.0.EBX[29] // EAX = 7, ECX = 0 a = 7; c = 0; asm volatile ("cpuid" :"=a"(a), "=b"(b), "=c"(c), "=d"(d) :"a"(a), "c"(c) ); // Intel® SHA Extensions feature bit is EBX[29] return ((b >> 29) & 1); } -- On CPU with sha_ni the program return "1". Otherwise it return "0". [Test Case] * Reproducible with Xenial/Zesty/Artful release. * Generated a checksum of a big file (e.g. 5GB file) with openssl $ time /usr/bin/openssl dgst -sha256 /var/tmp/5Gfile SHA256(/var/tmp/5Gfile)= 8d448d81521cbc1bfdc04dd199d448bd3c49374221007bd0846d8d39a70dd4f8 real 0m12.835s user 0m12.344s sys 0m0.484s * Openssl speed $ openssl speed sha1 Doing sha1 for 3s on 16 size blocks: 9969152 sha1's in 3.00s Doing sha1 for 3s on 64 size blocks: 8019164 sha1's in 3.00s Doing sha1 for 3s on 256 size blocks: 5254219 sha1's in 2.99s Doing sha1 for 3s on 1024 size blocks: 2217067 sha1's in 3.00s Doing sha1 for 3s on 8192 size blocks: 347842 sha1's in 3.00s OpenSSL 1.0.2g 1 Mar 2016 built on: reproducible build, date unspecified options:bn(64,64) rc4(8x,int) des(idx,cisc,16,int) aes(partial) idea(int) blowfish(idx) compiler: gcc -I. -I.. -I../include -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -Wa,--noexecstack -m64 -DL_ENDIAN -O3 -Wall -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_MONT5 -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DMD5_ASM -DAES_ASM -DVPAES_ASM -DBSAES_ASM -DWHIRLPOOL_ASM -DGHASH_ASM -DECP_NISTZ256_ASM The 'numbers' are in 1000s of bytes per second processed. type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes sha1 53168.81k 171075.50k 449859.55k 756758.87k 949840.55 The performance are clearly better when using the patch which take benefit of the sha extension. (See Regression Potential section for result with patch) [Regression Potential] * Note : IRC discussion with infinity : https://bugs.launchpad.net/ubuntu/xenial/+source/openssl/+bug/1674399/comments/8 * Note from irc discussion with apw and rbasak : https://bugs.launchpad.net/ubuntu/+source/openssl/+bug/1674399/comments/2 * It basically allow openssl to take benefit of sha extension potential (mostly performance-wise) now that new AMD cpu starting to have the capability. * The code check the CPUID bit to determine if the sha instructions are available are not. * Maintainer comment proves that he did the successfully tested on Intel with/without SHA extension Reference: https://github.com/openssl/openssl/issues/2848 "I don't have access to Ryzen system, so I didn't test it explicitly on Ryzen. Reporter did confirm it tough. Myself I tested on Intel processors, yes, with/without." * LP reporter comment : I, slashd, have tested on a Ryzen system (and AMD non-ryzen) and non-sha INTEL cpu. It does reveal a significant performance increase on Ryzen due to the sha extension : (Note that the performance remain the same on non-sha extension CPU (AMD/INTEL), as expected since they don't take benefit of the sha extension technology) [Tested on a Ryzen CPU] # Generated a checksum of a big file (e.g. 5GB file) with openssl $ time /usr/bin/openssl dgst -sha256 /var/tmp/5Gfile SHA256(/var/tmp/5Gfile)= 8d448d81521cbc1bfdc04dd199d448bd3c49374221007bd0846d8d39a70dd4f8 real 0m3.471s user 0m2.956s sys 0m0.516s # Openssl speed $ openssl speed sha1 Doing sha1 for 3s on 16 size blocks: 12081890 sha1's in 3.00s Doing sha1 for 3s on 64 size blocks: 11563950 sha1's in 3.00s Doing sha1 for 3s on 256 size blocks: 8375101 sha1's in 3.00s Doing sha1 for 3s on 1024 size blocks: 3987643 sha1's in 3.00s Doing sha1 for 3s on 8192 size blocks: 678036 sha1's in 3.00s OpenSSL 1.0.2g 1 Mar 2016 built on: reproducible build, date unspecified options:bn(64,64) rc4(8x,int) des(idx,cisc,16,int) aes(partial) idea(int) blowfish(idx) compiler: gcc -I. -I.. -I../include -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -Wa,--noexecstack -m64 -DL_ENDIAN -O3 -Wall -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_MONT5 -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DMD5_ASM -DAES_ASM -DVPAES_ASM -DBSAES_ASM -DWHIRLPOOL_ASM -DGHASH_ASM -DECP_NISTZ256_ASM The 'numbers' are in 1000s of bytes per second processed. type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes sha1 64436.75k 246697.60k 714675.29k 1361115.48k 1851490.30k [Other Info] * Debian Bug : https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=861145 * Upstream PR : https://github.com/openssl/openssl/issues/2848 * Upstream Repository : https://github.com/openssl/openssl.git * Upstream Commits : 1aed5e1 crypto/x86*cpuid.pl: move extended feature detection. ## This fix moves extended feature detection past basic feature detection where it belongs. f8418d8 crypto/x86_64cpuid.pl: move extended feature detection upwards. ## This commit for x86_64cpuid.pl addressed the problem, but messed up processor vendor detection. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/openssl/+bug/1674399/+subscriptions -- Mailing list: https://launchpad.net/~touch-packages Post to : touch-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~touch-packages More help : https://help.launchpad.net/ListHelp