HI, I am running the speed option to measure the performance difference between 0.9.8y and 1.0.0k AES implementations. There seems to be a noticeable difference in the performance results between the two versions when I run 'openssl speed aes' on my MacBook. I see a performance degradation from 0.9.8 to 1.0.0. I am trying to understand the performance differences between the two, but have been unable to find an explanation. Why is the assembly code different between these implementations? Could someone explain to me what improvements were made from 0.9.8 to 1.0.0? Thanks in advance and sorry if this topic has been discussed before.
Best regards, Rich Browne Results below: $ uname -a Darwin Richards-MacBook-Pro.local 11.4.2 Darwin Kernel Version 11.4.2: Thu Aug 23 16:25:48 PDT 2012; root:xnu-1699.32.7~1/RELEASE_X86_64 x86_64 Some cpuinfo hw.l1icachesize: 32768 hw.ncpu: 4 hw.byteorder: 1234 hw.memsize: 8589934592 hw.activecpu: 4 hw.physicalcpu: 2 hw.physicalcpu_max: 2 hw.logicalcpu: 4 hw.logicalcpu_max: 4 hw.cputype: 7 hw.cpusubtype: 4 hw.cpu64bit_capable: 1 hw.cpufamily: 1418770316 OpenSSL 0.9.8y 5 Feb 2013: $ ./bin/openssl speed aes To get the most accurate results, try to run this program when this computer is idle. Doing aes-128 cbc for 3s on 16 size blocks: 24341207 aes-128 cbc's in 3.00s Doing aes-128 cbc for 3s on 64 size blocks: 6459438 aes-128 cbc's in 3.00s Doing aes-128 cbc for 3s on 256 size blocks: 1625911 aes-128 cbc's in 3.00s Doing aes-128 cbc for 3s on 1024 size blocks: 407034 aes-128 cbc's in 3.00s Doing aes-128 cbc for 3s on 8192 size blocks: 51238 aes-128 cbc's in 3.00s Doing aes-192 cbc for 3s on 16 size blocks: 21384352 aes-192 cbc's in 3.00s Doing aes-192 cbc for 3s on 64 size blocks: 5603765 aes-192 cbc's in 3.00s Doing aes-192 cbc for 3s on 256 size blocks: 1408397 aes-192 cbc's in 3.00s Doing aes-192 cbc for 3s on 1024 size blocks: 352863 aes-192 cbc's in 3.00s Doing aes-192 cbc for 3s on 8192 size blocks: 44576 aes-192 cbc's in 3.00s Doing aes-256 cbc for 3s on 16 size blocks: 19058834 aes-256 cbc's in 3.00s Doing aes-256 cbc for 3s on 64 size blocks: 4986321 aes-256 cbc's in 3.00s Doing aes-256 cbc for 3s on 256 size blocks: 1247155 aes-256 cbc's in 3.00s Doing aes-256 cbc for 3s on 1024 size blocks: 312811 aes-256 cbc's in 3.00s Doing aes-256 cbc for 3s on 8192 size blocks: 39294 aes-256 cbc's in 3.00s OpenSSL 0.9.8y 5 Feb 2013 built on: Thu Sep 26 09:49:59 MDT 2013 options:bn(64,32) md2(int) rc4(ptr,char) des(idx,cisc,16,long) aes(partial) idea(int) blowfish(ptr) compiler: cc -fPIC -fno-common -DOPENSSL_PIC -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -arch i386 -O3 -fomit-frame-pointer -DL_ENDIAN available timing options: TIMEB USE_TOD HZ=100 [sysconf value] timing function used: getrusage The 'numbers' are in 1000s of bytes per second processed. type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes aes-128 cbc 129953.75k 137809.11k 138743.11k 138949.79k 139944.31k aes-192 cbc 114080.57k 119557.19k 120174.20k 120434.19k 121724.92k aes-256 cbc 101763.87k 106413.94k 106416.44k 106807.61k 107295.24k OpenSSL 1.0.0k 5 Feb 2013 ./bin/openssl speed aes Doing aes-128 cbc for 3s on 16 size blocks: 16818323 aes-128 cbc's in 3.00s Doing aes-128 cbc for 3s on 64 size blocks: 4774704 aes-128 cbc's in 3.00s Doing aes-128 cbc for 3s on 256 size blocks: 1227720 aes-128 cbc's in 3.00s Doing aes-128 cbc for 3s on 1024 size blocks: 311395 aes-128 cbc's in 3.00s Doing aes-128 cbc for 3s on 8192 size blocks: 38973 aes-128 cbc's in 3.00s Doing aes-192 cbc for 3s on 16 size blocks: 14279927 aes-192 cbc's in 3.00s Doing aes-192 cbc for 3s on 64 size blocks: 3982701 aes-192 cbc's in 3.00s Doing aes-192 cbc for 3s on 256 size blocks: 1020458 aes-192 cbc's in 3.00s Doing aes-192 cbc for 3s on 1024 size blocks: 258339 aes-192 cbc's in 3.00s Doing aes-192 cbc for 3s on 8192 size blocks: 32413 aes-192 cbc's in 3.00s Doing aes-256 cbc for 3s on 16 size blocks: 12429408 aes-256 cbc's in 3.00s Doing aes-256 cbc for 3s on 64 size blocks: 3435291 aes-256 cbc's in 3.00s Doing aes-256 cbc for 3s on 256 size blocks: 875766 aes-256 cbc's in 3.00s Doing aes-256 cbc for 3s on 1024 size blocks: 221659 aes-256 cbc's in 3.00s Doing aes-256 cbc for 3s on 8192 size blocks: 27630 aes-256 cbc's in 3.00s OpenSSL 1.0.0k 5 Feb 2013 built on: Thu Sep 26 09:55:50 MDT 2013 options:bn(64,32) rc4(4x,int) des(idx,cisc,16,long) aes(partial) idea(int) blowfish(ptr) compiler: cc -fPIC -fno-common -DOPENSSL_PIC -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -arch i386 -O3 -fomit-frame-pointer -DL_ENDIAN -DOPENSSL_BN_ASM_PART_WORDS -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DMD5_ASM -DRMD160_ASM -DAES_ASM -DWHIRLPOOL_ASM The 'numbers' are in 1000s of bytes per second processed. type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes aes-128 cbc 89697.72k 101860.35k 104765.44k 106289.49k 106422.27k aes-192 cbc 76159.61k 84964.29k 87079.08k 88179.71k 88509.10k aes-256 cbc 66290.18k 73286.21k 74732.03k 75659.61k 75448.32k
