Hi, since e-mail signing/DKIM (RFC 4871) becomes more and more important and our production servers all run stable S10u7 with openssl 0.9.7d and we don't wanna mix it up with 0.9.8* for obvious reasons, I decided to pull out the pk11_RSA_{sign|verify} stuff from the openssl pkcs11 contrib and flanged it to libdkim, so that it has [rsa-]sha256 support as well. To get an idea, see http://iws.cs.uni-magdeburg.de/~elkner/dkim/dkim.c.patch
To decide, whether it is worth to pull out the digest stuff too or just go with the digest lib (much less maintainance burden) I made some "benchmarks", i.e. runinng the t-{verify|signi}perf of libdkim (sorry for the wide format): openssl MHz sign % cor% verify % cor% 1) snv_115,Opteron 254, 0.9.8a 2813 394 100 100 3901 100 100 9) svn_110,UltraSPARC-IIIi, 0.9.8a 1503 130 33 62 4832 124 232 7) snv_110,UltraSPARC-IV+, 0.9.8a 1500 135 34 64 5870 150 282 8) svn_110,UltraSPARC-IIIi,0.9.8a/pkcs 1503 329 84 157 5249 135 252 6) snv_110,UltraSPARC-IV+, 0.9.8a/pkcs 1500 363 92 173 5523 142 266 4) s10u7,DC Opteron 285, 0.9.7d/pkcs/md 2593 874 222 241 16317 418 454 3) snv_115,Opteron 254, 0.9.8a/pkcs 2813 1210 307 307 6000 154 154 2) snv_115,Opteron 254, 0.9.7d/pkcs/md 2813 1332 338 338 6250 160 160 5) s10u7,UltraSPARC-IIIi,0.9.7d/pkcs/md 1503 3363 854 1598 19028 488 913 # vim: ts=4 filetype=txt # LESS: -x4 -M -F -X sign is rsa-sha256 signing speed in msgs/sec, % is wrt. to test 1), cor% is % wrt. 1) but correlated to cpu clock (cycles/s). Analog verify is rsa-sha256 verifying speed in msgs/sec ... More details (hw, results, cc options used) are available via http://iws.cs.uni-magdeburg.de/~elkner/dkim/dkimbench.txt Actually, I got, what I really didn't have expected: 1) though, that an x86 64bit machine with ~ 2x the clock rate of a sparc machine should be at least as fast as the sparc machine 2) thought, that openssl would automatically choose the pkcs11 engine on Solaris, when it is available 3) even with pkcs11 enabled, Nevada performance wrt. S10u7 is disappointing: for signing it is a ~ factor 10, for verifying it is about factor 4! So questions: 1) Is x86 pkcs11 not yet optimized very well? Or is it simply a hardware limitation, which prevents further opts? 2) Isn't it possible, to have Solaris openssl automatically load the pkcs11 engine and set it as default? I mean, there is a lot of SW which uses openssl today, however, the only one I know, which lets you choose the engine to use, is apache httpd2+ ... 3) Wrt. to the test and HW I don't think, a V240 is that much different to a Blade-1500 (silver). The only difference I can see, which could IMHO cause a difference, is that on machine 4) and 5) the provider is set to /usr/lib/security/$ISA/pkcs11_softtoken_extra.so - probably a "relict" from S10u3 (SUNWcry) times ... If this is really the key of the problem, why is this lib not part of Nevada? Last but not least an impl. question: Since not an pkcs11 expert and documentation wrt. to openssl/pkcs11 is missing the part: How to cleanup? The documentation say, that one should call C_Finalize(NULL), when the application is done with the crypto stuff, but also states, that a library should not call it, because it may have side effects (which one, and which one, if one doesn't it). So the openssl/contrib/crypto/engine/hw_pk11.c::pk11_finish(ENGINE *e) (I guess, called by EVP_cleanup()) doesn't C_Finalize, but sets the pFuncList back to NULL. So calling C_Finalize before EVP_cleanup() is IMHO not ok, but calling it after EVP_cleanup() would be wrong as well. But save a pointer to the function list and calling it later (e.g. see: http://iws.cs.uni-magdeburg.de/~elkner/dkim/dkim-crypto.c.patch) seems to be wrong as well, because of the pkcs11 dso unload/cleanup. So what is the right thing, todo here? Regards, jel. -- Otto-von-Guericke University http://www.cs.uni-magdeburg.de/ Department of Computer Science Geb. 29 R 027, Universitaetsplatz 2 39106 Magdeburg, Germany Tel: +49 391 67 12768 _______________________________________________ perf-discuss mailing list perf-discuss@opensolaris.org