In my last email I ran the benchmark on Fedora 21 (big mistake). Here are the results when running it back on the 28/56 core RHEL 6.5 server showing contention in a different place (fips). Is there a reason it would be calling into the FIPS code to get a lock of FIPS is not enabled?
Contention log: [Apr 24 21:35:09.731] Server {0x7f5529ccd700} ERROR: contention for lock - total contention: 70662 waiting: 16 file: fips.c line: 471 type: 40 [Apr 24 21:35:09.732] Server {0x7f55299ca700} ERROR: contention for lock - total contention: 71605 waiting: 2 file: fips.c line: 107 type: 39 Line in in fips that see the contention the most: [bcall@bm77 trafficserver]$ grep fips.c diags.log | cut -f 19 -d ' ' | sort | uniq -c | sort -rn 875379 471 288338 107 275472 127 Different lock types and their contention frequency (multiple benchmarks): 875379 40 < - in fips.c 563810 39 <- also in fips.c 440518 1 44145 18 17447 2 11356 10 1377 19 687 12 532 25 302 33 239 9 58 16 40 21 32 11 19 3 1 14 Code to diable FIPS and log: #ifdef OPENSSL_FIPS int mode = FIPS_mode(); Debug("ssl", "FIPS_mode: %d", mode); mode = FIPS_mode(); if (mode) { FIPS_mode_set(0); Debug("ssl", "FIPS_mode: %d", mode); } #endif [Apr 24 21:43:45.860] Server {0x7f7628146800} DEBUG: (ssl) FIPS_mode: 0 -Bryan > On Apr 24, 2015, at 10:56 AM, John Foley <fol...@cisco.com > <mailto:fol...@cisco.com>> wrote: > > When you create the private key given to SSL_CTX, be sure to instantiate a > private key for each thread as well. If you share the same key across all > threads, this can introduce contention. > > Here are the type 1 & 2 locks: > > # define CRYPTO_LOCK_ERR 1 > # define CRYPTO_LOCK_EX_DATA 2 > > Have you checked the OpenSSL error log to see what errors are being generated? > > Also, if you disable FIPS, does the thread contention still occur? > > > > On 04/24/2015 01:36 PM, Bryan Call wrote: >> We are using a single SSL_CTX across all the threads, so I will create >> multiple SSL_CTX per thread. I implemented dynlock callbacks yesterday, but >> I didn’t see them being used in the tests I did. I also added atomic >> counters to see what type of locks are having contention (see logs below): >> >> Most of the lock contention seems to be around type 2 and type 1 locks when >> calling our lock callback. In the FIPS/POST and SSL_CTX recommendations, in >> your previous email, what types of locks would those help with? >> >> Log output: >> [Apr 24 10:21:24.339] Server {0xb310000} ERROR: <SSLUtils.cc >> <http://sslutils.cc/>:178 (SSL_locking_callback)> contention for lock - >> total contention: 3938 waiting: 1 file: >> /SourceCache/OpenSSL098/OpenSSL098-52.20.2/src/crypto/ex_data.c line: 496 >> type: 2 >> [Apr 24 10:21:24.754] Server {0xb10a000} ERROR: <SSLUtils.cc >> <http://sslutils.cc/>:178 (SSL_locking_callback)> contention for lock - >> total contention: 1350 waiting: 1 file: >> /SourceCache/OpenSSL098/OpenSSL098-52.20.2/src/crypto/err/err_def.c line: >> 343 type: 1 >> >> >> -Bryan >> >> >> >> >>> On Apr 23, 2015, at 4:46 PM, John Foley <fol...@cisco.com >>> <mailto:fol...@cisco.com>> wrote: >>> >>> Looking at your call stack, it appears you're testing with FIPS mode >>> enabled. There's a lock in FIPS_selftest_failed() that's used to ensure >>> the crypto algorithms inside the FIPS boundary are not utilized until after >>> the POST has completed and passed. Unfortunately this lock remains >>> installed for the lifetime of the process. We've seen thread contention in >>> this area when using OpenSSL. One work-around is to modify your handler >>> that you pass to SSL_locking_callback() to not lock when the lock ID is >>> CRYPTO_LOCK_FIPS. But you would only want to disable this particular lock >>> ID after the POST has completed. >>> >>> If the above doesn't help, another item to check is whether you're >>> instantiating an SSL_CTX for each thread. I've seen some developers use a >>> single SSL_CTX for multiple threads. This results in a single instance of >>> the server cert and private key being shared across all the threads. This >>> results in thread contention on the asymmetric key operations associated >>> with the cert and private key (e.g. RSA). Be sure that each thread >>> instantiates it's own SSL_CTX and certificate/private key. >>> >>> Lastly, add some debugs to count the locks by lock ID in your lock handler. >>> There are about 40 different lock IDs defined in crypto/crypto.h. You'll >>> need to identify which lock is causing the thread contention. Once you >>> know that, then you can look in the code to see where that lock is used and >>> troubleshoot from there. >>> >>> >>> >>> On 04/23/2015 07:23 PM, Bryan Call wrote: >>>> I was running a benchmark on a 28 core (56 hyper-threaded) server that is >>>> running 84 threads in the process and I am seeing a lot of lock >>>> contention. I saw a lot of lock contention in the calls to >>>> SSL_get_error() ssl3_accept(). I am running RHEL 6.5 and >>>> openssl-1.0.1e-30.el6_6.7.x86_64. We have used SSL_locking_callback() to >>>> setup pthread mutex locks, but haven’t setup dynlocks yet. Would dynlocks >>>> help in this scenario? >>>> >>>> >>>> Output from perf: >>>> Samples: 1M of event 'cycles', Event count (approx.): 602783506611 >>>> - 48.04% [ET_NET 0] [kernel.kallsyms] [k] _spin_lock ◆ >>>> - _spin_lock ▒ >>>> + 48.05% futex_wait_setup ▒ >>>> + 47.79% futex_wake ▒ >>>> + 1.20% tcp_v4_rcv ▒ >>>> + 0.98% task_rq_lock ▒ >>>> + 0.64% double_lock_balance ▒ >>>> + 3.94% [ET_NET 0] libcrypto.so.1.0.1e [.] bn_mul_mont ▒ >>>> + 1.93% [ET_NET 0] libcrypto.so.1.0.1e [.] BN_usub ▒ >>>> + 1.80% [ET_NET 0] libcrypto.so.1.0.1e [.] 0x00000000000e5200 ▒ >>>> + 1.49% [ET_NET 0] libcrypto.so.1.0.1e [.] sha256_block_data_order ▒ >>>> + 0.97% [ET_NET 0] libcrypto.so.1.0.1e [.] gcm_ghash_clmul ▒ >>>> + 0.85% [ET_NET 0] libpthread-2.12.so [.] pthread_mutex_lock ▒ >>>> + 0.84% [ET_NET 0] libcrypto.so.1.0.1e [.] aesni_ctr32_encrypt_blocks ▒ >>>> + 0.74% [ET_NET 0] libc-2.12.so [.] malloc ▒ >>>> + 0.72% [ET_NET 0] libc-2.12.so [.] memcpy ▒ >>>> + 0.57% [ET_NET 0] libcrypto.so.1.0.1e [.] BN_mod_mul_montgomery ▒ >>>> + 0.51% [ET_NET 0] libpthread-2.12.so [.] pthread_mutex_unlock >>>> >>>> pstack of the process (only a couple of the interesting threads): >>>> Thread 1 (Thread 0x7f1fba151800 (LWP 1106)): >>>> #0 0x00007f1fbaaeb264 in __lll_lock_wait () from /lib64/libpthread.so.0 >>>> #1 0x00007f1fbaae6508 in _L_lock_854 () from /lib64/libpthread.so.0 >>>> #2 0x00007f1fbaae63d7 in pthread_mutex_lock () from /lib64/libpthread.so.0 >>>> #3 0x000000000072822f in SSL_locking_callback(int, int, char const*, int) >>>> () >>>> #4 0x00000031d2ae7f03 in ?? () from /usr/lib64/libcrypto.so.10 >>>> #5 0x00000031d2ae7d8c in ?? () from /usr/lib64/libcrypto.so.10 >>>> #6 0x00000031d2ae80ab in ERR_get_state () from /usr/lib64/libcrypto.so.10 >>>> #7 0x00000031d2ae8557 in ?? () from /usr/lib64/libcrypto.so.10 >>>> #8 0x00000031d2e424a9 in SSL_get_error () from /usr/lib64/libssl.so.10 >>>> #9 0x00000000007249ef in SSLNetVConnection::load_buffer_and_write(long, >>>> long&, long&, MIOBufferAccessor&, int&) () >>>> #10 0x00000000007389df in write_to_net_io(NetHandler*, >>>> UnixNetVConnection*, EThread*) () >>>> #11 0x0000000000738523 in write_to_net(NetHandler*, UnixNetVConnection*, >>>> EThread*) () >>>> #12 0x0000000000731922 in NetHandler::mainNetEvent(int, Event*) () >>>> #13 0x00000000004f6d48 in Continuation::handleEvent(int, void*) () >>>> #14 0x000000000075877a in EThread::process_event(Event*, int) () >>>> #15 0x0000000000758c84 in EThread::ex >>>> >>>> Thread 56 (Thread 0x7f1fb0ecd700 (LWP 1172)): >>>> #0 0x00007f1fbaae77d9 in pthread_mutex_unlock () from >>>> /lib64/libpthread.so.0 >>>> #1 0x0000000000728265 in SSL_locking_callback(int, int, char const*, int) >>>> () >>>> #2 0x00000031d2b57402 in ?? () from /usr/lib64/libcrypto.so.10 >>>> #3 0x00000031d2b574bd in FIPS_selftest_failed () from >>>> /usr/lib64/libcrypto.so.10 >>>> #4 0x00000031d2ae9f1c in EVP_DigestInit_ex () from >>>> /usr/lib64/libcrypto.so.10 >>>> #5 0x00000031d2a7977e in HMAC_Init_ex () from /usr/lib64/libcrypto.so.10 >>>> #6 0x00000031d2a79c29 in ?? () from /usr/lib64/libcrypto.so.10 >>>> #7 0x00000031d2af7a42 in EVP_PKEY_CTX_ctrl () from >>>> /usr/lib64/libcrypto.so.10 >>>> #8 0x00000031d2ae9ffb in EVP_DigestInit_ex () from >>>> /usr/lib64/libcrypto.so.10 >>>> #9 0x00000031d2af952a in ?? () from /usr/lib64/libcrypto.so.10 >>>> #10 0x00000031d2e34cf2 in tls1_PRF () from /usr/lib64/libssl.so.10 >>>> #11 0x00000031d2e35796 in tls1_final_finish_mac () from >>>> /usr/lib64/libssl.so.10 >>>> #12 0x00000031d2e2db2e in ssl3_get_message () from /usr/lib64/libssl.so.10 >>>> #13 0x00000031d2e1dc3c in ssl3_get_cert_verify () from >>>> /usr/lib64/libssl.so.10 >>>> #14 0x00000031d2e220d6 in ssl3_accept () from /usr/lib64/libssl.so.10 >>>> #15 0x0000000000725287 in SSLNetVConnection::sslServerHandShakeEvent(int&) >>>> () >>>> #16 0x00000000007251b3 in SSLNetVConnection::sslStartHandShake(int, int&) >>>> () >>>> #17 0x0000000000723df3 in SSLNetVConnection::net_read_io(NetHandler*, >>>> EThread*) () >>>> #18 0x0000000000731852 in NetHandler::mainNetEvent(int, Event*) () >>>> #19 0x00000000004f6d48 in Continuation::handleEvent(int, void*) () >>>> #20 0x000000000075877a in EThread::process_event(Event*, int) () >>>> #21 0x0000000000758c84 in EThread::execute() () >>>> #22 0x0000000000757cf8 in spawn_thread_internal(void*) () >>>> #23 0x00007f1fbaae49d1 in start_thread () from /lib64/libpthread.so.0 >>>> #24 0x00000030ff0e88fd in clone () from /lib64/libc.so.6 >>>> >>>> -Bryan >>>> >>>> >>>> >>>> >>>> >>>> >>>> _______________________________________________ >>>> openssl-users mailing list >>>> To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-users >>>> <https://mta.openssl.org/mailman/listinfo/openssl-users> >>> >>> _______________________________________________ >>> openssl-users mailing list >>> To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-users >>> <https://mta.openssl.org/mailman/listinfo/openssl-users> >> >> >> >> _______________________________________________ >> openssl-users mailing list >> To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-users >> <https://mta.openssl.org/mailman/listinfo/openssl-users> > > _______________________________________________ > openssl-users mailing list > To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-users > <https://mta.openssl.org/mailman/listinfo/openssl-users>
_______________________________________________ openssl-users mailing list To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-users