On Thu, 1 Sep 2022 17:28:40 GMT, zzambers <d...@openjdk.org> wrote: > There is a race condition in JDK's SessionManager, which can lead to random > exceptions. > > **Exception:** > > javax.net.ssl.SSLException: Internal error: close session with active objects > at java.base/sun.security.ssl.Alert.createSSLException(Alert.java:133) > at > java.base/sun.security.ssl.TransportContext.fatal(TransportContext.java:371) > at > java.base/sun.security.ssl.TransportContext.fatal(TransportContext.java:314) > at > java.base/sun.security.ssl.TransportContext.fatal(TransportContext.java:309) > at > java.base/sun.security.ssl.SSLSocketImpl.handleException(SSLSocketImpl.java:1707) > at > java.base/sun.security.ssl.SSLSocketImpl$AppInputStream.read(SSLSocketImpl.java:1080) > at > java.base/sun.security.ssl.SSLSocketImpl$AppInputStream.read(SSLSocketImpl.java:971) > at SSLSocketServer.serverLoop(SSLSocketServer.java:133) > at SSLSocketServer$1.run(SSLSocketServer.java:75) > at java.base/java.lang.Thread.run(Thread.java:833) > Caused by: java.security.ProviderException: Internal error: close session > with active objects > at > jdk.crypto.cryptoki/sun.security.pkcs11.Session.close(Session.java:127) > at > jdk.crypto.cryptoki/sun.security.pkcs11.Session.close(Session.java:114) > at > jdk.crypto.cryptoki/sun.security.pkcs11.SessionManager.closeSession(SessionManager.java:237) > at > jdk.crypto.cryptoki/sun.security.pkcs11.SessionManager$Pool.release(SessionManager.java:270) > at > jdk.crypto.cryptoki/sun.security.pkcs11.SessionManager.demoteObjSession(SessionManager.java:210) > at > jdk.crypto.cryptoki/sun.security.pkcs11.Session.removeObject(Session.java:101) > at > jdk.crypto.cryptoki/sun.security.pkcs11.SessionKeyRef.updateNativeKey(P11Key.java:1396) > at > jdk.crypto.cryptoki/sun.security.pkcs11.SessionKeyRef.removeNativeKey(P11Key.java:1377) > at > jdk.crypto.cryptoki/sun.security.pkcs11.NativeKeyHolder.releaseKeyID(P11Key.java:1329) > at > jdk.crypto.cryptoki/sun.security.pkcs11.P11Key.releaseKeyID(P11Key.java:156) > at > jdk.crypto.cryptoki/sun.security.pkcs11.P11AEADCipher.reset(P11AEADCipher.java:529) > at > jdk.crypto.cryptoki/sun.security.pkcs11.P11AEADCipher.ensureInitialized(P11AEADCipher.java:436) > at > jdk.crypto.cryptoki/sun.security.pkcs11.P11AEADCipher.implDoFinal(P11AEADCipher.java:732) > at > jdk.crypto.cryptoki/sun.security.pkcs11.P11AEADCipher.engineDoFinal(P11AEADCipher.java:624) > at java.base/javax.crypto.Cipher.doFinal(Cipher.java:2500) > at > java.base/sun.security.ssl.SSLCipher$T12GcmReadCipherGenerator$GcmReadCipher.decrypt(SSLCipher.java:1659) > at > java.base/sun.security.ssl.SSLSocketInputRecord.decodeInputRecord(SSLSocketInputRecord.java:260) > at > java.base/sun.security.ssl.SSLSocketInputRecord.decode(SSLSocketInputRecord.java:181) > at java.base/sun.security.ssl.SSLTransport.decode(SSLTransport.java:111) > at > java.base/sun.security.ssl.SSLSocketImpl.decode(SSLSocketImpl.java:1508) > at > java.base/sun.security.ssl.SSLSocketImpl.readApplicationRecord(SSLSocketImpl.java:1479) > at > java.base/sun.security.ssl.SSLSocketImpl$AppInputStream.read(SSLSocketImpl.java:1064) > ... 4 more > > > **Reproducibility:** > I started getting this exception quite reliably on JDK17 on my machine with > one particular test setup using ssl-tests testsuite. Unfortunately setup > itself needed some RH specific patches and also ability to reproduce depends > on other factors such number keys in keystore, machine where testing was > performed... I tried a bit to create some reproducer, but I couldn't find a > way to easily reproduce this issue :( > > **Problem:** > SunPKCS11 provider does session pooling. This is done in SessionManager [1] > (one per SunPKCS11 provider). Released sessions are kept by SessionManager > for a while, for reuse (in limited number). This however is a bit complicated > as some sessions can own objects (e.g. keys). So there are actually 2 pools. > One for sessions with objects ("objSessions") and one for sessions without > objects ("opSessions"). This is because sessions without objects, which are > not being used, can be safely closed (SessionManager only keeps around > limited amount of these), while sessions with objects cannot be safely closed > (until all objects are removed from them). Session manager has methods for > getting Session for given purpose (object creation or just doing other > operations), prioritizing appropriate pool. Each session has counter (called > "createdObjects") to track how many objects it owns. When session is being > returned to pool this counter is checked and session is placed to appropriate > pool. Also whe n counter for some Session in "objSessions" pool reaches zero it is moved ("demoted") to "opSessions" pool. > > And here comes complicated part. As far as I understand it, > Session.addObject() [2] (which increases "createdObjects" counter) is always > being called by thread "holding" session which owns the created object. (That > is: thread gets a session, uses it to create an object and calls > Session.addObject() on that session to increase the counter, before returning > the session to pool. See e.g.: [3]) However this is not true for > Session.removeObject() [4]. (That is: thread gets session, which is not > necessary the same one owning object being removed, performs object removal, > but then calls Session.removeObject() on session which owned that object. See > e.g.: [5]) That is Session.removeObject() can be called on Session which is > in "objSessions" pool or which is being used be other thread. (object removal > can happen as result of releasing key, either explicitly or as result of GC > etc..). > > And finally, there is a problem in code handling object removal from a > session. Session.removeObject() [4] first checks if "createdObjects" counter > reached zero. If so, it calls SessionManager.demoteObjSession(this) [6], > which attempts to remove Session from objSessions pool, if session is > successfully removed from there, meaning no other thread "holds" this > session, session is put to opSessions pool, if not (meaning other thread > "holds" it), method just returns, since that other thread puts this session > to appropriate pool, when it is done with it by calling > SessionManager.releaseSession(session). > > There is race condition here. Consider following scenario: > > // Thread T1 runs: > Session.removeObject() // [4] > createdObjects.decrementAndGet() // returns zero > > // Thread T2 steps in (operating on the same session instance): > Session.addObject() // increases "createdObjects" counter [2] > SessionManager.releaseSession(session) // releases session to objSessions pool > > // Thread T1 continues: > SessionManager.demoteObjSession(this) // [6] > objSessions.remove(session) // returns true > opSessions.release(session) // puts session (with objects!) to opSessions > pool > // if opSessions is already full, close of session with objects is attempted > throwing Exception.. > > > **Fix:** > SessionManager.demoteObjSession [6] method was changed, so that check for > objects is done once again if session was successfully removed from > "objSessions" pool (now that it is out of pool and other threads should not > be adding objects to it). Based on this check session is either released to > "opSessions" pool or returned to "objSessions" pool. This can be achieved by > calling releaseSession(session) instead of opSessions.release(session). > > **Testing:** > jdk_security tests passed for me locally with this change. > I have also tested this change on top of custom JDK17 build which allows > scenario, where I can reproduce this issue. Problem got fixed. > > [1] > https://github.com/openjdk/jdk/blob/9444a081cc9873caa7b5c6a78df0d1aecda6e4f1/src/jdk.crypto.cryptoki/share/classes/sun/security/pkcs11/SessionManager.java > [2] > https://github.com/openjdk/jdk/blob/9444a081cc9873caa7b5c6a78df0d1aecda6e4f1/src/jdk.crypto.cryptoki/share/classes/sun/security/pkcs11/Session.java#L93 > [3] > https://github.com/openjdk/jdk/blob/9444a081cc9873caa7b5c6a78df0d1aecda6e4f1/src/jdk.crypto.cryptoki/share/classes/sun/security/pkcs11/P11Key.java#L1339 > [4] > https://github.com/openjdk/jdk/blob/9444a081cc9873caa7b5c6a78df0d1aecda6e4f1/src/jdk.crypto.cryptoki/share/classes/sun/security/pkcs11/Session.java#L98 > [5] > https://github.com/openjdk/jdk/blob/9444a081cc9873caa7b5c6a78df0d1aecda6e4f1/src/jdk.crypto.cryptoki/share/classes/sun/security/pkcs11/P11Key.java#L1360 > [6] > https://github.com/openjdk/jdk/blob/9444a081cc9873caa7b5c6a78df0d1aecda6e4f1/src/jdk.crypto.cryptoki/share/classes/sun/security/pkcs11/SessionManager.java#L195
This pull request has now been integrated. Changeset: 1e031e6a Author: Zdenek Zambersky <zzamb...@redhat.com> Committer: Valerie Peng <valer...@openjdk.org> URL: https://git.openjdk.org/jdk/commit/1e031e6a5886fba3009d8e5fa62416fe15a901b6 Stats: 7 lines in 1 file changed: 5 ins; 0 del; 2 mod 8293232: Fix race condition in pkcs11 SessionManager Reviewed-by: valeriep ------------- PR: https://git.openjdk.org/jdk/pull/10125