QiuYucheng2003 opened a new issue, #12771:
URL: https://github.com/apache/ignite/issues/12771

   Description
   Summary:
   In org.apache.ignite.spi.encryption.keystore.KeystoreEncryptionSpi, a static 
final ThreadLocal<Cipher> aesWithPadding is used to cache Cipher instances for 
AES encryption. However, the codebase completely lacks a lifecycle management 
mechanism for this ThreadLocal—aesWithPadding.remove() is never called, even 
during the SPI stop phase (spiStop()).
   
   Root Cause:
   When a worker thread executes an encryption operation, a Cipher instance is 
placed into the thread's ThreadLocalMap. Because the key (aesWithPadding) is 
static final, it creates a permanent strong reference from the worker thread to 
the Cipher instance (and its associated JCA Provider classes).
   
   Impact (Critical for Embedded Mode):
   Apache Ignite is frequently deployed in embedded mode within Web Containers 
(e.g., Tomcat) or microservices.
   
   If the host application's worker threads are used for these operations, the 
ThreadLocal attaches the Cipher to long-lived threads.
   
   This creates a strong reference chain (Worker Thread -> ThreadLocalMap -> 
Cipher/Provider -> WebappClassLoader) that pins the application's classloader 
in memory.
   
   Upon application undeployment or hot-redeployment, the old classloader 
cannot be garbage collected, inevitably leading to a 
java.lang.OutOfMemoryError: Metaspace.
   
   Code Snippet
   Location: KeystoreEncryptionSpi.java
   // Definition: Static ThreadLocal without boundary management
   private static final ThreadLocal<Cipher> aesWithPadding = 
ThreadLocal.withInitial(() -> {
       try {
           return Cipher.getInstance(AES_WITH_PADDING);
       } catch (NoSuchAlgorithmException | NoSuchPaddingException e) {
           throw new IgniteException(e);
       }
   });
   (Note: aesWithPadding.remove() is never called in doEncryption() or 
spiStop())
   
   Expected Behavior
   The SPI should ensure that thread-local resources are cleanly removed to 
prevent memory and classloader leaks.
   
   Proposed Fix:
   
   1. Targeted Cleanup: If the ThreadLocal must remain static for performance, 
expose a cleanup mechanism that iterates and clears the map, or ensure 
aesWithPadding.remove() is called in a try-finally block after the cipher 
operation is complete (if thread pooling allows).
   
   2. Instance-level caching: Consider changing the static ThreadLocal to an 
instance-level ThreadLocal or a lightweight object pool bound to the 
KeystoreEncryptionSpi instance lifecycle, ensuring it can be cleared during 
spiStop().


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to