ziqiangliang opened a new issue, #7378:
URL: https://github.com/apache/gravitino/issues/7378

   ### Version
   
   main branch
   
   ### Describe what's wrong
   
   ### Summary
   
   In `org.apache.gravitino.storage.relational.utils.SessionUtils`, the methods:
   
   * `doWithoutCommitAndFetchResult`
   * `doWithoutCommit`
   
   **do not properly close** the underlying `SqlSession`, resulting in JDBC 
**connection leaks** when used repeatedly in production environments.
   
   ### Affected Methods
   
   ```java
   public static <T, R> R doWithoutCommitAndFetchResult(Class<T> mapperClazz, 
Function<T, R> func) {
     T mapper = SqlSessions.getMapper(mapperClazz); // opens session
     return func.apply(mapper); // does NOT close session
   }
   
   public static <T> void doWithoutCommit(Class<T> mapperClazz, Consumer<T> 
consumer) {
     T mapper = SqlSessions.getMapper(mapperClazz); // opens session
     consumer.accept(mapper); // does NOT close session
   }
   ```
   
   These methods obtain a `SqlSession` via `SqlSessions.getMapper()`, which 
internally opens a session and stores it in a `ThreadLocal`, but **they never 
call `commitAndCloseSqlSession()` or `rollbackAndCloseSqlSession()`**, nor do 
they use try-with-resources.
   
   As a result, **connections are never released back to the pool** if the 
caller forgets to manually close them—which violates the encapsulation these 
helper methods are meant to provide.
   
   ### Why This Is a Problem
   
   In long-running applications:
   
   * Connections accumulate in the pool in an active but idle state.
   * Eventually, all connections are used up.
   * DBCP cannot reclaim them since they’re not marked abandoned.
   * Leads to errors like:
   
     ```
     CommunicationsException: The last packet successfully received from the 
server was ...
     ```
   
   ### Comparison with `doWithCommit()`
   
   The `doWithCommit` method is correctly implemented:
   
   ```java
   public static <T> void doWithCommit(Class<T> mapperClazz, Consumer<T> 
consumer) {
     try (SqlSession session = SqlSessions.getSqlSession()) {
       try {
         T mapper = SqlSessions.getMapper(mapperClazz);
         consumer.accept(mapper);
         SqlSessions.commitAndCloseSqlSession();
       } catch (Throwable t) {
         SqlSessions.rollbackAndCloseSqlSession();
         throw t;
       }
     }
   }
   ```
   
   It guarantees session cleanup and avoids leaks.
   
   ### Suggested Fix
   
   Update both `doWithoutCommit` and `doWithoutCommitAndFetchResult` to ensure 
proper resource handling. For example:
   
   ```java
   public static <T, R> R doWithoutCommitAndFetchResult(Class<T> mapperClazz, 
Function<T, R> func) {
     try (SqlSession session = SqlSessions.getSqlSession()) {
       try {
         T mapper = SqlSessions.getMapper(mapperClazz);
         return func.apply(mapper);
       } finally {
         SqlSessions.closeSqlSession(); // or use commit/rollback if necessary
       }
     }
   }
   ```
   
   Or possibly use the same pattern as in `doWithCommit`, depending on your 
desired transactional guarantees.
   
   ### Impact
   
   * Prevents JDBC connection leaks
   * Ensures proper lifecycle management of SqlSession
   * Restores pool stability under load
   
   
   ### Error message and/or stacktrace
   
   ```
   ### Error querying database.  Cause: 
com.mysql.cj.jdbc.exceptions.CommunicationsException: The last packet 
successfully received from the server was 37,288,357 milliseconds ago. The last 
packet sent successfully to the server was 37,288,357 milliseconds ago. is 
longer than the server configured value of 'wait_timeout'. You should consider 
either expiring and/or testing connection validity before use in your 
application, increasing the server configured values for client timeouts, or 
using the Connector/J connection property 'autoReconnect=true' to avoid this 
problem
   ```
   
   
   ### How to reproduce
   
   version: **0.8.1-incubating-r1**
   ### reproduce way
   
   1. Start the Gravitino service with a JDBC backend and a limited connection 
pool (e.g., DBCP2).
   2. Call the REST API GET 
`/metalakes/{metalake}/objects/{type}/{fullName}/tags` repeatedly. This 
triggers `SessionUtils.doWithoutCommitAndFetchResult(...)`.
   3. Observe:
        Even after stopping all traffic, the connection pool’s **active count 
remains > 0 indefinitely, indicating leaked sessions**.
    
   4. This can be confirmed by inspecting metrics via DBCP2 or through 
Arthas/JMX.
   
   ### Additional context
   
   _No response_


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to