Always-prog opened a new pull request, #33300:
URL: https://github.com/apache/superset/pull/33300

   <!---
   Please write the PR title following the conventions at 
https://www.conventionalcommits.org/en/v1.0.0/
   Example:
   fix(dashboard): load charts correctly
   -->
   
   ### SUMMARY
   In Superset's current implementation, when fetching data for dashboards or 
charts, the metadata database (ex. - PostgreSQL) session remains open 
throughout the entire data source query execution. This leads to accumulation 
of numerous idle sessions in idle in transaction state, particularly under 
heavy user load or during long-running chart queries.
   
   The common workarounds for this issue involve either is increasing 
PostgreSQL's max_connections to unreasonably high values (e.g., 1000+), or 
throttling workload through Celery task limits (when using Global Async 
Queries), which negatively impacts Superset's performance.
   
   ### BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF
   Look at the count and  duration of "idle in transaction" sessions in DBeaver 
chart
   
   # Before
   
   
https://github.com/user-attachments/assets/7c385785-e395-4283-bc46-db245203fb91
   
   
   # After
   
https://github.com/user-attachments/assets/d2c46bc2-a928-4c14-a762-e3952fbbf610
   
   
   In my PR I've modified the session handling logic:
    - Close the session before executing chart data query, releasing the 
metadata database connection
    - Reopen the session after query completion with loading previously removed 
SQLAlchemy objects 
   
   
   ### TESTING INSTRUCTIONS
   1. Create a Session Monitoring Chart in DBeaver.
   Use this SQL to track session states in your metadata database:
   ```sql
   SELECT
     COUNT(*) FILTER (WHERE state = 'active') AS active_sessions,
     COUNT(*) FILTER (WHERE state = 'idle') AS idle_sessions,
     COUNT(*) FILTER (WHERE state = 'idle in transaction') AS 
idle_in_transaction
   FROM pg_stat_activity
   WHERE usename <> 'postgres' and datname = 'superset';
   ```
   3. Simulate a Slow Query in Superset using next virtual dataset:
   ```sql
   SELECT *
   FROM generate_series(1, 50000000) a
   CROSS JOIN generate_series(1, 50000000) b
   LIMIT 1000000000;
   ```
   4. Open this chart in Superset.
   5. While it loads, observe the session metrics in DBeaver
   
   <!--- Required! What steps can be taken to manually verify the changes? -->
   
   ### ADDITIONAL INFORMATION
   <!--- Check any relevant boxes with "x" -->
   <!--- HINT: Include "Fixes #nnn" if you are fixing an existing issue -->
   - [x] Has associated issue: #27531
   - [ ] Required feature flags:
   - [ ] Changes UI
   - [ ] Includes DB Migration (follow approval process in 
[SIP-59](https://github.com/apache/superset/issues/13351))
     - [ ] Migration is atomic, supports rollback & is backwards-compatible
     - [ ] Confirm DB migration upgrade and downgrade tested
     - [ ] Runtime estimates and downtime expectations provided
   - [ ] Introduces new feature or API
   - [ ] Removes existing feature or API
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to