github-actions[bot] opened a new pull request, #62996:
URL: https://github.com/apache/airflow/pull/62996

   * perf: use load_only() in eager_load_dag_run_for_validation to reduce data 
fetched
   
   The get_dag_runs API endpoint was slow on large deployments because
   eager_load_dag_run_for_validation() used selectinload on task_instances and
   task_instances_histories without restricting which columns were fetched.
   This caused SQLAlchemy to load all heavyweight columns (executor_config with
   pickled data, hostname, rendered fields, etc.) for every task instance across
   every DAG run in the result page — even though only dag_version_id is needed
   to traverse the association proxy to DagVersion.
   
   Add load_only(TaskInstance.dag_version_id) and
   load_only(TaskInstanceHistory.dag_version_id) to the selectinload chains so
   the SELECT for task instances fetches only the identity columns and the FK
   needed to resolve the dag_version relationship, significantly reducing the
   volume of data transferred from the database on busy deployments.
   
   Fixes #62025
   
   * Fix static checks
   
   ---------
   (cherry picked from commit 13af96b80868ef91ca623d35afcd76003bfbda90)
   
   Co-authored-by: Lakshmi Sravya 
<[email protected]>
   Co-authored-by: pierrejeambrun <[email protected]>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to