jason810496 opened a new pull request, #63910:
URL: https://github.com/apache/airflow/pull/63910

   
   * closes: #63545
   
   ## Why
   
   As pointed out in 
https://github.com/apache/airflow/pull/63625#pullrequestreview-3967954962, the 
original migration (and also the PR [[migration-0102] Change logic migration 
data of postgresql](https://github.com/apache/airflow/pull/63625)) will hold 
the `ACCESS EXCLUSIVE` lock that blocks **ALL the concurrent READ** and 
**WRITE**, For `TaskInstance` and `TaskInstanceHistory` tables at production 
scale, even 4 minutes of total blocking is unacceptable. (e.g. downgrade from 
3.2.0 to 3.1.8 due to other feature's error but encounter 8mins downtime)
   
   
   ## What
   
   Instead we:
   
   1. ADD `external_executor_id_new` COLUMN (instant, ~ms level, brief ACCESS 
EXCLUSIVE)
   2. Backfill in batches
       - it _might_ take longer than a naive `ALTER TABLE task_instance ALTER 
COLUMN ... TYPE VARCHAR(250)` DDL
       - but it's only a ROW EXCLUSIVE lock
       - so **with this approach, it still allow all concurrent reads/writes 
for the rows that are not in the current batch.**
   3. DROP `external_executor_id` + RENAME `external_executor_id_new` to 
`external_executor_id` in same TX (instant, ~ms level, brief ACCESS EXCLUSIVE)
   
   ---
   
   ##### Was generative AI tooling used to co-author this PR?
   
   - [x] Yes (please specify the tool below) Discussing the locking behavior 
with Claude, Gemini
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to