jason810496 opened a new pull request, #63910: URL: https://github.com/apache/airflow/pull/63910
* closes: #63545 ## Why As pointed out in https://github.com/apache/airflow/pull/63625#pullrequestreview-3967954962, the original migration (and also the PR [[migration-0102] Change logic migration data of postgresql](https://github.com/apache/airflow/pull/63625)) will hold the `ACCESS EXCLUSIVE` lock that blocks **ALL the concurrent READ** and **WRITE**, For `TaskInstance` and `TaskInstanceHistory` tables at production scale, even 4 minutes of total blocking is unacceptable. (e.g. downgrade from 3.2.0 to 3.1.8 due to other feature's error but encounter 8mins downtime) ## What Instead we: 1. ADD `external_executor_id_new` COLUMN (instant, ~ms level, brief ACCESS EXCLUSIVE) 2. Backfill in batches - it _might_ take longer than a naive `ALTER TABLE task_instance ALTER COLUMN ... TYPE VARCHAR(250)` DDL - but it's only a ROW EXCLUSIVE lock - so **with this approach, it still allow all concurrent reads/writes for the rows that are not in the current batch.** 3. DROP `external_executor_id` + RENAME `external_executor_id_new` to `external_executor_id` in same TX (instant, ~ms level, brief ACCESS EXCLUSIVE) --- ##### Was generative AI tooling used to co-author this PR? - [x] Yes (please specify the tool below) Discussing the locking behavior with Claude, Gemini -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
