kaxil commented on PR #60330: URL: https://github.com/apache/airflow/pull/60330#issuecomment-4087499991
@dabla @yennysu The error you're seeing (`"reason":"not_found","message":"Task Instance not found"`) is a separate bug from what this PR fixes. This PR fixes the try_number double-increment race (where two schedulers both schedule the same TI). Your error is a 404, not a 409, caused by UUID reassignment during orphan adoption: when a scheduler crashes, another scheduler resets the orphaned TI via `prepare_db_for_next_try()` which assigns a new UUID. The worker that's still running heartbeats with the old UUID and gets a 404. We're tracking that as a separate issue. Scaling with scheduler count matches this theory since more schedulers means more adoption cycles. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
