kaxil commented on PR #60330:
URL: https://github.com/apache/airflow/pull/60330#issuecomment-4087499991

   @dabla @yennysu The error you're seeing 
(`"reason":"not_found","message":"Task Instance not found"`) is a separate bug 
from what this PR fixes. This PR fixes the try_number double-increment race 
(where two schedulers both schedule the same TI). Your error is a 404, not a 
409, caused by UUID reassignment during orphan adoption: when a scheduler 
crashes, another scheduler resets the orphaned TI via 
`prepare_db_for_next_try()` which assigns a new UUID. The worker that's still 
running heartbeats with the old UUID and gets a 404.
   
   We're tracking that as a separate issue. Scaling with scheduler count 
matches this theory since more schedulers means more adoption cycles.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to