dstandish commented on code in PR #42082:
URL: https://github.com/apache/airflow/pull/42082#discussion_r1798697709


##########
airflow/models/taskinstance.py:
##########
@@ -1760,16 +1758,16 @@ def _handle_reschedule(
     ti.end_date = timezone.utcnow()
     ti.set_duration()
 
-    # Lock DAG run to be sure not to get into a deadlock situation when trying 
to insert
-    # TaskReschedule which apparently also creates lock on corresponding 
DagRun entity
-    with_row_locks(
-        session.query(DagRun).filter_by(
-            dag_id=ti.dag_id,
-            run_id=ti.run_id,
-        ),
-        session=session,
-    ).one()
-    # Log reschedule request
+    # set state
+    ti.state = TaskInstanceState.UP_FOR_RESCHEDULE
+
+    ti.clear_next_method_args()
+
+    session.merge(ti)
+    session.commit()

Review Comment:
   @potiuk yes, you are correct, this lock was added to address a report of 
deadlock _(added by you, incidentally :) )_. and i think user was mysql.   but 
it would have worked the same way on postgres.  postgres FKs work the same way. 
 when you insert it has to lock the row with the PK that you are referencing so 
you don't end up with a reference to a row that does not exist.
   
   postgres is unique in that it provides a way to say "i'm not going to delete 
or change the pk of this row" which means the insert can go through knowing the 
row won't disappear.  that's the purpose of this lock mode.
   
   but anyway, so yes in a sense this PR is "postgres only" in that we only 
change the lock used for concurrency management.  but since mysql doesn't have 
that mode, i had to still ensure we protect against deadlock in mysql.  and the 
way i did that was to introduce a commit immediately before inserting 
taskreschedule.  this will avoid deadlock in mysql because there will not be 
any other operation in the transaction where the TR record is inserted.  and 
since there will be nothing else in the tran, then a deadlock will be 
impossible.  i do not see any downside to committing before.  this means that 
with mysql, in contrast with postgres, it will still need to wait until there's 
nothing taking a FOR UPDATE lock on the record (postgres won't have to wait), 
but at least it won't _also_ have deadlock risk in addition to waiting; and it 
won't have to also do its own FOR UPDATE lock of the TI.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to