Review of MDEV-34857: Implement --slave-abort-blocking-timeout

If a slave replicating an event has waited for more than
@@slave_abort_blocking_timeout for a conflicting metadata lock held by a
non-replication thread, the blocking query is killed to allow replication to
proceed and not be blocked indefinitely by a user query.

<cut>

diff --git a/sql/mdl.cc b/sql/mdl.cc
index faccd1c9476..9845718e165 100644
--- a/sql/mdl.cc
+++ b/sql/mdl.cc

<cut>

@@ -2397,14 +2398,39 @@ MDL_context::acquire_lock(MDL_request
*mdl_request, double lock_wait_timeout)

   find_deadlock();

-  struct timespec abs_timeout, abs_shortwait;
+  struct timespec abs_timeout, abs_shortwait, abs_abort_blocking_timeout;
+  bool abort_blocking_enabled= false;
+  double abort_blocking_timeout= slave_abort_blocking_timeout;

Add a comment here. Something like

/*
  This code implements aborting of queries on the slaves that stops
  replication from continuing.
*/

+  if (abort_blocking_timeout < lock_wait_timeout &&
+      m_owner->get_thd()->rgi_slave)
+  {
+    set_timespec_nsec(abs_abort_blocking_timeout,
+                      (ulonglong)(abort_blocking_timeout * 1000000000ULL));
+    abort_blocking_enabled= true;
+  }
   set_timespec_nsec(abs_timeout,
                     (ulonglong)(lock_wait_timeout * 1000000000ULL));
-  set_timespec(abs_shortwait, 1);
   wait_status= MDL_wait::EMPTY;

-  while (cmp_timespec(abs_shortwait, abs_timeout) <= 0)
+  for (;;)
   {
+    bool abort_blocking= false;
+    set_timespec(abs_shortwait, 1);
+    if (abort_blocking_enabled &&
+        cmp_timespec(abs_shortwait, abs_abort_blocking_timeout) >= 0)
+    {
+      /*
+        If a slave DDL has waited for --slave-abort-select-timeout, then notify
+        any blocking SELECT once before continuing to wait until the full
+        timeout.
+      */
+      abs_shortwait= abs_abort_blocking_timeout;
+      abort_blocking= true;
+      abort_blocking_enabled= false;
+    }
+    if (cmp_timespec(abs_shortwait, abs_timeout) > 0)
+      break;

Is the above right last compare right?
Assuming that cmp_timespec(abs_shortwait, abs_abort_blocking_timeout) > 0
then the above will not break and we will not signal other threads to break even
if the previous code expectrs us to abort blocking.
(I think the cmp_timespec() return test should be the same in both cases)

Shouldn't the code be:

  for (;;)
  {
    bool abort_blocking= false;
    set_timespec(abs_shortwait, 1);
    if (abort_blocking_enabled &&
        cmp_timespec(abs_shortwait, abs_abort_blocking_timeout) >= 0)
    {
      /*
        If a slave DDL has waited for --slave-abort-select-timeout, then notify
        any blocking SELECT once before continuing to wait until the full
        timeout.
      */
      abort_blocking= true;
      abort_blocking_enabled= false;
    }
    else if (cmp_timespec(abs_shortwait, abs_timeout) > 0)
      break;

Or have I missed something?

     /* abs_timeout is far away. Wait a short while and notify locks. */
     wait_status= m_wait.timed_wait(m_owner, &abs_shortwait, FALSE,
                                    mdl_request->key.get_wait_state_name());
@@ -2425,9 +2451,8 @@ MDL_context::acquire_lock(MDL_request
*mdl_request, double lock_wait_timeout)

     mysql_prlock_wrlock(&lock->m_rwlock);
     if (lock->needs_notification(ticket))
-      lock->notify_conflicting_locks(this);
+      lock->notify_conflicting_locks(this, abort_blocking);
     mysql_prlock_unlock(&lock->m_rwlock);
-    set_timespec(abs_shortwait, 1);

I like the new code in that we are not setting abs_shortwait two times anymore.
(Using a while loop in the is case was sub optimal.

Ok to push after the above change or if you have a good argument why
the current code is correct
_______________________________________________
commits mailing list -- commits@lists.mariadb.org
To unsubscribe send an email to commits-le...@lists.mariadb.org

Reply via email to