On 1/1/26 8:52 AM, Thomas Koenig wrote:
Hi Jerry,

The attached patch fixes this by using the TRYLOCK to see if the UNIT is already in use before proceeding with the I/O.  Regression tested on x86_64- linux-gnu.

The idea triggered by Thomas in the PR.

OK for mainline?

Hmm... thinking about the logic, I think that

   if (TRYLOCK(&p->lock))
     {
       if (p->child_dtio == 0)
     {
       generate_error (&dtp->common, LIBERROR_RECURSIVE_IO, NULL);
       return;
     }
       UNLOCK(&p->lock);
     }


Reviewing what trylock does: Courtesy Gemini

"Unlike a standard "lock" operation, which pauses your program until the mutex is available, a "trylock" is non-blocking. It checks the status of the mutex and returns immediately.

If the mutex is free: The function locks it and returns a success code (usually 0).

If the mutex is already held: The function does not wait. It immediately returns an error code (usually EBUSY). "

So, if trylock returns a 0 which is boolean false, it won't error out and we have the lock so we immediately unlock. The value of child_dtio does not matter in this case.

If the trylock finds the lock is already taken, it returns a non-zero value which is boolean true and if this is not child_dtio (parent) we throw the error and never hit the unlock.

If child_dtio > 0 (child) we don't error and we pass to unlock. The unlock will unlock if it is locked however if it is not locked the behavior can be undefined. So what I think we need to do is the following:

if ( && (p->child_dtio == 0))
  {
    if (TRYLOCK(&p->lock))
      {
         generate_error (&dtp->common, LIBERROR_RECURSIVE_IO, NULL);
         return;
      }
    UNLOCK(&p->lock);
  }

This will avoid potential undefined behavior on various systems. I will make this change and test. If passes, OK for mainline?

--- snip ---

Reply via email to