jerpelea opened a new pull request, #9719:
URL: https://github.com/apache/nuttx/pull/9719

   ## Summary
   If cancellation points are enabled, then the following logic is activated in 
sem_wait().  This causes ECANCELED to be returned every time that sem_wait is 
called.
   
       int sem_wait(FAR sem_t *sem)
       {
         ...
   
         /* sem_wait() is a cancellation point */
   
         if (enter_cancellation_point())
           {
       #ifdef CONFIG_CANCELLATION_POINTS
             /* If there is a pending cancellation, then do not perform
              * the wait.  Exit now with ECANCELED. */
   
             errcode = ECANCELED;
             goto errout_with_cancelpt;
       #endif
           }
         ...
   
   Normally this works fine.  sem_wait() is the OS API called by the 
application and will cancel the thread just before it returns to the 
application.  Since it is cancellation point, it should never be called from 
within the OS.
   
   There there is is one perverse cases where sem_wait() may be nested within 
another cancellation point.  If open() is called, it will attempt to lock a VFS 
data structure and will eventually call nxmutex_lock().  nxmutex_lock() waits 
on a semaphore:
   
      int nxmutex_lock(FAR mutex_t *mutex)
      {
        ...
   
        for (; ; )
          {
            /* Take the semaphore (perhaps waiting) */
   
            ret = _SEM_WAIT(&mutex->sem);
            if (ret >= 0)
              {
                mutex->holder = _SCHED_GETTID();
                break;
              }
   
            ret = _SEM_ERRVAL(ret);
            if (ret != -EINTR && ret != -ECANCELED)
              {
                break;
              }
          }
      ...
   }
   
   In the FLAT build, _SEM_WAIT expands to sem_wait().  That causes the error 
in the logic:  It should always expand to nxsem_wait().  That is because 
sem_wait() is cancellation point and should never be called from with the OS or 
the C library internally.
   
   The failure occurs because the cancellation point logic in sem_wait() 
returns -ECANCELED (via _SEM_ERRVAL) because sem_wait() is nested; it needs to 
return the -ECANCELED error to the outermost cancellation point which is open() 
in this case.  Returning -ECANCELED then causes an infinite loop to occur in 
nxmutex_lock().
   
   The correct behavior in this case is to call nxsem_wait() instead of 
sem_wait().  nxsem_wait() is identical to sem_wait() except that it is not a 
cancelation point.  It will return -ECANCELED if the thread is canceled, but 
only once.  So no infinite loop results.
   
   In addition, an nxsem_wait() system call was added to support the call from 
nxmutex_lock().
   
   This resolves Issue #9695
   
   ## Impact
   RELEASE
   
   ## Testing
   NONE
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@nuttx.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to