Hello,

Nice investigation!

Zhaoming Luo, le jeu. 27 févr. 2025 10:37:36 +0800, a ecrit:
> It means that pthread_hurd_cond_timedwait_np() successes but
> current->signal is still 1. I assume current->signal will only be
> modified to 1 when pthread_hurd_cond_timedwait_np() returns EINTR. Maybe
> we should set current->signal to 0 when the return from
> pthread_hurd_cond_timedwait_np() is 0?

Actually it seems like the right way indeed, with some more fix.

Clearing it to 0 is done by prepare_current() which is done on RPC
entry, so that the linux code knows that we are not interrupted so
far. That pthread_hurd_cond_timedwait_np call is the only place inside
pfinet where pfinet sleeps, and thus where we care for seeing EINTR,
which we then need to remember and propagate through the linux stack
down to the RPC stub.

While the pthread_hurd_cond_timedwait_np call is sleeping, the
current_contents structure is used for other RPCs, so current->signal
may be cleared & set while managing other RPCs. That's actually why
interruptible_sleep_on_timeout takes care of saving current->isroot and
current->next_wait to restore them after taking the global lock again.
Since current->signal may written by something else, we have to put
something in it, always.

Also, I'm thinking that if we previously got interrupted, we shouldn't
ever try to wait again until exiting from the RPC: an interrupt means we
really want to try hard to finish the RPC. So I'd say we want:

  current->next_wait = 0;

  if (current->signal)
    /* We already got interrupted previously, keep interrupting the
       RPC.  */
    err = EINTR;
  else
    {
      /* This is the only place where we sleep within an RPC and release the 
global
         lock while serving it.  */
      err = pthread_hurd_cond_timedwait_np(c, &global_lock, tsp);

      if (err == EINTR)
        current->signal = 1;        /* We got cancelled, mark it for Linux code 
to bail out.  */
      else
        current->signal = 0;
    }

  current->isroot = isroot;

even if the linux code is not supposed to call schedule() again if
signal_pending() != 0, it's probably safer to avoid the question.

Could you check that it works fine?

Samuel

Reply via email to