On Tue, 2024-10-08 at 11:17 -0700, Richard Henderson wrote:
> On 10/5/24 13:35, Ilya Leoshkevich wrote:
> > > How can we handle the long-running syscalls?
> > > Just waiting sounds unsatisfying.
> > > Sending a reserved host signal may alter the guest's behaviour if
> > > a
> > > syscall like pause() is interrupted.
> > > What do you think about SIGSTOP-ping the "in_syscall" threads?
> > > A quick experiment shows that it should be completely invisible
> > > to
> > > the
> > > guest - the following program continues to run after
> > > SIGSTOP/SIGCONT:
> > > 
> > > #include <sys/syscall.h>
> > > #include <unistd.h>
> > > int main(void) { syscall(__NR_pause); };
> > 
> > Hmm, no, that won't work: SIGSTOP would stop all threads.
> > 
> > So I wonder if reserving a host signal for interrupting
> > "in_syscall"
> > threads would be an acceptable tradeoff?
> 
> Could work, yes.  We already steal SIGRTMIN for guest abort (to
> distinguish from host 
> abort), and remap guest __SIGRTMIN to host SIGRTMIN+1.  Grabbing
> SIGRTMIN+1 should work 
> ok, modulo the existing problem of presenting the guest with an
> incomplete set of signals.
> 
> I've wondered from time to time about multiplexing signals in this
> space, but I think that 
> runs afoul of having a consistent mapping for interprocess signaling.
> 
> 
> r~

I tried to think through how this would work in conjunction with
start_exclusive(), and there is one problem I don't see a good solution
for. Maybe you will have an idea.

The way I'm thinking of implementing this is as follows:

- Reserve the host's SIGRTMIN+1 and tweak host_signal_handler() to do
  nothing for this signal.

- In gdb_try_stop(), call start_exclusive(). After it returns, some
  threads will be parked in exclusive_idle(). Some other threads will
  be on their way to getting parked, and this needs to actually happen
  before gdb_try_stop() can proceed. For example, the ones that are
  executing handle_pending_signal() may change memory and CPU state.
  IIUC start_exclusive() will not wait for them, because they are not
  "running". I think a global counter protected by qemu_cpu_list_lock
  and paired with a new condition variable should be enough for this.

- Threads executing long-running syscalls will need to be interrupted
  by SIGRTMIN+1. These syscalls will return -EINTR and will need
  to be manually restarted so as not to disturb poorly written guests.
  This needs to happen only if there are no pending guest signals.

- Here is a minor problem: how to identify threads which need to be
  signalled? in_syscall may not be enough. But maybe signalling all
  threads won't hurt too much. The parked ones won't notice anyway.

- But here is the major problem: what if we signal a thread just before
  it starts executing a long-running syscall? Such thread will be stuck
  and we'll need to signal it again. But how to determine that this
  needs to be done?

  An obvious solution is to signal all threads in a loop with a 0.1s
  delay until the counter reaches n_threads. But it's quite ugly.

  Ideally SIGRTMIN+1 should be blocked most of the time. Then we should
  identify all places where long-running syscalls may be invoked and
  unblock SIGRTMIN+1 atomically with executing them. But I'm not aware
  of such mechanism (I have an extremely vague recollection that
  someone managed to abuse rseq for this, but we shouldn't be relying
  on rseq being available anyway).

Reply via email to