On Tue, Feb 14, 2023 at 05:23:08PM +0100, Markus Armbruster wrote: > Daniel P. Berrangé <berra...@redhat.com> writes: > > > On Tue, Feb 14, 2023 at 05:36:32PM +0400, Marc-André Lureau wrote: > >> Hi > >> > >> On Tue, Feb 14, 2023 at 5:34 PM Markus Armbruster <arm...@redhat.com> > >> wrote: > >> > > >> > marcandre.lur...@redhat.com writes: > >> > > >> > > From: Marc-André Lureau <marcandre.lur...@redhat.com> > >> > > > >> > > As per comment, presumably to avoid syscall in critical section. > >> > > > >> > > Fixes: 0210c3b39bef08 ("monitor: Use LOCK_GUARD macros") > >> > > Signed-off-by: Marc-André Lureau <marcandre.lur...@redhat.com> > >> > > --- > >> > > monitor/fds.c | 4 +++- > >> > > 1 file changed, 3 insertions(+), 1 deletion(-) > >> > > > >> > > diff --git a/monitor/fds.c b/monitor/fds.c > >> > > index 26b39a0ce6..03c5e97c35 100644 > >> > > --- a/monitor/fds.c > >> > > +++ b/monitor/fds.c > >> > > @@ -80,7 +80,7 @@ void qmp_getfd(const char *fdname, Error **errp) > >> > > return; > >> > > } > >> > > > >> > > - QEMU_LOCK_GUARD(&cur_mon->mon_lock); > >> > > + qemu_mutex_lock(&cur_mon->mon_lock); > >> > > QLIST_FOREACH(monfd, &cur_mon->fds, next) { > >> > > if (strcmp(monfd->name, fdname) != 0) { > >> > > continue; > >> > > @@ -88,6 +88,7 @@ void qmp_getfd(const char *fdname, Error **errp) > >> > > > >> > > tmp_fd = monfd->fd; > >> > > monfd->fd = fd; > >> > > + qemu_mutex_unlock(&cur_mon->mon_lock); > >> > > /* Make sure close() is outside critical section */ > >> > > close(tmp_fd); > >> > > return; > >> > > @@ -98,6 +99,7 @@ void qmp_getfd(const char *fdname, Error **errp) > >> > > monfd->fd = fd; > >> > > > >> > > QLIST_INSERT_HEAD(&cur_mon->fds, monfd, next); > >> > > + qemu_mutex_unlock(&cur_mon->mon_lock); > >> > > } > >> > > > >> > > void qmp_closefd(const char *fdname, Error **errp) > >> > > >> > This confused me. I think I understand now, but let's double-check. > >> > > >> > You're reverting commit 0210c3b39bef08 for qmp_getfd() because it > >> > extended the criticial section beyond the close(), invalidating the > >> > comment. Correct? > >> > >> Correct > >> > >> > Did it actually break anything? > >> > >> Not that I know of (David admitted over IRC that this was not intended) > > > > Conceptually the only risk here is that 'close()' blocks for a > > prolonged period of time, which prevents another thread from > > acquiring the mutex. > > > > First, the chances of close() blocking are incredibly low for > > socket FDs which have not yet been used to transmit data. It > > would require a malicious mgmt app to pass an unexpected FD > > type that could block but that's quite hard, and we consider > > the QMP client be a trusted entity anyway. > > > > As for another thread blocking on the mutex I'm not convinced > > that'll happen either. The FD set is scoped to the current > > monitor. Almost certainly the FD is going to be consumed by > > a later QMP device-add/object-add command, in the same thread. > > Processing of that later QMP command will be delayed regardless > > of whether the close is inside or outside the critical section. > > > > AFAICT keeping close() oujtside the critical section serves > > no purpose and we could just stick with the lock guard and > > delete the comment. > > Makes sense to me. > > There's another one in monitor_add_fd(). > > Both are from Peter's commit 9409fc05fe2 "monitor: protect mon->fds with > mon_lock". Peter, do you remember why you took the trouble to keep > close() outside the critical section? I know it's been a while...
IIRC the whole purpose of keeping close() out of the mutex section was to make sure the mutex won't take for too long in any possible way since the mutex will be held too in the monitor iothread (which will service the out-of-band commands), at that time we figured the close() has a chance of getting blocked (even if unlikely!). So to me it still makes sense to keep the close() out of the mutex section, unless the monitor code changed in the past few years on that, and sorry in advance if I didn't really follow what's happening.. What's the major beneift if we move it into the critical section? We can use the lock guard, but IMHO that's for making programming convenient only, we should not pay for it if there's an unwanted functional difference. In this case of close() I think it introduces back the possiblity of having a very slow close() - I'd bet it happen only if there's a remote socket connection to the QMP server and with unreliable network, but I really can't really tell. I think I used to discuss this with Dave. I'm wondering whether I should have used a userspace spinlock, that sounds even more proper for this case, but that's slightly off topic. It's just that if the original goal of "trying our best to make sure out-of-band monitor channels is always responsive" doesn't change, hence IMHO the comment on the lock should still be valid to me. Thanks, -- Peter Xu