Hi, 10.04.2019, 16:58, "Dr. David Alan Gilbert" <dgilb...@redhat.com>: > * Yury Kotov (yury-ko...@yandex-team.ru) wrote: >> Currently such case is possible for incoming: >> QMP: add-fd (fdset = 0, fd of some file): >> adds fd to fdset 0 and returns QEMU's fd (e.g. 33) >> QMP: migrate-incoming (uri = "fd:33"): fd is stored in QIOChannel *ioc >> ... >> Incoming migration completes and unrefs ioc -> close(33) >> QMP: remove-fd (fdset = 0, fd = 33): >> removes fd from fdset 0 and qemu_close() -> close(33) => double close > > Well spotted! That would very rarely cause a problem, but is a race. >
Actually, it hits for incoming case on 2 of 50 VMs on our production... >> For outgoing migration the case is the same but getfd instead of add-fd. >> Fix it by duping client's fd. >> >> Signed-off-by: Yury Kotov <yury-ko...@yandex-team.ru> > > Note your patch hit a problem building on windows; I don't think we have > a qemu_dup for windows. > Oh, I see... I'll add an ifdef for this in v2. > However, I think this problem is wider than just migration. > For example, I see that dump.c also uses monitor_get_fd, and it's > dump_cleanup also does a close on the fd. So I guess it hits the same > problem? > Also, qmp.c in qmp_add_client does a close on the fd in some error cases > (I've not followed the normal case). > > So perhaps all the users of monitor_get_fd are hitting this problem. > > Should we be doing the dup in monitor_get_fd? > Hmm, it sounds reasonable but Windows's case will remain broken. And also using fd from fdset without qemu_open will remain broken. Another way to fix them: 1) Add searching of monitor's fds and not duped fdset's fds to qemu_close 2) Replace broken closes to qemu_close Regards, Yury Kotov > Dave > >> --- >> migration/fd.c | 39 +++++++++++++++++++++++++++++++-------- >> migration/trace-events | 4 ++-- >> 2 files changed, 33 insertions(+), 10 deletions(-) >> >> diff --git a/migration/fd.c b/migration/fd.c >> index a7c13df4ad..c9ff07ac41 100644 >> --- a/migration/fd.c >> +++ b/migration/fd.c >> @@ -15,6 +15,7 @@ >> */ >> >> #include "qemu/osdep.h" >> +#include "qapi/error.h" >> #include "channel.h" >> #include "fd.h" >> #include "migration.h" >> @@ -26,15 +27,27 @@ >> void fd_start_outgoing_migration(MigrationState *s, const char *fdname, >> Error **errp) >> { >> QIOChannel *ioc; >> - int fd = monitor_get_fd(cur_mon, fdname, errp); >> + int fd, dup_fd; >> + >> + fd = monitor_get_fd(cur_mon, fdname, errp); >> if (fd == -1) { >> return; >> } >> >> - trace_migration_fd_outgoing(fd); >> - ioc = qio_channel_new_fd(fd, errp); >> + /* fd is previously created by qmp command 'getfd', >> + * so client is responsible to close it. Dup it to save original value from >> + * QIOChannel's destructor */ >> + dup_fd = qemu_dup(fd); >> + if (dup_fd == -1) { >> + error_setg(errp, "Cannot dup fd %s: %s (%d)", fdname, strerror(errno), >> + errno); >> + return; >> + } >> + >> + trace_migration_fd_outgoing(fd, dup_fd); >> + ioc = qio_channel_new_fd(dup_fd, errp); >> if (!ioc) { >> - close(fd); >> + close(dup_fd); >> return; >> } >> >> @@ -55,14 +68,24 @@ static gboolean fd_accept_incoming_migration(QIOChannel >> *ioc, >> void fd_start_incoming_migration(const char *infd, Error **errp) >> { >> QIOChannel *ioc; >> - int fd; >> + int fd, dup_fd; >> >> fd = strtol(infd, NULL, 0); >> - trace_migration_fd_incoming(fd); >> >> - ioc = qio_channel_new_fd(fd, errp); >> + /* fd is previously created by qmp command 'add-fd' or something else, >> + * so client is responsible to close it. Dup it to save original value from >> + * QIOChannel's destructor */ >> + dup_fd = qemu_dup(fd); >> + if (dup_fd == -1) { >> + error_setg(errp, "Cannot dup fd %d: %s (%d)", fd, strerror(errno), >> + errno); >> + return; >> + } >> + >> + trace_migration_fd_incoming(fd, dup_fd); >> + ioc = qio_channel_new_fd(dup_fd, errp); >> if (!ioc) { >> - close(fd); >> + close(dup_fd); >> return; >> } >> >> diff --git a/migration/trace-events b/migration/trace-events >> index de2e136e57..d2d30a6b3c 100644 >> --- a/migration/trace-events >> +++ b/migration/trace-events >> @@ -258,8 +258,8 @@ migration_exec_outgoing(const char *cmd) "cmd=%s" >> migration_exec_incoming(const char *cmd) "cmd=%s" >> >> # fd.c >> -migration_fd_outgoing(int fd) "fd=%d" >> -migration_fd_incoming(int fd) "fd=%d" >> +migration_fd_outgoing(int fd, int dup_fd) "fd=%d dup_fd=%d" >> +migration_fd_incoming(int fd, int dup_fd) "fd=%d dup_fd=%d" >> >> # socket.c >> migration_socket_incoming_accepted(void) "" >> -- >> 2.21.0 > -- > Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK