Hi, Just to clarify. I see two possible solutions:
1) Since the migration code doesn't receive fd, it isn't responsible for closing it. So, it may be better to use migrate_fd_param for both incoming/outgoing and add dupping for migrate_fd_param. Thus, clients must close the fd themselves. But existing clients will have a leak. 2) If we don't duplicate fd, then at least we should remove fd from the corresponding list. Therefore, the solution is to fix qemu_close to find the list and remove fd from it. But qemu_close is currently consistent with qemu_open (which opens/dups fd), so adding additional logic might not be a very good idea. I don't see any other solution, but I might miss something. What do you think? Regards, Yury 11.04.2019, 15:58, "Yury Kotov" <yury-ko...@yandex-team.ru>: > 11.04.2019, 15:39, "Daniel P. Berrangé" <berra...@redhat.com>: >> On Thu, Apr 11, 2019 at 03:31:43PM +0300, Yury Kotov wrote: >>> 11.04.2019, 15:04, "Daniel P. Berrangé" <berra...@redhat.com>: >>> > On Wed, Apr 10, 2019 at 12:26:52PM +0300, Yury Kotov wrote: >>> >> Currently such case is possible for incoming: >>> >> QMP: add-fd (fdset = 0, fd of some file): >>> >> adds fd to fdset 0 and returns QEMU's fd (e.g. 33) >>> >> QMP: migrate-incoming (uri = "fd:33"): fd is stored in QIOChannel *ioc >>> >> ... >>> >> Incoming migration completes and unrefs ioc -> close(33) >>> >> QMP: remove-fd (fdset = 0, fd = 33): >>> >> removes fd from fdset 0 and qemu_close() -> close(33) => double >>> close >>> > >>> > IMHO this is user error. >>> > >>> > You've given the FD to QEMU and told it to use it for migration. >>> > >>> > Once you've done that you should not be calling remove-fd. >>> > >>> > remove-fd should only be used in some error code path. ie if you >>> > have called add-fd, and then some failure means you've decided you >>> > can't invoke 'migrate-incoming'. Now you should call "remove-fd". >>> > >>> > AFAIK, we have never documented that 'add-fd' must be paired >>> > with 'remove-fd'. >>> > >>> > IOW, I think this "fix" is in fact not a fix - it is instead >>> > changing the semantics of when "remove-fd" should be used. >>> > >>> >>> I think it might be user's error but fd is still in cur_mon->fds (in >>> getfd case) >>> or in mon_fdsets (in add-fd case). So if fd is still exists in the lists >>> why >>> user can't close/remove it? >>> >>> So, there are 2 valid options: >>> 1) fd is still valid after migration (dup) >> >> If we do this then existing mgmt apps using QEMU who don't expect to need >> to call remove-fd are going to be leaking FDs forever. > > Ok, so what do you think about modifying qemu_close to remove fd from these > two > lists? Currently it tryes to remove fd only from mon_fdsets->dup_fds. > > Regards, > Yury > >>> 2) fd is closed but also removed from the appropriate list >> >> monitor_get_fd currently leaves the FD on the list. >> >> if all current users of that API are closing the FD >> themselves, then we could change that method to remove >> it from the list. >> >> Either way the requirements in this area are pooly documented >> both from QEMU's internal POV and from mgmt app public QMP pov. >> >> Regards, >> Daniel >> -- >> |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| >> |: https://libvirt.org -o- https://fstop138.berrange.com :| >> |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|