On Mon, Jul 18, 2016 at 08:41:32AM +0000, Gonglei (Arei) wrote: > Hi Daniel & Paolo, > > Commit 9894dc0c "char: convert from GIOChannel to QIOChannel", > about the below code segment: > > static bool qemu_chr_open_socket_fd(CharDriverState *chr, Error **errp) > { > TCPCharDriver *s = chr->opaque; > - int fd; > + QIOChannelSocket *sioc = qio_channel_socket_new(); > > if (s->is_listen) { > - fd = socket_listen(s->addr, errp); > + if (qio_channel_socket_listen_sync(sioc, s->addr, errp) < 0) { > + goto fail; > + } > + qemu_chr_finish_socket_connection(chr, sioc); > } else if (s->reconnect_time) { > - fd = socket_connect(s->addr, errp, qemu_chr_socket_connected, chr); > - return fd >= 0; > + qio_channel_socket_connect_async(sioc, s->addr, > + qemu_chr_socket_connected, > + chr, NULL); > } else { > - fd = socket_connect(s->addr, errp, NULL, NULL); > - } > - if (fd < 0) { > - return false; > + if (qio_channel_socket_connect_sync(sioc, s->addr, errp) < 0) { > + goto fail; > + } > + qemu_chr_finish_socket_connection(chr, sioc); > } > > - qemu_chr_finish_socket_connection(chr, fd); > return true; > + > + fail: > + object_unref(OBJECT(sioc)); > + return false; > } > > Why did you change socket_connect() to qio_channel_socket_connect_async > but not qio_channel_socket_connect_sync when s->reconnect_time is ture? > Thanks. > I can't get the reason from the commit message.
When the socket_connect() function is called with a non-null value for the NonBlockingConnectHandler parameter, it'll put the socket into non-blocking mode before doing the connection. Therefore socket_connect() method is no longer guaranteed to actually connect to the socket - it may simply start the connection and leave an event handler to finish it. This avoids blocking the caller waiting for the connectionb to finish which may take non-negligible time with TCP sockets. The qio_channel_socket_connect_sync() method will always block until completion of the connect, so we use qio_channel_socket_connect_async which always does the connection attempt in the background. The old code which may or may not finish the connect() in the background was a recipe for bugs because it forced callers to have 2 completely separate codepaths for the same API call and chances are only one of those code paths would be tested by the developers. With the code code we're guaranteed to always complete in the background, so the callers only have 1 codepath to worry about and so they'll be forced to ensure that codepath works. > PS: We encountered a problem that when config vhost-user-net reconneticon > function, > the virtio_net_get_features() didn't get a correct value after this changed, > and it was > fixed when we change the async to sync. It seems that you were mostly lucky with the old code in that presumably socket_connect() would complete the connection fully. There was never any guarantee of this though - it was perfectly feasible that socket_connect() would finish the connect in the background. So IMHO the vhost user code was already buggy if it doesn't cope with the connect finishing in the background. Regards, Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|