When the 'reconnect' option is given for a client connection, the qmp_chardev_open_socket_client method will run an asynchronous connection attempt. The QIOChannel socket executes this is a single use background thread, so the connection will succeed immediately (assuming the server is listening). The chardev, however, won't get the result from this background thread until the main loop starts running and processes idle callbacks.
Thus when tcp_chr_wait_connected is run s->ioc will be NULL, and the state will still be TCP_CHARDEV_STATE_DISCONNECTED, but there will already be an established connection that will be associated with the chardev by the pending idle callback. tcp_chr_wait_connected doesn't see this and so attempts to establish another connection synchronously. If the server allows multiple connections this is unhelpful but not a fatal problem as the duplicate connection will get ignored by the tcp_chr_new_client method when it sees the state is already connected. If the server only supports a single connection, however, the tcp_chr_wait_connected method will hang forever because the server will not accept its synchronous connection attempt until the first connection is closed. To deal with this we must ensure that qmp_chardev_open_socket_client does not actually start the asynchronous connection attempt. Instead it should schedule a timer with 0ms expiry time, which will only be processed once the main loop starts running. The tcp_chr_wait_connected method can now safely do a synchronous connection attempt without creating a race condition. When the timer expires it will see that a connection has already been established and take no further action. Signed-off-by: Daniel P. Berrangé <berra...@redhat.com> --- chardev/char-socket.c | 30 ++++++++++++++++++++++++++++-- 1 file changed, 28 insertions(+), 2 deletions(-) diff --git a/chardev/char-socket.c b/chardev/char-socket.c index 7e98a95bbd..07942d7a1b 100644 --- a/chardev/char-socket.c +++ b/chardev/char-socket.c @@ -965,7 +965,25 @@ static int tcp_chr_wait_connected(Chardev *chr, Error **errp) } } - while (!s->ioc) { + /* + * We expect state to be as follows: + * + * - server + * - wait -> CONNECTED + * - nowait -> DISCONNECTED + * - client + * - reconnect == 0 -> CONNECTED + * - reconnect != 0 -> DISCONNECTED + * + */ + if (s->state == TCP_CHARDEV_STATE_CONNECTING) { + error_setg(errp, + "Unexpected 'connecting' state when waiting for " + "connection during early startup"); + return -1; + } + + while (s->state != TCP_CHARDEV_STATE_CONNECTED) { if (s->is_listen) { tcp_chr_accept_server_sync(chr); } else { @@ -1106,7 +1124,15 @@ static int qmp_chardev_open_socket_client(Chardev *chr, if (reconnect > 0) { s->reconnect_time = reconnect; - tcp_chr_connect_client_async(chr); + /* + * We must not start the socket connect attempt until the main + * loop is running, otherwise qemu_chr_wait_connect will not be + * able to take over connection establishment during startup + */ + s->reconnect_timer = qemu_chr_timeout_add_ms(chr, + 0, + socket_reconnect_timeout, + chr); return 0; } else { return tcp_chr_connect_client_sync(chr, errp); -- 2.20.1