From: "Denis V. Lunev" <[email protected]>

When the host initiates an AF_VSOCK connect() to a guest that has not
yet loaded the virtio-vsock transport (i.e. still booting), the caller
blocks for VSOCK_DEFAULT_CONNECT_TIMEOUT.

A caller that wants to know if the guest is up yet instead of waiting
could theoretically tune SO_VM_SOCKETS_CONNECT_TIMEOUT, but it's tricky
to find the right timeout, if not impossible: there's no way to
distinguish "guest won't reply because it's not up yet" vs "guest is up
and tried to reply, but was too slow".

Furthermore, this delay is pointless:
- If the guest doesn't initialize within this timeout, connect()
  returns ETIMEDOUT.
- If the guest **does** initialize, it'll reply with RST immediately,
  because there won't be a listener on the port yet; connect() returns
  ECONNRESET.

That's also inconsistent with the behavior at other initialization
stages: if a connection is attempted when the guest driver is already
loaded, but nothing is listening yet, we return ECONNRESET immediately
without waiting.

Fix this by checking the RX virtqueue backend in
vhost_transport_send_pkt() before queuing. If it's NULL, return
-EHOSTUNREACH immediately.

Callers that used to get ETIMEDOUT will now usually get EHOSTUNREACH.

Signed-off-by: Denis V. Lunev <[email protected]>
Co-developed-by: Polina Vishneva <[email protected]>
Signed-off-by: Polina Vishneva <[email protected]>
---
v2:
- ECONNREFUSED -> EHOSTUNREACH.
- Use vhost_transport_do_send_pkt() instead of raw .private_data access.
- Removed READ_ONCE().
- Wrapped the condition with unlikely().
- Updated the comment and the commit message.

v1: 
https://lore.kernel.org/netdev/[email protected]

 drivers/vhost/vsock.c | 16 ++++++++++++++++
 1 file changed, 16 insertions(+)

diff --git a/drivers/vhost/vsock.c b/drivers/vhost/vsock.c
index 1d8ec6bed53e..9aaab6bb8061 100644
--- a/drivers/vhost/vsock.c
+++ b/drivers/vhost/vsock.c
@@ -302,6 +302,22 @@ vhost_transport_send_pkt(struct sk_buff *skb, struct net 
*net)
                return -ENODEV;
        }
 
+       /* Fast-fail if the guest hasn't enabled the RX vq yet. Queuing the 
packet
+        * and making the caller wait is pointless: even if the guest manages 
to init
+        * within the timeout, it'll immediately reply with RST, because 
there's no
+        * listener on the port yet.
+        *
+        * vhost_vq_get_backend() without vq->mutex is acceptable here: locking
+        * the mutex would be too expensive in this hot path, and we already 
have
+        * all the outcomes covered: if the backend becomes NULL right after 
the check,
+        * vhost_transport_do_send_pkt() will check it under the mutex anyway.
+        */
+       if 
(unlikely(!data_race(vhost_vq_get_backend(&vsock->vqs[VSOCK_VQ_RX])))) {
+               rcu_read_unlock();
+               kfree_skb(skb);
+               return -EHOSTUNREACH;
+       }
+
        if (virtio_vsock_skb_reply(skb))
                atomic_inc(&vsock->queued_replies);
 

base-commit: 8ab992f815d6736b5c7a6f5fd7bfe7bc106bb3dc
-- 
2.53.0


Reply via email to