On Thu, May 14, 2026 at 06:45:00PM +0200, Stefano Garzarella wrote: > On Thu, 14 May 2026 at 17:16, Michael S. Tsirkin <[email protected]> wrote: > > > > On Thu, May 14, 2026 at 04:57:16PM +0200, Stefano Garzarella wrote: > > > On Wed, May 13, 2026 at 12:54:16PM +0200, Stefano Garzarella wrote: > > > > From: Stefano Garzarella <[email protected]> > > > > > > > > When there is no more space to queue an incoming packet, the packet is > > > > silently dropped. This causes data loss without any notification to > > > > either peer, since there is no retransmission. > > > > > > > > Under normal circumstances, this should never happen. However, it could > > > > happen if the other peer doesn't respect the credit, or if the skb > > > > overhead, which we recently began to take into account with commit > > > > 059b7dbd20a6 ("vsock/virtio: fix potential unbounded skb queue"), > > > > is too high. > > > > > > > > Fix this by resetting the connection and setting the local socket error > > > > to ENOBUFS when virtio_transport_recv_enqueue() can no longer queue a > > > > packet, so both peers are explicitly notified of the failure rather than > > > > silently losing data. > > > > > > > > Fixes: ae6fcfbf5f03 ("vsock/virtio: discard packets if credit is not > > > > respected") > > > > Signed-off-by: Stefano Garzarella <[email protected]> > > > > --- > > > > net/vmw_vsock/virtio_transport_common.c | 19 ++++++++++++++----- > > > > 1 file changed, 14 insertions(+), 5 deletions(-) > > > > > > > > diff --git a/net/vmw_vsock/virtio_transport_common.c > > > > b/net/vmw_vsock/virtio_transport_common.c > > > > index 989cc252d3d3..4a4ac69d1ad1 100644 > > > > --- a/net/vmw_vsock/virtio_transport_common.c > > > > +++ b/net/vmw_vsock/virtio_transport_common.c > > > > @@ -1350,7 +1350,7 @@ virtio_transport_recv_connecting(struct sock *sk, > > > > return err; > > > > } > > > > > > > > -static void > > > > +static bool > > > > virtio_transport_recv_enqueue(struct vsock_sock *vsk, > > > > struct sk_buff *skb) > > > > { > > > > @@ -1365,10 +1365,8 @@ virtio_transport_recv_enqueue(struct vsock_sock > > > > *vsk, > > > > spin_lock_bh(&vvs->rx_lock); > > > > > > > > can_enqueue = virtio_transport_inc_rx_pkt(vvs, len); > > > > - if (!can_enqueue) { > > > > - free_pkt = true; > > > > + if (!can_enqueue) > > > > goto out; > > > > - } > > > > > > > > if (le32_to_cpu(hdr->flags) & VIRTIO_VSOCK_SEQ_EOM) > > > > vvs->msg_count++; > > > > @@ -1408,6 +1406,8 @@ virtio_transport_recv_enqueue(struct vsock_sock > > > > *vsk, > > > > spin_unlock_bh(&vvs->rx_lock); > > > > if (free_pkt) > > > > kfree_skb(skb); > > > > + > > > > + return can_enqueue; > > > > } > > > > > > > > static int > > > > @@ -1420,7 +1420,16 @@ virtio_transport_recv_connected(struct sock *sk, > > > > > > > > switch (le16_to_cpu(hdr->op)) { > > > > case VIRTIO_VSOCK_OP_RW: > > > > - virtio_transport_recv_enqueue(vsk, skb); > > > > + if (!virtio_transport_recv_enqueue(vsk, skb)) { > > > > + /* There is no more space to queue the packet, so > > > > let's > > > > + * close the connection; otherwise, we'll lose data. > > > > + */ > > > > + (void)virtio_transport_reset(vsk, skb); > > > > + sk->sk_state = TCP_CLOSE; > > > > + sk->sk_err = ENOBUFS; > > > > + sk_error_report(sk); > > > > > > sashiko reported some issues related to setting TCP_CLOSE state and not > > > removing the socket from the connect table: > > > https://sashiko.dev/#/patchset/20260513105417.56761-1-sgarzare%40redhat.com > > > > > > I'll change this by calling virtio_transport_do_close() and > > > vsock_remove_sock() in the next version. > > > > > > Stefano > > > > > > > + break; > > > > + } > > > > vsock_data_ready(sk); > > > > return err; > > > > case VIRTIO_VSOCK_OP_CREDIT_REQUEST: > > > > -- > > > > 2.54.0 > > > > > > > > > > And so the bag of hacks grows. I feel this is energy not well spent. > > Please, let us fix this properly *first*. And then worry about how to > > backport. Maybe it will not be so terrible to backport after all. > > > > TBH I don't think this is an hack, but an issue we should fix in any case. > Regarding the second patch, I see your point, but it's a big change > that worries me. I'd like some more time to fix it properly without > rushing. Staying calm without realizing that userspace is broken like > we are now without this series :-( > > That said, evaluating further, I think we have a similar issue also > with STREAM on the host side where the skb usually doesn't free space, > so we need a merge strategy also there. > > So, I'd like to have time to fix both definitely. If you have time and > want to go ahead, please do. > > Thanks, > Stefano
Well my patch was a start, we just need a strategy how to avoid copying everything, right? -- MST

