Re: [RFC v4 0/5] Add packed virtqueue to shadow virtqueue

Eugenio Perez Martin Mon, 10 Feb 2025 23:59:11 -0800

On Mon, Feb 10, 2025 at 5:25 PM Sahil Siddiq <icegambi...@gmail.com> wrote:
>
> Hi,
>
> On 2/10/25 7:53 PM, Eugenio Perez Martin wrote:
> > On Mon, Feb 10, 2025 at 11:58 AM Sahil Siddiq <icegambi...@gmail.com> wrote:
> >> On 2/6/25 8:47 PM, Sahil Siddiq wrote:
> >>> On 2/6/25 12:42 PM, Eugenio Perez Martin wrote:
> >>>> On Thu, Feb 6, 2025 at 6:26 AM Sahil Siddiq <icegambi...@gmail.com> 
> >>>> wrote:
> >>>>> On 2/4/25 11:45 PM, Eugenio Perez Martin wrote:
> >>>>>> PS: Please note that you can check packed_vq SVQ implementation
> >>>>>> already without CVQ, as these features are totally orthogonal :).
> >>>>>>
> >>>>>
> >>>>> Right. Now that I can ping with the ctrl features turned off, I think
> >>>>> this should take precedence. There's another issue specific to the
> >>>>> packed virtqueue case. It causes the kernel to crash. I have been
> >>>>> investigating this and the situation here looks very similar to what's
> >>>>> explained in Jason Wang's mail [2]. My plan of action is to apply his
> >>>>> changes in L2's kernel and check if that resolves the problem.
> >>>>>
> >>>>> The details of the crash can be found in this mail [3].
> >>>>>
> >>>>
> >>>> If you're testing this series without changes, I think that is caused
> >>>> by not implementing the packed version of vhost_svq_get_buf.
> >>>>
> >>>> https://lists.nongnu.org/archive/html/qemu-devel/2024-12/msg01902.html
> >>>>
> >>>
> >>> Oh, apologies, I think I had misunderstood your response in the linked 
> >>> mail.
> >>> Until now, I thought they were unrelated. In that case, I'll implement the
> >>> packed version of vhost_svq_get_buf. Hopefully that fixes it :).
> >>>
> >>
> >> I noticed one thing while testing some of the changes that I have made.
> >> I haven't finished making the relevant changes to all the functions which
> >> will have to handle split and packed vq differently. L2's kernel crashes
> >> when I launch L0-QEMU with ctrl_vq=on,ctrl_rx=on.
> >
> > Interesting, is a similar crash than this? (NULL ptr deference on
> > virtnet_set_features)?
> >
> > https://issues.redhat.com/browse/RHEL-391
> I am not able to access this bug report (even with a Red Hat account). It
> says it may have been deleted or I don't have the permission to view it.
>
> It's hard to tell if this is the same issue. I don't think it is the same
> issue though since I don't see any such indication in the logs. The kernel
> throws the following:
>
> [   23.047503] virtio_net virtio1: output.0:id 0 is not a head!


This is a common error when modifying code of the dataplane, it is
unlikely to do deep changes and not see this error :). It indicates
that your code is marking the descriptor id 0 as used when the guest
didn't make it available.

If this is happening in control virtqueue, I'd check if the code is
setting the flags as used in ring[1] when it shouldn't. But my bet is
that the rx queue is the wrong one.

> [   49.173243] watchdog: BUG: soft lockup - CPU#1 stuck for 25s! 
> [NetworkManager:782]
> [   49.174167] Modules linked in: rfkill intel_rapl_msr intel_rapl_common 
> intel_uncore_frequency_common intel_pmc_core intel_vsec pmt_telemetry 
> pmt_class kvg
> [   49.188258] CPU: 1 PID: 782 Comm: NetworkManager Not tainted 
> 6.8.7-200.fc39.x86_64 #1
> [   49.193196] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 
> rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014
> [   49.193196] RIP: 0010:virtqueue_get_buf+0x0/0x20
>

Two possibilities about this part:
a) You're spending "too long" in the debugger in QEMU. From the kernel
POV, the function virtqueue_get_buf is taking too long to complete so
it detects it as a lockup. You can check this scenario by not running
QEMU under GDB or disabling all breakpoints. You can ignore this
message if you don't find the error this way. If you still see the
message, goto possibility b.

b) The kernel has a bug that makes it softlockup in virtqueue_get_buf.
The kernel should not soft lockup even if your changes were malicious
:(, so it is something to be fixed. If you have the time, can you test
with the latest upstream kernel?

> Maybe I was incorrect in stating that the kernel crashes. It's more like
> the kernel is stuck in a loop (according to these blog posts on soft
> lockup [1][2]).
>
> In the above trace, RIP is in virtqueue_get_buf() [3]. This is what
> calls virtqueue_get_buf_ctx_packed() [4] which throws the error.
>
> What I don't understand is why vq->packed.desc_state[id].data [5] is
> NULL when the control features are turned on, but doesn't seem to be
> NULL when the control features are turned off.
>

Due to the net subsystem lock, CVQ handling is not as robust / secure
against this error as the dataplane queues. There is an ongoing effort
to make it more robust, so maybe this is something to fix in that
line.

Can you put the whole backtrace that prints the kernel?

> >> However, when I start
> >> L0-QEMU with ctrl_vq=off,ctrl_rx=off,ctrl_vlan=off,ctrl_mac_addr=off, L2's
> >> kernel boots successfully. Tracing L2-QEMU also confirms that the packed
> >> feature is enabled. With all the ctrl features disabled, I think pinging
> >> will also be possible once I finish implementing the packed versions of
> >> the other functions.
> >>
> >
> > Good!
> >
> >> There's another thing that I am confused about regarding the current
> >> implementation (in the master branch).
> >>
> >> In hw/virtio/vhost-shadow-virtqueue.c:vhost_svq_vring_write_descs() [1],
> >> svq->free_head saves the descriptor in the specified format using
> >> "le16_to_cpu" (line 171).
> >
> > Good catch, this should be le16_to_cpu actually. But code wise is the
> > same, so we have no visible error. Do you want to send a patch to fix
> > it?
> >
> Sorry, I am still a little confused here. Did you mean cpu_to_le16
> by any chance? Based on what I have understood, if it is to be used
> by the host machine, then it should be cpu_to_le16.
>
> I can send a patch once this is clear, or can even integrate it in
> this patch series since this patch series refactors that function
> anyway.
>

Ok, I don't know how I read the function to answer you that :(. Let me
start from scratch,

In line 171, we're copying data from QEMU internals, that are not in
the guest memory, to other QEMU internals. So no cpu_to_le* or
le*to_cpu is needed.

> >> On the other hand, the value of i is stored
> >> in the native endianness using "cpu_to_le16" (line 168). If "i" is to be
> >> stored in the native endianness (little endian in this case), then
> >> should svq->free_head first be converted to little endian before being
> >> assigned to "i" at the start of the function (line 142)?
> >>
> >

No endianness conversion is needed here for the same reason, all is
internal to QEMU and not intended to be seen by the guest.

> > This part is correct in the code, as it is used by the host, not
> > written to the guest or read from the guest. So no conversion is
> > needed.
>
> Understood.
>
> Thanks,
> Sahil
>
> [1] https://www.suse.com/support/kb/doc/?id=000018705
> [2] https://softlockup.com/SystemAdministration/Linux/Kernel/softlockup/
> [3] 
> https://github.com/torvalds/linux/blob/master/drivers/virtio/virtio_ring.c#L2545
> [4] 
> https://github.com/torvalds/linux/blob/master/drivers/virtio/virtio_ring.c#L1727
> [5] 
> https://github.com/torvalds/linux/blob/master/drivers/virtio/virtio_ring.c#L1762
>

Re: [RFC v4 0/5] Add packed virtqueue to shadow virtqueue

Reply via email to