Hi,
It's been a while since I sent an email. I thought I would send an update
to keep you in the loop.
I have been comparing svq's mechanism for split and packed vqs hoping to
find something that might lead to the source of the issue.
One thing worth noting is that when I use kernel version 6.8.5 for testing,
the crashes are far more frequent. In kernel version 6.15.0-rc3+, it's much
harder to reproduce.
On 5/15/25 11:49 AM, Eugenio Perez Martin wrote:
On Wed, May 14, 2025 at 8:22 AM Sahil Siddiq <icegambi...@gmail.com> wrote:
On 4/16/25 12:50 PM, Eugenio Perez Martin wrote:
On Mon, Apr 14, 2025 at 11:20 AM Sahil Siddiq <icegambi...@gmail.com> wrote:
On 3/26/25 1:05 PM, Eugenio Perez Martin wrote:
On Mon, Mar 24, 2025 at 2:59 PM Sahil Siddiq <icegambi...@gmail.com> wrote:
I managed to fix a few issues while testing this patch series.
There is still one issue that I am unable to resolve. I thought
I would send this patch series for review in case I have missed
something.
The issue is that this patch series does not work every time. I
am able to ping L0 from L2 and vice versa via packed SVQ when it
works.
When this doesn't work, both VMs throw a "Destination Host
Unreachable" error. This is sometimes (not always) accompanied
by the following kernel error (thrown by L2-kernel):
virtio_net virtio1: output.0:id 1 is not a head!
The TX queue seems to be problematic. More on this below.
Sometimes RX also results in this crash, but it seems to be less frequent.
This was filled while L2
was booting. In the case when the ctrl vq is disabled, I am not sure
what is responsible for filling the vqs in the data plane during
booting.
The nested guest's driver fills the rx queue at startup. After that,
that nested guest kicks and SVQ receives the descriptors. It copies
the descriptors to the shadow virtqueue and then kicks L0 QEMU.
Understood.
The other vq (vq_idx=1) is not filled completely before the issue is
hit.
I have been noting down the numbers and here is an example:
295 descriptors were added individually to the queues i.e., there were no
chains (vhost_svq_add_packed)
|_ 256 additions in vq_idx = 0, all with unique ids
|---- 27 descriptors (ids 0 through 26) were received later from the
device (vhost_svq_get_buf_packed)
|_ 39 additions in vq_idx = 1
|_ 13 descriptors had id = 0
|_ 26 descriptors had id = 1
|---- All descriptors were received at some point from the device
(vhost_svq_get_buf_packed)
There was one case in which vq_idx=0 had wrapped around. I verified
that flags were set appropriately during the wrap (avail and used flags
were flipped as expected).
Ok sounds like you're able to reach it before filling the queue. I'd
go for debugging notifications for this one then. More on this below.
=====
The next common situation where this issue is hit is during startup.
Before L2 can finish booting successfully, this error is thrown:
virtio_net virtio1: output.0:id 0 is not a head!
258 descriptors were added individually to the queues during startup (there
were no chains) (vhost_svq_add_packed)
|_ 256 additions in vq_idx = 0, all with unique ids
|---- None of them were received by the device (vhost_svq_get_buf_packed)
|_ 2 additions in vq_idx = 1
|_ id = 0 in index 0
|_ id = 1 in index 1
|---- Both descriptors were received at some point during startup from
the device (vhost_svq_get_buf_packed)
=====
Another case is after several seconds of pinging L0 from L2.
[ 99.034114] virtio_net virtio1: output.0:id 0 is not a head!
So the L2 guest sees a descriptor it has not made available
previously. This can be caused because SVQ returns the same descriptor
twice, or it doesn't fill the id or flags properly. It can also be
caused because we're not protecting the write ordering in the ring,
but I don't see anything obviously wrong by looking at the code.
366 descriptors were added individually to the queues i.e., there were no
chains (vhost_svq_add_packed)
|_ 289 additions in vq_idx = 0, wrap-around was observed with avail and used
flags inverted for 33 descriptors
| |---- 40 descriptors (ids 0 through 39) were received from the device
(vhost_svq_get_buf_packed)
|_ 77 additions in vq_idx = 1
|_ 76 descriptors had id = 0
|_ 1 descriptor had id = 1
|---- all 77 descriptors were received at some point from the device
(vhost_svq_get_buf_packed)
"id is not a head" is being thrown because vq->packed.desc_state[id].data
doesn't exist for the corresponding id in Linux [1]. But QEMU seems to have
stored some data for this id via vhost_svq_add() [2]. Linux sets the value
of vq->packed.desc_state[id].data in its version of virtqueue_add_packed() [3].
Let's keep debugging further. Can you trace the ids that the L2 kernel
makes available, and then the ones that it uses? At the same time, can
you trace the ids that the svq sees in vhost_svq_get_buf and the ones
that flushes? This allows us to check the set of available descriptors
at any given time.
In the linux kernel, I am printing which descriptor is received in which
queue in drivers/virtio/virtio_ring.c:virtqueue_get_buf_ctx_packed() [1].
I see the following lines getting printed for the TX queue:
[ 192.101591] output.0 -> id: 0
[ 213.737417] output.0 -> id: 0
[ 213.738714] output.0 -> id: 1
[ 213.740093] output.0 -> id: 0
[ 213.741521] virtio_net virtio1: output.0:id 0 is not a head!
I find it particular that it is the first descriptor with id 1. Do you
have any other descriptor with id 1 previously? Does it fail
consistently with id 1?
Yes, the descriptor with id 1 was used previously in TX. It varies between
test runs. It has failed with other ids as well during some test runs. In
one test run, it failed with id 17. I think there's an off-by-one bug here.
It crashes when it receives id 'x - 1' instead of 'x'.
You should have descriptors with id 1 and more in the rx queue and the
code should not be able to tell the difference, so it seems weird it
fails with tx. But who knows :).
Oh, I thought it would be able to differentiate between them since it knows
which vq->idx it's coming from.
I think there's something off in the way "free_head", "last_used_idx" and
"desc_next" values are calculated in vhost_svq_get_buf_packed() [1].
In the latest test run, QEMU sent ids 0 through 28 to L2. L2 started receiving
them in order till id 8. At this point it received id 7 again for some reason
and then crashed.
L2:
[ 1641.129218] (prepare_packed) output.0 -> needs_kick: 1
[ 1641.130621] (notify) output.0 -> function will return true
[ 1641.132022] output.0 -> id: 0
[ 1739.502358] input.0 -> id: 0
[ 1739.503003] input.0 -> id: 1
[ 1739.562024] input.0 -> id: 2
[ 1739.578682] input.0 -> id: 3
[ 1739.661913] input.0 -> id: 4
[ 1739.828796] input.0 -> id: 5
[ 1739.829789] input.0 -> id: 6
[ 1740.078757] input.0 -> id: 7
[ 1740.079749] input.0 -> id: 8
[ 1740.080382] input.0 -> id: 7 <----Received 7 again
[ 1740.081614] virtio_net virtio1: input.0:id 7 is not a head!
QEMU logs (vhost_svq_get_buf_packed):
------
size : svq->vring.num
len : svq->vring_packed.vring.desc[last_used].len
id : svq->vring_packed.vring.desc[last_used].id
num : svq->desc_state[id].ndescs
last_used_chain : Result of vhost_svq_last_desc_of_chain(svq, num, id) [2]
free_head : svq->free_head
last_used : (last_used_idx & ~(1 << VRING_PACKED_EVENT_F_WRAP_CTR)) +
num
used_wrap_counter : !!(last_used_idx & (1 << VRING_PACKED_EVENT_F_WRAP_CTR))
------
size: 256, len: 102, id: 0, vq idx: 0
id: 0, last_used_chain: 0, free_head: 0, vq idx: 0
num: 1, free_head: 0, id: 0, last_used: 1, used_wrap_counter: 1, vq idx: 0
------
size: 256, len: 74, id: 1, vq idx: 0
id: 1, last_used_chain: 1, free_head: 0, vq idx: 0
num: 1, free_head: 1, id: 1, last_used: 2, used_wrap_counter: 1, vq idx: 0
------
size: 256, len: 102, id: 2, vq idx: 0
id: 2, last_used_chain: 2, free_head: 1, vq idx: 0
num: 1, free_head: 2, id: 2, last_used: 3, used_wrap_counter: 1, vq idx: 0
------
size: 256, len: 82, id: 3, vq idx: 0
id: 3, last_used_chain: 3, free_head: 2, vq idx: 0
num: 1, free_head: 3, id: 3, last_used: 4, used_wrap_counter: 1, vq idx: 0
------
size: 256, len: 74, id: 4, vq idx: 0
id: 4, last_used_chain: 4, free_head: 3, vq idx: 0
num: 1, free_head: 4, id: 4, last_used: 5, used_wrap_counter: 1, vq idx: 0
------
size: 256, len: 82, id: 5, vq idx: 0
id: 5, last_used_chain: 5, free_head: 4, vq idx: 0
num: 1, free_head: 5, id: 5, last_used: 6, used_wrap_counter: 1, vq idx: 0
------
size: 256, len: 104, id: 6, vq idx: 0
id: 6, last_used_chain: 6, free_head: 5, vq idx: 0
num: 1, free_head: 6, id: 6, last_used: 7, used_wrap_counter: 1, vq idx: 0
------
size: 256, len: 82, id: 7, vq idx: 0
id: 7, last_used_chain: 7, free_head: 6, vq idx: 0
num: 1, free_head: 7, id: 7, last_used: 8, used_wrap_counter: 1, vq idx: 0
------
size: 256, len: 104, id: 8, vq idx: 0
id: 8, last_used_chain: 8, free_head: 7, vq idx: 0
num: 1, free_head: 8, id: 8, last_used: 9, used_wrap_counter: 1, vq idx: 0
------
size: 256, len: 98, id: 9, vq idx: 0
id: 9, last_used_chain: 9, free_head: 8, vq idx: 0
num: 1, free_head: 9, id: 9, last_used: 10, used_wrap_counter: 1, vq idx: 0
------
size: 256, len: 104, id: 10, vq idx: 0
id: 10, last_used_chain: 10, free_head: 9, vq idx: 0
num: 1, free_head: 10, id: 10, last_used: 11, used_wrap_counter: 1, vq idx: 0
I have a few more ideas of what to do. I'll let you know if I find something
else.
Thanks,
Sahil
[1]
https://github.com/valdaarhun/qemu/blob/packed_vq/hw/virtio/vhost-shadow-virtqueue.c#L687
[2]
https://github.com/valdaarhun/qemu/blob/packed_vq/hw/virtio/vhost-shadow-virtqueue.c#L629