On Mon, Jun 26, 2023 at 10:26 AM Hawkins Jiawei <yin31...@gmail.com> wrote: > > On 2023/6/25 18:48, Eugenio Perez Martin wrote: > > On Thu, Jun 22, 2023 at 3:07 AM Hawkins Jiawei <yin31...@gmail.com> wrote: > >> > >> According to the VirtIO standard, "Since there are no guarantees, > >> it can use a hash filter or silently switch to > >> allmulti or promiscuous mode if it is given too many addresses." > >> To achive this, QEMU ignores MAC addresses and marks `mac_table.x_overflow` > >> in the device internal state in virtio_net_handle_mac() > >> if the guest sets more than `MAC_TABLE_ENTRIES` MAC addresses > >> for the filter table. > >> > >> However, the problem is that QEMU never marks the `mac_table.x_overflow` > >> for the vdpa device internal state when the guest sets more than > >> `MAC_TABLE_ENTRIES` MAC addresses. > >> > >> To be more specific, currently QEMU offers a buffer size of > >> vhost_vdpa_net_cvq_cmd_len() for CVQ commands, which represents the size of > >> VIRTIO_NET_CTRL_MAC_TABLE_SET command with a maximum `MAC_TABLE_ENTRIES` > >> MAC addresses. > >> > >> Consequently, if the guest sets more than `MAC_TABLE_ENTRIES` MAC > >> addresses, > >> QEMU truncates the CVQ command data and copies this incomplete command > >> into the out buffer. In this situation, virtio_net_handle_mac() fails the > >> integrity check and returns VIRTIO_NET_ERR instead of marking > >> `mac_table.x_overflow` and returning VIRTIO_NET_OK, since the copied > >> CVQ command in the buffer is incomplete and flawed. > >> > >> This patch solves this problem by increasing the buffer size to > >> vhost_vdpa_net_cvq_cmd_page_len(), which represents the size of the buffer > >> that is allocated and mmaped. Therefore, everything should work correctly > >> as long as the guest sets fewer than `(vhost_vdpa_net_cvq_cmd_page_len() - > >> sizeof(struct virtio_net_ctrl_hdr) > >> - 2 * sizeof(struct virtio_net_ctrl_mac)) / ETH_ALEN` MAC addresses. > >> > >> Considering the highly unlikely scenario for the guest setting more than > >> that number of MAC addresses for the filter table, this patch should > >> work fine for the majority of cases. If there is a need for more than thoes > >> entries, we can increase the value for vhost_vdpa_net_cvq_cmd_page_len() > >> in the future, mapping more than one page for command output. > >> > >> Fixes: 7a7f87e94c ("vdpa: Move command buffers map to start of net device") > >> Signed-off-by: Hawkins Jiawei <yin31...@gmail.com> > >> --- > >> net/vhost-vdpa.c | 11 ++++++++++- > >> 1 file changed, 10 insertions(+), 1 deletion(-) > >> > >> diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c > >> index 5a72204899..ecfa8852b5 100644 > >> --- a/net/vhost-vdpa.c > >> +++ b/net/vhost-vdpa.c > >> @@ -784,9 +784,18 @@ static int > >> vhost_vdpa_net_handle_ctrl_avail(VhostShadowVirtqueue *svq, > >> }; > >> ssize_t dev_written = -EINVAL; > >> > >> + /* > >> + * This code truncates the VIRTIO_NET_CTRL_MAC_TABLE_SET CVQ command > >> + * and prevents QEMU from marking `mac_table.x_overflow` in the device > >> + * internal state in virtio_net_handle_mac() if the guest sets more > >> than > >> + * `(vhost_vdpa_net_cvq_cmd_page_len() - sizeof(struct > >> virtio_net_ctrl_hdr) > >> + * - 2 * sizeof(struct virtio_net_ctrl_mac)) / ETH_ALEN` MAC > >> addresses for > >> + * filter table. > >> + * However, this situation is considered rare, so it is acceptable. > > > > I think we can also fix this situation. If it fits in one page, we can > > still send the same page to the device and virtio_net_handle_ctrl_iov. > > If it does not fit in one page, we can clear all mac filters with > > VIRTIO_NET_CTRL_RX_ALLUNI and / or VIRTIO_NET_CTRL_RX_ALLMULTI. > > Hi Eugenio, > > Thank you for your review. > > However, it appears that the approach may not resolve the situation. > > When the CVQ command exceeds one page, > vhost_vdpa_net_handle_ctrl_avail() truncates the command, resulting in a > malformed SVQ command being received by the hardware, which in turn > triggers an error acknowledgement to QEMU. >
If that happens we can sanitize the copied cmd buffer. Let's start with an overflowed unicast table first. To configure the device we need to transform the command to VIRTIO_NET_CTRL_RX_ALLUNI, as we cannot copy all the table to the out cmd page. If that succeeds, we can continue to register that in the VirtIONet struct. We can copy the first MAC_TABLE_ENTRIES + 1 entries and set entries = (MAC_TABLE_ENTRIES + 1), and then use that buffer to call virtio_net_handle_ctrl_iov. That sets VirtIONet.uni_overflow = true and VirtIONet.all_uni = false. We need to handle the multicast addresses in a similar way, but we can find cases where only multicast addresses overflow. In future series we can improve the situation: * Allocating bigger out buffers so more mac addresses fit in it. * Mapping also guest pages in CVQ, so the real device is able to read the full table but VirtIONet only stores the first MAC_TABLE_ENTRIES or .uni_overflow = 1. But I think it should be enough at this point. Thanks! > So if CVQ command exceeds one page, vhost_vdpa_net_handle_ctrl_avail() > should not update the device internal state because the SVQ command > fails.(Please correct me if I am wrong) > > It appears that my commit message and comments did not take this into > account. I will refactor them in the v2 patch.. > > Thanks! > > > > > > Thanks! > > > >> + */ > >> out.iov_len = iov_to_buf(elem->out_sg, elem->out_num, 0, > >> s->cvq_cmd_out_buffer, > >> - vhost_vdpa_net_cvq_cmd_len()); > >> + vhost_vdpa_net_cvq_cmd_page_len()); > >> if (*(uint8_t *)s->cvq_cmd_out_buffer == VIRTIO_NET_CTRL_ANNOUNCE) { > >> /* > >> * Guest announce capability is emulated by qemu, so don't > >> forward to > >> -- > >> 2.25.1 > >> > > >