On 03/29/2018 09:01 AM, Tan, Jianfeng wrote:
Hi Maxime and Victor,
-----Original Message-----
From: Maxime Coquelin [mailto:maxime.coque...@redhat.com]
Sent: Tuesday, December 5, 2017 10:28 PM
To: Yuanhan Liu; Tan, Jianfeng; Victor Kaplansky
Cc: dev@dpdk.org; sta...@dpdk.org; Yang, Yi Y
Subject: Re: [PATCH] vhost: fix segfault as handle set_mem_table message
On 12/05/2017 03:19 PM, Yuanhan Liu wrote:
On Tue, Nov 28, 2017 at 01:09:29PM +0100, Maxime Coquelin wrote:
On 11/15/2017 12:41 PM, Jianfeng Tan wrote:
In a running VM, operations (like device attach/detach) will
trigger the QEMU to resend set_mem_table to vhost-user backend.
DPDK vhost-user handles this message rudely by unmap all existing
regions and map new ones. This might lead to segfault if there
is pmd thread just trying to touch those unmapped memory regions.
But for most cases, except VM memory hotplug,
FYI, Victor is working on implementing a lock-less protection mechanism
to prevent crashes in such cases. It is intended first to protect
log_base in case of multiqueue + live-migration, but would solve thi
issue too.
Bring this issue back for discussion.
Reported by SPDK guys, even with per-queue lock, they could still run into
crash as of memory hot plug or unplug.
In detail, you add the lock in an internal struct, vhost_queue which is,
unfortunately, not visible to the external datapath, like vhost-scsi in SPDK.
Yes, I agree current solution is not enough
The memory hot plug and unplug would be the main issue from SPDK side so far.
For this specific issue, I think we can continue this patch to filter out the
changed regions, and keep unchanged regions not remapped.
But I know that the per-vq lock is not only to resolve the memory table issue,
some other vhost-user messages could also lead to problems? If yes, shall we
take a step back, to think about how to solve this kind of issues for backends,
like vhost-scsi.
Right, any message that can change the device or virtqueue states can be
problematic.
Thoughts?
In another thread, SPDK people proposed to destroy and re-create the
device for every message. I think this is not acceptable.
I proposed an alternative that I think would work:
- external backends & applications implements the .vring_state_changed()
callback. On disable it stops processing the rings and only return
once all descriptor buffers are processed. On enable, they resume the
rings processing.
- In vhost lib, we implement vhost_vring_pause and vhost_vring_resume
functions. In pause function, we save device state (enable or
disable) in a variable, and if the ring is enabled at device level it
calls .vring_state_changed() with disabled state. In resume, it checks
the vring state in the device and call .vring_state_changed() with
enable state if it was enabled in device state.
So, I think that would work but I hadn't had a clear reply from SPDK
people proving it wouldn't.
They asked we revert Victor's patch, but I don't see the need as it does
not hurt SPDK (but doesn't protect anything for them I agree), while it
really fixes real issues with internal Net backend.
What do you think of my proposal? Do you see other alternative?
Thanks,
Maxime
Thanks,
Jianfeng
QEMU still sends the
set_mem_table message even the memory regions are not changed as
QEMU vhost-user filters out those not backed by file (fd > 0).
To fix this case, we add a check in the handler to see if the
memory regions are really changed; if not, we just keep old memory
regions.
Fixes: 8f972312b8f4 ("vhost: support vhost-user")
CC: sta...@dpdk.org
CC: Yuanhan Liu <y...@fridaylinux.org>
CC: Maxime Coquelin <maxime.coque...@redhat.com>
Reported-by: Yang Zhang <zy107...@alibaba-inc.com>
Reported-by: Xin Long <longxin...@alibaba-inc.com>
Signed-off-by: Yi Yang <yi.y.y...@intel.com>
Signed-off-by: Jianfeng Tan <jianfeng....@intel.com>
---
lib/librte_vhost/vhost_user.c | 33
+++++++++++++++++++++++++++++++++
1 file changed, 33 insertions(+)
Reviewed-by: Maxime Coquelin <maxime.coque...@redhat.com>
Applied to dpdk-next-virtio.
Thanks.
--yliu
Maxime