On 3/29/2018 3:35 PM, Maxime Coquelin wrote:
On 03/29/2018 09:01 AM, Tan, Jianfeng wrote:
Hi Maxime and Victor,
-----Original Message-----
From: Maxime Coquelin [mailto:maxime.coque...@redhat.com]
Sent: Tuesday, December 5, 2017 10:28 PM
To: Yuanhan Liu; Tan, Jianfeng; Victor Kaplansky
Cc: dev@dpdk.org; sta...@dpdk.org; Yang, Yi Y
Subject: Re: [PATCH] vhost: fix segfault as handle set_mem_table
message
On 12/05/2017 03:19 PM, Yuanhan Liu wrote:
On Tue, Nov 28, 2017 at 01:09:29PM +0100, Maxime Coquelin wrote:
On 11/15/2017 12:41 PM, Jianfeng Tan wrote:
In a running VM, operations (like device attach/detach) will
trigger the QEMU to resend set_mem_table to vhost-user backend.
DPDK vhost-user handles this message rudely by unmap all existing
regions and map new ones. This might lead to segfault if there
is pmd thread just trying to touch those unmapped memory regions.
But for most cases, except VM memory hotplug,
FYI, Victor is working on implementing a lock-less protection mechanism
to prevent crashes in such cases. It is intended first to protect
log_base in case of multiqueue + live-migration, but would solve thi
issue too.
Bring this issue back for discussion.
Reported by SPDK guys, even with per-queue lock, they could still run
into crash as of memory hot plug or unplug.
In detail, you add the lock in an internal struct, vhost_queue which
is, unfortunately, not visible to the external datapath, like
vhost-scsi in SPDK.
Yes, I agree current solution is not enough
The memory hot plug and unplug would be the main issue from SPDK side
so far. For this specific issue, I think we can continue this patch
to filter out the changed regions, and keep unchanged regions not
remapped.
How do you think that we move forward on this specific memory issue? I
think it can be parallel with the general mechanism below.
But I know that the per-vq lock is not only to resolve the memory
table issue, some other vhost-user messages could also lead to
problems? If yes, shall we take a step back, to think about how to
solve this kind of issues for backends, like vhost-scsi.
Right, any message that can change the device or virtqueue states can be
problematic.
Thoughts?
In another thread,
Apologize for starting another thread.
SPDK people proposed to destroy and re-create the
device for every message. I think this is not acceptable.
It sounds a little overkill and error-prone in my first impression.
I proposed an alternative that I think would work:
- external backends & applications implements the .vring_state_changed()
callback. On disable it stops processing the rings and only return
once all descriptor buffers are processed. On enable, they resume the
rings processing.
OK, this gives a chance for external backends & applications to sync
(lock or event) with polling threads, just like how destroy_device() works.
- In vhost lib, we implement vhost_vring_pause and vhost_vring_resume
functions. In pause function, we save device state (enable or
disable) in a variable, and if the ring is enabled at device level it
calls .vring_state_changed() with disabled state. In resume, it checks
the vring state in the device and call .vring_state_changed() with
enable state if it was enabled in device state.
So, I think that would work but I hadn't had a clear reply from SPDK
people proving it wouldn't.
They asked we revert Victor's patch, but I don't see the need as it does
not hurt SPDK (but doesn't protect anything for them I agree), while it
really fixes real issues with internal Net backend.
I agree with you. As SPDK has no chance to call the lock, it can not be
affected. I think what people (including myself) are really against is
adding lock in DPDK PMD. But we did not find a better way so far.
What do you think of my proposal? Do you see other alternative?
Sounds a feasible way. Let's wait and see how SPDK guys think.
Thanks,
Jianfeng
Thanks,
Maxime
Thanks,
Jianfeng