Re: [PATCH v4 07/14] vfio-user: run vfio-user context

Jag Raman Tue, 11 Jan 2022 06:08:53 -0800

> On Jan 11, 2022, at 4:36 AM, Stefan Hajnoczi <stefa...@redhat.com> wrote:
> 
> On Mon, Jan 10, 2022 at 05:56:25PM +0000, John Levon wrote:
>> On Thu, Jan 06, 2022 at 01:35:32PM +0000, Stefan Hajnoczi wrote:
>> 
>>>>>>> +static void vfu_object_attach_ctx(void *opaque)
>>>>>>> +{
>>>>>>> +    VfuObject *o = opaque;
>>>>>>> +    GPollFD pfds[1];
>>>>>>> +    int ret;
>>>>>>> +
>>>>>>> +    qemu_set_fd_handler(o->vfu_poll_fd, NULL, NULL, NULL);
>>>>>>> +
>>>>>>> +    pfds[0].fd = o->vfu_poll_fd;
>>>>>>> +    pfds[0].events = G_IO_IN | G_IO_HUP | G_IO_ERR;
>>>>>>> +
>>>>>>> +retry_attach:
>>>>>>> +    ret = vfu_attach_ctx(o->vfu_ctx);
>>>>>>> +    if (ret < 0 && (errno == EAGAIN || errno == EWOULDBLOCK)) {
>>>>>>> +        qemu_poll_ns(pfds, 1, 500 * (int64_t)SCALE_MS);
>>>>>>> +        goto retry_attach;
>>>>>> 
>>>>>> This can block the thread indefinitely. Other events like monitor
>>>>>> commands are not handled in this loop. Please make this asynchronous
>>>>>> (set an fd handler and return from this function so we can try again
>>>>>> later).
>>>>>> 
>>>>>> The vfu_attach_ctx() implementation synchronously negotiates the
>>>>>> vfio-user connection :(. That's a shame because even if accept(2) is
>>>>>> handled asynchronously, the negotiation can still block. It would be
>>>>>> cleanest to have a fully async libvfio-user's vfu_attach_ctx() API to
>>>>>> avoid blocking. Is that possible?
>>>>> 
>>>>> Thanos / John,
>>>>> 
>>>>>    Any thoughts on this?
>>>> 
>>>> I'm discussing this with John and FYI there are other places where 
>>>> libvfio-user can block, e.g. sending a response or receiving a command. Is 
>>>> it just the negotiation you want it to be asynchronous or _all_ 
>>>> libvfio-user operations? Making libvfio-user fully asynchronous might 
>>>> require a substantial API rewrite.
>>> 
>>> I see at least two reasons for a fully async API:
>>> 
>>> 1. The program wants to handle other events (e.g. a management REST API)
>>>   from the same event loop thread that invokes libvfio-user. If
>>>   libvfio-user blocks then the other events cannot be handled within a
>>>   reasonable time frame.
>>> 
>>>   The workaround for this is to use multi-threading and ignore the
>>>   event-driven architecture implied by vfu_get_poll_fd().
>>> 
>>> 2. The program handles multiple clients that do not trust each other.
>>>   This could be a software-defined network switch or storage appliance.
>>>   A malicious client can cause a denial-of-service by making a
>>>   libvfio-user call block.
>>> 
>>>   Again, the program needs separate threads instead of an event loop to
>>>   work around this.
>> 
>> Hi Stefan, we're certainly aware of the rationale. Ultimately we just haven't
>> yet found resources to fix this: in practice, we don't really hit problems 
>> from
>> the parts that are still synchronous. Of course, that's not a good argument 
>> when
>> it comes to your second issue, and indeed the library is not currently 
>> suitable
>> for multiple untrusted clients.
>> 
>> For Jag but: patches are welcome. But it's not just negotiate(): all sorts of
>> things are potentially synchronous - for example, it's expected that the 
>> region
>> read/write callbacks are done synchronously. Any other client reply, or
>> server->client message, is also synchronous.
>> 
>> It's partly why we haven't yet stabilised the API.
> 
> From the QEMU side it's okay to have blocking code in this experimental
> feature if users are aware of the limitation (e.g. the monitor is
> unresponsive if vfio-user blocks) and multiple untrusted clients are not
> supported. The QEMU code should also make its limitations clear so
> readers are aware of them and know the author chose this approach
> intentionally rather than due to an oversight.
> 
> Jag: Please mention this in the documentation and add a comment to
> vfu_object_attach_ctx() explaining that this code currently blocks.

Sure, will do.

Thank you!
--
Jag

> 
> I think in the long run it will be necessary to address it, e.g. in the
> multi-client case. That can either be done with threads or by making
> libvfio-user fully async.
> 
> Stefan
Re: [PATCH v4 07/14] vfio-user: run vfio-user context

Reply via email to