Claudio Jeker <cje...@diehard.n-r-g.com> writes:

> On Thu, Aug 03, 2023 at 07:01:51PM -0400, Dave Voutila wrote:
>>
>> Claudio Jeker <cje...@diehard.n-r-g.com> writes:
>>
>> > On Thu, Aug 03, 2023 at 04:20:47PM -0400, Dave Voutila wrote:
>> >> Found this while working on some virtio stuff. My original
>> >> implementation as part of the multi-process redesign didn't handle if
>> >> imsg_flush failed due to resource shortages (EAGAIN), so adopt the
>> >> do/while idiom used by other daemons like iked(8).
>> >>
>> >> Manifests with errors from the vm process looking like:
>> >>
>> >>   virtio_pci_io: imsg_flush (write)
>> >>   vmd: pci i/o access function failed
>> >>
>> >> ok?
>> >
>> > There are way to many imsg_flush() calls in this code. I don't think this
>> > is designed the right way. imsg_flush() should almost never be used.
>> > Busy loop on EAGAIN will bite you further down the road.
>> >
>>
>> Bite me how?
>
> We had massive EAGAIN spins in relayd. Busy loop on EAGAIN is almost
> always wrong. A possible fix is to poll on the fd until it is writable --
> while this does not burn CPU for nothing it still does not allow any other
> event from happening either. Have a look at what atomicio.c does you may
> need to add something similar for imsg_flush().
>

Ah. In this case, it's the vcpu thread that doesn't participate in an
event loop via libevent.

>> This is all for the synchronous imsg channel to the virtio device
>> process, not the async one. What's going to actually perform the flush
>> for me if not imsg_flush?
>
> This is where I'm a bit confused. There is no such concept of a
> synchronous imsg channel. imsg are by nature asynchronous. You pass
> messages beween processes (especially in a libevent driven daemon).
> It seems this code requires some kind of synchronisation but blocking
> all processing during that time seems like a very big hammer.
>

Right. It's the vcpu thread. Reading a register on the virtio device
needs to block until the value is produced, unless I'm missing something
here.

> I don't fully grasp vmd but looking at the code and seeing that atomicio
> is mixed with libevent is a red flag. Also why is the synchronous channel
> set non-blocking? If it is synchronous it should most probably be blocking
> (which would also solve the EAGAIN issue).
>

atomicio is used after exec and before the child process can
initialize its libevent loop and imsg channels. Afterwards, it's all
imsg.

So I guess that's the part that's a bit messy? The vm process's vcpu
thread sends imsgs "synchronously", but the device process will process
and respond in an asynchronous way using libevent.

I'm going to revisit this a bit more closely and see how it can be
reworked.

Reply via email to