Re: [Qemu-devel] [PATCH v3 4/9] monitor: no need to save need_resume

Markus Armbruster Wed, 29 Aug 2018 23:46:46 -0700

Peter Xu <pet...@redhat.com> writes:

> On Wed, Aug 29, 2018 at 02:40:39PM +0200, Markus Armbruster wrote:
>> Peter Xu <pet...@redhat.com> writes:
>> 
>> > On Wed, Aug 29, 2018 at 10:55:51AM +0200, Markus Armbruster wrote:
>> >> Peter Xu <pet...@redhat.com> writes:
>> >> 
>> >> > On Tue, Aug 28, 2018 at 05:46:29PM +0200, Markus Armbruster wrote:
>> >> >> Peter Xu <pet...@redhat.com> writes:
>> >> >> 
>> >> >> > On Sat, Aug 25, 2018 at 03:57:19PM +0200, Marc-André Lureau wrote:
>> >> >> >> There is no need for per-command need_resume granularity, it should
>> >> >> >> resume after running an non-oob command on oob-disabled monitor.
>> >> >> >> 
>> >> >> >> Signed-off-by: Marc-André Lureau <marcandre.lur...@redhat.com>
>> >> >> >> Reviewed-by: Markus Armbruster <arm...@redhat.com>
>> >> >> >
>> >> >> > Note that this series/patch still conflict with the "enable
>> >> >> > out-of-band by default" series.
>> >> >> >
>> >> >> >   [PATCH v6 00/13] monitor: enable OOB by default
>> >> >> 
>> >> >> Yes.
>> >> >> 
>> >> >> > I'm not against this patch to be merged since it has its r-b, but I
>> >> >> > feel like we'd better judge on whether we still like the response
>> >> >> > queue first, in case one day we'll need to add these things back.
>> >> >> 
>> >> >> Let's not worry about things that may or may not happen at some
>> >> >> indeterminate time in the future.
>> >> >
>> >> > It might not be that "far future"...  Please see below.
>> >> >
>> >> >> 
>> >> >> However:
>> >> >> 
>> >> >> > When there could be functional changes around the code path I would
>> >> >> > think we'd better keep the cleanup patches postponed a bit until 
>> >> >> > those
>> >> >> > functional changes are settled.  For now the functional part is 
>> >> >> > decide
>> >> >> > how to fix up the rest of out-of-band issues (my proposal is in the
>> >> >> > series above which should solve everything that is related to
>> >> >> > out-of-band to be fixed; if there is more, I'll continue to work on
>> >> >> > it), whether we should enable it by default for 3.1 (my answer
>> >> >> > is... yes...), and what to do with it.
>> >> >> 
>> >> >> I agree the important job is to finish OOB.
>> >> >> 
>> >> >> Sometimes, it's better to clean up first.  Sometimes, it's not.
>> >> >> 
>> >> >> Right now, the response queue is a useless complication, and
>> >> >> Marc-André's PATCH 3+4 get rid of it.  Lovely.  I understand this
>> >> >> conflicts with your OOB work.  The question is whether your work
>> >> >> fundamentally needs the response queue or not.
>> >> >
>> >> > Just to clarify a bit... I prefer to keep the response queue not
>> >> > because it's conflicting my existing work but because I think we might
>> >> > get use of it even in the near future.  I stated it here on the
>> >> > possibility that we might use the response queue to solve the
>> >> > unlimited monitor out_buf issue here:
>> >> >
>> >> > https://patchwork.kernel.org/patch/10511471/#22110771
>> >> >
>> >> > Quotes:
>> >> >
>> >> >         ...
>> >> >         Yeah actually this reminded me about the fact that we are
>> >> >         still using unlimited buffer size for the out_buf.  IMHO we
>> >> >         should make it a limited size after 3.0, then AFAICT if
>> >> >         without current qmp response queue we'll need to introduce
>> >> >         some similar thing to cache responses then when the out_buf is
>> >> >         full.
>> >> >
>> >> >         IMHO the response queue looks more like the correct place that
>> >> >         we should do the flow control, and the out_buf could be
>> >> >         majorly used to do the JSON->string convertion (so basically
>> >> >         IMHO the out_buf limitation should be the size of the maximum
>> >> >         JSON object that we'll have).
>> >> >         ...
>> >> >
>> >> > Let's imagine what we need if we want to limit the out_buf: (1) To
>> >> > limit the out_buf, we need somewhere to cache the responses that we
>> >> > want to put into out_buf but we can't when the out_buf is full -
>> >> > that's mostly the response queue now.  (2) Since we will need to queue
>> >> > these items onto out_buf when out_buf is not full, we'll possibly need
>> >> > something like a bottom half to kickoff when out_buf is able to handle
>> >> > more data - that's mostly the bottom half of the response queue.
>> >> >
>> >> > AFAIU the rest to do is very possible only that we set a limit to the
>> >> > out_buf but only if response queue is there...
>> >> 
>> >> Limiting outbuf only to have the same data pile up earlier in the
>> >> pipeline doesn't seem helpful.  We need to throttle production of
>> >> output.  A simple way to do that is throttling input.  See below.
>> >> 
>> >> > I didn't really work on the out_buf since I didn't want to further
>> >> > expand the out-of-band work (since it's already going far away before
>> >> > it settles down first...), and after all the out_buf issue is nothing
>> >> > related to the out-of-band work and it's there for a long time.
>> >> > However that's the major reason that I might prefer to keep the queue
>> >> > now.
>> >> >
>> >> > [1]
>> >> 
>> >> Let's review what we have.
>> >> 
>> >> QMP flow control is about limiting the amount of data QEMU buffers on
>> >> behalf of a QMP client.
>> >> 
>> >> This is about robustness, not security.  There are countless ways a QMP
>> >> client can make QEMU use more memory.  We want to cope with accidents,
>> >> not stop attacks.
>> >> 
>> >> A common and simple way to do flow control is to throttle receiving.
>> >> Unless the transport buffers way too much (which would be insane), the
>> >> peer will soon notice, and can adapt.
>> >> 
>> >> QMP input flows from the character device to commands.  It is converted
>> >> from text to QObject along the way.  Output flows from commands to the
>> >> character device.  It is converted from QObject to text along the way.
>> >> 
>> >> When the client sends input faster than QEMU can execute them, flow
>> >> control should kick in to stop adding more input to the pileup.  When
>> >> QEMU produces output faster than the client can receive it, flow control
>> >> should kick in to stop adding more output to the pileup.  We can do that
>> >> indirectly, by stopping input.
>> >> 
>> >> Input buffering:
>> >> 
>> >> * The chardev buffers 4KiB (I think).  Small enough to be ignored.
>> >> 
>> >> * The JSON parser buffers one partial JSON value as a token list, up to
>> >>   a limit in the order of 64MiB.  Weasel words "in the order", because
>> >>   it measures memory consumption only indirectly.  The limit is
>> >>   ridicilously generous for a control plane purpose like QMP.
>> >> 
>> >> * When the partial JSON value becomes complete, the JSON parser converts
>> >>   it to a QObject, then frees the token list.
>> >> 
>> >> * Without OOB, the QMP core buffers one complete QObject (the command)
>> >>   until we're done with the command.  The JSON parser's buffer should be
>> >>   empty then, because the QMP core suspends reading from the char device
>> >>   while it deals with a command.
>> >> 
>> >> * With OOB, the QMP core buffers up to 8 in-band commands and one
>> >>   out-of-band command.
>> >> 
>> >>   When the in-band command buffer is full, we currently drop further
>> >>   in-band commands, but that's a bad idea, and we're going to suspend
>> >>   reading from the char device instead.  Once we do, the JSON parser's
>> >>   buffer should be empty when the in-band command buffer is full.  The
>> >>   remainder of my analysis assumes we suspend rather than drop.
>> >> 
>> >> Output buffering:
>> >> 
>> >> * Traditionally, the QMP core buffers one QObject while it converts it
>> >>   to text.  It buffers an unlimited amount of text in mon->outbuf.
>> >> 
>> >> * Adding a request queue doesn't by itself change how much data is
>> >>   buffered, only when it's converted from QObject to text.  If the
>> >>   bottom half doing the conversion runs quickly, nothing changes.  If it
>> >>   gets delayed for some reason, QObjects can pile up in the response
>> >>   queue before they get converted and moved to mon->outbuf.
>> >> 
>> >> Summary of flow control:
>> >> 
>> >> * We limit input buffers and stop reading input when the limit is
>> >>   exceeded.  This stops input piling up.
>> >> 
>> >> * We do not limit output at all.
>> >> 
>> >> Ideally, we'd keep track of combined input and output buffer size, and
>> >> throttle input to keep it under control.  But separate limits for
>> >> individual buffers could be good enough, and might be simpler.
>> >
>> > (thanks for the summary)
>> >
>> >> 
>> >> >> If your OOB work happens to be coded for the response queue, but the
>> >> >> problem could also be solved without the response queue, then the OOB
>> >> >> job doesn't fundamentally need the response queue.
>> >> >
>> >> > Yes, I think the OOB work itself does not need the response queue now.
>> >> 
>> >> Understood.
>> >> 
>> >> >> Unless that's the case, getting rid of the response queue is 
>> >> >> unnecessary
>> >> >> churn.
>> >> >> 
>> >> >> If it is the case, we still need to consider effort.  Which order is
>> >> >> less total work?  Which order gets us to the goal faster?
>> >> >> 
>> >> >> Can you guys agree on answers to these questions, or do you need me to
>> >> >> answer them?
>> >> >> 
>> >> >> Restating the questions:
>> >> >> 
>> >> >> 1. Can you think of a way to do what Peter's OOB series does, but
>> >> >> without the response queue?
>> >> >> 
>> >> >> 2. If you can, what's easier / cheaper / faster:
>> >> >> 
>> >> >>    a. Merge Marc-André's patches to get rid of response queue, rewrite
>> >> >>       OOB series without response queue on top.
>> >> >> 
>> >> >>    b. Merge Peter's OOB series with response queue, rewrite patches to
>> >> >>       get rid of response queue on top.
>> >> >
>> >> > Let's have a quick look at above [1], if it's not a good reason (or
>> >> > even it's still unclear) then let's drop the queue.  It'll be
>> >> > perfectly fine I rebase my work upon Marc-André's.  After all the only
>> >> > reason to keep the response queue for me is to save time (for anyone
>> >> > who will be working with monitors).  If we spend too much time on
>> >> > judging whether we should keep the queue (we've already spent some)
>> >> > then it's already a waste of time...  It does not worth it IMO.
>> >> 
>> >> Time spent on coming up with a high-level plan for flow control is time
>> >> well spent.
>> >> 
>> >> Sometimes, you have to explore and experiment before you can come up
>> >> with a high-level plan that has a reasonable chance of working.  No
>> >> license to skip the "think before you hack" step entirely.
>> >> 
>> >> Here's the simplest (and possibly naive) plan I can think of: if
>> >> mon->outbuf exceeds a limit, suspend reading input.  No response queue.
>> >> Would it have a reasonable chance of working?
>> >
>> > Hmm I think it works...  Let's assume:
>> >
>> > - M: threshold size for outbuf
>> > - A: size of current outbuf
>> > - B: size of a new response message (and assume A+B>M, so the flow
>> >      control will trigger)
>> >
>> > I think that queue is not a must if we don't restrict the buffer that
>> > much - for example we can just queue the JSON object into outbuf when
>> > we receive the new message with size B (after queuing, we might get
>> > A+B>M, then it's slightly bigger than the limit threshold), now we
>> > suspend the monitor.
>> 
>> Yes.
>> 
>> M is a threshold for suspending the monitor, not a hard size limit.
>> 
>> Since the response QObjects has to go *somewhere*, we can just as well
>> stuff it into mon->outbuf.
>> 
>> The actual mon-outbuf size A may exceed the threshold M, but only while
>> the monitor is suspended.
>> 
>> > If we want to have a very strict buffer size limitation for outbuf so
>> > the outbuf never use more than M, we can't just queue it since it will
>> > overflow, then we need to stop the input and cache the object
>> > somewhere (e.g., the response queue).
>> 
>> Yes, but that doesn't actually limit memory use: instead of mon->outbuf
>> growing without bound, we now have response queue growing without bound.
>> Actual memory use changes only if one of the two representations QObject
>> and text is more compact.
>> 
>> Unscientific experiment: I instrumented qobject_to_json() to measure
>> size of its QObject input and text output (patch appended).  For
>> query-qmp-schema's output, I measure ~9.8MiB for QObject, and ~119KiB
>> for text:
>> 
>>     ### qobject_to_json(0x556653e7ac30)
>>     ###    #objs   #bytes QType
>>     ###        0        0 none
>>     ###      568        0 qnull
>>     ###        0        0 qnum
>>     ###    11876   540253 qstring
>>     ###     2301  9677496 qdict
>>     ###      509    86344 qlist
>>     ###        2       48 qbool
>>     ### 0x556653e7ac30 uses 10304141 vs. 121759
>> 
>> Most of the QObject's memory is occupied by QDicts.  The way they're
>> implemented is wasteful (known issue).  But even with a magical QDict
>> implementation that uses no memory at all, the QObject would still use
>> five times the memory of text.
>> 
>> My experiment suggests that to conserve memory, we should convert
>> responses to text right away.
>
> From memory saving POV, yes.  But keeping them as objects bring us
> flexibility and isolation, e.g., we can simply drop one event as we
> want (rather than dropping the whole outbuf), or suddenly we want to
> insert a new key due to some reason.  But these requirements have no
> real usage so far.  So I admit it's possibly me that have thought too
> much and I decided to not struggle. :)


YAGNI :)

Letting out-of-band responses (and perhaps certain events) overtake
in-band responses in the response queue would be kind of cool.  To be
actually useful, the response queue had to be non-empty.  Your code
empties it into mon->outbuf at the first opportunity.  That would have
to be changed.  Refilling mon->outbuf only when it gets empty would be
suboptimal, because it can split up writes to the underlying character
device.  Still, the response queue would be non-empty only when the QMP
client is slow to receive output.  Does that happen in practice?  My
point is: the additional complexity would be real, but the gains seem
dubious.

> (I never consider perf for QMP yet... so actually spending more memory
>  is not a problem to me; however possibility to use up the memory is
>  of course another thing...)
>
> Now I only hope we won't have other bug triggered due to start using
> the multi-threading of output buffer after we remove the queue.

That'll be Marc-André's fault then, for assuring us the "QMP response is
thread safe, and chardev write path is thread safe" right in his commit
message of PATCH 3 ;)

>> > So I think now I agree with you that the response queue is not
>> > required if we think the first solution is okay for us.
>> 
>> Cool.  Can you explore that and post patches?
>
> I think what I can do here is rebasing my work after Marc-Andre's
> patches merged.  Please let me know if there's anything else I can do.
>
> Thanks,

Sounds like a plan.

Re: [Qemu-devel] [PATCH v3 4/9] monitor: no need to save need_resume

Reply via email to