On Wed, Jun 22, 2022 at 11:33:32AM +0200, Jan Beulich wrote:
> On 22.06.2022 11:09, Roger Pau Monné wrote:
> > On Wed, Jun 22, 2022 at 10:04:19AM +0200, Jan Beulich wrote:
> >> On 16.06.2022 13:31, Roger Pau Monné wrote:
> >>> On Tue, Jun 14, 2022 at 11:45:54AM +0200, Jan Beulich wrote:
> >>>> On 14.06.2022 11:38, Roger Pau Monné wrote:
> >>>>> On Tue, Jun 14, 2022 at 11:13:07AM +0200, Jan Beulich wrote:
> >>>>>> On 14.06.2022 10:32, Roger Pau Monné wrote:
> >>>>>>> On Tue, Jun 14, 2022 at 10:10:03AM +0200, Jan Beulich wrote:
> >>>>>>>> On 14.06.2022 08:52, Roger Pau Monné wrote:
> >>>>>>>>> On Mon, Jun 13, 2022 at 03:56:54PM +0200, Jan Beulich wrote:
> >>>>>>>>>> On 13.06.2022 14:32, Roger Pau Monné wrote:
> >>>>>>>>>>> On Mon, Jun 13, 2022 at 11:18:49AM +0200, Jan Beulich wrote:
> >>>>>>>>>>>> On 13.06.2022 11:04, Roger Pau Monné wrote:
> >>>>>>>>>>>>> On Mon, Jun 13, 2022 at 10:29:43AM +0200, Jan Beulich wrote:
> >>>>>>>>>>>>>> On 13.06.2022 10:21, Roger Pau Monné wrote:
> >>>>>>>>>>>>>>> On Mon, Jun 13, 2022 at 09:30:06AM +0200, Jan Beulich wrote:
> >>>>>>>>>>>>>>>> On 10.06.2022 17:06, Roger Pau Monne wrote:
> >>>>>>>>>>>>>>>>> Prevent dropping console output from the hardware domain, 
> >>>>>>>>>>>>>>>>> since it's
> >>>>>>>>>>>>>>>>> likely important to have all the output if the boot fails 
> >>>>>>>>>>>>>>>>> without
> >>>>>>>>>>>>>>>>> having to resort to sync_console (which also affects the 
> >>>>>>>>>>>>>>>>> output from
> >>>>>>>>>>>>>>>>> other guests).
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> Do so by pairing the console_serial_puts() with
> >>>>>>>>>>>>>>>>> serial_{start,end}_log_everything(), so that no output is 
> >>>>>>>>>>>>>>>>> dropped.
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> While I can see the goal, why would Dom0 output be 
> >>>>>>>>>>>>>>>> (effectively) more
> >>>>>>>>>>>>>>>> important than Xen's own one (which isn't "forced")? And 
> >>>>>>>>>>>>>>>> with this
> >>>>>>>>>>>>>>>> aiming at boot output only, wouldn't you want to stop the 
> >>>>>>>>>>>>>>>> overriding
> >>>>>>>>>>>>>>>> once boot has completed (of which, if I'm not mistaken, we 
> >>>>>>>>>>>>>>>> don't
> >>>>>>>>>>>>>>>> really have any signal coming from Dom0)? And even during 
> >>>>>>>>>>>>>>>> boot I'm
> >>>>>>>>>>>>>>>> not convinced we'd want to let through everything, but 
> >>>>>>>>>>>>>>>> perhaps just
> >>>>>>>>>>>>>>>> Dom0's kernel messages?
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> I normally use sync_console on all the boxes I'm doing dev 
> >>>>>>>>>>>>>>> work, so
> >>>>>>>>>>>>>>> this request is something that come up internally.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Didn't realize Xen output wasn't forced, since we already 
> >>>>>>>>>>>>>>> have rate
> >>>>>>>>>>>>>>> limiting based on log levels I was assuming that 
> >>>>>>>>>>>>>>> non-ratelimited
> >>>>>>>>>>>>>>> messages wouldn't be dropped.  But yes, I agree that Xen 
> >>>>>>>>>>>>>>> (non-guest
> >>>>>>>>>>>>>>> triggered) output shouldn't be rate limited either.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Which would raise the question of why we have log levels for 
> >>>>>>>>>>>>>> non-guest
> >>>>>>>>>>>>>> messages.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Hm, maybe I'm confused, but I don't see a direct relation 
> >>>>>>>>>>>>> between log
> >>>>>>>>>>>>> levels and rate limiting.  If I set log level to WARNING I would
> >>>>>>>>>>>>> expect to not loose _any_ non-guest log messages with level 
> >>>>>>>>>>>>> WARNING or
> >>>>>>>>>>>>> above.  It's still useful to have log levels for non-guest 
> >>>>>>>>>>>>> messages,
> >>>>>>>>>>>>> since user might want to filter out DEBUG non-guest messages for
> >>>>>>>>>>>>> example.
> >>>>>>>>>>>>
> >>>>>>>>>>>> It was me who was confused, because of the two log-everything 
> >>>>>>>>>>>> variants
> >>>>>>>>>>>> we have (console and serial). You're right that your change is 
> >>>>>>>>>>>> unrelated
> >>>>>>>>>>>> to log levels. However, when there are e.g. many warnings or 
> >>>>>>>>>>>> when an
> >>>>>>>>>>>> admin has lowered the log level, what you (would) do is 
> >>>>>>>>>>>> effectively
> >>>>>>>>>>>> force sync_console mode transiently (for a subset of messages, 
> >>>>>>>>>>>> but
> >>>>>>>>>>>> that's secondary, especially because the "forced" output would 
> >>>>>>>>>>>> still
> >>>>>>>>>>>> be waiting for earlier output to make it out).
> >>>>>>>>>>>
> >>>>>>>>>>> Right, it would have to wait for any previous output on the 
> >>>>>>>>>>> buffer to
> >>>>>>>>>>> go out first.  In any case we can guarantee that no more output 
> >>>>>>>>>>> will
> >>>>>>>>>>> be added to the buffer while Xen waits for it to be flushed.
> >>>>>>>>>>>
> >>>>>>>>>>> So for the hardware domain it might make sense to wait for the TX
> >>>>>>>>>>> buffers to be half empty (the current tx_quench logic) by 
> >>>>>>>>>>> preempting
> >>>>>>>>>>> the hypercall.  That however could cause issues if guests manage 
> >>>>>>>>>>> to
> >>>>>>>>>>> keep filling the buffer while the hardware domain is being 
> >>>>>>>>>>> preempted.
> >>>>>>>>>>>
> >>>>>>>>>>> Alternatively we could always reserve half of the buffer for the
> >>>>>>>>>>> hardware domain, and allow it to be preempted while waiting for 
> >>>>>>>>>>> space
> >>>>>>>>>>> (since it's guaranteed non hardware domains won't be able to 
> >>>>>>>>>>> steal the
> >>>>>>>>>>> allocation from the hardware domain).
> >>>>>>>>>>
> >>>>>>>>>> Getting complicated it seems. I have to admit that I wonder 
> >>>>>>>>>> whether we
> >>>>>>>>>> wouldn't be better off leaving the current logic as is.
> >>>>>>>>>
> >>>>>>>>> Another possible solution (more like a band aid) is to increase the
> >>>>>>>>> buffer size from 4 pages to 8 or 16.  That would likely allow to 
> >>>>>>>>> cope
> >>>>>>>>> fine with the high throughput of boot messages.
> >>>>>>>>
> >>>>>>>> You mean the buffer whose size is controlled by serial_tx_buffer?
> >>>>>>>
> >>>>>>> Yes.
> >>>>>>>
> >>>>>>>> On
> >>>>>>>> large systems one may want to simply make use of the command line
> >>>>>>>> option then; I don't think the built-in default needs changing. Or
> >>>>>>>> if so, then perhaps not statically at build time, but taking into
> >>>>>>>> account system properties (like CPU count).
> >>>>>>>
> >>>>>>> So how about we use:
> >>>>>>>
> >>>>>>> min(16384, ROUNDUP(1024 * num_possible_cpus(), 4096))
> >>>>>>
> >>>>>> That would _reduce_ size on small systems, wouldn't it? Originally
> >>>>>> you were after increasing the default size. But if you had meant
> >>>>>> max(), then I'd fear on very large systems this may grow a little
> >>>>>> too large.
> >>>>>
> >>>>> See previous followup about my mistake of using min() instead of
> >>>>> max().
> >>>>>
> >>>>> On a system with 512 CPUs that would be 512KB, I don't think that's a
> >>>>> lot of memory, specially taking into account that a system with 512
> >>>>> CPUs should have a matching amount of memory I would expect.
> >>>>>
> >>>>> It's true however that I very much doubt we would fill a 512K buffer,
> >>>>> so limiting to 64K might be a sensible starting point?
> >>>>
> >>>> Yeah, 64k could be a value to compromise on. What total size of
> >>>> output have you observed to trigger the making of this patch? Xen
> >>>> alone doesn't even manage to fill 16k on most of my systems ...
> >>>
> >>> I've tried on one of the affected systems now, it's a 8 CPU Kaby Lake
> >>> at 3,5GHz, and manages to fill the buffer while booting Linux.
> >>>
> >>> My proposed formula won't fix this use case, so what about just
> >>> bumping the buffer to 32K by default, which does fix it?
> >>
> >> As said, suitably explained I could also agree with going to 64k. The
> >> question though is in how far 32k, 64k, or ...
> >>
> >>> Or alternatively use the proposed formula, but adjust the buffer to be
> >>> between [32K,64K].
> >>
> >> ... this formula would cover a wide range of contemporary systems.
> >> Without such I can't really see what good a bump would do, as then
> >> many people may still find themselves in need of using the command
> >> line option to put in place a larger buffer.
> > 
> > I'm afraid I don't know how to make progress with this.
> > 
> > The current value is clearly too low for at least one of my systems.
> > I don't think it's feasible for me to propose a value or formula that
> > I can confirm will be suitable for all systems, hence I would suggest
> > increasing the buffer value to 32K as that does fix the problem on
> > that specific system (without claiming it's a value that would suit
> > all setups).
> > 
> > I agree that many people could still find themselves in the need of
> > using the command line option, but I can assure that new buffer value
> > would fix the issue on at least one system, which should be enough as
> > a justification.
> 
> I'm afraid I view this differently. Dealing with individual systems is
> imo not a reason to change a default, when there is a command line
> option to adjust the value in question. And when, at the same time,
> the higher default might cause waste of resources on at least on other
> system. As said before, I'm not going to object to bumping to 32k or
> even 64k, provided this has wider benefit and limited downsides. But
> with a justification of "this fixes one system" I'm not going to ack
> (but also not nak) such a change.

Sorry, I certainly have a different view on this, as I think we should
aim to provide sane defaults that allow for proper functioning of Xen,
unless it turns out those defaults could cause issues on other
systems.  In this case I don't see how bumping the default console
ring from 16K to 32K is going to cause issues elsewhere.

Will prepare a patch to do that and send to the list, in case anyone
else would like to Ack it.

Thanks, Roger.

Reply via email to