On Wed, 21 Jan 2026, Jan Beulich wrote:
> On 21.01.2026 01:07, Stefano Stabellini wrote:
> > --- a/xen/drivers/char/console.c
> > +++ b/xen/drivers/char/console.c
> > @@ -521,6 +521,8 @@ struct domain *console_get_domain(void)
> > {
> > struct domain *d;
> >
> > + nrspin_lock_irq(&console_lock);
> > +
> > if ( console_rx == 0 )
> > return NULL;
> >
> > @@ -540,6 +542,8 @@ void console_put_domain(struct domain *d)
> > {
> > if ( d )
> > rcu_unlock_domain(d);
> > +
> > + nrspin_unlock_irq(&console_lock);
> > }
>
> Hmm, I'd much prefer if we could avoid this. The functions aren't even
> static, and new uses could easily appear. Such a locking model, even
> disabling IRQs, feels pretty dangerous. (If it was to be kept, prominent
> comments would need adding imo. However, for now I'm not going to make
> any effort to verify this is actually safe, on the assumption that this
> will go away again.)
It is totally fine to get rid of it and revert back to explicit locks
outside of the console_get_domain and console_put_domain functions as it
was done in v4: https://marc.info/?l=xen-devel&m=176886821718712
However, the locked regions would still need to extended, more on this
below.
> > @@ -596,8 +604,19 @@ static void __serial_rx(char c)
> >
> > d = console_get_domain();
> > if ( !d )
> > + {
> > + console_put_domain(d);
> > return;
> > + }
> >
> > +#ifdef CONFIG_SBSA_VUART_CONSOLE
> > + /* Prioritize vpl011 if enabled for this domain */
> > + if ( d->arch.vpl011.base_addr )
> > + {
> > + /* Deliver input to the emulated UART. */
> > + rc = vpl011_rx_char_xen(d, c);
> > + } else
>
> Nit: Style.
>
> > +#endif
> > if ( is_hardware_domain(d) || IS_ENABLED(CONFIG_DOM0LESS_BOOT) )
> > {
> > /*
> > @@ -613,11 +632,6 @@ static void __serial_rx(char c)
> > */
> > send_guest_domain_virq(d, VIRQ_CONSOLE);
> > }
> > -#ifdef CONFIG_SBSA_VUART_CONSOLE
> > - else
> > - /* Deliver input to the emulated UART. */
> > - rc = vpl011_rx_char_xen(d, c);
> > -#endif
>
> I don't understand this movement, and iirc it also wasn't there in v3.
> There's no explanation in the description, unless I'm overlooking the
> crucial few words.
This chunk fixes an unrelated bug on ARM. We need to move the
CONFIG_SBSA_VUART_CONSOLE check earlier otherwise this patch will never
be taken when IS_ENABLED(CONFIG_DOM0LESS_BOOT).
I wrote a note in the changelog here:
https://marc.info/?l=xen-devel&m=176886821718712
- if vpl011 is enabled, send characters to it (retains current behavior
on ARM)
I'll be clearer in the next commit message.
> > @@ -741,17 +756,23 @@ static long
> > guest_console_write(XEN_GUEST_HANDLE_PARAM(char) buffer,
> > if ( copy_from_guest(kbuf, buffer, kcount) )
> > return -EFAULT;
> >
> > - if ( is_hardware_domain(cd) )
> > + input = console_get_domain();
> > + if ( input && cd == input )
> > {
> > + if ( cd->pbuf_idx )
> > + {
> > + console_send(cd->pbuf, cd->pbuf_idx, flags);
> > + cd->pbuf_idx = 0;
> > + }
> > /* Use direct console output as it could be interactive */
> > - nrspin_lock_irq(&console_lock);
> > console_send(kbuf, kcount, flags);
> > - nrspin_unlock_irq(&console_lock);
> > + console_put_domain(input);
> > }
> > else
> > {
> > char *kin = kbuf, *kout = kbuf, c;
> >
> > + console_put_domain(input);
> > /* Strip non-printable characters */
> > do
> > {
>
> The folding of locking into console_{get,put}_domain() again results in overly
> long windows where the "put" is still outstanding. As said before, the current
> domain can't go away behind your back.
I wrote in the commit message: "Add the console_lock around
serial_rx_prod and serial_rx_cons modifications to protect it against
concurrent writes to it. Also protect against changes of domain with
focus while reading from the serial."
Following your past review comments, I switched to using the
console_lock (folded into console_get/put_domain, but it can be
separate) to protect both serial_rx_prod/serial_rx_cons accesses and
also console_rx accesses.
> > @@ -813,6 +835,13 @@ long do_console_io(
> > if ( count > INT_MAX )
> > break;
> >
> > + d = console_get_domain();
> > + if ( d != current->domain )
> > + {
> > + console_put_domain(d);
> > + return 0;
> > + }
> > +
> > rc = 0;
> > while ( (serial_rx_cons != serial_rx_prod) && (rc < count) )
> > {
> > @@ -822,14 +851,23 @@ long do_console_io(
> > len = SERIAL_RX_SIZE - idx;
> > if ( (rc + len) > count )
> > len = count - rc;
> > +
> > + console_put_domain(d);
> > if ( copy_to_guest_offset(buffer, rc, &serial_rx_ring[idx],
> > len) )
> > {
> > rc = -EFAULT;
> > break;
> > }
> > + d = console_get_domain();
> > + if ( d != current->domain )
> > + {
> > + console_put_domain(d);
> > + break;
> > + }
> > rc += len;
> > serial_rx_cons += len;
> > }
> > + console_put_domain(d);
> > break;
>
> This is pretty horrible, don't you agree? Demonstrated by the fact that you
> look to have encoded a double put: The 2nd to last one conflicts with the
> one right after the loop. Whereas the earlier "break" has no reference at
> all, but still takes the path with the "put" right after the loop. At the
> same time it also looks wrong to simply drop that last "put".
Yes I agree it is horrible. I did a deep review of all locking scenarios
and moved console_lock out of console_get/put_domain.
I sent out an update, expanding the commit message to explain the
locking, and re-testing all scenarios on both ARM and x86.
https://marc.info/?l=xen-devel&m=176904847332141
There were at 2-3 locking issues in this version of the patch and they
have all being resolved now.