On 09/08/17 11:58, Bhupinder Thakur wrote:
Hi Julien,
Hi Bhupinder,
Thanks for the testing.
On 8 August 2017 at 21:29, Julien Grall <julien.gr...@arm.com> wrote:
Hi Bhupinder,
I gave another and I have a couple of comments.
Booting Linux with earlycon enabled take quite a while. I can see the
characters coming slower than on the minitel. It seems to be a bit better
after switching off the bootconsole. Overall Linux is taking ~20 times to
boot with pl011 vs HVC console.
I do agree that pl011 is emulated and therefore you have to trap after each
character. But 20 times sounds far too much.
I think this slowness could be due to ratelimiting of the pl011 events
in xenconosle. Currently, the rate limit is
set to 30 events per 200 msecs (see RATE_LIMIT_ALLOWANCE/RATE_LIMIT_PERIOD).
I increased the rate limit to 600 events (30 * 20) per 200 msecs. With
this change,
I see that the the find command is running faster and smoother.
Earlier the find output would be jerky.
I think there might be another solution avoiding increasing the rate limit.
If you look at the earlycon code for pl011 in Linux:
static void pl011_putc(struct uart_port *port, int c)
{
while (readl(port->membase + UART01x_FR) & UART01x_FR_TXFF)
cpu_relax();
if (port->iotype == UPIO_MEM32)
writel(c, port->membase + UART01x_DR);
else
writeb(c, port->membase + UART01x_DR);
while (readl(port->membase + UART01x_FR) & UART01x_FR_BUSY)
cpu_relax();
}
Linux will wait the UART to be idle before sending a new character.
Now looking at vpl011 emulation, the busy bit set when a new character
is queued (see vpl011_write_data). This bit will only be cleared when
the console daemon will raise an event and the queue is empty (see
vpl011_data_avail).
This means for earlycon, you will need a round trip Guest -> Xen -> Dom0
-> Xen -> Guest for each single character. This is a bit
counterproductive and combined with the limit it makes it worse.
I would take a different approach on the BUSY bit. We can consider the
queue between Xen and xenconsoled as outside of the UART. If the
character is queued, then job done. I think this would improve quite a
lot of the performance.
Also, I would append a new patch at then end of the series rather modify
patch #1. This would avoid to do more review :).
After that I tried to stress the emulation a bit with "find ." to get a lot
of output. And I noticed a lot of message similar to the one below on xen
console:
d6v0 vpl011: Unexpected OUT ring buffer full
Associated to that the character have been eaten resulting to non-sense log.
A bit above the printk printing this message, there are a comment saying:
/*
* It is expected that the ring is not full when this function is called
* as the guest is expected to write to the data register only when the
* TXFF flag is not set.
* In case the guest does write even when the TXFF flag is set then the
* data will be silently dropped.
*/
I am quite surprised that Linux is not looking at the TXFF flags. So this
needs some investigation.
I ran 'find' but could not reproduce the issue.
Sorry I forgot to precise that you need to run find in a directory with
a lot of files. A good solution would be:
find /
Cheers,
--
Julien Grall
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel