@Gerd

> Do you also see the slowdown without the GPU in a otherwise identical
guest configuration?

No - without the GPUs, the entire boot process takes less than 30 seconds 
(which is true before and after the dynamic mmio window size patch ( 
https://github.com/tianocore/edk2/commit/ecb778d0ac62560aa172786ba19521f27bc3f650
 ) ).

> Looks quite high to me.  What amount of guest memory we are talking
about?

It is a pretty large memory allocation - over 900GB - so I'm not surprised that 
the initial allocation during `virsh start` takes a while when PCIe devices are 
passed through, since that allocation has to happen at init time. `virsh start` 
also takes the same amount of time with or without the dynamic mmio window size 
patch, but its time does scale with amount of memory allocated. (although I 
expect that, given that the time consuming part is just that memory allocation.)

> More details would be helpful indeed.  Is that a general overall
slowdown?  Is it some specific part which takes alot of time?

The part of the kernel boot that I highlighted in 
https://edk2.groups.io/g/devel/attachment/120801/2/this-part-takes-2-3-minutes.txt
 (which I think is PCIe device initialization and BAR assignment) is the part 
that seems slower than it should be. Each section of that log starting with 
"acpiphp: Slot <slot> registered" takes probably 15 seconds, so this whole 
section adds up to a few minutes. That part also does not scale with memory 
allocation, just with number of GPUs passed through. (in this log, I had 4 GPUs 
attached, IIRC).

Without the dynamic mmio window size patch, if I set my guest kernel to use 
`pci=nocrs pci=realloc`, this boot slowdown disappears and I am able to use the 
GPU with some conditions (details below).

@xpahos:

> This patch adds functionality that automatically adjusts the MMIO size based 
> on the number of physical bits. As a starting point, I would try running an 
> old build of OVMF and running grep on ‘rejected’ to make sure that no GPUs 
> were taken out of service while OVMF was running.

I haven't looked for this in OVMF debug output, but what you say here seems 
realistic, given that my VMs without the dynamic mmio window size patch throw 
many errors like this during guest kernel boot:
[    4.650955] pci 0000:00:01.5: BAR 15: no space for [mem size 0x3000000000 
64bit pref]
[    4.651700] pci 0000:00:01.5: BAR 15: failed to assign [mem size 
0x3000000000 64bit pref]

(and subsequently, the GPUs are not usable in the VMs (but the PCI devices are 
still present)). So it would make sense if the fast boot time in those versions 
is simply attributed to the kernel "giving up" on all of those right away, 
before the slow path starts. The only confusing part to me then is why I would 
not see this part ( 
https://edk2.groups.io/g/devel/attachment/120801/2/this-part-takes-2-3-minutes.txt
 ) going so slowly when I use a version of OVMF with the dynamic mmio window 
size patch reverted but with my guest kernel having `pci=realloc pci=nocrs` 
set. Under those circumstances, I have a fast boot time and my passed-through 
GPUs work. (although I do still see some outputs like this during linux boot:
[    4.592009] pci 0000:06:00.0: can't claim BAR 0 [mem 
0xffffffffff000000-0xffffffffffffffff 64bit pref]: no compatible bridge window
[    4.593477] pci 0000:06:00.0: can't claim BAR 2 [mem 
0xffffffe000000000-0xffffffffffffffff 64bit pref]: no compatible bridge window
[    4.593817] pci 0000:06:00.0: can't claim BAR 4 [mem 
0xfffffffffe000000-0xffffffffffffffff 64bit pref]: no compatible bridge window
and sometimes the loading of the Nvidia driver does introduce some brief 
lockups ( https://pastebin.ubuntu.com/p/J3TH3S7Xhd/ ) )

> But the linux kernel also takes a long time to initialise NVIDIA GPU using 
> SeaBIOS

This is good to know... given this and the above, I'm starting to wonder if it 
might actually be a kernel issue...


-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.
View/Reply Online (#120805): https://edk2.groups.io/g/devel/message/120805
Mute This Topic: https://groups.io/mt/109651206/21656
Group Owner: devel+ow...@edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/unsub [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-


Reply via email to