On Tue, 22 Feb 2022 11:00:55 +0000 Joao Martins <joao.m.mart...@oracle.com> wrote:
> On 2/21/22 15:28, Joao Martins wrote: > > On 2/21/22 06:58, Igor Mammedov wrote: > >> On Fri, 18 Feb 2022 17:12:21 +0000 > >> Joao Martins <joao.m.mart...@oracle.com> wrote: > >> > >>> On 2/14/22 15:31, Igor Mammedov wrote: > >>>> On Mon, 14 Feb 2022 15:05:00 +0000 > >>>> Joao Martins <joao.m.mart...@oracle.com> wrote: > >>>>> On 2/14/22 14:53, Igor Mammedov wrote: > >>>>>> On Mon, 7 Feb 2022 20:24:20 +0000 > >>>>>> Joao Martins <joao.m.mart...@oracle.com> wrote: > >>>>>>> +{ > >>>>>>> + PCMachineClass *pcmc = PC_MACHINE_GET_CLASS(pcms); > >>>>>>> + X86MachineState *x86ms = X86_MACHINE(pcms); > >>>>>>> + ram_addr_t device_mem_size = 0; > >>>>>>> + uint32_t eax, vendor[3]; > >>>>>>> + > >>>>>>> + host_cpuid(0x0, 0, &eax, &vendor[0], &vendor[2], &vendor[1]); > >>>>>>> + if (!IS_AMD_VENDOR(vendor)) { > >>>>>>> + return; > >>>>>>> + } > >>>>>>> + > >>>>>>> + if (pcmc->has_reserved_memory && > >>>>>>> + (machine->ram_size < machine->maxram_size)) { > >>>>>>> + device_mem_size = machine->maxram_size - machine->ram_size; > >>>>>>> + } > >>>>>>> + > >>>>>>> + if ((x86ms->above_4g_mem_start + x86ms->above_4g_mem_size + > >>>>>>> + device_mem_size) < AMD_HT_START) { > >>>>>> > >>>>> And I was at two minds on this one, whether to advertise *always* > >>>>> the 1T hole, regardless of relocation. Or account the size > >>>>> we advertise for the pci64 hole and make that part of the equation > >>>>> above. Although that has the flaw that the firmware at admin request > >>>>> may pick some ludricous number (limited by maxphysaddr). > >>>> > >>>> it this point we have only pci64 hole size (machine property), > >>>> so I'd include that in equation to make firmware assign > >>>> pci64 aperture above HT range. > >>>> > >>>> as for checking maxphysaddr, we can only check 'default' PCI hole > >>>> range at this stage (i.e. 1Gb aligned hole size after all possible RAM) > >>>> and hard error on it. > >>>> > >>> > >>> Igor, in the context of your comment above, I'll be introducing another > >>> preparatory patch that adds up pci_hole64_size to pc_memory_init() such > >>> that all used/max physaddr space checks are consolidated in > >>> pc_memory_init(). > >>> > >>> To that end, the changes involve mainly moves the the pcihost qdev > >>> creation > >>> to be before pc_memory_init(). Q35 just needs a 2-line order change. > >>> i440fx > >>> needs slightly more of a dance to extract that from i440fx_init() and also > >>> because most i440fx state is private (hence the new helper for size). But > >>> the actual initialization of I440fx/q35 pci host is still after > >>> pc_memory_init(), > >>> it is just to extra pci_hole64_size from the object + user passed args > >>> (-global etc). > >> > >> Shuffling init order is looks too intrusive and in practice > >> quite risky. > > > > Yeah, it is an intrusive change sadly. Although, why would you consider it > > risky (curious)? We are "only" moving this: > > > > qdev_new(host_type); > > > > ... located at the very top of i440fx_init() and called at the top of > > q35_host > > initilization to be instead before pc_memory_init(). And that means that an > > instance of an > > object gets made and its properties initialized i.e. @instance_init of q35 > > and i440fx and > > its properties. I don't see a particular dependence in PC code to tell that > > this > > would affect its surroundings parts. I don't see anything wrong her as well (but I'm probably overcautious since more often than not changing initialization order, has broken things in non obvious ways) > > > > The actual pcihost-related initialization is still kept entirely unchanged. > > > >> How about moving maxphysaddr check to pc_machine_done() instead? > >> (this way you won't have to move pcihost around) > >> > > I can move it. But be there will be a slight disconnect between what > > pc_memory_init() > > checks against "max used address" between ... dictating if the 4G mem > > start should change > > to 1T or not ... and when the phys-bits check is actually made which > > includes the pci hole64. > > > > For example, we create a guest with maxram 1009G (which 4G mem start would > > get at > > unchanged) and then the pci_hole64 goes likely assigned across the rest > > until 1023G (i.e. > > across the HT region). Here it would need an extra check and fail if > > pci_hole64 crosses > > the HT region. Whereby if it was added in pc_memory_init() then we could > > just relocate to > > 1T and the guest creation would still proceed. > > > Actually, on a second thought, not having the pci_hole64_size > to pc_memory_init() to instead introduce it in pc_machine_done() to > include pci_hole64_size looks like a half-step :( because otherwise the user > needs to play games on how much it should include as -m|-object-memory* > number to force it to relocate to 1T and avoid the guest creation > failure. > > So either we: > > 1) consider pci_hole64_size in pc_memory_init() and move default > pci-hole64-size (sort of what I was proposing in this last exchange) ok, lets go with this approach > > 2) keep the maxphysaddr check as is (but generic in pc_memory_init() > and disregarding pci-hole64-size) and advertise the actual 1T reserved hole > (regardless of above-4G relocation) letting BIOS consider reserved regions > when picking hole64-start. >