Re: [BUG] x2apic broken with current AMD hardware
> > >> I was taking a look at the BIOS manual for this motherboard and noticed > > >> a mention of a "Local APIC Mode" setting. Four values are listed > > >> "Compatibility", "xAPIC", "x2APIC", and "Auto". > > This does appear to be an improvement. With this the system boots if > the "Local APIC Mode" setting is "auto". As you may have guessed, > "(XEN) Switched to APIC driver x2apic_phys". > > > > When I tried setting "Local APIC Mode" to "x2APIC" though things didn't > go so well. I confirm what Elliott said, just tested the patch on my computer (ryzen 7000 series). Before, I was using the workaround "x2apic=false" in the xen boot option. After applying the patch, when "Local APIC Mode" is set to "auto" ( or at least not "x2APIC"), then it work without needing the "x2apic=false" workaround. If "Local APIC Mode" is set to "X2APIC", then it is stuck waiting for nvme À information, like without the patch and without the "x2apic=false" workaround. signature.asc Description: PGP signature
[Patch v1] Bug fix - Integer overflow when cpu frequency > u32 max value.
xen/arch/x86/time.c: Bug fix - Integer overflow when cpu frequency > u32 max value. What is was trying to do: I was trying to install QubesOS on my new computer (AMD zen4 processor). Guest VM were unusably slow / unusable. What is the issue: The cpu frequency reported is wrong for linux guest in HVM and PVH mode, and it cause issue with the TSC clocksource (for example). Why this patch solved my issue: The root cause it that "d->arch.tsc_khz" is a unsigned integer storing the cpu frequency in khz. It get multiplied by 1000, so if the cpu frequency is over ~4,294 Mhz (u32 max value), then it overflow. I am solving the issue by adding an explicit cast to u64 to avoid the overflow. --- xen/arch/x86/time.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/xen/arch/x86/time.c b/xen/arch/x86/time.c index b01acd390d..7c77ec8902 100644 --- a/xen/arch/x86/time.c +++ b/xen/arch/x86/time.c @@ -2585,7 +2585,7 @@ int tsc_set_info(struct domain *d, case TSC_MODE_ALWAYS_EMULATE: d->arch.vtsc_offset = get_s_time() - elapsed_nsec; d->arch.tsc_khz = gtsc_khz ?: cpu_khz; -set_time_scale(&d->arch.vtsc_to_ns, d->arch.tsc_khz * 1000); +set_time_scale(&d->arch.vtsc_to_ns, (u64)d->arch.tsc_khz * 1000); /* * In default mode use native TSC if the host has safe TSC and -- 2.38.1
[PATCH v2] Bug fix - Integer overflow when cpu frequency > u32 max value.
xen/x86: prevent overflow with high frequency TSCs Promote tsc_khz to a 64-bit type before multiplying by 1000 to avoid a 'overflow before widen' bug. Otherwise just above 4.294GHz the value will overflow. Processors with clocks this high are now in production and require this to work correctly. Signed-off-by: Neowutran --- Changed since v1: * smaller commit message * using uint64_t instead of u64 * added signed-off-by tag --- xen/arch/x86/time.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/xen/arch/x86/time.c b/xen/arch/x86/time.c index b01acd390d..c71e79e019 100644 --- a/xen/arch/x86/time.c +++ b/xen/arch/x86/time.c @@ -2585,7 +2585,7 @@ int tsc_set_info(struct domain *d, case TSC_MODE_ALWAYS_EMULATE: d->arch.vtsc_offset = get_s_time() - elapsed_nsec; d->arch.tsc_khz = gtsc_khz ?: cpu_khz; -set_time_scale(&d->arch.vtsc_to_ns, d->arch.tsc_khz * 1000); +set_time_scale(&d->arch.vtsc_to_ns, (uint64_t)d->arch.tsc_khz * 1000); /* * In default mode use native TSC if the host has safe TSC and -- 2.38.1
Re: [PATCH v2] Bug fix - Integer overflow when cpu frequency > u32 max value.
On 2022-12-19 09:12, Jan Beulich wrote: > On 18.12.2022 22:47, Elliott Mitchell wrote: > > On Sun, Dec 18, 2022 at 01:14:07PM +0100, Neowutran wrote: > >> xen/x86: prevent overflow with high frequency TSCs > >> > >> PrÀ omote tsc_khz to a 64-bit type before multiplying by 1000 to avoid a > >> 'overflow before widen' bug. > >> Otherwise just above 4.294GHz the value will overflow. > >> Processors with clocks this high are now in production and require this to > >> work > >> correctly. > >> > >> Signed-off-by: Neowutran > > > > Needing a bit of word-wrapping, but that can be adjusted during commit to > > the Xen tree. > > Right - also the first line of the body really wants to be the title. > I'd be happy to make edits while committing, but as said in reply to > v1 I also would prefer to suffix the literal "1000" instead of adding > a cast. I'd also be happy to make that adjustment (including to the > description), but I'd prefer to do so with your agreement. > > Jan No problem for me. If you prefer I can also do a v3 later today. À
Re: [PATCH] x86/x2apic: introduce a mixed physical/cluster mode
On 2023-11-07 11:11, Elliott Mitchell wrote: > On Mon, Oct 30, 2023 at 04:27:22PM +01�� 00, Roger Pau Monné wrote: > > On Mon, Oct 30, 2023 at 07:50:27AM -0700, Elliott Mitchell wrote: > > > On Tue, Oct 24, 2023 at 03:51:50PM +0200, Roger Pau Monne wrote: > > > > diff --git a/xen/arch/x86/genapic/x2apic.c > > > > b/xen/arch/x86/genapic/x2apic.c > > > > index 707deef98c27..15632cc7332e 100644 > > > > --- a/xen/arch/x86/genapic/x2apic.c > > > > +++ b/xen/arch/x86/genapic/x2apic.c > > > > @@ -220,38 +239,56 @@ static struct notifier_block x2apic_cpu_nfb = { > > > > static int8_t __initdata x2apic_phys = -1; > > > > boolean_param("x2apic_phys", x2apic_phys); > > > > > > > > +enum { > > > > + unset, physical, cluster, mixed > > > > +} static __initdata x2apic_mode = unset; > > > > + > > > > +static int __init parse_x2apic_mode(const char *s) > > > > +{ > > > > +if ( !cmdline_strcmp(s, "physical") ) > > > > +x2apic_mode = physical; > > > > +else if ( !cmdline_strcmp(s, "cluster") ) > > > > +x2apic_mode = cluster; > > > > +else if ( !cmdline_strcmp(s, "mixed") ) > > > > + �� x2apic_mode = mixed; > > > > +else > > > > +return EINVAL; > > > > + > > > > +return 0; > > > > +} > > > > +custom_param("x2apic-mode", parse_x2apic_mode); > > > > + > > > > const struct genapic *__init apic_x2apic_probe(void) > > > > { > > > > -if ( x2apic_phys < 0 ) > > > > +/* x2apic-mode option has preference over x2apic_phys. */ > > > > +if ( x2apic_phys >= 0 && x2apic_mode == unset ) > > > > +x2apic_mode = x2apic_phys ? physical : cluster; > > > > + > > > > +if ( x2apic_mode == unset ) > > > > { > > > > -/* > > > > - * Force physical mode if there's no (full) interrupt > > > > remapping support: > > > > - * The ID in clustered mode requires a 32 bit destination > > > > field due to > > > > - * the usage of the high 16 bits to hold the cluster ID. > > > > - */ > > > > -x2apic_phys = iommu_intremap != iommu_intremap_full || > > > > - (acpi_gbl_FADT.flags & ACPI_FADT_APIC_PHYSICAL) > > > > || > > > > -�� IS_ENABLED(CONFIG_X2APIC_PHYSICAL); > > > > -} > > > > -else if ( !x2apic_phys ) > > > > -switch ( iommu_intremap ) > > > > +if ( acpi_gbl_FADT.flags & ACPI_FADT_APIC_PHYSICAL ) > > > > { > > > > > > Could this explain the issues with recent AMD processors/motherboards? > > > > > > Mainly the firmware had been setting this flag, but Xen was previously > > > ignoring it? > > > > No, not unless you pass {no-}x2apic_phys={false,0} on the Xen command > > line to force logical (clustered) destination mode. > > > > > As such Xen had been attempting to use cluster mode on an > > > x2APIC where that mode was broken for physical interrupts? > > > > No, not realy, x2apic_phys was already forced to true if > > acpi_gbl_FADT.flags & ACPI_FADT_APIC_PHYSICAL is set on the FADT (I > > just delete that line in this same chunk and move it here). > > Okay, that was from a quick look at the patch. Given the symptoms and > workaround with recent AMD motherboards, this looked�� suspicious. > > In that case it might be a bug in what AMD is providing to motherboard > manufacturers. Mainly this bit MUST be set, but AMD's implementation > leaves it unset. > > Could also be if the setup is done correctly the bit can be cleared, but > multiple motherboard manufacturers are breaking this. Perhaps the steps > are fragile and AMD needed to provide better guidance. > > > Neowutran, are you still setup to and interested in doing > experimentation/testing with Xen on your AMD computer? Would you be up > for trying the patch here: > > https://lore.kernel.org/xen-devel/20231106142739.19650-1-roger@citrix.com/raw > > I have a suspicion this *might* fix the x2APIC issue everyone has been > seeing. > > How plausible would it be to release this as a bugfix/workaround on 4.17? > I'm expecting a "no", but figured I should ask given how widespread the > issue is. > > > -- > (\___(\___(\__ --=> 8-) EHM <=-- __/)___/)___/) > \BS (| ehem+sigmsg@m5p.�� com PGP 87145445 |) / > \_CS\ | _ -O #include O- _ | / _/ > 8A19\___\_|_/58D2 7E3D DDF4 7BA6 <-PGP-> 41D1 B375 37D0 8714\_|_/___/5445 > I just applied the patch on my setup ( https://lore.kernel.org/xen-devel/20231106142739.19650-1-roger@citrix.com/raw ) It seems to fix the x2APIC issue I was having. I only did some quick tests: I tried all the differents values in my bios for the X2APIC settings. Now the system successfully boot in all the cases, without needing manual override of the x2apic_phys/x2apic_mode parameter in boot commandline . ��
Re: [PATCH] x86/x2apic: introduce a mixed physical/cluster mode
Content-D�� isposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <81f6bbd5-0487-461a-af1a-dbb6ead47...@citrix.com> On 2023-11-18 11:11, Andrew Cooper wrote: > On 18/11/2023 3:04 am, Elliott Mitchell wrote: > > On Fri, Nov 17, 2023 at 11:12:37AM +0100, Neowutran wrote: > >> On 2023-11-07 11:11, Elliott Mitchell wrote: > >>> On Mon, Oct 30, 2023 at 04:27:22PM +01 > >>>> On Mon, Oct 30, 2023 at 07:50:27AM -0700, Elliott Mitchell wrote: > >>>>> On Tue, Oct 24, 2023 at 03:51:50PM +0200, Roger Pau Monne wrote: > >>>>>> diff --git a/xen/arch/x86/genapic/x2apic.c > >>>>>> b/xen/arch/x86/genapic/x2apic.c > >>>>>> index 707deef98c27..15632cc7332e 100644 > >>>>>> --- a/xen/arch/x86/genapic/x2apic.c > >>>>>> +++ b/xen/arch/x86/genapic/x2apic.c > >>>>>> @@ -220,38 +239,56 @@ static struct notifier_block x2apic_cpu_nfb = { > >>>>>> static int8_t __initdata x2apic_phys = -1; > >>>>>> boolean_param("x2apic_phys", x2apic_phys); > >>>>>> > >>>>>> +enum { > >>>>>> + unset, physical, cluster, mixed > >>>>>> +} static __initdata x2�� apic_mode = unset; > >>>>>> + > >>>>>> +static int __init parse_x2apic_mode(const char *s) > >>>>>> +{ > >>>>>> +if ( !cmdline_strcmp(s, "physical") ) > >>>>>> +x2apic_mode = physical; > >>>>>> +else if ( !cmdline_strcmp(s, "cluster") ) > >>>>>> +x2apic_mode = cluster; > >>>>>> +else if ( !cmdline_strcmp(s, "mixed") ) > >>>>>> + > >>>>>> +else > >>>>>> +return EINVAL; > >>>>>> + > >>>>>> +return 0; > >>>>>> +} > >>>>>> +custom_param("x2apic-mode", parse_x2apic_mode); > >>>>>> + > >>>>>> const struct genapic *__init apic_x2apic_probe(void) > >>>>>> { > >>>>>> -if ( x2apic_phys < 0 ) > >>>>>> +/* x2apic-mode option has preference over x2apic_phys. */ > >>>>>> +if ( x2apic_phys >= 0 && x2apic_mode == unset ) > >>>>>> +x2apic_mode = x2apic_phys ? physical : cluster; > >>>>>> + > >>>>>> +if ( x2apic_mode == unset ) > >>>>>> { > >>>>>> -/* > >>>>>> - * Force physical mode if there's no (full) interrupt > >>>>>> remapping support: �� > >>>>>> - * The ID in clustered mode requires a 32 bit destination > >>>>>> field due to > >>>>>> - * the usage of the high 16 bits to hold the cluster ID. > >>>>>> - */ > >>>>>> -x2apic_phys = iommu_intremap != iommu_intremap_full || > >>>>>> - (acpi_gbl_FADT.flags & ACPI_FADT_APIC_PHYSICAL) > >>>>>> || > >>>>>> - > >>>>>> -} > >>>>>> -else if ( !x2apic_phys ) > >>>>>> -switch ( iommu_intremap ) > >>>>>> +if ( acpi_gbl_FADT.flags & ACPI_FADT_APIC_PHYSICAL ) > >>>>>> { > >>>>> Could this explain the issues with recent AMD processors/motherboards? > >>>>> > >>>>> Mainly the firmware had been setting this flag, but Xen was previously > >>>>> ignoring it? > >>>> No, not unless you pass {no-}x2apic_phys={false,0} on the Xen command > >>>> line to force logical (clustered) destination mode. > >>>> > >>>>> As such Xen had been attempting to use cluster mode on an > >>>>> x2APIC where that mode was broken for physical interrupts? > >>>> No�� , not realy, x2apic_phys was already forced to true if > >>>> acpi_gbl_FADT.flags & ACPI_FADT_APIC_PHYSICAL is set on the FADT (I > >>>> just delete that line in this same chunk and move it here). > >>> Okay, that was from a quick look at the patch. Given the symptoms and > >>> workaround with recent AMD motherboards, this looked > >>> > >>> In that case
Re: [PATCH] x86/x2apic: introduce a mixed physical/cluster mode
c...@citrix.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: Hello, Thanks a lot for all the details and explainations ! :) On 2023-11-27 11:11, Andrew Cooper wrote: > On 24/11/2023 7:54 pm, Neowutran wrote: > > Hi, > > I did some more tests and research, indeed this patch improved/solved my > > specific case. > > > > Starting point: > > > > I am using Xen version 4.17.2 (exactly this source > > https://github.com/QubesOS/qubes-vmm-xen). > > In the bios (a Asus motherboard), I configured the "local apic" parameter > > to "X2APIC". > > For Xen, I did not set the parameter "x2apic-mode" nor the parameter > > "x2apic_phys". > > > > Case 1: > > I tryied to boot just like that, result: system is unusuably slow > > > > Case 2: > > Then, I applied a backport of the patch À > > https://lore.kernel.org/xen-devel/20231106142739.19650-1-roger@citrix.com/raw > > > > to the original Xen version of QubesOS and I recompiled. > > (https://github.com/neowutran/qubes-vmm-xen/blob/x2apic3/X2APIC.patch) > > Result: it work, the system is usable. > > > > Case 3: > > Then, I applied the patch > > https://github.com/xen-project/xen/commit/26a449ce32cef33f2cb50602be19fcc0c4223ba9 > > to the original Xen version of QubesOS and I recompiled. > > (https://github.com/neowutran/qubes-vmm-xen/blob/x2apic4/X2APIC.patch) > > Result: system is > > unusuably slow. > > > > > > In "Case 2", the value returned by the function "apic_x2apic_probe" is > > "&apic_x2apic_mixed". > > In "Case 3", the value returned by the function "apic_x2apic_probe" is > > "&apic_x2apic_cluster". > > > > > > --- > > If you want / need, details for the function "apic_x2apic_probe": > > > > Known "input" value: > > > > "CONFIG_X2APIC_PHYSICAL" is not defined > > "iommu_intremap == iommu_intremap_off" = false > > "acpiÀ _gbl_FADT.flags & ACPI_FADT_APIC_PHYSICAL" -> 0 > > "acpi_gbl_FADT.flags" = 247205 (in decimal) > > "CONFIG_X2APIC_PHYSICAL" is not defined > > "CONFIG_X2APIC_MIXED" is defined, because it is the default choice > > "x2apic_mode" = 0 > > "x2apic_phys" = -1 > > > > > > > > Trace log (I did some call "printk" to trace what was going on) > > Case 2: > > (XEN) NEOWUTRAN: X2APIC_MODE: 0 > > (XEN) NEOWUTRAN: X2APIC_PHYS: -1 > > (XEN) NEOWUTRAN: acpi_gbl_FADT.flags: 247205 > > (XEN) NEOWUTRAN IOMMU_INTREMAP: different > > (XEN) Neowutran: PASSE 2 > > (XEN) Neowutran: PASSE 4 > > (XEN) NEOWUTRAN: X2APIC_MODE: 3 > > (XEN) Neowutran: PASSE 7 > > (XEN) NEOWUTRAN: X2APIC_MODE: 3 > > > > (XEN) NEOWUTRAN: X2APIC_PHYS: -1 > > (XEN) NEOWUTRAN: acpi_gbl_FADT.flags: 247205 > > (XEN) NEOWUTRAN IOMMU_INTREMAP: different > > > > Case 3: > > (XEN) NEOWUTRAN2: X2APIC_PHYS: -1 > > (XEN) NEOWUTRAN2: acpi_gbl_FADT.flags: 247205 > > (XEN) NEOWUTRAN2 IOMMU_INTREMAP: different > > (XEN) Neowutran2: Passe 1 > > (XEN) NEOÀ WUTRAN2: X2APIC_PHYS: 0 > > (XEN) Neowutran2: Passe 6 > > (XEN) Neowutran2: Passe 7 > > (XEN) NEOWUTRAN2: X2APIC_PHYS: 0 > > (XEN) NEOWUTRAN2: acpi_gbl_FADT.flags: 247205 > > (XEN) NEOWUTRAN2 IOMMU_INTREMAP: different > > (XEN) Neowutran2: Passe 2 > > (XEN) Neowutran2: Passe 4 > > (XEN) Neowutran2: Passe 7 > > > > > > > > If you require the full logs, I could publish the full logs somewhere. > > -- > > > > ( However I do not understand if the root issue is a buggy motherboard, a > > bug in xen, or if the parameter "X2APIC_PHYSICAL" should have been set > > by the QubesOS project, or something else) > > Hello, > > Thankyou for the analysis. > > For your base version of QubeOS Xen, was that 4.13.2-5 ?  I can't see > any APIC changes in the patchqueue, and I believe all relevant bugfixes > are in 4.17.2, but I'd just like to confirm. I was using the qubes-vmm-xen release "4.17.2-5" that use xen version "4.17.2" . I don't see custom modification for APIC in the patchs applied tÀ o Xen by QubesOS > > First, by "unusable slow", other than the speed, did everything else > appear to operate adequately? Any cha
hvmloader - pci - relevance of qemu-traditional workaround
Hello, In QubesOS I had the following issue (https://github.com/QubesOS/qubes-issues/issues/4321): When I create a HVM and passthrough a GPU (pci device), it seems to trash the memory of the guest and either prevent the HVM to be started or prevent the GPU to be used. Looking more at the details, it is a memory corruption issue provoked by the hvmloader of xen. This specific issue is well documented in the xen source code here: https://github.com/xen-project/xen/blob/master/tools/firmware/hvmloader/pci.c#L105 " /* * Do we allow hvmloader to relocate guest memory in order to * increase the size of the lowmem MMIO hole? Defaulting to 1 * here will�� mean that non-libxl toolstacks (including xend and * home-grown ones) means that those using qemu-xen will still * experience the memory relocation bug described below; but it * also means that those using qemu-traditional will *not* * experience any change; and it also means that there is a * work-around for those using qemu-xen, namely switching to * qemu-traditional. * * If we defaulted to 0, and failing to resize the hole caused any * problems with qemu-traditional, then there is no work-around. * * Since xend can only use qemu-traditional, I think this is the * option that will have the least impact. */ bool allow_memory_relocate = 1; " and the corruption happen here: https://github.com/xen-project/xen/blob/master/tools/firmware/hvmloader/pci.c#L355 " /* * At the moment qemu-xen can't deal with relocated memory regions. * It's too close to the release to make a proper fix; for now, * only allow t�� he MMIO hole to grow large enough to move guest memory * if we're running qemu-traditional. Items that don't fit will be * relocated into the 64-bit address space. * * This loop now does the following: * - If allow_memory_relocate, increase the MMIO hole until it's * big enough, or until it's 2GiB * - If !allow_memory_relocate, increase the MMIO hole until it's * big enough, or until it's 2GiB, or until it overlaps guest * memory */ while ( (mmio_total > (pci_mem_end - pci_mem_start)) && ((pci_mem_start << 1) != 0) && (allow_memory_relocate || (((pci_mem_start << 1) >> PAGE_SHIFT) >= hvm_info->low_mem_pgend)) ) pci_mem_start <<= 1; " I wrote a small patch (https://github.com/QubesOS/qubes-vmm-xen/pull/172/files) to force the value of "allow_memory_relocate" to be 0. It fixed the issue I was having, I saw no drawback,�� so I am suggesting this patch to the QubesOS project. But I have some questions: - In the xen source code, the default value of "allow_memory_relocate" have been defined to 1 for qemu-traditional support. As of today, should this default value still be "1" ? - Does something have changed regarding "At the moment qemu-xen can't deal with relocated memory regions." ? / something else should be modified ? Many thanks, Neowutran ��
hvmloader - allow_memory_relocate overlaps
I am wondering if the variable "allow_memory_relocate" is still relevant today and if its default value is still relevant. Should it be defined to 0 by default instead of 1 (it seems to be a workaround for qemu-traditional, so maybe an outdated default value ? ) ? Code extract: tools/firmware/hvmloader/pci.c " /* * Do we allow hvmloader to relocate guest memory in order to * increase the size of the lowmem MMIO hole? Defaulting to 1 * here will mean that non-libxl toolstacks (including xend and * home-grown ones) means that those using qemu-xen will still * experience the memory relocation bug described below; but it * also means that those using q�� emu-traditional will *not* * experience any change; and it also means that there is a * work-around for those using qemu-xen, namely switching to * qemu-traditional. * * If we defaulted to 0, and failing to resize the hole caused any * problems with qemu-traditional, then there is no work-around. * * Since xend can only use qemu-traditional, I think this is the * option that will have the least impact. */ bool allow_memory_relocate = 1; " " /* * At the moment qemu-xen can't deal with relocated memory regions. * It's too close to the release to make a proper fix; for now, * only allow t he MMIO hole to grow large enough to move guest memory * if we're running qemu-traditional. Items that don't fit will be * relocated into the 64-bit address space. * * This loop now does the following: * - If allow_memory_relocate, increase the MMIO hole until it's * big enough, or �� until it's 2GiB * - If !allow_memory_relocate, increase the MMIO hole until it's * big enough, or until it's 2GiB, or until it overlaps guest * memory */ while ( (mmio_total > (pci_mem_end - pci_mem_start)) && ((pci_mem_start << 1) != 0) && (allow_memory_relocate || (((pci_mem_start << 1) >> PAGE_SHIFT) >= hvm_info->low_mem_pgend)) ) pci_mem_start <<= 1; " The issue it cause is documented in the source code: guest memory can be trashed by the hvmloader. Due to this issue, it is impossible to passthrough a large PCI device to a HVM with a lot of ram. (https://github.com/QubesOS/qubes-issues/issues/4321). (Forcing "allow_memory_relocate" to be 0 seems to solve the issue linked) Thanks, Neowutran ��
Re: hvmloader - allow_memory_relocate overlaps
On 2023-12-19 17:12, Jan Beulich wrote: > On 16.12.2023 08:01, Neowutran wrote: > > I am wondering if the variable "allow_memory_relocate" is still > > relevant today and if its default value is still relevant. > > Should it be defined to 0 by default instead of 1 (it seems to be a > > workaround for qemu-traditional, so maybe an outdated default value ? ) ? > > So are you saying you use qemu-trad? No, I am using "qemu_xen" ( from �� xenstore-read -> 'device-model = "qemu_xen"' > Otherwise isn't libxl suppressing this behavior anyway? If by "isn't libxl suppressing this behavior" you means if libxl is setting the value of "allow_memory_relocate", then the answer is no. Following this lead, I checked in what code path "allow_memory_relocate" could be defined. It is only defined in one code path, In the file "tools/libs/light/libxl_dm.c", in the function "void libxl__spawn_local_dm(libxl__egc *egc, libxl__dm_spawn_state *dmss)": ''' // ... if (b_info->type == LIBXL_DOMAIN_TYPE_HVM) { // ... libxl__xs_printf(gc, XBT_NULL, GCSPRINTF("%s/hvmloader/allow-memory-relocate", path), "%d", b_info->device_model_version==LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN_TRADITIONAL && !libxl__vnuma_configured(b_info)); // ... ''' However, on QubesOS (my system), "local_dm" is never used, "stub_dm" is always used. In the function "void lib�� xl__spawn_stub_dm(libxl__egc *egc, libxl__stub_dm_spawn_state *sdss)", the value of "allow_memory_relocate" is never defined. I tried to patch the code to define "allow_memory_relocate" in "libxl__spawn_stub_dm": ''' --- a/tools/libs/light/libxl_dm.c +++ b/tools/libs/light/libxl_dm.c @@ -2431,6 +2431,10 @@ void libxl__spawn_stub_dm(libxl__egc *egc, libxl__stub_dm_spawn_state *sdss) libxl__xs_get_dompath(gc, guest_domid)), "%s", libxl_bios_type_to_string(guest_config->b_info.u.hvm.bios)); +libxl__xs_printf(gc, XBT_NULL, + libxl__sprintf(gc, "%s/hvmloader/allow-memory-relocate", libxl__xs_get_dompath(gc, guest_domid)), + "%d", + 0); } ret = xc_domain_set_target(ctx->xch, dm_domid, guest_domid); if (ret<0) { ''' And it is indeed another way to solve my issue. However I do not understand the xen hypervisor enough to known i�� f "allow-memory-relocate" should have been defined in "libxl__spawn_stub_dm" or if the real issue is somewhere else. > Or did that code go stale? > > > Code extract: > > > > tools/firmware/hvmloader/pci.c > > " > >/* > > * Do we allow hvmloader to relocate guest memory in order to > > * increase the size of the lowmem MMIO hole? Defaulting to 1 > > * here will > > mean that non-libxl toolstacks (including xend and > > * home-grown ones) means that those using qemu-xen will still > > * experience the memory relocation bug described below; but it > > * also means that those using q > > emu-traditional will *not* > > * experience any change; and it also means that there is a > > * work-around for those using qemu-xen, namely switching to > > * qemu-traditional. > > * > > * If we defaulted to 0, and failing to resize the hole caused any > > * problems with qemu-traditional, then there is no work-around. > > * > > * Since xend can�� only use qemu-traditional, I think this is the > > * option that will have the least impact. > > */ > > bool allow_memory_relocate = 1; > > " > > > > " > > /* > > * At the moment qemu-xen can't deal with relocated memory regions. > > * It's too close to the release to make a proper fix; for now, > > * only allow t > > he MMIO hole to grow large enough to move guest memory > > * if we're running qemu-traditional. Items that don't fit will be > > * relocated into the 64-bit address space. > > * > > * This loop now does the following: > > * - If allow_memory_relocate, increase the MMIO hole until it's > > * big enough, or > > until it's 2GiB > > * - If !allow_memory_relocate, increase the MMIO hole until i
Re: hvmloader - allow_memory_relocate overlaps
> > ''' > > > > And the associated result is: > > > > (d22) NEOWUTRAN pci.c: pci_mem_end: -67108864 > > (d22) NEOWUTRAN pci.c: pci_mem_start: -268435456 > > (d22) NEOWUTRAN pci.c: allow_memory_relocate: 0 > > (d22) NEOWUTRAN pci.c:�� hvm_info->low_mem_pgend: 983040 > > (d22) NEOWUTRAN pci.c: HVM_LOADER would have tried to relocate guest memory > > (d22) NEOWUTRAN pci.c: pci_mem_start: -268435456 > > > > So if "allow_memory_relocate" was not defined to 0, the hvmloader > > would have tried to overlaps guest memory, and it seems that is > > something that qemu_xen cannot handle. > > Well, memory addresses printed in decimal are pretty hard to work with. > But I'd like to ask anyway that you supply all of the log messages for > such a guest starting, to be able to correlate what you've added with > other items also logged. Full logs with my patch to set allow-memory-relocate (https://github.com/neowutran/qubes-vmm-xen/commit/819705bc346cad14836fd523195ad2b0445330ac) https://pastebin.com/9kQgvraK (GPU passthrough work, hvmloader doesn't overlaps with guest memory) Full logs without my patch to set allow-memory-relocate (https://github.com/neowutran/qubes-vmm-xen/blob/allowmemoryrelocate/ALLOWMEMORYRELOCATE.patch) https://pastebin.com/g�� QGg55WZ (GPU passthrough doesn't work, hvmloader overlaps with guest memory) > > Jan Thanks, Neowutran ��
Re: [PATCH] libxl: Disable relocating memory for qemu-xen in stubdomain too
On 2023-12-27 03:12, Marek Marczykowski-Górecki wrote: > According to comments (and experiments) qemu-xen cannot handle memory > reolcation done by hvmloader. The code was already disabled when running > qemu-xen in dom0 (see libxl__spawn_local_dm()), but it was missed when > adding qemu-xen support to stubdomain. Adjust libxl__spawn_stub_dm() to > be consistent in this regard. > > Reported-by: Neowutran > Signed-off-by: Marek Marczykowski-Górecki > --- > tools/libs/light/libxl_dm.c | 10 ++ > 1 file changed, 10 insertions(+) > > diff --git a/tools/libs/light/libxl_dm.c b/tools/libs/light/libxl_dm.c > index 14b593110f7c..ed620a9d8e14 100644 > --- a/tools/libs/light/libxl_dm.c > +++ b/tools/libs/light/libxl_dm.c > @@ -2432,6 +2432,16 @@ void libxl__spawn_stub_dm(libxl__egc *egc, > libxl__stub_dm_spawn_state *sdss) > "%s", > > libxl_bios_type_to_string(guest_config->b_info.u.hvm.bios)); > } > +/* Disable relocating memory to make the MMIO hole larger > + * unless we're running qemu-traditional and vNUMA is not > + * configured. */ > +libxl__xs_printf(gc, XBT_NULL, > + libxl__sprintf(gc, "%s/hvmloader/allow-memory-relocate", > +libxl__xs_get_dompath(gc, guest_domid)), > + "%d", > + guest_config->b_info.device_model_version > +== LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN_TRADITIONAL && > + !libxl__vnuma_configured(&guest_config->b_info)); > ret = xc_domain_set_target(ctx->xch, dm_domid, guest_domid); > if (ret<0) { > LOGED(ERROR, guest_domid, "setting target domain %d -> %d", > -- > 2.41.0 Seems to work as expected Thanks, Neowutran
Re: hvmloader - allow_memory_relocate overlaps
On 2024-02-07 16:02, Jan Beulich wrote: > On 04.01.2024 14:16, Jan Beulich wrote: > > On 22.12.2023 16:49, Neowutran wrote: > >> Full logs without my patch to set allow-memory-relocate > >> (https://github.com/neowutran/qubes-vmm-xen/blob/allowmemoryrelocate/ALLOWMEMORYRELOCATE.patch) > >> https://pastebin.com/g > >> QGg55WZ > >> (GPU passthrough doesn't work, hvmloader overlaps with guest memory) > > > > So there are oddities, but I can't spot any overlaps. What's odd is that > > the two blocks already above 4Gb are accounted for (and later relocated) > > when calculating total MMIO size. BARs of size 2Gb and more shouldn't be > > accounted for at all when deciding whether low RAM needs relocating, as > > those can't live below 4Gb anyway. I vaguely recall pointing this out > > years ago, but it was thought we'd get away for a fair while. What's > > further odd is where the two blocks are moved to: F80 moves (down) > > to C0, while the smaller FC0 moves further up to FC8. > > > > I'll try to get to addressing at least the first oddity; if I can figure > > out why the second one occurs, I may try to address that as well. > > Could you give the patch below a try? I don't have a device with large > enough a BAR that I could sensibly pass through to a guest, so I was > only able to build-test the change. Hi and thanks, I just tested it, it indeed work well when the GPU have bar > 1Go. Setup: I removed my patch ( --- a/tools/libs/light/libxl_dm.c +++ b/tools/libs/light/libxl_dm.c @@ -2431,6 +2431,10 @@ void libxl__spawn_stub_dm(libxl__egc *egc, libxl__stub_dm_spawn_state *sdss) libxl__xs_get_dompath(gc, guest_domid)), "%s", libxl_bios_type_to_string(guest_config->b_info.u.hvm.bios)); +libxl__xs_printf(gc, XBT_NULL, + libxl__sprintf(gc, "%s/hvmloader/allow-memory-relocate", libxl__xs_get_dompath(gc, guest_domid)), + "%d", + 0); } ret = xc_domain_set_target(ctx->xch, dm_domid, guest_domid); if (ret<0) { ) and applied your suggested "skip huge BARs" patch. My GPU: nvidia 4080 - When the option "Resizable BAR support" is activated in my bios, the BAR1 size of my gpu is reported to be 16GB. With this patch, gpu passthrough work. When the option "Resizable BAR support" is desactivated in my bios, the BAR1 size of my gpu is reported to be 256MB. With this patch, gpu passthrough doesn't work (same crash as before) ( note: the option "Resizable BAR support" may or may not exist depending on the motherboard model, on some it is 'hardcoded' to activated, on some it is 'hardcoded' to desactivated) > Jan > > hvmloader/PCI: skip huge BARs in certain calculations > > BARs of size 2Gb and up can't possibly fit below 4Gb: Both the bottom of > the lower 2Gb range and the top of the higher 2Gb range have special > purpose. Don't even have them influence whether to (perhaps) relocate > low RAM. > > Reported-by: Neowutran > Signed-off-by: Jan Beulich > --- > If we wanted to fit e.g. multiple 1Gb BARs, it would likely be prudent > to similarly avoid low RAM relocation in the first place. Yet accounting > for things differently depending on many large BARs there are would > require more intrusive code changes. > > That said, I'm open to further lowering of the threshold. That'll > require different justification then, though. > > --- a/tools/firmware/hvmloader/pci.c > +++ b/tools/firmware/hvmloader/pci.c > @@ -33,6 +33,13 @@ uint32_t pci_mem_start = HVM_BELOW_4G_MM > const uint32_t pci_mem_end = RESERVED_MEMBASE; > uint64_t pci_hi_mem_start = 0, pci_hi_mem_end = 0; > > +/* > + * BARs larger than this value are put in 64-bit space unconditionally. That > + * is, such BARs also don't play into the determination of how big the lowmem > + * MMIO hole needs to be. > + */ > +#define HUGE_BAR_THRESH GB(1) > + > enum virtual_vga virtual_vga = VGA_none; > unsigned long igd_opregion_pgbase = 0; > > @@ -286,9 +293,11 @@ void pci_setup(void) > bars[i].bar_reg = bar_reg; > bars[i].bar_sz = bar_sz; > > -if ( ((bar_data & PCI_BASE_ADDRESS_SPACE) == > - PCI_BASE_ADDRESS_SPACE_MEMORY) || > - (bar_reg == PCI_ROM_ADDRESS) ) > +if ( is_64bar && bar_sz > HUGE_BAR_THRESH ) > +bar64_relocate = 1; > +else