[qemu-mainline test] 171676: tolerable FAIL - PUSHED

2022-07-19 Thread osstest service owner
flight 171676 qemu-mainline real [real]
http://logs.test-lab.xenproject.org/osstest/logs/171676/

Failures :-/ but no regressions.

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-qemuu-win7-amd64 19 guest-stopfail like 171648
 test-armhf-armhf-libvirt 16 saverestore-support-checkfail  like 171648
 test-amd64-amd64-qemuu-nested-amd 20 debian-hvm-install/l1/l2 fail like 171648
 test-armhf-armhf-libvirt-qcow2 15 saverestore-support-check   fail like 171648
 test-armhf-armhf-libvirt-raw 15 saverestore-support-checkfail  like 171648
 test-amd64-i386-xl-qemuu-win7-amd64 19 guest-stop fail like 171648
 test-amd64-i386-xl-qemuu-ws16-amd64 19 guest-stop fail like 171648
 test-amd64-amd64-xl-qemuu-ws16-amd64 19 guest-stopfail like 171648
 test-amd64-i386-xl-pvshim14 guest-start  fail   never pass
 test-amd64-amd64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 15 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  15 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 15 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  16 saverestore-support-checkfail   never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-armhf-armhf-xl-arndale  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  16 saverestore-support-checkfail   never pass
 test-amd64-i386-libvirt-raw  14 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-raw 14 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-raw 15 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 16 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-vhd 14 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt 15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 15 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 16 saverestore-support-checkfail never pass
 test-arm64-arm64-xl-vhd  14 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-vhd  15 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  16 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-armhf-armhf-libvirt-qcow2 14 migrate-support-checkfail never pass
 test-armhf-armhf-libvirt-raw 14 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 15 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 16 saverestore-support-checkfail  never pass
 test-armhf-armhf-xl-vhd  14 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  15 saverestore-support-checkfail   never pass

version targeted for testing:
 qemuu782378973121addeb11b13fd12a6ac2e69faa33f
baseline version:
 qemuu0ebf76aae58324b8f7bf6af798696687f5f4c2a9

Last test of basis   171648  2022-07-16 03:15:54 Z3 days
Failing since171670  2022-07-18 15:38:37 Z0 days2 attempts
Testing same since   171676  2022-07-18 22:40:16 Z0 days1 attempts


People who touched r

RE: [PATCH v3 1/3] x86/vmx: implement VMExit based guest Bus Lock detection

2022-07-19 Thread Tian, Kevin
> From: Roger Pau Monne 
> Sent: Friday, July 1, 2022 9:17 PM
> 
> @@ -4065,6 +4065,11 @@ void vmx_vmexit_handler(struct cpu_user_regs
> *regs)
> 
>  if ( unlikely(exit_reason & VMX_EXIT_REASONS_FAILED_VMENTRY) )
>  return vmx_failed_vmentry(exit_reason, regs);

Add a blank line.

> +if ( unlikely(exit_reason & VMX_EXIT_REASONS_BUS_LOCK) )
> +{
> +perfc_incr(buslock);
> +exit_reason &= ~VMX_EXIT_REASONS_BUS_LOCK;
> +}
> 
>  if ( v->arch.hvm.vmx.vmx_realmode )
>  {
> @@ -4561,6 +4566,15 @@ void vmx_vmexit_handler(struct cpu_user_regs
> *regs)
>  vmx_handle_descriptor_access(exit_reason);
>  break;
> 
> +case EXIT_REASON_BUS_LOCK:
> +/*
> + * Nothing to do: just taking a vmexit should be enough of a pause to
> + * prevent a VM from crippling the host with bus locks.  Note
> + * EXIT_REASON_BUS_LOCK will always have bit 26 set in exit_reason,
> and
> + * hence the perf counter is already increased.
> + */
> +break;
> +

Would it be helpful from diagnostic angle by throwing out a warning,
once per the culprit domain?


RE: [PATCH v3 1/3] x86/vmx: implement VMExit based guest Bus Lock detection

2022-07-19 Thread Tian, Kevin
> From: Roger Pau Monné 
> Sent: Monday, July 4, 2022 6:07 PM
> 
> On Mon, Jul 04, 2022 at 11:27:37AM +0200, Jan Beulich wrote:
> > On 01.07.2022 15:16, Roger Pau Monne wrote:
> > > --- a/xen/arch/x86/hvm/vmx/vmx.c
> > > +++ b/xen/arch/x86/hvm/vmx/vmx.c
> > > @@ -4065,6 +4065,11 @@ void vmx_vmexit_handler(struct
> cpu_user_regs *regs)
> > >
> > >  if ( unlikely(exit_reason & VMX_EXIT_REASONS_FAILED_VMENTRY) )
> > >  return vmx_failed_vmentry(exit_reason, regs);
> > > +if ( unlikely(exit_reason & VMX_EXIT_REASONS_BUS_LOCK) )
> > > +{
> > > +perfc_incr(buslock);
> > > +exit_reason &= ~VMX_EXIT_REASONS_BUS_LOCK;
> > > +}
> >
> > To cover for the flag bit, don't you also need to mask it off in
> > nvmx_idtv_handling()? Or (didn't go into detail with checking whether
> > there aren't any counter indications) pass the exit reason there from
> > vmx_vmexit_handler(), instead of re-reading it from the VMCS?
> 
> This seem to be an existing issue with nvmx_idtv_handling(), as it
> should use just the low 16bits to check against the VM Exit reason
> codes.
> 
> I can send a pre-patch to fix it, could pass exit reason from
> vmx_vmexit_handler(), but I would still need to cast to uint16_t for
> comparing against exit reason codes, as there's a jump into the 'out'
> label before VMX_EXIT_REASONS_BUS_LOCK is masked out.

or just masking out the bit in an earlier place which then also
covers nvmx_n2_vmexit_handler() below? There are a few other
goto's and return's before the point where that bit is currently
masked out. Having bus lock counted even in those failure paths
is also not a bad thing imho...

> 
> I think there's a similar issue with nvmx_n2_vmexit_handler() that
> doesn't cast the value to uint16_t and is called before
> VMX_EXIT_REASONS_BUS_LOCK is removed from exit reason.
> 




RE: [PATCH v3 2/3] x86/vmx: introduce helper to set VMX_INTR_SHADOW_NMI

2022-07-19 Thread Tian, Kevin
> From: Roger Pau Monne 
> Sent: Friday, July 1, 2022 9:17 PM
> 
> @@ -225,6 +225,9 @@ static inline void pi_clear_sn(struct pi_desc *pi_desc)
> 
>  /*
>   * Interruption-information format
> + *
> + * Note INTR_INFO_NMI_UNBLOCKED_BY_IRET is also used with Exit
> Qualification
> + * field under some circumstances.
>   */
>  #define INTR_INFO_VECTOR_MASK   0xff/* 7:0 */
>  #define INTR_INFO_INTR_TYPE_MASK0x700   /* 10:8 */

Overall this is good. But I wonder whether the readability is slightly better
by defining a dedicate flag bit for exit qualification with a clear comment
on which events it makes sense to...


Re: [RFC PATCH 0/2] Yocto Gitlab CI support

2022-07-19 Thread Bertrand Marquis
Hi Christopher,

> On 19 Jul 2022, at 05:34, Christopher Clark  
> wrote:
> 
> On Thu, Jul 14, 2022 at 3:10 AM Bertrand Marquis
>  wrote:
>> 
>> This patch series is a first attempt to check if we could use Yocto in
>> gitlab ci to build and run xen on qemu for arm, arm64 and x86.
> 
> Hi Bertrand, thanks for posting this.
> 
> I'm still making my way through it, and should be able to speak more
> to the OE/Yocto aspects than the Xen automation integration but at
> first pass, I think that this is work in the right direction.
> A few quick early points:
> - The build-yocto.sh script is clear to understand, which is helpful.

Thanks

> - The layers that you have selected to include in the build are good.
> Might be worth considering using openembedded-core, which is poky's
> upstream, but I think either is a valid choice.

That was how I did it first but packing them in one call is reducing the
number of intermediate images during the creation. If having more is ok
I can split this in v2

>- listing the layers one-per-line in the script might make it
> easier to patch in additional layers downstream, if needed
> - The target image of 'xen-image-minimal' is the right start; it would
> be nice to be able to pass that as an input from the dockerfile to
> allow for using this with other images.

Using a different image might also trigger other changes (for example
if you want to build xen-guest-image-minimal then you do not need the
same features).
Anyway I will add a parameter to build-yocto.sh to do it.

> - Possibly worth mentioning somewhere in the series description that
> this introduces coverage for x86-64 but not 32-bit x86 guests - it's
> the right choice given that this is just booting to a dom0.

I will add something saying that it does not cover booting guests (yet !!)
but the 32bit guest issue is also true for arm64 so mentioning it for x86
would be weird.

Cheers
Bertrand

> 
> Christopher
> 
>> The first patch is creating a container with all elements required to
>> build Yocto, a checkout of the yocto layers required and an helper
>> script to build and run xen on qemu with yocto.
>> 
>> The second patch is creating containers with a first build of yocto done
>> so that susbsequent build with those containers would only rebuild what
>> was changed and take the rest from the cache.
>> 
>> This is is mainly for discussion and sharing as there are still some
>> issues/problem to solve:
>> - building the qemu* containers can take several hours depending on the
>>  network bandwith and computing power of the machine where those are
>>  created
>> - produced containers containing the cache have a size between 8 and
>>  12GB depending on the architecture. We might need to store the build
>>  cache somewhere else to reduce the size. If we choose to have one
>>  single image, the needed size is around 20GB and we need up to 40GB
>>  during the build, which is why I splitted them.
>> - during the build and run, we use a bit more then 20GB of disk which is
>>  over the allowed size in gitlab
>> 
>> Once all problems passed, this can be used to build and run dom0 on qemu
>> with a modified Xen on the 3 archs in less than 10 minutes.
>> 
>> Bertrand Marquis (2):
>>  automation: Add elements for Yocto test and run
>>  automation: Add yocto containers with cache
>> 
>> automation/build/Makefile |   2 +
>> automation/build/yocto/build-yocto.sh | 322 ++
>> .../build/yocto/kirkstone-qemuarm.dockerfile  |  28 ++
>> .../yocto/kirkstone-qemuarm64.dockerfile  |  28 ++
>> .../yocto/kirkstone-qemux86-64.dockerfile |  28 ++
>> automation/build/yocto/kirkstone.dockerfile   |  98 ++
>> 6 files changed, 506 insertions(+)
>> create mode 100755 automation/build/yocto/build-yocto.sh
>> create mode 100644 automation/build/yocto/kirkstone-qemuarm.dockerfile
>> create mode 100644 automation/build/yocto/kirkstone-qemuarm64.dockerfile
>> create mode 100644 automation/build/yocto/kirkstone-qemux86-64.dockerfile
>> create mode 100644 automation/build/yocto/kirkstone.dockerfile
>> 
>> --
>> 2.25.1
>> 
>> 




Re: [PATCH v2 0/4] tools/xenstore: add some new features to the documentation

2022-07-19 Thread Julien Grall

Hi Jan,

On 19/07/2022 06:58, Jan Beulich wrote:

On 18.07.2022 18:28, Julien Grall wrote:

On 18/07/2022 17:12, Jan Beulich wrote:

On 27.05.2022 09:24, Juergen Gross wrote:

As you committed, I would be OK if this is addressed in a follow-up
series. But this *must* be addressed by the time 4.17 is released
because otherwise we will commit ourself to a broken interface. @Henry,
please add this in the blocker list.


If you hadn't answered, I would have reverted these right away this
morning, in particular to remove the (now wrong) feature bit part
(patches 2 and 3 have dropped their feature bit additions in v2).
If you nevertheless think an incremental update is going to be okay,
I'll leave things alone. But personally I think this mistake of mine
would better be corrected immediately.


I wasn't arguing against a revert and it looks like Juergen is away for 
the next 2 weeks. So if you prefer to correct the mistake now, then 
please revert it.


Cheers,

--
Julien Grall



RE: [PATCH v3 3/3] x86/vmx: implement Notify VM Exit

2022-07-19 Thread Tian, Kevin
> From: Roger Pau Monne 
> Sent: Friday, July 1, 2022 9:17 PM
> @@ -4589,6 +4601,22 @@ void vmx_vmexit_handler(struct cpu_user_regs
> *regs)
>   */
>  break;
> 
> +case EXIT_REASON_NOTIFY:
> +__vmread(EXIT_QUALIFICATION, &exit_qualification);
> +
> +if ( exit_qualification & NOTIFY_VM_CONTEXT_INVALID )
> +{

if ( unlikely() )

Apart from that:

Reviewed-by: Kevin Tian 


Re: [PATCH v1 01/18] kconfig: allow configuration of maximum modules

2022-07-19 Thread Jan Beulich
On 06.07.2022 23:04, Daniel P. Smith wrote:
> --- a/xen/arch/Kconfig
> +++ b/xen/arch/Kconfig
> @@ -17,3 +17,15 @@ config NR_CPUS
> For CPU cores which support Simultaneous Multi-Threading or similar
> technologies, this the number of logical threads which Xen will
> support.
> +
> +config NR_BOOTMODS
> + int "Maximum number of boot modules that a loader can pass"
> + range 1 32768
> + default "8" if X86
> + default "32" if ARM

Any reason for the larger default on Arm, irrespective of dom0less
actually being in use? (I'm actually surprised I can't spot a Kconfig
option controlling inclusion of dom0less. The default here imo isn't
supposed to depend on the architecture, but on whether dom0less is
supported. That way if another arch gained dom0less support, the
higher default would apply to it without needing further adjustment.)

> --- a/xen/arch/x86/efi/efi-boot.h
> +++ b/xen/arch/x86/efi/efi-boot.h
> @@ -18,7 +18,7 @@ static multiboot_info_t __initdata mbi = {
>   * The array size needs to be one larger than the number of modules we
>   * support - see __start_xen().
>   */
> -static module_t __initdata mb_modules[5];
> +static module_t __initdata mb_modules[CONFIG_NR_BOOTMODS + 1];

If the build admin selected 1, I'm pretty sure about nothing would work.
I think you want max(5, CONFIG_NR_BOOTMODS) or
max(4, CONFIG_NR_BOOTMODS) + 1 here and ...

> --- a/xen/arch/x86/guest/xen/pvh-boot.c
> +++ b/xen/arch/x86/guest/xen/pvh-boot.c
> @@ -32,7 +32,7 @@ bool __initdata pvh_boot;
>  uint32_t __initdata pvh_start_info_pa;
>  
>  static multiboot_info_t __initdata pvh_mbi;
> -static module_t __initdata pvh_mbi_mods[8];
> +static module_t __initdata pvh_mbi_mods[CONFIG_NR_BOOTMOD + 1];

... max(8, CONFIG_NR_BOOTMODS) here (albeit the 8 may have room for
lowering - I don't recall why 8 was chosen rather than going with
the minimum possible value covering all module kinds known at that
time).

Jan



Re: [PATCH v1 06/18] fdt: make fdt handling reusable across arch

2022-07-19 Thread Jan Beulich
On 06.07.2022 23:04, Daniel P. Smith wrote:
> This refactors reusable code from Arm's bootfdt.c and device-tree.h that is
> general fdt handling code.  The Kconfig parameter CORE_DEVICE_TREE is
> introduced for when the ability of parsing DTB files is needed by a capability
> such as hyperlaunch.
> 
> Signed-off-by: Daniel P. Smith 
> Reviewed-by: Christopher Clark 
> ---
>  xen/arch/arm/bootfdt.c| 115 +
>  xen/common/Kconfig|   4 ++
>  xen/common/Makefile   |   3 +-
>  xen/common/fdt.c  | 131 ++
>  xen/include/xen/device_tree.h |  50 +
>  xen/include/xen/fdt.h |  79 
>  6 files changed, 218 insertions(+), 164 deletions(-)
>  create mode 100644 xen/common/fdt.c
>  create mode 100644 xen/include/xen/fdt.h

I think this wants to be accompanied by an update to ./MAINTAINERS,
so maintainership doesn't silently transition to THE REST.

I further think that the moved code would want to have style adjusted
to match present guidelines - I've noticed a number of u uses which
should be uint_t. I didn't look closely to see whether other style
violations are also retained in the moved code.

Jan



[xen-unstable-smoke test] 171682: tolerable all pass - PUSHED

2022-07-19 Thread osstest service owner
flight 171682 xen-unstable-smoke real [real]
http://logs.test-lab.xenproject.org/osstest/logs/171682/

Failures :-/ but no regressions.

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-libvirt 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  16 saverestore-support-checkfail   never pass

version targeted for testing:
 xen  9723507daf2120131410c91980d4e4d9b0d0aa90
baseline version:
 xen  0e60f1d9d1970cae49ee9d03f5759f44afc1fdee

Last test of basis   171673  2022-07-18 19:00:24 Z0 days
Testing same since   171682  2022-07-19 07:06:21 Z0 days1 attempts


People who touched revisions under test:
  Andrew Cooper 
  Jan Beulich 

jobs:
 build-arm64-xsm  pass
 build-amd64  pass
 build-armhf  pass
 build-amd64-libvirt  pass
 test-armhf-armhf-xl  pass
 test-arm64-arm64-xl-xsm  pass
 test-amd64-amd64-xl-qemuu-debianhvm-amd64pass
 test-amd64-amd64-libvirt pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

To xenbits.xen.org:/home/xen/git/xen.git
   0e60f1d9d1..9723507daf  9723507daf2120131410c91980d4e4d9b0d0aa90 -> smoke



Re: [PATCH v2 4/4] vpci: include xen/vmap.h to fix build on ARM

2022-07-19 Thread Volodymyr Babchuk


Hello Jan,

Jan Beulich  writes:

> On 18.07.2022 23:15, Volodymyr Babchuk wrote:
>> Patch b4f211606011 ("vpci/msix: fix PBA accesses") introduced call to
>> iounmap(), but not added corresponding include.
>> 
>> Fixes: b4f211606011 ("vpci/msix: fix PBA accesses")
>
> I don't think there's any active issue with the "missing" include:
> That's only a problem once Arm has vPCI code enabled? In which
> case I don't think a Fixes: tag is warranted.
>

Fair enough. May I ask committer to drop this tag?

>> Signed-off-by: Volodymyr Babchuk 
>
> With Roger away and on the basis that I'm sure we won't mind the
> change:
> Acked-by: Jan Beulich 

Thank you,

-- 
Volodymyr Babchuk at EPAM


Re: [PATCH v2 4/4] vpci: include xen/vmap.h to fix build on ARM

2022-07-19 Thread Jan Beulich
On 19.07.2022 12:32, Volodymyr Babchuk wrote:
> Jan Beulich  writes:
> 
>> On 18.07.2022 23:15, Volodymyr Babchuk wrote:
>>> Patch b4f211606011 ("vpci/msix: fix PBA accesses") introduced call to
>>> iounmap(), but not added corresponding include.
>>>
>>> Fixes: b4f211606011 ("vpci/msix: fix PBA accesses")
>>
>> I don't think there's any active issue with the "missing" include:
>> That's only a problem once Arm has vPCI code enabled? In which
>> case I don't think a Fixes: tag is warranted.
> 
> Fair enough. May I ask committer to drop this tag?

I had taken respective note already, in case I end up committing this.
But this is the last patch of the series, so I can only guess whether
it might be okay to go in ahead of the other three patches.

Jan



[xen-unstable test] 171678: tolerable FAIL - PUSHED

2022-07-19 Thread osstest service owner
flight 171678 xen-unstable real [real]
flight 171684 xen-unstable real-retest [real]
http://logs.test-lab.xenproject.org/osstest/logs/171678/
http://logs.test-lab.xenproject.org/osstest/logs/171684/

Failures :-/ but no regressions.

Tests which are failing intermittently (not blocking):
 test-amd64-amd64-xl-credit1 22 guest-start/debian.repeat fail pass in 
171684-retest

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-qemut-win7-amd64 19 guest-stopfail like 171672
 test-armhf-armhf-libvirt 16 saverestore-support-checkfail  like 171672
 test-amd64-amd64-qemuu-nested-amd 20 debian-hvm-install/l1/l2 fail like 171672
 test-amd64-amd64-xl-qemuu-ws16-amd64 19 guest-stopfail like 171672
 test-amd64-i386-xl-qemut-ws16-amd64 19 guest-stop fail like 171672
 test-amd64-i386-xl-qemut-win7-amd64 19 guest-stop fail like 171672
 test-armhf-armhf-libvirt-qcow2 15 saverestore-support-check   fail like 171672
 test-armhf-armhf-libvirt-raw 15 saverestore-support-checkfail  like 171672
 test-amd64-i386-xl-qemuu-win7-amd64 19 guest-stop fail like 171672
 test-amd64-amd64-xl-qemut-ws16-amd64 19 guest-stopfail like 171672
 test-amd64-i386-xl-qemuu-ws16-amd64 19 guest-stop fail like 171672
 test-amd64-amd64-xl-qemuu-win7-amd64 19 guest-stopfail like 171672
 test-amd64-i386-xl-pvshim14 guest-start  fail   never pass
 test-amd64-amd64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  16 saverestore-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  15 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  16 saverestore-support-checkfail   never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-amd64-i386-libvirt-raw  14 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-raw 14 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-raw 15 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 15 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 16 saverestore-support-checkfail never pass
 test-armhf-armhf-xl-credit2  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-vhd  14 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-vhd  15 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 15 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 16 saverestore-support-checkfail  never pass
 test-armhf-armhf-libvirt 15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 16 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-vhd 14 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-qcow2 14 migrate-support-checkfail never pass
 test-armhf-armhf-libvirt-raw 14 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  14 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  15 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  15 migrate-support-checkfai

[libvirt test] 171680: regressions - FAIL

2022-07-19 Thread osstest service owner
flight 171680 libvirt real [real]
http://logs.test-lab.xenproject.org/osstest/logs/171680/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-arm64-libvirt   6 libvirt-buildfail REGR. vs. 151777
 build-amd64-libvirt   6 libvirt-buildfail REGR. vs. 151777
 build-i386-libvirt6 libvirt-buildfail REGR. vs. 151777
 build-armhf-libvirt   6 libvirt-buildfail REGR. vs. 151777

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-libvirt  1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt-pair  1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 1 build-check(1) blocked n/a
 test-amd64-amd64-libvirt-vhd  1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt-xsm  1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt   1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt-pair  1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 1 build-check(1) blocked n/a
 test-amd64-i386-libvirt-raw   1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt-xsm   1 build-check(1)   blocked  n/a
 test-arm64-arm64-libvirt  1 build-check(1)   blocked  n/a
 test-arm64-arm64-libvirt-qcow2  1 build-check(1)   blocked  n/a
 test-arm64-arm64-libvirt-raw  1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt-raw  1 build-check(1)   blocked  n/a
 test-arm64-arm64-libvirt-xsm  1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt  1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt-qcow2  1 build-check(1)   blocked  n/a

version targeted for testing:
 libvirt  9e8601c46482664a543e35545a921c96ad9adbc6
baseline version:
 libvirt  2c846fa6bcc11929c9fb857a22430fb9945654ad

Last test of basis   151777  2020-07-10 04:19:19 Z  739 days
Failing since151818  2020-07-11 04:18:52 Z  738 days  720 attempts
Testing same since   171680  2022-07-19 04:20:27 Z0 days1 attempts


People who touched revisions under test:
Adolfo Jayme Barrientos 
  Aleksandr Alekseev 
  Aleksei Zakharov 
  Amneesh Singh 
  Andika Triwidada 
  Andrea Bolognani 
  Andrew Melnychenko 
  Ani Sinha 
  Balázs Meskó 
  Barrett Schonefeld 
  Bastian Germann 
  Bastien Orivel 
  BiaoXiang Ye 
  Bihong Yu 
  Binfeng Wu 
  Bjoern Walk 
  Boris Fiuczynski 
  Brad Laue 
  Brian Turek 
  Bruno Haible 
  Chris Mayo 
  Christian Borntraeger 
  Christian Ehrhardt 
  Christian Kirbach 
  Christian Schoenebeck 
  Christophe Fergeau 
  Claudio Fontana 
  Cole Robinson 
  Collin Walling 
  Cornelia Huck 
  Cédric Bosdonnat 
  Côme Borsoi 
  Daniel Henrique Barboza 
  Daniel Letai 
  Daniel P. Berrange 
  Daniel P. Berrangé 
  David Michael 
  Didik Supriadi 
  dinglimin 
  Divya Garg 
  Dmitrii Shcherbakov 
  Dmytro Linkin 
  Eiichi Tsukata 
  Emilio Herrera 
  Eric Farman 
  Erik Skultety 
  Fabian Affolter 
  Fabian Freyer 
  Fabiano Fidêncio 
  Fangge Jin 
  Farhan Ali 
  Fedora Weblate Translation 
  Florian Schmidt 
  Franck Ridel 
  Gavi Teitz 
  gongwei 
  Guoyi Tu
  Göran Uddeborg 
  Halil Pasic 
  Han Han 
  Hao Wang 
  Haonan Wang 
  Hela Basa 
  Helmut Grohne 
  Hiroki Narukawa 
  Hyman Huang(黄勇) 
  Ian Wienand 
  Ioanna Alifieraki 
  Ivan Teterevkov 
  Jakob Meng 
  Jamie Strandboge 
  Jamie Strandboge 
  Jan Kuparinen 
  jason lee 
  Jean-Baptiste Holcroft 
  Jia Zhou 
  Jianan Gao 
  Jim Fehlig 
  Jin Yan 
  Jing Qi 
  Jinsheng Zhang 
  Jiri Denemark 
  Joachim Falk 
  John Ferlan 
  John Levon 
  John Levon 
  Jonathan Watt 
  Jonathon Jongsma 
  Julio Faracco 
  Justin Gatzen 
  Ján Tomko 
  Kashyap Chamarthy 
  Kevin Locke 
  Kim InSoo 
  Koichi Murase 
  Kristina Hanicova 
  Laine Stump 
  Laszlo Ersek 
  Lee Yarwood 
  Lei Yang 
  Lena Voytek 
  Liang Yan 
  Liang Yan 
  Liao Pingfang 
  Lin Ma 
  Lin Ma 
  Lin Ma 
  Liu Yiding 
  Lubomir Rintel 
  Luke Yue 
  Luyao Zhong 
  luzhipeng 
  Marc Hartmayer 
  Marc-André Lureau 
  Marek Marczykowski-Górecki 
  Mark Mielke 
  Markus Schade 
  Martin Kletzander 
  Martin Pitt 
  Masayoshi Mizuma 
  Matej Cepl 
  Matt Coleman 
  Matt Coleman 
  Mauro Matteo Cascella 
  Max Goodhart 
  Maxim Nestratov 
  Meina Li 
  Michal Privoznik 
  Michał Smyk 
  Milo Casagrande 
  Moshe Levi 
  Moteen Shah 
  Moteen Shah 
  Muha Aliss 
  Nathan 
  Neal Gompa 
  Nick Chevsky 
  Nick Shyrokovskiy 
  Nickys Music Group 
  Nico Pache 
  Nicolas Lécureuil 
  Nicolas Lécureuil 
  Nikolay Shirokovskiy 
  Nikolay Shirokovskiy 
  Nikolay Shirokovskiy 
  Niteesh Dubey 
  Olaf Hering 
  Olesya Gerasimenko 
  Or Ozeri 
  Orion Poplawski 
  Pany 
  Paolo Bonzini 
  Patrick Magauran 
  Paulo de Rezende Pinatti 
  Pavel H

Fwd: Ping²: [PATCH] x86: enable interrupts around dump_execstate()

2022-07-19 Thread Jan Beulich
Henry,

 Forwarded Message 
Subject: Ping²: [PATCH] x86: enable interrupts around dump_execstate()
Date: Tue, 5 Jul 2022 18:19:38 +0200
From: Jan Beulich 
To: Andrew Cooper , Roger Pau Monné 

CC: Wei Liu , xen-devel@lists.xenproject.org 


>On 11.01.2022 11:08, Jan Beulich wrote:
>> On 16.12.2021 14:33, Jan Beulich wrote:
>>> On 16.12.2021 12:54, Andrew Cooper wrote:
 On 13/12/2021 15:12, Jan Beulich wrote:
> show_hvm_stack() requires interrupts to be enabled to avoids triggering
> the consistency check in check_lock() for the p2m lock. To do so in
> spurious_interrupt() requires adding reentrancy protection / handling
> there.
>
> Fixes: adb715db698b ("x86/HVM: also dump stacks from 
> show_execution_state()")
> Signed-off-by: Jan Beulich 
>> 
>> There's a bug here which we need to deal with one way or another.
>> May I please ask for a response to the issues pointed out with
>> what you said in your earlier reply?
>
>I sincerely hope we won't ship another major version with this
>issue unfixed. The only option beyond applying this patch that I'm
>aware of is to revert the commit pointed at by Fixes:, which imo
>would be a shame (moving us further away from proper PVH support,
>including Dom0).

perhaps another item for the list of things needing resolution for
the release.

Jan



Re: [PATCH] x86: enable interrupts around dump_execstate()

2022-07-19 Thread Andrew Cooper
On 16/12/2021 13:33, Jan Beulich wrote:
> On 16.12.2021 12:54, Andrew Cooper wrote:
>> On 13/12/2021 15:12, Jan Beulich wrote:
>>> show_hvm_stack() requires interrupts to be enabled to avoids triggering
>>> the consistency check in check_lock() for the p2m lock. To do so in
>>> spurious_interrupt() requires adding reentrancy protection / handling
>>> there.
>>>
>>> Fixes: adb715db698b ("x86/HVM: also dump stacks from 
>>> show_execution_state()")
>>> Signed-off-by: Jan Beulich 
>>> ---
>>> The obvious (but imo undesirable) alternative is to suppress the call to
>>> show_hvm_stack() when interrupts are disabled.
>> show_execution_state() need to work in any context including the #DF
>> handler,
> Why? There's no show_execution_state() on that path.

Yes there is - it's reachable from any BUG().

It's also reachable on the NMI path via fatal_trap().

Talking of, didn't you say you'd found an unexplained deadlock with NMI
handling... ?

>
>> and
>>
>>     /*
>>  * Stop interleaving prevention: The necessary P2M lookups
>>  * involve locking, which has to occur with IRQs enabled.
>>  */
>>     console_unlock_recursive_irqrestore(flags);
>>     
>>     show_hvm_stack(curr, regs);
>>
>> is looking distinctly dodgy...
> Well, yes, it does.

Because it is.

You cannot enable interrupts here, because you have no clue if it safe
to do so.

What you are doing is creating yet another instance of the broken
pattern we already have with shutdown trying to move itself to CPU0,
that occasionally explodes in the middle of a context switch.

Furthermore show_execution_state() it is already broken for any path
where interrupts are already disabled, including but not limited to the
one you've found here.

adb715db698bc8ec3b88c24eb88b21e9da5b6c07 is broken and needs reverting.

No amount of playing games with irqs here is going to improve things.

~Andrew


Re: [PATCH] x86: enable interrupts around dump_execstate()

2022-07-19 Thread Jan Beulich
On 19.07.2022 13:22, Andrew Cooper wrote:
> On 16/12/2021 13:33, Jan Beulich wrote:
>> On 16.12.2021 12:54, Andrew Cooper wrote:
>>> On 13/12/2021 15:12, Jan Beulich wrote:
 show_hvm_stack() requires interrupts to be enabled to avoids triggering
 the consistency check in check_lock() for the p2m lock. To do so in
 spurious_interrupt() requires adding reentrancy protection / handling
 there.

 Fixes: adb715db698b ("x86/HVM: also dump stacks from 
 show_execution_state()")
 Signed-off-by: Jan Beulich 
 ---
 The obvious (but imo undesirable) alternative is to suppress the call to
 show_hvm_stack() when interrupts are disabled.
>>> show_execution_state() need to work in any context including the #DF
>>> handler,
>> Why? There's no show_execution_state() on that path.
> 
> Yes there is - it's reachable from any BUG().

"That path" was really referring to you mentioning #DF.

> It's also reachable on the NMI path via fatal_trap().
> 
> Talking of, didn't you say you'd found an unexplained deadlock with NMI
> handling... ?

Entirely unrelated to this, but yes.

>>> and
>>>
>>>     /*
>>>  * Stop interleaving prevention: The necessary P2M lookups
>>>  * involve locking, which has to occur with IRQs enabled.
>>>  */
>>>     console_unlock_recursive_irqrestore(flags);
>>>     
>>>     show_hvm_stack(curr, regs);
>>>
>>> is looking distinctly dodgy...
>> Well, yes, it does.
> 
> Because it is.
> 
> You cannot enable interrupts here, because you have no clue if it safe
> to do so.

We're not enabling interrupts here (if "here" is referring to the
quoted piece of code), we're merely restoring them. When they were
off before, they will continue to be off. (In that light calling
show_hvm_stack() is then still wrong in that case.)

If, otoh, you're talking about what the patch is doing, then
we're in an IRQ handler, so context outside of the IRQ must have
had IRQs enabled.

> What you are doing is creating yet another instance of the broken
> pattern we already have with shutdown trying to move itself to CPU0,
> that occasionally explodes in the middle of a context switch.
> 
> Furthermore show_execution_state() it is already broken for any path
> where interrupts are already disabled, including but not limited to the
> one you've found here.
> 
> adb715db698bc8ec3b88c24eb88b21e9da5b6c07 is broken and needs reverting.

Well, okay - but what's the plan then to achieve the intended
functionality?

The suggested alternative with the patch submission (to skip
show_hvm_stack() when IRQs are off) is probably necessary anyway
due to above observation (if we wouldn't outright revert), but
won't get us very far.

Jan



RE: Ping²: [PATCH] x86: enable interrupts around dump_execstate()

2022-07-19 Thread Henry Wang
Hi Jan,

> -Original Message-
> From: Jan Beulich 
> Henry,
> 
>  Forwarded Message 
> Subject: Ping²: [PATCH] x86: enable interrupts around dump_execstate()
> 
> >On 11.01.2022 11:08, Jan Beulich wrote:
> >> On 16.12.2021 14:33, Jan Beulich wrote:
> >>> On 16.12.2021 12:54, Andrew Cooper wrote:
>  On 13/12/2021 15:12, Jan Beulich wrote:
> > show_hvm_stack() requires interrupts to be enabled to avoids
> triggering
> > the consistency check in check_lock() for the p2m lock. To do so in
> > spurious_interrupt() requires adding reentrancy protection / handling
> > there.
> >
> > Fixes: adb715db698b ("x86/HVM: also dump stacks from
> show_execution_state()")
> > Signed-off-by: Jan Beulich 
> >>
> >> There's a bug here which we need to deal with one way or another.
> >> May I please ask for a response to the issues pointed out with
> >> what you said in your earlier reply?
> >
> >I sincerely hope we won't ship another major version with this
> >issue unfixed. The only option beyond applying this patch that I'm
> >aware of is to revert the commit pointed at by Fixes:, which imo
> >would be a shame (moving us further away from proper PVH support,
> >including Dom0).
> 
> perhaps another item for the list of things needing resolution for
> the release.

Many thanks for this information! I can see this thread is quite old and
probably even before I became the release manager so thanks for your
effort to find this :))

Yes of course, I've added this series to my blockers list and I will start to
track it so that we can have proper resolution for the 4.17 release.

Kind regards,
Henry

> 
> Jan


[PATCH v2] x86/PV: issue branch prediction barrier when switching 64-bit guest to kernel mode

2022-07-19 Thread Jan Beulich
Since both kernel and user mode run in ring 3, they run in the same
"predictor mode". While the kernel could take care of this itself, doing
so would be yet another item distinguishing PV from native. Additionally
we're in a much better position to issue the barrier command, and we can
save a #GP (for privileged instruction emulation) this way.

To allow to recover performance, introduce a new VM assist allowing the guest
kernel to suppress this barrier.

Signed-off-by: Jan Beulich 
---
v2: Leverage entry-IBPB. Add VM assist. Re-base.
---
I'm not entirely happy with re-using opt_ibpb_ctxt_switch here (it's a
mode switch after all, but v1 used opt_ibpb here), but it also didn't
seem very reasonable to introduce yet another command line option. The
only feasible alternative I would see is to check the CPUID bits directly.

--- a/xen/arch/x86/include/asm/domain.h
+++ b/xen/arch/x86/include/asm/domain.h
@@ -757,7 +757,8 @@ static inline void pv_inject_sw_interrup
  * but we can't make such requests fail all of the sudden.
  */
 #define PV64_VM_ASSIST_MASK (PV32_VM_ASSIST_MASK  | \
- (1UL << VMASST_TYPE_m2p_strict))
+ (1UL << VMASST_TYPE_m2p_strict)  | \
+ (1UL << VMASST_TYPE_mode_switch_no_ibpb))
 #define HVM_VM_ASSIST_MASK  (1UL << VMASST_TYPE_runstate_update_flag)
 
 #define arch_vm_assist_valid_mask(d) \
--- a/xen/arch/x86/pv/domain.c
+++ b/xen/arch/x86/pv/domain.c
@@ -467,7 +467,15 @@ void toggle_guest_mode(struct vcpu *v)
 if ( v->arch.flags & TF_kernel_mode )
 v->arch.pv.gs_base_kernel = gs_base;
 else
+{
 v->arch.pv.gs_base_user = gs_base;
+
+if ( opt_ibpb_ctxt_switch &&
+ !(d->arch.spec_ctrl_flags & SCF_entry_ibpb) &&
+ !VM_ASSIST(d, mode_switch_no_ibpb) )
+wrmsrl(MSR_PRED_CMD, PRED_CMD_IBPB);
+}
+
 asm volatile ( "swapgs" );
 
 _toggle_guest_pt(v);
--- a/xen/include/public/xen.h
+++ b/xen/include/public/xen.h
@@ -571,6 +571,16 @@ DEFINE_XEN_GUEST_HANDLE(mmuext_op_t);
  */
 #define VMASST_TYPE_m2p_strict   32
 
+/*
+ * x86-64 guests: Suppress IBPB on guest-user to guest-kernel mode switch.
+ *
+ * By default (on affected and capable hardware) as a safety measure Xen,
+ * to cover for the fact that guest-kernel and guest-user modes are both
+ * running in ring 3 (and hence share prediction context), would issue a
+ * barrier for user->kernel mode switches of PV guests.
+ */
+#define VMASST_TYPE_mode_switch_no_ibpb  33
+
 #if __XEN_INTERFACE_VERSION__ < 0x00040600
 #define MAX_VMASST_TYPE  3
 #endif



[PATCH] x86emul: add memory operand low bits checks for ENQCMD{,S}

2022-07-19 Thread Jan Beulich
Already ISE rev 044 added text to this effect; rev 045 further dropped
leftover earlier text indicating the contrary:
- ENQCMD requires the low 32 bits of the memory operand to be clear,
- ENDCMDS requires bits 20...30 of the memory operand to be clear.

Signed-off-by: Jan Beulich 
---
I'm a little reluctant to add a Fixes: tag here, because at the time
the code was written the behavior was matching what was documented.

--- a/xen/arch/x86/x86_emulate/x86_emulate.c
+++ b/xen/arch/x86/x86_emulate/x86_emulate.c
@@ -10499,6 +10499,7 @@ x86_emulate(
 goto done;
 if ( vex.pfx == vex_f2 ) /* enqcmd */
 {
+generate_exception_if(mmvalp->data32[0], EXC_GP, 0);
 fail_if(!ops->read_msr);
 if ( (rc = ops->read_msr(MSR_PASID, &msr_val,
  ctxt)) != X86EMUL_OKAY )
@@ -10506,7 +10507,8 @@ x86_emulate(
 generate_exception_if(!(msr_val & PASID_VALID), EXC_GP, 0);
 mmvalp->data32[0] = MASK_EXTR(msr_val, PASID_PASID_MASK);
 }
-mmvalp->data32[0] &= ~0x7ff0;
+else
+generate_exception_if(mmvalp->data32[0] & 0x7ff0, EXC_GP, 0);
 state->blk = blk_enqcmd;
 if ( (rc = ops->blk(x86_seg_es, src.val, mmvalp, 64, &_regs.eflags,
 state, ctxt)) != X86EMUL_OKAY )



Re: [PATCH v1 02/18] introduction of generalized boot info

2022-07-19 Thread Jan Beulich
On 06.07.2022 23:04, Daniel P. Smith wrote:
> --- /dev/null
> +++ b/xen/arch/x86/include/asm/bootinfo.h
> @@ -0,0 +1,48 @@
> +#ifndef __ARCH_X86_BOOTINFO_H__
> +#define __ARCH_X86_BOOTINFO_H__
> +
> +/* unused for x86 */
> +struct arch_bootstring { };
> +
> +struct __packed arch_bootmodule {
> +#define BOOTMOD_FLAG_X86_RELOCATED  1U << 0

Such macro expansions need parenthesizing.

> +uint32_t flags;
> +uint32_t headroom;
> +};

Since you're not following any external spec, on top of what Julien
said about the __packed attribute I'd also like to point out that
in many cases here there's no need to use fixed-width types.

> +struct __packed arch_boot_info {
> +uint32_t flags;
> +#define BOOTINFO_FLAG_X86_MEMLIMITS  1U << 0
> +#define BOOTINFO_FLAG_X86_BOOTDEV1U << 1
> +#define BOOTINFO_FLAG_X86_CMDLINE1U << 2
> +#define BOOTINFO_FLAG_X86_MODULES1U << 3
> +#define BOOTINFO_FLAG_X86_AOUT_SYMS  1U << 4
> +#define BOOTINFO_FLAG_X86_ELF_SYMS   1U << 5
> +#define BOOTINFO_FLAG_X86_MEMMAP 1U << 6
> +#define BOOTINFO_FLAG_X86_DRIVES 1U << 7
> +#define BOOTINFO_FLAG_X86_BIOSCONFIG 1U << 8
> +#define BOOTINFO_FLAG_X86_LOADERNAME 1U << 9
> +#define BOOTINFO_FLAG_X86_APM1U << 10
> +
> +bool xen_guest;

As the example of this, with just the header files being introduced
here it is not really possible to figure what these fields are to
be used for and hence whether they're legitimately represented here.

> +char *boot_loader_name;
> +char *kextra;

const?

Jan



Re: [PATCH 0/3] x86: make pat and mtrr independent from each other

2022-07-19 Thread Chuck Zmudzinski
On 7/18/2022 7:32 AM, Chuck Zmudzinski wrote:
> On 7/17/2022 3:55 AM, Thorsten Leemhuis wrote:
> > Hi Juergen!
> >
> > On 15.07.22 16:25, Juergen Gross wrote:
> > > Today PAT can't be used without MTRR being available, unless MTRR is at
> > > least configured via CONFIG_MTRR and the system is running as Xen PV
> > > guest. In this case PAT is automatically available via the hypervisor,
> > > but the PAT MSR can't be modified by the kernel and MTRR is disabled.
> > > 
> > > As an additional complexity the availability of PAT can't be queried
> > > via pat_enabled() in the Xen PV case, as the lack of MTRR will set PAT
> > > to be disabled. This leads to some drivers believing that not all cache
> > > modes are available, resulting in failures or degraded functionality.
> > > 
> > > The same applies to a kernel built with no MTRR support: it won't
> > > allow to use the PAT MSR, even if there is no technical reason for
> > > that, other than setting up PAT on all cpus the same way (which is a
> > > requirement of the processor's cache management) is relying on some
> > > MTRR specific code.
> > > 
> > > Fix all of that by:
> > > 
> > > - moving the function needed by PAT from MTRR specific code one level
> > >   up
> > > - adding a PAT indirection layer supporting the 3 cases "no or disabled
> > >   PAT", "PAT under kernel control", and "PAT under Xen control"
> > > - removing the dependency of PAT on MTRR
> >
> > Thx for working on this. If you need to respin these patches for one
> > reason or another, could you do me a favor and add proper 'Link:' tags
> > pointing to all reports about this issue? e.g. like this:
> >
> >  Link: https://lore.kernel.org/regressions/YnHK1Z3o99eMXsVK@mail-itl/
> >
> > These tags are considered important by Linus[1] and others, as they
> > allow anyone to look into the backstory weeks or years from now. That is
> > why they should be placed in cases like this, as
> > Documentation/process/submitting-patches.rst and
> > Documentation/process/5.Posting.rst explain in more detail. I care
> > personally, because these tags make my regression tracking efforts a
> > whole lot easier, as they allow my tracking bot 'regzbot' to
> > automatically connect reports with patches posted or committed to fix
> > tracked regressions.
> >
> > [1] see for example:
> > https://lore.kernel.org/all/CAHk-=wjMmSZzMJ3Xnskdg4+GGz=5p5p+gsyyfbth0f-dgvd...@mail.gmail.com/
> > https://lore.kernel.org/all/CAHk-=wgs38ZrfPvy=nowvkvzjpm3vfu1zobp37fwd_h9iad...@mail.gmail.com/
> > https://lore.kernel.org/all/CAHk-=wjxzafG-=j8ot30s7upn4rhbs6tx-uvfz5rme+l5_d...@mail.gmail.com/
> >
> > Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
> >
>
> I echo Thorsten's thx for starting on this now instead of waiting until
> September which I think is when Juergen said he could start working
> on this last week. I agree with Thorsten that Link tags are needed.
> Since multiple patches have been proposed to fix this regression,
> perhaps a Link to each proposed patch, and a note that
> the original report identified a specific commit which when reverted
> also fixes it. IMO, this is all part of the backstory Thorsten refers to.
>
> It looks like with this approach, a fix will not be coming real soon,
> and Borislav Petkov also discouraged me from testing this
> patch set until I receive a ping telling me it is ready for testing,
> which seems to confirm that this regression will not be fixed
> very soon. Please correct me if I am wrong about how long
> it will take to fix it with this approach.
>
> Also, is there any guarantee this approach is endorsed by
> all the maintainers who will need to sign-off, especially
> Linus? I say this because some of the discussion on the
> earlier proposed patches makes me doubt this. I am especially
> referring to this discussion:
>
> https://lore.kernel.org/lkml/4c8c9d4c-1c6b-8e9f-fa47-918a64898...@leemhuis.info/
>
> and also, here:
>
> https://lore.kernel.org/lkml/ysrjx%2fu1xn8rq...@zn.tnic/
>
> where Borislav Petkov argues that Linux should not be
> patched at all to fix this regression but instead the fix
> should come by patching the Xen hypervisor.
>
> So I have several questions, presuming at least the fix is going
> to be delayed for some time, and also presuming this approach
> is not yet an approach that has the blessing of the maintainers
> who will need to sign-off:
>
> 1. Can you estimate when the patch series will be ready for
> testing and suitable for a prepatch or RC release?
>
> 2. Can you estimate when the patch series will be ready to be
> merged into the mainline release? Is there any hope it will be
> fixed before the next longterm release hosted on kernel.org?
>
> 3. Since a fix is likely not coming soon, can you explain
> why the commit that was mentioned in the original
> report cannot be reverted as a temporary solution while
> we wait for the full fix to come later? I can say that
> reverting that commit (It was a commit affecting
> drm/i915) 

Re: [PATCH v1 03/18] x86: adopt new boot info structures

2022-07-19 Thread Jan Beulich
On 06.07.2022 23:04, Daniel P. Smith wrote:
> This commit replaces the use of the multiboot v1 structures starting
> at __start_xen(). The majority of this commit is converting the fields
> being accessed for the startup calculations. While adapting the ucode
> boot module location logic, this code was refactored to reduce some
> of the unnecessary complexity.

Things like this or ...

> --- a/xen/arch/x86/bzimage.c
> +++ b/xen/arch/x86/bzimage.c
> @@ -69,10 +69,8 @@ static __init int bzimage_check(struct setup_header *hdr, 
> unsigned long len)
>  return 1;
>  }
>  
> -static unsigned long __initdata orig_image_len;
> -
> -unsigned long __init bzimage_headroom(void *image_start,
> -  unsigned long image_length)
> +unsigned long __init bzimage_headroom(
> +void *image_start, unsigned long image_length)
>  {
>  struct setup_header *hdr = (struct setup_header *)image_start;
>  int err;
> @@ -91,7 +89,6 @@ unsigned long __init bzimage_headroom(void *image_start,
>  if ( elf_is_elfbinary(image_start, image_length) )
>  return 0;
>  
> -orig_image_len = image_length;
>  headroom = output_length(image_start, image_length);
>  if (gzip_check(image_start, image_length))
>  {
> @@ -104,12 +101,15 @@ unsigned long __init bzimage_headroom(void *image_start,
>  return headroom;
>  }
>  
> -int __init bzimage_parse(void *image_base, void **image_start,
> - unsigned long *image_len)
> +int __init bzimage_parse(
> +void *image_base, void **image_start, unsigned int headroom,
> +unsigned long *image_len)
>  {
>  struct setup_header *hdr = (struct setup_header *)(*image_start);
>  int err = bzimage_check(hdr, *image_len);
> -unsigned long output_len;
> +unsigned long output_len, orig_image_len;
> +
> +orig_image_len = *image_len - headroom;
>  
>  if ( err < 0 )
>  return err;
> @@ -125,7 +125,7 @@ int __init bzimage_parse(void *image_base, void 
> **image_start,
>  
>  BUG_ON(!(image_base < *image_start));
>  
> -output_len = output_length(*image_start, orig_image_len);
> +output_len = output_length(*image_start, *image_len);
>  
>  if ( (err = perform_gunzip(image_base, *image_start, orig_image_len)) > 
> 0 )
>  err = decompress(*image_start, orig_image_len, image_base);

... whatever the deal is here want factoring out. Also you want to avoid
making formatting changes (like in the function headers here) in an
already large patch, when you don't otherwise touch the functions. I'm
not even convinced the formatting changes are desirable here, so I'd
like to ask that even on code you do touch for other reasons you do so
only if the existing layout ends up really awkward.

I have not looked in any further detail at this patch, sorry. Together
with my comment on the earlier patch I conclude that it might be best
if you moved things to the new representation field by field (or set of
related fields), introducing the new fields in the abstraction struct
as they are being made use of.

Jan



Re: [PATCH v1 05/18] x86: refactor xen cmdline into general framework

2022-07-19 Thread Jan Beulich
On 06.07.2022 23:04, Daniel P. Smith wrote:
> --- a/xen/include/xen/bootinfo.h
> +++ b/xen/include/xen/bootinfo.h
> @@ -53,6 +53,17 @@ struct __packed boot_info {
>  
>  extern struct boot_info *boot_info;
>  
> +static inline char *bootinfo_prepare_cmdline(struct boot_info *bi)
> +{
> +bi->cmdline = arch_bootinfo_prepare_cmdline(bi->cmdline, bi->arch);
> +
> +if ( *bi->cmdline == ' ' )
> +printk(XENLOG_WARNING "%s: leading whitespace left on cmdline\n",
> +   __func__);

Just a remark and a question on this one: I don't view the use of
__func__ here (and in fact in many other cases as well) as very
useful. And why do we need such a warning all of the sudden in the
first place?

Jan



Re: [PATCH v1 08/18] kconfig: introduce domain builder config option

2022-07-19 Thread Jan Beulich
On 06.07.2022 23:04, Daniel P. Smith wrote:
> --- /dev/null
> +++ b/xen/common/domain-builder/Kconfig
> @@ -0,0 +1,15 @@
> +
> +menu "Domain Builder Features"
> +
> +config BUILDER_FDT
> + bool "Domain builder device tree (UNSUPPORTED)" if UNSUPPORTED
> + select CORE_DEVICE_TREE
> + ---help---

Nit: No new ---help--- please anymore.

> +   Enables the ability to configure the domain builder using a
> +   flattened device tree.

Is this about both Dom0 and DomU? Especially if not, this wants making
explicit. But perhaps even if so it wants saying, for the avoidance of
doubt.

Jan



[PATCH 1/1] OvmfPkg/XenPvBlkDxe: Fix memory barrier macro

2022-07-19 Thread Anthony PERARD
From: Anthony PERARD 

The macro "xen_mb()" needs to be a full memory barrier, that is it
needs to also prevent stores from been reorder after loads which an
x86 CPU can do (as I understand from reading [1]). So this patch makes
use of "mfence" instruction.

Currently, there's a good chance that OvmfXen hang in
XenPvBlockSync(), in an infinite loop, waiting for the last request to
be consumed by the backend. On the other hand, the backend didn't know
there were a new request and don't do anything. This is because there
is two ways the backend look for request, either it's working on one
and use RING_FINAL_CHECK_FOR_REQUESTS(), or it's waiting for an
event/notification. So the frontend (OvmfXen) doesn't need to send
a notification if the backend is already working, checking for needed
notification is done by RING_PUSH_REQUESTS_AND_CHECK_NOTIFY().

That last marco is where order of store vs load is important, the
macro first store the fact that there's a new request, then load the
value of the last event that the backend have done to check if an
asynchronous notification is needed. If those store and load are
reorder, OvmfXen could take the wrong decision of not sending a
notification and both sides just wait.

To fix this, we need to tell the CPU to not reorder stores after loads.

Aarch64 implementation of MemoryFence() is using "dmb sy" which seems
to prevent any reordering.

[1] https://en.wikipedia.org/wiki/Memory_ordering#Runtime_memory_ordering

Signed-off-by: Anthony PERARD 
---

I'm not sure what would be the correct implementation on MSFT,
_ReadWriteBarrier() seems to be only a compiler barrier, and I don't
know whether _mm_mfence() is just "mfence" or if it act as a compiler
barrier as well.

Cc: Ard Biesheuvel 
Cc: Jiewen Yao 
Cc: Jordan Justen 
Cc: Gerd Hoffmann 
Cc: Julien Grall 
---
 OvmfPkg/XenPvBlkDxe/XenPvBlkDxe.inf  |  8 ++
 OvmfPkg/XenPvBlkDxe/FullMemoryFence.h| 27 
 OvmfPkg/XenPvBlkDxe/XenPvBlkDxe.h|  3 ++-
 OvmfPkg/XenPvBlkDxe/X86GccFullMemoryFence.c  | 20 +++
 OvmfPkg/XenPvBlkDxe/X86MsftFullMemoryFence.c | 22 
 5 files changed, 79 insertions(+), 1 deletion(-)
 create mode 100644 OvmfPkg/XenPvBlkDxe/FullMemoryFence.h
 create mode 100644 OvmfPkg/XenPvBlkDxe/X86GccFullMemoryFence.c
 create mode 100644 OvmfPkg/XenPvBlkDxe/X86MsftFullMemoryFence.c

diff --git a/OvmfPkg/XenPvBlkDxe/XenPvBlkDxe.inf 
b/OvmfPkg/XenPvBlkDxe/XenPvBlkDxe.inf
index 5dd8e8be1183..dc91865265c1 100644
--- a/OvmfPkg/XenPvBlkDxe/XenPvBlkDxe.inf
+++ b/OvmfPkg/XenPvBlkDxe/XenPvBlkDxe.inf
@@ -30,9 +30,17 @@ [Sources]
   ComponentName.c
   ComponentName.h
   DriverBinding.h
+  FullMemoryFence.h
   XenPvBlkDxe.c
   XenPvBlkDxe.h
 
+[Sources.IA32]
+  X86GccFullMemoryFence.c | GCC
+  X86MsftFullMemoryFence.c | MSFT
+
+[Sources.X64]
+  X86GccFullMemoryFence.c | GCC
+  X86MsftFullMemoryFence.c | MSFT
 
 [LibraryClasses]
   UefiDriverEntryPoint
diff --git a/OvmfPkg/XenPvBlkDxe/FullMemoryFence.h 
b/OvmfPkg/XenPvBlkDxe/FullMemoryFence.h
new file mode 100644
index ..e3d1df3d0e9d
--- /dev/null
+++ b/OvmfPkg/XenPvBlkDxe/FullMemoryFence.h
@@ -0,0 +1,27 @@
+/** @file
+  Copyright (C) 2022, Citrix Ltd.
+
+  SPDX-License-Identifier: BSD-2-Clause-Patent
+**/
+
+#if defined (MDE_CPU_IA32) || defined (MDE_CPU_X64)
+
+//
+// Like MemoryFence() but prevent stores from been reorded with loads by
+// the CPU on X64.
+//
+VOID
+EFIAPI
+FullMemoryFence (
+  VOID
+  );
+
+#else
+
+//
+// Only implement FullMemoryFence() on X86 as MemoryFence() is probably
+// fine on other platform.
+//
+#define FullMemoryFence()  MemoryFence()
+
+#endif
diff --git a/OvmfPkg/XenPvBlkDxe/XenPvBlkDxe.h 
b/OvmfPkg/XenPvBlkDxe/XenPvBlkDxe.h
index 350b7bd309c0..67ee1899e9a8 100644
--- a/OvmfPkg/XenPvBlkDxe/XenPvBlkDxe.h
+++ b/OvmfPkg/XenPvBlkDxe/XenPvBlkDxe.h
@@ -11,8 +11,9 @@
 #define __EFI_XEN_PV_BLK_DXE_H__
 
 #include 
+#include "FullMemoryFence.h"
 
-#define xen_mb()   MemoryFence()
+#define xen_mb()   FullMemoryFence()
 #define xen_rmb()  MemoryFence()
 #define xen_wmb()  MemoryFence()
 
diff --git a/OvmfPkg/XenPvBlkDxe/X86GccFullMemoryFence.c 
b/OvmfPkg/XenPvBlkDxe/X86GccFullMemoryFence.c
new file mode 100644
index ..92d107def470
--- /dev/null
+++ b/OvmfPkg/XenPvBlkDxe/X86GccFullMemoryFence.c
@@ -0,0 +1,20 @@
+/** @file
+  Copyright (C) 2022, Citrix Ltd.
+
+  SPDX-License-Identifier: BSD-2-Clause-Patent
+**/
+
+#include "FullMemoryFence.h"
+
+//
+// Like MemoryFence() but prevent stores from been reorded with loads by
+// the CPU on X64.
+//
+VOID
+EFIAPI
+FullMemoryFence (
+  VOID
+  )
+{
+  __asm__ __volatile__ ("mfence":::"memory");
+}
diff --git a/OvmfPkg/XenPvBlkDxe/X86MsftFullMemoryFence.c 
b/OvmfPkg/XenPvBlkDxe/X86MsftFullMemoryFence.c
new file mode 100644
index ..fcb08f7601cd
--- /dev/null
+++ b/OvmfPkg/XenPvBlkDxe/X86MsftFullMemoryFence.c
@@ -0,0 +1,22 @@
+/** @file
+  Copyright (C) 2022, Citrix Ltd.
+
+  SPDX-License-Identifier: BSD-2-Clau

Re: [PATCH 1/1] OvmfPkg/XenPvBlkDxe: Fix memory barrier macro

2022-07-19 Thread Andrew Cooper
On 19/07/2022 14:52, Anthony Perard wrote:
> diff --git a/OvmfPkg/XenPvBlkDxe/XenPvBlkDxe.h 
> b/OvmfPkg/XenPvBlkDxe/XenPvBlkDxe.h
> index 350b7bd309c0..67ee1899e9a8 100644
> --- a/OvmfPkg/XenPvBlkDxe/XenPvBlkDxe.h
> +++ b/OvmfPkg/XenPvBlkDxe/XenPvBlkDxe.h
> @@ -11,8 +11,9 @@
>  #define __EFI_XEN_PV_BLK_DXE_H__
>  
>  #include 
> +#include "FullMemoryFence.h"
>  
> -#define xen_mb()   MemoryFence()
> +#define xen_mb()   FullMemoryFence()
>  #define xen_rmb()  MemoryFence()
>  #define xen_wmb()  MemoryFence()

Ok, so the old MemoryFence() is definitely bogus here.

However, it doesn't need to be an mfence instruction.  All that is
needed is smp_mb(), which these days is

asm volatile ( "lock addl $0, -4(%%rsp)" ::: "memory" )

because that has the required read/write ordering properties without the
extra serialising property that mfence has.

Furthermore, ...

>  
> diff --git a/OvmfPkg/XenPvBlkDxe/X86GccFullMemoryFence.c 
> b/OvmfPkg/XenPvBlkDxe/X86GccFullMemoryFence.c
> new file mode 100644
> index ..92d107def470
> --- /dev/null
> +++ b/OvmfPkg/XenPvBlkDxe/X86GccFullMemoryFence.c
> @@ -0,0 +1,20 @@
> +/** @file
> +  Copyright (C) 2022, Citrix Ltd.
> +
> +  SPDX-License-Identifier: BSD-2-Clause-Patent
> +**/
> +
> +#include "FullMemoryFence.h"
> +
> +//
> +// Like MemoryFence() but prevent stores from been reorded with loads by
> +// the CPU on X64.
> +//
> +VOID
> +EFIAPI
> +FullMemoryFence (
> +  VOID
> +  )
> +{
> +  __asm__ __volatile__ ("mfence":::"memory");
> +}

... stuff like this needs to come from a single core location, and not
opencoded for each driver.

~Andrew


Re: Ping: [PATCH] x86/PAT: have pat_enabled() properly reflect state when running on e.g. Xen

2022-07-19 Thread Chuck Zmudzinski
On 7/14/2022 6:45 PM, Chuck Zmudzinski wrote:
> On 7/14/2022 6:33 PM, Chuck Zmudzinski wrote:
> > On 7/14/2022 1:17 PM, Chuck Zmudzinski wrote:
> > > On 7/5/22 6:57 AM, Thorsten Leemhuis wrote:
> > > > [CCing tglx, mingo, Boris and Juergen]
> > > >
> > > > On 04.07.22 14:26, Jan Beulich wrote:
> > > > > On 04.07.2022 13:58, Thorsten Leemhuis wrote:
> > > > >> On 25.05.22 10:55, Jan Beulich wrote:
> > > > >>> On 28.04.2022 16:50, Jan Beulich wrote:
> > > >  The latest with commit bdd8b6c98239 ("drm/i915: replace 
> > > >  X86_FEATURE_PAT
> > > >  with pat_enabled()") pat_enabled() returning false (because of PAT
> > > >  initialization being suppressed in the absence of MTRRs being 
> > > >  announced
> > > >  to be available) has become a problem: The i915 driver now fails to
> > > >  initialize when running PV on Xen (i915_gem_object_pin_map() is 
> > > >  where I
> > > >  located the induced failure), and its error handling is flaky 
> > > >  enough to
> > > >  (at least sometimes) result in a hung system.
> > > > 
> > > >  Yet even beyond that problem the keying of the use of WC mappings 
> > > >  to
> > > >  pat_enabled() (see arch_can_pci_mmap_wc()) means that in particular
> > > >  graphics frame buffer accesses would have been quite a bit less
> > > >  performant than possible.
> > > > 
> > > >  Arrange for the function to return true in such environments, 
> > > >  without
> > > >  undermining the rest of PAT MSR management logic considering PAT 
> > > >  to be
> > > >  disabled: Specifically, no writes to the PAT MSR should occur.
> > > > 
> > > >  For the new boolean to live in .init.data, init_cache_modes() also 
> > > >  needs
> > > >  moving to .init.text (where it could/should have lived already 
> > > >  before).
> > > > 
> > > >  Signed-off-by: Jan Beulich 
> > > > >>>
> > > > >>> The Linux kernel regression tracker is pestering me because things 
> > > > >>> are
> > > > >>> taking so long (effectively quoting him), and alternative proposals
> > > > >>> made so far look to have more severe downsides.
> > > > >>
> > > > >> Has any progress been made with this patch? It afaics is meant to fix
> > > > >> this regression, which ideally should have been fixed weeks ago (btw:
> > > > >> adding a "Link:" tag pointing to it would be good):
> > > > >> https://lore.kernel.org/regressions/YnHK1Z3o99eMXsVK@mail-itl/
> > > > >>
> > > > >> According to Juergen it's still needed:
> > > > >> https://lore.kernel.org/lkml/c5515533-29a9-9e91-5a36-45f00f25b...@suse.com/
> > > > >>
> > > > >> Or was a different solution found to fix that regression?
> > > > > 
> > > > > No progress and no alternatives I'm aware of.
> > > >
> > > > Getting closer to the point where I need to bring this to Linus
> > > > attention. I hope this mail can help avoiding this.
> > > >
> > > > Jan, I didn't follow this closely, but do you have any idea why Dave,
> > > > Luto, and Peter are ignoring this? Is reverting bdd8b6c98239 a option to
> > > > get the regression fixed? Would a repost maybe help getting this rolling
> > > > again?
> > >
> > > Hi, Thorsten,
> > >
> > > Here is a link to the hardware probe of my system which exhibits
> > > a system hang before fully booting with bdd8b6c98239. Without
> > > bdd8b6c98239, the problem is gone:
> > >
> > > https://linux-hardware.org/?probe=32e615b538
> > >
> > > Keep in mind this problem is not seen with bdd8b6c98239
> > > on the bare metal, but only when running as a traditional Dom0
> > > PV type guest on Xen. I don't know see the problem on Xen HVM
> > > DomU, and I have not tested it on Xen PVH DomU, Xen PV DomU,
> > > or the experimental Xen PVH Dom0.
> >
> > Update: On affected hardware, you do not need to run in a
> > Xen PV Dom0 to see the regression caused by bdd8b6c98239.
> >
> > All you need to do is run, on the bare metal, on the affected
> > hardware, with the Linux kernel nopat boot option.
> >
> > Jan mentions in his commit message the function in the i915
> > driver that was touched by bdd8b6c98239 and that causes this
> > regression. That is, any Intel IGD that needs to execute the
> > function that Jan mentions in the commit message of his
> > proposed patch when the i915 driver is setting up the graphics
> > engine will most likely be hardware that is affected. My Intel
> > IGD was marketed as HD Graphics 4600, I think.
> >
> > So find an a system with these hardware characteristics, and
> > try running, with the nopat option, the Linux kernel, with
> > and without bdd8b6c98239. You will see the regression I
> > am experiencing, I predict.
>
> This raises a disturbing question: The commit message of
> bdd8b6c98239 mentions the nopat option. It does not specify what
> effect the commit was supposed to have on system
> with the nopat option, but the actual effect on the system,
> both with the seldom used nopat option and in Xen PV Dom0,
> a nasty regre

[xen-unstable-smoke test] 171685: tolerable all pass - PUSHED

2022-07-19 Thread osstest service owner
flight 171685 xen-unstable-smoke real [real]
http://logs.test-lab.xenproject.org/osstest/logs/171685/

Failures :-/ but no regressions.

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-libvirt 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  16 saverestore-support-checkfail   never pass

version targeted for testing:
 xen  e500b6b8d07f87593a9d0e3a391456ef4ac5ee34
baseline version:
 xen  9723507daf2120131410c91980d4e4d9b0d0aa90

Last test of basis   171682  2022-07-19 07:06:21 Z0 days
Testing same since   171685  2022-07-19 11:03:41 Z0 days1 attempts


People who touched revisions under test:
  Elliott Mitchell 
  Jan Beulich 

jobs:
 build-arm64-xsm  pass
 build-amd64  pass
 build-armhf  pass
 build-amd64-libvirt  pass
 test-armhf-armhf-xl  pass
 test-arm64-arm64-xl-xsm  pass
 test-amd64-amd64-xl-qemuu-debianhvm-amd64pass
 test-amd64-amd64-libvirt pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

To xenbits.xen.org:/home/xen/git/xen.git
   9723507daf..e500b6b8d0  e500b6b8d07f87593a9d0e3a391456ef4ac5ee34 -> smoke



Re: [PATCH 3/3] x86: decouple pat and mtrr handling

2022-07-19 Thread Borislav Petkov
On Fri, Jul 15, 2022 at 04:25:49PM +0200, Juergen Gross wrote:
> Today PAT is usable only with MTRR being active, with some nasty tweaks
> to make PAT usable when running as Xen PV guest, which doesn't support
> MTRR.
> 
> The reason for this coupling is, that both, PAT MSR changes and MTRR
> changes, require a similar sequence and so full PAT support was added
> using the already available MTRR handling.
> 
> Xen PV PAT handling can work without MTRR, as it just needs to consume
> the PAT MSR setting done by the hypervisor without the ability and need
> to change it. This in turn has resulted in a convoluted initialization
> sequence and wrong decisions regarding cache mode availability due to
> misguiding PAT availability flags.
> 
> Fix all of that by allowing to use PAT without MTRR and by adding an
> environment dependent PAT init function.

Aha, there's the explanation I was looking for.

> diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
> index 0a1bd14f7966..3edfb779dab5 100644
> --- a/arch/x86/kernel/cpu/common.c
> +++ b/arch/x86/kernel/cpu/common.c
> @@ -2408,8 +2408,8 @@ void __init cache_bp_init(void)
>  {
>   if (IS_ENABLED(CONFIG_MTRR))
>   mtrr_bp_init();
> - else
> - pat_disable("PAT support disabled because CONFIG_MTRR is 
> disabled in the kernel.");
> +
> + pat_cpu_init();
>  }
>  
>  void cache_ap_init(void)
> @@ -2417,7 +2417,8 @@ void cache_ap_init(void)
>   if (cache_aps_delayed_init)
>   return;
>  
> - mtrr_ap_init();
> + if (!mtrr_ap_init())
> + pat_ap_init_nomtrr();
>  }

So I'm reading this as: if it couldn't init AP's MTRRs, init its PAT.

But currently, the code sets the MTRRs for the delayed case or when the
CPU is not online by doing ->set_all and in there it sets first MTRRs
and then PAT.

I think the code above should simply try the two things, one after the
other, independently from one another.

And I see you've added another stomp machine call for PAT only.

Now, what I think the design of all this should be, is:

you have a bunch of things you need to do at each point:

* cache_ap_init

* cache_aps_init

* ...

Now, in each those, you look at whether PAT or MTRR is supported and you
do only those which are supported.

Also, the rendezvous handler should do:

if MTRR:
do MTRR specific stuff

if PAT:
do PAT specific stuff

This way you have clean definitions of what needs to happen when and you
also do *only* the things that the platform supports, by keeping the
proper order of operations - I believe MTRRs first and then PAT.

This way we'll get rid of that crazy maze of who calls what and when.

But first we need to define those points where stuff needs to happen and
then for each point define what stuff needs to happen.

How does that sound?

Thx.

-- 
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette



Re: [PATCH v1 01/18] kconfig: allow configuration of maximum modules

2022-07-19 Thread Daniel P. Smith


On 7/15/22 15:16, Julien Grall wrote:
> Hi Daniel,
> 
> On 06/07/2022 22:04, Daniel P. Smith wrote:
>> For x86 the number of allowable multiboot modules varies between the
>> different
>> entry points, non-efi boot, pvh boot, and efi boot. In the case of
>> both Arm and
>> x86 this value is fixed to values based on generalized assumptions. With
>> hyperlaunch for x86 and dom0less on Arm, use of static sizes results
>> in large
>> allocations compiled into the hypervisor that will go unused by many
>> use cases.
>>
>> This commit introduces a Kconfig variable that is set with sane
>> defaults based
>> on configuration selection. This variable is in turned used as the
>> array size
>> for the cases where a static allocated array of boot modules is declared.
>>
>> Signed-off-by: Daniel P. Smith 
>> Reviewed-by: Christopher Clark 
> 
> I am not entirely sure where this reviewed-by is coming from. Is this
> from internal review?

Yes.

> If yes, my recommendation would be to provide the reviewed-by on the
> mailing list. Ideally, the review should also be done in the open, but I
> understand some company wish to do a fully internal review first.

Since this capability is being jointly developed by Christopher and I,
with myself being the author of code, Christopher reviewed the code as
the co-developer. He did so as a second pair of eyes for any obvious
mistakes and to concur that the implementation was in line with the
approach the two of us architected. Perhaps a SoB line might be more
appropriate than an R-b line.

> At least from a committer perspective, this helps me to know whether the
> reviewed-by still apply. An example would be if you send a v2, I would
> not be able to know whether Christoffer still agreed on the change.

If an SoB line is more appropriate, then on the next version I can
switch it

>> ---
>>   xen/arch/Kconfig  | 12 
>>   xen/arch/arm/include/asm/setup.h  |  5 +++--
>>   xen/arch/x86/efi/efi-boot.h   |  2 +-
>>   xen/arch/x86/guest/xen/pvh-boot.c |  2 +-
>>   xen/arch/x86/setup.c  |  4 ++--
>>   5 files changed, 19 insertions(+), 6 deletions(-)
>>
>> diff --git a/xen/arch/Kconfig b/xen/arch/Kconfig
>> index f16eb0df43..24139057be 100644
>> --- a/xen/arch/Kconfig
>> +++ b/xen/arch/Kconfig
>> @@ -17,3 +17,15 @@ config NR_CPUS
>>     For CPU cores which support Simultaneous Multi-Threading or
>> similar
>>     technologies, this the number of logical threads which Xen will
>>     support.
>> +
>> +config NR_BOOTMODS
>> +    int "Maximum number of boot modules that a loader can pass"
>> +    range 1 32768
>> +    default "8" if X86
>> +    default "32" if ARM
>> +    help
>> +  Controls the build-time size of various arrays allocated for
>> +  parsing the boot modules passed by a loader when starting Xen.
>> +
>> +  This is of particular interest when using Xen's hypervisor domain
>> +  capabilities such as dom0less.
>> diff --git a/xen/arch/arm/include/asm/setup.h
>> b/xen/arch/arm/include/asm/setup.h
>> index 2bb01ecfa8..312a3e4209 100644
>> --- a/xen/arch/arm/include/asm/setup.h
>> +++ b/xen/arch/arm/include/asm/setup.h
>> @@ -10,7 +10,8 @@
>>     #define NR_MEM_BANKS 256
>>   -#define MAX_MODULES 32 /* Current maximum useful modules */
>> +/* Current maximum useful modules */
>> +#define MAX_MODULES CONFIG_NR_BOOTMODS
>>     typedef enum {
>>   BOOTMOD_XEN,
>> @@ -38,7 +39,7 @@ struct meminfo {
>>    * The domU flag is set for kernels and ramdisks of "xen,domain" nodes.
>>    * The purpose of the domU flag is to avoid getting confused in
>>    * kernel_probe, where we try to guess which is the dom0 kernel and
>> - * initrd to be compatible with all versions of the multiboot spec.
>> + * initrd to be compatible with all versions of the multiboot spec.
> 
> In general, I much prefer if coding style changes are done separately
> because it helps the review (I don't have to stare at the line to figure
> out what changed).

Actually, on a past review of another series I got dinged on this, and I
did try to get most of them out of this series. This is just a straggler
that I missed. I will clean up on next revision.

> I am not going to force this here. However, the strict minimum is to
> mention the change in the commit message.
> 
>>    */
>>   #define BOOTMOD_MAX_CMDLINE 1024
>>   struct bootmodule {
>> diff --git a/xen/arch/x86/efi/efi-boot.h b/xen/arch/x86/efi/efi-boot.h
>> index 6e65b569b0..4e1a799749 100644
>> --- a/xen/arch/x86/efi/efi-boot.h
>> +++ b/xen/arch/x86/efi/efi-boot.h
>> @@ -18,7 +18,7 @@ static multiboot_info_t __initdata mbi = {
>>    * The array size needs to be one larger than the number of modules we
>>    * support - see __start_xen().
>>    */
>> -static module_t __initdata mb_modules[5];
>> +static module_t __initdata mb_modules[CONFIG_NR_BOOTMODS + 1];
> 
> Please explain in the commit message why the number of modules was
> bumped from 5 to 9.

The number of modules were inconsistent 

[linux-linus test] 171681: tolerable FAIL - PUSHED

2022-07-19 Thread osstest service owner
flight 171681 linux-linus real [real]
http://logs.test-lab.xenproject.org/osstest/logs/171681/

Failures :-/ but no regressions.

Regressions which are regarded as allowable (not blocking):
 test-armhf-armhf-xl-rtds18 guest-start/debian.repeat fail REGR. vs. 171664

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-qemut-win7-amd64 19 guest-stopfail like 171664
 test-armhf-armhf-libvirt 16 saverestore-support-checkfail  like 171664
 test-amd64-amd64-qemuu-nested-amd 20 debian-hvm-install/l1/l2 fail like 171664
 test-amd64-amd64-xl-qemuu-ws16-amd64 19 guest-stopfail like 171664
 test-amd64-amd64-xl-qemuu-win7-amd64 19 guest-stopfail like 171664
 test-armhf-armhf-libvirt-qcow2 15 saverestore-support-check   fail like 171664
 test-armhf-armhf-libvirt-raw 15 saverestore-support-checkfail  like 171664
 test-amd64-amd64-xl-qemut-ws16-amd64 19 guest-stopfail like 171664
 test-amd64-amd64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  16 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-raw 14 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-qcow2 14 migrate-support-checkfail never pass
 test-arm64-arm64-libvirt-raw 14 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-raw 15 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-vhd  14 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-vhd  15 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 15 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 16 saverestore-support-checkfail never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-arm64-arm64-xl-seattle  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt-qcow2 14 migrate-support-checkfail never pass
 test-armhf-armhf-xl-multivcpu 15 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 16 saverestore-support-checkfail  never pass
 test-armhf-armhf-xl-rtds 15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt-raw 14 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  14 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  15 saverestore-support-checkfail   never pass

version targeted for testing:
 linuxca85855bdcae8f84f1512e88b4c75009ea17ea2f
baseline version:
 linuxff6992735ade75aae3e35d16b17da1008d753d28

Last test of basis   171664  2022-07-17 21:11:10 Z1 days
Failing since171674  2022-07-18 19:40:02 Z0 days2 attempts
Testing same since   171681  2022-07-19 04:40:23 Z0 days1 attempts


People who touched revisions under test:
  Andy Shevchenko 
  Bartosz Golaszewski 
  Dipen Patel 
  Jason Gunthorpe 
  Linus Torvalds 
  Mustafa Ismail 
  Shiraz Saleem 
  Thierry Reding 

jobs:
 build-amd64-xsm   

Re: [PATCH v1 01/18] kconfig: allow configuration of maximum modules

2022-07-19 Thread Daniel P. Smith
On 7/19/22 05:32, Jan Beulich wrote:
> On 06.07.2022 23:04, Daniel P. Smith wrote:
>> --- a/xen/arch/Kconfig
>> +++ b/xen/arch/Kconfig
>> @@ -17,3 +17,15 @@ config NR_CPUS
>>For CPU cores which support Simultaneous Multi-Threading or similar
>>technologies, this the number of logical threads which Xen will
>>support.
>> +
>> +config NR_BOOTMODS
>> +int "Maximum number of boot modules that a loader can pass"
>> +range 1 32768
>> +default "8" if X86
>> +default "32" if ARM
> 
> Any reason for the larger default on Arm, irrespective of dom0less
> actually being in use? (I'm actually surprised I can't spot a Kconfig
> option controlling inclusion of dom0less. The default here imo isn't
> supposed to depend on the architecture, but on whether dom0less is
> supported. That way if another arch gained dom0less support, the
> higher default would apply to it without needing further adjustment.)

Yes, multidomain construction is always on for Arm and the only
configurable is a commandline parameter to enforce that dom0 is not
created. As for the default, it was selected based on the largest value
used in the locations replaced by the Kconfig variable. Since there was
a significant difference between Arm and x86, I did not feel it was
appropriate to reduce/increase either, since it drives multiple static
array allocations for x86.

I have no attachments to any specific value, so I will freely adjust to
whatever conscience the community might come to.

>> --- a/xen/arch/x86/efi/efi-boot.h
>> +++ b/xen/arch/x86/efi/efi-boot.h
>> @@ -18,7 +18,7 @@ static multiboot_info_t __initdata mbi = {
>>   * The array size needs to be one larger than the number of modules we
>>   * support - see __start_xen().
>>   */
>> -static module_t __initdata mb_modules[5];
>> +static module_t __initdata mb_modules[CONFIG_NR_BOOTMODS + 1];
> 
> If the build admin selected 1, I'm pretty sure about nothing would work.
> I think you want max(5, CONFIG_NR_BOOTMODS) or
> max(4, CONFIG_NR_BOOTMODS) + 1 here and ...

Actually, I reasoned this out and 1 is in fact a valid value. It would
mean Xen + Dom0 Linux kernel with embedded initramfs with no externally
loaded XSM policy and no boot time microcode patching. This is a working
configuration, but open to debate if it is a desirable configuration.
The question is whether it is desired to block someone from building
such a configuration, or any number between 1 and 4. If the answer is
yes, then why not just set the lower bound of the range in the Kconfig
file instead of having to maintain a hard-coded lower bound in a max
marco across multiple locations?

>> --- a/xen/arch/x86/guest/xen/pvh-boot.c
>> +++ b/xen/arch/x86/guest/xen/pvh-boot.c
>> @@ -32,7 +32,7 @@ bool __initdata pvh_boot;
>>  uint32_t __initdata pvh_start_info_pa;
>>  
>>  static multiboot_info_t __initdata pvh_mbi;
>> -static module_t __initdata pvh_mbi_mods[8];
>> +static module_t __initdata pvh_mbi_mods[CONFIG_NR_BOOTMOD + 1];
> 
> ... max(8, CONFIG_NR_BOOTMODS) here (albeit the 8 may have room for
> lowering - I don't recall why 8 was chosen rather than going with
> the minimum possible value covering all module kinds known at that
> time).

This is what drove the default for x86 in Kconfig to be 8. I thought it
was excessive but assumed there was some reason for the value. And see
my comment above whether it should be max({n},CONFIG_NR_BOOTMOD) vs
range {n}..32768.



RE: [PATCH v1 00/18] Hyperlaunch

2022-07-19 Thread Smith, Jackson
Hi Daniel,

> -Original Message-
> Subject: [PATCH v1 00/18] Hyperlaunch

With the adjustments that I suggested in other messages, this patch builds and 
boots for me on x86 (including a device tree with a domU). I will continue to 
poke around and see if I discover any other rough edges.

One strange behavior I see is that xen fails to start the Dom0 kernel on a warm 
reboot. I'm using qemu_system_x86 with the KVM backend to test out the patch. 
After starting qemu, xen will boot correctly only once. If I attempt to reboot 
the virtual system (through the 'reboot' command in dom0 or the 'system_reset' 
qemu monitor command) without exiting/starting a new qemu process on the host 
machine, xen panics while booting after printing this:

(XEN) *** Building Dom0 ***
(XEN) Dom0 has maximum 856 PIRQs
(XEN) *** Constructing a PV Dom0 ***
(XEN) ELF: not an ELF binary
(XEN)
(XEN) 
(XEN) Panic on CPU 0:
(XEN) Could not construct domain 0
(XEN) 

This happens with the BUILDER_FDT config option on and off, and regardless of 
what dtb (if any) I pass to xen. I don't see this behavior if I switch back to 
xen's master branch.

Hopefully that explanation made sense. Let me know if I can provide any further 
information about my setup.

Thanks,
Jackson

Also, I apologize that my last messages included a digital signature. Should be 
fixed now.



[PATCH] xen/mem_sharing: support forks with active vPMU state

2022-07-19 Thread Tamas K Lengyel
Currently the vPMU state from a parent isn't copied to VM forks. To enable the
vPMU state to be copied to a fork VM we export certain vPMU functions. First,
the vPMU context needs to be allocated for the fork if the parent has one. For
this we introduce vpmu->allocate_context, which has previously only been called
when the guest enables the PMU on itself. Furthermore, we export
vpmu_save_force so that the PMU context can be saved on-demand even if no
context switch took place on the parent's CPU yet. Additionally, we make sure
all relevant configuration MSRs are saved in the vPMU context so the copy is
complete and the fork starts with the same PMU config as the parent.

Signed-off-by: Tamas K Lengyel 
---
 xen/arch/x86/cpu/vpmu.c | 12 -
 xen/arch/x86/cpu/vpmu_intel.c   | 16 +++
 xen/arch/x86/include/asm/vpmu.h |  5 
 xen/arch/x86/mm/mem_sharing.c   | 48 +
 4 files changed, 80 insertions(+), 1 deletion(-)

diff --git a/xen/arch/x86/cpu/vpmu.c b/xen/arch/x86/cpu/vpmu.c
index d2c03a1104..2b5d64a60d 100644
--- a/xen/arch/x86/cpu/vpmu.c
+++ b/xen/arch/x86/cpu/vpmu.c
@@ -336,7 +336,7 @@ void vpmu_do_interrupt(struct cpu_user_regs *regs)
 #endif
 }
 
-static void cf_check vpmu_save_force(void *arg)
+void cf_check vpmu_save_force(void *arg)
 {
 struct vcpu *v = arg;
 struct vpmu_struct *vpmu = vcpu_vpmu(v);
@@ -529,6 +529,16 @@ void vpmu_initialise(struct vcpu *v)
 put_vpmu(v);
 }
 
+void vpmu_allocate_context(struct vcpu *v)
+{
+struct vpmu_struct *vpmu = vcpu_vpmu(v);
+
+if ( vpmu_is_set(vpmu, VPMU_CONTEXT_ALLOCATED) )
+return;
+
+alternative_call(vpmu_ops.allocate_context, v);
+}
+
 static void cf_check vpmu_clear_last(void *arg)
 {
 if ( this_cpu(last_vcpu) == arg )
diff --git a/xen/arch/x86/cpu/vpmu_intel.c b/xen/arch/x86/cpu/vpmu_intel.c
index 8612f46973..31dc0ee14b 100644
--- a/xen/arch/x86/cpu/vpmu_intel.c
+++ b/xen/arch/x86/cpu/vpmu_intel.c
@@ -282,10 +282,17 @@ static inline void __core2_vpmu_save(struct vcpu *v)
 for ( i = 0; i < fixed_pmc_cnt; i++ )
 rdmsrl(MSR_CORE_PERF_FIXED_CTR0 + i, fixed_counters[i]);
 for ( i = 0; i < arch_pmc_cnt; i++ )
+{
 rdmsrl(MSR_IA32_PERFCTR0 + i, xen_pmu_cntr_pair[i].counter);
+rdmsrl(MSR_P6_EVNTSEL(i), xen_pmu_cntr_pair[i].control);
+}
 
 if ( !is_hvm_vcpu(v) )
 rdmsrl(MSR_CORE_PERF_GLOBAL_STATUS, core2_vpmu_cxt->global_status);
+/* Save MSR to private context to make it fork-friendly */
+else if ( mem_sharing_enabled(v->domain) )
+vmx_read_guest_msr(v, MSR_CORE_PERF_GLOBAL_CTRL,
+   &core2_vpmu_cxt->global_ctrl);
 }
 
 static int cf_check core2_vpmu_save(struct vcpu *v, bool to_guest)
@@ -346,6 +353,10 @@ static inline void __core2_vpmu_load(struct vcpu *v)
 core2_vpmu_cxt->global_ovf_ctrl = 0;
 wrmsrl(MSR_CORE_PERF_GLOBAL_CTRL, core2_vpmu_cxt->global_ctrl);
 }
+/* Restore MSR from context when used with a fork */
+else if ( mem_sharing_is_fork(v->domain) )
+vmx_write_guest_msr(v, MSR_CORE_PERF_GLOBAL_CTRL,
+core2_vpmu_cxt->global_ctrl);
 }
 
 static int core2_vpmu_verify(struct vcpu *v)
@@ -474,7 +485,11 @@ static int core2_vpmu_alloc_resource(struct vcpu *v)
 sizeof(uint64_t) * fixed_pmc_cnt;
 
 vpmu->context = core2_vpmu_cxt;
+vpmu->context_size = sizeof(struct xen_pmu_intel_ctxt) +
+ fixed_pmc_cnt * sizeof(uint64_t) +
+ arch_pmc_cnt * sizeof(struct xen_pmu_cntr_pair);
 vpmu->priv_context = p;
+vpmu->priv_context_size = sizeof(uint64_t);
 
 if ( !has_vlapic(v->domain) )
 {
@@ -882,6 +897,7 @@ static int cf_check vmx_vpmu_initialise(struct vcpu *v)
 
 static const struct arch_vpmu_ops __initconst_cf_clobber core2_vpmu_ops = {
 .initialise = vmx_vpmu_initialise,
+.allocate_context = core2_vpmu_alloc_resource,
 .do_wrmsr = core2_vpmu_do_wrmsr,
 .do_rdmsr = core2_vpmu_do_rdmsr,
 .do_interrupt = core2_vpmu_do_interrupt,
diff --git a/xen/arch/x86/include/asm/vpmu.h b/xen/arch/x86/include/asm/vpmu.h
index e5709bd44a..14d0939247 100644
--- a/xen/arch/x86/include/asm/vpmu.h
+++ b/xen/arch/x86/include/asm/vpmu.h
@@ -40,6 +40,7 @@
 /* Arch specific operations shared by all vpmus */
 struct arch_vpmu_ops {
 int (*initialise)(struct vcpu *v);
+int (*allocate_context)(struct vcpu *v);
 int (*do_wrmsr)(unsigned int msr, uint64_t msr_content);
 int (*do_rdmsr)(unsigned int msr, uint64_t *msr_content);
 int (*do_interrupt)(struct cpu_user_regs *regs);
@@ -59,6 +60,8 @@ struct vpmu_struct {
 u32 hw_lapic_lvtpc;
 void *context;  /* May be shared with PV guest */
 void *priv_context; /* hypervisor-only */
+size_t context_size;
+size_t priv_context_size;
 struct xen_pmu_data *xenpmu_data;
 spinlock_t vpmu_lock;
 };
@@ -106,8 +109,10 @@ void vpmu_lvtpc_update(

[PATCH V7 01/11] xen/pci: arm: add stub for is_memory_hole

2022-07-19 Thread Oleksandr Tyshchenko
From: Oleksandr Andrushchenko 

Add a stub for is_memory_hole which is required for PCI passthrough
on Arm.

Signed-off-by: Oleksandr Andrushchenko 
---
OT: It looks like the discussion got stuck. As I understand this
patch is not immediately needed in the context of current series
as PCI passthrough is not enabled on Arm at the moment. So the patch
could be added later on, but it is needed to allow PCI passthrough
to be built on Arm for those who want to test it.

Copy here some context provided by Julien:

Here a summary of the discussion (+ some my follow-up thoughts):

is_memory_hole() was recently introduced on x86 (see commit 75cc460a1b8c
"xen/pci: detect when BARs are not suitably positioned") to check
whether the BAR are positioned outside of a valid memory range. This was
introduced to work-around quirky firmware.

In theory, this could also happen on Arm. In practice, this may not
happen but it sounds better to sanity check that the BAR contains
"valid" I/O range.

On x86, this is implemented by checking the region is not described is
in the e820. IIUC, on Arm, the BARs have to be positioned in pre-defined
ranges. So I think it would be possible to implement is_memory_hole() by
going through the list of hostbridges and check the ranges.

But first, I'd like to confirm my understanding with Rahul, and others.

If we were going to go this route, I would also rename the function to
be better match what it is doing (i.e. it checks the BAR is correctly
placed). As a potentially optimization/hardening for Arm, we could pass
the hostbridge so we don't have to walk all of them.
---
 xen/arch/arm/mm.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c
index 009b8cd9ef..bb34b97eb5 100644
--- a/xen/arch/arm/mm.c
+++ b/xen/arch/arm/mm.c
@@ -1708,6 +1708,12 @@ unsigned long get_upper_mfn_bound(void)
 return max_page - 1;
 }
 
+bool is_memory_hole(mfn_t start, mfn_t end)
+{
+/* TODO: this needs to be properly implemented. */
+return true;
+}
+
 /*
  * Local variables:
  * mode: C
-- 
2.25.1




[PATCH V7 03/11] vpci/header: implement guest BAR register handlers

2022-07-19 Thread Oleksandr Tyshchenko
From: Oleksandr Andrushchenko 

Add relevant vpci register handlers when assigning PCI device to a domain
and remove those when de-assigning. This allows having different
handlers for different domains, e.g. hwdom and other guests.

Emulate guest BAR register values: this allows creating a guest view
of the registers and emulates size and properties probe as it is done
during PCI device enumeration by the guest.

All empty, IO and ROM BARs for guests are emulated by returning 0 on
reads and ignoring writes: this BARs are special with this respect as
their lower bits have special meaning, so returning default ~0 on read
may confuse guest OS.

Memory decoding is initially disabled when used by guests in order to
prevent the BAR being placed on top of a RAM region.

Signed-off-by: Oleksandr Andrushchenko 
---
Since v6:
- unify the writing of the PCI_COMMAND register on the
  error path into a label
- do not introduce bar_ignore_access helper and open code
- s/guest_bar_ignore_read/empty_bar_read
- update error message in guest_bar_write
- only setup empty_bar_read for IO if !x86
- OT: rebased
- OT: add cf_check specifier to guest_bar_(write)read() and empty_bar_read()
Since v5:
- make sure that the guest set address has the same page offset
  as the physical address on the host
- remove guest_rom_{read|write} as those just implement the default
  behaviour of the registers not being handled
- adjusted comment for struct vpci.addr field
- add guest handlers for BARs which are not handled and will otherwise
  return ~0 on read and ignore writes. The BARs are special with this
  respect as their lower bits have special meaning, so returning ~0
  doesn't seem to be right
Since v4:
- updated commit message
- s/guest_addr/guest_reg
Since v3:
- squashed two patches: dynamic add/remove handlers and guest BAR
  handler implementation
- fix guest BAR read of the high part of a 64bit BAR (Roger)
- add error handling to vpci_assign_device
- s/dom%pd/%pd
- blank line before return
Since v2:
- remove unneeded ifdefs for CONFIG_HAS_VPCI_GUEST_SUPPORT as more code
  has been eliminated from being built on x86
Since v1:
 - constify struct pci_dev where possible
 - do not open code is_system_domain()
 - simplify some code3. simplify
 - use gdprintk + error code instead of gprintk
 - gate vpci_bar_{add|remove}_handlers with CONFIG_HAS_VPCI_GUEST_SUPPORT,
   so these do not get compiled for x86
 - removed unneeded is_system_domain check
 - re-work guest read/write to be much simpler and do more work on write
   than read which is expected to be called more frequently
 - removed one too obvious comment
---
 xen/drivers/vpci/header.c | 151 +++---
 xen/include/xen/vpci.h|   3 +
 2 files changed, 126 insertions(+), 28 deletions(-)

diff --git a/xen/drivers/vpci/header.c b/xen/drivers/vpci/header.c
index e0461b1139..9fbbdc3500 100644
--- a/xen/drivers/vpci/header.c
+++ b/xen/drivers/vpci/header.c
@@ -412,6 +412,71 @@ static void cf_check bar_write(
 pci_conf_write32(pdev->sbdf, reg, val);
 }
 
+static void cf_check guest_bar_write(
+const struct pci_dev *pdev, unsigned int reg, uint32_t val, void *data)
+{
+struct vpci_bar *bar = data;
+bool hi = false;
+uint64_t guest_reg = bar->guest_reg;
+
+if ( bar->type == VPCI_BAR_MEM64_HI )
+{
+ASSERT(reg > PCI_BASE_ADDRESS_0);
+bar--;
+hi = true;
+}
+else
+{
+val &= PCI_BASE_ADDRESS_MEM_MASK;
+val |= bar->type == VPCI_BAR_MEM32 ? PCI_BASE_ADDRESS_MEM_TYPE_32
+   : PCI_BASE_ADDRESS_MEM_TYPE_64;
+val |= bar->prefetchable ? PCI_BASE_ADDRESS_MEM_PREFETCH : 0;
+}
+
+guest_reg &= ~(0xull << (hi ? 32 : 0));
+guest_reg |= (uint64_t)val << (hi ? 32 : 0);
+
+guest_reg &= ~(bar->size - 1) | ~PCI_BASE_ADDRESS_MEM_MASK;
+
+/*
+ * Make sure that the guest set address has the same page offset
+ * as the physical address on the host or otherwise things won't work as
+ * expected.
+ */
+if ( (guest_reg & (~PAGE_MASK & PCI_BASE_ADDRESS_MEM_MASK)) !=
+ (bar->addr & ~PAGE_MASK) )
+{
+gprintk(XENLOG_WARNING,
+"%pp: ignored BAR %zu write attempting to change page 
offset\n",
+&pdev->sbdf, bar - pdev->vpci->header.bars + hi);
+return;
+}
+
+bar->guest_reg = guest_reg;
+}
+
+static uint32_t cf_check guest_bar_read(
+const struct pci_dev *pdev, unsigned int reg, void *data)
+{
+const struct vpci_bar *bar = data;
+bool hi = false;
+
+if ( bar->type == VPCI_BAR_MEM64_HI )
+{
+ASSERT(reg > PCI_BASE_ADDRESS_0);
+bar--;
+hi = true;
+}
+
+return bar->guest_reg >> (hi ? 32 : 0);
+}
+
+static uint32_t cf_check empty_bar_read(
+const struct pci_dev *pdev, unsigned int reg, void *data)
+{
+return 0;
+}
+
 static void cf_check rom_write(
 const struct pci_dev *pdev, unsigned int reg, uint32

[PATCH V7 04/11] rangeset: add RANGESETF_no_print flag

2022-07-19 Thread Oleksandr Tyshchenko
From: Oleksandr Andrushchenko 

There are range sets which should not be printed, so introduce a flag
which allows marking those as such. Implement relevant logic to skip
such entries while printing.

While at it also simplify the definition of the flags by directly
defining those without helpers.

Suggested-by: Jan Beulich 
Signed-off-by: Oleksandr Andrushchenko 
Reviewed-by: Jan Beulich 
---
Since v5:
- comment indentation (Jan)
Since v1:
- update BUG_ON with new flag
- simplify the definition of the flags
---
 xen/common/rangeset.c  | 5 -
 xen/include/xen/rangeset.h | 5 +++--
 2 files changed, 7 insertions(+), 3 deletions(-)

diff --git a/xen/common/rangeset.c b/xen/common/rangeset.c
index a6ef264046..f8b909d016 100644
--- a/xen/common/rangeset.c
+++ b/xen/common/rangeset.c
@@ -433,7 +433,7 @@ struct rangeset *rangeset_new(
 INIT_LIST_HEAD(&r->range_list);
 r->nr_ranges = -1;
 
-BUG_ON(flags & ~RANGESETF_prettyprint_hex);
+BUG_ON(flags & ~(RANGESETF_prettyprint_hex | RANGESETF_no_print));
 r->flags = flags;
 
 safe_strcpy(r->name, name ?: "(no name)");
@@ -575,6 +575,9 @@ void rangeset_domain_printk(
 
 list_for_each_entry ( r, &d->rangesets, rangeset_list )
 {
+if ( r->flags & RANGESETF_no_print )
+continue;
+
 printk("");
 rangeset_printk(r);
 printk("\n");
diff --git a/xen/include/xen/rangeset.h b/xen/include/xen/rangeset.h
index 135f33f606..f7c69394d6 100644
--- a/xen/include/xen/rangeset.h
+++ b/xen/include/xen/rangeset.h
@@ -49,8 +49,9 @@ void rangeset_limit(
 
 /* Flags for passing to rangeset_new(). */
  /* Pretty-print range limits in hexadecimal. */
-#define _RANGESETF_prettyprint_hex 0
-#define RANGESETF_prettyprint_hex  (1U << _RANGESETF_prettyprint_hex)
+#define RANGESETF_prettyprint_hex   (1U << 0)
+ /* Do not print entries marked with this flag. */
+#define RANGESETF_no_print  (1U << 1)
 
 bool_t __must_check rangeset_is_empty(
 const struct rangeset *r);
-- 
2.25.1




[PATCH V7 02/11] vpci: add hooks for PCI device assign/de-assign

2022-07-19 Thread Oleksandr Tyshchenko
From: Oleksandr Andrushchenko 

When a PCI device gets assigned/de-assigned some work on vPCI side needs
to be done for that device. Introduce a pair of hooks so vPCI can handle
that.

Signed-off-by: Oleksandr Andrushchenko 
---
Since v6:
- do not pass struct domain to vpci_{assign|deassign}_device as
  pdev->domain can be used
- do not leave the device assigned (pdev->domain == new domain) in case
  vpci_assign_device fails: try to de-assign and if this also fails, then
  crash the domain
- re-work according to the new locking scheme (ASSERTs)
- OT: rebased
Since v5:
- do not split code into run_vpci_init
- do not check for is_system_domain in vpci_{de}assign_device
- do not use vpci_remove_device_handlers_locked and re-allocate
  pdev->vpci completely
- make vpci_deassign_device void
Since v4:
 - de-assign vPCI from the previous domain on device assignment
 - do not remove handlers in vpci_assign_device as those must not
   exist at that point
Since v3:
 - remove toolstack roll-back description from the commit message
   as error are to be handled with proper cleanup in Xen itself
 - remove __must_check
 - remove redundant rc check while assigning devices
 - fix redundant CONFIG_HAS_VPCI check for CONFIG_HAS_VPCI_GUEST_SUPPORT
 - use REGISTER_VPCI_INIT machinery to run required steps on device
   init/assign: add run_vpci_init helper
Since v2:
- define CONFIG_HAS_VPCI_GUEST_SUPPORT so dead code is not compiled
  for x86
Since v1:
 - constify struct pci_dev where possible
 - do not open code is_system_domain()
 - extended the commit message
---
 xen/drivers/Kconfig   |  4 
 xen/drivers/passthrough/pci.c | 11 +++
 xen/drivers/vpci/vpci.c   | 31 +++
 xen/include/xen/vpci.h| 15 +++
 4 files changed, 61 insertions(+)

diff --git a/xen/drivers/Kconfig b/xen/drivers/Kconfig
index db94393f47..780490cf8e 100644
--- a/xen/drivers/Kconfig
+++ b/xen/drivers/Kconfig
@@ -15,4 +15,8 @@ source "drivers/video/Kconfig"
 config HAS_VPCI
bool
 
+config HAS_VPCI_GUEST_SUPPORT
+   bool
+   depends on HAS_VPCI
+
 endmenu
diff --git a/xen/drivers/passthrough/pci.c b/xen/drivers/passthrough/pci.c
index f93922acc8..56af1dbd97 100644
--- a/xen/drivers/passthrough/pci.c
+++ b/xen/drivers/passthrough/pci.c
@@ -1019,6 +1019,8 @@ static int deassign_device(struct domain *d, uint16_t 
seg, uint8_t bus,
 if ( ret )
 goto out;
 
+vpci_deassign_device(pdev);
+
 if ( pdev->domain == hardware_domain  )
 pdev->quarantine = false;
 
@@ -1558,6 +1560,7 @@ static int assign_device(struct domain *d, u16 seg, u8 
bus, u8 devfn, u32 flag)
 {
 const struct domain_iommu *hd = dom_iommu(d);
 struct pci_dev *pdev;
+uint8_t old_devfn;
 int rc = 0;
 
 if ( !is_iommu_enabled(d) )
@@ -1577,6 +1580,8 @@ static int assign_device(struct domain *d, u16 seg, u8 
bus, u8 devfn, u32 flag)
 if ( pdev->broken && d != hardware_domain && d != dom_io )
 goto done;
 
+vpci_deassign_device(pdev);
+
 rc = pdev_msix_assign(d, pdev);
 if ( rc )
 goto done;
@@ -1594,6 +1599,8 @@ static int assign_device(struct domain *d, u16 seg, u8 
bus, u8 devfn, u32 flag)
   pci_to_dev(pdev), flag)) )
 goto done;
 
+old_devfn = devfn;
+
 for ( ; pdev->phantom_stride; rc = 0 )
 {
 devfn += pdev->phantom_stride;
@@ -1603,6 +1610,10 @@ static int assign_device(struct domain *d, u16 seg, u8 
bus, u8 devfn, u32 flag)
 pci_to_dev(pdev), flag);
 }
 
+rc = vpci_assign_device(pdev);
+if ( rc && deassign_device(d, seg, bus, old_devfn) )
+domain_crash(d);
+
  done:
 if ( rc )
 printk(XENLOG_G_WARNING "%pd: assign (%pp) failed (%d)\n",
diff --git a/xen/drivers/vpci/vpci.c b/xen/drivers/vpci/vpci.c
index 674c9b347d..d187901422 100644
--- a/xen/drivers/vpci/vpci.c
+++ b/xen/drivers/vpci/vpci.c
@@ -92,6 +92,37 @@ int vpci_add_handlers(struct pci_dev *pdev)
 
 return rc;
 }
+
+#ifdef CONFIG_HAS_VPCI_GUEST_SUPPORT
+/* Notify vPCI that device is assigned to guest. */
+int vpci_assign_device(struct pci_dev *pdev)
+{
+int rc;
+
+ASSERT(pcidevs_write_locked());
+
+if ( !has_vpci(pdev->domain) )
+return 0;
+
+rc = vpci_add_handlers(pdev);
+if ( rc )
+vpci_deassign_device(pdev);
+
+return rc;
+}
+
+/* Notify vPCI that device is de-assigned from guest. */
+void vpci_deassign_device(struct pci_dev *pdev)
+{
+ASSERT(pcidevs_write_locked());
+
+if ( !has_vpci(pdev->domain) )
+return;
+
+vpci_remove_device(pdev);
+}
+#endif /* CONFIG_HAS_VPCI_GUEST_SUPPORT */
+
 #endif /* __XEN__ */
 
 static int vpci_register_cmp(const struct vpci_register *r1,
diff --git a/xen/include/xen/vpci.h b/xen/include/xen/vpci.h
index 7ab39839ff..e5501b9207 100644
--- a/xen/include/xen/vpci.h
+++ b/xen/include/xen/vpci.h
@@ -254,6 +254,21 @@ static inline bool __must_check 
vpci_process_pending(struct vc

[PATCH V7 05/11] vpci/header: handle p2m range sets per BAR

2022-07-19 Thread Oleksandr Tyshchenko
From: Oleksandr Andrushchenko 

Instead of handling a single range set, that contains all the memory
regions of all the BARs and ROM, have them per BAR.
As the range sets are now created when a PCI device is added and destroyed
when it is removed so make them named and accounted.

Note that rangesets were chosen here despite there being only up to
3 separate ranges in each set (typically just 1). But rangeset per BAR
was chosen for the ease of implementation and existing code re-usability.

This is in preparation of making non-identity mappings in p2m for the MMIOs.

Signed-off-by: Oleksandr Andrushchenko 
---
Since v6:
- update according to the new locking scheme
- remove odd fail label in modify_bars
- OT: rebased
Since v5:
- fix comments
- move rangeset allocation to init_bars and only allocate
  for MAPPABLE BARs
- check for overlap with the already setup BAR ranges
Since v4:
- use named range sets for BARs (Jan)
- changes required by the new locking scheme
- updated commit message (Jan)
Since v3:
- re-work vpci_cancel_pending accordingly to the per-BAR handling
- s/num_mem_ranges/map_pending and s/uint8_t/bool
- ASSERT(bar->mem) in modify_bars
- create and destroy the rangesets on add/remove
---
 xen/drivers/vpci/header.c | 241 +++---
 xen/drivers/vpci/vpci.c   |   5 +
 xen/include/xen/vpci.h|   3 +-
 3 files changed, 182 insertions(+), 67 deletions(-)

diff --git a/xen/drivers/vpci/header.c b/xen/drivers/vpci/header.c
index 9fbbdc3500..f14ff11882 100644
--- a/xen/drivers/vpci/header.c
+++ b/xen/drivers/vpci/header.c
@@ -131,64 +131,106 @@ static void modify_decoding(const struct pci_dev *pdev, 
uint16_t cmd,
 
 bool vpci_process_pending(struct vcpu *v)
 {
-if ( v->vpci.mem )
+struct pci_dev *pdev = v->vpci.pdev;
+
+if ( !pdev )
+return false;
+
+pcidevs_read_lock();
+
+if ( v->vpci.map_pending )
 {
 struct map_data data = {
 .d = v->domain,
 .map = v->vpci.cmd & PCI_COMMAND_MEMORY,
 };
-int rc = rangeset_consume_ranges(v->vpci.mem, map_range, &data);
-
-if ( rc == -ERESTART )
-return true;
-
-pcidevs_read_lock();
-spin_lock(&v->vpci.pdev->vpci->lock);
-/* Disable memory decoding unconditionally on failure. */
-modify_decoding(v->vpci.pdev,
-rc ? v->vpci.cmd & ~PCI_COMMAND_MEMORY : v->vpci.cmd,
-!rc && v->vpci.rom_only);
-spin_unlock(&v->vpci.pdev->vpci->lock);
-pcidevs_read_unlock();
-
-rangeset_destroy(v->vpci.mem);
-v->vpci.mem = NULL;
-if ( rc )
+struct vpci_header *header = &pdev->vpci->header;
+unsigned int i;
+
+for ( i = 0; i < ARRAY_SIZE(header->bars); i++ )
 {
-/*
- * FIXME: in case of failure remove the device from the domain.
- * Note that there might still be leftover mappings. While this is
- * safe for Dom0, for DomUs the domain will likely need to be
- * killed in order to avoid leaking stale p2m mappings on
- * failure.
- */
-pcidevs_write_lock();
-vpci_remove_device(v->vpci.pdev);
-pcidevs_write_unlock();
+struct vpci_bar *bar = &header->bars[i];
+int rc;
+
+if ( rangeset_is_empty(bar->mem) )
+continue;
+
+rc = rangeset_consume_ranges(bar->mem, map_range, &data);
+
+if ( rc == -ERESTART )
+{
+pcidevs_read_unlock();
+return true;
+}
+
+spin_lock(&pdev->vpci->lock);
+/* Disable memory decoding unconditionally on failure. */
+modify_decoding(pdev, rc ? v->vpci.cmd & ~PCI_COMMAND_MEMORY :
+   v->vpci.cmd, !rc && v->vpci.rom_only);
+spin_unlock(&pdev->vpci->lock);
+
+if ( rc )
+{
+/*
+ * FIXME: in case of failure remove the device from the domain.
+ * Note that there might still be leftover mappings. While this
+ * is safe for Dom0, for DomUs the domain needs to be killed in
+ * order to avoid leaking stale p2m mappings on failure.
+ */
+v->vpci.map_pending = false;
+pcidevs_read_unlock();
+
+if ( is_hardware_domain(v->domain) )
+{
+pcidevs_write_lock();
+vpci_remove_device(v->vpci.pdev);
+pcidevs_write_unlock();
+}
+else
+domain_crash(v->domain);
+
+return false;
+}
 }
+
+v->vpci.map_pending = false;
 }
 
+pcidevs_read_unlock();
+
 return false;
 }
 
 static int __init apply_map(struct domain *d, const struct pci_dev *pdev,
- 

[PATCH V7 00/11] PCI devices passthrough on Arm, part 3

2022-07-19 Thread Oleksandr Tyshchenko
From: Oleksandr Tyshchenko 

Hi, all!

You can find previous discussion at [1].

1. This patch series is focusing on vPCI and adds support for non-identity
PCI BAR mappings which is required while passing through a PCI device to
a guest. The highlights are:

- Add relevant vpci register handlers when assigning PCI device to a domain
  and remove those when de-assigning. This allows having different
  handlers for different domains, e.g. hwdom and other guests.

- Emulate guest BAR register values based on physical BAR values.
  This allows creating a guest view of the registers and emulates
  size and properties probe as it is done during PCI device enumeration by
  the guest.

- Instead of handling a single range set, that contains all the memory
  regions of all the BARs and ROM, have them per BAR.

- Take into account guest's BAR view and program its p2m accordingly:
  gfn is guest's view of the BAR and mfn is the physical BAR value as set
  up by the host bridge in the hardware domain.
  This way hardware domain sees physical BAR values and guest sees
  emulated ones.

2. The series also adds support for virtual PCI bus topology for guests:
 - We emulate a single host bridge for the guest, so segment is always 0.
 - The implementation is limited to 32 devices which are allowed on
   a single PCI bus.
 - The virtual bus number is set to 0, so virtual devices are seen
   as embedded endpoints behind the root complex.

3. The series has been updated due to the new PCI(vPCI) locking scheme 
implemented
in the prereq series which is also on the review now [2].

4. For unprivileged guests vpci_{read|write} has been re-worked
to not passthrough accesses to the registers not explicitly handled
by the corresponding vPCI handlers: without that passthrough
to guests is completely unsafe as Xen allows them full access to
the registers. During development this can be reverted for debugging purposes.

!!! OT: please note, Oleksandr Andrushchenko who is the author of all this stuff
has managed to address allmost all review comments given for v6 and pushed the 
updated
version to the github (23.02.22). 
So after receiving his agreement I just picked it up and did the following 
before
pushing V7:
- rebased on the recent staging (resolving a few conflicts)
- updated according to the recent changes (added cf_check specifiers where 
appropriate, etc)
and performed minor adjustments
- made sure that both current and prereq series [2] didn't really break x86 by 
testing
PVH Dom0 (vPCI) and PV Dom0 + HVM DomU (PCI passthrough to DomU) using Qemu
- my colleague Volodymyr Babchuk (who was involved in the prereq series) 
rechecked that
both series worked on Arm using real HW

You can also find the series at [3].

[1] 
https://lore.kernel.org/xen-devel/20220204063459.680961-1-andr2...@gmail.com/
[2] 
https://lore.kernel.org/xen-devel/20220718211521.664729-1-volodymyr_babc...@epam.com/
[3] https://github.com/otyshchenko1/xen/commits/vpci7

Oleksandr Andrushchenko (11):
  xen/pci: arm: add stub for is_memory_hole
  vpci: add hooks for PCI device assign/de-assign
  vpci/header: implement guest BAR register handlers
  rangeset: add RANGESETF_no_print flag
  vpci/header: handle p2m range sets per BAR
  vpci/header: program p2m with guest BAR view
  vpci/header: emulate PCI_COMMAND register for guests
  vpci/header: reset the command register when adding devices
  vpci: add initial support for virtual PCI bus topology
  xen/arm: translate virtual PCI bus topology for guests
  xen/arm: account IO handlers for emulated PCI MSI-X

 xen/arch/arm/mm.c |   6 +
 xen/arch/arm/vpci.c   |  31 ++-
 xen/common/rangeset.c |   5 +-
 xen/drivers/Kconfig   |   4 +
 xen/drivers/passthrough/pci.c |  11 +
 xen/drivers/vpci/header.c | 458 ++
 xen/drivers/vpci/msi.c|   4 +
 xen/drivers/vpci/msix.c   |   4 +
 xen/drivers/vpci/vpci.c   | 130 ++
 xen/include/xen/rangeset.h|   5 +-
 xen/include/xen/sched.h   |   8 +
 xen/include/xen/vpci.h|  42 +++-
 12 files changed, 604 insertions(+), 104 deletions(-)

-- 
2.25.1




[PATCH V7 08/11] vpci/header: reset the command register when adding devices

2022-07-19 Thread Oleksandr Tyshchenko
From: Oleksandr Andrushchenko 

Reset the command register when assigning a PCI device to a guest:
according to the PCI spec the PCI_COMMAND register is typically all 0's
after reset, but this might not be true for the guest as it needs
to respect host's settings.
For that reason, do not write 0 to the PCI_COMMAND register directly,
but go through the corresponding emulation layer (cmd_write), which
will take care about the actual bits written.

Signed-off-by: Oleksandr Andrushchenko 
---
Since v6:
- use cmd_write directly without introducing emulate_cmd_reg
- update commit message with more description on all 0's in PCI_COMMAND
Since v5:
- updated commit message
Since v1:
 - do not write 0 to the command register, but respect host settings.
---
 xen/drivers/vpci/header.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/xen/drivers/vpci/header.c b/xen/drivers/vpci/header.c
index 2ce69a63a2..1be9775dda 100644
--- a/xen/drivers/vpci/header.c
+++ b/xen/drivers/vpci/header.c
@@ -701,6 +701,10 @@ static int cf_check init_bars(struct pci_dev *pdev)
  */
 ASSERT(header->guest_cmd == 0);
 
+/* Reset the command register for guests. */
+if ( !is_hwdom )
+cmd_write(pdev, PCI_COMMAND, 0, header);
+
 /* Setup a handler for the command register. */
 rc = vpci_add_register(pdev->vpci, cmd_read, cmd_write, PCI_COMMAND,
2, header);
-- 
2.25.1




[PATCH V7 07/11] vpci/header: emulate PCI_COMMAND register for guests

2022-07-19 Thread Oleksandr Tyshchenko
From: Oleksandr Andrushchenko 

Xen and/or Dom0 may have put values in PCI_COMMAND which they expect
to remain unaltered. PCI_COMMAND_SERR bit is a good example: while the
guest's view of this will want to be zero initially, the host having set
it to 1 may not easily be overwritten with 0, or else we'd effectively
imply giving the guest control of the bit. Thus, PCI_COMMAND register needs
proper emulation in order to honor host's settings.

There are examples of emulators [1], [2] which already deal with PCI_COMMAND
register emulation and it seems that at most they care about is the only INTx
bit (besides IO/memory enable and bus master which are write through).
It could be because in order to properly emulate the PCI_COMMAND register
we need to know about the whole PCI topology, e.g. if any setting in device's
command register is aligned with the upstream port etc.
This makes me think that because of this complexity others just ignore that.
Neither I think this can easily be done in Xen case.

According to "PCI LOCAL BUS SPECIFICATION, REV. 3.0", section "6.2.2
Device Control" the reset state of the command register is typically 0,
so when assigning a PCI device use 0 as the initial state for the guest's view
of the command register.

For now our emulation only makes sure INTx is set according to the host
requirements, i.e. depending on MSI/MSI-X enabled state.

This implementation and the decision to only emulate INTx bit for now
is based on the previous discussion at [3].

[1] https://github.com/qemu/qemu/blob/master/hw/xen/xen_pt_config_init.c#L310
[2] 
https://github.com/projectacrn/acrn-hypervisor/blob/master/hypervisor/hw/pci.c#L336
[3] 
https://patchwork.kernel.org/project/xen-devel/patch/20210903100831.177748-9-andr2...@gmail.com/

Signed-off-by: Oleksandr Andrushchenko 
---
Since v6:
- fold guest's logic into cmd_write
- implement cmd_read, so we can report emulated INTx state to guests
- introduce header->guest_cmd to hold the emulated state of the
  PCI_COMMAND register for guests
- OT: rebased
- OT: add cf_check specifier to cmd_read()
Since v5:
- add additional check for MSI-X enabled while altering INTX bit
- make sure INTx disabled while guests enable MSI/MSI-X
Since v3:
- gate more code on CONFIG_HAS_MSI
- removed logic for the case when MSI/MSI-X not enabled
---
 xen/drivers/vpci/header.c | 38 +-
 xen/drivers/vpci/msi.c|  4 
 xen/drivers/vpci/msix.c   |  4 
 xen/include/xen/vpci.h|  3 +++
 4 files changed, 48 insertions(+), 1 deletion(-)

diff --git a/xen/drivers/vpci/header.c b/xen/drivers/vpci/header.c
index 4e6547a54d..2ce69a63a2 100644
--- a/xen/drivers/vpci/header.c
+++ b/xen/drivers/vpci/header.c
@@ -443,11 +443,27 @@ static int modify_bars(const struct pci_dev *pdev, 
uint16_t cmd, bool rom_only)
 return 0;
 }
 
+/* TODO: Add proper emulation for all bits of the command register. */
 static void cf_check cmd_write(
 const struct pci_dev *pdev, unsigned int reg, uint32_t cmd, void *data)
 {
 uint16_t current_cmd = pci_conf_read16(pdev->sbdf, reg);
 
+if ( !is_hardware_domain(pdev->domain) )
+{
+struct vpci_header *header = data;
+
+header->guest_cmd = cmd;
+#ifdef CONFIG_HAS_PCI_MSI
+if ( pdev->vpci->msi->enabled || pdev->vpci->msix->enabled )
+/*
+ * Guest wants to enable INTx, but it can't be enabled
+ * if MSI/MSI-X enabled.
+ */
+cmd |= PCI_COMMAND_INTX_DISABLE;
+#endif
+}
+
 /*
  * Let Dom0 play with all the bits directly except for the memory
  * decoding one.
@@ -464,6 +480,19 @@ static void cf_check cmd_write(
 pci_conf_write16(pdev->sbdf, reg, cmd);
 }
 
+static uint32_t cf_check cmd_read(
+const struct pci_dev *pdev, unsigned int reg, void *data)
+{
+if ( !is_hardware_domain(pdev->domain) )
+{
+struct vpci_header *header = data;
+
+return header->guest_cmd;
+}
+
+return pci_conf_read16(pdev->sbdf, reg);
+}
+
 static void cf_check bar_write(
 const struct pci_dev *pdev, unsigned int reg, uint32_t val, void *data)
 {
@@ -665,8 +694,15 @@ static int cf_check init_bars(struct pci_dev *pdev)
 return -EOPNOTSUPP;
 }
 
+/*
+ * According to "PCI LOCAL BUS SPECIFICATION, REV. 3.0", section "6.2.2
+ * Device Control" the reset state of the command register is
+ * typically all 0's, so this is used as initial value for the guests.
+ */
+ASSERT(header->guest_cmd == 0);
+
 /* Setup a handler for the command register. */
-rc = vpci_add_register(pdev->vpci, vpci_hw_read16, cmd_write, PCI_COMMAND,
+rc = vpci_add_register(pdev->vpci, cmd_read, cmd_write, PCI_COMMAND,
2, header);
 if ( rc )
 return rc;
diff --git a/xen/drivers/vpci/msi.c b/xen/drivers/vpci/msi.c
index d864f740cf..c8c495e2d7 100644
--- a/xen/drivers/vpci/msi.c
+++ b/xen/drivers/vpci/msi.c
@@ -70,6 +70,10 @@ static void cf_

[PATCH V7 11/11] xen/arm: account IO handlers for emulated PCI MSI-X

2022-07-19 Thread Oleksandr Tyshchenko
From: Oleksandr Andrushchenko 

At the moment, we always allocate an extra 16 slots for IO handlers
(see MAX_IO_HANDLER). So while adding IO trap handlers for the emulated
MSI-X registers we need to explicitly tell that we have additional IO
handlers, so those are accounted.

Signed-off-by: Oleksandr Andrushchenko 
Acked-by: Julien Grall 
---
Cc: Julien Grall 
Cc: Stefano Stabellini 
---
This actually moved here from the part 2 of the prep work for PCI
passthrough on Arm as it seems to be the proper place for it.

Since v5:
- optimize with IS_ENABLED(CONFIG_HAS_PCI_MSI) since VPCI_MAX_VIRT_DEV is
  defined unconditionally
New in v5
---
 xen/arch/arm/vpci.c | 14 +-
 1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/xen/arch/arm/vpci.c b/xen/arch/arm/vpci.c
index 84b2b068a0..c5902cb9d3 100644
--- a/xen/arch/arm/vpci.c
+++ b/xen/arch/arm/vpci.c
@@ -131,6 +131,8 @@ static int vpci_get_num_handlers_cb(struct domain *d,
 
 unsigned int domain_vpci_get_num_mmio_handlers(struct domain *d)
 {
+unsigned int count;
+
 if ( !has_vpci(d) )
 return 0;
 
@@ -151,7 +153,17 @@ unsigned int domain_vpci_get_num_mmio_handlers(struct 
domain *d)
  * For guests each host bridge requires one region to cover the
  * configuration space. At the moment, we only expose a single host bridge.
  */
-return 1;
+count = 1;
+
+/*
+ * There's a single MSI-X MMIO handler that deals with both PBA
+ * and MSI-X tables per each PCI device being passed through.
+ * Maximum number of emulated virtual devices is VPCI_MAX_VIRT_DEV.
+ */
+if ( IS_ENABLED(CONFIG_HAS_PCI_MSI) )
+count += VPCI_MAX_VIRT_DEV;
+
+return count;
 }
 
 /*
-- 
2.25.1




[PATCH V7 10/11] xen/arm: translate virtual PCI bus topology for guests

2022-07-19 Thread Oleksandr Tyshchenko
From: Oleksandr Andrushchenko 

There are three  originators for the PCI configuration space access:
1. The domain that owns physical host bridge: MMIO handlers are
there so we can update vPCI register handlers with the values
written by the hardware domain, e.g. physical view of the registers
vs guest's view on the configuration space.
2. Guest access to the passed through PCI devices: we need to properly
map virtual bus topology to the physical one, e.g. pass the configuration
space access to the corresponding physical devices.
3. Emulated host PCI bridge access. It doesn't exist in the physical
topology, e.g. it can't be mapped to some physical host bridge.
So, all access to the host bridge itself needs to be trapped and
emulated.

Signed-off-by: Oleksandr Andrushchenko 
---
Since v6:
- add pcidevs locking to vpci_translate_virtual_device
- update wrt to the new locking scheme
Since v5:
- add vpci_translate_virtual_device for #ifndef CONFIG_HAS_VPCI_GUEST_SUPPORT
  case to simplify ifdefery
- add ASSERT(!is_hardware_domain(d)); to vpci_translate_virtual_device
- reset output register on failed virtual SBDF translation
Since v4:
- indentation fixes
- constify struct domain
- updated commit message
- updates to the new locking scheme (pdev->vpci_lock)
Since v3:
- revisit locking
- move code to vpci.c
Since v2:
 - pass struct domain instead of struct vcpu
 - constify arguments where possible
 - gate relevant code with CONFIG_HAS_VPCI_GUEST_SUPPORT
New in v2
---
 xen/arch/arm/vpci.c | 17 +
 xen/drivers/vpci/vpci.c | 26 ++
 xen/include/xen/vpci.h  |  7 +++
 3 files changed, 50 insertions(+)

diff --git a/xen/arch/arm/vpci.c b/xen/arch/arm/vpci.c
index a9fc5817f9..84b2b068a0 100644
--- a/xen/arch/arm/vpci.c
+++ b/xen/arch/arm/vpci.c
@@ -41,6 +41,16 @@ static int vpci_mmio_read(struct vcpu *v, mmio_info_t *info,
 /* data is needed to prevent a pointer cast on 32bit */
 unsigned long data;
 
+/*
+ * For the passed through devices we need to map their virtual SBDF
+ * to the physical PCI device being passed through.
+ */
+if ( !bridge && !vpci_translate_virtual_device(v->domain, &sbdf) )
+{
+*r = ~0ul;
+return 1;
+}
+
 if ( vpci_ecam_read(sbdf, ECAM_REG_OFFSET(info->gpa),
 1U << info->dabt.size, &data) )
 {
@@ -59,6 +69,13 @@ static int vpci_mmio_write(struct vcpu *v, mmio_info_t *info,
 struct pci_host_bridge *bridge = p;
 pci_sbdf_t sbdf = vpci_sbdf_from_gpa(bridge, info->gpa);
 
+/*
+ * For the passed through devices we need to map their virtual SBDF
+ * to the physical PCI device being passed through.
+ */
+if ( !bridge && !vpci_translate_virtual_device(v->domain, &sbdf) )
+return 1;
+
 return vpci_ecam_write(sbdf, ECAM_REG_OFFSET(info->gpa),
1U << info->dabt.size, r);
 }
diff --git a/xen/drivers/vpci/vpci.c b/xen/drivers/vpci/vpci.c
index d4601ecf9b..fc2c51dc3e 100644
--- a/xen/drivers/vpci/vpci.c
+++ b/xen/drivers/vpci/vpci.c
@@ -158,6 +158,32 @@ static void vpci_remove_virtual_device(const struct 
pci_dev *pdev)
 }
 }
 
+/*
+ * Find the physical device which is mapped to the virtual device
+ * and translate virtual SBDF to the physical one.
+ */
+bool vpci_translate_virtual_device(struct domain *d, pci_sbdf_t *sbdf)
+{
+struct pci_dev *pdev;
+
+ASSERT(!is_hardware_domain(d));
+
+pcidevs_read_lock();
+for_each_pdev( d, pdev )
+{
+if ( pdev->vpci && (pdev->vpci->guest_sbdf.sbdf == sbdf->sbdf) )
+{
+/* Replace guest SBDF with the physical one. */
+*sbdf = pdev->sbdf;
+pcidevs_read_unlock();
+return true;
+}
+}
+
+pcidevs_read_unlock();
+return false;
+}
+
 /* Notify vPCI that device is assigned to guest. */
 int vpci_assign_device(struct pci_dev *pdev)
 {
diff --git a/xen/include/xen/vpci.h b/xen/include/xen/vpci.h
index cc14b0086d..5749d8da78 100644
--- a/xen/include/xen/vpci.h
+++ b/xen/include/xen/vpci.h
@@ -276,6 +276,7 @@ static inline bool __must_check vpci_process_pending(struct 
vcpu *v)
 /* Notify vPCI that device is assigned/de-assigned to/from guest. */
 int vpci_assign_device(struct pci_dev *pdev);
 void vpci_deassign_device(struct pci_dev *pdev);
+bool vpci_translate_virtual_device(struct domain *d, pci_sbdf_t *sbdf);
 #else
 static inline int vpci_assign_device(struct pci_dev *pdev)
 {
@@ -285,6 +286,12 @@ static inline int vpci_assign_device(struct pci_dev *pdev)
 static inline void vpci_deassign_device(struct pci_dev *pdev)
 {
 };
+
+static inline bool vpci_translate_virtual_device(struct domain *d,
+ pci_sbdf_t *sbdf)
+{
+return false;
+}
 #endif
 
 #endif
-- 
2.25.1




[PATCH V7 09/11] vpci: add initial support for virtual PCI bus topology

2022-07-19 Thread Oleksandr Tyshchenko
From: Oleksandr Andrushchenko 

Assign SBDF to the PCI devices being passed through with bus 0.
The resulting topology is where PCIe devices reside on the bus 0 of the
root complex itself (embedded endpoints).
This implementation is limited to 32 devices which are allowed on
a single PCI bus.

Please note, that at the moment only function 0 of a multifunction
device can be passed through.

Signed-off-by: Oleksandr Andrushchenko 
---
Since v6:
- re-work wrt new locking scheme
- OT: add ASSERT(pcidevs_write_locked()); to add_virtual_device()
Since v5:
- s/vpci_add_virtual_device/add_virtual_device and make it static
- call add_virtual_device from vpci_assign_device and do not use
  REGISTER_VPCI_INIT machinery
- add pcidevs_locked ASSERT
- use DECLARE_BITMAP for vpci_dev_assigned_map
Since v4:
- moved and re-worked guest sbdf initializers
- s/set_bit/__set_bit
- s/clear_bit/__clear_bit
- minor comment fix s/Virtual/Guest/
- added VPCI_MAX_VIRT_DEV constant (PCI_SLOT(~0) + 1) which will be used
  later for counting the number of MMIO handlers required for a guest
  (Julien)
Since v3:
 - make use of VPCI_INIT
 - moved all new code to vpci.c which belongs to it
 - changed open-coded 31 to PCI_SLOT(~0)
 - added comments and code to reject multifunction devices with
   functions other than 0
 - updated comment about vpci_dev_next and made it unsigned int
 - implement roll back in case of error while assigning/deassigning devices
 - s/dom%pd/%pd
Since v2:
 - remove casts that are (a) malformed and (b) unnecessary
 - add new line for better readability
 - remove CONFIG_HAS_VPCI_GUEST_SUPPORT ifdef's as the relevant vPCI
functions are now completely gated with this config
 - gate common code with CONFIG_HAS_VPCI_GUEST_SUPPORT
New in v2
---
 xen/drivers/vpci/vpci.c | 70 -
 xen/include/xen/sched.h |  8 +
 xen/include/xen/vpci.h  | 11 +++
 3 files changed, 88 insertions(+), 1 deletion(-)

diff --git a/xen/drivers/vpci/vpci.c b/xen/drivers/vpci/vpci.c
index f683346285..d4601ecf9b 100644
--- a/xen/drivers/vpci/vpci.c
+++ b/xen/drivers/vpci/vpci.c
@@ -84,6 +84,9 @@ int vpci_add_handlers(struct pci_dev *pdev)
 
 INIT_LIST_HEAD(&pdev->vpci->handlers);
 spin_lock_init(&pdev->vpci->lock);
+#ifdef CONFIG_HAS_VPCI_GUEST_SUPPORT
+pdev->vpci->guest_sbdf.sbdf = ~0;
+#endif
 
 for ( i = 0; i < NUM_VPCI_INIT; i++ )
 {
@@ -99,6 +102,62 @@ int vpci_add_handlers(struct pci_dev *pdev)
 }
 
 #ifdef CONFIG_HAS_VPCI_GUEST_SUPPORT
+static int add_virtual_device(struct pci_dev *pdev)
+{
+struct domain *d = pdev->domain;
+pci_sbdf_t sbdf = { 0 };
+unsigned long new_dev_number;
+
+if ( is_hardware_domain(d) )
+return 0;
+
+ASSERT(pcidevs_write_locked());
+
+/*
+ * Each PCI bus supports 32 devices/slots at max or up to 256 when
+ * there are multi-function ones which are not yet supported.
+ */
+if ( pdev->info.is_extfn )
+{
+gdprintk(XENLOG_ERR, "%pp: only function 0 passthrough supported\n",
+ &pdev->sbdf);
+return -EOPNOTSUPP;
+}
+
+new_dev_number = find_first_zero_bit(d->vpci_dev_assigned_map,
+ VPCI_MAX_VIRT_DEV);
+if ( new_dev_number >= VPCI_MAX_VIRT_DEV )
+return -ENOSPC;
+
+__set_bit(new_dev_number, &d->vpci_dev_assigned_map);
+
+/*
+ * Both segment and bus number are 0:
+ *  - we emulate a single host bridge for the guest, e.g. segment 0
+ *  - with bus 0 the virtual devices are seen as embedded
+ *endpoints behind the root complex
+ *
+ * TODO: add support for multi-function devices.
+ */
+sbdf.devfn = PCI_DEVFN(new_dev_number, 0);
+pdev->vpci->guest_sbdf = sbdf;
+
+return 0;
+
+}
+
+static void vpci_remove_virtual_device(const struct pci_dev *pdev)
+{
+ASSERT(pcidevs_write_locked());
+
+if ( pdev->vpci )
+{
+__clear_bit(pdev->vpci->guest_sbdf.dev,
+&pdev->domain->vpci_dev_assigned_map);
+pdev->vpci->guest_sbdf.sbdf = ~0;
+}
+}
+
 /* Notify vPCI that device is assigned to guest. */
 int vpci_assign_device(struct pci_dev *pdev)
 {
@@ -111,8 +170,16 @@ int vpci_assign_device(struct pci_dev *pdev)
 
 rc = vpci_add_handlers(pdev);
 if ( rc )
-vpci_deassign_device(pdev);
+goto fail;
+
+rc = add_virtual_device(pdev);
+if ( rc )
+goto fail;
+
+return 0;
 
+ fail:
+vpci_deassign_device(pdev);
 return rc;
 }
 
@@ -124,6 +191,7 @@ void vpci_deassign_device(struct pci_dev *pdev)
 if ( !has_vpci(pdev->domain) )
 return;
 
+vpci_remove_virtual_device(pdev);
 vpci_remove_device(pdev);
 }
 #endif /* CONFIG_HAS_VPCI_GUEST_SUPPORT */
diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
index b9515eb497..a2848a5740 100644
--- a/xen/include/xen/sched.h
+++ b/xen/include/xen/sched.h
@@ -457,6 +457,14 @@ struct domain
 
 #ifdef CONFIG_HAS_PCI
 struct list_h

[PATCH V7 06/11] vpci/header: program p2m with guest BAR view

2022-07-19 Thread Oleksandr Tyshchenko
From: Oleksandr Andrushchenko 

Take into account guest's BAR view and program its p2m accordingly:
gfn is guest's view of the BAR and mfn is the physical BAR value as set
up by the PCI bus driver in the hardware domain.
This way hardware domain sees physical BAR values and guest sees
emulated ones.

Signed-off-by: Oleksandr Andrushchenko 
---
Since v5:
- remove debug print in map_range callback
- remove "identity" from the debug print
Since v4:
- moved start_{gfn|mfn} calculation into map_range
- pass vpci_bar in the map_data instead of start_{gfn|mfn}
- s/guest_addr/guest_reg
Since v3:
- updated comment (Roger)
- removed gfn_add(map->start_gfn, rc); which is wrong
- use v->domain instead of v->vpci.pdev->domain
- removed odd e.g. in comment
- s/d%d/%pd in altered code
- use gdprintk for map/unmap logs
Since v2:
- improve readability for data.start_gfn and restructure ?: construct
Since v1:
 - s/MSI/MSI-X in comments
---
 xen/drivers/vpci/header.c | 24 
 1 file changed, 20 insertions(+), 4 deletions(-)

diff --git a/xen/drivers/vpci/header.c b/xen/drivers/vpci/header.c
index f14ff11882..4e6547a54d 100644
--- a/xen/drivers/vpci/header.c
+++ b/xen/drivers/vpci/header.c
@@ -30,6 +30,7 @@
 
 struct map_data {
 struct domain *d;
+const struct vpci_bar *bar;
 bool map;
 };
 
@@ -41,8 +42,21 @@ static int cf_check map_range(
 
 for ( ; ; )
 {
+/* Start address of the BAR as seen by the guest. */
+gfn_t start_gfn = _gfn(PFN_DOWN(is_hardware_domain(map->d)
+? map->bar->addr
+: map->bar->guest_reg));
+/* Physical start address of the BAR. */
+mfn_t start_mfn = _mfn(PFN_DOWN(map->bar->addr));
 unsigned long size = e - s + 1;
 
+/*
+ * Ranges to be mapped don't always start at the BAR start address, as
+ * there can be holes or partially consumed ranges. Account for the
+ * offset of the current address from the BAR start.
+ */
+start_gfn = gfn_add(start_gfn, s - mfn_x(start_mfn));
+
 /*
  * ARM TODOs:
  * - On ARM whether the memory is prefetchable or not should be passed
@@ -52,8 +66,8 @@ static int cf_check map_range(
  * - {un}map_mmio_regions doesn't support preemption.
  */
 
-rc = map->map ? map_mmio_regions(map->d, _gfn(s), size, _mfn(s))
-  : unmap_mmio_regions(map->d, _gfn(s), size, _mfn(s));
+rc = map->map ? map_mmio_regions(map->d, start_gfn, size, _mfn(s))
+  : unmap_mmio_regions(map->d, start_gfn, size, _mfn(s));
 if ( rc == 0 )
 {
 *c += size;
@@ -62,8 +76,8 @@ static int cf_check map_range(
 if ( rc < 0 )
 {
 printk(XENLOG_G_WARNING
-   "Failed to identity %smap [%lx, %lx] for d%d: %d\n",
-   map->map ? "" : "un", s, e, map->d->domain_id, rc);
+   "Failed to %smap [%lx, %lx] for %pd: %d\n",
+   map->map ? "" : "un", s, e, map->d, rc);
 break;
 }
 ASSERT(rc < size);
@@ -155,6 +169,7 @@ bool vpci_process_pending(struct vcpu *v)
 if ( rangeset_is_empty(bar->mem) )
 continue;
 
+data.bar = bar;
 rc = rangeset_consume_ranges(bar->mem, map_range, &data);
 
 if ( rc == -ERESTART )
@@ -218,6 +233,7 @@ static int __init apply_map(struct domain *d, const struct 
pci_dev *pdev,
 if ( rangeset_is_empty(bar->mem) )
 continue;
 
+data.bar = bar;
 while ( (rc = rangeset_consume_ranges(bar->mem, map_range,
   &data)) == -ERESTART )
 {
-- 
2.25.1




[xen-unstable-smoke test] 171688: tolerable all pass - PUSHED

2022-07-19 Thread osstest service owner
flight 171688 xen-unstable-smoke real [real]
http://logs.test-lab.xenproject.org/osstest/logs/171688/

Failures :-/ but no regressions.

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-libvirt 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  16 saverestore-support-checkfail   never pass

version targeted for testing:
 xen  c16a9eda77b2089206d5bc39ab6488c3793e11bf
baseline version:
 xen  e500b6b8d07f87593a9d0e3a391456ef4ac5ee34

Last test of basis   171685  2022-07-19 11:03:41 Z0 days
Testing same since   171688  2022-07-19 15:01:54 Z0 days1 attempts


People who touched revisions under test:
  Andrew Cooper 

jobs:
 build-arm64-xsm  pass
 build-amd64  pass
 build-armhf  pass
 build-amd64-libvirt  pass
 test-armhf-armhf-xl  pass
 test-arm64-arm64-xl-xsm  pass
 test-amd64-amd64-xl-qemuu-debianhvm-amd64pass
 test-amd64-amd64-libvirt pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

To xenbits.xen.org:/home/xen/git/xen.git
   e500b6b8d0..c16a9eda77  c16a9eda77b2089206d5bc39ab6488c3793e11bf -> smoke



[seabios test] 171687: tolerable FAIL - PUSHED

2022-07-19 Thread osstest service owner
flight 171687 seabios real [real]
flight 171690 seabios real-retest [real]
http://logs.test-lab.xenproject.org/osstest/logs/171687/
http://logs.test-lab.xenproject.org/osstest/logs/171690/

Failures :-/ but no regressions.

Tests which are failing intermittently (not blocking):
 test-amd64-amd64-xl-qemuu-debianhvm-i386-xsm 12 debian-hvm-install fail pass 
in 171690-retest

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-qemuu-win7-amd64 19 guest-stopfail like 170031
 test-amd64-amd64-qemuu-nested-amd 20 debian-hvm-install/l1/l2 fail like 170031
 test-amd64-i386-xl-qemuu-win7-amd64 19 guest-stop fail like 170031
 test-amd64-amd64-xl-qemuu-ws16-amd64 19 guest-stopfail like 170031
 test-amd64-i386-xl-qemuu-ws16-amd64 19 guest-stop fail like 170031
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass

version targeted for testing:
 seabios  46de2eec93bffa0706e6229c0da2919763c8eb04
baseline version:
 seabios  dc88f9b72df52b22c35b127b80c487e0b6fca4af

Last test of basis   170031  2022-05-03 08:44:11 Z   77 days
Testing same since   171687  2022-07-19 11:10:32 Z0 days1 attempts


People who touched revisions under test:
  Gerd Hoffmann 

jobs:
 build-amd64-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm   pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsmpass
 test-amd64-amd64-xl-qemuu-debianhvm-i386-xsm fail
 test-amd64-i386-xl-qemuu-debianhvm-i386-xsm  pass
 test-amd64-amd64-qemuu-nested-amdfail
 test-amd64-i386-qemuu-rhel6hvm-amd   pass
 test-amd64-amd64-xl-qemuu-debianhvm-amd64pass
 test-amd64-i386-xl-qemuu-debianhvm-amd64 pass
 test-amd64-amd64-qemuu-freebsd11-amd64   pass
 test-amd64-amd64-qemuu-freebsd12-amd64   pass
 test-amd64-amd64-xl-qemuu-win7-amd64 fail
 test-amd64-i386-xl-qemuu-win7-amd64  fail
 test-amd64-amd64-xl-qemuu-ws16-amd64 fail
 test-amd64-i386-xl-qemuu-ws16-amd64  fail
 test-amd64-amd64-xl-qemuu-dmrestrict-amd64-dmrestrictpass
 test-amd64-i386-xl-qemuu-dmrestrict-amd64-dmrestrict pass
 test-amd64-amd64-qemuu-nested-intel  pass
 test-amd64-i386-qemuu-rhel6hvm-intel pass
 test-amd64-amd64-xl-qemuu-debianhvm-amd64-shadow pass
 test-amd64-i386-xl-qemuu-debianhvm-amd64-shadow  pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

To xenbits.xen.org:/home/xen/git/osstest/seabios.git
   dc88f9b..46de2ee  46de2eec93bffa0706e6229c0da2919763c8eb04 -> 
xen-tested-master



Re: [PATCH] xen/mem_sharing: support forks with active vPMU state

2022-07-19 Thread Tamas K Lengyel
On Tue, Jul 19, 2022 at 2:23 PM Andrew Cooper  wrote:
>
> On 19/07/2022 18:18, Tamas K Lengyel wrote:
> > diff --git a/xen/arch/x86/cpu/vpmu.c b/xen/arch/x86/cpu/vpmu.c
> > index d2c03a1104..2b5d64a60d 100644
> > --- a/xen/arch/x86/cpu/vpmu.c
> > +++ b/xen/arch/x86/cpu/vpmu.c
> > @@ -529,6 +529,16 @@ void vpmu_initialise(struct vcpu *v)
> >  put_vpmu(v);
> >  }
> >
> > +void vpmu_allocate_context(struct vcpu *v)
> > +{
> > +struct vpmu_struct *vpmu = vcpu_vpmu(v);
> > +
> > +if ( vpmu_is_set(vpmu, VPMU_CONTEXT_ALLOCATED) )
> > +return;
> > +
> > +alternative_call(vpmu_ops.allocate_context, v);
>
> You need to fill in an AMD pointer, or make this conditional.
>
> All alternatives have NULL pointers turned into UDs.
>
> Should be a two-liner on the AMD side.

There is no AMD caller to this so I'll just make it conditional to
ensure it's non-NULL.

>
> > diff --git a/xen/arch/x86/cpu/vpmu_intel.c b/xen/arch/x86/cpu/vpmu_intel.c
> > index 8612f46973..31dc0ee14b 100644
> > --- a/xen/arch/x86/cpu/vpmu_intel.c
> > +++ b/xen/arch/x86/cpu/vpmu_intel.c
> >  static int core2_vpmu_verify(struct vcpu *v)
> > @@ -474,7 +485,11 @@ static int core2_vpmu_alloc_resource(struct vcpu *v)
> >  sizeof(uint64_t) * fixed_pmc_cnt;
> >
> >  vpmu->context = core2_vpmu_cxt;
> > +vpmu->context_size = sizeof(struct xen_pmu_intel_ctxt) +
> > + fixed_pmc_cnt * sizeof(uint64_t) +
> > + arch_pmc_cnt * sizeof(struct xen_pmu_cntr_pair);
>
> This wants deduplicating with the earlier calculation, surely?

Sure.

>
> >  vpmu->priv_context = p;
> > +vpmu->priv_context_size = sizeof(uint64_t);
> >
> >  if ( !has_vlapic(v->domain) )
> >  {
> > @@ -882,6 +897,7 @@ static int cf_check vmx_vpmu_initialise(struct vcpu *v)
> >
> >  static const struct arch_vpmu_ops __initconst_cf_clobber core2_vpmu_ops = {
> >  .initialise = vmx_vpmu_initialise,
> > +.allocate_context = core2_vpmu_alloc_resource,
>
> core2_vpmu_alloc_resource() needs to gain a cf_check to not explode on
> TGL/SPR.
>
> >  .do_wrmsr = core2_vpmu_do_wrmsr,
> >  .do_rdmsr = core2_vpmu_do_rdmsr,
> >  .do_interrupt = core2_vpmu_do_interrupt,
> > diff --git a/xen/arch/x86/include/asm/vpmu.h 
> > b/xen/arch/x86/include/asm/vpmu.h
> > index e5709bd44a..14d0939247 100644
> > --- a/xen/arch/x86/include/asm/vpmu.h
> > +++ b/xen/arch/x86/include/asm/vpmu.h
> > @@ -106,8 +109,10 @@ void vpmu_lvtpc_update(uint32_t val);
> >  int vpmu_do_msr(unsigned int msr, uint64_t *msr_content, bool is_write);
> >  void vpmu_do_interrupt(struct cpu_user_regs *regs);
> >  void vpmu_initialise(struct vcpu *v);
> > +void vpmu_allocate_context(struct vcpu *v);
> >  void vpmu_destroy(struct vcpu *v);
> >  void vpmu_save(struct vcpu *v);
> > +void vpmu_save_force(void *arg);
>
> Needs the cf_check to compile.
>
> >  int vpmu_load(struct vcpu *v, bool_t from_guest);
> >  void vpmu_dump(struct vcpu *v);
> >
> > diff --git a/xen/arch/x86/mm/mem_sharing.c b/xen/arch/x86/mm/mem_sharing.c
> > index 8f9d9ed9a9..39cd03abf7 100644
> > --- a/xen/arch/x86/mm/mem_sharing.c
> > +++ b/xen/arch/x86/mm/mem_sharing.c
> > @@ -1653,6 +1653,50 @@ static void copy_vcpu_nonreg_state(struct vcpu 
> > *d_vcpu, struct vcpu *cd_vcpu)
> >  hvm_set_nonreg_state(cd_vcpu, &nrs);
> >  }
> >
> > +static int copy_vpmu(struct vcpu *d_vcpu, struct vcpu *cd_vcpu)
> > +{
> > +struct vpmu_struct *d_vpmu = vcpu_vpmu(d_vcpu);
> > +struct vpmu_struct *cd_vpmu = vcpu_vpmu(cd_vcpu);
> > +
> > +if ( !vpmu_are_all_set(d_vpmu, VPMU_INITIALIZED | 
> > VPMU_CONTEXT_ALLOCATED) )
> > +return 0;
> > +if ( !vpmu_is_set(cd_vpmu, VPMU_CONTEXT_ALLOCATED) )
> > +{
> > +vpmu_allocate_context(cd_vcpu);
> > +if ( !vpmu_is_set(cd_vpmu, VPMU_CONTEXT_ALLOCATED) )
> > +return -ENOMEM;
>
> vpmu_allocate_context() already checks VPMU_CONTEXT_ALLOCATED.  But
> isn't the double check here redundant?

True, I could drop the top level check here.

>
> The subsequent check looks like you want to pass the hook's return value
> up through vpmu_allocate_context().
>
> (And if you feel like turning it from bool-as-int to something more
> sane, say -errno, that would also be great.)

Yea, I wanted to avoid having to rework the currently backwards
meaning of the returned int values of vpmu functions. So that's why I
double check the allocation worked instead. If I do what you recommend
it would be the only vpmu function that doesn't return void and the
only callback that returns error codes instead of boolean
success/failure. I rather keep the code self-consistent in vpmu and
just live with this arguably kinda odd looking logic here.

Tamas



[ovmf test] 171689: all pass - PUSHED

2022-07-19 Thread osstest service owner
flight 171689 ovmf real [real]
http://logs.test-lab.xenproject.org/osstest/logs/171689/

Perfect :-)
All tests in this flight passed as required
version targeted for testing:
 ovmf 19a87683654a4969a9f86a3d02199c253c789970
baseline version:
 ovmf f0064ac3afa28e1aa3b6b9c22c6cf422a4bb8771

Last test of basis   171679  2022-07-19 03:13:39 Z0 days
Testing same since   171689  2022-07-19 16:41:40 Z0 days1 attempts


People who touched revisions under test:
  Jeff Brasen 

jobs:
 build-amd64-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-xl-qemuu-ovmf-amd64 pass
 test-amd64-i386-xl-qemuu-ovmf-amd64  pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

To xenbits.xen.org:/home/xen/git/osstest/ovmf.git
   f0064ac3af..19a8768365  19a87683654a4969a9f86a3d02199c253c789970 -> 
xen-tested-master



Re: [PATCH] xen/privcmd: prevent integer overflow on 32 bit systems

2022-07-19 Thread kernel test robot
Hi Dan,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on xen-tip/linux-next]
[also build test WARNING on linus/master v5.19-rc7 next-20220719]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:
https://github.com/intel-lab-lkp/linux/commits/Dan-Carpenter/xen-privcmd-prevent-integer-overflow-on-32-bit-systems/20220715-162307
base:   https://git.kernel.org/pub/scm/linux/kernel/git/xen/tip.git linux-next
config: x86_64-randconfig-a005 
(https://download.01.org/0day-ci/archive/20220720/202207200236.geswjpck-...@intel.com/config)
compiler: clang version 15.0.0 (https://github.com/llvm/llvm-project 
fa0c7639e91fa1cd0cf2ff0445a1634a90fe850a)
reproduce (this is a W=1 build):
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# 
https://github.com/intel-lab-lkp/linux/commit/ea22ebd83753c7181043e69251b78f0be73675ad
git remote add linux-review https://github.com/intel-lab-lkp/linux
git fetch --no-tags linux-review 
Dan-Carpenter/xen-privcmd-prevent-integer-overflow-on-32-bit-systems/20220715-162307
git checkout ea22ebd83753c7181043e69251b78f0be73675ad
# save the config file
mkdir build_dir && cp config build_dir/.config
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 
O=build_dir ARCH=x86_64 SHELL=/bin/bash drivers/acpi/ drivers/ata/ drivers/rtc/ 
drivers/thermal/intel/ drivers/xen/

If you fix the issue, kindly add following tag where applicable
Reported-by: kernel test robot 

All warnings (new ones prefixed by >>):

>> drivers/xen/privcmd.c:459:13: warning: result of comparison of constant 
>> 2305843009213693951 with expression of type 'unsigned int' is always false 
>> [-Wtautological-constant-out-of-range-compare]
   if (m.num > SIZE_MAX / sizeof(*m.arr))
   ~ ^ ~
   drivers/xen/privcmd.c:469:13: warning: result of comparison of constant 
2305843009213693951 with expression of type 'unsigned int' is always false 
[-Wtautological-constant-out-of-range-compare]
   if (m.num > SIZE_MAX / sizeof(*m.arr))
   ~ ^ ~
   2 warnings generated.


vim +459 drivers/xen/privcmd.c

   441  
   442  static long privcmd_ioctl_mmap_batch(
   443  struct file *file, void __user *udata, int version)
   444  {
   445  struct privcmd_data *data = file->private_data;
   446  int ret;
   447  struct privcmd_mmapbatch_v2 m;
   448  struct mm_struct *mm = current->mm;
   449  struct vm_area_struct *vma;
   450  unsigned long nr_pages;
   451  LIST_HEAD(pagelist);
   452  struct mmap_batch_state state;
   453  
   454  switch (version) {
   455  case 1:
   456  if (copy_from_user(&m, udata, sizeof(struct 
privcmd_mmapbatch)))
   457  return -EFAULT;
   458  /* Returns per-frame error in m.arr. */
 > 459  if (m.num > SIZE_MAX / sizeof(*m.arr))
   460  return -EINVAL;
   461  m.err = NULL;
   462  if (!access_ok(m.arr, m.num * sizeof(*m.arr)))
   463  return -EFAULT;
   464  break;
   465  case 2:
   466  if (copy_from_user(&m, udata, sizeof(struct 
privcmd_mmapbatch_v2)))
   467  return -EFAULT;
   468  /* Returns per-frame error code in m.err. */
   469  if (m.num > SIZE_MAX / sizeof(*m.arr))
   470  return -EINVAL;
   471  if (!access_ok(m.err, m.num * (sizeof(*m.err
   472  return -EFAULT;
   473  break;
   474  default:
   475  return -EINVAL;
   476  }
   477  
   478  /* If restriction is in place, check the domid matches */
   479  if (data->domid != DOMID_INVALID && data->domid != m.dom)
   480  return -EPERM;
   481  
   482  nr_pages = DIV_ROUND_UP(m.num, XEN_PFN_PER_PAGE);
   483  if ((m.num <= 0) || (nr_pages > (LONG_MAX >> PAGE_SHIFT)))
   484  return -EINVAL;
   485  
   486  ret = gather_array(&pagelist, m.num, sizeof(xen_pfn_t), m.arr);
   487  
   488  if (ret)
   489  goto out;
   490  if (list_empty(&pagelist)) {
   491  ret = -EINVAL;
   492  goto out;
   493  }
   494  
   495  if (version == 2) {
   496  

[PATCH] x86: Expose more MSR_ARCH_CAPS to hwdom

2022-07-19 Thread Jason Andryuk
commit e46474278a0e ("x86/intel: Expose MSR_ARCH_CAPS to dom0") started
exposing MSR_ARCH_CAPS to dom0.  More bits in MSR_ARCH_CAPS have since
been defined, but they haven't been exposed.  Update the list to allow
them through.

As one example, this allows a linux Dom0 to know that it has the
appropriate microcode via FB_CLEAR.  Notably, and with the updated
microcode, this changes dom0's
/sys/devices/system/cpu/vulnerabilities/mmio_stale_data changes from:
"Vulnerable: Clear CPU buffers attempted, no microcode; SMT Host state
unknown"
to:
"Mitigation: Clear CPU buffers; SMT Host state unknown"

This ecposes the MMIO Stale Data and Intel Branch History Injection
(BHI) controls as well as the page size change MCE issue bit.

Fixes: commit 2ebe8fe9b7e0 ("x86/spec-ctrl: Enumeration for MMIO Stale Data 
controls")
Fixes: commit cea9ae062295 ("x86/spec-ctrl: Enumeration for new Intel BHI 
controls")
Fixes: commit 59e89cdabc71 ("x86/vtx: Disable executable EPT superpages to work 
around CVE-2018-12207")

Signed-off-by: Jason Andryuk 
---
This is the broader replacement for "x86: Add MMIO Stale Data arch_caps
to hardware domain".

It wasn't discussed previously, but ARCH_CAPS_IF_PSCHANGE_MC_NO is added
as well.

This patch can't be directly backported because cea9ae062295 was not
backported.

 xen/arch/x86/msr.c | 9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/xen/arch/x86/msr.c b/xen/arch/x86/msr.c
index 6206529162..170f041793 100644
--- a/xen/arch/x86/msr.c
+++ b/xen/arch/x86/msr.c
@@ -72,7 +72,9 @@ static void __init calculate_host_policy(void)
 mp->arch_caps.raw &=
 (ARCH_CAPS_RDCL_NO | ARCH_CAPS_IBRS_ALL | ARCH_CAPS_RSBA |
  ARCH_CAPS_SKIP_L1DFL | ARCH_CAPS_SSB_NO | ARCH_CAPS_MDS_NO |
- ARCH_CAPS_IF_PSCHANGE_MC_NO | ARCH_CAPS_TSX_CTRL | ARCH_CAPS_TAA_NO);
+ ARCH_CAPS_IF_PSCHANGE_MC_NO | ARCH_CAPS_TSX_CTRL | ARCH_CAPS_TAA_NO |
+ ARCH_CAPS_SBDR_SSDP_NO | ARCH_CAPS_FBSDP_NO | ARCH_CAPS_PSDP_NO |
+ ARCH_CAPS_FB_CLEAR | ARCH_CAPS_RRSBA | ARCH_CAPS_BHI_NO);
 }
 
 static void __init calculate_pv_max_policy(void)
@@ -161,7 +163,10 @@ int init_domain_msr_policy(struct domain *d)
 
 mp->arch_caps.raw = val &
 (ARCH_CAPS_RDCL_NO | ARCH_CAPS_IBRS_ALL | ARCH_CAPS_RSBA |
- ARCH_CAPS_SSB_NO | ARCH_CAPS_MDS_NO | ARCH_CAPS_TAA_NO);
+ ARCH_CAPS_SSB_NO | ARCH_CAPS_MDS_NO | ARCH_CAPS_IF_PSCHANGE_MC_NO 
|
+ ARCH_CAPS_TAA_NO | ARCH_CAPS_SBDR_SSDP_NO | ARCH_CAPS_FBSDP_NO |
+ ARCH_CAPS_PSDP_NO | ARCH_CAPS_FB_CLEAR | ARCH_CAPS_RRSBA |
+ ARCH_CAPS_BHI_NO);
 }
 
 d->arch.msr = mp;
-- 
2.36.1




Re: [PATCH] x86emul: add memory operand low bits checks for ENQCMD{,S}

2022-07-19 Thread Andrew Cooper
On 19/07/2022 13:56, Jan Beulich wrote:
> Already ISE rev 044 added text to this effect; rev 045 further dropped
> leftover earlier text indicating the contrary:
> - ENQCMD requires the low 32 bits of the memory operand to be clear,
> - ENDCMDS requires bits 20...30 of the memory operand to be clear.
>
> Signed-off-by: Jan Beulich 
> ---
> I'm a little reluctant to add a Fixes: tag here, because at the time
> the code was written the behavior was matching what was documented.

It needs a tag, because this is fixing a problem in a previous patch,
and in principle wants backporting to 4.14.

It doesn't matter the cause of the error, and "Intel changed their
documentation" is pretty good as far as excuses go.

As far as the change goes, that does seem to match the latest docs.

Acked-by: Andrew Cooper 


Re: [PATCH] x86: Expose more MSR_ARCH_CAPS to hwdom

2022-07-19 Thread Andrew Cooper
On 19/07/2022 21:08, Jason Andryuk wrote:
> commit e46474278a0e ("x86/intel: Expose MSR_ARCH_CAPS to dom0") started
> exposing MSR_ARCH_CAPS to dom0.  More bits in MSR_ARCH_CAPS have since
> been defined, but they haven't been exposed.  Update the list to allow
> them through.
>
> As one example, this allows a linux Dom0 to know that it has the
> appropriate microcode via FB_CLEAR.  Notably, and with the updated
> microcode, this changes dom0's
> /sys/devices/system/cpu/vulnerabilities/mmio_stale_data changes from:
> "Vulnerable: Clear CPU buffers attempted, no microcode; SMT Host state
> unknown"
> to:
> "Mitigation: Clear CPU buffers; SMT Host state unknown"
>
> This ecposes the MMIO Stale Data and Intel Branch History Injection
> (BHI) controls as well as the page size change MCE issue bit.
>
> Fixes: commit 2ebe8fe9b7e0 ("x86/spec-ctrl: Enumeration for MMIO Stale Data 
> controls")
> Fixes: commit cea9ae062295 ("x86/spec-ctrl: Enumeration for new Intel BHI 
> controls")
> Fixes: commit 59e89cdabc71 ("x86/vtx: Disable executable EPT superpages to 
> work around CVE-2018-12207")
>
> Signed-off-by: Jason Andryuk 
> ---
> This is the broader replacement for "x86: Add MMIO Stale Data arch_caps
> to hardware domain".
>
> It wasn't discussed previously, but ARCH_CAPS_IF_PSCHANGE_MC_NO is added
> as well.

I deliberately excluded IF_PSCHANGE_MC_NO because it wasn't relevant. 
But I suppose Linux is looking for it anyway?

IF_PSCHANGE_MC_NO is the mouthful meaning "the frontend doesn't have a
strop when it takes an assist finds that the iTLB mapping has changed". 
It's only interesting to hypervisors looking after an EPT guest, which
means that it's only interesting to expose to HAP guests with nested
virt.  Except we disable mitigations for nested virt because there's a
bug in the nHAP code which I didn't have time to figure out, and none of
this is remotely security supported to start with.

In principle, TAA_NO's visibility should be dependent on the visibility
of RTM, but given this is all a pile of hacks anyway, I'm not sure how
much I care at this point.

~Andrew


[qemu-mainline test] 171683: tolerable FAIL - PUSHED

2022-07-19 Thread osstest service owner
flight 171683 qemu-mainline real [real]
http://logs.test-lab.xenproject.org/osstest/logs/171683/

Failures :-/ but no regressions.

Regressions which are regarded as allowable (not blocking):
 test-armhf-armhf-xl-rtds 14 guest-start  fail REGR. vs. 171676

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-qemuu-win7-amd64 19 guest-stopfail like 171676
 test-armhf-armhf-libvirt 16 saverestore-support-checkfail  like 171676
 test-amd64-amd64-qemuu-nested-amd 20 debian-hvm-install/l1/l2 fail like 171676
 test-armhf-armhf-libvirt-qcow2 15 saverestore-support-check   fail like 171676
 test-armhf-armhf-libvirt-raw 15 saverestore-support-checkfail  like 171676
 test-amd64-i386-xl-qemuu-win7-amd64 19 guest-stop fail like 171676
 test-amd64-i386-xl-qemuu-ws16-amd64 19 guest-stop fail like 171676
 test-amd64-amd64-xl-qemuu-ws16-amd64 19 guest-stopfail like 171676
 test-amd64-i386-xl-pvshim14 guest-start  fail   never pass
 test-amd64-amd64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  15 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 15 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  16 saverestore-support-checkfail   never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-armhf-armhf-xl-arndale  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  16 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt 15 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-raw  14 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-raw 14 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-raw 15 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-vhd 14 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt 15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 15 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 16 saverestore-support-checkfail never pass
 test-arm64-arm64-xl-vhd  14 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-vhd  15 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  16 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-armhf-armhf-libvirt-qcow2 14 migrate-support-checkfail never pass
 test-armhf-armhf-libvirt-raw 14 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 15 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 16 saverestore-support-checkfail  never pass
 test-armhf-armhf-xl-vhd  14 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  15 saverestore-support-checkfail   never pass

version targeted for testing:
 qemuub8bb9bbf4695b89bbdca702a054db0a7a2c8ff2b
baseline version:
 qemuu782378973121addeb11b13fd12a6ac2e69faa33f

Last test of basis   171676  2022-07-18 22:40:16 Z0 days
Testing same since   171683  2022-07-19 09:07:08 Z0 days1 attempts


People who touched revisions under test:
  Cédric Le Goater 
  Daniel Henrique Barboza 
  Fabiano Rosas 
  Jason A

[ovmf test] 171691: all pass - PUSHED

2022-07-19 Thread osstest service owner
flight 171691 ovmf real [real]
http://logs.test-lab.xenproject.org/osstest/logs/171691/

Perfect :-)
All tests in this flight passed as required
version targeted for testing:
 ovmf 671b0cea510ad6de02ee9d6dbdf8f9bbb881f35d
baseline version:
 ovmf 19a87683654a4969a9f86a3d02199c253c789970

Last test of basis   171689  2022-07-19 16:41:40 Z0 days
Testing same since   171691  2022-07-19 19:12:59 Z0 days1 attempts


People who touched revisions under test:
  Saloni Kasbekar 

jobs:
 build-amd64-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-xl-qemuu-ovmf-amd64 pass
 test-amd64-i386-xl-qemuu-ovmf-amd64  pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

To xenbits.xen.org:/home/xen/git/osstest/ovmf.git
   19a8768365..671b0cea51  671b0cea510ad6de02ee9d6dbdf8f9bbb881f35d -> 
xen-tested-master



[xen-unstable test] 171686: regressions - FAIL

2022-07-19 Thread osstest service owner
flight 171686 xen-unstable real [real]
flight 171692 xen-unstable real-retest [real]
http://logs.test-lab.xenproject.org/osstest/logs/171686/
http://logs.test-lab.xenproject.org/osstest/logs/171692/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-amd64-xl-qemut-debianhvm-i386-xsm 12 debian-hvm-install fail REGR. 
vs. 171678

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-qemut-win7-amd64 19 guest-stopfail like 171678
 test-armhf-armhf-libvirt 16 saverestore-support-checkfail  like 171678
 test-amd64-amd64-qemuu-nested-amd 20 debian-hvm-install/l1/l2 fail like 171678
 test-amd64-amd64-xl-qemuu-ws16-amd64 19 guest-stopfail like 171678
 test-amd64-i386-xl-qemut-ws16-amd64 19 guest-stop fail like 171678
 test-amd64-i386-xl-qemut-win7-amd64 19 guest-stop fail like 171678
 test-armhf-armhf-libvirt-qcow2 15 saverestore-support-check   fail like 171678
 test-armhf-armhf-libvirt-raw 15 saverestore-support-checkfail  like 171678
 test-amd64-i386-xl-qemuu-win7-amd64 19 guest-stop fail like 171678
 test-amd64-amd64-xl-qemut-ws16-amd64 19 guest-stopfail like 171678
 test-amd64-i386-xl-qemuu-ws16-amd64 19 guest-stop fail like 171678
 test-amd64-amd64-xl-qemuu-win7-amd64 19 guest-stopfail like 171678
 test-amd64-i386-xl-pvshim14 guest-start  fail   never pass
 test-amd64-amd64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  16 saverestore-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  15 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  16 saverestore-support-checkfail   never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-amd64-i386-libvirt-raw  14 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-raw 14 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-raw 15 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 15 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 16 saverestore-support-checkfail never pass
 test-armhf-armhf-xl-credit2  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-vhd  14 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-vhd  15 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 15 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 16 saverestore-support-checkfail  never pass
 test-armhf-armhf-libvirt 15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 16 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-vhd 14 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-qcow2 14 migrate-support-checkfail never pass
 test-armhf-armhf-libvirt-raw 14 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  14 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  15 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  15 migrate-suppo

Re: Ryzen 6000 (Mobile)

2022-07-19 Thread Dylanger Daly
> I'd focus on the booting issues first. And I guess you can take a video
> of that (assuming that a single screenshot likely isn't going to be
> enough), possibly with "vga=keep" in place (albeit that introduces
> extra slowness)?
>
> There's also the option of using an EHCI debug port for the serial
> console, but this requires (a) a special cable and (b) the system
> designers not having inserted any hubs between the controller andthe 
> connector.

Do you know if it's possible to have `console=vga vga=keep` and specify a 
secondary monitor? This would be very useful if I could have Xen log via a 
secondary monitor, in any case I'll record a video today. I can't seem to get 
anything useful out of /var/log/xen/console/hypervisor.log, I assume this log 
file isn't written to on a 'live' basis.

I would assume AMD has disabled any sort of debugging/NIDnT/CCD, surprisingly 
or unsurprisingly it's easier to debug Chromebooks with their CCD USB-C cables.

> Ok, these sound like two different things. One is dom0 failing to boot, and 
> one is the hang/reset when starting the VMs.
> Lets start with the dom0 problem first. The link you provide suggests a 
> credit2 bug. Does dom0 boot if you pass `sched=credit` on the command line, 
> in place of `dom0_max_vcpus=1 dom0_vcpus_pin` ?

Yes, this is correct, I think the first problem is an AMD 6000 Series CPU 
issue, as others have reported this same first issue: 
https://github.com/QubesOS/qubes-issues/issues/7570 (having to add 
`dom0_max_vcpus=1 dom0_vcpus_pin`)

I believe the second issue could be platform specific, that being a UEFI Option 
relating to the scheduler or something else causing the device to hang, 
anecdotally others that have the same-ish CPU aren't having this issue, so it 
could be specific to the Lenovo Yoga Slim 7 Pro X (Gen 7).

Issue #1 seems to be common with newer AMD Ryzen Mobile CPUs
Issue #2 seems to be Lenovo specific, I've tried limiting other domU's to 1 
vcpu to no avail, I haven't tried pinning a vcpu to a domU yet.

Unfortunately I tried adding `sched=credit` in place of the pinning config and 
dom0 didn't come up to ask for a LUKs password. Dom0 does indeed boot, it just 
doesn't make it past the early stage of kernel setup.

Cheers, Dylanger

[PATCH AUTOSEL 5.18 20/54] objtool: Update Retpoline validation

2022-07-19 Thread Sasha Levin
From: Peter Zijlstra 

[ Upstream commit 9bb2ec608a209018080ca262f771e6a9ff203b6f ]

Update retpoline validation with the new CONFIG_RETPOLINE requirement of
not having bare naked RET instructions.

Signed-off-by: Peter Zijlstra (Intel) 
Signed-off-by: Borislav Petkov 
Reviewed-by: Josh Poimboeuf 
Signed-off-by: Borislav Petkov 
Signed-off-by: Sasha Levin 
---
 arch/x86/include/asm/nospec-branch.h |  6 ++
 arch/x86/mm/mem_encrypt_boot.S   |  2 ++
 arch/x86/xen/xen-head.S  |  1 +
 tools/objtool/check.c| 19 +--
 4 files changed, 22 insertions(+), 6 deletions(-)

diff --git a/arch/x86/include/asm/nospec-branch.h 
b/arch/x86/include/asm/nospec-branch.h
index 92290b4f1c96..a790109f9337 100644
--- a/arch/x86/include/asm/nospec-branch.h
+++ b/arch/x86/include/asm/nospec-branch.h
@@ -75,6 +75,12 @@
.popsection
 .endm
 
+/*
+ * (ab)use RETPOLINE_SAFE on RET to annotate away 'bare' RET instructions
+ * vs RETBleed validation.
+ */
+#define ANNOTATE_UNRET_SAFE ANNOTATE_RETPOLINE_SAFE
+
 /*
  * JMP_NOSPEC and CALL_NOSPEC macros can be used instead of a simple
  * indirect jmp/call which may be susceptible to the Spectre variant 2
diff --git a/arch/x86/mm/mem_encrypt_boot.S b/arch/x86/mm/mem_encrypt_boot.S
index d94dea450fa6..9de3d900bc92 100644
--- a/arch/x86/mm/mem_encrypt_boot.S
+++ b/arch/x86/mm/mem_encrypt_boot.S
@@ -66,6 +66,7 @@ SYM_FUNC_START(sme_encrypt_execute)
pop %rbp
 
/* Offset to __x86_return_thunk would be wrong here */
+   ANNOTATE_UNRET_SAFE
ret
int3
 SYM_FUNC_END(sme_encrypt_execute)
@@ -154,6 +155,7 @@ SYM_FUNC_START(__enc_copy)
pop %r15
 
/* Offset to __x86_return_thunk would be wrong here */
+   ANNOTATE_UNRET_SAFE
ret
int3
 .L__enc_copy_end:
diff --git a/arch/x86/xen/xen-head.S b/arch/x86/xen/xen-head.S
index 3a2cd93bf059..fa884fc73e07 100644
--- a/arch/x86/xen/xen-head.S
+++ b/arch/x86/xen/xen-head.S
@@ -26,6 +26,7 @@ SYM_CODE_START(hypercall_page)
.rept (PAGE_SIZE / 32)
UNWIND_HINT_FUNC
ANNOTATE_NOENDBR
+   ANNOTATE_UNRET_SAFE
ret
/*
 * Xen will write the hypercall page, and sort out ENDBR.
diff --git a/tools/objtool/check.c b/tools/objtool/check.c
index 204519704f3b..2daa0dce199b 100644
--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -2097,8 +2097,9 @@ static int read_retpoline_hints(struct objtool_file *file)
}
 
if (insn->type != INSN_JUMP_DYNAMIC &&
-   insn->type != INSN_CALL_DYNAMIC) {
-   WARN_FUNC("retpoline_safe hint not an indirect 
jump/call",
+   insn->type != INSN_CALL_DYNAMIC &&
+   insn->type != INSN_RETURN) {
+   WARN_FUNC("retpoline_safe hint not an indirect 
jump/call/ret",
  insn->sec, insn->offset);
return -1;
}
@@ -3631,7 +3632,8 @@ static int validate_retpoline(struct objtool_file *file)
 
for_each_insn(file, insn) {
if (insn->type != INSN_JUMP_DYNAMIC &&
-   insn->type != INSN_CALL_DYNAMIC)
+   insn->type != INSN_CALL_DYNAMIC &&
+   insn->type != INSN_RETURN)
continue;
 
if (insn->retpoline_safe)
@@ -3646,9 +3648,14 @@ static int validate_retpoline(struct objtool_file *file)
if (!strcmp(insn->sec->name, ".init.text") && !module)
continue;
 
-   WARN_FUNC("indirect %s found in RETPOLINE build",
- insn->sec, insn->offset,
- insn->type == INSN_JUMP_DYNAMIC ? "jump" : "call");
+   if (insn->type == INSN_RETURN) {
+   WARN_FUNC("'naked' return found in RETPOLINE build",
+ insn->sec, insn->offset);
+   } else {
+   WARN_FUNC("indirect %s found in RETPOLINE build",
+ insn->sec, insn->offset,
+ insn->type == INSN_JUMP_DYNAMIC ? "jump" : 
"call");
+   }
 
warnings++;
}
-- 
2.35.1




[PATCH AUTOSEL 5.18 21/54] x86/xen: Rename SYS* entry points

2022-07-19 Thread Sasha Levin
From: Peter Zijlstra 

[ Upstream commit b75b7f8ef1148be1b9321ffc2f6c19238904b438 ]

Native SYS{CALL,ENTER} entry points are called
entry_SYS{CALL,ENTER}_{64,compat}, make sure the Xen versions are
named consistently.

Signed-off-by: Peter Zijlstra (Intel) 
Signed-off-by: Borislav Petkov 
Reviewed-by: Josh Poimboeuf 
Signed-off-by: Borislav Petkov 
Signed-off-by: Sasha Levin 
---
 arch/x86/xen/setup.c   |  6 +++---
 arch/x86/xen/xen-asm.S | 20 ++--
 arch/x86/xen/xen-ops.h |  6 +++---
 3 files changed, 16 insertions(+), 16 deletions(-)

diff --git a/arch/x86/xen/setup.c b/arch/x86/xen/setup.c
index 81aa46f770c5..cfa99e8f054b 100644
--- a/arch/x86/xen/setup.c
+++ b/arch/x86/xen/setup.c
@@ -918,7 +918,7 @@ void xen_enable_sysenter(void)
if (!boot_cpu_has(sysenter_feature))
return;
 
-   ret = register_callback(CALLBACKTYPE_sysenter, xen_sysenter_target);
+   ret = register_callback(CALLBACKTYPE_sysenter, 
xen_entry_SYSENTER_compat);
if(ret != 0)
setup_clear_cpu_cap(sysenter_feature);
 }
@@ -927,7 +927,7 @@ void xen_enable_syscall(void)
 {
int ret;
 
-   ret = register_callback(CALLBACKTYPE_syscall, xen_syscall_target);
+   ret = register_callback(CALLBACKTYPE_syscall, xen_entry_SYSCALL_64);
if (ret != 0) {
printk(KERN_ERR "Failed to set syscall callback: %d\n", ret);
/* Pretty fatal; 64-bit userspace has no other
@@ -936,7 +936,7 @@ void xen_enable_syscall(void)
 
if (boot_cpu_has(X86_FEATURE_SYSCALL32)) {
ret = register_callback(CALLBACKTYPE_syscall32,
-   xen_syscall32_target);
+   xen_entry_SYSCALL_compat);
if (ret != 0)
setup_clear_cpu_cap(X86_FEATURE_SYSCALL32);
}
diff --git a/arch/x86/xen/xen-asm.S b/arch/x86/xen/xen-asm.S
index caa9bc2fa100..6bf9d45b9178 100644
--- a/arch/x86/xen/xen-asm.S
+++ b/arch/x86/xen/xen-asm.S
@@ -234,7 +234,7 @@ SYM_CODE_END(xenpv_restore_regs_and_return_to_usermode)
  */
 
 /* Normal 64-bit system call target */
-SYM_CODE_START(xen_syscall_target)
+SYM_CODE_START(xen_entry_SYSCALL_64)
UNWIND_HINT_EMPTY
ENDBR
popq %rcx
@@ -249,12 +249,12 @@ SYM_CODE_START(xen_syscall_target)
movq $__USER_CS, 1*8(%rsp)
 
jmp entry_SYSCALL_64_after_hwframe
-SYM_CODE_END(xen_syscall_target)
+SYM_CODE_END(xen_entry_SYSCALL_64)
 
 #ifdef CONFIG_IA32_EMULATION
 
 /* 32-bit compat syscall target */
-SYM_CODE_START(xen_syscall32_target)
+SYM_CODE_START(xen_entry_SYSCALL_compat)
UNWIND_HINT_EMPTY
ENDBR
popq %rcx
@@ -269,10 +269,10 @@ SYM_CODE_START(xen_syscall32_target)
movq $__USER32_CS, 1*8(%rsp)
 
jmp entry_SYSCALL_compat_after_hwframe
-SYM_CODE_END(xen_syscall32_target)
+SYM_CODE_END(xen_entry_SYSCALL_compat)
 
 /* 32-bit compat sysenter target */
-SYM_CODE_START(xen_sysenter_target)
+SYM_CODE_START(xen_entry_SYSENTER_compat)
UNWIND_HINT_EMPTY
ENDBR
/*
@@ -291,19 +291,19 @@ SYM_CODE_START(xen_sysenter_target)
movq $__USER32_CS, 1*8(%rsp)
 
jmp entry_SYSENTER_compat_after_hwframe
-SYM_CODE_END(xen_sysenter_target)
+SYM_CODE_END(xen_entry_SYSENTER_compat)
 
 #else /* !CONFIG_IA32_EMULATION */
 
-SYM_CODE_START(xen_syscall32_target)
-SYM_CODE_START(xen_sysenter_target)
+SYM_CODE_START(xen_entry_SYSCALL_compat)
+SYM_CODE_START(xen_entry_SYSENTER_compat)
UNWIND_HINT_EMPTY
ENDBR
lea 16(%rsp), %rsp  /* strip %rcx, %r11 */
mov $-ENOSYS, %rax
pushq $0
jmp hypercall_iret
-SYM_CODE_END(xen_sysenter_target)
-SYM_CODE_END(xen_syscall32_target)
+SYM_CODE_END(xen_entry_SYSENTER_compat)
+SYM_CODE_END(xen_entry_SYSCALL_compat)
 
 #endif /* CONFIG_IA32_EMULATION */
diff --git a/arch/x86/xen/xen-ops.h b/arch/x86/xen/xen-ops.h
index fd0fec6e92f4..9a8bb972193d 100644
--- a/arch/x86/xen/xen-ops.h
+++ b/arch/x86/xen/xen-ops.h
@@ -10,10 +10,10 @@
 /* These are code, but not functions.  Defined in entry.S */
 extern const char xen_failsafe_callback[];
 
-void xen_sysenter_target(void);
+void xen_entry_SYSENTER_compat(void);
 #ifdef CONFIG_X86_64
-void xen_syscall_target(void);
-void xen_syscall32_target(void);
+void xen_entry_SYSCALL_64(void);
+void xen_entry_SYSCALL_compat(void);
 #endif
 
 extern void *xen_initial_gdt;
-- 
2.35.1




Re: [PATCH 3/3] x86: decouple pat and mtrr handling

2022-07-19 Thread Chuck Zmudzinski
On 7/15/22 10:25 AM, Juergen Gross wrote:
> Today PAT is usable only with MTRR being active, with some nasty tweaks
> to make PAT usable when running as Xen PV guest, which doesn't support
> MTRR.
>
> The reason for this coupling is, that both, PAT MSR changes and MTRR
> changes, require a similar sequence and so full PAT support was added
> using the already available MTRR handling.
>
> Xen PV PAT handling can work without MTRR, as it just needs to consume
> the PAT MSR setting done by the hypervisor without the ability and need
> to change it. This in turn has resulted in a convoluted initialization
> sequence and wrong decisions regarding cache mode availability due to
> misguiding PAT availability flags.
>
> Fix all of that by allowing to use PAT without MTRR and by adding an
> environment dependent PAT init function.
>
> Cc:  # 5.17
> Fixes: bdd8b6c98239 ("drm/i915: replace X86_FEATURE_PAT with pat_enabled()")
> Signed-off-by: Juergen Gross 
> ---
...
> diff --git a/arch/x86/mm/pat/memtype.c b/arch/x86/mm/pat/memtype.c
> index d5ef64ddd35e..3d4bc27ffebb 100644
> --- a/arch/x86/mm/pat/memtype.c
> +++ b/arch/x86/mm/pat/memtype.c
> ...
>  
> +void pat_init_noset(void)
> +{
> + pat_bp_enabled = true;
> + init_cache_modes();
> +}

This is what should fix the regression caused by commit
bdd8b6c98239 ("drm/i915: replace X86_FEATURE_PAT
with pat_enabled()"). Thanks for including this.

This function might need a better name. Does noset
refer to the fact that when we use this function, we do
not set or write to the PAT MSR? Maybe it should be
pat_init_noset_msr. Is Xen PV Dom0 the only case when
this function will be called or is it also for unprivileged
Xen PV domains? Then maybe it should be named
pat_init_xen_pv_dom0 or maybe just pat_init_xen_pv
if it is also used with unprivileged Xen PV domains. Or,
if you want to keep the name as pat_init_noset, maybe
it should be preceded by a comment clearly explaining
this function is currently only for the Xen PV and/or the Xen
PV Dom0 case when we don't write to the PAT MSR and we
still want to report PAT as enabled in those cases.

Chuck



[PATCH AUTOSEL 5.15 16/42] x86/xen: Rename SYS* entry points

2022-07-19 Thread Sasha Levin
From: Peter Zijlstra 

[ Upstream commit b75b7f8ef1148be1b9321ffc2f6c19238904b438 ]

Native SYS{CALL,ENTER} entry points are called
entry_SYS{CALL,ENTER}_{64,compat}, make sure the Xen versions are
named consistently.

Signed-off-by: Peter Zijlstra (Intel) 
Signed-off-by: Borislav Petkov 
Reviewed-by: Josh Poimboeuf 
Signed-off-by: Borislav Petkov 
Signed-off-by: Sasha Levin 
---
 arch/x86/xen/setup.c   |  6 +++---
 arch/x86/xen/xen-asm.S | 20 ++--
 arch/x86/xen/xen-ops.h |  6 +++---
 3 files changed, 16 insertions(+), 16 deletions(-)

diff --git a/arch/x86/xen/setup.c b/arch/x86/xen/setup.c
index 8bfc10330107..1f80dd3a2dd4 100644
--- a/arch/x86/xen/setup.c
+++ b/arch/x86/xen/setup.c
@@ -922,7 +922,7 @@ void xen_enable_sysenter(void)
if (!boot_cpu_has(sysenter_feature))
return;
 
-   ret = register_callback(CALLBACKTYPE_sysenter, xen_sysenter_target);
+   ret = register_callback(CALLBACKTYPE_sysenter, 
xen_entry_SYSENTER_compat);
if(ret != 0)
setup_clear_cpu_cap(sysenter_feature);
 }
@@ -931,7 +931,7 @@ void xen_enable_syscall(void)
 {
int ret;
 
-   ret = register_callback(CALLBACKTYPE_syscall, xen_syscall_target);
+   ret = register_callback(CALLBACKTYPE_syscall, xen_entry_SYSCALL_64);
if (ret != 0) {
printk(KERN_ERR "Failed to set syscall callback: %d\n", ret);
/* Pretty fatal; 64-bit userspace has no other
@@ -940,7 +940,7 @@ void xen_enable_syscall(void)
 
if (boot_cpu_has(X86_FEATURE_SYSCALL32)) {
ret = register_callback(CALLBACKTYPE_syscall32,
-   xen_syscall32_target);
+   xen_entry_SYSCALL_compat);
if (ret != 0)
setup_clear_cpu_cap(X86_FEATURE_SYSCALL32);
}
diff --git a/arch/x86/xen/xen-asm.S b/arch/x86/xen/xen-asm.S
index 962d30ea01a2..2cf22624012c 100644
--- a/arch/x86/xen/xen-asm.S
+++ b/arch/x86/xen/xen-asm.S
@@ -227,7 +227,7 @@ SYM_CODE_END(xenpv_restore_regs_and_return_to_usermode)
  */
 
 /* Normal 64-bit system call target */
-SYM_CODE_START(xen_syscall_target)
+SYM_CODE_START(xen_entry_SYSCALL_64)
UNWIND_HINT_EMPTY
popq %rcx
popq %r11
@@ -241,12 +241,12 @@ SYM_CODE_START(xen_syscall_target)
movq $__USER_CS, 1*8(%rsp)
 
jmp entry_SYSCALL_64_after_hwframe
-SYM_CODE_END(xen_syscall_target)
+SYM_CODE_END(xen_entry_SYSCALL_64)
 
 #ifdef CONFIG_IA32_EMULATION
 
 /* 32-bit compat syscall target */
-SYM_CODE_START(xen_syscall32_target)
+SYM_CODE_START(xen_entry_SYSCALL_compat)
UNWIND_HINT_EMPTY
popq %rcx
popq %r11
@@ -260,10 +260,10 @@ SYM_CODE_START(xen_syscall32_target)
movq $__USER32_CS, 1*8(%rsp)
 
jmp entry_SYSCALL_compat_after_hwframe
-SYM_CODE_END(xen_syscall32_target)
+SYM_CODE_END(xen_entry_SYSCALL_compat)
 
 /* 32-bit compat sysenter target */
-SYM_CODE_START(xen_sysenter_target)
+SYM_CODE_START(xen_entry_SYSENTER_compat)
UNWIND_HINT_EMPTY
/*
 * NB: Xen is polite and clears TF from EFLAGS for us.  This means
@@ -281,18 +281,18 @@ SYM_CODE_START(xen_sysenter_target)
movq $__USER32_CS, 1*8(%rsp)
 
jmp entry_SYSENTER_compat_after_hwframe
-SYM_CODE_END(xen_sysenter_target)
+SYM_CODE_END(xen_entry_SYSENTER_compat)
 
 #else /* !CONFIG_IA32_EMULATION */
 
-SYM_CODE_START(xen_syscall32_target)
-SYM_CODE_START(xen_sysenter_target)
+SYM_CODE_START(xen_entry_SYSCALL_compat)
+SYM_CODE_START(xen_entry_SYSENTER_compat)
UNWIND_HINT_EMPTY
lea 16(%rsp), %rsp  /* strip %rcx, %r11 */
mov $-ENOSYS, %rax
pushq $0
jmp hypercall_iret
-SYM_CODE_END(xen_sysenter_target)
-SYM_CODE_END(xen_syscall32_target)
+SYM_CODE_END(xen_entry_SYSENTER_compat)
+SYM_CODE_END(xen_entry_SYSCALL_compat)
 
 #endif /* CONFIG_IA32_EMULATION */
diff --git a/arch/x86/xen/xen-ops.h b/arch/x86/xen/xen-ops.h
index 8bc8b72a205d..16aed4b12129 100644
--- a/arch/x86/xen/xen-ops.h
+++ b/arch/x86/xen/xen-ops.h
@@ -10,10 +10,10 @@
 /* These are code, but not functions.  Defined in entry.S */
 extern const char xen_failsafe_callback[];
 
-void xen_sysenter_target(void);
+void xen_entry_SYSENTER_compat(void);
 #ifdef CONFIG_X86_64
-void xen_syscall_target(void);
-void xen_syscall32_target(void);
+void xen_entry_SYSCALL_64(void);
+void xen_entry_SYSCALL_compat(void);
 #endif
 
 extern void *xen_initial_gdt;
-- 
2.35.1




RE: [PATCH V11.1 1/3] libxl: Add support for Virtio disk configuration

2022-07-19 Thread Jiamei Xie
Hi Oleksandr,

We have tested it on arm64 with " disk = [ 'phy:/usr/share/guests/disk.img0, 
xvda1, backendtype=standalone, specification=virtio']". It works ok.

Tested-by: Jiamei Xie 
Tested-by: Henry Wang 

Best wishes
Jiamei Xie


> -Original Message-
> From: Xen-devel  On Behalf Of
> Oleksandr Tyshchenko
> Sent: Sunday, July 17, 2022 12:38 AM
> To: xen-devel@lists.xenproject.org
> Cc: Oleksandr Tyshchenko ; Wei Liu
> ; Anthony PERARD ; George
> Dunlap ; Nick Rosbrook
> ; Juergen Gross ; Stefano
> Stabellini ; Julien Grall ; Volodymyr
> Babchuk ; Bertrand Marquis
> 
> Subject: [PATCH V11.1 1/3] libxl: Add support for Virtio disk configuration
> 
> From: Oleksandr Tyshchenko 
> 
> This patch adds basic support for configuring and assisting virtio-mmio
> based virtio-disk backend (emulator) which is intended to run out of
> Qemu and could be run in any domain.
> Although the Virtio block device is quite different from traditional
> Xen PV block device (vbd) from the toolstack's point of view:
>  - as the frontend is virtio-blk which is not a Xenbus driver, nothing
>written to Xenstore are fetched by the frontend currently ("vdev"
>is not passed to the frontend). But this might need to be revised
>in future, so frontend data might be written to Xenstore in order to
>support hotplugging virtio devices or passing the backend domain id
>on arch where the device-tree is not available.
>  - the ring-ref/event-channel are not used for the backend<->frontend
>communication, the proposed IPC for Virtio is IOREQ/DM
> it is still a "block device" and ought to be integrated in existing
> "disk" handling. So, re-use (and adapt) "disk" parsing/configuration
> logic to deal with Virtio devices as well.
> 
> For the immediate purpose and an ability to extend that support for
> other use-cases in future (Qemu, virtio-pci, etc) perform the following
> actions:
> - Add new disk backend type (LIBXL_DISK_BACKEND_STANDALONE) and
> reflect
>   that in the configuration
> - Introduce new disk "specification" and "transport" fields to struct
>   libxl_device_disk. Both are written to the Xenstore. The transport
>   field is only used for the specification "virtio" and it assumes
>   only "mmio" value for now.
> - Introduce new "specification" option with "xen" communication
>   protocol being default value.
> - Add new device kind (LIBXL__DEVICE_KIND_VIRTIO_DISK) as current
>   one (LIBXL__DEVICE_KIND_VBD) doesn't fit into Virtio disk model
> 
> An example of domain configuration for Virtio disk:
> disk = [ 'phy:/dev/mmcblk0p3, xvda1, backendtype=standalone,
> specification=virtio']
> 
> Nothing has changed for default Xen disk configuration.
> 
> Please note, this patch is not enough for virtio-disk to work
> on Xen (Arm), as for every Virtio device (including disk) we need
> to allocate Virtio MMIO params (IRQ and memory region) and pass
> them to the backend, also update Guest device-tree. The subsequent
> patch will add these missing bits. For the current patch,
> the default "irq" and "base" are just written to the Xenstore.
> This is not an ideal splitting, but this way we avoid breaking
> the bisectability.
> 
> Signed-off-by: Oleksandr Tyshchenko 
> ---
> Changes RFC -> V1:
>- no changes
> 
> Changes V1 -> V2:
>- rebase according to the new location of libxl_virtio_disk.c
> 
> Changes V2 -> V3:
>- no changes
> 
> Changes V3 -> V4:
>- rebase according to the new argument for DEFINE_DEVICE_TYPE_STRUCT
> 
> Changes V4 -> V5:
>- split the changes, change the order of the patches
>- update patch description
>- don't introduce new "vdisk" configuration option with own parsing logic,
>  re-use Xen PV block "disk" parsing/configuration logic for the 
> virtio-disk
>- introduce "virtio" flag and document it's usage
>- add LIBXL_HAVE_DEVICE_DISK_VIRTIO
>- update libxlu_disk_l.[ch]
>- drop num_disks variable/MAX_VIRTIO_DISKS
>- drop Wei's T-b
> 
> Changes V5 -> V6:
>- rebase on current staging
>- use "%"PRIu64 instead of %lu for disk->base in device_disk_add()
>- update *.gen.go files
> 
> Changes V6 -> V7:
>- rebase on current staging
>- update *.gen.go files and libxlu_disk_l.[ch] files
>- update patch description
>- rework significantly to support more flexible configuration
>  and have more generic basic implementation for being able to extend
>  that for other use-cases (virtio-pci, qemu, etc).
> 
> Changes V7 -> V8:
>- update *.gen.go files and libxlu_disk_l.[ch] files
>- update patch description and comments in the code
>- use "specification" config option instead of "protocol"
>- update libxl_types.idl and code according to new fields
>  in libxl_device_disk
> 
> Changes V8 -> V9:
>- update (and harden) checks in libxl__device_disk_setdefault(),
>  return error in case of incorrect settings of specification
>  and transport
>- remove both asserts in device_disk_add()
>  

Re: Panic on CPU 0: FATAL TRAP: vec 7, #NM[0000]

2022-07-19 Thread ChrisD
So, you think it's a problem with fc36?
 Original message From: Andrew Cooper 
 Date: 7/18/22  6:25 PM  (GMT-05:00) To: 
ch...@dalessio.org, xen-devel@lists.xenproject.org Cc: Jan Beulich 
, Michael Young  Subject: Re: Panic 
on CPU 0: FATAL TRAP: vec 7, #NM[] On 18/07/2022 22:31, ch...@dalessio.org 
wrote:> I am trying to run Xen-4.16.1-4.fc36 on Fedora 36 on a brand new 
Lenovo> ThinkStation p620, but I keep getting the following error booting the> 
Xen kernel.> > Panic on CPU 0:> FATAL TRAP: vec 7, #NM[]>> Version info:> 
Name        : xen> Version     : 4.16.1> Release     : 4.fc36So 
https://koji.fedoraproject.org/koji/buildinfo?buildID=1991182 shouldbe the 
binary build in use, and looking at the debug syms, it reallydoes 
have:82d040439c80 :...82d04043a00c:   0f 6e c2  
  movd   %edx,%mm082d04043a00f:   0f 62 c0    
punpckldq %mm0,%mm082d04043a012:   49 89 87 c0 00 00 00    mov    
%rax,0xc0(%r15)82d04043a019:   41 0f 7f 87 d0 00 00    movq   
%mm0,0xd0(%r15)82d04043a020:   00So hardware is correct - this build of 
Xen is nonsense.The binary is also full of .annobin_ stuff which appears to be 
some kindof GCC plugin for watermarking.Michael: Any idea what's going on here? 
 Something has caused GCC toemit some MMX logic which is ultimately why things 
exploded, but thisprobably means that some of the build CFLAGS got 
dropped.Thanks,~Andrew

[qemu-mainline test] 171693: tolerable FAIL - PUSHED

2022-07-19 Thread osstest service owner
flight 171693 qemu-mainline real [real]
flight 171698 qemu-mainline real-retest [real]
http://logs.test-lab.xenproject.org/osstest/logs/171693/
http://logs.test-lab.xenproject.org/osstest/logs/171698/

Failures :-/ but no regressions.

Tests which are failing intermittently (not blocking):
 test-amd64-i386-xl-vhd7 xen-install fail pass in 171698-retest
 test-arm64-arm64-libvirt-raw 13 guest-start fail pass in 171698-retest

Tests which did not succeed, but are not blocking:
 test-arm64-arm64-libvirt-raw 14 migrate-support-check fail in 171698 never pass
 test-arm64-arm64-libvirt-raw 15 saverestore-support-check fail in 171698 never 
pass
 test-amd64-amd64-xl-qemuu-win7-amd64 19 guest-stopfail like 171683
 test-armhf-armhf-libvirt 16 saverestore-support-checkfail  like 171683
 test-amd64-amd64-qemuu-nested-amd 20 debian-hvm-install/l1/l2 fail like 171683
 test-armhf-armhf-libvirt-qcow2 15 saverestore-support-check   fail like 171683
 test-armhf-armhf-libvirt-raw 15 saverestore-support-checkfail  like 171683
 test-amd64-i386-xl-qemuu-win7-amd64 19 guest-stop fail like 171683
 test-amd64-i386-xl-qemuu-ws16-amd64 19 guest-stop fail like 171683
 test-amd64-amd64-xl-qemuu-ws16-amd64 19 guest-stopfail like 171683
 test-amd64-i386-xl-pvshim14 guest-start  fail   never pass
 test-amd64-amd64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  15 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 15 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  16 saverestore-support-checkfail   never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-armhf-armhf-xl-arndale  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  16 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt 15 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-raw  14 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 16 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-vhd 14 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt 15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 15 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 16 saverestore-support-checkfail never pass
 test-arm64-arm64-xl-vhd  14 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-vhd  15 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  16 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-armhf-armhf-libvirt-qcow2 14 migrate-support-checkfail never pass
 test-armhf-armhf-libvirt-raw 14 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 15 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 16 saverestore-support-checkfail  never pass
 test-armhf-armhf-xl-vhd  14 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  15 saverestore-support-checkfail   never pass

version targeted for testing:
 qemuud48125de38f48a61d6423ef6a01156d6dff9ee2c
baseline version:
 qemuub8bb9bbf

[PATCH v9 0/8] populate/unpopulate memory when domain on static allocation

2022-07-19 Thread Penny Zheng
Today when a domain unpopulates the memory on runtime, they will always
hand the memory over to the heap allocator. And it will be a problem if it
is a static domain.
Pages used as guest RAM for static domain shall always be reserved to this
domain only, and not be used for any other purposes, so they shall never go
back to heap allocator.

This patch serie intends to fix this issue, by adding pages on the new list
resv_page_list after having taken them off the "normal" list, when unpopulating
memory, and retrieving pages from resv page list(resv_page_list) when
populating memory.

---
v9 changes:
- move free_domheap_page into else-condition
- considering scrubbing static pages, domain dying case and opt_scrub_domheap
both do not apply to static pages.
- as unowned static pages don't make themselves to free_domstatic_page
at the moment, remove else-condition and add ASSERT(d) at the top of the
function
- remove macro helper put_static_page, and just expand its code inside
free_domstatic_page
- Use ASSERT_ALLOC_CONTEXT() in acquire_reserved_page
- Add free_staticmem_pages to undo prepare_staticmem_pages when
assign_domstatic_pages fails
- Remove redundant static in error message
---
v8 changes:
- introduce new helper free_domstatic_page
- let put_page call free_domstatic_page for static page, when last ref
drops
- #define PGC_static zero when !CONFIG_STATIC_MEMORY, as it is used
outside page_alloc.c
- #ifdef-ary around is_domain_using_staticmem() is not needed anymore
- order as a parameter is not needed here, as all staticmem operations are
limited to order-0 regions
- move d->page_alloc_lock after operation on d->resv_page_list
- As concurrent free/allocate could modify the resv_page_list, we still
need the lock
---
v7 changes:
- protect free_staticmem_pages with heap_lock to match its reverse function
acquire_staticmem_pages
- IS_ENABLED(CONFIG_STATIC_MEMORY) would not be needed anymore
- add page on the rsv_page_list *after* it has been freed
- remove the lock, since we add the page to rsv_page_list after it has
been totally freed.
---
v6 changes:
- rename PGC_staticmem to PGC_static
- remove #ifdef aroud function declaration
- use domain instead of sub-systems
- move non-zero is_domain_using_staticmem() from ARM header to common
header
- move PGC_static !CONFIG_STATIC_MEMORY definition to common header
- drop the lock before returning
---
v5 changes:
- introduce three new commits
- In order to avoid stub functions, we #define PGC_staticmem to non-zero only
when CONFIG_STATIC_MEMORY
- use "unlikely()" around pg->count_info & PGC_staticmem
- remove pointless "if", since mark_page_free() is going to set count_info
to PGC_state_free and by consequence clear PGC_staticmem
- move #define PGC_staticmem 0 to mm.h
- guard "is_domain_using_staticmem" under CONFIG_STATIC_MEMORY
- #define is_domain_using_staticmem zero if undefined
- extract common codes for assigning pages into a helper assign_domstatic_pages
- refine commit message
- remove stub function acquire_reserved_page
- Alloc/free of memory can happen concurrently. So access to rsv_page_list
needs to be protected with a spinlock
---
v4 changes:
- commit message refinement
- miss dropping __init in acquire_domstatic_pages
- add the page back to the reserved list in case of error
- remove redundant printk
- refine log message and make it warn level
- guard "is_domain_using_staticmem" under CONFIG_STATIC_MEMORY
- #define is_domain_using_staticmem zero if undefined
---
v3 changes:
- fix possible racy issue in free_staticmem_pages()
- introduce a stub free_staticmem_pages() for the !CONFIG_STATIC_MEMORY case
- move the change to free_heap_pages() to cover other potential call sites
- change fixed width type uint32_t to unsigned int
- change "flags" to a more descriptive name "cdf"
- change name from "is_domain_static()" to "is_domain_using_staticmem"
- have page_list_del() just once out of the if()
- remove resv_pages counter
- make arch_free_heap_page be an expression, not a compound statement.
- move #ifndef is_domain_using_staticmem to the common header file
- remove #ifdef CONFIG_STATIC_MEMORY-ary
- remove meaningless page_to_mfn(page) in error log
---
v2 changes:
- let "flags" live in the struct domain. So other arch can take
advantage of it in the future
- change name from "is_domain_on_static_allocation" to "is_domain_static()"
- put reserved pages on resv_page_list after having taken them off
the "normal" list
- introduce acquire_reserved_page to retrieve reserved pages from
resv_page_list
- forbid non-zero-order requests in populate_physmap
- let is_domain_static return ((void)(d), false) on x86
- fix coding style

Penny Zheng (8):
  xen/arm: rename PGC_reserved to PGC_static
  xen: do not free reserved memory into heap
  xen: do not merge reserved pages in free_heap_pages()
  xen: add field "flags" to cover all internal CDF_XXX
  xen/arm: introduce CDF_staticmem
  xen/arm: unpopulate memory when domain is static
  xen: introduce prepare_staticmem_pa

[PATCH v9 1/8] xen/arm: rename PGC_reserved to PGC_static

2022-07-19 Thread Penny Zheng
PGC_reserved could be ambiguous, and we have to tell what the pages are
reserved for, so this commit intends to rename PGC_reserved to
PGC_static, which clearly indicates the page is reserved for static
memory.

Signed-off-by: Penny Zheng 
Acked-by: Jan Beulich 
Acked-by: Julien Grall 
---
v8 changes:
- no change
---
v7 changes:
- no change
---
v6 changes:
- rename PGC_staticmem to PGC_static
---
v5 changes:
- new commit
---
 xen/arch/arm/include/asm/mm.h |  6 +++---
 xen/common/page_alloc.c   | 22 +++---
 2 files changed, 14 insertions(+), 14 deletions(-)

diff --git a/xen/arch/arm/include/asm/mm.h b/xen/arch/arm/include/asm/mm.h
index c4bc3cd1e5..8b2481c1f3 100644
--- a/xen/arch/arm/include/asm/mm.h
+++ b/xen/arch/arm/include/asm/mm.h
@@ -108,9 +108,9 @@ struct page_info
   /* Page is Xen heap? */
 #define _PGC_xen_heap PG_shift(2)
 #define PGC_xen_heap  PG_mask(1, 2)
-  /* Page is reserved */
-#define _PGC_reserved PG_shift(3)
-#define PGC_reserved  PG_mask(1, 3)
+  /* Page is static memory */
+#define _PGC_staticPG_shift(3)
+#define PGC_static PG_mask(1, 3)
 /* ... */
 /* Page is broken? */
 #define _PGC_broken   PG_shift(7)
diff --git a/xen/common/page_alloc.c b/xen/common/page_alloc.c
index fe0e15429a..ed56379b96 100644
--- a/xen/common/page_alloc.c
+++ b/xen/common/page_alloc.c
@@ -151,8 +151,8 @@
 #define p2m_pod_offline_or_broken_replace(pg) BUG_ON(pg != NULL)
 #endif
 
-#ifndef PGC_reserved
-#define PGC_reserved 0
+#ifndef PGC_static
+#define PGC_static 0
 #endif
 
 /*
@@ -2286,7 +2286,7 @@ int assign_pages(
 
 for ( i = 0; i < nr; i++ )
 {
-ASSERT(!(pg[i].count_info & ~(PGC_extra | PGC_reserved)));
+ASSERT(!(pg[i].count_info & ~(PGC_extra | PGC_static)));
 if ( pg[i].count_info & PGC_extra )
 extra_pages++;
 }
@@ -2346,7 +2346,7 @@ int assign_pages(
 page_set_owner(&pg[i], d);
 smp_wmb(); /* Domain pointer must be visible before updating refcnt. */
 pg[i].count_info =
-(pg[i].count_info & (PGC_extra | PGC_reserved)) | PGC_allocated | 
1;
+(pg[i].count_info & (PGC_extra | PGC_static)) | PGC_allocated | 1;
 
 page_list_add_tail(&pg[i], page_to_list(d, &pg[i]));
 }
@@ -2652,8 +2652,8 @@ void __init free_staticmem_pages(struct page_info *pg, 
unsigned long nr_mfns,
 scrub_one_page(pg);
 }
 
-/* In case initializing page of static memory, mark it PGC_reserved. */
-pg[i].count_info |= PGC_reserved;
+/* In case initializing page of static memory, mark it PGC_static. */
+pg[i].count_info |= PGC_static;
 }
 }
 
@@ -2682,8 +2682,8 @@ static struct page_info * __init 
acquire_staticmem_pages(mfn_t smfn,
 
 for ( i = 0; i < nr_mfns; i++ )
 {
-/* The page should be reserved and not yet allocated. */
-if ( pg[i].count_info != (PGC_state_free | PGC_reserved) )
+/* The page should be static and not yet allocated. */
+if ( pg[i].count_info != (PGC_state_free | PGC_static) )
 {
 printk(XENLOG_ERR
"pg[%lu] Static MFN %"PRI_mfn" c=%#lx t=%#x\n",
@@ -2697,10 +2697,10 @@ static struct page_info * __init 
acquire_staticmem_pages(mfn_t smfn,
 &tlbflush_timestamp);
 
 /*
- * Preserve flag PGC_reserved and change page state
+ * Preserve flag PGC_static and change page state
  * to PGC_state_inuse.
  */
-pg[i].count_info = PGC_reserved | PGC_state_inuse;
+pg[i].count_info = PGC_static | PGC_state_inuse;
 /* Initialise fields which have other uses for free pages. */
 pg[i].u.inuse.type_info = 0;
 page_set_owner(&pg[i], NULL);
@@ -2722,7 +2722,7 @@ static struct page_info * __init 
acquire_staticmem_pages(mfn_t smfn,
 
  out_err:
 while ( i-- )
-pg[i].count_info = PGC_reserved | PGC_state_free;
+pg[i].count_info = PGC_static | PGC_state_free;
 
 spin_unlock(&heap_lock);
 
-- 
2.25.1




[PATCH v9 3/8] xen: do not merge reserved pages in free_heap_pages()

2022-07-19 Thread Penny Zheng
The code in free_heap_pages() will try to merge pages with the
successor/predecessor if pages are suitably aligned. So if the pages
reserved are right next to the pages given to the heap allocator,
free_heap_pages() will merge them, and give the reserved pages to heap
allocator accidently as a result.

So in order to avoid the above scenario, this commit updates free_heap_pages()
to check whether the predecessor and/or successor has PGC_reserved set,
when trying to merge the about-to-be-freed chunk with the predecessor
and/or successor.

Suggested-by: Julien Grall 
Signed-off-by: Penny Zheng 
Reviewed-by: Jan Beulich 
Reviewed-by: Julien Grall 
---
v9 changes:
- no change
---
v8 changes:
- no change
---
v7 changes:
- no change
---
v6 changes:
- adapt to PGC_static
---
v5 changes:
- change PGC_reserved to adapt to PGC_staticmem
---
v4 changes:
- no changes
---
v3 changes:
- no changes
---
v2 changes:
- new commit
---
 xen/common/page_alloc.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/xen/common/page_alloc.c b/xen/common/page_alloc.c
index a12622e921..45bd88a685 100644
--- a/xen/common/page_alloc.c
+++ b/xen/common/page_alloc.c
@@ -1475,6 +1475,7 @@ static void free_heap_pages(
 /* Merge with predecessor block? */
 if ( !mfn_valid(page_to_mfn(predecessor)) ||
  !page_state_is(predecessor, free) ||
+ (predecessor->count_info & PGC_static) ||
  (PFN_ORDER(predecessor) != order) ||
  (phys_to_nid(page_to_maddr(predecessor)) != node) )
 break;
@@ -1498,6 +1499,7 @@ static void free_heap_pages(
 /* Merge with successor block? */
 if ( !mfn_valid(page_to_mfn(successor)) ||
  !page_state_is(successor, free) ||
+ (successor->count_info & PGC_static) ||
  (PFN_ORDER(successor) != order) ||
  (phys_to_nid(page_to_maddr(successor)) != node) )
 break;
-- 
2.25.1




[PATCH v9 2/8] xen: do not free reserved memory into heap

2022-07-19 Thread Penny Zheng
Pages used as guest RAM for static domain, shall be reserved to this
domain only.
So in case reserved pages being used for other purpose, users
shall not free them back to heap, even when last ref gets dropped.

This commit introduces a new helper free_domstatic_page to free
static page in runtime, and free_staticmem_pages will be called by it
in runtime, so let's drop the __init flag.

Signed-off-by: Penny Zheng 
---
v9 changes:
- move free_domheap_page into else-condition
- considering scrubbing static pages, domain dying case and opt_scrub_domheap
both donot apply to static pages.
- as unowned static pages don't make themselves to free_domstatic_page
at the moment, remove else-condition and add ASSERT(d) at the top of the
function
---
v8 changes:
- introduce new helper free_domstatic_page
- let put_page call free_domstatic_page for static page, when last ref
drops
- #define PGC_static zero when !CONFIG_STATIC_MEMORY, as it is used
outside page_alloc.c
---
v7 changes:
- protect free_staticmem_pages with heap_lock to match its reverse function
acquire_staticmem_pages
---
v6 changes:
- adapt to PGC_static
- remove #ifdef aroud function declaration
---
v5 changes:
- In order to avoid stub functions, we #define PGC_staticmem to non-zero only
when CONFIG_STATIC_MEMORY
- use "unlikely()" around pg->count_info & PGC_staticmem
- remove pointless "if", since mark_page_free() is going to set count_info
to PGC_state_free and by consequence clear PGC_staticmem
- move #define PGC_staticmem 0 to mm.h
---
v4 changes:
- no changes
---
v3 changes:
- fix possible racy issue in free_staticmem_pages()
- introduce a stub free_staticmem_pages() for the !CONFIG_STATIC_MEMORY case
- move the change to free_heap_pages() to cover other potential call sites
- fix the indentation
---
v2 changes:
- new commit
---
---
 xen/arch/arm/include/asm/mm.h |  4 +++-
 xen/arch/arm/mm.c |  5 -
 xen/common/page_alloc.c   | 37 ---
 xen/include/xen/mm.h  |  7 +--
 4 files changed, 42 insertions(+), 11 deletions(-)

diff --git a/xen/arch/arm/include/asm/mm.h b/xen/arch/arm/include/asm/mm.h
index 8b2481c1f3..f1640bbda4 100644
--- a/xen/arch/arm/include/asm/mm.h
+++ b/xen/arch/arm/include/asm/mm.h
@@ -108,9 +108,11 @@ struct page_info
   /* Page is Xen heap? */
 #define _PGC_xen_heap PG_shift(2)
 #define PGC_xen_heap  PG_mask(1, 2)
-  /* Page is static memory */
+#ifdef CONFIG_STATIC_MEMORY
+/* Page is static memory */
 #define _PGC_staticPG_shift(3)
 #define PGC_static PG_mask(1, 3)
+#endif
 /* ... */
 /* Page is broken? */
 #define _PGC_broken   PG_shift(7)
diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c
index 009b8cd9ef..9132fb9472 100644
--- a/xen/arch/arm/mm.c
+++ b/xen/arch/arm/mm.c
@@ -1622,7 +1622,10 @@ void put_page(struct page_info *page)
 
 if ( unlikely((nx & PGC_count_mask) == 0) )
 {
-free_domheap_page(page);
+if ( unlikely(nx & PGC_static) )
+free_domstatic_page(page);
+else
+free_domheap_page(page);
 }
 }
 
diff --git a/xen/common/page_alloc.c b/xen/common/page_alloc.c
index ed56379b96..a12622e921 100644
--- a/xen/common/page_alloc.c
+++ b/xen/common/page_alloc.c
@@ -151,10 +151,6 @@
 #define p2m_pod_offline_or_broken_replace(pg) BUG_ON(pg != NULL)
 #endif
 
-#ifndef PGC_static
-#define PGC_static 0
-#endif
-
 /*
  * Comma-separated list of hexadecimal page numbers containing bad bytes.
  * e.g. 'badpage=0x3f45,0x8a321'.
@@ -2636,12 +2632,14 @@ struct domain *get_pg_owner(domid_t domid)
 
 #ifdef CONFIG_STATIC_MEMORY
 /* Equivalent of free_heap_pages to free nr_mfns pages of static memory. */
-void __init free_staticmem_pages(struct page_info *pg, unsigned long nr_mfns,
- bool need_scrub)
+void free_staticmem_pages(struct page_info *pg, unsigned long nr_mfns,
+  bool need_scrub)
 {
 mfn_t mfn = page_to_mfn(pg);
 unsigned long i;
 
+spin_lock(&heap_lock);
+
 for ( i = 0; i < nr_mfns; i++ )
 {
 mark_page_free(&pg[i], mfn_add(mfn, i));
@@ -2652,9 +2650,34 @@ void __init free_staticmem_pages(struct page_info *pg, 
unsigned long nr_mfns,
 scrub_one_page(pg);
 }
 
-/* In case initializing page of static memory, mark it PGC_static. */
 pg[i].count_info |= PGC_static;
 }
+
+spin_unlock(&heap_lock);
+}
+
+void free_domstatic_page(struct page_info *page)
+{
+struct domain *d = page_get_owner(page);
+bool drop_dom_ref;
+
+ASSERT(d);
+
+ASSERT_ALLOC_CONTEXT();
+
+/* NB. May recursively lock from relinquish_memory(). */
+spin_lock_recursive(&d->page_alloc_lock);
+
+arch_free_heap_page(d, page);
+
+drop_dom_ref = !domain_adjust_tot_pages(d, -1);
+
+spin_unlock_recursive(&d->page_alloc_lock);
+
+free_staticmem_pages(page, 1, scrub_debug);
+
+if ( drop_dom_ref )
+put_domain(d);
 }
 
 /*
diff --git a/xen/include/xen/mm.h b/xen/in

[PATCH v9 4/8] xen: add field "flags" to cover all internal CDF_XXX

2022-07-19 Thread Penny Zheng
With more and more CDF_xxx internal flags in and to save the space, this
commit introduces a new field "flags" in struct domain to store CDF_*
internal flags directly.

Another new CDF_xxx will be introduced in the next patch.

Signed-off-by: Penny Zheng 
Acked-by: Julien Grall 
---
v9 changes:
- no change
---
v8 changes:
- no change
---
v7 changes:
- no change
---
v6 changes:
- no change
---
v5 changes:
- no change
---
v4 changes:
- no change
---
v3 changes:
- change fixed width type uint32_t to unsigned int
- change "flags" to a more descriptive name "cdf"
---
v2 changes:
- let "flags" live in the struct domain. So other arch can take
advantage of it in the future
- fix coding style
---
 xen/arch/arm/domain.c | 2 --
 xen/arch/arm/include/asm/domain.h | 3 +--
 xen/common/domain.c   | 3 +++
 xen/include/xen/sched.h   | 3 +++
 4 files changed, 7 insertions(+), 4 deletions(-)

diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
index 2f8eaab7b5..4722988ee7 100644
--- a/xen/arch/arm/domain.c
+++ b/xen/arch/arm/domain.c
@@ -709,8 +709,6 @@ int arch_domain_create(struct domain *d,
 ioreq_domain_init(d);
 #endif
 
-d->arch.directmap = flags & CDF_directmap;
-
 /* p2m_init relies on some value initialized by the IOMMU subsystem */
 if ( (rc = iommu_domain_init(d, config->iommu_opts)) != 0 )
 goto fail;
diff --git a/xen/arch/arm/include/asm/domain.h 
b/xen/arch/arm/include/asm/domain.h
index cd9ce19b4b..26a8348eed 100644
--- a/xen/arch/arm/include/asm/domain.h
+++ b/xen/arch/arm/include/asm/domain.h
@@ -29,7 +29,7 @@ enum domain_type {
 #define is_64bit_domain(d) (0)
 #endif
 
-#define is_domain_direct_mapped(d) (d)->arch.directmap
+#define is_domain_direct_mapped(d) ((d)->cdf & CDF_directmap)
 
 /*
  * Is the domain using the host memory layout?
@@ -104,7 +104,6 @@ struct arch_domain
 void *tee;
 #endif
 
-bool directmap;
 }  __cacheline_aligned;
 
 struct arch_vcpu
diff --git a/xen/common/domain.c b/xen/common/domain.c
index 618410e3b2..7062393e37 100644
--- a/xen/common/domain.c
+++ b/xen/common/domain.c
@@ -567,6 +567,9 @@ struct domain *domain_create(domid_t domid,
 /* Sort out our idea of is_system_domain(). */
 d->domain_id = domid;
 
+/* Holding CDF_* internal flags. */
+d->cdf = flags;
+
 /* Debug sanity. */
 ASSERT(is_system_domain(d) ? config == NULL : config != NULL);
 
diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
index b9515eb497..98e8001c89 100644
--- a/xen/include/xen/sched.h
+++ b/xen/include/xen/sched.h
@@ -596,6 +596,9 @@ struct domain
 struct ioreq_server *server[MAX_NR_IOREQ_SERVERS];
 } ioreq_server;
 #endif
+
+/* Holding CDF_* constant. Internal flags for domain creation. */
+unsigned int cdf;
 };
 
 static inline struct page_list_head *page_to_list(
-- 
2.25.1




[PATCH v9 5/8] xen/arm: introduce CDF_staticmem

2022-07-19 Thread Penny Zheng
In order to have an easy and quick way to find out whether this domain memory
is statically configured, this commit introduces a new flag CDF_staticmem and a
new helper is_domain_using_staticmem() to tell.

Signed-off-by: Penny Zheng 
Acked-by: Julien Grall 
Acked-by: Jan Beulich 
---
v9 changes:
- no change
---
v8 changes:
- #ifdef-ary around is_domain_using_staticmem() is not needed anymore
---
v7 changes:
- IS_ENABLED(CONFIG_STATIC_MEMORY) would not be needed anymore
---
v6 changes:
- move non-zero is_domain_using_staticmem() from ARM header to common
header
---
v5 changes:
- guard "is_domain_using_staticmem" under CONFIG_STATIC_MEMORY
- #define is_domain_using_staticmem zero if undefined
---
v4 changes:
- no changes
---
v3 changes:
- change name from "is_domain_static()" to "is_domain_using_staticmem"
---
v2 changes:
- change name from "is_domain_on_static_allocation" to "is_domain_static()"
---
 xen/arch/arm/domain_build.c | 5 -
 xen/include/xen/domain.h| 8 
 2 files changed, 12 insertions(+), 1 deletion(-)

diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
index 3fd1186b53..b76a84e8f5 100644
--- a/xen/arch/arm/domain_build.c
+++ b/xen/arch/arm/domain_build.c
@@ -3287,9 +3287,12 @@ void __init create_domUs(void)
 if ( !dt_device_is_compatible(node, "xen,domain") )
 continue;
 
+if ( dt_find_property(node, "xen,static-mem", NULL) )
+flags |= CDF_staticmem;
+
 if ( dt_property_read_bool(node, "direct-map") )
 {
-if ( !IS_ENABLED(CONFIG_STATIC_MEMORY) || !dt_find_property(node, 
"xen,static-mem", NULL) )
+if ( !(flags & CDF_staticmem) )
 panic("direct-map is not valid for domain %s without static 
allocation.\n",
   dt_node_name(node));
 
diff --git a/xen/include/xen/domain.h b/xen/include/xen/domain.h
index 628b14b086..2c8116afba 100644
--- a/xen/include/xen/domain.h
+++ b/xen/include/xen/domain.h
@@ -35,6 +35,14 @@ void arch_get_domain_info(const struct domain *d,
 /* Should domain memory be directly mapped? */
 #define CDF_directmap(1U << 1)
 #endif
+/* Is domain memory on static allocation? */
+#ifdef CONFIG_STATIC_MEMORY
+#define CDF_staticmem(1U << 2)
+#else
+#define CDF_staticmem0
+#endif
+
+#define is_domain_using_staticmem(d) ((d)->cdf & CDF_staticmem)
 
 /*
  * Arch-specifics.
-- 
2.25.1




[PATCH v9 6/8] xen/arm: unpopulate memory when domain is static

2022-07-19 Thread Penny Zheng
Today when a domain unpopulates the memory on runtime, they will always
hand the memory back to the heap allocator. And it will be a problem if domain
is static.

Pages as guest RAM for static domain shall be reserved to only this domain
and not be used for any other purposes, so they shall never go back to heap
allocator.

This commit puts reserved pages on the new list resv_page_list only after
having taken them off the "normal" list, when the last ref dropped.

Signed-off-by: Penny Zheng 
---
v9 change:
- remove macro helper put_static_page, and just expand its code inside
free_domstatic_page
---
v8 changes:
- adapt this patch for newly introduced free_domstatic_page
- order as a parameter is not needed here, as all staticmem operations are
limited to order-0 regions
- move d->page_alloc_lock after operation on d->resv_page_list
---
v7 changes:
- Add page on the rsv_page_list *after* it has been freed
---
v6 changes:
- refine in-code comment
- move PGC_static !CONFIG_STATIC_MEMORY definition to common header
---
v5 changes:
- adapt this patch for PGC_staticmem
---
v4 changes:
- no changes
---
v3 changes:
- have page_list_del() just once out of the if()
- remove resv_pages counter
- make arch_free_heap_page be an expression, not a compound statement.
---
v2 changes:
- put reserved pages on resv_page_list after having taken them off
the "normal" list
---
 xen/common/domain.c | 4 
 xen/common/page_alloc.c | 8 ++--
 xen/include/xen/sched.h | 3 +++
 3 files changed, 13 insertions(+), 2 deletions(-)

diff --git a/xen/common/domain.c b/xen/common/domain.c
index 7062393e37..c23f449451 100644
--- a/xen/common/domain.c
+++ b/xen/common/domain.c
@@ -604,6 +604,10 @@ struct domain *domain_create(domid_t domid,
 INIT_PAGE_LIST_HEAD(&d->page_list);
 INIT_PAGE_LIST_HEAD(&d->extra_page_list);
 INIT_PAGE_LIST_HEAD(&d->xenpage_list);
+#ifdef CONFIG_STATIC_MEMORY
+INIT_PAGE_LIST_HEAD(&d->resv_page_list);
+#endif
+
 
 spin_lock_init(&d->node_affinity_lock);
 d->node_affinity = NODE_MASK_ALL;
diff --git a/xen/common/page_alloc.c b/xen/common/page_alloc.c
index 45bd88a685..a568be55e3 100644
--- a/xen/common/page_alloc.c
+++ b/xen/common/page_alloc.c
@@ -2674,10 +2674,14 @@ void free_domstatic_page(struct page_info *page)
 
 drop_dom_ref = !domain_adjust_tot_pages(d, -1);
 
-spin_unlock_recursive(&d->page_alloc_lock);
-
 free_staticmem_pages(page, 1, scrub_debug);
 
+/* Add page on the resv_page_list *after* it has been freed. */
+if ( !drop_dom_ref )
+page_list_add_tail(page, &d->resv_page_list);
+
+spin_unlock_recursive(&d->page_alloc_lock);
+
 if ( drop_dom_ref )
 put_domain(d);
 }
diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
index 98e8001c89..d4fbd3dea7 100644
--- a/xen/include/xen/sched.h
+++ b/xen/include/xen/sched.h
@@ -381,6 +381,9 @@ struct domain
 struct page_list_head page_list;  /* linked list */
 struct page_list_head extra_page_list; /* linked list (size extra_pages) */
 struct page_list_head xenpage_list; /* linked list (size xenheap_pages) */
+#ifdef CONFIG_STATIC_MEMORY
+struct page_list_head resv_page_list; /* linked list */
+#endif
 
 /*
  * This field should only be directly accessed by domain_adjust_tot_pages()
-- 
2.25.1




[PATCH v9 7/8] xen: introduce prepare_staticmem_pages

2022-07-19 Thread Penny Zheng
Later, we want to use acquire_domstatic_pages() for populating memory
for static domain on runtime, however, there are a lot of pointless work
(checking mfn_valid(), scrubbing the free part, cleaning the cache...)
considering we know the page is valid and belong to the guest.

This commit splits acquire_staticmem_pages() in two parts, and
introduces prepare_staticmem_pages to bypass all "pointless work".

Signed-off-by: Penny Zheng 
Acked-by: Jan Beulich 
Acked-by: Julien Grall 
---
v8 changes:
- no change
---
v7 changes:
- no change
---
v6 changes:
- adapt to PGC_static
---
v5 changes:
- new commit
---
 xen/common/page_alloc.c | 61 -
 1 file changed, 36 insertions(+), 25 deletions(-)

diff --git a/xen/common/page_alloc.c b/xen/common/page_alloc.c
index a568be55e3..9e150946f9 100644
--- a/xen/common/page_alloc.c
+++ b/xen/common/page_alloc.c
@@ -2686,26 +2686,13 @@ void free_domstatic_page(struct page_info *page)
 put_domain(d);
 }
 
-/*
- * Acquire nr_mfns contiguous reserved pages, starting at #smfn, of
- * static memory.
- * This function needs to be reworked if used outside of boot.
- */
-static struct page_info * __init acquire_staticmem_pages(mfn_t smfn,
- unsigned long nr_mfns,
- unsigned int memflags)
+static bool __init prepare_staticmem_pages(struct page_info *pg,
+   unsigned long nr_mfns,
+   unsigned int memflags)
 {
 bool need_tlbflush = false;
 uint32_t tlbflush_timestamp = 0;
 unsigned long i;
-struct page_info *pg;
-
-ASSERT(nr_mfns);
-for ( i = 0; i < nr_mfns; i++ )
-if ( !mfn_valid(mfn_add(smfn, i)) )
-return NULL;
-
-pg = mfn_to_page(smfn);
 
 spin_lock(&heap_lock);
 
@@ -2716,7 +2703,7 @@ static struct page_info * __init 
acquire_staticmem_pages(mfn_t smfn,
 {
 printk(XENLOG_ERR
"pg[%lu] Static MFN %"PRI_mfn" c=%#lx t=%#x\n",
-   i, mfn_x(smfn) + i,
+   i, mfn_x(page_to_mfn(pg)) + i,
pg[i].count_info, pg[i].tlbflush_timestamp);
 goto out_err;
 }
@@ -2740,6 +2727,38 @@ static struct page_info * __init 
acquire_staticmem_pages(mfn_t smfn,
 if ( need_tlbflush )
 filtered_flush_tlb_mask(tlbflush_timestamp);
 
+return true;
+
+ out_err:
+while ( i-- )
+pg[i].count_info = PGC_static | PGC_state_free;
+
+spin_unlock(&heap_lock);
+
+return false;
+}
+
+/*
+ * Acquire nr_mfns contiguous reserved pages, starting at #smfn, of
+ * static memory.
+ * This function needs to be reworked if used outside of boot.
+ */
+static struct page_info * __init acquire_staticmem_pages(mfn_t smfn,
+ unsigned long nr_mfns,
+ unsigned int memflags)
+{
+unsigned long i;
+struct page_info *pg;
+
+ASSERT(nr_mfns);
+for ( i = 0; i < nr_mfns; i++ )
+if ( !mfn_valid(mfn_add(smfn, i)) )
+return NULL;
+
+pg = mfn_to_page(smfn);
+if ( !prepare_staticmem_pages(pg, nr_mfns, memflags) )
+return NULL;
+
 /*
  * Ensure cache and RAM are consistent for platforms where the guest
  * can control its own visibility of/through the cache.
@@ -2748,14 +2767,6 @@ static struct page_info * __init 
acquire_staticmem_pages(mfn_t smfn,
 flush_page_to_ram(mfn_x(smfn) + i, !(memflags & MEMF_no_icache_flush));
 
 return pg;
-
- out_err:
-while ( i-- )
-pg[i].count_info = PGC_static | PGC_state_free;
-
-spin_unlock(&heap_lock);
-
-return NULL;
 }
 
 /*
-- 
2.25.1




[PATCH v9 8/8] xen: retrieve reserved pages on populate_physmap

2022-07-19 Thread Penny Zheng
When a static domain populates memory through populate_physmap at runtime,
it shall retrieve reserved pages from resv_page_list to make sure that
guest RAM is still restricted in statically configured memory regions.
This commit also introduces a new helper acquire_reserved_page to make it work.

Signed-off-by: Penny Zheng 
---
v9 changes:
- Use ASSERT_ALLOC_CONTEXT() in acquire_reserved_page
- Add free_staticmem_pages to undo prepare_staticmem_pages when
assign_domstatic_pages fails
- Remove redundant static in error message
---
v8 changes:
- As concurrent free/allocate could modify the resv_page_list, we still
need the lock
---
v7 changes:
- remove the lock, since we add the page to rsv_page_list after it has
been totally freed.
---
v6 changes:
- drop the lock before returning
---
v5 changes:
- extract common codes for assigning pages into a helper assign_domstatic_pages
- refine commit message
- remove stub function acquire_reserved_page
- Alloc/free of memory can happen concurrently. So access to rsv_page_list
needs to be protected with a spinlock
---
v4 changes:
- miss dropping __init in acquire_domstatic_pages
- add the page back to the reserved list in case of error
- remove redundant printk
- refine log message and make it warn level
---
v3 changes:
- move is_domain_using_staticmem to the common header file
- remove #ifdef CONFIG_STATIC_MEMORY-ary
- remove meaningless page_to_mfn(page) in error log
---
v2 changes:
- introduce acquire_reserved_page to retrieve reserved pages from
resv_page_list
- forbid non-zero-order requests in populate_physmap
- let is_domain_static return ((void)(d), false) on x86
---
 xen/common/memory.c | 23 +
 xen/common/page_alloc.c | 74 +++--
 xen/include/xen/mm.h|  1 +
 3 files changed, 81 insertions(+), 17 deletions(-)

diff --git a/xen/common/memory.c b/xen/common/memory.c
index f6f794914d..d486ebd8b9 100644
--- a/xen/common/memory.c
+++ b/xen/common/memory.c
@@ -245,6 +245,29 @@ static void populate_physmap(struct memop_args *a)
 
 mfn = _mfn(gpfn);
 }
+else if ( is_domain_using_staticmem(d) )
+{
+/*
+ * No easy way to guarantee the retrieved pages are contiguous,
+ * so forbid non-zero-order requests here.
+ */
+if ( a->extent_order != 0 )
+{
+gdprintk(XENLOG_WARNING,
+ "Cannot allocate static order-%u pages for %pd\n",
+ a->extent_order, d);
+goto out;
+}
+
+mfn = acquire_reserved_page(d, a->memflags);
+if ( mfn_eq(mfn, INVALID_MFN) )
+{
+gdprintk(XENLOG_WARNING,
+ "%pd: failed to retrieve a reserved page\n",
+ d);
+goto out;
+}
+}
 else
 {
 page = alloc_domheap_pages(d, a->extent_order, a->memflags);
diff --git a/xen/common/page_alloc.c b/xen/common/page_alloc.c
index 9e150946f9..3414189432 100644
--- a/xen/common/page_alloc.c
+++ b/xen/common/page_alloc.c
@@ -2686,9 +2686,8 @@ void free_domstatic_page(struct page_info *page)
 put_domain(d);
 }
 
-static bool __init prepare_staticmem_pages(struct page_info *pg,
-   unsigned long nr_mfns,
-   unsigned int memflags)
+static bool prepare_staticmem_pages(struct page_info *pg, unsigned long 
nr_mfns,
+unsigned int memflags)
 {
 bool need_tlbflush = false;
 uint32_t tlbflush_timestamp = 0;
@@ -2769,21 +2768,9 @@ static struct page_info * __init 
acquire_staticmem_pages(mfn_t smfn,
 return pg;
 }
 
-/*
- * Acquire nr_mfns contiguous pages, starting at #smfn, of static memory,
- * then assign them to one specific domain #d.
- */
-int __init acquire_domstatic_pages(struct domain *d, mfn_t smfn,
-   unsigned int nr_mfns, unsigned int memflags)
+static int assign_domstatic_pages(struct domain *d, struct page_info *pg,
+  unsigned int nr_mfns, unsigned int memflags)
 {
-struct page_info *pg;
-
-ASSERT_ALLOC_CONTEXT();
-
-pg = acquire_staticmem_pages(smfn, nr_mfns, memflags);
-if ( !pg )
-return -ENOENT;
-
 if ( !d || (memflags & (MEMF_no_owner | MEMF_no_refcount)) )
 {
 /*
@@ -2802,6 +2789,59 @@ int __init acquire_domstatic_pages(struct domain *d, 
mfn_t smfn,
 
 return 0;
 }
+
+/*
+ * Acquire nr_mfns contiguous pages, starting at #smfn, of static memory,
+ * then assign them to one specific domain #d.
+ */
+int __init acquire_domstatic_pages(struct domain *d, mfn_t smfn,
+   unsigned int nr_mfns, unsigned int memflags)
+{
+struct page_

[xen-unstable test] 171695: tolerable FAIL - PUSHED

2022-07-19 Thread osstest service owner
flight 171695 xen-unstable real [real]
flight 171701 xen-unstable real-retest [real]
http://logs.test-lab.xenproject.org/osstest/logs/171695/
http://logs.test-lab.xenproject.org/osstest/logs/171701/

Failures :-/ but no regressions.

Tests which are failing intermittently (not blocking):
 test-amd64-i386-pair11 xen-install/dst_host fail pass in 171701-retest

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-qemut-win7-amd64 19 guest-stopfail like 171678
 test-armhf-armhf-libvirt 16 saverestore-support-checkfail  like 171678
 test-amd64-amd64-qemuu-nested-amd 20 debian-hvm-install/l1/l2 fail like 171678
 test-amd64-amd64-xl-qemuu-ws16-amd64 19 guest-stopfail like 171678
 test-amd64-i386-xl-qemut-ws16-amd64 19 guest-stop fail like 171678
 test-amd64-i386-xl-qemut-win7-amd64 19 guest-stop fail like 171678
 test-armhf-armhf-libvirt-qcow2 15 saverestore-support-check   fail like 171678
 test-armhf-armhf-libvirt-raw 15 saverestore-support-checkfail  like 171678
 test-amd64-i386-xl-qemuu-win7-amd64 19 guest-stop fail like 171678
 test-amd64-amd64-xl-qemut-ws16-amd64 19 guest-stopfail like 171678
 test-amd64-i386-xl-qemuu-ws16-amd64 19 guest-stop fail like 171678
 test-amd64-amd64-xl-qemuu-win7-amd64 19 guest-stopfail like 171678
 test-amd64-i386-xl-pvshim14 guest-start  fail   never pass
 test-amd64-amd64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  16 saverestore-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  15 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  16 saverestore-support-checkfail   never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-amd64-i386-libvirt-raw  14 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-raw 14 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-raw 15 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 15 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 16 saverestore-support-checkfail never pass
 test-arm64-arm64-xl-vhd  14 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-vhd  15 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 15 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 16 saverestore-support-checkfail  never pass
 test-armhf-armhf-libvirt 15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 16 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-vhd 14 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-qcow2 14 migrate-support-checkfail never pass
 test-armhf-armhf-libvirt-raw 14 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  14 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  15 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  15 migrate-support-checkfail   ne

[libvirt test] 171700: regressions - FAIL

2022-07-19 Thread osstest service owner
flight 171700 libvirt real [real]
http://logs.test-lab.xenproject.org/osstest/logs/171700/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-arm64-libvirt   6 libvirt-buildfail REGR. vs. 151777
 build-amd64-libvirt   6 libvirt-buildfail REGR. vs. 151777
 build-i386-libvirt6 libvirt-buildfail REGR. vs. 151777
 build-armhf-libvirt   6 libvirt-buildfail REGR. vs. 151777

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-libvirt  1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt-pair  1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 1 build-check(1) blocked n/a
 test-amd64-amd64-libvirt-vhd  1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt-xsm  1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt   1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt-pair  1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 1 build-check(1) blocked n/a
 test-amd64-i386-libvirt-raw   1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt-xsm   1 build-check(1)   blocked  n/a
 test-arm64-arm64-libvirt  1 build-check(1)   blocked  n/a
 test-arm64-arm64-libvirt-qcow2  1 build-check(1)   blocked  n/a
 test-arm64-arm64-libvirt-raw  1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt-raw  1 build-check(1)   blocked  n/a
 test-arm64-arm64-libvirt-xsm  1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt  1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt-qcow2  1 build-check(1)   blocked  n/a

version targeted for testing:
 libvirt  f52dbac93f146e5a80d5d27b2db1d3549d6580ef
baseline version:
 libvirt  2c846fa6bcc11929c9fb857a22430fb9945654ad

Last test of basis   151777  2020-07-10 04:19:19 Z  740 days
Failing since151818  2020-07-11 04:18:52 Z  739 days  721 attempts
Testing same since   171700  2022-07-20 04:18:56 Z0 days1 attempts


People who touched revisions under test:
Adolfo Jayme Barrientos 
  Aleksandr Alekseev 
  Aleksei Zakharov 
  Amneesh Singh 
  Andika Triwidada 
  Andrea Bolognani 
  Andrew Melnychenko 
  Ani Sinha 
  Balázs Meskó 
  Barrett Schonefeld 
  Bastian Germann 
  Bastien Orivel 
  BiaoXiang Ye 
  Bihong Yu 
  Binfeng Wu 
  Bjoern Walk 
  Boris Fiuczynski 
  Brad Laue 
  Brian Turek 
  Bruno Haible 
  Chris Mayo 
  Christian Borntraeger 
  Christian Ehrhardt 
  Christian Kirbach 
  Christian Schoenebeck 
  Christophe Fergeau 
  Claudio Fontana 
  Cole Robinson 
  Collin Walling 
  Cornelia Huck 
  Cédric Bosdonnat 
  Côme Borsoi 
  Daniel Henrique Barboza 
  Daniel Letai 
  Daniel P. Berrange 
  Daniel P. Berrangé 
  David Michael 
  Didik Supriadi 
  dinglimin 
  Divya Garg 
  Dmitrii Shcherbakov 
  Dmytro Linkin 
  Eiichi Tsukata 
  Emilio Herrera 
  Eric Farman 
  Erik Skultety 
  Fabian Affolter 
  Fabian Freyer 
  Fabiano Fidêncio 
  Fangge Jin 
  Farhan Ali 
  Fedora Weblate Translation 
  Florian Schmidt 
  Franck Ridel 
  Gavi Teitz 
  gongwei 
  Guoyi Tu
  Göran Uddeborg 
  Halil Pasic 
  Han Han 
  Hao Wang 
  Haonan Wang 
  Hela Basa 
  Helmut Grohne 
  Hiroki Narukawa 
  Hyman Huang(黄勇) 
  Ian Wienand 
  Ioanna Alifieraki 
  Ivan Teterevkov 
  Jakob Meng 
  Jamie Strandboge 
  Jamie Strandboge 
  Jan Kuparinen 
  jason lee 
  Jean-Baptiste Holcroft 
  Jia Zhou 
  Jianan Gao 
  Jim Fehlig 
  Jin Yan 
  Jing Qi 
  Jinsheng Zhang 
  Jiri Denemark 
  Joachim Falk 
  John Ferlan 
  John Levon 
  John Levon 
  Jonathan Watt 
  Jonathon Jongsma 
  Julio Faracco 
  Justin Gatzen 
  Ján Tomko 
  Kashyap Chamarthy 
  Kevin Locke 
  Kim InSoo 
  Koichi Murase 
  Kristina Hanicova 
  Laine Stump 
  Laszlo Ersek 
  Lee Yarwood 
  Lei Yang 
  Lena Voytek 
  Liang Yan 
  Liang Yan 
  Liao Pingfang 
  Lin Ma 
  Lin Ma 
  Lin Ma 
  Liu Yiding 
  Lubomir Rintel 
  Luke Yue 
  Luyao Zhong 
  luzhipeng 
  Marc Hartmayer 
  Marc-André Lureau 
  Marek Marczykowski-Górecki 
  Mark Mielke 
  Markus Schade 
  Martin Kletzander 
  Martin Pitt 
  Masayoshi Mizuma 
  Matej Cepl 
  Matt Coleman 
  Matt Coleman 
  Mauro Matteo Cascella 
  Max Goodhart 
  Maxim Nestratov 
  Meina Li 
  Michal Privoznik 
  Michał Smyk 
  Milo Casagrande 
  Moshe Levi 
  Moteen Shah 
  Moteen Shah 
  Muha Aliss 
  Nathan 
  Neal Gompa 
  Nick Chevsky 
  Nick Shyrokovskiy 
  Nickys Music Group 
  Nico Pache 
  Nicolas Lécureuil 
  Nicolas Lécureuil 
  Nikolay Shirokovskiy 
  Nikolay Shirokovskiy 
  Nikolay Shirokovskiy 
  Niteesh Dubey 
  Olaf Hering 
  Olesya Gerasimenko 
  Or Ozeri 
  Orion Poplawski 
  Pany 
  Paolo Bonzini 
  Patrick Magauran 
  Paulo de Rezende Pinatti 
  Pavel H