[Xen-devel] FYI: GSoC 2017: Xen Project has been accepted as a mentor organization!

2017-02-28 Thread Lars Kurth
Hi everyone,

we have been accepted into GSoC this year. This means we will likely have more 
interest in projects compared to the past and could do with 
a) more people willing to mentor
b) more projects on https://wiki.xenproject.org/wiki/2017-Summer-Internships 


The current list of projects and mentors are at
Mirage OS: http://canopy.mirage.io/Projects  
(note that for the projects in this list, it is impossible to identify the 
mentors)
All other projects: https://wiki.xenproject.org/wiki/Outreach_Program_Projects 
 
  
Best Regards
Lars

> Begin forwarded message:
> 
> From: Google Summer of Code 
> Subject: GSoC 2017: Xen Project has been accepted as a mentor organization!
> Date: 27 February 2017 17:00:44 GMT
> To: larskurth...@gmail.com
> 
> 
> Congratulations! Xen Project has been selected as a Google Summer of Code 
> 2017 mentor organization.
> 
> You can now invite mentors and update your organization profile.
> 
> Please click here to visit your dashboard: 
> https://summerofcode.withgoogle.com/dashboard/ 
>  <> <>  
> This email was sent to lars.kurth@gmail.com 
> .
> 
> You are receiving this email because of your participation in Google Summer 
> of Code 2017. 
> https://summerofcode.withgoogle.com 
> To leave the program and stop receiving all emails, you can go to your 
> profile  and request 
> deletion of your program profile.
> 
> For any questions, please contact gsoc-supp...@google.com. Replies to this 
> message go to an unmonitored mailbox.
> 
> © 2017 Google Inc., 1600 Amphitheatre Parkway, Mountain View, CA 94043, USA
> 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [xen-unstable bisection] complete test-amd64-i386-freebsd10-amd64

2017-02-28 Thread osstest service owner
branch xen-unstable
xenbranch xen-unstable
job test-amd64-i386-freebsd10-amd64
testid xen-boot

Tree: linux git://xenbits.xen.org/linux-pvops.git
Tree: linuxfirmware git://xenbits.xen.org/osstest/linux-firmware.git
Tree: qemu git://xenbits.xen.org/qemu-xen-traditional.git
Tree: qemuu git://xenbits.xen.org/qemu-xen.git
Tree: xen git://xenbits.xen.org/xen.git

*** Found and reproduced problem changeset ***

  Bug is in tree:  xen git://xenbits.xen.org/xen.git
  Bug introduced:  c5b9805bc1f79319ae342c65fcc201a15a47
  Bug not present: b199c44afa3a0d18d0e968e78a590eb9e69e20ad
  Last fail repro: http://logs.test-lab.xenproject.org/osstest/logs/106232/


  commit c5b9805bc1f79319ae342c65fcc201a15a47
  Author: Daniel Kiper 
  Date:   Wed Feb 22 14:38:06 2017 +0100
  
  efi: create new early memory allocator
  
  There is a problem with place_string() which is used as early memory
  allocator. It gets memory chunks starting from start symbol and goes
  down. Sadly this does not work when Xen is loaded using multiboot2
  protocol because then the start lives on 1 MiB address and we should
  not allocate a memory from below of it. So, I tried to use mem_lower
  address calculated by GRUB2. However, this solution works only on some
  machines. There are machines in the wild (e.g. Dell PowerEdge R820)
  which uses first ~640 KiB for boot services code or data... :-(((
  Hence, we need new memory allocator for Xen EFI boot code which is
  quite simple and generic and could be used by place_string() and
  efi_arch_allocate_mmap_buffer(). I think about following solutions:
  
  1) We could use native EFI allocation functions (e.g. AllocatePool()
 or AllocatePages()) to get memory chunk. However, later (somewhere
 in __start_xen()) we must copy its contents to safe place or reserve
 it in e820 memory map and map it in Xen virtual address space. This
 means that the code referring to Xen command line, loaded modules and
 EFI memory map, mostly in __start_xen(), will be further complicated
 and diverge from legacy BIOS cases. Additionally, both former things
 have to be placed below 4 GiB because their addresses are stored in
 multiboot_info_t structure which has 32-bit relevant members.
  
  2) We may allocate memory area statically somewhere in Xen code which
 could be used as memory pool for early dynamic allocations. Looks
 quite simple. Additionally, it would not depend on EFI at all and
 could be used on legacy BIOS platforms if we need it. However, we
 must carefully choose size of this pool. We do not want increase Xen
 binary size too much and waste too much memory but also we must fit
 at least memory map on x86 EFI platforms. As I saw on small machine,
 e.g. IBM System x3550 M2 with 8 GiB RAM, memory map may contain more
 than 200 entries. Every entry on x86-64 platform is 40 bytes in size.
 So, it means that we need more than 8 KiB for EFI memory map only.
 Additionally, if we use this memory pool for Xen and modules command
 line storage (it would be used when xen.efi is executed as EFI 
application)
 then we should add, I think, about 1 KiB. In this case, to be on safe
 side, we should assume at least 64 KiB pool for early memory 
allocations.
 Which is about 4 times of our earlier calculations. However, during
 discussion on Xen-devel Jan Beulich suggested that just in case we 
should
 use 1 MiB memory pool like it is in original place_string() 
implementation.
 So, let's use 1 MiB as it was proposed. If we think that we should not
 waste unallocated memory in the pool on running system then we can mark
 this region as __initdata and move all required data to dynamically
 allocated places somewhere in __start_xen().
  
  2a) We could put memory pool into .bss.page_aligned section. Then allocate
  memory chunks starting from the lowest address. After init phase we 
can
  free unused portion of the memory pool as in case of .init.text or 
.init.data
  sections. This way we do not need to allocate any space in image file 
and
  freeing of unused area in the memory pool is very simple.
  
  Now #2a solution is implemented because it is quite simple and requires
  limited number of changes, especially in __start_xen().
  
  New allocator is quite generic and can be used on ARM platforms too.
  Though it is not enabled on ARM yet due to lack of some prereq.
  List of them is placed before ebmalloc code.
  
  Signed-off-by: Daniel Kiper 
  Acked-by: Jan Beulich 
  Acked-by: Julien Grall 
  Reviewed-by: Doug Goldstein 
  Tested-by: Doug Goldstein 


For bisection revision-tuple graph see:
   
http://logs.test-lab.xenproject.org/osste

Re: [Xen-devel] [xen-unstable bisection] complete test-amd64-i386-freebsd10-amd64

2017-02-28 Thread Roger Pau Monné
Hello,

It seems that your changes are causing issues when booting a 32bit Dom0, it
seems there are several IOMMU faults that prevent Dom0 from booting at all.
AFAICT this only happens when using a 32bit Dom0. The bisector has pointed
several times at this change, so it looks like the culprit.

See for example:

http://logs.test-lab.xenproject.org/osstest/logs/106186/

This is the serial log of the box failing to boot:

http://logs.test-lab.xenproject.org/osstest/logs/106186/test-amd64-i386-migrupgrade/serial-chardonnay0.log.0

Search for "[VT-D]DMAR:[DMA Read] Request device [:01:00.0] fault addr
7cd3f000, iommu reg = 82c00021b000" to get to the first IOMMU fault.

Roger.

On Tue, Feb 28, 2017 at 09:08:40AM +, osstest service owner wrote:
> branch xen-unstable
> xenbranch xen-unstable
> job test-amd64-i386-freebsd10-amd64
> testid xen-boot
> 
> Tree: linux git://xenbits.xen.org/linux-pvops.git
> Tree: linuxfirmware git://xenbits.xen.org/osstest/linux-firmware.git
> Tree: qemu git://xenbits.xen.org/qemu-xen-traditional.git
> Tree: qemuu git://xenbits.xen.org/qemu-xen.git
> Tree: xen git://xenbits.xen.org/xen.git
> 
> *** Found and reproduced problem changeset ***
> 
>   Bug is in tree:  xen git://xenbits.xen.org/xen.git
>   Bug introduced:  c5b9805bc1f79319ae342c65fcc201a15a47
>   Bug not present: b199c44afa3a0d18d0e968e78a590eb9e69e20ad
>   Last fail repro: http://logs.test-lab.xenproject.org/osstest/logs/106232/
> 
> 
>   commit c5b9805bc1f79319ae342c65fcc201a15a47
>   Author: Daniel Kiper 
>   Date:   Wed Feb 22 14:38:06 2017 +0100
>   
>   efi: create new early memory allocator
>   
>   There is a problem with place_string() which is used as early memory
>   allocator. It gets memory chunks starting from start symbol and goes
>   down. Sadly this does not work when Xen is loaded using multiboot2
>   protocol because then the start lives on 1 MiB address and we should
>   not allocate a memory from below of it. So, I tried to use mem_lower
>   address calculated by GRUB2. However, this solution works only on some
>   machines. There are machines in the wild (e.g. Dell PowerEdge R820)
>   which uses first ~640 KiB for boot services code or data... :-(((
>   Hence, we need new memory allocator for Xen EFI boot code which is
>   quite simple and generic and could be used by place_string() and
>   efi_arch_allocate_mmap_buffer(). I think about following solutions:
>   
>   1) We could use native EFI allocation functions (e.g. AllocatePool()
>  or AllocatePages()) to get memory chunk. However, later (somewhere
>  in __start_xen()) we must copy its contents to safe place or reserve
>  it in e820 memory map and map it in Xen virtual address space. This
>  means that the code referring to Xen command line, loaded modules and
>  EFI memory map, mostly in __start_xen(), will be further complicated
>  and diverge from legacy BIOS cases. Additionally, both former things
>  have to be placed below 4 GiB because their addresses are stored in
>  multiboot_info_t structure which has 32-bit relevant members.
>   
>   2) We may allocate memory area statically somewhere in Xen code which
>  could be used as memory pool for early dynamic allocations. Looks
>  quite simple. Additionally, it would not depend on EFI at all and
>  could be used on legacy BIOS platforms if we need it. However, we
>  must carefully choose size of this pool. We do not want increase Xen
>  binary size too much and waste too much memory but also we must fit
>  at least memory map on x86 EFI platforms. As I saw on small machine,
>  e.g. IBM System x3550 M2 with 8 GiB RAM, memory map may contain more
>  than 200 entries. Every entry on x86-64 platform is 40 bytes in size.
>  So, it means that we need more than 8 KiB for EFI memory map only.
>  Additionally, if we use this memory pool for Xen and modules command
>  line storage (it would be used when xen.efi is executed as EFI 
> application)
>  then we should add, I think, about 1 KiB. In this case, to be on safe
>  side, we should assume at least 64 KiB pool for early memory 
> allocations.
>  Which is about 4 times of our earlier calculations. However, during
>  discussion on Xen-devel Jan Beulich suggested that just in case we 
> should
>  use 1 MiB memory pool like it is in original place_string() 
> implementation.
>  So, let's use 1 MiB as it was proposed. If we think that we should 
> not
>  waste unallocated memory in the pool on running system then we can 
> mark
>  this region as __initdata and move all required data to dynamically
>  allocated places somewhere in __start_xen().
>   
>   2a) We could put memory pool into .bss.page_aligned section. Then 
> allocate
>   

Re: [Xen-devel] [PATCH 10/10] x86/cpuid: Always enable faulting for the control domain

2017-02-28 Thread Jan Beulich
>>> On 27.02.17 at 16:10,  wrote:
> On 22/02/17 10:10, Jan Beulich wrote:
> On 22.02.17 at 11:00,  wrote:
>>> On 22/02/17 09:23, Jan Beulich wrote:
>>> On 20.02.17 at 12:00,  wrote:
> The domain builder in libxc no longer depends on leaked CPUID information 
> to
> properly construct HVM domains.  Remove the control domain exclusion.
 Am I missing some intermediate step? As long as there's a raw
 CPUID invocation in xc_cpuid_x86.c (which is still there in staging
 and I don't recall this series removing it) it at least _feels_ unsafe.
>>> Strictly speaking, the domain builder part of this was completed after
>>> my xsave adjustments.  All the guest-type-dependent information now
>>> comes from non-cpuid sources in libxc, or Xen ignores the toolstack
>>> values and recalculates information itself.
>>>
>>> However, until the Intel leaves were complete, dom0 had a hard time
>>> booting with this change as there were no toolstack-provided policy and
>>> no leakage from hardware.
>> So what are the CPUID uses in libxc then needed for at this point?
>> Could they be removed in a prereq patch to make clear all needed
>> information is now being obtained via hypercalls?
> 
> I'd prefer to defer that work.  The next chunk of CPUID work is going to
> be redesigning and reimplementing the hypervisor/libxc interface, and
> all cpuid() calls in libxc will fall out there, but its not a trivial
> set of changes to make.

With that, could you live with deferring the patch here until then?
I ask because ...

 Also the change here then results in Dom0 observing different
 behavior between faulting-capable and faulting-incapable hosts.
 I'm not convinced this is desirable.
>>> I disagree.  Avoiding the leakage is very desirable moving forwards.
>>>
>>> Other side effects are that it makes PV and PVH dom0 functionally
>>> identical WRT CPUID, and PV userspace (which, unlikely the kernel, tends
>>> not to be Xen-aware) sees sensible information.
>> I can see the upsides too, hence the "I'm not convinced" ...
> 
> So is that an ack or a nack?  I am afraid that this isn't very helpful.

... I understand this isn't helpful, yet no, at this point its neither
an ack nor a nak.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] RFC/PATCH: xen: race during domain destruction [Re: [xen-4.7-testing test] 105948: regressions - FAIL]

2017-02-28 Thread Jan Beulich
>>> On 27.02.17 at 16:18,  wrote:
> I'm therefore doing that now: I ask for backport of:
> 
>  f3d47501db2b7bb8dfd6a3c9710b7aff4b1fc55b
>  xen: fix a (latent) cpupool-related race during domain destroy
> 
> to 4.7.

Thanks for working this out! Applied to 4.7-staging.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC PATCH] mm, hotplug: get rid of auto_online_blocks

2017-02-28 Thread Heiko Carstens
On Mon, Feb 27, 2017 at 04:43:04PM +0100, Michal Hocko wrote:
> On Mon 27-02-17 12:25:10, Heiko Carstens wrote:
> > On Mon, Feb 27, 2017 at 11:02:09AM +0100, Vitaly Kuznetsov wrote:
> > > A couple of other thoughts:
> > > 1) Having all newly added memory online ASAP is probably what people
> > > want for all virtual machines.
> > 
> > This is not true for s390. On s390 we have "standby" memory that a guest
> > sees and potentially may use if it sets it online. Every guest that sets
> > memory offline contributes to the hypervisor's standby memory pool, while
> > onlining standby memory takes memory away from the standby pool.
> > 
> > The use-case is that a system administrator in advance knows the maximum
> > size a guest will ever have and also defines how much memory should be used
> > at boot time. The difference is standby memory.
> > 
> > Auto-onlining of standby memory is the last thing we want.
> > 
> > > Unfortunately, we have additional complexity with memory zones
> > > (ZONE_NORMAL, ZONE_MOVABLE) and in some cases manual intervention is
> > > required. Especially, when further unplug is expected.
> > 
> > This also is a reason why auto-onlining doesn't seem be the best way.
> 
> Can you imagine any situation when somebody actually might want to have
> this knob enabled? From what I understand it doesn't seem to be the
> case.

I can only speak for s390, and at least here I think auto-online is always
wrong, especially if you consider the added complexity that you may want to
online memory sometimes to ZONE_NORMAL and sometimes to ZONE_MOVABLE.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v4] xen/arm: warn if dom0_mem is not specified

2017-02-28 Thread Wei Liu
On Mon, Feb 27, 2017 at 03:14:07PM -0800, Stefano Stabellini wrote:
> The default dom0_mem is 128M which is not sufficient to boot a Ubuntu
> based Dom0. It is not clear what a better default value could be.
> 
> Instead, loudly warn the user when dom0_mem is unspecified and wait 3
> secs. Then use 512M.
> 
> Update the docs to specify that dom0_mem is required on ARM. (The
> current xen-command-line document does not actually reflect the current
> behavior of dom0_mem on ARM correctly.)
> 
> Signed-off-by: Stefano Stabellini 

Reviewed-by: Wei Liu 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] xl: move some helper functions to xl_utils.c

2017-02-28 Thread Ian Jackson
Wei Liu writes ("[PATCH] xl: move some helper functions to xl_utils.c"):
> Move some commonly used functions to a new file.

Acked-by: Ian Jackson 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH 1/4] interface: avoid redefinition of __XEN_INTERFACE_VERSION__

2017-02-28 Thread Juergen Gross
In stubdom environment __XEN_INTERFACE_VERSION__ is sometimes defined
on the command line of the build instruction. This conflicts with
xen-compat.h defining it unconditionally if __XEN__ or __XEN_TOOLS__
is set.

Just use #undef in this case to avoid the resulting warning.

Signed-off-by: Juergen Gross 
---
 xen/include/public/xen-compat.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/xen/include/public/xen-compat.h b/xen/include/public/xen-compat.h
index b673653..9453439 100644
--- a/xen/include/public/xen-compat.h
+++ b/xen/include/public/xen-compat.h
@@ -31,6 +31,7 @@
 
 #if defined(__XEN__) || defined(__XEN_TOOLS__)
 /* Xen is built with matching headers and implements the latest interface. */
+#undef __XEN_INTERFACE_VERSION__
 #define __XEN_INTERFACE_VERSION__ __XEN_LATEST_INTERFACE_VERSION__
 #elif !defined(__XEN_INTERFACE_VERSION__)
 /* Guests which do not specify a version get the legacy interface. */
-- 
2.10.2


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH 2/4] tools: add pkg-config file for libxc

2017-02-28 Thread Juergen Gross
When configuring the build of qemu the configure script is building
various test programs to determine the exact version of libxencontrol.

Instead of a try and error approach needing updates for nearly each
new version of Xen just provide xencontrol.pc to be used via
pkg-config.

In the end we need two different variants of that file: one for the
target system where eventually someone wants to build qemu, and one
for the local system to be used for building qemu as part of the Xen
build process.

The local variant is created in a dedicated directory in order to be
able to collect more pkg-config files used for building tools there.

Signed-off-by: Juergen Gross 
---
 .gitignore   |  3 +++
 stubdom/Makefile |  2 ++
 tools/Makefile   |  3 ++-
 tools/Rules.mk   | 13 +
 tools/libxc/Makefile | 22 +-
 tools/libxc/xencontrol.pc.in |  9 +
 6 files changed, 50 insertions(+), 2 deletions(-)
 create mode 100644 tools/libxc/xencontrol.pc.in

diff --git a/.gitignore b/.gitignore
index 557b38e..bd9ac53 100644
--- a/.gitignore
+++ b/.gitignore
@@ -79,6 +79,7 @@ stubdom/newlib-1.*
 stubdom/newlib-x86*
 stubdom/ocaml-*
 stubdom/pciutils-*
+stubdom/pkg-config/*
 stubdom/polarssl-*
 stubdom/stubdompath.sh
 stubdom/tpm_emulator-*
@@ -179,6 +180,7 @@ tools/include/xen/*
 tools/include/xen-xsm/*
 tools/include/xen-foreign/*.(c|h|size)
 tools/include/xen-foreign/checker
+tools/libxc/*.pc
 tools/libxl/_libxl.api-for-check
 tools/libxl/*.api-ok
 tools/libxl/*.pc
@@ -204,6 +206,7 @@ tools/misc/xen-hvmctx
 tools/misc/xenlockprof
 tools/misc/lowmemd
 tools/misc/xencov
+tools/pkg-config/*
 tools/xentrace/xenalyze
 tools/pygrub/build/*
 tools/python/build/*
diff --git a/stubdom/Makefile b/stubdom/Makefile
index 39b81c9..c6458e8 100644
--- a/stubdom/Makefile
+++ b/stubdom/Makefile
@@ -318,6 +318,7 @@ define do_links
   cd $(dir $@); \
   ln -sf $(dir $<)include/*.h include/; \
   ln -sf $(dir $<)*.[ch] .; \
+  ln -sf $(dir $<)*.pc.in .; \
   ln -sf $(dir $<)Makefile .
   touch $@
 endef
@@ -623,6 +624,7 @@ clean:
rm -fr grub-$(XEN_TARGET_ARCH)
rm -f $(STUBDOMPATH)
rm -f *-minios-config.mk
+   rm -fr pkg-config
[ ! -e libs-$(XEN_TARGET_ARCH)/toollog/Makefile ] || $(MAKE) DESTDIR= 
-C libs-$(XEN_TARGET_ARCH)/toollog clean
[ ! -e libs-$(XEN_TARGET_ARCH)/evtchn/Makefile ] || $(MAKE) DESTDIR= -C 
libs-$(XEN_TARGET_ARCH)/evtchn clean
[ ! -e libs-$(XEN_TARGET_ARCH)/gnttab/Makefile ] || $(MAKE) DESTDIR= -C 
libs-$(XEN_TARGET_ARCH)/gnttab clean
diff --git a/tools/Makefile b/tools/Makefile
index 68633a4..9548ab4 100644
--- a/tools/Makefile
+++ b/tools/Makefile
@@ -111,9 +111,10 @@ uninstall:
 
 .PHONY: clean
 clean: subdirs-clean
+   rm -rf pkg-config
 
 .PHONY: distclean
-distclean: subdirs-distclean
+distclean: subdirs-distclean clean
rm -rf qemu-xen-traditional-dir qemu-xen-traditional-dir-remote
rm -rf qemu-xen-dir qemu-xen-dir-remote
rm -rf ../config/Tools.mk config.h config.log config.status \
diff --git a/tools/Rules.mk b/tools/Rules.mk
index 52bdd1a..e676c6b 100644
--- a/tools/Rules.mk
+++ b/tools/Rules.mk
@@ -245,3 +245,16 @@ ifeq (,$(findstring clean,$(MAKECMDGOALS)))
 $(XEN_ROOT)/config/Tools.mk:
$(error You have to run ./configure before building or installing the 
tools)
 endif
+
+$(PKG_CONFIG_DIR)/%.pc: %.pc.in Makefile
+   mkdir -p $(PKG_CONFIG_DIR)
+   @sed -e 's!@@version@@!$(PKG_CONFIG_VERSION)!g' \
+-e 's!@@prefix@@!$(PKG_CONFIG_PREFIX)!g' \
+-e 's!@@incdir@@!$(PKG_CONFIG_INCDIR)!g' \
+-e 's!@@libdir@@!$(PKG_CONFIG_LIBDIR)!g' < $< > $@
+
+%.pc: %.pc.in Makefile
+   @sed -e 's!@@version@@!$(PKG_CONFIG_VERSION)!g' \
+-e 's!@@prefix@@!$(PKG_CONFIG_PREFIX)!g' \
+-e 's!@@incdir@@!$(PKG_CONFIG_INCDIR)!g' \
+-e 's!@@libdir@@!$(PKG_CONFIG_LIBDIR)!g' < $< > $@
diff --git a/tools/libxc/Makefile b/tools/libxc/Makefile
index da689c4..a161ba7 100644
--- a/tools/libxc/Makefile
+++ b/tools/libxc/Makefile
@@ -1,4 +1,5 @@
 XEN_ROOT = $(CURDIR)/../..
+PKG_CONFIG_DIR = ../pkg-config
 include $(XEN_ROOT)/tools/Rules.mk
 
 MAJOR= 4.9
@@ -159,6 +160,22 @@ endif
 $(CTRL_LIB_OBJS) $(GUEST_LIB_OBJS) \
 $(CTRL_PIC_OBJS) $(GUEST_PIC_OBJS): xc_private.h
 
+PKG_CONFIG := xencontrol.pc
+PKG_CONFIG_VERSION := $(MAJOR).$(MINOR)
+
+ifneq ($(CONFIG_LIBXC_MINIOS),y)
+PKG_CONFIG_INST := $(PKG_CONFIG)
+$(PKG_CONFIG_INST): PKG_CONFIG_PREFIX = $(prefix)
+$(PKG_CONFIG_INST): PKG_CONFIG_INCDIR = $(includedir)
+$(PKG_CONFIG_INST): PKG_CONFIG_LIBDIR = $(libdir)
+endif
+
+PKG_CONFIG_LOCAL := $(foreach pc,$(PKG_CONFIG),$(PKG_CONFIG_DIR)/$(pc))
+
+$(PKG_CONFIG_LOCAL): PKG_CONFIG_PREFIX = $(XEN_ROOT)
+$(PKG_CONFIG_LOCAL): PKG_CONFIG_INCDIR = $(XEN_LIBXC)/include
+$(PKG_CONFIG_LOCAL): PKG_CONFIG_LIBDIR = $(CURDIR)
+
 .PHONY: all
 all: build
 
@@ -167,12 +184,13 @@ build:
$(MAKE) libs
 
 .PHONY: 

[Xen-devel] [PATCH 0/4] build enhancements related to qemu

2017-02-28 Thread Juergen Gross
Last week I started an effort to use upstream qemu as a replacement
for qemu-trad in ioemu stubdom. This small series addresses some
problems I met in this effort. I believe all modifications are worth
doing regardless whether qemu upstream stubdom will become a reality
or not.

Juergen Gross (4):
  interface: avoid redefinition of __XEN_INTERFACE_VERSION__
  tools: add pkg-config file for libxc
  tools: use a dedicated build directory for qemu
  tools: set pkg-config path when configuring qemu

 .gitignore  |  4 
 stubdom/Makefile|  2 ++
 tools/Makefile  | 17 ++---
 tools/Rules.mk  | 13 +
 tools/libxc/Makefile| 22 +-
 tools/libxc/xencontrol.pc.in|  9 +
 xen/include/public/xen-compat.h |  1 +
 7 files changed, 60 insertions(+), 8 deletions(-)
 create mode 100644 tools/libxc/xencontrol.pc.in

-- 
2.10.2


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH 4/4] tools: set pkg-config path when configuring qemu

2017-02-28 Thread Juergen Gross
When calling configure for qemu provide the local pkg-config directory
in order to let the configure process find the libxenctrl version.

Signed-off-by: Juergen Gross 
---
 tools/Makefile | 1 +
 1 file changed, 1 insertion(+)

diff --git a/tools/Makefile b/tools/Makefile
index 99080ab..d6f8ce1 100644
--- a/tools/Makefile
+++ b/tools/Makefile
@@ -255,6 +255,7 @@ subdir-all-qemu-xen-dir: qemu-xen-dir-find
else \
enable_trace_backend='' ; \
fi ; \
+   PKG_CONFIG_PATH=$(XEN_ROOT)/tools/pkg-config \
$$source/configure --enable-xen --target-list=i386-softmmu \
$(QEMU_XEN_ENABLE_DEBUG) \
$$enable_trace_backend \
-- 
2.10.2


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH 3/4] tools: use a dedicated build directory for qemu

2017-02-28 Thread Juergen Gross
Instead of using the downloaded git tree as target directory for the
qemu build create a dedicated directory for that purpose.

Signed-off-by: Juergen Gross 
---
 .gitignore |  1 +
 tools/Makefile | 13 +++--
 2 files changed, 8 insertions(+), 6 deletions(-)

diff --git a/.gitignore b/.gitignore
index bd9ac53..a4937f9 100644
--- a/.gitignore
+++ b/.gitignore
@@ -207,6 +207,7 @@ tools/misc/xenlockprof
 tools/misc/lowmemd
 tools/misc/xencov
 tools/pkg-config/*
+tools/qemu-xen-build
 tools/xentrace/xenalyze
 tools/pygrub/build/*
 tools/python/build/*
diff --git a/tools/Makefile b/tools/Makefile
index 9548ab4..99080ab 100644
--- a/tools/Makefile
+++ b/tools/Makefile
@@ -116,7 +116,7 @@ clean: subdirs-clean
 .PHONY: distclean
 distclean: subdirs-distclean clean
rm -rf qemu-xen-traditional-dir qemu-xen-traditional-dir-remote
-   rm -rf qemu-xen-dir qemu-xen-dir-remote
+   rm -rf qemu-xen-dir qemu-xen-dir-remote qemu-xen-build
rm -rf ../config/Tools.mk config.h config.log config.status \
config.cache autom4te.cache
 
@@ -244,9 +244,10 @@ subdir-all-qemu-xen-dir: qemu-xen-dir-find
if test -d $(QEMU_UPSTREAM_LOC) ; then \
source=$(QEMU_UPSTREAM_LOC); \
else \
-   source=.; \
+   source=$(XEN_ROOT)/tools/qemu-xen-dir; \
fi; \
-   cd qemu-xen-dir; \
+   mkdir -p qemu-xen-build; \
+   cd qemu-xen-build; \
if $$source/scripts/tracetool.py --check-backend --backend log ; then \
enable_trace_backend='--enable-trace-backend=log'; \
elif $$source/scripts/tracetool.py --check-backend --backend stderr ; 
then \
@@ -299,12 +300,12 @@ subdir-all-qemu-xen-dir: qemu-xen-dir-find
$(MAKE) all
 
 subdir-install-qemu-xen-dir: subdir-all-qemu-xen-dir
-   cd qemu-xen-dir; \
+   cd qemu-xen-build; \
$(MAKE) install
 
 subdir-clean-qemu-xen-dir:
-   set -e; if test -d qemu-xen-dir/.; then \
-   $(MAKE) -C qemu-xen-dir clean; \
+   set -e; if test -d qemu-xen-build/.; then \
+   $(MAKE) -C qemu-xen-build clean; \
fi
 
 subdir-clean-debugger/gdbsx subdir-distclean-debugger/gdbsx: .phony
-- 
2.10.2


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 1/4] interface: avoid redefinition of __XEN_INTERFACE_VERSION__

2017-02-28 Thread Andrew Cooper
On 28/02/17 10:34, Juergen Gross wrote:
> In stubdom environment __XEN_INTERFACE_VERSION__ is sometimes defined
> on the command line of the build instruction. This conflicts with
> xen-compat.h defining it unconditionally if __XEN__ or __XEN_TOOLS__
> is set.
>
> Just use #undef in this case to avoid the resulting warning.
>
> Signed-off-by: Juergen Gross 

Reviewed-by: Andrew Cooper 

> ---
>  xen/include/public/xen-compat.h | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/xen/include/public/xen-compat.h b/xen/include/public/xen-compat.h
> index b673653..9453439 100644
> --- a/xen/include/public/xen-compat.h
> +++ b/xen/include/public/xen-compat.h
> @@ -31,6 +31,7 @@
>  
>  #if defined(__XEN__) || defined(__XEN_TOOLS__)
>  /* Xen is built with matching headers and implements the latest interface. */
> +#undef __XEN_INTERFACE_VERSION__
>  #define __XEN_INTERFACE_VERSION__ __XEN_LATEST_INTERFACE_VERSION__
>  #elif !defined(__XEN_INTERFACE_VERSION__)
>  /* Guests which do not specify a version get the legacy interface. */


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v8 00/24] Enable L2 Cache Allocation Technology & Refactor psr.c

2017-02-28 Thread Roger Pau Monné
On Wed, Feb 15, 2017 at 04:49:15PM +0800, Yi Sun wrote:
> Hi all,
> 
> We plan to bring a new PSR (Platform Shared Resource) feature called
> Intel L2 Cache Allocation Technology (L2 CAT) to Xen.
> 
> Besides the L2 CAT implementaion, we refactor the psr.c to make it more
> flexible to add new features and fulfill the principle, open for extension
> but closed for modification. We abstract the general operations of all

Hello,

I see that you use the term "open for extension but closed for modification"
here and in order patches. Could you please clarify what this means? "closed
for modification" doesn't seem to make much sense for an open source project,
where all the code is always open for modification.

Thanks, Roger.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [qemu-mainline baseline-only test] 68618: tolerable trouble: blocked/broken/fail/pass

2017-02-28 Thread Platform Team regression test user
This run is configured for baseline tests only.

flight 68618 qemu-mainline real [real]
http://osstest.xs.citrite.net/~osstest/testlogs/logs/68618/

Failures :-/ but no regressions.

Regressions which are regarded as allowable (not blocking):
 test-armhf-armhf-libvirt-xsm 13 saverestore-support-checkfail   like 68616
 test-armhf-armhf-libvirt 13 saverestore-support-checkfail   like 68616
 test-armhf-armhf-libvirt-raw 12 saverestore-support-checkfail   like 68616
 test-amd64-amd64-qemuu-nested-intel 16 debian-hvm-install/l1/l2 fail like 68616
 test-amd64-i386-xl-qemuu-winxpsp3-vcpus1  9 windows-installfail like 68616

Tests which did not succeed, but are not blocking:
 test-arm64-arm64-libvirt-xsm  1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl   1 build-check(1)   blocked  n/a
 build-arm64-libvirt   1 build-check(1)   blocked  n/a
 test-arm64-arm64-libvirt-qcow2  1 build-check(1)   blocked  n/a
 test-arm64-arm64-libvirt  1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl-credit2   1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl-rtds  1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl-multivcpu  1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl-xsm   1 build-check(1)   blocked  n/a
 build-arm64   2 hosts-allocate   broken never pass
 build-arm64-pvops 2 hosts-allocate   broken never pass
 build-arm64-xsm   2 hosts-allocate   broken never pass
 build-arm64   3 capture-logs broken never pass
 build-arm64-pvops 3 capture-logs broken never pass
 build-arm64-xsm   3 capture-logs broken never pass
 test-armhf-armhf-xl-multivcpu 12 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 13 saverestore-support-checkfail  never pass
 test-amd64-amd64-xl-pvh-intel 11 guest-start  fail  never pass
 test-armhf-armhf-xl-midway   12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-midway   13 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  13 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  13 saverestore-support-checkfail   never pass
 test-amd64-i386-libvirt  12 migrate-support-checkfail   never pass
 test-amd64-amd64-xl-pvh-amd  11 guest-start  fail   never pass
 test-amd64-amd64-libvirt 12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-raw 11 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  11 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  12 saverestore-support-checkfail   never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt-vhd 11 migrate-support-checkfail   never pass
 test-amd64-amd64-qemuu-nested-amd 16 debian-hvm-install/l1/l2  fail never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check 
fail never pass
 test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-stop fail never pass
 test-amd64-i386-xl-qemuu-win7-amd64 16 guest-stop  fail never pass

version targeted for testing:
 qemuu8f2d7c341184a95d05476ea3c45dbae2b9ddbe51
baseline version:
 qemuu28f997a82cb509bf4775d4006b368e1bde8b7bdd

Last test of basis68616  2017-02-26 17:44:12 Z1 days
Testing same since68618  2017-02-28 04:46:25 Z0 days1 attempts


People who touched revisions under test:
  Artyom Tarasenko 
  Daniel P. Berrange 
  Greg Kurz 
  Jeff Cody 
  John Snow 
  Kevin Wolf 
  Li Qiang 
  Nir Soffer 
  Paolo Bonzini 
  Paul Burton 
  Peter Lieven 
  Peter Maydell 
  Prasad J Pandit 
  Samuel Thibault 
  tianqing 
  Yongbok Kim 

jobs:
 build-amd64-xsm  pass
 build-arm64-xsm  

Re: [Xen-devel] [PATCH 4/4] tools: set pkg-config path when configuring qemu

2017-02-28 Thread Ian Jackson
Juergen Gross writes ("[PATCH 4/4] tools: set pkg-config path when configuring 
qemu"):
> When calling configure for qemu provide the local pkg-config directory
> in order to let the configure process find the libxenctrl version.

Is PKG_CONFIG_PATH used _in addition to_ the built-in paths, or does
it replace them ?

Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 1/4] interface: avoid redefinition of __XEN_INTERFACE_VERSION__

2017-02-28 Thread Ian Jackson
Juergen Gross writes ("[PATCH 1/4] interface: avoid redefinition of 
__XEN_INTERFACE_VERSION__"):
> In stubdom environment __XEN_INTERFACE_VERSION__ is sometimes defined
> on the command line of the build instruction. This conflicts with
> xen-compat.h defining it unconditionally if __XEN__ or __XEN_TOOLS__
> is set.

Aren't the two definitions the same ?  If so, surely there should be
no warning ?  If they are not the same, then surely we need to think
about which one is right ?

Sorry if the answers to this are obvious.

Thanks,
Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 4/4] tools: set pkg-config path when configuring qemu

2017-02-28 Thread Juergen Gross
On 28/02/17 12:08, Ian Jackson wrote:
> Juergen Gross writes ("[PATCH 4/4] tools: set pkg-config path when 
> configuring qemu"):
>> When calling configure for qemu provide the local pkg-config directory
>> in order to let the configure process find the libxenctrl version.
> 
> Is PKG_CONFIG_PATH used _in addition to_ the built-in paths, or does
> it replace them ?

It is in addition.


Juergen


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 1/4] interface: avoid redefinition of __XEN_INTERFACE_VERSION__

2017-02-28 Thread Jan Beulich
>>> On 28.02.17 at 11:34,  wrote:
> In stubdom environment __XEN_INTERFACE_VERSION__ is sometimes defined
> on the command line of the build instruction. This conflicts with
> xen-compat.h defining it unconditionally if __XEN__ or __XEN_TOOLS__
> is set.

Then that's what wants fixing. In fact it's questionable whether
__XEN_TOOLS__ (or even __XEN__) getting defined there is
appropriate.

> Just use #undef in this case to avoid the resulting warning.

I think the lack of a warning in case of a collision is worse here.
People should simply not define both the version symbol and
either of __XEN__ or __XEN_TOOLS__.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 2/4] tools: add pkg-config file for libxc

2017-02-28 Thread Ian Jackson
Juergen Gross writes ("[PATCH 2/4] tools: add pkg-config file for libxc"):
> When configuring the build of qemu the configure script is building
> various test programs to determine the exact version of libxencontrol.
> 
> Instead of a try and error approach needing updates for nearly each
> new version of Xen just provide xencontrol.pc to be used via
> pkg-config.
> 
> In the end we need two different variants of that file: one for the
> target system where eventually someone wants to build qemu, and one
> for the local system to be used for building qemu as part of the Xen
> build process.

I've not seen this done elsewhere, but I can see why it's attractive.
I worry though that we're breaking new ground.  Did you come up with
this idea yourself ?  Are you aware of other projects that do
something similar ?

Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 3/4] tools: use a dedicated build directory for qemu

2017-02-28 Thread Ian Jackson
Juergen Gross writes ("[PATCH 3/4] tools: use a dedicated build directory for 
qemu"):
> Instead of using the downloaded git tree as target directory for the
> qemu build create a dedicated directory for that purpose.

... why ?

Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 4/4] tools: set pkg-config path when configuring qemu

2017-02-28 Thread Ian Jackson
Juergen Gross writes ("Re: [PATCH 4/4] tools: set pkg-config path when 
configuring qemu"):
> On 28/02/17 12:08, Ian Jackson wrote:
> > Is PKG_CONFIG_PATH used _in addition to_ the built-in paths, or does
> > it replace them ?
> 
> It is in addition.

Good.  Thanks.

Acked-by: Ian Jackson 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [libvirt test] 106226: tolerable FAIL - PUSHED

2017-02-28 Thread osstest service owner
flight 106226 libvirt real [real]
http://logs.test-lab.xenproject.org/osstest/logs/106226/

Failures :-/ but no regressions.

Regressions which are regarded as allowable (not blocking):
 test-armhf-armhf-libvirt 13 saverestore-support-checkfail  like 106101
 test-armhf-armhf-libvirt-xsm 13 saverestore-support-checkfail  like 106101
 test-armhf-armhf-libvirt-raw 12 saverestore-support-checkfail  like 106101

Tests which did not succeed, but are not blocking:
 test-arm64-arm64-libvirt-xsm  1 build-check(1)   blocked  n/a
 build-arm64-libvirt   1 build-check(1)   blocked  n/a
 test-arm64-arm64-libvirt-qcow2  1 build-check(1)   blocked  n/a
 test-arm64-arm64-libvirt  1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt 12 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  12 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 12 migrate-support-checkfail   never pass
 build-arm64   5 xen-buildfail   never pass
 build-arm64-xsm   5 xen-buildfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check 
fail never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check 
fail never pass
 build-arm64-pvops 5 kernel-build fail   never pass
 test-amd64-amd64-libvirt-vhd 11 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt 12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-raw 11 migrate-support-checkfail   never pass

version targeted for testing:
 libvirt  ca1f38545750d597c75c9773723c716483b03e5c
baseline version:
 libvirt  38a8489c01787146215847ba6a84a5b2c5799f1f

Last test of basis   106101  2017-02-25 04:20:45 Z3 days
Testing same since   106226  2017-02-28 04:20:43 Z0 days1 attempts


People who touched revisions under test:
  John Ferlan 

jobs:
 build-amd64-xsm  pass
 build-arm64-xsm  fail
 build-armhf-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-arm64  fail
 build-armhf  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-arm64-libvirt  blocked 
 build-armhf-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-arm64-pvopsfail
 build-armhf-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm   pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsmpass
 test-amd64-amd64-libvirt-xsm pass
 test-arm64-arm64-libvirt-xsm blocked 
 test-armhf-armhf-libvirt-xsm pass
 test-amd64-i386-libvirt-xsm  pass
 test-amd64-amd64-libvirt pass
 test-arm64-arm64-libvirt blocked 
 test-armhf-armhf-libvirt pass
 test-amd64-i386-libvirt  pass
 test-amd64-amd64-libvirt-pairpass
 test-amd64-i386-libvirt-pair pass
 test-arm64-arm64-libvirt-qcow2   blocked 
 test-armhf-armhf-libvirt-raw pass
 test-amd64-amd64-libvirt-vhd pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at

Re: [Xen-devel] [xen-unstable bisection] complete test-amd64-i386-freebsd10-amd64

2017-02-28 Thread Jan Beulich
>>> On 28.02.17 at 10:20,  wrote:
> It seems that your changes are causing issues when booting a 32bit Dom0, it
> seems there are several IOMMU faults that prevent Dom0 from booting at all.
> AFAICT this only happens when using a 32bit Dom0. The bisector has pointed
> several times at this change, so it looks like the culprit.
> 
> See for example:
> 
> http://logs.test-lab.xenproject.org/osstest/logs/106186/ 
> 
> This is the serial log of the box failing to boot:
> 
> http://logs.test-lab.xenproject.org/osstest/logs/106186/test-amd64-i386-migr 
> upgrade/serial-chardonnay0.log.0
> 
> Search for "[VT-D]DMAR:[DMA Read] Request device [:01:00.0] fault addr
> 7cd3f000, iommu reg = 82c00021b000" to get to the first IOMMU fault.

And I think this is due to xen_in_range() (used by
vtd_set_hwdom_mapping()) not having got adjusted to cover the
freed part of .bss. Oddly enough the AMD equivalent _still_ does
not use it, but instead blindly maps everything.

Along those lines I would then also expect tboot to be broken, due
to its treatment of the [__bss_start,__bss_end) range.

Andrew, it looks like your b4cd59fea0 ("x86: reorder .data and
.init when linking") did also introduce breakage here, as
[_stext,__init_begin) no longer covers .data.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH] efi/boot: Avoid memory corruption when freeing ebmalloc area

2017-02-28 Thread Andrew Cooper
Don't free the final page containing a partial allocation.

Signed-off-by: Andrew Cooper 
---
CC: Jan Beulich 

NB: This was from code inspection.  It is not whatever is causing the bisector
to finger this commit.
---
 xen/common/efi/boot.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/xen/common/efi/boot.c b/xen/common/efi/boot.c
index b6cbdad..f89163c 100644
--- a/xen/common/efi/boot.c
+++ b/xen/common/efi/boot.c
@@ -148,7 +148,7 @@ static void __init __maybe_unused 
free_ebmalloc_unused_mem(void)
 {
 unsigned long start, end;
 
-start = (unsigned long)ebmalloc_mem + PAGE_ALIGN(ebmalloc_allocated);
+start = (unsigned long)ebmalloc_mem + ROUNDUP(ebmalloc_allocated, 
PAGE_SIZE);
 end = (unsigned long)ebmalloc_mem + sizeof(ebmalloc_mem);
 
 destroy_xen_mappings(start, end);
-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH v3 1/7] xen: credit2: make accessor helpers inline functions instead of macros

2017-02-28 Thread Dario Faggioli
There isn't any particular reason for the accessor helpers
to be macro, so turn them into 'static inline'-s, which are
better.

Note that it is necessary to move the function definitions
below the structure declarations.

No functional change intended.

Signed-off-by: Dario Faggioli 
---
Cc: George Dunlap 
Cc: Anshul Makkar 
Cc: Andrew Cooper 
Cc: Jan Beulich 
---
Changes from v2:
* plain 'inline' instead of 'always_inline', as requested during review;
* 'unsigned int' instead of just 'unisgned', as requested during review;
* constified more, as suggested during review;
* killed pointless parantheses, as suggested during review.
---
 xen/common/sched_credit2.c |  153 +---
 1 file changed, 86 insertions(+), 67 deletions(-)

diff --git a/xen/common/sched_credit2.c b/xen/common/sched_credit2.c
index b12d038..939c37b 100644
--- a/xen/common/sched_credit2.c
+++ b/xen/common/sched_credit2.c
@@ -208,18 +208,6 @@ static unsigned int __read_mostly opt_migrate_resist = 500;
 integer_param("sched_credit2_migrate_resist", opt_migrate_resist);
 
 /*
- * Useful macros
- */
-#define CSCHED2_PRIV(_ops)   \
-((struct csched2_private *)((_ops)->sched_data))
-#define CSCHED2_VCPU(_vcpu)  ((struct csched2_vcpu *) (_vcpu)->sched_priv)
-#define CSCHED2_DOM(_dom)((struct csched2_dom *) (_dom)->sched_priv)
-/* CPU to runq_id macro */
-#define c2r(_ops, _cpu) (CSCHED2_PRIV(_ops)->runq_map[(_cpu)])
-/* CPU to runqueue struct macro */
-#define RQD(_ops, _cpu) (&CSCHED2_PRIV(_ops)->rqd[c2r(_ops, _cpu)])
-
-/*
  * Load tracking and load balancing
  *
  * Load history of runqueues and vcpus is accounted for by using an
@@ -440,6 +428,37 @@ struct csched2_dom {
 };
 
 /*
+ * Accessor helpers functions.
+ */
+static inline struct csched2_private *csched2_priv(const struct scheduler *ops)
+{
+return ops->sched_data;
+}
+
+static inline struct csched2_vcpu *csched2_vcpu(const struct vcpu *v)
+{
+return v->sched_priv;
+}
+
+static inline struct csched2_dom *csched2_dom(const struct domain *d)
+{
+return d->sched_priv;
+}
+
+/* CPU to runq_id macro */
+static inline int c2r(const struct scheduler *ops, unsigned int cpu)
+{
+return csched2_priv(ops)->runq_map[(cpu)];
+}
+
+/* CPU to runqueue struct macro */
+static inline struct csched2_runqueue_data *c2rqd(const struct scheduler *ops,
+  unsigned int cpu)
+{
+return &csched2_priv(ops)->rqd[c2r(ops, cpu)];
+}
+
+/*
  * Hyperthreading (SMT) support.
  *
  * We use a special per-runq mask (smt_idle) and update it according to the
@@ -693,7 +712,7 @@ static void
 __update_runq_load(const struct scheduler *ops,
   struct csched2_runqueue_data *rqd, int change, s_time_t now)
 {
-struct csched2_private *prv = CSCHED2_PRIV(ops);
+struct csched2_private *prv = csched2_priv(ops);
 s_time_t delta, load = rqd->load;
 unsigned int P, W;
 
@@ -780,7 +799,7 @@ static void
 __update_svc_load(const struct scheduler *ops,
   struct csched2_vcpu *svc, int change, s_time_t now)
 {
-struct csched2_private *prv = CSCHED2_PRIV(ops);
+struct csched2_private *prv = csched2_priv(ops);
 s_time_t delta, vcpu_load;
 unsigned int P, W;
 
@@ -877,7 +896,7 @@ static void
 runq_insert(const struct scheduler *ops, struct csched2_vcpu *svc)
 {
 unsigned int cpu = svc->vcpu->processor;
-struct list_head * runq = &RQD(ops, cpu)->runq;
+struct list_head * runq = &c2rqd(ops, cpu)->runq;
 int pos = 0;
 
 ASSERT(spin_is_locked(per_cpu(schedule_data, cpu).schedule_lock));
@@ -935,7 +954,7 @@ runq_tickle(const struct scheduler *ops, struct 
csched2_vcpu *new, s_time_t now)
 int i, ipid = -1;
 s_time_t lowest = (1<<30);
 unsigned int cpu = new->vcpu->processor;
-struct csched2_runqueue_data *rqd = RQD(ops, cpu);
+struct csched2_runqueue_data *rqd = c2rqd(ops, cpu);
 cpumask_t mask;
 struct csched2_vcpu * cur;
 
@@ -1006,7 +1025,7 @@ runq_tickle(const struct scheduler *ops, struct 
csched2_vcpu *new, s_time_t now)
 cpumask_and(&mask, &mask, cpumask_scratch_cpu(cpu));
 if ( __cpumask_test_and_clear_cpu(cpu, &mask) )
 {
-cur = CSCHED2_VCPU(curr_on_cpu(cpu));
+cur = csched2_vcpu(curr_on_cpu(cpu));
 burn_credits(rqd, cur, now);
 
 if ( cur->credit < new->credit )
@@ -1022,7 +1041,7 @@ runq_tickle(const struct scheduler *ops, struct 
csched2_vcpu *new, s_time_t now)
 /* Already looked at this one above */
 ASSERT(i != cpu);
 
-cur = CSCHED2_VCPU(curr_on_cpu(i));
+cur = csched2_vcpu(curr_on_cpu(i));
 
 /*
  * Even if the cpu is not in rqd->idle, it may be running the
@@ -1095,7 +1114,7 @@ runq_tickle(const struct scheduler *ops, struct 
csched2_vcpu *new, s_time_t now)
 static void reset_credit(const struct scheduler *ops, int cpu, s_time_t now,
  struct csched2_vcpu *snext)
 {
-struct csched

[Xen-devel] [PATCH v3 0/7] xen: credit2: improve style, and tracing; fix two bugs

2017-02-28 Thread Dario Faggioli
Hello

This is v3 of the still uncommitted patches of this series. I believe I have
either responded to or addressed all the review comments. See the individual
changelogs for more details.

Previous versions are here:
 v2 https://lists.xen.org/archives/html/xen-devel/2017-02/msg01027.html
 v1 https://lists.xen.org/archives/html/xen-devel/2017-01/msg02837.html

The patches which actually fixes the behavioral issues have become, in this
series, patch 4 and patch 5.

Patches that already have all the needed acks to go in are marked with '*' in
the series summary below.

There is a git branch here:
 git://xenbits.xen.org/people/dariof/xen.git 
rel/sched/credit2-style-tracing-accounting-v3
 https://travis-ci.org/fdario/xen/builds/206143142

Thanks and Regards,
Dario
---
Dario Faggioli (7):
   xen: credit2: make accessor helpers inline functions instead of macros
 * xen: credit2: tidy up functions names by removing leading '__'.
   xen: credit2: group the runq manipulating functions.
 * xen: credit2: always mark a tickled pCPU as... tickled!
 * xen: credit2: don't miss accounting while doing a credit reset.
 * xen/tools: tracing: trace (Credit2) runq traversal.
   xen/tools: tracing: Report next slice time when continuing as well as 
switching

 tools/xentrace/formats |4 
 tools/xentrace/xenalyze.c  |   32 ++
 xen/common/sched_credit2.c |  714 +++-
 xen/common/schedule.c  |4 
 xen/include/public/trace.h |1 
 5 files changed, 415 insertions(+), 340 deletions(-)
--
<> (Raistlin Majere)
-
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH v3 3/7] xen: credit2: group the runq manipulating functions.

2017-02-28 Thread Dario Faggioli
So that they're all close among each other, and
also near to the comment describind the runqueue
organization (which is also moved).

No functional change intended.

Signed-off-by: Dario Faggioli 
---
Cc: George Dunlap 
Cc: Anshul Makkar 
---
Changes from v2:
* don't move the 'credit2_runqueue' option parsing code, as suggested during
  review;
---
 xen/common/sched_credit2.c |  408 ++--
 1 file changed, 204 insertions(+), 204 deletions(-)

diff --git a/xen/common/sched_credit2.c b/xen/common/sched_credit2.c
index c00dcbf..b0ec5f8 100644
--- a/xen/common/sched_credit2.c
+++ b/xen/common/sched_credit2.c
@@ -566,7 +566,7 @@ static int get_fallback_cpu(struct csched2_vcpu *svc)
 
 /*
  * Time-to-credit, credit-to-time.
- * 
+ *
  * We keep track of the "residual" time to make sure that frequent short
  * schedules still get accounted for in the end.
  *
@@ -587,7 +587,7 @@ static s_time_t c2t(struct csched2_runqueue_data *rqd, 
s_time_t credit, struct c
 }
 
 /*
- * Runqueue related code
+ * Runqueue related code.
  */
 
 static inline int vcpu_on_runq(struct csched2_vcpu *svc)
@@ -600,6 +600,208 @@ static inline struct csched2_vcpu * runq_elem(struct 
list_head *elem)
 return list_entry(elem, struct csched2_vcpu, runq_elem);
 }
 
+static void activate_runqueue(struct csched2_private *prv, int rqi)
+{
+struct csched2_runqueue_data *rqd;
+
+rqd = prv->rqd + rqi;
+
+BUG_ON(!cpumask_empty(&rqd->active));
+
+rqd->max_weight = 1;
+rqd->id = rqi;
+INIT_LIST_HEAD(&rqd->svc);
+INIT_LIST_HEAD(&rqd->runq);
+spin_lock_init(&rqd->lock);
+
+__cpumask_set_cpu(rqi, &prv->active_queues);
+}
+
+static void deactivate_runqueue(struct csched2_private *prv, int rqi)
+{
+struct csched2_runqueue_data *rqd;
+
+rqd = prv->rqd + rqi;
+
+BUG_ON(!cpumask_empty(&rqd->active));
+
+rqd->id = -1;
+
+__cpumask_clear_cpu(rqi, &prv->active_queues);
+}
+
+static inline bool_t same_node(unsigned int cpua, unsigned int cpub)
+{
+return cpu_to_node(cpua) == cpu_to_node(cpub);
+}
+
+static inline bool_t same_socket(unsigned int cpua, unsigned int cpub)
+{
+return cpu_to_socket(cpua) == cpu_to_socket(cpub);
+}
+
+static inline bool_t same_core(unsigned int cpua, unsigned int cpub)
+{
+return same_socket(cpua, cpub) &&
+   cpu_to_core(cpua) == cpu_to_core(cpub);
+}
+
+static unsigned int
+cpu_to_runqueue(struct csched2_private *prv, unsigned int cpu)
+{
+struct csched2_runqueue_data *rqd;
+unsigned int rqi;
+
+for ( rqi = 0; rqi < nr_cpu_ids; rqi++ )
+{
+unsigned int peer_cpu;
+
+/*
+ * As soon as we come across an uninitialized runqueue, use it.
+ * In fact, either:
+ *  - we are initializing the first cpu, and we assign it to
+ *runqueue 0. This is handy, especially if we are dealing
+ *with the boot cpu (if credit2 is the default scheduler),
+ *as we would not be able to use cpu_to_socket() and similar
+ *helpers anyway (they're result of which is not reliable yet);
+ *  - we have gone through all the active runqueues, and have not
+ *found anyone whose cpus' topology matches the one we are
+ *dealing with, so activating a new runqueue is what we want.
+ */
+if ( prv->rqd[rqi].id == -1 )
+break;
+
+rqd = prv->rqd + rqi;
+BUG_ON(cpumask_empty(&rqd->active));
+
+peer_cpu = cpumask_first(&rqd->active);
+BUG_ON(cpu_to_socket(cpu) == XEN_INVALID_SOCKET_ID ||
+   cpu_to_socket(peer_cpu) == XEN_INVALID_SOCKET_ID);
+
+if ( opt_runqueue == OPT_RUNQUEUE_ALL ||
+ (opt_runqueue == OPT_RUNQUEUE_CORE && same_core(peer_cpu, cpu)) ||
+ (opt_runqueue == OPT_RUNQUEUE_SOCKET && same_socket(peer_cpu, 
cpu)) ||
+ (opt_runqueue == OPT_RUNQUEUE_NODE && same_node(peer_cpu, cpu)) )
+break;
+}
+
+/* We really expect to be able to assign each cpu to a runqueue. */
+BUG_ON(rqi >= nr_cpu_ids);
+
+return rqi;
+}
+
+/* Find the domain with the highest weight. */
+static void update_max_weight(struct csched2_runqueue_data *rqd, int 
new_weight,
+  int old_weight)
+{
+/* Try to avoid brute-force search:
+ * - If new_weight is larger, max_weigth <- new_weight
+ * - If old_weight != max_weight, someone else is still max_weight
+ *   (No action required)
+ * - If old_weight == max_weight, brute-force search for max weight
+ */
+if ( new_weight > rqd->max_weight )
+{
+rqd->max_weight = new_weight;
+SCHED_STAT_CRANK(upd_max_weight_quick);
+}
+else if ( old_weight == rqd->max_weight )
+{
+struct list_head *iter;
+int max_weight = 1;
+
+list_for_each( iter, &rqd->svc )
+{
+struct csched2_vcpu * svc = list_entry(iter, struct csched2_vcpu, 
rqd_elem);
+
+   

[Xen-devel] [PATCH v3 2/7] xen: credit2: tidy up functions names by removing leading '__'.

2017-02-28 Thread Dario Faggioli
There is no reason for having pretty much all of the
functions whose names begin with double underscores
('__') to actually look like that.

In fact, that is misleading and makes the code hard
to read and understand. So, remove the '__'-s.

The only two that we keep are __runq_assign() and
__runq_deassign() (althought they're converted to
single underscore). In fact, in those cases, it is
indeed useful to have those sort of a "raw" variants.

In case of __runq_insert(), which is only called
once, by runq_insert(), merge the two functions.

No functional change intended.

Signed-off-by: Dario Faggioli 
Acked-by: George Dunlap 
---
Cc: Anshul Makkar 
---
Changes from v2:
* made 'runq_elem()' inline as well, as suggested during review.
---
 xen/common/sched_credit2.c |  114 +++-
 1 file changed, 49 insertions(+), 65 deletions(-)

diff --git a/xen/common/sched_credit2.c b/xen/common/sched_credit2.c
index 939c37b..c00dcbf 100644
--- a/xen/common/sched_credit2.c
+++ b/xen/common/sched_credit2.c
@@ -220,7 +220,7 @@ integer_param("sched_credit2_migrate_resist", 
opt_migrate_resist);
  * shift all time samples to the right.
  *
  * The details of the formulas used for load tracking are explained close to
- * __update_runq_load(). Let's just say here that, with full nanosecond time
+ * update_runq_load(). Let's just say here that, with full nanosecond time
  * granularity, a 30 bits wide 'decaying window' is ~1 second long.
  *
  * We want to consider the following equations:
@@ -232,7 +232,7 @@ integer_param("sched_credit2_migrate_resist", 
opt_migrate_resist);
  * Q-format fixed point arithmetic and load is the instantaneous load of a
  * runqueue, which basically is the number of runnable vcpus there are on the
  * runqueue (for the meaning of the other terms, look at the doc comment to
- *  __update_runq_load()).
+ *  update_runq_load()).
  *
  *  So, again, with full nanosecond granularity, and 1 second window, we have:
  *
@@ -590,14 +590,12 @@ static s_time_t c2t(struct csched2_runqueue_data *rqd, 
s_time_t credit, struct c
  * Runqueue related code
  */
 
-static /*inline*/ int
-__vcpu_on_runq(struct csched2_vcpu *svc)
+static inline int vcpu_on_runq(struct csched2_vcpu *svc)
 {
 return !list_empty(&svc->runq_elem);
 }
 
-static /*inline*/ struct csched2_vcpu *
-__runq_elem(struct list_head *elem)
+static inline struct csched2_vcpu * runq_elem(struct list_head *elem)
 {
 return list_entry(elem, struct csched2_vcpu, runq_elem);
 }
@@ -709,8 +707,8 @@ __runq_elem(struct list_head *elem)
  * Which, in both cases, is what we expect.
  */
 static void
-__update_runq_load(const struct scheduler *ops,
-  struct csched2_runqueue_data *rqd, int change, s_time_t now)
+update_runq_load(const struct scheduler *ops,
+ struct csched2_runqueue_data *rqd, int change, s_time_t now)
 {
 struct csched2_private *prv = csched2_priv(ops);
 s_time_t delta, load = rqd->load;
@@ -796,8 +794,8 @@ __update_runq_load(const struct scheduler *ops,
 }
 
 static void
-__update_svc_load(const struct scheduler *ops,
-  struct csched2_vcpu *svc, int change, s_time_t now)
+update_svc_load(const struct scheduler *ops,
+struct csched2_vcpu *svc, int change, s_time_t now)
 {
 struct csched2_private *prv = csched2_priv(ops);
 s_time_t delta, vcpu_load;
@@ -861,17 +859,24 @@ update_load(const struct scheduler *ops,
 {
 trace_var(TRC_CSCHED2_UPDATE_LOAD, 1, 0,  NULL);
 
-__update_runq_load(ops, rqd, change, now);
+update_runq_load(ops, rqd, change, now);
 if ( svc )
-__update_svc_load(ops, svc, change, now);
+update_svc_load(ops, svc, change, now);
 }
 
-static int
-__runq_insert(struct list_head *runq, struct csched2_vcpu *svc)
+static void
+runq_insert(const struct scheduler *ops, struct csched2_vcpu *svc)
 {
 struct list_head *iter;
+unsigned int cpu = svc->vcpu->processor;
+struct list_head * runq = &c2rqd(ops, cpu)->runq;
 int pos = 0;
 
+ASSERT(spin_is_locked(per_cpu(schedule_data, cpu).schedule_lock));
+
+ASSERT(!vcpu_on_runq(svc));
+ASSERT(c2r(ops, cpu) == c2r(ops, svc->vcpu->processor));
+
 ASSERT(&svc->rqd->runq == runq);
 ASSERT(!is_idle_vcpu(svc->vcpu));
 ASSERT(!svc->vcpu->is_running);
@@ -879,33 +884,15 @@ __runq_insert(struct list_head *runq, struct csched2_vcpu 
*svc)
 
 list_for_each( iter, runq )
 {
-struct csched2_vcpu * iter_svc = __runq_elem(iter);
+struct csched2_vcpu * iter_svc = runq_elem(iter);
 
 if ( svc->credit > iter_svc->credit )
 break;
 
 pos++;
 }
-
 list_add_tail(&svc->runq_elem, iter);
 
-return pos;
-}
-
-static void
-runq_insert(const struct scheduler *ops, struct csched2_vcpu *svc)
-{
-unsigned int cpu = svc->vcpu->processor;
-struct list_head * runq = &c2rqd(ops, cpu)->runq;
-int pos = 0;
-
-ASSERT(spin_is_locked(per_cpu(schedule_data, 

[Xen-devel] [PATCH v3 4/7] xen: credit2: always mark a tickled pCPU as... tickled!

2017-02-28 Thread Dario Faggioli
In fact, whether or not a pCPU has been tickled, and is
therefore about to re-schedule, is something we look at
and base decisions on in various places.

So, let's make sure that we do that basing on accurate
information.

While there, also tweak a little bit smt_idle_mask_clear()
(used for implementing SMT support), so that it only alter
the relevant cpumask when there is the actual need for this.
(This is only for reduced overhead, behavior remains the
same).

Signed-off-by: Dario Faggioli 
Reviewed-by: George Dunlap 
---
Cc: Anshul Makkar 
---
Changes from v2:
* fixed a bug I found myself in runq_tickle().
---
 xen/common/sched_credit2.c |   26 +++---
 1 file changed, 19 insertions(+), 7 deletions(-)

diff --git a/xen/common/sched_credit2.c b/xen/common/sched_credit2.c
index b0ec5f8..feb0f83 100644
--- a/xen/common/sched_credit2.c
+++ b/xen/common/sched_credit2.c
@@ -523,12 +523,15 @@ void smt_idle_mask_set(unsigned int cpu, const cpumask_t 
*idlers,
 }
 
 /*
- * Clear the bits of all the siblings of cpu from mask.
+ * Clear the bits of all the siblings of cpu from mask (if necessary).
  */
 static inline
 void smt_idle_mask_clear(unsigned int cpu, cpumask_t *mask)
 {
-cpumask_andnot(mask, mask, per_cpu(cpu_sibling_mask, cpu));
+const cpumask_t *cpu_siblings = per_cpu(cpu_sibling_mask, cpu);
+
+if ( cpumask_subset(cpu_siblings, mask) )
+cpumask_andnot(mask, mask, per_cpu(cpu_sibling_mask, cpu));
 }
 
 /*
@@ -1118,6 +1121,14 @@ static inline void runq_remove(struct csched2_vcpu *svc)
 
 void burn_credits(struct csched2_runqueue_data *rqd, struct csched2_vcpu *, 
s_time_t);
 
+static inline void
+tickle_cpu(unsigned int cpu, struct csched2_runqueue_data *rqd)
+{
+__cpumask_set_cpu(cpu, &rqd->tickled);
+smt_idle_mask_clear(cpu, &rqd->smt_idle);
+cpu_raise_softirq(cpu, SCHEDULE_SOFTIRQ);
+}
+
 /*
  * Check what processor it is best to 'wake', for picking up a vcpu that has
  * just been put (back) in the runqueue. Logic is as follows:
@@ -1285,9 +1296,8 @@ runq_tickle(const struct scheduler *ops, struct 
csched2_vcpu *new, s_time_t now)
 sizeof(d),
 (unsigned char *)&d);
 }
-__cpumask_set_cpu(ipid, &rqd->tickled);
-smt_idle_mask_clear(ipid, &rqd->smt_idle);
-cpu_raise_softirq(ipid, SCHEDULE_SOFTIRQ);
+
+tickle_cpu(ipid, rqd);
 
 if ( unlikely(new->tickled_cpu != -1) )
 SCHED_STAT_CRANK(tickled_cpu_overwritten);
@@ -1489,7 +1499,9 @@ csched2_vcpu_sleep(const struct scheduler *ops, struct 
vcpu *vc)
 SCHED_STAT_CRANK(vcpu_sleep);
 
 if ( curr_on_cpu(vc->processor) == vc )
-cpu_raise_softirq(vc->processor, SCHEDULE_SOFTIRQ);
+{
+tickle_cpu(vc->processor, svc->rqd);
+}
 else if ( vcpu_on_runq(svc) )
 {
 ASSERT(svc->rqd == c2rqd(ops, vc->processor));
@@ -1812,8 +1824,8 @@ static void migrate(const struct scheduler *ops,
 svc->migrate_rqd = trqd;
 __set_bit(_VPF_migrating, &svc->vcpu->pause_flags);
 __set_bit(__CSFLAG_runq_migrate_request, &svc->flags);
-cpu_raise_softirq(cpu, SCHEDULE_SOFTIRQ);
 SCHED_STAT_CRANK(migrate_requested);
+tickle_cpu(cpu, svc->rqd);
 }
 else
 {


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH v3 5/7] xen: credit2: don't miss accounting while doing a credit reset.

2017-02-28 Thread Dario Faggioli
A credit reset basically means going through all the
vCPUs of a runqueue and altering their credits, as a
consequence of a 'scheduling epoch' having come to an
end.

Blocked or runnable vCPUs are fine, all the credits
they've spent running so far have been accounted to
them when they were scheduled out.

But if a vCPU is running on a pCPU, when a reset event
occurs (on another pCPU), that does not get properly
accounted. Let's therefore begin to do so, for better
accuracy and fairness.

In fact, after this patch, we see this in a trace:

 csched2:schedule cpu 10, rq# 1, busy, not tickled
 csched2:burn_credits d1v5, credit = 9998353, delta = 202996
 runstate_continue d1v5 running->running
 ...
 csched2:schedule cpu 12, rq# 1, busy, not tickled
 csched2:burn_credits d1v6, credit = -1327, delta = 544
 csched2:reset_credits d0v13, credit_start = 1050, credit_end = 1050, 
mult = 1
 csched2:reset_credits d0v14, credit_start = 1050, credit_end = 1050, 
mult = 1
 csched2:reset_credits d0v7, credit_start = 1050, credit_end = 1050, 
mult = 1
 csched2:burn_credits d1v5, credit = 201805, delta = 9796548
 csched2:reset_credits d1v5, credit_start = 201805, credit_end = 10201805, mult 
= 1
 csched2:burn_credits d1v6, credit = -1327, delta = 0
 csched2:reset_credits d1v6, credit_start = -1327, credit_end = 9998673, mult = 
1

Which shows how d1v5 actually executed for ~9.796 ms,
on pCPU 10, when reset_credit() is executed, on pCPU
12, because of d1v6's credits going below 0.

Without this patch, this 9.796ms are not accounted
to anyone. With this patch, d1v5 is charged for that,
and its credits drop down from 9796548 to 201805.

And this is important, as it means that it will
begin the new epoch with 10201805 credits, instead
of 1050 (which he would have, before this patch).

Basically, we were forgetting one round of accounting
in epoch x, for the vCPUs that are running at the time
the epoch ends. And this meant favouring a little bit
these same vCPUs, in epoch x+1, providing them with
the chance of execute longer than their fair share.

Signed-off-by: Dario Faggioli 
Reviewed-by: George Dunlap 
---
Cc: Anshul Makkar 
---
 xen/common/sched_credit2.c |   14 --
 1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/xen/common/sched_credit2.c b/xen/common/sched_credit2.c
index feb0f83..66b7f96 100644
--- a/xen/common/sched_credit2.c
+++ b/xen/common/sched_credit2.c
@@ -1337,18 +1337,28 @@ static void reset_credit(const struct scheduler *ops, 
int cpu, s_time_t now,
 
 list_for_each( iter, &rqd->svc )
 {
+unsigned int svc_cpu;
 struct csched2_vcpu * svc;
 int start_credit;
 
 svc = list_entry(iter, struct csched2_vcpu, rqd_elem);
+svc_cpu = svc->vcpu->processor;
 
 ASSERT(!is_idle_vcpu(svc->vcpu));
 ASSERT(svc->rqd == rqd);
 
+/*
+ * If svc is running, it is our responsibility to make sure, here,
+ * that the credit it has spent so far get accounted.
+ */
+if ( svc->vcpu == curr_on_cpu(svc_cpu) )
+burn_credits(rqd, svc, now);
+
 start_credit = svc->credit;
 
-/* And add INIT * m, avoiding integer multiplication in the
- * common case. */
+/*
+ * Add INIT * m, avoiding integer multiplication in the common case.
+ */
 if ( likely(m==1) )
 svc->credit += CSCHED2_CREDIT_INIT;
 else


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH v3 6/7] xen/tools: tracing: trace (Credit2) runq traversal.

2017-02-28 Thread Dario Faggioli
When traversing a Credit2 runqueue to select the
best candidate vCPU to be run next, show in the
trace which vCPUs we consider.

A bit verbose, but quite useful, considering that
we may end up looking at, but then discarding, one
of more vCPU. This will help understand which ones
are skipped and why.

Also, add how much credits the chosen vCPU has
(in the TRC_CSCHED2_RUNQ_CANDIDATE record). And,
while there, fix a bug in tools/xentrace/formats
(still in the output of TRC_CSCHED2_RUNQ_CANDIDATE).

Signed-off-by: Dario Faggioli 
Acked-by: George Dunlap 
---
 tools/xentrace/formats |3 ++-
 tools/xentrace/xenalyze.c  |   15 +--
 xen/common/sched_credit2.c |   15 +++
 3 files changed, 30 insertions(+), 3 deletions(-)

diff --git a/tools/xentrace/formats b/tools/xentrace/formats
index db89f92..72c0b24 100644
--- a/tools/xentrace/formats
+++ b/tools/xentrace/formats
@@ -65,9 +65,10 @@
 0x00022210  CPU%(cpu)d  %(tsc)d (+%(reltsc)8d)  csched2:load_check [ 
lrq_id[16]:orq_id[16] = 0x%(1)08x, delta = %(2)d ]
 0x00022211  CPU%(cpu)d  %(tsc)d (+%(reltsc)8d)  csched2:load_balance   [ 
l_bavgload = 0x%(2)08x%(1)08x, o_bavgload = 0x%(4)08x%(3)08x, 
lrq_id[16]:orq_id[16] = 0x%(5)08x ]
 0x00022212  CPU%(cpu)d  %(tsc)d (+%(reltsc)8d)  csched2:pick_cpu   [ 
b_avgload = 0x%(2)08x%(1)08x, dom:vcpu = 0x%(3)08x, rq_id[16]:new_cpu[16] = 
%(4)d ]
-0x00022213  CPU%(cpu)d  %(tsc)d (+%(reltsc)8d)  csched2:runq_candidate [ 
dom:vcpu = 0x%(1)08x, skipped_vcpus = %(2)d tickled_cpu = %(3)d ]
+0x00022213  CPU%(cpu)d  %(tsc)d (+%(reltsc)8d)  csched2:runq_candidate [ 
dom:vcpu = 0x%(1)08x, credit = %(4)d, skipped_vcpus = %(3)d, tickled_cpu = 
%(2)d ]
 0x00022214  CPU%(cpu)d  %(tsc)d (+%(reltsc)8d)  csched2:schedule   [ 
rq:cpu = 0x%(1)08x, tasklet[8]:idle[8]:smt_idle[8]:tickled[8] = %(2)08x ]
 0x00022215  CPU%(cpu)d  %(tsc)d (+%(reltsc)8d)  csched2:ratelimit  [ 
dom:vcpu = 0x%(1)08x, runtime = %(2)d ]
+0x00022216  CPU%(cpu)d  %(tsc)d (+%(reltsc)8d)  csched2:runq_cand_chk  [ 
dom:vcpu = 0x%(1)08x ]
 
 0x00022801  CPU%(cpu)d  %(tsc)d (+%(reltsc)8d)  rtds:tickle[ cpu = 
%(1)d ]
 0x00022802  CPU%(cpu)d  %(tsc)d (+%(reltsc)8d)  rtds:runq_pick [ dom:vcpu 
= 0x%(1)08x, cur_deadline = 0x%(3)08x%(2)08x, cur_budget = 0x%(5)08x%(4)08x ]
diff --git a/tools/xentrace/xenalyze.c b/tools/xentrace/xenalyze.c
index a90da20..2678e2a 100644
--- a/tools/xentrace/xenalyze.c
+++ b/tools/xentrace/xenalyze.c
@@ -7825,12 +7825,13 @@ void sched_process(struct pcpu_info *p)
 struct {
 unsigned vcpuid:16, domid:16;
 unsigned tickled_cpu, skipped;
+int credit;
 } *r = (typeof(r))ri->d;
 
-printf(" %s csched2:runq_candidate d%uv%u, "
+printf(" %s csched2:runq_candidate d%uv%u, credit = %d, "
"%u vcpus skipped, ",
ri->dump_header, r->domid, r->vcpuid,
-   r->skipped);
+   r->credit, r->skipped);
 if (r->tickled_cpu == (unsigned)-1)
 printf("no cpu was tickled\n");
 else
@@ -7864,6 +7865,16 @@ void sched_process(struct pcpu_info *p)
r->runtime / 1000, r->runtime % 1000);
 }
 break;
+case TRC_SCHED_CLASS_EVT(CSCHED2, 23): /* RUNQ_CAND_CHECK  */
+if(opt.dump_all) {
+struct {
+unsigned int vcpuid:16, domid:16;
+} *r = (typeof(r))ri->d;
+
+printf(" %s csched2:runq_cand_check d%uv%u\n",
+   ri->dump_header, r->domid, r->vcpuid);
+}
+break;
 /* RTDS (TRC_RTDS_xxx) */
 case TRC_SCHED_CLASS_EVT(RTDS, 1): /* TICKLE   */
 if(opt.dump_all) {
diff --git a/xen/common/sched_credit2.c b/xen/common/sched_credit2.c
index 66b7f96..af457c1 100644
--- a/xen/common/sched_credit2.c
+++ b/xen/common/sched_credit2.c
@@ -56,6 +56,7 @@
 #define TRC_CSCHED2_RUNQ_CANDIDATE   TRC_SCHED_CLASS_EVT(CSCHED2, 20)
 #define TRC_CSCHED2_SCHEDULE TRC_SCHED_CLASS_EVT(CSCHED2, 21)
 #define TRC_CSCHED2_RATELIMITTRC_SCHED_CLASS_EVT(CSCHED2, 22)
+#define TRC_CSCHED2_RUNQ_CAND_CHECK  TRC_SCHED_CLASS_EVT(CSCHED2, 23)
 
 /*
  * WARNING: This is still in an experimental phase.  Status and work can be 
found at the
@@ -2494,6 +2495,18 @@ runq_candidate(struct csched2_runqueue_data *rqd,
 {
 struct csched2_vcpu * svc = list_entry(iter, struct csched2_vcpu, 
runq_elem);
 
+if ( unlikely(tb_init_done) )
+{
+struct {
+unsigned vcpu:16, dom:16;
+} d;
+d.dom = svc->vcpu->domain->domain_id;
+d.vcpu = svc->vcpu->vcpu_id;
+__trace_var(TRC_CSCHED2_RUNQ_CAND_CHECK, 1,
+sizeof(d),
+(unsigned char *)&d);
+}
+
 /* Only consi

[Xen-devel] [PATCH v3 7/7] xen/tools: tracing: Report next slice time when continuing as well as switching

2017-02-28 Thread Dario Faggioli
We record trace information about the next timeslice when
switching to a different vcpu, but not when continuing to
run the same cpu:

 csched2:schedule cpu 9, rq# 1, idle, SMT idle, tickled
 csched2:runq_candidate d0v3, 0 vcpus skipped, cpu 9 was tickled
 sched_switch prev d32767v9, run for 991.186us
 sched_switch next d0v3, was runnable for 2.515us, next slice 1.0us
 sched_switch prev d32767v9 next d0v3  
 runstate_change d32767v9 running->runnable
 ...
 csched2:schedule cpu 2, rq# 0, busy, not tickled
 csched2:burn_credits d1v5, credit = 9996950, delta = 502913
 csched2:runq_candidate d1v5, 0 vcpus skipped, no cpu was tickled
 runstate_continue d1v5 running->running
 ?

This information is quite useful; so add a trace including
that information on the 'continue_running' path as well,
like this:

 csched2:schedule cpu 1, rq# 0, busy, not tickled
 csched2:burn_credits d0v8, credit = 9998645, delta = 12104
 csched2:runq_candidate d0v8, credit = 9998645, 0 vcpus skipped, no cpu was 
tickled
 sched_switch continue d0v8, run for 1125.820us, next slice 9998.645us
 runstate_continue d0v8 running->running ^

Signed-off-by: Dario Faggioli 
---
Cc: George Dunlap 
---
Changes from v2:
* reworded the changelog, as suggested during review.
---
 tools/xentrace/formats |1 +
 tools/xentrace/xenalyze.c  |   17 +
 xen/common/schedule.c  |4 
 xen/include/public/trace.h |1 +
 4 files changed, 23 insertions(+)

diff --git a/tools/xentrace/formats b/tools/xentrace/formats
index 72c0b24..a055231 100644
--- a/tools/xentrace/formats
+++ b/tools/xentrace/formats
@@ -35,6 +35,7 @@
 0x0002800e  CPU%(cpu)d  %(tsc)d (+%(reltsc)8d)  switch_infprev[ dom:vcpu = 
0x%(1)04x%(2)04x, runtime = %(3)d ]
 0x0002800f  CPU%(cpu)d  %(tsc)d (+%(reltsc)8d)  switch_infnext[ 
new_dom:vcpu = 0x%(1)04x%(2)04x, time = %(3)d, r_time = %(4)d ]
 0x00028010  CPU%(cpu)d  %(tsc)d (+%(reltsc)8d)  domain_shutdown_code [ 
dom:vcpu = 0x%(1)04x%(2)04x, reason = 0x%(3)08x ]
+0x00028011  CPU%(cpu)d  %(tsc)d (+%(reltsc)8d)  switch_infcont[ dom:vcpu = 
0x%(1)04x%(2)04x, runtime = %(3)d, r_time = %(4)d ]
 
 0x00022001  CPU%(cpu)d  %(tsc)d (+%(reltsc)8d)  csched:sched_tasklet
 0x00022002  CPU%(cpu)d  %(tsc)d (+%(reltsc)8d)  csched:account_start [ 
dom:vcpu = 0x%(1)04x%(2)04x, active = %(3)d ]
diff --git a/tools/xentrace/xenalyze.c b/tools/xentrace/xenalyze.c
index 2678e2a..68ffcc2 100644
--- a/tools/xentrace/xenalyze.c
+++ b/tools/xentrace/xenalyze.c
@@ -7528,6 +7528,23 @@ void sched_process(struct pcpu_info *p)
 printf("\n");
 }
 break;
+case TRC_SCHED_SWITCH_INFCONT:
+if(opt.dump_all)
+{
+struct {
+unsigned int domid, vcpuid, rsince;
+int slice;
+} *r = (typeof(r))ri->d;
+
+printf(" %s sched_switch continue d%uv%u, run for %u.%uus",
+   ri->dump_header, r->domid, r->vcpuid,
+   r->rsince / 1000, r->rsince % 1000);
+if ( r->slice > 0 )
+printf(", next slice %u.%uus", r->slice / 1000,
+   r->slice % 1000);
+printf("\n");
+}
+break;
 case TRC_SCHED_CTL:
 case TRC_SCHED_S_TIMER_FN:
 case TRC_SCHED_T_TIMER_FN:
diff --git a/xen/common/schedule.c b/xen/common/schedule.c
index 36bbd94..223a120 100644
--- a/xen/common/schedule.c
+++ b/xen/common/schedule.c
@@ -1397,6 +1397,10 @@ static void schedule(void)
 if ( unlikely(prev == next) )
 {
 pcpu_schedule_unlock_irq(lock, cpu);
+TRACE_4D(TRC_SCHED_SWITCH_INFCONT,
+ next->domain->domain_id, next->vcpu_id,
+ now - prev->runstate.state_entry_time,
+ next_slice.time);
 trace_continue_running(next);
 return continue_running(prev);
 }
diff --git a/xen/include/public/trace.h b/xen/include/public/trace.h
index 5ef9c37..7f2e891 100644
--- a/xen/include/public/trace.h
+++ b/xen/include/public/trace.h
@@ -115,6 +115,7 @@
 #define TRC_SCHED_SWITCH_INFPREV (TRC_SCHED_VERBOSE + 14)
 #define TRC_SCHED_SWITCH_INFNEXT (TRC_SCHED_VERBOSE + 15)
 #define TRC_SCHED_SHUTDOWN_CODE  (TRC_SCHED_VERBOSE + 16)
+#define TRC_SCHED_SWITCH_INFCONT (TRC_SCHED_VERBOSE + 17)
 
 #define TRC_DOM0_DOM_ADD (TRC_DOM0_DOMOPS + 1)
 #define TRC_DOM0_DOM_REM (TRC_DOM0_DOMOPS + 2)


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [xen-unstable bisection] complete test-amd64-i386-freebsd10-amd64

2017-02-28 Thread Andrew Cooper
On 28/02/17 11:37, Jan Beulich wrote:
 On 28.02.17 at 10:20,  wrote:
>> It seems that your changes are causing issues when booting a 32bit Dom0, it
>> seems there are several IOMMU faults that prevent Dom0 from booting at all.
>> AFAICT this only happens when using a 32bit Dom0. The bisector has pointed
>> several times at this change, so it looks like the culprit.
>>
>> See for example:
>>
>> http://logs.test-lab.xenproject.org/osstest/logs/106186/ 
>>
>> This is the serial log of the box failing to boot:
>>
>> http://logs.test-lab.xenproject.org/osstest/logs/106186/test-amd64-i386-migr 
>> upgrade/serial-chardonnay0.log.0
>>
>> Search for "[VT-D]DMAR:[DMA Read] Request device [:01:00.0] fault addr
>> 7cd3f000, iommu reg = 82c00021b000" to get to the first IOMMU fault.
> And I think this is due to xen_in_range() (used by
> vtd_set_hwdom_mapping()) not having got adjusted to cover the
> freed part of .bss. Oddly enough the AMD equivalent _still_ does
> not use it, but instead blindly maps everything.
>
> Along those lines I would then also expect tboot to be broken, due
> to its treatment of the [__bss_start,__bss_end) range.
>
> Andrew, it looks like your b4cd59fea0 ("x86: reorder .data and
> .init when linking") did also introduce breakage here, as
> [_stext,__init_begin) no longer covers .data.

Hmm.  So it seems.  Let me put together a patch.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v8 03/24] x86: refactor psr: implement main data structures.

2017-02-28 Thread Roger Pau Monné
On Wed, Feb 15, 2017 at 04:49:18PM +0800, Yi Sun wrote:
> To construct an extendible framework, we need analyze PSR features
> and abstract the common things and feature specific things. Then,
> encapsulate them into different data structures.
> 
> By analyzing PSR features, we can get below map.
> +--+--+--+
>   ->| Dom0 | Dom1 | ...  |
>   | +--+--+--+
>   ||
>   |Dom ID  | cos_id of domain
>   |V
>   |
> +-+
> User ->| PSR  
>|
>  Socket ID |  +--+---+---+
>|
>|  | Socket0 Info | Socket 1 Info |...|
>|
>|  +--+---+---+
>|
>||   cos_id=0   cos_id=1   
>... |
>||  
> +---+---+---+ |
>||->Ref   : | ref 0 | ref 1
>  | ...   | |
>||  
> +---+---+---+ |
>||  
> +---+---+---+ |
>||->L3 CAT: | cos 0 | cos 1
>  | ...   | |
>||  
> +---+---+---+ |
>||  
> +---+---+---+ |
>||->L2 CAT: | cos 0 | cos 1
>  | ...   | |
>||  
> +---+---+---+ |
>||  
> +---+---+---+---+---+ |
>||->CDP   : | cos0 code | cos0 data | cos1 code | cos1 
> data | ...   | |
>|   
> +---+---+---+---+---+ |
>
> +-+
> 
> So, we need define a socket info data structure, 'struct
> psr_socket_info' to manage information per socket. It contains a
> reference count array according to COS ID and a feature list to
> manage all features enabled. Every entry of the reference count
> array is used to record how many domains are using the COS registers
> according to the COS ID. For example, L3 CAT and L2 CAT are enabled,
> Dom1 uses COS_ID=1 registers of both features to save CBM values, like
> below.
> +---+---+---+-+
> | COS 0 | COS 1 | COS 2 | ... |
> +---+---+---+-+
> L3 CAT  | 0x7ff | 0x1ff | ...   | ... |
> +---+---+---+-+
> L2 CAT  | 0xff  | 0xff  | ...   | ... |
> +---+---+---+-+
> 
> If Dom2 has same CBM values, it can reuse these registers which COS_ID=1.
> That means, both Dom1 and Dom2 use same COS registers(ID=1) to save same
> L3/L2 values. So, the value ref[1] is 2 which means 2 domains are using
> COS_ID 1.
> 
> To manage a feature, we need define a feature node data structure,
> 'struct feat_node', to manage feature's specific HW info, its callback
> functions (all feature's specific behaviors are encapsulated into these
> callback functions), and an array of all COS registers values of this
> feature.
> 
> CDP is a special feature which uses two entries of the array
> for one COS ID. So, the number of CDP COS registers is the half of L3
> CAT. E.g. L3 CAT has 16 COS registers, then CDP has 8 COS registers if
> it is enabled. CDP uses the COS registers array as below.
> 
>  
> +---+---+---+---+---+
> CDP cos_reg_val[] index: | 0 | 1 | 2 | 3 |
> ...|
>  
> +---+---+---+---+---+
>   value: | cos0 code | cos0 data | cos1 code | cos1 data |
> ...|
>  
> +---+---+---+---+---+
> 
> For more details, please refer SDM and patches to implement 'get value' and
> 'set value'.

I would recommend that you merge this with a patch that actually makes use of
this structures, or else it's hard to review it's usage IMHO.

> Signed-off-by: Yi Sun 
> Reviewed-by: Konrad Rzeszutek Wilk 
> ---
>  xen/arch/x86/psr.c | 108 
> -
>  1 file changed, 107 insertions(+), 1 deletion(-)
> 
> diff --git a/xen/arch/x86/psr.c b/xen/arch/x86/psr.c
> index 96a8589..5acd9ca 100644
> -

Re: [Xen-devel] [PATCH v2 08/10] x86/SVM: Add interrupt management code via AVIC

2017-02-28 Thread George Dunlap
On 02/01/17 17:45, Andrew Cooper wrote:
> On 31/12/2016 05:45, Suravee Suthikulpanit wrote:
>> Enabling AVIC implicitly disables the V_IRQ, V_INTR_PRIO, V_IGN_TPR,
>> and V_INTR_VECTOR fields in the VMCB Control Word. Therefore, this patch
>> introduces new interrupt injection code via AVIC backing page.
>>
>> Signed-off-by: Suravee Suthikulpanit 
>> Cc: Konrad Rzeszutek Wilk 
>> Cc: Jan Beulich 
>> Cc: Boris Ostrovsky 
>> ---
>>  xen/arch/x86/hvm/svm/avic.c| 28 
>>  xen/arch/x86/hvm/svm/intr.c|  4 
>>  xen/arch/x86/hvm/svm/svm.c | 12 ++--
>>  xen/include/asm-x86/hvm/svm/avic.h |  1 +
>>  4 files changed, 43 insertions(+), 2 deletions(-)
>>
>> diff --git a/xen/arch/x86/hvm/svm/avic.c b/xen/arch/x86/hvm/svm/avic.c
>> index 6351c8e..faa5e45 100644
>> --- a/xen/arch/x86/hvm/svm/avic.c
>> +++ b/xen/arch/x86/hvm/svm/avic.c
>> @@ -636,6 +636,34 @@ void svm_avic_vmexit_do_noaccel(struct cpu_user_regs 
>> *regs)
>>  return;
>>  }
>>  
>> +void svm_avic_deliver_posted_intr(struct vcpu *v, u8 vec)
>> +{
>> +struct vlapic *vlapic = vcpu_vlapic(v);
>> +
>> +/* Fallback to use non-AVIC if vcpu is not enabled with AVIC. */
>> +if ( !svm_avic_vcpu_enabled(v) )
>> +{
>> +if ( !vlapic_test_and_set_vector(vec, 
>> &vlapic->regs->data[APIC_IRR]) )
>> +vcpu_kick(v);
>> +return;
>> +}
>> +
>> +if ( !(guest_cpu_user_regs()->eflags & X86_EFLAGS_IF) )
>> +return;
> 
> Won't this discard the interrupt?
> 
>> +
>> +if ( vlapic_test_and_set_vector(vec, &vlapic->regs->data[APIC_IRR]) )
>> +return;
>> +
>> +/*
>> + * If vcpu is running on another cpu, hit the doorbell to signal
>> + * it to process interrupt. Otherwise, kick it.
>> + */
>> +if ( v->is_running && (v != current) )
>> +wrmsrl(AVIC_DOORBELL, cpu_data[v->processor].apicid);
> 
> Hmm - my gut feeling is that this is racy without holding the scheduler
> lock for the target pcpu.  Nothing (I am aware of) excludes ->is_running
> and ->processor changing behind our back.
> 
> CC'ing George and Dario for their input.

I'm not sure how AVIC_DOORBELL works (haven't looked at the whole
series) -- the vcpu_kick() path accesses both is_running and
v->processor without locks; but that's because any schedule event which
may change those values will also check to see whether there is a
pending event to be delivered.  In theory the same could apply to this
mechanism, but it would take some careful thinking (in particular,
understanding the "NB's" in vcpu_kick() to see if and how they apply).

 -George

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] efi/boot: Avoid memory corruption when freeing ebmalloc area

2017-02-28 Thread Jan Beulich
>>> On 28.02.17 at 12:44,  wrote:
> --- a/xen/common/efi/boot.c
> +++ b/xen/common/efi/boot.c
> @@ -148,7 +148,7 @@ static void __init __maybe_unused 
> free_ebmalloc_unused_mem(void)
>  {
>  unsigned long start, end;
>  
> -start = (unsigned long)ebmalloc_mem + PAGE_ALIGN(ebmalloc_allocated);
> +start = (unsigned long)ebmalloc_mem + ROUNDUP(ebmalloc_allocated, 
> PAGE_SIZE);

With

#define PAGE_ALIGN(x) (((x) + PAGE_SIZE - 1) & PAGE_MASK)

I don't see any behavioral change with your adjustment.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 1/4] interface: avoid redefinition of __XEN_INTERFACE_VERSION__

2017-02-28 Thread Juergen Gross
On 28/02/17 12:11, Jan Beulich wrote:
 On 28.02.17 at 11:34,  wrote:
>> In stubdom environment __XEN_INTERFACE_VERSION__ is sometimes defined
>> on the command line of the build instruction. This conflicts with
>> xen-compat.h defining it unconditionally if __XEN__ or __XEN_TOOLS__
>> is set.
> 
> Then that's what wants fixing. In fact it's questionable whether
> __XEN_TOOLS__ (or even __XEN__) getting defined there is
> appropriate.

There are multiple libraries from the tools directory being compiled
for stubdoms.

>> Just use #undef in this case to avoid the resulting warning.
> 
> I think the lack of a warning in case of a collision is worse here.
> People should simply not define both the version symbol and
> either of __XEN__ or __XEN_TOOLS__.

Would you be okay with:

#if defined(__XEN_INTERFACE_VERSION__)
  #if __XEN_INTERFACE_VERSION__ != __XEN_LATEST_INTERFACE_VERSION__
#error ...
  #endif
#else
  #define __XEN_INTERFACE_VERSION__ __XEN_LATEST_INTERFACE_VERSION__
#endif


Juergen

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [ovmf test] 106234: regressions - FAIL

2017-02-28 Thread osstest service owner
flight 106234 ovmf real [real]
http://logs.test-lab.xenproject.org/osstest/logs/106234/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-i386-xl-qemuu-ovmf-amd64 9 debian-hvm-install fail REGR. vs. 105963
 test-amd64-amd64-xl-qemuu-ovmf-amd64 9 debian-hvm-install fail REGR. vs. 105963

version targeted for testing:
 ovmf 31abcf1dc773c7663363d599b8e34bfe58b7a0e5
baseline version:
 ovmf e0307a7dad02aa8c0cd8b3b0b9edce8ddb3fef2e

Last test of basis   105963  2017-02-21 21:43:31 Z6 days
Failing since105980  2017-02-22 10:03:53 Z6 days   16 attempts
Testing same since   106234  2017-02-28 08:14:59 Z0 days1 attempts


People who touched revisions under test:
  Anthony PERARD 
  Ard Biesheuvel 
  Chen A Chen 
  Fu Siyuan 
  Hao Wu 
  Hegde Nagaraj P 
  Jeff Fan 
  Jiaxin Wu 
  Jiewen Yao 
  Laszlo Ersek 
  Paolo Bonzini 
  Qin Long 
  Ruiyu Ni 
  Star Zeng 
  Wu Jiaxin 
  Yonghong Zhu 
  Zhang Lubo 

jobs:
 build-amd64-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-xl-qemuu-ovmf-amd64 fail
 test-amd64-i386-xl-qemuu-ovmf-amd64  fail



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Not pushing.

(No revision log; it would be 1355 lines long.)

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [distros-debian-snapshot test] 68619: regressions - trouble: blocked/broken/fail/pass

2017-02-28 Thread Platform Team regression test user
flight 68619 distros-debian-snapshot real [real]
http://osstest.xs.citrite.net/~osstest/testlogs/logs/68619/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-i386-i386-daily-netboot-pvgrub 9 debian-di-install fail REGR. vs. 
68586
 test-amd64-amd64-i386-daily-netboot-pygrub 9 debian-di-install fail REGR. vs. 
68586
 test-amd64-amd64-amd64-daily-netboot-pvgrub 9 debian-di-install fail REGR. vs. 
68586

Regressions which are regarded as allowable (not blocking):
 test-armhf-armhf-armhf-daily-netboot-pygrub 9 debian-di-install fail like 68586
 test-amd64-i386-amd64-daily-netboot-pygrub 9 debian-di-install fail like 68586
 test-amd64-i386-i386-weekly-netinst-pygrub 9 debian-di-install fail like 68586
 test-amd64-i386-amd64-weekly-netinst-pygrub 9 debian-di-install fail like 68586
 test-amd64-amd64-i386-weekly-netinst-pygrub 9 debian-di-install fail like 68586
 test-amd64-amd64-amd64-weekly-netinst-pygrub 9 debian-di-install fail like 
68586

Tests which did not succeed, but are not blocking:
 test-arm64-arm64-armhf-daily-netboot-pygrub  1 build-check(1)  blocked n/a
 build-arm64-pvops 2 hosts-allocate   broken never pass
 build-arm64   2 hosts-allocate   broken never pass
 build-arm64-pvops 3 capture-logs broken never pass
 build-arm64   3 capture-logs broken never pass

baseline version:
 flight   68586

jobs:
 build-amd64  pass
 build-arm64  broken  
 build-armhf  pass
 build-i386   pass
 build-amd64-pvopspass
 build-arm64-pvopsbroken  
 build-armhf-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-amd64-daily-netboot-pvgrub  fail
 test-amd64-i386-i386-daily-netboot-pvgrubfail
 test-amd64-i386-amd64-daily-netboot-pygrub   fail
 test-arm64-arm64-armhf-daily-netboot-pygrub  blocked 
 test-armhf-armhf-armhf-daily-netboot-pygrub  fail
 test-amd64-amd64-i386-daily-netboot-pygrub   fail
 test-amd64-amd64-amd64-current-netinst-pygrubpass
 test-amd64-i386-amd64-current-netinst-pygrub pass
 test-amd64-amd64-i386-current-netinst-pygrub pass
 test-amd64-i386-i386-current-netinst-pygrub  pass
 test-amd64-amd64-amd64-weekly-netinst-pygrub fail
 test-amd64-i386-amd64-weekly-netinst-pygrub  fail
 test-amd64-amd64-i386-weekly-netinst-pygrub  fail
 test-amd64-i386-i386-weekly-netinst-pygrub   fail



sg-report-flight on osstest.xs.citrite.net
logs: /home/osstest/logs
images: /home/osstest/images

Logs, config files, etc. are available at
http://osstest.xs.citrite.net/~osstest/testlogs/logs

Test harness code can be found at
http://xenbits.xensource.com/gitweb?p=osstest.git;a=summary


Push not applicable.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 2/4] tools: add pkg-config file for libxc

2017-02-28 Thread Juergen Gross
On 28/02/17 12:13, Ian Jackson wrote:
> Juergen Gross writes ("[PATCH 2/4] tools: add pkg-config file for libxc"):
>> When configuring the build of qemu the configure script is building
>> various test programs to determine the exact version of libxencontrol.
>>
>> Instead of a try and error approach needing updates for nearly each
>> new version of Xen just provide xencontrol.pc to be used via
>> pkg-config.
>>
>> In the end we need two different variants of that file: one for the
>> target system where eventually someone wants to build qemu, and one
>> for the local system to be used for building qemu as part of the Xen
>> build process.
> 
> I've not seen this done elsewhere, but I can see why it's attractive.

Thanks.

> I worry though that we're breaking new ground.  Did you come up with
> this idea yourself ?

Yes.

> Are you aware of other projects that do something similar ?

No.

TBH: did you have a look at qemu's configure script and how Xen support
is handled there? I'm really astonished they accept such a hackery.

And I don't see why it would be wrong to use the same basic mechanism
for our internal build as for any "normal" build outside of the Xen
build environment.

The next step would be to add the other Xen libraries and their
dependencies in order to get rid of all the additional include and
library directories when calling configure for qemu.


Juergen


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v8 06/24] x86: refactor psr: implement get hw info flow.

2017-02-28 Thread Roger Pau Monné
On Wed, Feb 15, 2017 at 04:49:21PM +0800, Yi Sun wrote:
> This patch implements get HW info flow including L3 CAT callback
> function.
> 
> It also changes sysctl interface to make it more general.
> 
> With this patch, 'psr-hwinfo' can work for L3 CAT.
> 
> Signed-off-by: Yi Sun 
> Reviewed-by: Konrad Rzeszutek Wilk 
> ---
>  xen/arch/x86/psr.c| 75 
> +--
>  xen/arch/x86/sysctl.c | 14 +
>  xen/include/asm-x86/psr.h | 19 +++-
>  3 files changed, 93 insertions(+), 15 deletions(-)
> 
> diff --git a/xen/arch/x86/psr.c b/xen/arch/x86/psr.c
> index 798c614..8af59d9 100644
> --- a/xen/arch/x86/psr.c
> +++ b/xen/arch/x86/psr.c
> @@ -84,6 +84,7 @@ enum psr_feat_type {
>  PSR_SOCKET_L3_CAT = 0,
>  PSR_SOCKET_L3_CDP,
>  PSR_SOCKET_L2_CAT,
> +PSR_SOCKET_UNKNOWN = 0x,
>  };
>  
>  /* CAT/CDP HW info data structure. */
> @@ -112,6 +113,9 @@ struct feat_node;
>  struct feat_ops {
>  /* get_cos_max is used to get feature's cos_max. */
>  unsigned int (*get_cos_max)(const struct feat_node *feat);
> +/* get_feat_info is used to get feature HW info. */
> +bool (*get_feat_info)(const struct feat_node *feat,
> +  uint32_t data[], unsigned int array_len);
>  };
>  
>  /*
> @@ -182,6 +186,24 @@ static void free_feature(struct psr_socket_info *info)
>  }
>  }
>  
> +static enum psr_feat_type psr_cbm_type_to_feat_type(enum cbm_type type)
> +{
> +enum psr_feat_type feat_type;
> +
> +/* Judge if feature is enabled. */
> +switch ( type )
> +{
> +case PSR_CBM_TYPE_L3:
> +feat_type = PSR_SOCKET_L3_CAT;
> +break;
> +default:
> +feat_type = PSR_SOCKET_UNKNOWN;
> +break;
> +}
> +
> +return feat_type;
> +}
> +
>  /* L3 CAT functions implementation. */
>  static void l3_cat_init_feature(struct cpuid_leaf regs,
>  struct feat_node *feat,
> @@ -225,8 +247,22 @@ static unsigned int l3_cat_get_cos_max(const struct 
> feat_node *feat)
>  return feat->info.l3_cat_info.cos_max;
>  }
>  
> +static bool l3_cat_get_feat_info(const struct feat_node *feat,
> + uint32_t data[], unsigned int array_len)
> +{
> +if ( !data || 3 > array_len )

Shouldn't this be array_len != 3 and then I would rather prefer this to be set
in a define, it's used here and in XEN_SYSCTL_PSR_CAT_get_l3_info, maybe you
should add it's define below of the PSR_FLAG define?

#define CAT_L3_FEAT_SIZE 3

Or some better worded name.

[...]
>  int psr_get_l3_cbm(struct domain *d, unsigned int socket,
> diff --git a/xen/arch/x86/sysctl.c b/xen/arch/x86/sysctl.c
> index b8c30d4..e340baa 100644
> --- a/xen/arch/x86/sysctl.c
> +++ b/xen/arch/x86/sysctl.c
> @@ -176,15 +176,19 @@ long arch_do_sysctl(
>  switch ( sysctl->u.psr_cat_op.cmd )
>  {
>  case XEN_SYSCTL_PSR_CAT_get_l3_info:
> -ret = psr_get_cat_l3_info(sysctl->u.psr_cat_op.target,
> -  
> &sysctl->u.psr_cat_op.u.l3_info.cbm_len,
> -  
> &sysctl->u.psr_cat_op.u.l3_info.cos_max,
> -  &sysctl->u.psr_cat_op.u.l3_info.flags);
> +{
> +uint32_t data[3];

Space between variable declaration and code.

> +ret = psr_get_info(sysctl->u.psr_cat_op.target,
> +   PSR_CBM_TYPE_L3, data, 3);

Last parameter should be ARRAY_SIZE(data) IMHO.

Roger.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 3/4] tools: use a dedicated build directory for qemu

2017-02-28 Thread Juergen Gross
On 28/02/17 12:13, Ian Jackson wrote:
> Juergen Gross writes ("[PATCH 3/4] tools: use a dedicated build directory for 
> qemu"):
>> Instead of using the downloaded git tree as target directory for the
>> qemu build create a dedicated directory for that purpose.
> 
> ... why ?

Hmm, yes, I should have added the reason to the commit description:

This way I can use the same source directory of qemu for my effort to
configure and build qemu upstream in a stubdom environment.


Juergen


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] Retrieving the precise guest start time via libxc or similar?

2017-02-28 Thread Razvan Cojocaru
Hello,

xc_domain_getinfo() gets us info->cpu_time, which, if I understand
correctly, is the guest uptime.

Is there an officially sanctioned way of retrieving the _exact_ guest
start time? Obviously we can compute an approximation by doing now -
info->cpu_time, but we'd like to be able to uniquely identify an OS boot
-> shutdown cycle, and the difference method is not guaranteed to always
yield the same result for the same running OS.


Thanks,
Razvan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 1/4] interface: avoid redefinition of __XEN_INTERFACE_VERSION__

2017-02-28 Thread Juergen Gross
On 28/02/17 12:10, Ian Jackson wrote:
> Juergen Gross writes ("[PATCH 1/4] interface: avoid redefinition of 
> __XEN_INTERFACE_VERSION__"):
>> In stubdom environment __XEN_INTERFACE_VERSION__ is sometimes defined
>> on the command line of the build instruction. This conflicts with
>> xen-compat.h defining it unconditionally if __XEN__ or __XEN_TOOLS__
>> is set.
> 
> Aren't the two definitions the same ?  If so, surely there should be
> no warning ?  If they are not the same, then surely we need to think
> about which one is right ?

In the end they are the same.

I don't know why the warning is issued. Maybe because one value was
specified via the command line.

> Sorry if the answers to this are obvious.

I think it isn't.


Juergen

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH v4 00/17] x86emul: MMX/SSEn support

2017-02-28 Thread Jan Beulich
This includes support for AVX counterparts of them as well as a few
later SSE additions (basically covering the entire 0f-prefixed opcode
space, but not the 0f38 and 0f3a ones, nor 3dnow).

 1: support most memory accessing MMX/SSE{,2,3} insns
 2: support MMX/SSE{,2,3} moves
 3: support MMX/SSE/SSE2 converts
 4: support {,V}{,U}COMIS{S,D}
 5: support MMX/SSE{,2,4a} insns with only register operands
 6: support {,V}{LD,ST}MXCSR
 7: support {,V}MOVNTDQA
 8: test coverage for SSE/SSE2 insns
 9: honor MMXEXT feature flag
10: add tables for 0f38 and 0f3a extension space
11: support SSSE3 insns
12: support SSE4.1 insns
13: support SSE4.2 insns
14: test coverage for SSE3/SSSE3/SSE4* insns

Partly RFC from here on, as there's testing code still mostly missing,
albeit I'm unsure whether it makes sense to cover each and every
individual instruction.

15: support PCLMULQDQ
16: support AESNI insns
17: support SHA insns

Signed-off-by: Jan Beulich 
---
v4: New patch 14. Fixes to other patches see there.




___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v8 07/24] x86: refactor psr: implement get value flow.

2017-02-28 Thread Roger Pau Monné
On Wed, Feb 15, 2017 at 04:49:22PM +0800, Yi Sun wrote:
> This patch implements get value flow including L3 CAT callback
> function.
> 
> It also changes domctl interface to make it more general.
> 
> With this patch, 'psr-cat-show' can work for L3 CAT but not for
> L3 code/data which is implemented in patch "x86: refactor psr:
> implement get value flow for CDP.".
> 
> Signed-off-by: Yi Sun 
> Reviewed-by: Konrad Rzeszutek Wilk 
> ---
>  xen/arch/x86/domctl.c | 18 +-
>  xen/arch/x86/psr.c| 43 ++-
>  xen/include/asm-x86/psr.h |  4 ++--
>  3 files changed, 49 insertions(+), 16 deletions(-)
[...]
> diff --git a/xen/arch/x86/psr.c b/xen/arch/x86/psr.c
> index 8af59d9..c1afd36 100644
> --- a/xen/arch/x86/psr.c
> +++ b/xen/arch/x86/psr.c
> @@ -116,6 +116,9 @@ struct feat_ops {
>  /* get_feat_info is used to get feature HW info. */
>  bool (*get_feat_info)(const struct feat_node *feat,
>uint32_t data[], unsigned int array_len);
> +/* get_val is used to get feature COS register value. */
> +bool (*get_val)(const struct feat_node *feat, unsigned int cos,
> +enum cbm_type type, uint64_t *val);
>  };
>  
>  /*
> @@ -260,9 +263,22 @@ static bool l3_cat_get_feat_info(const struct feat_node 
> *feat,
>  return true;
>  }
>  
> +static bool l3_cat_get_val(const struct feat_node *feat, unsigned int cos,
> +   enum cbm_type type, uint64_t *val)
> +{
> +if ( cos > feat->info.l3_cat_info.cos_max )
> +/* Use default value. */
> +cos = 0;

I don't know much, but shouldn't this return false instead of silently
defaulting to 0? This doesn't seem to be what the caller expects.

> +
> +*val = feat->cos_reg_val[cos];
> +
> +return true;
> +}
> +
>  static const struct feat_ops l3_cat_ops = {
>  .get_cos_max = l3_cat_get_cos_max,
>  .get_feat_info = l3_cat_get_feat_info,
> +.get_val = l3_cat_get_val,
>  };
>  
>  static void __init parse_psr_bool(char *s, char *value, char *feature,
> @@ -482,12 +498,14 @@ static struct psr_socket_info *get_socket_info(unsigned 
> int socket)
>  return socket_info + socket;
>  }
>  
> -int psr_get_info(unsigned int socket, enum cbm_type type,
> - uint32_t data[], unsigned int array_len)
> +static int psr_get(unsigned int socket, enum cbm_type type,
> +   uint32_t data[], unsigned int array_len,
> +   struct domain *d, uint64_t *val)
>  {
>  const struct psr_socket_info *info = get_socket_info(socket);
>  const struct feat_node *feat;
>  enum psr_feat_type feat_type;
> +unsigned int cos;
>  
>  if ( IS_ERR(info) )
>  return PTR_ERR(info);
> @@ -498,6 +516,15 @@ int psr_get_info(unsigned int socket, enum cbm_type type,
>  if ( feat->feature != feat_type )
>  continue;
>  
> +if ( d )
> +{
> +cos = d->arch.psr_cos_ids[socket];
> +if ( feat->ops.get_val(feat, cos, type, val) )
> +return 0;
> +else

No need for the "else" branch here.

Roger.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 1/4] interface: avoid redefinition of __XEN_INTERFACE_VERSION__

2017-02-28 Thread Jan Beulich
>>> On 28.02.17 at 13:24,  wrote:
> On 28/02/17 12:11, Jan Beulich wrote:
> On 28.02.17 at 11:34,  wrote:
>>> In stubdom environment __XEN_INTERFACE_VERSION__ is sometimes defined
>>> on the command line of the build instruction. This conflicts with
>>> xen-compat.h defining it unconditionally if __XEN__ or __XEN_TOOLS__
>>> is set.
>> 
>> Then that's what wants fixing. In fact it's questionable whether
>> __XEN_TOOLS__ (or even __XEN__) getting defined there is
>> appropriate.
> 
> There are multiple libraries from the tools directory being compiled
> for stubdoms.

Each if which should get appropriate settings.

>>> Just use #undef in this case to avoid the resulting warning.
>> 
>> I think the lack of a warning in case of a collision is worse here.
>> People should simply not define both the version symbol and
>> either of __XEN__ or __XEN_TOOLS__.
> 
> Would you be okay with:
> 
> #if defined(__XEN_INTERFACE_VERSION__)
>   #if __XEN_INTERFACE_VERSION__ != __XEN_LATEST_INTERFACE_VERSION__
> #error ...
>   #endif
> #else
>   #define __XEN_INTERFACE_VERSION__ __XEN_LATEST_INTERFACE_VERSION__
> #endif

Well - see Ian's reply. If the values match (granted textually rather
than by value), there should be no compiler warning in the first
place.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH v4 02/17] x86emul: support MMX/SSE{,2,3} moves

2017-02-28 Thread Jan Beulich
Previously supported insns are being converted to the new model, and
several new ones are being added.

To keep the stub handling reasonably simple, integrate SET_SSE_PREFIX()
into copy_REX_VEX(), at once switching the stubs to use an empty REX
prefix instead of a double DS: one (no byte registers are being
accessed, so an empty REX prefix has no effect), except (of course) for
the 32-bit test harness build.

Signed-off-by: Jan Beulich 
---
v4: Re-base.
v3: Re-base. Introduce more labels to reduce redundant code.
v2: Don't clear TwoOp for vmov{l,h}p{s,d} to memory. Move re-setting of
TwoOp into VEX-specific code paths where possible. Special case
{,v}maskmov{q,dqu} in stub invocation. Move {,v}movq code block to
proper position. Add zero-mask {,v}maskmov{q,dqu} tests.

--- a/tools/tests/x86_emulator/test_x86_emulator.c
+++ b/tools/tests/x86_emulator/test_x86_emulator.c
@@ -1566,6 +1566,29 @@ int main(int argc, char **argv)
 else
 printf("skipped\n");
 
+printf("%-40s", "Testing movq 32(%ecx),%xmm1...");
+if ( stack_exec && cpu_has_sse2 )
+{
+decl_insn(movq_from_mem2);
+
+asm volatile ( "pcmpeqb %%xmm1, %%xmm1\n"
+   put_insn(movq_from_mem2, "movq 32(%0), %%xmm1")
+   :: "c" (NULL) );
+
+set_insn(movq_from_mem2);
+rc = x86_emulate(&ctxt, &emulops);
+if ( rc != X86EMUL_OKAY || !check_eip(movq_from_mem2) )
+goto fail;
+asm ( "pcmpgtb %%xmm0, %%xmm0\n\t"
+  "pcmpeqb %%xmm1, %%xmm0\n\t"
+  "pmovmskb %%xmm0, %0" : "=r" (rc) );
+if ( rc != 0x )
+goto fail;
+printf("okay\n");
+}
+else
+printf("skipped\n");
+
 printf("%-40s", "Testing vmovq %xmm1,32(%edx)...");
 if ( stack_exec && cpu_has_avx )
 {
@@ -1590,6 +1613,29 @@ int main(int argc, char **argv)
 else
 printf("skipped\n");
 
+printf("%-40s", "Testing vmovq 32(%edx),%xmm0...");
+if ( stack_exec && cpu_has_avx )
+{
+decl_insn(vmovq_from_mem);
+
+asm volatile ( "pcmpeqb %%xmm0, %%xmm0\n"
+   put_insn(vmovq_from_mem, "vmovq 32(%0), %%xmm0")
+   :: "d" (NULL) );
+
+set_insn(vmovq_from_mem);
+rc = x86_emulate(&ctxt, &emulops);
+if ( rc != X86EMUL_OKAY || !check_eip(vmovq_from_mem) )
+goto fail;
+asm ( "pcmpgtb %%xmm1, %%xmm1\n\t"
+  "pcmpeqb %%xmm0, %%xmm1\n\t"
+  "pmovmskb %%xmm1, %0" : "=r" (rc) );
+if ( rc != 0x )
+goto fail;
+printf("okay\n");
+}
+else
+printf("skipped\n");
+
 printf("%-40s", "Testing movdqu %xmm2,(%ecx)...");
 if ( stack_exec && cpu_has_sse2 )
 {
@@ -1821,6 +1867,33 @@ int main(int argc, char **argv)
 else
 printf("skipped\n");
 
+printf("%-40s", "Testing movd 32(%ecx),%mm4...");
+if ( stack_exec && cpu_has_mmx )
+{
+decl_insn(movd_from_mem);
+
+asm volatile ( "pcmpgtb %%mm4, %%mm4\n"
+   put_insn(movd_from_mem, "movd 32(%0), %%mm4")
+   :: "c" (NULL) );
+
+set_insn(movd_from_mem);
+rc = x86_emulate(&ctxt, &emulops);
+if ( rc != X86EMUL_OKAY || !check_eip(movd_from_mem) )
+goto fail;
+asm ( "pxor %%mm2,%%mm2\n\t"
+  "pcmpeqb %%mm4, %%mm2\n\t"
+  "pmovmskb %%mm2, %0" : "=r" (rc) );
+if ( rc != 0xf0 )
+goto fail;
+asm ( "pcmpeqb %%mm4, %%mm3\n\t"
+  "pmovmskb %%mm3, %0" : "=r" (rc) );
+if ( rc != 0x0f )
+goto fail;
+printf("okay\n");
+}
+else
+printf("skipped\n");
+
 printf("%-40s", "Testing movd %xmm2,32(%edx)...");
 if ( stack_exec && cpu_has_sse2 )
 {
@@ -1845,6 +1918,34 @@ int main(int argc, char **argv)
 else
 printf("skipped\n");
 
+printf("%-40s", "Testing movd 32(%edx),%xmm3...");
+if ( stack_exec && cpu_has_sse2 )
+{
+decl_insn(movd_from_mem2);
+
+asm volatile ( "pcmpeqb %%xmm3, %%xmm3\n"
+   put_insn(movd_from_mem2, "movd 32(%0), %%xmm3")
+   :: "d" (NULL) );
+
+set_insn(movd_from_mem2);
+rc = x86_emulate(&ctxt, &emulops);
+if ( rc != X86EMUL_OKAY || !check_eip(movd_from_mem2) )
+goto fail;
+asm ( "pxor %%xmm1,%%xmm1\n\t"
+  "pcmpeqb %%xmm3, %%xmm1\n\t"
+  "pmovmskb %%xmm1, %0" : "=r" (rc) );
+if ( rc != 0xfff0 )
+goto fail;
+asm ( "pcmpeqb %%xmm2, %%xmm2\n\t"
+  "pcmpeqb %%xmm3, %%xmm2\n\t"
+  "pmovmskb %%xmm2, %0" : "=r" (rc) );
+if ( rc != 0x000f )
+goto fail;
+printf("okay\n");
+}
+else
+printf("skipped\n");
+
 printf("%-40s", "Testing vmovd %xmm1,32(%ecx)...");
 if ( stack_exec && cpu_has_avx )
 {
@@ -1869,6 +1970,34 @@ 

[Xen-devel] [PATCH v4 01/17] x86emul: support most memory accessing MMX/SSE{, 2, 3} insns

2017-02-28 Thread Jan Beulich
e 0x0f-escape
space with memory operands. Not covered here are irregular moves,
converts, and {,U}COMIS{S,D} (modifying EFLAGS).

Note that the distinction between simd_*_fp isn't strictly needed, but
I've kept them as separate entries since in an earlier version I needed
them to be separate, and we may well find it useful down the road to
have that distinction.

Also take the opportunity and adjust the vmovdqu test case the new
LDDQU one here has been cloned from: To zero a ymm register we don't
need to go through hoops, as 128-bit AVX insns zero the upper portion
of the destination register, and in the disabled AVX2 code there was a
wrong YMM register used.

Signed-off-by: Jan Beulich 
---
v4: Add blank lines to enum simd_opsize. Re-base.
v3: Correct {,v}addsubp{s,d} comments (no 'h' in mnemonic).
Consistently generate #UD when VEX.l is disallowed. Ignore VEX.l
for scalar insns. Re-base. Introduce more labels to reduce
redundant code. Add fic.exn_raised constraint in invoke_stub() use.
v2: Correct SSE2 p{max,min}{ub,sw} case labels. Correct MMX
ps{ll,r{a,l}} and MMX punpckh{bw,wd,dq} operand sizes. Correct
zapping of TwoOp in x86_decode_twobyte() (and vmovs{s,d} handling
as a result). Also decode pshuf{h,l}w. Correct v{rcp,rsqrt}ss and
vsqrts{s,d} comments (they allow memory operands).

--- a/tools/tests/x86_emulator/test_x86_emulator.c
+++ b/tools/tests/x86_emulator/test_x86_emulator.c
@@ -1665,12 +1665,7 @@ int main(int argc, char **argv)
 {
 decl_insn(vmovdqu_from_mem);
 
-#if 0 /* Don't use AVX2 instructions for now */
-asm volatile ( "vpcmpgtb %%ymm4, %%ymm4, %%ymm4\n"
-#else
-asm volatile ( "vpcmpgtb %%xmm4, %%xmm4, %%xmm4\n\t"
-   "vinsertf128 $1, %%xmm4, %%ymm4, %%ymm4\n"
-#endif
+asm volatile ( "vpxor %%xmm4, %%xmm4, %%xmm4\n"
put_insn(vmovdqu_from_mem, "vmovdqu (%0), %%ymm4")
:: "d" (NULL) );
 
@@ -1684,7 +1679,7 @@ int main(int argc, char **argv)
 #if 0 /* Don't use AVX2 instructions for now */
 asm ( "vpcmpeqb %%ymm2, %%ymm2, %%ymm2\n\t"
   "vpcmpeqb %%ymm4, %%ymm2, %%ymm0\n\t"
-  "vpmovmskb %%ymm1, %0" : "=r" (rc) );
+  "vpmovmskb %%ymm0, %0" : "=r" (rc) );
 #else
 asm ( "vextractf128 $1, %%ymm4, %%xmm3\n\t"
   "vpcmpeqb %%xmm2, %%xmm2, %%xmm2\n\t"
@@ -2092,6 +2087,67 @@ int main(int argc, char **argv)
 printf("skipped\n");
 #endif
 
+printf("%-40s", "Testing lddqu 4(%edx),%xmm4...");
+if ( stack_exec && cpu_has_sse3 )
+{
+decl_insn(lddqu);
+
+asm volatile ( "pcmpgtb %%xmm4, %%xmm4\n"
+   put_insn(lddqu, "lddqu 4(%0), %%xmm4")
+   :: "d" (NULL) );
+
+set_insn(lddqu);
+memset(res, 0x55, 64);
+memset(res + 1, 0xff, 16);
+regs.edx = (unsigned long)res;
+rc = x86_emulate(&ctxt, &emulops);
+if ( rc != X86EMUL_OKAY || !check_eip(lddqu) )
+goto fail;
+asm ( "pcmpeqb %%xmm2, %%xmm2\n\t"
+  "pcmpeqb %%xmm4, %%xmm2\n\t"
+  "pmovmskb %%xmm2, %0" : "=r" (rc) );
+if ( rc != 0x )
+goto fail;
+printf("okay\n");
+}
+else
+printf("skipped\n");
+
+printf("%-40s", "Testing vlddqu (%ecx),%ymm4...");
+if ( stack_exec && cpu_has_avx )
+{
+decl_insn(vlddqu);
+
+asm volatile ( "vpxor %%xmm4, %%xmm4, %%xmm4\n"
+   put_insn(vlddqu, "vlddqu (%0), %%ymm4")
+   :: "c" (NULL) );
+
+set_insn(vlddqu);
+memset(res + 1, 0xff, 32);
+regs.ecx = (unsigned long)(res + 1);
+rc = x86_emulate(&ctxt, &emulops);
+if ( rc != X86EMUL_OKAY || !check_eip(vlddqu) )
+goto fail;
+#if 0 /* Don't use AVX2 instructions for now */
+asm ( "vpcmpeqb %%ymm2, %%ymm2, %%ymm2\n\t"
+  "vpcmpeqb %%ymm4, %%ymm2, %%ymm0\n\t"
+  "vpmovmskb %%ymm0, %0" : "=r" (rc) );
+#else
+asm ( "vextractf128 $1, %%ymm4, %%xmm3\n\t"
+  "vpcmpeqb %%xmm2, %%xmm2, %%xmm2\n\t"
+  "vpcmpeqb %%xmm4, %%xmm2, %%xmm0\n\t"
+  "vpcmpeqb %%xmm3, %%xmm2, %%xmm1\n\t"
+  "vpmovmskb %%xmm0, %0\n\t"
+  "vpmovmskb %%xmm1, %1" : "=r" (rc), "=r" (i) );
+rc |= i << 16;
+#endif
+if ( ~rc )
+goto fail;
+printf("okay\n");
+}
+else
+printf("skipped\n");
+
 #undef decl_insn
 #undef put_insn
 #undef set_insn
--- a/tools/tests/x86_emulator/x86_emulate.h
+++ b/tools/tests/x86_emulator/x86_emulate.h
@@ -80,6 +80,12 @@ static inline uint64_t xgetbv(uint32_t x
 (res.d & (1U << 26)) != 0; \
 })
 
+#define cpu_has_sse3 ({ \
+struct cpuid_leaf res; \
+emul_test_cpuid(1, 0, &res, NULL); \
+(res.c & (1U << 0)) != 0; \
+})
+
 #define cpu_has_popcnt ({ \
 struct cpuid_leaf res; \
 emul_test_cpuid(1, 0, &res, NULL);

[Xen-devel] [PATCH v4 03/17] x86emul: support MMX/SSE/SSE2 converts

2017-02-28 Thread Jan Beulich
Note that other than most scalar instructions, vcvt{,t}s{s,d}2si do #UD
when VEX.l is set on at least some Intel models. To be on the safe
side, implement the most restrictive mode here for now when emulating
an Intel CPU, and simply clear the bit when emulating an AMD one.

Signed-off-by: Jan Beulich 
---
v3: Ignore VEX.l for scalar insns other than vcvt{,t}s{s,d}2si.
Introduce more labels to reduce redundant code. Add fic.exn_raised
constraint to relevant invoke_stub() uses.
v2: Don't pointlessly set TwoOp for cvtpi2p{s,d} and cvt{,t}p{s,d}2pi.
Set Mov for all converts (with follow-on adjustments to case
labels). Consistently generate #UD when VEX.l is disallowed. Don't
check VEX. for vcvtsi2s{s,d}.

--- a/xen/arch/x86/x86_emulate/x86_emulate.c
+++ b/xen/arch/x86/x86_emulate/x86_emulate.c
@@ -251,9 +251,10 @@ static const struct {
 [0x22 ... 0x23] = { DstImplicit|SrcMem|ModRM },
 [0x28] = { DstImplicit|SrcMem|ModRM|Mov, simd_packed_fp },
 [0x29] = { DstMem|SrcImplicit|ModRM|Mov, simd_packed_fp },
-[0x2a] = { ImplicitOps|ModRM },
+[0x2a] = { DstImplicit|SrcMem|ModRM|Mov, simd_other },
 [0x2b] = { DstMem|SrcImplicit|ModRM|Mov, simd_any_fp },
-[0x2c ... 0x2f] = { ImplicitOps|ModRM },
+[0x2c ... 0x2d] = { DstImplicit|SrcMem|ModRM|Mov, simd_other },
+[0x2e ... 0x2f] = { ImplicitOps|ModRM },
 [0x30 ... 0x35] = { ImplicitOps },
 [0x37] = { ImplicitOps },
 [0x38] = { DstReg|SrcMem|ModRM },
@@ -264,7 +265,7 @@ static const struct {
 [0x52 ... 0x53] = { DstImplicit|SrcMem|ModRM|TwoOp, simd_single_fp },
 [0x54 ... 0x57] = { DstImplicit|SrcMem|ModRM, simd_packed_fp },
 [0x58 ... 0x59] = { DstImplicit|SrcMem|ModRM, simd_any_fp },
-[0x5a ... 0x5b] = { ModRM },
+[0x5a ... 0x5b] = { DstImplicit|SrcMem|ModRM|Mov, simd_other },
 [0x5c ... 0x5f] = { DstImplicit|SrcMem|ModRM, simd_any_fp },
 [0x60 ... 0x62] = { DstImplicit|SrcMem|ModRM, simd_other },
 [0x63 ... 0x67] = { DstImplicit|SrcMem|ModRM, simd_packed_int },
@@ -327,7 +328,7 @@ static const struct {
 [0xe0] = { DstImplicit|SrcMem|ModRM, simd_packed_int },
 [0xe1 ... 0xe2] = { DstImplicit|SrcMem|ModRM, simd_other },
 [0xe3 ... 0xe5] = { DstImplicit|SrcMem|ModRM, simd_packed_int },
-[0xe6] = { ModRM },
+[0xe6] = { DstImplicit|SrcMem|ModRM|Mov, simd_other },
 [0xe7] = { DstMem|SrcImplicit|ModRM|Mov, simd_packed_int },
 [0xe8 ... 0xef] = { DstImplicit|SrcMem|ModRM, simd_packed_int },
 [0xf0] = { DstImplicit|SrcMem|ModRM|Mov, simd_other },
@@ -5372,6 +5373,101 @@ x86_emulate(
 goto done;
 break;
 
+case X86EMUL_OPC_66(0x0f, 0x2a):   /* cvtpi2pd mm/m64,xmm */
+if ( ea.type == OP_REG )
+{
+case X86EMUL_OPC(0x0f, 0x2a):  /* cvtpi2ps mm/m64,xmm */
+CASE_SIMD_PACKED_FP(, 0x0f, 0x2c): /* cvttp{s,d}2pi xmm/mem,mm */
+CASE_SIMD_PACKED_FP(, 0x0f, 0x2d): /* cvtp{s,d}2pi xmm/mem,mm */
+host_and_vcpu_must_have(mmx);
+}
+op_bytes = (b & 4) && (vex.pfx & VEX_PREFIX_DOUBLE_MASK) ? 16 : 8;
+goto simd_0f_fp;
+
+CASE_SIMD_SCALAR_FP(, 0x0f, 0x2a): /* cvtsi2s{s,d} r/m,xmm */
+CASE_SIMD_SCALAR_FP(_VEX, 0x0f, 0x2a): /* vcvtsi2s{s,d} r/m,xmm,xmm */
+if ( vex.opcx == vex_none )
+{
+if ( vex.pfx & VEX_PREFIX_DOUBLE_MASK )
+vcpu_must_have(sse2);
+else
+vcpu_must_have(sse);
+get_fpu(X86EMUL_FPU_xmm, &fic);
+}
+else
+{
+host_and_vcpu_must_have(avx);
+get_fpu(X86EMUL_FPU_ymm, &fic);
+}
+
+if ( ea.type == OP_MEM )
+{
+rc = read_ulong(ea.mem.seg, ea.mem.off, &src.val,
+rex_prefix & REX_W ? 8 : 4, ctxt, ops);
+if ( rc != X86EMUL_OKAY )
+goto done;
+}
+else
+src.val = rex_prefix & REX_W ? *ea.reg : (uint32_t)*ea.reg;
+
+state->simd_size = simd_none;
+goto simd_0f_rm;
+
+CASE_SIMD_SCALAR_FP(, 0x0f, 0x2c): /* cvtts{s,d}2si xmm/mem,reg */
+CASE_SIMD_SCALAR_FP(_VEX, 0x0f, 0x2c): /* vcvtts{s,d}2si xmm/mem,reg */
+CASE_SIMD_SCALAR_FP(, 0x0f, 0x2d): /* cvts{s,d}2si xmm/mem,reg */
+CASE_SIMD_SCALAR_FP(_VEX, 0x0f, 0x2d): /* vcvts{s,d}2si xmm/mem,reg */
+if ( vex.opcx == vex_none )
+{
+if ( vex.pfx & VEX_PREFIX_DOUBLE_MASK )
+vcpu_must_have(sse2);
+else
+vcpu_must_have(sse);
+get_fpu(X86EMUL_FPU_xmm, &fic);
+}
+else
+{
+if ( ctxt->vendor == X86_VENDOR_AMD )
+vex.l = 0;
+generate_exception_if(vex.l, EXC_UD);
+host_and_vcpu_must_have(avx);
+get_fpu(X86EMUL_FPU_ymm, &fic);
+}
+
+opc = init_prefixes(stub);
+opc[0] = b;
+/* Convert GPR destination to %rAX and memory operand to (%rC

[Xen-devel] [PATCH v4 04/17] x86emul: support {,V}{,U}COMIS{S,D}

2017-02-28 Thread Jan Beulich
Signed-off-by: Jan Beulich 
---
v4: Add missing copy_REX_VEX().
v3: Ignore VEX.l. Add fic.exn_raised constraint to invoke_stub() use.
v2: Add missing RET to stub. Generate #UD (instead of simply failing)
when VEX.l is disallowed.

--- a/xen/arch/x86/x86_emulate/x86_emulate.c
+++ b/xen/arch/x86/x86_emulate/x86_emulate.c
@@ -254,7 +254,7 @@ static const struct {
 [0x2a] = { DstImplicit|SrcMem|ModRM|Mov, simd_other },
 [0x2b] = { DstMem|SrcImplicit|ModRM|Mov, simd_any_fp },
 [0x2c ... 0x2d] = { DstImplicit|SrcMem|ModRM|Mov, simd_other },
-[0x2e ... 0x2f] = { ImplicitOps|ModRM },
+[0x2e ... 0x2f] = { ImplicitOps|ModRM|TwoOp },
 [0x30 ... 0x35] = { ImplicitOps },
 [0x37] = { ImplicitOps },
 [0x38] = { DstReg|SrcMem|ModRM },
@@ -5468,6 +5468,55 @@ x86_emulate(
 state->simd_size = simd_none;
 break;
 
+CASE_SIMD_PACKED_FP(, 0x0f, 0x2e): /* ucomis{s,d} xmm/mem,xmm */
+CASE_SIMD_PACKED_FP(_VEX, 0x0f, 0x2e): /* vucomis{s,d} xmm/mem,xmm */
+CASE_SIMD_PACKED_FP(, 0x0f, 0x2f): /* comis{s,d} xmm/mem,xmm */
+CASE_SIMD_PACKED_FP(_VEX, 0x0f, 0x2f): /* vcomis{s,d} xmm/mem,xmm */
+if ( vex.opcx == vex_none )
+{
+if ( vex.pfx )
+vcpu_must_have(sse2);
+else
+vcpu_must_have(sse);
+get_fpu(X86EMUL_FPU_xmm, &fic);
+}
+else
+{
+host_and_vcpu_must_have(avx);
+get_fpu(X86EMUL_FPU_ymm, &fic);
+}
+
+opc = init_prefixes(stub);
+opc[0] = b;
+opc[1] = modrm;
+if ( ea.type == OP_MEM )
+{
+rc = ops->read(ea.mem.seg, ea.mem.off, mmvalp, vex.pfx ? 8 : 4,
+   ctxt);
+if ( rc != X86EMUL_OKAY )
+goto done;
+
+/* Convert memory operand to (%rAX). */
+rex_prefix &= ~REX_B;
+vex.b = 1;
+opc[1] &= 0x38;
+}
+fic.insn_bytes = PFX_BYTES + 2;
+opc[2] = 0xc3;
+
+copy_REX_VEX(opc, rex_prefix, vex);
+invoke_stub(_PRE_EFLAGS("[eflags]", "[mask]", "[tmp]"),
+_POST_EFLAGS("[eflags]", "[mask]", "[tmp]"),
+[eflags] "+g" (_regs._eflags),
+[tmp] "=&r" (cr4 /* dummy */), "+m" (*mmvalp),
+"+m" (fic.exn_raised)
+: [func] "rm" (stub.func), "a" (mmvalp),
+  [mask] "i" (EFLAGS_MASK));
+
+put_stub(stub);
+put_fpu(&fic);
+break;
+
 case X86EMUL_OPC(0x0f, 0x30): /* wrmsr */
 generate_exception_if(!mode_ring0(), EXC_GP, 0);
 fail_if(ops->write_msr == NULL);



x86emul: support {,V}{,U}COMIS{S,D}

Signed-off-by: Jan Beulich 
---
v4: Add missing copy_REX_VEX().
v3: Ignore VEX.l. Add fic.exn_raised constraint to invoke_stub() use.
v2: Add missing RET to stub. Generate #UD (instead of simply failing)
when VEX.l is disallowed.

--- a/xen/arch/x86/x86_emulate/x86_emulate.c
+++ b/xen/arch/x86/x86_emulate/x86_emulate.c
@@ -254,7 +254,7 @@ static const struct {
 [0x2a] = { DstImplicit|SrcMem|ModRM|Mov, simd_other },
 [0x2b] = { DstMem|SrcImplicit|ModRM|Mov, simd_any_fp },
 [0x2c ... 0x2d] = { DstImplicit|SrcMem|ModRM|Mov, simd_other },
-[0x2e ... 0x2f] = { ImplicitOps|ModRM },
+[0x2e ... 0x2f] = { ImplicitOps|ModRM|TwoOp },
 [0x30 ... 0x35] = { ImplicitOps },
 [0x37] = { ImplicitOps },
 [0x38] = { DstReg|SrcMem|ModRM },
@@ -5468,6 +5468,55 @@ x86_emulate(
 state->simd_size = simd_none;
 break;
 
+CASE_SIMD_PACKED_FP(, 0x0f, 0x2e): /* ucomis{s,d} xmm/mem,xmm */
+CASE_SIMD_PACKED_FP(_VEX, 0x0f, 0x2e): /* vucomis{s,d} xmm/mem,xmm */
+CASE_SIMD_PACKED_FP(, 0x0f, 0x2f): /* comis{s,d} xmm/mem,xmm */
+CASE_SIMD_PACKED_FP(_VEX, 0x0f, 0x2f): /* vcomis{s,d} xmm/mem,xmm */
+if ( vex.opcx == vex_none )
+{
+if ( vex.pfx )
+vcpu_must_have(sse2);
+else
+vcpu_must_have(sse);
+get_fpu(X86EMUL_FPU_xmm, &fic);
+}
+else
+{
+host_and_vcpu_must_have(avx);
+get_fpu(X86EMUL_FPU_ymm, &fic);
+}
+
+opc = init_prefixes(stub);
+opc[0] = b;
+opc[1] = modrm;
+if ( ea.type == OP_MEM )
+{
+rc = ops->read(ea.mem.seg, ea.mem.off, mmvalp, vex.pfx ? 8 : 4,
+   ctxt);
+if ( rc != X86EMUL_OKAY )
+goto done;
+
+/* Convert memory operand to (%rAX). */
+rex_prefix &= ~REX_B;
+vex.b = 1;
+opc[1] &= 0x38;
+}
+fic.insn_bytes = PFX_BYTES + 2;
+opc[2] = 0xc3;
+
+copy_REX_VEX(opc, rex_prefix, vex);
+invoke_stub(_PRE_EFLAGS("[eflags]", "[mask]", "[tmp]"),
+_POST_EFLAGS("[eflags]", "[mask]", "[tmp]"),
+[eflags] "+g" (_regs._eflag

[Xen-devel] [PATCH v4 05/17] x86emul: support MMX/SSE{, 2, 4a} insns with only register operands

2017-02-28 Thread Jan Beulich
This involves fixing a decode bug: VEX encoded insns aren't necessarily
followed by a ModR/M byte.

Signed-off-by: Jan Beulich 
---
v4: Add missing setting of op_bytes to insertq (register form)
handling.
v3: Simplify handling of extrq/insertq register forms. Use simd_0f_xmm
label.
v2: Correct {,v}pextrw operand descriptor.

--- a/xen/arch/x86/x86_emulate/x86_emulate.c
+++ b/xen/arch/x86/x86_emulate/x86_emulate.c
@@ -274,10 +274,11 @@ static const struct {
 [0x6e] = { DstImplicit|SrcMem|ModRM|Mov },
 [0x6f] = { DstImplicit|SrcMem|ModRM|Mov, simd_packed_int },
 [0x70] = { SrcImmByte|ModRM|TwoOp, simd_other },
-[0x71 ... 0x73] = { SrcImmByte|ModRM },
+[0x71 ... 0x73] = { DstImplicit|SrcImmByte|ModRM },
 [0x74 ... 0x76] = { DstImplicit|SrcMem|ModRM, simd_packed_int },
 [0x77] = { DstImplicit|SrcNone },
-[0x78 ... 0x79] = { ModRM },
+[0x78] = { ImplicitOps|ModRM },
+[0x79] = { DstReg|SrcMem|ModRM, simd_packed_int },
 [0x7c ... 0x7d] = { DstImplicit|SrcMem|ModRM, simd_other },
 [0x7e] = { DstMem|SrcImplicit|ModRM|Mov },
 [0x7f] = { DstMem|SrcImplicit|ModRM|Mov, simd_packed_int },
@@ -315,7 +316,7 @@ static const struct {
 [0xc2] = { DstImplicit|SrcImmByte|ModRM, simd_any_fp },
 [0xc3] = { DstMem|SrcReg|ModRM|Mov },
 [0xc4] = { DstReg|SrcImmByte|ModRM, simd_packed_int },
-[0xc5] = { SrcImmByte|ModRM },
+[0xc5] = { DstReg|SrcImmByte|ModRM|Mov },
 [0xc6] = { DstImplicit|SrcImmByte|ModRM, simd_packed_fp },
 [0xc7] = { ImplicitOps|ModRM },
 [0xc8 ... 0xcf] = { ImplicitOps },
@@ -2505,12 +2506,21 @@ x86_decode(
 
 opcode |= b | MASK_INSR(vex.pfx, X86EMUL_OPC_PFX_MASK);
 
+if ( !(d & ModRM) )
+{
+modrm_reg = modrm_rm = modrm_mod = modrm = 0;
+break;
+}
+
 modrm = insn_fetch_type(uint8_t);
 modrm_mod = (modrm & 0xc0) >> 6;
 
 break;
 }
+}
 
+if ( d & ModRM )
+{
 modrm_reg = ((rex_prefix & 4) << 1) | ((modrm & 0x38) >> 3);
 modrm_rm  = modrm & 0x07;
 
@@ -5658,6 +5668,18 @@ x86_emulate(
 CASE_SIMD_PACKED_FP(_VEX, 0x0f, 0x50): /* vmovmskp{s,d} {x,y}mm,reg */
 CASE_SIMD_PACKED_INT(0x0f, 0xd7):  /* pmovmskb {,x}mm,reg */
 case X86EMUL_OPC_VEX_66(0x0f, 0xd7):   /* vpmovmskb {x,y}mm,reg */
+opc = init_prefixes(stub);
+opc[0] = b;
+/* Convert GPR destination to %rAX. */
+rex_prefix &= ~REX_R;
+vex.r = 1;
+if ( !mode_64bit() )
+vex.w = 0;
+opc[1] = modrm & 0xc7;
+fic.insn_bytes = PFX_BYTES + 2;
+simd_0f_to_gpr:
+opc[fic.insn_bytes - PFX_BYTES] = 0xc3;
+
 generate_exception_if(ea.type != OP_REG, EXC_UD);
 
 if ( vex.opcx == vex_none )
@@ -5685,17 +5707,6 @@ x86_emulate(
 get_fpu(X86EMUL_FPU_ymm, &fic);
 }
 
-opc = init_prefixes(stub);
-opc[0] = b;
-/* Convert GPR destination to %rAX. */
-rex_prefix &= ~REX_R;
-vex.r = 1;
-if ( !mode_64bit() )
-vex.w = 0;
-opc[1] = modrm & 0xc7;
-fic.insn_bytes = PFX_BYTES + 2;
-opc[2] = 0xc3;
-
 copy_REX_VEX(opc, rex_prefix, vex);
 invoke_stub("", "", "=a" (dst.val) : [dummy] "i" (0));
 
@@ -5954,6 +5965,132 @@ x86_emulate(
 fic.insn_bytes = PFX_BYTES + 3;
 break;
 
+CASE_SIMD_PACKED_INT(0x0f, 0x71):/* Grp12 */
+case X86EMUL_OPC_VEX_66(0x0f, 0x71):
+CASE_SIMD_PACKED_INT(0x0f, 0x72):/* Grp13 */
+case X86EMUL_OPC_VEX_66(0x0f, 0x72):
+switch ( modrm_reg & 7 )
+{
+case 2: /* psrl{w,d} $imm8,{,x}mm */
+/* vpsrl{w,d} $imm8,{x,y}mm,{x,y}mm */
+case 4: /* psra{w,d} $imm8,{,x}mm */
+/* vpsra{w,d} $imm8,{x,y}mm,{x,y}mm */
+case 6: /* psll{w,d} $imm8,{,x}mm */
+/* vpsll{w,d} $imm8,{x,y}mm,{x,y}mm */
+break;
+default:
+goto cannot_emulate;
+}
+simd_0f_shift_imm:
+generate_exception_if(ea.type != OP_REG, EXC_UD);
+
+if ( vex.opcx != vex_none )
+{
+if ( vex.l )
+host_and_vcpu_must_have(avx2);
+else
+host_and_vcpu_must_have(avx);
+get_fpu(X86EMUL_FPU_ymm, &fic);
+}
+else if ( vex.pfx )
+{
+vcpu_must_have(sse2);
+get_fpu(X86EMUL_FPU_xmm, &fic);
+}
+else
+{
+host_and_vcpu_must_have(mmx);
+get_fpu(X86EMUL_FPU_mmx, &fic);
+}
+
+opc = init_prefixes(stub);
+opc[0] = b;
+opc[1] = modrm;
+opc[2] = imm1;
+fic.insn_bytes = PFX_BYTES + 3;
+simd_0f_reg_only:
+opc[fic.insn_bytes - PFX_BYTES] = 0xc3;
+
+copy_REX_VEX(opc, rex_prefix, vex);
+invoke_stub("", "", [dummy_out] "=g" (cr4)

Re: [Xen-devel] [PATCH 2/4] tools: add pkg-config file for libxc

2017-02-28 Thread Wei Liu
On Tue, Feb 28, 2017 at 01:32:06PM +0100, Juergen Gross wrote:
> And I don't see why it would be wrong to use the same basic mechanism
> for our internal build as for any "normal" build outside of the Xen
> build environment.
> 

+1 for this.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH v4 06/17] x86emul: support {,V}{LD,ST}MXCSR

2017-02-28 Thread Jan Beulich
Signed-off-by: Jan Beulich 
---
v4: Drop the host_and_ part from the AVX checks.
v3: Re-base.

--- a/tools/fuzz/x86_instruction_emulator/x86-insn-emulator-fuzzer.c
+++ b/tools/fuzz/x86_instruction_emulator/x86-insn-emulator-fuzzer.c
@@ -660,7 +660,7 @@ int LLVMFuzzerTestOneInput(const uint8_t
 };
 int rc;
 
-stack_exec = emul_test_make_stack_executable();
+stack_exec = emul_test_init();
 if ( !stack_exec )
 {
 printf("Warning: Stack could not be made executable (%d).\n", errno);
--- a/tools/tests/x86_emulator/test_x86_emulator.c
+++ b/tools/tests/x86_emulator/test_x86_emulator.c
@@ -219,7 +219,7 @@ int main(int argc, char **argv)
 }
 instr = (char *)res + 0x100;
 
-stack_exec = emul_test_make_stack_executable();
+stack_exec = emul_test_init();
 
 if ( !stack_exec )
 printf("Warning: Stack could not be made executable (%d).\n", errno);
@@ -2395,6 +2395,87 @@ int main(int argc, char **argv)
 goto fail;
 printf("okay\n");
 }
+else
+printf("skipped\n");
+
+printf("%-40s", "Testing stmxcsr (%edx)...");
+if ( cpu_has_sse )
+{
+decl_insn(stmxcsr);
+
+asm volatile ( put_insn(stmxcsr, "stmxcsr (%0)") :: "d" (NULL) );
+
+res[0] = 0x12345678;
+res[1] = 0x87654321;
+asm ( "stmxcsr %0" : "=m" (res[2]) );
+set_insn(stmxcsr);
+regs.edx = (unsigned long)res;
+rc = x86_emulate(&ctxt, &emulops);
+if ( rc != X86EMUL_OKAY || !check_eip(stmxcsr) ||
+ res[0] != res[2] || res[1] != 0x87654321 )
+goto fail;
+printf("okay\n");
+}
+else
+printf("skipped\n");
+
+printf("%-40s", "Testing ldmxcsr 4(%ecx)...");
+if ( cpu_has_sse )
+{
+decl_insn(ldmxcsr);
+
+asm volatile ( put_insn(ldmxcsr, "ldmxcsr 4(%0)") :: "c" (NULL) );
+
+set_insn(ldmxcsr);
+res[1] = mxcsr_mask;
+regs.ecx = (unsigned long)res;
+rc = x86_emulate(&ctxt, &emulops);
+asm ( "stmxcsr %0; ldmxcsr %1" : "=m" (res[0]) : "m" (res[2]) );
+if ( rc != X86EMUL_OKAY || !check_eip(ldmxcsr) ||
+ res[0] != mxcsr_mask )
+goto fail;
+printf("okay\n");
+}
+else
+printf("skipped\n");
+
+printf("%-40s", "Testing vstmxcsr (%ecx)...");
+if ( cpu_has_avx )
+{
+decl_insn(vstmxcsr);
+
+asm volatile ( put_insn(vstmxcsr, "vstmxcsr (%0)") :: "c" (NULL) );
+
+res[0] = 0x12345678;
+res[1] = 0x87654321;
+set_insn(vstmxcsr);
+regs.ecx = (unsigned long)res;
+rc = x86_emulate(&ctxt, &emulops);
+if ( rc != X86EMUL_OKAY || !check_eip(vstmxcsr) ||
+ res[0] != res[2] || res[1] != 0x87654321 )
+goto fail;
+printf("okay\n");
+}
+else
+printf("skipped\n");
+
+printf("%-40s", "Testing vldmxcsr 4(%edx)...");
+if ( cpu_has_avx )
+{
+decl_insn(vldmxcsr);
+
+asm volatile ( put_insn(vldmxcsr, "vldmxcsr 4(%0)") :: "d" (NULL) );
+
+set_insn(vldmxcsr);
+res[1] = mxcsr_mask;
+regs.edx = (unsigned long)res;
+rc = x86_emulate(&ctxt, &emulops);
+asm ( "stmxcsr %0; ldmxcsr %1" : "=m" (res[0]) : "m" (res[2]) );
+if ( rc != X86EMUL_OKAY || !check_eip(vldmxcsr) ||
+ res[0] != mxcsr_mask )
+goto fail;
+printf("okay\n");
+}
 else
 printf("skipped\n");
 
--- a/tools/tests/x86_emulator/x86_emulate.c
+++ b/tools/tests/x86_emulator/x86_emulate.c
@@ -22,10 +22,29 @@
 #define get_stub(stb) ((void *)((stb).addr = (uintptr_t)(stb).buf))
 #define put_stub(stb)
 
-bool emul_test_make_stack_executable(void)
+uint32_t mxcsr_mask = 0xffbf;
+
+bool emul_test_init(void)
 {
 unsigned long sp;
 
+if ( cpu_has_fxsr )
+{
+static union __attribute__((__aligned__(16))) {
+char x[464];
+struct {
+uint32_t other[6];
+uint32_t mxcsr;
+uint32_t mxcsr_mask;
+/* ... */
+};
+} fxs;
+
+asm ( "fxsave %0" : "=m" (fxs) );
+if ( fxs.mxcsr_mask )
+mxcsr_mask = fxs.mxcsr_mask;
+}
+
 /*
  * Mark the entire stack executable so that the stub executions
  * don't fault
--- a/tools/tests/x86_emulator/x86_emulate.h
+++ b/tools/tests/x86_emulator/x86_emulate.h
@@ -42,8 +42,10 @@
 
 #define is_canonical_address(x) (((int64_t)(x) >> 47) == ((int64_t)(x) >> 63))
 
+extern uint32_t mxcsr_mask;
+
 #define MMAP_SZ 16384
-bool emul_test_make_stack_executable(void);
+bool emul_test_init(void);
 
 #include "x86_emulate/x86_emulate.h"
 
@@ -68,6 +70,12 @@ static inline uint64_t xgetbv(uint32_t x
 (res.d & (1U << 23)) != 0; \
 })
 
+#define cpu_has_fxsr ({ \
+struct cpuid_leaf res; \
+emul_test_cpuid(1, 0, &res, NULL); \
+(res.d & (1U << 24)) != 0; \
+})
+
 #define cpu_has_sse ({ \
 struct cpuid_le

[Xen-devel] [PATCH v4 07/17] x86emul: support {,V}MOVNTDQA

2017-02-28 Thread Jan Beulich
... as the only post-SSE2 move insn.

Signed-off-by: Jan Beulich 
---
v3: Re-base.
v2: Re-base.

--- a/tools/tests/x86_emulator/test_x86_emulator.c
+++ b/tools/tests/x86_emulator/test_x86_emulator.c
@@ -2398,6 +2398,74 @@ int main(int argc, char **argv)
 else
 printf("skipped\n");
 
+printf("%-40s", "Testing movntdqa 16(%edx),%xmm4...");
+if ( stack_exec && cpu_has_sse4_1 )
+{
+decl_insn(movntdqa);
+
+asm volatile ( "pcmpgtb %%xmm4, %%xmm4\n"
+   put_insn(movntdqa, "movntdqa 16(%0), %%xmm4")
+   :: "d" (NULL) );
+
+set_insn(movntdqa);
+memset(res, 0x55, 64);
+memset(res + 4, 0xff, 16);
+regs.edx = (unsigned long)res;
+rc = x86_emulate(&ctxt, &emulops);
+if ( rc != X86EMUL_OKAY || !check_eip(movntdqa) )
+goto fail;
+asm ( "pcmpeqb %%xmm2, %%xmm2\n\t"
+  "pcmpeqb %%xmm4, %%xmm2\n\t"
+  "pmovmskb %%xmm2, %0" : "=r" (rc) );
+if ( rc != 0x )
+goto fail;
+printf("okay\n");
+}
+else
+printf("skipped\n");
+
+printf("%-40s", "Testing vmovntdqa (%ecx),%ymm4...");
+if ( stack_exec && cpu_has_avx2 )
+{
+decl_insn(vmovntdqa);
+
+#if 0 /* Don't use AVX2 instructions for now */
+asm volatile ( "vpxor %%ymm4, %%ymm4, %%ymm4\n"
+   put_insn(vmovntdqa, "vmovntdqa (%0), %%ymm4")
+   :: "c" (NULL) );
+#else
+asm volatile ( "vpxor %xmm4, %xmm4, %xmm4\n"
+   put_insn(vmovntdqa,
+".byte 0xc4, 0xe2, 0x7d, 0x2a, 0x21") );
+#endif
+
+set_insn(vmovntdqa);
+memset(res, 0x55, 96);
+memset(res + 8, 0xff, 32);
+regs.ecx = (unsigned long)(res + 8);
+rc = x86_emulate(&ctxt, &emulops);
+if ( rc != X86EMUL_OKAY || !check_eip(vmovntdqa) )
+goto fail;
+#if 0 /* Don't use AVX2 instructions for now */
+asm ( "vpcmpeqb %%ymm2, %%ymm2, %%ymm2\n\t"
+  "vpcmpeqb %%ymm4, %%ymm2, %%ymm0\n\t"
+  "vpmovmskb %%ymm0, %0" : "=r" (rc) );
+#else
+asm ( "vextractf128 $1, %%ymm4, %%xmm3\n\t"
+  "vpcmpeqb %%xmm2, %%xmm2, %%xmm2\n\t"
+  "vpcmpeqb %%xmm4, %%xmm2, %%xmm0\n\t"
+  "vpcmpeqb %%xmm3, %%xmm2, %%xmm1\n\t"
+  "vpmovmskb %%xmm0, %0\n\t"
+  "vpmovmskb %%xmm1, %1" : "=r" (rc), "=r" (i) );
+rc |= i << 16;
+#endif
+if ( ~rc )
+goto fail;
+printf("okay\n");
+}
+else
+printf("skipped\n");
+
 printf("%-40s", "Testing stmxcsr (%edx)...");
 if ( cpu_has_sse )
 {
--- a/tools/tests/x86_emulator/x86_emulate.h
+++ b/tools/tests/x86_emulator/x86_emulate.h
@@ -94,6 +94,12 @@ static inline uint64_t xgetbv(uint32_t x
 (res.c & (1U << 0)) != 0; \
 })
 
+#define cpu_has_sse4_1 ({ \
+struct cpuid_leaf res; \
+emul_test_cpuid(1, 0, &res, NULL); \
+(res.c & (1U << 19)) != 0; \
+})
+
 #define cpu_has_popcnt ({ \
 struct cpuid_leaf res; \
 emul_test_cpuid(1, 0, &res, NULL); \
--- a/xen/arch/x86/x86_emulate/x86_emulate.c
+++ b/xen/arch/x86/x86_emulate/x86_emulate.c
@@ -1399,6 +1399,7 @@ static bool vcpu_has(
 #define vcpu_has_sse2()vcpu_has( 1, EDX, 26, ctxt, ops)
 #define vcpu_has_sse3()vcpu_has( 1, ECX,  0, ctxt, ops)
 #define vcpu_has_cx16()vcpu_has( 1, ECX, 13, ctxt, ops)
+#define vcpu_has_sse4_1()  vcpu_has( 1, ECX, 19, ctxt, ops)
 #define vcpu_has_sse4_2()  vcpu_has( 1, ECX, 20, ctxt, ops)
 #define vcpu_has_movbe()   vcpu_has( 1, ECX, 22, ctxt, ops)
 #define vcpu_has_popcnt()  vcpu_has( 1, ECX, 23, ctxt, ops)
@@ -5919,6 +5920,7 @@ x86_emulate(
 case X86EMUL_OPC_VEX_66(0x0f, 0x7f): /* vmovdqa {x,y}mm,{x,y}mm/m128 */
 case X86EMUL_OPC_F3(0x0f, 0x7f): /* movdqu xmm,xmm/m128 */
 case X86EMUL_OPC_VEX_F3(0x0f, 0x7f): /* vmovdqu {x,y}mm,{x,y}mm/mem */
+movdqa:
 d |= TwoOp;
 op_bytes = 16 << vex.l;
 if ( vex.opcx != vex_none )
@@ -6814,6 +6816,23 @@ x86_emulate(
 sfence = true;
 break;
 
+case X86EMUL_OPC_66(0x0f38, 0x2a): /* movntdqa m128,xmm */
+case X86EMUL_OPC_VEX_66(0x0f38, 0x2a): /* vmovntdqa mem,{x,y}mm */
+generate_exception_if(ea.type != OP_MEM, EXC_UD);
+/* Ignore the non-temporal hint for now, using movdqa instead. */
+asm volatile ( "mfence" ::: "memory" );
+b = 0x6f;
+if ( vex.opcx == vex_none )
+vcpu_must_have(sse4_1);
+else
+{
+vex.opcx = vex_0f;
+if ( vex.l )
+vcpu_must_have(avx2);
+}
+state->simd_size = simd_packed_int;
+goto movdqa;
+
 case X86EMUL_OPC(0x0f38, 0xf0): /* movbe m,r */
 case X86EMUL_OPC(0x0f38, 0xf1): /* movbe r,m */
 vcpu_must_have(movbe);


x86emul: sup

[Xen-devel] [PATCH v4 08/17] x86emul: test coverage for SSE/SSE2 insns

2017-02-28 Thread Jan Beulich
... and their AVX equivalents. Note that a few instructions aren't
covered (yet), but those all fall into common pattern groups, so I
would hope that for now we can do with what is there.

MMX insns aren't being covered at all, as they're not easy to deal
with: The compiler refuses to emit such for other than uses of built-in
functions.

The current way of testing AVX insns is meant to be temporary only:
Once we fully support that feature, the present tests should rather be
replaced than full ones simply added.

Signed-off-by: Jan Beulich 
Acked-by: Andrew Cooper 
---
v4: Put spaces around ##. Parenthesize uses of macro parameters. Fix
indentation for a few preprocessor directives.
v2: New.

--- a/tools/tests/x86_emulator/Makefile
+++ b/tools/tests/x86_emulator/Makefile
@@ -11,11 +11,36 @@ all: $(TARGET)
 run: $(TARGET)
./$(TARGET)
 
-TESTCASES := blowfish
+TESTCASES := blowfish simd
 
 blowfish-cflags := ""
 blowfish-cflags-x86_32 := "-mno-accumulate-outgoing-args -Dstatic="
 
+sse-vecs := 16
+sse-ints :=
+sse-flts := 4
+sse2-vecs := $(sse-vecs)
+sse2-ints := 1 2 4 8
+sse2-flts := 4 8
+
+# When converting SSE to AVX, have the compiler avoid XMM0 to widen
+# coverage of the VEX. checks in the emulator.
+sse2avx := -ffixed-xmm0 -Wa,-msse2avx
+
+simd-cflags := $(foreach flavor,sse sse2, \
+ $(foreach vec,$($(flavor)-vecs), \
+   $(foreach int,$($(flavor)-ints), \
+ "-D$(flavor)_$(vec)i$(int) -m$(flavor) -O2 
-DVEC_SIZE=$(vec) -DINT_SIZE=$(int)" \
+ "-D$(flavor)_$(vec)u$(int) -m$(flavor) -O2 
-DVEC_SIZE=$(vec) -DUINT_SIZE=$(int)" \
+ "-D$(flavor)_avx_$(vec)i$(int) -m$(flavor) $(sse2avx) -O2 
-DVEC_SIZE=$(vec) -DINT_SIZE=$(int)" \
+ "-D$(flavor)_avx_$(vec)u$(int) -m$(flavor) $(sse2avx) -O2 
-DVEC_SIZE=$(vec) -DUINT_SIZE=$(int)") \
+   $(foreach flt,$($(flavor)-flts), \
+ "-D$(flavor)_$(vec)f$(flt) -m$(flavor) -O2 
-DVEC_SIZE=$(vec) -DFLOAT_SIZE=$(flt)" \
+ "-D$(flavor)_avx_$(vec)f$(flt) -m$(flavor) $(sse2avx) -O2 
-DVEC_SIZE=$(vec) -DFLOAT_SIZE=$(flt)")) \
+ $(foreach flt,$($(flavor)-flts), \
+   "-D$(flavor)_f$(flt) -m$(flavor) -mfpmath=sse -O2 
-DFLOAT_SIZE=$(flt)" \
+   "-D$(flavor)_avx_f$(flt) -m$(flavor) -mfpmath=sse 
$(sse2avx) -O2 -DFLOAT_SIZE=$(flt)"))
+
 $(addsuffix .h,$(TESTCASES)): %.h: %.c testcase.mk Makefile
rm -f $@.new $*.bin
$(foreach arch,$(filter-out $(XEN_COMPILE_ARCH),x86_32) 
$(XEN_COMPILE_ARCH), \
--- /dev/null
+++ b/tools/tests/x86_emulator/simd.c
@@ -0,0 +1,450 @@
+#include 
+
+asm (
+"\t.text\n"
+"\t.globl _start\n"
+"_start:\n"
+#if defined(__i386__) && VEC_SIZE == 16
+"\tpush %ebp\n"
+"\tmov %esp,%ebp\n"
+"\tand $~0xf,%esp\n"
+"\tcall simd_test\n"
+"\tleave\n"
+"\tret"
+#else
+"\tjmp simd_test"
+#endif
+);
+
+typedef
+#if defined(INT_SIZE)
+# define ELEM_SIZE INT_SIZE
+signed int
+# if INT_SIZE == 1
+#  define MODE QI
+# elif INT_SIZE == 2
+#  define MODE HI
+# elif INT_SIZE == 4
+#  define MODE SI
+# elif INT_SIZE == 8
+#  define MODE DI
+# endif
+#elif defined(UINT_SIZE)
+# define ELEM_SIZE UINT_SIZE
+unsigned int
+# if UINT_SIZE == 1
+#  define MODE QI
+# elif UINT_SIZE == 2
+#  define MODE HI
+# elif UINT_SIZE == 4
+#  define MODE SI
+# elif UINT_SIZE == 8
+#  define MODE DI
+# endif
+#elif defined(FLOAT_SIZE)
+float
+# define ELEM_SIZE FLOAT_SIZE
+# if FLOAT_SIZE == 4
+#  define MODE SF
+# elif FLOAT_SIZE == 8
+#  define MODE DF
+# endif
+#endif
+#ifndef VEC_SIZE
+# define VEC_SIZE ELEM_SIZE
+#endif
+__attribute__((mode(MODE), vector_size(VEC_SIZE))) vec_t;
+
+#define ELEM_COUNT (VEC_SIZE / ELEM_SIZE)
+
+typedef unsigned int __attribute__((mode(QI), vector_size(VEC_SIZE))) 
byte_vec_t;
+
+/* Various builtins want plain char / int / long long vector types ... */
+typedef char __attribute__((vector_size(VEC_SIZE))) vqi_t;
+typedef short __attribute__((vector_size(VEC_SIZE))) vhi_t;
+typedef int __attribute__((vector_size(VEC_SIZE))) vsi_t;
+#if VEC_SIZE >= 8
+typedef long long __attribute__((vector_size(VEC_SIZE))) vdi_t;
+#endif
+
+#if VEC_SIZE == 8 && defined(__SSE__)
+# define to_bool(cmp) (__builtin_ia32_pmovmskb(cmp) == 0xff)
+#elif VEC_SIZE == 16
+# if defined(__SSE__) && ELEM_SIZE == 4
+#  define to_bool(cmp) (__builtin_ia32_movmskps(cmp) == 0xf)
+# elif defined(__SSE2__)
+#  if ELEM_SIZE == 8
+#   define to_bool(cmp) (__builtin_ia32_movmskpd(cmp) == 3)
+#  else
+#   define to_bool(cmp) (__builtin_ia32_pmovmskb128(cmp) == 0x)
+#  endif
+# endif
+#endif
+
+#ifndef to_bool
+static inline bool _to_bool(byte_vec_t bv)
+{
+unsigned int i;
+
+for ( i = 0; i < VEC_SIZE; ++i )
+if ( bv[i] != 0xff )
+return false;
+
+return true;
+}
+# define to_bool(cmp) _to_bool((byte_vec_t)(cmp))
+#endif
+
+#if VEC_SIZE == FLOAT_SIZE
+# define to_int(x) ((vec_t){ (int)(x)[0

[Xen-devel] [PATCH v4 09/17] x86emul: honor MMXEXT feature flag

2017-02-28 Thread Jan Beulich
This being a strict (MMX register only) subset of SSE, we can simply
adjust the respective checks while making the new predicate look at
both flags.

Signed-off-by: Jan Beulich 
Reviewed-by: Andrew Cooper 

--- a/xen/arch/x86/x86_emulate/x86_emulate.c
+++ b/xen/arch/x86/x86_emulate/x86_emulate.c
@@ -1405,6 +1405,8 @@ static bool vcpu_has(
 #define vcpu_has_popcnt()  vcpu_has( 1, ECX, 23, ctxt, ops)
 #define vcpu_has_avx() vcpu_has( 1, ECX, 28, ctxt, ops)
 #define vcpu_has_rdrand()  vcpu_has( 1, ECX, 30, ctxt, ops)
+#define vcpu_has_mmxext() (vcpu_has(0x8001, EDX, 22, ctxt, ops) || \
+   vcpu_has_sse())
 #define vcpu_has_lahf_lm() vcpu_has(0x8001, ECX,  0, ctxt, ops)
 #define vcpu_has_cr8_legacy()  vcpu_has(0x8001, ECX,  4, ctxt, ops)
 #define vcpu_has_lzcnt()   vcpu_has(0x8001, ECX,  5, ctxt, ops)
@@ -5707,8 +5709,12 @@ x86_emulate(
 else
 {
 if ( b != 0x50 )
+{
 host_and_vcpu_must_have(mmx);
-vcpu_must_have(sse);
+vcpu_must_have(mmxext);
+}
+else
+vcpu_must_have(sse);
 }
 if ( b == 0x50 || (vex.pfx & VEX_PREFIX_DOUBLE_MASK) )
 get_fpu(X86EMUL_FPU_xmm, &fic);
@@ -5966,7 +5972,7 @@ x86_emulate(
 else
 {
 host_and_vcpu_must_have(mmx);
-vcpu_must_have(sse);
+vcpu_must_have(mmxext);
 get_fpu(X86EMUL_FPU_mmx, &fic);
 }
 simd_0f_imm8:
@@ -6252,7 +6258,7 @@ x86_emulate(
 if ( modrm_mod == 3 ) /* sfence */
 {
 generate_exception_if(vex.pfx, EXC_UD);
-vcpu_must_have(sse);
+vcpu_must_have(mmxext);
 asm volatile ( "sfence" ::: "memory" );
 break;
 }
@@ -6736,7 +6742,7 @@ x86_emulate(
 case X86EMUL_OPC(0x0f, 0xe3):/* pavgw mm/m64,mm */
 case X86EMUL_OPC(0x0f, 0xe4):/* pmulhuw mm/m64,mm */
 case X86EMUL_OPC(0x0f, 0xf6):/* psadbw mm/m64,mm */
-vcpu_must_have(sse);
+vcpu_must_have(mmxext);
 goto simd_0f_mmx;
 
 case X86EMUL_OPC_66(0x0f, 0xe6):   /* cvttpd2dq xmm/mem,xmm */
@@ -6767,7 +6773,7 @@ x86_emulate(
 else
 {
 host_and_vcpu_must_have(mmx);
-vcpu_must_have(sse);
+vcpu_must_have(mmxext);
 get_fpu(X86EMUL_FPU_mmx, &fic);
 }
 



x86emul: honor MMXEXT feature flag

This being a strict (MMX register only) subset of SSE, we can simply
adjust the respective checks while making the new predicate look at
both flags.

Signed-off-by: Jan Beulich 
Reviewed-by: Andrew Cooper 

--- a/xen/arch/x86/x86_emulate/x86_emulate.c
+++ b/xen/arch/x86/x86_emulate/x86_emulate.c
@@ -1405,6 +1405,8 @@ static bool vcpu_has(
 #define vcpu_has_popcnt()  vcpu_has( 1, ECX, 23, ctxt, ops)
 #define vcpu_has_avx() vcpu_has( 1, ECX, 28, ctxt, ops)
 #define vcpu_has_rdrand()  vcpu_has( 1, ECX, 30, ctxt, ops)
+#define vcpu_has_mmxext() (vcpu_has(0x8001, EDX, 22, ctxt, ops) || \
+   vcpu_has_sse())
 #define vcpu_has_lahf_lm() vcpu_has(0x8001, ECX,  0, ctxt, ops)
 #define vcpu_has_cr8_legacy()  vcpu_has(0x8001, ECX,  4, ctxt, ops)
 #define vcpu_has_lzcnt()   vcpu_has(0x8001, ECX,  5, ctxt, ops)
@@ -5707,8 +5709,12 @@ x86_emulate(
 else
 {
 if ( b != 0x50 )
+{
 host_and_vcpu_must_have(mmx);
-vcpu_must_have(sse);
+vcpu_must_have(mmxext);
+}
+else
+vcpu_must_have(sse);
 }
 if ( b == 0x50 || (vex.pfx & VEX_PREFIX_DOUBLE_MASK) )
 get_fpu(X86EMUL_FPU_xmm, &fic);
@@ -5966,7 +5972,7 @@ x86_emulate(
 else
 {
 host_and_vcpu_must_have(mmx);
-vcpu_must_have(sse);
+vcpu_must_have(mmxext);
 get_fpu(X86EMUL_FPU_mmx, &fic);
 }
 simd_0f_imm8:
@@ -6252,7 +6258,7 @@ x86_emulate(
 if ( modrm_mod == 3 ) /* sfence */
 {
 generate_exception_if(vex.pfx, EXC_UD);
-vcpu_must_have(sse);
+vcpu_must_have(mmxext);
 asm volatile ( "sfence" ::: "memory" );
 break;
 }
@@ -6736,7 +6742,7 @@ x86_emulate(
 case X86EMUL_OPC(0x0f, 0xe3):/* pavgw mm/m64,mm */
 case X86EMUL_OPC(0x0f, 0xe4):/* pmulhuw mm/m64,mm */
 case X86EMUL_OPC(0x0f, 0xf6):/* psadbw mm/m64,mm */
-vcpu_must_have(sse);
+vcpu_must_have(mmxext);
 goto simd_0f_mmx;
 
 case X86EMUL_OPC_66(0x0f, 0xe6):   /* cvttpd2dq xmm/mem,xmm */
@@ -6767,7 +6773,7 @@ x86_emul

[Xen-devel] [PATCH v4 10/17] x86emul: add tables for 0f38 and 0f3a extension space

2017-02-28 Thread Jan Beulich
Convert the few existing opcodes so far supported.

Signed-off-by: Jan Beulich 
---
v3: New.

--- a/xen/arch/x86/x86_emulate/x86_emulate.c
+++ b/xen/arch/x86/x86_emulate/x86_emulate.c
@@ -43,6 +43,8 @@
 #define SrcMask (7<<3)
 /* Generic ModRM decode. */
 #define ModRM   (1<<6)
+/* vSIB addressing mode (0f38 extension opcodes only), aliasing ModRM. */
+#define vSIB(1<<6)
 /* Destination is only written; never read. */
 #define Mov (1<<7)
 /* VEX/EVEX (SIMD only): 2nd source operand unused (must be all ones) */
@@ -340,6 +342,28 @@ static const struct {
 [0xff] = { ModRM }
 };
 
+static const struct {
+uint8_t simd_size:5;
+uint8_t to_memory:1;
+uint8_t two_op:1;
+uint8_t vsib:1;
+} ext0f38_table[256] = {
+[0x2a] = { .simd_size = simd_packed_int, .two_op = 1 },
+[0xf0] = { .two_op = 1 },
+[0xf1] = { .to_memory = 1, .two_op = 1 },
+[0xf2 ... 0xf3] = {},
+[0xf5 ... 0xf7] = {},
+};
+
+static const struct {
+uint8_t simd_size:5;
+uint8_t to_memory:1;
+uint8_t two_op:1;
+uint8_t four_op:1;
+} ext0f3a_table[256] = {
+[0xf0] = {},
+};
+
 static const opcode_desc_t xop_table[] = {
 DstReg|SrcImmByte|ModRM,
 DstReg|SrcMem|ModRM,
@@ -2129,7 +2153,7 @@ x86_decode_onebyte(
 /* fall through */
 case 3: /* call (far, absolute indirect) */
 case 5: /* jmp (far, absolute indirect) */
-state->desc = DstNone | SrcMem | ModRM | Mov;
+state->desc = DstNone | SrcMem | Mov;
 break;
 }
 break;
@@ -2199,7 +2223,7 @@ x86_decode_twobyte(
 if ( vex.pfx == vex_f3 ) /* movq xmm/m64,xmm */
 {
 case X86EMUL_OPC_VEX_F3(0, 0x7e): /* vmovq xmm/m64,xmm */
-state->desc = DstImplicit | SrcMem | ModRM | Mov;
+state->desc = DstImplicit | SrcMem | Mov;
 state->simd_size = simd_other;
 /* Avoid the state->desc adjustment below. */
 return X86EMUL_OKAY;
@@ -2213,12 +2237,12 @@ x86_decode_twobyte(
 switch ( modrm_reg & 7 )
 {
 case 2: /* {,v}ldmxcsr */
-state->desc = DstImplicit | SrcMem | ModRM | Mov;
+state->desc = DstImplicit | SrcMem | Mov;
 op_bytes = 4;
 break;
 
 case 3: /* {,v}stmxcsr */
-state->desc = DstMem | SrcImplicit | ModRM | Mov;
+state->desc = DstMem | SrcImplicit | Mov;
 op_bytes = 4;
 break;
 }
@@ -2239,7 +2263,7 @@ x86_decode_twobyte(
 ctxt->opcode |= MASK_INSR(vex.pfx, X86EMUL_OPC_PFX_MASK);
 /* fall through */
 case X86EMUL_OPC_VEX_66(0, 0xc4): /* vpinsrw */
-state->desc = DstReg | SrcMem16 | ModRM;
+state->desc = DstReg | SrcMem16;
 break;
 }
 
@@ -2275,8 +2299,8 @@ x86_decode_0f38(
 break;
 
 case 0xf1: /* movbe / crc32 */
-if ( !repne_prefix() )
-state->desc = (state->desc & ~(DstMask | SrcMask)) | DstMem | 
SrcReg | Mov;
+if ( repne_prefix() )
+state->desc = DstReg | SrcMem;
 if ( rep_prefix() )
 ctxt->opcode |= MASK_INSR(vex.pfx, X86EMUL_OPC_PFX_MASK);
 break;
@@ -2527,10 +2551,7 @@ x86_decode(
 opcode |= b | MASK_INSR(vex.pfx, X86EMUL_OPC_PFX_MASK);
 
 if ( !(d & ModRM) )
-{
-modrm_reg = modrm_rm = modrm_mod = modrm = 0;
 break;
-}
 
 modrm = insn_fetch_type(uint8_t);
 modrm_mod = (modrm & 0xc0) >> 6;
@@ -2541,6 +2562,8 @@ x86_decode(
 
 if ( d & ModRM )
 {
+d &= ~ModRM;
+#undef ModRM /* Only its aliases are valid to use from here on. */
 modrm_reg = ((rex_prefix & 4) << 1) | ((modrm & 0x38) >> 3);
 modrm_rm  = modrm & 0x07;
 
@@ -2550,8 +2573,9 @@ x86_decode(
  * normally be only addition/removal of SrcImm/SrcImm16, so their
  * fetching can be taken care of by the common code below.
  */
-if ( ext == ext_none )
+switch ( ext )
 {
+case ext_none:
 switch ( b )
 {
 case 0xf6 ... 0xf7: /* Grp3 */
@@ -2577,6 +2601,25 @@ x86_decode(
 }
 break;
 }
+break;
+
+case vex_0f38:
+d = ext0f38_table[b].to_memory ? DstMem | SrcReg
+   : DstReg | SrcMem;
+if ( ext0f38_table[b].two_op )
+d |= TwoOp;
+if ( ext0f38_table[b].vsib )
+d |= vSIB;
+state->simd_size = ext0f38_table[b].simd_size;
+break;
+
+case vex_0f3a:
+/*
+ * Cannot update d here yet, as the immediate operand still
+ * needs fetching.
+ */
+default:
+break;
 }
 
 if ( modrm_mod == 3 )
@@ -2587,6 +2630,7 @@ x86_decode(
 else

[Xen-devel] [PATCH v4 11/17] x86emul: support SSSE3 insns

2017-02-28 Thread Jan Beulich
... and their AVX equivalents.

Signed-off-by: Jan Beulich 
---
v3: New.

--- a/xen/arch/x86/x86_emulate/x86_emulate.c
+++ b/xen/arch/x86/x86_emulate/x86_emulate.c
@@ -348,6 +348,8 @@ static const struct {
 uint8_t two_op:1;
 uint8_t vsib:1;
 } ext0f38_table[256] = {
+[0x00 ... 0x0b] = { .simd_size = simd_packed_int },
+[0x1c ... 0x1e] = { .simd_size = simd_packed_int, .two_op = 1 },
 [0x2a] = { .simd_size = simd_packed_int, .two_op = 1 },
 [0xf0] = { .two_op = 1 },
 [0xf1] = { .to_memory = 1, .two_op = 1 },
@@ -361,6 +363,7 @@ static const struct {
 uint8_t two_op:1;
 uint8_t four_op:1;
 } ext0f3a_table[256] = {
+[0x0f] = { .simd_size = simd_packed_int },
 [0xf0] = {},
 };
 
@@ -1422,6 +1425,7 @@ static bool vcpu_has(
 #define vcpu_has_sse() vcpu_has( 1, EDX, 25, ctxt, ops)
 #define vcpu_has_sse2()vcpu_has( 1, EDX, 26, ctxt, ops)
 #define vcpu_has_sse3()vcpu_has( 1, ECX,  0, ctxt, ops)
+#define vcpu_has_ssse3()   vcpu_has( 1, ECX,  9, ctxt, ops)
 #define vcpu_has_cx16()vcpu_has( 1, ECX, 13, ctxt, ops)
 #define vcpu_has_sse4_1()  vcpu_has( 1, ECX, 19, ctxt, ops)
 #define vcpu_has_sse4_2()  vcpu_has( 1, ECX, 20, ctxt, ops)
@@ -5916,6 +5920,21 @@ x86_emulate(
 simd_0f_int:
 if ( vex.opcx != vex_none )
 {
+case X86EMUL_OPC_VEX_66(0x0f38, 0x00): /* vpshufb 
{x,y}mm/mem,{x,y}mm,{x,y}mm */
+case X86EMUL_OPC_VEX_66(0x0f38, 0x01): /* vphaddw 
{x,y}mm/mem,{x,y}mm,{x,y}mm */
+case X86EMUL_OPC_VEX_66(0x0f38, 0x02): /* vphaddd 
{x,y}mm/mem,{x,y}mm,{x,y}mm */
+case X86EMUL_OPC_VEX_66(0x0f38, 0x03): /* vphaddsw 
{x,y}mm/mem,{x,y}mm,{x,y}mm */
+case X86EMUL_OPC_VEX_66(0x0f38, 0x04): /* vpmaddubsw 
{x,y}mm/mem,{x,y}mm,{x,y}mm */
+case X86EMUL_OPC_VEX_66(0x0f38, 0x05): /* vphsubw 
{x,y}mm/mem,{x,y}mm,{x,y}mm */
+case X86EMUL_OPC_VEX_66(0x0f38, 0x06): /* vphsubd 
{x,y}mm/mem,{x,y}mm,{x,y}mm */
+case X86EMUL_OPC_VEX_66(0x0f38, 0x07): /* vphsubsw 
{x,y}mm/mem,{x,y}mm,{x,y}mm */
+case X86EMUL_OPC_VEX_66(0x0f38, 0x08): /* vpsignb 
{x,y}mm/mem,{x,y}mm,{x,y}mm */
+case X86EMUL_OPC_VEX_66(0x0f38, 0x09): /* vpsignw 
{x,y}mm/mem,{x,y}mm,{x,y}mm */
+case X86EMUL_OPC_VEX_66(0x0f38, 0x0a): /* vpsignd 
{x,y}mm/mem,{x,y}mm,{x,y}mm */
+case X86EMUL_OPC_VEX_66(0x0f38, 0x0b): /* vpmulhrsw 
{x,y}mm/mem,{x,y}mm,{x,y}mm */
+case X86EMUL_OPC_VEX_66(0x0f38, 0x1c): /* vpabsb {x,y}mm/mem,{x,y}mm */
+case X86EMUL_OPC_VEX_66(0x0f38, 0x1d): /* vpabsw {x,y}mm/mem,{x,y}mm */
+case X86EMUL_OPC_VEX_66(0x0f38, 0x1e): /* vpabsd {x,y}mm/mem,{x,y}mm */
 if ( !vex.l )
 goto simd_0f_avx;
 host_and_vcpu_must_have(avx2);
@@ -6011,6 +6030,7 @@ x86_emulate(
 simd_0f_int_imm8:
 if ( vex.opcx != vex_none )
 {
+case X86EMUL_OPC_VEX_66(0x0f3a, 0x0f): /* vpalignr 
$imm8,{x,y}mm/mem,{x,y}mm,{x,y}mm */
 if ( vex.l )
 host_and_vcpu_must_have(avx2);
 else
@@ -6879,6 +6899,58 @@ x86_emulate(
 sfence = true;
 break;
 
+case X86EMUL_OPC(0x0f38, 0x00):/* pshufb mm/m64,mm */
+case X86EMUL_OPC_66(0x0f38, 0x00): /* pshufb xmm/m128,xmm */
+case X86EMUL_OPC(0x0f38, 0x01):/* phaddw mm/m64,mm */
+case X86EMUL_OPC_66(0x0f38, 0x01): /* phaddw xmm/m128,xmm */
+case X86EMUL_OPC(0x0f38, 0x02):/* phaddd mm/m64,mm */
+case X86EMUL_OPC_66(0x0f38, 0x02): /* phaddd xmm/m128,xmm */
+case X86EMUL_OPC(0x0f38, 0x03):/* phaddsw mm/m64,mm */
+case X86EMUL_OPC_66(0x0f38, 0x03): /* phaddsw xmm/m128,xmm */
+case X86EMUL_OPC(0x0f38, 0x04):/* pmaddubsw mm/m64,mm */
+case X86EMUL_OPC_66(0x0f38, 0x04): /* pmaddubsw xmm/m128,xmm */
+case X86EMUL_OPC(0x0f38, 0x05):/* phsubw mm/m64,mm */
+case X86EMUL_OPC_66(0x0f38, 0x05): /* phsubw xmm/m128,xmm */
+case X86EMUL_OPC(0x0f38, 0x06):/* phsubd mm/m64,mm */
+case X86EMUL_OPC_66(0x0f38, 0x06): /* phsubd xmm/m128,xmm */
+case X86EMUL_OPC(0x0f38, 0x07):/* phsubsw mm/m64,mm */
+case X86EMUL_OPC_66(0x0f38, 0x07): /* phsubsw xmm/m128,xmm */
+case X86EMUL_OPC(0x0f38, 0x08):/* psignb mm/m64,mm */
+case X86EMUL_OPC_66(0x0f38, 0x08): /* psignb xmm/m128,xmm */
+case X86EMUL_OPC(0x0f38, 0x09):/* psignw mm/m64,mm */
+case X86EMUL_OPC_66(0x0f38, 0x09): /* psignw xmm/m128,xmm */
+case X86EMUL_OPC(0x0f38, 0x0a):/* psignd mm/m64,mm */
+case X86EMUL_OPC_66(0x0f38, 0x0a): /* psignd xmm/m128,xmm */
+case X86EMUL_OPC(0x0f38, 0x0b):/* pmulhrsw mm/m64,mm */
+case X86EMUL_OPC_66(0x0f38, 0x0b): /* pmulhrsw xmm/m128,xmm */
+case X86EMUL_OPC(0x0f38, 0x1c):/* pabsb mm/m64,mm */
+case X86EMUL_OPC_66(0x0f38, 0x1c): /* pabsb xmm/m128,xmm */
+case X86EMUL_OPC(0x0f38, 0x1d):/* pabsw mm/m64,mm */
+case X86EMUL_OPC_66(0x0f38, 0x1d): /* pabsw xmm/m128,xmm */
+case X86EMUL_OPC(0x0f38, 0x1e):   

[Xen-devel] [PATCH v4 12/17] x86emul: support SSE4.1 insns

2017-02-28 Thread Jan Beulich
... and their AVX equivalents.

Signed-off-by: Jan Beulich 
---
v4: Or in ByteOp for {,v}pinsrb instead of assigning it (in
x86_decode_0f3a()). Correct case label for ptest. Add missing
copy_REX_VEX() to {,v}ptest handling. Add missing immediate bytes
to {,v}pextr* etc handling. dppd requires vex.l clear. Use
consistent approach for stub setup in code paths shared between
VEX- and non-VEX-encoded insns.
v3: New.

--- a/xen/arch/x86/x86_emulate/x86_emulate.c
+++ b/xen/arch/x86/x86_emulate/x86_emulate.c
@@ -219,6 +219,13 @@ enum simd_opsize {
  */
 simd_single_fp,
 
+/*
+ * Scalar floating point:
+ * - 32 bits with low opcode bit clear (scalar single)
+ * - 64 bits with low opcode bit set (scalar double)
+ */
+simd_scalar_fp,
+
 /* Operand size encoded in non-standard way. */
 simd_other
 };
@@ -349,21 +356,45 @@ static const struct {
 uint8_t vsib:1;
 } ext0f38_table[256] = {
 [0x00 ... 0x0b] = { .simd_size = simd_packed_int },
+[0x10] = { .simd_size = simd_packed_int },
+[0x14 ... 0x15] = { .simd_size = simd_packed_fp },
+[0x17] = { .simd_size = simd_packed_int, .two_op = 1 },
 [0x1c ... 0x1e] = { .simd_size = simd_packed_int, .two_op = 1 },
+[0x20 ... 0x25] = { .simd_size = simd_other, .two_op = 1 },
+[0x28 ... 0x29] = { .simd_size = simd_packed_int },
 [0x2a] = { .simd_size = simd_packed_int, .two_op = 1 },
+[0x2b] = { .simd_size = simd_packed_int },
+[0x30 ... 0x35] = { .simd_size = simd_other, .two_op = 1 },
+[0x38 ... 0x3f] = { .simd_size = simd_packed_int },
+[0x40] = { .simd_size = simd_packed_int },
+[0x41] = { .simd_size = simd_packed_int, .two_op = 1 },
 [0xf0] = { .two_op = 1 },
 [0xf1] = { .to_memory = 1, .two_op = 1 },
 [0xf2 ... 0xf3] = {},
 [0xf5 ... 0xf7] = {},
 };
 
+/* Shift values between src and dst sizes of pmov{s,z}x{b,w,d}{w,d,q}. */
+static const uint8_t pmov_convert_delta[] = { 1, 2, 3, 1, 2, 1 };
+
 static const struct {
 uint8_t simd_size:5;
 uint8_t to_memory:1;
 uint8_t two_op:1;
 uint8_t four_op:1;
 } ext0f3a_table[256] = {
-[0x0f] = { .simd_size = simd_packed_int },
+[0x08 ... 0x09] = { .simd_size = simd_packed_fp, .two_op = 1 },
+[0x0a ... 0x0b] = { .simd_size = simd_scalar_fp },
+[0x0c ... 0x0d] = { .simd_size = simd_packed_fp },
+[0x0e ... 0x0f] = { .simd_size = simd_packed_int },
+[0x14 ... 0x17] = { .simd_size = simd_none, .to_memory = 1, .two_op = 1 },
+[0x20] = { .simd_size = simd_none },
+[0x21] = { .simd_size = simd_other },
+[0x22] = { .simd_size = simd_none },
+[0x40 ... 0x41] = { .simd_size = simd_packed_fp },
+[0x42] = { .simd_size = simd_packed_int },
+[0x4a ... 0x4b] = { .simd_size = simd_packed_fp, .four_op = 1 },
+[0x4c] = { .simd_size = simd_packed_int, .four_op = 1 },
 [0xf0] = {},
 };
 
@@ -2314,6 +2345,33 @@ x86_decode_0f38(
 }
 
 static int
+x86_decode_0f3a(
+struct x86_emulate_state *state,
+struct x86_emulate_ctxt *ctxt,
+const struct x86_emulate_ops *ops)
+{
+if ( !vex.opcx )
+ctxt->opcode |= MASK_INSR(vex.pfx, X86EMUL_OPC_PFX_MASK);
+
+switch ( ctxt->opcode & X86EMUL_OPC_MASK )
+{
+case X86EMUL_OPC_66(0, 0x20): /* pinsrb */
+case X86EMUL_OPC_VEX_66(0, 0x20): /* vpinsrb */
+state->desc = DstImplicit | SrcMem;
+if ( modrm_mod != 3 )
+state->desc |= ByteOp;
+break;
+
+case X86EMUL_OPC_66(0, 0x22): /* pinsr{d,q} */
+case X86EMUL_OPC_VEX_66(0, 0x22): /* vpinsr{d,q} */
+state->desc = DstImplicit | SrcMem;
+break;
+}
+
+return X86EMUL_OKAY;
+}
+
+static int
 x86_decode(
 struct x86_emulate_state *state,
 struct x86_emulate_ctxt *ctxt,
@@ -2801,8 +2859,7 @@ x86_decode(
 imm1 &= 0x7f;
 state->desc = d;
 state->simd_size = ext0f3a_table[b].simd_size;
-if ( !vex.opcx )
-ctxt->opcode |= MASK_INSR(vex.pfx, X86EMUL_OPC_PFX_MASK);
+rc = x86_decode_0f3a(state, ctxt, ops);
 break;
 
 case ext_8f08:
@@ -2866,6 +2923,10 @@ x86_decode(
 }
 break;
 
+case simd_scalar_fp:
+op_bytes = 4 << (ctxt->opcode & 1);
+break;
+
 default:
 op_bytes = 0;
 break;
@@ -5935,6 +5996,18 @@ x86_emulate(
 case X86EMUL_OPC_VEX_66(0x0f38, 0x1c): /* vpabsb {x,y}mm/mem,{x,y}mm */
 case X86EMUL_OPC_VEX_66(0x0f38, 0x1d): /* vpabsw {x,y}mm/mem,{x,y}mm */
 case X86EMUL_OPC_VEX_66(0x0f38, 0x1e): /* vpabsd {x,y}mm/mem,{x,y}mm */
+case X86EMUL_OPC_VEX_66(0x0f38, 0x28): /* vpmuldq 
{x,y}mm/mem,{x,y}mm,{x,y}mm */
+case X86EMUL_OPC_VEX_66(0x0f38, 0x29): /* vpcmpeqq 
{x,y}mm/mem,{x,y}mm,{x,y}mm */
+case X86EMUL_OPC_VEX_66(0x0f38, 0x2b): /* vpackusdw 
{x,y}mm/mem,{x,y}mm,{x,y}mm */
+case X86EMUL_OPC_VEX_66(0x0f38, 0x38): /* vpminsb 
{x,y}mm/mem,{x,y}mm,{x,y}mm */
+case X86EMUL_OPC_VEX_66(0x0f38, 0x39): /* vpminsd 
{x,y}mm/mem,{x,y}

[Xen-devel] [PATCH v4 13/17] x86emul: support SSE4.2 insns

2017-02-28 Thread Jan Beulich
... and their AVX equivalents.

Signed-off-by: Jan Beulich 
---
v3: New.

--- a/tools/tests/x86_emulator/test_x86_emulator.c
+++ b/tools/tests/x86_emulator/test_x86_emulator.c
@@ -2542,6 +2542,149 @@ int main(int argc, char **argv)
 else
 printf("skipped\n");
 
+printf("%-40s", "Testing pcmpestri $0x1a,(%ecx),%xmm2...");
+if ( stack_exec && cpu_has_sse4_2 )
+{
+decl_insn(pcmpestri);
+
+memcpy(res, "abcdefgh\0\1\2\3\4\5\6\7", 16);
+asm volatile ( "movq %0, %%xmm2\n"
+   put_insn(pcmpestri, "pcmpestri $0b00011010, (%1), 
%%xmm2")
+   :: "m" (res[0]), "c" (NULL) );
+
+set_insn(pcmpestri);
+regs.eax = regs.edx = 12;
+regs.ecx = (unsigned long)res;
+regs.eflags = X86_EFLAGS_PF | X86_EFLAGS_AF |
+  X86_EFLAGS_IF | X86_EFLAGS_OF;
+rc = x86_emulate(&ctxt, &emulops);
+if ( rc != X86EMUL_OKAY || !check_eip(pcmpestri) ||
+ regs.ecx != 9 ||
+ (regs.eflags & X86_EFLAGS_ARITH_MASK) !=
+ (X86_EFLAGS_CF | X86_EFLAGS_ZF | X86_EFLAGS_SF) )
+goto fail;
+printf("okay\n");
+}
+else
+printf("skipped\n");
+
+printf("%-40s", "Testing pcmpestrm $0x5a,(%ecx),%xmm2...");
+if ( stack_exec && cpu_has_sse4_2 )
+{
+decl_insn(pcmpestrm);
+
+asm volatile ( "movq %0, %%xmm2\n"
+   put_insn(pcmpestrm, "pcmpestrm $0b01011010, (%1), 
%%xmm2")
+   :: "m" (res[0]), "c" (NULL) );
+
+set_insn(pcmpestrm);
+regs.ecx = (unsigned long)res;
+regs.eflags = X86_EFLAGS_PF | X86_EFLAGS_AF |
+  X86_EFLAGS_IF | X86_EFLAGS_OF;
+rc = x86_emulate(&ctxt, &emulops);
+if ( rc != X86EMUL_OKAY || !check_eip(pcmpestrm) )
+goto fail;
+asm ( "pmovmskb %%xmm0, %0" : "=r" (rc) );
+if ( rc != 0x0e00 ||
+ (regs.eflags & X86_EFLAGS_ARITH_MASK) !=
+ (X86_EFLAGS_CF | X86_EFLAGS_ZF | X86_EFLAGS_SF) )
+goto fail;
+printf("okay\n");
+}
+else
+printf("skipped\n");
+
+printf("%-40s", "Testing pcmpistri $0x1a,(%ecx),%xmm2...");
+if ( stack_exec && cpu_has_sse4_2 )
+{
+decl_insn(pcmpistri);
+
+asm volatile ( "movq %0, %%xmm2\n"
+   put_insn(pcmpistri, "pcmpistri $0b00011010, (%1), 
%%xmm2")
+   :: "m" (res[0]), "c" (NULL) );
+
+set_insn(pcmpistri);
+regs.eflags = X86_EFLAGS_CF | X86_EFLAGS_PF | X86_EFLAGS_AF |
+  X86_EFLAGS_IF | X86_EFLAGS_OF;
+rc = x86_emulate(&ctxt, &emulops);
+if ( rc != X86EMUL_OKAY || !check_eip(pcmpistri) ||
+ regs.ecx != 16 ||
+ (regs.eflags & X86_EFLAGS_ARITH_MASK) !=
+ (X86_EFLAGS_ZF | X86_EFLAGS_SF) )
+goto fail;
+printf("okay\n");
+}
+else
+printf("skipped\n");
+
+printf("%-40s", "Testing pcmpistrm $0x4a,(%ecx),%xmm2...");
+if ( stack_exec && cpu_has_sse4_2 )
+{
+decl_insn(pcmpistrm);
+
+asm volatile ( "movq %0, %%xmm2\n"
+   put_insn(pcmpistrm, "pcmpistrm $0b01001010, (%1), 
%%xmm2")
+   :: "m" (res[0]), "c" (NULL) );
+
+set_insn(pcmpistrm);
+regs.ecx = (unsigned long)res;
+regs.eflags = X86_EFLAGS_PF | X86_EFLAGS_AF | X86_EFLAGS_IF;
+rc = x86_emulate(&ctxt, &emulops);
+if ( rc != X86EMUL_OKAY || !check_eip(pcmpistrm) )
+goto fail;
+asm ( "pmovmskb %%xmm0, %0" : "=r" (rc) );
+if ( rc != 0x ||
+(regs.eflags & X86_EFLAGS_ARITH_MASK) !=
+(X86_EFLAGS_CF | X86_EFLAGS_ZF | X86_EFLAGS_SF | X86_EFLAGS_OF) )
+goto fail;
+printf("okay\n");
+}
+else
+printf("skipped\n");
+
+printf("%-40s", "Testing vpcmpestri $0x7a,(%esi),%xmm2...");
+if ( stack_exec && cpu_has_avx )
+{
+decl_insn(vpcmpestri);
+
+#ifdef __x86_64__
+/*
+ * gas up to at least 2.27 doesn't honor explict "rex.w" for
+ * VEX/EVEX encoded instructions, and also doesn't provide any
+ * other means to control VEX.W.
+ */
+asm volatile ( "movq %0, %%xmm2\n"
+   put_insn(vpcmpestri,
+".byte 0xC4, 0xE3, 0xF9, 0x61, 0x16, 0x7A")
+   :: "m" (res[0]) );
+#else
+asm volatile ( "movq %0, %%xmm2\n"
+   put_insn(vpcmpestri,
+"vpcmpestri $0b0010, (%1), %%xmm2")
+   :: "m" (res[0]), "S" (NULL) );
+#endif
+
+set_insn(vpcmpestri);
+#ifdef __x86_64__
+regs.rax = ~0U + 1UL;
+regs.rcx = ~0UL;
+#else
+regs.eax = 0x7fff;
+#endif
+regs.esi = (unsigned long)res;
+regs.eflags = X86_EFLAGS_PF | X86_EFLAGS_AF | X86_EFLAGS_SF |
+  

[Xen-devel] [PATCH v4 14/17] x86emul: test coverage for SSE3/SSSE3/SSE4* insns

2017-02-28 Thread Jan Beulich
... and their AVX equivalents. Note that a few instructions aren't
covered (yet), but those all fall into common pattern groups, so I
would hope that for now we can do with what is there.

Just like for SSE/SSE2, MMX insns aren't being covered at all, as
they're not easy to deal with: The compiler refuses to emit such for
other than uses of built-in functions.

Signed-off-by: Jan Beulich 
---
v4: New.

--- a/tools/tests/x86_emulator/Makefile
+++ b/tools/tests/x86_emulator/Makefile
@@ -22,24 +22,31 @@ sse-flts := 4
 sse2-vecs := $(sse-vecs)
 sse2-ints := 1 2 4 8
 sse2-flts := 4 8
+sse4-vecs := $(sse2-vecs)
+sse4-ints := $(sse2-ints)
+sse4-flts := $(sse2-flts)
 
 # When converting SSE to AVX, have the compiler avoid XMM0 to widen
-# coverage of the VEX. checks in the emulator.
-sse2avx := -ffixed-xmm0 -Wa,-msse2avx
+# coverage of the VEX. checks in the emulator. We must not do this,
+# however, for SSE4.1 and later, as there are instructions with XMM0 as
+# an implicit operand.
+sse2avx-sse  := -ffixed-xmm0 -Wa,-msse2avx
+sse2avx-sse2 := $(sse2avx-sse)
+sse2avx-sse4 := -Wa,-msse2avx
 
-simd-cflags := $(foreach flavor,sse sse2, \
+simd-cflags := $(foreach flavor,sse sse2 sse4, \
  $(foreach vec,$($(flavor)-vecs), \
$(foreach int,$($(flavor)-ints), \
  "-D$(flavor)_$(vec)i$(int) -m$(flavor) -O2 
-DVEC_SIZE=$(vec) -DINT_SIZE=$(int)" \
  "-D$(flavor)_$(vec)u$(int) -m$(flavor) -O2 
-DVEC_SIZE=$(vec) -DUINT_SIZE=$(int)" \
- "-D$(flavor)_avx_$(vec)i$(int) -m$(flavor) $(sse2avx) -O2 
-DVEC_SIZE=$(vec) -DINT_SIZE=$(int)" \
- "-D$(flavor)_avx_$(vec)u$(int) -m$(flavor) $(sse2avx) -O2 
-DVEC_SIZE=$(vec) -DUINT_SIZE=$(int)") \
+ "-D$(flavor)_avx_$(vec)i$(int) -m$(flavor) 
$(sse2avx-$(flavor)) -O2 -DVEC_SIZE=$(vec) -DINT_SIZE=$(int)" \
+ "-D$(flavor)_avx_$(vec)u$(int) -m$(flavor) 
$(sse2avx-$(flavor)) -O2 -DVEC_SIZE=$(vec) -DUINT_SIZE=$(int)") \
$(foreach flt,$($(flavor)-flts), \
  "-D$(flavor)_$(vec)f$(flt) -m$(flavor) -O2 
-DVEC_SIZE=$(vec) -DFLOAT_SIZE=$(flt)" \
- "-D$(flavor)_avx_$(vec)f$(flt) -m$(flavor) $(sse2avx) -O2 
-DVEC_SIZE=$(vec) -DFLOAT_SIZE=$(flt)")) \
+ "-D$(flavor)_avx_$(vec)f$(flt) -m$(flavor) 
$(sse2avx-$(flavor)) -O2 -DVEC_SIZE=$(vec) -DFLOAT_SIZE=$(flt)")) \
  $(foreach flt,$($(flavor)-flts), \
"-D$(flavor)_f$(flt) -m$(flavor) -mfpmath=sse -O2 
-DFLOAT_SIZE=$(flt)" \
-   "-D$(flavor)_avx_f$(flt) -m$(flavor) -mfpmath=sse 
$(sse2avx) -O2 -DFLOAT_SIZE=$(flt)"))
+   "-D$(flavor)_avx_f$(flt) -m$(flavor) -mfpmath=sse 
$(sse2avx-$(flavor)) -O2 -DFLOAT_SIZE=$(flt)"))
 
 $(addsuffix .h,$(TESTCASES)): %.h: %.c testcase.mk Makefile
rm -f $@.new $*.bin
--- a/tools/tests/x86_emulator/simd.c
+++ b/tools/tests/x86_emulator/simd.c
@@ -70,7 +70,9 @@ typedef long long __attribute__((vector_
 #if VEC_SIZE == 8 && defined(__SSE__)
 # define to_bool(cmp) (__builtin_ia32_pmovmskb(cmp) == 0xff)
 #elif VEC_SIZE == 16
-# if defined(__SSE__) && ELEM_SIZE == 4
+# if defined(__SSE4_1__)
+#  define to_bool(cmp) __builtin_ia32_ptestc128(cmp, (vdi_t){} == 0)
+# elif defined(__SSE__) && ELEM_SIZE == 4
 #  define to_bool(cmp) (__builtin_ia32_movmskps(cmp) == 0xf)
 # elif defined(__SSE2__)
 #  if ELEM_SIZE == 8
@@ -182,9 +184,122 @@ static inline bool _to_bool(byte_vec_t b
 __builtin_ia32_maskmovdqu((vqi_t)(y), ~m_, d_); \
 })
 #endif
+#if VEC_SIZE == 16 && defined(__SSE3__)
+# if FLOAT_SIZE == 4
+#  define addsub(x, y) __builtin_ia32_addsubps(x, y)
+#  define dup_hi(x) __builtin_ia32_movshdup(x)
+#  define dup_lo(x) __builtin_ia32_movsldup(x)
+#  define hadd(x, y) __builtin_ia32_haddps(x, y)
+#  define hsub(x, y) __builtin_ia32_hsubps(x, y)
+# elif FLOAT_SIZE == 8
+#  define addsub(x, y) __builtin_ia32_addsubpd(x, y)
+#  define dup_lo(x) ({ \
+double __attribute__((vector_size(16))) r_; \
+asm ( "movddup %1,%0" : "=x" (r_) : "m" ((x)[0]) ); \
+r_; \
+})
+#  define hadd(x, y) __builtin_ia32_haddpd(x, y)
+#  define hsub(x, y) __builtin_ia32_hsubpd(x, y)
+# endif
+#endif
+#if VEC_SIZE == 16 && defined(__SSSE3__)
+# if INT_SIZE == 1
+#  define abs(x) ((vec_t)__builtin_ia32_pabsb128((vqi_t)(x)))
+# elif INT_SIZE == 2
+#  define abs(x) __builtin_ia32_pabsw128(x)
+# elif INT_SIZE == 4
+#  define abs(x) __builtin_ia32_pabsd128(x)
+# endif
+# if INT_SIZE == 1 || UINT_SIZE == 1
+#  define copysignz(x, y) ((vec_t)__builtin_ia32_psignb128((vqi_t)(x), 
(vqi_t)(y)))
+#  define swap(x) ((vec_t)__builtin_ia32_pshufb128((vqi_t)(x), (vqi_t)(inv - 
1)))
+#  define rotr(x, n) ((vec_t)__builtin_ia32_palignr128((vdi_t)(x), (vdi_t)(x), 
(n) * 8))
+# elif INT_SIZE == 2 || UINT_SIZE == 2
+#  define copysignz(x, y) ((vec_t)__builtin_ia32_psignw128((vhi_t)(x), 
(vhi_t)(y)))
+#  define hadd(x, y) ((vec_t)__builtin_ia32_phaddw128((v

[Xen-devel] [PATCH v4 15/17] x86emul: support PCLMULQDQ

2017-02-28 Thread Jan Beulich
... and its AVX equivalent.

Signed-off-by: Jan Beulich 
---
v3: New.

--- a/xen/arch/x86/x86_emulate/x86_emulate.c
+++ b/xen/arch/x86/x86_emulate/x86_emulate.c
@@ -393,6 +393,7 @@ static const struct {
 [0x22] = { .simd_size = simd_none },
 [0x40 ... 0x41] = { .simd_size = simd_packed_fp },
 [0x42] = { .simd_size = simd_packed_int },
+[0x44] = { .simd_size = simd_packed_int },
 [0x4a ... 0x4b] = { .simd_size = simd_packed_fp, .four_op = 1 },
 [0x4c] = { .simd_size = simd_packed_int, .four_op = 1 },
 [0x60 ... 0x63] = { .simd_size = simd_packed_int, .two_op = 1 },
@@ -1457,6 +1458,7 @@ static bool vcpu_has(
 #define vcpu_has_sse() vcpu_has( 1, EDX, 25, ctxt, ops)
 #define vcpu_has_sse2()vcpu_has( 1, EDX, 26, ctxt, ops)
 #define vcpu_has_sse3()vcpu_has( 1, ECX,  0, ctxt, ops)
+#define vcpu_has_pclmulqdq()   vcpu_has( 1, ECX,  1, ctxt, ops)
 #define vcpu_has_ssse3()   vcpu_has( 1, ECX,  9, ctxt, ops)
 #define vcpu_has_cx16()vcpu_has( 1, ECX, 13, ctxt, ops)
 #define vcpu_has_sse4_1()  vcpu_has( 1, ECX, 19, ctxt, ops)
@@ -7434,6 +7436,14 @@ x86_emulate(
 generate_exception_if(vex.l, EXC_UD);
 goto simd_0f_imm8_avx;
 
+case X86EMUL_OPC_66(0x0f3a, 0x44): /* pclmulqdq $imm8,xmm/m128,xmm */
+case X86EMUL_OPC_VEX_66(0x0f3a, 0x44): /* vpclmulqdq 
$imm8,xmm/m128,xmm,xmm */
+host_and_vcpu_must_have(pclmulqdq);
+if ( vex.opcx == vex_none )
+goto simd_0f3a_common;
+generate_exception_if(vex.l, EXC_UD);
+goto simd_0f_imm8_avx;
+
 case X86EMUL_OPC_VEX_66(0x0f3a, 0x4a): /* vblendvps 
{x,y}mm,{x,y}mm/mem,{x,y}mm,{x,y}mm */
 case X86EMUL_OPC_VEX_66(0x0f3a, 0x4b): /* vblendvpd 
{x,y}mm,{x,y}mm/mem,{x,y}mm,{x,y}mm */
 generate_exception_if(vex.w, EXC_UD);
--- a/xen/include/asm-x86/cpufeature.h
+++ b/xen/include/asm-x86/cpufeature.h
@@ -42,6 +42,7 @@
 #define cpu_has_ssse3  boot_cpu_has(X86_FEATURE_SSSE3)
 #define cpu_has_sse4_1 boot_cpu_has(X86_FEATURE_SSE4_1)
 #define cpu_has_sse4_2 boot_cpu_has(X86_FEATURE_SSE4_2)
+#define cpu_has_pclmulqdq  boot_cpu_has(X86_FEATURE_PCLMULQDQ)
 #define cpu_has_popcnt boot_cpu_has(X86_FEATURE_POPCNT)
 #define cpu_has_httboot_cpu_has(X86_FEATURE_HTT)
 #define cpu_has_nx boot_cpu_has(X86_FEATURE_NX)



x86emul: support PCLMULQDQ

... and its AVX equivalent.

Signed-off-by: Jan Beulich 
---
v3: New.

--- a/xen/arch/x86/x86_emulate/x86_emulate.c
+++ b/xen/arch/x86/x86_emulate/x86_emulate.c
@@ -393,6 +393,7 @@ static const struct {
 [0x22] = { .simd_size = simd_none },
 [0x40 ... 0x41] = { .simd_size = simd_packed_fp },
 [0x42] = { .simd_size = simd_packed_int },
+[0x44] = { .simd_size = simd_packed_int },
 [0x4a ... 0x4b] = { .simd_size = simd_packed_fp, .four_op = 1 },
 [0x4c] = { .simd_size = simd_packed_int, .four_op = 1 },
 [0x60 ... 0x63] = { .simd_size = simd_packed_int, .two_op = 1 },
@@ -1457,6 +1458,7 @@ static bool vcpu_has(
 #define vcpu_has_sse() vcpu_has( 1, EDX, 25, ctxt, ops)
 #define vcpu_has_sse2()vcpu_has( 1, EDX, 26, ctxt, ops)
 #define vcpu_has_sse3()vcpu_has( 1, ECX,  0, ctxt, ops)
+#define vcpu_has_pclmulqdq()   vcpu_has( 1, ECX,  1, ctxt, ops)
 #define vcpu_has_ssse3()   vcpu_has( 1, ECX,  9, ctxt, ops)
 #define vcpu_has_cx16()vcpu_has( 1, ECX, 13, ctxt, ops)
 #define vcpu_has_sse4_1()  vcpu_has( 1, ECX, 19, ctxt, ops)
@@ -7434,6 +7436,14 @@ x86_emulate(
 generate_exception_if(vex.l, EXC_UD);
 goto simd_0f_imm8_avx;
 
+case X86EMUL_OPC_66(0x0f3a, 0x44): /* pclmulqdq $imm8,xmm/m128,xmm */
+case X86EMUL_OPC_VEX_66(0x0f3a, 0x44): /* vpclmulqdq 
$imm8,xmm/m128,xmm,xmm */
+host_and_vcpu_must_have(pclmulqdq);
+if ( vex.opcx == vex_none )
+goto simd_0f3a_common;
+generate_exception_if(vex.l, EXC_UD);
+goto simd_0f_imm8_avx;
+
 case X86EMUL_OPC_VEX_66(0x0f3a, 0x4a): /* vblendvps 
{x,y}mm,{x,y}mm/mem,{x,y}mm,{x,y}mm */
 case X86EMUL_OPC_VEX_66(0x0f3a, 0x4b): /* vblendvpd 
{x,y}mm,{x,y}mm/mem,{x,y}mm,{x,y}mm */
 generate_exception_if(vex.w, EXC_UD);
--- a/xen/include/asm-x86/cpufeature.h
+++ b/xen/include/asm-x86/cpufeature.h
@@ -42,6 +42,7 @@
 #define cpu_has_ssse3  boot_cpu_has(X86_FEATURE_SSSE3)
 #define cpu_has_sse4_1 boot_cpu_has(X86_FEATURE_SSE4_1)
 #define cpu_has_sse4_2 boot_cpu_has(X86_FEATURE_SSE4_2)
+#define cpu_has_pclmulqdq  boot_cpu_has(X86_FEATURE_PCLMULQDQ)
 #define cpu_has_popcnt boot_cpu_has(X86_FEATURE_POPCNT)
 #define cpu_has_httboot_cpu_has(X86_FEATURE_HTT)
 #define cpu_has_nx boot_cpu_has(X86_FEATURE_NX)
___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

[Xen-devel] [PATCH v4 16/17] x86emul: support AESNI insns

2017-02-28 Thread Jan Beulich
... and their AVX equivalents.

Signed-off-by: Jan Beulich 
---
v3: New.

--- a/xen/arch/x86/x86_emulate/x86_emulate.c
+++ b/xen/arch/x86/x86_emulate/x86_emulate.c
@@ -368,6 +368,8 @@ static const struct {
 [0x37 ... 0x3f] = { .simd_size = simd_packed_int },
 [0x40] = { .simd_size = simd_packed_int },
 [0x41] = { .simd_size = simd_packed_int, .two_op = 1 },
+[0xdb] = { .simd_size = simd_packed_int, .two_op = 1 },
+[0xdc ... 0xdf] = { .simd_size = simd_packed_int },
 [0xf0] = { .two_op = 1 },
 [0xf1] = { .to_memory = 1, .two_op = 1 },
 [0xf2 ... 0xf3] = {},
@@ -397,6 +399,7 @@ static const struct {
 [0x4a ... 0x4b] = { .simd_size = simd_packed_fp, .four_op = 1 },
 [0x4c] = { .simd_size = simd_packed_int, .four_op = 1 },
 [0x60 ... 0x63] = { .simd_size = simd_packed_int, .two_op = 1 },
+[0xdf] = { .simd_size = simd_packed_int, .two_op = 1 },
 [0xf0] = {},
 };
 
@@ -1465,6 +1468,7 @@ static bool vcpu_has(
 #define vcpu_has_sse4_2()  vcpu_has( 1, ECX, 20, ctxt, ops)
 #define vcpu_has_movbe()   vcpu_has( 1, ECX, 22, ctxt, ops)
 #define vcpu_has_popcnt()  vcpu_has( 1, ECX, 23, ctxt, ops)
+#define vcpu_has_aesni()   vcpu_has( 1, ECX, 25, ctxt, ops)
 #define vcpu_has_avx() vcpu_has( 1, ECX, 28, ctxt, ops)
 #define vcpu_has_rdrand()  vcpu_has( 1, ECX, 30, ctxt, ops)
 #define vcpu_has_mmxext() (vcpu_has(0x8001, EDX, 22, ctxt, ops) || \
@@ -7155,6 +7159,22 @@ x86_emulate(
 host_and_vcpu_must_have(sse4_2);
 goto simd_0f38_common;
 
+case X86EMUL_OPC_66(0x0f38, 0xdb): /* aesimc xmm/m128,xmm */
+case X86EMUL_OPC_VEX_66(0x0f38, 0xdb): /* vaesimc xmm/m128,xmm */
+case X86EMUL_OPC_66(0x0f38, 0xdc): /* aesenc xmm/m128,xmm,xmm */
+case X86EMUL_OPC_VEX_66(0x0f38, 0xdc): /* vaesenc xmm/m128,xmm,xmm */
+case X86EMUL_OPC_66(0x0f38, 0xdd): /* aesenclast xmm/m128,xmm,xmm */
+case X86EMUL_OPC_VEX_66(0x0f38, 0xdd): /* vaesenclast xmm/m128,xmm,xmm */
+case X86EMUL_OPC_66(0x0f38, 0xde): /* aesdec xmm/m128,xmm,xmm */
+case X86EMUL_OPC_VEX_66(0x0f38, 0xde): /* vaesdec xmm/m128,xmm,xmm */
+case X86EMUL_OPC_66(0x0f38, 0xdf): /* aesdeclast xmm/m128,xmm,xmm */
+case X86EMUL_OPC_VEX_66(0x0f38, 0xdf): /* vaesdeclast xmm/m128,xmm,xmm */
+host_and_vcpu_must_have(aesni);
+if ( vex.opcx == vex_none )
+goto simd_0f38_common;
+generate_exception_if(vex.l, EXC_UD);
+goto simd_0f_avx;
+
 case X86EMUL_OPC(0x0f38, 0xf0): /* movbe m,r */
 case X86EMUL_OPC(0x0f38, 0xf1): /* movbe r,m */
 vcpu_must_have(movbe);
@@ -7510,6 +7530,14 @@ x86_emulate(
 dst.type = OP_NONE;
 break;
 
+case X86EMUL_OPC_66(0x0f3a, 0xdf): /* aeskeygenassist 
$imm8,xmm/m128,xmm */
+case X86EMUL_OPC_VEX_66(0x0f3a, 0xdf): /* vaeskeygenassist 
$imm8,xmm/m128,xmm */
+host_and_vcpu_must_have(aesni);
+if ( vex.opcx == vex_none )
+goto simd_0f3a_common;
+generate_exception_if(vex.l, EXC_UD);
+goto simd_0f_imm8_avx;
+
 case X86EMUL_OPC_VEX_F2(0x0f3a, 0xf0): /* rorx imm,r/m,r */
 vcpu_must_have(bmi2);
 generate_exception_if(vex.l || vex.reg != 0xf, EXC_UD);
--- a/xen/include/asm-x86/cpufeature.h
+++ b/xen/include/asm-x86/cpufeature.h
@@ -44,6 +44,7 @@
 #define cpu_has_sse4_2 boot_cpu_has(X86_FEATURE_SSE4_2)
 #define cpu_has_pclmulqdq  boot_cpu_has(X86_FEATURE_PCLMULQDQ)
 #define cpu_has_popcnt boot_cpu_has(X86_FEATURE_POPCNT)
+#define cpu_has_aesni  boot_cpu_has(X86_FEATURE_AESNI)
 #define cpu_has_httboot_cpu_has(X86_FEATURE_HTT)
 #define cpu_has_nx boot_cpu_has(X86_FEATURE_NX)
 #define cpu_has_clflushboot_cpu_has(X86_FEATURE_CLFLUSH)



x86emul: support AESNI insns

... and their AVX equivalents.

Signed-off-by: Jan Beulich 
---
v3: New.

--- a/xen/arch/x86/x86_emulate/x86_emulate.c
+++ b/xen/arch/x86/x86_emulate/x86_emulate.c
@@ -368,6 +368,8 @@ static const struct {
 [0x37 ... 0x3f] = { .simd_size = simd_packed_int },
 [0x40] = { .simd_size = simd_packed_int },
 [0x41] = { .simd_size = simd_packed_int, .two_op = 1 },
+[0xdb] = { .simd_size = simd_packed_int, .two_op = 1 },
+[0xdc ... 0xdf] = { .simd_size = simd_packed_int },
 [0xf0] = { .two_op = 1 },
 [0xf1] = { .to_memory = 1, .two_op = 1 },
 [0xf2 ... 0xf3] = {},
@@ -397,6 +399,7 @@ static const struct {
 [0x4a ... 0x4b] = { .simd_size = simd_packed_fp, .four_op = 1 },
 [0x4c] = { .simd_size = simd_packed_int, .four_op = 1 },
 [0x60 ... 0x63] = { .simd_size = simd_packed_int, .two_op = 1 },
+[0xdf] = { .simd_size = simd_packed_int, .two_op = 1 },
 [0xf0] = {},
 };
 
@@ -1465,6 +1468,7 @@ static bool vcpu_has(
 #define vcpu_has_sse4_2()  vcpu_has( 1, ECX, 20, ctxt, ops)
 #define vcpu_has_movbe()   vcpu_has( 1, ECX, 22, ctxt, ops)
 #define

[Xen-devel] [PATCH v4 17/17] x86emul: support SHA insns

2017-02-28 Thread Jan Beulich
Signed-off-by: Jan Beulich 
---
v3: New.

--- a/xen/arch/x86/x86_emulate/x86_emulate.c
+++ b/xen/arch/x86/x86_emulate/x86_emulate.c
@@ -368,6 +368,7 @@ static const struct {
 [0x37 ... 0x3f] = { .simd_size = simd_packed_int },
 [0x40] = { .simd_size = simd_packed_int },
 [0x41] = { .simd_size = simd_packed_int, .two_op = 1 },
+[0xc8 ... 0xcd] = { .simd_size = simd_other },
 [0xdb] = { .simd_size = simd_packed_int, .two_op = 1 },
 [0xdc ... 0xdf] = { .simd_size = simd_packed_int },
 [0xf0] = { .two_op = 1 },
@@ -399,6 +400,7 @@ static const struct {
 [0x4a ... 0x4b] = { .simd_size = simd_packed_fp, .four_op = 1 },
 [0x4c] = { .simd_size = simd_packed_int, .four_op = 1 },
 [0x60 ... 0x63] = { .simd_size = simd_packed_int, .two_op = 1 },
+[0xcc] = { .simd_size = simd_other },
 [0xdf] = { .simd_size = simd_packed_int, .two_op = 1 },
 [0xf0] = {},
 };
@@ -1490,6 +1492,7 @@ static bool vcpu_has(
 #define vcpu_has_smap()vcpu_has( 7, EBX, 20, ctxt, ops)
 #define vcpu_has_clflushopt()  vcpu_has( 7, EBX, 23, ctxt, ops)
 #define vcpu_has_clwb()vcpu_has( 7, EBX, 24, ctxt, ops)
+#define vcpu_has_sha() vcpu_has( 7, EBX, 29, ctxt, ops)
 #define vcpu_has_rdpid()   vcpu_has( 7, ECX, 22, ctxt, ops)
 
 #define vcpu_must_have(feat) \
@@ -7159,6 +7162,16 @@ x86_emulate(
 host_and_vcpu_must_have(sse4_2);
 goto simd_0f38_common;
 
+case X86EMUL_OPC(0x0f38, 0xc8): /* sha1nexte xmm/m128,xmm */
+case X86EMUL_OPC(0x0f38, 0xc9): /* sha1msg1 xmm/m128,xmm */
+case X86EMUL_OPC(0x0f38, 0xca): /* sha1msg2 xmm/m128,xmm */
+case X86EMUL_OPC(0x0f38, 0xcb): /* sha256rnds2 XMM0,xmm/m128,xmm */
+case X86EMUL_OPC(0x0f38, 0xcc): /* sha256msg1 xmm/m128,xmm */
+case X86EMUL_OPC(0x0f38, 0xcd): /* sha256msg2 xmm/m128,xmm */
+host_and_vcpu_must_have(sha);
+op_bytes = 16;
+goto simd_0f38_common;
+
 case X86EMUL_OPC_66(0x0f38, 0xdb): /* aesimc xmm/m128,xmm */
 case X86EMUL_OPC_VEX_66(0x0f38, 0xdb): /* vaesimc xmm/m128,xmm */
 case X86EMUL_OPC_66(0x0f38, 0xdc): /* aesenc xmm/m128,xmm,xmm */
@@ -7530,6 +7543,11 @@ x86_emulate(
 dst.type = OP_NONE;
 break;
 
+case X86EMUL_OPC(0x0f3a, 0xcc): /* sha1rnds4 $imm8,xmm/m128,xmm */
+host_and_vcpu_must_have(sha);
+op_bytes = 16;
+goto simd_0f3a_common;
+
 case X86EMUL_OPC_66(0x0f3a, 0xdf): /* aeskeygenassist 
$imm8,xmm/m128,xmm */
 case X86EMUL_OPC_VEX_66(0x0f3a, 0xdf): /* vaeskeygenassist 
$imm8,xmm/m128,xmm */
 host_and_vcpu_must_have(aesni);
--- a/xen/include/asm-x86/cpufeature.h
+++ b/xen/include/asm-x86/cpufeature.h
@@ -85,6 +85,7 @@
 #define cpu_has_sse4a  boot_cpu_has(X86_FEATURE_SSE4A)
 #define cpu_has_tbmboot_cpu_has(X86_FEATURE_TBM)
 #define cpu_has_itsc   boot_cpu_has(X86_FEATURE_ITSC)
+#define cpu_has_shaboot_cpu_has(X86_FEATURE_SHA)
 
 enum _cache_type {
 CACHE_TYPE_NULL = 0,



x86emul: support SHA insns

Signed-off-by: Jan Beulich 
---
v3: New.

--- a/xen/arch/x86/x86_emulate/x86_emulate.c
+++ b/xen/arch/x86/x86_emulate/x86_emulate.c
@@ -368,6 +368,7 @@ static const struct {
 [0x37 ... 0x3f] = { .simd_size = simd_packed_int },
 [0x40] = { .simd_size = simd_packed_int },
 [0x41] = { .simd_size = simd_packed_int, .two_op = 1 },
+[0xc8 ... 0xcd] = { .simd_size = simd_other },
 [0xdb] = { .simd_size = simd_packed_int, .two_op = 1 },
 [0xdc ... 0xdf] = { .simd_size = simd_packed_int },
 [0xf0] = { .two_op = 1 },
@@ -399,6 +400,7 @@ static const struct {
 [0x4a ... 0x4b] = { .simd_size = simd_packed_fp, .four_op = 1 },
 [0x4c] = { .simd_size = simd_packed_int, .four_op = 1 },
 [0x60 ... 0x63] = { .simd_size = simd_packed_int, .two_op = 1 },
+[0xcc] = { .simd_size = simd_other },
 [0xdf] = { .simd_size = simd_packed_int, .two_op = 1 },
 [0xf0] = {},
 };
@@ -1490,6 +1492,7 @@ static bool vcpu_has(
 #define vcpu_has_smap()vcpu_has( 7, EBX, 20, ctxt, ops)
 #define vcpu_has_clflushopt()  vcpu_has( 7, EBX, 23, ctxt, ops)
 #define vcpu_has_clwb()vcpu_has( 7, EBX, 24, ctxt, ops)
+#define vcpu_has_sha() vcpu_has( 7, EBX, 29, ctxt, ops)
 #define vcpu_has_rdpid()   vcpu_has( 7, ECX, 22, ctxt, ops)
 
 #define vcpu_must_have(feat) \
@@ -7159,6 +7162,16 @@ x86_emulate(
 host_and_vcpu_must_have(sse4_2);
 goto simd_0f38_common;
 
+case X86EMUL_OPC(0x0f38, 0xc8): /* sha1nexte xmm/m128,xmm */
+case X86EMUL_OPC(0x0f38, 0xc9): /* sha1msg1 xmm/m128,xmm */
+case X86EMUL_OPC(0x0f38, 0xca): /* sha1msg2 xmm/m128,xmm */
+case X86EMUL_OPC(0x0f38, 0xcb): /* sha256rnds2 XMM0,xmm/m128,xmm */
+case X86EMUL_OPC(0x0f38, 0xcc): /* sha256msg1 xmm/m128,xmm */
+case X86EMUL_OPC(0x0f38, 0xcd): /* sha256msg2 xmm/m128,xmm */
+

Re: [Xen-devel] [PATCH 2/4] tools: add pkg-config file for libxc

2017-02-28 Thread Ian Jackson
Juergen Gross writes ("Re: [PATCH 2/4] tools: add pkg-config file for libxc"):
> On 28/02/17 12:13, Ian Jackson wrote:
> > I worry though that we're breaking new ground.  Did you come up with
> > this idea yourself ?
> 
> Yes.

How exciting.

> > Are you aware of other projects that do something similar ?
> 
> No.

Well, anyway, my first impression is that this is a neat idea.  Would
you mind if I slept on it another day ?  If I can't think of a reason
not to do this by tomorrow I will ack your patch :-).

> TBH: did you have a look at qemu's configure script and how Xen support
> is handled there? I'm really astonished they accept such a hackery.

I dread to look.

> And I don't see why it would be wrong to use the same basic mechanism
> for our internal build as for any "normal" build outside of the Xen
> build environment.
> 
> The next step would be to add the other Xen libraries and their
> dependencies in order to get rid of all the additional include and
> library directories when calling configure for qemu.

Indeed.

Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 1/4] interface: avoid redefinition of __XEN_INTERFACE_VERSION__

2017-02-28 Thread Juergen Gross
On 28/02/17 13:46, Jan Beulich wrote:
 On 28.02.17 at 13:24,  wrote:
>> On 28/02/17 12:11, Jan Beulich wrote:
>> On 28.02.17 at 11:34,  wrote:
 In stubdom environment __XEN_INTERFACE_VERSION__ is sometimes defined
 on the command line of the build instruction. This conflicts with
 xen-compat.h defining it unconditionally if __XEN__ or __XEN_TOOLS__
 is set.
>>>
>>> Then that's what wants fixing. In fact it's questionable whether
>>> __XEN_TOOLS__ (or even __XEN__) getting defined there is
>>> appropriate.
>>
>> There are multiple libraries from the tools directory being compiled
>> for stubdoms.
> 
> Each if which should get appropriate settings.
> 
 Just use #undef in this case to avoid the resulting warning.
>>>
>>> I think the lack of a warning in case of a collision is worse here.
>>> People should simply not define both the version symbol and
>>> either of __XEN__ or __XEN_TOOLS__.
>>
>> Would you be okay with:
>>
>> #if defined(__XEN_INTERFACE_VERSION__)
>>   #if __XEN_INTERFACE_VERSION__ != __XEN_LATEST_INTERFACE_VERSION__
>> #error ...
>>   #endif
>> #else
>>   #define __XEN_INTERFACE_VERSION__ __XEN_LATEST_INTERFACE_VERSION__
>> #endif
> 
> Well - see Ian's reply. If the values match (granted textually rather
> than by value), there should be no compiler warning in the first
> place.

Hmm, maybe this is the problem: the value from the command line is
(textually) __XEN_LATEST_INTERFACE_VERSION__ while the value from the
#define is the _value_ of __XEN_LATEST_INTERFACE_VERSION__ due to the
pre-processor having replaced it already.

In case this makes sense, my suggestion seems to be appropriate, no?


Juergen


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 1/4] interface: avoid redefinition of __XEN_INTERFACE_VERSION__

2017-02-28 Thread Ian Jackson
Juergen Gross writes ("Re: [Xen-devel] [PATCH 1/4] interface: avoid 
redefinition of __XEN_INTERFACE_VERSION__"):
> Hmm, maybe this is the problem: the value from the command line is
> (textually) __XEN_LATEST_INTERFACE_VERSION__ while the value from the
> #define is the _value_ of __XEN_LATEST_INTERFACE_VERSION__ due to the
> pre-processor having replaced it already.
> 
> In case this makes sense, my suggestion seems to be appropriate, no?

Maybe.  Another possibly would be to contrive to #define
__XEN_INTERFACE_VERSION__ before __XEN_LATEST_INTERFACE_VERSION__.
Then the second substitution would occur later.

Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 3/4] tools: use a dedicated build directory for qemu

2017-02-28 Thread Ian Jackson
Juergen Gross writes ("Re: [PATCH 3/4] tools: use a dedicated build directory 
for qemu"):
> On 28/02/17 12:13, Ian Jackson wrote:
> > ... why ?
> 
> Hmm, yes, I should have added the reason to the commit description:
> 
> This way I can use the same source directory of qemu for my effort to
> configure and build qemu upstream in a stubdom environment.

Great.  Can you put that (or something like it) in the commit message ?

With that change,

Acked-by: Ian Jackson 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 3/4] tools: use a dedicated build directory for qemu

2017-02-28 Thread Juergen Gross
On 28/02/17 14:14, Ian Jackson wrote:
> Juergen Gross writes ("Re: [PATCH 3/4] tools: use a dedicated build directory 
> for qemu"):
>> On 28/02/17 12:13, Ian Jackson wrote:
>>> ... why ?
>>
>> Hmm, yes, I should have added the reason to the commit description:
>>
>> This way I can use the same source directory of qemu for my effort to
>> configure and build qemu upstream in a stubdom environment.
> 
> Great.  Can you put that (or something like it) in the commit message ?

Yes, I think I can manage that. :-)

> 
> With that change,
> 
> Acked-by: Ian Jackson 
> 

Thanks,

Juergen

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [xen-unstable test] 106218: regressions - trouble: blocked/broken/fail/pass

2017-02-28 Thread osstest service owner
flight 106218 xen-unstable real [real]
http://logs.test-lab.xenproject.org/osstest/logs/106218/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-i386-xl-qemuu-debianhvm-amd64 8 leak-check/basis(8) fail REGR. vs. 
105933
 test-amd64-i386-xl-qemut-stubdom-debianhvm-amd64-xsm 6 xen-boot fail REGR. vs. 
105933
 test-amd64-i386-xl-qemuu-debianhvm-amd64-xsm  6 xen-boot fail REGR. vs. 105933
 test-amd64-i386-libvirt   8 leak-check/basis(8)  fail REGR. vs. 105933
 test-amd64-i386-xl-qemut-winxpsp3  6 xen-bootfail REGR. vs. 105933
 test-amd64-i386-xl-raw6 xen-boot fail REGR. vs. 105933
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 6 xen-boot fail REGR. vs. 
105933
 test-amd64-i386-freebsd10-amd64  6 xen-boot  fail REGR. vs. 105933
 test-amd64-i386-xl-qemut-debianhvm-amd64-xsm  6 xen-boot fail REGR. vs. 105933
 test-amd64-i386-xl-qemuu-winxpsp3-vcpus1  6 xen-boot fail REGR. vs. 105933
 test-amd64-i386-xl6 xen-boot fail REGR. vs. 105933
 test-amd64-i386-xl-qemuu-winxpsp3  8 leak-check/basis(8) fail REGR. vs. 105933
 test-amd64-i386-pair  9 xen-boot/src_hostfail REGR. vs. 105933
 test-amd64-i386-pair 10 xen-boot/dst_hostfail REGR. vs. 105933
 test-amd64-i386-xl-qemuu-ovmf-amd64  6 xen-boot  fail REGR. vs. 105933
 test-amd64-i386-migrupgrade  10 xen-boot/dst_hostfail REGR. vs. 105933
 test-amd64-i386-freebsd10-i386  9 freebsd-installfail REGR. vs. 105933
 test-amd64-i386-libvirt-pair  9 xen-boot/src_hostfail REGR. vs. 105933
 test-amd64-i386-libvirt-pair 10 xen-boot/dst_hostfail REGR. vs. 105933
 test-amd64-i386-qemut-rhel6hvm-intel  6 xen-boot fail REGR. vs. 105933
 test-armhf-armhf-libvirt  9 debian-install   fail REGR. vs. 105933

Regressions which are regarded as allowable (not blocking):
 test-armhf-armhf-libvirt-xsm 13 saverestore-support-checkfail  like 105933
 test-amd64-amd64-xl-qemut-win7-amd64 16 guest-stopfail like 105933
 test-amd64-i386-xl-qemut-win7-amd64 16 guest-stop fail like 105933
 test-amd64-i386-xl-qemuu-win7-amd64 16 guest-stop fail like 105933
 test-armhf-armhf-libvirt-raw 12 saverestore-support-checkfail  like 105933
 test-amd64-amd64-xl-rtds  9 debian-install   fail  like 105933
 test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-stopfail like 105933

Tests which did not succeed, but are not blocking:
 build-arm64-libvirt   1 build-check(1)   blocked  n/a
 test-arm64-arm64-libvirt-qcow2  1 build-check(1)   blocked  n/a
 test-arm64-arm64-libvirt  1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl-rtds  1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl-multivcpu  1 build-check(1)   blocked  n/a
 test-arm64-arm64-libvirt-xsm  1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl   1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl-credit2   1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl-xsm   1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-pvh-amd  11 guest-start  fail   never pass
 test-amd64-amd64-xl-pvh-intel 11 guest-start  fail  never pass
 test-amd64-amd64-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  12 migrate-support-checkfail   never pass
 build-arm64   5 xen-buildfail   never pass
 build-arm64-xsm   5 xen-buildfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt-vhd 11 migrate-support-checkfail   never pass
 build-arm64-pvops 5 kernel-build fail   never pass
 test-amd64-amd64-qemuu-nested-amd 16 debian-hvm-install/l1/l2  fail never pass
 test-armhf-armhf-xl  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 12 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 13 saverestore-support-checkfail  never pass
 test-armhf-armhf-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 12 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 13 saverestore-support-checkfail never pass
 test-armhf-armhf-libvirt-raw 11 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  12 migrate-support-check

[Xen-devel] [PATCH 0/8] x86: switch away from temporary 32-bit register names

2017-02-28 Thread Jan Beulich
This is only part of the necessary changes. Some needed to be
dropped due to code having changed recently, and the biggest
missing part is the adjustment of the insn emulator, when I'd
prefer to do this work only after the non-RFC parts of
https://lists.xenproject.org/archives/html/xen-devel/2017-02/msg03474.html
have gone in (in order to avoid having to ping-pong re-base
that and this series).

1: re-introduce non-underscore prefixed 32-bit register names
2: switch away from temporary 32-bit register names
3: HVM: switch away from temporary 32-bit register names
4: HVMemul: switch away from temporary 32-bit register names
5: mm: switch away from temporary 32-bit register names
6: SVM: switch away from temporary 32-bit register names
7: Viridian: switch away from temporary 32-bit register names
8: VMX: switch away from temporary 32-bit register names

Signed-off-by: Jan Beulich 


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] Next Xen ARM community call

2017-02-28 Thread Julien Grall

Hi,

I haven't seen any complain with the time suggested by Stefano. So the 
Community call will be at 5pm UTC tomorrow (Wednesday 1st March).


Regarding the agenda, the topics I have so far are:
- PV protocols
- IRQ latency

Do we have any other topics to discuss?

For joining the call, please use either:

Call+44 1223 406065 (Local dial in)
and enter the access code below followed by # key.
Participant code: 4915191

Mobile Auto Dial:
VoIP: voip://+441223406065;4915191#
iOS devices: +44 1223 406065,4915191 and press #
Other devices: +44 1223 406065x4915191#

Additional Calling Information:

UK +44 1142828002
US CA +1 4085761502
US TX +1 5123141073
JP +81 453455355
DE +49 8945604050
NO +47 73187518
SE +46 46313131
FR +33 497235101
TW +886 35657119
HU +36 13275600
IE +353 91337900

Toll Free

UK 0800 1412084
US +1 8668801148
CN +86 4006782367
IN 0008009868365
IN +918049282778
TW 08000 22065
HU 0680981587
IE 1800800022
KF +972732558877

Regards,


On 17/02/17 15:00, Julien Grall wrote:

Hi Stefano,

On 16/02/17 18:40, Stefano Stabellini wrote:

On Thu, 16 Feb 2017, Julien Grall wrote:

Hello,

The last two community calls went really good and I am suggesting to
have a
new one on Wednesday 1st March at 4pm UTC. Any opinions?


Is it possible to change the time to 5pm?


I am fine with either time.





Also, do you have any specific topic you would like to talk during
the next
call?


I would like to discuss progress on PV protocols and IRQ latency.


I will add them in the agenda.

Cheers,



--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 1/4] interface: avoid redefinition of __XEN_INTERFACE_VERSION__

2017-02-28 Thread Jan Beulich
>>> On 28.02.17 at 14:10,  wrote:
> On 28/02/17 13:46, Jan Beulich wrote:
> On 28.02.17 at 13:24,  wrote:
>>> On 28/02/17 12:11, Jan Beulich wrote:
>>> On 28.02.17 at 11:34,  wrote:
> In stubdom environment __XEN_INTERFACE_VERSION__ is sometimes defined
> on the command line of the build instruction. This conflicts with
> xen-compat.h defining it unconditionally if __XEN__ or __XEN_TOOLS__
> is set.

 Then that's what wants fixing. In fact it's questionable whether
 __XEN_TOOLS__ (or even __XEN__) getting defined there is
 appropriate.
>>>
>>> There are multiple libraries from the tools directory being compiled
>>> for stubdoms.
>> 
>> Each if which should get appropriate settings.
>> 
> Just use #undef in this case to avoid the resulting warning.

 I think the lack of a warning in case of a collision is worse here.
 People should simply not define both the version symbol and
 either of __XEN__ or __XEN_TOOLS__.
>>>
>>> Would you be okay with:
>>>
>>> #if defined(__XEN_INTERFACE_VERSION__)
>>>   #if __XEN_INTERFACE_VERSION__ != __XEN_LATEST_INTERFACE_VERSION__
>>> #error ...
>>>   #endif
>>> #else
>>>   #define __XEN_INTERFACE_VERSION__ __XEN_LATEST_INTERFACE_VERSION__
>>> #endif
>> 
>> Well - see Ian's reply. If the values match (granted textually rather
>> than by value), there should be no compiler warning in the first
>> place.
> 
> Hmm, maybe this is the problem: the value from the command line is
> (textually) __XEN_LATEST_INTERFACE_VERSION__ while the value from the
> #define is the _value_ of __XEN_LATEST_INTERFACE_VERSION__ due to the
> pre-processor having replaced it already.

No, that replacement happens after having expanded
__XEN_INTERFACE_VERSION__ at use sites , not when
#define-ing it.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH 1/8] x86: re-introduce non-underscore prefixed 32-bit register names

2017-02-28 Thread Jan Beulich
For a transitional period (until we've managed to replace all
underscore prefixed instances), allow both names to co-exist.

Signed-off-by: Jan Beulich 

--- a/xen/include/public/arch-x86/xen-x86_64.h
+++ b/xen/include/public/arch-x86/xen-x86_64.h
@@ -134,7 +134,7 @@ struct iret_context {
 /* Anonymous unions include all permissible names (e.g., al/ah/ax/eax/rax). */
 #define __DECL_REG_LOHI(which) union { \
 uint64_t r ## which ## x; \
-uint32_t _e ## which ## x; \
+uint32_t e ## which ## x, _e ## which ## x; \
 uint16_t which ## x; \
 struct { \
 uint8_t which ## l; \
@@ -143,13 +143,13 @@ struct iret_context {
 }
 #define __DECL_REG_LO8(name) union { \
 uint64_t r ## name; \
-uint32_t _e ## name; \
+uint32_t e ## name, _e ## name; \
 uint16_t name; \
 uint8_t name ## l; \
 }
 #define __DECL_REG_LO16(name) union { \
 uint64_t r ## name; \
-uint32_t _e ## name; \
+uint32_t e ## name, _e ## name; \
 uint16_t name; \
 }
 #define __DECL_REG_HI(num) union { \



x86: re-introduce non-underscore prefixed 32-bit register names

For a transitional period (until we've managed to replace all
underscore prefixed instances), allow both names to co-exist.

Signed-off-by: Jan Beulich 

--- a/xen/include/public/arch-x86/xen-x86_64.h
+++ b/xen/include/public/arch-x86/xen-x86_64.h
@@ -134,7 +134,7 @@ struct iret_context {
 /* Anonymous unions include all permissible names (e.g., al/ah/ax/eax/rax). */
 #define __DECL_REG_LOHI(which) union { \
 uint64_t r ## which ## x; \
-uint32_t _e ## which ## x; \
+uint32_t e ## which ## x, _e ## which ## x; \
 uint16_t which ## x; \
 struct { \
 uint8_t which ## l; \
@@ -143,13 +143,13 @@ struct iret_context {
 }
 #define __DECL_REG_LO8(name) union { \
 uint64_t r ## name; \
-uint32_t _e ## name; \
+uint32_t e ## name, _e ## name; \
 uint16_t name; \
 uint8_t name ## l; \
 }
 #define __DECL_REG_LO16(name) union { \
 uint64_t r ## name; \
-uint32_t _e ## name; \
+uint32_t e ## name, _e ## name; \
 uint16_t name; \
 }
 #define __DECL_REG_HI(num) union { \
___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH 2/8] x86: switch away from temporary 32-bit register names

2017-02-28 Thread Jan Beulich
Signed-off-by: Jan Beulich 

--- a/xen/arch/x86/domain.c
+++ b/xen/arch/x86/domain.c
@@ -1015,11 +1015,11 @@ int arch_set_info_guest(
 init_int80_direct_trap(v);
 
 /* IOPL privileges are virtualised. */
-v->arch.pv_vcpu.iopl = v->arch.user_regs._eflags & X86_EFLAGS_IOPL;
-v->arch.user_regs._eflags &= ~X86_EFLAGS_IOPL;
+v->arch.pv_vcpu.iopl = v->arch.user_regs.eflags & X86_EFLAGS_IOPL;
+v->arch.user_regs.eflags &= ~X86_EFLAGS_IOPL;
 
 /* Ensure real hardware interrupts are enabled. */
-v->arch.user_regs._eflags |= X86_EFLAGS_IF;
+v->arch.user_regs.eflags |= X86_EFLAGS_IF;
 
 if ( !v->is_initialised )
 {
@@ -1776,14 +1776,14 @@ static void load_segments(struct vcpu *n
 if ( !ring_1(regs) )
 {
 ret  = put_user(regs->ss,   esp-1);
-ret |= put_user(regs->_esp, esp-2);
+ret |= put_user(regs->esp,  esp-2);
 esp -= 2;
 }
 
 if ( ret |
  put_user(rflags,  esp-1) |
  put_user(cs_and_mask, esp-2) |
- put_user(regs->_eip,  esp-3) |
+ put_user(regs->eip,   esp-3) |
  put_user(uregs->gs,   esp-4) |
  put_user(uregs->fs,   esp-5) |
  put_user(uregs->es,   esp-6) |
@@ -1798,12 +1798,12 @@ static void load_segments(struct vcpu *n
 vcpu_info(n, evtchn_upcall_mask) = 1;
 
 regs->entry_vector |= TRAP_syscall;
-regs->_eflags  &= ~(X86_EFLAGS_VM|X86_EFLAGS_RF|X86_EFLAGS_NT|
+regs->eflags   &= ~(X86_EFLAGS_VM|X86_EFLAGS_RF|X86_EFLAGS_NT|
 X86_EFLAGS_IOPL|X86_EFLAGS_TF);
 regs->ss= FLAT_COMPAT_KERNEL_SS;
-regs->_esp  = (unsigned long)(esp-7);
+regs->esp   = (unsigned long)(esp-7);
 regs->cs= FLAT_COMPAT_KERNEL_CS;
-regs->_eip  = pv->failsafe_callback_eip;
+regs->eip   = pv->failsafe_callback_eip;
 return;
 }
 
--- a/xen/arch/x86/domain_build.c
+++ b/xen/arch/x86/domain_build.c
@@ -1667,7 +1667,7 @@ int __init construct_dom0(
 regs->rip = parms.virt_entry;
 regs->rsp = vstack_end;
 regs->rsi = vstartinfo_start;
-regs->_eflags = X86_EFLAGS_IF;
+regs->eflags = X86_EFLAGS_IF;
 
 #ifdef CONFIG_SHADOW_PAGING
 if ( opt_dom0_shadow )
--- a/xen/arch/x86/domctl.c
+++ b/xen/arch/x86/domctl.c
@@ -1587,8 +1587,8 @@ void arch_get_info_guest(struct vcpu *v,
 }
 
 /* IOPL privileges are virtualised: merge back into returned eflags. */
-BUG_ON((c(user_regs._eflags) & X86_EFLAGS_IOPL) != 0);
-c(user_regs._eflags |= v->arch.pv_vcpu.iopl);
+BUG_ON((c(user_regs.eflags) & X86_EFLAGS_IOPL) != 0);
+c(user_regs.eflags |= v->arch.pv_vcpu.iopl);
 
 if ( !compat )
 {
--- a/xen/arch/x86/gdbstub.c
+++ b/xen/arch/x86/gdbstub.c
@@ -68,14 +68,14 @@ gdb_arch_resume(struct cpu_user_regs *re
 if ( addr != -1UL )
 regs->rip = addr;
 
-regs->_eflags &= ~X86_EFLAGS_TF;
+regs->eflags &= ~X86_EFLAGS_TF;
 
 /* Set eflags.RF to ensure we do not re-enter. */
-regs->_eflags |= X86_EFLAGS_RF;
+regs->eflags |= X86_EFLAGS_RF;
 
 /* Set the trap flag if we are single stepping. */
 if ( type == GDB_STEP )
-regs->_eflags |= X86_EFLAGS_TF;
+regs->eflags |= X86_EFLAGS_TF;
 }
 
 /*
--- a/xen/arch/x86/traps.c
+++ b/xen/arch/x86/traps.c
@@ -626,7 +626,7 @@ void fatal_trap(const struct cpu_user_re
 panic("FATAL TRAP: vector = %d (%s)\n"
   "[error_code=%04x] %s",
   trapnr, trapstr(trapnr), regs->error_code,
-  (regs->_eflags & X86_EFLAGS_IF) ? "" : ", IN INTERRUPT CONTEXT");
+  (regs->eflags & X86_EFLAGS_IF) ? "" : ", IN INTERRUPT CONTEXT");
 }
 
 void pv_inject_event(const struct x86_event *event)
@@ -703,8 +703,8 @@ static inline void do_guest_trap(unsigne
 static void instruction_done(struct cpu_user_regs *regs, unsigned long rip)
 {
 regs->rip = rip;
-regs->_eflags &= ~X86_EFLAGS_RF;
-if ( regs->_eflags & X86_EFLAGS_TF )
+regs->eflags &= ~X86_EFLAGS_RF;
+if ( regs->eflags & X86_EFLAGS_TF )
 {
 current->arch.debugreg[6] |= DR_STEP | DR_STATUS_RESERVED_ONE;
 do_guest_trap(TRAP_debug, regs);
@@ -1070,7 +1070,7 @@ static int emulate_forced_invalid_op(str
 
 eip += sizeof(instr);
 
-guest_cpuid(current, regs->_eax, regs->_ecx, &res);
+guest_cpuid(current, regs->eax, regs->ecx, &res);
 
 regs->rax = res.a;
 regs->rbx = res.b;
@@ -1395,7 +1395,7 @@ leaf:
  *   - Page fault in kernel mode
  */
 if ( (cr4 & X86_CR4_SMAP) && !(error_code & PFEC_user_mode) &&
- (((regs->cs & 3) == 3) || !(regs->_eflags & X86_EFLAGS_AC)) )
+ (((regs->cs

[Xen-devel] [PATCH 3/8] x86/HVM: switch away from temporary 32-bit register names

2017-02-28 Thread Jan Beulich
Signed-off-by: Jan Beulich 

--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -2968,20 +2968,20 @@ void hvm_task_switch(
 if ( rc != HVMCOPY_okay )
 goto out;
 
-eflags = regs->_eflags;
+eflags = regs->eflags;
 if ( taskswitch_reason == TSW_iret )
 eflags &= ~X86_EFLAGS_NT;
 
-tss.eip= regs->_eip;
+tss.eip= regs->eip;
 tss.eflags = eflags;
-tss.eax= regs->_eax;
-tss.ecx= regs->_ecx;
-tss.edx= regs->_edx;
-tss.ebx= regs->_ebx;
-tss.esp= regs->_esp;
-tss.ebp= regs->_ebp;
-tss.esi= regs->_esi;
-tss.edi= regs->_edi;
+tss.eax= regs->eax;
+tss.ecx= regs->ecx;
+tss.edx= regs->edx;
+tss.ebx= regs->ebx;
+tss.esp= regs->esp;
+tss.ebp= regs->ebp;
+tss.esi= regs->esi;
+tss.edi= regs->edi;
 
 hvm_get_segment_register(v, x86_seg_es, &segr);
 tss.es = segr.sel;
@@ -3047,7 +3047,7 @@ void hvm_task_switch(
 
 if ( taskswitch_reason == TSW_call_or_int )
 {
-regs->_eflags |= X86_EFLAGS_NT;
+regs->eflags |= X86_EFLAGS_NT;
 tss.back_link = prev_tr.sel;
 
 rc = hvm_copy_to_guest_linear(tr.base + offsetof(typeof(tss), 
back_link),
@@ -3084,7 +3084,7 @@ void hvm_task_switch(
 opsz = segr.attr.fields.db ? 4 : 2;
 hvm_get_segment_register(v, x86_seg_ss, &segr);
 if ( segr.attr.fields.db )
-sp = regs->_esp -= opsz;
+sp = regs->esp -= opsz;
 else
 sp = regs->sp -= opsz;
 if ( hvm_virtual_to_linear_addr(x86_seg_ss, &segr, sp, opsz,
@@ -3370,7 +3370,7 @@ void hvm_rdtsc_intercept(struct cpu_user
 {
 msr_split(regs, _hvm_rdtsc_intercept());
 
-HVMTRACE_2D(RDTSC, regs->_eax, regs->_edx);
+HVMTRACE_2D(RDTSC, regs->eax, regs->edx);
 }
 
 int hvm_msr_read_intercept(unsigned int msr, uint64_t *msr_content)
@@ -3684,11 +3684,11 @@ void hvm_ud_intercept(struct cpu_user_re
  (memcmp(sig, "\xf\xbxen", sizeof(sig)) == 0) )
 {
 regs->rip += sizeof(sig);
-regs->_eflags &= ~X86_EFLAGS_RF;
+regs->eflags &= ~X86_EFLAGS_RF;
 
 /* Zero the upper 32 bits of %rip if not in 64bit mode. */
 if ( !(hvm_long_mode_enabled(cur) && cs->attr.fields.l) )
-regs->rip = regs->_eip;
+regs->rip = regs->eip;
 
 add_taint(TAINT_HVM_FEP);
 
@@ -3732,7 +3732,7 @@ enum hvm_intblk hvm_interrupt_blocked(st
 }
 
 if ( (intack.source != hvm_intsrc_nmi) &&
- !(guest_cpu_user_regs()->_eflags & X86_EFLAGS_IF) )
+ !(guest_cpu_user_regs()->eflags & X86_EFLAGS_IF) )
 return hvm_intblk_rflags_ie;
 
 intr_shadow = hvm_funcs.get_interrupt_shadow(v);



x86/HVM: switch away from temporary 32-bit register names

Signed-off-by: Jan Beulich 

--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -2968,20 +2968,20 @@ void hvm_task_switch(
 if ( rc != HVMCOPY_okay )
 goto out;
 
-eflags = regs->_eflags;
+eflags = regs->eflags;
 if ( taskswitch_reason == TSW_iret )
 eflags &= ~X86_EFLAGS_NT;
 
-tss.eip= regs->_eip;
+tss.eip= regs->eip;
 tss.eflags = eflags;
-tss.eax= regs->_eax;
-tss.ecx= regs->_ecx;
-tss.edx= regs->_edx;
-tss.ebx= regs->_ebx;
-tss.esp= regs->_esp;
-tss.ebp= regs->_ebp;
-tss.esi= regs->_esi;
-tss.edi= regs->_edi;
+tss.eax= regs->eax;
+tss.ecx= regs->ecx;
+tss.edx= regs->edx;
+tss.ebx= regs->ebx;
+tss.esp= regs->esp;
+tss.ebp= regs->ebp;
+tss.esi= regs->esi;
+tss.edi= regs->edi;
 
 hvm_get_segment_register(v, x86_seg_es, &segr);
 tss.es = segr.sel;
@@ -3047,7 +3047,7 @@ void hvm_task_switch(
 
 if ( taskswitch_reason == TSW_call_or_int )
 {
-regs->_eflags |= X86_EFLAGS_NT;
+regs->eflags |= X86_EFLAGS_NT;
 tss.back_link = prev_tr.sel;
 
 rc = hvm_copy_to_guest_linear(tr.base + offsetof(typeof(tss), 
back_link),
@@ -3084,7 +3084,7 @@ void hvm_task_switch(
 opsz = segr.attr.fields.db ? 4 : 2;
 hvm_get_segment_register(v, x86_seg_ss, &segr);
 if ( segr.attr.fields.db )
-sp = regs->_esp -= opsz;
+sp = regs->esp -= opsz;
 else
 sp = regs->sp -= opsz;
 if ( hvm_virtual_to_linear_addr(x86_seg_ss, &segr, sp, opsz,
@@ -3370,7 +3370,7 @@ void hvm_rdtsc_intercept(struct cpu_user
 {
 msr_split(regs, _hvm_rdtsc_intercept());
 
-HVMTRACE_2D(RDTSC, regs->_eax, regs->_edx);
+HVMTRACE_2D(RDTSC, regs->eax, regs->edx);
 }
 
 int hvm_msr_read_intercept(unsigned int msr, uint64_t *msr_content)
@@ -3684,11 +3684,11 @@ void hvm_ud_intercept(struct cpu_user_re
  (memcmp(sig, "\xf\xbxen", sizeof(sig)) == 0) )
 {
 regs->rip += sizeof(sig);
-regs->_eflags &= ~X86_EFLA

[Xen-devel] [PATCH 4/8] x86/HVMemul: switch away from temporary 32-bit register names

2017-02-28 Thread Jan Beulich
Signed-off-by: Jan Beulich 

--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -442,7 +442,7 @@ static int hvmemul_linear_to_phys(
 }
 
 /* Reverse mode if this is a backwards multi-iteration string operation. */
-reverse = (hvmemul_ctxt->ctxt.regs->_eflags & X86_EFLAGS_DF) && (*reps > 
1);
+reverse = (hvmemul_ctxt->ctxt.regs->eflags & X86_EFLAGS_DF) && (*reps > 1);
 
 if ( reverse && ((PAGE_SIZE - offset) < bytes_per_rep) )
 {
@@ -539,7 +539,7 @@ static int hvmemul_virtual_to_linear(
 if ( IS_ERR(reg) )
 return -PTR_ERR(reg);
 
-if ( (hvmemul_ctxt->ctxt.regs->_eflags & X86_EFLAGS_DF) && (*reps > 1) )
+if ( (hvmemul_ctxt->ctxt.regs->eflags & X86_EFLAGS_DF) && (*reps > 1) )
 {
 /*
  * x86_emulate() clips the repetition count to ensure we don't wrap
@@ -1085,7 +1085,7 @@ static int hvmemul_rep_ins(
 return X86EMUL_UNHANDLEABLE;
 
 return hvmemul_do_pio_addr(src_port, reps, bytes_per_rep, IOREQ_READ,
-   !!(ctxt->regs->_eflags & X86_EFLAGS_DF), gpa);
+   !!(ctxt->regs->eflags & X86_EFLAGS_DF), gpa);
 }
 
 static int hvmemul_rep_outs_set_context(
@@ -1154,7 +1154,7 @@ static int hvmemul_rep_outs(
 return X86EMUL_UNHANDLEABLE;
 
 return hvmemul_do_pio_addr(dst_port, reps, bytes_per_rep, IOREQ_WRITE,
-   !!(ctxt->regs->_eflags & X86_EFLAGS_DF), gpa);
+   !!(ctxt->regs->eflags & X86_EFLAGS_DF), gpa);
 }
 
 static int hvmemul_rep_movs(
@@ -1173,7 +1173,7 @@ static int hvmemul_rep_movs(
 paddr_t sgpa, dgpa;
 uint32_t pfec = PFEC_page_present;
 p2m_type_t sp2mt, dp2mt;
-int rc, df = !!(ctxt->regs->_eflags & X86_EFLAGS_DF);
+int rc, df = !!(ctxt->regs->eflags & X86_EFLAGS_DF);
 char *buf;
 
 rc = hvmemul_virtual_to_linear(
@@ -1327,7 +1327,7 @@ static int hvmemul_rep_stos(
 unsigned long addr, bytes;
 paddr_t gpa;
 p2m_type_t p2mt;
-bool_t df = !!(ctxt->regs->_eflags & X86_EFLAGS_DF);
+bool_t df = !!(ctxt->regs->eflags & X86_EFLAGS_DF);
 int rc = hvmemul_virtual_to_linear(seg, offset, bytes_per_rep, reps,
hvm_access_write, hvmemul_ctxt, &addr);
 
@@ -1775,7 +1775,7 @@ static int _hvm_emulate_one(struct hvm_e
 if ( hvmemul_ctxt->ctxt.retire.hlt &&
  !hvm_local_events_need_delivery(curr) )
 {
-hvm_hlt(regs->_eflags);
+hvm_hlt(regs->eflags);
 }
 
 return X86EMUL_OKAY;
--- a/xen/arch/x86/hvm/io.c
+++ b/xen/arch/x86/hvm/io.c
@@ -136,7 +136,7 @@ bool handle_pio(uint16_t port, unsigned
 ASSERT((size - 1) < 4 && size != 3);
 
 if ( dir == IOREQ_WRITE )
-data = guest_cpu_user_regs()->_eax;
+data = guest_cpu_user_regs()->eax;
 
 rc = hvmemul_do_pio_buffer(port, size, dir, &data);
 



x86/HVMemul: switch away from temporary 32-bit register names

Signed-off-by: Jan Beulich 

--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -442,7 +442,7 @@ static int hvmemul_linear_to_phys(
 }
 
 /* Reverse mode if this is a backwards multi-iteration string operation. */
-reverse = (hvmemul_ctxt->ctxt.regs->_eflags & X86_EFLAGS_DF) && (*reps > 
1);
+reverse = (hvmemul_ctxt->ctxt.regs->eflags & X86_EFLAGS_DF) && (*reps > 1);
 
 if ( reverse && ((PAGE_SIZE - offset) < bytes_per_rep) )
 {
@@ -539,7 +539,7 @@ static int hvmemul_virtual_to_linear(
 if ( IS_ERR(reg) )
 return -PTR_ERR(reg);
 
-if ( (hvmemul_ctxt->ctxt.regs->_eflags & X86_EFLAGS_DF) && (*reps > 1) )
+if ( (hvmemul_ctxt->ctxt.regs->eflags & X86_EFLAGS_DF) && (*reps > 1) )
 {
 /*
  * x86_emulate() clips the repetition count to ensure we don't wrap
@@ -1085,7 +1085,7 @@ static int hvmemul_rep_ins(
 return X86EMUL_UNHANDLEABLE;
 
 return hvmemul_do_pio_addr(src_port, reps, bytes_per_rep, IOREQ_READ,
-   !!(ctxt->regs->_eflags & X86_EFLAGS_DF), gpa);
+   !!(ctxt->regs->eflags & X86_EFLAGS_DF), gpa);
 }
 
 static int hvmemul_rep_outs_set_context(
@@ -1154,7 +1154,7 @@ static int hvmemul_rep_outs(
 return X86EMUL_UNHANDLEABLE;
 
 return hvmemul_do_pio_addr(dst_port, reps, bytes_per_rep, IOREQ_WRITE,
-   !!(ctxt->regs->_eflags & X86_EFLAGS_DF), gpa);
+   !!(ctxt->regs->eflags & X86_EFLAGS_DF), gpa);
 }
 
 static int hvmemul_rep_movs(
@@ -1173,7 +1173,7 @@ static int hvmemul_rep_movs(
 paddr_t sgpa, dgpa;
 uint32_t pfec = PFEC_page_present;
 p2m_type_t sp2mt, dp2mt;
-int rc, df = !!(ctxt->regs->_eflags & X86_EFLAGS_DF);
+int rc, df = !!(ctxt->regs->eflags & X86_EFLAGS_DF);
 char *buf;
 
 rc = hvmemul_virtual_to_linear(
@@ -1327,7 +1327,7 @@ static int hvmemul_rep_stos(
 unsigned long addr, bytes;
 paddr_t gpa;
 p2m_type_t p2mt;
-bool_t df = !!(ctxt->regs->_eflag

[Xen-devel] [PATCH 5/8] x86/mm: switch away from temporary 32-bit register names

2017-02-28 Thread Jan Beulich
Signed-off-by: Jan Beulich 

--- a/xen/arch/x86/mm/guest_walk.c
+++ b/xen/arch/x86/mm/guest_walk.c
@@ -196,7 +196,7 @@ guest_walk_tables(struct vcpu *v, struct
  *   - Page fault in kernel mode
  */
 smap = hvm_smap_enabled(v) &&
-   ((hvm_get_cpl(v) == 3) || !(regs->_eflags & X86_EFLAGS_AC));
+   ((hvm_get_cpl(v) == 3) || !(regs->eflags & X86_EFLAGS_AC));
 break;
 case SMAP_CHECK_ENABLED:
 smap = hvm_smap_enabled(v);



x86/mm: switch away from temporary 32-bit register names

Signed-off-by: Jan Beulich 

--- a/xen/arch/x86/mm/guest_walk.c
+++ b/xen/arch/x86/mm/guest_walk.c
@@ -196,7 +196,7 @@ guest_walk_tables(struct vcpu *v, struct
  *   - Page fault in kernel mode
  */
 smap = hvm_smap_enabled(v) &&
-   ((hvm_get_cpl(v) == 3) || !(regs->_eflags & X86_EFLAGS_AC));
+   ((hvm_get_cpl(v) == 3) || !(regs->eflags & X86_EFLAGS_AC));
 break;
 case SMAP_CHECK_ENABLED:
 smap = hvm_smap_enabled(v);
___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] Next Xen ARM community call

2017-02-28 Thread Artem Mygaiev
Hi Julien

We would also like to discuss a topic raised recently by Volodymyr Babchuk -
SMC/HVC handling in Xen.

ARTEM MYGAIEV 
Director, Technology Solutions
 
Office: +380 44 390 5457 x 65570   Cell: +380 67 921 1131  
Email: artem_myga...@epam.com 
Kyiv, Ukraine (GMT+3)   epam.com 
 
CONFIDENTIALITY CAUTION AND DISCLAIMER
This message is intended only for the use of the individual(s) or
entity(ies) to which it is addressed and contains information that is
legally privileged and confidential. If you are not the intended recipient,
or the person responsible for delivering the message to the intended
recipient, you are hereby notified that any dissemination, distribution or
copying of this communication is strictly prohibited. All unintended
recipients are obliged to delete this message and destroy any printed
copies. 
 

-Original Message-
From: Julien Grall [mailto:julien.gr...@arm.com] 
Sent: 28 лютого 2017 р. 15:30
To: Stefano Stabellini 
Cc: n...@arm.com; xen-devel ;
lars.ku...@citrix.com; Steve Capper ; Edgar E.
Iglesias ; Artem Mygaiev
; Alex Agizim ;
anastassios.na...@onapp.com
Subject: Re: Next Xen ARM community call

Hi,

I haven't seen any complain with the time suggested by Stefano. So the
Community call will be at 5pm UTC tomorrow (Wednesday 1st March).

Regarding the agenda, the topics I have so far are:
 - PV protocols
 - IRQ latency

Do we have any other topics to discuss?

For joining the call, please use either:

Call+44 1223 406065 (Local dial in)
and enter the access code below followed by # key.
Participant code: 4915191

Mobile Auto Dial:
 VoIP: voip://+441223406065;4915191#
 iOS devices: +44 1223 406065,4915191 and press #
 Other devices: +44 1223 406065x4915191#

Additional Calling Information:

UK +44 1142828002
US CA +1 4085761502
US TX +1 5123141073
JP +81 453455355
DE +49 8945604050
NO +47 73187518
SE +46 46313131
FR +33 497235101
TW +886 35657119
HU +36 13275600
IE +353 91337900

Toll Free

UK 0800 1412084
US +1 8668801148
CN +86 4006782367
IN 0008009868365
IN +918049282778
TW 08000 22065
HU 0680981587
IE 1800800022
KF +972732558877

Regards,


On 17/02/17 15:00, Julien Grall wrote:
> Hi Stefano,
>
> On 16/02/17 18:40, Stefano Stabellini wrote:
>> On Thu, 16 Feb 2017, Julien Grall wrote:
>>> Hello,
>>>
>>> The last two community calls went really good and I am suggesting to 
>>> have a new one on Wednesday 1st March at 4pm UTC. Any opinions?
>>
>> Is it possible to change the time to 5pm?
>
> I am fine with either time.
>
>>
>>
>>> Also, do you have any specific topic you would like to talk during 
>>> the next call?
>>
>> I would like to discuss progress on PV protocols and IRQ latency.
>
> I will add them in the agenda.
>
> Cheers,
>

--
Julien Grall


smime.p7s
Description: S/MIME cryptographic signature
___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH 6/8] x86/SVM: switch away from temporary 32-bit register names

2017-02-28 Thread Jan Beulich
Signed-off-by: Jan Beulich 

--- a/xen/arch/x86/hvm/svm/nestedsvm.c
+++ b/xen/arch/x86/hvm/svm/nestedsvm.c
@@ -975,7 +975,7 @@ nsvm_vmcb_guest_intercepts_exitcode(stru
 break;
 ns_vmcb = nv->nv_vvmcx;
 vmexits = nsvm_vmcb_guest_intercepts_msr(svm->ns_cached_msrpm,
-regs->_ecx, ns_vmcb->exitinfo1 != 0);
+regs->ecx, ns_vmcb->exitinfo1 != 0);
 if (vmexits == NESTEDHVM_VMEXIT_HOST)
 return 0;
 break;
--- a/xen/arch/x86/hvm/svm/svm.c
+++ b/xen/arch/x86/hvm/svm/svm.c
@@ -111,11 +111,11 @@ void __update_guest_eip(struct cpu_user_
 ASSERT(regs == guest_cpu_user_regs());
 
 regs->rip += inst_len;
-regs->_eflags &= ~X86_EFLAGS_RF;
+regs->eflags &= ~X86_EFLAGS_RF;
 
 curr->arch.hvm_svm.vmcb->interrupt_shadow = 0;
 
-if ( regs->_eflags & X86_EFLAGS_TF )
+if ( regs->eflags & X86_EFLAGS_TF )
 hvm_inject_hw_exception(TRAP_debug, X86_EVENT_NO_EC);
 }
 
@@ -515,7 +515,7 @@ static int svm_guest_x86_mode(struct vcp
 
 if ( unlikely(!(v->arch.hvm_vcpu.guest_cr[0] & X86_CR0_PE)) )
 return 0;
-if ( unlikely(guest_cpu_user_regs()->_eflags & X86_EFLAGS_VM) )
+if ( unlikely(guest_cpu_user_regs()->eflags & X86_EFLAGS_VM) )
 return 1;
 if ( hvm_long_mode_enabled(v) && likely(vmcb->cs.attr.fields.l) )
 return 8;
@@ -1223,7 +1223,7 @@ static void svm_inject_event(const struc
 switch ( _event.vector )
 {
 case TRAP_debug:
-if ( regs->_eflags & X86_EFLAGS_TF )
+if ( regs->eflags & X86_EFLAGS_TF )
 {
 __restore_debug_registers(vmcb, curr);
 vmcb_set_dr6(vmcb, vmcb_get_dr6(vmcb) | 0x4000);
@@ -1313,7 +1313,7 @@ static void svm_inject_event(const struc
  */
 if ( !((vmcb->_efer & EFER_LMA) && vmcb->cs.attr.fields.l) )
 {
-regs->rip = regs->_eip;
+regs->rip = regs->eip;
 vmcb->nextrip = (uint32_t)vmcb->nextrip;
 }
 
@@ -1595,8 +1595,8 @@ static void svm_vmexit_do_cpuid(struct c
 if ( (inst_len = __get_instruction_length(curr, INSTR_CPUID)) == 0 )
 return;
 
-guest_cpuid(curr, regs->_eax, regs->_ecx, &res);
-HVMTRACE_5D(CPUID, regs->_eax, res.a, res.b, res.c, res.d);
+guest_cpuid(curr, regs->eax, regs->ecx, &res);
+HVMTRACE_5D(CPUID, regs->eax, res.a, res.b, res.c, res.d);
 
 regs->rax = res.a;
 regs->rbx = res.b;
@@ -1973,12 +1973,12 @@ static void svm_do_msr_access(struct cpu
 {
 uint64_t msr_content = 0;
 
-rc = hvm_msr_read_intercept(regs->_ecx, &msr_content);
+rc = hvm_msr_read_intercept(regs->ecx, &msr_content);
 if ( rc == X86EMUL_OKAY )
 msr_split(regs, msr_content);
 }
 else
-rc = hvm_msr_write_intercept(regs->_ecx, msr_fold(regs), 1);
+rc = hvm_msr_write_intercept(regs->ecx, msr_fold(regs), 1);
 
 if ( rc == X86EMUL_OKAY )
 __update_guest_eip(regs, inst_len);
@@ -1993,7 +1993,7 @@ static void svm_vmexit_do_hlt(struct vmc
 return;
 __update_guest_eip(regs, inst_len);
 
-hvm_hlt(regs->_eflags);
+hvm_hlt(regs->eflags);
 }
 
 static void svm_vmexit_do_rdtsc(struct cpu_user_regs *regs)
@@ -2338,11 +2338,11 @@ void svm_vmexit_handler(struct cpu_user_
 if ( hvm_long_mode_enabled(v) )
 HVMTRACE_ND(VMEXIT64, vcpu_guestmode ? TRC_HVM_NESTEDFLAG : 0,
 1/*cycles*/, 3, exit_reason,
-regs->_eip, regs->rip >> 32, 0, 0, 0);
+regs->eip, regs->rip >> 32, 0, 0, 0);
 else
 HVMTRACE_ND(VMEXIT, vcpu_guestmode ? TRC_HVM_NESTEDFLAG : 0,
 1/*cycles*/, 2, exit_reason,
-regs->_eip, 0, 0, 0, 0);
+regs->eip, 0, 0, 0, 0);
 
 if ( vcpu_guestmode ) {
 enum nestedhvm_vmexits nsret;
@@ -2621,7 +2621,7 @@ void svm_vmexit_handler(struct cpu_user_
 case VMEXIT_INVLPGA:
 if ( (inst_len = __get_instruction_length(v, INSTR_INVLPGA)) == 0 )
 break;
-svm_invlpga_intercept(v, regs->rax, regs->_ecx);
+svm_invlpga_intercept(v, regs->rax, regs->ecx);
 __update_guest_eip(regs, inst_len);
 break;
 
@@ -2629,7 +2629,7 @@ void svm_vmexit_handler(struct cpu_user_
 if ( (inst_len = __get_instruction_length(v, INSTR_VMCALL)) == 0 )
 break;
 BUG_ON(vcpu_guestmode);
-HVMTRACE_1D(VMMCALL, regs->_eax);
+HVMTRACE_1D(VMMCALL, regs->eax);
 
 if ( hvm_hypercall(regs) == HVM_HCALL_completed )
 __update_guest_eip(regs, inst_len);
@@ -2687,7 +2687,7 @@ void svm_vmexit_handler(struct cpu_user_
 if ( vmcb_get_cpl(vmcb) )
 hvm_inject_hw_exception(TRAP_gp_fault, 0);
 else if ( (inst_len = __get_instruction_length(v, INSTR_XSETBV)) &&
-  hvm_handle_xsetbv(regs->_ecx, msr_fold(regs)) == 0 )
+  hvm_handle_xsetbv(regs->ecx, msr_fold(regs)) == 0 )
 __update_guest_eip(regs, 

[Xen-devel] [PATCH 5/8] x86/Viridian: switch away from temporary 32-bit register names

2017-02-28 Thread Jan Beulich
Signed-off-by: Jan Beulich 

--- a/xen/arch/x86/hvm/viridian.c
+++ b/xen/arch/x86/hvm/viridian.c
@@ -666,9 +666,9 @@ int viridian_hypercall(struct cpu_user_r
 output_params_gpa = regs->r8;
 break;
 case 4:
-input.raw = (regs->rdx << 32) | regs->_eax;
-input_params_gpa = (regs->rbx << 32) | regs->_ecx;
-output_params_gpa = (regs->rdi << 32) | regs->_esi;
+input.raw = (regs->rdx << 32) | regs->eax;
+input_params_gpa = (regs->rbx << 32) | regs->ecx;
+output_params_gpa = (regs->rdi << 32) | regs->esi;
 break;
 default:
 goto out;



x86/Viridian: switch away from temporary 32-bit register names

Signed-off-by: Jan Beulich 

--- a/xen/arch/x86/hvm/viridian.c
+++ b/xen/arch/x86/hvm/viridian.c
@@ -666,9 +666,9 @@ int viridian_hypercall(struct cpu_user_r
 output_params_gpa = regs->r8;
 break;
 case 4:
-input.raw = (regs->rdx << 32) | regs->_eax;
-input_params_gpa = (regs->rbx << 32) | regs->_ecx;
-output_params_gpa = (regs->rdi << 32) | regs->_esi;
+input.raw = (regs->rdx << 32) | regs->eax;
+input_params_gpa = (regs->rbx << 32) | regs->ecx;
+output_params_gpa = (regs->rdi << 32) | regs->esi;
 break;
 default:
 goto out;
___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH 8/8] x86/VMX: switch away from temporary 32-bit register names

2017-02-28 Thread Jan Beulich
Signed-off-by: Jan Beulich 

--- a/xen/arch/x86/hvm/vmx/realmode.c
+++ b/xen/arch/x86/hvm/vmx/realmode.c
@@ -72,7 +72,7 @@ static void realmode_deliver_exception(
 
 /* We can't test hvmemul_ctxt->ctxt.sp_size: it may not be initialised. */
 if ( hvmemul_ctxt->seg_reg[x86_seg_ss].attr.fields.db )
-pstk = regs->_esp -= 6;
+pstk = regs->esp -= 6;
 else
 pstk = regs->sp -= 6;
 
@@ -82,7 +82,7 @@ static void realmode_deliver_exception(
 csr->sel  = cs_eip >> 16;
 csr->base = (uint32_t)csr->sel << 4;
 regs->ip = (uint16_t)cs_eip;
-regs->_eflags &= ~(X86_EFLAGS_TF | X86_EFLAGS_IF | X86_EFLAGS_RF);
+regs->eflags &= ~(X86_EFLAGS_TF | X86_EFLAGS_IF | X86_EFLAGS_RF);
 
 /* Exception delivery clears STI and MOV-SS blocking. */
 if ( hvmemul_ctxt->intr_shadow &
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -607,7 +607,7 @@ int vmx_guest_x86_mode(struct vcpu *v)
 
 if ( unlikely(!(v->arch.hvm_vcpu.guest_cr[0] & X86_CR0_PE)) )
 return 0;
-if ( unlikely(guest_cpu_user_regs()->_eflags & X86_EFLAGS_VM) )
+if ( unlikely(guest_cpu_user_regs()->eflags & X86_EFLAGS_VM) )
 return 1;
 __vmread(GUEST_CS_AR_BYTES, &cs_ar_bytes);
 if ( hvm_long_mode_enabled(v) &&
@@ -1753,7 +1753,7 @@ static void vmx_inject_event(const struc
 switch ( _event.vector | -(_event.type == X86_EVENTTYPE_SW_INTERRUPT) )
 {
 case TRAP_debug:
-if ( guest_cpu_user_regs()->_eflags & X86_EFLAGS_TF )
+if ( guest_cpu_user_regs()->eflags & X86_EFLAGS_TF )
 {
 __restore_debug_registers(curr);
 write_debugreg(6, read_debugreg(6) | DR_STEP);
@@ -1853,7 +1853,7 @@ static void vmx_set_info_guest(struct vc
  */
 __vmread(GUEST_INTERRUPTIBILITY_INFO, &intr_shadow);
 if ( v->domain->debugger_attached &&
- (v->arch.user_regs._eflags & X86_EFLAGS_TF) &&
+ (v->arch.user_regs.eflags & X86_EFLAGS_TF) &&
  (intr_shadow & VMX_INTR_SHADOW_STI) )
 {
 intr_shadow &= ~VMX_INTR_SHADOW_STI;
@@ -2092,8 +2092,8 @@ static int vmx_vcpu_emulate_vmfunc(const
 struct vcpu *curr = current;
 
 if ( !cpu_has_vmx_vmfunc && altp2m_active(curr->domain) &&
- regs->_eax == 0 &&
- p2m_switch_vcpu_altp2m_by_id(curr, regs->_ecx) )
+ regs->eax == 0 &&
+ p2m_switch_vcpu_altp2m_by_id(curr, regs->ecx) )
 rc = X86EMUL_OKAY;
 
 return rc;
@@ -2416,7 +2416,7 @@ void update_guest_eip(void)
 unsigned long x;
 
 regs->rip += get_instruction_length(); /* Safe: callers audited */
-regs->_eflags &= ~X86_EFLAGS_RF;
+regs->eflags &= ~X86_EFLAGS_RF;
 
 __vmread(GUEST_INTERRUPTIBILITY_INFO, &x);
 if ( x & (VMX_INTR_SHADOW_STI | VMX_INTR_SHADOW_MOV_SS) )
@@ -2425,7 +2425,7 @@ void update_guest_eip(void)
 __vmwrite(GUEST_INTERRUPTIBILITY_INFO, x);
 }
 
-if ( regs->_eflags & X86_EFLAGS_TF )
+if ( regs->eflags & X86_EFLAGS_TF )
 hvm_inject_hw_exception(TRAP_debug, X86_EVENT_NO_EC);
 }
 
@@ -2446,7 +2446,7 @@ static void vmx_fpu_dirty_intercept(void
 static int vmx_do_cpuid(struct cpu_user_regs *regs)
 {
 struct vcpu *curr = current;
-uint32_t leaf = regs->_eax, subleaf = regs->_ecx;
+uint32_t leaf = regs->eax, subleaf = regs->ecx;
 struct cpuid_leaf res;
 
 if ( hvm_check_cpuid_faulting(current) )
@@ -3204,8 +3204,8 @@ void vmx_enter_realmode(struct cpu_user_
 /* Adjust RFLAGS to enter virtual 8086 mode with IOPL == 3.  Since
  * we have CR4.VME == 1 and our own TSS with an empty interrupt
  * redirection bitmap, all software INTs will be handled by vm86 */
-v->arch.hvm_vmx.vm86_saved_eflags = regs->_eflags;
-regs->_eflags |= (X86_EFLAGS_VM | X86_EFLAGS_IOPL);
+v->arch.hvm_vmx.vm86_saved_eflags = regs->eflags;
+regs->eflags |= (X86_EFLAGS_VM | X86_EFLAGS_IOPL);
 }
 
 static int vmx_handle_eoi_write(void)
@@ -3347,10 +3347,10 @@ void vmx_vmexit_handler(struct cpu_user_
 
 if ( hvm_long_mode_enabled(v) )
 HVMTRACE_ND(VMEXIT64, 0, 1/*cycles*/, 3, exit_reason,
-regs->_eip, regs->rip >> 32, 0, 0, 0);
+regs->eip, regs->rip >> 32, 0, 0, 0);
 else
 HVMTRACE_ND(VMEXIT, 0, 1/*cycles*/, 2, exit_reason,
-regs->_eip, 0, 0, 0, 0);
+regs->eip, 0, 0, 0, 0);
 
 perfc_incra(vmexits, exit_reason);
 
@@ -3435,8 +3435,8 @@ void vmx_vmexit_handler(struct cpu_user_
 if ( v->arch.hvm_vmx.vmx_realmode )
 {
 /* Put RFLAGS back the way the guest wants it */
-regs->_eflags &= ~(X86_EFLAGS_VM | X86_EFLAGS_IOPL);
-regs->_eflags |= (v->arch.hvm_vmx.vm86_saved_eflags & X86_EFLAGS_IOPL);
+regs->eflags &= ~(X86_EFLAGS_VM | X86_EFLAGS_IOPL);
+regs->eflags |= (v->arch.hvm_vmx.vm86_saved_eflags & X86_EFLAGS_IOPL);
 
 /* Unless this exit was for an interrupt, we've hit something
  * vm86 can't handle.  Try again, u

Re: [Xen-devel] [PATCH 4/8] x86/HVMemul: switch away from temporary 32-bit register names

2017-02-28 Thread Paul Durrant
> -Original Message-
> From: Jan Beulich [mailto:jbeul...@suse.com]
> Sent: 28 February 2017 13:37
> To: xen-devel 
> Cc: Andrew Cooper ; Paul Durrant
> ; George Dunlap 
> Subject: [PATCH 4/8] x86/HVMemul: switch away from temporary 32-bit
> register names
> 
> Signed-off-by: Jan Beulich 
> 

Reviewed-by: Paul Durrant 

> --- a/xen/arch/x86/hvm/emulate.c
> +++ b/xen/arch/x86/hvm/emulate.c
> @@ -442,7 +442,7 @@ static int hvmemul_linear_to_phys(
>  }
> 
>  /* Reverse mode if this is a backwards multi-iteration string operation. 
> */
> -reverse = (hvmemul_ctxt->ctxt.regs->_eflags & X86_EFLAGS_DF) &&
> (*reps > 1);
> +reverse = (hvmemul_ctxt->ctxt.regs->eflags & X86_EFLAGS_DF) &&
> (*reps > 1);
> 
>  if ( reverse && ((PAGE_SIZE - offset) < bytes_per_rep) )
>  {
> @@ -539,7 +539,7 @@ static int hvmemul_virtual_to_linear(
>  if ( IS_ERR(reg) )
>  return -PTR_ERR(reg);
> 
> -if ( (hvmemul_ctxt->ctxt.regs->_eflags & X86_EFLAGS_DF) && (*reps > 1)
> )
> +if ( (hvmemul_ctxt->ctxt.regs->eflags & X86_EFLAGS_DF) && (*reps > 1) )
>  {
>  /*
>   * x86_emulate() clips the repetition count to ensure we don't wrap
> @@ -1085,7 +1085,7 @@ static int hvmemul_rep_ins(
>  return X86EMUL_UNHANDLEABLE;
> 
>  return hvmemul_do_pio_addr(src_port, reps, bytes_per_rep,
> IOREQ_READ,
> -   !!(ctxt->regs->_eflags & X86_EFLAGS_DF), gpa);
> +   !!(ctxt->regs->eflags & X86_EFLAGS_DF), gpa);
>  }
> 
>  static int hvmemul_rep_outs_set_context(
> @@ -1154,7 +1154,7 @@ static int hvmemul_rep_outs(
>  return X86EMUL_UNHANDLEABLE;
> 
>  return hvmemul_do_pio_addr(dst_port, reps, bytes_per_rep,
> IOREQ_WRITE,
> -   !!(ctxt->regs->_eflags & X86_EFLAGS_DF), gpa);
> +   !!(ctxt->regs->eflags & X86_EFLAGS_DF), gpa);
>  }
> 
>  static int hvmemul_rep_movs(
> @@ -1173,7 +1173,7 @@ static int hvmemul_rep_movs(
>  paddr_t sgpa, dgpa;
>  uint32_t pfec = PFEC_page_present;
>  p2m_type_t sp2mt, dp2mt;
> -int rc, df = !!(ctxt->regs->_eflags & X86_EFLAGS_DF);
> +int rc, df = !!(ctxt->regs->eflags & X86_EFLAGS_DF);
>  char *buf;
> 
>  rc = hvmemul_virtual_to_linear(
> @@ -1327,7 +1327,7 @@ static int hvmemul_rep_stos(
>  unsigned long addr, bytes;
>  paddr_t gpa;
>  p2m_type_t p2mt;
> -bool_t df = !!(ctxt->regs->_eflags & X86_EFLAGS_DF);
> +bool_t df = !!(ctxt->regs->eflags & X86_EFLAGS_DF);
>  int rc = hvmemul_virtual_to_linear(seg, offset, bytes_per_rep, reps,
> hvm_access_write, hvmemul_ctxt, 
> &addr);
> 
> @@ -1775,7 +1775,7 @@ static int _hvm_emulate_one(struct hvm_e
>  if ( hvmemul_ctxt->ctxt.retire.hlt &&
>   !hvm_local_events_need_delivery(curr) )
>  {
> -hvm_hlt(regs->_eflags);
> +hvm_hlt(regs->eflags);
>  }
> 
>  return X86EMUL_OKAY;
> --- a/xen/arch/x86/hvm/io.c
> +++ b/xen/arch/x86/hvm/io.c
> @@ -136,7 +136,7 @@ bool handle_pio(uint16_t port, unsigned
>  ASSERT((size - 1) < 4 && size != 3);
> 
>  if ( dir == IOREQ_WRITE )
> -data = guest_cpu_user_regs()->_eax;
> +data = guest_cpu_user_regs()->eax;
> 
>  rc = hvmemul_do_pio_buffer(port, size, dir, &data);
> 
> 
> 


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 0/8] x86: switch away from temporary 32-bit register names

2017-02-28 Thread Andrew Cooper
On 28/02/17 13:27, Jan Beulich wrote:
> This is only part of the necessary changes. Some needed to be
> dropped due to code having changed recently, and the biggest
> missing part is the adjustment of the insn emulator, when I'd
> prefer to do this work only after the non-RFC parts of
> https://lists.xenproject.org/archives/html/xen-devel/2017-02/msg03474.html
> have gone in (in order to avoid having to ping-pong re-base
> that and this series).
>
> 1: re-introduce non-underscore prefixed 32-bit register names
> 2: switch away from temporary 32-bit register names
> 3: HVM: switch away from temporary 32-bit register names
> 4: HVMemul: switch away from temporary 32-bit register names
> 5: mm: switch away from temporary 32-bit register names
> 6: SVM: switch away from temporary 32-bit register names
> 7: Viridian: switch away from temporary 32-bit register names
> 8: VMX: switch away from temporary 32-bit register names
>
> Signed-off-by: Jan Beulich 
>

Your Viridian patch is labelled 7 here, but 5 in the email.  I guess
that is just an oversight?

All Reviewed-by: Andrew Cooper 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 5/8] x86/Viridian: switch away from temporary 32-bit register names

2017-02-28 Thread Paul Durrant
> -Original Message-
> From: Jan Beulich [mailto:jbeul...@suse.com]
> Sent: 28 February 2017 13:39
> To: xen-devel 
> Cc: Andrew Cooper ; Paul Durrant
> ; George Dunlap 
> Subject: [PATCH 5/8] x86/Viridian: switch away from temporary 32-bit
> register names
> 
> Signed-off-by: Jan Beulich 
> 

Reviewed-by: Paul Durrant 

> --- a/xen/arch/x86/hvm/viridian.c
> +++ b/xen/arch/x86/hvm/viridian.c
> @@ -666,9 +666,9 @@ int viridian_hypercall(struct cpu_user_r
>  output_params_gpa = regs->r8;
>  break;
>  case 4:
> -input.raw = (regs->rdx << 32) | regs->_eax;
> -input_params_gpa = (regs->rbx << 32) | regs->_ecx;
> -output_params_gpa = (regs->rdi << 32) | regs->_esi;
> +input.raw = (regs->rdx << 32) | regs->eax;
> +input_params_gpa = (regs->rbx << 32) | regs->ecx;
> +output_params_gpa = (regs->rdi << 32) | regs->esi;
>  break;
>  default:
>  goto out;
> 
> 


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [ARM] SMC (and HVC) handling in hypervisor

2017-02-28 Thread Julien Grall



On 11/02/17 00:14, Volodymyr Babchuk wrote:

Hello,


Hi Volodymyr,


This e-mail is sort of follow-up to the two threads: [1] (my thread
about TEE interaction) and [2] (Edgar's thread regarding handling SMC
calls in platform_hvc). I want to discuss more broad topic there.

Obviously, there are growing number of SMC users and current state of
SMC handling in Xen satisfies nobody. My team wants to handle SMCs in
secure way, Xilinx wants to forward some calls directly to Secure
Monitor, while allowing to handle other in userspace, etc.

My proposition is to gather all requirements to SMC (and HVC) handling
in one place (e.g. in this mail thread). After we' will have clear
picture of what we want, we will be able to develop some solution,
that will satisfy us all. At least, I hope so :)

Also I want to remind, that there are ARM document called "SMC Calling
Convention" [3]. According to it, any aarch64 hypervisor "must
implement the Standard Secure and Hypervisor Service calls". At this
moment XEN does not conform to this.

So, lets get started with the requirements:
0. There are no much difference between SMC and HVC handling (at least
according to SMCCC).
1. Hypervisor should at least provide own UUID and version while
called by SMC/HVC


Do we need to reserve the UUID for Xen?


2. Hypervisor should forward some calls from dom0 directly to Secure
Monitor (Xilinx use case)
3. Hypervisor should virtualize PSCI calls, CPU service calls, ARM
architecture service calls, etc.
4. Hypervisor should handle TEE calls in a secure way (e.g. no
untrusted handlers in Dom0 userspace).
5. Hypervisor should support multiple TEEs (at least at compilation time).
6. Hypervisor should do this as fast as possible (DRM playback use case).
7. All domains (including dom0) should be handled in the same way.


+1 here. Same path for all domains means less code and you it will get 
tested more


Cheers,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 1/4] interface: avoid redefinition of __XEN_INTERFACE_VERSION__

2017-02-28 Thread Juergen Gross
On 28/02/17 14:13, Ian Jackson wrote:
> Juergen Gross writes ("Re: [Xen-devel] [PATCH 1/4] interface: avoid 
> redefinition of __XEN_INTERFACE_VERSION__"):
>> Hmm, maybe this is the problem: the value from the command line is
>> (textually) __XEN_LATEST_INTERFACE_VERSION__ while the value from the
>> #define is the _value_ of __XEN_LATEST_INTERFACE_VERSION__ due to the
>> pre-processor having replaced it already.
>>
>> In case this makes sense, my suggestion seems to be appropriate, no?
> 
> Maybe.  Another possibly would be to contrive to #define
> __XEN_INTERFACE_VERSION__ before __XEN_LATEST_INTERFACE_VERSION__.
> Then the second substitution would occur later.

Please drop this patch.

Seems there are paths where __XEN_INTERFACE_VERSION__ is being set to
a different value.

The main problem seems to be that xenctrl.h #define's __XEN_TOOLS__
in order to get all structures of #include'd headers, but it isn't
#undef'ining it at the end.

I believe all stubdoms using xenctrl.h in their main app should
explicitly set __XEN_INTERFACE_VERSION__ in their config file to
__XEN_LATEST_INTERFACE_VERSION__.

I'll send a patch doing this.


Juergen


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 6/8] x86/SVM: switch away from temporary 32-bit register names

2017-02-28 Thread Boris Ostrovsky
On 02/28/2017 08:38 AM, Jan Beulich wrote:
> Signed-off-by: Jan Beulich 

Reviewed-by: Boris Ostrovsky 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [xen-unstable-smoke test] 106258: tolerable trouble: broken/fail/pass - PUSHED

2017-02-28 Thread osstest service owner
flight 106258 xen-unstable-smoke real [real]
http://logs.test-lab.xenproject.org/osstest/logs/106258/

Failures :-/ but no regressions.

Tests which did not succeed, but are not blocking:
 test-arm64-arm64-xl-xsm   1 build-check(1)   blocked  n/a
 build-arm64   5 xen-buildfail   never pass
 build-arm64-pvops 5 kernel-build fail   never pass
 test-armhf-armhf-xl  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  13 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt 12 migrate-support-checkfail   never pass

version targeted for testing:
 xen  c8b5cae4ad086ab12a06b602fe7bc26a548fbcac
baseline version:
 xen  12f687bf28e23fa662bb518311c4ec71e5b39ab8

Last test of basis   106208  2017-02-27 17:01:58 Z0 days
Testing same since   106258  2017-02-28 12:01:11 Z0 days1 attempts


People who touched revisions under test:
  Ian Jackson 
  Juergen Gross 
  Wei Liu 

jobs:
 build-amd64  pass
 build-arm64  fail
 build-armhf  pass
 build-amd64-libvirt  pass
 build-arm64-pvopsfail
 test-armhf-armhf-xl  pass
 test-arm64-arm64-xl-xsm  broken  
 test-amd64-amd64-xl-qemuu-debianhvm-i386 pass
 test-amd64-amd64-libvirt pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

+ branch=xen-unstable-smoke
+ revision=c8b5cae4ad086ab12a06b602fe7bc26a548fbcac
+ . ./cri-lock-repos
++ . ./cri-common
+++ . ./cri-getconfig
+++ umask 002
+++ getrepos
 getconfig Repos
 perl -e '
use Osstest;
readglobalconfig();
print $c{"Repos"} or die $!;
'
+++ local repos=/home/osstest/repos
+++ '[' -z /home/osstest/repos ']'
+++ '[' '!' -d /home/osstest/repos ']'
+++ echo /home/osstest/repos
++ repos=/home/osstest/repos
++ repos_lock=/home/osstest/repos/lock
++ '[' x '!=' x/home/osstest/repos/lock ']'
++ OSSTEST_REPOS_LOCK_LOCKED=/home/osstest/repos/lock
++ exec with-lock-ex -w /home/osstest/repos/lock ./ap-push xen-unstable-smoke 
c8b5cae4ad086ab12a06b602fe7bc26a548fbcac
+ branch=xen-unstable-smoke
+ revision=c8b5cae4ad086ab12a06b602fe7bc26a548fbcac
+ . ./cri-lock-repos
++ . ./cri-common
+++ . ./cri-getconfig
+++ umask 002
+++ getrepos
 getconfig Repos
 perl -e '
use Osstest;
readglobalconfig();
print $c{"Repos"} or die $!;
'
+++ local repos=/home/osstest/repos
+++ '[' -z /home/osstest/repos ']'
+++ '[' '!' -d /home/osstest/repos ']'
+++ echo /home/osstest/repos
++ repos=/home/osstest/repos
++ repos_lock=/home/osstest/repos/lock
++ '[' x/home/osstest/repos/lock '!=' x/home/osstest/repos/lock ']'
+ . ./cri-common
++ . ./cri-getconfig
++ umask 002
+ select_xenbranch
+ case "$branch" in
+ tree=xen
+ xenbranch=xen-unstable-smoke
+ qemuubranch=qemu-upstream-unstable
+ '[' xxen = xlinux ']'
+ linuxbranch=
+ '[' xqemu-upstream-unstable = x ']'
+ select_prevxenbranch
++ ./cri-getprevxenbranch xen-unstable-smoke
+ prevxenbranch=xen-4.8-testing
+ '[' xc8b5cae4ad086ab12a06b602fe7bc26a548fbcac = x ']'
+ : tested/2.6.39.x
+ . ./ap-common
++ : osst...@xenbits.xen.org
+++ getconfig OsstestUpstream
+++ perl -e '
use Osstest;
readglobalconfig();
print $c{"OsstestUpstream"} or die $!;
'
++ :
++ : git://xenbits.xen.org/xen.git
++ : osst...@xenbits.xen.org:/home/xen/git/xen.git
++ : git://xenbits.xen.org/qemu-xen-traditional.git
++ : git://git.kernel.org
++ : git://git.kernel.org/pub/scm/linux/kernel/git
++ : git
++ : git://xenbits.xen.org/xtf.git
++ : osst...@xenbits.xen.org:/home/xen/git/xtf.git
++ : git://xenbits.xen.org/xtf.git
++ : git://xenbits.xen.org/libvirt.git
++ : osst...@xenbits.xen.org:/home/xen/git/libvirt.git
++ : git://xenbits.xen.org/libvirt.git
++ : git://xenbits.xen.org/osstest/rumprun.git
++ : git
++ : git://xenbits.xen.org/osstest/rumprun.git
++ : osst...@xenbits.xen.org:/home/xen/git/osstest/rumprun.git

Re: [Xen-devel] [PATCH v3 3/7] xen: credit2: group the runq manipulating functions.

2017-02-28 Thread Andrew Cooper
On 28/02/17 11:52, Dario Faggioli wrote:
> +static inline bool_t same_node(unsigned int cpua, unsigned int cpub)

s/bool_t/bool/g

>
> +
> +if ( unlikely(tb_init_done) )
> +{
> +struct {
> +unsigned rqi:16, max_weight:16;

More commonly known as uint16_t :)

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v8 08/24] x86: refactor psr: set value: implement framework.

2017-02-28 Thread Roger Pau Monné
On Wed, Feb 15, 2017 at 04:49:23PM +0800, Yi Sun wrote:
> As set value flow is the most complicated one in psr, it will be
> divided to some patches to make things clearer. This patch
> implements the set value framework to show a whole picture firstly.
> 
> It also changes domctl interface to make it more general.
> 
> To make the set value flow be general and can support multiple features
> at same time, it includes below steps:
> 1. Get COS ID of current domain using.
> 2. Assemble a value array to store all features current value
>in it and replace the current value of the feature which is
>being set to the new input value.
> 3. Find if there is already a COS ID on which all features'
>values are same as the array. Then, we can reuse this COS
>ID.
> 4. If fail to find, we need pick an available COS ID. Only COS ID which ref
>is 0 or 1 can be picked.
> 5. Write all features MSRs according to the COS ID.
> 6. Update ref according to COS ID.
> 7. Save the COS ID into current domain's psr_cos_ids[socket] so that we
>can know which COS the domain is using on the socket.
> 
> So, some functions are abstracted and the callback functions will be
> implemented in next patches.
> 
> Here is an example to understand the process. The CPU supports
> two featuers, e.g. L3 CAT and L2 CAT. user wants to set L3 CAT
> of Dom1 to 0x1ff.
> 1. Get the old_cos of Dom1 which is 0. L3 CAT is the first
> element of feature list. The COS registers values are below at
> this time.
> ---
> | COS 0 | COS 1 | COS 2 | ... |
> ---
> L3 CAT  | 0x7ff | ...   | ...   | ... |
> ---
> L2 CAT  | 0xff  | ...   | ...   | ... |
> ---
> 
> 2. Assemble The value array to be:
> val[0]: 0x1ff
> val[1]: 0xff
> 
> 3. It cannot find a matching COS.
> 
> 4. Allocate COS 1 to store the value set.
> 
> 5. Write the COS 1 registers. The COS registers values are
> changed to below now.
> ---
> | COS 0 | COS 1 | COS 2 | ... |
> ---
> L3 CAT  | 0x7ff | 0x1ff | ...   | ... |
> ---
> L2 CAT  | 0xff  | 0xff  | ...   | ... |
> ---
> 
> 6. The ref[1] is increased to 1 because Dom1 is using it now.
> 
> 7. Save 1 to Dom1's psr_cos_ids[socket].
> 
> Then, user wants to set L3 CAT of Dom2 to 0x1ff too. The old_cos
> of Dom2 is 0 too. Repeat above flow.
> 
> The val array assembled is:
> val[0]: 0x1ff
> val[1]: 0xff
> 
> So, it can find a matching COS, COS 1. Then, it can reuse COS 1
> for Dom2.
> 
> The ref[1] is increased to 2 now because both Dom1 and Dom2 are
> using this COS ID. Set 1 to Dom2's psr_cos_ids[socket].
> 
> Signed-off-by: Yi Sun 
> ---
>  xen/arch/x86/domctl.c |  18 ++---
>  xen/arch/x86/psr.c| 202 
> +-
>  xen/include/asm-x86/psr.h |   4 +-
>  3 files changed, 210 insertions(+), 14 deletions(-)
[...]
> diff --git a/xen/arch/x86/psr.c b/xen/arch/x86/psr.c
> index c1afd36..d414b5e 100644
> --- a/xen/arch/x86/psr.c
> +++ b/xen/arch/x86/psr.c
> @@ -546,18 +546,214 @@ int psr_get_val(struct domain *d, unsigned int socket,
>  return psr_get(socket, type, NULL, 0, d, val);
>  }
>  
> -int psr_set_l3_cbm(struct domain *d, unsigned int socket,
> -   uint64_t cbm, enum cbm_type type)
> +/* Set value functions */
> +static unsigned int get_cos_num(const struct psr_socket_info *info)
>  {
>  return 0;
>  }
>  
> +static int assemble_val_array(uint64_t *val,
> +  uint32_t array_len,
> +  const struct psr_socket_info *info,
> +  unsigned int old_cos)
> +{
> +return -EINVAL;
> +}
> +
> +static int set_new_val_to_array(uint64_t *val,
> +uint32_t array_len,
> +const struct psr_socket_info *info,
> +enum psr_feat_type feat_type,
> +enum cbm_type type,
> +uint64_t m)
> +{
> +return -EINVAL;
> +}
> +
> +static int find_cos(const uint64_t *val, uint32_t array_len,
> +enum psr_feat_type feat_type,
> +const struct psr_socket_info *info)
> +{
ASSERT(spin_is_locked(info->ref_lock));
> +return -ENOENT;
> +}
> +
> +static int pick_avail_cos(const struct psr_socket_info *info,
> +  const uint64_t *val, uint32_t array_len,
> +  unsigned int old_cos,
> +  enum psr_feat_type feat_type)
> +{
ASSERT(spin_is_locked(info->ref_lock));
> +return -ENOENT;
> +}
> +
> +static int write_psr_msr(unsigned int socket, unsigned int cos,
> + const uint64_t *val)
> +{
ASSERT(spin_is_locked(in

Re: [Xen-devel] [PATCH 0/8] x86: switch away from temporary 32-bit register names

2017-02-28 Thread Jan Beulich
>>> On 28.02.17 at 14:47,  wrote:
> On 28/02/17 13:27, Jan Beulich wrote:
>> This is only part of the necessary changes. Some needed to be
>> dropped due to code having changed recently, and the biggest
>> missing part is the adjustment of the insn emulator, when I'd
>> prefer to do this work only after the non-RFC parts of
>> https://lists.xenproject.org/archives/html/xen-devel/2017-02/msg03474.html 
>> have gone in (in order to avoid having to ping-pong re-base
>> that and this series).
>>
>> 1: re-introduce non-underscore prefixed 32-bit register names
>> 2: switch away from temporary 32-bit register names
>> 3: HVM: switch away from temporary 32-bit register names
>> 4: HVMemul: switch away from temporary 32-bit register names
>> 5: mm: switch away from temporary 32-bit register names
>> 6: SVM: switch away from temporary 32-bit register names
>> 7: Viridian: switch away from temporary 32-bit register names
>> 8: VMX: switch away from temporary 32-bit register names
>>
>> Signed-off-by: Jan Beulich 
>>
> 
> Your Viridian patch is labelled 7 here, but 5 in the email.  I guess
> that is just an oversight?

Indeed - I don#t know how that has happened.

> All Reviewed-by: Andrew Cooper 

Thanks, Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [xen-4.4-testing test] 106220: regressions - FAIL

2017-02-28 Thread osstest service owner
flight 106220 xen-4.4-testing real [real]
http://logs.test-lab.xenproject.org/osstest/logs/106220/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-i386-xend-qemut-winxpsp3 15 guest-localmigrate/x10 fail in 106051 
REGR. vs. 105835

Tests which are failing intermittently (not blocking):
 test-amd64-i386-xl-qemut-win7-amd64 3 host-install(3) broken in 106018 pass in 
106220
 test-amd64-amd64-xl-qemuu-ovmf-amd64 3 host-install(3) broken in 106018 pass 
in 106220
 test-amd64-i386-xl-qemuu-ovmf-amd64 3 host-install(3) broken in 106018 pass in 
106220
 test-amd64-i386-xl   3 host-install(3) broken in 106018 pass in 106220
 test-amd64-amd64-pygrub  3 host-install(3) broken in 106018 pass in 106220
 test-amd64-i386-xl-qemut-winxpsp3-vcpus1 3 host-install(3) broken in 106018 
pass in 106220
 test-amd64-i386-freebsd10-amd64 3 host-install(3) broken in 106051 pass in 
106220
 test-armhf-armhf-xl-multivcpu 16 guest-start.2   fail in 106051 pass in 106130
 test-armhf-armhf-xl-credit2   6 xen-boot fail in 106130 pass in 106220
 test-armhf-armhf-xl-multivcpu 11 guest-start fail in 106183 pass in 106198
 test-armhf-armhf-xl-credit2 15 guest-start/debian.repeat fail in 106183 pass 
in 106220
 test-armhf-armhf-xl-multivcpu 15 guest-start/debian.repeat fail in 106198 pass 
in 106130
 test-amd64-amd64-xl-qemut-win7-amd64 16 guest-stop fail pass in 106018
 test-amd64-i386-xl-qemut-win7-amd64 16 guest-stop  fail pass in 106051
 test-amd64-i386-xl-qemuu-win7-amd64 15 guest-localmigrate/x10 fail pass in 
106183
 test-armhf-armhf-xl-multivcpu  9 debian-installfail pass in 106198
 test-amd64-amd64-xl-qemuu-win7-amd64 15 guest-localmigrate/x10 fail pass in 
106198

Regressions which are regarded as allowable (not blocking):
 test-amd64-i386-xl-qemuu-win7-amd64 16 guest-stop   fail in 106183 like 105835
 test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-stop  fail in 106198 like 105835
 test-amd64-i386-xend-qemut-winxpsp3  9 windows-installfail like 105815
 test-xtf-amd64-amd64-2   16 xtf/test-pv32pae-selftestfail  like 105835
 test-xtf-amd64-amd64-4   16 xtf/test-pv32pae-selftestfail  like 105835
 test-xtf-amd64-amd64-3   20 xtf/test-hvm32-invlpg~shadow fail  like 105835
 test-xtf-amd64-amd64-3 33 xtf/test-hvm32pae-invlpg~shadow fail like 105835
 test-xtf-amd64-amd64-3   44 xtf/test-hvm64-invlpg~shadow fail  like 105835
 test-xtf-amd64-amd64-2   54 leak-check/check fail  like 105835
 test-xtf-amd64-amd64-4   54 leak-check/check fail  like 105835
 test-xtf-amd64-amd64-1   54 leak-check/check fail  like 105835
 test-xtf-amd64-amd64-5   54 leak-check/check fail  like 105835
 test-xtf-amd64-amd64-3   54 leak-check/check fail  like 105835

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-rumprun-amd64  1 build-check(1)   blocked  n/a
 test-amd64-i386-rumprun-i386  1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl-multivcpu 12 migrate-support-check fail in 106198 never 
pass
 test-armhf-armhf-xl-multivcpu 13 saverestore-support-check fail in 106198 
never pass
 test-xtf-amd64-amd64-2   10 xtf-fep  fail   never pass
 test-xtf-amd64-amd64-4   10 xtf-fep  fail   never pass
 test-xtf-amd64-amd64-2   18 xtf/test-hvm32-cpuid-faulting fail  never pass
 test-xtf-amd64-amd64-4   18 xtf/test-hvm32-cpuid-faulting fail  never pass
 test-xtf-amd64-amd64-1   10 xtf-fep  fail   never pass
 test-xtf-amd64-amd64-1   16 xtf/test-pv32pae-selftestfail   never pass
 test-xtf-amd64-amd64-1   18 xtf/test-hvm32-cpuid-faulting fail  never pass
 test-xtf-amd64-amd64-2 31 xtf/test-hvm32pae-cpuid-faulting fail never pass
 test-xtf-amd64-amd64-4 31 xtf/test-hvm32pae-cpuid-faulting fail never pass
 test-xtf-amd64-amd64-2 37 xtf/test-hvm32pse-cpuid-faulting fail never pass
 test-xtf-amd64-amd64-4 37 xtf/test-hvm32pse-cpuid-faulting fail never pass
 test-xtf-amd64-amd64-2   41 xtf/test-hvm64-cpuid-faulting fail  never pass
 test-xtf-amd64-amd64-1 31 xtf/test-hvm32pae-cpuid-faulting fail never pass
 test-xtf-amd64-amd64-4   41 xtf/test-hvm64-cpuid-faulting fail  never pass
 test-xtf-amd64-amd64-1 37 xtf/test-hvm32pse-cpuid-faulting fail never pass
 test-xtf-amd64-amd64-1   41 xtf/test-hvm64-cpuid-faulting fail  never pass
 test-xtf-amd64-amd64-5   10 xtf-fep  fail   never pass
 build-amd64-rumprun   7 xen-buildfail   never pass
 build-i386-rumprun7 xen-buildfail   never pass
 test-xtf-amd64-amd64-3   10 xtf-fep  fail   never pass
 test-amd64-i386-libvirt  12 migrate-support-checkfail   never pass
 test-xtf-amd64-amd64-2   53 xtf/test-hvm64-xsa-195   

Re: [Xen-devel] [PATCH v3 3/7] xen: credit2: group the runq manipulating functions.

2017-02-28 Thread Dario Faggioli
On Tue, 2017-02-28 at 13:55 +, Andrew Cooper wrote:
> On 28/02/17 11:52, Dario Faggioli wrote:
> > 
> > +static inline bool_t same_node(unsigned int cpua, unsigned int
> > cpub)
> 
> s/bool_t/bool/g
> 
Oh.. Yes, you're right. Sorry!

> > +
> > +if ( unlikely(tb_init_done) )
> > +{
> > +struct {
> > +unsigned rqi:16, max_weight:16;
> 
> More commonly known as uint16_t :)
> 
Yeah, I know. :-)

But tracing code in Credit2 is done like above everywhere, and while I
see and agree on your point, I feel more comfortable in following suit.

And anyway, I'm considering a follow-up cleanup where I'll get rid of
all these 'if (tracing){...}' blocks, and substitute them with inline
functions, and I can certainly do the type change there as well.

Thanks and Regards,
Dario
-- 
<> (Raistlin Majere)
-
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)

signature.asc
Description: This is a digitally signed message part
___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v8 13/24] x86: refactor psr: implement CPU init and free flow for CDP.

2017-02-28 Thread Roger Pau Monné
On Wed, Feb 15, 2017 at 04:49:28PM +0800, Yi Sun wrote:
> This patch implements the CPU init and free flow for CDP including L3 CDP
> initialization callback function.
> 
> Signed-off-by: Yi Sun 
> ---
>  xen/arch/x86/psr.c | 104 
> +
>  1 file changed, 98 insertions(+), 6 deletions(-)
> 
> diff --git a/xen/arch/x86/psr.c b/xen/arch/x86/psr.c
> index 82bb8fe..4c08779 100644
> --- a/xen/arch/x86/psr.c
> +++ b/xen/arch/x86/psr.c
> @@ -97,6 +97,7 @@ struct psr_cat_hw_info {
>  struct feat_hw_info {
>  union {
>  struct psr_cat_hw_info l3_cat_info;
> +struct psr_cat_hw_info l3_cdp_info;
>  };
>  };
>  
> @@ -195,6 +196,22 @@ struct feat_node {
>  struct list_head list;
>  };
>  
> +/*
> + * get_data - get DATA COS register value from input COS ID.
> + * @feat:the feature list entry.
> + * @cos: the COS ID.
> + */
> +#define get_cdp_data(feat, cos)  \
> +( feat->cos_reg_val[cos * 2] )

This should be:

((feat)->cos_reg_val[(cos) * 2])

And the same treatment should be applied to the macro below.

> +
> +/*
> + * get_cdp_code - get CODE COS register value from input COS ID.
> + * @feat:the feature list entry.
> + * @cos: the COS ID.
> + */
> +#define get_cdp_code(feat, cos)  \
> +( feat->cos_reg_val[cos * 2 + 1] )
> +
>  struct psr_assoc {
>  uint64_t val;
>  uint64_t cos_mask;
> @@ -217,6 +234,7 @@ static DEFINE_PER_CPU(struct psr_assoc, psr_assoc);
>   * cpu_add_remove_lock spinlock.
>   */
>  static struct feat_node *feat_l3_cat;
> +static struct feat_node *feat_l3_cdp;
>  
>  /* Common functions. */
>  static void free_feature(struct psr_socket_info *info)
> @@ -457,6 +475,63 @@ static const struct feat_ops l3_cat_ops = {
>  .write_msr = l3_cat_write_msr,
>  };
>  
> +/* L3 CDP functions implementation. */
> +static void l3_cdp_init_feature(struct cpuid_leaf regs,
> +struct feat_node *feat,
> +struct psr_socket_info *info)
> +{
> +struct psr_cat_hw_info l3_cdp = { };
> +unsigned int socket;
> +uint64_t val;
> +
> +/* No valid value so do not enable feature. */
> +if ( !regs.a || !regs.d )
> +return;
> +
> +l3_cdp.cbm_len = (regs.a & CAT_CBM_LEN_MASK) + 1;
> +/* Cut half of cos_max when CDP is enabled. */
> +l3_cdp.cos_max = min(opt_cos_max, regs.d & CAT_COS_MAX_MASK) >> 1;
> +
> +/* cos=0 is reserved as default cbm(all ones). */
> +get_cdp_code(feat, 0) =
> + (1ull << l3_cdp.cbm_len) - 1;

I think that all those ull sufixes should be turned into uint64_t casts,
because that's the type that you are actually using. Or else just use ul, which
is the same and shorter.

Roger.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v8 14/24] x86: refactor psr: implement get hw info flow for CDP.

2017-02-28 Thread Roger Pau Monné
On Wed, Feb 15, 2017 at 04:49:29PM +0800, Yi Sun wrote:
> This patch implements get HW info flow for CDP including L3 CDP callback
> function.
> 
> It also changes sysctl function to make it work for CDP.
> 
> With this patch, 'psr-hwinfo' can work for L3 CDP.
> 
> Signed-off-by: Yi Sun 
> ---
>  xen/arch/x86/psr.c| 18 ++
>  xen/arch/x86/sysctl.c | 24 +---
>  2 files changed, 39 insertions(+), 3 deletions(-)
> 
> diff --git a/xen/arch/x86/psr.c b/xen/arch/x86/psr.c
> index 4c08779..72c9888 100644
> --- a/xen/arch/x86/psr.c
> +++ b/xen/arch/x86/psr.c
> @@ -270,6 +270,10 @@ static enum psr_feat_type psr_cbm_type_to_feat_type(enum 
> cbm_type type)
>  case PSR_CBM_TYPE_L3:
>  feat_type = PSR_SOCKET_L3_CAT;
>  break;
> +case PSR_CBM_TYPE_L3_DATA:
> +case PSR_CBM_TYPE_L3_CODE:
> +feat_type = PSR_SOCKET_L3_CDP;
> +break;
>  default:
>  feat_type = PSR_SOCKET_UNKNOWN;
>  break;
> @@ -528,8 +532,22 @@ static unsigned int l3_cdp_get_cos_max(const struct 
> feat_node *feat)
>  return feat->info.l3_cdp_info.cos_max;
>  }
>  
> +static bool l3_cdp_get_feat_info(const struct feat_node *feat,
> + uint32_t data[], uint32_t array_len)
> +{
> +if ( !data || 3 > array_len )

array_len != 3?

> +return false;
> +
> +data[CBM_LEN] = feat->info.l3_cdp_info.cbm_len;
> +data[COS_MAX] = feat->info.l3_cdp_info.cos_max;
> +data[PSR_FLAG] |= XEN_SYSCTL_PSR_CAT_L3_CDP;
> +
> +return true;
> +}
> +
>  struct feat_ops l3_cdp_ops = {
>  .get_cos_max = l3_cdp_get_cos_max,
> +.get_feat_info = l3_cdp_get_feat_info,
>  };
>  
>  static void __init parse_psr_bool(char *s, char *value, char *feature,
> diff --git a/xen/arch/x86/sysctl.c b/xen/arch/x86/sysctl.c
> index e340baa..568bfe9 100644
> --- a/xen/arch/x86/sysctl.c
> +++ b/xen/arch/x86/sysctl.c
> @@ -181,9 +181,27 @@ long arch_do_sysctl(
>  ret = psr_get_info(sysctl->u.psr_cat_op.target,
> PSR_CBM_TYPE_L3, data, 3);
>  
> -sysctl->u.psr_cat_op.u.l3_info.cbm_len = data[CBM_LEN];
> -sysctl->u.psr_cat_op.u.l3_info.cos_max = data[COS_MAX];
> -sysctl->u.psr_cat_op.u.l3_info.flags   = data[PSR_FLAG];
> +if ( !ret )
> +{
> +sysctl->u.psr_cat_op.u.l3_info.cbm_len = data[CBM_LEN];
> +sysctl->u.psr_cat_op.u.l3_info.cos_max = data[COS_MAX];
> +sysctl->u.psr_cat_op.u.l3_info.flags   = data[PSR_FLAG];
> +} else {

Coding style, should be:

}
else
{

Roger.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v8 15/24] x86: refactor psr: implement get value flow for CDP.

2017-02-28 Thread Roger Pau Monné
On Wed, Feb 15, 2017 at 04:49:30PM +0800, Yi Sun wrote:
> This patch implements L3 CDP get value callback function.
> 
> With this patch, 'psr-cat-show' can work for L3 CDP.
> 
> Signed-off-by: Yi Sun 
> ---
>  xen/arch/x86/psr.c | 16 
>  1 file changed, 16 insertions(+)
> 
> diff --git a/xen/arch/x86/psr.c b/xen/arch/x86/psr.c
> index 72c9888..72ed923 100644
> --- a/xen/arch/x86/psr.c
> +++ b/xen/arch/x86/psr.c
> @@ -545,9 +545,25 @@ static bool l3_cdp_get_feat_info(const struct feat_node 
> *feat,
>  return true;
>  }
>  
> +static bool l3_cdp_get_val(const struct feat_node *feat, unsigned int cos,
> +   enum cbm_type type, uint64_t *val)
> +{
> +if ( cos > feat->info.l3_cdp_info.cos_max )
> +/* Use default value. */
> +cos = 0;

As with the other get_val, I think that this should return false.

Roger.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


  1   2   >