date:20200817

Re: [PATCH] efi: discover ESRT table on Xen PV too

2020-08-17 Thread Ard Biesheuvel

Hi Marek,

On Sun, 16 Aug 2020 at 02:20, Marek Marczykowski-Górecki
 wrote:
>
> In case of Xen PV dom0, Xen passes along info about system tables (see
> arch/x86/xen/efi.c), but not the memory map from EFI. This makes sense
> as it is Xen responsible for managing physical memory address space.
> In this case, it doesn't make sense to condition using ESRT table on
> availability of EFI memory map, as it isn't Linux kernel responsible for
> it. Skip this part on Xen PV (let Xen do the right thing if it deems
> necessary) and use ESRT table normally.
>
> This is a requirement for using fwupd in PV dom0 to update UEFI using
> capsules.
>
> Signed-off-by: Marek Marczykowski-Górecki 
> ---
>  drivers/firmware/efi/esrt.c | 47 -
>  1 file changed, 25 insertions(+), 22 deletions(-)
>
> diff --git a/drivers/firmware/efi/esrt.c b/drivers/firmware/efi/esrt.c
> index d5915272141f..5c49f2aaa4b1 100644
> --- a/drivers/firmware/efi/esrt.c
> +++ b/drivers/firmware/efi/esrt.c
> @@ -245,36 +245,38 @@ void __init efi_esrt_init(void)
> int rc;
> phys_addr_t end;
>
> -   if (!efi_enabled(EFI_MEMMAP))
> +   if (!efi_enabled(EFI_MEMMAP) && !efi_enabled(EFI_PARAVIRT))
> return;
>
> pr_debug("esrt-init: loading.\n");
> if (!esrt_table_exists())
> return;
>
> -   rc = efi_mem_desc_lookup(efi.esrt, &md);
> -   if (rc < 0 ||
> -   (!(md.attribute & EFI_MEMORY_RUNTIME) &&
> -md.type != EFI_BOOT_SERVICES_DATA &&
> -md.type != EFI_RUNTIME_SERVICES_DATA)) {
> -   pr_warn("ESRT header is not in the memory map.\n");
> -   return;
> -   }
> +   if (efi_enabled(EFI_MEMMAP)) {
> +   rc = efi_mem_desc_lookup(efi.esrt, &md);
> +   if (rc < 0 ||
> +   (!(md.attribute & EFI_MEMORY_RUNTIME) &&
> +md.type != EFI_BOOT_SERVICES_DATA &&
> +md.type != EFI_RUNTIME_SERVICES_DATA)) {
> +   pr_warn("ESRT header is not in the memory map.\n");
> +   return;
> +   }
>
> -   max = efi_mem_desc_end(&md);
> -   if (max < efi.esrt) {
> -   pr_err("EFI memory descriptor is invalid. (esrt: %p max: 
> %p)\n",
> -  (void *)efi.esrt, (void *)max);
> -   return;
> -   }
> +   max = efi_mem_desc_end(&md);
> +   if (max < efi.esrt) {
> +   pr_err("EFI memory descriptor is invalid. (esrt: %p 
> max: %p)\n",
> +  (void *)efi.esrt, (void *)max);
> +   return;
> +   }
>
> -   size = sizeof(*esrt);
> -   max -= efi.esrt;
> +   size = sizeof(*esrt);
> +   max -= efi.esrt;
>
> -   if (max < size) {
> -   pr_err("ESRT header doesn't fit on single memory map entry. 
> (size: %zu max: %zu)\n",
> -  size, max);
> -   return;
> +   if (max < size) {
> +   pr_err("ESRT header doesn't fit on single memory map 
> entry. (size: %zu max: %zu)\n",
> +  size, max);
> +   return;
> +   }
> }
>
> va = early_memremap(efi.esrt, size);
> @@ -331,7 +333,8 @@ void __init efi_esrt_init(void)
>
> end = esrt_data + size;
> pr_info("Reserving ESRT space from %pa to %pa.\n", &esrt_data, &end);
> -   if (md.type == EFI_BOOT_SERVICES_DATA)
> +
> +   if (efi_enabled(EFI_MEMMAP) && md.type == EFI_BOOT_SERVICES_DATA)
> efi_mem_reserve(esrt_data, esrt_data_size);
>

This does not look correct to me. Why doesn't the region need to be
reserved on a Xen boot? The OS may overwrite it otherwise.


> pr_debug("esrt-init: loaded.\n");
> --
> 2.25.4
>

Re: [PATCH] xen: Introduce cmpxchg64() and guest_cmpxchg64()

2020-08-17 Thread Julien Grall


Hi,

On 17/08/2020 10:24, Roger Pau Monné wrote:

On Sat, Aug 15, 2020 at 06:21:43PM +0100, Julien Grall wrote:

From: Julien Grall 

The IOREQ code is using cmpxchg() with 64-bit value. At the moment, this
is x86 code, but there is plan to make it common.

To cater 32-bit arch, introduce two new helpers to deal with 64-bit
cmpxchg.

The Arm 32-bit implementation of cmpxchg64() is based on the __cmpxchg64
in Linux v5.8 (arch/arm/include/asm/cmpxchg.h).

Signed-off-by: Julien Grall 
Cc: Oleksandr Tyshchenko 
---
diff --git a/xen/include/asm-x86/guest_atomics.h 
b/xen/include/asm-x86/guest_atomics.h
index 029417c8ffc1..f4de9d3631ff 100644
--- a/xen/include/asm-x86/guest_atomics.h
+++ b/xen/include/asm-x86/guest_atomics.h
@@ -20,6 +20,8 @@
  ((void)(d), test_and_change_bit(nr, p))
  
  #define guest_cmpxchg(d, ptr, o, n) ((void)(d), cmpxchg(ptr, o, n))

+#define guest_cmpxchg64(d, ptr, o, n) ((void)(d), cmpxchg64(ptr, o, n))
+
  
  #endif /* _X86_GUEST_ATOMICS_H */

  /*
diff --git a/xen/include/asm-x86/x86_64/system.h 
b/xen/include/asm-x86/x86_64/system.h
index f471859c19cc..c1b16105e9f2 100644
--- a/xen/include/asm-x86/x86_64/system.h
+++ b/xen/include/asm-x86/x86_64/system.h
@@ -5,6 +5,8 @@
  ((__typeof__(*(ptr)))__cmpxchg((ptr),(unsigned long)(o),\
 (unsigned long)(n),sizeof(*(ptr
  
+#define cmpxchg64(ptr, o, n) cmpxchg(ptr, o, n)


Why do you need to introduce an explicitly sized version of cmpxchg
for 64bit values?

There's no cmpxchg{8,16,32}, so I would expect cmpxchg64 to just be
handled by cmpxchg detecting the size of the parameter passed to the
function.
That works quite well for 64-bit arches. However, for 32-bit, you would 
need to take some detour so 32-bit and 64-bit can cohabit (you cannot 
simply replace unsigned long with uint64_t).


I couldn't come up with a way to do it. Do you have any suggestion?

Cheers,

--
Julien Grall

[PATCH II v2 09/17] tools/misc: replace PAGE_SIZE with XC_PAGE_SIZE in xen-mfndump.c

2020-08-17 Thread Juergen Gross

The definition of PAGE_SIZE comes from xc_private.h, which shouldn't be
used by xen-mfndump.c. Replace PAGE_SIZE by XC_PAGE_SIZE, as
xc_private.h contains:

#define PAGE_SIZE XC_PAGE_SIZE

For the same reason PAGE_SHIFT_X86 needs to replaced with
XC_PAGE_SHIFT.

Signed-off-by: Juergen Gross 
---
 tools/misc/xen-mfndump.c | 26 +-
 1 file changed, 13 insertions(+), 13 deletions(-)

diff --git a/tools/misc/xen-mfndump.c b/tools/misc/xen-mfndump.c
index cb15d08c7e..92bc954ce0 100644
--- a/tools/misc/xen-mfndump.c
+++ b/tools/misc/xen-mfndump.c
@@ -207,7 +207,7 @@ int dump_ptes_func(int argc, char *argv[])
 goto out;
 }
 
-page = xc_map_foreign_range(xch, domid, PAGE_SIZE, PROT_READ,
+page = xc_map_foreign_range(xch, domid, XC_PAGE_SIZE, PROT_READ,
 minfo.p2m_table[pfn]);
 if ( !page )
 {
@@ -216,7 +216,7 @@ int dump_ptes_func(int argc, char *argv[])
 goto out;
 }
 
-pte_num = PAGE_SIZE / 8;
+pte_num = XC_PAGE_SIZE / 8;
 
 printf(" --- Dumping %d PTEs for domain %d ---\n", pte_num, domid);
 printf(" Guest Width: %u, PT Levels: %u P2M size: = %lu\n",
@@ -252,7 +252,7 @@ int dump_ptes_func(int argc, char *argv[])
 
  out:
 if ( page )
-munmap(page, PAGE_SIZE);
+munmap(page, XC_PAGE_SIZE);
 xc_unmap_domain_meminfo(xch, &minfo);
 munmap(m2p_table, M2P_SIZE(max_mfn));
 return rc;
@@ -290,7 +290,7 @@ int lookup_pte_func(int argc, char *argv[])
 return -1;
 }
 
-pte_num = PAGE_SIZE / 8;
+pte_num = XC_PAGE_SIZE / 8;
 
 printf(" --- Lookig for PTEs mapping mfn 0x%lx for domain %d ---\n",
mfn, domid);
@@ -302,7 +302,7 @@ int lookup_pte_func(int argc, char *argv[])
 if ( !(minfo.pfn_type[i] & XEN_DOMCTL_PFINFO_LTABTYPE_MASK) )
 continue;
 
-page = xc_map_foreign_range(xch, domid, PAGE_SIZE, PROT_READ,
+page = xc_map_foreign_range(xch, domid, XC_PAGE_SIZE, PROT_READ,
 minfo.p2m_table[i]);
 if ( !page )
 continue;
@@ -312,15 +312,15 @@ int lookup_pte_func(int argc, char *argv[])
 uint64_t pte = ((const uint64_t*)page)[j];
 
 #define __MADDR_BITS_X86  ((minfo.guest_width == 8) ? 52 : 44)
-#define __MFN_MASK_X86((1ULL << (__MADDR_BITS_X86 - PAGE_SHIFT_X86)) - 1)
-if ( ((pte >> PAGE_SHIFT_X86) & __MFN_MASK_X86) == mfn)
+#define __MFN_MASK_X86((1ULL << (__MADDR_BITS_X86 - XC_PAGE_SHIFT)) - 1)
+if ( ((pte >> XC_PAGE_SHIFT) & __MFN_MASK_X86) == mfn)
 printf("  0x%lx <-- [0x%lx][%lu]: 0x%"PRIx64"\n",
mfn, minfo.p2m_table[i], j, pte);
 #undef __MADDR_BITS_X86
 #undef __MFN_MASK_X8
 }
 
-munmap(page, PAGE_SIZE);
+munmap(page, XC_PAGE_SIZE);
 page = NULL;
 }
 
@@ -355,8 +355,8 @@ int memcmp_mfns_func(int argc, char *argv[])
 return -1;
 }
 
-page1 = xc_map_foreign_range(xch, domid1, PAGE_SIZE, PROT_READ, mfn1);
-page2 = xc_map_foreign_range(xch, domid2, PAGE_SIZE, PROT_READ, mfn2);
+page1 = xc_map_foreign_range(xch, domid1, XC_PAGE_SIZE, PROT_READ, mfn1);
+page2 = xc_map_foreign_range(xch, domid2, XC_PAGE_SIZE, PROT_READ, mfn2);
 if ( !page1 || !page2 )
 {
 ERROR("Failed to map either 0x%lx[dom %d] or 0x%lx[dom %d]\n",
@@ -368,13 +368,13 @@ int memcmp_mfns_func(int argc, char *argv[])
 printf(" --- Comparing the content of 2 MFNs ---\n");
 printf(" 1: 0x%lx[dom %d], 2: 0x%lx[dom %d]\n",
mfn1, domid1, mfn2, domid2);
-printf("  memcpy(1, 2) = %d\n", memcmp(page1, page2, PAGE_SIZE));
+printf("  memcpy(1, 2) = %d\n", memcmp(page1, page2, XC_PAGE_SIZE));
 
  out:
 if ( page1 )
-munmap(page1, PAGE_SIZE);
+munmap(page1, XC_PAGE_SIZE);
 if ( page2 )
-munmap(page2, PAGE_SIZE);
+munmap(page2, XC_PAGE_SIZE);
 return rc;
 }
 
-- 
2.26.2

[PATCH II v2 04/17] tools/python: drop libxenguest from setup.py

2020-08-17 Thread Juergen Gross

There is not a single wrapper for a libxenguest function defined.
So drop libxenguest from tools/python/setup.py.

Signed-off-by: Juergen Gross 
---
 tools/python/setup.py | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/tools/python/setup.py b/tools/python/setup.py
index 8faf1c0ddc..44696b3998 100644
--- a/tools/python/setup.py
+++ b/tools/python/setup.py
@@ -21,8 +21,8 @@ xc = Extension("xc",
   PATH_LIBXC + "/include",
   "xen/lowlevel/xc" ],
library_dirs   = [ PATH_LIBXC ],
-   libraries  = [ "xenctrl", "xenguest" ],
-   depends= [ PATH_LIBXC + "/libxenctrl.so", 
PATH_LIBXC + "/libxenguest.so" ],
+   libraries  = [ "xenctrl" ],
+   depends= [ PATH_LIBXC + "/libxenctrl.so" ],
extra_link_args= [ "-Wl,-rpath-link="+PATH_LIBXENTOOLLOG ],
sources= [ "xen/lowlevel/xc/xc.c" ])
 
-- 
2.26.2

[PATCH II v2 12/17] tools/libxc: move xc_[un]map_domain_meminfo() into new source xg_domain.c

2020-08-17 Thread Juergen Gross

Move xc_[un]map_domain_meminfo() functions to new source xg_domain.c as
they are defined in include/xenguest.h and should be in libxenguest.

Signed-off-by: Juergen Gross 
---
 tools/libxc/Makefile|   4 +-
 tools/libxc/xc_domain.c | 126 -
 tools/libxc/xg_domain.c | 149 
 3 files changed, 152 insertions(+), 127 deletions(-)
 create mode 100644 tools/libxc/xg_domain.c

diff --git a/tools/libxc/Makefile b/tools/libxc/Makefile
index c1e41a8ee9..f3f1edc07b 100644
--- a/tools/libxc/Makefile
+++ b/tools/libxc/Makefile
@@ -52,7 +52,9 @@ CTRL_SRCS-y   += xc_gnttab_compat.c
 CTRL_SRCS-y   += xc_devicemodel_compat.c
 
 GUEST_SRCS-y :=
-GUEST_SRCS-y += xg_private.c xc_suspend.c
+GUEST_SRCS-y += xg_private.c
+GUEST_SRCS-y += xg_domain.c
+GUEST_SRCS-y += xc_suspend.c
 ifeq ($(CONFIG_MIGRATE),y)
 GUEST_SRCS-y += xc_sr_common.c
 GUEST_SRCS-$(CONFIG_X86) += xc_sr_common_x86.c
diff --git a/tools/libxc/xc_domain.c b/tools/libxc/xc_domain.c
index 71829c2bce..fbc22c4df6 100644
--- a/tools/libxc/xc_domain.c
+++ b/tools/libxc/xc_domain.c
@@ -1892,132 +1892,6 @@ int xc_domain_unbind_pt_spi_irq(xc_interface *xch,
 PT_IRQ_TYPE_SPI, 0, 0, 0, 0, spi));
 }
 
-int xc_unmap_domain_meminfo(xc_interface *xch, struct xc_domain_meminfo *minfo)
-{
-struct domain_info_context _di = { .guest_width = minfo->guest_width,
-   .p2m_size = minfo->p2m_size};
-struct domain_info_context *dinfo = &_di;
-
-free(minfo->pfn_type);
-if ( minfo->p2m_table )
-munmap(minfo->p2m_table, P2M_FL_ENTRIES * PAGE_SIZE);
-minfo->p2m_table = NULL;
-
-return 0;
-}
-
-int xc_map_domain_meminfo(xc_interface *xch, uint32_t domid,
-  struct xc_domain_meminfo *minfo)
-{
-struct domain_info_context _di;
-struct domain_info_context *dinfo = &_di;
-
-xc_dominfo_t info;
-shared_info_any_t *live_shinfo;
-xen_capabilities_info_t xen_caps = "";
-int i;
-
-/* Only be initialized once */
-if ( minfo->pfn_type || minfo->p2m_table )
-{
-errno = EINVAL;
-return -1;
-}
-
-if ( xc_domain_getinfo(xch, domid, 1, &info) != 1 )
-{
-PERROR("Could not get domain info");
-return -1;
-}
-
-if ( xc_domain_get_guest_width(xch, domid, &minfo->guest_width) )
-{
-PERROR("Could not get domain address size");
-return -1;
-}
-_di.guest_width = minfo->guest_width;
-
-/* Get page table levels (see get_platform_info() in xg_save_restore.h */
-if ( xc_version(xch, XENVER_capabilities, &xen_caps) )
-{
-PERROR("Could not get Xen capabilities (for page table levels)");
-return -1;
-}
-if ( strstr(xen_caps, "xen-3.0-x86_64") )
-/* Depends on whether it's a compat 32-on-64 guest */
-minfo->pt_levels = ( (minfo->guest_width == 8) ? 4 : 3 );
-else if ( strstr(xen_caps, "xen-3.0-x86_32p") )
-minfo->pt_levels = 3;
-else if ( strstr(xen_caps, "xen-3.0-x86_32") )
-minfo->pt_levels = 2;
-else
-{
-errno = EFAULT;
-return -1;
-}
-
-/* We need the shared info page for mapping the P2M */
-live_shinfo = xc_map_foreign_range(xch, domid, PAGE_SIZE, PROT_READ,
-   info.shared_info_frame);
-if ( !live_shinfo )
-{
-PERROR("Could not map the shared info frame (MFN 0x%lx)",
-   info.shared_info_frame);
-return -1;
-}
-
-if ( xc_core_arch_map_p2m_writable(xch, minfo->guest_width, &info,
-   live_shinfo, &minfo->p2m_table,
-   &minfo->p2m_size) )
-{
-PERROR("Could not map the P2M table");
-munmap(live_shinfo, PAGE_SIZE);
-return -1;
-}
-munmap(live_shinfo, PAGE_SIZE);
-_di.p2m_size = minfo->p2m_size;
-
-/* Make space and prepare for getting the PFN types */
-minfo->pfn_type = calloc(sizeof(*minfo->pfn_type), minfo->p2m_size);
-if ( !minfo->pfn_type )
-{
-PERROR("Could not allocate memory for the PFN types");
-goto failed;
-}
-for ( i = 0; i < minfo->p2m_size; i++ )
-minfo->pfn_type[i] = xc_pfn_to_mfn(i, minfo->p2m_table,
-   minfo->guest_width);
-
-/* Retrieve PFN types in batches */
-for ( i = 0; i < minfo->p2m_size ; i+=1024 )
-{
-int count = ((minfo->p2m_size - i ) > 1024 ) ?
-1024: (minfo->p2m_size - i);
-
-if ( xc_get_pfn_type_batch(xch, domid, count, minfo->pfn_type + i) )
-{
-PERROR("Could not get %d-eth batch of PFN types", (i+1)/1024);
-goto failed;
-}
-}
-
-return 0;
-
-failed:
-if ( minfo->pfn_type )
-{
-free(minfo->pfn_type);
-minfo->pfn_type = NULL;
-}
-if ( minfo->p2m_table )
-

[PATCH II v2 03/17] tools: tweak tools/libs/libs.mk for being able to support libxenctrl

2020-08-17 Thread Juergen Gross

tools/libs/libs.mk needs to be modified for being able to support
building libxenctrl, as the pkg-config file of that library is not
following the same conventions as those of the other libraries.

So add support for specifying PKG_CONFIG before including libs.mk.

In order to make life easier for unstable libraries like libxenctrl
set MAJOR and MINOR automatically to the Xen-version and 0 when not
specified. This removes the need to bump the versions of unstable
libraries when switching to a new Xen version.

As all libraries built via libs.mk require a map file generate a dummy
one in case there is none existing. This again will help avoiding the
need to bump the libarary version in the map file of an unstable
library in case it is exporting all symbols.

The clean target is missing the removal of _paths.h.

Finally drop the foreach loop when setting PKG_CONFIG_LOCAL, as there
is always only one element in PKG_CONFIG.

Signed-off-by: Juergen Gross 
---
 tools/libs/libs.mk | 21 ++---
 1 file changed, 14 insertions(+), 7 deletions(-)

diff --git a/tools/libs/libs.mk b/tools/libs/libs.mk
index 19efc5e743..8b1ca2aa62 100644
--- a/tools/libs/libs.mk
+++ b/tools/libs/libs.mk
@@ -1,10 +1,13 @@
 # Common Makefile for building a lib.
 #
 # Variables taken as input:
-#   MAJOR:   major version of lib
-#   MINOR:   minor version of lib
+#   PKG_CONFIG: name of pkg-config file (xen$(LIBNAME).pc if empty)
+#   MAJOR:   major version of lib (Xen version if empty)
+#   MINOR:   minor version of lib (0 if empty)
 
 LIBNAME := $(notdir $(CURDIR))
+MAJOR ?= $(shell $(XEN_ROOT)/version.sh $(XEN_ROOT)/xen/Makefile)
+MINOR ?= 0
 
 SHLIB_LDFLAGS += -Wl,--version-script=libxen$(LIBNAME).map
 
@@ -22,7 +25,7 @@ ifneq ($(nosharedlibs),y)
 LIB += libxen$(LIBNAME).so
 endif
 
-PKG_CONFIG := xen$(LIBNAME).pc
+PKG_CONFIG ?= xen$(LIBNAME).pc
 PKG_CONFIG_VERSION := $(MAJOR).$(MINOR)
 
 ifneq ($(CONFIG_LIBXC_MINIOS),y)
@@ -32,7 +35,7 @@ $(PKG_CONFIG_INST): PKG_CONFIG_INCDIR = $(includedir)
 $(PKG_CONFIG_INST): PKG_CONFIG_LIBDIR = $(libdir)
 endif
 
-PKG_CONFIG_LOCAL := $(foreach pc,$(PKG_CONFIG),$(PKG_CONFIG_DIR)/$(pc))
+PKG_CONFIG_LOCAL := $(PKG_CONFIG_DIR)/$(PKG_CONFIG)
 
 LIBHEADER ?= xen$(LIBNAME).h
 LIBHEADERS = $(foreach h, $(LIBHEADER), include/$(h))
@@ -45,7 +48,7 @@ $(PKG_CONFIG_LOCAL): PKG_CONFIG_LIBDIR = $(CURDIR)
 all: build
 
 .PHONY: build
-build: libs
+build: libs libxen$(LIBNAME).map
 
 .PHONY: libs
 libs: headers.chk $(LIB) $(PKG_CONFIG_INST) $(PKG_CONFIG_LOCAL)
@@ -64,6 +67,9 @@ endif
 
 headers.chk: $(LIBHEADERSGLOB) $(AUTOINCS)
 
+libxen$(LIBNAME).map:
+   echo 'VERS_$(MAJOR).$(MINOR) { global: *; };' >$@
+
 $(LIBHEADERSGLOB): $(LIBHEADERS)
for i in $(realpath $(LIBHEADERS)); do ln -sf $$i 
$(XEN_ROOT)/tools/include; done
 
@@ -87,7 +93,7 @@ install: build
$(SYMLINK_SHLIB) libxen$(LIBNAME).so.$(MAJOR).$(MINOR) 
$(DESTDIR)$(libdir)/libxen$(LIBNAME).so.$(MAJOR)
$(SYMLINK_SHLIB) libxen$(LIBNAME).so.$(MAJOR) 
$(DESTDIR)$(libdir)/libxen$(LIBNAME).so
for i in $(LIBHEADERS); do $(INSTALL_DATA) $$i $(DESTDIR)$(includedir); 
done
-   $(INSTALL_DATA) xen$(LIBNAME).pc $(DESTDIR)$(PKG_INSTALLDIR)
+   $(INSTALL_DATA) $(PKG_CONFIG) $(DESTDIR)$(PKG_INSTALLDIR)
 
 .PHONY: uninstall
 uninstall:
@@ -107,8 +113,9 @@ clean:
rm -rf *.rpm $(LIB) *~ $(DEPS_RM) $(LIB_OBJS) $(PIC_OBJS)
rm -f libxen$(LIBNAME).so.$(MAJOR).$(MINOR) libxen$(LIBNAME).so.$(MAJOR)
rm -f headers.chk
-   rm -f xen$(LIBNAME).pc
+   rm -f $(PKG_CONFIG)
rm -f $(LIBHEADERSGLOB)
+   rm -f _paths.h
 
 .PHONY: distclean
 distclean: clean
-- 
2.26.2

[PATCH II v2 06/17] tools: don't assume libxenguest and libxenctrl to be in same directory

2020-08-17 Thread Juergen Gross

There are quite some places in Makefiles assuming libxenguest and
libxenctrl being built in the same directory via a single Makefile.

Drop this assumption by specifying the dependencies and path variables
for both libraries correctly.

Signed-off-by: Juergen Gross 
---
 tools/Rules.mk   | 7 +++
 tools/libxl/Makefile | 2 +-
 tools/misc/Makefile  | 1 +
 3 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/tools/Rules.mk b/tools/Rules.mk
index b36818bcaa..35d237bba6 100644
--- a/tools/Rules.mk
+++ b/tools/Rules.mk
@@ -31,8 +31,7 @@ LIBS_LIBS += hypfs
 USELIBS_hypfs := toollog toolcore call
 
 XEN_libxenctrl = $(XEN_ROOT)/tools/libxc
-# Currently libxenguest lives in the same directory as libxenctrl
-XEN_libxenguest= $(XEN_libxenctrl)
+XEN_libxenguest= $(XEN_ROOT)/tools/libxc
 XEN_libxenlight= $(XEN_ROOT)/tools/libxl
 # Currently libxlutil lives in the same directory as libxenlight
 XEN_libxlutil  = $(XEN_libxenlight)
@@ -132,7 +131,7 @@ LDLIBS_libxenguest = $(SHDEPS_libxenguest) 
$(XEN_libxenguest)/libxenguest$(libex
 SHLIB_libxenguest  = $(SHDEPS_libxenguest) -Wl,-rpath-link=$(XEN_libxenguest)
 
 CFLAGS_libxenstore = -I$(XEN_libxenstore)/include $(CFLAGS_xeninclude)
-SHDEPS_libxenstore = $(SHLIB_libxentoolcore)
+SHDEPS_libxenstore = $(SHLIB_libxentoolcore) $(SHLIB_libxenctrl)
 LDLIBS_libxenstore = $(SHDEPS_libxenstore) 
$(XEN_libxenstore)/libxenstore$(libextension)
 SHLIB_libxenstore  = $(SHDEPS_libxenstore) -Wl,-rpath-link=$(XEN_libxenstore)
 ifeq ($(CONFIG_Linux),y)
@@ -159,7 +158,7 @@ CFLAGS += -O2 -fomit-frame-pointer
 endif
 
 CFLAGS_libxenlight = -I$(XEN_libxenlight) $(CFLAGS_libxenctrl) 
$(CFLAGS_xeninclude)
-SHDEPS_libxenlight = $(SHLIB_libxenctrl) $(SHLIB_libxenstore) 
$(SHLIB_libxenhypfs)
+SHDEPS_libxenlight = $(SHLIB_libxenctrl) $(SHLIB_libxenstore) 
$(SHLIB_libxenhypfs) $(SHLIB_libxenguest)
 LDLIBS_libxenlight = $(SHDEPS_libxenlight) 
$(XEN_libxenlight)/libxenlight$(libextension)
 SHLIB_libxenlight  = $(SHDEPS_libxenlight) -Wl,-rpath-link=$(XEN_libxenlight)
 
diff --git a/tools/libxl/Makefile b/tools/libxl/Makefile
index 0e8dfc6193..65f3968947 100644
--- a/tools/libxl/Makefile
+++ b/tools/libxl/Makefile
@@ -188,7 +188,7 @@ libxl_dom.o: CFLAGS += -I$(XEN_ROOT)/tools  # include 
libacpi/x86.h
 libxl_x86_acpi.o: CFLAGS += -I$(XEN_ROOT)/tools
 
 SAVE_HELPER_OBJS = libxl_save_helper.o _libxl_save_msgs_helper.o
-$(SAVE_HELPER_OBJS): CFLAGS += $(CFLAGS_libxenctrl) $(CFLAGS_libxenevtchn)
+$(SAVE_HELPER_OBJS): CFLAGS += $(CFLAGS_libxenctrl) $(CFLAGS_libxenevtchn) 
$(CFLAGS_libxenguest)
 
 PKG_CONFIG = xenlight.pc xlutil.pc
 PKG_CONFIG_VERSION := $(MAJOR).$(MINOR)
diff --git a/tools/misc/Makefile b/tools/misc/Makefile
index 9fdb13597f..e7e74db85f 100644
--- a/tools/misc/Makefile
+++ b/tools/misc/Makefile
@@ -6,6 +6,7 @@ CFLAGS += -Werror
 CFLAGS += -include $(XEN_ROOT)/tools/config.h
 CFLAGS += $(CFLAGS_libxenevtchn)
 CFLAGS += $(CFLAGS_libxenctrl)
+CFLAGS += $(CFLAGS_libxenguest)
 CFLAGS += $(CFLAGS_xeninclude)
 CFLAGS += $(CFLAGS_libxenstore)
 
-- 
2.26.2

[PATCH II v2 10/17] tools/misc: drop all libxc internals from xen-mfndump.c

2020-08-17 Thread Juergen Gross

The last libxc internal used by xen-mfndump.c is the ERROR() macro.
Add a simple definition for that macro to xen-mfndump.c and replace
the libxc private header includes by official ones.

Signed-off-by: Juergen Gross 
---
 tools/misc/Makefile  |  2 --
 tools/misc/xen-mfndump.c | 13 +
 2 files changed, 9 insertions(+), 6 deletions(-)

diff --git a/tools/misc/Makefile b/tools/misc/Makefile
index 2a7f2ec42d..7d37f297a9 100644
--- a/tools/misc/Makefile
+++ b/tools/misc/Makefile
@@ -99,8 +99,6 @@ xen-hptool: xen-hptool.o
 
 xenhypfs.o: CFLAGS += $(CFLAGS_libxenhypfs)
 
-# xen-mfndump incorrectly uses libxc internals
-xen-mfndump.o: CFLAGS += -I$(XEN_ROOT)/tools/libxc $(CFLAGS_libxencall)
 xen-mfndump: xen-mfndump.o
$(CC) $(LDFLAGS) -o $@ $< $(LDLIBS_libxenevtchn) $(LDLIBS_libxenctrl) 
$(LDLIBS_libxenguest) $(APPEND_LDFLAGS)
 
diff --git a/tools/misc/xen-mfndump.c b/tools/misc/xen-mfndump.c
index 92bc954ce0..62121bd241 100644
--- a/tools/misc/xen-mfndump.c
+++ b/tools/misc/xen-mfndump.c
@@ -1,15 +1,20 @@
-#define XC_WANT_COMPAT_MAP_FOREIGN_API
-#include 
-#include 
-#include 
+#include 
+#include 
+#include 
 #include 
 #include 
 
+#define XC_WANT_COMPAT_MAP_FOREIGN_API
+#include 
+#include 
+
 #include 
 
 #define M2P_SIZE(_m)ROUNDUP(((_m) * sizeof(xen_pfn_t)), 21)
 #define is_mapped(pfn_type) (!((pfn_type) & 0x8000UL))
 
+#define ERROR(msg, args...) fprintf(stderr, msg, ## args)
+
 static xc_interface *xch;
 
 int help_func(int argc, char *argv[])
-- 
2.26.2

[PATCH II v2 01/17] stubdom: add correct dependencies for Xen libraries

2020-08-17 Thread Juergen Gross

The stubdom Makefile is missing several dependencies between Xen
libraries. Add them.

Signed-off-by: Juergen Gross 
---
 stubdom/Makefile | 5 +
 1 file changed, 5 insertions(+)

diff --git a/stubdom/Makefile b/stubdom/Makefile
index af8cde41b9..6fcecadeb9 100644
--- a/stubdom/Makefile
+++ b/stubdom/Makefile
@@ -405,6 +405,7 @@ libs-$(XEN_TARGET_ARCH)/toollog/libxentoollog.a: 
mk-headers-$(XEN_TARGET_ARCH) $
 
 .PHONY: libxenevtchn
 libxenevtchn: libs-$(XEN_TARGET_ARCH)/evtchn/libxenevtchn.a
+libs-$(XEN_TARGET_ARCH)/evtchn/libxenevtchn.a: libxentoolcore
 libs-$(XEN_TARGET_ARCH)/evtchn/libxenevtchn.a: mk-headers-$(XEN_TARGET_ARCH) 
$(NEWLIB_STAMPFILE)
CPPFLAGS="$(TARGET_CPPFLAGS)" CFLAGS="$(TARGET_CFLAGS)" $(MAKE) 
DESTDIR= -C libs-$(XEN_TARGET_ARCH)/evtchn
 
@@ -414,6 +415,7 @@ libs-$(XEN_TARGET_ARCH)/evtchn/libxenevtchn.a: 
mk-headers-$(XEN_TARGET_ARCH) $(N
 
 .PHONY: libxengnttab
 libxengnttab: libs-$(XEN_TARGET_ARCH)/gnttab/libxengnttab.a
+libs-$(XEN_TARGET_ARCH)/gnttab/libxengnttab.a: libxentoolcore libxentoollog
 libs-$(XEN_TARGET_ARCH)/gnttab/libxengnttab.a: mk-headers-$(XEN_TARGET_ARCH) 
$(NEWLIB_STAMPFILE)
CPPFLAGS="$(TARGET_CPPFLAGS)" CFLAGS="$(TARGET_CFLAGS)" $(MAKE) 
DESTDIR= -C libs-$(XEN_TARGET_ARCH)/gnttab
 
@@ -423,6 +425,7 @@ libs-$(XEN_TARGET_ARCH)/gnttab/libxengnttab.a: 
mk-headers-$(XEN_TARGET_ARCH) $(N
 
 .PHONY: libxencall
 libxencall: libs-$(XEN_TARGET_ARCH)/call/libxencall.a
+libs-$(XEN_TARGET_ARCH)/call/libxencall.a: libxentoolcore
 libs-$(XEN_TARGET_ARCH)/call/libxencall.a: mk-headers-$(XEN_TARGET_ARCH) 
$(NEWLIB_STAMPFILE)
CPPFLAGS="$(TARGET_CPPFLAGS)" CFLAGS="$(TARGET_CFLAGS)" $(MAKE) 
DESTDIR= -C libs-$(XEN_TARGET_ARCH)/call
 
@@ -432,6 +435,7 @@ libs-$(XEN_TARGET_ARCH)/call/libxencall.a: 
mk-headers-$(XEN_TARGET_ARCH) $(NEWLI
 
 .PHONY: libxenforeignmemory
 libxenforeignmemory: 
libs-$(XEN_TARGET_ARCH)/foreignmemory/libxenforeignmemory.a
+libs-$(XEN_TARGET_ARCH)/foreignmemory/libxenforeignmemory.a: libxentoolcore
 libs-$(XEN_TARGET_ARCH)/foreignmemory/libxenforeignmemory.a: 
mk-headers-$(XEN_TARGET_ARCH) $(NEWLIB_STAMPFILE)
CPPFLAGS="$(TARGET_CPPFLAGS)" CFLAGS="$(TARGET_CFLAGS)" $(MAKE) 
DESTDIR= -C libs-$(XEN_TARGET_ARCH)/foreignmemory
 
@@ -441,6 +445,7 @@ 
libs-$(XEN_TARGET_ARCH)/foreignmemory/libxenforeignmemory.a: mk-headers-$(XEN_TA
 
 .PHONY: libxendevicemodel
 libxendevicemodel: libs-$(XEN_TARGET_ARCH)/devicemodel/libxendevicemodel.a
+libs-$(XEN_TARGET_ARCH)/devicemodel/libxendevicemodel.a: libxentoolcore 
libxentoollog libxencall
 libs-$(XEN_TARGET_ARCH)/devicemodel/libxendevicemodel.a: 
mk-headers-$(XEN_TARGET_ARCH) $(NEWLIB_STAMPFILE)
CPPFLAGS="$(TARGET_CPPFLAGS)" CFLAGS="$(TARGET_CFLAGS)" $(MAKE) 
DESTDIR= -C libs-$(XEN_TARGET_ARCH)/devicemodel
 
-- 
2.26.2

[PATCH II v2 02/17] tools: drop explicit path specifications for qemu build

2020-08-17 Thread Juergen Gross

Since more than three years now qemu is capable to set the needed
include and library paths for the Xen libraries via pkg-config.

So drop the specification of those paths in tools/Makefile. This will
enable to move libxenctrl away from tools/libxc, as qemu's configure
script has special treatment of this path.

Signed-off-by: Juergen Gross 
---
 tools/Makefile | 26 +-
 1 file changed, 1 insertion(+), 25 deletions(-)

diff --git a/tools/Makefile b/tools/Makefile
index 198b239edc..7c9f9fc900 100644
--- a/tools/Makefile
+++ b/tools/Makefile
@@ -245,32 +245,8 @@ subdir-all-qemu-xen-dir: qemu-xen-dir-find
-DXC_WANT_COMPAT_GNTTAB_API=1 \
-DXC_WANT_COMPAT_MAP_FOREIGN_API=1 \
-DXC_WANT_COMPAT_DEVICEMODEL_API=1 \
-   -I$(XEN_ROOT)/tools/include \
-   -I$(XEN_ROOT)/tools/libs/toolcore/include \
-   -I$(XEN_ROOT)/tools/libs/toollog/include \
-   -I$(XEN_ROOT)/tools/libs/evtchn/include \
-   -I$(XEN_ROOT)/tools/libs/gnttab/include \
-   -I$(XEN_ROOT)/tools/libs/foreignmemory/include \
-   -I$(XEN_ROOT)/tools/libs/devicemodel/include \
-   -I$(XEN_ROOT)/tools/libxc/include \
-   -I$(XEN_ROOT)/tools/xenstore/include \
-   -I$(XEN_ROOT)/tools/xenstore/compat/include \
$(EXTRA_CFLAGS_QEMU_XEN)" \
-   --extra-ldflags="-L$(XEN_ROOT)/tools/libxc \
-   -L$(XEN_ROOT)/tools/xenstore \
-   -L$(XEN_ROOT)/tools/libs/toolcore \
-   -L$(XEN_ROOT)/tools/libs/evtchn \
-   -L$(XEN_ROOT)/tools/libs/gnttab \
-   -L$(XEN_ROOT)/tools/libs/foreignmemory \
-   -L$(XEN_ROOT)/tools/libs/devicemodel \
-   -Wl,-rpath-link=$(XEN_ROOT)/tools/libs/toolcore \
-   -Wl,-rpath-link=$(XEN_ROOT)/tools/libs/toollog \
-   -Wl,-rpath-link=$(XEN_ROOT)/tools/libs/evtchn \
-   -Wl,-rpath-link=$(XEN_ROOT)/tools/libs/gnttab \
-   -Wl,-rpath-link=$(XEN_ROOT)/tools/libs/call \
-   -Wl,-rpath-link=$(XEN_ROOT)/tools/libs/foreignmemory \
-   -Wl,-rpath-link=$(XEN_ROOT)/tools/libs/devicemodel \
-   $(QEMU_UPSTREAM_RPATH)" \
+   --extra-ldflags="$(QEMU_UPSTREAM_RPATH)" \
--bindir=$(LIBEXEC_BIN) \
--datadir=$(SHAREDIR)/qemu-xen \
--localstatedir=$(localstatedir) \
-- 
2.26.2

[PATCH II v2 14/17] tools/libxc: rename libxenguest internal headers

2020-08-17 Thread Juergen Gross

Rename the header files private to libxenguest from xc_*.h to xg_*.h.

Signed-off-by: Juergen Gross 
---
 tools/libxc/xg_dom_bzimageloader.c  | 2 +-
 tools/libxc/{xc_dom_decompress.h => xg_dom_decompress.h}| 2 +-
 tools/libxc/xg_dom_decompress_lz4.c | 2 +-
 tools/libxc/xg_dom_decompress_unsafe.c  | 2 +-
 .../{xc_dom_decompress_unsafe.h => xg_dom_decompress_unsafe.h}  | 0
 tools/libxc/xg_dom_decompress_unsafe_bzip2.c| 2 +-
 tools/libxc/xg_dom_decompress_unsafe_lzma.c | 2 +-
 tools/libxc/xg_dom_decompress_unsafe_lzo1x.c| 2 +-
 tools/libxc/xg_dom_decompress_unsafe_xz.c   | 2 +-
 tools/libxc/xg_sr_common.c  | 2 +-
 tools/libxc/{xc_sr_common.h => xg_sr_common.h}  | 2 +-
 tools/libxc/xg_sr_common_x86.c  | 2 +-
 tools/libxc/{xc_sr_common_x86.h => xg_sr_common_x86.h}  | 2 +-
 tools/libxc/xg_sr_common_x86_pv.c   | 2 +-
 tools/libxc/{xc_sr_common_x86_pv.h => xg_sr_common_x86_pv.h}| 2 +-
 tools/libxc/xg_sr_restore.c | 2 +-
 tools/libxc/xg_sr_restore_x86_hvm.c | 2 +-
 tools/libxc/xg_sr_restore_x86_pv.c  | 2 +-
 tools/libxc/xg_sr_save.c| 2 +-
 tools/libxc/xg_sr_save_x86_hvm.c| 2 +-
 tools/libxc/xg_sr_save_x86_pv.c | 2 +-
 tools/libxc/{xc_sr_stream_format.h => xg_sr_stream_format.h}| 0
 22 files changed, 20 insertions(+), 20 deletions(-)
 rename tools/libxc/{xc_dom_decompress.h => xg_dom_decompress.h} (77%)
 rename tools/libxc/{xc_dom_decompress_unsafe.h => xg_dom_decompress_unsafe.h} 
(100%)
 rename tools/libxc/{xc_sr_common.h => xg_sr_common.h} (99%)
 rename tools/libxc/{xc_sr_common_x86.h => xg_sr_common_x86.h} (98%)
 rename tools/libxc/{xc_sr_common_x86_pv.h => xg_sr_common_x86_pv.h} (98%)
 rename tools/libxc/{xc_sr_stream_format.h => xg_sr_stream_format.h} (100%)

diff --git a/tools/libxc/xg_dom_bzimageloader.c 
b/tools/libxc/xg_dom_bzimageloader.c
index a7d70cc7c6..f959a77602 100644
--- a/tools/libxc/xg_dom_bzimageloader.c
+++ b/tools/libxc/xg_dom_bzimageloader.c
@@ -32,7 +32,7 @@
 #include 
 
 #include "xg_private.h"
-#include "xc_dom_decompress.h"
+#include "xg_dom_decompress.h"
 
 #include 
 
diff --git a/tools/libxc/xc_dom_decompress.h b/tools/libxc/xg_dom_decompress.h
similarity index 77%
rename from tools/libxc/xc_dom_decompress.h
rename to tools/libxc/xg_dom_decompress.h
index 42cefa3f0e..d9a21cf297 100644
--- a/tools/libxc/xc_dom_decompress.h
+++ b/tools/libxc/xg_dom_decompress.h
@@ -1,7 +1,7 @@
 #ifndef __MINIOS__
 # include "xc_dom.h"
 #else
-# include "xc_dom_decompress_unsafe.h"
+# include "xg_dom_decompress_unsafe.h"
 #endif
 
 int xc_try_lz4_decode(struct xc_dom_image *dom, void **blob, size_t *size);
diff --git a/tools/libxc/xg_dom_decompress_lz4.c 
b/tools/libxc/xg_dom_decompress_lz4.c
index b6a33f27a8..97ba620d86 100644
--- a/tools/libxc/xg_dom_decompress_lz4.c
+++ b/tools/libxc/xg_dom_decompress_lz4.c
@@ -4,7 +4,7 @@
 #include 
 
 #include "xg_private.h"
-#include "xc_dom_decompress.h"
+#include "xg_dom_decompress.h"
 
 #define CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS
 
diff --git a/tools/libxc/xg_dom_decompress_unsafe.c 
b/tools/libxc/xg_dom_decompress_unsafe.c
index 164e35558f..21d964787d 100644
--- a/tools/libxc/xg_dom_decompress_unsafe.c
+++ b/tools/libxc/xg_dom_decompress_unsafe.c
@@ -3,7 +3,7 @@
 #include 
 
 #include "xg_private.h"
-#include "xc_dom_decompress_unsafe.h"
+#include "xg_dom_decompress_unsafe.h"
 
 static struct xc_dom_image *unsafe_dom;
 static unsigned char *output_blob;
diff --git a/tools/libxc/xc_dom_decompress_unsafe.h 
b/tools/libxc/xg_dom_decompress_unsafe.h
similarity index 100%
rename from tools/libxc/xc_dom_decompress_unsafe.h
rename to tools/libxc/xg_dom_decompress_unsafe.h
diff --git a/tools/libxc/xg_dom_decompress_unsafe_bzip2.c 
b/tools/libxc/xg_dom_decompress_unsafe_bzip2.c
index 4dcabe4061..9d3709e6cc 100644
--- a/tools/libxc/xg_dom_decompress_unsafe_bzip2.c
+++ b/tools/libxc/xg_dom_decompress_unsafe_bzip2.c
@@ -3,7 +3,7 @@
 #include 
 
 #include "xg_private.h"
-#include "xc_dom_decompress_unsafe.h"
+#include "xg_dom_decompress_unsafe.h"
 
 #include "../../xen/common/bunzip2.c"
 
diff --git a/tools/libxc/xg_dom_decompress_unsafe_lzma.c 
b/tools/libxc/xg_dom_decompress_unsafe_lzma.c
index 4ee8cdbab1..5d178f0c43 100644
--- a/tools/libxc/xg_dom_decompress_unsafe_lzma.c
+++ b/tools/libxc/xg_dom_decompress_unsafe_lzma.c
@@ -3,7 +3,7 @@
 #include 
 
 #include "xg_private.h"
-#include "xc_dom_decompress_unsafe.h"
+#include "xg_dom_decompress_unsafe.h"
 
 #include "../../xen/common/unlzma.c"
 
diff --git a/tools/libxc/xg_dom_decompress_unsafe_lzo1x.c 
b/tools/libxc/xg_dom_decompress_unsaf

[PATCH II v2 05/17] tools: fix pkg-config file for libxenguest

2020-08-17 Thread Juergen Gross

The pkg-config file for libxenguest is missing the private dependency
on libxenctrl.

Signed-off-by: Juergen Gross 
---
 tools/libxc/xenguest.pc.in | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/libxc/xenguest.pc.in b/tools/libxc/xenguest.pc.in
index 225ac0b9c8..6b43b67e63 100644
--- a/tools/libxc/xenguest.pc.in
+++ b/tools/libxc/xenguest.pc.in
@@ -7,4 +7,4 @@ Description: The Xenguest library for Xen hypervisor
 Version: @@version@@
 Cflags: -I${includedir}
 Libs: @@libsflag@@${libdir} -lxenguest
-Requires.private: xentoollog,xencall,xenforeignmemory,xenevtchn
+Requires.private: xentoollog,xencall,xenforeignmemory,xenevtchn,xencontrol
-- 
2.26.2

[PATCH II v2 11/17] tools/libxc: remove unused headers xc_efi.h and xc_elf.h

2020-08-17 Thread Juergen Gross

Remove xc_efi.h and xc_elf.h as they aren't used anywhere.

Signed-off-by: Juergen Gross 
---
 tools/libxc/xc_efi.h | 158 ---
 tools/libxc/xc_elf.h |  16 -
 2 files changed, 174 deletions(-)
 delete mode 100644 tools/libxc/xc_efi.h
 delete mode 100644 tools/libxc/xc_elf.h

diff --git a/tools/libxc/xc_efi.h b/tools/libxc/xc_efi.h
deleted file mode 100644
index dbe105be8f..00
--- a/tools/libxc/xc_efi.h
+++ /dev/null
@@ -1,158 +0,0 @@
-/*
- * Extensible Firmware Interface
- * Based on 'Extensible Firmware Interface Specification' version 0.9, April 
30, 1999
- *
- * This library is free software; you can redistribute it and/or
- * modify it under the terms of the GNU Lesser General Public
- * License as published by the Free Software Foundation;
- * version 2.1 of the License.
- *
- * This library is distributed in the hope that it will be useful,
- * but WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
- * Lesser General Public License for more details.
- *
- * You should have received a copy of the GNU Lesser General Public
- * License along with this library; If not, see .
- *
- * Copyright (C) 1999 VA Linux Systems
- * Copyright (C) 1999 Walt Drummond 
- * Copyright (C) 1999, 2002-2003 Hewlett-Packard Co.
- *  David Mosberger-Tang 
- *  Stephane Eranian 
- */
-
-#ifndef XC_EFI_H
-#define XC_EFI_H
-
-/* definitions from xen/include/asm-ia64/linux-xen/linux/efi.h */
-
-typedef struct {
-uint8_t b[16];
-} efi_guid_t;
-
-#define EFI_GUID(a,b,c,d0,d1,d2,d3,d4,d5,d6,d7) \
-((efi_guid_t) \
-{{ (a) & 0xff, ((a) >> 8) & 0xff, ((a) >> 16) & 0xff, ((a) >> 24) & 0xff, \
-  (b) & 0xff, ((b) >> 8) & 0xff, \
-  (c) & 0xff, ((c) >> 8) & 0xff, \
-  (d0), (d1), (d2), (d3), (d4), (d5), (d6), (d7) }})
-
-/*
- * Generic EFI table header
- */
-typedef struct {
-   uint64_t signature;
-   uint32_t revision;
-   uint32_t headersize;
-   uint32_t crc32;
-   uint32_t reserved;
-} efi_table_hdr_t;
-
-/*
- * Memory map descriptor:
- */
-
-/* Memory types: */
-#define EFI_RESERVED_TYPE0
-#define EFI_LOADER_CODE  1
-#define EFI_LOADER_DATA  2
-#define EFI_BOOT_SERVICES_CODE   3
-#define EFI_BOOT_SERVICES_DATA   4
-#define EFI_RUNTIME_SERVICES_CODE5
-#define EFI_RUNTIME_SERVICES_DATA6
-#define EFI_CONVENTIONAL_MEMORY  7
-#define EFI_UNUSABLE_MEMORY  8
-#define EFI_ACPI_RECLAIM_MEMORY  9
-#define EFI_ACPI_MEMORY_NVS 10
-#define EFI_MEMORY_MAPPED_IO11
-#define EFI_MEMORY_MAPPED_IO_PORT_SPACE 12
-#define EFI_PAL_CODE13
-#define EFI_MAX_MEMORY_TYPE 14
-
-/* Attribute values: */
-#define EFI_MEMORY_UC   ((uint64_t)0x0001ULL)/* 
uncached */
-#define EFI_MEMORY_WC   ((uint64_t)0x0002ULL)/* 
write-coalescing */
-#define EFI_MEMORY_WT   ((uint64_t)0x0004ULL)/* 
write-through */
-#define EFI_MEMORY_WB   ((uint64_t)0x0008ULL)/* 
write-back */
-#define EFI_MEMORY_WP   ((uint64_t)0x1000ULL)/* 
write-protect */
-#define EFI_MEMORY_RP   ((uint64_t)0x2000ULL)/* 
read-protect */
-#define EFI_MEMORY_XP   ((uint64_t)0x4000ULL)/* 
execute-protect */
-#define EFI_MEMORY_RUNTIME  ((uint64_t)0x8000ULL)/* range 
requires runtime mapping */
-#define EFI_MEMORY_DESCRIPTOR_VERSION   1
-
-#define EFI_PAGE_SHIFT  12
-
-/*
- * For current x86 implementations of EFI, there is
- * additional padding in the mem descriptors.  This is not
- * the case in ia64.  Need to have this fixed in the f/w.
- */
-typedef struct {
-uint32_t type;
-uint32_t pad;
-uint64_t phys_addr;
-uint64_t virt_addr;
-uint64_t num_pages;
-uint64_t attribute;
-#if defined (__i386__)
-uint64_t pad1;
-#endif
-} efi_memory_desc_t;
-
-/*
- * EFI Runtime Services table
- */
-#define EFI_RUNTIME_SERVICES_SIGNATURE ((uint64_t)0x5652453544e5552ULL)
-#define EFI_RUNTIME_SERVICES_REVISION  0x0001
-
-typedef struct {
-   efi_table_hdr_t hdr;
-   unsigned long get_time;
-   unsigned long set_time;
-   unsigned long get_wakeup_time;
-   unsigned long set_wakeup_time;
-   unsigned long set_virtual_address_map;
-   unsigned long convert_pointer;
-   unsigned long get_variable;
-   unsigned long get_next_variable;
-   unsigned long set_variable;
-   unsigned long get_next_high_mono_count;
-   unsigned long reset_system;
-} efi_runtime_services_t;
-
-/*
- *  EFI Configuration Table and GUID definitions
- */
-#define NULL_GUID \
-EFI_GUID(  0x, 0x, 0x, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 
0x00, 0x00 )
-#define ACPI_20_TABLE_GUID\
-EFI

[PATCH II v2 16/17] tools/libxc: untangle libxenctrl from libxenguest

2020-08-17 Thread Juergen Gross

Sources of libxenctrl and libxenguest are completely entangled. In
practice libxenguest is a user of libxenctrl, so don't let any source
libxenctrl include xg_private.h.

This can be achieved by moving all definitions used by libxenctrl from
xg_private.h to xc_private.h.

Export xenctrl_dom.h as it will now be included by other public
headers.

Signed-off-by: Juergen Gross 
---
 tools/libxc/Makefile  |  3 ++-
 tools/libxc/include/xenctrl_dom.h | 10 +++--
 tools/libxc/include/xenguest.h|  8 ++-
 tools/libxc/xc_core.c |  5 +++--
 tools/libxc/xc_core.h |  2 +-
 tools/libxc/xc_core_arm.c |  2 +-
 tools/libxc/xc_core_x86.c |  6 ++
 tools/libxc/xc_domain.c   |  3 +--
 tools/libxc/xc_hcall_buf.c|  1 -
 tools/libxc/xc_private.c  |  1 -
 tools/libxc/xc_private.h  | 36 +++
 tools/libxc/xc_resume.c   |  2 --
 tools/libxc/xg_private.h  | 22 ---
 tools/libxc/xg_save_restore.h |  9 
 14 files changed, 56 insertions(+), 54 deletions(-)

diff --git a/tools/libxc/Makefile b/tools/libxc/Makefile
index e1b2c24106..1e4065f87c 100644
--- a/tools/libxc/Makefile
+++ b/tools/libxc/Makefile
@@ -193,7 +193,7 @@ install: build
$(INSTALL_DATA) libxenctrl.a $(DESTDIR)$(libdir)
$(SYMLINK_SHLIB) libxenctrl.so.$(MAJOR).$(MINOR) 
$(DESTDIR)$(libdir)/libxenctrl.so.$(MAJOR)
$(SYMLINK_SHLIB) libxenctrl.so.$(MAJOR) 
$(DESTDIR)$(libdir)/libxenctrl.so
-   $(INSTALL_DATA) include/xenctrl.h include/xenctrl_compat.h 
$(DESTDIR)$(includedir)
+   $(INSTALL_DATA) include/xenctrl.h include/xenctrl_compat.h 
include/xenctrl_dom.h $(DESTDIR)$(includedir)
$(INSTALL_SHLIB) libxenguest.so.$(MAJOR).$(MINOR) $(DESTDIR)$(libdir)
$(INSTALL_DATA) libxenguest.a $(DESTDIR)$(libdir)
$(SYMLINK_SHLIB) libxenguest.so.$(MAJOR).$(MINOR) 
$(DESTDIR)$(libdir)/libxenguest.so.$(MAJOR)
@@ -213,6 +213,7 @@ uninstall:
rm -f $(DESTDIR)$(PKG_INSTALLDIR)/xencontrol.pc
rm -f $(DESTDIR)$(includedir)/xenctrl.h
rm -f $(DESTDIR)$(includedir)/xenctrl_compat.h
+   rm -f $(DESTDIR)$(includedir)/xenctrl_dom.h
rm -f $(DESTDIR)$(libdir)/libxenctrl.so
rm -f $(DESTDIR)$(libdir)/libxenctrl.so.$(MAJOR)
rm -f $(DESTDIR)$(libdir)/libxenctrl.so.$(MAJOR).$(MINOR)
diff --git a/tools/libxc/include/xenctrl_dom.h 
b/tools/libxc/include/xenctrl_dom.h
index 52a4d6c8c0..40b85b7755 100644
--- a/tools/libxc/include/xenctrl_dom.h
+++ b/tools/libxc/include/xenctrl_dom.h
@@ -17,9 +17,7 @@
 #define _XC_DOM_H
 
 #include 
-#include 
 
-#define INVALID_PFN ((xen_pfn_t)-1)
 #define X86_HVM_NR_SPECIAL_PAGES8
 #define X86_HVM_END_SPECIAL_REGION  0xff000u
 #define XG_MAX_MODULES 2
@@ -38,6 +36,12 @@ struct xc_dom_seg {
 xen_pfn_t pages;
 };
 
+struct xc_hvm_firmware_module {
+uint8_t  *data;
+uint32_t  length;
+uint64_t  guest_addr_out;
+};
+
 struct xc_dom_mem {
 struct xc_dom_mem *next;
 void *ptr;
@@ -255,6 +259,8 @@ struct xc_dom_arch {
 int (*setup_pgtables) (struct xc_dom_image * dom);
 
 /* arch-specific data structs setup */
+/* in Mini-OS environment start_info might be a macro, avoid collision. */
+#undef start_info
 int (*start_info) (struct xc_dom_image * dom);
 int (*shared_info) (struct xc_dom_image * dom, void *shared_info);
 int (*vcpu) (struct xc_dom_image * dom);
diff --git a/tools/libxc/include/xenguest.h b/tools/libxc/include/xenguest.h
index 7a12d21ff2..4643384790 100644
--- a/tools/libxc/include/xenguest.h
+++ b/tools/libxc/include/xenguest.h
@@ -22,6 +22,8 @@
 #ifndef XENGUEST_H
 #define XENGUEST_H
 
+#include 
+
 #define XC_NUMA_NO_NODE   (~0U)
 
 #define XCFLAGS_LIVE  (1 << 0)
@@ -249,12 +251,6 @@ int xc_linux_build(xc_interface *xch,
unsigned int console_evtchn,
unsigned long *console_mfn);
 
-struct xc_hvm_firmware_module {
-uint8_t  *data;
-uint32_t  length;
-uint64_t  guest_addr_out;
-};
-
 /*
  * Sets *lockfd to -1.
  * Has deallocated everything even on error.
diff --git a/tools/libxc/xc_core.c b/tools/libxc/xc_core.c
index 7df1fccd62..e8c6fb96f9 100644
--- a/tools/libxc/xc_core.c
+++ b/tools/libxc/xc_core.c
@@ -60,12 +60,13 @@
  *
  */
 
-#include "xg_private.h"
+#include "xc_private.h"
 #include "xc_core.h"
-#include "xenctrl_dom.h"
 #include 
 #include 
 
+#include 
+
 /* number of pages to write at a time */
 #define DUMP_INCREMENT (4 * 1024)
 
diff --git a/tools/libxc/xc_core.h b/tools/libxc/xc_core.h
index ed7ed53ca5..36fb755da2 100644
--- a/tools/libxc/xc_core.h
+++ b/tools/libxc/xc_core.h
@@ -21,7 +21,7 @@
 #define XC_CORE_H
 
 #include "xen/version.h"
-#include "xg_private.h"
+#include "xc_private.h"
 #include "xen/libelf/elfstructs.h"
 
 /* section names */
diff --git a/tools/libxc/xc_core_arm.c b/tools/libxc/xc_core_arm.c
index c3c492c971..7b587b4cc5 100644
--- a/tools/libxc/xc_core_arm.c
+++ b

[PATCH II v2 13/17] tools/libxc: rename all libxenguest sources to xg_*

2020-08-17 Thread Juergen Gross

Some sources of libxenguest are named xg_*.c and some xc_*.c. Rename
the xc_*.c files to xg_*.c.

Signed-off-by: Juergen Gross 
---
 tools/libxc/Makefile  | 59 ++-
 .../libxc/{xc_cpuid_x86.c => xg_cpuid_x86.c}  |  0
 tools/libxc/{xc_dom_arm.c => xg_dom_arm.c}|  0
 ...imageloader.c => xg_dom_armzimageloader.c} |  0
 ...{xc_dom_binloader.c => xg_dom_binloader.c} |  0
 tools/libxc/{xc_dom_boot.c => xg_dom_boot.c}  |  0
 ...bzimageloader.c => xg_dom_bzimageloader.c} |  0
 ...m_compat_linux.c => xg_dom_compat_linux.c} |  0
 tools/libxc/{xc_dom_core.c => xg_dom_core.c}  |  0
 ...compress_lz4.c => xg_dom_decompress_lz4.c} |  0
 ...ss_unsafe.c => xg_dom_decompress_unsafe.c} |  0
 ...ip2.c => xg_dom_decompress_unsafe_bzip2.c} |  0
 ...lzma.c => xg_dom_decompress_unsafe_lzma.c} |  0
 ...o1x.c => xg_dom_decompress_unsafe_lzo1x.c} |  0
 ...afe_xz.c => xg_dom_decompress_unsafe_xz.c} |  0
 ...{xc_dom_elfloader.c => xg_dom_elfloader.c} |  0
 ...{xc_dom_hvmloader.c => xg_dom_hvmloader.c} |  0
 tools/libxc/{xc_dom_x86.c => xg_dom_x86.c}|  0
 .../libxc/{xc_nomigrate.c => xg_nomigrate.c}  |  0
 .../{xc_offline_page.c => xg_offline_page.c}  |  0
 .../libxc/{xc_sr_common.c => xg_sr_common.c}  |  0
 ...{xc_sr_common_x86.c => xg_sr_common_x86.c} |  0
 ..._common_x86_pv.c => xg_sr_common_x86_pv.c} |  0
 .../{xc_sr_restore.c => xg_sr_restore.c}  |  0
 ...tore_x86_hvm.c => xg_sr_restore_x86_hvm.c} |  0
 ...estore_x86_pv.c => xg_sr_restore_x86_pv.c} |  0
 tools/libxc/{xc_sr_save.c => xg_sr_save.c}|  0
 ...sr_save_x86_hvm.c => xg_sr_save_x86_hvm.c} |  0
 ...c_sr_save_x86_pv.c => xg_sr_save_x86_pv.c} |  0
 tools/libxc/{xc_suspend.c => xg_suspend.c}|  0
 30 files changed, 30 insertions(+), 29 deletions(-)
 rename tools/libxc/{xc_cpuid_x86.c => xg_cpuid_x86.c} (100%)
 rename tools/libxc/{xc_dom_arm.c => xg_dom_arm.c} (100%)
 rename tools/libxc/{xc_dom_armzimageloader.c => xg_dom_armzimageloader.c} 
(100%)
 rename tools/libxc/{xc_dom_binloader.c => xg_dom_binloader.c} (100%)
 rename tools/libxc/{xc_dom_boot.c => xg_dom_boot.c} (100%)
 rename tools/libxc/{xc_dom_bzimageloader.c => xg_dom_bzimageloader.c} (100%)
 rename tools/libxc/{xc_dom_compat_linux.c => xg_dom_compat_linux.c} (100%)
 rename tools/libxc/{xc_dom_core.c => xg_dom_core.c} (100%)
 rename tools/libxc/{xc_dom_decompress_lz4.c => xg_dom_decompress_lz4.c} (100%)
 rename tools/libxc/{xc_dom_decompress_unsafe.c => xg_dom_decompress_unsafe.c} 
(100%)
 rename tools/libxc/{xc_dom_decompress_unsafe_bzip2.c => 
xg_dom_decompress_unsafe_bzip2.c} (100%)
 rename tools/libxc/{xc_dom_decompress_unsafe_lzma.c => 
xg_dom_decompress_unsafe_lzma.c} (100%)
 rename tools/libxc/{xc_dom_decompress_unsafe_lzo1x.c => 
xg_dom_decompress_unsafe_lzo1x.c} (100%)
 rename tools/libxc/{xc_dom_decompress_unsafe_xz.c => 
xg_dom_decompress_unsafe_xz.c} (100%)
 rename tools/libxc/{xc_dom_elfloader.c => xg_dom_elfloader.c} (100%)
 rename tools/libxc/{xc_dom_hvmloader.c => xg_dom_hvmloader.c} (100%)
 rename tools/libxc/{xc_dom_x86.c => xg_dom_x86.c} (100%)
 rename tools/libxc/{xc_nomigrate.c => xg_nomigrate.c} (100%)
 rename tools/libxc/{xc_offline_page.c => xg_offline_page.c} (100%)
 rename tools/libxc/{xc_sr_common.c => xg_sr_common.c} (100%)
 rename tools/libxc/{xc_sr_common_x86.c => xg_sr_common_x86.c} (100%)
 rename tools/libxc/{xc_sr_common_x86_pv.c => xg_sr_common_x86_pv.c} (100%)
 rename tools/libxc/{xc_sr_restore.c => xg_sr_restore.c} (100%)
 rename tools/libxc/{xc_sr_restore_x86_hvm.c => xg_sr_restore_x86_hvm.c} (100%)
 rename tools/libxc/{xc_sr_restore_x86_pv.c => xg_sr_restore_x86_pv.c} (100%)
 rename tools/libxc/{xc_sr_save.c => xg_sr_save.c} (100%)
 rename tools/libxc/{xc_sr_save_x86_hvm.c => xg_sr_save_x86_hvm.c} (100%)
 rename tools/libxc/{xc_sr_save_x86_pv.c => xg_sr_save_x86_pv.c} (100%)
 rename tools/libxc/{xc_suspend.c => xg_suspend.c} (100%)

diff --git a/tools/libxc/Makefile b/tools/libxc/Makefile
index f3f1edc07b..e1b2c24106 100644
--- a/tools/libxc/Makefile
+++ b/tools/libxc/Makefile
@@ -54,20 +54,20 @@ CTRL_SRCS-y   += xc_devicemodel_compat.c
 GUEST_SRCS-y :=
 GUEST_SRCS-y += xg_private.c
 GUEST_SRCS-y += xg_domain.c
-GUEST_SRCS-y += xc_suspend.c
+GUEST_SRCS-y += xg_suspend.c
 ifeq ($(CONFIG_MIGRATE),y)
-GUEST_SRCS-y += xc_sr_common.c
-GUEST_SRCS-$(CONFIG_X86) += xc_sr_common_x86.c
-GUEST_SRCS-$(CONFIG_X86) += xc_sr_common_x86_pv.c
-GUEST_SRCS-$(CONFIG_X86) += xc_sr_restore_x86_pv.c
-GUEST_SRCS-$(CONFIG_X86) += xc_sr_restore_x86_hvm.c
-GUEST_SRCS-$(CONFIG_X86) += xc_sr_save_x86_pv.c
-GUEST_SRCS-$(CONFIG_X86) += xc_sr_save_x86_hvm.c
-GUEST_SRCS-y += xc_sr_restore.c
-GUEST_SRCS-y += xc_sr_save.c
-GUEST_SRCS-y += xc_offline_page.c
+GUEST_SRCS-y += xg_sr_common.c
+GUEST_SRCS-$(CONFIG_X86) += xg_sr_common_x86.c
+GUEST_SRCS-$(CONFIG_X86) += xg_sr_common_x86_pv.c
+GUEST_SRCS-$(CONFIG_X86) += xg_sr_restore_x86_pv.c
+GUEST_SRCS-$(CONFIG_X86) += xg_sr_restore_x86_hvm.c
+GUEST_SRCS-$(CONFIG_X86) += xg_sr_save_x86_pv.c
+GUEST_SRCS-$(CON

[PATCH II v2 07/17] tools/misc: don't use libxenctrl internals from xen-hptool

2020-08-17 Thread Juergen Gross

xen-hptool is including private headers from tools/libxc without any
need. Switch it to use official headers only.

Signed-off-by: Juergen Gross 
---
 tools/misc/Makefile | 2 --
 tools/misc/xen-hptool.c | 8 +---
 2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/tools/misc/Makefile b/tools/misc/Makefile
index e7e74db85f..2a7f2ec42d 100644
--- a/tools/misc/Makefile
+++ b/tools/misc/Makefile
@@ -94,8 +94,6 @@ xenhypfs: xenhypfs.o
 xenlockprof: xenlockprof.o
$(CC) $(LDFLAGS) -o $@ $< $(LDLIBS_libxenctrl) $(APPEND_LDFLAGS)
 
-# xen-hptool incorrectly uses libxc internals
-xen-hptool.o: CFLAGS += -I$(XEN_ROOT)/tools/libxc $(CFLAGS_libxencall)
 xen-hptool: xen-hptool.o
$(CC) $(LDFLAGS) -o $@ $< $(LDLIBS_libxenevtchn) $(LDLIBS_libxenctrl) 
$(LDLIBS_libxenguest) $(LDLIBS_libxenstore) $(APPEND_LDFLAGS)
 
diff --git a/tools/misc/xen-hptool.c b/tools/misc/xen-hptool.c
index 6e27d9cf43..7f17f24942 100644
--- a/tools/misc/xen-hptool.c
+++ b/tools/misc/xen-hptool.c
@@ -1,9 +1,11 @@
+#include 
+#include 
+#include 
 #include 
 #include 
-#include 
-#include 
+#include 
 #include 
-#include 
+#include 
 
 static xc_interface *xch;
 
-- 
2.26.2

[PATCH II v2 08/17] tools/misc: don't include xg_save_restore.h from xen-mfndump.c

2020-08-17 Thread Juergen Gross

xen-mfndump.c is including the libxc private header xg_save_restore.h.
Avoid that by moving the definition of is_mapped() to xen-mfndump.c
(it is used there only) and by duplicating the definition of
M2P_SIZE() in xen-mfndump.c.

Signed-off-by: Juergen Gross 
---
 tools/libxc/xg_save_restore.h | 4 
 tools/misc/xen-mfndump.c  | 5 -
 2 files changed, 4 insertions(+), 5 deletions(-)

diff --git a/tools/libxc/xg_save_restore.h b/tools/libxc/xg_save_restore.h
index 303081df0d..b904296997 100644
--- a/tools/libxc/xg_save_restore.h
+++ b/tools/libxc/xg_save_restore.h
@@ -109,10 +109,6 @@ static inline int get_platform_info(xc_interface *xch, 
uint32_t dom,
 #define M2P_SIZE(_m)ROUNDUP(((_m) * sizeof(xen_pfn_t)), M2P_SHIFT)
 #define M2P_CHUNKS(_m)  (M2P_SIZE((_m)) >> M2P_SHIFT)
 
-/* Returns TRUE if the PFN is currently mapped */
-#define is_mapped(pfn_type) (!((pfn_type) & 0x8000UL))
-
-
 #define GET_FIELD(_p, _f, _w) (((_w) == 8) ? ((_p)->x64._f) : ((_p)->x32._f))
 
 #define SET_FIELD(_p, _f, _v, _w) do {  \
diff --git a/tools/misc/xen-mfndump.c b/tools/misc/xen-mfndump.c
index 858bd0e26b..cb15d08c7e 100644
--- a/tools/misc/xen-mfndump.c
+++ b/tools/misc/xen-mfndump.c
@@ -5,7 +5,10 @@
 #include 
 #include 
 
-#include "xg_save_restore.h"
+#include 
+
+#define M2P_SIZE(_m)ROUNDUP(((_m) * sizeof(xen_pfn_t)), 21)
+#define is_mapped(pfn_type) (!((pfn_type) & 0x8000UL))
 
 static xc_interface *xch;
 
-- 
2.26.2

[PATCH II v2 00/17] move libxenctrl to tools/libs directory

2020-08-17 Thread Juergen Gross

This is part 2 of the series moving more libraries under tools/libs.
It is based on part 1 and does the needed cleanup work and moving for
libxenctrl into tools/libs/ctrl.

Please note that patch 17 ("tools: move libxenctrl below tools/libs")
needs the related qemu-trad patch applied in order not to break the
build:

https://lists.xen.org/archives/html/xen-devel/2020-07/msg00617.html

Changes in V2:
- split the original series into multiple parts, this being part 2
- split the original 3 patches into now 17 in order to make review
  easier
- fixed several bugs and addressed the few comments I received

Juergen Gross (17):
  stubdom: add correct dependencies for Xen libraries
  tools: drop explicit path specifications for qemu build
  tools: tweak tools/libs/libs.mk for being able to support libxenctrl
  tools/python: drop libxenguest from setup.py
  tools: fix pkg-config file for libxenguest
  tools: don't assume libxenguest and libxenctrl to be in same directory
  tools/misc: don't use libxenctrl internals from xen-hptool
  tools/misc: don't include xg_save_restore.h from xen-mfndump.c
  tools/misc: replace PAGE_SIZE with XC_PAGE_SIZE in xen-mfndump.c
  tools/misc: drop all libxc internals from xen-mfndump.c
  tools/libxc: remove unused headers xc_efi.h and xc_elf.h
  tools/libxc: move xc_[un]map_domain_meminfo() into new source
xg_domain.c
  tools/libxc: rename all libxenguest sources to xg_*
  tools/libxc: rename libxenguest internal headers
  tools/misc: rename xc_dom.h do xenctrl_dom.h
  tools/libxc: untangle libxenctrl from libxenguest
  tools: move libxenctrl below tools/libs

 .gitignore|   8 +
 MAINTAINERS   |   2 +-
 stubdom/Makefile  |  28 ++-
 stubdom/grub/kexec.c  |   2 +-
 stubdom/mini-os.mk|   2 +-
 tools/Makefile|  26 +--
 tools/Rules.mk|  17 +-
 tools/helpers/init-xenstore-domain.c  |   2 +-
 tools/libs/Makefile   |   1 +
 tools/libs/ctrl/Makefile  |  68 
 tools/{libxc => libs/ctrl}/include/xenctrl.h  |   0
 .../ctrl}/include/xenctrl_compat.h|   0
 .../ctrl/include/xenctrl_dom.h}   |  10 +-
 tools/{libxc => libs/ctrl}/xc_altp2m.c|   0
 tools/{libxc => libs/ctrl}/xc_arinc653.c  |   0
 tools/{libxc => libs/ctrl}/xc_bitops.h|   0
 tools/{libxc => libs/ctrl}/xc_core.c  |   5 +-
 tools/{libxc => libs/ctrl}/xc_core.h  |   2 +-
 tools/{libxc => libs/ctrl}/xc_core_arm.c  |   2 +-
 tools/{libxc => libs/ctrl}/xc_core_arm.h  |   0
 tools/{libxc => libs/ctrl}/xc_core_x86.c  |   6 +-
 tools/{libxc => libs/ctrl}/xc_core_x86.h  |   0
 tools/{libxc => libs/ctrl}/xc_cpu_hotplug.c   |   0
 tools/{libxc => libs/ctrl}/xc_cpupool.c   |   0
 tools/{libxc => libs/ctrl}/xc_csched.c|   0
 tools/{libxc => libs/ctrl}/xc_csched2.c   |   0
 .../ctrl}/xc_devicemodel_compat.c |   0
 tools/{libxc => libs/ctrl}/xc_domain.c| 129 +-
 tools/{libxc => libs/ctrl}/xc_evtchn.c|   0
 tools/{libxc => libs/ctrl}/xc_evtchn_compat.c |   0
 tools/{libxc => libs/ctrl}/xc_flask.c |   0
 .../{libxc => libs/ctrl}/xc_foreign_memory.c  |   0
 tools/{libxc => libs/ctrl}/xc_freebsd.c   |   0
 tools/{libxc => libs/ctrl}/xc_gnttab.c|   0
 tools/{libxc => libs/ctrl}/xc_gnttab_compat.c |   0
 tools/{libxc => libs/ctrl}/xc_hcall_buf.c |   1 -
 tools/{libxc => libs/ctrl}/xc_kexec.c |   0
 tools/{libxc => libs/ctrl}/xc_linux.c |   0
 tools/{libxc => libs/ctrl}/xc_mem_access.c|   0
 tools/{libxc => libs/ctrl}/xc_mem_paging.c|   0
 tools/{libxc => libs/ctrl}/xc_memshr.c|   0
 tools/{libxc => libs/ctrl}/xc_minios.c|   0
 tools/{libxc => libs/ctrl}/xc_misc.c  |   0
 tools/{libxc => libs/ctrl}/xc_monitor.c   |   0
 tools/{libxc => libs/ctrl}/xc_msr_x86.h   |   0
 tools/{libxc => libs/ctrl}/xc_netbsd.c|   0
 tools/{libxc => libs/ctrl}/xc_pagetab.c   |   0
 tools/{libxc => libs/ctrl}/xc_physdev.c   |   0
 tools/{libxc => libs/ctrl}/xc_pm.c|   0
 tools/{libxc => libs/ctrl}/xc_private.c   |   3 +-
 tools/{libxc => libs/ctrl}/xc_private.h   |  36 
 tools/{libxc => libs/ctrl}/xc_psr.c   |   0
 tools/{libxc => libs/ctrl}/xc_resource.c  |   0
 tools/{libxc => libs/ctrl}/xc_resume.c|   2 -
 tools/{libxc => libs/ctrl}/xc_rt.c|   0
 tools/{libxc => libs/ctrl}/xc_solaris.c   |   0
 tools/{libxc => libs/ctrl}/xc_tbuf.c  |   0
 tools/{libxc => libs/ctrl}/xc_vm_event.c  |   0
 tools/{libxc => libs/ctrl}/xencontrol.pc.in   |   0
 tools/libs/libs.mk|  21 ++-
 tools/libxc/Makefile  | 159 +-
 tools/libxc/include/xenguest.h|

Re: [PATCH] mini-os: fix do_map_frames() for pvh

2020-08-17 Thread Wei Liu

On Sat, Aug 15, 2020 at 11:43:21PM +0200, Samuel Thibault wrote:
> Juergen Gross, le sam. 15 août 2020 13:12:57 +0200, a ecrit:
> > In case ov PVH dom_map_frames() is missing to increment the virtual
> > address. This leads to writing only the first page table entry multiple
> > times.
> > 
> > Signed-off-by: Juergen Gross 
> 
> Reviewed-by: Samuel Thibault 

Applied.

[PATCH II v2 15/17] tools/misc: rename xc_dom.h do xenctrl_dom.h

2020-08-17 Thread Juergen Gross

For being able to disentangle lixenctrl and libxenguest headers
xc_dom.h will need to be public. Prepare that by renaming xc_dom.h
to xenctrl_dom.h.

Signed-off-by: Juergen Gross 
---
 stubdom/grub/kexec.c| 2 +-
 tools/helpers/init-xenstore-domain.c| 2 +-
 tools/libxc/include/{xc_dom.h => xenctrl_dom.h} | 0
 tools/libxc/xc_core.c   | 2 +-
 tools/libxc/xc_private.c| 2 +-
 tools/libxc/xg_dom_arm.c| 2 +-
 tools/libxc/xg_dom_armzimageloader.c| 2 +-
 tools/libxc/xg_dom_binloader.c  | 2 +-
 tools/libxc/xg_dom_boot.c   | 2 +-
 tools/libxc/xg_dom_compat_linux.c   | 2 +-
 tools/libxc/xg_dom_core.c   | 2 +-
 tools/libxc/xg_dom_decompress.h | 2 +-
 tools/libxc/xg_dom_decompress_unsafe.h  | 2 +-
 tools/libxc/xg_dom_elfloader.c  | 2 +-
 tools/libxc/xg_dom_hvmloader.c  | 2 +-
 tools/libxc/xg_dom_x86.c| 2 +-
 tools/libxc/xg_offline_page.c   | 2 +-
 tools/libxc/xg_sr_common.h  | 2 +-
 tools/libxl/libxl_arm.c | 2 +-
 tools/libxl/libxl_arm.h | 2 +-
 tools/libxl/libxl_create.c  | 2 +-
 tools/libxl/libxl_dm.c  | 2 +-
 tools/libxl/libxl_dom.c | 2 +-
 tools/libxl/libxl_internal.h| 2 +-
 tools/libxl/libxl_vnuma.c   | 2 +-
 tools/libxl/libxl_x86.c | 2 +-
 tools/libxl/libxl_x86_acpi.c| 2 +-
 tools/python/xen/lowlevel/xc/xc.c   | 2 +-
 tools/xcutils/readnotes.c   | 2 +-
 29 files changed, 28 insertions(+), 28 deletions(-)
 rename tools/libxc/include/{xc_dom.h => xenctrl_dom.h} (100%)

diff --git a/stubdom/grub/kexec.c b/stubdom/grub/kexec.c
index 0e68b969a2..24001220a9 100644
--- a/stubdom/grub/kexec.c
+++ b/stubdom/grub/kexec.c
@@ -20,7 +20,7 @@
 #include 
 
 #include 
-#include 
+#include 
 
 #include 
 #include 
diff --git a/tools/helpers/init-xenstore-domain.c 
b/tools/helpers/init-xenstore-domain.c
index 4ce8299c3c..5bdb48dc80 100644
--- a/tools/helpers/init-xenstore-domain.c
+++ b/tools/helpers/init-xenstore-domain.c
@@ -8,7 +8,7 @@
 #include 
 #include 
 #include 
-#include 
+#include 
 #include 
 #include 
 #include 
diff --git a/tools/libxc/include/xc_dom.h b/tools/libxc/include/xenctrl_dom.h
similarity index 100%
rename from tools/libxc/include/xc_dom.h
rename to tools/libxc/include/xenctrl_dom.h
diff --git a/tools/libxc/xc_core.c b/tools/libxc/xc_core.c
index 2ee1d205b4..7df1fccd62 100644
--- a/tools/libxc/xc_core.c
+++ b/tools/libxc/xc_core.c
@@ -62,7 +62,7 @@
 
 #include "xg_private.h"
 #include "xc_core.h"
-#include "xc_dom.h"
+#include "xenctrl_dom.h"
 #include 
 #include 
 
diff --git a/tools/libxc/xc_private.c b/tools/libxc/xc_private.c
index 90974d572e..6ecdf6953f 100644
--- a/tools/libxc/xc_private.c
+++ b/tools/libxc/xc_private.c
@@ -19,7 +19,7 @@
 
 #include "xc_private.h"
 #include "xg_private.h"
-#include "xc_dom.h"
+#include "xenctrl_dom.h"
 #include 
 #include 
 #include 
diff --git a/tools/libxc/xg_dom_arm.c b/tools/libxc/xg_dom_arm.c
index 931404c222..3f66f1d890 100644
--- a/tools/libxc/xg_dom_arm.c
+++ b/tools/libxc/xg_dom_arm.c
@@ -24,7 +24,7 @@
 #include 
 
 #include "xg_private.h"
-#include "xc_dom.h"
+#include "xenctrl_dom.h"
 
 #define NR_MAGIC_PAGES 4
 #define CONSOLE_PFN_OFFSET 0
diff --git a/tools/libxc/xg_dom_armzimageloader.c 
b/tools/libxc/xg_dom_armzimageloader.c
index 0df8c2a4b1..4246c8e5fa 100644
--- a/tools/libxc/xg_dom_armzimageloader.c
+++ b/tools/libxc/xg_dom_armzimageloader.c
@@ -25,7 +25,7 @@
 #include 
 
 #include "xg_private.h"
-#include "xc_dom.h"
+#include "xenctrl_dom.h"
 
 #include  /* XXX ntohl is not the right function... */
 
diff --git a/tools/libxc/xg_dom_binloader.c b/tools/libxc/xg_dom_binloader.c
index d6f7f2a500..870a921427 100644
--- a/tools/libxc/xg_dom_binloader.c
+++ b/tools/libxc/xg_dom_binloader.c
@@ -83,7 +83,7 @@
 #include 
 
 #include "xg_private.h"
-#include "xc_dom.h"
+#include "xenctrl_dom.h"
 
 #define round_pgup(_p)(((_p)+(PAGE_SIZE_X86-1))&PAGE_MASK_X86)
 #define round_pgdown(_p)  ((_p)&PAGE_MASK_X86)
diff --git a/tools/libxc/xg_dom_boot.c b/tools/libxc/xg_dom_boot.c
index bb599b33ba..1e31e92244 100644
--- a/tools/libxc/xg_dom_boot.c
+++ b/tools/libxc/xg_dom_boot.c
@@ -31,7 +31,7 @@
 #include 
 
 #include "xg_private.h"
-#include "xc_dom.h"
+#include "xenctrl_dom.h"
 #include "xc_core.h"
 #include 
 #include 
diff --git a/tools/libxc/xg_dom_compat_linux.c 
b/tools/libxc/xg_dom_compat_linux.c
index b3d43feed9..b645f0b14b 100644
--- a/tools/libxc/xg_dom_compat_linux.c
+++ b/tools/libxc/xg_dom_compat_linux.c
@@ -30,7 +30,7 @@
 
 #include "xenctrl.h"
 #include "xg_private.h"
-#include "xc_dom.h"
+#include "xenctrl_dom.h"
 
 /

Re: [PATCH] mini-os: correct memory access rights for pvh mode

2020-08-17 Thread Wei Liu

On Sat, Aug 15, 2020 at 11:40:02PM +0200, Samuel Thibault wrote:
> Juergen Gross, le sam. 15 août 2020 13:15:57 +0200, a ecrit:
> > When running as a PVH guest the memory access rights are not set
> > correctly: _PAGE_USER should not be set and CR0.WP should be set.
> > Especially CR0.WP is important in order to let the allocate on
> > demand feature work, as it requires a page fault when writing to a
> > read-only page.
> > 
> > Signed-off-by: Juergen Gross 
> 
> Reviewed-by: Samuel Thibault 

Applied.

[PATCH II v2 17/17] tools: move libxenctrl below tools/libs

2020-08-17 Thread Juergen Gross

Today tools/libxc needs to be built after tools/libs as libxenctrl is
depending on some libraries in tools/libs. This in turn blocks moving
other libraries depending on libxenctrl below tools/libs.

So carve out libxenctrl from tools/libxc and move it into
tools/libs/ctrl.

Signed-off-by: Juergen Gross 
---
 .gitignore|  8 ++
 MAINTAINERS   |  2 +-
 stubdom/Makefile  | 23 -
 stubdom/mini-os.mk|  2 +-
 tools/Rules.mk| 10 +-
 tools/libs/Makefile   |  1 +
 tools/libs/ctrl/Makefile  | 68 +
 tools/{libxc => libs/ctrl}/include/xenctrl.h  |  0
 .../ctrl}/include/xenctrl_compat.h|  0
 .../ctrl}/include/xenctrl_dom.h   |  0
 tools/{libxc => libs/ctrl}/xc_altp2m.c|  0
 tools/{libxc => libs/ctrl}/xc_arinc653.c  |  0
 tools/{libxc => libs/ctrl}/xc_bitops.h|  0
 tools/{libxc => libs/ctrl}/xc_core.c  |  0
 tools/{libxc => libs/ctrl}/xc_core.h  |  0
 tools/{libxc => libs/ctrl}/xc_core_arm.c  |  0
 tools/{libxc => libs/ctrl}/xc_core_arm.h  |  0
 tools/{libxc => libs/ctrl}/xc_core_x86.c  |  0
 tools/{libxc => libs/ctrl}/xc_core_x86.h  |  0
 tools/{libxc => libs/ctrl}/xc_cpu_hotplug.c   |  0
 tools/{libxc => libs/ctrl}/xc_cpupool.c   |  0
 tools/{libxc => libs/ctrl}/xc_csched.c|  0
 tools/{libxc => libs/ctrl}/xc_csched2.c   |  0
 .../ctrl}/xc_devicemodel_compat.c |  0
 tools/{libxc => libs/ctrl}/xc_domain.c|  0
 tools/{libxc => libs/ctrl}/xc_evtchn.c|  0
 tools/{libxc => libs/ctrl}/xc_evtchn_compat.c |  0
 tools/{libxc => libs/ctrl}/xc_flask.c |  0
 .../{libxc => libs/ctrl}/xc_foreign_memory.c  |  0
 tools/{libxc => libs/ctrl}/xc_freebsd.c   |  0
 tools/{libxc => libs/ctrl}/xc_gnttab.c|  0
 tools/{libxc => libs/ctrl}/xc_gnttab_compat.c |  0
 tools/{libxc => libs/ctrl}/xc_hcall_buf.c |  0
 tools/{libxc => libs/ctrl}/xc_kexec.c |  0
 tools/{libxc => libs/ctrl}/xc_linux.c |  0
 tools/{libxc => libs/ctrl}/xc_mem_access.c|  0
 tools/{libxc => libs/ctrl}/xc_mem_paging.c|  0
 tools/{libxc => libs/ctrl}/xc_memshr.c|  0
 tools/{libxc => libs/ctrl}/xc_minios.c|  0
 tools/{libxc => libs/ctrl}/xc_misc.c  |  0
 tools/{libxc => libs/ctrl}/xc_monitor.c   |  0
 tools/{libxc => libs/ctrl}/xc_msr_x86.h   |  0
 tools/{libxc => libs/ctrl}/xc_netbsd.c|  0
 tools/{libxc => libs/ctrl}/xc_pagetab.c   |  0
 tools/{libxc => libs/ctrl}/xc_physdev.c   |  0
 tools/{libxc => libs/ctrl}/xc_pm.c|  0
 tools/{libxc => libs/ctrl}/xc_private.c   |  0
 tools/{libxc => libs/ctrl}/xc_private.h   |  0
 tools/{libxc => libs/ctrl}/xc_psr.c   |  0
 tools/{libxc => libs/ctrl}/xc_resource.c  |  0
 tools/{libxc => libs/ctrl}/xc_resume.c|  0
 tools/{libxc => libs/ctrl}/xc_rt.c|  0
 tools/{libxc => libs/ctrl}/xc_solaris.c   |  0
 tools/{libxc => libs/ctrl}/xc_tbuf.c  |  0
 tools/{libxc => libs/ctrl}/xc_vm_event.c  |  0
 tools/{libxc => libs/ctrl}/xencontrol.pc.in   |  0
 tools/libxc/Makefile  | 99 +++
 tools/python/Makefile |  2 +-
 tools/python/setup.py |  8 +-
 59 files changed, 117 insertions(+), 106 deletions(-)
 create mode 100644 tools/libs/ctrl/Makefile
 rename tools/{libxc => libs/ctrl}/include/xenctrl.h (100%)
 rename tools/{libxc => libs/ctrl}/include/xenctrl_compat.h (100%)
 rename tools/{libxc => libs/ctrl}/include/xenctrl_dom.h (100%)
 rename tools/{libxc => libs/ctrl}/xc_altp2m.c (100%)
 rename tools/{libxc => libs/ctrl}/xc_arinc653.c (100%)
 rename tools/{libxc => libs/ctrl}/xc_bitops.h (100%)
 rename tools/{libxc => libs/ctrl}/xc_core.c (100%)
 rename tools/{libxc => libs/ctrl}/xc_core.h (100%)
 rename tools/{libxc => libs/ctrl}/xc_core_arm.c (100%)
 rename tools/{libxc => libs/ctrl}/xc_core_arm.h (100%)
 rename tools/{libxc => libs/ctrl}/xc_core_x86.c (100%)
 rename tools/{libxc => libs/ctrl}/xc_core_x86.h (100%)
 rename tools/{libxc => libs/ctrl}/xc_cpu_hotplug.c (100%)
 rename tools/{libxc => libs/ctrl}/xc_cpupool.c (100%)
 rename tools/{libxc => libs/ctrl}/xc_csched.c (100%)
 rename tools/{libxc => libs/ctrl}/xc_csched2.c (100%)
 rename tools/{libxc => libs/ctrl}/xc_devicemodel_compat.c (100%)
 rename tools/{libxc => libs/ctrl}/xc_domain.c (100%)
 rename tools/{libxc => libs/ctrl}/xc_evtchn.c (100%)
 rename tools/{libxc => libs/ctrl}/xc_evtchn_compat.c (100%)
 rename tools/{libxc => libs/ctrl}/xc_flask.c (100%)
 rename tools/{libxc => libs/ctrl}/xc_foreign_memory.c (100%)
 rename tools/{libxc => libs/ctrl}/xc_freebsd.c (100%)
 rename tools/{libxc => libs/ctrl}/xc_gnttab.c (100%)
 rename tools/{libxc => libs/ctrl}/xc_gnttab_compat.c (100%)
 rename tools/{libxc => libs/ctrl}/x

Re: [PATCH v1 4/6] tools/ocaml/xenstored: drop select based

2020-08-17 Thread Wei Liu

The subject line seems to be cut off half way.

"Drop select based $SOMETHING"?

Wei.

Re: [PATCH II v2 01/17] stubdom: add correct dependencies for Xen libraries

2020-08-17 Thread Samuel Thibault

Hello,

Juergen Gross, le lun. 17 août 2020 11:49:06 +0200, a ecrit:
> The stubdom Makefile is missing several dependencies between Xen
> libraries. Add them.

> @@ -405,6 +405,7 @@ libs-$(XEN_TARGET_ARCH)/toollog/libxentoollog.a: 
> mk-headers-$(XEN_TARGET_ARCH) $
>  
>  .PHONY: libxenevtchn
>  libxenevtchn: libs-$(XEN_TARGET_ARCH)/evtchn/libxenevtchn.a
> +libs-$(XEN_TARGET_ARCH)/evtchn/libxenevtchn.a: libxentoolcore

I see

evtchn/Makefile:USELIBS  := toollog toolcore

So it'd actually need libxentoollog as well?

> @@ -423,6 +425,7 @@ libs-$(XEN_TARGET_ARCH)/gnttab/libxengnttab.a: 
> mk-headers-$(XEN_TARGET_ARCH) $(N
>  
>  .PHONY: libxencall
>  libxencall: libs-$(XEN_TARGET_ARCH)/call/libxencall.a
> +libs-$(XEN_TARGET_ARCH)/call/libxencall.a: libxentoolcore

Same with

call/Makefile:USELIBS  := toollog toolcore

?

> @@ -432,6 +435,7 @@ libs-$(XEN_TARGET_ARCH)/call/libxencall.a: 
> mk-headers-$(XEN_TARGET_ARCH) $(NEWLI
>  
>  .PHONY: libxenforeignmemory
>  libxenforeignmemory: 
> libs-$(XEN_TARGET_ARCH)/foreignmemory/libxenforeignmemory.a
> +libs-$(XEN_TARGET_ARCH)/foreignmemory/libxenforeignmemory.a: libxentoolcore

Same with 

foreignmemory/Makefile:USELIBS  := toollog toolcore

?

Possibly they are actually already coming from somewhere by
transitivity, but it'd probably better to just make sure we match
Makefiles' USELIBS.

Samuel

Re: [PATCH II v2 17/17] tools: move libxenctrl below tools/libs

2020-08-17 Thread Samuel Thibault

Juergen Gross, le lun. 17 août 2020 11:49:22 +0200, a ecrit:
> diff --git a/stubdom/Makefile b/stubdom/Makefile
> index 6fcecadeb9..440adc2eb4 100644
> --- a/stubdom/Makefile
> +++ b/stubdom/Makefile

> diff --git a/stubdom/mini-os.mk b/stubdom/mini-os.mk
> index 32528bb91f..b1387df3f8 100644
> --- a/stubdom/mini-os.mk
> +++ b/stubdom/mini-os.mk

For these,

Reviewed-by: Samuel Thibault

Re: [PATCH] xen: Introduce cmpxchg64() and guest_cmpxchg64()

2020-08-17 Thread Julien Grall


Hi,

On 17/08/2020 11:33, Roger Pau Monné wrote:

On Mon, Aug 17, 2020 at 10:42:54AM +0100, Julien Grall wrote:

Hi,

On 17/08/2020 10:24, Roger Pau Monné wrote:

On Sat, Aug 15, 2020 at 06:21:43PM +0100, Julien Grall wrote:

From: Julien Grall 

The IOREQ code is using cmpxchg() with 64-bit value. At the moment, this
is x86 code, but there is plan to make it common.

To cater 32-bit arch, introduce two new helpers to deal with 64-bit
cmpxchg.

The Arm 32-bit implementation of cmpxchg64() is based on the __cmpxchg64
in Linux v5.8 (arch/arm/include/asm/cmpxchg.h).

Signed-off-by: Julien Grall 
Cc: Oleksandr Tyshchenko 
---
diff --git a/xen/include/asm-x86/guest_atomics.h 
b/xen/include/asm-x86/guest_atomics.h
index 029417c8ffc1..f4de9d3631ff 100644
--- a/xen/include/asm-x86/guest_atomics.h
+++ b/xen/include/asm-x86/guest_atomics.h
@@ -20,6 +20,8 @@
   ((void)(d), test_and_change_bit(nr, p))
   #define guest_cmpxchg(d, ptr, o, n) ((void)(d), cmpxchg(ptr, o, n))
+#define guest_cmpxchg64(d, ptr, o, n) ((void)(d), cmpxchg64(ptr, o, n))
+
   #endif /* _X86_GUEST_ATOMICS_H */
   /*
diff --git a/xen/include/asm-x86/x86_64/system.h 
b/xen/include/asm-x86/x86_64/system.h
index f471859c19cc..c1b16105e9f2 100644
--- a/xen/include/asm-x86/x86_64/system.h
+++ b/xen/include/asm-x86/x86_64/system.h
@@ -5,6 +5,8 @@
   ((__typeof__(*(ptr)))__cmpxchg((ptr),(unsigned long)(o),\
  (unsigned long)(n),sizeof(*(ptr
+#define cmpxchg64(ptr, o, n) cmpxchg(ptr, o, n)


Why do you need to introduce an explicitly sized version of cmpxchg
for 64bit values?

There's no cmpxchg{8,16,32}, so I would expect cmpxchg64 to just be
handled by cmpxchg detecting the size of the parameter passed to the
function.

That works quite well for 64-bit arches. However, for 32-bit, you would need
to take some detour so 32-bit and 64-bit can cohabit (you cannot simply
replace unsigned long with uint64_t).


Oh, I see. Switching __cmpxchg on Arm 32 to use unsigned long long or
uint64_t would be bad, as you would then need two registers to pass
the value to the function, or push it on the stack?


We have only 4 registers (r0 - r4) available for the arguments. With 
64-bit value, we will be using 2 registers, some will end up to be 
pushed on the stack.


This is assuming the compiler is not clever enough to see we are only 
using the bottom 32-bit with some cmpxchg.




Maybe do something like:

#define cmpxchg(ptr,o,n) ({ \
typeof(*(ptr)) tmp; \
\
switch ( sizeof(*(ptr)) )   \
{   \
case 8: \
tmp = __cmpxchg_mb64((ptr), (uint64_t)(o),  \
(uint64_t)(n), sizeof(*(ptr \
break;  \
default:\
tmp = __cmpxchg_mb((ptr), (unsigned long)(o),   \
(unsigned long)(n), sizeof(*(ptr\
break;  \
}   \
tmp;\
})



Unfortunately this can't compile if o and n are pointers because the 
compiler will complain about the cast to uint64_t.


We would also need a cast when assigning to tmp because tmp may not be a 
scalar type. This would lead to the same compiler issue.


The only way I could see to make it work would be to use the same trick 
as we do for {read, write}_atomic() (see asm-arm/atomic.h). We are using 
union and void pointer to prevent explicit cast.


But I am not sure whether the effort is really worth it.

Cheers,

--
Julien Grall

Re: [PATCH II v2 01/17] stubdom: add correct dependencies for Xen libraries

2020-08-17 Thread Jürgen Groß


On 17.08.20 12:12, Samuel Thibault wrote:

Hello,

Juergen Gross, le lun. 17 août 2020 11:49:06 +0200, a ecrit:

The stubdom Makefile is missing several dependencies between Xen
libraries. Add them.



@@ -405,6 +405,7 @@ libs-$(XEN_TARGET_ARCH)/toollog/libxentoollog.a: 
mk-headers-$(XEN_TARGET_ARCH) $
  
  .PHONY: libxenevtchn

  libxenevtchn: libs-$(XEN_TARGET_ARCH)/evtchn/libxenevtchn.a
+libs-$(XEN_TARGET_ARCH)/evtchn/libxenevtchn.a: libxentoolcore


I see

evtchn/Makefile:USELIBS  := toollog toolcore

So it'd actually need libxentoollog as well?


@@ -423,6 +425,7 @@ libs-$(XEN_TARGET_ARCH)/gnttab/libxengnttab.a: 
mk-headers-$(XEN_TARGET_ARCH) $(N
  
  .PHONY: libxencall

  libxencall: libs-$(XEN_TARGET_ARCH)/call/libxencall.a
+libs-$(XEN_TARGET_ARCH)/call/libxencall.a: libxentoolcore


Same with

call/Makefile:USELIBS  := toollog toolcore

?


@@ -432,6 +435,7 @@ libs-$(XEN_TARGET_ARCH)/call/libxencall.a: 
mk-headers-$(XEN_TARGET_ARCH) $(NEWLI
  
  .PHONY: libxenforeignmemory

  libxenforeignmemory: 
libs-$(XEN_TARGET_ARCH)/foreignmemory/libxenforeignmemory.a
+libs-$(XEN_TARGET_ARCH)/foreignmemory/libxenforeignmemory.a: libxentoolcore


Same with

foreignmemory/Makefile:USELIBS  := toollog toolcore

?

Possibly they are actually already coming from somewhere by
transitivity, but it'd probably better to just make sure we match
Makefiles' USELIBS.


Yes. Thanks for catching those.

When all libraries have been switched to the USELIBS scheme I'll add a
patch using those variables in the stubdom Makefile, too. This will
avoid duplicate work when adding new libs or changing dependencies.


Juergen

Re: [PATCH I v2.1 6/6] tools: generate most contents of library make variables

2020-08-17 Thread Jürgen Groß


On 16.08.20 14:34, Juergen Gross wrote:

Library related make variables (CFLAGS_lib*, SHDEPS_lib*, LDLIBS_lib*
and SHLIB_lib*) mostly have a common pattern for their values. Generate
most of this content automatically by adding a new per-library variable
defining on which other libraries a lib is depending.

This in turn makes it possible to drop the USELIB variable from each
library Makefile.

The LIBNAME variable can be dropped, too, as it can be derived from the
directory name the library is residing in.

Signed-off-by: Juergen Gross 


Hmm, after Samuel's reply to patch II 1/17 I'm inclined to put the
USELIBS_* variables into a new tools/libs/uselibs.mk file in order to be
able to include it from stubdom/Makefile.


Juergen


---
  tools/Rules.mk| 74 +++
  tools/libs/call/Makefile  |  2 -
  tools/libs/devicemodel/Makefile   |  2 -
  tools/libs/evtchn/Makefile|  2 -
  tools/libs/foreignmemory/Makefile |  2 -
  tools/libs/gnttab/Makefile|  2 -
  tools/libs/hypfs/Makefile |  2 -
  tools/libs/libs.mk|  8 ++--
  tools/libs/toolcore/Makefile  |  1 -
  tools/libs/toollog/Makefile   |  1 -
  10 files changed, 31 insertions(+), 65 deletions(-)

diff --git a/tools/Rules.mk b/tools/Rules.mk
index 5d699cfd39..b36818bcaa 100644
--- a/tools/Rules.mk
+++ b/tools/Rules.mk
@@ -12,14 +12,24 @@ INSTALL = $(XEN_ROOT)/tools/cross-install
  LDFLAGS += $(PREPEND_LDFLAGS_XEN_TOOLS)
  
  XEN_INCLUDE= $(XEN_ROOT)/tools/include

-XEN_libxentoolcore = $(XEN_ROOT)/tools/libs/toolcore
-XEN_libxentoollog  = $(XEN_ROOT)/tools/libs/toollog
-XEN_libxenevtchn   = $(XEN_ROOT)/tools/libs/evtchn
-XEN_libxengnttab   = $(XEN_ROOT)/tools/libs/gnttab
-XEN_libxencall = $(XEN_ROOT)/tools/libs/call
-XEN_libxenforeignmemory = $(XEN_ROOT)/tools/libs/foreignmemory
-XEN_libxendevicemodel = $(XEN_ROOT)/tools/libs/devicemodel
-XEN_libxenhypfs= $(XEN_ROOT)/tools/libs/hypfs
+
+LIBS_LIBS += toolcore
+USELIBS_toolcore :=
+LIBS_LIBS += toollog
+USELIBS_toollog :=
+LIBS_LIBS += evtchn
+USELIBS_evtchn := toollog toolcore
+LIBS_LIBS += gnttab
+USELIBS_gnttab := toollog toolcore
+LIBS_LIBS += call
+USELIBS_call := toollog toolcore
+LIBS_LIBS += foreignmemory
+USELIBS_foreignmemory := toollog toolcore
+LIBS_LIBS += devicemodel
+USELIBS_devicemodel := toollog toolcore call
+LIBS_LIBS += hypfs
+USELIBS_hypfs := toollog toolcore call
+
  XEN_libxenctrl = $(XEN_ROOT)/tools/libxc
  # Currently libxenguest lives in the same directory as libxenctrl
  XEN_libxenguest= $(XEN_libxenctrl)
@@ -99,45 +109,15 @@ endif
  # Consumers of libfoo should not directly use $(SHDEPS_libfoo) or
  # $(SHLIB_libfoo)
  
-CFLAGS_libxentoollog = -I$(XEN_libxentoollog)/include $(CFLAGS_xeninclude)

-SHDEPS_libxentoollog =
-LDLIBS_libxentoollog = $(SHDEPS_libxentoollog) 
$(XEN_libxentoollog)/libxentoollog$(libextension)
-SHLIB_libxentoollog  = $(SHDEPS_libxentoollog) 
-Wl,-rpath-link=$(XEN_libxentoollog)
-
-CFLAGS_libxentoolcore = -I$(XEN_libxentoolcore)/include $(CFLAGS_xeninclude)
-SHDEPS_libxentoolcore =
-LDLIBS_libxentoolcore = $(SHDEPS_libxentoolcore) 
$(XEN_libxentoolcore)/libxentoolcore$(libextension)
-SHLIB_libxentoolcore  = $(SHDEPS_libxentoolcore) 
-Wl,-rpath-link=$(XEN_libxentoolcore)
-
-CFLAGS_libxenevtchn = -I$(XEN_libxenevtchn)/include $(CFLAGS_xeninclude)
-SHDEPS_libxenevtchn = $(SHLIB_libxentoolcore)
-LDLIBS_libxenevtchn = $(SHDEPS_libxenevtchn) 
$(XEN_libxenevtchn)/libxenevtchn$(libextension)
-SHLIB_libxenevtchn  = $(SHDEPS_libxenevtchn) 
-Wl,-rpath-link=$(XEN_libxenevtchn)
-
-CFLAGS_libxengnttab = -I$(XEN_libxengnttab)/include $(CFLAGS_xeninclude)
-SHDEPS_libxengnttab = $(SHLIB_libxentoollog) $(SHLIB_libxentoolcore)
-LDLIBS_libxengnttab = $(SHDEPS_libxengnttab) 
$(XEN_libxengnttab)/libxengnttab$(libextension)
-SHLIB_libxengnttab  = $(SHDEPS_libxengnttab) 
-Wl,-rpath-link=$(XEN_libxengnttab)
-
-CFLAGS_libxencall = -I$(XEN_libxencall)/include $(CFLAGS_xeninclude)
-SHDEPS_libxencall = $(SHLIB_libxentoolcore)
-LDLIBS_libxencall = $(SHDEPS_libxencall) 
$(XEN_libxencall)/libxencall$(libextension)
-SHLIB_libxencall  = $(SHDEPS_libxencall) -Wl,-rpath-link=$(XEN_libxencall)
-
-CFLAGS_libxenforeignmemory = -I$(XEN_libxenforeignmemory)/include 
$(CFLAGS_xeninclude)
-SHDEPS_libxenforeignmemory = $(SHLIB_libxentoolcore)
-LDLIBS_libxenforeignmemory = $(SHDEPS_libxenforeignmemory) 
$(XEN_libxenforeignmemory)/libxenforeignmemory$(libextension)
-SHLIB_libxenforeignmemory  = $(SHDEPS_libxenforeignmemory) 
-Wl,-rpath-link=$(XEN_libxenforeignmemory)
-
-CFLAGS_libxendevicemodel = -I$(XEN_libxendevicemodel)/include 
$(CFLAGS_xeninclude)
-SHDEPS_libxendevicemodel = $(SHLIB_libxentoollog) $(SHLIB_libxentoolcore) 
$(SHLIB_libxencall)
-LDLIBS_libxendevicemodel = $(SHDEPS_libxendevicemodel) 
$(XEN_libxendevicemodel)/libxendevicemodel$(libextension)
-SHLIB_libxendevicemodel  = $(SHDEPS_libxendevicemodel) 
-Wl,-rpath-link=$(XEN_libxendevicemodel)
-
-CFLAGS_libxenhypfs = -I$(XE

Re: [PATCH] efi: discover ESRT table on Xen PV too

2020-08-17 Thread Roger Pau Monné

On Sun, Aug 16, 2020 at 02:19:49AM +0200, Marek Marczykowski-Górecki wrote:
> In case of Xen PV dom0, Xen passes along info about system tables (see
> arch/x86/xen/efi.c), but not the memory map from EFI.

I think that's because the memory map returned by
XENMEM_machine_memory_map is in e820 form, and doesn't contain the
required information about the EFI regions due to the translation done
by efi_arch_process_memory_map in Xen?

> This makes sense
> as it is Xen responsible for managing physical memory address space.
> In this case, it doesn't make sense to condition using ESRT table on
> availability of EFI memory map, as it isn't Linux kernel responsible for
> it.

PV dom0 is kind of special in that regard as it can create mappings to
(almost) any MMIO regions, and hence can change it's memory map
substantially.

> Skip this part on Xen PV (let Xen do the right thing if it deems
> necessary) and use ESRT table normally.

Maybe it would be better to introduce a new hypercall (or add a
parameter to XENMEM_machine_memory_map) in order to be able to fetch
the EFI memory map?

That should allow a PV dom0 to check the ESRT is correct and thus not
diverge from bate metal.

> 
> This is a requirement for using fwupd in PV dom0 to update UEFI using
> capsules.
> 
> Signed-off-by: Marek Marczykowski-Górecki 
> ---
>  drivers/firmware/efi/esrt.c | 47 -
>  1 file changed, 25 insertions(+), 22 deletions(-)
> 
> diff --git a/drivers/firmware/efi/esrt.c b/drivers/firmware/efi/esrt.c
> index d5915272141f..5c49f2aaa4b1 100644
> --- a/drivers/firmware/efi/esrt.c
> +++ b/drivers/firmware/efi/esrt.c
> @@ -245,36 +245,38 @@ void __init efi_esrt_init(void)
>   int rc;
>   phys_addr_t end;
>  
> - if (!efi_enabled(EFI_MEMMAP))
> + if (!efi_enabled(EFI_MEMMAP) && !efi_enabled(EFI_PARAVIRT))
>   return;
>  
>   pr_debug("esrt-init: loading.\n");
>   if (!esrt_table_exists())
>   return;
>  
> - rc = efi_mem_desc_lookup(efi.esrt, &md);
> - if (rc < 0 ||
> - (!(md.attribute & EFI_MEMORY_RUNTIME) &&
> -  md.type != EFI_BOOT_SERVICES_DATA &&
> -  md.type != EFI_RUNTIME_SERVICES_DATA)) {
> - pr_warn("ESRT header is not in the memory map.\n");
> - return;
> - }
> + if (efi_enabled(EFI_MEMMAP)) {
> + rc = efi_mem_desc_lookup(efi.esrt, &md);
> + if (rc < 0 ||
> + (!(md.attribute & EFI_MEMORY_RUNTIME) &&
> +  md.type != EFI_BOOT_SERVICES_DATA &&
> +  md.type != EFI_RUNTIME_SERVICES_DATA)) {
> + pr_warn("ESRT header is not in the memory map.\n");
> + return;
> + }

Here you blindly trust the data in the ESRT in the PV case, without
checking it matches the regions on the memory map, which could lead to
errors if ESRT turns to be wrong.

Thanks, Roger.

Re: [PATCH] xen: Introduce cmpxchg64() and guest_cmpxchg64()

2020-08-17 Thread Julien Grall





On 17/08/2020 12:50, Roger Pau Monné wrote:

On Mon, Aug 17, 2020 at 12:05:54PM +0100, Julien Grall wrote:

Hi,

On 17/08/2020 11:33, Roger Pau Monné wrote:

On Mon, Aug 17, 2020 at 10:42:54AM +0100, Julien Grall wrote:

Hi,

On 17/08/2020 10:24, Roger Pau Monné wrote:

On Sat, Aug 15, 2020 at 06:21:43PM +0100, Julien Grall wrote:

From: Julien Grall 

The IOREQ code is using cmpxchg() with 64-bit value. At the moment, this
is x86 code, but there is plan to make it common.

To cater 32-bit arch, introduce two new helpers to deal with 64-bit
cmpxchg.

The Arm 32-bit implementation of cmpxchg64() is based on the __cmpxchg64
in Linux v5.8 (arch/arm/include/asm/cmpxchg.h).

Signed-off-by: Julien Grall 
Cc: Oleksandr Tyshchenko 
---
diff --git a/xen/include/asm-x86/guest_atomics.h 
b/xen/include/asm-x86/guest_atomics.h
index 029417c8ffc1..f4de9d3631ff 100644
--- a/xen/include/asm-x86/guest_atomics.h
+++ b/xen/include/asm-x86/guest_atomics.h
@@ -20,6 +20,8 @@
((void)(d), test_and_change_bit(nr, p))
#define guest_cmpxchg(d, ptr, o, n) ((void)(d), cmpxchg(ptr, o, n))
+#define guest_cmpxchg64(d, ptr, o, n) ((void)(d), cmpxchg64(ptr, o, n))
+
#endif /* _X86_GUEST_ATOMICS_H */
/*
diff --git a/xen/include/asm-x86/x86_64/system.h 
b/xen/include/asm-x86/x86_64/system.h
index f471859c19cc..c1b16105e9f2 100644
--- a/xen/include/asm-x86/x86_64/system.h
+++ b/xen/include/asm-x86/x86_64/system.h
@@ -5,6 +5,8 @@
((__typeof__(*(ptr)))__cmpxchg((ptr),(unsigned long)(o),\
   (unsigned long)(n),sizeof(*(ptr
+#define cmpxchg64(ptr, o, n) cmpxchg(ptr, o, n)


Why do you need to introduce an explicitly sized version of cmpxchg
for 64bit values?

There's no cmpxchg{8,16,32}, so I would expect cmpxchg64 to just be
handled by cmpxchg detecting the size of the parameter passed to the
function.

That works quite well for 64-bit arches. However, for 32-bit, you would need
to take some detour so 32-bit and 64-bit can cohabit (you cannot simply
replace unsigned long with uint64_t).


Oh, I see. Switching __cmpxchg on Arm 32 to use unsigned long long or
uint64_t would be bad, as you would then need two registers to pass
the value to the function, or push it on the stack?


We have only 4 registers (r0 - r4) available for the arguments. With 64-bit
value, we will be using 2 registers, some will end up to be pushed on the
stack.

This is assuming the compiler is not clever enough to see we are only using
the bottom 32-bit with some cmpxchg.



Maybe do something like:

#define cmpxchg(ptr,o,n) ({ \
typeof(*(ptr)) tmp; \
\
switch ( sizeof(*(ptr)) )   \
{   \
case 8: \
tmp = __cmpxchg_mb64((ptr), (uint64_t)(o),  \
(uint64_t)(n), sizeof(*(ptr \
break;  \
default:\
tmp = __cmpxchg_mb((ptr), (unsigned long)(o),   \
(unsigned long)(n), sizeof(*(ptr\
break;  \
}   \
tmp;\
})



Unfortunately this can't compile if o and n are pointers because the
compiler will complain about the cast to uint64_t.


Right, we would have to cast to unsigned long first and then to
uint64_t, which is not very nice.


If you use (uint64_t)(unsigned long) in the 64-bit case, then you would 
lose the top 32-bit. So cmpxchg() wouldn't work as expected.






We would also need a cast when assigning to tmp because tmp may not be a
scalar type. This would lead to the same compiler issue.


Yes, we would have to do a bunch of casts.


I don't think there is a way to solve this using just cast.




The only way I could see to make it work would be to use the same trick as
we do for {read, write}_atomic() (see asm-arm/atomic.h). We are using union
and void pointer to prevent explicit cast.


I'm mostly worried about common code having assumed that cmpxchg
does also handle 64bit sized parameters, and thus failing to use
cmpxchg64 when required. I assume this is not much of a deal as then
the Arm 32 build would fail, so it should be fairly easy to catch
those.
FWIW, this is not very different to the existing approach. If one would 
use cmpxchg() with 64-bit, then it would fail to compile.


Furthermore, there is no guarantee that a new 32-bit arch would have 
64-bit atomic operations. For instance, not all 3

Re: [PATCH] xen/x86: irq: Avoid a TOCTOU race in pirq_spin_lock_irq_desc()

2020-08-17 Thread Julien Grall


Hi,

On 17/08/2020 13:46, Roger Pau Monné wrote:

On Fri, Aug 14, 2020 at 08:25:28PM +0100, Julien Grall wrote:

Hi Andrew,

Sorry for the late answer.

On 23/07/2020 14:59, Andrew Cooper wrote:

On 23/07/2020 14:22, Julien Grall wrote:

Hi Jan,

On 23/07/2020 12:23, Jan Beulich wrote:

On 22.07.2020 18:53, Julien Grall wrote:

--- a/xen/arch/x86/irq.c
+++ b/xen/arch/x86/irq.c
@@ -1187,7 +1187,7 @@ struct irq_desc *pirq_spin_lock_irq_desc(
      for ( ; ; )
    {
-    int irq = pirq->arch.irq;
+    int irq = read_atomic(&pirq->arch.irq);


There we go - I'd be fine this way, but I'm pretty sure Andrew
would want this to be ACCESS_ONCE(). So I guess now is the time
to settle which one to prefer in new code (or which criteria
there are to prefer one over the other).


I would prefer if we have a single way to force the compiler to do a
single access (read/write).


Unlikely to happen, I'd expect.

But I would really like to get rid of (or at least rename)
read_atomic()/write_atomic() specifically because they've got nothing to
do with atomic_t's and the set of functionality who's namespace they share.


Would you be happy if I rename both to READ_ONCE() and WRITE_ONCE()? I would
also suggest to move them implementation in a new header asm/lib.h.


Maybe {READ/WRITE}_SINGLE (to note those should be implemented using a
single instruction)?


The asm volatile statement contains only one instruction, but this 
doesn't mean the helper will generate a single instruction.


You may have other instructions to get the registers ready for the access.



ACCESS_ONCE (which also has the _ONCE suffix) IIRC could be
implemented using several instructions, and hence doesn't seem right
that they all have the _ONCE suffix.


The goal here is the same, we want to access the variable *only* once.

May I ask why we would want to expose the difference to the user?

Cheers,

--
Julien Grall

Re: [PATCH] xen: Introduce cmpxchg64() and guest_cmpxchg64()

2020-08-17 Thread Roger Pau Monné

On Sat, Aug 15, 2020 at 06:21:43PM +0100, Julien Grall wrote:
> From: Julien Grall 
> 
> The IOREQ code is using cmpxchg() with 64-bit value. At the moment, this
> is x86 code, but there is plan to make it common.
> 
> To cater 32-bit arch, introduce two new helpers to deal with 64-bit
> cmpxchg.
> 
> The Arm 32-bit implementation of cmpxchg64() is based on the __cmpxchg64
> in Linux v5.8 (arch/arm/include/asm/cmpxchg.h).
> 
> Signed-off-by: Julien Grall 
> Cc: Oleksandr Tyshchenko 
> ---
> diff --git a/xen/include/asm-x86/guest_atomics.h 
> b/xen/include/asm-x86/guest_atomics.h
> index 029417c8ffc1..f4de9d3631ff 100644
> --- a/xen/include/asm-x86/guest_atomics.h
> +++ b/xen/include/asm-x86/guest_atomics.h
> @@ -20,6 +20,8 @@
>  ((void)(d), test_and_change_bit(nr, p))
>  
>  #define guest_cmpxchg(d, ptr, o, n) ((void)(d), cmpxchg(ptr, o, n))
> +#define guest_cmpxchg64(d, ptr, o, n) ((void)(d), cmpxchg64(ptr, o, n))
> +
>  
>  #endif /* _X86_GUEST_ATOMICS_H */
>  /*
> diff --git a/xen/include/asm-x86/x86_64/system.h 
> b/xen/include/asm-x86/x86_64/system.h
> index f471859c19cc..c1b16105e9f2 100644
> --- a/xen/include/asm-x86/x86_64/system.h
> +++ b/xen/include/asm-x86/x86_64/system.h
> @@ -5,6 +5,8 @@
>  ((__typeof__(*(ptr)))__cmpxchg((ptr),(unsigned long)(o),\
> (unsigned long)(n),sizeof(*(ptr
>  
> +#define cmpxchg64(ptr, o, n) cmpxchg(ptr, o, n)

Why do you need to introduce an explicitly sized version of cmpxchg
for 64bit values?

There's no cmpxchg{8,16,32}, so I would expect cmpxchg64 to just be
handled by cmpxchg detecting the size of the parameter passed to the
function. I think it's worth adding to the commit message why such
differentiated helper is needed.

Thanks, Roger.

Re: [PATCH v1 4/6] tools/ocaml/xenstored: drop select based

2020-08-17 Thread Edwin Torok

On Mon, 2020-08-17 at 09:59 +, Wei Liu wrote:
> [CAUTION - EXTERNAL EMAIL] DO NOT reply, click links, or open
> attachments unless you have verified the sender and know the content
> is safe.
> 
> The subject line seems to be cut off half way.
> 
> "Drop select based $SOMETHING"?

$SOMETHING = socket watching, I'll fix the message in V2.

> 
> Wei.
>

Re: [PATCH v1 0/6] tools/ocaml/xenstored: simplify code

2020-08-17 Thread Christian Lindig

I am going to look at this in more detail. In general, all of this are welcome 
changes. The main problem with select/poll is emulation of select behaviour 
which creates a lot of lists and consequently memory garbage at high frequency. 
This change is not yet addressing that but by dropping select paves the way to 
a more efficient implementation.

From: Edwin Torok
Sent: 14 August 2020 23:11
To: xen-devel@lists.xenproject.org
Cc: Edwin Torok; Christian Lindig; David Scott; Ian Jackson; Wei Liu
Subject: [PATCH v1 0/6] tools/ocaml/xenstored: simplify code

Fix warnings, and delete some obsolete code.
oxenstored contained a hand-rolled GC to perform hash-consing:
this can be done with a lot fewer lines of code by using the built-in Weak 
module.

The choice of data structures for trees/tries is not very efficient: they are 
just
lists. Using a map improves lookup and deletion complexity, and replaces 
hand-rolled
recursion with higher-level library calls.

There is a lot more that could be done to optimize socket polling:
an epoll backend with a poll fallback,but API structured around event-based 
polling
would be better. But first lets drop the legacy select based code: I think every
modern *nix should have a working poll(3) by now.

This is a draft series, in need of more testing.

Edwin Török (6):
  tools/ocaml/libs/xc: Fix ambiguous documentation comment
  tools/ocaml/xenstored: fix deprecation warning
  tools/ocaml/xenstored: replace hand rolled GC with weak GC references
  tools/ocaml/xenstored: drop select based
  tools/ocaml/xenstored: use more efficient node trees
  tools/ocaml/xenstored: use more efficient tries

 tools/ocaml/libs/xc/xenctrl.mli   |  2 +
 tools/ocaml/xenstored/connection.ml   |  3 -
 tools/ocaml/xenstored/connections.ml  |  2 +-
 tools/ocaml/xenstored/disk.ml |  2 +-
 tools/ocaml/xenstored/history.ml  | 14 
 tools/ocaml/xenstored/parse_arg.ml|  7 +-
 tools/ocaml/xenstored/{select.ml => poll.ml}  | 14 +---
 .../ocaml/xenstored/{select.mli => poll.mli}  | 12 +---
 tools/ocaml/xenstored/store.ml| 49 ++---
 tools/ocaml/xenstored/symbol.ml   | 70 +--
 tools/ocaml/xenstored/symbol.mli  | 22 ++
 tools/ocaml/xenstored/trie.ml | 61 +++-
 tools/ocaml/xenstored/trie.mli| 26 +++
 tools/ocaml/xenstored/xenstored.ml| 20 +-
 14 files changed, 98 insertions(+), 206 deletions(-)
 rename tools/ocaml/xenstored/{select.ml => poll.ml} (85%)
 rename tools/ocaml/xenstored/{select.mli => poll.mli} (58%)

--
2.25.1

Re: [PATCH] xen: Introduce cmpxchg64() and guest_cmpxchg64()

2020-08-17 Thread Roger Pau Monné

On Mon, Aug 17, 2020 at 10:42:54AM +0100, Julien Grall wrote:
> Hi,
> 
> On 17/08/2020 10:24, Roger Pau Monné wrote:
> > On Sat, Aug 15, 2020 at 06:21:43PM +0100, Julien Grall wrote:
> > > From: Julien Grall 
> > > 
> > > The IOREQ code is using cmpxchg() with 64-bit value. At the moment, this
> > > is x86 code, but there is plan to make it common.
> > > 
> > > To cater 32-bit arch, introduce two new helpers to deal with 64-bit
> > > cmpxchg.
> > > 
> > > The Arm 32-bit implementation of cmpxchg64() is based on the __cmpxchg64
> > > in Linux v5.8 (arch/arm/include/asm/cmpxchg.h).
> > > 
> > > Signed-off-by: Julien Grall 
> > > Cc: Oleksandr Tyshchenko 
> > > ---
> > > diff --git a/xen/include/asm-x86/guest_atomics.h 
> > > b/xen/include/asm-x86/guest_atomics.h
> > > index 029417c8ffc1..f4de9d3631ff 100644
> > > --- a/xen/include/asm-x86/guest_atomics.h
> > > +++ b/xen/include/asm-x86/guest_atomics.h
> > > @@ -20,6 +20,8 @@
> > >   ((void)(d), test_and_change_bit(nr, p))
> > >   #define guest_cmpxchg(d, ptr, o, n) ((void)(d), cmpxchg(ptr, o, n))
> > > +#define guest_cmpxchg64(d, ptr, o, n) ((void)(d), cmpxchg64(ptr, o, n))
> > > +
> > >   #endif /* _X86_GUEST_ATOMICS_H */
> > >   /*
> > > diff --git a/xen/include/asm-x86/x86_64/system.h 
> > > b/xen/include/asm-x86/x86_64/system.h
> > > index f471859c19cc..c1b16105e9f2 100644
> > > --- a/xen/include/asm-x86/x86_64/system.h
> > > +++ b/xen/include/asm-x86/x86_64/system.h
> > > @@ -5,6 +5,8 @@
> > >   ((__typeof__(*(ptr)))__cmpxchg((ptr),(unsigned long)(o),
> > > \
> > >  (unsigned long)(n),sizeof(*(ptr
> > > +#define cmpxchg64(ptr, o, n) cmpxchg(ptr, o, n)
> > 
> > Why do you need to introduce an explicitly sized version of cmpxchg
> > for 64bit values?
> > 
> > There's no cmpxchg{8,16,32}, so I would expect cmpxchg64 to just be
> > handled by cmpxchg detecting the size of the parameter passed to the
> > function.
> That works quite well for 64-bit arches. However, for 32-bit, you would need
> to take some detour so 32-bit and 64-bit can cohabit (you cannot simply
> replace unsigned long with uint64_t).

Oh, I see. Switching __cmpxchg on Arm 32 to use unsigned long long or
uint64_t would be bad, as you would then need two registers to pass
the value to the function, or push it on the stack?

Maybe do something like:

#define cmpxchg(ptr,o,n) ({ \
typeof(*(ptr)) tmp; \
\
switch ( sizeof(*(ptr)) )   \
{   \
case 8: \
tmp = __cmpxchg_mb64((ptr), (uint64_t)(o),  \
(uint64_t)(n), sizeof(*(ptr \
break;  \
default:\
tmp = __cmpxchg_mb((ptr), (unsigned long)(o),   \
(unsigned long)(n), sizeof(*(ptr\
break;  \
}   \
tmp;\
})

Roger.

Re: [RFC PATCH V1 04/12] xen/arm: Introduce arch specific bits for IOREQ/DM features

2020-08-17 Thread Oleksandr




On 15.08.20 20:56, Julien Grall wrote:

Hi Julien.


Hi,

On 04/08/2020 15:01, Julien Grall wrote:

On 04/08/2020 08:49, Paul Durrant wrote:

diff --git a/tools/libxc/xc_dom_arm.c b/tools/libxc/xc_dom_arm.c
index 931404c..b5fc066 100644
--- a/tools/libxc/xc_dom_arm.c
+++ b/tools/libxc/xc_dom_arm.c
@@ -26,11 +26,19 @@
  #include "xg_private.h"
  #include "xc_dom.h"

-#define NR_MAGIC_PAGES 4
+
  #define CONSOLE_PFN_OFFSET 0
  #define XENSTORE_PFN_OFFSET 1
  #define MEMACCESS_PFN_OFFSET 2
  #define VUART_PFN_OFFSET 3
+#define IOREQ_SERVER_PFN_OFFSET 4
+
+#define NR_IOREQ_SERVER_PAGES 8
+#define NR_MAGIC_PAGES (4 + NR_IOREQ_SERVER_PAGES)
+
+#define GUEST_MAGIC_BASE_PFN (GUEST_MAGIC_BASE >> XC_PAGE_SHIFT)
+
+#define special_pfn(x)  (GUEST_MAGIC_BASE_PFN + (x))


Why introduce 'magic pages' for Arm? It's quite a horrible hack that 
we have begun to do away with by adding resource mapping.


This would require us to mandate at least Linux 4.17 in a domain that 
will run an IOREQ server. If we don't mandate this, the minimum 
version would be 4.10 where DM OP was introduced.


Because of XSA-300, we could technically not safely run an IOREQ 
server with existing Linux. So it is probably OK to enforce the use 
of the acquire interface.
One more thing. We are using atomic operations on the IOREQ pages. As 
our implementation is based on LL/SC instructions so far, we have 
mitigation in place to prevent a domain DoS Xen. However, this relies 
on the page to be mapped in a single domain at the time.


AFAICT, with the legacy interface, the pages will be mapped in both 
the target and the emulator. So this would defeat the mitigation we 
have in place.


Because the legacy interface is relying on foreign mapping, the page 
has to be mapped in the target P2M. It might be possible to restrict 
the access for the target by setting the p2m bits r, w to 0. This 
would still allow the foreign mapping to work as we only check the p2m 
type during mapping.


Anyway, I think we agreed that we want to avoid to introduce the 
legacy interface. But I wanted to answer just for completeness and 
keep a record of potential pitfalls with the legacy interface on Arm.
ok, the HVMOP plumbing on Arm will be dropped for non-RFC series. It 
seems that xenforeignmemory_map_resource() does needed things. Of 
course, the corresponding Linux patch to support 
IOCTL_PRIVCMD_MMAP_RESOURCE was cherry-picked for that purpose (I am 
currently using v4.14).


Thank you.


--
Regards,

Oleksandr Tyshchenko

Re: [PATCH] xen/x86: irq: Avoid a TOCTOU race in pirq_spin_lock_irq_desc()

2020-08-17 Thread Julien Grall





On 17/08/2020 15:01, Roger Pau Monné wrote:

On Mon, Aug 17, 2020 at 02:14:01PM +0100, Julien Grall wrote:

Hi,

On 17/08/2020 13:46, Roger Pau Monné wrote:

On Fri, Aug 14, 2020 at 08:25:28PM +0100, Julien Grall wrote:

Hi Andrew,

Sorry for the late answer.

On 23/07/2020 14:59, Andrew Cooper wrote:

On 23/07/2020 14:22, Julien Grall wrote:

Hi Jan,

On 23/07/2020 12:23, Jan Beulich wrote:

On 22.07.2020 18:53, Julien Grall wrote:

--- a/xen/arch/x86/irq.c
+++ b/xen/arch/x86/irq.c
@@ -1187,7 +1187,7 @@ struct irq_desc *pirq_spin_lock_irq_desc(
       for ( ; ; )
     {
-    int irq = pirq->arch.irq;
+    int irq = read_atomic(&pirq->arch.irq);


There we go - I'd be fine this way, but I'm pretty sure Andrew
would want this to be ACCESS_ONCE(). So I guess now is the time
to settle which one to prefer in new code (or which criteria
there are to prefer one over the other).


I would prefer if we have a single way to force the compiler to do a
single access (read/write).


Unlikely to happen, I'd expect.

But I would really like to get rid of (or at least rename)
read_atomic()/write_atomic() specifically because they've got nothing to
do with atomic_t's and the set of functionality who's namespace they share.


Would you be happy if I rename both to READ_ONCE() and WRITE_ONCE()? I would
also suggest to move them implementation in a new header asm/lib.h.


Maybe {READ/WRITE}_SINGLE (to note those should be implemented using a
single instruction)?


The asm volatile statement contains only one instruction, but this doesn't
mean the helper will generate a single instruction.


Well, the access should be done using a single instruction, which is
what we care about when using this helpers.


You may have other instructions to get the registers ready for the access.



ACCESS_ONCE (which also has the _ONCE suffix) IIRC could be
implemented using several instructions, and hence doesn't seem right
that they all have the _ONCE suffix.


The goal here is the same, we want to access the variable *only* once.


Right, but this is not guaranteed by the current implementation of
ACCESS_ONCE AFAICT, as the compiler *might* split the access into two
(or more) instructions, and hence won't be an atomic access anymore?
From my understanding, at least on GCC/Clang, ACCESS_ONCE() should be 
atomic if you are using aligned address and the size smaller than a 
register size.





May I ask why we would want to expose the difference to the user?


I'm not saying we should, but naming them using the _ONCE suffix seems
misleading IMO, as they have different guarantees than what
ACCESS_ONCE currently provides.


Lets leave aside how ACCESS_ONCE() is implemented for a moment.

If ACCESS_ONCE() doesn't guarantee atomicy, then it means you may read a 
mix of the old and new value. This would most likely break quite a few 
of the users because the result wouldn't be coherent.


Do you have place in mind where the non-atomicity would be useful?

Cheers,

--
Julien Grall

Re: [RFC PATCH V1 07/12] A collection of tweaks to be able to run emulator in driver domain

2020-08-17 Thread Oleksandr




On 16.08.20 18:36, Julien Grall wrote:

Hi Julien.




On 14/08/2020 17:30, Oleksandr wrote:


Hello all.



-Original Message-
From: Jan Beulich 
Sent: 05 August 2020 17:20
To: Oleksandr Tyshchenko ; Paul Durrant 

Cc: xen-devel@lists.xenproject.org; Oleksandr Tyshchenko 
; Andrew
Cooper ; George Dunlap 
; Ian Jackson
; Julien Grall ; 
Stefano Stabellini
; Wei Liu ; Daniel De Graaf 

Subject: Re: [RFC PATCH V1 07/12] A collection of tweaks to be 
able to run emulator in driver domain


On 03.08.2020 20:21, Oleksandr Tyshchenko wrote:

From: Oleksandr Tyshchenko 

Trying to run emulator in driver domain I ran into various issues
mostly policy-related. So this patch tries to resolve all them
plobably in a hackish way. I would like to get feedback how
to implement them properly as having an emulator in driver domain
is a completely valid use-case.

  From going over the comments I can only derive you want to run
an emulator in a driver domain, which doesn't really make sense
to me. A driver domain has a different purpose after all. If
instead you mean it to be run in just some other domain (which
also isn't the domain controlling the target), then there may
be more infrastructure changes needed.

Paul - was/is your standalone ioreq server (demu?) able to run
in other than the domain controlling a guest?

Not something I've done yet, but it was always part of the idea so 
that we could e.g. pass through a device to a dedicated domain and 
then run multiple demu instances there to virtualize it for many 
domUs. (I'm thinking here of a device that is not SR-IOV and hence 
would need some bespoke emulation code to share it out).That 
dedicated domain would be termed the 'driver domain' simply 
because it is running the device driver for the h/w that underpins 
the emulation.

I may abuse "driver domain" terminology, but indeed in our use-case we
pass through a set of H/W devices to a dedicated domain which is 
running

the device drivers for that H/Ws. Our target system comprises a thin
Dom0 (without H/W devices at all), DomD (which owns most of the H/W
devices) and DomU which runs on virtual devices. This patch tries to
make changes at Xen side to be able run standalone ioreq server
(emulator) in that dedicated (driver?) domain.
Okay, in which case I'm fine with the term. I simply wasn't aware of 
the

targeted scenario, sorry.



May I kindly ask to suggest me the pointers how to *properly* resolve 
various policy related issues described in that patch? Without having 
them resolved it wouldn't be able to run standalone IOREQ server in 
driver domain.


You could already do that by writing your own XSM policy. Did you 
explore it? If so, may I ask why this wouldn't be suitable?


Also, I would like to emphasis that because of XSA-295 (Unlimited Arm 
Atomics Operations), you can only run emulators in trusted domain on Arm.


There would be more work to do if you wanted to run them in 
non-trusted environment.


Thank you for the explanation. Yes, we consider driver domain as a 
trusted domain, there is no plan to run emulator in non-trusted domains. 
Indeed, it worth trying to write our own policy which will cover our use 
case (with emulator in driver domain) rather than tweak Xen's default one.



--
Regards,

Oleksandr Tyshchenko

Re: [PATCH v2 1/7] x86/EFI: sanitize build logic

2020-08-17 Thread Jan Beulich

On 10.08.2020 16:38, Andrew Cooper wrote:
> On 07/08/2020 17:33, Andrew Cooper wrote:
>> On 07/08/2020 12:32, Jan Beulich wrote:
>>> With changes done over time and as far as linking goes, the only special
>>> thing about building with EFI support enabled is the need for the dummy
>>> relocations object for xen.gz uniformly in all build stages. All other
>>> efi/*.o can be consumed from the built_in*.o files.
>>>
>>> In efi/Makefile, besides moving relocs-dummy.o to "extra", also properly
>>> split between obj-y and obj-bin-y.
>>>
>>> Signed-off-by: Jan Beulich 
>> Acked-by: Andrew Cooper 
>>
>> I'd prefer to see this all in Kconfig, but this is a clear improvement
>> in its own right.
> 
> Actually, it breaks the build with LIVEPATCH enabled.
> 
> make[2]: *** No rule to make target 'efi/buildid.o', needed by
> '/local/security/xen.git/xen/xen.efi'.  Stop.
> make[2]: *** Waiting for unfinished jobs
> Makefile:355: recipe for target '/local/security/xen.git/xen/xen' failed

There must be more to it than just "with LIVEPATCH enabled", as I definitely
tested a LIVEPATCH-enabled config. I'll see if I can figure out what's wrong
without further details (after my now prolonged "vacation"), but I may need
to come back asking for further detail.

Jan

Re: [RFC PATCH V1 05/12] hvm/dm: Introduce xendevicemodel_set_irq_level DM op

2020-08-17 Thread Jan Beulich

On 07.08.2020 23:50, Stefano Stabellini wrote:
> On Fri, 7 Aug 2020, Jan Beulich wrote:
>> On 07.08.2020 01:49, Stefano Stabellini wrote:
>>> On Thu, 6 Aug 2020, Julien Grall wrote:
 On 06/08/2020 01:37, Stefano Stabellini wrote:
> On Wed, 5 Aug 2020, Julien Grall wrote:
>> On 05/08/2020 00:22, Stefano Stabellini wrote:
>>> On Mon, 3 Aug 2020, Oleksandr Tyshchenko wrote:
 From: Oleksandr Tyshchenko 

 This patch adds ability to the device emulator to notify otherend
 (some entity running in the guest) using a SPI and implements Arm
 specific bits for it. Proposed interface allows emulator to set
 the logical level of a one of a domain's IRQ lines.

 Please note, this is a split/cleanup of Julien's PoC:
 "Add support for Guest IO forwarding to a device emulator"

 Signed-off-by: Julien Grall 
 Signed-off-by: Oleksandr Tyshchenko 
 ---
tools/libs/devicemodel/core.c   | 18
 ++
tools/libs/devicemodel/include/xendevicemodel.h |  4 
tools/libs/devicemodel/libxendevicemodel.map|  1 +
xen/arch/arm/dm.c   | 22
 +-
xen/common/hvm/dm.c |  1 +
xen/include/public/hvm/dm_op.h  | 15
 +++
6 files changed, 60 insertions(+), 1 deletion(-)

 diff --git a/tools/libs/devicemodel/core.c
 b/tools/libs/devicemodel/core.c
 index 4d40639..30bd79f 100644
 --- a/tools/libs/devicemodel/core.c
 +++ b/tools/libs/devicemodel/core.c
 @@ -430,6 +430,24 @@ int xendevicemodel_set_isa_irq_level(
return xendevicemodel_op(dmod, domid, 1, &op, sizeof(op));
}
+int xendevicemodel_set_irq_level(
 +xendevicemodel_handle *dmod, domid_t domid, uint32_t irq,
 +unsigned int level)
>>>
>>> It is a pity that having xen_dm_op_set_pci_intx_level and
>>> xen_dm_op_set_isa_irq_level already we need to add a third one, but from
>>> the names alone I don't think we can reuse either of them.
>>
>> The problem is not the name...
>>
>>>
>>> It is very similar to set_isa_irq_level. We could almost rename
>>> xendevicemodel_set_isa_irq_level to xendevicemodel_set_irq_level or,
>>> better, just add an alias to it so that xendevicemodel_set_irq_level is
>>> implemented by calling xendevicemodel_set_isa_irq_level. Honestly I am
>>> not sure if it is worth doing it though. Any other opinions?
>>
>> ... the problem is the interrupt field is only 8-bit. So we would only be
>> able
>> to cover IRQ 0 - 255.
>
> Argh, that's not going to work :-(  I wasn't sure if it was a good idea
> anyway.
>
>
>> It is not entirely clear how the existing subop could be extended without
>> breaking existing callers.
>>
>>> But I think we should plan for not needing two calls (one to set level
>>> to 1, and one to set it to 0):
>>> https://marc.info/?l=xen-devel&m=159535112027405
>>
>> I am not sure to understand your suggestion here? Are you suggesting to
>> remove
>> the 'level' parameter?
>
> My hope was to make it optional to call the hypercall with level = 0,
> not necessarily to remove 'level' from the struct.

 From my understanding, the hypercall is meant to represent the status of 
 the
 line between the device and the interrupt controller (either low or high).

 This is then up to the interrupt controller to decide when the interrupt is
 going to be fired:
   - For edge interrupt, this will fire when the line move from low to high 
 (or
 vice versa).
   - For level interrupt, this will fire when line is high (assuming level
 trigger high) and will keeping firing until the device decided to lower the
 line.

 For a device, it is common to keep the line high until an OS wrote to a
 specific register.

 Furthermore, technically, the guest OS is in charge to configure how an
 interrupt is triggered. Admittely this information is part of the DT, but
 nothing prevent a guest to change it.

 As side note, we have a workaround in Xen for some buggy DT (see the arch
 timer) exposing the wrong trigger type.

 Because of that, I don't really see a way to make optional. Maybe you have
 something different in mind?
>>>
>>> For level, we need the level parameter. For edge, we are only interested
>>> in the "edge", right?
>>
>> I don't think so, unless Arm has special restrictions. Edges can be
>> both rising and falling ones.
> 
> And the same is true for level interrupts too: they could be active-low
> or active-high.
> 
> 
> Instead of modelling the state of the line, wh

Re: [RFC PATCH 1/2] libxl: add Function class to IDL

2020-08-17 Thread Nick Rosbrook

On Fri, Aug 14, 2020 at 11:52:33AM +0100, Anthony PERARD wrote:
> On Mon, Jul 27, 2020 at 09:26:32AM -0400, Nick Rosbrook wrote:
> > Add a Function and CtxFunction classes to idl.py to allow generator
> > scripts to generate wrappers which are repetitive and straight forward
> > when doing so by hand. Examples of such functions are the
> > device_add/remove functions.
> > 
> > To start, a Function has attributes for namespace, name, parameters,
> > return type, and an indication if the return value should be interpreted as
> > a status code. The CtxFunction class extends this by indicating that a
> > libxl_ctx is a required parmeter, and can optionally be an async
> > function.
> > 
> > Also, add logic to idl.parse to return the list of functions found in an
> > IDL file. For now, have users of idl.py -- i.e. libxl/gentypes.py and
> > golang/xenlight/gengotypes.py -- ignore the list of functions returned.
> > 
> > Signed-off-by: Nick Rosbrook 
> > ---
> >  
> > +class Function(object):
> > +"""
> > +A general description of a function signature.
> > +
> > +Attributes:
> > +  name (str): name of the function, excluding namespace.
> > +  params (list of (str,Type)): list of function parameters.
> > +  return_type (Type): the Type (if any), returned by the function.
> > +  return_is_status (bool): Indicates that the return value should be
> > +   interpreted as an error/status code.
> 
> Can we get away without `return_is_status`? Couldn't we try to have
> return_type=libxl_error to indicate that return is a kind of status?
> 
Yes, I think that is much better.

> > +"""
> > +class CtxFunction(Function):
> > +"""
> > +A function that requires a libxl_ctx.
> > +
> > +Attributes:
> > +  is_asyncop (bool): indicates that the function accepts a
> > + libxl_asyncop_how parameter.
> 
> While CtxFunction can be a function that takes `libxl_ctx` as first
> parameter, I don't think `is_asyncop` can be used. We can't know if
> `ao_how` will be last or not. For some function, `ao_how` is second to
> last. So, I guess `ao_how` might need to be listed in `params`
> 
> What do you think?
That's a good point. Do you think it would make sense to add `Builtin`
definitions to libxl_types.idl for `libxl_asyncop_how`,
`libxl_asyncprogress_how`, etc.? That way the generation scripts could
work with those types more easily. But, I guess since those definitions
aren't known until parse time we couldn't use them in the
`DeviceFunction` class definition (but maybe that's not a big deal).

Thank you for the feedback.

-NR

Re: [PATCH 05/14] kernel-doc: public/features.h

2020-08-17 Thread Jan Beulich

On 07.08.2020 23:52, Stefano Stabellini wrote:
> On Fri, 7 Aug 2020, Jan Beulich wrote:
>> On 07.08.2020 01:49, Stefano Stabellini wrote:
>>> @@ -41,19 +41,25 @@
>>>   * XENFEAT_dom0 MUST be set if the guest is to be booted as dom0,
>>>   */
>>>  
>>> -/*
>>> - * If set, the guest does not need to write-protect its pagetables, and can
>>> - * update them via direct writes.
>>> +/**
>>> + * DOC: XENFEAT_writable_page_tables
>>> + *
>>> + * If set, the guest does not need to write-protect its pagetables, and
>>> + * can update them via direct writes.
>>>   */
>>>  #define XENFEAT_writable_page_tables   0
>>
>> I dislike such redundancy (and it's more noticable here than with
>> the struct-s). Is there really no way for the tool to find the
>> right item, the more that in the cover letter you say that you
>> even need to get the placement right, i.e. there can't be e.g.
>> intervening #define-s?
> 
> Let me clarify that the right placement (nothing between the comment and
> the following structure) is important for structs, typedefs, etc., but
> not for "DOC". DOC is freeform and doesn't have to be followed by
> anything specifically.
> 
> 
> In regards to the redundancy, there is only another option, that I
> didn't choose because it leads to worse documents being generated.
> However, they are still readable, so if the agreement is to use the
> other format, I would be OK with it.
> 
> 
> The other format is the keyword "macro" (this one would have to have the
> right placement, straight on top of the #define):
> 
> /**
>  * macro XENFEAT_writable_page_tables
>  *
>  * If set, the guest does not need to write-protect its pagetables, and
>  * can update them via direct writes.
>  */
> 
> 
> Which could be further simplified to:
> 
> /**
>  * macro
>  *
>  * If set, the guest does not need to write-protect its pagetables, and
>  * can update them via direct writes.
>  */
> 
> 
> In terms of redundancy, that's the best we can do.
> 
> The reason why I say it is not optimal is that with DOC the pleudo-html
> generated via sphinx is:
> 
> ---
> * XENFEAT_writable_page_tables *
> 
> If set, the guest does not need to write-protect its pagetables, and
> can update them via direct writes.
> ---
> 
> While with macro, two () parenthesis gets added to the title, and also an
> empty "Parameters" section gets added, like this:
> 
> ---
> * XENFEAT_writable_page_tables() *
> 
> ** Parameters **
> 
> ** Description **
> 
> If set, the guest does not need to write-protect its pagetables, and
> can update them via direct writes.
> ---
> 
> 
> I think it could be confusing to the user: it looks like a macro with
> parameters, which is not what we want.

Agreed, so ...

> For that reason, I think we should stick with "DOC" for now.

... if there are no (better) alternatives we'll have to live with the
redundancy.

Jan

Re: [PATCH 08/14] kernel-doc: public/memory.h

2020-08-17 Thread Jan Beulich

On 07.08.2020 23:51, Stefano Stabellini wrote:
> On Fri, 7 Aug 2020, Jan Beulich wrote:
>> On 07.08.2020 01:49, Stefano Stabellini wrote:
>>> From: Stefano Stabellini 
>>>
>>> Convert in-code comments to kernel-doc format wherever possible.
>>>
>>> Signed-off-by: Stefano Stabellini 
>>> ---
>>>  xen/include/public/memory.h | 232 
>>>  1 file changed, 155 insertions(+), 77 deletions(-)
>>>
>>> diff --git a/xen/include/public/memory.h b/xen/include/public/memory.h
>>> index 21057ed78e..4c57ed213c 100644
>>> --- a/xen/include/public/memory.h
>>> +++ b/xen/include/public/memory.h
>>> @@ -30,7 +30,9 @@
>>>  #include "xen.h"
>>>  #include "physdev.h"
>>>  
>>> -/*
>>> +/**
>>> + * DOC: XENMEM_increase_reservation and XENMEM_decrease_reservation
>>> + *
>>>   * Increase or decrease the specified domain's memory reservation. Returns 
>>> the
>>>   * number of extents successfully allocated or freed.
>>>   * arg == addr of struct xen_memory_reservation.
>>> @@ -40,29 +42,37 @@
>>>  #define XENMEM_populate_physmap 6
>>>  
>>>  #if __XEN_INTERFACE_VERSION__ >= 0x00030209
>>> -/*
>>> - * Maximum # bits addressable by the user of the allocated region (e.g., 
>>> I/O
>>> - * devices often have a 32-bit limitation even in 64-bit systems). If zero
>>> - * then the user has no addressing restriction. This field is not used by
>>> - * XENMEM_decrease_reservation.
>>> +/**
>>> + * DOC: XENMEMF_*
>>> + *
>>> + * - XENMEMF_address_bits, XENMEMF_get_address_bits:
>>> + *   Maximum # bits addressable by the user of the allocated region
>>> + *   (e.g., I/O devices often have a 32-bit limitation even in 64-bit
>>> + *   systems). If zero then the user has no addressing restriction. 
>>> This
>>> + *   field is not used by XENMEM_decrease_reservation.
>>> + * - XENMEMF_node, XENMEMF_get_node: NUMA node to allocate from
>>> + * - XENMEMF_populate_on_demand: Flag to populate physmap with 
>>> populate-on-demand entries
>>> + * - XENMEMF_exact_node_request, XENMEMF_exact_node: Flag to request 
>>> allocation only from the node specified
>>
>> Nit: overly long line
> 
> I'll fix
> 
> 
>>> + * - XENMEMF_vnode: Flag to indicate the node specified is virtual node
>>>   */
>>>  #define XENMEMF_address_bits(x) (x)
>>>  #define XENMEMF_get_address_bits(x) ((x) & 0xffu)
>>> -/* NUMA node to allocate from. */
>>>  #define XENMEMF_node(x) (((x) + 1) << 8)
>>>  #define XENMEMF_get_node(x) x) >> 8) - 1) & 0xffu)
>>> -/* Flag to populate physmap with populate-on-demand entries */
>>>  #define XENMEMF_populate_on_demand (1<<16)
>>> -/* Flag to request allocation only from the node specified */
>>>  #define XENMEMF_exact_node_request  (1<<17)
>>>  #define XENMEMF_exact_node(n) (XENMEMF_node(n) | 
>>> XENMEMF_exact_node_request)
>>> -/* Flag to indicate the node specified is virtual node */
>>>  #define XENMEMF_vnode  (1<<18)
>>>  #endif
>>>  
>>> +/**
>>> + * struct xen_memory_reservation
>>> + */
>>>  struct xen_memory_reservation {
>>>  
>>> -/*
>>> +/**
>>> + * @extent_start:
>>> + *
>>
>> Take the opportunity and drop the stray blank line?
>  
> Sure
> 
> 
>>> @@ -200,90 +236,115 @@ DEFINE_XEN_GUEST_HANDLE(xen_machphys_mfn_list_t);
>>>   */
>>>  #define XENMEM_machphys_compat_mfn_list 25
>>>  
>>> -/*
>>> +#define XENMEM_machphys_mapping 12
>>> +/**
>>> + * struct xen_machphys_mapping - XENMEM_machphys_mapping
>>> + *
>>>   * Returns the location in virtual address space of the machine_to_phys
>>>   * mapping table. Architectures which do not have a m2p table, or which do 
>>> not
>>>   * map it by default into guest address space, do not implement this 
>>> command.
>>>   * arg == addr of xen_machphys_mapping_t.
>>>   */
>>> -#define XENMEM_machphys_mapping 12
>>>  struct xen_machphys_mapping {
>>> +/** @v_start: Start virtual address */
>>>  xen_ulong_t v_start, v_end; /* Start and end virtual addresses.   */
>>> -xen_ulong_t max_mfn;/* Maximum MFN that can be looked up. */
>>> +/** @v_end: End virtual addresses */
>>> +xen_ulong_t v_end;
>>> +/** @max_mfn: Maximum MFN that can be looked up */
>>> +xen_ulong_t max_mfn;
>>>  };
>>>  typedef struct xen_machphys_mapping xen_machphys_mapping_t;
>>>  DEFINE_XEN_GUEST_HANDLE(xen_machphys_mapping_t);
>>>  
>>> -/* Source mapping space. */
>>> +/**
>>> + * DOC: Source mapping space.
>>> + *
>>> + * - XENMAPSPACE_shared_info:  shared info page
>>> + * - XENMAPSPACE_grant_table:  grant table page
>>> + * - XENMAPSPACE_gmfn: GMFN
>>> + * - XENMAPSPACE_gmfn_range:   GMFN range, XENMEM_add_to_physmap only.
>>> + * - XENMAPSPACE_gmfn_foreign: GMFN from another dom,
>>> + * XENMEM_add_to_physmap_batch only.
>>> + * - XENMAPSPACE_dev_mmio: device mmio region ARM only; the region is 
>>> mapped
>>> + * in Stage-2 using the Normal 
>>> MemoryInner/Outer
>>> + * Write-Back Cacheable memo

Re: [PATCH 4/4] EFI: free unused boot mem in at least some cases

2020-08-17 Thread Jan Beulich

On 10.08.2020 19:09, Andrew Cooper wrote:
> On 06/08/2020 10:06, Jan Beulich wrote:
>> Address at least the primary reason why 52bba67f8b87 ("efi/boot: Don't
>> free ebmalloc area at all") was put in place: Make xen_in_range() aware
>> of the freed range. This is in particular relevant for EFI-enabled
>> builds not actually running on EFI, as the entire range will be unused
>> in this case.
>>
>> Signed-off-by: Jan Beulich 
>> ---
>> The remaining issue could be addressed too, by making the area 2M in
>> size and 2M-aligned.
> 
> This memory range is only used for relocating the (synthesized?)
> multiboot strings, is it not?

No. Afaict it has nothing to do with multiboot strings. There are
exactly two uses afaics - in place_string() and in
efi_arch_allocate_mmap_buffer(). The former is used to record
command line pieces, e.g. that parsed from the config file, while
the latter is what allocates the memory for the EFI memory map.

> I'm not actually convinced that this is a sensible tradeoff.
> 
> For one, you've broken setup.c's:
> 
>     /* This needs to remain in sync with xen_in_range(). */
>     reserve_e820_ram(&boot_e820, __pa(_stext), __pa(__2M_rwdata_end));
> 
> which covers the runtime aspect of what xen_in_range() covers during boot.

Hmm, I did specifically look at that and thought it wouldn't need
changing. But now that you point it out (again), it looks like I
was wrong.

> I think the better course of action is to go with David Woodhouse's work
> to not relocate the trampoline until later on boot (if even necessary),
> at which point both of the custom allocators can disappear.

Well, in the light of my response above I'd like to express that
I can't see how David's work would make this allocator go away.

Jan

Re: [RFC PATCH 2/2] libxl: prototype libxl_device_nic_add/remove with IDL

2020-08-17 Thread Nick Rosbrook

On Fri, Aug 14, 2020 at 11:57:47AM +0100, Anthony PERARD wrote:
> On Mon, Jul 27, 2020 at 09:26:33AM -0400, Nick Rosbrook wrote:
> > Add a DeviceFunction class and describe prototypes for
> > libxl_device_nic_add/remove in libxl_types.idl.
> > 
> > Signed-off-by: Nick Rosbrook 
> > --
> > This is mostly to serve as an example of how the first patch would be
> > used for function support in the IDL.
> > ---
> >  tools/libxl/idl.py  | 8 
> >  tools/libxl/libxl_types.idl | 6 ++
> >  2 files changed, 14 insertions(+)
> > 
> > diff --git a/tools/libxl/idl.py b/tools/libxl/idl.py
> > index 1839871f86..15085af8c7 100644
> > --- a/tools/libxl/idl.py
> > +++ b/tools/libxl/idl.py
> > @@ -386,6 +386,14 @@ class CtxFunction(Function):
> >  
> >  Function.__init__(self, name, params, return_type, 
> > return_is_status)
> >  
> > +class DeviceFunction(CtxFunction):
> > +""" A function that modifies a device. """
> 
> I guess that meant to be used by all function generated with the C macro
> LIBXL_DEFINE_DEVICE_ADD() and LIBXL_DEFINE_DEVICE_REMOVE(), isn't it?
Yes, I think this could be used in place of those macros.
> 
> I wonder if if we could get away with the type of device ("nic") and the
> type of the parameter (`libxl_device_nic`) and have DeviceFunction been
> a generator for both `add` and `remove` functions (and `destroy`).

We could do that, but I think for clarity it might be valuable to
explicitly define each of them. Actually, as I look at this patch again
I wonder if it would be better to define `Device{Add,Remove,Destroy}`
class definitions?

> Also there are functions like libxl_devid_to_device_nic() aren't those
> of type DeviceFunction as well ? But they don't takes any `ao_how`.
> 
> There is also `libxl_device_nic_list{,_free}`, but it is to handle a
> list of libxl_device_*, so it could be kind of related to DeviceFunction, but
> not quite. But maybe I'm going to far :-).

I think this gives another good reason to define more specific `Device*`
classes, rather than a broad `DeviceFunction` class. What do you think?

Thanks,

-NR

Re: [PATCH] xen: Introduce cmpxchg64() and guest_cmpxchg64()

2020-08-17 Thread Roger Pau Monné

On Mon, Aug 17, 2020 at 12:05:54PM +0100, Julien Grall wrote:
> Hi,
> 
> On 17/08/2020 11:33, Roger Pau Monné wrote:
> > On Mon, Aug 17, 2020 at 10:42:54AM +0100, Julien Grall wrote:
> > > Hi,
> > > 
> > > On 17/08/2020 10:24, Roger Pau Monné wrote:
> > > > On Sat, Aug 15, 2020 at 06:21:43PM +0100, Julien Grall wrote:
> > > > > From: Julien Grall 
> > > > > 
> > > > > The IOREQ code is using cmpxchg() with 64-bit value. At the moment, 
> > > > > this
> > > > > is x86 code, but there is plan to make it common.
> > > > > 
> > > > > To cater 32-bit arch, introduce two new helpers to deal with 64-bit
> > > > > cmpxchg.
> > > > > 
> > > > > The Arm 32-bit implementation of cmpxchg64() is based on the 
> > > > > __cmpxchg64
> > > > > in Linux v5.8 (arch/arm/include/asm/cmpxchg.h).
> > > > > 
> > > > > Signed-off-by: Julien Grall 
> > > > > Cc: Oleksandr Tyshchenko 
> > > > > ---
> > > > > diff --git a/xen/include/asm-x86/guest_atomics.h 
> > > > > b/xen/include/asm-x86/guest_atomics.h
> > > > > index 029417c8ffc1..f4de9d3631ff 100644
> > > > > --- a/xen/include/asm-x86/guest_atomics.h
> > > > > +++ b/xen/include/asm-x86/guest_atomics.h
> > > > > @@ -20,6 +20,8 @@
> > > > >((void)(d), test_and_change_bit(nr, p))
> > > > >#define guest_cmpxchg(d, ptr, o, n) ((void)(d), cmpxchg(ptr, o, n))
> > > > > +#define guest_cmpxchg64(d, ptr, o, n) ((void)(d), cmpxchg64(ptr, o, 
> > > > > n))
> > > > > +
> > > > >#endif /* _X86_GUEST_ATOMICS_H */
> > > > >/*
> > > > > diff --git a/xen/include/asm-x86/x86_64/system.h 
> > > > > b/xen/include/asm-x86/x86_64/system.h
> > > > > index f471859c19cc..c1b16105e9f2 100644
> > > > > --- a/xen/include/asm-x86/x86_64/system.h
> > > > > +++ b/xen/include/asm-x86/x86_64/system.h
> > > > > @@ -5,6 +5,8 @@
> > > > >((__typeof__(*(ptr)))__cmpxchg((ptr),(unsigned long)(o),   
> > > > >  \
> > > > >   (unsigned 
> > > > > long)(n),sizeof(*(ptr
> > > > > +#define cmpxchg64(ptr, o, n) cmpxchg(ptr, o, n)
> > > > 
> > > > Why do you need to introduce an explicitly sized version of cmpxchg
> > > > for 64bit values?
> > > > 
> > > > There's no cmpxchg{8,16,32}, so I would expect cmpxchg64 to just be
> > > > handled by cmpxchg detecting the size of the parameter passed to the
> > > > function.
> > > That works quite well for 64-bit arches. However, for 32-bit, you would 
> > > need
> > > to take some detour so 32-bit and 64-bit can cohabit (you cannot simply
> > > replace unsigned long with uint64_t).
> > 
> > Oh, I see. Switching __cmpxchg on Arm 32 to use unsigned long long or
> > uint64_t would be bad, as you would then need two registers to pass
> > the value to the function, or push it on the stack?
> 
> We have only 4 registers (r0 - r4) available for the arguments. With 64-bit
> value, we will be using 2 registers, some will end up to be pushed on the
> stack.
> 
> This is assuming the compiler is not clever enough to see we are only using
> the bottom 32-bit with some cmpxchg.
> 
> > 
> > Maybe do something like:
> > 
> > #define cmpxchg(ptr,o,n) ({ \
> > typeof(*(ptr)) tmp; \
> > \
> > switch ( sizeof(*(ptr)) )   \
> > {   \
> > case 8: \
> > tmp = __cmpxchg_mb64((ptr), (uint64_t)(o),  \
> > (uint64_t)(n), sizeof(*(ptr \
> > break;  \
> > default:\
> > tmp = __cmpxchg_mb((ptr), (unsigned long)(o),   \
> > (unsigned long)(n), sizeof(*(ptr\
> > break;  \
> > }   \
> > tmp;\
> > })
> 
> 
> Unfortunately this can't compile if o and n are pointers because the
> compiler will complain about the cast to uint64_t.

Right, we would have to cast to unsigned long first and then to
uint64_t, which is not very nice.

> 
> We would also need a cast when assigning to tmp because tmp may not be a
> scalar type. This would lead to the same compiler issue.

Yes, we would have to do a bunch of casts.

> The only way I could see to make it work would be to use the same trick as
> we do for {read, write}_atomic() (see asm-arm/atomic.h). We are using union
> and void pointer to prevent explicit cast.

I'm mostly worried about common code having assumed that cmpxchg
does also handle 64bit sized parameters, and thus failing to use
cmpxchg64 when required. I assume this is not much of a deal as then
the Arm 3

Re: [PATCH] xen/x86: irq: Avoid a TOCTOU race in pirq_spin_lock_irq_desc()

2020-08-17 Thread Julien Grall





On 17/08/2020 16:03, Roger Pau Monné wrote:

On Mon, Aug 17, 2020 at 03:39:52PM +0100, Julien Grall wrote:



On 17/08/2020 15:01, Roger Pau Monné wrote:

On Mon, Aug 17, 2020 at 02:14:01PM +0100, Julien Grall wrote:

Hi,

On 17/08/2020 13:46, Roger Pau Monné wrote:

On Fri, Aug 14, 2020 at 08:25:28PM +0100, Julien Grall wrote:

Hi Andrew,

Sorry for the late answer.

On 23/07/2020 14:59, Andrew Cooper wrote:

On 23/07/2020 14:22, Julien Grall wrote:

Hi Jan,

On 23/07/2020 12:23, Jan Beulich wrote:

On 22.07.2020 18:53, Julien Grall wrote:

--- a/xen/arch/x86/irq.c
+++ b/xen/arch/x86/irq.c
@@ -1187,7 +1187,7 @@ struct irq_desc *pirq_spin_lock_irq_desc(
    for ( ; ; )
  {
-    int irq = pirq->arch.irq;
+    int irq = read_atomic(&pirq->arch.irq);


There we go - I'd be fine this way, but I'm pretty sure Andrew
would want this to be ACCESS_ONCE(). So I guess now is the time
to settle which one to prefer in new code (or which criteria
there are to prefer one over the other).


I would prefer if we have a single way to force the compiler to do a
single access (read/write).


Unlikely to happen, I'd expect.

But I would really like to get rid of (or at least rename)
read_atomic()/write_atomic() specifically because they've got nothing to
do with atomic_t's and the set of functionality who's namespace they share.


Would you be happy if I rename both to READ_ONCE() and WRITE_ONCE()? I would
also suggest to move them implementation in a new header asm/lib.h.


Maybe {READ/WRITE}_SINGLE (to note those should be implemented using a
single instruction)?


The asm volatile statement contains only one instruction, but this doesn't
mean the helper will generate a single instruction.


Well, the access should be done using a single instruction, which is
what we care about when using this helpers.


You may have other instructions to get the registers ready for the access.



ACCESS_ONCE (which also has the _ONCE suffix) IIRC could be
implemented using several instructions, and hence doesn't seem right
that they all have the _ONCE suffix.


The goal here is the same, we want to access the variable *only* once.


Right, but this is not guaranteed by the current implementation of
ACCESS_ONCE AFAICT, as the compiler *might* split the access into two
(or more) instructions, and hence won't be an atomic access anymore?

 From my understanding, at least on GCC/Clang, ACCESS_ONCE() should be atomic
if you are using aligned address and the size smaller than a register size.


Yes, any sane compiler shouldn't split such access, but this is not
guaranteed by the current code in ACCESS_ONCE.
To be sure, your concern here is not about GCC/Clang but other 
compilers. Am I correct?


We already have a collection of compiler specific macros in compiler.h. 
So how about we classify this macro as a compiler specific one? (See 
more below).







May I ask why we would want to expose the difference to the user?


I'm not saying we should, but naming them using the _ONCE suffix seems
misleading IMO, as they have different guarantees than what
ACCESS_ONCE currently provides.


Lets leave aside how ACCESS_ONCE() is implemented for a moment.

If ACCESS_ONCE() doesn't guarantee atomicy, then it means you may read a mix
of the old and new value. This would most likely break quite a few of the
users because the result wouldn't be coherent.

Do you have place in mind where the non-atomicity would be useful?


Not that I'm aware, I think they could all be safely switched to use
the atomic variants
There is concern that read_atomic(), write_atomic() prevent the compiler 
to do certain optimization. Andrew gave the example of:


ACCESS_ONCE(...) |= ...



In fact I wouldn't be surprised if users of ACCESS_ONCE break if the
access was split into multiple instructions.

My comment was to notice that just renaming the atomic read/write
helpers to use the _ONCE prefix is IMO weird as they offer different
properties than ACCESS_ONCE, and hence might confuse users.Just
looking at READ_ONCE users could assume all _ONCE helpers would
guarantee atomicity, which is not the case.


Our implementation of ACCESS_ONCE() is very similar to what Linux used 
to have. There READ_ONCE()/WRITE_ONCE() are also using the same principles.


From my understanding, you can safely assume the access will be atomic 
if the following conditions are met:

- The address is correctly size
- The size is smaller than the word machine size

I would agree this may not be correct on all the existing compilers. But 
this macro could easily be re-implemented if we add support for a 
compiler with different guarantee.


Therefore, I fail to see why we can't use the same guarantee in Xen.

Cheers,

--
Julien Grall

Re: [PATCH 3/4] build: also check for empty .bss.* in .o -> .init.o conversion

2020-08-17 Thread Jan Beulich

On 10.08.2020 19:51, Andrew Cooper wrote:
> On 07/08/2020 16:40, Jan Beulich wrote:
>> On 07.08.2020 17:12, Andrew Cooper wrote:
>>> On 07/08/2020 11:56, Jan Beulich wrote:
 On 06.08.2020 18:16, Andrew Cooper wrote:
> On 06/08/2020 10:05, Jan Beulich wrote:
> Can't we remove all of this by having CONFIG_XEN_PE expressed/selectable
> properly in Kconfig, and gathering all the objects normally, rather than
> bodging all of common/efi/ through arch/efi/ ?
 _If_ we settle on Kconfig to be allowed to check compiler (and linker)
 features, then yes. This continues to be a pending topic though, so
 the switch can't be made like this at this point in time. (It could be
 made a Kconfig item now - which, when enabled, implies the assertion
 that a capable tool chain is in use.)
>>> I am still of the opinion that nothing needs discussing, but you are
>>> obviously not.
>>>
>>> Please raise this as a topic and lets discuss it, because it has a
>>> meaningful impacting on a large number of pending series.
>> Preferably I would have put this on this month's community meeting
>> agenda, but I'll be ooo next week, so that's not going to help, I'm
>> afraid. I guess I should put it up in email form when I'm back,
>> albeit I wasn't thinking it should need to be me to start the
>> discussion. Instead my view was that such a discussion should (have
>> been, now after-the-fact) be started by whoever wants to introduce
>> a new feature.
> 
> It would have been better to raise a concern/objectection before you
> committed the feature.

I did, and I committed the whole lot because of not wanting to block
the many improvements over this one aspect I disagree with. Recall
me asking what happens if the compiler (or any part of the tool chain)
gets upgraded (or, possibly worse, downgraded) between two
(incremental) builds?

> It was a very clear intent of upgrading Kconfig and switching to Kbuild,
> to clean up the total and chronic mess we call a build system.  It has
> been discussed multiple times in person, and on xen-devel, without
> apparent objection at the time.

The change to Kbuild was discussed. The use (and, depending on how one
views it, abuse) of Kconfig to determine tool chain capabilities wasn't,
iirc. At least not in a way that it would have been noticeable to me.

> The state of 4.14 and later is that we have the feature, and it is
> already in use, with a lot more use expected to continue fixing the
> build system.

If I'm not mistaken I did make my ack on the first use of the new
behavior (in your CET series) dependent upon a subsequent discussion
(that should have occurred up front), again in an attempt to get
certain things taken care of for 4.14. This was, again iirc, in turn
referring to the earlier ack on Anthony's series, which was given in
a similarly conditional manner. (But I may be mis-remembering.)
Therefore ...

> You are currently blocking work to fix aspects of the build system based
> on a dislike of this feature, *and* expecting someone else to justify
> why using this feature as intended is ok in the first place.

... I'm pretty puzzled: Am I now being told that I shouldn't have
made the compromises, and rather should have blocked things earlier
on? I.e. is my attempt to show reasonable behavior now being turned
back into an argument against me? If so, I can certainly draw the
obvious conclusions from that, for the future.

> I do not consider that a reasonable expectation of how to proceed.
> 
> If you wish to undo what was a deliberate intention of the
> Kconfig/Kbuild work, then it is you who must start the conversation on
> why we should revert the improvements.

If I hadn't voiced my reservations long before, this _may_ indeed be
a valid position to take. But given all that had been said already
before any of this went in, I don't think it is. Anyway - despite me
not thinking it should be me (and hence it not having happened so
far), I intend (as said) to start a discussion. To be honest, I'll
be curious to see how it'll go, both in terms of number of
responses received and in terms of everyone honoring the fact that
it should _not_ matter that the logic in question was already
committed.

Jan

Re: [PATCH 3/3] x86: don't override INVALID_M2P_ENTRY with SHARED_M2P_ENTRY

2020-08-17 Thread Jan Beulich

On 10.08.2020 18:42, Andrew Cooper wrote:
> On 06/08/2020 10:29, Jan Beulich wrote:
>> While in most cases code ahead of the invocation of set_gpfn_from_mfn()
>> deals with shared pages, at least in set_typed_p2m_entry() I can't spot
>> such handling (it's entirely possible there's code missing there). Let's
>> try to play safe and add an extra check.
>>
>> Signed-off-by: Jan Beulich 
>>
>> --- a/xen/include/asm-x86/mm.h
>> +++ b/xen/include/asm-x86/mm.h
>> @@ -525,9 +525,14 @@ extern const unsigned int *const compat_
>>  #endif /* CONFIG_PV32 */
>>  
>>  #define _set_gpfn_from_mfn(mfn, pfn) ({\
>> -struct domain *d = page_get_owner(mfn_to_page(_mfn(mfn))); \
>> -unsigned long entry = (d && (d == dom_cow)) ?  \
>> -SHARED_M2P_ENTRY : (pfn);  \
>> +unsigned long entry = (pfn);   \
>> +if ( entry != INVALID_M2P_ENTRY )  \
>> +{  \
>> +const struct domain *d;\
>> +d = page_get_owner(mfn_to_page(_mfn(mfn)));\
>> +if ( d && (d == dom_cow) ) \
>> +entry = SHARED_M2P_ENTRY;  \
>> +}  \
>>  set_compat_m2p(mfn, (unsigned int)(entry));\
>>  machine_to_phys_mapping[mfn] = (entry);\
>>  })
>>
> 
> Hmm - we already have a lot of callers, and this is already too
> complicated to be a define.

I did consider moving this into an out-of-line function, yes.

> We have x86 which uses M2P, and ARM which doesn't.  We have two more
> architectures on the way which probably won't want M2P, and certainly
> won't in the beginning.
> 
> Can we introduce CONFIG_M2P which is selected by x86, rename this
> infrastructure to set_m2p() or something, provide a no-op fallback in
> common code, and move this implementation into x86/mm.c ?

We can, sure. Question is whether this isn't more scope creep than
is acceptable considering the purpose of this change.

> In particular, silently clobbering pfn to SHARED_M2P_ENTRY is rude
> behaviour.  It would be better to ASSERT() the right one is passed in,
> which also simplifies release builds.

Now this is, irrespective of me agreeing with the point you make,
a change I'm not going to make: There's no way I could guarantee
I wouldn't break mem-sharing. A change like this can imo only
possibly be done by someone actively working on and with
mem-sharing.

Jan

Re: [RFC] efi/boot: Unified Xen executable for UEFI Secure Boot support

2020-08-17 Thread Jan Beulich

On 11.08.2020 16:47, Trammell Hudson wrote:
> On Friday, August 7, 2020 2:23 PM, Jan Beulich  wrote:
>> On 06.08.2020 16:15, Trammell Hudson wrote:
>>> --- /dev/null
>>> +++ b/xen/scripts/unify-xen
>>> @@ -0,0 +1,89 @@
>>> +#!/bin/bash
>>> +# Build a "unified Xen" image.
>>> +# Usage
>>> +# unify xen.efi xen.cfg bzimage initrd [xsm [ucode]]
>>> [...]
>>
>> With all these hard coded size restrictions I take it this still is
>> just an example, not something that is to eventually get committed.
> 
> I'm wondering if for the initial merge if it is better to include just
> the objcopy command line to show how to do it in the documentation, similar
> to how systemd-boot documents it, rather than providing a tool.  At a later
> time a more correct unify script could be merged.

Sounds like a reasonable approach.

> Updated patch follows:

Going forward, may I ask that you please send new versions of the patch(es)
instead of inlining them in your replies?

Jan

Re: [PATCH] xen/x86: irq: Avoid a TOCTOU race in pirq_spin_lock_irq_desc()

2020-08-17 Thread Roger Pau Monné

On Fri, Aug 14, 2020 at 08:25:28PM +0100, Julien Grall wrote:
> Hi Andrew,
> 
> Sorry for the late answer.
> 
> On 23/07/2020 14:59, Andrew Cooper wrote:
> > On 23/07/2020 14:22, Julien Grall wrote:
> > > Hi Jan,
> > > 
> > > On 23/07/2020 12:23, Jan Beulich wrote:
> > > > On 22.07.2020 18:53, Julien Grall wrote:
> > > > > --- a/xen/arch/x86/irq.c
> > > > > +++ b/xen/arch/x86/irq.c
> > > > > @@ -1187,7 +1187,7 @@ struct irq_desc *pirq_spin_lock_irq_desc(
> > > > >      for ( ; ; )
> > > > >    {
> > > > > -    int irq = pirq->arch.irq;
> > > > > +    int irq = read_atomic(&pirq->arch.irq);
> > > > 
> > > > There we go - I'd be fine this way, but I'm pretty sure Andrew
> > > > would want this to be ACCESS_ONCE(). So I guess now is the time
> > > > to settle which one to prefer in new code (or which criteria
> > > > there are to prefer one over the other).
> > > 
> > > I would prefer if we have a single way to force the compiler to do a
> > > single access (read/write).
> > 
> > Unlikely to happen, I'd expect.
> > 
> > But I would really like to get rid of (or at least rename)
> > read_atomic()/write_atomic() specifically because they've got nothing to
> > do with atomic_t's and the set of functionality who's namespace they share.
> 
> Would you be happy if I rename both to READ_ONCE() and WRITE_ONCE()? I would
> also suggest to move them implementation in a new header asm/lib.h.

Maybe {READ/WRITE}_SINGLE (to note those should be implemented using a
single instruction)?

ACCESS_ONCE (which also has the _ONCE suffix) IIRC could be
implemented using several instructions, and hence doesn't seem right
that they all have the _ONCE suffix.

Roger.

Re: [PATCH v1 5/6] tools/ocaml/xenstored: use more efficient node trees

2020-08-17 Thread Christian Lindig

+let compare a b =
+  if equal a b then 0
+  else -(String.compare a b)

I think this bit could use an inline comment why the sort order is reversed. 
This could be also simplified to -(String.compare a b) because this goes to the 
internal (polymorphic) compare implemented in C which does a physical 
equivalence check first.

-- C


From: Edwin Torok
Sent: 14 August 2020 23:14
To: xen-devel@lists.xenproject.org
Cc: Edwin Torok; Christian Lindig; David Scott; Ian Jackson; Wei Liu
Subject: [PATCH v1 5/6] tools/ocaml/xenstored: use more efficient node trees

This changes the output of xenstore-ls to be sorted.
Previously the keys were listed in the order in which they were inserted
in.
docs/misc/xenstore.txt doesn't specify in what order keys are listed.

Map.update is used to retain semantics with replace_child:
only an existing child is replaced, if it wasn't part of the original
map we don't add it.
Similarly exception behaviour is retained for del_childname and related
functions.

Entries are stored in reverse sort order, so that upon Map.fold the
constructed list is sorted in ascending order and there is no need for a
List.rev.

Signed-off-by: Edwin Török 
---
 tools/ocaml/xenstored/store.ml   | 46 +++-
 tools/ocaml/xenstored/symbol.ml  |  4 +++
 tools/ocaml/xenstored/symbol.mli |  3 +++
 3 files changed, 29 insertions(+), 24 deletions(-)

diff --git a/tools/ocaml/xenstored/store.ml b/tools/ocaml/xenstored/store.ml
index 45659a23ee..d9dfa36045 100644
--- a/tools/ocaml/xenstored/store.ml
+++ b/tools/ocaml/xenstored/store.ml
@@ -16,17 +16,19 @@
  *)
 open Stdext

+module SymbolMap = Map.Make(Symbol)
+
 module Node = struct

 type t = {
name: Symbol.t;
perms: Perms.Node.t;
value: string;
-   children: t list;
+   children: t SymbolMap.t;
 }

 let create _name _perms _value =
-   { name = Symbol.of_string _name; perms = _perms; value = _value; 
children = []; }
+   { name = Symbol.of_string _name; perms = _perms; value = _value; 
children = SymbolMap.empty; }

 let get_owner node = Perms.Node.get_owner node.perms
 let get_children node = node.children
@@ -42,38 +44,34 @@ let set_value node nvalue =
 let set_perms node nperms = { node with perms = nperms }

 let add_child node child =
-   { node with children = child :: node.children }
+   let children = SymbolMap.add child.name child node.children in
+   { node with children }

 let exists node childname =
let childname = Symbol.of_string childname in
-   List.exists (fun n -> Symbol.equal n.name childname) node.children
+   SymbolMap.mem childname node.children

 let find node childname =
let childname = Symbol.of_string childname in
-   List.find (fun n -> Symbol.equal n.name childname) node.children
+   SymbolMap.find childname node.children

 let replace_child node child nchild =
-   (* this is the on-steroid version of the filter one-replace one *)
-   let rec replace_one_in_list l =
-   match l with
-   | []   -> []
-   | h :: tl when Symbol.equal h.name child.name -> nchild :: tl
-   | h :: tl  -> h :: replace_one_in_list 
tl
-   in
-   { node with children = (replace_one_in_list node.children) }
+   { node with
+ children = SymbolMap.update child.name
+(function None -> None | Some _ -> Some nchild)
+node.children
+   }

 let del_childname node childname =
let sym = Symbol.of_string childname in
-   let rec delete_one_in_list l =
-   match l with
-   | []-> raise Not_found
-   | h :: tl when Symbol.equal h.name sym -> tl
-   | h :: tl   -> h :: delete_one_in_list tl
-   in
-   { node with children = (delete_one_in_list node.children) }
+   { node with children =
+   SymbolMap.update sym
+ (function None -> raise Not_found | Some _ -> None)
+ node.children
+   }

 let del_all_children node =
-   { node with children = [] }
+   { node with children = SymbolMap.empty }

 (* check if the current node can be accessed by the current connection with 
rperm permissions *)
 let check_perm node connection request =
@@ -87,7 +85,7 @@ let check_owner node connection =
raise Define.Permission_denied;
end

-let rec recurse fct node = fct node; List.iter (recurse fct) node.children
+let rec recurse fct node = fct node; SymbolMap.iter (fun _ -> recurse fct) 
node.children

 let unpack node = (Symbol.to_string node.name, node.perms, node.value)

@@ -321,7 +319,7 @@ let ls store perm path =
Node.check_perm cnode perm Perms.READ;
cnode.Node.children in
Path.ap

Re: [PATCH v1 0/6] tools/ocaml/xenstored: simplify code

2020-08-17 Thread Christian Lindig

From: Edwin Torok
Sent: 14 August 2020 23:11
To: xen-devel@lists.xenproject.org
Cc: Edwin Torok; Christian Lindig; David Scott; Ian Jackson; Wei Liu
Subject: [PATCH v1 0/6] tools/ocaml/xenstored: simplify code

Fix warnings, and delete some obsolete code.
oxenstored contained a hand-rolled GC to perform hash-consing:
this can be done with a lot fewer lines of code by using the built-in Weak 
module.

The choice of data structures for trees/tries is not very efficient: they are 
just
lists. Using a map improves lookup and deletion complexity, and replaces 
hand-rolled
recursion with higher-level library calls.

There is a lot more that could be done to optimize socket polling:
an epoll backend with a poll fallback,but API structured around event-based 
polling
would be better. But first lets drop the legacy select based code: I think every
modern *nix should have a working poll(3) by now.

This is a draft series, in need of more testing.

Edwin Török (6):
  tools/ocaml/libs/xc: Fix ambiguous documentation comment
  tools/ocaml/xenstored: fix deprecation warning
  tools/ocaml/xenstored: replace hand rolled GC with weak GC references
  tools/ocaml/xenstored: drop select based
  tools/ocaml/xenstored: use more efficient node trees
  tools/ocaml/xenstored: use more efficient tries

 tools/ocaml/libs/xc/xenctrl.mli   |  2 +
 tools/ocaml/xenstored/connection.ml   |  3 -
 tools/ocaml/xenstored/connections.ml  |  2 +-
 tools/ocaml/xenstored/disk.ml |  2 +-
 tools/ocaml/xenstored/history.ml  | 14 
 tools/ocaml/xenstored/parse_arg.ml|  7 +-
 tools/ocaml/xenstored/{select.ml => poll.ml}  | 14 +---
 .../ocaml/xenstored/{select.mli => poll.mli}  | 12 +---
 tools/ocaml/xenstored/store.ml| 49 ++---
 tools/ocaml/xenstored/symbol.ml   | 70 +--
 tools/ocaml/xenstored/symbol.mli  | 22 ++
 tools/ocaml/xenstored/trie.ml | 61 +++-
 tools/ocaml/xenstored/trie.mli| 26 +++
 tools/ocaml/xenstored/xenstored.ml| 20 +-
 14 files changed, 98 insertions(+), 206 deletions(-)
 rename tools/ocaml/xenstored/{select.ml => poll.ml} (85%)
 rename tools/ocaml/xenstored/{select.mli => poll.mli} (58%)

--
2.25.1

This all looks good - I left a small comment on one of the patches and I agree 
that this needs testing. I also wonder about compatibility with earlier OCaml 
releases that we support but I see no real obstacles.

-- 
Acked-by: Christian Lindig

CI loop working group

2020-08-17 Thread George Dunlap

As a brief summary, here is what we discussed at the XenSummit design session 
on CI;

# What is needed to run CI on patches posted to the list:

1. Get a patch series. Determine if it's for Xen or not.
2. determine the base branch (staging, staging-4.13, staging-4.12, etc) 
("for-4.14" after a branch) (actually, probably everything should go into 
staging first)
3. Apply that patch series to a branch.
4. Push to a git repo (while developing push it somewhere other than the main 
repo)
5. Let CI run
6. Add step at the end of the CI run to comment on the ML list (ideally reply 
to series on list)
7. Have an opt-out flag.

patchew already pushes to github.com, so we just need to get it to push to 
gitlab.  So the plan is:

# Plan

- fix current CI loop built failure
- Get an account for patchew on gitlab
- Reconfigure patchew.org to push there instead
- Reconfigure patchew.org to reply to mailing list w/ result

We’d also discussed dropping the “test every commit” script.

Andy said he would volunteer to chase this.

At the most recent community call, Andy recommended we form a working group 
with regular meetings to make sure things keep moving forward.  Shall we say 
biweekly?  Any preferences for meeting time / venue?

I also propose we enable issue tracking on gitlab.com/xen-project, at least for 
project members, to collect and track this sort of thing.  Any objections?

 -George

Re: [PATCH] xen/x86: irq: Avoid a TOCTOU race in pirq_spin_lock_irq_desc()

2020-08-17 Thread Julien Grall





On 17/08/2020 18:33, Roger Pau Monné wrote:

On Mon, Aug 17, 2020 at 04:53:51PM +0100, Julien Grall wrote:



On 17/08/2020 16:03, Roger Pau Monné wrote:

On Mon, Aug 17, 2020 at 03:39:52PM +0100, Julien Grall wrote:



On 17/08/2020 15:01, Roger Pau Monné wrote:

On Mon, Aug 17, 2020 at 02:14:01PM +0100, Julien Grall wrote:

Hi,

On 17/08/2020 13:46, Roger Pau Monné wrote:

On Fri, Aug 14, 2020 at 08:25:28PM +0100, Julien Grall wrote:

Hi Andrew,

Sorry for the late answer.

On 23/07/2020 14:59, Andrew Cooper wrote:

On 23/07/2020 14:22, Julien Grall wrote:

Hi Jan,

On 23/07/2020 12:23, Jan Beulich wrote:

On 22.07.2020 18:53, Julien Grall wrote:

--- a/xen/arch/x86/irq.c
+++ b/xen/arch/x86/irq.c
@@ -1187,7 +1187,7 @@ struct irq_desc *pirq_spin_lock_irq_desc(
     for ( ; ; )
   {
-    int irq = pirq->arch.irq;
+    int irq = read_atomic(&pirq->arch.irq);


There we go - I'd be fine this way, but I'm pretty sure Andrew
would want this to be ACCESS_ONCE(). So I guess now is the time
to settle which one to prefer in new code (or which criteria
there are to prefer one over the other).


I would prefer if we have a single way to force the compiler to do a
single access (read/write).


Unlikely to happen, I'd expect.

But I would really like to get rid of (or at least rename)
read_atomic()/write_atomic() specifically because they've got nothing to
do with atomic_t's and the set of functionality who's namespace they share.


Would you be happy if I rename both to READ_ONCE() and WRITE_ONCE()? I would
also suggest to move them implementation in a new header asm/lib.h.


Maybe {READ/WRITE}_SINGLE (to note those should be implemented using a
single instruction)?


The asm volatile statement contains only one instruction, but this doesn't
mean the helper will generate a single instruction.


Well, the access should be done using a single instruction, which is
what we care about when using this helpers.


You may have other instructions to get the registers ready for the access.



ACCESS_ONCE (which also has the _ONCE suffix) IIRC could be
implemented using several instructions, and hence doesn't seem right
that they all have the _ONCE suffix.


The goal here is the same, we want to access the variable *only* once.


Right, but this is not guaranteed by the current implementation of
ACCESS_ONCE AFAICT, as the compiler *might* split the access into two
(or more) instructions, and hence won't be an atomic access anymore?

  From my understanding, at least on GCC/Clang, ACCESS_ONCE() should be atomic
if you are using aligned address and the size smaller than a register size.


Yes, any sane compiler shouldn't split such access, but this is not
guaranteed by the current code in ACCESS_ONCE.

To be sure, your concern here is not about GCC/Clang but other compilers. Am
I correct?


Or about the existing ones switching behavior, which is again quite
unlikely I would like to assume.


The main goal of the macro is to mark place which require the variable 
to be accessed once. So, in the unlikely event this may happen, it would 
be easy to modify the implementation.





We already have a collection of compiler specific macros in compiler.h. So
how about we classify this macro as a compiler specific one? (See more
below).






May I ask why we would want to expose the difference to the user?


I'm not saying we should, but naming them using the _ONCE suffix seems
misleading IMO, as they have different guarantees than what
ACCESS_ONCE currently provides.


Lets leave aside how ACCESS_ONCE() is implemented for a moment.

If ACCESS_ONCE() doesn't guarantee atomicy, then it means you may read a mix
of the old and new value. This would most likely break quite a few of the
users because the result wouldn't be coherent.

Do you have place in mind where the non-atomicity would be useful?


Not that I'm aware, I think they could all be safely switched to use
the atomic variants

There is concern that read_atomic(), write_atomic() prevent the compiler to
do certain optimization. Andrew gave the example of:

ACCESS_ONCE(...) |= ...


I'm not sure how will that behave when used with a compile known
value that's smaller than the size of the destination. Could the
compiler optimize this as a partial read/write if only the lower byte
is modified for example?


Here what Andrew wrote in a previous answer:

"Which a sufficiently clever compiler could convert to a single `or 
$val, ptr` instruction on x86, while read_atomic()/write_atomic() would 
force it to be `mov ptr, %reg; or $val, %reg; mov %reg, ptr`."


On Arm, a RwM operation will still not be atomic as it would require 3 
instructions.








In fact I wouldn't be surprised if users of ACCESS_ONCE break if the
access was split into multiple instructions.

My comment was to notice that just renaming the atomic read/write
helpers to use the _ONCE prefix is IMO weird as they offer different
properties than ACCESS_ONCE, and hence might confuse users.Just

Re: [PATCH] xen/x86: irq: Avoid a TOCTOU race in pirq_spin_lock_irq_desc()

2020-08-17 Thread Roger Pau Monné

On Mon, Aug 17, 2020 at 02:14:01PM +0100, Julien Grall wrote:
> Hi,
> 
> On 17/08/2020 13:46, Roger Pau Monné wrote:
> > On Fri, Aug 14, 2020 at 08:25:28PM +0100, Julien Grall wrote:
> > > Hi Andrew,
> > > 
> > > Sorry for the late answer.
> > > 
> > > On 23/07/2020 14:59, Andrew Cooper wrote:
> > > > On 23/07/2020 14:22, Julien Grall wrote:
> > > > > Hi Jan,
> > > > > 
> > > > > On 23/07/2020 12:23, Jan Beulich wrote:
> > > > > > On 22.07.2020 18:53, Julien Grall wrote:
> > > > > > > --- a/xen/arch/x86/irq.c
> > > > > > > +++ b/xen/arch/x86/irq.c
> > > > > > > @@ -1187,7 +1187,7 @@ struct irq_desc *pirq_spin_lock_irq_desc(
> > > > > > >       for ( ; ; )
> > > > > > >     {
> > > > > > > -    int irq = pirq->arch.irq;
> > > > > > > +    int irq = read_atomic(&pirq->arch.irq);
> > > > > > 
> > > > > > There we go - I'd be fine this way, but I'm pretty sure Andrew
> > > > > > would want this to be ACCESS_ONCE(). So I guess now is the time
> > > > > > to settle which one to prefer in new code (or which criteria
> > > > > > there are to prefer one over the other).
> > > > > 
> > > > > I would prefer if we have a single way to force the compiler to do a
> > > > > single access (read/write).
> > > > 
> > > > Unlikely to happen, I'd expect.
> > > > 
> > > > But I would really like to get rid of (or at least rename)
> > > > read_atomic()/write_atomic() specifically because they've got nothing to
> > > > do with atomic_t's and the set of functionality who's namespace they 
> > > > share.
> > > 
> > > Would you be happy if I rename both to READ_ONCE() and WRITE_ONCE()? I 
> > > would
> > > also suggest to move them implementation in a new header asm/lib.h.
> > 
> > Maybe {READ/WRITE}_SINGLE (to note those should be implemented using a
> > single instruction)?
> 
> The asm volatile statement contains only one instruction, but this doesn't
> mean the helper will generate a single instruction.

Well, the access should be done using a single instruction, which is
what we care about when using this helpers.

> You may have other instructions to get the registers ready for the access.
> 
> > 
> > ACCESS_ONCE (which also has the _ONCE suffix) IIRC could be
> > implemented using several instructions, and hence doesn't seem right
> > that they all have the _ONCE suffix.
> 
> The goal here is the same, we want to access the variable *only* once.

Right, but this is not guaranteed by the current implementation of
ACCESS_ONCE AFAICT, as the compiler *might* split the access into two
(or more) instructions, and hence won't be an atomic access anymore?

> May I ask why we would want to expose the difference to the user?

I'm not saying we should, but naming them using the _ONCE suffix seems
misleading IMO, as they have different guarantees than what
ACCESS_ONCE currently provides.

Thanks, Roger.

Re: [PATCH] xen: Introduce cmpxchg64() and guest_cmpxchg64()

2020-08-17 Thread Roger Pau Monné

On Mon, Aug 17, 2020 at 02:03:23PM +0100, Julien Grall wrote:
> 
> 
> On 17/08/2020 12:50, Roger Pau Monné wrote:
> > On Mon, Aug 17, 2020 at 12:05:54PM +0100, Julien Grall wrote:
> > > Hi,
> > > 
> > > On 17/08/2020 11:33, Roger Pau Monné wrote:
> > > > On Mon, Aug 17, 2020 at 10:42:54AM +0100, Julien Grall wrote:
> > > > > Hi,
> > > > > 
> > > > > On 17/08/2020 10:24, Roger Pau Monné wrote:
> > > > > > On Sat, Aug 15, 2020 at 06:21:43PM +0100, Julien Grall wrote:
> > > > > > > From: Julien Grall 
> > > > > > > 
> > > > > > > The IOREQ code is using cmpxchg() with 64-bit value. At the 
> > > > > > > moment, this
> > > > > > > is x86 code, but there is plan to make it common.
> > > > > > > 
> > > > > > > To cater 32-bit arch, introduce two new helpers to deal with 
> > > > > > > 64-bit
> > > > > > > cmpxchg.
> > > > > > > 
> > > > > > > The Arm 32-bit implementation of cmpxchg64() is based on the 
> > > > > > > __cmpxchg64
> > > > > > > in Linux v5.8 (arch/arm/include/asm/cmpxchg.h).
> > > > > > > 
> > > > > > > Signed-off-by: Julien Grall 
> > > > > > > Cc: Oleksandr Tyshchenko 
> > > > > > > ---
> > > > > > > diff --git a/xen/include/asm-x86/guest_atomics.h 
> > > > > > > b/xen/include/asm-x86/guest_atomics.h
> > > > > > > index 029417c8ffc1..f4de9d3631ff 100644
> > > > > > > --- a/xen/include/asm-x86/guest_atomics.h
> > > > > > > +++ b/xen/include/asm-x86/guest_atomics.h
> > > > > > > @@ -20,6 +20,8 @@
> > > > > > > ((void)(d), test_and_change_bit(nr, p))
> > > > > > > #define guest_cmpxchg(d, ptr, o, n) ((void)(d), cmpxchg(ptr, 
> > > > > > > o, n))
> > > > > > > +#define guest_cmpxchg64(d, ptr, o, n) ((void)(d), cmpxchg64(ptr, 
> > > > > > > o, n))
> > > > > > > +
> > > > > > > #endif /* _X86_GUEST_ATOMICS_H */
> > > > > > > /*
> > > > > > > diff --git a/xen/include/asm-x86/x86_64/system.h 
> > > > > > > b/xen/include/asm-x86/x86_64/system.h
> > > > > > > index f471859c19cc..c1b16105e9f2 100644
> > > > > > > --- a/xen/include/asm-x86/x86_64/system.h
> > > > > > > +++ b/xen/include/asm-x86/x86_64/system.h
> > > > > > > @@ -5,6 +5,8 @@
> > > > > > > ((__typeof__(*(ptr)))__cmpxchg((ptr),(unsigned long)(o),  
> > > > > > >   \
> > > > > > >(unsigned 
> > > > > > > long)(n),sizeof(*(ptr
> > > > > > > +#define cmpxchg64(ptr, o, n) cmpxchg(ptr, o, n)
> > > > > > 
> > > > > > Why do you need to introduce an explicitly sized version of cmpxchg
> > > > > > for 64bit values?
> > > > > > 
> > > > > > There's no cmpxchg{8,16,32}, so I would expect cmpxchg64 to just be
> > > > > > handled by cmpxchg detecting the size of the parameter passed to the
> > > > > > function.
> > > > > That works quite well for 64-bit arches. However, for 32-bit, you 
> > > > > would need
> > > > > to take some detour so 32-bit and 64-bit can cohabit (you cannot 
> > > > > simply
> > > > > replace unsigned long with uint64_t).
> > > > 
> > > > Oh, I see. Switching __cmpxchg on Arm 32 to use unsigned long long or
> > > > uint64_t would be bad, as you would then need two registers to pass
> > > > the value to the function, or push it on the stack?
> > > 
> > > We have only 4 registers (r0 - r4) available for the arguments. With 
> > > 64-bit
> > > value, we will be using 2 registers, some will end up to be pushed on the
> > > stack.
> > > 
> > > This is assuming the compiler is not clever enough to see we are only 
> > > using
> > > the bottom 32-bit with some cmpxchg.
> > > 
> > > > 
> > > > Maybe do something like:
> > > > 
> > > > #define cmpxchg(ptr,o,n) ({ 
> > > > \
> > > > typeof(*(ptr)) tmp; 
> > > > \
> > > > 
> > > > \
> > > > switch ( sizeof(*(ptr)) )   
> > > > \
> > > > {   
> > > > \
> > > > case 8: 
> > > > \
> > > > tmp = __cmpxchg_mb64((ptr), (uint64_t)(o),  
> > > > \
> > > > (uint64_t)(n), sizeof(*(ptr 
> > > > \
> > > > break;  
> > > > \
> > > > default:
> > > > \
> > > > tmp = __cmpxchg_mb((ptr), (unsigned long)(o),   
> > > > \
> > > > (unsigned long)(n), sizeof(*(ptr
> > > > \
> > > > break;  
> > > > \
> > > > }   
> > > > \
> > > > tmp;
> > > > \
> > > > })
> > > 
> > > 
> > > Unfortunately this can't compile if o and n are pointers because th

Re: [PATCH] xen/x86: irq: Avoid a TOCTOU race in pirq_spin_lock_irq_desc()

2020-08-17 Thread Roger Pau Monné

On Mon, Aug 17, 2020 at 03:39:52PM +0100, Julien Grall wrote:
> 
> 
> On 17/08/2020 15:01, Roger Pau Monné wrote:
> > On Mon, Aug 17, 2020 at 02:14:01PM +0100, Julien Grall wrote:
> > > Hi,
> > > 
> > > On 17/08/2020 13:46, Roger Pau Monné wrote:
> > > > On Fri, Aug 14, 2020 at 08:25:28PM +0100, Julien Grall wrote:
> > > > > Hi Andrew,
> > > > > 
> > > > > Sorry for the late answer.
> > > > > 
> > > > > On 23/07/2020 14:59, Andrew Cooper wrote:
> > > > > > On 23/07/2020 14:22, Julien Grall wrote:
> > > > > > > Hi Jan,
> > > > > > > 
> > > > > > > On 23/07/2020 12:23, Jan Beulich wrote:
> > > > > > > > On 22.07.2020 18:53, Julien Grall wrote:
> > > > > > > > > --- a/xen/arch/x86/irq.c
> > > > > > > > > +++ b/xen/arch/x86/irq.c
> > > > > > > > > @@ -1187,7 +1187,7 @@ struct irq_desc 
> > > > > > > > > *pirq_spin_lock_irq_desc(
> > > > > > > > >    for ( ; ; )
> > > > > > > > >  {
> > > > > > > > > -    int irq = pirq->arch.irq;
> > > > > > > > > +    int irq = read_atomic(&pirq->arch.irq);
> > > > > > > > 
> > > > > > > > There we go - I'd be fine this way, but I'm pretty sure Andrew
> > > > > > > > would want this to be ACCESS_ONCE(). So I guess now is the time
> > > > > > > > to settle which one to prefer in new code (or which criteria
> > > > > > > > there are to prefer one over the other).
> > > > > > > 
> > > > > > > I would prefer if we have a single way to force the compiler to 
> > > > > > > do a
> > > > > > > single access (read/write).
> > > > > > 
> > > > > > Unlikely to happen, I'd expect.
> > > > > > 
> > > > > > But I would really like to get rid of (or at least rename)
> > > > > > read_atomic()/write_atomic() specifically because they've got 
> > > > > > nothing to
> > > > > > do with atomic_t's and the set of functionality who's namespace 
> > > > > > they share.
> > > > > 
> > > > > Would you be happy if I rename both to READ_ONCE() and WRITE_ONCE()? 
> > > > > I would
> > > > > also suggest to move them implementation in a new header asm/lib.h.
> > > > 
> > > > Maybe {READ/WRITE}_SINGLE (to note those should be implemented using a
> > > > single instruction)?
> > > 
> > > The asm volatile statement contains only one instruction, but this doesn't
> > > mean the helper will generate a single instruction.
> > 
> > Well, the access should be done using a single instruction, which is
> > what we care about when using this helpers.
> > 
> > > You may have other instructions to get the registers ready for the access.
> > > 
> > > > 
> > > > ACCESS_ONCE (which also has the _ONCE suffix) IIRC could be
> > > > implemented using several instructions, and hence doesn't seem right
> > > > that they all have the _ONCE suffix.
> > > 
> > > The goal here is the same, we want to access the variable *only* once.
> > 
> > Right, but this is not guaranteed by the current implementation of
> > ACCESS_ONCE AFAICT, as the compiler *might* split the access into two
> > (or more) instructions, and hence won't be an atomic access anymore?
> From my understanding, at least on GCC/Clang, ACCESS_ONCE() should be atomic
> if you are using aligned address and the size smaller than a register size.

Yes, any sane compiler shouldn't split such access, but this is not
guaranteed by the current code in ACCESS_ONCE.

> > 
> > > May I ask why we would want to expose the difference to the user?
> > 
> > I'm not saying we should, but naming them using the _ONCE suffix seems
> > misleading IMO, as they have different guarantees than what
> > ACCESS_ONCE currently provides.
> 
> Lets leave aside how ACCESS_ONCE() is implemented for a moment.
> 
> If ACCESS_ONCE() doesn't guarantee atomicy, then it means you may read a mix
> of the old and new value. This would most likely break quite a few of the
> users because the result wouldn't be coherent.
> 
> Do you have place in mind where the non-atomicity would be useful?

Not that I'm aware, I think they could all be safely switched to use
the atomic variants

In fact I wouldn't be surprised if users of ACCESS_ONCE break if the
access was split into multiple instructions.

My comment was to notice that just renaming the atomic read/write
helpers to use the _ONCE prefix is IMO weird as they offer different
properties than ACCESS_ONCE, and hence might confuse users. Just
looking at READ_ONCE users could assume all _ONCE helpers would
guarantee atomicity, which is not the case.

Thanks, Roger.

[PATCH 1/8] x86/vmx: handle writes to MISC_ENABLE MSR

2020-08-17 Thread Roger Pau Monne

Such handling consist in checking that no bits have been changed from
the read value, if that's the case silently drop the write, otherwise
inject a fault.

At least Windows guests will expect to write to the MISC_ENABLE MSR
with the same value that's been read from it.

Signed-off-by: Roger Pau Monné 
---
 xen/arch/x86/hvm/vmx/vmx.c | 9 -
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index eb54aadfba..fbfb31af05 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -3166,7 +3166,7 @@ static int vmx_msr_write_intercept(unsigned int msr, 
uint64_t msr_content)
 
 switch ( msr )
 {
-uint64_t rsvd;
+uint64_t rsvd, tmp;
 
 case MSR_IA32_SYSENTER_CS:
 __vmwrite(GUEST_SYSENTER_CS, msr_content);
@@ -3304,6 +3304,13 @@ static int vmx_msr_write_intercept(unsigned int msr, 
uint64_t msr_content)
 /* None of these MSRs are writeable. */
 goto gp_fault;
 
+case MSR_IA32_MISC_ENABLE:
+/* Silently drop writes that don't change the reported value. */
+if ( vmx_msr_read_intercept(msr, &tmp) != X86EMUL_OKAY ||
+ tmp != msr_content )
+goto gp_fault;
+break;
+
 case MSR_P6_PERFCTR(0)...MSR_P6_PERFCTR(7):
 case MSR_P6_EVNTSEL(0)...MSR_P6_EVNTSEL(7):
 case MSR_CORE_PERF_FIXED_CTR0...MSR_CORE_PERF_FIXED_CTR2:
-- 
2.28.0

[PATCH 8/8] x86/hvm: Disallow access to unknown MSRs

2020-08-17 Thread Roger Pau Monne

From: Andrew Cooper 

Change the catch-all behavior for MSR not explicitly handled. Instead
of allow full read-access to the MSR space and silently dropping
writes return an exception when the MSR is not explicitly handled.

Signed-off-by: Andrew Cooper 
---
 xen/arch/x86/hvm/svm/svm.c |  8 
 xen/arch/x86/hvm/vmx/vmx.c | 11 ---
 2 files changed, 8 insertions(+), 11 deletions(-)

diff --git a/xen/arch/x86/hvm/svm/svm.c b/xen/arch/x86/hvm/svm/svm.c
index 671cdcb724..076fa67138 100644
--- a/xen/arch/x86/hvm/svm/svm.c
+++ b/xen/arch/x86/hvm/svm/svm.c
@@ -1959,6 +1959,7 @@ static int svm_msr_read_intercept(unsigned int msr, 
uint64_t *msr_content)
 break;
 }
 
+gdprintk(XENLOG_WARNING, "RDMSR 0x%08x unimplemented\n", msr);
 goto gpf;
 }
 
@@ -2140,10 +2141,9 @@ static int svm_msr_write_intercept(unsigned int msr, 
uint64_t msr_content)
 break;
 
 default:
-/* Match up with the RDMSR side; ultimately this should go away. */
-if ( rdmsr_safe(msr, msr_content) == 0 )
-break;
-
+gdprintk(XENLOG_WARNING,
+ "WRMSR 0x%08x val 0x%016"PRIx64" unimplemented\n",
+ msr, msr_content);
 goto gpf;
 }
 
diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index fbfb31af05..800066da7d 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -3024,9 +3024,7 @@ static int vmx_msr_read_intercept(unsigned int msr, 
uint64_t *msr_content)
 break;
 }
 
-if ( rdmsr_safe(msr, *msr_content) == 0 )
-break;
-
+gdprintk(XENLOG_WARNING, "RDMSR 0x%08x unimplemented\n", msr);
 goto gp_fault;
 }
 
@@ -3329,10 +3327,9 @@ static int vmx_msr_write_intercept(unsigned int msr, 
uint64_t msr_content)
  is_last_branch_msr(msr) )
 break;
 
-/* Match up with the RDMSR side; ultimately this should go away. */
-if ( rdmsr_safe(msr, msr_content) == 0 )
-break;
-
+gdprintk(XENLOG_WARNING,
+ "WRMSR 0x%08x val 0x%016"PRIx64" unimplemented\n",
+ msr, msr_content);
 goto gp_fault;
 }
 
-- 
2.28.0

[PATCH 7/8] x86/pv: disallow access to unknown MSRs

2020-08-17 Thread Roger Pau Monne

Change the catch-all behavior for MSR not explicitly handled. Instead
of allow full read-access to the MSR space and silently dropping
writes return an exception when the MSR is not explicitly handled.

Signed-off-by: Roger Pau Monné 
---
 xen/arch/x86/pv/emul-priv-op.c | 18 ++
 1 file changed, 10 insertions(+), 8 deletions(-)

diff --git a/xen/arch/x86/pv/emul-priv-op.c b/xen/arch/x86/pv/emul-priv-op.c
index 76c878b677..fcbcf5a6c2 100644
--- a/xen/arch/x86/pv/emul-priv-op.c
+++ b/xen/arch/x86/pv/emul-priv-op.c
@@ -976,9 +976,10 @@ static int read_msr(unsigned int reg, uint64_t *val,
 }
 /* fall through */
 default:
+gdprintk(XENLOG_WARNING, "RDMSR 0x%08x unimplemented\n", reg);
+break;
+
 normal:
-/* Everyone can read the MSR space. */
-/* gdprintk(XENLOG_WARNING, "Domain attempted RDMSR %08x\n", reg); */
 if ( rdmsr_safe(reg, *val) )
 break;
 return X86EMUL_OKAY;
@@ -1143,14 +1144,15 @@ static int write_msr(unsigned int reg, uint64_t val,
 }
 /* fall through */
 default:
-if ( rdmsr_safe(reg, temp) )
-break;
+gdprintk(XENLOG_WARNING,
+ "WRMSR 0x%08x val 0x%016"PRIx64" unimplemented\n",
+ reg, val);
+break;
 
-if ( val != temp )
 invalid:
-gdprintk(XENLOG_WARNING,
- "Domain attempted WRMSR %08x from 0x%016"PRIx64" to 
0x%016"PRIx64"\n",
- reg, temp, val);
+gdprintk(XENLOG_WARNING,
+ "Domain attempted WRMSR %08x from 0x%016"PRIx64" to 
0x%016"PRIx64"\n",
+ reg, temp, val);
 return X86EMUL_OKAY;
 }
 
-- 
2.28.0

[PATCH 4/8] x86/pv: handle reads to the PAT MSR

2020-08-17 Thread Roger Pau Monne

The value in the PAT MSR is part of the ABI between Xen and PV guests,
and there's no reason to not allow a PV guest to read it.

Signed-off-by: Roger Pau Monné 
---
 xen/arch/x86/pv/emul-priv-op.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/xen/arch/x86/pv/emul-priv-op.c b/xen/arch/x86/pv/emul-priv-op.c
index fd3cbfaebc..ff87c7d769 100644
--- a/xen/arch/x86/pv/emul-priv-op.c
+++ b/xen/arch/x86/pv/emul-priv-op.c
@@ -900,6 +900,10 @@ static int read_msr(unsigned int reg, uint64_t *val,
 *val = guest_efer(currd);
 return X86EMUL_OKAY;
 
+case MSR_IA32_CR_PAT:
+*val = XEN_MSR_PAT;
+return X86EMUL_OKAY;
+
 case MSR_K7_FID_VID_CTL:
 case MSR_K7_FID_VID_STATUS:
 case MSR_K8_PSTATE_LIMIT:
-- 
2.28.0

[PATCH 6/8] x86/pv: allow reading FEATURE_CONTROL MSR

2020-08-17 Thread Roger Pau Monne

Linux PV guests will attempt to read the FEATURE_CONTROL MSR, report
no features enabled or available, and that the MSR is already locked.

Signed-off-by: Roger Pau Monné 
---
 xen/arch/x86/pv/emul-priv-op.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/xen/arch/x86/pv/emul-priv-op.c b/xen/arch/x86/pv/emul-priv-op.c
index 554a95ae8d..76c878b677 100644
--- a/xen/arch/x86/pv/emul-priv-op.c
+++ b/xen/arch/x86/pv/emul-priv-op.c
@@ -879,6 +879,10 @@ static int read_msr(unsigned int reg, uint64_t *val,
 *val |= APIC_BASE_BSP;
 return X86EMUL_OKAY;
 
+case MSR_IA32_FEATURE_CONTROL:
+*val = IA32_FEATURE_CONTROL_LOCK;
+return X86EMUL_OKAY;
+
 case MSR_FS_BASE:
 if ( is_pv_32bit_domain(currd) )
 break;
-- 
2.28.0

[PATCH 2/8] x86/svm: silently drop writes to SYSCFG and related MSRs

2020-08-17 Thread Roger Pau Monne

The SYSCFG, TOP_MEM1 and TOP_MEM2 MSRs are currently exposed to guests
and writes are silently discarded. Make this explicit in the SVM code
now, and just return 0 when attempting to read any of the MSRs, while
continuing to silently drop writes.

Signed-off-by: Roger Pau Monné 
---
 xen/arch/x86/hvm/svm/svm.c | 13 +
 1 file changed, 13 insertions(+)

diff --git a/xen/arch/x86/hvm/svm/svm.c b/xen/arch/x86/hvm/svm/svm.c
index ca3bbfcbb3..671cdcb724 100644
--- a/xen/arch/x86/hvm/svm/svm.c
+++ b/xen/arch/x86/hvm/svm/svm.c
@@ -1917,6 +1917,13 @@ static int svm_msr_read_intercept(unsigned int msr, 
uint64_t *msr_content)
 goto gpf;
 break;
 
+case MSR_K8_TOP_MEM1:
+case MSR_K8_TOP_MEM2:
+case MSR_K8_SYSCFG:
+/* Return all 0s. */
+*msr_content = 0;
+break;
+
 case MSR_K8_VM_CR:
 *msr_content = 0;
 break;
@@ -2094,6 +2101,12 @@ static int svm_msr_write_intercept(unsigned int msr, 
uint64_t msr_content)
 goto gpf;
 break;
 
+case MSR_K8_TOP_MEM1:
+case MSR_K8_TOP_MEM2:
+case MSR_K8_SYSCFG:
+/* Drop writes. */
+break;
+
 case MSR_K8_VM_CR:
 /* ignore write. handle all bits as read-only. */
 break;
-- 
2.28.0

[PATCH 0/8] x86: switch default MSR behavior

2020-08-17 Thread Roger Pau Monne

Hello,

The current series attempts to change the current MSR default handling
behavior, which is to silently drop writes to writable MSRs, and allow
reading any MSR not explicitly handled.

After this series access to MSRs not explicitly handled will trigger a
#GP fault. I've tested this series with osstest and it doesn't introduce
any regression, at least on the boxes selected for testing:

http://logs.test-lab.xenproject.org/osstest/logs/152602/

Thanks, Roger.

Andrew Cooper (1):
  x86/hvm: Disallow access to unknown MSRs

Roger Pau Monne (7):
  x86/vmx: handle writes to MISC_ENABLE MSR
  x86/svm: silently drop writes to SYSCFG and related MSRs
  x86/pv: handle writes to the EFER MSR
  x86/pv: handle reads to the PAT MSR
  x86/pv: allow reading APIC_BASE MSR
  x86/pv: allow reading FEATURE_CONTROL MSR
  x86/pv: disallow access to unknown MSRs

 xen/arch/x86/hvm/svm/svm.c | 21 +--
 xen/arch/x86/hvm/vmx/vmx.c | 20 ++
 xen/arch/x86/pv/emul-priv-op.c | 68 +-
 3 files changed, 79 insertions(+), 30 deletions(-)

-- 
2.28.0

[PATCH 5/8] x86/pv: allow reading APIC_BASE MSR

2020-08-17 Thread Roger Pau Monne

Linux PV guests will attempt to read the APIC_BASE MSR, so just report
a default value to make Linux happy.

Signed-off-by: Roger Pau Monné 
---
 xen/arch/x86/pv/emul-priv-op.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/xen/arch/x86/pv/emul-priv-op.c b/xen/arch/x86/pv/emul-priv-op.c
index ff87c7d769..554a95ae8d 100644
--- a/xen/arch/x86/pv/emul-priv-op.c
+++ b/xen/arch/x86/pv/emul-priv-op.c
@@ -872,6 +872,13 @@ static int read_msr(unsigned int reg, uint64_t *val,
 
 switch ( reg )
 {
+case MSR_APIC_BASE:
+/* Linux PV guests will attempt to read APIC_BASE. */
+*val = APIC_BASE_ENABLE | APIC_DEFAULT_PHYS_BASE;
+if ( !curr->vcpu_id )
+*val |= APIC_BASE_BSP;
+return X86EMUL_OKAY;
+
 case MSR_FS_BASE:
 if ( is_pv_32bit_domain(currd) )
 break;
-- 
2.28.0

[PATCH 3/8] x86/pv: handle writes to the EFER MSR

2020-08-17 Thread Roger Pau Monne

Silently drop writes to the EFER MSR for PV guests if the value is not
changed from what it's being reported. Current PV Linux will attempt
to write to the MSR with the same value that's been read, and raising
a fault will result in a guest crash.

As part of this work introduce a helper to easily get the EFER value
reported to guests.

Signed-off-by: Roger Pau Monné 
---
 xen/arch/x86/pv/emul-priv-op.c | 35 --
 1 file changed, 25 insertions(+), 10 deletions(-)

diff --git a/xen/arch/x86/pv/emul-priv-op.c b/xen/arch/x86/pv/emul-priv-op.c
index efeb2a727e..fd3cbfaebc 100644
--- a/xen/arch/x86/pv/emul-priv-op.c
+++ b/xen/arch/x86/pv/emul-priv-op.c
@@ -837,6 +837,23 @@ static inline bool is_cpufreq_controller(const struct 
domain *d)
 is_hardware_domain(d));
 }
 
+static uint64_t guest_efer(const struct domain *d)
+{
+uint64_t val;
+
+/* Hide unknown bits, and unconditionally hide SVME from guests. */
+val = read_efer() & EFER_KNOWN_MASK & ~EFER_SVME;
+/*
+ * Hide the 64-bit features from 32-bit guests.  SCE has
+ * vendor-dependent behaviour.
+ */
+if ( is_pv_32bit_domain(d) )
+val &= ~(EFER_LME | EFER_LMA |
+ (boot_cpu_data.x86_vendor == X86_VENDOR_INTEL
+  ? EFER_SCE : 0));
+return val;
+}
+
 static int read_msr(unsigned int reg, uint64_t *val,
 struct x86_emulate_ctxt *ctxt)
 {
@@ -880,16 +897,7 @@ static int read_msr(unsigned int reg, uint64_t *val,
 return X86EMUL_OKAY;
 
 case MSR_EFER:
-/* Hide unknown bits, and unconditionally hide SVME from guests. */
-*val = read_efer() & EFER_KNOWN_MASK & ~EFER_SVME;
-/*
- * Hide the 64-bit features from 32-bit guests.  SCE has
- * vendor-dependent behaviour.
- */
-if ( is_pv_32bit_domain(currd) )
-*val &= ~(EFER_LME | EFER_LMA |
-  (boot_cpu_data.x86_vendor == X86_VENDOR_INTEL
-   ? EFER_SCE : 0));
+*val = guest_efer(currd);
 return X86EMUL_OKAY;
 
 case MSR_K7_FID_VID_CTL:
@@ -1005,6 +1013,13 @@ static int write_msr(unsigned int reg, uint64_t val,
 curr->arch.pv.gs_base_user = val;
 return X86EMUL_OKAY;
 
+case MSR_EFER:
+/* Silently drop writes that don't change the reported value. */
+temp = guest_efer(currd);
+if ( val != temp )
+goto invalid;
+return X86EMUL_OKAY;
+
 case MSR_K7_FID_VID_STATUS:
 case MSR_K7_FID_VID_CTL:
 case MSR_K8_PSTATE_LIMIT:
-- 
2.28.0

Re: Planned osstest outage, around 17th August

2020-08-17 Thread Ian Jackson

Ian Jackson writes ("Planned osstest outage, around 17th August"):
> osstest's infrastructure hosts need upgrading to Debian "buster" (aka
> Debian "stable").  We are planning to do this on Monday the 17th of
> August.
> 
> This will involve telling osstest to start draining its queues some
> time around the 15th of August.  If all goes well, it will be back in
> operation late on the 17th.  But it is possible that difficulties will
> arise, in which case it might be out of operation, or operating in a
> degraded way, for perhaps the rest of that week.

Some of the upgrades have encountered difficulties.  Nothing
insurmountable, but it is taking longer than expected.  We hope to be
done some time tomorrow.  In the meantime osstest is still offline.

Thanks,
Ian.

Re: [PATCH] xen/x86: irq: Avoid a TOCTOU race in pirq_spin_lock_irq_desc()

2020-08-17 Thread Roger Pau Monné

On Mon, Aug 17, 2020 at 04:53:51PM +0100, Julien Grall wrote:
> 
> 
> On 17/08/2020 16:03, Roger Pau Monné wrote:
> > On Mon, Aug 17, 2020 at 03:39:52PM +0100, Julien Grall wrote:
> > > 
> > > 
> > > On 17/08/2020 15:01, Roger Pau Monné wrote:
> > > > On Mon, Aug 17, 2020 at 02:14:01PM +0100, Julien Grall wrote:
> > > > > Hi,
> > > > > 
> > > > > On 17/08/2020 13:46, Roger Pau Monné wrote:
> > > > > > On Fri, Aug 14, 2020 at 08:25:28PM +0100, Julien Grall wrote:
> > > > > > > Hi Andrew,
> > > > > > > 
> > > > > > > Sorry for the late answer.
> > > > > > > 
> > > > > > > On 23/07/2020 14:59, Andrew Cooper wrote:
> > > > > > > > On 23/07/2020 14:22, Julien Grall wrote:
> > > > > > > > > Hi Jan,
> > > > > > > > > 
> > > > > > > > > On 23/07/2020 12:23, Jan Beulich wrote:
> > > > > > > > > > On 22.07.2020 18:53, Julien Grall wrote:
> > > > > > > > > > > --- a/xen/arch/x86/irq.c
> > > > > > > > > > > +++ b/xen/arch/x86/irq.c
> > > > > > > > > > > @@ -1187,7 +1187,7 @@ struct irq_desc 
> > > > > > > > > > > *pirq_spin_lock_irq_desc(
> > > > > > > > > > >     for ( ; ; )
> > > > > > > > > > >   {
> > > > > > > > > > > -    int irq = pirq->arch.irq;
> > > > > > > > > > > +    int irq = read_atomic(&pirq->arch.irq);
> > > > > > > > > > 
> > > > > > > > > > There we go - I'd be fine this way, but I'm pretty sure 
> > > > > > > > > > Andrew
> > > > > > > > > > would want this to be ACCESS_ONCE(). So I guess now is the 
> > > > > > > > > > time
> > > > > > > > > > to settle which one to prefer in new code (or which criteria
> > > > > > > > > > there are to prefer one over the other).
> > > > > > > > > 
> > > > > > > > > I would prefer if we have a single way to force the compiler 
> > > > > > > > > to do a
> > > > > > > > > single access (read/write).
> > > > > > > > 
> > > > > > > > Unlikely to happen, I'd expect.
> > > > > > > > 
> > > > > > > > But I would really like to get rid of (or at least rename)
> > > > > > > > read_atomic()/write_atomic() specifically because they've got 
> > > > > > > > nothing to
> > > > > > > > do with atomic_t's and the set of functionality who's namespace 
> > > > > > > > they share.
> > > > > > > 
> > > > > > > Would you be happy if I rename both to READ_ONCE() and 
> > > > > > > WRITE_ONCE()? I would
> > > > > > > also suggest to move them implementation in a new header 
> > > > > > > asm/lib.h.
> > > > > > 
> > > > > > Maybe {READ/WRITE}_SINGLE (to note those should be implemented 
> > > > > > using a
> > > > > > single instruction)?
> > > > > 
> > > > > The asm volatile statement contains only one instruction, but this 
> > > > > doesn't
> > > > > mean the helper will generate a single instruction.
> > > > 
> > > > Well, the access should be done using a single instruction, which is
> > > > what we care about when using this helpers.
> > > > 
> > > > > You may have other instructions to get the registers ready for the 
> > > > > access.
> > > > > 
> > > > > > 
> > > > > > ACCESS_ONCE (which also has the _ONCE suffix) IIRC could be
> > > > > > implemented using several instructions, and hence doesn't seem right
> > > > > > that they all have the _ONCE suffix.
> > > > > 
> > > > > The goal here is the same, we want to access the variable *only* once.
> > > > 
> > > > Right, but this is not guaranteed by the current implementation of
> > > > ACCESS_ONCE AFAICT, as the compiler *might* split the access into two
> > > > (or more) instructions, and hence won't be an atomic access anymore?
> > >  From my understanding, at least on GCC/Clang, ACCESS_ONCE() should be 
> > > atomic
> > > if you are using aligned address and the size smaller than a register 
> > > size.
> > 
> > Yes, any sane compiler shouldn't split such access, but this is not
> > guaranteed by the current code in ACCESS_ONCE.
> To be sure, your concern here is not about GCC/Clang but other compilers. Am
> I correct?

Or about the existing ones switching behavior, which is again quite
unlikely I would like to assume.

> We already have a collection of compiler specific macros in compiler.h. So
> how about we classify this macro as a compiler specific one? (See more
> below).
> 
> > 
> > > > 
> > > > > May I ask why we would want to expose the difference to the user?
> > > > 
> > > > I'm not saying we should, but naming them using the _ONCE suffix seems
> > > > misleading IMO, as they have different guarantees than what
> > > > ACCESS_ONCE currently provides.
> > > 
> > > Lets leave aside how ACCESS_ONCE() is implemented for a moment.
> > > 
> > > If ACCESS_ONCE() doesn't guarantee atomicy, then it means you may read a 
> > > mix
> > > of the old and new value. This would most likely break quite a few of the
> > > users because the result wouldn't be coherent.
> > > 
> > > Do you have place in mind where the non-atomicity would be useful?
> > 
> > Not that I'm aware, I think they could all be safely switched to use
> > the atomic variants
> There is concern that read_atomic(), write_atomic() prevent t

[OSSTEST PATCH 1/2] Tcl: Use tclsh8.6

2020-08-17 Thread Ian Jackson

This is needed to run on buster.

I have checked that tclsh8.6 and TclX works on osstest.test-lab.  TclX
seems to be provided by tcl8.4 but work with tcl8.6 (at least on
buster).

Deployment note: hosts running earlier Debian (including
osstest.xs.citrite.net, the Citrix Cambridge instance), may need
OSSTEST_DAEMON_TCLSH=tclsh8.4 or similar in ~/.xen-osstest/settings.

Signed-off-by: Ian Jackson 
---
 README| 2 +-
 mg-transient-task | 2 +-
 ms-ownerdaemon| 2 +-
 ms-queuedaemon| 2 +-
 ms-reportuptime   | 2 +-
 sg-execute-flight | 2 +-
 sg-run-job| 2 +-
 7 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/README b/README
index 91793795..2804ecf3 100644
--- a/README
+++ b/README
@@ -291,7 +291,7 @@ To run osstest in standalone mode:
 
  - You need to install
  sqlite3
- tcl8.5 tclx8.4 libsqlite3-tcl
+ tcl8.6 tclx8.4 libsqlite3-tcl
  libdbi-perl libdbd-sqlite3-perl
  pax rsync
  curl
diff --git a/mg-transient-task b/mg-transient-task
index ce5180ff..d707ce76 100755
--- a/mg-transient-task
+++ b/mg-transient-task
@@ -1,4 +1,4 @@
-#!/usr/bin/tclsh8.5
+#!/usr/bin/tclsh8.6
 # -*- Tcl -*- 
 # usage: ./mg-transient-task PROGRAM [ARGS...]
 
diff --git a/ms-ownerdaemon b/ms-ownerdaemon
index bf0b5952..4c33e93a 100755
--- a/ms-ownerdaemon
+++ b/ms-ownerdaemon
@@ -1,4 +1,4 @@
-#!/usr/bin/tclsh8.5
+#!/usr/bin/tclsh8.6
 # -*- Tcl -*- 
 # usage: ./ms-ownerdaemon  ... | logger
 
diff --git a/ms-queuedaemon b/ms-queuedaemon
index f02abf37..a3a009ca 100755
--- a/ms-queuedaemon
+++ b/ms-queuedaemon
@@ -1,4 +1,4 @@
-#!/usr/bin/tclsh8.5
+#!/usr/bin/tclsh8.6
 # -*- Tcl -*- 
 # usage: ./ms-queuedaemon  ... | logger
 
diff --git a/ms-reportuptime b/ms-reportuptime
index 804e563d..bcf79054 100755
--- a/ms-reportuptime
+++ b/ms-reportuptime
@@ -1,4 +1,4 @@
-#!/usr/bin/tclsh8.5
+#!/usr/bin/tclsh8.6
 # -*- Tcl -*- 
 # usage: ./ms-reportuptime
 
diff --git a/sg-execute-flight b/sg-execute-flight
index 02f63316..1b002cdd 100755
--- a/sg-execute-flight
+++ b/sg-execute-flight
@@ -1,4 +1,4 @@
-#!/usr/bin/tclsh8.5
+#!/usr/bin/tclsh8.6
 # -*- Tcl -*- 
 # usage: ./sg-execute-flight FLIGHT BLESSING
 
diff --git a/sg-run-job b/sg-run-job
index aa7953ac..df3d08d0 100755
--- a/sg-run-job
+++ b/sg-run-job
@@ -1,4 +1,4 @@
-#!/usr/bin/tclsh8.5
+#!/usr/bin/tclsh8.6
 # -*- Tcl -*-
 
 # This is part of "osstest", an automated testing framework for Xen.
-- 
2.11.0

[OSSTEST PATCH 2/2] tcl: JobDB: Do not require particular Pgtcl version

2020-08-17 Thread Ian Jackson

This just serves to complicate upgrades.

Signed-off-by: Ian Jackson 
---
 tcl/JobDB-Executive.tcl | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tcl/JobDB-Executive.tcl b/tcl/JobDB-Executive.tcl
index 56b61825..29c82821 100644
--- a/tcl/JobDB-Executive.tcl
+++ b/tcl/JobDB-Executive.tcl
@@ -15,7 +15,7 @@
 # You should have received a copy of the GNU Affero General Public License
 # along with this program.  If not, see .
 
-package require Pgtcl 1.5
+package require Pgtcl
 
 namespace eval jobdb {
 
-- 
2.11.0

[PATCH v2 1/6] tools/ocaml/libs/xc: Fix ambiguous documentation comment

2020-08-17 Thread Edwin Török

Signed-off-by: Edwin Török 
---
 tools/ocaml/libs/xc/xenctrl.mli | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/tools/ocaml/libs/xc/xenctrl.mli b/tools/ocaml/libs/xc/xenctrl.mli
index 26ec7e59b1..f7f6ec570d 100644
--- a/tools/ocaml/libs/xc/xenctrl.mli
+++ b/tools/ocaml/libs/xc/xenctrl.mli
@@ -132,8 +132,10 @@ external interface_close : handle -> unit = 
"stub_xc_interface_close"
  * interface_open and interface_close or with_intf although mixing both
  * is possible *)
 val with_intf : (handle -> 'a) -> 'a
+
 (** [get_handle] returns the global handle used by [with_intf] *)
 val get_handle: unit -> handle option
+
 (** [close handle] closes the handle maintained by [with_intf]. This
  * should only be closed before process exit. It must not be called from
  * a function called directly or indirectly by with_intf as this
-- 
2.25.1

[PATCH v2 4/6] tools/ocaml/xenstored: drop select based socket watching

2020-08-17 Thread Edwin Török

Poll has been the default since 2014, I think we can safely say by now
that poll() works and we don't need to fall back to select().

This will allow fixing up the way we call poll to be more efficient
(and pave the way for introducing epoll support):
currently poll wraps the select API, which is inefficient.

Signed-off-by: Edwin Török 
---
Changed since v1:
 * fix commit title
---
 tools/ocaml/xenstored/Makefile | 12 ++--
 tools/ocaml/xenstored/parse_arg.ml |  7 ++-
 tools/ocaml/xenstored/{select.ml => poll.ml}   | 14 ++
 tools/ocaml/xenstored/{select.mli => poll.mli} | 12 ++--
 tools/ocaml/xenstored/xenstored.ml |  4 +---
 5 files changed, 13 insertions(+), 36 deletions(-)
 rename tools/ocaml/xenstored/{select.ml => poll.ml} (85%)
 rename tools/ocaml/xenstored/{select.mli => poll.mli} (58%)

diff --git a/tools/ocaml/xenstored/Makefile b/tools/ocaml/xenstored/Makefile
index 68d35c483a..692a62584e 100644
--- a/tools/ocaml/xenstored/Makefile
+++ b/tools/ocaml/xenstored/Makefile
@@ -18,12 +18,12 @@ OCAMLINCLUDE += \
-I $(OCAML_TOPLEVEL)/libs/xc \
-I $(OCAML_TOPLEVEL)/libs/eventchn
 
-LIBS = syslog.cma syslog.cmxa select.cma select.cmxa
+LIBS = syslog.cma syslog.cmxa poll.cma poll.cmxa
 syslog_OBJS = syslog
 syslog_C_OBJS = syslog_stubs
-select_OBJS = select
-select_C_OBJS = select_stubs
-OCAML_LIBRARY = syslog select
+poll_OBJS = poll
+poll_C_OBJS = select_stubs
+OCAML_LIBRARY = syslog poll
 
 LIBS += systemd.cma systemd.cmxa
 systemd_OBJS = systemd
@@ -58,13 +58,13 @@ OBJS = paths \
process \
xenstored
 
-INTF = symbol.cmi trie.cmi syslog.cmi systemd.cmi select.cmi
+INTF = symbol.cmi trie.cmi syslog.cmi systemd.cmi poll.cmi
 
 XENSTOREDLIBS = \
unix.cmxa \
-ccopt -L -ccopt . syslog.cmxa \
-ccopt -L -ccopt . systemd.cmxa \
-   -ccopt -L -ccopt . select.cmxa \
+   -ccopt -L -ccopt . poll.cmxa \
-ccopt -L -ccopt $(OCAML_TOPLEVEL)/libs/mmap 
$(OCAML_TOPLEVEL)/libs/mmap/xenmmap.cmxa \
-ccopt -L -ccopt $(OCAML_TOPLEVEL)/libs/eventchn 
$(OCAML_TOPLEVEL)/libs/eventchn/xeneventchn.cmxa \
-ccopt -L -ccopt $(OCAML_TOPLEVEL)/libs/xc 
$(OCAML_TOPLEVEL)/libs/xc/xenctrl.cmxa \
diff --git a/tools/ocaml/xenstored/parse_arg.ml 
b/tools/ocaml/xenstored/parse_arg.ml
index 1803c3eda0..2c4b5a8528 100644
--- a/tools/ocaml/xenstored/parse_arg.ml
+++ b/tools/ocaml/xenstored/parse_arg.ml
@@ -25,7 +25,6 @@ type config =
tracefile: string option; (* old xenstored compatibility *)
restart: bool;
disable_socket: bool;
-   use_select: bool;
 }
 
 let do_argv =
@@ -37,7 +36,7 @@ let do_argv =
and config_file = ref ""
and restart = ref false
and disable_socket = ref false
-   and use_select = ref false in
+   in
 
let speclist =
[ ("--no-domain-init", Arg.Unit (fun () -> domain_init := 
false),
@@ -54,9 +53,8 @@ let do_argv =
  ("-T", Arg.Set_string tracefile, ""); (* for compatibility *)
  ("--restart", Arg.Set restart, "Read database on starting");
  ("--disable-socket", Arg.Unit (fun () -> disable_socket := 
true), "Disable socket");
- ("--use-select", Arg.Unit (fun () -> use_select := true), 
"Use select instead of poll"); (* for backward compatibility and testing *)
] in
-   let usage_msg = "usage : xenstored [--config-file ] 
[--no-domain-init] [--help] [--no-fork] [--reraise-top-level] [--restart] 
[--disable-socket] [--use-select]" in
+   let usage_msg = "usage : xenstored [--config-file ] 
[--no-domain-init] [--help] [--no-fork] [--reraise-top-level] [--restart] 
[--disable-socket]" in
Arg.parse speclist (fun _ -> ()) usage_msg;
{
domain_init = !domain_init;
@@ -68,5 +66,4 @@ let do_argv =
tracefile = if !tracefile <> "" then Some !tracefile else None;
restart = !restart;
disable_socket = !disable_socket;
-   use_select = !use_select;
}
diff --git a/tools/ocaml/xenstored/select.ml b/tools/ocaml/xenstored/poll.ml
similarity index 85%
rename from tools/ocaml/xenstored/select.ml
rename to tools/ocaml/xenstored/poll.ml
index 0455e163e3..26f8620dfc 100644
--- a/tools/ocaml/xenstored/select.ml
+++ b/tools/ocaml/xenstored/poll.ml
@@ -63,15 +63,5 @@ let poll_select in_fds out_fds exc_fds timeout =
 (if event.except then fd :: x else x))
a r
 
-(* If the use_poll function is not called at all, we default to the original 
Unix.select behavior *)
-let select_fun = ref Unix.select
-
-let use_poll yes =
-   let sel_fun, max_fd =
-   if yes then poll_select, get_sys_fs_nr_open ()
-   else Unix.select, 1024 in
-   select_fun := sel_fun;
-   set_fd_limit max_fd
-
-let select in_fds out_fds exc_fds timeout =
-   (!select_fun) in_fds out

[PATCH v2 3/6] tools/ocaml/xenstored: replace hand rolled GC with weak GC references

2020-08-17 Thread Edwin Török

The code here is attempting to reduce memory usage by sharing common
substrings in the tree: it replaces strings with ints, and keeps a
string->int map that gets manually garbage collected using a hand-rolled
mark and sweep algorithm.

This is unnecessary: OCaml already has a mark-and-sweep Garbage
Collector runtime, and sharing of common strings in tree nodes
can be achieved through Weak references: if the string hasn't been seen
yet it gets added to the Weak reference table, and if it has we use the
entry from the table instead, thus storing a string only once.
When the string is no longer referenced OCaml's GC will drop it from the
weak table: there is no need to manually do a mark-and-sweep, or to tell
OCaml when to drop it.

Signed-off-by: Edwin Török 
---
 tools/ocaml/xenstored/connection.ml |  3 --
 tools/ocaml/xenstored/history.ml| 14 --
 tools/ocaml/xenstored/store.ml  | 11 ++---
 tools/ocaml/xenstored/symbol.ml | 68 ++---
 tools/ocaml/xenstored/symbol.mli| 21 ++---
 tools/ocaml/xenstored/xenstored.ml  | 16 +--
 6 files changed, 24 insertions(+), 109 deletions(-)

diff --git a/tools/ocaml/xenstored/connection.ml 
b/tools/ocaml/xenstored/connection.ml
index 24750ada43..aa6dd95501 100644
--- a/tools/ocaml/xenstored/connection.ml
+++ b/tools/ocaml/xenstored/connection.ml
@@ -271,9 +271,6 @@ let has_more_work con =
 
 let incr_ops con = con.stat_nb_ops <- con.stat_nb_ops + 1
 
-let mark_symbols con =
-   Hashtbl.iter (fun _ t -> Store.mark_symbols (Transaction.get_store t)) 
con.transactions
-
 let stats con =
Hashtbl.length con.watches, con.stat_nb_ops
 
diff --git a/tools/ocaml/xenstored/history.ml b/tools/ocaml/xenstored/history.ml
index f39565bff5..029802bd15 100644
--- a/tools/ocaml/xenstored/history.ml
+++ b/tools/ocaml/xenstored/history.ml
@@ -22,20 +22,6 @@ type history_record = {
 
 let history : history_record list ref = ref []
 
-(* Called from periodic_ops to ensure we don't discard symbols that are still 
needed. *)
-(* There is scope for optimisation here, since in consecutive commits one 
commit's `after`
- * is the same thing as the next commit's `before`, but not all commits in 
history are
- * consecutive. *)
-let mark_symbols () =
-   (* There are gaps where dom0's commits are missing. Otherwise we could 
assume that
-* each element's `before` is the same thing as the next element's 
`after`
-* since the next element is the previous commit *)
-   List.iter (fun hist_rec ->
-   Store.mark_symbols hist_rec.before;
-   Store.mark_symbols hist_rec.after;
-   )
-   !history
-
 (* Keep only enough commit-history to protect the running transactions that we 
are still tracking *)
 (* There is scope for optimisation here, replacing List.filter with something 
more efficient,
  * probably on a different list-like structure. *)
diff --git a/tools/ocaml/xenstored/store.ml b/tools/ocaml/xenstored/store.ml
index f299ec6461..45659a23ee 100644
--- a/tools/ocaml/xenstored/store.ml
+++ b/tools/ocaml/xenstored/store.ml
@@ -46,18 +46,18 @@ let add_child node child =
 
 let exists node childname =
let childname = Symbol.of_string childname in
-   List.exists (fun n -> n.name = childname) node.children
+   List.exists (fun n -> Symbol.equal n.name childname) node.children
 
 let find node childname =
let childname = Symbol.of_string childname in
-   List.find (fun n -> n.name = childname) node.children
+   List.find (fun n -> Symbol.equal n.name childname) node.children
 
 let replace_child node child nchild =
(* this is the on-steroid version of the filter one-replace one *)
let rec replace_one_in_list l =
match l with
| []   -> []
-   | h :: tl when h.name = child.name -> nchild :: tl
+   | h :: tl when Symbol.equal h.name child.name -> nchild :: tl
| h :: tl  -> h :: replace_one_in_list 
tl
in
{ node with children = (replace_one_in_list node.children) }
@@ -67,7 +67,7 @@ let del_childname node childname =
let rec delete_one_in_list l =
match l with
| []-> raise Not_found
-   | h :: tl when h.name = sym -> tl
+   | h :: tl when Symbol.equal h.name sym -> tl
| h :: tl   -> h :: delete_one_in_list tl
in
{ node with children = (delete_one_in_list node.children) }
@@ -463,9 +463,6 @@ let copy store = {
quota = Quota.copy store.quota;
 }
 
-let mark_symbols store =
-   Node.recurse (fun node -> Symbol.mark_as_used node.Node.name) store.root
-
 let incr_transaction_coalesce store =
store.stat_transaction_coalesce <- store.stat_transaction_coalesce + 1
 let incr_transaction_abort store =
diff --git a/tools/

[PATCH v2 6/6] tools/ocaml/xenstored: use more efficient tries

2020-08-17 Thread Edwin Török

No functional change, just an optimization.

Signed-off-by: Edwin Török 
---
Changed since v1:
 * fix missing 'set_node' in 'set' that got lost in conversion to map
 * simplify 'compare' function
---
 tools/ocaml/xenstored/connections.ml |  2 +-
 tools/ocaml/xenstored/symbol.ml  |  6 +--
 tools/ocaml/xenstored/trie.ml| 59 
 tools/ocaml/xenstored/trie.mli   | 26 ++--
 4 files changed, 43 insertions(+), 50 deletions(-)

diff --git a/tools/ocaml/xenstored/connections.ml 
b/tools/ocaml/xenstored/connections.ml
index f02ef6b526..4983c7370b 100644
--- a/tools/ocaml/xenstored/connections.ml
+++ b/tools/ocaml/xenstored/connections.ml
@@ -21,7 +21,7 @@ type t = {
anonymous: (Unix.file_descr, Connection.t) Hashtbl.t;
domains: (int, Connection.t) Hashtbl.t;
ports: (Xeneventchn.t, Connection.t) Hashtbl.t;
-   mutable watches: (string, Connection.watch list) Trie.t;
+   mutable watches: Connection.watch list Trie.t;
 }
 
 let create () = {
diff --git a/tools/ocaml/xenstored/symbol.ml b/tools/ocaml/xenstored/symbol.ml
index 2697915623..85b3f265de 100644
--- a/tools/ocaml/xenstored/symbol.ml
+++ b/tools/ocaml/xenstored/symbol.ml
@@ -31,9 +31,9 @@ let equal a b =
   (* compare using physical equality, both members have to be part of the 
above weak table *)
   a == b
 
-let compare a b =
-  if equal a b then 0
-  else -(String.compare a b)
+(* the sort order is reversed here, so that Map.fold constructs a list
+   in ascending order *)
+let compare a b = String.compare b a
 
 let stats () =
   let len, entries, _, _, _, _ = WeakTable.stats tbl in
diff --git a/tools/ocaml/xenstored/trie.ml b/tools/ocaml/xenstored/trie.ml
index dc42535092..5b4831cf02 100644
--- a/tools/ocaml/xenstored/trie.ml
+++ b/tools/ocaml/xenstored/trie.ml
@@ -13,24 +13,26 @@
  * GNU Lesser General Public License for more details.
  *)
 
+module StringMap = Map.Make(String)
+
 module Node =
 struct
-   type ('a,'b) t =  {
-   key: 'a;
-   value: 'b option;
-   children: ('a,'b) t list;
+   type 'a t =  {
+   key: string;
+   value: 'a option;
+   children: 'a t StringMap.t;
}
 
let _create key value = {
key = key;
value = Some value;
-   children = [];
+   children = StringMap.empty;
}
 
let empty key = {
key = key;
value = None;
-   children = []
+   children = StringMap.empty;
}
 
let _get_key node = node.key
@@ -47,41 +49,31 @@ struct
{ node with children = children }
 
let _add_child node child =
-   { node with children = child :: node.children }
+   { node with children = StringMap.add child.key child 
node.children }
 end
 
-type ('a,'b) t = ('a,'b) Node.t list
+type 'a t = 'a Node.t StringMap.t
 
 let mem_node nodes key =
-   List.exists (fun n -> n.Node.key = key) nodes
+   StringMap.mem key nodes
 
 let find_node nodes key =
-   List.find (fun n -> n.Node.key = key) nodes
+   StringMap.find key nodes
 
 let replace_node nodes key node =
-   let rec aux = function
-   | []-> []
-   | h :: tl when h.Node.key = key -> node :: tl
-   | h :: tl   -> h :: aux tl
-   in
-   aux nodes
+   StringMap.update key (function None -> None | Some _ -> Some node) nodes
 
 let remove_node nodes key =
-   let rec aux = function
-   | []-> raise Not_found
-   | h :: tl when h.Node.key = key -> tl
-   | h :: tl   -> h :: aux tl
-   in
-   aux nodes
+   StringMap.update key (function None -> raise Not_found | Some _ -> 
None) nodes
 
-let create () = []
+let create () = StringMap.empty
 
 let rec iter f tree =
-   let aux node =
-   f node.Node.key node.Node.value;
+   let aux key node =
+   f key node.Node.value;
iter f node.Node.children
in
-   List.iter aux tree
+   StringMap.iter aux tree
 
 let rec map f tree =
let aux node =
@@ -92,13 +84,14 @@ let rec map f tree =
in
{ node with Node.value = value; Node.children = map f 
node.Node.children }
in
-   List.filter (fun n -> n.Node.value <> None || n.Node.children <> []) 
(List.map aux tree)
+   tree |> StringMap.map aux
+   |> StringMap.filter (fun _ n -> n.Node.value <> None || not 
(StringMap.is_empty n.Node.children) )
 
 let rec fold f tree acc =
-   let aux accu node =
-   fold f node.Node.children (f node.Node.key node.Node.value accu)
+   let aux key node accu =
+   fold f node.Node.children (f key node.Node.value accu)
in
-   List.fold_left aux acc tree
+

[PATCH v2 5/6] tools/ocaml/xenstored: use more efficient node trees

2020-08-17 Thread Edwin Török

This changes the output of xenstore-ls to be sorted.
Previously the keys were listed in the order in which they were inserted
in.
docs/misc/xenstore.txt doesn't specify in what order keys are listed.

Map.update is used to retain semantics with replace_child:
only an existing child is replaced, if it wasn't part of the original
map we don't add it.
Similarly exception behaviour is retained for del_childname and related
functions.

Entries are stored in reverse sort order, so that upon Map.fold the
constructed list is sorted in ascending order and there is no need for a
List.rev.

Signed-off-by: Edwin Török 
---
 tools/ocaml/xenstored/store.ml   | 46 +++-
 tools/ocaml/xenstored/symbol.ml  |  4 +++
 tools/ocaml/xenstored/symbol.mli |  3 +++
 3 files changed, 29 insertions(+), 24 deletions(-)

diff --git a/tools/ocaml/xenstored/store.ml b/tools/ocaml/xenstored/store.ml
index 45659a23ee..d9dfa36045 100644
--- a/tools/ocaml/xenstored/store.ml
+++ b/tools/ocaml/xenstored/store.ml
@@ -16,17 +16,19 @@
  *)
 open Stdext
 
+module SymbolMap = Map.Make(Symbol)
+
 module Node = struct
 
 type t = {
name: Symbol.t;
perms: Perms.Node.t;
value: string;
-   children: t list;
+   children: t SymbolMap.t;
 }
 
 let create _name _perms _value =
-   { name = Symbol.of_string _name; perms = _perms; value = _value; 
children = []; }
+   { name = Symbol.of_string _name; perms = _perms; value = _value; 
children = SymbolMap.empty; }
 
 let get_owner node = Perms.Node.get_owner node.perms
 let get_children node = node.children
@@ -42,38 +44,34 @@ let set_value node nvalue =
 let set_perms node nperms = { node with perms = nperms }
 
 let add_child node child =
-   { node with children = child :: node.children }
+   let children = SymbolMap.add child.name child node.children in
+   { node with children }
 
 let exists node childname =
let childname = Symbol.of_string childname in
-   List.exists (fun n -> Symbol.equal n.name childname) node.children
+   SymbolMap.mem childname node.children
 
 let find node childname =
let childname = Symbol.of_string childname in
-   List.find (fun n -> Symbol.equal n.name childname) node.children
+   SymbolMap.find childname node.children
 
 let replace_child node child nchild =
-   (* this is the on-steroid version of the filter one-replace one *)
-   let rec replace_one_in_list l =
-   match l with
-   | []   -> []
-   | h :: tl when Symbol.equal h.name child.name -> nchild :: tl
-   | h :: tl  -> h :: replace_one_in_list 
tl
-   in
-   { node with children = (replace_one_in_list node.children) }
+   { node with
+ children = SymbolMap.update child.name
+(function None -> None | Some _ -> Some nchild)
+node.children
+   }
 
 let del_childname node childname =
let sym = Symbol.of_string childname in
-   let rec delete_one_in_list l =
-   match l with
-   | []-> raise Not_found
-   | h :: tl when Symbol.equal h.name sym -> tl
-   | h :: tl   -> h :: delete_one_in_list tl
-   in
-   { node with children = (delete_one_in_list node.children) }
+   { node with children =
+   SymbolMap.update sym
+ (function None -> raise Not_found | Some _ -> None)
+ node.children
+   }
 
 let del_all_children node =
-   { node with children = [] }
+   { node with children = SymbolMap.empty }
 
 (* check if the current node can be accessed by the current connection with 
rperm permissions *)
 let check_perm node connection request =
@@ -87,7 +85,7 @@ let check_owner node connection =
raise Define.Permission_denied;
end
 
-let rec recurse fct node = fct node; List.iter (recurse fct) node.children
+let rec recurse fct node = fct node; SymbolMap.iter (fun _ -> recurse fct) 
node.children
 
 let unpack node = (Symbol.to_string node.name, node.perms, node.value)
 
@@ -321,7 +319,7 @@ let ls store perm path =
Node.check_perm cnode perm Perms.READ;
cnode.Node.children in
Path.apply store.root path do_ls in
-   List.rev (List.map (fun n -> Symbol.to_string n.Node.name) children)
+   SymbolMap.fold (fun k _ accu -> Symbol.to_string k :: accu) children []
 
 let getperms store perm path =
if path = [] then
@@ -350,7 +348,7 @@ let traversal root_node f =
let rec _traversal path node =
f path node;
let node_path = Path.of_path_and_name path (Symbol.to_string 
node.Node.name) in
-   List.iter (_traversal node_path) node.Node.children
+   SymbolMap.iter (fun _ -> _traversal node_path)

[PATCH v2 0/6] tools/ocaml/xenstored: simplify code

2020-08-17 Thread Edwin Török

Fix warnings, and delete some obsolete code.
oxenstored contained a hand-rolled GC to perform hash-consing:
this can be done with a lot fewer lines of code by using the built-in Weak 
module.

The choice of data structures for trees/tries is not very efficient: they are 
just
lists. Using a map improves lookup and deletion complexity, and replaces 
hand-rolled
recursion with higher-level library calls.

There is a lot more that could be done to optimize socket polling:
an epoll backend with a poll fallback,but API structured around event-based 
polling
would be better. But first lets drop the legacy select based code: I think every
modern *nix should have a working poll(3) by now.

Changes since v1:
  * passed some testing
  * fix commit title on 'drop select based'
  * fix missing 'set_node' in 'set' that got lost in conversion to map
  * simplify 'compare' function

Edwin Török (6):
  tools/ocaml/libs/xc: Fix ambiguous documentation comment
  tools/ocaml/xenstored: fix deprecation warning
  tools/ocaml/xenstored: replace hand rolled GC with weak GC references
  tools/ocaml/xenstored: drop select based socket watching
  tools/ocaml/xenstored: use more efficient node trees
  tools/ocaml/xenstored: use more efficient tries

 tools/ocaml/libs/xc/xenctrl.mli   |  2 +
 tools/ocaml/xenstored/Makefile| 12 ++--
 tools/ocaml/xenstored/connection.ml   |  3 -
 tools/ocaml/xenstored/connections.ml  |  2 +-
 tools/ocaml/xenstored/disk.ml |  2 +-
 tools/ocaml/xenstored/history.ml  | 14 
 tools/ocaml/xenstored/parse_arg.ml|  7 +-
 tools/ocaml/xenstored/{select.ml => poll.ml}  | 14 +---
 .../ocaml/xenstored/{select.mli => poll.mli}  | 12 +---
 tools/ocaml/xenstored/store.ml| 49 ++---
 tools/ocaml/xenstored/symbol.ml   | 70 +--
 tools/ocaml/xenstored/symbol.mli  | 22 ++
 tools/ocaml/xenstored/trie.ml | 59 +++-
 tools/ocaml/xenstored/trie.mli| 26 +++
 tools/ocaml/xenstored/xenstored.ml| 20 +-
 15 files changed, 103 insertions(+), 211 deletions(-)
 rename tools/ocaml/xenstored/{select.ml => poll.ml} (85%)
 rename tools/ocaml/xenstored/{select.mli => poll.mli} (58%)

-- 
2.25.1

[PATCH v2 2/6] tools/ocaml/xenstored: fix deprecation warning

2020-08-17 Thread Edwin Török

```
File "xenstored/disk.ml", line 33, characters 9-23:
33 |let c = Char.lowercase c in
  ^^
(alert deprecated): Stdlib.Char.lowercase
Use Char.lowercase_ascii instead.
```

Signed-off-by: Edwin Török 
---
 tools/ocaml/xenstored/disk.ml | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/ocaml/xenstored/disk.ml b/tools/ocaml/xenstored/disk.ml
index 4739967b61..1ca0e2a95e 100644
--- a/tools/ocaml/xenstored/disk.ml
+++ b/tools/ocaml/xenstored/disk.ml
@@ -30,7 +30,7 @@ let undec c =
| _  -> raise (Failure "undecify")
 
 let unhex c =
-   let c = Char.lowercase c in
+   let c = Char.lowercase_ascii c in
match c with
| '0' .. '9' -> (Char.code c) - (Char.code '0')
| 'a' .. 'f' -> (Char.code c) - (Char.code 'a') + 10
-- 
2.25.1

[PATCH v3 4/6] tools/ocaml/xenstored: drop select based socket watching

2020-08-17 Thread Edwin Török

Poll has been the default since 2014, I think we can safely say by now
that poll() works and we don't need to fall back to select().

This will allow fixing up the way we call poll to be more efficient
(and pave the way for introducing epoll support):
currently poll wraps the select API, which is inefficient.

Signed-off-by: Edwin Török 
---
Changed since v1:
 * fix commit title
---
 tools/ocaml/xenstored/Makefile | 12 ++--
 tools/ocaml/xenstored/parse_arg.ml |  7 ++-
 tools/ocaml/xenstored/{select.ml => poll.ml}   | 14 ++
 tools/ocaml/xenstored/{select.mli => poll.mli} | 12 ++--
 tools/ocaml/xenstored/xenstored.ml |  4 +---
 5 files changed, 13 insertions(+), 36 deletions(-)
 rename tools/ocaml/xenstored/{select.ml => poll.ml} (85%)
 rename tools/ocaml/xenstored/{select.mli => poll.mli} (58%)

diff --git a/tools/ocaml/xenstored/Makefile b/tools/ocaml/xenstored/Makefile
index 68d35c483a..692a62584e 100644
--- a/tools/ocaml/xenstored/Makefile
+++ b/tools/ocaml/xenstored/Makefile
@@ -18,12 +18,12 @@ OCAMLINCLUDE += \
-I $(OCAML_TOPLEVEL)/libs/xc \
-I $(OCAML_TOPLEVEL)/libs/eventchn
 
-LIBS = syslog.cma syslog.cmxa select.cma select.cmxa
+LIBS = syslog.cma syslog.cmxa poll.cma poll.cmxa
 syslog_OBJS = syslog
 syslog_C_OBJS = syslog_stubs
-select_OBJS = select
-select_C_OBJS = select_stubs
-OCAML_LIBRARY = syslog select
+poll_OBJS = poll
+poll_C_OBJS = select_stubs
+OCAML_LIBRARY = syslog poll
 
 LIBS += systemd.cma systemd.cmxa
 systemd_OBJS = systemd
@@ -58,13 +58,13 @@ OBJS = paths \
process \
xenstored
 
-INTF = symbol.cmi trie.cmi syslog.cmi systemd.cmi select.cmi
+INTF = symbol.cmi trie.cmi syslog.cmi systemd.cmi poll.cmi
 
 XENSTOREDLIBS = \
unix.cmxa \
-ccopt -L -ccopt . syslog.cmxa \
-ccopt -L -ccopt . systemd.cmxa \
-   -ccopt -L -ccopt . select.cmxa \
+   -ccopt -L -ccopt . poll.cmxa \
-ccopt -L -ccopt $(OCAML_TOPLEVEL)/libs/mmap 
$(OCAML_TOPLEVEL)/libs/mmap/xenmmap.cmxa \
-ccopt -L -ccopt $(OCAML_TOPLEVEL)/libs/eventchn 
$(OCAML_TOPLEVEL)/libs/eventchn/xeneventchn.cmxa \
-ccopt -L -ccopt $(OCAML_TOPLEVEL)/libs/xc 
$(OCAML_TOPLEVEL)/libs/xc/xenctrl.cmxa \
diff --git a/tools/ocaml/xenstored/parse_arg.ml 
b/tools/ocaml/xenstored/parse_arg.ml
index 1803c3eda0..2c4b5a8528 100644
--- a/tools/ocaml/xenstored/parse_arg.ml
+++ b/tools/ocaml/xenstored/parse_arg.ml
@@ -25,7 +25,6 @@ type config =
tracefile: string option; (* old xenstored compatibility *)
restart: bool;
disable_socket: bool;
-   use_select: bool;
 }
 
 let do_argv =
@@ -37,7 +36,7 @@ let do_argv =
and config_file = ref ""
and restart = ref false
and disable_socket = ref false
-   and use_select = ref false in
+   in
 
let speclist =
[ ("--no-domain-init", Arg.Unit (fun () -> domain_init := 
false),
@@ -54,9 +53,8 @@ let do_argv =
  ("-T", Arg.Set_string tracefile, ""); (* for compatibility *)
  ("--restart", Arg.Set restart, "Read database on starting");
  ("--disable-socket", Arg.Unit (fun () -> disable_socket := 
true), "Disable socket");
- ("--use-select", Arg.Unit (fun () -> use_select := true), 
"Use select instead of poll"); (* for backward compatibility and testing *)
] in
-   let usage_msg = "usage : xenstored [--config-file ] 
[--no-domain-init] [--help] [--no-fork] [--reraise-top-level] [--restart] 
[--disable-socket] [--use-select]" in
+   let usage_msg = "usage : xenstored [--config-file ] 
[--no-domain-init] [--help] [--no-fork] [--reraise-top-level] [--restart] 
[--disable-socket]" in
Arg.parse speclist (fun _ -> ()) usage_msg;
{
domain_init = !domain_init;
@@ -68,5 +66,4 @@ let do_argv =
tracefile = if !tracefile <> "" then Some !tracefile else None;
restart = !restart;
disable_socket = !disable_socket;
-   use_select = !use_select;
}
diff --git a/tools/ocaml/xenstored/select.ml b/tools/ocaml/xenstored/poll.ml
similarity index 85%
rename from tools/ocaml/xenstored/select.ml
rename to tools/ocaml/xenstored/poll.ml
index 0455e163e3..26f8620dfc 100644
--- a/tools/ocaml/xenstored/select.ml
+++ b/tools/ocaml/xenstored/poll.ml
@@ -63,15 +63,5 @@ let poll_select in_fds out_fds exc_fds timeout =
 (if event.except then fd :: x else x))
a r
 
-(* If the use_poll function is not called at all, we default to the original 
Unix.select behavior *)
-let select_fun = ref Unix.select
-
-let use_poll yes =
-   let sel_fun, max_fd =
-   if yes then poll_select, get_sys_fs_nr_open ()
-   else Unix.select, 1024 in
-   select_fun := sel_fun;
-   set_fd_limit max_fd
-
-let select in_fds out_fds exc_fds timeout =
-   (!select_fun) in_fds out

[PATCH v3 6/6] tools/ocaml/xenstored: use more efficient tries

2020-08-17 Thread Edwin Török

No functional change, just an optimization.

Signed-off-by: Edwin Török 
---
Changed since v1:
 * fix missing 'set_node' in 'set' that got lost in conversion to map
 * simplify 'compare' function
---
 tools/ocaml/xenstored/connections.ml |  2 +-
 tools/ocaml/xenstored/symbol.ml  |  6 +--
 tools/ocaml/xenstored/trie.ml| 59 
 tools/ocaml/xenstored/trie.mli   | 26 ++--
 4 files changed, 43 insertions(+), 50 deletions(-)

diff --git a/tools/ocaml/xenstored/connections.ml 
b/tools/ocaml/xenstored/connections.ml
index f02ef6b526..4983c7370b 100644
--- a/tools/ocaml/xenstored/connections.ml
+++ b/tools/ocaml/xenstored/connections.ml
@@ -21,7 +21,7 @@ type t = {
anonymous: (Unix.file_descr, Connection.t) Hashtbl.t;
domains: (int, Connection.t) Hashtbl.t;
ports: (Xeneventchn.t, Connection.t) Hashtbl.t;
-   mutable watches: (string, Connection.watch list) Trie.t;
+   mutable watches: Connection.watch list Trie.t;
 }
 
 let create () = {
diff --git a/tools/ocaml/xenstored/symbol.ml b/tools/ocaml/xenstored/symbol.ml
index 2697915623..85b3f265de 100644
--- a/tools/ocaml/xenstored/symbol.ml
+++ b/tools/ocaml/xenstored/symbol.ml
@@ -31,9 +31,9 @@ let equal a b =
   (* compare using physical equality, both members have to be part of the 
above weak table *)
   a == b
 
-let compare a b =
-  if equal a b then 0
-  else -(String.compare a b)
+(* the sort order is reversed here, so that Map.fold constructs a list
+   in ascending order *)
+let compare a b = String.compare b a
 
 let stats () =
   let len, entries, _, _, _, _ = WeakTable.stats tbl in
diff --git a/tools/ocaml/xenstored/trie.ml b/tools/ocaml/xenstored/trie.ml
index dc42535092..5b4831cf02 100644
--- a/tools/ocaml/xenstored/trie.ml
+++ b/tools/ocaml/xenstored/trie.ml
@@ -13,24 +13,26 @@
  * GNU Lesser General Public License for more details.
  *)
 
+module StringMap = Map.Make(String)
+
 module Node =
 struct
-   type ('a,'b) t =  {
-   key: 'a;
-   value: 'b option;
-   children: ('a,'b) t list;
+   type 'a t =  {
+   key: string;
+   value: 'a option;
+   children: 'a t StringMap.t;
}
 
let _create key value = {
key = key;
value = Some value;
-   children = [];
+   children = StringMap.empty;
}
 
let empty key = {
key = key;
value = None;
-   children = []
+   children = StringMap.empty;
}
 
let _get_key node = node.key
@@ -47,41 +49,31 @@ struct
{ node with children = children }
 
let _add_child node child =
-   { node with children = child :: node.children }
+   { node with children = StringMap.add child.key child 
node.children }
 end
 
-type ('a,'b) t = ('a,'b) Node.t list
+type 'a t = 'a Node.t StringMap.t
 
 let mem_node nodes key =
-   List.exists (fun n -> n.Node.key = key) nodes
+   StringMap.mem key nodes
 
 let find_node nodes key =
-   List.find (fun n -> n.Node.key = key) nodes
+   StringMap.find key nodes
 
 let replace_node nodes key node =
-   let rec aux = function
-   | []-> []
-   | h :: tl when h.Node.key = key -> node :: tl
-   | h :: tl   -> h :: aux tl
-   in
-   aux nodes
+   StringMap.update key (function None -> None | Some _ -> Some node) nodes
 
 let remove_node nodes key =
-   let rec aux = function
-   | []-> raise Not_found
-   | h :: tl when h.Node.key = key -> tl
-   | h :: tl   -> h :: aux tl
-   in
-   aux nodes
+   StringMap.update key (function None -> raise Not_found | Some _ -> 
None) nodes
 
-let create () = []
+let create () = StringMap.empty
 
 let rec iter f tree =
-   let aux node =
-   f node.Node.key node.Node.value;
+   let aux key node =
+   f key node.Node.value;
iter f node.Node.children
in
-   List.iter aux tree
+   StringMap.iter aux tree
 
 let rec map f tree =
let aux node =
@@ -92,13 +84,14 @@ let rec map f tree =
in
{ node with Node.value = value; Node.children = map f 
node.Node.children }
in
-   List.filter (fun n -> n.Node.value <> None || n.Node.children <> []) 
(List.map aux tree)
+   tree |> StringMap.map aux
+   |> StringMap.filter (fun _ n -> n.Node.value <> None || not 
(StringMap.is_empty n.Node.children) )
 
 let rec fold f tree acc =
-   let aux accu node =
-   fold f node.Node.children (f node.Node.key node.Node.value accu)
+   let aux key node accu =
+   fold f node.Node.children (f key node.Node.value accu)
in
-   List.fold_left aux acc tree
+

[PATCH v3 0/6] tools/ocaml/xenstored: simplify code

2020-08-17 Thread Edwin Török

Fix warnings, and delete some obsolete code.
oxenstored contained a hand-rolled GC to perform hash-consing:
this can be done with a lot fewer lines of code by using the built-in Weak 
module.

The choice of data structures for trees/tries is not very efficient: they are 
just
lists. Using a map improves lookup and deletion complexity, and replaces 
hand-rolled
recursion with higher-level library calls.

There is a lot more that could be done to optimize socket polling:
an epoll backend with a poll fallback,but API structured around event-based 
polling
would be better. But first lets drop the legacy select based code: I think every
modern *nix should have a working poll(3) by now.

This is a draft series, in need of more testing.
Changes since v1:
* fix bug where a 'set_node' call was missed
* simplify 'compare' code
* fix commit title for 'drop select based'
* passed some testing

Please ignore V2, something went wrong and V2 was nearly identical to V1,
not matching what I had in my git tree.

Edwin Török (6):
  tools/ocaml/libs/xc: Fix ambiguous documentation comment
  tools/ocaml/xenstored: fix deprecation warning
  tools/ocaml/xenstored: replace hand rolled GC with weak GC references
  tools/ocaml/xenstored: drop select based socket watching
  tools/ocaml/xenstored: use more efficient node trees
  tools/ocaml/xenstored: use more efficient tries

 tools/ocaml/libs/xc/xenctrl.mli   |  2 +
 tools/ocaml/xenstored/Makefile| 12 ++--
 tools/ocaml/xenstored/connection.ml   |  3 -
 tools/ocaml/xenstored/connections.ml  |  2 +-
 tools/ocaml/xenstored/disk.ml |  2 +-
 tools/ocaml/xenstored/history.ml  | 14 
 tools/ocaml/xenstored/parse_arg.ml|  7 +-
 tools/ocaml/xenstored/{select.ml => poll.ml}  | 14 +---
 .../ocaml/xenstored/{select.mli => poll.mli}  | 12 +---
 tools/ocaml/xenstored/store.ml| 49 ++---
 tools/ocaml/xenstored/symbol.ml   | 70 +--
 tools/ocaml/xenstored/symbol.mli  | 22 ++
 tools/ocaml/xenstored/trie.ml | 59 +++-
 tools/ocaml/xenstored/trie.mli| 26 +++
 tools/ocaml/xenstored/xenstored.ml| 20 +-
 15 files changed, 103 insertions(+), 211 deletions(-)
 rename tools/ocaml/xenstored/{select.ml => poll.ml} (85%)
 rename tools/ocaml/xenstored/{select.mli => poll.mli} (58%)

-- 
2.25.1

[PATCH v3 2/6] tools/ocaml/xenstored: fix deprecation warning

2020-08-17 Thread Edwin Török

```
File "xenstored/disk.ml", line 33, characters 9-23:
33 |let c = Char.lowercase c in
  ^^
(alert deprecated): Stdlib.Char.lowercase
Use Char.lowercase_ascii instead.
```

Signed-off-by: Edwin Török 
---
 tools/ocaml/xenstored/disk.ml | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/ocaml/xenstored/disk.ml b/tools/ocaml/xenstored/disk.ml
index 4739967b61..1ca0e2a95e 100644
--- a/tools/ocaml/xenstored/disk.ml
+++ b/tools/ocaml/xenstored/disk.ml
@@ -30,7 +30,7 @@ let undec c =
| _  -> raise (Failure "undecify")
 
 let unhex c =
-   let c = Char.lowercase c in
+   let c = Char.lowercase_ascii c in
match c with
| '0' .. '9' -> (Char.code c) - (Char.code '0')
| 'a' .. 'f' -> (Char.code c) - (Char.code 'a') + 10
-- 
2.25.1

[PATCH v3 1/6] tools/ocaml/libs/xc: Fix ambiguous documentation comment

2020-08-17 Thread Edwin Török

Signed-off-by: Edwin Török 
---
 tools/ocaml/libs/xc/xenctrl.mli | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/tools/ocaml/libs/xc/xenctrl.mli b/tools/ocaml/libs/xc/xenctrl.mli
index 26ec7e59b1..f7f6ec570d 100644
--- a/tools/ocaml/libs/xc/xenctrl.mli
+++ b/tools/ocaml/libs/xc/xenctrl.mli
@@ -132,8 +132,10 @@ external interface_close : handle -> unit = 
"stub_xc_interface_close"
  * interface_open and interface_close or with_intf although mixing both
  * is possible *)
 val with_intf : (handle -> 'a) -> 'a
+
 (** [get_handle] returns the global handle used by [with_intf] *)
 val get_handle: unit -> handle option
+
 (** [close handle] closes the handle maintained by [with_intf]. This
  * should only be closed before process exit. It must not be called from
  * a function called directly or indirectly by with_intf as this
-- 
2.25.1

Re: [PATCH v1 5/6] tools/ocaml/xenstored: use more efficient node trees

2020-08-17 Thread Edwin Torok

On Mon, 2020-08-17 at 14:52 +0200, Christian Lindig wrote:
> +let compare a b =
> +  if equal a b then 0
> +  else -(String.compare a b)
> 
> I think this bit could use an inline comment why the sort order is
> reversed. This could be also simplified to -(String.compare a b)
> because this goes to the internal (polymorphic) compare implemented
> in C which does a physical equivalence check first.

Good point, I've dropped the equal, and instead of negating the compare
I swapped its arguments.

See V3 of the patch (ignore V2, for some reason it looked nearly
identical to V1, not matching what I had in my git tree,
perhaps git-format-patch didn't overwrite the patches?).

Best regards,
--Edwin

> 
> -- C
> 
> 
> From: Edwin Torok
> Sent: 14 August 2020 23:14
> To: xen-devel@lists.xenproject.org
> Cc: Edwin Torok; Christian Lindig; David Scott; Ian Jackson; Wei Liu
> Subject: [PATCH v1 5/6] tools/ocaml/xenstored: use more efficient
> node trees
> 
> This changes the output of xenstore-ls to be sorted.
> Previously the keys were listed in the order in which they were
> inserted
> in.
> docs/misc/xenstore.txt doesn't specify in what order keys are listed.
> 
> Map.update is used to retain semantics with replace_child:
> only an existing child is replaced, if it wasn't part of the original
> map we don't add it.
> Similarly exception behaviour is retained for del_childname and
> related
> functions.
> 
> Entries are stored in reverse sort order, so that upon Map.fold the
> constructed list is sorted in ascending order and there is no need
> for a
> List.rev.
> 
> Signed-off-by: Edwin Török 
> ---
>  tools/ocaml/xenstored/store.ml   | 46 +++---
> --
>  tools/ocaml/xenstored/symbol.ml  |  4 +++
>  tools/ocaml/xenstored/symbol.mli |  3 +++
>  3 files changed, 29 insertions(+), 24 deletions(-)
> 
> diff --git a/tools/ocaml/xenstored/store.ml
> b/tools/ocaml/xenstored/store.ml
> index 45659a23ee..d9dfa36045 100644
> --- a/tools/ocaml/xenstored/store.ml
> +++ b/tools/ocaml/xenstored/store.ml
> @@ -16,17 +16,19 @@
>   *)
>  open Stdext
> 
> +module SymbolMap = Map.Make(Symbol)
> +
>  module Node = struct
> 
>  type t = {
> name: Symbol.t;
> perms: Perms.Node.t;
> value: string;
> -   children: t list;
> +   children: t SymbolMap.t;
>  }
> 
>  let create _name _perms _value =
> -   { name = Symbol.of_string _name; perms = _perms; value =
> _value; children = []; }
> +   { name = Symbol.of_string _name; perms = _perms; value =
> _value; children = SymbolMap.empty; }
> 
>  let get_owner node = Perms.Node.get_owner node.perms
>  let get_children node = node.children
> @@ -42,38 +44,34 @@ let set_value node nvalue =
>  let set_perms node nperms = { node with perms = nperms }
> 
>  let add_child node child =
> -   { node with children = child :: node.children }
> +   let children = SymbolMap.add child.name child node.children
> in
> +   { node with children }
> 
>  let exists node childname =
> let childname = Symbol.of_string childname in
> -   List.exists (fun n -> Symbol.equal n.name childname)
> node.children
> +   SymbolMap.mem childname node.children
> 
>  let find node childname =
> let childname = Symbol.of_string childname in
> -   List.find (fun n -> Symbol.equal n.name childname)
> node.children
> +   SymbolMap.find childname node.children
> 
>  let replace_child node child nchild =
> -   (* this is the on-steroid version of the filter one-replace
> one *)
> -   let rec replace_one_in_list l =
> -   match l with
> -   | []   -> []
> -   | h :: tl when Symbol.equal h.name child.name ->
> nchild :: tl
> -   | h :: tl  -> h ::
> replace_one_in_list tl
> -   in
> -   { node with children = (replace_one_in_list node.children) }
> +   { node with
> + children = SymbolMap.update child.name
> +(function None -> None | Some _ -> Some nchild)
> +node.children
> +   }
> 
>  let del_childname node childname =
> let sym = Symbol.of_string childname in
> -   let rec delete_one_in_list l =
> -   match l with
> -   | []-> raise Not_found
> -   | h :: tl when Symbol.equal h.name sym -> tl
> -   | h :: tl   -> h ::
> delete_one_in_list tl
> -   in
> -   { node with children = (delete_one_in_list node.children) }
> +   { node with children =
> +   SymbolMap.update sym
> + (function None -> raise Not_found | Some _ -> None)
> + node.children
> +   }
> 
>  let del_all_children node =
> -   { node with children = [] }
> +   { node with children = SymbolMap.empty }
> 
>  (* check if the current node can be accessed by the current
> connection

[PATCH v3 5/6] tools/ocaml/xenstored: use more efficient node trees

2020-08-17 Thread Edwin Török

This changes the output of xenstore-ls to be sorted.
Previously the keys were listed in the order in which they were inserted
in.
docs/misc/xenstore.txt doesn't specify in what order keys are listed.

Map.update is used to retain semantics with replace_child:
only an existing child is replaced, if it wasn't part of the original
map we don't add it.
Similarly exception behaviour is retained for del_childname and related
functions.

Entries are stored in reverse sort order, so that upon Map.fold the
constructed list is sorted in ascending order and there is no need for a
List.rev.

Signed-off-by: Edwin Török 
---
 tools/ocaml/xenstored/store.ml   | 46 +++-
 tools/ocaml/xenstored/symbol.ml  |  4 +++
 tools/ocaml/xenstored/symbol.mli |  3 +++
 3 files changed, 29 insertions(+), 24 deletions(-)

diff --git a/tools/ocaml/xenstored/store.ml b/tools/ocaml/xenstored/store.ml
index 45659a23ee..d9dfa36045 100644
--- a/tools/ocaml/xenstored/store.ml
+++ b/tools/ocaml/xenstored/store.ml
@@ -16,17 +16,19 @@
  *)
 open Stdext
 
+module SymbolMap = Map.Make(Symbol)
+
 module Node = struct
 
 type t = {
name: Symbol.t;
perms: Perms.Node.t;
value: string;
-   children: t list;
+   children: t SymbolMap.t;
 }
 
 let create _name _perms _value =
-   { name = Symbol.of_string _name; perms = _perms; value = _value; 
children = []; }
+   { name = Symbol.of_string _name; perms = _perms; value = _value; 
children = SymbolMap.empty; }
 
 let get_owner node = Perms.Node.get_owner node.perms
 let get_children node = node.children
@@ -42,38 +44,34 @@ let set_value node nvalue =
 let set_perms node nperms = { node with perms = nperms }
 
 let add_child node child =
-   { node with children = child :: node.children }
+   let children = SymbolMap.add child.name child node.children in
+   { node with children }
 
 let exists node childname =
let childname = Symbol.of_string childname in
-   List.exists (fun n -> Symbol.equal n.name childname) node.children
+   SymbolMap.mem childname node.children
 
 let find node childname =
let childname = Symbol.of_string childname in
-   List.find (fun n -> Symbol.equal n.name childname) node.children
+   SymbolMap.find childname node.children
 
 let replace_child node child nchild =
-   (* this is the on-steroid version of the filter one-replace one *)
-   let rec replace_one_in_list l =
-   match l with
-   | []   -> []
-   | h :: tl when Symbol.equal h.name child.name -> nchild :: tl
-   | h :: tl  -> h :: replace_one_in_list 
tl
-   in
-   { node with children = (replace_one_in_list node.children) }
+   { node with
+ children = SymbolMap.update child.name
+(function None -> None | Some _ -> Some nchild)
+node.children
+   }
 
 let del_childname node childname =
let sym = Symbol.of_string childname in
-   let rec delete_one_in_list l =
-   match l with
-   | []-> raise Not_found
-   | h :: tl when Symbol.equal h.name sym -> tl
-   | h :: tl   -> h :: delete_one_in_list tl
-   in
-   { node with children = (delete_one_in_list node.children) }
+   { node with children =
+   SymbolMap.update sym
+ (function None -> raise Not_found | Some _ -> None)
+ node.children
+   }
 
 let del_all_children node =
-   { node with children = [] }
+   { node with children = SymbolMap.empty }
 
 (* check if the current node can be accessed by the current connection with 
rperm permissions *)
 let check_perm node connection request =
@@ -87,7 +85,7 @@ let check_owner node connection =
raise Define.Permission_denied;
end
 
-let rec recurse fct node = fct node; List.iter (recurse fct) node.children
+let rec recurse fct node = fct node; SymbolMap.iter (fun _ -> recurse fct) 
node.children
 
 let unpack node = (Symbol.to_string node.name, node.perms, node.value)
 
@@ -321,7 +319,7 @@ let ls store perm path =
Node.check_perm cnode perm Perms.READ;
cnode.Node.children in
Path.apply store.root path do_ls in
-   List.rev (List.map (fun n -> Symbol.to_string n.Node.name) children)
+   SymbolMap.fold (fun k _ accu -> Symbol.to_string k :: accu) children []
 
 let getperms store perm path =
if path = [] then
@@ -350,7 +348,7 @@ let traversal root_node f =
let rec _traversal path node =
f path node;
let node_path = Path.of_path_and_name path (Symbol.to_string 
node.Node.name) in
-   List.iter (_traversal node_path) node.Node.children
+   SymbolMap.iter (fun _ -> _traversal node_path)

[PATCH v3 3/6] tools/ocaml/xenstored: replace hand rolled GC with weak GC references

2020-08-17 Thread Edwin Török

The code here is attempting to reduce memory usage by sharing common
substrings in the tree: it replaces strings with ints, and keeps a
string->int map that gets manually garbage collected using a hand-rolled
mark and sweep algorithm.

This is unnecessary: OCaml already has a mark-and-sweep Garbage
Collector runtime, and sharing of common strings in tree nodes
can be achieved through Weak references: if the string hasn't been seen
yet it gets added to the Weak reference table, and if it has we use the
entry from the table instead, thus storing a string only once.
When the string is no longer referenced OCaml's GC will drop it from the
weak table: there is no need to manually do a mark-and-sweep, or to tell
OCaml when to drop it.

Signed-off-by: Edwin Török 
---
 tools/ocaml/xenstored/connection.ml |  3 --
 tools/ocaml/xenstored/history.ml| 14 --
 tools/ocaml/xenstored/store.ml  | 11 ++---
 tools/ocaml/xenstored/symbol.ml | 68 ++---
 tools/ocaml/xenstored/symbol.mli| 21 ++---
 tools/ocaml/xenstored/xenstored.ml  | 16 +--
 6 files changed, 24 insertions(+), 109 deletions(-)

diff --git a/tools/ocaml/xenstored/connection.ml 
b/tools/ocaml/xenstored/connection.ml
index 24750ada43..aa6dd95501 100644
--- a/tools/ocaml/xenstored/connection.ml
+++ b/tools/ocaml/xenstored/connection.ml
@@ -271,9 +271,6 @@ let has_more_work con =
 
 let incr_ops con = con.stat_nb_ops <- con.stat_nb_ops + 1
 
-let mark_symbols con =
-   Hashtbl.iter (fun _ t -> Store.mark_symbols (Transaction.get_store t)) 
con.transactions
-
 let stats con =
Hashtbl.length con.watches, con.stat_nb_ops
 
diff --git a/tools/ocaml/xenstored/history.ml b/tools/ocaml/xenstored/history.ml
index f39565bff5..029802bd15 100644
--- a/tools/ocaml/xenstored/history.ml
+++ b/tools/ocaml/xenstored/history.ml
@@ -22,20 +22,6 @@ type history_record = {
 
 let history : history_record list ref = ref []
 
-(* Called from periodic_ops to ensure we don't discard symbols that are still 
needed. *)
-(* There is scope for optimisation here, since in consecutive commits one 
commit's `after`
- * is the same thing as the next commit's `before`, but not all commits in 
history are
- * consecutive. *)
-let mark_symbols () =
-   (* There are gaps where dom0's commits are missing. Otherwise we could 
assume that
-* each element's `before` is the same thing as the next element's 
`after`
-* since the next element is the previous commit *)
-   List.iter (fun hist_rec ->
-   Store.mark_symbols hist_rec.before;
-   Store.mark_symbols hist_rec.after;
-   )
-   !history
-
 (* Keep only enough commit-history to protect the running transactions that we 
are still tracking *)
 (* There is scope for optimisation here, replacing List.filter with something 
more efficient,
  * probably on a different list-like structure. *)
diff --git a/tools/ocaml/xenstored/store.ml b/tools/ocaml/xenstored/store.ml
index f299ec6461..45659a23ee 100644
--- a/tools/ocaml/xenstored/store.ml
+++ b/tools/ocaml/xenstored/store.ml
@@ -46,18 +46,18 @@ let add_child node child =
 
 let exists node childname =
let childname = Symbol.of_string childname in
-   List.exists (fun n -> n.name = childname) node.children
+   List.exists (fun n -> Symbol.equal n.name childname) node.children
 
 let find node childname =
let childname = Symbol.of_string childname in
-   List.find (fun n -> n.name = childname) node.children
+   List.find (fun n -> Symbol.equal n.name childname) node.children
 
 let replace_child node child nchild =
(* this is the on-steroid version of the filter one-replace one *)
let rec replace_one_in_list l =
match l with
| []   -> []
-   | h :: tl when h.name = child.name -> nchild :: tl
+   | h :: tl when Symbol.equal h.name child.name -> nchild :: tl
| h :: tl  -> h :: replace_one_in_list 
tl
in
{ node with children = (replace_one_in_list node.children) }
@@ -67,7 +67,7 @@ let del_childname node childname =
let rec delete_one_in_list l =
match l with
| []-> raise Not_found
-   | h :: tl when h.name = sym -> tl
+   | h :: tl when Symbol.equal h.name sym -> tl
| h :: tl   -> h :: delete_one_in_list tl
in
{ node with children = (delete_one_in_list node.children) }
@@ -463,9 +463,6 @@ let copy store = {
quota = Quota.copy store.quota;
 }
 
-let mark_symbols store =
-   Node.recurse (fun node -> Symbol.mark_as_used node.Node.name) store.root
-
 let incr_transaction_coalesce store =
store.stat_transaction_coalesce <- store.stat_transaction_coalesce + 1
 let incr_transaction_abort store =
diff --git a/tools/

Re: [RFC PATCH V1 05/12] hvm/dm: Introduce xendevicemodel_set_irq_level DM op

2020-08-17 Thread Stefano Stabellini

On Mon, 17 Aug 2020, Jan Beulich wrote:
> On 07.08.2020 23:50, Stefano Stabellini wrote:
> > On Fri, 7 Aug 2020, Jan Beulich wrote:
> >> On 07.08.2020 01:49, Stefano Stabellini wrote:
> >>> On Thu, 6 Aug 2020, Julien Grall wrote:
>  On 06/08/2020 01:37, Stefano Stabellini wrote:
> > On Wed, 5 Aug 2020, Julien Grall wrote:
> >> On 05/08/2020 00:22, Stefano Stabellini wrote:
> >>> On Mon, 3 Aug 2020, Oleksandr Tyshchenko wrote:
>  From: Oleksandr Tyshchenko 
> 
>  This patch adds ability to the device emulator to notify otherend
>  (some entity running in the guest) using a SPI and implements Arm
>  specific bits for it. Proposed interface allows emulator to set
>  the logical level of a one of a domain's IRQ lines.
> 
>  Please note, this is a split/cleanup of Julien's PoC:
>  "Add support for Guest IO forwarding to a device emulator"
> 
>  Signed-off-by: Julien Grall 
>  Signed-off-by: Oleksandr Tyshchenko 
>  ---
> tools/libs/devicemodel/core.c   | 18
>  ++
> tools/libs/devicemodel/include/xendevicemodel.h |  4 
> tools/libs/devicemodel/libxendevicemodel.map|  1 +
> xen/arch/arm/dm.c   | 22
>  +-
> xen/common/hvm/dm.c |  1 +
> xen/include/public/hvm/dm_op.h  | 15
>  +++
> 6 files changed, 60 insertions(+), 1 deletion(-)
> 
>  diff --git a/tools/libs/devicemodel/core.c
>  b/tools/libs/devicemodel/core.c
>  index 4d40639..30bd79f 100644
>  --- a/tools/libs/devicemodel/core.c
>  +++ b/tools/libs/devicemodel/core.c
>  @@ -430,6 +430,24 @@ int xendevicemodel_set_isa_irq_level(
> return xendevicemodel_op(dmod, domid, 1, &op, sizeof(op));
> }
> +int xendevicemodel_set_irq_level(
>  +xendevicemodel_handle *dmod, domid_t domid, uint32_t irq,
>  +unsigned int level)
> >>>
> >>> It is a pity that having xen_dm_op_set_pci_intx_level and
> >>> xen_dm_op_set_isa_irq_level already we need to add a third one, but 
> >>> from
> >>> the names alone I don't think we can reuse either of them.
> >>
> >> The problem is not the name...
> >>
> >>>
> >>> It is very similar to set_isa_irq_level. We could almost rename
> >>> xendevicemodel_set_isa_irq_level to xendevicemodel_set_irq_level or,
> >>> better, just add an alias to it so that xendevicemodel_set_irq_level 
> >>> is
> >>> implemented by calling xendevicemodel_set_isa_irq_level. Honestly I am
> >>> not sure if it is worth doing it though. Any other opinions?
> >>
> >> ... the problem is the interrupt field is only 8-bit. So we would only 
> >> be
> >> able
> >> to cover IRQ 0 - 255.
> >
> > Argh, that's not going to work :-(  I wasn't sure if it was a good idea
> > anyway.
> >
> >
> >> It is not entirely clear how the existing subop could be extended 
> >> without
> >> breaking existing callers.
> >>
> >>> But I think we should plan for not needing two calls (one to set level
> >>> to 1, and one to set it to 0):
> >>> https://marc.info/?l=xen-devel&m=159535112027405
> >>
> >> I am not sure to understand your suggestion here? Are you suggesting to
> >> remove
> >> the 'level' parameter?
> >
> > My hope was to make it optional to call the hypercall with level = 0,
> > not necessarily to remove 'level' from the struct.
> 
>  From my understanding, the hypercall is meant to represent the status of 
>  the
>  line between the device and the interrupt controller (either low or 
>  high).
> 
>  This is then up to the interrupt controller to decide when the interrupt 
>  is
>  going to be fired:
>    - For edge interrupt, this will fire when the line move from low to 
>  high (or
>  vice versa).
>    - For level interrupt, this will fire when line is high (assuming level
>  trigger high) and will keeping firing until the device decided to lower 
>  the
>  line.
> 
>  For a device, it is common to keep the line high until an OS wrote to a
>  specific register.
> 
>  Furthermore, technically, the guest OS is in charge to configure how an
>  interrupt is triggered. Admittely this information is part of the DT, but
>  nothing prevent a guest to change it.
> 
>  As side note, we have a workaround in Xen for some buggy DT (see the arch
>  timer) exposing the wrong trigger type.
> 
>  Because of that, I don't really see a way to make optional. Maybe you 
>  have
>  something different in mind?
> >>>
> >>> For level, we need the level

Re: [PATCH] xen: Introduce cmpxchg64() and guest_cmpxchg64()

2020-08-17 Thread Stefano Stabellini

On Sat, 15 Aug 2020, Julien Grall wrote:
> From: Julien Grall 
> 
> The IOREQ code is using cmpxchg() with 64-bit value. At the moment, this
> is x86 code, but there is plan to make it common.
> 
> To cater 32-bit arch, introduce two new helpers to deal with 64-bit
> cmpxchg.
> 
> The Arm 32-bit implementation of cmpxchg64() is based on the __cmpxchg64
> in Linux v5.8 (arch/arm/include/asm/cmpxchg.h).
> 
> Signed-off-by: Julien Grall 
> Cc: Oleksandr Tyshchenko 
> ---
>  xen/include/asm-arm/arm32/cmpxchg.h | 68 +
>  xen/include/asm-arm/arm64/cmpxchg.h |  5 +++
>  xen/include/asm-arm/guest_atomics.h | 22 ++
>  xen/include/asm-x86/guest_atomics.h |  2 +
>  xen/include/asm-x86/x86_64/system.h |  2 +
>  5 files changed, 99 insertions(+)
> 
> diff --git a/xen/include/asm-arm/arm32/cmpxchg.h 
> b/xen/include/asm-arm/arm32/cmpxchg.h
> index 0770f272ee99..5e2fa6ee38a0 100644
> --- a/xen/include/asm-arm/arm32/cmpxchg.h
> +++ b/xen/include/asm-arm/arm32/cmpxchg.h
> @@ -87,6 +87,38 @@ __CMPXCHG_CASE(b, 1)
>  __CMPXCHG_CASE(h, 2)
>  __CMPXCHG_CASE( , 4)
>  
> +static inline bool __cmpxchg_case_8(volatile uint64_t *ptr,
> + uint64_t *old,
> + uint64_t new,
> + bool timeout,
> + unsigned int max_try)
> +{
> + uint64_t oldval;
> + uint64_t res;
> +
> + do {
> + asm volatile(
> + "   ldrexd  %1, %H1, [%3]\n"
> + "   teq %1, %4\n"
> + "   teqeq   %H1, %H4\n"
> + "   movne   %0, #0\n"
> + "   movne   %H0, #0\n"
> + "   bne 2f\n"
> + "   strexd  %0, %5, %H5, [%3]\n"
> + "   teq %0, #0\n"

Apologies if I am misreading this code, but this last "teq" instruction
doesn't seem to be useful?


> + "2:"
> + : "=&r" (res), "=&r" (oldval), "+Qo" (*ptr)
  ^ not used ?


> + : "r" (ptr), "r" (*old), "r" (new)
> + : "memory", "cc");
> + if (!res)
> + break;
> + } while (!timeout || ((--max_try) > 0));
> +
> + *old = oldval;
> +
> + return !res;
> +}
> +
>  static always_inline bool __int_cmpxchg(volatile void *ptr, unsigned long 
> *old,
>   unsigned long new, int size,
>   bool timeout, unsigned int max_try)
> @@ -156,6 +188,30 @@ static always_inline bool __cmpxchg_mb_timeout(volatile 
> void *ptr,
>   return ret;
>  }
>  
> +/*
> + * The helper may fail to update the memory if the action takes too long.
> + *
> + * @old: On call the value pointed contains the expected old value. It will 
> be
> + * updated to the actual old value.
> + * @max_try: Maximum number of iterations
> + *
> + * The helper will return true when the update has succeeded (i.e no
> + * timeout) and false if the update has failed.
> + */
> +static always_inline bool __cmpxchg64_mb_timeout(volatile uint64_t *ptr,
> +  uint64_t *old,
> +  uint64_t new,
> +  unsigned int max_try)
> +{
> + bool ret;
> +
> + smp_mb();
> + ret = __cmpxchg_case_8(ptr, old, new, true, max_try);
> + smp_mb();
> +
> + return ret;
> +}
> +
>  #define cmpxchg(ptr,o,n) \
>   ((__typeof__(*(ptr)))__cmpxchg_mb((ptr),\
> (unsigned long)(o),   \
> @@ -167,6 +223,18 @@ static always_inline bool __cmpxchg_mb_timeout(volatile 
> void *ptr,
>  (unsigned long)(o),  \
>  (unsigned long)(n),  \
>  sizeof(*(ptr
> +
> +static inline uint64_t cmpxchg64(volatile uint64_t *ptr,
> +  uint64_t old,
> +  uint64_t new)
> +{
> + smp_mb();

I was looking at the existing code I noticed that we don't have a
corresponding smp_mb(); in this position. Is it needed here because of
the 64bit-ness?


> + if (!__cmpxchg_case_8(ptr, &old, new, false, 0))
> + ASSERT_UNREACHABLE();
> +
> + return old;
> +}
> +
>  #endif
>  /*
>   * Local variables:
> diff --git a/xen/include/asm-arm/arm64/cmpxchg.h 
> b/xen/include/asm-arm/arm64/cmpxchg.h
> index fc5c60f0bd74..de9cd0ee2b07 100644
> --- a/xen/include/asm-arm/arm64/cmpxchg.h
> +++ b/xen/include/asm-arm/arm64/cmpxchg.h
> @@ -187,6 +187,11 @@ static always_inline bool __cmpxchg_mb_timeout(volatile 
> void *ptr,
>   __ret; \
>  })
>  
> +#define cmpxchg64(ptr, o, n) cmpxchg(ptr, o, n)
> +
> +#define __cmpx

Re: [PATCH 05/14] kernel-doc: public/features.h

2020-08-17 Thread Stefano Stabellini

On Mon, 17 Aug 2020, Jan Beulich wrote:
> On 07.08.2020 23:52, Stefano Stabellini wrote:
> > On Fri, 7 Aug 2020, Jan Beulich wrote:
> >> On 07.08.2020 01:49, Stefano Stabellini wrote:
> >>> @@ -41,19 +41,25 @@
> >>>   * XENFEAT_dom0 MUST be set if the guest is to be booted as dom0,
> >>>   */
> >>>  
> >>> -/*
> >>> - * If set, the guest does not need to write-protect its pagetables, and 
> >>> can
> >>> - * update them via direct writes.
> >>> +/**
> >>> + * DOC: XENFEAT_writable_page_tables
> >>> + *
> >>> + * If set, the guest does not need to write-protect its pagetables, and
> >>> + * can update them via direct writes.
> >>>   */
> >>>  #define XENFEAT_writable_page_tables   0
> >>
> >> I dislike such redundancy (and it's more noticable here than with
> >> the struct-s). Is there really no way for the tool to find the
> >> right item, the more that in the cover letter you say that you
> >> even need to get the placement right, i.e. there can't be e.g.
> >> intervening #define-s?
> > 
> > Let me clarify that the right placement (nothing between the comment and
> > the following structure) is important for structs, typedefs, etc., but
> > not for "DOC". DOC is freeform and doesn't have to be followed by
> > anything specifically.
> > 
> > 
> > In regards to the redundancy, there is only another option, that I
> > didn't choose because it leads to worse documents being generated.
> > However, they are still readable, so if the agreement is to use the
> > other format, I would be OK with it.
> > 
> > 
> > The other format is the keyword "macro" (this one would have to have the
> > right placement, straight on top of the #define):
> > 
> > /**
> >  * macro XENFEAT_writable_page_tables
> >  *
> >  * If set, the guest does not need to write-protect its pagetables, and
> >  * can update them via direct writes.
> >  */
> > 
> > 
> > Which could be further simplified to:
> > 
> > /**
> >  * macro
> >  *
> >  * If set, the guest does not need to write-protect its pagetables, and
> >  * can update them via direct writes.
> >  */
> > 
> > 
> > In terms of redundancy, that's the best we can do.
> > 
> > The reason why I say it is not optimal is that with DOC the pleudo-html
> > generated via sphinx is:
> > 
> > ---
> > * XENFEAT_writable_page_tables *
> > 
> > If set, the guest does not need to write-protect its pagetables, and
> > can update them via direct writes.
> > ---
> > 
> > While with macro, two () parenthesis gets added to the title, and also an
> > empty "Parameters" section gets added, like this:
> > 
> > ---
> > * XENFEAT_writable_page_tables() *
> > 
> > ** Parameters **
> > 
> > ** Description **
> > 
> > If set, the guest does not need to write-protect its pagetables, and
> > can update them via direct writes.
> > ---
> > 
> > 
> > I think it could be confusing to the user: it looks like a macro with
> > parameters, which is not what we want.
> 
> Agreed, so ...
> 
> > For that reason, I think we should stick with "DOC" for now.
> 
> ... if there are no (better) alternatives we'll have to live with the
> redundancy.

Thanks Jan. I would prefer to get this series in as is (with the other
minor changes we discussed) as basic enablement for kernel-doc. I
volunteer to have a look into this issue and try to come up with a
better alternative afterward.

Re: [PATCH 08/14] kernel-doc: public/memory.h

2020-08-17 Thread Stefano Stabellini

On Mon, 17 Aug 2020, Jan Beulich wrote:
> On 07.08.2020 23:51, Stefano Stabellini wrote:
> > On Fri, 7 Aug 2020, Jan Beulich wrote:
> >> On 07.08.2020 01:49, Stefano Stabellini wrote:
> >>> From: Stefano Stabellini 
> >>>
> >>> Convert in-code comments to kernel-doc format wherever possible.
> >>>
> >>> Signed-off-by: Stefano Stabellini 
> >>> ---
> >>>  xen/include/public/memory.h | 232 
> >>>  1 file changed, 155 insertions(+), 77 deletions(-)
> >>>
> >>> diff --git a/xen/include/public/memory.h b/xen/include/public/memory.h
> >>> index 21057ed78e..4c57ed213c 100644
> >>> --- a/xen/include/public/memory.h
> >>> +++ b/xen/include/public/memory.h
> >>> @@ -30,7 +30,9 @@
> >>>  #include "xen.h"
> >>>  #include "physdev.h"
> >>>  
> >>> -/*
> >>> +/**
> >>> + * DOC: XENMEM_increase_reservation and XENMEM_decrease_reservation
> >>> + *
> >>>   * Increase or decrease the specified domain's memory reservation. 
> >>> Returns the
> >>>   * number of extents successfully allocated or freed.
> >>>   * arg == addr of struct xen_memory_reservation.
> >>> @@ -40,29 +42,37 @@
> >>>  #define XENMEM_populate_physmap 6
> >>>  
> >>>  #if __XEN_INTERFACE_VERSION__ >= 0x00030209
> >>> -/*
> >>> - * Maximum # bits addressable by the user of the allocated region (e.g., 
> >>> I/O
> >>> - * devices often have a 32-bit limitation even in 64-bit systems). If 
> >>> zero
> >>> - * then the user has no addressing restriction. This field is not used by
> >>> - * XENMEM_decrease_reservation.
> >>> +/**
> >>> + * DOC: XENMEMF_*
> >>> + *
> >>> + * - XENMEMF_address_bits, XENMEMF_get_address_bits:
> >>> + *   Maximum # bits addressable by the user of the allocated region
> >>> + *   (e.g., I/O devices often have a 32-bit limitation even in 64-bit
> >>> + *   systems). If zero then the user has no addressing restriction. 
> >>> This
> >>> + *   field is not used by XENMEM_decrease_reservation.
> >>> + * - XENMEMF_node, XENMEMF_get_node: NUMA node to allocate from
> >>> + * - XENMEMF_populate_on_demand: Flag to populate physmap with 
> >>> populate-on-demand entries
> >>> + * - XENMEMF_exact_node_request, XENMEMF_exact_node: Flag to request 
> >>> allocation only from the node specified
> >>
> >> Nit: overly long line
> > 
> > I'll fix
> > 
> > 
> >>> + * - XENMEMF_vnode: Flag to indicate the node specified is virtual node
> >>>   */
> >>>  #define XENMEMF_address_bits(x) (x)
> >>>  #define XENMEMF_get_address_bits(x) ((x) & 0xffu)
> >>> -/* NUMA node to allocate from. */
> >>>  #define XENMEMF_node(x) (((x) + 1) << 8)
> >>>  #define XENMEMF_get_node(x) x) >> 8) - 1) & 0xffu)
> >>> -/* Flag to populate physmap with populate-on-demand entries */
> >>>  #define XENMEMF_populate_on_demand (1<<16)
> >>> -/* Flag to request allocation only from the node specified */
> >>>  #define XENMEMF_exact_node_request  (1<<17)
> >>>  #define XENMEMF_exact_node(n) (XENMEMF_node(n) | 
> >>> XENMEMF_exact_node_request)
> >>> -/* Flag to indicate the node specified is virtual node */
> >>>  #define XENMEMF_vnode  (1<<18)
> >>>  #endif
> >>>  
> >>> +/**
> >>> + * struct xen_memory_reservation
> >>> + */
> >>>  struct xen_memory_reservation {
> >>>  
> >>> -/*
> >>> +/**
> >>> + * @extent_start:
> >>> + *
> >>
> >> Take the opportunity and drop the stray blank line?
> >  
> > Sure
> > 
> > 
> >>> @@ -200,90 +236,115 @@ DEFINE_XEN_GUEST_HANDLE(xen_machphys_mfn_list_t);
> >>>   */
> >>>  #define XENMEM_machphys_compat_mfn_list 25
> >>>  
> >>> -/*
> >>> +#define XENMEM_machphys_mapping 12
> >>> +/**
> >>> + * struct xen_machphys_mapping - XENMEM_machphys_mapping
> >>> + *
> >>>   * Returns the location in virtual address space of the machine_to_phys
> >>>   * mapping table. Architectures which do not have a m2p table, or which 
> >>> do not
> >>>   * map it by default into guest address space, do not implement this 
> >>> command.
> >>>   * arg == addr of xen_machphys_mapping_t.
> >>>   */
> >>> -#define XENMEM_machphys_mapping 12
> >>>  struct xen_machphys_mapping {
> >>> +/** @v_start: Start virtual address */
> >>>  xen_ulong_t v_start, v_end; /* Start and end virtual addresses.   */
> >>> -xen_ulong_t max_mfn;/* Maximum MFN that can be looked up. */
> >>> +/** @v_end: End virtual addresses */
> >>> +xen_ulong_t v_end;
> >>> +/** @max_mfn: Maximum MFN that can be looked up */
> >>> +xen_ulong_t max_mfn;
> >>>  };
> >>>  typedef struct xen_machphys_mapping xen_machphys_mapping_t;
> >>>  DEFINE_XEN_GUEST_HANDLE(xen_machphys_mapping_t);
> >>>  
> >>> -/* Source mapping space. */
> >>> +/**
> >>> + * DOC: Source mapping space.
> >>> + *
> >>> + * - XENMAPSPACE_shared_info:  shared info page
> >>> + * - XENMAPSPACE_grant_table:  grant table page
> >>> + * - XENMAPSPACE_gmfn: GMFN
> >>> + * - XENMAPSPACE_gmfn_range:   GMFN range, XENMEM_add_to_physmap only.
> >>> + * - XENMAPSPACE_gmfn_foreign: GMFN from another dom,
> >>>

[PATCH] xen/arm: Missing N1/A76/A75 FP registers in vCPU context switch

2020-08-17 Thread Wei Chen

Xen has cpu_has_fp/cpu_has_simd to detect whether the CPU supports
FP/SIMD or not. But currently, this two MACROs only consider value 0
of ID_AA64PFR0_EL1.FP/SIMD as FP/SIMD features enabled. But for CPUs
that support FP/SIMD and half-precision floating-point features, the
ID_AA64PFR0_EL1.FP/SIMD are 1. For these CPUs, xen will treat them as
no FP/SIMD support. In this case, the vfp_save/restore_state will not
take effect.

Unfortunately, Cortex-N1/A76/A75 are the CPUs support FP/SIMD and
half-precision floatiing-point. Their ID_AA64PFR0_EL1.FP/SMID are 1
(see Arm ARM DDI0487F.b, D13.2.64). In this case, on N1/A76/A75
platforms, Xen will always miss the float pointer registers save/restore.
If different vCPUs are running on the same pCPU, the float pointer
registers will be corrupted randomly.

This patch fixes Xen on these new cores.

Signed-off-by: Wei Chen 
---
 xen/include/asm-arm/cpufeature.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/xen/include/asm-arm/cpufeature.h b/xen/include/asm-arm/cpufeature.h
index 674beb0353..588089e5ae 100644
--- a/xen/include/asm-arm/cpufeature.h
+++ b/xen/include/asm-arm/cpufeature.h
@@ -13,8 +13,8 @@
 #define cpu_has_el2_64(boot_cpu_feature64(el2) >= 1)
 #define cpu_has_el3_32(boot_cpu_feature64(el3) == 2)
 #define cpu_has_el3_64(boot_cpu_feature64(el3) >= 1)
-#define cpu_has_fp(boot_cpu_feature64(fp) == 0)
-#define cpu_has_simd  (boot_cpu_feature64(simd) == 0)
+#define cpu_has_fp(boot_cpu_feature64(fp) <= 1)
+#define cpu_has_simd  (boot_cpu_feature64(simd) <= 1)
 #define cpu_has_gicv3 (boot_cpu_feature64(gic) == 1)
 #endif

--
2.17.1

IMPORTANT NOTICE: The contents of this email and any attachments are 
confidential and may also be privileged. If you are not the intended recipient, 
please notify the sender immediately and do not disclose the contents to any 
other person, use it for any purpose, or store or copy the information in any 
medium. Thank you.

[RFC PATCH] xen/gntdev.c: Convert get_user_pages() to pin_user_pages()

2020-08-17 Thread Souptick Joarder

In 2019, we introduced pin_user_pages*() and now we are converting
get_user_pages*() to the new API as appropriate. [1] & [2] could
be referred for more information. This is case 5 as per document [1].

[1] Documentation/core-api/pin_user_pages.rst

[2] "Explicit pinning of user-space pages":
https://lwn.net/Articles/807108/

Signed-off-by: Souptick Joarder 
Cc: John Hubbard 
---
 drivers/xen/gntdev.c | 7 ++-
 1 file changed, 2 insertions(+), 5 deletions(-)

diff --git a/drivers/xen/gntdev.c b/drivers/xen/gntdev.c
index 64a9025a..e480509 100644
--- a/drivers/xen/gntdev.c
+++ b/drivers/xen/gntdev.c
@@ -730,7 +730,7 @@ static int gntdev_get_page(struct gntdev_copy_batch *batch, 
void __user *virt,
unsigned long xen_pfn;
int ret;
 
-   ret = get_user_pages_fast(addr, 1, writeable ? FOLL_WRITE : 0, &page);
+   ret = pin_user_pages_fast(addr, 1, writeable ? FOLL_WRITE : 0, &page);
if (ret < 0)
return ret;
 
@@ -744,10 +744,7 @@ static int gntdev_get_page(struct gntdev_copy_batch 
*batch, void __user *virt,
 
 static void gntdev_put_pages(struct gntdev_copy_batch *batch)
 {
-   unsigned int i;
-
-   for (i = 0; i < batch->nr_pages; i++)
-   put_page(batch->pages[i]);
+   unpin_user_pages(batch->pages, batch->nr_pages);
batch->nr_pages = 0;
 }
 
-- 
1.9.1

92 matches

Mail list logo