date:20170731

Re: [Xen-devel] [PATCH] VT-d: don't panic/warn on iommu=no-igfx

2017-07-31 Thread Tian, Kevin

> From: Rusty Bird [mailto:rustyb...@openmailbox.org]
> Sent: Thursday, July 27, 2017 7:54 PM
> 
> When operating on an Intel graphics device, iommu_enable_translation()
> panicked (force_iommu==1) or warned (force_iommu==0) about the BIOS if
> is_igd_vt_enabled_quirk() returned 0. That's good if the actual BIOS
> problem has been detected. But since commit 1463411, returning 0 could
> also happen if the user simply passed "iommu=no-igfx", in which case
> bailing out _without_ the panic/warning would be more appropriate.
> 
> The panic broke the combination "iommu=force,no-igfx", and also the case
> where "iommu=no-igfx" is passed but force_iommu=1 is set automatically
> by x2apic_bsp_setup().
> 
> Move the iommu_igfx check from is_igd_vt_enabled_quirk() into its only
> caller iommu_enable_translation(), and tweak the logic.
> 
> Signed-off-by: Rusty Bird 
> ---
> 
> Notes:
> Best viewed with "git show --ignore-space-change --function-context"
> 
>  xen/drivers/passthrough/vtd/iommu.c  | 18 --
>  xen/drivers/passthrough/vtd/quirks.c |  3 ---
>  2 files changed, 12 insertions(+), 9 deletions(-)
> 
> diff --git a/xen/drivers/passthrough/vtd/iommu.c
> b/xen/drivers/passthrough/vtd/iommu.c
> index 19328f6..2849ea1 100644
> --- a/xen/drivers/passthrough/vtd/iommu.c
> +++ b/xen/drivers/passthrough/vtd/iommu.c
> @@ -747,14 +747,20 @@ static void iommu_enable_translation(struct
> acpi_drhd_unit *drhd)
>  unsigned long flags;
>  struct iommu *iommu = drhd->iommu;
> 
> -if ( is_igd_drhd(drhd) && !is_igd_vt_enabled_quirk() )
> +if ( is_igd_drhd(drhd) )
>  {
> -if ( force_iommu )
> -panic("BIOS did not enable IGD for VT properly, crash Xen for 
> security
> purpose");
> +if ( !iommu_igfx )
> +return;

A message might be also helpful here so user can confirm its
boot option takes effect...

> 
> -printk(XENLOG_WARNING VTDPREFIX
> -   "BIOS did not enable IGD for VT properly.  Disabling IGD VT-d
> engine.\n");
> -return;
> +if ( !is_igd_vt_enabled_quirk() )
> +{
> +if ( force_iommu )
> +panic("BIOS did not enable IGD for VT properly, crash Xen for
> security purpose");
> +
> +printk(XENLOG_WARNING VTDPREFIX
> +   "BIOS did not enable IGD for VT properly.  Disabling IGD 
> VT-d
> engine.\n");
> +return;
> +}
>  }
> 
>  /* apply platform specific errata workarounds */
> diff --git a/xen/drivers/passthrough/vtd/quirks.c
> b/xen/drivers/passthrough/vtd/quirks.c
> index 91f96ac..5bbbd96 100644
> --- a/xen/drivers/passthrough/vtd/quirks.c
> +++ b/xen/drivers/passthrough/vtd/quirks.c
> @@ -70,9 +70,6 @@ int is_igd_vt_enabled_quirk(void)
>  {
>  u16 ggc;
> 
> -if ( !iommu_igfx )
> -return 0;
> -
>  if ( !IS_ILK(ioh_id) )
>  return 1;
> 
> --
> 2.9.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v2] xen: Implement hypercall for tracing of program counters

2017-07-31 Thread Julien Grall


Hi,

On 07/28/2017 04:37 PM, Wei Liu wrote:

On Wed, Jul 26, 2017 at 12:43:45PM +0200, Felix Schmoll wrote:

+d->tracing_buffer = NULL;
+
+if ( copy_to_guest(buf, temp, d->tracing_buffer_pos) )
+ret = -EFAULT;
+
+xfree(temp);
+
+ret = d->tracing_buffer_pos;
+break;
+}
+
+default:
+ret = -ENOSYS;


EINVAL


Should not it be -EOPNOTSUPP to match return error when CONFIG_TRACE_PC 
is not?


Cheers,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

[Xen-devel] PV drivers and zero copying

2017-07-31 Thread Oleksandr Andrushchenko


Hi, all!

The aim of this mail is to highlight and discuss possible approaches to
implementing zero copying for PV drivers. Rationale behind this is that 
there
are use-cases when drivers operate with big shared buffers, e.g. 
display, when

memory copying from front’s buffer into back’s one may significantly hit
performance of the system (for example, for para-virtual display running 
at full

HD resolution at 60Hz it is approximately 475MB/sec).

Assumptions (which actually fit ARM platforms, but can be extended to other
platforms as well): Dom0 is a 1:1 mapped privileged domain, runs backend
driver/software DomU is an unprivileged domain without 1:1 memory 
mapping, runs

frontend driver

Buffer origin: while implementing zero copying the buffer allocation can 
happen

either on DomU’s end or Dom0’s one depending on the use-case and HW
capabilities/availability: When DomU allocates: It cannot guarantee physical
memory continuity of the buffers allocated Dom0’s HW *can* handle 
non-contiguous
memory buffers allocated by DomU for memory operations (DMA, for 
example), e.g.
either with IOMMU help or by any other means (HW block’s own MMU).  When 
Dom0

allocates as it is mapped 1:1 it can allocate physically contiguous memory
Dom0’s HW *cannot* handle non-contiguous memory buffers allocated by 
DomU for

memory operations by any means.

1 Sharing with granted references
==

1-1 Buffer allocated @DomU
--
@DomU
alloc_xenballooned_pages(nr_pages, pages);
cur_ref = gnttab_claim_grant_reference(&priv_gref_head);
gnttab_grant_foreign_access_ref(cur_ref, otherend_id, ...);

@Dom0
alloc_xenballooned_pages(nr_pages, pages);
gnttab_set_map_op(&map_ops[i], addr, GNTMAP_host_map | 
GNTMAP_device_map,

grefs[i], otherend_id);
gnttab_map_refs(map_ops, NULL, pages, nr_pages);

1-2 Buffer allocated @Dom0
--
@Dom0

dma_alloc_wc(dev, size, &dev_addr, GFP_KERNEL | __GFP_NOWARN);
HYPERVISOR_memory_op(XENMEM_populate_physmap, &reservation);
cur_ref = gnttab_claim_grant_reference(&priv_gref_head);
gnttab_grant_foreign_access_ref(cur_ref, otherend_id, ...);

@Dom0
alloc_xenballooned_pages(nr_pages, pages);
gnttab_set_map_op(&map_ops[i], addr, GNTMAP_host_map, grefs[i], 
otherend_id);

gnttab_map_refs(map_ops, NULL, pages, nr_pages);

2 Sharing with page transfers (GNTTABOP_transfer)
==
FIXME: This use-case seems to be only needed while allocating physically
contiguous buffers at Dom0. For the reverse path 1-1 method can be used.

This approach relies on GNTTABOP_transfer API: “transfer  to a 
foreign

domain. The foreign domain has previously registered its interest in the
transfer via ”, for full documentation see [1]. The process of
transferring pages is explained by Christopher Clark at [2] and is 
available as
implementation at [3], [4]. The relevant logic is in: 
xen/common/grant_table.c :

gnttab_transfer.

Basic workflow explained to me by Christopher:
- The mfn starts as owned by the sending domain, and that domain removes any
  mappings of it from its page tables. Xen will enforce that the 
reference count

must be low enough for the transfer to succeed.
- The receiving domain indicates interest for receiving a page by writing an
  entry in its grant table.
- You'll need to communicate the grant ref from the receiver to the 
sender (eg.

  via xenstore or another existing channel)
- The sending domain invokes the hypercall, with the grant ref from the
  receiving domain.
- The sending domain notifies the receiving domain somehow that the 
transfer has

  completed. (eg. send an event or via xenstore)
- Once the transfer has completed, the receiving domain will need to map the
  newly assigned page.
- Note: For the transfer, the receiving domain must have enough headroom to
  receive the new page, which means it must not have allocated all of 
its memory
quota already prior to the transfer. Typically this can be ensured by 
freeing

enough memory back to Xen before writing the grant ref.

3 Sharing with page exchange (XENMEM_exchange)
==

This API was pointed to me by Stefano Stabellini as one of the possible 
ways to
achieve zero copying and share physically contiguous buffers. It is used 
by x86
SWIOTLB code (xen_create_contiguous_region, [5]), but as per my 
understanding
this API cannot be used on ARM as of now [6].  Conclusion: not an option 
for ARM

at the moment

Comparison for display use-case
===

1 Number of grant references used
1-1 grant references: nr_pages
1-2 GNTTABOP_transfer: nr_pages
1-3 XENMEM_exchange: not an option

2 Effect of DomU crash on Dom0 (its mapped pages)
2-1 grant references: pages can be unmapped by Dom0, Dom0 is fully 
recovered
2-2 GNTTABOP_transfer: pages will be returned to the Hypervisor, lost 
for Dom0

2-3 XENMEM

Re: [Xen-devel] [RFC PATCH 00/15] RFC: SGX virtualization design and draft patches

2017-07-31 Thread Huang, Kai


Hi Wei,

Thanks for your comments. Please see my reply below.

On 7/29/2017 1:40 AM, Wei Liu wrote:

On Tue, Jul 18, 2017 at 08:22:55PM +1200, Huang, Kai wrote:

Hi Wei,

Thank you very much for comments. Please see my reply below.

On 7/17/2017 9:16 PM, Wei Liu wrote:

Hi Kai

Thanks for this nice write-up.

Some comments and questions below.

On Sun, Jul 09, 2017 at 08:03:10PM +1200, Kai Huang wrote:

Hi all,


[...]

2. SGX Virtualization Design

2.1 High Level Toolstack Changes:

2.1.1 New 'epc' parameter

EPC is limited resource. In order to use EPC efficiently among all domains,
when creating guest, administrator should be able to specify domain's virtual
EPC size. And admin
alao should be able to get all domain's virtual EPC size.

For this purpose, a new 'epc = ' parameter is added to XL configuration
file. This parameter specifies guest's virtual EPC size. The EPC base address
will be calculated by toolstack internally, according to guest's memory size,
MMIO size, etc. 'epc' is MB in unit and any 1MB aligned value will be accepted.

2.1.2 New XL commands (?)

Administrator should be able to get physical EPC size, and all domain's virtual
EPC size. For this purpose, we can introduce 2 additional commands:

  # xl sgxinfo

Which will print out physical EPC size, and other SGX info (such as SGX1, SGX2,
etc) if necessary.

  # xl sgxlist 

Which will print out particular domain's virtual EPC size, or list all virtual
EPC sizes for all supported domains.

Alternatively, we can also extend existing XL commands by adding new option

  # xl info -sgx

Which will print out physical EPC size along with other physinfo. And

  # xl list  -sgx

Which will print out domain's virtual EPC size.

Comments?



Can a guest have multiple EPC? If so, the proposed parameter is not good
enough.


According to SDM a machine may have multiple EPC, but it may have doesn't
mean it must have. EPC is typically reserved by BIOS as Processor Reserved
Memory (PRM), and in my understanding, client machine  doesn't need to have
multiple EPC. Currently, I don't see why we need to expose multiple EPC to
guest. Even physical machine reports multiple EPC, exposing one EPC to guest
is enough. Currently SGX should not be supported with virtual NUMA
simultaneously for a single domain.



When you say "is enough", do you mean Intel doesn't recommend users to
use more than one? I don't think from reading this doc precludes using
more then one technically.


No I don't think Intel would make such recommendation. For real hardware 
yes it's possible there are multiple EPC sections, but for client or 
single socket server machine, typically there will be only one EPC. In 
case of VM, I don't see there's any benefit of exposing multiple EPCs to 
guest, except the vNUMA case. My thinking is although SDM doesn't 
preclude using more than one EPC but for VM there's no need to use more 
than one.






Can a guest with EPC enabled be migrated? The answer to this question
can lead to multiple other questions.


See the last section of my design. I saw you've already seen it. :)



Another question, is EPC going to be backed by normal memory? This is
related to memory accounting of the guest.


Although SDM says typically EPC is allocated by BIOS as PRM, but I think we
can just treat EPC as PRM, so I believe yes, physically EPC is backed by
normal memory. But EPC is reported as reserved memory in e820 table.



Is EPC going to be modeled as a device or another type of memory? This
is related to how we manage it in the toolstack.


I think we'd better to treat EPC as another type of memory. I am not sure
whether it should be modeled as device, as on real machine, EPC is also
exposed in ACPI table via "INT0E0C" device under \_SB (however it is not
modeled as PCIE device for sure).



Finally why do you not allow the users to specify the base address?


I don't see any reason why user needs to specify base address. If we do,
then specify what address? On real machine, BIOS set the base address, and
for VM, I think toolstack/Xen should do this.


We can expose an option for user to control that if they want to and at
the same time provide the logic to calculate the base address
internally. I'm not sure if that's going to be very useful, but I'm not
convinced it is entirely useless either.

Thinking a bit more we can always extend the syntax and API to support
that if need be, so I'm fine with not providing such mechanism at early
stage.


Yeah I think we can extend if needed in the future. Thanks Wei.

Thanks,
-Kai





___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] PV drivers and zero copying

2017-07-31 Thread Paul Durrant

> -Original Message-
[snip]
> Comparison for display use-case
> ===
> 
> 1 Number of grant references used
> 1-1 grant references: nr_pages
> 1-2 GNTTABOP_transfer: nr_pages
> 1-3 XENMEM_exchange: not an option
> 
> 2 Effect of DomU crash on Dom0 (its mapped pages)
> 2-1 grant references: pages can be unmapped by Dom0, Dom0 is fully
> recovered
> 2-2 GNTTABOP_transfer: pages will be returned to the Hypervisor, lost
> for Dom0
> 2-3 XENMEM_exchange: not an option
> 
> 3 Security issues from sharing Dom0 pages to DomU
> 1-1 grant references: none
> 1-2 GNTTABOP_transfer: none
> 1-3 XENMEM_exchange: not an option
> 
> At the moment approach 1 with granted references seems to be a winner for
> sharing buffers both ways, e.g. Dom0 -> DomU and DomU -> Dom0.
> 
> Conclusion
> ==
> 
> I would like to get some feedback from the community on which approach
> is more
> suitable for sharing large buffers and to have a clear vision on cons
> and pros
> of each one: please feel free to add other metrics I missed and correct
> the ones
> I commented on.  I would appreciate help on comparing approaches 2 and 3
> as I
> have little knowledge of these APIs (2 seems to be addressed by
> Christopher, and
> 3 seems to be relevant to what Konrad/Stefano do WRT SWIOTLB).
> 

Hi,

  I once implemented a scheme where network frontends used memory granted from 
backends and this hit quite a few problems:

- If domU is allowed to grant map memory from dom0, there is not currently a 
way to forcibly take it back (so I don't think you're quite correct in 2-1 
above... but I may have missed something). Hence the domU can hold dom0's 
memory to ransom. (In the network case this was avoided by using grant table v2 
'copy-only' grants).
- If you end up having to grant buffers which do not originate in dom0 (i.e. 
they were grant mapped from another domU) then this creates similar problems 
with one domU holding another domU's memory to ransom, even when using 
copy-only grants. I don’t think this would be an issue in your use-case.
- Currently the default grant table size is 32 pages and it may not take that 
many guests using a protocol where dom0 grants memory to domU to exhaust dom0's 
grant table (depending on how many grants-per-domU the protocol allows). If 
you're intending to grant large buffers then you may need quite a few grants 
(since they are per-4k-chunk) to do this, so you might run into this limit.

  Cheers,

Paul
___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

[Xen-devel] [linux-3.18 test] 112387: trouble: blocked/broken/fail/pass

2017-07-31 Thread osstest service owner

flight 112387 linux-3.18 real [real]
http://logs.test-lab.xenproject.org/osstest/logs/112387/

Failures and problems with tests :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-arm64-xsm   2 hosts-allocate broken REGR. vs. 112102
 build-arm64-pvops 2 hosts-allocate broken REGR. vs. 112102
 build-arm64   2 hosts-allocate broken REGR. vs. 112102
 build-arm64-pvops 3 capture-logs   broken REGR. vs. 112102

Tests which are failing intermittently (not blocking):
 test-amd64-i386-qemut-rhel6hvm-intel  7 xen-boot   fail pass in 112378

Tests which did not succeed, but are not blocking:
 test-arm64-arm64-libvirt-xsm  1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl   1 build-check(1)   blocked  n/a
 build-arm64-libvirt   1 build-check(1)   blocked  n/a
 test-arm64-arm64-examine  1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl-credit2   1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl-xsm   1 build-check(1)   blocked  n/a
 build-arm64-xsm   3 capture-logs  broken blocked in 112102
 build-arm64   3 capture-logs  broken blocked in 112102
 test-amd64-amd64-xl-qemuu-win7-amd64 17 guest-stop  fail blocked in 112102
 test-amd64-i386-xl-qemut-win7-amd64 17 guest-stop   fail blocked in 112102
 test-amd64-i386-xl-qemuu-win7-amd64 18 guest-start/win.repeat fail in 112378 
blocked in 112102
 test-amd64-amd64-xl-qemut-win7-amd64 17 guest-stop  fail in 112378 like 112085
 test-amd64-i386-xl-qemuu-win7-amd64 17 guest-stop fail like 112085
 test-armhf-armhf-libvirt 14 saverestore-support-checkfail  like 112102
 test-armhf-armhf-libvirt-xsm 14 saverestore-support-checkfail  like 112102
 test-amd64-amd64-xl-qemut-win7-amd64 16 guest-localmigrate/x10 fail like 112102
 test-armhf-armhf-libvirt-raw 13 saverestore-support-checkfail  like 112102
 test-amd64-amd64-xl-rtds 10 debian-install   fail  like 112102
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-amd64-amd64-xl-qemut-ws16-amd64 10 windows-installfail never pass
 test-amd64-i386-libvirt-xsm  13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-amd64-amd64-xl-qemuu-ws16-amd64 10 windows-installfail never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-armhf-armhf-xl-arndale  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  14 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-vhd 12 migrate-support-checkfail   never pass
 test-amd64-i386-xl-qemuu-ws16-amd64 13 guest-saverestore   fail never pass
 test-amd64-amd64-qemuu-nested-amd 17 debian-hvm-install/l1/l2  fail never pass
 test-armhf-armhf-xl  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt 13 migrate-support-checkfail   never pass
 test-amd64-i386-xl-qemut-ws16-amd64 13 guest-saverestore   fail never pass
 test-armhf-armhf-xl-cubietruck 13 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 14 saverestore-support-checkfail never pass
 test-armhf-armhf-xl-xsm  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-raw 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  14 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-armhf-armhf-xl-vhd  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  13 saverestore-support-checkfail   never pass
 test-amd64-i386-xl-qemuu-win10-i386 10 windows-install fail never pass
 test-amd64-amd64-xl-qemuu-win10-i386 10 windows-installfail never pass
 test-amd64-i386-xl-qemut-win10-i386 10 windows-install fail never pass
 test-armhf-armhf-xl-multivcpu 13 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 14 saverestore-support-checkfail  never pass
 test-amd64-amd64-xl-qemut-win10-i386 10 windows-installfail never pass

version targeted

[Xen-devel] [PATCH v2] VT-d: don't panic/warn on iommu=no-igfx

2017-07-31 Thread Rusty Bird

When operating on an Intel graphics device, iommu_enable_translation()
panicked (force_iommu==1) or warned (force_iommu==0) about the BIOS if
is_igd_vt_enabled_quirk() returned 0. That's good if the actual BIOS
problem has been detected. But since commit 1463411, returning 0 could
also happen if the user simply passed "iommu=no-igfx", in which case
bailing out with an info message (instead of a panic/warning) would be
more appropriate.

The panic broke the combination "iommu=force,no-igfx", and also the case
where "iommu=no-igfx" is passed but force_iommu=1 is set automatically
by x2apic_bsp_setup().

Move the iommu_igfx check from is_igd_vt_enabled_quirk() into its only
caller iommu_enable_translation(), and tweak the logic.

Signed-off-by: Rusty Bird 
---

Notes:
Changed since v1: print info message when iommu_igfx==0
Best viewed with "git show --ignore-space-change --function-context"

 xen/drivers/passthrough/vtd/iommu.c  | 22 --
 xen/drivers/passthrough/vtd/quirks.c |  3 ---
 2 files changed, 16 insertions(+), 9 deletions(-)

diff --git a/xen/drivers/passthrough/vtd/iommu.c 
b/xen/drivers/passthrough/vtd/iommu.c
index 19328f6..daaed0a 100644
--- a/xen/drivers/passthrough/vtd/iommu.c
+++ b/xen/drivers/passthrough/vtd/iommu.c
@@ -747,14 +747,24 @@ static void iommu_enable_translation(struct 
acpi_drhd_unit *drhd)
 unsigned long flags;
 struct iommu *iommu = drhd->iommu;
 
-if ( is_igd_drhd(drhd) && !is_igd_vt_enabled_quirk() ) 
+if ( is_igd_drhd(drhd) )
 {
-if ( force_iommu )
-panic("BIOS did not enable IGD for VT properly, crash Xen for 
security purpose");
+if ( !iommu_igfx )
+{
+printk(XENLOG_INFO VTDPREFIX
+   "Passed iommu=no-igfx option.  Disabling IGD VT-d 
engine.\n");
+return;
+}
 
-printk(XENLOG_WARNING VTDPREFIX
-   "BIOS did not enable IGD for VT properly.  Disabling IGD VT-d 
engine.\n");
-return;
+if ( !is_igd_vt_enabled_quirk() )
+{
+if ( force_iommu )
+panic("BIOS did not enable IGD for VT properly, crash Xen for 
security purpose");
+
+printk(XENLOG_WARNING VTDPREFIX
+   "BIOS did not enable IGD for VT properly.  Disabling IGD 
VT-d engine.\n");
+return;
+}
 }
 
 /* apply platform specific errata workarounds */
diff --git a/xen/drivers/passthrough/vtd/quirks.c 
b/xen/drivers/passthrough/vtd/quirks.c
index 91f96ac..5bbbd96 100644
--- a/xen/drivers/passthrough/vtd/quirks.c
+++ b/xen/drivers/passthrough/vtd/quirks.c
@@ -70,9 +70,6 @@ int is_igd_vt_enabled_quirk(void)
 {
 u16 ggc;
 
-if ( !iommu_igfx )
-return 0;
-
 if ( !IS_ILK(ioh_id) )
 return 1;
 
-- 
2.9.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH] VT-d: don't panic/warn on iommu=no-igfx

2017-07-31 Thread Rusty Bird

Tian, Kevin:
> > From: Rusty Bird [mailto:rustyb...@openmailbox.org]
> > +if ( !iommu_igfx )
> > +return;
> 
> A message might be also helpful here so user can confirm its
> boot option takes effect...

Done, see v2 patch. Thanks for the feedback!

Rusty


signature.asc
Description: PGP signature
___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] PV drivers and zero copying

2017-07-31 Thread Oleksandr Andrushchenko


Hi, Paul!


On 07/31/2017 12:03 PM, Paul Durrant wrote:

-Original Message-

[snip]

Comparison for display use-case
===

1 Number of grant references used
1-1 grant references: nr_pages
1-2 GNTTABOP_transfer: nr_pages
1-3 XENMEM_exchange: not an option

2 Effect of DomU crash on Dom0 (its mapped pages)
2-1 grant references: pages can be unmapped by Dom0, Dom0 is fully
recovered
2-2 GNTTABOP_transfer: pages will be returned to the Hypervisor, lost
for Dom0
2-3 XENMEM_exchange: not an option

3 Security issues from sharing Dom0 pages to DomU
1-1 grant references: none
1-2 GNTTABOP_transfer: none
1-3 XENMEM_exchange: not an option

At the moment approach 1 with granted references seems to be a winner for
sharing buffers both ways, e.g. Dom0 -> DomU and DomU -> Dom0.

Conclusion
==

I would like to get some feedback from the community on which approach
is more
suitable for sharing large buffers and to have a clear vision on cons
and pros
of each one: please feel free to add other metrics I missed and correct
the ones
I commented on.  I would appreciate help on comparing approaches 2 and 3
as I
have little knowledge of these APIs (2 seems to be addressed by
Christopher, and
3 seems to be relevant to what Konrad/Stefano do WRT SWIOTLB).


Hi,

   I once implemented a scheme where network frontends used memory granted from 
backends and this hit quite a few problems:

- If domU is allowed to grant map memory from dom0, there is not currently a 
way to forcibly take it back (so I don't think you're quite correct in 2-1 
above... but I may have missed something).

This is where I was not 100% confident, so you seem to be right here.

  Hence the domU can hold dom0's memory to ransom.

Yes, this is the problem

  (In the network case this was avoided by using grant table v2 'copy-only' 
grants).
- If you end up having to grant buffers which do not originate in dom0 (i.e. 
they were grant mapped from another domU) then this creates similar problems 
with one domU holding another domU's memory to ransom, even when using 
copy-only grants. I don’t think this would be an issue in your use-case.
In our use-cases we will only share from Dom0 to DomU in case Dom0 is 
1:1 mapped

Otherwise, HW should be capable of dealing with non-contiguous memory

- Currently the default grant table size is 32 pages and it may not take that 
many guests using a protocol where dom0 grants memory to domU to exhaust dom0's 
grant table

Yes, I know about this limitation, thank you

  (depending on how many grants-per-domU the protocol allows). If you're 
intending to grant large buffers then you may need quite a few grants (since 
they are per-4k-chunk) to do this, so you might run into this limit.

   Cheers,

 Paul

Thank you for comments,
so in both cases (with grant refs or memory transfer) there is no way to 
cleanly
recover from DomU crash (either grants cannot be returned or memory goes 
to the

hypervisor, not Dom0).
Is this correct?

Thank you,
Oleksandr

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

[Xen-devel] [distros-debian-sid test] 71921: tolerable trouble: blocked/broken/fail/pass

2017-07-31 Thread Platform Team regression test user

flight 71921 distros-debian-sid real [real]
http://osstest.xs.citrite.net/~osstest/testlogs/logs/71921/

Failures :-/ but no regressions.

Tests which did not succeed, but are not blocking:
 test-arm64-arm64-armhf-sid-netboot-pygrub  1 build-check(1)blocked n/a
 build-arm64-pvops 2 hosts-allocate   broken like 71732
 build-arm64   2 hosts-allocate   broken like 71732
 build-arm64-pvops 3 capture-logs broken like 71732
 build-arm64   3 capture-logs broken like 71732
 test-amd64-i386-i386-sid-netboot-pvgrub 11 guest-start   fail blocked in 71732
 test-amd64-amd64-amd64-sid-netboot-pvgrub 11 guest-start fail blocked in 71732
 test-armhf-armhf-armhf-sid-netboot-pygrub 10 debian-di-install fail blocked in 
71732

baseline version:
 flight   71732

jobs:
 build-amd64  pass
 build-arm64  broken  
 build-armhf  pass
 build-i386   pass
 build-amd64-pvopspass
 build-arm64-pvopsbroken  
 build-armhf-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-amd64-sid-netboot-pvgrubfail
 test-amd64-i386-i386-sid-netboot-pvgrub  fail
 test-amd64-i386-amd64-sid-netboot-pygrub pass
 test-arm64-arm64-armhf-sid-netboot-pygrubblocked 
 test-armhf-armhf-armhf-sid-netboot-pygrubfail
 test-amd64-amd64-i386-sid-netboot-pygrub pass



sg-report-flight on osstest.xs.citrite.net
logs: /home/osstest/logs
images: /home/osstest/images

Logs, config files, etc. are available at
http://osstest.xs.citrite.net/~osstest/testlogs/logs

Test harness code can be found at
http://xenbits.xensource.com/gitweb?p=osstest.git;a=summary


Push not applicable.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] PV drivers and zero copying

2017-07-31 Thread Julien Grall


(+ Joao)

On 31/07/17 09:34, Oleksandr Andrushchenko wrote:

Hi, all!


Hi Oleksandr,


The aim of this mail is to highlight and discuss possible approaches to
implementing zero copying for PV drivers. Rationale behind this is that
there
are use-cases when drivers operate with big shared buffers, e.g.
display, when
memory copying from front’s buffer into back’s one may significantly hit
performance of the system (for example, for para-virtual display running
at full
HD resolution at 60Hz it is approximately 475MB/sec).

Assumptions (which actually fit ARM platforms, but can be extended to other
platforms as well): Dom0 is a 1:1 mapped privileged domain, runs backend
driver/software DomU is an unprivileged domain without 1:1 memory
mapping, runs
frontend driver


I would rather avoid to stick with this assumption on ARM. This was only 
meant to be a workaround for platform without IOMMU (see [1]) and we 
will get into trouble when using IOMMU.


For instance, there are no requirement to have the IOMMU supporting as 
many as address bits than the processor. So 1:1 mapping here will not be 
an option.




Buffer origin: while implementing zero copying the buffer allocation can
happen
either on DomU’s end or Dom0’s one depending on the use-case and HW
capabilities/availability: When DomU allocates: It cannot guarantee
physical
memory continuity of the buffers allocated Dom0’s HW *can* handle
non-contiguous
memory buffers allocated by DomU for memory operations (DMA, for
example), e.g.
either with IOMMU help or by any other means (HW block’s own MMU).  When
Dom0
allocates as it is mapped 1:1 it can allocate physically contiguous memory
Dom0’s HW *cannot* handle non-contiguous memory buffers allocated by
DomU for
memory operations by any means.


I am not sure to follow this. How zero copy is related to 1:1 mapping? 
Is it because you have hardware that does not support scatter/gather IO 
or IOMMU?




1 Sharing with granted references
==

1-1 Buffer allocated @DomU
--
@DomU
alloc_xenballooned_pages(nr_pages, pages);
cur_ref = gnttab_claim_grant_reference(&priv_gref_head);
gnttab_grant_foreign_access_ref(cur_ref, otherend_id, ...);

@Dom0
alloc_xenballooned_pages(nr_pages, pages);
gnttab_set_map_op(&map_ops[i], addr, GNTMAP_host_map |
GNTMAP_device_map,
grefs[i], otherend_id);
gnttab_map_refs(map_ops, NULL, pages, nr_pages);

1-2 Buffer allocated @Dom0
--
@Dom0

dma_alloc_wc(dev, size, &dev_addr, GFP_KERNEL | __GFP_NOWARN);
HYPERVISOR_memory_op(XENMEM_populate_physmap, &reservation);
cur_ref = gnttab_claim_grant_reference(&priv_gref_head);
gnttab_grant_foreign_access_ref(cur_ref, otherend_id, ...);

@Dom0
alloc_xenballooned_pages(nr_pages, pages);
gnttab_set_map_op(&map_ops[i], addr, GNTMAP_host_map, grefs[i],
otherend_id);
gnttab_map_refs(map_ops, NULL, pages, nr_pages);

2 Sharing with page transfers (GNTTABOP_transfer)
==
FIXME: This use-case seems to be only needed while allocating physically
contiguous buffers at Dom0. For the reverse path 1-1 method can be used.

This approach relies on GNTTABOP_transfer API: “transfer  to a
foreign
domain. The foreign domain has previously registered its interest in the
transfer via ”, for full documentation see [1]. The process of
transferring pages is explained by Christopher Clark at [2] and is
available as
implementation at [3], [4]. The relevant logic is in:
xen/common/grant_table.c :
gnttab_transfer.

Basic workflow explained to me by Christopher:
- The mfn starts as owned by the sending domain, and that domain removes
any
  mappings of it from its page tables. Xen will enforce that the
reference count
must be low enough for the transfer to succeed.
- The receiving domain indicates interest for receiving a page by
writing an
  entry in its grant table.
- You'll need to communicate the grant ref from the receiver to the
sender (eg.
  via xenstore or another existing channel)
- The sending domain invokes the hypercall, with the grant ref from the
  receiving domain.
- The sending domain notifies the receiving domain somehow that the
transfer has
  completed. (eg. send an event or via xenstore)
- Once the transfer has completed, the receiving domain will need to map
the
  newly assigned page.
- Note: For the transfer, the receiving domain must have enough headroom to
  receive the new page, which means it must not have allocated all of
its memory
quota already prior to the transfer. Typically this can be ensured by
freeing
enough memory back to Xen before writing the grant ref.

3 Sharing with page exchange (XENMEM_exchange)
==

This API was pointed to me by Stefano Stabellini as one of the possible
ways to
achieve zero copying and share physically contiguous buffers. It is used
by x86
SWIOTLB code (xen_create_contiguous_region,

Re: [Xen-devel] PV drivers and zero copying

2017-07-31 Thread Andrew Cooper

On 31/07/17 10:03, Paul Durrant wrote:
>> -Original Message-
> [snip]
>> Comparison for display use-case
>> ===
>>
>> 1 Number of grant references used
>> 1-1 grant references: nr_pages
>> 1-2 GNTTABOP_transfer: nr_pages
>> 1-3 XENMEM_exchange: not an option
>>
>> 2 Effect of DomU crash on Dom0 (its mapped pages)
>> 2-1 grant references: pages can be unmapped by Dom0, Dom0 is fully
>> recovered
>> 2-2 GNTTABOP_transfer: pages will be returned to the Hypervisor, lost
>> for Dom0
>> 2-3 XENMEM_exchange: not an option
>>
>> 3 Security issues from sharing Dom0 pages to DomU
>> 1-1 grant references: none
>> 1-2 GNTTABOP_transfer: none
>> 1-3 XENMEM_exchange: not an option
>>
>> At the moment approach 1 with granted references seems to be a winner for
>> sharing buffers both ways, e.g. Dom0 -> DomU and DomU -> Dom0.
>>
>> Conclusion
>> ==
>>
>> I would like to get some feedback from the community on which approach
>> is more
>> suitable for sharing large buffers and to have a clear vision on cons
>> and pros
>> of each one: please feel free to add other metrics I missed and correct
>> the ones
>> I commented on.  I would appreciate help on comparing approaches 2 and 3
>> as I
>> have little knowledge of these APIs (2 seems to be addressed by
>> Christopher, and
>> 3 seems to be relevant to what Konrad/Stefano do WRT SWIOTLB).
>>
> Hi,
>
>   I once implemented a scheme where network frontends used memory granted 
> from backends and this hit quite a few problems:
>
> - If domU is allowed to grant map memory from dom0, there is not currently a 
> way to forcibly take it back (so I don't think you're quite correct in 2-1 
> above... but I may have missed something). Hence the domU can hold dom0's 
> memory to ransom. (In the network case this was avoided by using grant table 
> v2 'copy-only' grants).
> - If you end up having to grant buffers which do not originate in dom0 (i.e. 
> they were grant mapped from another domU) then this creates similar problems 
> with one domU holding another domU's memory to ransom, even when using 
> copy-only grants. I don’t think this would be an issue in your use-case.
> - Currently the default grant table size is 32 pages and it may not take that 
> many guests using a protocol where dom0 grants memory to domU to exhaust 
> dom0's grant table (depending on how many grants-per-domU the protocol 
> allows). If you're intending to grant large buffers then you may need quite a 
> few grants (since they are per-4k-chunk) to do this, so you might run into 
> this limit.

To follow up on what Paul said, google for XenServer Receive Side Copy. 
The issue is that it looks very attractive (from an offloading things
out of dom0 perspective), and does work at small scale, but the failure
cases are far more tricky than we imagined, and you run dom0 out of
grants very quickly, which puts a hard upper bound on scalability.  It
is high on the list of "worst mistakes we put into production".

It is not safe at all for dom0 to grant frames to domU without dom0
having a mechanism to revoke the grant.  (There was work looking into
this in the past, but it suffered from a lack of free time.)

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] PV drivers and zero copying

2017-07-31 Thread Oleksandr Andrushchenko


On 07/31/2017 12:31 PM, Andrew Cooper wrote:

On 31/07/17 10:03, Paul Durrant wrote:

-Original Message-

[snip]

Comparison for display use-case
===

1 Number of grant references used
1-1 grant references: nr_pages
1-2 GNTTABOP_transfer: nr_pages
1-3 XENMEM_exchange: not an option

2 Effect of DomU crash on Dom0 (its mapped pages)
2-1 grant references: pages can be unmapped by Dom0, Dom0 is fully
recovered
2-2 GNTTABOP_transfer: pages will be returned to the Hypervisor, lost
for Dom0
2-3 XENMEM_exchange: not an option

3 Security issues from sharing Dom0 pages to DomU
1-1 grant references: none
1-2 GNTTABOP_transfer: none
1-3 XENMEM_exchange: not an option

At the moment approach 1 with granted references seems to be a winner for
sharing buffers both ways, e.g. Dom0 -> DomU and DomU -> Dom0.

Conclusion
==

I would like to get some feedback from the community on which approach
is more
suitable for sharing large buffers and to have a clear vision on cons
and pros
of each one: please feel free to add other metrics I missed and correct
the ones
I commented on.  I would appreciate help on comparing approaches 2 and 3
as I
have little knowledge of these APIs (2 seems to be addressed by
Christopher, and
3 seems to be relevant to what Konrad/Stefano do WRT SWIOTLB).


Hi,

   I once implemented a scheme where network frontends used memory granted from 
backends and this hit quite a few problems:

- If domU is allowed to grant map memory from dom0, there is not currently a 
way to forcibly take it back (so I don't think you're quite correct in 2-1 
above... but I may have missed something). Hence the domU can hold dom0's 
memory to ransom. (In the network case this was avoided by using grant table v2 
'copy-only' grants).
- If you end up having to grant buffers which do not originate in dom0 (i.e. 
they were grant mapped from another domU) then this creates similar problems 
with one domU holding another domU's memory to ransom, even when using 
copy-only grants. I don’t think this would be an issue in your use-case.
- Currently the default grant table size is 32 pages and it may not take that 
many guests using a protocol where dom0 grants memory to domU to exhaust dom0's 
grant table (depending on how many grants-per-domU the protocol allows). If 
you're intending to grant large buffers then you may need quite a few grants 
(since they are per-4k-chunk) to do this, so you might run into this limit.

To follow up on what Paul said, google for XenServer Receive Side Copy.

I will, thank you

The issue is that it looks very attractive (from an offloading things
out of dom0 perspective), and does work at small scale, but the failure
cases are far more tricky than we imagined, and you run dom0 out of
grants very quickly, which puts a hard upper bound on scalability.  It
is high on the list of "worst mistakes we put into production".

It is not safe at all for dom0 to grant frames to domU without dom0
having a mechanism to revoke the grant.

Other than that, are there any other concerns, e.g. from
security POV?

  (There was work looking into
this in the past, but it suffered from a lack of free time.)

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Thank you,
Oleksandr

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] PV drivers and zero copying

2017-07-31 Thread Oleksandr Andrushchenko


Hi, Julien!


On 07/31/2017 12:24 PM, Julien Grall wrote:

(+ Joao)

On 31/07/17 09:34, Oleksandr Andrushchenko wrote:

Hi, all!


Hi Oleksandr,


The aim of this mail is to highlight and discuss possible approaches to
implementing zero copying for PV drivers. Rationale behind this is that
there
are use-cases when drivers operate with big shared buffers, e.g.
display, when
memory copying from front’s buffer into back’s one may significantly hit
performance of the system (for example, for para-virtual display running
at full
HD resolution at 60Hz it is approximately 475MB/sec).

Assumptions (which actually fit ARM platforms, but can be extended to 
other

platforms as well): Dom0 is a 1:1 mapped privileged domain, runs backend
driver/software DomU is an unprivileged domain without 1:1 memory
mapping, runs
frontend driver


I would rather avoid to stick with this assumption on ARM. This was 
only meant to be a workaround for platform without IOMMU (see [1]) and 
we will get into trouble when using IOMMU.

You are correct, thank you


For instance, there are no requirement to have the IOMMU supporting as 
many as address bits than the processor. So 1:1 mapping here will not 
be an option.




Buffer origin: while implementing zero copying the buffer allocation can
happen
either on DomU’s end or Dom0’s one depending on the use-case and HW
capabilities/availability: When DomU allocates: It cannot guarantee
physical
memory continuity of the buffers allocated Dom0’s HW *can* handle
non-contiguous
memory buffers allocated by DomU for memory operations (DMA, for
example), e.g.
either with IOMMU help or by any other means (HW block’s own MMU).  When
Dom0
allocates as it is mapped 1:1 it can allocate physically contiguous 
memory

Dom0’s HW *cannot* handle non-contiguous memory buffers allocated by
DomU for
memory operations by any means.


I am not sure to follow this. How zero copy is related to 1:1 mapping? 
Is it because you have hardware that does not support scatter/gather 
IO or IOMMU?

yes, you got it right




1 Sharing with granted references
==

1-1 Buffer allocated @DomU
--
@DomU
alloc_xenballooned_pages(nr_pages, pages);
cur_ref = gnttab_claim_grant_reference(&priv_gref_head);
gnttab_grant_foreign_access_ref(cur_ref, otherend_id, ...);

@Dom0
alloc_xenballooned_pages(nr_pages, pages);
gnttab_set_map_op(&map_ops[i], addr, GNTMAP_host_map |
GNTMAP_device_map,
grefs[i], otherend_id);
gnttab_map_refs(map_ops, NULL, pages, nr_pages);

1-2 Buffer allocated @Dom0
--
@Dom0
 PV MMU support as seen in the balloon driver, the difference is 
that

 pages are explicitly allocated to be used for DMA>
dma_alloc_wc(dev, size, &dev_addr, GFP_KERNEL | __GFP_NOWARN);
HYPERVISOR_memory_op(XENMEM_populate_physmap, &reservation);
cur_ref = gnttab_claim_grant_reference(&priv_gref_head);
gnttab_grant_foreign_access_ref(cur_ref, otherend_id, ...);

@Dom0
alloc_xenballooned_pages(nr_pages, pages);
gnttab_set_map_op(&map_ops[i], addr, GNTMAP_host_map, grefs[i],
otherend_id);
gnttab_map_refs(map_ops, NULL, pages, nr_pages);

2 Sharing with page transfers (GNTTABOP_transfer)
==
FIXME: This use-case seems to be only needed while allocating physically
contiguous buffers at Dom0. For the reverse path 1-1 method can be used.

This approach relies on GNTTABOP_transfer API: “transfer  to a
foreign
domain. The foreign domain has previously registered its interest in the
transfer via ”, for full documentation see [1]. The 
process of

transferring pages is explained by Christopher Clark at [2] and is
available as
implementation at [3], [4]. The relevant logic is in:
xen/common/grant_table.c :
gnttab_transfer.

Basic workflow explained to me by Christopher:
- The mfn starts as owned by the sending domain, and that domain removes
any
  mappings of it from its page tables. Xen will enforce that the
reference count
must be low enough for the transfer to succeed.
- The receiving domain indicates interest for receiving a page by
writing an
  entry in its grant table.
- You'll need to communicate the grant ref from the receiver to the
sender (eg.
  via xenstore or another existing channel)
- The sending domain invokes the hypercall, with the grant ref from the
  receiving domain.
- The sending domain notifies the receiving domain somehow that the
transfer has
  completed. (eg. send an event or via xenstore)
- Once the transfer has completed, the receiving domain will need to map
the
  newly assigned page.
- Note: For the transfer, the receiving domain must have enough 
headroom to

  receive the new page, which means it must not have allocated all of
its memory
quota already prior to the transfer. Typically this can be ensured by
freeing
enough memory back to Xen before writing the grant ref.

3 Sharing with page exchange (XENMEM_exchange)
==

Re: [Xen-devel] PV drivers and zero copying

2017-07-31 Thread Julien Grall




On 31/07/17 10:46, Oleksandr Andrushchenko wrote:

Hi, Julien!


On 07/31/2017 12:24 PM, Julien Grall wrote:

(+ Joao)

On 31/07/17 09:34, Oleksandr Andrushchenko wrote:

Hi, all!


Hi Oleksandr,


The aim of this mail is to highlight and discuss possible approaches to
implementing zero copying for PV drivers. Rationale behind this is that
there
are use-cases when drivers operate with big shared buffers, e.g.
display, when
memory copying from front’s buffer into back’s one may significantly hit
performance of the system (for example, for para-virtual display running
at full
HD resolution at 60Hz it is approximately 475MB/sec).

Assumptions (which actually fit ARM platforms, but can be extended to
other
platforms as well): Dom0 is a 1:1 mapped privileged domain, runs backend
driver/software DomU is an unprivileged domain without 1:1 memory
mapping, runs
frontend driver


I would rather avoid to stick with this assumption on ARM. This was
only meant to be a workaround for platform without IOMMU (see [1]) and
we will get into trouble when using IOMMU.

You are correct, thank you


For instance, there are no requirement to have the IOMMU supporting as
many as address bits than the processor. So 1:1 mapping here will not
be an option.



Buffer origin: while implementing zero copying the buffer allocation can
happen
either on DomU’s end or Dom0’s one depending on the use-case and HW
capabilities/availability: When DomU allocates: It cannot guarantee
physical
memory continuity of the buffers allocated Dom0’s HW *can* handle
non-contiguous
memory buffers allocated by DomU for memory operations (DMA, for
example), e.g.
either with IOMMU help or by any other means (HW block’s own MMU).  When
Dom0
allocates as it is mapped 1:1 it can allocate physically contiguous
memory
Dom0’s HW *cannot* handle non-contiguous memory buffers allocated by
DomU for
memory operations by any means.


I am not sure to follow this. How zero copy is related to 1:1 mapping?
Is it because you have hardware that does not support scatter/gather
IO or IOMMU?

yes, you got it right


Do you have any example of hardware? What are the performance you 
require with them?


Cheers,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] PV drivers and zero copying

2017-07-31 Thread Oleksandr Andrushchenko


On 07/31/2017 12:47 PM, Julien Grall wrote:



On 31/07/17 10:46, Oleksandr Andrushchenko wrote:

Hi, Julien!


On 07/31/2017 12:24 PM, Julien Grall wrote:

(+ Joao)

On 31/07/17 09:34, Oleksandr Andrushchenko wrote:

Hi, all!


Hi Oleksandr,

The aim of this mail is to highlight and discuss possible 
approaches to
implementing zero copying for PV drivers. Rationale behind this is 
that

there
are use-cases when drivers operate with big shared buffers, e.g.
display, when
memory copying from front’s buffer into back’s one may 
significantly hit
performance of the system (for example, for para-virtual display 
running

at full
HD resolution at 60Hz it is approximately 475MB/sec).

Assumptions (which actually fit ARM platforms, but can be extended to
other
platforms as well): Dom0 is a 1:1 mapped privileged domain, runs 
backend

driver/software DomU is an unprivileged domain without 1:1 memory
mapping, runs
frontend driver


I would rather avoid to stick with this assumption on ARM. This was
only meant to be a workaround for platform without IOMMU (see [1]) and
we will get into trouble when using IOMMU.

You are correct, thank you


For instance, there are no requirement to have the IOMMU supporting as
many as address bits than the processor. So 1:1 mapping here will not
be an option.



Buffer origin: while implementing zero copying the buffer 
allocation can

happen
either on DomU’s end or Dom0’s one depending on the use-case and HW
capabilities/availability: When DomU allocates: It cannot guarantee
physical
memory continuity of the buffers allocated Dom0’s HW *can* handle
non-contiguous
memory buffers allocated by DomU for memory operations (DMA, for
example), e.g.
either with IOMMU help or by any other means (HW block’s own MMU).  
When

Dom0
allocates as it is mapped 1:1 it can allocate physically contiguous
memory
Dom0’s HW *cannot* handle non-contiguous memory buffers allocated by
DomU for
memory operations by any means.


I am not sure to follow this. How zero copy is related to 1:1 mapping?
Is it because you have hardware that does not support scatter/gather
IO or IOMMU?

yes, you got it right


Do you have any example of hardware? What are the performance you 
require with them?



Currently our target is Renesas R-Car Gen3
At the moment I don't have clean requirements, but
ideally, PV driver introduces 0% performance drop
Some time soon I will have numbers on running display/GPU
with and without zero-copy - will keep updated

Cheers,




___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] xen/link: Move .data.rel.ro sections into .rodata for final link

2017-07-31 Thread Jan Beulich

>>> Andrew Cooper  07/30/17 2:50 PM >>>
>On 30/07/17 07:16, Jan Beulich wrote:
> David Woodhouse  07/20/17 5:22 PM >>>
>>> This includes stuff lke the hypercall tables which we really want
>>> to be read-only. And they were going into .data.read-mostly.
>> Yes, we'd like them to be read-only, but what if EFI properly assigned r/o
>> permissions to the .rodata section when loading xen.efi? We'd then be
>> unable to apply relocations when switching from 1:1 to virtual mappings
>> (see efi_arch_relocate_image()).
>
>Ah yes.  I'd overlooked that point when considering the ramifications of
>this change.
>
>efi_arch_relocate_image() should probably do the same as what we do with
>livepatching, and temporarily clear CR0.WP for the duration of the patching.

Yes, we could do that, but with some care - we should no play with CR0.WP
prior to ExitBootServices(), so we would need to avoid actually writing out
relocated values for that first pass even in the 64-bit reloc case.

Jan



___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v2] xen: Implement hypercall for tracing of program counters

2017-07-31 Thread Wei Liu

On Mon, Jul 31, 2017 at 09:22:35AM +0100, Julien Grall wrote:
> Hi,
> 
> On 07/28/2017 04:37 PM, Wei Liu wrote:
> > On Wed, Jul 26, 2017 at 12:43:45PM +0200, Felix Schmoll wrote:
> > > +d->tracing_buffer = NULL;
> > > +
> > > +if ( copy_to_guest(buf, temp, d->tracing_buffer_pos) )
> > > +ret = -EFAULT;
> > > +
> > > +xfree(temp);
> > > +
> > > +ret = d->tracing_buffer_pos;
> > > +break;
> > > +}
> > > +
> > > +default:
> > > +ret = -ENOSYS;
> > 
> > EINVAL
> 
> Should not it be -EOPNOTSUPP to match return error when CONFIG_TRACE_PC is
> not?

AIUI EOPNOTSUPP means "This is a valid operation but I am not configured
to support it" while EINVAL means "This is an invalid value
(operation)".

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] xen/link: Move .data.rel.ro sections into .rodata for final link

2017-07-31 Thread Jan Beulich

>>> David Woodhouse  07/31/17 1:19 AM >>>
>On Sun, 2017-07-30 at 13:50 +0100, Andrew Cooper wrote:
>> On 30/07/17 07:16, Jan Beulich wrote:
>> > > > > David Woodhouse  07/20/17 5:22 PM >>>
>> > > This includes stuff lke the hypercall tables which we really want
>> > > to be read-only. And they were going into .data.read-mostly.
>> > Yes, we'd like them to be read-only, but what if EFI properly assigned r/o
>> > permissions to the .rodata section when loading xen.efi? We'd then be
>> > unable to apply relocations when switching from 1:1 to virtual mappings
>> > (see efi_arch_relocate_image()).
>> Ah yes.  I'd overlooked that point when considering the ramifications of
>> this change.
>> 
>> efi_arch_relocate_image() should probably do the same as what we do with
>> livepatching, and temporarily clear CR0.WP for the duration of the patching.
>
>Hm, efi/mkreloc.c was already emitting relocations in the .rodata
>section before this change. Are you saying that was already broken?

Are there any such relocations? The compiler shouldn't emit data needing
relocation to .rodata, so if at all such might live in assembly code. But yes,
if there are any, things would have been latently broken even before.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v3 extra 00/11] x86: refactor mm.c: page APIs and hypercalls

2017-07-31 Thread Jan Beulich

>>> Wei Liu  07/30/17 5:43 PM >>>
>Note that in the stubs I choose to return EINVAL but maybe we should just BUG()
>there because those paths aren't supposed to be taken when !CONFIG_PV. And I'm
>sure common code will BUG_ON() or BUG() sooner or later. Thoughts?

BUG() - yes, please.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] PV drivers and zero copying

2017-07-31 Thread Julien Grall




On 31/07/17 10:52, Oleksandr Andrushchenko wrote:

On 07/31/2017 12:47 PM, Julien Grall wrote:

On 31/07/17 10:46, Oleksandr Andrushchenko wrote:
Do you have any example of hardware? What are the performance you
require with them?


Currently our target is Renesas R-Car Gen3
At the moment I don't have clean requirements, but
ideally, PV driver introduces 0% performance drop
Some time soon I will have numbers on running display/GPU
with and without zero-copy - will keep updated


PV driver with 0% performance drop sounds a stretch target. But this is 
does not answer to my question. Do you have any hardware that does not 
support scatter/gather or not protected by an IOMMU that will be 
interfaced with PV drivers?


Cheers,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] Regression PCI passthrough from 4.5.5 to 4.6.0-rc1

2017-07-31 Thread Jan Beulich

>>> "Andreas Kinzler"  07/17/17 6:32 PM >>>
>>> Jan, I still have access to the hardware so perhaps we can finally solve
>>> this problem.
>> Feel free to go ahead; I'll be on vacation for the next three weeks.
>
>Perhaps we can shortcut debugging a bit because I looked through the  
>patches of XenServer 7.2 and found the attached patch. Now I tried it and  
>it seems to solve all the problems. Does that patch look good to you, too?

Iirc the patch had even been submitted once, and rejected as being not
generally correct (i.e. it cures a symptom rather than the cause). What
we'd need to know is the order of actions the guest takes which ought to
result in the vector getting unmasked, but doesn't in reality.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v2] xen: Implement hypercall for tracing of program counters

2017-07-31 Thread Julien Grall




On 31/07/17 10:56, Wei Liu wrote:

On Mon, Jul 31, 2017 at 09:22:35AM +0100, Julien Grall wrote:

Hi,

On 07/28/2017 04:37 PM, Wei Liu wrote:

On Wed, Jul 26, 2017 at 12:43:45PM +0200, Felix Schmoll wrote:

+d->tracing_buffer = NULL;
+
+if ( copy_to_guest(buf, temp, d->tracing_buffer_pos) )
+ret = -EFAULT;
+
+xfree(temp);
+
+ret = d->tracing_buffer_pos;
+break;
+}
+
+default:
+ret = -ENOSYS;


EINVAL


Should not it be -EOPNOTSUPP to match return error when CONFIG_TRACE_PC is
not?


AIUI EOPNOTSUPP means "This is a valid operation but I am not configured
to support it" while EINVAL means "This is an invalid value
(operation)".


Fair enough. However, you impose the caller to check -EINVAL and 
-EOPNOTSUPP in order to know if an operation can be done.


I first thought all the other hypercalls use -EOPNOTSUPP, but in face 
they use -ENOSYS. It would be to be consistent with the rest rather than 
reinventing our own.


--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] xen/link: Move .data.rel.ro sections into .rodata for final link

2017-07-31 Thread David Woodhouse

On Mon, 2017-07-31 at 03:57 -0600, Jan Beulich wrote:
> Are there any such relocations? The compiler shouldn't emit data needing
> relocation to .rodata, so if at all such might live in assembly code. But yes,
> if there are any, things would have been latently broken even before.


 $ git diff 33a0b4fe90f1ef1a104dd454c931bb46d417ffca^
diff --git a/xen/arch/x86/Makefile b/xen/arch/x86/Makefile
index 93ead6e..aa25dd9 100644
--- a/xen/arch/x86/Makefile
+++ b/xen/arch/x86/Makefile
@@ -194,7 +194,7 @@ $(TARGET).efi: prelink-efi.o $(note_file) efi.lds 
efi/relocs-dummy.o $(BASEDIR)/
if $(guard) false; then rm -f $@; echo 'EFI support disabled'; \
else $(NM) -pa --format=sysv $(@D)/$(@F) \
| $(BASEDIR)/tools/symbols --xensyms --sysv --sort 
>$(@D)/$(@F).map; fi
-   rm -f $(@D)/.$(@F).[0-9]*
+   #rm -f $(@D)/.$(@F).[0-9]*
 
 efi/boot.init.o efi/runtime.o efi/compat.o efi/buildid.o: 
$(BASEDIR)/arch/x86/efi/built_in.o
 efi/boot.init.o efi/runtime.o efi/compat.o efi/buildid.o: ;
diff --git a/xen/arch/x86/efi/mkreloc.c b/xen/arch/x86/efi/mkreloc.c
index bddcce0..55d14a7 100644
--- a/xen/arch/x86/efi/mkreloc.c
+++ b/xen/arch/x86/efi/mkreloc.c
@@ -346,6 +346,7 @@ int main(int argc, char *argv[])
  memcmp(sec1[i].name, ".lockpro", sizeof(sec1[i].name)) == 0 )
 continue;
 
+   printf("# section %.*s\n", (int)sizeof(sec1[i].name), sec1[i].name);
 if ( !sec1[i].rva )
 {
 fprintf(stderr, "Can't handle section %u with zero RVA\n", i);
 $ grep -A36 rodata .xen.efi.0r.S
# section .rodata
.equ rva_0020_relocs, 0x0c
.balign 4
.long 0x41b000, rva_0041b000_relocs
.word (10 << 12) | 0x920
.word (10 << 12) | 0x928
.word (10 << 12) | 0x930
.word (10 << 12) | 0x938
.word (10 << 12) | 0x940
.word (10 << 12) | 0x948
.word (10 << 12) | 0x950
.word (10 << 12) | 0x958
.word (10 << 12) | 0x960
.word (10 << 12) | 0x968
.word (10 << 12) | 0x970
.word (10 << 12) | 0x978
.word (10 << 12) | 0x980
.word (10 << 12) | 0x988
.word (10 << 12) | 0x990
.word (10 << 12) | 0x998
.word (10 << 12) | 0x9a0
.word (10 << 12) | 0x9a8
.word (10 << 12) | 0x9b0
.word (10 << 12) | 0x9b8
.word (10 << 12) | 0x9c0
.word (10 << 12) | 0x9c8
.word (10 << 12) | 0x9d0
.word (10 << 12) | 0x9d8
.word (10 << 12) | 0x9e0
.word (10 << 12) | 0x9e8
.word (10 << 12) | 0x9f0
.word (10 << 12) | 0x9f8
.word (10 << 12) | 0xa00
.word (10 << 12) | 0xa08
.word (10 << 12) | 0xa10
.word (10 << 12) | 0xa18
# section .init.te


smime.p7s
Description: S/MIME cryptographic signature
___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v2] xen: Implement hypercall for tracing of program counters

2017-07-31 Thread Wei Liu

On Mon, Jul 31, 2017 at 11:13:33AM +0100, Julien Grall wrote:
> 
> 
> On 31/07/17 10:56, Wei Liu wrote:
> > On Mon, Jul 31, 2017 at 09:22:35AM +0100, Julien Grall wrote:
> > > Hi,
> > > 
> > > On 07/28/2017 04:37 PM, Wei Liu wrote:
> > > > On Wed, Jul 26, 2017 at 12:43:45PM +0200, Felix Schmoll wrote:
> > > > > +d->tracing_buffer = NULL;
> > > > > +
> > > > > +if ( copy_to_guest(buf, temp, d->tracing_buffer_pos) )
> > > > > +ret = -EFAULT;
> > > > > +
> > > > > +xfree(temp);
> > > > > +
> > > > > +ret = d->tracing_buffer_pos;
> > > > > +break;
> > > > > +}
> > > > > +
> > > > > +default:
> > > > > +ret = -ENOSYS;
> > > > 
> > > > EINVAL
> > > 
> > > Should not it be -EOPNOTSUPP to match return error when CONFIG_TRACE_PC is
> > > not?
> > 
> > AIUI EOPNOTSUPP means "This is a valid operation but I am not configured
> > to support it" while EINVAL means "This is an invalid value
> > (operation)".
> 
> Fair enough. However, you impose the caller to check -EINVAL and -EOPNOTSUPP
> in order to know if an operation can be done.
> 
> I first thought all the other hypercalls use -EOPNOTSUPP, but in face they
> use -ENOSYS. It would be to be consistent with the rest rather than
> reinventing our own.
> 

Oh, if that is already an established convention, I think using ENOSYS
is fine. Felix, please keep it as ENOSYS.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] PV drivers and zero copying

2017-07-31 Thread Oleksandr Andrushchenko




On 07/31/2017 01:04 PM, Julien Grall wrote:



On 31/07/17 10:52, Oleksandr Andrushchenko wrote:

On 07/31/2017 12:47 PM, Julien Grall wrote:

On 31/07/17 10:46, Oleksandr Andrushchenko wrote:
Do you have any example of hardware? What are the performance you
require with them?


Currently our target is Renesas R-Car Gen3
At the moment I don't have clean requirements, but
ideally, PV driver introduces 0% performance drop
Some time soon I will have numbers on running display/GPU
with and without zero-copy - will keep updated


PV driver with 0% performance drop sounds a stretch target.

It is, but we should always get as close as possible
But this is does not answer to my question. Do you have any hardware 
that does not support scatter/gather

AFAIK display driver which doesn't support scatter-gather on our platform
(BSP based on 4.9 kernel, rcar_du uses DRM CMA - DRM contiguous memory 
allocator)

Anyways, for pages above 4GB even scatter-gather will not help
devices with 32-bit DMA

or not protected by an IOMMU that will be interfaced with PV drivers?


As per my understanding, IOMMU is solely owned by the hypervisor now
and there is no API to tell Xen from Dom0 to setup IOMMU for such
a buffer (pages), so display HW can do DMA with that buffer.
Thus, Dom0 has no means to do that work and make PV driver produce
buffers which can be used by the real HW driver without bounce buffering.

Cheers,




___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] xen/link: Move .data.rel.ro sections into .rodata for final link

2017-07-31 Thread David Woodhouse

On Sun, 2017-07-30 at 00:16 -0600, Jan Beulich wrote:
> > 
> > > 
> > > > 
> > > > David Woodhouse  07/20/17 5:22 PM >>>
> > This includes stuff lke the hypercall tables which we really want
> > to be read-only. And they were going into .data.read-mostly.
> Yes, we'd like them to be read-only, but what if EFI properly assigned r/o
> permissions to the .rodata section when loading xen.efi? We'd then be
> unable to apply relocations when switching from 1:1 to virtual mappings
> (see efi_arch_relocate_image()).


FWIW it does look like TianoCore has gained the ability to mark
sections as read-only, in January of this year:
https://github.com/tianocore/edk2/commit/d0e92aad46

It doesn't actually seem to be complete — even with subsequent fixes
since that commit, it doesn't look like it catches the case of data
sections without EFI_IMAGE_SCN_MEM_WRITE, such as .rodata. 

And even if/when that gets fixed you'll note that the protection is
deliberately torn down in ExitBootServices(), specifically for the case
you're concerned about below — because you'll need to do the
relocations.

So I don't think there should be a problem here.

smime.p7s
Description: S/MIME cryptographic signature
___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] PV drivers and zero copying

2017-07-31 Thread Joao Martins

Hey Oleksandr,

On 07/31/2017 09:34 AM, Oleksandr Andrushchenko wrote:
> Hi, all!
> 
[snip]
> 
> Comparison for display use-case
> ===
> 
> 1 Number of grant references used
> 1-1 grant references: nr_pages
> 1-2 GNTTABOP_transfer: nr_pages
> 1-3 XENMEM_exchange: not an option
> 
> 2 Effect of DomU crash on Dom0 (its mapped pages)
> 2-1 grant references: pages can be unmapped by Dom0, Dom0 is fully 
> recovered
> 2-2 GNTTABOP_transfer: pages will be returned to the Hypervisor, lost 
> for Dom0
> 2-3 XENMEM_exchange: not an option
> 
> 3 Security issues from sharing Dom0 pages to DomU
> 1-1 grant references: none
> 1-2 GNTTABOP_transfer: none
> 1-3 XENMEM_exchange: not an option
> 
> At the moment approach 1 with granted references seems to be a winner for
> sharing buffers both ways, e.g. Dom0 -> DomU and DomU -> Dom0.
> 
> Conclusion
> ==
> 
> I would like to get some feedback from the community on which approach 
> is more
> suitable for sharing large buffers and to have a clear vision on cons 
> and pros
> of each one: please feel free to add other metrics I missed and correct 
> the ones
> I commented on.  I would appreciate help on comparing approaches 2 and 3 
> as I
> have little knowledge of these APIs (2 seems to be addressed by 
> Christopher, and
> 3 seems to be relevant to what Konrad/Stefano do WRT SWIOTLB).

Depending on your performance/memory requirements - there could be another
option which is to keep the guest mapped on Domain-0 (what was discussed with
Zerogrant session[0][1] that will be formally proposed in the next month or so).
But that would only solve the grant maps/unmaps/copies done on Domain-0 (given
the numbers you pasted a bit ago, you might not really need to go to such 
extents)

[0]
http://schd.ws/hosted_files/xendeveloperanddesignsummit2017/05/zerogrant_spec.pdf
[1]
http://schd.ws/hosted_files/xendeveloperanddesignsummit2017/a8/zerogrant_slides.pdf

For the buffers allocated on Dom0 and safely grant buffers from Dom0 to DomU
(which I am not so sure it is possible today :(), maybe a "contract" from DomU
provide a set of transferable pages that Dom0 holds on for each Dom-0 gref
provided to the guest (and assuming this is only a handful couple of guests as
grant table is not that big). IIUC, From what you pasted above on "Buffer
allocated @Dom0" sounds like Domain-0 could quickly ran out of pages/OOM (and
grants), if you're guest is misbehaving/buggy or malicious; *also* domain-0
grant table is a rather finite/small resource (even though you can override the
number of frames in the arguments).

Joao

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] PV drivers and zero copying

2017-07-31 Thread Joao Martins

On 07/31/2017 11:37 AM, Oleksandr Andrushchenko wrote:
> On 07/31/2017 01:04 PM, Julien Grall wrote:
>> On 31/07/17 10:52, Oleksandr Andrushchenko wrote:
>>> On 07/31/2017 12:47 PM, Julien Grall wrote:
 On 31/07/17 10:46, Oleksandr Andrushchenko wrote:
 Do you have any example of hardware? What are the performance you
 require with them?

>>> Currently our target is Renesas R-Car Gen3
>>> At the moment I don't have clean requirements, but
>>> ideally, PV driver introduces 0% performance drop
>>> Some time soon I will have numbers on running display/GPU
>>> with and without zero-copy - will keep updated
>>
>> PV driver with 0% performance drop sounds a stretch target.
> It is, but we should always get as close as possible
>> But this is does not answer to my question. Do you have any hardware 
>> that does not support scatter/gather
> AFAIK display driver which doesn't support scatter-gather on our platform
> (BSP based on 4.9 kernel, rcar_du uses DRM CMA - DRM contiguous memory 
> allocator)
> Anyways, for pages above 4GB even scatter-gather will not help
> devices with 32-bit DMA
>> or not protected by an IOMMU that will be interfaced with PV drivers?
>>
> As per my understanding, IOMMU is solely owned by the hypervisor now
> and there is no API to tell Xen from Dom0 to setup IOMMU for such
> a buffer (pages), so display HW can do DMA with that buffer.
> Thus, Dom0 has no means to do that work and make PV driver produce
> buffers which can be used by the real HW driver without bounce buffering.

Sounds like it is addressed by PV-IOMMU[0] which I think it will be resurrected
in the coming months as per the design session last hackaton.

[0] https://schd.ws/hosted_files/xendeveloperanddesignsummit2017/91/PV-IOMMU.txt

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v2] xen: Implement hypercall for tracing of program counters

2017-07-31 Thread Ian Jackson

Wei Liu writes ("Re: [Xen-devel] [PATCH v2] xen: Implement hypercall for 
tracing of program counters"):
> On Mon, Jul 31, 2017 at 09:22:35AM +0100, Julien Grall wrote:
..
> > Should not it be -EOPNOTSUPP to match return error when CONFIG_TRACE_PC is
> > not?
> 
> AIUI EOPNOTSUPP means "This is a valid operation but I am not configured
> to support it" while EINVAL means "This is an invalid value
> (operation)".

EOPNOTSUPP means "someone somewhere might think this is valid, but I
don't understand it".  It can be used, for example, for unknown
operation numbers.

"ENOSYS" is used in exactly the same situation, but where the enum
whose value is not understood is precisely the syscall number.  In the
context of the hypervisor I think ENOSYS is used for "unknown
hypercall number".  I haven't checked whether it is used for "unknown
operation number" but I suspect that the hypervisor users EOPNOTSUPP
for that.

It would be sensible if hypervisor maintainers were to write this
stuff down somewhere.

EINVAL means "I definitely know that this is invalid".  It should
rarely be used for an unknown value of an enum, since enums can gain
new values in future implementations.  It can be used for "value out
of range" or "I understand this combination of parameters, but it is
not meaningful".  It should not be used for "I understand this
combination of parameters, and I do not implement it, even though in
principle the combination might somehow be implemented in the future".

In this particular case I suspect that EOPNOTSUPP is right.  EINVAL is
clearly wrong.

While looking at the original patch, I saw this:

> +if ( !d )
> +return -EINVAL; /* invalid domain */

Is this conventional ?  EINVAL is a remarkably unhelpful error code
for this case.  IMO the hypervisor ought to have a dedicated error
code for "referenced domain does not exist".  But maybe it doesn't.

Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

[Xen-devel] [PATCH 3/3] docs: hook up process/ to build system

2017-07-31 Thread Wei Liu

Signed-off-by: Wei Liu 
---
 docs/Makefile | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/docs/Makefile b/docs/Makefile
index 942247202a..6743fa3744 100644
--- a/docs/Makefile
+++ b/docs/Makefile
@@ -17,7 +17,7 @@ MARKDOWNSRC-y := $(sort $(shell find misc -name '*.markdown' 
-print))
 
 TXTSRC-y := $(sort $(shell find misc -name '*.txt' -print))
 
-PANDOCSRC-y := $(sort $(shell find features/ misc/ specs/ -name '*.pandoc' 
-print))
+PANDOCSRC-y := $(sort $(shell find process/ features/ misc/ specs/ -name 
'*.pandoc' -print))
 
 # Documentation targets
 DOC_MAN1 := $(patsubst man/%.pod.1,man1/%.1,$(MAN1SRC-y)) \
-- 
2.11.0


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

[Xen-devel] [PATCH 0/3] Docs: consolidate release related documents

2017-07-31 Thread Wei Liu

Wei Liu (3):
  docs: consolidate release related documents
  docs: add xen-release-management.pandoc
  docs: hook up process/ to build system

 docs/Makefile  |   2 +-
 {misc => docs/process}/branching-checklist.txt |   0
 {misc => docs/process}/release-checklist.txt   |   0
 docs/process/xen-release-management.pandoc | 594 +
 4 files changed, 595 insertions(+), 1 deletion(-)
 rename {misc => docs/process}/branching-checklist.txt (100%)
 rename {misc => docs/process}/release-checklist.txt (100%)
 create mode 100644 docs/process/xen-release-management.pandoc

-- 
2.11.0


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

[Xen-devel] [PATCH 1/3] docs: consolidate release related documents

2017-07-31 Thread Wei Liu

Move the existing docs from misc to docs/process.

Signed-off-by: Wei Liu 
---
 {misc => docs/process}/branching-checklist.txt | 0
 {misc => docs/process}/release-checklist.txt   | 0
 2 files changed, 0 insertions(+), 0 deletions(-)
 rename {misc => docs/process}/branching-checklist.txt (100%)
 rename {misc => docs/process}/release-checklist.txt (100%)

diff --git a/misc/branching-checklist.txt b/docs/process/branching-checklist.txt
similarity index 100%
rename from misc/branching-checklist.txt
rename to docs/process/branching-checklist.txt
diff --git a/misc/release-checklist.txt b/docs/process/release-checklist.txt
similarity index 100%
rename from misc/release-checklist.txt
rename to docs/process/release-checklist.txt
-- 
2.11.0


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

[Xen-devel] [PATCH 2/3] docs: add xen-release-management.pandoc

2017-07-31 Thread Wei Liu

A document for the release manager.

Signed-off-by: Wei Liu 
---
 docs/process/xen-release-management.pandoc | 594 +
 1 file changed, 594 insertions(+)
 create mode 100644 docs/process/xen-release-management.pandoc

diff --git a/docs/process/xen-release-management.pandoc 
b/docs/process/xen-release-management.pandoc
new file mode 100644
index 00..afdf559429
--- /dev/null
+++ b/docs/process/xen-release-management.pandoc
@@ -0,0 +1,594 @@
+% Xen Release Management
+% Wei Liu <>
+% Revision 1
+
+# Motivation
+
+Over the years we have had different people signing up as the Release Manager
+of Xen. It would be rather wasteful if every new Release Manager has to go over
+everything and tripped over by the same mistakes again and again.
+
+This file intends to document the process of managing a Xen release. It is
+mainly written for Release Manager, but other roles (contributors,
+maintainers and committers) are also encouraged to read this document, so
+that they can have an idea what to expect from the Release Manager.
+
+# Xen release cycle
+
+The Xen hypervisor project now releases twice a year, at the beginning of
+June and the beginning of December. The actual release date depends on a lot
+of factors.
+
+We can roughly divide one release into two periods. The development period
+and the freeze period. The former is 4 months long and the latter is about 2
+months long.
+
+During development period, contributors submit patches to be reviewed and
+committed into xen.git. All feature patches must be committed before a date,
+which is normally called the "cut-off date", after which the freeze period
+starts. There will be a date before which all patches that wish to be merged
+for the release should be posted -- it is normally called the "last posting
+date" and it is normally two weeks before the "cut-off date".
+
+During freeze period, the tree is closed for new features. Only bug fixes are
+accepted. This period can be shorter or longer than 2 months. If it ends up
+longer than 2 months, it eats into the next development period.
+
+Here is a conjured up example (use ```cal 2017``` to get an idea):
+
+* Development period: 2017 June 11 - 2017 September 29
+* the "cut-off date" is 2017 September 29
+* the "last posting date" is 2017 September 15
+* Freeze period: 2017 October 2 - 2017 December 7
+* the anticipated release date is 2017 December 7
+
+# The different roles in a Xen release
+
+## Release Manager
+
+A trusted developer in the community that owns the release process. The major
+goal of the Release Manager is to make sure a Xen release has high quality
+and doesn't slip too much.
+
+The Release Manager will not see much workload during development period, but
+expects to see increasing workload during the freeze period until the final
+release. He or she is expected to keep track of issues, arrange RCs,
+negotiate with relevant stakeholders, balance the need from various parties
+and make difficult decisions when necessary.
+
+The Release Manager essentially owns xen-unstable branch during the freeze
+period. The Committers will act on the wishes of the Release Manager during
+that time.
+
+## Maintainers
+
+A group of trusted developers who are responsible for certain components in
+xen.git. They are expected to respond to patches / questions with regard to
+their components in a timely manner, especially during the freeze period.
+
+## Committers
+
+A group of trusted maintainers who can commit to xen.git. During the
+development window they normally push things as they see fit. During the
+freeze period they transfer xen-unstable branch ownership and act on the
+wishes of the Release Manager. That normally means they need to have an
+Release Ack in order to push a patch.
+
+## Contributors
+
+Contributors are also expected to respond quickly to any issues regarding the
+code they submitted during development period. Failing that, the Release
+Manager might decide to revert the changes, declare feature unsupported or
+take any action he / she deems appropriate.
+
+## The Security Team
+
+The Security Team operates independently. The visibility might be rather
+limited due to the sensitive nature of security work. The best action the
+Release Manager can take is to set aside some time for potential security
+issues to be fixed.
+
+## The Release Technician
+
+The Release Technician is the person who tags various trees, prepares tarball
+etc. He or she acts on the wishes of the Release Manager. Please make sure
+the communication is as clear as it can be.
+
+## The Community Manager
+
+The Community Manager owns xenproject.org infrastructure. He or she is
+responsible for updating various web archives, updating wiki pages and
+coordinating with the PR Personnel.
+
+## The PR Personnel
+
+They are responsible for coordinating with external reporters to publish Xen
+release announcement. The Release Manager should be absolutely sure the
+release is going

Re: [Xen-devel] [XenSummit 2017] Build tools follow up

2017-07-31 Thread Oleksandr Andrushchenko


ping

On 07/17/2017 03:47 PM, Oleksandr Andrushchenko wrote:

Hi, all!

This is a follow-up on Xen distribution build systems we saw at the
summit and invitation for sharing thoughts and ways we build our 
images and
distros. I would like to specifically ask OpenXT project to reply with 
the
description of their build system so we have clear picture of what is 
being
developed and used around. And of course if there are other approaches 
to do the

same you are welcome to share those as well.

On our side at EPAM we have developed a distribution which is called 
xt-distro

(xt stays for Xen troops [0]) with the following goals in mind:
1. Make it possible to easily build different images as a single 
distribution

   - Separate images for different domains should be combined into a
 distribution, e.g. set of images/artifacts required to run them as a
 system
   - Xt-distro allows doing so as easy as running bitbake with the 
predefined

 target, e.g. “bitbake xt-image”
2. Easily deal with different BSPs for different platforms using a 
unified way

   with little or no EPAM specific code/scripts
   - We use bitbake and stripped version of Poky/meta-openembedded, so 
there are
 little in-house extensions we added to deal with Yocto based BSPs 
[1], [2]
   - We use bitbake’s well defined scripting language [3], so no yet 
another

 language to learn
   - We use Google repo tool to maintain sets of meta layers as 
manifest files [4]

3. Ability to easily collect component revisions, so any build can be
   reproduced if need be
   - We save commit IDs of every component of the build including 
versions of the

 meta layers
4. Ability to easily customize and tune builds, e.g. URIs, 
versions/commitIDs,

   branches used
   - This is purely done in bitbake’s recipe language
5. Make patching process easy
   - With our extensions you can use bblayers.conf, local.conf, 
patches which

 are part of a meta layer which is usually a manual step
6. Code reuse is a must, e.g. the same set of software must be easily 
built for
   different platforms without copying build scripts of the existing 
components
   - With bitbake’s meta layers we define generic recipes, e.g. 
suitable for all

 platforms [5] (think of it as a library) and tuning meta layers [6]
 (we use product concept here) which define specific 
versions/revisions of the

 components, apply specific patches and add/remove software
7. Cross compilation must be an easy task to do
   - This is easily achieved with Yocto builds and usually not a 
problem for other

 build systems as well (with SDKs)
   - Allows building for x86/ARM and other architectures as well
8. There must be a simple way to share build artifacts of different 
domains between
   each other, for example, DomU’s kernel should be a part of Dom0’s 
rootfs, so

   xl/libxl can access it
   - This is achieved with bitbake’s recipes, so no EPAM extensions
9. Possibility to easily use different build systems for different 
components of

   the system.
   - Bitbake allows you building makefile based projects, CMake etc. 
out of the box,

 so adding firmware, stubdoms made easy
10. Development support
   - We use SDKs built by bitbake so during everyday development you 
are packed with
 all that is needed to re-create build environment (with both host 
and target

 environments, e.g. x86 host tools and ARM cross-compiler)
   - Speed up builds for developers within organization, e.g. we reuse 
downloads and

 build cache between all of the developer’s machines on our network
11. Make the build system suitable for build automation
- We have created a python script which can easily be used with
  Jenkins or even cron [7]

For example, with xt-distro our development product delivers: Xen 
image and
policy file, Dom0 kernel and dtb images (Dom0 + DomU), Dom0 rootfs 
image (with

embedded kernels and configuration for DomU), DomU rootfs image. This
effectively means, that the build system can provide you with the 
complete set

of images to run your product/distribution.

We are looking forward for any kind of feedback and will be glad to 
collaborate

on the above.

Thank you,
Xen-troops at EPAM

[0] https://github.com/xen-troops/xt-distro
[1] https://git.yoctoproject.org/cgit/cgit.cgi/poky
[2] http://cgit.openembedded.org/meta-openembedded
[3] 
http://www.yoctoproject.org/docs/2.3/bitbake-user-manual/bitbake-user-manual.html

[4] https://gerrit.googlesource.com/git-repo
[5] https://github.com/xen-troops/meta-xt-images
[6] https://github.com/xen-troops/meta-xt-prod-devel
[7] https://github.com/xen-troops/build-scripts




___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] PV drivers and zero copying

2017-07-31 Thread Oleksandr Andrushchenko


Hi, Joao!

On 07/31/2017 02:03 PM, Joao Martins wrote:

Hey Oleksandr,

On 07/31/2017 09:34 AM, Oleksandr Andrushchenko wrote:

Hi, all!


[snip]

Comparison for display use-case
===

1 Number of grant references used
1-1 grant references: nr_pages
1-2 GNTTABOP_transfer: nr_pages
1-3 XENMEM_exchange: not an option

2 Effect of DomU crash on Dom0 (its mapped pages)
2-1 grant references: pages can be unmapped by Dom0, Dom0 is fully
recovered
2-2 GNTTABOP_transfer: pages will be returned to the Hypervisor, lost
for Dom0
2-3 XENMEM_exchange: not an option

3 Security issues from sharing Dom0 pages to DomU
1-1 grant references: none
1-2 GNTTABOP_transfer: none
1-3 XENMEM_exchange: not an option

At the moment approach 1 with granted references seems to be a winner for
sharing buffers both ways, e.g. Dom0 -> DomU and DomU -> Dom0.

Conclusion
==

I would like to get some feedback from the community on which approach
is more
suitable for sharing large buffers and to have a clear vision on cons
and pros
of each one: please feel free to add other metrics I missed and correct
the ones
I commented on.  I would appreciate help on comparing approaches 2 and 3
as I
have little knowledge of these APIs (2 seems to be addressed by
Christopher, and
3 seems to be relevant to what Konrad/Stefano do WRT SWIOTLB).

Depending on your performance/memory requirements - there could be another
option which is to keep the guest mapped on Domain-0 (what was discussed with
Zerogrant session[0][1] that will be formally proposed in the next month or so).

Unfortunately I missed that session during the Summit
due to overlapping sessions

But that would only solve the grant maps/unmaps/copies done on Domain-0 (given
the numbers you pasted a bit ago, you might not really need to go to such 
extents)

[0]
http://schd.ws/hosted_files/xendeveloperanddesignsummit2017/05/zerogrant_spec.pdf
[1]
http://schd.ws/hosted_files/xendeveloperanddesignsummit2017/a8/zerogrant_slides.pdf

I will read these, thank you for the links

For the buffers allocated on Dom0 and safely grant buffers from Dom0 to DomU
(which I am not so sure it is possible today :()

We have this working in our setup for display (we have implemented
z-copy with grant references already)

, maybe a "contract" from DomU
provide a set of transferable pages that Dom0 holds on for each Dom-0 gref
provided to the guest (and assuming this is only a handful couple of guests as
grant table is not that big).

It is an option

  IIUC, From what you pasted above on "Buffer
allocated @Dom0" sounds like Domain-0 could quickly ran out of pages/OOM (and
grants), if you're guest is misbehaving/buggy or malicious; *also* domain-0
grant table is a rather finite/small resource (even though you can override the
number of frames in the arguments).

Well, you are right. But, we are focusing on embedded appliances,
so those systems we use are not that "dynamic" with that respect.
Namely: we have fixed number of domains and their functionality
is well known, so we can do rather precise assumption on resource
usage.

Joao

Thank you,
Oleksandr

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] PV drivers and zero copying

2017-07-31 Thread Joao Martins

On 07/31/2017 12:41 PM, Oleksandr Andrushchenko wrote:
> Hi, Joao!
> 
> On 07/31/2017 02:03 PM, Joao Martins wrote:
>> Hey Oleksandr,
>>
>> On 07/31/2017 09:34 AM, Oleksandr Andrushchenko wrote:
>>> Hi, all!
>>>
>> [snip]
>>> Comparison for display use-case
>>> ===
>>>
>>> 1 Number of grant references used
>>> 1-1 grant references: nr_pages
>>> 1-2 GNTTABOP_transfer: nr_pages
>>> 1-3 XENMEM_exchange: not an option
>>>
>>> 2 Effect of DomU crash on Dom0 (its mapped pages)
>>> 2-1 grant references: pages can be unmapped by Dom0, Dom0 is fully
>>> recovered
>>> 2-2 GNTTABOP_transfer: pages will be returned to the Hypervisor, lost
>>> for Dom0
>>> 2-3 XENMEM_exchange: not an option
>>>
>>> 3 Security issues from sharing Dom0 pages to DomU
>>> 1-1 grant references: none
>>> 1-2 GNTTABOP_transfer: none
>>> 1-3 XENMEM_exchange: not an option
>>>
>>> At the moment approach 1 with granted references seems to be a winner for
>>> sharing buffers both ways, e.g. Dom0 -> DomU and DomU -> Dom0.
>>>
>>> Conclusion
>>> ==
>>>
>>> I would like to get some feedback from the community on which approach
>>> is more
>>> suitable for sharing large buffers and to have a clear vision on cons
>>> and pros
>>> of each one: please feel free to add other metrics I missed and correct
>>> the ones
>>> I commented on.  I would appreciate help on comparing approaches 2 and 3
>>> as I
>>> have little knowledge of these APIs (2 seems to be addressed by
>>> Christopher, and
>>> 3 seems to be relevant to what Konrad/Stefano do WRT SWIOTLB).
>> Depending on your performance/memory requirements - there could be another
>> option which is to keep the guest mapped on Domain-0 (what was discussed with
>> Zerogrant session[0][1] that will be formally proposed in the next month or 
>> so).
> Unfortunately I missed that session during the Summit
> due to overlapping sessions

Hmm - Zerocopy Rx (Dom0 -> DomU) would indeed be an interesting topic to bring 
up.

>> But that would only solve the grant maps/unmaps/copies done on Domain-0 
>> (given
>> the numbers you pasted a bit ago, you might not really need to go to such 
>> extents)
>>
>> [0]
>> http://schd.ws/hosted_files/xendeveloperanddesignsummit2017/05/zerogrant_spec.pdf
>> [1]
>> http://schd.ws/hosted_files/xendeveloperanddesignsummit2017/a8/zerogrant_slides.pdf
> I will read these, thank you for the links
>> For the buffers allocated on Dom0 and safely grant buffers from Dom0 to DomU
>> (which I am not so sure it is possible today :()
> We have this working in our setup for display (we have implemented
> z-copy with grant references already)

Allow me to clarify :) I meant "possible to do it in a safely manner", IOW,
regarding what I mentioned below in following paragraphs. But your answer below
clarifies on that aspect.

>> , maybe a "contract" from DomU
>> provide a set of transferable pages that Dom0 holds on for each Dom-0 gref
>> provided to the guest (and assuming this is only a handful couple of guests 
>> as
>> grant table is not that big).
> It is an option
>>
>>   IIUC, From what you pasted above on "Buffer
>> allocated @Dom0" sounds like Domain-0 could quickly ran out of pages/OOM (and
>> grants), if you're guest is misbehaving/buggy or malicious; *also* domain-0
>> grant table is a rather finite/small resource (even though you can override 
>> the
>> number of frames in the arguments).
> Well, you are right. But, we are focusing on embedded appliances,
> so those systems we use are not that "dynamic" with that respect.
> Namely: we have fixed number of domains and their functionality
> is well known, so we can do rather precise assumption on resource
> usage.

Interesting! So here I presume backend trusts the frontend.

Cheers,
Joao

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v2] xen: Implement hypercall for tracing of program counters

2017-07-31 Thread Paul Durrant

> -Original Message-
> From: Xen-devel [mailto:xen-devel-boun...@lists.xen.org] On Behalf Of Ian
> Jackson
> Sent: 31 July 2017 12:14
> To: Wei Liu 
> Cc: sstabell...@kernel.org; Felix Schmoll ;
> Andrew Cooper ; Julien Grall
> ; jbeul...@suse.com; xen-
> de...@lists.xenproject.org
> Subject: Re: [Xen-devel] [PATCH v2] xen: Implement hypercall for tracing of
> program counters
> 
> Wei Liu writes ("Re: [Xen-devel] [PATCH v2] xen: Implement hypercall for
> tracing of program counters"):
> > On Mon, Jul 31, 2017 at 09:22:35AM +0100, Julien Grall wrote:
> ..
> > > Should not it be -EOPNOTSUPP to match return error when
> CONFIG_TRACE_PC is
> > > not?
> >
> > AIUI EOPNOTSUPP means "This is a valid operation but I am not configured
> > to support it" while EINVAL means "This is an invalid value
> > (operation)".
> 
> EOPNOTSUPP means "someone somewhere might think this is valid, but I
> don't understand it".  It can be used, for example, for unknown
> operation numbers.
> 
> "ENOSYS" is used in exactly the same situation, but where the enum
> whose value is not understood is precisely the syscall number.  In the
> context of the hypervisor I think ENOSYS is used for "unknown
> hypercall number".  I haven't checked whether it is used for "unknown
> operation number" but I suspect that the hypervisor users EOPNOTSUPP
> for that.

Nope. It's ENOSYS for that case too (certainly for hvm and memory ops... which 
is what I checked).

  Paul

> 
> It would be sensible if hypervisor maintainers were to write this
> stuff down somewhere.
> 
> EINVAL means "I definitely know that this is invalid".  It should
> rarely be used for an unknown value of an enum, since enums can gain
> new values in future implementations.  It can be used for "value out
> of range" or "I understand this combination of parameters, but it is
> not meaningful".  It should not be used for "I understand this
> combination of parameters, and I do not implement it, even though in
> principle the combination might somehow be implemented in the future".
> 
> In this particular case I suspect that EOPNOTSUPP is right.  EINVAL is
> clearly wrong.
> 
> While looking at the original patch, I saw this:
> 
> > +if ( !d )
> > +return -EINVAL; /* invalid domain */
> 
> Is this conventional ?  EINVAL is a remarkably unhelpful error code
> for this case.  IMO the hypervisor ought to have a dedicated error
> code for "referenced domain does not exist".  But maybe it doesn't.
> 
> Ian.
> 
> ___
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> https://lists.xen.org/xen-devel
___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v2 00/13] "Non-shared" IOMMU support on ARM

2017-07-31 Thread Oleksandr Tyshchenko

Hi, Kevin

On Mon, Jul 31, 2017 at 8:57 AM, Tian, Kevin  wrote:
>> From: Oleksandr Tyshchenko
>> Sent: Wednesday, July 26, 2017 1:27 AM
>>
>> From: Oleksandr Tyshchenko 
>>
>> Hi, all.
>>
>> The purpose of this patch series is to create a base for porting
>> any "Non-shared" IOMMUs to Xen on ARM. Saying "Non-shared" IOMMU I
>> mean
>> the IOMMU that can't share the page table with the CPU.
>
> Is "non-shared" IOMMU a standard terminology in ARM side? I quickly
> searched to find it mostly used in this thread...
I don't think that it is a standard terminology.

>
> On the other hand, all IOMMUs support a basic DMA remapping
> mechanism with page table not shared with CPU. Then some IOMMUs
> may optional support Shared Virtual Memory (SVM) through page
> sharing with CPU. Then I'm not sure why need to highlight the
> "non-shared" manner in this thread, instead of just saying
> IPMMU-VMSA support...
I wouldn't use "IPMMU-VMSA support" in this thread since it may be any
other IOMMUs which can't share page table
with CPU because of format incompatibilities.
I needed something short to describe such IOMMUs, but, If title
"non-shared" IOMMU sounds confusing
I won't use it anymore. Do you have something in mind?

>
>> Primarily, we are interested in IPMMU-VMSA and I hope that it will be the
>> first candidate.
>> It is VMSA-compatible IOMMU that integrated in the newest Renesas R-Car
>> Gen3 SoCs (ARM).
>> I am about to push IPMMU-VMSA support in a while.
>>
>> With regard to the patch series, it was rebased on Xen 4.9.0 release and
>> tested on Renesas R-Car Gen3
>> H3/M3 based boards with applied IPMMU-VMSA support:
>> - Patches 1 and 3 have Julien's Rb.
>> - Patch 2 has Jan's Rb but only for x86 and generic parts.
>> - Patch 4 has Julien's Ab.
>> - Patches 5,6,9,10 were slightly reworked.
>> - Patch 7 was significantly reworked. The previous patch -> iommu: Split
>> iommu_hwdom_init() into arch specific parts
>> - Patches 8,11,12,13 are new.
>>
>> Not really sure about x86-related changes since I had no possibility to 
>> check.
>> So, compile-tested on x86.
>>
>> You can find current patch series here:
>> repo: https://github.com/otyshchenko1/xen.git branch:
>> non_shared_iommu_v2
>>
>> Previous patch series here:
>> [PATCH v1 00/10] "Non-shared" IOMMU support on ARM
>> https://www.mail-archive.com/xen-devel@lists.xen.org/msg107532.html
>>
>> [RFC PATCH 0/9] "Non-shared" IOMMU support on ARM
>> https://www.mail-archive.com/xen-devel@lists.xen.org/msg100468.html
>>
>> Thank you.
>>
>> Oleksandr Tyshchenko (13):
>>   xen/device-tree: Add dt_count_phandle_with_args helper
>>   iommu: Add extra order argument to the IOMMU APIs and platform
>> callbacks
>>   xen/arm: p2m: Add helper to convert p2m type to IOMMU flags
>>   xen/arm: p2m: Update IOMMU mapping whenever possible if page table is
>> not shared
>>   iommu/arm: Re-define iommu_use_hap_pt(d) as iommu_hap_pt_share
>>   iommu: Add extra use_iommu argument to iommu_domain_init()
>>   iommu: Make decision about needing IOMMU for hardware domains in
>> advance
>>   iommu/arm: Misc fixes for arch specific part
>>   xen/arm: Add use_iommu flag to xen_arch_domainconfig
>>   xen/arm: domain_build: Don't expose IOMMU specific properties to the
>> guest
>>   iommu/arm: smmu: Squash map_pages/unmap_pages with
>> map_page/unmap_page
>>   [RFC] iommu: VT-d: Squash map_pages/unmap_pages with
>> map_page/unmap_page
>>   [RFC] iommu: AMD-Vi: Squash map_pages/unmap_pages with
>> map_page/unmap_page
>>
>>  tools/libxl/libxl_arm.c   |   8 +
>>  xen/arch/arm/domain.c |   2 +-
>>  xen/arch/arm/domain_build.c   |  10 ++
>>  xen/arch/arm/p2m.c|  10 +-
>>  xen/arch/x86/domain.c |   2 +-
>>  xen/arch/x86/mm.c |  11 +-
>>  xen/arch/x86/mm/p2m-ept.c |  21 +--
>>  xen/arch/x86/mm/p2m-pt.c  |  26 +---
>>  xen/arch/x86/mm/p2m.c |  38 +
>>  xen/arch/x86/x86_64/mm.c  |   5 +-
>>  xen/common/device_tree.c  |   7 +
>>  xen/common/grant_table.c  |  10 +-
>>  xen/drivers/passthrough/amd/iommu_map.c   | 212 +++
>> ---
>>  xen/drivers/passthrough/amd/pci_amd_iommu.c   |  10 +-
>>  xen/drivers/passthrough/arm/iommu.c   |   7 +-
>>  xen/drivers/passthrough/arm/smmu.c|  23 ++-
>>  xen/drivers/passthrough/iommu.c   |  73 -
>>  xen/drivers/passthrough/vtd/iommu.c   | 116 +-
>>  xen/drivers/passthrough/vtd/x86/vtd.c |   4 +-
>>  xen/drivers/passthrough/x86/iommu.c   |   6 +-
>>  xen/include/asm-arm/iommu.h   |   4 +-
>>  xen/include/asm-arm/p2m.h |  34 +
>>  xen/include/asm-x86/hvm/svm/amd-iommu-proto.h |   8 +-
>>  xen/include/public/arch-arm.h |   5 +
>>  xen/

[Xen-devel] [linux-4.9 test] 112388: trouble: blocked/broken/fail/pass

2017-07-31 Thread osstest service owner

flight 112388 linux-4.9 real [real]
http://logs.test-lab.xenproject.org/osstest/logs/112388/

Failures and problems with tests :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-arm64-xsm   2 hosts-allocate broken REGR. vs. 112193
 build-arm64-pvops 2 hosts-allocate broken REGR. vs. 112193
 build-arm64   2 hosts-allocate broken REGR. vs. 112193

Tests which are failing intermittently (not blocking):
 test-armhf-armhf-xl-credit2  17 guest-start.2fail in 112379 pass in 112373
 test-amd64-i386-rumprun-i386 17 rumprun-demo-xenstorels/xenstorels.repeat fail 
in 112379 pass in 112388
 test-armhf-armhf-xl-credit2  16 guest-start/debian.repeat  fail pass in 112379
 test-armhf-armhf-xl-multivcpu 19 leak-check/check  fail pass in 112379
 test-amd64-amd64-qemuu-nested-intel 19 leak-check/check/l1 fail pass in 112379

Tests which did not succeed, but are not blocking:
 test-arm64-arm64-libvirt-xsm  1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl   1 build-check(1)   blocked  n/a
 build-arm64-libvirt   1 build-check(1)   blocked  n/a
 test-arm64-arm64-examine  1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl-credit2   1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl-xsm   1 build-check(1)   blocked  n/a
 build-arm64-pvops 3 capture-logs  broken blocked in 112193
 build-arm64   3 capture-logs  broken blocked in 112193
 build-arm64-xsm   3 capture-logs broken never pass
 test-amd64-amd64-xl-qemuu-win7-amd64 18 guest-start/win.repeat fail blocked in 
112193
 test-amd64-i386-xl-qemuu-win7-amd64 16 guest-localmigrate/x10 fail in 112373 
like 112117
 test-amd64-amd64-xl-qemuu-win7-amd64 17 guest-stop  fail in 112379 like 112086
 test-amd64-amd64-xl-qemut-win7-amd64 16 guest-localmigrate/x10 fail in 112379 
like 112117
 test-amd64-i386-xl-qemut-win7-amd64 16 guest-localmigrate/x10 fail in 112379 
like 112193
 test-amd64-i386-xl-qemut-win7-amd64 17 guest-stop fail like 112086
 test-amd64-amd64-xl-qemut-win7-amd64 18 guest-start/win.repeat fail like 112193
 test-armhf-armhf-xl-rtds 16 guest-start/debian.repeatfail  like 112193
 test-amd64-i386-xl-qemuu-win7-amd64 17 guest-stop fail like 112193
 test-amd64-amd64-xl-rtds 10 debian-install   fail  like 112193
 test-amd64-amd64-xl-qemuu-ws16-amd64 10 windows-installfail never pass
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  13 migrate-support-checkfail   never pass
 test-amd64-amd64-xl-qemut-ws16-amd64 10 windows-installfail never pass
 test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-armhf-armhf-xl-arndale  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  14 saverestore-support-checkfail   never pass
 test-amd64-amd64-qemuu-nested-amd 17 debian-hvm-install/l1/l2  fail never pass
 test-amd64-amd64-libvirt-vhd 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 13 migrate-support-checkfail never pass
 test-armhf-armhf-libvirt-xsm 14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 14 saverestore-support-checkfail never pass
 test-armhf-armhf-libvirt 13 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt 14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 13 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 14 saverestore-support-checkfail  never pass
 test-amd64-i386-xl-qemuu-ws16-amd64 13 guest-saverestore   fail never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-i386-xl-qemut-ws16-amd64 13 guest-saverestore   fail never pass
 test-armhf-armhf-libvirt-raw 12 migrate-suppor

Re: [Xen-devel] [PATCH v2] xen: Implement hypercall for tracing of program counters

2017-07-31 Thread Andrew Cooper

On 31/07/17 12:58, Paul Durrant wrote:
>> -Original Message-
>> From: Xen-devel [mailto:xen-devel-boun...@lists.xen.org] On Behalf Of Ian
>> Jackson
>> Sent: 31 July 2017 12:14
>> To: Wei Liu 
>> Cc: sstabell...@kernel.org; Felix Schmoll ;
>> Andrew Cooper ; Julien Grall
>> ; jbeul...@suse.com; xen-
>> de...@lists.xenproject.org
>> Subject: Re: [Xen-devel] [PATCH v2] xen: Implement hypercall for tracing of
>> program counters
>>
>> Wei Liu writes ("Re: [Xen-devel] [PATCH v2] xen: Implement hypercall for
>> tracing of program counters"):
>>> On Mon, Jul 31, 2017 at 09:22:35AM +0100, Julien Grall wrote:
>> ..
 Should not it be -EOPNOTSUPP to match return error when
>> CONFIG_TRACE_PC is
 not?
>>> AIUI EOPNOTSUPP means "This is a valid operation but I am not configured
>>> to support it" while EINVAL means "This is an invalid value
>>> (operation)".
>> EOPNOTSUPP means "someone somewhere might think this is valid, but I
>> don't understand it".  It can be used, for example, for unknown
>> operation numbers.
>>
>> "ENOSYS" is used in exactly the same situation, but where the enum
>> whose value is not understood is precisely the syscall number.  In the
>> context of the hypervisor I think ENOSYS is used for "unknown
>> hypercall number".  I haven't checked whether it is used for "unknown
>> operation number" but I suspect that the hypervisor users EOPNOTSUPP
>> for that.
> Nope. It's ENOSYS for that case too (certainly for hvm and memory ops... 
> which is what I checked).

History has been poor to us.  Quite a few of the ENOSYS should be
EOPNOTSUPP, but we can't change them for ABI reasons.

>
>   Paul
>
>> It would be sensible if hypervisor maintainers were to write this
>> stuff down somewhere.
>>
>> EINVAL means "I definitely know that this is invalid".  It should
>> rarely be used for an unknown value of an enum, since enums can gain
>> new values in future implementations.  It can be used for "value out
>> of range" or "I understand this combination of parameters, but it is
>> not meaningful".  It should not be used for "I understand this
>> combination of parameters, and I do not implement it, even though in
>> principle the combination might somehow be implemented in the future".
>>
>> In this particular case I suspect that EOPNOTSUPP is right.  EINVAL is
>> clearly wrong.
>>
>> While looking at the original patch, I saw this:
>>
>>> +if ( !d )
>>> +return -EINVAL; /* invalid domain */
>> Is this conventional ?  EINVAL is a remarkably unhelpful error code
>> for this case.  IMO the hypervisor ought to have a dedicated error
>> code for "referenced domain does not exist".  But maybe it doesn't.

ESRCH is "no such domain".  We also use ENOENT for "no such vcpu".

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] PV drivers and zero copying

2017-07-31 Thread Oleksandr Andrushchenko


On 07/31/2017 02:58 PM, Joao Martins wrote:

On 07/31/2017 12:41 PM, Oleksandr Andrushchenko wrote:

Hi, Joao!

On 07/31/2017 02:03 PM, Joao Martins wrote:

Hey Oleksandr,

On 07/31/2017 09:34 AM, Oleksandr Andrushchenko wrote:

Hi, all!


[snip]

Comparison for display use-case
===

1 Number of grant references used
1-1 grant references: nr_pages
1-2 GNTTABOP_transfer: nr_pages
1-3 XENMEM_exchange: not an option

2 Effect of DomU crash on Dom0 (its mapped pages)
2-1 grant references: pages can be unmapped by Dom0, Dom0 is fully
recovered
2-2 GNTTABOP_transfer: pages will be returned to the Hypervisor, lost
for Dom0
2-3 XENMEM_exchange: not an option

3 Security issues from sharing Dom0 pages to DomU
1-1 grant references: none
1-2 GNTTABOP_transfer: none
1-3 XENMEM_exchange: not an option

At the moment approach 1 with granted references seems to be a winner for
sharing buffers both ways, e.g. Dom0 -> DomU and DomU -> Dom0.

Conclusion
==

I would like to get some feedback from the community on which approach
is more
suitable for sharing large buffers and to have a clear vision on cons
and pros
of each one: please feel free to add other metrics I missed and correct
the ones
I commented on.  I would appreciate help on comparing approaches 2 and 3
as I
have little knowledge of these APIs (2 seems to be addressed by
Christopher, and
3 seems to be relevant to what Konrad/Stefano do WRT SWIOTLB).

Depending on your performance/memory requirements - there could be another
option which is to keep the guest mapped on Domain-0 (what was discussed with
Zerogrant session[0][1] that will be formally proposed in the next month or so).

Unfortunately I missed that session during the Summit
due to overlapping sessions

Hmm - Zerocopy Rx (Dom0 -> DomU) would indeed be an interesting topic to bring 
up.


it is, especially for the systems which require physically contiguous
buffers

But that would only solve the grant maps/unmaps/copies done on Domain-0 (given
the numbers you pasted a bit ago, you might not really need to go to such 
extents)

[0]
http://schd.ws/hosted_files/xendeveloperanddesignsummit2017/05/zerogrant_spec.pdf
[1]
http://schd.ws/hosted_files/xendeveloperanddesignsummit2017/a8/zerogrant_slides.pdf

I will read these, thank you for the links

For the buffers allocated on Dom0 and safely grant buffers from Dom0 to DomU
(which I am not so sure it is possible today :()

We have this working in our setup for display (we have implemented
z-copy with grant references already)

Allow me to clarify :) I meant "possible to do it in a safely manner", IOW,
regarding what I mentioned below in following paragraphs. But your answer below
clarifies on that aspect.

good :)

, maybe a "contract" from DomU
provide a set of transferable pages that Dom0 holds on for each Dom-0 gref
provided to the guest (and assuming this is only a handful couple of guests as
grant table is not that big).

It is an option

   IIUC, From what you pasted above on "Buffer
allocated @Dom0" sounds like Domain-0 could quickly ran out of pages/OOM (and
grants), if you're guest is misbehaving/buggy or malicious; *also* domain-0
grant table is a rather finite/small resource (even though you can override the
number of frames in the arguments).

Well, you are right. But, we are focusing on embedded appliances,
so those systems we use are not that "dynamic" with that respect.
Namely: we have fixed number of domains and their functionality
is well known, so we can do rather precise assumption on resource
usage.

Interesting! So here I presume backend trusts the frontend.

yes, this is the case. What is more backend can make decision
on if to allow buffer allocation or reject the request

Cheers,
Joao



___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] xen/link: Move .data.rel.ro sections into .rodata for final link

2017-07-31 Thread Jan Beulich

>>> David Woodhouse  07/31/17 1:02 PM >>>
>On Sun, 2017-07-30 at 00:16 -0600, Jan Beulich wrote:
>> > > > David Woodhouse  07/20/17 5:22 PM >>>
>> > This includes stuff lke the hypercall tables which we really want
>> > to be read-only. And they were going into .data.read-mostly.
>> Yes, we'd like them to be read-only, but what if EFI properly assigned r/o
>> permissions to the .rodata section when loading xen.efi? We'd then be
>> unable to apply relocations when switching from 1:1 to virtual mappings
>> (see efi_arch_relocate_image()).
>
>
>FWIW it does look like TianoCore has gained the ability to mark
>sections as read-only, in January of this year:
>https://github.com/tianocore/edk2/commit/d0e92aad46
>
>It doesn't actually seem to be complete — even with subsequent fixes
>since that commit, it doesn't look like it catches the case of data
>sections without EFI_IMAGE_SCN_MEM_WRITE, such as .rodata. 
>
>And even if/when that gets fixed you'll note that the protection is
>deliberately torn down in ExitBootServices(), specifically for the case
>you're concerned about below — because you'll need to do the
>relocations.

As said in an earlier reply, a first pass over relocations is being done
long before the call to ExitBootServices(). A minimal adjustment to
efi_arch_relocate_image() will be needed anyway, afaict.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

[Xen-devel] PVH VCPU hotplug support v7?

2017-07-31 Thread Ross Lagerwall


Hi Boris,

I've modified your PVH VCPU hotplug support v6 patch series [1] to 
support HVM guests running _with_ a device model for XenServer's 
purposes. This is useful because it moves the vCPU hotplug handling out 
of QEMU and allows it to mostly be shared with PVH. It will also allow 
unplugging vCPUs (libxl currently only does cpu-add for upstream qemu).


Are you still planning on continuing with that patch series since your 
commit to Linux [2]?


[1] 
https://lists.xenproject.org/archives/html/xen-devel/2017-01/msg00060.html


[2]:
commit 2a7197f02dddf1f9cee300bd12512375ed56524a
Author: Boris Ostrovsky 
Date:   Mon Feb 6 10:58:05 2017 -0500

xen/pvh: Enable CPU hotplug

PVH guests don't (yet) receive ACPI hotplug interrupts and therefore
need to monitor xenstore for CPU hotplug event.

Signed-off-by: Boris Ostrovsky 
Reviewed-by: Juergen Gross 

Thanks,
--
Ross Lagerwall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v2] xen: Implement hypercall for tracing of program counters

2017-07-31 Thread Jan Beulich

>>> Andrew Cooper  07/31/17 2:01 PM >>>
>On 31/07/17 12:58, Paul Durrant wrote:
>> From: Xen-devel [mailto:xen-devel-boun...@lists.xen.org] On Behalf Of Ian
>> Jackson
>> Sent: 31 July 2017 12:14
>>> Wei Liu writes ("Re: [Xen-devel] [PATCH v2] xen: Implement hypercall for
>>> tracing of program counters"):
 On Mon, Jul 31, 2017 at 09:22:35AM +0100, Julien Grall wrote:
> Should not it be -EOPNOTSUPP to match return error when
>>> CONFIG_TRACE_PC is
> not?
 AIUI EOPNOTSUPP means "This is a valid operation but I am not configured
 to support it" while EINVAL means "This is an invalid value
 (operation)".
>>> EOPNOTSUPP means "someone somewhere might think this is valid, but I
>>> don't understand it".  It can be used, for example, for unknown
>>> operation numbers.
>>>
>>> "ENOSYS" is used in exactly the same situation, but where the enum
>>> whose value is not understood is precisely the syscall number.  In the
>>> context of the hypervisor I think ENOSYS is used for "unknown
>>> hypercall number".  I haven't checked whether it is used for "unknown
>>> operation number" but I suspect that the hypervisor users EOPNOTSUPP
>>> for that.
>> Nope. It's ENOSYS for that case too (certainly for hvm and memory ops... 
>> which is what I checked).
>
>History has been poor to us.  Quite a few of the ENOSYS should be
>EOPNOTSUPP, but we can't change them for ABI reasons.

Right, but the result here ought to be that we at least don't introduce any
new bogus uses of ENOSYS or EINVAL.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v2 1/5] livepatch: Tighten alignment checks.

2017-07-31 Thread Jan Beulich

>>> Konrad Rzeszutek Wilk  07/26/17 9:50 PM >>>
>--- a/xen/common/livepatch_elf.c
>+++ b/xen/common/livepatch_elf.c
>@@ -86,6 +86,19 @@ static int elf_resolve_sections(struct livepatch_elf *elf, 
>const void *data)
>delta < sizeof(Elf_Ehdr) ? "at ELF header" : "is past end");
>return -EINVAL;
>}
>+else if ( sec[i].sec->sh_addralign > 1 &&

As said before, to me this check looks confusing. I'd recommend to only check
for the field to be non-zero.

>+  sec[i].sec->sh_addr % sec[i].sec->sh_addralign )
>+{
>+dprintk(XENLOG_ERR, LIVEPATCH "%s: Section [%u] addr 
>(%#"PRIxElfAddr") is not aligned properly (%#"PRIxElfAddr")\n",
>+elf->name, i, sec[i].sec->sh_addr, 
>sec[i].sec->sh_addralign);
>+return -EINVAL;
>+}
>+else if ( sec[i].sec->sh_addralign > 1 && sec[i].sec->sh_addralign % 
>2 )

What use is this one? Do you perhaps mean to check that the alignment is
a power of 2? In that case a single check of sh_addralign & (sh_addralign - 1)
against zero would be what you want.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 0/3] Docs: consolidate release related documents

2017-07-31 Thread Ian Jackson

Wei Liu writes ("[PATCH 0/3] Docs: consolidate release related documents"):
> Wei Liu (3):
>   docs: consolidate release related documents
>   docs: add xen-release-management.pandoc
>   docs: hook up process/ to build system

FWIW,

Acked-by: Ian Jackson 

However, AFAICT the mean reason this doesn't process
release-checklist.txt and branching-checklist.txt into files on the
website is that you only have it process pandocs.

But I don't think those files are ever going to want to be made into
web pages.  Whereas it is possible that we will grow other process
documents in .txt form which do, and no provision is made for them.

I wonder if it would be better to rename those to files to a different
directory.  The reason I invented misc/ was precisely to avoid people
(users and developers) tripping over them thinking they might be
useful...

Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v2 2/5] livepatch: Include sizes when an mismatch occurs

2017-07-31 Thread Jan Beulich

>>> Konrad Rzeszutek Wilk  07/26/17 9:48 PM >>>
>--- a/xen/common/livepatch.c
>+++ b/xen/common/livepatch.c
>@@ -457,6 +457,24 @@ static int secure_payload(struct payload *payload, struct 
>livepatch_elf *elf)
>return rc;
>}
 >
>+static int check_section(const struct livepatch_elf *elf,
>+ const struct livepatch_elf_sec *sec,
>+ const size_t sz, bool zero_ok)

I guess you want to drop the const here (or else for consistency add one to the
last parameter). As to the last parameter - I doubt its usefulness: There's one
place where you pass false, and that place looks bogus. I don't see anything
wrong with an empty .ex_table section.

>+{
>+if ( !elf || !sec )
>+return -EINVAL;

None of the callers actually uses the return value. Perhaps the function should
return bool?

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v2 3/5] xen/livepatch/ARM32: Don't load and crash on livepatches loaded with wrong alignment.

2017-07-31 Thread Jan Beulich

>>> Konrad Rzeszutek Wilk  07/26/17 9:50 PM >>>
>--- a/docs/misc/livepatch.markdown
>+++ b/docs/misc/livepatch.markdown
>@@ -279,6 +279,10 @@ It may also have some architecture-specific sections. For 
>example:
>* Exception tables.
>* Relocations for each of these sections.
 >
>+Note that on ARM 32 the sections SHOULD be four byte aligned. Otherwise
>+we risk hitting Data Abort exception as un-aligned manipulation of data is
>+prohibited on ARM 32.

This (and hence the rest of the patch) is not in line with the outcome of the
earlier discussion we had. Nothing is wrong with a section having smaller
alignment, as long as there are no 32-bit (or wider, but I don't think there
are any such) relocations against such a section. And even if there were, I
think it should rather be the code doing the relocations needing to cope, as
I don't think the ARM ELF ABI imposes any such restriction.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] PVH VCPU hotplug support v7?

2017-07-31 Thread Boris Ostrovsky

On 07/31/2017 09:20 AM, Ross Lagerwall wrote:
> Hi Boris,
>
> I've modified your PVH VCPU hotplug support v6 patch series [1] to
> support HVM guests running _with_ a device model for XenServer's
> purposes. This is useful because it moves the vCPU hotplug handling
> out of QEMU and allows it to mostly be shared with PVH. It will also
> allow unplugging vCPUs (libxl currently only does cpu-add for upstream
> qemu).
>
> Are you still planning on continuing with that patch series since your
> commit to Linux [2]?

This series has been put on hold until we figure out what to do with
hotplug for PVH dom0. (The problem was the "dual" view by dom0 of APCI
CPU namespace --- on hotplug event dom0 has to somehow figure out
whether the event was due to (dis)appearance of a physical or virtual CPU).

I don't think this has been dealt with yet (copying Roger).

-boris


>
> [1]
> https://lists.xenproject.org/archives/html/xen-devel/2017-01/msg00060.html
>
> [2]:
> commit 2a7197f02dddf1f9cee300bd12512375ed56524a
> Author: Boris Ostrovsky 
> Date:   Mon Feb 6 10:58:05 2017 -0500
>
> xen/pvh: Enable CPU hotplug
>
> PVH guests don't (yet) receive ACPI hotplug interrupts and therefore
> need to monitor xenstore for CPU hotplug event.
>
> Signed-off-by: Boris Ostrovsky 
> Reviewed-by: Juergen Gross 
>
> Thanks,


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH] xen/events: Fix interrupt lost during irq_disable and irq_enable

2017-07-31 Thread Boris Ostrovsky

On 07/29/2017 12:59 PM, Liu Shuo wrote:
> Here is a device has xen-pirq-MSI interrupt. Dom0 might lost interrupt
> during driver irq_disable/irq_enable. Here is the scenario,
>  1. irq_disable -> disable_dynirq -> mask_evtchn(irq channel)
>  2. dev interrupt raised by HW and Xen mark its evtchn as pending
>  3. irq_enable -> startup_pirq -> eoi_pirq ->
> clear_evtchn(channel of irq) -> clear pending status
>  4. consume_one_event process the irq event without pending bit assert
> which result in interrupt lost once
>  5. No HW interrupt raising anymore.
>
> Now use enable_dynirq for enable_pirq of xen_pirq_chip to remove
> eoi_pirq when irq_enable.
>
> Signed-off-by: Liu Shuo 

Reviewed-by: Boris Ostrovsky 



___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v2 4/5] alternative/x86/arm32: Align altinstructions (and altinstr_replacement) sections.

2017-07-31 Thread Jan Beulich

>>> Konrad Rzeszutek Wilk  07/26/17 9:50 PM >>>
>On x86 the bloat-o-meter detects that with this change the file shrinks:
>add/remove: 1/0 grow/shrink: 0/2 up/down: 156/-367 (-211)
>function old new   delta
>get_page_from_gfn  - 156+156
>do_mmu_update   45784569  -9
>do_mmuext_op56045246-358
>Total: Before=3170439, After=3170228, chg -0.01%

This looks unexpected, and hence I'd like to ask for an explanation. If
anything I'd expect the image size to grow (slightly).

>--- a/xen/include/asm-x86/alternative.h
>+++ b/xen/include/asm-x86/alternative.h
>@@ -56,10 +56,12 @@ extern void alternative_instructions(void);
 >
>#define ALTERNATIVE_N(newinstr, feature, number)   \
>".pushsection .altinstructions,\"a\"\n"\
>+  ".p2align 2\n"  \

Can't this then be accompanied by dropping the (over-)alignment
done in xen.lds.S?

>ALTINSTR_ENTRY(feature, number)\
>".section .discard,\"a\",@progbits\n"  \
>DISCARD_ENTRY(number)  \
>".section .altinstr_replacement, \"ax\"\n" \
>+  ".p2align 2\n"  \

This surely isn't needed on x86.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v2 5/5] livepatch: Declare live patching as a supported feature

2017-07-31 Thread Jan Beulich

>>> Konrad Rzeszutek Wilk  07/26/17 9:48 PM >>>
>From: Ross Lagerwall 
>
>See docs/features/livepatch.pandoc for the details.
>
>Signed-off-by: Ross Lagerwall 
>Signed-off-by: Konrad Rzeszutek Wilk 

Reviewed-by: Jan Beulich 


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v14 09/23] x86: refactor psr: L3 CAT: set value: assemble features value array.

2017-07-31 Thread Jan Beulich

>>> Yi Sun  07/15/17 2:47 AM >>>
>@@ -619,6 +710,46 @@ static int insert_val_into_array(uint32_t val[],
>enum cbm_type type,
>uint32_t new_val)
>{
>+const struct feat_node *feat;
>+const struct feat_props *props;
>+unsigned int i;
>+int ret;
>+
>+ASSERT(feat_type < FEAT_TYPE_NUM);
>+
>+ret = skip_prior_features(&array_len, feat_type);
>+if ( ret < 0 )
>+return ret;
>+else
>+val += ret;

Please avoid such pointless "else".

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v14 12/23] x86: refactor psr: L3 CAT: set value: implement write msr flow.

2017-07-31 Thread Jan Beulich

>>> Yi Sun  07/15/17 2:48 AM >>>
>static int write_psr_msrs(unsigned int socket, unsigned int cos,
>const uint32_t val[], unsigned int array_len,
>enum psr_feat_type feat_type)
>{
>-return -ENOENT;
>+int ret;
>+struct psr_socket_info *info = get_socket_info(socket);
>+struct cos_write_info data =
>+{
>+.cos = cos,
>+.feature = info->features[feat_type],
>+.props = feat_props[feat_type],
>+};
>+
>+if ( cos > info->features[feat_type]->cos_max )
>+return -EINVAL;
>+
>+/* Skip to the feature's value head. */
>+ret = skip_prior_features(&array_len, feat_type);
>+if ( ret < 0 )
>+return ret;
>+else
>+val += ret;

With this (again pointless) else removed,
Reviewed-by: Jan Beulich 

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v6 05/12] x86/domctl: Handle ACPI access from domctl

2017-07-31 Thread Ross Lagerwall


On 01/03/2017 02:04 PM, Boris Ostrovsky wrote:

Signed-off-by: Boris Ostrovsky 
---
Changes in v6:
* Adjustments to to patch 4 changes.
* Added a spinlock for VCPU map access
* Return an error on guest trying to write VCPU map


snip

-static int acpi_cpumap_access_common(struct domain *d, bool is_write,
- unsigned int port,
+static int acpi_cpumap_access_common(struct domain *d, bool is_guest_access,
+ bool is_write, unsigned int port,
   unsigned int bytes, uint32_t *val)
  {
  unsigned int first_byte = port - XEN_ACPI_CPU_MAP;
+int rc = X86EMUL_OKAY;

  BUILD_BUG_ON(XEN_ACPI_CPU_MAP + XEN_ACPI_CPU_MAP_LEN
   > ACPI_GPE0_BLK_ADDRESS_V1);

+spin_lock(&d->arch.hvm_domain.acpi_lock);
+
  if ( !is_write )
  {
  uint32_t mask = (bytes < 4) ? ~0U << (bytes * 8) : 0;
@@ -32,23 +37,61 @@ static int acpi_cpumap_access_common(struct domain *d, bool 
is_write,
  memcpy(val, (uint8_t *)d->avail_vcpus + first_byte,
 min(bytes, ((d->max_vcpus + 7) / 8) - first_byte));
  }
+else if ( !is_guest_access )
+memcpy((uint8_t *)d->avail_vcpus + first_byte, val,
+   min(bytes, ((d->max_vcpus + 7) / 8) - first_byte));
  else
  /* Guests do not write CPU map */
-return X86EMUL_UNHANDLEABLE;
+rc = X86EMUL_UNHANDLEABLE;

-return X86EMUL_OKAY;
+spin_unlock(&d->arch.hvm_domain.acpi_lock);
+
+return rc;
  }

  int hvm_acpi_domctl_access(struct domain *d,
 const struct xen_domctl_acpi_access *access)
  {
-return -ENOSYS;
+unsigned int bytes, i;
+uint32_t val = 0;
+uint8_t *ptr = (uint8_t *)&val;
+int rc;
+bool is_write = (access->rw == XEN_DOMCTL_ACPI_WRITE) ? true : false;
+
+if ( has_acpi_dm_ff(d) )
+return -EOPNOTSUPP;
+
+if ( access->space_id != XEN_ACPI_SYSTEM_IO )
+return -EINVAL;
+
+if ( !((access->address >= XEN_ACPI_CPU_MAP) &&
+   (access->address < XEN_ACPI_CPU_MAP + XEN_ACPI_CPU_MAP_LEN)) )
+return -ENODEV;
+
+for ( i = 0; i < access->width; i += sizeof(val) )
+{
+bytes = (access->width - i > sizeof(val)) ?
+sizeof(val) : access->width - i;
+
+if ( is_write && copy_from_guest_offset(ptr, access->val, i, bytes) )
+return -EFAULT;
+
+rc = acpi_cpumap_access_common(d, false, is_write,
+   access->address, bytes, &val);


While I'm looking at this code...
This doesn't work if access->width > sizeof(val) (4 bytes). The same 
value (access->address) is always passed into acpi_cpumap_access_common 
for 'port' and this is used as an offset into the avail_cpus array. So 
the offset is unchanged and only the first 4 bytes of avail_cpus ever 
gets changed.


--
Ross Lagerwall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] PVH VCPU hotplug support v7?

2017-07-31 Thread Andrew Cooper

On 31/07/17 14:55, Boris Ostrovsky wrote:
> On 07/31/2017 09:20 AM, Ross Lagerwall wrote:
>> Hi Boris,
>>
>> I've modified your PVH VCPU hotplug support v6 patch series [1] to
>> support HVM guests running _with_ a device model for XenServer's
>> purposes. This is useful because it moves the vCPU hotplug handling
>> out of QEMU and allows it to mostly be shared with PVH. It will also
>> allow unplugging vCPUs (libxl currently only does cpu-add for upstream
>> qemu).
>>
>> Are you still planning on continuing with that patch series since your
>> commit to Linux [2]?
> This series has been put on hold until we figure out what to do with
> hotplug for PVH dom0. (The problem was the "dual" view by dom0 of APCI
> CPU namespace --- on hotplug event dom0 has to somehow figure out
> whether the event was due to (dis)appearance of a physical or virtual CPU).
>
> I don't think this has been dealt with yet (copying Roger).

From the point of view of unblocking several pieces of work, it would be
fine for this logic to be behind an emulation flag, just like LAPIC/etc.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

[Xen-devel] [qemu-mainline test] 112389: regressions - trouble: blocked/broken/fail/pass

2017-07-31 Thread osstest service owner

flight 112389 qemu-mainline real [real]
http://logs.test-lab.xenproject.org/osstest/logs/112389/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-arm64-pvops 2 hosts-allocate broken REGR. vs. 111765
 build-arm64-xsm   2 hosts-allocate broken REGR. vs. 111765
 build-arm64   2 hosts-allocate broken REGR. vs. 111765
 test-amd64-i386-xl-qemuu-ws16-amd64 10 windows-install   fail REGR. vs. 111765
 test-amd64-i386-xl-qemuu-win7-amd64 10 windows-install   fail REGR. vs. 111765
 test-amd64-amd64-xl-qemuu-win7-amd64 10 windows-install  fail REGR. vs. 111765

Tests which are failing intermittently (not blocking):
 test-armhf-armhf-xl  10 debian-install fail pass in 112380
 test-armhf-armhf-xl-rtds 12 guest-startfail pass in 112380

Tests which did not succeed, but are not blocking:
 test-arm64-arm64-libvirt-xsm  1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl   1 build-check(1)   blocked  n/a
 build-arm64-libvirt   1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl-credit2   1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl-xsm   1 build-check(1)   blocked  n/a
 build-arm64   3 capture-logs  broken blocked in 111765
 build-arm64-pvops 3 capture-logs  broken blocked in 111765
 build-arm64-xsm   3 capture-logs  broken blocked in 111765
 test-armhf-armhf-xl-rtds 16 guest-start/debian.repeat fail in 112380 like 
111765
 test-armhf-armhf-xl 13 migrate-support-check fail in 112380 never pass
 test-armhf-armhf-xl 14 saverestore-support-check fail in 112380 never pass
 test-armhf-armhf-xl-rtds13 migrate-support-check fail in 112380 never pass
 test-armhf-armhf-xl-rtds 14 saverestore-support-check fail in 112380 never pass
 test-armhf-armhf-libvirt-xsm 14 saverestore-support-checkfail  like 111765
 test-armhf-armhf-libvirt 14 saverestore-support-checkfail  like 111765
 test-armhf-armhf-libvirt-raw 13 saverestore-support-checkfail  like 111765
 test-amd64-amd64-xl-rtds 10 debian-install   fail  like 111765
 test-amd64-amd64-xl-qemuu-ws16-amd64 10 windows-installfail never pass
 test-amd64-i386-libvirt  13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt-vhd 12 migrate-support-checkfail   never pass
 test-amd64-amd64-qemuu-nested-amd 17 debian-hvm-install/l1/l2  fail never pass
 test-armhf-armhf-xl-credit2  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 13 migrate-support-checkfail never pass
 test-armhf-armhf-libvirt 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 14 saverestore-support-checkfail never pass
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 13 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 14 saverestore-support-checkfail  never pass
 test-armhf-armhf-xl-xsm  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt-raw 12 migrate-support-checkfail   never pass
 test-amd64-i386-xl-qemuu-win10-i386 10 windows-install fail never pass
 test-amd64-amd64-xl-qemuu-win10-i386 10 windows-installfail never pass

version targeted for testing:
 qemuua588c4985eff363154d65aee8607d0a4601655f7
baseline version:
 qemuu31fe1c414501047cbb91b695bdccc0068496dcf6

Last test of basis   111765  2017-07-13 10:20:16 Z   18 days
Failing since111790  2017-07-14 04:20:46 Z   17 days   25 attempts
Testing same since   112366  2017-07-28 18:48:13 Z2 days4 attempts


People who touched revisions under test:
  Alex BennÃ©e 
  Alex Williamson 
  Alexander Graf 
  Alexey G 
  Alexey Gerasimenko 
  Alexey Kardashevskiy 
  A

Re: [Xen-devel] [PATCH v14 13/23] x86: refactor psr: CDP: implement CPU init flow.

2017-07-31 Thread Jan Beulich

>>> Yi Sun  07/15/17 2:48 AM >>>
>@@ -272,7 +312,8 @@ static int cat_init_feature(const struct cpuid_leaf *regs,
>if ( !opt_cpu_info )
>return 0;
 >
>-printk(XENLOG_INFO "CAT: enabled on socket %u, cos_max:%u, cbm_len:%u\n",
>+printk(XENLOG_INFO "%s: enabled on socket %u, cos_max:%u, cbm_len:%u\n",
>+   ((type == FEAT_TYPE_L3_CDP) ? "CDP" : "L3 CAT"),

Why is this not "L3 CDP" when the enumerator includes L3?

>@@ -1283,10 +1344,22 @@ static void psr_cpu_init(void)
>feat = feat_l3;
>feat_l3 = NULL;
 >
>-if ( !cat_init_feature(®s, feat, info, FEAT_TYPE_L3_CAT) )
>-feat_props[FEAT_TYPE_L3_CAT] = &l3_cat_props;
>+if ( (regs.c & PSR_CAT_CDP_CAPABILITY) && (opt_psr & PSR_CDP) )
>+{
>+if ( !cat_init_feature(®s, feat, info, FEAT_TYPE_L3_CDP) )
>+feat_props[FEAT_TYPE_L3_CDP] = &l3_cdp_props;
>+else
>+/* If CDP init fails, try to work as L3 CAT. */
>+goto l3_cat_init;
>+}
>else
>-feat_l3 = feat;
>+{
>+ l3_cat_init:

I'd really like to ask to re-structure this slightly so that you won't
need goto and a label here. As said before, goto-s are sort of okay
for making complicated error paths readable, but they should be
avoided in almost all other cases.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v14 16/23] x86: L2 CAT: implement CPU init flow.

2017-07-31 Thread Jan Beulich

>>> Yi Sun  07/15/17 2:49 AM >>>
>@@ -273,6 +275,12 @@ static int cat_init_feature(const struct cpuid_leaf *regs,
>struct psr_socket_info *info,
>enum psr_feat_type type)
>{
>+const char * const cat_feat_name[FEAT_TYPE_NUM] = {

Strictly speaking the blank after the star is wrong here.

>+"L3 CAT",
>+"CDP",
>+"L2 CAT",
>+};

Please use designated initializers here.

With at least this second aspect taken care of
Reviewed-by: Jan Beulich 

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] PVH VCPU hotplug support v7?

2017-07-31 Thread Boris Ostrovsky

On 07/31/2017 10:12 AM, Andrew Cooper wrote:
> On 31/07/17 14:55, Boris Ostrovsky wrote:
>> On 07/31/2017 09:20 AM, Ross Lagerwall wrote:
>>> Hi Boris,
>>>
>>> I've modified your PVH VCPU hotplug support v6 patch series [1] to
>>> support HVM guests running _with_ a device model for XenServer's
>>> purposes. This is useful because it moves the vCPU hotplug handling
>>> out of QEMU and allows it to mostly be shared with PVH. It will also
>>> allow unplugging vCPUs (libxl currently only does cpu-add for upstream
>>> qemu).
>>>
>>> Are you still planning on continuing with that patch series since your
>>> commit to Linux [2]?
>> This series has been put on hold until we figure out what to do with
>> hotplug for PVH dom0. (The problem was the "dual" view by dom0 of APCI
>> CPU namespace --- on hotplug event dom0 has to somehow figure out
>> whether the event was due to (dis)appearance of a physical or virtual CPU).
>>
>> I don't think this has been dealt with yet (copying Roger).
> From the point of view of unblocking several pieces of work, it would be
> fine for this logic to be behind an emulation flag, just like LAPIC/etc.

The (I think) last message discussing this series was

https://lists.xenproject.org/archives/html/xen-devel/2017-01/msg00143.html

Are you suggesting extracting pieces that would move hotplug support for
HVM guests from qemu to hypervisor/toolstack but leave all PVH-specific
code out? (The feature flag is part of this series --- 
https://lists.xenproject.org/archives/html/xen-devel/2017-01/msg00059.html)

-boris

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] [RFC v2 02/12] Rename PSR sysctl/domctl interfaces and xsm policy to make them be general.

2017-07-31 Thread Wei Liu

Normally there is no need to have period at the end of the subject line.

On Thu, Jul 20, 2017 at 04:49:03PM +0800, Yi Sun wrote:
> This patch renames PSR sysctl/domctl interfaces and related xsm policy to
> make them be general for all resource allocation features but not only
> for CAT. Then, we can resuse the interfaces for all allocation features.
> 

It would be useful to list what is changed to what. Afaict "cat" is
changed to "alloc".

This patch mostly looks fine cod-wise. The only thing I want to point
out is that you should bump the version number for both domctl and
sysctl.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] [RFC v2 04/12] x86: implement data structure and CPU init flow for MBA.

2017-07-31 Thread Wei Liu

On Thu, Jul 20, 2017 at 04:49:05PM +0800, Yi Sun wrote:
> This patch implements main data structures of MBA.
> 
> Like CAT features, MBA HW info has cos_max which means the max cos
> registers number, and thrtl_max which means the max throttle value
> (delay value). It also has a flag to represent if the throttle
> value is linear or not.
> 
> One COS register of MBA stores a throttle value for one or more
> domains. The throttle value means the transaction time between L2
> cache and next level memory to be delayed.
> 
> This patch also implements init flow for MBA and register stub
> callback functions.
> 
> Signed-off-by: Yi Sun 
> ---
>  xen/arch/x86/psr.c  | 130 
> 
>  xen/include/asm-x86/msr-index.h |   1 +
>  xen/include/asm-x86/psr.h   |   2 +
>  3 files changed, 109 insertions(+), 24 deletions(-)
> 
> diff --git a/xen/arch/x86/psr.c b/xen/arch/x86/psr.c
> index d1d854f..d1ea5a4 100644
> --- a/xen/arch/x86/psr.c
> +++ b/xen/arch/x86/psr.c
> @@ -27,13 +27,16 @@
>   * - CMT Cache Monitoring Technology
>   * - COS/CLOSClass of Service. Also mean COS registers.
>   * - COS_MAX Max number of COS for the feature (minus 1)
> + * - MBA Memory Bandwidth Allocation
>   * - MSRsMachine Specific Registers
>   * - PSR Intel Platform Shared Resource
> + * - THRTL_MAX   Max throttle value (delay value) of MBA
>   */
>  
>  #define PSR_CMT(1<<0)
>  #define PSR_CAT(1<<1)
>  #define PSR_CDP(1<<2)
> +#define PSR_MBA(1<<3)

These should really be (1u << X) -- please use unsigned value and add
spaces around "<<".

Can you please submit a patch to fix them first?

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] [RFC v2 09/12] tools: implement the new get hw info interface suitable to all psr allocation features.

2017-07-31 Thread Wei Liu

On Thu, Jul 20, 2017 at 04:49:10PM +0800, Yi Sun wrote:
> This patch implements a new get hw info interface suitable for all psr 
> allocation
> features and the whole flow. It also enables MBA support in tools to get MBA
> HW info.
> 
> Signed-off-by: Yi Sun 
> ---
>  tools/libxc/include/xenctrl.h |  30 +++-
>  tools/libxc/xc_psr.c  |  46 +
>  tools/libxl/libxl_psr.c   | 155 
> +++---
>  tools/xl/xl_cmdtable.c|   3 +
>  tools/xl/xl_psr.c |  45 +++-
>  5 files changed, 238 insertions(+), 41 deletions(-)
> 
> diff --git a/tools/libxc/include/xenctrl.h b/tools/libxc/include/xenctrl.h
> index 2248900..0b0ec31 100644
> --- a/tools/libxc/include/xenctrl.h
> +++ b/tools/libxc/include/xenctrl.h
> @@ -2460,6 +2460,31 @@ enum xc_psr_cat_type {
>  };
>  typedef enum xc_psr_cat_type xc_psr_cat_type;
>  
> +enum xc_psr_feat_type {
> +XC_PSR_FEAT_UNKNOWN= 0,
> +XC_PSR_FEAT_CAT_L3 = 1,
> +XC_PSR_FEAT_CAT_L2 = 2,
> +XC_PSR_FEAT_MBA= 3,

Pointless initialisers.

> +};
> +typedef enum xc_psr_feat_type xc_psr_feat_type;
> +
> +struct xc_psr_hw_info {
> +union {
> +struct {
> +uint32_t cos_max;
> +uint32_t cbm_len;
> +bool cdp_enabled;
> +} xc_cat_info;
> +
> +struct {
> +uint32_t cos_max;
> +uint32_t thrtl_max;
> +bool linear;
> +} xc_mba_info;
> +} u;
> +};
> +typedef struct xc_psr_hw_info xc_psr_hw_info;
[...]
> index 8319301..43b84b6 100644
> --- a/tools/libxl/libxl_psr.c
> +++ b/tools/libxl/libxl_psr.c
> @@ -361,47 +361,49 @@ int libxl_psr_cat_get_cbm(libxl_ctx *ctx, uint32_t 
> domid,
>  return rc;
>  }
>  
> +static inline int libxl_psr_hw_info_to_libxl_psr_cat_info(

libxl__ -> two underscores for internal functions.

> +  libxl_psr_feat_type type, libxl_psr_hw_info *hw_info,
> +  libxl_psr_cat_info *cat_info)
> +{
> +if (type != LIBXL_PSR_FEAT_TYPE_CAT_INFO)
> +return -1;

ERROR_INVAL;

> +
> +static inline int libxc__psr_hw_info_to_libxl_psr_hw_info(
> +  libxl_psr_feat_type type, xc_psr_hw_info *xc_hw_info,
> +  libxl_psr_hw_info *xl_hw_info)
> +{
> +switch (type) {
> +case LIBXL_PSR_FEAT_TYPE_CAT_INFO:
> +xl_hw_info->u.cat_info.cos_max = xc_hw_info->u.xc_cat_info.cos_max;
> +xl_hw_info->u.cat_info.cbm_len = xc_hw_info->u.xc_cat_info.cbm_len;
> +xl_hw_info->u.cat_info.cdp_enabled =
> +
> xc_hw_info->u.xc_cat_info.cdp_enabled;
> +break;
> +case LIBXL_PSR_FEAT_TYPE_MBA_INFO:
> +xl_hw_info->u.mba_info.cos_max = xc_hw_info->u.xc_mba_info.cos_max;
> +xl_hw_info->u.mba_info.thrtl_max = 
> xc_hw_info->u.xc_mba_info.thrtl_max;
> +xl_hw_info->u.mba_info.linear = xc_hw_info->u.xc_mba_info.linear;
> +break;
> +default:
> +return -1;

ERROR_INVAL

> +}
> +
> +return 0;
> +}
> +
>  int libxl_psr_get_hw_info(libxl_ctx *ctx, libxl_psr_hw_info **info,
>int *nr, libxl_psr_feat_type type, int lvl)
>  {
> -return EXIT_FAILURE;
> +GC_INIT(ctx);
> +int rc;
> +int i = 0, socketid, nr_sockets;
> +libxl_bitmap socketmap;
> +libxl_psr_hw_info *ptr;
> +xc_psr_feat_type xc_type;
> +xc_psr_hw_info hw_info;
> +
> +libxl_bitmap_init(&socketmap);
> +
> +if ( type == LIBXL_PSR_FEAT_TYPE_CAT_INFO && lvl != 3 && lvl != 2) {

Extraneous space.

>  
>  void libxl_psr_hw_info_list_free(libxl_psr_hw_info *list, int nr)
>  {
> +int i;

unsigned int

> +
> +for (i = 0; i < nr; i++)
> +libxl_psr_hw_info_dispose(&list[i]);
> +free(list);
>  }
>  
>  /*
> diff --git a/tools/xl/xl_cmdtable.c b/tools/xl/xl_cmdtable.c
> index 2c71a9f..14a02d4 100644
> --- a/tools/xl/xl_cmdtable.c
> +++ b/tools/xl/xl_cmdtable.c
> @@ -524,6 +524,9 @@ struct cmd_spec cmd_table[] = {
>"[options]",
>"-m, --cmt   Show Cache Monitoring Technology (CMT) hardware 
> info\n"
>"-a, --cat   Show Cache Allocation Technology (CAT) hardware 
> info\n"
> +#ifdef LIBXL_HAVE_PSR_MBA
> +  "-b, --mba   Show Memory Bandwidth Allocation (MBA) hardware 
> info\n"
> +#endif

You don't need to test the macro. xl always has all features available.

Same comment applies to the rest of this patch.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] [RFC v2 03/12] x86: rename 'cbm_type' to 'psr_val_type' to make it general.

2017-07-31 Thread Wei Liu

On Thu, Jul 20, 2017 at 04:49:04PM +0800, Yi Sun wrote:
> This patch renames 'cbm_type' to 'psr_val_type' to make it be general.
> Then, we can reuse this for all psr allocation features.
> 
> Signed-off-by: Yi Sun 

The code LGTM.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] [RFC v2 08/12] tools: create general interfaces to support psr allocation features.

2017-07-31 Thread Wei Liu

On Thu, Jul 20, 2017 at 04:49:09PM +0800, Yi Sun wrote:
[...]
> +
> +#ifdef LIBXL_HAVE_PSR_MBA
> +/*
> + * Function to set a domain's value. It operates on a single or multiple
> + * target(s) defined in 'target_map'. 'target_map' specifies all the sockets
> + * to be operated on.
> + */
> +int libxl_psr_set_val(libxl_ctx *ctx, uint32_t domid,
> +  libxl_psr_cbm_type type, libxl_bitmap *target_map,
> +  uint64_t val);
> +/*
> + * Function to get a domain's cbm. It operates on a single 'target'.
> + * 'target' specifies which socket to be operated on.
> + */
> +int libxl_psr_get_val(libxl_ctx *ctx, uint32_t domid,
> +  libxl_psr_cbm_type type, uint32_t target,

There is no need for target to be uint32_t right? Unsigned int should
work too?

> +  uint64_t *val);
> +/*
> + * On success, the function returns an array of elements in 'info',
> + * and the length in 'nr'.
> + */
> +int libxl_psr_get_hw_info(libxl_ctx *ctx, libxl_psr_hw_info **info,
> +  int *nr, libxl_psr_feat_type type, int lvl);
> +void libxl_psr_hw_info_list_free(libxl_psr_hw_info *list, int nr);

nr should be unsigned int.

> +#endif /* LIBXL_HAVE_PSR_MBA */
> +#endif /* LIBXL_HAVE_PSR_CAT */
>  
>  /* misc */
>  
> diff --git a/tools/libxl/libxl_psr.c b/tools/libxl/libxl_psr.c
> index f55ba1e..8319301 100644
> --- a/tools/libxl/libxl_psr.c
> +++ b/tools/libxl/libxl_psr.c
> @@ -425,6 +425,30 @@ void libxl_psr_cat_info_list_free(libxl_psr_cat_info 
> *list, int nr)
>  free(list);
>  }
>  
> +int libxl_psr_set_val(libxl_ctx *ctx, uint32_t domid,
> +  libxl_psr_cbm_type type, libxl_bitmap *target_map,
> +  uint64_t val)
> +{
> +return EXIT_FAILURE;

ERROR_FAIL here.

> +
> +libxl_psr_hw_info = Struct("psr_hw_info", [
> +("id", uint32),
> +("u", KeyedUnion(None, libxl_psr_feat_type, "type",
> +  [("cat_info", Struct(None, [
> + ("cos_max", uint32),
> + ("cbm_len", uint32),
> + ("cdp_enabled", bool),
> +   ])),
> +   ("mba_info", Struct(None, [
> + ("cos_max", uint32),
> + ("thrtl_max",   uint32),
> + ("linear",  bool),
> +   ])),
> +  ]))

If this is output only please mark it as dir=DIR_OUT.

> +])
> -- 
> 1.9.1
> 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] [RFC v2 10/12] tools: implemet new get value interface suitable for all psr allocation features.

2017-07-31 Thread Wei Liu

On Thu, Jul 20, 2017 at 04:49:11PM +0800, Yi Sun wrote:
> This patch implements a new get value interface in tools suitable for all psr
> allocation features and the whole flow. It also enables MBA support in tools
> to get MBA value.

This suggests this patch can be at least broken into two?

> 
> Signed-off-by: Yi Sun 
> ---
>  tools/libxc/include/xenctrl.h |  13 +-
>  tools/libxc/xc_psr.c  |  11 +-
>  tools/libxl/libxl_psr.c   |  61 ++
>  tools/xl/xl.h |   3 +
>  tools/xl/xl_cmdtable.c|   9 +-
>  tools/xl/xl_psr.c | 275 
> ++
>  6 files changed, 236 insertions(+), 136 deletions(-)
> 
> diff --git a/tools/libxc/include/xenctrl.h b/tools/libxc/include/xenctrl.h
> index 0b0ec31..def18f5 100644
> --- a/tools/libxc/include/xenctrl.h
> +++ b/tools/libxc/include/xenctrl.h
> @@ -2452,13 +2452,14 @@ enum xc_psr_cmt_type {
>  };
>  typedef enum xc_psr_cmt_type xc_psr_cmt_type;
>  
> -enum xc_psr_cat_type {
> +enum xc_psr_val_type {
>  XC_PSR_CAT_L3_CBM  = 1,
>  XC_PSR_CAT_L3_CBM_CODE = 2,
>  XC_PSR_CAT_L3_CBM_DATA = 3,
>  XC_PSR_CAT_L2_CBM  = 4,
> +XC_PSR_MBA_THRTL   = 5,
>  };
> -typedef enum xc_psr_cat_type xc_psr_cat_type;
> +typedef enum xc_psr_val_type xc_psr_val_type;

Changing the name of the type should be done in a separate patch.

The rest of this patch mixes renaming and functional change which is
rather difficult to review I'm afraid.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v4 06/13] libxl: change p9 to use generec add function

2017-07-31 Thread Wei Liu

On Sun, Jul 30, 2017 at 09:42:09PM +0300, Oleksandr Grytsov wrote:
> On Fri, Jul 28, 2017 at 7:23 PM, Wei Liu  wrote:
> > On Fri, Jul 28, 2017 at 03:11:34PM +0100, Wei Liu wrote:
> >> On Tue, Jul 18, 2017 at 05:25:23PM +0300, Oleksandr Grytsov wrote:
> >> [...]
> >> >  /* Waits for the passed device to reach state XenbusStateInitWait.
> >> >   * This is not really useful by itself, but is important when executing
> >> >   * hotplug scripts, since we need to be sure the device is in the 
> >> > correct
> >> > @@ -3565,6 +3559,7 @@ extern const struct libxl_device_type 
> >> > libxl__usbctrl_devtype;
> >> >  extern const struct libxl_device_type libxl__usbdev_devtype;
> >> >  extern const struct libxl_device_type libxl__pcidev_devtype;
> >> >  extern const struct libxl_device_type libxl__vdispl_devtype;
> >> > +extern const struct libxl_device_type libxl__p9_devtype;
> >> >
> >> >  extern const struct libxl_device_type *device_type_tbl[];
> >> >
> >> > diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
> >> > index 25563cf..96dbaed 100644
> >> > --- a/tools/libxl/libxl_types.idl
> >> > +++ b/tools/libxl/libxl_types.idl
> >> > @@ -804,7 +804,7 @@ libxl_domain_config = Struct("domain_config", [
> >> >  ("vfbs", Array(libxl_device_vfb, "num_vfbs")),
> >> >  ("vkbs", Array(libxl_device_vkb, "num_vkbs")),
> >> >  ("vtpms", Array(libxl_device_vtpm, "num_vtpms")),
> >> > -("p9", Array(libxl_device_p9, "num_p9s")),
> >> > +("p9s", Array(libxl_device_p9, "num_p9s")),
> >>
> >> Oh, no, please don't do this. We can't change the name of the fields.
> >>
> >> There is already on irregular device type -- the PCI device. I suppose
> >> you probably need another hook somewhere. And please convert PCI devices
> >> if you can.
> >
> > OK, going through the code I think we need to come to a conclusion if we
> > want an extra callback to handle the irregular device names first
> > because that's likely to affect the code of the framework in previous
> > patch.
> 
> Actually creating new callback to handle irregular device name looks
> not so good.
> There is the pattern which all namings should follow. May be it has to
> be documented

The normal pattern is DEVTYPEs.

> somewhere. p9 was added recently we can ask the author to review this rename.

Once it is released we can't change it, of course unless we deem it
unstable. I'm two minded here. P9 was released in 4.9, which was only a
few months old.

But we definitely can't change the PCI type. It has been around since
forever. And there is provision in code to deal with that.

> From other side this rename touches only internals changes: no changes
> in config file
> or CLI interface.
> 

As said, the framework need to be ready to deal with PCI anyway.

What sort of issues do you foresee here?

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v2 1/2] libxl: Implement the handler to handle unrecoverable AER errors.

2017-07-31 Thread Wei Liu

On Fri, Jul 28, 2017 at 10:15:40AM -0700, Venu Busireddy wrote:
> On 2017-07-28 16:58:13 +0100, Wei Liu wrote:
> > On Wed, Jul 26, 2017 at 07:16:38PM -0500, Venu Busireddy wrote:
> > > Implement the callback function to handle unrecoverable AER errors, and
> > > also the public APIs that can be used to register/unregister the handler.
> > > When an AER error occurs, the handler will forcibly remove the erring
> > > PCIe device from the guest.
> > > 
> > > Signed-off-by: Venu Busireddy 
> > > ---
> > >  tools/libxl/libxl_event.h |  2 ++
> > >  tools/libxl/libxl_pci.c   | 85 
> > > +++
> > >  2 files changed, 87 insertions(+)
> > > 
> > 
> > Please also add a LIBXL_HAVE macro to libxl.h. There are plenty of
> > examples there.
> 
> I assume you meant, for example, something like:
> 
> /* LIBXL_HAVE_UNREG_AER_EVENTS_HANDLER
>  *
>  * If it is defined, libxl has a library function called
>  * libxl_unreg_aer_events_handler.
>  */
> #define LIBXL_HAVE_UNREG_AER_EVENTS_HANDLER 1
> 
> If so, I will add them in the next revision.
> 
> > > +
> > > +int libxl_reg_aer_events_handler(libxl_ctx *ctx, uint32_t domid)
> > > +{
> > > +int rc = 0;
> > > +char *be_path;
> > > +GC_INIT(ctx);
> > > +
> > > +aer_watch.domid = domid;
> > > +be_path = 
> > > GCSPRINTF("/local/domain/0/backend/pci/%u/0/aerFailedSBDF", domid);
> > 
> > I think the best thing to do is you get the domid using
> > libxl__get_domid. Try not to hard-code 0.
> > 
> > Same for your callback function. And there are quite a few 0's that I'm
> > not sure what they stand for.
> 
> All those 0's are saying dom0's domid. Isn't dom0's domid always 0?
> If that is not the case, I can use libxl__get_domid(). Please let me
> know.

Don't assume the code will always run in dom0, so use get_domid is best.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v7 00/14] arm/mem_access: Walk guest page tables in SW if mem_access is active

2017-07-31 Thread Julien Grall



On 18/07/17 13:24, Sergej Proskurin wrote:
> Hi all,

Hi,

> 
> The function p2m_mem_access_check_and_get_page is called from the function
> get_page_from_gva if mem_access is active and the hardware-aided translation 
> of
> the given guest virtual address (gva) into machine address fails. That is, if
> the stage-2 translation tables constrain access to the guests's page tables,
> hardware-assisted translation will fail. The idea of the function
> p2m_mem_access_check_and_get_page is thus to translate the given gva and check
> the requested access rights in software. However, as the current 
> implementation
> of p2m_mem_access_check_and_get_page makes use of the hardware-aided gva to 
> ipa
> translation, the translation might also fail because of reasons stated above
> and will become equally relevant for the altp2m implementation on ARM.  As
> such, we provide a software guest translation table walk to address the above
> mentioned issue.
> 
> The current version of the implementation supports translation of both the
> short-descriptor as well as the long-descriptor translation table format on
> ARMv7 and ARMv8 (AArch32/AArch64).
> 
> This revised version incorporates the comments of the previous patch series. 
> In
> this patch version we refine the definition of PAGE_SIZE_GRAN and
> PAGE_MASK_GRAN. In particular, we use PAGE_SIZE_GRAN to define PAGE_MASK_GRAN
> and thus avoid these defines to have a differing type. We also changed the
> previously introduced macro BITS_PER_LONG_LONG to BITS_PER_LLONG. Further
> changes comprise minor adjustments in comments and renaming of macros and
> function parameters. Some additional changes comprising code readability and
> correct type usage have been made and stated in the individual commits.
> 
> The following patch series can be found on Github[0].

I tried this series today with the change [1] in Xen to check the translation
is valid. However, I got a failure when booting non-LPAE arm32 Dom0:

(XEN) Loading kernel from boot module @ 80008000
(XEN) Allocating 1:1 mappings totalling 512MB for dom0:
(XEN) BANK[0] 0x00a000-0x00c000 (512MB)
(XEN) Grant table range: 0x00ffe0-0x00ffe6a000
(XEN) Loading zImage from 80008000 to a780-a7f50e28
(XEN) Allocating PPI 16 for event channel interrupt
(XEN) Loading dom0 DTB to 0xa800-0xa8001f8e
(XEN) Std. Loglevel: All
(XEN) Guest Loglevel: All
(XEN) guest_walk_tables: gva 0xffeff018 pipa 0x1c090018
(XEN) access_guest_memory_by_ipa: gpa 0xa0207ff8
(XEN) access_guest_memory_by_ipa: gpa 0xa13aebfc
(XEN) d0: guestcopy: failed to get table entry.
(XEN) Xen BUG at traps.c:2737
(XEN) [ Xen-4.10-unstable  arm32  debug=y   Not tainted ]
(XEN) CPU:0
(XEN) PC: 00264dc0 do_trap_guest_sync+0x161c/0x1804
(XEN) CPSR:   a05a MODE:Hypervisor
(XEN)  R0: ffea R1:  R2:  R3: 004a
(XEN)  R4: 93830007 R5: 47fcff58 R6: 93830007 R7: 0007
(XEN)  R8: 1c09 R9:  R10: R11:47fcff54 R12:ffea
(XEN) HYP: SP: 47fcfee4 LR: 00258dec
(XEN) 
(XEN)   VTCR_EL2: 80003558
(XEN)  VTTBR_EL2: 00010008f3ffc000
(XEN) 
(XEN)  SCTLR_EL2: 30cd187f
(XEN)HCR_EL2: 0038663f
(XEN)  TTBR0_EL2: fff02000
(XEN) 
(XEN)ESR_EL2: 
(XEN)  HPFAR_EL2: 001c0900
(XEN)  HDFAR: ffeff018
(XEN)  HIFAR: 
(XEN) 
(XEN) Xen stack trace from sp=47fcfee4:
(XEN) 47fcff34 00256008 47fcfefc 47fcfefc 20da 0004 47fd48f4
(XEN)002d5ef0 0004 002d1f00 0004  002d1f00 c163f740 93830007
(XEN)ffeff018 1c090018  47fcff44 c15e70ac 005b c15e70ac c074400c
(XEN)0031  c0743ff8 47fcff58 00268ce0 c15e70ac 005b 0031
(XEN)ffeff000 c15e70ac 005b c15e70ac c074400c 0031  c0743ff8
(XEN) 001f   c074401c 21d3 93830007 
(XEN)c161cac0 c161cac0 c1501de0 c0735640 c161cacc c161cacc c161cad8 c161cad8
(XEN)     c161cae4 c161cae4 41d3
(XEN)     dfdfdfcf cfdfdfdf
(XEN) Xen call trace:
(XEN)[<00264dc0>] do_trap_guest_sync+0x161c/0x1804 (PC)
(XEN)[<00258dec>] access_guest_memory_by_ipa+0x25c/0x284 (LR)
(XEN)[<00268ce0>] entry.o#return_from_trap+0/0x4
(XEN) 
(XEN) 
(XEN) 
(XEN) Panic on CPU 0:
(XEN) Xen BUG at traps.c:2737
(XEN) 
(XEN) 
(XEN) Reboot in five seconds...

The IPA 0xa13aebfc is not valid for the domain.

Cheers,

[1]
diff --git a/xen/arch/arm/guestcopy.c b/xen/arch/arm/guestcopy.c
index 4ee07fcea3..89c5ebf3cf 100644
--- a/xen/arch/arm/guestcopy.c
+++ b/xen/arch/arm/guestcopy.c
@@ -139,6 +139,8 @@ int access_guest_memory_by_ipa(struct domain *d, paddr_t 
gpa, void *buf,
 return -EINVAL;
 }
 
+printk("%s: gpa 0x%llx\n", __FUNCTION__, gpa);
+
 page

Re: [Xen-devel] [PATCH v5 1/8] mm: Place unscrubbed pages at the end of pagelist

2017-07-31 Thread Jan Beulich

>>> Boris Ostrovsky  07/23/17 4:01 AM >>>
>On 06/27/2017 01:06 PM, Jan Beulich wrote:
> Boris Ostrovsky  06/22/17 8:55 PM >>>
>>> +{
>>> +if ( pg < first_dirty_pg )
>>> +first_dirty = (first_dirty_pg - pg) / sizeof(*pg);
>> 
>> Pointer subtraction already includes the involved division. 
>
>
>Yes, this was a mistake.
>
>> Otoh I wonder
>> if you couldn't get away without pointer comparison/subtraction here
>> altogether.
>
>
>Without comparison I can only assume that first_dirty is zero (i.e. the 
>whole buddy is potentially dirty). Is there something else I could do?

I was thinking of tracking indexes instead of pointers. But maybe that
would more hamper readability of the overall result than help it.
 
>>> @@ -892,8 +934,25 @@ static int reserve_offlined_page(struct page_info 
>>> *head)
>>>   {
>>>   merge:
>>>   /* We don't consider merging outside the head_order. */
>>> -page_list_add_tail(cur_head, &heap(node, zone, cur_order));
>>> -PFN_ORDER(cur_head) = cur_order;
>>> +
>>> +/* See if any of the pages indeed need scrubbing. */
>>> +if ( first_dirty_pg && (cur_head + (1 << cur_order) > 
>>> first_dirty_pg) )
>>> +{
>>> +if ( cur_head < first_dirty_pg )
>>> +i = (first_dirty_pg - cur_head) / 
>>> sizeof(*cur_head);
>
>I assume the same comment as above applies here.

Of course. I usually avoid repeating the same comment, except maybe
when reviewing patches of first time contributors.

>>> +else
>>> +i = 0;
>>> +
>>> +for ( ; i < (1 << cur_order); i++ )
>>> +if ( test_bit(_PGC_need_scrub,
>>> +  &cur_head[i].count_info) )
>>> +{
>>> +first_dirty = i;
>>> +break;
>>> +}
>> 
>> Perhaps worth having ASSERT(first_dirty != INVALID_DIRTY_IDX) here? Or are
>> there cases where ->u.free.first_dirty of a page may be wrong?
>
>
>When we merge in free_heap_pages we don't clear first_dirty of the 
>successor buddy (at some point I did have this done but you questioned 
>whether it was needed and I dropped it).

Hmm, this indeed answers my question, but doesn't help (me) understanding
whether the suggested ASSERT() could be wrong.

>>> --- a/xen/include/asm-x86/mm.h
>>> +++ b/xen/include/asm-x86/mm.h
>>> @@ -88,7 +88,15 @@ struct page_info
>>>   /* Page is on a free list: ((count_info & PGC_count_mask) == 0). 
>>> */
>>>   struct {
>>>   /* Do TLBs need flushing for safety before next page use? */
>>> -bool_t need_tlbflush;
>>> +unsigned long need_tlbflush:1;
>>> +
>>> +/*
>>> + * Index of the first *possibly* unscrubbed page in the buddy.
>>> + * One more than maximum possible order (MAX_ORDER+1) to
>> 
>> Why +1 here and hence ...
>
>Don't we have MAX_ORDER+1 orders?

So here there might be a simple misunderstanding: I understand the
parenthesized MAX_ORDER+1 to represent "maximum possible
order", i.e. excluding the "one more than", not the least because of
the ...

>> + * accommodate INVALID_DIRTY_IDX.
>> + */
>> +#define INVALID_DIRTY_IDX (-1UL & (((1UL<> +unsigned long first_dirty:MAX_ORDER + 2;

+2 here.

>> ... why +2 instead of +1? And isn't the expression INVALID_DIRTY_IDX wrongly
>> parenthesized (apart from lacking blanks around the shift operator)? I'd
>> expect you want a value with MAX_ORDER+1 set bits, i.e.
>> (1UL << (MAX_ORDER + 1)) - 1. ANDing with -1UL seems quite pointless too.
>
>Yes to parentheses and AND. Should be (1UL << (MAX_ORDER + 2)) - 1

I.e. I would still expect it to be (1UL << (MAX_ORDER + 1)) - 1
here.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v6 05/12] x86/domctl: Handle ACPI access from domctl

2017-07-31 Thread Boris Ostrovsky

On 07/31/2017 10:14 AM, Ross Lagerwall wrote:
> On 01/03/2017 02:04 PM, Boris Ostrovsky wrote:
>> Signed-off-by: Boris Ostrovsky 
>> ---
>> Changes in v6:
>> * Adjustments to to patch 4 changes.
>> * Added a spinlock for VCPU map access
>> * Return an error on guest trying to write VCPU map
>>
> snip
>> -static int acpi_cpumap_access_common(struct domain *d, bool is_write,
>> - unsigned int port,
>> +static int acpi_cpumap_access_common(struct domain *d, bool
>> is_guest_access,
>> + bool is_write, unsigned int port,
>>unsigned int bytes, uint32_t
>> *val)
>>   {
>>   unsigned int first_byte = port - XEN_ACPI_CPU_MAP;
>> +int rc = X86EMUL_OKAY;
>>
>>   BUILD_BUG_ON(XEN_ACPI_CPU_MAP + XEN_ACPI_CPU_MAP_LEN
>>> ACPI_GPE0_BLK_ADDRESS_V1);
>>
>> +spin_lock(&d->arch.hvm_domain.acpi_lock);
>> +
>>   if ( !is_write )
>>   {
>>   uint32_t mask = (bytes < 4) ? ~0U << (bytes * 8) : 0;
>> @@ -32,23 +37,61 @@ static int acpi_cpumap_access_common(struct
>> domain *d, bool is_write,
>>   memcpy(val, (uint8_t *)d->avail_vcpus + first_byte,
>>  min(bytes, ((d->max_vcpus + 7) / 8) - first_byte));
>>   }
>> +else if ( !is_guest_access )
>> +memcpy((uint8_t *)d->avail_vcpus + first_byte, val,
>> +   min(bytes, ((d->max_vcpus + 7) / 8) - first_byte));
>>   else
>>   /* Guests do not write CPU map */
>> -return X86EMUL_UNHANDLEABLE;
>> +rc = X86EMUL_UNHANDLEABLE;
>>
>> -return X86EMUL_OKAY;
>> +spin_unlock(&d->arch.hvm_domain.acpi_lock);
>> +
>> +return rc;
>>   }
>>
>>   int hvm_acpi_domctl_access(struct domain *d,
>>  const struct xen_domctl_acpi_access
>> *access)
>>   {
>> -return -ENOSYS;
>> +unsigned int bytes, i;
>> +uint32_t val = 0;
>> +uint8_t *ptr = (uint8_t *)&val;
>> +int rc;
>> +bool is_write = (access->rw == XEN_DOMCTL_ACPI_WRITE) ? true :
>> false;
>> +
>> +if ( has_acpi_dm_ff(d) )
>> +return -EOPNOTSUPP;
>> +
>> +if ( access->space_id != XEN_ACPI_SYSTEM_IO )
>> +return -EINVAL;
>> +
>> +if ( !((access->address >= XEN_ACPI_CPU_MAP) &&
>> +   (access->address < XEN_ACPI_CPU_MAP +
>> XEN_ACPI_CPU_MAP_LEN)) )
>> +return -ENODEV;
>> +
>> +for ( i = 0; i < access->width; i += sizeof(val) )
>> +{
>> +bytes = (access->width - i > sizeof(val)) ?
>> +sizeof(val) : access->width - i;
>> +
>> +if ( is_write && copy_from_guest_offset(ptr, access->val, i,
>> bytes) )
>> +return -EFAULT;
>> +
>> +rc = acpi_cpumap_access_common(d, false, is_write,
>> +   access->address, bytes, &val);
>
> While I'm looking at this code...
> This doesn't work if access->width > sizeof(val) (4 bytes). The same
> value (access->address) is always passed into
> acpi_cpumap_access_common for 'port' and this is used as an offset
> into the avail_cpus array. So the offset is unchanged and only the
> first 4 bytes of avail_cpus ever gets changed.

I'd have to go back to the series (haven't looked at it since it was
posted back in January) but I think I enforce somewhere size of the
access to fit into 4 bytes. And if not then you are right.


-boris

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v2 1/2] libxl: Implement the handler to handle unrecoverable AER errors.

2017-07-31 Thread Wei Liu

On Fri, Jul 28, 2017 at 04:56:56PM -0700, Venu Busireddy wrote:
> On 2017-07-28 17:39:52 +0100, Ian Jackson wrote:
> > Venu Busireddy writes ("[PATCH v2 1/2] libxl: Implement the handler to 
> > handle unrecoverable AER errors."):
> > > Implement the callback function to handle unrecoverable AER errors, and
> > > also the public APIs that can be used to register/unregister the handler.
> > > When an AER error occurs, the handler will forcibly remove the erring
> > > PCIe device from the guest.
> > 
> > Why is this only sometimes the right thing to do ?  On what basis
> > might a user choose ?
> 
> This is not an "only sometimes" thing. User doesn't choose it. We always
> want to watch for AER errors.
> 
> > If this is always the right thing to do then maybe we need to recast
> > this as a general "please run monitoring for this domain" call ?
> 
> What does "recast" imply? Which "call" are you referring to?
> 
> > > +int libxl_reg_aer_events_handler(libxl_ctx *, uint32_t) 
> > > LIBXL_EXTERNAL_CALLERS_ONLY;
> > > +void libxl_unreg_aer_events_handler(libxl_ctx *, uint32_t) 
> > > LIBXL_EXTERNAL_CALLERS_ONLY;
> > 
> > I think these names are very unintuitive.  They describe the
> > implementation, not the effect.
> 
> These names can be changed to anything you want. Please suggest any names
> of your choice, and I will change them. That ensures that we don't spend
> more review revisions in fine tuning those names.
> 
> > The API seems awkward.  Inside libxl, events are only processed while
> > the application is inside libxl.  So for these functions to be
> > effective, the calling application must arrange to be running the
> > libxl event loop.  This should be documented, at least.
> 
> I am afraid I don't follow you. This scheme is tested and it works. So, I
> do not follow you when you say, "...for these functions to be effective,..."

Libxl has an internal event loop. The code as-is registers a watch which
runs on the internal event loop. The event loop's event is only
processed when the process enters libxl (calls libxl functions).

Imagine a toolstack which doesn't use libxl's internal event loop -- for
example libvirt has its own event loop, or a toolstack which only calls
the register function then nothing else. Your API would not work for
those cases.

Ian, please correct me if I'm wrong.

> 
> > What happens if more than one process calls this at once ?
> 
> Each call is handled within a separate 'xl' process's context. I don't
> see a problem there. Do you have anything specific in mind?
> 

It is possible that both xl processes see its watch fires and try to
write to the same node at the same time.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v5 3/8] mm: Scrub pages in alloc_heap_pages() if needed

2017-07-31 Thread Jan Beulich

>>> Boris Ostrovsky  07/23/17 4:07 AM >>>
>On 06/27/2017 02:00 PM, Jan Beulich wrote:
> Boris Ostrovsky  06/22/17 8:55 PM >>>
>>> @@ -862,10 +879,19 @@ static struct page_info *alloc_heap_pages(
>>>   if ( d != NULL )
>>>   d->last_alloc_node = node;
>>>   
>>> +need_scrub = !!first_dirty_pg && !(memflags & MEMF_no_scrub);
>> 
>> No need for !! here. But I wonder whether that part of the check is really
>> useful anyway, considering the sole use ...
>> 
>>>   for ( i = 0; i < (1 << order); i++ )
>>>   {
>>>   /* Reference count must continuously be zero for free pages. */
>>> -BUG_ON(pg[i].count_info != PGC_state_free);
>>> +BUG_ON((pg[i].count_info & ~PGC_need_scrub) != PGC_state_free);
>>> +
>>> +if ( test_bit(_PGC_need_scrub, &pg[i].count_info) )
>>> +{
>>> +if ( need_scrub )
>>> +scrub_one_page(&pg[i]);
>> 
>> ... here. If it isn't, I think the local variable isn't warranted either.
>> If you agree, the thus adjusted patch can have
>> Reviewed-by: Jan Beulich 
>> (otherwise I'll wait with it to understand the reason first).
>
>first_dirty_pg is indeed unnecessary but I think local variable is 
>useful to avoid ANDing memflags inside the loop on each iteration 
>(unless you think compiler is smart enough to realize that memflags is 
>not changing).

I don't understand: At least on x86 I'd expect the compiler to use a
single TEST if you used memflags inside the loop, whereas the local
variable would likely be a single CMP inside the loop plus setup code
outside of it.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] [RFC v4]Proposal to allow setting up shared memory areas between VMs from xl config file

2017-07-31 Thread Wei Liu

On Mon, Jul 31, 2017 at 01:09:04AM +0800, Zhongze Liu wrote:
> 
>   @cache_policyCan only appear when @role = master.
>The stage-2 cacheability/shareability attributes of the
>shared memory area. Currently, only two policies are
>supported:
>  * ARM_normal: Only applicable to ARM guests. This
>would mean Inner and Outer Write-Back
>Cacheable, and Inner Shareable.
>  * x86_normal: Only applicable to x86 HVM guests. This
>would mean Write-Back Cacheable.

This question might have been asked before, I'm sorry for not being
able to follow closely previous discussions...

Why are these opaque names chosen instead of just writing "wb" "wc" or
whatever?

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v5 4/8] mm: Scrub memory from idle loop

2017-07-31 Thread Jan Beulich

>>> Boris Ostrovsky  07/23/17 4:14 AM >>>
>>> @@ -1050,17 +1120,42 @@ static void scrub_free_pages(unsigned int node)
>>> -if ( node_need_scrub[node] == 0 )
>>> -return;
>>> +if ( preempt || (node_need_scrub[node] == 0) )
>>> +goto out;
>>>   }
>>>   } while ( order-- != 0 );
>>>   }
>>> +
>>> + out:
>>> +spin_unlock(&heap_lock);
>>> +node_clear(node, node_scrubbing);
>>> +return softirq_pending(cpu) || (node_to_scrub(false) != NUMA_NO_NODE);
>> 
>> While I can see why you use it here, the softirq_pending() looks sort of
>> misplaced: While invoking it twice in the caller will look a little odd too,
>> I still think that's where the check belongs.
>
>
>scrub_free_pages is called from idle loop as
>
>else if ( !softirq_pending(cpu) && !scrub_free_pages() )
>pm_idle();
>
>so softirq_pending() is unnecessary here.
>
>(Not sure why you are saying it would be invoked twice)

That was sort of implicit - the caller would want to become


else if ( !softirq_pending(cpu) && !scrub_free_pages() && 
!softirq_pending(cpu) )
pm_idle();

to account for the fact that a softirq may become pending while scrubbing.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 0/3] Docs: consolidate release related documents

2017-07-31 Thread Wei Liu

On Mon, Jul 31, 2017 at 02:51:21PM +0100, Ian Jackson wrote:
> Wei Liu writes ("[PATCH 0/3] Docs: consolidate release related documents"):
> > Wei Liu (3):
> >   docs: consolidate release related documents
> >   docs: add xen-release-management.pandoc
> >   docs: hook up process/ to build system
> 
> FWIW,
> 
> Acked-by: Ian Jackson 
> 
> However, AFAICT the mean reason this doesn't process
> release-checklist.txt and branching-checklist.txt into files on the
> website is that you only have it process pandocs.
> 

Correct, that's deliberate.

> But I don't think those files are ever going to want to be made into
> web pages.  Whereas it is possible that we will grow other process
> documents in .txt form which do, and no provision is made for them.
> 
> I wonder if it would be better to rename those to files to a different
> directory.  The reason I invented misc/ was precisely to avoid people
> (users and developers) tripping over them thinking they might be
> useful...
> 

Lars (?) said there should be a "process" directory, hence I put
everything that's relevant here.

I'm open to suggestions about directory structure. Maybe having a
technician-checklist directory under process?

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] PVH VCPU hotplug support v7?

2017-07-31 Thread Ross Lagerwall


On 07/31/2017 03:29 PM, Boris Ostrovsky wrote:

On 07/31/2017 10:12 AM, Andrew Cooper wrote:

On 31/07/17 14:55, Boris Ostrovsky wrote:

On 07/31/2017 09:20 AM, Ross Lagerwall wrote:

Hi Boris,

I've modified your PVH VCPU hotplug support v6 patch series [1] to
support HVM guests running _with_ a device model for XenServer's
purposes. This is useful because it moves the vCPU hotplug handling
out of QEMU and allows it to mostly be shared with PVH. It will also
allow unplugging vCPUs (libxl currently only does cpu-add for upstream
qemu).

Are you still planning on continuing with that patch series since your
commit to Linux [2]?

This series has been put on hold until we figure out what to do with
hotplug for PVH dom0. (The problem was the "dual" view by dom0 of APCI
CPU namespace --- on hotplug event dom0 has to somehow figure out
whether the event was due to (dis)appearance of a physical or virtual CPU).

I don't think this has been dealt with yet (copying Roger).

 From the point of view of unblocking several pieces of work, it would be
fine for this logic to be behind an emulation flag, just like LAPIC/etc.


The (I think) last message discussing this series was

https://lists.xenproject.org/archives/html/xen-devel/2017-01/msg00143.html

Are you suggesting extracting pieces that would move hotplug support for
HVM guests from qemu to hypervisor/toolstack but leave all PVH-specific
code out? (The feature flag is part of this series ---
https://lists.xenproject.org/archives/html/xen-devel/2017-01/msg00059.html)



I think(?) Andrew was suggesting to have an emulation flag such that 
hotplug support is moved into the hypervisor for HVM guests _and_ PVH 
guests except for PVH dom0.


I don't know what work this unblocks that he was referring to.

--
Ross Lagerwall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v2 1/2] libxl: use xen-blkback for 'vbd' disk types by default

2017-07-31 Thread Wei Liu

On Fri, Jul 28, 2017 at 06:42:13PM +0200, Marek Marczykowski-Górecki wrote:
> This will allow later to make HVM domain without qemu in dom0 (in
> addition to the one in stubdomain).
> 
> Signed-off-by: Marek Marczykowski-Górecki 
> 
> ---
> This is extracted from v1 of "libxl: do not start dom0 qemu for
> stubdomain when not needed".
> 
> Signed-off-by: Marek Marczykowski-Górecki 
> ---
>  tools/libxl/libxl_disk.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/tools/libxl/libxl_disk.c b/tools/libxl/libxl_disk.c
> index 63de75c..7842d9b 100644
> --- a/tools/libxl/libxl_disk.c
> +++ b/tools/libxl/libxl_disk.c
> @@ -56,10 +56,12 @@ static void disk_eject_xswatch_callback(libxl__egc *egc, 
> libxl__ev_xswatch *w,
>  "/local/domain/%d/backend/%" TOSTRING(BACKEND_STRING_SIZE)
> "[a-z]/%*d/%*d",
> &disk->backend_domid, backend_type);
> -if (!strcmp(backend_type, "tap") || !strcmp(backend_type, "vbd")) {
> +if (!strcmp(backend_type, "tap")) {
>  disk->backend = LIBXL_DISK_BACKEND_TAP;
>  } else if (!strcmp(backend_type, "qdisk")) {
>  disk->backend = LIBXL_DISK_BACKEND_QDISK;
> +} else if (!strcmp(backend_type, "vbd")) {
> +disk->backend = LIBXL_DISK_BACKEND_PHY;

Wait, it only occurred to me until now this patch is changing
disk_eject_xswatch_callback.

Is this a bug fix? How is it possible for the backend_type to be "vbd"
when there isn't such thing in libxl_types.idl?

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] xen/link: Move .data.rel.ro sections into .rodata for final link

2017-07-31 Thread David Woodhouse

On Mon, 2017-07-31 at 07:15 -0600, Jan Beulich wrote:
> > > > David Woodhouse  07/31/17 1:02 PM >>>
> > On Sun, 2017-07-30 at 00:16 -0600, Jan Beulich wrote:
> > > > > > David Woodhouse  07/20/17 5:22 PM >>>
> > > > This includes stuff lke the hypercall tables which we really want
> > > > to be read-only. And they were going into .data.read-mostly.
> > >
> > > Yes, we'd like them to be read-only, but what if EFI properly assigned r/o
> > > permissions to the .rodata section when loading xen.efi? We'd then be
> > > unable to apply relocations when switching from 1:1 to virtual mappings
> > > (see efi_arch_relocate_image()).
> > 
> > FWIW it does look like TianoCore has gained the ability to mark
> > sections as read-only, in January of this year:
> > https://github.com/tianocore/edk2/commit/d0e92aad46
> > 
> > It doesn't actually seem to be complete — even with subsequent fixes
> > since that commit, it doesn't look like it catches the case of data
> > sections without EFI_IMAGE_SCN_MEM_WRITE, such as .rodata. 
> > 
> > And even if/when that gets fixed you'll note that the protection is
> > deliberately torn down in ExitBootServices(), specifically for the case
> > you're concerned about below — because you'll need to do the
> > relocations.
>
> As said in an earlier reply, a first pass over relocations is being done
> long before the call to ExitBootServices(). A minimal adjustment to
> efi_arch_relocate_image() will be needed anyway, afaict.

Ah, right. I think more "implied" than "said" but I understand now. :)

At least, I understand what you're saying... I have no bloody clue
what's actually going on though.

There is a first call to efi_arch_relocate_image(0) before the
ExitBootServices() call. However I'm missing something because I can't
see how that call achieves anything *other* than to trigger the fault
we're concerned about.

There are three types of relocations — PE_BASE_RELOC_ABS,
PE_BASE_RELOC_HIGHLOW, and PE_BASE_RELOC_DIR64.

The first (ABS) doesn't seem to do anything, ever. And is never emitted
by mkrelocs.c. 

The second (HIGHLOW) does nothing if (!delta).

The third (DIR64) simply adds 'delta' to the target address. We could
potentially stop it faulting on that pointless '*addr += 0' by doing
this...

--- a/xen/arch/x86/efi/efi-boot.h
+++ b/xen/arch/x86/efi/efi-boot.h
@@ -87,7 +87,8 @@ static void __init efi_arch_relocate_image(unsigned long 
delta)
 case PE_BASE_RELOC_DIR64:
 if ( in_page_tables(addr) )
 blexit(L"Unexpected relocation type");
-*(u64 *)addr += delta;
+if ( delta )
+*(u64 *)addr += delta;
 break;
 default:
 blexit(L"Unsupported relocation type");

... but then again, if the whole function is really doing nothing at
all when invoked with delta==0 then perhaps it would just be easier to
remove the first pass altogether. I feel sure I'm missing something,
but what? Is it still supposed to be adding xen_phys_start in the
PE_BASE_RELOC_HIGHLOW case even when delta==0? Because it isn't...

Either way, this is still broken before my patch though, right?
Especially if UEFI learns to do it for non-executable sections, but
AFAICT even before that.

These are the sections I see the PE section headers of a local build:

 Name     Characteristics   Relocations
.text    0xe0d00020 (WRX)    ✓
.rodata  0x40600040 ( R )    ✓
.buildid 0x40300040 ( R )
.init.te 0x60500020 ( RX)    ✓
.init.da 0xc0d00040 (WR )    ✓
.data.re 0xc0800040 (WR )    ✓
.data    0xc0d00040 (WR )    ✓
.bss     0xc180 (WR )
.reloc   0x42300040 ( R )
.pad     0xc0300080 (WR )

So there are (again, before my patch) relocations in .init.da(ta) and
.rodata sections which UEFI *might* start marking read-only, and also
in .init.te(xt) which is R+X and could be marked read-only today.

And the .init.te(xt) relocations include PE_BASE_RELOC_DIR64
relocations, which *would* cause a fault in the !delta case.

(All the relocations in .rodata both before and after my patch are also
PE_BASE_RELOC_DIR64, FWIW)

smime.p7s
Description: S/MIME cryptographic signature
___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v2 1/2] libxl: use xen-blkback for 'vbd' disk types by default

2017-07-31 Thread Wei Liu

On Mon, Jul 31, 2017 at 04:56:04PM +0100, Wei Liu wrote:
> On Fri, Jul 28, 2017 at 06:42:13PM +0200, Marek Marczykowski-Górecki wrote:
> > This will allow later to make HVM domain without qemu in dom0 (in
> > addition to the one in stubdomain).
> > 
> > Signed-off-by: Marek Marczykowski-Górecki 
> > 
> > ---
> > This is extracted from v1 of "libxl: do not start dom0 qemu for
> > stubdomain when not needed".
> > 
> > Signed-off-by: Marek Marczykowski-Górecki 
> > ---
> >  tools/libxl/libxl_disk.c | 4 +++-
> >  1 file changed, 3 insertions(+), 1 deletion(-)
> > 
> > diff --git a/tools/libxl/libxl_disk.c b/tools/libxl/libxl_disk.c
> > index 63de75c..7842d9b 100644
> > --- a/tools/libxl/libxl_disk.c
> > +++ b/tools/libxl/libxl_disk.c
> > @@ -56,10 +56,12 @@ static void disk_eject_xswatch_callback(libxl__egc 
> > *egc, libxl__ev_xswatch *w,
> >  "/local/domain/%d/backend/%" TOSTRING(BACKEND_STRING_SIZE)
> > "[a-z]/%*d/%*d",
> > &disk->backend_domid, backend_type);
> > -if (!strcmp(backend_type, "tap") || !strcmp(backend_type, "vbd")) {
> > +if (!strcmp(backend_type, "tap")) {
> >  disk->backend = LIBXL_DISK_BACKEND_TAP;
> >  } else if (!strcmp(backend_type, "qdisk")) {
> >  disk->backend = LIBXL_DISK_BACKEND_QDISK;
> > +} else if (!strcmp(backend_type, "vbd")) {
> > +disk->backend = LIBXL_DISK_BACKEND_PHY;
> 
> Wait, it only occurred to me until now this patch is changing
> disk_eject_xswatch_callback.
> 
> Is this a bug fix? How is it possible for the backend_type to be "vbd"
> when there isn't such thing in libxl_types.idl?

Oh, I'm an idiot. That's read from xenstore path. I think this patch is
correct. But I still tend to think this is a bug fix. How do you
discover this problem? Has disk eject ever worked for you?

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v5 1/8] mm: Place unscrubbed pages at the end of pagelist

2017-07-31 Thread Boris Ostrovsky

On 07/31/2017 10:45 AM, Jan Beulich wrote:
 Boris Ostrovsky  07/23/17 4:01 AM >>>
>> On 06/27/2017 01:06 PM, Jan Beulich wrote:
>> Boris Ostrovsky  06/22/17 8:55 PM >>>
 +{
 +if ( pg < first_dirty_pg )
 +first_dirty = (first_dirty_pg - pg) / sizeof(*pg);
>>> Pointer subtraction already includes the involved division. 
>>
>> Yes, this was a mistake.
>>
>>> Otoh I wonder
>>> if you couldn't get away without pointer comparison/subtraction here
>>> altogether.
>>
>> Without comparison I can only assume that first_dirty is zero (i.e. the 
>> whole buddy is potentially dirty). Is there something else I could do?
> I was thinking of tracking indexes instead of pointers. But maybe that
> would more hamper readability of the overall result than help it.

I'll try to see how it looks.

>  
 +else
 +i = 0;
 +
 +for ( ; i < (1 << cur_order); i++ )
 +if ( test_bit(_PGC_need_scrub,
 +  &cur_head[i].count_info) )
 +{
 +first_dirty = i;
 +break;
 +}
>>> Perhaps worth having ASSERT(first_dirty != INVALID_DIRTY_IDX) here? Or are
>>> there cases where ->u.free.first_dirty of a page may be wrong?
>>
>> When we merge in free_heap_pages we don't clear first_dirty of the 
>> successor buddy (at some point I did have this done but you questioned 
>> whether it was needed and I dropped it).
> Hmm, this indeed answers my question, but doesn't help (me) understanding
> whether the suggested ASSERT() could be wrong.

Oh, I see what you were asking --- ASSERT() *after* the loop, to make
sure we indeed found the first dirty page. Yes, I will add it.

>
 --- a/xen/include/asm-x86/mm.h
 +++ b/xen/include/asm-x86/mm.h
 @@ -88,7 +88,15 @@ struct page_info
   /* Page is on a free list: ((count_info & PGC_count_mask) == 0). 
 */
   struct {
   /* Do TLBs need flushing for safety before next page use? */
 -bool_t need_tlbflush;
 +unsigned long need_tlbflush:1;
 +
 +/*
 + * Index of the first *possibly* unscrubbed page in the buddy.
 + * One more than maximum possible order (MAX_ORDER+1) to
>>> Why +1 here and hence ...
>> Don't we have MAX_ORDER+1 orders?
> So here there might be a simple misunderstanding: I understand the
> parenthesized MAX_ORDER+1 to represent "maximum possible
> order", i.e. excluding the "one more than", not the least because of
> the ...
>
>>> + * accommodate INVALID_DIRTY_IDX.
>>> + */
>>> +#define INVALID_DIRTY_IDX (-1UL & (((1UL<>> +unsigned long first_dirty:MAX_ORDER + 2;
> +2 here.
>
>>> ... why +2 instead of +1? And isn't the expression INVALID_DIRTY_IDX wrongly
>>> parenthesized (apart from lacking blanks around the shift operator)? I'd
>>> expect you want a value with MAX_ORDER+1 set bits, i.e.
>>> (1UL << (MAX_ORDER + 1)) - 1. ANDing with -1UL seems quite pointless too.
>> Yes to parentheses and AND. Should be (1UL << (MAX_ORDER + 2)) - 1
> I.e. I would still expect it to be (1UL << (MAX_ORDER + 1)) - 1
> here.


Sorry, I still don't get it.

Say, MAX_ORDER is 1. Since this implies that indexes 0, 1, 2 and 3 are
all valid (because we can have up to 2^(MAX_ORDER+1) pages), don't we
need 3 bits to indicate an invalid index?

-boris

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v2 3/5] xen/livepatch/ARM32: Don't load and crash on livepatches loaded with wrong alignment.

2017-07-31 Thread Konrad Rzeszutek Wilk

On Mon, Jul 31, 2017 at 07:55:34AM -0600, Jan Beulich wrote:
> >>> Konrad Rzeszutek Wilk  07/26/17 9:50 PM >>>
> >--- a/docs/misc/livepatch.markdown
> >+++ b/docs/misc/livepatch.markdown
> >@@ -279,6 +279,10 @@ It may also have some architecture-specific sections. 
> >For example:
> >* Exception tables.
> >* Relocations for each of these sections.
>  >
> >+Note that on ARM 32 the sections SHOULD be four byte aligned. Otherwise
> >+we risk hitting Data Abort exception as un-aligned manipulation of data is
> >+prohibited on ARM 32.
> 
> This (and hence the rest of the patch) is not in line with the outcome of the
> earlier discussion we had. Nothing is wrong with a section having smaller
> alignment, as long as there are no 32-bit (or wider, but I don't think there
> are any such) relocations against such a section. And even if there were, I
> think it should rather be the code doing the relocations needing to cope, as
> I don't think the ARM ELF ABI imposes any such restriction.

The idea behind this patch is to give advance warnings. Akin to what
2ff229643b739e2fd0cd0536ee9fca506cfa92f8
"xen/livepatch: Don't crash on encountering STN_UNDEF relocations" did.

The other patches in this series fix the alignment issues.

The ARM ELF ABI 
(http://infocenter.arm.com/help/topic/com.arm.doc.ihi0044f/IHI0044F_aaelf.pdf)

says:

4.3.5 Section Alignment
There is no minimum alignment required for a section. However, sections 
containing thumb code must be at least
16-bit aligned and sections containing ARM code must be at least 32-bit aligned.
Platform standards may set a limit on the maximum alignment that they can 
guarantee (normally the page size).

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v5 3/8] mm: Scrub pages in alloc_heap_pages() if needed

2017-07-31 Thread Boris Ostrovsky

On 07/31/2017 11:16 AM, Jan Beulich wrote:
 Boris Ostrovsky  07/23/17 4:07 AM >>>
>> On 06/27/2017 02:00 PM, Jan Beulich wrote:
>> Boris Ostrovsky  06/22/17 8:55 PM >>>
 @@ -862,10 +879,19 @@ static struct page_info *alloc_heap_pages(
   if ( d != NULL )
   d->last_alloc_node = node;
   
 +need_scrub = !!first_dirty_pg && !(memflags & MEMF_no_scrub);
>>> No need for !! here. But I wonder whether that part of the check is really
>>> useful anyway, considering the sole use ...
>>>
   for ( i = 0; i < (1 << order); i++ )
   {
   /* Reference count must continuously be zero for free pages. */
 -BUG_ON(pg[i].count_info != PGC_state_free);
 +BUG_ON((pg[i].count_info & ~PGC_need_scrub) != PGC_state_free);
 +
 +if ( test_bit(_PGC_need_scrub, &pg[i].count_info) )
 +{
 +if ( need_scrub )
 +scrub_one_page(&pg[i]);
>>> ... here. If it isn't, I think the local variable isn't warranted either.
>>> If you agree, the thus adjusted patch can have
>>> Reviewed-by: Jan Beulich 
>>> (otherwise I'll wait with it to understand the reason first).
>> first_dirty_pg is indeed unnecessary but I think local variable is 
>> useful to avoid ANDing memflags inside the loop on each iteration 
>> (unless you think compiler is smart enough to realize that memflags is 
>> not changing).
> I don't understand: At least on x86 I'd expect the compiler to use a
> single TEST if you used memflags inside the loop, whereas the local
> variable would likely be a single CMP inside the loop plus setup code
> outside of it.

OK, I haven't considered that you don't actually need to AND and then
CMP. Then yes, local variable is unnecessary.

-boris

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH] xen:arm earlyprintk configuration for Hikey 960 boards

2017-07-31 Thread Konrad Rzeszutek Wilk

On Thu, Jul 27, 2017 at 10:52:40AM +0100, Julien Grall wrote:
> Hi Konrad,
> 
> On 27/07/17 02:18, Konrad Rzeszutek Wilk wrote:
> > On Wed, Jul 26, 2017 at 05:59:15PM +0100, Julien Grall wrote:
> > > Hi Konrad,
> > > 
> > > On 26/07/17 17:54, Konrad Rzeszutek Wilk wrote:
> > > > Introduce an earlyprintk configuration of Hikey 960 boards.
> > 
> > ..snip..
> > > 
> > > Would it be possible to update the wiki page on the hikey [1] with your
> > > latest finding?
> > 
> > I added a whole new web-page:
> > https://wiki.xenproject.org/wiki/HiKey960
> > 
> > As the HiKey != HiKey960
> 
> Oh :). Fine then. Would you be up to do testing on this platform during RC?

Yes.
> 
> If you can you add your name in [1]. This would help user to know the board
> has been tested with latest release.
> 
> > 
> > [Also updated the landing page 
> > https://wiki.xenproject.org/wiki/Xen_ARM_with_Virtualization_Extensions
> > to point to this new one]
> > 
> 
> Cheers,
> 
> [1] https://wiki.xenproject.org/wiki/Xen_ARM_Manual_Smoke_Test/Results
> 
> -- 
> Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v5 4/8] mm: Scrub memory from idle loop

2017-07-31 Thread Boris Ostrovsky

On 07/31/2017 11:20 AM, Jan Beulich wrote:
 Boris Ostrovsky  07/23/17 4:14 AM >>>
 @@ -1050,17 +1120,42 @@ static void scrub_free_pages(unsigned int node)
 -if ( node_need_scrub[node] == 0 )
 -return;
 +if ( preempt || (node_need_scrub[node] == 0) )
 +goto out;
   }
   } while ( order-- != 0 );
   }
 +
 + out:
 +spin_unlock(&heap_lock);
 +node_clear(node, node_scrubbing);
 +return softirq_pending(cpu) || (node_to_scrub(false) != NUMA_NO_NODE);
>>> While I can see why you use it here, the softirq_pending() looks sort of
>>> misplaced: While invoking it twice in the caller will look a little odd too,
>>> I still think that's where the check belongs.
>>
>> scrub_free_pages is called from idle loop as
>>
>> else if ( !softirq_pending(cpu) && !scrub_free_pages() )
>> pm_idle();
>>
>> so softirq_pending() is unnecessary here.
>>
>> (Not sure why you are saying it would be invoked twice)
> That was sort of implicit - the caller would want to become
>
>
> else if ( !softirq_pending(cpu) && !scrub_free_pages() && 
> !softirq_pending(cpu) )
> pm_idle();
>
> to account for the fact that a softirq may become pending while scrubbing.

That would look really odd IMO.

Would

else if ( !softirq_pending(cpu) )
if ( !scrub_free_pages() && !softirq_pending(cpu) )
   pm_idle();

or 

else if ( !softirq_pending(cpu) && !scrub_free_pages() )
if ( !softirq_pending(cpu) )
pm_idle();

 
be better? (I'd prefer the first)

-boris



___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] PVH VCPU hotplug support v7?

2017-07-31 Thread Boris Ostrovsky

On 07/31/2017 11:36 AM, Ross Lagerwall wrote:
> On 07/31/2017 03:29 PM, Boris Ostrovsky wrote:
>> On 07/31/2017 10:12 AM, Andrew Cooper wrote:
>>> On 31/07/17 14:55, Boris Ostrovsky wrote:
 On 07/31/2017 09:20 AM, Ross Lagerwall wrote:
> Hi Boris,
>
> I've modified your PVH VCPU hotplug support v6 patch series [1] to
> support HVM guests running _with_ a device model for XenServer's
> purposes. This is useful because it moves the vCPU hotplug handling
> out of QEMU and allows it to mostly be shared with PVH. It will also
> allow unplugging vCPUs (libxl currently only does cpu-add for
> upstream
> qemu).
>
> Are you still planning on continuing with that patch series since
> your
> commit to Linux [2]?
 This series has been put on hold until we figure out what to do with
 hotplug for PVH dom0. (The problem was the "dual" view by dom0 of APCI
 CPU namespace --- on hotplug event dom0 has to somehow figure out
 whether the event was due to (dis)appearance of a physical or
 virtual CPU).

 I don't think this has been dealt with yet (copying Roger).
>>>  From the point of view of unblocking several pieces of work, it
>>> would be
>>> fine for this logic to be behind an emulation flag, just like
>>> LAPIC/etc.
>>
>> The (I think) last message discussing this series was
>>
>> https://lists.xenproject.org/archives/html/xen-devel/2017-01/msg00143.html
>>
>>
>> Are you suggesting extracting pieces that would move hotplug support for
>> HVM guests from qemu to hypervisor/toolstack but leave all PVH-specific
>> code out? (The feature flag is part of this series ---
>> https://lists.xenproject.org/archives/html/xen-devel/2017-01/msg00059.html)
>>
>>
>
> I think(?) Andrew was suggesting to have an emulation flag such that
> hotplug support is moved into the hypervisor for HVM guests _and_ PVH
> guests except for PVH dom0.

That (different handling for PVH dom0 vs. domU) was exactly what Jan was
objecting to. (I'll add him too).

-boris

>
> I don't know what work this unblocks that he was referring to.
>


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v2 2/2] libxl: do not start dom0 qemu for stubdomain when not needed

2017-07-31 Thread Wei Liu

On Fri, Jul 28, 2017 at 06:42:14PM +0200, Marek Marczykowski-Górecki wrote:
> Do not setup vfb+vkb when no access method was configured. Then check if
> qemu is really needed.
> 
> The only not configurable thing forcing qemu running in dom0 after this
> change are consoles used to save/restore. But even in that case, there
> is much smaller part of qemu exposed.
> 
> Signed-off-by: Marek Marczykowski-Górecki 

Acked-by: Wei Liu 

> +/* will be changed back to LIBXL__CONSOLE_BACKEND_IOEMU if 
> qemu
> + * will be in use */

Hmm... I don't think there is requirement in CODING_STYLE for
multiple-line comment, so there are quite a few styles in use. But
looking at libxl code the prevailing style seems to be:

   /*
*  
*  
*/

I will make the adjustment while committing.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v2 2/2] libxl: do not start dom0 qemu for stubdomain when not needed

2017-07-31 Thread Ian Jackson

Wei Liu writes ("Re: [PATCH v2 2/2] libxl: do not start dom0 qemu for 
stubdomain when not needed"):
> Hmm... I don't think there is requirement in CODING_STYLE for
> multiple-line comment, so there are quite a few styles in use. But
> looking at libxl code the prevailing style seems to be:
> 
>/*
> *  
> *  
> */
> 
> I will make the adjustment while committing.

Personally I think this is well into the kind of territory that is not
worth arguing over.  Even a mix of styles here is fine.  (I think we
should mandate the initial `*' on continuation lines.)

Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

[Xen-devel] [libvirt test] 112392: trouble: blocked/broken/pass

2017-07-31 Thread osstest service owner

flight 112392 libvirt real [real]
http://logs.test-lab.xenproject.org/osstest/logs/112392/

Failures and problems with tests :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-arm64-pvops 2 hosts-allocate broken REGR. vs. 112276
 build-arm64-xsm   2 hosts-allocate broken REGR. vs. 112276
 build-arm64   2 hosts-allocate broken REGR. vs. 112276

Tests which did not succeed, but are not blocking:
 test-arm64-arm64-libvirt-xsm  1 build-check(1)   blocked  n/a
 build-arm64-libvirt   1 build-check(1)   blocked  n/a
 test-arm64-arm64-libvirt-qcow2  1 build-check(1)   blocked  n/a
 test-arm64-arm64-libvirt  1 build-check(1)   blocked  n/a
 build-arm64   3 capture-logs  broken blocked in 112276
 build-arm64-xsm   3 capture-logs  broken blocked in 112276
 build-arm64-pvops 3 capture-logs  broken blocked in 112276
 test-armhf-armhf-libvirt 14 saverestore-support-checkfail  like 112276
 test-armhf-armhf-libvirt-xsm 14 saverestore-support-checkfail  like 112276
 test-armhf-armhf-libvirt-raw 13 saverestore-support-checkfail  like 112276
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt-vhd 12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt 13 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-raw 12 migrate-support-checkfail   never pass

version targeted for testing:
 libvirt  772a6e36a5cbed4992315c4a8325947020f7ec70
baseline version:
 libvirt  f7237d63e8f02f3689f9b63b413fae7d4221faa9

Last test of basis   112276  2017-07-25 04:21:09 Z6 days
Failing since112310  2017-07-26 04:21:38 Z5 days6 attempts
Testing same since   112370  2017-07-29 04:23:27 Z2 days3 attempts


People who touched revisions under test:
  Andrea Bolognani 
  Daniel P. Berrange 
  Erik Skultety 
  John Ferlan 
  JÃ¡n Tomko 
  Martin Kletzander 
  Michal Privoznik 
  Nitesh Konkar 
  Nitesh Konkar 
  Pavel Hrdina 
  Peter Krempa 
  Scott Garfinkle 

jobs:
 build-amd64-xsm  pass
 build-arm64-xsm  broken  
 build-armhf-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-arm64  broken  
 build-armhf  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-arm64-libvirt  blocked 
 build-armhf-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-arm64-pvopsbroken  
 build-armhf-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm   pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsmpass
 test-amd64-amd64-libvirt-xsm pass
 test-arm64-arm64-libvirt-xsm blocked 
 test-armhf-armhf-libvirt-xsm pass
 test-amd64-i386-libvirt-xsm  pass
 test-amd64-amd64-libvirt pass
 test-arm64-arm64-libvirt blocked 
 test-armhf-armhf-libvirt pass
 test-amd64-i386-libvirt  pass
 test-amd64-amd64-libvirt-pairpass
 test-amd64-i386-libvirt-pair pass
 test-arm64-arm64-libvirt-qcow2   blocked 
 test-armhf-armhf-libvirt-raw pass
 test-amd64-amd64-libvirt-vhd

Re: [Xen-devel] [PATCH] xen:arm earlyprintk configuration for Hikey 960 boards

2017-07-31 Thread Julien Grall




On 31/07/17 17:11, Konrad Rzeszutek Wilk wrote:

On Thu, Jul 27, 2017 at 10:52:40AM +0100, Julien Grall wrote:

Hi Konrad,

On 27/07/17 02:18, Konrad Rzeszutek Wilk wrote:

On Wed, Jul 26, 2017 at 05:59:15PM +0100, Julien Grall wrote:

Hi Konrad,

On 26/07/17 17:54, Konrad Rzeszutek Wilk wrote:

Introduce an earlyprintk configuration of Hikey 960 boards.


..snip..


Would it be possible to update the wiki page on the hikey [1] with your
latest finding?


I added a whole new web-page:
https://wiki.xenproject.org/wiki/HiKey960

As the HiKey != HiKey960


Oh :). Fine then. Would you be up to do testing on this platform during RC?


Yes.


Thank you!



If you can you add your name in [1]. This would help user to know the board
has been tested with latest release.



[Also updated the landing page 
https://wiki.xenproject.org/wiki/Xen_ARM_with_Virtualization_Extensions
to point to this new one]



Cheers,

[1] https://wiki.xenproject.org/wiki/Xen_ARM_Manual_Smoke_Test/Results

--
Julien Grall


--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

[Xen-devel] [XTF PATCH] build: append -fno-pic to CFLAGS

2017-07-31 Thread Wei Liu

It appears that Stretch's gcc has this on by default, which causes the
generating of several get_pc_thunk's, which breaks xsa-192 test.

Signed-off-by: Wei Liu 
---
 build/common.mk | 1 +
 1 file changed, 1 insertion(+)

diff --git a/build/common.mk b/build/common.mk
index f1de800..4ce6abf 100644
--- a/build/common.mk
+++ b/build/common.mk
@@ -28,6 +28,7 @@ COMMON_CFLAGS += -fno-common -fno-asynchronous-unwind-tables 
-fno-strict-aliasin
 COMMON_CFLAGS += -fno-stack-protector -ffreestanding
 COMMON_CFLAGS += -mno-red-zone -mno-sse
 COMMON_CFLAGS += -Wno-unused-parameter -Winline
+COMMON_CFLAGS += -fno-pic
 
 COMMON_AFLAGS-x86_32 := -m32
 COMMON_AFLAGS-x86_64 := -m64
-- 
2.11.0


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] [XTF PATCH] build: append -fno-pic to CFLAGS

2017-07-31 Thread Andrew Cooper

On 31/07/17 18:20, Wei Liu wrote:
> It appears that Stretch's gcc has this on by default, which causes the
> generating of several get_pc_thunk's, which breaks xsa-192 test.
>
> Signed-off-by: Wei Liu 

Reviewed and committed.  Thanks.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

[Xen-devel] [linux-linus test] 112390: regressions - trouble: blocked/broken/fail/pass

2017-07-31 Thread osstest service owner

flight 112390 linux-linus real [real]
http://logs.test-lab.xenproject.org/osstest/logs/112390/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-arm64   2 hosts-allocate broken REGR. vs. 110515
 build-arm64-pvops 2 hosts-allocate broken REGR. vs. 110515
 build-arm64-xsm   2 hosts-allocate broken REGR. vs. 110515
 test-amd64-amd64-examine  7 reboot   fail REGR. vs. 110515
 test-amd64-amd64-libvirt-pair 10 xen-boot/src_host   fail REGR. vs. 110515
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 7 xen-boot fail REGR. vs. 
110515
 test-amd64-amd64-rumprun-amd64  7 xen-boot   fail REGR. vs. 110515
 test-amd64-amd64-libvirt-pair 11 xen-boot/dst_host   fail REGR. vs. 110515
 test-amd64-amd64-qemuu-nested-intel  7 xen-boot  fail REGR. vs. 110515
 test-amd64-amd64-xl-qemut-debianhvm-amd64  7 xen-bootfail REGR. vs. 110515
 test-amd64-amd64-xl-pvh-intel  7 xen-bootfail REGR. vs. 110515
 test-amd64-amd64-pygrub   7 xen-boot fail REGR. vs. 110515
 test-amd64-amd64-xl-qemut-stubdom-debianhvm-amd64-xsm 7 xen-boot fail REGR. 
vs. 110515
 test-amd64-amd64-libvirt  7 xen-boot fail REGR. vs. 110515
 test-amd64-amd64-xl-qcow2 7 xen-boot fail REGR. vs. 110515
 test-amd64-amd64-i386-pvgrub  7 xen-boot fail REGR. vs. 110515
 test-amd64-amd64-xl-qemut-win7-amd64 16 guest-localmigrate/x10 fail REGR. vs. 
110515
 test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-localmigrate/x10 fail REGR. vs. 
110515
 test-amd64-amd64-amd64-pvgrub  7 xen-bootfail REGR. vs. 110515
 test-amd64-amd64-libvirt-vhd  7 xen-boot fail REGR. vs. 110515

Tests which did not succeed, but are not blocking:
 build-arm64-libvirt   1 build-check(1)   blocked  n/a
 test-arm64-arm64-examine  1 build-check(1)   blocked  n/a
 test-arm64-arm64-libvirt-xsm  1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl   1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl-credit2   1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl-xsm   1 build-check(1)   blocked  n/a
 build-arm64-xsm   3 capture-logs  broken blocked in 110515
 build-arm64-pvops 3 capture-logs  broken blocked in 110515
 build-arm64   3 capture-logs  broken blocked in 110515
 test-armhf-armhf-libvirt 14 saverestore-support-checkfail  like 110515
 test-armhf-armhf-xl-rtds 16 guest-start/debian.repeatfail  like 110515
 test-amd64-i386-xl-qemuu-win7-amd64 17 guest-stop fail like 110515
 test-amd64-i386-xl-qemut-win7-amd64 17 guest-stop fail like 110515
 test-armhf-armhf-libvirt-xsm 14 saverestore-support-checkfail  like 110515
 test-amd64-amd64-xl-rtds 10 debian-install   fail  like 110515
 test-armhf-armhf-libvirt-raw 13 saverestore-support-checkfail  like 110515
 test-amd64-amd64-xl-qemuu-ws16-amd64 10 windows-installfail never pass
 test-amd64-i386-libvirt  13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-amd64-qemuu-nested-amd 17 debian-hvm-install/l1/l2  fail never pass
 test-armhf-armhf-xl-rtds 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 14 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 13 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 14 saverestore-support-checkfail  never pass
 test-amd64-amd64-xl-qemut-ws16-amd64 10 windows-installfail never pass
 test-amd64-i386-xl-qemuu-ws16-amd64 13 guest-saverestore   fail never pass
 test-amd64-i386-xl-qemut-ws16-amd64 13 guest-saverestore   fail never pass
 test-armhf-armhf-xl  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  14 saverestore-support-checkfail   never pass
 test-armhf-armh

[Xen-devel] [PATCH v1 1/1] xen/arm: Disable PCIe on the ZynqMP

2017-07-31 Thread Edgar E. Iglesias

From: "Edgar E. Iglesias" 

Disable PCIe on the ZynqMP. Xen does not yet know how to map the
controller and dom0 fails to boot with the node enabled.

Signed-off-by: Edgar E. Iglesias 
---
 xen/arch/arm/platforms/xilinx-zynqmp.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/xen/arch/arm/platforms/xilinx-zynqmp.c 
b/xen/arch/arm/platforms/xilinx-zynqmp.c
index 2adee91..822287b 100644
--- a/xen/arch/arm/platforms/xilinx-zynqmp.c
+++ b/xen/arch/arm/platforms/xilinx-zynqmp.c
@@ -29,6 +29,7 @@ static const struct dt_device_match zynqmp_blacklist_dev[] 
__initconst =
 {
 /* Power management is not yet supported.  */
 DT_MATCH_COMPATIBLE("xlnx,zynqmp-pm"),
+DT_MATCH_COMPATIBLE("xlnx,nwl-pcie-2.11"),
 { /* sentinel */ },
 };
 
-- 
2.7.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

[Xen-devel] [PATCH v1 0/1] xen/arm: zynqmp: Disable PCIe

2017-07-31 Thread Edgar E. Iglesias

From: "Edgar E. Iglesias" 

Hi,

We're seeing panics in dom0 with PCIe enabled due to what seems
to be wrongly created mappings by Xen. With older kernels we
didn't see the panics but PCIe wasn't functional in dom0.

This disables the PCIe nodes on the ZynqMP until Xen/ARM gets
more PCIe support.

Can we get this into stable-4.8 and 4.9?

Cheers,
Edgar

Edgar E. Iglesias (1):
  xen/arm: Disable PCIe on the ZynqMP

 xen/arch/arm/platforms/xilinx-zynqmp.c | 1 +
 1 file changed, 1 insertion(+)

-- 
2.7.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] DESIGN v2: CPUID part 3

2017-07-31 Thread Konrad Rzeszutek Wilk

On Wed, Jul 05, 2017 at 02:22:00PM +0100, Joao Martins wrote:
> On 07/05/2017 12:16 PM, Andrew Cooper wrote:
> > On 05/07/17 10:46, Joao Martins wrote:
> >> Hey Andrew,
> >>
> >> On 07/04/2017 03:55 PM, Andrew Cooper wrote:
> >>> Presented herewith is the a plan for the final part of CPUID work, which
> >>> primarily covers better Xen/Toolstack interaction for configuring the 
> >>> guests
> >>> CPUID policy.
> >>>
> >> Really nice write up, a few comments below.
> >>
> >>> A PDF version of this document is available from:
> >>>
> >>> http://xenbits.xen.org/people/andrewcoop/cpuid-part-3-rev2.pdf
> >>>
> >>> Changes from v1:
> >>>  * Clarification of the interaction of emulated features
> >>>  * More information about the difference between max and default 
> >>> featuresets.
> >>>
> >>> ~Andrew
> >>>
> >>> -8<-
> >>> % CPUID Handling (part 3)
> >>> % Revision 2
> >>>
> 
> [snip]
> 
> >>> # Proposal
> >>>
> >>> First and foremost, split the current **max\_policy** notion into separate
> >>> **max** and **default** policies.  This allows for the provision of 
> >>> features
> >>> which are unused by default, but may be opted in to, both at the 
> >>> hypervisor
> >>> level and the toolstack level.
> >>>
> >>> At the hypervisor level, **max** constitutes all the features Xen can use 
> >>> on
> >>> the current hardware, while **default** is the subset thereof which are
> >>> supported features, the features which the user has explicitly opted in 
> >>> to,
> >>> and excluding any features the user has explicitly opted out of.
> >>>
> >>> A new `cpuid=` command line option shall be introduced, whose internals 
> >>> are
> >>> generated automatically from the featureset ABI.  This means that all 
> >>> features
> >>> added to `include/public/arch-x86/cpufeatureset.h` automatically gain 
> >>> command
> >>> line control.  (RFC: The same top level option can probably be used for
> >>> non-feature CPUID data control, although I can't currently think of any 
> >>> cases
> >>> where this would be used Also find a sensible way to express 'available 
> >>> but
> >>> not to be used by Xen', as per the current `smep` and `smap` options.)
> >>>
> >>>
> >>> At the guest level, the **max** policy is conceptually unchanged.  It
> >>> constitutes all the features Xen is willing to offer to each type of 
> >>> guest on
> >>> the current hardware (including emulated features).  However, it shall 
> >>> instead
> >>> be derived from Xen's **default** host policy.  This is to ensure that
> >>> experimental hypervisor features must be opted in to at the Xen level 
> >>> before
> >>> they can be opted in to at the toolstack level.
> >>>
> >>> The guests **default** policy is then derived from its **max**.  This is
> >>> because there are some features which should always be explicitly opted 
> >>> in to
> >>> by the toolstack, such as emulated features which come with a security
> >>> trade-off, or for non-architectural features which may differ in
> >>> implementation in heterogeneous environments.
> >>>
> >>> All global policies (Xen and guest, max and default) shall be made 
> >>> available
> >>> to the toolstack, in a manner similar to the existing
> >>> _XEN\_SYSCTL\_get\_cpu\_featureset_ mechanism.  This allows decisions to 
> >>> be
> >>> taken which include all CPUID data, not just the feature bitmaps.
> >>>
> >>> New _XEN\_DOMCTL\_{get,set}\_cpuid\_policy_ hypercalls will be introduced,
> >>> which allows the toolstack to query and set the cpuid policy for a 
> >>> specific
> >>> domain.  It shall supersede _XEN\_DOMCTL\_set\_cpuid_, and shall fail if 
> >>> Xen
> >>> is unhappy with any aspect of the policy during auditing.  This provides
> >>> feedback to the user that a chosen combination will not work, rather than 
> >>> the
> >>> guest booting in an unexpected state.
> >>>
> >>> When a domain is initially created, the appropriate guests **default** 
> >>> policy
> >>> is duplicated for use.  When auditing, Xen shall audit the toolstacks
> >>> requested policy against the guests **max** policy.  This allows 
> >>> experimental
> >>> features or non-migration-safe features to be opted in to, without those
> >>> features being imposed upon all guests automatically.
> >>>
> >>> A guests CPUID policy shall be immutable after construction.  This better
> >>> matches real hardware, and simplifies the logic in Xen to translate policy
> >>> alterations into configuration changes.
> >>>
> >> This appears to be a suitable abstraction even for higher level toolstacks
> >> (libxl). At least I can imagine libvirt fetching the PV/HVM max policy, and
> >> compare them between different servers when user computes the guest cpu 
> >> config
> >> (the normalized one) and use the common denominator as the guest policy.
> >> Probably higher level toolstack could even use these said policies 
> >> constructs
> >> and built the idea of models such that the user could easily choose one 
> >> for a
> >> pool of hosts with different

[Xen-devel] [PATCH] x86/apic/x2apic: Share IRQ vector between cluster members only when no cpumask is specified

2017-07-31 Thread Boris Ostrovsky

We have limited number (slightly under NR_DYNAMIC_VECTORS=192) of IRQ
vectors that are available to each processor. Currently, when x2apic
cluster mode is used (which is default), each vector is shared among
all processors in the cluster. With many IRQs (as is the case on systems
with multiple SR-IOV cards) and few clusters (e.g. single socket)
there is a good chance that we will run out of vectors.

This patch tries to decrease vector sharing between processors by
assigning vector to a single processor if the assignment request (via
__assign_irq_vector()) comes without explicitly specifying which
processors are expected to share the interrupt. This typically happens
during boot time (or possibly PCI hotplug) when create_irq(NUMA_NO_NODE)
is called. When __assign_irq_vector() is called from
set_desc_affinity() which provides sharing mask, vector sharing will
continue to be performed, as before.

This patch to some extent mirrors Linux commit d872818dbbee
("x86/apic/x2apic: Use multiple cluster members for the irq destination
only with the explicit affinity").

Note that this change still does not guarantee that we never run out of
vectors. For example, on a single core system we will be effectively
back to the single cluster/socket case of original code.

Signed-off-by: Boris Ostrovsky 
---
 xen/arch/x86/acpi/boot.c | 2 +-
 xen/arch/x86/apic.c  | 2 +-
 xen/arch/x86/genapic/delivery.c  | 4 ++--
 xen/arch/x86/genapic/x2apic.c| 9 +++--
 xen/arch/x86/io_apic.c   | 8 
 xen/arch/x86/irq.c   | 4 ++--
 xen/arch/x86/mpparse.c   | 2 +-
 xen/arch/x86/msi.c   | 2 +-
 xen/include/asm-x86/genapic.h| 9 ++---
 xen/include/asm-x86/mach-generic/mach_apic.h | 9 +
 10 files changed, 30 insertions(+), 21 deletions(-)

diff --git a/xen/arch/x86/acpi/boot.c b/xen/arch/x86/acpi/boot.c
index 8e6c96d..c16b14a 100644
--- a/xen/arch/x86/acpi/boot.c
+++ b/xen/arch/x86/acpi/boot.c
@@ -645,7 +645,7 @@ static void __init acpi_process_madt(void)
acpi_ioapic = true;
 
smp_found_config = true;
-   clustered_apic_check();
+   CLUSTERED_APIC_CHECK();
}
}
if (error == -EINVAL) {
diff --git a/xen/arch/x86/apic.c b/xen/arch/x86/apic.c
index 851a6cc..a0e1798 100644
--- a/xen/arch/x86/apic.c
+++ b/xen/arch/x86/apic.c
@@ -544,7 +544,7 @@ void setup_local_APIC(void)
  * an APIC.  See e.g. "AP-388 82489DX User's Manual" (Intel
  * document number 292116).  So here it goes...
  */
-init_apic_ldr();
+INIT_APIC_LDR();
 
 /*
  * Set Task Priority to reject any interrupts below FIRST_DYNAMIC_VECTOR.
diff --git a/xen/arch/x86/genapic/delivery.c b/xen/arch/x86/genapic/delivery.c
index ced92a1..d71c01c 100644
--- a/xen/arch/x86/genapic/delivery.c
+++ b/xen/arch/x86/genapic/delivery.c
@@ -30,7 +30,7 @@ void __init clustered_apic_check_flat(void)
printk("Enabling APIC mode:  Flat.  Using %d I/O APICs\n", nr_ioapics);
 }
 
-const cpumask_t *vector_allocation_cpumask_flat(int cpu)
+const cpumask_t *vector_allocation_cpumask_flat(int cpu, const cpumask_t 
*cpumask)
 {
return &cpu_online_map;
 } 
@@ -58,7 +58,7 @@ void __init clustered_apic_check_phys(void)
printk("Enabling APIC mode:  Phys.  Using %d I/O APICs\n", nr_ioapics);
 }
 
-const cpumask_t *vector_allocation_cpumask_phys(int cpu)
+const cpumask_t *vector_allocation_cpumask_phys(int cpu, const cpumask_t 
*cpumask)
 {
return cpumask_of(cpu);
 }
diff --git a/xen/arch/x86/genapic/x2apic.c b/xen/arch/x86/genapic/x2apic.c
index 5fffb31..c0b97c9 100644
--- a/xen/arch/x86/genapic/x2apic.c
+++ b/xen/arch/x86/genapic/x2apic.c
@@ -27,6 +27,7 @@
 #include 
 #include 
 #include 
+#include 
 
 static DEFINE_PER_CPU_READ_MOSTLY(u32, cpu_2_logical_apicid);
 static DEFINE_PER_CPU_READ_MOSTLY(cpumask_t *, cluster_cpus);
@@ -72,9 +73,13 @@ static void __init clustered_apic_check_x2apic(void)
 {
 }
 
-static const cpumask_t *vector_allocation_cpumask_x2apic_cluster(int cpu)
+static const cpumask_t *vector_allocation_cpumask_x2apic_cluster(int cpu,
+const cpumask_t *cpumask)
 {
-return per_cpu(cluster_cpus, cpu);
+if ( cpumask != TARGET_CPUS )
+return per_cpu(cluster_cpus, cpu);
+else
+return cpumask_of(cpu);
 }
 
 static unsigned int cpu_mask_to_apicid_x2apic_cluster(const cpumask_t *cpumask)
diff --git a/xen/arch/x86/io_apic.c b/xen/arch/x86/io_apic.c
index 2838f6b..3eefcfc 100644
--- a/xen/arch/x86/io_apic.c
+++ b/xen/arch/x86/io_apic.c
@@ -1038,7 +1038,7 @@ static void __init setup_IO_APIC_irqs(void)
 disable_8259A_irq(irq_to_desc(irq));
 
 desc = irq_to_desc(irq);
-SET_DEST(entry, logical, cpu_mask_to_apicid(TARGET_CPUS))

[Xen-devel] [xen-unstable test] 112391: regressions - trouble: blocked/broken/fail/pass

2017-07-31 Thread osstest service owner

flight 112391 xen-unstable real [real]
http://logs.test-lab.xenproject.org/osstest/logs/112391/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-arm64-xsm   2 hosts-allocate broken REGR. vs. 112286
 build-arm64-pvops 2 hosts-allocate broken REGR. vs. 112286
 build-arm64   2 hosts-allocate broken REGR. vs. 112286
 test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-localmigrate/x10 fail REGR. vs. 
112286

Tests which are failing intermittently (not blocking):
 test-amd64-amd64-rumprun-amd64 17 rumprun-demo-xenstorels/xenstorels.repeat 
fail pass in 112384
 test-amd64-i386-xl-qemuu-win7-amd64 16 guest-localmigrate/x10 fail pass in 
112384
 test-armhf-armhf-xl-rtds 12 guest-startfail pass in 112384

Tests which did not succeed, but are not blocking:
 test-arm64-arm64-libvirt-xsm  1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl   1 build-check(1)   blocked  n/a
 build-arm64-libvirt   1 build-check(1)   blocked  n/a
 test-arm64-arm64-examine  1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl-credit2   1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl-xsm   1 build-check(1)   blocked  n/a
 build-arm64-xsm   3 capture-logs  broken blocked in 112286
 build-arm64-pvops 3 capture-logs  broken blocked in 112286
 build-arm64   3 capture-logs  broken blocked in 112286
 test-amd64-amd64-xl-qemut-win7-amd64 18 guest-start/win.repeat fail in 112384 
blocked in 112286
 test-amd64-i386-xl-qemut-win7-amd64 17 guest-stop fail in 112384 blocked in 
112286
 test-armhf-armhf-xl-rtds 16 guest-start/debian.repeat fail in 112384 like 
112274
 test-amd64-i386-xl-qemuu-win7-amd64 17 guest-stop   fail in 112384 like 112286
 test-armhf-armhf-xl-rtds13 migrate-support-check fail in 112384 never pass
 test-armhf-armhf-xl-rtds 14 saverestore-support-check fail in 112384 never pass
 test-amd64-amd64-xl-qemut-win7-amd64 17 guest-stopfail like 112286
 test-amd64-i386-xl-qemut-win7-amd64 16 guest-localmigrate/x10 fail like 112286
 test-armhf-armhf-libvirt 14 saverestore-support-checkfail  like 112286
 test-armhf-armhf-libvirt-xsm 14 saverestore-support-checkfail  like 112286
 test-amd64-amd64-xl-rtds 10 debian-install   fail  like 112286
 test-armhf-armhf-libvirt-raw 13 saverestore-support-checkfail  like 112286
 test-amd64-amd64-xl-qemuu-ws16-amd64 10 windows-installfail never pass
 test-amd64-i386-libvirt-xsm  13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt-vhd 12 migrate-support-checkfail   never pass
 test-amd64-i386-xl-qemuu-ws16-amd64 13 guest-saverestore   fail never pass
 test-amd64-amd64-qemuu-nested-amd 17 debian-hvm-install/l1/l2  fail never pass
 test-armhf-armhf-xl  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 13 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 14 saverestore-support-checkfail never pass
 test-armhf-armhf-xl-xsm  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  14 saverestore-support-checkfail   never pass
 test-amd64-i386-xl-qemut-ws16-amd64 13 guest-saverestore   fail never pass
 test-armhf-armhf-xl-multivcpu 13 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 14 saverestore-support-checkfail  never pass
 test-amd64-i386-libvirt  13 migrate-support-checkfail   never pass
 test-amd64-amd64-xl-qemut-ws16-amd64 10 windows-installfail never pass
 test-armhf-armhf-libvirt 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-raw 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  13 saverestore-support-checkfail   never pass
 test-amd64-i386-xl-qemuu-win10-i386 10 windows-install fail never pas

Re: [Xen-devel] PV drivers and zero copying

2017-07-31 Thread Stefano Stabellini

On Mon, 31 Jul 2017, Oleksandr Andrushchenko wrote:
> 3 Sharing with page exchange (XENMEM_exchange)
> ==
> 
> This API was pointed to me by Stefano Stabellini as one of the possible ways
> to
> achieve zero copying and share physically contiguous buffers. It is used by
> x86
> SWIOTLB code (xen_create_contiguous_region, [5]), but as per my understanding
> this API cannot be used on ARM as of now [6].  Conclusion: not an option for
> ARM
> at the moment

Let me elaborate on this. The purpose of XENMEM_exchange is to exchange
a number of memory pages with an equal number of contiguous memory
pages, possibly even under 4G. The original purpose of the hypercall was
to get DMA-able memory.

So far, it has only been used by Dom0 on x86. Dom0 on ARM doesn't need
it because it is mapped 1:1 by default and device assignment is not
allowed without an IOMMU. However it should work on ARM too, as the
implementation is all common code in Xen. Also, looking at the
implementation (xen/common/memory.c:memory_exchange) it would seem that
it can be called from a DomU too (but I have never tried).

Thus, if you have a platform without IOMMU and you disabled the IOMMU
checks in Xen to assign a device to a DomU anyway, then you could use
this hypercall from DomU to get memory under 4G to be used for DMA with
this device.

As far as I can tell XENMEM_exchange could help in the design of
zero-copy PV protocols only to address this specific use case:

- you have a frontend in DomU and a backend in Dom0
- pages shared by DomU get mapped in Dom0 and potentially used for DMA
- the device has under 4G DMA restrictions

Normally Dom0 maps a DomU page, then at the time of using the mapped
page for DMA it checks whether it is suitable for DMA (under 4G if the
device requires so). If it is not, Dom0 uses a bounce buffer borrowed
from the swiotlb. Obviously this introduces one or two memcpys.

Instead, if DomU calls XENMEM_exchange to get memory under 4G, and
shares one of the pages with Dom0 via PV frontends, then Dom0 wouldn't
have to use a bounce buffer to do DMA to this page.

Does it make sense?

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

1 2 >

1 - 100 of 151 matches

Mail list logo