Re: [Xen-devel] [PATCH v2 3/3] x86/ioreq server: Add HVMOP to map guest ram with p2m_ioreq_server to an ioreq server

2016-04-07 Thread Yu, Zhang

Thanks for your advices and good questions. :)

On 4/7/2016 1:13 AM, George Dunlap wrote:

On Thu, Mar 31, 2016 at 11:53 AM, Yu Zhang  wrote:

A new HVMOP - HVMOP_map_mem_type_to_ioreq_server, is added to
let one ioreq server claim/disclaim its responsibility for the
handling of guest pages with p2m type p2m_ioreq_server. Users
of this HVMOP can specify whether the p2m_ioreq_server is supposed
to handle write accesses or read ones or both in a parameter named
flags. For now, we only support one ioreq server for this p2m type,
so once an ioreq server has claimed its ownership, subsequent calls
of the HVMOP_map_mem_type_to_ioreq_server will fail. Users can also
disclaim the ownership of guest ram pages with this p2m type, by
triggering this new HVMOP, with ioreq server id set to the current
owner's and flags parameter set to 0.

For now, both HVMOP_map_mem_type_to_ioreq_server and p2m_ioreq_server
are only supported for HVMs with HAP enabled.

Note that flags parameter(if not 0) of this HVMOP only indicates
which kind of memory accesses are to be forwarded to an ioreq server,
it has impact on the access rights of guest ram pages, but are not
the same. Due to hardware limitations, if only write operations are
to be forwarded, read ones will be performed at full speed, with
no hypervisor intervention. But if read ones are to be forwarded to
an ioreq server, writes will inevitably be trapped into hypervisor,
which means significant performance impact.

Also note that this HVMOP_map_mem_type_to_ioreq_server will not
change the p2m type of any guest ram page, until HVMOP_set_mem_type
is triggered. So normally the steps should be the backend driver
first claims its ownership of guest ram pages with p2m_ioreq_server
type, and then sets the memory type to p2m_ioreq_server for specified
guest ram pages.

Signed-off-by: Paul Durrant 
Signed-off-by: Yu Zhang 


And again, review of this patch was significantly delayed because you
didn't provide any description of the changes you made between v1 and
v2 or why.


Sorry about the inconvenience, will change in next version.



Overall looks good.  Just a few questions...


+static int hvmop_map_mem_type_to_ioreq_server(
+XEN_GUEST_HANDLE_PARAM(xen_hvm_map_mem_type_to_ioreq_server_t) uop)
+{
+xen_hvm_map_mem_type_to_ioreq_server_t op;
+struct domain *d;
+int rc;
+
+if ( copy_from_guest(&op, uop, 1) )
+return -EFAULT;
+
+rc = rcu_lock_remote_domain_by_id(op.domid, &d);
+if ( rc != 0 )
+return rc;
+
+rc = -EINVAL;
+if ( !is_hvm_domain(d) )
+goto out;
+
+/* For now, only support for HAP enabled hvm */
+if ( !hap_enabled(d) )
+goto out;


So before I suggested that this be restricted to HAP because you were
using p2m_memory_type_changed(), which was only implemented on EPT.
But since then you've switched that code to use
p2m_change_entry_type_global() instead, which is implemented by both;
and you implement the type in p2m_type_to_flags().  Is there any
reason to keep this restriction?



Yes. And this is a change which was not explained clearly. Sorry.

Reason I've chosen p2m_change_entry_type_global() instead:
p2m_memory_type_changed() will only trigger the resynchronization for
the ept memory types in resolve_misconfig(). Yet it is the p2m type we
wanna to be recalculated, so here comes p2m_change_entry_type_global().

Reasons I restricting the code in HAP mode:
Well, I guess p2m_change_entry_type_global() was only called by HAP code
like hap_[en|dis]able_log_dirty() etc, which were registered during
hap_domain_init(). As to shadow mode, it is sh_[en|dis]able_log_dirty(),
which do not use p2m_change_entry_type_global().

Since my intention is to resync the outdated p2m_ioreq_server pages
back to p2m_ram_rw, calling p2m_change_entry_global() directly should
be much more convenient(and correct) for me than inventing another
wrapper to cover both the HAP and shadow mode(which xengt does not use
by now).



+/*
+ * Each time we map/unmap an ioreq server to/from p2m_ioreq_server,
+ * we mark the p2m table to be recalculated, so that gfns which were
+ * previously marked with p2m_ioreq_server can be resynced.
+ */
+p2m_change_entry_type_global(d, p2m_ioreq_server, p2m_ram_rw);


This comment doesn't seem to be accurate (or if it is it's a bit
confusing).  Would it be more accurate to say something like the
following:

"Each time we map / unmap in ioreq server to/from p2m_ioreq_server, we
reset all memory currently marked p2m_ioreq_server to p2m_ram_rw."


Well, I agree this comment is not quite accurate. Like you said in your
comment, the purpose here, calling p2m_change_entry_type_global() is to
"reset all memory currently marked p2m_ioreq_server to p2m_ram_rw". But
the recalculation is asynchronous. So how about:

"Each time we map/unmap an ioreq server to/from p2m_ioreq_server, we
mark the p2m table to be recalculated, so all memory currently marked
p2m_ioreq_server can be reset

Re: [Xen-devel] [PATCH v9 2/3] VT-d: wrap a _sync version for all VT-d flush interfaces

2016-04-07 Thread Xu, Quan
On April 05, 2016 5:35pm, Jan Beulich  wrote:
> >>> On 01.04.16 at 16:47,  wrote:
> > The dev_invalidate_iotlb() scans ats_devices list to flush ATS
> > devices, and the invalidate_sync() is put after dev_invalidate_iotlb()
> > to synchronize with hardware for flush status. If we assign multiple
> > ATS devices to a domain, the flush status is about all these multiple
> > ATS devices. Once flush timeout expires, we couldn't find out which
> > one is the buggy ATS device.
> 
> Is that true? Or is that just a limitation of our implementation?
> 

IMO, both.
I hope vt-d maintainers help me double check it.

> > Then, The invalidate_sync() variant (We need to pass down the device's
> > SBDF to hide the ATS device) is put within dev_invalidate_iotlb() to
> > synchronize for the flush status one by one.
> 
> I don't think this is stating current state of things. So ...
> 
> > If flush timeout expires,
> > we could find out the buggy ATS device and hide it. However, for other
> > VT-d flush interfaces, the invalidate_sync() is still put after at present.
> > This is inconsistent.
> 
> ... taken together, what is inconsistent here needs to be described better, 
> as well
> as what it is you do to eliminate the inconsistency. Note that you should not
> refer back (or imply knowledge of) the previous discussion on the earlier
> version.
> In any of that discussion is useful here, you need to summarize it instead.
> 

I will continue to summarize it and send out later.

> > --- a/xen/drivers/passthrough/vtd/extern.h
> > +++ b/xen/drivers/passthrough/vtd/extern.h
> > @@ -61,6 +61,8 @@ int dev_invalidate_iotlb(struct iommu *iommu, u16
> > did,
> >
> >  int qinval_device_iotlb(struct iommu *iommu,
> >  u32 max_invs_pend, u16 sid, u16 size, u64
> > addr);
> > +int qinval_device_iotlb_sync(struct iommu *iommu, u32 max_invs_pend,
> > + u16 sid, u16 size, u64 addr);
> 
> So are then both functions needed to be externally accessible?
> That would seem contrary to the last paragraph of the patch description.
> 

I was aware of this. I'd better make the qinval_device_iotlb() a static one in 
next v10.

[...]

> > +static int queue_invalidate_context_sync(struct iommu *iommu,
> 
> __must_check?
> 

Agreed.

[...]

> > +{
> > +queue_invalidate_context(iommu, did, source_id,
> > + function_mask, granu);
> > +
> > +return invalidate_sync(iommu);
> > +}
> 
> Further down you replace the only call to
> queue_invalidate_context() - why keep both functions instead of just making 
> the
> existing one do the sync? (That would the likely also apply to
> qinval_device_iotlb() and others below.)
> 

It is optional.
 I think:
1. in the long term, we may need no _sync version.
2. At least, the current wrap looks good to me. e.g. queue_invalidate_context() 
is for context-cache Invalidate Descriptor, and the
invalidate_sync() is for Invalidation Wait Descriptor. It is much clearer.

> > @@ -338,23 +365,24 @@ static int flush_iotlb_qi(
> >
> >  if ( qi_ctrl->qinval_maddr != 0 )
> >  {
> > -int rc;
> > -
> >  /* use queued invalidation */
> >  if (cap_write_drain(iommu->cap))
> >  dw = 1;
> >  if (cap_read_drain(iommu->cap))
> >  dr = 1;
> >  /* Need to conside the ih bit later */
> > -queue_invalidate_iotlb(iommu,
> > -   type >>
> DMA_TLB_FLUSH_GRANU_OFFSET, dr,
> > -   dw, did, size_order, 0, addr);
> > +ret = queue_invalidate_iotlb_sync(iommu,
> > +  type >> DMA_TLB_FLUSH_GRANU_OFFSET, dr, dw, did,
> > +  size_order, 0, addr);
> > +
> > +/* TODO: Timeout error handling to be added later */
> 
> As of today queue_invalidate_wait() panics, so this comment is not very 
> helpful
> as there is not timeout that could possibly be detected here.
> 

Okay, I will drop it.


> > +if ( ret )
> > +return ret;
> > +
> >  if ( flush_dev_iotlb )
> >  ret = dev_invalidate_iotlb(iommu, did, addr, size_order, type);
> > -rc = invalidate_sync(iommu);
> > -if ( !ret )
> > -ret = rc;
> >  }
> 
> I think leaving the existing logic as is would be better - best effort 
> invalidation
> even when an error has occurred.
> 

I have an open:
As vt-d spec(:Queued Invalidation Ordering Considerations) said,
 1. If the Fence(FN) flag is 1 in a inv_wait_dsc, hardware must execute 
descriptors following the inv_wait_dsc only after wait command is completed.
 2. when a Device-TLB invalidation timeout is detected, hardware must not 
complete any pending inv_wait_dsc commands.
In current code, the Fence(FN) is always 1.
if a Device-TLB invalidation timeout is detected, this additional inv_wait_dsc 
is not completed.
__iiuc__, 
the new coming descriptors, in that queue, _might_ be not executed any more, 
waiting for this addition

Re: [Xen-devel] Running Xen on Nvidia Jetson-TK1

2016-04-07 Thread Dushyant Behl
Hello,

On Fri, Apr 1, 2016 at 3:34 PM, Julien Grall  wrote:
> Hello Dushyant,
>
> On 29/03/16 21:56, Dushyant Behl wrote:
>>
>> On Wed, Mar 30, 2016 at 12:31 AM, Julien Grall 
>> wrote:
>>>
>>> On 24/03/16 11:05, Dushyant Behl wrote:
>
>
>> (XEN) DOM0: [0.00] irq: no irq domain found for
>> /interrupt-controller !
>> (XEN) DOM0: [0.00] irq: no irq domain found for
>> /interrupt-controller !
>> (XEN) DOM0: [0.00] irq: no irq domain found for
>> /interrupt-controller !
>> (XEN) DOM0: [0.00] arch_timer: No interrupt available, giving up
>
>
> It looks like to me that Xen is not recreating the device-tree correctly. I
> would look into the kernel to find what is expected.

This looks like a possible bug (or some missing feature) in Xen's
device tree creation which could
take some time to handle, so if I could be of any more help to you
with this issue please let me know.

[I've cc'ed Ian Campbell in this mail (Sorry for cc'ing you explicitly)]

Ian,

Actually, I want to run Xen on the Tegra Jetson board for some project
of mine but currently Linux-4.1 is
failing as dom0 because its not able to receive interrupts from the arch_timer.
This link contains the dom0 failure boot log -
http://lists.xenproject.org/archives/html/xen-devel/2016-03/msg03715.html

In your patch for *Hacky* support for Jetsok-TK1 you said that you
were able to run guests on
Jetson-tk1 board with Xen. Can I know which kernel version you used as
dom0 (and possibly domU guests)?

Thanks and Regards,
Dushyant Behl

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [linux-3.16 test] 89242: regressions - FAIL

2016-04-07 Thread osstest service owner
flight 89242 linux-3.16 real [real]
http://logs.test-lab.xenproject.org/osstest/logs/89242/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-i386-libvirt-pair 21 guest-migrate/src_host/dst_host fail REGR. vs. 
85048

Tests which are failing intermittently (not blocking):
 test-amd64-i386-pair  3 host-install/src_host(3) broken in 88768 pass in 89242
 test-amd64-i386-xl3 host-install(3)  broken in 88768 pass in 89242
 test-amd64-i386-xl-qemut-stubdom-debianhvm-amd64-xsm 15 guest-localmigrate/x10 
fail in 88768 pass in 89242
 test-armhf-armhf-xl-xsm   9 debian-install fail in 88768 pass in 89242
 test-armhf-armhf-xl-credit2  15 guest-start/debian.repeat   fail pass in 88768

Regressions which are regarded as allowable (not blocking):
 build-i386-rumpuserxen6 xen-buildfail   like 85048
 build-amd64-rumpuserxen   6 xen-buildfail   like 85048
 test-amd64-amd64-xl-credit2  17 guest-localmigrate/x10   fail   like 85048
 test-amd64-i386-xl-qemut-win7-amd64 16 guest-stop  fail like 85048
 test-amd64-amd64-xl-qemut-win7-amd64 16 guest-stop fail like 85048
 test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-stop fail like 85048
 test-amd64-i386-xl-qemuu-win7-amd64 16 guest-stop  fail like 85048
 test-armhf-armhf-xl-rtds 11 guest-start  fail   like 85048

Tests which did not succeed, but are not blocking:
 test-amd64-i386-rumpuserxen-i386  1 build-check(1)   blocked  n/a
 test-amd64-amd64-rumpuserxen-amd64  1 build-check(1)   blocked n/a
 test-amd64-amd64-xl-pvh-intel 11 guest-start  fail  never pass
 test-amd64-amd64-xl-pvh-amd  11 guest-start  fail   never pass
 test-amd64-amd64-xl-multivcpu 17 guest-localmigrate/x10   fail  never pass
 test-armhf-armhf-libvirt 14 guest-saverestorefail   never pass
 test-armhf-armhf-libvirt 12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-qcow2 11 migrate-support-checkfail never pass
 test-armhf-armhf-libvirt-qcow2 13 guest-saverestorefail never pass
 test-amd64-i386-libvirt-xsm  12 migrate-support-checkfail   never pass
 test-amd64-amd64-xl-rtds 17 guest-localmigrate/x10   fail   never pass
 test-armhf-armhf-libvirt-raw 13 guest-saverestorefail   never pass
 test-armhf-armhf-libvirt-raw 11 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-xsm 14 guest-saverestorefail   never pass
 test-amd64-amd64-qemuu-nested-amd 16 debian-hvm-install/l1/l2  fail never pass
 test-amd64-amd64-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 12 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 13 saverestore-support-checkfail never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check 
fail never pass
 test-armhf-armhf-xl-credit2  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 13 saverestore-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 12 migrate-support-checkfail  never pass
 test-armhf-armhf-xl  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  13 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-vhd 11 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check 
fail never pass
 test-armhf-armhf-xl-xsm  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  11 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  12 saverestore-support-checkfail   never pass

version targeted for testing:
 linux3a96c6601b6fc47baa6d296f9111ba7be4dad6fc
baseline version:
 linux7f2a8840d127c8d5c59a5d79235e1205aba2e102

Last test of basis85048  2016-03-02 10:56:10 Z   35 days
Testing same since87897  2016-03-29 14:28:05 Z8 days9 attempts


People who touched revisions under test:
  Akshay Bhat 
  Al Viro 
  Alex Deucher 
  Alex Williamson 
  Alexander Deucher 
  Amir Vadai 
  Andreas Schwab 
  Andrey Skvortsov 
  Andy Lutomirski 
  An

Re: [Xen-devel] [PATCH 2/2] libxl: Do not leak data on error path from libxl__read_sysfs_file_contents

2016-04-07 Thread Chun Yan Liu


>>> On 4/4/2016 at 11:10 PM, in message
<1459782600-16073-2-git-send-email-ian.jack...@eu.citrix.com>, Ian Jackson
 wrote: 
> Bug introduced in bc023ecd 
> "libxl_utils: add internal function to read sysfs file contents" 
>  
> CID: 1358108 
> Signed-off-by: Ian Jackson  
> CC: cover...@xenproject.org 
> CC: Chunyan Liu  
> --- 
>  tools/libxl/libxl_utils.c |1 + 
>  1 file changed, 1 insertion(+) 
>  
> diff --git a/tools/libxl/libxl_utils.c b/tools/libxl/libxl_utils.c 
> index ceb8825..bd58a52 100644 
> --- a/tools/libxl/libxl_utils.c 
> +++ b/tools/libxl/libxl_utils.c 
> @@ -466,6 +466,7 @@ int libxl__read_sysfs_file_contents(libxl__gc *gc, const  
> char *filename, 
>  e = errno; 
>  assert(e != ENOENT); 
>  if (f) fclose(f); 
> +free(data); 

'data' is malloced with 'gc', it'll be freed by GC_FREE. Do we need to free
it here?

Chunyan

>  return e; 
>  } 
>   
 



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 1/2] libxl: Set rc on failure of usbdev_busaddr_to_busid

2016-04-07 Thread Chun Yan Liu
Thanks, Ian!

>>> On 4/4/2016 at 11:09 PM, in message
<1459782600-16073-1-git-send-email-ian.jack...@eu.citrix.com>, Ian Jackson
 wrote: 
> We must set rc before using `goto out'. 
>  
> Bug introduced in bf7628f0 "libxl: add pvusb API". 
>  
> CID: 1358113 
> Signed-off-by: Ian Jackson  
> CC: cover...@xenproject.org 
> CC: Simon Cao  
> CC: George Dunlap  
> CC: Chunyan Liu  
> --- 
>  tools/libxl/libxl_pvusb.c |1 + 
>  1 file changed, 1 insertion(+) 
>  
> diff --git a/tools/libxl/libxl_pvusb.c b/tools/libxl/libxl_pvusb.c 
> index 5f92628..6f53317 100644 
> --- a/tools/libxl/libxl_pvusb.c 
> +++ b/tools/libxl/libxl_pvusb.c 
> @@ -905,6 +905,7 @@ static int libxl__device_usbdev_add_xenstore(libxl__gc  
> *gc, uint32_t domid, 
>  usbdev->u.hostdev.hostaddr); 
>  if (!busid) { 
>  LOG(DEBUG, "Fail to get busid of usb device"); 
> +rc = ERROR_FAIL; 
>  goto out; 
>  } 
>   
 



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH] x86/emulate: Check current->arch.vm_event in hvmemul_virtual_to_linear()

2016-04-07 Thread Razvan Cojocaru
Theoretically it is possible for mem_access_emulate_each_rep to be
true even when current->arch.vm_event == NULL, so add an extra
check to hvmemul_virtual_to_linear().

Signed-off-by: Razvan Cojocaru 
---
 xen/arch/x86/hvm/emulate.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
index f5ab5bc..91413d2 100644
--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -514,7 +514,7 @@ static int hvmemul_virtual_to_linear(
  * vm_event being triggered for repeated writes to a whole page.
  */
 if ( unlikely(current->domain->arch.mem_access_emulate_each_rep) &&
- current->arch.vm_event->emulate_flags != 0 )
+ current->arch.vm_event && current->arch.vm_event->emulate_flags != 0 )
max_reps = 1;
 
 /*
-- 
1.9.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Redundant lstats in libxl_pvusb.c

2016-04-07 Thread Chun Yan Liu


>>> On 4/4/2016 at 11:07 PM, in message
<22274.33583.712655.413...@mariner.uk.xensource.com>, Ian Jackson
 wrote: 
> In libxl_usb.c, usbintf_get_drvpath calls stat(2) on the driver sysfs 
> path, and then realpath on the same path. 

It's true. This could be done by calling realpath only. Will correct.

>  
> And bind_usbintf calls stat(2) on the driver directory path, and then 
> open(2) on a file in that directory. 

It's not true. It calls stat(2) on a file in driver path (driver/interface),
and open(2) on another file in that driver path (driver/bind).

Chunyan
>  
> It seems to be that in both cases, libxl could simply directly access 
> the target object.  Ie, it could always call realpath, and always call 
> open.  Appropriate error handling would deal with the cases currently 
> dealt with by the stat. 
>  
> Am I wrong about this ? 
>  
> I'm prompted to look at this by Coverity, Coverity thinks that this 
> stat-then-realpath, or stat-then-open, might be a TOCTOU security 
> problem.  I think it's wrong, but it would be nice to tidy up the code 
> and eliminate these complaints. 
>  
> If I am right, I'd appreciate patch(es).  They should mention 
> CID: 1358112 
> CID: 1358111 
> for these two functions, respectively. 
>  
> Thanks, 
> Ian. 
>  
> > *** CID 1358112:  Security best practices violations  (TOCTOU) 
> > /tools/libxl/libxl_pvusb.c: 995 in usbintf_get_drvpath() 
> > 989 spath = GCSPRINTF(SYSFS_USB_DEV "/%s/driver", intf); 
> > 990  
> > 991 r = lstat(spath, &st); 
> > 992 if (r == 0) { 
> > 993 /* Find the canonical path to the driver. */ 
> > 994 dp = libxl__zalloc(gc, PATH_MAX); 
> > >>> CID 1358112:  Security best practices violations  (TOCTOU) 
> > >>> Calling function "realpath" that uses "spath" after a check 
> > >>> function.  
> This can cause a time-of-check, time-of-use race condition. 
> > 995 dp = realpath(spath, dp); 
> > 996 if (!dp) { 
> > 997 LOGE(ERROR, "get realpath failed: '%s'", spath); 
> > 998 return ERROR_FAIL; 
> > 999 } 
> > 1000 } else if (errno == ENOENT) { 
>  
> > *** CID 1358111:  Security best practices violations  (TOCTOU) 
> > /tools/libxl/libxl_pvusb.c: 1061 in bind_usbintf() 
> > 1055 return 0; 
> > 1056 if (r < 0 && errno != ENOENT) 
> > 1057 return ERROR_FAIL; 
> > 1058  
> > 1059 path = GCSPRINTF("%s/bind", drvpath); 
> > 1060  
> > >>> CID 1358111:  Security best practices violations  (TOCTOU) 
> > >>> Calling function "open" that uses "path" after a check function. 
> > >>> This  
> can cause a time-of-check, time-of-use race condition. 
> > 1061 fd = open(path, O_WRONLY); 
> > 1062 if (fd < 0) { 
> > 1063 LOGE(ERROR, "open file failed: '%s'", path); 
> > 1064 rc = ERROR_FAIL; 
> > 1065 goto out; 
> > 1066 } 
>  
>  



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v4 02/14] x86/xen: use X86_SUBARCH_XEN for PV guest boots

2016-04-07 Thread David Vrabel
On 07/04/16 01:06, Luis R. Rodriguez wrote:
> The use of subarch should have no current effect on Xen
> PV guests, as such this should have no current functional
> effects.

Reviewed-by: David Vrabel 

David

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH 4/4] fix a pvusb type check

2016-04-07 Thread Chunyan Liu
Missing a check of controller type.

Signed-off-by: Chunyan Liu 
CC: Simon Cao 
CC: George Dunlap 
CC: Ian Jackson 
CC: Juergen Gross 
---
This affects Juergen's qusb patch too. Fix that together. This patch
could be applied on top of Juergen's qusb backend patch.

 tools/libxl/libxl_pvusb.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/tools/libxl/libxl_pvusb.c b/tools/libxl/libxl_pvusb.c
index 02d3e55..6447639 100644
--- a/tools/libxl/libxl_pvusb.c
+++ b/tools/libxl/libxl_pvusb.c
@@ -811,6 +811,13 @@ static int libxl__device_usbdev_setdefault(libxl__gc *gc,
 }
 }
 
+if (usbctrl->type != LIBXL_USBCTRL_TYPE_PV &&
+usbctrl->type != LIBXL_USBCTRL_TYPE_QUSB) {
+LOG(ERROR, "Unsupported USB controller type");
+rc = ERROR_FAIL;
+goto out;
+}
+
 rc = libxl__device_usbctrl_add_xenstore(gc, domid, usbctrl,
 update_json);
 if (rc) goto out;
-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH 0/4] pvusb fixes

2016-04-07 Thread Chunyan Liu
Patch series of pvusb fixes.

Chunyan Liu (4):
  a fix in libxl_device_usbdev_list
  correct libxl_write_exactly sizeof
  cleanup redundant lstat in libxl_pvusb.c
  fix a pvusb type check

 tools/libxl/libxl_pvusb.c | 38 +-
 1 file changed, 17 insertions(+), 21 deletions(-)

-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH 3/4] cleanup redundant lstat in libxl_pvusb.c

2016-04-07 Thread Chunyan Liu
CID: 1358112

Signed-off-by: Chunyan Liu 
CC: Simon Cao 
CC: George Dunlap 
CC: Ian Jackson 
---
 tools/libxl/libxl_pvusb.c | 21 +
 1 file changed, 5 insertions(+), 16 deletions(-)

diff --git a/tools/libxl/libxl_pvusb.c b/tools/libxl/libxl_pvusb.c
index 45117cf..02d3e55 100644
--- a/tools/libxl/libxl_pvusb.c
+++ b/tools/libxl/libxl_pvusb.c
@@ -983,25 +983,14 @@ static char *usbdev_busid_from_ctrlport(libxl__gc *gc, 
uint32_t domid,
 static int usbintf_get_drvpath(libxl__gc *gc, const char *intf, char **drvpath)
 {
 char *spath, *dp = NULL;
-struct stat st;
-int r;
 
 spath = GCSPRINTF(SYSFS_USB_DEV "/%s/driver", intf);
 
-r = lstat(spath, &st);
-if (r == 0) {
-/* Find the canonical path to the driver. */
-dp = libxl__zalloc(gc, PATH_MAX);
-dp = realpath(spath, dp);
-if (!dp) {
-LOGE(ERROR, "get realpath failed: '%s'", spath);
-return ERROR_FAIL;
-}
-} else if (errno == ENOENT) {
-/* driver path doesn't exist */
-dp = NULL;
-} else {
-LOGE(ERROR, "lstat failed: '%s'", spath);
+/* Find the canonical path to the driver. */
+dp = libxl__zalloc(gc, PATH_MAX);
+dp = realpath(spath, dp);
+if (!dp && errno != ENOENT) {
+LOGE(ERROR, "get realpath failed: '%s'", spath);
 return ERROR_FAIL;
 }
 
-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH 2/4] correct libxl_write_exactly sizeof

2016-04-07 Thread Chunyan Liu
sizeof is wrongly used in libxl_write_exactly function, using
strlen instead.

CID: 1358110
CID: 1358109

Signed-off-by: Chunyan Liu 
CC: Simon Cao 
CC: George Dunlap 
CC: Ian Jackson 
---
 tools/libxl/libxl_pvusb.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/tools/libxl/libxl_pvusb.c b/tools/libxl/libxl_pvusb.c
index 04e41b4..45117cf 100644
--- a/tools/libxl/libxl_pvusb.c
+++ b/tools/libxl/libxl_pvusb.c
@@ -1025,7 +1025,7 @@ static int unbind_usbintf(libxl__gc *gc, const char *intf)
 goto out;
 }
 
-if (libxl_write_exactly(CTX, fd, intf, sizeof(intf), path, intf)) {
+if (libxl_write_exactly(CTX, fd, intf, strlen(intf), path, intf)) {
 rc = ERROR_FAIL;
 goto out;
 }
@@ -1065,7 +1065,7 @@ static int bind_usbintf(libxl__gc *gc, const char *intf, 
const char *drvpath)
 goto out;
 }
 
-if (libxl_write_exactly(CTX, fd, intf, sizeof(intf), path, intf)) {
+if (libxl_write_exactly(CTX, fd, intf, strlen(intf), path, intf)) {
 rc = ERROR_FAIL;
 goto out;
 }
-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH 1/4] a fix in libxl_device_usbdev_list

2016-04-07 Thread Chunyan Liu
In testing with libvirt pvusb functionality, found a rc check
error in libxl_device_usbdev_list. Correct it. This function
is not used by xl.

Signed-off-by: Chunyan Liu 
CC: Simon Cao 
CC: George Dunlap 
CC: Ian Jackson 
---
 tools/libxl/libxl_pvusb.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/tools/libxl/libxl_pvusb.c b/tools/libxl/libxl_pvusb.c
index 5f92628..04e41b4 100644
--- a/tools/libxl/libxl_pvusb.c
+++ b/tools/libxl/libxl_pvusb.c
@@ -701,13 +701,13 @@ libxl_device_usbdev_list(libxl_ctx *ctx, uint32_t domid, 
int *num)
 usbctrls = libxl__xs_directory(gc, XBT_NULL, path, &nc);
 
 for (i = 0; i < nc; i++) {
-int r, nd = 0;
+int rc, nd = 0;
 libxl_device_usbdev *tmp = NULL;
 
-r = libxl__device_usbdev_list_for_usbctrl(gc, domid,
+rc = libxl__device_usbdev_list_for_usbctrl(gc, domid,
   atoi(usbctrls[i]),
   &tmp, &nd);
-if (!r || !nd) continue;
+if (rc || !nd) continue;
 
 usbdevs = libxl__realloc(NOGC, usbdevs,
  sizeof(*usbdevs) * (*num + nd));
-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v4 04/14] x86/rtc: replace paravirt rtc check with platform legacy quirk

2016-04-07 Thread David Vrabel
On 07/04/16 01:06, Luis R. Rodriguez wrote:
> We have 4 types of x86 platforms that disable RTC:
> 
>   * Intel MID
>   * Lguest - uses paravirt
>   * Xen dom-U - uses paravirt
>   * x86 on legacy systems annotated with an ACPI legacy flag
> 
> We can consolidate all of these into a platform specific legacy
> quirk set early in boot through i386_start_kernel() and through
> x86_64_start_reservations(). This deals with the RTC quirks which
> we can rely on through the hardware subarch, the ACPI check can
> be dealt with separately.

Xen parts:

Reviewed-by: David Vrabel 

David

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v4 06/14] x86/init: use a platform legacy quirk for ebda

2016-04-07 Thread David Vrabel
On 07/04/16 01:06, Luis R. Rodriguez wrote:
> 
> --- a/arch/x86/kernel/platform-quirks.c
> +++ b/arch/x86/kernel/platform-quirks.c
> @@ -7,8 +7,12 @@
>  void __init x86_early_init_platform_quirks(void)
>  {
>   x86_platform.legacy.rtc = 1;
> + x86_platform.legacy.ebda_search = 0;

You should make the default the setting for regular PC hardware, as you
have done for the .rtc bit.

David

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v4 11/14] pnpbios: replace paravirt_enabled() check with legacy device check

2016-04-07 Thread David Vrabel
On 07/04/16 01:06, Luis R. Rodriguez wrote:
> Since we are removing paravirt_enabled() replace it with a
> logical equivalent. Even though PNPBIOS is x86 specific we
> add an arch-specific type call, which can be implemented by
> any architecture to show how other legacy attribute devices
> can later be also checked for with other ACPI legacy attribute
> flags.
> 
> This implicates the first ACPI 5.2.9.3 IA-PC Boot Architecture
> ACPI_FADT_LEGACY_DEVICES flag device, and shows how to add more.
[...]
> +struct x86_legacy_devices {
> + int pnpbios;
> +};

It's not clear why pnpbios needs a new structure and why this structure
of devices does not have the bit for the rtc device.

David

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 4/4] fix a pvusb type check

2016-04-07 Thread Juergen Gross
On 07/04/16 11:40, Chunyan Liu wrote:
> Missing a check of controller type.
> 
> Signed-off-by: Chunyan Liu 
> CC: Simon Cao 
> CC: George Dunlap 
> CC: Ian Jackson 
> CC: Juergen Gross 

Reviewed-by: Juergen Gross 

> ---
> This affects Juergen's qusb patch too. Fix that together. This patch
> could be applied on top of Juergen's qusb backend patch.
> 
>  tools/libxl/libxl_pvusb.c | 7 +++
>  1 file changed, 7 insertions(+)
> 
> diff --git a/tools/libxl/libxl_pvusb.c b/tools/libxl/libxl_pvusb.c
> index 02d3e55..6447639 100644
> --- a/tools/libxl/libxl_pvusb.c
> +++ b/tools/libxl/libxl_pvusb.c
> @@ -811,6 +811,13 @@ static int libxl__device_usbdev_setdefault(libxl__gc *gc,
>  }
>  }
>  
> +if (usbctrl->type != LIBXL_USBCTRL_TYPE_PV &&
> +usbctrl->type != LIBXL_USBCTRL_TYPE_QUSB) {
> +LOG(ERROR, "Unsupported USB controller type");
> +rc = ERROR_FAIL;
> +goto out;
> +}
> +
>  rc = libxl__device_usbctrl_add_xenstore(gc, domid, usbctrl,
>  update_json);
>  if (rc) goto out;
> 


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v3] xen/arm: map_dev_mmio_region: printk should be ratelimited

2016-04-07 Thread Julien Grall

Hi Shannon,

On 07/04/16 07:28, Shannon Zhao wrote:

From: Shannon Zhao 

The function map_dev_mmio_region is used in a hypercall. Therefore all
printks should be ratelimited to avoid a malicious guest flooding the
console.

Signed-off-by: Shannon Zhao 
Reviewed-by: Konrad Rzeszutek Wilk 


Acked-by: Julien Grall 

Regards,


---
v3: update commit message
---
  xen/arch/arm/p2m.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
index 0011708..db21433 100644
--- a/xen/arch/arm/p2m.c
+++ b/xen/arch/arm/p2m.c
@@ -1284,7 +1284,7 @@ int map_dev_mmio_region(struct domain *d,
  res = map_mmio_regions(d, start_gfn, nr, mfn);
  if ( res < 0 )
  {
-printk(XENLOG_ERR "Unable to map [%#lx - %#lx] in Dom%d\n",
+printk(XENLOG_G_ERR "Unable to map [%#lx - %#lx] in Dom%d\n",
 start_gfn, start_gfn + nr - 1, d->domain_id);
  return res;
  }



--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [RFC PATCH] Data integrity extension support for xen-block

2016-04-07 Thread Bob Liu
* What's data integrity extension and why?
Modern filesystems feature checksumming of data and metadata to protect against
data corruption.  However, the detection of the corruption is done at read time
which could potentially be months after the data was written.  At that point the
original data that the application tried to write is most likely lost.

The solution in Linux is the data integrity framework which enables protection
information to be pinned to I/Os and sent to/received from controllers that
support it. struct bio has been extended with a pointer to a struct bip which
in turn contains the integrity metadata. The bip is essentially a trimmed down
bio with a bio_vec and some housekeeping.

* Issues when xen-block get involved.
xen-blkfront only transmits the normal data of struct bio while the integrity
metadata buffer(struct bio_integrity_payload in each bio) is ignored.

* Proposal of transmitting bio integrity payload.
Adding an extra request following the normal data request, this extra request
contains the integrity payload.
The xen-blkback will reconstruct an new bio with both received normal data and
integrity metadata.

Welcome any better ideas, thank you!

[1] http://lwn.net/Articles/280023/
[2] https://www.kernel.org/doc/Documentation/block/data-integrity.txt

Signed-off-by: Bob Liu 
---
 xen/include/public/io/blkif.h |   50 +
 1 file changed, 50 insertions(+)

diff --git a/xen/include/public/io/blkif.h b/xen/include/public/io/blkif.h
index 99f0326..3d8d39f 100644
--- a/xen/include/public/io/blkif.h
+++ b/xen/include/public/io/blkif.h
@@ -635,6 +635,28 @@
 #define BLKIF_OP_INDIRECT  6
 
 /*
+ * Recognized only if "feature-extra-request" is present in backend xenbus 
info.
+ * A request with BLKIF_OP_EXTRA_FLAG indicates an extra request is followed
+ * in the shared ring buffer.
+ *
+ * By this way, extra data like bio integrity payload can be transmitted from
+ * frontend to backend.
+ *
+ * The 'wire' format is like:
+ *  Request 1: xen_blkif_request
+ * [Request 2: xen_blkif_extra_request](only if request 1 has 
BLKIF_OP_EXTRA_FLAG)
+ *  Request 3: xen_blkif_request
+ *  Request 4: xen_blkif_request
+ * [Request 5: xen_blkif_extra_request](only if request 4 has 
BLKIF_OP_EXTRA_FLAG)
+ *  ...
+ *  Request N: xen_blkif_request
+ *
+ * If a backend does not recognize BLKIF_OP_EXTRA_FLAG, it should *not* create 
the
+ * "feature-extra-request" node!
+ */
+#define BLKIF_OP_EXTRA_FLAG (0x80)
+
+/*
  * Maximum scatter/gather segments per request.
  * This is carefully chosen so that sizeof(blkif_ring_t) <= PAGE_SIZE.
  * NB. This could be 12 if the ring indexes weren't stored in the same page.
@@ -703,6 +725,34 @@ struct blkif_request_indirect {
 };
 typedef struct blkif_request_indirect blkif_request_indirect_t;
 
+enum blkif_extra_request_type {
+   BLKIF_EXTRA_TYPE_DIX = 1,   /* Data integrity extension 
payload.  */
+};
+
+struct bio_integrity_req {
+   /*
+* Grant mapping for transmitting bio integrity payload to backend.
+*/
+   grant_ref_t *gref;
+   unsigned int nr_grefs;
+   unsigned int len;
+};
+
+/*
+ * Extra request, must follow a normal-request and a normal-request can
+ * only be followed by one extra request.
+ */
+struct blkif_request_extra {
+   uint8_t type;   /* BLKIF_EXTRA_TYPE_* */
+   uint16_t _pad1;
+#ifndef CONFIG_X86_32
+   uint32_t _pad2; /* offsetof(blkif_...,u.extra.id) == 8 */
+#endif
+   uint64_t id;
+   struct bio_integrity_req bi_req;
+} __attribute__((__packed__));
+typedef struct blkif_request_extra blkif_request_extra_t;
+
 struct blkif_response {
 uint64_tid;  /* copied from request */
 uint8_t operation;   /* copied from request */
-- 
1.7.10.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [linux-3.18 test] 89247: regressions - trouble: blocked/broken/fail/pass

2016-04-07 Thread osstest service owner
flight 89247 linux-3.18 real [real]
http://logs.test-lab.xenproject.org/osstest/logs/89247/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-armhf   4 host-build-prep   fail REGR. vs. 86513
 test-amd64-amd64-xl-qemut-debianhvm-amd64-xsm 15 guest-localmigrate/x10 fail 
REGR. vs. 86513
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 9 debian-hvm-install fail 
REGR. vs. 86513
 test-amd64-i386-libvirt-pair 21 guest-migrate/src_host/dst_host fail REGR. vs. 
86513
 test-amd64-i386-xl-qemut-win7-amd64 20 leak-check/check   fail REGR. vs. 86513

Regressions which are regarded as allowable (not blocking):
 test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-stopfail REGR. vs. 86513
 build-amd64-rumpuserxen   6 xen-buildfail   like 86513
 build-i386-rumpuserxen6 xen-buildfail   like 86513
 test-amd64-i386-xl-qemuu-win7-amd64 16 guest-stop  fail like 86513
 test-amd64-amd64-xl-qemut-win7-amd64 16 guest-stop fail like 86513

Tests which did not succeed, but are not blocking:
 test-armhf-armhf-xl-multivcpu  1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt-qcow2  1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt  1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt-raw  1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl   1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl-vhd   1 build-check(1)   blocked  n/a
 test-amd64-amd64-rumpuserxen-amd64  1 build-check(1)   blocked n/a
 test-armhf-armhf-xl-credit2   1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl-cubietruck  1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl-rtds  1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl-arndale   1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt-xsm  1 build-check(1)   blocked  n/a
 build-armhf-libvirt   1 build-check(1)   blocked  n/a
 test-amd64-i386-rumpuserxen-i386  1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-pvh-amd  11 guest-start  fail   never pass
 test-amd64-amd64-xl-pvh-intel 14 guest-saverestorefail  never pass
 test-amd64-amd64-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-amd64-amd64-qemuu-nested-amd 16 debian-hvm-install/l1/l2  fail never pass
 test-amd64-i386-libvirt  12 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-vhd 11 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  12 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  12 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check 
fail never pass

version targeted for testing:
 linuxb36eba9b4dd4344ed51b8f644049aeac606ccff2
baseline version:
 linuxd439e869d612dd7a338ac75a4afc3646a5e67370

Last test of basis86513  2016-03-17 21:21:40 Z   20 days
Testing same since89247  2016-04-06 22:15:59 Z0 days1 attempts


People who touched revisions under test:
  Alex Deucher 
  Avery Pennarun 
  Catalin Marinas 
  Chris Bainbridge 
  Emmanuel Grumbach 
  Felix Fietkau 
  Grygorii Strashko 
  James Hogan 
  Johannes Berg 
  Jouni Malinen 
  Konstantin Khlebnikov 
  Liad Kaufman 
  Lokesh Vutla 
  Marc Kleine-Budde 
  Mark Brown 
  Markos Chandras 
  Maximilain Schneider 
  Maximilian Schneider 
  Michael S. Tsirkin 
  Miklos Szeredi 
  Mugunthan V N 
  Nicholas Bellinger 
  Paolo Bonzini 
  Paul Mackerras 
  Paul Walmsley 
  Radim Krčmář 
  Ralf Baechle 
  Rui Wang 
  Sasha Levin 
  Steven Rostedt (Red Hat) 
  Steven Rostedt 
  Sudeep Holla 
  Sudip Mukherjee 
  Takashi Iwai 
  Tony Lindgren 
  Will Deacon 

jobs:
 build-amd64-xsm  pass
 build-armhf-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-armhf  broken  
 build-i386   pass
 build-amd64-libvirt  pass
 build-armhf-libvirt  blocked 
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-armhf-pvopspass
 build-i386-pvops  

Re: [Xen-devel] [PATCH v10 06/17] Xen: ARM: Add support for mapping platform device mmio

2016-04-07 Thread Julien Grall

Hi Shannon,

On 07/04/16 02:37, Shannon Zhao wrote:



On 2016/4/6 20:16, Julien Grall wrote:

+gpfns[j] = XEN_PFN_DOWN(r->start) + j;
+idxs[j] = XEN_PFN_DOWN(r->start) + j;
+}
+
+xatp.domid = DOMID_SELF;
+xatp.size = nr;
+xatp.space = XENMAPSPACE_dev_mmio;
+
+set_xen_guest_handle(xatp.gpfns, gpfns);
+set_xen_guest_handle(xatp.idxs, idxs);
+set_xen_guest_handle(xatp.errs, errs);
+
+rc = HYPERVISOR_memory_op(XENMEM_add_to_physmap_range, &xatp);
+kfree(gpfns);
+kfree(idxs);
+kfree(errs);
+if (rc)
+return rc;


Shouldn't we redo the mapping if the hypercall fails?

Hmm, why? If it fails again when we redo the mapping, what should we do
then? Redo again?


Because the device MMIO region is left half mapped in DOM0 address space.

After having another look to your patch, if an error occurs, the 
notifier will still return NOTIFY_OK. This will lead to random data 
abort when the driver is accessing the MMIO regions as the device will 
be considered fully functional.


However, even if the notifier return NOTIFY_BAD, the function device_add 
doesn't care about the return value of blocking_notifier_call_chain. I 
think this need to be fixed.



I think if it fails at the first time it will always fail no matter how
many times we do.


I was speaking about the mappings that succeeded. They will unlikely 
fail during removal. If they ever fail you can just ignore the error.


Regards,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v10 00/17] Add ACPI support for Xen Dom0 on ARM64

2016-04-07 Thread Julien Grall

On 07/04/16 02:39, Shannon Zhao wrote:

Hi Julien,


Hi Shannon,


On 2016/4/6 19:32, Julien Grall wrote:

Hi Shannon,

On 01/04/2016 16:48, Shannon Zhao wrote:

This patch set adds ACPI support for Xen Dom0 on ARM64. The relevant Xen
ACPI on ARM64 design document could be found from [1].

This patch set adds a new FDT node "uefi" under /hypervisor to pass UEFI
information. Introduce a bus notifier of AMBA and Platform bus to map
the new added device's MMIO space. Make Xen domain use
xlated_setup_gnttab_pages to setup grant table and a new hypercall to
get event-channel irq.

Regarding the initialization flow of Linux kernel, it needs to move
xen_early_init() before efi_init(). Then xen_early_init() will check
whether it runs on Xen through the /hypervisor node and efi_init() will
call a new function fdt_find_xen_uefi_params(), to parse those
xen,uefi-* parameters just like the existing efi_get_fdt_params().

And in arm64_enable_runtime_services() it will check whether it runs on
Xen and call another new function xen_efi_runtime_setup() to setup
runtime service instead of efi_native_runtime_setup(). The
xen_efi_runtime_setup() will assign the runtime function pointers with
the functions of driver/xen/efi.c.

And since we pass a /hypervisor node and a /chosen node to Dom0, it
needs to check whether the DTS only contains a /hypervisor node and a
/chosen node in acpi_boot_table_init().

Patches are tested on FVP base model. They can be fetched from[2].


I have tested this series and Linux is booting up to the prompt:

Tested-by: Julien Grall 

Thanks a lot. There are several patches which you didn't give your
comments. So I assume you will review them. If so, I'll wait and update
this series later.


I don't have any comments on those patches. You can go ahead to update 
the patch series.


Regards,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v1] libxc: fix uninitialized variable when changing rtds scheduling parameters

2016-04-07 Thread Ian Jackson
Andrew Cooper writes ("Re: [Xen-devel] [PATCH v1] libxc: fix uninitialized 
variable when changing rtds scheduling parameters"):
> On 06/04/16 21:30, Chong Li wrote:
> > Commit 046c2b503a89d21b41e4d555a9f75d02af00dbc6 introduces a build
> > failure: in some cases (e.g., num_vcpus <=0),
> > xc_sched_rtds_vcpu_get/set returns an uninitialized variable.
> 
> LGTM.
> 
> Reviewed-by: Andrew Cooper 

Committed-by: Ian Jackson 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [ovmf test] 89251: regressions - FAIL

2016-04-07 Thread osstest service owner
flight 89251 ovmf real [real]
http://logs.test-lab.xenproject.org/osstest/logs/89251/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-i386-xl-qemuu-ovmf-amd64  9 debian-hvm-install fail REGR. vs. 65543
 test-amd64-amd64-xl-qemuu-ovmf-amd64 9 debian-hvm-install fail REGR. vs. 65543

version targeted for testing:
 ovmf eccc28bfcb91b5d72adf173d7404652c7aa63c26
baseline version:
 ovmf 5ac96e3a28dd26eabee421919f67fa7c443a47f1

Last test of basis65543  2015-12-08 08:45:15 Z  121 days
Failing since 65593  2015-12-08 23:44:51 Z  120 days  139 attempts
Testing same since89251  2016-04-06 22:17:55 Z0 days1 attempts


People who touched revisions under test:
  "Samer El-Haj-Mahmoud" 
  "Wu, Hao A" 
  "Yao, Jiewen" 
  Alcantara, Paulo 
  Anbazhagan Baraneedharan 
  Andrew Fish 
  Ard Biesheuvel 
  Arthur Crippa Burigo 
  Cecil Sheng 
  Chao Zhang 
  Chao Zhang
  Charles Duffy 
  Cinnamon Shia 
  Cohen, Eugene 
  Dandan Bi 
  Daocheng Bu 
  Daryl McDaniel 
  David Woodhouse 
  Derek Lin 
  edk2 dev 
  edk2-devel 
  Eric Dong 
  Eric Dong 
  Eugene Cohen 
  Evan Lloyd 
  Feng Tian 
  Fu Siyuan 
  Gabriel Somlo 
  Gary Ching-Pang Lin 
  Gary Lin 
  Ghazi Belaam 
  Hao Wu 
  Haojian Zhuang 
  Hess Chen 
  Heyi Guo 
  Jaben Carsey 
  James Bottomley 
  Jeff Fan 
  Jeremy Linton 
  Jiaxin Wu 
  jiewen yao 
  Jim Dailey 
  jim_dai...@dell.com 
  Jordan Justen 
  Juliano Ciocari 
  Karyne Mayer 
  Larry Hauch 
  Laszlo Ersek 
  Leahy, Leroy P
  Leahy, Leroy P 
  Lee Leahy 
  Leekha Shaveta 
  Leendert van Doorn 
  Leif Lindholm 
  Leo Duran 
  Liming Gao 
  Mark Rutland 
  Marvin Haeuser 
  Marvin Häuser 
  Michael Kinney 
  Michael LeMay 
  Michael Thomas 
  Michał Zegan 
  Ni, Ruiyu 
  Ni, Ruiyu 
  Paolo Bonzini 
  Paulo Alcantara 
  Paulo Alcantara Cavalcanti 
  Peter Kirmeier 
  Qin Long 
  Qiu Shumin 
  Rodrigo Dias Correa 
  Ruiyu Ni 
  Ryan Harkin 
  Samer El-Haj-Mahmoud 
  Samer El-Haj-Mahmoud 
  Sami Mujawar 
  Shivamurthy Shastri 
  Star Zeng 
  Supreeth Venkatesh 
  Tapan Shah 
  Thomas Palmer 
  Tian, Feng 
  Vladislav Vovchenko 
  Yao Jiewen 
  Yao, Jiewen 
  Ye Ting 
  Yonghong Zhu 
  Zeng, Star 
  Zhang Lubo 
  Zhang, Chao B 
  Zhang, Lubo 
  Zhangfei Gao 

jobs:
 build-amd64-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-xl-qemuu-ovmf-amd64 fail
 test-amd64-i386-xl-qemuu-ovmf-amd64  fail



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Not pushing.

(No revision log; it would be 16809 lines long.)

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Redundant lstats in libxl_pvusb.c

2016-04-07 Thread Ian Jackson
Chun Yan Liu writes ("Re: Redundant lstats in libxl_pvusb.c"):
> <22274.33583.712655.413...@mariner.uk.xensource.com>, Ian Jackson
>  wrote: 
> > In libxl_usb.c, usbintf_get_drvpath calls stat(2) on the driver sysfs 
> > path, and then realpath on the same path. 
> 
> It's true. This could be done by calling realpath only. Will correct.

Thanks.

> > And bind_usbintf calls stat(2) on the driver directory path, and then 
> > open(2) on a file in that directory. 
> 
> It's not true. It calls stat(2) on a file in driver path (driver/interface),
> and open(2) on another file in that driver path (driver/bind).

I have read the function again and you are right.

Coverity said:

> > > >>> CID 1358111: Security best practices violations (TOCTOU)
> > > >>> Calling function "open" that uses "path" after a check
> > > >>> function. This can cause a time-of-check, time-of-use
> > > >>> race condition.

But it seems that it is confused by the reuse of the path variable.
I think this is arguably a bug in Coverity.

But, evidently, the same reuse confused me too.  Maybe we should turn
`path' into two variables, `intf_path' and `bind_path' ?  What do you
think ?

Thanks,
Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Redundant lstats in libxl_pvusb.c

2016-04-07 Thread Chun Yan Liu


>>> On 4/7/2016 at 06:43 PM, in message
<22278.14817.248393.423...@mariner.uk.xensource.com>, Ian Jackson
 wrote: 
> Chun Yan Liu writes ("Re: Redundant lstats in libxl_pvusb.c"): 
> > <22274.33583.712655.413...@mariner.uk.xensource.com>, Ian Jackson 
> >  wrote:  
> > > In libxl_usb.c, usbintf_get_drvpath calls stat(2) on the driver sysfs  
> > > path, and then realpath on the same path.  
> >  
> > It's true. This could be done by calling realpath only. Will correct. 
>  
> Thanks. 
>  
> > > And bind_usbintf calls stat(2) on the driver directory path, and then  
> > > open(2) on a file in that directory.  
> >  
> > It's not true. It calls stat(2) on a file in driver path  
> (driver/interface), 
> > and open(2) on another file in that driver path (driver/bind). 
>  
> I have read the function again and you are right. 
>  
> Coverity said: 
>  
> > > > >>> CID 1358111: Security best practices violations (TOCTOU) 
> > > > >>> Calling function "open" that uses "path" after a check 
> > > > >>> function. This can cause a time-of-check, time-of-use 
> > > > >>> race condition. 
>  
> But it seems that it is confused by the reuse of the path variable. 
> I think this is arguably a bug in Coverity. 
>  
> But, evidently, the same reuse confused me too.  Maybe we should turn 
> `path' into two variables, `intf_path' and `bind_path' ?  What do you 
> think ? 

Yeah, maybe it's better to change into 'intf_path' and 'bind_path', I'll update.
But it's unavoidable that some temp variable will be reused for many
times.

Chunyan

>  
> Thanks, 
> Ian. 
>  
>  



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [for-4.7] xen/arm64: correctly emulate the {w, x}zr registers

2016-04-07 Thread Julien Grall
On AArch64, encoding 31 for an R in the HSR is used to represent
either {w,x}sp or {w,x}zr (See C1.2.4 in ARM DDI 0486A.d) depending on
how the register field is interpreted by the instruction.

All the instructions trapped by Xen (either via a sysreg access or
data abort) interpret encoding 31 as {w,x}zr. Therefore we don't have
to worry about the possibility that a trap could refer to sp or about
decoding the instruction.

For example AArch64 LDR and STR can have zr in the source/target
register , but never sp. sp can be present in the destination
pointer( i.e.  "[sp]"), but that would be represented by the value of
FAR_EL2, not in the HSR.

For AArch32 it is possible for a LDR to target the PC, but this would
not result in a valid ISS in the HSR register. However this could only
occur if loading or storing the PC to MMIO, which we simply choose not
to support for now.

Finally, features such as xenaccess can lead to us trapping on
arbitrary instructions accessing RAM and not just for MMIO. However in
many such cases HSR.ISS is not valid and in general features such as
xenaccess do not rely on the nature of the specific instruction, they
resolve the fault (via information found elsewhere e.g. FAR_EL2)
without needing to know anything about the instruction which triggered
the trap.

The register zr represents the zero register, i.e it will always
return 0 and write to it is ignored. To properly handle this property,
2 new helpers have been introduced {get,set}_user_reg to read/write a
value from/to a register. All the calls to select_user_reg have been
replaced by these 2 helpers.

Furthermore, the code to emulate encoding 31 in select_user_reg has been
dropped because it was invalid. For Aarch64 context, the encoding is
used for sp or zr. For AArch32 context, the ISS won't be valid for data
abort from AArch32 using r15 (i.e pc) as source/destination (See D7-1881
ARM DDI 0487A.d, note the validity is more restrictive than on ARMv7).
It's also not possible to use r15 in co-processor instructions.

This patch fixes setting MMIO register and sysreg to a random value
(actually PC) instead of zero by something like:

*((volatile int*)reg) = 0;

compilers tend to generate "str wzr, [xx]" here.

[ian: added BUG_ON to select_user_reg and clarified bits of the commit message]
Reported-by: Marc Zyngier 
Signed-off-by: Julien Grall 
Signed-off-by: Ian Campbell 
Reviewed-by: Stefano Stabellini 

---

Stefano, let me know the new helper corresponds to change you requested
(see [1])

This patch is a bug fix for Xen 4.7. Without it, a MMIO register and
sysreg will be set to a random value (actually PC) when the zero
register is used.

I'm not sure if we should consider this patch to be backported to Xen
4.6 and Xen 4.5. It depends on other patches and it would require some
rework to backport it alone.

[1] http://lists.xenproject.org/archives/html/xen-devel/2016-02/msg03100.html
---
 xen/arch/arm/io.c  |  34 
 xen/arch/arm/traps.c   | 126 ++---
 xen/arch/arm/vgic-v3.c |   3 +-
 xen/arch/arm/vtimer.c  |  59 -
 xen/include/asm-arm/regs.h |   7 +--
 5 files changed, 158 insertions(+), 71 deletions(-)

diff --git a/xen/arch/arm/io.c b/xen/arch/arm/io.c
index 7e29943..0156755 100644
--- a/xen/arch/arm/io.c
+++ b/xen/arch/arm/io.c
@@ -24,12 +24,19 @@
 #include 
 
 static int handle_read(const struct mmio_handler *handler, struct vcpu *v,
-   mmio_info_t *info, register_t *r)
+   mmio_info_t *info)
 {
 const struct hsr_dabt dabt = info->dabt;
+struct cpu_user_regs *regs = guest_cpu_user_regs();
+/*
+ * Initialize to zero to avoid leaking data if there is an
+ * implementation error in the emulation (such as not correctly
+ * setting r).
+ */
+register_t r = 0;
 uint8_t size = (1 << dabt.size) * 8;
 
-if ( !handler->ops->read(v, info, r, handler->priv) )
+if ( !handler->ops->read(v, info, &r, handler->priv) )
 return 0;
 
 /*
@@ -37,7 +44,7 @@ static int handle_read(const struct mmio_handler *handler, 
struct vcpu *v,
  * Note that we expect the read handler to have zeroed the bits
  * outside the requested access size.
  */
-if ( dabt.sign && (*r & (1UL << (size - 1))) )
+if ( dabt.sign && (r & (1UL << (size - 1))) )
 {
 /*
  * We are relying on register_t using the same as
@@ -45,21 +52,30 @@ static int handle_read(const struct mmio_handler *handler, 
struct vcpu *v,
  * code smaller.
  */
 BUILD_BUG_ON(sizeof(register_t) != sizeof(unsigned long));
-*r |= (~0UL) << size;
+r |= (~0UL) << size;
 }
 
+set_user_reg(regs, dabt.reg, r);
+
 return 1;
 }
 
+static int handle_write(const struct mmio_handler *handler, struct vcpu *v,
+mmio_info_t *info)
+{
+const struct hsr_dabt dabt = info->dabt;
+struct cpu_user_regs *regs = gu

[Xen-devel] [for-4.7 3/5] xen/arm: acpi: Fix SMP support when booting with ACPI

2016-04-07 Thread Julien Grall
The variable enabled_cpus is used to know the number of CPU enabled in
the MADT.

Currently this variable is used to check the validity of the boot CPU.
It will be considered invalid when "enabled_cpus > 1".

However, this condition also means that multiple CPUs are present on the
system. So secondary will never be brought up.

The correct way to check the validity of the boot CPU is to use the
variable bootcpu_valid.

Signed-off-by: Julien Grall 
---
 xen/arch/arm/acpi/boot.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/xen/arch/arm/acpi/boot.c b/xen/arch/arm/acpi/boot.c
index 2a71660..fd29bdc 100644
--- a/xen/arch/arm/acpi/boot.c
+++ b/xen/arch/arm/acpi/boot.c
@@ -149,7 +149,7 @@ void __init acpi_smp_init_cpus(void)
 return;
 }
 
-if ( enabled_cpus > 1 )
+if ( !bootcpu_valid )
 {
 printk("MADT missing boot CPU MPIDR, not enabling secondaries\n");
 return;
-- 
1.9.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [for-4.7 5/5] xen/arm: acpi: Print more error messages in acpi_map_gic_cpu_interface

2016-04-07 Thread Julien Grall
It's helpful to spot any error without having to modify the hypervisor
code.

Signed-off-by: Julien Grall 
---
 xen/arch/arm/acpi/boot.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/xen/arch/arm/acpi/boot.c b/xen/arch/arm/acpi/boot.c
index 602ab39..23285f7 100644
--- a/xen/arch/arm/acpi/boot.c
+++ b/xen/arch/arm/acpi/boot.c
@@ -63,7 +63,10 @@ acpi_map_gic_cpu_interface(struct 
acpi_madt_generic_interrupt *processor)
 
 total_cpus++;
 if ( !enabled )
+{
+printk("Skipping disabled CPU entry with 0x%"PRIx64" MPIDR\n", mpidr);
 return;
+}
 
 if ( enabled_cpus >=  NR_CPUS )
 {
@@ -101,7 +104,11 @@ acpi_map_gic_cpu_interface(struct 
acpi_madt_generic_interrupt *processor)
 }
 
 if ( !acpi_psci_present() )
+{
+printk("PSCI not present, skipping CPU MPIDR 0x%"PRIx64"\n",
+   mpidr);
 return;
+}
 
 if ( (rc = arch_cpu_init(enabled_cpus, NULL)) < 0 )
 {
-- 
1.9.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [for-4.7 4/5] xen/arm: acpi: Remove uncessary check in acpi_map_gic_cpu_interface

2016-04-07 Thread Julien Grall
This part of the code will never be executed when the entry
corresponds to the boot CPU.

Also print an error message rather when arch_cpu_init has failed.

Signed-off-by: Julien Grall 
---
 xen/arch/arm/acpi/boot.c | 15 ---
 1 file changed, 8 insertions(+), 7 deletions(-)

diff --git a/xen/arch/arm/acpi/boot.c b/xen/arch/arm/acpi/boot.c
index fd29bdc..602ab39 100644
--- a/xen/arch/arm/acpi/boot.c
+++ b/xen/arch/arm/acpi/boot.c
@@ -51,6 +51,7 @@ static void __init
 acpi_map_gic_cpu_interface(struct acpi_madt_generic_interrupt *processor)
 {
 int i;
+int rc;
 u64 mpidr = processor->arm_mpidr & MPIDR_HWID_MASK;
 bool_t enabled = !!(processor->flags & ACPI_MADT_ENABLED);
 
@@ -102,16 +103,16 @@ acpi_map_gic_cpu_interface(struct 
acpi_madt_generic_interrupt *processor)
 if ( !acpi_psci_present() )
 return;
 
-/* CPU 0 was already initialized */
-if ( enabled_cpus )
+if ( (rc = arch_cpu_init(enabled_cpus, NULL)) < 0 )
 {
-if ( arch_cpu_init(enabled_cpus, NULL) < 0 )
-return;
-
-/* map the logical cpu id to cpu MPIDR */
-cpu_logical_map(enabled_cpus) = mpidr;
+printk("cpu%d: init failed (0x%"PRIx64" MPIDR): %d\n",
+   enabled_cpus, mpidr, rc);
+return;
 }
 
+/* map the logical cpu id to cpu MPIDR */
+cpu_logical_map(enabled_cpus) = mpidr;
+
 enabled_cpus++;
 }
 
-- 
1.9.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [for-4.7 1/5] drivers/pl011: ACPI: The interrupt should always be high level triggered

2016-04-07 Thread Julien Grall
The SPCR does not specify if the interrupt is edge or level triggered.
So the configuration needs to be hardcoded in the code.

Based on the PL011 TRM (see 2.2.8 in ARM DDI 0183G), the interrupt generated
will be active high. This wording implies the interrupt should be high level
triggered. Note that a rising edge triggered interrupt would be described as
"high going edge".

Signed-off-by: Julien Grall 
---
 xen/drivers/char/pl011.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/xen/drivers/char/pl011.c b/xen/drivers/char/pl011.c
index fa22edf..88d8488 100644
--- a/xen/drivers/char/pl011.c
+++ b/xen/drivers/char/pl011.c
@@ -327,7 +327,7 @@ static int __init pl011_acpi_uart_init(const void *data)
 }
 
 /* trigger/polarity information is not available in spcr */
-irq_set_type(spcr->interrupt, IRQ_TYPE_EDGE_BOTH);
+irq_set_type(spcr->interrupt, IRQ_TYPE_LEVEL_MASK);
 
 res = pl011_uart_init(spcr->interrupt, spcr->serial_port.address,
   PAGE_SIZE);
-- 
1.9.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [for-4.7 2/5] xen/arm: acpi: The boot CPU does not always match the first entry in the MADT

2016-04-07 Thread Julien Grall
Since the ACPI 6.0 errata document [1], the first entry in the MADT
does not have to correspond to the boot CPU.

Introduce a new variable to know if a MADT entry matching the boot CPU
is found. Furthermore, it's not necessary to check if the MPIDR is
duplicated for the boot CPU. So the rest of the function can be skipped.

[1] 1380 Unnecessary restrictions to FW vendors in ordering of GIC structures
in MADT

Signed-off-by: Julien Grall 
---
 xen/arch/arm/acpi/boot.c | 14 ++
 1 file changed, 10 insertions(+), 4 deletions(-)

diff --git a/xen/arch/arm/acpi/boot.c b/xen/arch/arm/acpi/boot.c
index 859aa86..2a71660 100644
--- a/xen/arch/arm/acpi/boot.c
+++ b/xen/arch/arm/acpi/boot.c
@@ -37,7 +37,8 @@
 #include 
 
 /* Processors with enabled flag and sane MPIDR */
-static unsigned int enabled_cpus;
+static unsigned int enabled_cpus = 1;
+static bool __initdata bootcpu_valid;
 
 /* total number of cpus in this system */
 static unsigned int __initdata total_cpus;
@@ -71,10 +72,15 @@ acpi_map_gic_cpu_interface(struct 
acpi_madt_generic_interrupt *processor)
 }
 
 /* Check if GICC structure of boot CPU is available in the MADT */
-if ( (enabled_cpus == 0) && (cpu_logical_map(0) != mpidr) )
+if ( cpu_logical_map(0) == mpidr )
 {
-printk("Firmware bug, invalid CPU MPIDR for cpu0: 0x%"PRIx64" in 
MADT\n",
-   mpidr);
+if ( bootcpu_valid )
+{
+printk("Firmware bug, duplicate boot CPU MPIDR: 0x%"PRIx64" in 
MADT\n",
+   mpidr);
+return;
+}
+bootcpu_valid = true;
 return;
 }
 
-- 
1.9.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [for-4.7 0/5] xen/arm: acpi: Bunch of fixes to use ACPI with SMP and PL011

2016-04-07 Thread Julien Grall
Hello,

This patch series fixes secondary bring up and the use of the PL011 UART driver
when Xen boots using ACPI.

Regards,

Cc: wei.l...@citrix.com

Julien Grall (5):
  drivers/pl011: ACPI: The interrupt should always be high level
triggered
  xen/arm: acpi: The boot CPU does not always match the first entry in
the MADT
  xen/arm: acpi: Fix SMP support when booting with ACPI
  xen/arm: acpi: Remove uncessary check in acpi_map_gic_cpu_interface
  xen/arm: acpi: Print more error messages in acpi_map_gic_cpu_interface

 xen/arch/arm/acpi/boot.c | 38 ++
 xen/drivers/char/pl011.c |  2 +-
 2 files changed, 27 insertions(+), 13 deletions(-)

-- 
1.9.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH] bind_usbintf: do not reuse 'path'

2016-04-07 Thread Chunyan Liu
To avoid confusion, add a new variable "intf_path" to indicate
driver/interface path, let "path" indicate driver/bind path only.

CID: 1358111

Signed-off-by: Chunyan Liu 
CC: Simon Cao 
CC: George Dunlap 
CC: Ian Jackson 
---
 tools/libxl/libxl_pvusb.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/tools/libxl/libxl_pvusb.c b/tools/libxl/libxl_pvusb.c
index 6447639..8cf3ddc 100644
--- a/tools/libxl/libxl_pvusb.c
+++ b/tools/libxl/libxl_pvusb.c
@@ -1035,18 +1035,18 @@ out:
 
 static int bind_usbintf(libxl__gc *gc, const char *intf, const char *drvpath)
 {
-char *path;
+char *path, *intf_path;
 struct stat st;
 int fd = -1;
 int rc, r;
 
-path = GCSPRINTF("%s/%s", drvpath, intf);
+intf_path = GCSPRINTF("%s/%s", drvpath, intf);
 
 /* check through lstat, if intf already exists under drvpath,
  * it's already bound, return directly; if it doesn't exist,
  * continue to do bind work; otherwise, return error.
  */
-r = lstat(path, &st);
+r = lstat(intf_path, &st);
 if (r == 0)
 return 0;
 if (r < 0 && errno != ENOENT)
-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 01/11] xen: sched: make implementing .alloc_pdata optional

2016-04-07 Thread George Dunlap
On 06/04/16 18:22, Dario Faggioli wrote:
> The .alloc_pdata scheduler hook must, before this change,
> be implemented by all schedulers --even those ones that
> don't need to allocate anything.
> 
> Make it possible to just use the SCHED_OP(), like for
> the other hooks, by using ERR_PTR() and IS_ERR() for
> error reporting. This:
>  - makes NULL a variant of success;
>  - allows for errors other than ENOMEM to be properly
>communicated (if ever necessary).
> 
> This, in turn, means that schedulers not needing to
> allocate any per-pCPU data, can avoid implementing the
> hook. In fact, the artificial implementation of
> .alloc_pdata in the ARINC653 is removed (and, while there,
> nuke .free_pdata too, as it is equally useless).
> 
> Signed-off-by: Dario Faggioli 
> Reviewed-by: Meng Xu 

Acked-by: George Dunlap 

> ---
> Cc: George Dunlap 
> Cc: Robert VanVossen 
> Cc: Josh Whitehead 
> Cc: Jan Beulich 
> Cc: Juergen Gross 
> ---
> Changes from v1:
>  * only update sd->sched_priv if alloc_pdata does not return
>IS_ERR, so that xfree() can always be safely called on
>sd->sched_priv itself, as requested during review;
>  * xen/err.h included in .c files that actually need it,
>instead than in sched-if.h.
> ---
>  xen/common/sched_arinc653.c |   31 ---
>  xen/common/sched_credit.c   |5 +++--
>  xen/common/sched_credit2.c  |2 +-
>  xen/common/sched_rt.c   |8 
>  xen/common/schedule.c   |   27 +--
>  5 files changed, 25 insertions(+), 48 deletions(-)
> 
> diff --git a/xen/common/sched_arinc653.c b/xen/common/sched_arinc653.c
> index 8a11a2f..b79fcdf 100644
> --- a/xen/common/sched_arinc653.c
> +++ b/xen/common/sched_arinc653.c
> @@ -456,34 +456,6 @@ a653sched_free_vdata(const struct scheduler *ops, void 
> *priv)
>  }
>  
>  /**
> - * This function allocates scheduler-specific data for a physical CPU
> - *
> - * We do not actually make use of any per-CPU data but the hypervisor expects
> - * a non-NULL return value
> - *
> - * @param ops   Pointer to this instance of the scheduler structure
> - *
> - * @return  Pointer to the allocated data
> - */
> -static void *
> -a653sched_alloc_pdata(const struct scheduler *ops, int cpu)
> -{
> -/* return a non-NULL value to keep schedule.c happy */
> -return SCHED_PRIV(ops);
> -}
> -
> -/**
> - * This function frees scheduler-specific data for a physical CPU
> - *
> - * @param ops   Pointer to this instance of the scheduler structure
> - */
> -static void
> -a653sched_free_pdata(const struct scheduler *ops, void *pcpu, int cpu)
> -{
> -/* nop */
> -}
> -
> -/**
>   * This function allocates scheduler-specific data for a domain
>   *
>   * We do not actually make use of any per-domain data but the hypervisor
> @@ -737,9 +709,6 @@ static const struct scheduler sched_arinc653_def = {
>  .free_vdata = a653sched_free_vdata,
>  .alloc_vdata= a653sched_alloc_vdata,
>  
> -.free_pdata = a653sched_free_pdata,
> -.alloc_pdata= a653sched_alloc_pdata,
> -
>  .free_domdata   = a653sched_free_domdata,
>  .alloc_domdata  = a653sched_alloc_domdata,
>  
> diff --git a/xen/common/sched_credit.c b/xen/common/sched_credit.c
> index 4c4927f..63a4a63 100644
> --- a/xen/common/sched_credit.c
> +++ b/xen/common/sched_credit.c
> @@ -23,6 +23,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  
>  /*
> @@ -532,12 +533,12 @@ csched_alloc_pdata(const struct scheduler *ops, int cpu)
>  /* Allocate per-PCPU info */
>  spc = xzalloc(struct csched_pcpu);
>  if ( spc == NULL )
> -return NULL;
> +return ERR_PTR(-ENOMEM);
>  
>  if ( !alloc_cpumask_var(&spc->balance_mask) )
>  {
>  xfree(spc);
> -return NULL;
> +return ERR_PTR(-ENOMEM);
>  }
>  
>  spin_lock_irqsave(&prv->lock, flags);
> diff --git a/xen/common/sched_credit2.c b/xen/common/sched_credit2.c
> index b8c8e40..e97d8be 100644
> --- a/xen/common/sched_credit2.c
> +++ b/xen/common/sched_credit2.c
> @@ -2047,7 +2047,7 @@ csched2_alloc_pdata(const struct scheduler *ops, int 
> cpu)
>  printk("%s: cpu %d not online yet, deferring initializatgion\n",
> __func__, cpu);
>  
> -return (void *)1;
> +return NULL;
>  }
>  
>  static void
> diff --git a/xen/common/sched_rt.c b/xen/common/sched_rt.c
> index 321b0a5..aece318 100644
> --- a/xen/common/sched_rt.c
> +++ b/xen/common/sched_rt.c
> @@ -29,6 +29,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  
>  /*
> @@ -681,7 +682,7 @@ rt_alloc_pdata(const struct scheduler *ops, int cpu)
>  spin_unlock_irqrestore(old_lock, flags);
>  
>  if ( !alloc_cpumask_var(&_cpumask_scratch[cpu]) )
> -return NULL;
> +return ERR_PTR(-ENOMEM);
>  
>  if ( prv->repl_timer == NULL )
>  {
> @@ -689,13 +690,12 @@ rt_alloc_pdata(const struct scheduler *ops, int cpu)
>  prv->repl_timer = xzalloc(stru

Re: [Xen-devel] [PATCH v10 06/17] Xen: ARM: Add support for mapping platform device mmio

2016-04-07 Thread Shannon Zhao


On 2016/4/7 18:32, Julien Grall wrote:
> Hi Shannon,
> 
> On 07/04/16 02:37, Shannon Zhao wrote:
>>
>>
>> On 2016/4/6 20:16, Julien Grall wrote:
 +gpfns[j] = XEN_PFN_DOWN(r->start) + j;
 +idxs[j] = XEN_PFN_DOWN(r->start) + j;
 +}
 +
 +xatp.domid = DOMID_SELF;
 +xatp.size = nr;
 +xatp.space = XENMAPSPACE_dev_mmio;
 +
 +set_xen_guest_handle(xatp.gpfns, gpfns);
 +set_xen_guest_handle(xatp.idxs, idxs);
 +set_xen_guest_handle(xatp.errs, errs);
 +
 +rc = HYPERVISOR_memory_op(XENMEM_add_to_physmap_range, &xatp);
 +kfree(gpfns);
 +kfree(idxs);
 +kfree(errs);
 +if (rc)
 +return rc;
>>>
>>> Shouldn't we redo the mapping if the hypercall fails?
>> Hmm, why? If it fails again when we redo the mapping, what should we do
>> then? Redo again?
> 
> Because the device MMIO region is left half mapped in DOM0 address space.
> 
> After having another look to your patch, if an error occurs, the
> notifier will still return NOTIFY_OK. This will lead to random data
> abort when the driver is accessing the MMIO regions as the device will
> be considered fully functional.
> 
> However, even if the notifier return NOTIFY_BAD, the function device_add
> doesn't care about the return value of blocking_notifier_call_chain. I
> think this need to be fixed.
> 
>> I think if it fails at the first time it will always fail no matter how
>> many times we do.
> 
> I was speaking about the mappings that succeeded. They will unlikely
> fail during removal. If they ever fail you can just ignore the error.
Ok, I see. I thought you mean that it needs to map the regions again.
But what you really mean is undoing the mappings.

Thanks,
-- 
Shannon


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v4 26/26] tools/libxc: Calculate xstate cpuid leaf from guest information

2016-04-07 Thread Andrew Cooper
On 07/04/16 01:56, Jan Beulich wrote:
 Andrew Cooper  04/07/16 2:40 AM >>>
>> On 07/04/2016 01:16, Jan Beulich wrote:
>> Andrew Cooper  04/05/16 7:49 PM >>>
 There is no possible way of avoiding having a whitelist somewhere, which
 limits what Xen will tolerate supporting for the guest.
>>> Right, but preferably in exactly one place. And imo that ought to be
>>> info->xfeature_mask.
>> info->xfeature_mask is actually Xen's limit, as obtained from
>> XEN_DOMCTL_getvcpuextstate, so is an authoritative source of "the
>> maximum Xen will support".
>>
>> However, the guest_xfeature_mask must be generated and used as this
>> patch.  Without it, a domU will break if it migrates from a more capable
>> xstate host to a less capable host, as using info->xfeature_mask alone
>> leaks in state which should be levelled out.
>>
>> Currently upstream, heterogeneous migration of domains using xsave is
>> broken if the domain first boots on the more-capable host.
> I don't follow, I'm afraid: To me this looks like two separate things. One is 
> to
> suitably level the guest (via its config file), and the other is to not allow 
> it to
> use things the host doesn't support. If you want the guest to be migratable
> to a less capable host, you need to configure the guest accordingly instead
> of relying on a second instance of white listing.

Agreed, on all points.

But I assert that my change moves the code from being broken to working,
per the above description.

I have reworded several bits for v5 - perhaps that will make the patch
more clear.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v5 02/21] xen/x86: Calculate maximum host and guest featuresets

2016-04-07 Thread Andrew Cooper
All of this information will be used by the toolstack to make informed
levelling decisions for VMs, and by Xen to sanity check toolstack-provided
information.

The split between the shadow and hap HVM masks is necessary due to the lack of
a "get cpuid policy" hypercall.  Multi-host toolstacks (i.e. not libxl)
dealing with hap and non-hap capable hosts need to be able to calculate that
migrating a shadow guest is safe.

Future planned development work will implement proper cpuid policy handing in
Xen, including a "get policy" hypercall, but until then, the difference is
made available for toolstack use via a non-stable interface.

Signed-off-by: Andrew Cooper 
Reviewed-by: Konrad Rzeszutek Wilk 
---
CC: Jan Beulich 

v3:
 * Move as much as possible into .init.
 * Fix the handing of the shared bits for the cross-vendor case.
 * Fix extended check.
v4:
 * Fix copy&paste error in calculate_hvm_featureset()
v5:
 * Expand commit message, explaining about the shadow and hap masks.
---
 xen/arch/x86/cpuid.c| 162 
 xen/arch/x86/setup.c|   3 +
 xen/include/asm-x86/cpuid.h |  17 +
 3 files changed, 182 insertions(+)

diff --git a/xen/arch/x86/cpuid.c b/xen/arch/x86/cpuid.c
index 77e008a..41439f8 100644
--- a/xen/arch/x86/cpuid.c
+++ b/xen/arch/x86/cpuid.c
@@ -1,14 +1,176 @@
 #include 
 #include 
 #include 
+#include 
+#include 
+#include 
 
 const uint32_t known_features[] = INIT_KNOWN_FEATURES;
 const uint32_t special_features[] = INIT_SPECIAL_FEATURES;
 
+static const uint32_t __initconst pv_featuremask[] = INIT_PV_FEATURES;
+static const uint32_t __initconst hvm_shadow_featuremask[] = 
INIT_HVM_SHADOW_FEATURES;
+static const uint32_t __initconst hvm_hap_featuremask[] = 
INIT_HVM_HAP_FEATURES;
+
+uint32_t __read_mostly raw_featureset[FSCAPINTS];
+uint32_t __read_mostly pv_featureset[FSCAPINTS];
+uint32_t __read_mostly hvm_featureset[FSCAPINTS];
+
+static void __init sanitise_featureset(uint32_t *fs)
+{
+unsigned int i;
+
+for ( i = 0; i < FSCAPINTS; ++i )
+{
+/* Clamp to known mask. */
+fs[i] &= known_features[i];
+}
+
+/*
+ * Sort out shared bits.  We are constructing a featureset which needs to
+ * be applicable to a cross-vendor case.  Intel strictly clears the common
+ * bits in e1d, while AMD strictly duplicates them.
+ *
+ * We duplicate them here to be compatible with AMD while on Intel, and
+ * rely on logic closer to the guest to make the featureset stricter if
+ * emulating Intel.
+ */
+fs[FEATURESET_e1d] = ((fs[FEATURESET_1d]  &  CPUID_COMMON_1D_FEATURES) |
+  (fs[FEATURESET_e1d] & ~CPUID_COMMON_1D_FEATURES));
+}
+
+static void __init calculate_raw_featureset(void)
+{
+unsigned int max, tmp;
+
+max = cpuid_eax(0);
+
+if ( max >= 1 )
+cpuid(0x1, &tmp, &tmp,
+  &raw_featureset[FEATURESET_1c],
+  &raw_featureset[FEATURESET_1d]);
+if ( max >= 7 )
+cpuid_count(0x7, 0, &tmp,
+&raw_featureset[FEATURESET_7b0],
+&raw_featureset[FEATURESET_7c0],
+&tmp);
+if ( max >= 0xd )
+cpuid_count(0xd, 1,
+&raw_featureset[FEATURESET_Da1],
+&tmp, &tmp, &tmp);
+
+max = cpuid_eax(0x8000);
+if ( (max >> 16) != 0x8000 )
+return;
+
+if ( max >= 0x8001 )
+cpuid(0x8001, &tmp, &tmp,
+  &raw_featureset[FEATURESET_e1c],
+  &raw_featureset[FEATURESET_e1d]);
+if ( max >= 0x8007 )
+cpuid(0x8007, &tmp, &tmp, &tmp,
+  &raw_featureset[FEATURESET_e7d]);
+if ( max >= 0x8008 )
+cpuid(0x8008, &tmp,
+  &raw_featureset[FEATURESET_e8b],
+  &tmp, &tmp);
+}
+
+static void __init calculate_pv_featureset(void)
+{
+unsigned int i;
+
+for ( i = 0; i < FSCAPINTS; ++i )
+pv_featureset[i] = host_featureset[i] & pv_featuremask[i];
+
+/* Unconditionally claim to be able to set the hypervisor bit. */
+__set_bit(X86_FEATURE_HYPERVISOR, pv_featureset);
+
+/*
+ * Allow the toolstack to set HTT, X2APIC and CMP_LEGACY.  These bits
+ * affect how to interpret topology information in other cpuid leaves.
+ */
+__set_bit(X86_FEATURE_HTT, pv_featureset);
+__set_bit(X86_FEATURE_X2APIC, pv_featureset);
+__set_bit(X86_FEATURE_CMP_LEGACY, pv_featureset);
+
+sanitise_featureset(pv_featureset);
+}
+
+static void __init calculate_hvm_featureset(void)
+{
+unsigned int i;
+const uint32_t *hvm_featuremask;
+
+if ( !hvm_enabled )
+return;
+
+hvm_featuremask = hvm_funcs.hap_supported ?
+hvm_hap_featuremask : hvm_shadow_featuremask;
+
+for ( i = 0; i < FSCAPINTS; ++i )
+hvm_featureset[i] = host_featureset[i] & hvm_featuremask[i];
+
+/* Unconditionally claim to be able to set the hypervisor bit. */
+__set_bit(X86_FEATURE

[Xen-devel] [PATCH v5 01/21] xen/x86: Annotate VM applicability in featureset

2016-04-07 Thread Andrew Cooper
Use attributes to specify whether a feature is applicable to be exposed to:
 1) All guests
 2) HVM guests
 3) HVM HAP guests
and, via absence of an attribute, to no guests.

There is no current need for other categories (e.g. PV-only features), and
such categories should not be introduced if possible.  These categories follow
from the fact that, with increased hardware support, a guest gets more
features to use.

These settings are derived from the existing code in {pv,hvm}_cpuid(), and
xc_cpuid_x86.c.  One notable exception is EXTAPIC which was previously
erroneously exposed to guests.  PV guests don't get to use the APIC and the
HVM APIC emulation doesn't support extended space.

Signed-off-by: Andrew Cooper 
Reviewed-by: Konrad Rzeszutek Wilk 
---
CC: Jan Beulich 

v2:
 * Annotate features using a magic comment and autogeneration.
v3:
 * Rebase over the new namespaceing changes.
 * Expand commit message.
 * Correct PSE36 to being a HAP-only feature.
v4:
 * Re-break PSE36.
 * Hide LWP from PV guests.
v5:
 * Explicitly identify that attributes are not part of the public API.
---
 xen/include/public/arch-x86/cpufeatureset.h | 190 ++--
 xen/tools/gen-cpuid.py  |  32 -
 2 files changed, 128 insertions(+), 94 deletions(-)

diff --git a/xen/include/public/arch-x86/cpufeatureset.h 
b/xen/include/public/arch-x86/cpufeatureset.h
index 8308972..edd9975 100644
--- a/xen/include/public/arch-x86/cpufeatureset.h
+++ b/xen/include/public/arch-x86/cpufeatureset.h
@@ -75,141 +75,147 @@ enum {
  * Attribute syntax:
  *
  * Attributes for a particular feature are provided as characters before the
- * first space in the comment immediately following the feature value.
+ * first space in the comment immediately following the feature value.  Note -
+ * none of these attributes form part of the Xen public ABI.
  *
  * Special: '!'
  *   This bit has special properties and is not a straight indication of a
  *   piece of new functionality.  Xen will handle these differently,
  *   and may override toolstack settings completely.
+ *
+ * Applicability to guests: 'A', 'S' or 'H'
+ *   'A' = All guests.
+ *   'S' = All HVM guests (not PV guests).
+ *   'H' = HVM HAP guests (not PV or HVM Shadow guests).
  */
 
 /* Intel-defined CPU features, CPUID level 0x0001.edx, word 0 */
-XEN_CPUFEATURE(FPU,   0*32+ 0) /*   Onboard FPU */
-XEN_CPUFEATURE(VME,   0*32+ 1) /*   Virtual Mode Extensions */
-XEN_CPUFEATURE(DE,0*32+ 2) /*   Debugging Extensions */
-XEN_CPUFEATURE(PSE,   0*32+ 3) /*   Page Size Extensions */
-XEN_CPUFEATURE(TSC,   0*32+ 4) /*   Time Stamp Counter */
-XEN_CPUFEATURE(MSR,   0*32+ 5) /*   Model-Specific Registers, RDMSR, 
WRMSR */
-XEN_CPUFEATURE(PAE,   0*32+ 6) /*   Physical Address Extensions */
-XEN_CPUFEATURE(MCE,   0*32+ 7) /*   Machine Check Architecture */
-XEN_CPUFEATURE(CX8,   0*32+ 8) /*   CMPXCHG8 instruction */
-XEN_CPUFEATURE(APIC,  0*32+ 9) /*!  Onboard APIC */
-XEN_CPUFEATURE(SEP,   0*32+11) /*   SYSENTER/SYSEXIT */
-XEN_CPUFEATURE(MTRR,  0*32+12) /*   Memory Type Range Registers */
-XEN_CPUFEATURE(PGE,   0*32+13) /*   Page Global Enable */
-XEN_CPUFEATURE(MCA,   0*32+14) /*   Machine Check Architecture */
-XEN_CPUFEATURE(CMOV,  0*32+15) /*   CMOV instruction (FCMOVCC and 
FCOMI too if FPU present) */
-XEN_CPUFEATURE(PAT,   0*32+16) /*   Page Attribute Table */
-XEN_CPUFEATURE(PSE36, 0*32+17) /*   36-bit PSEs */
-XEN_CPUFEATURE(CLFLUSH,   0*32+19) /*   CLFLUSH instruction */
+XEN_CPUFEATURE(FPU,   0*32+ 0) /*A  Onboard FPU */
+XEN_CPUFEATURE(VME,   0*32+ 1) /*S  Virtual Mode Extensions */
+XEN_CPUFEATURE(DE,0*32+ 2) /*A  Debugging Extensions */
+XEN_CPUFEATURE(PSE,   0*32+ 3) /*S  Page Size Extensions */
+XEN_CPUFEATURE(TSC,   0*32+ 4) /*A  Time Stamp Counter */
+XEN_CPUFEATURE(MSR,   0*32+ 5) /*A  Model-Specific Registers, RDMSR, 
WRMSR */
+XEN_CPUFEATURE(PAE,   0*32+ 6) /*A  Physical Address Extensions */
+XEN_CPUFEATURE(MCE,   0*32+ 7) /*A  Machine Check Architecture */
+XEN_CPUFEATURE(CX8,   0*32+ 8) /*A  CMPXCHG8 instruction */
+XEN_CPUFEATURE(APIC,  0*32+ 9) /*!A Onboard APIC */
+XEN_CPUFEATURE(SEP,   0*32+11) /*A  SYSENTER/SYSEXIT */
+XEN_CPUFEATURE(MTRR,  0*32+12) /*S  Memory Type Range Registers */
+XEN_CPUFEATURE(PGE,   0*32+13) /*S  Page Global Enable */
+XEN_CPUFEATURE(MCA,   0*32+14) /*A  Machine Check Architecture */
+XEN_CPUFEATURE(CMOV,  0*32+15) /*A  CMOV instruction (FCMOVCC and 
FCOMI too if FPU present) */
+XEN_CPUFEATURE(PAT,   0*32+16) /*A  Page Attribute Table */
+XEN_CPUFEATURE(PSE36, 0*32+17) /*S  36-bit PSEs */
+XEN_CPUFEATURE(CLFLUSH,   0*32+19) /*A  CLFLUSH instruction */
 XEN_CPUFEATURE(DS,0*32+21) /*   Debug Store */
-XEN_CPUFEATURE

[Xen-devel] [PATCH v5 03/21] xen/x86: Generate deep dependencies of features

2016-04-07 Thread Andrew Cooper
Some features depend on other features.  Working out and maintaining the exact
dependency tree is complicated, so it is expressed in the automatic generation
script.

At runtime, Xen needs to be disable all features which are dependent on a
feature being disabled.  Because of the flattening performed at compile time,
runtime can use a single mask to disable all eventual features.

Signed-off-by: Andrew Cooper 
---
CC: Jan Beulich 

v2:
 * New.
v3:
 * Vastly more reserch and comments.
v4:
 * Expand commit message.
 * More tweaks to the dependency tree.
 * Avoid for_each_set_bit() walking off the end of disabled_features[].
   Expanding disabled_features[] turns out to be far more simple than
   attempting to opencode for_each_set_bit()
v5:
 * Further tweaking of the SSE* dependencies.
---
 xen/arch/x86/cpuid.c|  56 
 xen/include/asm-x86/cpuid.h |   2 +
 xen/tools/gen-cpuid.py  | 153 +++-
 3 files changed, 210 insertions(+), 1 deletion(-)

diff --git a/xen/arch/x86/cpuid.c b/xen/arch/x86/cpuid.c
index 41439f8..e1e0e44 100644
--- a/xen/arch/x86/cpuid.c
+++ b/xen/arch/x86/cpuid.c
@@ -11,6 +11,7 @@ const uint32_t special_features[] = INIT_SPECIAL_FEATURES;
 static const uint32_t __initconst pv_featuremask[] = INIT_PV_FEATURES;
 static const uint32_t __initconst hvm_shadow_featuremask[] = 
INIT_HVM_SHADOW_FEATURES;
 static const uint32_t __initconst hvm_hap_featuremask[] = 
INIT_HVM_HAP_FEATURES;
+static const uint32_t __initconst deep_features[] = INIT_DEEP_FEATURES;
 
 uint32_t __read_mostly raw_featureset[FSCAPINTS];
 uint32_t __read_mostly pv_featureset[FSCAPINTS];
@@ -18,12 +19,36 @@ uint32_t __read_mostly hvm_featureset[FSCAPINTS];
 
 static void __init sanitise_featureset(uint32_t *fs)
 {
+/* for_each_set_bit() uses unsigned longs.  Extend with zeroes. */
+uint32_t disabled_features[
+ROUNDUP(FSCAPINTS, sizeof(unsigned long)/sizeof(uint32_t))] = {};
 unsigned int i;
 
 for ( i = 0; i < FSCAPINTS; ++i )
 {
 /* Clamp to known mask. */
 fs[i] &= known_features[i];
+
+/*
+ * Identify which features with deep dependencies have been
+ * disabled.
+ */
+disabled_features[i] = ~fs[i] & deep_features[i];
+}
+
+for_each_set_bit(i, (void *)disabled_features,
+ sizeof(disabled_features) * 8)
+{
+const uint32_t *dfs = lookup_deep_deps(i);
+unsigned int j;
+
+ASSERT(dfs); /* deep_features[] should guarentee this. */
+
+for ( j = 0; j < FSCAPINTS; ++j )
+{
+fs[j] &= ~dfs[j];
+disabled_features[j] &= ~dfs[j];
+}
 }
 
 /*
@@ -164,6 +189,36 @@ void __init calculate_featuresets(void)
 calculate_hvm_featureset();
 }
 
+const uint32_t * __init lookup_deep_deps(uint32_t feature)
+{
+static const struct {
+uint32_t feature;
+uint32_t fs[FSCAPINTS];
+} deep_deps[] __initconst = INIT_DEEP_DEPS;
+unsigned int start = 0, end = ARRAY_SIZE(deep_deps);
+
+BUILD_BUG_ON(ARRAY_SIZE(deep_deps) != NR_DEEP_DEPS);
+
+/* Fast early exit. */
+if ( !test_bit(feature, deep_features) )
+return NULL;
+
+/* deep_deps[] is sorted.  Perform a binary search. */
+while ( start < end )
+{
+unsigned int mid = start + ((end - start) / 2);
+
+if ( deep_deps[mid].feature > feature )
+end = mid;
+else if ( deep_deps[mid].feature < feature )
+start = mid + 1;
+else
+return deep_deps[mid].fs;
+}
+
+return NULL;
+}
+
 static void __init __maybe_unused build_assertions(void)
 {
 BUILD_BUG_ON(ARRAY_SIZE(known_features) != FSCAPINTS);
@@ -171,6 +226,7 @@ static void __init __maybe_unused build_assertions(void)
 BUILD_BUG_ON(ARRAY_SIZE(pv_featuremask) != FSCAPINTS);
 BUILD_BUG_ON(ARRAY_SIZE(hvm_shadow_featuremask) != FSCAPINTS);
 BUILD_BUG_ON(ARRAY_SIZE(hvm_hap_featuremask) != FSCAPINTS);
+BUILD_BUG_ON(ARRAY_SIZE(deep_features) != FSCAPINTS);
 }
 
 /*
diff --git a/xen/include/asm-x86/cpuid.h b/xen/include/asm-x86/cpuid.h
index 5041bcd..4725672 100644
--- a/xen/include/asm-x86/cpuid.h
+++ b/xen/include/asm-x86/cpuid.h
@@ -29,6 +29,8 @@ extern uint32_t hvm_featureset[FSCAPINTS];
 
 void calculate_featuresets(void);
 
+const uint32_t *lookup_deep_deps(uint32_t feature);
+
 #endif /* __ASSEMBLY__ */
 #endif /* !__X86_CPUID_H__ */
 
diff --git a/xen/tools/gen-cpuid.py b/xen/tools/gen-cpuid.py
index f971ab2..57533de 100755
--- a/xen/tools/gen-cpuid.py
+++ b/xen/tools/gen-cpuid.py
@@ -144,6 +144,141 @@ def crunch_numbers(state):
 state.hvm_shadow = featureset_to_uint32s(state.raw_hvm_shadow, nr_entries)
 state.hvm_hap = featureset_to_uint32s(state.raw_hvm_hap, nr_entries)
 
+#
+# Feature dependency information.
+#
+# !!! WARNING !!!
+#
+# A lot of this information is derived from the written text of vendors
+# software m

[Xen-devel] [PATCH v5 08/21] x86/cpu: Sysctl and common infrastructure for levelling context switching

2016-04-07 Thread Andrew Cooper
A toolstack needs to know how much control Xen has over the visible cpuid
values in PV guests.  Provide an explicit mechanism to query what Xen is
capable of.

This interface will currently report no capabilities.  This change is
scaffolding for future patches, which will introduce detection and switching
logic, after which the interface will report hardware capabilities correctly.

Signed-off-by: Andrew Cooper 
Acked-by: Jan Beulich 
---
CC: Daniel De Graaf 

v2:
 * s/cpumasks/cpuidmasks/
v3:
 * Reintroduce XEN_SYSCTL_get_levelling_caps (requested by Joao for some
   libvirt development he has planned).
 * Rename to XEN_SYSCTL_get_cpu_levelling_caps, and rename the constants to
   match the Xen command line options.
v4:
 * Move declarations from processor.h to cpuid.h
 * API corrections for XEN_SYSCTL_get_levelling_caps
v5:
 * XSM policy pieces
---
 tools/flask/policy/policy/modules/xen/xen.te |  1 +
 xen/arch/x86/cpu/common.c|  6 ++
 xen/arch/x86/sysctl.c|  6 ++
 xen/include/asm-x86/cpufeature.h |  1 +
 xen/include/asm-x86/cpuid.h  | 32 
 xen/include/public/sysctl.h  | 23 
 xen/xsm/flask/hooks.c|  3 +++
 xen/xsm/flask/policy/access_vectors  |  2 ++
 8 files changed, 74 insertions(+)

diff --git a/tools/flask/policy/policy/modules/xen/xen.te 
b/tools/flask/policy/policy/modules/xen/xen.te
index 7e69ce9..c29b067 100644
--- a/tools/flask/policy/policy/modules/xen/xen.te
+++ b/tools/flask/policy/policy/modules/xen/xen.te
@@ -72,6 +72,7 @@ allow dom0_t xen_t:xen2 {
 allow dom0_t xen_t:xen2 {
 pmu_ctrl
 get_symbol
+get_cpu_levelling_caps
 };
 
 # Allow dom0 to use all XENVER_ subops and VERSION subops that have checks.
diff --git a/xen/arch/x86/cpu/common.c b/xen/arch/x86/cpu/common.c
index b5c023f..7ef75b0 100644
--- a/xen/arch/x86/cpu/common.c
+++ b/xen/arch/x86/cpu/common.c
@@ -36,6 +36,12 @@ integer_param("cpuid_mask_ext_ecx", opt_cpuid_mask_ext_ecx);
 unsigned int opt_cpuid_mask_ext_edx = ~0u;
 integer_param("cpuid_mask_ext_edx", opt_cpuid_mask_ext_edx);
 
+unsigned int __initdata expected_levelling_cap;
+unsigned int __read_mostly levelling_caps;
+
+DEFINE_PER_CPU(struct cpuidmasks, cpuidmasks);
+struct cpuidmasks __read_mostly cpuidmask_defaults;
+
 const struct cpu_dev *__read_mostly cpu_devs[X86_VENDOR_NUM] = {};
 
 unsigned int paddr_bits __read_mostly = 36;
diff --git a/xen/arch/x86/sysctl.c b/xen/arch/x86/sysctl.c
index 58cbd70..f68cbec 100644
--- a/xen/arch/x86/sysctl.c
+++ b/xen/arch/x86/sysctl.c
@@ -190,6 +190,12 @@ long arch_do_sysctl(
 }
 break;
 
+case XEN_SYSCTL_get_cpu_levelling_caps:
+sysctl->u.cpu_levelling_caps.caps = levelling_caps;
+if ( __copy_field_to_guest(u_sysctl, sysctl, 
u.cpu_levelling_caps.caps) )
+ret = -EFAULT;
+break;
+
 default:
 ret = -ENOSYS;
 break;
diff --git a/xen/include/asm-x86/cpufeature.h b/xen/include/asm-x86/cpufeature.h
index 6a08579..9a93799 100644
--- a/xen/include/asm-x86/cpufeature.h
+++ b/xen/include/asm-x86/cpufeature.h
@@ -83,6 +83,7 @@
 #define cpu_has_xsaves boot_cpu_has(X86_FEATURE_XSAVES)
 #define cpu_has_monitorboot_cpu_has(X86_FEATURE_MONITOR)
 #define cpu_has_eist   boot_cpu_has(X86_FEATURE_EIST)
+#define cpu_has_hypervisor boot_cpu_has(X86_FEATURE_HYPERVISOR)
 
 enum _cache_type {
 CACHE_TYPE_NULL = 0,
diff --git a/xen/include/asm-x86/cpuid.h b/xen/include/asm-x86/cpuid.h
index 4725672..9a21c25 100644
--- a/xen/include/asm-x86/cpuid.h
+++ b/xen/include/asm-x86/cpuid.h
@@ -3,6 +3,7 @@
 
 #include 
 #include 
+#include 
 
 #define FSCAPINTS FEATURESET_NR_ENTRIES
 
@@ -18,6 +19,7 @@
 
 #ifndef __ASSEMBLY__
 #include 
+#include 
 
 extern const uint32_t known_features[FSCAPINTS];
 extern const uint32_t special_features[FSCAPINTS];
@@ -31,6 +33,36 @@ void calculate_featuresets(void);
 
 const uint32_t *lookup_deep_deps(uint32_t feature);
 
+/*
+ * Expected levelling capabilities (given cpuid vendor/family information),
+ * and levelling capabilities actually available (given MSR probing).
+ */
+#define LCAP_faulting XEN_SYSCTL_CPU_LEVELCAP_faulting
+#define LCAP_1cd  (XEN_SYSCTL_CPU_LEVELCAP_ecx |\
+   XEN_SYSCTL_CPU_LEVELCAP_edx)
+#define LCAP_e1cd (XEN_SYSCTL_CPU_LEVELCAP_extd_ecx |   \
+   XEN_SYSCTL_CPU_LEVELCAP_extd_edx)
+#define LCAP_Da1  XEN_SYSCTL_CPU_LEVELCAP_xsave_eax
+#define LCAP_6c   XEN_SYSCTL_CPU_LEVELCAP_thermal_ecx
+#define LCAP_7ab0 (XEN_SYSCTL_CPU_LEVELCAP_l7s0_eax |   \
+   XEN_SYSCTL_CPU_LEVELCAP_l7s0_ebx)
+extern unsigned int expected_levelling_cap, levelling_caps;
+
+struct cpuidmasks
+{
+uint64_t _1cd;
+uint64_t e1cd;
+uint64_t Da1;
+uint64_t _6c;
+uint64_t _7ab0;
+};
+
+/* Per CPU shadows of masking MSR values, for lazy co

[Xen-devel] [PATCH v5 07/21] x86/cpu: Move set_cpumask() calls into c_early_init()

2016-04-07 Thread Andrew Cooper
Before c/s 44e24f8567 "x86: don't call generic_identify() redundantly", the
commandline-provided masks would take effect in Xen's view of the processor
features.

As the masks got applied after the query for features, the redundant call to
generic_identify() would clobber the pre-masking feature information with the
post-masking information.

Move the set_cpumask() calls into c_early_init() so the effects of the command
line parameters take place before the main query for features in
generic_identify().

The cpuid_mask_* command line parameters now limit the entire system.
Subsequent changes will cause the mask MSRs to be context switched per-domain,
removing the need to use the command line parameters for heterogeneous
levelling purposes.

Signed-off-by: Andrew Cooper 
Reviewed-by: Konrad Rzeszutek Wilk 
---
CC: Jan Beulich 

v5:
 * Tweak wording in the commit message
---
 xen/arch/x86/cpu/amd.c   |  8 ++--
 xen/arch/x86/cpu/intel.c | 34 +-
 2 files changed, 23 insertions(+), 19 deletions(-)

diff --git a/xen/arch/x86/cpu/amd.c b/xen/arch/x86/cpu/amd.c
index 47a38c6..5516777 100644
--- a/xen/arch/x86/cpu/amd.c
+++ b/xen/arch/x86/cpu/amd.c
@@ -407,6 +407,11 @@ static void amd_get_topology(struct cpuinfo_x86 *c)
  c->cpu_core_id);
 }
 
+static void early_init_amd(struct cpuinfo_x86 *c)
+{
+   set_cpuidmask(c);
+}
+
 static void init_amd(struct cpuinfo_x86 *c)
 {
u32 l, h;
@@ -595,14 +600,13 @@ static void init_amd(struct cpuinfo_x86 *c)
if ((smp_processor_id() == 1) && !cpu_has(c, X86_FEATURE_ITSC))
disable_c1_ramping();
 
-   set_cpuidmask(c);
-
check_syscfg_dram_mod_en();
 }
 
 static const struct cpu_dev amd_cpu_dev = {
.c_vendor   = "AMD",
.c_ident= { "AuthenticAMD" },
+   .c_early_init   = early_init_amd,
.c_init = init_amd,
 };
 
diff --git a/xen/arch/x86/cpu/intel.c b/xen/arch/x86/cpu/intel.c
index bdf89f6..ad22375 100644
--- a/xen/arch/x86/cpu/intel.c
+++ b/xen/arch/x86/cpu/intel.c
@@ -189,6 +189,23 @@ static void early_init_intel(struct cpuinfo_x86 *c)
if (boot_cpu_data.x86 == 0xF && boot_cpu_data.x86_model == 3 &&
(boot_cpu_data.x86_mask == 3 || boot_cpu_data.x86_mask == 4))
paddr_bits = 36;
+
+   if (c == &boot_cpu_data && c->x86 == 6) {
+   if (probe_intel_cpuid_faulting())
+   __set_bit(X86_FEATURE_CPUID_FAULTING,
+ c->x86_capability);
+   } else if (boot_cpu_has(X86_FEATURE_CPUID_FAULTING)) {
+   BUG_ON(!probe_intel_cpuid_faulting());
+   __set_bit(X86_FEATURE_CPUID_FAULTING, c->x86_capability);
+   }
+
+   if (!cpu_has_cpuid_faulting)
+   set_cpuidmask(c);
+   else if ((c == &boot_cpu_data) &&
+(~(opt_cpuid_mask_ecx & opt_cpuid_mask_edx &
+   opt_cpuid_mask_ext_ecx & opt_cpuid_mask_ext_edx &
+   opt_cpuid_mask_xsave_eax)))
+   printk("No CPUID feature masking support available\n");
 }
 
 /*
@@ -258,23 +275,6 @@ static void init_intel(struct cpuinfo_x86 *c)
detect_ht(c);
}
 
-   if (c == &boot_cpu_data && c->x86 == 6) {
-   if (probe_intel_cpuid_faulting())
-   __set_bit(X86_FEATURE_CPUID_FAULTING,
- c->x86_capability);
-   } else if (boot_cpu_has(X86_FEATURE_CPUID_FAULTING)) {
-   BUG_ON(!probe_intel_cpuid_faulting());
-   __set_bit(X86_FEATURE_CPUID_FAULTING, c->x86_capability);
-   }
-
-   if (!cpu_has_cpuid_faulting)
-   set_cpuidmask(c);
-   else if ((c == &boot_cpu_data) &&
-(~(opt_cpuid_mask_ecx & opt_cpuid_mask_edx &
-   opt_cpuid_mask_ext_ecx & opt_cpuid_mask_ext_edx &
-   opt_cpuid_mask_xsave_eax)))
-   printk("No CPUID feature masking support available\n");
-
/* Work around errata */
Intel_errata_workarounds(c);
 
-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v5 05/21] xen/x86: Improve disabling of features which have dependencies

2016-04-07 Thread Andrew Cooper
APIC and XSAVE have dependent features, which also need disabling if Xen
chooses to disable a feature.

Use setup_clear_cpu_cap() rather than clear_bit(), as it takes care of
dependent features as well.

Signed-off-by: Andrew Cooper 
Reviewed-by: Jan Beulich 
---
v2: Move boolean_param() adjacent to use_xsave in xstate_init()
---
 xen/arch/x86/apic.c   |  2 +-
 xen/arch/x86/cpu/common.c | 12 +++-
 xen/arch/x86/xstate.c |  6 +-
 3 files changed, 9 insertions(+), 11 deletions(-)

diff --git a/xen/arch/x86/apic.c b/xen/arch/x86/apic.c
index b9601ad..8df5bd3 100644
--- a/xen/arch/x86/apic.c
+++ b/xen/arch/x86/apic.c
@@ -1349,7 +1349,7 @@ void pmu_apic_interrupt(struct cpu_user_regs *regs)
 int __init APIC_init_uniprocessor (void)
 {
 if (enable_local_apic < 0)
-__clear_bit(X86_FEATURE_APIC, boot_cpu_data.x86_capability);
+setup_clear_cpu_cap(X86_FEATURE_APIC);
 
 if (!smp_found_config && !cpu_has_apic) {
 skip_ioapic_setup = 1;
diff --git a/xen/arch/x86/cpu/common.c b/xen/arch/x86/cpu/common.c
index 0942b44..b5c023f 100644
--- a/xen/arch/x86/cpu/common.c
+++ b/xen/arch/x86/cpu/common.c
@@ -16,9 +16,6 @@
 
 #include "cpu.h"
 
-static bool_t use_xsave = 1;
-boolean_param("xsave", use_xsave);
-
 bool_t opt_arat = 1;
 boolean_param("arat", opt_arat);
 
@@ -341,12 +338,6 @@ void identify_cpu(struct cpuinfo_x86 *c)
if (this_cpu->c_init)
this_cpu->c_init(c);
 
-/* Initialize xsave/xrstor features */
-   if ( !use_xsave )
-   __clear_bit(X86_FEATURE_XSAVE, boot_cpu_data.x86_capability);
-
-   if ( cpu_has_xsave )
-   xstate_init(c);
 
if ( !opt_pku )
setup_clear_cpu_cap(X86_FEATURE_PKU);
@@ -370,6 +361,9 @@ void identify_cpu(struct cpuinfo_x86 *c)
 
/* Now the feature flags better reflect actual CPU features! */
 
+   if ( cpu_has_xsave )
+   xstate_init(c);
+
 #ifdef NOISY_CAPS
printk(KERN_DEBUG "CPU: After all inits, caps:");
for (i = 0; i < NCAPINTS; i++)
diff --git a/xen/arch/x86/xstate.c b/xen/arch/x86/xstate.c
index 8c652bc..4a6c9f6 100644
--- a/xen/arch/x86/xstate.c
+++ b/xen/arch/x86/xstate.c
@@ -534,11 +534,15 @@ unsigned int xstate_ctxt_size(u64 xcr0)
 /* Collect the information of processor's extended state */
 void xstate_init(struct cpuinfo_x86 *c)
 {
+static bool_t __initdata use_xsave = 1;
+boolean_param("xsave", use_xsave);
+
 bool_t bsp = c == &boot_cpu_data;
 u32 eax, ebx, ecx, edx;
 u64 feature_mask;
 
-if ( boot_cpu_data.cpuid_level < XSTATE_CPUID )
+if ( (bsp && !use_xsave) ||
+ boot_cpu_data.cpuid_level < XSTATE_CPUID )
 {
 BUG_ON(!bsp);
 setup_clear_cpu_cap(X86_FEATURE_XSAVE);
-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v5 04/21] xen/x86: Clear dependent features when clearing a cpu cap

2016-04-07 Thread Andrew Cooper
When clearing a cpu cap, clear all dependent features.  This avoids having a
featureset with intermediate features disabled, but leaf features enabled.

Signed-off-by: Andrew Cooper 
Reviewed-by: Jan Beulich 
---
v3:
 * Style fixes.  Use __test_and_set_bit()
---
 xen/arch/x86/cpu/common.c | 16 +++-
 1 file changed, 15 insertions(+), 1 deletion(-)

diff --git a/xen/arch/x86/cpu/common.c b/xen/arch/x86/cpu/common.c
index d302272..0942b44 100644
--- a/xen/arch/x86/cpu/common.c
+++ b/xen/arch/x86/cpu/common.c
@@ -53,8 +53,22 @@ static unsigned int cleared_caps[NCAPINTS];
 
 void __init setup_clear_cpu_cap(unsigned int cap)
 {
+   const uint32_t *dfs;
+   unsigned int i;
+
+   if (__test_and_set_bit(cap, cleared_caps))
+   return;
+
__clear_bit(cap, boot_cpu_data.x86_capability);
-   __set_bit(cap, cleared_caps);
+   dfs = lookup_deep_deps(cap);
+
+   if (!dfs)
+   return;
+
+   for (i = 0; i < FSCAPINTS; ++i) {
+   cleared_caps[i] |= dfs[i];
+   boot_cpu_data.x86_capability[i] &= ~dfs[i];
+   }
 }
 
 static void default_init(struct cpuinfo_x86 * c)
-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v5 00/21] x86: Improvements to cpuid handling for guests

2016-04-07 Thread Andrew Cooper
This series is available in git form at:
  http://xenbits.xen.org/git-http/people/andrewcoop/xen.git levelling-v4

Notable changes from v4 are further tweaking of the SSE* dependency tree,
fixing of deep-C and P state collection on some Linux dom0 kernels, fix the
construction of HVM domains on Skylake hardware, XSM hooks for the new
sysctls.

Most patches do now how Acks/Reviews.  The remaining patches are #1-3,6-7
(x86), #8,14 (XSM), #15(ARM), #26 (Toolstack).

The current cpuid code, both in the hypervisor and toolstack, has grown
organically for a very long time, and is flawed in many ways.  This series
focuses specifically on the fixing the bits pertaining to the visible
features, and I will be fixing other areas in future work (e.g. per-core,
per-package values, auditing of incoming migration values, etc.)

These changes alter the workflow of cpuid handling as follows:

Xen boots and evaluates its current capabilities.  It uses this information to
calculate the maximum featuresets it can provide to guests, and provides this
information for toolstack consumption.  A toolstack may then calculate a safe
set of features (taking into account migratability), and sets a guests cpuid
policy.  Xen then takes care of context switching the levelling state.

In particular, this means that PV guests may have different levels while
running on the same host, an option which was not previously available.

Andrew Cooper (21):
  xen/x86: Annotate VM applicability in featureset
  xen/x86: Calculate maximum host and guest featuresets
  xen/x86: Generate deep dependencies of features
  xen/x86: Clear dependent features when clearing a cpu cap
  xen/x86: Improve disabling of features which have dependencies
  xen/x86: Improvements to in-hypervisor cpuid sanity checks
  x86/cpu: Move set_cpumask() calls into c_early_init()
  x86/cpu: Sysctl and common infrastructure for levelling context
switching
  x86/cpu: Rework AMD masking MSR setup
  x86/cpu: Rework Intel masking/faulting setup
  x86/cpu: Context switch cpuid masks and faulting state in
context_switch()
  x86/pv: Provide custom cpumasks for PV domains
  x86/domctl: Update PV domain cpumasks when setting cpuid policy
  xen+tools: Export maximum host and guest cpu featuresets via SYSCTL
  tools/libxc: Modify bitmap operations to take void pointers
  tools/libxc: Use public/featureset.h for cpuid policy generation
  tools/libxc: Expose the automatically generated cpu featuremask
information
  tools: Utility for dealing with featuresets
  tools/libxc: Wire a featureset through to cpuid policy logic
  tools/libxc: Use featuresets rather than guesswork
  tools/libxc: Calculate xstate cpuid leaf from guest information

 .gitignore   |   1 +
 tools/flask/policy/policy/modules/xen/xen.te |   2 +
 tools/libxc/Makefile |   9 +
 tools/libxc/include/xenctrl.h|  22 +-
 tools/libxc/xc_bitops.h  |  37 +-
 tools/libxc/xc_cpufeature.h  | 151 ---
 tools/libxc/xc_cpuid_x86.c   | 639 +--
 tools/libxl/libxl_cpuid.c|   2 +-
 tools/misc/Makefile  |   4 +
 tools/misc/xen-cpuid.c   | 394 +
 tools/ocaml/libs/xc/xenctrl.ml   |   3 +
 tools/ocaml/libs/xc/xenctrl.mli  |   4 +
 tools/ocaml/libs/xc/xenctrl_stubs.c  |  37 +-
 tools/python/xen/lowlevel/xc/xc.c|   2 +-
 xen/arch/x86/apic.c  |   2 +-
 xen/arch/x86/cpu/amd.c   | 292 
 xen/arch/x86/cpu/common.c|  41 +-
 xen/arch/x86/cpu/intel.c | 278 
 xen/arch/x86/cpuid.c | 218 +
 xen/arch/x86/crash.c |   3 +
 xen/arch/x86/domain.c|  18 +-
 xen/arch/x86/domctl.c| 138 ++
 xen/arch/x86/hvm/hvm.c   | 125 --
 xen/arch/x86/setup.c |   3 +
 xen/arch/x86/sysctl.c|  57 +++
 xen/arch/x86/traps.c | 254 +++
 xen/arch/x86/xstate.c|   6 +-
 xen/include/asm-x86/cpufeature.h |   4 +
 xen/include/asm-x86/cpuid.h  |  51 +++
 xen/include/asm-x86/domain.h |   2 +
 xen/include/asm-x86/processor.h  |   2 +-
 xen/include/public/arch-x86/cpufeatureset.h  | 190 
 xen/include/public/sysctl.h  |  50 +++
 xen/tools/gen-cpuid.py   | 185 +++-
 xen/xsm/flask/hooks.c|   6 +
 xen/xsm/flask/policy/access_vectors  |   4 +
 36 files changed, 2400 insertions(+), 836 deletions(-)
 delete mode 100644 tools/libxc/xc_cpufeature.h
 create mode 100644 tools/misc/xen-cpuid.c

-- 
2.1.4


___
Xen-deve

[Xen-devel] [PATCH v5 09/21] x86/cpu: Rework AMD masking MSR setup

2016-04-07 Thread Andrew Cooper
This patch is best reviewed as its end result rather than as a diff, as it
rewrites almost all of the setup.

On the BSP, cpuid information is used to evaluate the potential available set
of masking MSRs, and they are unconditionally probed, filling in the
availability information and hardware defaults.

The command line parameters are then combined with the hardware defaults to
further restrict the Xen default masking level.

Signed-off-by: Andrew Cooper 
Reviewed-by: Jan Beulich 
---
v2:
 * Provide extra information if opt_cpu_info
 * Extra comment indicating the expected use of amd_ctxt_switch_levelling()
v3:
 * Fix the interaction of the fast-forward bits with the override MSRs.
 * Style fixups.
v5:
 * Tweak comments and style.
---
 xen/arch/x86/cpu/amd.c | 281 -
 1 file changed, 184 insertions(+), 97 deletions(-)

diff --git a/xen/arch/x86/cpu/amd.c b/xen/arch/x86/cpu/amd.c
index 5516777..93a8a5e 100644
--- a/xen/arch/x86/cpu/amd.c
+++ b/xen/arch/x86/cpu/amd.c
@@ -80,6 +80,13 @@ static inline int wrmsr_amd_safe(unsigned int msr, unsigned 
int lo,
return err;
 }
 
+static void wrmsr_amd(unsigned int msr, uint64_t val)
+{
+   asm volatile("wrmsr" ::
+"c" (msr), "a" ((uint32_t)val),
+"d" (val >> 32), "D" (0x9c5a203a));
+}
+
 static const struct cpuidmask {
uint16_t fam;
char rev[2];
@@ -126,126 +133,203 @@ static const struct cpuidmask *__init noinline 
get_cpuidmask(const char *opt)
 }
 
 /*
+ * Sets caps in expected_levelling_cap, probes for the specified mask MSR, and
+ * set caps in levelling_caps if it is found.  Processors prior to Fam 10h
+ * required a 32-bit password for masking MSRs.  Returns the default value.
+ */
+static uint64_t __init _probe_mask_msr(unsigned int msr, uint64_t caps)
+{
+   unsigned int hi, lo;
+
+   expected_levelling_cap |= caps;
+
+   if ((rdmsr_amd_safe(msr, &lo, &hi) == 0) &&
+   (wrmsr_amd_safe(msr, lo, hi) == 0))
+   levelling_caps |= caps;
+
+   return ((uint64_t)hi << 32) | lo;
+}
+
+/*
+ * Probe for the existance of the expected masking MSRs.  They might easily
+ * not be available if Xen is running virtualised.
+ */
+static void __init noinline probe_masking_msrs(void)
+{
+   const struct cpuinfo_x86 *c = &boot_cpu_data;
+
+   /*
+* First, work out which masking MSRs we should have, based on
+* revision and cpuid.
+*/
+
+   /* Fam11 doesn't support masking at all. */
+   if (c->x86 == 0x11)
+   return;
+
+   cpuidmask_defaults._1cd =
+   _probe_mask_msr(MSR_K8_FEATURE_MASK, LCAP_1cd);
+   cpuidmask_defaults.e1cd =
+   _probe_mask_msr(MSR_K8_EXT_FEATURE_MASK, LCAP_e1cd);
+
+   if (c->cpuid_level >= 7)
+   cpuidmask_defaults._7ab0 =
+   _probe_mask_msr(MSR_AMD_L7S0_FEATURE_MASK, LCAP_7ab0);
+
+   if (c->x86 == 0x15 && c->cpuid_level >= 6 && cpuid_ecx(6))
+   cpuidmask_defaults._6c =
+   _probe_mask_msr(MSR_AMD_THRM_FEATURE_MASK, LCAP_6c);
+
+   /*
+* Don't bother warning about a mismatch if virtualised.  These MSRs
+* are not architectural and almost never virtualised.
+*/
+   if ((expected_levelling_cap == levelling_caps) ||
+   cpu_has_hypervisor)
+   return;
+
+   printk(XENLOG_WARNING "Mismatch between expected (%#x) "
+  "and real (%#x) levelling caps: missing %#x\n",
+  expected_levelling_cap, levelling_caps,
+  (expected_levelling_cap ^ levelling_caps) & levelling_caps);
+   printk(XENLOG_WARNING "Fam %#x, model %#x level %#x\n",
+  c->x86, c->x86_model, c->cpuid_level);
+   printk(XENLOG_WARNING
+  "If not running virtualised, please report a bug\n");
+}
+
+/*
+ * Context switch levelling state to the next domain.  A parameter of NULL is
+ * used to context switch to the default host state (by the cpu bringup-code,
+ * crash path, etc).
+ */
+static void amd_ctxt_switch_levelling(const struct domain *nextd)
+{
+   struct cpuidmasks *these_masks = &this_cpu(cpuidmasks);
+   const struct cpuidmasks *masks = &cpuidmask_defaults;
+
+#define LAZY(cap, msr, field)  \
+   ({  \
+   if (unlikely(these_masks->field != masks->field) && \
+   ((levelling_caps & cap) == cap))\
+   {   \
+   wrmsr_amd(msr, masks->field);   \
+   these_masks->field = masks->field;  \
+   }   \
+   })
+
+   LAZY(LCAP_1cd,  MSR_K8_FEATURE_MASK,   _1cd);
+   LAZY(LCAP_e1cd, MSR_K8_EXT_FEATURE_MASK,

[Xen-devel] [PATCH v5 06/21] xen/x86: Improvements to in-hypervisor cpuid sanity checks

2016-04-07 Thread Andrew Cooper
Currently, {pv,hvm}_cpuid() has a large quantity of essentially-static logic
for modifying the features visible to a guest.  A lot of this can be subsumed
by {pv,hvm}_featuremask, which identify the features available on this
hardware which could be given to a PV or HVM guest.

This is a step in the direction of full per-domain cpuid policies, but lots
more development is needed for that.  As a result, the static checks are
simplified, but the dynamic checks need to remain for now.

As a side effect, some of the logic for special features can be improved.
OSXSAVE and OSPKE will be automatically cleared because of being absent in the
featuremask.  This allows the fast-forward logic to be more simple.

In addition, there are some corrections to the existing logic:

 * Hiding PSE36 out of PAE mode is architecturally wrong.  It turns out that
   it was a bugfix for running HyperV under Xen, which wanted to see PSE36
   even after choosing to use PAE paging.  PSE36 is not supported by shadow
   paging, so is hidden from non-HAP guests, but is still visible for HAP
   guests.  It is also leaked into non-HAP guests when the guest is already
   running in PAE mode.
 * Changing the visibility of RDTSCP based on host TSC stability or virtual
   TSC mode is bogus, so dropped.
 * When emulating Intel to a guest, the common features in e1d should be
   cleared.
 * The APIC bit in e1d (on non-Intel) is also a fast-forward from the
   APIC_BASE MSR.

As a small improvement, use compiler-visible &'s and |'s, rather than
{clear,set}_bit().

Signed-off-by: Andrew Cooper 
---
CC: Jan Beulich 

v2:
 * Reinstate some of the dynamic checks for now.  Future development work will
   instate a complete per-domain policy.
 * Fix OSXSAVE handling for PV guests.
v3:
 * Better handling of the cross-vendor case.
 * Improvements to the handling of special features.
 * Correct PSE36 to being a HAP-only feature.
 * Yet more OSXSAVE fixes for PV guests.
v4:
 * Leak PSE36 into shadow guests to fix buggy versions of Hyper-V.
 * Leak MTRR into the hardware domain to fix Xenolinux dom0.
 * Change cross-vendor 1D disabling logic.
 * Avoid reading arch.pv_vcpu for PVH guests.
v5:
 * Clarify the commit message regarding PSE36.
 * Drop redundant is_pv_domain().
 * Fix deep-C and P states with modern Linux PVOps on faulting-capable hardware.
---
 xen/arch/x86/hvm/hvm.c   | 125 ---
 xen/arch/x86/traps.c | 254 +++
 xen/include/asm-x86/cpufeature.h |   2 +
 3 files changed, 263 insertions(+), 118 deletions(-)

diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index b239f74..9ce61cc 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -72,6 +72,7 @@
 #include 
 #include 
 #include 
+#include 
 
 bool_t __read_mostly hvm_enabled;
 
@@ -3358,62 +3359,71 @@ void hvm_cpuid(unsigned int input, unsigned int *eax, 
unsigned int *ebx,
 /* Fix up VLAPIC details. */
 *ebx &= 0x00FFu;
 *ebx |= (v->vcpu_id * 2) << 24;
+
+*ecx &= hvm_featureset[FEATURESET_1c];
+*edx &= hvm_featureset[FEATURESET_1d];
+
+/* APIC exposed to guests, but Fast-forward MSR_APIC_BASE.EN back in. 
*/
 if ( vlapic_hw_disabled(vcpu_vlapic(v)) )
-__clear_bit(X86_FEATURE_APIC & 31, edx);
+*edx &= ~cpufeat_bit(X86_FEATURE_APIC);
 
-/* Fix up OSXSAVE. */
-if ( *ecx & cpufeat_mask(X86_FEATURE_XSAVE) &&
- (v->arch.hvm_vcpu.guest_cr[4] & X86_CR4_OSXSAVE) )
+/* OSXSAVE cleared by hvm_featureset.  Fast-forward CR4 back in. */
+if ( v->arch.hvm_vcpu.guest_cr[4] & X86_CR4_OSXSAVE )
 *ecx |= cpufeat_mask(X86_FEATURE_OSXSAVE);
-else
-*ecx &= ~cpufeat_mask(X86_FEATURE_OSXSAVE);
 
-/* Don't expose PCID to non-hap hvm. */
+/* Don't expose HAP-only features to non-hap guests. */
 if ( !hap_enabled(d) )
+{
 *ecx &= ~cpufeat_mask(X86_FEATURE_PCID);
 
-/* Only provide PSE36 when guest runs in 32bit PAE or in long mode */
-if ( !(hvm_pae_enabled(v) || hvm_long_mode_enabled(v)) )
-*edx &= ~cpufeat_mask(X86_FEATURE_PSE36);
+/*
+ * PSE36 is not supported in shadow mode.  This bit should be
+ * unilaterally cleared.
+ *
+ * However, an unspecified version of Hyper-V from 2011 refuses
+ * to start as the "cpu does not provide required hw features" if
+ * it can't see PSE36.
+ *
+ * As a workaround, leak the toolstack-provided PSE36 value into a
+ * shadow guest if the guest is already using PAE paging (and
+ * won't care about reverting back to PSE paging).  Otherwise,
+ * knoble it, so a 32bit guest doesn't get the impression that it
+ * could try to use PSE36 paging.
+ */
+if ( !(hvm_pae_enabled(v) || hvm_long_mode_

[Xen-devel] [PATCH v5 11/21] x86/cpu: Context switch cpuid masks and faulting state in context_switch()

2016-04-07 Thread Andrew Cooper
A single ctxt_switch_levelling() function pointer is provided
(defaulting to an empty nop), which is overridden in the appropriate
$VENDOR_init_levelling().

set_cpuid_faulting() is made private and included within
intel_ctxt_switch_levelling().

One (attempted) functional change is that the faulting configuration should
not be special cased for dom0.  It turns out that the toolstack relies on the
special case (and indeed, on being a PV domain in the first place) to
correctly build HVM domains.

For now, the control domain is left as a special case, until futher work can
be completed to remove the restriction.

Signed-off-by: Andrew Cooper 
Reviewed-by: Jan Beulich 
---
v3:
 * Don't leave cpuid masking/faulting active for the kexec kernel.
v2:
 * Style fixes.
 * ASSERT() that faulting is available in set_cpuid_faulting().
v5:
 * Fix the building of HVM domains from hardware with faulting available.
---
 xen/arch/x86/cpu/amd.c  |  3 +++
 xen/arch/x86/cpu/common.c   |  7 +++
 xen/arch/x86/cpu/intel.c| 37 -
 xen/arch/x86/crash.c|  3 +++
 xen/arch/x86/domain.c   |  4 +---
 xen/include/asm-x86/processor.h |  2 +-
 6 files changed, 47 insertions(+), 9 deletions(-)

diff --git a/xen/arch/x86/cpu/amd.c b/xen/arch/x86/cpu/amd.c
index 93a8a5e..3e2f4a8 100644
--- a/xen/arch/x86/cpu/amd.c
+++ b/xen/arch/x86/cpu/amd.c
@@ -331,6 +331,9 @@ static void __init noinline amd_init_levelling(void)
   (uint32_t)cpuidmask_defaults._7ab0,
   (uint32_t)cpuidmask_defaults._6c);
}
+
+   if (levelling_caps)
+   ctxt_switch_levelling = amd_ctxt_switch_levelling;
 }
 
 /*
diff --git a/xen/arch/x86/cpu/common.c b/xen/arch/x86/cpu/common.c
index 7ef75b0..fe6eab4 100644
--- a/xen/arch/x86/cpu/common.c
+++ b/xen/arch/x86/cpu/common.c
@@ -88,6 +88,13 @@ static const struct cpu_dev default_cpu = {
 };
 static const struct cpu_dev *this_cpu = &default_cpu;
 
+static void default_ctxt_switch_levelling(const struct domain *nextd)
+{
+   /* Nop */
+}
+void (* __read_mostly ctxt_switch_levelling)(const struct domain *nextd) =
+   default_ctxt_switch_levelling;
+
 bool_t opt_cpu_info;
 boolean_param("cpuinfo", opt_cpu_info);
 
diff --git a/xen/arch/x86/cpu/intel.c b/xen/arch/x86/cpu/intel.c
index 6e1fbbb..e21c32d 100644
--- a/xen/arch/x86/cpu/intel.c
+++ b/xen/arch/x86/cpu/intel.c
@@ -32,13 +32,15 @@ static bool_t __init probe_intel_cpuid_faulting(void)
return 1;
 }
 
-static DEFINE_PER_CPU(bool_t, cpuid_faulting_enabled);
-void set_cpuid_faulting(bool_t enable)
+static void set_cpuid_faulting(bool_t enable)
 {
+   static DEFINE_PER_CPU(bool_t, cpuid_faulting_enabled);
+   bool_t *this_enabled = &this_cpu(cpuid_faulting_enabled);
uint32_t hi, lo;
 
-   if (!cpu_has_cpuid_faulting ||
-   this_cpu(cpuid_faulting_enabled) == enable )
+   ASSERT(cpu_has_cpuid_faulting);
+
+   if (*this_enabled == enable)
return;
 
rdmsr(MSR_INTEL_MISC_FEATURES_ENABLES, lo, hi);
@@ -47,7 +49,7 @@ void set_cpuid_faulting(bool_t enable)
lo |= MSR_MISC_FEATURES_CPUID_FAULTING;
wrmsr(MSR_INTEL_MISC_FEATURES_ENABLES, lo, hi);
 
-   this_cpu(cpuid_faulting_enabled) = enable;
+   *this_enabled = enable;
 }
 
 /*
@@ -154,6 +156,28 @@ static void intel_ctxt_switch_levelling(const struct 
domain *nextd)
struct cpuidmasks *these_masks = &this_cpu(cpuidmasks);
const struct cpuidmasks *masks = &cpuidmask_defaults;
 
+   if (cpu_has_cpuid_faulting) {
+   /*
+* We *should* be enabling faulting for the control domain.
+*
+* Unfortunately, the domain builder (having only ever been a
+* PV guest) expects to be able to see host cpuid state in a
+* native CPUID instruction, to correctly build a CPUID policy
+* for HVM guests (notably the xstate leaves).
+*
+* This logic is fundimentally broken for HVM toolstack
+* domains, and faulting causes PV guests to behave like HVM
+* guests from their point of view.
+*
+* Future development plans will move responsibility for
+* generating the maximum full cpuid policy into Xen, at which
+* this problem will disappear.
+*/
+   set_cpuid_faulting(nextd && is_pv_domain(nextd) &&
+  !is_control_domain(nextd));
+   return;
+   }
+
 #define LAZY(msr, field)   \
({  \
if (unlikely(these_masks->field != masks->field) && \
@@ -227,6 +251,9 @@ static void __init noinline intel_init_levelling(void)
   (uint32_t)cpuidmask_defaults

[Xen-devel] [PATCH v5 20/21] tools/libxc: Use featuresets rather than guesswork

2016-04-07 Thread Andrew Cooper
It is conceptually wrong to base a VM's featureset on the features visible to
the toolstack which happens to construct it.

Instead, the featureset used is either an explicit one passed by the
toolstack, or the default which Xen believes it can give to the guest.

Collect all the feature manipulation into a single function which adjusts the
featureset, and perform deep dependency removal.

Signed-off-by: Andrew Cooper 
Acked-by: Wei Liu 
---
CC: Ian Jackson 

v2:
 * Join several related patches together.
v3:
 * Correctly adjust HTT/CMP_LEGACY in the policy.  PV guests see host details,
   so get the host features.  HVM guests have their vcpu topology presented in
   an HTT compatible manor (even if ends up reporting 1 cpu), so have
   CMP_LEGACY unconditionally cleared.
---
 tools/libxc/xc_cpuid_x86.c | 356 +
 1 file changed, 137 insertions(+), 219 deletions(-)

diff --git a/tools/libxc/xc_cpuid_x86.c b/tools/libxc/xc_cpuid_x86.c
index a92f5e4..fc7e20a 100644
--- a/tools/libxc/xc_cpuid_x86.c
+++ b/tools/libxc/xc_cpuid_x86.c
@@ -21,7 +21,9 @@
 
 #include 
 #include 
+#include 
 #include "xc_private.h"
+#include "xc_bitops.h"
 #include 
 
 enum {
@@ -31,12 +33,14 @@ enum {
 #include "_xc_cpuid_autogen.h"
 
 #define bitmaskof(idx)  (1u << ((idx) & 31))
-#define clear_bit(idx, dst) ((dst) &= ~bitmaskof(idx))
-#define set_bit(idx, dst)   ((dst) |=  bitmaskof(idx))
+#define featureword_of(idx) ((idx) >> 5)
+#define clear_feature(idx, dst) ((dst) &= ~bitmaskof(idx))
+#define set_feature(idx, dst)   ((dst) |=  bitmaskof(idx))
 
 #define DEF_MAX_BASE 0x000du
 #define DEF_MAX_INTELEXT  0x8008u
 #define DEF_MAX_AMDEXT0x801cu
+#define COMMON_1D CPUID_COMMON_1D_FEATURES
 
 int xc_get_cpu_levelling_caps(xc_interface *xch, uint32_t *caps)
 {
@@ -322,37 +326,6 @@ static void amd_xc_cpuid_policy(xc_interface *xch,
 regs[0] = DEF_MAX_AMDEXT;
 break;
 
-case 0x8001: {
-if ( !info->pae )
-clear_bit(X86_FEATURE_PAE, regs[3]);
-
-/* Filter all other features according to a whitelist. */
-regs[2] &= (bitmaskof(X86_FEATURE_LAHF_LM) |
-bitmaskof(X86_FEATURE_CMP_LEGACY) |
-(info->nestedhvm ? bitmaskof(X86_FEATURE_SVM) : 0) |
-bitmaskof(X86_FEATURE_CR8_LEGACY) |
-bitmaskof(X86_FEATURE_ABM) |
-bitmaskof(X86_FEATURE_SSE4A) |
-bitmaskof(X86_FEATURE_MISALIGNSSE) |
-bitmaskof(X86_FEATURE_3DNOWPREFETCH) |
-bitmaskof(X86_FEATURE_OSVW) |
-bitmaskof(X86_FEATURE_XOP) |
-bitmaskof(X86_FEATURE_LWP) |
-bitmaskof(X86_FEATURE_FMA4) |
-bitmaskof(X86_FEATURE_TBM) |
-bitmaskof(X86_FEATURE_DBEXT));
-regs[3] &= (0x0183f3ff | /* features shared with 0x0001:EDX */
-bitmaskof(X86_FEATURE_NX) |
-bitmaskof(X86_FEATURE_LM) |
-bitmaskof(X86_FEATURE_PAGE1GB) |
-bitmaskof(X86_FEATURE_SYSCALL) |
-bitmaskof(X86_FEATURE_MMXEXT) |
-bitmaskof(X86_FEATURE_FFXSR) |
-bitmaskof(X86_FEATURE_3DNOW) |
-bitmaskof(X86_FEATURE_3DNOWEXT));
-break;
-}
-
 case 0x8008:
 /*
  * ECX[15:12] is ApicIdCoreSize: ECX[7:0] is NumberOfCores (minus one).
@@ -399,12 +372,6 @@ static void intel_xc_cpuid_policy(xc_interface *xch,
 {
 switch ( input[0] )
 {
-case 0x0001:
-/* ECX[5] is availability of VMX */
-if ( info->nestedhvm )
-set_bit(X86_FEATURE_VMX, regs[2]);
-break;
-
 case 0x0004:
 /*
  * EAX[31:26] is Maximum Cores Per Package (minus one).
@@ -420,19 +387,6 @@ static void intel_xc_cpuid_policy(xc_interface *xch,
 regs[0] = DEF_MAX_INTELEXT;
 break;
 
-case 0x8001: {
-/* Only a few features are advertised in Intel's 0x8001. */
-regs[2] &= (bitmaskof(X86_FEATURE_LAHF_LM) |
-bitmaskof(X86_FEATURE_3DNOWPREFETCH) |
-bitmaskof(X86_FEATURE_ABM));
-regs[3] &= (bitmaskof(X86_FEATURE_NX) |
-bitmaskof(X86_FEATURE_LM) |
-bitmaskof(X86_FEATURE_PAGE1GB) |
-bitmaskof(X86_FEATURE_SYSCALL) |
-bitmaskof(X86_FEATURE_RDTSCP));
-break;
-}
-
 case 0x8005:
 regs[0] = regs[1] = regs[2] = 0;
 break;
@@ -444,10 +398,6 @@ static void intel_xc_cpuid_policy(xc_interface *xch,
 }
 }
 
-#define XSAVEOPT(1 << 0)
-#define XSAVEC  (1 << 1)
-#define XGETBV1 (1 << 2)
-#define XSAVES  (1 << 3)
 /* Configure extended state enumeration leaves (0x000D for xsave) */
 static void xc_cpuid_config_xsave(xc_interface *xch

[Xen-devel] [PATCH v5 10/21] x86/cpu: Rework Intel masking/faulting setup

2016-04-07 Thread Andrew Cooper
This patch is best reviewed as its end result rather than as a diff, as it
rewrites almost all of the setup.

On the BSP, cpuid information is used to evaluate the potential available set
of masking MSRs, and they are unconditionally probed, filling in the
availability information and hardware defaults.  A side effect of this is that
probe_intel_cpuid_faulting() can move to being __init.

The command line parameters are then combined with the hardware defaults to
further restrict the Xen default masking level.

Signed-off-by: Andrew Cooper 
Reviewed-by: Jan Beulich 
---
v2:
 * Style fixes.
 * Provide extra information if opt_cpu_info.
 * Extra comment indicating the expected use of intel_ctxt_switch_levelling().
v3:
 * Style fixes.
 * Avoid printing the cpumask defaults if faulting is available.
v5:
 * Tweak comments.
---
 xen/arch/x86/cpu/intel.c | 234 ++-
 1 file changed, 149 insertions(+), 85 deletions(-)

diff --git a/xen/arch/x86/cpu/intel.c b/xen/arch/x86/cpu/intel.c
index ad22375..6e1fbbb 100644
--- a/xen/arch/x86/cpu/intel.c
+++ b/xen/arch/x86/cpu/intel.c
@@ -18,11 +18,18 @@
 
 #define select_idle_routine(x) ((void)0)
 
-static unsigned int probe_intel_cpuid_faulting(void)
+static bool_t __init probe_intel_cpuid_faulting(void)
 {
uint64_t x;
-   return !rdmsr_safe(MSR_INTEL_PLATFORM_INFO, x) &&
-   (x & MSR_PLATFORM_INFO_CPUID_FAULTING);
+
+   if (rdmsr_safe(MSR_INTEL_PLATFORM_INFO, x) ||
+   !(x & MSR_PLATFORM_INFO_CPUID_FAULTING))
+   return 0;
+
+   expected_levelling_cap |= LCAP_faulting;
+   levelling_caps |=  LCAP_faulting;
+   __set_bit(X86_FEATURE_CPUID_FAULTING, boot_cpu_data.x86_capability);
+   return 1;
 }
 
 static DEFINE_PER_CPU(bool_t, cpuid_faulting_enabled);
@@ -44,36 +51,40 @@ void set_cpuid_faulting(bool_t enable)
 }
 
 /*
- * opt_cpuid_mask_ecx/edx: cpuid.1[ecx, edx] feature mask.
- * For example, E8400[Intel Core 2 Duo Processor series] ecx = 0x0008E3FD,
- * edx = 0xBFEBFBFF when executing CPUID.EAX = 1 normally. If you want to
- * 'rev down' to E8400, you can set these values in these Xen boot parameters.
+ * Set caps in expected_levelling_cap, probe a specific masking MSR, and set
+ * caps in levelling_caps if it is found, or clobber the MSR index if missing.
+ * If preset, reads the default value into msr_val.
  */
-static void set_cpuidmask(const struct cpuinfo_x86 *c)
+static uint64_t __init _probe_mask_msr(unsigned int *msr, uint64_t caps)
 {
-   static unsigned int msr_basic, msr_ext, msr_xsave;
-   static enum { not_parsed, no_mask, set_mask } status;
-   u64 msr_val;
+   uint64_t val = 0;
 
-   if (status == no_mask)
-   return;
+   expected_levelling_cap |= caps;
 
-   if (status == set_mask)
-   goto setmask;
+   if (rdmsr_safe(*msr, val) || wrmsr_safe(*msr, val))
+   *msr = 0;
+   else
+   levelling_caps |= caps;
 
-   ASSERT((status == not_parsed) && (c == &boot_cpu_data));
-   status = no_mask;
+   return val;
+}
 
-   if (!~(opt_cpuid_mask_ecx & opt_cpuid_mask_edx &
-  opt_cpuid_mask_ext_ecx & opt_cpuid_mask_ext_edx &
-  opt_cpuid_mask_xsave_eax))
-   return;
+/* Indices of the masking MSRs, or 0 if unavailable. */
+static unsigned int __read_mostly msr_basic, __read_mostly msr_ext,
+   __read_mostly msr_xsave;
+
+/*
+ * Probe for the existance of the expected masking MSRs.  They might easily
+ * not be available if Xen is running virtualised.
+ */
+static void __init probe_masking_msrs(void)
+{
+   const struct cpuinfo_x86 *c = &boot_cpu_data;
+   unsigned int exp_msr_basic, exp_msr_ext, exp_msr_xsave;
 
/* Only family 6 supports this feature. */
-   if (c->x86 != 6) {
-   printk("No CPUID feature masking support available\n");
+   if (c->x86 != 6)
return;
-   }
 
switch (c->x86_model) {
case 0x17: /* Yorkfield, Wolfdale, Penryn, Harpertown(DP) */
@@ -100,59 +111,121 @@ static void set_cpuidmask(const struct cpuinfo_x86 *c)
break;
}
 
-   status = set_mask;
+   exp_msr_basic = msr_basic;
+   exp_msr_ext   = msr_ext;
+   exp_msr_xsave = msr_xsave;
 
-   if (~(opt_cpuid_mask_ecx & opt_cpuid_mask_edx)) {
-   if (msr_basic)
-   printk("Writing CPUID feature mask ecx:edx -> 
%08x:%08x\n",
-  opt_cpuid_mask_ecx, opt_cpuid_mask_edx);
-   else
-   printk("No CPUID feature mask available\n");
-   }
-   else
-   msr_basic = 0;
-
-   if (~(opt_cpuid_mask_ext_ecx & opt_cpuid_mask_ext_edx)) {
-   if (msr_ext)
-   printk("Writing CPUID extended feature mask ecx:edx -> 
%08x:%08x\n",
-  opt_cpuid_mask_ext_ecx, opt_cpuid_mask_ext_edx);
-   

[Xen-devel] [PATCH v5 13/21] x86/domctl: Update PV domain cpumasks when setting cpuid policy

2016-04-07 Thread Andrew Cooper
This allows PV domains with different featuresets to observe different values
from a native cpuid instruction, on supporting hardware.

It is important to leak the host view of X2APIC, HTT and CMP_LEGACY through to
guests, even though they could be hidden.  These flags affect how to interpret
other cpuid leaves which are not maskable.

Signed-off-by: Andrew Cooper 
Reviewed-by: Jan Beulich 
---
v2:
 * Use switch() rather than if/elseif chain
 * Clamp to static PV featuremask
v3:
 * Only set a shadow cpumask if it is available in hardware.  This causes
   fewer branches in the context switch.
 * Fix interaction between fastforward bits and override MSR.
 * Fix up the cross-vendor case.
 * Fix the host view of HTT/CMP_LEGACY.
v4:
 * More comments explaining the masking MSRs behaviour.
 * s/CPU/CPUID/
 * Leak host X2APIC.
v5:
 * Fix commit message wrt X2APIC.
---
 xen/arch/x86/domctl.c| 138 +++
 xen/include/asm-x86/cpufeature.h |   1 +
 2 files changed, 139 insertions(+)

diff --git a/xen/arch/x86/domctl.c b/xen/arch/x86/domctl.c
index e5180ef..cba1e37 100644
--- a/xen/arch/x86/domctl.c
+++ b/xen/arch/x86/domctl.c
@@ -36,6 +36,7 @@
 #include 
 #include 
 #include 
+#include 
 
 static int gdbsx_guest_mem_io(domid_t domid, struct xen_domctl_gdbsx_memio 
*iop)
 {
@@ -87,6 +88,143 @@ static void update_domain_cpuid_info(struct domain *d,
 d->arch.x86_model = (ctl->eax >> 4) & 0xf;
 if ( d->arch.x86 >= 0x6 )
 d->arch.x86_model |= (ctl->eax >> 12) & 0xf0;
+
+if ( is_pv_domain(d) && ((levelling_caps & LCAP_1cd) == LCAP_1cd) )
+{
+uint64_t mask = cpuidmask_defaults._1cd;
+uint32_t ecx = ctl->ecx & pv_featureset[FEATURESET_1c];
+uint32_t edx = ctl->edx & pv_featureset[FEATURESET_1d];
+
+/*
+ * Must expose hosts HTT and X2APIC value so a guest using native
+ * CPUID can correctly interpret other leaves which cannot be
+ * masked.
+ */
+if ( cpu_has_x2apic )
+ecx |= cpufeat_mask(X86_FEATURE_X2APIC);
+if ( cpu_has_htt )
+edx |= cpufeat_mask(X86_FEATURE_HTT);
+
+switch ( boot_cpu_data.x86_vendor )
+{
+case X86_VENDOR_INTEL:
+/*
+ * Intel masking MSRs are documented as AND masks.
+ * Experimentally, they are applied before OSXSAVE and APIC
+ * are fast-forwarded from real hardware state.
+ */
+mask &= ((uint64_t)edx << 32) | ecx;
+break;
+
+case X86_VENDOR_AMD:
+mask &= ((uint64_t)ecx << 32) | edx;
+
+/*
+ * AMD masking MSRs are documented as overrides.
+ * Experimentally, fast-forwarding of the OSXSAVE and APIC
+ * bits from real hardware state only occurs if the MSR has
+ * the respective bits set.
+ */
+if ( ecx & cpufeat_mask(X86_FEATURE_XSAVE) )
+ecx = cpufeat_mask(X86_FEATURE_OSXSAVE);
+else
+ecx = 0;
+edx = cpufeat_mask(X86_FEATURE_APIC);
+
+mask |= ((uint64_t)ecx << 32) | edx;
+break;
+}
+
+d->arch.pv_domain.cpuidmasks->_1cd = mask;
+}
+break;
+
+case 6:
+if ( is_pv_domain(d) && ((levelling_caps & LCAP_6c) == LCAP_6c) )
+{
+uint64_t mask = cpuidmask_defaults._6c;
+
+if ( boot_cpu_data.x86_vendor == X86_VENDOR_AMD )
+mask &= (~0ULL << 32) | ctl->ecx;
+
+d->arch.pv_domain.cpuidmasks->_6c = mask;
+}
+break;
+
+case 7:
+if ( ctl->input[1] != 0 )
+break;
+
+if ( is_pv_domain(d) && ((levelling_caps & LCAP_7ab0) == LCAP_7ab0) )
+{
+uint64_t mask = cpuidmask_defaults._7ab0;
+uint32_t eax = ctl->eax;
+uint32_t ebx = ctl->ebx & pv_featureset[FEATURESET_7b0];
+
+if ( boot_cpu_data.x86_vendor == X86_VENDOR_AMD )
+mask &= ((uint64_t)eax << 32) | ebx;
+
+d->arch.pv_domain.cpuidmasks->_7ab0 = mask;
+}
+break;
+
+case 0xd:
+if ( ctl->input[1] != 1 )
+break;
+
+if ( is_pv_domain(d) && ((levelling_caps & LCAP_Da1) == LCAP_Da1) )
+{
+uint64_t mask = cpuidmask_defaults.Da1;
+uint32_t eax = ctl->eax & pv_featureset[FEATURESET_Da1];
+
+if ( boot_cpu_data.x86_vendor == X86_VENDOR_INTEL )
+mask &= (~0ULL << 32) | eax;
+
+d->arch.pv_domain.cpuidmasks->Da1 = mask;
+}
+break;
+
+case 0x8001:
+if ( is_pv_domain(d) && ((levelling_caps & LCAP_e1cd) == LCAP_e1cd) )
+{
+uint64_t mask = cpuidmask_defaults.e1cd;
+  

[Xen-devel] [PATCH v5 18/21] tools: Utility for dealing with featuresets

2016-04-07 Thread Andrew Cooper
It is able to reports the current featuresets; both the static masks and
dynamic featuresets from Xen, or to decode an arbitrary featureset into
`/proc/cpuinfo` style strings.

Signed-off-by: Andrew Cooper 
Acked-by: Wei Liu 
---
CC: Ian Jackson 

v2: No linking hackary
---
 .gitignore |   1 +
 tools/misc/Makefile|   4 +
 tools/misc/xen-cpuid.c | 394 +
 3 files changed, 399 insertions(+)
 create mode 100644 tools/misc/xen-cpuid.c

diff --git a/.gitignore b/.gitignore
index b40453e..20ffa2d 100644
--- a/.gitignore
+++ b/.gitignore
@@ -179,6 +179,7 @@ tools/misc/cpuperf/cpuperf-perfcntr
 tools/misc/cpuperf/cpuperf-xen
 tools/misc/xc_shadow
 tools/misc/xen_cpuperf
+tools/misc/xen-cpuid
 tools/misc/xen-detect
 tools/misc/xen-tmem-list-parse
 tools/misc/xenperf
diff --git a/tools/misc/Makefile b/tools/misc/Makefile
index a2ef0ec..a94dad9 100644
--- a/tools/misc/Makefile
+++ b/tools/misc/Makefile
@@ -10,6 +10,7 @@ CFLAGS += $(CFLAGS_xeninclude)
 CFLAGS += $(CFLAGS_libxenstore)
 
 # Everything to be installed in regular bin/
+INSTALL_BIN-$(CONFIG_X86)  += xen-cpuid
 INSTALL_BIN-$(CONFIG_X86)  += xen-detect
 INSTALL_BIN+= xencons
 INSTALL_BIN+= xencov_split
@@ -68,6 +69,9 @@ clean:
 .PHONY: distclean
 distclean: clean
 
+xen-cpuid: xen-cpuid.o
+   $(CC) $(LDFLAGS) -o $@ $< $(LDLIBS_libxenctrl) $(LDLIBS_libxenguest) 
$(APPEND_LDFLAGS)
+
 xen-hvmctx: xen-hvmctx.o
$(CC) $(LDFLAGS) -o $@ $< $(LDLIBS_libxenctrl) $(APPEND_LDFLAGS)
 
diff --git a/tools/misc/xen-cpuid.c b/tools/misc/xen-cpuid.c
new file mode 100644
index 000..608c488
--- /dev/null
+++ b/tools/misc/xen-cpuid.c
@@ -0,0 +1,394 @@
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+
+#define ARRAY_SIZE(a) (sizeof a / sizeof *a)
+static uint32_t nr_features;
+
+static const char *str_1d[32] =
+{
+[ 0] = "fpu",  [ 1] = "vme",
+[ 2] = "de",   [ 3] = "pse",
+[ 4] = "tsc",  [ 5] = "msr",
+[ 6] = "pae",  [ 7] = "mce",
+[ 8] = "cx8",  [ 9] = "apic",
+[10] = "REZ",  [11] = "sysenter",
+[12] = "mtrr", [13] = "pge",
+[14] = "mca",  [15] = "cmov",
+[16] = "pat",  [17] = "pse36",
+[18] = "psn",  [19] = "clflush",
+[20] = "REZ",  [21] = "ds",
+[22] = "acpi", [23] = "mmx",
+[24] = "fxsr", [25] = "sse",
+[26] = "sse2", [27] = "ss",
+[28] = "htt",  [29] = "tm",
+[30] = "ia64", [31] = "pbe",
+};
+
+static const char *str_1c[32] =
+{
+[ 0] = "sse3",[ 1] = "pclmulqdq",
+[ 2] = "dtes64",  [ 3] = "monitor",
+[ 4] = "ds-cpl",  [ 5] = "vmx",
+[ 6] = "smx", [ 7] = "est",
+[ 8] = "tm2", [ 9] = "ssse3",
+[10] = "cntx-id", [11] = "sdgb",
+[12] = "fma", [13] = "cx16",
+[14] = "xtpr",[15] = "pdcm",
+[16] = "REZ", [17] = "pcid",
+[18] = "dca", [19] = "sse41",
+[20] = "sse42",   [21] = "x2apic",
+[22] = "movebe",  [23] = "popcnt",
+[24] = "tsc-dl",  [25] = "aesni",
+[26] = "xsave",   [27] = "osxsave",
+[28] = "avx", [29] = "f16c",
+[30] = "rdrnd",   [31] = "hyper",
+};
+
+static const char *str_e1d[32] =
+{
+[ 0] = "fpu",[ 1] = "vme",
+[ 2] = "de", [ 3] = "pse",
+[ 4] = "tsc",[ 5] = "msr",
+[ 6] = "pae",[ 7] = "mce",
+[ 8] = "cx8",[ 9] = "apic",
+[10] = "REZ",[11] = "syscall",
+[12] = "mtrr",   [13] = "pge",
+[14] = "mca",[15] = "cmov",
+[16] = "fcmov",  [17] = "pse36",
+[18] = "REZ",[19] = "mp",
+[20] = "nx", [21] = "REZ",
+[22] = "mmx+",   [23] = "mmx",
+[24] = "fxsr",   [25] = "fxsr+",
+[26] = "pg1g",   [27] = "rdtscp",
+[28] = "REZ",[29] = "lm",
+[30] = "3dnow+", [31] = "3dnow",
+};
+
+static const char *str_e1c[32] =
+{
+[ 0] = "lahf_lm",[ 1] = "cmp",
+[ 2] = "svm",[ 3] = "extapic",
+[ 4] = "cr8d",   [ 5] = "lzcnt",
+[ 6] = "sse4a",  [ 7] = "msse",
+[ 8] = "3dnowpf",[ 9] = "osvw",
+[10] = "ibs",[11] = "xop",
+[12] = "skinit", [13] = "wdt",
+[14] = "REZ",[15] = "lwp",
+[16] = "fma4",   [17] = "tce",
+[18] = "REZ",[19] = "nodeid",
+[20] = "REZ",[21] = "tbm",
+[22] = "topoext",[23] = "perfctr_core",
+[24] = "perfctr_nb", [25] = "REZ",
+[26] = "dbx",[27] = "perftsc",
+[28] = "pcx_l2i",[29] = "monitorx",
+
+[30 ... 31] = "REZ",
+};
+
+static const char *str_7b0[32] =
+{
+[ 0] = "fsgsbase", [ 1] = "tsc-adj",
+[ 2] = "sgx",  [ 3] = "bmi1",
+[ 4] = "hle",  [ 5] = "avx2",
+[ 6] = "REZ",  [ 7] = "smep",
+[ 8] = "bmi2", [ 9] = "erms",
+[10] = "invpcid",  [11] = "rtm",
+[12] = "pqm",  [13] = "depfpp",
+[14] = "mpx",  [15] = "pqe",
+[16] = "avx512f",  [17] = "avx512dq",
+[18] = "rdseed",   [19] = "adx",
+[20] = "smap", [21] = "avx512ifma",
+[22] = "pcomit",   [23] = "clflushopt",

[Xen-devel] [PATCH v5 12/21] x86/pv: Provide custom cpumasks for PV domains

2016-04-07 Thread Andrew Cooper
And use them in preference to cpumask_defaults on context switch.  HVM domains
must not be masked (to avoid interfering with cpuid calls within the guest),
so always lazily context switch to the host default.

Signed-off-by: Andrew Cooper 
Reviewed-by: Jan Beulich 
---
v2:
 * s/cpumasks/cpuidmasks/
 * Use structure assignment
 * Fix error path in arch_domain_create()
v3:
 * Indentation fixes.
 * Only allocate PV cpuidmasks if the host is has cpumasks to use.
---
 xen/arch/x86/cpu/amd.c   |  4 +++-
 xen/arch/x86/cpu/intel.c |  5 -
 xen/arch/x86/domain.c| 14 ++
 xen/include/asm-x86/domain.h |  2 ++
 4 files changed, 23 insertions(+), 2 deletions(-)

diff --git a/xen/arch/x86/cpu/amd.c b/xen/arch/x86/cpu/amd.c
index 3e2f4a8..d5afc3e 100644
--- a/xen/arch/x86/cpu/amd.c
+++ b/xen/arch/x86/cpu/amd.c
@@ -206,7 +206,9 @@ static void __init noinline probe_masking_msrs(void)
 static void amd_ctxt_switch_levelling(const struct domain *nextd)
 {
struct cpuidmasks *these_masks = &this_cpu(cpuidmasks);
-   const struct cpuidmasks *masks = &cpuidmask_defaults;
+   const struct cpuidmasks *masks =
+   (nextd && is_pv_domain(nextd) && 
nextd->arch.pv_domain.cpuidmasks)
+   ? nextd->arch.pv_domain.cpuidmasks : &cpuidmask_defaults;
 
 #define LAZY(cap, msr, field)  \
({  \
diff --git a/xen/arch/x86/cpu/intel.c b/xen/arch/x86/cpu/intel.c
index e21c32d..fe4736e 100644
--- a/xen/arch/x86/cpu/intel.c
+++ b/xen/arch/x86/cpu/intel.c
@@ -154,7 +154,7 @@ static void __init probe_masking_msrs(void)
 static void intel_ctxt_switch_levelling(const struct domain *nextd)
 {
struct cpuidmasks *these_masks = &this_cpu(cpuidmasks);
-   const struct cpuidmasks *masks = &cpuidmask_defaults;
+   const struct cpuidmasks *masks;
 
if (cpu_has_cpuid_faulting) {
/*
@@ -178,6 +178,9 @@ static void intel_ctxt_switch_levelling(const struct domain 
*nextd)
return;
}
 
+   masks = (nextd && is_pv_domain(nextd) && 
nextd->arch.pv_domain.cpuidmasks)
+   ? nextd->arch.pv_domain.cpuidmasks : &cpuidmask_defaults;
+
 #define LAZY(msr, field)   \
({  \
if (unlikely(these_masks->field != masks->field) && \
diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
index cba77a2..a64bfdc 100644
--- a/xen/arch/x86/domain.c
+++ b/xen/arch/x86/domain.c
@@ -577,6 +577,14 @@ int arch_domain_create(struct domain *d, unsigned int 
domcr_flags,
 goto fail;
 clear_page(d->arch.pv_domain.gdt_ldt_l1tab);
 
+if ( levelling_caps & ~LCAP_faulting )
+{
+d->arch.pv_domain.cpuidmasks = xmalloc(struct cpuidmasks);
+if ( !d->arch.pv_domain.cpuidmasks )
+goto fail;
+*d->arch.pv_domain.cpuidmasks = cpuidmask_defaults;
+}
+
 rc = create_perdomain_mapping(d, GDT_LDT_VIRT_START,
   GDT_LDT_MBYTES << (20 - PAGE_SHIFT),
   NULL, NULL);
@@ -672,7 +680,10 @@ int arch_domain_create(struct domain *d, unsigned int 
domcr_flags,
 paging_final_teardown(d);
 free_perdomain_mappings(d);
 if ( is_pv_domain(d) )
+{
+xfree(d->arch.pv_domain.cpuidmasks);
 free_xenheap_page(d->arch.pv_domain.gdt_ldt_l1tab);
+}
 psr_domain_free(d);
 return rc;
 }
@@ -692,7 +703,10 @@ void arch_domain_destroy(struct domain *d)
 
 free_perdomain_mappings(d);
 if ( is_pv_domain(d) )
+{
 free_xenheap_page(d->arch.pv_domain.gdt_ldt_l1tab);
+xfree(d->arch.pv_domain.cpuidmasks);
+}
 
 free_xenheap_page(d->shared_info);
 cleanup_domain_irq_mapping(d);
diff --git a/xen/include/asm-x86/domain.h b/xen/include/asm-x86/domain.h
index de60def..90f021f 100644
--- a/xen/include/asm-x86/domain.h
+++ b/xen/include/asm-x86/domain.h
@@ -252,6 +252,8 @@ struct pv_domain
 
 /* map_domain_page() mapping cache. */
 struct mapcache_domain mapcache;
+
+struct cpuidmasks *cpuidmasks;
 };
 
 struct monitor_write_data {
-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v5 19/21] tools/libxc: Wire a featureset through to cpuid policy logic

2016-04-07 Thread Andrew Cooper
Later changes will cause the cpuid generation logic to seed their information
from a featureset.  This patch adds the infrastructure to specify a
featureset, and will obtain the appropriate default from Xen if omitted.

Signed-off-by: Andrew Cooper 
Acked-by: Wei Liu 
---
CC: Ian Jackson 

v2:
 * Modify existing call rather than introducing a new one.
 * Fix up in-tree callsites.
---
 tools/libxc/include/xenctrl.h   |  4 ++-
 tools/libxc/xc_cpuid_x86.c  | 69 -
 tools/libxl/libxl_cpuid.c   |  2 +-
 tools/ocaml/libs/xc/xenctrl_stubs.c |  2 +-
 tools/python/xen/lowlevel/xc/xc.c   |  2 +-
 5 files changed, 66 insertions(+), 13 deletions(-)

diff --git a/tools/libxc/include/xenctrl.h b/tools/libxc/include/xenctrl.h
index 3715f51..f5a034a 100644
--- a/tools/libxc/include/xenctrl.h
+++ b/tools/libxc/include/xenctrl.h
@@ -1985,7 +1985,9 @@ int xc_cpuid_set(xc_interface *xch,
  const char **config,
  char **config_transformed);
 int xc_cpuid_apply_policy(xc_interface *xch,
-  domid_t domid);
+  domid_t domid,
+  uint32_t *featureset,
+  unsigned int nr_features);
 void xc_cpuid_to_str(const unsigned int *regs,
  char **strs); /* some strs[] may be NULL if ENOMEM */
 int xc_mca_op(xc_interface *xch, struct xen_mc *mc);
diff --git a/tools/libxc/xc_cpuid_x86.c b/tools/libxc/xc_cpuid_x86.c
index 0cffb36..a92f5e4 100644
--- a/tools/libxc/xc_cpuid_x86.c
+++ b/tools/libxc/xc_cpuid_x86.c
@@ -166,6 +166,9 @@ struct cpuid_domain_info
 bool pvh;
 uint64_t xfeature_mask;
 
+uint32_t *featureset;
+unsigned int nr_features;
+
 /* PV-only information. */
 bool pv64;
 
@@ -197,11 +200,14 @@ static void cpuid(const unsigned int *input, unsigned int 
*regs)
 }
 
 static int get_cpuid_domain_info(xc_interface *xch, domid_t domid,
- struct cpuid_domain_info *info)
+ struct cpuid_domain_info *info,
+ uint32_t *featureset,
+ unsigned int nr_features)
 {
 struct xen_domctl domctl = {};
 xc_dominfo_t di;
 unsigned int in[2] = { 0, ~0U }, regs[4];
+unsigned int i, host_nr_features = xc_get_cpu_featureset_size();
 int rc;
 
 cpuid(in, regs);
@@ -223,6 +229,23 @@ static int get_cpuid_domain_info(xc_interface *xch, 
domid_t domid,
 info->hvm = di.hvm;
 info->pvh = di.pvh;
 
+info->featureset = calloc(host_nr_features, sizeof(*info->featureset));
+if ( !info->featureset )
+return -ENOMEM;
+
+info->nr_features = host_nr_features;
+
+if ( featureset )
+{
+memcpy(info->featureset, featureset,
+   min(host_nr_features, nr_features) * sizeof(*info->featureset));
+
+/* Check for truncated set bits. */
+for ( i = nr_features; i < host_nr_features; ++i )
+if ( featureset[i] != 0 )
+return -EOPNOTSUPP;
+}
+
 /* Get xstate information. */
 domctl.cmd = XEN_DOMCTL_getvcpuextstate;
 domctl.domain = domid;
@@ -247,6 +270,14 @@ static int get_cpuid_domain_info(xc_interface *xch, 
domid_t domid,
 return rc;
 
 info->nestedhvm = !!val;
+
+if ( !featureset )
+{
+rc = xc_get_cpu_featureset(xch, XEN_SYSCTL_cpu_featureset_hvm,
+   &host_nr_features, info->featureset);
+if ( rc )
+return rc;
+}
 }
 else
 {
@@ -257,11 +288,24 @@ static int get_cpuid_domain_info(xc_interface *xch, 
domid_t domid,
 return rc;
 
 info->pv64 = (width == 8);
+
+if ( !featureset )
+{
+rc = xc_get_cpu_featureset(xch, XEN_SYSCTL_cpu_featureset_pv,
+   &host_nr_features, info->featureset);
+if ( rc )
+return rc;
+}
 }
 
 return 0;
 }
 
+static void free_cpuid_domain_info(struct cpuid_domain_info *info)
+{
+free(info->featureset);
+}
+
 static void amd_xc_cpuid_policy(xc_interface *xch,
 const struct cpuid_domain_info *info,
 const unsigned int *input, unsigned int *regs)
@@ -789,16 +833,18 @@ void xc_cpuid_to_str(const unsigned int *regs, char 
**strs)
 }
 }
 
-int xc_cpuid_apply_policy(xc_interface *xch, domid_t domid)
+int xc_cpuid_apply_policy(xc_interface *xch, domid_t domid,
+  uint32_t *featureset,
+  unsigned int nr_features)
 {
 struct cpuid_domain_info info = {};
 unsigned int input[2] = { 0, 0 }, regs[4];
 unsigned int base_max, ext_max;
 int rc;
 
-rc = get_cpuid_domain_info(xch, domid, &info);
+rc = get_cpuid_domain_info(xch, domid, &info, featureset, nr_features);
 if ( rc )
-return 

[Xen-devel] [PATCH v5 15/21] tools/libxc: Modify bitmap operations to take void pointers

2016-04-07 Thread Andrew Cooper
The type of the pointer to a bitmap is not interesting; it does not affect the
representation of the block of bits being pointed to.

Make the libxc functions consistent with those in Xen, so they can work just
as well with 'unsigned int *' based bitmaps.

As part of doing so, change the implementation to be in terms of char rather
than unsigned long.  This fixes alignment concerns with ARM.

Signed-off-by: Andrew Cooper 
---
CC: Ian Jackson 
CC: Wei Liu 
CC: Stefano Stabellini 
CC: Julien Grall 

v2:
 * New
v3:
 * Implement in terms of char rather than unsigned long to fix alignment
   issues for ARM.
v4:
 * Fix erronious calculation in bitmap_size()
---
 tools/libxc/xc_bitops.h | 37 -
 1 file changed, 20 insertions(+), 17 deletions(-)

diff --git a/tools/libxc/xc_bitops.h b/tools/libxc/xc_bitops.h
index cd749f4..3e7a544 100644
--- a/tools/libxc/xc_bitops.h
+++ b/tools/libxc/xc_bitops.h
@@ -6,70 +6,73 @@
 #include 
 #include 
 
+/* Needed by several includees, but no longer used for bitops. */
 #define BITS_PER_LONG (sizeof(unsigned long) * 8)
 #define ORDER_LONG (sizeof(unsigned long) == 4 ? 5 : 6)
 
-#define BITMAP_ENTRY(_nr,_bmap) ((_bmap))[(_nr)/BITS_PER_LONG]
-#define BITMAP_SHIFT(_nr) ((_nr) % BITS_PER_LONG)
+#define BITMAP_ENTRY(_nr,_bmap) ((_bmap))[(_nr) / 8]
+#define BITMAP_SHIFT(_nr) ((_nr) % 8)
 
 /* calculate required space for number of longs needed to hold nr_bits */
 static inline int bitmap_size(int nr_bits)
 {
-int nr_long, nr_bytes;
-nr_long = (nr_bits + BITS_PER_LONG - 1) >> ORDER_LONG;
-nr_bytes = nr_long * sizeof(unsigned long);
-return nr_bytes;
+return (nr_bits + 7) / 8;
 }
 
-static inline unsigned long *bitmap_alloc(int nr_bits)
+static inline void *bitmap_alloc(int nr_bits)
 {
 return calloc(1, bitmap_size(nr_bits));
 }
 
-static inline void bitmap_set(unsigned long *addr, int nr_bits)
+static inline void bitmap_set(void *addr, int nr_bits)
 {
 memset(addr, 0xff, bitmap_size(nr_bits));
 }
 
-static inline void bitmap_clear(unsigned long *addr, int nr_bits)
+static inline void bitmap_clear(void *addr, int nr_bits)
 {
 memset(addr, 0, bitmap_size(nr_bits));
 }
 
-static inline int test_bit(int nr, unsigned long *addr)
+static inline int test_bit(int nr, const void *_addr)
 {
+const char *addr = _addr;
 return (BITMAP_ENTRY(nr, addr) >> BITMAP_SHIFT(nr)) & 1;
 }
 
-static inline void clear_bit(int nr, unsigned long *addr)
+static inline void clear_bit(int nr, void *_addr)
 {
+char *addr = _addr;
 BITMAP_ENTRY(nr, addr) &= ~(1UL << BITMAP_SHIFT(nr));
 }
 
-static inline void set_bit(int nr, unsigned long *addr)
+static inline void set_bit(int nr, void *_addr)
 {
+char *addr = _addr;
 BITMAP_ENTRY(nr, addr) |= (1UL << BITMAP_SHIFT(nr));
 }
 
-static inline int test_and_clear_bit(int nr, unsigned long *addr)
+static inline int test_and_clear_bit(int nr, void *addr)
 {
 int oldbit = test_bit(nr, addr);
 clear_bit(nr, addr);
 return oldbit;
 }
 
-static inline int test_and_set_bit(int nr, unsigned long *addr)
+static inline int test_and_set_bit(int nr, void *addr)
 {
 int oldbit = test_bit(nr, addr);
 set_bit(nr, addr);
 return oldbit;
 }
 
-static inline void bitmap_or(unsigned long *dst, const unsigned long *other,
+static inline void bitmap_or(void *_dst, const void *_other,
  int nr_bits)
 {
-int i, nr_longs = (bitmap_size(nr_bits) / sizeof(unsigned long));
-for ( i = 0; i < nr_longs; ++i )
+char *dst = _dst;
+const char *other = _other;
+int i;
+for ( i = 0; i < bitmap_size(nr_bits); ++i )
 dst[i] |= other[i];
 }
 
-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v5 14/21] xen+tools: Export maximum host and guest cpu featuresets via SYSCTL

2016-04-07 Thread Andrew Cooper
And provide stubs for toolstack use.

Signed-off-by: Andrew Cooper 
Acked-by: Wei Liu 
Acked-by: David Scott 
Acked-by: Jan Beulich 
---
CC: Tim Deegan 
CC: Daniel De Graaf 

v2:
 * Rebased to use libxencall
 * Improve hypercall documentation
v3:
 * Provide libxc implementation for XEN_SYSCTL_get_cpu_levelling_caps as well.
v4:
 * More const.
v5:
 * XSM bits.
---
 tools/flask/policy/policy/modules/xen/xen.te |  1 +
 tools/libxc/include/xenctrl.h|  4 +++
 tools/libxc/xc_cpuid_x86.c   | 41 ++
 tools/ocaml/libs/xc/xenctrl.ml   |  3 ++
 tools/ocaml/libs/xc/xenctrl.mli  |  4 +++
 tools/ocaml/libs/xc/xenctrl_stubs.c  | 35 +++
 xen/arch/x86/sysctl.c| 51 
 xen/include/public/sysctl.h  | 27 +++
 xen/xsm/flask/hooks.c|  3 ++
 xen/xsm/flask/policy/access_vectors  |  2 ++
 10 files changed, 171 insertions(+)

diff --git a/tools/flask/policy/policy/modules/xen/xen.te 
b/tools/flask/policy/policy/modules/xen/xen.te
index c29b067..a551756 100644
--- a/tools/flask/policy/policy/modules/xen/xen.te
+++ b/tools/flask/policy/policy/modules/xen/xen.te
@@ -73,6 +73,7 @@ allow dom0_t xen_t:xen2 {
 pmu_ctrl
 get_symbol
 get_cpu_levelling_caps
+get_cpu_featureset
 };
 
 # Allow dom0 to use all XENVER_ subops and VERSION subops that have checks.
diff --git a/tools/libxc/include/xenctrl.h b/tools/libxc/include/xenctrl.h
index e8cb1ec..1c865a3 100644
--- a/tools/libxc/include/xenctrl.h
+++ b/tools/libxc/include/xenctrl.h
@@ -2618,6 +2618,10 @@ int xc_psr_cat_get_domain_data(xc_interface *xch, 
uint32_t domid,
 int xc_psr_cat_get_l3_info(xc_interface *xch, uint32_t socket,
uint32_t *cos_max, uint32_t *cbm_len,
bool *cdp_enabled);
+
+int xc_get_cpu_levelling_caps(xc_interface *xch, uint32_t *caps);
+int xc_get_cpu_featureset(xc_interface *xch, uint32_t index,
+  uint32_t *nr_features, uint32_t *featureset);
 #endif
 
 /* Compat shims */
diff --git a/tools/libxc/xc_cpuid_x86.c b/tools/libxc/xc_cpuid_x86.c
index 733add4..5780397 100644
--- a/tools/libxc/xc_cpuid_x86.c
+++ b/tools/libxc/xc_cpuid_x86.c
@@ -33,6 +33,47 @@
 #define DEF_MAX_INTELEXT  0x8008u
 #define DEF_MAX_AMDEXT0x801cu
 
+int xc_get_cpu_levelling_caps(xc_interface *xch, uint32_t *caps)
+{
+DECLARE_SYSCTL;
+int ret;
+
+sysctl.cmd = XEN_SYSCTL_get_cpu_levelling_caps;
+ret = do_sysctl(xch, &sysctl);
+
+if ( !ret )
+*caps = sysctl.u.cpu_levelling_caps.caps;
+
+return ret;
+}
+
+int xc_get_cpu_featureset(xc_interface *xch, uint32_t index,
+  uint32_t *nr_features, uint32_t *featureset)
+{
+DECLARE_SYSCTL;
+DECLARE_HYPERCALL_BOUNCE(featureset,
+ *nr_features * sizeof(*featureset),
+ XC_HYPERCALL_BUFFER_BOUNCE_OUT);
+int ret;
+
+if ( xc_hypercall_bounce_pre(xch, featureset) )
+return -1;
+
+sysctl.cmd = XEN_SYSCTL_get_cpu_featureset;
+sysctl.u.cpu_featureset.index = index;
+sysctl.u.cpu_featureset.nr_features = *nr_features;
+set_xen_guest_handle(sysctl.u.cpu_featureset.features, featureset);
+
+ret = do_sysctl(xch, &sysctl);
+
+xc_hypercall_bounce_post(xch, featureset);
+
+if ( !ret )
+*nr_features = sysctl.u.cpu_featureset.nr_features;
+
+return ret;
+}
+
 struct cpuid_domain_info
 {
 enum
diff --git a/tools/ocaml/libs/xc/xenctrl.ml b/tools/ocaml/libs/xc/xenctrl.ml
index 58a53a1..75006e7 100644
--- a/tools/ocaml/libs/xc/xenctrl.ml
+++ b/tools/ocaml/libs/xc/xenctrl.ml
@@ -242,6 +242,9 @@ external version_changeset: handle -> string = 
"stub_xc_version_changeset"
 external version_capabilities: handle -> string =
   "stub_xc_version_capabilities"
 
+type featureset_index = Featureset_raw | Featureset_host | Featureset_pv | 
Featureset_hvm
+external get_cpu_featureset : handle -> featureset_index -> int64 array = 
"stub_xc_get_cpu_featureset"
+
 external watchdog : handle -> int -> int32 -> int
   = "stub_xc_watchdog"
 
diff --git a/tools/ocaml/libs/xc/xenctrl.mli b/tools/ocaml/libs/xc/xenctrl.mli
index 16443df..720e4b2 100644
--- a/tools/ocaml/libs/xc/xenctrl.mli
+++ b/tools/ocaml/libs/xc/xenctrl.mli
@@ -147,6 +147,10 @@ external version_compile_info : handle -> compile_info
 external version_changeset : handle -> string = "stub_xc_version_changeset"
 external version_capabilities : handle -> string
   = "stub_xc_version_capabilities"
+
+type featureset_index = Featureset_raw | Featureset_host | Featureset_pv | 
Featureset_hvm
+external get_cpu_featureset : handle -> featureset_index -> int64 array = 
"stub_xc_get_cpu_featureset"
+
 type core_magic = Magic_hvm | Magic_pv
 type core_header = {
   xch_magic : core_magic;
diff --git a/tools/ocaml/libs/xc/xenctrl_stubs.c 
b/tools/ocaml/libs/xc

[Xen-devel] [PATCH v5 17/21] tools/libxc: Expose the automatically generated cpu featuremask information

2016-04-07 Thread Andrew Cooper
Signed-off-by: Andrew Cooper 
Acked-by: Wei Liu 
---
CC: Ian Jackson 

New in v2
---
 tools/libxc/Makefile  |  9 ++
 tools/libxc/include/xenctrl.h | 14 
 tools/libxc/xc_cpuid_x86.c| 75 +++
 3 files changed, 98 insertions(+)

diff --git a/tools/libxc/Makefile b/tools/libxc/Makefile
index 608404f..ef02c9d 100644
--- a/tools/libxc/Makefile
+++ b/tools/libxc/Makefile
@@ -145,6 +145,15 @@ $(eval $(genpath-target))
 
 xc_private.h: _paths.h
 
+ifeq ($(CONFIG_X86),y)
+
+_xc_cpuid_autogen.h: $(XEN_ROOT)/xen/include/public/arch-x86/cpufeatureset.h 
$(XEN_ROOT)/xen/tools/gen-cpuid.py
+   $(PYTHON) $(XEN_ROOT)/xen/tools/gen-cpuid.py -i $^ -o $@.new
+   $(call move-if-changed,$@.new,$@)
+
+build: _xc_cpuid_autogen.h
+endif
+
 $(CTRL_LIB_OBJS) $(GUEST_LIB_OBJS) \
 $(CTRL_PIC_OBJS) $(GUEST_PIC_OBJS): xc_private.h
 
diff --git a/tools/libxc/include/xenctrl.h b/tools/libxc/include/xenctrl.h
index 1c865a3..3715f51 100644
--- a/tools/libxc/include/xenctrl.h
+++ b/tools/libxc/include/xenctrl.h
@@ -2622,6 +2622,20 @@ int xc_psr_cat_get_l3_info(xc_interface *xch, uint32_t 
socket,
 int xc_get_cpu_levelling_caps(xc_interface *xch, uint32_t *caps);
 int xc_get_cpu_featureset(xc_interface *xch, uint32_t index,
   uint32_t *nr_features, uint32_t *featureset);
+
+uint32_t xc_get_cpu_featureset_size(void);
+
+enum xc_static_cpu_featuremask {
+XC_FEATUREMASK_KNOWN,
+XC_FEATUREMASK_SPECIAL,
+XC_FEATUREMASK_PV,
+XC_FEATUREMASK_HVM_SHADOW,
+XC_FEATUREMASK_HVM_HAP,
+XC_FEATUREMASK_DEEP_FEATURES,
+};
+const uint32_t *xc_get_static_cpu_featuremask(enum xc_static_cpu_featuremask);
+const uint32_t *xc_get_feature_deep_deps(uint32_t feature);
+
 #endif
 
 /* Compat shims */
diff --git a/tools/libxc/xc_cpuid_x86.c b/tools/libxc/xc_cpuid_x86.c
index d3674db..0cffb36 100644
--- a/tools/libxc/xc_cpuid_x86.c
+++ b/tools/libxc/xc_cpuid_x86.c
@@ -28,6 +28,7 @@ enum {
 #define XEN_CPUFEATURE(name, value) X86_FEATURE_##name = value,
 #include 
 };
+#include "_xc_cpuid_autogen.h"
 
 #define bitmaskof(idx)  (1u << ((idx) & 31))
 #define clear_bit(idx, dst) ((dst) &= ~bitmaskof(idx))
@@ -78,6 +79,80 @@ int xc_get_cpu_featureset(xc_interface *xch, uint32_t index,
 return ret;
 }
 
+uint32_t xc_get_cpu_featureset_size(void)
+{
+return FEATURESET_NR_ENTRIES;
+}
+
+const uint32_t *xc_get_static_cpu_featuremask(
+enum xc_static_cpu_featuremask mask)
+{
+const static uint32_t known[FEATURESET_NR_ENTRIES] = INIT_KNOWN_FEATURES,
+special[FEATURESET_NR_ENTRIES] = INIT_SPECIAL_FEATURES,
+pv[FEATURESET_NR_ENTRIES] = INIT_PV_FEATURES,
+hvm_shadow[FEATURESET_NR_ENTRIES] = INIT_HVM_SHADOW_FEATURES,
+hvm_hap[FEATURESET_NR_ENTRIES] = INIT_HVM_HAP_FEATURES,
+deep_features[FEATURESET_NR_ENTRIES] = INIT_DEEP_FEATURES;
+
+XC_BUILD_BUG_ON(ARRAY_SIZE(known) != FEATURESET_NR_ENTRIES);
+XC_BUILD_BUG_ON(ARRAY_SIZE(special) != FEATURESET_NR_ENTRIES);
+XC_BUILD_BUG_ON(ARRAY_SIZE(pv) != FEATURESET_NR_ENTRIES);
+XC_BUILD_BUG_ON(ARRAY_SIZE(hvm_shadow) != FEATURESET_NR_ENTRIES);
+XC_BUILD_BUG_ON(ARRAY_SIZE(hvm_hap) != FEATURESET_NR_ENTRIES);
+XC_BUILD_BUG_ON(ARRAY_SIZE(deep_features) != FEATURESET_NR_ENTRIES);
+
+switch ( mask )
+{
+case XC_FEATUREMASK_KNOWN:
+return known;
+
+case XC_FEATUREMASK_SPECIAL:
+return special;
+
+case XC_FEATUREMASK_PV:
+return pv;
+
+case XC_FEATUREMASK_HVM_SHADOW:
+return hvm_shadow;
+
+case XC_FEATUREMASK_HVM_HAP:
+return hvm_hap;
+
+case XC_FEATUREMASK_DEEP_FEATURES:
+return deep_features;
+
+default:
+return NULL;
+}
+}
+
+const uint32_t *xc_get_feature_deep_deps(uint32_t feature)
+{
+static const struct {
+uint32_t feature;
+uint32_t fs[FEATURESET_NR_ENTRIES];
+} deep_deps[] = INIT_DEEP_DEPS;
+
+unsigned int start = 0, end = ARRAY_SIZE(deep_deps);
+
+XC_BUILD_BUG_ON(ARRAY_SIZE(deep_deps) != NR_DEEP_DEPS);
+
+/* deep_deps[] is sorted.  Perform a binary search. */
+while ( start < end )
+{
+unsigned int mid = start + ((end - start) / 2);
+
+if ( deep_deps[mid].feature > feature )
+end = mid;
+else if ( deep_deps[mid].feature < feature )
+start = mid + 1;
+else
+return deep_deps[mid].fs;
+}
+
+return NULL;
+}
+
 struct cpuid_domain_info
 {
 enum
-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v5 21/21] tools/libxc: Calculate xstate cpuid leaf from guest information

2016-04-07 Thread Andrew Cooper
The existing logic is broken for heterogeneous migration.  By always
advertising the host maximum xstate, a migration to a less capable host always
fails as Xen cannot accomodate the xcr0_accum in the migration stream.

By calculating xstate from the feature information (which a multi-host
toolstack will have levelled appropriately), the guest will have the current
hosts maximum xstate advertised, allowing for correct migration to less
capable hosts.

In addition, some further improvements and corrections:
 - don't discard the known flags in sub-leaves 2..63 ECX
 - zap sub-leaves beyond 62
 - zap all bits in leaf 1, EBX/ECX.  No XSS features are currently supported.

Signed-off-by: Andrew Cooper 
Signed-off-by: Jan Beulich 
---
CC: Wei Liu 
CC: Ian Jackson 

v3:
 * Reintroduce MPX adjustment (this series has been in development since
   before the introduction of MPX upstream, and it got lost in a rebase).
v4:
 * Fold further improvements from Jan.
v5:
 * Reintroduce PKRU, (again, lost due to rebasing).
 * Rewrite the commit message and comments to try and better explain why I am
   deliberatly removing host-specific information from the xstate calculation.
 * Reintroduce 0x masks for EAX, to avoid Coverity complaining about
   truncation on assignment.
---
 tools/libxc/xc_cpuid_x86.c | 89 ++
 1 file changed, 75 insertions(+), 14 deletions(-)

diff --git a/tools/libxc/xc_cpuid_x86.c b/tools/libxc/xc_cpuid_x86.c
index fc7e20a..6d14904 100644
--- a/tools/libxc/xc_cpuid_x86.c
+++ b/tools/libxc/xc_cpuid_x86.c
@@ -398,54 +398,115 @@ static void intel_xc_cpuid_policy(xc_interface *xch,
 }
 }
 
+/* XSTATE bits in XCR0. */
+#define X86_XCR0_X87(1ULL <<  0)
+#define X86_XCR0_SSE(1ULL <<  1)
+#define X86_XCR0_AVX(1ULL <<  2)
+#define X86_XCR0_BNDREG (1ULL <<  3)
+#define X86_XCR0_BNDCSR (1ULL <<  4)
+#define X86_XCR0_PKRU   (1ULL <<  9)
+#define X86_XCR0_LWP(1ULL << 62)
+
+#define X86_XSS_MASK(0) /* No XSS states supported yet. */
+
+/* Per-component subleaf flags. */
+#define XSTATE_XSS  (1ULL <<  0)
+#define XSTATE_ALIGN64  (1ULL <<  1)
+
 /* Configure extended state enumeration leaves (0x000D for xsave) */
 static void xc_cpuid_config_xsave(xc_interface *xch,
   const struct cpuid_domain_info *info,
   const unsigned int *input, unsigned int 
*regs)
 {
-if ( info->xfeature_mask == 0 )
+uint64_t guest_xfeature_mask;
+
+if ( info->xfeature_mask == 0 ||
+ !test_bit(X86_FEATURE_XSAVE, info->featureset) )
 {
 regs[0] = regs[1] = regs[2] = regs[3] = 0;
 return;
 }
 
+guest_xfeature_mask = X86_XCR0_SSE | X86_XCR0_X87;
+
+if ( test_bit(X86_FEATURE_AVX, info->featureset) )
+guest_xfeature_mask |= X86_XCR0_AVX;
+
+if ( test_bit(X86_FEATURE_MPX, info->featureset) )
+guest_xfeature_mask |= X86_XCR0_BNDREG | X86_XCR0_BNDCSR;
+
+if ( test_bit(X86_FEATURE_PKU, info->featureset) )
+guest_xfeature_mask |= X86_XCR0_PKRU;
+
+if ( test_bit(X86_FEATURE_LWP, info->featureset) )
+guest_xfeature_mask |= X86_XCR0_LWP;
+
+/*
+ * In the common case, the toolstack will have queried Xen for the maximum
+ * available featureset, and guest_xfeature_mask should not able to be
+ * calculated as being greater than the host limit, info->xfeature_mask.
+ *
+ * Nothing currently prevents a toolstack (or an optimistic user) from
+ * purposefully trying to select a larger-than-available xstate set.
+ *
+ * To avoid the domain dying with an unexpected fault, clamp the
+ * calculated mask to the host limit.  Future development work will remove
+ * this possibility, when Xen fully audits the complete cpuid polcy set
+ * for a domain.
+ */
+guest_xfeature_mask &= info->xfeature_mask;
+
 switch ( input[1] )
 {
-case 0: 
+case 0:
 /* EAX: low 32bits of xfeature_enabled_mask */
-regs[0] = info->xfeature_mask & 0x;
+regs[0] = guest_xfeature_mask & 0x;
 /* EDX: high 32bits of xfeature_enabled_mask */
-regs[3] = (info->xfeature_mask >> 32) & 0x;
+regs[3] = guest_xfeature_mask >> 32;
 /* ECX: max size required by all HW features */
 {
 unsigned int _input[2] = {0xd, 0x0}, _regs[4];
 regs[2] = 0;
-for ( _input[1] = 2; _input[1] < 64; _input[1]++ )
+for ( _input[1] = 2; _input[1] <= 62; _input[1]++ )
 {
 cpuid(_input, _regs);
 if ( (_regs[0] + _regs[1]) > regs[2] )
 regs[2] = _regs[0] + _regs[1];
 }
 }
-/* EBX: max size required by enabled features. 
- * This register contains a dynamic value, which varies when a guest 
- * enables or disables XSTATE features (via xsetbv). The default size 
- * after reset 

[Xen-devel] [PATCH v5 16/21] tools/libxc: Use public/featureset.h for cpuid policy generation

2016-04-07 Thread Andrew Cooper
Rather than having a different local copy of some of the feature
definitions.

Modify the xc_cpuid_x86.c cpumask helpers to appropriate truncate the
new values.

As some of the feature have been renamed in the public API, similar renames
are made here.

Signed-off-by: Andrew Cooper 
Acked-by: Wei Liu 
---
CC: Ian Jackson 

v3:
 * Adjust naming to match Xen.
---
 tools/libxc/xc_cpufeature.h | 151 
 tools/libxc/xc_cpuid_x86.c  |  37 ++-
 2 files changed, 20 insertions(+), 168 deletions(-)
 delete mode 100644 tools/libxc/xc_cpufeature.h

diff --git a/tools/libxc/xc_cpufeature.h b/tools/libxc/xc_cpufeature.h
deleted file mode 100644
index 01dbeec..000
--- a/tools/libxc/xc_cpufeature.h
+++ /dev/null
@@ -1,151 +0,0 @@
-/*
- * This library is free software; you can redistribute it and/or
- * modify it under the terms of the GNU Lesser General Public
- * License as published by the Free Software Foundation;
- * version 2.1 of the License.
- *
- * This library is distributed in the hope that it will be useful,
- * but WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
- * Lesser General Public License for more details.
- *
- * You should have received a copy of the GNU Lesser General Public
- * License along with this library; If not, see .
- */
-
-#ifndef __LIBXC_CPUFEATURE_H
-#define __LIBXC_CPUFEATURE_H
-
-/* Intel-defined CPU features, CPUID level 0x0001 (edx) */
-#define X86_FEATURE_FPU  0 /* Onboard FPU */
-#define X86_FEATURE_VME  1 /* Virtual Mode Extensions */
-#define X86_FEATURE_DE   2 /* Debugging Extensions */
-#define X86_FEATURE_PSE  3 /* Page Size Extensions */
-#define X86_FEATURE_TSC  4 /* Time Stamp Counter */
-#define X86_FEATURE_MSR  5 /* Model-Specific Registers, RDMSR, WRMSR */
-#define X86_FEATURE_PAE  6 /* Physical Address Extensions */
-#define X86_FEATURE_MCE  7 /* Machine Check Architecture */
-#define X86_FEATURE_CX8  8 /* CMPXCHG8 instruction */
-#define X86_FEATURE_APIC 9 /* Onboard APIC */
-#define X86_FEATURE_SEP 11 /* SYSENTER/SYSEXIT */
-#define X86_FEATURE_MTRR12 /* Memory Type Range Registers */
-#define X86_FEATURE_PGE 13 /* Page Global Enable */
-#define X86_FEATURE_MCA 14 /* Machine Check Architecture */
-#define X86_FEATURE_CMOV15 /* CMOV instruction */
-#define X86_FEATURE_PAT 16 /* Page Attribute Table */
-#define X86_FEATURE_PSE36   17 /* 36-bit PSEs */
-#define X86_FEATURE_PN  18 /* Processor serial number */
-#define X86_FEATURE_CLFLSH  19 /* Supports the CLFLUSH instruction */
-#define X86_FEATURE_DS  21 /* Debug Store */
-#define X86_FEATURE_ACPI22 /* ACPI via MSR */
-#define X86_FEATURE_MMX 23 /* Multimedia Extensions */
-#define X86_FEATURE_FXSR24 /* FXSAVE and FXRSTOR instructions */
-#define X86_FEATURE_XMM 25 /* Streaming SIMD Extensions */
-#define X86_FEATURE_XMM226 /* Streaming SIMD Extensions-2 */
-#define X86_FEATURE_SELFSNOOP   27 /* CPU self snoop */
-#define X86_FEATURE_HT  28 /* Hyper-Threading */
-#define X86_FEATURE_ACC 29 /* Automatic clock control */
-#define X86_FEATURE_IA6430 /* IA-64 processor */
-#define X86_FEATURE_PBE 31 /* Pending Break Enable */
-
-/* AMD-defined CPU features, CPUID level 0x8001 */
-/* Don't duplicate feature flags which are redundant with Intel! */
-#define X86_FEATURE_SYSCALL 11 /* SYSCALL/SYSRET */
-#define X86_FEATURE_MP  19 /* MP Capable. */
-#define X86_FEATURE_NX  20 /* Execute Disable */
-#define X86_FEATURE_MMXEXT  22 /* AMD MMX extensions */
-#define X86_FEATURE_FFXSR   25 /* FFXSR instruction optimizations */
-#define X86_FEATURE_PAGE1GB 26 /* 1Gb large page support */
-#define X86_FEATURE_RDTSCP  27 /* RDTSCP */
-#define X86_FEATURE_LM  29 /* Long Mode (x86-64) */
-#define X86_FEATURE_3DNOWEXT30 /* AMD 3DNow! extensions */
-#define X86_FEATURE_3DNOW   31 /* 3DNow! */
-
-/* Intel-defined CPU features, CPUID level 0x0001 (ecx) */
-#define X86_FEATURE_XMM3 0 /* Streaming SIMD Extensions-3 */
-#define X86_FEATURE_PCLMULQDQ1 /* Carry-less multiplication */
-#define X86_FEATURE_DTES64   2 /* 64-bit Debug Store */
-#define X86_FEATURE_MWAIT3 /* Monitor/Mwait support */
-#define X86_FEATURE_DSCPL4 /* CPL Qualified Debug Store */
-#define X86_FEATURE_VMXE 5 /* Virtual Machine Extensions */
-#define X86_FEATURE_SMXE 6 /* Safer Mode Extensions */
-#define X86_FEATURE_EST  7 /* Enhanced SpeedStep */
-#define X86_FEATURE_TM2  8 /* Thermal Monitor 2 */
-#define X86_FEATURE_SSSE39 /* Supplemental Streaming SIMD Exts-3 */
-#define X86_FEATURE_CID 10 /* Context ID */
-#define X86_FEATURE_FMA 12 /* Fus

[Xen-devel] [PATCH v11 06/17] Xen: ARM: Add support for mapping platform device mmio

2016-04-07 Thread Shannon Zhao
From: Shannon Zhao 

Add a bus_notifier for platform bus device in order to map the device
mmio regions when DOM0 booting with ACPI.

Signed-off-by: Shannon Zhao 
Acked-by: Stefano Stabellini 
---
 drivers/xen/Makefile |   1 +
 drivers/xen/arm-device.c | 153 +++
 2 files changed, 154 insertions(+)
 create mode 100644 drivers/xen/arm-device.c

diff --git a/drivers/xen/Makefile b/drivers/xen/Makefile
index 9b7a35c..415f286 100644
--- a/drivers/xen/Makefile
+++ b/drivers/xen/Makefile
@@ -9,6 +9,7 @@ CFLAGS_features.o   := $(nostackp)
 
 CFLAGS_efi.o   += -fshort-wchar
 
+dom0-$(CONFIG_ARM64) += arm-device.o
 dom0-$(CONFIG_PCI) += pci.o
 dom0-$(CONFIG_USB_SUPPORT) += dbgp.o
 dom0-$(CONFIG_XEN_ACPI) += acpi.o $(xen-pad-y)
diff --git a/drivers/xen/arm-device.c b/drivers/xen/arm-device.c
new file mode 100644
index 000..b918e8e
--- /dev/null
+++ b/drivers/xen/arm-device.c
@@ -0,0 +1,153 @@
+/*
+ * Copyright (c) 2015, Linaro Limited, Shannon Zhao
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see .
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+static int xen_unmap_device_mmio(const struct resource *resources,
+unsigned int count)
+{
+   unsigned int i, j, nr;
+   int rc = 0;
+   const struct resource *r;
+   struct xen_remove_from_physmap xrp;
+
+   for (i = 0; i < count; i++) {
+   r = &resources[i];
+   nr = DIV_ROUND_UP(resource_size(r), XEN_PAGE_SIZE);
+   if ((resource_type(r) != IORESOURCE_MEM) || (nr == 0))
+   continue;
+
+   for (j = 0; j < nr; j++) {
+   xrp.domid = DOMID_SELF;
+   xrp.gpfn = XEN_PFN_DOWN(r->start) + j;
+   rc = HYPERVISOR_memory_op(XENMEM_remove_from_physmap,
+ &xrp);
+   if (rc)
+   return rc;
+   }
+   }
+
+   return rc;
+}
+
+static int xen_map_device_mmio(const struct resource *resources,
+  unsigned int count)
+{
+   unsigned int i, j, nr;
+   int rc = 0;
+   const struct resource *r;
+   xen_pfn_t *gpfns;
+   xen_ulong_t *idxs;
+   int *errs;
+   struct xen_add_to_physmap_range xatp;
+
+   for (i = 0; i < count; i++) {
+   r = &resources[i];
+   nr = DIV_ROUND_UP(resource_size(r), XEN_PAGE_SIZE);
+   if ((resource_type(r) != IORESOURCE_MEM) || (nr == 0))
+   continue;
+
+   gpfns = kzalloc(sizeof(xen_pfn_t) * nr, GFP_KERNEL);
+   idxs = kzalloc(sizeof(xen_ulong_t) * nr, GFP_KERNEL);
+   errs = kzalloc(sizeof(int) * nr, GFP_KERNEL);
+   if (!gpfns || !idxs || !errs) {
+   kfree(gpfns);
+   kfree(idxs);
+   kfree(errs);
+   rc = -ENOMEM;
+   goto unmap;
+   }
+
+   for (j = 0; j < nr; j++) {
+   /*
+* The regions are always mapped 1:1 to DOM0 and this is
+* fine because the memory map for DOM0 is the same as
+* the host (except for the RAM).
+*/
+   gpfns[j] = XEN_PFN_DOWN(r->start) + j;
+   idxs[j] = XEN_PFN_DOWN(r->start) + j;
+   }
+
+   xatp.domid = DOMID_SELF;
+   xatp.size = nr;
+   xatp.space = XENMAPSPACE_dev_mmio;
+
+   set_xen_guest_handle(xatp.gpfns, gpfns);
+   set_xen_guest_handle(xatp.idxs, idxs);
+   set_xen_guest_handle(xatp.errs, errs);
+
+   rc = HYPERVISOR_memory_op(XENMEM_add_to_physmap_range, &xatp);
+   kfree(gpfns);
+   kfree(idxs);
+   kfree(errs);
+   if (rc)
+   goto unmap;
+   }
+
+   return rc;
+
+unmap:
+   xen_unmap_device_mmio(resources, i);
+   return rc;
+}
+
+static int xen_platform_notifier(struct notifier_block *nb,
+unsigned long action, void *data)
+{
+   struct platform_device *pdev = to_platform_device(data);
+   int r = 0;

[Xen-devel] [PATCH v11 00/17] Add ACPI support for Xen Dom0 on ARM64

2016-04-07 Thread Shannon Zhao
From: Shannon Zhao 

This patch set adds ACPI support for Xen Dom0 on ARM64. The relevant Xen
ACPI on ARM64 design document could be found from [1].

This patch set adds a new FDT node "uefi" under /hypervisor to pass UEFI
information. Introduce a bus notifier of AMBA and Platform bus to map
the new added device's MMIO space. Make Xen domain use
xlated_setup_gnttab_pages to setup grant table and a new hypercall to
get event-channel irq.

Regarding the initialization flow of Linux kernel, it needs to move
xen_early_init() before efi_init(). Then xen_early_init() will check
whether it runs on Xen through the /hypervisor node and efi_init() will
call a new function fdt_find_xen_uefi_params(), to parse those
xen,uefi-* parameters just like the existing efi_get_fdt_params().

And in arm64_enable_runtime_services() it will check whether it runs on
Xen and call another new function xen_efi_runtime_setup() to setup
runtime service instead of efi_native_runtime_setup(). The
xen_efi_runtime_setup() will assign the runtime function pointers with
the functions of driver/xen/efi.c.

And since we pass a /hypervisor node and a /chosen node to Dom0, it
needs to check whether the DTS only contains a /hypervisor node and a
/chosen node in acpi_boot_table_init().

Patches are tested on FVP base model.

Thanks,
Shannon

[1] http://lists.xen.org/archives/html/xen-devel/2015-11/msg00488.html

Changes since v10:
* address Rafael's comments on patch 1
* undo the device mmio mappings if it fails in patch 6

Changes since v9:
* address Rafael's comments on patch 1
* check the compatible string of hypervisor node in patch 12

Changes since v8:
* rebased on v4.6-rc1
* print UART device address (PATCH 1)
* use xen_for_each_gfn (PATCH 3)
* reduce indentation by inverting the condition (PATCH 10)
* move xen_early_init() before efi_init() as well for ARM (PATCH 11)
* sync the document with Xen (PATCH 13)

Changes since v7:
* add __init prefix for acpi_get_spcr_uart_addr (PATCH 1)

Changes since v6:
* rebase on linux master
* refactor codes as acpi_get_spcr_uart_addr (PATCH 1)
* sync with Xen (patch 9)

Changes since v5:
* rebase on linux master
* use acpi_dev_resource_memory to parse the device memory info(patch 1)
* sync with Xen (patch 9)

Changes since v4:
* rebase on linux master
* move the check acpi_device_should_be_hidden into
  acpi_bus_type_and_status (patch 1)
* use existing function fdt_subnode_offset (patch 16)

Changes since v3:
* rebase on linux master
* print a warning when there is no SPCR table
* rephase the commit message of PATCH 3
* rephase the words of PATCH 13
* use strcmp and factor the function in PATCH 16
* Add several ACKs and RBs, thanks a lot


Changes since v2:
* Use 0 to check if it should ignore the UART
* Fix the use of page_to_xen_pfn
* Factor ACPI and DT parts in xen_guest_init
* Check "uefi" node by full path
* Fix the statement of Documentation/devicetree/bindings/arm/xen.txt

Changes since v1:
* Rebase on linux mainline and wallclock patch from Stefano
* Refactor AMBA and platform device MMIO map to one file
* Use EFI_PARAVIRT to check if it supports XEN EFI
* Refactor Xen EFI codes
* Address other comments

Shannon Zhao (17):
  Xen: ACPI: Hide UART used by Xen
  xen/grant-table: Move xlated_setup_gnttab_pages to common place
  Xen: xlate: Use page_to_xen_pfn instead of page_to_pfn
  arm/xen: Use xen_xlate_map_ballooned_pages to setup grant table
  xen: memory : Add new XENMAPSPACE type XENMAPSPACE_dev_mmio
  Xen: ARM: Add support for mapping platform device mmio
  Xen: ARM: Add support for mapping AMBA device mmio
  Xen: public/hvm: sync changes of HVM_PARAM_CALLBACK_VIA ABI from Xen
  xen/hvm/params: Add a new delivery type for event-channel in
HVM_PARAM_CALLBACK_IRQ
  arm/xen: Get event-channel irq through HVM_PARAM when booting with
ACPI
  ARM: XEN: Move xen_early_init() before efi_init()
  ARM64: ACPI: Check if it runs on Xen to enable or disable ACPI
  ARM: Xen: Document UEFI support on Xen ARM virtual platforms
  XEN: EFI: Move x86 specific codes to architecture directory
  ARM64: XEN: Add a function to initialize Xen specific UEFI runtime
services
  FDT: Add a helper to get the subnode by given name
  Xen: EFI: Parse DT parameters for Xen specific UEFI

 Documentation/devicetree/bindings/arm/xen.txt |  35 +
 arch/arm/include/asm/xen/xen-ops.h|   6 +
 arch/arm/kernel/setup.c   |   2 +-
 arch/arm/xen/Makefile |   1 +
 arch/arm/xen/efi.c|  40 ++
 arch/arm/xen/enlighten.c  | 109 ++
 arch/arm64/include/asm/xen/xen-ops.h  |   6 +
 arch/arm64/kernel/acpi.c  |  14 +-
 arch/arm64/kernel/setup.c |   2 +-
 arch/arm64/xen/Makefile   |   1 +
 arch/x86/xen/efi.c| 112 +++
 arch/x86/xen/grant-table.c|  57 +---
 drivers/acpi/scan.c

[Xen-devel] [PATCH v11 07/17] Xen: ARM: Add support for mapping AMBA device mmio

2016-04-07 Thread Shannon Zhao
From: Shannon Zhao 

Add a bus_notifier for AMBA bus device in order to map the device
mmio regions when DOM0 booting with ACPI.

Signed-off-by: Shannon Zhao 
Reviewed-by: Stefano Stabellini 
Reviewed-by: Julien Grall 
---
 drivers/xen/arm-device.c | 43 +++
 1 file changed, 43 insertions(+)

diff --git a/drivers/xen/arm-device.c b/drivers/xen/arm-device.c
index b918e8e..778acf8 100644
--- a/drivers/xen/arm-device.c
+++ b/drivers/xen/arm-device.c
@@ -151,3 +151,46 @@ static int __init register_xen_platform_notifier(void)
 }
 
 arch_initcall(register_xen_platform_notifier);
+
+#ifdef CONFIG_ARM_AMBA
+#include 
+
+static int xen_amba_notifier(struct notifier_block *nb,
+unsigned long action, void *data)
+{
+   struct amba_device *adev = to_amba_device(data);
+   int r = 0;
+
+   switch (action) {
+   case BUS_NOTIFY_ADD_DEVICE:
+   r = xen_map_device_mmio(&adev->res, 1);
+   break;
+   case BUS_NOTIFY_DEL_DEVICE:
+   r = xen_unmap_device_mmio(&adev->res, 1);
+   break;
+   default:
+   return NOTIFY_DONE;
+   }
+   if (r)
+   dev_err(&adev->dev, "AMBA: Failed to %s device %s MMIO!\n",
+   action == BUS_NOTIFY_ADD_DEVICE ? "map" :
+   (action == BUS_NOTIFY_DEL_DEVICE ? "unmap" : "?"),
+   adev->dev.init_name);
+
+   return NOTIFY_OK;
+}
+
+static struct notifier_block amba_device_nb = {
+   .notifier_call = xen_amba_notifier,
+};
+
+static int __init register_xen_amba_notifier(void)
+{
+   if (!xen_initial_domain() || acpi_disabled)
+   return 0;
+
+   return bus_register_notifier(&amba_bustype, &amba_device_nb);
+}
+
+arch_initcall(register_xen_amba_notifier);
+#endif
-- 
2.0.4



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v11 05/17] xen: memory : Add new XENMAPSPACE type XENMAPSPACE_dev_mmio

2016-04-07 Thread Shannon Zhao
From: Shannon Zhao 

Add a new type of Xen map space for Dom0 to map device's MMIO region.

Signed-off-by: Shannon Zhao 
Reviewed-by: Julien Grall 
---
 include/xen/interface/memory.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/include/xen/interface/memory.h b/include/xen/interface/memory.h
index 2ecfe4f..9aa8988 100644
--- a/include/xen/interface/memory.h
+++ b/include/xen/interface/memory.h
@@ -160,6 +160,7 @@ DEFINE_GUEST_HANDLE_STRUCT(xen_machphys_mapping_t);
 #define XENMAPSPACE_gmfn_foreign 4 /* GMFN from another dom,
* XENMEM_add_to_physmap_range only.
*/
+#define XENMAPSPACE_dev_mmio 5 /* device mmio region */
 
 /*
  * Sets the GPFN at which a particular page appears in the specified guest's
-- 
2.0.4



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v11 03/17] Xen: xlate: Use page_to_xen_pfn instead of page_to_pfn

2016-04-07 Thread Shannon Zhao
From: Shannon Zhao 

Make xen_xlate_map_ballooned_pages work with 64K pages. In that case
Kernel pages are 64K in size but Xen pages remain 4K in size. Xen pfns
refer to 4K pages.

Signed-off-by: Shannon Zhao 
Reviewed-by: Stefano Stabellini 
Reviewed-by: Julien Grall 
---
 drivers/xen/xlate_mmu.c | 38 +++---
 1 file changed, 27 insertions(+), 11 deletions(-)

diff --git a/drivers/xen/xlate_mmu.c b/drivers/xen/xlate_mmu.c
index 9692656..23f1387 100644
--- a/drivers/xen/xlate_mmu.c
+++ b/drivers/xen/xlate_mmu.c
@@ -189,6 +189,18 @@ int xen_xlate_unmap_gfn_range(struct vm_area_struct *vma,
 }
 EXPORT_SYMBOL_GPL(xen_xlate_unmap_gfn_range);
 
+struct map_balloon_pages {
+   xen_pfn_t *pfns;
+   unsigned int idx;
+};
+
+static void setup_balloon_gfn(unsigned long gfn, void *data)
+{
+   struct map_balloon_pages *info = data;
+
+   info->pfns[info->idx++] = gfn;
+}
+
 /**
  * xen_xlate_map_ballooned_pages - map a new set of ballooned pages
  * @gfns: returns the array of corresponding GFNs
@@ -205,11 +217,13 @@ int __init xen_xlate_map_ballooned_pages(xen_pfn_t 
**gfns, void **virt,
struct page **pages;
xen_pfn_t *pfns;
void *vaddr;
+   struct map_balloon_pages data;
int rc;
-   unsigned int i;
+   unsigned long nr_pages;
 
BUG_ON(nr_grant_frames == 0);
-   pages = kcalloc(nr_grant_frames, sizeof(pages[0]), GFP_KERNEL);
+   nr_pages = DIV_ROUND_UP(nr_grant_frames, XEN_PFN_PER_PAGE);
+   pages = kcalloc(nr_pages, sizeof(pages[0]), GFP_KERNEL);
if (!pages)
return -ENOMEM;
 
@@ -218,22 +232,24 @@ int __init xen_xlate_map_ballooned_pages(xen_pfn_t 
**gfns, void **virt,
kfree(pages);
return -ENOMEM;
}
-   rc = alloc_xenballooned_pages(nr_grant_frames, pages);
+   rc = alloc_xenballooned_pages(nr_pages, pages);
if (rc) {
-   pr_warn("%s Couldn't balloon alloc %ld pfns rc:%d\n", __func__,
-   nr_grant_frames, rc);
+   pr_warn("%s Couldn't balloon alloc %ld pages rc:%d\n", __func__,
+   nr_pages, rc);
kfree(pages);
kfree(pfns);
return rc;
}
-   for (i = 0; i < nr_grant_frames; i++)
-   pfns[i] = page_to_pfn(pages[i]);
 
-   vaddr = vmap(pages, nr_grant_frames, 0, PAGE_KERNEL);
+   data.pfns = pfns;
+   data.idx = 0;
+   xen_for_each_gfn(pages, nr_grant_frames, setup_balloon_gfn, &data);
+
+   vaddr = vmap(pages, nr_pages, 0, PAGE_KERNEL);
if (!vaddr) {
-   pr_warn("%s Couldn't map %ld pfns rc:%d\n", __func__,
-   nr_grant_frames, rc);
-   free_xenballooned_pages(nr_grant_frames, pages);
+   pr_warn("%s Couldn't map %ld pages rc:%d\n", __func__,
+   nr_pages, rc);
+   free_xenballooned_pages(nr_pages, pages);
kfree(pages);
kfree(pfns);
return -ENOMEM;
-- 
2.0.4



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v11 02/17] xen/grant-table: Move xlated_setup_gnttab_pages to common place

2016-04-07 Thread Shannon Zhao
From: Shannon Zhao 

Move xlated_setup_gnttab_pages to common place, so it can be reused by
ARM to setup grant table.

Rename it to xen_xlate_map_ballooned_pages.

Signed-off-by: Shannon Zhao 
Reviewed-by: Stefano Stabellini 
Reviewed-by: Julien Grall 
---
 arch/x86/xen/grant-table.c | 57 +--
 drivers/xen/xlate_mmu.c| 61 ++
 include/xen/xen-ops.h  |  2 ++
 3 files changed, 69 insertions(+), 51 deletions(-)

diff --git a/arch/x86/xen/grant-table.c b/arch/x86/xen/grant-table.c
index e079500..de4144c 100644
--- a/arch/x86/xen/grant-table.c
+++ b/arch/x86/xen/grant-table.c
@@ -111,63 +111,18 @@ int arch_gnttab_init(unsigned long nr_shared)
 }
 
 #ifdef CONFIG_XEN_PVH
-#include 
 #include 
-#include 
-static int __init xlated_setup_gnttab_pages(void)
-{
-   struct page **pages;
-   xen_pfn_t *pfns;
-   void *vaddr;
-   int rc;
-   unsigned int i;
-   unsigned long nr_grant_frames = gnttab_max_grant_frames();
-
-   BUG_ON(nr_grant_frames == 0);
-   pages = kcalloc(nr_grant_frames, sizeof(pages[0]), GFP_KERNEL);
-   if (!pages)
-   return -ENOMEM;
-
-   pfns = kcalloc(nr_grant_frames, sizeof(pfns[0]), GFP_KERNEL);
-   if (!pfns) {
-   kfree(pages);
-   return -ENOMEM;
-   }
-   rc = alloc_xenballooned_pages(nr_grant_frames, pages);
-   if (rc) {
-   pr_warn("%s Couldn't balloon alloc %ld pfns rc:%d\n", __func__,
-   nr_grant_frames, rc);
-   kfree(pages);
-   kfree(pfns);
-   return rc;
-   }
-   for (i = 0; i < nr_grant_frames; i++)
-   pfns[i] = page_to_pfn(pages[i]);
-
-   vaddr = vmap(pages, nr_grant_frames, 0, PAGE_KERNEL);
-   if (!vaddr) {
-   pr_warn("%s Couldn't map %ld pfns rc:%d\n", __func__,
-   nr_grant_frames, rc);
-   free_xenballooned_pages(nr_grant_frames, pages);
-   kfree(pages);
-   kfree(pfns);
-   return -ENOMEM;
-   }
-   kfree(pages);
-
-   xen_auto_xlat_grant_frames.pfn = pfns;
-   xen_auto_xlat_grant_frames.count = nr_grant_frames;
-   xen_auto_xlat_grant_frames.vaddr = vaddr;
-
-   return 0;
-}
-
+#include 
 static int __init xen_pvh_gnttab_setup(void)
 {
if (!xen_pvh_domain())
return -ENODEV;
 
-   return xlated_setup_gnttab_pages();
+   xen_auto_xlat_grant_frames.count = gnttab_max_grant_frames();
+
+   return xen_xlate_map_ballooned_pages(&xen_auto_xlat_grant_frames.pfn,
+&xen_auto_xlat_grant_frames.vaddr,
+xen_auto_xlat_grant_frames.count);
 }
 /* Call it _before_ __gnttab_init as we need to initialize the
  * xen_auto_xlat_grant_frames first. */
diff --git a/drivers/xen/xlate_mmu.c b/drivers/xen/xlate_mmu.c
index 5063c5e..9692656 100644
--- a/drivers/xen/xlate_mmu.c
+++ b/drivers/xen/xlate_mmu.c
@@ -29,6 +29,8 @@
  */
 #include 
 #include 
+#include 
+#include 
 
 #include 
 #include 
@@ -37,6 +39,7 @@
 #include 
 #include 
 #include 
+#include 
 
 typedef void (*xen_gfn_fn_t)(unsigned long gfn, void *data);
 
@@ -185,3 +188,61 @@ int xen_xlate_unmap_gfn_range(struct vm_area_struct *vma,
return 0;
 }
 EXPORT_SYMBOL_GPL(xen_xlate_unmap_gfn_range);
+
+/**
+ * xen_xlate_map_ballooned_pages - map a new set of ballooned pages
+ * @gfns: returns the array of corresponding GFNs
+ * @virt: returns the virtual address of the mapped region
+ * @nr_grant_frames: number of GFNs
+ * @return 0 on success, error otherwise
+ *
+ * This allocates a set of ballooned pages and maps them into the
+ * kernel's address space.
+ */
+int __init xen_xlate_map_ballooned_pages(xen_pfn_t **gfns, void **virt,
+unsigned long nr_grant_frames)
+{
+   struct page **pages;
+   xen_pfn_t *pfns;
+   void *vaddr;
+   int rc;
+   unsigned int i;
+
+   BUG_ON(nr_grant_frames == 0);
+   pages = kcalloc(nr_grant_frames, sizeof(pages[0]), GFP_KERNEL);
+   if (!pages)
+   return -ENOMEM;
+
+   pfns = kcalloc(nr_grant_frames, sizeof(pfns[0]), GFP_KERNEL);
+   if (!pfns) {
+   kfree(pages);
+   return -ENOMEM;
+   }
+   rc = alloc_xenballooned_pages(nr_grant_frames, pages);
+   if (rc) {
+   pr_warn("%s Couldn't balloon alloc %ld pfns rc:%d\n", __func__,
+   nr_grant_frames, rc);
+   kfree(pages);
+   kfree(pfns);
+   return rc;
+   }
+   for (i = 0; i < nr_grant_frames; i++)
+   pfns[i] = page_to_pfn(pages[i]);
+
+   vaddr = vmap(pages, nr_grant_frames, 0, PAGE_KERNEL);
+   if (!vaddr) {
+   pr_warn("%s Couldn't map %ld pfns rc:%d\n", __func__,
+   nr_grant_frames, rc

[Xen-devel] [PATCH v11 14/17] XEN: EFI: Move x86 specific codes to architecture directory

2016-04-07 Thread Shannon Zhao
From: Shannon Zhao 

Move x86 specific codes to architecture directory and export those EFI
runtime service functions. This will be useful for initializing runtime
service on ARM later.

Signed-off-by: Shannon Zhao 
Reviewed-by: Stefano Stabellini 
---
 arch/x86/xen/efi.c| 112 
 drivers/xen/efi.c | 174 ++
 include/xen/xen-ops.h |  30 ++---
 3 files changed, 168 insertions(+), 148 deletions(-)

diff --git a/arch/x86/xen/efi.c b/arch/x86/xen/efi.c
index be14cc3..86527f1 100644
--- a/arch/x86/xen/efi.c
+++ b/arch/x86/xen/efi.c
@@ -20,10 +20,122 @@
 #include 
 #include 
 
+#include 
 #include 
+#include 
 
 #include 
 #include 
+#include 
+
+static efi_char16_t vendor[100] __initdata;
+
+static efi_system_table_t efi_systab_xen __initdata = {
+   .hdr = {
+   .signature  = EFI_SYSTEM_TABLE_SIGNATURE,
+   .revision   = 0, /* Initialized later. */
+   .headersize = 0, /* Ignored by Linux Kernel. */
+   .crc32  = 0, /* Ignored by Linux Kernel. */
+   .reserved   = 0
+   },
+   .fw_vendor  = EFI_INVALID_TABLE_ADDR, /* Initialized later. */
+   .fw_revision= 0,  /* Initialized later. */
+   .con_in_handle  = EFI_INVALID_TABLE_ADDR, /* Not used under Xen. */
+   .con_in = EFI_INVALID_TABLE_ADDR, /* Not used under Xen. */
+   .con_out_handle = EFI_INVALID_TABLE_ADDR, /* Not used under Xen. */
+   .con_out= EFI_INVALID_TABLE_ADDR, /* Not used under Xen. */
+   .stderr_handle  = EFI_INVALID_TABLE_ADDR, /* Not used under Xen. */
+   .stderr = EFI_INVALID_TABLE_ADDR, /* Not used under Xen. */
+   .runtime= (efi_runtime_services_t *)EFI_INVALID_TABLE_ADDR,
+ /* Not used under Xen. */
+   .boottime   = (efi_boot_services_t *)EFI_INVALID_TABLE_ADDR,
+ /* Not used under Xen. */
+   .nr_tables  = 0,  /* Initialized later. */
+   .tables = EFI_INVALID_TABLE_ADDR  /* Initialized later. */
+};
+
+static const struct efi efi_xen __initconst = {
+   .systab   = NULL, /* Initialized later. */
+   .runtime_version  = 0,/* Initialized later. */
+   .mps  = EFI_INVALID_TABLE_ADDR,
+   .acpi = EFI_INVALID_TABLE_ADDR,
+   .acpi20   = EFI_INVALID_TABLE_ADDR,
+   .smbios   = EFI_INVALID_TABLE_ADDR,
+   .smbios3  = EFI_INVALID_TABLE_ADDR,
+   .sal_systab   = EFI_INVALID_TABLE_ADDR,
+   .boot_info= EFI_INVALID_TABLE_ADDR,
+   .hcdp = EFI_INVALID_TABLE_ADDR,
+   .uga  = EFI_INVALID_TABLE_ADDR,
+   .uv_systab= EFI_INVALID_TABLE_ADDR,
+   .fw_vendor= EFI_INVALID_TABLE_ADDR,
+   .runtime  = EFI_INVALID_TABLE_ADDR,
+   .config_table = EFI_INVALID_TABLE_ADDR,
+   .get_time = xen_efi_get_time,
+   .set_time = xen_efi_set_time,
+   .get_wakeup_time  = xen_efi_get_wakeup_time,
+   .set_wakeup_time  = xen_efi_set_wakeup_time,
+   .get_variable = xen_efi_get_variable,
+   .get_next_variable= xen_efi_get_next_variable,
+   .set_variable = xen_efi_set_variable,
+   .query_variable_info  = xen_efi_query_variable_info,
+   .update_capsule   = xen_efi_update_capsule,
+   .query_capsule_caps   = xen_efi_query_capsule_caps,
+   .get_next_high_mono_count = xen_efi_get_next_high_mono_count,
+   .reset_system = NULL, /* Functionality provided by Xen. */
+   .set_virtual_address_map  = NULL, /* Not used under Xen. */
+   .memmap   = NULL, /* Not used under Xen. */
+   .flags= 0 /* Initialized later. */
+};
+
+static efi_system_table_t __init *xen_efi_probe(void)
+{
+   struct xen_platform_op op = {
+   .cmd = XENPF_firmware_info,
+   .u.firmware_info = {
+   .type = XEN_FW_EFI_INFO,
+   .index = XEN_FW_EFI_CONFIG_TABLE
+   }
+   };
+   union xenpf_efi_info *info = &op.u.firmware_info.u.efi_info;
+
+   if (!xen_initial_domain() || HYPERVISOR_platform_op(&op) < 0)
+   return NULL;
+
+   /* Here we know that Xen runs on EFI platform. */
+
+   efi = efi_xen;
+
+   efi_systab_xen.tables = info->cfg.addr;
+   efi_systab_xen.nr_tables = info->cfg.nent;
+
+   op.cmd = XENPF_firmware_info;
+   op.u.firmware_info.type = XEN_FW_EFI_INFO;
+   op.u.firmware_info.index = XEN_FW_EFI_VENDOR;
+   info->vendor.bufsz = sizeof(ven

[Xen-devel] [PATCH v11 04/17] arm/xen: Use xen_xlate_map_ballooned_pages to setup grant table

2016-04-07 Thread Shannon Zhao
From: Shannon Zhao 

Use xen_xlate_map_ballooned_pages to setup grant table. Then it doesn't
rely on DT or ACPI to pass the start address and size of grant table.

Signed-off-by: Shannon Zhao 
Acked-by: Stefano Stabellini 
Reviewed-by: Julien Grall 
---
 arch/arm/xen/enlighten.c | 13 -
 1 file changed, 4 insertions(+), 9 deletions(-)

diff --git a/arch/arm/xen/enlighten.c b/arch/arm/xen/enlighten.c
index 75cd734..d94f726 100644
--- a/arch/arm/xen/enlighten.c
+++ b/arch/arm/xen/enlighten.c
@@ -282,18 +282,10 @@ static int __init xen_guest_init(void)
 {
struct xen_add_to_physmap xatp;
struct shared_info *shared_info_page = NULL;
-   struct resource res;
-   phys_addr_t grant_frames;
 
if (!xen_domain())
return 0;
 
-   if (of_address_to_resource(xen_node, GRANT_TABLE_PHYSADDR, &res)) {
-   pr_err("Xen grant table base address not found\n");
-   return -ENODEV;
-   }
-   grant_frames = res.start;
-
xen_events_irq = irq_of_parse_and_map(xen_node, 0);
if (!xen_events_irq) {
pr_err("Xen event channel interrupt not found\n");
@@ -328,7 +320,10 @@ static int __init xen_guest_init(void)
if (xen_vcpu_info == NULL)
return -ENOMEM;
 
-   if (gnttab_setup_auto_xlat_frames(grant_frames)) {
+   xen_auto_xlat_grant_frames.count = gnttab_max_grant_frames();
+   if (xen_xlate_map_ballooned_pages(&xen_auto_xlat_grant_frames.pfn,
+ &xen_auto_xlat_grant_frames.vaddr,
+ xen_auto_xlat_grant_frames.count)) {
free_percpu(xen_vcpu_info);
return -ENOMEM;
}
-- 
2.0.4



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v11 01/17] Xen: ACPI: Hide UART used by Xen

2016-04-07 Thread Shannon Zhao
From: Shannon Zhao 

ACPI 6.0 introduces a new table STAO to list the devices which are used
by Xen and can't be used by Dom0. On Xen virtual platforms, the physical
UART is used by Xen. So here it hides UART from Dom0.

CC: "Rafael J. Wysocki"  (supporter:ACPI)
CC: Len Brown  (supporter:ACPI)
CC: linux-a...@vger.kernel.org (open list:ACPI)
Signed-off-by: Shannon Zhao 
---
 drivers/acpi/scan.c | 74 +
 1 file changed, 74 insertions(+)

diff --git a/drivers/acpi/scan.c b/drivers/acpi/scan.c
index 5f28cf7..cfc73fe 100644
--- a/drivers/acpi/scan.c
+++ b/drivers/acpi/scan.c
@@ -46,6 +46,13 @@ DEFINE_MUTEX(acpi_device_lock);
 LIST_HEAD(acpi_wakeup_device_list);
 static DEFINE_MUTEX(acpi_hp_context_lock);
 
+/*
+ * The UART device described by the SPCR table is the only object which needs
+ * special-casing. Everything else is covered by ACPI namespace paths in STAO
+ * table.
+ */
+static u64 spcr_uart_addr;
+
 struct acpi_dep_data {
struct list_head node;
acpi_handle master;
@@ -1453,6 +1460,41 @@ static int acpi_add_single_object(struct acpi_device 
**child,
return 0;
 }
 
+static acpi_status acpi_get_resource_memory(struct acpi_resource *ares,
+   void *context)
+{
+   struct resource *res = context;
+
+   if (acpi_dev_resource_memory(ares, res))
+   return AE_CTRL_TERMINATE;
+
+   return AE_OK;
+}
+
+static bool acpi_device_should_be_hidden(acpi_handle handle)
+{
+   acpi_status status;
+   struct resource res;
+
+   /* Check if it should ignore the UART device */
+   if (!(spcr_uart_addr && acpi_has_method(handle, METHOD_NAME__CRS)))
+   return false;
+
+   /*
+* The UART device described in SPCR table is assumed to have only one
+* memory resource present. So we only look for the first one here.
+*/
+   status = acpi_walk_resources(handle, METHOD_NAME__CRS,
+acpi_get_resource_memory, &res);
+   if (ACPI_FAILURE(status) || res.start != spcr_uart_addr)
+   return false;
+
+   acpi_handle_info(handle, "The UART device @%pa in SPCR table will be 
hidden\n",
+&res.start);
+
+   return true;
+}
+
 static int acpi_bus_type_and_status(acpi_handle handle, int *type,
unsigned long long *sta)
 {
@@ -1466,6 +1508,9 @@ static int acpi_bus_type_and_status(acpi_handle handle, 
int *type,
switch (acpi_type) {
case ACPI_TYPE_ANY: /* for ACPI_ROOT_OBJECT */
case ACPI_TYPE_DEVICE:
+   if (acpi_device_should_be_hidden(handle))
+   return -ENODEV;
+
*type = ACPI_BUS_TYPE_DEVICE;
status = acpi_bus_get_status_handle(handle, sta);
if (ACPI_FAILURE(status))
@@ -1916,9 +1961,24 @@ static int acpi_bus_scan_fixed(void)
return result < 0 ? result : 0;
 }
 
+static void __init acpi_get_spcr_uart_addr(void)
+{
+   acpi_status status;
+   struct acpi_table_spcr *spcr_ptr;
+
+   status = acpi_get_table(ACPI_SIG_SPCR, 0,
+   (struct acpi_table_header **)&spcr_ptr);
+   if (ACPI_SUCCESS(status))
+   spcr_uart_addr = spcr_ptr->serial_port.address;
+   else
+   printk(KERN_WARNING PREFIX "STAO table present, but SPCR is 
missing\n");
+}
+
 int __init acpi_scan_init(void)
 {
int result;
+   acpi_status status;
+   struct acpi_table_stao *stao_ptr;
 
acpi_pci_root_init();
acpi_pci_link_init();
@@ -1934,6 +1994,20 @@ int __init acpi_scan_init(void)
 
acpi_scan_add_handler(&generic_device_handler);
 
+   /*
+* If there is STAO table, check whether it needs to ignore the UART
+* device in SPCR table.
+*/
+   status = acpi_get_table(ACPI_SIG_STAO, 0,
+   (struct acpi_table_header **)&stao_ptr);
+   if (ACPI_SUCCESS(status)) {
+   if (stao_ptr->header.length > sizeof(struct acpi_table_stao))
+   printk(KERN_INFO PREFIX "STAO Name List not yet 
supported.");
+
+   if (stao_ptr->ignore_uart)
+   acpi_get_spcr_uart_addr();
+   }
+
mutex_lock(&acpi_scan_lock);
/*
 * Enumerate devices in the ACPI namespace.
-- 
2.0.4



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v11 08/17] Xen: public/hvm: sync changes of HVM_PARAM_CALLBACK_VIA ABI from Xen

2016-04-07 Thread Shannon Zhao
From: Shannon Zhao 

Sync the changes of HVM_PARAM_CALLBACK_VIA ABI introduced by
Xen commit  (public/hvm: export the HVM_PARAM_CALLBACK_VIA
ABI in the API).

Signed-off-by: Shannon Zhao 
Acked-by: Stefano Stabellini 
---
 include/xen/interface/hvm/params.h | 27 +--
 1 file changed, 21 insertions(+), 6 deletions(-)

diff --git a/include/xen/interface/hvm/params.h 
b/include/xen/interface/hvm/params.h
index a6c7991..70ad208 100644
--- a/include/xen/interface/hvm/params.h
+++ b/include/xen/interface/hvm/params.h
@@ -27,16 +27,31 @@
  * Parameter space for HVMOP_{set,get}_param.
  */
 
+#define HVM_PARAM_CALLBACK_IRQ 0
 /*
  * How should CPU0 event-channel notifications be delivered?
- * val[63:56] == 0: val[55:0] is a delivery GSI (Global System Interrupt).
- * val[63:56] == 1: val[55:0] is a delivery PCI INTx line, as follows:
- *  Domain = val[47:32], Bus  = val[31:16],
- *  DevFn  = val[15: 8], IntX = val[ 1: 0]
- * val[63:56] == 2: val[7:0] is a vector number.
+ *
  * If val == 0 then CPU0 event-channel notifications are not delivered.
+ * If val != 0, val[63:56] encodes the type, as follows:
+ */
+
+#define HVM_PARAM_CALLBACK_TYPE_GSI  0
+/*
+ * val[55:0] is a delivery GSI.  GSI 0 cannot be used, as it aliases val == 0,
+ * and disables all notifications.
+ */
+
+#define HVM_PARAM_CALLBACK_TYPE_PCI_INTX 1
+/*
+ * val[55:0] is a delivery PCI INTx line:
+ * Domain = val[47:32], Bus = val[31:16] DevFn = val[15:8], IntX = val[1:0]
+ */
+
+#define HVM_PARAM_CALLBACK_TYPE_VECTOR   2
+/*
+ * val[7:0] is a vector number.  Check for XENFEAT_hvm_callback_vector to know
+ * if this delivery method is available.
  */
-#define HVM_PARAM_CALLBACK_IRQ 0
 
 #define HVM_PARAM_STORE_PFN1
 #define HVM_PARAM_STORE_EVTCHN 2
-- 
2.0.4



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v11 15/17] ARM64: XEN: Add a function to initialize Xen specific UEFI runtime services

2016-04-07 Thread Shannon Zhao
From: Shannon Zhao 

When running on Xen hypervisor, runtime services are supported through
hypercall. Add a Xen specific function to initialize runtime services.

Signed-off-by: Shannon Zhao 
Reviewed-by: Stefano Stabellini 
---
 arch/arm/include/asm/xen/xen-ops.h   |  6 ++
 arch/arm/xen/Makefile|  1 +
 arch/arm/xen/efi.c   | 40 
 arch/arm64/include/asm/xen/xen-ops.h |  6 ++
 arch/arm64/xen/Makefile  |  1 +
 drivers/xen/Kconfig  |  2 +-
 6 files changed, 55 insertions(+), 1 deletion(-)
 create mode 100644 arch/arm/include/asm/xen/xen-ops.h
 create mode 100644 arch/arm/xen/efi.c
 create mode 100644 arch/arm64/include/asm/xen/xen-ops.h

diff --git a/arch/arm/include/asm/xen/xen-ops.h 
b/arch/arm/include/asm/xen/xen-ops.h
new file mode 100644
index 000..ec154e7
--- /dev/null
+++ b/arch/arm/include/asm/xen/xen-ops.h
@@ -0,0 +1,6 @@
+#ifndef _ASM_XEN_OPS_H
+#define _ASM_XEN_OPS_H
+
+void xen_efi_runtime_setup(void);
+
+#endif /* _ASM_XEN_OPS_H */
diff --git a/arch/arm/xen/Makefile b/arch/arm/xen/Makefile
index 1296952..2279521 100644
--- a/arch/arm/xen/Makefile
+++ b/arch/arm/xen/Makefile
@@ -1 +1,2 @@
 obj-y  := enlighten.o hypercall.o grant-table.o p2m.o mm.o
+obj-$(CONFIG_XEN_EFI) += efi.o
diff --git a/arch/arm/xen/efi.c b/arch/arm/xen/efi.c
new file mode 100644
index 000..16db419
--- /dev/null
+++ b/arch/arm/xen/efi.c
@@ -0,0 +1,40 @@
+/*
+ * Copyright (c) 2015, Linaro Limited, Shannon Zhao
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License along
+ * with this program.  If not, see .
+ */
+
+#include 
+#include 
+#include 
+
+/* Set XEN EFI runtime services function pointers. Other fields of struct efi,
+ * e.g. efi.systab, will be set like normal EFI.
+ */
+void __init xen_efi_runtime_setup(void)
+{
+   efi.get_time = xen_efi_get_time;
+   efi.set_time = xen_efi_set_time;
+   efi.get_wakeup_time  = xen_efi_get_wakeup_time;
+   efi.set_wakeup_time  = xen_efi_set_wakeup_time;
+   efi.get_variable = xen_efi_get_variable;
+   efi.get_next_variable= xen_efi_get_next_variable;
+   efi.set_variable = xen_efi_set_variable;
+   efi.query_variable_info  = xen_efi_query_variable_info;
+   efi.update_capsule   = xen_efi_update_capsule;
+   efi.query_capsule_caps   = xen_efi_query_capsule_caps;
+   efi.get_next_high_mono_count = xen_efi_get_next_high_mono_count;
+   efi.reset_system = NULL; /* Functionality provided by Xen. 
*/
+}
+EXPORT_SYMBOL_GPL(xen_efi_runtime_setup);
diff --git a/arch/arm64/include/asm/xen/xen-ops.h 
b/arch/arm64/include/asm/xen/xen-ops.h
new file mode 100644
index 000..ec154e7
--- /dev/null
+++ b/arch/arm64/include/asm/xen/xen-ops.h
@@ -0,0 +1,6 @@
+#ifndef _ASM_XEN_OPS_H
+#define _ASM_XEN_OPS_H
+
+void xen_efi_runtime_setup(void);
+
+#endif /* _ASM_XEN_OPS_H */
diff --git a/arch/arm64/xen/Makefile b/arch/arm64/xen/Makefile
index 74a8d87..8ff8aa9 100644
--- a/arch/arm64/xen/Makefile
+++ b/arch/arm64/xen/Makefile
@@ -1,2 +1,3 @@
 xen-arm-y  += $(addprefix ../../arm/xen/, enlighten.o grant-table.o p2m.o 
mm.o)
 obj-y  := xen-arm.o hypercall.o
+obj-$(CONFIG_XEN_EFI) += $(addprefix ../../arm/xen/, efi.o)
diff --git a/drivers/xen/Kconfig b/drivers/xen/Kconfig
index 979a831..f15bb3b7 100644
--- a/drivers/xen/Kconfig
+++ b/drivers/xen/Kconfig
@@ -275,7 +275,7 @@ config XEN_HAVE_PVMMU
 
 config XEN_EFI
def_bool y
-   depends on X86_64 && EFI
+   depends on (ARM || ARM64 || X86_64) && EFI
 
 config XEN_AUTO_XLATE
def_bool y
-- 
2.0.4



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v11 12/17] ARM64: ACPI: Check if it runs on Xen to enable or disable ACPI

2016-04-07 Thread Shannon Zhao
From: Shannon Zhao 

When it's a Xen domain0 booting with ACPI, it will supply a /chosen and
a /hypervisor node in DT. So check if it needs to enable ACPI.

Signed-off-by: Shannon Zhao 
Reviewed-by: Stefano Stabellini 
Acked-by: Hanjun Guo 
---
 arch/arm64/kernel/acpi.c | 14 ++
 1 file changed, 10 insertions(+), 4 deletions(-)

diff --git a/arch/arm64/kernel/acpi.c b/arch/arm64/kernel/acpi.c
index d1ce8e2..57ee317 100644
--- a/arch/arm64/kernel/acpi.c
+++ b/arch/arm64/kernel/acpi.c
@@ -67,10 +67,15 @@ static int __init dt_scan_depth1_nodes(unsigned long node,
 {
/*
 * Return 1 as soon as we encounter a node at depth 1 that is
-* not the /chosen node.
+* not the /chosen node, or /hypervisor node with compatible
+* string "xen,xen".
 */
-   if (depth == 1 && (strcmp(uname, "chosen") != 0))
-   return 1;
+   if (depth == 1 && (strcmp(uname, "chosen") != 0)) {
+   if (strcmp(uname, "hypervisor") != 0 ||
+   !of_flat_dt_is_compatible(node, "xen,xen"))
+   return 1;
+   }
+
return 0;
 }
 
@@ -184,7 +189,8 @@ void __init acpi_boot_table_init(void)
/*
 * Enable ACPI instead of device tree unless
 * - ACPI has been disabled explicitly (acpi=off), or
-* - the device tree is not empty (it has more than just a /chosen node)
+* - the device tree is not empty (it has more than just a /chosen node,
+*   and a /hypervisor node when running on Xen)
 *   and ACPI has not been force enabled (acpi=force)
 */
if (param_acpi_off ||
-- 
2.0.4



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v11 09/17] xen/hvm/params: Add a new delivery type for event-channel in HVM_PARAM_CALLBACK_IRQ

2016-04-07 Thread Shannon Zhao
From: Shannon Zhao 

This new delivery type which is for ARM shares the same value with
HVM_PARAM_CALLBACK_TYPE_VECTOR which is for x86.

val[15:8] is flag: val[7:0] is a PPI.
To the flag, bit 8 stands the interrupt mode is edge(1) or level(0) and
bit 9 stands the interrupt polarity is active low(1) or high(0).

Signed-off-by: Shannon Zhao 
Acked-by: Stefano Stabellini 
Reviewed-by: Julien Grall 
---
 include/xen/interface/hvm/params.h | 13 +
 1 file changed, 13 insertions(+)

diff --git a/include/xen/interface/hvm/params.h 
b/include/xen/interface/hvm/params.h
index 70ad208..4d61fc5 100644
--- a/include/xen/interface/hvm/params.h
+++ b/include/xen/interface/hvm/params.h
@@ -47,11 +47,24 @@
  * Domain = val[47:32], Bus = val[31:16] DevFn = val[15:8], IntX = val[1:0]
  */
 
+#if defined(__i386__) || defined(__x86_64__)
 #define HVM_PARAM_CALLBACK_TYPE_VECTOR   2
 /*
  * val[7:0] is a vector number.  Check for XENFEAT_hvm_callback_vector to know
  * if this delivery method is available.
  */
+#elif defined(__arm__) || defined(__aarch64__)
+#define HVM_PARAM_CALLBACK_TYPE_PPI  2
+/*
+ * val[55:16] needs to be zero.
+ * val[15:8] is interrupt flag of the PPI used by event-channel:
+ *  bit 8: the PPI is edge(1) or level(0) triggered
+ *  bit 9: the PPI is active low(1) or high(0)
+ * val[7:0] is a PPI number used by event-channel.
+ * This is only used by ARM/ARM64 and masking/eoi the interrupt associated to
+ * the notification is handled by the interrupt controller.
+ */
+#endif
 
 #define HVM_PARAM_STORE_PFN1
 #define HVM_PARAM_STORE_EVTCHN 2
-- 
2.0.4



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v11 10/17] arm/xen: Get event-channel irq through HVM_PARAM when booting with ACPI

2016-04-07 Thread Shannon Zhao
From: Shannon Zhao 

The kernel will get the event-channel IRQ through
HVM_PARAM_CALLBACK_IRQ.

Signed-off-by: Shannon Zhao 
Reviewed-by: Stefano Stabellini 
Reviewed-by: Julien Grall 
---
 arch/arm/xen/enlighten.c | 36 +++-
 1 file changed, 35 insertions(+), 1 deletion(-)

diff --git a/arch/arm/xen/enlighten.c b/arch/arm/xen/enlighten.c
index d94f726..06bd61a 100644
--- a/arch/arm/xen/enlighten.c
+++ b/arch/arm/xen/enlighten.c
@@ -30,6 +30,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 
@@ -278,6 +279,35 @@ void __init xen_early_init(void)
add_preferred_console("hvc", 0, NULL);
 }
 
+static void __init xen_acpi_guest_init(void)
+{
+#ifdef CONFIG_ACPI
+   struct xen_hvm_param a;
+   int interrupt, trigger, polarity;
+
+   a.domid = DOMID_SELF;
+   a.index = HVM_PARAM_CALLBACK_IRQ;
+
+   if (HYPERVISOR_hvm_op(HVMOP_get_param, &a)
+   || (a.value >> 56) != HVM_PARAM_CALLBACK_TYPE_PPI) {
+   xen_events_irq = 0;
+   return;
+   }
+
+   interrupt = a.value & 0xff;
+   trigger = ((a.value >> 8) & 0x1) ? ACPI_EDGE_SENSITIVE
+: ACPI_LEVEL_SENSITIVE;
+   polarity = ((a.value >> 8) & 0x2) ? ACPI_ACTIVE_LOW
+ : ACPI_ACTIVE_HIGH;
+   xen_events_irq = acpi_register_gsi(NULL, interrupt, trigger, polarity);
+#endif
+}
+
+static void __init xen_dt_guest_init(void)
+{
+   xen_events_irq = irq_of_parse_and_map(xen_node, 0);
+}
+
 static int __init xen_guest_init(void)
 {
struct xen_add_to_physmap xatp;
@@ -286,7 +316,11 @@ static int __init xen_guest_init(void)
if (!xen_domain())
return 0;
 
-   xen_events_irq = irq_of_parse_and_map(xen_node, 0);
+   if (!acpi_disabled)
+   xen_acpi_guest_init();
+   else
+   xen_dt_guest_init();
+
if (!xen_events_irq) {
pr_err("Xen event channel interrupt not found\n");
return -ENODEV;
-- 
2.0.4



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v11 16/17] FDT: Add a helper to get the subnode by given name

2016-04-07 Thread Shannon Zhao
From: Shannon Zhao 

Sometimes it needs to check if there is a subnode of given node in FDT
by given name. Introduce this helper to get the subnode if it exists.

CC: Rob Herring 
Signed-off-by: Shannon Zhao 
Acked-by: Stefano Stabellini 
Acked-by: Rob Herring 
---
 drivers/of/fdt.c   | 13 +
 include/linux/of_fdt.h |  2 ++
 2 files changed, 15 insertions(+)

diff --git a/drivers/of/fdt.c b/drivers/of/fdt.c
index 3349d2a..5c8b2f2 100644
--- a/drivers/of/fdt.c
+++ b/drivers/of/fdt.c
@@ -645,6 +645,19 @@ int __init of_scan_flat_dt(int (*it)(unsigned long node,
 }
 
 /**
+ * of_get_flat_dt_subnode_by_name - get the subnode by given name
+ *
+ * @node: the parent node
+ * @uname: the name of subnode
+ * @return offset of the subnode, or -FDT_ERR_NOTFOUND if there is none
+ */
+
+int of_get_flat_dt_subnode_by_name(unsigned long node, const char *uname)
+{
+   return fdt_subnode_offset(initial_boot_params, node, uname);
+}
+
+/**
  * of_get_flat_dt_root - find the root node in the flat blob
  */
 unsigned long __init of_get_flat_dt_root(void)
diff --git a/include/linux/of_fdt.h b/include/linux/of_fdt.h
index 2fbe868..2c3707e 100644
--- a/include/linux/of_fdt.h
+++ b/include/linux/of_fdt.h
@@ -52,6 +52,8 @@ extern char __dtb_end[];
 extern int of_scan_flat_dt(int (*it)(unsigned long node, const char *uname,
 int depth, void *data),
   void *data);
+extern int of_get_flat_dt_subnode_by_name(unsigned long node,
+ const char *uname);
 extern const void *of_get_flat_dt_prop(unsigned long node, const char *name,
   int *size);
 extern int of_flat_dt_is_compatible(unsigned long node, const char *name);
-- 
2.0.4



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v11 11/17] ARM: XEN: Move xen_early_init() before efi_init()

2016-04-07 Thread Shannon Zhao
From: Shannon Zhao 

Move xen_early_init() before efi_init(), then when calling efi_init()
could initialize Xen specific UEFI.

Check if it runs on Xen hypervisor through the flat dts.

Cc: Russell King 
Signed-off-by: Shannon Zhao 
Reviewed-by: Stefano Stabellini 
Reviewed-by: Julien Grall 
---
 arch/arm/kernel/setup.c   |  2 +-
 arch/arm/xen/enlighten.c  | 56 ++-
 arch/arm64/kernel/setup.c |  2 +-
 3 files changed, 43 insertions(+), 17 deletions(-)

diff --git a/arch/arm/kernel/setup.c b/arch/arm/kernel/setup.c
index 139791e..5bc7516 100644
--- a/arch/arm/kernel/setup.c
+++ b/arch/arm/kernel/setup.c
@@ -1035,6 +1035,7 @@ void __init setup_arch(char **cmdline_p)
early_paging_init(mdesc);
 #endif
setup_dma_zone(mdesc);
+   xen_early_init();
efi_init();
sanity_check_meminfo();
arm_memblock_init(mdesc);
@@ -1051,7 +1052,6 @@ void __init setup_arch(char **cmdline_p)
 
arm_dt_init_cpu_maps();
psci_dt_init();
-   xen_early_init();
 #ifdef CONFIG_SMP
if (is_smp()) {
if (!mdesc->smp_init || !mdesc->smp_init()) {
diff --git a/arch/arm/xen/enlighten.c b/arch/arm/xen/enlighten.c
index 06bd61a..13e3e9f 100644
--- a/arch/arm/xen/enlighten.c
+++ b/arch/arm/xen/enlighten.c
@@ -20,6 +20,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -53,8 +54,6 @@ struct xen_memory_region 
xen_extra_mem[XEN_EXTRA_MEM_MAX_REGIONS] __initdata;
 
 static __read_mostly unsigned int xen_events_irq;
 
-static __initdata struct device_node *xen_node;
-
 int xen_remap_domain_gfn_array(struct vm_area_struct *vma,
   unsigned long addr,
   xen_pfn_t *gfn, int nr,
@@ -238,6 +237,33 @@ static irqreturn_t xen_arm_callback(int irq, void *arg)
return IRQ_HANDLED;
 }
 
+static __initdata struct {
+   const char *compat;
+   const char *prefix;
+   const char *version;
+   bool found;
+} hyper_node = {"xen,xen", "xen,xen-", NULL, false};
+
+static int __init fdt_find_hyper_node(unsigned long node, const char *uname,
+ int depth, void *data)
+{
+   const void *s = NULL;
+   int len;
+
+   if (depth != 1 || strcmp(uname, "hypervisor") != 0)
+   return 0;
+
+   if (of_flat_dt_is_compatible(node, hyper_node.compat))
+   hyper_node.found = true;
+
+   s = of_get_flat_dt_prop(node, "compatible", &len);
+   if (strlen(hyper_node.prefix) + 3  < len &&
+   !strncmp(hyper_node.prefix, s, strlen(hyper_node.prefix)))
+   hyper_node.version = s + strlen(hyper_node.prefix);
+
+   return 0;
+}
+
 /*
  * see Documentation/devicetree/bindings/arm/xen.txt for the
  * documentation of the Xen Device Tree format.
@@ -245,26 +271,18 @@ static irqreturn_t xen_arm_callback(int irq, void *arg)
 #define GRANT_TABLE_PHYSADDR 0
 void __init xen_early_init(void)
 {
-   int len;
-   const char *s = NULL;
-   const char *version = NULL;
-   const char *xen_prefix = "xen,xen-";
-
-   xen_node = of_find_compatible_node(NULL, NULL, "xen,xen");
-   if (!xen_node) {
+   of_scan_flat_dt(fdt_find_hyper_node, NULL);
+   if (!hyper_node.found) {
pr_debug("No Xen support\n");
return;
}
-   s = of_get_property(xen_node, "compatible", &len);
-   if (strlen(xen_prefix) + 3  < len &&
-   !strncmp(xen_prefix, s, strlen(xen_prefix)))
-   version = s + strlen(xen_prefix);
-   if (version == NULL) {
+
+   if (hyper_node.version == NULL) {
pr_debug("Xen version not found\n");
return;
}
 
-   pr_info("Xen %s support found\n", version);
+   pr_info("Xen %s support found\n", hyper_node.version);
 
xen_domain_type = XEN_HVM_DOMAIN;
 
@@ -305,6 +323,14 @@ static void __init xen_acpi_guest_init(void)
 
 static void __init xen_dt_guest_init(void)
 {
+   struct device_node *xen_node;
+
+   xen_node = of_find_compatible_node(NULL, NULL, "xen,xen");
+   if (!xen_node) {
+   pr_err("Xen support was detected before, but it has 
disappeared\n");
+   return;
+   }
+
xen_events_irq = irq_of_parse_and_map(xen_node, 0);
 }
 
diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c
index 9dc6776..7cf992f 100644
--- a/arch/arm64/kernel/setup.c
+++ b/arch/arm64/kernel/setup.c
@@ -320,6 +320,7 @@ void __init setup_arch(char **cmdline_p)
 */
cpu_uninstall_idmap();
 
+   xen_early_init();
efi_init();
arm64_memblock_init();
 
@@ -341,7 +342,6 @@ void __init setup_arch(char **cmdline_p)
} else {
psci_acpi_init();
}
-   xen_early_init();
 
cpu_read_bootcpu_ops();
smp_init_cpus();
-- 
2.0.4



___
Xen-devel mailing lis

[Xen-devel] [PATCH v11 17/17] Xen: EFI: Parse DT parameters for Xen specific UEFI

2016-04-07 Thread Shannon Zhao
From: Shannon Zhao 

Add a new function to parse DT parameters for Xen specific UEFI just
like the way for normal UEFI. Then it could reuse the existing codes.

If Xen supports EFI, initialize runtime services.

CC: Matt Fleming 
Signed-off-by: Shannon Zhao 
Reviewed-by: Matt Fleming 
Reviewed-by: Stefano Stabellini 
---
 arch/arm/xen/enlighten.c   |  6 +
 drivers/firmware/efi/arm-runtime.c | 17 +-
 drivers/firmware/efi/efi.c | 45 --
 3 files changed, 56 insertions(+), 12 deletions(-)

diff --git a/arch/arm/xen/enlighten.c b/arch/arm/xen/enlighten.c
index 13e3e9f..e130562 100644
--- a/arch/arm/xen/enlighten.c
+++ b/arch/arm/xen/enlighten.c
@@ -261,6 +261,12 @@ static int __init fdt_find_hyper_node(unsigned long node, 
const char *uname,
!strncmp(hyper_node.prefix, s, strlen(hyper_node.prefix)))
hyper_node.version = s + strlen(hyper_node.prefix);
 
+   if (IS_ENABLED(CONFIG_XEN_EFI)) {
+   /* Check if Xen supports EFI */
+   if (of_get_flat_dt_subnode_by_name(node, "uefi") > 0)
+   set_bit(EFI_PARAVIRT, &efi.flags);
+   }
+
return 0;
 }
 
diff --git a/drivers/firmware/efi/arm-runtime.c 
b/drivers/firmware/efi/arm-runtime.c
index 6ae21e4..ac609b9 100644
--- a/drivers/firmware/efi/arm-runtime.c
+++ b/drivers/firmware/efi/arm-runtime.c
@@ -27,6 +27,7 @@
 #include 
 #include 
 #include 
+#include 
 
 extern u64 efi_system_table;
 
@@ -107,13 +108,19 @@ static int __init arm_enable_runtime_services(void)
}
set_bit(EFI_SYSTEM_TABLES, &efi.flags);
 
-   if (!efi_virtmap_init()) {
-   pr_err("No UEFI virtual mapping was installed -- runtime 
services will not be available\n");
-   return -ENOMEM;
+   if (IS_ENABLED(CONFIG_XEN_EFI) && efi_enabled(EFI_PARAVIRT)) {
+   /* Set up runtime services function pointers for Xen Dom0 */
+   xen_efi_runtime_setup();
+   } else {
+   if (!efi_virtmap_init()) {
+   pr_err("No UEFI virtual mapping was installed -- 
runtime services will not be available\n");
+   return -ENOMEM;
+   }
+
+   /* Set up runtime services function pointers */
+   efi_native_runtime_setup();
}
 
-   /* Set up runtime services function pointers */
-   efi_native_runtime_setup();
set_bit(EFI_RUNTIME_SERVICES, &efi.flags);
 
efi.runtime_version = efi.systab->hdr.revision;
diff --git a/drivers/firmware/efi/efi.c b/drivers/firmware/efi/efi.c
index 3a69ed5..519c096 100644
--- a/drivers/firmware/efi/efi.c
+++ b/drivers/firmware/efi/efi.c
@@ -469,12 +469,14 @@ device_initcall(efi_load_efivars);
FIELD_SIZEOF(struct efi_fdt_params, field) \
}
 
-static __initdata struct {
+struct params {
const char name[32];
const char propname[32];
int offset;
int size;
-} dt_params[] = {
+};
+
+static struct params fdt_params[] __initdata = {
UEFI_PARAM("System Table", "linux,uefi-system-table", system_table),
UEFI_PARAM("MemMap Address", "linux,uefi-mmap-start", mmap),
UEFI_PARAM("MemMap Size", "linux,uefi-mmap-size", mmap_size),
@@ -482,24 +484,45 @@ static __initdata struct {
UEFI_PARAM("MemMap Desc. Version", "linux,uefi-mmap-desc-ver", desc_ver)
 };
 
+static struct params xen_fdt_params[] __initdata = {
+   UEFI_PARAM("System Table", "xen,uefi-system-table", system_table),
+   UEFI_PARAM("MemMap Address", "xen,uefi-mmap-start", mmap),
+   UEFI_PARAM("MemMap Size", "xen,uefi-mmap-size", mmap_size),
+   UEFI_PARAM("MemMap Desc. Size", "xen,uefi-mmap-desc-size", desc_size),
+   UEFI_PARAM("MemMap Desc. Version", "xen,uefi-mmap-desc-ver", desc_ver)
+};
+
 struct param_info {
int found;
void *params;
+   struct params *dt_params;
+   int size;
 };
 
 static int __init fdt_find_uefi_params(unsigned long node, const char *uname,
   int depth, void *data)
 {
struct param_info *info = data;
+   struct params *dt_params = info->dt_params;
const void *prop;
void *dest;
u64 val;
-   int i, len;
+   int i, len, offset;
 
-   if (depth != 1 || strcmp(uname, "chosen") != 0)
-   return 0;
+   if (efi_enabled(EFI_PARAVIRT)) {
+   if (depth != 1 || strcmp(uname, "hypervisor") != 0)
+   return 0;
 
-   for (i = 0; i < ARRAY_SIZE(dt_params); i++) {
+   offset = of_get_flat_dt_subnode_by_name(node, "uefi");
+   if (offset < 0)
+   return 0;
+   node = offset;
+   } else {
+   if (depth != 1 || strcmp(uname, "chosen") != 0)
+   return 0;
+   }
+
+   for (i = 0; i < info->size; i++) {
prop = of_get_flat_dt_prop(node, dt_p

[Xen-devel] [PATCH v11 13/17] ARM: Xen: Document UEFI support on Xen ARM virtual platforms

2016-04-07 Thread Shannon Zhao
From: Shannon Zhao 

Add a "uefi" node under /hypervisor node in FDT, then Linux kernel could
scan this to get the UEFI information.

CC: Rob Herring 
Signed-off-by: Shannon Zhao 
Acked-by: Rob Herring 
Reviewed-by: Stefano Stabellini 
---
 Documentation/devicetree/bindings/arm/xen.txt | 35 +++
 1 file changed, 35 insertions(+)

diff --git a/Documentation/devicetree/bindings/arm/xen.txt 
b/Documentation/devicetree/bindings/arm/xen.txt
index 0f7b9c2..c9b9321 100644
--- a/Documentation/devicetree/bindings/arm/xen.txt
+++ b/Documentation/devicetree/bindings/arm/xen.txt
@@ -11,10 +11,32 @@ the following properties:
   memory where the grant table should be mapped to, using an
   HYPERVISOR_memory_op hypercall. The memory region is large enough to map
   the whole grant table (it is larger or equal to gnttab_max_grant_frames()).
+  This property is unnecessary when booting Dom0 using ACPI.
 
 - interrupts: the interrupt used by Xen to inject event notifications.
   A GIC node is also required.
+  This property is unnecessary when booting Dom0 using ACPI.
 
+To support UEFI on Xen ARM virtual platforms, Xen populates the FDT "uefi" node
+under /hypervisor with following parameters:
+
+
+Name  | Size   | Description
+
+xen,uefi-system-table | 64-bit | Guest physical address of the UEFI System
+ || Table.
+
+xen,uefi-mmap-start   | 64-bit | Guest physical address of the UEFI memory
+ || map.
+
+xen,uefi-mmap-size| 32-bit | Size in bytes of the UEFI memory map
+  || pointed to in previous entry.
+
+xen,uefi-mmap-desc-size   | 32-bit | Size in bytes of each entry in the UEFI
+  || memory map.
+
+xen,uefi-mmap-desc-ver| 32-bit | Version of the mmap descriptor format.
+
 
 Example (assuming #address-cells = <2> and #size-cells = <2>):
 
@@ -22,4 +44,17 @@ hypervisor {
compatible = "xen,xen-4.3", "xen,xen";
reg = <0 0xb000 0 0x2>;
interrupts = <1 15 0xf08>;
+   uefi {
+   xen,uefi-system-table = <0x>;
+   xen,uefi-mmap-start = <0x>;
+   xen,uefi-mmap-size = <0x>;
+   xen,uefi-mmap-desc-size = <0x>;
+   xen,uefi-mmap-desc-ver = <0x>;
+};
 };
+
+The format and meaning of the "xen,uefi-*" parameters are similar to those in
+Documentation/arm/uefi.txt, which are provided by the regular UEFI stub. 
However
+they differ because they are provided by the Xen hypervisor, together with a 
set
+of UEFI runtime services implemented via hypercalls, see
+http://xenbits.xen.org/docs/unstable/hypercall/x86_64/include,public,platform.h.html.
-- 
2.0.4



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [for-4.7 3/5] xen/arm: acpi: Fix SMP support when booting with ACPI

2016-04-07 Thread Shannon Zhao


On 2016/4/7 18:59, Julien Grall wrote:
> The variable enabled_cpus is used to know the number of CPU enabled in
> the MADT.
> 
> Currently this variable is used to check the validity of the boot CPU.
> It will be considered invalid when "enabled_cpus > 1".
> 
> However, this condition also means that multiple CPUs are present on the
> system. So secondary will never be brought up.
> 
> The correct way to check the validity of the boot CPU is to use the
> variable bootcpu_valid.
> 
> Signed-off-by: Julien Grall 

Reviewed-by: Shannon Zhao 

> ---
>  xen/arch/arm/acpi/boot.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/xen/arch/arm/acpi/boot.c b/xen/arch/arm/acpi/boot.c
> index 2a71660..fd29bdc 100644
> --- a/xen/arch/arm/acpi/boot.c
> +++ b/xen/arch/arm/acpi/boot.c
> @@ -149,7 +149,7 @@ void __init acpi_smp_init_cpus(void)
>  return;
>  }
>  
> -if ( enabled_cpus > 1 )
> +if ( !bootcpu_valid )
>  {
>  printk("MADT missing boot CPU MPIDR, not enabling secondaries\n");
>  return;
> 

-- 
Shannon


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [for-4.7 5/5] xen/arm: acpi: Print more error messages in acpi_map_gic_cpu_interface

2016-04-07 Thread Shannon Zhao


On 2016/4/7 18:59, Julien Grall wrote:
> It's helpful to spot any error without having to modify the hypervisor
> code.
> 
> Signed-off-by: Julien Grall 

Reviewed-by: Shannon Zhao 
> ---
>  xen/arch/arm/acpi/boot.c | 7 +++
>  1 file changed, 7 insertions(+)
> 
> diff --git a/xen/arch/arm/acpi/boot.c b/xen/arch/arm/acpi/boot.c
> index 602ab39..23285f7 100644
> --- a/xen/arch/arm/acpi/boot.c
> +++ b/xen/arch/arm/acpi/boot.c
> @@ -63,7 +63,10 @@ acpi_map_gic_cpu_interface(struct 
> acpi_madt_generic_interrupt *processor)
>  
>  total_cpus++;
>  if ( !enabled )
> +{
> +printk("Skipping disabled CPU entry with 0x%"PRIx64" MPIDR\n", 
> mpidr);
>  return;
> +}
>  
>  if ( enabled_cpus >=  NR_CPUS )
>  {
> @@ -101,7 +104,11 @@ acpi_map_gic_cpu_interface(struct 
> acpi_madt_generic_interrupt *processor)
>  }
>  
>  if ( !acpi_psci_present() )
> +{
> +printk("PSCI not present, skipping CPU MPIDR 0x%"PRIx64"\n",
> +   mpidr);
>  return;
> +}
>  
>  if ( (rc = arch_cpu_init(enabled_cpus, NULL)) < 0 )
>  {
> 

-- 
Shannon


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [for-4.7 4/5] xen/arm: acpi: Remove uncessary check in acpi_map_gic_cpu_interface

2016-04-07 Thread Shannon Zhao


On 2016/4/7 18:59, Julien Grall wrote:
> This part of the code will never be executed when the entry
> corresponds to the boot CPU.
> 
> Also print an error message rather when arch_cpu_init has failed.
> 
> Signed-off-by: Julien Grall 

Reviewed-by: Shannon Zhao 
> ---
>  xen/arch/arm/acpi/boot.c | 15 ---
>  1 file changed, 8 insertions(+), 7 deletions(-)
> 
> diff --git a/xen/arch/arm/acpi/boot.c b/xen/arch/arm/acpi/boot.c
> index fd29bdc..602ab39 100644
> --- a/xen/arch/arm/acpi/boot.c
> +++ b/xen/arch/arm/acpi/boot.c
> @@ -51,6 +51,7 @@ static void __init
>  acpi_map_gic_cpu_interface(struct acpi_madt_generic_interrupt *processor)
>  {
>  int i;
> +int rc;
>  u64 mpidr = processor->arm_mpidr & MPIDR_HWID_MASK;
>  bool_t enabled = !!(processor->flags & ACPI_MADT_ENABLED);
>  
> @@ -102,16 +103,16 @@ acpi_map_gic_cpu_interface(struct 
> acpi_madt_generic_interrupt *processor)
>  if ( !acpi_psci_present() )
>  return;
>  
> -/* CPU 0 was already initialized */
> -if ( enabled_cpus )
> +if ( (rc = arch_cpu_init(enabled_cpus, NULL)) < 0 )
>  {
> -if ( arch_cpu_init(enabled_cpus, NULL) < 0 )
> -return;
> -
> -/* map the logical cpu id to cpu MPIDR */
> -cpu_logical_map(enabled_cpus) = mpidr;
> +printk("cpu%d: init failed (0x%"PRIx64" MPIDR): %d\n",
> +   enabled_cpus, mpidr, rc);
> +return;
>  }
>  
> +/* map the logical cpu id to cpu MPIDR */
> +cpu_logical_map(enabled_cpus) = mpidr;
> +
>  enabled_cpus++;
>  }
>  
> 

-- 
Shannon


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v4 01/14] x86/boot: enumerate documentation for the x86 hardware_subarch

2016-04-07 Thread Andy Shevchenko
On Wed, 2016-04-06 at 17:06 -0700, Luis R. Rodriguez wrote:
> Although hardware_subarch has been in place since the x86 boot
> protocol 2.07 it hasn't been used much. Enumerate current possible
> values to avoid misuses and help with semantics later at boot
> time should this be used further.
> 
> These enums should only ever be used by architecture x86 code,
> and all that code should be well contained and compartamentalized,
> clarify that as well.

Nitpick:

> + * @X86_SUBARCH_PC: Should be used if the hardware is enumerable
> using standard
> + *   PC mechanisms (PCI, ACPI) and doesn't need a special boot
> flow.
> + * @X86_SUBARCH_LGUEST: Used for x86 hypervisor demo, lguest
> + * @X86_SUBARCH_XEN: Used for Xen guest types which follow the PV
> boot path,
> + *   which start at asm startup_xen() entry point and later
> jump to the C
> + *   xen_start_kernel() entry point.
> + * @X86_SUBARCH_INTEL_MID: Used for Intel MID (Mobile Internet
> Device) platform
> + *   systems which do not have the PCI legacy interfaces.
> + * @X86_SUBARCH_CE4100: Used for Intel CE media processor (CE4100)
> SOC for

I think 'SoC' (without quotes) will be better.

> + *   for settop boxes and media devices, the use of a subarch
> for CE4100
> + *   is more of a hack...

-- 
Andy Shevchenko 
Intel Finland Oy


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Regarding Outreachy project on Improving CR Dashboard

2016-04-07 Thread Priya
Hello all,

Thanks for the suggestions. I have updated the changes as u had mentioned.
I am sorry, but I could not find any errors while running

$ python3 createjson.py --mbox xen-devel-2016-03 --output new.json

command. I am wondering what is wrong with this and might be a problem with
python3 or my perceval version. I have added licensing and python logging.
You can see it in my github repo [1]. I will try upgrading perceval and
adding in the tests in the coming days, and will update.

[1]: https://github.com/priya299/Dashboard


*Priya V*
Amrita University
LinkedIn

| GitHub  | Bitbucket



On Thu, Apr 7, 2016 at 3:29 AM, Jesus M. Gonzalez-Barahona  wrote:

> On Wed, 2016-04-06 at 17:30 +0530, Priya wrote:
> > Hello,
> >
> > Thanks for your suggestions.
> > I have made the appropriate changes as you had mentioned.
> > It took a little time to change from python3 to python3.4 as perceval
> > supports python3.4. I have updated the changes in my github. You can
> > see my git repo [1]
> >
> > [1]:https://github.com/priya299/Dashboard
>
> Thanks a lot, Priya. Good work. Some preliminary comments, below.
>
> * When runing the script on the xen-devel-2016-03 mbox, I seen an
> exception raised:
>
> 
> (perceval)jgb@expisito:~/src/outreachy/Dashboard/dashboard$ python3
> createjson.py --mbox xen-devel-2016-03 --output new.json
> Traceback (most recent call last):
>   File "createjson.py", line 61, in 
> create_json(args.mbox,args.output)
>   File "createjson.py", line 43, in create_json
> if key == k['Message-ID'].strip('<>'):
> KeyError: 'Message-ID'
> 
>
> Maybe some message is not having a Message-ID field? I suggest that you
> capture this exception, print out the offending message, and go on with
> the next one. You can use the Python logging package for printing out
> this kind of information (you can see how to use it in the Perceval
> package itself). But see below.
>
> * Minor typo in the README:
>
> Instead of
>
> eg: python3.4 createjson --mbox xen-devel-2016-03 --output new.json
>
> it should be
>
> eg: python3.4 createjson.py --mbox xen-devel-2016-03 --output new.json
>
> * The files have no licensing info. If you agree, it could be GPLv3, as
> is Perceval itself. For that, it would be enough that you mimic the
> header in Perceval files in your Python files (of course, indicating
> your authorship information).
>
> * Which version of Perceval are you using? Some weeks ago, the format
> of the dictionary produced by Perceval for each message changed. Now
> the  actual fields of the message are in a data subdictionary. Please,
> check that: the above exception with respect to the Message-ID key
> could be because of this... Please, try to make it work with master
> HEAD for Perceval (I don't expect any new major change in the next
> days/weeks, and I'll try to warn you in case some happens).
>
> * Could you please write at least one unit test for your code? You can
> see examples of the testing schema we use in the tests directory in
> Perceval, but we use vanilla unittest (the Python package for tests).
> At this stage I don't need that you produce a whole set of tests, only
> one or two to show that you know how to write unit tests, please.
>
> Saludos,
>
> Jesus.
>
> > Priya V
> > Amrita University
> > LinkedIn | GitHub | Bitbucket
> >
> > ___
> > Xen-devel mailing list
> > Xen-devel@lists.xen.org
> > http://lists.xen.org/xen-devel
> --
> Bitergia: http://bitergia.com
> /me at Twitter: https://twitter.com/jgbarah
>
>
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [BUG]

2016-04-07 Thread Михаил Казанцев
 "IRQ problem dom0 xen4.5-amd64"
i have mothrboard geforce 6100pm-m2(v3.0) latest bios and athtlon 64 x2 5000+ 
cpu
Mothrboard based on nforce 430 chipset
if in bios enabled apic mode. Xen boot stops on trying to load dom0 kernel 
(gentoo kernel v4.5 or ubuntu kernel v 4.2)  and just showing black screen, but 
kernel without xen loads normally.

if in bios disable apic mode Xen boot dom0 but not see any disk (of cource scsi 
libata and nforce deivers enabled in kernel by "Y") and keyboard does not work.
logs on screen that i can see:
genirq: flags mismatch irq12 0080 (i8042) vs  (mce)
genirq: flags mismatch irq1 0080 (i8042) vs 0002cc00 (spinlock0)
genirq: flags mismatch irq1 0080 (rtc0) vs 0002cc00 (xen-pcpu)
genirq: flags mismatch irq5 0080 (sata_nv) vs 0002cc00 (callfunction0)


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [for-4.7 1/5] drivers/pl011: ACPI: The interrupt should always be high level triggered

2016-04-07 Thread Shannon Zhao


On 2016/4/7 18:59, Julien Grall wrote:
> The SPCR does not specify if the interrupt is edge or level triggered.
> So the configuration needs to be hardcoded in the code.
> 
> Based on the PL011 TRM (see 2.2.8 in ARM DDI 0183G), the interrupt generated
> will be active high. This wording implies the interrupt should be high level
> triggered.
I think active high can stand rising edge triggered for edge triggered
interrupt.

E.g. see "Table 5-118 Flag Definitions: Virtual Timer, EL2 timers, and
Secure & Non-Secure EL1 timers" in ACPI SPEC 6.0.

> Note that a rising edge triggered interrupt would be described as
> "high going edge".
> 
> Signed-off-by: Julien Grall 
> ---
>  xen/drivers/char/pl011.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/xen/drivers/char/pl011.c b/xen/drivers/char/pl011.c
> index fa22edf..88d8488 100644
> --- a/xen/drivers/char/pl011.c
> +++ b/xen/drivers/char/pl011.c
> @@ -327,7 +327,7 @@ static int __init pl011_acpi_uart_init(const void *data)
>  }
>  
>  /* trigger/polarity information is not available in spcr */
> -irq_set_type(spcr->interrupt, IRQ_TYPE_EDGE_BOTH);
> +irq_set_type(spcr->interrupt, IRQ_TYPE_LEVEL_MASK);
>  
>  res = pl011_uart_init(spcr->interrupt, spcr->serial_port.address,
>PAGE_SIZE);
> 

-- 
Shannon


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [for-4.7 2/5] xen/arm: acpi: The boot CPU does not always match the first entry in the MADT

2016-04-07 Thread Shannon Zhao


On 2016/4/7 18:59, Julien Grall wrote:
> Since the ACPI 6.0 errata document [1], the first entry in the MADT
> does not have to correspond to the boot CPU.
> 
> Introduce a new variable to know if a MADT entry matching the boot CPU
> is found. Furthermore, it's not necessary to check if the MPIDR is
> duplicated for the boot CPU. So the rest of the function can be skipped.
> 
> [1] 1380 Unnecessary restrictions to FW vendors in ordering of GIC structures
> in MADT
> 
> Signed-off-by: Julien Grall 

Reviewed-by: Shannon Zhao 
> ---
>  xen/arch/arm/acpi/boot.c | 14 ++
>  1 file changed, 10 insertions(+), 4 deletions(-)
> 
> diff --git a/xen/arch/arm/acpi/boot.c b/xen/arch/arm/acpi/boot.c
> index 859aa86..2a71660 100644
> --- a/xen/arch/arm/acpi/boot.c
> +++ b/xen/arch/arm/acpi/boot.c
> @@ -37,7 +37,8 @@
>  #include 
>  
>  /* Processors with enabled flag and sane MPIDR */
> -static unsigned int enabled_cpus;
> +static unsigned int enabled_cpus = 1;
> +static bool __initdata bootcpu_valid;
>  
>  /* total number of cpus in this system */
>  static unsigned int __initdata total_cpus;
> @@ -71,10 +72,15 @@ acpi_map_gic_cpu_interface(struct 
> acpi_madt_generic_interrupt *processor)
>  }
>  
>  /* Check if GICC structure of boot CPU is available in the MADT */
> -if ( (enabled_cpus == 0) && (cpu_logical_map(0) != mpidr) )
> +if ( cpu_logical_map(0) == mpidr )
>  {
> -printk("Firmware bug, invalid CPU MPIDR for cpu0: 0x%"PRIx64" in 
> MADT\n",
> -   mpidr);
> +if ( bootcpu_valid )
> +{
> +printk("Firmware bug, duplicate boot CPU MPIDR: 0x%"PRIx64" in 
> MADT\n",
> +   mpidr);
> +return;
> +}
> +bootcpu_valid = true;
>  return;
>  }
>  
> 

-- 
Shannon


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v4 04/14] x86/rtc: replace paravirt rtc check with platform legacy quirk

2016-04-07 Thread Boris Ostrovsky

On 04/06/2016 08:06 PM, Luis R. Rodriguez wrote:

We have 4 types of x86 platforms that disable RTC:

   * Intel MID
   * Lguest - uses paravirt
   * Xen dom-U - uses paravirt
   * x86 on legacy systems annotated with an ACPI legacy flag

We can consolidate all of these into a platform specific legacy
quirk set early in boot through i386_start_kernel() and through
x86_64_start_reservations(). This deals with the RTC quirks which
we can rely on through the hardware subarch, the ACPI check can
be dealt with separately.

v2: split the subarch check from the ACPI check, clarify
 on the ACPI change commit log why ordering works

Suggested-by: Ingo Molnar 
Signed-off-by: Luis R. Rodriguez 
---
  arch/x86/Makefile |  1 +
  arch/x86/include/asm/paravirt.h   |  6 --
  arch/x86/include/asm/paravirt_types.h |  5 -
  arch/x86/include/asm/processor.h  |  1 -
  arch/x86/include/asm/x86_init.h   | 13 +
  arch/x86/kernel/Makefile  |  6 +-
  arch/x86/kernel/head32.c  |  2 ++
  arch/x86/kernel/head64.c  |  1 +
  arch/x86/kernel/platform-quirks.c | 18 ++
  arch/x86/kernel/rtc.c |  7 ++-
  arch/x86/lguest/boot.c|  1 -
  arch/x86/xen/enlighten.c  |  3 ---
  12 files changed, 42 insertions(+), 22 deletions(-)
  create mode 100644 arch/x86/kernel/platform-quirks.c

diff --git a/arch/x86/Makefile b/arch/x86/Makefile
index 4086abca0b32..f9ed8a7ce2b6 100644
--- a/arch/x86/Makefile
+++ b/arch/x86/Makefile
@@ -209,6 +209,7 @@ endif
  head-y := arch/x86/kernel/head_$(BITS).o
  head-y += arch/x86/kernel/head$(BITS).o
  head-y += arch/x86/kernel/head.o
+head-y += arch/x86/kernel/platform-quirks.o
  
  libs-y  += arch/x86/lib/
  
diff --git a/arch/x86/include/asm/paravirt.h b/arch/x86/include/asm/paravirt.h

index 601f1b8f9961..6c7a4a192032 100644
--- a/arch/x86/include/asm/paravirt.h
+++ b/arch/x86/include/asm/paravirt.h
@@ -20,12 +20,6 @@ static inline int paravirt_enabled(void)
return pv_info.paravirt_enabled;
  }
  
-static inline int paravirt_has_feature(unsigned int feature)

-{
-   WARN_ON_ONCE(!pv_info.paravirt_enabled);
-   return (pv_info.features & feature);
-}
-
  static inline void load_sp0(struct tss_struct *tss,
 struct thread_struct *thread)
  {
diff --git a/arch/x86/include/asm/paravirt_types.h 
b/arch/x86/include/asm/paravirt_types.h
index e8c2326478c8..6acc1b26cf40 100644
--- a/arch/x86/include/asm/paravirt_types.h
+++ b/arch/x86/include/asm/paravirt_types.h
@@ -70,14 +70,9 @@ struct pv_info {
  #endif
  
  	int paravirt_enabled;

-   unsigned int features;/* valid only if paravirt_enabled is set */
const char *name;
  };
  
-#define paravirt_has(x) paravirt_has_feature(PV_SUPPORTED_##x)

-/* Supported features */
-#define PV_SUPPORTED_RTC(1<<0)
-
  struct pv_init_ops {
/*
 * Patch may replace one of the defined code sequences with
diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index 9264476f3d57..0c70c7daa6b8 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -474,7 +474,6 @@ static inline unsigned long current_top_of_stack(void)
  #else
  #define __cpuid   native_cpuid
  #define paravirt_enabled()0
-#define paravirt_has(x)0
  
  static inline void load_sp0(struct tss_struct *tss,

struct thread_struct *thread)
diff --git a/arch/x86/include/asm/x86_init.h b/arch/x86/include/asm/x86_init.h
index 1ae89a2721d6..27d5c3fe5198 100644
--- a/arch/x86/include/asm/x86_init.h
+++ b/arch/x86/include/asm/x86_init.h
@@ -142,6 +142,15 @@ struct x86_cpuinit_ops {
  struct timespec;
  
  /**

+ * struct x86_legacy_features - legacy x86 features
+ *
+ * @rtc: this device has a CMOS real-time clock present
+ */
+struct x86_legacy_features {
+   int rtc;
+};
+
+/**
   * struct x86_platform_ops - platform specific runtime functions
   * @calibrate_tsc:calibrate TSC
   * @get_wallclock:get time from HW clock like RTC etc.
@@ -152,6 +161,7 @@ struct timespec;
   * @save_sched_clock_state:   save state for sched_clock() on suspend
   * @restore_sched_clock_state:restore state for sched_clock() on 
resume
   * @apic_post_init:   adjust apic if neeeded
+ * @legacy:legacy features
   */
  struct x86_platform_ops {
unsigned long (*calibrate_tsc)(void);
@@ -165,6 +175,7 @@ struct x86_platform_ops {
void (*save_sched_clock_state)(void);
void (*restore_sched_clock_state)(void);
void (*apic_post_init)(void);
+   struct x86_legacy_features legacy;
  };
  
  struct pci_dev;

@@ -186,6 +197,8 @@ extern struct x86_cpuinit_ops x86_cpuinit;
  extern struct x86_platform_ops x86_platform;
  extern struct x86_msi_ops x86_msi;
  extern struct x86_io_apic_ops x86_io_apic_ops;
+
+extern voi

Re: [Xen-devel] [PATCH v5 21/21] tools/libxc: Calculate xstate cpuid leaf from guest information

2016-04-07 Thread Wei Liu
On Thu, Apr 07, 2016 at 12:57:26PM +0100, Andrew Cooper wrote:
> The existing logic is broken for heterogeneous migration.  By always
> advertising the host maximum xstate, a migration to a less capable host always
> fails as Xen cannot accomodate the xcr0_accum in the migration stream.
> 
> By calculating xstate from the feature information (which a multi-host
> toolstack will have levelled appropriately), the guest will have the current
> hosts maximum xstate advertised, allowing for correct migration to less
> capable hosts.
> 
> In addition, some further improvements and corrections:
>  - don't discard the known flags in sub-leaves 2..63 ECX
>  - zap sub-leaves beyond 62
>  - zap all bits in leaf 1, EBX/ECX.  No XSS features are currently supported.
> 
> Signed-off-by: Andrew Cooper 
> Signed-off-by: Jan Beulich 

Since this is mostly x86 code and signed off by two x86 maintainers, I'm
just rubber-stamping this patch:

Acked-by: Wei Liu 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v5 15/21] tools/libxc: Modify bitmap operations to take void pointers

2016-04-07 Thread Wei Liu
On Thu, Apr 07, 2016 at 12:57:20PM +0100, Andrew Cooper wrote:
> The type of the pointer to a bitmap is not interesting; it does not affect the
> representation of the block of bits being pointed to.
> 
> Make the libxc functions consistent with those in Xen, so they can work just
> as well with 'unsigned int *' based bitmaps.
> 
> As part of doing so, change the implementation to be in terms of char rather
> than unsigned long.  This fixes alignment concerns with ARM.
> 
> Signed-off-by: Andrew Cooper 

The code looks fine to me.

We've given enough time for ARM folks to object so:

Acked-by: Wei Liu 

We can always fix this on ARM if it appears to be broken.

Wei.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v4 08/14] apm32: remove paravirt_enabled() use

2016-04-07 Thread Boris Ostrovsky

On 04/06/2016 08:06 PM, Luis R. Rodriguez wrote:

There is already a check for apm_info.bios == 0, the
apm_info.bios is set from the boot_params.apm_bios_info.
Both Xen and lguest, which are also the only ones that set
paravirt_enabled to true, never set the apm_bios.info. The

Xen folks are sure force disable to 0 is not needed,


Because apm_info lives in .bss (which we recently made sure is cleared 
on Xen PV). May be worth mentioning in the commit message so that we 
don't forget why this is not needed.


I think you also have this statement in other patches.

-boris


we
recently forced disabled this on lguest. With this in place
the paravirt_enabled() check is simply not needed anymore.

Signed-off-by: Luis R. Rodriguez 
---
  arch/x86/kernel/apm_32.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kernel/apm_32.c b/arch/x86/kernel/apm_32.c
index 9307f182fe30..c7364bd633e1 100644
--- a/arch/x86/kernel/apm_32.c
+++ b/arch/x86/kernel/apm_32.c
@@ -2267,7 +2267,7 @@ static int __init apm_init(void)
  
  	dmi_check_system(apm_dmi_table);
  
-	if (apm_info.bios.version == 0 || paravirt_enabled() || machine_is_olpc()) {

+   if (apm_info.bios.version == 0 || machine_is_olpc()) {
printk(KERN_INFO "apm: BIOS not found.\n");
return -ENODEV;
}



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v4 00/14] x86: remove paravirt_enabled

2016-04-07 Thread Juergen Gross
On 07/04/16 02:06, Luis R. Rodriguez wrote:
> Now that Andy's ASM paravirt_enabled() use is merged all we need is to address
> the rest of the C code uses. This completes that work by providing proper
> semantics for platform legacy settings and quirks as suggested by Ingo, this 
> in
> turn can also be extended later for benefit of further processing of ACPI
> 5.2.9.3 IA-PC Boot Architecture flags, which we currently don't take much
> advantage of.  For instance the ACPI_FADT_NO_VGA can later be leveraged by 
> bare
> metal x86 *and* HVMLite, as HVMLite seems to plan to set this.
> 
> Also, hpa has noted both Intel MID and CE4100 can make use of disabling
> pnpbios, we can do that separately after this, but it should now be a
> trivial change, generic given this quirk stuff is all generic now.
> 
> This patches goes tested by 0-day, except for the last patch, for some reason
> the branch that included that patch has had testing delayed for quite a
> while now, but I can't think of anything there that should break anything.
> 
> I've also just run time tested this on bare metal only so far.

FWIW: Xen dom0 is booting with the patches applied.


Juergen

> 
> Luis R. Rodriguez (14):
>   x86/boot: enumerate documentation for the x86 hardware_subarch
>   x86/xen: use X86_SUBARCH_XEN for PV guest boots
>   tools/lguest: make lguest launcher use X86_SUBARCH_LGUEST explicitly
>   x86/rtc: replace paravirt rtc check with platform legacy quirk
>   x86, ACPI: move ACPI_FADT_NO_CMOS_RTC check to ACPI boot code
>   x86/init: use a platform legacy quirk for ebda
>   tools/lguest: force disable tboot and apm
>   apm32: remove paravirt_enabled() use
>   x86/tboot: remove paravirt_enabled()
>   x86/cpu/intel: remove not needed paravirt_enabled() for f00f work
> around
>   pnpbios: replace paravirt_enabled() check with legacy device check
>   x86, ACPI: parse ACPI_FADT_LEGACY_DEVICES
>   x86/init: rename ebda code file
>   x86/paravirt: remove paravirt_enabled()
> 
>  arch/x86/Makefile |  3 ++-
>  arch/x86/include/asm/paravirt.h   | 11 -
>  arch/x86/include/asm/paravirt_types.h |  6 -
>  arch/x86/include/asm/processor.h  |  2 --
>  arch/x86/include/asm/x86_init.h   | 42 
> +++
>  arch/x86/include/uapi/asm/bootparam.h | 36 +-
>  arch/x86/kernel/Makefile  |  6 -
>  arch/x86/kernel/acpi/boot.c   |  9 
>  arch/x86/kernel/apm_32.c  |  2 +-
>  arch/x86/kernel/cpu/intel.c   |  2 +-
>  arch/x86/kernel/{head.c => ebda.c}|  2 +-
>  arch/x86/kernel/head32.c  |  2 ++
>  arch/x86/kernel/head64.c  |  1 +
>  arch/x86/kernel/kvm.c |  8 ---
>  arch/x86/kernel/paravirt.c|  1 -
>  arch/x86/kernel/platform-quirks.c | 32 ++
>  arch/x86/kernel/rtc.c | 15 ++---
>  arch/x86/kernel/tboot.c   |  6 -
>  arch/x86/lguest/boot.c|  3 ---
>  arch/x86/xen/enlighten.c  |  5 +
>  drivers/pnp/pnpbios/core.c|  3 ++-
>  include/linux/pnp.h   |  2 ++
>  tools/lguest/lguest.c | 10 +++--
>  23 files changed, 146 insertions(+), 63 deletions(-)
>  rename arch/x86/kernel/{head.c => ebda.c} (98%)
>  create mode 100644 arch/x86/kernel/platform-quirks.c
> 


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [xen-unstable-smoke test] 89344: tolerable FAIL - PUSHED

2016-04-07 Thread osstest service owner
flight 89344 xen-unstable-smoke real [real]
http://logs.test-lab.xenproject.org/osstest/logs/89344/

Failures :-/ but no regressions.

Regressions which are regarded as allowable (not blocking):
 build-amd64-libvirt   5 libvirt-buildfail   like 89250

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-libvirt  1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  13 saverestore-support-checkfail   never pass

version targeted for testing:
 xen  b5836f71e186c891231def53b415b3f340306613
baseline version:
 xen  8fdf2fc46b7118214035edddc720fb87522511b9

Last test of basis89250  2016-04-06 22:18:14 Z0 days
Testing same since89344  2016-04-07 11:12:33 Z0 days1 attempts


People who touched revisions under test:
  Chong Li 
  Chong Li 

jobs:
 build-amd64  pass
 build-armhf  pass
 build-amd64-libvirt  fail
 test-armhf-armhf-xl  pass
 test-amd64-amd64-xl-qemuu-debianhvm-i386 pass
 test-amd64-amd64-libvirt blocked 



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

+ branch=xen-unstable-smoke
+ revision=b5836f71e186c891231def53b415b3f340306613
+ . ./cri-lock-repos
++ . ./cri-common
+++ . ./cri-getconfig
+++ umask 002
+++ getrepos
 getconfig Repos
 perl -e '
use Osstest;
readglobalconfig();
print $c{"Repos"} or die $!;
'
+++ local repos=/home/osstest/repos
+++ '[' -z /home/osstest/repos ']'
+++ '[' '!' -d /home/osstest/repos ']'
+++ echo /home/osstest/repos
++ repos=/home/osstest/repos
++ repos_lock=/home/osstest/repos/lock
++ '[' x '!=' x/home/osstest/repos/lock ']'
++ OSSTEST_REPOS_LOCK_LOCKED=/home/osstest/repos/lock
++ exec with-lock-ex -w /home/osstest/repos/lock ./ap-push xen-unstable-smoke 
b5836f71e186c891231def53b415b3f340306613
+ branch=xen-unstable-smoke
+ revision=b5836f71e186c891231def53b415b3f340306613
+ . ./cri-lock-repos
++ . ./cri-common
+++ . ./cri-getconfig
+++ umask 002
+++ getrepos
 getconfig Repos
 perl -e '
use Osstest;
readglobalconfig();
print $c{"Repos"} or die $!;
'
+++ local repos=/home/osstest/repos
+++ '[' -z /home/osstest/repos ']'
+++ '[' '!' -d /home/osstest/repos ']'
+++ echo /home/osstest/repos
++ repos=/home/osstest/repos
++ repos_lock=/home/osstest/repos/lock
++ '[' x/home/osstest/repos/lock '!=' x/home/osstest/repos/lock ']'
+ . ./cri-common
++ . ./cri-getconfig
++ umask 002
+ select_xenbranch
+ case "$branch" in
+ tree=xen
+ xenbranch=xen-unstable-smoke
+ qemuubranch=qemu-upstream-unstable
+ '[' xxen = xlinux ']'
+ linuxbranch=
+ '[' xqemu-upstream-unstable = x ']'
+ select_prevxenbranch
++ ./cri-getprevxenbranch xen-unstable-smoke
+ prevxenbranch=xen-4.6-testing
+ '[' xb5836f71e186c891231def53b415b3f340306613 = x ']'
+ : tested/2.6.39.x
+ . ./ap-common
++ : osst...@xenbits.xen.org
+++ getconfig OsstestUpstream
+++ perl -e '
use Osstest;
readglobalconfig();
print $c{"OsstestUpstream"} or die $!;
'
++ :
++ : git://xenbits.xen.org/xen.git
++ : osst...@xenbits.xen.org:/home/xen/git/xen.git
++ : git://xenbits.xen.org/qemu-xen-traditional.git
++ : git://git.kernel.org
++ : git://git.kernel.org/pub/scm/linux/kernel/git
++ : git
++ : git://xenbits.xen.org/libvirt.git
++ : osst...@xenbits.xen.org:/home/xen/git/libvirt.git
++ : git://xenbits.xen.org/libvirt.git
++ : git://xenbits.xen.org/rumpuser-xen.git
++ : git
++ : git://xenbits.xen.org/rumpuser-xen.git
++ : osst...@xenbits.xen.org:/home/xen/git/rumpuser-xen.git
+++ besteffort_repo https://github.com/rumpkernel/rumpkernel-netbsd-src
+++ local repo=https://github.com/rumpkernel/rumpkernel-netbsd-src
+++ cached_repo https://github.com/rumpkernel/rumpkernel-netbsd-src 
'[fetch=try]'
+++ local repo=https://github.com/rumpkernel/rumpkernel-netbsd-src
+++ local 'options=[fetch=try]'
 getconfig GitCacheProxy
 perl -e '
use Osstest;
readglobalconfig();
print $c{"GitCach

Re: [Xen-devel] [for-4.7 1/5] drivers/pl011: ACPI: The interrupt should always be high level triggered

2016-04-07 Thread Julien Grall

Hi Shannon,

Thank you for the review.

On 07/04/16 13:30, Shannon Zhao wrote:



On 2016/4/7 18:59, Julien Grall wrote:

The SPCR does not specify if the interrupt is edge or level triggered.
So the configuration needs to be hardcoded in the code.

Based on the PL011 TRM (see 2.2.8 in ARM DDI 0183G), the interrupt generated
will be active high. This wording implies the interrupt should be high level
triggered.

I think active high can stand rising edge triggered for edge triggered
interrupt.

E.g. see "Table 5-118 Flag Definitions: Virtual Timer, EL2 timers, and
Secure & Non-Secure EL1 timers" in ACPI SPEC 6.0.


I've spoken with multiple person about the wording and the consensus is 
"active high" would imply high level triggered. So it's very ambiguous.


However, the PL011 is always using a high level triggered. You can look 
at the device tree bindings such as the one for the foundation model.


Also, the SBSA (section 4.3.2 in ARM-DEN-0029 v2.3) states the PL011 
implemented with a level triggered interrupt.


Note, I wasn't able to get the serial console working on my platform 
with edge triggered interrupt.


Regards,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [for-4.7 1/5] drivers/pl011: ACPI: The interrupt should always be high level triggered

2016-04-07 Thread Shannon Zhao
On 2016年04月07日 21:41, Julien Grall wrote:
> 
> On 07/04/16 13:30, Shannon Zhao wrote:
>>
>>
>> On 2016/4/7 18:59, Julien Grall wrote:
>>> The SPCR does not specify if the interrupt is edge or level triggered.
>>> So the configuration needs to be hardcoded in the code.
>>>
>>> Based on the PL011 TRM (see 2.2.8 in ARM DDI 0183G), the interrupt
>>> generated
>>> will be active high. This wording implies the interrupt should be
>>> high level
>>> triggered.
>> I think active high can stand rising edge triggered for edge triggered
>> interrupt.
>>
>> E.g. see "Table 5-118 Flag Definitions: Virtual Timer, EL2 timers, and
>> Secure & Non-Secure EL1 timers" in ACPI SPEC 6.0.
> 
> I've spoken with multiple person about the wording and the consensus is
> "active high" would imply high level triggered. So it's very ambiguous.
> 
> However, the PL011 is always using a high level triggered. You can look
> at the device tree bindings such as the one for the foundation model.
> 
> Also, the SBSA (section 4.3.2 in ARM-DEN-0029 v2.3) states the PL011
> implemented with a level triggered interrupt.
> 
> Note, I wasn't able to get the serial console working on my platform
> with edge triggered interrupt.

So how about IRQ_TYPE_LEVEL_HIGH instead of IRQ_TYPE_LEVEL_MASK?

Thanks,
-- 
Shannon

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [for-4.7 1/5] drivers/pl011: ACPI: The interrupt should always be high level triggered

2016-04-07 Thread Julien Grall



On 07/04/16 14:57, Shannon Zhao wrote:

On 2016年04月07日 21:41, Julien Grall wrote:


On 07/04/16 13:30, Shannon Zhao wrote:



On 2016/4/7 18:59, Julien Grall wrote:

The SPCR does not specify if the interrupt is edge or level triggered.
So the configuration needs to be hardcoded in the code.

Based on the PL011 TRM (see 2.2.8 in ARM DDI 0183G), the interrupt
generated
will be active high. This wording implies the interrupt should be
high level
triggered.

I think active high can stand rising edge triggered for edge triggered
interrupt.

E.g. see "Table 5-118 Flag Definitions: Virtual Timer, EL2 timers, and
Secure & Non-Secure EL1 timers" in ACPI SPEC 6.0.


I've spoken with multiple person about the wording and the consensus is
"active high" would imply high level triggered. So it's very ambiguous.

However, the PL011 is always using a high level triggered. You can look
at the device tree bindings such as the one for the foundation model.

Also, the SBSA (section 4.3.2 in ARM-DEN-0029 v2.3) states the PL011
implemented with a level triggered interrupt.

Note, I wasn't able to get the serial console working on my platform
with edge triggered interrupt.


So how about IRQ_TYPE_LEVEL_HIGH instead of IRQ_TYPE_LEVEL_MASK?


Good point. I will likely resend only this patch and update the commit 
message too.


Regards,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH 1/2] pygrub: Ignore GRUB2 if statements

2016-04-07 Thread Ross Lagerwall
SLES 12's default GRUB config has the following code before any entries:
if [ -n "$extra_cmdline" ]; then
  submenu "Bootable snapshot #$snapshot_num" {
menuentry "If OK, run 'snapper rollback' and reboot." { true; }
  }
fi

This prevents pygrub from booting using the default entry. Since I'm not
aware of any distro GRUB config which puts useful entries within
conditionals, ignore them.

Signed-off-by: Ross Lagerwall 
---
 tools/pygrub/src/GrubConf.py | 8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/tools/pygrub/src/GrubConf.py b/tools/pygrub/src/GrubConf.py
index dc810d5..cf9aa8b 100644
--- a/tools/pygrub/src/GrubConf.py
+++ b/tools/pygrub/src/GrubConf.py
@@ -373,6 +373,7 @@ class Grub2ConfigFile(_GrubConfigFile):
 lines = buf.split("\n")
 
 in_function = False
+in_if = False
 img = None
 title = ""
 menu_level=0
@@ -389,9 +390,14 @@ class Grub2ConfigFile(_GrubConfigFile):
 if l.startswith('function'):
 in_function = True
 continue
-if in_function:
+elif l.startswith('if'):
+in_if = True
+continue
+if in_function or in_if:
 if l.startswith('}'):
 in_function = False
+elif l.startswith('fi'):
+in_if = False
 continue
 
 # new image
-- 
2.4.3


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v6 02/24] xen/xsplice: Hypervisor implementation of XEN_XSPLICE_op

2016-04-07 Thread Andrew Cooper
On 07/04/16 04:49, Konrad Rzeszutek Wilk wrote:
> +static int xsplice_upload(xen_sysctl_xsplice_upload_t *upload)
> +{
> +struct payload *data = NULL, *found;
> +char n[XEN_XSPLICE_NAME_SIZE];
> +int rc;
> +
> +rc = verify_payload(upload, n);
> +if ( rc )
> +return rc;
> +
> +spin_lock(&payload_lock);
> +
> +found = find_payload(n);
> +if ( found && !IS_ERR(found) /* Found. */ )
> +{
> +rc = -EEXIST;
> +goto out;
> +}
> +
> +if ( IS_ERR(found) )
> +{
> +rc = PTR_ERR(found);
> +goto out;
> +}

This logic chain can be simplified to

if ( IS_ERR(found) )
{
rc = PTR_ERR(found);
goto out;
}
else if ( found )
{
rc = -EEXISTS;
goto out;
}


> +static void xsplice_printall(unsigned char key)
> +{
> +struct payload *data;

printk("'%u' pressed - Dumping all xsplice patches\n", key);

to match other keyhandlers, and give some context to a bunch of lines
starting " name=...".

> +
> +if ( !spin_trylock(&payload_lock) )
> +{
> +printk("Lock held. Try again.\n");
> +return;
> +}
> +
> +list_for_each_entry ( data, &payload_list, list )
> +printk(" name=%s state=%s(%d)\n", data->name,
> +   state2str(data->state), data->state);
> +
> +spin_unlock(&payload_lock);
> +}
> +

> diff --git a/xen/include/xen/xsplice.h b/xen/include/xen/xsplice.h
> new file mode 100644
> index 000..5c84851
> --- /dev/null
> +++ b/xen/include/xen/xsplice.h
> @@ -0,0 +1,35 @@
> +/*
> + * Copyright (c) 2016 Oracle and/or its affiliates. All rights reserved.
> + *
> + */
> +
> +#ifndef __XEN_XSPLICE_H__
> +#define __XEN_XSPLICE_H__
> +
> +struct xen_sysctl_xsplice_op;
> +
> +#ifdef CONFIG_XSPLICE
> +
> +int xsplice_op(struct xen_sysctl_xsplice_op *);
> +
> +#else
> +
> +#include  /* For -EOPNOTSUPP */
> +static inline int xsplice_op(struct xen_sysctl_xsplice_op *op)
> +{
> +return -EOPNOTSUPP;

-ENOSYS, as this disables all xsplice functionality, and matches the
existing behaviour for missing SYSCTL_ ops.

> +}
> +
> +#endif /* CONFIG_XSPLICE */
> +
> +#endif /* __XEN_XSPLICE_H__ */
> +
> +/*
> + * Local variables:
> + * mode: C
> + * c-file-style: "BSD"
> + * c-basic-offset: 4
> + * tab-width: 4
> + * indent-tabs-mode: nil
> + * End:
> + */
> diff --git a/xen/xsm/flask/hooks.c b/xen/xsm/flask/hooks.c
> index 1eaec58..3ef0441 100644
> --- a/xen/xsm/flask/hooks.c
> +++ b/xen/xsm/flask/hooks.c
> @@ -808,6 +808,12 @@ static int flask_sysctl(int cmd)
>  case XEN_SYSCTL_tmem_op:
>  return domain_has_xen(current->domain, XEN__TMEM_CONTROL);
>  
> +#ifdef CONFIG_XSPLICE
> +case XEN_SYSCTL_xsplice_op:
> +return avc_current_has_perm(SECINITSID_XEN, SECCLASS_XEN2,
> +XEN2__XSPLICE_OP, NULL);
> +#endif

The case statement should not be conditional.  Otherwise, a toolstack
issuing an xsplice_op against a hypervisor with xsplice compiled out
will hit the "Unknown op" below.

Given that XEN2__XSPLICE_OP unconditionally exists, I would just drop
the #ifdef's completely, and accept that if this permissions check ends
up passing, the actual xsplice_op handler will fail.

No major problems, so with these fixed, Reviewed-by: Andrew Cooper


> +
>  default:
>  printk("flask_sysctl: Unknown op %d\n", cmd);
>  return -EPERM;


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 04/11] xen: sched: close potential races when switching scheduler to CPUs

2016-04-07 Thread George Dunlap
On 06/04/16 18:23, Dario Faggioli wrote:
> In short, the point is making sure that the actual switch
> of scheduler and the remapping of the scheduler's runqueue
> lock occur in the same critical section, protected by the
> "old" scheduler's lock (and not, e.g., in the free_pdata
> hook, as it is now for Credit2 and RTDS).
> 
> Not doing  so, is (at least) racy. In fact, for instance,
> if we switch cpu X from, Credit2 to Credit, we do:
> 
>  schedule_cpu_switch(x, csched2 --> csched):
>//scheduler[x] is csched2
>//schedule_lock[x] is csched2_lock
>csched_alloc_pdata(x)
>csched_init_pdata(x)
>pcpu_schedule_lock(x) > takes csched2_lock
>scheduler[X] = csched
>pcpu_schedule_unlock(x) --> unlocks csched2_lock
>[1]
>csched2_free_pdata(x)
>  pcpu_schedule_lock(x) --> takes csched2_lock
>  schedule_lock[x] = csched_lock
>  spin_unlock(csched2_lock)
> 
> While, if we switch cpu X from, Credit to Credit2, we do:
> 
>  schedule_cpu_switch(X, csched --> csched2):
>//scheduler[x] is csched
>//schedule_lock[x] is csched_lock
>csched2_alloc_pdata(x)
>csched2_init_pdata(x)
>  pcpu_schedule_lock(x) --> takes csched_lock
>  schedule_lock[x] = csched2_lock
>  spin_unlock(csched_lock)
>[2]
>pcpu_schedule_lock(x) > takes csched2_lock
>scheduler[X] = csched2
>pcpu_schedule_unlock(x) --> unlocks csched2_lock
>csched_free_pdata(x)
> 
> And if we switch cpu X from RTDS to Credit2, we do:
> 
>  schedule_cpu_switch(X, RTDS --> csched2):
>//scheduler[x] is rtds
>//schedule_lock[x] is rtds_lock
>csched2_alloc_pdata(x)
>csched2_init_pdata(x)
>  pcpu_schedule_lock(x) --> takes rtds_lock
>  schedule_lock[x] = csched2_lock
>  spin_unlock(rtds_lock)
>pcpu_schedule_lock(x) > takes csched2_lock
>scheduler[x] = csched2
>pcpu_schedule_unlock(x) --> unlocks csched2_lock
>rtds_free_pdata(x)
>  spin_lock(rtds_lock)
>  ASSERT(schedule_lock[x] == rtds_lock) [3]
>  schedule_lock[x] = DEFAULT_SCHEDULE_LOCK [4]
>  spin_unlock(rtds_lock)
> 
> So, the first problem is that, if anything related to
> scheduling, and involving CPU, happens at [1] or [2], we:
>  - take csched2_lock,
>  - operate on Credit1 functions and data structures,
> which is no good!
> 
> The second problem is that the ASSERT at [3] triggers, and
> the third that at [4], we screw up the lock remapping we've
> done for ourself in csched2_init_pdata()!
> 
> The first problem arises because there is a window during
> which the lock is already the new one, but the scheduler is
> still the old one. The other two, becase we let schedulers
> mess with the lock (re)mapping done by others.
> 
> This patch, therefore, introduces a new hook in the scheduler
> interface, called switch_sched, meant at being used when
> switching scheduler on a CPU, and implements it for the
> various schedulers (that needs it: i.e., all except ARINC653),
> so that things are done in the proper order and under the
> protection of the best suited (set of) lock(s). It is
> necessary to add the hook (as compared to keep doing things
> in generic code), because different schedulers may have
>  different locking schemes.
> 
> Signed-off-by: Dario Faggioli 

Hey Dario! Everything here looks good, except for one thing: the
scheduler lock for arinc653 scheduler. :-)  What happens now if you
assign a cpu to credit2, and then assign it to arinc653?  Since arinc
doesn't implement the switch_sched() functionality, the per-cpu
scheduler lock will still point to the credit2 lock, won't it?

Which will *work*, although it will add unnecessary contention to the
credit2 lock;  until the lock goes away, at which point
vcpu_schedule_lock*() will essentially be using a wild pointer.

 -George

> ---
> Cc: George Dunlap 
> Cc: Meng Xu 
> Cc: Tianyang Chen 
> ---
> Changes from v1:
> 
> new patch, basically, coming from squashing what were
> 4 patches in v1. In any case, with respect to those 4
> patches:
>  - runqueue lock is back being taken in schedule_cpu_switch(),
>as suggested during review;
>  - add barriers for making sure all initialization is done
>when the new lock is assigned, as sugested during review;
>  - add comments and ASSERT-s about how and why the adopted
>locking scheme is safe, as suggested during review.
> ---
>  xen/common/sched_credit.c  |   44 
>  xen/common/sched_credit2.c |   81 
> +---
>  xen/common/sched_rt.c  |   45 +---
>  xen/common/schedule.c  |   41 +-
>  xen/include/xen/sched-if.h |3 ++
>  5 files changed, 172 insertions(+), 42 deletions(-)
> 
> diff --git a/xen/common/sched_credit.c b/xen/common/sched_credit.c
> index 96a245d..540d515 100644
> --- a/xen/common/sched_credit.c
> +++ b/xen/common/sched_credit.c
> @@ -578,12 +578,55 @@ csched_init_pdata(const struct scheduler *ops, void 
> *pdata, int cpu)
>  {

Re: [Xen-devel] [PATCH v2 08/11] xen: sched: allow for choosing credit2 runqueues configuration at boot

2016-04-07 Thread George Dunlap
On 07/04/16 06:04, Juergen Gross wrote:
> On 06/04/16 19:23, Dario Faggioli wrote:
>> In fact, credit2 uses CPU topology to decide how to arrange
>> its internal runqueues. Before this change, only 'one runqueue
>> per socket' was allowed. However, experiments have shown that,
>> for instance, having one runqueue per physical core improves
>> performance, especially in case hyperthreading is available.
>>
>> In general, it makes sense to allow users to pick one runqueue
>> arrangement at boot time, so that:
>>  - more experiments can be easily performed to even better
>>assess and improve performance;
>>  - one can select the best configuration for his specific
>>use case and/or hardware.
>>
>> This patch enables the above.
>>
>> Note that, for correctly arranging runqueues to be per-core,
>> just checking cpu_to_core() on the host CPUs is not enough.
>> In fact, cores (and hyperthreads) on different sockets, can
>> have the same core (and thread) IDs! We, therefore, need to
>> check whether the full topology of two CPUs matches, for
>> them to be put in the same runqueue.
>>
>> Note also that the default (although not functional) for
>> credit2, since now, has been per-socket runqueue. This patch
>> leaves things that way, to avoid mixing policy and technical
>> changes.
>>
>> Finally, it would be a nice feature to be able to select
>> a particular runqueue arrangement, even when creating a
>> Credit2 cpupool. This is left as future work.
>>
>> Signed-off-by: Dario Faggioli 
>> Signed-off-by: Uma Sharma 
> 
> With the one comment below addressed:
> 
> Reviewed-by: Juergen Gross 
> 
>> ---
>> Cc: George Dunlap 
>> Cc: Uma Sharma 
>> Cc: Juergen Gross 
>> ---
>> Cahnges from v1:
>>  * fix bug in parameter parsing, and start using strcmp()
>>for that, as requested during review.
>> ---
>>  docs/misc/xen-command-line.markdown |   19 +
>>  xen/common/sched_credit2.c  |   76 
>> +--
>>  2 files changed, 90 insertions(+), 5 deletions(-)
>>
> 
> ...
> 
>> @@ -2006,7 +2067,10 @@ cpu_to_runqueue(struct csched2_private *prv, unsigned 
>> int cpu)
>>  BUG_ON(cpu_to_socket(cpu) == XEN_INVALID_SOCKET_ID ||
>> cpu_to_socket(peer_cpu) == XEN_INVALID_SOCKET_ID);
>>  
>> -if ( cpu_to_socket(cpumask_first(&rqd->active)) == 
>> cpu_to_socket(cpu) )
>> +if ( opt_runqueue == OPT_RUNQUEUE_ALL ||
>> + (opt_runqueue == OPT_RUNQUEUE_CORE && same_core(peer_cpu, 
>> cpu)) ||
>> + (opt_runqueue == OPT_RUNQUEUE_SOCKET && same_socket(peer_cpu, 
>> cpu)) ||
>> + (opt_runqueue == OPT_RUNQUEUE_NODE && same_node(peer_cpu, 
>> cpu)) )
>>  break;
>>  }
>>  
>> @@ -2170,6 +2234,8 @@ csched2_init(struct scheduler *ops)
>>  printk(" load_window_shift: %d\n", opt_load_window_shift);
>>  printk(" underload_balance_tolerance: %d\n", 
>> opt_underload_balance_tolerance);
>>  printk(" overload_balance_tolerance: %d\n", 
>> opt_overload_balance_tolerance);
>> +printk(" runqueues arrangement: per-%s\n",
>> +   opt_runqueue == OPT_RUNQUEUE_CORE ? "core" : "socket");
> 
> I asked this before: shouldn't the optiones "node" and "all" be
> respected here, too?

Dario, would it make sense to put the string names ("core", "socket",
&c) in an array, then have both parse_credit2_runqueue() iterate over
the array to find the appropriate numeric value, and have this use the
array to convert from the numeric value to a string?

 -George

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v4 08/14] hvmloader: Locate the BIOS blob

2016-04-07 Thread Anthony PERARD
On Tue, Apr 05, 2016 at 06:59:03AM -0600, Jan Beulich wrote:
> >>> On 14.03.16 at 18:55,  wrote:
> > --- a/tools/firmware/hvmloader/hvmloader.c
> > +++ b/tools/firmware/hvmloader/hvmloader.c
> > @@ -292,8 +322,16 @@ int main(void)
> >  }
> >  
> >  printf("Loading %s ...\n", bios->name);
> > -if ( bios->bios_load )
> > -bios->bios_load(bios);
> > +bios_module = get_module_entry(hvm_start_info, "bios");
> 
> Isn't "bios" a bit vague, as there could be multiple (system, video,
> add-on card)?

Maybe a bit, also this can be use to load OVMF which is not technically a
BIOS. I guess a better name would be "firmware" or "system firmware".

On the other hand, the name "bios" is used in few places, in hvmloader, in
libxl (to choose between seabios/ovmf).

-- 
Anthony PERARD

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


  1   2   3   >