Re: [Xen-devel] [PATCH 13/25] argo: implement the register op

2018-12-21 Thread Christopher Clark
On Thu, Dec 20, 2018 at 11:28 PM Jan Beulich  wrote:
>
> >>> On 21.12.18 at 02:25,  wrote:
> > On Thu, Dec 20, 2018 at 12:29 AM Jan Beulich  wrote:
> >>
> >> >>> On 20.12.18 at 06:29,  wrote:
> >> > On Wed, Dec 12, 2018 at 1:48 AM Jan Beulich  wrote:
> >> >>
> >> >> > +static int
> >> >> > +argo_find_ring_mfns(struct domain *d, struct argo_ring_info 
> >> >> > *ring_info,
> >> >> > +uint32_t npage, 
> >> >> > XEN_GUEST_HANDLE_PARAM(argo_pfn_t) pfn_hnd,
> >> >> > +uint32_t len)
> >> >> > +{
> >> >> > +int i;
> >> >> > +int ret = 0;
> >> >> > +
> >> >> > +if ( (npage << PAGE_SHIFT) < len )
> >> >> > +return -EINVAL;
> >> >> > +
> >> >> > +if ( ring_info->mfns )
> >> >> > +{
> >> >> > +/*
> >> >> > + * Ring already existed. Check if it's the same ring,
> >> >> > + * i.e. same number of pages and all translated gpfns still
> >> >> > + * translating to the same mfns
> >> >> > + */
> >> >>
> >> >> This comment makes me wonder whether the translations are
> >> >> permitted to change at other times. If so I'm not sure what
> >> >> value verification here has. If not, this probably would want to
> >> >> be debugging-only code.
> >> >
> >> > My understanding is that the gfn->mfn translation is not necessarily 
> >> > stable
> >> > across entry and exit from host power state S4, suspend to disk.
>
> Now I'm afraid there's some confusion here: Originally you've
> said "host".
>
> >> How would that be? It's not stable across guest migration (or
> >> its non-live save/restore equivalent),
> >
> > Right, that's clear.
> >
> >> but how would things change across S3?
> >
> > I don't think that they do change in that case.
> >
> > From studying the code involved above, a related item: the guest runs the 
> > same
> > suspend and resume kernel code before entering into/exiting from either 
> > guest
> > S3 or S4, so the guest kernel resume code needs to re-register the rings, to
> > cover the case where it is coming up in an environment where they were 
> > dropped
> > - so that's what it does.
> >
> > This relates to the code section above: if guest entry to S3 is aborted at 
> > the
> > final step (eg. error or platform refuses, eg. maybe a physical device
> > interaction with passthrough) then the hypervisor has not torn down the 
> > rings,
> > the guest remains running within the same domain, and the guest resume logic
> > runs, which runs through re-registration for all its rings. The check in the
> > logic above allows the existing ring mappings within the hypervisor to be
> > preserved.
>
> Yet now you suddenly talk about guest S3.

Well, the context is that you did just ask about S3, without
specifying host or guest. Host S3 doesn't involve much at all, so I
went and studied the code in both the Linux driver and the hypervisor
to determine what it does in the case of guest S3, and then replied
with the above since it is relevant to the code in question. I hope I
was clear about referring to guest S3 above in my last reply.

That logic aims to make ring registration idempotent, to avoid the
teardown of established mappings of the ring pages in the case where
doing so isn't needed.

> >> And there's no support for S4 (and I can't see it appearing any time soon).
> >
> > OK. oh well.
>
> Considering the original "host" context, my response here was
> relating to host S4. Guest S4 ought to be functional (as being
> mostly a guest kernel function anyway).

ack.

Christopher

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 13/25] argo: implement the register op

2018-12-21 Thread Jan Beulich
>>> On 21.12.18 at 09:16,  wrote:
> On Thu, Dec 20, 2018 at 11:28 PM Jan Beulich  wrote:
>>
>> >>> On 21.12.18 at 02:25,  wrote:
>> > On Thu, Dec 20, 2018 at 12:29 AM Jan Beulich  wrote:
>> >>
>> >> >>> On 20.12.18 at 06:29,  wrote:
>> >> > On Wed, Dec 12, 2018 at 1:48 AM Jan Beulich  wrote:
>> >> >>
>> >> >> > +static int
>> >> >> > +argo_find_ring_mfns(struct domain *d, struct argo_ring_info 
>> >> >> > *ring_info,
>> >> >> > +uint32_t npage, 
>> >> >> > XEN_GUEST_HANDLE_PARAM(argo_pfn_t) pfn_hnd,
>> >> >> > +uint32_t len)
>> >> >> > +{
>> >> >> > +int i;
>> >> >> > +int ret = 0;
>> >> >> > +
>> >> >> > +if ( (npage << PAGE_SHIFT) < len )
>> >> >> > +return -EINVAL;
>> >> >> > +
>> >> >> > +if ( ring_info->mfns )
>> >> >> > +{
>> >> >> > +/*
>> >> >> > + * Ring already existed. Check if it's the same ring,
>> >> >> > + * i.e. same number of pages and all translated gpfns still
>> >> >> > + * translating to the same mfns
>> >> >> > + */
>> >> >>
>> >> >> This comment makes me wonder whether the translations are
>> >> >> permitted to change at other times. If so I'm not sure what
>> >> >> value verification here has. If not, this probably would want to
>> >> >> be debugging-only code.
>> >> >
>> >> > My understanding is that the gfn->mfn translation is not necessarily 
>> >> > stable
>> >> > across entry and exit from host power state S4, suspend to disk.

Note this ^^^ (and see below).

>> Now I'm afraid there's some confusion here: Originally you've
>> said "host".
>>
>> >> How would that be? It's not stable across guest migration (or
>> >> its non-live save/restore equivalent),
>> >
>> > Right, that's clear.
>> >
>> >> but how would things change across S3?
>> >
>> > I don't think that they do change in that case.
>> >
>> > From studying the code involved above, a related item: the guest runs the 
>> > same
>> > suspend and resume kernel code before entering into/exiting from either 
>> > guest
>> > S3 or S4, so the guest kernel resume code needs to re-register the rings, 
>> > to
>> > cover the case where it is coming up in an environment where they were 
>> > dropped
>> > - so that's what it does.
>> >
>> > This relates to the code section above: if guest entry to S3 is aborted at 
>> > the
>> > final step (eg. error or platform refuses, eg. maybe a physical device
>> > interaction with passthrough) then the hypervisor has not torn down the 
>> > rings,
>> > the guest remains running within the same domain, and the guest resume 
>> > logic
>> > runs, which runs through re-registration for all its rings. The check in 
>> > the
>> > logic above allows the existing ring mappings within the hypervisor to be
>> > preserved.
>>
>> Yet now you suddenly talk about guest S3.
> 
> Well, the context is that you did just ask about S3, without
> specifying host or guest.

I'm sorry to be picky, but no, I don't think I did. You did expicitly
say "host", making me further think only about that case.

> Host S3 doesn't involve much at all, so I
> went and studied the code in both the Linux driver and the hypervisor
> to determine what it does in the case of guest S3, and then replied
> with the above since it is relevant to the code in question. I hope I
> was clear about referring to guest S3 above in my last reply.
> 
> That logic aims to make ring registration idempotent, to avoid the
> teardown of established mappings of the ring pages in the case where
> doing so isn't needed.

You treat complexity in the kernel for complexity in the hypervisor.
I'm not sure this is appropriate, as I can't judge how much more
difficult it would be for the guest to look after itself. But let's look
at both cases again:
- For guest S3, afaik, the domain doesn't change, and hence
  memory assignment remains the same. No re-registration
  necessary then afaict.
- For guest S4, aiui, the domain gets destroyed and a new one
  built upon resume. Re-registration would be needed, but due
  to the domain re-construction no leftovers ought to exist in
  Xen.
Hence to me it would seem more natural to have the guest deal
with the situation, and have no extra logic for this in Xen. You
want the guest to re-register anyway, yet simply avoiding to
do so in the S3 case ought to be a single (or very few)
conditional(s), i.e. not a whole lot of complexity.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH for-4.12 v2 1/8] xen/page_alloc: Move get_pg_owner()/put_pg_owner() from x86 to common code

2018-12-21 Thread Jan Beulich
>>> On 20.12.18 at 20:23,  wrote:
> From: Benjamin Sanda 
> 
> get_pg_owner() and put_pg_owner() will be necessary in a follow-up
> commit to support xentrace on Arm. So move the helper to common code.
> 
> Signed-off-by: Benjamin Sanda 
> [julien: Rework commit title / turn put_pg_owner to a macro]

Nit: It's an inline function now.

Jan



___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH] drm/xen-front: Make shmem backed display buffer coherent

2018-12-21 Thread Oleksandr Andrushchenko

On 12/20/18 8:38 PM, Christoph Hellwig wrote:

On Thu, Dec 20, 2018 at 07:35:15PM +0100, Daniel Vetter wrote:

Err, with streaming DMA buffer sharing is trivial.  The coherent DMA
allocator is what causes all kinds of horrible hacks that can't actually
work on various platforms.

Hm, I thought the streaming dma api is the one that causes bounce
buffers and all that fun. If you're unlucky at least.

Yes it may.  But even if that happens everything will actually work,
just slower.  While the dma coherent API is simply broken.

But if you don't want bounce buffering you need to use the dma
noncoherent allocator as proposed here:


https://lists.linuxfoundation.org/pipermail/iommu/2018-December/031982.html

which combines allocating memory that doesn't need to be bounce
buffered with a sharing scheme that can actually work.

So, the bottom line will be: I can use DMA API for what I need, but:
1. I need to remove GFP_USER
2. No need for DMA32 (so no chance for bouncing to step in)
3. I may need to check if mapping and unmapping of the buffer
at once will also help, e.g. no need to have the buffer mapped until
it is destroyed
Did I get it all right?

Thank you,
Oleksandr

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH for-4.12 v2 8/8] xen: Switch parameter in get_page_from_gfn to use typesafe gfn

2018-12-21 Thread Jan Beulich
>>> On 20.12.18 at 20:23,  wrote:
> No functional change intended.
> 
> Only reasonable clean-ups are done in this patch. The rest will use _gfn
> for the time being.
> 
> Signed-off-by: Julien Grall 

Acked-by: Jan Beulich 



___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [linux-4.19 test] 131447: regressions - FAIL

2018-12-21 Thread osstest service owner
flight 131447 linux-4.19 real [real]
http://logs.test-lab.xenproject.org/osstest/logs/131447/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-amd64-xl-shadow7 xen-boot fail REGR. vs. 129313
 test-amd64-amd64-pair10 xen-boot/src_hostfail REGR. vs. 129313
 test-amd64-amd64-pair11 xen-boot/dst_hostfail REGR. vs. 129313
 test-amd64-amd64-libvirt-pair 10 xen-boot/src_host   fail REGR. vs. 129313
 test-amd64-amd64-libvirt-pair 11 xen-boot/dst_host   fail REGR. vs. 129313
 test-amd64-amd64-xl-qemut-stubdom-debianhvm-amd64-xsm 7 xen-boot fail REGR. 
vs. 129313
 test-amd64-amd64-xl-qemuu-debianhvm-amd64-xsm 7 xen-boot fail REGR. vs. 129313
 test-amd64-amd64-i386-pvgrub  7 xen-boot fail REGR. vs. 129313
 test-amd64-amd64-xl-qemuu-dmrestrict-amd64-dmrestrict 7 xen-boot fail REGR. 
vs. 129313
 test-amd64-i386-examine   8 reboot   fail REGR. vs. 129313
 test-amd64-amd64-amd64-pvgrub  7 xen-bootfail REGR. vs. 129313
 test-amd64-i386-xl-raw7 xen-boot fail REGR. vs. 129313
 test-amd64-i386-xl-qemuu-debianhvm-amd64  7 xen-boot fail REGR. vs. 129313
 test-amd64-i386-xl-qemut-debianhvm-amd64-xsm  7 xen-boot fail REGR. vs. 129313
 test-amd64-i386-pair 10 xen-boot/src_hostfail REGR. vs. 129313
 test-amd64-i386-pair 11 xen-boot/dst_hostfail REGR. vs. 129313
 test-amd64-i386-libvirt-pair 10 xen-boot/src_hostfail REGR. vs. 129313
 test-amd64-i386-libvirt-pair 11 xen-boot/dst_hostfail REGR. vs. 129313
 test-amd64-i386-qemut-rhel6hvm-intel  7 xen-boot fail REGR. vs. 129313
 test-amd64-amd64-examine  8 reboot   fail REGR. vs. 129313
 test-amd64-i386-qemut-rhel6hvm-amd 12 guest-start/redhat.repeat fail REGR. vs. 
129313

Tests which are failing intermittently (not blocking):
 test-amd64-amd64-xl-qcow2 19 guest-start/debian.repeat fail in 131425 pass in 
131447
 test-armhf-armhf-xl-arndale   5 host-ping-check-native fail pass in 131425

Tests which did not succeed, but are not blocking:
 test-armhf-armhf-xl-arndale 13 migrate-support-check fail in 131425 never pass
 test-armhf-armhf-xl-arndale 14 saverestore-support-check fail in 131425 never 
pass
 test-amd64-i386-xl-qemuu-debianhvm-amd64-xsm  7 xen-boot   fail never pass
 test-amd64-i386-xl-pvshim12 guest-start  fail   never pass
 test-amd64-i386-libvirt  13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-amd64-qemuu-nested-amd 17 debian-hvm-install/l1/l2  fail never pass
 test-amd64-amd64-xl-qemut-win7-amd64 17 guest-stop fail never pass
 test-amd64-amd64-libvirt-vhd 12 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-amd64-amd64-xl-qemuu-win7-amd64 17 guest-stop fail never pass
 test-armhf-armhf-xl-credit2  13 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt 14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 13 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 14 saverestore-support-checkfail never pass
 test-amd64-i386-xl-qemut-ws16-amd64 17 guest-stop  fail never pass
 test-amd64-i386-xl-qemuu-win7-amd64 17 guest-stop  fail never pass
 test-amd64-amd64-xl-qemuu-ws16-amd64 17 guest-stop fail never pass
 test-amd64-i386-xl-qemut-win7-amd64 17 guest-stop  fail never pass
 test-amd64-i386-xl-qemuu-ws16-amd64 17 guest-stop  fail never pass
 test-armhf-armhf-xl-rtds 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 14 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt-raw 12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-raw 13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 13 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 14 saverestore-support-checkfail 

Re: [Xen-devel] [PATCH v4 0/2] misc safety certification fixes

2018-12-21 Thread Jan Beulich
>>> On 20.12.18 at 18:26,  wrote:
> On 11/12/18 11:06 PM, Stefano Stabellini wrote:
> Discussing with Stefano today, he is aiming to get this series for Xen 
> 4.12. I will be away until the x86/common code freeze.
> 
> I agree with him that I will waive my ack if it gets reviewed by any 
> committers.

Well, discussion on patch 2 was abandoned rather than finished
afaict, which means Stefano either lost interest or is meaning to
submit v5 with the comments addressed.

Jan



___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v2 0/3] x86/mm-locks: add a bias to control domain lock levels

2018-12-21 Thread Roger Pau Monne
Hello,

The following series attempts to fix a mm lock level issue that prevents
using paging_log_dirty_op from a paging Dom0 (like a PVH Dom0). The
discussion that lead to this series can be found at:

https://lists.xenproject.org/archives/html/xen-devel/2018-12/msg01197.html

Roger Pau Monne (3):
  x86/mm-locks: remove trailing whitespace
  x86/mm-locks: convert some macros to inline functions
  x86/mm-locks: apply a bias to lock levels for control domain

 xen/arch/x86/mm/mm-locks.h | 217 ++---
 xen/arch/x86/mm/p2m-pod.c  |   5 +-
 2 files changed, 130 insertions(+), 92 deletions(-)

-- 
2.20.1


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v2 2/3] x86/mm-locks: convert some macros to inline functions

2018-12-21 Thread Roger Pau Monne
And rename to have only one prefix underscore where applicable.

No functional change.

Signed-off-by: Roger Pau Monné 
Reviewed-by: George Dunlap 
---
Cc: George Dunlap 
Cc: Jan Beulich 
Cc: Andrew Cooper 
Cc: Wei Liu 
---
 xen/arch/x86/mm/mm-locks.h | 96 --
 1 file changed, 51 insertions(+), 45 deletions(-)

diff --git a/xen/arch/x86/mm/mm-locks.h b/xen/arch/x86/mm/mm-locks.h
index 64b8775a6d..d3497713e9 100644
--- a/xen/arch/x86/mm/mm-locks.h
+++ b/xen/arch/x86/mm/mm-locks.h
@@ -29,7 +29,6 @@
 
 /* Per-CPU variable for enforcing the lock ordering */
 DECLARE_PER_CPU(int, mm_lock_level);
-#define __get_lock_level()  (this_cpu(mm_lock_level))
 
 DECLARE_PERCPU_RWLOCK_GLOBAL(p2m_percpu_rwlock);
 
@@ -46,43 +45,47 @@ static inline int mm_locked_by_me(mm_lock_t *l)
 return (l->lock.recurse_cpu == current->processor);
 }
 
+static inline int _get_lock_level(void)
+{
+return this_cpu(mm_lock_level);
+}
+
 /*
  * If you see this crash, the numbers printed are order levels defined
  * in this file.
  */
-#define __check_lock_level(l)   \
-do {\
-if ( unlikely(__get_lock_level() > (l)) )   \
-{   \
-printk("mm locking order violation: %i > %i\n", \
-   __get_lock_level(), (l));\
-BUG();  \
-}   \
-} while(0)
-
-#define __set_lock_level(l) \
-do {\
-__get_lock_level() = (l);   \
-} while(0)
+static inline void _check_lock_level(int l)
+{
+if ( unlikely(_get_lock_level() > l) )
+{
+printk("mm locking order violation: %i > %i\n", _get_lock_level(), l);
+BUG();
+}
+}
+
+static inline void _set_lock_level(int l)
+{
+this_cpu(mm_lock_level) = l;
+}
 
 static inline void _mm_lock(mm_lock_t *l, const char *func, int level, int rec)
 {
 if ( !((mm_locked_by_me(l)) && rec) )
-__check_lock_level(level);
+_check_lock_level(level);
 spin_lock_recursive(&l->lock);
 if ( l->lock.recurse_cnt == 1 )
 {
 l->locker_function = func;
-l->unlock_level = __get_lock_level();
+l->unlock_level = _get_lock_level();
 }
 else if ( (unlikely(!rec)) )
 panic("mm lock already held by %s\n", l->locker_function);
-__set_lock_level(level);
+_set_lock_level(level);
 }
 
 static inline void _mm_enforce_order_lock_pre(int level)
 {
-__check_lock_level(level);
+_check_lock_level(level);
 }
 
 static inline void _mm_enforce_order_lock_post(int level, int *unlock_level,
@@ -92,12 +95,12 @@ static inline void _mm_enforce_order_lock_post(int level, 
int *unlock_level,
 {
 if ( (*recurse_count)++ == 0 )
 {
-*unlock_level = __get_lock_level();
+*unlock_level = _get_lock_level();
 }
 } else {
-*unlock_level = __get_lock_level();
+*unlock_level = _get_lock_level();
 }
-__set_lock_level(level);
+_set_lock_level(level);
 }
 
 
@@ -118,12 +121,12 @@ static inline void _mm_write_lock(mm_rwlock_t *l, const 
char *func, int level)
 {
 if ( !mm_write_locked_by_me(l) )
 {
-__check_lock_level(level);
+_check_lock_level(level);
 percpu_write_lock(p2m_percpu_rwlock, &l->lock);
 l->locker = get_processor_id();
 l->locker_function = func;
-l->unlock_level = __get_lock_level();
-__set_lock_level(level);
+l->unlock_level = _get_lock_level();
+_set_lock_level(level);
 }
 l->recurse_count++;
 }
@@ -134,13 +137,13 @@ static inline void mm_write_unlock(mm_rwlock_t *l)
 return;
 l->locker = -1;
 l->locker_function = "nobody";
-__set_lock_level(l->unlock_level);
+_set_lock_level(l->unlock_level);
 percpu_write_unlock(p2m_percpu_rwlock, &l->lock);
 }
 
 static inline void _mm_read_lock(mm_rwlock_t *l, int level)
 {
-__check_lock_level(level);
+_check_lock_level(level);
 percpu_read_lock(p2m_percpu_rwlock, &l->lock);
 /* There's nowhere to store the per-CPU unlock level so we can't
  * set the lock level. */
@@ -181,7 +184,7 @@ static inline void mm_unlock(mm_lock_t *l)
 if ( l->lock.recurse_cnt == 1 )
 {
 l->locker_function = "nobody";
-__set_lock_level(l->unlock_level);
+_set_lock_level(l->unlock_level);
 }
 spin_unlock_recursive(&l->lock);
 }
@@ -194,10 +197,10 @@ static inline void mm_enforce_order_unlock(int 
unlock_level,
 BUG_ON(*recurse_count == 0);
 if ( (*recurse_count)-- == 1 )
 {
-__set_lock_level(unlock_level);
+_set_lock_level(unlock_level);
 }
 } else {
-__set_lock_level(unlock_level);
+_set_lock_level(unlock_level);
 }
 }
 
@@ -287,21 +290,24 @@ declare_

[Xen-devel] [PATCH v2 1/3] x86/mm-locks: remove trailing whitespace

2018-12-21 Thread Roger Pau Monne
No functional change.

Signed-off-by: Roger Pau Monné 
Reviewed-by: George Dunlap 
---
Cc: George Dunlap 
Cc: Jan Beulich 
Cc: Andrew Cooper 
Cc: Wei Liu 
---
 xen/arch/x86/mm/mm-locks.h | 24 
 1 file changed, 12 insertions(+), 12 deletions(-)

diff --git a/xen/arch/x86/mm/mm-locks.h b/xen/arch/x86/mm/mm-locks.h
index 95295b62d2..64b8775a6d 100644
--- a/xen/arch/x86/mm/mm-locks.h
+++ b/xen/arch/x86/mm/mm-locks.h
@@ -3,11 +3,11 @@
  *
  * Spinlocks used by the code in arch/x86/mm.
  *
- * Copyright (c) 2011 Citrix Systems, inc. 
+ * Copyright (c) 2011 Citrix Systems, inc.
  * Copyright (c) 2007 Advanced Micro Devices (Wei Huang)
  * Copyright (c) 2006-2007 XenSource Inc.
  * Copyright (c) 2006 Michael A Fetterman
- * 
+ *
  * This program is free software; you can redistribute it and/or modify
  * it under the terms of the GNU General Public License as published by
  * the Free Software Foundation; either version 2 of the License, or
@@ -41,7 +41,7 @@ static inline void mm_lock_init(mm_lock_t *l)
 l->unlock_level = 0;
 }
 
-static inline int mm_locked_by_me(mm_lock_t *l) 
+static inline int mm_locked_by_me(mm_lock_t *l)
 {
 return (l->lock.recurse_cpu == current->processor);
 }
@@ -67,7 +67,7 @@ do {\
 
 static inline void _mm_lock(mm_lock_t *l, const char *func, int level, int rec)
 {
-if ( !((mm_locked_by_me(l)) && rec) ) 
+if ( !((mm_locked_by_me(l)) && rec) )
 __check_lock_level(level);
 spin_lock_recursive(&l->lock);
 if ( l->lock.recurse_cnt == 1 )
@@ -186,7 +186,7 @@ static inline void mm_unlock(mm_lock_t *l)
 spin_unlock_recursive(&l->lock);
 }
 
-static inline void mm_enforce_order_unlock(int unlock_level, 
+static inline void mm_enforce_order_unlock(int unlock_level,
 unsigned short *recurse_count)
 {
 if ( recurse_count )
@@ -310,7 +310,7 @@ declare_mm_rwlock(altp2m);
 #define gfn_locked_by_me(p,g) p2m_locked_by_me(p)
 
 /* PoD lock (per-p2m-table)
- * 
+ *
  * Protects private PoD data structs: entry and cache
  * counts, page lists, sweep parameters. */
 
@@ -322,7 +322,7 @@ declare_mm_lock(pod)
 
 /* Page alloc lock (per-domain)
  *
- * This is an external lock, not represented by an mm_lock_t. However, 
+ * This is an external lock, not represented by an mm_lock_t. However,
  * pod code uses it in conjunction with the p2m lock, and expecting
  * the ordering which we enforce here.
  * The lock is not recursive. */
@@ -338,13 +338,13 @@ declare_mm_order_constraint(page_alloc)
  * For shadow pagetables, this lock protects
  *   - all changes to shadow page table pages
  *   - the shadow hash table
- *   - the shadow page allocator 
+ *   - the shadow page allocator
  *   - all changes to guest page table pages
  *   - all changes to the page_info->tlbflush_timestamp
- *   - the page_info->count fields on shadow pages 
- * 
- * For HAP, it protects the NPT/EPT tables and mode changes. 
- * 
+ *   - the page_info->count fields on shadow pages
+ *
+ * For HAP, it protects the NPT/EPT tables and mode changes.
+ *
  * It also protects the log-dirty bitmap from concurrent accesses (and
  * teardowns, etc). */
 
-- 
2.20.1


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v2 3/3] x86/mm-locks: apply a bias to lock levels for control domain

2018-12-21 Thread Roger Pau Monne
paging_log_dirty_op function takes mm locks from a subject domain and
then attempts to perform copy to operations against the caller domain
in order to copy the result of the hypercall into the caller provided
buffer.

This works fine when the caller is a non-paging domain, but triggers a
lock order panic when the caller is a paging domain due to the fact
that at the point where the copy to operation is performed the subject
domain paging lock is locked, and the copy operation requires
locking the caller p2m lock which has a lower level.

Fix this limitation by adding a bias to the level of control domain mm
locks, so that the lower control domain mm lock always has a level
greater than the higher unprivileged domain lock level. This allows
locking the subject domain mm locks and then locking the control
domain mm locks, while keeping the same lock ordering and the changes
mostly confined to mm-locks.h.

Note that so far only this flow (locking a subject domain locks and
then the control domain ones) has been identified, but not all
possible code paths have been inspected. Hence this solution attempts
to be a non-intrusive fix for the problem at hand, without discarding
further changes in the future if other valid code paths are found that
require more complex lock level ordering.

Signed-off-by: Roger Pau Monné 
---
Cc: George Dunlap 
Cc: Jan Beulich 
Cc: Andrew Cooper 
Cc: Wei Liu 
Cc: Tim Deegan 
---
Changes since v1:
 - Boost only control domain mm lock levels instead of the caller.
---
 xen/arch/x86/mm/mm-locks.h | 119 +++--
 xen/arch/x86/mm/p2m-pod.c  |   5 +-
 2 files changed, 78 insertions(+), 46 deletions(-)

diff --git a/xen/arch/x86/mm/mm-locks.h b/xen/arch/x86/mm/mm-locks.h
index d3497713e9..d6c073dc5c 100644
--- a/xen/arch/x86/mm/mm-locks.h
+++ b/xen/arch/x86/mm/mm-locks.h
@@ -50,15 +50,35 @@ static inline int _get_lock_level(void)
 return this_cpu(mm_lock_level);
 }
 
+#define MM_LOCK_ORDER_MAX64
+/*
+ * Return the lock level taking the domain bias into account. If the domain is
+ * privileged a bias of MM_LOCK_ORDER_MAX is applied to the lock level, so that
+ * mm locks that belong to a control domain can be acquired after having
+ * acquired mm locks of an unprivileged domain.
+ *
+ * This is required in order to use some hypercalls from a paging domain that
+ * take locks of a subject domain and then attempt to copy data to/from the
+ * caller domain.
+ */
+static inline int _lock_level(const struct domain *d, int l)
+{
+ASSERT(l <= MM_LOCK_ORDER_MAX);
+
+return l + (d && is_control_domain(d) ? MM_LOCK_ORDER_MAX : 0);
+}
+
 /*
  * If you see this crash, the numbers printed are order levels defined
  * in this file.
  */
-static inline void _check_lock_level(int l)
+static inline void _check_lock_level(const struct domain *d, int l)
 {
-if ( unlikely(_get_lock_level() > l) )
+int lvl = _lock_level(d, l);
+
+if ( unlikely(_get_lock_level() > lvl) )
 {
-printk("mm locking order violation: %i > %i\n", _get_lock_level(), l);
+printk("mm locking order violation: %i > %i\n", _get_lock_level(), 
lvl);
 BUG();
 }
 }
@@ -68,10 +88,11 @@ static inline void _set_lock_level(int l)
 this_cpu(mm_lock_level) = l;
 }
 
-static inline void _mm_lock(mm_lock_t *l, const char *func, int level, int rec)
+static inline void _mm_lock(const struct domain *d, mm_lock_t *l,
+const char *func, int level, int rec)
 {
 if ( !((mm_locked_by_me(l)) && rec) )
-_check_lock_level(level);
+_check_lock_level(d, level);
 spin_lock_recursive(&l->lock);
 if ( l->lock.recurse_cnt == 1 )
 {
@@ -80,16 +101,17 @@ static inline void _mm_lock(mm_lock_t *l, const char 
*func, int level, int rec)
 }
 else if ( (unlikely(!rec)) )
 panic("mm lock already held by %s\n", l->locker_function);
-_set_lock_level(level);
+_set_lock_level(_lock_level(d, level));
 }
 
-static inline void _mm_enforce_order_lock_pre(int level)
+static inline void _mm_enforce_order_lock_pre(const struct domain *d, int 
level)
 {
-_check_lock_level(level);
+_check_lock_level(d, level);
 }
 
-static inline void _mm_enforce_order_lock_post(int level, int *unlock_level,
-unsigned short *recurse_count)
+static inline void _mm_enforce_order_lock_post(const struct domain *d, int 
level,
+   int *unlock_level,
+   unsigned short *recurse_count)
 {
 if ( recurse_count )
 {
@@ -100,7 +122,7 @@ static inline void _mm_enforce_order_lock_post(int level, 
int *unlock_level,
 } else {
 *unlock_level = _get_lock_level();
 }
-_set_lock_level(level);
+_set_lock_level(_lock_level(d, level));
 }
 
 
@@ -117,16 +139,17 @@ static inline int mm_write_locked_by_me(mm_rwlock_t *l)
 return (l->locker == get_processor_id());

[Xen-devel] [xen-unstable-smoke test] 131486: tolerable all pass - PUSHED

2018-12-21 Thread osstest service owner
flight 131486 xen-unstable-smoke real [real]
http://logs.test-lab.xenproject.org/osstest/logs/131486/

Failures :-/ but no regressions.

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  14 saverestore-support-checkfail   never pass

version targeted for testing:
 xen  396d8d5418ea908a5ef88e7d7a9f22c70ada44c2
baseline version:
 xen  7183e86a29c3fe15078eb0b8c11d3e556c22effa

Last test of basis   131445  2018-12-19 14:00:43 Z1 days
Testing same since   131486  2018-12-21 08:00:41 Z0 days1 attempts


People who touched revisions under test:
  Andrew Cooper 
  Jan Beulich 
  Roger Pau Monné 

jobs:
 build-amd64  pass
 build-armhf  pass
 build-amd64-libvirt  pass
 test-armhf-armhf-xl  pass
 test-amd64-amd64-xl-qemuu-debianhvm-i386 pass
 test-amd64-amd64-libvirt pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

To xenbits.xen.org:/home/xen/git/xen.git
   7183e86a29..396d8d5418  396d8d5418ea908a5ef88e7d7a9f22c70ada44c2 -> smoke

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v2 0/3] x86/mm-locks: add a bias to control domain lock levels

2018-12-21 Thread Jan Beulich
>>> On 21.12.18 at 10:41,  wrote:
> Hello,
> 
> The following series attempts to fix a mm lock level issue that prevents
> using paging_log_dirty_op from a paging Dom0 (like a PVH Dom0). The
> discussion that lead to this series can be found at:
> 
> https://lists.xenproject.org/archives/html/xen-devel/2018-12/msg01197.html 
> 
> Roger Pau Monne (3):
>   x86/mm-locks: remove trailing whitespace
>   x86/mm-locks: convert some macros to inline functions

You could have checked the tree before sending - I've committed
these two in the morning.

Jan



___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [ovmf test] 131470: regressions - FAIL

2018-12-21 Thread osstest service owner
flight 131470 ovmf real [real]
http://logs.test-lab.xenproject.org/osstest/logs/131470/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-i3866 xen-buildfail REGR. vs. 129475
 build-i386-xsm6 xen-buildfail REGR. vs. 129475
 build-amd64-xsm   6 xen-buildfail REGR. vs. 129475
 build-amd64   6 xen-buildfail REGR. vs. 129475

Tests which did not succeed, but are not blocking:
 build-amd64-libvirt   1 build-check(1)   blocked  n/a
 test-amd64-i386-xl-qemuu-ovmf-amd64  1 build-check(1)  blocked n/a
 test-amd64-amd64-xl-qemuu-ovmf-amd64  1 build-check(1) blocked n/a
 build-i386-libvirt1 build-check(1)   blocked  n/a

version targeted for testing:
 ovmf 5a9b3eb8e5fbbc0a49f80630039d58712aacfab8
baseline version:
 ovmf 5ae3184d8c59f7bbb84bad482df6b8020ba58188

Last test of basis   129475  2018-11-05 21:13:11 Z   45 days
Failing since129526  2018-11-06 20:49:26 Z   44 days  162 attempts
Testing same since   131470  2018-12-20 21:57:41 Z0 days1 attempts


People who touched revisions under test:
  Achin Gupta 
  Ard Biesheuvel 
  Bob Feng 
  bob.c.f...@intel.com 
  BobCF 
  Chasel Chiu 
  Chasel, Chiu 
  Chen A Chen 
  Dandan Bi 
  David Wei 
  Derek Lin 
  Eric Dong 
  Feng, Bob C 
  Fu Siyuan 
  Gary Lin 
  Hao Wu 
  Jaben Carsey 
  Jeff Brasen 
  Jian J Wang 
  Jiaxin Wu 
  Jiewen Yao 
  Laszlo Ersek 
  Leif Lindholm 
  Liming Gao 
  Liu Yu 
  Marc Zyngier 
  Marcin Wojtas 
  Ming Huang 
  Pedroa Liu 
  Ruiyu Ni 
  shenglei 
  Shenglei Zhang 
  Star Zeng 
  Sughosh Ganu 
  Sumit Garg 
  Sun, Zailiang 
  Thomas Abraham 
  Ting Ye 
  Tomasz Michalec 
  Vijayenthiran Subramaniam 
  Vladimir Olovyannikov 
  Wang BinX A 
  Wu Jiaxin 
  Ye Ting 
  Yonghong Zhu 
  yuchenlin 
  Zailiang Sun 
  Zhang, Chao B 
  Zhao, ZhiqiangX 
  Zhiju.Fan 
  zhijufan 
  ZhiqiangX Zhao 
  zwei4 

jobs:
 build-amd64-xsm  fail
 build-i386-xsm   fail
 build-amd64  fail
 build-i386   fail
 build-amd64-libvirt  blocked 
 build-i386-libvirt   blocked 
 build-amd64-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-xl-qemuu-ovmf-amd64 blocked 
 test-amd64-i386-xl-qemuu-ovmf-amd64  blocked 



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Not pushing.

(No revision log; it would be 4265 lines long.)

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 03/15] x86/cpu/vpmu: Add Hygon Dhyana support for vPMU

2018-12-21 Thread Pu Wen
On 2018/12/20 22:25, Boris Ostrovsky wrote:
...
>> diff --git a/xen/arch/x86/cpu/vpmu_amd.c b/xen/arch/x86/cpu/vpmu_amd.c
>> index 5efc39b..e9f0a5c 100644
>> --- a/xen/arch/x86/cpu/vpmu_amd.c
>> +++ b/xen/arch/x86/cpu/vpmu_amd.c
>> @@ -554,6 +554,8 @@ int __init amd_vpmu_init(void)
>>   case 0x12:
>>   case 0x14:
>>   case 0x16:
>> +case 0x17:
>> +case 0x18:
>
>
> This also enables VPMU support for Zen which goes beyond what the
> commit message claims to do.

Sorry for the not clear commit message. Will add modification description
in the commit message and make the changes complete.

On the other hand, since current Xen vPMU still not support Zen. so in
this patch we enable 0x17 support. If this modification is not preferred,
will remove AMD Xen 0x17 support in next version.

>
> Also, why are you choosing to use legacy MSRs (and you did the same in
> Linux)? Doesn't Zen (which you are saying is similar to Hygon) support
> c001_020X bank?

In Linux, the Xen PMU driver use the default branch cases, which also use
the legacy MSRs way. So we choose to follow legacy MSRs here in Dhyana
cases.

Since both of Zen and Dhyana support C001_020X MSRs. If use the C001_020X
is preferred, we will try to modify the related codes and create a patch.
Also the Linux Xen PMU driver may need to be updated to use these MSRs.

-- 
Regards,
Pu Wen

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v4 1/3] xen/pt: fix some pass-thru devices don't work across reboot

2018-12-21 Thread Jan Beulich
>>> On 20.12.18 at 16:29,  wrote:
> --- a/xen/drivers/passthrough/pci.c
> +++ b/xen/drivers/passthrough/pci.c
> @@ -1514,6 +1514,55 @@ static int assign_device(struct domain *d, u16 seg, u8 
> bus, u8 devfn, u32 flag)
>  return rc;
>  }
>  
> +/*
> + * Unmap established mappings between domain's pirq and device's MSI.
> + * These mappings were set up by qemu/guest and are expected to be
> + * destroyed when changing the device's ownership.
> + */
> +static void pci_unmap_msi(struct pci_dev *pdev)
> +{
> +struct msi_desc *entry;
> +struct domain *d = pdev->domain;
> +
> +ASSERT(pcidevs_locked());
> +
> +if ( !d )
> +return;

Why? deassign_device() (the only caller) ought to guarantee this,
due to its use of pci_get_pdev_by_domain(). I think this simply
wants to be another ASSERT(), if anything at all.

> +spin_lock(&d->event_lock);
> +while ( (entry = list_first_entry_or_null(&pdev->msi_list,
> +  struct msi_desc, list)) != 
> NULL )
> +{
> +struct pirq *info;
> +int pirq = 0;
> +unsigned int nr = entry->msi_attrib.type != PCI_CAP_ID_MSIX
> +  ? entry->msi.nvec : 1;
> +
> +while ( nr -- )

Stray blank.

> +{
> +struct hvm_pirq_dpci *pirq_dpci;
> +
> +pirq = domain_irq_to_pirq(d, entry[nr].irq);
> +WARN_ON(pirq < 0);
> +if ( pirq <= 0 )
> +continue;
> +
> +info = pirq_info(d, pirq);
> +if ( !info )
> +continue;
> +
> +pirq_dpci = pirq_dpci(info);
> +if ( pirq_dpci &&
> + (pirq_dpci->flags & HVM_IRQ_DPCI_MACH_MSI) &&
> + (pirq_dpci->flags & HVM_IRQ_DPCI_GUEST_MSI) )
> +pt_irq_destroy_bind_msi(d, info);
> +}
> +if ( pirq > 0 )
> +unmap_domain_pirq(d, pirq);

Can you guarantee that this function won't fail? Because if it
does, I think you might end up in an infinite loop, because the
entry doesn't always get removed from the list in error cases.
Maybe unmap_domain_pirq() needs a "force" mode added,
perhaps indirectly by way of passing "entry" into it (all other
callers would pass NULL).

But then again I'm still not fully convinced that a hypervisor
change is the right course of action here in the first place. It
would be better if the hypervisor had to just verify that all
IRQ mappings are gone, or else fail the de-assignment of the
device.

Jan



___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v4 3/3] xen/pt: initialize 'warned' field of arch_msix properly

2018-12-21 Thread Jan Beulich
>>> On 20.12.18 at 16:29,  wrote:
> --- a/xen/include/asm-x86/msi.h
> +++ b/xen/include/asm-x86/msi.h
> @@ -252,5 +252,10 @@ void guest_mask_msi_irq(struct irq_desc *, bool mask);
>  void ack_nonmaskable_msi_irq(struct irq_desc *);
>  void end_nonmaskable_msi_irq(struct irq_desc *, u8 vector);
>  void set_msi_affinity(struct irq_desc *, const cpumask_t *);
> +static inline void init_arch_msix(struct arch_msix *msix)
> +{
> +spin_lock_init(&msix->table_lock);
> +msix->warned = DOMID_INVALID;
> +}

I think this would better sit next to the structure definition,
i.e. a few lines up. In any event a separating blank line
needs adding. With that
Reviewed-by: Jan Beulich 

Additionally perhaps arch_init_msix() would better fit our
general naming of arch-specific functions.

Jan



___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 01/15] x86/cpu: Create Hygon Dhyana architecture support file

2018-12-21 Thread Andrew Cooper
On 20/12/2018 13:12, Pu Wen wrote:
> diff --git a/xen/arch/x86/cpu/hygon.c b/xen/arch/x86/cpu/hygon.c
> new file mode 100644
> index 000..0728b4a
> --- /dev/null
> +++ b/xen/arch/x86/cpu/hygon.c
> @@ -0,0 +1,296 @@
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +#include "cpu.h"
> +
> +static unsigned int __initdata opt_cpuid_mask_l7s0_eax = ~0u;
> +integer_param("cpuid_mask_l7s0_eax", opt_cpuid_mask_l7s0_eax);
> +static unsigned int __initdata opt_cpuid_mask_l7s0_ebx = ~0u;
> +integer_param("cpuid_mask_l7s0_ebx", opt_cpuid_mask_l7s0_ebx);

These should be moved from the AMD specific code into the common cpu
code (alongside the other masks) rather than duplicated here.

> +
> +static inline int rdmsr_hygon_safe(unsigned int msr, unsigned int *lo,
> +  unsigned int *hi)
> +{
> + int err;
> +
> + asm volatile("1: rdmsr\n2:\n"
> +  ".section .fixup,\"ax\"\n"
> +  "3: movl %6,%2\n"
> +  "   jmp 2b\n"
> +  ".previous\n"
> +  _ASM_EXTABLE(1b, 3b)
> +  : "=a" (*lo), "=d" (*hi), "=r" (err)
> +  : "c" (msr), "D" (0x9c5a203a), "2" (0), "i" (-EFAULT));

These rdmsr/wrmsr helpers with a password in %edi are only used in the
K8 processors.  Since Hygon is a Zen derivative, you shouldn't need any
of these.

> +
> + return err;
> +}
> +
> +static inline int wrmsr_hygon_safe(unsigned int msr, unsigned int lo,
> +  unsigned int hi)
> +{
> + int err;
> +
> + asm volatile("1: wrmsr\n2:\n"
> +  ".section .fixup,\"ax\"\n"
> +  "3: movl %6,%0\n"
> +  "   jmp 2b\n"
> +  ".previous\n"
> +  _ASM_EXTABLE(1b, 3b)
> +  : "=r" (err)
> +  : "c" (msr), "a" (lo), "d" (hi), "D" (0x9c5a203a),
> +"0" (0), "i" (-EFAULT));
> +
> + return err;
> +}
> +
> +static void wrmsr_hygon(unsigned int msr, uint64_t val)
> +{
> + asm volatile("wrmsr" ::
> +  "c" (msr), "a" ((uint32_t)val),
> +  "d" (val >> 32), "D" (0x9c5a203a));
> +}
> +
> +/*
> + * Sets caps in expected_levelling_cap, probes for the specified mask MSR, 
> and
> + * set caps in levelling_caps if it is found.  Returns the default value.
> + */
> +static uint64_t __init _probe_mask_msr(unsigned int msr, uint64_t caps)
> +{
> + unsigned int hi, lo;
> +
> + expected_levelling_cap |= caps;
> +
> + if ((rdmsr_hygon_safe(msr, &lo, &hi) == 0) &&
> + (wrmsr_hygon_safe(msr, lo, hi) == 0))
> + levelling_caps |= caps;
> +
> + return ((uint64_t)hi << 32) | lo;
> +}
> +
> +/* Probe for the existance of the expected masking MSRs. */
> +static void __init noinline probe_masking_msrs(void)
> +{
> + const struct cpuinfo_x86 *c = &boot_cpu_data;
> +
> + /* Work out which masking MSRs we should have. */
> + cpuidmask_defaults._1cd =
> + _probe_mask_msr(MSR_K8_FEATURE_MASK, LCAP_1cd);
> + cpuidmask_defaults.e1cd =
> + _probe_mask_msr(MSR_K8_EXT_FEATURE_MASK, LCAP_e1cd);
> + if (c->cpuid_level >= 7)
> + cpuidmask_defaults._7ab0 =
> + _probe_mask_msr(MSR_AMD_L7S0_FEATURE_MASK, LCAP_7ab0);
> +}
> +
> +/*
> + * Context switch CPUID masking state to the next domain.  Only called if
> + * CPUID Faulting isn't available, but masking MSRs have been detected.  A
> + * parameter of NULL is used to context switch to the default host state (by
> + * the cpu bringup-code, crash path, etc).
> + */
> +static void hygon_ctxt_switch_masking(const struct vcpu *next)
> +{
> + struct cpuidmasks *these_masks = &this_cpu(cpuidmasks);
> + const struct domain *nextd = next ? next->domain : NULL;
> + const struct cpuidmasks *masks =
> + (nextd && is_pv_domain(nextd) && nextd->arch.pv.cpuidmasks)
> + ? nextd->arch.pv.cpuidmasks : &cpuidmask_defaults;
> +
> + if ((levelling_caps & LCAP_1cd) == LCAP_1cd) {
> + uint64_t val = masks->_1cd;
> +
> + /*
> +  * OSXSAVE defaults to 1, which causes fast-forwarding of
> +  * Xen's real setting.  Clobber it if disabled by the guest
> +  * kernel.
> +  */
> + if (next && is_pv_vcpu(next) && !is_idle_vcpu(next) &&
> + !(next->arch.pv.ctrlreg[4] & X86_CR4_OSXSAVE))
> + val &= ~((uint64_t)cpufeat_mask(X86_FEATURE_OSXSAVE) << 
> 32);
> +
> + if (unlikely(these_masks->_1cd != val)) {
> + wrmsr_hygon(MSR_K8_FEATURE_MASK, val);
> + these_masks->_1cd = val;
> + }
> + }
> +
> +#define LAZY(cap, msr, field)
> \
> + ({  \
> + if (unlikely(these_masks->field != masks->field) && \

Re: [Xen-devel] [PATCH v4 1/3] xen/pt: fix some pass-thru devices don't work across reboot

2018-12-21 Thread Roger Pau Monné
On Fri, Dec 21, 2018 at 03:13:50AM -0700, Jan Beulich wrote:
> But then again I'm still not fully convinced that a hypervisor
> change is the right course of action here in the first place. It
> would be better if the hypervisor had to just verify that all
> IRQ mappings are gone, or else fail the de-assignment of the
> device.

The only component (except the hypervisor) that knows about such
assignments is QEMU, and in the case of a QEMU crash the host would be
left with a device that cannot be de-assigned, because the information
about the PIRQ bindings in lost due to the QEMU crash.

IMO Xen needs to be capable of cleaning any bindings and mappings done
by the toolstack or the device model in order to be able to correctly
recover from a device model or toolstack crash.

Roger.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v4 1/3] xen/pt: fix some pass-thru devices don't work across reboot

2018-12-21 Thread Jan Beulich
>>> On 21.12.18 at 11:26,  wrote:
> On Fri, Dec 21, 2018 at 03:13:50AM -0700, Jan Beulich wrote:
>> But then again I'm still not fully convinced that a hypervisor
>> change is the right course of action here in the first place. It
>> would be better if the hypervisor had to just verify that all
>> IRQ mappings are gone, or else fail the de-assignment of the
>> device.
> 
> The only component (except the hypervisor) that knows about such
> assignments is QEMU, and in the case of a QEMU crash the host would be
> left with a device that cannot be de-assigned, because the information
> about the PIRQ bindings in lost due to the QEMU crash.
> 
> IMO Xen needs to be capable of cleaning any bindings and mappings done
> by the toolstack or the device model in order to be able to correctly
> recover from a device model or toolstack crash.

But possibly with tool stack assistance: Rather than doing it (in a
potentially fragile way, as per my other comments) as an integral
part of deassign-device, it could be a separate domctl to be
issued first. Or otherwise failure here ought to lead to failure of
deassign-device, rather than (e.g.) an infinite loop.

Jan



___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v4 0/2] misc safety certification fixes

2018-12-21 Thread Julien Grall

Hi,

On 21/12/2018 09:27, Jan Beulich wrote:

On 20.12.18 at 18:26,  wrote:

On 11/12/18 11:06 PM, Stefano Stabellini wrote:
Discussing with Stefano today, he is aiming to get this series for Xen
4.12. I will be away until the x86/common code freeze.

I agree with him that I will waive my ack if it gets reviewed by any
committers.


Well, discussion on patch 2 was abandoned rather than finished
afaict, which means Stefano either lost interest or is meaning to
submit v5 with the comments addressed.


He is planning to send a new version while I am on holidays. My request applies 
for any new version of this providing it gets reviewed by any committers.


Cheers,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v4 3/3] xen/pt: initialize 'warned' field of arch_msix properly

2018-12-21 Thread Roger Pau Monné
On Fri, Dec 21, 2018 at 03:17:30AM -0700, Jan Beulich wrote:
> >>> On 20.12.18 at 16:29,  wrote:
> > --- a/xen/include/asm-x86/msi.h
> > +++ b/xen/include/asm-x86/msi.h
> > @@ -252,5 +252,10 @@ void guest_mask_msi_irq(struct irq_desc *, bool mask);
> >  void ack_nonmaskable_msi_irq(struct irq_desc *);
> >  void end_nonmaskable_msi_irq(struct irq_desc *, u8 vector);
> >  void set_msi_affinity(struct irq_desc *, const cpumask_t *);
> > +static inline void init_arch_msix(struct arch_msix *msix)
> > +{
> > +spin_lock_init(&msix->table_lock);
> > +msix->warned = DOMID_INVALID;
> > +}
> 
> I think this would better sit next to the structure definition,
> i.e. a few lines up. In any event a separating blank line
> needs adding. With that
> Reviewed-by: Jan Beulich 
> 
> Additionally perhaps arch_init_msix() would better fit our
> general naming of arch-specific functions.

With Jan's comments addresses you can also add my

Reviewed-by: Roger Pau Monné 

Thanks, Roger.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH for-4.12 v2 2/8] xen/arm: p2m: Introduce p2m_get_page_from_gfn

2018-12-21 Thread Julien Grall

Hi Stefano,

On 20/12/2018 22:56, Stefano Stabellini wrote:

On Thu, 20 Dec 2018, Julien Grall wrote:

In a follow-up patches, we will need to handle get_page_from_gfn

  ^ remove a


I have removed the "es" from "patches" and keep the "a" instead.



Aside from that:

Reviewed-by: Stefano Stabellini 


Thank you for the review!

Cheers,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [linux-4.4 test] 131448: tolerable FAIL - PUSHED

2018-12-21 Thread osstest service owner
flight 131448 linux-4.4 real [real]
http://logs.test-lab.xenproject.org/osstest/logs/131448/

Failures :-/ but no regressions.

Tests which did not succeed, but are not blocking:
 test-armhf-armhf-xl  16 guest-start/debian.repeatfail  like 131384
 test-amd64-amd64-xl-qemuu-dmrestrict-amd64-dmrestrict 10 debian-hvm-install 
fail never pass
 test-amd64-i386-xl-qemuu-dmrestrict-amd64-dmrestrict 10 debian-hvm-install 
fail never pass
 test-amd64-i386-xl-pvshim12 guest-start  fail   never pass
 test-amd64-amd64-xl-pvhv2-amd 12 guest-start  fail  never pass
 test-amd64-i386-libvirt  13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  13 migrate-support-checkfail   never pass
 test-amd64-amd64-xl-pvhv2-intel 12 guest-start fail never pass
 test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-amd64-qemuu-nested-amd 17 debian-hvm-install/l1/l2  fail never pass
 test-amd64-amd64-libvirt-vhd 12 migrate-support-checkfail   never pass
 test-amd64-i386-xl-qemuu-win7-amd64 17 guest-stop  fail never pass
 test-amd64-amd64-xl-qemuu-win7-amd64 17 guest-stop fail never pass
 test-armhf-armhf-xl-credit2  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 13 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 14 saverestore-support-checkfail  never pass
 test-armhf-armhf-libvirt 13 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt 14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 13 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 14 saverestore-support-checkfail never pass
 test-armhf-armhf-xl  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  14 saverestore-support-checkfail   never pass
 test-amd64-amd64-xl-qemut-win7-amd64 17 guest-stop fail never pass
 test-amd64-i386-xl-qemut-win7-amd64 17 guest-stop  fail never pass
 test-amd64-amd64-xl-qemut-ws16-amd64 17 guest-stop fail never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-amd64-xl-qemuu-ws16-amd64 17 guest-stop fail never pass
 test-armhf-armhf-xl-arndale  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  13 saverestore-support-checkfail   never pass
 test-amd64-i386-xl-qemuu-ws16-amd64 17 guest-stop  fail never pass
 test-amd64-i386-xl-qemut-ws16-amd64 17 guest-stop  fail never pass
 test-armhf-armhf-libvirt-raw 12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-raw 13 saverestore-support-checkfail   never pass
 test-amd64-amd64-xl-qemut-win10-i386 10 windows-installfail never pass
 test-amd64-i386-xl-qemut-win10-i386 10 windows-install fail never pass
 test-amd64-i386-xl-qemuu-win10-i386 10 windows-install fail never pass
 test-amd64-amd64-xl-qemuu-win10-i386 10 windows-installfail never pass

version targeted for testing:
 linuxd3c67a52a66ba2d44bcf1b8262609148c7c73113
baseline version:
 linux640f85865ca658ae07d485693a3d452bdbbadaba

Last test of basis   131384  2018-12-17 08:40:03 Z4 days
Testing same since   131426  2018-12-18 08:18:16 Z3 days2 attempts


People who touched revisions under test:
  Aaro Koskinen 
  Al Viro 
  Alexei Starovoitov 
  Alexei Starovoitov 
  Anders Roxell 
  Andrew Bowers 
  Andrew Morton 
  Arnd Bergmann 
  Ashok Raj 
  Ben Hutchings 
  Borislav Petkov 
  Catalin Marinas 
  Christian König 
  Christoph Hellwig 
  Christoph Paasch 
  Colin Ian King 
  Dan Williams 
  Daniel Borkmann 
  Dave Airlie 
  David Howells 
  David Matlack 
  David S. Miller 
  David Sterba 
  David Woodhouse 
  Eric Dumazet 
  Felipe Balbi 
  Filipe Manana 
  Florian Fainelli 
  Greg Kroah-Hartman 
  Guenter Roeck 
  Heiner Kallweit 
  Hillf Danton 
  Huacai Chen 
  Ingo Molnar 
  Janusz Krzysztofik 
  Jarkko Nikula 
  Jeff Kirsher 
  Jens Axboe 
  Jes

Re: [Xen-devel] [PATCH for-4.12 v2 4/8] xen/arm: Add support for read-only foreign mappings

2018-12-21 Thread Julien Grall

Hi,

On 20/12/2018 23:35, Stefano Stabellini wrote:

On Thu, 20 Dec 2018, Julien Grall wrote:

(Sorry for the formatting)

On Thu, 20 Dec 2018 at 23:08, Stefano Stabellini  wrote:
   On Thu, 20 Dec 2018, Julien Grall wrote:
   > Current, foreign mappings can only be read-write. A follow-up patch 
will
   > extend foreign mapping for Xen backend memory (via XEN_DOMID), some of
   > that memory should only be read accessible for the mapping domain.
   >
   > Introduce a new p2m_type to cater read-only foreign mappings. For now,
   > the decision between the two foreign mapping type is based on the type
   > of the guest page mapped.
   >
   > Signed-off-by: Julien Grall 
   > Reviewed-by: Andrii Anisov 
   >
   > ---
   >
   >     Changes in v2:
   >         - Add Andrii's reviewed-by
   > ---
   >  xen/arch/arm/mm.c         |  2 +-
   >  xen/arch/arm/p2m.c        |  1 +
   >  xen/include/asm-arm/p2m.h | 42 
+++---
   >  3 files changed, 41 insertions(+), 4 deletions(-)
   >
   > diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c
   > index 7193d83b44..58f7e54640 100644
   > --- a/xen/arch/arm/mm.c
   > +++ b/xen/arch/arm/mm.c
   > @@ -1267,7 +1267,7 @@ int xenmem_add_to_physmap_one(
   >          }
   >
   >          mfn = page_to_mfn(page);
   > -        t = p2m_map_foreign_rw;
   > +        t = (p2mt == p2m_ram_rw) ? p2m_map_foreign_rw : 
p2m_map_foreign_ro;

   I know there is a p2m_is_ram check close by, but I think it would still
   be better to do:

     if (p2mt == p2m_ram_rw)
       t = p2m_map_foreign_rw;
     else if (p2mt == p2m_ram_ro)
       t = p2m_map_foreign_ro;
     else
       error

   to avoid cases where p2mt is something completely different
   (p2m_mmio_direct_dev for instance) and t gets set to p2m_map_foreign_ro
   by mistake.


The case you suggest is impossible. You can only be here if you manage to get a 
reference on the page (e.g p2m_foreign or
p2m_ram).
The check above remove the foreign types. But if you ever get here there are 
not much harm done as it would be read-only.

The extras 5 lines of code are just not worth it.


I realize the case is impossible today, it was for clarity and for
future proof-ness. You can reduce line code count by combining it with
the p2m_is_ram check above:

diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c
index 49d7a76..01ae2cc 100644
--- a/xen/arch/arm/mm.c
+++ b/xen/arch/arm/mm.c
@@ -1259,7 +1259,9 @@ int xenmem_add_to_physmap_one(
  return -EINVAL;
  }
  
-if ( !p2m_is_ram(p2mt) )

+if ( p2m_is_ram(p2mt) )
+t = (p2mt == p2m_ram_rw) ? p2m_map_foreign_rw : p2m_map_foreign_ro;
+else
  {
  put_page(page);
  put_pg_owner(od);
@@ -1267,7 +1269,6 @@ int xenmem_add_to_physmap_one(
  }
  
  mfn = page_to_mfn(page);

-t = (p2mt == p2m_ram_rw) ? p2m_map_foreign_rw : p2m_map_foreign_ro;
  
  put_pg_owner(od);

  break;



That's a better solution. I will update the patch.

Cheers,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH for-4.12 v2 1/8] xen/page_alloc: Move get_pg_owner()/put_pg_owner() from x86 to common code

2018-12-21 Thread Julien Grall

Hi,

On 20/12/2018 22:59, Andrew Cooper wrote:

On 20/12/2018 22:53, Stefano Stabellini wrote:



diff --git a/xen/common/page_alloc.c b/xen/common/page_alloc.c
index 2c6509e3a0..edb93b8ada 100644
--- a/xen/common/page_alloc.c
+++ b/xen/common/page_alloc.c
@@ -2509,6 +2510,43 @@ static __init int register_heap_trigger(void)
  }
  __initcall(register_heap_trigger);
  
+struct domain *get_pg_owner(domid_t domid)

+{
+struct domain *pg_owner = NULL, *curr = current->domain;
+
+if ( likely(domid == DOMID_SELF) )
+{
+pg_owner = rcu_lock_current_domain();
+goto out;
+}
+
+if ( unlikely(domid == curr->domain_id) )
+{
+gdprintk(XENLOG_WARNING, "Cannot specify itself as foreign domain\n");
+goto out;
+}
+
+switch ( domid )
+{
+case DOMID_IO:
+pg_owner = rcu_lock_domain(dom_io);
+break;


Newline.


+case DOMID_XEN:
+pg_owner = rcu_lock_domain(dom_xen);
+break;


Newline.


+default:
+if ( (pg_owner = rcu_lock_domain_by_id(domid)) == NULL )
+{
+gdprintk(XENLOG_WARNING, "Unknown domain d%d\n", domid);
+break;
+}
+break;


if ( (pg_owner = rcu_lock_domain_by_id(domid)) == NULL )
     gdprintk(XENLOG_WARNING, "Unknown domain d%d\n", domid);

break;

All trivial, so Reviewed-by: Andrew Cooper 
and please fix on commit.


Thank you! I have now committed this patch.

Cheers,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH for-4.12 v2 4/8] xen/arm: Add support for read-only foreign mappings

2018-12-21 Thread Julien Grall

Hi,

On 20/12/2018 19:23, Julien Grall wrote:

Current, foreign mappings can only be read-write. A follow-up patch will
extend foreign mapping for Xen backend memory (via XEN_DOMID), some of
that memory should only be read accessible for the mapping domain.

Introduce a new p2m_type to cater read-only foreign mappings. For now,
the decision between the two foreign mapping type is based on the type
of the guest page mapped.

Signed-off-by: Julien Grall 
Reviewed-by: Andrii Anisov 

---

 Changes in v2:
 - Add Andrii's reviewed-by
---
  xen/arch/arm/mm.c |  2 +-
  xen/arch/arm/p2m.c|  1 +
  xen/include/asm-arm/p2m.h | 42 +++---
  3 files changed, 41 insertions(+), 4 deletions(-)

diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c
index 7193d83b44..58f7e54640 100644
--- a/xen/arch/arm/mm.c
+++ b/xen/arch/arm/mm.c
@@ -1267,7 +1267,7 @@ int xenmem_add_to_physmap_one(
  }
  
  mfn = page_to_mfn(page);

-t = p2m_map_foreign_rw;
+t = (p2mt == p2m_ram_rw) ? p2m_map_foreign_rw : p2m_map_foreign_ro;
  
  rcu_unlock_domain(od);

  break;
diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
index e0b84a9db5..dea04ef66f 100644
--- a/xen/arch/arm/p2m.c
+++ b/xen/arch/arm/p2m.c
@@ -477,6 +477,7 @@ static void p2m_set_permission(lpae_t *e, p2m_type_t t, 
p2m_access_t a)
  break;
  
  case p2m_iommu_map_ro:

+case p2m_map_foreign_ro:
  case p2m_grant_map_ro:
  case p2m_invalid:
  e->p2m.xn = 1;
diff --git a/xen/include/asm-arm/p2m.h b/xen/include/asm-arm/p2m.h
index a1aef7b793..6f2728e2bb 100644
--- a/xen/include/asm-arm/p2m.h
+++ b/xen/include/asm-arm/p2m.h
@@ -116,6 +116,7 @@ typedef enum {
  p2m_mmio_direct_nc, /* Read/write mapping of genuine MMIO area 
non-cacheable */
  p2m_mmio_direct_c,  /* Read/write mapping of genuine MMIO area cacheable 
*/
  p2m_map_foreign_rw, /* Read/write RAM pages from foreign domain */
+p2m_map_foreign_ro, /* Read-only RAM pages from foreign domain */
  p2m_grant_map_rw,   /* Read/write grant mapping */
  p2m_grant_map_ro,   /* Read-only grant mapping */
  /* The types below are only used to decide the page attribute in the P2M 
*/
@@ -135,12 +136,16 @@ typedef enum {
  #define P2M_GRANT_TYPES (p2m_to_mask(p2m_grant_map_rw) |  \
   p2m_to_mask(p2m_grant_map_ro))
  
+/* Foreign mappings types */

+#define P2M_FOREIGN_TYPES (p2m_to_mask(p2m_map_foreign_rw) | \
+   p2m_to_mask(p2m_map_foreign_ro))
+
  /* Useful predicates */
  #define p2m_is_ram(_t) (p2m_to_mask(_t) & P2M_RAM_TYPES)
-#define p2m_is_foreign(_t) (p2m_to_mask(_t) & p2m_to_mask(p2m_map_foreign_rw))
+#define p2m_is_foreign(_t) (p2m_to_mask(_t) & P2M_FOREIGN_TYPES)
  #define p2m_is_any_ram(_t) (p2m_to_mask(_t) &   \
  (P2M_RAM_TYPES | P2M_GRANT_TYPES |  \
- p2m_to_mask(p2m_map_foreign_rw)))
+ P2M_FOREIGN_TYPES))
  
  /* All common type definitions should live ahead of this inclusion. */

  #ifdef _XEN_P2M_COMMON_H
@@ -295,7 +300,38 @@ struct page_info *p2m_get_page_from_gfn(struct domain *d, 
gfn_t gfn,
  static inline struct page_info *get_page_from_gfn(
  struct domain *d, unsigned long gfn, p2m_type_t *t, p2m_query_t q)
  {


Something has gone wrong with this patch. The chunk below should be in a 
separate patch. I will split this patch in two.



-return p2m_get_page_from_gfn(d, _gfn(gfn), t);
+mfn_t mfn;
+p2m_type_t _t;
+struct page_info *page;
+
+/*
+ * Special case for DOMID_XEN as it is the only domain so far that is
+ * not auto-translated.
+ */
+if ( unlikely(d != dom_xen) )
+return p2m_get_page_from_gfn(d, _gfn(gfn), t);
+
+if ( !t )
+t = &_t;
+
+*t = p2m_invalid;
+
+/*
+ * DOMID_XEN see 1-1 RAM. The p2m_type is based on the type of the
+ * page.
+ */
+mfn = _mfn(gfn);
+page = mfn_to_page(mfn);
+
+if ( !mfn_valid(mfn) || !get_page(page, d) )
+return NULL;
+
+if ( page->u.inuse.type_info & PGT_writable_page )
+*t = p2m_ram_rw;
+else
+*t = p2m_ram_ro;
+
+return page;
  }
  
  int get_page_type(struct page_info *page, unsigned long type);




Cheers,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH] x86/vtx: Improvements to ept= command line handling

2018-12-21 Thread Jan Beulich
>>> On 20.12.18 at 18:16,  wrote:
> Switch parse_ept_param() to use the parse_boolean() infrastructure for more
> consistency with related command line parameters.  Rename opt_pml_enabled to
> opt_ept_pml for consistency with opt_ept_ad, and switch it to being bool
> 
> Drop the comment leading comment for parse_ept_param().  It is stale, and just

Nit: There's one "comment" to many here.

> --- a/docs/misc/xen-command-line.markdown
> +++ b/docs/misc/xen-command-line.markdown
> @@ -841,29 +841,37 @@ effect the inverse meaning.
>  >> Allows mapping of RuntimeServices which have no cachability attribute
>  >> set as UC.
>  
> -### ept (Intel)
> -> `= List of ( {no-}pml | {no-}ad )`
> +### ept
> +> `= List of [ ad=, pml= ]`
>  
> -Controls EPT related features.
> +> Applicability: Intel
>  
> -> Sub-options:
> -
> -> `pml`
> +Extended Page Tables are a feature of Intel's VT-x technology, whereby
> +hardware manages the virtualisation of HVM guest pagetables.  EPT was
> +introduced with the Nehalem architecture.
>  
> -> Default: `true`
> +*   The `ad` boolean controls hardware tracking of Access and Dirty bits in 
> the
> +EPT pagetables, and was first introduced in Broadwell Server.
>  
> ->> PML is a new hardware feature in Intel's Broadwell Server and further
> ->> platforms which reduces hypervisor overhead of log-dirty mechanism by
> ->> automatically recording GPAs (guest physical addresses) when guest memory
> ->> gets dirty, and therefore significantly reducing number of EPT violation
> ->> caused by write protection of guest memory, which is a necessity to
> ->> implement log-dirty mechanism before PML.
> +By default, Xen will use A/D tracking when available in hardware, except
> +on Avoton processors affected by erratum AVR41.  Explicitly choosing
> +`ad=0` will disable the use of A/D tracking on capable hardware, whereas
> +choosing `ad=1` will cause tracking to be used even on AVR41-affected
> +hardware.

Is there any reason for this special casing of the one erratum?
Earlier this week I've gone through some spec updates for other
purposes, and I've seen some rather frightening EPT A/D errata.

Anyway, this is a question unrelated to the patch here, so
Reviewed-by: Jan Beulich 

Jan



___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH] x86/vtx: Improvements to ept= command line handling

2018-12-21 Thread Andrew Cooper
On 21/12/2018 11:13, Jan Beulich wrote:
 On 20.12.18 at 18:16,  wrote:
>> Switch parse_ept_param() to use the parse_boolean() infrastructure for more
>> consistency with related command line parameters.  Rename opt_pml_enabled to
>> opt_ept_pml for consistency with opt_ept_ad, and switch it to being bool
>>
>> Drop the comment leading comment for parse_ept_param().  It is stale, and 
>> just
> Nit: There's one "comment" to many here.

Oops - will fix.

>
>> --- a/docs/misc/xen-command-line.markdown
>> +++ b/docs/misc/xen-command-line.markdown
>> @@ -841,29 +841,37 @@ effect the inverse meaning.
>>  >> Allows mapping of RuntimeServices which have no cachability attribute
>>  >> set as UC.
>>  
>> -### ept (Intel)
>> -> `= List of ( {no-}pml | {no-}ad )`
>> +### ept
>> +> `= List of [ ad=, pml= ]`
>>  
>> -Controls EPT related features.
>> +> Applicability: Intel
>>  
>> -> Sub-options:
>> -
>> -> `pml`
>> +Extended Page Tables are a feature of Intel's VT-x technology, whereby
>> +hardware manages the virtualisation of HVM guest pagetables.  EPT was
>> +introduced with the Nehalem architecture.
>>  
>> -> Default: `true`
>> +*   The `ad` boolean controls hardware tracking of Access and Dirty bits in 
>> the
>> +EPT pagetables, and was first introduced in Broadwell Server.
>>  
>> ->> PML is a new hardware feature in Intel's Broadwell Server and further
>> ->> platforms which reduces hypervisor overhead of log-dirty mechanism by
>> ->> automatically recording GPAs (guest physical addresses) when guest memory
>> ->> gets dirty, and therefore significantly reducing number of EPT violation
>> ->> caused by write protection of guest memory, which is a necessity to
>> ->> implement log-dirty mechanism before PML.
>> +By default, Xen will use A/D tracking when available in hardware, except
>> +on Avoton processors affected by erratum AVR41.  Explicitly choosing
>> +`ad=0` will disable the use of A/D tracking on capable hardware, whereas
>> +choosing `ad=1` will cause tracking to be used even on AVR41-affected
>> +hardware.
> Is there any reason for this special casing of the one erratum?
> Earlier this week I've gone through some spec updates for other
> purposes, and I've seen some rather frightening EPT A/D errata.

Which, out of interest?  There are a few, particularly on Skylake, but
all the problematic ones I'm aware of are fixed in microcode.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH] x86/vtx: Improvements to ept= command line handling

2018-12-21 Thread Jan Beulich
>>> On 21.12.18 at 12:15,  wrote:
> On 21/12/2018 11:13, Jan Beulich wrote:
> On 20.12.18 at 18:16,  wrote:
>>> +By default, Xen will use A/D tracking when available in hardware, 
>>> except
>>> +on Avoton processors affected by erratum AVR41.  Explicitly choosing
>>> +`ad=0` will disable the use of A/D tracking on capable hardware, 
>>> whereas
>>> +choosing `ad=1` will cause tracking to be used even on AVR41-affected
>>> +hardware.
>> Is there any reason for this special casing of the one erratum?
>> Earlier this week I've gone through some spec updates for other
>> purposes, and I've seen some rather frightening EPT A/D errata.
> 
> Which, out of interest?  There are a few, particularly on Skylake, but
> all the problematic ones I'm aware of are fixed in microcode.

I'd have to go through them again, as I didn't pay close attention to
what was said about their status. Are we generally putting ourselves
on the position then that errata don't need working around if there's
a microcode update available? That's a possible position to take, but
not spelled out anywhere.

Jan



___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v4 1/3] xen/pt: fix some pass-thru devices don't work across reboot

2018-12-21 Thread Roger Pau Monné
On Fri, Dec 21, 2018 at 03:32:47AM -0700, Jan Beulich wrote:
> >>> On 21.12.18 at 11:26,  wrote:
> > On Fri, Dec 21, 2018 at 03:13:50AM -0700, Jan Beulich wrote:
> >> But then again I'm still not fully convinced that a hypervisor
> >> change is the right course of action here in the first place. It
> >> would be better if the hypervisor had to just verify that all
> >> IRQ mappings are gone, or else fail the de-assignment of the
> >> device.
> > 
> > The only component (except the hypervisor) that knows about such
> > assignments is QEMU, and in the case of a QEMU crash the host would be
> > left with a device that cannot be de-assigned, because the information
> > about the PIRQ bindings in lost due to the QEMU crash.
> > 
> > IMO Xen needs to be capable of cleaning any bindings and mappings done
> > by the toolstack or the device model in order to be able to correctly
> > recover from a device model or toolstack crash.
> 
> But possibly with tool stack assistance: Rather than doing it (in a
> potentially fragile way, as per my other comments) as an integral
> part of deassign-device, it could be a separate domctl to be
> issued first.

I don't have a strong opinion whether a new hypercall would be better
than just hooking this logic into the current deassign hypercall, as
long as it's robust.

> Or otherwise failure here ought to lead to failure of
> deassign-device, rather than (e.g.) an infinite loop.

Yes, I fully agree it needs to be robust, ideally the
unbinding/unmapping should always succeed, or at least don't get stuck
into an infinite loop.

Roger.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH] x86/vtx: Improvements to ept= command line handling

2018-12-21 Thread Andrew Cooper
On 21/12/2018 11:27, Jan Beulich wrote:
 On 21.12.18 at 12:15,  wrote:
>> On 21/12/2018 11:13, Jan Beulich wrote:
>> On 20.12.18 at 18:16,  wrote:
 +By default, Xen will use A/D tracking when available in hardware, 
 except
 +on Avoton processors affected by erratum AVR41.  Explicitly choosing
 +`ad=0` will disable the use of A/D tracking on capable hardware, 
 whereas
 +choosing `ad=1` will cause tracking to be used even on AVR41-affected
 +hardware.
>>> Is there any reason for this special casing of the one erratum?
>>> Earlier this week I've gone through some spec updates for other
>>> purposes, and I've seen some rather frightening EPT A/D errata.
>> Which, out of interest?  There are a few, particularly on Skylake, but
>> all the problematic ones I'm aware of are fixed in microcode.
> I'd have to go through them again, as I didn't pay close attention to
> what was said about their status. Are we generally putting ourselves
> on the position then that errata don't need working around if there's
> a microcode update available? That's a possible position to take, but
> not spelled out anywhere.

Intel and AMD are quite clear that you should be running up-to-date
microcode.

Its conceptually similar to someone saying "I found an issue in Xen
4.8.0" and our reply being "Thats great, but its already fixed in 4.8.1
- please update".

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [qemu-mainline test] 131454: tolerable FAIL - PUSHED

2018-12-21 Thread osstest service owner
flight 131454 qemu-mainline real [real]
http://logs.test-lab.xenproject.org/osstest/logs/131454/

Failures :-/ but no regressions.

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-qemuu-win7-amd64 17 guest-stopfail like 131435
 test-armhf-armhf-libvirt 14 saverestore-support-checkfail  like 131435
 test-amd64-i386-xl-qemuu-win7-amd64 17 guest-stop fail like 131435
 test-amd64-amd64-xl-qemuu-ws16-amd64 17 guest-stopfail like 131435
 test-armhf-armhf-libvirt-raw 13 saverestore-support-checkfail  like 131435
 test-amd64-i386-xl-pvshim12 guest-start  fail   never pass
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-armhf-armhf-xl-arndale  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  14 saverestore-support-checkfail   never pass
 test-amd64-amd64-qemuu-nested-amd 17 debian-hvm-install/l1/l2  fail never pass
 test-amd64-amd64-libvirt-vhd 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 13 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 14 saverestore-support-checkfail  never pass
 test-armhf-armhf-xl  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 13 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 13 migrate-support-checkfail never pass
 test-armhf-armhf-xl-credit1  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 14 saverestore-support-checkfail never pass
 test-armhf-armhf-xl-credit1  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt-raw 12 migrate-support-checkfail   never pass
 test-amd64-i386-xl-qemuu-ws16-amd64 17 guest-stop  fail never pass
 test-armhf-armhf-xl-vhd  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  13 saverestore-support-checkfail   never pass
 test-amd64-amd64-xl-qemuu-win10-i386 10 windows-installfail never pass
 test-amd64-i386-xl-qemuu-win10-i386 10 windows-install fail never pass

version targeted for testing:
 qemuub72566a4ffaddbc0c0c1f6f5ee91b42ab13ff429
baseline version:
 qemuue85c577158a2e8e252414959da9ef15c12eec63d

Last test of basis   131435  2018-12-18 23:20:18 Z2 days
Testing same since   131454  2018-12-20 05:20:07 Z1 days1 attempts


People who touched revisions under test:
  Daniel Henrique Barboza 
  Dominik Csapak 
  Eduardo Habkost 
  Emilio G. Cota 
  Laurent Vivier 
  Marc-André Lureau 
  Markus Armbruster 
  Peter Maydell 
  Roman Bolshakov 

jobs:
 build-amd64-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-armhf  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-armhf-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-armhf-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-xl  pass
 test-armhf-armhf-xl  pass
 test-amd64-i386-xl   pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm   pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsmpass
 test-amd64-amd64-xl-qemuu-debianhvm-amd64-xsmpass
 test-amd64-i386-xl-qemuu-debianhvm-amd64-xsm pass
 test-amd64-amd64-libvirt-xsm pass
 test-amd64-i386-libvirt-xsm

Re: [Xen-devel] [PATCH for-4.12 v2 8/8] xen: Switch parameter in get_page_from_gfn to use typesafe gfn

2018-12-21 Thread Julien Grall

Hi Stefano,

On 20/12/2018 23:25, Stefano Stabellini wrote:

On Thu, 20 Dec 2018, Julien Grall wrote:

No functional change intended.

Only reasonable clean-ups are done in this patch. The rest will use _gfn
for the time being.

Signed-off-by: Julien Grall 


I don't have the bandwidth to review this patch before the holidays, but
it is not required for the feature to go in.


That's fine. This patch is just a cleaned-up. Thank you for review the rest of 
the series!


Cheers,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 1/2] xen/dom0: Improve documentation for dom0= and dom0-iommu=

2018-12-21 Thread Roger Pau Monné
On Thu, Dec 20, 2018 at 11:40:51PM +, Andrew Cooper wrote:
> Update to the latest metadata style, and expand each of the clauses with more
> information, including applicable CONFIG_* options.
> 
> Drop the redundant comment beside parse_dom0_param(), to avoid it getting out
> of sync with the main documentation.
> 
> Signed-off-by: Andrew Cooper 

Thanks! A couple of fixes below, because the original text is actually
wrong...

> ---
> CC: Jan Beulich 
> CC: Wei Liu 
> CC: Roger Pau Monné 
> CC: Stefano Stabellini 
> CC: Julien Grall 
> 
> Please double check for correctness.  The text matches my
> understanding/reading of the code, but some of it is rather subtle going.
> 
> It occurs to me that:
> 
>  * The choice of dom0 boot mode should in part be derived from the available
>CONFIG_* options, and ELF notes advertised in the dom0 kernel.

This is indeed doable, but would require parsing the dom0 kernel
before building the domain.

> 
>  * AMD probably needs to gain an `ivmd=` to mirror `rmrr=` on the Intel side,
>because we know there are other errors in the IVRS table.

Yes, albeit using rmrr is quite cumbersome because it's mostly a
trial-and-error process until there are no more iommu faults (unless
you can get the correct rmrr command for your hardware somewhere).

> 
>  * Neither of map-{inclusive,reserved} should be active by default, even on
>Intel hardware, and we should (wherever possible) have quirks like we have
>for all other firmware screwups.  Requiring the user to diagnose/work
>around firmware problems like this is quite rude.

That would indeed be nice, but I think there are too many vendor
firmware versions to be able to correctly identify such quirks, the
more that vendors don't even list missing RMRR as erratum.

> +Controls for the dom0 IOMMU setup.
> +
> +*   The `passthrough` boolean is applicable to x86 PV dom0's only and 
> defaults
> +to false.  It controls whether the IOMMU is fully disabled for devices
> +belonging to dom0 (`passthrough=1`), or whether the IOMMU is set up with
> +an identity transform for dom0 (`passthrough=0`) to prevent dom0 from
> +DMA'ing outside of its permitted areas.
> +
> +This option is hardwired to false for x86 PVH dom0's (where a 
> non-identity
> +transform is required for dom0 to function), and is ignored for ARM.
> +
> +*   The `strict` boolean is applicable to x86 PV dom0's only and defaults to
> +false.  It controls whether dom0 can have IOMMU mappings for all domain
> +RAM in the system, or only for its allocated RAM (and grant mappings 
> etc.)
> +
> +This option is hardwired to true for x86 PVH dom0's (as RAM belonging to
> +other domains in the system don't live in a compatible address space), 
> and
> +is ignored for ARM.
> +
> +*   The `map-inclusive` boolean is applicable to x86 PV dom0's, and sets up 
> DMA
> +remapping for all non-RAM regions below 4GB except for unusable ranges.
> +
> +Typically, some devices in a system use bits of RAM for communication, 
> and
> +these areas should be listed via RMRR or IVMD entries in the APCI tables,
> +so Xen can ensure that they are identity-mapped in the IOMMU.  However,
> +some firmware makes mistakes writing its APCI tables, and this option is 
> a
> +coarse-grain workaround for those errors.
> +
> +Where possible, finer grain corrections should be made with the `rmrr=`,
> +`ivrs_hpet=` or `ivrs_ioapic=` command line options.
> +
> +This option is enabled by default on x86 Intel systems, disabled by
> +default on other x86 systems, and invalid on ARM systems.

I'm afraid the previous text was wrong. I later discovered that AMD
also had such workarounds applied by default, and unified the code,
but failed to update the documentation, sorry.

map-inclusive is enabled by default on x86 for a PV dom0. See
xen/drivers/passthrough/x86/iommu.c:215 (arch_iommu_hwdom_init).

> +
> +*   The `map-reserved` functionality is very similar to `map-inclusive`, but 
> is
> +applicable to both x86 PV and PVH dom0's, and represents a subset of the
> +correction by only mapping reserved memory regions rather than all 
> non-RAM
> +regions.
> +
> +This option is enabled by default on x86 Intel systems, disabled by
> +default on other x86 systems, and invalid on ARM systems.

map-reserved is enabled by default on x86,
xen/drivers/passthrough/x86/iommu.c:218 (arch_iommu_hwdom_init).

The text itself looks OK to me.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 1/2] xen/dom0: Improve documentation for dom0= and dom0-iommu=

2018-12-21 Thread Jan Beulich
>>> On 21.12.18 at 00:40,  wrote:
> --- a/docs/misc/xen-command-line.markdown
> +++ b/docs/misc/xen-command-line.markdown
> @@ -636,55 +636,76 @@ trace feature is only enabled in debugging builds of 
> Xen.
>  
>  Specify the bit width of the DMA heap.
>  
> -### dom0 (x86)
> -> `= List of [ pvh | shadow ]`
> +### dom0
> +> `= List of [ pvh=, shadow= ]`
>  
> -> Sub-options:
> -
> -> `pvh`
> +> Applicability: x86

Why the new tag, when everything else uses (x86) next to the
option name?

>  ### dom0-iommu
> -> `= List of [ passthrough | strict | map-inclusive ]`
> -
> -This list of booleans controls the iommu usage by Dom0:
> -
> -* `passthrough`: disables DMA remapping for Dom0. Default is `false`. Note 
> that
> -  this option is hard coded to `false` for a PVH Dom0 and any attempt to
> -  overwrite it from the command line is ignored.
> -
> -* `strict`: sets up DMA remapping only for the RAM Dom0 actually got 
> assigned.
> -  Default is `false` which means Dom0 will get mappings for all the host
> -  RAM except regions in use by Xen. Note that this option is hard coded to
> -  `true` for a PVH Dom0 and any attempt to overwrite it from the command line
> -  is ignored.
> -
> -* `map-inclusive`: sets up DMA remapping for all the non-RAM regions below 
> 4GB
> -  except for unusable ranges. Use this to work around firmware issues 
> providing
> -  incorrect RMRR/IVMD entries. Rather than only mapping RAM pages for IOMMU
> -  accesses for Dom0, with this option all pages up to 4GB, not marked as
> -  unusable in the E820 table, will get a mapping established. Note that this
> -  option is only applicable to a PV Dom0 and is enabled by default on Intel
> -  hardware.
> -
> -* `map-reserved`: sets up DMA remapping for all the reserved regions in the
> -  memory map for Dom0. Use this to work around firmware issues providing
> -  incorrect RMRR/IVMD entries. Rather than only mapping RAM pages for IOMMU
> -  accesses for Dom0, all memory regions marked as reserved in the memory map
> -  that don't overlap with any MMIO region from emulated devices will be
> -  identity mapped. This option maps a subset of the memory that would be
> -  mapped when using the `map-inclusive` option. This option is available to 
> all
> -  Dom0 modes and is enabled by default on Intel hardware.
> +> `= List of [ passthrough=, strict=, map-inclusive=,
> +>  map-reserved= ]`
> +
> +Controls for the dom0 IOMMU setup.
> +
> +*   The `passthrough` boolean is applicable to x86 PV dom0's only and 
> defaults
> +to false.  It controls whether the IOMMU is fully disabled for devices
> +belonging to dom0 (`passthrough=1`), or whether the IOMMU is set up with
> +an identity transform for dom0 (`passthrough=0`) to prevent dom0 from
> +DMA'ing outside of its permitted areas.
> +
> +This option is hardwired to false for x86 PVH dom0's (where a 
> non-identity
> +transform is required for dom0 to function), and is ignored for ARM.
> +
> +*   The `strict` boolean is applicable to x86 PV dom0's only and defaults to
> +false.  It controls whether dom0 can have IOMMU mappings for all domain
> +RAM in the system, or only for its allocated RAM (and grant mappings 
> etc.)
> +
> +This option is hardwired to true for x86 PVH dom0's (as RAM belonging to
> +other domains in the system don't live in a compatible address space), 
> and
> +is ignored for ARM.
> +
> +*   The `map-inclusive` boolean is applicable to x86 PV dom0's, and sets up 
> DMA
> +remapping for all non-RAM regions below 4GB except for unusable ranges.

I don't thinks this is PV-specific, just its default is.

Jan



___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 2/2] xen/dom0: Add a dom0-iommu=none option

2018-12-21 Thread Jan Beulich
>>> On 21.12.18 at 00:40,  wrote:
> @@ -271,6 +272,26 @@ int parse_boolean(const char *name, const char *s, const 
> char *e)
>  return -1;
>  }
>  
> +int cmdline_strcmp(const char *frag, const char *name)

__init ?

> +{
> +while ( 1 )
> +{
> +int res = (*frag - *name);
> +
> +if ( res || *name == '\0' )
> +{
> +/* NUL in 'name' matching punctuation in 'frag' implies success. 
> */
> +if ( *name == '\0' && ispunct(*frag) )
> +res = 0;

Isn't ispunct() true for dashes and perhaps also underscores?
I don't think it can be this generic, the more that ...

> --- a/xen/include/xen/lib.h
> +++ b/xen/include/xen/lib.h
> @@ -79,6 +79,13 @@ int parse_bool(const char *s, const char *e);
>   */
>  int parse_boolean(const char *name, const char *s, const char *e);
>  
> +/**
> + * Very similar to strcmp(), but will declare a match if the NUL in 'name'
> + * lines up with punctuationin 'frag'.  Designed for picking exact string
> + * matches out of a comma-separated command line fragment.
> + */
> +int cmdline_strcmp(const char *frag, const char *name);

... you talk of commas only here.

Jan



___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v2 11/18] argo: implement the register op

2018-12-21 Thread Julien Grall

Hi Christopher,

On 21/12/2018 01:17, Christopher Clark wrote:

On Thu, Dec 20, 2018 at 3:20 AM Julien Grall  wrote:


Hi Christopher,

On 12/20/18 6:39 AM, Christopher Clark wrote:

Used by a domain to register a region of memory for receiving messages from
either a specified other domain, or, if specifying a wildcard, any domain.

This operation creates a mapping within Xen's private address space that
will remain resident for the lifetime of the ring. In subsequent commits,
the hypervisor will use this mapping to copy data from a sending domain into
this registered ring, making it accessible to the domain that registered the
ring to receive data.

In this code, the p2m type of the memory supplied by the guest for the ring
must be p2m_ram_rw, which is a conservative choice made to defer the need to
reason about the other p2m types with this commit.

xen_argo_page_descr_t type is introduced as a page descriptor, to convey
both the physical address of the start of the page and its granularity. The
smallest granularity page is assumed to be 4096 bytes and the lower twelve
bits of the type are used for indicate an enumerated page size.


I haven't seen any reply from you on my concern with this approach (see
[1]).

For convenience, I will duplicate the message here.


Hi Julien,

Thanks for the reminder.


If you let the user the choice of the granularity, then, I believe, you
will prevent the hypervisor to do some optimization.


OK, let's work through this then.


For instance, if the guest supplies only 4KB page but the hypervisor is
64KB. There are no way to easily map them contiguously in the hypervisor
(e.g using vmap).


Right. So with the matrix:

4K guest, 4K xen : fine.
4K guest, 64K xen : contiguous guest physical chunks or region required.
64K guest, 4K xen : weird? seems doable.


It is not weird, 64KB split nicely into 16 4KB chunk. Actually, Linux upstream 
has all the support for to run with 64KB pages on current Xen.



64K guest, 64K xen : fine (with some work).

as you note, the 4K guest, 64K hypervisor case is the one that
raises the question.


That's correct. To generalize the problem, the problem will happen whenever the 
guest page size is smaller than the Xen page size.





Is there a particular reason to allow the ring buffer to be
non-contiguous in the guest physical address?


It hasn't been a necessary restriction up to this point, and isn't
so on the platforms we're deploying on, so my preference is not to
introduce it as an additional requirement if it can be avoided. It
allows us to use vmalloc (rather than kmalloc) on Linux, which is
helpful.


vmalloc might be an issue on Arm if we request 64KB chunk of physical memory. 
Although I don't know the vmalloc implementation to be able to say whether this 
can be addressed.




There can be high turnover in ring registration for a server with
many short-lived connections. While the rings are not necessarily
large -- the default is 128K in the current Linux driver, though
clients can change what they use -- contiguous memory regions are a
more limited resource for the kernel to manage, and avoiding
pressure on that contiguous region allocator when it isn't necessary
is preferable.

We also do not want to disincentivize a server that is seeking to
improve performance from registering larger rings -- so allowing
non-contiguous regions fits with that.

I'd have to study the Linux driver further to say whether there
are stronger additional requirements that I'm not currently aware
of, but I don't know of any at the moment.


Thank you for the detailed explanation. So I think my option 1) below would suit 
you the best here.





Depending on the answer, there are different way to handle that:
1) Request the guest to allocate memory using 64KB (on Arm) chunk and
pass the base address for each chunk
2) Request the guest to allocate contiguously the buffer and pass the
base address and size


I understand that #2 would avoid the need to describe a contiguous
allocation of memory as a series of chunks; but I think #1 is the
option I would select. Do you think that would be acceptable?


1) is a good option for me. I forgot to mention the base address would need to 
be aligned to 64KB.


Cheers,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 2/2] xen/dom0: Add a dom0-iommu=none option

2018-12-21 Thread Roger Pau Monné
On Thu, Dec 20, 2018 at 11:40:52PM +, Andrew Cooper wrote:
> For development purposes, it is very convenient to boot Xen as a PVH guest,
> with an XTF PV or PVH "dom0".  The edit-compile-go cycle is a matter of
> seconds, and you can resonably insert printk() debugging in places which which
> would be completely infeasible when booting fully-fledged guests.
> 
> However, the PVH dom0 path insists on having a working IOMMU, which doesn't
> exist when virtualised as a PVH guest, and isn't necessary for XTF anyway.
> 
> Introduce a developer mode to skip the IOMMU requirement.

This looks very similar to the current 'passthrough' option, maybe it
would be enough to allow PVH dom0 to use the passthrough option
provided a warning is added to
arch_iommu_check_autotranslated_hwdom?

Thanks, Roger.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 1/2] xen/dom0: Improve documentation for dom0= and dom0-iommu=

2018-12-21 Thread Andrew Cooper
On 21/12/2018 12:08, Jan Beulich wrote:
 On 21.12.18 at 00:40,  wrote:
>> --- a/docs/misc/xen-command-line.markdown
>> +++ b/docs/misc/xen-command-line.markdown
>> @@ -636,55 +636,76 @@ trace feature is only enabled in debugging builds of 
>> Xen.
>>  
>>  Specify the bit width of the DMA heap.
>>  
>> -### dom0 (x86)
>> -> `= List of [ pvh | shadow ]`
>> +### dom0
>> +> `= List of [ pvh=, shadow= ]`
>>  
>> -> Sub-options:
>> -
>> -> `pvh`
>> +> Applicability: x86
> Why the new tag, when everything else uses (x86) next to the
> option name?

See the commit message of c/s a3a99df44e5405d2092ec59087681765fa4cdee7

The problem is with the generated HTML anchors when trying to cross
reference the options.

>
>>  ### dom0-iommu
>> -> `= List of [ passthrough | strict | map-inclusive ]`
>> -
>> -This list of booleans controls the iommu usage by Dom0:
>> -
>> -* `passthrough`: disables DMA remapping for Dom0. Default is `false`. Note 
>> that
>> -  this option is hard coded to `false` for a PVH Dom0 and any attempt to
>> -  overwrite it from the command line is ignored.
>> -
>> -* `strict`: sets up DMA remapping only for the RAM Dom0 actually got 
>> assigned.
>> -  Default is `false` which means Dom0 will get mappings for all the host
>> -  RAM except regions in use by Xen. Note that this option is hard coded to
>> -  `true` for a PVH Dom0 and any attempt to overwrite it from the command 
>> line
>> -  is ignored.
>> -
>> -* `map-inclusive`: sets up DMA remapping for all the non-RAM regions below 
>> 4GB
>> -  except for unusable ranges. Use this to work around firmware issues 
>> providing
>> -  incorrect RMRR/IVMD entries. Rather than only mapping RAM pages for IOMMU
>> -  accesses for Dom0, with this option all pages up to 4GB, not marked as
>> -  unusable in the E820 table, will get a mapping established. Note that this
>> -  option is only applicable to a PV Dom0 and is enabled by default on Intel
>> -  hardware.
>> -
>> -* `map-reserved`: sets up DMA remapping for all the reserved regions in the
>> -  memory map for Dom0. Use this to work around firmware issues providing
>> -  incorrect RMRR/IVMD entries. Rather than only mapping RAM pages for IOMMU
>> -  accesses for Dom0, all memory regions marked as reserved in the memory map
>> -  that don't overlap with any MMIO region from emulated devices will be
>> -  identity mapped. This option maps a subset of the memory that would be
>> -  mapped when using the `map-inclusive` option. This option is available to 
>> all
>> -  Dom0 modes and is enabled by default on Intel hardware.
>> +> `= List of [ passthrough=, strict=, map-inclusive=,
>> +>  map-reserved= ]`
>> +
>> +Controls for the dom0 IOMMU setup.
>> +
>> +*   The `passthrough` boolean is applicable to x86 PV dom0's only and 
>> defaults
>> +to false.  It controls whether the IOMMU is fully disabled for devices
>> +belonging to dom0 (`passthrough=1`), or whether the IOMMU is set up with
>> +an identity transform for dom0 (`passthrough=0`) to prevent dom0 from
>> +DMA'ing outside of its permitted areas.
>> +
>> +This option is hardwired to false for x86 PVH dom0's (where a 
>> non-identity
>> +transform is required for dom0 to function), and is ignored for ARM.
>> +
>> +*   The `strict` boolean is applicable to x86 PV dom0's only and defaults to
>> +false.  It controls whether dom0 can have IOMMU mappings for all domain
>> +RAM in the system, or only for its allocated RAM (and grant mappings 
>> etc.)
>> +
>> +This option is hardwired to true for x86 PVH dom0's (as RAM belonging to
>> +other domains in the system don't live in a compatible address space), 
>> and
>> +is ignored for ARM.
>> +
>> +*   The `map-inclusive` boolean is applicable to x86 PV dom0's, and sets up 
>> DMA
>> +remapping for all non-RAM regions below 4GB except for unusable ranges.
> I don't thinks this is PV-specific, just its default is.

>From arch_iommu_hwdom_init():

/* Inclusive mappings are enabled by default for PV. */
if ( iommu_hwdom_inclusive == -1 )
iommu_hwdom_inclusive = is_pv_domain(d);
/* Reserved IOMMU mappings are enabled by default. */
if ( iommu_hwdom_reserved == -1 )
iommu_hwdom_reserved = 1;

if ( iommu_hwdom_inclusive && !is_pv_domain(d) )
{
printk(XENLOG_WARNING
   "IOMMU inclusive mappings are only supported on PV Dom0\n");
iommu_hwdom_inclusive = 0;
}


Attempting to use this option with a PVH dom0 will cause Xen to force it
off.

~Andrew
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 2/2] xen/dom0: Add a dom0-iommu=none option

2018-12-21 Thread Andrew Cooper
On 21/12/2018 12:13, Jan Beulich wrote:
 On 21.12.18 at 00:40,  wrote:
>> @@ -271,6 +272,26 @@ int parse_boolean(const char *name, const char *s, 
>> const char *e)
>>  return -1;
>>  }
>>  
>> +int cmdline_strcmp(const char *frag, const char *name)
> __init ?

I think there are some runtime parameters in need of some fixing as well.

>
>> +{
>> +while ( 1 )
>> +{
>> +int res = (*frag - *name);
>> +
>> +if ( res || *name == '\0' )
>> +{
>> +/* NUL in 'name' matching punctuation in 'frag' implies 
>> success. */
>> +if ( *name == '\0' && ispunct(*frag) )
>> +res = 0;
> Isn't ispunct() true for dashes and perhaps also underscores?
> I don't think it can be this generic, the more that ...
>
>> --- a/xen/include/xen/lib.h
>> +++ b/xen/include/xen/lib.h
>> @@ -79,6 +79,13 @@ int parse_bool(const char *s, const char *e);
>>   */
>>  int parse_boolean(const char *name, const char *s, const char *e);
>>  
>> +/**
>> + * Very similar to strcmp(), but will declare a match if the NUL in 'name'
>> + * lines up with punctuationin 'frag'.  Designed for picking exact string
>> + * matches out of a comma-separated command line fragment.
>> + */
>> +int cmdline_strcmp(const char *frag, const char *name);
> ... you talk of commas only here.

I actually borrowed this function from my CPUID cmdline patch.  In 99%
of cases, we only need to match = and , but we have some other
parameters such as psr= which use : for delimiters, hence the use of
ispunct().

As an alternative, I could revert back to explicitly checking the
expected punctuation.  It is not as if this is a fastpath.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 2/2] xen/dom0: Add a dom0-iommu=none option

2018-12-21 Thread Andrew Cooper
On 21/12/2018 12:44, Roger Pau Monné wrote:
> On Thu, Dec 20, 2018 at 11:40:52PM +, Andrew Cooper wrote:
>> For development purposes, it is very convenient to boot Xen as a PVH guest,
>> with an XTF PV or PVH "dom0".  The edit-compile-go cycle is a matter of
>> seconds, and you can resonably insert printk() debugging in places which 
>> which
>> would be completely infeasible when booting fully-fledged guests.
>>
>> However, the PVH dom0 path insists on having a working IOMMU, which doesn't
>> exist when virtualised as a PVH guest, and isn't necessary for XTF anyway.
>>
>> Introduce a developer mode to skip the IOMMU requirement.
> This looks very similar to the current 'passthrough' option, maybe it
> would be enough to allow PVH dom0 to use the passthrough option
> provided a warning is added to
> arch_iommu_check_autotranslated_hwdom?

I considered that, but "dom0-iommu=passthrough" isn't an accurate
description of what is going on.  Frankly, its not correct for PV either.

TBH, dom0-iommu=none is better for both.  How about I introduce that as
the new option, and leave passthrough as a legacy alias?

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] PROBLEM: Xen paging-request boot failure since 4.19.5

2018-12-21 Thread Ken Pizzini
On Thu, Dec 20, 2018 at 11:12:40AM -0500, Boris Ostrovsky wrote:
> This is addressed by https://lkml.org/lkml/2018/12/11/266 but has not
> been merged yet.

Confirmed: applying the patch in that posting to my Gentoo
sys-kernel/gentoo-sources-4.19.10 tree results in a working
system for me.

Thanks,
--Ken Pizzini

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 1/2] xen/dom0: Improve documentation for dom0= and dom0-iommu=

2018-12-21 Thread Andrew Cooper
On 21/12/2018 12:08, Roger Pau Monné wrote:
> On Thu, Dec 20, 2018 at 11:40:51PM +, Andrew Cooper wrote:
>> Update to the latest metadata style, and expand each of the clauses with more
>> information, including applicable CONFIG_* options.
>>
>> Drop the redundant comment beside parse_dom0_param(), to avoid it getting out
>> of sync with the main documentation.
>>
>> Signed-off-by: Andrew Cooper 
> Thanks! A couple of fixes below, because the original text is actually
> wrong...

TBH, that is my default assumption every time I do work like this :)

>
>> ---
>> CC: Jan Beulich 
>> CC: Wei Liu 
>> CC: Roger Pau Monné 
>> CC: Stefano Stabellini 
>> CC: Julien Grall 
>>
>> Please double check for correctness.  The text matches my
>> understanding/reading of the code, but some of it is rather subtle going.
>>
>> It occurs to me that:
>>
>>  * The choice of dom0 boot mode should in part be derived from the available
>>CONFIG_* options, and ELF notes advertised in the dom0 kernel.
> This is indeed doable, but would require parsing the dom0 kernel
> before building the domain.

I don't see anything wrong with parsing the ELF headers ahead of
building the domain.  From the overall boot time, its just an
order-of-operations issue.

>
>>  * AMD probably needs to gain an `ivmd=` to mirror `rmrr=` on the Intel side,
>>because we know there are other errors in the IVRS table.
> Yes, albeit using rmrr is quite cumbersome because it's mostly a
> trial-and-error process until there are no more iommu faults (unless
> you can get the correct rmrr command for your hardware somewhere).
>
>>  * Neither of map-{inclusive,reserved} should be active by default, even on
>>Intel hardware, and we should (wherever possible) have quirks like we have
>>for all other firmware screwups.  Requiring the user to diagnose/work
>>around firmware problems like this is quite rude.
> That would indeed be nice, but I think there are too many vendor
> firmware versions to be able to correctly identify such quirks, the
> more that vendors don't even list missing RMRR as erratum.

I don't agree.  We already have quirks based on DMI (at the moment,
mainly for reboot overrides), and the vast majority of the offending
cases are the BMC shared mailbox, which will be in a fixed per-platform
location.

I don't expect we'll ever find and fix all quirks, but where we do find
suitable ones, we should put them into the boot code.

>
>> +Controls for the dom0 IOMMU setup.
>> +
>> +*   The `passthrough` boolean is applicable to x86 PV dom0's only and 
>> defaults
>> +to false.  It controls whether the IOMMU is fully disabled for devices
>> +belonging to dom0 (`passthrough=1`), or whether the IOMMU is set up with
>> +an identity transform for dom0 (`passthrough=0`) to prevent dom0 from
>> +DMA'ing outside of its permitted areas.
>> +
>> +This option is hardwired to false for x86 PVH dom0's (where a 
>> non-identity
>> +transform is required for dom0 to function), and is ignored for ARM.
>> +
>> +*   The `strict` boolean is applicable to x86 PV dom0's only and defaults to
>> +false.  It controls whether dom0 can have IOMMU mappings for all domain
>> +RAM in the system, or only for its allocated RAM (and grant mappings 
>> etc.)
>> +
>> +This option is hardwired to true for x86 PVH dom0's (as RAM belonging to
>> +other domains in the system don't live in a compatible address space), 
>> and
>> +is ignored for ARM.
>> +
>> +*   The `map-inclusive` boolean is applicable to x86 PV dom0's, and sets up 
>> DMA
>> +remapping for all non-RAM regions below 4GB except for unusable ranges.
>> +
>> +Typically, some devices in a system use bits of RAM for communication, 
>> and
>> +these areas should be listed via RMRR or IVMD entries in the APCI 
>> tables,
>> +so Xen can ensure that they are identity-mapped in the IOMMU.  However,
>> +some firmware makes mistakes writing its APCI tables, and this option 
>> is a
>> +coarse-grain workaround for those errors.
>> +
>> +Where possible, finer grain corrections should be made with the `rmrr=`,
>> +`ivrs_hpet=` or `ivrs_ioapic=` command line options.
>> +
>> +This option is enabled by default on x86 Intel systems, disabled by
>> +default on other x86 systems, and invalid on ARM systems.
> I'm afraid the previous text was wrong. I later discovered that AMD
> also had such workarounds applied by default, and unified the code,
> but failed to update the documentation, sorry.
>
> map-inclusive is enabled by default on x86 for a PV dom0. See
> xen/drivers/passthrough/x86/iommu.c:215 (arch_iommu_hwdom_init).
>
>> +
>> +*   The `map-reserved` functionality is very similar to `map-inclusive`, 
>> but is
>> +applicable to both x86 PV and PVH dom0's, and represents a subset of the
>> +correction by only mapping reserved memory regions rather than all 
>> non-RAM
>> +regions.
>> +
>> +This option is enabled by default o

Re: [Xen-devel] [PATCH for-4.12 v2 2/8] xen/arm: p2m: Introduce p2m_get_page_from_gfn

2018-12-21 Thread Andrew Cooper
On 20/12/2018 19:23, Julien Grall wrote:
> diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
> index 2b5e43f50a..cd34149d13 100644
> --- a/xen/arch/arm/p2m.c
> +++ b/xen/arch/arm/p2m.c
> @@ -406,6 +406,38 @@ mfn_t p2m_lookup(struct domain *d, gfn_t gfn, p2m_type_t 
> *t)
>  return mfn;
>  }
>  
> +struct page_info *p2m_get_page_from_gfn(struct domain *d, gfn_t gfn,
> +p2m_type_t *t)
> +{
> +struct page_info *page;
> +p2m_type_t p2mt;
> +mfn_t mfn = p2m_lookup(d, gfn, &p2mt);
> +
> +if (t)

Spaces

> +*t = p2mt;
> +
> +if ( !p2m_is_any_ram(p2mt) )
> +return NULL;
> +
> +if ( !mfn_valid(mfn) )
> +return NULL;

Newline

> +page = mfn_to_page(mfn);
> +
> +/*
> + * get_page won't work on foreign mapping because the page doesn't
> + * belong to the current domain.
> + */
> +if ( p2m_is_foreign(p2mt) )
> +{
> +struct domain *fdom = page_get_owner_and_reference(page);
> +ASSERT(fdom != NULL);
> +ASSERT(fdom != d);
> +return page;
> +}
> +
> +return (get_page(page, d) ? page: NULL);

No need for the outer brackets.

All trivial style issues, so can be fixed on commit.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [xen-unstable-smoke test] 131491: tolerable all pass - PUSHED

2018-12-21 Thread osstest service owner
flight 131491 xen-unstable-smoke real [real]
http://logs.test-lab.xenproject.org/osstest/logs/131491/

Failures :-/ but no regressions.

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  14 saverestore-support-checkfail   never pass

version targeted for testing:
 xen  af80e1d9c79e3bcb392775f311d20eec54b3389b
baseline version:
 xen  396d8d5418ea908a5ef88e7d7a9f22c70ada44c2

Last test of basis   131486  2018-12-21 08:00:41 Z0 days
Testing same since   131491  2018-12-21 11:01:16 Z0 days1 attempts


People who touched revisions under test:
  Benjamin Sanda 
  Julien Grall 
  Stefano Stabellini 

jobs:
 build-amd64  pass
 build-armhf  pass
 build-amd64-libvirt  pass
 test-armhf-armhf-xl  pass
 test-amd64-amd64-xl-qemuu-debianhvm-i386 pass
 test-amd64-amd64-libvirt pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

To xenbits.xen.org:/home/xen/git/xen.git
   396d8d5418..af80e1d9c7  af80e1d9c79e3bcb392775f311d20eec54b3389b -> smoke

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 03/15] x86/cpu/vpmu: Add Hygon Dhyana support for vPMU

2018-12-21 Thread Boris Ostrovsky
On 12/21/18 5:02 AM, Pu Wen wrote:
> On 2018/12/20 22:25, Boris Ostrovsky wrote:
> ...
>>> diff --git a/xen/arch/x86/cpu/vpmu_amd.c b/xen/arch/x86/cpu/vpmu_amd.c
>>> index 5efc39b..e9f0a5c 100644
>>> --- a/xen/arch/x86/cpu/vpmu_amd.c
>>> +++ b/xen/arch/x86/cpu/vpmu_amd.c
>>> @@ -554,6 +554,8 @@ int __init amd_vpmu_init(void)
>>>   case 0x12:
>>>   case 0x14:
>>>   case 0x16:
>>> +case 0x17:
>>> +case 0x18:
>>
>> This also enables VPMU support for Zen which goes beyond what the
>> commit message claims to do.
> Sorry for the not clear commit message. Will add modification description
> in the commit message and make the changes complete.
>
> On the other hand, since current Xen vPMU still not support Zen. so in
> this patch we enable 0x17 support. If this modification is not preferred,
> will remove AMD Xen 0x17 support in next version.

Enabling 0x17 should be fine, I just thought commit message should be
explicit about that.


>> Also, why are you choosing to use legacy MSRs (and you did the same in
>> Linux)? Doesn't Zen (which you are saying is similar to Hygon) support
>> c001_020X bank?
> In Linux, the Xen PMU driver use the default branch cases, which also use
> the legacy MSRs way. So we choose to follow legacy MSRs here in Dhyana
> cases.
>
> Since both of Zen and Dhyana support C001_020X MSRs. If use the C001_020X
> is preferred, we will try to modify the related codes and create a patch.


I don't have a Zen box available right now but from what I can see 0x17
counters are compatible with 0x15 so I think switching to C001_020X
should work. And looks like you are using those in Linux (non-Xen part) too.


> Also the Linux Xen PMU driver may need to be updated to use these MSRs.

Yes, although Linux part is used only by PV guests.

-boris


-boris

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH for-4.12 v2 8/8] xen: Switch parameter in get_page_from_gfn to use typesafe gfn

2018-12-21 Thread Boris Ostrovsky
On 12/20/18 2:23 PM, Julien Grall wrote:
> No functional change intended.
>
> Only reasonable clean-ups are done in this patch. The rest will use _gfn
> for the time being.
>
> Signed-off-by: Julien Grall 

SVM bits:

Reviewed-by: Boris Ostrovsky 



___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH] x86/pv: Clean up cr3 handling in arch_set_info_guest()

2018-12-21 Thread Andrew Cooper
All of this code lives inside CONFIG_PV which means gfn == mfn, and the
fill_ro_mpt() calls clearly show that the value is used untranslated.

Change cr3_gfn to a suitably typed cr3_mfn, and replace get_page_from_gfn()
with a straight mfn_to_page/get_page sequence, to avoid the implication that
translation is going on.

No functional change.

Signed-off-by: Andrew Cooper 
---
CC: Jan Beulich 
CC: Wei Liu 
CC: Roger Pau Monné 
CC: Stefano Stabellini 
CC: Julien Grall 

Julien: This should simplify your "xen: Switch parameter in get_page_from_gfn
to use typesafe gfn" patch.  In particular, I did a doubletake at
fill_ro_mpt(_mfn(gfn_x(cr3_gfn))); when reviewing it.
---
 xen/arch/x86/domain.c | 22 +++---
 1 file changed, 11 insertions(+), 11 deletions(-)

diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
index 32dc4253..da94ab4 100644
--- a/xen/arch/x86/domain.c
+++ b/xen/arch/x86/domain.c
@@ -827,8 +827,8 @@ int arch_set_info_guest(
 unsigned long flags;
 bool compat;
 #ifdef CONFIG_PV
-unsigned long cr3_gfn;
-struct page_info *cr3_page;
+mfn_t cr3_mfn;
+struct page_info *cr3_page = NULL;
 unsigned long cr4;
 int rc = 0;
 #endif
@@ -1091,12 +1091,12 @@ int arch_set_info_guest(
 set_bit(_VPF_in_reset, &v->pause_flags);
 
 if ( !compat )
-cr3_gfn = xen_cr3_to_pfn(c.nat->ctrlreg[3]);
+cr3_mfn = _mfn(xen_cr3_to_pfn(c.nat->ctrlreg[3]));
 else
-cr3_gfn = compat_cr3_to_pfn(c.cmp->ctrlreg[3]);
-cr3_page = get_page_from_gfn(d, cr3_gfn, NULL, P2M_ALLOC);
+cr3_mfn = _mfn(compat_cr3_to_pfn(c.cmp->ctrlreg[3]));
 
-if ( !cr3_page )
+if ( !mfn_valid(cr3_mfn) ||
+ !(cr3_page = mfn_to_page(cr3_mfn), get_page(cr3_page, d)) )
 rc = -EINVAL;
 else if ( paging_mode_refcounts(d) )
 /* nothing */;
@@ -1122,7 +1122,7 @@ int arch_set_info_guest(
 case 0:
 if ( !compat && !VM_ASSIST(d, m2p_strict) &&
  !paging_mode_refcounts(d) )
-fill_ro_mpt(_mfn(cr3_gfn));
+fill_ro_mpt(cr3_mfn);
 break;
 default:
 if ( cr3_page == current->arch.old_guest_table )
@@ -1137,10 +1137,10 @@ int arch_set_info_guest(
 v->arch.guest_table = pagetable_from_page(cr3_page);
 if ( c.nat->ctrlreg[1] )
 {
-cr3_gfn = xen_cr3_to_pfn(c.nat->ctrlreg[1]);
-cr3_page = get_page_from_gfn(d, cr3_gfn, NULL, P2M_ALLOC);
+cr3_mfn = _mfn(xen_cr3_to_pfn(c.nat->ctrlreg[1]));
 
-if ( !cr3_page )
+if ( !mfn_valid(cr3_mfn) ||
+ !(cr3_page = mfn_to_page(cr3_mfn), get_page(cr3_page, d)) )
 rc = -EINVAL;
 else if ( !paging_mode_refcounts(d) )
 {
@@ -1162,7 +1162,7 @@ int arch_set_info_guest(
 break;
 case 0:
 if ( VM_ASSIST(d, m2p_strict) )
-zap_ro_mpt(_mfn(cr3_gfn));
+zap_ro_mpt(cr3_mfn);
 break;
 }
 }
-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH for-4.12 v2 8/8] xen: Switch parameter in get_page_from_gfn to use typesafe gfn

2018-12-21 Thread Andrew Cooper
On 20/12/2018 19:23, Julien Grall wrote:
> diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
> index 32dc4253ff..b462a8513b 100644
> --- a/xen/arch/x86/domain.c
> +++ b/xen/arch/x86/domain.c
> @@ -827,7 +827,7 @@ int arch_set_info_guest(
>  unsigned long flags;
>  bool compat;
>  #ifdef CONFIG_PV
> -unsigned long cr3_gfn;
> +gfn_t cr3_gfn;

I've sent an alternative patch which this patch should be rebased over,
at which point all modifications to arch_set_info_guest() should
hopefully disappear.

> diff --git a/xen/arch/x86/hvm/domain.c b/xen/arch/x86/hvm/domain.c
> index 5d5a746a25..73d2da8441 100644
> --- a/xen/arch/x86/hvm/domain.c
> +++ b/xen/arch/x86/hvm/domain.c
> @@ -297,7 +297,7 @@ int arch_set_info_hvm_guest(struct vcpu *v, const 
> vcpu_hvm_context_t *ctx)
>  {
>  /* Shadow-mode CR3 change. Check PDBR and update refcounts. */
>  struct page_info *page = get_page_from_gfn(v->domain,
> - v->arch.hvm.guest_cr[3] >> PAGE_SHIFT,
> + gaddr_to_gfn(v->arch.hvm.guest_cr[3]),
>   NULL, P2M_ALLOC);

Can you re-indent while modifying this please?

> diff --git a/xen/arch/x86/hvm/viridian/time.c 
> b/xen/arch/x86/hvm/viridian/time.c
> index 840a82b457..a718434456 100644
> --- a/xen/arch/x86/hvm/viridian/time.c
> +++ b/xen/arch/x86/hvm/viridian/time.c
> @@ -38,16 +38,16 @@ static void dump_reference_tsc(const struct domain *d)
>  
>  static void update_reference_tsc(struct domain *d, bool initialize)
>  {
> -unsigned long gmfn = d->arch.hvm.viridian.reference_tsc.fields.pfn;
> -struct page_info *page = get_page_from_gfn(d, gmfn, NULL, P2M_ALLOC);
> +gfn_t gfn = _gfn(d->arch.hvm.viridian.reference_tsc.fields.pfn);
> +struct page_info *page = get_page_from_gfn(d, gfn, NULL, P2M_ALLOC);
>  HV_REFERENCE_TSC_PAGE *p;
>  
>  if ( !page || !get_page_type(page, PGT_writable_page) )
>  {
>  if ( page )
>  put_page(page);
> -gdprintk(XENLOG_WARNING, "Bad GMFN %#"PRI_gfn" (MFN %#"PRI_mfn")\n",
> - gmfn, mfn_x(page ? page_to_mfn(page) : INVALID_MFN));
> +gdprintk(XENLOG_WARNING, "Bad GFN %#"PRI_gfn" (MFN %#"PRI_mfn")\n",

The canonical format for gfns and mfns are just %"PRI_*, without the #

Do you mind fixing this seeing as you're changing the string anyway?

> diff --git a/xen/include/asm-x86/p2m.h b/xen/include/asm-x86/p2m.h
> index 3304921991..1efbc071c5 100644
> --- a/xen/include/asm-x86/p2m.h
> +++ b/xen/include/asm-x86/p2m.h
> @@ -491,18 +491,21 @@ struct page_info *p2m_get_page_from_gfn(struct 
> p2m_domain *p2m, gfn_t gfn,
>  p2m_query_t q);
>  
>  static inline struct page_info *get_page_from_gfn(
> -struct domain *d, unsigned long gfn, p2m_type_t *t, p2m_query_t q)
> +struct domain *d, gfn_t gfn, p2m_type_t *t, p2m_query_t q)
>  {
>  struct page_info *page;
> +mfn_t mfn;
>  
>  if ( paging_mode_translate(d) )
> -return p2m_get_page_from_gfn(p2m_get_hostp2m(d), _gfn(gfn), t, NULL, 
> q);
> +return p2m_get_page_from_gfn(p2m_get_hostp2m(d), gfn, t, NULL, q);
>  
>  /* Non-translated guests see 1-1 RAM / MMIO mappings everywhere */
>  if ( t )
>  *t = likely(d != dom_io) ? p2m_ram_rw : p2m_mmio_direct;
> -page = mfn_to_page(_mfn(gfn));
> -return mfn_valid(_mfn(gfn)) && get_page(page, d) ? page : NULL;
> +
> +mfn = _mfn(gfn_x(gfn));
> +page = mfn_to_page(mfn);
> +return mfn_valid(mfn) && get_page(page, d) ? page : NULL;

This unfortunately propagates some bad behaviour, because it is not safe
to use mfn_to_page(mfn); before mfn_valid(mfn) succeeds.  (In practice
it works because mfn_to_page() is just pointer arithmetic.)

Pleas can you express this as:

return (mfn_valid(mfn) &&
    (page = mfn_to_page(mfn), get_page(page, d))) ? page : NULL;

which at least gets the order of operations in the correct order from
C's point of view.

Alternatively, and perhaps easier to follow:

if ( !mfn_valid(mfn) )
    return NULL;

page = mfn_to_page(mfn);

return get_page(page, d) ? page : NULL;

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH] x86emul: fix test harness and fuzzer build dependencies

2018-12-21 Thread Ian Jackson
Jan Beulich writes ("Re: [Xen-devel] [PATCH] x86emul: fix test harness and 
fuzzer build dependencies"):
> On 20.12.18 at 17:05,  wrote:
>> On 20.12.18 at 16:23,  wrote:
> >> Ie, it is there to satisfy the requirement I mention above, that the
> >> dependency directory is built first.
> > 
> > Which effectively means anything underneath tools/ (and whatever
> > else subdir which depend on one of said rules) is liable to fail to
> > build without having come through this (top level) rule.

Yes.

> > But this is not a property this patch introduces or changes. It just
> > re-arranges how things get done. That is, I'd like to ask for the
> > change to be acked (or a concrete proposal be made for what
> > needs to change) _without_ fixing breakage that might be there
> > and the introduction of which you may have missed, or else I'm
> > sure you would have commented on what is now eddf9559c9.

I'm afraid I don't follow this line of argument.  The uncommitted
patch we are now discussing introduce a new -j-unsafe call, which
could cause stochastic build failures.

And AFAICT the only thing the patch-under-discussion improves is to
relax (in some situations) the requirement to run
  make -C /tools/include && make
instead of just make, in x86emul.  That requirement is the one which I
am saying is entirely normal when using recursive make so it should
not be a surprise.

AFAICT eddf9559c9 doesn't seem to have this problem, although I could
be wrong.

> >> But in that case we need to make sure that either:
> >> 
> >>  A. 1. The top-level Makefiles ensure that *a* build of
> >>tools/include completes *before* starting to enter
> >>tools/fuzz/x86_instruction_emulator.  (Which I
> >>think is the case.)
> > 
> > Yes, by said dependency on %-tools-public-headers.

Right.

> >> AND
> >> 
> >> 2. make -C tools/include is definitely completely read-only if the
> >>thing has already been built.  (This is somewhat hard to check
> >>and maintain, and would need a comment in that Makefile to
> >>ensure that this property is preserved.)
> > 
> > Isn't that a property that's supposed to hold for almost all (sub)trees,
> > i.e. it's rather the exception that an already built tree will see further
> > changes when re-built incrementally without the sources having
> > changed? That said, we have to consider such a case here, due to
> > the use of move-if-changed in xen/include/xen/lib/x86/Makefile. But
> > this still doesn't violate the fully-read-only requirement, as the rule's
> > commands won't be executed again when the dependencies are
> > older than the target (as is going to be the case after the invocation
> > of the build from the top level Makefile).
> 
> So the conclusion was wrong here. If the dependencies changed
> (as in are newer than the target) but the target file is the same as
> before, move-if-changed will hide the effect from consumers of the
> produced file, but the temporary file created gets in the way of -j
> here.

Indeed.

> There being just one invocation of the build in tools/include/ prior
> to the patch here (and hence no race afaict), would you consider
> it reasonable to make the two new invocations dependent upon
> $(MAKELEVEL), thus protecting things in the recursive case?

I don't think this is particularly nice, although I wouldn't block it.

Why is this particular inter-directory dependency unusual ?  Do we
plan to introduce similar MAKELEVEL-based invocation of dependency
directory makefiles everyhere ?

Ian.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v7 01/14] libxl: Enhance libxl__sendmsg_fds to deal with EINTR and EWOULDBLOCK

2018-12-21 Thread Ian Jackson
Anthony PERARD writes ("[PATCH v7 01/14] libxl: Enhance libxl__sendmsg_fds to 
deal with EINTR and EWOULDBLOCK"):
> This patch change the behavior of libxl__sendmsg_fds to retry sendmsg on
> EINTR error.
> 
> This patch also allow a caller of libxl__sendmsg_fds to deal with
> EWOULDBLOCK. The function now requires to send only 1 byte of data so
> that when dealing with non-blocking fds a EWOULDBLOCK error would mean
> that the fds haven't been sent yet. Current caller already send only 1
> byte.

Even with a blocking fd, sendmsg may in principle report a short
write.  (So the commit message should talk about short writes in
general.)

> Notes:
> v7:
> always assert datalen == 1, but only fail when sendmsg haven't send
> everything (r != datalen)
> check sendmsg return value on success as well (check for short write)

Rather than having a function which takes an argument which
mandatorily takes the value 1, how about simply deleting that
argument ?

You can do that in a followup patch if you like.  In the meantime:

Acked-by: Ian Jackson 

Ian.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v7 02/14] libxl_qmp: Separate QMP message generation from qmp_send_prepare

2018-12-21 Thread Ian Jackson
Anthony PERARD writes ("[PATCH v7 02/14] libxl_qmp: Separate QMP message 
generation from qmp_send_prepare"):
> .. to be able to re-use qmp_prepare_cmd with libxl__ev_qmp.
> 
> This patch also add the QMP end of command '\r\n' into the generated
> string as every caller will needs this.
> 
> There should be no functional change.

Acked-by: Ian Jackson 

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v7 04/14] libxl: Add wrapper around libxl__json_object_to_json JSON

2018-12-21 Thread Ian Jackson
Anthony PERARD writes ("[PATCH v7 04/14] libxl: Add wrapper around 
libxl__json_object_to_json JSON"):
> That wrapper is going to be used to safely log a json_object, as
> libxl__json_object_to_json return NULL on error. In the error case,
> JSON() will return an invalid json string.

Acked-by: Ian Jackson 

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH for-4.12 1/8] xen/arm: Only set necessary flags when initializing HCR_EL2

2018-12-21 Thread Andrii Anisov



On 28.11.18 18:49, Julien Grall wrote:

Only {A,F,I}MO are necessary to receive interrupts until a guest vCPU is
loaded.

The rest have no effect on Xen and it is better to avoid setting them.

Signed-off-by: Julien Grall 
---
  xen/arch/arm/traps.c | 8 ++--
  1 file changed, 6 insertions(+), 2 deletions(-)


Reviewed-by: Andrii Anisov 

--
Sincerely,
Andrii Anisov.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH for-4.12 2/8] xen/arm: p2m: Provide an helper to generate the VTTBR

2018-12-21 Thread Andrii Anisov



On 28.11.18 18:49, Julien Grall wrote:

A follow-up patch will need to generate the VTTBR in a few places.

Signed-off-by: Julien Grall 
---
  xen/arch/arm/p2m.c | 7 ++-
  1 file changed, 6 insertions(+), 1 deletion(-)



Reviewed-by: Andrii Anisov 

--
Sincerely,
Andrii Anisov.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH for-4.12 3/8] xen/arm: p2m: Introduce an helper to allocate the root page-table

2018-12-21 Thread Andrii Anisov



On 28.11.18 18:49, Julien Grall wrote:

A follow-up patch will require to allocate the root page-table without
having a domain in hand.

Signed-off-by: Julien Grall 


Reviewed-by: Andrii Anisov 

--
Sincerely,
Andrii Anisov.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH for-4.12 5/8] xen/arm: p2m: Only use isb() when it is necessary

2018-12-21 Thread Andrii Anisov



On 28.11.18 18:49, Julien Grall wrote:

The EL1 translation regime is out-of-context when running at EL2. This
means the processor cannot speculate memory accesses using the registers
associated to that regime.

An isb() is only need if Xen is going to use the translation regime
before returning to the guest (exception returns will synchronized the
context).

Remove unecessary isb() and document the ones left.

Signed-off-by: Julien Grall 



Reviewed-by: Andrii Anisov 

--
Sincerely,
Andrii Anisov.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH for-4.12 5/8] xen/arm: p2m: Only use isb() when it is necessary

2018-12-21 Thread Andrii Anisov



On 28.11.18 18:49, Julien Grall wrote:


  if ( *last_vcpu_ran != INVALID_VCPU_ID && *last_vcpu_ran != n->vcpu_id )
+{
  flush_tlb_local();
+}

BTW, missed mentioning that curly braces above are odd by coding style.

--
Sincerely,
Andrii Anisov.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH for-4.12 7/8] xen/arm: p2m: Clean-up headers included and order them alphabetically

2018-12-21 Thread Andrii Anisov



On 28.11.18 18:49, Julien Grall wrote:

A lot of the headers are not necessary, so remove them. At the same
time, re-order them alphabetically.

Signed-off-by: Julien Grall 


Reviewed-by: Andrii Anisov 

--
Sincerely,
Andrii Anisov.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH for-4.12 8/8] DO NOT APPLY Allow testing the new AT speculate workaround code

2018-12-21 Thread Andrii Anisov

I guess this one should not be here.


--
Sincerely,
Andrii Anisov.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH for-4.12 8/8] DO NOT APPLY Allow testing the new AT speculate workaround code

2018-12-21 Thread Andrew Cooper
On 21/12/2018 14:44, Andrii Anisov wrote:
> I guess this one should not be here.

Posting patches like this can be useful for people trying to test the
series.

As such, it is worth posting, but the DO NOT APPLY hint is there for
people to realise that is isn't for inclusion generally.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH for-4.12 5/8] xen/arm: p2m: Only use isb() when it is necessary

2018-12-21 Thread Julien Grall



On 21/12/2018 14:43, Andrii Anisov wrote:



On 28.11.18 18:49, Julien Grall wrote:


  if ( *last_vcpu_ran != INVALID_VCPU_ID && *last_vcpu_ran != n->vcpu_id )
+    {
  flush_tlb_local();
+    }

BTW, missed mentioning that curly braces above are odd by coding style.


It is not odd, just unnecessary. I will drop them.

Cheers,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH for-4.12 8/8] DO NOT APPLY Allow testing the new AT speculate workaround code

2018-12-21 Thread Andrii Anisov



On 21.12.18 16:47, Andrew Cooper wrote:

On 21/12/2018 14:44, Andrii Anisov wrote:

I guess this one should not be here.


Posting patches like this can be useful for people trying to test the
series.

As such, it is worth posting, but the DO NOT APPLY hint is there for
people to realise that is isn't for inclusion generally.


Thank you for the hint.

--
Sincerely,
Andrii Anisov.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH for-4.12 4/8] xen/arm: domain_build: Don't switch to the guest P2M when copying data

2018-12-21 Thread Andrii Anisov



On 28.11.18 18:49, Julien Grall wrote:

Until recently, kernel/initrd/dtb were loaded using guest VA and
therefore requiring to restore temporarily the P2M. This reworked in a
series of commits (up to 9292086 "xen/arm: domain_build: Use
copy_to_guest_phys_flush_dcache in dtb_load") to use a guest PA.

This will also help a follow-up patch which will require
p2m_{save,restore}_state to work in pair to workaround an erratum.

Signed-off-by: Julien Grall 


Reviewed-by: Andrii Anisov 

--
Sincerely,
Andrii Anisov.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v4 1/3] xen/pt: fix some pass-thru devices don't work across reboot

2018-12-21 Thread Chao Gao
On Fri, Dec 21, 2018 at 03:13:50AM -0700, Jan Beulich wrote:
 On 20.12.18 at 16:29,  wrote:
>> --- a/xen/drivers/passthrough/pci.c
>> +++ b/xen/drivers/passthrough/pci.c
>> @@ -1514,6 +1514,55 @@ static int assign_device(struct domain *d, u16 seg, 
>> u8 bus, u8 devfn, u32 flag)
>>  return rc;
>>  }
>>  
>> +/*
>> + * Unmap established mappings between domain's pirq and device's MSI.
>> + * These mappings were set up by qemu/guest and are expected to be
>> + * destroyed when changing the device's ownership.
>> + */
>> +static void pci_unmap_msi(struct pci_dev *pdev)
>> +{
>> +struct msi_desc *entry;
>> +struct domain *d = pdev->domain;
>> +
>> +ASSERT(pcidevs_locked());
>> +
>> +if ( !d )
>> +return;
>
>Why? deassign_device() (the only caller) ought to guarantee this,
>due to its use of pci_get_pdev_by_domain(). I think this simply
>wants to be another ASSERT(), if anything at all.
>
>> +spin_lock(&d->event_lock);
>> +while ( (entry = list_first_entry_or_null(&pdev->msi_list,
>> +  struct msi_desc, list)) != 
>> NULL )
>> +{
>> +struct pirq *info;
>> +int pirq = 0;
>> +unsigned int nr = entry->msi_attrib.type != PCI_CAP_ID_MSIX
>> +  ? entry->msi.nvec : 1;
>> +
>> +while ( nr -- )
>
>Stray blank.
>
>> +{
>> +struct hvm_pirq_dpci *pirq_dpci;
>> +
>> +pirq = domain_irq_to_pirq(d, entry[nr].irq);
>> +WARN_ON(pirq < 0);
>> +if ( pirq <= 0 )
>> +continue;
>> +
>> +info = pirq_info(d, pirq);
>> +if ( !info )
>> +continue;
>> +
>> +pirq_dpci = pirq_dpci(info);
>> +if ( pirq_dpci &&
>> + (pirq_dpci->flags & HVM_IRQ_DPCI_MACH_MSI) &&
>> + (pirq_dpci->flags & HVM_IRQ_DPCI_GUEST_MSI) )
>> +pt_irq_destroy_bind_msi(d, info);
>> +}
>> +if ( pirq > 0 )
>> +unmap_domain_pirq(d, pirq);
>
>Can you guarantee that this function won't fail? Because if it
>does, I think you might end up in an infinite loop, because the

Considering current code doesn't deal with remaining pirq, if we
failed to unmap some pirq here (remove all entries from the msi_list
here), it wouldn't be a big issue. Hence the real issue here is a
potential infinite loop. Then we can just use
list_for_each_entry_safe(...) to traverse msi_list to avoid infinite
loop.

>entry doesn't always get removed from the list in error cases.
>Maybe unmap_domain_pirq() needs a "force" mode added,
>perhaps indirectly by way of passing "entry" into it (all other
>callers would pass NULL).

Yes, it is viable. However, for this call site, unmap_domain_pirq()
would fail to remove an entry only if xsm_unmap_domain_irq() in
unmap_domain_pirq() failed. Can we expect that xsm_unmap_domain_irq()
would always succeed there? If the answer is yes, what needed is
another assertion rather than the "force" mode. Maybe we can
forcibly remove all entries still on the list after the loop.
The benefit is we needn't change unmap_domain_pirq() and its
existing call sites.

>
>But then again I'm still not fully convinced that a hypervisor
>change is the right course of action here in the first place. It
>would be better if the hypervisor had to just verify that all
>IRQ mappings are gone, or else fail the de-assignment of the
>device.

Then in another place, we need the "force" mode. I don't think
it would bring great benefit if there is no other case (except device
hot-remove and guest shutdown) where we want to unmap all pirqs
related to a device.  

Thanks
Chao


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH for-4.12 7/8] xen/arm: p2m: Clean-up headers included and order them alphabetically

2018-12-21 Thread Andrii Anisov

I've missed that this patch is already merged within a different series.
Also "[Xen-devel] [PATCH for-4.12 6/8] xen/arm: Implement workaround for Cortex-A76 
erratum 1165522" should be rebased.

On 21.12.18 16:44, Andrii Anisov wrote:



On 28.11.18 18:49, Julien Grall wrote:

A lot of the headers are not necessary, so remove them. At the same
time, re-order them alphabetically.

Signed-off-by: Julien Grall 


Reviewed-by: Andrii Anisov 



--
Sincerely,
Andrii Anisov.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v7 06/14] libxl_qmp: Implementation of libxl__ev_qmp_*

2018-12-21 Thread Ian Jackson
Anthony PERARD writes ("[PATCH v7 06/14] libxl_qmp: Implementation of 
libxl__ev_qmp_*"):
> This patch implement the API libxl__ev_qmp documented in the previous
> patch, "libxl: Design of an async API to issue QMP commands to QEMU".
> 
> Since this API is to interact with QEMU via the QMP protocol, it also
> implement a QMP client. The specification for the QEMU Machine Protocol
> (QMP) can be found in the QEMU repository at:
> https://git.qemu.org/?p=qemu.git;a=blob_plain;f=docs/interop/qmp-spec.txt

Thanks.

I have only fairly minor comments now.  The biggest one remaining is
about the use of EGC_GC which I think probably wants to become
STATE_AO_GC throughout.  See below...


> diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
> index 1c7a3b22f4..056de9de2f 100644
> --- a/tools/libxl/libxl_internal.h
> +++ b/tools/libxl/libxl_internal.h
> @@ -412,6 +412,19 @@ _hidden int libxl__ev_qmp_send(libxl__gc *gc, 
> libxl__ev_qmp *ev,

> +/* receive buffer */
> +char *rx_buf;

rx_buf needs a comment saying it comes from NOGC since otherwise one
would assume it came from the ao gc like the other buffers.

(Could it come from the ao gc instead?)


> +static void qmp_ev_ensure_reading_writing(libxl__gc *gc, libxl__ev_qmp *ev)
> +/* Update the state of `efd` to match the permited state */
> +{

This function is legal only in states other than disconnected.
Needs to be documented.

> +static void qmp_ev_set_state(libxl__gc *gc, libxl__ev_qmp *ev,
> + libxl__qmp_state new_state)
> +/* on entry: !broken and !disconnected */
> +{
> +switch (new_state) {
> +case qmp_state_disconnected:
> +break;
> +case qmp_state_connecting:
> +assert(ev->state == qmp_state_disconnected);
> +break;
> +case qmp_state_capability_negotiation:
> +assert(ev->state == qmp_state_connecting);
> +break;
> +case qmp_state_waiting_reply:
> +assert(ev->state == qmp_state_capability_negotiation ||
> +   ev->state == qmp_state_connected);
> +break;
> +case qmp_state_connected:
> +assert(ev->state == qmp_state_waiting_reply);
> +break;
> +}
> +
> +ev->state = new_state;

I think this function needs to update efd ?  What am I missing ?

The comment doesn't say what the output state is but naturally I
assume that it is precisely new_state, and not some transitional
mixture.  If it is intended to produce a transitional mixture that
ought to be documented.

For a concrete example: if on entry, with new_state==disconnected, we
are `connecting' then: efd will be looking for POLLIN.  But it needs
to become Idle.


> +/* Setup connection */
> +
> +static int qmp_ev_connect(libxl__gc *gc, libxl__ev_qmp *ev)
> +/* disconnected -> connecting but with `msg` free
> + * on error: broken */
> +{

This function looks fine to me.  However, earlier I wrote this:

  Contrary to the state description, this function does not
  transition rx_buf from free to used.  However, I think this
  would probably be more easily remedied by changing the
  definition of `used' to permit NULL/0/0.  You might want to use
  a different word to `used', `inuse' perhaps ?

This is still true.  That is, your state description for `connecting'
says that rx_buf is `used'.  And your description lower about what
rx_buf being `used' means says that rx_buf must be `allocated'.

I think this would probably be best resolved by writing:

  * free   used
- * rx_buf   NULL   allocated
+ * rx_buf   NULL   NULL or allocated
  * rx_buf_size  0  allocation size of `rx_buf`
  * rx_buf_used  0  <= rx_buf_size, actual data in the
  * buffer

Ie just to change the internal spec.  I am going to assume for the
rest of the review that the code is right and the internal spec will
be updated.  (I don't think it is necessary to change the descriptions
of rx_buf_size and rx_buf_used; it will be clear that the `allocation
size' of a NULL must be 0.)


> +/* QMP FD callbacks */
> +
> +static void qmp_ev_fd_callback(libxl__egc *egc, libxl__ev_fd *ev_fd,
...

> +static int qmp_ev_callback_writable(libxl__gc *gc,
> +libxl__ev_qmp *ev, int fd)
> +/* on entry: !disconnected
> + * on return, one of these state transition:
> + *   waiting_reply (with msg set) -> waiting_reply (with msg free)
> + *   tx_buf set -> same state or tx_buf free
> + *   tx_buf free -> no state change
> + * on error: broken */
> +{
...
> +assert(ev->tx_buf);
> +if (!ev->tx_buf)
> +return 0;

I think the if is vestigial.


> +while (ev->tx_buf_off < ev->tx_buf_len) {
> +r = write(fd, ev->tx_buf + ev->tx_buf_off,
> +  ev->tx_buf_len - ev->tx_buf_off);
> +if (r < 0) {
> +if (errno == EINTR)
> +continue;
> +if (errno == EWOULDB

Re: [Xen-devel] [PATCH for-4.12 6/8] xen/arm: Implement workaround for Cortex-A76 erratum 1165522

2018-12-21 Thread Andrii Anisov



On 28.11.18 18:49, Julien Grall wrote:

Early version of Cortex-A76 can end-up with corrupt TLBs if they
speculate an AT instruction while the S1/S2 system registers are in an
inconsistent state.

This can happen during guest context switch and when invalidating the
TLBs for other than the current VMID.

The workaround implemented in Xen will:
 - Use an empty stage-2 with a reserved VMID while context switching
 between 2 guests
 - Use an empty stage-2 with the VMID where TLBs need to be flushed

Signed-off-by: Julien Grall 


Reviewed-by: Andrii Anisov 

--
Sincerely,
Andrii Anisov.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 2/2] xen/dom0: Add a dom0-iommu=none option

2018-12-21 Thread Roger Pau Monné
On Fri, Dec 21, 2018 at 01:01:22PM +, Andrew Cooper wrote:
> On 21/12/2018 12:44, Roger Pau Monné wrote:
> > On Thu, Dec 20, 2018 at 11:40:52PM +, Andrew Cooper wrote:
> >> For development purposes, it is very convenient to boot Xen as a PVH guest,
> >> with an XTF PV or PVH "dom0".  The edit-compile-go cycle is a matter of
> >> seconds, and you can resonably insert printk() debugging in places which 
> >> which
> >> would be completely infeasible when booting fully-fledged guests.
> >>
> >> However, the PVH dom0 path insists on having a working IOMMU, which doesn't
> >> exist when virtualised as a PVH guest, and isn't necessary for XTF anyway.
> >>
> >> Introduce a developer mode to skip the IOMMU requirement.
> > This looks very similar to the current 'passthrough' option, maybe it
> > would be enough to allow PVH dom0 to use the passthrough option
> > provided a warning is added to
> > arch_iommu_check_autotranslated_hwdom?
> 
> I considered that, but "dom0-iommu=passthrough" isn't an accurate
> description of what is going on.  Frankly, its not correct for PV either.

And what about turning dom0-iommu into a boolean option itself, so
that you can do "dom0-iommu=false"?

I think that's more similar to other Xen command line options that
have a boolean value and/or a list of sub-options.

> TBH, dom0-iommu=none is better for both.  How about I introduce that as
> the new option, and leave passthrough as a legacy alias?

That sounds good, I would like to avoid (if possible) the
proliferation of new iommu_hwdom_* variables, because we already have
a bunch and the functionality introduced by the 'none' option looks
very similar to what 'passthrough' aims to achieve.

Thanks, Roger.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v7 07/14] libxl_exec: Add libxl__spawn_initiate_failure

2018-12-21 Thread Ian Jackson
Anthony PERARD writes ("[PATCH v7 07/14] libxl_exec: Add 
libxl__spawn_initiate_failure"):
> This function can be used by user of libxl__spawn_* when they setup a
> notification other than xenstore. The parent can already report success
> via libxl__spawn_initiate_detach(), this new function can be used for
> failure instead of waiting for the timeout.

Acked-by: Ian Jackson 


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v7 06/14] libxl_qmp: Implementation of libxl__ev_qmp_*

2018-12-21 Thread Ian Jackson
Ian Jackson writes ("Re: [PATCH v7 06/14] libxl_qmp: Implementation of 
libxl__ev_qmp_*"):
> I have only fairly minor comments now.  The biggest one remaining is
> about the use of EGC_GC which I think probably wants to become
> STATE_AO_GC throughout.  See below...

I realise that I should state explicitly that libxl__ev_qmp_callback
must still get an egc.

The egc should be passed through all the layers, but its gc should
generally not be used.

Ian.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v7 08/14] libxl: Add init/dispose of for libxl__domain_build_state

2018-12-21 Thread Ian Jackson
Anthony PERARD writes ("[PATCH v7 08/14] libxl: Add init/dispose of for 
libxl__domain_build_state"):
> These two new functions libxl__domain_build_state_{init,dispose} should
> be called every time a new libxl__domain_build_state comes to existance.
> 
> There seems to be two of them, one with the domain creation machinery,
> and one in the stub_dm_spawn.

Acked-by: Ian Jackson 

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v7 09/14] libxl_dm: Pre-open QMP socket for QEMU

2018-12-21 Thread Ian Jackson
Anthony PERARD writes ("[PATCH v7 09/14] libxl_dm: Pre-open QMP socket for 
QEMU"):
> This patch moves the creation of the QMP unix socket from QEMU to libxl.
> But libxl doesn't rely on this yet.

Acked-by: Ian Jackson 

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v7 11/14] libxl: QEMU startup sync based on QMP

2018-12-21 Thread Ian Jackson
Anthony PERARD writes ("[PATCH v7 11/14] libxl: QEMU startup sync based on 
QMP"):
> This is only activated when dm_restrict=1, as explained in a previous
> patch "libxl_dm: Pre-open QMP socket for QEMU"
> 
> Signed-off-by: Anthony PERARD 
> Reviewed-by: Roger Pau Monné 

Acked-by: Ian Jackson 

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v7 12/14] libxl_qmp: Store advertised QEMU version in libxl__ev_qmp

2018-12-21 Thread Ian Jackson
Anthony PERARD writes ("[PATCH v7 12/14] libxl_qmp: Store advertised QEMU 
version in libxl__ev_qmp"):
> This will be used in a later patch.

Acked-by: Ian Jackson 

> +o = libxl__json_map_get("QMP", resp, JSON_MAP);
> +o = libxl__json_map_get("version", o, JSON_MAP);
> +o = libxl__json_map_get("qemu", o, JSON_MAP);
> +#define GRAB_VERSION(level) do { \
> +ev->qemu_version.level = libxl__json_object_get_integer( \
> +libxl__json_map_get(#level, o, JSON_INTEGER)); \
> +} while (0)
> +GRAB_VERSION(major);
> +GRAB_VERSION(minor);
> +GRAB_VERSION(micro);

Earlier I wrote:

   I would prefer the indentation to be such that the statement inside
   the macro is indented like the ones outside.

Ie like this:

  +#define GRAB_VERSION(level) do { \
  +ev->qemu_version.level = libxl__json_object_get_integer( \
  +libxl__json_map_get(#level, o, JSON_INTEGER)); \
  +} while (0)
  +GRAB_VERSION(major);

But up to you.  My ack stands either way.

Thanks,
Ian.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 1/2] xen/dom0: Improve documentation for dom0= and dom0-iommu=

2018-12-21 Thread Roger Pau Monné
On Fri, Dec 21, 2018 at 01:13:25PM +, Andrew Cooper wrote:
> On 21/12/2018 12:08, Roger Pau Monné wrote:
> > On Thu, Dec 20, 2018 at 11:40:51PM +, Andrew Cooper wrote:
> >> Update to the latest metadata style, and expand each of the clauses with 
> >> more
> >> information, including applicable CONFIG_* options.
> >>
> >> Drop the redundant comment beside parse_dom0_param(), to avoid it getting 
> >> out
> >> of sync with the main documentation.
> >>
> >> Signed-off-by: Andrew Cooper 
> > Thanks! A couple of fixes below, because the original text is actually
> > wrong...
> 
> TBH, that is my default assumption every time I do work like this :)

In this case it's my fault :(, because I changed the code and forgot
about the docs.

> >
> >> ---
> >> CC: Jan Beulich 
> >> CC: Wei Liu 
> >> CC: Roger Pau Monné 
> >> CC: Stefano Stabellini 
> >> CC: Julien Grall 
> >>
> >> Please double check for correctness.  The text matches my
> >> understanding/reading of the code, but some of it is rather subtle going.
> >>
> >> It occurs to me that:
> >>
> >>  * The choice of dom0 boot mode should in part be derived from the 
> >> available
> >>CONFIG_* options, and ELF notes advertised in the dom0 kernel.
> > This is indeed doable, but would require parsing the dom0 kernel
> > before building the domain.
> 
> I don't see anything wrong with parsing the ELF headers ahead of
> building the domain.  From the overall boot time, its just an
> order-of-operations issue.

Oh yes, I didn't mean my comment to sound like criticism. I agree
there should be no issues in parsing the ELF earlier, or if there are
issues they should be fixed.

> >
> >>  * AMD probably needs to gain an `ivmd=` to mirror `rmrr=` on the Intel 
> >> side,
> >>because we know there are other errors in the IVRS table.
> > Yes, albeit using rmrr is quite cumbersome because it's mostly a
> > trial-and-error process until there are no more iommu faults (unless
> > you can get the correct rmrr command for your hardware somewhere).
> >
> >>  * Neither of map-{inclusive,reserved} should be active by default, even on
> >>Intel hardware, and we should (wherever possible) have quirks like we 
> >> have
> >>for all other firmware screwups.  Requiring the user to diagnose/work
> >>around firmware problems like this is quite rude.
> > That would indeed be nice, but I think there are too many vendor
> > firmware versions to be able to correctly identify such quirks, the
> > more that vendors don't even list missing RMRR as erratum.
> 
> I don't agree.  We already have quirks based on DMI (at the moment,
> mainly for reboot overrides), and the vast majority of the offending
> cases are the BMC shared mailbox, which will be in a fixed per-platform
> location.

IIRC I've only found a single box that worked without map-reserved,
and that's my NUC which has firmware from Intel. And even in that case
the USB ports weren't fully working.

I guess such quirks could be applied based on the chipset version then
if Xen realizes the firmware is either wrong or missing obvious RMRR
regions?

> I don't expect we'll ever find and fix all quirks, but where we do find
> suitable ones, we should put them into the boot code.

Sadly I agree. What I'm worried about is turning the default
map-{inclusive/reserved} to off, that's likely to make dom0 unable to
boot on a huge amount of hardware.

Thanks, Roger.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v7 13/14] libxl: Change libxl__domain_suspend_device_model() to be async

2018-12-21 Thread Ian Jackson
Anthony PERARD writes ("[PATCH v7 13/14] libxl: Change 
libxl__domain_suspend_device_model() to be async"):
> This create an extra step for the two call sites of the function.
> 
> libxl__domain_suspend_device_model() in this patch gets an extra error
> variable (there is ret and rc), but ret goes away in the next patch.

Acked-by: Ian Jackson 

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v7 00/14] libxl: Enable save/restore/migration of a restricted QEMU + libxl__ev_qmp_*

2018-12-21 Thread Ian Jackson
Anthony PERARD writes ("[PATCH v7 00/14] libxl: Enable save/restore/migration 
of a restricted QEMU + libxl__ev_qmp_*"):
> Patch series available in this git branch:
> https://xenbits.xen.org/git-http/people/aperard/xen-unstable.git 
> br.libxl-ev-qmp-v

Thanks.  Sorry for being so slow to review this.  It is very nearly
ready and I hope it will make 4.12.

Ian.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v7 14/14] libxl: Re-implement domain_suspend_device_model using libxl__ev_qmp

2018-12-21 Thread Ian Jackson
Anthony PERARD writes ("[PATCH v7 14/14] libxl: Re-implement 
domain_suspend_device_model using libxl__ev_qmp"):
> The re-implementation is done because we want to be able to send the
> file description that QEMU can use to save its state. When QEMU is
> restricted, it would not be able to write to a path.
> 
> This replace both libxl__qmp_stop() and libxl__qmp_save().
> 
> qmp_qemu_check_version() was only used by libxl__qmp_save(), so it is
> replace by a version using libxl__ev_qmp instead.
> 
> Coding style fixed in libxl__domain_suspend_device_model() for the
> return value.

Acked-by: Ian Jackson 

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH for-4.12 v2 8/8] xen: Switch parameter in get_page_from_gfn to use typesafe gfn

2018-12-21 Thread Julien Grall

Hi Andrew,

On 21/12/2018 14:14, Andrew Cooper wrote:

On 20/12/2018 19:23, Julien Grall wrote:

diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
index 32dc4253ff..b462a8513b 100644
--- a/xen/arch/x86/domain.c
+++ b/xen/arch/x86/domain.c
@@ -827,7 +827,7 @@ int arch_set_info_guest(
  unsigned long flags;
  bool compat;
  #ifdef CONFIG_PV
-unsigned long cr3_gfn;
+gfn_t cr3_gfn;


I've sent an alternative patch which this patch should be rebased over,
at which point all modifications to arch_set_info_guest() should
hopefully disappear.


The rest of the series should be merged by end of today (Code freeze for Xen 
Arm). So I will resend this patch separately after my holidays.





diff --git a/xen/arch/x86/hvm/domain.c b/xen/arch/x86/hvm/domain.c
index 5d5a746a25..73d2da8441 100644
--- a/xen/arch/x86/hvm/domain.c
+++ b/xen/arch/x86/hvm/domain.c
@@ -297,7 +297,7 @@ int arch_set_info_hvm_guest(struct vcpu *v, const 
vcpu_hvm_context_t *ctx)
  {
  /* Shadow-mode CR3 change. Check PDBR and update refcounts. */
  struct page_info *page = get_page_from_gfn(v->domain,
- v->arch.hvm.guest_cr[3] >> PAGE_SHIFT,
+ gaddr_to_gfn(v->arch.hvm.guest_cr[3]),
   NULL, P2M_ALLOC);


Can you re-indent while modifying this please?


Sure.




diff --git a/xen/arch/x86/hvm/viridian/time.c b/xen/arch/x86/hvm/viridian/time.c
index 840a82b457..a718434456 100644
--- a/xen/arch/x86/hvm/viridian/time.c
+++ b/xen/arch/x86/hvm/viridian/time.c
@@ -38,16 +38,16 @@ static void dump_reference_tsc(const struct domain *d)
  
  static void update_reference_tsc(struct domain *d, bool initialize)

  {
-unsigned long gmfn = d->arch.hvm.viridian.reference_tsc.fields.pfn;
-struct page_info *page = get_page_from_gfn(d, gmfn, NULL, P2M_ALLOC);
+gfn_t gfn = _gfn(d->arch.hvm.viridian.reference_tsc.fields.pfn);
+struct page_info *page = get_page_from_gfn(d, gfn, NULL, P2M_ALLOC);
  HV_REFERENCE_TSC_PAGE *p;
  
  if ( !page || !get_page_type(page, PGT_writable_page) )

  {
  if ( page )
  put_page(page);
-gdprintk(XENLOG_WARNING, "Bad GMFN %#"PRI_gfn" (MFN %#"PRI_mfn")\n",
- gmfn, mfn_x(page ? page_to_mfn(page) : INVALID_MFN));
+gdprintk(XENLOG_WARNING, "Bad GFN %#"PRI_gfn" (MFN %#"PRI_mfn")\n",


The canonical format for gfns and mfns are just %"PRI_*, without the #

Do you mind fixing this seeing as you're changing the string anyway?


Sure.




diff --git a/xen/include/asm-x86/p2m.h b/xen/include/asm-x86/p2m.h
index 3304921991..1efbc071c5 100644
--- a/xen/include/asm-x86/p2m.h
+++ b/xen/include/asm-x86/p2m.h
@@ -491,18 +491,21 @@ struct page_info *p2m_get_page_from_gfn(struct p2m_domain 
*p2m, gfn_t gfn,
  p2m_query_t q);
  
  static inline struct page_info *get_page_from_gfn(

-struct domain *d, unsigned long gfn, p2m_type_t *t, p2m_query_t q)
+struct domain *d, gfn_t gfn, p2m_type_t *t, p2m_query_t q)
  {
  struct page_info *page;
+mfn_t mfn;
  
  if ( paging_mode_translate(d) )

-return p2m_get_page_from_gfn(p2m_get_hostp2m(d), _gfn(gfn), t, NULL, 
q);
+return p2m_get_page_from_gfn(p2m_get_hostp2m(d), gfn, t, NULL, q);
  
  /* Non-translated guests see 1-1 RAM / MMIO mappings everywhere */

  if ( t )
  *t = likely(d != dom_io) ? p2m_ram_rw : p2m_mmio_direct;
-page = mfn_to_page(_mfn(gfn));
-return mfn_valid(_mfn(gfn)) && get_page(page, d) ? page : NULL;
+
+mfn = _mfn(gfn_x(gfn));
+page = mfn_to_page(mfn);
+return mfn_valid(mfn) && get_page(page, d) ? page : NULL;


This unfortunately propagates some bad behaviour, because it is not safe
to use mfn_to_page(mfn); before mfn_valid(mfn) succeeds.  (In practice
it works because mfn_to_page() is just pointer arithmetic.)

Pleas can you express this as:

return (mfn_valid(mfn) &&
     (page = mfn_to_page(mfn), get_page(page, d))) ? page : NULL;

which at least gets the order of operations in the correct order from
C's point of view.

Alternatively, and perhaps easier to follow:

if ( !mfn_valid(mfn) )
     return NULL;

page = mfn_to_page(mfn);

return get_page(page, d) ? page : NULL;


I am happy to fix that. However, shouldn't this be handled in a separate patch? 
After all, the code is not worst than it currently is.


Cheers,



~Andrew



--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 03/15] x86/cpu/vpmu: Add Hygon Dhyana support for vPMU

2018-12-21 Thread Pu Wen
On 2018/12/21 21:34, Boris Ostrovsky wrote:
> On 12/21/18 5:02 AM, Pu Wen wrote:
>> On 2018/12/20 22:25, Boris Ostrovsky wrote:
>> ...
 diff --git a/xen/arch/x86/cpu/vpmu_amd.c b/xen/arch/x86/cpu/vpmu_amd.c
 index 5efc39b..e9f0a5c 100644
 --- a/xen/arch/x86/cpu/vpmu_amd.c
 +++ b/xen/arch/x86/cpu/vpmu_amd.c
 @@ -554,6 +554,8 @@ int __init amd_vpmu_init(void)
case 0x12:
case 0x14:
case 0x16:
 +case 0x17:
 +case 0x18:
>>>
>>> This also enables VPMU support for Zen which goes beyond what the
>>> commit message claims to do.
>> Sorry for the not clear commit message. Will add modification description
>> in the commit message and make the changes complete.
>>
>> On the other hand, since current Xen vPMU still not support Zen. so in
>> this patch we enable 0x17 support. If this modification is not preferred,
>> will remove AMD Xen 0x17 support in next version.
> 
> Enabling 0x17 should be fine, I just thought commit message should be
> explicit about that.

OK, will explicit describe the enabling of 0x17 in the commit message in next
version patch set. Thanks for the suggestion.

>>> Also, why are you choosing to use legacy MSRs (and you did the same in
>>> Linux)? Doesn't Zen (which you are saying is similar to Hygon) support
>>> c001_020X bank?
>> In Linux, the Xen PMU driver use the default branch cases, which also use
>> the legacy MSRs way. So we choose to follow legacy MSRs here in Dhyana
>> cases.
>>
>> Since both of Zen and Dhyana support C001_020X MSRs. If use the C001_020X
>> is preferred, we will try to modify the related codes and create a patch.
> 
> 
> I don't have a Zen box available right now but from what I can see 0x17
> counters are compatible with 0x15 so I think switching to C001_020X
> should work. And looks like you are using those in Linux (non-Xen part) too.
>> Also the Linux Xen PMU driver may need to be updated to use these MSRs.
> 
> Yes, although Linux part is used only by PV guests.

Yes, I have tested the MSRs of the 0x15 ones by booting a Dom0 PV guest.
It works even when the Linux Xen PMU driver use the legacy MSRs. I'll test the
PV guest by using C001_020X in the Linux Xen part tomorrow.

Thx.

-- 
Regards,
Pu Wen

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH for-4.12 v3 4/8] xen/arm: Make get_page_from_gfn working with DOMID_XEN

2018-12-21 Thread Julien Grall
DOMID_XEN is used to share pages beloging to the hypervisor
(e.g trace buffers). Unlike other domains, DOMID_XEN is a non-auto
translated domain and therefore does not have a P2M.

This patch adds a special case for DOMID_XEN in get_page_from_gfn. We
may want to provide "non-auto translated helpers" in the future if we
see more case.

Signed-off-by: Julien Grall 

---
Changes in v3:
- Split from "xen/arm: Add support for read-only foreign
mappings"
- Use likely rather than unlikely
- Fix typoes
---
 xen/include/asm-arm/p2m.h | 33 -
 1 file changed, 32 insertions(+), 1 deletion(-)

diff --git a/xen/include/asm-arm/p2m.h b/xen/include/asm-arm/p2m.h
index a03a033a05..041dea827c 100644
--- a/xen/include/asm-arm/p2m.h
+++ b/xen/include/asm-arm/p2m.h
@@ -300,7 +300,38 @@ struct page_info *p2m_get_page_from_gfn(struct domain *d, 
gfn_t gfn,
 static inline struct page_info *get_page_from_gfn(
 struct domain *d, unsigned long gfn, p2m_type_t *t, p2m_query_t q)
 {
-return p2m_get_page_from_gfn(d, _gfn(gfn), t);
+mfn_t mfn;
+p2m_type_t _t;
+struct page_info *page;
+
+/*
+ * Special case for DOMID_XEN as it is the only domain so far that is
+ * not auto-translated.
+ */
+if ( likely(d != dom_xen) )
+return p2m_get_page_from_gfn(d, _gfn(gfn), t);
+
+if ( !t )
+t = &_t;
+
+*t = p2m_invalid;
+
+/*
+ * DOMID_XEN sees 1-1 RAM. The p2m_type is based on the type of the
+ * page.
+ */
+mfn = _mfn(gfn);
+page = mfn_to_page(mfn);
+
+if ( !mfn_valid(mfn) || !get_page(page, d) )
+return NULL;
+
+if ( page->u.inuse.type_info & PGT_writable_page )
+*t = p2m_ram_rw;
+else
+*t = p2m_ram_ro;
+
+return page;
 }
 
 int get_page_type(struct page_info *page, unsigned long type);
-- 
2.11.0


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH for-4.12 v3 7/8] xenalyze: Build for Both ARM and x86 Platforms

2018-12-21 Thread Julien Grall
From: Benjamin Sanda 

Modified to provide building of the xenalyze binary for both ARM and
x86 platforms. The xenalyze binary is now built as part of the BIN
list for both platforms.

Signed-off-by: Benjamin Sanda 
Signed-off-by: Julien Grall 
Acked-by: Wei Liu 
Acked-by: Stefano Stabellini 

---
Changes in v3:
- Add Stefano's acked-by

Changes in v2:
- Add Wei's acked-by
---
 tools/xentrace/Makefile | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/tools/xentrace/Makefile b/tools/xentrace/Makefile
index 0bad942bdf..9fb7fc96e7 100644
--- a/tools/xentrace/Makefile
+++ b/tools/xentrace/Makefile
@@ -9,8 +9,7 @@ LDLIBS += $(LDLIBS_libxenevtchn)
 LDLIBS += $(LDLIBS_libxenctrl)
 LDLIBS += $(ARGP_LDFLAGS)
 
-BIN-$(CONFIG_X86) = xenalyze
-BIN  = $(BIN-y)
+BIN  = xenalyze
 SBIN = xentrace xentrace_setsize
 LIBBIN   = xenctx
 SCRIPTS  = xentrace_format
-- 
2.11.0


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH for-4.12 v3 1/8] xen/arm: p2m: Introduce p2m_get_page_from_gfn

2018-12-21 Thread Julien Grall
In a follow-up patch, we will need to handle get_page_from_gfn
differently for DOMID_XEN. To keep the code simple move the current
content in a new separate helper p2m_get_page_from_gfn.

Note the new helper is not anymore a static inline function as the helper
is quite complex.

Finally, take the opportunity to use typesafe gfn as the change is
minor.

Signed-off-by: Julien Grall 
Reviewed-by: Andrii Anisov 
Reviewed-by: Stefano Stabellini 

---
Changes in v3:
- Fix typeos
- Add Stefano's reviewed-by
- Fix coding style

Changes in v2:
- Add Andrii's reviewed-by
---
 xen/arch/arm/p2m.c| 33 +
 xen/include/asm-arm/p2m.h | 33 -
 2 files changed, 37 insertions(+), 29 deletions(-)

diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
index 2b5e43f50a..7ae5b29699 100644
--- a/xen/arch/arm/p2m.c
+++ b/xen/arch/arm/p2m.c
@@ -406,6 +406,39 @@ mfn_t p2m_lookup(struct domain *d, gfn_t gfn, p2m_type_t 
*t)
 return mfn;
 }
 
+struct page_info *p2m_get_page_from_gfn(struct domain *d, gfn_t gfn,
+p2m_type_t *t)
+{
+struct page_info *page;
+p2m_type_t p2mt;
+mfn_t mfn = p2m_lookup(d, gfn, &p2mt);
+
+if ( t )
+*t = p2mt;
+
+if ( !p2m_is_any_ram(p2mt) )
+return NULL;
+
+if ( !mfn_valid(mfn) )
+return NULL;
+
+page = mfn_to_page(mfn);
+
+/*
+ * get_page won't work on foreign mapping because the page doesn't
+ * belong to the current domain.
+ */
+if ( p2m_is_foreign(p2mt) )
+{
+struct domain *fdom = page_get_owner_and_reference(page);
+ASSERT(fdom != NULL);
+ASSERT(fdom != d);
+return page;
+}
+
+return get_page(page, d) ? page : NULL;
+}
+
 int guest_physmap_mark_populate_on_demand(struct domain *d,
   unsigned long gfn,
   unsigned int order)
diff --git a/xen/include/asm-arm/p2m.h b/xen/include/asm-arm/p2m.h
index 01cd3ee4b5..4db8e8709d 100644
--- a/xen/include/asm-arm/p2m.h
+++ b/xen/include/asm-arm/p2m.h
@@ -289,38 +289,13 @@ typedef unsigned int p2m_query_t;
 #define P2M_ALLOC(1u<<0)   /* Populate PoD and paged-out entries */
 #define P2M_UNSHARE  (1u<<1)   /* Break CoW sharing */
 
+struct page_info *p2m_get_page_from_gfn(struct domain *d, gfn_t gfn,
+p2m_type_t *t);
+
 static inline struct page_info *get_page_from_gfn(
 struct domain *d, unsigned long gfn, p2m_type_t *t, p2m_query_t q)
 {
-struct page_info *page;
-p2m_type_t p2mt;
-mfn_t mfn = p2m_lookup(d, _gfn(gfn), &p2mt);
-
-if (t)
-*t = p2mt;
-
-if ( !p2m_is_any_ram(p2mt) )
-return NULL;
-
-if ( !mfn_valid(mfn) )
-return NULL;
-page = mfn_to_page(mfn);
-
-/*
- * get_page won't work on foreign mapping because the page doesn't
- * belong to the current domain.
- */
-if ( p2m_is_foreign(p2mt) )
-{
-struct domain *fdom = page_get_owner_and_reference(page);
-ASSERT(fdom != NULL);
-ASSERT(fdom != d);
-return page;
-}
-
-if ( !get_page(page, d) )
-return NULL;
-return page;
+return p2m_get_page_from_gfn(d, _gfn(gfn), t);
 }
 
 int get_page_type(struct page_info *page, unsigned long type);
-- 
2.11.0


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH for-4.12 v3 0/8] xen/arm: Add xentrace support

2018-12-21 Thread Julien Grall
Hi all,

This patch series is a rework of the series sent by Benjamin Sanda in April
2016 [1]. It finally adds support for xentrace/xenanalyze on Arm.

Cheers,

[1] https://lists.xenproject.org/archives/html/xen-devel/2016-04/msg00464.html

*** BLURB HERE ***

Benjamin Sanda (2):
  xen/arm: Initialize trace buffer
  xenalyze: Build for Both ARM and x86 Platforms

Julien Grall (6):
  xen/arm: p2m: Introduce p2m_get_page_from_gfn
  xen/arm: Rename p2m_map_foreign to p2m_map_foreign_rw
  xen/arm: Add support for read-only foreign mappings
  xen/arm: Make get_page_from_gfn working with DOMID_XEN
  xen/arm: Allow a privileged domain to map foreign page from DOMID_XEN
  xen: Switch parameter in get_page_from_gfn to use typesafe gfn

 tools/xentrace/Makefile  |  3 +-
 xen/arch/arm/guestcopy.c |  2 +-
 xen/arch/arm/mm.c| 19 ++--
 xen/arch/arm/p2m.c   | 36 ++-
 xen/arch/arm/setup.c |  3 ++
 xen/arch/x86/cpu/vpmu.c  |  2 +-
 xen/arch/x86/domain.c| 12 
 xen/arch/x86/domctl.c|  6 ++--
 xen/arch/x86/hvm/dm.c|  2 +-
 xen/arch/x86/hvm/domain.c|  2 +-
 xen/arch/x86/hvm/hvm.c   |  9 +++---
 xen/arch/x86/hvm/svm/svm.c   |  8 ++---
 xen/arch/x86/hvm/viridian/time.c |  8 ++---
 xen/arch/x86/hvm/viridian/viridian.c | 16 +-
 xen/arch/x86/hvm/vmx/vmx.c   |  4 +--
 xen/arch/x86/hvm/vmx/vvmx.c  | 12 
 xen/arch/x86/mm.c| 24 ---
 xen/arch/x86/mm/p2m.c|  2 +-
 xen/arch/x86/mm/shadow/hvm.c |  6 ++--
 xen/arch/x86/physdev.c   |  3 +-
 xen/arch/x86/pv/descriptor-tables.c  |  4 +--
 xen/arch/x86/pv/emul-priv-op.c   |  6 ++--
 xen/arch/x86/pv/mm.c |  2 +-
 xen/arch/x86/traps.c | 11 +++
 xen/common/domain.c  |  2 +-
 xen/common/event_fifo.c  | 12 
 xen/common/memory.c  |  4 +--
 xen/common/tmem_xen.c|  2 +-
 xen/include/asm-arm/p2m.h| 57 +---
 xen/include/asm-x86/p2m.h| 11 ---
 30 files changed, 174 insertions(+), 116 deletions(-)

-- 
2.11.0


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH for-4.12 v3 3/8] xen/arm: Add support for read-only foreign mappings

2018-12-21 Thread Julien Grall
Currently, foreign mappings can only be read-write. A follow-up patch will
extend foreign mapping for Xen backend memory (via XEN_DOMID), some of
that memory should only be read accessible for the mapping domain.

Introduce a new p2m_type to cater read-only foreign mappings. For now,
the decision between the two foreign mapping type is based on the type
of the guest page mapped.

Signed-off-by: Julien Grall 

---

Cc: Andrii Anisov 

Changes in v3:
- Remove Andrii's reviewed-by
- Move out the XEN_DOMID code in a separate patch
- Make the new addition future-proof

Changes in v2:
- Add Andrii's reviewed-by
---
 xen/arch/arm/mm.c | 5 +++--
 xen/arch/arm/p2m.c| 1 +
 xen/include/asm-arm/p2m.h | 9 +++--
 3 files changed, 11 insertions(+), 4 deletions(-)

diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c
index 7193d83b44..3bf11eec4f 100644
--- a/xen/arch/arm/mm.c
+++ b/xen/arch/arm/mm.c
@@ -1259,7 +1259,9 @@ int xenmem_add_to_physmap_one(
 return -EINVAL;
 }
 
-if ( !p2m_is_ram(p2mt) )
+if ( p2m_is_ram(p2mt) )
+t = (p2mt == p2m_ram_rw) ? p2m_map_foreign_rw : p2m_map_foreign_ro;
+else
 {
 put_page(page);
 rcu_unlock_domain(od);
@@ -1267,7 +1269,6 @@ int xenmem_add_to_physmap_one(
 }
 
 mfn = page_to_mfn(page);
-t = p2m_map_foreign_rw;
 
 rcu_unlock_domain(od);
 break;
diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
index 89279fb590..1e7c91e39a 100644
--- a/xen/arch/arm/p2m.c
+++ b/xen/arch/arm/p2m.c
@@ -478,6 +478,7 @@ static void p2m_set_permission(lpae_t *e, p2m_type_t t, 
p2m_access_t a)
 break;
 
 case p2m_iommu_map_ro:
+case p2m_map_foreign_ro:
 case p2m_grant_map_ro:
 case p2m_invalid:
 e->p2m.xn = 1;
diff --git a/xen/include/asm-arm/p2m.h b/xen/include/asm-arm/p2m.h
index a1aef7b793..a03a033a05 100644
--- a/xen/include/asm-arm/p2m.h
+++ b/xen/include/asm-arm/p2m.h
@@ -116,6 +116,7 @@ typedef enum {
 p2m_mmio_direct_nc, /* Read/write mapping of genuine MMIO area 
non-cacheable */
 p2m_mmio_direct_c,  /* Read/write mapping of genuine MMIO area cacheable */
 p2m_map_foreign_rw, /* Read/write RAM pages from foreign domain */
+p2m_map_foreign_ro, /* Read-only RAM pages from foreign domain */
 p2m_grant_map_rw,   /* Read/write grant mapping */
 p2m_grant_map_ro,   /* Read-only grant mapping */
 /* The types below are only used to decide the page attribute in the P2M */
@@ -135,12 +136,16 @@ typedef enum {
 #define P2M_GRANT_TYPES (p2m_to_mask(p2m_grant_map_rw) |  \
  p2m_to_mask(p2m_grant_map_ro))
 
+/* Foreign mappings types */
+#define P2M_FOREIGN_TYPES (p2m_to_mask(p2m_map_foreign_rw) | \
+   p2m_to_mask(p2m_map_foreign_ro))
+
 /* Useful predicates */
 #define p2m_is_ram(_t) (p2m_to_mask(_t) & P2M_RAM_TYPES)
-#define p2m_is_foreign(_t) (p2m_to_mask(_t) & p2m_to_mask(p2m_map_foreign_rw))
+#define p2m_is_foreign(_t) (p2m_to_mask(_t) & P2M_FOREIGN_TYPES)
 #define p2m_is_any_ram(_t) (p2m_to_mask(_t) &   \
 (P2M_RAM_TYPES | P2M_GRANT_TYPES |  \
- p2m_to_mask(p2m_map_foreign_rw)))
+ P2M_FOREIGN_TYPES))
 
 /* All common type definitions should live ahead of this inclusion. */
 #ifdef _XEN_P2M_COMMON_H
-- 
2.11.0


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH for-4.12 v3 8/8] xen: Switch parameter in get_page_from_gfn to use typesafe gfn

2018-12-21 Thread Julien Grall
No functional change intended.

Only reasonable clean-ups are done in this patch. The rest will use _gfn
for the time being.

Signed-off-by: Julien Grall 
Acked-by: Jan Beulich 

---
Changes in v3:
- Add Jan's acked-by

Changes in v2:
- Remove >> PAGE_SHIFT in svm code
- Fix typo in the e-mail address
- Small NITs
---
 xen/arch/arm/guestcopy.c |  2 +-
 xen/arch/arm/mm.c|  2 +-
 xen/arch/x86/cpu/vpmu.c  |  2 +-
 xen/arch/x86/domain.c| 12 ++--
 xen/arch/x86/domctl.c|  6 +++---
 xen/arch/x86/hvm/dm.c|  2 +-
 xen/arch/x86/hvm/domain.c|  2 +-
 xen/arch/x86/hvm/hvm.c   |  9 +
 xen/arch/x86/hvm/svm/svm.c   |  8 
 xen/arch/x86/hvm/viridian/time.c |  8 
 xen/arch/x86/hvm/viridian/viridian.c | 16 
 xen/arch/x86/hvm/vmx/vmx.c   |  4 ++--
 xen/arch/x86/hvm/vmx/vvmx.c  | 12 ++--
 xen/arch/x86/mm.c| 24 ++--
 xen/arch/x86/mm/p2m.c|  2 +-
 xen/arch/x86/mm/shadow/hvm.c |  6 +++---
 xen/arch/x86/physdev.c   |  3 ++-
 xen/arch/x86/pv/descriptor-tables.c  |  4 ++--
 xen/arch/x86/pv/emul-priv-op.c   |  6 +++---
 xen/arch/x86/pv/mm.c |  2 +-
 xen/arch/x86/traps.c | 11 ++-
 xen/common/domain.c  |  2 +-
 xen/common/event_fifo.c  | 12 ++--
 xen/common/memory.c  |  4 ++--
 xen/common/tmem_xen.c|  2 +-
 xen/include/asm-arm/p2m.h|  6 +++---
 xen/include/asm-x86/p2m.h| 11 +++
 27 files changed, 95 insertions(+), 85 deletions(-)

diff --git a/xen/arch/arm/guestcopy.c b/xen/arch/arm/guestcopy.c
index 7a0f3e9d5f..55892062bb 100644
--- a/xen/arch/arm/guestcopy.c
+++ b/xen/arch/arm/guestcopy.c
@@ -37,7 +37,7 @@ static struct page_info *translate_get_page(copy_info_t info, 
uint64_t addr,
 return get_page_from_gva(info.gva.v, addr,
  write ? GV2M_WRITE : GV2M_READ);
 
-page = get_page_from_gfn(info.gpa.d, paddr_to_pfn(addr), &p2mt, P2M_ALLOC);
+page = get_page_from_gfn(info.gpa.d, gaddr_to_gfn(addr), &p2mt, P2M_ALLOC);
 
 if ( !page )
 return NULL;
diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c
index 01ae20..340a1d1548 100644
--- a/xen/arch/arm/mm.c
+++ b/xen/arch/arm/mm.c
@@ -1252,7 +1252,7 @@ int xenmem_add_to_physmap_one(
 
 /* Take reference to the foreign domain page.
  * Reference will be released in XENMEM_remove_from_physmap */
-page = get_page_from_gfn(od, idx, &p2mt, P2M_ALLOC);
+page = get_page_from_gfn(od, _gfn(idx), &p2mt, P2M_ALLOC);
 if ( !page )
 {
 put_pg_owner(od);
diff --git a/xen/arch/x86/cpu/vpmu.c b/xen/arch/x86/cpu/vpmu.c
index 8a4f753eae..4d8f153031 100644
--- a/xen/arch/x86/cpu/vpmu.c
+++ b/xen/arch/x86/cpu/vpmu.c
@@ -607,7 +607,7 @@ static int pvpmu_init(struct domain *d, xen_pmu_params_t 
*params)
 struct vcpu *v;
 struct vpmu_struct *vpmu;
 struct page_info *page;
-uint64_t gfn = params->val;
+gfn_t gfn = _gfn(params->val);
 
 if ( (params->vcpu >= d->max_vcpus) || (d->vcpu[params->vcpu] == NULL) )
 return -EINVAL;
diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
index 32dc4253ff..b462a8513b 100644
--- a/xen/arch/x86/domain.c
+++ b/xen/arch/x86/domain.c
@@ -827,7 +827,7 @@ int arch_set_info_guest(
 unsigned long flags;
 bool compat;
 #ifdef CONFIG_PV
-unsigned long cr3_gfn;
+gfn_t cr3_gfn;
 struct page_info *cr3_page;
 unsigned long cr4;
 int rc = 0;
@@ -1091,9 +1091,9 @@ int arch_set_info_guest(
 set_bit(_VPF_in_reset, &v->pause_flags);
 
 if ( !compat )
-cr3_gfn = xen_cr3_to_pfn(c.nat->ctrlreg[3]);
+cr3_gfn = _gfn(xen_cr3_to_pfn(c.nat->ctrlreg[3]));
 else
-cr3_gfn = compat_cr3_to_pfn(c.cmp->ctrlreg[3]);
+cr3_gfn = _gfn(compat_cr3_to_pfn(c.cmp->ctrlreg[3]));
 cr3_page = get_page_from_gfn(d, cr3_gfn, NULL, P2M_ALLOC);
 
 if ( !cr3_page )
@@ -1122,7 +1122,7 @@ int arch_set_info_guest(
 case 0:
 if ( !compat && !VM_ASSIST(d, m2p_strict) &&
  !paging_mode_refcounts(d) )
-fill_ro_mpt(_mfn(cr3_gfn));
+fill_ro_mpt(_mfn(gfn_x(cr3_gfn)));
 break;
 default:
 if ( cr3_page == current->arch.old_guest_table )
@@ -1137,7 +1137,7 @@ int arch_set_info_guest(
 v->arch.guest_table = pagetable_from_page(cr3_page);
 if ( c.nat->ctrlreg[1] )
 {
-cr3_gfn = xen_cr3_to_pfn(c.nat->ctrlreg[1]);
+cr3_gfn = _gfn(xen_cr3_to_pfn(c.nat->ctrlreg[1]));
 cr3_page = get_page_from_gfn(d, cr3_gfn, NULL, P2M_ALLOC);
 
 if ( !cr3_page )
@@ -1162,7 +1162,7 @@ int arch_set_info_guest(
  

[Xen-devel] [PATCH for-4.12 v3 2/8] xen/arm: Rename p2m_map_foreign to p2m_map_foreign_rw

2018-12-21 Thread Julien Grall
A follow-up patch will introduce another type of foreign mapping. Rename
the type to make clear it is only used for read-write mapping.

No functional changes intended.

Signed-off-by: Julien Grall 
Reviewed-by: Andrii Anisov 
Acked-by: Stefano Stabellini 

---
Changes in v3
- Add Stefano's acked-by

Changes in v2:
- Add Andrii's reviewed-by
---
 xen/arch/arm/mm.c | 2 +-
 xen/arch/arm/p2m.c| 2 +-
 xen/include/asm-arm/p2m.h | 6 +++---
 3 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c
index d96a6655ee..7193d83b44 100644
--- a/xen/arch/arm/mm.c
+++ b/xen/arch/arm/mm.c
@@ -1267,7 +1267,7 @@ int xenmem_add_to_physmap_one(
 }
 
 mfn = page_to_mfn(page);
-t = p2m_map_foreign;
+t = p2m_map_foreign_rw;
 
 rcu_unlock_domain(od);
 break;
diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
index 7ae5b29699..89279fb590 100644
--- a/xen/arch/arm/p2m.c
+++ b/xen/arch/arm/p2m.c
@@ -468,7 +468,7 @@ static void p2m_set_permission(lpae_t *e, p2m_type_t t, 
p2m_access_t a)
 break;
 
 case p2m_iommu_map_rw:
-case p2m_map_foreign:
+case p2m_map_foreign_rw:
 case p2m_grant_map_rw:
 case p2m_mmio_direct_dev:
 case p2m_mmio_direct_nc:
diff --git a/xen/include/asm-arm/p2m.h b/xen/include/asm-arm/p2m.h
index 4db8e8709d..a1aef7b793 100644
--- a/xen/include/asm-arm/p2m.h
+++ b/xen/include/asm-arm/p2m.h
@@ -115,7 +115,7 @@ typedef enum {
 p2m_mmio_direct_dev,/* Read/write mapping of genuine Device MMIO area */
 p2m_mmio_direct_nc, /* Read/write mapping of genuine MMIO area 
non-cacheable */
 p2m_mmio_direct_c,  /* Read/write mapping of genuine MMIO area cacheable */
-p2m_map_foreign,/* Ram pages from foreign domain */
+p2m_map_foreign_rw, /* Read/write RAM pages from foreign domain */
 p2m_grant_map_rw,   /* Read/write grant mapping */
 p2m_grant_map_ro,   /* Read-only grant mapping */
 /* The types below are only used to decide the page attribute in the P2M */
@@ -137,10 +137,10 @@ typedef enum {
 
 /* Useful predicates */
 #define p2m_is_ram(_t) (p2m_to_mask(_t) & P2M_RAM_TYPES)
-#define p2m_is_foreign(_t) (p2m_to_mask(_t) & p2m_to_mask(p2m_map_foreign))
+#define p2m_is_foreign(_t) (p2m_to_mask(_t) & p2m_to_mask(p2m_map_foreign_rw))
 #define p2m_is_any_ram(_t) (p2m_to_mask(_t) &   \
 (P2M_RAM_TYPES | P2M_GRANT_TYPES |  \
- p2m_to_mask(p2m_map_foreign)))
+ p2m_to_mask(p2m_map_foreign_rw)))
 
 /* All common type definitions should live ahead of this inclusion. */
 #ifdef _XEN_P2M_COMMON_H
-- 
2.11.0


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH for-4.12 v3 6/8] xen/arm: Initialize trace buffer

2018-12-21 Thread Julien Grall
From: Benjamin Sanda 

Now that we allow a privileged domain to map tracing buffer, initialize
them so a user can effectively trace Xen.

Signed-off-by: Benjamin Sanda 
[julien: rework commit message]
Signed-off-by: Julien Grall 
Reviewed-by: Andrii Anisov 
Acked-by: Stefano Stabellini 

---
Changes in v3:
- Add Stefano's acked-by

Changes in v2:
- Add Andrii's reviewed-by
---
 xen/arch/arm/setup.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/xen/arch/arm/setup.c b/xen/arch/arm/setup.c
index fb923cdf67..444857a967 100644
--- a/xen/arch/arm/setup.c
+++ b/xen/arch/arm/setup.c
@@ -36,6 +36,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -899,6 +900,8 @@ void __init start_xen(unsigned long boot_phys_offset,
 
 heap_init_late();
 
+init_trace_bufs();
+
 init_constructors();
 
 console_endboot();
-- 
2.11.0


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH for-4.12 v3 5/8] xen/arm: Allow a privileged domain to map foreign page from DOMID_XEN

2018-12-21 Thread Julien Grall
For auto-translated domain, the only way to map a page to itself is the
using the foreign map API. The current code does not allow mapping page from
special page (such as DOMID_XEN).

As xentrace buffers are shared using DOMID_XEN, it is not possible to use
tracing for Arm.

This could be solved by using the helper get_pg_owner(). This helper will
be able to get a reference on DOMID_XEN and therefore allow mapping for
privileged domain.

This patch replace the call to rcu_lock_domain_by_any_id() with
get_pg_owner(). For consistency, all the call to rcu_unlock_domain are
replaced by put_pg_owner().

Signed-off-by: Julien grall 
Reviewed-by: Andrii Anisov 
Reviewed-by: Stefano Stabellini 

---
Changes in v3:
- Add Stefano's reviewed-by
- Fix typoes

Changes in v2:
- Add Andrii's reviewed-by
---
 xen/arch/arm/mm.c | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c
index 3bf11eec4f..01ae20 100644
--- a/xen/arch/arm/mm.c
+++ b/xen/arch/arm/mm.c
@@ -1233,20 +1233,20 @@ int xenmem_add_to_physmap_one(
 struct domain *od;
 p2m_type_t p2mt;
 
-od = rcu_lock_domain_by_any_id(extra.foreign_domid);
+od = get_pg_owner(extra.foreign_domid);
 if ( od == NULL )
 return -ESRCH;
 
 if ( od == d )
 {
-rcu_unlock_domain(od);
+put_pg_owner(od);
 return -EINVAL;
 }
 
 rc = xsm_map_gmfn_foreign(XSM_TARGET, d, od);
 if ( rc )
 {
-rcu_unlock_domain(od);
+put_pg_owner(od);
 return rc;
 }
 
@@ -1255,7 +1255,7 @@ int xenmem_add_to_physmap_one(
 page = get_page_from_gfn(od, idx, &p2mt, P2M_ALLOC);
 if ( !page )
 {
-rcu_unlock_domain(od);
+put_pg_owner(od);
 return -EINVAL;
 }
 
@@ -1264,13 +1264,13 @@ int xenmem_add_to_physmap_one(
 else
 {
 put_page(page);
-rcu_unlock_domain(od);
+put_pg_owner(od);
 return -EINVAL;
 }
 
 mfn = page_to_mfn(page);
 
-rcu_unlock_domain(od);
+put_pg_owner(od);
 break;
 }
 case XENMAPSPACE_dev_mmio:
-- 
2.11.0


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH for-4.12 v3 7/8] xenalyze: Build for Both ARM and x86 Platforms

2018-12-21 Thread Julien Grall

Hi,

On 21/12/2018 16:26, Julien Grall wrote:

From: Benjamin Sanda 

Modified to provide building of the xenalyze binary for both ARM and
x86 platforms. The xenalyze binary is now built as part of the BIN
list for both platforms.

Signed-off-by: Benjamin Sanda 
Signed-off-by: Julien Grall 
Acked-by: Wei Liu 
Acked-by: Stefano Stabellini 


I made a typo in the address e-mail. Stefano, could you fix it on commit?

Cheers,



---
 Changes in v3:
 - Add Stefano's acked-by

 Changes in v2:
 - Add Wei's acked-by
---
  tools/xentrace/Makefile | 3 +--
  1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/tools/xentrace/Makefile b/tools/xentrace/Makefile
index 0bad942bdf..9fb7fc96e7 100644
--- a/tools/xentrace/Makefile
+++ b/tools/xentrace/Makefile
@@ -9,8 +9,7 @@ LDLIBS += $(LDLIBS_libxenevtchn)
  LDLIBS += $(LDLIBS_libxenctrl)
  LDLIBS += $(ARGP_LDFLAGS)
  
-BIN-$(CONFIG_X86) = xenalyze

-BIN  = $(BIN-y)
+BIN  = xenalyze
  SBIN = xentrace xentrace_setsize
  LIBBIN   = xenctx
  SCRIPTS  = xentrace_format



--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH for-4.12 v3 8/8] xen: Switch parameter in get_page_from_gfn to use typesafe gfn

2018-12-21 Thread Julien Grall



On 21/12/2018 16:26, Julien Grall wrote:

No functional change intended.

Only reasonable clean-ups are done in this patch. The rest will use _gfn
for the time being.

Signed-off-by: Julien Grall 
Acked-by: Jan Beulich 


Please ignore this patch. I need to rebase it on Andrew's patch.

Cheers,



---
 Changes in v3:
 - Add Jan's acked-by

 Changes in v2:
 - Remove >> PAGE_SHIFT in svm code
 - Fix typo in the e-mail address
 - Small NITs
---
  xen/arch/arm/guestcopy.c |  2 +-
  xen/arch/arm/mm.c|  2 +-
  xen/arch/x86/cpu/vpmu.c  |  2 +-
  xen/arch/x86/domain.c| 12 ++--
  xen/arch/x86/domctl.c|  6 +++---
  xen/arch/x86/hvm/dm.c|  2 +-
  xen/arch/x86/hvm/domain.c|  2 +-
  xen/arch/x86/hvm/hvm.c   |  9 +
  xen/arch/x86/hvm/svm/svm.c   |  8 
  xen/arch/x86/hvm/viridian/time.c |  8 
  xen/arch/x86/hvm/viridian/viridian.c | 16 
  xen/arch/x86/hvm/vmx/vmx.c   |  4 ++--
  xen/arch/x86/hvm/vmx/vvmx.c  | 12 ++--
  xen/arch/x86/mm.c| 24 ++--
  xen/arch/x86/mm/p2m.c|  2 +-
  xen/arch/x86/mm/shadow/hvm.c |  6 +++---
  xen/arch/x86/physdev.c   |  3 ++-
  xen/arch/x86/pv/descriptor-tables.c  |  4 ++--
  xen/arch/x86/pv/emul-priv-op.c   |  6 +++---
  xen/arch/x86/pv/mm.c |  2 +-
  xen/arch/x86/traps.c | 11 ++-
  xen/common/domain.c  |  2 +-
  xen/common/event_fifo.c  | 12 ++--
  xen/common/memory.c  |  4 ++--
  xen/common/tmem_xen.c|  2 +-
  xen/include/asm-arm/p2m.h|  6 +++---
  xen/include/asm-x86/p2m.h| 11 +++
  27 files changed, 95 insertions(+), 85 deletions(-)

diff --git a/xen/arch/arm/guestcopy.c b/xen/arch/arm/guestcopy.c
index 7a0f3e9d5f..55892062bb 100644
--- a/xen/arch/arm/guestcopy.c
+++ b/xen/arch/arm/guestcopy.c
@@ -37,7 +37,7 @@ static struct page_info *translate_get_page(copy_info_t info, 
uint64_t addr,
  return get_page_from_gva(info.gva.v, addr,
   write ? GV2M_WRITE : GV2M_READ);
  
-page = get_page_from_gfn(info.gpa.d, paddr_to_pfn(addr), &p2mt, P2M_ALLOC);

+page = get_page_from_gfn(info.gpa.d, gaddr_to_gfn(addr), &p2mt, P2M_ALLOC);
  
  if ( !page )

  return NULL;
diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c
index 01ae20..340a1d1548 100644
--- a/xen/arch/arm/mm.c
+++ b/xen/arch/arm/mm.c
@@ -1252,7 +1252,7 @@ int xenmem_add_to_physmap_one(
  
  /* Take reference to the foreign domain page.

   * Reference will be released in XENMEM_remove_from_physmap */
-page = get_page_from_gfn(od, idx, &p2mt, P2M_ALLOC);
+page = get_page_from_gfn(od, _gfn(idx), &p2mt, P2M_ALLOC);
  if ( !page )
  {
  put_pg_owner(od);
diff --git a/xen/arch/x86/cpu/vpmu.c b/xen/arch/x86/cpu/vpmu.c
index 8a4f753eae..4d8f153031 100644
--- a/xen/arch/x86/cpu/vpmu.c
+++ b/xen/arch/x86/cpu/vpmu.c
@@ -607,7 +607,7 @@ static int pvpmu_init(struct domain *d, xen_pmu_params_t 
*params)
  struct vcpu *v;
  struct vpmu_struct *vpmu;
  struct page_info *page;
-uint64_t gfn = params->val;
+gfn_t gfn = _gfn(params->val);
  
  if ( (params->vcpu >= d->max_vcpus) || (d->vcpu[params->vcpu] == NULL) )

  return -EINVAL;
diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
index 32dc4253ff..b462a8513b 100644
--- a/xen/arch/x86/domain.c
+++ b/xen/arch/x86/domain.c
@@ -827,7 +827,7 @@ int arch_set_info_guest(
  unsigned long flags;
  bool compat;
  #ifdef CONFIG_PV
-unsigned long cr3_gfn;
+gfn_t cr3_gfn;
  struct page_info *cr3_page;
  unsigned long cr4;
  int rc = 0;
@@ -1091,9 +1091,9 @@ int arch_set_info_guest(
  set_bit(_VPF_in_reset, &v->pause_flags);
  
  if ( !compat )

-cr3_gfn = xen_cr3_to_pfn(c.nat->ctrlreg[3]);
+cr3_gfn = _gfn(xen_cr3_to_pfn(c.nat->ctrlreg[3]));
  else
-cr3_gfn = compat_cr3_to_pfn(c.cmp->ctrlreg[3]);
+cr3_gfn = _gfn(compat_cr3_to_pfn(c.cmp->ctrlreg[3]));
  cr3_page = get_page_from_gfn(d, cr3_gfn, NULL, P2M_ALLOC);
  
  if ( !cr3_page )

@@ -1122,7 +1122,7 @@ int arch_set_info_guest(
  case 0:
  if ( !compat && !VM_ASSIST(d, m2p_strict) &&
   !paging_mode_refcounts(d) )
-fill_ro_mpt(_mfn(cr3_gfn));
+fill_ro_mpt(_mfn(gfn_x(cr3_gfn)));
  break;
  default:
  if ( cr3_page == current->arch.old_guest_table )
@@ -1137,7 +1137,7 @@ int arch_set_info_guest(
  v->arch.guest_table = pagetable_from_page(cr3_page);
  if ( c.nat->ctrlreg[1] )
  {
-cr3_gfn = xen_cr3_to_pfn(c.nat->ctrlreg[1]);
+c

Re: [Xen-devel] [PATCH for-4.12 v2 8/8] xen: Switch parameter in get_page_from_gfn to use typesafe gfn

2018-12-21 Thread Andrew Cooper
On 21/12/2018 16:21, Julien Grall wrote:
>>> diff --git a/xen/include/asm-x86/p2m.h b/xen/include/asm-x86/p2m.h
>>> index 3304921991..1efbc071c5 100644
>>> --- a/xen/include/asm-x86/p2m.h
>>> +++ b/xen/include/asm-x86/p2m.h
>>> @@ -491,18 +491,21 @@ struct page_info *p2m_get_page_from_gfn(struct
>>> p2m_domain *p2m, gfn_t gfn,
>>>   p2m_query_t q);
>>>     static inline struct page_info *get_page_from_gfn(
>>> -    struct domain *d, unsigned long gfn, p2m_type_t *t, p2m_query_t q)
>>> +    struct domain *d, gfn_t gfn, p2m_type_t *t, p2m_query_t q)
>>>   {
>>>   struct page_info *page;
>>> +    mfn_t mfn;
>>>     if ( paging_mode_translate(d) )
>>> -    return p2m_get_page_from_gfn(p2m_get_hostp2m(d), _gfn(gfn),
>>> t, NULL, q);
>>> +    return p2m_get_page_from_gfn(p2m_get_hostp2m(d), gfn, t,
>>> NULL, q);
>>>     /* Non-translated guests see 1-1 RAM / MMIO mappings
>>> everywhere */
>>>   if ( t )
>>>   *t = likely(d != dom_io) ? p2m_ram_rw : p2m_mmio_direct;
>>> -    page = mfn_to_page(_mfn(gfn));
>>> -    return mfn_valid(_mfn(gfn)) && get_page(page, d) ? page : NULL;
>>> +
>>> +    mfn = _mfn(gfn_x(gfn));
>>> +    page = mfn_to_page(mfn);
>>> +    return mfn_valid(mfn) && get_page(page, d) ? page : NULL;
>>
>> This unfortunately propagates some bad behaviour, because it is not safe
>> to use mfn_to_page(mfn); before mfn_valid(mfn) succeeds.  (In practice
>> it works because mfn_to_page() is just pointer arithmetic.)
>>
>> Pleas can you express this as:
>>
>> return (mfn_valid(mfn) &&
>>  (page = mfn_to_page(mfn), get_page(page, d))) ? page : NULL;
>>
>> which at least gets the order of operations in the correct order from
>> C's point of view.
>>
>> Alternatively, and perhaps easier to follow:
>>
>> if ( !mfn_valid(mfn) )
>>  return NULL;
>>
>> page = mfn_to_page(mfn);
>>
>> return get_page(page, d) ? page : NULL;
>
> I am happy to fix that. However, shouldn't this be handled in a
> separate patch? After all, the code is not worst than it currently is.

I don't think its worthy of a separate patch.  You're touching the code
anyway, so might as well do it all in one go.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v3 02/11] libxl: Get rid of support for QEMU_USER_BASE (xen-qemuuser-domidNN)

2018-12-21 Thread George Dunlap
QEMU_USER_BASE allows a user to specify the UID to use when running
the devicemodel for a specific domain number.  Unfortunately, this is
not really practical: It requires nearly 32,000 entries in
/etc/passwd.  QEMU_USER_RANGE_BASE is much more practical.

Remove support for QEMU_USER_BASE.

Signed-off-by: George Dunlap 
Acked-by: Ian Jackson 
---
NB that I've chosen not to update the xl.cfg man page at this time; it
needs a lot of other updates as well, which would be easier to do all
at once at the end.

CC: Ian Jackson 
CC: Wei Liu 
---
 tools/libxl/libxl_dm.c   | 16 
 tools/libxl/libxl_internal.h |  1 -
 2 files changed, 4 insertions(+), 13 deletions(-)

diff --git a/tools/libxl/libxl_dm.c b/tools/libxl/libxl_dm.c
index bbcbc94b6c..6024d4b7b8 100644
--- a/tools/libxl/libxl_dm.c
+++ b/tools/libxl/libxl_dm.c
@@ -138,13 +138,6 @@ static int libxl__domain_get_device_model_uid(libxl__gc 
*gc,
 return 0;
 }
 
-user = GCSPRINTF("%s%d", LIBXL_QEMU_USER_BASE, guest_domid);
-ret = userlookup_helper_getpwnam(gc, user, &user_pwbuf, 0);
-if (ret < 0)
-return ret;
-if (ret > 0)
-goto end_search;
-
 ret = userlookup_helper_getpwnam(gc, LIBXL_QEMU_USER_RANGE_BASE,
  &user_pwbuf, &user_base);
 if (ret < 0)
@@ -174,15 +167,14 @@ static int libxl__domain_get_device_model_uid(libxl__gc 
*gc,
 if (ret < 0)
 return ret;
 if (ret > 0) {
-LOGD(WARN, guest_domid, "Could not find user %s%d, falling back to %s",
- LIBXL_QEMU_USER_BASE, guest_domid, LIBXL_QEMU_USER_SHARED);
+LOGD(WARN, guest_domid, "Could not find user %s, falling back to %s",
+ LIBXL_QEMU_USER_RANGE_BASE, LIBXL_QEMU_USER_SHARED);
 goto end_search;
 }
 
 LOGD(ERROR, guest_domid,
- "Could not find user %s%d or %s or range base pseudo-user %s, cannot 
restrict",
- LIBXL_QEMU_USER_BASE, guest_domid, LIBXL_QEMU_USER_SHARED,
- LIBXL_QEMU_USER_RANGE_BASE);
+ "Could not find user %s or range base pseudo-user %s, cannot 
restrict",
+ LIBXL_QEMU_USER_SHARED, LIBXL_QEMU_USER_RANGE_BASE);
 return ERROR_INVAL;
 
 end_search:
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index c4a43bd0b7..b147f3803c 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -4387,7 +4387,6 @@ _hidden int libxl__read_sysfs_file_contents(libxl__gc *gc,
 int *datalen_r);
 
 #define LIBXL_QEMU_USER_PREFIX "xen-qemuuser"
-#define LIBXL_QEMU_USER_BASE   LIBXL_QEMU_USER_PREFIX"-domid"
 #define LIBXL_QEMU_USER_SHARED LIBXL_QEMU_USER_PREFIX"-shared"
 #define LIBXL_QEMU_USER_RANGE_BASE LIBXL_QEMU_USER_PREFIX"-range-base"
 
-- 
2.19.2


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v3 07/11] libxl: Make killing of device model asynchronous

2018-12-21 Thread George Dunlap
Or at least, give it an asynchronous interface so that we can make it
actually asynchronous in subsequent patches.

Create state structures and callback function signatures.  Add the
state structure to libxl__destroy_domid_state.  Break
libxl__destroy_domid down into two functions.

No functional change intended.

Signed-off-by: George Dunlap 
Acked-by: Ian Jackson 
---
v2:
- Note that libxl__devicemodel_destroy_cb may be called reentrantly

NB that I retain the comment before libxl__destroy_device_model, in
spite of the fact that it looks "pointless", to separate it logically
from the previous prototype.

CC: Ian Jackson 
CC: Wei Liu 
---
 tools/libxl/libxl_dm.c   | 11 +++---
 tools/libxl/libxl_domain.c   | 40 
 tools/libxl/libxl_internal.h | 20 --
 3 files changed, 58 insertions(+), 13 deletions(-)

diff --git a/tools/libxl/libxl_dm.c b/tools/libxl/libxl_dm.c
index 450433452d..ca59df33fe 100644
--- a/tools/libxl/libxl_dm.c
+++ b/tools/libxl/libxl_dm.c
@@ -2696,19 +2696,24 @@ out:
 return rc;
 }
 
-int libxl__destroy_device_model(libxl__gc *gc, uint32_t domid)
+void libxl__destroy_device_model(libxl__egc *egc,
+ libxl__destroy_devicemodel_state *ddms)
 {
+STATE_AO_GC(ddms->ao);
 int rc;
+int domid = ddms->domid;
 char *path = DEVICE_MODEL_XS_PATH(gc, LIBXL_TOOLSTACK_DOMID, domid, "");
+
 if (!xs_rm(CTX->xsh, XBT_NULL, path))
 LOGD(ERROR, domid, "xs_rm failed for %s", path);
+
 /* We should try to destroy the device model anyway. */
 rc = kill_device_model(gc,
   GCSPRINTF("/local/domain/%d/image/device-model-pid", domid));
-
+
 libxl__qmp_cleanup(gc, domid);
 
-return rc;
+ddms->callback(egc, ddms, rc);
 }
 
 /* Return 0 if no dm needed, 1 if needed and <0 if error. */
diff --git a/tools/libxl/libxl_domain.c b/tools/libxl/libxl_domain.c
index d46b97dedf..0ce1ba1327 100644
--- a/tools/libxl/libxl_domain.c
+++ b/tools/libxl/libxl_domain.c
@@ -1008,6 +1008,10 @@ static void destroy_finish_check(libxl__egc *egc,
 }
 
 /* Callbacks for libxl__destroy_domid */
+static void dm_destroy_cb(libxl__egc *egc,
+  libxl__destroy_devicemodel_state *ddms,
+  int rc);
+
 static void devices_destroy_cb(libxl__egc *egc,
libxl__devices_remove_state *drs,
int rc);
@@ -1066,16 +1070,18 @@ void libxl__destroy_domid(libxl__egc *egc, 
libxl__destroy_domid_state *dis)
 if (rc < 0) {
 LOGEVD(ERROR, rc, domid, "xc_domain_pause failed");
 }
+
 if (dm_present) {
-if (libxl__destroy_device_model(gc, domid) < 0)
-LOGD(ERROR, domid, "libxl__destroy_device_model failed");
+dis->ddms.ao = ao;
+dis->ddms.domid = domid;
+dis->ddms.callback = dm_destroy_cb;
+
+libxl__destroy_device_model(egc, &dis->ddms);
+return;
+} else {
+dm_destroy_cb(egc, &dis->ddms, 0);
+return;
 }
-dis->drs.ao = ao;
-dis->drs.domid = domid;
-dis->drs.callback = devices_destroy_cb;
-dis->drs.force = 1;
-libxl__devices_destroy(egc, &dis->drs);
-return;
 
 out:
 assert(rc);
@@ -1083,6 +1089,24 @@ out:
 return;
 }
 
+static void dm_destroy_cb(libxl__egc *egc,
+  libxl__destroy_devicemodel_state *ddms,
+  int rc)
+{
+libxl__destroy_domid_state *dis = CONTAINER_OF(ddms, *dis, ddms);
+STATE_AO_GC(dis->ao);
+uint32_t domid = dis->domid;
+
+if (rc < 0)
+LOGD(ERROR, domid, "libxl__destroy_device_model failed");
+
+dis->drs.ao = ao;
+dis->drs.domid = domid;
+dis->drs.callback = devices_destroy_cb;
+dis->drs.force = 1;
+libxl__devices_destroy(egc, &dis->drs);
+}
+
 static void devices_destroy_cb(libxl__egc *egc,
libxl__devices_remove_state *drs,
int rc)
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index b147f3803c..f9e0bf6578 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -1705,8 +1705,6 @@ _hidden int 
libxl__wait_for_device_model_deprecated(libxl__gc *gc,
   void *userdata),
 void *check_callback_userdata);
 
-_hidden int libxl__destroy_device_model(libxl__gc *gc, uint32_t domid);
-
 _hidden const libxl_vnc_info *libxl__dm_vnc(const libxl_domain_config *g_cfg);
 
 _hidden char *libxl__abs_path(libxl__gc *gc, const char *s, const char *path);
@@ -3672,6 +3670,7 @@ extern const struct libxl_device_type *device_type_tbl[];
 
 typedef struct libxl__domain_destroy_state libxl__domain_destroy_state;
 typedef struct libxl__destroy_domid_state libxl__destroy_domid_state;
+typedef struct libxl__destroy_devicemodel_state 
libxl__destroy_devicemodel_state;
 typedef struct libxl__devices_remove_

[Xen-devel] [PATCH v3 04/11] dm_depriv: Describe expected usage of device_model_user parameter

2018-12-21 Thread George Dunlap
A number of subsequent patches rely on as-yet undefined behavior for
what the `device_model_user` parameter does.  Rather than implement it
incorrectly (or randomly), or remove the feature, describe an expected
usage for the feature.  Further patches will make decisions based on
this expected usage.

Signed-off-by: George Dunlap 
Acked-by: Ian Jackson 
---
v3:
- Fix a minor typo

v2:
- Remove stale comment about device_model_user not being ready

RFC: As we'll see in a later patch, this implementation is still
incomplete: we need a `reaper` uid from which to kill uids.

CC: Ian Jackson 
CC: Wei Liu 
CC: Anthony Perard 
---
 docs/features/qemu-deprivilege.pandoc | 17 +
 tools/libxl/libxl_types.idl   |  1 -
 2 files changed, 17 insertions(+), 1 deletion(-)

diff --git a/docs/features/qemu-deprivilege.pandoc 
b/docs/features/qemu-deprivilege.pandoc
index f941525189..ce21a60ef7 100644
--- a/docs/features/qemu-deprivilege.pandoc
+++ b/docs/features/qemu-deprivilege.pandoc
@@ -66,6 +66,23 @@ this, create a user named `xen-qemuuser-shared`; for example:
 
 adduser --no-create-home --system xen-qemuuser-shared
 
+A final way to set up a separate process for qemus is to allocate one
+UID per VM, and set the UID in the domain config file with the
+`device_model_user` argument.  For example, suppose you have a VM
+named `c6-01`.  You might do the following:
+
+adduser --system --no-create-home --group xen-qemuuser-c6-01
+
+And then in your config file, the following line:
+
+device_model_user="xen-qemuuser-c6-01"
+
+NOTE: It is important when using `device_model_user` that EACH VM HAVE
+A SEPARATE UID, and that none of these UIDs map to root.  xl will
+throw an error a uid maps to zero, but not if multiple VMs have the
+same uid.  Multiple VMs with the same device model uid will cause
+problems.
+
 ## Domain config changes
 
 The core domain config change is to add the following line to the
diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
index 51cf06a3a2..141c46e42a 100644
--- a/tools/libxl/libxl_types.idl
+++ b/tools/libxl/libxl_types.idl
@@ -495,7 +495,6 @@ libxl_domain_build_info = Struct("domain_build_info",[
 ("device_model", string),
 ("device_model_ssidref", uint32),
 ("device_model_ssid_label", string),
-# device_model_user is not ready for use yet
 ("device_model_user", string),
 
 # extra parameters pass directly to qemu, NULL terminated
-- 
2.19.2


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v3 09/11] libxl: Kill QEMU with "reaper" ruid

2018-12-21 Thread George Dunlap
Using kill(-1) to killing an untrusted dm process with the real uid
equal to the dm_uid isn't guaranteed to succeed: the process in
question may be able to kill the reaper process after the setresuid()
and before the kill().

Instead, set the real uid to the QEMU user for domain 0
(QEMU_USER_RANGE_BASE + 0).  The reaper process will still be able to
kill the dm process, but not vice versa.

This, in turn, requires locking to make sure that only one reaper
process is using that uid at a time; otherwise one reaper process may
kill the other reaper process.

Create a lockfile in RUNDIR/dm-reaper-lock, and grab the lock before
executing kill.

In the event that we can't get the lock for some reason, go ahead with
the kill using dm_uid for both real and effective UIDs.  This isn't
guaranteed to work, but it's no worse than not trying to kill the
process at all.

NB that this effectively requires admins using device_model_user to
also define xen_qemuuser_range_base; this will be addressed in
subsequent patches.

Signed-off-by: George Dunlap 
---
v3:
- Have libxl__get_reaper_uid return an error if a suitable uid is not found.
- Explicitly set reaper_uid to dm_uid if get_reaper_lock_and_uid()
  returns an error, rather than relying on reaper_uid being unchanged
  on failure.
- Open the lockfile 0644 rather than 0666.
- Remove bogus comment.

v2:
- Port over previous changes
- libxl__get_reaper_uid() won't set errno, use LOG rather than LOGE.
- Accumulate error and return for all failures
- Move flock() outside of the condition.  Also fix EINTR check (check
  errno rather than return value).
- Add a comment explaining why we return an error even if the kill()
  succeeds
- Move locking to a separate function to minimize gotos
- Refactor libxl__get_reaper_id to take a pointer for reaper_uid;
  return only success/failure.  Also return EINVAL if reaper_uid would
  resolve to 0.
- Handle "reaper_uid not found" specially; note issue with
  device_model_user feature
- Assert that final reaper_uid != 0

CC: Ian Jackson 
CC: Wei Liu 
---
 tools/libxl/libxl_dm.c | 118 +
 1 file changed, 108 insertions(+), 10 deletions(-)

diff --git a/tools/libxl/libxl_dm.c b/tools/libxl/libxl_dm.c
index d265f24864..f7c4e5eb3b 100644
--- a/tools/libxl/libxl_dm.c
+++ b/tools/libxl/libxl_dm.c
@@ -259,6 +259,35 @@ out:
 return rc;
 }
 
+/*
+ * Look up "reaper UID".  If present and non-root, returns 0 and sets
+ * reaper_uid.  Otherwise returns libxl-style error.
+ */
+static int libxl__get_reaper_uid(libxl__gc *gc, uid_t *reaper_uid)
+{
+struct passwd *user_base, user_pwbuf;
+int rc;
+
+rc = userlookup_helper_getpwnam(gc, LIBXL_QEMU_USER_RANGE_BASE,
+ &user_pwbuf, &user_base);
+if (rc)
+return rc;
+
+if (!user_base) {
+LOG(WARN, "Couldn't find uid for reaper process");
+return ERROR_INVAL;
+}
+
+if (user_base->pw_uid == 0) {
+LOG(ERROR, "UID for reaper process maps to root!");
+return ERROR_INVAL;
+}
+
+*reaper_uid = user_base->pw_uid;
+
+return 0;
+}
+
 const char *libxl__domain_device_model(libxl__gc *gc,
const libxl_domain_build_info *info)
 {
@@ -2834,37 +2863,106 @@ out:
 return;
 }
 
+static int get_reaper_lock_and_uid(libxl__destroy_devicemodel_state *ddms,
+   uid_t *reaper_uid)
+{
+STATE_AO_GC(ddms->ao);
+int domid = ddms->domid;
+int r;
+const char * lockfile;
+int fd;
+
+/* Try to lock the "reaper uid" */
+lockfile = GCSPRINTF("%s/dm-reaper-lock", libxl__run_dir_path());
+
+/*
+ * NB that since we've just forked, we can't have any
+ * threads; so we don't need the libxl__carefd
+ * infrastructure here.
+ */
+fd = open(lockfile, O_RDWR|O_CREAT, 0644);
+if (fd < 0) {
+LOGED(ERROR, domid,
+  "unexpected error while trying to open lockfile %s, errno=%d",
+  lockfile, errno);
+return ERROR_FAIL;
+}
+
+/* Try to lock the file, retrying on EINTR */
+for (;;) {
+r = flock(fd, LOCK_EX);
+if (!r)
+break;
+if (errno != EINTR) {
+/* All other errno: EBADF, EINVAL, ENOLCK, EWOULDBLOCK */
+LOGED(ERROR, domid,
+  "unexpected error while trying to lock %s, fd=%d, errno=%d",
+  lockfile, fd, errno);
+return ERROR_FAIL;
+}
+}
+
+/*
+ * Get reaper_uid.  If we can't find such a uid, return an error.
+ *
+ * FIXME: This means that domain destruction will fail if
+ * device_model_user is set but QEMU_USER_RANGE_BASE doesn't exist.
+ */
+return libxl__get_reaper_uid(gc, reaper_uid);
+}
+
+
 /*
  * Destroy all processes of the given uid by setresuid to the
  * specified uid and kill(-1).  NB this MUST BE CALLED FROM A SEPARATE
- * PROCESS from the normal libxl process. 

[Xen-devel] [PATCH v3 11/11] dm_depriv: Mark `UID cleanup` as completed

2018-12-21 Thread George Dunlap
Signed-off-by: George Dunlap 
---
CC: Ian Jackson 
CC: Wei Liu 
---
 docs/designs/qemu-deprivilege.md | 40 
 1 file changed, 20 insertions(+), 20 deletions(-)

diff --git a/docs/designs/qemu-deprivilege.md b/docs/designs/qemu-deprivilege.md
index f7444a434d..81a5f5c05d 100644
--- a/docs/designs/qemu-deprivilege.md
+++ b/docs/designs/qemu-deprivilege.md
@@ -128,26 +128,6 @@ are specified; this does not apply to QEMU running as a 
Xen DM.
 
 '''Tested''': Not tested
 
-# Restrictions / improvements still to do
-
-This lists potential restrictions still to do.  It is meant to be
-listed in order of ease of implementation, with low-hanging fruit
-first.
-
-### Further RLIMITs
-
-RLIMIT_AS limits the total amount of memory; but this includes the
-virtual memory which QEMU uses as a mapcache.  xen-mapcache.c already
-fiddles with this; it would be straightforward to make it *set* the
-rlimit to what it thinks a sensible limit is.
-
-RLIMIT_NPROC limits total number of processes or threads.  QEMU uses
-threads for some devices, so this would require some thought.
-
-Other things that would take some cleverness / changes to QEMU to
-utilize due to ordering constrants:
- - RLIMIT_NOFILES (after all necessary files are opened)
-
 ### libxl UID cleanup
 
 '''Description''': Domain IDs are reused, and thus restricted UIDs are
@@ -223,6 +203,26 @@ Since this will kill all other `reaper_uid` processes as 
well, we must
 either allocate a separate `reaper_uid` per domain, or use locking to
 ensure that only one killing process is active at a time.
 
+# Restrictions / improvements still to do
+
+This lists potential restrictions still to do.  It is meant to be
+listed in order of ease of implementation, with low-hanging fruit
+first.
+
+### Further RLIMITs
+
+RLIMIT_AS limits the total amount of memory; but this includes the
+virtual memory which QEMU uses as a mapcache.  xen-mapcache.c already
+fiddles with this; it would be straightforward to make it *set* the
+rlimit to what it thinks a sensible limit is.
+
+RLIMIT_NPROC limits total number of processes or threads.  QEMU uses
+threads for some devices, so this would require some thought.
+
+Other things that would take some cleverness / changes to QEMU to
+utilize due to ordering constrants:
+ - RLIMIT_NOFILES (after all necessary files are opened)
+
 ## libxl: Treat QMP connection as untrusted
 
 '''Description''': Currently libxl talks with QEMU via QMP; but its
-- 
2.19.2


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v3 06/11] libxl: Move qmp cleanup into devicemodel destroy function

2018-12-21 Thread George Dunlap
Removing the qmp connection is logically part of the device model
destruction; having the caller destroy it is a mild layering
violation.

Move libxl__qmp_cleanup() into libxl__destroy_device_model().  This
will make it easier when we make devicemodel destruction asynchronous.

Signed-off-by: George Dunlap 
Acked-by: Ian Jackson 
---
CC: Ian Jackson 
CC: Wei Liu 
---
 tools/libxl/libxl_dm.c | 9 +++--
 tools/libxl/libxl_domain.c | 2 --
 2 files changed, 7 insertions(+), 4 deletions(-)

diff --git a/tools/libxl/libxl_dm.c b/tools/libxl/libxl_dm.c
index d73bbb6b06..450433452d 100644
--- a/tools/libxl/libxl_dm.c
+++ b/tools/libxl/libxl_dm.c
@@ -2698,12 +2698,17 @@ out:
 
 int libxl__destroy_device_model(libxl__gc *gc, uint32_t domid)
 {
+int rc;
 char *path = DEVICE_MODEL_XS_PATH(gc, LIBXL_TOOLSTACK_DOMID, domid, "");
 if (!xs_rm(CTX->xsh, XBT_NULL, path))
 LOGD(ERROR, domid, "xs_rm failed for %s", path);
 /* We should try to destroy the device model anyway. */
-return kill_device_model(gc,
-GCSPRINTF("/local/domain/%d/image/device-model-pid", domid));
+rc = kill_device_model(gc,
+  GCSPRINTF("/local/domain/%d/image/device-model-pid", domid));
+
+libxl__qmp_cleanup(gc, domid);
+
+return rc;
 }
 
 /* Return 0 if no dm needed, 1 if needed and <0 if error. */
diff --git a/tools/libxl/libxl_domain.c b/tools/libxl/libxl_domain.c
index 3377bba994..d46b97dedf 100644
--- a/tools/libxl/libxl_domain.c
+++ b/tools/libxl/libxl_domain.c
@@ -1069,8 +1069,6 @@ void libxl__destroy_domid(libxl__egc *egc, 
libxl__destroy_domid_state *dis)
 if (dm_present) {
 if (libxl__destroy_device_model(gc, domid) < 0)
 LOGD(ERROR, domid, "libxl__destroy_device_model failed");
-
-libxl__qmp_cleanup(gc, domid);
 }
 dis->drs.ao = ao;
 dis->drs.domid = domid;
-- 
2.19.2


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

  1   2   >