Re: [Xen-devel] [PATCH 13/25] argo: implement the register op
On Thu, Dec 20, 2018 at 11:28 PM Jan Beulich wrote: > > >>> On 21.12.18 at 02:25, wrote: > > On Thu, Dec 20, 2018 at 12:29 AM Jan Beulich wrote: > >> > >> >>> On 20.12.18 at 06:29, wrote: > >> > On Wed, Dec 12, 2018 at 1:48 AM Jan Beulich wrote: > >> >> > >> >> > +static int > >> >> > +argo_find_ring_mfns(struct domain *d, struct argo_ring_info > >> >> > *ring_info, > >> >> > +uint32_t npage, > >> >> > XEN_GUEST_HANDLE_PARAM(argo_pfn_t) pfn_hnd, > >> >> > +uint32_t len) > >> >> > +{ > >> >> > +int i; > >> >> > +int ret = 0; > >> >> > + > >> >> > +if ( (npage << PAGE_SHIFT) < len ) > >> >> > +return -EINVAL; > >> >> > + > >> >> > +if ( ring_info->mfns ) > >> >> > +{ > >> >> > +/* > >> >> > + * Ring already existed. Check if it's the same ring, > >> >> > + * i.e. same number of pages and all translated gpfns still > >> >> > + * translating to the same mfns > >> >> > + */ > >> >> > >> >> This comment makes me wonder whether the translations are > >> >> permitted to change at other times. If so I'm not sure what > >> >> value verification here has. If not, this probably would want to > >> >> be debugging-only code. > >> > > >> > My understanding is that the gfn->mfn translation is not necessarily > >> > stable > >> > across entry and exit from host power state S4, suspend to disk. > > Now I'm afraid there's some confusion here: Originally you've > said "host". > > >> How would that be? It's not stable across guest migration (or > >> its non-live save/restore equivalent), > > > > Right, that's clear. > > > >> but how would things change across S3? > > > > I don't think that they do change in that case. > > > > From studying the code involved above, a related item: the guest runs the > > same > > suspend and resume kernel code before entering into/exiting from either > > guest > > S3 or S4, so the guest kernel resume code needs to re-register the rings, to > > cover the case where it is coming up in an environment where they were > > dropped > > - so that's what it does. > > > > This relates to the code section above: if guest entry to S3 is aborted at > > the > > final step (eg. error or platform refuses, eg. maybe a physical device > > interaction with passthrough) then the hypervisor has not torn down the > > rings, > > the guest remains running within the same domain, and the guest resume logic > > runs, which runs through re-registration for all its rings. The check in the > > logic above allows the existing ring mappings within the hypervisor to be > > preserved. > > Yet now you suddenly talk about guest S3. Well, the context is that you did just ask about S3, without specifying host or guest. Host S3 doesn't involve much at all, so I went and studied the code in both the Linux driver and the hypervisor to determine what it does in the case of guest S3, and then replied with the above since it is relevant to the code in question. I hope I was clear about referring to guest S3 above in my last reply. That logic aims to make ring registration idempotent, to avoid the teardown of established mappings of the ring pages in the case where doing so isn't needed. > >> And there's no support for S4 (and I can't see it appearing any time soon). > > > > OK. oh well. > > Considering the original "host" context, my response here was > relating to host S4. Guest S4 ought to be functional (as being > mostly a guest kernel function anyway). ack. Christopher ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH 13/25] argo: implement the register op
>>> On 21.12.18 at 09:16, wrote: > On Thu, Dec 20, 2018 at 11:28 PM Jan Beulich wrote: >> >> >>> On 21.12.18 at 02:25, wrote: >> > On Thu, Dec 20, 2018 at 12:29 AM Jan Beulich wrote: >> >> >> >> >>> On 20.12.18 at 06:29, wrote: >> >> > On Wed, Dec 12, 2018 at 1:48 AM Jan Beulich wrote: >> >> >> >> >> >> > +static int >> >> >> > +argo_find_ring_mfns(struct domain *d, struct argo_ring_info >> >> >> > *ring_info, >> >> >> > +uint32_t npage, >> >> >> > XEN_GUEST_HANDLE_PARAM(argo_pfn_t) pfn_hnd, >> >> >> > +uint32_t len) >> >> >> > +{ >> >> >> > +int i; >> >> >> > +int ret = 0; >> >> >> > + >> >> >> > +if ( (npage << PAGE_SHIFT) < len ) >> >> >> > +return -EINVAL; >> >> >> > + >> >> >> > +if ( ring_info->mfns ) >> >> >> > +{ >> >> >> > +/* >> >> >> > + * Ring already existed. Check if it's the same ring, >> >> >> > + * i.e. same number of pages and all translated gpfns still >> >> >> > + * translating to the same mfns >> >> >> > + */ >> >> >> >> >> >> This comment makes me wonder whether the translations are >> >> >> permitted to change at other times. If so I'm not sure what >> >> >> value verification here has. If not, this probably would want to >> >> >> be debugging-only code. >> >> > >> >> > My understanding is that the gfn->mfn translation is not necessarily >> >> > stable >> >> > across entry and exit from host power state S4, suspend to disk. Note this ^^^ (and see below). >> Now I'm afraid there's some confusion here: Originally you've >> said "host". >> >> >> How would that be? It's not stable across guest migration (or >> >> its non-live save/restore equivalent), >> > >> > Right, that's clear. >> > >> >> but how would things change across S3? >> > >> > I don't think that they do change in that case. >> > >> > From studying the code involved above, a related item: the guest runs the >> > same >> > suspend and resume kernel code before entering into/exiting from either >> > guest >> > S3 or S4, so the guest kernel resume code needs to re-register the rings, >> > to >> > cover the case where it is coming up in an environment where they were >> > dropped >> > - so that's what it does. >> > >> > This relates to the code section above: if guest entry to S3 is aborted at >> > the >> > final step (eg. error or platform refuses, eg. maybe a physical device >> > interaction with passthrough) then the hypervisor has not torn down the >> > rings, >> > the guest remains running within the same domain, and the guest resume >> > logic >> > runs, which runs through re-registration for all its rings. The check in >> > the >> > logic above allows the existing ring mappings within the hypervisor to be >> > preserved. >> >> Yet now you suddenly talk about guest S3. > > Well, the context is that you did just ask about S3, without > specifying host or guest. I'm sorry to be picky, but no, I don't think I did. You did expicitly say "host", making me further think only about that case. > Host S3 doesn't involve much at all, so I > went and studied the code in both the Linux driver and the hypervisor > to determine what it does in the case of guest S3, and then replied > with the above since it is relevant to the code in question. I hope I > was clear about referring to guest S3 above in my last reply. > > That logic aims to make ring registration idempotent, to avoid the > teardown of established mappings of the ring pages in the case where > doing so isn't needed. You treat complexity in the kernel for complexity in the hypervisor. I'm not sure this is appropriate, as I can't judge how much more difficult it would be for the guest to look after itself. But let's look at both cases again: - For guest S3, afaik, the domain doesn't change, and hence memory assignment remains the same. No re-registration necessary then afaict. - For guest S4, aiui, the domain gets destroyed and a new one built upon resume. Re-registration would be needed, but due to the domain re-construction no leftovers ought to exist in Xen. Hence to me it would seem more natural to have the guest deal with the situation, and have no extra logic for this in Xen. You want the guest to re-register anyway, yet simply avoiding to do so in the S3 case ought to be a single (or very few) conditional(s), i.e. not a whole lot of complexity. Jan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH for-4.12 v2 1/8] xen/page_alloc: Move get_pg_owner()/put_pg_owner() from x86 to common code
>>> On 20.12.18 at 20:23, wrote: > From: Benjamin Sanda > > get_pg_owner() and put_pg_owner() will be necessary in a follow-up > commit to support xentrace on Arm. So move the helper to common code. > > Signed-off-by: Benjamin Sanda > [julien: Rework commit title / turn put_pg_owner to a macro] Nit: It's an inline function now. Jan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH] drm/xen-front: Make shmem backed display buffer coherent
On 12/20/18 8:38 PM, Christoph Hellwig wrote: On Thu, Dec 20, 2018 at 07:35:15PM +0100, Daniel Vetter wrote: Err, with streaming DMA buffer sharing is trivial. The coherent DMA allocator is what causes all kinds of horrible hacks that can't actually work on various platforms. Hm, I thought the streaming dma api is the one that causes bounce buffers and all that fun. If you're unlucky at least. Yes it may. But even if that happens everything will actually work, just slower. While the dma coherent API is simply broken. But if you don't want bounce buffering you need to use the dma noncoherent allocator as proposed here: https://lists.linuxfoundation.org/pipermail/iommu/2018-December/031982.html which combines allocating memory that doesn't need to be bounce buffered with a sharing scheme that can actually work. So, the bottom line will be: I can use DMA API for what I need, but: 1. I need to remove GFP_USER 2. No need for DMA32 (so no chance for bouncing to step in) 3. I may need to check if mapping and unmapping of the buffer at once will also help, e.g. no need to have the buffer mapped until it is destroyed Did I get it all right? Thank you, Oleksandr ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH for-4.12 v2 8/8] xen: Switch parameter in get_page_from_gfn to use typesafe gfn
>>> On 20.12.18 at 20:23, wrote: > No functional change intended. > > Only reasonable clean-ups are done in this patch. The rest will use _gfn > for the time being. > > Signed-off-by: Julien Grall Acked-by: Jan Beulich ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] [linux-4.19 test] 131447: regressions - FAIL
flight 131447 linux-4.19 real [real] http://logs.test-lab.xenproject.org/osstest/logs/131447/ Regressions :-( Tests which did not succeed and are blocking, including tests which could not be run: test-amd64-amd64-xl-shadow7 xen-boot fail REGR. vs. 129313 test-amd64-amd64-pair10 xen-boot/src_hostfail REGR. vs. 129313 test-amd64-amd64-pair11 xen-boot/dst_hostfail REGR. vs. 129313 test-amd64-amd64-libvirt-pair 10 xen-boot/src_host fail REGR. vs. 129313 test-amd64-amd64-libvirt-pair 11 xen-boot/dst_host fail REGR. vs. 129313 test-amd64-amd64-xl-qemut-stubdom-debianhvm-amd64-xsm 7 xen-boot fail REGR. vs. 129313 test-amd64-amd64-xl-qemuu-debianhvm-amd64-xsm 7 xen-boot fail REGR. vs. 129313 test-amd64-amd64-i386-pvgrub 7 xen-boot fail REGR. vs. 129313 test-amd64-amd64-xl-qemuu-dmrestrict-amd64-dmrestrict 7 xen-boot fail REGR. vs. 129313 test-amd64-i386-examine 8 reboot fail REGR. vs. 129313 test-amd64-amd64-amd64-pvgrub 7 xen-bootfail REGR. vs. 129313 test-amd64-i386-xl-raw7 xen-boot fail REGR. vs. 129313 test-amd64-i386-xl-qemuu-debianhvm-amd64 7 xen-boot fail REGR. vs. 129313 test-amd64-i386-xl-qemut-debianhvm-amd64-xsm 7 xen-boot fail REGR. vs. 129313 test-amd64-i386-pair 10 xen-boot/src_hostfail REGR. vs. 129313 test-amd64-i386-pair 11 xen-boot/dst_hostfail REGR. vs. 129313 test-amd64-i386-libvirt-pair 10 xen-boot/src_hostfail REGR. vs. 129313 test-amd64-i386-libvirt-pair 11 xen-boot/dst_hostfail REGR. vs. 129313 test-amd64-i386-qemut-rhel6hvm-intel 7 xen-boot fail REGR. vs. 129313 test-amd64-amd64-examine 8 reboot fail REGR. vs. 129313 test-amd64-i386-qemut-rhel6hvm-amd 12 guest-start/redhat.repeat fail REGR. vs. 129313 Tests which are failing intermittently (not blocking): test-amd64-amd64-xl-qcow2 19 guest-start/debian.repeat fail in 131425 pass in 131447 test-armhf-armhf-xl-arndale 5 host-ping-check-native fail pass in 131425 Tests which did not succeed, but are not blocking: test-armhf-armhf-xl-arndale 13 migrate-support-check fail in 131425 never pass test-armhf-armhf-xl-arndale 14 saverestore-support-check fail in 131425 never pass test-amd64-i386-xl-qemuu-debianhvm-amd64-xsm 7 xen-boot fail never pass test-amd64-i386-xl-pvshim12 guest-start fail never pass test-amd64-i386-libvirt 13 migrate-support-checkfail never pass test-amd64-i386-libvirt-xsm 13 migrate-support-checkfail never pass test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail never pass test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check fail never pass test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check fail never pass test-amd64-amd64-qemuu-nested-amd 17 debian-hvm-install/l1/l2 fail never pass test-amd64-amd64-xl-qemut-win7-amd64 17 guest-stop fail never pass test-amd64-amd64-libvirt-vhd 12 migrate-support-checkfail never pass test-amd64-amd64-libvirt 13 migrate-support-checkfail never pass test-amd64-amd64-xl-qemuu-win7-amd64 17 guest-stop fail never pass test-armhf-armhf-xl-credit2 13 migrate-support-checkfail never pass test-armhf-armhf-libvirt 13 migrate-support-checkfail never pass test-armhf-armhf-xl 13 migrate-support-checkfail never pass test-armhf-armhf-xl-credit2 14 saverestore-support-checkfail never pass test-armhf-armhf-libvirt 14 saverestore-support-checkfail never pass test-armhf-armhf-xl 14 saverestore-support-checkfail never pass test-armhf-armhf-xl-cubietruck 13 migrate-support-checkfail never pass test-armhf-armhf-xl-cubietruck 14 saverestore-support-checkfail never pass test-amd64-i386-xl-qemut-ws16-amd64 17 guest-stop fail never pass test-amd64-i386-xl-qemuu-win7-amd64 17 guest-stop fail never pass test-amd64-amd64-xl-qemuu-ws16-amd64 17 guest-stop fail never pass test-amd64-i386-xl-qemut-win7-amd64 17 guest-stop fail never pass test-amd64-i386-xl-qemuu-ws16-amd64 17 guest-stop fail never pass test-armhf-armhf-xl-rtds 13 migrate-support-checkfail never pass test-armhf-armhf-xl-rtds 14 saverestore-support-checkfail never pass test-armhf-armhf-libvirt-raw 12 migrate-support-checkfail never pass test-armhf-armhf-libvirt-raw 13 saverestore-support-checkfail never pass test-armhf-armhf-xl-credit1 13 migrate-support-checkfail never pass test-armhf-armhf-xl-credit1 14 saverestore-support-checkfail never pass test-armhf-armhf-xl-multivcpu 13 migrate-support-checkfail never pass test-armhf-armhf-xl-multivcpu 14 saverestore-support-checkfail
Re: [Xen-devel] [PATCH v4 0/2] misc safety certification fixes
>>> On 20.12.18 at 18:26, wrote: > On 11/12/18 11:06 PM, Stefano Stabellini wrote: > Discussing with Stefano today, he is aiming to get this series for Xen > 4.12. I will be away until the x86/common code freeze. > > I agree with him that I will waive my ack if it gets reviewed by any > committers. Well, discussion on patch 2 was abandoned rather than finished afaict, which means Stefano either lost interest or is meaning to submit v5 with the comments addressed. Jan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] [PATCH v2 0/3] x86/mm-locks: add a bias to control domain lock levels
Hello, The following series attempts to fix a mm lock level issue that prevents using paging_log_dirty_op from a paging Dom0 (like a PVH Dom0). The discussion that lead to this series can be found at: https://lists.xenproject.org/archives/html/xen-devel/2018-12/msg01197.html Roger Pau Monne (3): x86/mm-locks: remove trailing whitespace x86/mm-locks: convert some macros to inline functions x86/mm-locks: apply a bias to lock levels for control domain xen/arch/x86/mm/mm-locks.h | 217 ++--- xen/arch/x86/mm/p2m-pod.c | 5 +- 2 files changed, 130 insertions(+), 92 deletions(-) -- 2.20.1 ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] [PATCH v2 2/3] x86/mm-locks: convert some macros to inline functions
And rename to have only one prefix underscore where applicable. No functional change. Signed-off-by: Roger Pau Monné Reviewed-by: George Dunlap --- Cc: George Dunlap Cc: Jan Beulich Cc: Andrew Cooper Cc: Wei Liu --- xen/arch/x86/mm/mm-locks.h | 96 -- 1 file changed, 51 insertions(+), 45 deletions(-) diff --git a/xen/arch/x86/mm/mm-locks.h b/xen/arch/x86/mm/mm-locks.h index 64b8775a6d..d3497713e9 100644 --- a/xen/arch/x86/mm/mm-locks.h +++ b/xen/arch/x86/mm/mm-locks.h @@ -29,7 +29,6 @@ /* Per-CPU variable for enforcing the lock ordering */ DECLARE_PER_CPU(int, mm_lock_level); -#define __get_lock_level() (this_cpu(mm_lock_level)) DECLARE_PERCPU_RWLOCK_GLOBAL(p2m_percpu_rwlock); @@ -46,43 +45,47 @@ static inline int mm_locked_by_me(mm_lock_t *l) return (l->lock.recurse_cpu == current->processor); } +static inline int _get_lock_level(void) +{ +return this_cpu(mm_lock_level); +} + /* * If you see this crash, the numbers printed are order levels defined * in this file. */ -#define __check_lock_level(l) \ -do {\ -if ( unlikely(__get_lock_level() > (l)) ) \ -{ \ -printk("mm locking order violation: %i > %i\n", \ - __get_lock_level(), (l));\ -BUG(); \ -} \ -} while(0) - -#define __set_lock_level(l) \ -do {\ -__get_lock_level() = (l); \ -} while(0) +static inline void _check_lock_level(int l) +{ +if ( unlikely(_get_lock_level() > l) ) +{ +printk("mm locking order violation: %i > %i\n", _get_lock_level(), l); +BUG(); +} +} + +static inline void _set_lock_level(int l) +{ +this_cpu(mm_lock_level) = l; +} static inline void _mm_lock(mm_lock_t *l, const char *func, int level, int rec) { if ( !((mm_locked_by_me(l)) && rec) ) -__check_lock_level(level); +_check_lock_level(level); spin_lock_recursive(&l->lock); if ( l->lock.recurse_cnt == 1 ) { l->locker_function = func; -l->unlock_level = __get_lock_level(); +l->unlock_level = _get_lock_level(); } else if ( (unlikely(!rec)) ) panic("mm lock already held by %s\n", l->locker_function); -__set_lock_level(level); +_set_lock_level(level); } static inline void _mm_enforce_order_lock_pre(int level) { -__check_lock_level(level); +_check_lock_level(level); } static inline void _mm_enforce_order_lock_post(int level, int *unlock_level, @@ -92,12 +95,12 @@ static inline void _mm_enforce_order_lock_post(int level, int *unlock_level, { if ( (*recurse_count)++ == 0 ) { -*unlock_level = __get_lock_level(); +*unlock_level = _get_lock_level(); } } else { -*unlock_level = __get_lock_level(); +*unlock_level = _get_lock_level(); } -__set_lock_level(level); +_set_lock_level(level); } @@ -118,12 +121,12 @@ static inline void _mm_write_lock(mm_rwlock_t *l, const char *func, int level) { if ( !mm_write_locked_by_me(l) ) { -__check_lock_level(level); +_check_lock_level(level); percpu_write_lock(p2m_percpu_rwlock, &l->lock); l->locker = get_processor_id(); l->locker_function = func; -l->unlock_level = __get_lock_level(); -__set_lock_level(level); +l->unlock_level = _get_lock_level(); +_set_lock_level(level); } l->recurse_count++; } @@ -134,13 +137,13 @@ static inline void mm_write_unlock(mm_rwlock_t *l) return; l->locker = -1; l->locker_function = "nobody"; -__set_lock_level(l->unlock_level); +_set_lock_level(l->unlock_level); percpu_write_unlock(p2m_percpu_rwlock, &l->lock); } static inline void _mm_read_lock(mm_rwlock_t *l, int level) { -__check_lock_level(level); +_check_lock_level(level); percpu_read_lock(p2m_percpu_rwlock, &l->lock); /* There's nowhere to store the per-CPU unlock level so we can't * set the lock level. */ @@ -181,7 +184,7 @@ static inline void mm_unlock(mm_lock_t *l) if ( l->lock.recurse_cnt == 1 ) { l->locker_function = "nobody"; -__set_lock_level(l->unlock_level); +_set_lock_level(l->unlock_level); } spin_unlock_recursive(&l->lock); } @@ -194,10 +197,10 @@ static inline void mm_enforce_order_unlock(int unlock_level, BUG_ON(*recurse_count == 0); if ( (*recurse_count)-- == 1 ) { -__set_lock_level(unlock_level); +_set_lock_level(unlock_level); } } else { -__set_lock_level(unlock_level); +_set_lock_level(unlock_level); } } @@ -287,21 +290,24 @@ declare_
[Xen-devel] [PATCH v2 1/3] x86/mm-locks: remove trailing whitespace
No functional change. Signed-off-by: Roger Pau Monné Reviewed-by: George Dunlap --- Cc: George Dunlap Cc: Jan Beulich Cc: Andrew Cooper Cc: Wei Liu --- xen/arch/x86/mm/mm-locks.h | 24 1 file changed, 12 insertions(+), 12 deletions(-) diff --git a/xen/arch/x86/mm/mm-locks.h b/xen/arch/x86/mm/mm-locks.h index 95295b62d2..64b8775a6d 100644 --- a/xen/arch/x86/mm/mm-locks.h +++ b/xen/arch/x86/mm/mm-locks.h @@ -3,11 +3,11 @@ * * Spinlocks used by the code in arch/x86/mm. * - * Copyright (c) 2011 Citrix Systems, inc. + * Copyright (c) 2011 Citrix Systems, inc. * Copyright (c) 2007 Advanced Micro Devices (Wei Huang) * Copyright (c) 2006-2007 XenSource Inc. * Copyright (c) 2006 Michael A Fetterman - * + * * This program is free software; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation; either version 2 of the License, or @@ -41,7 +41,7 @@ static inline void mm_lock_init(mm_lock_t *l) l->unlock_level = 0; } -static inline int mm_locked_by_me(mm_lock_t *l) +static inline int mm_locked_by_me(mm_lock_t *l) { return (l->lock.recurse_cpu == current->processor); } @@ -67,7 +67,7 @@ do {\ static inline void _mm_lock(mm_lock_t *l, const char *func, int level, int rec) { -if ( !((mm_locked_by_me(l)) && rec) ) +if ( !((mm_locked_by_me(l)) && rec) ) __check_lock_level(level); spin_lock_recursive(&l->lock); if ( l->lock.recurse_cnt == 1 ) @@ -186,7 +186,7 @@ static inline void mm_unlock(mm_lock_t *l) spin_unlock_recursive(&l->lock); } -static inline void mm_enforce_order_unlock(int unlock_level, +static inline void mm_enforce_order_unlock(int unlock_level, unsigned short *recurse_count) { if ( recurse_count ) @@ -310,7 +310,7 @@ declare_mm_rwlock(altp2m); #define gfn_locked_by_me(p,g) p2m_locked_by_me(p) /* PoD lock (per-p2m-table) - * + * * Protects private PoD data structs: entry and cache * counts, page lists, sweep parameters. */ @@ -322,7 +322,7 @@ declare_mm_lock(pod) /* Page alloc lock (per-domain) * - * This is an external lock, not represented by an mm_lock_t. However, + * This is an external lock, not represented by an mm_lock_t. However, * pod code uses it in conjunction with the p2m lock, and expecting * the ordering which we enforce here. * The lock is not recursive. */ @@ -338,13 +338,13 @@ declare_mm_order_constraint(page_alloc) * For shadow pagetables, this lock protects * - all changes to shadow page table pages * - the shadow hash table - * - the shadow page allocator + * - the shadow page allocator * - all changes to guest page table pages * - all changes to the page_info->tlbflush_timestamp - * - the page_info->count fields on shadow pages - * - * For HAP, it protects the NPT/EPT tables and mode changes. - * + * - the page_info->count fields on shadow pages + * + * For HAP, it protects the NPT/EPT tables and mode changes. + * * It also protects the log-dirty bitmap from concurrent accesses (and * teardowns, etc). */ -- 2.20.1 ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] [PATCH v2 3/3] x86/mm-locks: apply a bias to lock levels for control domain
paging_log_dirty_op function takes mm locks from a subject domain and then attempts to perform copy to operations against the caller domain in order to copy the result of the hypercall into the caller provided buffer. This works fine when the caller is a non-paging domain, but triggers a lock order panic when the caller is a paging domain due to the fact that at the point where the copy to operation is performed the subject domain paging lock is locked, and the copy operation requires locking the caller p2m lock which has a lower level. Fix this limitation by adding a bias to the level of control domain mm locks, so that the lower control domain mm lock always has a level greater than the higher unprivileged domain lock level. This allows locking the subject domain mm locks and then locking the control domain mm locks, while keeping the same lock ordering and the changes mostly confined to mm-locks.h. Note that so far only this flow (locking a subject domain locks and then the control domain ones) has been identified, but not all possible code paths have been inspected. Hence this solution attempts to be a non-intrusive fix for the problem at hand, without discarding further changes in the future if other valid code paths are found that require more complex lock level ordering. Signed-off-by: Roger Pau Monné --- Cc: George Dunlap Cc: Jan Beulich Cc: Andrew Cooper Cc: Wei Liu Cc: Tim Deegan --- Changes since v1: - Boost only control domain mm lock levels instead of the caller. --- xen/arch/x86/mm/mm-locks.h | 119 +++-- xen/arch/x86/mm/p2m-pod.c | 5 +- 2 files changed, 78 insertions(+), 46 deletions(-) diff --git a/xen/arch/x86/mm/mm-locks.h b/xen/arch/x86/mm/mm-locks.h index d3497713e9..d6c073dc5c 100644 --- a/xen/arch/x86/mm/mm-locks.h +++ b/xen/arch/x86/mm/mm-locks.h @@ -50,15 +50,35 @@ static inline int _get_lock_level(void) return this_cpu(mm_lock_level); } +#define MM_LOCK_ORDER_MAX64 +/* + * Return the lock level taking the domain bias into account. If the domain is + * privileged a bias of MM_LOCK_ORDER_MAX is applied to the lock level, so that + * mm locks that belong to a control domain can be acquired after having + * acquired mm locks of an unprivileged domain. + * + * This is required in order to use some hypercalls from a paging domain that + * take locks of a subject domain and then attempt to copy data to/from the + * caller domain. + */ +static inline int _lock_level(const struct domain *d, int l) +{ +ASSERT(l <= MM_LOCK_ORDER_MAX); + +return l + (d && is_control_domain(d) ? MM_LOCK_ORDER_MAX : 0); +} + /* * If you see this crash, the numbers printed are order levels defined * in this file. */ -static inline void _check_lock_level(int l) +static inline void _check_lock_level(const struct domain *d, int l) { -if ( unlikely(_get_lock_level() > l) ) +int lvl = _lock_level(d, l); + +if ( unlikely(_get_lock_level() > lvl) ) { -printk("mm locking order violation: %i > %i\n", _get_lock_level(), l); +printk("mm locking order violation: %i > %i\n", _get_lock_level(), lvl); BUG(); } } @@ -68,10 +88,11 @@ static inline void _set_lock_level(int l) this_cpu(mm_lock_level) = l; } -static inline void _mm_lock(mm_lock_t *l, const char *func, int level, int rec) +static inline void _mm_lock(const struct domain *d, mm_lock_t *l, +const char *func, int level, int rec) { if ( !((mm_locked_by_me(l)) && rec) ) -_check_lock_level(level); +_check_lock_level(d, level); spin_lock_recursive(&l->lock); if ( l->lock.recurse_cnt == 1 ) { @@ -80,16 +101,17 @@ static inline void _mm_lock(mm_lock_t *l, const char *func, int level, int rec) } else if ( (unlikely(!rec)) ) panic("mm lock already held by %s\n", l->locker_function); -_set_lock_level(level); +_set_lock_level(_lock_level(d, level)); } -static inline void _mm_enforce_order_lock_pre(int level) +static inline void _mm_enforce_order_lock_pre(const struct domain *d, int level) { -_check_lock_level(level); +_check_lock_level(d, level); } -static inline void _mm_enforce_order_lock_post(int level, int *unlock_level, -unsigned short *recurse_count) +static inline void _mm_enforce_order_lock_post(const struct domain *d, int level, + int *unlock_level, + unsigned short *recurse_count) { if ( recurse_count ) { @@ -100,7 +122,7 @@ static inline void _mm_enforce_order_lock_post(int level, int *unlock_level, } else { *unlock_level = _get_lock_level(); } -_set_lock_level(level); +_set_lock_level(_lock_level(d, level)); } @@ -117,16 +139,17 @@ static inline int mm_write_locked_by_me(mm_rwlock_t *l) return (l->locker == get_processor_id());
[Xen-devel] [xen-unstable-smoke test] 131486: tolerable all pass - PUSHED
flight 131486 xen-unstable-smoke real [real] http://logs.test-lab.xenproject.org/osstest/logs/131486/ Failures :-/ but no regressions. Tests which did not succeed, but are not blocking: test-amd64-amd64-libvirt 13 migrate-support-checkfail never pass test-armhf-armhf-xl 13 migrate-support-checkfail never pass test-armhf-armhf-xl 14 saverestore-support-checkfail never pass version targeted for testing: xen 396d8d5418ea908a5ef88e7d7a9f22c70ada44c2 baseline version: xen 7183e86a29c3fe15078eb0b8c11d3e556c22effa Last test of basis 131445 2018-12-19 14:00:43 Z1 days Testing same since 131486 2018-12-21 08:00:41 Z0 days1 attempts People who touched revisions under test: Andrew Cooper Jan Beulich Roger Pau Monné jobs: build-amd64 pass build-armhf pass build-amd64-libvirt pass test-armhf-armhf-xl pass test-amd64-amd64-xl-qemuu-debianhvm-i386 pass test-amd64-amd64-libvirt pass sg-report-flight on osstest.test-lab.xenproject.org logs: /home/logs/logs images: /home/logs/images Logs, config files, etc. are available at http://logs.test-lab.xenproject.org/osstest/logs Explanation of these reports, and of osstest in general, is at http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master Test harness code can be found at http://xenbits.xen.org/gitweb?p=osstest.git;a=summary Pushing revision : To xenbits.xen.org:/home/xen/git/xen.git 7183e86a29..396d8d5418 396d8d5418ea908a5ef88e7d7a9f22c70ada44c2 -> smoke ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v2 0/3] x86/mm-locks: add a bias to control domain lock levels
>>> On 21.12.18 at 10:41, wrote: > Hello, > > The following series attempts to fix a mm lock level issue that prevents > using paging_log_dirty_op from a paging Dom0 (like a PVH Dom0). The > discussion that lead to this series can be found at: > > https://lists.xenproject.org/archives/html/xen-devel/2018-12/msg01197.html > > Roger Pau Monne (3): > x86/mm-locks: remove trailing whitespace > x86/mm-locks: convert some macros to inline functions You could have checked the tree before sending - I've committed these two in the morning. Jan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] [ovmf test] 131470: regressions - FAIL
flight 131470 ovmf real [real] http://logs.test-lab.xenproject.org/osstest/logs/131470/ Regressions :-( Tests which did not succeed and are blocking, including tests which could not be run: build-i3866 xen-buildfail REGR. vs. 129475 build-i386-xsm6 xen-buildfail REGR. vs. 129475 build-amd64-xsm 6 xen-buildfail REGR. vs. 129475 build-amd64 6 xen-buildfail REGR. vs. 129475 Tests which did not succeed, but are not blocking: build-amd64-libvirt 1 build-check(1) blocked n/a test-amd64-i386-xl-qemuu-ovmf-amd64 1 build-check(1) blocked n/a test-amd64-amd64-xl-qemuu-ovmf-amd64 1 build-check(1) blocked n/a build-i386-libvirt1 build-check(1) blocked n/a version targeted for testing: ovmf 5a9b3eb8e5fbbc0a49f80630039d58712aacfab8 baseline version: ovmf 5ae3184d8c59f7bbb84bad482df6b8020ba58188 Last test of basis 129475 2018-11-05 21:13:11 Z 45 days Failing since129526 2018-11-06 20:49:26 Z 44 days 162 attempts Testing same since 131470 2018-12-20 21:57:41 Z0 days1 attempts People who touched revisions under test: Achin Gupta Ard Biesheuvel Bob Feng bob.c.f...@intel.com BobCF Chasel Chiu Chasel, Chiu Chen A Chen Dandan Bi David Wei Derek Lin Eric Dong Feng, Bob C Fu Siyuan Gary Lin Hao Wu Jaben Carsey Jeff Brasen Jian J Wang Jiaxin Wu Jiewen Yao Laszlo Ersek Leif Lindholm Liming Gao Liu Yu Marc Zyngier Marcin Wojtas Ming Huang Pedroa Liu Ruiyu Ni shenglei Shenglei Zhang Star Zeng Sughosh Ganu Sumit Garg Sun, Zailiang Thomas Abraham Ting Ye Tomasz Michalec Vijayenthiran Subramaniam Vladimir Olovyannikov Wang BinX A Wu Jiaxin Ye Ting Yonghong Zhu yuchenlin Zailiang Sun Zhang, Chao B Zhao, ZhiqiangX Zhiju.Fan zhijufan ZhiqiangX Zhao zwei4 jobs: build-amd64-xsm fail build-i386-xsm fail build-amd64 fail build-i386 fail build-amd64-libvirt blocked build-i386-libvirt blocked build-amd64-pvopspass build-i386-pvops pass test-amd64-amd64-xl-qemuu-ovmf-amd64 blocked test-amd64-i386-xl-qemuu-ovmf-amd64 blocked sg-report-flight on osstest.test-lab.xenproject.org logs: /home/logs/logs images: /home/logs/images Logs, config files, etc. are available at http://logs.test-lab.xenproject.org/osstest/logs Explanation of these reports, and of osstest in general, is at http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master Test harness code can be found at http://xenbits.xen.org/gitweb?p=osstest.git;a=summary Not pushing. (No revision log; it would be 4265 lines long.) ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH 03/15] x86/cpu/vpmu: Add Hygon Dhyana support for vPMU
On 2018/12/20 22:25, Boris Ostrovsky wrote: ... >> diff --git a/xen/arch/x86/cpu/vpmu_amd.c b/xen/arch/x86/cpu/vpmu_amd.c >> index 5efc39b..e9f0a5c 100644 >> --- a/xen/arch/x86/cpu/vpmu_amd.c >> +++ b/xen/arch/x86/cpu/vpmu_amd.c >> @@ -554,6 +554,8 @@ int __init amd_vpmu_init(void) >> case 0x12: >> case 0x14: >> case 0x16: >> +case 0x17: >> +case 0x18: > > > This also enables VPMU support for Zen which goes beyond what the > commit message claims to do. Sorry for the not clear commit message. Will add modification description in the commit message and make the changes complete. On the other hand, since current Xen vPMU still not support Zen. so in this patch we enable 0x17 support. If this modification is not preferred, will remove AMD Xen 0x17 support in next version. > > Also, why are you choosing to use legacy MSRs (and you did the same in > Linux)? Doesn't Zen (which you are saying is similar to Hygon) support > c001_020X bank? In Linux, the Xen PMU driver use the default branch cases, which also use the legacy MSRs way. So we choose to follow legacy MSRs here in Dhyana cases. Since both of Zen and Dhyana support C001_020X MSRs. If use the C001_020X is preferred, we will try to modify the related codes and create a patch. Also the Linux Xen PMU driver may need to be updated to use these MSRs. -- Regards, Pu Wen ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v4 1/3] xen/pt: fix some pass-thru devices don't work across reboot
>>> On 20.12.18 at 16:29, wrote: > --- a/xen/drivers/passthrough/pci.c > +++ b/xen/drivers/passthrough/pci.c > @@ -1514,6 +1514,55 @@ static int assign_device(struct domain *d, u16 seg, u8 > bus, u8 devfn, u32 flag) > return rc; > } > > +/* > + * Unmap established mappings between domain's pirq and device's MSI. > + * These mappings were set up by qemu/guest and are expected to be > + * destroyed when changing the device's ownership. > + */ > +static void pci_unmap_msi(struct pci_dev *pdev) > +{ > +struct msi_desc *entry; > +struct domain *d = pdev->domain; > + > +ASSERT(pcidevs_locked()); > + > +if ( !d ) > +return; Why? deassign_device() (the only caller) ought to guarantee this, due to its use of pci_get_pdev_by_domain(). I think this simply wants to be another ASSERT(), if anything at all. > +spin_lock(&d->event_lock); > +while ( (entry = list_first_entry_or_null(&pdev->msi_list, > + struct msi_desc, list)) != > NULL ) > +{ > +struct pirq *info; > +int pirq = 0; > +unsigned int nr = entry->msi_attrib.type != PCI_CAP_ID_MSIX > + ? entry->msi.nvec : 1; > + > +while ( nr -- ) Stray blank. > +{ > +struct hvm_pirq_dpci *pirq_dpci; > + > +pirq = domain_irq_to_pirq(d, entry[nr].irq); > +WARN_ON(pirq < 0); > +if ( pirq <= 0 ) > +continue; > + > +info = pirq_info(d, pirq); > +if ( !info ) > +continue; > + > +pirq_dpci = pirq_dpci(info); > +if ( pirq_dpci && > + (pirq_dpci->flags & HVM_IRQ_DPCI_MACH_MSI) && > + (pirq_dpci->flags & HVM_IRQ_DPCI_GUEST_MSI) ) > +pt_irq_destroy_bind_msi(d, info); > +} > +if ( pirq > 0 ) > +unmap_domain_pirq(d, pirq); Can you guarantee that this function won't fail? Because if it does, I think you might end up in an infinite loop, because the entry doesn't always get removed from the list in error cases. Maybe unmap_domain_pirq() needs a "force" mode added, perhaps indirectly by way of passing "entry" into it (all other callers would pass NULL). But then again I'm still not fully convinced that a hypervisor change is the right course of action here in the first place. It would be better if the hypervisor had to just verify that all IRQ mappings are gone, or else fail the de-assignment of the device. Jan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v4 3/3] xen/pt: initialize 'warned' field of arch_msix properly
>>> On 20.12.18 at 16:29, wrote: > --- a/xen/include/asm-x86/msi.h > +++ b/xen/include/asm-x86/msi.h > @@ -252,5 +252,10 @@ void guest_mask_msi_irq(struct irq_desc *, bool mask); > void ack_nonmaskable_msi_irq(struct irq_desc *); > void end_nonmaskable_msi_irq(struct irq_desc *, u8 vector); > void set_msi_affinity(struct irq_desc *, const cpumask_t *); > +static inline void init_arch_msix(struct arch_msix *msix) > +{ > +spin_lock_init(&msix->table_lock); > +msix->warned = DOMID_INVALID; > +} I think this would better sit next to the structure definition, i.e. a few lines up. In any event a separating blank line needs adding. With that Reviewed-by: Jan Beulich Additionally perhaps arch_init_msix() would better fit our general naming of arch-specific functions. Jan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH 01/15] x86/cpu: Create Hygon Dhyana architecture support file
On 20/12/2018 13:12, Pu Wen wrote: > diff --git a/xen/arch/x86/cpu/hygon.c b/xen/arch/x86/cpu/hygon.c > new file mode 100644 > index 000..0728b4a > --- /dev/null > +++ b/xen/arch/x86/cpu/hygon.c > @@ -0,0 +1,296 @@ > +#include > +#include > +#include > +#include > + > +#include "cpu.h" > + > +static unsigned int __initdata opt_cpuid_mask_l7s0_eax = ~0u; > +integer_param("cpuid_mask_l7s0_eax", opt_cpuid_mask_l7s0_eax); > +static unsigned int __initdata opt_cpuid_mask_l7s0_ebx = ~0u; > +integer_param("cpuid_mask_l7s0_ebx", opt_cpuid_mask_l7s0_ebx); These should be moved from the AMD specific code into the common cpu code (alongside the other masks) rather than duplicated here. > + > +static inline int rdmsr_hygon_safe(unsigned int msr, unsigned int *lo, > + unsigned int *hi) > +{ > + int err; > + > + asm volatile("1: rdmsr\n2:\n" > + ".section .fixup,\"ax\"\n" > + "3: movl %6,%2\n" > + " jmp 2b\n" > + ".previous\n" > + _ASM_EXTABLE(1b, 3b) > + : "=a" (*lo), "=d" (*hi), "=r" (err) > + : "c" (msr), "D" (0x9c5a203a), "2" (0), "i" (-EFAULT)); These rdmsr/wrmsr helpers with a password in %edi are only used in the K8 processors. Since Hygon is a Zen derivative, you shouldn't need any of these. > + > + return err; > +} > + > +static inline int wrmsr_hygon_safe(unsigned int msr, unsigned int lo, > + unsigned int hi) > +{ > + int err; > + > + asm volatile("1: wrmsr\n2:\n" > + ".section .fixup,\"ax\"\n" > + "3: movl %6,%0\n" > + " jmp 2b\n" > + ".previous\n" > + _ASM_EXTABLE(1b, 3b) > + : "=r" (err) > + : "c" (msr), "a" (lo), "d" (hi), "D" (0x9c5a203a), > +"0" (0), "i" (-EFAULT)); > + > + return err; > +} > + > +static void wrmsr_hygon(unsigned int msr, uint64_t val) > +{ > + asm volatile("wrmsr" :: > + "c" (msr), "a" ((uint32_t)val), > + "d" (val >> 32), "D" (0x9c5a203a)); > +} > + > +/* > + * Sets caps in expected_levelling_cap, probes for the specified mask MSR, > and > + * set caps in levelling_caps if it is found. Returns the default value. > + */ > +static uint64_t __init _probe_mask_msr(unsigned int msr, uint64_t caps) > +{ > + unsigned int hi, lo; > + > + expected_levelling_cap |= caps; > + > + if ((rdmsr_hygon_safe(msr, &lo, &hi) == 0) && > + (wrmsr_hygon_safe(msr, lo, hi) == 0)) > + levelling_caps |= caps; > + > + return ((uint64_t)hi << 32) | lo; > +} > + > +/* Probe for the existance of the expected masking MSRs. */ > +static void __init noinline probe_masking_msrs(void) > +{ > + const struct cpuinfo_x86 *c = &boot_cpu_data; > + > + /* Work out which masking MSRs we should have. */ > + cpuidmask_defaults._1cd = > + _probe_mask_msr(MSR_K8_FEATURE_MASK, LCAP_1cd); > + cpuidmask_defaults.e1cd = > + _probe_mask_msr(MSR_K8_EXT_FEATURE_MASK, LCAP_e1cd); > + if (c->cpuid_level >= 7) > + cpuidmask_defaults._7ab0 = > + _probe_mask_msr(MSR_AMD_L7S0_FEATURE_MASK, LCAP_7ab0); > +} > + > +/* > + * Context switch CPUID masking state to the next domain. Only called if > + * CPUID Faulting isn't available, but masking MSRs have been detected. A > + * parameter of NULL is used to context switch to the default host state (by > + * the cpu bringup-code, crash path, etc). > + */ > +static void hygon_ctxt_switch_masking(const struct vcpu *next) > +{ > + struct cpuidmasks *these_masks = &this_cpu(cpuidmasks); > + const struct domain *nextd = next ? next->domain : NULL; > + const struct cpuidmasks *masks = > + (nextd && is_pv_domain(nextd) && nextd->arch.pv.cpuidmasks) > + ? nextd->arch.pv.cpuidmasks : &cpuidmask_defaults; > + > + if ((levelling_caps & LCAP_1cd) == LCAP_1cd) { > + uint64_t val = masks->_1cd; > + > + /* > + * OSXSAVE defaults to 1, which causes fast-forwarding of > + * Xen's real setting. Clobber it if disabled by the guest > + * kernel. > + */ > + if (next && is_pv_vcpu(next) && !is_idle_vcpu(next) && > + !(next->arch.pv.ctrlreg[4] & X86_CR4_OSXSAVE)) > + val &= ~((uint64_t)cpufeat_mask(X86_FEATURE_OSXSAVE) << > 32); > + > + if (unlikely(these_masks->_1cd != val)) { > + wrmsr_hygon(MSR_K8_FEATURE_MASK, val); > + these_masks->_1cd = val; > + } > + } > + > +#define LAZY(cap, msr, field) > \ > + ({ \ > + if (unlikely(these_masks->field != masks->field) && \
Re: [Xen-devel] [PATCH v4 1/3] xen/pt: fix some pass-thru devices don't work across reboot
On Fri, Dec 21, 2018 at 03:13:50AM -0700, Jan Beulich wrote: > But then again I'm still not fully convinced that a hypervisor > change is the right course of action here in the first place. It > would be better if the hypervisor had to just verify that all > IRQ mappings are gone, or else fail the de-assignment of the > device. The only component (except the hypervisor) that knows about such assignments is QEMU, and in the case of a QEMU crash the host would be left with a device that cannot be de-assigned, because the information about the PIRQ bindings in lost due to the QEMU crash. IMO Xen needs to be capable of cleaning any bindings and mappings done by the toolstack or the device model in order to be able to correctly recover from a device model or toolstack crash. Roger. ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v4 1/3] xen/pt: fix some pass-thru devices don't work across reboot
>>> On 21.12.18 at 11:26, wrote: > On Fri, Dec 21, 2018 at 03:13:50AM -0700, Jan Beulich wrote: >> But then again I'm still not fully convinced that a hypervisor >> change is the right course of action here in the first place. It >> would be better if the hypervisor had to just verify that all >> IRQ mappings are gone, or else fail the de-assignment of the >> device. > > The only component (except the hypervisor) that knows about such > assignments is QEMU, and in the case of a QEMU crash the host would be > left with a device that cannot be de-assigned, because the information > about the PIRQ bindings in lost due to the QEMU crash. > > IMO Xen needs to be capable of cleaning any bindings and mappings done > by the toolstack or the device model in order to be able to correctly > recover from a device model or toolstack crash. But possibly with tool stack assistance: Rather than doing it (in a potentially fragile way, as per my other comments) as an integral part of deassign-device, it could be a separate domctl to be issued first. Or otherwise failure here ought to lead to failure of deassign-device, rather than (e.g.) an infinite loop. Jan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v4 0/2] misc safety certification fixes
Hi, On 21/12/2018 09:27, Jan Beulich wrote: On 20.12.18 at 18:26, wrote: On 11/12/18 11:06 PM, Stefano Stabellini wrote: Discussing with Stefano today, he is aiming to get this series for Xen 4.12. I will be away until the x86/common code freeze. I agree with him that I will waive my ack if it gets reviewed by any committers. Well, discussion on patch 2 was abandoned rather than finished afaict, which means Stefano either lost interest or is meaning to submit v5 with the comments addressed. He is planning to send a new version while I am on holidays. My request applies for any new version of this providing it gets reviewed by any committers. Cheers, -- Julien Grall ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v4 3/3] xen/pt: initialize 'warned' field of arch_msix properly
On Fri, Dec 21, 2018 at 03:17:30AM -0700, Jan Beulich wrote: > >>> On 20.12.18 at 16:29, wrote: > > --- a/xen/include/asm-x86/msi.h > > +++ b/xen/include/asm-x86/msi.h > > @@ -252,5 +252,10 @@ void guest_mask_msi_irq(struct irq_desc *, bool mask); > > void ack_nonmaskable_msi_irq(struct irq_desc *); > > void end_nonmaskable_msi_irq(struct irq_desc *, u8 vector); > > void set_msi_affinity(struct irq_desc *, const cpumask_t *); > > +static inline void init_arch_msix(struct arch_msix *msix) > > +{ > > +spin_lock_init(&msix->table_lock); > > +msix->warned = DOMID_INVALID; > > +} > > I think this would better sit next to the structure definition, > i.e. a few lines up. In any event a separating blank line > needs adding. With that > Reviewed-by: Jan Beulich > > Additionally perhaps arch_init_msix() would better fit our > general naming of arch-specific functions. With Jan's comments addresses you can also add my Reviewed-by: Roger Pau Monné Thanks, Roger. ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH for-4.12 v2 2/8] xen/arm: p2m: Introduce p2m_get_page_from_gfn
Hi Stefano, On 20/12/2018 22:56, Stefano Stabellini wrote: On Thu, 20 Dec 2018, Julien Grall wrote: In a follow-up patches, we will need to handle get_page_from_gfn ^ remove a I have removed the "es" from "patches" and keep the "a" instead. Aside from that: Reviewed-by: Stefano Stabellini Thank you for the review! Cheers, -- Julien Grall ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] [linux-4.4 test] 131448: tolerable FAIL - PUSHED
flight 131448 linux-4.4 real [real] http://logs.test-lab.xenproject.org/osstest/logs/131448/ Failures :-/ but no regressions. Tests which did not succeed, but are not blocking: test-armhf-armhf-xl 16 guest-start/debian.repeatfail like 131384 test-amd64-amd64-xl-qemuu-dmrestrict-amd64-dmrestrict 10 debian-hvm-install fail never pass test-amd64-i386-xl-qemuu-dmrestrict-amd64-dmrestrict 10 debian-hvm-install fail never pass test-amd64-i386-xl-pvshim12 guest-start fail never pass test-amd64-amd64-xl-pvhv2-amd 12 guest-start fail never pass test-amd64-i386-libvirt 13 migrate-support-checkfail never pass test-amd64-i386-libvirt-xsm 13 migrate-support-checkfail never pass test-amd64-amd64-xl-pvhv2-intel 12 guest-start fail never pass test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail never pass test-amd64-amd64-libvirt 13 migrate-support-checkfail never pass test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check fail never pass test-amd64-amd64-qemuu-nested-amd 17 debian-hvm-install/l1/l2 fail never pass test-amd64-amd64-libvirt-vhd 12 migrate-support-checkfail never pass test-amd64-i386-xl-qemuu-win7-amd64 17 guest-stop fail never pass test-amd64-amd64-xl-qemuu-win7-amd64 17 guest-stop fail never pass test-armhf-armhf-xl-credit2 13 migrate-support-checkfail never pass test-armhf-armhf-xl-credit2 14 saverestore-support-checkfail never pass test-armhf-armhf-xl-credit1 13 migrate-support-checkfail never pass test-armhf-armhf-xl-credit1 14 saverestore-support-checkfail never pass test-armhf-armhf-xl-multivcpu 13 migrate-support-checkfail never pass test-armhf-armhf-xl-multivcpu 14 saverestore-support-checkfail never pass test-armhf-armhf-libvirt 13 migrate-support-checkfail never pass test-armhf-armhf-libvirt 14 saverestore-support-checkfail never pass test-armhf-armhf-xl-cubietruck 13 migrate-support-checkfail never pass test-armhf-armhf-xl-cubietruck 14 saverestore-support-checkfail never pass test-armhf-armhf-xl 13 migrate-support-checkfail never pass test-armhf-armhf-xl 14 saverestore-support-checkfail never pass test-amd64-amd64-xl-qemut-win7-amd64 17 guest-stop fail never pass test-amd64-i386-xl-qemut-win7-amd64 17 guest-stop fail never pass test-amd64-amd64-xl-qemut-ws16-amd64 17 guest-stop fail never pass test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check fail never pass test-amd64-amd64-xl-qemuu-ws16-amd64 17 guest-stop fail never pass test-armhf-armhf-xl-arndale 13 migrate-support-checkfail never pass test-armhf-armhf-xl-arndale 14 saverestore-support-checkfail never pass test-armhf-armhf-xl-rtds 13 migrate-support-checkfail never pass test-armhf-armhf-xl-rtds 14 saverestore-support-checkfail never pass test-armhf-armhf-xl-vhd 12 migrate-support-checkfail never pass test-armhf-armhf-xl-vhd 13 saverestore-support-checkfail never pass test-amd64-i386-xl-qemuu-ws16-amd64 17 guest-stop fail never pass test-amd64-i386-xl-qemut-ws16-amd64 17 guest-stop fail never pass test-armhf-armhf-libvirt-raw 12 migrate-support-checkfail never pass test-armhf-armhf-libvirt-raw 13 saverestore-support-checkfail never pass test-amd64-amd64-xl-qemut-win10-i386 10 windows-installfail never pass test-amd64-i386-xl-qemut-win10-i386 10 windows-install fail never pass test-amd64-i386-xl-qemuu-win10-i386 10 windows-install fail never pass test-amd64-amd64-xl-qemuu-win10-i386 10 windows-installfail never pass version targeted for testing: linuxd3c67a52a66ba2d44bcf1b8262609148c7c73113 baseline version: linux640f85865ca658ae07d485693a3d452bdbbadaba Last test of basis 131384 2018-12-17 08:40:03 Z4 days Testing same since 131426 2018-12-18 08:18:16 Z3 days2 attempts People who touched revisions under test: Aaro Koskinen Al Viro Alexei Starovoitov Alexei Starovoitov Anders Roxell Andrew Bowers Andrew Morton Arnd Bergmann Ashok Raj Ben Hutchings Borislav Petkov Catalin Marinas Christian König Christoph Hellwig Christoph Paasch Colin Ian King Dan Williams Daniel Borkmann Dave Airlie David Howells David Matlack David S. Miller David Sterba David Woodhouse Eric Dumazet Felipe Balbi Filipe Manana Florian Fainelli Greg Kroah-Hartman Guenter Roeck Heiner Kallweit Hillf Danton Huacai Chen Ingo Molnar Janusz Krzysztofik Jarkko Nikula Jeff Kirsher Jens Axboe Jes
Re: [Xen-devel] [PATCH for-4.12 v2 4/8] xen/arm: Add support for read-only foreign mappings
Hi, On 20/12/2018 23:35, Stefano Stabellini wrote: On Thu, 20 Dec 2018, Julien Grall wrote: (Sorry for the formatting) On Thu, 20 Dec 2018 at 23:08, Stefano Stabellini wrote: On Thu, 20 Dec 2018, Julien Grall wrote: > Current, foreign mappings can only be read-write. A follow-up patch will > extend foreign mapping for Xen backend memory (via XEN_DOMID), some of > that memory should only be read accessible for the mapping domain. > > Introduce a new p2m_type to cater read-only foreign mappings. For now, > the decision between the two foreign mapping type is based on the type > of the guest page mapped. > > Signed-off-by: Julien Grall > Reviewed-by: Andrii Anisov > > --- > > Changes in v2: > - Add Andrii's reviewed-by > --- > xen/arch/arm/mm.c | 2 +- > xen/arch/arm/p2m.c | 1 + > xen/include/asm-arm/p2m.h | 42 +++--- > 3 files changed, 41 insertions(+), 4 deletions(-) > > diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c > index 7193d83b44..58f7e54640 100644 > --- a/xen/arch/arm/mm.c > +++ b/xen/arch/arm/mm.c > @@ -1267,7 +1267,7 @@ int xenmem_add_to_physmap_one( > } > > mfn = page_to_mfn(page); > - t = p2m_map_foreign_rw; > + t = (p2mt == p2m_ram_rw) ? p2m_map_foreign_rw : p2m_map_foreign_ro; I know there is a p2m_is_ram check close by, but I think it would still be better to do: if (p2mt == p2m_ram_rw) t = p2m_map_foreign_rw; else if (p2mt == p2m_ram_ro) t = p2m_map_foreign_ro; else error to avoid cases where p2mt is something completely different (p2m_mmio_direct_dev for instance) and t gets set to p2m_map_foreign_ro by mistake. The case you suggest is impossible. You can only be here if you manage to get a reference on the page (e.g p2m_foreign or p2m_ram). The check above remove the foreign types. But if you ever get here there are not much harm done as it would be read-only. The extras 5 lines of code are just not worth it. I realize the case is impossible today, it was for clarity and for future proof-ness. You can reduce line code count by combining it with the p2m_is_ram check above: diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c index 49d7a76..01ae2cc 100644 --- a/xen/arch/arm/mm.c +++ b/xen/arch/arm/mm.c @@ -1259,7 +1259,9 @@ int xenmem_add_to_physmap_one( return -EINVAL; } -if ( !p2m_is_ram(p2mt) ) +if ( p2m_is_ram(p2mt) ) +t = (p2mt == p2m_ram_rw) ? p2m_map_foreign_rw : p2m_map_foreign_ro; +else { put_page(page); put_pg_owner(od); @@ -1267,7 +1269,6 @@ int xenmem_add_to_physmap_one( } mfn = page_to_mfn(page); -t = (p2mt == p2m_ram_rw) ? p2m_map_foreign_rw : p2m_map_foreign_ro; put_pg_owner(od); break; That's a better solution. I will update the patch. Cheers, -- Julien Grall ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH for-4.12 v2 1/8] xen/page_alloc: Move get_pg_owner()/put_pg_owner() from x86 to common code
Hi, On 20/12/2018 22:59, Andrew Cooper wrote: On 20/12/2018 22:53, Stefano Stabellini wrote: diff --git a/xen/common/page_alloc.c b/xen/common/page_alloc.c index 2c6509e3a0..edb93b8ada 100644 --- a/xen/common/page_alloc.c +++ b/xen/common/page_alloc.c @@ -2509,6 +2510,43 @@ static __init int register_heap_trigger(void) } __initcall(register_heap_trigger); +struct domain *get_pg_owner(domid_t domid) +{ +struct domain *pg_owner = NULL, *curr = current->domain; + +if ( likely(domid == DOMID_SELF) ) +{ +pg_owner = rcu_lock_current_domain(); +goto out; +} + +if ( unlikely(domid == curr->domain_id) ) +{ +gdprintk(XENLOG_WARNING, "Cannot specify itself as foreign domain\n"); +goto out; +} + +switch ( domid ) +{ +case DOMID_IO: +pg_owner = rcu_lock_domain(dom_io); +break; Newline. +case DOMID_XEN: +pg_owner = rcu_lock_domain(dom_xen); +break; Newline. +default: +if ( (pg_owner = rcu_lock_domain_by_id(domid)) == NULL ) +{ +gdprintk(XENLOG_WARNING, "Unknown domain d%d\n", domid); +break; +} +break; if ( (pg_owner = rcu_lock_domain_by_id(domid)) == NULL ) gdprintk(XENLOG_WARNING, "Unknown domain d%d\n", domid); break; All trivial, so Reviewed-by: Andrew Cooper and please fix on commit. Thank you! I have now committed this patch. Cheers, -- Julien Grall ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH for-4.12 v2 4/8] xen/arm: Add support for read-only foreign mappings
Hi, On 20/12/2018 19:23, Julien Grall wrote: Current, foreign mappings can only be read-write. A follow-up patch will extend foreign mapping for Xen backend memory (via XEN_DOMID), some of that memory should only be read accessible for the mapping domain. Introduce a new p2m_type to cater read-only foreign mappings. For now, the decision between the two foreign mapping type is based on the type of the guest page mapped. Signed-off-by: Julien Grall Reviewed-by: Andrii Anisov --- Changes in v2: - Add Andrii's reviewed-by --- xen/arch/arm/mm.c | 2 +- xen/arch/arm/p2m.c| 1 + xen/include/asm-arm/p2m.h | 42 +++--- 3 files changed, 41 insertions(+), 4 deletions(-) diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c index 7193d83b44..58f7e54640 100644 --- a/xen/arch/arm/mm.c +++ b/xen/arch/arm/mm.c @@ -1267,7 +1267,7 @@ int xenmem_add_to_physmap_one( } mfn = page_to_mfn(page); -t = p2m_map_foreign_rw; +t = (p2mt == p2m_ram_rw) ? p2m_map_foreign_rw : p2m_map_foreign_ro; rcu_unlock_domain(od); break; diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c index e0b84a9db5..dea04ef66f 100644 --- a/xen/arch/arm/p2m.c +++ b/xen/arch/arm/p2m.c @@ -477,6 +477,7 @@ static void p2m_set_permission(lpae_t *e, p2m_type_t t, p2m_access_t a) break; case p2m_iommu_map_ro: +case p2m_map_foreign_ro: case p2m_grant_map_ro: case p2m_invalid: e->p2m.xn = 1; diff --git a/xen/include/asm-arm/p2m.h b/xen/include/asm-arm/p2m.h index a1aef7b793..6f2728e2bb 100644 --- a/xen/include/asm-arm/p2m.h +++ b/xen/include/asm-arm/p2m.h @@ -116,6 +116,7 @@ typedef enum { p2m_mmio_direct_nc, /* Read/write mapping of genuine MMIO area non-cacheable */ p2m_mmio_direct_c, /* Read/write mapping of genuine MMIO area cacheable */ p2m_map_foreign_rw, /* Read/write RAM pages from foreign domain */ +p2m_map_foreign_ro, /* Read-only RAM pages from foreign domain */ p2m_grant_map_rw, /* Read/write grant mapping */ p2m_grant_map_ro, /* Read-only grant mapping */ /* The types below are only used to decide the page attribute in the P2M */ @@ -135,12 +136,16 @@ typedef enum { #define P2M_GRANT_TYPES (p2m_to_mask(p2m_grant_map_rw) | \ p2m_to_mask(p2m_grant_map_ro)) +/* Foreign mappings types */ +#define P2M_FOREIGN_TYPES (p2m_to_mask(p2m_map_foreign_rw) | \ + p2m_to_mask(p2m_map_foreign_ro)) + /* Useful predicates */ #define p2m_is_ram(_t) (p2m_to_mask(_t) & P2M_RAM_TYPES) -#define p2m_is_foreign(_t) (p2m_to_mask(_t) & p2m_to_mask(p2m_map_foreign_rw)) +#define p2m_is_foreign(_t) (p2m_to_mask(_t) & P2M_FOREIGN_TYPES) #define p2m_is_any_ram(_t) (p2m_to_mask(_t) & \ (P2M_RAM_TYPES | P2M_GRANT_TYPES | \ - p2m_to_mask(p2m_map_foreign_rw))) + P2M_FOREIGN_TYPES)) /* All common type definitions should live ahead of this inclusion. */ #ifdef _XEN_P2M_COMMON_H @@ -295,7 +300,38 @@ struct page_info *p2m_get_page_from_gfn(struct domain *d, gfn_t gfn, static inline struct page_info *get_page_from_gfn( struct domain *d, unsigned long gfn, p2m_type_t *t, p2m_query_t q) { Something has gone wrong with this patch. The chunk below should be in a separate patch. I will split this patch in two. -return p2m_get_page_from_gfn(d, _gfn(gfn), t); +mfn_t mfn; +p2m_type_t _t; +struct page_info *page; + +/* + * Special case for DOMID_XEN as it is the only domain so far that is + * not auto-translated. + */ +if ( unlikely(d != dom_xen) ) +return p2m_get_page_from_gfn(d, _gfn(gfn), t); + +if ( !t ) +t = &_t; + +*t = p2m_invalid; + +/* + * DOMID_XEN see 1-1 RAM. The p2m_type is based on the type of the + * page. + */ +mfn = _mfn(gfn); +page = mfn_to_page(mfn); + +if ( !mfn_valid(mfn) || !get_page(page, d) ) +return NULL; + +if ( page->u.inuse.type_info & PGT_writable_page ) +*t = p2m_ram_rw; +else +*t = p2m_ram_ro; + +return page; } int get_page_type(struct page_info *page, unsigned long type); Cheers, -- Julien Grall ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH] x86/vtx: Improvements to ept= command line handling
>>> On 20.12.18 at 18:16, wrote: > Switch parse_ept_param() to use the parse_boolean() infrastructure for more > consistency with related command line parameters. Rename opt_pml_enabled to > opt_ept_pml for consistency with opt_ept_ad, and switch it to being bool > > Drop the comment leading comment for parse_ept_param(). It is stale, and just Nit: There's one "comment" to many here. > --- a/docs/misc/xen-command-line.markdown > +++ b/docs/misc/xen-command-line.markdown > @@ -841,29 +841,37 @@ effect the inverse meaning. > >> Allows mapping of RuntimeServices which have no cachability attribute > >> set as UC. > > -### ept (Intel) > -> `= List of ( {no-}pml | {no-}ad )` > +### ept > +> `= List of [ ad=, pml= ]` > > -Controls EPT related features. > +> Applicability: Intel > > -> Sub-options: > - > -> `pml` > +Extended Page Tables are a feature of Intel's VT-x technology, whereby > +hardware manages the virtualisation of HVM guest pagetables. EPT was > +introduced with the Nehalem architecture. > > -> Default: `true` > +* The `ad` boolean controls hardware tracking of Access and Dirty bits in > the > +EPT pagetables, and was first introduced in Broadwell Server. > > ->> PML is a new hardware feature in Intel's Broadwell Server and further > ->> platforms which reduces hypervisor overhead of log-dirty mechanism by > ->> automatically recording GPAs (guest physical addresses) when guest memory > ->> gets dirty, and therefore significantly reducing number of EPT violation > ->> caused by write protection of guest memory, which is a necessity to > ->> implement log-dirty mechanism before PML. > +By default, Xen will use A/D tracking when available in hardware, except > +on Avoton processors affected by erratum AVR41. Explicitly choosing > +`ad=0` will disable the use of A/D tracking on capable hardware, whereas > +choosing `ad=1` will cause tracking to be used even on AVR41-affected > +hardware. Is there any reason for this special casing of the one erratum? Earlier this week I've gone through some spec updates for other purposes, and I've seen some rather frightening EPT A/D errata. Anyway, this is a question unrelated to the patch here, so Reviewed-by: Jan Beulich Jan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH] x86/vtx: Improvements to ept= command line handling
On 21/12/2018 11:13, Jan Beulich wrote: On 20.12.18 at 18:16, wrote: >> Switch parse_ept_param() to use the parse_boolean() infrastructure for more >> consistency with related command line parameters. Rename opt_pml_enabled to >> opt_ept_pml for consistency with opt_ept_ad, and switch it to being bool >> >> Drop the comment leading comment for parse_ept_param(). It is stale, and >> just > Nit: There's one "comment" to many here. Oops - will fix. > >> --- a/docs/misc/xen-command-line.markdown >> +++ b/docs/misc/xen-command-line.markdown >> @@ -841,29 +841,37 @@ effect the inverse meaning. >> >> Allows mapping of RuntimeServices which have no cachability attribute >> >> set as UC. >> >> -### ept (Intel) >> -> `= List of ( {no-}pml | {no-}ad )` >> +### ept >> +> `= List of [ ad=, pml= ]` >> >> -Controls EPT related features. >> +> Applicability: Intel >> >> -> Sub-options: >> - >> -> `pml` >> +Extended Page Tables are a feature of Intel's VT-x technology, whereby >> +hardware manages the virtualisation of HVM guest pagetables. EPT was >> +introduced with the Nehalem architecture. >> >> -> Default: `true` >> +* The `ad` boolean controls hardware tracking of Access and Dirty bits in >> the >> +EPT pagetables, and was first introduced in Broadwell Server. >> >> ->> PML is a new hardware feature in Intel's Broadwell Server and further >> ->> platforms which reduces hypervisor overhead of log-dirty mechanism by >> ->> automatically recording GPAs (guest physical addresses) when guest memory >> ->> gets dirty, and therefore significantly reducing number of EPT violation >> ->> caused by write protection of guest memory, which is a necessity to >> ->> implement log-dirty mechanism before PML. >> +By default, Xen will use A/D tracking when available in hardware, except >> +on Avoton processors affected by erratum AVR41. Explicitly choosing >> +`ad=0` will disable the use of A/D tracking on capable hardware, whereas >> +choosing `ad=1` will cause tracking to be used even on AVR41-affected >> +hardware. > Is there any reason for this special casing of the one erratum? > Earlier this week I've gone through some spec updates for other > purposes, and I've seen some rather frightening EPT A/D errata. Which, out of interest? There are a few, particularly on Skylake, but all the problematic ones I'm aware of are fixed in microcode. ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH] x86/vtx: Improvements to ept= command line handling
>>> On 21.12.18 at 12:15, wrote: > On 21/12/2018 11:13, Jan Beulich wrote: > On 20.12.18 at 18:16, wrote: >>> +By default, Xen will use A/D tracking when available in hardware, >>> except >>> +on Avoton processors affected by erratum AVR41. Explicitly choosing >>> +`ad=0` will disable the use of A/D tracking on capable hardware, >>> whereas >>> +choosing `ad=1` will cause tracking to be used even on AVR41-affected >>> +hardware. >> Is there any reason for this special casing of the one erratum? >> Earlier this week I've gone through some spec updates for other >> purposes, and I've seen some rather frightening EPT A/D errata. > > Which, out of interest? There are a few, particularly on Skylake, but > all the problematic ones I'm aware of are fixed in microcode. I'd have to go through them again, as I didn't pay close attention to what was said about their status. Are we generally putting ourselves on the position then that errata don't need working around if there's a microcode update available? That's a possible position to take, but not spelled out anywhere. Jan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v4 1/3] xen/pt: fix some pass-thru devices don't work across reboot
On Fri, Dec 21, 2018 at 03:32:47AM -0700, Jan Beulich wrote: > >>> On 21.12.18 at 11:26, wrote: > > On Fri, Dec 21, 2018 at 03:13:50AM -0700, Jan Beulich wrote: > >> But then again I'm still not fully convinced that a hypervisor > >> change is the right course of action here in the first place. It > >> would be better if the hypervisor had to just verify that all > >> IRQ mappings are gone, or else fail the de-assignment of the > >> device. > > > > The only component (except the hypervisor) that knows about such > > assignments is QEMU, and in the case of a QEMU crash the host would be > > left with a device that cannot be de-assigned, because the information > > about the PIRQ bindings in lost due to the QEMU crash. > > > > IMO Xen needs to be capable of cleaning any bindings and mappings done > > by the toolstack or the device model in order to be able to correctly > > recover from a device model or toolstack crash. > > But possibly with tool stack assistance: Rather than doing it (in a > potentially fragile way, as per my other comments) as an integral > part of deassign-device, it could be a separate domctl to be > issued first. I don't have a strong opinion whether a new hypercall would be better than just hooking this logic into the current deassign hypercall, as long as it's robust. > Or otherwise failure here ought to lead to failure of > deassign-device, rather than (e.g.) an infinite loop. Yes, I fully agree it needs to be robust, ideally the unbinding/unmapping should always succeed, or at least don't get stuck into an infinite loop. Roger. ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH] x86/vtx: Improvements to ept= command line handling
On 21/12/2018 11:27, Jan Beulich wrote: On 21.12.18 at 12:15, wrote: >> On 21/12/2018 11:13, Jan Beulich wrote: >> On 20.12.18 at 18:16, wrote: +By default, Xen will use A/D tracking when available in hardware, except +on Avoton processors affected by erratum AVR41. Explicitly choosing +`ad=0` will disable the use of A/D tracking on capable hardware, whereas +choosing `ad=1` will cause tracking to be used even on AVR41-affected +hardware. >>> Is there any reason for this special casing of the one erratum? >>> Earlier this week I've gone through some spec updates for other >>> purposes, and I've seen some rather frightening EPT A/D errata. >> Which, out of interest? There are a few, particularly on Skylake, but >> all the problematic ones I'm aware of are fixed in microcode. > I'd have to go through them again, as I didn't pay close attention to > what was said about their status. Are we generally putting ourselves > on the position then that errata don't need working around if there's > a microcode update available? That's a possible position to take, but > not spelled out anywhere. Intel and AMD are quite clear that you should be running up-to-date microcode. Its conceptually similar to someone saying "I found an issue in Xen 4.8.0" and our reply being "Thats great, but its already fixed in 4.8.1 - please update". ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] [qemu-mainline test] 131454: tolerable FAIL - PUSHED
flight 131454 qemu-mainline real [real] http://logs.test-lab.xenproject.org/osstest/logs/131454/ Failures :-/ but no regressions. Tests which did not succeed, but are not blocking: test-amd64-amd64-xl-qemuu-win7-amd64 17 guest-stopfail like 131435 test-armhf-armhf-libvirt 14 saverestore-support-checkfail like 131435 test-amd64-i386-xl-qemuu-win7-amd64 17 guest-stop fail like 131435 test-amd64-amd64-xl-qemuu-ws16-amd64 17 guest-stopfail like 131435 test-armhf-armhf-libvirt-raw 13 saverestore-support-checkfail like 131435 test-amd64-i386-xl-pvshim12 guest-start fail never pass test-amd64-amd64-libvirt 13 migrate-support-checkfail never pass test-amd64-i386-libvirt-xsm 13 migrate-support-checkfail never pass test-amd64-i386-libvirt 13 migrate-support-checkfail never pass test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail never pass test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check fail never pass test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check fail never pass test-armhf-armhf-xl-arndale 13 migrate-support-checkfail never pass test-armhf-armhf-xl-arndale 14 saverestore-support-checkfail never pass test-amd64-amd64-qemuu-nested-amd 17 debian-hvm-install/l1/l2 fail never pass test-amd64-amd64-libvirt-vhd 12 migrate-support-checkfail never pass test-armhf-armhf-xl-multivcpu 13 migrate-support-checkfail never pass test-armhf-armhf-xl-multivcpu 14 saverestore-support-checkfail never pass test-armhf-armhf-xl 13 migrate-support-checkfail never pass test-armhf-armhf-xl 14 saverestore-support-checkfail never pass test-armhf-armhf-xl-rtds 13 migrate-support-checkfail never pass test-armhf-armhf-libvirt 13 migrate-support-checkfail never pass test-armhf-armhf-xl-rtds 14 saverestore-support-checkfail never pass test-armhf-armhf-xl-cubietruck 13 migrate-support-checkfail never pass test-armhf-armhf-xl-credit1 13 migrate-support-checkfail never pass test-armhf-armhf-xl-cubietruck 14 saverestore-support-checkfail never pass test-armhf-armhf-xl-credit1 14 saverestore-support-checkfail never pass test-armhf-armhf-xl-credit2 13 migrate-support-checkfail never pass test-armhf-armhf-xl-credit2 14 saverestore-support-checkfail never pass test-armhf-armhf-libvirt-raw 12 migrate-support-checkfail never pass test-amd64-i386-xl-qemuu-ws16-amd64 17 guest-stop fail never pass test-armhf-armhf-xl-vhd 12 migrate-support-checkfail never pass test-armhf-armhf-xl-vhd 13 saverestore-support-checkfail never pass test-amd64-amd64-xl-qemuu-win10-i386 10 windows-installfail never pass test-amd64-i386-xl-qemuu-win10-i386 10 windows-install fail never pass version targeted for testing: qemuub72566a4ffaddbc0c0c1f6f5ee91b42ab13ff429 baseline version: qemuue85c577158a2e8e252414959da9ef15c12eec63d Last test of basis 131435 2018-12-18 23:20:18 Z2 days Testing same since 131454 2018-12-20 05:20:07 Z1 days1 attempts People who touched revisions under test: Daniel Henrique Barboza Dominik Csapak Eduardo Habkost Emilio G. Cota Laurent Vivier Marc-André Lureau Markus Armbruster Peter Maydell Roman Bolshakov jobs: build-amd64-xsm pass build-i386-xsm pass build-amd64 pass build-armhf pass build-i386 pass build-amd64-libvirt pass build-armhf-libvirt pass build-i386-libvirt pass build-amd64-pvopspass build-armhf-pvopspass build-i386-pvops pass test-amd64-amd64-xl pass test-armhf-armhf-xl pass test-amd64-i386-xl pass test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm pass test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsmpass test-amd64-amd64-xl-qemuu-debianhvm-amd64-xsmpass test-amd64-i386-xl-qemuu-debianhvm-amd64-xsm pass test-amd64-amd64-libvirt-xsm pass test-amd64-i386-libvirt-xsm
Re: [Xen-devel] [PATCH for-4.12 v2 8/8] xen: Switch parameter in get_page_from_gfn to use typesafe gfn
Hi Stefano, On 20/12/2018 23:25, Stefano Stabellini wrote: On Thu, 20 Dec 2018, Julien Grall wrote: No functional change intended. Only reasonable clean-ups are done in this patch. The rest will use _gfn for the time being. Signed-off-by: Julien Grall I don't have the bandwidth to review this patch before the holidays, but it is not required for the feature to go in. That's fine. This patch is just a cleaned-up. Thank you for review the rest of the series! Cheers, -- Julien Grall ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH 1/2] xen/dom0: Improve documentation for dom0= and dom0-iommu=
On Thu, Dec 20, 2018 at 11:40:51PM +, Andrew Cooper wrote: > Update to the latest metadata style, and expand each of the clauses with more > information, including applicable CONFIG_* options. > > Drop the redundant comment beside parse_dom0_param(), to avoid it getting out > of sync with the main documentation. > > Signed-off-by: Andrew Cooper Thanks! A couple of fixes below, because the original text is actually wrong... > --- > CC: Jan Beulich > CC: Wei Liu > CC: Roger Pau Monné > CC: Stefano Stabellini > CC: Julien Grall > > Please double check for correctness. The text matches my > understanding/reading of the code, but some of it is rather subtle going. > > It occurs to me that: > > * The choice of dom0 boot mode should in part be derived from the available >CONFIG_* options, and ELF notes advertised in the dom0 kernel. This is indeed doable, but would require parsing the dom0 kernel before building the domain. > > * AMD probably needs to gain an `ivmd=` to mirror `rmrr=` on the Intel side, >because we know there are other errors in the IVRS table. Yes, albeit using rmrr is quite cumbersome because it's mostly a trial-and-error process until there are no more iommu faults (unless you can get the correct rmrr command for your hardware somewhere). > > * Neither of map-{inclusive,reserved} should be active by default, even on >Intel hardware, and we should (wherever possible) have quirks like we have >for all other firmware screwups. Requiring the user to diagnose/work >around firmware problems like this is quite rude. That would indeed be nice, but I think there are too many vendor firmware versions to be able to correctly identify such quirks, the more that vendors don't even list missing RMRR as erratum. > +Controls for the dom0 IOMMU setup. > + > +* The `passthrough` boolean is applicable to x86 PV dom0's only and > defaults > +to false. It controls whether the IOMMU is fully disabled for devices > +belonging to dom0 (`passthrough=1`), or whether the IOMMU is set up with > +an identity transform for dom0 (`passthrough=0`) to prevent dom0 from > +DMA'ing outside of its permitted areas. > + > +This option is hardwired to false for x86 PVH dom0's (where a > non-identity > +transform is required for dom0 to function), and is ignored for ARM. > + > +* The `strict` boolean is applicable to x86 PV dom0's only and defaults to > +false. It controls whether dom0 can have IOMMU mappings for all domain > +RAM in the system, or only for its allocated RAM (and grant mappings > etc.) > + > +This option is hardwired to true for x86 PVH dom0's (as RAM belonging to > +other domains in the system don't live in a compatible address space), > and > +is ignored for ARM. > + > +* The `map-inclusive` boolean is applicable to x86 PV dom0's, and sets up > DMA > +remapping for all non-RAM regions below 4GB except for unusable ranges. > + > +Typically, some devices in a system use bits of RAM for communication, > and > +these areas should be listed via RMRR or IVMD entries in the APCI tables, > +so Xen can ensure that they are identity-mapped in the IOMMU. However, > +some firmware makes mistakes writing its APCI tables, and this option is > a > +coarse-grain workaround for those errors. > + > +Where possible, finer grain corrections should be made with the `rmrr=`, > +`ivrs_hpet=` or `ivrs_ioapic=` command line options. > + > +This option is enabled by default on x86 Intel systems, disabled by > +default on other x86 systems, and invalid on ARM systems. I'm afraid the previous text was wrong. I later discovered that AMD also had such workarounds applied by default, and unified the code, but failed to update the documentation, sorry. map-inclusive is enabled by default on x86 for a PV dom0. See xen/drivers/passthrough/x86/iommu.c:215 (arch_iommu_hwdom_init). > + > +* The `map-reserved` functionality is very similar to `map-inclusive`, but > is > +applicable to both x86 PV and PVH dom0's, and represents a subset of the > +correction by only mapping reserved memory regions rather than all > non-RAM > +regions. > + > +This option is enabled by default on x86 Intel systems, disabled by > +default on other x86 systems, and invalid on ARM systems. map-reserved is enabled by default on x86, xen/drivers/passthrough/x86/iommu.c:218 (arch_iommu_hwdom_init). The text itself looks OK to me. ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH 1/2] xen/dom0: Improve documentation for dom0= and dom0-iommu=
>>> On 21.12.18 at 00:40, wrote: > --- a/docs/misc/xen-command-line.markdown > +++ b/docs/misc/xen-command-line.markdown > @@ -636,55 +636,76 @@ trace feature is only enabled in debugging builds of > Xen. > > Specify the bit width of the DMA heap. > > -### dom0 (x86) > -> `= List of [ pvh | shadow ]` > +### dom0 > +> `= List of [ pvh=, shadow= ]` > > -> Sub-options: > - > -> `pvh` > +> Applicability: x86 Why the new tag, when everything else uses (x86) next to the option name? > ### dom0-iommu > -> `= List of [ passthrough | strict | map-inclusive ]` > - > -This list of booleans controls the iommu usage by Dom0: > - > -* `passthrough`: disables DMA remapping for Dom0. Default is `false`. Note > that > - this option is hard coded to `false` for a PVH Dom0 and any attempt to > - overwrite it from the command line is ignored. > - > -* `strict`: sets up DMA remapping only for the RAM Dom0 actually got > assigned. > - Default is `false` which means Dom0 will get mappings for all the host > - RAM except regions in use by Xen. Note that this option is hard coded to > - `true` for a PVH Dom0 and any attempt to overwrite it from the command line > - is ignored. > - > -* `map-inclusive`: sets up DMA remapping for all the non-RAM regions below > 4GB > - except for unusable ranges. Use this to work around firmware issues > providing > - incorrect RMRR/IVMD entries. Rather than only mapping RAM pages for IOMMU > - accesses for Dom0, with this option all pages up to 4GB, not marked as > - unusable in the E820 table, will get a mapping established. Note that this > - option is only applicable to a PV Dom0 and is enabled by default on Intel > - hardware. > - > -* `map-reserved`: sets up DMA remapping for all the reserved regions in the > - memory map for Dom0. Use this to work around firmware issues providing > - incorrect RMRR/IVMD entries. Rather than only mapping RAM pages for IOMMU > - accesses for Dom0, all memory regions marked as reserved in the memory map > - that don't overlap with any MMIO region from emulated devices will be > - identity mapped. This option maps a subset of the memory that would be > - mapped when using the `map-inclusive` option. This option is available to > all > - Dom0 modes and is enabled by default on Intel hardware. > +> `= List of [ passthrough=, strict=, map-inclusive=, > +> map-reserved= ]` > + > +Controls for the dom0 IOMMU setup. > + > +* The `passthrough` boolean is applicable to x86 PV dom0's only and > defaults > +to false. It controls whether the IOMMU is fully disabled for devices > +belonging to dom0 (`passthrough=1`), or whether the IOMMU is set up with > +an identity transform for dom0 (`passthrough=0`) to prevent dom0 from > +DMA'ing outside of its permitted areas. > + > +This option is hardwired to false for x86 PVH dom0's (where a > non-identity > +transform is required for dom0 to function), and is ignored for ARM. > + > +* The `strict` boolean is applicable to x86 PV dom0's only and defaults to > +false. It controls whether dom0 can have IOMMU mappings for all domain > +RAM in the system, or only for its allocated RAM (and grant mappings > etc.) > + > +This option is hardwired to true for x86 PVH dom0's (as RAM belonging to > +other domains in the system don't live in a compatible address space), > and > +is ignored for ARM. > + > +* The `map-inclusive` boolean is applicable to x86 PV dom0's, and sets up > DMA > +remapping for all non-RAM regions below 4GB except for unusable ranges. I don't thinks this is PV-specific, just its default is. Jan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH 2/2] xen/dom0: Add a dom0-iommu=none option
>>> On 21.12.18 at 00:40, wrote: > @@ -271,6 +272,26 @@ int parse_boolean(const char *name, const char *s, const > char *e) > return -1; > } > > +int cmdline_strcmp(const char *frag, const char *name) __init ? > +{ > +while ( 1 ) > +{ > +int res = (*frag - *name); > + > +if ( res || *name == '\0' ) > +{ > +/* NUL in 'name' matching punctuation in 'frag' implies success. > */ > +if ( *name == '\0' && ispunct(*frag) ) > +res = 0; Isn't ispunct() true for dashes and perhaps also underscores? I don't think it can be this generic, the more that ... > --- a/xen/include/xen/lib.h > +++ b/xen/include/xen/lib.h > @@ -79,6 +79,13 @@ int parse_bool(const char *s, const char *e); > */ > int parse_boolean(const char *name, const char *s, const char *e); > > +/** > + * Very similar to strcmp(), but will declare a match if the NUL in 'name' > + * lines up with punctuationin 'frag'. Designed for picking exact string > + * matches out of a comma-separated command line fragment. > + */ > +int cmdline_strcmp(const char *frag, const char *name); ... you talk of commas only here. Jan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v2 11/18] argo: implement the register op
Hi Christopher, On 21/12/2018 01:17, Christopher Clark wrote: On Thu, Dec 20, 2018 at 3:20 AM Julien Grall wrote: Hi Christopher, On 12/20/18 6:39 AM, Christopher Clark wrote: Used by a domain to register a region of memory for receiving messages from either a specified other domain, or, if specifying a wildcard, any domain. This operation creates a mapping within Xen's private address space that will remain resident for the lifetime of the ring. In subsequent commits, the hypervisor will use this mapping to copy data from a sending domain into this registered ring, making it accessible to the domain that registered the ring to receive data. In this code, the p2m type of the memory supplied by the guest for the ring must be p2m_ram_rw, which is a conservative choice made to defer the need to reason about the other p2m types with this commit. xen_argo_page_descr_t type is introduced as a page descriptor, to convey both the physical address of the start of the page and its granularity. The smallest granularity page is assumed to be 4096 bytes and the lower twelve bits of the type are used for indicate an enumerated page size. I haven't seen any reply from you on my concern with this approach (see [1]). For convenience, I will duplicate the message here. Hi Julien, Thanks for the reminder. If you let the user the choice of the granularity, then, I believe, you will prevent the hypervisor to do some optimization. OK, let's work through this then. For instance, if the guest supplies only 4KB page but the hypervisor is 64KB. There are no way to easily map them contiguously in the hypervisor (e.g using vmap). Right. So with the matrix: 4K guest, 4K xen : fine. 4K guest, 64K xen : contiguous guest physical chunks or region required. 64K guest, 4K xen : weird? seems doable. It is not weird, 64KB split nicely into 16 4KB chunk. Actually, Linux upstream has all the support for to run with 64KB pages on current Xen. 64K guest, 64K xen : fine (with some work). as you note, the 4K guest, 64K hypervisor case is the one that raises the question. That's correct. To generalize the problem, the problem will happen whenever the guest page size is smaller than the Xen page size. Is there a particular reason to allow the ring buffer to be non-contiguous in the guest physical address? It hasn't been a necessary restriction up to this point, and isn't so on the platforms we're deploying on, so my preference is not to introduce it as an additional requirement if it can be avoided. It allows us to use vmalloc (rather than kmalloc) on Linux, which is helpful. vmalloc might be an issue on Arm if we request 64KB chunk of physical memory. Although I don't know the vmalloc implementation to be able to say whether this can be addressed. There can be high turnover in ring registration for a server with many short-lived connections. While the rings are not necessarily large -- the default is 128K in the current Linux driver, though clients can change what they use -- contiguous memory regions are a more limited resource for the kernel to manage, and avoiding pressure on that contiguous region allocator when it isn't necessary is preferable. We also do not want to disincentivize a server that is seeking to improve performance from registering larger rings -- so allowing non-contiguous regions fits with that. I'd have to study the Linux driver further to say whether there are stronger additional requirements that I'm not currently aware of, but I don't know of any at the moment. Thank you for the detailed explanation. So I think my option 1) below would suit you the best here. Depending on the answer, there are different way to handle that: 1) Request the guest to allocate memory using 64KB (on Arm) chunk and pass the base address for each chunk 2) Request the guest to allocate contiguously the buffer and pass the base address and size I understand that #2 would avoid the need to describe a contiguous allocation of memory as a series of chunks; but I think #1 is the option I would select. Do you think that would be acceptable? 1) is a good option for me. I forgot to mention the base address would need to be aligned to 64KB. Cheers, -- Julien Grall ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH 2/2] xen/dom0: Add a dom0-iommu=none option
On Thu, Dec 20, 2018 at 11:40:52PM +, Andrew Cooper wrote: > For development purposes, it is very convenient to boot Xen as a PVH guest, > with an XTF PV or PVH "dom0". The edit-compile-go cycle is a matter of > seconds, and you can resonably insert printk() debugging in places which which > would be completely infeasible when booting fully-fledged guests. > > However, the PVH dom0 path insists on having a working IOMMU, which doesn't > exist when virtualised as a PVH guest, and isn't necessary for XTF anyway. > > Introduce a developer mode to skip the IOMMU requirement. This looks very similar to the current 'passthrough' option, maybe it would be enough to allow PVH dom0 to use the passthrough option provided a warning is added to arch_iommu_check_autotranslated_hwdom? Thanks, Roger. ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH 1/2] xen/dom0: Improve documentation for dom0= and dom0-iommu=
On 21/12/2018 12:08, Jan Beulich wrote: On 21.12.18 at 00:40, wrote: >> --- a/docs/misc/xen-command-line.markdown >> +++ b/docs/misc/xen-command-line.markdown >> @@ -636,55 +636,76 @@ trace feature is only enabled in debugging builds of >> Xen. >> >> Specify the bit width of the DMA heap. >> >> -### dom0 (x86) >> -> `= List of [ pvh | shadow ]` >> +### dom0 >> +> `= List of [ pvh=, shadow= ]` >> >> -> Sub-options: >> - >> -> `pvh` >> +> Applicability: x86 > Why the new tag, when everything else uses (x86) next to the > option name? See the commit message of c/s a3a99df44e5405d2092ec59087681765fa4cdee7 The problem is with the generated HTML anchors when trying to cross reference the options. > >> ### dom0-iommu >> -> `= List of [ passthrough | strict | map-inclusive ]` >> - >> -This list of booleans controls the iommu usage by Dom0: >> - >> -* `passthrough`: disables DMA remapping for Dom0. Default is `false`. Note >> that >> - this option is hard coded to `false` for a PVH Dom0 and any attempt to >> - overwrite it from the command line is ignored. >> - >> -* `strict`: sets up DMA remapping only for the RAM Dom0 actually got >> assigned. >> - Default is `false` which means Dom0 will get mappings for all the host >> - RAM except regions in use by Xen. Note that this option is hard coded to >> - `true` for a PVH Dom0 and any attempt to overwrite it from the command >> line >> - is ignored. >> - >> -* `map-inclusive`: sets up DMA remapping for all the non-RAM regions below >> 4GB >> - except for unusable ranges. Use this to work around firmware issues >> providing >> - incorrect RMRR/IVMD entries. Rather than only mapping RAM pages for IOMMU >> - accesses for Dom0, with this option all pages up to 4GB, not marked as >> - unusable in the E820 table, will get a mapping established. Note that this >> - option is only applicable to a PV Dom0 and is enabled by default on Intel >> - hardware. >> - >> -* `map-reserved`: sets up DMA remapping for all the reserved regions in the >> - memory map for Dom0. Use this to work around firmware issues providing >> - incorrect RMRR/IVMD entries. Rather than only mapping RAM pages for IOMMU >> - accesses for Dom0, all memory regions marked as reserved in the memory map >> - that don't overlap with any MMIO region from emulated devices will be >> - identity mapped. This option maps a subset of the memory that would be >> - mapped when using the `map-inclusive` option. This option is available to >> all >> - Dom0 modes and is enabled by default on Intel hardware. >> +> `= List of [ passthrough=, strict=, map-inclusive=, >> +> map-reserved= ]` >> + >> +Controls for the dom0 IOMMU setup. >> + >> +* The `passthrough` boolean is applicable to x86 PV dom0's only and >> defaults >> +to false. It controls whether the IOMMU is fully disabled for devices >> +belonging to dom0 (`passthrough=1`), or whether the IOMMU is set up with >> +an identity transform for dom0 (`passthrough=0`) to prevent dom0 from >> +DMA'ing outside of its permitted areas. >> + >> +This option is hardwired to false for x86 PVH dom0's (where a >> non-identity >> +transform is required for dom0 to function), and is ignored for ARM. >> + >> +* The `strict` boolean is applicable to x86 PV dom0's only and defaults to >> +false. It controls whether dom0 can have IOMMU mappings for all domain >> +RAM in the system, or only for its allocated RAM (and grant mappings >> etc.) >> + >> +This option is hardwired to true for x86 PVH dom0's (as RAM belonging to >> +other domains in the system don't live in a compatible address space), >> and >> +is ignored for ARM. >> + >> +* The `map-inclusive` boolean is applicable to x86 PV dom0's, and sets up >> DMA >> +remapping for all non-RAM regions below 4GB except for unusable ranges. > I don't thinks this is PV-specific, just its default is. >From arch_iommu_hwdom_init(): /* Inclusive mappings are enabled by default for PV. */ if ( iommu_hwdom_inclusive == -1 ) iommu_hwdom_inclusive = is_pv_domain(d); /* Reserved IOMMU mappings are enabled by default. */ if ( iommu_hwdom_reserved == -1 ) iommu_hwdom_reserved = 1; if ( iommu_hwdom_inclusive && !is_pv_domain(d) ) { printk(XENLOG_WARNING "IOMMU inclusive mappings are only supported on PV Dom0\n"); iommu_hwdom_inclusive = 0; } Attempting to use this option with a PVH dom0 will cause Xen to force it off. ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH 2/2] xen/dom0: Add a dom0-iommu=none option
On 21/12/2018 12:13, Jan Beulich wrote: On 21.12.18 at 00:40, wrote: >> @@ -271,6 +272,26 @@ int parse_boolean(const char *name, const char *s, >> const char *e) >> return -1; >> } >> >> +int cmdline_strcmp(const char *frag, const char *name) > __init ? I think there are some runtime parameters in need of some fixing as well. > >> +{ >> +while ( 1 ) >> +{ >> +int res = (*frag - *name); >> + >> +if ( res || *name == '\0' ) >> +{ >> +/* NUL in 'name' matching punctuation in 'frag' implies >> success. */ >> +if ( *name == '\0' && ispunct(*frag) ) >> +res = 0; > Isn't ispunct() true for dashes and perhaps also underscores? > I don't think it can be this generic, the more that ... > >> --- a/xen/include/xen/lib.h >> +++ b/xen/include/xen/lib.h >> @@ -79,6 +79,13 @@ int parse_bool(const char *s, const char *e); >> */ >> int parse_boolean(const char *name, const char *s, const char *e); >> >> +/** >> + * Very similar to strcmp(), but will declare a match if the NUL in 'name' >> + * lines up with punctuationin 'frag'. Designed for picking exact string >> + * matches out of a comma-separated command line fragment. >> + */ >> +int cmdline_strcmp(const char *frag, const char *name); > ... you talk of commas only here. I actually borrowed this function from my CPUID cmdline patch. In 99% of cases, we only need to match = and , but we have some other parameters such as psr= which use : for delimiters, hence the use of ispunct(). As an alternative, I could revert back to explicitly checking the expected punctuation. It is not as if this is a fastpath. ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH 2/2] xen/dom0: Add a dom0-iommu=none option
On 21/12/2018 12:44, Roger Pau Monné wrote: > On Thu, Dec 20, 2018 at 11:40:52PM +, Andrew Cooper wrote: >> For development purposes, it is very convenient to boot Xen as a PVH guest, >> with an XTF PV or PVH "dom0". The edit-compile-go cycle is a matter of >> seconds, and you can resonably insert printk() debugging in places which >> which >> would be completely infeasible when booting fully-fledged guests. >> >> However, the PVH dom0 path insists on having a working IOMMU, which doesn't >> exist when virtualised as a PVH guest, and isn't necessary for XTF anyway. >> >> Introduce a developer mode to skip the IOMMU requirement. > This looks very similar to the current 'passthrough' option, maybe it > would be enough to allow PVH dom0 to use the passthrough option > provided a warning is added to > arch_iommu_check_autotranslated_hwdom? I considered that, but "dom0-iommu=passthrough" isn't an accurate description of what is going on. Frankly, its not correct for PV either. TBH, dom0-iommu=none is better for both. How about I introduce that as the new option, and leave passthrough as a legacy alias? ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] PROBLEM: Xen paging-request boot failure since 4.19.5
On Thu, Dec 20, 2018 at 11:12:40AM -0500, Boris Ostrovsky wrote: > This is addressed by https://lkml.org/lkml/2018/12/11/266 but has not > been merged yet. Confirmed: applying the patch in that posting to my Gentoo sys-kernel/gentoo-sources-4.19.10 tree results in a working system for me. Thanks, --Ken Pizzini ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH 1/2] xen/dom0: Improve documentation for dom0= and dom0-iommu=
On 21/12/2018 12:08, Roger Pau Monné wrote: > On Thu, Dec 20, 2018 at 11:40:51PM +, Andrew Cooper wrote: >> Update to the latest metadata style, and expand each of the clauses with more >> information, including applicable CONFIG_* options. >> >> Drop the redundant comment beside parse_dom0_param(), to avoid it getting out >> of sync with the main documentation. >> >> Signed-off-by: Andrew Cooper > Thanks! A couple of fixes below, because the original text is actually > wrong... TBH, that is my default assumption every time I do work like this :) > >> --- >> CC: Jan Beulich >> CC: Wei Liu >> CC: Roger Pau Monné >> CC: Stefano Stabellini >> CC: Julien Grall >> >> Please double check for correctness. The text matches my >> understanding/reading of the code, but some of it is rather subtle going. >> >> It occurs to me that: >> >> * The choice of dom0 boot mode should in part be derived from the available >>CONFIG_* options, and ELF notes advertised in the dom0 kernel. > This is indeed doable, but would require parsing the dom0 kernel > before building the domain. I don't see anything wrong with parsing the ELF headers ahead of building the domain. From the overall boot time, its just an order-of-operations issue. > >> * AMD probably needs to gain an `ivmd=` to mirror `rmrr=` on the Intel side, >>because we know there are other errors in the IVRS table. > Yes, albeit using rmrr is quite cumbersome because it's mostly a > trial-and-error process until there are no more iommu faults (unless > you can get the correct rmrr command for your hardware somewhere). > >> * Neither of map-{inclusive,reserved} should be active by default, even on >>Intel hardware, and we should (wherever possible) have quirks like we have >>for all other firmware screwups. Requiring the user to diagnose/work >>around firmware problems like this is quite rude. > That would indeed be nice, but I think there are too many vendor > firmware versions to be able to correctly identify such quirks, the > more that vendors don't even list missing RMRR as erratum. I don't agree. We already have quirks based on DMI (at the moment, mainly for reboot overrides), and the vast majority of the offending cases are the BMC shared mailbox, which will be in a fixed per-platform location. I don't expect we'll ever find and fix all quirks, but where we do find suitable ones, we should put them into the boot code. > >> +Controls for the dom0 IOMMU setup. >> + >> +* The `passthrough` boolean is applicable to x86 PV dom0's only and >> defaults >> +to false. It controls whether the IOMMU is fully disabled for devices >> +belonging to dom0 (`passthrough=1`), or whether the IOMMU is set up with >> +an identity transform for dom0 (`passthrough=0`) to prevent dom0 from >> +DMA'ing outside of its permitted areas. >> + >> +This option is hardwired to false for x86 PVH dom0's (where a >> non-identity >> +transform is required for dom0 to function), and is ignored for ARM. >> + >> +* The `strict` boolean is applicable to x86 PV dom0's only and defaults to >> +false. It controls whether dom0 can have IOMMU mappings for all domain >> +RAM in the system, or only for its allocated RAM (and grant mappings >> etc.) >> + >> +This option is hardwired to true for x86 PVH dom0's (as RAM belonging to >> +other domains in the system don't live in a compatible address space), >> and >> +is ignored for ARM. >> + >> +* The `map-inclusive` boolean is applicable to x86 PV dom0's, and sets up >> DMA >> +remapping for all non-RAM regions below 4GB except for unusable ranges. >> + >> +Typically, some devices in a system use bits of RAM for communication, >> and >> +these areas should be listed via RMRR or IVMD entries in the APCI >> tables, >> +so Xen can ensure that they are identity-mapped in the IOMMU. However, >> +some firmware makes mistakes writing its APCI tables, and this option >> is a >> +coarse-grain workaround for those errors. >> + >> +Where possible, finer grain corrections should be made with the `rmrr=`, >> +`ivrs_hpet=` or `ivrs_ioapic=` command line options. >> + >> +This option is enabled by default on x86 Intel systems, disabled by >> +default on other x86 systems, and invalid on ARM systems. > I'm afraid the previous text was wrong. I later discovered that AMD > also had such workarounds applied by default, and unified the code, > but failed to update the documentation, sorry. > > map-inclusive is enabled by default on x86 for a PV dom0. See > xen/drivers/passthrough/x86/iommu.c:215 (arch_iommu_hwdom_init). > >> + >> +* The `map-reserved` functionality is very similar to `map-inclusive`, >> but is >> +applicable to both x86 PV and PVH dom0's, and represents a subset of the >> +correction by only mapping reserved memory regions rather than all >> non-RAM >> +regions. >> + >> +This option is enabled by default o
Re: [Xen-devel] [PATCH for-4.12 v2 2/8] xen/arm: p2m: Introduce p2m_get_page_from_gfn
On 20/12/2018 19:23, Julien Grall wrote: > diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c > index 2b5e43f50a..cd34149d13 100644 > --- a/xen/arch/arm/p2m.c > +++ b/xen/arch/arm/p2m.c > @@ -406,6 +406,38 @@ mfn_t p2m_lookup(struct domain *d, gfn_t gfn, p2m_type_t > *t) > return mfn; > } > > +struct page_info *p2m_get_page_from_gfn(struct domain *d, gfn_t gfn, > +p2m_type_t *t) > +{ > +struct page_info *page; > +p2m_type_t p2mt; > +mfn_t mfn = p2m_lookup(d, gfn, &p2mt); > + > +if (t) Spaces > +*t = p2mt; > + > +if ( !p2m_is_any_ram(p2mt) ) > +return NULL; > + > +if ( !mfn_valid(mfn) ) > +return NULL; Newline > +page = mfn_to_page(mfn); > + > +/* > + * get_page won't work on foreign mapping because the page doesn't > + * belong to the current domain. > + */ > +if ( p2m_is_foreign(p2mt) ) > +{ > +struct domain *fdom = page_get_owner_and_reference(page); > +ASSERT(fdom != NULL); > +ASSERT(fdom != d); > +return page; > +} > + > +return (get_page(page, d) ? page: NULL); No need for the outer brackets. All trivial style issues, so can be fixed on commit. ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] [xen-unstable-smoke test] 131491: tolerable all pass - PUSHED
flight 131491 xen-unstable-smoke real [real] http://logs.test-lab.xenproject.org/osstest/logs/131491/ Failures :-/ but no regressions. Tests which did not succeed, but are not blocking: test-amd64-amd64-libvirt 13 migrate-support-checkfail never pass test-armhf-armhf-xl 13 migrate-support-checkfail never pass test-armhf-armhf-xl 14 saverestore-support-checkfail never pass version targeted for testing: xen af80e1d9c79e3bcb392775f311d20eec54b3389b baseline version: xen 396d8d5418ea908a5ef88e7d7a9f22c70ada44c2 Last test of basis 131486 2018-12-21 08:00:41 Z0 days Testing same since 131491 2018-12-21 11:01:16 Z0 days1 attempts People who touched revisions under test: Benjamin Sanda Julien Grall Stefano Stabellini jobs: build-amd64 pass build-armhf pass build-amd64-libvirt pass test-armhf-armhf-xl pass test-amd64-amd64-xl-qemuu-debianhvm-i386 pass test-amd64-amd64-libvirt pass sg-report-flight on osstest.test-lab.xenproject.org logs: /home/logs/logs images: /home/logs/images Logs, config files, etc. are available at http://logs.test-lab.xenproject.org/osstest/logs Explanation of these reports, and of osstest in general, is at http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master Test harness code can be found at http://xenbits.xen.org/gitweb?p=osstest.git;a=summary Pushing revision : To xenbits.xen.org:/home/xen/git/xen.git 396d8d5418..af80e1d9c7 af80e1d9c79e3bcb392775f311d20eec54b3389b -> smoke ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH 03/15] x86/cpu/vpmu: Add Hygon Dhyana support for vPMU
On 12/21/18 5:02 AM, Pu Wen wrote: > On 2018/12/20 22:25, Boris Ostrovsky wrote: > ... >>> diff --git a/xen/arch/x86/cpu/vpmu_amd.c b/xen/arch/x86/cpu/vpmu_amd.c >>> index 5efc39b..e9f0a5c 100644 >>> --- a/xen/arch/x86/cpu/vpmu_amd.c >>> +++ b/xen/arch/x86/cpu/vpmu_amd.c >>> @@ -554,6 +554,8 @@ int __init amd_vpmu_init(void) >>> case 0x12: >>> case 0x14: >>> case 0x16: >>> +case 0x17: >>> +case 0x18: >> >> This also enables VPMU support for Zen which goes beyond what the >> commit message claims to do. > Sorry for the not clear commit message. Will add modification description > in the commit message and make the changes complete. > > On the other hand, since current Xen vPMU still not support Zen. so in > this patch we enable 0x17 support. If this modification is not preferred, > will remove AMD Xen 0x17 support in next version. Enabling 0x17 should be fine, I just thought commit message should be explicit about that. >> Also, why are you choosing to use legacy MSRs (and you did the same in >> Linux)? Doesn't Zen (which you are saying is similar to Hygon) support >> c001_020X bank? > In Linux, the Xen PMU driver use the default branch cases, which also use > the legacy MSRs way. So we choose to follow legacy MSRs here in Dhyana > cases. > > Since both of Zen and Dhyana support C001_020X MSRs. If use the C001_020X > is preferred, we will try to modify the related codes and create a patch. I don't have a Zen box available right now but from what I can see 0x17 counters are compatible with 0x15 so I think switching to C001_020X should work. And looks like you are using those in Linux (non-Xen part) too. > Also the Linux Xen PMU driver may need to be updated to use these MSRs. Yes, although Linux part is used only by PV guests. -boris -boris ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH for-4.12 v2 8/8] xen: Switch parameter in get_page_from_gfn to use typesafe gfn
On 12/20/18 2:23 PM, Julien Grall wrote: > No functional change intended. > > Only reasonable clean-ups are done in this patch. The rest will use _gfn > for the time being. > > Signed-off-by: Julien Grall SVM bits: Reviewed-by: Boris Ostrovsky ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] [PATCH] x86/pv: Clean up cr3 handling in arch_set_info_guest()
All of this code lives inside CONFIG_PV which means gfn == mfn, and the fill_ro_mpt() calls clearly show that the value is used untranslated. Change cr3_gfn to a suitably typed cr3_mfn, and replace get_page_from_gfn() with a straight mfn_to_page/get_page sequence, to avoid the implication that translation is going on. No functional change. Signed-off-by: Andrew Cooper --- CC: Jan Beulich CC: Wei Liu CC: Roger Pau Monné CC: Stefano Stabellini CC: Julien Grall Julien: This should simplify your "xen: Switch parameter in get_page_from_gfn to use typesafe gfn" patch. In particular, I did a doubletake at fill_ro_mpt(_mfn(gfn_x(cr3_gfn))); when reviewing it. --- xen/arch/x86/domain.c | 22 +++--- 1 file changed, 11 insertions(+), 11 deletions(-) diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c index 32dc4253..da94ab4 100644 --- a/xen/arch/x86/domain.c +++ b/xen/arch/x86/domain.c @@ -827,8 +827,8 @@ int arch_set_info_guest( unsigned long flags; bool compat; #ifdef CONFIG_PV -unsigned long cr3_gfn; -struct page_info *cr3_page; +mfn_t cr3_mfn; +struct page_info *cr3_page = NULL; unsigned long cr4; int rc = 0; #endif @@ -1091,12 +1091,12 @@ int arch_set_info_guest( set_bit(_VPF_in_reset, &v->pause_flags); if ( !compat ) -cr3_gfn = xen_cr3_to_pfn(c.nat->ctrlreg[3]); +cr3_mfn = _mfn(xen_cr3_to_pfn(c.nat->ctrlreg[3])); else -cr3_gfn = compat_cr3_to_pfn(c.cmp->ctrlreg[3]); -cr3_page = get_page_from_gfn(d, cr3_gfn, NULL, P2M_ALLOC); +cr3_mfn = _mfn(compat_cr3_to_pfn(c.cmp->ctrlreg[3])); -if ( !cr3_page ) +if ( !mfn_valid(cr3_mfn) || + !(cr3_page = mfn_to_page(cr3_mfn), get_page(cr3_page, d)) ) rc = -EINVAL; else if ( paging_mode_refcounts(d) ) /* nothing */; @@ -1122,7 +1122,7 @@ int arch_set_info_guest( case 0: if ( !compat && !VM_ASSIST(d, m2p_strict) && !paging_mode_refcounts(d) ) -fill_ro_mpt(_mfn(cr3_gfn)); +fill_ro_mpt(cr3_mfn); break; default: if ( cr3_page == current->arch.old_guest_table ) @@ -1137,10 +1137,10 @@ int arch_set_info_guest( v->arch.guest_table = pagetable_from_page(cr3_page); if ( c.nat->ctrlreg[1] ) { -cr3_gfn = xen_cr3_to_pfn(c.nat->ctrlreg[1]); -cr3_page = get_page_from_gfn(d, cr3_gfn, NULL, P2M_ALLOC); +cr3_mfn = _mfn(xen_cr3_to_pfn(c.nat->ctrlreg[1])); -if ( !cr3_page ) +if ( !mfn_valid(cr3_mfn) || + !(cr3_page = mfn_to_page(cr3_mfn), get_page(cr3_page, d)) ) rc = -EINVAL; else if ( !paging_mode_refcounts(d) ) { @@ -1162,7 +1162,7 @@ int arch_set_info_guest( break; case 0: if ( VM_ASSIST(d, m2p_strict) ) -zap_ro_mpt(_mfn(cr3_gfn)); +zap_ro_mpt(cr3_mfn); break; } } -- 2.1.4 ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH for-4.12 v2 8/8] xen: Switch parameter in get_page_from_gfn to use typesafe gfn
On 20/12/2018 19:23, Julien Grall wrote: > diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c > index 32dc4253ff..b462a8513b 100644 > --- a/xen/arch/x86/domain.c > +++ b/xen/arch/x86/domain.c > @@ -827,7 +827,7 @@ int arch_set_info_guest( > unsigned long flags; > bool compat; > #ifdef CONFIG_PV > -unsigned long cr3_gfn; > +gfn_t cr3_gfn; I've sent an alternative patch which this patch should be rebased over, at which point all modifications to arch_set_info_guest() should hopefully disappear. > diff --git a/xen/arch/x86/hvm/domain.c b/xen/arch/x86/hvm/domain.c > index 5d5a746a25..73d2da8441 100644 > --- a/xen/arch/x86/hvm/domain.c > +++ b/xen/arch/x86/hvm/domain.c > @@ -297,7 +297,7 @@ int arch_set_info_hvm_guest(struct vcpu *v, const > vcpu_hvm_context_t *ctx) > { > /* Shadow-mode CR3 change. Check PDBR and update refcounts. */ > struct page_info *page = get_page_from_gfn(v->domain, > - v->arch.hvm.guest_cr[3] >> PAGE_SHIFT, > + gaddr_to_gfn(v->arch.hvm.guest_cr[3]), > NULL, P2M_ALLOC); Can you re-indent while modifying this please? > diff --git a/xen/arch/x86/hvm/viridian/time.c > b/xen/arch/x86/hvm/viridian/time.c > index 840a82b457..a718434456 100644 > --- a/xen/arch/x86/hvm/viridian/time.c > +++ b/xen/arch/x86/hvm/viridian/time.c > @@ -38,16 +38,16 @@ static void dump_reference_tsc(const struct domain *d) > > static void update_reference_tsc(struct domain *d, bool initialize) > { > -unsigned long gmfn = d->arch.hvm.viridian.reference_tsc.fields.pfn; > -struct page_info *page = get_page_from_gfn(d, gmfn, NULL, P2M_ALLOC); > +gfn_t gfn = _gfn(d->arch.hvm.viridian.reference_tsc.fields.pfn); > +struct page_info *page = get_page_from_gfn(d, gfn, NULL, P2M_ALLOC); > HV_REFERENCE_TSC_PAGE *p; > > if ( !page || !get_page_type(page, PGT_writable_page) ) > { > if ( page ) > put_page(page); > -gdprintk(XENLOG_WARNING, "Bad GMFN %#"PRI_gfn" (MFN %#"PRI_mfn")\n", > - gmfn, mfn_x(page ? page_to_mfn(page) : INVALID_MFN)); > +gdprintk(XENLOG_WARNING, "Bad GFN %#"PRI_gfn" (MFN %#"PRI_mfn")\n", The canonical format for gfns and mfns are just %"PRI_*, without the # Do you mind fixing this seeing as you're changing the string anyway? > diff --git a/xen/include/asm-x86/p2m.h b/xen/include/asm-x86/p2m.h > index 3304921991..1efbc071c5 100644 > --- a/xen/include/asm-x86/p2m.h > +++ b/xen/include/asm-x86/p2m.h > @@ -491,18 +491,21 @@ struct page_info *p2m_get_page_from_gfn(struct > p2m_domain *p2m, gfn_t gfn, > p2m_query_t q); > > static inline struct page_info *get_page_from_gfn( > -struct domain *d, unsigned long gfn, p2m_type_t *t, p2m_query_t q) > +struct domain *d, gfn_t gfn, p2m_type_t *t, p2m_query_t q) > { > struct page_info *page; > +mfn_t mfn; > > if ( paging_mode_translate(d) ) > -return p2m_get_page_from_gfn(p2m_get_hostp2m(d), _gfn(gfn), t, NULL, > q); > +return p2m_get_page_from_gfn(p2m_get_hostp2m(d), gfn, t, NULL, q); > > /* Non-translated guests see 1-1 RAM / MMIO mappings everywhere */ > if ( t ) > *t = likely(d != dom_io) ? p2m_ram_rw : p2m_mmio_direct; > -page = mfn_to_page(_mfn(gfn)); > -return mfn_valid(_mfn(gfn)) && get_page(page, d) ? page : NULL; > + > +mfn = _mfn(gfn_x(gfn)); > +page = mfn_to_page(mfn); > +return mfn_valid(mfn) && get_page(page, d) ? page : NULL; This unfortunately propagates some bad behaviour, because it is not safe to use mfn_to_page(mfn); before mfn_valid(mfn) succeeds. (In practice it works because mfn_to_page() is just pointer arithmetic.) Pleas can you express this as: return (mfn_valid(mfn) && (page = mfn_to_page(mfn), get_page(page, d))) ? page : NULL; which at least gets the order of operations in the correct order from C's point of view. Alternatively, and perhaps easier to follow: if ( !mfn_valid(mfn) ) return NULL; page = mfn_to_page(mfn); return get_page(page, d) ? page : NULL; ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH] x86emul: fix test harness and fuzzer build dependencies
Jan Beulich writes ("Re: [Xen-devel] [PATCH] x86emul: fix test harness and fuzzer build dependencies"): > On 20.12.18 at 17:05, wrote: >> On 20.12.18 at 16:23, wrote: > >> Ie, it is there to satisfy the requirement I mention above, that the > >> dependency directory is built first. > > > > Which effectively means anything underneath tools/ (and whatever > > else subdir which depend on one of said rules) is liable to fail to > > build without having come through this (top level) rule. Yes. > > But this is not a property this patch introduces or changes. It just > > re-arranges how things get done. That is, I'd like to ask for the > > change to be acked (or a concrete proposal be made for what > > needs to change) _without_ fixing breakage that might be there > > and the introduction of which you may have missed, or else I'm > > sure you would have commented on what is now eddf9559c9. I'm afraid I don't follow this line of argument. The uncommitted patch we are now discussing introduce a new -j-unsafe call, which could cause stochastic build failures. And AFAICT the only thing the patch-under-discussion improves is to relax (in some situations) the requirement to run make -C /tools/include && make instead of just make, in x86emul. That requirement is the one which I am saying is entirely normal when using recursive make so it should not be a surprise. AFAICT eddf9559c9 doesn't seem to have this problem, although I could be wrong. > >> But in that case we need to make sure that either: > >> > >> A. 1. The top-level Makefiles ensure that *a* build of > >>tools/include completes *before* starting to enter > >>tools/fuzz/x86_instruction_emulator. (Which I > >>think is the case.) > > > > Yes, by said dependency on %-tools-public-headers. Right. > >> AND > >> > >> 2. make -C tools/include is definitely completely read-only if the > >>thing has already been built. (This is somewhat hard to check > >>and maintain, and would need a comment in that Makefile to > >>ensure that this property is preserved.) > > > > Isn't that a property that's supposed to hold for almost all (sub)trees, > > i.e. it's rather the exception that an already built tree will see further > > changes when re-built incrementally without the sources having > > changed? That said, we have to consider such a case here, due to > > the use of move-if-changed in xen/include/xen/lib/x86/Makefile. But > > this still doesn't violate the fully-read-only requirement, as the rule's > > commands won't be executed again when the dependencies are > > older than the target (as is going to be the case after the invocation > > of the build from the top level Makefile). > > So the conclusion was wrong here. If the dependencies changed > (as in are newer than the target) but the target file is the same as > before, move-if-changed will hide the effect from consumers of the > produced file, but the temporary file created gets in the way of -j > here. Indeed. > There being just one invocation of the build in tools/include/ prior > to the patch here (and hence no race afaict), would you consider > it reasonable to make the two new invocations dependent upon > $(MAKELEVEL), thus protecting things in the recursive case? I don't think this is particularly nice, although I wouldn't block it. Why is this particular inter-directory dependency unusual ? Do we plan to introduce similar MAKELEVEL-based invocation of dependency directory makefiles everyhere ? Ian. ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v7 01/14] libxl: Enhance libxl__sendmsg_fds to deal with EINTR and EWOULDBLOCK
Anthony PERARD writes ("[PATCH v7 01/14] libxl: Enhance libxl__sendmsg_fds to deal with EINTR and EWOULDBLOCK"): > This patch change the behavior of libxl__sendmsg_fds to retry sendmsg on > EINTR error. > > This patch also allow a caller of libxl__sendmsg_fds to deal with > EWOULDBLOCK. The function now requires to send only 1 byte of data so > that when dealing with non-blocking fds a EWOULDBLOCK error would mean > that the fds haven't been sent yet. Current caller already send only 1 > byte. Even with a blocking fd, sendmsg may in principle report a short write. (So the commit message should talk about short writes in general.) > Notes: > v7: > always assert datalen == 1, but only fail when sendmsg haven't send > everything (r != datalen) > check sendmsg return value on success as well (check for short write) Rather than having a function which takes an argument which mandatorily takes the value 1, how about simply deleting that argument ? You can do that in a followup patch if you like. In the meantime: Acked-by: Ian Jackson Ian. ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v7 02/14] libxl_qmp: Separate QMP message generation from qmp_send_prepare
Anthony PERARD writes ("[PATCH v7 02/14] libxl_qmp: Separate QMP message generation from qmp_send_prepare"): > .. to be able to re-use qmp_prepare_cmd with libxl__ev_qmp. > > This patch also add the QMP end of command '\r\n' into the generated > string as every caller will needs this. > > There should be no functional change. Acked-by: Ian Jackson ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v7 04/14] libxl: Add wrapper around libxl__json_object_to_json JSON
Anthony PERARD writes ("[PATCH v7 04/14] libxl: Add wrapper around libxl__json_object_to_json JSON"): > That wrapper is going to be used to safely log a json_object, as > libxl__json_object_to_json return NULL on error. In the error case, > JSON() will return an invalid json string. Acked-by: Ian Jackson ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH for-4.12 1/8] xen/arm: Only set necessary flags when initializing HCR_EL2
On 28.11.18 18:49, Julien Grall wrote: Only {A,F,I}MO are necessary to receive interrupts until a guest vCPU is loaded. The rest have no effect on Xen and it is better to avoid setting them. Signed-off-by: Julien Grall --- xen/arch/arm/traps.c | 8 ++-- 1 file changed, 6 insertions(+), 2 deletions(-) Reviewed-by: Andrii Anisov -- Sincerely, Andrii Anisov. ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH for-4.12 2/8] xen/arm: p2m: Provide an helper to generate the VTTBR
On 28.11.18 18:49, Julien Grall wrote: A follow-up patch will need to generate the VTTBR in a few places. Signed-off-by: Julien Grall --- xen/arch/arm/p2m.c | 7 ++- 1 file changed, 6 insertions(+), 1 deletion(-) Reviewed-by: Andrii Anisov -- Sincerely, Andrii Anisov. ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH for-4.12 3/8] xen/arm: p2m: Introduce an helper to allocate the root page-table
On 28.11.18 18:49, Julien Grall wrote: A follow-up patch will require to allocate the root page-table without having a domain in hand. Signed-off-by: Julien Grall Reviewed-by: Andrii Anisov -- Sincerely, Andrii Anisov. ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH for-4.12 5/8] xen/arm: p2m: Only use isb() when it is necessary
On 28.11.18 18:49, Julien Grall wrote: The EL1 translation regime is out-of-context when running at EL2. This means the processor cannot speculate memory accesses using the registers associated to that regime. An isb() is only need if Xen is going to use the translation regime before returning to the guest (exception returns will synchronized the context). Remove unecessary isb() and document the ones left. Signed-off-by: Julien Grall Reviewed-by: Andrii Anisov -- Sincerely, Andrii Anisov. ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH for-4.12 5/8] xen/arm: p2m: Only use isb() when it is necessary
On 28.11.18 18:49, Julien Grall wrote: if ( *last_vcpu_ran != INVALID_VCPU_ID && *last_vcpu_ran != n->vcpu_id ) +{ flush_tlb_local(); +} BTW, missed mentioning that curly braces above are odd by coding style. -- Sincerely, Andrii Anisov. ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH for-4.12 7/8] xen/arm: p2m: Clean-up headers included and order them alphabetically
On 28.11.18 18:49, Julien Grall wrote: A lot of the headers are not necessary, so remove them. At the same time, re-order them alphabetically. Signed-off-by: Julien Grall Reviewed-by: Andrii Anisov -- Sincerely, Andrii Anisov. ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH for-4.12 8/8] DO NOT APPLY Allow testing the new AT speculate workaround code
I guess this one should not be here. -- Sincerely, Andrii Anisov. ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH for-4.12 8/8] DO NOT APPLY Allow testing the new AT speculate workaround code
On 21/12/2018 14:44, Andrii Anisov wrote: > I guess this one should not be here. Posting patches like this can be useful for people trying to test the series. As such, it is worth posting, but the DO NOT APPLY hint is there for people to realise that is isn't for inclusion generally. ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH for-4.12 5/8] xen/arm: p2m: Only use isb() when it is necessary
On 21/12/2018 14:43, Andrii Anisov wrote: On 28.11.18 18:49, Julien Grall wrote: if ( *last_vcpu_ran != INVALID_VCPU_ID && *last_vcpu_ran != n->vcpu_id ) + { flush_tlb_local(); + } BTW, missed mentioning that curly braces above are odd by coding style. It is not odd, just unnecessary. I will drop them. Cheers, -- Julien Grall ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH for-4.12 8/8] DO NOT APPLY Allow testing the new AT speculate workaround code
On 21.12.18 16:47, Andrew Cooper wrote: On 21/12/2018 14:44, Andrii Anisov wrote: I guess this one should not be here. Posting patches like this can be useful for people trying to test the series. As such, it is worth posting, but the DO NOT APPLY hint is there for people to realise that is isn't for inclusion generally. Thank you for the hint. -- Sincerely, Andrii Anisov. ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH for-4.12 4/8] xen/arm: domain_build: Don't switch to the guest P2M when copying data
On 28.11.18 18:49, Julien Grall wrote: Until recently, kernel/initrd/dtb were loaded using guest VA and therefore requiring to restore temporarily the P2M. This reworked in a series of commits (up to 9292086 "xen/arm: domain_build: Use copy_to_guest_phys_flush_dcache in dtb_load") to use a guest PA. This will also help a follow-up patch which will require p2m_{save,restore}_state to work in pair to workaround an erratum. Signed-off-by: Julien Grall Reviewed-by: Andrii Anisov -- Sincerely, Andrii Anisov. ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v4 1/3] xen/pt: fix some pass-thru devices don't work across reboot
On Fri, Dec 21, 2018 at 03:13:50AM -0700, Jan Beulich wrote: On 20.12.18 at 16:29, wrote: >> --- a/xen/drivers/passthrough/pci.c >> +++ b/xen/drivers/passthrough/pci.c >> @@ -1514,6 +1514,55 @@ static int assign_device(struct domain *d, u16 seg, >> u8 bus, u8 devfn, u32 flag) >> return rc; >> } >> >> +/* >> + * Unmap established mappings between domain's pirq and device's MSI. >> + * These mappings were set up by qemu/guest and are expected to be >> + * destroyed when changing the device's ownership. >> + */ >> +static void pci_unmap_msi(struct pci_dev *pdev) >> +{ >> +struct msi_desc *entry; >> +struct domain *d = pdev->domain; >> + >> +ASSERT(pcidevs_locked()); >> + >> +if ( !d ) >> +return; > >Why? deassign_device() (the only caller) ought to guarantee this, >due to its use of pci_get_pdev_by_domain(). I think this simply >wants to be another ASSERT(), if anything at all. > >> +spin_lock(&d->event_lock); >> +while ( (entry = list_first_entry_or_null(&pdev->msi_list, >> + struct msi_desc, list)) != >> NULL ) >> +{ >> +struct pirq *info; >> +int pirq = 0; >> +unsigned int nr = entry->msi_attrib.type != PCI_CAP_ID_MSIX >> + ? entry->msi.nvec : 1; >> + >> +while ( nr -- ) > >Stray blank. > >> +{ >> +struct hvm_pirq_dpci *pirq_dpci; >> + >> +pirq = domain_irq_to_pirq(d, entry[nr].irq); >> +WARN_ON(pirq < 0); >> +if ( pirq <= 0 ) >> +continue; >> + >> +info = pirq_info(d, pirq); >> +if ( !info ) >> +continue; >> + >> +pirq_dpci = pirq_dpci(info); >> +if ( pirq_dpci && >> + (pirq_dpci->flags & HVM_IRQ_DPCI_MACH_MSI) && >> + (pirq_dpci->flags & HVM_IRQ_DPCI_GUEST_MSI) ) >> +pt_irq_destroy_bind_msi(d, info); >> +} >> +if ( pirq > 0 ) >> +unmap_domain_pirq(d, pirq); > >Can you guarantee that this function won't fail? Because if it >does, I think you might end up in an infinite loop, because the Considering current code doesn't deal with remaining pirq, if we failed to unmap some pirq here (remove all entries from the msi_list here), it wouldn't be a big issue. Hence the real issue here is a potential infinite loop. Then we can just use list_for_each_entry_safe(...) to traverse msi_list to avoid infinite loop. >entry doesn't always get removed from the list in error cases. >Maybe unmap_domain_pirq() needs a "force" mode added, >perhaps indirectly by way of passing "entry" into it (all other >callers would pass NULL). Yes, it is viable. However, for this call site, unmap_domain_pirq() would fail to remove an entry only if xsm_unmap_domain_irq() in unmap_domain_pirq() failed. Can we expect that xsm_unmap_domain_irq() would always succeed there? If the answer is yes, what needed is another assertion rather than the "force" mode. Maybe we can forcibly remove all entries still on the list after the loop. The benefit is we needn't change unmap_domain_pirq() and its existing call sites. > >But then again I'm still not fully convinced that a hypervisor >change is the right course of action here in the first place. It >would be better if the hypervisor had to just verify that all >IRQ mappings are gone, or else fail the de-assignment of the >device. Then in another place, we need the "force" mode. I don't think it would bring great benefit if there is no other case (except device hot-remove and guest shutdown) where we want to unmap all pirqs related to a device. Thanks Chao ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH for-4.12 7/8] xen/arm: p2m: Clean-up headers included and order them alphabetically
I've missed that this patch is already merged within a different series. Also "[Xen-devel] [PATCH for-4.12 6/8] xen/arm: Implement workaround for Cortex-A76 erratum 1165522" should be rebased. On 21.12.18 16:44, Andrii Anisov wrote: On 28.11.18 18:49, Julien Grall wrote: A lot of the headers are not necessary, so remove them. At the same time, re-order them alphabetically. Signed-off-by: Julien Grall Reviewed-by: Andrii Anisov -- Sincerely, Andrii Anisov. ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v7 06/14] libxl_qmp: Implementation of libxl__ev_qmp_*
Anthony PERARD writes ("[PATCH v7 06/14] libxl_qmp: Implementation of libxl__ev_qmp_*"): > This patch implement the API libxl__ev_qmp documented in the previous > patch, "libxl: Design of an async API to issue QMP commands to QEMU". > > Since this API is to interact with QEMU via the QMP protocol, it also > implement a QMP client. The specification for the QEMU Machine Protocol > (QMP) can be found in the QEMU repository at: > https://git.qemu.org/?p=qemu.git;a=blob_plain;f=docs/interop/qmp-spec.txt Thanks. I have only fairly minor comments now. The biggest one remaining is about the use of EGC_GC which I think probably wants to become STATE_AO_GC throughout. See below... > diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h > index 1c7a3b22f4..056de9de2f 100644 > --- a/tools/libxl/libxl_internal.h > +++ b/tools/libxl/libxl_internal.h > @@ -412,6 +412,19 @@ _hidden int libxl__ev_qmp_send(libxl__gc *gc, > libxl__ev_qmp *ev, > +/* receive buffer */ > +char *rx_buf; rx_buf needs a comment saying it comes from NOGC since otherwise one would assume it came from the ao gc like the other buffers. (Could it come from the ao gc instead?) > +static void qmp_ev_ensure_reading_writing(libxl__gc *gc, libxl__ev_qmp *ev) > +/* Update the state of `efd` to match the permited state */ > +{ This function is legal only in states other than disconnected. Needs to be documented. > +static void qmp_ev_set_state(libxl__gc *gc, libxl__ev_qmp *ev, > + libxl__qmp_state new_state) > +/* on entry: !broken and !disconnected */ > +{ > +switch (new_state) { > +case qmp_state_disconnected: > +break; > +case qmp_state_connecting: > +assert(ev->state == qmp_state_disconnected); > +break; > +case qmp_state_capability_negotiation: > +assert(ev->state == qmp_state_connecting); > +break; > +case qmp_state_waiting_reply: > +assert(ev->state == qmp_state_capability_negotiation || > + ev->state == qmp_state_connected); > +break; > +case qmp_state_connected: > +assert(ev->state == qmp_state_waiting_reply); > +break; > +} > + > +ev->state = new_state; I think this function needs to update efd ? What am I missing ? The comment doesn't say what the output state is but naturally I assume that it is precisely new_state, and not some transitional mixture. If it is intended to produce a transitional mixture that ought to be documented. For a concrete example: if on entry, with new_state==disconnected, we are `connecting' then: efd will be looking for POLLIN. But it needs to become Idle. > +/* Setup connection */ > + > +static int qmp_ev_connect(libxl__gc *gc, libxl__ev_qmp *ev) > +/* disconnected -> connecting but with `msg` free > + * on error: broken */ > +{ This function looks fine to me. However, earlier I wrote this: Contrary to the state description, this function does not transition rx_buf from free to used. However, I think this would probably be more easily remedied by changing the definition of `used' to permit NULL/0/0. You might want to use a different word to `used', `inuse' perhaps ? This is still true. That is, your state description for `connecting' says that rx_buf is `used'. And your description lower about what rx_buf being `used' means says that rx_buf must be `allocated'. I think this would probably be best resolved by writing: * free used - * rx_buf NULL allocated + * rx_buf NULL NULL or allocated * rx_buf_size 0 allocation size of `rx_buf` * rx_buf_used 0 <= rx_buf_size, actual data in the * buffer Ie just to change the internal spec. I am going to assume for the rest of the review that the code is right and the internal spec will be updated. (I don't think it is necessary to change the descriptions of rx_buf_size and rx_buf_used; it will be clear that the `allocation size' of a NULL must be 0.) > +/* QMP FD callbacks */ > + > +static void qmp_ev_fd_callback(libxl__egc *egc, libxl__ev_fd *ev_fd, ... > +static int qmp_ev_callback_writable(libxl__gc *gc, > +libxl__ev_qmp *ev, int fd) > +/* on entry: !disconnected > + * on return, one of these state transition: > + * waiting_reply (with msg set) -> waiting_reply (with msg free) > + * tx_buf set -> same state or tx_buf free > + * tx_buf free -> no state change > + * on error: broken */ > +{ ... > +assert(ev->tx_buf); > +if (!ev->tx_buf) > +return 0; I think the if is vestigial. > +while (ev->tx_buf_off < ev->tx_buf_len) { > +r = write(fd, ev->tx_buf + ev->tx_buf_off, > + ev->tx_buf_len - ev->tx_buf_off); > +if (r < 0) { > +if (errno == EINTR) > +continue; > +if (errno == EWOULDB
Re: [Xen-devel] [PATCH for-4.12 6/8] xen/arm: Implement workaround for Cortex-A76 erratum 1165522
On 28.11.18 18:49, Julien Grall wrote: Early version of Cortex-A76 can end-up with corrupt TLBs if they speculate an AT instruction while the S1/S2 system registers are in an inconsistent state. This can happen during guest context switch and when invalidating the TLBs for other than the current VMID. The workaround implemented in Xen will: - Use an empty stage-2 with a reserved VMID while context switching between 2 guests - Use an empty stage-2 with the VMID where TLBs need to be flushed Signed-off-by: Julien Grall Reviewed-by: Andrii Anisov -- Sincerely, Andrii Anisov. ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH 2/2] xen/dom0: Add a dom0-iommu=none option
On Fri, Dec 21, 2018 at 01:01:22PM +, Andrew Cooper wrote: > On 21/12/2018 12:44, Roger Pau Monné wrote: > > On Thu, Dec 20, 2018 at 11:40:52PM +, Andrew Cooper wrote: > >> For development purposes, it is very convenient to boot Xen as a PVH guest, > >> with an XTF PV or PVH "dom0". The edit-compile-go cycle is a matter of > >> seconds, and you can resonably insert printk() debugging in places which > >> which > >> would be completely infeasible when booting fully-fledged guests. > >> > >> However, the PVH dom0 path insists on having a working IOMMU, which doesn't > >> exist when virtualised as a PVH guest, and isn't necessary for XTF anyway. > >> > >> Introduce a developer mode to skip the IOMMU requirement. > > This looks very similar to the current 'passthrough' option, maybe it > > would be enough to allow PVH dom0 to use the passthrough option > > provided a warning is added to > > arch_iommu_check_autotranslated_hwdom? > > I considered that, but "dom0-iommu=passthrough" isn't an accurate > description of what is going on. Frankly, its not correct for PV either. And what about turning dom0-iommu into a boolean option itself, so that you can do "dom0-iommu=false"? I think that's more similar to other Xen command line options that have a boolean value and/or a list of sub-options. > TBH, dom0-iommu=none is better for both. How about I introduce that as > the new option, and leave passthrough as a legacy alias? That sounds good, I would like to avoid (if possible) the proliferation of new iommu_hwdom_* variables, because we already have a bunch and the functionality introduced by the 'none' option looks very similar to what 'passthrough' aims to achieve. Thanks, Roger. ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v7 07/14] libxl_exec: Add libxl__spawn_initiate_failure
Anthony PERARD writes ("[PATCH v7 07/14] libxl_exec: Add libxl__spawn_initiate_failure"): > This function can be used by user of libxl__spawn_* when they setup a > notification other than xenstore. The parent can already report success > via libxl__spawn_initiate_detach(), this new function can be used for > failure instead of waiting for the timeout. Acked-by: Ian Jackson ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v7 06/14] libxl_qmp: Implementation of libxl__ev_qmp_*
Ian Jackson writes ("Re: [PATCH v7 06/14] libxl_qmp: Implementation of libxl__ev_qmp_*"): > I have only fairly minor comments now. The biggest one remaining is > about the use of EGC_GC which I think probably wants to become > STATE_AO_GC throughout. See below... I realise that I should state explicitly that libxl__ev_qmp_callback must still get an egc. The egc should be passed through all the layers, but its gc should generally not be used. Ian. ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v7 08/14] libxl: Add init/dispose of for libxl__domain_build_state
Anthony PERARD writes ("[PATCH v7 08/14] libxl: Add init/dispose of for libxl__domain_build_state"): > These two new functions libxl__domain_build_state_{init,dispose} should > be called every time a new libxl__domain_build_state comes to existance. > > There seems to be two of them, one with the domain creation machinery, > and one in the stub_dm_spawn. Acked-by: Ian Jackson ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v7 09/14] libxl_dm: Pre-open QMP socket for QEMU
Anthony PERARD writes ("[PATCH v7 09/14] libxl_dm: Pre-open QMP socket for QEMU"): > This patch moves the creation of the QMP unix socket from QEMU to libxl. > But libxl doesn't rely on this yet. Acked-by: Ian Jackson ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v7 11/14] libxl: QEMU startup sync based on QMP
Anthony PERARD writes ("[PATCH v7 11/14] libxl: QEMU startup sync based on QMP"): > This is only activated when dm_restrict=1, as explained in a previous > patch "libxl_dm: Pre-open QMP socket for QEMU" > > Signed-off-by: Anthony PERARD > Reviewed-by: Roger Pau Monné Acked-by: Ian Jackson ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v7 12/14] libxl_qmp: Store advertised QEMU version in libxl__ev_qmp
Anthony PERARD writes ("[PATCH v7 12/14] libxl_qmp: Store advertised QEMU version in libxl__ev_qmp"): > This will be used in a later patch. Acked-by: Ian Jackson > +o = libxl__json_map_get("QMP", resp, JSON_MAP); > +o = libxl__json_map_get("version", o, JSON_MAP); > +o = libxl__json_map_get("qemu", o, JSON_MAP); > +#define GRAB_VERSION(level) do { \ > +ev->qemu_version.level = libxl__json_object_get_integer( \ > +libxl__json_map_get(#level, o, JSON_INTEGER)); \ > +} while (0) > +GRAB_VERSION(major); > +GRAB_VERSION(minor); > +GRAB_VERSION(micro); Earlier I wrote: I would prefer the indentation to be such that the statement inside the macro is indented like the ones outside. Ie like this: +#define GRAB_VERSION(level) do { \ +ev->qemu_version.level = libxl__json_object_get_integer( \ +libxl__json_map_get(#level, o, JSON_INTEGER)); \ +} while (0) +GRAB_VERSION(major); But up to you. My ack stands either way. Thanks, Ian. ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH 1/2] xen/dom0: Improve documentation for dom0= and dom0-iommu=
On Fri, Dec 21, 2018 at 01:13:25PM +, Andrew Cooper wrote: > On 21/12/2018 12:08, Roger Pau Monné wrote: > > On Thu, Dec 20, 2018 at 11:40:51PM +, Andrew Cooper wrote: > >> Update to the latest metadata style, and expand each of the clauses with > >> more > >> information, including applicable CONFIG_* options. > >> > >> Drop the redundant comment beside parse_dom0_param(), to avoid it getting > >> out > >> of sync with the main documentation. > >> > >> Signed-off-by: Andrew Cooper > > Thanks! A couple of fixes below, because the original text is actually > > wrong... > > TBH, that is my default assumption every time I do work like this :) In this case it's my fault :(, because I changed the code and forgot about the docs. > > > >> --- > >> CC: Jan Beulich > >> CC: Wei Liu > >> CC: Roger Pau Monné > >> CC: Stefano Stabellini > >> CC: Julien Grall > >> > >> Please double check for correctness. The text matches my > >> understanding/reading of the code, but some of it is rather subtle going. > >> > >> It occurs to me that: > >> > >> * The choice of dom0 boot mode should in part be derived from the > >> available > >>CONFIG_* options, and ELF notes advertised in the dom0 kernel. > > This is indeed doable, but would require parsing the dom0 kernel > > before building the domain. > > I don't see anything wrong with parsing the ELF headers ahead of > building the domain. From the overall boot time, its just an > order-of-operations issue. Oh yes, I didn't mean my comment to sound like criticism. I agree there should be no issues in parsing the ELF earlier, or if there are issues they should be fixed. > > > >> * AMD probably needs to gain an `ivmd=` to mirror `rmrr=` on the Intel > >> side, > >>because we know there are other errors in the IVRS table. > > Yes, albeit using rmrr is quite cumbersome because it's mostly a > > trial-and-error process until there are no more iommu faults (unless > > you can get the correct rmrr command for your hardware somewhere). > > > >> * Neither of map-{inclusive,reserved} should be active by default, even on > >>Intel hardware, and we should (wherever possible) have quirks like we > >> have > >>for all other firmware screwups. Requiring the user to diagnose/work > >>around firmware problems like this is quite rude. > > That would indeed be nice, but I think there are too many vendor > > firmware versions to be able to correctly identify such quirks, the > > more that vendors don't even list missing RMRR as erratum. > > I don't agree. We already have quirks based on DMI (at the moment, > mainly for reboot overrides), and the vast majority of the offending > cases are the BMC shared mailbox, which will be in a fixed per-platform > location. IIRC I've only found a single box that worked without map-reserved, and that's my NUC which has firmware from Intel. And even in that case the USB ports weren't fully working. I guess such quirks could be applied based on the chipset version then if Xen realizes the firmware is either wrong or missing obvious RMRR regions? > I don't expect we'll ever find and fix all quirks, but where we do find > suitable ones, we should put them into the boot code. Sadly I agree. What I'm worried about is turning the default map-{inclusive/reserved} to off, that's likely to make dom0 unable to boot on a huge amount of hardware. Thanks, Roger. ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v7 13/14] libxl: Change libxl__domain_suspend_device_model() to be async
Anthony PERARD writes ("[PATCH v7 13/14] libxl: Change libxl__domain_suspend_device_model() to be async"): > This create an extra step for the two call sites of the function. > > libxl__domain_suspend_device_model() in this patch gets an extra error > variable (there is ret and rc), but ret goes away in the next patch. Acked-by: Ian Jackson ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v7 00/14] libxl: Enable save/restore/migration of a restricted QEMU + libxl__ev_qmp_*
Anthony PERARD writes ("[PATCH v7 00/14] libxl: Enable save/restore/migration of a restricted QEMU + libxl__ev_qmp_*"): > Patch series available in this git branch: > https://xenbits.xen.org/git-http/people/aperard/xen-unstable.git > br.libxl-ev-qmp-v Thanks. Sorry for being so slow to review this. It is very nearly ready and I hope it will make 4.12. Ian. ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v7 14/14] libxl: Re-implement domain_suspend_device_model using libxl__ev_qmp
Anthony PERARD writes ("[PATCH v7 14/14] libxl: Re-implement domain_suspend_device_model using libxl__ev_qmp"): > The re-implementation is done because we want to be able to send the > file description that QEMU can use to save its state. When QEMU is > restricted, it would not be able to write to a path. > > This replace both libxl__qmp_stop() and libxl__qmp_save(). > > qmp_qemu_check_version() was only used by libxl__qmp_save(), so it is > replace by a version using libxl__ev_qmp instead. > > Coding style fixed in libxl__domain_suspend_device_model() for the > return value. Acked-by: Ian Jackson ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH for-4.12 v2 8/8] xen: Switch parameter in get_page_from_gfn to use typesafe gfn
Hi Andrew, On 21/12/2018 14:14, Andrew Cooper wrote: On 20/12/2018 19:23, Julien Grall wrote: diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c index 32dc4253ff..b462a8513b 100644 --- a/xen/arch/x86/domain.c +++ b/xen/arch/x86/domain.c @@ -827,7 +827,7 @@ int arch_set_info_guest( unsigned long flags; bool compat; #ifdef CONFIG_PV -unsigned long cr3_gfn; +gfn_t cr3_gfn; I've sent an alternative patch which this patch should be rebased over, at which point all modifications to arch_set_info_guest() should hopefully disappear. The rest of the series should be merged by end of today (Code freeze for Xen Arm). So I will resend this patch separately after my holidays. diff --git a/xen/arch/x86/hvm/domain.c b/xen/arch/x86/hvm/domain.c index 5d5a746a25..73d2da8441 100644 --- a/xen/arch/x86/hvm/domain.c +++ b/xen/arch/x86/hvm/domain.c @@ -297,7 +297,7 @@ int arch_set_info_hvm_guest(struct vcpu *v, const vcpu_hvm_context_t *ctx) { /* Shadow-mode CR3 change. Check PDBR and update refcounts. */ struct page_info *page = get_page_from_gfn(v->domain, - v->arch.hvm.guest_cr[3] >> PAGE_SHIFT, + gaddr_to_gfn(v->arch.hvm.guest_cr[3]), NULL, P2M_ALLOC); Can you re-indent while modifying this please? Sure. diff --git a/xen/arch/x86/hvm/viridian/time.c b/xen/arch/x86/hvm/viridian/time.c index 840a82b457..a718434456 100644 --- a/xen/arch/x86/hvm/viridian/time.c +++ b/xen/arch/x86/hvm/viridian/time.c @@ -38,16 +38,16 @@ static void dump_reference_tsc(const struct domain *d) static void update_reference_tsc(struct domain *d, bool initialize) { -unsigned long gmfn = d->arch.hvm.viridian.reference_tsc.fields.pfn; -struct page_info *page = get_page_from_gfn(d, gmfn, NULL, P2M_ALLOC); +gfn_t gfn = _gfn(d->arch.hvm.viridian.reference_tsc.fields.pfn); +struct page_info *page = get_page_from_gfn(d, gfn, NULL, P2M_ALLOC); HV_REFERENCE_TSC_PAGE *p; if ( !page || !get_page_type(page, PGT_writable_page) ) { if ( page ) put_page(page); -gdprintk(XENLOG_WARNING, "Bad GMFN %#"PRI_gfn" (MFN %#"PRI_mfn")\n", - gmfn, mfn_x(page ? page_to_mfn(page) : INVALID_MFN)); +gdprintk(XENLOG_WARNING, "Bad GFN %#"PRI_gfn" (MFN %#"PRI_mfn")\n", The canonical format for gfns and mfns are just %"PRI_*, without the # Do you mind fixing this seeing as you're changing the string anyway? Sure. diff --git a/xen/include/asm-x86/p2m.h b/xen/include/asm-x86/p2m.h index 3304921991..1efbc071c5 100644 --- a/xen/include/asm-x86/p2m.h +++ b/xen/include/asm-x86/p2m.h @@ -491,18 +491,21 @@ struct page_info *p2m_get_page_from_gfn(struct p2m_domain *p2m, gfn_t gfn, p2m_query_t q); static inline struct page_info *get_page_from_gfn( -struct domain *d, unsigned long gfn, p2m_type_t *t, p2m_query_t q) +struct domain *d, gfn_t gfn, p2m_type_t *t, p2m_query_t q) { struct page_info *page; +mfn_t mfn; if ( paging_mode_translate(d) ) -return p2m_get_page_from_gfn(p2m_get_hostp2m(d), _gfn(gfn), t, NULL, q); +return p2m_get_page_from_gfn(p2m_get_hostp2m(d), gfn, t, NULL, q); /* Non-translated guests see 1-1 RAM / MMIO mappings everywhere */ if ( t ) *t = likely(d != dom_io) ? p2m_ram_rw : p2m_mmio_direct; -page = mfn_to_page(_mfn(gfn)); -return mfn_valid(_mfn(gfn)) && get_page(page, d) ? page : NULL; + +mfn = _mfn(gfn_x(gfn)); +page = mfn_to_page(mfn); +return mfn_valid(mfn) && get_page(page, d) ? page : NULL; This unfortunately propagates some bad behaviour, because it is not safe to use mfn_to_page(mfn); before mfn_valid(mfn) succeeds. (In practice it works because mfn_to_page() is just pointer arithmetic.) Pleas can you express this as: return (mfn_valid(mfn) && (page = mfn_to_page(mfn), get_page(page, d))) ? page : NULL; which at least gets the order of operations in the correct order from C's point of view. Alternatively, and perhaps easier to follow: if ( !mfn_valid(mfn) ) return NULL; page = mfn_to_page(mfn); return get_page(page, d) ? page : NULL; I am happy to fix that. However, shouldn't this be handled in a separate patch? After all, the code is not worst than it currently is. Cheers, ~Andrew -- Julien Grall ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH 03/15] x86/cpu/vpmu: Add Hygon Dhyana support for vPMU
On 2018/12/21 21:34, Boris Ostrovsky wrote: > On 12/21/18 5:02 AM, Pu Wen wrote: >> On 2018/12/20 22:25, Boris Ostrovsky wrote: >> ... diff --git a/xen/arch/x86/cpu/vpmu_amd.c b/xen/arch/x86/cpu/vpmu_amd.c index 5efc39b..e9f0a5c 100644 --- a/xen/arch/x86/cpu/vpmu_amd.c +++ b/xen/arch/x86/cpu/vpmu_amd.c @@ -554,6 +554,8 @@ int __init amd_vpmu_init(void) case 0x12: case 0x14: case 0x16: +case 0x17: +case 0x18: >>> >>> This also enables VPMU support for Zen which goes beyond what the >>> commit message claims to do. >> Sorry for the not clear commit message. Will add modification description >> in the commit message and make the changes complete. >> >> On the other hand, since current Xen vPMU still not support Zen. so in >> this patch we enable 0x17 support. If this modification is not preferred, >> will remove AMD Xen 0x17 support in next version. > > Enabling 0x17 should be fine, I just thought commit message should be > explicit about that. OK, will explicit describe the enabling of 0x17 in the commit message in next version patch set. Thanks for the suggestion. >>> Also, why are you choosing to use legacy MSRs (and you did the same in >>> Linux)? Doesn't Zen (which you are saying is similar to Hygon) support >>> c001_020X bank? >> In Linux, the Xen PMU driver use the default branch cases, which also use >> the legacy MSRs way. So we choose to follow legacy MSRs here in Dhyana >> cases. >> >> Since both of Zen and Dhyana support C001_020X MSRs. If use the C001_020X >> is preferred, we will try to modify the related codes and create a patch. > > > I don't have a Zen box available right now but from what I can see 0x17 > counters are compatible with 0x15 so I think switching to C001_020X > should work. And looks like you are using those in Linux (non-Xen part) too. >> Also the Linux Xen PMU driver may need to be updated to use these MSRs. > > Yes, although Linux part is used only by PV guests. Yes, I have tested the MSRs of the 0x15 ones by booting a Dom0 PV guest. It works even when the Linux Xen PMU driver use the legacy MSRs. I'll test the PV guest by using C001_020X in the Linux Xen part tomorrow. Thx. -- Regards, Pu Wen ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] [PATCH for-4.12 v3 4/8] xen/arm: Make get_page_from_gfn working with DOMID_XEN
DOMID_XEN is used to share pages beloging to the hypervisor (e.g trace buffers). Unlike other domains, DOMID_XEN is a non-auto translated domain and therefore does not have a P2M. This patch adds a special case for DOMID_XEN in get_page_from_gfn. We may want to provide "non-auto translated helpers" in the future if we see more case. Signed-off-by: Julien Grall --- Changes in v3: - Split from "xen/arm: Add support for read-only foreign mappings" - Use likely rather than unlikely - Fix typoes --- xen/include/asm-arm/p2m.h | 33 - 1 file changed, 32 insertions(+), 1 deletion(-) diff --git a/xen/include/asm-arm/p2m.h b/xen/include/asm-arm/p2m.h index a03a033a05..041dea827c 100644 --- a/xen/include/asm-arm/p2m.h +++ b/xen/include/asm-arm/p2m.h @@ -300,7 +300,38 @@ struct page_info *p2m_get_page_from_gfn(struct domain *d, gfn_t gfn, static inline struct page_info *get_page_from_gfn( struct domain *d, unsigned long gfn, p2m_type_t *t, p2m_query_t q) { -return p2m_get_page_from_gfn(d, _gfn(gfn), t); +mfn_t mfn; +p2m_type_t _t; +struct page_info *page; + +/* + * Special case for DOMID_XEN as it is the only domain so far that is + * not auto-translated. + */ +if ( likely(d != dom_xen) ) +return p2m_get_page_from_gfn(d, _gfn(gfn), t); + +if ( !t ) +t = &_t; + +*t = p2m_invalid; + +/* + * DOMID_XEN sees 1-1 RAM. The p2m_type is based on the type of the + * page. + */ +mfn = _mfn(gfn); +page = mfn_to_page(mfn); + +if ( !mfn_valid(mfn) || !get_page(page, d) ) +return NULL; + +if ( page->u.inuse.type_info & PGT_writable_page ) +*t = p2m_ram_rw; +else +*t = p2m_ram_ro; + +return page; } int get_page_type(struct page_info *page, unsigned long type); -- 2.11.0 ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] [PATCH for-4.12 v3 7/8] xenalyze: Build for Both ARM and x86 Platforms
From: Benjamin Sanda Modified to provide building of the xenalyze binary for both ARM and x86 platforms. The xenalyze binary is now built as part of the BIN list for both platforms. Signed-off-by: Benjamin Sanda Signed-off-by: Julien Grall Acked-by: Wei Liu Acked-by: Stefano Stabellini --- Changes in v3: - Add Stefano's acked-by Changes in v2: - Add Wei's acked-by --- tools/xentrace/Makefile | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/tools/xentrace/Makefile b/tools/xentrace/Makefile index 0bad942bdf..9fb7fc96e7 100644 --- a/tools/xentrace/Makefile +++ b/tools/xentrace/Makefile @@ -9,8 +9,7 @@ LDLIBS += $(LDLIBS_libxenevtchn) LDLIBS += $(LDLIBS_libxenctrl) LDLIBS += $(ARGP_LDFLAGS) -BIN-$(CONFIG_X86) = xenalyze -BIN = $(BIN-y) +BIN = xenalyze SBIN = xentrace xentrace_setsize LIBBIN = xenctx SCRIPTS = xentrace_format -- 2.11.0 ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] [PATCH for-4.12 v3 1/8] xen/arm: p2m: Introduce p2m_get_page_from_gfn
In a follow-up patch, we will need to handle get_page_from_gfn differently for DOMID_XEN. To keep the code simple move the current content in a new separate helper p2m_get_page_from_gfn. Note the new helper is not anymore a static inline function as the helper is quite complex. Finally, take the opportunity to use typesafe gfn as the change is minor. Signed-off-by: Julien Grall Reviewed-by: Andrii Anisov Reviewed-by: Stefano Stabellini --- Changes in v3: - Fix typeos - Add Stefano's reviewed-by - Fix coding style Changes in v2: - Add Andrii's reviewed-by --- xen/arch/arm/p2m.c| 33 + xen/include/asm-arm/p2m.h | 33 - 2 files changed, 37 insertions(+), 29 deletions(-) diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c index 2b5e43f50a..7ae5b29699 100644 --- a/xen/arch/arm/p2m.c +++ b/xen/arch/arm/p2m.c @@ -406,6 +406,39 @@ mfn_t p2m_lookup(struct domain *d, gfn_t gfn, p2m_type_t *t) return mfn; } +struct page_info *p2m_get_page_from_gfn(struct domain *d, gfn_t gfn, +p2m_type_t *t) +{ +struct page_info *page; +p2m_type_t p2mt; +mfn_t mfn = p2m_lookup(d, gfn, &p2mt); + +if ( t ) +*t = p2mt; + +if ( !p2m_is_any_ram(p2mt) ) +return NULL; + +if ( !mfn_valid(mfn) ) +return NULL; + +page = mfn_to_page(mfn); + +/* + * get_page won't work on foreign mapping because the page doesn't + * belong to the current domain. + */ +if ( p2m_is_foreign(p2mt) ) +{ +struct domain *fdom = page_get_owner_and_reference(page); +ASSERT(fdom != NULL); +ASSERT(fdom != d); +return page; +} + +return get_page(page, d) ? page : NULL; +} + int guest_physmap_mark_populate_on_demand(struct domain *d, unsigned long gfn, unsigned int order) diff --git a/xen/include/asm-arm/p2m.h b/xen/include/asm-arm/p2m.h index 01cd3ee4b5..4db8e8709d 100644 --- a/xen/include/asm-arm/p2m.h +++ b/xen/include/asm-arm/p2m.h @@ -289,38 +289,13 @@ typedef unsigned int p2m_query_t; #define P2M_ALLOC(1u<<0) /* Populate PoD and paged-out entries */ #define P2M_UNSHARE (1u<<1) /* Break CoW sharing */ +struct page_info *p2m_get_page_from_gfn(struct domain *d, gfn_t gfn, +p2m_type_t *t); + static inline struct page_info *get_page_from_gfn( struct domain *d, unsigned long gfn, p2m_type_t *t, p2m_query_t q) { -struct page_info *page; -p2m_type_t p2mt; -mfn_t mfn = p2m_lookup(d, _gfn(gfn), &p2mt); - -if (t) -*t = p2mt; - -if ( !p2m_is_any_ram(p2mt) ) -return NULL; - -if ( !mfn_valid(mfn) ) -return NULL; -page = mfn_to_page(mfn); - -/* - * get_page won't work on foreign mapping because the page doesn't - * belong to the current domain. - */ -if ( p2m_is_foreign(p2mt) ) -{ -struct domain *fdom = page_get_owner_and_reference(page); -ASSERT(fdom != NULL); -ASSERT(fdom != d); -return page; -} - -if ( !get_page(page, d) ) -return NULL; -return page; +return p2m_get_page_from_gfn(d, _gfn(gfn), t); } int get_page_type(struct page_info *page, unsigned long type); -- 2.11.0 ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] [PATCH for-4.12 v3 0/8] xen/arm: Add xentrace support
Hi all, This patch series is a rework of the series sent by Benjamin Sanda in April 2016 [1]. It finally adds support for xentrace/xenanalyze on Arm. Cheers, [1] https://lists.xenproject.org/archives/html/xen-devel/2016-04/msg00464.html *** BLURB HERE *** Benjamin Sanda (2): xen/arm: Initialize trace buffer xenalyze: Build for Both ARM and x86 Platforms Julien Grall (6): xen/arm: p2m: Introduce p2m_get_page_from_gfn xen/arm: Rename p2m_map_foreign to p2m_map_foreign_rw xen/arm: Add support for read-only foreign mappings xen/arm: Make get_page_from_gfn working with DOMID_XEN xen/arm: Allow a privileged domain to map foreign page from DOMID_XEN xen: Switch parameter in get_page_from_gfn to use typesafe gfn tools/xentrace/Makefile | 3 +- xen/arch/arm/guestcopy.c | 2 +- xen/arch/arm/mm.c| 19 ++-- xen/arch/arm/p2m.c | 36 ++- xen/arch/arm/setup.c | 3 ++ xen/arch/x86/cpu/vpmu.c | 2 +- xen/arch/x86/domain.c| 12 xen/arch/x86/domctl.c| 6 ++-- xen/arch/x86/hvm/dm.c| 2 +- xen/arch/x86/hvm/domain.c| 2 +- xen/arch/x86/hvm/hvm.c | 9 +++--- xen/arch/x86/hvm/svm/svm.c | 8 ++--- xen/arch/x86/hvm/viridian/time.c | 8 ++--- xen/arch/x86/hvm/viridian/viridian.c | 16 +- xen/arch/x86/hvm/vmx/vmx.c | 4 +-- xen/arch/x86/hvm/vmx/vvmx.c | 12 xen/arch/x86/mm.c| 24 --- xen/arch/x86/mm/p2m.c| 2 +- xen/arch/x86/mm/shadow/hvm.c | 6 ++-- xen/arch/x86/physdev.c | 3 +- xen/arch/x86/pv/descriptor-tables.c | 4 +-- xen/arch/x86/pv/emul-priv-op.c | 6 ++-- xen/arch/x86/pv/mm.c | 2 +- xen/arch/x86/traps.c | 11 +++ xen/common/domain.c | 2 +- xen/common/event_fifo.c | 12 xen/common/memory.c | 4 +-- xen/common/tmem_xen.c| 2 +- xen/include/asm-arm/p2m.h| 57 +--- xen/include/asm-x86/p2m.h| 11 --- 30 files changed, 174 insertions(+), 116 deletions(-) -- 2.11.0 ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] [PATCH for-4.12 v3 3/8] xen/arm: Add support for read-only foreign mappings
Currently, foreign mappings can only be read-write. A follow-up patch will extend foreign mapping for Xen backend memory (via XEN_DOMID), some of that memory should only be read accessible for the mapping domain. Introduce a new p2m_type to cater read-only foreign mappings. For now, the decision between the two foreign mapping type is based on the type of the guest page mapped. Signed-off-by: Julien Grall --- Cc: Andrii Anisov Changes in v3: - Remove Andrii's reviewed-by - Move out the XEN_DOMID code in a separate patch - Make the new addition future-proof Changes in v2: - Add Andrii's reviewed-by --- xen/arch/arm/mm.c | 5 +++-- xen/arch/arm/p2m.c| 1 + xen/include/asm-arm/p2m.h | 9 +++-- 3 files changed, 11 insertions(+), 4 deletions(-) diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c index 7193d83b44..3bf11eec4f 100644 --- a/xen/arch/arm/mm.c +++ b/xen/arch/arm/mm.c @@ -1259,7 +1259,9 @@ int xenmem_add_to_physmap_one( return -EINVAL; } -if ( !p2m_is_ram(p2mt) ) +if ( p2m_is_ram(p2mt) ) +t = (p2mt == p2m_ram_rw) ? p2m_map_foreign_rw : p2m_map_foreign_ro; +else { put_page(page); rcu_unlock_domain(od); @@ -1267,7 +1269,6 @@ int xenmem_add_to_physmap_one( } mfn = page_to_mfn(page); -t = p2m_map_foreign_rw; rcu_unlock_domain(od); break; diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c index 89279fb590..1e7c91e39a 100644 --- a/xen/arch/arm/p2m.c +++ b/xen/arch/arm/p2m.c @@ -478,6 +478,7 @@ static void p2m_set_permission(lpae_t *e, p2m_type_t t, p2m_access_t a) break; case p2m_iommu_map_ro: +case p2m_map_foreign_ro: case p2m_grant_map_ro: case p2m_invalid: e->p2m.xn = 1; diff --git a/xen/include/asm-arm/p2m.h b/xen/include/asm-arm/p2m.h index a1aef7b793..a03a033a05 100644 --- a/xen/include/asm-arm/p2m.h +++ b/xen/include/asm-arm/p2m.h @@ -116,6 +116,7 @@ typedef enum { p2m_mmio_direct_nc, /* Read/write mapping of genuine MMIO area non-cacheable */ p2m_mmio_direct_c, /* Read/write mapping of genuine MMIO area cacheable */ p2m_map_foreign_rw, /* Read/write RAM pages from foreign domain */ +p2m_map_foreign_ro, /* Read-only RAM pages from foreign domain */ p2m_grant_map_rw, /* Read/write grant mapping */ p2m_grant_map_ro, /* Read-only grant mapping */ /* The types below are only used to decide the page attribute in the P2M */ @@ -135,12 +136,16 @@ typedef enum { #define P2M_GRANT_TYPES (p2m_to_mask(p2m_grant_map_rw) | \ p2m_to_mask(p2m_grant_map_ro)) +/* Foreign mappings types */ +#define P2M_FOREIGN_TYPES (p2m_to_mask(p2m_map_foreign_rw) | \ + p2m_to_mask(p2m_map_foreign_ro)) + /* Useful predicates */ #define p2m_is_ram(_t) (p2m_to_mask(_t) & P2M_RAM_TYPES) -#define p2m_is_foreign(_t) (p2m_to_mask(_t) & p2m_to_mask(p2m_map_foreign_rw)) +#define p2m_is_foreign(_t) (p2m_to_mask(_t) & P2M_FOREIGN_TYPES) #define p2m_is_any_ram(_t) (p2m_to_mask(_t) & \ (P2M_RAM_TYPES | P2M_GRANT_TYPES | \ - p2m_to_mask(p2m_map_foreign_rw))) + P2M_FOREIGN_TYPES)) /* All common type definitions should live ahead of this inclusion. */ #ifdef _XEN_P2M_COMMON_H -- 2.11.0 ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] [PATCH for-4.12 v3 8/8] xen: Switch parameter in get_page_from_gfn to use typesafe gfn
No functional change intended. Only reasonable clean-ups are done in this patch. The rest will use _gfn for the time being. Signed-off-by: Julien Grall Acked-by: Jan Beulich --- Changes in v3: - Add Jan's acked-by Changes in v2: - Remove >> PAGE_SHIFT in svm code - Fix typo in the e-mail address - Small NITs --- xen/arch/arm/guestcopy.c | 2 +- xen/arch/arm/mm.c| 2 +- xen/arch/x86/cpu/vpmu.c | 2 +- xen/arch/x86/domain.c| 12 ++-- xen/arch/x86/domctl.c| 6 +++--- xen/arch/x86/hvm/dm.c| 2 +- xen/arch/x86/hvm/domain.c| 2 +- xen/arch/x86/hvm/hvm.c | 9 + xen/arch/x86/hvm/svm/svm.c | 8 xen/arch/x86/hvm/viridian/time.c | 8 xen/arch/x86/hvm/viridian/viridian.c | 16 xen/arch/x86/hvm/vmx/vmx.c | 4 ++-- xen/arch/x86/hvm/vmx/vvmx.c | 12 ++-- xen/arch/x86/mm.c| 24 ++-- xen/arch/x86/mm/p2m.c| 2 +- xen/arch/x86/mm/shadow/hvm.c | 6 +++--- xen/arch/x86/physdev.c | 3 ++- xen/arch/x86/pv/descriptor-tables.c | 4 ++-- xen/arch/x86/pv/emul-priv-op.c | 6 +++--- xen/arch/x86/pv/mm.c | 2 +- xen/arch/x86/traps.c | 11 ++- xen/common/domain.c | 2 +- xen/common/event_fifo.c | 12 ++-- xen/common/memory.c | 4 ++-- xen/common/tmem_xen.c| 2 +- xen/include/asm-arm/p2m.h| 6 +++--- xen/include/asm-x86/p2m.h| 11 +++ 27 files changed, 95 insertions(+), 85 deletions(-) diff --git a/xen/arch/arm/guestcopy.c b/xen/arch/arm/guestcopy.c index 7a0f3e9d5f..55892062bb 100644 --- a/xen/arch/arm/guestcopy.c +++ b/xen/arch/arm/guestcopy.c @@ -37,7 +37,7 @@ static struct page_info *translate_get_page(copy_info_t info, uint64_t addr, return get_page_from_gva(info.gva.v, addr, write ? GV2M_WRITE : GV2M_READ); -page = get_page_from_gfn(info.gpa.d, paddr_to_pfn(addr), &p2mt, P2M_ALLOC); +page = get_page_from_gfn(info.gpa.d, gaddr_to_gfn(addr), &p2mt, P2M_ALLOC); if ( !page ) return NULL; diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c index 01ae20..340a1d1548 100644 --- a/xen/arch/arm/mm.c +++ b/xen/arch/arm/mm.c @@ -1252,7 +1252,7 @@ int xenmem_add_to_physmap_one( /* Take reference to the foreign domain page. * Reference will be released in XENMEM_remove_from_physmap */ -page = get_page_from_gfn(od, idx, &p2mt, P2M_ALLOC); +page = get_page_from_gfn(od, _gfn(idx), &p2mt, P2M_ALLOC); if ( !page ) { put_pg_owner(od); diff --git a/xen/arch/x86/cpu/vpmu.c b/xen/arch/x86/cpu/vpmu.c index 8a4f753eae..4d8f153031 100644 --- a/xen/arch/x86/cpu/vpmu.c +++ b/xen/arch/x86/cpu/vpmu.c @@ -607,7 +607,7 @@ static int pvpmu_init(struct domain *d, xen_pmu_params_t *params) struct vcpu *v; struct vpmu_struct *vpmu; struct page_info *page; -uint64_t gfn = params->val; +gfn_t gfn = _gfn(params->val); if ( (params->vcpu >= d->max_vcpus) || (d->vcpu[params->vcpu] == NULL) ) return -EINVAL; diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c index 32dc4253ff..b462a8513b 100644 --- a/xen/arch/x86/domain.c +++ b/xen/arch/x86/domain.c @@ -827,7 +827,7 @@ int arch_set_info_guest( unsigned long flags; bool compat; #ifdef CONFIG_PV -unsigned long cr3_gfn; +gfn_t cr3_gfn; struct page_info *cr3_page; unsigned long cr4; int rc = 0; @@ -1091,9 +1091,9 @@ int arch_set_info_guest( set_bit(_VPF_in_reset, &v->pause_flags); if ( !compat ) -cr3_gfn = xen_cr3_to_pfn(c.nat->ctrlreg[3]); +cr3_gfn = _gfn(xen_cr3_to_pfn(c.nat->ctrlreg[3])); else -cr3_gfn = compat_cr3_to_pfn(c.cmp->ctrlreg[3]); +cr3_gfn = _gfn(compat_cr3_to_pfn(c.cmp->ctrlreg[3])); cr3_page = get_page_from_gfn(d, cr3_gfn, NULL, P2M_ALLOC); if ( !cr3_page ) @@ -1122,7 +1122,7 @@ int arch_set_info_guest( case 0: if ( !compat && !VM_ASSIST(d, m2p_strict) && !paging_mode_refcounts(d) ) -fill_ro_mpt(_mfn(cr3_gfn)); +fill_ro_mpt(_mfn(gfn_x(cr3_gfn))); break; default: if ( cr3_page == current->arch.old_guest_table ) @@ -1137,7 +1137,7 @@ int arch_set_info_guest( v->arch.guest_table = pagetable_from_page(cr3_page); if ( c.nat->ctrlreg[1] ) { -cr3_gfn = xen_cr3_to_pfn(c.nat->ctrlreg[1]); +cr3_gfn = _gfn(xen_cr3_to_pfn(c.nat->ctrlreg[1])); cr3_page = get_page_from_gfn(d, cr3_gfn, NULL, P2M_ALLOC); if ( !cr3_page ) @@ -1162,7 +1162,7 @@ int arch_set_info_guest(
[Xen-devel] [PATCH for-4.12 v3 2/8] xen/arm: Rename p2m_map_foreign to p2m_map_foreign_rw
A follow-up patch will introduce another type of foreign mapping. Rename the type to make clear it is only used for read-write mapping. No functional changes intended. Signed-off-by: Julien Grall Reviewed-by: Andrii Anisov Acked-by: Stefano Stabellini --- Changes in v3 - Add Stefano's acked-by Changes in v2: - Add Andrii's reviewed-by --- xen/arch/arm/mm.c | 2 +- xen/arch/arm/p2m.c| 2 +- xen/include/asm-arm/p2m.h | 6 +++--- 3 files changed, 5 insertions(+), 5 deletions(-) diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c index d96a6655ee..7193d83b44 100644 --- a/xen/arch/arm/mm.c +++ b/xen/arch/arm/mm.c @@ -1267,7 +1267,7 @@ int xenmem_add_to_physmap_one( } mfn = page_to_mfn(page); -t = p2m_map_foreign; +t = p2m_map_foreign_rw; rcu_unlock_domain(od); break; diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c index 7ae5b29699..89279fb590 100644 --- a/xen/arch/arm/p2m.c +++ b/xen/arch/arm/p2m.c @@ -468,7 +468,7 @@ static void p2m_set_permission(lpae_t *e, p2m_type_t t, p2m_access_t a) break; case p2m_iommu_map_rw: -case p2m_map_foreign: +case p2m_map_foreign_rw: case p2m_grant_map_rw: case p2m_mmio_direct_dev: case p2m_mmio_direct_nc: diff --git a/xen/include/asm-arm/p2m.h b/xen/include/asm-arm/p2m.h index 4db8e8709d..a1aef7b793 100644 --- a/xen/include/asm-arm/p2m.h +++ b/xen/include/asm-arm/p2m.h @@ -115,7 +115,7 @@ typedef enum { p2m_mmio_direct_dev,/* Read/write mapping of genuine Device MMIO area */ p2m_mmio_direct_nc, /* Read/write mapping of genuine MMIO area non-cacheable */ p2m_mmio_direct_c, /* Read/write mapping of genuine MMIO area cacheable */ -p2m_map_foreign,/* Ram pages from foreign domain */ +p2m_map_foreign_rw, /* Read/write RAM pages from foreign domain */ p2m_grant_map_rw, /* Read/write grant mapping */ p2m_grant_map_ro, /* Read-only grant mapping */ /* The types below are only used to decide the page attribute in the P2M */ @@ -137,10 +137,10 @@ typedef enum { /* Useful predicates */ #define p2m_is_ram(_t) (p2m_to_mask(_t) & P2M_RAM_TYPES) -#define p2m_is_foreign(_t) (p2m_to_mask(_t) & p2m_to_mask(p2m_map_foreign)) +#define p2m_is_foreign(_t) (p2m_to_mask(_t) & p2m_to_mask(p2m_map_foreign_rw)) #define p2m_is_any_ram(_t) (p2m_to_mask(_t) & \ (P2M_RAM_TYPES | P2M_GRANT_TYPES | \ - p2m_to_mask(p2m_map_foreign))) + p2m_to_mask(p2m_map_foreign_rw))) /* All common type definitions should live ahead of this inclusion. */ #ifdef _XEN_P2M_COMMON_H -- 2.11.0 ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] [PATCH for-4.12 v3 6/8] xen/arm: Initialize trace buffer
From: Benjamin Sanda Now that we allow a privileged domain to map tracing buffer, initialize them so a user can effectively trace Xen. Signed-off-by: Benjamin Sanda [julien: rework commit message] Signed-off-by: Julien Grall Reviewed-by: Andrii Anisov Acked-by: Stefano Stabellini --- Changes in v3: - Add Stefano's acked-by Changes in v2: - Add Andrii's reviewed-by --- xen/arch/arm/setup.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/xen/arch/arm/setup.c b/xen/arch/arm/setup.c index fb923cdf67..444857a967 100644 --- a/xen/arch/arm/setup.c +++ b/xen/arch/arm/setup.c @@ -36,6 +36,7 @@ #include #include #include +#include #include #include #include @@ -899,6 +900,8 @@ void __init start_xen(unsigned long boot_phys_offset, heap_init_late(); +init_trace_bufs(); + init_constructors(); console_endboot(); -- 2.11.0 ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] [PATCH for-4.12 v3 5/8] xen/arm: Allow a privileged domain to map foreign page from DOMID_XEN
For auto-translated domain, the only way to map a page to itself is the using the foreign map API. The current code does not allow mapping page from special page (such as DOMID_XEN). As xentrace buffers are shared using DOMID_XEN, it is not possible to use tracing for Arm. This could be solved by using the helper get_pg_owner(). This helper will be able to get a reference on DOMID_XEN and therefore allow mapping for privileged domain. This patch replace the call to rcu_lock_domain_by_any_id() with get_pg_owner(). For consistency, all the call to rcu_unlock_domain are replaced by put_pg_owner(). Signed-off-by: Julien grall Reviewed-by: Andrii Anisov Reviewed-by: Stefano Stabellini --- Changes in v3: - Add Stefano's reviewed-by - Fix typoes Changes in v2: - Add Andrii's reviewed-by --- xen/arch/arm/mm.c | 12 ++-- 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c index 3bf11eec4f..01ae20 100644 --- a/xen/arch/arm/mm.c +++ b/xen/arch/arm/mm.c @@ -1233,20 +1233,20 @@ int xenmem_add_to_physmap_one( struct domain *od; p2m_type_t p2mt; -od = rcu_lock_domain_by_any_id(extra.foreign_domid); +od = get_pg_owner(extra.foreign_domid); if ( od == NULL ) return -ESRCH; if ( od == d ) { -rcu_unlock_domain(od); +put_pg_owner(od); return -EINVAL; } rc = xsm_map_gmfn_foreign(XSM_TARGET, d, od); if ( rc ) { -rcu_unlock_domain(od); +put_pg_owner(od); return rc; } @@ -1255,7 +1255,7 @@ int xenmem_add_to_physmap_one( page = get_page_from_gfn(od, idx, &p2mt, P2M_ALLOC); if ( !page ) { -rcu_unlock_domain(od); +put_pg_owner(od); return -EINVAL; } @@ -1264,13 +1264,13 @@ int xenmem_add_to_physmap_one( else { put_page(page); -rcu_unlock_domain(od); +put_pg_owner(od); return -EINVAL; } mfn = page_to_mfn(page); -rcu_unlock_domain(od); +put_pg_owner(od); break; } case XENMAPSPACE_dev_mmio: -- 2.11.0 ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH for-4.12 v3 7/8] xenalyze: Build for Both ARM and x86 Platforms
Hi, On 21/12/2018 16:26, Julien Grall wrote: From: Benjamin Sanda Modified to provide building of the xenalyze binary for both ARM and x86 platforms. The xenalyze binary is now built as part of the BIN list for both platforms. Signed-off-by: Benjamin Sanda Signed-off-by: Julien Grall Acked-by: Wei Liu Acked-by: Stefano Stabellini I made a typo in the address e-mail. Stefano, could you fix it on commit? Cheers, --- Changes in v3: - Add Stefano's acked-by Changes in v2: - Add Wei's acked-by --- tools/xentrace/Makefile | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/tools/xentrace/Makefile b/tools/xentrace/Makefile index 0bad942bdf..9fb7fc96e7 100644 --- a/tools/xentrace/Makefile +++ b/tools/xentrace/Makefile @@ -9,8 +9,7 @@ LDLIBS += $(LDLIBS_libxenevtchn) LDLIBS += $(LDLIBS_libxenctrl) LDLIBS += $(ARGP_LDFLAGS) -BIN-$(CONFIG_X86) = xenalyze -BIN = $(BIN-y) +BIN = xenalyze SBIN = xentrace xentrace_setsize LIBBIN = xenctx SCRIPTS = xentrace_format -- Julien Grall ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH for-4.12 v3 8/8] xen: Switch parameter in get_page_from_gfn to use typesafe gfn
On 21/12/2018 16:26, Julien Grall wrote: No functional change intended. Only reasonable clean-ups are done in this patch. The rest will use _gfn for the time being. Signed-off-by: Julien Grall Acked-by: Jan Beulich Please ignore this patch. I need to rebase it on Andrew's patch. Cheers, --- Changes in v3: - Add Jan's acked-by Changes in v2: - Remove >> PAGE_SHIFT in svm code - Fix typo in the e-mail address - Small NITs --- xen/arch/arm/guestcopy.c | 2 +- xen/arch/arm/mm.c| 2 +- xen/arch/x86/cpu/vpmu.c | 2 +- xen/arch/x86/domain.c| 12 ++-- xen/arch/x86/domctl.c| 6 +++--- xen/arch/x86/hvm/dm.c| 2 +- xen/arch/x86/hvm/domain.c| 2 +- xen/arch/x86/hvm/hvm.c | 9 + xen/arch/x86/hvm/svm/svm.c | 8 xen/arch/x86/hvm/viridian/time.c | 8 xen/arch/x86/hvm/viridian/viridian.c | 16 xen/arch/x86/hvm/vmx/vmx.c | 4 ++-- xen/arch/x86/hvm/vmx/vvmx.c | 12 ++-- xen/arch/x86/mm.c| 24 ++-- xen/arch/x86/mm/p2m.c| 2 +- xen/arch/x86/mm/shadow/hvm.c | 6 +++--- xen/arch/x86/physdev.c | 3 ++- xen/arch/x86/pv/descriptor-tables.c | 4 ++-- xen/arch/x86/pv/emul-priv-op.c | 6 +++--- xen/arch/x86/pv/mm.c | 2 +- xen/arch/x86/traps.c | 11 ++- xen/common/domain.c | 2 +- xen/common/event_fifo.c | 12 ++-- xen/common/memory.c | 4 ++-- xen/common/tmem_xen.c| 2 +- xen/include/asm-arm/p2m.h| 6 +++--- xen/include/asm-x86/p2m.h| 11 +++ 27 files changed, 95 insertions(+), 85 deletions(-) diff --git a/xen/arch/arm/guestcopy.c b/xen/arch/arm/guestcopy.c index 7a0f3e9d5f..55892062bb 100644 --- a/xen/arch/arm/guestcopy.c +++ b/xen/arch/arm/guestcopy.c @@ -37,7 +37,7 @@ static struct page_info *translate_get_page(copy_info_t info, uint64_t addr, return get_page_from_gva(info.gva.v, addr, write ? GV2M_WRITE : GV2M_READ); -page = get_page_from_gfn(info.gpa.d, paddr_to_pfn(addr), &p2mt, P2M_ALLOC); +page = get_page_from_gfn(info.gpa.d, gaddr_to_gfn(addr), &p2mt, P2M_ALLOC); if ( !page ) return NULL; diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c index 01ae20..340a1d1548 100644 --- a/xen/arch/arm/mm.c +++ b/xen/arch/arm/mm.c @@ -1252,7 +1252,7 @@ int xenmem_add_to_physmap_one( /* Take reference to the foreign domain page. * Reference will be released in XENMEM_remove_from_physmap */ -page = get_page_from_gfn(od, idx, &p2mt, P2M_ALLOC); +page = get_page_from_gfn(od, _gfn(idx), &p2mt, P2M_ALLOC); if ( !page ) { put_pg_owner(od); diff --git a/xen/arch/x86/cpu/vpmu.c b/xen/arch/x86/cpu/vpmu.c index 8a4f753eae..4d8f153031 100644 --- a/xen/arch/x86/cpu/vpmu.c +++ b/xen/arch/x86/cpu/vpmu.c @@ -607,7 +607,7 @@ static int pvpmu_init(struct domain *d, xen_pmu_params_t *params) struct vcpu *v; struct vpmu_struct *vpmu; struct page_info *page; -uint64_t gfn = params->val; +gfn_t gfn = _gfn(params->val); if ( (params->vcpu >= d->max_vcpus) || (d->vcpu[params->vcpu] == NULL) ) return -EINVAL; diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c index 32dc4253ff..b462a8513b 100644 --- a/xen/arch/x86/domain.c +++ b/xen/arch/x86/domain.c @@ -827,7 +827,7 @@ int arch_set_info_guest( unsigned long flags; bool compat; #ifdef CONFIG_PV -unsigned long cr3_gfn; +gfn_t cr3_gfn; struct page_info *cr3_page; unsigned long cr4; int rc = 0; @@ -1091,9 +1091,9 @@ int arch_set_info_guest( set_bit(_VPF_in_reset, &v->pause_flags); if ( !compat ) -cr3_gfn = xen_cr3_to_pfn(c.nat->ctrlreg[3]); +cr3_gfn = _gfn(xen_cr3_to_pfn(c.nat->ctrlreg[3])); else -cr3_gfn = compat_cr3_to_pfn(c.cmp->ctrlreg[3]); +cr3_gfn = _gfn(compat_cr3_to_pfn(c.cmp->ctrlreg[3])); cr3_page = get_page_from_gfn(d, cr3_gfn, NULL, P2M_ALLOC); if ( !cr3_page ) @@ -1122,7 +1122,7 @@ int arch_set_info_guest( case 0: if ( !compat && !VM_ASSIST(d, m2p_strict) && !paging_mode_refcounts(d) ) -fill_ro_mpt(_mfn(cr3_gfn)); +fill_ro_mpt(_mfn(gfn_x(cr3_gfn))); break; default: if ( cr3_page == current->arch.old_guest_table ) @@ -1137,7 +1137,7 @@ int arch_set_info_guest( v->arch.guest_table = pagetable_from_page(cr3_page); if ( c.nat->ctrlreg[1] ) { -cr3_gfn = xen_cr3_to_pfn(c.nat->ctrlreg[1]); +c
Re: [Xen-devel] [PATCH for-4.12 v2 8/8] xen: Switch parameter in get_page_from_gfn to use typesafe gfn
On 21/12/2018 16:21, Julien Grall wrote: >>> diff --git a/xen/include/asm-x86/p2m.h b/xen/include/asm-x86/p2m.h >>> index 3304921991..1efbc071c5 100644 >>> --- a/xen/include/asm-x86/p2m.h >>> +++ b/xen/include/asm-x86/p2m.h >>> @@ -491,18 +491,21 @@ struct page_info *p2m_get_page_from_gfn(struct >>> p2m_domain *p2m, gfn_t gfn, >>> p2m_query_t q); >>> static inline struct page_info *get_page_from_gfn( >>> - struct domain *d, unsigned long gfn, p2m_type_t *t, p2m_query_t q) >>> + struct domain *d, gfn_t gfn, p2m_type_t *t, p2m_query_t q) >>> { >>> struct page_info *page; >>> + mfn_t mfn; >>> if ( paging_mode_translate(d) ) >>> - return p2m_get_page_from_gfn(p2m_get_hostp2m(d), _gfn(gfn), >>> t, NULL, q); >>> + return p2m_get_page_from_gfn(p2m_get_hostp2m(d), gfn, t, >>> NULL, q); >>> /* Non-translated guests see 1-1 RAM / MMIO mappings >>> everywhere */ >>> if ( t ) >>> *t = likely(d != dom_io) ? p2m_ram_rw : p2m_mmio_direct; >>> - page = mfn_to_page(_mfn(gfn)); >>> - return mfn_valid(_mfn(gfn)) && get_page(page, d) ? page : NULL; >>> + >>> + mfn = _mfn(gfn_x(gfn)); >>> + page = mfn_to_page(mfn); >>> + return mfn_valid(mfn) && get_page(page, d) ? page : NULL; >> >> This unfortunately propagates some bad behaviour, because it is not safe >> to use mfn_to_page(mfn); before mfn_valid(mfn) succeeds. (In practice >> it works because mfn_to_page() is just pointer arithmetic.) >> >> Pleas can you express this as: >> >> return (mfn_valid(mfn) && >> (page = mfn_to_page(mfn), get_page(page, d))) ? page : NULL; >> >> which at least gets the order of operations in the correct order from >> C's point of view. >> >> Alternatively, and perhaps easier to follow: >> >> if ( !mfn_valid(mfn) ) >> return NULL; >> >> page = mfn_to_page(mfn); >> >> return get_page(page, d) ? page : NULL; > > I am happy to fix that. However, shouldn't this be handled in a > separate patch? After all, the code is not worst than it currently is. I don't think its worthy of a separate patch. You're touching the code anyway, so might as well do it all in one go. ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] [PATCH v3 02/11] libxl: Get rid of support for QEMU_USER_BASE (xen-qemuuser-domidNN)
QEMU_USER_BASE allows a user to specify the UID to use when running the devicemodel for a specific domain number. Unfortunately, this is not really practical: It requires nearly 32,000 entries in /etc/passwd. QEMU_USER_RANGE_BASE is much more practical. Remove support for QEMU_USER_BASE. Signed-off-by: George Dunlap Acked-by: Ian Jackson --- NB that I've chosen not to update the xl.cfg man page at this time; it needs a lot of other updates as well, which would be easier to do all at once at the end. CC: Ian Jackson CC: Wei Liu --- tools/libxl/libxl_dm.c | 16 tools/libxl/libxl_internal.h | 1 - 2 files changed, 4 insertions(+), 13 deletions(-) diff --git a/tools/libxl/libxl_dm.c b/tools/libxl/libxl_dm.c index bbcbc94b6c..6024d4b7b8 100644 --- a/tools/libxl/libxl_dm.c +++ b/tools/libxl/libxl_dm.c @@ -138,13 +138,6 @@ static int libxl__domain_get_device_model_uid(libxl__gc *gc, return 0; } -user = GCSPRINTF("%s%d", LIBXL_QEMU_USER_BASE, guest_domid); -ret = userlookup_helper_getpwnam(gc, user, &user_pwbuf, 0); -if (ret < 0) -return ret; -if (ret > 0) -goto end_search; - ret = userlookup_helper_getpwnam(gc, LIBXL_QEMU_USER_RANGE_BASE, &user_pwbuf, &user_base); if (ret < 0) @@ -174,15 +167,14 @@ static int libxl__domain_get_device_model_uid(libxl__gc *gc, if (ret < 0) return ret; if (ret > 0) { -LOGD(WARN, guest_domid, "Could not find user %s%d, falling back to %s", - LIBXL_QEMU_USER_BASE, guest_domid, LIBXL_QEMU_USER_SHARED); +LOGD(WARN, guest_domid, "Could not find user %s, falling back to %s", + LIBXL_QEMU_USER_RANGE_BASE, LIBXL_QEMU_USER_SHARED); goto end_search; } LOGD(ERROR, guest_domid, - "Could not find user %s%d or %s or range base pseudo-user %s, cannot restrict", - LIBXL_QEMU_USER_BASE, guest_domid, LIBXL_QEMU_USER_SHARED, - LIBXL_QEMU_USER_RANGE_BASE); + "Could not find user %s or range base pseudo-user %s, cannot restrict", + LIBXL_QEMU_USER_SHARED, LIBXL_QEMU_USER_RANGE_BASE); return ERROR_INVAL; end_search: diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h index c4a43bd0b7..b147f3803c 100644 --- a/tools/libxl/libxl_internal.h +++ b/tools/libxl/libxl_internal.h @@ -4387,7 +4387,6 @@ _hidden int libxl__read_sysfs_file_contents(libxl__gc *gc, int *datalen_r); #define LIBXL_QEMU_USER_PREFIX "xen-qemuuser" -#define LIBXL_QEMU_USER_BASE LIBXL_QEMU_USER_PREFIX"-domid" #define LIBXL_QEMU_USER_SHARED LIBXL_QEMU_USER_PREFIX"-shared" #define LIBXL_QEMU_USER_RANGE_BASE LIBXL_QEMU_USER_PREFIX"-range-base" -- 2.19.2 ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] [PATCH v3 07/11] libxl: Make killing of device model asynchronous
Or at least, give it an asynchronous interface so that we can make it actually asynchronous in subsequent patches. Create state structures and callback function signatures. Add the state structure to libxl__destroy_domid_state. Break libxl__destroy_domid down into two functions. No functional change intended. Signed-off-by: George Dunlap Acked-by: Ian Jackson --- v2: - Note that libxl__devicemodel_destroy_cb may be called reentrantly NB that I retain the comment before libxl__destroy_device_model, in spite of the fact that it looks "pointless", to separate it logically from the previous prototype. CC: Ian Jackson CC: Wei Liu --- tools/libxl/libxl_dm.c | 11 +++--- tools/libxl/libxl_domain.c | 40 tools/libxl/libxl_internal.h | 20 -- 3 files changed, 58 insertions(+), 13 deletions(-) diff --git a/tools/libxl/libxl_dm.c b/tools/libxl/libxl_dm.c index 450433452d..ca59df33fe 100644 --- a/tools/libxl/libxl_dm.c +++ b/tools/libxl/libxl_dm.c @@ -2696,19 +2696,24 @@ out: return rc; } -int libxl__destroy_device_model(libxl__gc *gc, uint32_t domid) +void libxl__destroy_device_model(libxl__egc *egc, + libxl__destroy_devicemodel_state *ddms) { +STATE_AO_GC(ddms->ao); int rc; +int domid = ddms->domid; char *path = DEVICE_MODEL_XS_PATH(gc, LIBXL_TOOLSTACK_DOMID, domid, ""); + if (!xs_rm(CTX->xsh, XBT_NULL, path)) LOGD(ERROR, domid, "xs_rm failed for %s", path); + /* We should try to destroy the device model anyway. */ rc = kill_device_model(gc, GCSPRINTF("/local/domain/%d/image/device-model-pid", domid)); - + libxl__qmp_cleanup(gc, domid); -return rc; +ddms->callback(egc, ddms, rc); } /* Return 0 if no dm needed, 1 if needed and <0 if error. */ diff --git a/tools/libxl/libxl_domain.c b/tools/libxl/libxl_domain.c index d46b97dedf..0ce1ba1327 100644 --- a/tools/libxl/libxl_domain.c +++ b/tools/libxl/libxl_domain.c @@ -1008,6 +1008,10 @@ static void destroy_finish_check(libxl__egc *egc, } /* Callbacks for libxl__destroy_domid */ +static void dm_destroy_cb(libxl__egc *egc, + libxl__destroy_devicemodel_state *ddms, + int rc); + static void devices_destroy_cb(libxl__egc *egc, libxl__devices_remove_state *drs, int rc); @@ -1066,16 +1070,18 @@ void libxl__destroy_domid(libxl__egc *egc, libxl__destroy_domid_state *dis) if (rc < 0) { LOGEVD(ERROR, rc, domid, "xc_domain_pause failed"); } + if (dm_present) { -if (libxl__destroy_device_model(gc, domid) < 0) -LOGD(ERROR, domid, "libxl__destroy_device_model failed"); +dis->ddms.ao = ao; +dis->ddms.domid = domid; +dis->ddms.callback = dm_destroy_cb; + +libxl__destroy_device_model(egc, &dis->ddms); +return; +} else { +dm_destroy_cb(egc, &dis->ddms, 0); +return; } -dis->drs.ao = ao; -dis->drs.domid = domid; -dis->drs.callback = devices_destroy_cb; -dis->drs.force = 1; -libxl__devices_destroy(egc, &dis->drs); -return; out: assert(rc); @@ -1083,6 +1089,24 @@ out: return; } +static void dm_destroy_cb(libxl__egc *egc, + libxl__destroy_devicemodel_state *ddms, + int rc) +{ +libxl__destroy_domid_state *dis = CONTAINER_OF(ddms, *dis, ddms); +STATE_AO_GC(dis->ao); +uint32_t domid = dis->domid; + +if (rc < 0) +LOGD(ERROR, domid, "libxl__destroy_device_model failed"); + +dis->drs.ao = ao; +dis->drs.domid = domid; +dis->drs.callback = devices_destroy_cb; +dis->drs.force = 1; +libxl__devices_destroy(egc, &dis->drs); +} + static void devices_destroy_cb(libxl__egc *egc, libxl__devices_remove_state *drs, int rc) diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h index b147f3803c..f9e0bf6578 100644 --- a/tools/libxl/libxl_internal.h +++ b/tools/libxl/libxl_internal.h @@ -1705,8 +1705,6 @@ _hidden int libxl__wait_for_device_model_deprecated(libxl__gc *gc, void *userdata), void *check_callback_userdata); -_hidden int libxl__destroy_device_model(libxl__gc *gc, uint32_t domid); - _hidden const libxl_vnc_info *libxl__dm_vnc(const libxl_domain_config *g_cfg); _hidden char *libxl__abs_path(libxl__gc *gc, const char *s, const char *path); @@ -3672,6 +3670,7 @@ extern const struct libxl_device_type *device_type_tbl[]; typedef struct libxl__domain_destroy_state libxl__domain_destroy_state; typedef struct libxl__destroy_domid_state libxl__destroy_domid_state; +typedef struct libxl__destroy_devicemodel_state libxl__destroy_devicemodel_state; typedef struct libxl__devices_remove_
[Xen-devel] [PATCH v3 04/11] dm_depriv: Describe expected usage of device_model_user parameter
A number of subsequent patches rely on as-yet undefined behavior for what the `device_model_user` parameter does. Rather than implement it incorrectly (or randomly), or remove the feature, describe an expected usage for the feature. Further patches will make decisions based on this expected usage. Signed-off-by: George Dunlap Acked-by: Ian Jackson --- v3: - Fix a minor typo v2: - Remove stale comment about device_model_user not being ready RFC: As we'll see in a later patch, this implementation is still incomplete: we need a `reaper` uid from which to kill uids. CC: Ian Jackson CC: Wei Liu CC: Anthony Perard --- docs/features/qemu-deprivilege.pandoc | 17 + tools/libxl/libxl_types.idl | 1 - 2 files changed, 17 insertions(+), 1 deletion(-) diff --git a/docs/features/qemu-deprivilege.pandoc b/docs/features/qemu-deprivilege.pandoc index f941525189..ce21a60ef7 100644 --- a/docs/features/qemu-deprivilege.pandoc +++ b/docs/features/qemu-deprivilege.pandoc @@ -66,6 +66,23 @@ this, create a user named `xen-qemuuser-shared`; for example: adduser --no-create-home --system xen-qemuuser-shared +A final way to set up a separate process for qemus is to allocate one +UID per VM, and set the UID in the domain config file with the +`device_model_user` argument. For example, suppose you have a VM +named `c6-01`. You might do the following: + +adduser --system --no-create-home --group xen-qemuuser-c6-01 + +And then in your config file, the following line: + +device_model_user="xen-qemuuser-c6-01" + +NOTE: It is important when using `device_model_user` that EACH VM HAVE +A SEPARATE UID, and that none of these UIDs map to root. xl will +throw an error a uid maps to zero, but not if multiple VMs have the +same uid. Multiple VMs with the same device model uid will cause +problems. + ## Domain config changes The core domain config change is to add the following line to the diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl index 51cf06a3a2..141c46e42a 100644 --- a/tools/libxl/libxl_types.idl +++ b/tools/libxl/libxl_types.idl @@ -495,7 +495,6 @@ libxl_domain_build_info = Struct("domain_build_info",[ ("device_model", string), ("device_model_ssidref", uint32), ("device_model_ssid_label", string), -# device_model_user is not ready for use yet ("device_model_user", string), # extra parameters pass directly to qemu, NULL terminated -- 2.19.2 ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] [PATCH v3 09/11] libxl: Kill QEMU with "reaper" ruid
Using kill(-1) to killing an untrusted dm process with the real uid equal to the dm_uid isn't guaranteed to succeed: the process in question may be able to kill the reaper process after the setresuid() and before the kill(). Instead, set the real uid to the QEMU user for domain 0 (QEMU_USER_RANGE_BASE + 0). The reaper process will still be able to kill the dm process, but not vice versa. This, in turn, requires locking to make sure that only one reaper process is using that uid at a time; otherwise one reaper process may kill the other reaper process. Create a lockfile in RUNDIR/dm-reaper-lock, and grab the lock before executing kill. In the event that we can't get the lock for some reason, go ahead with the kill using dm_uid for both real and effective UIDs. This isn't guaranteed to work, but it's no worse than not trying to kill the process at all. NB that this effectively requires admins using device_model_user to also define xen_qemuuser_range_base; this will be addressed in subsequent patches. Signed-off-by: George Dunlap --- v3: - Have libxl__get_reaper_uid return an error if a suitable uid is not found. - Explicitly set reaper_uid to dm_uid if get_reaper_lock_and_uid() returns an error, rather than relying on reaper_uid being unchanged on failure. - Open the lockfile 0644 rather than 0666. - Remove bogus comment. v2: - Port over previous changes - libxl__get_reaper_uid() won't set errno, use LOG rather than LOGE. - Accumulate error and return for all failures - Move flock() outside of the condition. Also fix EINTR check (check errno rather than return value). - Add a comment explaining why we return an error even if the kill() succeeds - Move locking to a separate function to minimize gotos - Refactor libxl__get_reaper_id to take a pointer for reaper_uid; return only success/failure. Also return EINVAL if reaper_uid would resolve to 0. - Handle "reaper_uid not found" specially; note issue with device_model_user feature - Assert that final reaper_uid != 0 CC: Ian Jackson CC: Wei Liu --- tools/libxl/libxl_dm.c | 118 + 1 file changed, 108 insertions(+), 10 deletions(-) diff --git a/tools/libxl/libxl_dm.c b/tools/libxl/libxl_dm.c index d265f24864..f7c4e5eb3b 100644 --- a/tools/libxl/libxl_dm.c +++ b/tools/libxl/libxl_dm.c @@ -259,6 +259,35 @@ out: return rc; } +/* + * Look up "reaper UID". If present and non-root, returns 0 and sets + * reaper_uid. Otherwise returns libxl-style error. + */ +static int libxl__get_reaper_uid(libxl__gc *gc, uid_t *reaper_uid) +{ +struct passwd *user_base, user_pwbuf; +int rc; + +rc = userlookup_helper_getpwnam(gc, LIBXL_QEMU_USER_RANGE_BASE, + &user_pwbuf, &user_base); +if (rc) +return rc; + +if (!user_base) { +LOG(WARN, "Couldn't find uid for reaper process"); +return ERROR_INVAL; +} + +if (user_base->pw_uid == 0) { +LOG(ERROR, "UID for reaper process maps to root!"); +return ERROR_INVAL; +} + +*reaper_uid = user_base->pw_uid; + +return 0; +} + const char *libxl__domain_device_model(libxl__gc *gc, const libxl_domain_build_info *info) { @@ -2834,37 +2863,106 @@ out: return; } +static int get_reaper_lock_and_uid(libxl__destroy_devicemodel_state *ddms, + uid_t *reaper_uid) +{ +STATE_AO_GC(ddms->ao); +int domid = ddms->domid; +int r; +const char * lockfile; +int fd; + +/* Try to lock the "reaper uid" */ +lockfile = GCSPRINTF("%s/dm-reaper-lock", libxl__run_dir_path()); + +/* + * NB that since we've just forked, we can't have any + * threads; so we don't need the libxl__carefd + * infrastructure here. + */ +fd = open(lockfile, O_RDWR|O_CREAT, 0644); +if (fd < 0) { +LOGED(ERROR, domid, + "unexpected error while trying to open lockfile %s, errno=%d", + lockfile, errno); +return ERROR_FAIL; +} + +/* Try to lock the file, retrying on EINTR */ +for (;;) { +r = flock(fd, LOCK_EX); +if (!r) +break; +if (errno != EINTR) { +/* All other errno: EBADF, EINVAL, ENOLCK, EWOULDBLOCK */ +LOGED(ERROR, domid, + "unexpected error while trying to lock %s, fd=%d, errno=%d", + lockfile, fd, errno); +return ERROR_FAIL; +} +} + +/* + * Get reaper_uid. If we can't find such a uid, return an error. + * + * FIXME: This means that domain destruction will fail if + * device_model_user is set but QEMU_USER_RANGE_BASE doesn't exist. + */ +return libxl__get_reaper_uid(gc, reaper_uid); +} + + /* * Destroy all processes of the given uid by setresuid to the * specified uid and kill(-1). NB this MUST BE CALLED FROM A SEPARATE - * PROCESS from the normal libxl process.
[Xen-devel] [PATCH v3 11/11] dm_depriv: Mark `UID cleanup` as completed
Signed-off-by: George Dunlap --- CC: Ian Jackson CC: Wei Liu --- docs/designs/qemu-deprivilege.md | 40 1 file changed, 20 insertions(+), 20 deletions(-) diff --git a/docs/designs/qemu-deprivilege.md b/docs/designs/qemu-deprivilege.md index f7444a434d..81a5f5c05d 100644 --- a/docs/designs/qemu-deprivilege.md +++ b/docs/designs/qemu-deprivilege.md @@ -128,26 +128,6 @@ are specified; this does not apply to QEMU running as a Xen DM. '''Tested''': Not tested -# Restrictions / improvements still to do - -This lists potential restrictions still to do. It is meant to be -listed in order of ease of implementation, with low-hanging fruit -first. - -### Further RLIMITs - -RLIMIT_AS limits the total amount of memory; but this includes the -virtual memory which QEMU uses as a mapcache. xen-mapcache.c already -fiddles with this; it would be straightforward to make it *set* the -rlimit to what it thinks a sensible limit is. - -RLIMIT_NPROC limits total number of processes or threads. QEMU uses -threads for some devices, so this would require some thought. - -Other things that would take some cleverness / changes to QEMU to -utilize due to ordering constrants: - - RLIMIT_NOFILES (after all necessary files are opened) - ### libxl UID cleanup '''Description''': Domain IDs are reused, and thus restricted UIDs are @@ -223,6 +203,26 @@ Since this will kill all other `reaper_uid` processes as well, we must either allocate a separate `reaper_uid` per domain, or use locking to ensure that only one killing process is active at a time. +# Restrictions / improvements still to do + +This lists potential restrictions still to do. It is meant to be +listed in order of ease of implementation, with low-hanging fruit +first. + +### Further RLIMITs + +RLIMIT_AS limits the total amount of memory; but this includes the +virtual memory which QEMU uses as a mapcache. xen-mapcache.c already +fiddles with this; it would be straightforward to make it *set* the +rlimit to what it thinks a sensible limit is. + +RLIMIT_NPROC limits total number of processes or threads. QEMU uses +threads for some devices, so this would require some thought. + +Other things that would take some cleverness / changes to QEMU to +utilize due to ordering constrants: + - RLIMIT_NOFILES (after all necessary files are opened) + ## libxl: Treat QMP connection as untrusted '''Description''': Currently libxl talks with QEMU via QMP; but its -- 2.19.2 ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] [PATCH v3 06/11] libxl: Move qmp cleanup into devicemodel destroy function
Removing the qmp connection is logically part of the device model destruction; having the caller destroy it is a mild layering violation. Move libxl__qmp_cleanup() into libxl__destroy_device_model(). This will make it easier when we make devicemodel destruction asynchronous. Signed-off-by: George Dunlap Acked-by: Ian Jackson --- CC: Ian Jackson CC: Wei Liu --- tools/libxl/libxl_dm.c | 9 +++-- tools/libxl/libxl_domain.c | 2 -- 2 files changed, 7 insertions(+), 4 deletions(-) diff --git a/tools/libxl/libxl_dm.c b/tools/libxl/libxl_dm.c index d73bbb6b06..450433452d 100644 --- a/tools/libxl/libxl_dm.c +++ b/tools/libxl/libxl_dm.c @@ -2698,12 +2698,17 @@ out: int libxl__destroy_device_model(libxl__gc *gc, uint32_t domid) { +int rc; char *path = DEVICE_MODEL_XS_PATH(gc, LIBXL_TOOLSTACK_DOMID, domid, ""); if (!xs_rm(CTX->xsh, XBT_NULL, path)) LOGD(ERROR, domid, "xs_rm failed for %s", path); /* We should try to destroy the device model anyway. */ -return kill_device_model(gc, -GCSPRINTF("/local/domain/%d/image/device-model-pid", domid)); +rc = kill_device_model(gc, + GCSPRINTF("/local/domain/%d/image/device-model-pid", domid)); + +libxl__qmp_cleanup(gc, domid); + +return rc; } /* Return 0 if no dm needed, 1 if needed and <0 if error. */ diff --git a/tools/libxl/libxl_domain.c b/tools/libxl/libxl_domain.c index 3377bba994..d46b97dedf 100644 --- a/tools/libxl/libxl_domain.c +++ b/tools/libxl/libxl_domain.c @@ -1069,8 +1069,6 @@ void libxl__destroy_domid(libxl__egc *egc, libxl__destroy_domid_state *dis) if (dm_present) { if (libxl__destroy_device_model(gc, domid) < 0) LOGD(ERROR, domid, "libxl__destroy_device_model failed"); - -libxl__qmp_cleanup(gc, domid); } dis->drs.ao = ao; dis->drs.domid = domid; -- 2.19.2 ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel