Re: [Xen-devel] [PATCH v6 04/12] microcode: introduce a global cache of ucode patch
>>> On 13.03.19 at 18:04, wrote: > On 13/03/2019 10:30, Andrew Cooper wrote: >> On 13/03/2019 07:39, Jan Beulich wrote: >> On 12.03.19 at 17:53, wrote: IIRC we agreed that systems with mixed CPU versions are not supported, hence the same microcode blob should be used to update all the possible CPUs on the system, so a list should not be needed here. >>> That's not what I recall. We agreed to not store everything, because >>> mixed-family / mixed-model systems are commonly not supported by >>> the hw vendors. But mixed stepping systems may well be. >> The difference between Skylake and CascadeLake server CPUs is only in >> the stepping (4 vs 6). Same for Kaby/Coffee/Whiskey/Amber lake (the >> mobile and desktop lines have a model each, and newer generations are >> just new steppings). >> >> I'll have to defer to Intel as to exactly what is supported, but when I >> asked this before, the answer was that no heterogeneity was supported at >> all, not even at the stepping level or platform ID level. > > So, as it turns out, Xen in practice only supports a single files worth > of microcode from /lib/firmware > > This is because: > 1) That is the only behaviour drakut has, and > 2) drakut has been the default initrd-generator in Linux distros for far > longer than Xen has had boot time microcode loading support, and I disagree. pulling the microcode blob out of the initrd is only a secondary option, as far as Xen's history is concerned. My original early load implementation required specifying a separate module in the grub configuration; ucode=scan support was added later by Konrad. Yet the ucode= logic also supports the full microcode.bin blob to be loaded. (And as to the timeline, I'm not sure SLE in particular was switched to dracut before Xen had gained early loading support.) That said - I agree with you that the overwhelming amount of testing likely covers the case you mention, and nothing else. So if for all practical purposes supporting just a single family:model: stepping tuple is enough - fine with me. Given my other reply regarding how to deal with load failure though, I'm not sure this would mean we can resort back to caching just a single blob. Jan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v5 09/11] viridian: add implementation of synthetic interrupt MSRs
>>> On 13.03.19 at 17:13, wrote: >> -Original Message- >> From: Xen-devel [mailto:xen-devel-boun...@lists.xenproject.org] On Behalf Of > Jan Beulich >> Sent: 13 March 2019 13:15 >> To: Paul Durrant >> Cc: Stefano Stabellini ; Wei Liu >> ; > Konrad Rzeszutek Wilk >> ; George Dunlap ; Andrew > Cooper >> ; Ian Jackson ; Tim >> (Xen.org) > ; Julien >> Grall ; xen-devel ; >> Roger > Pau Monne >> >> Subject: Re: [Xen-devel] [PATCH v5 09/11] viridian: add implementation of > synthetic interrupt MSRs >> >> >>> On 11.03.19 at 14:41, wrote: >> > @@ -28,6 +29,32 @@ typedef union _HV_VP_ASSIST_PAGE >> > uint8_t ReservedZBytePadding[PAGE_SIZE]; >> > } HV_VP_ASSIST_PAGE; >> > >> > +typedef enum HV_MESSAGE_TYPE { >> > +HvMessageTypeNone, >> > +HvMessageTimerExpired = 0x8010, >> > +} HV_MESSAGE_TYPE; >> > + >> > +typedef struct HV_MESSAGE_FLAGS { >> > +uint8_t MessagePending:1; >> > +uint8_t Reserved:7; >> > +} HV_MESSAGE_FLAGS; >> > + >> > +typedef struct HV_MESSAGE_HEADER { >> > +HV_MESSAGE_TYPE MessageType; >> > +uint16_t Reserved1; >> > +HV_MESSAGE_FLAGS MessageFlags; >> > +uint8_t PayloadSize; >> > +uint64_t Reserved2; >> > +} HV_MESSAGE_HEADER; >> > + >> > +#define HV_MESSAGE_SIZE 256 >> > +#define HV_MESSAGE_MAX_PAYLOAD_QWORD_COUNT 30 >> > > Missed this one before... > >> Is this defined this way, or (given ... >> >> > +typedef struct HV_MESSAGE { >> > +HV_MESSAGE_HEADER Header; >> > +uint64_t Payload[HV_MESSAGE_MAX_PAYLOAD_QWORD_COUNT]; >> > +} HV_MESSAGE; >> >> ... this) isn't it rather >> >> #define HV_MESSAGE_MAX_PAYLOAD_QWORD_COUNT \ >> ((HV_MESSAGE_SIZE - sizeof(HV_MESSAGE_HEADER) / 8) >> > > I need the definition for the array in the struct so that sizeof(HV_MESSAGE) > == HV_MESSAGE_SIZE (for which there is a BUILD_BUG_ON()) later. I don't understand this part - I'm not asking to ditch the #define. As to the BUILD_BUG_ON() - I see now, but that's only in patch 10, and in a specific message handler. I think this would belong here, and in the main viridian.c file. > It's also written that way in the spec. so I'd rather leave it as-is. Well, okay then - I was sort of expecting this to be spelled out there in such a way. That doesn't change my overall opinion, but I can see your point of wanting to match the spec, and you being the maintainer I have no basis to insist anyway. Jan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v2 1/4] x86: stop handling MSR_IA32_BNDCFGS save/restore in implementation code
>>> On 13.03.19 at 17:05, wrote: >> From: Xen-devel [mailto:xen-devel-boun...@lists.xenproject.org] On Behalf Of >> Jan Beulich >> Sent: 13 March 2019 15:55 >> >> >>> On 11.03.19 at 19:09, wrote: >> > --- a/xen/include/asm-x86/msr.h >> > +++ b/xen/include/asm-x86/msr.h >> > @@ -328,7 +328,7 @@ int init_vcpu_msr_policy(struct vcpu *v); >> > * These functions are also used by the migration logic, so need to cope >> > with >> > * being used outside of v's context. >> > */ >> > -int guest_rdmsr(const struct vcpu *v, uint32_t msr, uint64_t *val); >> > +int guest_rdmsr(struct vcpu *v, uint32_t msr, uint64_t *val); >> >> I find this pretty undesirable, and I'd like to at least put out >> for discussion a means how to avoid it: Any entity being >> passed a const struct vcpu *cv can get hold of a non-const >> one by doing >> >> struct vcpu *v = cv->domain->vcpu[cv->vcpu_id]; >> > > Looks kind of odd, but of course it will certainly work. > >> Of course this shouldn't be used arbitrarily, but to hide an >> implementation detail like that of vmx_vmcs_enter() I think >> this could be justified. Thoughts? >> > > I guess the question is at what level to do this? Probably in the hvm code > rather than in the vmx code. I think it should be in VMX code, as that's where the vmx_vmcs_enter() oddity lives. Jan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH for-next 7/9] xen: Convert is_xen_heap_mfn to use typesafe MFN
>>> On 13.03.19 at 18:24, wrote: > On 13/03/2019 15:04, Jan Beulich wrote: > On 18.02.19 at 12:35, wrote: >>> --- a/xen/common/page_alloc.c >>> +++ b/xen/common/page_alloc.c >>> @@ -2121,9 +2121,9 @@ void init_xenheap_pages(paddr_t ps, paddr_t pe) >>>* Yuk! Ensure there is a one-page buffer between Xen and Dom zones, >>> to >>>* prevent merging of power-of-two blocks across the zone boundary. >>>*/ >>> -if ( ps && !is_xen_heap_mfn(paddr_to_pfn(ps)-1) ) >>> +if ( ps && !is_xen_heap_mfn(_mfn(paddr_to_pfn(ps)-1)) ) >>> ps += PAGE_SIZE; >>> -if ( !is_xen_heap_mfn(paddr_to_pfn(pe)) ) >>> +if ( !is_xen_heap_mfn(maddr_to_mfn(pe)) ) >> >> Why maddr_to_mfn() here but still paddr_to_pfn() above? Oh, >> we don't have any mfn_sub(), I see. > > Yes we don't have mfn_sub() (or even gfn_sub()). I only found a couple of > places > where such helpers might be useful. I can introduce the 2 helpers if you > think > it is worth it. Well, I guess in the end I'm fine either way. It simply struck me as odd at the first glance that you use maddr_to_mfn() in one case but not the other. Jan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH for-next 8/9] xen: Introduce HAS_M2P config and use to protect mfn_to_gmfn call
>>> On 13.03.19 at 18:30, wrote: > On 13/03/2019 15:20, Jan Beulich wrote: > On 18.02.19 at 12:35, wrote: >>> --- a/xen/common/domctl.c >>> +++ b/xen/common/domctl.c >>> @@ -205,7 +205,7 @@ void getdomaininfo(struct domain *d, struct > xen_domctl_getdomaininfo *info) >>> info->outstanding_pages = d->outstanding_pages; >>> info->shr_pages = atomic_read(&d->shr_pages); >>> info->paged_pages = atomic_read(&d->paged_pages); >>> -info->shared_info_frame = mfn_to_gmfn(d, virt_to_mfn(d->shared_info)); >>> +info->shared_info_frame = gfn_x(domain_shared_info_gfn(d)); >> >> I think this change wants to be accompanied by a warning attached >> to the field declaration in the public header. > > Make sense. > >> >> But I'd also like to have the tool stack maintainers' view on making >> this field effectively unusable for Arm. > > The value in shared_info_frame was plain wrong since the creation of Xen Arm. > So > this is just making the error more obvious. I don't expect any user of it on > Arm. Well, my request for tool stack maintainer input wasn't to put under question that the field can't currently be used sensibly on Arm. Instead I'm meaning to know whether it can be sensibly expected for the tool stack to want to use the field uniformly, in which case rather than making it more obviously not work it should be fixed instead. Jan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH for-next 9/9] xen: Remove mfn_to_gmfn macro
>>> On 13.03.19 at 18:34, wrote: > On 13/03/2019 15:59, Jan Beulich wrote: > On 13.03.19 at 16:48, wrote: >>> On 13/03/2019 15:40, Jan Beulich wrote: >>> On 13.03.19 at 16:24, wrote: > On 13/03/2019 15:22, Jan Beulich wrote: > On 18.02.19 at 12:36, wrote: >>> --- a/xen/include/asm-arm/mm.h >>> +++ b/xen/include/asm-arm/mm.h >>> @@ -321,10 +321,8 @@ struct page_info *get_page_from_gva(struct vcpu >>> *v, >>> vaddr_t va, >>>#define SHARED_M2P_ENTRY (~0UL - 1UL) >>>#define SHARED_M2P(_e) ((_e) == SHARED_M2P_ENTRY) >>> >>> -/* Xen always owns P2M on ARM */ >>> +/* We don't have a M2P on Arm */ >>>#define set_gpfn_from_mfn(mfn, pfn) do { (void) (mfn), (void)(pfn); >>> } >>> while (0) >>> -#define mfn_to_gmfn(_d, mfn) (mfn) >> So is the plan to remove the other macro from Arm then as well? > Do you mean mfn_to_gfn? If so it does not exist on Arm. No, I mean the one in context above - set_gpfn_from_mfn(). >>> It is used in common code, so we would need to #idef the caller. >> Hmm, right, such #ifdef-ary would be undesirable (and two out of >> the three common code callers would need it. >> >>> I think it is better to provide a NOP implementation. Could be moved >>> somewhere >>> in the common header though. Any opinions? >> This would perhaps be better, now that you have HAVE_M2P. > > Given that "having an M2P" is now an x86-specific concept, I think > phasing set_gpfn_from_mfn()'s use out of common code is the way to go. But what's the implication of this? There would need to be some arch_*() hook used in the place that set_gpfn_from_mfn() is invoked currently. But then it can as well remain set_gpfn_from_mfn() (with the !HAVE_M2P stubbed out properly in a common header), can't it? Jan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH for-next 9/9] xen: Remove mfn_to_gmfn macro
>>> On 13.03.19 at 19:41, wrote: > On 13/03/2019 17:42, Julien Grall wrote: >> On 13/03/2019 17:34, Andrew Cooper wrote: >>> Given that "having an M2P" is now an x86-specific concept, I think >>> phasing set_gpfn_from_mfn()'s use out of common code is the way to go. >> >> So you never expect other architecture to use the M2P? > > I guess that depends on how likely it is going to be that Xen gains a > new non-HVM-like virtualisation mode on a new architecture. Well, not quite. I don't think it would be straightforward to make x86 select HAVE_M2P only when PV is also enabled. Hence a HVM- like implementation may still want to maintain M2P. In fact it is my understanding that 64-bit Arm could easily do (leaving aside the question of whether the memory needed to build the tables would be well spent this), but it's prohibitive on 32-bit, and hence it's easier for the Arm code overall to uniformly resort to alternative means where such a translation is indeed needed. Jan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH] tools/cpuid: Update xen-cpuid for Coffee Lake processors
>>> On 13.03.19 at 20:37, wrote: > --- a/tools/misc/xen-cpuid.c > +++ b/tools/misc/xen-cpuid.c > @@ -127,6 +127,8 @@ static const char *str_7c0[32] = > [14] = "avx512_vpopcntdq", > > [22] = "rdpid", > + > +[30] = "sgx-lc", > }; > > static const char *str_e7d[32] = Hmm, to be honest I don't see why this needs to go in ahead of https://lists.xenproject.org/archives/html/xen-devel/2018-12/msg01813.html posted almost 3 months ago, and already taking care of the bit. I'm fine to switch from underscore to dash there if that's desired, but it looks inconsistent with pre-existing entries (as the context of your patch also shows). Jan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] [xen-4.8-testing test] 133755: regressions - FAIL
flight 133755 xen-4.8-testing real [real] http://logs.test-lab.xenproject.org/osstest/logs/133755/ Regressions :-( Tests which did not succeed and are blocking, including tests which could not be run: test-amd64-amd64-libvirt-pair 22 guest-migrate/src_host/dst_host fail REGR. vs. 130965 Tests which are failing intermittently (not blocking): test-xtf-amd64-amd64-5 69 xtf/test-hvm64-xsa-278 fail in 133662 pass in 133755 test-xtf-amd64-amd64-4 50 xtf/test-hvm64-lbr-tsx-vmentry fail in 133680 pass in 133755 test-amd64-i386-libvirt-pair 22 guest-migrate/src_host/dst_host fail pass in 133662 test-xtf-amd64-amd64-2 50 xtf/test-hvm64-lbr-tsx-vmentry fail pass in 133680 Tests which did not succeed, but are not blocking: test-xtf-amd64-amd64-5 50 xtf/test-hvm64-lbr-tsx-vmentry fail like 130965 test-xtf-amd64-amd64-3 50 xtf/test-hvm64-lbr-tsx-vmentry fail like 130965 test-amd64-amd64-xl-qemut-win7-amd64 17 guest-stopfail like 130965 test-amd64-i386-xl-qemut-win7-amd64 17 guest-stop fail like 130965 test-armhf-armhf-xl-rtds 16 guest-start/debian.repeatfail like 130965 test-amd64-i386-xl-qemuu-win7-amd64 17 guest-stop fail like 130965 test-amd64-amd64-xl-qemuu-win7-amd64 17 guest-stopfail like 130965 test-amd64-i386-xl-qemuu-ws16-amd64 17 guest-stop fail like 130965 test-amd64-amd64-xl-rtds 10 debian-install fail like 130965 test-amd64-amd64-xl-qemut-ws16-amd64 17 guest-stopfail like 130965 test-amd64-amd64-xl-qemuu-ws16-amd64 17 guest-stopfail like 130965 test-amd64-i386-xl-qemut-ws16-amd64 17 guest-stop fail like 130965 test-amd64-amd64-libvirt 13 migrate-support-checkfail never pass test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail never pass test-amd64-i386-libvirt-xsm 13 migrate-support-checkfail never pass test-amd64-i386-libvirt 13 migrate-support-checkfail never pass test-arm64-arm64-xl 13 migrate-support-checkfail never pass test-arm64-arm64-xl 14 saverestore-support-checkfail never pass test-arm64-arm64-xl-credit1 13 migrate-support-checkfail never pass test-arm64-arm64-xl-credit1 14 saverestore-support-checkfail never pass test-arm64-arm64-xl-credit2 13 migrate-support-checkfail never pass test-arm64-arm64-xl-credit2 14 saverestore-support-checkfail never pass test-arm64-arm64-libvirt-xsm 13 migrate-support-checkfail never pass test-arm64-arm64-libvirt-xsm 14 saverestore-support-checkfail never pass test-arm64-arm64-xl-xsm 13 migrate-support-checkfail never pass test-arm64-arm64-xl-xsm 14 saverestore-support-checkfail never pass test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check fail never pass test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check fail never pass test-armhf-armhf-xl-arndale 13 migrate-support-checkfail never pass test-armhf-armhf-xl-arndale 14 saverestore-support-checkfail never pass test-amd64-amd64-qemuu-nested-amd 17 debian-hvm-install/l1/l2 fail never pass test-amd64-amd64-libvirt-vhd 12 migrate-support-checkfail never pass test-armhf-armhf-xl-credit1 13 migrate-support-checkfail never pass test-armhf-armhf-xl-credit1 14 saverestore-support-checkfail never pass test-armhf-armhf-xl-credit2 13 migrate-support-checkfail never pass test-armhf-armhf-xl-credit2 14 saverestore-support-checkfail never pass test-armhf-armhf-xl-multivcpu 13 migrate-support-checkfail never pass test-armhf-armhf-xl-multivcpu 14 saverestore-support-checkfail never pass test-armhf-armhf-xl-rtds 13 migrate-support-checkfail never pass test-armhf-armhf-xl-cubietruck 13 migrate-support-checkfail never pass test-armhf-armhf-xl-rtds 14 saverestore-support-checkfail never pass test-armhf-armhf-xl-cubietruck 14 saverestore-support-checkfail never pass test-armhf-armhf-libvirt 13 migrate-support-checkfail never pass test-armhf-armhf-libvirt 14 saverestore-support-checkfail never pass test-armhf-armhf-libvirt-raw 12 migrate-support-checkfail never pass test-armhf-armhf-libvirt-raw 13 saverestore-support-checkfail never pass test-armhf-armhf-xl 13 migrate-support-checkfail never pass test-armhf-armhf-xl 14 saverestore-support-checkfail never pass test-armhf-armhf-xl-vhd 12 migrate-support-checkfail never pass test-armhf-armhf-xl-vhd 13 saverestore-support-checkfail never pass test-amd64-i386-xl-qemut-win10-i386 10 windows-install fail never pass test-amd64-i386-xl-qemuu-win10-i386 10 windows-install fail never pass test-amd64-amd64-xl-qemuu-win10-i386 10 windows-installfail never pa
Re: [Xen-devel] xen: Can't insert balloon page into VM userspace (WAS Re: [linux-linus bisection] complete test-arm64-arm64-xl-xsm)
On 12/03/2019 20:46, David Hildenbrand wrote: > On 12.03.19 19:23, David Hildenbrand wrote: >> On 12.03.19 19:02, Boris Ostrovsky wrote: >>> On 3/12/19 1:24 PM, Andrew Cooper wrote: On 12/03/2019 17:18, David Hildenbrand wrote: > On 12.03.19 18:14, Matthew Wilcox wrote: >> On Tue, Mar 12, 2019 at 05:05:39PM +, Julien Grall wrote: >>> On 3/12/19 3:59 PM, Julien Grall wrote: It looks like all the arm test for linus [1] and next [2] tree are now failing. x86 seems to be mostly ok. The bisector fingered the following commit: commit 0ee930e6cafa048c1925893d0ca89918b2814f2c Author: Matthew Wilcox Date: Tue Mar 5 15:46:06 2019 -0800 mm/memory.c: prevent mapping typed pages to userspace Pages which use page_type must never be mapped to userspace as it would destroy their page type. Add an explicit check for this instead of assuming that kernel drivers always get this right. >> Oh good, it found a real problem. >> >>> It turns out the problem is because the balloon driver will call >>> __SetPageOffline() on allocated page. Therefore the page has a type and >>> vm_insert_pages will deny the insertion. >>> >>> My knowledge is quite limited in this area. So I am not sure how we can >>> solve the problem. >>> >>> I would appreciate if someone could provide input of to fix the mapping. >> I don't know the balloon driver, so I don't know why it was doing this, >> but what it was doing was Wrong and has been since 2014 with: >> >> commit d6d86c0a7f8ddc5b38cf089222cb1d9540762dc2 >> Author: Konstantin Khlebnikov >> Date: Thu Oct 9 15:29:27 2014 -0700 >> >> mm/balloon_compaction: redesign ballooned pages management >> >> If ballooned pages are supposed to be mapped into userspace, you can't >> mark >> them as ballooned pages using the mapcount field. >> > Asking myself why anybody would want to map balloon inflated pages into > user space (this just sounds plain wrong but my understanding to what > XEN balloon driver does might be limited), but I assume the easy fix > would be to revert I suspect the bug here is that the balloon driver is (ab)used for a second purpose >>> >>> Yes. And its name is alloc_xenballooned_pages(). >>> >> >> Haven't had a look at the code yet, but would another temporary fix be >> to clear/set PG_offline when allocating/freeing a ballooned page? >> (assuming here that only such pages will be mapped to user space) >> > > I guess something like this could do the trick if I understood it correctly: > > diff --git a/drivers/xen/balloon.c b/drivers/xen/balloon.c > index 39b229f9e256..d37dd5bb7a8f 100644 > --- a/drivers/xen/balloon.c > +++ b/drivers/xen/balloon.c > @@ -604,6 +604,7 @@ int alloc_xenballooned_pages(int nr_pages, struct > page **pages) > while (pgno < nr_pages) { > page = balloon_retrieve(true); > if (page) { > + __ClearPageOffline(page); > pages[pgno++] = page; > #ifdef CONFIG_XEN_HAVE_PVMMU > /* > @@ -645,8 +646,10 @@ void free_xenballooned_pages(int nr_pages, struct > page **pages) > mutex_lock(&balloon_mutex); > > for (i = 0; i < nr_pages; i++) { > - if (pages[i]) > + if (pages[i]) { > + __SetPageOffline(pages[i]); > balloon_append(pages[i]); > + } > } > > balloon_stats.target_unpopulated -= nr_pages; > > > At least this way, the pages allocated (and thus eventually mapped to > user space) would not be marked, but the other ones would remain marked > and could be excluded by makedumptool. > I think this patch should do the trick. Julien, could you give it a try? On x86 I can't reproduce your problem easily as dom0 is PV with plenty of unpopulated pages for grant memory not suffering from missing "offline" bit. Juergen ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH 1/2 for-4.12] xen: introduce VCPUOP_register_runstate_phys_memory_area hypercall
>>> On 05.03.19 at 14:14, wrote: > From: Andrii Anisov > > The hypercall employs the same vcpu_register_runstate_memory_area > structure for the interface, but requires registered area to not > cross a page boundary. > > Signed-off-by: Andrii Anisov While first of all you'll need to settle the dispute with Julien, a couple of comments on the actual changes you do in case the approach is to be pursued. > --- a/xen/arch/arm/domain.c > +++ b/xen/arch/arm/domain.c > @@ -277,29 +277,33 @@ static void ctxt_switch_to(struct vcpu *n) > /* Update per-VCPU guest runstate shared memory area (if registered). */ > static void update_runstate_area(struct vcpu *v) > { > -void __user *guest_handle = NULL; > - > if ( guest_handle_is_null(runstate_guest(v)) ) > return; I think (seeing also patch 2) re-using the handle representation is a bad idea. If there's a Xen-internal mapping use a plain pointer. (All comments on Arm code respectively apply to the x86 version, also [if any] in patch 2.) > -if ( VM_ASSIST(v->domain, runstate_update_flag) ) > +if ( v->runstate_guest_type == RUNSTATE_VADDR ) Patch 2 shows that this wants to be switch(). > { > -guest_handle = &v->runstate_guest.p->state_entry_time + 1; > -guest_handle--; > -v->runstate.state_entry_time |= XEN_RUNSTATE_UPDATE; > -__raw_copy_to_guest(guest_handle, > -(void *)(&v->runstate.state_entry_time + 1) - 1, > 1); > -smp_wmb(); > -} > +void __user *guest_handle = NULL; > +if ( VM_ASSIST(v->domain, runstate_update_flag) ) > +{ > +guest_handle = &v->runstate_guest.p->state_entry_time + 1; > +guest_handle--; > +v->runstate.state_entry_time |= XEN_RUNSTATE_UPDATE; > +__raw_copy_to_guest(guest_handle, > +(void *)(&v->runstate.state_entry_time + 1) > - 1, > +1); > +smp_wmb(); > +} > > -__copy_to_guest(runstate_guest(v), &v->runstate, 1); > +__copy_to_guest(runstate_guest(v), &v->runstate, 1); > > -if ( guest_handle ) > -{ > -v->runstate.state_entry_time &= ~XEN_RUNSTATE_UPDATE; > -smp_wmb(); > -__raw_copy_to_guest(guest_handle, > -(void *)(&v->runstate.state_entry_time + 1) - 1, > 1); > +if ( guest_handle ) > +{ > +v->runstate.state_entry_time &= ~XEN_RUNSTATE_UPDATE; > +smp_wmb(); > +__raw_copy_to_guest(guest_handle, > +(void *)(&v->runstate.state_entry_time + 1) > - 1, > +1); > +} > } There looks to be an "else" missing here (or "default:" once changed to switch()), which you have on the x86 side. I'm unconvinced it should be "return true" there though (in which case having nothing here would indeed have been correct) - ASSERT_UNREACHABLE() would much rather seem appropriate. > @@ -1535,6 +1536,12 @@ long do_vcpu_op(int cmd, unsigned int vcpuid, > XEN_GUEST_HANDLE_PARAM(void) arg) > break; > } > > +case VCPUOP_register_runstate_phys_memory_area: > +{ > +rc = -ENOSYS; > +break; > +} Stray braces and bad use of -ENOSYS (should be e.g. -EOPNOTSUPP). > --- a/xen/include/public/vcpu.h > +++ b/xen/include/public/vcpu.h > @@ -235,6 +235,22 @@ struct vcpu_register_time_memory_area { > typedef struct vcpu_register_time_memory_area > vcpu_register_time_memory_area_t; > DEFINE_XEN_GUEST_HANDLE(vcpu_register_time_memory_area_t); > > +/* > + * Register a shared memory area from which the guest may obtain its own > + * runstate information without needing to execute a hypercall. > + * Notes: > + * 1. The registered address must be guest's physical address. > + * 2. The registered runstate area should not cross page boundary. > + * 3. Only one shared area may be registered per VCPU. The shared area is > + * updated by the hypervisor each time the VCPU is scheduled. Thus > + * runstate.state will always be RUNSTATE_running and > + * runstate.state_entry_time will indicate the system time at which the > + * VCPU was last scheduled to run. > + * @extra_arg == pointer to vcpu_register_runstate_memory_area structure. I don't think re-use of that structure is appropriate to represent a physical address. > + */ > +#define VCPUOP_register_runstate_phys_memory_area 14 > + > + > #endif /* __XEN_PUBLIC_VCPU_H__ */ Stray double blank lines. > --- a/xen/include/xen/sched.h > +++ b/xen/include/xen/sched.h > @@ -163,6 +163,13 @@ struct vcpu > void*sched_priv;/* scheduler-specific data */ > > struct vcpu_runstate_info runstate; > + > +enum { > +RUNSTATE_NONE = 0, > +RUNSTATE_PADDR = 1, > +RUNSTATE_VADDR = 2, > +} runstate_guest_type; I can see that putting the new field here nicely groups with
Re: [Xen-devel] [PATCH v5 09/11] viridian: add implementation of synthetic interrupt MSRs
> -Original Message- > From: Jan Beulich [mailto:jbeul...@suse.com] > Sent: 14 March 2019 07:48 > To: Paul Durrant > Cc: Julien Grall ; Andrew Cooper > ; George Dunlap > ; Ian Jackson ; Roger Pau > Monne > ; Wei Liu ; Stefano Stabellini > ; > xen-devel ; Konrad Rzeszutek Wilk > ; Tim > (Xen.org) > Subject: RE: [Xen-devel] [PATCH v5 09/11] viridian: add implementation of > synthetic interrupt MSRs > > >>> On 13.03.19 at 17:13, wrote: > >> -Original Message- > >> From: Xen-devel [mailto:xen-devel-boun...@lists.xenproject.org] On Behalf > >> Of > > Jan Beulich > >> Sent: 13 March 2019 13:15 > >> To: Paul Durrant > >> Cc: Stefano Stabellini ; Wei Liu > >> ; > > Konrad Rzeszutek Wilk > >> ; George Dunlap ; Andrew > > Cooper > >> ; Ian Jackson ; Tim > >> (Xen.org) > > ; Julien > >> Grall ; xen-devel ; > >> Roger > > Pau Monne > >> > >> Subject: Re: [Xen-devel] [PATCH v5 09/11] viridian: add implementation of > > synthetic interrupt MSRs > >> > >> >>> On 11.03.19 at 14:41, wrote: > >> > @@ -28,6 +29,32 @@ typedef union _HV_VP_ASSIST_PAGE > >> > uint8_t ReservedZBytePadding[PAGE_SIZE]; > >> > } HV_VP_ASSIST_PAGE; > >> > > >> > +typedef enum HV_MESSAGE_TYPE { > >> > +HvMessageTypeNone, > >> > +HvMessageTimerExpired = 0x8010, > >> > +} HV_MESSAGE_TYPE; > >> > + > >> > +typedef struct HV_MESSAGE_FLAGS { > >> > +uint8_t MessagePending:1; > >> > +uint8_t Reserved:7; > >> > +} HV_MESSAGE_FLAGS; > >> > + > >> > +typedef struct HV_MESSAGE_HEADER { > >> > +HV_MESSAGE_TYPE MessageType; > >> > +uint16_t Reserved1; > >> > +HV_MESSAGE_FLAGS MessageFlags; > >> > +uint8_t PayloadSize; > >> > +uint64_t Reserved2; > >> > +} HV_MESSAGE_HEADER; > >> > + > >> > +#define HV_MESSAGE_SIZE 256 > >> > +#define HV_MESSAGE_MAX_PAYLOAD_QWORD_COUNT 30 > >> > > > > Missed this one before... > > > >> Is this defined this way, or (given ... > >> > >> > +typedef struct HV_MESSAGE { > >> > +HV_MESSAGE_HEADER Header; > >> > +uint64_t Payload[HV_MESSAGE_MAX_PAYLOAD_QWORD_COUNT]; > >> > +} HV_MESSAGE; > >> > >> ... this) isn't it rather > >> > >> #define HV_MESSAGE_MAX_PAYLOAD_QWORD_COUNT \ > >> ((HV_MESSAGE_SIZE - sizeof(HV_MESSAGE_HEADER) / 8) > >> > > > > I need the definition for the array in the struct so that sizeof(HV_MESSAGE) > > == HV_MESSAGE_SIZE (for which there is a BUILD_BUG_ON()) later. > > I don't understand this part - I'm not asking to ditch the #define. > As to the BUILD_BUG_ON() - I see now, but that's only in patch > 10, and in a specific message handler. I think this would belong > here, and in the main viridian.c file. Ok, I can see that it is more logical to co-locate the BUILD_BUG_ON() with the definitions. Paul > > > It's also written that way in the spec. so I'd rather leave it as-is. > > Well, okay then - I was sort of expecting this to be spelled out > there in such a way. That doesn't change my overall opinion, > but I can see your point of wanting to match the spec, and you > being the maintainer I have no basis to insist anyway. > > Jan > ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH 2/2 for-4.12] xen: implement VCPUOP_register_runstate_phys_memory_area
>>> On 05.03.19 at 14:14, wrote: > --- a/xen/arch/arm/domain.c > +++ b/xen/arch/arm/domain.c > @@ -305,6 +305,26 @@ static void update_runstate_area(struct vcpu *v) > 1); > } > } > +else if ( v->runstate_guest_type == RUNSTATE_PADDR ) > +{ > +if ( VM_ASSIST(v->domain, runstate_update_flag) ) > +{ > +runstate_guest(v).p->state_entry_time |= XEN_RUNSTATE_UPDATE; > +smp_wmb(); > +v->runstate.state_entry_time |= XEN_RUNSTATE_UPDATE; > +} > + > +memcpy(runstate_guest(v).p, &v->runstate, sizeof(v->runstate)); > + > +if ( VM_ASSIST(v->domain, runstate_update_flag) ) > +{ > +runstate_guest(v).p->state_entry_time &= ~XEN_RUNSTATE_UPDATE; > +smp_wmb(); > +v->runstate.state_entry_time &= ~XEN_RUNSTATE_UPDATE; > +} > +} > +else > +{ /* No actions required */ } See my respective comment on patch 1. > @@ -1648,9 +1648,37 @@ bool update_runstate_area(struct vcpu *v) > (void *)(&v->runstate.state_entry_time + 1) > - 1, 1); > } > } > -else > +else if ( v->runstate_guest_type == RUNSTATE_PADDR ) > { > -rc = true; > +if ( VM_ASSIST(v->domain, runstate_update_flag) ) > +{ > +v->runstate.state_entry_time |= XEN_RUNSTATE_UPDATE; > +if ( has_32bit_shinfo((v)->domain) ) Stray inner parentheses (at least one more instance further down). > --- a/xen/common/domain.c > +++ b/xen/common/domain.c > @@ -738,7 +738,14 @@ int domain_kill(struct domain *d) > if ( cpupool_move_domain(d, cpupool0) ) > return -ERESTART; > for_each_vcpu ( d, v ) > +{ > +if ( v->runstate_guest_type == RUNSTATE_VADDR ) > +set_xen_guest_handle(runstate_guest(v), NULL); > +else > +unmap_runstate_area(v); Since this recurs further down, can't the VADDR case also be taken care of by unmap_runstate_area()? > @@ -1333,6 +1344,65 @@ void unmap_vcpu_info(struct vcpu *v) > put_page_and_type(mfn_to_page(mfn)); > } > > +int map_runstate_area(struct vcpu *v, > + struct vcpu_register_runstate_memory_area *area) Neither this nor the unmap function look to be used outside this file, so should be static. It also looks as if the second parameter could be constified. > +{ > +unsigned long offset = area->addr.p & ~PAGE_MASK; > +gfn_t gfn = gaddr_to_gfn(area->addr.p); > +struct domain *d = v->domain; > +void *mapping; > +struct page_info *page; > +size_t size = sizeof (struct vcpu_runstate_info ); Stray blanks. > +ASSERT(v->runstate_guest_type == RUNSTATE_PADDR ); Another one. > +if ( offset > (PAGE_SIZE - size) ) > +return -EINVAL; > + > +page = get_page_from_gfn(d, gfn_x(gfn), NULL, P2M_ALLOC); Don't you also need P2M_UNSHARE? And is the p2m type of the page really of no interest here? > +void unmap_runstate_area(struct vcpu *v) > +{ > +mfn_t mfn; > + > +if ( v->runstate_guest_type != RUNSTATE_PADDR ) > +return; > + > +if ( guest_handle_is_null(runstate_guest(v)) ) > +return; > + > +mfn = _mfn(virt_to_mfn(runstate_guest(v).p)); The pointer is the result of __map_domain_page_global() - I don't think you can legitimately use virt_to_mfn() on it, at least not on x86; domain_page_map_to_mfn() is what you want to use here. > @@ -1532,13 +1602,29 @@ long do_vcpu_op(int cmd, unsigned int vcpuid, > XEN_GUEST_HANDLE_PARAM(void) arg) > vcpu_runstate_get(v, &runstate); > __copy_to_guest(runstate_guest(v), &runstate, 1); > } > - > break; > } Undue removal of a blank line. > case VCPUOP_register_runstate_phys_memory_area: > { > -rc = -ENOSYS; > +struct vcpu_register_runstate_memory_area area; > + > +rc = -EFAULT; > +if ( copy_from_guest(&area, arg, 1) ) > +break; > + > +unmap_runstate_area(v); > +v->runstate_guest_type = RUNSTATE_PADDR; > +rc = map_runstate_area(v, &area); > + > +if ( rc ) > +{ > +v->runstate_guest_type = RUNSTATE_NONE; All of this updating of type and mapping are racy, and I guess that's also what Julien has already pointed out. As long as the mapping can be updated by remote CPUs, you have to make sure that the consuming side can't see partially updated values. On option _might_ be to switch to NONE first, and to PADDR upon success (with suitable barriers in between). The logic then of course needs mirroring to the VADDR path. Jan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] Hypervisor Debugging GDB, KDB
Dear Xen Developer mailing list, I have been looking for detailed Information about the current status of Xen hypervisor debugging. I checked the mailing list (questions mostly unanswered or outdated), the repositories, and several search engine without a satisfying answer to my question. Is there any way of single stepping Xen hypervisor as we usually do, e.g., with the Linux kernel using gdb and kdb over serial port? If so can you give me a hint where to find this information, furthermore it might be good to make this information easily accessible for newcomers in Xen development. Best, Jan Ruh CONFIDENTIALITY: The contents of this e-mail are confidential and intended only for the above addressee(s). If you are not the intended recipient, or the person responsible for delivering it to the intended recipient, copying or delivering it to anyone else or using it in any unauthorized manner is prohibited and may be unlawful. If you receive this e-mail by mistake, please notify the sender and the systems administrator at straym...@tttech.com immediately. ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH] xen-block: Replace qdict_put_obj() by qdict_put() where appropriate
> -Original Message- > From: Markus Armbruster [mailto:arm...@redhat.com] > Sent: 13 March 2019 17:45 > To: qemu-de...@nongnu.org > Cc: sstabell...@kernel.org; Anthony Perard ; Paul > Durrant > ; xen-devel@lists.xenproject.org; > qemu-bl...@nongnu.org > Subject: [PATCH] xen-block: Replace qdict_put_obj() by qdict_put() where > appropriate > > Patch created mechanically by rerunning: > > $ spatch --sp-file scripts/coccinelle/qobject.cocci \ > --macro-file scripts/cocci-macro-file.h \ > --dir hw/block --in-place > > Signed-off-by: Markus Armbruster Acked-by: Paul Durrant > --- > hw/block/xen-block.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/hw/block/xen-block.c b/hw/block/xen-block.c > index 70fc2455e8..9c722b9b95 100644 > --- a/hw/block/xen-block.c > +++ b/hw/block/xen-block.c > @@ -771,7 +771,7 @@ static XenBlockDrive *xen_block_drive_create(const char > *id, > QDict *cache_qdict = qdict_new(); > > qdict_put_bool(cache_qdict, "direct", true); > -qdict_put_obj(file_layer, "cache", QOBJECT(cache_qdict)); > +qdict_put(file_layer, "cache", cache_qdict); > > qdict_put_str(file_layer, "aio", "native"); > } > @@ -796,7 +796,7 @@ static XenBlockDrive *xen_block_drive_create(const char > *id, > qdict_put_str(driver_layer, "driver", driver); > g_free(driver); > > -qdict_put_obj(driver_layer, "file", QOBJECT(file_layer)); > +qdict_put(driver_layer, "file", file_layer); > > g_assert(!drive->node_name); > drive->node_name = xen_block_blockdev_add(drive->id, driver_layer, > -- > 2.17.2 ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] [PATCH 0/2] xen: some debug trace enhancements
While doing some scheduler work I used debug trace for diagnosis of problems during dom0 boot. This small series is the result of adapting debug trace to my needs. Juergen Gross (2): xen/debug: make debugtrace configurable via Kconfig xen/debug: make debugtrace more clever regarding repeating entries xen/Kconfig.debug | 7 +++ xen/drivers/char/console.c | 46 ++ xen/include/xen/lib.h | 3 +-- 3 files changed, 42 insertions(+), 14 deletions(-) -- 2.16.4 ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] [PATCH 2/2] xen/debug: make debugtrace more clever regarding repeating entries
In case debugtrace is writing to memory and the last entry is repeated don't fill up the trace buffer, but modify the count prefix to "x-y " style instead. Signed-off-by: Juergen Gross --- xen/drivers/char/console.c | 44 +--- 1 file changed, 33 insertions(+), 11 deletions(-) diff --git a/xen/drivers/char/console.c b/xen/drivers/char/console.c index 41ec13ce52..38dcfc732e 100644 --- a/xen/drivers/char/console.c +++ b/xen/drivers/char/console.c @@ -1225,13 +1225,28 @@ void debugtrace_dump(void) watchdog_enable(); } +static void debugtrace_add_to_buf(char *buf) +{ +char *p; + +for ( p = buf; *p != '\0'; p++ ) +{ +debugtrace_buf[debugtrace_prd++] = *p; +/* Always leave a nul byte at the end of the buffer. */ +if ( debugtrace_prd == (debugtrace_bytes - 1) ) +debugtrace_prd = 0; +} +} + void debugtrace_printk(const char *fmt, ...) { -static charbuf[1024]; -static u32 count; +static char buf[1024]; +static char last_buf[1024]; +static u32 count, last_count; +static unsigned int last_prd; +char cntbuf[24]; va_list args; -char *p; unsigned long flags; if ( debugtrace_bytes == 0 ) @@ -1243,25 +1258,32 @@ void debugtrace_printk(const char *fmt, ...) ASSERT(debugtrace_buf[debugtrace_bytes - 1] == 0); -snprintf(buf, sizeof(buf), "%u ", ++count); - va_start(args, fmt); -(void)vsnprintf(buf + strlen(buf), sizeof(buf) - strlen(buf), fmt, args); +(void)vsnprintf(buf, sizeof(buf), fmt, args); va_end(args); if ( debugtrace_send_to_console ) { +snprintf(cntbuf, sizeof(cntbuf), "%u ", ++count); +serial_puts(sercon_handle, cntbuf); serial_puts(sercon_handle, buf); } else { -for ( p = buf; *p != '\0'; p++ ) +if ( strcmp(buf, last_buf) ) +{ +last_prd = debugtrace_prd; +last_count = ++count; +safe_strcpy(last_buf, buf); +snprintf(cntbuf, sizeof(cntbuf), "%u ", count); +} +else { -debugtrace_buf[debugtrace_prd++] = *p; -/* Always leave a nul byte at the end of the buffer. */ -if ( debugtrace_prd == (debugtrace_bytes - 1) ) -debugtrace_prd = 0; +debugtrace_prd = last_prd; +snprintf(cntbuf, sizeof(cntbuf), "%u-%u ", last_count, ++count); } +debugtrace_add_to_buf(cntbuf); +debugtrace_add_to_buf(buf); } spin_unlock_irqrestore(&debugtrace_lock, flags); -- 2.16.4 ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] [PATCH 1/2] xen/debug: make debugtrace configurable via Kconfig
Instead of having to edit include/xen/lib.h for making debugtrace available make it configurable via Kconfig. Default is off, it is available only in expert mode or in debug builds. Signed-off-by: Juergen Gross --- xen/Kconfig.debug | 7 +++ xen/drivers/char/console.c | 2 +- xen/include/xen/lib.h | 3 +-- 3 files changed, 9 insertions(+), 3 deletions(-) diff --git a/xen/Kconfig.debug b/xen/Kconfig.debug index 4d5d7f87cb..daacf85141 100644 --- a/xen/Kconfig.debug +++ b/xen/Kconfig.debug @@ -98,6 +98,13 @@ config UBSAN If unsure, say N here. +config DEBUG_TRACE + bool "Debug trace support" + ---help--- + Debug trace enables to record debug trace messages which are printed + either directly to the console or are printed to console in case of + a system crash. + endif # DEBUG || EXPERT endmenu diff --git a/xen/drivers/char/console.c b/xen/drivers/char/console.c index 4315588f05..41ec13ce52 100644 --- a/xen/drivers/char/console.c +++ b/xen/drivers/char/console.c @@ -1157,7 +1157,7 @@ int printk_ratelimit(void) * ** */ -#ifdef DEBUG_TRACE_DUMP +#ifdef CONFIG_DEBUG_TRACE /* Send output direct to console, or buffer it? */ static volatile int debugtrace_send_to_console; diff --git a/xen/include/xen/lib.h b/xen/include/xen/lib.h index 89939f43c8..e0b7bcb6b7 100644 --- a/xen/include/xen/lib.h +++ b/xen/include/xen/lib.h @@ -86,8 +86,7 @@ int parse_boolean(const char *name, const char *s, const char *e); */ int cmdline_strcmp(const char *frag, const char *name); -/*#define DEBUG_TRACE_DUMP*/ -#ifdef DEBUG_TRACE_DUMP +#ifdef CONFIG_DEBUG_TRACE extern void debugtrace_dump(void); extern void debugtrace_printk(const char *fmt, ...) __attribute__ ((format (printf, 1, 2))); -- 2.16.4 ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] [freebsd-master test] 133765: regressions - FAIL
flight 133765 freebsd-master real [real] http://logs.test-lab.xenproject.org/osstest/logs/133765/ Regressions :-( Tests which did not succeed and are blocking, including tests which could not be run: build-amd64-freebsd-again 5 host-install(5) fail REGR. vs. 133707 version targeted for testing: freebsd 603a99698863e28eb4970588946e212c6a7c41d7 baseline version: freebsd 4545fe820a83b675845f9045f06c856129b9c17c Last test of basis 133707 2019-03-11 09:19:18 Z3 days Testing same since 133765 2019-03-13 09:19:31 Z1 days1 attempts People who touched revisions under test: 0mp <0...@freebsd.org> ae asomers bcr cy dab dim emaste gahr gjb hselasky imp jhb kadesai ken kevans kib lidl mav mckusick np sjg trasz wosch wulf jobs: build-amd64-freebsd-againfail build-amd64-freebsd pass build-amd64-xen-freebsd pass sg-report-flight on osstest.test-lab.xenproject.org logs: /home/logs/logs images: /home/logs/images Logs, config files, etc. are available at http://logs.test-lab.xenproject.org/osstest/logs Explanation of these reports, and of osstest in general, is at http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master Test harness code can be found at http://xenbits.xen.org/gitweb?p=osstest.git;a=summary Not pushing. (No revision log; it would be 848 lines long.) ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v5 10/11] viridian: add implementation of synthetic timers
> -Original Message- > From: Xen-devel [mailto:xen-devel-boun...@lists.xenproject.org] On Behalf Of > Paul Durrant > Sent: 13 March 2019 14:37 > To: 'Jan Beulich' > Cc: Stefano Stabellini ; Wei Liu > ; Konrad Rzeszutek Wilk > ; Andrew Cooper ; Tim > (Xen.org) ; > George Dunlap ; Julien Grall > ; xen-devel de...@lists.xenproject.org>; Ian Jackson ; Roger Pau > Monne > > Subject: Re: [Xen-devel] [PATCH v5 10/11] viridian: add implementation of > synthetic timers > [snip] > > > As to safety of this, I have two concerns: > > > > 1) TscSequence gets updated as a result of a guest action (an MSR > > write). This makes it non-obvious that the loop above will get > > exited in due course. > > > > True. The domain could try to DoS this call. This could be avoided by doing a > domain_pause() if we > test continuously fails for a number of iterations, or maybe just one > iteration. > > > 2) The way update_reference_tsc() deals with the two "invalid" > > values suggests ~0 and 0 should be special cased in general. I > > _think_ this is not necessary here, but it also seems to me as if > > the VM ever having a way to observe either of those two values > > would be wrong too. Shouldn't the function avoid to ever store > > ~0 into that field, i.e. increment into a local variable, update > > that local variable to skip the two "invalid" values, and only then > > store into the field? > > > > Otoh, making it into that function being a result of an MSR write, > > it may welll be that the spec precludes the guest from reading > > the reference page while an update was invoked from one of its > > vCPU-s. If this was the case, then I also wouldn't have to > > wonder any longer how this entire mechanism can be race free > > in the first place (without a double increment like we do in the > > pv-clock protocol). > > From observation, it looks like Windows initializes the reference tsc page > before it brings secondary > CPUs online and then doesn't touch the MSR again, so we should probably only > tolerate one mismatch in > time_now() before doing domain_pause(). Actually it occurred to me last night that I'm being completely thick by coding it this way. The viridian code sets TscScale, not the guest, so we don't even need to reference the HV_REFERENCE_TSC_PAGE struct. Looking again, I'm also concerned that there's a small TOCTOU race in testing whether the reference tsc page is valid where the guest could unmap it on another CPU and cause a NULL pointer deref in time_now(), so I'll re-work this entirely. Paul ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] [PATCH 0/2] xen: some scheduler related simplifications
While working on the scheduler I needed to modify some bits which are worth to be considered as a stand-alone improvement IMO. Juergen Gross (2): xen: introduce a cpumask with all bits set xen/sched: don't disable scheduler on cpus during suspend xen/arch/x86/io_apic.c| 4 +- xen/common/cpu.c | 4 ++ xen/common/cpupool.c | 2 +- xen/common/schedule.c | 163 +- xen/include/xen/cpumask.h | 2 + 5 files changed, 53 insertions(+), 122 deletions(-) -- 2.16.4 ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] [PATCH 1/2] xen: introduce a cpumask with all bits set
There are several places in Xen allocating a cpumask on the stack and setting all bits in it just to use it as an initial mask for allowing all cpus. Save the stack space and omit the need for runtime initialization by defining a globally accessible cpumask_all variable. Signed-off-by: Juergen Gross --- xen/arch/x86/io_apic.c| 4 +--- xen/common/cpu.c | 4 xen/common/schedule.c | 17 - xen/include/xen/cpumask.h | 2 ++ 4 files changed, 11 insertions(+), 16 deletions(-) diff --git a/xen/arch/x86/io_apic.c b/xen/arch/x86/io_apic.c index daa5e9e5ff..a5344ed727 100644 --- a/xen/arch/x86/io_apic.c +++ b/xen/arch/x86/io_apic.c @@ -1881,7 +1881,6 @@ static void __init check_timer(void) int apic1, pin1, apic2, pin2; int vector, ret; unsigned long flags; -cpumask_t mask_all; local_irq_save(flags); @@ -1892,8 +1891,7 @@ static void __init check_timer(void) vector = IRQ0_VECTOR; clear_irq_vector(0); -cpumask_setall(&mask_all); -if ((ret = bind_irq_vector(0, vector, &mask_all))) +if ((ret = bind_irq_vector(0, vector, &cpumask_all))) printk(KERN_ERR"..IRQ0 is not set correctly with ioapic!!!, err:%d\n", ret); irq_desc[0].status &= ~IRQ_DISABLED; diff --git a/xen/common/cpu.c b/xen/common/cpu.c index 653a56b840..836c62f97f 100644 --- a/xen/common/cpu.c +++ b/xen/common/cpu.c @@ -11,6 +11,10 @@ unsigned int __read_mostly nr_cpumask_bits = BITS_TO_LONGS(NR_CPUS) * BITS_PER_LONG; #endif +const cpumask_t cpumask_all = { +.bits[0 ... (BITS_TO_LONGS(NR_CPUS) - 1)] = ~0UL +}; + /* * cpu_bit_bitmap[] is a special, "compressed" data structure that * represents all NR_CPUS bits binary values of 1processor = processor; @@ -280,9 +277,9 @@ int sched_init_vcpu(struct vcpu *v, unsigned int processor) * domain-0 VCPUs, are pinned onto their respective physical CPUs. */ if ( is_idle_domain(d) || d->is_pinned ) -sched_set_affinity(v, cpumask_of(processor), &allcpus); +sched_set_affinity(v, cpumask_of(processor), &cpumask_all); else -sched_set_affinity(v, &allcpus, &allcpus); +sched_set_affinity(v, &cpumask_all, &cpumask_all); /* Idle VCPUs are scheduled immediately, so don't put them in runqueue. */ if ( is_idle_domain(d) ) @@ -361,7 +358,6 @@ int sched_move_domain(struct domain *d, struct cpupool *c) for_each_vcpu ( d, v ) { spinlock_t *lock; -cpumask_t allcpus; vcpudata = v->sched_priv; @@ -369,11 +365,9 @@ int sched_move_domain(struct domain *d, struct cpupool *c) migrate_timer(&v->singleshot_timer, new_p); migrate_timer(&v->poll_timer, new_p); -cpumask_setall(&allcpus); - lock = vcpu_schedule_lock_irq(v); -sched_set_affinity(v, &allcpus, &allcpus); +sched_set_affinity(v, &cpumask_all, &cpumask_all); v->processor = new_p; /* @@ -812,8 +806,6 @@ int cpu_disable_scheduler(unsigned int cpu) if ( cpumask_empty(&online_affinity) && cpumask_test_cpu(cpu, v->cpu_hard_affinity) ) { -cpumask_t allcpus; - if ( v->affinity_broken ) { /* The vcpu is temporarily pinned, can't move it. */ @@ -831,8 +823,7 @@ int cpu_disable_scheduler(unsigned int cpu) else printk(XENLOG_DEBUG "Breaking affinity for %pv\n", v); -cpumask_setall(&allcpus); -sched_set_affinity(v, &allcpus, NULL); +sched_set_affinity(v, &cpumask_all, NULL); } if ( v->processor != cpu ) diff --git a/xen/include/xen/cpumask.h b/xen/include/xen/cpumask.h index b4cc92a4f5..5a43438988 100644 --- a/xen/include/xen/cpumask.h +++ b/xen/include/xen/cpumask.h @@ -75,6 +75,8 @@ typedef struct cpumask{ DECLARE_BITMAP(bits, NR_CPUS); } cpumask_t; +extern const cpumask_t cpumask_all; + extern unsigned int nr_cpu_ids; #if NR_CPUS > 4 * BITS_PER_LONG -- 2.16.4 ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] [PATCH 2/2] xen/sched: don't disable scheduler on cpus during suspend
Today there is special handling in cpu_disable_scheduler() for suspend by forcing all vcpus to the boot cpu. In fact there is no need for that as during resume the vcpus are put on the correct cpus again. So we can just omit the call of cpu_disable_scheduler() when offlining a cpu due to suspend and on resuming we can omit taking the schedule lock for selecting the new processor. In restore_vcpu_affinity() we should be careful when applying affinity as the cpu might not have come back to life. This in turn enables us to even support affinity_broken across suspend/resume. Signed-off-by: Juergen Gross --- xen/common/cpupool.c | 2 +- xen/common/schedule.c | 146 ++ 2 files changed, 42 insertions(+), 106 deletions(-) diff --git a/xen/common/cpupool.c b/xen/common/cpupool.c index e89bb67e71..7b5ce18426 100644 --- a/xen/common/cpupool.c +++ b/xen/common/cpupool.c @@ -313,7 +313,7 @@ static long cpupool_unassign_cpu_helper(void *info) * cpu_disable_scheduler(), and at the bottom of this function. */ rcu_read_lock(&domlist_read_lock); -ret = cpu_disable_scheduler(cpu); +ret = (system_state == SYS_STATE_suspend) ? 0 : cpu_disable_scheduler(cpu); cpumask_set_cpu(cpu, &cpupool_free_cpus); /* diff --git a/xen/common/schedule.c b/xen/common/schedule.c index 60755a631e..1ab4f182ce 100644 --- a/xen/common/schedule.c +++ b/xen/common/schedule.c @@ -560,33 +560,6 @@ static void vcpu_move_locked(struct vcpu *v, unsigned int new_cpu) v->processor = new_cpu; } -/* - * Move a vcpu from its current processor to a target new processor, - * without asking the scheduler to do any placement. This is intended - * for being called from special contexts, where things are quiet - * enough that no contention is supposed to happen (i.e., during - * shutdown or software suspend, like ACPI S3). - */ -static void vcpu_move_nosched(struct vcpu *v, unsigned int new_cpu) -{ -unsigned long flags; -spinlock_t *lock, *new_lock; - -ASSERT(system_state == SYS_STATE_suspend); -ASSERT(!vcpu_runnable(v) && (atomic_read(&v->pause_count) || - atomic_read(&v->domain->pause_count))); - -lock = per_cpu(schedule_data, v->processor).schedule_lock; -new_lock = per_cpu(schedule_data, new_cpu).schedule_lock; - -sched_spin_lock_double(lock, new_lock, &flags); -ASSERT(new_cpu != v->processor); -vcpu_move_locked(v, new_cpu); -sched_spin_unlock_double(lock, new_lock, flags); - -sched_move_irqs(v); -} - /* * Initiating migration * @@ -735,31 +708,36 @@ void restore_vcpu_affinity(struct domain *d) ASSERT(!vcpu_runnable(v)); -lock = vcpu_schedule_lock_irq(v); - -if ( v->affinity_broken ) -{ -sched_set_affinity(v, v->cpu_hard_affinity_saved, NULL); -v->affinity_broken = 0; - -} - /* - * During suspend (in cpu_disable_scheduler()), we moved every vCPU - * to BSP (which, as of now, is pCPU 0), as a temporary measure to - * allow the nonboot processors to have their data structure freed - * and go to sleep. But nothing guardantees that the BSP is a valid - * pCPU for a particular domain. + * Re-assign the initial processor as after resume we have no + * guarantee the old processor has come back to life again. * * Therefore, here, before actually unpausing the domains, we should * set v->processor of each of their vCPUs to something that will * make sense for the scheduler of the cpupool in which they are in. */ cpumask_and(cpumask_scratch_cpu(cpu), v->cpu_hard_affinity, -cpupool_domain_cpumask(v->domain)); -v->processor = cpumask_any(cpumask_scratch_cpu(cpu)); +cpupool_domain_cpumask(d)); +if ( cpumask_empty(cpumask_scratch_cpu(cpu)) ) +{ +if ( v->affinity_broken ) +{ +sched_set_affinity(v, v->cpu_hard_affinity_saved, NULL); +v->affinity_broken = 0; +cpumask_and(cpumask_scratch_cpu(cpu), v->cpu_hard_affinity, +cpupool_domain_cpumask(d)); +} -spin_unlock_irq(lock); +if ( cpumask_empty(cpumask_scratch_cpu(cpu)) ) +{ +printk(XENLOG_DEBUG "Breaking affinity for %pv\n", v); +sched_set_affinity(v, &cpumask_all, NULL); +cpumask_and(cpumask_scratch_cpu(cpu), v->cpu_hard_affinity, +cpupool_domain_cpumask(d)); +} +} + +v->processor = cpumask_any(cpumask_scratch_cpu(cpu)); lock = vcpu_schedule_lock_irq(v); v->processor = SCHED_OP(vcpu_scheduler(v), pick_cpu, v); @@ -782,7 +760,6 @@ int cpu_disable_scheduler(unsigned int cpu) struct vcpu *v; struct cpupool *c; cpumas
Re: [Xen-devel] [PATCH 1/2] xen: introduce a cpumask with all bits set
On 14/03/2019 09:59, Juergen Gross wrote: > There are several places in Xen allocating a cpumask on the stack and > setting all bits in it just to use it as an initial mask for allowing > all cpus. > > Save the stack space and omit the need for runtime initialization by > defining a globally accessible cpumask_all variable. > > Signed-off-by: Juergen Gross Reviewed-by: Andrew Cooper , with one minor style suggestions which can be folded on commit. > diff --git a/xen/arch/x86/io_apic.c b/xen/arch/x86/io_apic.c > index daa5e9e5ff..a5344ed727 100644 > --- a/xen/arch/x86/io_apic.c > +++ b/xen/arch/x86/io_apic.c > @@ -1892,8 +1891,7 @@ static void __init check_timer(void) > vector = IRQ0_VECTOR; > clear_irq_vector(0); > > -cpumask_setall(&mask_all); > -if ((ret = bind_irq_vector(0, vector, &mask_all))) > +if ((ret = bind_irq_vector(0, vector, &cpumask_all))) Spaces, seeing as we're modifying the line. ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH 1/2] xen: introduce a cpumask with all bits set
On 14/03/2019 11:04, Andrew Cooper wrote: > On 14/03/2019 09:59, Juergen Gross wrote: >> There are several places in Xen allocating a cpumask on the stack and >> setting all bits in it just to use it as an initial mask for allowing >> all cpus. >> >> Save the stack space and omit the need for runtime initialization by >> defining a globally accessible cpumask_all variable. >> >> Signed-off-by: Juergen Gross > > Reviewed-by: Andrew Cooper , with one minor > style suggestions which can be folded on commit. > >> diff --git a/xen/arch/x86/io_apic.c b/xen/arch/x86/io_apic.c >> index daa5e9e5ff..a5344ed727 100644 >> --- a/xen/arch/x86/io_apic.c >> +++ b/xen/arch/x86/io_apic.c >> @@ -1892,8 +1891,7 @@ static void __init check_timer(void) >> vector = IRQ0_VECTOR; >> clear_irq_vector(0); >> >> -cpumask_setall(&mask_all); >> -if ((ret = bind_irq_vector(0, vector, &mask_all))) >> +if ((ret = bind_irq_vector(0, vector, &cpumask_all))) > > Spaces, seeing as we're modifying the line. You are aware this file originates from the Linux kernel? I really don't mind either way, I just wanted to mention it... Juergen ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH for-next 7/9] xen: Convert is_xen_heap_mfn to use typesafe MFN
Hi Jan, On 3/14/19 7:52 AM, Jan Beulich wrote: On 13.03.19 at 18:24, wrote: On 13/03/2019 15:04, Jan Beulich wrote: On 18.02.19 at 12:35, wrote: --- a/xen/common/page_alloc.c +++ b/xen/common/page_alloc.c @@ -2121,9 +2121,9 @@ void init_xenheap_pages(paddr_t ps, paddr_t pe) * Yuk! Ensure there is a one-page buffer between Xen and Dom zones, to * prevent merging of power-of-two blocks across the zone boundary. */ -if ( ps && !is_xen_heap_mfn(paddr_to_pfn(ps)-1) ) +if ( ps && !is_xen_heap_mfn(_mfn(paddr_to_pfn(ps)-1)) ) ps += PAGE_SIZE; -if ( !is_xen_heap_mfn(paddr_to_pfn(pe)) ) +if ( !is_xen_heap_mfn(maddr_to_mfn(pe)) ) Why maddr_to_mfn() here but still paddr_to_pfn() above? Oh, we don't have any mfn_sub(), I see. Yes we don't have mfn_sub() (or even gfn_sub()). I only found a couple of places where such helpers might be useful. I can introduce the 2 helpers if you think it is worth it. Well, I guess in the end I'm fine either way. It simply struck me as odd at the first glance that you use maddr_to_mfn() in one case but not the other. I wanted to avoid adding mfn_x(...) on the line: _mfn(mfn_x(maddr_to_mfn(ps)) - 1) I will look to introduce mfn_sub()/gfn_sub(). Cheers, -- Julien Grall ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH for-next 7/9] xen: Convert is_xen_heap_mfn to use typesafe MFN
On 14/03/2019 10:12, Julien Grall wrote: > Hi Jan, > > On 3/14/19 7:52 AM, Jan Beulich wrote: > On 13.03.19 at 18:24, wrote: >>> On 13/03/2019 15:04, Jan Beulich wrote: >>> On 18.02.19 at 12:35, wrote: > --- a/xen/common/page_alloc.c > +++ b/xen/common/page_alloc.c > @@ -2121,9 +2121,9 @@ void init_xenheap_pages(paddr_t ps, paddr_t pe) > * Yuk! Ensure there is a one-page buffer between Xen and > Dom zones, to > * prevent merging of power-of-two blocks across the zone > boundary. > */ > - if ( ps && !is_xen_heap_mfn(paddr_to_pfn(ps)-1) ) > + if ( ps && !is_xen_heap_mfn(_mfn(paddr_to_pfn(ps)-1)) ) > ps += PAGE_SIZE; > - if ( !is_xen_heap_mfn(paddr_to_pfn(pe)) ) > + if ( !is_xen_heap_mfn(maddr_to_mfn(pe)) ) Why maddr_to_mfn() here but still paddr_to_pfn() above? Oh, we don't have any mfn_sub(), I see. >>> >>> Yes we don't have mfn_sub() (or even gfn_sub()). I only found a >>> couple of places >>> where such helpers might be useful. I can introduce the 2 helpers if >>> you think >>> it is worth it. >> >> Well, I guess in the end I'm fine either way. It simply struck me >> as odd at the first glance that you use maddr_to_mfn() in one >> case but not the other. > > I wanted to avoid adding mfn_x(...) on the line: > > _mfn(mfn_x(maddr_to_mfn(ps)) - 1) > > I will look to introduce mfn_sub()/gfn_sub(). You do know that mfn_add(mfn, -1) will DTRT? ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH for-next 7/9] xen: Convert is_xen_heap_mfn to use typesafe MFN
Hi Andrew, On 3/14/19 10:14 AM, Andrew Cooper wrote: On 14/03/2019 10:12, Julien Grall wrote: Hi Jan, On 3/14/19 7:52 AM, Jan Beulich wrote: On 13.03.19 at 18:24, wrote: On 13/03/2019 15:04, Jan Beulich wrote: On 18.02.19 at 12:35, wrote: --- a/xen/common/page_alloc.c +++ b/xen/common/page_alloc.c @@ -2121,9 +2121,9 @@ void init_xenheap_pages(paddr_t ps, paddr_t pe) * Yuk! Ensure there is a one-page buffer between Xen and Dom zones, to * prevent merging of power-of-two blocks across the zone boundary. */ - if ( ps && !is_xen_heap_mfn(paddr_to_pfn(ps)-1) ) + if ( ps && !is_xen_heap_mfn(_mfn(paddr_to_pfn(ps)-1)) ) ps += PAGE_SIZE; - if ( !is_xen_heap_mfn(paddr_to_pfn(pe)) ) + if ( !is_xen_heap_mfn(maddr_to_mfn(pe)) ) Why maddr_to_mfn() here but still paddr_to_pfn() above? Oh, we don't have any mfn_sub(), I see. Yes we don't have mfn_sub() (or even gfn_sub()). I only found a couple of places where such helpers might be useful. I can introduce the 2 helpers if you think it is worth it. Well, I guess in the end I'm fine either way. It simply struck me as odd at the first glance that you use maddr_to_mfn() in one case but not the other. I wanted to avoid adding mfn_x(...) on the line: _mfn(mfn_x(maddr_to_mfn(ps)) - 1) I will look to introduce mfn_sub()/gfn_sub(). You do know that mfn_add(mfn, -1) will DTRT? It didn't occur to me until you said it. I will use that then. Cheers, -- Julien Grall ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH 1/2] xen: introduce a cpumask with all bits set
On 3/14/19 9:59 AM, Juergen Gross wrote: > There are several places in Xen allocating a cpumask on the stack and > setting all bits in it just to use it as an initial mask for allowing > all cpus. > > Save the stack space and omit the need for runtime initialization by > defining a globally accessible cpumask_all variable. > > Signed-off-by: Juergen Gross Acked-by: George Dunlap ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] [libvirt test] 133757: regressions - FAIL
flight 133757 libvirt real [real] http://logs.test-lab.xenproject.org/osstest/logs/133757/ Regressions :-( Tests which did not succeed and are blocking, including tests which could not be run: build-i386-libvirt6 libvirt-buildfail REGR. vs. 133672 Tests which did not succeed, but are not blocking: test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 1 build-check(1) blocked n/a test-amd64-i386-libvirt-xsm 1 build-check(1) blocked n/a test-amd64-i386-libvirt 1 build-check(1) blocked n/a test-amd64-i386-libvirt-pair 1 build-check(1) blocked n/a test-armhf-armhf-libvirt 14 saverestore-support-checkfail like 133672 test-armhf-armhf-libvirt-raw 13 saverestore-support-checkfail like 133672 test-arm64-arm64-libvirt-xsm 13 migrate-support-checkfail never pass test-arm64-arm64-libvirt-xsm 14 saverestore-support-checkfail never pass test-amd64-amd64-libvirt 13 migrate-support-checkfail never pass test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail never pass test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check fail never pass test-arm64-arm64-libvirt-qcow2 12 migrate-support-checkfail never pass test-arm64-arm64-libvirt-qcow2 13 saverestore-support-checkfail never pass test-amd64-amd64-libvirt-vhd 12 migrate-support-checkfail never pass test-armhf-armhf-libvirt 13 migrate-support-checkfail never pass test-arm64-arm64-libvirt 13 migrate-support-checkfail never pass test-arm64-arm64-libvirt 14 saverestore-support-checkfail never pass test-armhf-armhf-libvirt-raw 12 migrate-support-checkfail never pass version targeted for testing: libvirt c615c14246880c91afa3fe1659dc2104447de601 baseline version: libvirt 7a05c739c26decb8ff0eef4f6c75ce3ef729532d Last test of basis 133672 2019-03-09 13:00:19 Z4 days Failing since133728 2019-03-12 04:19:12 Z2 days2 attempts Testing same since 133757 2019-03-13 07:07:24 Z1 days1 attempts People who touched revisions under test: Andrea Bolognani Eric Blake Laszlo Ersek Martin Kletzander Michal Privoznik Peter Krempa Shotaro Gotanda jobs: build-amd64-xsm pass build-arm64-xsm pass build-i386-xsm pass build-amd64 pass build-arm64 pass build-armhf pass build-i386 pass build-amd64-libvirt pass build-arm64-libvirt pass build-armhf-libvirt pass build-i386-libvirt fail build-amd64-pvopspass build-arm64-pvopspass build-armhf-pvopspass build-i386-pvops pass test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm pass test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsmblocked test-amd64-amd64-libvirt-xsm pass test-arm64-arm64-libvirt-xsm pass test-amd64-i386-libvirt-xsm blocked test-amd64-amd64-libvirt pass test-arm64-arm64-libvirt pass test-armhf-armhf-libvirt pass test-amd64-i386-libvirt blocked test-amd64-amd64-libvirt-pairpass test-amd64-i386-libvirt-pair blocked test-arm64-arm64-libvirt-qcow2 pass test-armhf-armhf-libvirt-raw pass test-amd64-amd64-libvirt-vhd pass sg-report-flight on osstest.test-lab.xenproject.org logs: /home/logs/logs images: /home/logs/images Logs, config files, etc. are available at http://logs.test-lab.xenproject.org/osstest/logs Explanation of these reports, and of osstest in general, is at http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master Test harness code can be found at http://xenbits.xen.org/gitweb?p=o
Re: [Xen-devel] [PATCH] tools/cpuid: Update xen-cpuid for Coffee Lake processors
On 14/03/2019 08:20, Jan Beulich wrote: On 13.03.19 at 20:37, wrote: >> --- a/tools/misc/xen-cpuid.c >> +++ b/tools/misc/xen-cpuid.c >> @@ -127,6 +127,8 @@ static const char *str_7c0[32] = >> [14] = "avx512_vpopcntdq", >> >> [22] = "rdpid", >> + >> +[30] = "sgx-lc", >> }; >> >> static const char *str_e7d[32] = > Hmm, to be honest I don't see why this needs to go in ahead of > https://lists.xenproject.org/archives/html/xen-devel/2018-12/msg01813.html > posted almost 3 months ago, and already taking care of the bit. Because its hidden at the end of a series I haven't gotten to the end of yet. But yes - it does look to be a more complete patch so lets start there. ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v7 49/49] tools: re-sync CPUID leaf 7 tables
On 19/12/2018 15:08, Jan Beulich wrote: > Bring libxl's in line with the public header, and update xen-cpuid's to > the latest information available in Intel's documentation (SDM ver 068 > and ISA extensions ver 035), with (as before) the exception on MAWAU. > > Some pre-existing strings get changed to match SDM naming. This should > be benign in xen-cpuid, and I hope it's also acceptable in libxl, where > people actually using the slightly wrong names would have to update > their guest config files. > > Signed-off-by: Jan Beulich Acked-by: Andrew Cooper ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] [PATCH v6 08/11] viridian: stop directly calling viridian_time_ref_count_freeze/thaw()...
...from arch_domain_shutdown/pause/unpause(). A subsequent patch will introduce an implementaion of synthetic timers which will also need freeze/thaw hooks, so make the exported hooks more generic and call through to (re-named and static) time_ref_count_freeze/thaw functions. NOTE: This patch also introduces a new time_ref_count() helper to return the current counter value. This is currently only used by the MSR read handler but the synthetic timer code will also need to use it. Signed-off-by: Paul Durrant Reviewed-by: Wei Liu Acked-by: Jan Beulich --- Cc: Andrew Cooper Cc: "Roger Pau Monné" --- xen/arch/x86/domain.c | 12 ++-- xen/arch/x86/hvm/viridian/time.c | 24 +--- xen/include/asm-x86/hvm/viridian.h | 4 ++-- 3 files changed, 29 insertions(+), 11 deletions(-) diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c index 8d579e2cf9..02afa7518e 100644 --- a/xen/arch/x86/domain.c +++ b/xen/arch/x86/domain.c @@ -657,20 +657,20 @@ void arch_domain_destroy(struct domain *d) void arch_domain_shutdown(struct domain *d) { -if ( has_viridian_time_ref_count(d) ) -viridian_time_ref_count_freeze(d); +if ( is_viridian_domain(d) ) +viridian_time_domain_freeze(d); } void arch_domain_pause(struct domain *d) { -if ( has_viridian_time_ref_count(d) ) -viridian_time_ref_count_freeze(d); +if ( is_viridian_domain(d) ) +viridian_time_domain_freeze(d); } void arch_domain_unpause(struct domain *d) { -if ( has_viridian_time_ref_count(d) ) -viridian_time_ref_count_thaw(d); +if ( is_viridian_domain(d) ) +viridian_time_domain_thaw(d); } int arch_domain_soft_reset(struct domain *d) diff --git a/xen/arch/x86/hvm/viridian/time.c b/xen/arch/x86/hvm/viridian/time.c index 16fe41d411..71291d921c 100644 --- a/xen/arch/x86/hvm/viridian/time.c +++ b/xen/arch/x86/hvm/viridian/time.c @@ -91,7 +91,7 @@ static int64_t raw_trc_val(const struct domain *d) return scale_delta(tsc, &tsc_to_ns) / 100ul; } -void viridian_time_ref_count_freeze(const struct domain *d) +static void time_ref_count_freeze(const struct domain *d) { struct viridian_time_ref_count *trc = &d->arch.hvm.viridian->time_ref_count; @@ -100,7 +100,7 @@ void viridian_time_ref_count_freeze(const struct domain *d) trc->val = raw_trc_val(d) + trc->off; } -void viridian_time_ref_count_thaw(const struct domain *d) +static void time_ref_count_thaw(const struct domain *d) { struct viridian_time_ref_count *trc = &d->arch.hvm.viridian->time_ref_count; @@ -110,6 +110,24 @@ void viridian_time_ref_count_thaw(const struct domain *d) trc->off = (int64_t)trc->val - raw_trc_val(d); } +static int64_t time_ref_count(const struct domain *d) +{ +struct viridian_time_ref_count *trc = +&d->arch.hvm.viridian->time_ref_count; + +return raw_trc_val(d) + trc->off; +} + +void viridian_time_domain_freeze(const struct domain *d) +{ +time_ref_count_freeze(d); +} + +void viridian_time_domain_thaw(const struct domain *d) +{ +time_ref_count_thaw(d); +} + int viridian_time_wrmsr(struct vcpu *v, uint32_t idx, uint64_t val) { struct domain *d = v->domain; @@ -179,7 +197,7 @@ int viridian_time_rdmsr(const struct vcpu *v, uint32_t idx, uint64_t *val) printk(XENLOG_G_INFO "d%d: VIRIDIAN MSR_TIME_REF_COUNT: accessed\n", d->domain_id); -*val = raw_trc_val(d) + trc->off; +*val = time_ref_count(d); break; } diff --git a/xen/include/asm-x86/hvm/viridian.h b/xen/include/asm-x86/hvm/viridian.h index c65c044191..8146e2fc46 100644 --- a/xen/include/asm-x86/hvm/viridian.h +++ b/xen/include/asm-x86/hvm/viridian.h @@ -77,8 +77,8 @@ int guest_rdmsr_viridian(const struct vcpu *v, uint32_t idx, uint64_t *val); int viridian_hypercall(struct cpu_user_regs *regs); -void viridian_time_ref_count_freeze(const struct domain *d); -void viridian_time_ref_count_thaw(const struct domain *d); +void viridian_time_domain_freeze(const struct domain *d); +void viridian_time_domain_thaw(const struct domain *d); int viridian_vcpu_init(struct vcpu *v); int viridian_domain_init(struct domain *d); -- 2.20.1 ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] [PATCH v6 03/11] viridian: use stack variables for viridian_vcpu and viridian_domain...
...where there is more than one dereference inside a function. This shortens the code and makes it more readable. No functional change. Signed-off-by: Paul Durrant Reviewed-by: Jan Beulich --- Cc: Andrew Cooper Cc: Wei Liu Cc: "Roger Pau Monné" v4: - New in v4 --- xen/arch/x86/hvm/viridian/synic.c| 49 xen/arch/x86/hvm/viridian/time.c | 27 --- xen/arch/x86/hvm/viridian/viridian.c | 47 +- 3 files changed, 69 insertions(+), 54 deletions(-) diff --git a/xen/arch/x86/hvm/viridian/synic.c b/xen/arch/x86/hvm/viridian/synic.c index 28eda7798c..f3d9f7ae74 100644 --- a/xen/arch/x86/hvm/viridian/synic.c +++ b/xen/arch/x86/hvm/viridian/synic.c @@ -30,7 +30,8 @@ typedef union _HV_VP_ASSIST_PAGE void viridian_apic_assist_set(const struct vcpu *v) { -HV_VP_ASSIST_PAGE *ptr = v->arch.hvm.viridian->vp_assist.ptr; +struct viridian_vcpu *vv = v->arch.hvm.viridian; +HV_VP_ASSIST_PAGE *ptr = vv->vp_assist.ptr; if ( !ptr ) return; @@ -40,25 +41,25 @@ void viridian_apic_assist_set(const struct vcpu *v) * wrong and the VM will most likely hang so force a crash now * to make the problem clear. */ -if ( v->arch.hvm.viridian->apic_assist_pending ) +if ( vv->apic_assist_pending ) domain_crash(v->domain); -v->arch.hvm.viridian->apic_assist_pending = true; +vv->apic_assist_pending = true; ptr->ApicAssist.no_eoi = 1; } bool viridian_apic_assist_completed(const struct vcpu *v) { -HV_VP_ASSIST_PAGE *ptr = v->arch.hvm.viridian->vp_assist.ptr; +struct viridian_vcpu *vv = v->arch.hvm.viridian; +HV_VP_ASSIST_PAGE *ptr = vv->vp_assist.ptr; if ( !ptr ) return false; -if ( v->arch.hvm.viridian->apic_assist_pending && - !ptr->ApicAssist.no_eoi ) +if ( vv->apic_assist_pending && !ptr->ApicAssist.no_eoi ) { /* An EOI has been avoided */ -v->arch.hvm.viridian->apic_assist_pending = false; +vv->apic_assist_pending = false; return true; } @@ -67,17 +68,20 @@ bool viridian_apic_assist_completed(const struct vcpu *v) void viridian_apic_assist_clear(const struct vcpu *v) { -HV_VP_ASSIST_PAGE *ptr = v->arch.hvm.viridian->vp_assist.ptr; +struct viridian_vcpu *vv = v->arch.hvm.viridian; +HV_VP_ASSIST_PAGE *ptr = vv->vp_assist.ptr; if ( !ptr ) return; ptr->ApicAssist.no_eoi = 0; -v->arch.hvm.viridian->apic_assist_pending = false; +vv->apic_assist_pending = false; } int viridian_synic_wrmsr(struct vcpu *v, uint32_t idx, uint64_t val) { +struct viridian_vcpu *vv = v->arch.hvm.viridian; + switch ( idx ) { case HV_X64_MSR_EOI: @@ -95,12 +99,11 @@ int viridian_synic_wrmsr(struct vcpu *v, uint32_t idx, uint64_t val) case HV_X64_MSR_VP_ASSIST_PAGE: /* release any previous mapping */ -viridian_unmap_guest_page(&v->arch.hvm.viridian->vp_assist); -v->arch.hvm.viridian->vp_assist.msr.raw = val; -viridian_dump_guest_page(v, "VP_ASSIST", - &v->arch.hvm.viridian->vp_assist); -if ( v->arch.hvm.viridian->vp_assist.msr.fields.enabled ) -viridian_map_guest_page(v, &v->arch.hvm.viridian->vp_assist); +viridian_unmap_guest_page(&vv->vp_assist); +vv->vp_assist.msr.raw = val; +viridian_dump_guest_page(v, "VP_ASSIST", &vv->vp_assist); +if ( vv->vp_assist.msr.fields.enabled ) +viridian_map_guest_page(v, &vv->vp_assist); break; default: @@ -146,18 +149,22 @@ int viridian_synic_rdmsr(const struct vcpu *v, uint32_t idx, uint64_t *val) void viridian_synic_save_vcpu_ctxt(const struct vcpu *v, struct hvm_viridian_vcpu_context *ctxt) { -ctxt->apic_assist_pending = v->arch.hvm.viridian->apic_assist_pending; -ctxt->vp_assist_msr = v->arch.hvm.viridian->vp_assist.msr.raw; +const struct viridian_vcpu *vv = v->arch.hvm.viridian; + +ctxt->apic_assist_pending = vv->apic_assist_pending; +ctxt->vp_assist_msr = vv->vp_assist.msr.raw; } void viridian_synic_load_vcpu_ctxt( struct vcpu *v, const struct hvm_viridian_vcpu_context *ctxt) { -v->arch.hvm.viridian->vp_assist.msr.raw = ctxt->vp_assist_msr; -if ( v->arch.hvm.viridian->vp_assist.msr.fields.enabled ) -viridian_map_guest_page(v, &v->arch.hvm.viridian->vp_assist); +struct viridian_vcpu *vv = v->arch.hvm.viridian; + +vv->vp_assist.msr.raw = ctxt->vp_assist_msr; +if ( vv->vp_assist.msr.fields.enabled ) +viridian_map_guest_page(v, &vv->vp_assist); -v->arch.hvm.viridian->apic_assist_pending = ctxt->apic_assist_pending; +vv->apic_assist_pending = ctxt->apic_assist_pending; } /* diff --git a/xen/arch/x86/hvm/viridian/time.c b/xen/arch/x86/hvm/viridian/time.c index a7e94aadf0..76f9612001 100644 --- a/xen/arch/x86/hvm/viridian/time.c +++ b/xen/arch/x86/hvm/
[Xen-devel] [PATCH v6 04/11] viridian: make 'fields' struct anonymous...
...inside viridian_page_msr and viridian_guest_os_id_msr unions. There's no need to name it and the code is shortened by not doing so. No functional change. Signed-off-by: Paul Durrant Reviewed-by: Jan Beulich --- Cc: Andrew Cooper Cc: Wei Liu Cc: "Roger Pau Monné" v4: - New in v4 --- xen/arch/x86/hvm/viridian/synic.c| 4 ++-- xen/arch/x86/hvm/viridian/time.c | 10 +- xen/arch/x86/hvm/viridian/viridian.c | 20 +--- xen/include/asm-x86/hvm/viridian.h | 4 ++-- 4 files changed, 18 insertions(+), 20 deletions(-) diff --git a/xen/arch/x86/hvm/viridian/synic.c b/xen/arch/x86/hvm/viridian/synic.c index f3d9f7ae74..05d971b365 100644 --- a/xen/arch/x86/hvm/viridian/synic.c +++ b/xen/arch/x86/hvm/viridian/synic.c @@ -102,7 +102,7 @@ int viridian_synic_wrmsr(struct vcpu *v, uint32_t idx, uint64_t val) viridian_unmap_guest_page(&vv->vp_assist); vv->vp_assist.msr.raw = val; viridian_dump_guest_page(v, "VP_ASSIST", &vv->vp_assist); -if ( vv->vp_assist.msr.fields.enabled ) +if ( vv->vp_assist.msr.enabled ) viridian_map_guest_page(v, &vv->vp_assist); break; @@ -161,7 +161,7 @@ void viridian_synic_load_vcpu_ctxt( struct viridian_vcpu *vv = v->arch.hvm.viridian; vv->vp_assist.msr.raw = ctxt->vp_assist_msr; -if ( vv->vp_assist.msr.fields.enabled ) +if ( vv->vp_assist.msr.enabled ) viridian_map_guest_page(v, &vv->vp_assist); vv->apic_assist_pending = ctxt->apic_assist_pending; diff --git a/xen/arch/x86/hvm/viridian/time.c b/xen/arch/x86/hvm/viridian/time.c index 76f9612001..909a3fb9e3 100644 --- a/xen/arch/x86/hvm/viridian/time.c +++ b/xen/arch/x86/hvm/viridian/time.c @@ -29,16 +29,16 @@ static void dump_reference_tsc(const struct domain *d) { const union viridian_page_msr *rt = &d->arch.hvm.viridian->reference_tsc; -if ( !rt->fields.enabled ) +if ( !rt->enabled ) return; printk(XENLOG_G_INFO "d%d: VIRIDIAN REFERENCE_TSC: pfn: %lx\n", - d->domain_id, (unsigned long)rt->fields.pfn); + d->domain_id, (unsigned long)rt->pfn); } static void update_reference_tsc(struct domain *d, bool initialize) { -unsigned long gmfn = d->arch.hvm.viridian->reference_tsc.fields.pfn; +unsigned long gmfn = d->arch.hvm.viridian->reference_tsc.pfn; struct page_info *page = get_page_from_gfn(d, gmfn, NULL, P2M_ALLOC); HV_REFERENCE_TSC_PAGE *p; @@ -151,7 +151,7 @@ int viridian_time_wrmsr(struct vcpu *v, uint32_t idx, uint64_t val) vd->reference_tsc.raw = val; dump_reference_tsc(d); -if ( vd->reference_tsc.fields.enabled ) +if ( vd->reference_tsc.enabled ) update_reference_tsc(d, true); break; @@ -232,7 +232,7 @@ void viridian_time_load_domain_ctxt( vd->time_ref_count.val = ctxt->time_ref_count; vd->reference_tsc.raw = ctxt->reference_tsc; -if ( vd->reference_tsc.fields.enabled ) +if ( vd->reference_tsc.enabled ) update_reference_tsc(d, false); } diff --git a/xen/arch/x86/hvm/viridian/viridian.c b/xen/arch/x86/hvm/viridian/viridian.c index 710470fed7..1a20d68aaf 100644 --- a/xen/arch/x86/hvm/viridian/viridian.c +++ b/xen/arch/x86/hvm/viridian/viridian.c @@ -192,7 +192,7 @@ void cpuid_viridian_leaves(const struct vcpu *v, uint32_t leaf, case 4: /* Recommended hypercall usage. */ -if ( vd->guest_os_id.raw == 0 || vd->guest_os_id.fields.os < 4 ) +if ( vd->guest_os_id.raw == 0 || vd->guest_os_id.os < 4 ) break; res->a = CPUID4A_RELAX_TIMER_INT; if ( viridian_feature_mask(d) & HVMPV_hcall_remote_tlb_flush ) @@ -228,10 +228,8 @@ static void dump_guest_os_id(const struct domain *d) printk(XENLOG_G_INFO "d%d: VIRIDIAN GUEST_OS_ID: vendor: %x os: %x major: %x minor: %x sp: %x build: %x\n", - d->domain_id, - goi->fields.vendor, goi->fields.os, - goi->fields.major, goi->fields.minor, - goi->fields.service_pack, goi->fields.build_number); + d->domain_id, goi->vendor, goi->os, goi->major, goi->minor, + goi->service_pack, goi->build_number); } static void dump_hypercall(const struct domain *d) @@ -242,12 +240,12 @@ static void dump_hypercall(const struct domain *d) printk(XENLOG_G_INFO "d%d: VIRIDIAN HYPERCALL: enabled: %x pfn: %lx\n", d->domain_id, - hg->fields.enabled, (unsigned long)hg->fields.pfn); + hg->enabled, (unsigned long)hg->pfn); } static void enable_hypercall_page(struct domain *d) { -unsigned long gmfn = d->arch.hvm.viridian->hypercall_gpa.fields.pfn; +unsigned long gmfn = d->arch.hvm.viridian->hypercall_gpa.pfn; struct page_info *page = get_page_from_gfn(d, gmfn, NULL, P2M_ALLOC); uint8_t *p; @@ -297,7 +295,7 @@ int guest_wrmsr_viridian(struct vcpu *v, uint32_t idx, uint64_t val) case HV_X64_MSR_HYPERCALL: vd->hy
[Xen-devel] [PATCH v6 07/11] viridian: use viridian_map/unmap_guest_page() for reference tsc page
Whilst the reference tsc page does not currently need to be kept mapped after it is initially set up (or updated after migrate), the code can be simplified by using the common guest page map/unmap and dump functions. New functionality added by a subsequent patch will also require the page to kept mapped for the lifetime of the domain. NOTE: Because the reference tsc page is per-domain rather than per-vcpu this patch also changes viridian_map_guest_page() to take a domain pointer rather than a vcpu pointer. The domain pointer cannot be const, unlike the vcpu pointer. Signed-off-by: Paul Durrant Reviewed-by: Wei Liu --- Cc: Jan Beulich Cc: Andrew Cooper Cc: "Roger Pau Monné" --- xen/arch/x86/hvm/viridian/private.h | 2 +- xen/arch/x86/hvm/viridian/synic.c| 6 ++- xen/arch/x86/hvm/viridian/time.c | 56 +--- xen/arch/x86/hvm/viridian/viridian.c | 3 +- xen/include/asm-x86/hvm/viridian.h | 2 +- 5 files changed, 25 insertions(+), 44 deletions(-) diff --git a/xen/arch/x86/hvm/viridian/private.h b/xen/arch/x86/hvm/viridian/private.h index 5078b2d2ab..96a784b840 100644 --- a/xen/arch/x86/hvm/viridian/private.h +++ b/xen/arch/x86/hvm/viridian/private.h @@ -111,7 +111,7 @@ void viridian_time_load_domain_ctxt( void viridian_dump_guest_page(const struct vcpu *v, const char *name, const struct viridian_page *vp); -void viridian_map_guest_page(const struct vcpu *v, struct viridian_page *vp); +void viridian_map_guest_page(struct domain *d, struct viridian_page *vp); void viridian_unmap_guest_page(struct viridian_page *vp); #endif /* X86_HVM_VIRIDIAN_PRIVATE_H */ diff --git a/xen/arch/x86/hvm/viridian/synic.c b/xen/arch/x86/hvm/viridian/synic.c index b8dab4b246..fb560bc162 100644 --- a/xen/arch/x86/hvm/viridian/synic.c +++ b/xen/arch/x86/hvm/viridian/synic.c @@ -81,6 +81,7 @@ void viridian_apic_assist_clear(const struct vcpu *v) int viridian_synic_wrmsr(struct vcpu *v, uint32_t idx, uint64_t val) { struct viridian_vcpu *vv = v->arch.hvm.viridian; +struct domain *d = v->domain; switch ( idx ) { @@ -103,7 +104,7 @@ int viridian_synic_wrmsr(struct vcpu *v, uint32_t idx, uint64_t val) vv->vp_assist.msr.raw = val; viridian_dump_guest_page(v, "VP_ASSIST", &vv->vp_assist); if ( vv->vp_assist.msr.enabled ) -viridian_map_guest_page(v, &vv->vp_assist); +viridian_map_guest_page(d, &vv->vp_assist); break; default: @@ -178,10 +179,11 @@ void viridian_synic_load_vcpu_ctxt( struct vcpu *v, const struct hvm_viridian_vcpu_context *ctxt) { struct viridian_vcpu *vv = v->arch.hvm.viridian; +struct domain *d = v->domain; vv->vp_assist.msr.raw = ctxt->vp_assist_msr; if ( vv->vp_assist.msr.enabled ) -viridian_map_guest_page(v, &vv->vp_assist); +viridian_map_guest_page(d, &vv->vp_assist); vv->apic_assist_pending = ctxt->apic_assist_pending; } diff --git a/xen/arch/x86/hvm/viridian/time.c b/xen/arch/x86/hvm/viridian/time.c index 4399e62f54..16fe41d411 100644 --- a/xen/arch/x86/hvm/viridian/time.c +++ b/xen/arch/x86/hvm/viridian/time.c @@ -25,33 +25,10 @@ typedef struct _HV_REFERENCE_TSC_PAGE uint64_t Reserved2[509]; } HV_REFERENCE_TSC_PAGE, *PHV_REFERENCE_TSC_PAGE; -static void dump_reference_tsc(const struct domain *d) -{ -const union viridian_page_msr *rt = &d->arch.hvm.viridian->reference_tsc; - -if ( !rt->enabled ) -return; - -printk(XENLOG_G_INFO "d%d: VIRIDIAN REFERENCE_TSC: pfn: %lx\n", - d->domain_id, (unsigned long)rt->pfn); -} - static void update_reference_tsc(struct domain *d, bool initialize) { -unsigned long gmfn = d->arch.hvm.viridian->reference_tsc.pfn; -struct page_info *page = get_page_from_gfn(d, gmfn, NULL, P2M_ALLOC); -HV_REFERENCE_TSC_PAGE *p; - -if ( !page || !get_page_type(page, PGT_writable_page) ) -{ -if ( page ) -put_page(page); -gdprintk(XENLOG_WARNING, "Bad GMFN %#"PRI_gfn" (MFN %#"PRI_mfn")\n", - gmfn, mfn_x(page ? page_to_mfn(page) : INVALID_MFN)); -return; -} - -p = __map_domain_page(page); +const struct viridian_page *rt = &d->arch.hvm.viridian->reference_tsc; +HV_REFERENCE_TSC_PAGE *p = rt->ptr; if ( initialize ) clear_page(p); @@ -82,7 +59,7 @@ static void update_reference_tsc(struct domain *d, bool initialize) printk(XENLOG_G_INFO "d%d: VIRIDIAN REFERENCE_TSC: invalidated\n", d->domain_id); -goto out; +return; } /* @@ -100,11 +77,6 @@ static void update_reference_tsc(struct domain *d, bool initialize) if ( p->TscSequence == 0x || p->TscSequence == 0 ) /* Avoid both 'invalid' values */ p->TscSequence = 1; - - out: -unmap_domain_page(p); - -put_page_and_type(page); } static int64_t raw_trc_val(const struct domain *d) @@ -149,10 +
[Xen-devel] [PATCH v6 00/11] viridian: implement more enlightenments
This series adds three new enlightenments: - Synthetic timers, which depends on the... - Synthetic interrupt controller (or SynIC) - Synthetic cluster IPI All these enlightenments are implemented in current versions of QEMU/KVM so this series closes the gap. Paul Durrant (11): viridian: add init hooks viridian: separately allocate domain and vcpu structures viridian: use stack variables for viridian_vcpu and viridian_domain... viridian: make 'fields' struct anonymous... viridian: extend init/deinit hooks into synic and time modules viridian: add missing context save helpers into synic and time modules viridian: use viridian_map/unmap_guest_page() for reference tsc page viridian: stop directly calling viridian_time_ref_count_freeze/thaw()... viridian: add implementation of synthetic interrupt MSRs viridian: add implementation of synthetic timers viridian: add implementation of the HvSendSyntheticClusterIpi hypercall docs/man/xl.cfg.5.pod.in | 18 +- tools/libxl/libxl.h| 18 + tools/libxl/libxl_dom.c| 10 + tools/libxl/libxl_types.idl| 3 + xen/arch/x86/domain.c | 12 +- xen/arch/x86/hvm/hvm.c | 10 + xen/arch/x86/hvm/viridian/private.h| 31 +- xen/arch/x86/hvm/viridian/synic.c | 388 +-- xen/arch/x86/hvm/viridian/time.c | 512 ++--- xen/arch/x86/hvm/viridian/viridian.c | 229 +-- xen/arch/x86/hvm/vlapic.c | 31 +- xen/include/asm-x86/hvm/domain.h | 2 +- xen/include/asm-x86/hvm/hvm.h | 7 + xen/include/asm-x86/hvm/vcpu.h | 2 +- xen/include/asm-x86/hvm/viridian.h | 76 +++- xen/include/public/arch-x86/hvm/save.h | 4 + xen/include/public/hvm/params.h| 17 +- 17 files changed, 1227 insertions(+), 143 deletions(-) v4: - Add two cleanup patches (#3 and #4) and re-order #8 and #9 v3: - Add the synthetic cluster IPI patch (#11) --- Cc: Andrew Cooper Cc: George Dunlap Cc: Ian Jackson Cc: Jan Beulich Cc: Julien Grall Cc: Konrad Rzeszutek Wilk Cc: "Roger Pau Monné" Cc: Stefano Stabellini Cc: Tim Deegan Cc: Wei Liu -- 2.20.1 ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] [PATCH v6 06/11] viridian: add missing context save helpers into synic and time modules
Currently the time module lacks vcpu context save helpers and the synic module lacks domain context save helpers. These helpers are not yet required but subsequent patches will require at least some of them so this patch completes the set to avoid introducing them in an ad-hoc way. Signed-off-by: Paul Durrant Reviewed-by: Wei Liu --- Cc: Jan Beulich Cc: Andrew Cooper Cc: "Roger Pau Monné" v3: - Add missing callers so that they are not added in an ad-hoc way --- xen/arch/x86/hvm/viridian/private.h | 10 ++ xen/arch/x86/hvm/viridian/synic.c| 10 ++ xen/arch/x86/hvm/viridian/time.c | 10 ++ xen/arch/x86/hvm/viridian/viridian.c | 4 4 files changed, 34 insertions(+) diff --git a/xen/arch/x86/hvm/viridian/private.h b/xen/arch/x86/hvm/viridian/private.h index 8c029f62c6..5078b2d2ab 100644 --- a/xen/arch/x86/hvm/viridian/private.h +++ b/xen/arch/x86/hvm/viridian/private.h @@ -85,6 +85,11 @@ void viridian_synic_save_vcpu_ctxt(const struct vcpu *v, void viridian_synic_load_vcpu_ctxt( struct vcpu *v, const struct hvm_viridian_vcpu_context *ctxt); +void viridian_synic_save_domain_ctxt( +const struct domain *d, struct hvm_viridian_domain_context *ctxt); +void viridian_synic_load_domain_ctxt( +struct domain *d, const struct hvm_viridian_domain_context *ctxt); + int viridian_time_wrmsr(struct vcpu *v, uint32_t idx, uint64_t val); int viridian_time_rdmsr(const struct vcpu *v, uint32_t idx, uint64_t *val); @@ -94,6 +99,11 @@ int viridian_time_domain_init(const struct domain *d); void viridian_time_vcpu_deinit(const struct vcpu *v); void viridian_time_domain_deinit(const struct domain *d); +void viridian_time_save_vcpu_ctxt( +const struct vcpu *v, struct hvm_viridian_vcpu_context *ctxt); +void viridian_time_load_vcpu_ctxt( +struct vcpu *v, const struct hvm_viridian_vcpu_context *ctxt); + void viridian_time_save_domain_ctxt( const struct domain *d, struct hvm_viridian_domain_context *ctxt); void viridian_time_load_domain_ctxt( diff --git a/xen/arch/x86/hvm/viridian/synic.c b/xen/arch/x86/hvm/viridian/synic.c index 4b00dbe1b3..b8dab4b246 100644 --- a/xen/arch/x86/hvm/viridian/synic.c +++ b/xen/arch/x86/hvm/viridian/synic.c @@ -186,6 +186,16 @@ void viridian_synic_load_vcpu_ctxt( vv->apic_assist_pending = ctxt->apic_assist_pending; } +void viridian_synic_save_domain_ctxt( +const struct domain *d, struct hvm_viridian_domain_context *ctxt) +{ +} + +void viridian_synic_load_domain_ctxt( +struct domain *d, const struct hvm_viridian_domain_context *ctxt) +{ +} + /* * Local variables: * mode: C diff --git a/xen/arch/x86/hvm/viridian/time.c b/xen/arch/x86/hvm/viridian/time.c index 48aca7e0ab..4399e62f54 100644 --- a/xen/arch/x86/hvm/viridian/time.c +++ b/xen/arch/x86/hvm/viridian/time.c @@ -233,6 +233,16 @@ void viridian_time_domain_deinit(const struct domain *d) { } +void viridian_time_save_vcpu_ctxt( +const struct vcpu *v, struct hvm_viridian_vcpu_context *ctxt) +{ +} + +void viridian_time_load_vcpu_ctxt( +struct vcpu *v, const struct hvm_viridian_vcpu_context *ctxt) +{ +} + void viridian_time_save_domain_ctxt( const struct domain *d, struct hvm_viridian_domain_context *ctxt) { diff --git a/xen/arch/x86/hvm/viridian/viridian.c b/xen/arch/x86/hvm/viridian/viridian.c index f9a509d918..742a988252 100644 --- a/xen/arch/x86/hvm/viridian/viridian.c +++ b/xen/arch/x86/hvm/viridian/viridian.c @@ -707,6 +707,7 @@ static int viridian_save_domain_ctxt(struct vcpu *v, return 0; viridian_time_save_domain_ctxt(d, &ctxt); +viridian_synic_save_domain_ctxt(d, &ctxt); return (hvm_save_entry(VIRIDIAN_DOMAIN, 0, h, &ctxt) != 0); } @@ -723,6 +724,7 @@ static int viridian_load_domain_ctxt(struct domain *d, vd->hypercall_gpa.raw = ctxt.hypercall_gpa; vd->guest_os_id.raw = ctxt.guest_os_id; +viridian_synic_load_domain_ctxt(d, &ctxt); viridian_time_load_domain_ctxt(d, &ctxt); return 0; @@ -738,6 +740,7 @@ static int viridian_save_vcpu_ctxt(struct vcpu *v, hvm_domain_context_t *h) if ( !is_viridian_vcpu(v) ) return 0; +viridian_time_save_vcpu_ctxt(v, &ctxt); viridian_synic_save_vcpu_ctxt(v, &ctxt); return hvm_save_entry(VIRIDIAN_VCPU, v->vcpu_id, h, &ctxt); @@ -764,6 +767,7 @@ static int viridian_load_vcpu_ctxt(struct domain *d, return -EINVAL; viridian_synic_load_vcpu_ctxt(v, &ctxt); +viridian_time_load_vcpu_ctxt(v, &ctxt); return 0; } -- 2.20.1 ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] [PATCH v6 09/11] viridian: add implementation of synthetic interrupt MSRs
This patch introduces an implementation of the SCONTROL, SVERSION, SIEFP, SIMP, EOM and SINT0-15 SynIC MSRs. No message source is added and, as such, nothing will yet generate a synthetic interrupt. A subsequent patch will add an implementation of synthetic timers which will need the infrastructure added by this patch to deliver expiry messages to the guest. NOTE: A 'synic' option is added to the toolstack viridian enlightenments enumeration but is deliberately not documented as enabling these SynIC registers without a message source is only useful for debugging. Signed-off-by: Paul Durrant Acked-by: Wei Liu --- Cc: Ian Jackson Cc: Andrew Cooper Cc: George Dunlap Cc: Jan Beulich Cc: Julien Grall Cc: Konrad Rzeszutek Wilk Cc: Stefano Stabellini Cc: Tim Deegan Cc: "Roger Pau Monné" v6: - Address further comments from Jan v4: - Address comments from Jan v3: - Add the 'SintPollingModeAvailable' bit in CPUID leaf 3 --- tools/libxl/libxl.h| 6 + tools/libxl/libxl_dom.c| 3 + tools/libxl/libxl_types.idl| 1 + xen/arch/x86/hvm/viridian/synic.c | 253 - xen/arch/x86/hvm/viridian/viridian.c | 19 ++ xen/arch/x86/hvm/vlapic.c | 31 ++- xen/include/asm-x86/hvm/hvm.h | 3 + xen/include/asm-x86/hvm/viridian.h | 27 +++ xen/include/public/arch-x86/hvm/save.h | 2 + xen/include/public/hvm/params.h| 7 +- 10 files changed, 344 insertions(+), 8 deletions(-) diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h index a38e5cdba2..a923a380d3 100644 --- a/tools/libxl/libxl.h +++ b/tools/libxl/libxl.h @@ -318,6 +318,12 @@ */ #define LIBXL_HAVE_VIRIDIAN_CRASH_CTL 1 +/* + * LIBXL_HAVE_VIRIDIAN_SYNIC indicates that the 'synic' value + * is present in the viridian enlightenment enumeration. + */ +#define LIBXL_HAVE_VIRIDIAN_SYNIC 1 + /* * LIBXL_HAVE_BUILDINFO_HVM_ACPI_LAPTOP_SLATE indicates that * libxl_domain_build_info has the u.hvm.acpi_laptop_slate field. diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c index 6160991af3..fb758d2ac3 100644 --- a/tools/libxl/libxl_dom.c +++ b/tools/libxl/libxl_dom.c @@ -317,6 +317,9 @@ static int hvm_set_viridian_features(libxl__gc *gc, uint32_t domid, if (libxl_bitmap_test(&enlightenments, LIBXL_VIRIDIAN_ENLIGHTENMENT_CRASH_CTL)) mask |= HVMPV_crash_ctl; +if (libxl_bitmap_test(&enlightenments, LIBXL_VIRIDIAN_ENLIGHTENMENT_SYNIC)) +mask |= HVMPV_synic; + if (mask != 0 && xc_hvm_param_set(CTX->xch, domid, diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl index b685ac47ac..9860bcaf5f 100644 --- a/tools/libxl/libxl_types.idl +++ b/tools/libxl/libxl_types.idl @@ -235,6 +235,7 @@ libxl_viridian_enlightenment = Enumeration("viridian_enlightenment", [ (4, "hcall_remote_tlb_flush"), (5, "apic_assist"), (6, "crash_ctl"), +(7, "synic"), ]) libxl_hdtype = Enumeration("hdtype", [ diff --git a/xen/arch/x86/hvm/viridian/synic.c b/xen/arch/x86/hvm/viridian/synic.c index fb560bc162..d00a7bfe2c 100644 --- a/xen/arch/x86/hvm/viridian/synic.c +++ b/xen/arch/x86/hvm/viridian/synic.c @@ -13,6 +13,7 @@ #include #include +#include #include "private.h" @@ -28,6 +29,37 @@ typedef union _HV_VP_ASSIST_PAGE uint8_t ReservedZBytePadding[PAGE_SIZE]; } HV_VP_ASSIST_PAGE; +typedef enum HV_MESSAGE_TYPE { +HvMessageTypeNone, +HvMessageTimerExpired = 0x8010, +} HV_MESSAGE_TYPE; + +typedef struct HV_MESSAGE_FLAGS { +uint8_t MessagePending:1; +uint8_t Reserved:7; +} HV_MESSAGE_FLAGS; + +typedef struct HV_MESSAGE_HEADER { +HV_MESSAGE_TYPE MessageType; +uint16_t Reserved1; +HV_MESSAGE_FLAGS MessageFlags; +uint8_t PayloadSize; +uint64_t Reserved2; +} HV_MESSAGE_HEADER; + +#define HV_MESSAGE_SIZE 256 +#define HV_MESSAGE_MAX_PAYLOAD_QWORD_COUNT 30 + +typedef struct HV_MESSAGE { +HV_MESSAGE_HEADER Header; +uint64_t Payload[HV_MESSAGE_MAX_PAYLOAD_QWORD_COUNT]; +} HV_MESSAGE; + +void __init __maybe_unused build_assertions(void) +{ +BUILD_BUG_ON(sizeof(HV_MESSAGE) != HV_MESSAGE_SIZE); +} + void viridian_apic_assist_set(const struct vcpu *v) { struct viridian_vcpu *vv = v->arch.hvm.viridian; @@ -83,6 +115,8 @@ int viridian_synic_wrmsr(struct vcpu *v, uint32_t idx, uint64_t val) struct viridian_vcpu *vv = v->arch.hvm.viridian; struct domain *d = v->domain; +ASSERT(v == current || !v->is_running); + switch ( idx ) { case HV_X64_MSR_EOI: @@ -107,6 +141,76 @@ int viridian_synic_wrmsr(struct vcpu *v, uint32_t idx, uint64_t val) viridian_map_guest_page(d, &vv->vp_assist); break; +case HV_X64_MSR_SCONTROL: +if ( !(viridian_feature_mask(d) & HVMPV_synic) ) +return X86EMUL_EXCEPTION; + +vv->scontrol = val; +break; + +case HV_X64_MSR_SVERSION: +return X86EMUL_EXC
[Xen-devel] [PATCH v6 05/11] viridian: extend init/deinit hooks into synic and time modules
This patch simply adds domain and vcpu init/deinit hooks into the synic and time modules and wires them into viridian_[domain|vcpu]_[init|deinit](). Only one of the hooks is currently needed (to unmap the 'VP Assist' page) but subsequent patches will make use of the others. NOTE: To perform the unmap of the VP Assist page, viridian_unmap_guest_page() is now directly called in the new viridian_synic_vcpu_deinit() function (which is safe even if is_viridian_vcpu() evaluates to false). This replaces the slightly hacky mechanism of faking a zero write to the HV_X64_MSR_VP_ASSIST_PAGE MSR in viridian_cpu_deinit(). Signed-off-by: Paul Durrant Reviewed-by: Jan Beulich Reviewed-by: Wei Liu --- Cc: Andrew Cooper Cc: "Roger Pau Monné" v4: - Constify vcpu and domain pointers v2: - Pay attention to sync and time init hook return values --- xen/arch/x86/hvm/viridian/private.h | 12 + xen/arch/x86/hvm/viridian/synic.c| 19 ++ xen/arch/x86/hvm/viridian/time.c | 18 ++ xen/arch/x86/hvm/viridian/viridian.c | 37 ++-- 4 files changed, 84 insertions(+), 2 deletions(-) diff --git a/xen/arch/x86/hvm/viridian/private.h b/xen/arch/x86/hvm/viridian/private.h index 46174f48cd..8c029f62c6 100644 --- a/xen/arch/x86/hvm/viridian/private.h +++ b/xen/arch/x86/hvm/viridian/private.h @@ -74,6 +74,12 @@ int viridian_synic_wrmsr(struct vcpu *v, uint32_t idx, uint64_t val); int viridian_synic_rdmsr(const struct vcpu *v, uint32_t idx, uint64_t *val); +int viridian_synic_vcpu_init(const struct vcpu *v); +int viridian_synic_domain_init(const struct domain *d); + +void viridian_synic_vcpu_deinit(const struct vcpu *v); +void viridian_synic_domain_deinit(const struct domain *d); + void viridian_synic_save_vcpu_ctxt(const struct vcpu *v, struct hvm_viridian_vcpu_context *ctxt); void viridian_synic_load_vcpu_ctxt( @@ -82,6 +88,12 @@ void viridian_synic_load_vcpu_ctxt( int viridian_time_wrmsr(struct vcpu *v, uint32_t idx, uint64_t val); int viridian_time_rdmsr(const struct vcpu *v, uint32_t idx, uint64_t *val); +int viridian_time_vcpu_init(const struct vcpu *v); +int viridian_time_domain_init(const struct domain *d); + +void viridian_time_vcpu_deinit(const struct vcpu *v); +void viridian_time_domain_deinit(const struct domain *d); + void viridian_time_save_domain_ctxt( const struct domain *d, struct hvm_viridian_domain_context *ctxt); void viridian_time_load_domain_ctxt( diff --git a/xen/arch/x86/hvm/viridian/synic.c b/xen/arch/x86/hvm/viridian/synic.c index 05d971b365..4b00dbe1b3 100644 --- a/xen/arch/x86/hvm/viridian/synic.c +++ b/xen/arch/x86/hvm/viridian/synic.c @@ -146,6 +146,25 @@ int viridian_synic_rdmsr(const struct vcpu *v, uint32_t idx, uint64_t *val) return X86EMUL_OKAY; } +int viridian_synic_vcpu_init(const struct vcpu *v) +{ +return 0; +} + +int viridian_synic_domain_init(const struct domain *d) +{ +return 0; +} + +void viridian_synic_vcpu_deinit(const struct vcpu *v) +{ +viridian_unmap_guest_page(&v->arch.hvm.viridian->vp_assist); +} + +void viridian_synic_domain_deinit(const struct domain *d) +{ +} + void viridian_synic_save_vcpu_ctxt(const struct vcpu *v, struct hvm_viridian_vcpu_context *ctxt) { diff --git a/xen/arch/x86/hvm/viridian/time.c b/xen/arch/x86/hvm/viridian/time.c index 909a3fb9e3..48aca7e0ab 100644 --- a/xen/arch/x86/hvm/viridian/time.c +++ b/xen/arch/x86/hvm/viridian/time.c @@ -215,6 +215,24 @@ int viridian_time_rdmsr(const struct vcpu *v, uint32_t idx, uint64_t *val) return X86EMUL_OKAY; } +int viridian_time_vcpu_init(const struct vcpu *v) +{ +return 0; +} + +int viridian_time_domain_init(const struct domain *d) +{ +return 0; +} + +void viridian_time_vcpu_deinit(const struct vcpu *v) +{ +} + +void viridian_time_domain_deinit(const struct domain *d) +{ +} + void viridian_time_save_domain_ctxt( const struct domain *d, struct hvm_viridian_domain_context *ctxt) { diff --git a/xen/arch/x86/hvm/viridian/viridian.c b/xen/arch/x86/hvm/viridian/viridian.c index 1a20d68aaf..f9a509d918 100644 --- a/xen/arch/x86/hvm/viridian/viridian.c +++ b/xen/arch/x86/hvm/viridian/viridian.c @@ -418,22 +418,52 @@ int guest_rdmsr_viridian(const struct vcpu *v, uint32_t idx, uint64_t *val) int viridian_vcpu_init(struct vcpu *v) { +int rc; + ASSERT(!v->arch.hvm.viridian); v->arch.hvm.viridian = xzalloc(struct viridian_vcpu); if ( !v->arch.hvm.viridian ) return -ENOMEM; +rc = viridian_synic_vcpu_init(v); +if ( rc ) +goto fail; + +rc = viridian_time_vcpu_init(v); +if ( rc ) +goto fail; + return 0; + + fail: +viridian_vcpu_deinit(v); + +return rc; } int viridian_domain_init(struct domain *d) { +int rc; + ASSERT(!d->arch.hvm.viridian); d->arch.hvm.viridian = xzalloc(struct viridian_domain); if ( !d->arch.hv
[Xen-devel] [PATCH v6 02/11] viridian: separately allocate domain and vcpu structures
Currently the viridian_domain and viridian_vcpu structures are inline in the hvm_domain and hvm_vcpu structures respectively. Subsequent patches will need to add sizable extra fields to the viridian structures which will cause the PAGE_SIZE limit of the overall vcpu structure to be exceeded. This patch, therefore, uses the new init hooks to separately allocate the structures and converts the 'viridian' fields in hvm_domain and hvm_cpu to be pointers to these allocations. These separate allocations also allow some vcpu and domain pointers to become const. Ideally, now that they are no longer inline, the allocations of the viridian structures could be made conditional on whether the toolstack is going to configure the viridian enlightenments. However the toolstack is currently unable to convey this information to the domain creation code so such an enhancement is deferred until that becomes possible. NOTE: The patch also introduced the 'is_viridian_vcpu' macro to avoid introducing a second evaluation of 'is_viridian_domain' with an open-coded 'v->domain' argument. This macro will also be further used in a subsequent patch. Signed-off-by: Paul Durrant Reviewed-by: Wei Liu Acked-by: Jan Beulich --- Cc: Andrew Cooper Cc: "Roger Pau Monné" v4: - Const-ify some vcpu and domain pointers v2: - use XFREE() - expand commit comment to point out why allocations are unconditional --- xen/arch/x86/hvm/viridian/private.h | 2 +- xen/arch/x86/hvm/viridian/synic.c| 46 - xen/arch/x86/hvm/viridian/time.c | 38 +++--- xen/arch/x86/hvm/viridian/viridian.c | 75 ++-- xen/include/asm-x86/hvm/domain.h | 2 +- xen/include/asm-x86/hvm/hvm.h| 4 ++ xen/include/asm-x86/hvm/vcpu.h | 2 +- xen/include/asm-x86/hvm/viridian.h | 10 ++-- 8 files changed, 101 insertions(+), 78 deletions(-) diff --git a/xen/arch/x86/hvm/viridian/private.h b/xen/arch/x86/hvm/viridian/private.h index 398b22f12d..46174f48cd 100644 --- a/xen/arch/x86/hvm/viridian/private.h +++ b/xen/arch/x86/hvm/viridian/private.h @@ -89,7 +89,7 @@ void viridian_time_load_domain_ctxt( void viridian_dump_guest_page(const struct vcpu *v, const char *name, const struct viridian_page *vp); -void viridian_map_guest_page(struct vcpu *v, struct viridian_page *vp); +void viridian_map_guest_page(const struct vcpu *v, struct viridian_page *vp); void viridian_unmap_guest_page(struct viridian_page *vp); #endif /* X86_HVM_VIRIDIAN_PRIVATE_H */ diff --git a/xen/arch/x86/hvm/viridian/synic.c b/xen/arch/x86/hvm/viridian/synic.c index a6ebbbc9f5..28eda7798c 100644 --- a/xen/arch/x86/hvm/viridian/synic.c +++ b/xen/arch/x86/hvm/viridian/synic.c @@ -28,9 +28,9 @@ typedef union _HV_VP_ASSIST_PAGE uint8_t ReservedZBytePadding[PAGE_SIZE]; } HV_VP_ASSIST_PAGE; -void viridian_apic_assist_set(struct vcpu *v) +void viridian_apic_assist_set(const struct vcpu *v) { -HV_VP_ASSIST_PAGE *ptr = v->arch.hvm.viridian.vp_assist.ptr; +HV_VP_ASSIST_PAGE *ptr = v->arch.hvm.viridian->vp_assist.ptr; if ( !ptr ) return; @@ -40,40 +40,40 @@ void viridian_apic_assist_set(struct vcpu *v) * wrong and the VM will most likely hang so force a crash now * to make the problem clear. */ -if ( v->arch.hvm.viridian.apic_assist_pending ) +if ( v->arch.hvm.viridian->apic_assist_pending ) domain_crash(v->domain); -v->arch.hvm.viridian.apic_assist_pending = true; +v->arch.hvm.viridian->apic_assist_pending = true; ptr->ApicAssist.no_eoi = 1; } -bool viridian_apic_assist_completed(struct vcpu *v) +bool viridian_apic_assist_completed(const struct vcpu *v) { -HV_VP_ASSIST_PAGE *ptr = v->arch.hvm.viridian.vp_assist.ptr; +HV_VP_ASSIST_PAGE *ptr = v->arch.hvm.viridian->vp_assist.ptr; if ( !ptr ) return false; -if ( v->arch.hvm.viridian.apic_assist_pending && +if ( v->arch.hvm.viridian->apic_assist_pending && !ptr->ApicAssist.no_eoi ) { /* An EOI has been avoided */ -v->arch.hvm.viridian.apic_assist_pending = false; +v->arch.hvm.viridian->apic_assist_pending = false; return true; } return false; } -void viridian_apic_assist_clear(struct vcpu *v) +void viridian_apic_assist_clear(const struct vcpu *v) { -HV_VP_ASSIST_PAGE *ptr = v->arch.hvm.viridian.vp_assist.ptr; +HV_VP_ASSIST_PAGE *ptr = v->arch.hvm.viridian->vp_assist.ptr; if ( !ptr ) return; ptr->ApicAssist.no_eoi = 0; -v->arch.hvm.viridian.apic_assist_pending = false; +v->arch.hvm.viridian->apic_assist_pending = false; } int viridian_synic_wrmsr(struct vcpu *v, uint32_t idx, uint64_t val) @@ -95,12 +95,12 @@ int viridian_synic_wrmsr(struct vcpu *v, uint32_t idx, uint64_t val) case HV_X64_MSR_VP_ASSIST_PAGE: /* release any previous mapping */ -viridian_unmap_guest_page(&v->arch.hvm.viridian
[Xen-devel] [PATCH v6 01/11] viridian: add init hooks
This patch adds domain and vcpu init hooks for viridian features. The init hooks do not yet do anything; the functionality will be added to by subsequent patches. Signed-off-by: Paul Durrant Reviewed-by: Wei Liu Acked-by: Jan Beulich --- Cc: Andrew Cooper Cc: "Roger Pau Monné" v5: - Put the call to viridian_domain_deinit() back into hvm_domain_relinquish_resources() where it should be v3: - Re-instate call from domain deinit to vcpu deinit - Move deinit calls to avoid introducing new labels v2: - Remove call from domain deinit to vcpu deinit --- xen/arch/x86/hvm/hvm.c | 10 ++ xen/arch/x86/hvm/viridian/viridian.c | 10 ++ xen/include/asm-x86/hvm/viridian.h | 3 +++ 3 files changed, 23 insertions(+) diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c index 8adbb61b57..11ce21fc08 100644 --- a/xen/arch/x86/hvm/hvm.c +++ b/xen/arch/x86/hvm/hvm.c @@ -666,6 +666,10 @@ int hvm_domain_initialise(struct domain *d) if ( hvm_tsc_scaling_supported ) d->arch.hvm.tsc_scaling_ratio = hvm_default_tsc_scaling_ratio; +rc = viridian_domain_init(d); +if ( rc ) +goto fail2; + rc = hvm_funcs.domain_initialise(d); if ( rc != 0 ) goto fail2; @@ -687,6 +691,7 @@ int hvm_domain_initialise(struct domain *d) hvm_destroy_cacheattr_region_list(d); destroy_perdomain_mapping(d, PERDOMAIN_VIRT_START, 0); fail: +viridian_domain_deinit(d); return rc; } @@ -1526,6 +1531,10 @@ int hvm_vcpu_initialise(struct vcpu *v) && (rc = nestedhvm_vcpu_initialise(v)) < 0 ) /* teardown: nestedhvm_vcpu_destroy */ goto fail5; +rc = viridian_vcpu_init(v); +if ( rc ) +goto fail5; + rc = hvm_all_ioreq_servers_add_vcpu(d, v); if ( rc != 0 ) goto fail6; @@ -1553,6 +1562,7 @@ int hvm_vcpu_initialise(struct vcpu *v) fail2: hvm_vcpu_cacheattr_destroy(v); fail1: +viridian_vcpu_deinit(v); return rc; } diff --git a/xen/arch/x86/hvm/viridian/viridian.c b/xen/arch/x86/hvm/viridian/viridian.c index 425af56856..5b0eb8a8c7 100644 --- a/xen/arch/x86/hvm/viridian/viridian.c +++ b/xen/arch/x86/hvm/viridian/viridian.c @@ -417,6 +417,16 @@ int guest_rdmsr_viridian(const struct vcpu *v, uint32_t idx, uint64_t *val) return X86EMUL_OKAY; } +int viridian_vcpu_init(struct vcpu *v) +{ +return 0; +} + +int viridian_domain_init(struct domain *d) +{ +return 0; +} + void viridian_vcpu_deinit(struct vcpu *v) { viridian_synic_wrmsr(v, HV_X64_MSR_VP_ASSIST_PAGE, 0); diff --git a/xen/include/asm-x86/hvm/viridian.h b/xen/include/asm-x86/hvm/viridian.h index ec5ef8d3f9..f072838955 100644 --- a/xen/include/asm-x86/hvm/viridian.h +++ b/xen/include/asm-x86/hvm/viridian.h @@ -80,6 +80,9 @@ viridian_hypercall(struct cpu_user_regs *regs); void viridian_time_ref_count_freeze(struct domain *d); void viridian_time_ref_count_thaw(struct domain *d); +int viridian_vcpu_init(struct vcpu *v); +int viridian_domain_init(struct domain *d); + void viridian_vcpu_deinit(struct vcpu *v); void viridian_domain_deinit(struct domain *d); -- 2.20.1 ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] [PATCH v6 11/11] viridian: add implementation of the HvSendSyntheticClusterIpi hypercall
This patch adds an implementation of the hypercall as documented in the specification [1], section 10.5.2. This enlightenment, as with others, is advertised by CPUID leaf 0x4004 and is under control of a new 'hcall_ipi' option in libxl. If used, this enlightenment should mean the guest only takes a single VMEXIT to issue IPIs to multiple vCPUs rather than the multiple VMEXITs that would result from using the emulated local APIC. [1] https://github.com/MicrosoftDocs/Virtualization-Documentation/raw/live/tlfs/Hypervisor%20Top%20Level%20Functional%20Specification%20v5.0C.pdf Signed-off-by: Paul Durrant Acked-by: Wei Liu Reviewed-by: Jan Beulich --- Cc: Ian Jackson Cc: Andrew Cooper Cc: George Dunlap Cc: Julien Grall Cc: Konrad Rzeszutek Wilk Cc: Stefano Stabellini Cc: Tim Deegan Cc: "Roger Pau Monné" v4: - Address comments from Jan v3: - New in v3 --- docs/man/xl.cfg.5.pod.in | 6 +++ tools/libxl/libxl.h | 6 +++ tools/libxl/libxl_dom.c | 3 ++ tools/libxl/libxl_types.idl | 1 + xen/arch/x86/hvm/viridian/viridian.c | 63 xen/include/public/hvm/params.h | 7 +++- 6 files changed, 85 insertions(+), 1 deletion(-) diff --git a/docs/man/xl.cfg.5.pod.in b/docs/man/xl.cfg.5.pod.in index 355c654693..c7d70e618b 100644 --- a/docs/man/xl.cfg.5.pod.in +++ b/docs/man/xl.cfg.5.pod.in @@ -2175,6 +2175,12 @@ ticks and hence enabling this group will ensure that ticks will be consistent with use of an enlightened time source (B or B). +=item B + +This set incorporates use of a hypercall for interprocessor interrupts. +This enlightenment may improve performance of Windows guests with multiple +virtual CPUs. + =item B This is a special value that enables the default set of groups, which diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h index c8f219b0d3..482499a6c0 100644 --- a/tools/libxl/libxl.h +++ b/tools/libxl/libxl.h @@ -330,6 +330,12 @@ */ #define LIBXL_HAVE_VIRIDIAN_STIMER 1 +/* + * LIBXL_HAVE_VIRIDIAN_HCALL_IPI indicates that the 'hcall_ipi' value + * is present in the viridian enlightenment enumeration. + */ +#define LIBXL_HAVE_VIRIDIAN_HCALL_IPI 1 + /* * LIBXL_HAVE_BUILDINFO_HVM_ACPI_LAPTOP_SLATE indicates that * libxl_domain_build_info has the u.hvm.acpi_laptop_slate field. diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c index 2ee0f82ee7..879c806139 100644 --- a/tools/libxl/libxl_dom.c +++ b/tools/libxl/libxl_dom.c @@ -324,6 +324,9 @@ static int hvm_set_viridian_features(libxl__gc *gc, uint32_t domid, if (libxl_bitmap_test(&enlightenments, LIBXL_VIRIDIAN_ENLIGHTENMENT_STIMER)) mask |= HVMPV_time_ref_count | HVMPV_synic | HVMPV_stimer; +if (libxl_bitmap_test(&enlightenments, LIBXL_VIRIDIAN_ENLIGHTENMENT_HCALL_IPI)) +mask |= HVMPV_hcall_ipi; + if (mask != 0 && xc_hvm_param_set(CTX->xch, domid, diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl index 1cce249de4..cb4702fd7a 100644 --- a/tools/libxl/libxl_types.idl +++ b/tools/libxl/libxl_types.idl @@ -237,6 +237,7 @@ libxl_viridian_enlightenment = Enumeration("viridian_enlightenment", [ (6, "crash_ctl"), (7, "synic"), (8, "stimer"), +(9, "hcall_ipi"), ]) libxl_hdtype = Enumeration("hdtype", [ diff --git a/xen/arch/x86/hvm/viridian/viridian.c b/xen/arch/x86/hvm/viridian/viridian.c index dce648bb4e..4b06b78a27 100644 --- a/xen/arch/x86/hvm/viridian/viridian.c +++ b/xen/arch/x86/hvm/viridian/viridian.c @@ -28,6 +28,7 @@ #define HvFlushVirtualAddressSpace 0x0002 #define HvFlushVirtualAddressList 0x0003 #define HvNotifyLongSpinWait 0x0008 +#define HvSendSyntheticClusterIpi 0x000b #define HvGetPartitionId 0x0046 #define HvExtCallQueryCapabilities 0x8001 @@ -95,6 +96,7 @@ typedef union _HV_CRASH_CTL_REG_CONTENTS #define CPUID4A_HCALL_REMOTE_TLB_FLUSH (1 << 2) #define CPUID4A_MSR_BASED_APIC (1 << 3) #define CPUID4A_RELAX_TIMER_INT(1 << 5) +#define CPUID4A_SYNTHETIC_CLUSTER_IPI (1 << 10) /* Viridian CPUID leaf 6: Implementation HW features detected and in use */ #define CPUID6A_APIC_OVERLAY(1 << 0) @@ -206,6 +208,8 @@ void cpuid_viridian_leaves(const struct vcpu *v, uint32_t leaf, res->a |= CPUID4A_HCALL_REMOTE_TLB_FLUSH; if ( !cpu_has_vmx_apic_reg_virt ) res->a |= CPUID4A_MSR_BASED_APIC; +if ( viridian_feature_mask(d) & HVMPV_hcall_ipi ) +res->a |= CPUID4A_SYNTHETIC_CLUSTER_IPI; /* * This value is the recommended number of attempts to try to @@ -628,6 +632,65 @@ int viridian_hypercall(struct cpu_user_regs *regs) break; } +case HvSendSyntheticClusterIpi: +{ +struct vcpu *v; +uint32_t vector; +uint64_t vcpu_mask; + +status = HV_STATUS_INVALID_PARAMETER; + +/* Get input parameters. */ +if ( input.fast ) +{ +
[Xen-devel] [PATCH v6 10/11] viridian: add implementation of synthetic timers
This patch introduces an implementation of the STIMER0-15_CONFIG/COUNT MSRs and hence a the first SynIC message source. The new (and documented) 'stimer' viridian enlightenment group may be specified to enable this feature. While in the neighbourhood, this patch adds a missing check for an attempt to write the time reference count MSR, which should result in an exception (but not be reported as an unimplemented MSR). NOTE: It is necessary for correct operation that timer expiration and message delivery time-stamping use the same time source as the guest. The specification is ambiguous but testing with a Windows 10 1803 guest has shown that using the partition reference counter as a source whilst the guest is using RDTSC and the reference tsc page does not work correctly. Therefore the time_now() function is used. This implements the algorithm for acquiring partition reference time that is documented in the specifiction. Signed-off-by: Paul Durrant Acked-by: Wei Liu --- Cc: Ian Jackson Cc: Andrew Cooper Cc: George Dunlap Cc: Jan Beulich Cc: Julien Grall Cc: Konrad Rzeszutek Wilk Cc: Stefano Stabellini Cc: Tim Deegan Cc: "Roger Pau Monné" v6: - Stop using the reference tsc page in time_now() - Address further comments from Jan v5: - Fix time_now() to read TSC as the guest would see it v4: - Address comments from Jan v3: - Re-worked missed ticks calculation --- docs/man/xl.cfg.5.pod.in | 12 +- tools/libxl/libxl.h| 6 + tools/libxl/libxl_dom.c| 4 + tools/libxl/libxl_types.idl| 1 + xen/arch/x86/hvm/viridian/private.h| 9 +- xen/arch/x86/hvm/viridian/synic.c | 55 +++- xen/arch/x86/hvm/viridian/time.c | 383 - xen/arch/x86/hvm/viridian/viridian.c | 5 + xen/include/asm-x86/hvm/viridian.h | 32 ++- xen/include/public/arch-x86/hvm/save.h | 2 + xen/include/public/hvm/params.h| 7 +- 11 files changed, 503 insertions(+), 13 deletions(-) diff --git a/docs/man/xl.cfg.5.pod.in b/docs/man/xl.cfg.5.pod.in index ad81af1ed8..355c654693 100644 --- a/docs/man/xl.cfg.5.pod.in +++ b/docs/man/xl.cfg.5.pod.in @@ -2167,11 +2167,19 @@ This group incorporates the crash control MSRs. These enlightenments allow Windows to write crash information such that it can be logged by Xen. +=item B + +This set incorporates the SynIC and synthetic timer MSRs. Windows will +use synthetic timers in preference to emulated HPET for a source of +ticks and hence enabling this group will ensure that ticks will be +consistent with use of an enlightened time source (B or +B). + =item B This is a special value that enables the default set of groups, which -is currently the B, B, B, B -and B groups. +is currently the B, B, B, B, +B and B groups. =item B diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h index a923a380d3..c8f219b0d3 100644 --- a/tools/libxl/libxl.h +++ b/tools/libxl/libxl.h @@ -324,6 +324,12 @@ */ #define LIBXL_HAVE_VIRIDIAN_SYNIC 1 +/* + * LIBXL_HAVE_VIRIDIAN_STIMER indicates that the 'stimer' value + * is present in the viridian enlightenment enumeration. + */ +#define LIBXL_HAVE_VIRIDIAN_STIMER 1 + /* * LIBXL_HAVE_BUILDINFO_HVM_ACPI_LAPTOP_SLATE indicates that * libxl_domain_build_info has the u.hvm.acpi_laptop_slate field. diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c index fb758d2ac3..2ee0f82ee7 100644 --- a/tools/libxl/libxl_dom.c +++ b/tools/libxl/libxl_dom.c @@ -269,6 +269,7 @@ static int hvm_set_viridian_features(libxl__gc *gc, uint32_t domid, libxl_bitmap_set(&enlightenments, LIBXL_VIRIDIAN_ENLIGHTENMENT_TIME_REF_COUNT); libxl_bitmap_set(&enlightenments, LIBXL_VIRIDIAN_ENLIGHTENMENT_APIC_ASSIST); libxl_bitmap_set(&enlightenments, LIBXL_VIRIDIAN_ENLIGHTENMENT_CRASH_CTL); +libxl_bitmap_set(&enlightenments, LIBXL_VIRIDIAN_ENLIGHTENMENT_STIMER); } libxl_for_each_set_bit(v, info->u.hvm.viridian_enable) { @@ -320,6 +321,9 @@ static int hvm_set_viridian_features(libxl__gc *gc, uint32_t domid, if (libxl_bitmap_test(&enlightenments, LIBXL_VIRIDIAN_ENLIGHTENMENT_SYNIC)) mask |= HVMPV_synic; +if (libxl_bitmap_test(&enlightenments, LIBXL_VIRIDIAN_ENLIGHTENMENT_STIMER)) +mask |= HVMPV_time_ref_count | HVMPV_synic | HVMPV_stimer; + if (mask != 0 && xc_hvm_param_set(CTX->xch, domid, diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl index 9860bcaf5f..1cce249de4 100644 --- a/tools/libxl/libxl_types.idl +++ b/tools/libxl/libxl_types.idl @@ -236,6 +236,7 @@ libxl_viridian_enlightenment = Enumeration("viridian_enlightenment", [ (5, "apic_assist"), (6, "crash_ctl"), (7, "synic"), +(8, "stimer"), ]) libxl_hdtype = Enumeration("hdtype", [ diff --git a/xen/arch/x86/hvm/viridian/private.h b/xen/arch/x86/hvm/viridian/private.h index 96a784b840.
[Xen-devel] [xen-4.9-testing test] 133756: regressions - FAIL
flight 133756 xen-4.9-testing real [real] http://logs.test-lab.xenproject.org/osstest/logs/133756/ Regressions :-( Tests which did not succeed and are blocking, including tests which could not be run: test-amd64-i386-libvirt-pair 22 guest-migrate/src_host/dst_host fail REGR. vs. 132889 test-amd64-amd64-libvirt-pair 22 guest-migrate/src_host/dst_host fail REGR. vs. 132889 test-amd64-amd64-xl-qemut-ws16-amd64 17 guest-stop fail REGR. vs. 132889 Tests which are failing intermittently (not blocking): test-amd64-amd64-xl-qemut-ws16-amd64 14 guest-localmigrate fail in 133712 pass in 133756 test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-localmigrate/x10 fail in 133712 pass in 133756 test-amd64-amd64-xl-qemut-win10-i386 7 xen-boot fail pass in 133712 test-amd64-amd64-xl-qemuu-ws16-amd64 16 guest-localmigrate/x10 fail pass in 133712 test-amd64-i386-xl-qemut-ws16-amd64 16 guest-localmigrate/x10 fail pass in 133712 Tests which did not succeed, but are not blocking: test-amd64-i386-xl-qemuu-win7-amd64 17 guest-stop fail blocked in 132889 test-amd64-amd64-xl-qemut-win7-amd64 17 guest-stop fail blocked in 132889 test-amd64-i386-xl-qemuu-ws16-amd64 17 guest-stop fail blocked in 132889 test-armhf-armhf-xl-rtds 16 guest-start/debian.repeat fail in 133712 blocked in 132889 test-amd64-i386-xl-qemuu-win7-amd64 16 guest-localmigrate/x10 fail in 133712 like 132889 test-amd64-amd64-xl-qemut-win7-amd64 16 guest-localmigrate/x10 fail in 133712 like 132889 test-amd64-amd64-xl-qemuu-ws16-amd64 17 guest-stop fail in 133712 like 132889 test-amd64-i386-xl-qemut-ws16-amd64 17 guest-stop fail in 133712 like 132889 test-amd64-amd64-xl-qemut-win10-i386 10 windows-install fail in 133712 never pass test-amd64-i386-xl-qemut-win7-amd64 17 guest-stop fail like 132889 test-amd64-amd64-xl-rtds 10 debian-install fail like 132889 test-amd64-i386-libvirt 13 migrate-support-checkfail never pass test-amd64-i386-libvirt-xsm 13 migrate-support-checkfail never pass test-arm64-arm64-xl-credit1 13 migrate-support-checkfail never pass test-arm64-arm64-xl-credit1 14 saverestore-support-checkfail never pass test-arm64-arm64-xl-credit2 13 migrate-support-checkfail never pass test-arm64-arm64-xl-credit2 14 saverestore-support-checkfail never pass test-arm64-arm64-xl-xsm 13 migrate-support-checkfail never pass test-arm64-arm64-xl-xsm 14 saverestore-support-checkfail never pass test-arm64-arm64-libvirt-xsm 13 migrate-support-checkfail never pass test-arm64-arm64-libvirt-xsm 14 saverestore-support-checkfail never pass test-arm64-arm64-xl 13 migrate-support-checkfail never pass test-arm64-arm64-xl 14 saverestore-support-checkfail never pass test-amd64-amd64-libvirt 13 migrate-support-checkfail never pass test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail never pass test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check fail never pass test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check fail never pass test-amd64-amd64-qemuu-nested-amd 17 debian-hvm-install/l1/l2 fail never pass test-amd64-amd64-libvirt-vhd 12 migrate-support-checkfail never pass test-armhf-armhf-xl 13 migrate-support-checkfail never pass test-armhf-armhf-xl 14 saverestore-support-checkfail never pass test-armhf-armhf-xl-multivcpu 13 migrate-support-checkfail never pass test-armhf-armhf-xl-multivcpu 14 saverestore-support-checkfail never pass test-armhf-armhf-xl-cubietruck 13 migrate-support-checkfail never pass test-armhf-armhf-xl-cubietruck 14 saverestore-support-checkfail never pass test-armhf-armhf-xl-credit1 13 migrate-support-checkfail never pass test-armhf-armhf-xl-credit1 14 saverestore-support-checkfail never pass test-armhf-armhf-xl-credit2 13 migrate-support-checkfail never pass test-armhf-armhf-xl-credit2 14 saverestore-support-checkfail never pass test-armhf-armhf-libvirt-raw 12 migrate-support-checkfail never pass test-armhf-armhf-libvirt-raw 13 saverestore-support-checkfail never pass test-armhf-armhf-xl-vhd 12 migrate-support-checkfail never pass test-armhf-armhf-xl-vhd 13 saverestore-support-checkfail never pass test-armhf-armhf-libvirt 13 migrate-support-checkfail never pass test-armhf-armhf-libvirt 14 saverestore-support-checkfail never pass test-armhf-armhf-xl-arndale 13 migrate-support-checkfail never pass test-armhf-armhf-xl-arndale 14 saverestore-support-checkfail never pass test-armhf-armhf-xl-rtds 13 migrate-support-checkfail never pass test-armhf-armhf-xl-rtds 14 saverestore-support-checkfail never pass test-amd64-a
Re: [Xen-devel] [PATCH for-next 7/9] xen: Convert is_xen_heap_mfn to use typesafe MFN
>>> On 14.03.19 at 11:14, wrote: > You do know that mfn_add(mfn, -1) will DTRT? It will, but imo it looks slightly odd in particular in e.g. for ( ...; ...; mfn_add(mfn, -1) ), hence I didn't suggest it. But I don't object either. Jan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v13] tolerate jitter in cpu_khz calculation to avoid TSC emulation
Am Wed, 13 Mar 2019 03:18:39 -0600 schrieb "Jan Beulich" : > The discontinuity is still there, and so far you've failed to explain > why a discontinuity is what you want here. In v12 you asked for some data about the ranges. With new data the ranges are: 2.20GHz 29khz 2.30GHz 105khz 2.40GHz 3524khz 2.50GHz 114khz There are 5 systems in total in the 2.3/2.4/2.5 classes that have a much higher frequency, making the total range 15382/6052/19179. I have to check what these runaway values mean. The total range within the ntp.drift file is -71.138..345.942 for >2000 hosts. I will see if any experiments can be done on the hosts which are on the edge of a frequency change. Would be nice to know if ntpd can cope with tsc_mode=native for example when a domU is migrated from one to the other. Olaf pgplWyWjMltUL.pgp Description: Digitale Signatur von OpenPGP ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] [PATCH v2 2/2] tools: add new xl command get-hypervisor-config
Add new subcommand "get-hypervisor-config" to xl config to print the hypervisor .config file. To be able to reuse already existing decompressing code in libxenguest xc_inflate_buffer() has to be moved to libxenguest.h. Signed-off-by: Juergen Gross --- V2: - rename subcommand to get-hypervisor-config (Wei Liu) - use goto style error handling (Wei Liu) --- docs/man/xl.1.pod.in | 5 + tools/libxc/include/xenctrl.h | 8 tools/libxc/include/xenguest.h | 13 + tools/libxc/xc_misc.c | 42 ++ tools/libxc/xg_private.h | 4 tools/libxl/libxl.c| 34 ++ tools/libxl/libxl.h| 8 tools/xl/xl.h | 1 + tools/xl/xl_cmdtable.c | 5 + tools/xl/xl_misc.c | 20 10 files changed, 136 insertions(+), 4 deletions(-) diff --git a/docs/man/xl.1.pod.in b/docs/man/xl.1.pod.in index 4310fcd818..9d20958e91 100644 --- a/docs/man/xl.1.pod.in +++ b/docs/man/xl.1.pod.in @@ -844,6 +844,11 @@ Clears Xen's message buffer. =back +=item B + +Print the software configuration file (.config) used to build the +hypervisor. + =item B [I] Print information about the Xen host in I format. When diff --git a/tools/libxc/include/xenctrl.h b/tools/libxc/include/xenctrl.h index a3628e56bb..c6a203e1a4 100644 --- a/tools/libxc/include/xenctrl.h +++ b/tools/libxc/include/xenctrl.h @@ -2632,6 +2632,14 @@ int xc_livepatch_replace(xc_interface *xch, char *name, uint32_t timeout); int xc_domain_cacheflush(xc_interface *xch, uint32_t domid, xen_pfn_t start_pfn, xen_pfn_t nr_pfns); +/* + * Get gzip-ed .config from hypervisor. + * *buffer must be free()-ed by caller. + * data size is returned in `size`. + * Returns 0 on success. + */ +int xc_get_config(xc_interface *xch, char **buffer, unsigned long *size); + /* Compat shims */ #include "xenctrl_compat.h" diff --git a/tools/libxc/include/xenguest.h b/tools/libxc/include/xenguest.h index b4b2e19619..76e87ea97c 100644 --- a/tools/libxc/include/xenguest.h +++ b/tools/libxc/include/xenguest.h @@ -310,4 +310,17 @@ xen_pfn_t *xc_map_m2p(xc_interface *xch, unsigned long max_mfn, int prot, unsigned long *mfn0); + +/** + * Decompress a gzip-ed stream. + * @parm xch a handle to an open hypervisor interface + * @parm in_buf buffer holding the gzip-ed data + * @parm in_size size in bytes of the gzip-ed data + * @parm out_size where to store the gunzip-ed data length + * @return new allocated buffer holding the gunzip-ed data + */ +char *xc_inflate_buffer(xc_interface *xch, +const char *in_buf, +unsigned long in_size, +unsigned long *out_size); #endif /* XENGUEST_H */ diff --git a/tools/libxc/xc_misc.c b/tools/libxc/xc_misc.c index 5e6714ae2b..83d259e46e 100644 --- a/tools/libxc/xc_misc.c +++ b/tools/libxc/xc_misc.c @@ -888,6 +888,48 @@ int xc_livepatch_replace(xc_interface *xch, char *name, uint32_t timeout) return _xc_livepatch_action(xch, name, LIVEPATCH_ACTION_REPLACE, timeout); } +int xc_get_config(xc_interface *xch, char **buffer, unsigned long *size) +{ +int rc; +DECLARE_SYSCTL; +DECLARE_HYPERCALL_BUFFER(char, buf); + +sysctl.cmd = XEN_SYSCTL_get_config; +sysctl.u.get_config.size = 0; +set_xen_guest_handle(sysctl.u.get_config.buffer, HYPERCALL_BUFFER_NULL); +rc = do_sysctl(xch, &sysctl); +if ( rc ) +return rc; + +*size = sysctl.u.get_config.size; +buf = xc_hypercall_buffer_alloc(xch, buf, *size); +if ( !buf ) +{ +errno = ENOMEM; +return -1; +} + +sysctl.cmd = XEN_SYSCTL_get_config; +sysctl.u.get_config.size = *size; +set_xen_guest_handle(sysctl.u.get_config.buffer, buf); +rc = do_sysctl(xch, &sysctl); + +if ( rc ) +goto out; + +*buffer = calloc(1, *size); +if ( !*buffer ) +{ +errno = ENOMEM; +goto out; +} +memmove(*buffer, buf, *size); + +out: +xc_hypercall_buffer_free(xch, buf); +return rc; +} + /* * Local variables: * mode: C diff --git a/tools/libxc/xg_private.h b/tools/libxc/xg_private.h index f0a4b2c616..ca85e10737 100644 --- a/tools/libxc/xg_private.h +++ b/tools/libxc/xg_private.h @@ -43,10 +43,6 @@ char *xc_read_image(xc_interface *xch, const char *filename, unsigned long *size); -char *xc_inflate_buffer(xc_interface *xch, -const char *in_buf, -unsigned long in_size, -unsigned long *out_size); unsigned long csum_page (void * page); diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c index ec71574e99..e363371811 100644 --- a/tools/libxl/libxl.c +++ b/tools/libxl/libxl.c @@ -669,6 +669,40 @@ int libxl_set_parameters(libxl_ctx *ctx, char *param
[Xen-devel] [PATCH v2 1/2] xen: add interface for obtaining .config from hypervisor
Add a sysctl interface for obtaining the .config file used to build the hypervisor. The mechanism is inspired by the Linux kernel's one. Signed-off-by: Juergen Gross Reviewed-by: Jan Beulich (apart from XSM changes) --- V2: - bump sysctl interface version - check pad to be zero (Wei Liu) - only copy data if buffer is large enough (Wei Liu) - add .gitignore entry at correct position (Wei Liu) - make xen_config_data_sz const (Jan Beulich) --- .gitignore | 2 ++ tools/flask/policy/modules/dom0.te | 2 +- xen/common/Makefile | 7 +++ xen/common/sysctl.c | 17 + xen/include/public/sysctl.h | 18 +- xen/include/xen/kernel.h| 3 +++ xen/tools/Makefile | 9 +++-- xen/tools/bin2c.c | 28 xen/xsm/flask/hooks.c | 3 +++ xen/xsm/flask/policy/access_vectors | 2 ++ 10 files changed, 87 insertions(+), 4 deletions(-) create mode 100644 xen/tools/bin2c.c diff --git a/.gitignore b/.gitignore index 26bc583f74..b433bce092 100644 --- a/.gitignore +++ b/.gitignore @@ -309,6 +309,7 @@ xen/arch/*/efi/boot.c xen/arch/*/efi/compat.c xen/arch/*/efi/efi.h xen/arch/*/efi/runtime.c +xen/common/config_data.c xen/include/headers*.chk xen/include/asm xen/include/asm-*/asm-offsets.h @@ -326,6 +327,7 @@ xen/test/livepatch/xen_bye_world.livepatch xen/test/livepatch/xen_hello_world.livepatch xen/test/livepatch/xen_nop.livepatch xen/test/livepatch/xen_replace_world.livepatch +xen/tools/bin2c xen/tools/kconfig/.tmp_gtkcheck xen/tools/kconfig/.tmp_qtcheck xen/tools/symbols diff --git a/tools/flask/policy/modules/dom0.te b/tools/flask/policy/modules/dom0.te index a347d664f8..b776e9f307 100644 --- a/tools/flask/policy/modules/dom0.te +++ b/tools/flask/policy/modules/dom0.te @@ -16,7 +16,7 @@ allow dom0_t xen_t:xen { allow dom0_t xen_t:xen2 { resource_op psr_cmt_op psr_alloc pmu_ctrl get_symbol get_cpu_levelling_caps get_cpu_featureset livepatch_op - coverage_op set_parameter + coverage_op set_parameter get_config }; # Allow dom0 to use all XENVER_ subops that have checks. diff --git a/xen/common/Makefile b/xen/common/Makefile index bca48e6e22..7d98dad478 100644 --- a/xen/common/Makefile +++ b/xen/common/Makefile @@ -1,6 +1,7 @@ obj-$(CONFIG_ARGO) += argo.o obj-y += bitmap.o obj-y += bsearch.o +obj-y += config_data.o obj-$(CONFIG_CORE_PARKING) += core_parking.o obj-y += cpu.o obj-y += cpupool.o @@ -84,3 +85,9 @@ subdir-$(CONFIG_UBSAN) += ubsan subdir-$(CONFIG_NEEDS_LIBELF) += libelf subdir-$(CONFIG_HAS_DEVICE_TREE) += libfdt + +config_data.c: ../.config + ( echo "const char xen_config_data[] ="; \ + cat $< | gzip | ../tools/bin2c; \ + echo ";"; \ + echo "const unsigned int xen_config_data_sz = sizeof(xen_config_data) - 1;" ) > $@ diff --git a/xen/common/sysctl.c b/xen/common/sysctl.c index c0aa6bde4e..7d4329882d 100644 --- a/xen/common/sysctl.c +++ b/xen/common/sysctl.c @@ -13,6 +13,7 @@ #include #include #include +#include #include #include #include @@ -502,6 +503,22 @@ long do_sysctl(XEN_GUEST_HANDLE_PARAM(xen_sysctl_t) u_sysctl) break; } +case XEN_SYSCTL_get_config: +{ +if ( op->u.get_config.pad ) +{ +ret = -EINVAL; +break; +} +if ( xen_config_data_sz <= op->u.get_config.size && + copy_to_guest(op->u.get_config.buffer, xen_config_data, + xen_config_data_sz) ) +ret = -EFAULT; +op->u.get_config.size = xen_config_data_sz; + +break; +} + default: ret = arch_do_sysctl(op, u_sysctl); copyback = 0; diff --git a/xen/include/public/sysctl.h b/xen/include/public/sysctl.h index c49b4dcc99..6139321971 100644 --- a/xen/include/public/sysctl.h +++ b/xen/include/public/sysctl.h @@ -36,7 +36,7 @@ #include "physdev.h" #include "tmem.h" -#define XEN_SYSCTL_INTERFACE_VERSION 0x0012 +#define XEN_SYSCTL_INTERFACE_VERSION 0x0013 /* * Read console content from Xen buffer ring. @@ -1100,6 +1100,20 @@ typedef struct xen_sysctl_cpu_policy xen_sysctl_cpu_policy_t; DEFINE_XEN_GUEST_HANDLE(xen_sysctl_cpu_policy_t); #endif +/* + * XEN_SYSCTL_get_config + * + * Return gzip-ed .config file + */ +struct xen_sysctl_get_config { +XEN_GUEST_HANDLE_64(char) buffer; /* IN: pointer to buffer. */ +uint32_t size; /* IN: size of buffer. */ +/* OUT: size of config data. */ +uint32_t pad; /* IN: MUST be zero. */ +}; +typedef struct xen_sysctl_get_config xen_sysctl_get_config_t; +DEFINE_XEN_GUEST_HANDLE(xen_sysctl_get_config_t); + struct xen_sysctl { uint32_t cmd; #define XEN_SYSCTL_readconsole1 @@ -1130,6 +1144,7 @@ struct xen_sysctl { #define XEN_SYSCTL_livepatch_op
[Xen-devel] [PATCH v2 0/2] add xl command to get hypervisor .config
Add "xl get-hypervisor-config" printing the .config used to build the currently running hypervisor. Juergen Gross (2): xen: add interface for obtaining .config from hypervisor tools: add new xl command get-hypervisor-config .gitignore | 2 ++ docs/man/xl.1.pod.in| 5 + tools/flask/policy/modules/dom0.te | 2 +- tools/libxc/include/xenctrl.h | 8 +++ tools/libxc/include/xenguest.h | 13 tools/libxc/xc_misc.c | 42 + tools/libxc/xg_private.h| 4 tools/libxl/libxl.c | 34 ++ tools/libxl/libxl.h | 8 +++ tools/xl/xl.h | 1 + tools/xl/xl_cmdtable.c | 5 + tools/xl/xl_misc.c | 20 ++ xen/common/Makefile | 7 +++ xen/common/sysctl.c | 17 +++ xen/include/public/sysctl.h | 18 +++- xen/include/xen/kernel.h| 3 +++ xen/tools/Makefile | 9 ++-- xen/tools/bin2c.c | 28 + xen/xsm/flask/hooks.c | 3 +++ xen/xsm/flask/policy/access_vectors | 2 ++ 20 files changed, 223 insertions(+), 8 deletions(-) create mode 100644 xen/tools/bin2c.c -- 2.16.4 ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH for-next 7/9] xen: Convert is_xen_heap_mfn to use typesafe MFN
On 14/03/2019 11:47, Jan Beulich wrote: On 14.03.19 at 11:14, wrote: >> You do know that mfn_add(mfn, -1) will DTRT? > It will, but imo it looks slightly odd in particular in e.g. > for ( ...; ...; mfn_add(mfn, -1) ), hence I didn't suggest it. But > I don't object either. Well - in mathematics, x - 1 is identical to x + (-1), so the construct doesn't look overly odd to me. Alternatively, we could rename mfn_add to something else, but nothing comes to mind which would be as clear. ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v2 2/2] tools: add new xl command get-hypervisor-config
On Thu, Mar 14, 2019 at 12:59:37PM +0100, Juergen Gross wrote: > Add new subcommand "get-hypervisor-config" to xl config to print the > hypervisor .config file. > > To be able to reuse already existing decompressing code in libxenguest > xc_inflate_buffer() has to be moved to libxenguest.h. > > Signed-off-by: Juergen Gross Acked-by: Wei Liu ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v2 1/2] xen: add interface for obtaining .config from hypervisor
On Thu, Mar 14, 2019 at 12:59:36PM +0100, Juergen Gross wrote: > Add a sysctl interface for obtaining the .config file used to build > the hypervisor. The mechanism is inspired by the Linux kernel's one. > > Signed-off-by: Juergen Gross > Reviewed-by: Jan Beulich (apart from XSM changes) Reviewed-by: Wei Liu ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v6 11/12] x86/microcode: Synchronize late microcode loading
On 13/03/2019 08:02, Jan Beulich wrote: On 13.03.19 at 08:54, wrote: > On 13.03.19 at 06:02, wrote: >>> On Tue, Mar 12, 2019 at 05:07:51PM -0700, Raj, Ashok wrote: On Mon, Mar 11, 2019 at 03:57:35PM +0800, Chao Gao wrote: > +if ( cpu == cpumask_first(per_cpu(cpu_sibling_mask, cpu)) ) > +ret = microcode_update_cpu(); Does ret have any useful things on if the update failed? Doesn't seem to be used before you overwrite later in collect_cpu_info()? >>> It has the reason of failure on error. Actally, there are two reasons: >>> one is no patch of newer revision, the other is we tried to update but >>> the microcode revision didn't change. I can check this return value and >>> print more informative message to admin. And furthermore, for the >>> latter, we can remove the ucode patch from caches to address Roger's >>> concern expressed in comments to patch 4 & 5. >> Btw, I'm not sure removing such ucode from the cache is appropriate: >> It may well apply elsewhere, unless there's a clear indication that the >> blob is broken. So perhaps there needs to be special casing of -EIO, >> which gets returned when the ucode rev reported by the CPU after >> the update does not match expectations. > An to go one step further, perhaps we should also store more than > just the newest variant for a given pf. If the newest fails to apply > but there is another one newer than what's on a CPU, updating to > that may work, and once that intermediate update worked, the > update to the newest version may then work too. I don't think this is sensible. Running with mismatched microcode is unsupported (for a suitable definition of mismatched). At boot, there will be 1 individual blob which is applicable to the CPUs, and gets loaded on all APs. (Possibly more than 1 blob, but remember that only multi-socket servers and top end workstations have a chance of having mixed steppings in the first place.) During late load, we will have 1 (or more) blobs provided in the late hypercall, and we will apply in a logically-atomic fashion in a rendezvous'd context. There are a few outcomes from this action. If the ucode application fails internally, the system is already lost and will crash. This is an inherent risk which people doing late loading need to be aware of (and why test workloads exist in production setups). If some cores accept the update but others don't, then we've also got serious problems and probably system instability. This is the kind of problem which needs to be detected during testing, and may require a change in application strategy, or may in practice prevent a particular from being declared safe to late load. The expected case is that all cores accept the blob during the rendezvous. As a result, I don't see any need to store more than a single ucode version (whether this is a single blob or perhaps a set of closely related blobs for a mixed-stepping system) in the stead state (to cope with AP boot, suspend/resume, or CPU hotplug), and a second version which is the proposed-new ucode for late loading. On late load failure, we should dump enough information to work out exactly what went on, to determine how best to proceed, but the server is effectively lost to us. On late load success, the proposed new "version" replaces the current "version". And again - I reiterate the point that I think it is fine to have a simplifying assumption that we don't have mixed stepping systems to start with, presuming this is generally in line with Intel's support statement. If in practice we find mixed stepping systems which are supported by an OEM/Intel, we can see about extending the logic. ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] L1TF Patch Series v10
Dear all, This patch series attempts to mitigate the issue that have been raised in the XSA-289 (https://xenbits.xen.org/xsa/advisory-289.html). To block speculative execution on Intel hardware, an lfence instruction is required to make sure that selected checks are not bypassed. Speculative out-of-bound accesses can be prevented by using the array_index_nospec macro. The major change compared to version 9 is patch 8/8, which slipped through the analysis in the first rounds. Best, Norbert Amazon Development Center Germany GmbH Krausenstr. 38 10117 Berlin Geschaeftsfuehrer: Christian Schlaeger, Ralf Herbrich Ust-ID: DE 289 237 879 Eingetragen am Amtsgericht Charlottenburg HRB 149173 B ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] [PATCH L1TF v10 1/8] spec: add l1tf-barrier
To control the runtime behavior on L1TF vulnerable platforms better, the command line option l1tf-barrier is introduced. This option controls whether on vulnerable x86 platforms the lfence instruction is used to prevent speculative execution from bypassing the evaluation of conditionals that are protected with the evaluate_nospec macro. By now, Xen is capable of identifying L1TF vulnerable hardware. However, this information cannot be used for alternative patching, as a CPU feature is required. To control alternative patching with the command line option, a new x86 feature "X86_FEATURE_SC_L1TF_VULN" is introduced. This feature is used to patch the lfence instruction into the arch_barrier_nospec_true function. The feature is enabled only if L1TF vulnerable hardware is detected and the command line option does not prevent using this feature. The status of hyperthreading is considered when automatically enabling adding the lfence instruction. Since platforms without hyperthreading can still be vulnerable to L1TF in case the L1 cache is not flushed properly, the additional lfence instructions are patched in if either hyperthreading is enabled, or L1 cache flushing is missing. This is part of the speculative hardening effort. Signed-off-by: Norbert Manthey Reviewed-by: Jan Beulich --- docs/misc/xen-command-line.pandoc | 14 ++ xen/arch/x86/spec_ctrl.c | 17 +++-- xen/include/asm-x86/cpufeatures.h | 1 + xen/include/asm-x86/spec_ctrl.h | 1 + 4 files changed, 27 insertions(+), 6 deletions(-) diff --git a/docs/misc/xen-command-line.pandoc b/docs/misc/xen-command-line.pandoc --- a/docs/misc/xen-command-line.pandoc +++ b/docs/misc/xen-command-line.pandoc @@ -483,9 +483,9 @@ accounting for hardware capabilities as enumerated via CPUID. Currently accepted: -The Speculation Control hardware features `ibrsb`, `stibp`, `ibpb`, -`l1d-flush` and `ssbd` are used by default if available and applicable. They can -be ignored, e.g. `no-ibrsb`, at which point Xen won't use them itself, and +The Speculation Control hardware features `ibrsb`, `stibp`, `ibpb`, `l1d-flush`, +`l1tf-barrier` and `ssbd` are used by default if available and applicable. They +can be ignored, e.g. `no-ibrsb`, at which point Xen won't use them itself, and won't offer them to guests. ### cpuid_mask_cpu @@ -1902,7 +1902,7 @@ By default SSBD will be mitigated at runtime (i.e `ssbd=runtime`). ### spec-ctrl (x86) > `= List of [ , xen=, {pv,hvm,msr-sc,rsb}=, > bti-thunk=retpoline|lfence|jmp, {ibrs,ibpb,ssbd,eager-fpu, -> l1d-flush}= ]` +> l1d-flush,l1tf-barrier}= ]` Controls for speculative execution sidechannel mitigations. By default, Xen will pick the most appropriate mitigations based on compiled in support, @@ -1968,6 +1968,12 @@ Irrespective of Xen's setting, the feature is virtualised for HVM guests to use. By default, Xen will enable this mitigation on hardware believed to be vulnerable to L1TF. +On hardware vulnerable to L1TF, the `l1tf-barrier=` option can be used to force +or prevent Xen from protecting evaluations inside the hypervisor with a barrier +instruction to not load potentially secret information into L1 cache. By +default, Xen will enable this mitigation on hardware believed to be vulnerable +to L1TF. + ### sync_console > `= ` diff --git a/xen/arch/x86/spec_ctrl.c b/xen/arch/x86/spec_ctrl.c --- a/xen/arch/x86/spec_ctrl.c +++ b/xen/arch/x86/spec_ctrl.c @@ -21,6 +21,7 @@ #include #include +#include #include #include #include @@ -50,6 +51,7 @@ bool __read_mostly opt_ibpb = true; bool __read_mostly opt_ssbd = false; int8_t __read_mostly opt_eager_fpu = -1; int8_t __read_mostly opt_l1d_flush = -1; +int8_t __read_mostly opt_l1tf_barrier = -1; bool __initdata bsp_delay_spec_ctrl; uint8_t __read_mostly default_xen_spec_ctrl; @@ -91,6 +93,8 @@ static int __init parse_spec_ctrl(const char *s) if ( opt_pv_l1tf_domu < 0 ) opt_pv_l1tf_domu = 0; +opt_l1tf_barrier = 0; + disable_common: opt_rsb_pv = false; opt_rsb_hvm = false; @@ -157,6 +161,8 @@ static int __init parse_spec_ctrl(const char *s) opt_eager_fpu = val; else if ( (val = parse_boolean("l1d-flush", s, ss)) >= 0 ) opt_l1d_flush = val; +else if ( (val = parse_boolean("l1tf-barrier", s, ss)) >= 0 ) +opt_l1tf_barrier = val; else rc = -EINVAL; @@ -248,7 +254,7 @@ static void __init print_details(enum ind_thunk thunk, uint64_t caps) "\n"); /* Settings for Xen's protection, irrespective of guests. */ -printk(" Xen settings: BTI-Thunk %s, SPEC_CTRL: %s%s, Other:%s%s\n", +printk(" Xen settings: BTI-Thunk %s, SPEC_CTRL: %s%s, Other:%s%s%s\n", thunk == THUNK_NONE ? "N/A" : thunk == THUNK_RETPOLINE ? "RETPOLINE" : thunk == THUNK_LFENCE
[Xen-devel] [PATCH L1TF v10 2/8] nospec: introduce evaluate_nospec
Since the L1TF vulnerability of Intel CPUs, loading hypervisor data into L1 cache is problematic, because when hyperthreading is used as well, a guest running on the sibling core can leak this potentially secret data. To prevent these speculative accesses, we block speculation after accessing the domain property field by adding lfence instructions. This way, the CPU continues executing and loading data only once the condition is actually evaluated. As this protection is typically used in if statements, the lfence has to come in a compatible way. Therefore, a function that returns true after an lfence instruction is introduced. To protect both branches after a conditional, an lfence instruction has to be added for the two branches. To be able to block speculation after several evaluations, the generic barrier macro block_speculation is also introduced. As the L1TF vulnerability is only present on the x86 architecture, there is no need to add protection for other architectures. Hence, the introduced functions are defined but empty. On the x86 architecture, by default, the lfence instruction is not present either. Only when a L1TF vulnerable platform is detected, the lfence instruction is patched in via alternative patching. Similarly, PV guests are protected wrt L1TF by default, so that the protection is furthermore disabled in case HVM is exclueded via the build configuration. Introducing the lfence instructions catches a lot of potential leaks with a simple unintrusive code change. During performance testing, we did not notice performance effects. This is part of the speculative hardening effort. Signed-off-by: Norbert Manthey Acked-by: Julien Grall --- xen/include/asm-arm/nospec.h | 25 + xen/include/asm-x86/nospec.h | 39 +++ xen/include/xen/nospec.h | 1 + 3 files changed, 65 insertions(+) create mode 100644 xen/include/asm-arm/nospec.h create mode 100644 xen/include/asm-x86/nospec.h diff --git a/xen/include/asm-arm/nospec.h b/xen/include/asm-arm/nospec.h new file mode 100644 --- /dev/null +++ b/xen/include/asm-arm/nospec.h @@ -0,0 +1,25 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* Copyright 2018 Amazon.com, Inc. or its affiliates. All Rights Reserved. */ + +#ifndef _ASM_ARM_NOSPEC_H +#define _ASM_ARM_NOSPEC_H + +static inline bool evaluate_nospec(bool condition) +{ +return condition; +} + +static inline void block_speculation(void) +{ +} + +#endif /* _ASM_ARM_NOSPEC_H */ + +/* + * Local variables: + * mode: C + * c-file-style: "BSD" + * c-basic-offset: 4 + * indent-tabs-mode: nil + * End: + */ diff --git a/xen/include/asm-x86/nospec.h b/xen/include/asm-x86/nospec.h new file mode 100644 --- /dev/null +++ b/xen/include/asm-x86/nospec.h @@ -0,0 +1,39 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* Copyright 2018 Amazon.com, Inc. or its affiliates. All Rights Reserved. */ + +#ifndef _ASM_X86_NOSPEC_H +#define _ASM_X86_NOSPEC_H + +#include + +/* Allow to insert a read memory barrier into conditionals */ +static always_inline bool barrier_nospec_true(void) +{ +#ifdef CONFIG_HVM +alternative("", "lfence", X86_FEATURE_SC_L1TF_VULN); +#endif +return true; +} + +/* Allow to protect evaluation of conditionasl with respect to speculation */ +static always_inline bool evaluate_nospec(bool condition) +{ +return condition ? barrier_nospec_true() : !barrier_nospec_true(); +} + +/* Allow to block speculative execution in generic code */ +static always_inline void block_speculation(void) +{ +barrier_nospec_true(); +} + +#endif /* _ASM_X86_NOSPEC_H */ + +/* + * Local variables: + * mode: C + * c-file-style: "BSD" + * c-basic-offset: 4 + * indent-tabs-mode: nil + * End: + */ diff --git a/xen/include/xen/nospec.h b/xen/include/xen/nospec.h --- a/xen/include/xen/nospec.h +++ b/xen/include/xen/nospec.h @@ -8,6 +8,7 @@ #define XEN_NOSPEC_H #include +#include /** * array_index_mask_nospec() - generate a ~0 mask when index < size, 0 otherwise -- 2.7.4 Amazon Development Center Germany GmbH Krausenstr. 38 10117 Berlin Geschaeftsfuehrer: Christian Schlaeger, Ralf Herbrich Ust-ID: DE 289 237 879 Eingetragen am Amtsgericht Charlottenburg HRB 149173 B ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] [PATCH L1TF v10 3/8] is_control_domain: block speculation
Checks of domain properties, such as is_hardware_domain or is_hvm_domain, might be bypassed by speculatively executing these instructions. A reason for bypassing these checks is that these macros access the domain structure via a pointer, and check a certain field. Since this memory access is slow, the CPU assumes a returned value and continues the execution. In case an is_control_domain check is bypassed, for example during a hypercall, data that should only be accessible by the control domain could be loaded into the cache. This is part of the speculative hardening effort. Signed-off-by: Norbert Manthey Acked-by: Jan Beulich --- xen/include/xen/sched.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h --- a/xen/include/xen/sched.h +++ b/xen/include/xen/sched.h @@ -913,10 +913,10 @@ void watchdog_domain_destroy(struct domain *d); *(that is, this would not be suitable for a driver domain) * - There is never a reason to deny the hardware domain access to this */ -#define is_hardware_domain(_d) ((_d) == hardware_domain) +#define is_hardware_domain(_d) evaluate_nospec((_d) == hardware_domain) /* This check is for functionality specific to a control domain */ -#define is_control_domain(_d) ((_d)->is_privileged) +#define is_control_domain(_d) evaluate_nospec((_d)->is_privileged) #define VM_ASSIST(d, t) (test_bit(VMASST_TYPE_ ## t, &(d)->vm_assist)) -- 2.7.4 Amazon Development Center Germany GmbH Krausenstr. 38 10117 Berlin Geschaeftsfuehrer: Christian Schlaeger, Ralf Herbrich Ust-ID: DE 289 237 879 Eingetragen am Amtsgericht Charlottenburg HRB 149173 B ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] [PATCH L1TF v10 4/8] is_hvm/pv_domain: block speculation
When checking for being an hvm domain, or PV domain, we have to make sure that speculation cannot bypass that check, and eventually access data that should not end up in cache for the current domain type. This is part of the speculative hardening effort. Signed-off-by: Norbert Manthey Acked-by: Jan Beulich --- xen/include/xen/sched.h | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h --- a/xen/include/xen/sched.h +++ b/xen/include/xen/sched.h @@ -922,7 +922,8 @@ void watchdog_domain_destroy(struct domain *d); static inline bool is_pv_domain(const struct domain *d) { -return IS_ENABLED(CONFIG_PV) ? d->guest_type == guest_type_pv : false; +return IS_ENABLED(CONFIG_PV) + ? evaluate_nospec(d->guest_type == guest_type_pv) : false; } static inline bool is_pv_vcpu(const struct vcpu *v) @@ -953,7 +954,8 @@ static inline bool is_pv_64bit_vcpu(const struct vcpu *v) #endif static inline bool is_hvm_domain(const struct domain *d) { -return IS_ENABLED(CONFIG_HVM) ? d->guest_type == guest_type_hvm : false; +return IS_ENABLED(CONFIG_HVM) + ? evaluate_nospec(d->guest_type == guest_type_hvm) : false; } static inline bool is_hvm_vcpu(const struct vcpu *v) -- 2.7.4 Amazon Development Center Germany GmbH Krausenstr. 38 10117 Berlin Geschaeftsfuehrer: Christian Schlaeger, Ralf Herbrich Ust-ID: DE 289 237 879 Eingetragen am Amtsgericht Charlottenburg HRB 149173 B ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] [PATCH L1TF v10 7/8] common/grant_table: block speculative out-of-bound accesses
Guests can issue grant table operations and provide guest controlled data to them. This data is also used for memory loads. To avoid speculative out-of-bound accesses, we use the array_index_nospec macro where applicable. However, there are also memory accesses that cannot be protected by a single array protection, or multiple accesses in a row. To protect these, a nospec barrier is placed between the actual range check and the access via the block_speculation macro. Speculative execution is not blocked in case one of the following properties is true: - path cannot be triggered by the guest - path does not return to the guest - path does not result in an out-of-bound access - path cannot be executed repeatedly Only the combination of the above properties allows to actually leak continuous chunks of memory. Therefore, we only add the penalty of protective mechanisms in case a potential speculative out-of-bound access matches all the above properties. As different versions of grant tables use structures of different size, and the status is encoded in an array for version 2, speculative execution might perform out-of-bound accesses of version 2 while the table is actually using version 1. Hence, speculation is prevented when accessing new memory based on the grant table version. In cases, where no different memory locations are accessed on the code path that follow an if statement, no protection is required. No different memory locations are accessed in the following functionsi after a version check: * _set_status, as the header memory layout is the same * unmap_common, as potentially touched memory locations are allocated and initialized * gnttab_grow_table, as the touched memory is the same for each branch after the conditionals * gnttab_transfer, as no memory access depends on the conditional * release_grant_for_copy, as no out-of-bound access depends on this conditional * gnttab_set_version, as in case of a version change all the memory is touched in both cases * gnttab_release_mappings, as this function is called only during domain destruction and control is not returned to the guest * mem_sharing_gref_to_gfn, as potential dangerous memory accesses are covered by the next evaluate_nospec * gnttab_get_status_frame, as the potential dangerous memory accesses are protected in gnttab_get_status_frame_mfn * gnttab_usage_print, as this function cannot be triggered by the guest This is part of the speculative hardening effort. Signed-off-by: Norbert Manthey --- Notes: v10: extended commit message with explanation when to exclude comparisons minor change in gnttab_transfer due to rebase xen/common/grant_table.c | 97 +--- 1 file changed, 75 insertions(+), 22 deletions(-) diff --git a/xen/common/grant_table.c b/xen/common/grant_table.c --- a/xen/common/grant_table.c +++ b/xen/common/grant_table.c @@ -37,6 +37,7 @@ #include #include #include +#include #include #include @@ -203,8 +204,9 @@ static inline unsigned int nr_status_frames(const struct grant_table *gt) } #define MAPTRACK_PER_PAGE (PAGE_SIZE / sizeof(struct grant_mapping)) -#define maptrack_entry(t, e) \ -((t)->maptrack[(e)/MAPTRACK_PER_PAGE][(e)%MAPTRACK_PER_PAGE]) +#define maptrack_entry(t, e) \ +((t)->maptrack[array_index_nospec(e, (t)->maptrack_limit) / \ +MAPTRACK_PER_PAGE][(e) % MAPTRACK_PER_PAGE]) static inline unsigned int nr_maptrack_frames(struct grant_table *t) @@ -226,10 +228,23 @@ nr_maptrack_frames(struct grant_table *t) static grant_entry_header_t * shared_entry_header(struct grant_table *t, grant_ref_t ref) { -if ( t->gt_version == 1 ) +switch ( t->gt_version ) +{ +case 1: +/* Returned values should be independent of speculative execution */ +block_speculation(); return (grant_entry_header_t*)&shared_entry_v1(t, ref); -else + +case 2: +/* Returned values should be independent of speculative execution */ +block_speculation(); return &shared_entry_v2(t, ref).hdr; +} + +ASSERT_UNREACHABLE(); +block_speculation(); + +return NULL; } /* Active grant entry - used for shadowing GTF_permit_access grants. */ @@ -634,14 +649,24 @@ static unsigned int nr_grant_entries(struct grant_table *gt) case 1: BUILD_BUG_ON(f2e(INITIAL_NR_GRANT_FRAMES, 1) < GNTTAB_NR_RESERVED_ENTRIES); + +/* Make sure we return a value independently of speculative execution */ +block_speculation(); return f2e(nr_grant_frames(gt), 1); + case 2: BUILD_BUG_ON(f2e(INITIAL_NR_GRANT_FRAMES, 2) < GNTTAB_NR_RESERVED_ENTRIES); + +/* Make sure we return a value independen
[Xen-devel] [PATCH L1TF v10 6/8] x86/hvm: add nospec to hvmop param
The params array in hvm can be accessed with get and set functions. As the index is guest controlled, make sure no out-of-bound accesses can be performed. As we cannot influence how future compilers might modify the instructions that enforce the bounds, we furthermore block speculation, so that the update is visible in the architectural state. This is part of the speculative hardening effort. Signed-off-by: Norbert Manthey Acked-by: Jan Beulich --- xen/arch/x86/hvm/hvm.c | 6 ++ 1 file changed, 6 insertions(+) diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c --- a/xen/arch/x86/hvm/hvm.c +++ b/xen/arch/x86/hvm/hvm.c @@ -4135,6 +4135,9 @@ static int hvmop_set_param( if ( a.index >= HVM_NR_PARAMS ) return -EINVAL; +/* Make sure the above bound check is not bypassed during speculation. */ +block_speculation(); + d = rcu_lock_domain_by_any_id(a.domid); if ( d == NULL ) return -ESRCH; @@ -4401,6 +4404,9 @@ static int hvmop_get_param( if ( a.index >= HVM_NR_PARAMS ) return -EINVAL; +/* Make sure the above bound check is not bypassed during speculation. */ +block_speculation(); + d = rcu_lock_domain_by_any_id(a.domid); if ( d == NULL ) return -ESRCH; -- 2.7.4 Amazon Development Center Germany GmbH Krausenstr. 38 10117 Berlin Geschaeftsfuehrer: Christian Schlaeger, Ralf Herbrich Ust-ID: DE 289 237 879 Eingetragen am Amtsgericht Charlottenburg HRB 149173 B ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] [PATCH L1TF v10 5/8] common/memory: block speculative out-of-bound accesses
The get_page_from_gfn method returns a pointer to a page that belongs to a gfn. Before returning the pointer, the gfn is checked for being valid. Under speculation, these checks can be bypassed, so that the function get_page is still executed partially. Consequently, the function page_get_owner_and_reference might be executed partially as well. In this function, the computed pointer is accessed, resulting in a speculative out-of-bound address load. As the gfn can be controlled by a guest, this access is problematic. To mitigate the root cause, an lfence instruction is added via the evaluate_nospec macro. To make the protection generic, we do not introduce the lfence instruction for this single check, but add it to the mfn_valid function. This way, other potentially problematic accesses are protected as well. This is part of the speculative hardening effort. Signed-off-by: Norbert Manthey Acked-by: Jan Beulich --- xen/common/pdx.c | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/xen/common/pdx.c b/xen/common/pdx.c --- a/xen/common/pdx.c +++ b/xen/common/pdx.c @@ -18,6 +18,7 @@ #include #include #include +#include /* Parameters for PFN/MADDR compression. */ unsigned long __read_mostly max_pdx; @@ -33,8 +34,9 @@ unsigned long __read_mostly pdx_group_valid[BITS_TO_LONGS( bool __mfn_valid(unsigned long mfn) { -return likely(mfn < max_page) && - likely(!(mfn & pfn_hole_mask)) && +if ( unlikely(evaluate_nospec(mfn >= max_page)) ) +return false; +return likely(!(mfn & pfn_hole_mask)) && likely(test_bit(pfn_to_pdx(mfn) / PDX_GROUP_COUNT, pdx_group_valid)); } -- 2.7.4 Amazon Development Center Germany GmbH Krausenstr. 38 10117 Berlin Geschaeftsfuehrer: Christian Schlaeger, Ralf Herbrich Ust-ID: DE 289 237 879 Eingetragen am Amtsgericht Charlottenburg HRB 149173 B ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] [PATCH L1TF v10 8/8] common/domain: block speculative out-of-bound accesses
When issuing a vcpu_op hypercall, guests have control over the vcpuid variable. In the old code, this allowed to perform speculative out-of-bound accesses. To block this, we make use of the domain_vcpu function. This is part of the speculative hardening effort. Signed-off-by: Norbert Manthey --- xen/common/domain.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/xen/common/domain.c b/xen/common/domain.c --- a/xen/common/domain.c +++ b/xen/common/domain.c @@ -1365,7 +1365,7 @@ long do_vcpu_op(int cmd, unsigned int vcpuid, XEN_GUEST_HANDLE_PARAM(void) arg) struct vcpu *v; long rc = 0; -if ( vcpuid >= d->max_vcpus || (v = d->vcpu[vcpuid]) == NULL ) +if ( (v = domain_vcpu(d, vcpuid)) == NULL ) return -ENOENT; switch ( cmd ) -- 2.7.4 Amazon Development Center Germany GmbH Krausenstr. 38 10117 Berlin Geschaeftsfuehrer: Christian Schlaeger, Ralf Herbrich Ust-ID: DE 289 237 879 Eingetragen am Amtsgericht Charlottenburg HRB 149173 B ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v6 11/12] x86/microcode: Synchronize late microcode loading
On Wed, Mar 13, 2019 at 02:02:55AM -0600, Jan Beulich wrote: On 13.03.19 at 08:54, wrote: > On 13.03.19 at 06:02, wrote: >>> On Tue, Mar 12, 2019 at 05:07:51PM -0700, Raj, Ashok wrote: On Mon, Mar 11, 2019 at 03:57:35PM +0800, Chao Gao wrote: > +if ( cpu == cpumask_first(per_cpu(cpu_sibling_mask, cpu)) ) > +ret = microcode_update_cpu(); Does ret have any useful things on if the update failed? Doesn't seem to be used before you overwrite later in collect_cpu_info()? >>> >>> It has the reason of failure on error. Actally, there are two reasons: >>> one is no patch of newer revision, the other is we tried to update but >>> the microcode revision didn't change. I can check this return value and >>> print more informative message to admin. And furthermore, for the >>> latter, we can remove the ucode patch from caches to address Roger's >>> concern expressed in comments to patch 4 & 5. >> >> Btw, I'm not sure removing such ucode from the cache is appropriate: >> It may well apply elsewhere, unless there's a clear indication that the >> blob is broken. Yes. Got it. Can we just assume we won't encounter that ucode update succeeded only on part of cpus and warn a reboot is needed if it happened? We definitely want to tolerate some kinds of hardware misbehavior. But for such cases which are unlikely to happen, I prefer to improve this code when we meet this kind of issue. >> So perhaps there needs to be special casing of -EIO, >> which gets returned when the ucode rev reported by the CPU after >> the update does not match expectations. > >An to go one step further, perhaps we should also store more than >just the newest variant for a given pf. If the newest fails to apply >but there is another one newer than what's on a CPU, updating to >that may work, and once that intermediate update worked, the >update to the newest version may then work too. Intel SDM doesn't mention this dependency (to apply an ucode relies on a specific old ucode applied). Perhaps we can also assume we won't fall into this case. Hi Ashok, Do you know whether Intel's ucode update mechanism has such dependency? Thanks Chao ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] [PATCH v3 2/4] x86: move the saved value of MSR_IA32_XSS into struct vcpu_msrs
Currently the value is saved directly in struct hvm_vcpu. This patch simply co-locates it with other saved MSR values. No functional change. Signed-off-by: Paul Durrant Reviewed-by: Jan Beulich Reviewed-by: Kevin Tian --- Cc: Andrew Cooper Cc: Wei Liu Cc: "Roger Pau Monné" Cc: Jun Nakajima --- xen/arch/x86/domain.c | 2 +- xen/arch/x86/hvm/hvm.c | 4 ++-- xen/arch/x86/hvm/vmx/vmx.c | 4 ++-- xen/include/asm-x86/hvm/vcpu.h | 1 - xen/include/asm-x86/msr.h | 5 + 5 files changed, 10 insertions(+), 6 deletions(-) diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c index 8d579e2cf9..aa38555736 100644 --- a/xen/arch/x86/domain.c +++ b/xen/arch/x86/domain.c @@ -1653,7 +1653,7 @@ static void __context_switch(void) BUG(); if ( cpu_has_xsaves && is_hvm_vcpu(n) ) -set_msr_xss(n->arch.hvm.msr_xss); +set_msr_xss(n->arch.msrs->xss.raw); } vcpu_restore_fpu_nonlazy(n, false); nd->arch.ctxt_switch->to(n); diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c index e566d83f8b..dff590e658 100644 --- a/xen/arch/x86/hvm/hvm.c +++ b/xen/arch/x86/hvm/hvm.c @@ -3466,7 +3466,7 @@ int hvm_msr_read_intercept(unsigned int msr, uint64_t *msr_content) case MSR_IA32_XSS: if ( !d->arch.cpuid->xstate.xsaves ) goto gp_fault; -*msr_content = v->arch.hvm.msr_xss; +*msr_content = v->arch.msrs->xss.raw; break; case MSR_K8_ENABLE_C1E: @@ -3612,7 +3612,7 @@ int hvm_msr_write_intercept(unsigned int msr, uint64_t msr_content, /* No XSS features currently supported for guests. */ if ( !d->arch.cpuid->xstate.xsaves || msr_content != 0 ) goto gp_fault; -v->arch.hvm.msr_xss = msr_content; +v->arch.msrs->xss.raw = msr_content; break; case MSR_AMD64_NB_CFG: diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c index 6445f44c3b..4941924cf6 100644 --- a/xen/arch/x86/hvm/vmx/vmx.c +++ b/xen/arch/x86/hvm/vmx/vmx.c @@ -807,7 +807,7 @@ static void vmx_save_msr(struct vcpu *v, struct hvm_msr *ctxt) { if ( cpu_has_xsaves && cpu_has_vmx_xsaves ) { -ctxt->msr[ctxt->count].val = v->arch.hvm.msr_xss; +ctxt->msr[ctxt->count].val = v->arch.msrs->xss.raw; if ( ctxt->msr[ctxt->count].val ) ctxt->msr[ctxt->count++].index = MSR_IA32_XSS; } @@ -826,7 +826,7 @@ static int vmx_load_msr(struct vcpu *v, struct hvm_msr *ctxt) { case MSR_IA32_XSS: if ( cpu_has_xsaves && cpu_has_vmx_xsaves ) -v->arch.hvm.msr_xss = ctxt->msr[i].val; +v->arch.msrs->xss.raw = ctxt->msr[i].val; else err = -ENXIO; break; diff --git a/xen/include/asm-x86/hvm/vcpu.h b/xen/include/asm-x86/hvm/vcpu.h index 6c84d5a5a6..5563d28a4e 100644 --- a/xen/include/asm-x86/hvm/vcpu.h +++ b/xen/include/asm-x86/hvm/vcpu.h @@ -176,7 +176,6 @@ struct hvm_vcpu { struct hvm_vcpu_asid n1asid; u64 msr_tsc_adjust; -u64 msr_xss; union { struct vmx_vcpu vmx; diff --git a/xen/include/asm-x86/msr.h b/xen/include/asm-x86/msr.h index a7244793bf..0d52c085f6 100644 --- a/xen/include/asm-x86/msr.h +++ b/xen/include/asm-x86/msr.h @@ -313,6 +313,11 @@ struct vcpu_msrs * values here may be stale in current context. */ uint32_t dr_mask[4]; + +/* 0x0da0 - MSR_IA32_XSS */ +struct { +uint64_t raw; +} xss; }; void init_guest_msr_policy(void); -- 2.20.1 ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] [PATCH v3 1/4] x86: stop handling MSR_IA32_BNDCFGS save/restore in implementation code
Saving and restoring the value of this MSR is currently handled by implementation-specific code despite it being architectural. This patch moves handling of accesses to this MSR from hvm.c into the msr.c, thus allowing the common MSR save/restore code to handle it. NOTE: Because vmx_get/set_guest_bndcfgs() call vmx_vmcs_enter(), the struct vcpu pointer passed in, and hence the vcpu pointer passed to guest_rdmsr() cannot be const. Signed-off-by: Paul Durrant --- Cc: Kevin Tian Cc: Jan Beulich Cc: Andrew Cooper Cc: Wei Liu Cc: "Roger Pau Monné" Cc: Jun Nakajima v3: - Address further comments from Jan - Dropped Kevin's R-b because of change to vmx.c v2: - Addressed comments from Jan by largely removing hunks - Keeping Kevin's R-b since remaining hunks in vmx.c are as before --- xen/arch/x86/hvm/hvm.c| 14 ++ xen/arch/x86/hvm/vmx/vmx.c| 26 ++ xen/arch/x86/msr.c| 12 xen/include/asm-x86/hvm/hvm.h | 4 ++-- 4 files changed, 22 insertions(+), 34 deletions(-) diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c index 8adbb61b57..e566d83f8b 100644 --- a/xen/arch/x86/hvm/hvm.c +++ b/xen/arch/x86/hvm/hvm.c @@ -1303,6 +1303,7 @@ static int hvm_load_cpu_xsave_states(struct domain *d, hvm_domain_context_t *h) static const uint32_t msrs_to_send[] = { MSR_SPEC_CTRL, MSR_INTEL_MISC_FEATURES_ENABLES, +MSR_IA32_BNDCFGS, MSR_AMD64_DR0_ADDRESS_MASK, MSR_AMD64_DR1_ADDRESS_MASK, MSR_AMD64_DR2_ADDRESS_MASK, @@ -1440,6 +1441,7 @@ static int hvm_load_cpu_msrs(struct domain *d, hvm_domain_context_t *h) case MSR_SPEC_CTRL: case MSR_INTEL_MISC_FEATURES_ENABLES: +case MSR_IA32_BNDCFGS: case MSR_AMD64_DR0_ADDRESS_MASK: case MSR_AMD64_DR1_ADDRESS_MASK ... MSR_AMD64_DR3_ADDRESS_MASK: rc = guest_wrmsr(v, ctxt->msr[i].index, ctxt->msr[i].val); @@ -3467,12 +3469,6 @@ int hvm_msr_read_intercept(unsigned int msr, uint64_t *msr_content) *msr_content = v->arch.hvm.msr_xss; break; -case MSR_IA32_BNDCFGS: -if ( !d->arch.cpuid->feat.mpx || - !hvm_get_guest_bndcfgs(v, msr_content) ) -goto gp_fault; -break; - case MSR_K8_ENABLE_C1E: case MSR_AMD64_NB_CFG: /* @@ -3619,12 +3615,6 @@ int hvm_msr_write_intercept(unsigned int msr, uint64_t msr_content, v->arch.hvm.msr_xss = msr_content; break; -case MSR_IA32_BNDCFGS: -if ( !d->arch.cpuid->feat.mpx || - !hvm_set_guest_bndcfgs(v, msr_content) ) -goto gp_fault; -break; - case MSR_AMD64_NB_CFG: /* ignore the write */ break; diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c index 725dd88c13..6445f44c3b 100644 --- a/xen/arch/x86/hvm/vmx/vmx.c +++ b/xen/arch/x86/hvm/vmx/vmx.c @@ -805,17 +805,6 @@ static unsigned int __init vmx_init_msr(void) static void vmx_save_msr(struct vcpu *v, struct hvm_msr *ctxt) { -vmx_vmcs_enter(v); - -if ( cpu_has_mpx && cpu_has_vmx_mpx ) -{ -__vmread(GUEST_BNDCFGS, &ctxt->msr[ctxt->count].val); -if ( ctxt->msr[ctxt->count].val ) -ctxt->msr[ctxt->count++].index = MSR_IA32_BNDCFGS; -} - -vmx_vmcs_exit(v); - if ( cpu_has_xsaves && cpu_has_vmx_xsaves ) { ctxt->msr[ctxt->count].val = v->arch.hvm.msr_xss; @@ -835,14 +824,6 @@ static int vmx_load_msr(struct vcpu *v, struct hvm_msr *ctxt) { switch ( ctxt->msr[i].index ) { -case MSR_IA32_BNDCFGS: -if ( cpu_has_mpx && cpu_has_vmx_mpx && - is_canonical_address(ctxt->msr[i].val) && - !(ctxt->msr[i].val & IA32_BNDCFGS_RESERVED) ) -__vmwrite(GUEST_BNDCFGS, ctxt->msr[i].val); -else if ( ctxt->msr[i].val ) -err = -ENXIO; -break; case MSR_IA32_XSS: if ( cpu_has_xsaves && cpu_has_vmx_xsaves ) v->arch.hvm.msr_xss = ctxt->msr[i].val; @@ -1215,10 +1196,15 @@ static bool vmx_set_guest_bndcfgs(struct vcpu *v, u64 val) return true; } -static bool vmx_get_guest_bndcfgs(struct vcpu *v, u64 *val) +static bool vmx_get_guest_bndcfgs(const struct vcpu *cv, u64 *val) { +struct vcpu *v; + ASSERT(cpu_has_mpx && cpu_has_vmx_mpx); +/* Get a non-const pointer for vmx_vmcs_enter() */ +v = cv->domain->vcpu[cv->vcpu_id]; + vmx_vmcs_enter(v); __vmread(GUEST_BNDCFGS, val); vmx_vmcs_exit(v); diff --git a/xen/arch/x86/msr.c b/xen/arch/x86/msr.c index 4df4a59f4d..363fd63096 100644 --- a/xen/arch/x86/msr.c +++ b/xen/arch/x86/msr.c @@ -160,6 +160,12 @@ int guest_rdmsr(const struct vcpu *v, uint32_t msr, uint64_t *val) ret = guest_rdmsr_x2apic(v, msr, val); break; +case MSR_IA32_BNDCFGS: +if ( !cp->feat.mpx || !hvm_get_guest_bndcfgs(v, val) ) +goto gp_fault; + +
[Xen-devel] [PATCH v3 3/4] x86: stop handling MSR_IA32_XSS save/restore in implementation code
Saving and restoring the value of this MSR is currently handled by implementation-specific code despite it being architectural. This patch moves handling of accesses to this MSR from hvm.c into the msr.c, thus allowing the common MSR save/restore code to handle it. This patch also adds proper checks of CPUID policy in the new get/set code. NOTE: MSR_IA32_XSS is the last MSR to be saved and restored by implementation-specific code. This patch therefore removes the (VMX) definitions and of the init_msr(), save_msr() and load_msr() hvm_funcs, as they are no longer necessary. The declarations of and calls to those hvm_funcs will be cleaned up by a subsequent patch. Signed-off-by: Paul Durrant Reviewed-by: Jan Beulich Reviewed-by: Kevin Tian --- Cc: Andrew Cooper Cc: Wei Liu Cc: "Roger Pau Monné" Cc: Jun Nakajima --- xen/arch/x86/hvm/hvm.c | 15 ++-- xen/arch/x86/hvm/vmx/vmx.c | 49 -- xen/arch/x86/msr.c | 18 ++ 3 files changed, 20 insertions(+), 62 deletions(-) diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c index dff590e658..deb7fb2adb 100644 --- a/xen/arch/x86/hvm/hvm.c +++ b/xen/arch/x86/hvm/hvm.c @@ -1304,6 +1304,7 @@ static const uint32_t msrs_to_send[] = { MSR_SPEC_CTRL, MSR_INTEL_MISC_FEATURES_ENABLES, MSR_IA32_BNDCFGS, +MSR_IA32_XSS, MSR_AMD64_DR0_ADDRESS_MASK, MSR_AMD64_DR1_ADDRESS_MASK, MSR_AMD64_DR2_ADDRESS_MASK, @@ -1442,6 +1443,7 @@ static int hvm_load_cpu_msrs(struct domain *d, hvm_domain_context_t *h) case MSR_SPEC_CTRL: case MSR_INTEL_MISC_FEATURES_ENABLES: case MSR_IA32_BNDCFGS: +case MSR_IA32_XSS: case MSR_AMD64_DR0_ADDRESS_MASK: case MSR_AMD64_DR1_ADDRESS_MASK ... MSR_AMD64_DR3_ADDRESS_MASK: rc = guest_wrmsr(v, ctxt->msr[i].index, ctxt->msr[i].val); @@ -3463,12 +3465,6 @@ int hvm_msr_read_intercept(unsigned int msr, uint64_t *msr_content) MTRRcap_VCNT))]; break; -case MSR_IA32_XSS: -if ( !d->arch.cpuid->xstate.xsaves ) -goto gp_fault; -*msr_content = v->arch.msrs->xss.raw; -break; - case MSR_K8_ENABLE_C1E: case MSR_AMD64_NB_CFG: /* @@ -3608,13 +3604,6 @@ int hvm_msr_write_intercept(unsigned int msr, uint64_t msr_content, goto gp_fault; break; -case MSR_IA32_XSS: -/* No XSS features currently supported for guests. */ -if ( !d->arch.cpuid->xstate.xsaves || msr_content != 0 ) -goto gp_fault; -v->arch.msrs->xss.raw = msr_content; -break; - case MSR_AMD64_NB_CFG: /* ignore the write */ break; diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c index 4941924cf6..043c4268a0 100644 --- a/xen/arch/x86/hvm/vmx/vmx.c +++ b/xen/arch/x86/hvm/vmx/vmx.c @@ -797,52 +797,6 @@ static int vmx_load_vmcs_ctxt(struct vcpu *v, struct hvm_hw_cpu *ctxt) return 0; } -static unsigned int __init vmx_init_msr(void) -{ -return (cpu_has_mpx && cpu_has_vmx_mpx) + - (cpu_has_xsaves && cpu_has_vmx_xsaves); -} - -static void vmx_save_msr(struct vcpu *v, struct hvm_msr *ctxt) -{ -if ( cpu_has_xsaves && cpu_has_vmx_xsaves ) -{ -ctxt->msr[ctxt->count].val = v->arch.msrs->xss.raw; -if ( ctxt->msr[ctxt->count].val ) -ctxt->msr[ctxt->count++].index = MSR_IA32_XSS; -} -} - -static int vmx_load_msr(struct vcpu *v, struct hvm_msr *ctxt) -{ -unsigned int i; -int err = 0; - -vmx_vmcs_enter(v); - -for ( i = 0; i < ctxt->count; ++i ) -{ -switch ( ctxt->msr[i].index ) -{ -case MSR_IA32_XSS: -if ( cpu_has_xsaves && cpu_has_vmx_xsaves ) -v->arch.msrs->xss.raw = ctxt->msr[i].val; -else -err = -ENXIO; -break; -default: -continue; -} -if ( err ) -break; -ctxt->msr[i]._rsvd = 1; -} - -vmx_vmcs_exit(v); - -return err; -} - static void vmx_fpu_enter(struct vcpu *v) { vcpu_restore_fpu_lazy(v); @@ -2284,9 +2238,6 @@ static struct hvm_function_table __initdata vmx_function_table = { .vcpu_destroy = vmx_vcpu_destroy, .save_cpu_ctxt= vmx_save_vmcs_ctxt, .load_cpu_ctxt= vmx_load_vmcs_ctxt, -.init_msr = vmx_init_msr, -.save_msr = vmx_save_msr, -.load_msr = vmx_load_msr, .get_interrupt_shadow = vmx_get_interrupt_shadow, .set_interrupt_shadow = vmx_set_interrupt_shadow, .guest_x86_mode = vmx_guest_x86_mode, diff --git a/xen/arch/x86/msr.c b/xen/arch/x86/msr.c index 363fd63096..cfa49d7a63 100644 --- a/xen/arch/x86/msr.c +++ b/xen/arch/x86/msr.c @@ -166,6 +166,13 @@ int guest_rdmsr(const struct vcpu *v, uint32_t msr, uint64_t *val) break;
[Xen-devel] [PATCH v3 4/4] x86: remove defunct init/load/save_msr() hvm_funcs
These hvm_funcs are no longer required since no MSR values are saved or restored by implementation-specific code. Signed-off-by: Paul Durrant Reviewed-by: Jan Beulich --- Cc: Andrew Cooper Cc: Wei Liu Cc: "Roger Pau Monné" v2: - Re-instate err check on loop in hvm_load_cpu_msrs() --- xen/arch/x86/hvm/hvm.c| 29 + xen/include/asm-x86/hvm/hvm.h | 4 2 files changed, 9 insertions(+), 24 deletions(-) diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c index deb7fb2adb..f84bf11f0f 100644 --- a/xen/arch/x86/hvm/hvm.c +++ b/xen/arch/x86/hvm/hvm.c @@ -1310,7 +1310,6 @@ static const uint32_t msrs_to_send[] = { MSR_AMD64_DR2_ADDRESS_MASK, MSR_AMD64_DR3_ADDRESS_MASK, }; -static unsigned int __read_mostly msr_count_max = ARRAY_SIZE(msrs_to_send); static int hvm_save_cpu_msrs(struct vcpu *v, hvm_domain_context_t *h) { @@ -1320,7 +1319,7 @@ static int hvm_save_cpu_msrs(struct vcpu *v, hvm_domain_context_t *h) int err; err = _hvm_init_entry(h, CPU_MSR_CODE, v->vcpu_id, - HVM_CPU_MSR_SIZE(msr_count_max)); + HVM_CPU_MSR_SIZE(ARRAY_SIZE(msrs_to_send))); if ( err ) return err; ctxt = (struct hvm_msr *)&h->data[h->cur]; @@ -1353,10 +1352,7 @@ static int hvm_save_cpu_msrs(struct vcpu *v, hvm_domain_context_t *h) ctxt->msr[ctxt->count++].val = val; } -if ( hvm_funcs.save_msr ) -hvm_funcs.save_msr(v, ctxt); - -ASSERT(ctxt->count <= msr_count_max); +ASSERT(ctxt->count <= ARRAY_SIZE(msrs_to_send)); for ( i = 0; i < ctxt->count; ++i ) ctxt->msr[i]._rsvd = 0; @@ -1431,9 +1427,6 @@ static int hvm_load_cpu_msrs(struct domain *d, hvm_domain_context_t *h) return -EOPNOTSUPP; /* Checking finished */ -if ( hvm_funcs.load_msr ) -err = hvm_funcs.load_msr(v, ctxt); - for ( i = 0; !err && i < ctxt->count; ++i ) { switch ( ctxt->msr[i].index ) @@ -1475,17 +1468,13 @@ static int __init hvm_register_CPU_save_and_restore(void) sizeof(struct hvm_save_descriptor), HVMSR_PER_VCPU); -if ( hvm_funcs.init_msr ) -msr_count_max += hvm_funcs.init_msr(); - -if ( msr_count_max ) -hvm_register_savevm(CPU_MSR_CODE, -"CPU_MSR", -hvm_save_cpu_msrs, -hvm_load_cpu_msrs, -HVM_CPU_MSR_SIZE(msr_count_max) + -sizeof(struct hvm_save_descriptor), -HVMSR_PER_VCPU); +hvm_register_savevm(CPU_MSR_CODE, +"CPU_MSR", +hvm_save_cpu_msrs, +hvm_load_cpu_msrs, +HVM_CPU_MSR_SIZE(ARRAY_SIZE(msrs_to_send)) + +sizeof(struct hvm_save_descriptor), +HVMSR_PER_VCPU); return 0; } diff --git a/xen/include/asm-x86/hvm/hvm.h b/xen/include/asm-x86/hvm/hvm.h index d41ed63232..a70fdbe298 100644 --- a/xen/include/asm-x86/hvm/hvm.h +++ b/xen/include/asm-x86/hvm/hvm.h @@ -115,10 +115,6 @@ struct hvm_function_table { void (*save_cpu_ctxt)(struct vcpu *v, struct hvm_hw_cpu *ctxt); int (*load_cpu_ctxt)(struct vcpu *v, struct hvm_hw_cpu *ctxt); -unsigned int (*init_msr)(void); -void (*save_msr)(struct vcpu *, struct hvm_msr *); -int (*load_msr)(struct vcpu *, struct hvm_msr *); - /* Examine specifics of the guest state. */ unsigned int (*get_interrupt_shadow)(struct vcpu *v); void (*set_interrupt_shadow)(struct vcpu *v, unsigned int intr_shadow); -- 2.20.1 ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] [PATCH v3 0/4] clean up MSR save/restore code
Paul Durrant (4): x86: stop handling MSR_IA32_BNDCFGS save/restore in implementation code x86: move the saved value of MSR_IA32_XSS into struct vcpu_msrs x86: stop handling MSR_IA32_XSS save/restore in implementation code x86: remove defunct init/load/save_msr() hvm_funcs xen/arch/x86/domain.c | 2 +- xen/arch/x86/hvm/hvm.c | 58 ++ xen/arch/x86/hvm/vmx/vmx.c | 75 +++--- xen/arch/x86/msr.c | 30 ++ xen/include/asm-x86/hvm/hvm.h | 8 +--- xen/include/asm-x86/hvm/vcpu.h | 1 - xen/include/asm-x86/msr.h | 5 +++ 7 files changed, 57 insertions(+), 122 deletions(-) --- v2: - Drop patches #2 and #6 of the v1 series Cc: Andrew Cooper Cc: Jan Beulich Cc: Jun Nakajima Cc: Kevin Tian Cc: "Roger Pau Monné" Cc: Wei Liu -- 2.20.1 ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v6 11/12] x86/microcode: Synchronize late microcode loading
>>> On 14.03.19 at 14:01, wrote: > On Wed, Mar 13, 2019 at 02:02:55AM -0600, Jan Beulich wrote: > On 13.03.19 at 08:54, wrote: >> On 13.03.19 at 06:02, wrote: On Tue, Mar 12, 2019 at 05:07:51PM -0700, Raj, Ashok wrote: >On Mon, Mar 11, 2019 at 03:57:35PM +0800, Chao Gao wrote: >> +if ( cpu == cpumask_first(per_cpu(cpu_sibling_mask, cpu)) ) >> +ret = microcode_update_cpu(); > >Does ret have any useful things on if the update failed? Doesn't seem >to be used before you overwrite later in collect_cpu_info()? It has the reason of failure on error. Actally, there are two reasons: one is no patch of newer revision, the other is we tried to update but the microcode revision didn't change. I can check this return value and print more informative message to admin. And furthermore, for the latter, we can remove the ucode patch from caches to address Roger's concern expressed in comments to patch 4 & 5. >>> >>> Btw, I'm not sure removing such ucode from the cache is appropriate: >>> It may well apply elsewhere, unless there's a clear indication that the >>> blob is broken. > > Yes. Got it. Can we just assume we won't encounter that ucode update > succeeded only on part of cpus and warn a reboot is needed if it happened? > We definitely want to tolerate some kinds of hardware misbehavior. But > for such cases which are unlikely to happen, I prefer to improve this code > when we meet this kind of issue. Okay, for both this and ... >>> So perhaps there needs to be special casing of -EIO, >>> which gets returned when the ucode rev reported by the CPU after >>> the update does not match expectations. >> >>An to go one step further, perhaps we should also store more than >>just the newest variant for a given pf. If the newest fails to apply >>but there is another one newer than what's on a CPU, updating to >>that may work, and once that intermediate update worked, the >>update to the newest version may then work too. ... see also Andrew's reply. As long as no-one is aware of mixed- stepping systems out there in the wild, I'm fine with simplifying things where possible. > Intel SDM doesn't mention this dependency (to apply an ucode relies on a > specific old ucode applied). Perhaps we can also assume we won't fall > into this case. Please see Linux'es arch/x86/kernel/cpu/microcode/intel.c:is_blacklisted() for one of the possible problematic situations here. Of course the SDM wouldn't mention it, only the errata document for this specific model would. Jan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH 1/2] xen: introduce a cpumask with all bits set
>>> On 14.03.19 at 10:59, wrote: > @@ -1892,8 +1891,7 @@ static void __init check_timer(void) > vector = IRQ0_VECTOR; > clear_irq_vector(0); > > -cpumask_setall(&mask_all); > -if ((ret = bind_irq_vector(0, vector, &mask_all))) > +if ((ret = bind_irq_vector(0, vector, &cpumask_all))) Since you replace the users of cpumask_setall(), wouldn't it be better to also remove that function from the header? Jan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] [PATCH] xen/netfront: Remove unneeded .resume callback
From: Oleksandr Andrushchenko Currently on driver resume we remove all the network queues and destroy shared Tx/Rx rings leaving the driver in its current state and never signaling the backend of this frontend's state change. This leads to the number of consequences: - when frontend withdraws granted references to the rings etc. it cannot be cleanly done as the backend still holds those (it was not told to free the resources) - it is not possible to resume driver operation as all the communication means with the backned were destroyed by the frontend, thus making the frontend appear to the guest OS as functional, but not really. Fix this by not destroying communication channels/rings on driver resume. Signed-off-by: Oleksandr Andrushchenko --- drivers/net/xen-netfront.c | 17 - 1 file changed, 17 deletions(-) diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c index c914c24f880b..2ca162048da4 100644 --- a/drivers/net/xen-netfront.c +++ b/drivers/net/xen-netfront.c @@ -1422,22 +1422,6 @@ static void xennet_disconnect_backend(struct netfront_info *info) } } -/** - * We are reconnecting to the backend, due to a suspend/resume, or a backend - * driver restart. We tear down our netif structure and recreate it, but - * leave the device-layer structures intact so that this is transparent to the - * rest of the kernel. - */ -static int netfront_resume(struct xenbus_device *dev) -{ - struct netfront_info *info = dev_get_drvdata(&dev->dev); - - dev_dbg(&dev->dev, "%s\n", dev->nodename); - - xennet_disconnect_backend(info); - return 0; -} - static int xen_net_read_mac(struct xenbus_device *dev, u8 mac[]) { char *s, *e, *macstr; @@ -2185,7 +2169,6 @@ static struct xenbus_driver netfront_driver = { .ids = netfront_ids, .probe = netfront_probe, .remove = xennet_remove, - .resume = netfront_resume, .otherend_changed = netback_changed, }; -- 2.21.0 ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH L1TF v10 2/8] nospec: introduce evaluate_nospec
>>> On 14.03.19 at 13:50, wrote: > Since the L1TF vulnerability of Intel CPUs, loading hypervisor data into > L1 cache is problematic, because when hyperthreading is used as well, a > guest running on the sibling core can leak this potentially secret data. > > To prevent these speculative accesses, we block speculation after > accessing the domain property field by adding lfence instructions. This > way, the CPU continues executing and loading data only once the condition > is actually evaluated. > > As this protection is typically used in if statements, the lfence has to > come in a compatible way. Therefore, a function that returns true after an > lfence instruction is introduced. To protect both branches after a > conditional, an lfence instruction has to be added for the two branches. > To be able to block speculation after several evaluations, the generic > barrier macro block_speculation is also introduced. > > As the L1TF vulnerability is only present on the x86 architecture, there is > no need to add protection for other architectures. Hence, the introduced > functions are defined but empty. > > On the x86 architecture, by default, the lfence instruction is not present > either. Only when a L1TF vulnerable platform is detected, the lfence > instruction is patched in via alternative patching. Similarly, PV guests > are protected wrt L1TF by default, so that the protection is furthermore > disabled in case HVM is exclueded via the build configuration. > > Introducing the lfence instructions catches a lot of potential leaks with > a simple unintrusive code change. During performance testing, we did not > notice performance effects. > > This is part of the speculative hardening effort. > > Signed-off-by: Norbert Manthey > Acked-by: Julien Grall I did give my ack on v9, and I see no indication of changes which may have invalidated it. Jan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH L1TF v10 8/8] common/domain: block speculative out-of-bound accesses
>>> On 14.03.19 at 13:50, wrote: > When issuing a vcpu_op hypercall, guests have control over the > vcpuid variable. In the old code, this allowed to perform > speculative out-of-bound accesses. To block this, we make use > of the domain_vcpu function. > > This is part of the speculative hardening effort. > > Signed-off-by: Norbert Manthey Reviewed-by: Jan Beulich ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH L1TF v10 2/8] nospec: introduce evaluate_nospec
On 3/14/19 14:19, Jan Beulich wrote: On 14.03.19 at 13:50, wrote: >> Since the L1TF vulnerability of Intel CPUs, loading hypervisor data into >> L1 cache is problematic, because when hyperthreading is used as well, a >> guest running on the sibling core can leak this potentially secret data. >> >> To prevent these speculative accesses, we block speculation after >> accessing the domain property field by adding lfence instructions. This >> way, the CPU continues executing and loading data only once the condition >> is actually evaluated. >> >> As this protection is typically used in if statements, the lfence has to >> come in a compatible way. Therefore, a function that returns true after an >> lfence instruction is introduced. To protect both branches after a >> conditional, an lfence instruction has to be added for the two branches. >> To be able to block speculation after several evaluations, the generic >> barrier macro block_speculation is also introduced. >> >> As the L1TF vulnerability is only present on the x86 architecture, there is >> no need to add protection for other architectures. Hence, the introduced >> functions are defined but empty. >> >> On the x86 architecture, by default, the lfence instruction is not present >> either. Only when a L1TF vulnerable platform is detected, the lfence >> instruction is patched in via alternative patching. Similarly, PV guests >> are protected wrt L1TF by default, so that the protection is furthermore >> disabled in case HVM is exclueded via the build configuration. >> >> Introducing the lfence instructions catches a lot of potential leaks with >> a simple unintrusive code change. During performance testing, we did not >> notice performance effects. >> >> This is part of the speculative hardening effort. >> >> Signed-off-by: Norbert Manthey >> Acked-by: Julien Grall > I did give my ack on v9, and I see no indication of changes which > may have invalidated it. That is a miss on my side. Norbert > > Jan > > Amazon Development Center Germany GmbH Krausenstr. 38 10117 Berlin Geschaeftsfuehrer: Christian Schlaeger, Ralf Herbrich Ust-ID: DE 289 237 879 Eingetragen am Amtsgericht Charlottenburg HRB 149173 B ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH 1/2] xen: introduce a cpumask with all bits set
On 14/03/2019 14:12, Jan Beulich wrote: On 14.03.19 at 10:59, wrote: >> @@ -1892,8 +1891,7 @@ static void __init check_timer(void) >> vector = IRQ0_VECTOR; >> clear_irq_vector(0); >> >> -cpumask_setall(&mask_all); >> -if ((ret = bind_irq_vector(0, vector, &mask_all))) >> +if ((ret = bind_irq_vector(0, vector, &cpumask_all))) > > Since you replace the users of cpumask_setall(), wouldn't it be > better to also remove that function from the header? I don't replace all of them. There are still some users left where replacing with cpumask_all would be less optimal. Juergen ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH 1/2] xen/debug: make debugtrace configurable via Kconfig
>>> On 14.03.19 at 10:37, wrote: > Instead of having to edit include/xen/lib.h for making debugtrace > available make it configurable via Kconfig. > > Default is off, it is available only in expert mode or in debug builds. > > Signed-off-by: Juergen Gross Acked-by: Jan Beulich ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH 2/2] xen/debug: make debugtrace more clever regarding repeating entries
>>> On 14.03.19 at 10:37, wrote: > --- a/xen/drivers/char/console.c > +++ b/xen/drivers/char/console.c > @@ -1225,13 +1225,28 @@ void debugtrace_dump(void) > watchdog_enable(); > } > > +static void debugtrace_add_to_buf(char *buf) > +{ > +char *p; > + > +for ( p = buf; *p != '\0'; p++ ) > +{ > +debugtrace_buf[debugtrace_prd++] = *p; > +/* Always leave a nul byte at the end of the buffer. */ > +if ( debugtrace_prd == (debugtrace_bytes - 1) ) > +debugtrace_prd = 0; > +} > +} > + > void debugtrace_printk(const char *fmt, ...) > { > -static charbuf[1024]; > -static u32 count; > +static char buf[1024]; > +static char last_buf[1024]; > +static u32 count, last_count; Please change to uint32_t or even better simply to unsigned int. > @@ -1243,25 +1258,32 @@ void debugtrace_printk(const char *fmt, ...) > > ASSERT(debugtrace_buf[debugtrace_bytes - 1] == 0); > > -snprintf(buf, sizeof(buf), "%u ", ++count); > - > va_start(args, fmt); > -(void)vsnprintf(buf + strlen(buf), sizeof(buf) - strlen(buf), fmt, args); > +(void)vsnprintf(buf, sizeof(buf), fmt, args); Please take the opportunity and drop the stray cast. Jan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v3 1/4] x86: stop handling MSR_IA32_BNDCFGS save/restore in implementation code
>>> On 14.03.19 at 14:01, wrote: > @@ -1215,10 +1196,15 @@ static bool vmx_set_guest_bndcfgs(struct vcpu *v, u64 > val) > return true; > } > > -static bool vmx_get_guest_bndcfgs(struct vcpu *v, u64 *val) > +static bool vmx_get_guest_bndcfgs(const struct vcpu *cv, u64 *val) > { > +struct vcpu *v; > + > ASSERT(cpu_has_mpx && cpu_has_vmx_mpx); > > +/* Get a non-const pointer for vmx_vmcs_enter() */ > +v = cv->domain->vcpu[cv->vcpu_id]; Any chance this could be made the initializer of the variable? > --- a/xen/arch/x86/msr.c > +++ b/xen/arch/x86/msr.c > @@ -160,6 +160,12 @@ int guest_rdmsr(const struct vcpu *v, uint32_t msr, > uint64_t *val) > ret = guest_rdmsr_x2apic(v, msr, val); > break; > > +case MSR_IA32_BNDCFGS: > +if ( !cp->feat.mpx || !hvm_get_guest_bndcfgs(v, val) ) > +goto gp_fault; Didn't we settle on ASSERT(is_hvm_*)? > @@ -323,6 +329,12 @@ int guest_wrmsr(struct vcpu *v, uint32_t msr, uint64_t > val) > ret = guest_wrmsr_x2apic(v, msr, val); > break; > > +case MSR_IA32_BNDCFGS: > +if ( !cp->feat.mpx || !hvm_set_guest_bndcfgs(v, val) ) > +goto gp_fault; Same here? Or did Andrew tell you otherwise? Jan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH 2/2] xen/debug: make debugtrace more clever regarding repeating entries
On 14/03/2019 14:33, Jan Beulich wrote: On 14.03.19 at 10:37, wrote: >> --- a/xen/drivers/char/console.c >> +++ b/xen/drivers/char/console.c >> @@ -1225,13 +1225,28 @@ void debugtrace_dump(void) >> watchdog_enable(); >> } >> >> +static void debugtrace_add_to_buf(char *buf) >> +{ >> +char *p; >> + >> +for ( p = buf; *p != '\0'; p++ ) >> +{ >> +debugtrace_buf[debugtrace_prd++] = *p; >> +/* Always leave a nul byte at the end of the buffer. */ >> +if ( debugtrace_prd == (debugtrace_bytes - 1) ) >> +debugtrace_prd = 0; >> +} >> +} >> + >> void debugtrace_printk(const char *fmt, ...) >> { >> -static charbuf[1024]; >> -static u32 count; >> +static char buf[1024]; >> +static char last_buf[1024]; >> +static u32 count, last_count; > > Please change to uint32_t or even better simply to unsigned int. Okay. > >> @@ -1243,25 +1258,32 @@ void debugtrace_printk(const char *fmt, ...) >> >> ASSERT(debugtrace_buf[debugtrace_bytes - 1] == 0); >> >> -snprintf(buf, sizeof(buf), "%u ", ++count); >> - >> va_start(args, fmt); >> -(void)vsnprintf(buf + strlen(buf), sizeof(buf) - strlen(buf), fmt, >> args); >> +(void)vsnprintf(buf, sizeof(buf), fmt, args); > > Please take the opportunity and drop the stray cast. Will do. Juergen ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH 2/2] xen/debug: make debugtrace more clever regarding repeating entries
On 14/03/2019 13:38, Juergen Gross wrote: > On 14/03/2019 14:33, Jan Beulich wrote: > On 14.03.19 at 10:37, wrote: >>> --- a/xen/drivers/char/console.c >>> +++ b/xen/drivers/char/console.c >>> @@ -1225,13 +1225,28 @@ void debugtrace_dump(void) >>> watchdog_enable(); >>> } >>> >>> +static void debugtrace_add_to_buf(char *buf) >>> +{ >>> +char *p; >>> + >>> +for ( p = buf; *p != '\0'; p++ ) >>> +{ >>> +debugtrace_buf[debugtrace_prd++] = *p; >>> +/* Always leave a nul byte at the end of the buffer. */ >>> +if ( debugtrace_prd == (debugtrace_bytes - 1) ) >>> +debugtrace_prd = 0; >>> +} >>> +} >>> + >>> void debugtrace_printk(const char *fmt, ...) >>> { >>> -static charbuf[1024]; >>> -static u32 count; >>> +static char buf[1024]; >>> +static char last_buf[1024]; >>> +static u32 count, last_count; >> Please change to uint32_t or even better simply to unsigned int. > Okay. > >>> @@ -1243,25 +1258,32 @@ void debugtrace_printk(const char *fmt, ...) >>> >>> ASSERT(debugtrace_buf[debugtrace_bytes - 1] == 0); >>> >>> -snprintf(buf, sizeof(buf), "%u ", ++count); >>> - >>> va_start(args, fmt); >>> -(void)vsnprintf(buf + strlen(buf), sizeof(buf) - strlen(buf), fmt, >>> args); >>> +(void)vsnprintf(buf, sizeof(buf), fmt, args); >> Please take the opportunity and drop the stray cast. > Will do. Both can be done on commit, surely? ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v3 1/4] x86: stop handling MSR_IA32_BNDCFGS save/restore in implementation code
> -Original Message- > From: Jan Beulich [mailto:jbeul...@suse.com] > Sent: 14 March 2019 13:39 > To: Paul Durrant > Cc: Andrew Cooper ; Roger Pau Monne > ; Wei Liu > ; Jun Nakajima ; Kevin Tian > ; xen- > devel > Subject: Re: [PATCH v3 1/4] x86: stop handling MSR_IA32_BNDCFGS save/restore > in implementation code > > >>> On 14.03.19 at 14:01, wrote: > > @@ -1215,10 +1196,15 @@ static bool vmx_set_guest_bndcfgs(struct vcpu *v, > > u64 val) > > return true; > > } > > > > -static bool vmx_get_guest_bndcfgs(struct vcpu *v, u64 *val) > > +static bool vmx_get_guest_bndcfgs(const struct vcpu *cv, u64 *val) > > { > > +struct vcpu *v; > > + > > ASSERT(cpu_has_mpx && cpu_has_vmx_mpx); > > > > +/* Get a non-const pointer for vmx_vmcs_enter() */ > > +v = cv->domain->vcpu[cv->vcpu_id]; > > Any chance this could be made the initializer of the variable? I thought it looked slightly odd having a comment above the declaration, which is why I did it this way, but I can move it. > > > --- a/xen/arch/x86/msr.c > > +++ b/xen/arch/x86/msr.c > > @@ -160,6 +160,12 @@ int guest_rdmsr(const struct vcpu *v, uint32_t msr, > > uint64_t *val) > > ret = guest_rdmsr_x2apic(v, msr, val); > > break; > > > > +case MSR_IA32_BNDCFGS: > > +if ( !cp->feat.mpx || !hvm_get_guest_bndcfgs(v, val) ) > > +goto gp_fault; > > Didn't we settle on ASSERT(is_hvm_*)? > > > @@ -323,6 +329,12 @@ int guest_wrmsr(struct vcpu *v, uint32_t msr, uint64_t > > val) > > ret = guest_wrmsr_x2apic(v, msr, val); > > break; > > > > +case MSR_IA32_BNDCFGS: > > +if ( !cp->feat.mpx || !hvm_set_guest_bndcfgs(v, val) ) > > +goto gp_fault; > > Same here? Or did Andrew tell you otherwise? Andrew was ok with me just dropping it, but I can put in the ASSERT if you prefer. Paul > > Jan > ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH] xen/netfront: Remove unneeded .resume callback
I am still confused about suspend/resume for the front drivers, for instance, block front [1] does implement full/proper reconnect on .resume, but net front only does this partially. Could anyone please shed some light on suspend/resume design? Thank you, Oleksandr On 3/14/19 3:17 PM, Oleksandr Andrushchenko wrote: From: Oleksandr Andrushchenko Currently on driver resume we remove all the network queues and destroy shared Tx/Rx rings leaving the driver in its current state and never signaling the backend of this frontend's state change. This leads to the number of consequences: - when frontend withdraws granted references to the rings etc. it cannot be cleanly done as the backend still holds those (it was not told to free the resources) - it is not possible to resume driver operation as all the communication means with the backned were destroyed by the frontend, thus making the frontend appear to the guest OS as functional, but not really. Fix this by not destroying communication channels/rings on driver resume. Signed-off-by: Oleksandr Andrushchenko --- drivers/net/xen-netfront.c | 17 - 1 file changed, 17 deletions(-) diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c index c914c24f880b..2ca162048da4 100644 --- a/drivers/net/xen-netfront.c +++ b/drivers/net/xen-netfront.c @@ -1422,22 +1422,6 @@ static void xennet_disconnect_backend(struct netfront_info *info) } } -/** - * We are reconnecting to the backend, due to a suspend/resume, or a backend - * driver restart. We tear down our netif structure and recreate it, but - * leave the device-layer structures intact so that this is transparent to the - * rest of the kernel. - */ -static int netfront_resume(struct xenbus_device *dev) -{ - struct netfront_info *info = dev_get_drvdata(&dev->dev); - - dev_dbg(&dev->dev, "%s\n", dev->nodename); - - xennet_disconnect_backend(info); - return 0; -} - static int xen_net_read_mac(struct xenbus_device *dev, u8 mac[]) { char *s, *e, *macstr; @@ -2185,7 +2169,6 @@ static struct xenbus_driver netfront_driver = { .ids = netfront_ids, .probe = netfront_probe, .remove = xennet_remove, - .resume = netfront_resume, .otherend_changed = netback_changed, }; [1] https://elixir.bootlin.com/linux/v5.0.2/source/drivers/block/xen-blkfront.c#L2072 ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v13] tolerate jitter in cpu_khz calculation to avoid TSC emulation
Am Wed, 13 Mar 2019 03:18:39 -0600 schrieb "Jan Beulich" : > I'm sorry, but I continue to object to this adjustment getting done > both by default _and_ not in a per-guest manner. As said before, > you can't demand guests to run NTP, and hence you can't expect > them to get along with a few hundred kHz jump in observed TSC > frequency. Whether the performance drop due to vTSC use is > better or worse is a policy decision, which we should leave to the > admin. Hence the feature needs to be off by default, and there > needs to be at least a host-wide control to enable it; a per-guest > control would be better. IOW I explicitly do not agree with the > last sentence of the commit message. So this seems the be the essential part that prevents moving forward. Your claim is basically that "we do not know how the workload reacts to frequency change". My claim is basically "there is enough evidence that syncing with external clock is required if the frequency remotely matters". I think that conflict can not be easily solved. One way to solve it would be a knob that injects a value into the proposed "vtsc_tolerance_khz" variable, leave the calculation to the host admin, and leave code in tsc_set_info basically as is. Maybe "xl set-params" can be the way to change the value, that way it can be changed globally at runtime if needed. In staging the change would affect HVM and PVH. I never ran PVH, I have to assume it behaves like HVM in this regard. Olaf pgpJ4RXdl0NFx.pgp Description: Digitale Signatur von OpenPGP ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] [PATCH v4 4/4] x86: remove defunct init/load/save_msr() hvm_funcs
These hvm_funcs are no longer required since no MSR values are saved or restored by implementation-specific code. Signed-off-by: Paul Durrant Reviewed-by: Jan Beulich --- Cc: Andrew Cooper Cc: Wei Liu Cc: "Roger Pau Monné" v2: - Re-instate err check on loop in hvm_load_cpu_msrs() --- xen/arch/x86/hvm/hvm.c| 29 + xen/include/asm-x86/hvm/hvm.h | 4 2 files changed, 9 insertions(+), 24 deletions(-) diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c index deb7fb2adb..f84bf11f0f 100644 --- a/xen/arch/x86/hvm/hvm.c +++ b/xen/arch/x86/hvm/hvm.c @@ -1310,7 +1310,6 @@ static const uint32_t msrs_to_send[] = { MSR_AMD64_DR2_ADDRESS_MASK, MSR_AMD64_DR3_ADDRESS_MASK, }; -static unsigned int __read_mostly msr_count_max = ARRAY_SIZE(msrs_to_send); static int hvm_save_cpu_msrs(struct vcpu *v, hvm_domain_context_t *h) { @@ -1320,7 +1319,7 @@ static int hvm_save_cpu_msrs(struct vcpu *v, hvm_domain_context_t *h) int err; err = _hvm_init_entry(h, CPU_MSR_CODE, v->vcpu_id, - HVM_CPU_MSR_SIZE(msr_count_max)); + HVM_CPU_MSR_SIZE(ARRAY_SIZE(msrs_to_send))); if ( err ) return err; ctxt = (struct hvm_msr *)&h->data[h->cur]; @@ -1353,10 +1352,7 @@ static int hvm_save_cpu_msrs(struct vcpu *v, hvm_domain_context_t *h) ctxt->msr[ctxt->count++].val = val; } -if ( hvm_funcs.save_msr ) -hvm_funcs.save_msr(v, ctxt); - -ASSERT(ctxt->count <= msr_count_max); +ASSERT(ctxt->count <= ARRAY_SIZE(msrs_to_send)); for ( i = 0; i < ctxt->count; ++i ) ctxt->msr[i]._rsvd = 0; @@ -1431,9 +1427,6 @@ static int hvm_load_cpu_msrs(struct domain *d, hvm_domain_context_t *h) return -EOPNOTSUPP; /* Checking finished */ -if ( hvm_funcs.load_msr ) -err = hvm_funcs.load_msr(v, ctxt); - for ( i = 0; !err && i < ctxt->count; ++i ) { switch ( ctxt->msr[i].index ) @@ -1475,17 +1468,13 @@ static int __init hvm_register_CPU_save_and_restore(void) sizeof(struct hvm_save_descriptor), HVMSR_PER_VCPU); -if ( hvm_funcs.init_msr ) -msr_count_max += hvm_funcs.init_msr(); - -if ( msr_count_max ) -hvm_register_savevm(CPU_MSR_CODE, -"CPU_MSR", -hvm_save_cpu_msrs, -hvm_load_cpu_msrs, -HVM_CPU_MSR_SIZE(msr_count_max) + -sizeof(struct hvm_save_descriptor), -HVMSR_PER_VCPU); +hvm_register_savevm(CPU_MSR_CODE, +"CPU_MSR", +hvm_save_cpu_msrs, +hvm_load_cpu_msrs, +HVM_CPU_MSR_SIZE(ARRAY_SIZE(msrs_to_send)) + +sizeof(struct hvm_save_descriptor), +HVMSR_PER_VCPU); return 0; } diff --git a/xen/include/asm-x86/hvm/hvm.h b/xen/include/asm-x86/hvm/hvm.h index d41ed63232..a70fdbe298 100644 --- a/xen/include/asm-x86/hvm/hvm.h +++ b/xen/include/asm-x86/hvm/hvm.h @@ -115,10 +115,6 @@ struct hvm_function_table { void (*save_cpu_ctxt)(struct vcpu *v, struct hvm_hw_cpu *ctxt); int (*load_cpu_ctxt)(struct vcpu *v, struct hvm_hw_cpu *ctxt); -unsigned int (*init_msr)(void); -void (*save_msr)(struct vcpu *, struct hvm_msr *); -int (*load_msr)(struct vcpu *, struct hvm_msr *); - /* Examine specifics of the guest state. */ unsigned int (*get_interrupt_shadow)(struct vcpu *v); void (*set_interrupt_shadow)(struct vcpu *v, unsigned int intr_shadow); -- 2.20.1 ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] [PATCH v4 3/4] x86: stop handling MSR_IA32_XSS save/restore in implementation code
Saving and restoring the value of this MSR is currently handled by implementation-specific code despite it being architectural. This patch moves handling of accesses to this MSR from hvm.c into the msr.c, thus allowing the common MSR save/restore code to handle it. This patch also adds proper checks of CPUID policy in the new get/set code. NOTE: MSR_IA32_XSS is the last MSR to be saved and restored by implementation-specific code. This patch therefore removes the (VMX) definitions and of the init_msr(), save_msr() and load_msr() hvm_funcs, as they are no longer necessary. The declarations of and calls to those hvm_funcs will be cleaned up by a subsequent patch. Signed-off-by: Paul Durrant Reviewed-by: Jan Beulich Reviewed-by: Kevin Tian --- Cc: Andrew Cooper Cc: Wei Liu Cc: "Roger Pau Monné" Cc: Jun Nakajima --- xen/arch/x86/hvm/hvm.c | 15 ++-- xen/arch/x86/hvm/vmx/vmx.c | 49 -- xen/arch/x86/msr.c | 18 ++ 3 files changed, 20 insertions(+), 62 deletions(-) diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c index dff590e658..deb7fb2adb 100644 --- a/xen/arch/x86/hvm/hvm.c +++ b/xen/arch/x86/hvm/hvm.c @@ -1304,6 +1304,7 @@ static const uint32_t msrs_to_send[] = { MSR_SPEC_CTRL, MSR_INTEL_MISC_FEATURES_ENABLES, MSR_IA32_BNDCFGS, +MSR_IA32_XSS, MSR_AMD64_DR0_ADDRESS_MASK, MSR_AMD64_DR1_ADDRESS_MASK, MSR_AMD64_DR2_ADDRESS_MASK, @@ -1442,6 +1443,7 @@ static int hvm_load_cpu_msrs(struct domain *d, hvm_domain_context_t *h) case MSR_SPEC_CTRL: case MSR_INTEL_MISC_FEATURES_ENABLES: case MSR_IA32_BNDCFGS: +case MSR_IA32_XSS: case MSR_AMD64_DR0_ADDRESS_MASK: case MSR_AMD64_DR1_ADDRESS_MASK ... MSR_AMD64_DR3_ADDRESS_MASK: rc = guest_wrmsr(v, ctxt->msr[i].index, ctxt->msr[i].val); @@ -3463,12 +3465,6 @@ int hvm_msr_read_intercept(unsigned int msr, uint64_t *msr_content) MTRRcap_VCNT))]; break; -case MSR_IA32_XSS: -if ( !d->arch.cpuid->xstate.xsaves ) -goto gp_fault; -*msr_content = v->arch.msrs->xss.raw; -break; - case MSR_K8_ENABLE_C1E: case MSR_AMD64_NB_CFG: /* @@ -3608,13 +3604,6 @@ int hvm_msr_write_intercept(unsigned int msr, uint64_t msr_content, goto gp_fault; break; -case MSR_IA32_XSS: -/* No XSS features currently supported for guests. */ -if ( !d->arch.cpuid->xstate.xsaves || msr_content != 0 ) -goto gp_fault; -v->arch.msrs->xss.raw = msr_content; -break; - case MSR_AMD64_NB_CFG: /* ignore the write */ break; diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c index 985e5735d2..c46e05b91e 100644 --- a/xen/arch/x86/hvm/vmx/vmx.c +++ b/xen/arch/x86/hvm/vmx/vmx.c @@ -797,52 +797,6 @@ static int vmx_load_vmcs_ctxt(struct vcpu *v, struct hvm_hw_cpu *ctxt) return 0; } -static unsigned int __init vmx_init_msr(void) -{ -return (cpu_has_mpx && cpu_has_vmx_mpx) + - (cpu_has_xsaves && cpu_has_vmx_xsaves); -} - -static void vmx_save_msr(struct vcpu *v, struct hvm_msr *ctxt) -{ -if ( cpu_has_xsaves && cpu_has_vmx_xsaves ) -{ -ctxt->msr[ctxt->count].val = v->arch.msrs->xss.raw; -if ( ctxt->msr[ctxt->count].val ) -ctxt->msr[ctxt->count++].index = MSR_IA32_XSS; -} -} - -static int vmx_load_msr(struct vcpu *v, struct hvm_msr *ctxt) -{ -unsigned int i; -int err = 0; - -vmx_vmcs_enter(v); - -for ( i = 0; i < ctxt->count; ++i ) -{ -switch ( ctxt->msr[i].index ) -{ -case MSR_IA32_XSS: -if ( cpu_has_xsaves && cpu_has_vmx_xsaves ) -v->arch.msrs->xss.raw = ctxt->msr[i].val; -else -err = -ENXIO; -break; -default: -continue; -} -if ( err ) -break; -ctxt->msr[i]._rsvd = 1; -} - -vmx_vmcs_exit(v); - -return err; -} - static void vmx_fpu_enter(struct vcpu *v) { vcpu_restore_fpu_lazy(v); @@ -2282,9 +2236,6 @@ static struct hvm_function_table __initdata vmx_function_table = { .vcpu_destroy = vmx_vcpu_destroy, .save_cpu_ctxt= vmx_save_vmcs_ctxt, .load_cpu_ctxt= vmx_load_vmcs_ctxt, -.init_msr = vmx_init_msr, -.save_msr = vmx_save_msr, -.load_msr = vmx_load_msr, .get_interrupt_shadow = vmx_get_interrupt_shadow, .set_interrupt_shadow = vmx_set_interrupt_shadow, .guest_x86_mode = vmx_guest_x86_mode, diff --git a/xen/arch/x86/msr.c b/xen/arch/x86/msr.c index 0e901d2397..4b5e000224 100644 --- a/xen/arch/x86/msr.c +++ b/xen/arch/x86/msr.c @@ -170,6 +170,13 @@ int guest_rdmsr(const struct vcpu *v, uint32_t msr, uint64_t *val) break;
[Xen-devel] [PATCH v4 1/4] x86: stop handling MSR_IA32_BNDCFGS save/restore in implementation code
Saving and restoring the value of this MSR is currently handled by implementation-specific code despite it being architectural. This patch moves handling of accesses to this MSR from hvm.c into the msr.c, thus allowing the common MSR save/restore code to handle it. NOTE: Because vmx_get/set_guest_bndcfgs() call vmx_vmcs_enter(), the struct vcpu pointer passed in, and hence the vcpu pointer passed to guest_rdmsr() cannot be const. Signed-off-by: Paul Durrant --- Cc: Kevin Tian Cc: Jan Beulich Cc: Andrew Cooper Cc: Wei Liu Cc: "Roger Pau Monné" Cc: Jun Nakajima v4: - Cosmetic re-arrangements and an additional ASSERT requested by Jan v3: - Address further comments from Jan - Dropped Kevin's R-b because of change to vmx.c v2: - Addressed comments from Jan by largely removing hunks - Keeping Kevin's R-b since remaining hunks in vmx.c are as before --- xen/arch/x86/hvm/hvm.c| 14 ++ xen/arch/x86/hvm/vmx/vmx.c| 24 xen/arch/x86/msr.c| 20 xen/include/asm-x86/hvm/hvm.h | 4 ++-- 4 files changed, 28 insertions(+), 34 deletions(-) diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c index 8adbb61b57..e566d83f8b 100644 --- a/xen/arch/x86/hvm/hvm.c +++ b/xen/arch/x86/hvm/hvm.c @@ -1303,6 +1303,7 @@ static int hvm_load_cpu_xsave_states(struct domain *d, hvm_domain_context_t *h) static const uint32_t msrs_to_send[] = { MSR_SPEC_CTRL, MSR_INTEL_MISC_FEATURES_ENABLES, +MSR_IA32_BNDCFGS, MSR_AMD64_DR0_ADDRESS_MASK, MSR_AMD64_DR1_ADDRESS_MASK, MSR_AMD64_DR2_ADDRESS_MASK, @@ -1440,6 +1441,7 @@ static int hvm_load_cpu_msrs(struct domain *d, hvm_domain_context_t *h) case MSR_SPEC_CTRL: case MSR_INTEL_MISC_FEATURES_ENABLES: +case MSR_IA32_BNDCFGS: case MSR_AMD64_DR0_ADDRESS_MASK: case MSR_AMD64_DR1_ADDRESS_MASK ... MSR_AMD64_DR3_ADDRESS_MASK: rc = guest_wrmsr(v, ctxt->msr[i].index, ctxt->msr[i].val); @@ -3467,12 +3469,6 @@ int hvm_msr_read_intercept(unsigned int msr, uint64_t *msr_content) *msr_content = v->arch.hvm.msr_xss; break; -case MSR_IA32_BNDCFGS: -if ( !d->arch.cpuid->feat.mpx || - !hvm_get_guest_bndcfgs(v, msr_content) ) -goto gp_fault; -break; - case MSR_K8_ENABLE_C1E: case MSR_AMD64_NB_CFG: /* @@ -3619,12 +3615,6 @@ int hvm_msr_write_intercept(unsigned int msr, uint64_t msr_content, v->arch.hvm.msr_xss = msr_content; break; -case MSR_IA32_BNDCFGS: -if ( !d->arch.cpuid->feat.mpx || - !hvm_set_guest_bndcfgs(v, msr_content) ) -goto gp_fault; -break; - case MSR_AMD64_NB_CFG: /* ignore the write */ break; diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c index 725dd88c13..f8481d032a 100644 --- a/xen/arch/x86/hvm/vmx/vmx.c +++ b/xen/arch/x86/hvm/vmx/vmx.c @@ -805,17 +805,6 @@ static unsigned int __init vmx_init_msr(void) static void vmx_save_msr(struct vcpu *v, struct hvm_msr *ctxt) { -vmx_vmcs_enter(v); - -if ( cpu_has_mpx && cpu_has_vmx_mpx ) -{ -__vmread(GUEST_BNDCFGS, &ctxt->msr[ctxt->count].val); -if ( ctxt->msr[ctxt->count].val ) -ctxt->msr[ctxt->count++].index = MSR_IA32_BNDCFGS; -} - -vmx_vmcs_exit(v); - if ( cpu_has_xsaves && cpu_has_vmx_xsaves ) { ctxt->msr[ctxt->count].val = v->arch.hvm.msr_xss; @@ -835,14 +824,6 @@ static int vmx_load_msr(struct vcpu *v, struct hvm_msr *ctxt) { switch ( ctxt->msr[i].index ) { -case MSR_IA32_BNDCFGS: -if ( cpu_has_mpx && cpu_has_vmx_mpx && - is_canonical_address(ctxt->msr[i].val) && - !(ctxt->msr[i].val & IA32_BNDCFGS_RESERVED) ) -__vmwrite(GUEST_BNDCFGS, ctxt->msr[i].val); -else if ( ctxt->msr[i].val ) -err = -ENXIO; -break; case MSR_IA32_XSS: if ( cpu_has_xsaves && cpu_has_vmx_xsaves ) v->arch.hvm.msr_xss = ctxt->msr[i].val; @@ -1215,8 +1196,11 @@ static bool vmx_set_guest_bndcfgs(struct vcpu *v, u64 val) return true; } -static bool vmx_get_guest_bndcfgs(struct vcpu *v, u64 *val) +static bool vmx_get_guest_bndcfgs(const struct vcpu *cv, u64 *val) { +/* Get a non-const pointer for vmx_vmcs_enter() */ +struct vcpu *v = cv->domain->vcpu[cv->vcpu_id]; + ASSERT(cpu_has_mpx && cpu_has_vmx_mpx); vmx_vmcs_enter(v); diff --git a/xen/arch/x86/msr.c b/xen/arch/x86/msr.c index 4df4a59f4d..0e901d2397 100644 --- a/xen/arch/x86/msr.c +++ b/xen/arch/x86/msr.c @@ -160,6 +160,16 @@ int guest_rdmsr(const struct vcpu *v, uint32_t msr, uint64_t *val) ret = guest_rdmsr_x2apic(v, msr, val); break; +case MSR_IA32_BNDCFGS: +if ( !cp->feat.mpx ) +goto gp_fault; + +ASSERT(is_hvm_domain(
[Xen-devel] [PATCH v4 0/4] clean up MSR save/restore code
*** BLURB HERE *** Paul Durrant (4): x86: stop handling MSR_IA32_BNDCFGS save/restore in implementation code x86: move the saved value of MSR_IA32_XSS into struct vcpu_msrs x86: stop handling MSR_IA32_XSS save/restore in implementation code x86: remove defunct init/load/save_msr() hvm_funcs xen/arch/x86/domain.c | 2 +- xen/arch/x86/hvm/hvm.c | 58 ++- xen/arch/x86/hvm/vmx/vmx.c | 73 ++ xen/arch/x86/msr.c | 38 ++ xen/include/asm-x86/hvm/hvm.h | 8 +--- xen/include/asm-x86/hvm/vcpu.h | 1 - xen/include/asm-x86/msr.h | 5 +++ 7 files changed, 63 insertions(+), 122 deletions(-) --- v2: - Drop patches #2 and #6 of the v1 series Cc: Andrew Cooper Cc: Jan Beulich Cc: Jun Nakajima Cc: Kevin Tian Cc: "Roger Pau Monné" Cc: Wei Liu -- 2.20.1 ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] [PATCH v4 2/4] x86: move the saved value of MSR_IA32_XSS into struct vcpu_msrs
Currently the value is saved directly in struct hvm_vcpu. This patch simply co-locates it with other saved MSR values. No functional change. Signed-off-by: Paul Durrant Reviewed-by: Jan Beulich Reviewed-by: Kevin Tian --- Cc: Andrew Cooper Cc: Wei Liu Cc: "Roger Pau Monné" Cc: Jun Nakajima --- xen/arch/x86/domain.c | 2 +- xen/arch/x86/hvm/hvm.c | 4 ++-- xen/arch/x86/hvm/vmx/vmx.c | 4 ++-- xen/include/asm-x86/hvm/vcpu.h | 1 - xen/include/asm-x86/msr.h | 5 + 5 files changed, 10 insertions(+), 6 deletions(-) diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c index 8d579e2cf9..aa38555736 100644 --- a/xen/arch/x86/domain.c +++ b/xen/arch/x86/domain.c @@ -1653,7 +1653,7 @@ static void __context_switch(void) BUG(); if ( cpu_has_xsaves && is_hvm_vcpu(n) ) -set_msr_xss(n->arch.hvm.msr_xss); +set_msr_xss(n->arch.msrs->xss.raw); } vcpu_restore_fpu_nonlazy(n, false); nd->arch.ctxt_switch->to(n); diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c index e566d83f8b..dff590e658 100644 --- a/xen/arch/x86/hvm/hvm.c +++ b/xen/arch/x86/hvm/hvm.c @@ -3466,7 +3466,7 @@ int hvm_msr_read_intercept(unsigned int msr, uint64_t *msr_content) case MSR_IA32_XSS: if ( !d->arch.cpuid->xstate.xsaves ) goto gp_fault; -*msr_content = v->arch.hvm.msr_xss; +*msr_content = v->arch.msrs->xss.raw; break; case MSR_K8_ENABLE_C1E: @@ -3612,7 +3612,7 @@ int hvm_msr_write_intercept(unsigned int msr, uint64_t msr_content, /* No XSS features currently supported for guests. */ if ( !d->arch.cpuid->xstate.xsaves || msr_content != 0 ) goto gp_fault; -v->arch.hvm.msr_xss = msr_content; +v->arch.msrs->xss.raw = msr_content; break; case MSR_AMD64_NB_CFG: diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c index f8481d032a..985e5735d2 100644 --- a/xen/arch/x86/hvm/vmx/vmx.c +++ b/xen/arch/x86/hvm/vmx/vmx.c @@ -807,7 +807,7 @@ static void vmx_save_msr(struct vcpu *v, struct hvm_msr *ctxt) { if ( cpu_has_xsaves && cpu_has_vmx_xsaves ) { -ctxt->msr[ctxt->count].val = v->arch.hvm.msr_xss; +ctxt->msr[ctxt->count].val = v->arch.msrs->xss.raw; if ( ctxt->msr[ctxt->count].val ) ctxt->msr[ctxt->count++].index = MSR_IA32_XSS; } @@ -826,7 +826,7 @@ static int vmx_load_msr(struct vcpu *v, struct hvm_msr *ctxt) { case MSR_IA32_XSS: if ( cpu_has_xsaves && cpu_has_vmx_xsaves ) -v->arch.hvm.msr_xss = ctxt->msr[i].val; +v->arch.msrs->xss.raw = ctxt->msr[i].val; else err = -ENXIO; break; diff --git a/xen/include/asm-x86/hvm/vcpu.h b/xen/include/asm-x86/hvm/vcpu.h index 6c84d5a5a6..5563d28a4e 100644 --- a/xen/include/asm-x86/hvm/vcpu.h +++ b/xen/include/asm-x86/hvm/vcpu.h @@ -176,7 +176,6 @@ struct hvm_vcpu { struct hvm_vcpu_asid n1asid; u64 msr_tsc_adjust; -u64 msr_xss; union { struct vmx_vcpu vmx; diff --git a/xen/include/asm-x86/msr.h b/xen/include/asm-x86/msr.h index a7244793bf..0d52c085f6 100644 --- a/xen/include/asm-x86/msr.h +++ b/xen/include/asm-x86/msr.h @@ -313,6 +313,11 @@ struct vcpu_msrs * values here may be stale in current context. */ uint32_t dr_mask[4]; + +/* 0x0da0 - MSR_IA32_XSS */ +struct { +uint64_t raw; +} xss; }; void init_guest_msr_policy(void); -- 2.20.1 ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH 2/6] xen: Move xenstore initialization to common location
> -Original Message- > From: Jason Andryuk [mailto:jandr...@gmail.com] > Sent: 13 March 2019 18:12 > To: Paul Durrant > Cc: qemu-de...@nongnu.org; xen-devel@lists.xenproject.org; > marma...@invisiblethingslab.com; Stefano > Stabellini ; Anthony Perard > ; Paolo Bonzini > ; Richard Henderson ; Eduardo Habkost > ; > Michael S. Tsirkin ; Marcel Apfelbaum > > Subject: Re: [PATCH 2/6] xen: Move xenstore initialization to common location > > On Wed, Mar 13, 2019 at 11:01 AM Paul Durrant wrote: > > > > > -Original Message- > > > From: Jason Andryuk [mailto:jandr...@gmail.com] > > > Sent: 11 March 2019 18:02 > > > To: qemu-de...@nongnu.org > > > Cc: xen-devel@lists.xenproject.org; marma...@invisiblethingslab.com; > > > Jason Andryuk > > > ; Stefano Stabellini ; > > > Anthony Perard > > > ; Paul Durrant ; > > > Paolo Bonzini > > > ; Richard Henderson ; Eduardo > > > Habkost ; > > > Michael S. Tsirkin ; Marcel Apfelbaum > > > > > > Subject: [PATCH 2/6] xen: Move xenstore initialization to common location > > > > > > For the xen stubdom case, we'll want xenstore initialized, but we'll > > > want to skip the rest of xen_be_init. Move the initialization to > > > xen_hvm_init so we can conditionalize calling xen_be_init. > > > > > > xs_domain_open() is deprecated for xs_open(0), so make the replacement > > > as well. > > > > Can you elaborate as to why you need to do this when the code at the top of > > xen_hvm_init() already > opens xenstore for its own purposes, and AFAICT xenstore_update() is only > needed if QEMU is > implementing a PV backend? > > > > > > Hi, Paul. Thanks for reviewing. > > I think you are right, that this basically shouldn't be needed if PV > backends are disabled. This patch came out of OpenXT, where it is > needed for some out-of-tree patches. But that doesn't make it > suitable for upstreaming. > > However, while reviewing, it looks like the xen accelerator in > hw/xen/xen-common.c:xen_init() registers xen_change_state_handler(). > xen_change_state_handler() uses the global xenstore handle and will > exit(1) if NULL. I see it yes. TBH signalling state via xenstore should go away as it is incompatible with deprivileging, and I think Anthony might have some patches for that? In the meantime I suggest just doing a local xs_open(0) in that function. > I'm not sure how to get the XenIOState xenstore > handle over to the accelerator's xen_init. That would not be appropriate as the machine type may not be xenfv and hence xen_hvm_init() may not have been called. Paul > Outside of that, I think > you are correct that xenstore_update doesn't need to be run when PV > backends are disabled. > > Thanks, > Jason ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] [PATCH] build: don't mandate availability of a fetcher program
It is common that build hosts are isolated from outside world. They don't necessarily have wget or ftp installed. Turn the error into warning in configure. And point FETCHER to `false' command if neither wget nor ftp is available, so any attempt to download will result in error. Signed-off-by: Wei Liu --- m4/fetcher.m4 | 4 +++- stubdom/configure | 46 +- tools/configure | 46 +- 3 files changed, 93 insertions(+), 3 deletions(-) diff --git a/m4/fetcher.m4 b/m4/fetcher.m4 index 86f33b3937..c1a72c189c 100644 --- a/m4/fetcher.m4 +++ b/m4/fetcher.m4 @@ -1,5 +1,6 @@ AC_DEFUN([AX_CHECK_FETCHER], [ AC_PATH_PROG([WGET],[wget], [no]) +AC_PATH_PROG([FALSE],[false], [/bin/false]) AS_IF([test x"$WGET" != x"no"], [ FETCHER="$WGET -c -O" ], [ @@ -7,7 +8,8 @@ AS_IF([test x"$WGET" != x"no"], [ AS_IF([test x"$FTP" != x"no"], [ FETCHER="$FTP -o" ], [ -AC_MSG_ERROR([cannot find wget or ftp]) +FETCHER="$FALSE" +AC_MSG_WARN([cannot find wget or ftp]) ]) ]) AC_SUBST(FETCHER) diff --git a/stubdom/configure b/stubdom/configure index df3f763a7b..beeb8db2e1 100755 --- a/stubdom/configure +++ b/stubdom/configure @@ -625,6 +625,7 @@ CFLAGS CC FETCHER FTP +FALSE WGET CMAKE extfiles @@ -2362,6 +2363,47 @@ $as_echo "no" >&6; } fi +# Extract the first word of "false", so it can be a program name with args. +set dummy false; ac_word=$2 +{ $as_echo "$as_me:${as_lineno-$LINENO}: checking for $ac_word" >&5 +$as_echo_n "checking for $ac_word... " >&6; } +if ${ac_cv_path_FALSE+:} false; then : + $as_echo_n "(cached) " >&6 +else + case $FALSE in + [\\/]* | ?:[\\/]*) + ac_cv_path_FALSE="$FALSE" # Let the user override the test with a path. + ;; + *) + as_save_IFS=$IFS; IFS=$PATH_SEPARATOR +for as_dir in $PATH +do + IFS=$as_save_IFS + test -z "$as_dir" && as_dir=. +for ac_exec_ext in '' $ac_executable_extensions; do + if as_fn_executable_p "$as_dir/$ac_word$ac_exec_ext"; then +ac_cv_path_FALSE="$as_dir/$ac_word$ac_exec_ext" +$as_echo "$as_me:${as_lineno-$LINENO}: found $as_dir/$ac_word$ac_exec_ext" >&5 +break 2 + fi +done + done +IFS=$as_save_IFS + + test -z "$ac_cv_path_FALSE" && ac_cv_path_FALSE="/bin/false" + ;; +esac +fi +FALSE=$ac_cv_path_FALSE +if test -n "$FALSE"; then + { $as_echo "$as_me:${as_lineno-$LINENO}: result: $FALSE" >&5 +$as_echo "$FALSE" >&6; } +else + { $as_echo "$as_me:${as_lineno-$LINENO}: result: no" >&5 +$as_echo "no" >&6; } +fi + + if test x"$WGET" != x"no"; then : FETCHER="$WGET -c -O" @@ -2415,7 +2457,9 @@ fi else -as_fn_error $? "cannot find wget or ftp" "$LINENO" 5 +FETCHER="$FALSE" +{ $as_echo "$as_me:${as_lineno-$LINENO}: WARNING: cannot find wget or ftp" >&5 +$as_echo "$as_me: WARNING: cannot find wget or ftp" >&2;} fi diff --git a/tools/configure b/tools/configure index b66d3f6fba..92ead93335 100755 --- a/tools/configure +++ b/tools/configure @@ -644,6 +644,7 @@ system_aio zlib FETCHER FTP +FALSE WGET pixman_LIBS pixman_CFLAGS @@ -8219,6 +8220,47 @@ $as_echo "no" >&6; } fi +# Extract the first word of "false", so it can be a program name with args. +set dummy false; ac_word=$2 +{ $as_echo "$as_me:${as_lineno-$LINENO}: checking for $ac_word" >&5 +$as_echo_n "checking for $ac_word... " >&6; } +if ${ac_cv_path_FALSE+:} false; then : + $as_echo_n "(cached) " >&6 +else + case $FALSE in + [\\/]* | ?:[\\/]*) + ac_cv_path_FALSE="$FALSE" # Let the user override the test with a path. + ;; + *) + as_save_IFS=$IFS; IFS=$PATH_SEPARATOR +for as_dir in $PATH +do + IFS=$as_save_IFS + test -z "$as_dir" && as_dir=. +for ac_exec_ext in '' $ac_executable_extensions; do + if as_fn_executable_p "$as_dir/$ac_word$ac_exec_ext"; then +ac_cv_path_FALSE="$as_dir/$ac_word$ac_exec_ext" +$as_echo "$as_me:${as_lineno-$LINENO}: found $as_dir/$ac_word$ac_exec_ext" >&5 +break 2 + fi +done + done +IFS=$as_save_IFS + + test -z "$ac_cv_path_FALSE" && ac_cv_path_FALSE="/bin/false" + ;; +esac +fi +FALSE=$ac_cv_path_FALSE +if test -n "$FALSE"; then + { $as_echo "$as_me:${as_lineno-$LINENO}: result: $FALSE" >&5 +$as_echo "$FALSE" >&6; } +else + { $as_echo "$as_me:${as_lineno-$LINENO}: result: no" >&5 +$as_echo "no" >&6; } +fi + + if test x"$WGET" != x"no"; then : FETCHER="$WGET -c -O" @@ -8272,7 +8314,9 @@ fi else -as_fn_error $? "cannot find wget or ftp" "$LINENO" 5 +FETCHER="$FALSE" +{ $as_echo "$as_me:${as_lineno-$LINENO}: WARNING: cannot find wget or ftp" >&5 +$as_echo "$as_me: WARNING: cannot find wget or ftp" >&2;} fi -- 2.11.0 ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v13] tolerate jitter in cpu_khz calculation to avoid TSC emulation
On 14/03/2019 14:50, Olaf Hering wrote: > Am Wed, 13 Mar 2019 03:18:39 -0600 > schrieb "Jan Beulich" : > >> I'm sorry, but I continue to object to this adjustment getting done >> both by default _and_ not in a per-guest manner. As said before, >> you can't demand guests to run NTP, and hence you can't expect >> them to get along with a few hundred kHz jump in observed TSC >> frequency. Whether the performance drop due to vTSC use is >> better or worse is a policy decision, which we should leave to the >> admin. Hence the feature needs to be off by default, and there >> needs to be at least a host-wide control to enable it; a per-guest >> control would be better. IOW I explicitly do not agree with the >> last sentence of the commit message. > > So this seems the be the essential part that prevents moving forward. > > Your claim is basically that "we do not know how the workload reacts > to frequency change". > My claim is basically "there is enough evidence that syncing with > external clock is required if the frequency remotely matters". > > I think that conflict can not be easily solved. > > One way to solve it would be a knob that injects a value into the > proposed "vtsc_tolerance_khz" variable, leave the calculation to > the host admin, and leave code in tsc_set_info basically as is. > > Maybe "xl set-params" can be the way to change the value, that way > it can be changed globally at runtime if needed. > > In staging the change would affect HVM and PVH. I never ran PVH, > I have to assume it behaves like HVM in this regard. I think Andrew (or Ian?) once suggested to handle this whole mess in the migration stream instead of the hypervisor. So why don't you: - add a domain config item for specifying the allowed jitter - add that value to the migration struct xc_sr_rec_tsc_info (there is a reserved field available) - and then modify handle_tsc_info() in tools/libxc/xc_sr_common_x86.c to test the host frequency to be in the acceptable range and if this is the case put the host frequency into the gtsc_khz parameter of the xc_domain_set_tsc_info() call This would be the least intrusive change allowing maximum flexibility IMO. Juergen ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v6 09/11] viridian: add implementation of synthetic interrupt MSRs
>>> On 14.03.19 at 12:25, wrote: > @@ -131,13 +238,69 @@ int viridian_synic_rdmsr(const struct vcpu *v, uint32_t > idx, uint64_t *val) > *val = ((uint64_t)icr2 << 32) | icr; > break; > } > + > case HV_X64_MSR_TPR: > *val = vlapic_get_reg(vcpu_vlapic(v), APIC_TASKPRI); > break; > > case HV_X64_MSR_VP_ASSIST_PAGE: > -*val = v->arch.hvm.viridian->vp_assist.msr.raw; > +*val = vv->vp_assist.msr.raw; > +break; > + > +case HV_X64_MSR_SCONTROL: > +if ( !(viridian_feature_mask(d) & HVMPV_synic) ) > +return X86EMUL_EXCEPTION; > + > +*val = vv->scontrol; > +break; > + > +case HV_X64_MSR_SVERSION: > +if ( !(viridian_feature_mask(d) & HVMPV_synic) ) > +return X86EMUL_EXCEPTION; > + > +/* > + * The specification says that the version number is 0x0001 > + * and should be in the lower 32-bits of the MSR, while the > + * upper 32-bits are reserved... but it doesn't say what they > + * should be set to. Assume everything but the bottom bit > + * should be zero. > + */ > +*val = 1ul; > +break; > + > +case HV_X64_MSR_SIEFP: > +if ( !(viridian_feature_mask(d) & HVMPV_synic) ) > +return X86EMUL_EXCEPTION; > + > +*val = vv->siefp; > +break; > + > +case HV_X64_MSR_SIMP: > +if ( !(viridian_feature_mask(d) & HVMPV_synic) ) > +return X86EMUL_EXCEPTION; > + > +*val = vv->simp.msr.raw; > +break; > + > +case HV_X64_MSR_EOM: > +if ( !(viridian_feature_mask(d) & HVMPV_synic) ) > +return X86EMUL_EXCEPTION; > + > +*val = 0; > +break; > + > +case HV_X64_MSR_SINT0 ... HV_X64_MSR_SINT15: > +{ > +unsigned int sintx = idx - HV_X64_MSR_SINT0; > +const union viridian_sint_msr *vs = > +&array_access_nospec(vv->sint, sintx); > + > +if ( !(viridian_feature_mask(d) & HVMPV_synic) ) > +return X86EMUL_EXCEPTION; > + > +*val = vs->raw; > break; Without this necessarily being a request to change, I still don't understand why you don't omit vs as a variable and simply do *val = array_access_nospec(vv->sint, sintx).raw; > @@ -149,6 +312,20 @@ int viridian_synic_rdmsr(const struct vcpu *v, uint32_t > idx, uint64_t *val) > > int viridian_synic_vcpu_init(const struct vcpu *v) > { FTR while I'm in favor of adding const wherever it is possible and makes sense, I consider it quite odd for an init function to take a pointer to const. Perhaps the deinit one would also fall into that category. > @@ -1328,9 +1340,13 @@ int vlapic_has_pending_irq(struct vcpu *v) > (irr & 0xf0) <= (isr & 0xf0) ) > { > viridian_apic_assist_clear(v); > -return -1; > +irr = -1; > } > > +out: The label still lacks proper indentation. With at least this fixed (which is fine to be done while committing if this is the only piece to change) Reviewed-by: Jan Beulich Jan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] xen: Can't insert balloon page into VM userspace (WAS Re: [linux-linus bisection] complete test-arm64-arm64-xl-xsm)
Hi, On 3/14/19 8:37 AM, Juergen Gross wrote: On 12/03/2019 20:46, David Hildenbrand wrote: On 12.03.19 19:23, David Hildenbrand wrote: I guess something like this could do the trick if I understood it correctly: diff --git a/drivers/xen/balloon.c b/drivers/xen/balloon.c index 39b229f9e256..d37dd5bb7a8f 100644 --- a/drivers/xen/balloon.c +++ b/drivers/xen/balloon.c @@ -604,6 +604,7 @@ int alloc_xenballooned_pages(int nr_pages, struct page **pages) while (pgno < nr_pages) { page = balloon_retrieve(true); if (page) { + __ClearPageOffline(page); pages[pgno++] = page; #ifdef CONFIG_XEN_HAVE_PVMMU /* @@ -645,8 +646,10 @@ void free_xenballooned_pages(int nr_pages, struct page **pages) mutex_lock(&balloon_mutex); for (i = 0; i < nr_pages; i++) { - if (pages[i]) + if (pages[i]) { + __SetPageOffline(pages[i]); balloon_append(pages[i]); + } } balloon_stats.target_unpopulated -= nr_pages; At least this way, the pages allocated (and thus eventually mapped to user space) would not be marked, but the other ones would remain marked and could be excluded by makedumptool. I think this patch should do the trick. Julien, could you give it a try? On x86 I can't reproduce your problem easily as dom0 is PV with plenty of unpopulated pages for grant memory not suffering from missing "offline" bit. Sure. I managed to get the console working with the patch suggested by David. Feel free to add my tested-by if when you resend it as is. Cheers, -- Julien Grall ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v13] tolerate jitter in cpu_khz calculation to avoid TSC emulation
Am Thu, 14 Mar 2019 15:10:16 +0100 schrieb Juergen Gross : > This would be the least intrusive change allowing maximum flexibility > IMO. I think earlier attempts did alter the tooling. But that would not help with existing domUs started on hosts that do not have that property. A global knob would help with that. Olaf pgp5WO_kISTE5.pgp Description: Digitale Signatur von OpenPGP ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH 2/2] xen/debug: make debugtrace more clever regarding repeating entries
>>> On 14.03.19 at 14:40, wrote: > On 14/03/2019 13:38, Juergen Gross wrote: >> On 14/03/2019 14:33, Jan Beulich wrote: >> On 14.03.19 at 10:37, wrote: --- a/xen/drivers/char/console.c +++ b/xen/drivers/char/console.c @@ -1225,13 +1225,28 @@ void debugtrace_dump(void) watchdog_enable(); } +static void debugtrace_add_to_buf(char *buf) +{ +char *p; + +for ( p = buf; *p != '\0'; p++ ) +{ +debugtrace_buf[debugtrace_prd++] = *p; +/* Always leave a nul byte at the end of the buffer. */ +if ( debugtrace_prd == (debugtrace_bytes - 1) ) +debugtrace_prd = 0; +} +} + void debugtrace_printk(const char *fmt, ...) { -static charbuf[1024]; -static u32 count; +static char buf[1024]; +static char last_buf[1024]; +static u32 count, last_count; >>> Please change to uint32_t or even better simply to unsigned int. >> Okay. >> @@ -1243,25 +1258,32 @@ void debugtrace_printk(const char *fmt, ...) ASSERT(debugtrace_buf[debugtrace_bytes - 1] == 0); -snprintf(buf, sizeof(buf), "%u ", ++count); - va_start(args, fmt); -(void)vsnprintf(buf + strlen(buf), sizeof(buf) - strlen(buf), fmt, args); +(void)vsnprintf(buf, sizeof(buf), fmt, args); >>> Please take the opportunity and drop the stray cast. >> Will do. > > Both can be done on commit, surely? Perhaps, albeit iirc the first would amount to more than just s/u32/.../ on the line in question. Jan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] xen: Can't insert balloon page into VM userspace (WAS Re: [linux-linus bisection] complete test-arm64-arm64-xl-xsm)
On 14.03.19 15:12, Julien Grall wrote: > Hi, > > On 3/14/19 8:37 AM, Juergen Gross wrote: >> On 12/03/2019 20:46, David Hildenbrand wrote: >>> On 12.03.19 19:23, David Hildenbrand wrote: >>> >>> I guess something like this could do the trick if I understood it correctly: >>> >>> diff --git a/drivers/xen/balloon.c b/drivers/xen/balloon.c >>> index 39b229f9e256..d37dd5bb7a8f 100644 >>> --- a/drivers/xen/balloon.c >>> +++ b/drivers/xen/balloon.c >>> @@ -604,6 +604,7 @@ int alloc_xenballooned_pages(int nr_pages, struct >>> page **pages) >>> while (pgno < nr_pages) { >>> page = balloon_retrieve(true); >>> if (page) { >>> + __ClearPageOffline(page); >>> pages[pgno++] = page; >>> #ifdef CONFIG_XEN_HAVE_PVMMU >>> /* >>> @@ -645,8 +646,10 @@ void free_xenballooned_pages(int nr_pages, struct >>> page **pages) >>> mutex_lock(&balloon_mutex); >>> >>> for (i = 0; i < nr_pages; i++) { >>> - if (pages[i]) >>> + if (pages[i]) { >>> + __SetPageOffline(pages[i]); >>> balloon_append(pages[i]); >>> + } >>> } >>> >>> balloon_stats.target_unpopulated -= nr_pages; >>> >>> >>> At least this way, the pages allocated (and thus eventually mapped to >>> user space) would not be marked, but the other ones would remain marked >>> and could be excluded by makedumptool. >>> >> >> I think this patch should do the trick. Julien, could you give it a >> try? On x86 I can't reproduce your problem easily as dom0 is PV with >> plenty of unpopulated pages for grant memory not suffering from >> missing "offline" bit. > > Sure. I managed to get the console working with the patch suggested by > David. Feel free to add my tested-by if when you resend it as is. > Thanks, I will send as proper patch later! Cheers! > Cheers, > -- Thanks, David / dhildenb ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] xen: Can't insert balloon page into VM userspace (WAS Re: [linux-linus bisection] complete test-arm64-arm64-xl-xsm)
On 14/03/2019 15:12, Julien Grall wrote: > Hi, > > On 3/14/19 8:37 AM, Juergen Gross wrote: >> On 12/03/2019 20:46, David Hildenbrand wrote: >>> On 12.03.19 19:23, David Hildenbrand wrote: >>> >>> I guess something like this could do the trick if I understood it >>> correctly: >>> >>> diff --git a/drivers/xen/balloon.c b/drivers/xen/balloon.c >>> index 39b229f9e256..d37dd5bb7a8f 100644 >>> --- a/drivers/xen/balloon.c >>> +++ b/drivers/xen/balloon.c >>> @@ -604,6 +604,7 @@ int alloc_xenballooned_pages(int nr_pages, struct >>> page **pages) >>> while (pgno < nr_pages) { >>> page = balloon_retrieve(true); >>> if (page) { >>> + __ClearPageOffline(page); >>> pages[pgno++] = page; >>> #ifdef CONFIG_XEN_HAVE_PVMMU >>> /* >>> @@ -645,8 +646,10 @@ void free_xenballooned_pages(int nr_pages, struct >>> page **pages) >>> mutex_lock(&balloon_mutex); >>> >>> for (i = 0; i < nr_pages; i++) { >>> - if (pages[i]) >>> + if (pages[i]) { >>> + __SetPageOffline(pages[i]); >>> balloon_append(pages[i]); >>> + } >>> } >>> >>> balloon_stats.target_unpopulated -= nr_pages; >>> >>> >>> At least this way, the pages allocated (and thus eventually mapped to >>> user space) would not be marked, but the other ones would remain marked >>> and could be excluded by makedumptool. >>> >> >> I think this patch should do the trick. Julien, could you give it a >> try? On x86 I can't reproduce your problem easily as dom0 is PV with >> plenty of unpopulated pages for grant memory not suffering from >> missing "offline" bit. > > Sure. I managed to get the console working with the patch suggested by > David. Feel free to add my tested-by if when you resend it as is. David, could you please send a proper patch with your Sob? Juergen ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] xen: Can't insert balloon page into VM userspace (WAS Re: [linux-linus bisection] complete test-arm64-arm64-xl-xsm)
On 14.03.19 15:15, Juergen Gross wrote: > On 14/03/2019 15:12, Julien Grall wrote: >> Hi, >> >> On 3/14/19 8:37 AM, Juergen Gross wrote: >>> On 12/03/2019 20:46, David Hildenbrand wrote: On 12.03.19 19:23, David Hildenbrand wrote: I guess something like this could do the trick if I understood it correctly: diff --git a/drivers/xen/balloon.c b/drivers/xen/balloon.c index 39b229f9e256..d37dd5bb7a8f 100644 --- a/drivers/xen/balloon.c +++ b/drivers/xen/balloon.c @@ -604,6 +604,7 @@ int alloc_xenballooned_pages(int nr_pages, struct page **pages) while (pgno < nr_pages) { page = balloon_retrieve(true); if (page) { + __ClearPageOffline(page); pages[pgno++] = page; #ifdef CONFIG_XEN_HAVE_PVMMU /* @@ -645,8 +646,10 @@ void free_xenballooned_pages(int nr_pages, struct page **pages) mutex_lock(&balloon_mutex); for (i = 0; i < nr_pages; i++) { - if (pages[i]) + if (pages[i]) { + __SetPageOffline(pages[i]); balloon_append(pages[i]); + } } balloon_stats.target_unpopulated -= nr_pages; At least this way, the pages allocated (and thus eventually mapped to user space) would not be marked, but the other ones would remain marked and could be excluded by makedumptool. >>> >>> I think this patch should do the trick. Julien, could you give it a >>> try? On x86 I can't reproduce your problem easily as dom0 is PV with >>> plenty of unpopulated pages for grant memory not suffering from >>> missing "offline" bit. >> >> Sure. I managed to get the console working with the patch suggested by >> David. Feel free to add my tested-by if when you resend it as is. > > David, could you please send a proper patch with your Sob? > Yes, on it :) Cheers! > > Juergen > -- Thanks, David / dhildenb ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v13] tolerate jitter in cpu_khz calculation to avoid TSC emulation
>>> On 14.03.19 at 14:50, wrote: > In staging the change would affect HVM and PVH. I never ran PVH, > I have to assume it behaves like HVM in this regard. And again you makie it look as if PV wasn't also going through that same path. Yet all that's there at the top of the function is if ( is_pv_domain(d) && is_hardware_domain(d) ) { d->arch.vtsc = 0; return 0; } Jan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel