Re: [Xen-devel] [PATCH v6 04/12] microcode: introduce a global cache of ucode patch

2019-03-14 Thread Jan Beulich
>>> On 13.03.19 at 18:04,  wrote:
> On 13/03/2019 10:30, Andrew Cooper wrote:
>> On 13/03/2019 07:39, Jan Beulich wrote:
>> On 12.03.19 at 17:53,  wrote:
 IIRC we agreed that systems with mixed CPU versions are not supported,
 hence the same microcode blob should be used to update all the
 possible CPUs on the system, so a list should not be needed here.
>>> That's not what I recall. We agreed to not store everything, because
>>> mixed-family / mixed-model systems are commonly not supported by
>>> the hw vendors. But mixed stepping systems may well be.
>> The difference between Skylake and CascadeLake server CPUs is only in
>> the stepping (4 vs 6).  Same for Kaby/Coffee/Whiskey/Amber lake (the
>> mobile and desktop lines have a model each, and newer generations are
>> just new steppings).
>>
>> I'll have to defer to Intel as to exactly what is supported, but when I
>> asked this before, the answer was that no heterogeneity was supported at
>> all, not even at the stepping level or platform ID level.
> 
> So, as it turns out, Xen in practice only supports a single files worth
> of microcode from /lib/firmware
> 
> This is because:
> 1) That is the only behaviour drakut has, and
> 2) drakut has been the default initrd-generator in Linux distros for far
> longer than Xen has had boot time microcode loading support, and

I disagree. pulling the microcode blob out of the initrd is only a
secondary option, as far as Xen's history is concerned. My
original early load implementation required specifying a separate
module in the grub configuration; ucode=scan support was added
later by Konrad. Yet the ucode= logic also supports the
full microcode.bin blob to be loaded. (And as to the timeline, I'm
not sure SLE in particular was switched to dracut before Xen had
gained early loading support.)

That said - I agree with you that the overwhelming amount of
testing likely covers the case you mention, and nothing else. So
if for all practical purposes supporting just a single family:model:
stepping tuple is enough - fine with me. Given my other reply
regarding how to deal with load failure though, I'm not sure this
would mean we can resort back to caching just a single blob.

Jan



___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v5 09/11] viridian: add implementation of synthetic interrupt MSRs

2019-03-14 Thread Jan Beulich
>>> On 13.03.19 at 17:13,  wrote:
>>  -Original Message-
>> From: Xen-devel [mailto:xen-devel-boun...@lists.xenproject.org] On Behalf Of 
> Jan Beulich
>> Sent: 13 March 2019 13:15
>> To: Paul Durrant 
>> Cc: Stefano Stabellini ; Wei Liu 
>> ; 
> Konrad Rzeszutek Wilk
>> ; George Dunlap ; Andrew 
> Cooper
>> ; Ian Jackson ; Tim 
>> (Xen.org) 
> ; Julien
>> Grall ; xen-devel ; 
>> Roger 
> Pau Monne
>> 
>> Subject: Re: [Xen-devel] [PATCH v5 09/11] viridian: add implementation of 
> synthetic interrupt MSRs
>> 
>> >>> On 11.03.19 at 14:41,  wrote:
>> > @@ -28,6 +29,32 @@ typedef union _HV_VP_ASSIST_PAGE
>> >  uint8_t ReservedZBytePadding[PAGE_SIZE];
>> >  } HV_VP_ASSIST_PAGE;
>> >
>> > +typedef enum HV_MESSAGE_TYPE {
>> > +HvMessageTypeNone,
>> > +HvMessageTimerExpired = 0x8010,
>> > +} HV_MESSAGE_TYPE;
>> > +
>> > +typedef struct HV_MESSAGE_FLAGS {
>> > +uint8_t MessagePending:1;
>> > +uint8_t Reserved:7;
>> > +} HV_MESSAGE_FLAGS;
>> > +
>> > +typedef struct HV_MESSAGE_HEADER {
>> > +HV_MESSAGE_TYPE MessageType;
>> > +uint16_t Reserved1;
>> > +HV_MESSAGE_FLAGS MessageFlags;
>> > +uint8_t PayloadSize;
>> > +uint64_t Reserved2;
>> > +} HV_MESSAGE_HEADER;
>> > +
>> > +#define HV_MESSAGE_SIZE 256
>> > +#define HV_MESSAGE_MAX_PAYLOAD_QWORD_COUNT 30
>> 
> 
> Missed this one before...
> 
>> Is this defined this way, or (given ...
>> 
>> > +typedef struct HV_MESSAGE {
>> > +HV_MESSAGE_HEADER Header;
>> > +uint64_t Payload[HV_MESSAGE_MAX_PAYLOAD_QWORD_COUNT];
>> > +} HV_MESSAGE;
>> 
>> ... this) isn't it rather
>> 
>> #define HV_MESSAGE_MAX_PAYLOAD_QWORD_COUNT \
>> ((HV_MESSAGE_SIZE - sizeof(HV_MESSAGE_HEADER) / 8)
>> 
> 
> I need the definition for the array in the struct so that sizeof(HV_MESSAGE) 
> == HV_MESSAGE_SIZE (for which there is a BUILD_BUG_ON()) later.

I don't understand this part - I'm not asking to ditch the #define.
As to the BUILD_BUG_ON() - I see now, but that's only in patch
10, and in a specific message handler. I think this would belong
here, and in the main viridian.c file.

> It's also written that way in the spec. so I'd rather leave it as-is.

Well, okay then - I was sort of expecting this to be spelled out
there in such a way. That doesn't change my overall opinion,
but I can see your point of wanting to match the spec, and you
being the maintainer I have no basis to insist anyway.

Jan



___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v2 1/4] x86: stop handling MSR_IA32_BNDCFGS save/restore in implementation code

2019-03-14 Thread Jan Beulich
>>> On 13.03.19 at 17:05,  wrote:
>> From: Xen-devel [mailto:xen-devel-boun...@lists.xenproject.org] On Behalf Of 
>> Jan Beulich
>> Sent: 13 March 2019 15:55
>> 
>> >>> On 11.03.19 at 19:09,  wrote:
>> > --- a/xen/include/asm-x86/msr.h
>> > +++ b/xen/include/asm-x86/msr.h
>> > @@ -328,7 +328,7 @@ int init_vcpu_msr_policy(struct vcpu *v);
>> >   * These functions are also used by the migration logic, so need to cope 
>> > with
>> >   * being used outside of v's context.
>> >   */
>> > -int guest_rdmsr(const struct vcpu *v, uint32_t msr, uint64_t *val);
>> > +int guest_rdmsr(struct vcpu *v, uint32_t msr, uint64_t *val);
>> 
>> I find this pretty undesirable, and I'd like to at least put out
>> for discussion a means how to avoid it: Any entity being
>> passed a const struct vcpu *cv can get hold of a non-const
>> one by doing
>> 
>> struct vcpu *v = cv->domain->vcpu[cv->vcpu_id];
>> 
> 
> Looks kind of odd, but of course it will certainly work.
> 
>> Of course this shouldn't be used arbitrarily, but to hide an
>> implementation detail like that of vmx_vmcs_enter() I think
>> this could be justified. Thoughts?
>> 
> 
> I guess the question is at what level to do this? Probably in the hvm code 
> rather than in the vmx code.

I think it should be in VMX code, as that's where the
vmx_vmcs_enter() oddity lives.

Jan



___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH for-next 7/9] xen: Convert is_xen_heap_mfn to use typesafe MFN

2019-03-14 Thread Jan Beulich
>>> On 13.03.19 at 18:24,  wrote:
> On 13/03/2019 15:04, Jan Beulich wrote:
> On 18.02.19 at 12:35,  wrote:
>>> --- a/xen/common/page_alloc.c
>>> +++ b/xen/common/page_alloc.c
>>> @@ -2121,9 +2121,9 @@ void init_xenheap_pages(paddr_t ps, paddr_t pe)
>>>* Yuk! Ensure there is a one-page buffer between Xen and Dom zones, 
>>> to
>>>* prevent merging of power-of-two blocks across the zone boundary.
>>>*/
>>> -if ( ps && !is_xen_heap_mfn(paddr_to_pfn(ps)-1) )
>>> +if ( ps && !is_xen_heap_mfn(_mfn(paddr_to_pfn(ps)-1)) )
>>>   ps += PAGE_SIZE;
>>> -if ( !is_xen_heap_mfn(paddr_to_pfn(pe)) )
>>> +if ( !is_xen_heap_mfn(maddr_to_mfn(pe)) )
>> 
>> Why maddr_to_mfn() here but still paddr_to_pfn() above? Oh,
>> we don't have any mfn_sub(), I see.
> 
> Yes we don't have mfn_sub() (or even gfn_sub()). I only found a couple of 
> places 
> where such helpers might be useful. I can introduce the 2 helpers if you 
> think 
> it is worth it.

Well, I guess in the end I'm fine either way. It simply struck me
as odd at the first glance that you use maddr_to_mfn() in one
case but not the other.

Jan



___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH for-next 8/9] xen: Introduce HAS_M2P config and use to protect mfn_to_gmfn call

2019-03-14 Thread Jan Beulich
>>> On 13.03.19 at 18:30,  wrote:
> On 13/03/2019 15:20, Jan Beulich wrote:
> On 18.02.19 at 12:35,  wrote:
>>> --- a/xen/common/domctl.c
>>> +++ b/xen/common/domctl.c
>>> @@ -205,7 +205,7 @@ void getdomaininfo(struct domain *d, struct 
> xen_domctl_getdomaininfo *info)
>>>   info->outstanding_pages = d->outstanding_pages;
>>>   info->shr_pages = atomic_read(&d->shr_pages);
>>>   info->paged_pages   = atomic_read(&d->paged_pages);
>>> -info->shared_info_frame = mfn_to_gmfn(d, virt_to_mfn(d->shared_info));
>>> +info->shared_info_frame = gfn_x(domain_shared_info_gfn(d));
>> 
>> I think this change wants to be accompanied by a warning attached
>> to the field declaration in the public header.
> 
> Make sense.
> 
>> 
>> But I'd also like to have the tool stack maintainers' view on making
>> this field effectively unusable for Arm.
> 
> The value in shared_info_frame was plain wrong since the creation of Xen Arm. 
> So 
> this is just making the error more obvious. I don't expect any user of it on 
> Arm.

Well, my request for tool stack maintainer input wasn't to put under
question that the field can't currently be used sensibly on Arm.
Instead I'm meaning to know whether it can be sensibly expected
for the tool stack to want to use the field uniformly, in which case
rather than making it more obviously not work it should be fixed
instead.

Jan



___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH for-next 9/9] xen: Remove mfn_to_gmfn macro

2019-03-14 Thread Jan Beulich
>>> On 13.03.19 at 18:34,  wrote:
> On 13/03/2019 15:59, Jan Beulich wrote:
> On 13.03.19 at 16:48,  wrote:
>>> On 13/03/2019 15:40, Jan Beulich wrote:
>>> On 13.03.19 at 16:24,  wrote:
> On 13/03/2019 15:22, Jan Beulich wrote:
> On 18.02.19 at 12:36,  wrote:
>>> --- a/xen/include/asm-arm/mm.h
>>> +++ b/xen/include/asm-arm/mm.h
>>> @@ -321,10 +321,8 @@ struct page_info *get_page_from_gva(struct vcpu 
>>> *v, 
>>> vaddr_t va,
>>>#define SHARED_M2P_ENTRY (~0UL - 1UL)
>>>#define SHARED_M2P(_e)   ((_e) == SHARED_M2P_ENTRY)
>>>
>>> -/* Xen always owns P2M on ARM */
>>> +/* We don't have a M2P on Arm */
>>>#define set_gpfn_from_mfn(mfn, pfn) do { (void) (mfn), (void)(pfn); 
>>> } 
>>> while (0)
>>> -#define mfn_to_gmfn(_d, mfn)  (mfn)
>> So is the plan to remove the other macro from Arm then as well?
> Do you mean mfn_to_gfn? If so it does not exist on Arm.
 No, I mean the one in context above - set_gpfn_from_mfn().
>>> It is used in common code, so we would need to #idef the caller.
>> Hmm, right, such #ifdef-ary would be undesirable (and two out of
>> the three common code callers would need it.
>>
>>> I think it is better to provide a NOP implementation. Could be moved 
>>> somewhere 
>>> in the common header though. Any opinions?
>> This would perhaps be better, now that you have HAVE_M2P.
> 
> Given that "having an M2P" is now an x86-specific concept, I think
> phasing set_gpfn_from_mfn()'s use out of common code is the way to go.

But what's the implication of this? There would need to be some
arch_*() hook used in the place that set_gpfn_from_mfn() is
invoked currently. But then it can as well remain
set_gpfn_from_mfn() (with the !HAVE_M2P stubbed out properly
in a common header), can't it?

Jan



___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH for-next 9/9] xen: Remove mfn_to_gmfn macro

2019-03-14 Thread Jan Beulich
>>> On 13.03.19 at 19:41,  wrote:
> On 13/03/2019 17:42, Julien Grall wrote:
>> On 13/03/2019 17:34, Andrew Cooper wrote:
>>> Given that "having an M2P" is now an x86-specific concept, I think
>>> phasing set_gpfn_from_mfn()'s use out of common code is the way to go.
>>
>> So you never expect other architecture to use the M2P?
> 
> I guess that depends on how likely it is going to be that Xen gains a
> new non-HVM-like virtualisation mode on a new architecture.

Well, not quite. I don't think it would be straightforward to make
x86 select HAVE_M2P only when PV is also enabled. Hence a HVM-
like implementation may still want to maintain M2P. In fact it is
my understanding that 64-bit Arm could easily do (leaving aside
the question of whether the memory needed to build the tables
would be well spent this), but it's prohibitive on 32-bit, and hence
it's easier for the Arm code overall to uniformly resort to
alternative means where such a translation is indeed needed.

Jan



___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH] tools/cpuid: Update xen-cpuid for Coffee Lake processors

2019-03-14 Thread Jan Beulich
>>> On 13.03.19 at 20:37,  wrote:
> --- a/tools/misc/xen-cpuid.c
> +++ b/tools/misc/xen-cpuid.c
> @@ -127,6 +127,8 @@ static const char *str_7c0[32] =
>  [14] = "avx512_vpopcntdq",
>  
>  [22] = "rdpid",
> +
> +[30] = "sgx-lc",
>  };
>  
>  static const char *str_e7d[32] =

Hmm, to be honest I don't see why this needs to go in ahead of
https://lists.xenproject.org/archives/html/xen-devel/2018-12/msg01813.html
posted almost 3 months ago, and already taking care of the bit.
I'm fine to switch from underscore to dash there if that's
desired, but it looks inconsistent with pre-existing entries (as
the context of your patch also shows).

Jan



___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [xen-4.8-testing test] 133755: regressions - FAIL

2019-03-14 Thread osstest service owner
flight 133755 xen-4.8-testing real [real]
http://logs.test-lab.xenproject.org/osstest/logs/133755/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-amd64-libvirt-pair 22 guest-migrate/src_host/dst_host fail REGR. 
vs. 130965

Tests which are failing intermittently (not blocking):
 test-xtf-amd64-amd64-5 69 xtf/test-hvm64-xsa-278 fail in 133662 pass in 133755
 test-xtf-amd64-amd64-4 50 xtf/test-hvm64-lbr-tsx-vmentry fail in 133680 pass 
in 133755
 test-amd64-i386-libvirt-pair 22 guest-migrate/src_host/dst_host fail pass in 
133662
 test-xtf-amd64-amd64-2   50 xtf/test-hvm64-lbr-tsx-vmentry fail pass in 133680

Tests which did not succeed, but are not blocking:
 test-xtf-amd64-amd64-5  50 xtf/test-hvm64-lbr-tsx-vmentry fail like 130965
 test-xtf-amd64-amd64-3  50 xtf/test-hvm64-lbr-tsx-vmentry fail like 130965
 test-amd64-amd64-xl-qemut-win7-amd64 17 guest-stopfail like 130965
 test-amd64-i386-xl-qemut-win7-amd64 17 guest-stop fail like 130965
 test-armhf-armhf-xl-rtds 16 guest-start/debian.repeatfail  like 130965
 test-amd64-i386-xl-qemuu-win7-amd64 17 guest-stop fail like 130965
 test-amd64-amd64-xl-qemuu-win7-amd64 17 guest-stopfail like 130965
 test-amd64-i386-xl-qemuu-ws16-amd64 17 guest-stop fail like 130965
 test-amd64-amd64-xl-rtds 10 debian-install   fail  like 130965
 test-amd64-amd64-xl-qemut-ws16-amd64 17 guest-stopfail like 130965
 test-amd64-amd64-xl-qemuu-ws16-amd64 17 guest-stopfail like 130965
 test-amd64-i386-xl-qemut-ws16-amd64 17 guest-stop fail like 130965
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  14 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-armhf-armhf-xl-arndale  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  14 saverestore-support-checkfail   never pass
 test-amd64-amd64-qemuu-nested-amd 17 debian-hvm-install/l1/l2  fail never pass
 test-amd64-amd64-libvirt-vhd 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 13 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 14 saverestore-support-checkfail  never pass
 test-armhf-armhf-xl-rtds 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 13 migrate-support-checkfail never pass
 test-armhf-armhf-xl-rtds 14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 14 saverestore-support-checkfail never pass
 test-armhf-armhf-libvirt 13 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt 14 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt-raw 12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-raw 13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  13 saverestore-support-checkfail   never pass
 test-amd64-i386-xl-qemut-win10-i386 10 windows-install fail never pass
 test-amd64-i386-xl-qemuu-win10-i386 10 windows-install fail never pass
 test-amd64-amd64-xl-qemuu-win10-i386 10 windows-installfail never pa

Re: [Xen-devel] xen: Can't insert balloon page into VM userspace (WAS Re: [linux-linus bisection] complete test-arm64-arm64-xl-xsm)

2019-03-14 Thread Juergen Gross
On 12/03/2019 20:46, David Hildenbrand wrote:
> On 12.03.19 19:23, David Hildenbrand wrote:
>> On 12.03.19 19:02, Boris Ostrovsky wrote:
>>> On 3/12/19 1:24 PM, Andrew Cooper wrote:
 On 12/03/2019 17:18, David Hildenbrand wrote:
> On 12.03.19 18:14, Matthew Wilcox wrote:
>> On Tue, Mar 12, 2019 at 05:05:39PM +, Julien Grall wrote:
>>> On 3/12/19 3:59 PM, Julien Grall wrote:
 It looks like all the arm test for linus [1] and next [2] tree
 are now failing. x86 seems to be mostly ok.

 The bisector fingered the following commit:

 commit 0ee930e6cafa048c1925893d0ca89918b2814f2c
 Author: Matthew Wilcox 
 Date:   Tue Mar 5 15:46:06 2019 -0800

  mm/memory.c: prevent mapping typed pages to userspace
  Pages which use page_type must never be mapped to userspace as it 
 would
  destroy their page type.  Add an explicit check for this instead 
 of
  assuming that kernel drivers always get this right.
>> Oh good, it found a real problem.
>>
>>> It turns out the problem is because the balloon driver will call
>>> __SetPageOffline() on allocated page. Therefore the page has a type and
>>> vm_insert_pages will deny the insertion.
>>>
>>> My knowledge is quite limited in this area. So I am not sure how we can
>>> solve the problem.
>>>
>>> I would appreciate if someone could provide input of to fix the mapping.
>> I don't know the balloon driver, so I don't know why it was doing this,
>> but what it was doing was Wrong and has been since 2014 with:
>>
>> commit d6d86c0a7f8ddc5b38cf089222cb1d9540762dc2
>> Author: Konstantin Khlebnikov 
>> Date:   Thu Oct 9 15:29:27 2014 -0700
>>
>> mm/balloon_compaction: redesign ballooned pages management
>>
>> If ballooned pages are supposed to be mapped into userspace, you can't 
>> mark
>> them as ballooned pages using the mapcount field.
>>
> Asking myself why anybody would want to map balloon inflated pages into
> user space (this just sounds plain wrong but my understanding to what
> XEN balloon driver does might be limited), but I assume the easy fix
> would be to revert
 I suspect the bug here is that the balloon driver is (ab)used for a
 second purpose
>>>
>>> Yes. And its name is alloc_xenballooned_pages().
>>>
>>
>> Haven't had a look at the code yet, but would another temporary fix be
>> to clear/set PG_offline when allocating/freeing a ballooned page?
>> (assuming here that only such pages will be mapped to user space)
>>
> 
> I guess something like this could do the trick if I understood it correctly:
> 
> diff --git a/drivers/xen/balloon.c b/drivers/xen/balloon.c
> index 39b229f9e256..d37dd5bb7a8f 100644
> --- a/drivers/xen/balloon.c
> +++ b/drivers/xen/balloon.c
> @@ -604,6 +604,7 @@ int alloc_xenballooned_pages(int nr_pages, struct
> page **pages)
> while (pgno < nr_pages) {
> page = balloon_retrieve(true);
> if (page) {
> +   __ClearPageOffline(page);
> pages[pgno++] = page;
>  #ifdef CONFIG_XEN_HAVE_PVMMU
> /*
> @@ -645,8 +646,10 @@ void free_xenballooned_pages(int nr_pages, struct
> page **pages)
> mutex_lock(&balloon_mutex);
> 
> for (i = 0; i < nr_pages; i++) {
> -   if (pages[i])
> +   if (pages[i]) {
> +   __SetPageOffline(pages[i]);
> balloon_append(pages[i]);
> +   }
> }
> 
> balloon_stats.target_unpopulated -= nr_pages;
> 
> 
> At least this way, the pages allocated (and thus eventually mapped to
> user space) would not be marked, but the other ones would remain marked
> and could be excluded by makedumptool.
> 

I think this patch should do the trick. Julien, could you give it a
try? On x86 I can't reproduce your problem easily as dom0 is PV with
plenty of unpopulated pages for grant memory not suffering from
missing "offline" bit.


Juergen

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 1/2 for-4.12] xen: introduce VCPUOP_register_runstate_phys_memory_area hypercall

2019-03-14 Thread Jan Beulich
>>> On 05.03.19 at 14:14,  wrote:
> From: Andrii Anisov 
> 
> The hypercall employs the same vcpu_register_runstate_memory_area
> structure for the interface, but requires registered area to not
> cross a page boundary.
> 
> Signed-off-by: Andrii Anisov 

While first of all you'll need to settle the dispute with Julien, a
couple of comments on the actual changes you do in case the
approach is to be pursued.

> --- a/xen/arch/arm/domain.c
> +++ b/xen/arch/arm/domain.c
> @@ -277,29 +277,33 @@ static void ctxt_switch_to(struct vcpu *n)
>  /* Update per-VCPU guest runstate shared memory area (if registered). */
>  static void update_runstate_area(struct vcpu *v)
>  {
> -void __user *guest_handle = NULL;
> -
>  if ( guest_handle_is_null(runstate_guest(v)) )
>  return;

I think (seeing also patch 2) re-using the handle representation
is a bad idea. If there's a Xen-internal mapping use a plain
pointer. (All comments on Arm code respectively apply to the
x86 version, also [if any] in patch 2.)

> -if ( VM_ASSIST(v->domain, runstate_update_flag) )
> +if ( v->runstate_guest_type == RUNSTATE_VADDR )

Patch 2 shows that this wants to be switch().

>  {
> -guest_handle = &v->runstate_guest.p->state_entry_time + 1;
> -guest_handle--;
> -v->runstate.state_entry_time |= XEN_RUNSTATE_UPDATE;
> -__raw_copy_to_guest(guest_handle,
> -(void *)(&v->runstate.state_entry_time + 1) - 1, 
> 1);
> -smp_wmb();
> -}
> +void __user *guest_handle = NULL;
> +if ( VM_ASSIST(v->domain, runstate_update_flag) )
> +{
> +guest_handle = &v->runstate_guest.p->state_entry_time + 1;
> +guest_handle--;
> +v->runstate.state_entry_time |= XEN_RUNSTATE_UPDATE;
> +__raw_copy_to_guest(guest_handle,
> +(void *)(&v->runstate.state_entry_time + 1) 
> - 1,
> +1);
> +smp_wmb();
> +}
>  
> -__copy_to_guest(runstate_guest(v), &v->runstate, 1);
> +__copy_to_guest(runstate_guest(v), &v->runstate, 1);
>  
> -if ( guest_handle )
> -{
> -v->runstate.state_entry_time &= ~XEN_RUNSTATE_UPDATE;
> -smp_wmb();
> -__raw_copy_to_guest(guest_handle,
> -(void *)(&v->runstate.state_entry_time + 1) - 1, 
> 1);
> +if ( guest_handle )
> +{
> +v->runstate.state_entry_time &= ~XEN_RUNSTATE_UPDATE;
> +smp_wmb();
> +__raw_copy_to_guest(guest_handle,
> +(void *)(&v->runstate.state_entry_time + 1) 
> - 1,
> +1);
> +}
>  }

There looks to be an "else" missing here (or "default:" once changed
to switch()), which you have on the x86 side. I'm unconvinced it
should be "return true" there though (in which case having nothing
here would indeed have been correct) - ASSERT_UNREACHABLE()
would much rather seem appropriate.

> @@ -1535,6 +1536,12 @@ long do_vcpu_op(int cmd, unsigned int vcpuid, 
> XEN_GUEST_HANDLE_PARAM(void) arg)
>  break;
>  }
>  
> +case VCPUOP_register_runstate_phys_memory_area:
> +{
> +rc = -ENOSYS;
> +break;
> +}

Stray braces and bad use of -ENOSYS (should be e.g. -EOPNOTSUPP).

> --- a/xen/include/public/vcpu.h
> +++ b/xen/include/public/vcpu.h
> @@ -235,6 +235,22 @@ struct vcpu_register_time_memory_area {
>  typedef struct vcpu_register_time_memory_area 
> vcpu_register_time_memory_area_t;
>  DEFINE_XEN_GUEST_HANDLE(vcpu_register_time_memory_area_t);
>  
> +/*
> + * Register a shared memory area from which the guest may obtain its own
> + * runstate information without needing to execute a hypercall.
> + * Notes:
> + *  1. The registered address must be guest's physical address.
> + *  2. The registered runstate area should not cross page boundary.
> + *  3. Only one shared area may be registered per VCPU. The shared area is
> + * updated by the hypervisor each time the VCPU is scheduled. Thus
> + * runstate.state will always be RUNSTATE_running and
> + * runstate.state_entry_time will indicate the system time at which the
> + * VCPU was last scheduled to run.
> + * @extra_arg == pointer to vcpu_register_runstate_memory_area structure.

I don't think re-use of that structure is appropriate to represent
a physical address.

> + */
> +#define VCPUOP_register_runstate_phys_memory_area 14
> +
> +
>  #endif /* __XEN_PUBLIC_VCPU_H__ */

Stray double blank lines.

> --- a/xen/include/xen/sched.h
> +++ b/xen/include/xen/sched.h
> @@ -163,6 +163,13 @@ struct vcpu
>  void*sched_priv;/* scheduler-specific data */
>  
>  struct vcpu_runstate_info runstate;
> +
> +enum {
> +RUNSTATE_NONE = 0,
> +RUNSTATE_PADDR = 1,
> +RUNSTATE_VADDR = 2,
> +} runstate_guest_type;

I can see that putting the new field here nicely groups with

Re: [Xen-devel] [PATCH v5 09/11] viridian: add implementation of synthetic interrupt MSRs

2019-03-14 Thread Paul Durrant
> -Original Message-
> From: Jan Beulich [mailto:jbeul...@suse.com]
> Sent: 14 March 2019 07:48
> To: Paul Durrant 
> Cc: Julien Grall ; Andrew Cooper 
> ; George Dunlap
> ; Ian Jackson ; Roger Pau 
> Monne
> ; Wei Liu ; Stefano Stabellini 
> ;
> xen-devel ; Konrad Rzeszutek Wilk 
> ; Tim
> (Xen.org) 
> Subject: RE: [Xen-devel] [PATCH v5 09/11] viridian: add implementation of 
> synthetic interrupt MSRs
> 
> >>> On 13.03.19 at 17:13,  wrote:
> >>  -Original Message-
> >> From: Xen-devel [mailto:xen-devel-boun...@lists.xenproject.org] On Behalf 
> >> Of
> > Jan Beulich
> >> Sent: 13 March 2019 13:15
> >> To: Paul Durrant 
> >> Cc: Stefano Stabellini ; Wei Liu 
> >> ;
> > Konrad Rzeszutek Wilk
> >> ; George Dunlap ; Andrew
> > Cooper
> >> ; Ian Jackson ; Tim 
> >> (Xen.org)
> > ; Julien
> >> Grall ; xen-devel ; 
> >> Roger
> > Pau Monne
> >> 
> >> Subject: Re: [Xen-devel] [PATCH v5 09/11] viridian: add implementation of
> > synthetic interrupt MSRs
> >>
> >> >>> On 11.03.19 at 14:41,  wrote:
> >> > @@ -28,6 +29,32 @@ typedef union _HV_VP_ASSIST_PAGE
> >> >  uint8_t ReservedZBytePadding[PAGE_SIZE];
> >> >  } HV_VP_ASSIST_PAGE;
> >> >
> >> > +typedef enum HV_MESSAGE_TYPE {
> >> > +HvMessageTypeNone,
> >> > +HvMessageTimerExpired = 0x8010,
> >> > +} HV_MESSAGE_TYPE;
> >> > +
> >> > +typedef struct HV_MESSAGE_FLAGS {
> >> > +uint8_t MessagePending:1;
> >> > +uint8_t Reserved:7;
> >> > +} HV_MESSAGE_FLAGS;
> >> > +
> >> > +typedef struct HV_MESSAGE_HEADER {
> >> > +HV_MESSAGE_TYPE MessageType;
> >> > +uint16_t Reserved1;
> >> > +HV_MESSAGE_FLAGS MessageFlags;
> >> > +uint8_t PayloadSize;
> >> > +uint64_t Reserved2;
> >> > +} HV_MESSAGE_HEADER;
> >> > +
> >> > +#define HV_MESSAGE_SIZE 256
> >> > +#define HV_MESSAGE_MAX_PAYLOAD_QWORD_COUNT 30
> >>
> >
> > Missed this one before...
> >
> >> Is this defined this way, or (given ...
> >>
> >> > +typedef struct HV_MESSAGE {
> >> > +HV_MESSAGE_HEADER Header;
> >> > +uint64_t Payload[HV_MESSAGE_MAX_PAYLOAD_QWORD_COUNT];
> >> > +} HV_MESSAGE;
> >>
> >> ... this) isn't it rather
> >>
> >> #define HV_MESSAGE_MAX_PAYLOAD_QWORD_COUNT \
> >> ((HV_MESSAGE_SIZE - sizeof(HV_MESSAGE_HEADER) / 8)
> >>
> >
> > I need the definition for the array in the struct so that sizeof(HV_MESSAGE)
> > == HV_MESSAGE_SIZE (for which there is a BUILD_BUG_ON()) later.
> 
> I don't understand this part - I'm not asking to ditch the #define.
> As to the BUILD_BUG_ON() - I see now, but that's only in patch
> 10, and in a specific message handler. I think this would belong
> here, and in the main viridian.c file.

Ok, I can see that it is more logical to co-locate the BUILD_BUG_ON() with the 
definitions.

  Paul

> 
> > It's also written that way in the spec. so I'd rather leave it as-is.
> 
> Well, okay then - I was sort of expecting this to be spelled out
> there in such a way. That doesn't change my overall opinion,
> but I can see your point of wanting to match the spec, and you
> being the maintainer I have no basis to insist anyway.
> 
> Jan
> 


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 2/2 for-4.12] xen: implement VCPUOP_register_runstate_phys_memory_area

2019-03-14 Thread Jan Beulich
>>> On 05.03.19 at 14:14,  wrote:
> --- a/xen/arch/arm/domain.c
> +++ b/xen/arch/arm/domain.c
> @@ -305,6 +305,26 @@ static void update_runstate_area(struct vcpu *v)
>  1);
>  }
>  }
> +else if ( v->runstate_guest_type == RUNSTATE_PADDR )
> +{
> +if ( VM_ASSIST(v->domain, runstate_update_flag) )
> +{
> +runstate_guest(v).p->state_entry_time |= XEN_RUNSTATE_UPDATE;
> +smp_wmb();
> +v->runstate.state_entry_time |= XEN_RUNSTATE_UPDATE;
> +}
> +
> +memcpy(runstate_guest(v).p, &v->runstate, sizeof(v->runstate));
> +
> +if ( VM_ASSIST(v->domain, runstate_update_flag) )
> +{
> +runstate_guest(v).p->state_entry_time &= ~XEN_RUNSTATE_UPDATE;
> +smp_wmb();
> +v->runstate.state_entry_time &= ~XEN_RUNSTATE_UPDATE;
> +}
> +}
> +else
> +{ /* No actions required */ }

See my respective comment on patch 1.

> @@ -1648,9 +1648,37 @@ bool update_runstate_area(struct vcpu *v)
>  (void *)(&v->runstate.state_entry_time + 1) 
> - 1, 1);
>  }
>  }
> -else
> +else if ( v->runstate_guest_type == RUNSTATE_PADDR )
>  {
> -rc = true;
> +if ( VM_ASSIST(v->domain, runstate_update_flag) )
> +{
> +v->runstate.state_entry_time |= XEN_RUNSTATE_UPDATE;
> +if ( has_32bit_shinfo((v)->domain) )

Stray inner parentheses (at least one more instance further down).

> --- a/xen/common/domain.c
> +++ b/xen/common/domain.c
> @@ -738,7 +738,14 @@ int domain_kill(struct domain *d)
>  if ( cpupool_move_domain(d, cpupool0) )
>  return -ERESTART;
>  for_each_vcpu ( d, v )
> +{
> +if ( v->runstate_guest_type == RUNSTATE_VADDR )
> +set_xen_guest_handle(runstate_guest(v), NULL);
> +else
> +unmap_runstate_area(v);

Since this recurs further down, can't the VADDR case also be
taken care of by unmap_runstate_area()?

> @@ -1333,6 +1344,65 @@ void unmap_vcpu_info(struct vcpu *v)
>  put_page_and_type(mfn_to_page(mfn));
>  }
>  
> +int map_runstate_area(struct vcpu *v,
> +  struct vcpu_register_runstate_memory_area *area)

Neither this nor the unmap function look to be used outside this
file, so should be static. It also looks as if the second parameter
could be constified.

> +{
> +unsigned long offset = area->addr.p & ~PAGE_MASK;
> +gfn_t gfn = gaddr_to_gfn(area->addr.p);
> +struct domain *d = v->domain;
> +void *mapping;
> +struct page_info *page;
> +size_t size = sizeof (struct vcpu_runstate_info );

Stray blanks.

> +ASSERT(v->runstate_guest_type == RUNSTATE_PADDR );

Another one.

> +if ( offset > (PAGE_SIZE - size) )
> +return -EINVAL;
> +
> +page = get_page_from_gfn(d, gfn_x(gfn), NULL, P2M_ALLOC);

Don't you also need P2M_UNSHARE? And is the p2m type of the
page really of no interest here?

> +void unmap_runstate_area(struct vcpu *v)
> +{
> +mfn_t mfn;
> +
> +if ( v->runstate_guest_type != RUNSTATE_PADDR )
> +return;
> +
> +if ( guest_handle_is_null(runstate_guest(v)) )
> +return;
> +
> +mfn = _mfn(virt_to_mfn(runstate_guest(v).p));

The pointer is the result of __map_domain_page_global() - I
don't think you can legitimately use virt_to_mfn() on it, at
least not on x86; domain_page_map_to_mfn() is what you
want to use here.

> @@ -1532,13 +1602,29 @@ long do_vcpu_op(int cmd, unsigned int vcpuid, 
> XEN_GUEST_HANDLE_PARAM(void) arg)
>  vcpu_runstate_get(v, &runstate);
>  __copy_to_guest(runstate_guest(v), &runstate, 1);
>  }
> -
>  break;
>  }

Undue removal of a blank line.

>  case VCPUOP_register_runstate_phys_memory_area:
>  {
> -rc = -ENOSYS;
> +struct vcpu_register_runstate_memory_area area;
> +
> +rc = -EFAULT;
> +if ( copy_from_guest(&area, arg, 1) )
> +break;
> +
> +unmap_runstate_area(v);
> +v->runstate_guest_type = RUNSTATE_PADDR;
> +rc = map_runstate_area(v, &area);
> +
> +if ( rc )
> +{
> +v->runstate_guest_type = RUNSTATE_NONE;

All of this updating of type and mapping are racy, and I guess
that's also what Julien has already pointed out. As long as the
mapping can be updated by remote CPUs, you have to make
sure that the consuming side can't see partially updated values.
On option _might_ be to switch to NONE first, and to PADDR
upon success (with suitable barriers in between). The logic
then of course needs mirroring to the VADDR path.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] Hypervisor Debugging GDB, KDB

2019-03-14 Thread Jan Ruh
Dear Xen Developer mailing list,

I have been looking for detailed Information about the current status of Xen 
hypervisor debugging. I checked the mailing list (questions mostly unanswered 
or outdated), the repositories, and several search engine without a satisfying 
answer to my question. Is there any way of single stepping Xen hypervisor as we 
usually do, e.g., with the Linux kernel using gdb and kdb over serial port? If 
so can you give me a hint where to find this information, furthermore it might 
be good to make this information easily accessible for newcomers in Xen 
development.

Best,

Jan Ruh
CONFIDENTIALITY: The contents of this e-mail are confidential and intended only 
for the above addressee(s). If you are not the intended recipient, or the 
person responsible for delivering it to the intended recipient, copying or 
delivering it to anyone else or using it in any unauthorized manner is 
prohibited and may be unlawful. If you receive this e-mail by mistake, please 
notify the sender and the systems administrator at straym...@tttech.com 
immediately.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH] xen-block: Replace qdict_put_obj() by qdict_put() where appropriate

2019-03-14 Thread Paul Durrant
> -Original Message-
> From: Markus Armbruster [mailto:arm...@redhat.com]
> Sent: 13 March 2019 17:45
> To: qemu-de...@nongnu.org
> Cc: sstabell...@kernel.org; Anthony Perard ; Paul 
> Durrant
> ; xen-devel@lists.xenproject.org; 
> qemu-bl...@nongnu.org
> Subject: [PATCH] xen-block: Replace qdict_put_obj() by qdict_put() where 
> appropriate
> 
> Patch created mechanically by rerunning:
> 
> $ spatch --sp-file scripts/coccinelle/qobject.cocci \
>  --macro-file scripts/cocci-macro-file.h \
>  --dir hw/block --in-place
> 
> Signed-off-by: Markus Armbruster 

Acked-by: Paul Durrant 

> ---
>  hw/block/xen-block.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/hw/block/xen-block.c b/hw/block/xen-block.c
> index 70fc2455e8..9c722b9b95 100644
> --- a/hw/block/xen-block.c
> +++ b/hw/block/xen-block.c
> @@ -771,7 +771,7 @@ static XenBlockDrive *xen_block_drive_create(const char 
> *id,
>  QDict *cache_qdict = qdict_new();
> 
>  qdict_put_bool(cache_qdict, "direct", true);
> -qdict_put_obj(file_layer, "cache", QOBJECT(cache_qdict));
> +qdict_put(file_layer, "cache", cache_qdict);
> 
>  qdict_put_str(file_layer, "aio", "native");
>  }
> @@ -796,7 +796,7 @@ static XenBlockDrive *xen_block_drive_create(const char 
> *id,
>  qdict_put_str(driver_layer, "driver", driver);
>  g_free(driver);
> 
> -qdict_put_obj(driver_layer, "file", QOBJECT(file_layer));
> +qdict_put(driver_layer, "file", file_layer);
> 
>  g_assert(!drive->node_name);
>  drive->node_name = xen_block_blockdev_add(drive->id, driver_layer,
> --
> 2.17.2


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH 0/2] xen: some debug trace enhancements

2019-03-14 Thread Juergen Gross
While doing some scheduler work I used debug trace for diagnosis of
problems during dom0 boot. This small series is the result of adapting
debug trace to my needs.

Juergen Gross (2):
  xen/debug: make debugtrace configurable via Kconfig
  xen/debug: make debugtrace more clever regarding repeating entries

 xen/Kconfig.debug  |  7 +++
 xen/drivers/char/console.c | 46 ++
 xen/include/xen/lib.h  |  3 +--
 3 files changed, 42 insertions(+), 14 deletions(-)

-- 
2.16.4


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH 2/2] xen/debug: make debugtrace more clever regarding repeating entries

2019-03-14 Thread Juergen Gross
In case debugtrace is writing to memory and the last entry is repeated
don't fill up the trace buffer, but modify the count prefix to "x-y "
style instead.

Signed-off-by: Juergen Gross 
---
 xen/drivers/char/console.c | 44 +---
 1 file changed, 33 insertions(+), 11 deletions(-)

diff --git a/xen/drivers/char/console.c b/xen/drivers/char/console.c
index 41ec13ce52..38dcfc732e 100644
--- a/xen/drivers/char/console.c
+++ b/xen/drivers/char/console.c
@@ -1225,13 +1225,28 @@ void debugtrace_dump(void)
 watchdog_enable();
 }
 
+static void debugtrace_add_to_buf(char *buf)
+{
+char *p;
+
+for ( p = buf; *p != '\0'; p++ )
+{
+debugtrace_buf[debugtrace_prd++] = *p;
+/* Always leave a nul byte at the end of the buffer. */
+if ( debugtrace_prd == (debugtrace_bytes - 1) )
+debugtrace_prd = 0;
+}
+}
+
 void debugtrace_printk(const char *fmt, ...)
 {
-static charbuf[1024];
-static u32 count;
+static char buf[1024];
+static char last_buf[1024];
+static u32 count, last_count;
+static unsigned int last_prd;
 
+char  cntbuf[24];
 va_list   args;
-char *p;
 unsigned long flags;
 
 if ( debugtrace_bytes == 0 )
@@ -1243,25 +1258,32 @@ void debugtrace_printk(const char *fmt, ...)
 
 ASSERT(debugtrace_buf[debugtrace_bytes - 1] == 0);
 
-snprintf(buf, sizeof(buf), "%u ", ++count);
-
 va_start(args, fmt);
-(void)vsnprintf(buf + strlen(buf), sizeof(buf) - strlen(buf), fmt, args);
+(void)vsnprintf(buf, sizeof(buf), fmt, args);
 va_end(args);
 
 if ( debugtrace_send_to_console )
 {
+snprintf(cntbuf, sizeof(cntbuf), "%u ", ++count);
+serial_puts(sercon_handle, cntbuf);
 serial_puts(sercon_handle, buf);
 }
 else
 {
-for ( p = buf; *p != '\0'; p++ )
+if ( strcmp(buf, last_buf) )
+{
+last_prd = debugtrace_prd;
+last_count = ++count;
+safe_strcpy(last_buf, buf);
+snprintf(cntbuf, sizeof(cntbuf), "%u ", count);
+}
+else
 {
-debugtrace_buf[debugtrace_prd++] = *p;
-/* Always leave a nul byte at the end of the buffer. */
-if ( debugtrace_prd == (debugtrace_bytes - 1) )
-debugtrace_prd = 0;
+debugtrace_prd = last_prd;
+snprintf(cntbuf, sizeof(cntbuf), "%u-%u ", last_count, ++count);
 }
+debugtrace_add_to_buf(cntbuf);
+debugtrace_add_to_buf(buf);
 }
 
 spin_unlock_irqrestore(&debugtrace_lock, flags);
-- 
2.16.4


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH 1/2] xen/debug: make debugtrace configurable via Kconfig

2019-03-14 Thread Juergen Gross
Instead of having to edit include/xen/lib.h for making debugtrace
available make it configurable via Kconfig.

Default is off, it is available only in expert mode or in debug builds.

Signed-off-by: Juergen Gross 
---
 xen/Kconfig.debug  | 7 +++
 xen/drivers/char/console.c | 2 +-
 xen/include/xen/lib.h  | 3 +--
 3 files changed, 9 insertions(+), 3 deletions(-)

diff --git a/xen/Kconfig.debug b/xen/Kconfig.debug
index 4d5d7f87cb..daacf85141 100644
--- a/xen/Kconfig.debug
+++ b/xen/Kconfig.debug
@@ -98,6 +98,13 @@ config UBSAN
 
  If unsure, say N here.
 
+config DEBUG_TRACE
+   bool "Debug trace support"
+   ---help---
+ Debug trace enables to record debug trace messages which are printed
+ either directly to the console or are printed to console in case of
+ a system crash.
+
 endif # DEBUG || EXPERT
 
 endmenu
diff --git a/xen/drivers/char/console.c b/xen/drivers/char/console.c
index 4315588f05..41ec13ce52 100644
--- a/xen/drivers/char/console.c
+++ b/xen/drivers/char/console.c
@@ -1157,7 +1157,7 @@ int printk_ratelimit(void)
  * **
  */
 
-#ifdef DEBUG_TRACE_DUMP
+#ifdef CONFIG_DEBUG_TRACE
 
 /* Send output direct to console, or buffer it? */
 static volatile int debugtrace_send_to_console;
diff --git a/xen/include/xen/lib.h b/xen/include/xen/lib.h
index 89939f43c8..e0b7bcb6b7 100644
--- a/xen/include/xen/lib.h
+++ b/xen/include/xen/lib.h
@@ -86,8 +86,7 @@ int parse_boolean(const char *name, const char *s, const char 
*e);
  */
 int cmdline_strcmp(const char *frag, const char *name);
 
-/*#define DEBUG_TRACE_DUMP*/
-#ifdef DEBUG_TRACE_DUMP
+#ifdef CONFIG_DEBUG_TRACE
 extern void debugtrace_dump(void);
 extern void debugtrace_printk(const char *fmt, ...)
 __attribute__ ((format (printf, 1, 2)));
-- 
2.16.4


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [freebsd-master test] 133765: regressions - FAIL

2019-03-14 Thread osstest service owner
flight 133765 freebsd-master real [real]
http://logs.test-lab.xenproject.org/osstest/logs/133765/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-amd64-freebsd-again 5 host-install(5)  fail REGR. vs. 133707

version targeted for testing:
 freebsd  603a99698863e28eb4970588946e212c6a7c41d7
baseline version:
 freebsd  4545fe820a83b675845f9045f06c856129b9c17c

Last test of basis   133707  2019-03-11 09:19:18 Z3 days
Testing same since   133765  2019-03-13 09:19:31 Z1 days1 attempts


People who touched revisions under test:
  0mp <0...@freebsd.org>
  ae 
  asomers 
  bcr 
  cy 
  dab 
  dim 
  emaste 
  gahr 
  gjb 
  hselasky 
  imp 
  jhb 
  kadesai 
  ken 
  kevans 
  kib 
  lidl 
  mav 
  mckusick 
  np 
  sjg 
  trasz 
  wosch 
  wulf 

jobs:
 build-amd64-freebsd-againfail
 build-amd64-freebsd  pass
 build-amd64-xen-freebsd  pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Not pushing.

(No revision log; it would be 848 lines long.)

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v5 10/11] viridian: add implementation of synthetic timers

2019-03-14 Thread Paul Durrant
> -Original Message-
> From: Xen-devel [mailto:xen-devel-boun...@lists.xenproject.org] On Behalf Of 
> Paul Durrant
> Sent: 13 March 2019 14:37
> To: 'Jan Beulich' 
> Cc: Stefano Stabellini ; Wei Liu 
> ; Konrad Rzeszutek Wilk
> ; Andrew Cooper ; Tim 
> (Xen.org) ;
> George Dunlap ; Julien Grall 
> ; xen-devel  de...@lists.xenproject.org>; Ian Jackson ; Roger Pau 
> Monne
> 
> Subject: Re: [Xen-devel] [PATCH v5 10/11] viridian: add implementation of 
> synthetic timers
> 
[snip]
> 
> > As to safety of this, I have two concerns:
> >
> > 1) TscSequence gets updated as a result of a guest action (an MSR
> > write). This makes it non-obvious that the loop above will get
> > exited in due course.
> >
> 
> True. The domain could try to DoS this call. This could be avoided by doing a 
> domain_pause() if we
> test continuously fails for a number of iterations, or maybe just one 
> iteration.
> 
> > 2) The way update_reference_tsc() deals with the two "invalid"
> > values suggests ~0 and 0 should be special cased in general. I
> > _think_ this is not necessary here, but it also seems to me as if
> > the VM ever having a way to observe either of those two values
> > would be wrong too. Shouldn't the function avoid to ever store
> > ~0 into that field, i.e. increment into a local variable, update
> > that local variable to skip the two "invalid" values, and only then
> > store into the field?
> >
> > Otoh, making it into that function being a result of an MSR write,
> > it may welll be that the spec precludes the guest from reading
> > the reference page while an update was invoked from one of its
> > vCPU-s. If this was the case, then I also wouldn't have to
> > wonder any longer how this entire mechanism can be race free
> > in the first place (without a double increment like we do in the
> > pv-clock protocol).
> 
> From observation, it looks like Windows initializes the reference tsc page 
> before it brings secondary
> CPUs online and then doesn't touch the MSR again, so we should probably only 
> tolerate one mismatch in
> time_now() before doing domain_pause().

Actually it occurred to me last night that I'm being completely thick by coding 
it this way. The viridian code sets TscScale, not the guest, so we don't even 
need to reference the HV_REFERENCE_TSC_PAGE struct. Looking again, I'm also 
concerned that there's a small TOCTOU race in testing whether the reference tsc 
page is valid where the guest could unmap it on another CPU and cause a NULL 
pointer deref in time_now(), so I'll re-work this entirely.

  Paul

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH 0/2] xen: some scheduler related simplifications

2019-03-14 Thread Juergen Gross
While working on the scheduler I needed to modify some bits which are
worth to be considered as a stand-alone improvement IMO.

Juergen Gross (2):
  xen: introduce a cpumask with all bits set
  xen/sched: don't disable scheduler on cpus during suspend

 xen/arch/x86/io_apic.c|   4 +-
 xen/common/cpu.c  |   4 ++
 xen/common/cpupool.c  |   2 +-
 xen/common/schedule.c | 163 +-
 xen/include/xen/cpumask.h |   2 +
 5 files changed, 53 insertions(+), 122 deletions(-)

-- 
2.16.4


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH 1/2] xen: introduce a cpumask with all bits set

2019-03-14 Thread Juergen Gross
There are several places in Xen allocating a cpumask on the stack and
setting all bits in it just to use it as an initial mask for allowing
all cpus.

Save the stack space and omit the need for runtime initialization by
defining a globally accessible cpumask_all variable.

Signed-off-by: Juergen Gross 
---
 xen/arch/x86/io_apic.c|  4 +---
 xen/common/cpu.c  |  4 
 xen/common/schedule.c | 17 -
 xen/include/xen/cpumask.h |  2 ++
 4 files changed, 11 insertions(+), 16 deletions(-)

diff --git a/xen/arch/x86/io_apic.c b/xen/arch/x86/io_apic.c
index daa5e9e5ff..a5344ed727 100644
--- a/xen/arch/x86/io_apic.c
+++ b/xen/arch/x86/io_apic.c
@@ -1881,7 +1881,6 @@ static void __init check_timer(void)
 int apic1, pin1, apic2, pin2;
 int vector, ret;
 unsigned long flags;
-cpumask_t mask_all;
 
 local_irq_save(flags);
 
@@ -1892,8 +1891,7 @@ static void __init check_timer(void)
 vector = IRQ0_VECTOR;
 clear_irq_vector(0);
 
-cpumask_setall(&mask_all);
-if ((ret = bind_irq_vector(0, vector, &mask_all)))
+if ((ret = bind_irq_vector(0, vector, &cpumask_all)))
 printk(KERN_ERR"..IRQ0 is not set correctly with ioapic!!!, err:%d\n", 
ret);
 
 irq_desc[0].status &= ~IRQ_DISABLED;
diff --git a/xen/common/cpu.c b/xen/common/cpu.c
index 653a56b840..836c62f97f 100644
--- a/xen/common/cpu.c
+++ b/xen/common/cpu.c
@@ -11,6 +11,10 @@ unsigned int __read_mostly nr_cpumask_bits
 = BITS_TO_LONGS(NR_CPUS) * BITS_PER_LONG;
 #endif
 
+const cpumask_t cpumask_all = {
+.bits[0 ... (BITS_TO_LONGS(NR_CPUS) - 1)] = ~0UL
+};
+
 /*
  * cpu_bit_bitmap[] is a special, "compressed" data structure that
  * represents all NR_CPUS bits binary values of 1processor = processor;
 
@@ -280,9 +277,9 @@ int sched_init_vcpu(struct vcpu *v, unsigned int processor)
  * domain-0 VCPUs, are pinned onto their respective physical CPUs.
  */
 if ( is_idle_domain(d) || d->is_pinned )
-sched_set_affinity(v, cpumask_of(processor), &allcpus);
+sched_set_affinity(v, cpumask_of(processor), &cpumask_all);
 else
-sched_set_affinity(v, &allcpus, &allcpus);
+sched_set_affinity(v, &cpumask_all, &cpumask_all);
 
 /* Idle VCPUs are scheduled immediately, so don't put them in runqueue. */
 if ( is_idle_domain(d) )
@@ -361,7 +358,6 @@ int sched_move_domain(struct domain *d, struct cpupool *c)
 for_each_vcpu ( d, v )
 {
 spinlock_t *lock;
-cpumask_t allcpus;
 
 vcpudata = v->sched_priv;
 
@@ -369,11 +365,9 @@ int sched_move_domain(struct domain *d, struct cpupool *c)
 migrate_timer(&v->singleshot_timer, new_p);
 migrate_timer(&v->poll_timer, new_p);
 
-cpumask_setall(&allcpus);
-
 lock = vcpu_schedule_lock_irq(v);
 
-sched_set_affinity(v, &allcpus, &allcpus);
+sched_set_affinity(v, &cpumask_all, &cpumask_all);
 
 v->processor = new_p;
 /*
@@ -812,8 +806,6 @@ int cpu_disable_scheduler(unsigned int cpu)
 if ( cpumask_empty(&online_affinity) &&
  cpumask_test_cpu(cpu, v->cpu_hard_affinity) )
 {
-cpumask_t allcpus;
-
 if ( v->affinity_broken )
 {
 /* The vcpu is temporarily pinned, can't move it. */
@@ -831,8 +823,7 @@ int cpu_disable_scheduler(unsigned int cpu)
 else
 printk(XENLOG_DEBUG "Breaking affinity for %pv\n", v);
 
-cpumask_setall(&allcpus);
-sched_set_affinity(v, &allcpus, NULL);
+sched_set_affinity(v, &cpumask_all, NULL);
 }
 
 if ( v->processor != cpu )
diff --git a/xen/include/xen/cpumask.h b/xen/include/xen/cpumask.h
index b4cc92a4f5..5a43438988 100644
--- a/xen/include/xen/cpumask.h
+++ b/xen/include/xen/cpumask.h
@@ -75,6 +75,8 @@
 
 typedef struct cpumask{ DECLARE_BITMAP(bits, NR_CPUS); } cpumask_t;
 
+extern const cpumask_t cpumask_all;
+
 extern unsigned int nr_cpu_ids;
 
 #if NR_CPUS > 4 * BITS_PER_LONG
-- 
2.16.4


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH 2/2] xen/sched: don't disable scheduler on cpus during suspend

2019-03-14 Thread Juergen Gross
Today there is special handling in cpu_disable_scheduler() for suspend
by forcing all vcpus to the boot cpu. In fact there is no need for that
as during resume the vcpus are put on the correct cpus again.

So we can just omit the call of cpu_disable_scheduler() when offlining
a cpu due to suspend and on resuming we can omit taking the schedule
lock for selecting the new processor.

In restore_vcpu_affinity() we should be careful when applying affinity
as the cpu might not have come back to life. This in turn enables us
to even support affinity_broken across suspend/resume.

Signed-off-by: Juergen Gross 
---
 xen/common/cpupool.c  |   2 +-
 xen/common/schedule.c | 146 ++
 2 files changed, 42 insertions(+), 106 deletions(-)

diff --git a/xen/common/cpupool.c b/xen/common/cpupool.c
index e89bb67e71..7b5ce18426 100644
--- a/xen/common/cpupool.c
+++ b/xen/common/cpupool.c
@@ -313,7 +313,7 @@ static long cpupool_unassign_cpu_helper(void *info)
  * cpu_disable_scheduler(), and at the bottom of this function.
  */
 rcu_read_lock(&domlist_read_lock);
-ret = cpu_disable_scheduler(cpu);
+ret = (system_state == SYS_STATE_suspend) ? 0 : cpu_disable_scheduler(cpu);
 cpumask_set_cpu(cpu, &cpupool_free_cpus);
 
 /*
diff --git a/xen/common/schedule.c b/xen/common/schedule.c
index 60755a631e..1ab4f182ce 100644
--- a/xen/common/schedule.c
+++ b/xen/common/schedule.c
@@ -560,33 +560,6 @@ static void vcpu_move_locked(struct vcpu *v, unsigned int 
new_cpu)
 v->processor = new_cpu;
 }
 
-/*
- * Move a vcpu from its current processor to a target new processor,
- * without asking the scheduler to do any placement. This is intended
- * for being called from special contexts, where things are quiet
- * enough that no contention is supposed to happen (i.e., during
- * shutdown or software suspend, like ACPI S3).
- */
-static void vcpu_move_nosched(struct vcpu *v, unsigned int new_cpu)
-{
-unsigned long flags;
-spinlock_t *lock, *new_lock;
-
-ASSERT(system_state == SYS_STATE_suspend);
-ASSERT(!vcpu_runnable(v) && (atomic_read(&v->pause_count) ||
- atomic_read(&v->domain->pause_count)));
-
-lock = per_cpu(schedule_data, v->processor).schedule_lock;
-new_lock = per_cpu(schedule_data, new_cpu).schedule_lock;
-
-sched_spin_lock_double(lock, new_lock, &flags);
-ASSERT(new_cpu != v->processor);
-vcpu_move_locked(v, new_cpu);
-sched_spin_unlock_double(lock, new_lock, flags);
-
-sched_move_irqs(v);
-}
-
 /*
  * Initiating migration
  *
@@ -735,31 +708,36 @@ void restore_vcpu_affinity(struct domain *d)
 
 ASSERT(!vcpu_runnable(v));
 
-lock = vcpu_schedule_lock_irq(v);
-
-if ( v->affinity_broken )
-{
-sched_set_affinity(v, v->cpu_hard_affinity_saved, NULL);
-v->affinity_broken = 0;
-
-}
-
 /*
- * During suspend (in cpu_disable_scheduler()), we moved every vCPU
- * to BSP (which, as of now, is pCPU 0), as a temporary measure to
- * allow the nonboot processors to have their data structure freed
- * and go to sleep. But nothing guardantees that the BSP is a valid
- * pCPU for a particular domain.
+ * Re-assign the initial processor as after resume we have no
+ * guarantee the old processor has come back to life again.
  *
  * Therefore, here, before actually unpausing the domains, we should
  * set v->processor of each of their vCPUs to something that will
  * make sense for the scheduler of the cpupool in which they are in.
  */
 cpumask_and(cpumask_scratch_cpu(cpu), v->cpu_hard_affinity,
-cpupool_domain_cpumask(v->domain));
-v->processor = cpumask_any(cpumask_scratch_cpu(cpu));
+cpupool_domain_cpumask(d));
+if ( cpumask_empty(cpumask_scratch_cpu(cpu)) )
+{
+if ( v->affinity_broken )
+{
+sched_set_affinity(v, v->cpu_hard_affinity_saved, NULL);
+v->affinity_broken = 0;
+cpumask_and(cpumask_scratch_cpu(cpu), v->cpu_hard_affinity,
+cpupool_domain_cpumask(d));
+}
 
-spin_unlock_irq(lock);
+if ( cpumask_empty(cpumask_scratch_cpu(cpu)) )
+{
+printk(XENLOG_DEBUG "Breaking affinity for %pv\n", v);
+sched_set_affinity(v, &cpumask_all, NULL);
+cpumask_and(cpumask_scratch_cpu(cpu), v->cpu_hard_affinity,
+cpupool_domain_cpumask(d));
+}
+}
+
+v->processor = cpumask_any(cpumask_scratch_cpu(cpu));
 
 lock = vcpu_schedule_lock_irq(v);
 v->processor = SCHED_OP(vcpu_scheduler(v), pick_cpu, v);
@@ -782,7 +760,6 @@ int cpu_disable_scheduler(unsigned int cpu)
 struct vcpu *v;
 struct cpupool *c;
 cpumas

Re: [Xen-devel] [PATCH 1/2] xen: introduce a cpumask with all bits set

2019-03-14 Thread Andrew Cooper
On 14/03/2019 09:59, Juergen Gross wrote:
> There are several places in Xen allocating a cpumask on the stack and
> setting all bits in it just to use it as an initial mask for allowing
> all cpus.
>
> Save the stack space and omit the need for runtime initialization by
> defining a globally accessible cpumask_all variable.
>
> Signed-off-by: Juergen Gross 

Reviewed-by: Andrew Cooper , with one minor
style suggestions which can be folded on commit.

> diff --git a/xen/arch/x86/io_apic.c b/xen/arch/x86/io_apic.c
> index daa5e9e5ff..a5344ed727 100644
> --- a/xen/arch/x86/io_apic.c
> +++ b/xen/arch/x86/io_apic.c
> @@ -1892,8 +1891,7 @@ static void __init check_timer(void)
>  vector = IRQ0_VECTOR;
>  clear_irq_vector(0);
>  
> -cpumask_setall(&mask_all);
> -if ((ret = bind_irq_vector(0, vector, &mask_all)))
> +if ((ret = bind_irq_vector(0, vector, &cpumask_all)))

Spaces, seeing as we're modifying the line.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 1/2] xen: introduce a cpumask with all bits set

2019-03-14 Thread Juergen Gross
On 14/03/2019 11:04, Andrew Cooper wrote:
> On 14/03/2019 09:59, Juergen Gross wrote:
>> There are several places in Xen allocating a cpumask on the stack and
>> setting all bits in it just to use it as an initial mask for allowing
>> all cpus.
>>
>> Save the stack space and omit the need for runtime initialization by
>> defining a globally accessible cpumask_all variable.
>>
>> Signed-off-by: Juergen Gross 
> 
> Reviewed-by: Andrew Cooper , with one minor
> style suggestions which can be folded on commit.
> 
>> diff --git a/xen/arch/x86/io_apic.c b/xen/arch/x86/io_apic.c
>> index daa5e9e5ff..a5344ed727 100644
>> --- a/xen/arch/x86/io_apic.c
>> +++ b/xen/arch/x86/io_apic.c
>> @@ -1892,8 +1891,7 @@ static void __init check_timer(void)
>>  vector = IRQ0_VECTOR;
>>  clear_irq_vector(0);
>>  
>> -cpumask_setall(&mask_all);
>> -if ((ret = bind_irq_vector(0, vector, &mask_all)))
>> +if ((ret = bind_irq_vector(0, vector, &cpumask_all)))
> 
> Spaces, seeing as we're modifying the line.

You are aware this file originates from the Linux kernel?

I really don't mind either way, I just wanted to mention it...


Juergen

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH for-next 7/9] xen: Convert is_xen_heap_mfn to use typesafe MFN

2019-03-14 Thread Julien Grall

Hi Jan,

On 3/14/19 7:52 AM, Jan Beulich wrote:

On 13.03.19 at 18:24,  wrote:

On 13/03/2019 15:04, Jan Beulich wrote:

On 18.02.19 at 12:35,  wrote:

--- a/xen/common/page_alloc.c
+++ b/xen/common/page_alloc.c
@@ -2121,9 +2121,9 @@ void init_xenheap_pages(paddr_t ps, paddr_t pe)
* Yuk! Ensure there is a one-page buffer between Xen and Dom zones, to
* prevent merging of power-of-two blocks across the zone boundary.
*/
-if ( ps && !is_xen_heap_mfn(paddr_to_pfn(ps)-1) )
+if ( ps && !is_xen_heap_mfn(_mfn(paddr_to_pfn(ps)-1)) )
   ps += PAGE_SIZE;
-if ( !is_xen_heap_mfn(paddr_to_pfn(pe)) )
+if ( !is_xen_heap_mfn(maddr_to_mfn(pe)) )


Why maddr_to_mfn() here but still paddr_to_pfn() above? Oh,
we don't have any mfn_sub(), I see.


Yes we don't have mfn_sub() (or even gfn_sub()). I only found a couple of places
where such helpers might be useful. I can introduce the 2 helpers if you think
it is worth it.


Well, I guess in the end I'm fine either way. It simply struck me
as odd at the first glance that you use maddr_to_mfn() in one
case but not the other.


I wanted to avoid adding mfn_x(...) on the line:

_mfn(mfn_x(maddr_to_mfn(ps)) - 1)

I will look to introduce mfn_sub()/gfn_sub().

Cheers,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH for-next 7/9] xen: Convert is_xen_heap_mfn to use typesafe MFN

2019-03-14 Thread Andrew Cooper
On 14/03/2019 10:12, Julien Grall wrote:
> Hi Jan,
>
> On 3/14/19 7:52 AM, Jan Beulich wrote:
> On 13.03.19 at 18:24,  wrote:
>>> On 13/03/2019 15:04, Jan Beulich wrote:
>>> On 18.02.19 at 12:35,  wrote:
> --- a/xen/common/page_alloc.c
> +++ b/xen/common/page_alloc.c
> @@ -2121,9 +2121,9 @@ void init_xenheap_pages(paddr_t ps, paddr_t pe)
>     * Yuk! Ensure there is a one-page buffer between Xen and
> Dom zones, to
>     * prevent merging of power-of-two blocks across the zone
> boundary.
>     */
> -    if ( ps && !is_xen_heap_mfn(paddr_to_pfn(ps)-1) )
> +    if ( ps && !is_xen_heap_mfn(_mfn(paddr_to_pfn(ps)-1)) )
>    ps += PAGE_SIZE;
> -    if ( !is_xen_heap_mfn(paddr_to_pfn(pe)) )
> +    if ( !is_xen_heap_mfn(maddr_to_mfn(pe)) )

 Why maddr_to_mfn() here but still paddr_to_pfn() above? Oh,
 we don't have any mfn_sub(), I see.
>>>
>>> Yes we don't have mfn_sub() (or even gfn_sub()). I only found a
>>> couple of places
>>> where such helpers might be useful. I can introduce the 2 helpers if
>>> you think
>>> it is worth it.
>>
>> Well, I guess in the end I'm fine either way. It simply struck me
>> as odd at the first glance that you use maddr_to_mfn() in one
>> case but not the other.
>
> I wanted to avoid adding mfn_x(...) on the line:
>
> _mfn(mfn_x(maddr_to_mfn(ps)) - 1)
>
> I will look to introduce mfn_sub()/gfn_sub().

You do know that mfn_add(mfn, -1) will DTRT?

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH for-next 7/9] xen: Convert is_xen_heap_mfn to use typesafe MFN

2019-03-14 Thread Julien Grall

Hi Andrew,

On 3/14/19 10:14 AM, Andrew Cooper wrote:

On 14/03/2019 10:12, Julien Grall wrote:

Hi Jan,

On 3/14/19 7:52 AM, Jan Beulich wrote:

On 13.03.19 at 18:24,  wrote:

On 13/03/2019 15:04, Jan Beulich wrote:

On 18.02.19 at 12:35,  wrote:

--- a/xen/common/page_alloc.c
+++ b/xen/common/page_alloc.c
@@ -2121,9 +2121,9 @@ void init_xenheap_pages(paddr_t ps, paddr_t pe)
     * Yuk! Ensure there is a one-page buffer between Xen and
Dom zones, to
     * prevent merging of power-of-two blocks across the zone
boundary.
     */
-    if ( ps && !is_xen_heap_mfn(paddr_to_pfn(ps)-1) )
+    if ( ps && !is_xen_heap_mfn(_mfn(paddr_to_pfn(ps)-1)) )
    ps += PAGE_SIZE;
-    if ( !is_xen_heap_mfn(paddr_to_pfn(pe)) )
+    if ( !is_xen_heap_mfn(maddr_to_mfn(pe)) )


Why maddr_to_mfn() here but still paddr_to_pfn() above? Oh,
we don't have any mfn_sub(), I see.


Yes we don't have mfn_sub() (or even gfn_sub()). I only found a
couple of places
where such helpers might be useful. I can introduce the 2 helpers if
you think
it is worth it.


Well, I guess in the end I'm fine either way. It simply struck me
as odd at the first glance that you use maddr_to_mfn() in one
case but not the other.


I wanted to avoid adding mfn_x(...) on the line:

_mfn(mfn_x(maddr_to_mfn(ps)) - 1)

I will look to introduce mfn_sub()/gfn_sub().


You do know that mfn_add(mfn, -1) will DTRT?


It didn't occur to me until you said it. I will use that then.

Cheers,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 1/2] xen: introduce a cpumask with all bits set

2019-03-14 Thread George Dunlap
On 3/14/19 9:59 AM, Juergen Gross wrote:
> There are several places in Xen allocating a cpumask on the stack and
> setting all bits in it just to use it as an initial mask for allowing
> all cpus.
> 
> Save the stack space and omit the need for runtime initialization by
> defining a globally accessible cpumask_all variable.
> 
> Signed-off-by: Juergen Gross 

Acked-by: George Dunlap 

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [libvirt test] 133757: regressions - FAIL

2019-03-14 Thread osstest service owner
flight 133757 libvirt real [real]
http://logs.test-lab.xenproject.org/osstest/logs/133757/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-i386-libvirt6 libvirt-buildfail REGR. vs. 133672

Tests which did not succeed, but are not blocking:
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 1 build-check(1) blocked n/a
 test-amd64-i386-libvirt-xsm   1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt   1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt-pair  1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt 14 saverestore-support-checkfail  like 133672
 test-armhf-armhf-libvirt-raw 13 saverestore-support-checkfail  like 133672
 test-arm64-arm64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 14 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-arm64-arm64-libvirt-qcow2 12 migrate-support-checkfail never pass
 test-arm64-arm64-libvirt-qcow2 13 saverestore-support-checkfail never pass
 test-amd64-amd64-libvirt-vhd 12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt 13 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt 13 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt 14 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt-raw 12 migrate-support-checkfail   never pass

version targeted for testing:
 libvirt  c615c14246880c91afa3fe1659dc2104447de601
baseline version:
 libvirt  7a05c739c26decb8ff0eef4f6c75ce3ef729532d

Last test of basis   133672  2019-03-09 13:00:19 Z4 days
Failing since133728  2019-03-12 04:19:12 Z2 days2 attempts
Testing same since   133757  2019-03-13 07:07:24 Z1 days1 attempts


People who touched revisions under test:
  Andrea Bolognani 
  Eric Blake 
  Laszlo Ersek 
  Martin Kletzander 
  Michal Privoznik 
  Peter Krempa 
  Shotaro Gotanda 

jobs:
 build-amd64-xsm  pass
 build-arm64-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-arm64  pass
 build-armhf  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-arm64-libvirt  pass
 build-armhf-libvirt  pass
 build-i386-libvirt   fail
 build-amd64-pvopspass
 build-arm64-pvopspass
 build-armhf-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm   pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsmblocked 
 test-amd64-amd64-libvirt-xsm pass
 test-arm64-arm64-libvirt-xsm pass
 test-amd64-i386-libvirt-xsm  blocked 
 test-amd64-amd64-libvirt pass
 test-arm64-arm64-libvirt pass
 test-armhf-armhf-libvirt pass
 test-amd64-i386-libvirt  blocked 
 test-amd64-amd64-libvirt-pairpass
 test-amd64-i386-libvirt-pair blocked 
 test-arm64-arm64-libvirt-qcow2   pass
 test-armhf-armhf-libvirt-raw pass
 test-amd64-amd64-libvirt-vhd pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=o

Re: [Xen-devel] [PATCH] tools/cpuid: Update xen-cpuid for Coffee Lake processors

2019-03-14 Thread Andrew Cooper
On 14/03/2019 08:20, Jan Beulich wrote:
 On 13.03.19 at 20:37,  wrote:
>> --- a/tools/misc/xen-cpuid.c
>> +++ b/tools/misc/xen-cpuid.c
>> @@ -127,6 +127,8 @@ static const char *str_7c0[32] =
>>  [14] = "avx512_vpopcntdq",
>>  
>>  [22] = "rdpid",
>> +
>> +[30] = "sgx-lc",
>>  };
>>  
>>  static const char *str_e7d[32] =
> Hmm, to be honest I don't see why this needs to go in ahead of
> https://lists.xenproject.org/archives/html/xen-devel/2018-12/msg01813.html
> posted almost 3 months ago, and already taking care of the bit.

Because its hidden at the end of a series I haven't gotten to the end of
yet.  But yes - it does look to be a more complete patch so lets start
there.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v7 49/49] tools: re-sync CPUID leaf 7 tables

2019-03-14 Thread Andrew Cooper
On 19/12/2018 15:08, Jan Beulich wrote:
> Bring libxl's in line with the public header, and update xen-cpuid's to
> the latest information available in Intel's documentation (SDM ver 068
> and ISA extensions ver 035), with (as before) the exception on MAWAU.
>
> Some pre-existing strings get changed to match SDM naming. This should
> be benign in xen-cpuid, and I hope it's also acceptable in libxl, where
> people actually using the slightly wrong names would have to update
> their guest config files.
>
> Signed-off-by: Jan Beulich 

Acked-by: Andrew Cooper 

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v6 08/11] viridian: stop directly calling viridian_time_ref_count_freeze/thaw()...

2019-03-14 Thread Paul Durrant
...from arch_domain_shutdown/pause/unpause().

A subsequent patch will introduce an implementaion of synthetic timers
which will also need freeze/thaw hooks, so make the exported hooks more
generic and call through to (re-named and static) time_ref_count_freeze/thaw
functions.

NOTE: This patch also introduces a new time_ref_count() helper to return
  the current counter value. This is currently only used by the MSR
  read handler but the synthetic timer code will also need to use it.

Signed-off-by: Paul Durrant 
Reviewed-by: Wei Liu 
Acked-by: Jan Beulich 
---
Cc: Andrew Cooper 
Cc: "Roger Pau Monné" 
---
 xen/arch/x86/domain.c  | 12 ++--
 xen/arch/x86/hvm/viridian/time.c   | 24 +---
 xen/include/asm-x86/hvm/viridian.h |  4 ++--
 3 files changed, 29 insertions(+), 11 deletions(-)

diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
index 8d579e2cf9..02afa7518e 100644
--- a/xen/arch/x86/domain.c
+++ b/xen/arch/x86/domain.c
@@ -657,20 +657,20 @@ void arch_domain_destroy(struct domain *d)
 
 void arch_domain_shutdown(struct domain *d)
 {
-if ( has_viridian_time_ref_count(d) )
-viridian_time_ref_count_freeze(d);
+if ( is_viridian_domain(d) )
+viridian_time_domain_freeze(d);
 }
 
 void arch_domain_pause(struct domain *d)
 {
-if ( has_viridian_time_ref_count(d) )
-viridian_time_ref_count_freeze(d);
+if ( is_viridian_domain(d) )
+viridian_time_domain_freeze(d);
 }
 
 void arch_domain_unpause(struct domain *d)
 {
-if ( has_viridian_time_ref_count(d) )
-viridian_time_ref_count_thaw(d);
+if ( is_viridian_domain(d) )
+viridian_time_domain_thaw(d);
 }
 
 int arch_domain_soft_reset(struct domain *d)
diff --git a/xen/arch/x86/hvm/viridian/time.c b/xen/arch/x86/hvm/viridian/time.c
index 16fe41d411..71291d921c 100644
--- a/xen/arch/x86/hvm/viridian/time.c
+++ b/xen/arch/x86/hvm/viridian/time.c
@@ -91,7 +91,7 @@ static int64_t raw_trc_val(const struct domain *d)
 return scale_delta(tsc, &tsc_to_ns) / 100ul;
 }
 
-void viridian_time_ref_count_freeze(const struct domain *d)
+static void time_ref_count_freeze(const struct domain *d)
 {
 struct viridian_time_ref_count *trc =
 &d->arch.hvm.viridian->time_ref_count;
@@ -100,7 +100,7 @@ void viridian_time_ref_count_freeze(const struct domain *d)
 trc->val = raw_trc_val(d) + trc->off;
 }
 
-void viridian_time_ref_count_thaw(const struct domain *d)
+static void time_ref_count_thaw(const struct domain *d)
 {
 struct viridian_time_ref_count *trc =
 &d->arch.hvm.viridian->time_ref_count;
@@ -110,6 +110,24 @@ void viridian_time_ref_count_thaw(const struct domain *d)
 trc->off = (int64_t)trc->val - raw_trc_val(d);
 }
 
+static int64_t time_ref_count(const struct domain *d)
+{
+struct viridian_time_ref_count *trc =
+&d->arch.hvm.viridian->time_ref_count;
+
+return raw_trc_val(d) + trc->off;
+}
+
+void viridian_time_domain_freeze(const struct domain *d)
+{
+time_ref_count_freeze(d);
+}
+
+void viridian_time_domain_thaw(const struct domain *d)
+{
+time_ref_count_thaw(d);
+}
+
 int viridian_time_wrmsr(struct vcpu *v, uint32_t idx, uint64_t val)
 {
 struct domain *d = v->domain;
@@ -179,7 +197,7 @@ int viridian_time_rdmsr(const struct vcpu *v, uint32_t idx, 
uint64_t *val)
 printk(XENLOG_G_INFO "d%d: VIRIDIAN MSR_TIME_REF_COUNT: 
accessed\n",
d->domain_id);
 
-*val = raw_trc_val(d) + trc->off;
+*val = time_ref_count(d);
 break;
 }
 
diff --git a/xen/include/asm-x86/hvm/viridian.h 
b/xen/include/asm-x86/hvm/viridian.h
index c65c044191..8146e2fc46 100644
--- a/xen/include/asm-x86/hvm/viridian.h
+++ b/xen/include/asm-x86/hvm/viridian.h
@@ -77,8 +77,8 @@ int guest_rdmsr_viridian(const struct vcpu *v, uint32_t idx, 
uint64_t *val);
 int
 viridian_hypercall(struct cpu_user_regs *regs);
 
-void viridian_time_ref_count_freeze(const struct domain *d);
-void viridian_time_ref_count_thaw(const struct domain *d);
+void viridian_time_domain_freeze(const struct domain *d);
+void viridian_time_domain_thaw(const struct domain *d);
 
 int viridian_vcpu_init(struct vcpu *v);
 int viridian_domain_init(struct domain *d);
-- 
2.20.1


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v6 03/11] viridian: use stack variables for viridian_vcpu and viridian_domain...

2019-03-14 Thread Paul Durrant
...where there is more than one dereference inside a function.

This shortens the code and makes it more readable. No functional change.

Signed-off-by: Paul Durrant 
Reviewed-by: Jan Beulich 
---
Cc: Andrew Cooper 
Cc: Wei Liu 
Cc: "Roger Pau Monné" 

v4:
 - New in v4
---
 xen/arch/x86/hvm/viridian/synic.c| 49 
 xen/arch/x86/hvm/viridian/time.c | 27 ---
 xen/arch/x86/hvm/viridian/viridian.c | 47 +-
 3 files changed, 69 insertions(+), 54 deletions(-)

diff --git a/xen/arch/x86/hvm/viridian/synic.c 
b/xen/arch/x86/hvm/viridian/synic.c
index 28eda7798c..f3d9f7ae74 100644
--- a/xen/arch/x86/hvm/viridian/synic.c
+++ b/xen/arch/x86/hvm/viridian/synic.c
@@ -30,7 +30,8 @@ typedef union _HV_VP_ASSIST_PAGE
 
 void viridian_apic_assist_set(const struct vcpu *v)
 {
-HV_VP_ASSIST_PAGE *ptr = v->arch.hvm.viridian->vp_assist.ptr;
+struct viridian_vcpu *vv = v->arch.hvm.viridian;
+HV_VP_ASSIST_PAGE *ptr = vv->vp_assist.ptr;
 
 if ( !ptr )
 return;
@@ -40,25 +41,25 @@ void viridian_apic_assist_set(const struct vcpu *v)
  * wrong and the VM will most likely hang so force a crash now
  * to make the problem clear.
  */
-if ( v->arch.hvm.viridian->apic_assist_pending )
+if ( vv->apic_assist_pending )
 domain_crash(v->domain);
 
-v->arch.hvm.viridian->apic_assist_pending = true;
+vv->apic_assist_pending = true;
 ptr->ApicAssist.no_eoi = 1;
 }
 
 bool viridian_apic_assist_completed(const struct vcpu *v)
 {
-HV_VP_ASSIST_PAGE *ptr = v->arch.hvm.viridian->vp_assist.ptr;
+struct viridian_vcpu *vv = v->arch.hvm.viridian;
+HV_VP_ASSIST_PAGE *ptr = vv->vp_assist.ptr;
 
 if ( !ptr )
 return false;
 
-if ( v->arch.hvm.viridian->apic_assist_pending &&
- !ptr->ApicAssist.no_eoi )
+if ( vv->apic_assist_pending && !ptr->ApicAssist.no_eoi )
 {
 /* An EOI has been avoided */
-v->arch.hvm.viridian->apic_assist_pending = false;
+vv->apic_assist_pending = false;
 return true;
 }
 
@@ -67,17 +68,20 @@ bool viridian_apic_assist_completed(const struct vcpu *v)
 
 void viridian_apic_assist_clear(const struct vcpu *v)
 {
-HV_VP_ASSIST_PAGE *ptr = v->arch.hvm.viridian->vp_assist.ptr;
+struct viridian_vcpu *vv = v->arch.hvm.viridian;
+HV_VP_ASSIST_PAGE *ptr = vv->vp_assist.ptr;
 
 if ( !ptr )
 return;
 
 ptr->ApicAssist.no_eoi = 0;
-v->arch.hvm.viridian->apic_assist_pending = false;
+vv->apic_assist_pending = false;
 }
 
 int viridian_synic_wrmsr(struct vcpu *v, uint32_t idx, uint64_t val)
 {
+struct viridian_vcpu *vv = v->arch.hvm.viridian;
+
 switch ( idx )
 {
 case HV_X64_MSR_EOI:
@@ -95,12 +99,11 @@ int viridian_synic_wrmsr(struct vcpu *v, uint32_t idx, 
uint64_t val)
 
 case HV_X64_MSR_VP_ASSIST_PAGE:
 /* release any previous mapping */
-viridian_unmap_guest_page(&v->arch.hvm.viridian->vp_assist);
-v->arch.hvm.viridian->vp_assist.msr.raw = val;
-viridian_dump_guest_page(v, "VP_ASSIST",
- &v->arch.hvm.viridian->vp_assist);
-if ( v->arch.hvm.viridian->vp_assist.msr.fields.enabled )
-viridian_map_guest_page(v, &v->arch.hvm.viridian->vp_assist);
+viridian_unmap_guest_page(&vv->vp_assist);
+vv->vp_assist.msr.raw = val;
+viridian_dump_guest_page(v, "VP_ASSIST", &vv->vp_assist);
+if ( vv->vp_assist.msr.fields.enabled )
+viridian_map_guest_page(v, &vv->vp_assist);
 break;
 
 default:
@@ -146,18 +149,22 @@ int viridian_synic_rdmsr(const struct vcpu *v, uint32_t 
idx, uint64_t *val)
 void viridian_synic_save_vcpu_ctxt(const struct vcpu *v,
struct hvm_viridian_vcpu_context *ctxt)
 {
-ctxt->apic_assist_pending = v->arch.hvm.viridian->apic_assist_pending;
-ctxt->vp_assist_msr = v->arch.hvm.viridian->vp_assist.msr.raw;
+const struct viridian_vcpu *vv = v->arch.hvm.viridian;
+
+ctxt->apic_assist_pending = vv->apic_assist_pending;
+ctxt->vp_assist_msr = vv->vp_assist.msr.raw;
 }
 
 void viridian_synic_load_vcpu_ctxt(
 struct vcpu *v, const struct hvm_viridian_vcpu_context *ctxt)
 {
-v->arch.hvm.viridian->vp_assist.msr.raw = ctxt->vp_assist_msr;
-if ( v->arch.hvm.viridian->vp_assist.msr.fields.enabled )
-viridian_map_guest_page(v, &v->arch.hvm.viridian->vp_assist);
+struct viridian_vcpu *vv = v->arch.hvm.viridian;
+
+vv->vp_assist.msr.raw = ctxt->vp_assist_msr;
+if ( vv->vp_assist.msr.fields.enabled )
+viridian_map_guest_page(v, &vv->vp_assist);
 
-v->arch.hvm.viridian->apic_assist_pending = ctxt->apic_assist_pending;
+vv->apic_assist_pending = ctxt->apic_assist_pending;
 }
 
 /*
diff --git a/xen/arch/x86/hvm/viridian/time.c b/xen/arch/x86/hvm/viridian/time.c
index a7e94aadf0..76f9612001 100644
--- a/xen/arch/x86/hvm/viridian/time.c
+++ b/xen/arch/x86/hvm/

[Xen-devel] [PATCH v6 04/11] viridian: make 'fields' struct anonymous...

2019-03-14 Thread Paul Durrant
...inside viridian_page_msr and viridian_guest_os_id_msr unions.

There's no need to name it and the code is shortened by not doing so.
No functional change.

Signed-off-by: Paul Durrant 
Reviewed-by: Jan Beulich 
---
Cc: Andrew Cooper 
Cc: Wei Liu 
Cc: "Roger Pau Monné" 

v4:
 - New in v4
---
 xen/arch/x86/hvm/viridian/synic.c|  4 ++--
 xen/arch/x86/hvm/viridian/time.c | 10 +-
 xen/arch/x86/hvm/viridian/viridian.c | 20 +---
 xen/include/asm-x86/hvm/viridian.h   |  4 ++--
 4 files changed, 18 insertions(+), 20 deletions(-)

diff --git a/xen/arch/x86/hvm/viridian/synic.c 
b/xen/arch/x86/hvm/viridian/synic.c
index f3d9f7ae74..05d971b365 100644
--- a/xen/arch/x86/hvm/viridian/synic.c
+++ b/xen/arch/x86/hvm/viridian/synic.c
@@ -102,7 +102,7 @@ int viridian_synic_wrmsr(struct vcpu *v, uint32_t idx, 
uint64_t val)
 viridian_unmap_guest_page(&vv->vp_assist);
 vv->vp_assist.msr.raw = val;
 viridian_dump_guest_page(v, "VP_ASSIST", &vv->vp_assist);
-if ( vv->vp_assist.msr.fields.enabled )
+if ( vv->vp_assist.msr.enabled )
 viridian_map_guest_page(v, &vv->vp_assist);
 break;
 
@@ -161,7 +161,7 @@ void viridian_synic_load_vcpu_ctxt(
 struct viridian_vcpu *vv = v->arch.hvm.viridian;
 
 vv->vp_assist.msr.raw = ctxt->vp_assist_msr;
-if ( vv->vp_assist.msr.fields.enabled )
+if ( vv->vp_assist.msr.enabled )
 viridian_map_guest_page(v, &vv->vp_assist);
 
 vv->apic_assist_pending = ctxt->apic_assist_pending;
diff --git a/xen/arch/x86/hvm/viridian/time.c b/xen/arch/x86/hvm/viridian/time.c
index 76f9612001..909a3fb9e3 100644
--- a/xen/arch/x86/hvm/viridian/time.c
+++ b/xen/arch/x86/hvm/viridian/time.c
@@ -29,16 +29,16 @@ static void dump_reference_tsc(const struct domain *d)
 {
 const union viridian_page_msr *rt = &d->arch.hvm.viridian->reference_tsc;
 
-if ( !rt->fields.enabled )
+if ( !rt->enabled )
 return;
 
 printk(XENLOG_G_INFO "d%d: VIRIDIAN REFERENCE_TSC: pfn: %lx\n",
-   d->domain_id, (unsigned long)rt->fields.pfn);
+   d->domain_id, (unsigned long)rt->pfn);
 }
 
 static void update_reference_tsc(struct domain *d, bool initialize)
 {
-unsigned long gmfn = d->arch.hvm.viridian->reference_tsc.fields.pfn;
+unsigned long gmfn = d->arch.hvm.viridian->reference_tsc.pfn;
 struct page_info *page = get_page_from_gfn(d, gmfn, NULL, P2M_ALLOC);
 HV_REFERENCE_TSC_PAGE *p;
 
@@ -151,7 +151,7 @@ int viridian_time_wrmsr(struct vcpu *v, uint32_t idx, 
uint64_t val)
 
 vd->reference_tsc.raw = val;
 dump_reference_tsc(d);
-if ( vd->reference_tsc.fields.enabled )
+if ( vd->reference_tsc.enabled )
 update_reference_tsc(d, true);
 break;
 
@@ -232,7 +232,7 @@ void viridian_time_load_domain_ctxt(
 vd->time_ref_count.val = ctxt->time_ref_count;
 vd->reference_tsc.raw = ctxt->reference_tsc;
 
-if ( vd->reference_tsc.fields.enabled )
+if ( vd->reference_tsc.enabled )
 update_reference_tsc(d, false);
 }
 
diff --git a/xen/arch/x86/hvm/viridian/viridian.c 
b/xen/arch/x86/hvm/viridian/viridian.c
index 710470fed7..1a20d68aaf 100644
--- a/xen/arch/x86/hvm/viridian/viridian.c
+++ b/xen/arch/x86/hvm/viridian/viridian.c
@@ -192,7 +192,7 @@ void cpuid_viridian_leaves(const struct vcpu *v, uint32_t 
leaf,
 
 case 4:
 /* Recommended hypercall usage. */
-if ( vd->guest_os_id.raw == 0 || vd->guest_os_id.fields.os < 4 )
+if ( vd->guest_os_id.raw == 0 || vd->guest_os_id.os < 4 )
 break;
 res->a = CPUID4A_RELAX_TIMER_INT;
 if ( viridian_feature_mask(d) & HVMPV_hcall_remote_tlb_flush )
@@ -228,10 +228,8 @@ static void dump_guest_os_id(const struct domain *d)
 
 printk(XENLOG_G_INFO
"d%d: VIRIDIAN GUEST_OS_ID: vendor: %x os: %x major: %x minor: %x 
sp: %x build: %x\n",
-   d->domain_id,
-   goi->fields.vendor, goi->fields.os,
-   goi->fields.major, goi->fields.minor,
-   goi->fields.service_pack, goi->fields.build_number);
+   d->domain_id, goi->vendor, goi->os, goi->major, goi->minor,
+   goi->service_pack, goi->build_number);
 }
 
 static void dump_hypercall(const struct domain *d)
@@ -242,12 +240,12 @@ static void dump_hypercall(const struct domain *d)
 
 printk(XENLOG_G_INFO "d%d: VIRIDIAN HYPERCALL: enabled: %x pfn: %lx\n",
d->domain_id,
-   hg->fields.enabled, (unsigned long)hg->fields.pfn);
+   hg->enabled, (unsigned long)hg->pfn);
 }
 
 static void enable_hypercall_page(struct domain *d)
 {
-unsigned long gmfn = d->arch.hvm.viridian->hypercall_gpa.fields.pfn;
+unsigned long gmfn = d->arch.hvm.viridian->hypercall_gpa.pfn;
 struct page_info *page = get_page_from_gfn(d, gmfn, NULL, P2M_ALLOC);
 uint8_t *p;
 
@@ -297,7 +295,7 @@ int guest_wrmsr_viridian(struct vcpu *v, uint32_t idx, 
uint64_t val)
 case HV_X64_MSR_HYPERCALL:
 vd->hy

[Xen-devel] [PATCH v6 07/11] viridian: use viridian_map/unmap_guest_page() for reference tsc page

2019-03-14 Thread Paul Durrant
Whilst the reference tsc page does not currently need to be kept mapped
after it is initially set up (or updated after migrate), the code can
be simplified by using the common guest page map/unmap and dump functions.
New functionality added by a subsequent patch will also require the page to
kept mapped for the lifetime of the domain.

NOTE: Because the reference tsc page is per-domain rather than per-vcpu
  this patch also changes viridian_map_guest_page() to take a domain
  pointer rather than a vcpu pointer. The domain pointer cannot be
  const, unlike the vcpu pointer.

Signed-off-by: Paul Durrant 
Reviewed-by: Wei Liu 
---
Cc: Jan Beulich 
Cc: Andrew Cooper 
Cc: "Roger Pau Monné" 
---
 xen/arch/x86/hvm/viridian/private.h  |  2 +-
 xen/arch/x86/hvm/viridian/synic.c|  6 ++-
 xen/arch/x86/hvm/viridian/time.c | 56 +---
 xen/arch/x86/hvm/viridian/viridian.c |  3 +-
 xen/include/asm-x86/hvm/viridian.h   |  2 +-
 5 files changed, 25 insertions(+), 44 deletions(-)

diff --git a/xen/arch/x86/hvm/viridian/private.h 
b/xen/arch/x86/hvm/viridian/private.h
index 5078b2d2ab..96a784b840 100644
--- a/xen/arch/x86/hvm/viridian/private.h
+++ b/xen/arch/x86/hvm/viridian/private.h
@@ -111,7 +111,7 @@ void viridian_time_load_domain_ctxt(
 
 void viridian_dump_guest_page(const struct vcpu *v, const char *name,
   const struct viridian_page *vp);
-void viridian_map_guest_page(const struct vcpu *v, struct viridian_page *vp);
+void viridian_map_guest_page(struct domain *d, struct viridian_page *vp);
 void viridian_unmap_guest_page(struct viridian_page *vp);
 
 #endif /* X86_HVM_VIRIDIAN_PRIVATE_H */
diff --git a/xen/arch/x86/hvm/viridian/synic.c 
b/xen/arch/x86/hvm/viridian/synic.c
index b8dab4b246..fb560bc162 100644
--- a/xen/arch/x86/hvm/viridian/synic.c
+++ b/xen/arch/x86/hvm/viridian/synic.c
@@ -81,6 +81,7 @@ void viridian_apic_assist_clear(const struct vcpu *v)
 int viridian_synic_wrmsr(struct vcpu *v, uint32_t idx, uint64_t val)
 {
 struct viridian_vcpu *vv = v->arch.hvm.viridian;
+struct domain *d = v->domain;
 
 switch ( idx )
 {
@@ -103,7 +104,7 @@ int viridian_synic_wrmsr(struct vcpu *v, uint32_t idx, 
uint64_t val)
 vv->vp_assist.msr.raw = val;
 viridian_dump_guest_page(v, "VP_ASSIST", &vv->vp_assist);
 if ( vv->vp_assist.msr.enabled )
-viridian_map_guest_page(v, &vv->vp_assist);
+viridian_map_guest_page(d, &vv->vp_assist);
 break;
 
 default:
@@ -178,10 +179,11 @@ void viridian_synic_load_vcpu_ctxt(
 struct vcpu *v, const struct hvm_viridian_vcpu_context *ctxt)
 {
 struct viridian_vcpu *vv = v->arch.hvm.viridian;
+struct domain *d = v->domain;
 
 vv->vp_assist.msr.raw = ctxt->vp_assist_msr;
 if ( vv->vp_assist.msr.enabled )
-viridian_map_guest_page(v, &vv->vp_assist);
+viridian_map_guest_page(d, &vv->vp_assist);
 
 vv->apic_assist_pending = ctxt->apic_assist_pending;
 }
diff --git a/xen/arch/x86/hvm/viridian/time.c b/xen/arch/x86/hvm/viridian/time.c
index 4399e62f54..16fe41d411 100644
--- a/xen/arch/x86/hvm/viridian/time.c
+++ b/xen/arch/x86/hvm/viridian/time.c
@@ -25,33 +25,10 @@ typedef struct _HV_REFERENCE_TSC_PAGE
 uint64_t Reserved2[509];
 } HV_REFERENCE_TSC_PAGE, *PHV_REFERENCE_TSC_PAGE;
 
-static void dump_reference_tsc(const struct domain *d)
-{
-const union viridian_page_msr *rt = &d->arch.hvm.viridian->reference_tsc;
-
-if ( !rt->enabled )
-return;
-
-printk(XENLOG_G_INFO "d%d: VIRIDIAN REFERENCE_TSC: pfn: %lx\n",
-   d->domain_id, (unsigned long)rt->pfn);
-}
-
 static void update_reference_tsc(struct domain *d, bool initialize)
 {
-unsigned long gmfn = d->arch.hvm.viridian->reference_tsc.pfn;
-struct page_info *page = get_page_from_gfn(d, gmfn, NULL, P2M_ALLOC);
-HV_REFERENCE_TSC_PAGE *p;
-
-if ( !page || !get_page_type(page, PGT_writable_page) )
-{
-if ( page )
-put_page(page);
-gdprintk(XENLOG_WARNING, "Bad GMFN %#"PRI_gfn" (MFN %#"PRI_mfn")\n",
- gmfn, mfn_x(page ? page_to_mfn(page) : INVALID_MFN));
-return;
-}
-
-p = __map_domain_page(page);
+const struct viridian_page *rt = &d->arch.hvm.viridian->reference_tsc;
+HV_REFERENCE_TSC_PAGE *p = rt->ptr;
 
 if ( initialize )
 clear_page(p);
@@ -82,7 +59,7 @@ static void update_reference_tsc(struct domain *d, bool 
initialize)
 
 printk(XENLOG_G_INFO "d%d: VIRIDIAN REFERENCE_TSC: invalidated\n",
d->domain_id);
-goto out;
+return;
 }
 
 /*
@@ -100,11 +77,6 @@ static void update_reference_tsc(struct domain *d, bool 
initialize)
 if ( p->TscSequence == 0x ||
  p->TscSequence == 0 ) /* Avoid both 'invalid' values */
 p->TscSequence = 1;
-
- out:
-unmap_domain_page(p);
-
-put_page_and_type(page);
 }
 
 static int64_t raw_trc_val(const struct domain *d)
@@ -149,10 +

[Xen-devel] [PATCH v6 00/11] viridian: implement more enlightenments

2019-03-14 Thread Paul Durrant
This series adds three new enlightenments:

- Synthetic timers, which depends on the...
- Synthetic interrupt controller (or SynIC)
- Synthetic cluster IPI

All these enlightenments are implemented in current versions of QEMU/KVM
so this series closes the gap.

Paul Durrant (11):
  viridian: add init hooks
  viridian: separately allocate domain and vcpu structures
  viridian: use stack variables for viridian_vcpu and viridian_domain...
  viridian: make 'fields' struct anonymous...
  viridian: extend init/deinit hooks into synic and time modules
  viridian: add missing context save helpers into synic and time modules
  viridian: use viridian_map/unmap_guest_page() for reference tsc page
  viridian: stop directly calling
viridian_time_ref_count_freeze/thaw()...
  viridian: add implementation of synthetic interrupt MSRs
  viridian: add implementation of synthetic timers
  viridian: add implementation of the HvSendSyntheticClusterIpi
hypercall

 docs/man/xl.cfg.5.pod.in   |  18 +-
 tools/libxl/libxl.h|  18 +
 tools/libxl/libxl_dom.c|  10 +
 tools/libxl/libxl_types.idl|   3 +
 xen/arch/x86/domain.c  |  12 +-
 xen/arch/x86/hvm/hvm.c |  10 +
 xen/arch/x86/hvm/viridian/private.h|  31 +-
 xen/arch/x86/hvm/viridian/synic.c  | 388 +--
 xen/arch/x86/hvm/viridian/time.c   | 512 ++---
 xen/arch/x86/hvm/viridian/viridian.c   | 229 +--
 xen/arch/x86/hvm/vlapic.c  |  31 +-
 xen/include/asm-x86/hvm/domain.h   |   2 +-
 xen/include/asm-x86/hvm/hvm.h  |   7 +
 xen/include/asm-x86/hvm/vcpu.h |   2 +-
 xen/include/asm-x86/hvm/viridian.h |  76 +++-
 xen/include/public/arch-x86/hvm/save.h |   4 +
 xen/include/public/hvm/params.h|  17 +-
 17 files changed, 1227 insertions(+), 143 deletions(-)

v4:
 - Add two cleanup patches (#3 and #4) and re-order #8 and #9

v3:
 - Add the synthetic cluster IPI patch (#11)

---
Cc: Andrew Cooper 
Cc: George Dunlap 
Cc: Ian Jackson 
Cc: Jan Beulich 
Cc: Julien Grall 
Cc: Konrad Rzeszutek Wilk 
Cc: "Roger Pau Monné" 
Cc: Stefano Stabellini 
Cc: Tim Deegan 
Cc: Wei Liu 
-- 
2.20.1


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v6 06/11] viridian: add missing context save helpers into synic and time modules

2019-03-14 Thread Paul Durrant
Currently the time module lacks vcpu context save helpers and the synic
module lacks domain context save helpers. These helpers are not yet
required but subsequent patches will require at least some of them so this
patch completes the set to avoid introducing them in an ad-hoc way.

Signed-off-by: Paul Durrant 
Reviewed-by: Wei Liu 
---
Cc: Jan Beulich 
Cc: Andrew Cooper 
Cc: "Roger Pau Monné" 

v3:
 - Add missing callers so that they are not added in an ad-hoc way
---
 xen/arch/x86/hvm/viridian/private.h  | 10 ++
 xen/arch/x86/hvm/viridian/synic.c| 10 ++
 xen/arch/x86/hvm/viridian/time.c | 10 ++
 xen/arch/x86/hvm/viridian/viridian.c |  4 
 4 files changed, 34 insertions(+)

diff --git a/xen/arch/x86/hvm/viridian/private.h 
b/xen/arch/x86/hvm/viridian/private.h
index 8c029f62c6..5078b2d2ab 100644
--- a/xen/arch/x86/hvm/viridian/private.h
+++ b/xen/arch/x86/hvm/viridian/private.h
@@ -85,6 +85,11 @@ void viridian_synic_save_vcpu_ctxt(const struct vcpu *v,
 void viridian_synic_load_vcpu_ctxt(
 struct vcpu *v, const struct hvm_viridian_vcpu_context *ctxt);
 
+void viridian_synic_save_domain_ctxt(
+const struct domain *d, struct hvm_viridian_domain_context *ctxt);
+void viridian_synic_load_domain_ctxt(
+struct domain *d, const struct hvm_viridian_domain_context *ctxt);
+
 int viridian_time_wrmsr(struct vcpu *v, uint32_t idx, uint64_t val);
 int viridian_time_rdmsr(const struct vcpu *v, uint32_t idx, uint64_t *val);
 
@@ -94,6 +99,11 @@ int viridian_time_domain_init(const struct domain *d);
 void viridian_time_vcpu_deinit(const struct vcpu *v);
 void viridian_time_domain_deinit(const struct domain *d);
 
+void viridian_time_save_vcpu_ctxt(
+const struct vcpu *v, struct hvm_viridian_vcpu_context *ctxt);
+void viridian_time_load_vcpu_ctxt(
+struct vcpu *v, const struct hvm_viridian_vcpu_context *ctxt);
+
 void viridian_time_save_domain_ctxt(
 const struct domain *d, struct hvm_viridian_domain_context *ctxt);
 void viridian_time_load_domain_ctxt(
diff --git a/xen/arch/x86/hvm/viridian/synic.c 
b/xen/arch/x86/hvm/viridian/synic.c
index 4b00dbe1b3..b8dab4b246 100644
--- a/xen/arch/x86/hvm/viridian/synic.c
+++ b/xen/arch/x86/hvm/viridian/synic.c
@@ -186,6 +186,16 @@ void viridian_synic_load_vcpu_ctxt(
 vv->apic_assist_pending = ctxt->apic_assist_pending;
 }
 
+void viridian_synic_save_domain_ctxt(
+const struct domain *d, struct hvm_viridian_domain_context *ctxt)
+{
+}
+
+void viridian_synic_load_domain_ctxt(
+struct domain *d, const struct hvm_viridian_domain_context *ctxt)
+{
+}
+
 /*
  * Local variables:
  * mode: C
diff --git a/xen/arch/x86/hvm/viridian/time.c b/xen/arch/x86/hvm/viridian/time.c
index 48aca7e0ab..4399e62f54 100644
--- a/xen/arch/x86/hvm/viridian/time.c
+++ b/xen/arch/x86/hvm/viridian/time.c
@@ -233,6 +233,16 @@ void viridian_time_domain_deinit(const struct domain *d)
 {
 }
 
+void viridian_time_save_vcpu_ctxt(
+const struct vcpu *v, struct hvm_viridian_vcpu_context *ctxt)
+{
+}
+
+void viridian_time_load_vcpu_ctxt(
+struct vcpu *v, const struct hvm_viridian_vcpu_context *ctxt)
+{
+}
+
 void viridian_time_save_domain_ctxt(
 const struct domain *d, struct hvm_viridian_domain_context *ctxt)
 {
diff --git a/xen/arch/x86/hvm/viridian/viridian.c 
b/xen/arch/x86/hvm/viridian/viridian.c
index f9a509d918..742a988252 100644
--- a/xen/arch/x86/hvm/viridian/viridian.c
+++ b/xen/arch/x86/hvm/viridian/viridian.c
@@ -707,6 +707,7 @@ static int viridian_save_domain_ctxt(struct vcpu *v,
 return 0;
 
 viridian_time_save_domain_ctxt(d, &ctxt);
+viridian_synic_save_domain_ctxt(d, &ctxt);
 
 return (hvm_save_entry(VIRIDIAN_DOMAIN, 0, h, &ctxt) != 0);
 }
@@ -723,6 +724,7 @@ static int viridian_load_domain_ctxt(struct domain *d,
 vd->hypercall_gpa.raw = ctxt.hypercall_gpa;
 vd->guest_os_id.raw = ctxt.guest_os_id;
 
+viridian_synic_load_domain_ctxt(d, &ctxt);
 viridian_time_load_domain_ctxt(d, &ctxt);
 
 return 0;
@@ -738,6 +740,7 @@ static int viridian_save_vcpu_ctxt(struct vcpu *v, 
hvm_domain_context_t *h)
 if ( !is_viridian_vcpu(v) )
 return 0;
 
+viridian_time_save_vcpu_ctxt(v, &ctxt);
 viridian_synic_save_vcpu_ctxt(v, &ctxt);
 
 return hvm_save_entry(VIRIDIAN_VCPU, v->vcpu_id, h, &ctxt);
@@ -764,6 +767,7 @@ static int viridian_load_vcpu_ctxt(struct domain *d,
 return -EINVAL;
 
 viridian_synic_load_vcpu_ctxt(v, &ctxt);
+viridian_time_load_vcpu_ctxt(v, &ctxt);
 
 return 0;
 }
-- 
2.20.1


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v6 09/11] viridian: add implementation of synthetic interrupt MSRs

2019-03-14 Thread Paul Durrant
This patch introduces an implementation of the SCONTROL, SVERSION, SIEFP,
SIMP, EOM and SINT0-15 SynIC MSRs. No message source is added and, as such,
nothing will yet generate a synthetic interrupt. A subsequent patch will
add an implementation of synthetic timers which will need the infrastructure
added by this patch to deliver expiry messages to the guest.

NOTE: A 'synic' option is added to the toolstack viridian enlightenments
  enumeration but is deliberately not documented as enabling these
  SynIC registers without a message source is only useful for
  debugging.

Signed-off-by: Paul Durrant 
Acked-by: Wei Liu 
---
Cc: Ian Jackson 
Cc: Andrew Cooper 
Cc: George Dunlap 
Cc: Jan Beulich 
Cc: Julien Grall 
Cc: Konrad Rzeszutek Wilk 
Cc: Stefano Stabellini 
Cc: Tim Deegan 
Cc: "Roger Pau Monné" 

v6:
 - Address further comments from Jan

v4:
 - Address comments from Jan

v3:
 - Add the 'SintPollingModeAvailable' bit in CPUID leaf 3
---
 tools/libxl/libxl.h|   6 +
 tools/libxl/libxl_dom.c|   3 +
 tools/libxl/libxl_types.idl|   1 +
 xen/arch/x86/hvm/viridian/synic.c  | 253 -
 xen/arch/x86/hvm/viridian/viridian.c   |  19 ++
 xen/arch/x86/hvm/vlapic.c  |  31 ++-
 xen/include/asm-x86/hvm/hvm.h  |   3 +
 xen/include/asm-x86/hvm/viridian.h |  27 +++
 xen/include/public/arch-x86/hvm/save.h |   2 +
 xen/include/public/hvm/params.h|   7 +-
 10 files changed, 344 insertions(+), 8 deletions(-)

diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h
index a38e5cdba2..a923a380d3 100644
--- a/tools/libxl/libxl.h
+++ b/tools/libxl/libxl.h
@@ -318,6 +318,12 @@
  */
 #define LIBXL_HAVE_VIRIDIAN_CRASH_CTL 1
 
+/*
+ * LIBXL_HAVE_VIRIDIAN_SYNIC indicates that the 'synic' value
+ * is present in the viridian enlightenment enumeration.
+ */
+#define LIBXL_HAVE_VIRIDIAN_SYNIC 1
+
 /*
  * LIBXL_HAVE_BUILDINFO_HVM_ACPI_LAPTOP_SLATE indicates that
  * libxl_domain_build_info has the u.hvm.acpi_laptop_slate field.
diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c
index 6160991af3..fb758d2ac3 100644
--- a/tools/libxl/libxl_dom.c
+++ b/tools/libxl/libxl_dom.c
@@ -317,6 +317,9 @@ static int hvm_set_viridian_features(libxl__gc *gc, 
uint32_t domid,
 if (libxl_bitmap_test(&enlightenments, 
LIBXL_VIRIDIAN_ENLIGHTENMENT_CRASH_CTL))
 mask |= HVMPV_crash_ctl;
 
+if (libxl_bitmap_test(&enlightenments, LIBXL_VIRIDIAN_ENLIGHTENMENT_SYNIC))
+mask |= HVMPV_synic;
+
 if (mask != 0 &&
 xc_hvm_param_set(CTX->xch,
  domid,
diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
index b685ac47ac..9860bcaf5f 100644
--- a/tools/libxl/libxl_types.idl
+++ b/tools/libxl/libxl_types.idl
@@ -235,6 +235,7 @@ libxl_viridian_enlightenment = 
Enumeration("viridian_enlightenment", [
 (4, "hcall_remote_tlb_flush"),
 (5, "apic_assist"),
 (6, "crash_ctl"),
+(7, "synic"),
 ])
 
 libxl_hdtype = Enumeration("hdtype", [
diff --git a/xen/arch/x86/hvm/viridian/synic.c 
b/xen/arch/x86/hvm/viridian/synic.c
index fb560bc162..d00a7bfe2c 100644
--- a/xen/arch/x86/hvm/viridian/synic.c
+++ b/xen/arch/x86/hvm/viridian/synic.c
@@ -13,6 +13,7 @@
 
 #include 
 #include 
+#include 
 
 #include "private.h"
 
@@ -28,6 +29,37 @@ typedef union _HV_VP_ASSIST_PAGE
 uint8_t ReservedZBytePadding[PAGE_SIZE];
 } HV_VP_ASSIST_PAGE;
 
+typedef enum HV_MESSAGE_TYPE {
+HvMessageTypeNone,
+HvMessageTimerExpired = 0x8010,
+} HV_MESSAGE_TYPE;
+
+typedef struct HV_MESSAGE_FLAGS {
+uint8_t MessagePending:1;
+uint8_t Reserved:7;
+} HV_MESSAGE_FLAGS;
+
+typedef struct HV_MESSAGE_HEADER {
+HV_MESSAGE_TYPE MessageType;
+uint16_t Reserved1;
+HV_MESSAGE_FLAGS MessageFlags;
+uint8_t PayloadSize;
+uint64_t Reserved2;
+} HV_MESSAGE_HEADER;
+
+#define HV_MESSAGE_SIZE 256
+#define HV_MESSAGE_MAX_PAYLOAD_QWORD_COUNT 30
+
+typedef struct HV_MESSAGE {
+HV_MESSAGE_HEADER Header;
+uint64_t Payload[HV_MESSAGE_MAX_PAYLOAD_QWORD_COUNT];
+} HV_MESSAGE;
+
+void __init __maybe_unused build_assertions(void)
+{
+BUILD_BUG_ON(sizeof(HV_MESSAGE) != HV_MESSAGE_SIZE);
+}
+
 void viridian_apic_assist_set(const struct vcpu *v)
 {
 struct viridian_vcpu *vv = v->arch.hvm.viridian;
@@ -83,6 +115,8 @@ int viridian_synic_wrmsr(struct vcpu *v, uint32_t idx, 
uint64_t val)
 struct viridian_vcpu *vv = v->arch.hvm.viridian;
 struct domain *d = v->domain;
 
+ASSERT(v == current || !v->is_running);
+
 switch ( idx )
 {
 case HV_X64_MSR_EOI:
@@ -107,6 +141,76 @@ int viridian_synic_wrmsr(struct vcpu *v, uint32_t idx, 
uint64_t val)
 viridian_map_guest_page(d, &vv->vp_assist);
 break;
 
+case HV_X64_MSR_SCONTROL:
+if ( !(viridian_feature_mask(d) & HVMPV_synic) )
+return X86EMUL_EXCEPTION;
+
+vv->scontrol = val;
+break;
+
+case HV_X64_MSR_SVERSION:
+return X86EMUL_EXC

[Xen-devel] [PATCH v6 05/11] viridian: extend init/deinit hooks into synic and time modules

2019-03-14 Thread Paul Durrant
This patch simply adds domain and vcpu init/deinit hooks into the synic
and time modules and wires them into viridian_[domain|vcpu]_[init|deinit]().
Only one of the hooks is currently needed (to unmap the 'VP Assist' page)
but subsequent patches will make use of the others.

NOTE: To perform the unmap of the VP Assist page,
  viridian_unmap_guest_page() is now directly called in the new
  viridian_synic_vcpu_deinit() function (which is safe even if
  is_viridian_vcpu() evaluates to false). This replaces the slightly
  hacky mechanism of faking a zero write to the
  HV_X64_MSR_VP_ASSIST_PAGE MSR in viridian_cpu_deinit().

Signed-off-by: Paul Durrant 
Reviewed-by: Jan Beulich 
Reviewed-by: Wei Liu 
---
Cc: Andrew Cooper 
Cc: "Roger Pau Monné" 

v4:
 - Constify vcpu and domain pointers

v2:
 - Pay attention to sync and time init hook return values
---
 xen/arch/x86/hvm/viridian/private.h  | 12 +
 xen/arch/x86/hvm/viridian/synic.c| 19 ++
 xen/arch/x86/hvm/viridian/time.c | 18 ++
 xen/arch/x86/hvm/viridian/viridian.c | 37 ++--
 4 files changed, 84 insertions(+), 2 deletions(-)

diff --git a/xen/arch/x86/hvm/viridian/private.h 
b/xen/arch/x86/hvm/viridian/private.h
index 46174f48cd..8c029f62c6 100644
--- a/xen/arch/x86/hvm/viridian/private.h
+++ b/xen/arch/x86/hvm/viridian/private.h
@@ -74,6 +74,12 @@
 int viridian_synic_wrmsr(struct vcpu *v, uint32_t idx, uint64_t val);
 int viridian_synic_rdmsr(const struct vcpu *v, uint32_t idx, uint64_t *val);
 
+int viridian_synic_vcpu_init(const struct vcpu *v);
+int viridian_synic_domain_init(const struct domain *d);
+
+void viridian_synic_vcpu_deinit(const struct vcpu *v);
+void viridian_synic_domain_deinit(const struct domain *d);
+
 void viridian_synic_save_vcpu_ctxt(const struct vcpu *v,
struct hvm_viridian_vcpu_context *ctxt);
 void viridian_synic_load_vcpu_ctxt(
@@ -82,6 +88,12 @@ void viridian_synic_load_vcpu_ctxt(
 int viridian_time_wrmsr(struct vcpu *v, uint32_t idx, uint64_t val);
 int viridian_time_rdmsr(const struct vcpu *v, uint32_t idx, uint64_t *val);
 
+int viridian_time_vcpu_init(const struct vcpu *v);
+int viridian_time_domain_init(const struct domain *d);
+
+void viridian_time_vcpu_deinit(const struct vcpu *v);
+void viridian_time_domain_deinit(const struct domain *d);
+
 void viridian_time_save_domain_ctxt(
 const struct domain *d, struct hvm_viridian_domain_context *ctxt);
 void viridian_time_load_domain_ctxt(
diff --git a/xen/arch/x86/hvm/viridian/synic.c 
b/xen/arch/x86/hvm/viridian/synic.c
index 05d971b365..4b00dbe1b3 100644
--- a/xen/arch/x86/hvm/viridian/synic.c
+++ b/xen/arch/x86/hvm/viridian/synic.c
@@ -146,6 +146,25 @@ int viridian_synic_rdmsr(const struct vcpu *v, uint32_t 
idx, uint64_t *val)
 return X86EMUL_OKAY;
 }
 
+int viridian_synic_vcpu_init(const struct vcpu *v)
+{
+return 0;
+}
+
+int viridian_synic_domain_init(const struct domain *d)
+{
+return 0;
+}
+
+void viridian_synic_vcpu_deinit(const struct vcpu *v)
+{
+viridian_unmap_guest_page(&v->arch.hvm.viridian->vp_assist);
+}
+
+void viridian_synic_domain_deinit(const struct domain *d)
+{
+}
+
 void viridian_synic_save_vcpu_ctxt(const struct vcpu *v,
struct hvm_viridian_vcpu_context *ctxt)
 {
diff --git a/xen/arch/x86/hvm/viridian/time.c b/xen/arch/x86/hvm/viridian/time.c
index 909a3fb9e3..48aca7e0ab 100644
--- a/xen/arch/x86/hvm/viridian/time.c
+++ b/xen/arch/x86/hvm/viridian/time.c
@@ -215,6 +215,24 @@ int viridian_time_rdmsr(const struct vcpu *v, uint32_t 
idx, uint64_t *val)
 return X86EMUL_OKAY;
 }
 
+int viridian_time_vcpu_init(const struct vcpu *v)
+{
+return 0;
+}
+
+int viridian_time_domain_init(const struct domain *d)
+{
+return 0;
+}
+
+void viridian_time_vcpu_deinit(const struct vcpu *v)
+{
+}
+
+void viridian_time_domain_deinit(const struct domain *d)
+{
+}
+
 void viridian_time_save_domain_ctxt(
 const struct domain *d, struct hvm_viridian_domain_context *ctxt)
 {
diff --git a/xen/arch/x86/hvm/viridian/viridian.c 
b/xen/arch/x86/hvm/viridian/viridian.c
index 1a20d68aaf..f9a509d918 100644
--- a/xen/arch/x86/hvm/viridian/viridian.c
+++ b/xen/arch/x86/hvm/viridian/viridian.c
@@ -418,22 +418,52 @@ int guest_rdmsr_viridian(const struct vcpu *v, uint32_t 
idx, uint64_t *val)
 
 int viridian_vcpu_init(struct vcpu *v)
 {
+int rc;
+
 ASSERT(!v->arch.hvm.viridian);
 v->arch.hvm.viridian = xzalloc(struct viridian_vcpu);
 if ( !v->arch.hvm.viridian )
 return -ENOMEM;
 
+rc = viridian_synic_vcpu_init(v);
+if ( rc )
+goto fail;
+
+rc = viridian_time_vcpu_init(v);
+if ( rc )
+goto fail;
+
 return 0;
+
+ fail:
+viridian_vcpu_deinit(v);
+
+return rc;
 }
 
 int viridian_domain_init(struct domain *d)
 {
+int rc;
+
 ASSERT(!d->arch.hvm.viridian);
 d->arch.hvm.viridian = xzalloc(struct viridian_domain);
 if ( !d->arch.hv

[Xen-devel] [PATCH v6 02/11] viridian: separately allocate domain and vcpu structures

2019-03-14 Thread Paul Durrant
Currently the viridian_domain and viridian_vcpu structures are inline in
the hvm_domain and hvm_vcpu structures respectively. Subsequent patches
will need to add sizable extra fields to the viridian structures which
will cause the PAGE_SIZE limit of the overall vcpu structure to be
exceeded. This patch, therefore, uses the new init hooks to separately
allocate the structures and converts the 'viridian' fields in hvm_domain
and hvm_cpu to be pointers to these allocations. These separate allocations
also allow some vcpu and domain pointers to become const.

Ideally, now that they are no longer inline, the allocations of the
viridian structures could be made conditional on whether the toolstack
is going to configure the viridian enlightenments. However the toolstack
is currently unable to convey this information to the domain creation code
so such an enhancement is deferred until that becomes possible.

NOTE: The patch also introduced the 'is_viridian_vcpu' macro to avoid
  introducing a second evaluation of 'is_viridian_domain' with an
  open-coded 'v->domain' argument. This macro will also be further
  used in a subsequent patch.

Signed-off-by: Paul Durrant 
Reviewed-by: Wei Liu 
Acked-by: Jan Beulich 
---
Cc: Andrew Cooper 
Cc: "Roger Pau Monné" 

v4:
 - Const-ify some vcpu and domain pointers

v2:
 - use XFREE()
 - expand commit comment to point out why allocations are unconditional
---
 xen/arch/x86/hvm/viridian/private.h  |  2 +-
 xen/arch/x86/hvm/viridian/synic.c| 46 -
 xen/arch/x86/hvm/viridian/time.c | 38 +++---
 xen/arch/x86/hvm/viridian/viridian.c | 75 ++--
 xen/include/asm-x86/hvm/domain.h |  2 +-
 xen/include/asm-x86/hvm/hvm.h|  4 ++
 xen/include/asm-x86/hvm/vcpu.h   |  2 +-
 xen/include/asm-x86/hvm/viridian.h   | 10 ++--
 8 files changed, 101 insertions(+), 78 deletions(-)

diff --git a/xen/arch/x86/hvm/viridian/private.h 
b/xen/arch/x86/hvm/viridian/private.h
index 398b22f12d..46174f48cd 100644
--- a/xen/arch/x86/hvm/viridian/private.h
+++ b/xen/arch/x86/hvm/viridian/private.h
@@ -89,7 +89,7 @@ void viridian_time_load_domain_ctxt(
 
 void viridian_dump_guest_page(const struct vcpu *v, const char *name,
   const struct viridian_page *vp);
-void viridian_map_guest_page(struct vcpu *v, struct viridian_page *vp);
+void viridian_map_guest_page(const struct vcpu *v, struct viridian_page *vp);
 void viridian_unmap_guest_page(struct viridian_page *vp);
 
 #endif /* X86_HVM_VIRIDIAN_PRIVATE_H */
diff --git a/xen/arch/x86/hvm/viridian/synic.c 
b/xen/arch/x86/hvm/viridian/synic.c
index a6ebbbc9f5..28eda7798c 100644
--- a/xen/arch/x86/hvm/viridian/synic.c
+++ b/xen/arch/x86/hvm/viridian/synic.c
@@ -28,9 +28,9 @@ typedef union _HV_VP_ASSIST_PAGE
 uint8_t ReservedZBytePadding[PAGE_SIZE];
 } HV_VP_ASSIST_PAGE;
 
-void viridian_apic_assist_set(struct vcpu *v)
+void viridian_apic_assist_set(const struct vcpu *v)
 {
-HV_VP_ASSIST_PAGE *ptr = v->arch.hvm.viridian.vp_assist.ptr;
+HV_VP_ASSIST_PAGE *ptr = v->arch.hvm.viridian->vp_assist.ptr;
 
 if ( !ptr )
 return;
@@ -40,40 +40,40 @@ void viridian_apic_assist_set(struct vcpu *v)
  * wrong and the VM will most likely hang so force a crash now
  * to make the problem clear.
  */
-if ( v->arch.hvm.viridian.apic_assist_pending )
+if ( v->arch.hvm.viridian->apic_assist_pending )
 domain_crash(v->domain);
 
-v->arch.hvm.viridian.apic_assist_pending = true;
+v->arch.hvm.viridian->apic_assist_pending = true;
 ptr->ApicAssist.no_eoi = 1;
 }
 
-bool viridian_apic_assist_completed(struct vcpu *v)
+bool viridian_apic_assist_completed(const struct vcpu *v)
 {
-HV_VP_ASSIST_PAGE *ptr = v->arch.hvm.viridian.vp_assist.ptr;
+HV_VP_ASSIST_PAGE *ptr = v->arch.hvm.viridian->vp_assist.ptr;
 
 if ( !ptr )
 return false;
 
-if ( v->arch.hvm.viridian.apic_assist_pending &&
+if ( v->arch.hvm.viridian->apic_assist_pending &&
  !ptr->ApicAssist.no_eoi )
 {
 /* An EOI has been avoided */
-v->arch.hvm.viridian.apic_assist_pending = false;
+v->arch.hvm.viridian->apic_assist_pending = false;
 return true;
 }
 
 return false;
 }
 
-void viridian_apic_assist_clear(struct vcpu *v)
+void viridian_apic_assist_clear(const struct vcpu *v)
 {
-HV_VP_ASSIST_PAGE *ptr = v->arch.hvm.viridian.vp_assist.ptr;
+HV_VP_ASSIST_PAGE *ptr = v->arch.hvm.viridian->vp_assist.ptr;
 
 if ( !ptr )
 return;
 
 ptr->ApicAssist.no_eoi = 0;
-v->arch.hvm.viridian.apic_assist_pending = false;
+v->arch.hvm.viridian->apic_assist_pending = false;
 }
 
 int viridian_synic_wrmsr(struct vcpu *v, uint32_t idx, uint64_t val)
@@ -95,12 +95,12 @@ int viridian_synic_wrmsr(struct vcpu *v, uint32_t idx, 
uint64_t val)
 
 case HV_X64_MSR_VP_ASSIST_PAGE:
 /* release any previous mapping */
-viridian_unmap_guest_page(&v->arch.hvm.viridian

[Xen-devel] [PATCH v6 01/11] viridian: add init hooks

2019-03-14 Thread Paul Durrant
This patch adds domain and vcpu init hooks for viridian features. The init
hooks do not yet do anything; the functionality will be added to by
subsequent patches.

Signed-off-by: Paul Durrant 
Reviewed-by: Wei Liu 
Acked-by: Jan Beulich 
---
Cc: Andrew Cooper 
Cc: "Roger Pau Monné" 

v5:
 - Put the call to viridian_domain_deinit() back into
   hvm_domain_relinquish_resources() where it should be

v3:
 - Re-instate call from domain deinit to vcpu deinit
 - Move deinit calls to avoid introducing new labels

v2:
 - Remove call from domain deinit to vcpu deinit
---
 xen/arch/x86/hvm/hvm.c   | 10 ++
 xen/arch/x86/hvm/viridian/viridian.c | 10 ++
 xen/include/asm-x86/hvm/viridian.h   |  3 +++
 3 files changed, 23 insertions(+)

diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 8adbb61b57..11ce21fc08 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -666,6 +666,10 @@ int hvm_domain_initialise(struct domain *d)
 if ( hvm_tsc_scaling_supported )
 d->arch.hvm.tsc_scaling_ratio = hvm_default_tsc_scaling_ratio;
 
+rc = viridian_domain_init(d);
+if ( rc )
+goto fail2;
+
 rc = hvm_funcs.domain_initialise(d);
 if ( rc != 0 )
 goto fail2;
@@ -687,6 +691,7 @@ int hvm_domain_initialise(struct domain *d)
 hvm_destroy_cacheattr_region_list(d);
 destroy_perdomain_mapping(d, PERDOMAIN_VIRT_START, 0);
  fail:
+viridian_domain_deinit(d);
 return rc;
 }
 
@@ -1526,6 +1531,10 @@ int hvm_vcpu_initialise(struct vcpu *v)
  && (rc = nestedhvm_vcpu_initialise(v)) < 0 ) /* teardown: 
nestedhvm_vcpu_destroy */
 goto fail5;
 
+rc = viridian_vcpu_init(v);
+if ( rc )
+goto fail5;
+
 rc = hvm_all_ioreq_servers_add_vcpu(d, v);
 if ( rc != 0 )
 goto fail6;
@@ -1553,6 +1562,7 @@ int hvm_vcpu_initialise(struct vcpu *v)
  fail2:
 hvm_vcpu_cacheattr_destroy(v);
  fail1:
+viridian_vcpu_deinit(v);
 return rc;
 }
 
diff --git a/xen/arch/x86/hvm/viridian/viridian.c 
b/xen/arch/x86/hvm/viridian/viridian.c
index 425af56856..5b0eb8a8c7 100644
--- a/xen/arch/x86/hvm/viridian/viridian.c
+++ b/xen/arch/x86/hvm/viridian/viridian.c
@@ -417,6 +417,16 @@ int guest_rdmsr_viridian(const struct vcpu *v, uint32_t 
idx, uint64_t *val)
 return X86EMUL_OKAY;
 }
 
+int viridian_vcpu_init(struct vcpu *v)
+{
+return 0;
+}
+
+int viridian_domain_init(struct domain *d)
+{
+return 0;
+}
+
 void viridian_vcpu_deinit(struct vcpu *v)
 {
 viridian_synic_wrmsr(v, HV_X64_MSR_VP_ASSIST_PAGE, 0);
diff --git a/xen/include/asm-x86/hvm/viridian.h 
b/xen/include/asm-x86/hvm/viridian.h
index ec5ef8d3f9..f072838955 100644
--- a/xen/include/asm-x86/hvm/viridian.h
+++ b/xen/include/asm-x86/hvm/viridian.h
@@ -80,6 +80,9 @@ viridian_hypercall(struct cpu_user_regs *regs);
 void viridian_time_ref_count_freeze(struct domain *d);
 void viridian_time_ref_count_thaw(struct domain *d);
 
+int viridian_vcpu_init(struct vcpu *v);
+int viridian_domain_init(struct domain *d);
+
 void viridian_vcpu_deinit(struct vcpu *v);
 void viridian_domain_deinit(struct domain *d);
 
-- 
2.20.1


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v6 11/11] viridian: add implementation of the HvSendSyntheticClusterIpi hypercall

2019-03-14 Thread Paul Durrant
This patch adds an implementation of the hypercall as documented in the
specification [1], section 10.5.2. This enlightenment, as with others, is
advertised by CPUID leaf 0x4004 and is under control of a new
'hcall_ipi' option in libxl.

If used, this enlightenment should mean the guest only takes a single VMEXIT
to issue IPIs to multiple vCPUs rather than the multiple VMEXITs that would
result from using the emulated local APIC.

[1] 
https://github.com/MicrosoftDocs/Virtualization-Documentation/raw/live/tlfs/Hypervisor%20Top%20Level%20Functional%20Specification%20v5.0C.pdf

Signed-off-by: Paul Durrant 
Acked-by: Wei Liu 
Reviewed-by: Jan Beulich 
---
Cc: Ian Jackson 
Cc: Andrew Cooper 
Cc: George Dunlap 
Cc: Julien Grall 
Cc: Konrad Rzeszutek Wilk 
Cc: Stefano Stabellini 
Cc: Tim Deegan 
Cc: "Roger Pau Monné" 

v4:
 - Address comments from Jan

v3:
 - New in v3
---
 docs/man/xl.cfg.5.pod.in |  6 +++
 tools/libxl/libxl.h  |  6 +++
 tools/libxl/libxl_dom.c  |  3 ++
 tools/libxl/libxl_types.idl  |  1 +
 xen/arch/x86/hvm/viridian/viridian.c | 63 
 xen/include/public/hvm/params.h  |  7 +++-
 6 files changed, 85 insertions(+), 1 deletion(-)

diff --git a/docs/man/xl.cfg.5.pod.in b/docs/man/xl.cfg.5.pod.in
index 355c654693..c7d70e618b 100644
--- a/docs/man/xl.cfg.5.pod.in
+++ b/docs/man/xl.cfg.5.pod.in
@@ -2175,6 +2175,12 @@ ticks and hence enabling this group will ensure that 
ticks will be
 consistent with use of an enlightened time source (B or
 B).
 
+=item B
+
+This set incorporates use of a hypercall for interprocessor interrupts.
+This enlightenment may improve performance of Windows guests with multiple
+virtual CPUs.
+
 =item B
 
 This is a special value that enables the default set of groups, which
diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h
index c8f219b0d3..482499a6c0 100644
--- a/tools/libxl/libxl.h
+++ b/tools/libxl/libxl.h
@@ -330,6 +330,12 @@
  */
 #define LIBXL_HAVE_VIRIDIAN_STIMER 1
 
+/*
+ * LIBXL_HAVE_VIRIDIAN_HCALL_IPI indicates that the 'hcall_ipi' value
+ * is present in the viridian enlightenment enumeration.
+ */
+#define LIBXL_HAVE_VIRIDIAN_HCALL_IPI 1
+
 /*
  * LIBXL_HAVE_BUILDINFO_HVM_ACPI_LAPTOP_SLATE indicates that
  * libxl_domain_build_info has the u.hvm.acpi_laptop_slate field.
diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c
index 2ee0f82ee7..879c806139 100644
--- a/tools/libxl/libxl_dom.c
+++ b/tools/libxl/libxl_dom.c
@@ -324,6 +324,9 @@ static int hvm_set_viridian_features(libxl__gc *gc, 
uint32_t domid,
 if (libxl_bitmap_test(&enlightenments, 
LIBXL_VIRIDIAN_ENLIGHTENMENT_STIMER))
 mask |= HVMPV_time_ref_count | HVMPV_synic | HVMPV_stimer;
 
+if (libxl_bitmap_test(&enlightenments, 
LIBXL_VIRIDIAN_ENLIGHTENMENT_HCALL_IPI))
+mask |= HVMPV_hcall_ipi;
+
 if (mask != 0 &&
 xc_hvm_param_set(CTX->xch,
  domid,
diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
index 1cce249de4..cb4702fd7a 100644
--- a/tools/libxl/libxl_types.idl
+++ b/tools/libxl/libxl_types.idl
@@ -237,6 +237,7 @@ libxl_viridian_enlightenment = 
Enumeration("viridian_enlightenment", [
 (6, "crash_ctl"),
 (7, "synic"),
 (8, "stimer"),
+(9, "hcall_ipi"),
 ])
 
 libxl_hdtype = Enumeration("hdtype", [
diff --git a/xen/arch/x86/hvm/viridian/viridian.c 
b/xen/arch/x86/hvm/viridian/viridian.c
index dce648bb4e..4b06b78a27 100644
--- a/xen/arch/x86/hvm/viridian/viridian.c
+++ b/xen/arch/x86/hvm/viridian/viridian.c
@@ -28,6 +28,7 @@
 #define HvFlushVirtualAddressSpace 0x0002
 #define HvFlushVirtualAddressList  0x0003
 #define HvNotifyLongSpinWait   0x0008
+#define HvSendSyntheticClusterIpi  0x000b
 #define HvGetPartitionId   0x0046
 #define HvExtCallQueryCapabilities 0x8001
 
@@ -95,6 +96,7 @@ typedef union _HV_CRASH_CTL_REG_CONTENTS
 #define CPUID4A_HCALL_REMOTE_TLB_FLUSH (1 << 2)
 #define CPUID4A_MSR_BASED_APIC (1 << 3)
 #define CPUID4A_RELAX_TIMER_INT(1 << 5)
+#define CPUID4A_SYNTHETIC_CLUSTER_IPI  (1 << 10)
 
 /* Viridian CPUID leaf 6: Implementation HW features detected and in use */
 #define CPUID6A_APIC_OVERLAY(1 << 0)
@@ -206,6 +208,8 @@ void cpuid_viridian_leaves(const struct vcpu *v, uint32_t 
leaf,
 res->a |= CPUID4A_HCALL_REMOTE_TLB_FLUSH;
 if ( !cpu_has_vmx_apic_reg_virt )
 res->a |= CPUID4A_MSR_BASED_APIC;
+if ( viridian_feature_mask(d) & HVMPV_hcall_ipi )
+res->a |= CPUID4A_SYNTHETIC_CLUSTER_IPI;
 
 /*
  * This value is the recommended number of attempts to try to
@@ -628,6 +632,65 @@ int viridian_hypercall(struct cpu_user_regs *regs)
 break;
 }
 
+case HvSendSyntheticClusterIpi:
+{
+struct vcpu *v;
+uint32_t vector;
+uint64_t vcpu_mask;
+
+status = HV_STATUS_INVALID_PARAMETER;
+
+/* Get input parameters. */
+if ( input.fast )
+{
+

[Xen-devel] [PATCH v6 10/11] viridian: add implementation of synthetic timers

2019-03-14 Thread Paul Durrant
This patch introduces an implementation of the STIMER0-15_CONFIG/COUNT MSRs
and hence a the first SynIC message source.

The new (and documented) 'stimer' viridian enlightenment group may be
specified to enable this feature.

While in the neighbourhood, this patch adds a missing check for an
attempt to write the time reference count MSR, which should result in an
exception (but not be reported as an unimplemented MSR).

NOTE: It is necessary for correct operation that timer expiration and
  message delivery time-stamping use the same time source as the guest.
  The specification is ambiguous but testing with a Windows 10 1803
  guest has shown that using the partition reference counter as a
  source whilst the guest is using RDTSC and the reference tsc page
  does not work correctly. Therefore the time_now() function is used.
  This implements the algorithm for acquiring partition reference time
  that is documented in the specifiction.

Signed-off-by: Paul Durrant 
Acked-by: Wei Liu 
---
Cc: Ian Jackson 
Cc: Andrew Cooper 
Cc: George Dunlap 
Cc: Jan Beulich 
Cc: Julien Grall 
Cc: Konrad Rzeszutek Wilk 
Cc: Stefano Stabellini 
Cc: Tim Deegan 
Cc: "Roger Pau Monné" 

v6:
 - Stop using the reference tsc page in time_now()
 - Address further comments from Jan

v5:
 - Fix time_now() to read TSC as the guest would see it

v4:
 - Address comments from Jan

v3:
 - Re-worked missed ticks calculation
---
 docs/man/xl.cfg.5.pod.in   |  12 +-
 tools/libxl/libxl.h|   6 +
 tools/libxl/libxl_dom.c|   4 +
 tools/libxl/libxl_types.idl|   1 +
 xen/arch/x86/hvm/viridian/private.h|   9 +-
 xen/arch/x86/hvm/viridian/synic.c  |  55 +++-
 xen/arch/x86/hvm/viridian/time.c   | 383 -
 xen/arch/x86/hvm/viridian/viridian.c   |   5 +
 xen/include/asm-x86/hvm/viridian.h |  32 ++-
 xen/include/public/arch-x86/hvm/save.h |   2 +
 xen/include/public/hvm/params.h|   7 +-
 11 files changed, 503 insertions(+), 13 deletions(-)

diff --git a/docs/man/xl.cfg.5.pod.in b/docs/man/xl.cfg.5.pod.in
index ad81af1ed8..355c654693 100644
--- a/docs/man/xl.cfg.5.pod.in
+++ b/docs/man/xl.cfg.5.pod.in
@@ -2167,11 +2167,19 @@ This group incorporates the crash control MSRs. These 
enlightenments
 allow Windows to write crash information such that it can be logged
 by Xen.
 
+=item B
+
+This set incorporates the SynIC and synthetic timer MSRs. Windows will
+use synthetic timers in preference to emulated HPET for a source of
+ticks and hence enabling this group will ensure that ticks will be
+consistent with use of an enlightened time source (B or
+B).
+
 =item B
 
 This is a special value that enables the default set of groups, which
-is currently the B, B, B, B
-and B groups.
+is currently the B, B, B, B,
+B and B groups.
 
 =item B
 
diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h
index a923a380d3..c8f219b0d3 100644
--- a/tools/libxl/libxl.h
+++ b/tools/libxl/libxl.h
@@ -324,6 +324,12 @@
  */
 #define LIBXL_HAVE_VIRIDIAN_SYNIC 1
 
+/*
+ * LIBXL_HAVE_VIRIDIAN_STIMER indicates that the 'stimer' value
+ * is present in the viridian enlightenment enumeration.
+ */
+#define LIBXL_HAVE_VIRIDIAN_STIMER 1
+
 /*
  * LIBXL_HAVE_BUILDINFO_HVM_ACPI_LAPTOP_SLATE indicates that
  * libxl_domain_build_info has the u.hvm.acpi_laptop_slate field.
diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c
index fb758d2ac3..2ee0f82ee7 100644
--- a/tools/libxl/libxl_dom.c
+++ b/tools/libxl/libxl_dom.c
@@ -269,6 +269,7 @@ static int hvm_set_viridian_features(libxl__gc *gc, 
uint32_t domid,
 libxl_bitmap_set(&enlightenments, 
LIBXL_VIRIDIAN_ENLIGHTENMENT_TIME_REF_COUNT);
 libxl_bitmap_set(&enlightenments, 
LIBXL_VIRIDIAN_ENLIGHTENMENT_APIC_ASSIST);
 libxl_bitmap_set(&enlightenments, 
LIBXL_VIRIDIAN_ENLIGHTENMENT_CRASH_CTL);
+libxl_bitmap_set(&enlightenments, LIBXL_VIRIDIAN_ENLIGHTENMENT_STIMER);
 }
 
 libxl_for_each_set_bit(v, info->u.hvm.viridian_enable) {
@@ -320,6 +321,9 @@ static int hvm_set_viridian_features(libxl__gc *gc, 
uint32_t domid,
 if (libxl_bitmap_test(&enlightenments, LIBXL_VIRIDIAN_ENLIGHTENMENT_SYNIC))
 mask |= HVMPV_synic;
 
+if (libxl_bitmap_test(&enlightenments, 
LIBXL_VIRIDIAN_ENLIGHTENMENT_STIMER))
+mask |= HVMPV_time_ref_count | HVMPV_synic | HVMPV_stimer;
+
 if (mask != 0 &&
 xc_hvm_param_set(CTX->xch,
  domid,
diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
index 9860bcaf5f..1cce249de4 100644
--- a/tools/libxl/libxl_types.idl
+++ b/tools/libxl/libxl_types.idl
@@ -236,6 +236,7 @@ libxl_viridian_enlightenment = 
Enumeration("viridian_enlightenment", [
 (5, "apic_assist"),
 (6, "crash_ctl"),
 (7, "synic"),
+(8, "stimer"),
 ])
 
 libxl_hdtype = Enumeration("hdtype", [
diff --git a/xen/arch/x86/hvm/viridian/private.h 
b/xen/arch/x86/hvm/viridian/private.h
index 96a784b840.

[Xen-devel] [xen-4.9-testing test] 133756: regressions - FAIL

2019-03-14 Thread osstest service owner
flight 133756 xen-4.9-testing real [real]
http://logs.test-lab.xenproject.org/osstest/logs/133756/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-i386-libvirt-pair 22 guest-migrate/src_host/dst_host fail REGR. vs. 
132889
 test-amd64-amd64-libvirt-pair 22 guest-migrate/src_host/dst_host fail REGR. 
vs. 132889
 test-amd64-amd64-xl-qemut-ws16-amd64 17 guest-stop   fail REGR. vs. 132889

Tests which are failing intermittently (not blocking):
 test-amd64-amd64-xl-qemut-ws16-amd64 14 guest-localmigrate fail in 133712 pass 
in 133756
 test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-localmigrate/x10 fail in 133712 
pass in 133756
 test-amd64-amd64-xl-qemut-win10-i386  7 xen-boot   fail pass in 133712
 test-amd64-amd64-xl-qemuu-ws16-amd64 16 guest-localmigrate/x10 fail pass in 
133712
 test-amd64-i386-xl-qemut-ws16-amd64 16 guest-localmigrate/x10 fail pass in 
133712

Tests which did not succeed, but are not blocking:
 test-amd64-i386-xl-qemuu-win7-amd64 17 guest-stop   fail blocked in 132889
 test-amd64-amd64-xl-qemut-win7-amd64 17 guest-stop  fail blocked in 132889
 test-amd64-i386-xl-qemuu-ws16-amd64 17 guest-stop   fail blocked in 132889
 test-armhf-armhf-xl-rtds 16 guest-start/debian.repeat fail in 133712 blocked 
in 132889
 test-amd64-i386-xl-qemuu-win7-amd64 16 guest-localmigrate/x10 fail in 133712 
like 132889
 test-amd64-amd64-xl-qemut-win7-amd64 16 guest-localmigrate/x10 fail in 133712 
like 132889
 test-amd64-amd64-xl-qemuu-ws16-amd64 17 guest-stop  fail in 133712 like 132889
 test-amd64-i386-xl-qemut-ws16-amd64 17 guest-stop   fail in 133712 like 132889
 test-amd64-amd64-xl-qemut-win10-i386 10 windows-install fail in 133712 never 
pass
 test-amd64-i386-xl-qemut-win7-amd64 17 guest-stop fail like 132889
 test-amd64-amd64-xl-rtds 10 debian-install   fail  like 132889
 test-amd64-i386-libvirt  13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  14 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-amd64-qemuu-nested-amd 17 debian-hvm-install/l1/l2  fail never pass
 test-amd64-amd64-libvirt-vhd 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 13 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 14 saverestore-support-checkfail  never pass
 test-armhf-armhf-xl-cubietruck 13 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 14 saverestore-support-checkfail never pass
 test-armhf-armhf-xl-credit1  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt-raw 12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-raw 13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt 13 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt 14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 14 saverestore-support-checkfail   never pass
 test-amd64-a

Re: [Xen-devel] [PATCH for-next 7/9] xen: Convert is_xen_heap_mfn to use typesafe MFN

2019-03-14 Thread Jan Beulich
>>> On 14.03.19 at 11:14,  wrote:
> You do know that mfn_add(mfn, -1) will DTRT?

It will, but imo it looks slightly odd in particular in e.g.
for ( ...; ...; mfn_add(mfn, -1) ), hence I didn't suggest it. But
I don't object either.

Jan



___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v13] tolerate jitter in cpu_khz calculation to avoid TSC emulation

2019-03-14 Thread Olaf Hering
Am Wed, 13 Mar 2019 03:18:39 -0600
schrieb "Jan Beulich" :

> The discontinuity is still there, and so far you've failed to explain
> why a discontinuity is what you want here.

In v12 you asked for some data about the ranges. With new data the ranges are:

2.20GHz 29khz
2.30GHz 105khz
2.40GHz 3524khz
2.50GHz 114khz

There are 5 systems in total in the 2.3/2.4/2.5 classes that have a much higher 
frequency, making the total range 15382/6052/19179.
I have to check what these runaway values mean.

The total range within the ntp.drift file is -71.138..345.942 for >2000 hosts.

I will see if any experiments can be done on the hosts which are on the edge of 
a frequency change. Would be nice to know if ntpd can cope with tsc_mode=native 
for example when a domU is migrated from one to the other.

Olaf


pgplWyWjMltUL.pgp
Description: Digitale Signatur von OpenPGP
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v2 2/2] tools: add new xl command get-hypervisor-config

2019-03-14 Thread Juergen Gross
Add   new subcommand "get-hypervisor-config" to xl config to print the
hypervisor .config file.

To be able to reuse already existing decompressing code in libxenguest
xc_inflate_buffer() has to be moved to libxenguest.h.

Signed-off-by: Juergen Gross 
---
V2:
- rename subcommand to get-hypervisor-config (Wei Liu)
- use goto style error handling (Wei Liu)
---
 docs/man/xl.1.pod.in   |  5 +
 tools/libxc/include/xenctrl.h  |  8 
 tools/libxc/include/xenguest.h | 13 +
 tools/libxc/xc_misc.c  | 42 ++
 tools/libxc/xg_private.h   |  4 
 tools/libxl/libxl.c| 34 ++
 tools/libxl/libxl.h|  8 
 tools/xl/xl.h  |  1 +
 tools/xl/xl_cmdtable.c |  5 +
 tools/xl/xl_misc.c | 20 
 10 files changed, 136 insertions(+), 4 deletions(-)

diff --git a/docs/man/xl.1.pod.in b/docs/man/xl.1.pod.in
index 4310fcd818..9d20958e91 100644
--- a/docs/man/xl.1.pod.in
+++ b/docs/man/xl.1.pod.in
@@ -844,6 +844,11 @@ Clears Xen's message buffer.
 
 =back
 
+=item B
+
+Print the software configuration file (.config) used to build the
+hypervisor.
+
 =item B [I]
 
 Print information about the Xen host in I format.  When
diff --git a/tools/libxc/include/xenctrl.h b/tools/libxc/include/xenctrl.h
index a3628e56bb..c6a203e1a4 100644
--- a/tools/libxc/include/xenctrl.h
+++ b/tools/libxc/include/xenctrl.h
@@ -2632,6 +2632,14 @@ int xc_livepatch_replace(xc_interface *xch, char *name, 
uint32_t timeout);
 int xc_domain_cacheflush(xc_interface *xch, uint32_t domid,
  xen_pfn_t start_pfn, xen_pfn_t nr_pfns);
 
+/*
+ * Get gzip-ed .config from hypervisor.
+ * *buffer must be free()-ed by caller.
+ * data size is returned in `size`.
+ * Returns 0 on success.
+ */
+int xc_get_config(xc_interface *xch, char **buffer, unsigned long *size);
+
 /* Compat shims */
 #include "xenctrl_compat.h"
 
diff --git a/tools/libxc/include/xenguest.h b/tools/libxc/include/xenguest.h
index b4b2e19619..76e87ea97c 100644
--- a/tools/libxc/include/xenguest.h
+++ b/tools/libxc/include/xenguest.h
@@ -310,4 +310,17 @@ xen_pfn_t *xc_map_m2p(xc_interface *xch,
   unsigned long max_mfn,
   int prot,
   unsigned long *mfn0);
+
+/**
+ * Decompress a gzip-ed stream.
+ * @parm xch a handle to an open hypervisor interface
+ * @parm in_buf buffer holding the gzip-ed data
+ * @parm in_size size in bytes of the gzip-ed data
+ * @parm out_size where to store the gunzip-ed data length
+ * @return new allocated buffer holding the gunzip-ed data
+ */
+char *xc_inflate_buffer(xc_interface *xch,
+const char *in_buf,
+unsigned long in_size,
+unsigned long *out_size);
 #endif /* XENGUEST_H */
diff --git a/tools/libxc/xc_misc.c b/tools/libxc/xc_misc.c
index 5e6714ae2b..83d259e46e 100644
--- a/tools/libxc/xc_misc.c
+++ b/tools/libxc/xc_misc.c
@@ -888,6 +888,48 @@ int xc_livepatch_replace(xc_interface *xch, char *name, 
uint32_t timeout)
 return _xc_livepatch_action(xch, name, LIVEPATCH_ACTION_REPLACE, timeout);
 }
 
+int xc_get_config(xc_interface *xch, char **buffer, unsigned long *size)
+{
+int rc;
+DECLARE_SYSCTL;
+DECLARE_HYPERCALL_BUFFER(char, buf);
+
+sysctl.cmd = XEN_SYSCTL_get_config;
+sysctl.u.get_config.size = 0;
+set_xen_guest_handle(sysctl.u.get_config.buffer, HYPERCALL_BUFFER_NULL);
+rc = do_sysctl(xch, &sysctl);
+if ( rc )
+return rc;
+
+*size = sysctl.u.get_config.size;
+buf = xc_hypercall_buffer_alloc(xch, buf, *size);
+if ( !buf )
+{
+errno = ENOMEM;
+return -1;
+}
+
+sysctl.cmd = XEN_SYSCTL_get_config;
+sysctl.u.get_config.size = *size;
+set_xen_guest_handle(sysctl.u.get_config.buffer, buf);
+rc = do_sysctl(xch, &sysctl);
+
+if ( rc )
+goto out;
+
+*buffer = calloc(1, *size);
+if ( !*buffer )
+{
+errno = ENOMEM;
+goto out;
+}
+memmove(*buffer, buf, *size);
+
+out:
+xc_hypercall_buffer_free(xch, buf);
+return rc;
+}
+
 /*
  * Local variables:
  * mode: C
diff --git a/tools/libxc/xg_private.h b/tools/libxc/xg_private.h
index f0a4b2c616..ca85e10737 100644
--- a/tools/libxc/xg_private.h
+++ b/tools/libxc/xg_private.h
@@ -43,10 +43,6 @@
 
 char *xc_read_image(xc_interface *xch,
 const char *filename, unsigned long *size);
-char *xc_inflate_buffer(xc_interface *xch,
-const char *in_buf,
-unsigned long in_size,
-unsigned long *out_size);
 
 unsigned long csum_page (void * page);
 
diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
index ec71574e99..e363371811 100644
--- a/tools/libxl/libxl.c
+++ b/tools/libxl/libxl.c
@@ -669,6 +669,40 @@ int libxl_set_parameters(libxl_ctx *ctx, char *param

[Xen-devel] [PATCH v2 1/2] xen: add interface for obtaining .config from hypervisor

2019-03-14 Thread Juergen Gross
Add a sysctl interface for obtaining the .config file used to build
the hypervisor. The mechanism is inspired by the Linux kernel's one.

Signed-off-by: Juergen Gross 
Reviewed-by: Jan Beulich  (apart from XSM changes)
---
V2:
- bump sysctl interface version
- check pad to be zero (Wei Liu)
- only copy data if buffer is large enough (Wei Liu)
- add .gitignore entry at correct position (Wei Liu)
- make xen_config_data_sz const (Jan Beulich)
---
 .gitignore  |  2 ++
 tools/flask/policy/modules/dom0.te  |  2 +-
 xen/common/Makefile |  7 +++
 xen/common/sysctl.c | 17 +
 xen/include/public/sysctl.h | 18 +-
 xen/include/xen/kernel.h|  3 +++
 xen/tools/Makefile  |  9 +++--
 xen/tools/bin2c.c   | 28 
 xen/xsm/flask/hooks.c   |  3 +++
 xen/xsm/flask/policy/access_vectors |  2 ++
 10 files changed, 87 insertions(+), 4 deletions(-)
 create mode 100644 xen/tools/bin2c.c

diff --git a/.gitignore b/.gitignore
index 26bc583f74..b433bce092 100644
--- a/.gitignore
+++ b/.gitignore
@@ -309,6 +309,7 @@ xen/arch/*/efi/boot.c
 xen/arch/*/efi/compat.c
 xen/arch/*/efi/efi.h
 xen/arch/*/efi/runtime.c
+xen/common/config_data.c
 xen/include/headers*.chk
 xen/include/asm
 xen/include/asm-*/asm-offsets.h
@@ -326,6 +327,7 @@ xen/test/livepatch/xen_bye_world.livepatch
 xen/test/livepatch/xen_hello_world.livepatch
 xen/test/livepatch/xen_nop.livepatch
 xen/test/livepatch/xen_replace_world.livepatch
+xen/tools/bin2c
 xen/tools/kconfig/.tmp_gtkcheck
 xen/tools/kconfig/.tmp_qtcheck
 xen/tools/symbols
diff --git a/tools/flask/policy/modules/dom0.te 
b/tools/flask/policy/modules/dom0.te
index a347d664f8..b776e9f307 100644
--- a/tools/flask/policy/modules/dom0.te
+++ b/tools/flask/policy/modules/dom0.te
@@ -16,7 +16,7 @@ allow dom0_t xen_t:xen {
 allow dom0_t xen_t:xen2 {
resource_op psr_cmt_op psr_alloc pmu_ctrl get_symbol
get_cpu_levelling_caps get_cpu_featureset livepatch_op
-   coverage_op set_parameter
+   coverage_op set_parameter get_config
 };
 
 # Allow dom0 to use all XENVER_ subops that have checks.
diff --git a/xen/common/Makefile b/xen/common/Makefile
index bca48e6e22..7d98dad478 100644
--- a/xen/common/Makefile
+++ b/xen/common/Makefile
@@ -1,6 +1,7 @@
 obj-$(CONFIG_ARGO) += argo.o
 obj-y += bitmap.o
 obj-y += bsearch.o
+obj-y += config_data.o
 obj-$(CONFIG_CORE_PARKING) += core_parking.o
 obj-y += cpu.o
 obj-y += cpupool.o
@@ -84,3 +85,9 @@ subdir-$(CONFIG_UBSAN) += ubsan
 
 subdir-$(CONFIG_NEEDS_LIBELF) += libelf
 subdir-$(CONFIG_HAS_DEVICE_TREE) += libfdt
+
+config_data.c: ../.config
+   ( echo "const char xen_config_data[] ="; \
+ cat $< | gzip | ../tools/bin2c; \
+ echo ";"; \
+ echo "const unsigned int xen_config_data_sz = sizeof(xen_config_data) 
- 1;" ) > $@
diff --git a/xen/common/sysctl.c b/xen/common/sysctl.c
index c0aa6bde4e..7d4329882d 100644
--- a/xen/common/sysctl.c
+++ b/xen/common/sysctl.c
@@ -13,6 +13,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -502,6 +503,22 @@ long do_sysctl(XEN_GUEST_HANDLE_PARAM(xen_sysctl_t) 
u_sysctl)
 break;
 }
 
+case XEN_SYSCTL_get_config:
+{
+if ( op->u.get_config.pad )
+{
+ret = -EINVAL;
+break;
+}
+if ( xen_config_data_sz <= op->u.get_config.size &&
+ copy_to_guest(op->u.get_config.buffer, xen_config_data,
+   xen_config_data_sz) )
+ret = -EFAULT;
+op->u.get_config.size = xen_config_data_sz;
+
+break;
+}
+
 default:
 ret = arch_do_sysctl(op, u_sysctl);
 copyback = 0;
diff --git a/xen/include/public/sysctl.h b/xen/include/public/sysctl.h
index c49b4dcc99..6139321971 100644
--- a/xen/include/public/sysctl.h
+++ b/xen/include/public/sysctl.h
@@ -36,7 +36,7 @@
 #include "physdev.h"
 #include "tmem.h"
 
-#define XEN_SYSCTL_INTERFACE_VERSION 0x0012
+#define XEN_SYSCTL_INTERFACE_VERSION 0x0013
 
 /*
  * Read console content from Xen buffer ring.
@@ -1100,6 +1100,20 @@ typedef struct xen_sysctl_cpu_policy 
xen_sysctl_cpu_policy_t;
 DEFINE_XEN_GUEST_HANDLE(xen_sysctl_cpu_policy_t);
 #endif
 
+/*
+ * XEN_SYSCTL_get_config
+ *
+ * Return gzip-ed .config file
+ */
+struct xen_sysctl_get_config {
+XEN_GUEST_HANDLE_64(char) buffer;   /* IN: pointer to buffer. */
+uint32_t size;  /* IN: size of buffer. */
+/* OUT: size of config data. */
+uint32_t pad;   /* IN: MUST be zero. */
+};
+typedef struct xen_sysctl_get_config xen_sysctl_get_config_t;
+DEFINE_XEN_GUEST_HANDLE(xen_sysctl_get_config_t);
+
 struct xen_sysctl {
 uint32_t cmd;
 #define XEN_SYSCTL_readconsole1
@@ -1130,6 +1144,7 @@ struct xen_sysctl {
 #define XEN_SYSCTL_livepatch_op   

[Xen-devel] [PATCH v2 0/2] add xl command to get hypervisor .config

2019-03-14 Thread Juergen Gross
Add "xl get-hypervisor-config" printing the .config used to build the
currently running hypervisor.

Juergen Gross (2):
  xen: add interface for obtaining .config from hypervisor
  tools: add new xl command get-hypervisor-config

 .gitignore  |  2 ++
 docs/man/xl.1.pod.in|  5 +
 tools/flask/policy/modules/dom0.te  |  2 +-
 tools/libxc/include/xenctrl.h   |  8 +++
 tools/libxc/include/xenguest.h  | 13 
 tools/libxc/xc_misc.c   | 42 +
 tools/libxc/xg_private.h|  4 
 tools/libxl/libxl.c | 34 ++
 tools/libxl/libxl.h |  8 +++
 tools/xl/xl.h   |  1 +
 tools/xl/xl_cmdtable.c  |  5 +
 tools/xl/xl_misc.c  | 20 ++
 xen/common/Makefile |  7 +++
 xen/common/sysctl.c | 17 +++
 xen/include/public/sysctl.h | 18 +++-
 xen/include/xen/kernel.h|  3 +++
 xen/tools/Makefile  |  9 ++--
 xen/tools/bin2c.c   | 28 +
 xen/xsm/flask/hooks.c   |  3 +++
 xen/xsm/flask/policy/access_vectors |  2 ++
 20 files changed, 223 insertions(+), 8 deletions(-)
 create mode 100644 xen/tools/bin2c.c

-- 
2.16.4


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH for-next 7/9] xen: Convert is_xen_heap_mfn to use typesafe MFN

2019-03-14 Thread Andrew Cooper
On 14/03/2019 11:47, Jan Beulich wrote:
 On 14.03.19 at 11:14,  wrote:
>> You do know that mfn_add(mfn, -1) will DTRT?
> It will, but imo it looks slightly odd in particular in e.g.
> for ( ...; ...; mfn_add(mfn, -1) ), hence I didn't suggest it. But
> I don't object either.

Well - in mathematics, x - 1 is identical to x + (-1), so the construct
doesn't look overly odd to me.

Alternatively, we could rename mfn_add to something else, but nothing
comes to mind which would be as clear.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v2 2/2] tools: add new xl command get-hypervisor-config

2019-03-14 Thread Wei Liu
On Thu, Mar 14, 2019 at 12:59:37PM +0100, Juergen Gross wrote:
> Add   new subcommand "get-hypervisor-config" to xl config to print the
> hypervisor .config file.
> 
> To be able to reuse already existing decompressing code in libxenguest
> xc_inflate_buffer() has to be moved to libxenguest.h.
> 
> Signed-off-by: Juergen Gross 

Acked-by: Wei Liu 

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v2 1/2] xen: add interface for obtaining .config from hypervisor

2019-03-14 Thread Wei Liu
On Thu, Mar 14, 2019 at 12:59:36PM +0100, Juergen Gross wrote:
> Add a sysctl interface for obtaining the .config file used to build
> the hypervisor. The mechanism is inspired by the Linux kernel's one.
> 
> Signed-off-by: Juergen Gross 
> Reviewed-by: Jan Beulich  (apart from XSM changes)

Reviewed-by: Wei Liu 

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v6 11/12] x86/microcode: Synchronize late microcode loading

2019-03-14 Thread Andrew Cooper
On 13/03/2019 08:02, Jan Beulich wrote:
 On 13.03.19 at 08:54,  wrote:
> On 13.03.19 at 06:02,  wrote:
>>> On Tue, Mar 12, 2019 at 05:07:51PM -0700, Raj, Ashok wrote:
 On Mon, Mar 11, 2019 at 03:57:35PM +0800, Chao Gao wrote:
> +if ( cpu == cpumask_first(per_cpu(cpu_sibling_mask, cpu)) )
> +ret = microcode_update_cpu();
 Does ret have any useful things on if the update failed? Doesn't seem 
 to be used before you overwrite later in collect_cpu_info()?
>>> It has the reason of failure on error. Actally, there are two reasons:
>>> one is no patch of newer revision, the other is we tried to update but
>>> the microcode revision didn't change. I can check this return value and
>>> print more informative message to admin. And furthermore, for the
>>> latter, we can remove the ucode patch from caches to address Roger's
>>> concern expressed in comments to patch 4 & 5.
>> Btw, I'm not sure removing such ucode from the cache is appropriate:
>> It may well apply elsewhere, unless there's a clear indication that the
>> blob is broken. So perhaps there needs to be special casing of -EIO,
>> which gets returned when the ucode rev reported by the CPU after
>> the update does not match expectations.
> An to go one step further, perhaps we should also store more than
> just the newest variant for a given pf. If the newest fails to apply
> but there is another one newer than what's on a CPU, updating to
> that may work, and once that intermediate update worked, the
> update to the newest version may then work too.

I don't think this is sensible.

Running with mismatched microcode is unsupported (for a suitable
definition of mismatched).

At boot, there will be 1 individual blob which is applicable to the
CPUs, and gets loaded on all APs.  (Possibly more than 1 blob, but
remember that only multi-socket servers and top end workstations have a
chance of having mixed steppings in the first place.)

During late load, we will have 1 (or more) blobs provided in the late
hypercall, and we will apply in a logically-atomic fashion in a
rendezvous'd context.

There are a few outcomes from this action.  If the ucode application
fails internally, the system is already lost and will crash.  This is an
inherent risk which people doing late loading need to be aware of (and
why test workloads exist in production setups).

If some cores accept the update but others don't, then we've also got
serious problems and probably system instability.  This is the kind of
problem which needs to be detected during testing, and may require a
change in application strategy, or may in practice prevent a particular
from being declared safe to late load.

The expected case is that all cores accept the blob during the rendezvous. 

As a result, I don't see any need to store more than a single ucode
version (whether this is a single blob or perhaps a set of closely
related blobs for a mixed-stepping system) in the stead state (to cope
with AP boot, suspend/resume, or CPU hotplug), and a second version
which is the proposed-new ucode for late loading.

On late load failure, we should dump enough information to work out
exactly what went on, to determine how best to proceed, but the server
is effectively lost to us.  On late load success, the proposed new
"version" replaces the current "version".

And again - I reiterate the point that I think it is fine to have a
simplifying assumption that we don't have mixed stepping systems to
start with, presuming this is generally in line with Intel's support
statement.  If in practice we find mixed stepping systems which are
supported by an OEM/Intel, we can see about extending the logic.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] L1TF Patch Series v10

2019-03-14 Thread Norbert Manthey
Dear all,

This patch series attempts to mitigate the issue that have been raised in the
XSA-289 (https://xenbits.xen.org/xsa/advisory-289.html). To block speculative
execution on Intel hardware, an lfence instruction is required to make sure
that selected checks are not bypassed. Speculative out-of-bound accesses can
be prevented by using the array_index_nospec macro.

The major change compared to version 9 is patch 8/8, which slipped through the
analysis in the first rounds.

Best,
Norbert




Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrer: Christian Schlaeger, Ralf Herbrich
Ust-ID: DE 289 237 879
Eingetragen am Amtsgericht Charlottenburg HRB 149173 B



___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH L1TF v10 1/8] spec: add l1tf-barrier

2019-03-14 Thread Norbert Manthey
To control the runtime behavior on L1TF vulnerable platforms better, the
command line option l1tf-barrier is introduced. This option controls
whether on vulnerable x86 platforms the lfence instruction is used to
prevent speculative execution from bypassing the evaluation of
conditionals that are protected with the evaluate_nospec macro.

By now, Xen is capable of identifying L1TF vulnerable hardware. However,
this information cannot be used for alternative patching, as a CPU feature
is required. To control alternative patching with the command line option,
a new x86 feature "X86_FEATURE_SC_L1TF_VULN" is introduced. This feature
is used to patch the lfence instruction into the arch_barrier_nospec_true
function. The feature is enabled only if L1TF vulnerable hardware is
detected and the command line option does not prevent using this feature.

The status of hyperthreading is considered when automatically enabling
adding the lfence instruction. Since platforms without hyperthreading can
still be vulnerable to L1TF in case the L1 cache is not flushed properly,
the additional lfence instructions are patched in if either hyperthreading
is enabled, or L1 cache flushing is missing.

This is part of the speculative hardening effort.

Signed-off-by: Norbert Manthey 
Reviewed-by: Jan Beulich 

---
 docs/misc/xen-command-line.pandoc | 14 ++
 xen/arch/x86/spec_ctrl.c  | 17 +++--
 xen/include/asm-x86/cpufeatures.h |  1 +
 xen/include/asm-x86/spec_ctrl.h   |  1 +
 4 files changed, 27 insertions(+), 6 deletions(-)

diff --git a/docs/misc/xen-command-line.pandoc 
b/docs/misc/xen-command-line.pandoc
--- a/docs/misc/xen-command-line.pandoc
+++ b/docs/misc/xen-command-line.pandoc
@@ -483,9 +483,9 @@ accounting for hardware capabilities as enumerated via 
CPUID.
 
 Currently accepted:
 
-The Speculation Control hardware features `ibrsb`, `stibp`, `ibpb`,
-`l1d-flush` and `ssbd` are used by default if available and applicable.  They 
can
-be ignored, e.g. `no-ibrsb`, at which point Xen won't use them itself, and
+The Speculation Control hardware features `ibrsb`, `stibp`, `ibpb`, 
`l1d-flush`,
+`l1tf-barrier` and `ssbd` are used by default if available and applicable.  
They
+can be ignored, e.g. `no-ibrsb`, at which point Xen won't use them itself, and
 won't offer them to guests.
 
 ### cpuid_mask_cpu
@@ -1902,7 +1902,7 @@ By default SSBD will be mitigated at runtime (i.e 
`ssbd=runtime`).
 ### spec-ctrl (x86)
 > `= List of [ , xen=, {pv,hvm,msr-sc,rsb}=,
 >  bti-thunk=retpoline|lfence|jmp, {ibrs,ibpb,ssbd,eager-fpu,
->  l1d-flush}= ]`
+>  l1d-flush,l1tf-barrier}= ]`
 
 Controls for speculative execution sidechannel mitigations.  By default, Xen
 will pick the most appropriate mitigations based on compiled in support,
@@ -1968,6 +1968,12 @@ Irrespective of Xen's setting, the feature is 
virtualised for HVM guests to
 use.  By default, Xen will enable this mitigation on hardware believed to be
 vulnerable to L1TF.
 
+On hardware vulnerable to L1TF, the `l1tf-barrier=` option can be used to force
+or prevent Xen from protecting evaluations inside the hypervisor with a barrier
+instruction to not load potentially secret information into L1 cache.  By
+default, Xen will enable this mitigation on hardware believed to be vulnerable
+to L1TF.
+
 ### sync_console
 > `= `
 
diff --git a/xen/arch/x86/spec_ctrl.c b/xen/arch/x86/spec_ctrl.c
--- a/xen/arch/x86/spec_ctrl.c
+++ b/xen/arch/x86/spec_ctrl.c
@@ -21,6 +21,7 @@
 #include 
 #include 
 
+#include 
 #include 
 #include 
 #include 
@@ -50,6 +51,7 @@ bool __read_mostly opt_ibpb = true;
 bool __read_mostly opt_ssbd = false;
 int8_t __read_mostly opt_eager_fpu = -1;
 int8_t __read_mostly opt_l1d_flush = -1;
+int8_t __read_mostly opt_l1tf_barrier = -1;
 
 bool __initdata bsp_delay_spec_ctrl;
 uint8_t __read_mostly default_xen_spec_ctrl;
@@ -91,6 +93,8 @@ static int __init parse_spec_ctrl(const char *s)
 if ( opt_pv_l1tf_domu < 0 )
 opt_pv_l1tf_domu = 0;
 
+opt_l1tf_barrier = 0;
+
 disable_common:
 opt_rsb_pv = false;
 opt_rsb_hvm = false;
@@ -157,6 +161,8 @@ static int __init parse_spec_ctrl(const char *s)
 opt_eager_fpu = val;
 else if ( (val = parse_boolean("l1d-flush", s, ss)) >= 0 )
 opt_l1d_flush = val;
+else if ( (val = parse_boolean("l1tf-barrier", s, ss)) >= 0 )
+opt_l1tf_barrier = val;
 else
 rc = -EINVAL;
 
@@ -248,7 +254,7 @@ static void __init print_details(enum ind_thunk thunk, 
uint64_t caps)
"\n");
 
 /* Settings for Xen's protection, irrespective of guests. */
-printk("  Xen settings: BTI-Thunk %s, SPEC_CTRL: %s%s, Other:%s%s\n",
+printk("  Xen settings: BTI-Thunk %s, SPEC_CTRL: %s%s, Other:%s%s%s\n",
thunk == THUNK_NONE  ? "N/A" :
thunk == THUNK_RETPOLINE ? "RETPOLINE" :
thunk == THUNK_LFENCE

[Xen-devel] [PATCH L1TF v10 2/8] nospec: introduce evaluate_nospec

2019-03-14 Thread Norbert Manthey
Since the L1TF vulnerability of Intel CPUs, loading hypervisor data into
L1 cache is problematic, because when hyperthreading is used as well, a
guest running on the sibling core can leak this potentially secret data.

To prevent these speculative accesses, we block speculation after
accessing the domain property field by adding lfence instructions. This
way, the CPU continues executing and loading data only once the condition
is actually evaluated.

As this protection is typically used in if statements, the lfence has to
come in a compatible way. Therefore, a function that returns true after an
lfence instruction is introduced. To protect both branches after a
conditional, an lfence instruction has to be added for the two branches.
To be able to block speculation after several evaluations, the generic
barrier macro block_speculation is also introduced.

As the L1TF vulnerability is only present on the x86 architecture, there is
no need to add protection for other architectures. Hence, the introduced
functions are defined but empty.

On the x86 architecture, by default, the lfence instruction is not present
either. Only when a L1TF vulnerable platform is detected, the lfence
instruction is patched in via alternative patching. Similarly, PV guests
are protected wrt L1TF by default, so that the protection is furthermore
disabled in case HVM is exclueded via the build configuration.

Introducing the lfence instructions catches a lot of potential leaks with
a simple unintrusive code change. During performance testing, we did not
notice performance effects.

This is part of the speculative hardening effort.

Signed-off-by: Norbert Manthey 
Acked-by: Julien Grall 
---
 xen/include/asm-arm/nospec.h | 25 +
 xen/include/asm-x86/nospec.h | 39 +++
 xen/include/xen/nospec.h |  1 +
 3 files changed, 65 insertions(+)
 create mode 100644 xen/include/asm-arm/nospec.h
 create mode 100644 xen/include/asm-x86/nospec.h

diff --git a/xen/include/asm-arm/nospec.h b/xen/include/asm-arm/nospec.h
new file mode 100644
--- /dev/null
+++ b/xen/include/asm-arm/nospec.h
@@ -0,0 +1,25 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/* Copyright 2018 Amazon.com, Inc. or its affiliates. All Rights Reserved. */
+
+#ifndef _ASM_ARM_NOSPEC_H
+#define _ASM_ARM_NOSPEC_H
+
+static inline bool evaluate_nospec(bool condition)
+{
+return condition;
+}
+
+static inline void block_speculation(void)
+{
+}
+
+#endif /* _ASM_ARM_NOSPEC_H */
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/xen/include/asm-x86/nospec.h b/xen/include/asm-x86/nospec.h
new file mode 100644
--- /dev/null
+++ b/xen/include/asm-x86/nospec.h
@@ -0,0 +1,39 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/* Copyright 2018 Amazon.com, Inc. or its affiliates. All Rights Reserved. */
+
+#ifndef _ASM_X86_NOSPEC_H
+#define _ASM_X86_NOSPEC_H
+
+#include 
+
+/* Allow to insert a read memory barrier into conditionals */
+static always_inline bool barrier_nospec_true(void)
+{
+#ifdef CONFIG_HVM
+alternative("", "lfence", X86_FEATURE_SC_L1TF_VULN);
+#endif
+return true;
+}
+
+/* Allow to protect evaluation of conditionasl with respect to speculation */
+static always_inline bool evaluate_nospec(bool condition)
+{
+return condition ? barrier_nospec_true() : !barrier_nospec_true();
+}
+
+/* Allow to block speculative execution in generic code */
+static always_inline void block_speculation(void)
+{
+barrier_nospec_true();
+}
+
+#endif /* _ASM_X86_NOSPEC_H */
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/xen/include/xen/nospec.h b/xen/include/xen/nospec.h
--- a/xen/include/xen/nospec.h
+++ b/xen/include/xen/nospec.h
@@ -8,6 +8,7 @@
 #define XEN_NOSPEC_H
 
 #include 
+#include 
 
 /**
  * array_index_mask_nospec() - generate a ~0 mask when index < size, 0 
otherwise
-- 
2.7.4




Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrer: Christian Schlaeger, Ralf Herbrich
Ust-ID: DE 289 237 879
Eingetragen am Amtsgericht Charlottenburg HRB 149173 B



___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH L1TF v10 3/8] is_control_domain: block speculation

2019-03-14 Thread Norbert Manthey
Checks of domain properties, such as is_hardware_domain or is_hvm_domain,
might be bypassed by speculatively executing these instructions. A reason
for bypassing these checks is that these macros access the domain
structure via a pointer, and check a certain field. Since this memory
access is slow, the CPU assumes a returned value and continues the
execution.

In case an is_control_domain check is bypassed, for example during a
hypercall, data that should only be accessible by the control domain could
be loaded into the cache.

This is part of the speculative hardening effort.

Signed-off-by: Norbert Manthey 
Acked-by: Jan Beulich 

---
 xen/include/xen/sched.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
--- a/xen/include/xen/sched.h
+++ b/xen/include/xen/sched.h
@@ -913,10 +913,10 @@ void watchdog_domain_destroy(struct domain *d);
  *(that is, this would not be suitable for a driver domain)
  *  - There is never a reason to deny the hardware domain access to this
  */
-#define is_hardware_domain(_d) ((_d) == hardware_domain)
+#define is_hardware_domain(_d) evaluate_nospec((_d) == hardware_domain)
 
 /* This check is for functionality specific to a control domain */
-#define is_control_domain(_d) ((_d)->is_privileged)
+#define is_control_domain(_d) evaluate_nospec((_d)->is_privileged)
 
 #define VM_ASSIST(d, t) (test_bit(VMASST_TYPE_ ## t, &(d)->vm_assist))
 
-- 
2.7.4




Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrer: Christian Schlaeger, Ralf Herbrich
Ust-ID: DE 289 237 879
Eingetragen am Amtsgericht Charlottenburg HRB 149173 B



___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH L1TF v10 4/8] is_hvm/pv_domain: block speculation

2019-03-14 Thread Norbert Manthey
When checking for being an hvm domain, or PV domain, we have to make
sure that speculation cannot bypass that check, and eventually access
data that should not end up in cache for the current domain type.

This is part of the speculative hardening effort.

Signed-off-by: Norbert Manthey 
Acked-by: Jan Beulich 

---
 xen/include/xen/sched.h | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
--- a/xen/include/xen/sched.h
+++ b/xen/include/xen/sched.h
@@ -922,7 +922,8 @@ void watchdog_domain_destroy(struct domain *d);
 
 static inline bool is_pv_domain(const struct domain *d)
 {
-return IS_ENABLED(CONFIG_PV) ? d->guest_type == guest_type_pv : false;
+return IS_ENABLED(CONFIG_PV)
+   ? evaluate_nospec(d->guest_type == guest_type_pv) : false;
 }
 
 static inline bool is_pv_vcpu(const struct vcpu *v)
@@ -953,7 +954,8 @@ static inline bool is_pv_64bit_vcpu(const struct vcpu *v)
 #endif
 static inline bool is_hvm_domain(const struct domain *d)
 {
-return IS_ENABLED(CONFIG_HVM) ? d->guest_type == guest_type_hvm : false;
+return IS_ENABLED(CONFIG_HVM)
+   ? evaluate_nospec(d->guest_type == guest_type_hvm) : false;
 }
 
 static inline bool is_hvm_vcpu(const struct vcpu *v)
-- 
2.7.4




Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrer: Christian Schlaeger, Ralf Herbrich
Ust-ID: DE 289 237 879
Eingetragen am Amtsgericht Charlottenburg HRB 149173 B



___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH L1TF v10 7/8] common/grant_table: block speculative out-of-bound accesses

2019-03-14 Thread Norbert Manthey
Guests can issue grant table operations and provide guest controlled
data to them. This data is also used for memory loads. To avoid
speculative out-of-bound accesses, we use the array_index_nospec macro
where applicable. However, there are also memory accesses that cannot
be protected by a single array protection, or multiple accesses in a
row. To protect these, a nospec barrier is placed between the actual
range check and the access via the block_speculation macro.

Speculative execution is not blocked in case one of the following
properties is true:
 - path cannot be triggered by the guest
 - path does not return to the guest
 - path does not result in an out-of-bound access
 - path cannot be executed repeatedly
Only the combination of the above properties allows to actually leak
continuous chunks of memory. Therefore, we only add the penalty of
protective mechanisms in case a potential speculative out-of-bound
access matches all the above properties.

As different versions of grant tables use structures of different size,
and the status is encoded in an array for version 2, speculative
execution might perform out-of-bound accesses of version 2 while
the table is actually using version 1. Hence, speculation is prevented
when accessing new memory based on the grant table version. In cases,
where no different memory locations are accessed on the code path that
follow an if statement, no protection is required. No different memory
locations are accessed in the following functionsi after a version check:

 * _set_status, as the header memory layout is the same
 * unmap_common, as potentially touched memory locations are allocated
 and initialized
 * gnttab_grow_table, as the touched memory is the same for each
branch after the conditionals
 * gnttab_transfer, as no memory access depends on the conditional
 * release_grant_for_copy, as no out-of-bound access depends on this
conditional
 * gnttab_set_version, as in case of a version change all the memory is
touched in both cases
 * gnttab_release_mappings, as this function is called only during domain
destruction and control is not returned to the guest
 * mem_sharing_gref_to_gfn, as potential dangerous memory accesses are
covered by the next evaluate_nospec
 * gnttab_get_status_frame, as the potential dangerous memory accesses
are protected in gnttab_get_status_frame_mfn
 * gnttab_usage_print, as this function cannot be triggered by the guest

This is part of the speculative hardening effort.

Signed-off-by: Norbert Manthey 

---

Notes:
  v10: extended commit message with explanation when to exclude comparisons
   minor change in gnttab_transfer due to rebase

 xen/common/grant_table.c | 97 +---
 1 file changed, 75 insertions(+), 22 deletions(-)

diff --git a/xen/common/grant_table.c b/xen/common/grant_table.c
--- a/xen/common/grant_table.c
+++ b/xen/common/grant_table.c
@@ -37,6 +37,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 
@@ -203,8 +204,9 @@ static inline unsigned int nr_status_frames(const struct 
grant_table *gt)
 }
 
 #define MAPTRACK_PER_PAGE (PAGE_SIZE / sizeof(struct grant_mapping))
-#define maptrack_entry(t, e) \
-((t)->maptrack[(e)/MAPTRACK_PER_PAGE][(e)%MAPTRACK_PER_PAGE])
+#define maptrack_entry(t, e)   
\
+((t)->maptrack[array_index_nospec(e, (t)->maptrack_limit) /
\
+MAPTRACK_PER_PAGE][(e) % 
MAPTRACK_PER_PAGE])
 
 static inline unsigned int
 nr_maptrack_frames(struct grant_table *t)
@@ -226,10 +228,23 @@ nr_maptrack_frames(struct grant_table *t)
 static grant_entry_header_t *
 shared_entry_header(struct grant_table *t, grant_ref_t ref)
 {
-if ( t->gt_version == 1 )
+switch ( t->gt_version )
+{
+case 1:
+/* Returned values should be independent of speculative execution */
+block_speculation();
 return (grant_entry_header_t*)&shared_entry_v1(t, ref);
-else
+
+case 2:
+/* Returned values should be independent of speculative execution */
+block_speculation();
 return &shared_entry_v2(t, ref).hdr;
+}
+
+ASSERT_UNREACHABLE();
+block_speculation();
+
+return NULL;
 }
 
 /* Active grant entry - used for shadowing GTF_permit_access grants. */
@@ -634,14 +649,24 @@ static unsigned int nr_grant_entries(struct grant_table 
*gt)
 case 1:
 BUILD_BUG_ON(f2e(INITIAL_NR_GRANT_FRAMES, 1) <
  GNTTAB_NR_RESERVED_ENTRIES);
+
+/* Make sure we return a value independently of speculative execution 
*/
+block_speculation();
 return f2e(nr_grant_frames(gt), 1);
+
 case 2:
 BUILD_BUG_ON(f2e(INITIAL_NR_GRANT_FRAMES, 2) <
  GNTTAB_NR_RESERVED_ENTRIES);
+
+/* Make sure we return a value independen

[Xen-devel] [PATCH L1TF v10 6/8] x86/hvm: add nospec to hvmop param

2019-03-14 Thread Norbert Manthey
The params array in hvm can be accessed with get and set functions.
As the index is guest controlled, make sure no out-of-bound accesses
can be performed.

As we cannot influence how future compilers might modify the
instructions that enforce the bounds, we furthermore block speculation,
so that the update is visible in the architectural state.

This is part of the speculative hardening effort.

Signed-off-by: Norbert Manthey 
Acked-by: Jan Beulich 

---
 xen/arch/x86/hvm/hvm.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -4135,6 +4135,9 @@ static int hvmop_set_param(
 if ( a.index >= HVM_NR_PARAMS )
 return -EINVAL;
 
+/* Make sure the above bound check is not bypassed during speculation. */
+block_speculation();
+
 d = rcu_lock_domain_by_any_id(a.domid);
 if ( d == NULL )
 return -ESRCH;
@@ -4401,6 +4404,9 @@ static int hvmop_get_param(
 if ( a.index >= HVM_NR_PARAMS )
 return -EINVAL;
 
+/* Make sure the above bound check is not bypassed during speculation. */
+block_speculation();
+
 d = rcu_lock_domain_by_any_id(a.domid);
 if ( d == NULL )
 return -ESRCH;
-- 
2.7.4




Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrer: Christian Schlaeger, Ralf Herbrich
Ust-ID: DE 289 237 879
Eingetragen am Amtsgericht Charlottenburg HRB 149173 B



___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH L1TF v10 5/8] common/memory: block speculative out-of-bound accesses

2019-03-14 Thread Norbert Manthey
The get_page_from_gfn method returns a pointer to a page that belongs
to a gfn. Before returning the pointer, the gfn is checked for being
valid. Under speculation, these checks can be bypassed, so that
the function get_page is still executed partially. Consequently, the
function page_get_owner_and_reference might be executed partially as
well. In this function, the computed pointer is accessed, resulting in
a speculative out-of-bound address load. As the gfn can be controlled by
a guest, this access is problematic.

To mitigate the root cause, an lfence instruction is added via the
evaluate_nospec macro. To make the protection generic, we do not
introduce the lfence instruction for this single check, but add it to
the mfn_valid function. This way, other potentially problematic accesses
are protected as well.

This is part of the speculative hardening effort.

Signed-off-by: Norbert Manthey 
Acked-by: Jan Beulich 

---
 xen/common/pdx.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/xen/common/pdx.c b/xen/common/pdx.c
--- a/xen/common/pdx.c
+++ b/xen/common/pdx.c
@@ -18,6 +18,7 @@
 #include 
 #include 
 #include 
+#include 
 
 /* Parameters for PFN/MADDR compression. */
 unsigned long __read_mostly max_pdx;
@@ -33,8 +34,9 @@ unsigned long __read_mostly pdx_group_valid[BITS_TO_LONGS(
 
 bool __mfn_valid(unsigned long mfn)
 {
-return likely(mfn < max_page) &&
-   likely(!(mfn & pfn_hole_mask)) &&
+if ( unlikely(evaluate_nospec(mfn >= max_page)) )
+return false;
+return likely(!(mfn & pfn_hole_mask)) &&
likely(test_bit(pfn_to_pdx(mfn) / PDX_GROUP_COUNT,
pdx_group_valid));
 }
-- 
2.7.4




Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrer: Christian Schlaeger, Ralf Herbrich
Ust-ID: DE 289 237 879
Eingetragen am Amtsgericht Charlottenburg HRB 149173 B



___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH L1TF v10 8/8] common/domain: block speculative out-of-bound accesses

2019-03-14 Thread Norbert Manthey
When issuing a vcpu_op hypercall, guests have control over the
vcpuid variable. In the old code, this allowed to perform
speculative out-of-bound accesses. To block this, we make use
of the domain_vcpu function.

This is part of the speculative hardening effort.

Signed-off-by: Norbert Manthey 
---
 xen/common/domain.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/xen/common/domain.c b/xen/common/domain.c
--- a/xen/common/domain.c
+++ b/xen/common/domain.c
@@ -1365,7 +1365,7 @@ long do_vcpu_op(int cmd, unsigned int vcpuid, 
XEN_GUEST_HANDLE_PARAM(void) arg)
 struct vcpu *v;
 long rc = 0;
 
-if ( vcpuid >= d->max_vcpus || (v = d->vcpu[vcpuid]) == NULL )
+if ( (v = domain_vcpu(d, vcpuid)) == NULL )
 return -ENOENT;
 
 switch ( cmd )
-- 
2.7.4




Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrer: Christian Schlaeger, Ralf Herbrich
Ust-ID: DE 289 237 879
Eingetragen am Amtsgericht Charlottenburg HRB 149173 B



___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v6 11/12] x86/microcode: Synchronize late microcode loading

2019-03-14 Thread Chao Gao
On Wed, Mar 13, 2019 at 02:02:55AM -0600, Jan Beulich wrote:
 On 13.03.19 at 08:54,  wrote:
> On 13.03.19 at 06:02,  wrote:
>>> On Tue, Mar 12, 2019 at 05:07:51PM -0700, Raj, Ashok wrote:
On Mon, Mar 11, 2019 at 03:57:35PM +0800, Chao Gao wrote:
> +if ( cpu == cpumask_first(per_cpu(cpu_sibling_mask, cpu)) )
> +ret = microcode_update_cpu();

Does ret have any useful things on if the update failed? Doesn't seem 
to be used before you overwrite later in collect_cpu_info()?
>>> 
>>> It has the reason of failure on error. Actally, there are two reasons:
>>> one is no patch of newer revision, the other is we tried to update but
>>> the microcode revision didn't change. I can check this return value and
>>> print more informative message to admin. And furthermore, for the
>>> latter, we can remove the ucode patch from caches to address Roger's
>>> concern expressed in comments to patch 4 & 5.
>> 
>> Btw, I'm not sure removing such ucode from the cache is appropriate:
>> It may well apply elsewhere, unless there's a clear indication that the
>> blob is broken.

Yes. Got it. Can we just assume we won't encounter that ucode update
succeeded only on part of cpus and warn a reboot is needed if it happened?
We definitely want to tolerate some kinds of hardware misbehavior. But
for such cases which are unlikely to happen, I prefer to improve this code
when we meet this kind of issue.

>> So perhaps there needs to be special casing of -EIO,
>> which gets returned when the ucode rev reported by the CPU after
>> the update does not match expectations.
>
>An to go one step further, perhaps we should also store more than
>just the newest variant for a given pf. If the newest fails to apply
>but there is another one newer than what's on a CPU, updating to
>that may work, and once that intermediate update worked, the
>update to the newest version may then work too.

Intel SDM doesn't mention this dependency (to apply an ucode relies on a
specific old ucode applied). Perhaps we can also assume we won't fall
into this case.

Hi Ashok,

Do you know whether Intel's ucode update mechanism has such dependency?

Thanks
Chao


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v3 2/4] x86: move the saved value of MSR_IA32_XSS into struct vcpu_msrs

2019-03-14 Thread Paul Durrant
Currently the value is saved directly in struct hvm_vcpu. This patch simply
co-locates it with other saved MSR values. No functional change.

Signed-off-by: Paul Durrant 
Reviewed-by: Jan Beulich 
Reviewed-by: Kevin Tian 
---
Cc: Andrew Cooper 
Cc: Wei Liu 
Cc: "Roger Pau Monné" 
Cc: Jun Nakajima 
---
 xen/arch/x86/domain.c  | 2 +-
 xen/arch/x86/hvm/hvm.c | 4 ++--
 xen/arch/x86/hvm/vmx/vmx.c | 4 ++--
 xen/include/asm-x86/hvm/vcpu.h | 1 -
 xen/include/asm-x86/msr.h  | 5 +
 5 files changed, 10 insertions(+), 6 deletions(-)

diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
index 8d579e2cf9..aa38555736 100644
--- a/xen/arch/x86/domain.c
+++ b/xen/arch/x86/domain.c
@@ -1653,7 +1653,7 @@ static void __context_switch(void)
 BUG();
 
 if ( cpu_has_xsaves && is_hvm_vcpu(n) )
-set_msr_xss(n->arch.hvm.msr_xss);
+set_msr_xss(n->arch.msrs->xss.raw);
 }
 vcpu_restore_fpu_nonlazy(n, false);
 nd->arch.ctxt_switch->to(n);
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index e566d83f8b..dff590e658 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -3466,7 +3466,7 @@ int hvm_msr_read_intercept(unsigned int msr, uint64_t 
*msr_content)
 case MSR_IA32_XSS:
 if ( !d->arch.cpuid->xstate.xsaves )
 goto gp_fault;
-*msr_content = v->arch.hvm.msr_xss;
+*msr_content = v->arch.msrs->xss.raw;
 break;
 
 case MSR_K8_ENABLE_C1E:
@@ -3612,7 +3612,7 @@ int hvm_msr_write_intercept(unsigned int msr, uint64_t 
msr_content,
 /* No XSS features currently supported for guests. */
 if ( !d->arch.cpuid->xstate.xsaves || msr_content != 0 )
 goto gp_fault;
-v->arch.hvm.msr_xss = msr_content;
+v->arch.msrs->xss.raw = msr_content;
 break;
 
 case MSR_AMD64_NB_CFG:
diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index 6445f44c3b..4941924cf6 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -807,7 +807,7 @@ static void vmx_save_msr(struct vcpu *v, struct hvm_msr 
*ctxt)
 {
 if ( cpu_has_xsaves && cpu_has_vmx_xsaves )
 {
-ctxt->msr[ctxt->count].val = v->arch.hvm.msr_xss;
+ctxt->msr[ctxt->count].val = v->arch.msrs->xss.raw;
 if ( ctxt->msr[ctxt->count].val )
 ctxt->msr[ctxt->count++].index = MSR_IA32_XSS;
 }
@@ -826,7 +826,7 @@ static int vmx_load_msr(struct vcpu *v, struct hvm_msr 
*ctxt)
 {
 case MSR_IA32_XSS:
 if ( cpu_has_xsaves && cpu_has_vmx_xsaves )
-v->arch.hvm.msr_xss = ctxt->msr[i].val;
+v->arch.msrs->xss.raw = ctxt->msr[i].val;
 else
 err = -ENXIO;
 break;
diff --git a/xen/include/asm-x86/hvm/vcpu.h b/xen/include/asm-x86/hvm/vcpu.h
index 6c84d5a5a6..5563d28a4e 100644
--- a/xen/include/asm-x86/hvm/vcpu.h
+++ b/xen/include/asm-x86/hvm/vcpu.h
@@ -176,7 +176,6 @@ struct hvm_vcpu {
 struct hvm_vcpu_asid n1asid;
 
 u64 msr_tsc_adjust;
-u64 msr_xss;
 
 union {
 struct vmx_vcpu vmx;
diff --git a/xen/include/asm-x86/msr.h b/xen/include/asm-x86/msr.h
index a7244793bf..0d52c085f6 100644
--- a/xen/include/asm-x86/msr.h
+++ b/xen/include/asm-x86/msr.h
@@ -313,6 +313,11 @@ struct vcpu_msrs
  * values here may be stale in current context.
  */
 uint32_t dr_mask[4];
+
+/* 0x0da0 - MSR_IA32_XSS */
+struct {
+uint64_t raw;
+} xss;
 };
 
 void init_guest_msr_policy(void);
-- 
2.20.1


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v3 1/4] x86: stop handling MSR_IA32_BNDCFGS save/restore in implementation code

2019-03-14 Thread Paul Durrant
Saving and restoring the value of this MSR is currently handled by
implementation-specific code despite it being architectural. This patch
moves handling of accesses to this MSR from hvm.c into the msr.c, thus
allowing the common MSR save/restore code to handle it.

NOTE: Because vmx_get/set_guest_bndcfgs() call vmx_vmcs_enter(), the
  struct vcpu pointer passed in, and hence the vcpu pointer passed to
  guest_rdmsr() cannot be const.

Signed-off-by: Paul Durrant 
---
Cc: Kevin Tian 
Cc: Jan Beulich 
Cc: Andrew Cooper 
Cc: Wei Liu 
Cc: "Roger Pau Monné" 
Cc: Jun Nakajima 

v3:
 - Address further comments from Jan
 - Dropped Kevin's R-b because of change to vmx.c

v2:
 - Addressed comments from Jan by largely removing hunks
 - Keeping Kevin's R-b since remaining hunks in vmx.c are as before
---
 xen/arch/x86/hvm/hvm.c| 14 ++
 xen/arch/x86/hvm/vmx/vmx.c| 26 ++
 xen/arch/x86/msr.c| 12 
 xen/include/asm-x86/hvm/hvm.h |  4 ++--
 4 files changed, 22 insertions(+), 34 deletions(-)

diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 8adbb61b57..e566d83f8b 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -1303,6 +1303,7 @@ static int hvm_load_cpu_xsave_states(struct domain *d, 
hvm_domain_context_t *h)
 static const uint32_t msrs_to_send[] = {
 MSR_SPEC_CTRL,
 MSR_INTEL_MISC_FEATURES_ENABLES,
+MSR_IA32_BNDCFGS,
 MSR_AMD64_DR0_ADDRESS_MASK,
 MSR_AMD64_DR1_ADDRESS_MASK,
 MSR_AMD64_DR2_ADDRESS_MASK,
@@ -1440,6 +1441,7 @@ static int hvm_load_cpu_msrs(struct domain *d, 
hvm_domain_context_t *h)
 
 case MSR_SPEC_CTRL:
 case MSR_INTEL_MISC_FEATURES_ENABLES:
+case MSR_IA32_BNDCFGS:
 case MSR_AMD64_DR0_ADDRESS_MASK:
 case MSR_AMD64_DR1_ADDRESS_MASK ... MSR_AMD64_DR3_ADDRESS_MASK:
 rc = guest_wrmsr(v, ctxt->msr[i].index, ctxt->msr[i].val);
@@ -3467,12 +3469,6 @@ int hvm_msr_read_intercept(unsigned int msr, uint64_t 
*msr_content)
 *msr_content = v->arch.hvm.msr_xss;
 break;
 
-case MSR_IA32_BNDCFGS:
-if ( !d->arch.cpuid->feat.mpx ||
- !hvm_get_guest_bndcfgs(v, msr_content) )
-goto gp_fault;
-break;
-
 case MSR_K8_ENABLE_C1E:
 case MSR_AMD64_NB_CFG:
  /*
@@ -3619,12 +3615,6 @@ int hvm_msr_write_intercept(unsigned int msr, uint64_t 
msr_content,
 v->arch.hvm.msr_xss = msr_content;
 break;
 
-case MSR_IA32_BNDCFGS:
-if ( !d->arch.cpuid->feat.mpx ||
- !hvm_set_guest_bndcfgs(v, msr_content) )
-goto gp_fault;
-break;
-
 case MSR_AMD64_NB_CFG:
 /* ignore the write */
 break;
diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index 725dd88c13..6445f44c3b 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -805,17 +805,6 @@ static unsigned int __init vmx_init_msr(void)
 
 static void vmx_save_msr(struct vcpu *v, struct hvm_msr *ctxt)
 {
-vmx_vmcs_enter(v);
-
-if ( cpu_has_mpx && cpu_has_vmx_mpx )
-{
-__vmread(GUEST_BNDCFGS, &ctxt->msr[ctxt->count].val);
-if ( ctxt->msr[ctxt->count].val )
-ctxt->msr[ctxt->count++].index = MSR_IA32_BNDCFGS;
-}
-
-vmx_vmcs_exit(v);
-
 if ( cpu_has_xsaves && cpu_has_vmx_xsaves )
 {
 ctxt->msr[ctxt->count].val = v->arch.hvm.msr_xss;
@@ -835,14 +824,6 @@ static int vmx_load_msr(struct vcpu *v, struct hvm_msr 
*ctxt)
 {
 switch ( ctxt->msr[i].index )
 {
-case MSR_IA32_BNDCFGS:
-if ( cpu_has_mpx && cpu_has_vmx_mpx &&
- is_canonical_address(ctxt->msr[i].val) &&
- !(ctxt->msr[i].val & IA32_BNDCFGS_RESERVED) )
-__vmwrite(GUEST_BNDCFGS, ctxt->msr[i].val);
-else if ( ctxt->msr[i].val )
-err = -ENXIO;
-break;
 case MSR_IA32_XSS:
 if ( cpu_has_xsaves && cpu_has_vmx_xsaves )
 v->arch.hvm.msr_xss = ctxt->msr[i].val;
@@ -1215,10 +1196,15 @@ static bool vmx_set_guest_bndcfgs(struct vcpu *v, u64 
val)
 return true;
 }
 
-static bool vmx_get_guest_bndcfgs(struct vcpu *v, u64 *val)
+static bool vmx_get_guest_bndcfgs(const struct vcpu *cv, u64 *val)
 {
+struct vcpu *v;
+
 ASSERT(cpu_has_mpx && cpu_has_vmx_mpx);
 
+/* Get a non-const pointer for vmx_vmcs_enter() */
+v = cv->domain->vcpu[cv->vcpu_id];
+
 vmx_vmcs_enter(v);
 __vmread(GUEST_BNDCFGS, val);
 vmx_vmcs_exit(v);
diff --git a/xen/arch/x86/msr.c b/xen/arch/x86/msr.c
index 4df4a59f4d..363fd63096 100644
--- a/xen/arch/x86/msr.c
+++ b/xen/arch/x86/msr.c
@@ -160,6 +160,12 @@ int guest_rdmsr(const struct vcpu *v, uint32_t msr, 
uint64_t *val)
 ret = guest_rdmsr_x2apic(v, msr, val);
 break;
 
+case MSR_IA32_BNDCFGS:
+if ( !cp->feat.mpx || !hvm_get_guest_bndcfgs(v, val) )
+goto gp_fault;
+
+   

[Xen-devel] [PATCH v3 3/4] x86: stop handling MSR_IA32_XSS save/restore in implementation code

2019-03-14 Thread Paul Durrant
Saving and restoring the value of this MSR is currently handled by
implementation-specific code despite it being architectural. This patch
moves handling of accesses to this MSR from hvm.c into the msr.c, thus
allowing the common MSR save/restore code to handle it.

This patch also adds proper checks of CPUID policy in the new get/set code.

NOTE: MSR_IA32_XSS is the last MSR to be saved and restored by
  implementation-specific code. This patch therefore removes the
  (VMX) definitions and of the init_msr(), save_msr() and
  load_msr() hvm_funcs, as they are no longer necessary. The
  declarations of and calls to those hvm_funcs will be cleaned up
  by a subsequent patch.

Signed-off-by: Paul Durrant 
Reviewed-by: Jan Beulich 
Reviewed-by: Kevin Tian 
---
Cc: Andrew Cooper 
Cc: Wei Liu 
Cc: "Roger Pau Monné" 
Cc: Jun Nakajima 
---
 xen/arch/x86/hvm/hvm.c | 15 ++--
 xen/arch/x86/hvm/vmx/vmx.c | 49 --
 xen/arch/x86/msr.c | 18 ++
 3 files changed, 20 insertions(+), 62 deletions(-)

diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index dff590e658..deb7fb2adb 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -1304,6 +1304,7 @@ static const uint32_t msrs_to_send[] = {
 MSR_SPEC_CTRL,
 MSR_INTEL_MISC_FEATURES_ENABLES,
 MSR_IA32_BNDCFGS,
+MSR_IA32_XSS,
 MSR_AMD64_DR0_ADDRESS_MASK,
 MSR_AMD64_DR1_ADDRESS_MASK,
 MSR_AMD64_DR2_ADDRESS_MASK,
@@ -1442,6 +1443,7 @@ static int hvm_load_cpu_msrs(struct domain *d, 
hvm_domain_context_t *h)
 case MSR_SPEC_CTRL:
 case MSR_INTEL_MISC_FEATURES_ENABLES:
 case MSR_IA32_BNDCFGS:
+case MSR_IA32_XSS:
 case MSR_AMD64_DR0_ADDRESS_MASK:
 case MSR_AMD64_DR1_ADDRESS_MASK ... MSR_AMD64_DR3_ADDRESS_MASK:
 rc = guest_wrmsr(v, ctxt->msr[i].index, ctxt->msr[i].val);
@@ -3463,12 +3465,6 @@ int hvm_msr_read_intercept(unsigned int msr, uint64_t 
*msr_content)
 MTRRcap_VCNT))];
 break;
 
-case MSR_IA32_XSS:
-if ( !d->arch.cpuid->xstate.xsaves )
-goto gp_fault;
-*msr_content = v->arch.msrs->xss.raw;
-break;
-
 case MSR_K8_ENABLE_C1E:
 case MSR_AMD64_NB_CFG:
  /*
@@ -3608,13 +3604,6 @@ int hvm_msr_write_intercept(unsigned int msr, uint64_t 
msr_content,
 goto gp_fault;
 break;
 
-case MSR_IA32_XSS:
-/* No XSS features currently supported for guests. */
-if ( !d->arch.cpuid->xstate.xsaves || msr_content != 0 )
-goto gp_fault;
-v->arch.msrs->xss.raw = msr_content;
-break;
-
 case MSR_AMD64_NB_CFG:
 /* ignore the write */
 break;
diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index 4941924cf6..043c4268a0 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -797,52 +797,6 @@ static int vmx_load_vmcs_ctxt(struct vcpu *v, struct 
hvm_hw_cpu *ctxt)
 return 0;
 }
 
-static unsigned int __init vmx_init_msr(void)
-{
-return (cpu_has_mpx && cpu_has_vmx_mpx) +
-   (cpu_has_xsaves && cpu_has_vmx_xsaves);
-}
-
-static void vmx_save_msr(struct vcpu *v, struct hvm_msr *ctxt)
-{
-if ( cpu_has_xsaves && cpu_has_vmx_xsaves )
-{
-ctxt->msr[ctxt->count].val = v->arch.msrs->xss.raw;
-if ( ctxt->msr[ctxt->count].val )
-ctxt->msr[ctxt->count++].index = MSR_IA32_XSS;
-}
-}
-
-static int vmx_load_msr(struct vcpu *v, struct hvm_msr *ctxt)
-{
-unsigned int i;
-int err = 0;
-
-vmx_vmcs_enter(v);
-
-for ( i = 0; i < ctxt->count; ++i )
-{
-switch ( ctxt->msr[i].index )
-{
-case MSR_IA32_XSS:
-if ( cpu_has_xsaves && cpu_has_vmx_xsaves )
-v->arch.msrs->xss.raw = ctxt->msr[i].val;
-else
-err = -ENXIO;
-break;
-default:
-continue;
-}
-if ( err )
-break;
-ctxt->msr[i]._rsvd = 1;
-}
-
-vmx_vmcs_exit(v);
-
-return err;
-}
-
 static void vmx_fpu_enter(struct vcpu *v)
 {
 vcpu_restore_fpu_lazy(v);
@@ -2284,9 +2238,6 @@ static struct hvm_function_table __initdata 
vmx_function_table = {
 .vcpu_destroy = vmx_vcpu_destroy,
 .save_cpu_ctxt= vmx_save_vmcs_ctxt,
 .load_cpu_ctxt= vmx_load_vmcs_ctxt,
-.init_msr = vmx_init_msr,
-.save_msr = vmx_save_msr,
-.load_msr = vmx_load_msr,
 .get_interrupt_shadow = vmx_get_interrupt_shadow,
 .set_interrupt_shadow = vmx_set_interrupt_shadow,
 .guest_x86_mode   = vmx_guest_x86_mode,
diff --git a/xen/arch/x86/msr.c b/xen/arch/x86/msr.c
index 363fd63096..cfa49d7a63 100644
--- a/xen/arch/x86/msr.c
+++ b/xen/arch/x86/msr.c
@@ -166,6 +166,13 @@ int guest_rdmsr(const struct vcpu *v, uint32_t msr, 
uint64_t *val)
 
 break;
 

[Xen-devel] [PATCH v3 4/4] x86: remove defunct init/load/save_msr() hvm_funcs

2019-03-14 Thread Paul Durrant
These hvm_funcs are no longer required since no MSR values are saved or
restored by implementation-specific code.

Signed-off-by: Paul Durrant 
Reviewed-by: Jan Beulich 
---
Cc: Andrew Cooper 
Cc: Wei Liu 
Cc: "Roger Pau Monné" 

v2:
 - Re-instate err check on loop in hvm_load_cpu_msrs()
---
 xen/arch/x86/hvm/hvm.c| 29 +
 xen/include/asm-x86/hvm/hvm.h |  4 
 2 files changed, 9 insertions(+), 24 deletions(-)

diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index deb7fb2adb..f84bf11f0f 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -1310,7 +1310,6 @@ static const uint32_t msrs_to_send[] = {
 MSR_AMD64_DR2_ADDRESS_MASK,
 MSR_AMD64_DR3_ADDRESS_MASK,
 };
-static unsigned int __read_mostly msr_count_max = ARRAY_SIZE(msrs_to_send);
 
 static int hvm_save_cpu_msrs(struct vcpu *v, hvm_domain_context_t *h)
 {
@@ -1320,7 +1319,7 @@ static int hvm_save_cpu_msrs(struct vcpu *v, 
hvm_domain_context_t *h)
 int err;
 
 err = _hvm_init_entry(h, CPU_MSR_CODE, v->vcpu_id,
- HVM_CPU_MSR_SIZE(msr_count_max));
+ HVM_CPU_MSR_SIZE(ARRAY_SIZE(msrs_to_send)));
 if ( err )
 return err;
 ctxt = (struct hvm_msr *)&h->data[h->cur];
@@ -1353,10 +1352,7 @@ static int hvm_save_cpu_msrs(struct vcpu *v, 
hvm_domain_context_t *h)
 ctxt->msr[ctxt->count++].val = val;
 }
 
-if ( hvm_funcs.save_msr )
-hvm_funcs.save_msr(v, ctxt);
-
-ASSERT(ctxt->count <= msr_count_max);
+ASSERT(ctxt->count <= ARRAY_SIZE(msrs_to_send));
 
 for ( i = 0; i < ctxt->count; ++i )
 ctxt->msr[i]._rsvd = 0;
@@ -1431,9 +1427,6 @@ static int hvm_load_cpu_msrs(struct domain *d, 
hvm_domain_context_t *h)
 return -EOPNOTSUPP;
 /* Checking finished */
 
-if ( hvm_funcs.load_msr )
-err = hvm_funcs.load_msr(v, ctxt);
-
 for ( i = 0; !err && i < ctxt->count; ++i )
 {
 switch ( ctxt->msr[i].index )
@@ -1475,17 +1468,13 @@ static int __init 
hvm_register_CPU_save_and_restore(void)
 sizeof(struct hvm_save_descriptor),
 HVMSR_PER_VCPU);
 
-if ( hvm_funcs.init_msr )
-msr_count_max += hvm_funcs.init_msr();
-
-if ( msr_count_max )
-hvm_register_savevm(CPU_MSR_CODE,
-"CPU_MSR",
-hvm_save_cpu_msrs,
-hvm_load_cpu_msrs,
-HVM_CPU_MSR_SIZE(msr_count_max) +
-sizeof(struct hvm_save_descriptor),
-HVMSR_PER_VCPU);
+hvm_register_savevm(CPU_MSR_CODE,
+"CPU_MSR",
+hvm_save_cpu_msrs,
+hvm_load_cpu_msrs,
+HVM_CPU_MSR_SIZE(ARRAY_SIZE(msrs_to_send)) +
+sizeof(struct hvm_save_descriptor),
+HVMSR_PER_VCPU);
 
 return 0;
 }
diff --git a/xen/include/asm-x86/hvm/hvm.h b/xen/include/asm-x86/hvm/hvm.h
index d41ed63232..a70fdbe298 100644
--- a/xen/include/asm-x86/hvm/hvm.h
+++ b/xen/include/asm-x86/hvm/hvm.h
@@ -115,10 +115,6 @@ struct hvm_function_table {
 void (*save_cpu_ctxt)(struct vcpu *v, struct hvm_hw_cpu *ctxt);
 int (*load_cpu_ctxt)(struct vcpu *v, struct hvm_hw_cpu *ctxt);
 
-unsigned int (*init_msr)(void);
-void (*save_msr)(struct vcpu *, struct hvm_msr *);
-int (*load_msr)(struct vcpu *, struct hvm_msr *);
-
 /* Examine specifics of the guest state. */
 unsigned int (*get_interrupt_shadow)(struct vcpu *v);
 void (*set_interrupt_shadow)(struct vcpu *v, unsigned int intr_shadow);
-- 
2.20.1


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v3 0/4] clean up MSR save/restore code

2019-03-14 Thread Paul Durrant
Paul Durrant (4):
  x86: stop handling MSR_IA32_BNDCFGS save/restore in implementation
code
  x86: move the saved value of MSR_IA32_XSS into struct vcpu_msrs
  x86: stop handling MSR_IA32_XSS save/restore in implementation code
  x86: remove defunct init/load/save_msr() hvm_funcs

 xen/arch/x86/domain.c  |  2 +-
 xen/arch/x86/hvm/hvm.c | 58 ++
 xen/arch/x86/hvm/vmx/vmx.c | 75 +++---
 xen/arch/x86/msr.c | 30 ++
 xen/include/asm-x86/hvm/hvm.h  |  8 +---
 xen/include/asm-x86/hvm/vcpu.h |  1 -
 xen/include/asm-x86/msr.h  |  5 +++
 7 files changed, 57 insertions(+), 122 deletions(-)
---
v2:
 - Drop patches #2 and #6 of the v1 series

Cc: Andrew Cooper 
Cc: Jan Beulich 
Cc: Jun Nakajima 
Cc: Kevin Tian 
Cc: "Roger Pau Monné" 
Cc: Wei Liu 
-- 
2.20.1


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v6 11/12] x86/microcode: Synchronize late microcode loading

2019-03-14 Thread Jan Beulich
>>> On 14.03.19 at 14:01,  wrote:
> On Wed, Mar 13, 2019 at 02:02:55AM -0600, Jan Beulich wrote:
> On 13.03.19 at 08:54,  wrote:
>> On 13.03.19 at 06:02,  wrote:
 On Tue, Mar 12, 2019 at 05:07:51PM -0700, Raj, Ashok wrote:
>On Mon, Mar 11, 2019 at 03:57:35PM +0800, Chao Gao wrote:
>> +if ( cpu == cpumask_first(per_cpu(cpu_sibling_mask, cpu)) )
>> +ret = microcode_update_cpu();
>
>Does ret have any useful things on if the update failed? Doesn't seem 
>to be used before you overwrite later in collect_cpu_info()?
 
 It has the reason of failure on error. Actally, there are two reasons:
 one is no patch of newer revision, the other is we tried to update but
 the microcode revision didn't change. I can check this return value and
 print more informative message to admin. And furthermore, for the
 latter, we can remove the ucode patch from caches to address Roger's
 concern expressed in comments to patch 4 & 5.
>>> 
>>> Btw, I'm not sure removing such ucode from the cache is appropriate:
>>> It may well apply elsewhere, unless there's a clear indication that the
>>> blob is broken.
> 
> Yes. Got it. Can we just assume we won't encounter that ucode update
> succeeded only on part of cpus and warn a reboot is needed if it happened?
> We definitely want to tolerate some kinds of hardware misbehavior. But
> for such cases which are unlikely to happen, I prefer to improve this code
> when we meet this kind of issue.

Okay, for both this and ...

>>> So perhaps there needs to be special casing of -EIO,
>>> which gets returned when the ucode rev reported by the CPU after
>>> the update does not match expectations.
>>
>>An to go one step further, perhaps we should also store more than
>>just the newest variant for a given pf. If the newest fails to apply
>>but there is another one newer than what's on a CPU, updating to
>>that may work, and once that intermediate update worked, the
>>update to the newest version may then work too.

... see also Andrew's reply. As long as no-one is aware of mixed-
stepping systems out there in the wild, I'm fine with simplifying
things where possible.

> Intel SDM doesn't mention this dependency (to apply an ucode relies on a
> specific old ucode applied). Perhaps we can also assume we won't fall
> into this case.

Please see Linux'es
arch/x86/kernel/cpu/microcode/intel.c:is_blacklisted()
for one of the possible problematic situations here. Of course
the SDM wouldn't mention it, only the errata document for this
specific model would.

Jan



___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 1/2] xen: introduce a cpumask with all bits set

2019-03-14 Thread Jan Beulich
>>> On 14.03.19 at 10:59,  wrote:
> @@ -1892,8 +1891,7 @@ static void __init check_timer(void)
>  vector = IRQ0_VECTOR;
>  clear_irq_vector(0);
>  
> -cpumask_setall(&mask_all);
> -if ((ret = bind_irq_vector(0, vector, &mask_all)))
> +if ((ret = bind_irq_vector(0, vector, &cpumask_all)))

Since you replace the users of cpumask_setall(), wouldn't it be
better to also remove that function from the header?

Jan



___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH] xen/netfront: Remove unneeded .resume callback

2019-03-14 Thread Oleksandr Andrushchenko
From: Oleksandr Andrushchenko 

Currently on driver resume we remove all the network queues and
destroy shared Tx/Rx rings leaving the driver in its current state
and never signaling the backend of this frontend's state change.
This leads to the number of consequences:
- when frontend withdraws granted references to the rings etc. it cannot
  be cleanly done as the backend still holds those (it was not told to
  free the resources)
- it is not possible to resume driver operation as all the communication
  means with the backned were destroyed by the frontend, thus
  making the frontend appear to the guest OS as functional, but
  not really.

Fix this by not destroying communication channels/rings on driver
resume.

Signed-off-by: Oleksandr Andrushchenko 
---
 drivers/net/xen-netfront.c | 17 -
 1 file changed, 17 deletions(-)

diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c
index c914c24f880b..2ca162048da4 100644
--- a/drivers/net/xen-netfront.c
+++ b/drivers/net/xen-netfront.c
@@ -1422,22 +1422,6 @@ static void xennet_disconnect_backend(struct 
netfront_info *info)
}
 }
 
-/**
- * We are reconnecting to the backend, due to a suspend/resume, or a backend
- * driver restart.  We tear down our netif structure and recreate it, but
- * leave the device-layer structures intact so that this is transparent to the
- * rest of the kernel.
- */
-static int netfront_resume(struct xenbus_device *dev)
-{
-   struct netfront_info *info = dev_get_drvdata(&dev->dev);
-
-   dev_dbg(&dev->dev, "%s\n", dev->nodename);
-
-   xennet_disconnect_backend(info);
-   return 0;
-}
-
 static int xen_net_read_mac(struct xenbus_device *dev, u8 mac[])
 {
char *s, *e, *macstr;
@@ -2185,7 +2169,6 @@ static struct xenbus_driver netfront_driver = {
.ids = netfront_ids,
.probe = netfront_probe,
.remove = xennet_remove,
-   .resume = netfront_resume,
.otherend_changed = netback_changed,
 };
 
-- 
2.21.0


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH L1TF v10 2/8] nospec: introduce evaluate_nospec

2019-03-14 Thread Jan Beulich
>>> On 14.03.19 at 13:50,  wrote:
> Since the L1TF vulnerability of Intel CPUs, loading hypervisor data into
> L1 cache is problematic, because when hyperthreading is used as well, a
> guest running on the sibling core can leak this potentially secret data.
> 
> To prevent these speculative accesses, we block speculation after
> accessing the domain property field by adding lfence instructions. This
> way, the CPU continues executing and loading data only once the condition
> is actually evaluated.
> 
> As this protection is typically used in if statements, the lfence has to
> come in a compatible way. Therefore, a function that returns true after an
> lfence instruction is introduced. To protect both branches after a
> conditional, an lfence instruction has to be added for the two branches.
> To be able to block speculation after several evaluations, the generic
> barrier macro block_speculation is also introduced.
> 
> As the L1TF vulnerability is only present on the x86 architecture, there is
> no need to add protection for other architectures. Hence, the introduced
> functions are defined but empty.
> 
> On the x86 architecture, by default, the lfence instruction is not present
> either. Only when a L1TF vulnerable platform is detected, the lfence
> instruction is patched in via alternative patching. Similarly, PV guests
> are protected wrt L1TF by default, so that the protection is furthermore
> disabled in case HVM is exclueded via the build configuration.
> 
> Introducing the lfence instructions catches a lot of potential leaks with
> a simple unintrusive code change. During performance testing, we did not
> notice performance effects.
> 
> This is part of the speculative hardening effort.
> 
> Signed-off-by: Norbert Manthey 
> Acked-by: Julien Grall 

I did give my ack on v9, and I see no indication of changes which
may have invalidated it.

Jan



___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH L1TF v10 8/8] common/domain: block speculative out-of-bound accesses

2019-03-14 Thread Jan Beulich
>>> On 14.03.19 at 13:50,  wrote:
> When issuing a vcpu_op hypercall, guests have control over the
> vcpuid variable. In the old code, this allowed to perform
> speculative out-of-bound accesses. To block this, we make use
> of the domain_vcpu function.
> 
> This is part of the speculative hardening effort.
> 
> Signed-off-by: Norbert Manthey 

Reviewed-by: Jan Beulich 



___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH L1TF v10 2/8] nospec: introduce evaluate_nospec

2019-03-14 Thread Norbert Manthey
On 3/14/19 14:19, Jan Beulich wrote:
 On 14.03.19 at 13:50,  wrote:
>> Since the L1TF vulnerability of Intel CPUs, loading hypervisor data into
>> L1 cache is problematic, because when hyperthreading is used as well, a
>> guest running on the sibling core can leak this potentially secret data.
>>
>> To prevent these speculative accesses, we block speculation after
>> accessing the domain property field by adding lfence instructions. This
>> way, the CPU continues executing and loading data only once the condition
>> is actually evaluated.
>>
>> As this protection is typically used in if statements, the lfence has to
>> come in a compatible way. Therefore, a function that returns true after an
>> lfence instruction is introduced. To protect both branches after a
>> conditional, an lfence instruction has to be added for the two branches.
>> To be able to block speculation after several evaluations, the generic
>> barrier macro block_speculation is also introduced.
>>
>> As the L1TF vulnerability is only present on the x86 architecture, there is
>> no need to add protection for other architectures. Hence, the introduced
>> functions are defined but empty.
>>
>> On the x86 architecture, by default, the lfence instruction is not present
>> either. Only when a L1TF vulnerable platform is detected, the lfence
>> instruction is patched in via alternative patching. Similarly, PV guests
>> are protected wrt L1TF by default, so that the protection is furthermore
>> disabled in case HVM is exclueded via the build configuration.
>>
>> Introducing the lfence instructions catches a lot of potential leaks with
>> a simple unintrusive code change. During performance testing, we did not
>> notice performance effects.
>>
>> This is part of the speculative hardening effort.
>>
>> Signed-off-by: Norbert Manthey 
>> Acked-by: Julien Grall 
> I did give my ack on v9, and I see no indication of changes which
> may have invalidated it.

That is a miss on my side.

Norbert

>
> Jan
>
>



Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrer: Christian Schlaeger, Ralf Herbrich
Ust-ID: DE 289 237 879
Eingetragen am Amtsgericht Charlottenburg HRB 149173 B

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 1/2] xen: introduce a cpumask with all bits set

2019-03-14 Thread Juergen Gross
On 14/03/2019 14:12, Jan Beulich wrote:
 On 14.03.19 at 10:59,  wrote:
>> @@ -1892,8 +1891,7 @@ static void __init check_timer(void)
>>  vector = IRQ0_VECTOR;
>>  clear_irq_vector(0);
>>  
>> -cpumask_setall(&mask_all);
>> -if ((ret = bind_irq_vector(0, vector, &mask_all)))
>> +if ((ret = bind_irq_vector(0, vector, &cpumask_all)))
> 
> Since you replace the users of cpumask_setall(), wouldn't it be
> better to also remove that function from the header?

I don't replace all of them. There are still some users left where
replacing with cpumask_all would be less optimal.


Juergen

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 1/2] xen/debug: make debugtrace configurable via Kconfig

2019-03-14 Thread Jan Beulich
>>> On 14.03.19 at 10:37,  wrote:
> Instead of having to edit include/xen/lib.h for making debugtrace
> available make it configurable via Kconfig.
> 
> Default is off, it is available only in expert mode or in debug builds.
> 
> Signed-off-by: Juergen Gross 

Acked-by: Jan Beulich 



___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 2/2] xen/debug: make debugtrace more clever regarding repeating entries

2019-03-14 Thread Jan Beulich
>>> On 14.03.19 at 10:37,  wrote:
> --- a/xen/drivers/char/console.c
> +++ b/xen/drivers/char/console.c
> @@ -1225,13 +1225,28 @@ void debugtrace_dump(void)
>  watchdog_enable();
>  }
>  
> +static void debugtrace_add_to_buf(char *buf)
> +{
> +char *p;
> +
> +for ( p = buf; *p != '\0'; p++ )
> +{
> +debugtrace_buf[debugtrace_prd++] = *p;
> +/* Always leave a nul byte at the end of the buffer. */
> +if ( debugtrace_prd == (debugtrace_bytes - 1) )
> +debugtrace_prd = 0;
> +}
> +}
> +
>  void debugtrace_printk(const char *fmt, ...)
>  {
> -static charbuf[1024];
> -static u32 count;
> +static char buf[1024];
> +static char last_buf[1024];
> +static u32 count, last_count;

Please change to uint32_t or even better simply to unsigned int.

> @@ -1243,25 +1258,32 @@ void debugtrace_printk(const char *fmt, ...)
>  
>  ASSERT(debugtrace_buf[debugtrace_bytes - 1] == 0);
>  
> -snprintf(buf, sizeof(buf), "%u ", ++count);
> -
>  va_start(args, fmt);
> -(void)vsnprintf(buf + strlen(buf), sizeof(buf) - strlen(buf), fmt, args);
> +(void)vsnprintf(buf, sizeof(buf), fmt, args);

Please take the opportunity and drop the stray cast.

Jan



___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v3 1/4] x86: stop handling MSR_IA32_BNDCFGS save/restore in implementation code

2019-03-14 Thread Jan Beulich
>>> On 14.03.19 at 14:01,  wrote:
> @@ -1215,10 +1196,15 @@ static bool vmx_set_guest_bndcfgs(struct vcpu *v, u64 
> val)
>  return true;
>  }
>  
> -static bool vmx_get_guest_bndcfgs(struct vcpu *v, u64 *val)
> +static bool vmx_get_guest_bndcfgs(const struct vcpu *cv, u64 *val)
>  {
> +struct vcpu *v;
> +
>  ASSERT(cpu_has_mpx && cpu_has_vmx_mpx);
>  
> +/* Get a non-const pointer for vmx_vmcs_enter() */
> +v = cv->domain->vcpu[cv->vcpu_id];

Any chance this could be made the initializer of the variable?

> --- a/xen/arch/x86/msr.c
> +++ b/xen/arch/x86/msr.c
> @@ -160,6 +160,12 @@ int guest_rdmsr(const struct vcpu *v, uint32_t msr, 
> uint64_t *val)
>  ret = guest_rdmsr_x2apic(v, msr, val);
>  break;
>  
> +case MSR_IA32_BNDCFGS:
> +if ( !cp->feat.mpx || !hvm_get_guest_bndcfgs(v, val) )
> +goto gp_fault;

Didn't we settle on ASSERT(is_hvm_*)?

> @@ -323,6 +329,12 @@ int guest_wrmsr(struct vcpu *v, uint32_t msr, uint64_t 
> val)
>  ret = guest_wrmsr_x2apic(v, msr, val);
>  break;
>  
> +case MSR_IA32_BNDCFGS:
> +if ( !cp->feat.mpx || !hvm_set_guest_bndcfgs(v, val) )
> +goto gp_fault;

Same here? Or did Andrew tell you otherwise?

Jan



___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 2/2] xen/debug: make debugtrace more clever regarding repeating entries

2019-03-14 Thread Juergen Gross
On 14/03/2019 14:33, Jan Beulich wrote:
 On 14.03.19 at 10:37,  wrote:
>> --- a/xen/drivers/char/console.c
>> +++ b/xen/drivers/char/console.c
>> @@ -1225,13 +1225,28 @@ void debugtrace_dump(void)
>>  watchdog_enable();
>>  }
>>  
>> +static void debugtrace_add_to_buf(char *buf)
>> +{
>> +char *p;
>> +
>> +for ( p = buf; *p != '\0'; p++ )
>> +{
>> +debugtrace_buf[debugtrace_prd++] = *p;
>> +/* Always leave a nul byte at the end of the buffer. */
>> +if ( debugtrace_prd == (debugtrace_bytes - 1) )
>> +debugtrace_prd = 0;
>> +}
>> +}
>> +
>>  void debugtrace_printk(const char *fmt, ...)
>>  {
>> -static charbuf[1024];
>> -static u32 count;
>> +static char buf[1024];
>> +static char last_buf[1024];
>> +static u32 count, last_count;
> 
> Please change to uint32_t or even better simply to unsigned int.

Okay.

> 
>> @@ -1243,25 +1258,32 @@ void debugtrace_printk(const char *fmt, ...)
>>  
>>  ASSERT(debugtrace_buf[debugtrace_bytes - 1] == 0);
>>  
>> -snprintf(buf, sizeof(buf), "%u ", ++count);
>> -
>>  va_start(args, fmt);
>> -(void)vsnprintf(buf + strlen(buf), sizeof(buf) - strlen(buf), fmt, 
>> args);
>> +(void)vsnprintf(buf, sizeof(buf), fmt, args);
> 
> Please take the opportunity and drop the stray cast.

Will do.


Juergen

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 2/2] xen/debug: make debugtrace more clever regarding repeating entries

2019-03-14 Thread Andrew Cooper
On 14/03/2019 13:38, Juergen Gross wrote:
> On 14/03/2019 14:33, Jan Beulich wrote:
> On 14.03.19 at 10:37,  wrote:
>>> --- a/xen/drivers/char/console.c
>>> +++ b/xen/drivers/char/console.c
>>> @@ -1225,13 +1225,28 @@ void debugtrace_dump(void)
>>>  watchdog_enable();
>>>  }
>>>  
>>> +static void debugtrace_add_to_buf(char *buf)
>>> +{
>>> +char *p;
>>> +
>>> +for ( p = buf; *p != '\0'; p++ )
>>> +{
>>> +debugtrace_buf[debugtrace_prd++] = *p;
>>> +/* Always leave a nul byte at the end of the buffer. */
>>> +if ( debugtrace_prd == (debugtrace_bytes - 1) )
>>> +debugtrace_prd = 0;
>>> +}
>>> +}
>>> +
>>>  void debugtrace_printk(const char *fmt, ...)
>>>  {
>>> -static charbuf[1024];
>>> -static u32 count;
>>> +static char buf[1024];
>>> +static char last_buf[1024];
>>> +static u32 count, last_count;
>> Please change to uint32_t or even better simply to unsigned int.
> Okay.
>
>>> @@ -1243,25 +1258,32 @@ void debugtrace_printk(const char *fmt, ...)
>>>  
>>>  ASSERT(debugtrace_buf[debugtrace_bytes - 1] == 0);
>>>  
>>> -snprintf(buf, sizeof(buf), "%u ", ++count);
>>> -
>>>  va_start(args, fmt);
>>> -(void)vsnprintf(buf + strlen(buf), sizeof(buf) - strlen(buf), fmt, 
>>> args);
>>> +(void)vsnprintf(buf, sizeof(buf), fmt, args);
>> Please take the opportunity and drop the stray cast.
> Will do.

Both can be done on commit, surely?

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v3 1/4] x86: stop handling MSR_IA32_BNDCFGS save/restore in implementation code

2019-03-14 Thread Paul Durrant
> -Original Message-
> From: Jan Beulich [mailto:jbeul...@suse.com]
> Sent: 14 March 2019 13:39
> To: Paul Durrant 
> Cc: Andrew Cooper ; Roger Pau Monne 
> ; Wei Liu
> ; Jun Nakajima ; Kevin Tian 
> ; xen-
> devel 
> Subject: Re: [PATCH v3 1/4] x86: stop handling MSR_IA32_BNDCFGS save/restore 
> in implementation code
> 
> >>> On 14.03.19 at 14:01,  wrote:
> > @@ -1215,10 +1196,15 @@ static bool vmx_set_guest_bndcfgs(struct vcpu *v, 
> > u64 val)
> >  return true;
> >  }
> >
> > -static bool vmx_get_guest_bndcfgs(struct vcpu *v, u64 *val)
> > +static bool vmx_get_guest_bndcfgs(const struct vcpu *cv, u64 *val)
> >  {
> > +struct vcpu *v;
> > +
> >  ASSERT(cpu_has_mpx && cpu_has_vmx_mpx);
> >
> > +/* Get a non-const pointer for vmx_vmcs_enter() */
> > +v = cv->domain->vcpu[cv->vcpu_id];
> 
> Any chance this could be made the initializer of the variable?

I thought it looked slightly odd having a comment above the declaration, which 
is why I did it this way, but I can move it.

> 
> > --- a/xen/arch/x86/msr.c
> > +++ b/xen/arch/x86/msr.c
> > @@ -160,6 +160,12 @@ int guest_rdmsr(const struct vcpu *v, uint32_t msr, 
> > uint64_t *val)
> >  ret = guest_rdmsr_x2apic(v, msr, val);
> >  break;
> >
> > +case MSR_IA32_BNDCFGS:
> > +if ( !cp->feat.mpx || !hvm_get_guest_bndcfgs(v, val) )
> > +goto gp_fault;
> 
> Didn't we settle on ASSERT(is_hvm_*)?
> 
> > @@ -323,6 +329,12 @@ int guest_wrmsr(struct vcpu *v, uint32_t msr, uint64_t 
> > val)
> >  ret = guest_wrmsr_x2apic(v, msr, val);
> >  break;
> >
> > +case MSR_IA32_BNDCFGS:
> > +if ( !cp->feat.mpx || !hvm_set_guest_bndcfgs(v, val) )
> > +goto gp_fault;
> 
> Same here? Or did Andrew tell you otherwise?

Andrew was ok with me just dropping it, but I can put in the ASSERT if you 
prefer.

  Paul

> 
> Jan
> 


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH] xen/netfront: Remove unneeded .resume callback

2019-03-14 Thread Oleksandr Andrushchenko

I am still confused about suspend/resume for the front drivers,
for instance, block front [1] does implement full/proper reconnect
on .resume, but net front only does this partially.

Could anyone please shed some light on suspend/resume design?

Thank you,
Oleksandr

On 3/14/19 3:17 PM, Oleksandr Andrushchenko wrote:

From: Oleksandr Andrushchenko 

Currently on driver resume we remove all the network queues and
destroy shared Tx/Rx rings leaving the driver in its current state
and never signaling the backend of this frontend's state change.
This leads to the number of consequences:
- when frontend withdraws granted references to the rings etc. it cannot
   be cleanly done as the backend still holds those (it was not told to
   free the resources)
- it is not possible to resume driver operation as all the communication
   means with the backned were destroyed by the frontend, thus
   making the frontend appear to the guest OS as functional, but
   not really.

Fix this by not destroying communication channels/rings on driver
resume.

Signed-off-by: Oleksandr Andrushchenko 
---
  drivers/net/xen-netfront.c | 17 -
  1 file changed, 17 deletions(-)

diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c
index c914c24f880b..2ca162048da4 100644
--- a/drivers/net/xen-netfront.c
+++ b/drivers/net/xen-netfront.c
@@ -1422,22 +1422,6 @@ static void xennet_disconnect_backend(struct 
netfront_info *info)
}
  }
  
-/**

- * We are reconnecting to the backend, due to a suspend/resume, or a backend
- * driver restart.  We tear down our netif structure and recreate it, but
- * leave the device-layer structures intact so that this is transparent to the
- * rest of the kernel.
- */
-static int netfront_resume(struct xenbus_device *dev)
-{
-   struct netfront_info *info = dev_get_drvdata(&dev->dev);
-
-   dev_dbg(&dev->dev, "%s\n", dev->nodename);
-
-   xennet_disconnect_backend(info);
-   return 0;
-}
-
  static int xen_net_read_mac(struct xenbus_device *dev, u8 mac[])
  {
char *s, *e, *macstr;
@@ -2185,7 +2169,6 @@ static struct xenbus_driver netfront_driver = {
.ids = netfront_ids,
.probe = netfront_probe,
.remove = xennet_remove,
-   .resume = netfront_resume,
.otherend_changed = netback_changed,
  };
  


[1] 
https://elixir.bootlin.com/linux/v5.0.2/source/drivers/block/xen-blkfront.c#L2072


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v13] tolerate jitter in cpu_khz calculation to avoid TSC emulation

2019-03-14 Thread Olaf Hering
Am Wed, 13 Mar 2019 03:18:39 -0600
schrieb "Jan Beulich" :

> I'm sorry, but I continue to object to this adjustment getting done
> both by default _and_ not in a per-guest manner. As said before,
> you can't demand guests to run NTP, and hence you can't expect
> them to get along with a few hundred kHz jump in observed TSC
> frequency. Whether the performance drop due to vTSC use is
> better or worse is a policy decision, which we should leave to the
> admin. Hence the feature needs to be off by default, and there
> needs to be at least a host-wide control to enable it; a per-guest
> control would be better. IOW I explicitly do not agree with the
> last sentence of the commit message.

So this seems the be the essential part that prevents moving forward.

Your claim is basically that "we do not know how the workload reacts
to frequency change".
My claim is basically "there is enough evidence that syncing with
external clock is required if the frequency remotely matters".

I think that conflict can not be easily solved.

One way to solve it would be a knob that injects a value into the
proposed "vtsc_tolerance_khz" variable, leave the calculation to
the host admin, and leave code in tsc_set_info basically as is.

Maybe "xl set-params" can be the way to change the value, that way
it can be changed globally at runtime if needed.

In staging the change would affect HVM and PVH. I never ran PVH,
I have to assume it behaves like HVM in this regard.


Olaf


pgpJ4RXdl0NFx.pgp
Description: Digitale Signatur von OpenPGP
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v4 4/4] x86: remove defunct init/load/save_msr() hvm_funcs

2019-03-14 Thread Paul Durrant
These hvm_funcs are no longer required since no MSR values are saved or
restored by implementation-specific code.

Signed-off-by: Paul Durrant 
Reviewed-by: Jan Beulich 
---
Cc: Andrew Cooper 
Cc: Wei Liu 
Cc: "Roger Pau Monné" 

v2:
 - Re-instate err check on loop in hvm_load_cpu_msrs()
---
 xen/arch/x86/hvm/hvm.c| 29 +
 xen/include/asm-x86/hvm/hvm.h |  4 
 2 files changed, 9 insertions(+), 24 deletions(-)

diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index deb7fb2adb..f84bf11f0f 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -1310,7 +1310,6 @@ static const uint32_t msrs_to_send[] = {
 MSR_AMD64_DR2_ADDRESS_MASK,
 MSR_AMD64_DR3_ADDRESS_MASK,
 };
-static unsigned int __read_mostly msr_count_max = ARRAY_SIZE(msrs_to_send);
 
 static int hvm_save_cpu_msrs(struct vcpu *v, hvm_domain_context_t *h)
 {
@@ -1320,7 +1319,7 @@ static int hvm_save_cpu_msrs(struct vcpu *v, 
hvm_domain_context_t *h)
 int err;
 
 err = _hvm_init_entry(h, CPU_MSR_CODE, v->vcpu_id,
- HVM_CPU_MSR_SIZE(msr_count_max));
+ HVM_CPU_MSR_SIZE(ARRAY_SIZE(msrs_to_send)));
 if ( err )
 return err;
 ctxt = (struct hvm_msr *)&h->data[h->cur];
@@ -1353,10 +1352,7 @@ static int hvm_save_cpu_msrs(struct vcpu *v, 
hvm_domain_context_t *h)
 ctxt->msr[ctxt->count++].val = val;
 }
 
-if ( hvm_funcs.save_msr )
-hvm_funcs.save_msr(v, ctxt);
-
-ASSERT(ctxt->count <= msr_count_max);
+ASSERT(ctxt->count <= ARRAY_SIZE(msrs_to_send));
 
 for ( i = 0; i < ctxt->count; ++i )
 ctxt->msr[i]._rsvd = 0;
@@ -1431,9 +1427,6 @@ static int hvm_load_cpu_msrs(struct domain *d, 
hvm_domain_context_t *h)
 return -EOPNOTSUPP;
 /* Checking finished */
 
-if ( hvm_funcs.load_msr )
-err = hvm_funcs.load_msr(v, ctxt);
-
 for ( i = 0; !err && i < ctxt->count; ++i )
 {
 switch ( ctxt->msr[i].index )
@@ -1475,17 +1468,13 @@ static int __init 
hvm_register_CPU_save_and_restore(void)
 sizeof(struct hvm_save_descriptor),
 HVMSR_PER_VCPU);
 
-if ( hvm_funcs.init_msr )
-msr_count_max += hvm_funcs.init_msr();
-
-if ( msr_count_max )
-hvm_register_savevm(CPU_MSR_CODE,
-"CPU_MSR",
-hvm_save_cpu_msrs,
-hvm_load_cpu_msrs,
-HVM_CPU_MSR_SIZE(msr_count_max) +
-sizeof(struct hvm_save_descriptor),
-HVMSR_PER_VCPU);
+hvm_register_savevm(CPU_MSR_CODE,
+"CPU_MSR",
+hvm_save_cpu_msrs,
+hvm_load_cpu_msrs,
+HVM_CPU_MSR_SIZE(ARRAY_SIZE(msrs_to_send)) +
+sizeof(struct hvm_save_descriptor),
+HVMSR_PER_VCPU);
 
 return 0;
 }
diff --git a/xen/include/asm-x86/hvm/hvm.h b/xen/include/asm-x86/hvm/hvm.h
index d41ed63232..a70fdbe298 100644
--- a/xen/include/asm-x86/hvm/hvm.h
+++ b/xen/include/asm-x86/hvm/hvm.h
@@ -115,10 +115,6 @@ struct hvm_function_table {
 void (*save_cpu_ctxt)(struct vcpu *v, struct hvm_hw_cpu *ctxt);
 int (*load_cpu_ctxt)(struct vcpu *v, struct hvm_hw_cpu *ctxt);
 
-unsigned int (*init_msr)(void);
-void (*save_msr)(struct vcpu *, struct hvm_msr *);
-int (*load_msr)(struct vcpu *, struct hvm_msr *);
-
 /* Examine specifics of the guest state. */
 unsigned int (*get_interrupt_shadow)(struct vcpu *v);
 void (*set_interrupt_shadow)(struct vcpu *v, unsigned int intr_shadow);
-- 
2.20.1


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v4 3/4] x86: stop handling MSR_IA32_XSS save/restore in implementation code

2019-03-14 Thread Paul Durrant
Saving and restoring the value of this MSR is currently handled by
implementation-specific code despite it being architectural. This patch
moves handling of accesses to this MSR from hvm.c into the msr.c, thus
allowing the common MSR save/restore code to handle it.

This patch also adds proper checks of CPUID policy in the new get/set code.

NOTE: MSR_IA32_XSS is the last MSR to be saved and restored by
  implementation-specific code. This patch therefore removes the
  (VMX) definitions and of the init_msr(), save_msr() and
  load_msr() hvm_funcs, as they are no longer necessary. The
  declarations of and calls to those hvm_funcs will be cleaned up
  by a subsequent patch.

Signed-off-by: Paul Durrant 
Reviewed-by: Jan Beulich 
Reviewed-by: Kevin Tian 
---
Cc: Andrew Cooper 
Cc: Wei Liu 
Cc: "Roger Pau Monné" 
Cc: Jun Nakajima 
---
 xen/arch/x86/hvm/hvm.c | 15 ++--
 xen/arch/x86/hvm/vmx/vmx.c | 49 --
 xen/arch/x86/msr.c | 18 ++
 3 files changed, 20 insertions(+), 62 deletions(-)

diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index dff590e658..deb7fb2adb 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -1304,6 +1304,7 @@ static const uint32_t msrs_to_send[] = {
 MSR_SPEC_CTRL,
 MSR_INTEL_MISC_FEATURES_ENABLES,
 MSR_IA32_BNDCFGS,
+MSR_IA32_XSS,
 MSR_AMD64_DR0_ADDRESS_MASK,
 MSR_AMD64_DR1_ADDRESS_MASK,
 MSR_AMD64_DR2_ADDRESS_MASK,
@@ -1442,6 +1443,7 @@ static int hvm_load_cpu_msrs(struct domain *d, 
hvm_domain_context_t *h)
 case MSR_SPEC_CTRL:
 case MSR_INTEL_MISC_FEATURES_ENABLES:
 case MSR_IA32_BNDCFGS:
+case MSR_IA32_XSS:
 case MSR_AMD64_DR0_ADDRESS_MASK:
 case MSR_AMD64_DR1_ADDRESS_MASK ... MSR_AMD64_DR3_ADDRESS_MASK:
 rc = guest_wrmsr(v, ctxt->msr[i].index, ctxt->msr[i].val);
@@ -3463,12 +3465,6 @@ int hvm_msr_read_intercept(unsigned int msr, uint64_t 
*msr_content)
 MTRRcap_VCNT))];
 break;
 
-case MSR_IA32_XSS:
-if ( !d->arch.cpuid->xstate.xsaves )
-goto gp_fault;
-*msr_content = v->arch.msrs->xss.raw;
-break;
-
 case MSR_K8_ENABLE_C1E:
 case MSR_AMD64_NB_CFG:
  /*
@@ -3608,13 +3604,6 @@ int hvm_msr_write_intercept(unsigned int msr, uint64_t 
msr_content,
 goto gp_fault;
 break;
 
-case MSR_IA32_XSS:
-/* No XSS features currently supported for guests. */
-if ( !d->arch.cpuid->xstate.xsaves || msr_content != 0 )
-goto gp_fault;
-v->arch.msrs->xss.raw = msr_content;
-break;
-
 case MSR_AMD64_NB_CFG:
 /* ignore the write */
 break;
diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index 985e5735d2..c46e05b91e 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -797,52 +797,6 @@ static int vmx_load_vmcs_ctxt(struct vcpu *v, struct 
hvm_hw_cpu *ctxt)
 return 0;
 }
 
-static unsigned int __init vmx_init_msr(void)
-{
-return (cpu_has_mpx && cpu_has_vmx_mpx) +
-   (cpu_has_xsaves && cpu_has_vmx_xsaves);
-}
-
-static void vmx_save_msr(struct vcpu *v, struct hvm_msr *ctxt)
-{
-if ( cpu_has_xsaves && cpu_has_vmx_xsaves )
-{
-ctxt->msr[ctxt->count].val = v->arch.msrs->xss.raw;
-if ( ctxt->msr[ctxt->count].val )
-ctxt->msr[ctxt->count++].index = MSR_IA32_XSS;
-}
-}
-
-static int vmx_load_msr(struct vcpu *v, struct hvm_msr *ctxt)
-{
-unsigned int i;
-int err = 0;
-
-vmx_vmcs_enter(v);
-
-for ( i = 0; i < ctxt->count; ++i )
-{
-switch ( ctxt->msr[i].index )
-{
-case MSR_IA32_XSS:
-if ( cpu_has_xsaves && cpu_has_vmx_xsaves )
-v->arch.msrs->xss.raw = ctxt->msr[i].val;
-else
-err = -ENXIO;
-break;
-default:
-continue;
-}
-if ( err )
-break;
-ctxt->msr[i]._rsvd = 1;
-}
-
-vmx_vmcs_exit(v);
-
-return err;
-}
-
 static void vmx_fpu_enter(struct vcpu *v)
 {
 vcpu_restore_fpu_lazy(v);
@@ -2282,9 +2236,6 @@ static struct hvm_function_table __initdata 
vmx_function_table = {
 .vcpu_destroy = vmx_vcpu_destroy,
 .save_cpu_ctxt= vmx_save_vmcs_ctxt,
 .load_cpu_ctxt= vmx_load_vmcs_ctxt,
-.init_msr = vmx_init_msr,
-.save_msr = vmx_save_msr,
-.load_msr = vmx_load_msr,
 .get_interrupt_shadow = vmx_get_interrupt_shadow,
 .set_interrupt_shadow = vmx_set_interrupt_shadow,
 .guest_x86_mode   = vmx_guest_x86_mode,
diff --git a/xen/arch/x86/msr.c b/xen/arch/x86/msr.c
index 0e901d2397..4b5e000224 100644
--- a/xen/arch/x86/msr.c
+++ b/xen/arch/x86/msr.c
@@ -170,6 +170,13 @@ int guest_rdmsr(const struct vcpu *v, uint32_t msr, 
uint64_t *val)
 
 break;
 

[Xen-devel] [PATCH v4 1/4] x86: stop handling MSR_IA32_BNDCFGS save/restore in implementation code

2019-03-14 Thread Paul Durrant
Saving and restoring the value of this MSR is currently handled by
implementation-specific code despite it being architectural. This patch
moves handling of accesses to this MSR from hvm.c into the msr.c, thus
allowing the common MSR save/restore code to handle it.

NOTE: Because vmx_get/set_guest_bndcfgs() call vmx_vmcs_enter(), the
  struct vcpu pointer passed in, and hence the vcpu pointer passed to
  guest_rdmsr() cannot be const.

Signed-off-by: Paul Durrant 
---
Cc: Kevin Tian 
Cc: Jan Beulich 
Cc: Andrew Cooper 
Cc: Wei Liu 
Cc: "Roger Pau Monné" 
Cc: Jun Nakajima 

v4:
 - Cosmetic re-arrangements and an additional ASSERT requested by Jan

v3:
 - Address further comments from Jan
 - Dropped Kevin's R-b because of change to vmx.c

v2:
 - Addressed comments from Jan by largely removing hunks
 - Keeping Kevin's R-b since remaining hunks in vmx.c are as before
---
 xen/arch/x86/hvm/hvm.c| 14 ++
 xen/arch/x86/hvm/vmx/vmx.c| 24 
 xen/arch/x86/msr.c| 20 
 xen/include/asm-x86/hvm/hvm.h |  4 ++--
 4 files changed, 28 insertions(+), 34 deletions(-)

diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 8adbb61b57..e566d83f8b 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -1303,6 +1303,7 @@ static int hvm_load_cpu_xsave_states(struct domain *d, 
hvm_domain_context_t *h)
 static const uint32_t msrs_to_send[] = {
 MSR_SPEC_CTRL,
 MSR_INTEL_MISC_FEATURES_ENABLES,
+MSR_IA32_BNDCFGS,
 MSR_AMD64_DR0_ADDRESS_MASK,
 MSR_AMD64_DR1_ADDRESS_MASK,
 MSR_AMD64_DR2_ADDRESS_MASK,
@@ -1440,6 +1441,7 @@ static int hvm_load_cpu_msrs(struct domain *d, 
hvm_domain_context_t *h)
 
 case MSR_SPEC_CTRL:
 case MSR_INTEL_MISC_FEATURES_ENABLES:
+case MSR_IA32_BNDCFGS:
 case MSR_AMD64_DR0_ADDRESS_MASK:
 case MSR_AMD64_DR1_ADDRESS_MASK ... MSR_AMD64_DR3_ADDRESS_MASK:
 rc = guest_wrmsr(v, ctxt->msr[i].index, ctxt->msr[i].val);
@@ -3467,12 +3469,6 @@ int hvm_msr_read_intercept(unsigned int msr, uint64_t 
*msr_content)
 *msr_content = v->arch.hvm.msr_xss;
 break;
 
-case MSR_IA32_BNDCFGS:
-if ( !d->arch.cpuid->feat.mpx ||
- !hvm_get_guest_bndcfgs(v, msr_content) )
-goto gp_fault;
-break;
-
 case MSR_K8_ENABLE_C1E:
 case MSR_AMD64_NB_CFG:
  /*
@@ -3619,12 +3615,6 @@ int hvm_msr_write_intercept(unsigned int msr, uint64_t 
msr_content,
 v->arch.hvm.msr_xss = msr_content;
 break;
 
-case MSR_IA32_BNDCFGS:
-if ( !d->arch.cpuid->feat.mpx ||
- !hvm_set_guest_bndcfgs(v, msr_content) )
-goto gp_fault;
-break;
-
 case MSR_AMD64_NB_CFG:
 /* ignore the write */
 break;
diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index 725dd88c13..f8481d032a 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -805,17 +805,6 @@ static unsigned int __init vmx_init_msr(void)
 
 static void vmx_save_msr(struct vcpu *v, struct hvm_msr *ctxt)
 {
-vmx_vmcs_enter(v);
-
-if ( cpu_has_mpx && cpu_has_vmx_mpx )
-{
-__vmread(GUEST_BNDCFGS, &ctxt->msr[ctxt->count].val);
-if ( ctxt->msr[ctxt->count].val )
-ctxt->msr[ctxt->count++].index = MSR_IA32_BNDCFGS;
-}
-
-vmx_vmcs_exit(v);
-
 if ( cpu_has_xsaves && cpu_has_vmx_xsaves )
 {
 ctxt->msr[ctxt->count].val = v->arch.hvm.msr_xss;
@@ -835,14 +824,6 @@ static int vmx_load_msr(struct vcpu *v, struct hvm_msr 
*ctxt)
 {
 switch ( ctxt->msr[i].index )
 {
-case MSR_IA32_BNDCFGS:
-if ( cpu_has_mpx && cpu_has_vmx_mpx &&
- is_canonical_address(ctxt->msr[i].val) &&
- !(ctxt->msr[i].val & IA32_BNDCFGS_RESERVED) )
-__vmwrite(GUEST_BNDCFGS, ctxt->msr[i].val);
-else if ( ctxt->msr[i].val )
-err = -ENXIO;
-break;
 case MSR_IA32_XSS:
 if ( cpu_has_xsaves && cpu_has_vmx_xsaves )
 v->arch.hvm.msr_xss = ctxt->msr[i].val;
@@ -1215,8 +1196,11 @@ static bool vmx_set_guest_bndcfgs(struct vcpu *v, u64 
val)
 return true;
 }
 
-static bool vmx_get_guest_bndcfgs(struct vcpu *v, u64 *val)
+static bool vmx_get_guest_bndcfgs(const struct vcpu *cv, u64 *val)
 {
+/* Get a non-const pointer for vmx_vmcs_enter() */
+struct vcpu *v = cv->domain->vcpu[cv->vcpu_id];
+
 ASSERT(cpu_has_mpx && cpu_has_vmx_mpx);
 
 vmx_vmcs_enter(v);
diff --git a/xen/arch/x86/msr.c b/xen/arch/x86/msr.c
index 4df4a59f4d..0e901d2397 100644
--- a/xen/arch/x86/msr.c
+++ b/xen/arch/x86/msr.c
@@ -160,6 +160,16 @@ int guest_rdmsr(const struct vcpu *v, uint32_t msr, 
uint64_t *val)
 ret = guest_rdmsr_x2apic(v, msr, val);
 break;
 
+case MSR_IA32_BNDCFGS:
+if ( !cp->feat.mpx )
+goto gp_fault;
+
+ASSERT(is_hvm_domain(

[Xen-devel] [PATCH v4 0/4] clean up MSR save/restore code

2019-03-14 Thread Paul Durrant
*** BLURB HERE ***

Paul Durrant (4):
  x86: stop handling MSR_IA32_BNDCFGS save/restore in implementation
code
  x86: move the saved value of MSR_IA32_XSS into struct vcpu_msrs
  x86: stop handling MSR_IA32_XSS save/restore in implementation code
  x86: remove defunct init/load/save_msr() hvm_funcs

 xen/arch/x86/domain.c  |  2 +-
 xen/arch/x86/hvm/hvm.c | 58 ++-
 xen/arch/x86/hvm/vmx/vmx.c | 73 ++
 xen/arch/x86/msr.c | 38 ++
 xen/include/asm-x86/hvm/hvm.h  |  8 +---
 xen/include/asm-x86/hvm/vcpu.h |  1 -
 xen/include/asm-x86/msr.h  |  5 +++
 7 files changed, 63 insertions(+), 122 deletions(-)
---
v2:
 - Drop patches #2 and #6 of the v1 series

Cc: Andrew Cooper 
Cc: Jan Beulich 
Cc: Jun Nakajima 
Cc: Kevin Tian 
Cc: "Roger Pau Monné" 
Cc: Wei Liu 
-- 
2.20.1


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v4 2/4] x86: move the saved value of MSR_IA32_XSS into struct vcpu_msrs

2019-03-14 Thread Paul Durrant
Currently the value is saved directly in struct hvm_vcpu. This patch simply
co-locates it with other saved MSR values. No functional change.

Signed-off-by: Paul Durrant 
Reviewed-by: Jan Beulich 
Reviewed-by: Kevin Tian 
---
Cc: Andrew Cooper 
Cc: Wei Liu 
Cc: "Roger Pau Monné" 
Cc: Jun Nakajima 
---
 xen/arch/x86/domain.c  | 2 +-
 xen/arch/x86/hvm/hvm.c | 4 ++--
 xen/arch/x86/hvm/vmx/vmx.c | 4 ++--
 xen/include/asm-x86/hvm/vcpu.h | 1 -
 xen/include/asm-x86/msr.h  | 5 +
 5 files changed, 10 insertions(+), 6 deletions(-)

diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
index 8d579e2cf9..aa38555736 100644
--- a/xen/arch/x86/domain.c
+++ b/xen/arch/x86/domain.c
@@ -1653,7 +1653,7 @@ static void __context_switch(void)
 BUG();
 
 if ( cpu_has_xsaves && is_hvm_vcpu(n) )
-set_msr_xss(n->arch.hvm.msr_xss);
+set_msr_xss(n->arch.msrs->xss.raw);
 }
 vcpu_restore_fpu_nonlazy(n, false);
 nd->arch.ctxt_switch->to(n);
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index e566d83f8b..dff590e658 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -3466,7 +3466,7 @@ int hvm_msr_read_intercept(unsigned int msr, uint64_t 
*msr_content)
 case MSR_IA32_XSS:
 if ( !d->arch.cpuid->xstate.xsaves )
 goto gp_fault;
-*msr_content = v->arch.hvm.msr_xss;
+*msr_content = v->arch.msrs->xss.raw;
 break;
 
 case MSR_K8_ENABLE_C1E:
@@ -3612,7 +3612,7 @@ int hvm_msr_write_intercept(unsigned int msr, uint64_t 
msr_content,
 /* No XSS features currently supported for guests. */
 if ( !d->arch.cpuid->xstate.xsaves || msr_content != 0 )
 goto gp_fault;
-v->arch.hvm.msr_xss = msr_content;
+v->arch.msrs->xss.raw = msr_content;
 break;
 
 case MSR_AMD64_NB_CFG:
diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index f8481d032a..985e5735d2 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -807,7 +807,7 @@ static void vmx_save_msr(struct vcpu *v, struct hvm_msr 
*ctxt)
 {
 if ( cpu_has_xsaves && cpu_has_vmx_xsaves )
 {
-ctxt->msr[ctxt->count].val = v->arch.hvm.msr_xss;
+ctxt->msr[ctxt->count].val = v->arch.msrs->xss.raw;
 if ( ctxt->msr[ctxt->count].val )
 ctxt->msr[ctxt->count++].index = MSR_IA32_XSS;
 }
@@ -826,7 +826,7 @@ static int vmx_load_msr(struct vcpu *v, struct hvm_msr 
*ctxt)
 {
 case MSR_IA32_XSS:
 if ( cpu_has_xsaves && cpu_has_vmx_xsaves )
-v->arch.hvm.msr_xss = ctxt->msr[i].val;
+v->arch.msrs->xss.raw = ctxt->msr[i].val;
 else
 err = -ENXIO;
 break;
diff --git a/xen/include/asm-x86/hvm/vcpu.h b/xen/include/asm-x86/hvm/vcpu.h
index 6c84d5a5a6..5563d28a4e 100644
--- a/xen/include/asm-x86/hvm/vcpu.h
+++ b/xen/include/asm-x86/hvm/vcpu.h
@@ -176,7 +176,6 @@ struct hvm_vcpu {
 struct hvm_vcpu_asid n1asid;
 
 u64 msr_tsc_adjust;
-u64 msr_xss;
 
 union {
 struct vmx_vcpu vmx;
diff --git a/xen/include/asm-x86/msr.h b/xen/include/asm-x86/msr.h
index a7244793bf..0d52c085f6 100644
--- a/xen/include/asm-x86/msr.h
+++ b/xen/include/asm-x86/msr.h
@@ -313,6 +313,11 @@ struct vcpu_msrs
  * values here may be stale in current context.
  */
 uint32_t dr_mask[4];
+
+/* 0x0da0 - MSR_IA32_XSS */
+struct {
+uint64_t raw;
+} xss;
 };
 
 void init_guest_msr_policy(void);
-- 
2.20.1


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 2/6] xen: Move xenstore initialization to common location

2019-03-14 Thread Paul Durrant
> -Original Message-
> From: Jason Andryuk [mailto:jandr...@gmail.com]
> Sent: 13 March 2019 18:12
> To: Paul Durrant 
> Cc: qemu-de...@nongnu.org; xen-devel@lists.xenproject.org; 
> marma...@invisiblethingslab.com; Stefano
> Stabellini ; Anthony Perard 
> ; Paolo Bonzini
> ; Richard Henderson ; Eduardo Habkost 
> ;
> Michael S. Tsirkin ; Marcel Apfelbaum 
> 
> Subject: Re: [PATCH 2/6] xen: Move xenstore initialization to common location
> 
> On Wed, Mar 13, 2019 at 11:01 AM Paul Durrant  wrote:
> >
> > > -Original Message-
> > > From: Jason Andryuk [mailto:jandr...@gmail.com]
> > > Sent: 11 March 2019 18:02
> > > To: qemu-de...@nongnu.org
> > > Cc: xen-devel@lists.xenproject.org; marma...@invisiblethingslab.com; 
> > > Jason Andryuk
> > > ; Stefano Stabellini ; 
> > > Anthony Perard
> > > ; Paul Durrant ; 
> > > Paolo Bonzini
> > > ; Richard Henderson ; Eduardo 
> > > Habkost ;
> > > Michael S. Tsirkin ; Marcel Apfelbaum 
> > > 
> > > Subject: [PATCH 2/6] xen: Move xenstore initialization to common location
> > >
> > > For the xen stubdom case, we'll want xenstore initialized, but we'll
> > > want to skip the rest of xen_be_init.  Move the initialization to
> > > xen_hvm_init so we can conditionalize calling xen_be_init.
> > >
> > > xs_domain_open() is deprecated for xs_open(0), so make the replacement
> > > as well.
> >
> > Can you elaborate as to why you need to do this when the code at the top of 
> > xen_hvm_init() already
> opens xenstore for its own purposes, and AFAICT xenstore_update() is only 
> needed if QEMU is
> implementing a PV backend?
> >
> >
> 
> Hi, Paul.  Thanks for reviewing.
> 
> I think you are right, that this basically shouldn't be needed if PV
> backends are disabled.  This patch came out of OpenXT, where it is
> needed for some out-of-tree patches.  But that doesn't make it
> suitable for upstreaming.
> 
> However, while reviewing, it looks like the xen accelerator in
> hw/xen/xen-common.c:xen_init() registers xen_change_state_handler().
> xen_change_state_handler() uses the global xenstore handle and will
> exit(1) if NULL.

I see it yes. TBH signalling state via xenstore should go away as it is 
incompatible with deprivileging, and I think Anthony might have some patches 
for that? In the meantime I suggest just doing a local xs_open(0) in that 
function.

> I'm not sure how to get the XenIOState xenstore
> handle over to the accelerator's xen_init.

That would not be appropriate as the machine type may not be xenfv and hence 
xen_hvm_init() may not have been called.

  Paul

>  Outside of that, I think
> you are correct that xenstore_update doesn't need to be run when PV
> backends are disabled.
> 
> Thanks,
> Jason
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH] build: don't mandate availability of a fetcher program

2019-03-14 Thread Wei Liu
It is common that build hosts are isolated from outside world. They
don't necessarily have wget or ftp installed.

Turn the error into warning in configure. And point FETCHER to `false'
command if neither wget nor ftp is available, so any attempt to
download will result in error.

Signed-off-by: Wei Liu 
---
 m4/fetcher.m4 |  4 +++-
 stubdom/configure | 46 +-
 tools/configure   | 46 +-
 3 files changed, 93 insertions(+), 3 deletions(-)

diff --git a/m4/fetcher.m4 b/m4/fetcher.m4
index 86f33b3937..c1a72c189c 100644
--- a/m4/fetcher.m4
+++ b/m4/fetcher.m4
@@ -1,5 +1,6 @@
 AC_DEFUN([AX_CHECK_FETCHER], [
 AC_PATH_PROG([WGET],[wget], [no])
+AC_PATH_PROG([FALSE],[false], [/bin/false])
 AS_IF([test x"$WGET" != x"no"], [
 FETCHER="$WGET -c -O"
 ], [
@@ -7,7 +8,8 @@ AS_IF([test x"$WGET" != x"no"], [
 AS_IF([test x"$FTP" != x"no"], [
 FETCHER="$FTP -o"
 ], [
-AC_MSG_ERROR([cannot find wget or ftp])
+FETCHER="$FALSE"
+AC_MSG_WARN([cannot find wget or ftp])
 ])
 ])
 AC_SUBST(FETCHER)
diff --git a/stubdom/configure b/stubdom/configure
index df3f763a7b..beeb8db2e1 100755
--- a/stubdom/configure
+++ b/stubdom/configure
@@ -625,6 +625,7 @@ CFLAGS
 CC
 FETCHER
 FTP
+FALSE
 WGET
 CMAKE
 extfiles
@@ -2362,6 +2363,47 @@ $as_echo "no" >&6; }
 fi
 
 
+# Extract the first word of "false", so it can be a program name with args.
+set dummy false; ac_word=$2
+{ $as_echo "$as_me:${as_lineno-$LINENO}: checking for $ac_word" >&5
+$as_echo_n "checking for $ac_word... " >&6; }
+if ${ac_cv_path_FALSE+:} false; then :
+  $as_echo_n "(cached) " >&6
+else
+  case $FALSE in
+  [\\/]* | ?:[\\/]*)
+  ac_cv_path_FALSE="$FALSE" # Let the user override the test with a path.
+  ;;
+  *)
+  as_save_IFS=$IFS; IFS=$PATH_SEPARATOR
+for as_dir in $PATH
+do
+  IFS=$as_save_IFS
+  test -z "$as_dir" && as_dir=.
+for ac_exec_ext in '' $ac_executable_extensions; do
+  if as_fn_executable_p "$as_dir/$ac_word$ac_exec_ext"; then
+ac_cv_path_FALSE="$as_dir/$ac_word$ac_exec_ext"
+$as_echo "$as_me:${as_lineno-$LINENO}: found $as_dir/$ac_word$ac_exec_ext" 
>&5
+break 2
+  fi
+done
+  done
+IFS=$as_save_IFS
+
+  test -z "$ac_cv_path_FALSE" && ac_cv_path_FALSE="/bin/false"
+  ;;
+esac
+fi
+FALSE=$ac_cv_path_FALSE
+if test -n "$FALSE"; then
+  { $as_echo "$as_me:${as_lineno-$LINENO}: result: $FALSE" >&5
+$as_echo "$FALSE" >&6; }
+else
+  { $as_echo "$as_me:${as_lineno-$LINENO}: result: no" >&5
+$as_echo "no" >&6; }
+fi
+
+
 if test x"$WGET" != x"no"; then :
 
 FETCHER="$WGET -c -O"
@@ -2415,7 +2457,9 @@ fi
 
 else
 
-as_fn_error $? "cannot find wget or ftp" "$LINENO" 5
+FETCHER="$FALSE"
+{ $as_echo "$as_me:${as_lineno-$LINENO}: WARNING: cannot find wget or 
ftp" >&5
+$as_echo "$as_me: WARNING: cannot find wget or ftp" >&2;}
 
 fi
 
diff --git a/tools/configure b/tools/configure
index b66d3f6fba..92ead93335 100755
--- a/tools/configure
+++ b/tools/configure
@@ -644,6 +644,7 @@ system_aio
 zlib
 FETCHER
 FTP
+FALSE
 WGET
 pixman_LIBS
 pixman_CFLAGS
@@ -8219,6 +8220,47 @@ $as_echo "no" >&6; }
 fi
 
 
+# Extract the first word of "false", so it can be a program name with args.
+set dummy false; ac_word=$2
+{ $as_echo "$as_me:${as_lineno-$LINENO}: checking for $ac_word" >&5
+$as_echo_n "checking for $ac_word... " >&6; }
+if ${ac_cv_path_FALSE+:} false; then :
+  $as_echo_n "(cached) " >&6
+else
+  case $FALSE in
+  [\\/]* | ?:[\\/]*)
+  ac_cv_path_FALSE="$FALSE" # Let the user override the test with a path.
+  ;;
+  *)
+  as_save_IFS=$IFS; IFS=$PATH_SEPARATOR
+for as_dir in $PATH
+do
+  IFS=$as_save_IFS
+  test -z "$as_dir" && as_dir=.
+for ac_exec_ext in '' $ac_executable_extensions; do
+  if as_fn_executable_p "$as_dir/$ac_word$ac_exec_ext"; then
+ac_cv_path_FALSE="$as_dir/$ac_word$ac_exec_ext"
+$as_echo "$as_me:${as_lineno-$LINENO}: found $as_dir/$ac_word$ac_exec_ext" 
>&5
+break 2
+  fi
+done
+  done
+IFS=$as_save_IFS
+
+  test -z "$ac_cv_path_FALSE" && ac_cv_path_FALSE="/bin/false"
+  ;;
+esac
+fi
+FALSE=$ac_cv_path_FALSE
+if test -n "$FALSE"; then
+  { $as_echo "$as_me:${as_lineno-$LINENO}: result: $FALSE" >&5
+$as_echo "$FALSE" >&6; }
+else
+  { $as_echo "$as_me:${as_lineno-$LINENO}: result: no" >&5
+$as_echo "no" >&6; }
+fi
+
+
 if test x"$WGET" != x"no"; then :
 
 FETCHER="$WGET -c -O"
@@ -8272,7 +8314,9 @@ fi
 
 else
 
-as_fn_error $? "cannot find wget or ftp" "$LINENO" 5
+FETCHER="$FALSE"
+{ $as_echo "$as_me:${as_lineno-$LINENO}: WARNING: cannot find wget or 
ftp" >&5
+$as_echo "$as_me: WARNING: cannot find wget or ftp" >&2;}
 
 fi
 
-- 
2.11.0


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v13] tolerate jitter in cpu_khz calculation to avoid TSC emulation

2019-03-14 Thread Juergen Gross
On 14/03/2019 14:50, Olaf Hering wrote:
> Am Wed, 13 Mar 2019 03:18:39 -0600
> schrieb "Jan Beulich" :
> 
>> I'm sorry, but I continue to object to this adjustment getting done
>> both by default _and_ not in a per-guest manner. As said before,
>> you can't demand guests to run NTP, and hence you can't expect
>> them to get along with a few hundred kHz jump in observed TSC
>> frequency. Whether the performance drop due to vTSC use is
>> better or worse is a policy decision, which we should leave to the
>> admin. Hence the feature needs to be off by default, and there
>> needs to be at least a host-wide control to enable it; a per-guest
>> control would be better. IOW I explicitly do not agree with the
>> last sentence of the commit message.
> 
> So this seems the be the essential part that prevents moving forward.
> 
> Your claim is basically that "we do not know how the workload reacts
> to frequency change".
> My claim is basically "there is enough evidence that syncing with
> external clock is required if the frequency remotely matters".
> 
> I think that conflict can not be easily solved.
> 
> One way to solve it would be a knob that injects a value into the
> proposed "vtsc_tolerance_khz" variable, leave the calculation to
> the host admin, and leave code in tsc_set_info basically as is.
> 
> Maybe "xl set-params" can be the way to change the value, that way
> it can be changed globally at runtime if needed.
> 
> In staging the change would affect HVM and PVH. I never ran PVH,
> I have to assume it behaves like HVM in this regard.

I think Andrew (or Ian?) once suggested to handle this whole mess in the
migration stream instead of the hypervisor.

So why don't you:

- add a domain config item for specifying the allowed jitter
- add that value to the migration struct xc_sr_rec_tsc_info (there is a
  reserved field available)
- and then modify handle_tsc_info() in tools/libxc/xc_sr_common_x86.c to
  test the host frequency to be in the acceptable range and if this is
  the case put the host frequency into the gtsc_khz parameter of the
  xc_domain_set_tsc_info() call

This would be the least intrusive change allowing maximum flexibility
IMO.


Juergen

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v6 09/11] viridian: add implementation of synthetic interrupt MSRs

2019-03-14 Thread Jan Beulich
>>> On 14.03.19 at 12:25,  wrote:
> @@ -131,13 +238,69 @@ int viridian_synic_rdmsr(const struct vcpu *v, uint32_t 
> idx, uint64_t *val)
>  *val = ((uint64_t)icr2 << 32) | icr;
>  break;
>  }
> +
>  case HV_X64_MSR_TPR:
>  *val = vlapic_get_reg(vcpu_vlapic(v), APIC_TASKPRI);
>  break;
>  
>  case HV_X64_MSR_VP_ASSIST_PAGE:
> -*val = v->arch.hvm.viridian->vp_assist.msr.raw;
> +*val = vv->vp_assist.msr.raw;
> +break;
> +
> +case HV_X64_MSR_SCONTROL:
> +if ( !(viridian_feature_mask(d) & HVMPV_synic) )
> +return X86EMUL_EXCEPTION;
> +
> +*val = vv->scontrol;
> +break;
> +
> +case HV_X64_MSR_SVERSION:
> +if ( !(viridian_feature_mask(d) & HVMPV_synic) )
> +return X86EMUL_EXCEPTION;
> +
> +/*
> + * The specification says that the version number is 0x0001
> + * and should be in the lower 32-bits of the MSR, while the
> + * upper 32-bits are reserved... but it doesn't say what they
> + * should be set to. Assume everything but the bottom bit
> + * should be zero.
> + */
> +*val = 1ul;
> +break;
> +
> +case HV_X64_MSR_SIEFP:
> +if ( !(viridian_feature_mask(d) & HVMPV_synic) )
> +return X86EMUL_EXCEPTION;
> +
> +*val = vv->siefp;
> +break;
> +
> +case HV_X64_MSR_SIMP:
> +if ( !(viridian_feature_mask(d) & HVMPV_synic) )
> +return X86EMUL_EXCEPTION;
> +
> +*val = vv->simp.msr.raw;
> +break;
> +
> +case HV_X64_MSR_EOM:
> +if ( !(viridian_feature_mask(d) & HVMPV_synic) )
> +return X86EMUL_EXCEPTION;
> +
> +*val = 0;
> +break;
> +
> +case HV_X64_MSR_SINT0 ... HV_X64_MSR_SINT15:
> +{
> +unsigned int sintx = idx - HV_X64_MSR_SINT0;
> +const union viridian_sint_msr *vs =
> +&array_access_nospec(vv->sint, sintx);
> +
> +if ( !(viridian_feature_mask(d) & HVMPV_synic) )
> +return X86EMUL_EXCEPTION;
> +
> +*val = vs->raw;
>  break;

Without this necessarily being a request to change, I still don't
understand why you don't omit vs as a variable and simply do

*val = array_access_nospec(vv->sint, sintx).raw;

> @@ -149,6 +312,20 @@ int viridian_synic_rdmsr(const struct vcpu *v, uint32_t 
> idx, uint64_t *val)
>  
>  int viridian_synic_vcpu_init(const struct vcpu *v)
>  {

FTR while I'm in favor of adding const wherever it is possible
and makes sense, I consider it quite odd for an init function
to take a pointer to const. Perhaps the deinit one would also
fall into that category.

> @@ -1328,9 +1340,13 @@ int vlapic_has_pending_irq(struct vcpu *v)
>   (irr & 0xf0) <= (isr & 0xf0) )
>  {
>  viridian_apic_assist_clear(v);
> -return -1;
> +irr = -1;
>  }
>  
> +out:

The label still lacks proper indentation. With at least this fixed (which
is fine to be done while committing if this is the only piece to change)
Reviewed-by: Jan Beulich 

Jan



___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] xen: Can't insert balloon page into VM userspace (WAS Re: [linux-linus bisection] complete test-arm64-arm64-xl-xsm)

2019-03-14 Thread Julien Grall

Hi,

On 3/14/19 8:37 AM, Juergen Gross wrote:

On 12/03/2019 20:46, David Hildenbrand wrote:

On 12.03.19 19:23, David Hildenbrand wrote:

I guess something like this could do the trick if I understood it correctly:

diff --git a/drivers/xen/balloon.c b/drivers/xen/balloon.c
index 39b229f9e256..d37dd5bb7a8f 100644
--- a/drivers/xen/balloon.c
+++ b/drivers/xen/balloon.c
@@ -604,6 +604,7 @@ int alloc_xenballooned_pages(int nr_pages, struct
page **pages)
 while (pgno < nr_pages) {
 page = balloon_retrieve(true);
 if (page) {
+   __ClearPageOffline(page);
 pages[pgno++] = page;
  #ifdef CONFIG_XEN_HAVE_PVMMU
 /*
@@ -645,8 +646,10 @@ void free_xenballooned_pages(int nr_pages, struct
page **pages)
 mutex_lock(&balloon_mutex);

 for (i = 0; i < nr_pages; i++) {
-   if (pages[i])
+   if (pages[i]) {
+   __SetPageOffline(pages[i]);
 balloon_append(pages[i]);
+   }
 }

 balloon_stats.target_unpopulated -= nr_pages;


At least this way, the pages allocated (and thus eventually mapped to
user space) would not be marked, but the other ones would remain marked
and could be excluded by makedumptool.



I think this patch should do the trick. Julien, could you give it a
try? On x86 I can't reproduce your problem easily as dom0 is PV with
plenty of unpopulated pages for grant memory not suffering from
missing "offline" bit.


Sure. I managed to get the console working with the patch suggested by 
David. Feel free to add my tested-by if when you resend it as is.


Cheers,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v13] tolerate jitter in cpu_khz calculation to avoid TSC emulation

2019-03-14 Thread Olaf Hering
Am Thu, 14 Mar 2019 15:10:16 +0100
schrieb Juergen Gross :

> This would be the least intrusive change allowing maximum flexibility
> IMO.

I think earlier attempts did alter the tooling.
But that would not help with existing domUs started on hosts that do not have 
that property.
A global knob would help with that.


Olaf


pgp5WO_kISTE5.pgp
Description: Digitale Signatur von OpenPGP
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 2/2] xen/debug: make debugtrace more clever regarding repeating entries

2019-03-14 Thread Jan Beulich
>>> On 14.03.19 at 14:40,  wrote:
> On 14/03/2019 13:38, Juergen Gross wrote:
>> On 14/03/2019 14:33, Jan Beulich wrote:
>> On 14.03.19 at 10:37,  wrote:
 --- a/xen/drivers/char/console.c
 +++ b/xen/drivers/char/console.c
 @@ -1225,13 +1225,28 @@ void debugtrace_dump(void)
  watchdog_enable();
  }
  
 +static void debugtrace_add_to_buf(char *buf)
 +{
 +char *p;
 +
 +for ( p = buf; *p != '\0'; p++ )
 +{
 +debugtrace_buf[debugtrace_prd++] = *p;
 +/* Always leave a nul byte at the end of the buffer. */
 +if ( debugtrace_prd == (debugtrace_bytes - 1) )
 +debugtrace_prd = 0;
 +}
 +}
 +
  void debugtrace_printk(const char *fmt, ...)
  {
 -static charbuf[1024];
 -static u32 count;
 +static char buf[1024];
 +static char last_buf[1024];
 +static u32 count, last_count;
>>> Please change to uint32_t or even better simply to unsigned int.
>> Okay.
>>
 @@ -1243,25 +1258,32 @@ void debugtrace_printk(const char *fmt, ...)
  
  ASSERT(debugtrace_buf[debugtrace_bytes - 1] == 0);
  
 -snprintf(buf, sizeof(buf), "%u ", ++count);
 -
  va_start(args, fmt);
 -(void)vsnprintf(buf + strlen(buf), sizeof(buf) - strlen(buf), fmt, 
 args);
 +(void)vsnprintf(buf, sizeof(buf), fmt, args);
>>> Please take the opportunity and drop the stray cast.
>> Will do.
> 
> Both can be done on commit, surely?

Perhaps, albeit iirc the first would amount to more than just s/u32/.../
on the line in question.

Jan



___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] xen: Can't insert balloon page into VM userspace (WAS Re: [linux-linus bisection] complete test-arm64-arm64-xl-xsm)

2019-03-14 Thread David Hildenbrand
On 14.03.19 15:12, Julien Grall wrote:
> Hi,
> 
> On 3/14/19 8:37 AM, Juergen Gross wrote:
>> On 12/03/2019 20:46, David Hildenbrand wrote:
>>> On 12.03.19 19:23, David Hildenbrand wrote:
>>>
>>> I guess something like this could do the trick if I understood it correctly:
>>>
>>> diff --git a/drivers/xen/balloon.c b/drivers/xen/balloon.c
>>> index 39b229f9e256..d37dd5bb7a8f 100644
>>> --- a/drivers/xen/balloon.c
>>> +++ b/drivers/xen/balloon.c
>>> @@ -604,6 +604,7 @@ int alloc_xenballooned_pages(int nr_pages, struct
>>> page **pages)
>>>  while (pgno < nr_pages) {
>>>  page = balloon_retrieve(true);
>>>  if (page) {
>>> +   __ClearPageOffline(page);
>>>  pages[pgno++] = page;
>>>   #ifdef CONFIG_XEN_HAVE_PVMMU
>>>  /*
>>> @@ -645,8 +646,10 @@ void free_xenballooned_pages(int nr_pages, struct
>>> page **pages)
>>>  mutex_lock(&balloon_mutex);
>>>
>>>  for (i = 0; i < nr_pages; i++) {
>>> -   if (pages[i])
>>> +   if (pages[i]) {
>>> +   __SetPageOffline(pages[i]);
>>>  balloon_append(pages[i]);
>>> +   }
>>>  }
>>>
>>>  balloon_stats.target_unpopulated -= nr_pages;
>>>
>>>
>>> At least this way, the pages allocated (and thus eventually mapped to
>>> user space) would not be marked, but the other ones would remain marked
>>> and could be excluded by makedumptool.
>>>
>>
>> I think this patch should do the trick. Julien, could you give it a
>> try? On x86 I can't reproduce your problem easily as dom0 is PV with
>> plenty of unpopulated pages for grant memory not suffering from
>> missing "offline" bit.
> 
> Sure. I managed to get the console working with the patch suggested by 
> David. Feel free to add my tested-by if when you resend it as is.
> 

Thanks, I will send as proper patch later!

Cheers!

> Cheers,
> 


-- 

Thanks,

David / dhildenb

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] xen: Can't insert balloon page into VM userspace (WAS Re: [linux-linus bisection] complete test-arm64-arm64-xl-xsm)

2019-03-14 Thread Juergen Gross
On 14/03/2019 15:12, Julien Grall wrote:
> Hi,
> 
> On 3/14/19 8:37 AM, Juergen Gross wrote:
>> On 12/03/2019 20:46, David Hildenbrand wrote:
>>> On 12.03.19 19:23, David Hildenbrand wrote:
>>>
>>> I guess something like this could do the trick if I understood it
>>> correctly:
>>>
>>> diff --git a/drivers/xen/balloon.c b/drivers/xen/balloon.c
>>> index 39b229f9e256..d37dd5bb7a8f 100644
>>> --- a/drivers/xen/balloon.c
>>> +++ b/drivers/xen/balloon.c
>>> @@ -604,6 +604,7 @@ int alloc_xenballooned_pages(int nr_pages, struct
>>> page **pages)
>>>  while (pgno < nr_pages) {
>>>  page = balloon_retrieve(true);
>>>  if (page) {
>>> +   __ClearPageOffline(page);
>>>  pages[pgno++] = page;
>>>   #ifdef CONFIG_XEN_HAVE_PVMMU
>>>  /*
>>> @@ -645,8 +646,10 @@ void free_xenballooned_pages(int nr_pages, struct
>>> page **pages)
>>>  mutex_lock(&balloon_mutex);
>>>
>>>  for (i = 0; i < nr_pages; i++) {
>>> -   if (pages[i])
>>> +   if (pages[i]) {
>>> +   __SetPageOffline(pages[i]);
>>>  balloon_append(pages[i]);
>>> +   }
>>>  }
>>>
>>>  balloon_stats.target_unpopulated -= nr_pages;
>>>
>>>
>>> At least this way, the pages allocated (and thus eventually mapped to
>>> user space) would not be marked, but the other ones would remain marked
>>> and could be excluded by makedumptool.
>>>
>>
>> I think this patch should do the trick. Julien, could you give it a
>> try? On x86 I can't reproduce your problem easily as dom0 is PV with
>> plenty of unpopulated pages for grant memory not suffering from
>> missing "offline" bit.
> 
> Sure. I managed to get the console working with the patch suggested by
> David. Feel free to add my tested-by if when you resend it as is.

David, could you please send a proper patch with your Sob?


Juergen

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] xen: Can't insert balloon page into VM userspace (WAS Re: [linux-linus bisection] complete test-arm64-arm64-xl-xsm)

2019-03-14 Thread David Hildenbrand
On 14.03.19 15:15, Juergen Gross wrote:
> On 14/03/2019 15:12, Julien Grall wrote:
>> Hi,
>>
>> On 3/14/19 8:37 AM, Juergen Gross wrote:
>>> On 12/03/2019 20:46, David Hildenbrand wrote:
 On 12.03.19 19:23, David Hildenbrand wrote:

 I guess something like this could do the trick if I understood it
 correctly:

 diff --git a/drivers/xen/balloon.c b/drivers/xen/balloon.c
 index 39b229f9e256..d37dd5bb7a8f 100644
 --- a/drivers/xen/balloon.c
 +++ b/drivers/xen/balloon.c
 @@ -604,6 +604,7 @@ int alloc_xenballooned_pages(int nr_pages, struct
 page **pages)
  while (pgno < nr_pages) {
  page = balloon_retrieve(true);
  if (page) {
 +   __ClearPageOffline(page);
  pages[pgno++] = page;
   #ifdef CONFIG_XEN_HAVE_PVMMU
  /*
 @@ -645,8 +646,10 @@ void free_xenballooned_pages(int nr_pages, struct
 page **pages)
  mutex_lock(&balloon_mutex);

  for (i = 0; i < nr_pages; i++) {
 -   if (pages[i])
 +   if (pages[i]) {
 +   __SetPageOffline(pages[i]);
  balloon_append(pages[i]);
 +   }
  }

  balloon_stats.target_unpopulated -= nr_pages;


 At least this way, the pages allocated (and thus eventually mapped to
 user space) would not be marked, but the other ones would remain marked
 and could be excluded by makedumptool.

>>>
>>> I think this patch should do the trick. Julien, could you give it a
>>> try? On x86 I can't reproduce your problem easily as dom0 is PV with
>>> plenty of unpopulated pages for grant memory not suffering from
>>> missing "offline" bit.
>>
>> Sure. I managed to get the console working with the patch suggested by
>> David. Feel free to add my tested-by if when you resend it as is.
> 
> David, could you please send a proper patch with your Sob?
> 

Yes, on it :)

Cheers!

> 
> Juergen
> 


-- 

Thanks,

David / dhildenb

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v13] tolerate jitter in cpu_khz calculation to avoid TSC emulation

2019-03-14 Thread Jan Beulich
>>> On 14.03.19 at 14:50,  wrote:
> In staging the change would affect HVM and PVH. I never ran PVH,
> I have to assume it behaves like HVM in this regard.

And again you makie it look as if PV wasn't also going through that
same path. Yet all that's there at the top of the function is

if ( is_pv_domain(d) && is_hardware_domain(d) )
{
d->arch.vtsc = 0;
return 0;
}

Jan



___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

  1   2   >