date:20151222

>>> On 22.12.15 at 08:40,  wrote:
> Maybe, there are still some misunderstanding about your expectation.
> Let me summarize it here.
> 
> After Quan's patch-set, there are two types of error code:
> - -EOPNOTSUPP
> Now we only support and use software way to synchronize the invalidation,
> if someone calls queue_invalidate_wait() and passes sw with 0, then
> -EOPNOTSUPP is returned (Though this cannot happen in real world, since 
> queue_invalidate_wait() is called only in one place and 1 is passed in to 
> 'sw').
> So I am not sure what should we do for this return value, if we really get
> that return value, it means the flush is not actually executed, so the iommu
> state is incorrect, the data is inconsistent. Do you think what should we do
> for this case?

Since seeing this error would be a software bug, BUG() or ASSERT()
are fine to handle this specific case, if need be.

> - -ETIMEDOUT
> For this case, Quan has elaborate a lot, IIUIC, the main gap between you
> and Quan is you think the error code should be propagated to the up caller,
> while in Quan's implementation, he deals with this error in 
> invalidate_timeout()
> and device_tlb_invalidate_timeout(), hence no need to propagated it to
> up called, since the handling policy will crash the domain, so we don't think
> propagated error code is needed. Even we propagate it, the up caller still
> doesn't need to do anything for it.

"Handling" an error by e.g. domain_crash() doesn't mean you don't
need to also modify (or at the very least inspect) callers: They may
continue doing things _assuming_ success. Of course you don't
need to domain_crash() at all layers. But errors from lower layers
should, at least in most ordinary cases, lead to higher layers bailing
instead of continuing.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [qemu-upstream-4.2-testing test] 66816: regressions - FAIL

flight 66816 qemu-upstream-4.2-testing real [real]
http://logs.test-lab.xenproject.org/osstest/logs/66816/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-i3865 xen-build fail REGR. vs. 62044
 build-amd64   5 xen-build fail REGR. vs. 62044

Tests which did not succeed, but are not blocking:
 build-i386-libvirt1 build-check(1)   blocked  n/a
 test-i386-i386-xl-qemuu-winxpsp3  1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qemuu-ovmf-amd64  1 build-check(1) blocked n/a
 test-amd64-amd64-xl-qemuu-debianhvm-amd64  1 build-check(1)blocked n/a
 test-amd64-amd64-xl-qemuu-win7-amd64  1 build-check(1) blocked n/a
 test-amd64-amd64-xl-qemuu-winxpsp3  1 build-check(1)   blocked n/a
 test-amd64-i386-xl-qemuu-ovmf-amd64  1 build-check(1)  blocked n/a
 test-amd64-i386-qemuu-rhel6hvm-intel  1 build-check(1) blocked n/a
 test-amd64-i386-qemuu-rhel6hvm-amd  1 build-check(1)   blocked n/a
 test-amd64-i386-xl-qemuu-winxpsp3-vcpus1  1 build-check(1) blocked n/a
 test-amd64-i386-xl-qemuu-debianhvm-amd64  1 build-check(1) blocked n/a
 test-amd64-i386-xend-qemuu-winxpsp3  1 build-check(1)  blocked n/a
 test-amd64-i386-xl-qemuu-win7-amd64  1 build-check(1)  blocked n/a
 build-amd64-libvirt   1 build-check(1)   blocked  n/a

version targeted for testing:
 qemuu5081fc1c773d2a83ec7a867f030323b8b6956cd1
baseline version:
 qemuuc17e602ae64fb24405ebd256679ba9a6f5819086

Last test of basis62044  2015-09-15 15:06:42 Z   97 days
Testing same since66542  2015-12-18 16:37:10 Z3 days4 attempts


People who touched revisions under test:
  Stefano Stabellini 

jobs:
 build-amd64  fail
 build-i386   fail
 build-amd64-libvirt  blocked 
 build-i386-libvirt   blocked 
 build-amd64-pvopspass
 build-i386-pvops pass
 test-amd64-i386-qemuu-rhel6hvm-amd   blocked 
 test-amd64-amd64-xl-qemuu-debianhvm-amd64blocked 
 test-amd64-i386-xl-qemuu-debianhvm-amd64 blocked 
 test-amd64-amd64-xl-qemuu-ovmf-amd64 blocked 
 test-amd64-i386-xl-qemuu-ovmf-amd64  blocked 
 test-amd64-amd64-xl-qemuu-win7-amd64 blocked 
 test-amd64-i386-xl-qemuu-win7-amd64  blocked 
 test-amd64-i386-qemuu-rhel6hvm-intel blocked 
 test-amd64-i386-xl-qemuu-winxpsp3-vcpus1 blocked 
 test-amd64-i386-xend-qemuu-winxpsp3  blocked 
 test-amd64-amd64-xl-qemuu-winxpsp3   blocked 
 test-i386-i386-xl-qemuu-winxpsp3 blocked 



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Not pushing.


commit 5081fc1c773d2a83ec7a867f030323b8b6956cd1
Author: Stefano Stabellini 
Date:   Fri Dec 18 15:45:14 2015 +

xenfb: avoid reading twice the same fields from the shared page

Reading twice the same field could give the guest an attack of
opportunity. In the case of event->type, gcc could compile the switch
statement into a jump table, effectively ending up reading the type
field multiple times.

This is part of XSA-155.

Signed-off-by: Stefano Stabellini 

commit 74fab2ef4c0ba42af477e9e445c9883cc45cf9e6
Author: Stefano Stabellini 
Date:   Fri Dec 18 15:44:58 2015 +

xen/blkif: Avoid double access to src->nr_segments

src is stored in shared memory and src->nr_segments is dereferenced
twice at the end of the function.  If a compiler decides to compile this
into two separate memory accesses then the size limitation could be
bypassed.

Fix it by removing the double access to src->nr_segments.

This is part of XSA-155.

Signed-off-by: Stefano Stabellini

Re: [Xen-devel] [PATCH v3 0/2] VT-d flush issue

>On 22.12.2015 at 4:01pm  wrote:
> >>> On 22.12.15 at 08:40,  wrote:
> > Maybe, there are still some misunderstanding about your expectation.
> > Let me summarize it here.
> >
> > After Quan's patch-set, there are two types of error code:
> > - -EOPNOTSUPP
> > Now we only support and use software way to synchronize the
> > invalidation, if someone calls queue_invalidate_wait() and passes sw
> > with 0, then -EOPNOTSUPP is returned (Though this cannot happen in
> > real world, since
> > queue_invalidate_wait() is called only in one place and 1 is passed in to 
> > 'sw').
> > So I am not sure what should we do for this return value, if we really
> > get that return value, it means the flush is not actually executed, so
> > the iommu state is incorrect, the data is inconsistent. Do you think
> > what should we do for this case?
> 
> Since seeing this error would be a software bug, BUG() or ASSERT() are fine to
> handle this specific case, if need be.
> 
> > - -ETIMEDOUT
> > For this case, Quan has elaborate a lot, IIUIC, the main gap between
> > you and Quan is you think the error code should be propagated to the
> > up caller, while in Quan's implementation, he deals with this error in
> > invalidate_timeout()
> > and device_tlb_invalidate_timeout(), hence no need to propagated it to
> > up called, since the handling policy will crash the domain, so we
> > don't think propagated error code is needed. Even we propagate it, the
> > up caller still doesn't need to do anything for it.
> 
> "Handling" an error by e.g. domain_crash() doesn't mean you don't need to also
> modify (or at the very least inspect) callers: They may continue doing things
> _assuming_ success. Of course you don't need to domain_crash() at all layers.
> But errors from lower layers should, at least in most ordinary cases, lead to
> higher layers bailing instead of continuing.
> 

For Device-TLB flush error, I think we need to propagated error code.
For IEC/iotlb/context flush error, if panic is acceptable, we can ignore the 
propagated error code. BTW, it is very challenge / tricky to handle all
Of error, and some error is unrecoverable. As mentioned, it looks like 
rewriting Xen hypervisor.

For -EOPNOTSUPP, we can print warning message. If it supports interrupt method, 
we can return 0 in queue_invalidate_wait().

Feng, thanks for your update!

-Quan











___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v3 0/2] VT-d flush issue

2015-12-22 Thread Wu, Feng



> -Original Message-
> From: Jan Beulich [mailto:jbeul...@suse.com]
> Sent: Tuesday, December 22, 2015 4:01 PM
> To: Wu, Feng ; Xu, Quan 
> Cc: 'andrew.coop...@citrix.com' ;
> 'george.dun...@eu.citrix.com' ; Nakajima, Jun
> ; Tian, Kevin ; 'xen-
> de...@lists.xen.org' ; 'k...@xen.org' ;
> 't...@xen.org' 
> Subject: RE: [Xen-devel] [PATCH v3 0/2] VT-d flush issue
> 
> >>> On 22.12.15 at 08:40,  wrote:
> > Maybe, there are still some misunderstanding about your expectation.
> > Let me summarize it here.
> >
> > After Quan's patch-set, there are two types of error code:
> > - -EOPNOTSUPP
> > Now we only support and use software way to synchronize the invalidation,
> > if someone calls queue_invalidate_wait() and passes sw with 0, then
> > -EOPNOTSUPP is returned (Though this cannot happen in real world, since
> > queue_invalidate_wait() is called only in one place and 1 is passed in to 
> > 'sw').
> > So I am not sure what should we do for this return value, if we really get
> > that return value, it means the flush is not actually executed, so the iommu
> > state is incorrect, the data is inconsistent. Do you think what should we do
> > for this case?
> 
> Since seeing this error would be a software bug, BUG() or ASSERT()
> are fine to handle this specific case, if need be.
> 
> > - -ETIMEDOUT
> > For this case, Quan has elaborate a lot, IIUIC, the main gap between you
> > and Quan is you think the error code should be propagated to the up caller,
> > while in Quan's implementation, he deals with this error in
> > invalidate_timeout()
> > and device_tlb_invalidate_timeout(), hence no need to propagated it to
> > up called, since the handling policy will crash the domain, so we don't 
> > think
> > propagated error code is needed. Even we propagate it, the up caller still
> > doesn't need to do anything for it.
> 
> "Handling" an error by e.g. domain_crash() doesn't mean you don't
> need to also modify (or at the very least inspect) callers: They may
> continue doing things _assuming_ success. Of course you don't
> need to domain_crash() at all layers. But errors from lower layers
> should, at least in most ordinary cases, lead to higher layers bailing
> instead of continuing.

So there are two questions:
1. Just confirmed with Quan, for IEC/context/iotlb flush timeout, the policy
will be Xen panic. Do you agree with this policy? If you do, we don't need to
pass the error code to the caller, right? If you don't, we may need more
discussion about how the handle this case first before anything else.
2. For Device iotlb flush, the current policy is to crash the domain, in this 
case
we need to check case by case whether the caller need to handle something
if timeout is encountered, right? If needed, we need do something there, if
not needed, we can just let it be.

Thanks,
Feng

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH V4 4/6] x86/hvm: pkeys, add pkeys support for guest_walk_tables

2015-12-22 Thread Han, Huaitong

On Mon, 2015-12-21 at 08:32 -0700, Jan Beulich wrote:
> > > > On 21.12.15 at 08:21,  wrote:
> > --- a/xen/arch/x86/mm/guest_walk.c
> > +++ b/xen/arch/x86/mm/guest_walk.c
> > @@ -90,6 +90,55 @@ static uint32_t set_ad_bits(void *guest_p, void
> > *walk_p, int set_dirty)
> >  return 0;
> >  }
> >  
> > +#if GUEST_PAGING_LEVELS >= CONFIG_PAGING_LEVELS
> 
> GUEST_PAGING_LEVELS >= 4 (just like further down)
The code is modified according Andrew's comments:
"
  This is a latent linking bug for the future 
  when 5 levels comes along.

  It will probably be best to use the same trick
  as gw_page_flags to compile it once but use it
  multiple times.
"

> 
> > +bool_t leaf_pte_pkeys_check(struct vcpu *vcpu,
> > +uint32_t pfec, uint32_t pte_pkey)
> > +{
> > +unsigned int pkru = 0;
> > +bool_t pkru_ad, pkru_wd;
> > +
> 
> Stray blank line.
> 
> > +bool_t pf = !!(pfec & PFEC_page_present);
> > +bool_t uf = !!(pfec & PFEC_user_mode);
> > +bool_t wf = !!(pfec & PFEC_write_access);
> > +bool_t ff = !!(pfec & PFEC_insn_fetch);
> > +bool_t rsvdf = !!(pfec & PFEC_reserved_bit);
> > +
> > +/* When page is present,  PFEC_prot_key is always checked */
> > +if ( !pf || is_pv_vcpu(vcpu) )
> > +return 0;
> 
> I think for a function called "check" together with how its callers
> use
> it the return value meaning is inverted here. 
I will change the function name to "pkey_page_fault".

> Also the comment seems
> inverted wrt the actual check (and is missing a full stop). And
> doesn't
> key 0 have static meaning, in which case you could bail early (and
> namely avoid the expensive RDPKRU further down)?
Key 0 have no static meaning, the default key maybe different in
different OS.
> 
> > +/*
> > + * PKU:  additional mechanism by which the paging controls
> > + * access to user-mode addresses based on the value in the
> > + * PKRU register. A fault is considered as a PKU violation if
> > all
> > + * of the following conditions are ture:
> > + * 1.CR4_PKE=1.
> > + * 2.EFER_LMA=1.
> > + * 3.page is present with no reserved bit violations.
> > + * 4.the access is not an instruction fetch.
> > + * 5.the access is to a user page.
> > + * 6.PKRU.AD=1
> > + *   or The access is a data write and PKRU.WD=1
> > + *and either CR0.WP=1 or it is a user access.
> > + */
> > +if ( !hvm_pku_enabled(vcpu) ||
> > +!hvm_long_mode_enabled(vcpu) || rsvdf || ff )
> 
> Where's the "user page" check? Also - indentation.
it should be:
if ( !hvm_pku_enabled(vcpu) || !hvm_long_mode_enabled(vcpu) ||
rsvdf || ff || !(pte_flags & _PAGE_USER) )
> 
> > +return 0;
> > +
> > +pkru = read_pkru();
> > +if ( unlikely(pkru) )
> > +{
> > +pkru_ad = read_pkru_ad(pkru, pte_pkey);
> > +pkru_wd = read_pkru_wd(pkru, pte_pkey);
> > +/* Condition 6 */
> > +if ( pkru_ad || (pkru_wd && wf && (hvm_wp_enabled(vcpu) ||
> > uf)))
> 
> Ah, uf is being checked here. But according to the comment it could
> (and should, again to avoid the RDPKRU) move up.
> 
> > @@ -270,6 +324,12 @@ guest_walk_tables(struct vcpu *v, struct
> > p2m_domain *p2m,
> >  
> >  pse2M = (gflags & _PAGE_PSE) && guest_supports_superpages(v); 
> >  
> > +#if GUEST_PAGING_LEVELS >= 4
> > +pkey = guest_l2e_get_pkey(gw->l2e);
> > +if ( pse2M && leaf_pte_pkeys_check(v, pfec, pkey) )
> > +rc |= _PAGE_PKEY_BITS;
> > +#endif
> 
> I think the #ifdef isn't really needed here, if you moved the one
> around leaf_pte_pkeys_check() into that function, and if you
> perhaps also dropped the "pkey" local variable.
guest_l2e_get_pkey has different macro depend on GUEST_PAGING_LEVELS
too, and I think it's better to keep it.
> 
> Jan
> 
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v3 0/2] VT-d flush issue

>>> On 22.12.15 at 09:10,  wrote:

> 
>> -Original Message-
>> From: Jan Beulich [mailto:jbeul...@suse.com]
>> Sent: Tuesday, December 22, 2015 4:01 PM
>> To: Wu, Feng ; Xu, Quan 
>> Cc: 'andrew.coop...@citrix.com' ;
>> 'george.dun...@eu.citrix.com' ; Nakajima, Jun
>> ; Tian, Kevin ; 'xen-
>> de...@lists.xen.org' ; 'k...@xen.org' 
>> ;
>> 't...@xen.org' 
>> Subject: RE: [Xen-devel] [PATCH v3 0/2] VT-d flush issue
>> 
>> >>> On 22.12.15 at 08:40,  wrote:
>> > Maybe, there are still some misunderstanding about your expectation.
>> > Let me summarize it here.
>> >
>> > After Quan's patch-set, there are two types of error code:
>> > - -EOPNOTSUPP
>> > Now we only support and use software way to synchronize the invalidation,
>> > if someone calls queue_invalidate_wait() and passes sw with 0, then
>> > -EOPNOTSUPP is returned (Though this cannot happen in real world, since
>> > queue_invalidate_wait() is called only in one place and 1 is passed in to 
> 'sw').
>> > So I am not sure what should we do for this return value, if we really get
>> > that return value, it means the flush is not actually executed, so the 
> iommu
>> > state is incorrect, the data is inconsistent. Do you think what should we 
> do
>> > for this case?
>> 
>> Since seeing this error would be a software bug, BUG() or ASSERT()
>> are fine to handle this specific case, if need be.
>> 
>> > - -ETIMEDOUT
>> > For this case, Quan has elaborate a lot, IIUIC, the main gap between you
>> > and Quan is you think the error code should be propagated to the up caller,
>> > while in Quan's implementation, he deals with this error in
>> > invalidate_timeout()
>> > and device_tlb_invalidate_timeout(), hence no need to propagated it to
>> > up called, since the handling policy will crash the domain, so we don't 
> think
>> > propagated error code is needed. Even we propagate it, the up caller still
>> > doesn't need to do anything for it.
>> 
>> "Handling" an error by e.g. domain_crash() doesn't mean you don't
>> need to also modify (or at the very least inspect) callers: They may
>> continue doing things _assuming_ success. Of course you don't
>> need to domain_crash() at all layers. But errors from lower layers
>> should, at least in most ordinary cases, lead to higher layers bailing
>> instead of continuing.
> 
> So there are two questions:
> 1. Just confirmed with Quan, for IEC/context/iotlb flush timeout, the policy
> will be Xen panic. Do you agree with this policy? If you do, we don't need 
> to
> pass the error code to the caller, right? If you don't, we may need more
> discussion about how the handle this case first before anything else.

Just to repeat several earlier statements: BUG() and panic() ought
to be avoided wherever possible (i.e. including here).

> 2. For Device iotlb flush, the current policy is to crash the domain, in 
> this case
> we need to check case by case whether the caller need to handle something
> if timeout is encountered, right? If needed, we need do something there, if
> not needed, we can just let it be.

"Just let it be" is wrong (and is what led to the bad state the code is
in right now in this regard). I gave advice for the direction before:
Add __must_check. This will make the compiler point out to you
where changes are needed.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v3 0/2] VT-d flush issue

>>> On 22.12.15 at 09:10,  wrote:
> For Device-TLB flush error, I think we need to propagated error code.
> For IEC/iotlb/context flush error, if panic is acceptable, we can ignore the 
> propagated error code. BTW, it is very challenge / tricky to handle all
> Of error, and some error is unrecoverable. As mentioned, it looks like 
> rewriting Xen hypervisor.

We're moving in circles. In particular I don't believe this last sentence.
Re-writing many parts of the hypervisor would be required is you
were to return the error to the guest (which technically isn't possible
in many case afaict). Having crashed the guest, you don't need to be
concerned about unrolling previous (partially completed) operations,
so I don't think it's this much of a rewrite.

And just to be clear - I hope you recall that the current approach
taken to the flush issue is already a compromise far away from
where we initially meant the code to move, and it was always
clear that the bad error handling state the code is in is going to
get in the way of this being a simple fix. It's sad that the people
originally having introduced that code can't be held responsible
for fixing this up, but that's a situation we find ourselves in all
the time.

Jan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH V4 6/6] x86/hvm: pkeys, add pkeys support for cpuid handling

>>> On 22.12.15 at 03:54,  wrote:
> On Mon, 2015-12-21 at 08:07 -0700, Jan Beulich wrote:
>> > > > On 21.12.15 at 08:21,  wrote:
>> > @@ -4600,6 +4600,14 @@ void hvm_cpuid(unsigned int input, unsigned
>> > int *eax, unsigned int *ebx,
>> >  /* Don't expose INVPCID to non-hap hvm. */
>> >  if ( (count == 0) && !hap_enabled(d) )
>> >  *ebx &= ~cpufeat_mask(X86_FEATURE_INVPCID);
>> > +
>> > +/* X86_FEATURE_PKU is not yet implemented for shadow
>> > paging. */
>> > +if ( (count == 0) && !hap_enabled(d) )
>> > +*ecx &= ~cpufeat_mask(X86_FEATURE_PKU);
>> 
>> I'm still missing the xsave dependency here.
> Xsave dependency deletion becasue we use RDPKRU to get PKRU register
> value instead of XSAVE now.

What the hypervisor does doesn't matter here. The question is
whether from an architectural standpoint XSAVE is a prerequsite.
If it is, then you need to clear PKU when _guest_ XSAVE is clear.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v3 0/2] VT-d flush issue

> On December 22, 2015 4:27pm,  wrote:
> >>> On 22.12.15 at 09:10,  wrote:
> > For Device-TLB flush error, I think we need to propagated error code.
> > For IEC/iotlb/context flush error, if panic is acceptable, we can
> > ignore the propagated error code. BTW, it is very challenge / tricky
> > to handle all Of error, and some error is unrecoverable. As mentioned,
> > it looks like rewriting Xen hypervisor.
> 
> We're moving in circles. In particular I don't believe this last sentence.
> Re-writing many parts of the hypervisor would be required is you were to 
> return
> the error to the guest (which technically isn't possible in many case afaict).
> Having crashed the guest, you don't need to be concerned about unrolling
> previous (partially completed) operations, so I don't think it's this much of 
> a
> rewrite.
> 
> And just to be clear - I hope you recall that the current approach taken to 
> the
> flush issue is already a compromise far away from where we initially meant the
> code to move, and it was always clear that the bad error handling state the
> code is in is going to get in the way of this being a simple fix. It's sad 
> that the
> people originally having introduced that code can't be held responsible for 
> fixing
> this up, but that's a situation we find ourselves in all the time.
> 
> Jan

Jan,
Let's finish our discussion. I accept your idea. But I need to separate it into 
3 patch set (It is complicated for me, sometime it makes me crash.):
   Patch set 1:  Device-TLB/iotlb flush error. (send out this week)
   Patch set 2:  context flush error. (need 2 ~ 3 weeks)
   Patch set 3:  iec flush error. (need 3 ~ 4 weeks)

If it is acceptable, we can discuss in detail one by one..

Thanks
-Quan





___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH V4 4/6] x86/hvm: pkeys, add pkeys support for guest_walk_tables

>>> On 22.12.15 at 09:12,  wrote:
> On Mon, 2015-12-21 at 08:32 -0700, Jan Beulich wrote:
>> > > > On 21.12.15 at 08:21,  wrote:
>> > --- a/xen/arch/x86/mm/guest_walk.c
>> > +++ b/xen/arch/x86/mm/guest_walk.c
>> > @@ -90,6 +90,55 @@ static uint32_t set_ad_bits(void *guest_p, void
>> > *walk_p, int set_dirty)
>> >  return 0;
>> >  }
>> >  
>> > +#if GUEST_PAGING_LEVELS >= CONFIG_PAGING_LEVELS
>> 
>> GUEST_PAGING_LEVELS >= 4 (just like further down)
> The code is modified according Andrew's comments:
> "
>   This is a latent linking bug for the future 
>   when 5 levels comes along.
> 
>   It will probably be best to use the same trick
>   as gw_page_flags to compile it once but use it
>   multiple times.
> "

Okay, understood. But then you should follow _that_ model (using
== instead of >=) instead of inventing a 3rd variant. Similar things
should be done in similar ways so that they can be easily recognized
being similar.

Otoh the function isn't being called from code other than
GUEST_PAGING_LEVELS == 4, so at present it could be static,
which would take care of the latent linking issue Andrew referred to.

>> > +bool_t pf = !!(pfec & PFEC_page_present);
>> > +bool_t uf = !!(pfec & PFEC_user_mode);
>> > +bool_t wf = !!(pfec & PFEC_write_access);
>> > +bool_t ff = !!(pfec & PFEC_insn_fetch);
>> > +bool_t rsvdf = !!(pfec & PFEC_reserved_bit);
>> > +
>> > +/* When page is present,  PFEC_prot_key is always checked */
>> > +if ( !pf || is_pv_vcpu(vcpu) )
>> > +return 0;
>> 
>> I think for a function called "check" together with how its callers
>> use
>> it the return value meaning is inverted here. 
> I will change the function name to "pkey_page_fault".

Perhaps just pkey_fault() then, since it's clear there's no other fault
possible here besides a page fault?

>> Also the comment seems
>> inverted wrt the actual check (and is missing a full stop). And
>> doesn't
>> key 0 have static meaning, in which case you could bail early (and
>> namely avoid the expensive RDPKRU further down)?
> Key 0 have no static meaning, the default key maybe different in
> different OS.

Okay - I thought I had seen something like this mentioned in
another sub-thread.

>> > @@ -270,6 +324,12 @@ guest_walk_tables(struct vcpu *v, struct
>> > p2m_domain *p2m,
>> >  
>> >  pse2M = (gflags & _PAGE_PSE) && guest_supports_superpages(v); 
>> >  
>> > +#if GUEST_PAGING_LEVELS >= 4
>> > +pkey = guest_l2e_get_pkey(gw->l2e);
>> > +if ( pse2M && leaf_pte_pkeys_check(v, pfec, pkey) )
>> > +rc |= _PAGE_PKEY_BITS;
>> > +#endif
>> 
>> I think the #ifdef isn't really needed here, if you moved the one
>> around leaf_pte_pkeys_check() into that function, and if you
>> perhaps also dropped the "pkey" local variable.
> guest_l2e_get_pkey has different macro depend on GUEST_PAGING_LEVELS
> too, and I think it's better to keep it.

I didn't suggest to drop guest_l2e_get_pkey(). I'd like to see the
#ifdef-s go away if at all possible, to help code readability.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [xen-4.3-testing test] 66779: regressions - FAIL

flight 66779 xen-4.3-testing real [real]
http://logs.test-lab.xenproject.org/osstest/logs/66779/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-amd64   5 xen-build fail REGR. vs. 65650
 build-amd64-prev  5 xen-build fail REGR. vs. 65650
 build-i386-prev   5 xen-build fail REGR. vs. 65650
 build-i3865 xen-build fail REGR. vs. 65650

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-rumpuserxen-amd64  1 build-check(1)   blocked n/a
 test-amd64-i386-rumpuserxen-i386  1 build-check(1)   blocked  n/a
 build-i386-libvirt1 build-check(1)   blocked  n/a
 test-amd64-i386-xl-qemuu-ovmf-amd64  1 build-check(1)  blocked n/a
 test-amd64-amd64-xl-qemuu-ovmf-amd64  1 build-check(1) blocked n/a
 build-amd64-libvirt   1 build-check(1)   blocked  n/a
 build-i386-rumpuserxen1 build-check(1)   blocked  n/a
 build-amd64-rumpuserxen   1 build-check(1)   blocked  n/a
 test-amd64-amd64-pv   1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt   1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-multivcpu  1 build-check(1)   blocked  n/a
 test-amd64-i386-migrupgrade   1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl   1 build-check(1)   blocked  n/a
 test-amd64-amd64-migrupgrade  1 build-check(1)   blocked  n/a
 test-amd64-i386-pair  1 build-check(1)   blocked  n/a
 test-amd64-amd64-pair 1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt  1 build-check(1)   blocked  n/a
 test-amd64-i386-qemuu-rhel6hvm-intel  1 build-check(1) blocked n/a
 test-amd64-amd64-amd64-pvgrub  1 build-check(1)   blocked  n/a
 test-amd64-amd64-i386-pvgrub  1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-credit2   1 build-check(1)   blocked  n/a
 test-amd64-i386-pv1 build-check(1)   blocked  n/a
 test-amd64-i386-freebsd10-i386  1 build-check(1)   blocked  n/a
 test-amd64-i386-xl-qemuu-debianhvm-amd64  1 build-check(1) blocked n/a
 test-amd64-i386-qemut-rhel6hvm-amd  1 build-check(1)   blocked n/a
 test-amd64-i386-freebsd10-amd64  1 build-check(1)   blocked  n/a
 test-amd64-i386-xl-raw1 build-check(1)   blocked  n/a
 test-amd64-amd64-pygrub   1 build-check(1)   blocked  n/a
 test-amd64-i386-xl1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt-vhd  1 build-check(1)   blocked  n/a
 test-amd64-i386-xl-qemut-win7-amd64  1 build-check(1)  blocked n/a
 test-amd64-amd64-xl-qemut-win7-amd64  1 build-check(1) blocked n/a
 test-amd64-i386-qemut-rhel6hvm-intel  1 build-check(1) blocked n/a
 test-amd64-i386-qemuu-rhel6hvm-amd  1 build-check(1)   blocked n/a
 test-amd64-amd64-xl-qemuu-win7-amd64  1 build-check(1) blocked n/a
 test-amd64-i386-xl-qemuu-win7-amd64  1 build-check(1)  blocked n/a
 test-amd64-amd64-xl-qemut-debianhvm-amd64  1 build-check(1)blocked n/a
 test-amd64-amd64-xl-qemut-winxpsp3  1 build-check(1)   blocked n/a
 test-amd64-i386-xl-qemut-winxpsp3-vcpus1  1 build-check(1) blocked n/a
 test-amd64-i386-xl-qemut-debianhvm-amd64  1 build-check(1) blocked n/a
 test-amd64-amd64-xl-qemuu-winxpsp3  1 build-check(1)   blocked n/a
 test-amd64-amd64-xl-qcow2 1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qemuu-debianhvm-amd64  1 build-check(1)blocked n/a
 test-amd64-i386-xl-qemuu-winxpsp3-vcpus1  1 build-check(1) blocked n/a
 test-amd64-i386-xend-qemut-winxpsp3  1 build-check(1)  blocked n/a
 test-armhf-armhf-xl-multivcpu  6 xen-boot fail  never pass
 test-armhf-armhf-libvirt-qcow2  6 xen-boot fail never pass
 test-armhf-armhf-xl-vhd   6 xen-boot fail   never pass
 test-armhf-armhf-libvirt  6 xen-boot fail   never pass
 test-armhf-armhf-xl-credit2   6 xen-boot fail   never pass
 test-armhf-armhf-xl-cubietruck  6 xen-boot fail never pass
 test-armhf-armhf-libvirt-raw  6 xen-boot fail   never pass
 test-armhf-armhf-xl   6 xen-boot fail   never pass
 test-armhf-armhf-xl-arndale   6 xen-boot fail   never pass

version targeted for testing:
 xen  0574a773238f0ecce3873dc525082f6e16ac655b
baseline version:
 xen  54dd84062d4e0f0298508826fc06990415da431e

Last test of basis65650  2015-12-10 03:5

[Xen-devel] [qemu-upstream-4.3-testing test] 66831: regressions - FAIL

flight 66831 qemu-upstream-4.3-testing real [real]
http://logs.test-lab.xenproject.org/osstest/logs/66831/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-i3865 xen-build fail REGR. vs. 62112
 build-amd64   5 xen-build fail REGR. vs. 62112

Tests which did not succeed, but are not blocking:
 build-i386-libvirt1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt-vhd  1 build-check(1)   blocked  n/a
 build-amd64-libvirt   1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt  1 build-check(1)   blocked  n/a
 test-amd64-i386-xl-qemuu-winxpsp3-vcpus1  1 build-check(1) blocked n/a
 test-amd64-amd64-amd64-pvgrub  1 build-check(1)   blocked  n/a
 test-amd64-i386-xl-qemuu-ovmf-amd64  1 build-check(1)  blocked n/a
 test-amd64-i386-xl-raw1 build-check(1)   blocked  n/a
 test-amd64-i386-qemuu-rhel6hvm-amd  1 build-check(1)   blocked n/a
 test-amd64-i386-xl-qemuu-win7-amd64  1 build-check(1)  blocked n/a
 test-amd64-amd64-pair 1 build-check(1)   blocked  n/a
 test-amd64-i386-xl-qemuu-debianhvm-amd64  1 build-check(1) blocked n/a
 test-amd64-i386-xl1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qemuu-ovmf-amd64  1 build-check(1) blocked n/a
 test-amd64-amd64-xl-qemuu-debianhvm-amd64  1 build-check(1)blocked n/a
 test-amd64-i386-freebsd10-i386  1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt   1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-multivcpu  1 build-check(1)   blocked  n/a
 test-amd64-i386-pv1 build-check(1)   blocked  n/a
 test-amd64-i386-qemuu-rhel6hvm-intel  1 build-check(1) blocked n/a
 test-amd64-i386-freebsd10-amd64  1 build-check(1)   blocked  n/a
 test-amd64-i386-pair  1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qemuu-win7-amd64  1 build-check(1) blocked n/a
 test-amd64-amd64-xl-qemuu-winxpsp3  1 build-check(1)   blocked n/a
 test-amd64-amd64-pv   1 build-check(1)   blocked  n/a
 test-amd64-amd64-i386-pvgrub  1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qcow2 1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-credit2   1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl   1 build-check(1)   blocked  n/a
 test-amd64-amd64-pygrub   1 build-check(1)   blocked  n/a

version targeted for testing:
 qemuu6e184363e64a0610c35ca231bfc73cea56eb02f3
baseline version:
 qemuub188780861662e8cf1847ec562799b32bb44f05e

Last test of basis62112  2015-09-18 04:07:28 Z   95 days
Testing same since66538  2015-12-18 16:12:07 Z3 days4 attempts


People who touched revisions under test:
  Stefano Stabellini 

jobs:
 build-amd64  fail
 build-i386   fail
 build-amd64-libvirt  blocked 
 build-i386-libvirt   blocked 
 build-amd64-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-xl  blocked 
 test-amd64-i386-xl   blocked 
 test-amd64-i386-qemuu-rhel6hvm-amd   blocked 
 test-amd64-amd64-xl-qemuu-debianhvm-amd64blocked 
 test-amd64-i386-xl-qemuu-debianhvm-amd64 blocked 
 test-amd64-i386-freebsd10-amd64  blocked 
 test-amd64-amd64-xl-qemuu-ovmf-amd64 blocked 
 test-amd64-i386-xl-qemuu-ovmf-amd64  blocked 
 test-amd64-amd64-xl-qemuu-win7-amd64 blocked 
 test-amd64-i386-xl-qemuu-win7-amd64  blocked 
 test-amd64-amd64-xl-credit2  blocked 
 test-amd64-i386-freebsd10-i386   blocked 
 test-amd64-i386-qemuu-rhel6hvm-intel blocked 
 test-amd64-amd64-libvirt blocked 
 test-amd64-i386-libvirt  blocked 
 test-amd64-amd64-xl-multivcpublocked 
 test-amd64-amd64-pairblocked 
 test-amd64-i386-pair blocked 
 test-amd64-amd64-pv  blocked 
 test-amd64-i386-pv

Re: [Xen-devel] Xen-devel Digest, Vol 130, Issue 521

2015-12-22 Thread Sander Eikelenboom

On 2015-12-21 23:58, xen-devel-requ...@lists.xen.org wrote:

Send Xen-devel mailing list submissions to
xen-devel@lists.xen.org

To subscribe or unsubscribe via the World Wide Web, visit
http://lists.xen.org/cgi-bin/mailman/listinfo/xen-devel
or, via email, send a message with subject or body 'help' to
xen-devel-requ...@lists.xen.org

You can reach the person managing the list at
xen-devel-ow...@lists.xen.org

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Xen-devel digest..."

Today's Topics:

1. hvc console not working on 4.4.0 guest (M A Young)

I am testing Fedora rawhide in a xen pv guest and I get the error
[9.809550] genirq: Flags mismatch irq 8. (hvc_console) vs.
(rtc0)
[9.809714] hvc_open: request_irq failed with rc -16.
during the boot and the hvc console doesn't respond when the boot
finishes. I have noticed this I think since the update through various
linux 4.4.0 kernels up to the current 4.4.0-0.rc5.git3.1.fc24.x86_64
one.
I do get an hvc console if I use the Fedora 23 kernel
4.2.6-300.fc23.x86_64

from before the update.
Is this a known problem? I have attached the boot log up to the above
error.

Michael Young

Hi Michael,

Yes it's a known problem, the patch for it was just accepted into the
tip tree:

https://git.kernel.org/cgit/linux/kernel/git/tip/tip.git/commit/?h=x86/urgent&id=d8c98a1d1488747625ad6044d423406e17e99b7a

So it will land in 4.4-final, the 2 patches causing this regression were
unfortunately also already backported
to the stable trees (despite objection), but hopefully the fix wil also
hit stable in the next round.

--
Sander

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH] x86/mmuext: Unify okay/rc error handling in do_mmuext_op()

>>> On 21.12.15 at 18:16,  wrote:
> c/s 506db90 "x86/HVM: merge HVM and PVH hypercall tables" introduced a path
> whereby 'okay' was used uninitialised, with broke compilation on CentOS 7.

It appeared to be used uninitialized, but wasn't in fact (i.e. the
outcome - the value rc gets set to - didn't depend on the value
due to

if ( unlikely(!okay) && !rc )
rc = -EINVAL;

being equivalent to

if ( !rc && unlikely(!okay) )
rc = -EINVAL;

(no side effects for the expressions on either side of the &&).
I'll re-word accordingly upon committing, to not give the false
impression of there having been other than a cosmetic problem.

> Splitting the error handling like this is fragile and unnecessary.  Drop the
> okay variable entirely and just use rc directly, substituting rc = -EINVAL/0
> for okay = 0/1.
> 
> In addition, two error messages are updated to print rc, and some stray
> whitespace is dropped.
> 
> Signed-off-by: Andrew Cooper 

Reviewed-by: Jan Beulich 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH] x86/mmuext: Unify okay/rc error handling in do_mmuext_op()

On 22/12/2015 08:57, Jan Beulich wrote:
 On 21.12.15 at 18:16,  wrote:
>> c/s 506db90 "x86/HVM: merge HVM and PVH hypercall tables" introduced a path
>> whereby 'okay' was used uninitialised, with broke compilation on CentOS 7.
> It appeared to be used uninitialized, but wasn't in fact (i.e. the
> outcome - the value rc gets set to - didn't depend on the value
> due to
>
> if ( unlikely(!okay) && !rc )
> rc = -EINVAL;
>
> being equivalent to
>
> if ( !rc && unlikely(!okay) )
> rc = -EINVAL;
>
> (no side effects for the expressions on either side of the &&).
> I'll re-word accordingly upon committing, to not give the false
> impression of there having been other than a cosmetic problem.

There is a real problem.  Because the compiler is able to prove that
okay is genuinely read uninitialised in one case, the rules concerning
undefined behaviour permit it to do anything it wishes, including
omitting this if statement.

As far as practical problems go however, it is the build breakage which
is relevant, and it breaks because of a -Werror=maybe-uninitialised.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v3 0/2] VT-d flush issue

>>> On 22.12.15 at 09:43,  wrote:
> Let's finish our discussion. I accept your idea. But I need to separate it 
> into 3 patch set (It is complicated for me, sometime it makes me crash.):
>Patch set 1:  Device-TLB/iotlb flush error. (send out this week)
>Patch set 2:  context flush error. (need 2 ~ 3 weeks)
>Patch set 3:  iec flush error. (need 3 ~ 4 weeks)
> 
> If it is acceptable, we can discuss in detail one by one..

Splitting up is of course acceptable.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v3 0/2] VT-d flush issue

> On 22.12.2015 at 5:09pm,  wrote:
> >>> On 22.12.15 at 09:43,  wrote:
> > Let's finish our discussion. I accept your idea. But I need to
> > separate it into 3 patch set (It is complicated for me, sometime it makes me
> crash.):
> >Patch set 1:  Device-TLB/iotlb flush error. (send out this week)
> >Patch set 2:  context flush error. (need 2 ~ 3 weeks)
> >Patch set 3:  iec flush error. (need 3 ~ 4 weeks)
> >
> > If it is acceptable, we can discuss in detail one by one..
> 
> Splitting up is of course acceptable.
> 

Thanks Jan. I am getting started to Patch set 1. 

Quan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH V4 6/6] x86/hvm: pkeys, add pkeys support for cpuid handling

2015-12-22 Thread Han, Huaitong

On Tue, 2015-12-22 at 01:30 -0700, Jan Beulich wrote:
> > > > On 22.12.15 at 03:54,  wrote:
> > On Mon, 2015-12-21 at 08:07 -0700, Jan Beulich wrote:
> > > > > > On 21.12.15 at 08:21,  wrote:
> > > > @@ -4600,6 +4600,14 @@ void hvm_cpuid(unsigned int input,
> > > > unsigned
> > > > int *eax, unsigned int *ebx,
> > > >  /* Don't expose INVPCID to non-hap hvm. */
> > > >  if ( (count == 0) && !hap_enabled(d) )
> > > >  *ebx &= ~cpufeat_mask(X86_FEATURE_INVPCID);
> > > > +
> > > > +/* X86_FEATURE_PKU is not yet implemented for shadow
> > > > paging. */
> > > > +if ( (count == 0) && !hap_enabled(d) )
> > > > +*ecx &= ~cpufeat_mask(X86_FEATURE_PKU);
> > > 
> > > I'm still missing the xsave dependency here.
> > Xsave dependency deletion becasue we use RDPKRU to get PKRU
> > register
> > value instead of XSAVE now.
> 
> What the hypervisor does doesn't matter here. The question is
> whether from an architectural standpoint XSAVE is a prerequsite.
> If it is, then you need to clear PKU when _guest_ XSAVE is clear.
No, XSAVE is not a prerequsite.
> 
> Jan
> 
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [linux-3.14 test] 65633: regressions - FAIL

2015-12-22 Thread Xiao Guangrong




On 12/16/2015 09:32 AM, Robert Hu wrote:

On Fri, 2015-12-11 at 12:01 +, Ian Campbell wrote:

On Fri, 2015-12-11 at 11:48 +0800, Robert Hu wrote:

On Fri, 2015-12-11 at 01:16 +, osstest service owner wrote:

flight 65633 linux-3.14 real [real]
http://logs.test-lab.xenproject.org/osstest/logs/65633/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
  test-amd64-i386-rumpuserxen-i386 10 guest-start   fail REGR.
vs. 64562

[trim...]
Hi Ian,

Why does it still fails there and even marked 'never pass' now?


This is the test of the linux-3.14 branch, not the xen-unstable branch
which was failing before.

Once the revert passes through the xen-unstable push gate then the linux-
3.14 branch (and most other branches) will pick up that change.

I don't know why the nested test case has never passed on the 3.14 branch,
someone would have to investigate if they think that is a problem.


Guangrong will take a look at this along with the nested failure issue
on Xen side.



Yes. I am still working on it and please be patient, i need
some time to build up the knowledge and figure the root case out.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH] libxc: Defer initialization of start_page for HVM guests

2015-12-22 Thread Roger Pau Monné

El 22/12/15 a les 0.45, Boris Ostrovsky ha escrit:
> With commit 8c45adec18e0 ("libxc: create unmapped initrd in domain
> builder if supported") location of ramdisk may not be available to
> HVMlite guests by the time alloc_magic_pages_hvm() is invoked if the
> guest supports unmapped initrd.
> 
> Since the only operation related to allocating magic pages in that
> routine is allocation of HVMlite start info we can move everything
> else to a later point such as xc_dom_arch.start_info().
> 
> Signed-off-by: Boris Ostrovsky 
> ---
> I am not sure xc_dom_arch.start_info() is the right place neither since we
> still are doing things that have nothing to do with start_info.
>
>  tools/libxc/xc_dom_x86.c |   45 ++---
>  1 files changed, 30 insertions(+), 15 deletions(-)
> 
> diff --git a/tools/libxc/xc_dom_x86.c b/tools/libxc/xc_dom_x86.c
> index 3960875..f64079e 100644
> --- a/tools/libxc/xc_dom_x86.c
> +++ b/tools/libxc/xc_dom_x86.c
> @@ -585,6 +585,32 @@ static void build_hvm_info(void *hvm_info_page, struct 
> xc_dom_image *dom)
>  
>  static int alloc_magic_pages_hvm(struct xc_dom_image *dom)
>  {
> +struct xc_dom_seg seg;
> +size_t start_info_size = sizeof(struct hvm_start_info);
> +
> +if ( dom->device_model )
> +return 0;
> +
> +if ( dom->cmdline )
> +start_info_size += ROUNDUP(strlen(dom->cmdline) + 1, 8);
> +if ( dom->ramdisk_blob )
> +/* Limited to one module. */
> +start_info_size += sizeof( struct hvm_modlist_entry);
 ^ extra space.
> +
> +if ( xc_dom_alloc_segment(dom, &seg, "HVMlite start info", 0,
> +  start_info_size) )
> +{
> +DOMPRINTF("Unable to reserve memory for the start info");
> +return -1;
> +}
> +
> +dom->start_info_pfn = seg.pfn;
> +
> +return 0;
> +}
> +
> +static int start_info_hvm(struct xc_dom_image *dom)
> +{
>  unsigned long i;
>  void *hvm_info_page;
>  uint32_t *ident_pt, domid = dom->guest_domid;
> @@ -636,7 +662,6 @@ static int alloc_magic_pages_hvm(struct xc_dom_image *dom)
>  
>  if ( !dom->device_model )
>  {
> -struct xc_dom_seg seg;
>  struct hvm_start_info *start_info;
>  char *cmdline;
>  struct hvm_modlist_entry *modlist;
> @@ -653,17 +678,9 @@ static int alloc_magic_pages_hvm(struct xc_dom_image 
> *dom)
>  if ( dom->ramdisk_blob )
>  start_info_size += sizeof(*modlist); /* Limited to one module. */
>  
> -rc = xc_dom_alloc_segment(dom, &seg, "HVMlite start info", 0,
> -  start_info_size);
> -if ( rc != 0 )
> -{
> -DOMPRINTF("Unable to reserve memory for the start info");
> -goto out;
> -}
> -
>  start_page = xc_map_foreign_range(xch, domid, start_info_size,

IMHO, I would stash start_info_size somewhere inside xc_dom_image in
order to avoid calculating it twice.

Also, why is everything done inside of alloc_magic_pages_hvm moved to
start_info_hvm? AFAICT we should only need to move the code that fills
the hvm_start_info struct, but the rest of the code already present in
alloc_magic_pages_hvm could be left as-is.

Roger.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH V4 6/6] x86/hvm: pkeys, add pkeys support for cpuid handling

>>> On 22.12.15 at 10:25,  wrote:
> On Tue, 2015-12-22 at 01:30 -0700, Jan Beulich wrote:
>> > > > On 22.12.15 at 03:54,  wrote:
>> > On Mon, 2015-12-21 at 08:07 -0700, Jan Beulich wrote:
>> > > > > > On 21.12.15 at 08:21,  wrote:
>> > > > @@ -4600,6 +4600,14 @@ void hvm_cpuid(unsigned int input,
>> > > > unsigned
>> > > > int *eax, unsigned int *ebx,
>> > > >  /* Don't expose INVPCID to non-hap hvm. */
>> > > >  if ( (count == 0) && !hap_enabled(d) )
>> > > >  *ebx &= ~cpufeat_mask(X86_FEATURE_INVPCID);
>> > > > +
>> > > > +/* X86_FEATURE_PKU is not yet implemented for shadow
>> > > > paging. */
>> > > > +if ( (count == 0) && !hap_enabled(d) )
>> > > > +*ecx &= ~cpufeat_mask(X86_FEATURE_PKU);
>> > > 
>> > > I'm still missing the xsave dependency here.
>> > Xsave dependency deletion becasue we use RDPKRU to get PKRU
>> > register
>> > value instead of XSAVE now.
>> 
>> What the hypervisor does doesn't matter here. The question is
>> whether from an architectural standpoint XSAVE is a prerequsite.
>> If it is, then you need to clear PKU when _guest_ XSAVE is clear.
> No, XSAVE is not a prerequsite.

I.e. OSes are expected to deal with both cases (PKU w/ XSAVE and
PKU w/o XSAVE)? Sounds like a recipe for trouble, but well - okay.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH] x86/mmuext: Unify okay/rc error handling in do_mmuext_op()

On 21/12/2015 17:50, Boris Ostrovsky wrote:
> On 12/21/2015 12:16 PM, Andrew Cooper wrote:
>> c/s 506db90 "x86/HVM: merge HVM and PVH hypercall tables" introduced
>> a path
>> whereby 'okay' was used uninitialised, with broke compilation on
>> CentOS 7.
>>
>> Splitting the error handling like this is fragile and unnecessary. 
>> Drop the
>> okay variable entirely and just use rc directly, substituting rc =
>> -EINVAL/0
>> for okay = 0/1.
>>
>> In addition, two error messages are updated to print rc, and some stray
>> whitespace is dropped.
>>
>> Signed-off-by: Andrew Cooper 
>> ---
>> CC: Jan Beulich 
>> CC: Boris Ostrovsky 
>
> Reviewed-by: Boris Ostrovsky 
>
> with a couple of style nits:
>
>> @@ -3258,14 +3254,14 @@ long do_mmuext_op(
>> break;
>>   }
>> -
>> +
>>   case MMUEXT_TLB_FLUSH_LOCAL:
>>   if ( likely(d == pg_owner) )
>>   flush_tlb_local();
>>   else
>>   rc = -EPERM;
>>   break;
>> -
>> +
>>   case MMUEXT_INVLPG_LOCAL:
>>   if ( unlikely(d != pg_owner) )
>>   rc = -EPERM;
>
> These look like stray changes.

They are specifically the stray bits of whitespace referred to in the
commit message.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] hvc console not working on 4.4.0 guest

2015-12-22 Thread David Vrabel

On 21/12/15 22:57, M A Young wrote:
> I am testing Fedora rawhide in a xen pv guest and I get the error
> [9.809550] genirq: Flags mismatch irq 8.  (hvc_console) vs. 
>  (rtc0)
> [9.809714] hvc_open: request_irq failed with rc -16.
> during the boot and the hvc console doesn't respond when the boot 
> finishes. I have noticed this I think since the update through various 
> linux 4.4.0 kernels up to the current 4.4.0-0.rc5.git3.1.fc24.x86_64 one.
> I do get an hvc console if I use the Fedora 23 kernel 4.2.6-300.fc23.x86_64
> from before the update.
> Is this a known problem? I have attached the boot log up to the above 
> error. 

Fixed by "x86: Xen PV guests don't have the rtc_cmos platform device"
which hasn't yet been applied.

As a workaround try using a 2 VCPU guest.

David

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v3 0/2] VT-d flush issue

> On 22.12.2015 at 5:09pm,  wrote:
> >>> On 22.12.15 at 09:43,  wrote:
> > Let's finish our discussion. I accept your idea. But I need to
> > separate it into 3 patch set (It is complicated for me, sometime it makes me
> crash.):
> >Patch set 1:  Device-TLB/iotlb flush error. (send out this week)
> >Patch set 2:  context flush error. (need 2 ~ 3 weeks)
> >Patch set 3:  iec flush error. (need 3 ~ 4 weeks)
> >
> > If it is acceptable, we can discuss in detail one by one..
> 
> Splitting up is of course acceptable.

Jan,
   Just update the combination,

   Patch set 1:  Device-TLB flush error. (send out this week)
   Patch set 2:  context/iotlb flush error. (need 2 ~ 3 weeks)
   Patch set 3:  iec flush error. (need 3 ~ 4 weeks)

-Quan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH V5 0/6] x86/hvm: pkeys, add memory protection-key support

Changes in v5:
*Add static for opt_pku.
*Update commit message for some patches.
*Add condition 5:the access is to a user page to pkey_fault, and simplify #ifdef
for guest_walk_tables patch.
*Don't write XSTATE_PKRU to PV's xcr0.
*count == 0 is combined in hvm_cpuid function.

Changes in v4:
*Delete gva2gfn patch, and when page is present, PFEC_prot_key is always 
checked.
*Use RDPKRU instead of xsave_read because RDPKRU does cost less.
*Squash pkeys patch and pkru patch to guest_walk_tables patch.

Changes in v3:
*Get CPUID:ospke depend on guest cpuid instead of host hardware capable,
and Move cpuid patch to the last of patches.
*Move opt_pku to cpu/common.c.
*Use MASK_EXTR for get_pte_pkeys.
*Add quoting for pkru macro, and use static inline pkru_read functions.
*Rebase get_xsave_addr for updated codes, and add uncompressed format support
for xsave state.
*Use fpu_xsave instead of vcpu_save_fpu, and adjust the code style for
leaf_pte_pkeys_check.
*Add parentheses for PFEC_prot_key of gva2gfn funcitons.

Changes in v2:
*Rebase all patches in staging branch
*Disable X86_CR4_PKE on hypervisor, and delete pkru_read/write functions, and
use xsave state read to get pkru value.
*Delete the patch that adds pkeys support for do_page_fault.
*Add pkeys support for gva2gfn so that setting _PAGE_PK_BIT in the return
value can get propagated to the guest correctly.

The protection-key feature provides an additional mechanism by which IA-32e
paging controls access to usermode addresses.

Hardware support for protection keys for user pages is enumerated with CPUID
feature flag CPUID.7.0.ECX[3]:PKU. Software support is CPUID.7.0.ECX[4]:OSPKE
with the setting of CR4.PKE(bit 22).

When CR4.PKE = 1, every linear address is associated with the 4-bit protection
key located in bits 62:59 of the paging-structure entry that mapped the page
containing the linear address. The PKRU register determines, for each
protection key, whether user-mode addresses with that protection key may be
read or written.

The PKRU register (protection key rights for user pages) is a 32-bit register
with the following format: for each i (0 ≤ i ≤ 15), PKRU[2i] is the
access-disable bit for protection key i (ADi); PKRU[2i+1] is the write-disable
bit for protection key i (WDi).

Software can use the RDPKRU and WRPKRU instructions with ECX = 0 to read and
write PKRU. In addition, the PKRU register is XSAVE-managed state and can thus
be read and written by instructions in the XSAVE feature set.

PFEC.PK (bit 5) is defined as protection key violations.

The specification of Protection Keys can be found at SDM (4.6.2, volume 3)
http://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-software-developer-manual-325462.pdf.

Huaitong Han (6):
  x86/hvm: pkeys, add the flag to enable Memory Protection Keys
  x86/hvm: pkeys, add pkeys support when setting CR4
  x86/hvm: pkeys, disable pkeys for guests in non-paging mode
  x86/hvm: pkeys, add pkeys support for guest_walk_tables
  x86/hvm: pkeys, add xstate support for pkeys
  x86/hvm: pkeys, add pkeys support for cpuid handling

 docs/misc/xen-command-line.markdown | 10 ++
 tools/libxc/xc_cpufeature.h |  2 ++
 tools/libxc/xc_cpuid_x86.c  |  6 ++--
 xen/arch/x86/cpu/common.c   | 10 +-
 xen/arch/x86/hvm/hvm.c  | 41 
 xen/arch/x86/hvm/vmx/vmx.c  | 11 ---
 xen/arch/x86/mm/guest_walk.c| 64 +
 xen/arch/x86/mm/hap/guest_walk.c|  3 ++
 xen/arch/x86/xstate.c   |  3 +-
 xen/include/asm-x86/cpufeature.h|  6 +++-
 xen/include/asm-x86/guest_pt.h  | 12 +++
 xen/include/asm-x86/hvm/hvm.h   |  2 ++
 xen/include/asm-x86/page.h  |  5 +++
 xen/include/asm-x86/processor.h | 39 ++
 xen/include/asm-x86/x86_64/page.h   | 12 +++
 xen/include/asm-x86/xstate.h|  4 ++-
 16 files changed, 205 insertions(+), 25 deletions(-)

-- 
2.4.3


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH V5 2/6] x86/hvm: pkeys, add pkeys support when setting CR4

CR4.PKE(bit 22) enables support for the RDPKRU/WRPKRU instructions to access 
PKRU and
the protection keys check (a page fault trigger).

This patch adds X86_CR4_PKE to hvm_cr4_guest_reserved_bits so that CR4 check 
works
before setting.

Signed-off-by: Huaitong Han 
Reviewed-by: Andrew Cooper 
---
 xen/arch/x86/hvm/hvm.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index db0aeba..59916ed 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -1924,6 +1924,7 @@ static unsigned long hvm_cr4_guest_reserved_bits(const 
struct vcpu *v,
 leaf1_edx = 
boot_cpu_data.x86_capability[cpufeat_word(X86_FEATURE_VME)];
 leaf1_ecx = 
boot_cpu_data.x86_capability[cpufeat_word(X86_FEATURE_PCID)];
 leaf7_0_ebx = 
boot_cpu_data.x86_capability[cpufeat_word(X86_FEATURE_FSGSBASE)];
+leaf7_0_ecx = 
boot_cpu_data.x86_capability[cpufeat_word(X86_FEATURE_PKU)];
 }
 
 return ~(unsigned long)
@@ -1959,7 +1960,9 @@ static unsigned long hvm_cr4_guest_reserved_bits(const 
struct vcpu *v,
  (leaf7_0_ebx & cpufeat_mask(X86_FEATURE_SMEP) ?
   X86_CR4_SMEP : 0) |
  (leaf7_0_ebx & cpufeat_mask(X86_FEATURE_SMAP) ?
-  X86_CR4_SMAP : 0));
+  X86_CR4_SMAP : 0) |
+  (leaf7_0_ecx & cpufeat_mask(X86_FEATURE_PKU) ?
+  X86_CR4_PKE : 0));
 }
 
 static int hvm_load_cpu_ctxt(struct domain *d, hvm_domain_context_t *h)
-- 
2.4.3


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH V5 1/6] x86/hvm: pkeys, add the flag to enable Memory Protection Keys

This patch adds the flag("pku") to enable Memory Protection Keys, and updates
the markdown.

Signed-off-by: Huaitong Han 
Reviewed-by: Andrew Cooper 
---
 docs/misc/xen-command-line.markdown | 10 ++
 xen/arch/x86/cpu/common.c   | 10 +-
 xen/include/asm-x86/cpufeature.h|  6 +-
 3 files changed, 24 insertions(+), 2 deletions(-)

diff --git a/docs/misc/xen-command-line.markdown 
b/docs/misc/xen-command-line.markdown
index c103894..36ecf80 100644
--- a/docs/misc/xen-command-line.markdown
+++ b/docs/misc/xen-command-line.markdown
@@ -1177,6 +1177,16 @@ This option can be specified more than once (up to 8 
times at present).
 ### ple\_window
 > `= `
 
+### pku
+> `= `
+
+> Default: `true`
+
+Flag to enable Memory Protection Keys.
+
+The protection-key feature provides an additional mechanism by which IA-32e
+paging controls access to usermode addresses.
+
 ### psr (Intel)
 > `= List of ( cmt: | rmid_max: | cat: | 
 > cos_max: | cdp: )`
 
diff --git a/xen/arch/x86/cpu/common.c b/xen/arch/x86/cpu/common.c
index 310ec85..a018855 100644
--- a/xen/arch/x86/cpu/common.c
+++ b/xen/arch/x86/cpu/common.c
@@ -22,6 +22,10 @@ boolean_param("xsave", use_xsave);
 bool_t opt_arat = 1;
 boolean_param("arat", opt_arat);
 
+/* pku: Flag to enable Memory Protection Keys (default on). */
+static bool_t opt_pku = 1;
+boolean_param("pku", opt_pku);
+
 unsigned int opt_cpuid_mask_ecx = ~0u;
 integer_param("cpuid_mask_ecx", opt_cpuid_mask_ecx);
 unsigned int opt_cpuid_mask_edx = ~0u;
@@ -270,7 +274,8 @@ static void __cpuinit generic_identify(struct cpuinfo_x86 
*c)
if ( c->cpuid_level >= 0x0007 )
cpuid_count(0x0007, 0, &tmp,

&c->x86_capability[cpufeat_word(X86_FEATURE_FSGSBASE)],
-   &tmp, &tmp);
+   &c->x86_capability[cpufeat_word(X86_FEATURE_PKU)],
+   &tmp);
 }
 
 /*
@@ -323,6 +328,9 @@ void __cpuinit identify_cpu(struct cpuinfo_x86 *c)
if ( cpu_has_xsave )
xstate_init(c);
 
+   if ( !opt_pku )
+   setup_clear_cpu_cap(X86_FEATURE_PKU);
+
/*
 * The vendor-specific functions might have changed features.  Now
 * we do "generic changes."
diff --git a/xen/include/asm-x86/cpufeature.h b/xen/include/asm-x86/cpufeature.h
index af127cf..ef96514 100644
--- a/xen/include/asm-x86/cpufeature.h
+++ b/xen/include/asm-x86/cpufeature.h
@@ -11,7 +11,7 @@
 
 #include 
 
-#define NCAPINTS   8   /* N 32-bit words worth of info */
+#define NCAPINTS   9   /* N 32-bit words worth of info */
 
 /* Intel-defined CPU features, CPUID level 0x0001 (edx), word 0 */
 #define X86_FEATURE_FPU(0*32+ 0) /* Onboard FPU */
@@ -163,6 +163,10 @@
 #define X86_FEATURE_ADX(7*32+19) /* ADCX, ADOX instructions */
 #define X86_FEATURE_SMAP   (7*32+20) /* Supervisor Mode Access Prevention 
*/
 
+/* Intel-defined CPU features, CPUID level 0x0007:0 (ecx), word 8 */
+#define X86_FEATURE_PKU(8*32+ 3) /* Protection Keys for Userspace */
+#define X86_FEATURE_OSPKE  (8*32+ 4) /* OS Protection Keys Enable */
+
 #define cpufeat_word(idx)  ((idx) / 32)
 #define cpufeat_bit(idx)   ((idx) % 32)
 #define cpufeat_mask(idx)  (_AC(1, U) << cpufeat_bit(idx))
-- 
2.4.3


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH V5 5/6] x86/hvm: pkeys, add xstate support for pkeys

This patch adds xstate support for pkeys.

Signed-off-by: Huaitong Han 
Reviewed-by: Andrew Cooper 
---
 xen/arch/x86/xstate.c| 3 ++-
 xen/include/asm-x86/xstate.h | 4 +++-
 2 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/xen/arch/x86/xstate.c b/xen/arch/x86/xstate.c
index b65da38..baa4b58 100644
--- a/xen/arch/x86/xstate.c
+++ b/xen/arch/x86/xstate.c
@@ -487,7 +487,8 @@ void xstate_init(struct cpuinfo_x86 *c)
  * Set CR4_OSXSAVE and run "cpuid" to get xsave_cntxt_size.
  */
 set_in_cr4(X86_CR4_OSXSAVE);
-if ( !set_xcr0(feature_mask) )
+/* PKU is disabled on PV mode. */
+if ( !set_xcr0(feature_mask & ~XSTATE_PKRU) )
 BUG();
 
 if ( bsp )
diff --git a/xen/include/asm-x86/xstate.h b/xen/include/asm-x86/xstate.h
index 12d939b..f7c41ba 100644
--- a/xen/include/asm-x86/xstate.h
+++ b/xen/include/asm-x86/xstate.h
@@ -34,13 +34,15 @@
 #define XSTATE_OPMASK  (1ULL << 5)
 #define XSTATE_ZMM (1ULL << 6)
 #define XSTATE_HI_ZMM  (1ULL << 7)
+#define XSTATE_PKRU(1ULL << 9)
 #define XSTATE_LWP (1ULL << 62) /* AMD lightweight profiling */
 #define XSTATE_FP_SSE  (XSTATE_FP | XSTATE_SSE)
 #define XCNTXT_MASK(XSTATE_FP | XSTATE_SSE | XSTATE_YMM | XSTATE_OPMASK | \
 XSTATE_ZMM | XSTATE_HI_ZMM | XSTATE_NONLAZY)
 
 #define XSTATE_ALL (~(1ULL << 63))
-#define XSTATE_NONLAZY (XSTATE_LWP | XSTATE_BNDREGS | XSTATE_BNDCSR)
+#define XSTATE_NONLAZY (XSTATE_LWP | XSTATE_BNDREGS | XSTATE_BNDCSR | \
+XSTATE_PKRU)
 #define XSTATE_LAZY(XSTATE_ALL & ~XSTATE_NONLAZY)
 #define XSTATE_COMPACTION_ENABLED  (1ULL << 63)
 
-- 
2.4.3


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH V5 4/6] x86/hvm: pkeys, add pkeys support for guest_walk_tables

Protection keys define a new 4-bit protection key field(PKEY) in bits 62:59 of
leaf entries of the page tables.

PKRU register defines 32 bits, there are 16 domains and 2 attribute bits per
domain in pkru, for each i (0 ≤ i ≤ 15), PKRU[2i] is the access-disable bit for
protection key i (ADi); PKRU[2i+1] is the write-disable bit for protection key
i (WDi). PKEY is index to a defined domain.

A fault is considered as a PKU violation if all of the following conditions are
ture:
1.CR4_PKE=1.
2.EFER_LMA=1.
3.Page is present with no reserved bit violations.
4.The access is not an instruction fetch.
5.The access is to a user page.
6.PKRU.AD=1
or The access is a data write and PKRU.WD=1
and either CR0.WP=1 or it is a user access.

Signed-off-by: Huaitong Han 
---
 xen/arch/x86/mm/guest_walk.c  | 64 +++
 xen/arch/x86/mm/hap/guest_walk.c  |  3 ++
 xen/include/asm-x86/guest_pt.h| 12 
 xen/include/asm-x86/hvm/hvm.h |  2 ++
 xen/include/asm-x86/page.h|  5 +++
 xen/include/asm-x86/processor.h   | 39 
 xen/include/asm-x86/x86_64/page.h | 12 
 7 files changed, 137 insertions(+)

diff --git a/xen/arch/x86/mm/guest_walk.c b/xen/arch/x86/mm/guest_walk.c
index 18d1acf..9cdd607 100644
--- a/xen/arch/x86/mm/guest_walk.c
+++ b/xen/arch/x86/mm/guest_walk.c
@@ -90,6 +90,57 @@ static uint32_t set_ad_bits(void *guest_p, void *walk_p, int 
set_dirty)
 return 0;
 }
 
+extern bool_t pkey_fault(struct vcpu *vcpu, uint32_t pfec,
+uint32_t pte_flags, uint32_t pte_pkey);
+#if GUEST_PAGING_LEVELS == CONFIG_PAGING_LEVELS
+bool_t pkey_fault(struct vcpu *vcpu, uint32_t pfec,
+uint32_t pte_flags, uint32_t pte_pkey)
+{
+unsigned int pkru = 0;
+bool_t pkru_ad, pkru_wd;
+
+bool_t pf = !!(pfec & PFEC_page_present);
+bool_t uf = !!(pfec & PFEC_user_mode);
+bool_t wf = !!(pfec & PFEC_write_access);
+bool_t ff = !!(pfec & PFEC_insn_fetch);
+bool_t rsvdf = !!(pfec & PFEC_reserved_bit);
+
+/* When page isn't present,  PKEY isn't checked. */
+if ( !pf || is_pv_vcpu(vcpu) )
+return 0;
+
+/*
+ * PKU:  additional mechanism by which the paging controls
+ * access to user-mode addresses based on the value in the
+ * PKRU register. A fault is considered as a PKU violation if all
+ * of the following conditions are ture:
+ * 1.CR4_PKE=1.
+ * 2.EFER_LMA=1.
+ * 3.page is present with no reserved bit violations.
+ * 4.the access is not an instruction fetch.
+ * 5.the access is to a user page.
+ * 6.PKRU.AD=1
+ *   or The access is a data write and PKRU.WD=1
+ *and either CR0.WP=1 or it is a user access.
+ */
+if ( !hvm_pku_enabled(vcpu) || !hvm_long_mode_enabled(vcpu) ||
+rsvdf || ff || !(pte_flags & _PAGE_USER) )
+return 0;
+
+pkru = read_pkru();
+if ( unlikely(pkru) )
+{
+pkru_ad = read_pkru_ad(pkru, pte_pkey);
+pkru_wd = read_pkru_wd(pkru, pte_pkey);
+/* Condition 6 */
+if ( pkru_ad || (pkru_wd && wf && (hvm_wp_enabled(vcpu) || uf)))
+return 1;
+}
+
+return 0;
+}
+#endif
+
 /* Walk the guest pagetables, after the manner of a hardware walker. */
 /* Because the walk is essentially random, it can cause a deadlock 
  * warning in the p2m locking code. Highly unlikely this is an actual
@@ -107,6 +158,7 @@ guest_walk_tables(struct vcpu *v, struct p2m_domain *p2m,
 guest_l3e_t *l3p = NULL;
 guest_l4e_t *l4p;
 #endif
+unsigned int pkey;
 uint32_t gflags, mflags, iflags, rc = 0;
 bool_t smep = 0, smap = 0;
 bool_t pse1G = 0, pse2M = 0;
@@ -190,6 +242,7 @@ guest_walk_tables(struct vcpu *v, struct p2m_domain *p2m,
 goto out;
 /* Get the l3e and check its flags*/
 gw->l3e = l3p[guest_l3_table_offset(va)];
+pkey = guest_l3e_get_pkey(gw->l3e);
 gflags = guest_l3e_get_flags(gw->l3e) ^ iflags;
 if ( !(gflags & _PAGE_PRESENT) ) {
 rc |= _PAGE_PRESENT;
@@ -199,6 +252,9 @@ guest_walk_tables(struct vcpu *v, struct p2m_domain *p2m,
 
 pse1G = (gflags & _PAGE_PSE) && guest_supports_1G_superpages(v); 
 
+if ( pse1G && pkey_fault(v, pfec, gflags, pkey) )
+rc |= _PAGE_PKEY_BITS;
+
 if ( pse1G )
 {
 /* Generate a fake l1 table entry so callers don't all 
@@ -270,6 +326,10 @@ guest_walk_tables(struct vcpu *v, struct p2m_domain *p2m,
 
 pse2M = (gflags & _PAGE_PSE) && guest_supports_superpages(v); 
 
+pkey = guest_l2e_get_pkey(gw->l2e);
+if ( pse2M && pkey_fault(v, pfec, gflags, pkey) )
+rc |= _PAGE_PKEY_BITS;
+
 if ( pse2M )
 {
 /* Special case: this guest VA is in a PSE superpage, so there's
@@ -330,6 +390,10 @@ guest_walk_tables(struct vcpu *v, struct p2m_domain *p2m,
 goto out;
 }
 rc |= ((gflags & mflags) ^ mflags);
+
+pkey = guest_l1e_get_pkey(gw->l1e);
+if ( pkey_fault(v, pfec, gfla

[Xen-devel] [PATCH V5 6/6] x86/hvm: pkeys, add pkeys support for cpuid handling

This patch adds pkeys support for cpuid handing.

Pkeys hardware support is CPUID.7.0.ECX[3]:PKU. software support is
CPUID.7.0.ECX[4]:OSPKE and it reflects the support setting of CR4.PKE.

Signed-off-by: Huaitong Han 
---
 tools/libxc/xc_cpufeature.h |  2 ++
 tools/libxc/xc_cpuid_x86.c  |  6 --
 xen/arch/x86/hvm/hvm.c  | 36 +++-
 3 files changed, 29 insertions(+), 15 deletions(-)

diff --git a/tools/libxc/xc_cpufeature.h b/tools/libxc/xc_cpufeature.h
index c3ddc80..f6a9778 100644
--- a/tools/libxc/xc_cpufeature.h
+++ b/tools/libxc/xc_cpufeature.h
@@ -141,5 +141,7 @@
 #define X86_FEATURE_ADX 19 /* ADCX, ADOX instructions */
 #define X86_FEATURE_SMAP20 /* Supervisor Mode Access Protection */
 
+/* Intel-defined CPU features, CPUID level 0x0007:0 (ecx) */
+#define X86_FEATURE_PKU 3
 
 #endif /* __LIBXC_CPUFEATURE_H */
diff --git a/tools/libxc/xc_cpuid_x86.c b/tools/libxc/xc_cpuid_x86.c
index 8882c01..1ce979b 100644
--- a/tools/libxc/xc_cpuid_x86.c
+++ b/tools/libxc/xc_cpuid_x86.c
@@ -427,9 +427,11 @@ static void xc_cpuid_hvm_policy(xc_interface *xch,
 bitmaskof(X86_FEATURE_ADX)  |
 bitmaskof(X86_FEATURE_SMAP) |
 bitmaskof(X86_FEATURE_FSGSBASE));
+regs[2] &= bitmaskof(X86_FEATURE_PKU);
 } else
-regs[1] = 0;
-regs[0] = regs[2] = regs[3] = 0;
+regs[1] = regs[2] = 0;
+
+regs[0] = regs[3] = 0;
 break;
 
 case 0x000d:
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 59916ed..076313b 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -4572,7 +4572,7 @@ void hvm_cpuid(unsigned int input, unsigned int *eax, 
unsigned int *ebx,
 __clear_bit(X86_FEATURE_APIC & 31, edx);
 
 /* Fix up OSXSAVE. */
-if ( cpu_has_xsave )
+if ( *ecx & cpufeat_mask(X86_FEATURE_XSAVE) )
 *ecx |= (v->arch.hvm_vcpu.guest_cr[4] & X86_CR4_OSXSAVE) ?
  cpufeat_mask(X86_FEATURE_OSXSAVE) : 0;
 
@@ -4585,21 +4585,31 @@ void hvm_cpuid(unsigned int input, unsigned int *eax, 
unsigned int *ebx,
 *edx &= ~cpufeat_mask(X86_FEATURE_PSE36);
 break;
 case 0x7:
-if ( (count == 0) && !cpu_has_smep )
-*ebx &= ~cpufeat_mask(X86_FEATURE_SMEP);
+if ( count == 0 )
+{
+if ( !cpu_has_smep )
+*ebx &= ~cpufeat_mask(X86_FEATURE_SMEP);
 
-if ( (count == 0) && !cpu_has_smap )
-*ebx &= ~cpufeat_mask(X86_FEATURE_SMAP);
+if ( !cpu_has_smap )
+*ebx &= ~cpufeat_mask(X86_FEATURE_SMAP);
 
-/* Don't expose MPX to hvm when VMX support is not available */
-if ( (count == 0) &&
- (!(vmx_vmexit_control & VM_EXIT_CLEAR_BNDCFGS) ||
-  !(vmx_vmentry_control & VM_ENTRY_LOAD_BNDCFGS)) )
-*ebx &= ~cpufeat_mask(X86_FEATURE_MPX);
+/* Don't expose MPX to hvm when VMX support is not available */
+if (!(vmx_vmexit_control & VM_EXIT_CLEAR_BNDCFGS) ||
+  !(vmx_vmentry_control & VM_ENTRY_LOAD_BNDCFGS))
+*ebx &= ~cpufeat_mask(X86_FEATURE_MPX);
 
-/* Don't expose INVPCID to non-hap hvm. */
-if ( (count == 0) && !hap_enabled(d) )
-*ebx &= ~cpufeat_mask(X86_FEATURE_INVPCID);
+if ( !hap_enabled(d) )
+{
+/* Don't expose INVPCID to non-hap hvm. */
+*ebx &= ~cpufeat_mask(X86_FEATURE_INVPCID);
+/* X86_FEATURE_PKU is not yet implemented for shadow paging. */
+*ecx &= ~cpufeat_mask(X86_FEATURE_PKU);
+}
+
+if ( *ecx & cpufeat_mask(X86_FEATURE_PKU))
+*ecx |= (v->arch.hvm_vcpu.guest_cr[4] & X86_CR4_PKE) ?
+ cpufeat_mask(X86_FEATURE_OSPKE) : 0;
+}
 break;
 case 0xb:
 /* Fix the x2APIC identifier. */
-- 
2.4.3


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH V5 3/6] x86/hvm: pkeys, disable pkeys for guests in non-paging mode

This patch disables pkeys for guest in non-paging mode, However XEN always uses
paging mode to emulate guest non-paging mode, To emulate this behavior, pkeys
needs to be manually disabled when guest switches to non-paging mode.

Signed-off-by: Huaitong Han 
Reviewed-by: Andrew Cooper 
---
 xen/arch/x86/hvm/vmx/vmx.c | 11 ++-
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index 2581e97..7123912 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -1380,12 +1380,13 @@ static void vmx_update_guest_cr(struct vcpu *v, 
unsigned int cr)
 if ( !hvm_paging_enabled(v) )
 {
 /*
- * SMEP/SMAP is disabled if CPU is in non-paging mode in hardware.
- * However Xen always uses paging mode to emulate guest non-paging
- * mode. To emulate this behavior, SMEP/SMAP needs to be manually
- * disabled when guest VCPU is in non-paging mode.
+ * SMEP/SMAP/PKU is disabled if CPU is in non-paging mode in
+ * hardware. However Xen always uses paging mode to emulate guest
+ * non-paging mode. To emulate this behavior, SMEP/SMAP/PKU needs
+ * to be manually disabled when guest VCPU is in non-paging mode.
  */
-v->arch.hvm_vcpu.hw_cr[4] &= ~(X86_CR4_SMEP | X86_CR4_SMAP);
+v->arch.hvm_vcpu.hw_cr[4] &=
+~(X86_CR4_SMEP | X86_CR4_SMAP | X86_CR4_PKE);
 }
 __vmwrite(GUEST_CR4, v->arch.hvm_vcpu.hw_cr[4]);
 break;
-- 
2.4.3


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH V5 4/6] x86/hvm: pkeys, add pkeys support for guest_walk_tables

On 22/12/15 10:30, Huaitong Han wrote:
> --- a/xen/include/asm-x86/processor.h
> +++ b/xen/include/asm-x86/processor.h
> @@ -373,6 +373,45 @@ static always_inline void clear_in_cr4 (unsigned long 
> mask)
>  write_cr4(read_cr4() & ~mask);
>  }
>  
> +static inline unsigned int read_pkru(void)
> +{
> +unsigned int pkru;
> +
> +/*
> + * _PAGE_PKEY_BITS have a conflict with _PAGE_GNTTAB used by PV guests,
> + * so that X86_CR4_PKE is disable on hypervisor, RDPKRU instruction can
> + * be used with temporarily setting CR4.PKE.
> + */
> +set_in_cr4(X86_CR4_PKE);
> +asm volatile (".byte 0x0f,0x01,0xee"
> +: "=a" (pkru) : "c" (0) : "dx");
> +clear_in_cr4(X86_CR4_PKE);

You can't use set/clear_in_cr4 here, as it modifies the global
mmu_cr4_features variable.

I would recommend

unsigned long cr4 = read_cr4();
write_cr4(cr4 | X86_CR4_PKE);
...
write_cr4(cr4);

instead.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 2/4] x86: misc printk() adjustments

On 21/12/15 14:48, Jan Beulich wrote:
 On 21.12.15 at 15:41,  wrote:
>> On 21/12/15 14:34, Jan Beulich wrote:
>>> --- a/xen/arch/x86/io_apic.c
>>> +++ b/xen/arch/x86/io_apic.c
>>> @@ -2310,13 +2310,14 @@ int ioapic_guest_read(unsigned long phys
>>>  return 0;
>>>  }
>>>  
>>> -#define WARN_BOGUS_WRITE(f, a...)   \
>>> -dprintk(XENLOG_INFO, "\n%s: "\
>>> -"apic=%d, pin=%d, irq=%d\n" \
>>> -"%s: new_entry=%08x\n"  \
>>> -"%s: " f, __FUNCTION__, apic, pin, irq,\
>>> -__FUNCTION__, *(u32 *)&rte,   \
>>> -__FUNCTION__ , ##a )
>>> +#define WARN_BOGUS_WRITE(f, a...)   \
>>> +dprintk(XENLOG_INFO, "\n"   \
>>> +XENLOG_INFO "%s: apic=%d, pin=%d, irq=%d\n" \
>>> +XENLOG_INFO "%s: new_entry=%08x\n"  \
>>> +XENLOG_INFO "%s: " f "\n",  \
>>> +__func__, apic, pin, irq,   \
>>> +__func__, *(u32 *)&rte, \
>>> +__func__, ##a )
>>>  
>>>  int ioapic_guest_write(unsigned long physbase, unsigned int reg, u32 val)
>>>  {
>>> @@ -2388,7 +2389,7 @@ int ioapic_guest_write(unsigned long phy
>>>  rte.vector = desc->arch.vector;
>>>  if ( *(u32*)&rte != ret )
>>>  WARN_BOGUS_WRITE("old_entry=%08x pirq=%d\n%s: "
>>> - "Attempt to modify IO-APIC pin for in-use 
>>> IRQ!\n",
>>> + "Attempt to modify IO-APIC pin for in-use 
>>> IRQ!",
>>>   ret, pirq, __FUNCTION__);
>> Given that this is the sole user of WARN_BOGUS_WRITE(), I would
>> recommend folding it all together in a simple dprintk(), and remove some
>> of the redundant information, and fixing the resulting message to take
>> up fewer lines.
> If you feel strongly about this I could certainly do so, but I did
> consider that option and decided against it to retain the option
> of using the macro (again?) in a second place.

I don't forsee it being useful to use again.

Either way, Reviewed-by: Andrew Cooper 

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [xen-4.6-testing test] 66794: regressions - FAIL

flight 66794 xen-4.6-testing real [real]
http://logs.test-lab.xenproject.org/osstest/logs/66794/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-i386-xl-qemuu-ovmf-amd64  9 debian-hvm-install fail REGR. vs. 65639
 test-amd64-i386-xl-qemut-stubdom-debianhvm-amd64-xsm 9 debian-hvm-install fail 
REGR. vs. 65639
 test-amd64-amd64-xl-qemuu-ovmf-amd64 9 debian-hvm-install fail REGR. vs. 65639

Regressions which are regarded as allowable (not blocking):
 test-armhf-armhf-xl-vhd  15 guest-start.2fail blocked in 65639
 test-amd64-amd64-xl-qemut-win7-amd64 16 guest-stop fail like 65639

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-pvh-intel 11 guest-start  fail  never pass
 test-amd64-amd64-xl-pvh-amd  11 guest-start  fail   never pass
 test-amd64-amd64-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  13 saverestore-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  12 migrate-support-checkfail   never pass
 test-amd64-amd64-qemuu-nested-amd 16 debian-hvm-install/l1/l2  fail never pass
 test-armhf-armhf-xl-multivcpu 13 saverestore-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 12 migrate-support-checkfail  never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check 
fail never pass
 test-amd64-i386-libvirt  12 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check 
fail never pass
 test-armhf-armhf-xl-cubietruck 12 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 13 saverestore-support-checkfail never pass
 test-armhf-armhf-libvirt-qcow2 11 migrate-support-checkfail never pass
 test-armhf-armhf-libvirt-qcow2 13 guest-saverestorefail never pass
 test-armhf-armhf-xl-xsm  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-raw 13 guest-saverestorefail   never pass
 test-armhf-armhf-libvirt-raw 11 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-xsm 14 guest-saverestorefail   never pass
 test-armhf-armhf-libvirt 14 guest-saverestorefail   never pass
 test-armhf-armhf-libvirt 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  11 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  12 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  12 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-vhd 11 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 12 migrate-support-checkfail   never pass

version targeted for testing:
 xen  cdd96b9be5a57ae7795750c3ae83ea6783030688
baseline version:
 xen  850bcd0f42e4912b6605a2caa42d5c466ed58bd1

Last test of basis65639  2015-12-09 21:22:40 Z   12 days
Failing since 66394  2015-12-15 18:17:32 Z6 days6 attempts
Testing same since66537  2015-12-18 16:12:08 Z3 days3 attempts


People who touched revisions under test:
  Boris Ostrovsky 
  Brendan Gregg 
  Daniel Kiper 
  Dario Faggioli 
  David Vrabel 
  Ed Swierk 
  Haozhong Zhang 
  Ian Campbell 
  Ian Jackson 
  Jan Beulich 
  Kevin Tian 

jobs:
 build-amd64-xsm  pass
 build-armhf-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-armhf  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-armhf-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-prev pass
 build-i386-prev  pass
 build-amd64-pvopspass

Re: [Xen-devel] [PATCH] tools: make flask utils build unconditional

On 22/12/15 04:46, Doug Goldstein wrote:
> The flask utilities only have dependencies on libxc so there's no
> downside to always building it. Distros and projects based on Xen can
> put these utilities into a different package and not install them for
> everyone. Prior to this change FLASK_ENABLE needs to be a top level
> variable however after this change FLASK_ENABLE only affects xen/.
>
> Signed-off-by: Doug Goldstein 

CC'ing Daniel as this is a flask related change.

FWIW, Reviewed-by: Andrew Cooper 

> ---
>  tools/Makefile | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/tools/Makefile b/tools/Makefile
> index 820ca40..2f773fd 100644
> --- a/tools/Makefile
> +++ b/tools/Makefile
> @@ -4,7 +4,7 @@ include $(XEN_ROOT)/tools/Rules.mk
>  SUBDIRS-y :=
>  SUBDIRS-y += include
>  SUBDIRS-y += libxc
> -SUBDIRS-$(FLASK_ENABLE) += flask
> +SUBDIRS-y += flask
>  SUBDIRS-y += xenstore
>  SUBDIRS-y += misc
>  SUBDIRS-y += examples


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [xen-unstable-smoke test] 66936: tolerable all pass - PUSHED

flight 66936 xen-unstable-smoke real [real]
http://logs.test-lab.xenproject.org/osstest/logs/66936/

Failures :-/ but no regressions.

Tests which did not succeed, but are not blocking:
 test-armhf-armhf-xl  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  13 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt 12 migrate-support-checkfail   never pass

version targeted for testing:
 xen  8e4d18e113c9d914a591c7bd8159c990122f50dc
baseline version:
 xen  bf925a9f1254391749f569c1b8fc606036340488

Last test of basis66868  2015-12-21 18:08:31 Z0 days
Testing same since66936  2015-12-22 10:01:41 Z0 days1 attempts


People who touched revisions under test:
  Andrew Cooper 
  Feng Wu 
  Jan Beulich 

jobs:
 build-amd64  pass
 build-armhf  pass
 build-amd64-libvirt  pass
 test-armhf-armhf-xl  pass
 test-amd64-amd64-xl-qemuu-debianhvm-i386 pass
 test-amd64-amd64-libvirt pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

+ branch=xen-unstable-smoke
+ revision=8e4d18e113c9d914a591c7bd8159c990122f50dc
+ . ./cri-lock-repos
++ . ./cri-common
+++ . ./cri-getconfig
+++ umask 002
+++ getrepos
 getconfig Repos
 perl -e '
use Osstest;
readglobalconfig();
print $c{"Repos"} or die $!;
'
+++ local repos=/home/osstest/repos
+++ '[' -z /home/osstest/repos ']'
+++ '[' '!' -d /home/osstest/repos ']'
+++ echo /home/osstest/repos
++ repos=/home/osstest/repos
++ repos_lock=/home/osstest/repos/lock
++ '[' x '!=' x/home/osstest/repos/lock ']'
++ OSSTEST_REPOS_LOCK_LOCKED=/home/osstest/repos/lock
++ exec with-lock-ex -w /home/osstest/repos/lock ./ap-push xen-unstable-smoke 
8e4d18e113c9d914a591c7bd8159c990122f50dc
+ branch=xen-unstable-smoke
+ revision=8e4d18e113c9d914a591c7bd8159c990122f50dc
+ . ./cri-lock-repos
++ . ./cri-common
+++ . ./cri-getconfig
+++ umask 002
+++ getrepos
 getconfig Repos
 perl -e '
use Osstest;
readglobalconfig();
print $c{"Repos"} or die $!;
'
+++ local repos=/home/osstest/repos
+++ '[' -z /home/osstest/repos ']'
+++ '[' '!' -d /home/osstest/repos ']'
+++ echo /home/osstest/repos
++ repos=/home/osstest/repos
++ repos_lock=/home/osstest/repos/lock
++ '[' x/home/osstest/repos/lock '!=' x/home/osstest/repos/lock ']'
+ . ./cri-common
++ . ./cri-getconfig
++ umask 002
+ select_xenbranch
+ case "$branch" in
+ tree=xen
+ xenbranch=xen-unstable-smoke
+ qemuubranch=qemu-upstream-unstable
+ '[' xxen = xlinux ']'
+ linuxbranch=
+ '[' xqemu-upstream-unstable = x ']'
+ select_prevxenbranch
++ ./cri-getprevxenbranch xen-unstable-smoke
+ prevxenbranch=xen-unstable
+ '[' x8e4d18e113c9d914a591c7bd8159c990122f50dc = x ']'
+ : tested/2.6.39.x
+ . ./ap-common
++ : osst...@xenbits.xen.org
+++ getconfig OsstestUpstream
+++ perl -e '
use Osstest;
readglobalconfig();
print $c{"OsstestUpstream"} or die $!;
'
++ :
++ : git://xenbits.xen.org/xen.git
++ : osst...@xenbits.xen.org:/home/xen/git/xen.git
++ : git://xenbits.xen.org/qemu-xen-traditional.git
++ : git://git.kernel.org
++ : git://git.kernel.org/pub/scm/linux/kernel/git
++ : git
++ : git://xenbits.xen.org/libvirt.git
++ : osst...@xenbits.xen.org:/home/xen/git/libvirt.git
++ : git://xenbits.xen.org/libvirt.git
++ : git://xenbits.xen.org/rumpuser-xen.git
++ : git
++ : git://xenbits.xen.org/rumpuser-xen.git
++ : osst...@xenbits.xen.org:/home/xen/git/rumpuser-xen.git
+++ besteffort_repo https://github.com/rumpkernel/rumpkernel-netbsd-src
+++ local repo=https://github.com/rumpkernel/rumpkernel-netbsd-src
+++ cached_repo https://github.com/rumpkernel/rumpkernel-netbsd-src 
'[fetch=try]'
+++ local repo=https://github.com/rumpkernel/rumpkernel-netbsd-src
+++ local 'options=[fetch=try]'
 getconfig GitCacheProxy
 perl -e '
use Osstest;
readglobalconfig();
print $c{"GitCacheProxy"} or die $!;
'
+++ local cache=git://cache:9419/
+++ '[' xgit://cache:9419/ '!=' x ']'
+++ echo 
'git://c

Re: [Xen-devel] [PATCHv5 1/3] rwlock: Add per-cpu reader-writer lock infrastructure

On 18/12/15 16:08, Malcolm Crossley wrote:
> Per-cpu read-write locks allow for the fast path read case to have
> low overhead by only setting/clearing a per-cpu variable for using
> the read lock. The per-cpu read fast path also avoids locked
> compare swap operations which can be particularly slow on coherent
> multi-socket systems, particularly if there is heavy usage of the
> read lock itself.
> 
> The per-cpu reader-writer lock uses a local variable to control
> the read lock fast path. This allows a writer to disable the fast
> path and ensures the readers switch to using the underlying
> read-write lock implementation instead of the per-cpu variable.
> 
> Once the writer has taken the write lock and disabled the fast path,
> it must poll the per-cpu variable for all CPU's which have entered
> the critical section for the specific read-write lock the writer is
> attempting to take. This design allows for a single per-cpu variable
> to be used for read/write locks belonging to seperate data structures.
> If a two or more different per-cpu read lock(s) are taken
> simultaneously then the per-cpu data structure is not used and the
> implementation takes the read lock of the underlying read-write lock,
> this behaviour is equivalent to the slow path in terms of performance.
> The per-cpu rwlock is not recursion safe for taking the per-cpu
> read lock because there is no recursion count variable, this is
> functionally equivalent to standard spin locks.
> 
> Slow path readers which are unblocked, set the per-cpu variable and
> drop the read lock. This simplifies the implementation and allows
> for fairness in the underlying read-write lock to be taken
> advantage of.
> 
> There is more overhead on the per-cpu write lock path due to checking
> each CPUs fast path per-cpu variable but this overhead is likely be
> hidden by the required delay of waiting for readers to exit the
> critical section. The loop is optimised to only iterate over
> the per-cpu data of active readers of the rwlock. The cpumask_t for
> tracking the active readers is stored in a single per-cpu data
> location and thus the write lock is not pre-emption safe. Therefore
> the per-cpu write lock can only be used with interrupts disabled.
> 
> Signed-off-by: Malcolm Crossley 
> --
> Changes since v4:
> - Use static inlines for the percpu_owner check
> 
> Changes since v3:
> - Fix the ASSERTS for the percpu_owner check
> 
> Changes since v2:
> - Remove stray hard tabs
> - Added owner field to percpu_rwlock_t for debug builds. This allows
>   for ASSERTs to be added which verify the correct global percpu
>   structure is used for the local initialised percpu_rwlock lock
> 
> Changes since v1:
> - Removed restriction on taking two or more different percpu rw
> locks simultaneously
> - Moved fast-path/slow-path barrier to be per lock instead of global
> - Created seperate percpu_rwlock_t type and added macros to
> initialise new type
> - Added helper macros for using the percpu rwlock itself
> - Moved active_readers cpumask off the stack and into a percpu
> variable
> ---
>  xen/common/spinlock.c|  45 +
>  xen/include/asm-arm/percpu.h |   5 ++
>  xen/include/asm-x86/percpu.h |   6 +++
>  xen/include/xen/percpu.h |   4 ++
>  xen/include/xen/spinlock.h   | 115 
> +++
>  5 files changed, 175 insertions(+)
> 
> diff --git a/xen/common/spinlock.c b/xen/common/spinlock.c
> index 7f89694..901c2b2 100644
> --- a/xen/common/spinlock.c
> +++ b/xen/common/spinlock.c
> @@ -10,6 +10,8 @@
>  #include 
>  #include 
>  
> +static DEFINE_PER_CPU(cpumask_t, percpu_rwlock_readers);
> +
>  #ifndef NDEBUG
>  
>  static atomic_t spin_debug __read_mostly = ATOMIC_INIT(0);
> @@ -492,6 +494,49 @@ int _rw_is_write_locked(rwlock_t *lock)
>  return (lock->lock == RW_WRITE_FLAG); /* writer in critical section? */
>  }
>  
> +void _percpu_write_lock(percpu_rwlock_t **per_cpudata,
> +percpu_rwlock_t *percpu_rwlock)
> +{
> +unsigned int cpu;
> +cpumask_t *rwlock_readers = &this_cpu(percpu_rwlock_readers);
> +
> +/* Validate the correct per_cpudata variable has been provided. */
> +_percpu_rwlock_owner_check(per_cpudata, percpu_rwlock);
> +
> +/* 
> + * First take the write lock to protect against other writers or slow 
> + * path readers.
> + */
> +write_lock(&percpu_rwlock->rwlock);
> +
> +/* Now set the global variable so that readers start using read_lock. */
> +percpu_rwlock->writer_activating = 1;
> +smp_mb();
> +
> +/* Using a per cpu cpumask is only safe if there is no nesting. */
> +ASSERT(!in_irq());
> +cpumask_copy(rwlock_readers, &cpu_online_map);
> +
> +/* Check if there are any percpu readers in progress on this rwlock. */
> +for ( ; ; )
> +{
> +for_each_cpu(cpu, rwlock_readers)
> +{
> +/* 
> + * Remove any percpu readers not contending on this rwlock
> +

[Xen-devel] [seabios baseline-only test] 38549: tolerable FAIL

2015-12-22 Thread Platform Team regression test user

This run is configured for baseline tests only.

flight 38549 seabios real [real]
http://osstest.xs.citrite.net/~osstest/testlogs/logs/38549/

Failures :-/ but no regressions.

Regressions which are regarded as allowable (not blocking):
 test-amd64-amd64-xl-qemuu-winxpsp3  9 windows-install   fail baseline untested

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check 
fail never pass
 test-amd64-amd64-qemuu-nested-amd 16 debian-hvm-install/l1/l2  fail never pass
 test-amd64-i386-xl-qemuu-win7-amd64 16 guest-stop  fail never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check 
fail never pass
 test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-stop fail never pass

version targeted for testing:
 seabios  76327b9f32a009245c215f4a3c5d58a01b5310ae
baseline version:
 seabios  948f3c9b5f74f265bd171ee367a632a8f7c81f68

Last test of basis38383  2015-11-29 07:20:01 Z   23 days
Testing same since38549  2015-12-22 02:02:51 Z0 days1 attempts


People who touched revisions under test:
  Marcel Apfelbaum 
  Michael S. Tsirkin 
  Stefan Berger 

jobs:
 build-amd64-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm   pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsmpass
 test-amd64-amd64-xl-qemuu-debianhvm-amd64-xsmpass
 test-amd64-i386-xl-qemuu-debianhvm-amd64-xsm pass
 test-amd64-amd64-qemuu-nested-amdfail
 test-amd64-i386-qemuu-rhel6hvm-amd   pass
 test-amd64-amd64-xl-qemuu-debianhvm-amd64pass
 test-amd64-i386-xl-qemuu-debianhvm-amd64 pass
 test-amd64-amd64-xl-qemuu-ovmf-amd64 pass
 test-amd64-i386-xl-qemuu-ovmf-amd64  pass
 test-amd64-amd64-xl-qemuu-win7-amd64 fail
 test-amd64-i386-xl-qemuu-win7-amd64  fail
 test-amd64-amd64-qemuu-nested-intel  pass
 test-amd64-i386-qemuu-rhel6hvm-intel pass
 test-amd64-i386-xl-qemuu-winxpsp3-vcpus1 pass
 test-amd64-amd64-xl-qemuu-winxpsp3   fail
 test-amd64-i386-xl-qemuu-winxpsp3pass



sg-report-flight on osstest.xs.citrite.net
logs: /home/osstest/logs
images: /home/osstest/images

Logs, config files, etc. are available at
http://osstest.xs.citrite.net/~osstest/testlogs/logs

Test harness code can be found at
http://xenbits.xensource.com/gitweb?p=osstest.git;a=summary


Push not applicable.


commit 76327b9f32a009245c215f4a3c5d58a01b5310ae
Author: Marcel Apfelbaum 
Date:   Mon Dec 7 14:05:14 2015 +0200

fw/pci: do not automatically allocate IO region for PCIe bridges

PCIe downstream ports (Root Ports and switches Downstream Ports) appear
to firmware as PCI-PCI bridges and a 4K IO space is allocated for them
even if there is no device behind them requesting IO space,
all that for hotplug purpose.

However, PCIe devices can work without IO, so there is no need
to allocate IO space for hotplug.

Acked-by: Michael S. Tsirkin 
Signed-off-by: Marcel Apfelbaum 

commit 320df85010401c6d3ee189a54c63b937d1dcade9
Author: Stefan Berger 
Date:   Mon Nov 30 11:14:19 2015 -0500

tpm: Add a menu for TPM configuration

This patch adds an new menu entry to the main menu. This menu item enables
the user to enter a TPM control menu which allows control of those aspects
of the TPM's state that can only be controlled while in the firmware
and while physical presence can be asserted.

If the machine has a TPM, the boot menu will look as follows, with
the new menu item accessible by pressing the 't' key.

Select boot device:

1. ata0-1: QEMU HARDDISK ATA-7 Hard-Disk (6144 MiBytes)
2. Legacy option rom
3. iPXE (PCI 00:03.0)

t. TPM Menu

Upon pressing t the TPM submenu will be shown:

The Trusted Platform Module (TPM) is a hardware device in this machine.
I

Re: [Xen-devel] [PATCHv5 3/3] p2m: convert p2m rwlock to percpu rwlock

On 18/12/15 16:08, Malcolm Crossley wrote:
> The per domain p2m read lock suffers from significant contention when
> performance multi-queue block or network IO due to the parallel
> grant map/unmaps/copies occuring on the DomU's p2m.
> 
> On multi-socket systems, the contention results in the locked compare swap
> operation failing frequently which results in a tight loop of retries of the
> compare swap operation. As the coherency fabric can only support a specific
> rate of compare swap operations for a particular data location then taking
> the read lock itself becomes a bottleneck for p2m operations.
> 
> Percpu rwlock p2m performance with the same configuration is approximately
> 64 gbit/s vs the 48 gbit/s with grant table percpu rwlocks only.
> 
> Oprofile was used to determine the initial overhead of the read-write locks
> and to confirm the overhead was dramatically reduced by the percpu rwlocks.
> 
> Note: altp2m users will not achieve a gain if they take an altp2m read lock
> simultaneously with the main p2m lock.
> 
> Signed-off-by: Malcolm Crossley 

Looks good, thanks:

Reviewed-by: George Dunlap 

If you end up switching to always using the per-cpu pointer stored in
the percpu_rwlock struct, you can retain the Reviewed-by for those changes.

 -George


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCHv6 2/2] x86/ept: defer the invalidation until the p2m lock is released

On 18/12/15 13:50, David Vrabel wrote:
> Holding the p2m lock while calling ept_sync_domain() is very expensive
> since it does a on_selected_cpus() call.  IPIs on many socket machines
> can be very slows and on_selected_cpus() is serialized.
> 
> It is safe to defer the invalidate until the p2m lock is released
> except for two cases:
> 
> 1. When freeing a page table page (since partial translations may be
>cached).
> 2. When reclaiming a zero page as part of PoD.
> 
> For these cases, add p2m_tlb_flush_sync() calls which will immediately
> perform the invalidate before the page is freed or reclaimed.

There are at least two other places in the PoD code where the "remove ->
check -> add to cache -> unlock" pattern exist; and it looks to me like
there are other places where races might occur (e.g.,
p2m_paging_evict(), which does remove -> scrub -> put -> unlock;
p2m_altp2m_propagate_change(), which does remove -> put -> unlock).

Part of me wonders whether, rather than making callers that need it
remember to do a flush, it might be better to explicitly pass in
P2M_FLUSH or P2M_CAN_DEFER when calling p2m_set_entry, just to make
people think about the fact that the p2m change may not actually take
effect until later.  Any thoughts on that?

Comments on the current approach inline.

> diff --git a/xen/arch/x86/mm/p2m-ept.c b/xen/arch/x86/mm/p2m-ept.c
> index c094320..43c7f1b 100644
> --- a/xen/arch/x86/mm/p2m-ept.c
> +++ b/xen/arch/x86/mm/p2m-ept.c
> @@ -263,6 +263,7 @@ static void ept_free_entry(struct p2m_domain *p2m, 
> ept_entry_t *ept_entry, int l
>  unmap_domain_page(epte);
>  }
>  
> +p2m_tlb_flush_sync(p2m);
>  p2m_free_ptp(p2m, mfn_to_page(ept_entry->mfn));

It's probably worth a comment here pointing out that even if this
function is called several times (e.g., if you replace a load of 4k
entries with a 1G entry), the actual flush will only happen the first time.

> +static void ept_flush_and_unlock(struct p2m_domain *p2m, bool_t unlock)
> +{
> +p2m->need_flush = 0;
> +if ( unlock )
> +mm_write_unlock(&p2m->lock);
> +ept_sync_domain_mask(p2m, p2m->domain->domain_dirty_cpumask);
>  }

Having a function called "flush_and_unlock", with a boolean as to
whether to unlock or not, just seems a bit wonky.

Wouldn't it make more sense to have the hook just named "flush_sync()",
and move the unlocking out in the generic p2m code (where you already
have the check for need_flush)?

> diff --git a/xen/include/asm-x86/p2m.h b/xen/include/asm-x86/p2m.h
> index fa46dd9..9c394c2 100644
> --- a/xen/include/asm-x86/p2m.h
> +++ b/xen/include/asm-x86/p2m.h
> @@ -261,6 +261,10 @@ struct p2m_domain {
>unsigned long gfn, l1_pgentry_t *p,
>l1_pgentry_t new, unsigned int 
> level);
>  long   (*audit_p2m)(struct p2m_domain *p2m);
> +void   (*flush_and_unlock)(struct p2m_domain *p2m, bool_t 
> unlock);
> +
> +unsigned int defer_flush;
> +bool_t need_flush;

It's probably worth a comment that at the moment calling
flush_and_unlock() is gated on need_flush; so it's OK not to implement
flush_and_unlock() as long as you never set need_flush.

Thanks,
 -George

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] Xen Security Advisory 155 (CVE-2015-8550) - paravirtualized drivers incautious about shared memory

MiniOS for QEMU stubdom has frontends, such as mini-os/blkfront.c and
mini-os/netfront.c, not backends.

Cheers,

Stefano


On Mon, 21 Dec 2015, Eric Shelton wrote:
> Seeing as "All OSes providing PV backends are susceptible," doesn't this 
> include MiniOS for QEMU stubdom as well? 
> Are there patches available for mini-os/blkfront.c, mini-os/netfront.c, and 
> mini-os/pcifront.c?  I didn't see
> anything for this.
> Best,
> Eric
>
> On Thu, Dec 17, 2015 at 1:36 PM, Xen.org security team  
> wrote:
>
>   - Topal: Output generated on Tue Dec 22 12:23:44 GMT 2015 - 
> Topal: GPG output starts - gpg:
>   no valid OpenPGP data found. gpg: processing message failed: eof - 
> Topal: GPG output ends -
>   - Topal: Original message starts - -BEGIN PGP SIGNED 
> MESSAGE-
>   Hash: SHA1
>
>               Xen Security Advisory CVE-2015-8550 / XSA-155
>                                 version 6
>
>       paravirtualized drivers incautious about shared memory contents
>
>   UPDATES IN VERSION 6
>   
>
>   Correct CREDITS section.
>
>   ISSUE DESCRIPTION
>   =
>
>   The compiler can emit optimizations in the PV backend drivers which
>   can lead to double fetch vulnerabilities. Specifically the shared
>   memory between the frontend and backend can be fetched twice (during
>   which time the frontend can alter the contents) possibly leading to
>   arbitrary code execution in backend.
>
>   IMPACT
>   ==
>
>   Malicious guest administrators can cause denial of service.  If driver
>   domains are not in use, the impact can be a host crash, or privilege 
> escalation.
>
>   VULNERABLE SYSTEMS
>   ==
>
>   Systems running PV or HVM guests are vulnerable.
>
>   ARM and x86 systems are vulnerable.
>
>   All OSes providing PV backends are susceptible, this includes
>   Linux and NetBSD. By default the Linux distributions compile kernels
>   with optimizations.
>
>   MITIGATION
>   ==
>
>   There is no mitigation.
>
>   CREDITS
>   ===
>
>   This issue was discovered by Felix Wilhelm (ERNW Research, KIT /
>   Operating Systems Group).
>
>   RESOLUTION
>   ==
>
>   Applying the appropriate attached patches should fix the problem for
>   PV backends.  Note only that PV backends are fixed; PV frontend
>   patches will be developed and released (publicly) after the embargo
>   date.
>
>   Please note that there is a bug in some versions of gcc,
>   https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58145 which can cause the
>   construct used in RING_COPY_REQUEST() to be ineffective in some
>   circumstances. We have determined that this is only the case when the
>   structure being copied consists purely of bitfields. The Xen PV
>   protocols updated here do not use bitfields in this way and therefore
>   these patches are not subject to that bug. However authors of third
>   party PV protocols should take this into consideration.
>
>   Linux v4.4:
>   xsa155-linux-xsa155-0001-xen-Add-RING_COPY_REQUEST.patch
>   
> xsa155-linux-xsa155-0002-xen-netback-don-t-use-last-request-to-determine-mini.patch
>   
> xsa155-linux-xsa155-0003-xen-netback-use-RING_COPY_REQUEST-throughout.patch
>   
> xsa155-linux-xsa155-0004-xen-blkback-only-read-request-operation-from-shared-.patch
>   
> xsa155-linux-xsa155-0005-xen-blkback-read-from-indirect-descriptors-only-once.patch
>   xsa155-linux-xsa155-0006-xen-scsiback-safely-copy-requests.patch
>   
> xsa155-linux-xsa155-0007-xen-pciback-Save-xen_pci_op-commands-before-processi.patch
>   Linux v4.[0,1,2,3]
>   All the above patches except #5 will apply, please use:
>   
> xsa155-linux43-0005-xen-blkback-read-from-indirect-descriptors-only-once.patch
>   Linux v3.19:
>   All the above patches except #5 and #6 will apply, please use:
>   
> xsa155-linux43-0005-xen-blkback-read-from-indirect-descriptors-only-once.patch
>   xsa155-linux319-0006-xen-scsiback-safely-copy-requests.patch
>
>   qemu-xen:
>   xsa155-qemu-qdisk-double-access.patch
>   xsa155-qemu-xenfb.patch
>
>   qemu-traditional:
>   xsa155-qemut-qdisk-double-access.patch
>   xsa155-qemut-xenfb.patch
>
>   NetBSD 7.0:
>   xsa155-netbsd-xsa155-0001-netbsd-xen-Add-RING_COPY_REQUEST.patch
>   
> xsa155-netbsd-xsa155-0002-netbsd-netback-Use-RING_COPY_REQUEST-instead-of-RING.patch
>   
> xsa155-netbsd-xsa155-0003-netbsd-ring-Add-barrier-to-provide-an-compiler-barri.patch
>   
> xsa155-netbsd-xsa155-0004-netbsd-block-only-read-request-operation-from-shared.patch
>   
> xsa155-netbsd-xsa155-0005-netbsd-pciback-Operate-on-local-version-of-xen_pci_o.patch
>
>   xen:
>   xsa155-xen-0001-xen-Add-RING_COPY_REQUEST.patch
>   xsa155-xen-000

Re: [Xen-devel] [PATCH V5 1/6] x86/hvm: pkeys, add the flag to enable Memory Protection Keys

>>> On 22.12.15 at 11:30,  wrote:
> This patch adds the flag("pku") to enable Memory Protection Keys, and updates
> the markdown.
> 
> Signed-off-by: Huaitong Han 
> Reviewed-by: Andrew Cooper 

Please avoid resending patches which got applied already.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH] build: save generated config in /boot

On 12/21/15 6:53 AM, Jan Beulich wrote:
 On 21.12.15 at 13:40,  wrote:
>> On 21/12/15 12:11, Jan Beulich wrote:
>> On 18.12.15 at 22:35,  wrote:
 Since we now support changing Xen options with Kconfig, we should save
 the configuration that was used to build up Xen. This will save it in
 /boot alongside the installed xen.gz and call it
 xen-$(FULLVERSION).config

 Suggested-by: Ian Campbell 
 Signed-off-by: Doug Goldstein 
 ---
  xen/Makefile | 1 +
  1 file changed, 1 insertion(+)

 diff --git a/xen/Makefile b/xen/Makefile
 index 9023863..460b977 100644
 --- a/xen/Makefile
 +++ b/xen/Makefile
 @@ -58,6 +58,7 @@ _install: $(TARGET)$(CONFIG_XEN_INSTALL_SUFFIX)
ln -f -s $(T)-$(XEN_FULLVERSION)$(Z) $(D)$(BOOT_DIR)/$(T)$(Z)
[ -d "$(D)$(DEBUG_DIR)" ] || $(INSTALL_DIR) $(D)$(DEBUG_DIR)
$(INSTALL_DATA) $(TARGET)-syms 
 $(D)$(DEBUG_DIR)/$(T)-syms-$(XEN_FULLVERSION)
 +  $(INSTALL_DATA) $(KCONFIG_CONFIG) 
>> $(D)$(BOOT_DIR)/$(T)-$(XEN_FULLVERSION).config
>>> Was it really suggested to put this into /boot? It has no business
>>> being there...
>>
>> A typical Linux has:
>>
>> andrewcoop@andrewcoop:/local/xen.git/xen$ ls -lA /boot/
>> total 21044
>> -rw-r--r--  1 root root   157726 Dec 15 15:40 config-3.16.0-4-amd64
>> drwxr-xr-x  5 root root 4096 Dec 18 07:53 grub
>> -rw-r--r--  1 root root 15535038 Dec 18 07:52 initrd.img-3.16.0-4-amd64
>> drwx--  2 root root16384 Oct 19 11:11 lost+found
>> -rw-r--r--  1 root root  2676277 Dec 15 15:40 System.map-3.16.0-4-amd64
>> -rw-r--r--  1 root root  3118928 Dec 15 15:37 vmlinuz-3.16.0-4-amd64
>>
>> which at the very least is consistent between Debian and RHEL derivatives.
>>
>> IMO, doing the same for Xen is sensible.
> 
> I'm afraid I have to disagree - just because Linux does things a
> certain way doesn't mean that the only (sensible) way. Imo /boot
> should hold exclusively stuff needed for booting. Remember how
> we moved xen-syms out of there not so long ago? You could have
> objected to that change too, considering that Linux puts
> System.map and sometimes also the uncompressed vmlinux there.
> Yet I think it was a correct move, and the change here should
> follow that model instead of Linux'es.
> 
> Jan
> 

I don't think you're correct here Jan. Looking at most distros I have at
hand (I'll admit I don't have SuSE available to me). The uncompressed
kernel is not in /boot. In fact all of them use /usr/lib/debug so Xen
did follow Linux here.

-- 
Doug Goldstein



signature.asc
Description: OpenPGP digital signature
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [qemu-mainline test] 66848: regressions - FAIL

flight 66848 qemu-mainline real [real]
http://logs.test-lab.xenproject.org/osstest/logs/66848/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-i3865 xen-build fail REGR. vs. 66433
 build-i386-xsm5 xen-build fail REGR. vs. 66433
 build-amd64-xsm   5 xen-build fail REGR. vs. 66433
 build-amd64   5 xen-build fail REGR. vs. 66433
 build-armhf-xsm   5 xen-build fail REGR. vs. 66433
 build-armhf   5 xen-build fail REGR. vs. 66433

Tests which did not succeed, but are not blocking:
 build-amd64-libvirt   1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-pvh-amd   1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-pvh-intel  1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl-vhd   1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt-qcow2  1 build-check(1)   blocked  n/a
 build-i386-libvirt1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt-xsm  1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-credit2   1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt-xsm  1 build-check(1)   blocked  n/a
 test-amd64-i386-xl-xsm1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-multivcpu  1 build-check(1)   blocked  n/a
 test-amd64-amd64-pair 1 build-check(1)   blocked  n/a
 test-amd64-i386-pair  1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt  1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt-pair  1 build-check(1)   blocked  n/a
 test-amd64-i386-xl1 build-check(1)   blocked  n/a
 test-amd64-amd64-amd64-pvgrub  1 build-check(1)   blocked  n/a
 test-amd64-amd64-pygrub   1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl-multivcpu  1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl   1 build-check(1)   blocked  n/a
 test-amd64-i386-qemuu-rhel6hvm-intel  1 build-check(1) blocked n/a
 test-amd64-amd64-qemuu-nested-amd  1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt-raw  1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt-pair  1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qemuu-debianhvm-amd64  1 build-check(1)blocked n/a
 test-armhf-armhf-xl-xsm   1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qemuu-ovmf-amd64  1 build-check(1) blocked n/a
 test-armhf-armhf-xl-cubietruck  1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl-credit2   1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt  1 build-check(1)   blocked  n/a
 test-amd64-i386-xl-qemuu-debianhvm-amd64  1 build-check(1) blocked n/a
 build-armhf-libvirt   1 build-check(1)   blocked  n/a
 test-amd64-i386-freebsd10-amd64  1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt   1 build-check(1)   blocked  n/a
 test-amd64-i386-xl-raw1 build-check(1)   blocked  n/a
 test-amd64-amd64-qemuu-nested-intel  1 build-check(1)  blocked n/a
 test-amd64-i386-freebsd10-i386  1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl-arndale   1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-xsm   1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qemuu-win7-amd64  1 build-check(1) blocked n/a
 test-amd64-amd64-xl   1 build-check(1)   blocked  n/a
 test-amd64-i386-xl-qemuu-win7-amd64  1 build-check(1)  blocked n/a
 test-amd64-amd64-xl-rtds  1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt-xsm   1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qcow2 1 build-check(1)   blocked  n/a
 test-amd64-i386-xl-qemuu-debianhvm-amd64-xsm  1 build-check(1) blocked n/a
 test-amd64-i386-qemuu-rhel6hvm-amd  1 build-check(1)   blocked n/a
 test-armhf-armhf-xl-rtds  1 build-check(1)   blocked  n/a
 test-amd64-amd64-i386-pvgrub  1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 1 build-check(1) blocked n/a
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 1 build-check(1) blocked n/a
 test-amd64-amd64-xl-qemuu-debianhvm-amd64-xsm  1 build-check(1)blocked n/a
 test-amd64-amd64-libvirt-vhd  1 build-check(1)   blocked  n/a
 test-amd64-i386-xl-qemuu-winxpsp3-vcpus1  1 build-check(1) blocked n/a
 test-amd64-i386-xl-qemuu-ovmf-amd64  1 build-check(1)  blocked n/a
 test-amd64-amd64-xl-qemuu-winxpsp3  1 build-check(1)

Re: [Xen-devel] [PATCH] build: save generated config in /boot

On 12/21/15 9:35 AM, Jan Beulich wrote:
 On 21.12.15 at 16:20,  wrote:
>> On 12/21/15 8:51 AM, Jan Beulich wrote:
>> On 21.12.15 at 15:35,  wrote:
 On 12/21/15 6:11 AM, Jan Beulich wrote:
 On 18.12.15 at 22:35,  wrote:
>> Since we now support changing Xen options with Kconfig, we should save
>> the configuration that was used to build up Xen. This will save it in
>> /boot alongside the installed xen.gz and call it
>> xen-$(FULLVERSION).config
>>
>> Suggested-by: Ian Campbell 
>> Signed-off-by: Doug Goldstein 
>> ---
>>  xen/Makefile | 1 +
>>  1 file changed, 1 insertion(+)
>>
>> diff --git a/xen/Makefile b/xen/Makefile
>> index 9023863..460b977 100644
>> --- a/xen/Makefile
>> +++ b/xen/Makefile
>> @@ -58,6 +58,7 @@ _install: $(TARGET)$(CONFIG_XEN_INSTALL_SUFFIX)
>>  ln -f -s $(T)-$(XEN_FULLVERSION)$(Z) $(D)$(BOOT_DIR)/$(T)$(Z)
>>  [ -d "$(D)$(DEBUG_DIR)" ] || $(INSTALL_DIR) $(D)$(DEBUG_DIR)
>>  $(INSTALL_DATA) $(TARGET)-syms 
>> $(D)$(DEBUG_DIR)/$(T)-syms-$(XEN_FULLVERSION)
>> +$(INSTALL_DATA) $(KCONFIG_CONFIG) 
 $(D)$(BOOT_DIR)/$(T)-$(XEN_FULLVERSION).config
>
> Was it really suggested to put this into /boot? It has no business
> being there...

 Yes. By multiple people. Ian Campbell was the first person to suggest it
 in that location.
>>>
>>> Okay, so I've looked it up, and no, he didn't. He just gave this as one
>>> possibility:
>>>
>>> "It occurred to me this morning that we probably ought to stash the .config
>>>  somewhere on install in such a way that it can be associated with the Xen
>>>  binary (i.e. with the same full suffix as the binary itself, not the
>>>  abridged symlink names), maybe as $(BOOT_DIR)/$(T)-
>>>  $(XEN_FULLVERSION).config?"
>>>
>>> But yes, I'm sorry for not noticing this as an undesirable place right
>>> away.
>>
>> Ok well I'm at a loss here because the quote clearly shows him
>> suggesting that location. Do you have a suggested location because so
>> far I've just got a no from you on the only suggested location.
> 
> Match the xen-syms location?
> 
> Jan
> 

I guess I fail to grasp the rationale behind not putting it in /boot.

-- 
Doug Goldstein



signature.asc
Description: OpenPGP digital signature
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH] build: save generated config in /boot

>>> On 22.12.15 at 13:45,  wrote:
> On 12/21/15 6:53 AM, Jan Beulich wrote:
> On 21.12.15 at 13:40,  wrote:
>>> On 21/12/15 12:11, Jan Beulich wrote:
>>> On 18.12.15 at 22:35,  wrote:
> Since we now support changing Xen options with Kconfig, we should save
> the configuration that was used to build up Xen. This will save it in
> /boot alongside the installed xen.gz and call it
> xen-$(FULLVERSION).config
>
> Suggested-by: Ian Campbell 
> Signed-off-by: Doug Goldstein 
> ---
>  xen/Makefile | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/xen/Makefile b/xen/Makefile
> index 9023863..460b977 100644
> --- a/xen/Makefile
> +++ b/xen/Makefile
> @@ -58,6 +58,7 @@ _install: $(TARGET)$(CONFIG_XEN_INSTALL_SUFFIX)
>   ln -f -s $(T)-$(XEN_FULLVERSION)$(Z) $(D)$(BOOT_DIR)/$(T)$(Z)
>   [ -d "$(D)$(DEBUG_DIR)" ] || $(INSTALL_DIR) $(D)$(DEBUG_DIR)
>   $(INSTALL_DATA) $(TARGET)-syms 
> $(D)$(DEBUG_DIR)/$(T)-syms-$(XEN_FULLVERSION)
> + $(INSTALL_DATA) $(KCONFIG_CONFIG) 
>>> $(D)$(BOOT_DIR)/$(T)-$(XEN_FULLVERSION).config
 Was it really suggested to put this into /boot? It has no business
 being there...
>>>
>>> A typical Linux has:
>>>
>>> andrewcoop@andrewcoop:/local/xen.git/xen$ ls -lA /boot/
>>> total 21044
>>> -rw-r--r--  1 root root   157726 Dec 15 15:40 config-3.16.0-4-amd64
>>> drwxr-xr-x  5 root root 4096 Dec 18 07:53 grub
>>> -rw-r--r--  1 root root 15535038 Dec 18 07:52 initrd.img-3.16.0-4-amd64
>>> drwx--  2 root root16384 Oct 19 11:11 lost+found
>>> -rw-r--r--  1 root root  2676277 Dec 15 15:40 System.map-3.16.0-4-amd64
>>> -rw-r--r--  1 root root  3118928 Dec 15 15:37 vmlinuz-3.16.0-4-amd64
>>>
>>> which at the very least is consistent between Debian and RHEL derivatives.
>>>
>>> IMO, doing the same for Xen is sensible.
>> 
>> I'm afraid I have to disagree - just because Linux does things a
>> certain way doesn't mean that the only (sensible) way. Imo /boot
>> should hold exclusively stuff needed for booting. Remember how
>> we moved xen-syms out of there not so long ago? You could have
>> objected to that change too, considering that Linux puts
>> System.map and sometimes also the uncompressed vmlinux there.
>> Yet I think it was a correct move, and the change here should
>> follow that model instead of Linux'es.
> 
> I don't think you're correct here Jan. Looking at most distros I have at
> hand (I'll admit I don't have SuSE available to me). The uncompressed
> kernel is not in /boot.

Note how I said "sometimes"? And I suppose you will agree that
System.map is unnecessary for booting, despite it getting put in
/boot quite commonly as it looks.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH] build: save generated config in /boot

>>> On 22.12.15 at 13:45,  wrote:
> On 12/21/15 9:35 AM, Jan Beulich wrote:
> On 21.12.15 at 16:20,  wrote:
>>> On 12/21/15 8:51 AM, Jan Beulich wrote:
>>> On 21.12.15 at 15:35,  wrote:
> On 12/21/15 6:11 AM, Jan Beulich wrote:
> On 18.12.15 at 22:35,  wrote:
>>> Since we now support changing Xen options with Kconfig, we should save
>>> the configuration that was used to build up Xen. This will save it in
>>> /boot alongside the installed xen.gz and call it
>>> xen-$(FULLVERSION).config
>>>
>>> Suggested-by: Ian Campbell 
>>> Signed-off-by: Doug Goldstein 
>>> ---
>>>  xen/Makefile | 1 +
>>>  1 file changed, 1 insertion(+)
>>>
>>> diff --git a/xen/Makefile b/xen/Makefile
>>> index 9023863..460b977 100644
>>> --- a/xen/Makefile
>>> +++ b/xen/Makefile
>>> @@ -58,6 +58,7 @@ _install: $(TARGET)$(CONFIG_XEN_INSTALL_SUFFIX)
>>> ln -f -s $(T)-$(XEN_FULLVERSION)$(Z) $(D)$(BOOT_DIR)/$(T)$(Z)
>>> [ -d "$(D)$(DEBUG_DIR)" ] || $(INSTALL_DIR) $(D)$(DEBUG_DIR)
>>> $(INSTALL_DATA) $(TARGET)-syms 
>>> $(D)$(DEBUG_DIR)/$(T)-syms-$(XEN_FULLVERSION)
>>> +   $(INSTALL_DATA) $(KCONFIG_CONFIG) 
> $(D)$(BOOT_DIR)/$(T)-$(XEN_FULLVERSION).config
>>
>> Was it really suggested to put this into /boot? It has no business
>> being there...
>
> Yes. By multiple people. Ian Campbell was the first person to suggest it
> in that location.

 Okay, so I've looked it up, and no, he didn't. He just gave this as one
 possibility:

 "It occurred to me this morning that we probably ought to stash the .config
  somewhere on install in such a way that it can be associated with the Xen
  binary (i.e. with the same full suffix as the binary itself, not the
  abridged symlink names), maybe as $(BOOT_DIR)/$(T)-
  $(XEN_FULLVERSION).config?"

 But yes, I'm sorry for not noticing this as an undesirable place right
 away.
>>>
>>> Ok well I'm at a loss here because the quote clearly shows him
>>> suggesting that location. Do you have a suggested location because so
>>> far I've just got a no from you on the only suggested location.
>> 
>> Match the xen-syms location?
> 
> I guess I fail to grasp the rationale behind not putting it in /boot.

It's the other way around really - you'd have to provide a reason
(other than "Linux does so too") for putting it in /boot.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH V5 4/6] x86/hvm: pkeys, add pkeys support for guest_walk_tables

>>> On 22.12.15 at 11:30,  wrote:

I dislike having to repeat this: Please trim your Cc lists.

> --- a/xen/arch/x86/mm/guest_walk.c
> +++ b/xen/arch/x86/mm/guest_walk.c
> @@ -90,6 +90,57 @@ static uint32_t set_ad_bits(void *guest_p, void *walk_p, 
> int set_dirty)
>  return 0;
>  }
>  
> +extern bool_t pkey_fault(struct vcpu *vcpu, uint32_t pfec,
> +uint32_t pte_flags, uint32_t pte_pkey);
> +#if GUEST_PAGING_LEVELS == CONFIG_PAGING_LEVELS
> +bool_t pkey_fault(struct vcpu *vcpu, uint32_t pfec,
> +uint32_t pte_flags, uint32_t pte_pkey)
> +{

See my comments on the previous version. Please avoid sending new
versions without having addressed all comments on the previous one
(verbally or by code changes). Having done the suggested change
just partially (by removing the #ifdef-s from the call sites) you now
do the key check universally, and things remain correct just because
of the long mode check in the middle of the function.

> +unsigned int pkru = 0;
> +bool_t pkru_ad, pkru_wd;
> +
> +bool_t pf = !!(pfec & PFEC_page_present);

There's still this stray blank line above (and I continue to wonder
whether you really need all these boolean variables many of which
get used just once).

> +bool_t uf = !!(pfec & PFEC_user_mode);
> +bool_t wf = !!(pfec & PFEC_write_access);
> +bool_t ff = !!(pfec & PFEC_insn_fetch);
> +bool_t rsvdf = !!(pfec & PFEC_reserved_bit);
> +
> +/* When page isn't present,  PKEY isn't checked. */
> +if ( !pf || is_pv_vcpu(vcpu) )
> +return 0;
> +
> +/*
> + * PKU:  additional mechanism by which the paging controls
> + * access to user-mode addresses based on the value in the
> + * PKRU register. A fault is considered as a PKU violation if all
> + * of the following conditions are ture:
> + * 1.CR4_PKE=1.
> + * 2.EFER_LMA=1.
> + * 3.page is present with no reserved bit violations.
> + * 4.the access is not an instruction fetch.
> + * 5.the access is to a user page.
> + * 6.PKRU.AD=1
> + *   or The access is a data write and PKRU.WD=1
> + *and either CR0.WP=1 or it is a user access.
> + */
> +if ( !hvm_pku_enabled(vcpu) || !hvm_long_mode_enabled(vcpu) ||
> +rsvdf || ff || !(pte_flags & _PAGE_USER) )

Indentation (I think this also was broken already in the previous
version).

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH V5 5/6] x86/hvm: pkeys, add xstate support for pkeys

>>> On 22.12.15 at 11:30,  wrote:
> --- a/xen/include/asm-x86/xstate.h
> +++ b/xen/include/asm-x86/xstate.h
> @@ -34,13 +34,15 @@
>  #define XSTATE_OPMASK  (1ULL << 5)
>  #define XSTATE_ZMM (1ULL << 6)
>  #define XSTATE_HI_ZMM  (1ULL << 7)
> +#define XSTATE_PKRU(1ULL << 9)
>  #define XSTATE_LWP (1ULL << 62) /* AMD lightweight profiling */
>  #define XSTATE_FP_SSE  (XSTATE_FP | XSTATE_SSE)
>  #define XCNTXT_MASK(XSTATE_FP | XSTATE_SSE | XSTATE_YMM | XSTATE_OPMASK 
> | \
>  XSTATE_ZMM | XSTATE_HI_ZMM | XSTATE_NONLAZY)
>  
>  #define XSTATE_ALL (~(1ULL << 63))
> -#define XSTATE_NONLAZY (XSTATE_LWP | XSTATE_BNDREGS | XSTATE_BNDCSR)
> +#define XSTATE_NONLAZY (XSTATE_LWP | XSTATE_BNDREGS | XSTATE_BNDCSR | \
> +XSTATE_PKRU)

Same thing here - I don't recall any reply to my inquiry regarding this
change.

Jan



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH V5 6/6] x86/hvm: pkeys, add pkeys support for cpuid handling

>>> On 22.12.15 at 11:30,  wrote:
> This patch adds pkeys support for cpuid handing.
> 
> Pkeys hardware support is CPUID.7.0.ECX[3]:PKU. software support is
> CPUID.7.0.ECX[4]:OSPKE and it reflects the support setting of CR4.PKE.
> 
> Signed-off-by: Huaitong Han 
> ---
>  tools/libxc/xc_cpufeature.h |  2 ++
>  tools/libxc/xc_cpuid_x86.c  |  6 --
>  xen/arch/x86/hvm/hvm.c  | 36 +++-
>  3 files changed, 29 insertions(+), 15 deletions(-)
> 
> diff --git a/tools/libxc/xc_cpufeature.h b/tools/libxc/xc_cpufeature.h
> index c3ddc80..f6a9778 100644
> --- a/tools/libxc/xc_cpufeature.h
> +++ b/tools/libxc/xc_cpufeature.h
> @@ -141,5 +141,7 @@
>  #define X86_FEATURE_ADX 19 /* ADCX, ADOX instructions */
>  #define X86_FEATURE_SMAP20 /* Supervisor Mode Access Protection */
>  
> +/* Intel-defined CPU features, CPUID level 0x0007:0 (ecx) */
> +#define X86_FEATURE_PKU 3
>  
>  #endif /* __LIBXC_CPUFEATURE_H */
> diff --git a/tools/libxc/xc_cpuid_x86.c b/tools/libxc/xc_cpuid_x86.c
> index 8882c01..1ce979b 100644
> --- a/tools/libxc/xc_cpuid_x86.c
> +++ b/tools/libxc/xc_cpuid_x86.c
> @@ -427,9 +427,11 @@ static void xc_cpuid_hvm_policy(xc_interface *xch,
>  bitmaskof(X86_FEATURE_ADX)  |
>  bitmaskof(X86_FEATURE_SMAP) |
>  bitmaskof(X86_FEATURE_FSGSBASE));
> +regs[2] &= bitmaskof(X86_FEATURE_PKU);
>  } else
> -regs[1] = 0;
> -regs[0] = regs[2] = regs[3] = 0;
> +regs[1] = regs[2] = 0;
> +
> +regs[0] = regs[3] = 0;
>  break;
>  
>  case 0x000d:
> diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
> index 59916ed..076313b 100644
> --- a/xen/arch/x86/hvm/hvm.c
> +++ b/xen/arch/x86/hvm/hvm.c
> @@ -4572,7 +4572,7 @@ void hvm_cpuid(unsigned int input, unsigned int *eax, 
> unsigned int *ebx,
>  __clear_bit(X86_FEATURE_APIC & 31, edx);
>  
>  /* Fix up OSXSAVE. */
> -if ( cpu_has_xsave )
> +if ( *ecx & cpufeat_mask(X86_FEATURE_XSAVE) )
>  *ecx |= (v->arch.hvm_vcpu.guest_cr[4] & X86_CR4_OSXSAVE) ?
>   cpufeat_mask(X86_FEATURE_OSXSAVE) : 0;
>  
> @@ -4585,21 +4585,31 @@ void hvm_cpuid(unsigned int input, unsigned int *eax, 
> unsigned int *ebx,
>  *edx &= ~cpufeat_mask(X86_FEATURE_PSE36);
>  break;
>  case 0x7:
> -if ( (count == 0) && !cpu_has_smep )
> -*ebx &= ~cpufeat_mask(X86_FEATURE_SMEP);
> +if ( count == 0 )
> +{
> +if ( !cpu_has_smep )
> +*ebx &= ~cpufeat_mask(X86_FEATURE_SMEP);
>  
> -if ( (count == 0) && !cpu_has_smap )
> -*ebx &= ~cpufeat_mask(X86_FEATURE_SMAP);
> +if ( !cpu_has_smap )
> +*ebx &= ~cpufeat_mask(X86_FEATURE_SMAP);
>  
> -/* Don't expose MPX to hvm when VMX support is not available */
> -if ( (count == 0) &&
> - (!(vmx_vmexit_control & VM_EXIT_CLEAR_BNDCFGS) ||
> -  !(vmx_vmentry_control & VM_ENTRY_LOAD_BNDCFGS)) )
> -*ebx &= ~cpufeat_mask(X86_FEATURE_MPX);
> +/* Don't expose MPX to hvm when VMX support is not available */

Please fix the comment style if you touch this anyway.

> +if (!(vmx_vmexit_control & VM_EXIT_CLEAR_BNDCFGS) ||
> +  !(vmx_vmentry_control & VM_ENTRY_LOAD_BNDCFGS))
> +*ebx &= ~cpufeat_mask(X86_FEATURE_MPX);
>  
> -/* Don't expose INVPCID to non-hap hvm. */
> -if ( (count == 0) && !hap_enabled(d) )
> -*ebx &= ~cpufeat_mask(X86_FEATURE_INVPCID);
> +if ( !hap_enabled(d) )
> +{
> +/* Don't expose INVPCID to non-hap hvm. */
> +*ebx &= ~cpufeat_mask(X86_FEATURE_INVPCID);
> +/* X86_FEATURE_PKU is not yet implemented for shadow paging. 
> */
> +*ecx &= ~cpufeat_mask(X86_FEATURE_PKU);
> +}
> +
> +if ( *ecx & cpufeat_mask(X86_FEATURE_PKU))
> +*ecx |= (v->arch.hvm_vcpu.guest_cr[4] & X86_CR4_PKE) ?
> + cpufeat_mask(X86_FEATURE_OSPKE) : 0;

Bogus use of ?: considering that you or in zero in its "false" case.
Just extend the if() condition.

Jan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [distros-debian-snapshot test] 38550: tolerable FAIL

2015-12-22 Thread Platform Team regression test user

flight 38550 distros-debian-snapshot real [real]
http://osstest.xs.citrite.net/~osstest/testlogs/logs/38550/

Failures :-/ but no regressions.

Regressions which are regarded as allowable (not blocking):
 test-armhf-armhf-armhf-daily-netboot-pygrub 9 debian-di-install fail like 38518
 test-amd64-i386-i386-daily-netboot-pvgrub  9 debian-di-install fail like 38518
 test-amd64-i386-amd64-daily-netboot-pygrub 9 debian-di-install fail like 38518
 test-amd64-amd64-i386-daily-netboot-pygrub 9 debian-di-install fail like 38518
 test-amd64-amd64-amd64-daily-netboot-pvgrub 9 debian-di-install fail like 38518

baseline version:
 flight   38518

jobs:
 build-amd64  pass
 build-armhf  pass
 build-i386   pass
 build-amd64-pvopspass
 build-armhf-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-amd64-daily-netboot-pvgrub  fail
 test-amd64-i386-i386-daily-netboot-pvgrubfail
 test-amd64-i386-amd64-daily-netboot-pygrub   fail
 test-armhf-armhf-armhf-daily-netboot-pygrub  fail
 test-amd64-amd64-i386-daily-netboot-pygrub   fail
 test-amd64-amd64-amd64-current-netinst-pygrubpass
 test-amd64-i386-amd64-current-netinst-pygrub pass
 test-amd64-amd64-i386-current-netinst-pygrub pass
 test-amd64-i386-i386-current-netinst-pygrub  pass
 test-amd64-amd64-amd64-weekly-netinst-pygrub pass
 test-amd64-i386-amd64-weekly-netinst-pygrub  pass
 test-amd64-amd64-i386-weekly-netinst-pygrub  pass
 test-amd64-i386-i386-weekly-netinst-pygrub   pass



sg-report-flight on osstest.xs.citrite.net
logs: /home/osstest/logs
images: /home/osstest/images

Logs, config files, etc. are available at
http://osstest.xs.citrite.net/~osstest/testlogs/logs

Test harness code can be found at
http://xenbits.xensource.com/gitweb?p=osstest.git;a=summary


Push not applicable.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] Xen Security Advisory 155 (CVE-2015-8550) - paravirtualized drivers incautious about shared memory

2015-12-22 Thread Samuel Thibault

Stefano Stabellini, on Tue 22 Dec 2015 12:24:35 +, wrote:
> MiniOS for QEMU stubdom has frontends, such as mini-os/blkfront.c and
> mini-os/netfront.c, not backends.

There is one backend, tpmback.  It however doesn't use a ring.

Samuel

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v3 13/13] xen-hvm: Mark inappropriate error handling FIXME

On Thu, 17 Dec 2015, Markus Armbruster wrote:
> Cc: Stefano Stabellini 
> Cc: xen-de...@lists.xensource.com
> Signed-off-by: Markus Armbruster 
> ---
>  xen-hvm.c | 7 +++
>  1 file changed, 7 insertions(+)
> 
> diff --git a/xen-hvm.c b/xen-hvm.c
> index 3d78a0c..2a93390 100644
> --- a/xen-hvm.c
> +++ b/xen-hvm.c
> @@ -240,6 +240,7 @@ static void xen_ram_init(PCMachineState *pcms,
>  
>  void xen_ram_alloc(ram_addr_t ram_addr, ram_addr_t size, MemoryRegion *mr)
>  {
> +/* FIXME caller ram_block_add() wants error_setg() on failure */
>  unsigned long nr_pfn;
>  xen_pfn_t *pfn_list;
>  int i;
> @@ -1192,6 +1193,12 @@ static void xen_wakeup_notifier(Notifier *notifier, 
> void *data)
>  int xen_hvm_init(PCMachineState *pcms,
>   MemoryRegion **ram_memory)
>  {
> +/*
> + * FIXME Returns -1 without cleaning up on some errors (harmless
> + * as long as the caller exit()s on error), dies with hw_error()

Could you please stop the comment here and just replace hw_error() with
either return -1 or exit(1) within xen_hvm_init?


> + * on others.  hw_error() isn't approprate here.  Should probably
> + * simply exit() on all errors.
> + */
>  int i, rc;
>  xen_pfn_t ioreq_pfn;
>  xen_pfn_t bufioreq_pfn;
> -- 
> 2.4.3
> 
> 
> ___
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel
> 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCHv1 4/4] spinlock: add fair read-write locks

>>> On 18.12.15 at 15:09,  wrote:
> +/**
> + * rspin_until_writer_unlock - inc reader count & spin until writer is gone

Stale comment - the function doesn't increment anything.

Also throughout the file, with Linux coding style converted to Xen
style, comment style should be made Xen-like too.

> +/*
> + * Readers come here when they cannot get the lock without waiting
> + */
> +if ( unlikely(in_irq()) )
> +{
> +/*
> + * Readers in interrupt context will spin until the lock is
> + * available without waiting in the queue.
> + */
> +smp_rmb();
> +cnts = atomic_read(&lock->cnts);
> +rspin_until_writer_unlock(lock, cnts);
> +return;
> +}

I can't immediately see the reason for this re-introduction of
unfairness - can you say a word on this, or perhaps extend the
comment?

> +/**
> + * queue_write_lock_slowpath - acquire write lock of a queue rwlock
> + * @lock : Pointer to queue rwlock structure
> + */
> +void queue_write_lock_slowpath(rwlock_t *lock)
>  {
> -_read_unlock(lock);
> -local_irq_enable();
> -}
> +u32 cnts;
>  
> -void _read_unlock_irqrestore(rwlock_t *lock, unsigned long flags)
> -{
> -_read_unlock(lock);
> -local_irq_restore(flags);
> -}
> +/* Put the writer into the wait queue */
> +spin_lock(&lock->lock);
>  
> -void _write_lock(rwlock_t *lock)
> -{
> -uint32_t x;
> +/* Try to acquire the lock directly if no reader is present */
> +if ( !atomic_read(&lock->cnts) &&
> + (atomic_cmpxchg(&lock->cnts, 0, _QW_LOCKED) == 0) )
> +goto unlock;
>  
> -check_lock(&lock->debug);
> -do {
> -while ( (x = lock->lock) & RW_WRITE_FLAG )
> -cpu_relax();
> -} while ( cmpxchg(&lock->lock, x, x|RW_WRITE_FLAG) != x );
> -while ( x != 0 )
> +/*
> + * Set the waiting flag to notify readers that a writer is pending,
> + * or wait for a previous writer to go away.
> + */
> +for (;;)
>  {
> -cpu_relax();
> -x = lock->lock & ~RW_WRITE_FLAG;
> -}
> -preempt_disable();
> -}
> -
> -void _write_lock_irq(rwlock_t *lock)
> -{
> -uint32_t x;
> +cnts = atomic_read(&lock->cnts);
> +if ( !(cnts & _QW_WMASK) &&
> + (atomic_cmpxchg(&lock->cnts, cnts,
> + cnts | _QW_WAITING) == cnts) )
> +break;

Considering that (at least on x86) cmpxchg is relatively expensive,
and that atomic OR would be carried out by some cmpxchg-like
mechanism on most other arches anyway, can't this be an atomic
OR, followed by a read to check for another active writer?

> +unlock:
> +spin_unlock(&lock->lock);

Labels indented by at least one space please.

Also - are you using any of the static functions in spinlock.c? If not,
putting the rwlock code in a new rwlock.c would help review quite a
bit, since code removal and code addition would then be separate
rather than intermixed.

> +/*
> + * Writer states & reader shift and bias
> + */
> +#define_QW_WAITING  1   /* A writer is waiting */
> +#define_QW_LOCKED   0xff/* A writer holds the lock */
> +#define_QW_WMASK0xff/* Writer mask */
> +#define_QR_SHIFT8   /* Reader count shift  */
> +#define_QR_BIAS (1U << _QR_SHIFT)

Is there a particular reason for the 8-bit writer mask (a 2-bit one
would seem to suffice)?

> +static inline int _write_trylock(rwlock_t *lock)
> +{
> +u32 cnts;
> +
> +cnts = atomic_read(&lock->cnts);
> +if ( unlikely(cnts) )
> +return 0;
> +
> +return likely(atomic_cmpxchg(&lock->cnts,
> + cnts, cnts | _QW_LOCKED) == cnts);

The | is pointless here considering that cnts is zero.

> +static inline void _write_unlock(rwlock_t *lock)
> +{
> +/*
> + * If the writer field is atomic, it can be cleared directly.
> + * Otherwise, an atomic subtraction will be used to clear it.
> + */
> +atomic_sub(_QW_LOCKED, &lock->cnts);
> +}

Ah, I guess the comment here is the explanation for the 8-bit
write mask.

> +static inline int _rw_is_write_locked(rwlock_t *lock)
> +{
> +return atomic_read(&lock->cnts) & _QW_WMASK;
> +}

This returns true for write-locked or writer-waiting - is this intended?

Jan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [xen-4.4-testing test] 66583: regressions - FAIL

On Mon, 21 Dec 2015, Jan Beulich wrote:
> >>> On 20.12.15 at 13:25,  wrote:
> > flight 66583 xen-4.4-testing real [real]
> > http://logs.test-lab.xenproject.org/osstest/logs/66583/ 
> > 
> > Regressions :-(
> > 
> > Tests which did not succeed and are blocking,
> > including tests which could not be run:
> >  build-amd64-prev  5 xen-build fail REGR. vs. 
> > 66458
> >  build-i386-prev   5 xen-build fail REGR. vs. 
> > 66458
> 
> Ian, Stefano,
> 
> is one of you two looking into this?
> 
> block-qcow.c: In function 'get_cluster_offset':
> block-qcow.c:444:3: error: 'tmp_ptr' may be used uninitialized in this 
> function [-Werror=maybe-uninitialized]
>memcpy(tmp_ptr, l1_ptr, 4096);
>^
> block-qcow.c:619:7: error: 'tmp_ptr2' may be used uninitialized in this 
> function [-Werror=maybe-uninitialized]
>if (write(s->fd, tmp_ptr2, 4096) != 4096) {
>^
> cc1: all warnings being treated as errors
> /home/osstest/build.66583.build-amd64-prev/xen/tools/blktap/drivers/../../../tools/Rules.mk:89:
>  recipe for target 'block-qcow.o' failed
> make[5]: *** [block-qcow.o] Error 1

What does build-amd64-prev means? I cannot find the explanation of the
test anywhere.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCHv6 2/2] x86/ept: defer the invalidation until the p2m lock is released

On 22/12/15 12:23, George Dunlap wrote:
> On 18/12/15 13:50, David Vrabel wrote:
>> Holding the p2m lock while calling ept_sync_domain() is very expensive
>> since it does a on_selected_cpus() call.  IPIs on many socket machines
>> can be very slows and on_selected_cpus() is serialized.
>>
>> It is safe to defer the invalidate until the p2m lock is released
>> except for two cases:
>>
>> 1. When freeing a page table page (since partial translations may be
>>cached).
>> 2. When reclaiming a zero page as part of PoD.
>>
>> For these cases, add p2m_tlb_flush_sync() calls which will immediately
>> perform the invalidate before the page is freed or reclaimed.
> There are at least two other places in the PoD code where the "remove ->
> check -> add to cache -> unlock" pattern exist; and it looks to me like
> there are other places where races might occur (e.g.,
> p2m_paging_evict(), which does remove -> scrub -> put -> unlock;
> p2m_altp2m_propagate_change(), which does remove -> put -> unlock).
>
> Part of me wonders whether, rather than making callers that need it
> remember to do a flush, it might be better to explicitly pass in
> P2M_FLUSH or P2M_CAN_DEFER when calling p2m_set_entry, just to make
> people think about the fact that the p2m change may not actually take
> effect until later.  Any thoughts on that?
>
> Comments on the current approach inline.
>
>> diff --git a/xen/arch/x86/mm/p2m-ept.c b/xen/arch/x86/mm/p2m-ept.c
>> index c094320..43c7f1b 100644
>> --- a/xen/arch/x86/mm/p2m-ept.c
>> +++ b/xen/arch/x86/mm/p2m-ept.c
>> @@ -263,6 +263,7 @@ static void ept_free_entry(struct p2m_domain *p2m, 
>> ept_entry_t *ept_entry, int l
>>  unmap_domain_page(epte);
>>  }
>>  
>> +p2m_tlb_flush_sync(p2m);
>>  p2m_free_ptp(p2m, mfn_to_page(ept_entry->mfn));
> It's probably worth a comment here pointing out that even if this
> function is called several times (e.g., if you replace a load of 4k
> entries with a 1G entry), the actual flush will only happen the first time.
>
>> +static void ept_flush_and_unlock(struct p2m_domain *p2m, bool_t unlock)
>> +{
>> +p2m->need_flush = 0;
>> +if ( unlock )
>> +mm_write_unlock(&p2m->lock);
>> +ept_sync_domain_mask(p2m, p2m->domain->domain_dirty_cpumask);
>>  }
> Having a function called "flush_and_unlock", with a boolean as to
> whether to unlock or not, just seems a bit wonky.
>
> Wouldn't it make more sense to have the hook just named "flush_sync()",
> and move the unlocking out in the generic p2m code (where you already
> have the check for need_flush)?
>
>> diff --git a/xen/include/asm-x86/p2m.h b/xen/include/asm-x86/p2m.h
>> index fa46dd9..9c394c2 100644
>> --- a/xen/include/asm-x86/p2m.h
>> +++ b/xen/include/asm-x86/p2m.h
>> @@ -261,6 +261,10 @@ struct p2m_domain {
>>unsigned long gfn, l1_pgentry_t 
>> *p,
>>l1_pgentry_t new, unsigned int 
>> level);
>>  long   (*audit_p2m)(struct p2m_domain *p2m);
>> +void   (*flush_and_unlock)(struct p2m_domain *p2m, bool_t 
>> unlock);
>> +
>> +unsigned int defer_flush;
>> +bool_t need_flush;
> It's probably worth a comment that at the moment calling
> flush_and_unlock() is gated on need_flush; so it's OK not to implement
> flush_and_unlock() as long as you never set need_flush.

This is just one small accident (in code elsewhere) away from a code
injection vulnerability.

Either we should require that all function pointers are filled in, or
BUG() if the pointer is missing when we attempt to use it.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH] libxc: Defer initialization of start_page for HVM guests

2015-12-22 Thread Boris Ostrovsky


On 12/22/2015 04:42 AM, Roger Pau Monné wrote:

El 22/12/15 a les 0.45, Boris Ostrovsky ha escrit:

With commit 8c45adec18e0 ("libxc: create unmapped initrd in domain
builder if supported") location of ramdisk may not be available to
HVMlite guests by the time alloc_magic_pages_hvm() is invoked if the
guest supports unmapped initrd.

Since the only operation related to allocating magic pages in that
routine is allocation of HVMlite start info we can move everything
else to a later point such as xc_dom_arch.start_info().

Signed-off-by: Boris Ostrovsky 
---
I am not sure xc_dom_arch.start_info() is the right place neither since we
still are doing things that have nothing to do with start_info.

  tools/libxc/xc_dom_x86.c |   45 ++---
  1 files changed, 30 insertions(+), 15 deletions(-)

diff --git a/tools/libxc/xc_dom_x86.c b/tools/libxc/xc_dom_x86.c
index 3960875..f64079e 100644
--- a/tools/libxc/xc_dom_x86.c
+++ b/tools/libxc/xc_dom_x86.c
@@ -585,6 +585,32 @@ static void build_hvm_info(void *hvm_info_page, struct 
xc_dom_image *dom)
  
  static int alloc_magic_pages_hvm(struct xc_dom_image *dom)

  {
+struct xc_dom_seg seg;
+size_t start_info_size = sizeof(struct hvm_start_info);
+
+if ( dom->device_model )
+return 0;
+
+if ( dom->cmdline )
+start_info_size += ROUNDUP(strlen(dom->cmdline) + 1, 8);
+if ( dom->ramdisk_blob )
+/* Limited to one module. */
+start_info_size += sizeof( struct hvm_modlist_entry);

  ^ extra space.

+
+if ( xc_dom_alloc_segment(dom, &seg, "HVMlite start info", 0,
+  start_info_size) )
+{
+DOMPRINTF("Unable to reserve memory for the start info");
+return -1;
+}
+
+dom->start_info_pfn = seg.pfn;
+
+return 0;
+}
+
+static int start_info_hvm(struct xc_dom_image *dom)
+{
  unsigned long i;
  void *hvm_info_page;
  uint32_t *ident_pt, domid = dom->guest_domid;
@@ -636,7 +662,6 @@ static int alloc_magic_pages_hvm(struct xc_dom_image *dom)
  
  if ( !dom->device_model )

  {
-struct xc_dom_seg seg;
  struct hvm_start_info *start_info;
  char *cmdline;
  struct hvm_modlist_entry *modlist;
@@ -653,17 +678,9 @@ static int alloc_magic_pages_hvm(struct xc_dom_image *dom)
  if ( dom->ramdisk_blob )
  start_info_size += sizeof(*modlist); /* Limited to one module. */
  
-rc = xc_dom_alloc_segment(dom, &seg, "HVMlite start info", 0,

-  start_info_size);
-if ( rc != 0 )
-{
-DOMPRINTF("Unable to reserve memory for the start info");
-goto out;
-}
-
  start_page = xc_map_foreign_range(xch, domid, start_info_size,

IMHO, I would stash start_info_size somewhere inside xc_dom_image in
order to avoid calculating it twice.


Yes, I should do this.



Also, why is everything done inside of alloc_magic_pages_hvm moved to
start_info_hvm? AFAICT we should only need to move the code that fills
the hvm_start_info struct, but the rest of the code already present in
alloc_magic_pages_hvm could be left as-is.


That's what I was referring to in the commit message (and in the comment 
below it): most, if not all, of the stuff that alloc_magic_pages_hvm() 
currently does really has nothing to do with magic pages allocation. 
E.g. hvm_params settting, special pages initialization, etc. So I 
figured I could move all of this to a later point. But, as I said in the 
comment, I am not convinced that start_info() is the right place either.



-boris

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH RESEND v4] arm: remove !CPU_V6 and !GENERIC_ATOMIC64 build dependencies for XEN

Hello Russell,

Would you please consider this patch for the next merge window? It
is a very old patch (March 2014) which has been left over.

See some of the original threads here:

http://marc.info/?l=linux-arm-kernel&m=138920414402610&w=2
http://marc.info/?l=linux-arm-kernel&m=139031200118436&w=2

Many thanks,

Stefano

---

Remove !GENERIC_ATOMIC64 build dependency:
- introduce xen_atomic64_xchg
- use it to implement xchg_xen_ulong

Remove !CPU_V6 build dependency:
- introduce __cmpxchg8 and __cmpxchg16, compiled even ifdef
  CONFIG_CPU_V6
- implement sync_cmpxchg using __cmpxchg8 and __cmpxchg16

Signed-off-by: Stefano Stabellini 
Reviewed-by: Will Deacon 
Acked-by: Ian Campbell 
CC: a...@arndb.de
CC: will.dea...@arm.com
CC: catalin.mari...@arm.com
CC: linux-arm-ker...@lists.infradead.org
CC: linux-ker...@vger.kernel.org
CC: xen-de...@lists.xenproject.org

diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index 34e1569..e09ed94 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -1807,8 +1807,7 @@ config XEN_DOM0
 config XEN
bool "Xen guest support on ARM"
depends on ARM && AEABI && OF
-   depends on CPU_V7 && !CPU_V6
-   depends on !GENERIC_ATOMIC64
+   depends on CPU_V7
depends on MMU
select ARCH_DMA_ADDR_T_64BIT
select ARM_PSCI
diff --git a/arch/arm/include/asm/cmpxchg.h b/arch/arm/include/asm/cmpxchg.h
index 97882f9..3b342ec 100644
--- a/arch/arm/include/asm/cmpxchg.h
+++ b/arch/arm/include/asm/cmpxchg.h
@@ -152,6 +152,44 @@ extern void __bad_cmpxchg(volatile void *ptr, int size);
  * cmpxchg only support 32-bits operands on ARMv6.
  */
 
+static inline unsigned long __cmpxchg8(volatile void *ptr, unsigned long old,
+ unsigned long new)
+{
+   unsigned long oldval, res;
+
+   do {
+   asm volatile("@ __cmpxchg8\n"
+   "   ldrexb  %1, [%2]\n"
+   "   mov %0, #0\n"
+   "   teq %1, %3\n"
+   "   strexbeq %0, %4, [%2]\n"
+   : "=&r" (res), "=&r" (oldval)
+   : "r" (ptr), "Ir" (old), "r" (new)
+   : "memory", "cc");
+   } while (res);
+
+   return oldval;
+}
+
+static inline unsigned long __cmpxchg16(volatile void *ptr, unsigned long old,
+ unsigned long new)
+{
+   unsigned long oldval, res;
+
+   do {
+   asm volatile("@ __cmpxchg16\n"
+   "   ldrexh  %1, [%2]\n"
+   "   mov %0, #0\n"
+   "   teq %1, %3\n"
+   "   strexheq %0, %4, [%2]\n"
+   : "=&r" (res), "=&r" (oldval)
+   : "r" (ptr), "Ir" (old), "r" (new)
+   : "memory", "cc");
+   } while (res);
+
+   return oldval;
+}
+
 static inline unsigned long __cmpxchg(volatile void *ptr, unsigned long old,
  unsigned long new, int size)
 {
@@ -162,28 +200,10 @@ static inline unsigned long __cmpxchg(volatile void *ptr, 
unsigned long old,
switch (size) {
 #ifndef CONFIG_CPU_V6  /* min ARCH >= ARMv6K */
case 1:
-   do {
-   asm volatile("@ __cmpxchg1\n"
-   "   ldrexb  %1, [%2]\n"
-   "   mov %0, #0\n"
-   "   teq %1, %3\n"
-   "   strexbeq %0, %4, [%2]\n"
-   : "=&r" (res), "=&r" (oldval)
-   : "r" (ptr), "Ir" (old), "r" (new)
-   : "memory", "cc");
-   } while (res);
+   oldval = __cmpxchg8(ptr, old, new);
break;
case 2:
-   do {
-   asm volatile("@ __cmpxchg1\n"
-   "   ldrexh  %1, [%2]\n"
-   "   mov %0, #0\n"
-   "   teq %1, %3\n"
-   "   strexheq %0, %4, [%2]\n"
-   : "=&r" (res), "=&r" (oldval)
-   : "r" (ptr), "Ir" (old), "r" (new)
-   : "memory", "cc");
-   } while (res);
+   oldval = __cmpxchg16(ptr, old, new);
break;
 #endif
case 4:
diff --git a/arch/arm/include/asm/sync_bitops.h 
b/arch/arm/include/asm/sync_bitops.h
index 9732b8e..4103319 100644
--- a/arch/arm/include/asm/sync_bitops.h
+++ b/arch/arm/include/asm/sync_bitops.h
@@ -20,7 +20,29 @@
 #define sync_test_and_clear_bit(nr, p) _test_and_clear_bit(nr, p)
 #define sync_test_and_change_bit(nr, p)_test_and_change_bit(nr, p)
 #define sync_test_bit(nr, addr)test_bit(nr, addr)
-#define sync_cmpxchg   cmpxchg
 
+static inline unsigned long sync_cmpxchg(volatile void *ptr,
+

Re: [Xen-devel] [PATCHv6 2/2] x86/ept: defer the invalidation until the p2m lock is released

2015-12-22 Thread David Vrabel

On 22/12/15 14:01, Andrew Cooper wrote:
> On 22/12/15 12:23, George Dunlap wrote:
>> On 18/12/15 13:50, David Vrabel wrote:
>>> Holding the p2m lock while calling ept_sync_domain() is very expensive
>>> since it does a on_selected_cpus() call.  IPIs on many socket machines
>>> can be very slows and on_selected_cpus() is serialized.
>>>
>>> It is safe to defer the invalidate until the p2m lock is released
>>> except for two cases:
>>>
>>> 1. When freeing a page table page (since partial translations may be
>>>cached).
>>> 2. When reclaiming a zero page as part of PoD.
>>>
>>> For these cases, add p2m_tlb_flush_sync() calls which will immediately
>>> perform the invalidate before the page is freed or reclaimed.
>> There are at least two other places in the PoD code where the "remove ->
>> check -> add to cache -> unlock" pattern exist; and it looks to me like
>> there are other places where races might occur (e.g.,
>> p2m_paging_evict(), which does remove -> scrub -> put -> unlock;
>> p2m_altp2m_propagate_change(), which does remove -> put -> unlock).
>>
>> Part of me wonders whether, rather than making callers that need it
>> remember to do a flush, it might be better to explicitly pass in
>> P2M_FLUSH or P2M_CAN_DEFER when calling p2m_set_entry, just to make
>> people think about the fact that the p2m change may not actually take
>> effect until later.  Any thoughts on that?
>>
>> Comments on the current approach inline.
>>
>>> diff --git a/xen/arch/x86/mm/p2m-ept.c b/xen/arch/x86/mm/p2m-ept.c
>>> index c094320..43c7f1b 100644
>>> --- a/xen/arch/x86/mm/p2m-ept.c
>>> +++ b/xen/arch/x86/mm/p2m-ept.c
>>> @@ -263,6 +263,7 @@ static void ept_free_entry(struct p2m_domain *p2m, 
>>> ept_entry_t *ept_entry, int l
>>>  unmap_domain_page(epte);
>>>  }
>>>  
>>> +p2m_tlb_flush_sync(p2m);
>>>  p2m_free_ptp(p2m, mfn_to_page(ept_entry->mfn));
>> It's probably worth a comment here pointing out that even if this
>> function is called several times (e.g., if you replace a load of 4k
>> entries with a 1G entry), the actual flush will only happen the first time.
>>
>>> +static void ept_flush_and_unlock(struct p2m_domain *p2m, bool_t unlock)
>>> +{
>>> +p2m->need_flush = 0;
>>> +if ( unlock )
>>> +mm_write_unlock(&p2m->lock);
>>> +ept_sync_domain_mask(p2m, p2m->domain->domain_dirty_cpumask);
>>>  }
>> Having a function called "flush_and_unlock", with a boolean as to
>> whether to unlock or not, just seems a bit wonky.
>>
>> Wouldn't it make more sense to have the hook just named "flush_sync()",
>> and move the unlocking out in the generic p2m code (where you already
>> have the check for need_flush)?
>>
>>> diff --git a/xen/include/asm-x86/p2m.h b/xen/include/asm-x86/p2m.h
>>> index fa46dd9..9c394c2 100644
>>> --- a/xen/include/asm-x86/p2m.h
>>> +++ b/xen/include/asm-x86/p2m.h
>>> @@ -261,6 +261,10 @@ struct p2m_domain {
>>>unsigned long gfn, l1_pgentry_t 
>>> *p,
>>>l1_pgentry_t new, unsigned int 
>>> level);
>>>  long   (*audit_p2m)(struct p2m_domain *p2m);
>>> +void   (*flush_and_unlock)(struct p2m_domain *p2m, bool_t 
>>> unlock);
>>> +
>>> +unsigned int defer_flush;
>>> +bool_t need_flush;
>> It's probably worth a comment that at the moment calling
>> flush_and_unlock() is gated on need_flush; so it's OK not to implement
>> flush_and_unlock() as long as you never set need_flush.
> 
> This is just one small accident (in code elsewhere) away from a code
> injection vulnerability.
> 
> Either we should require that all function pointers are filled in, or
> BUG() if the pointer is missing when we attempt to use it.

Jan asked for the call to be conditional on need_flush and to not test
flush_and_unlock.

David

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [xen-4.4-testing test] 66583: regressions - FAIL

>>> On 22.12.15 at 14:55,  wrote:
> On Mon, 21 Dec 2015, Jan Beulich wrote:
>> >>> On 20.12.15 at 13:25,  wrote:
>> > flight 66583 xen-4.4-testing real [real]
>> > http://logs.test-lab.xenproject.org/osstest/logs/66583/ 
>> > 
>> > Regressions :-(
>> > 
>> > Tests which did not succeed and are blocking,
>> > including tests which could not be run:
>> >  build-amd64-prev  5 xen-build fail REGR. vs. 
> 66458
>> >  build-i386-prev   5 xen-build fail REGR. vs. 
> 66458
>> 
>> Ian, Stefano,
>> 
>> is one of you two looking into this?
>> 
>> block-qcow.c: In function 'get_cluster_offset':
>> block-qcow.c:444:3: error: 'tmp_ptr' may be used uninitialized in this 
> function [-Werror=maybe-uninitialized]
>>memcpy(tmp_ptr, l1_ptr, 4096);
>>^
>> block-qcow.c:619:7: error: 'tmp_ptr2' may be used uninitialized in this 
> function [-Werror=maybe-uninitialized]
>>if (write(s->fd, tmp_ptr2, 4096) != 4096) {
>>^
>> cc1: all warnings being treated as errors
>> 
> /home/osstest/build.66583.build-amd64-prev/xen/tools/blktap/drivers/../../../to
> ols/Rules.mk:89: recipe for target 'block-qcow.o' failed
>> make[5]: *** [block-qcow.o] Error 1
> 
> What does build-amd64-prev means? I cannot find the explanation of the
> test anywhere.

Iiuc it's the build pf the previous version, needed to test n-1 -> n
migration. I.e. (in line with the respective 4.3 failure) it indicates
that 4.3's qemu doesn't build.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH] libxc: Defer initialization of start_page for HVM guests

2015-12-22 Thread Roger Pau Monné

El 22/12/15 a les 15.12, Boris Ostrovsky ha escrit:
> On 12/22/2015 04:42 AM, Roger Pau Monné wrote:
>> Also, why is everything done inside of alloc_magic_pages_hvm moved to
>> start_info_hvm? AFAICT we should only need to move the code that fills
>> the hvm_start_info struct, but the rest of the code already present in
>> alloc_magic_pages_hvm could be left as-is.
> 
> That's what I was referring to in the commit message (and in the comment
> below it): most, if not all, of the stuff that alloc_magic_pages_hvm()
> currently does really has nothing to do with magic pages allocation.
> E.g. hvm_params settting, special pages initialization, etc. So I
> figured I could move all of this to a later point. But, as I said in the
> comment, I am not convinced that start_info() is the right place either.

Some of it has to do with magic pages IMHO (although I have to admit
this is probably a question of taste), for example it makes sense from
my PoV to allocate the hvm_info_page, the special pages (xenstore,
console...), the ioreq server pages and possibly the EPT identity map.
Moving the start_info stuff into it's own function also makes sense.

Roger.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [xen-4.4-testing test] 66583: regressions - FAIL

On Tue, 22 Dec 2015, Jan Beulich wrote:
> >>> On 22.12.15 at 14:55,  wrote:
> > On Mon, 21 Dec 2015, Jan Beulich wrote:
> >> >>> On 20.12.15 at 13:25,  wrote:
> >> > flight 66583 xen-4.4-testing real [real]
> >> > http://logs.test-lab.xenproject.org/osstest/logs/66583/ 
> >> > 
> >> > Regressions :-(
> >> > 
> >> > Tests which did not succeed and are blocking,
> >> > including tests which could not be run:
> >> >  build-amd64-prev  5 xen-build fail REGR. 
> >> > vs. 
> > 66458
> >> >  build-i386-prev   5 xen-build fail REGR. 
> >> > vs. 
> > 66458
> >> 
> >> Ian, Stefano,
> >> 
> >> is one of you two looking into this?
> >> 
> >> block-qcow.c: In function 'get_cluster_offset':
> >> block-qcow.c:444:3: error: 'tmp_ptr' may be used uninitialized in this 
> > function [-Werror=maybe-uninitialized]
> >>memcpy(tmp_ptr, l1_ptr, 4096);
> >>^
> >> block-qcow.c:619:7: error: 'tmp_ptr2' may be used uninitialized in this 
> > function [-Werror=maybe-uninitialized]
> >>if (write(s->fd, tmp_ptr2, 4096) != 4096) {
> >>^
> >> cc1: all warnings being treated as errors
> >> 
> > /home/osstest/build.66583.build-amd64-prev/xen/tools/blktap/drivers/../../../to
> > ols/Rules.mk:89: recipe for target 'block-qcow.o' failed
> >> make[5]: *** [block-qcow.o] Error 1
> > 
> > What does build-amd64-prev means? I cannot find the explanation of the
> > test anywhere.
> 
> Iiuc it's the build pf the previous version, needed to test n-1 -> n
> migration. I.e. (in line with the respective 4.3 failure) it indicates
> that 4.3's qemu doesn't build.

Thanks for the explanation, it was very helpful.

>From the logs what is not building is actually tools/blktap, not QEMU
though, but I cannot reproduce the failure.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH RESEND v4] arm: remove !CPU_V6 and !GENERIC_ATOMIC64 build dependencies for XEN

2015-12-22 Thread Russell King - ARM Linux

On Tue, Dec 22, 2015 at 02:17:17PM +, Stefano Stabellini wrote:
> Hello Russell,
> 
> Would you please consider this patch for the next merge window? It
> is a very old patch (March 2014) which has been left over.

This patch has some obvious problems...

> diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
> index 34e1569..e09ed94 100644
> --- a/arch/arm/Kconfig
> +++ b/arch/arm/Kconfig
> @@ -1807,8 +1807,7 @@ config XEN_DOM0
>  config XEN
>   bool "Xen guest support on ARM"
>   depends on ARM && AEABI && OF
> - depends on CPU_V7 && !CPU_V6
> - depends on !GENERIC_ATOMIC64
> + depends on CPU_V7

How sure are we that this won't cause regressions?  How much testing
has been done with these changed dependencies?

> diff --git a/arch/arm/include/asm/sync_bitops.h 
> b/arch/arm/include/asm/sync_bitops.h
> index 9732b8e..4103319 100644
> --- a/arch/arm/include/asm/sync_bitops.h
> +++ b/arch/arm/include/asm/sync_bitops.h
> @@ -20,7 +20,29 @@
>  #define sync_test_and_clear_bit(nr, p)   _test_and_clear_bit(nr, p)
>  #define sync_test_and_change_bit(nr, p)  _test_and_change_bit(nr, p)
>  #define sync_test_bit(nr, addr)  test_bit(nr, addr)
> -#define sync_cmpxchg cmpxchg
>  
> +static inline unsigned long sync_cmpxchg(volatile void *ptr,
> + 
>  unsigned long old,
> + 
>  unsigned long new)
> +{
> + unsigned long oldval;
> + int size = sizeof(*(ptr));

This is buggy.  You're doing sizeof(void) here, which on GCC will always
be 1:

   A consequence of this is that `sizeof' is also allowed on `void' and
  on function types, and returns 1.

> +
> + smp_mb();
> + switch (size) {
> + case 1:
> + oldval = __cmpxchg8(ptr, old, new);
> + break;
> + case 2:
> + oldval = __cmpxchg16(ptr, old, new);
> + break;

The ldrexb/ldrexh instructions are not available on ARMv6 CPUs.  __xchg()
and friends avoided these for ARMv6 CPUs, but this does not.

I'd expect anything that uses sync_cmpxchg() will fail to build when a
kernel including ARMv6 is attempted.

So... I don't think this patch is ready.

-- 
RMK's Patch system: http://www.arm.linux.org.uk/developer/patches/
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH] build: save generated config in /boot

On 22/12/15 12:52, Jan Beulich wrote:
 On 22.12.15 at 13:45,  wrote:
>> On 12/21/15 9:35 AM, Jan Beulich wrote:
>> On 21.12.15 at 16:20,  wrote:
 On 12/21/15 8:51 AM, Jan Beulich wrote:
 On 21.12.15 at 15:35,  wrote:
>> On 12/21/15 6:11 AM, Jan Beulich wrote:
>> On 18.12.15 at 22:35,  wrote:
 Since we now support changing Xen options with Kconfig, we should save
 the configuration that was used to build up Xen. This will save it in
 /boot alongside the installed xen.gz and call it
 xen-$(FULLVERSION).config

 Suggested-by: Ian Campbell 
 Signed-off-by: Doug Goldstein 
 ---
  xen/Makefile | 1 +
  1 file changed, 1 insertion(+)

 diff --git a/xen/Makefile b/xen/Makefile
 index 9023863..460b977 100644
 --- a/xen/Makefile
 +++ b/xen/Makefile
 @@ -58,6 +58,7 @@ _install: $(TARGET)$(CONFIG_XEN_INSTALL_SUFFIX)
ln -f -s $(T)-$(XEN_FULLVERSION)$(Z) $(D)$(BOOT_DIR)/$(T)$(Z)
[ -d "$(D)$(DEBUG_DIR)" ] || $(INSTALL_DIR) $(D)$(DEBUG_DIR)
$(INSTALL_DATA) $(TARGET)-syms 
 $(D)$(DEBUG_DIR)/$(T)-syms-$(XEN_FULLVERSION)
 +  $(INSTALL_DATA) $(KCONFIG_CONFIG) 
>> $(D)$(BOOT_DIR)/$(T)-$(XEN_FULLVERSION).config
>>> Was it really suggested to put this into /boot? It has no business
>>> being there...
>> Yes. By multiple people. Ian Campbell was the first person to suggest it
>> in that location.
> Okay, so I've looked it up, and no, he didn't. He just gave this as one
> possibility:
>
> "It occurred to me this morning that we probably ought to stash the 
> .config
>  somewhere on install in such a way that it can be associated with the Xen
>  binary (i.e. with the same full suffix as the binary itself, not the
>  abridged symlink names), maybe as $(BOOT_DIR)/$(T)-
>  $(XEN_FULLVERSION).config?"
>
> But yes, I'm sorry for not noticing this as an undesirable place right
> away.
 Ok well I'm at a loss here because the quote clearly shows him
 suggesting that location. Do you have a suggested location because so
 far I've just got a no from you on the only suggested location.
>>> Match the xen-syms location?
>> I guess I fail to grasp the rationale behind not putting it in /boot.
> It's the other way around really - you'd have to provide a reason
> (other than "Linux does so too") for putting it in /boot.

I disagree.

Xen being consistent with Linux in this regard is in the best interest
of the users of Xen, as they end up finding similar information in
similar places.

The onus is on you to provide a reason why we should deliberately do
something different.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH] libxc: Defer initialization of start_page for HVM guests

2015-12-22 Thread Boris Ostrovsky


On 12/22/2015 09:36 AM, Roger Pau Monné wrote:

El 22/12/15 a les 15.12, Boris Ostrovsky ha escrit:

On 12/22/2015 04:42 AM, Roger Pau Monné wrote:

Also, why is everything done inside of alloc_magic_pages_hvm moved to
start_info_hvm? AFAICT we should only need to move the code that fills
the hvm_start_info struct, but the rest of the code already present in
alloc_magic_pages_hvm could be left as-is.

That's what I was referring to in the commit message (and in the comment
below it): most, if not all, of the stuff that alloc_magic_pages_hvm()
currently does really has nothing to do with magic pages allocation.
E.g. hvm_params settting, special pages initialization, etc. So I
figured I could move all of this to a later point. But, as I said in the
comment, I am not convinced that start_info() is the right place either.

Some of it has to do with magic pages IMHO (although I have to admit
this is probably a question of taste), for example it makes sense from
my PoV to allocate the hvm_info_page, the special pages (xenstore,
console...), the ioreq server pages and possibly the EPT identity map.
Moving the start_info stuff into it's own function also makes sense.


Since there are many ways to separate this I'll wait for maintainers to 
express their preference.


(I don't know who is around now and I will disappear too on Friday until 
January so this may have to wait until then).


-boris

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCHv6 2/2] x86/ept: defer the invalidation until the p2m lock is released

On 22/12/15 14:20, David Vrabel wrote:
> On 22/12/15 14:01, Andrew Cooper wrote:
>> On 22/12/15 12:23, George Dunlap wrote:
>>> On 18/12/15 13:50, David Vrabel wrote:
 Holding the p2m lock while calling ept_sync_domain() is very expensive
 since it does a on_selected_cpus() call.  IPIs on many socket machines
 can be very slows and on_selected_cpus() is serialized.

 It is safe to defer the invalidate until the p2m lock is released
 except for two cases:

 1. When freeing a page table page (since partial translations may be
cached).
 2. When reclaiming a zero page as part of PoD.

 For these cases, add p2m_tlb_flush_sync() calls which will immediately
 perform the invalidate before the page is freed or reclaimed.
>>> There are at least two other places in the PoD code where the "remove ->
>>> check -> add to cache -> unlock" pattern exist; and it looks to me like
>>> there are other places where races might occur (e.g.,
>>> p2m_paging_evict(), which does remove -> scrub -> put -> unlock;
>>> p2m_altp2m_propagate_change(), which does remove -> put -> unlock).
>>>
>>> Part of me wonders whether, rather than making callers that need it
>>> remember to do a flush, it might be better to explicitly pass in
>>> P2M_FLUSH or P2M_CAN_DEFER when calling p2m_set_entry, just to make
>>> people think about the fact that the p2m change may not actually take
>>> effect until later.  Any thoughts on that?
>>>
>>> Comments on the current approach inline.
>>>
 diff --git a/xen/arch/x86/mm/p2m-ept.c b/xen/arch/x86/mm/p2m-ept.c
 index c094320..43c7f1b 100644
 --- a/xen/arch/x86/mm/p2m-ept.c
 +++ b/xen/arch/x86/mm/p2m-ept.c
 @@ -263,6 +263,7 @@ static void ept_free_entry(struct p2m_domain *p2m, 
 ept_entry_t *ept_entry, int l
  unmap_domain_page(epte);
  }
  
 +p2m_tlb_flush_sync(p2m);
  p2m_free_ptp(p2m, mfn_to_page(ept_entry->mfn));
>>> It's probably worth a comment here pointing out that even if this
>>> function is called several times (e.g., if you replace a load of 4k
>>> entries with a 1G entry), the actual flush will only happen the first time.
>>>
 +static void ept_flush_and_unlock(struct p2m_domain *p2m, bool_t unlock)
 +{
 +p2m->need_flush = 0;
 +if ( unlock )
 +mm_write_unlock(&p2m->lock);
 +ept_sync_domain_mask(p2m, p2m->domain->domain_dirty_cpumask);
  }
>>> Having a function called "flush_and_unlock", with a boolean as to
>>> whether to unlock or not, just seems a bit wonky.
>>>
>>> Wouldn't it make more sense to have the hook just named "flush_sync()",
>>> and move the unlocking out in the generic p2m code (where you already
>>> have the check for need_flush)?
>>>
 diff --git a/xen/include/asm-x86/p2m.h b/xen/include/asm-x86/p2m.h
 index fa46dd9..9c394c2 100644
 --- a/xen/include/asm-x86/p2m.h
 +++ b/xen/include/asm-x86/p2m.h
 @@ -261,6 +261,10 @@ struct p2m_domain {
unsigned long gfn, l1_pgentry_t 
 *p,
l1_pgentry_t new, unsigned int 
 level);
  long   (*audit_p2m)(struct p2m_domain *p2m);
 +void   (*flush_and_unlock)(struct p2m_domain *p2m, bool_t 
 unlock);
 +
 +unsigned int defer_flush;
 +bool_t need_flush;
>>> It's probably worth a comment that at the moment calling
>>> flush_and_unlock() is gated on need_flush; so it's OK not to implement
>>> flush_and_unlock() as long as you never set need_flush.
>>
>> This is just one small accident (in code elsewhere) away from a code
>> injection vulnerability.
>>
>> Either we should require that all function pointers are filled in, or
>> BUG() if the pointer is missing when we attempt to use it.
> 
> Jan asked for the call to be conditional on need_flush and to not test
> flush_and_unlock.

Then perhaps the other paging modes should point this to a function
which calls BUG()?  Or perhaps a noop -- no point in crashing a machine
in production because you don't need to actually do a flush.

 -George

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [linux-next test] 66804: regressions - trouble: blocked/broken/fail/pass

flight 66804 linux-next real [real]
http://logs.test-lab.xenproject.org/osstest/logs/66804/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-i386-pvops  5 kernel-build  fail REGR. vs. 66614
 build-amd64-pvops 5 kernel-build  fail REGR. vs. 66614

Regressions which are regarded as allowable (not blocking):
 test-armhf-armhf-libvirt  3 host-install(3) broken REGR. vs. 66614
 test-armhf-armhf-xl-rtds  6 xen-boot fail   like 66614
 test-armhf-armhf-xl-arndale   6 xen-boot fail   like 66614
 test-armhf-armhf-xl-vhd   6 xen-boot fail   like 66614
 test-armhf-armhf-xl-multivcpu  6 xen-boot fail  like 66614
 test-armhf-armhf-xl-credit2   6 xen-boot fail   like 66614
 test-armhf-armhf-libvirt-xsm  6 xen-boot fail   like 66614
 test-armhf-armhf-xl-cubietruck  6 xen-boot fail like 66614
 test-armhf-armhf-libvirt-raw  6 xen-boot fail   like 66614
 test-armhf-armhf-xl   6 xen-boot fail   like 66614
 test-armhf-armhf-xl-xsm   6 xen-boot fail   like 66614
 test-armhf-armhf-libvirt-qcow2  6 xen-boot fail like 66614

Tests which did not succeed, but are not blocking:
 test-amd64-i386-rumpuserxen-i386  1 build-check(1)   blocked  n/a
 test-amd64-i386-xl-qemut-stubdom-debianhvm-amd64-xsm 1 build-check(1) blocked 
n/a
 test-amd64-amd64-rumpuserxen-amd64  1 build-check(1)   blocked n/a
 test-amd64-amd64-xl-pvh-amd   1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-pvh-intel  1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl   1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt-pair  1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-multivcpu  1 build-check(1)   blocked  n/a
 test-amd64-i386-qemuu-rhel6hvm-intel  1 build-check(1) blocked n/a
 test-amd64-amd64-pygrub   1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt-xsm  1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt   1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-credit2   1 build-check(1)   blocked  n/a
 test-amd64-i386-xl-xsm1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt-pair  1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt-xsm   1 build-check(1)   blocked  n/a
 test-amd64-i386-xl-qemuu-debianhvm-amd64-xsm  1 build-check(1) blocked n/a
 test-amd64-amd64-libvirt  1 build-check(1)   blocked  n/a
 test-amd64-i386-xl1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-xsm   1 build-check(1)   blocked  n/a
 test-amd64-amd64-i386-pvgrub  1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qemut-stubdom-debianhvm-amd64-xsm 1 build-check(1) blocked 
n/a
 test-amd64-amd64-xl-qemut-debianhvm-amd64-xsm  1 build-check(1)blocked n/a
 test-amd64-i386-qemut-rhel6hvm-intel  1 build-check(1) blocked n/a
 test-amd64-amd64-amd64-pvgrub  1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qemuu-debianhvm-amd64-xsm  1 build-check(1)blocked n/a
 test-amd64-i386-xl-qemut-debianhvm-amd64-xsm  1 build-check(1) blocked n/a
 test-amd64-amd64-xl-rtds  1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qemut-debianhvm-amd64  1 build-check(1)blocked n/a
 test-amd64-i386-xl-qemuu-ovmf-amd64  1 build-check(1)  blocked n/a
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 1 build-check(1) blocked n/a
 test-amd64-i386-xl-qemut-debianhvm-amd64  1 build-check(1) blocked n/a
 test-amd64-amd64-qemuu-nested-intel  1 build-check(1)  blocked n/a
 test-amd64-i386-freebsd10-amd64  1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qcow2 1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 1 build-check(1) blocked n/a
 test-amd64-i386-qemut-rhel6hvm-amd  1 build-check(1)   blocked n/a
 test-amd64-i386-qemuu-rhel6hvm-amd  1 build-check(1)   blocked n/a
 test-amd64-i386-pair  1 build-check(1)   blocked  n/a
 test-amd64-i386-freebsd10-i386  1 build-check(1)   blocked  n/a
 test-amd64-i386-xl-qemuu-debianhvm-amd64  1 build-check(1) blocked n/a
 test-amd64-i386-xl-raw1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qemuu-win7-amd64  1 build-check(1) blocked n/a
 test-amd64-amd64-qemuu-nested-amd  1 build-check(1)   blocked  n/a
 test-amd64-i386-xl-qemuu-win7-amd64  1 build-check(1)  blocked n/a
 test-amd64-amd64-pair 1 build-check(1)   block

Re: [Xen-devel] Xen Security Advisory 155 (CVE-2015-8550) - paravirtualized drivers incautious about shared memory

2015-12-22 Thread Eric Shelton

The XSA mentions that "PV frontend patches will be developed and
released (publicly) after the embargo date."  Has anything been done
towards this that should also be incorporated into MiniOS?  On a
system utilizing a "driver domain," where a backend is running on a
domain that is considered unprivileged and untrusted (such as the
example described in http://wiki.xenproject.org/wiki/Driver_Domain),
it seems XSA-155-style double fetch vulnerabilities in the frontends
are also a potential security concern, and should be eliminated.
However, perhaps that does not include pcifront, since pciback would
always be running in dom0.

Eric

On Tue, Dec 22, 2015 at 7:24 AM, Stefano Stabellini
 wrote:
> MiniOS for QEMU stubdom has frontends, such as mini-os/blkfront.c and
> mini-os/netfront.c, not backends.
>
> Cheers,
>
> Stefano
>
>
> On Mon, 21 Dec 2015, Eric Shelton wrote:
>> Seeing as "All OSes providing PV backends are susceptible," doesn't this 
>> include MiniOS for QEMU stubdom as well?
>> Are there patches available for mini-os/blkfront.c, mini-os/netfront.c, and 
>> mini-os/pcifront.c?  I didn't see
>> anything for this.
>> Best,
>> Eric
>>
>> On Thu, Dec 17, 2015 at 1:36 PM, Xen.org security team  
>> wrote:
>>
>>   - Topal: Output generated on Tue Dec 22 12:23:44 GMT 2015 - 
>> Topal: GPG output starts - gpg:
>>   no valid OpenPGP data found. gpg: processing message failed: eof - 
>> Topal: GPG output ends -
>>   - Topal: Original message starts - -BEGIN PGP SIGNED 
>> MESSAGE-
>>   Hash: SHA1
>>
>>   Xen Security Advisory CVE-2015-8550 / XSA-155
>> version 6
>>
>>   paravirtualized drivers incautious about shared memory contents
>>
>>   UPDATES IN VERSION 6
>>   
>>
>>   Correct CREDITS section.
>>
>>   ISSUE DESCRIPTION
>>   =
>>
>>   The compiler can emit optimizations in the PV backend drivers which
>>   can lead to double fetch vulnerabilities. Specifically the shared
>>   memory between the frontend and backend can be fetched twice (during
>>   which time the frontend can alter the contents) possibly leading to
>>   arbitrary code execution in backend.
>>
>>   IMPACT
>>   ==
>>
>>   Malicious guest administrators can cause denial of service.  If driver
>>   domains are not in use, the impact can be a host crash, or privilege 
>> escalation.
>>
>>   VULNERABLE SYSTEMS
>>   ==
>>
>>   Systems running PV or HVM guests are vulnerable.
>>
>>   ARM and x86 systems are vulnerable.
>>
>>   All OSes providing PV backends are susceptible, this includes
>>   Linux and NetBSD. By default the Linux distributions compile kernels
>>   with optimizations.
>>
>>   MITIGATION
>>   ==
>>
>>   There is no mitigation.
>>
>>   CREDITS
>>   ===
>>
>>   This issue was discovered by Felix Wilhelm (ERNW Research, KIT /
>>   Operating Systems Group).
>>
>>   RESOLUTION
>>   ==
>>
>>   Applying the appropriate attached patches should fix the problem for
>>   PV backends.  Note only that PV backends are fixed; PV frontend
>>   patches will be developed and released (publicly) after the embargo
>>   date.
>>
>>   Please note that there is a bug in some versions of gcc,
>>   https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58145 which can cause the
>>   construct used in RING_COPY_REQUEST() to be ineffective in some
>>   circumstances. We have determined that this is only the case when the
>>   structure being copied consists purely of bitfields. The Xen PV
>>   protocols updated here do not use bitfields in this way and therefore
>>   these patches are not subject to that bug. However authors of third
>>   party PV protocols should take this into consideration.
>>
>>   Linux v4.4:
>>   xsa155-linux-xsa155-0001-xen-Add-RING_COPY_REQUEST.patch
>>   
>> xsa155-linux-xsa155-0002-xen-netback-don-t-use-last-request-to-determine-mini.patch
>>   
>> xsa155-linux-xsa155-0003-xen-netback-use-RING_COPY_REQUEST-throughout.patch
>>   
>> xsa155-linux-xsa155-0004-xen-blkback-only-read-request-operation-from-shared-.patch
>>   
>> xsa155-linux-xsa155-0005-xen-blkback-read-from-indirect-descriptors-only-once.patch
>>   xsa155-linux-xsa155-0006-xen-scsiback-safely-copy-requests.patch
>>   
>> xsa155-linux-xsa155-0007-xen-pciback-Save-xen_pci_op-commands-before-processi.patch
>>   Linux v4.[0,1,2,3]
>>   All the above patches except #5 will apply, please use:
>>   
>> xsa155-linux43-0005-xen-blkback-read-from-indirect-descriptors-only-once.patch
>>   Linux v3.19:
>>   All the above patches except #5 and #6 will apply, please use:
>>   
>> xsa155-linux43-0005-xen-blkback-read-from-indirect-descriptors-only-once.patch
>>   xsa155-linux319-0006-

Re: [Xen-devel] [PATCH V5 4/6] x86/hvm: pkeys, add pkeys support for guest_walk_tables

On 22/12/15 13:02, Jan Beulich wrote:
 On 22.12.15 at 11:30,  wrote:
> 
> I dislike having to repeat this: Please trim your Cc lists.
> 
>> --- a/xen/arch/x86/mm/guest_walk.c
>> +++ b/xen/arch/x86/mm/guest_walk.c
>> @@ -90,6 +90,57 @@ static uint32_t set_ad_bits(void *guest_p, void *walk_p, 
>> int set_dirty)
>>  return 0;
>>  }
>>  
>> +extern bool_t pkey_fault(struct vcpu *vcpu, uint32_t pfec,
>> +uint32_t pte_flags, uint32_t pte_pkey);
>> +#if GUEST_PAGING_LEVELS == CONFIG_PAGING_LEVELS
>> +bool_t pkey_fault(struct vcpu *vcpu, uint32_t pfec,
>> +uint32_t pte_flags, uint32_t pte_pkey)
>> +{
> 
> See my comments on the previous version. Please avoid sending new
> versions without having addressed all comments on the previous one
> (verbally or by code changes). Having done the suggested change
> just partially (by removing the #ifdef-s from the call sites) you now
> do the key check universally, and things remain correct just because
> of the long mode check in the middle of the function.
> 
>> +unsigned int pkru = 0;
>> +bool_t pkru_ad, pkru_wd;
>> +
>> +bool_t pf = !!(pfec & PFEC_page_present);
> 
> There's still this stray blank line above (and I continue to wonder
> whether you really need all these boolean variables many of which
> get used just once).

I suspect the "stray blank line" was added for readability.

But I agree that I'd prefer not to use local boolean variables, and just
to put the flag checking inline.

> 
>> +bool_t uf = !!(pfec & PFEC_user_mode);
>> +bool_t wf = !!(pfec & PFEC_write_access);
>> +bool_t ff = !!(pfec & PFEC_insn_fetch);
>> +bool_t rsvdf = !!(pfec & PFEC_reserved_bit);
>> +
>> +/* When page isn't present,  PKEY isn't checked. */
>> +if ( !pf || is_pv_vcpu(vcpu) )
>> +return 0;
>> +
>> +/*
>> + * PKU:  additional mechanism by which the paging controls
>> + * access to user-mode addresses based on the value in the
>> + * PKRU register. A fault is considered as a PKU violation if all
>> + * of the following conditions are ture:

*true

>> + * 1.CR4_PKE=1.
>> + * 2.EFER_LMA=1.
>> + * 3.page is present with no reserved bit violations.
>> + * 4.the access is not an instruction fetch.
>> + * 5.the access is to a user page.
>> + * 6.PKRU.AD=1
>> + *   or The access is a data write and PKRU.WD=1
>> + *and either CR0.WP=1 or it is a user access.
>> + */
>> +if ( !hvm_pku_enabled(vcpu) || !hvm_long_mode_enabled(vcpu) ||
>> +rsvdf || ff || !(pte_flags & _PAGE_USER) )

And I think you might as well make this one line per condition,
something like this:

if ( is_pv_vcpu(vcpu) ||
 !hvm_pku_enabled(vcpu) ||
 !hvm_long_mode_enabled(vcpu)
 !(pfec & PFEC_page_present) ||
 (pfec & (PFEC_insn_fetch|PFEC_reserved_bit)) ||
 !(pte_flags & _PAGE_USER) )
return 0;

 -George

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH V5 4/6] x86/hvm: pkeys, add pkeys support for guest_walk_tables

On 22/12/15 10:30, Huaitong Han wrote:
> Protection keys define a new 4-bit protection key field(PKEY) in bits 62:59 of
> leaf entries of the page tables.
> 
> PKRU register defines 32 bits, there are 16 domains and 2 attribute bits per
> domain in pkru, for each i (0 ≤ i ≤ 15), PKRU[2i] is the access-disable bit 
> for
> protection key i (ADi); PKRU[2i+1] is the write-disable bit for protection key
> i (WDi). PKEY is index to a defined domain.
> 
> A fault is considered as a PKU violation if all of the following conditions 
> are
> ture:
> 1.CR4_PKE=1.
> 2.EFER_LMA=1.
> 3.Page is present with no reserved bit violations.
> 4.The access is not an instruction fetch.
> 5.The access is to a user page.
> 6.PKRU.AD=1
> or The access is a data write and PKRU.WD=1
> and either CR0.WP=1 or it is a user access.

Thanks, I think this architecture is much better.  More comments inline.

> diff --git a/xen/arch/x86/mm/guest_walk.c b/xen/arch/x86/mm/guest_walk.c
> index 18d1acf..9cdd607 100644
> --- a/xen/arch/x86/mm/guest_walk.c
> +++ b/xen/arch/x86/mm/guest_walk.c
> @@ -90,6 +90,57 @@ static uint32_t set_ad_bits(void *guest_p, void *walk_p, 
> int set_dirty)
>  return 0;
>  }
>  
> +extern bool_t pkey_fault(struct vcpu *vcpu, uint32_t pfec,
> +uint32_t pte_flags, uint32_t pte_pkey);
> +#if GUEST_PAGING_LEVELS == CONFIG_PAGING_LEVELS
> +bool_t pkey_fault(struct vcpu *vcpu, uint32_t pfec,
> +uint32_t pte_flags, uint32_t pte_pkey)
> +{
> +unsigned int pkru = 0;
> +bool_t pkru_ad, pkru_wd;
> +
> +bool_t pf = !!(pfec & PFEC_page_present);
> +bool_t uf = !!(pfec & PFEC_user_mode);
> +bool_t wf = !!(pfec & PFEC_write_access);
> +bool_t ff = !!(pfec & PFEC_insn_fetch);
> +bool_t rsvdf = !!(pfec & PFEC_reserved_bit);
> +
> +/* When page isn't present,  PKEY isn't checked. */
> +if ( !pf || is_pv_vcpu(vcpu) )
> +return 0;
> +
> +/*
> + * PKU:  additional mechanism by which the paging controls
> + * access to user-mode addresses based on the value in the
> + * PKRU register. A fault is considered as a PKU violation if all
> + * of the following conditions are ture:
> + * 1.CR4_PKE=1.
> + * 2.EFER_LMA=1.
> + * 3.page is present with no reserved bit violations.
> + * 4.the access is not an instruction fetch.
> + * 5.the access is to a user page.
> + * 6.PKRU.AD=1
> + *   or The access is a data write and PKRU.WD=1
> + *and either CR0.WP=1 or it is a user access.
> + */
> +if ( !hvm_pku_enabled(vcpu) || !hvm_long_mode_enabled(vcpu) ||
> +rsvdf || ff || !(pte_flags & _PAGE_USER) )
> +return 0;

See my comment in response to Jan's comment.

> @@ -199,6 +252,9 @@ guest_walk_tables(struct vcpu *v, struct p2m_domain *p2m,
>  
>  pse1G = (gflags & _PAGE_PSE) && guest_supports_1G_superpages(v); 
>  
> +if ( pse1G && pkey_fault(v, pfec, gflags, pkey) )
> +rc |= _PAGE_PKEY_BITS;
> +
>  if ( pse1G )

So it looks like you're only doing a pkey check on the 'leaf' ptes, is
that right?

In which case we have a lot of duplication here -- duplicate pse1G /
pse2M checks, and exact copies of the call to pkey_fault().  Would it
make sense to move the pkey_fault() check down below the set_ad label?

 -George


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH] retain the configuration that Xen was built with

This should retain the .config file from the Kconfig process so that we
know how this build of Xen was configured.

Signed-off-by: Doug Goldstein 
---
 ts-xen-build | 1 +
 1 file changed, 1 insertion(+)

diff --git a/ts-xen-build b/ts-xen-build
index b02e737..80b1faa 100755
--- a/ts-xen-build
+++ b/ts-xen-build
@@ -192,6 +192,7 @@ sub stash () {
 }
 built_stash_file($ho, $builddir, "xen-syms", "xen/xen/xen-syms", 1);
 built_stash_file($ho, $builddir, "xen-config", "xen/.config", 1);
+built_stash_file($ho, $builddir, "xen-hv-config", "xen/xen/.config", 1);
 built_stash_file($ho, $builddir, "seabios-config",
 "xen/tools/firmware/seabios-dir-remote/.config", 1);
 built_compress_stashed("xen-syms");
-- 
2.4.10


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH RESEND v4] arm: remove !CPU_V6 and !GENERIC_ATOMIC64 build dependencies for XEN

On Tue, 22 Dec 2015, Russell King - ARM Linux wrote:
> On Tue, Dec 22, 2015 at 02:17:17PM +, Stefano Stabellini wrote:
> > Hello Russell,
> > 
> > Would you please consider this patch for the next merge window? It
> > is a very old patch (March 2014) which has been left over.
> 
> This patch has some obvious problems...
> 
> > diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
> > index 34e1569..e09ed94 100644
> > --- a/arch/arm/Kconfig
> > +++ b/arch/arm/Kconfig
> > @@ -1807,8 +1807,7 @@ config XEN_DOM0
> >  config XEN
> > bool "Xen guest support on ARM"
> > depends on ARM && AEABI && OF
> > -   depends on CPU_V7 && !CPU_V6
> > -   depends on !GENERIC_ATOMIC64
> > +   depends on CPU_V7
> 
> How sure are we that this won't cause regressions?  How much testing
> has been done with these changed dependencies?

I did build-test a range of combinations of those options, sorry for not
spotting the error below.


> > diff --git a/arch/arm/include/asm/sync_bitops.h 
> > b/arch/arm/include/asm/sync_bitops.h
> > index 9732b8e..4103319 100644
> > --- a/arch/arm/include/asm/sync_bitops.h
> > +++ b/arch/arm/include/asm/sync_bitops.h
> > @@ -20,7 +20,29 @@
> >  #define sync_test_and_clear_bit(nr, p) _test_and_clear_bit(nr, p)
> >  #define sync_test_and_change_bit(nr, p)_test_and_change_bit(nr, p)
> >  #define sync_test_bit(nr, addr)test_bit(nr, addr)
> > -#define sync_cmpxchg   cmpxchg
> >  
> > +static inline unsigned long sync_cmpxchg(volatile void *ptr,
> > +   
> >  unsigned long old,
> > +   
> >  unsigned long new)
> > +{
> > +   unsigned long oldval;
> > +   int size = sizeof(*(ptr));
> 
> This is buggy.  You're doing sizeof(void) here, which on GCC will always
> be 1:
> 
>A consequence of this is that `sizeof' is also allowed on `void' and
>   on function types, and returns 1.

You are right, thank you very much for looking at the patch and finding
this bug. I think the issue can be solved with something like:

+#define sync_cmpxchg(ptr, o, n) ({ \
+   (__typeof(*ptr))__sync_cmpxchg((ptr),   \
+   (unsigned long)(o), \
+   (unsigned long)(n), \
+   sizeof(*(ptr)));\
+})
 
+static inline unsigned long __sync_cmpxchg(volatile void *ptr,
+   
 unsigned long old,
+   
 unsigned long new,
+   
 int size)
+{


> > +
> > +   smp_mb();
> > +   switch (size) {
> > +   case 1:
> > +   oldval = __cmpxchg8(ptr, old, new);
> > +   break;
> > +   case 2:
> > +   oldval = __cmpxchg16(ptr, old, new);
> > +   break;
> 
> The ldrexb/ldrexh instructions are not available on ARMv6 CPUs.  __xchg()
> and friends avoided these for ARMv6 CPUs, but this does not.
> 
> I'd expect anything that uses sync_cmpxchg() will fail to build when a
> kernel including ARMv6 is attempted.

The code builds, but of course it could not actually run on CPU_V6. But
the use case for me is building drivers/xen/grant-table.c, which can
only run on CPU_V7 anyway, so it is not a problem.

Is that acceptable for you? If not, do you have any other suggestions?


> So... I don't think this patch is ready.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH] retain the configuration that Xen was built with

On 12/22/15 9:44 AM, Doug Goldstein wrote:
> This should retain the .config file from the Kconfig process so that we
> know how this build of Xen was configured.
> 
> Signed-off-by: Doug Goldstein 
> ---
>  ts-xen-build | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/ts-xen-build b/ts-xen-build
> index b02e737..80b1faa 100755
> --- a/ts-xen-build
> +++ b/ts-xen-build
> @@ -192,6 +192,7 @@ sub stash () {
>  }
>  built_stash_file($ho, $builddir, "xen-syms", "xen/xen/xen-syms", 1);
>  built_stash_file($ho, $builddir, "xen-config", "xen/.config", 1);
> +built_stash_file($ho, $builddir, "xen-hv-config", "xen/xen/.config", 1);
>  built_stash_file($ho, $builddir, "seabios-config",
>"xen/tools/firmware/seabios-dir-remote/.config", 1);
>  built_compress_stashed("xen-syms");
> 


This is for osstest. I forgot to include that in the subject.

-- 
Doug Goldstein



signature.asc
Description: OpenPGP digital signature
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH RESEND v4] arm: remove !CPU_V6 and !GENERIC_ATOMIC64 build dependencies for XEN

2015-12-22 Thread Russell King - ARM Linux

On Tue, Dec 22, 2015 at 03:45:19PM +, Stefano Stabellini wrote:
> The code builds, but of course it could not actually run on CPU_V6. But
> the use case for me is building drivers/xen/grant-table.c, which can
> only run on CPU_V7 anyway, so it is not a problem.

What I'm concerned about in this case is if you try to use instructions
which are not supported by the CPU, even in assembly, the assembler
will barf.

So, if we're building an ARMv6 + ARMv7 kernel, and you try to use
LDREXB in generic code, even though you may intend it to only be
executed on ARMv7, the assembler will error out.

We used to be able to work around that by passing -Wa,-march=armv7
to the compiler for specific files, which would then tell the assembler
to accept ARMv7 instructions, but modern GCC output .arch pseudo-ops
which override the command line.  So now the only way to do it is to
override the assemblers selected archtecture using .arch pseudo-ops
in the assembly code.

So, what I'm saying is that dropping the !ARMv6 dependency is not as
simple as it would seem, and I'm nervous about including any such
patches this close to Christmas and the merge window.

I've recently been through a run of "apply this patch, no it's wrong,
drop it, apply the next version, no it's still wrong, drop it again."
So, today is the last day I'm accepting _known_ good patches into my
tree for the 4.5 merge window.  This is not yet a known good patch
because it hasn't been through several iterations of testing - and I
don't want to be adding and removing patches from my tree over
Christmas.

I'd rather leave my tree in a known good state before Christmas so
that those who want to fiddle with the kernel over the holiday break
can do so without having the hassle of trees containing partially
tested patches.

-- 
RMK's Patch system: http://www.arm.linux.org.uk/developer/patches/
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH] build: save generated config in /boot

>>> On 22.12.15 at 15:46,  wrote:
> On 22/12/15 12:52, Jan Beulich wrote:
> On 22.12.15 at 13:45,  wrote:
>>> On 12/21/15 9:35 AM, Jan Beulich wrote:
>>> On 21.12.15 at 16:20,  wrote:
> On 12/21/15 8:51 AM, Jan Beulich wrote:
> On 21.12.15 at 15:35,  wrote:
>>> On 12/21/15 6:11 AM, Jan Beulich wrote:
>>> On 18.12.15 at 22:35,  wrote:
> Since we now support changing Xen options with Kconfig, we should save
> the configuration that was used to build up Xen. This will save it in
> /boot alongside the installed xen.gz and call it
> xen-$(FULLVERSION).config
>
> Suggested-by: Ian Campbell 
> Signed-off-by: Doug Goldstein 
> ---
>  xen/Makefile | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/xen/Makefile b/xen/Makefile
> index 9023863..460b977 100644
> --- a/xen/Makefile
> +++ b/xen/Makefile
> @@ -58,6 +58,7 @@ _install: $(TARGET)$(CONFIG_XEN_INSTALL_SUFFIX)
>   ln -f -s $(T)-$(XEN_FULLVERSION)$(Z) $(D)$(BOOT_DIR)/$(T)$(Z)
>   [ -d "$(D)$(DEBUG_DIR)" ] || $(INSTALL_DIR) $(D)$(DEBUG_DIR)
>   $(INSTALL_DATA) $(TARGET)-syms 
> $(D)$(DEBUG_DIR)/$(T)-syms-$(XEN_FULLVERSION)
> + $(INSTALL_DATA) $(KCONFIG_CONFIG) 
>>> $(D)$(BOOT_DIR)/$(T)-$(XEN_FULLVERSION).config
 Was it really suggested to put this into /boot? It has no business
 being there...
>>> Yes. By multiple people. Ian Campbell was the first person to suggest it
>>> in that location.
>> Okay, so I've looked it up, and no, he didn't. He just gave this as one
>> possibility:
>>
>> "It occurred to me this morning that we probably ought to stash the 
>> .config
>>  somewhere on install in such a way that it can be associated with the 
>> Xen
>>  binary (i.e. with the same full suffix as the binary itself, not the
>>  abridged symlink names), maybe as $(BOOT_DIR)/$(T)-
>>  $(XEN_FULLVERSION).config?"
>>
>> But yes, I'm sorry for not noticing this as an undesirable place right
>> away.
> Ok well I'm at a loss here because the quote clearly shows him
> suggesting that location. Do you have a suggested location because so
> far I've just got a no from you on the only suggested location.
 Match the xen-syms location?
>>> I guess I fail to grasp the rationale behind not putting it in /boot.
>> It's the other way around really - you'd have to provide a reason
>> (other than "Linux does so too") for putting it in /boot.
> 
> I disagree.
> 
> Xen being consistent with Linux in this regard is in the best interest
> of the users of Xen, as they end up finding similar information in
> similar places.
> 
> The onus is on you to provide a reason why we should deliberately do
> something different.

I'm sorry, but no - why would we slavishly follow what Linux does, no
matter whether it makes sense?

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH] build: save generated config in /boot

On 12/22/15 9:59 AM, Jan Beulich wrote:
 On 22.12.15 at 15:46,  wrote:
>> On 22/12/15 12:52, Jan Beulich wrote:
>> On 22.12.15 at 13:45,  wrote:
 On 12/21/15 9:35 AM, Jan Beulich wrote:
 On 21.12.15 at 16:20,  wrote:
>> On 12/21/15 8:51 AM, Jan Beulich wrote:
>> On 21.12.15 at 15:35,  wrote:
 On 12/21/15 6:11 AM, Jan Beulich wrote:
 On 18.12.15 at 22:35,  wrote:
>> Since we now support changing Xen options with Kconfig, we should 
>> save
>> the configuration that was used to build up Xen. This will save it in
>> /boot alongside the installed xen.gz and call it
>> xen-$(FULLVERSION).config
>>
>> Suggested-by: Ian Campbell 
>> Signed-off-by: Doug Goldstein 
>> ---
>>  xen/Makefile | 1 +
>>  1 file changed, 1 insertion(+)
>>
>> diff --git a/xen/Makefile b/xen/Makefile
>> index 9023863..460b977 100644
>> --- a/xen/Makefile
>> +++ b/xen/Makefile
>> @@ -58,6 +58,7 @@ _install: $(TARGET)$(CONFIG_XEN_INSTALL_SUFFIX)
>>  ln -f -s $(T)-$(XEN_FULLVERSION)$(Z) $(D)$(BOOT_DIR)/$(T)$(Z)
>>  [ -d "$(D)$(DEBUG_DIR)" ] || $(INSTALL_DIR) $(D)$(DEBUG_DIR)
>>  $(INSTALL_DATA) $(TARGET)-syms 
>> $(D)$(DEBUG_DIR)/$(T)-syms-$(XEN_FULLVERSION)
>> +$(INSTALL_DATA) $(KCONFIG_CONFIG) 
 $(D)$(BOOT_DIR)/$(T)-$(XEN_FULLVERSION).config
> Was it really suggested to put this into /boot? It has no business
> being there...
 Yes. By multiple people. Ian Campbell was the first person to suggest 
 it
 in that location.
>>> Okay, so I've looked it up, and no, he didn't. He just gave this as one
>>> possibility:
>>>
>>> "It occurred to me this morning that we probably ought to stash the 
>>> .config
>>>  somewhere on install in such a way that it can be associated with the 
>>> Xen
>>>  binary (i.e. with the same full suffix as the binary itself, not the
>>>  abridged symlink names), maybe as $(BOOT_DIR)/$(T)-
>>>  $(XEN_FULLVERSION).config?"
>>>
>>> But yes, I'm sorry for not noticing this as an undesirable place right
>>> away.
>> Ok well I'm at a loss here because the quote clearly shows him
>> suggesting that location. Do you have a suggested location because so
>> far I've just got a no from you on the only suggested location.
> Match the xen-syms location?
 I guess I fail to grasp the rationale behind not putting it in /boot.
>>> It's the other way around really - you'd have to provide a reason
>>> (other than "Linux does so too") for putting it in /boot.
>>
>> I disagree.
>>
>> Xen being consistent with Linux in this regard is in the best interest
>> of the users of Xen, as they end up finding similar information in
>> similar places.
>>
>> The onus is on you to provide a reason why we should deliberately do
>> something different.
> 
> I'm sorry, but no - why would we slavishly follow what Linux does, no
> matter whether it makes sense?
> 
> Jan
> 

How does it not make sense in this case? That's what Andrew and I are
asking you to explain.

-- 
Doug Goldstein



signature.asc
Description: OpenPGP digital signature
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH RESEND v4] arm: remove !CPU_V6 and !GENERIC_ATOMIC64 build dependencies for XEN

On Tue, 22 Dec 2015, Russell King - ARM Linux wrote:
> On Tue, Dec 22, 2015 at 03:45:19PM +, Stefano Stabellini wrote:
> > The code builds, but of course it could not actually run on CPU_V6. But
> > the use case for me is building drivers/xen/grant-table.c, which can
> > only run on CPU_V7 anyway, so it is not a problem.
> 
> What I'm concerned about in this case is if you try to use instructions
> which are not supported by the CPU, even in assembly, the assembler
> will barf.
> 
> So, if we're building an ARMv6 + ARMv7 kernel, and you try to use
> LDREXB in generic code, even though you may intend it to only be
> executed on ARMv7, the assembler will error out.
> 
> We used to be able to work around that by passing -Wa,-march=armv7
> to the compiler for specific files, which would then tell the assembler
> to accept ARMv7 instructions, but modern GCC output .arch pseudo-ops
> which override the command line.  So now the only way to do it is to
> override the assemblers selected archtecture using .arch pseudo-ops
> in the assembly code.
> 
> So, what I'm saying is that dropping the !ARMv6 dependency is not as
> simple as it would seem, and I'm nervous about including any such
> patches this close to Christmas and the merge window.
 
Sure, that's understandable.

Merge window aside, do you think that for the future it might be better
to change approach and write a Xen specific implementation of
sync_cmpxchg?  Like this older patch tried to do

http://marc.info/?l=linux-arm-kernel&m=138436406724990&w=2

?

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH] build: save generated config in /boot

>>> On 22.12.15 at 17:02,  wrote:
> How does it not make sense in this case? That's what Andrew and I are
> asking you to explain.

But I already explained it: The file isn't needed for booting.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [RFC 0/3] xen/arm: Support XENMEM_exchange

>>> On 17.12.15 at 17:31,  wrote:
> I've looked at reworking XENMEM_exchange to avoid mfn_to_gmfn. My idea would
> be to allocate a temporary array to store the GFN between the two loops.
> However, the array would be quite large (the max default is 18 on ARM),
> so I don't know if it's acceptable.

This would quite clearly not be acceptable, yet I don't have an
alternative suggestion other then implementing M2P on ARM, which
you have already named as not an option.

Otoh I'm not sure what two loops you talk about - I find exactly
one use of mfn_to_gmfn() in memory_exchange(), and latching
the GFN earlier would be wrong (as it might change in parallel to
your processing).

Jan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [RFC 2/3] xen/common: memory: Add support for direct mapped domain in XEMEM_exchange

>>> On 17.12.15 at 17:31,  wrote:
> Direct mapped domain needs to retrieve the exact same underlying
> physical page when the region is re-populated.

In which case - what sense does it make for such a domain to
exchange memory (the primary purpose of which is to get
_different_ memory populated at the given GFN (range))?

Jan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [RFC 3/3] xen/common: memory: Move steal_page in common code

>>> On 17.12.15 at 17:31,  wrote:
> --- a/xen/common/page_alloc.c
> +++ b/xen/common/page_alloc.c
> @@ -1786,6 +1786,57 @@ int assign_pages(
>  return -1;
>  }
>  
> +int steal_page(
> +struct domain *d, struct page_info *page, unsigned int memflags)
> +{

I think it would better go into common/memory.c - there's no
allocation involved here.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PULL 0/4] xen-2015-12-22

The following changes since commit c3626ca7df027dabf0568284360a23faf18f0884:

  Update version for v2.5.0-rc3 release (2015-12-07 17:47:40 +)

are available in the git repository at:

  git://xenbits.xen.org/people/sstabellini/qemu-dm.git tags/xen-2015-12-22

for you to fetch changes up to fc3e493bc8e96ef4bf7ae4f035f43cb39382c936:

  xen_disk: treat "vhd" as "vpc" (2015-12-11 17:02:37 +)


Xen 2015/12/22


Jan Beulich (3):
  xen/MSI-X: latch MSI-X table writes
  xen/MSI-X: really enforce alignment
  xen/pass-through: correctly deal with RW1C bits

Stefano Stabellini (1):
  xen_disk: treat "vhd" as "vpc"

 hw/block/xen_disk.c |3 ++
 hw/xen/xen_pt.h |6 ++-
 hw/xen/xen_pt_config_init.c |   40 +++-
 hw/xen/xen_pt_msi.c |   86 ---
 4 files changed, 60 insertions(+), 75 deletions(-)

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PULL 1/4] xen/MSI-X: latch MSI-X table writes

From: Jan Beulich 

The remaining log message in pci_msix_write() is wrong, as there guest
behavior may only appear to be wrong: For one, the old logic didn't
take the mask-all bit into account. And then this shouldn't depend on
host device state (i.e. the host may have masked the entry without the
guest having done so). Plus these writes shouldn't be dropped even when
an entry gets unmasked. Instead, if they can't be made take effect
right away, they should take effect on the next unmasking or enabling
operation - the specification explicitly describes such caching
behavior.

Signed-off-by: Jan Beulich 
Signed-off-by: Stefano Stabellini 
Reviewed-by: Stefano Stabellini 
---
 hw/xen/xen_pt.h |4 +--
 hw/xen/xen_pt_config_init.c |2 ++
 hw/xen/xen_pt_msi.c |   74 ---
 3 files changed, 32 insertions(+), 48 deletions(-)

diff --git a/hw/xen/xen_pt.h b/hw/xen/xen_pt.h
index c545280..4f922f4 100644
--- a/hw/xen/xen_pt.h
+++ b/hw/xen/xen_pt.h
@@ -187,13 +187,13 @@ typedef struct XenPTMSIXEntry {
 int pirq;
 uint64_t addr;
 uint32_t data;
-uint32_t vector_ctrl;
+uint32_t latch[4];
 bool updated; /* indicate whether MSI ADDR or DATA is updated */
-bool warned;  /* avoid issuing (bogus) warning more than once */
 } XenPTMSIXEntry;
 typedef struct XenPTMSIX {
 uint32_t ctrl_offset;
 bool enabled;
+bool maskall;
 int total_entries;
 int bar_index;
 uint64_t table_base;
diff --git a/hw/xen/xen_pt_config_init.c b/hw/xen/xen_pt_config_init.c
index 3d8686d..4fd7206 100644
--- a/hw/xen/xen_pt_config_init.c
+++ b/hw/xen/xen_pt_config_init.c
@@ -1499,6 +1499,8 @@ static int 
xen_pt_msixctrl_reg_write(XenPCIPassthroughState *s,
 xen_pt_msix_disable(s);
 }
 
+s->msix->maskall = *val & PCI_MSIX_FLAGS_MASKALL;
+
 debug_msix_enabled_old = s->msix->enabled;
 s->msix->enabled = !!(*val & PCI_MSIX_FLAGS_ENABLE);
 if (s->msix->enabled != debug_msix_enabled_old) {
diff --git a/hw/xen/xen_pt_msi.c b/hw/xen/xen_pt_msi.c
index 82de2bc..e10cd2a 100644
--- a/hw/xen/xen_pt_msi.c
+++ b/hw/xen/xen_pt_msi.c
@@ -25,6 +25,7 @@
 #define XEN_PT_GFLAGSSHIFT_DELIV_MODE 12
 #define XEN_PT_GFLAGSSHIFT_TRG_MODE   15
 
+#define latch(fld) latch[PCI_MSIX_ENTRY_##fld / sizeof(uint32_t)]
 
 /*
  * Helpers
@@ -314,7 +315,8 @@ static int msix_set_enable(XenPCIPassthroughState *s, bool 
enabled)
enabled);
 }
 
-static int xen_pt_msix_update_one(XenPCIPassthroughState *s, int entry_nr)
+static int xen_pt_msix_update_one(XenPCIPassthroughState *s, int entry_nr,
+  uint32_t vec_ctrl)
 {
 XenPTMSIXEntry *entry = NULL;
 int pirq;
@@ -332,6 +334,19 @@ static int xen_pt_msix_update_one(XenPCIPassthroughState 
*s, int entry_nr)
 
 pirq = entry->pirq;
 
+/*
+ * Update the entry addr and data to the latest values only when the
+ * entry is masked or they are all masked, as required by the spec.
+ * Addr and data changes while the MSI-X entry is unmasked get deferred
+ * until the next masked -> unmasked transition.
+ */
+if (pirq == XEN_PT_UNASSIGNED_PIRQ || s->msix->maskall ||
+(vec_ctrl & PCI_MSIX_ENTRY_CTRL_MASKBIT)) {
+entry->addr = entry->latch(LOWER_ADDR) |
+  ((uint64_t)entry->latch(UPPER_ADDR) << 32);
+entry->data = entry->latch(DATA);
+}
+
 rc = msi_msix_setup(s, entry->addr, entry->data, &pirq, true, entry_nr,
 entry->pirq == XEN_PT_UNASSIGNED_PIRQ);
 if (rc) {
@@ -357,7 +372,7 @@ int xen_pt_msix_update(XenPCIPassthroughState *s)
 int i;
 
 for (i = 0; i < msix->total_entries; i++) {
-xen_pt_msix_update_one(s, i);
+xen_pt_msix_update_one(s, i, msix->msix_entry[i].latch(VECTOR_CTRL));
 }
 
 return 0;
@@ -406,35 +421,15 @@ int xen_pt_msix_update_remap(XenPCIPassthroughState *s, 
int bar_index)
 
 static uint32_t get_entry_value(XenPTMSIXEntry *e, int offset)
 {
-switch (offset) {
-case PCI_MSIX_ENTRY_LOWER_ADDR:
-return e->addr & UINT32_MAX;
-case PCI_MSIX_ENTRY_UPPER_ADDR:
-return e->addr >> 32;
-case PCI_MSIX_ENTRY_DATA:
-return e->data;
-case PCI_MSIX_ENTRY_VECTOR_CTRL:
-return e->vector_ctrl;
-default:
-return 0;
-}
+return !(offset % sizeof(*e->latch))
+   ? e->latch[offset / sizeof(*e->latch)] : 0;
 }
 
 static void set_entry_value(XenPTMSIXEntry *e, int offset, uint32_t val)
 {
-switch (offset) {
-case PCI_MSIX_ENTRY_LOWER_ADDR:
-e->addr = (e->addr & ((uint64_t)UINT32_MAX << 32)) | val;
-break;
-case PCI_MSIX_ENTRY_UPPER_ADDR:
-e->addr = (uint64_t)val << 32 | (e->addr & UINT32_MAX);
-break;
-case PCI_MSIX_ENTRY_DATA:
-e->data = val;
-break;
-case PCI_MSIX_ENTRY_VECTOR_CTRL:
-e->vector_ctrl = val;
-break;
+if (!(offset % sizeof(*e

[Xen-devel] [PULL 2/4] xen/MSI-X: really enforce alignment

From: Jan Beulich 

The way the generic infrastructure works the intention of not allowing
unaligned accesses can't be achieved by simply setting .unaligned to
false. The benefit is that we can now replace the conditionals in
{get,set}_entry_value() by assert()-s.

Signed-off-by: Jan Beulich 
Signed-off-by: Stefano Stabellini 
Reviewed-by: Stefano Stabellini 
---
 hw/xen/xen_pt_msi.c |   22 --
 1 file changed, 16 insertions(+), 6 deletions(-)

diff --git a/hw/xen/xen_pt_msi.c b/hw/xen/xen_pt_msi.c
index e10cd2a..302a310 100644
--- a/hw/xen/xen_pt_msi.c
+++ b/hw/xen/xen_pt_msi.c
@@ -421,16 +421,14 @@ int xen_pt_msix_update_remap(XenPCIPassthroughState *s, 
int bar_index)
 
 static uint32_t get_entry_value(XenPTMSIXEntry *e, int offset)
 {
-return !(offset % sizeof(*e->latch))
-   ? e->latch[offset / sizeof(*e->latch)] : 0;
+assert(!(offset % sizeof(*e->latch)));
+return e->latch[offset / sizeof(*e->latch)];
 }
 
 static void set_entry_value(XenPTMSIXEntry *e, int offset, uint32_t val)
 {
-if (!(offset % sizeof(*e->latch)))
-{
-e->latch[offset / sizeof(*e->latch)] = val;
-}
+assert(!(offset % sizeof(*e->latch)));
+e->latch[offset / sizeof(*e->latch)] = val;
 }
 
 static void pci_msix_write(void *opaque, hwaddr addr,
@@ -494,6 +492,12 @@ static uint64_t pci_msix_read(void *opaque, hwaddr addr,
 }
 }
 
+static bool pci_msix_accepts(void *opaque, hwaddr addr,
+ unsigned size, bool is_write)
+{
+return !(addr & (size - 1));
+}
+
 static const MemoryRegionOps pci_msix_ops = {
 .read = pci_msix_read,
 .write = pci_msix_write,
@@ -502,7 +506,13 @@ static const MemoryRegionOps pci_msix_ops = {
 .min_access_size = 4,
 .max_access_size = 4,
 .unaligned = false,
+.accepts = pci_msix_accepts
 },
+.impl = {
+.min_access_size = 4,
+.max_access_size = 4,
+.unaligned = false
+}
 };
 
 int xen_pt_msix_init(XenPCIPassthroughState *s, uint32_t base)
-- 
1.7.10.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PULL 3/4] xen/pass-through: correctly deal with RW1C bits

From: Jan Beulich 

Introduce yet another mask for them, so that the generic routine can
handle them, at once rendering xen_pt_pmcsr_reg_write() superfluous.

Signed-off-by: Jan Beulich 
Signed-off-by: Stefano Stabellini 
Reviewed-by: Stefano Stabellini 
---
 hw/xen/xen_pt.h |2 ++
 hw/xen/xen_pt_config_init.c |   38 --
 2 files changed, 14 insertions(+), 26 deletions(-)

diff --git a/hw/xen/xen_pt.h b/hw/xen/xen_pt.h
index 4f922f4..3749711 100644
--- a/hw/xen/xen_pt.h
+++ b/hw/xen/xen_pt.h
@@ -113,6 +113,8 @@ struct XenPTRegInfo {
 uint32_t res_mask;
 /* reg read only field mask (ON:RO/ROS, OFF:other) */
 uint32_t ro_mask;
+/* reg read/write-1-clear field mask (ON:RW1C/RW1CS, OFF:other) */
+uint32_t rw1c_mask;
 /* reg emulate field mask (ON:emu, OFF:passthrough) */
 uint32_t emu_mask;
 xen_pt_conf_reg_init init;
diff --git a/hw/xen/xen_pt_config_init.c b/hw/xen/xen_pt_config_init.c
index 4fd7206..185a698 100644
--- a/hw/xen/xen_pt_config_init.c
+++ b/hw/xen/xen_pt_config_init.c
@@ -179,7 +179,8 @@ static int xen_pt_byte_reg_write(XenPCIPassthroughState *s, 
XenPTReg *cfg_entry,
 *data = XEN_PT_MERGE_VALUE(*val, *data, writable_mask);
 
 /* create value for writing to I/O device register */
-*val = XEN_PT_MERGE_VALUE(*val, dev_value, throughable_mask);
+*val = XEN_PT_MERGE_VALUE(*val, dev_value & ~reg->rw1c_mask,
+  throughable_mask);
 
 return 0;
 }
@@ -197,7 +198,8 @@ static int xen_pt_word_reg_write(XenPCIPassthroughState *s, 
XenPTReg *cfg_entry,
 *data = XEN_PT_MERGE_VALUE(*val, *data, writable_mask);
 
 /* create value for writing to I/O device register */
-*val = XEN_PT_MERGE_VALUE(*val, dev_value, throughable_mask);
+*val = XEN_PT_MERGE_VALUE(*val, dev_value & ~reg->rw1c_mask,
+  throughable_mask);
 
 return 0;
 }
@@ -215,7 +217,8 @@ static int xen_pt_long_reg_write(XenPCIPassthroughState *s, 
XenPTReg *cfg_entry,
 *data = XEN_PT_MERGE_VALUE(*val, *data, writable_mask);
 
 /* create value for writing to I/O device register */
-*val = XEN_PT_MERGE_VALUE(*val, dev_value, throughable_mask);
+*val = XEN_PT_MERGE_VALUE(*val, dev_value & ~reg->rw1c_mask,
+  throughable_mask);
 
 return 0;
 }
@@ -633,6 +636,7 @@ static XenPTRegInfo xen_pt_emu_reg_header0[] = {
 .init_val   = 0x,
 .res_mask   = 0x0007,
 .ro_mask= 0x06F8,
+.rw1c_mask  = 0xF900,
 .emu_mask   = 0x0010,
 .init   = xen_pt_status_reg_init,
 .u.w.read   = xen_pt_word_reg_read,
@@ -944,6 +948,7 @@ static XenPTRegInfo xen_pt_emu_reg_pcie[] = {
 .size   = 2,
 .res_mask   = 0xFFC0,
 .ro_mask= 0x0030,
+.rw1c_mask  = 0x000F,
 .init   = xen_pt_common_reg_init,
 .u.w.read   = xen_pt_word_reg_read,
 .u.w.write  = xen_pt_word_reg_write,
@@ -964,6 +969,7 @@ static XenPTRegInfo xen_pt_emu_reg_pcie[] = {
 .offset = PCI_EXP_LNKSTA,
 .size   = 2,
 .ro_mask= 0x3FFF,
+.rw1c_mask  = 0xC000,
 .init   = xen_pt_common_reg_init,
 .u.w.read   = xen_pt_word_reg_read,
 .u.w.write  = xen_pt_word_reg_write,
@@ -1000,27 +1006,6 @@ static XenPTRegInfo xen_pt_emu_reg_pcie[] = {
  * Power Management Capability
  */
 
-/* write Power Management Control/Status register */
-static int xen_pt_pmcsr_reg_write(XenPCIPassthroughState *s,
-  XenPTReg *cfg_entry, uint16_t *val,
-  uint16_t dev_value, uint16_t valid_mask)
-{
-XenPTRegInfo *reg = cfg_entry->reg;
-uint16_t writable_mask = 0;
-uint16_t throughable_mask = get_throughable_mask(s, reg, valid_mask);
-uint16_t *data = cfg_entry->ptr.half_word;
-
-/* modify emulate register */
-writable_mask = reg->emu_mask & ~reg->ro_mask & valid_mask;
-*data = XEN_PT_MERGE_VALUE(*val, *data, writable_mask);
-
-/* create value for writing to I/O device register */
-*val = XEN_PT_MERGE_VALUE(*val, dev_value & ~PCI_PM_CTRL_PME_STATUS,
-  throughable_mask);
-
-return 0;
-}
-
 /* Power Management Capability reg static information table */
 static XenPTRegInfo xen_pt_emu_reg_pm[] = {
 /* Next Pointer reg */
@@ -1051,11 +1036,12 @@ static XenPTRegInfo xen_pt_emu_reg_pm[] = {
 .size   = 2,
 .init_val   = 0x0008,
 .res_mask   = 0x00F0,
-.ro_mask= 0xE10C,
+.ro_mask= 0x610C,
+.rw1c_mask  = 0x8000,
 .emu_mask   = 0x810B,
 .init   = xen_pt_common_reg_init,
 .u.w.read   = xen_pt_word_reg_read,
-.u.w.write  = xen_pt_pmcsr_reg_write,
+.u.w.write  = xen_pt_word_reg_write,
 },
 {
 .size = 0,
-- 
1.7.10.4


___
Xen-devel mailing list
Xen-dev

[Xen-devel] [PULL 4/4] xen_disk: treat "vhd" as "vpc"

The Xen toolstack uses "vhd" to specify a disk in VHD format, however
the name of the driver in QEMU is "vpc". Replace "vhd" with "vpc", so
that QEMU can find the right driver to use for it.

Signed-off-by: Stefano Stabellini 
---
 hw/block/xen_disk.c |3 +++
 1 file changed, 3 insertions(+)

diff --git a/hw/block/xen_disk.c b/hw/block/xen_disk.c
index 8146650..a48e726 100644
--- a/hw/block/xen_disk.c
+++ b/hw/block/xen_disk.c
@@ -825,6 +825,9 @@ static int blk_init(struct XenDevice *xendev)
 if (!strcmp("aio", blkdev->fileproto)) {
 blkdev->fileproto = "raw";
 }
+if (!strcmp("vhd", blkdev->fileproto)) {
+blkdev->fileproto = "vpc";
+}
 if (blkdev->mode == NULL) {
 blkdev->mode = xenstore_read_be_str(&blkdev->xendev, "mode");
 }
-- 
1.7.10.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH] x86/VPMU: Check more carefully which bits are allowed to be written to MSRs

2015-12-22 Thread Boris Ostrovsky

Current Intel VPMU emulation needs to perform more checks when writing
PMU MSRs on guest's behalf:
* MSR_CORE_PERF_GLOBAL_CTRL is not checked at all
* MSR_CORE_PERF_FIXED_CTR_CTRL has more reserved bits in PMU version 2
* MSR_CORE_PERF_GLOBAL_OVF_CTRL's bit 61 is allowed on versions greater
* than 2.

We can also use precomputed mask in core2_vpmu_do_interrupt().

Signed-off-by: Boris Ostrovsky 
---
 xen/arch/x86/cpu/vpmu_intel.c |   18 --
 1 files changed, 16 insertions(+), 2 deletions(-)

diff --git a/xen/arch/x86/cpu/vpmu_intel.c b/xen/arch/x86/cpu/vpmu_intel.c
index 3eff1ae..8dc0b48 100644
--- a/xen/arch/x86/cpu/vpmu_intel.c
+++ b/xen/arch/x86/cpu/vpmu_intel.c
@@ -87,7 +87,7 @@ static unsigned int __read_mostly arch_pmc_cnt, fixed_pmc_cnt;
 /* Masks used for testing whether and MSR is valid */
 #define ARCH_CTRL_MASK  (~((1ull << 32) - 1) | (1ull << 21))
 static uint64_t __read_mostly fixed_ctrl_mask, fixed_counters_mask;
-static uint64_t __read_mostly global_ovf_ctrl_mask;
+static uint64_t __read_mostly global_ovf_ctrl_mask, global_ctrl_mask;
 
 /* Total size of PMU registers block (copied to/from PV(H) guest) */
 static unsigned int __read_mostly regs_sz;
@@ -392,6 +392,8 @@ static int core2_vpmu_verify(struct vcpu *v)
 
 if ( core2_vpmu_cxt->global_ovf_ctrl & global_ovf_ctrl_mask )
 return -EINVAL;
+if ( core2_vpmu_cxt->global_ctrl & global_ctrl_mask )
+return -EINVAL;
 
 fixed_ctrl = core2_vpmu_cxt->fixed_ctrl;
 if ( fixed_ctrl & fixed_ctrl_mask )
@@ -627,6 +629,8 @@ static int core2_vpmu_do_wrmsr(unsigned int msr, uint64_t 
msr_content,
 gdprintk(XENLOG_WARNING, "Guest setting of DTS is ignored.\n");
 return 0;
 case MSR_CORE_PERF_GLOBAL_CTRL:
+if ( msr_content & global_ctrl_mask )
+return -EINVAL;
 core2_vpmu_cxt->global_ctrl = msr_content;
 break;
 case MSR_CORE_PERF_FIXED_CTR_CTRL:
@@ -820,7 +824,7 @@ static int core2_vpmu_do_interrupt(struct cpu_user_regs 
*regs)
 if ( is_pmc_quirk )
 handle_pmc_quirk(msr_content);
 core2_vpmu_cxt->global_status |= msr_content;
-msr_content = 0xC007 | ((1 << arch_pmc_cnt) - 1);
+msr_content = ~global_ovf_ctrl_mask;
 wrmsrl(MSR_CORE_PERF_GLOBAL_OVF_CTRL, msr_content);
 }
 else
@@ -1001,10 +1005,20 @@ int __init core2_vpmu_init(void)
 full_width_write = (caps >> 13) & 1;
 
 fixed_ctrl_mask = ~((1ull << (fixed_pmc_cnt * FIXED_CTR_CTRL_BITS)) - 1);
+if ( version == 2 )
+fixed_ctrl_mask |= 0x444;
 fixed_counters_mask = ~((1ull << core2_get_bitwidth_fix_count()) - 1);
+global_ctrl_mask = ~1ULL << fixed_pmc_cnt) - 1) << 32) |
+ ((1ULL << arch_pmc_cnt) - 1));
 global_ovf_ctrl_mask = ~(0xC000 |
  (((1ULL << fixed_pmc_cnt) - 1) << 32) |
  ((1ULL << arch_pmc_cnt) - 1));
+if ( version > 2)
+/*
+ * Even though we don't support Uncore counters guests should be
+ * able to clear all available overflows.
+ */
+global_ovf_ctrl_mask &= ~(1ULL << 61);
 
 regs_sz = (sizeof(struct xen_pmu_intel_ctxt) - regs_off) +
   sizeof(uint64_t) * fixed_pmc_cnt +
-- 
1.7.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH RFC 01/31] xen/public: Export featureset information in the public API

>>> On 16.12.15 at 22:24,  wrote:
> +/* VIA/Cyrix/Centaur-defined CPU features, CPUID level 0xC001, FSMAX+2 */
> +#define X86_FEATURE_XSTORE   ((FSMAX+1)*32+ 2) /* on-CPU RNG present (xstore 
> insn) */
> +#define X86_FEATURE_XSTORE_EN((FSMAX+1)*32+ 3) /* on-CPU RNG enabled 
> */
> +#define X86_FEATURE_XCRYPT   ((FSMAX+1)*32+ 6) /* on-CPU crypto (xcrypt 
> insn) */
> +#define X86_FEATURE_XCRYPT_EN((FSMAX+1)*32+ 7) /* on-CPU crypto 
> enabled */
> +#define X86_FEATURE_ACE2 ((FSMAX+1)*32+ 8) /* Advanced Cryptography 
> Engine v2 */
> +#define X86_FEATURE_ACE2_EN  ((FSMAX+1)*32+ 9) /* ACE v2 enabled */
> +#define X86_FEATURE_PHE  ((FSMAX+1)*32+10) /* PadLock Hash 
> Engine */
> +#define X86_FEATURE_PHE_EN   ((FSMAX+1)*32+11) /* PHE enabled */
> +#define X86_FEATURE_PMM  ((FSMAX+1)*32+12) /* PadLock Montgomery 
> Multiplier */
> +#define X86_FEATURE_PMM_EN   ((FSMAX+1)*32+13) /* PMM enabled */

None of these get consumed anywhere - if you already touch this
code, just drop all of them.

> +/*
> + * CPUID leaf shorthand:
> + * - optional 'e', Extended (0x8xxx)
> + * - leaf, uppercase hex
> + * - register, lowercase
> + * - optional subleaf, uppercase hex
> + */
> +#define XEN_FEATURESET_1d 0 /* 0x0001.edx  */
> +#define XEN_FEATURESET_1c 1 /* 0x0001.ecx  */
> +#define XEN_FEATURESET_e1d2 /* 0x8001.edx  */
> +#define XEN_FEATURESET_e1c3 /* 0x8001.ecx  */
> +#define XEN_FEATURESET_Da14 /* 0x000d:1.eax*/
> +#define XEN_FEATURESET_7b05 /* 0x0007:0.ebx*/

This ends up being pretty cryptic.

> +#define XEN_NR_FEATURESET_ENTRIES (XEN_FEATURESET_7b0 + 1)

This shouldn't be exposed, as it will necessarily change sooner or later.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH RFC 02/31] tools/libxc: Use public/featureset.h for cpuid policy generation

>>> On 16.12.15 at 22:24,  wrote:
> --- a/xen/include/public/arch-x86/featureset.h
> +++ b/xen/include/public/arch-x86/featureset.h
> @@ -163,6 +163,7 @@
>  
>  /* Intel-defined CPU features, CPUID level 0x0007:0.ebx, word 5 */
>  #define X86_FEATURE_FSGSBASE  ( 5*32+ 0) /* {RD,WR}{FS,GS}BASE 
> instructions */
> +#define X86_FEATURE_TSC_ADJUST( 5*32+ 1) /* TSC_ADJUST MSR available */

This would probably better go into patch 1.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH RFC 03/31] xen/x86: Store antifeatures inverted in a featureset

>>> On 16.12.15 at 22:24,  wrote:
> --- /dev/null
> +++ b/xen/arch/x86/cpuid/cpuid-private.h
> @@ -0,0 +1,27 @@
> +#ifdef __XEN__
> +
> +#include 
> +
> +#include 
> +
> +#else
> +
> +# error TODO for userspace

I suppose your intentions with this will become apparent in later
patches?

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH v2] memory-hotplug: add automatic onlining policy for the newly added memory

2015-12-22 Thread Vitaly Kuznetsov

Currently, all newly added memory blocks remain in 'offline' state unless
someone onlines them, some linux distributions carry special udev rules
like:

SUBSYSTEM=="memory", ACTION=="add", ATTR{state}=="offline", ATTR{state}="online"

to make this happen automatically. This is not a great solution for virtual
machines where memory hotplug is being used to address high memory pressure
situations as such onlining is slow and a userspace process doing this
(udev) has a chance of being killed by the OOM killer as it will probably
require to allocate some memory.

Introduce default policy for the newly added memory blocks in
/sys/devices/system/memory/hotplug_autoonline file with two possible
values: "offline" which preserves the current behavior and "online" which
causes all newly added memory blocks to go online as soon as they're added.
The default is "online" when MEMORY_HOTPLUG_AUTOONLINE kernel config option
is selected.

Cc: Jonathan Corbet 
Cc: Greg Kroah-Hartman 
Cc: Daniel Kiper 
Cc: Dan Williams 
Cc: Tang Chen 
Cc: David Vrabel 
Cc: David Rientjes 
Cc: Andrew Morton 
Cc: Naoya Horiguchi 
Cc: Xishi Qiu 
Cc: Mel Gorman 
Cc: "K. Y. Srinivasan" 
Cc: Igor Mammedov 
Cc: Kay Sievers 
Cc: Konrad Rzeszutek Wilk 
Cc: Boris Ostrovsky 
Signed-off-by: Vitaly Kuznetsov 
---
- Changes since 'v1':
  Add 'online' parameter to add_memory_resource() as it is being used by
  xen ballon driver and it adds "empty" memory pages [David Vrabel].
  (I don't completely understand what prevents manual onlining in this
   case as we still have all newly added blocks in sysfs ... this is the
   discussion point.)

- Changes since 'RFC':
  It seems nobody is strongly opposed to the idea, thus non-RFC.
  Change memhp_autoonline to bool, we support only MMOP_ONLINE_KEEP
  and MMOP_OFFLINE for the auto-onlining policy, eliminate 'unknown'
  from show_memhp_autoonline(). [Daniel Kiper]
  Put everything under CONFIG_MEMORY_HOTPLUG_AUTOONLINE, enable the
  feature by default (when the config option is selected) and add
  kernel parameter (nomemhp_autoonline) to disable the functionality
  upon boot when needed.

- RFC:
  I was able to find previous attempts to fix the issue, e.g.:
  http://marc.info/?l=linux-kernel&m=137425951924598&w=2
  http://marc.info/?l=linux-acpi&m=127186488905382
  but I'm not completely sure why it didn't work out and the solution
  I suggest is not 'smart enough', thus 'RFC'.
---
 Documentation/kernel-parameters.txt |  2 ++
 Documentation/memory-hotplug.txt| 26 --
 drivers/base/memory.c   | 36 
 drivers/xen/balloon.c   |  2 +-
 include/linux/memory_hotplug.h  |  6 +-
 mm/Kconfig  |  9 +
 mm/memory_hotplug.c | 25 +++--
 7 files changed, 96 insertions(+), 10 deletions(-)

diff --git a/Documentation/kernel-parameters.txt 
b/Documentation/kernel-parameters.txt
index 742f69d..652efe1 100644
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -2537,6 +2537,8 @@ bytes respectively. Such letter suffixes can also be 
entirely omitted.
shutdown the other cpus.  Instead use the REBOOT_VECTOR
irq.
 
+   nomemhp_autoonline  Don't automatically online newly added memory.
+
nomoduleDisable module load
 
nopat   [X86] Disable PAT (page attribute table extension of
diff --git a/Documentation/memory-hotplug.txt b/Documentation/memory-hotplug.txt
index ce2cfcf..041efac 100644
--- a/Documentation/memory-hotplug.txt
+++ b/Documentation/memory-hotplug.txt
@@ -111,8 +111,9 @@ To use memory hotplug feature, kernel must be compiled with 
following
 config options.
 
 - For all memory hotplug
-Memory model -> Sparse Memory  (CONFIG_SPARSEMEM)
-Allow for memory hot-add   (CONFIG_MEMORY_HOTPLUG)
+Memory model -> Sparse Memory (CONFIG_SPARSEMEM)
+Allow for memory hot-add  (CONFIG_MEMORY_HOTPLUG)
+Automatically online hot-added memory (CONFIG_MEMORY_HOTPLUG_AUTOONLINE)
 
 - To enable memory removal, the followings are also necessary
 Allow for memory hot remove(CONFIG_MEMORY_HOTREMOVE)
@@ -254,12 +255,25 @@ If the memory block is online, you'll read "online".
 If the memory block is offline, you'll read "offline".
 
 
-5.2. How to online memory
+5.2. Memory onlining
 
-Even if the memory is hot-added, it is not at ready-to-use state.
-For using newly added memory, you have to "online" the memory block.
+When the memory is hot-added, the kernel decides whether or not to "online"
+it according to the policy which can be read from "hotplug_autoonline" file
+(requires CONFIG_MEMORY_HOTPLUG_AUTOONLINE):
 
-For onlining, you have to write "online" to the memory block's state file as:
+% cat /sys/devices/system/memory/hotplug_autoonline
+
+The default is "online" which means the newly added memory will

Re: [Xen-devel] [PATCH V5 4/6] x86/hvm: pkeys, add pkeys support for guest_walk_tables

On 22/12/15 10:30, Huaitong Han wrote:
> Protection keys define a new 4-bit protection key field(PKEY) in bits 62:59 of
> leaf entries of the page tables.
>
> PKRU register defines 32 bits, there are 16 domains and 2 attribute bits per
> domain in pkru, for each i (0 ≤ i ≤ 15), PKRU[2i] is the access-disable bit 
> for
> protection key i (ADi); PKRU[2i+1] is the write-disable bit for protection key
> i (WDi). PKEY is index to a defined domain.
>
> A fault is considered as a PKU violation if all of the following conditions 
> are
> ture:
> 1.CR4_PKE=1.
> 2.EFER_LMA=1.
> 3.Page is present with no reserved bit violations.
> 4.The access is not an instruction fetch.
> 5.The access is to a user page.
> 6.PKRU.AD=1
> or The access is a data write and PKRU.WD=1
> and either CR0.WP=1 or it is a user access.

One comment you didn't address from v3, however:

"At the moment PFEC_insn_fetch is only set in
hvm_fetch_from_guest_virt() if hvm_nx_enabled() or hvm_smep_enabled()
are true.  Which means that if you *don't* have nx or smep enabled, then
the patch series as is will fault on instruction fetches when it
shouldn't.  (I don't remember anyone mentioning nx or smep being enabled
as a prerequisite for pkeys.)"

I think realistically the only way to address this is to start making
the clean separation between "pfec in" and "pfec out" I mentioned in the
previous discussion.

I've coded up the attached patch, but only compile-tested it.  Can you
give it a look to see if you think it is correct, test it, include it in
your next patch series?

Thanks,
 -George

From b8ec8e14c670215587a4e74c3aaec8dab6f26a8c Mon Sep 17 00:00:00 2001
From: George Dunlap 
Date: Tue, 22 Dec 2015 16:17:22 +
Subject: [PATCH] xen/mm: Clean up pfec handling in gva_to_gfn

At the moment, the pfec argument to gva_to_gfn has two functions:

* To inform guest_walk what kind of access is happenind

* As a value to pass back into the guest in the event of a fault.

Unfortunately this is not quite treated consistently: the hvm_fetch_*
function will "pre-clear" the PFEC_insn_fetch flag before calling
gva_to_gfn; meaning guest_walk doesn't actually know whether a given
access is an instruction fetch or not.  This works now, but will cause
issues when pkeys are introduced, since guest_walk will need to know
whether an access is an instruction fetch even if it doesn't return
PFEC_insn_fetch.

Fix this by making a clean separation for in and out functionalities
of the pfec argument:

1. Always pass in the access type to gva_to_gfn

2. Filter out inappropriate access flags before returning from gva_to_gfn.

(The PFEC_insn_fetch flag should only be passed to the guest if either NX or
SMEP is enabled.  See Intel 64 Developer's Manual, Volume 3, Section 4.7.)

Signed-off-by: George Dunlap 
---
 xen/arch/x86/hvm/hvm.c   |  8 ++--
 xen/arch/x86/mm/hap/guest_walk.c | 10 +-
 xen/arch/x86/mm/shadow/multi.c   |  6 ++
 3 files changed, 17 insertions(+), 7 deletions(-)

diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 08cef1f..aa1f64f 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -4423,11 +4423,9 @@ enum hvm_copy_result hvm_copy_from_guest_virt(
 enum hvm_copy_result hvm_fetch_from_guest_virt(
 void *buf, unsigned long vaddr, int size, uint32_t pfec)
 {
-if ( hvm_nx_enabled(current) || hvm_smep_enabled(current) )
-pfec |= PFEC_insn_fetch;
 return __hvm_copy(buf, vaddr, size,
   HVMCOPY_from_guest | HVMCOPY_fault | HVMCOPY_virt,
-  PFEC_page_present | pfec);
+  PFEC_page_present | PFEC_insn_fetch | pfec);
 }

 enum hvm_copy_result hvm_copy_to_guest_virt_nofault(
@@ -4449,11 +4447,9 @@ enum hvm_copy_result hvm_copy_from_guest_virt_nofault(
 enum hvm_copy_result hvm_fetch_from_guest_virt_nofault(
 void *buf, unsigned long vaddr, int size, uint32_t pfec)
 {
-if ( hvm_nx_enabled(current) || hvm_smep_enabled(current) )
-pfec |= PFEC_insn_fetch;
 return __hvm_copy(buf, vaddr, size,
   HVMCOPY_from_guest | HVMCOPY_no_fault | HVMCOPY_virt,
-  PFEC_page_present | pfec);
+  PFEC_page_present | PFEC_insn_fetch | pfec);
 }

 unsigned long copy_to_user_hvm(void *to, const void *from, unsigned int len)
diff --git a/xen/arch/x86/mm/hap/guest_walk.c b/xen/arch/x86/mm/hap/guest_walk.c
index 11c1b35..2dce111 100644
--- a/xen/arch/x86/mm/hap/guest_walk.c
+++ b/xen/arch/x86/mm/hap/guest_walk.c
@@ -82,7 +82,7 @@ unsigned long hap_p2m_ga_to_gfn(GUEST_PAGING_LEVELS)(
 if ( !top_page )
 {
 pfec[0] &= ~PFEC_page_present;
-return INVALID_GFN;
+goto out_tweak_pfec;
 }
 top_mfn = _mfn(page_to_mfn(top_page));

@@ -136,6 +136,14 @@ unsigned long hap_p2m_ga_to_gfn(GUEST_PAGING_LEVELS)(
 if ( missing & _PAGE_SHARED )
 pfec[0] = PFEC_page_shared;

+out_tweak_pfec:
+/* 
+ * Intel 64 Volume 3, Section 4.7:

Re: [Xen-devel] [PATCH RFC 04/31] xen/x86: Mask out unknown features from Xen's capabilities

>>> On 16.12.15 at 22:24,  wrote:
> --- a/xen/arch/x86/cpu/common.c
> +++ b/xen/arch/x86/cpu/common.c
> @@ -324,6 +324,7 @@ void __cpuinit identify_cpu(struct cpuinfo_x86 *c)
>* we do "generic changes."
>*/
>   for (i = 0; i < XEN_NR_FEATURESET_ENTRIES; ++i) {
> + c->x86_capability[i] &= known_features[i];
>   c->x86_capability[i] ^= inverted_features[i];
>   }

Assert that no unknown features slipped into the inverted ones?
But see also below.

> --- a/xen/arch/x86/cpuid/cpuid-private.h
> +++ b/xen/arch/x86/cpuid/cpuid-private.h
> @@ -10,6 +10,32 @@
>  
>  #endif
>  
> +/* Mask of bits which are shared between 1d and e1d. */
> +#define SHARED_1d   \
> +(cpufeat_mask(X86_FEATURE_FPU)   |  \
> + cpufeat_mask(X86_FEATURE_VME)   |  \
> + cpufeat_mask(X86_FEATURE_DE)|  \
> + cpufeat_mask(X86_FEATURE_PSE)   |  \
> + cpufeat_mask(X86_FEATURE_TSC)   |  \
> + cpufeat_mask(X86_FEATURE_MSR)   |  \
> + cpufeat_mask(X86_FEATURE_PAE)   |  \
> + cpufeat_mask(X86_FEATURE_MCE)   |  \
> + cpufeat_mask(X86_FEATURE_CX8)   |  \
> + cpufeat_mask(X86_FEATURE_APIC)  |  \
> + cpufeat_mask(X86_FEATURE_MTRR)  |  \
> + cpufeat_mask(X86_FEATURE_PGE)   |  \
> + cpufeat_mask(X86_FEATURE_MCA)   |  \
> + cpufeat_mask(X86_FEATURE_CMOV)  |  \
> + cpufeat_mask(X86_FEATURE_PAT)   |  \
> + cpufeat_mask(X86_FEATURE_PSE36) |  \
> + cpufeat_mask(X86_FEATURE_MMX)   |  \
> + cpufeat_mask(X86_FEATURE_FXSR))

These are shared for AMD, but zero in the extended leaf for Intel.

> --- a/xen/arch/x86/cpuid/cpuid.c
> +++ b/xen/arch/x86/cpuid/cpuid.c
> @@ -1,5 +1,110 @@
>  #include "cpuid-private.h"
>  
> +const uint32_t known_features[XEN_NR_FEATURESET_ENTRIES] =
> +{
> +[cpufeat_word(X86_FEATURE_FPU)] = (SHARED_1d   |
> +   cpufeat_mask(X86_FEATURE_SEP)   |

I can see how you try to remove fixed relationship, but using
FPU in the array index when no FPU appear in the list is at
least confusing.

Looking at the rest, adding a new feature will become quite a bit
more involved. I think we need some better abstraction here, e.g.
a mechanism similar to the one I used in public/errno.h or the one
Linux uses to populate its syscall tables or /proc/cpuinfo's feature
list to allow multiple inclusion of a single list of definitions where all
properties of each individual feature are maintained on one line.

The tables in this file would then be derived from this (perhaps
via further steps of machine processing).

Jan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH RFC 01/31] xen/public: Export featureset information in the public API

On 22/12/15 16:28, Jan Beulich wrote:
 On 16.12.15 at 22:24,  wrote:
>> +/* VIA/Cyrix/Centaur-defined CPU features, CPUID level 0xC001, FSMAX+2 
>> */
>> +#define X86_FEATURE_XSTORE  ((FSMAX+1)*32+ 2) /* on-CPU RNG present (xstore 
>> insn) */
>> +#define X86_FEATURE_XSTORE_EN   ((FSMAX+1)*32+ 3) /* on-CPU RNG enabled 
>> */
>> +#define X86_FEATURE_XCRYPT  ((FSMAX+1)*32+ 6) /* on-CPU crypto (xcrypt 
>> insn) */
>> +#define X86_FEATURE_XCRYPT_EN   ((FSMAX+1)*32+ 7) /* on-CPU crypto 
>> enabled */
>> +#define X86_FEATURE_ACE2((FSMAX+1)*32+ 8) /* Advanced Cryptography 
>> Engine v2 */
>> +#define X86_FEATURE_ACE2_EN ((FSMAX+1)*32+ 9) /* ACE v2 enabled */
>> +#define X86_FEATURE_PHE ((FSMAX+1)*32+10) /* PadLock Hash 
>> Engine */
>> +#define X86_FEATURE_PHE_EN  ((FSMAX+1)*32+11) /* PHE enabled */
>> +#define X86_FEATURE_PMM ((FSMAX+1)*32+12) /* PadLock Montgomery 
>> Multiplier */
>> +#define X86_FEATURE_PMM_EN  ((FSMAX+1)*32+13) /* PMM enabled */
> None of these get consumed anywhere - if you already touch this
> code, just drop all of them.

X86_FEATURE_XSTORE gets used in xen/arch/x86/cpu/centaur.c to stash word
0xC001, but nothing actually reads the stashed values.

I could do a precursor patch which drops the stashing of this
information, which will result in NCAPINTS getting shorter by the end of
the series.

>
>> +/*
>> + * CPUID leaf shorthand:
>> + * - optional 'e', Extended (0x8xxx)
>> + * - leaf, uppercase hex
>> + * - register, lowercase
>> + * - optional subleaf, uppercase hex
>> + */
>> +#define XEN_FEATURESET_1d 0 /* 0x0001.edx  */
>> +#define XEN_FEATURESET_1c 1 /* 0x0001.ecx  */
>> +#define XEN_FEATURESET_e1d2 /* 0x8001.edx  */
>> +#define XEN_FEATURESET_e1c3 /* 0x8001.ecx  */
>> +#define XEN_FEATURESET_Da14 /* 0x000d:1.eax*/
>> +#define XEN_FEATURESET_7b05 /* 0x0007:0.ebx*/
> This ends up being pretty cryptic.

Perhaps at a first glance, but there are so many uses that a shorthand
really is needed.  See especially the MSR masking patches towards the
end of the series.

>
>> +#define XEN_NR_FEATURESET_ENTRIES (XEN_FEATURESET_7b0 + 1)
> This shouldn't be exposed, as it will necessarily change sooner or later.

Hmm yes - I think I can alter where this lives, although libxc does need
to be able to get this value.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH RFC XEN v1 09/14] xen: arm: Save and restore GIC state.

On Thu, 17 Dec 2015, Ian Campbell wrote:
> > > +/* Save PPI and SGI states (per-CPU) */
> > > +spin_lock(&v->arch.vgic.lock);
> >
> > vgic_lock
>
> Ah, yes, this function was originally in save.c so didn't have access, but
> now it is in the right place I should use all the correct functions.
>
> > Is the domain already paused at this point? If so, maybe we could get
> > away without locking?
>
> Maybe we could think about it hard enough to rationalise that, but it seems
> safer and easier to just follow the locking rules regardless.
>
> > > +int vgic_irq_save_core(struct vcpu *v, unsigned irq, struct hvm_hw_irq
> > > *s)
> > > +{
> > > +struct pending_irq *p = irq_to_pending(v, irq);
> > > +struct vgic_irq_rank *rank = vgic_rank_irq(v, p->irq);
> > > +
> > > +const bool enabled = test_bit(GIC_IRQ_GUEST_ENABLED, &p->status);
> > > +const bool queued = test_bit(GIC_IRQ_GUEST_QUEUED, &p->status);
> > > +const bool visible = test_bit(GIC_IRQ_GUEST_VISIBLE, &p->status);
> > > +
> > > +spin_lock(&rank->lock);
> >
> > vgic_rank_lock. Maybe we need to take the vgic lock because there might
> > be incoming hardware interrupts for the domain but I don't think we need
> > the rank lock, do we?
>
> As above I think it is simpler and safer to just follow the usual rules
> rather than to special case things.

I don't think we take the rank lock with the vgic lock held anywhere in
the code though. It would be good to avoid nested locking unless
necessary.


> > +{
> > > +/*
> > > + * This uses the current IPRIORITYR, which may differ from
> > > + * when the IRQ was actually made pending. h/w spec probably
> > > + * allows this,  check
> > > + */
> >
> > From this VM point of view the IPRIORITYR is s->priority. I would remove
> > the comment.
>
> If you have an IRQ with priority 55 which fires and then while it is
> pending the priority is changed to 75 then I'm not sure what impact that
> has on the priority which the hardware considers the already pending
> interrupt to have.
>
> To make it more complex consider if there was a second interrupt IRQ
> with priority 65, if the handler for that was the one changed IRQ's
> priority should it expect to get immediately preempted by IRQ
>
> The comment is there because I'm not sure what behaviour the spec requires.

The spec says:

It is IMPLEMENTATION DEFINED whether changing the value of a priority
field changes the priority of an active interrupt.

We could infer that changing the priority of pending interrupts is
supposed to be supported.

I would only write:

   * This uses the current IPRIORITYR, which may differ from
   * when the IRQ was actually made pending.


> In terms of our code without the save restore I think the pending IRQ
> would remain at priority 55 and the guest would get IRQ when IRQ
> EOIs. Whereas upon restore we would reraise it with priority 75 and IRQ
> would appear to preempt IRQ, which would be an interesting difference in
> behaviour from the PoV of the guest.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH RFC 05/31] xen/x86: Collect more CPUID feature words

>>> On 16.12.15 at 22:24,  wrote:
> --- a/xen/arch/x86/cpu/amd.c
> +++ b/xen/arch/x86/cpu/amd.c
> @@ -481,7 +481,7 @@ static void __devinit init_amd(struct cpuinfo_x86 *c)
>  
>   if (c->extended_cpuid_level >= 0x8007) {
>   c->x86_power = cpuid_edx(0x8007);
> - if (c->x86_power & (1<<8)) {
> + if (c->x86_power & cpufeat_mask(X86_FEATURE_ITSC)) {

generic_identify() has already run by the time we get here, so no
need to re-read cpuid_edx() (interestingly enough you leverage
this in init_intel()).

> --- a/xen/arch/x86/cpuid/cpuid.c
> +++ b/xen/arch/x86/cpuid/cpuid.c
> @@ -103,6 +103,12 @@ const uint32_t known_features[XEN_NR_FEATURESET_ENTRIES] 
> =
>  cpufeat_mask(X86_FEATURE_RDSEED) 
> |
>  cpufeat_mask(X86_FEATURE_ADX)
> |
>  cpufeat_mask(X86_FEATURE_SMAP)),
> +
> +[cpufeat_word(X86_FEATURE_PREFETCHWT1)] = 
> (cpufeat_mask(X86_FEATURE_PREFETCHWT1)),
> +
> +[cpufeat_word(X86_FEATURE_ITSC)] = (cpufeat_mask(X86_FEATURE_ITSC)),
> +
> +[cpufeat_word(X86_FEATURE_CLZERO)] = (cpufeat_mask(X86_FEATURE_CLZERO)),
>  };

Are these indeed the only ones in their groups that can be safely
marked "known"?

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH RFC 06/31] xen/x86: Infrastructure to calculate guest featuresets

>>> On 16.12.15 at 22:24,  wrote:
>  xen/arch/x86/Makefile   |  1 +
>  xen/arch/x86/cpuid.c| 23 +++
>  xen/arch/x86/setup.c|  3 +++
>  xen/include/asm-x86/cpuid.h | 22 ++
>  4 files changed, 49 insertions(+)
>  create mode 100644 xen/arch/x86/cpuid.c

You already introduced xen/arch/x86/cpuid/ in patch 3 - why don't
you put this file there, as e.g. guest.c?

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v2 2/8] xen: arm: fix typo in the description of struct pending_irq->desc

On Tue, 10 Nov 2015, Ian Campbell wrote:
> s/it/if/ makes more sense.
> 
> Signed-off-by: Ian Campbell 

Acked-by: Stefano Stabellini 


>  xen/include/asm-arm/vgic.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/xen/include/asm-arm/vgic.h b/xen/include/asm-arm/vgic.h
> index cb51a9e..7d580cc 100644
> --- a/xen/include/asm-arm/vgic.h
> +++ b/xen/include/asm-arm/vgic.h
> @@ -66,7 +66,7 @@ struct pending_irq
>  #define GIC_IRQ_GUEST_ENABLED  3
>  #define GIC_IRQ_GUEST_MIGRATING   4
>  unsigned long status;
> -struct irq_desc *desc; /* only set it the irq corresponds to a physical 
> irq */
> +struct irq_desc *desc; /* only set if the irq corresponds to a physical 
> irq */
>  unsigned int irq;
>  #define GIC_INVALID_LR ~(uint8_t)0
>  uint8_t lr;
> -- 
> 2.1.4
> 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v2 1/8] xen: arm: fix indendation of struct vtimer

On Tue, 10 Nov 2015, Ian Campbell wrote:
> Signed-off-by: Ian Campbell 

Acked-by: Stefano Stabellini 


>  xen/include/asm-arm/domain.h | 10 +-
>  1 file changed, 5 insertions(+), 5 deletions(-)
> 
> diff --git a/xen/include/asm-arm/domain.h b/xen/include/asm-arm/domain.h
> index e7e40da..c56f06e 100644
> --- a/xen/include/asm-arm/domain.h
> +++ b/xen/include/asm-arm/domain.h
> @@ -35,11 +35,11 @@ extern int dom0_11_mapping;
>  #define is_domain_direct_mapped(d) ((d) == hardware_domain && 
> dom0_11_mapping)
>  
>  struct vtimer {
> -struct vcpu *v;
> -int irq;
> -struct timer timer;
> -uint32_t ctl;
> -uint64_t cval;
> +struct vcpu *v;
> +int irq;
> +struct timer timer;
> +uint32_t ctl;
> +uint64_t cval;
>  };
>  
>  struct arch_domain
> -- 
> 2.1.4
> 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH RFC 07/31] xen/x86: Export host featureset via SYSCTL