Re: [Xen-devel] [PATCH v3 0/2] VT-d flush issue
>>> On 22.12.15 at 08:40, wrote: > Maybe, there are still some misunderstanding about your expectation. > Let me summarize it here. > > After Quan's patch-set, there are two types of error code: > - -EOPNOTSUPP > Now we only support and use software way to synchronize the invalidation, > if someone calls queue_invalidate_wait() and passes sw with 0, then > -EOPNOTSUPP is returned (Though this cannot happen in real world, since > queue_invalidate_wait() is called only in one place and 1 is passed in to > 'sw'). > So I am not sure what should we do for this return value, if we really get > that return value, it means the flush is not actually executed, so the iommu > state is incorrect, the data is inconsistent. Do you think what should we do > for this case? Since seeing this error would be a software bug, BUG() or ASSERT() are fine to handle this specific case, if need be. > - -ETIMEDOUT > For this case, Quan has elaborate a lot, IIUIC, the main gap between you > and Quan is you think the error code should be propagated to the up caller, > while in Quan's implementation, he deals with this error in > invalidate_timeout() > and device_tlb_invalidate_timeout(), hence no need to propagated it to > up called, since the handling policy will crash the domain, so we don't think > propagated error code is needed. Even we propagate it, the up caller still > doesn't need to do anything for it. "Handling" an error by e.g. domain_crash() doesn't mean you don't need to also modify (or at the very least inspect) callers: They may continue doing things _assuming_ success. Of course you don't need to domain_crash() at all layers. But errors from lower layers should, at least in most ordinary cases, lead to higher layers bailing instead of continuing. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [qemu-upstream-4.2-testing test] 66816: regressions - FAIL
flight 66816 qemu-upstream-4.2-testing real [real] http://logs.test-lab.xenproject.org/osstest/logs/66816/ Regressions :-( Tests which did not succeed and are blocking, including tests which could not be run: build-i3865 xen-build fail REGR. vs. 62044 build-amd64 5 xen-build fail REGR. vs. 62044 Tests which did not succeed, but are not blocking: build-i386-libvirt1 build-check(1) blocked n/a test-i386-i386-xl-qemuu-winxpsp3 1 build-check(1) blocked n/a test-amd64-amd64-xl-qemuu-ovmf-amd64 1 build-check(1) blocked n/a test-amd64-amd64-xl-qemuu-debianhvm-amd64 1 build-check(1)blocked n/a test-amd64-amd64-xl-qemuu-win7-amd64 1 build-check(1) blocked n/a test-amd64-amd64-xl-qemuu-winxpsp3 1 build-check(1) blocked n/a test-amd64-i386-xl-qemuu-ovmf-amd64 1 build-check(1) blocked n/a test-amd64-i386-qemuu-rhel6hvm-intel 1 build-check(1) blocked n/a test-amd64-i386-qemuu-rhel6hvm-amd 1 build-check(1) blocked n/a test-amd64-i386-xl-qemuu-winxpsp3-vcpus1 1 build-check(1) blocked n/a test-amd64-i386-xl-qemuu-debianhvm-amd64 1 build-check(1) blocked n/a test-amd64-i386-xend-qemuu-winxpsp3 1 build-check(1) blocked n/a test-amd64-i386-xl-qemuu-win7-amd64 1 build-check(1) blocked n/a build-amd64-libvirt 1 build-check(1) blocked n/a version targeted for testing: qemuu5081fc1c773d2a83ec7a867f030323b8b6956cd1 baseline version: qemuuc17e602ae64fb24405ebd256679ba9a6f5819086 Last test of basis62044 2015-09-15 15:06:42 Z 97 days Testing same since66542 2015-12-18 16:37:10 Z3 days4 attempts People who touched revisions under test: Stefano Stabellini jobs: build-amd64 fail build-i386 fail build-amd64-libvirt blocked build-i386-libvirt blocked build-amd64-pvopspass build-i386-pvops pass test-amd64-i386-qemuu-rhel6hvm-amd blocked test-amd64-amd64-xl-qemuu-debianhvm-amd64blocked test-amd64-i386-xl-qemuu-debianhvm-amd64 blocked test-amd64-amd64-xl-qemuu-ovmf-amd64 blocked test-amd64-i386-xl-qemuu-ovmf-amd64 blocked test-amd64-amd64-xl-qemuu-win7-amd64 blocked test-amd64-i386-xl-qemuu-win7-amd64 blocked test-amd64-i386-qemuu-rhel6hvm-intel blocked test-amd64-i386-xl-qemuu-winxpsp3-vcpus1 blocked test-amd64-i386-xend-qemuu-winxpsp3 blocked test-amd64-amd64-xl-qemuu-winxpsp3 blocked test-i386-i386-xl-qemuu-winxpsp3 blocked sg-report-flight on osstest.test-lab.xenproject.org logs: /home/logs/logs images: /home/logs/images Logs, config files, etc. are available at http://logs.test-lab.xenproject.org/osstest/logs Explanation of these reports, and of osstest in general, is at http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master Test harness code can be found at http://xenbits.xen.org/gitweb?p=osstest.git;a=summary Not pushing. commit 5081fc1c773d2a83ec7a867f030323b8b6956cd1 Author: Stefano Stabellini Date: Fri Dec 18 15:45:14 2015 + xenfb: avoid reading twice the same fields from the shared page Reading twice the same field could give the guest an attack of opportunity. In the case of event->type, gcc could compile the switch statement into a jump table, effectively ending up reading the type field multiple times. This is part of XSA-155. Signed-off-by: Stefano Stabellini commit 74fab2ef4c0ba42af477e9e445c9883cc45cf9e6 Author: Stefano Stabellini Date: Fri Dec 18 15:44:58 2015 + xen/blkif: Avoid double access to src->nr_segments src is stored in shared memory and src->nr_segments is dereferenced twice at the end of the function. If a compiler decides to compile this into two separate memory accesses then the size limitation could be bypassed. Fix it by removing the double access to src->nr_segments. This is part of XSA-155. Signed-off-by: Stefano Stabellini
Re: [Xen-devel] [PATCH v3 0/2] VT-d flush issue
>On 22.12.2015 at 4:01pm wrote: > >>> On 22.12.15 at 08:40, wrote: > > Maybe, there are still some misunderstanding about your expectation. > > Let me summarize it here. > > > > After Quan's patch-set, there are two types of error code: > > - -EOPNOTSUPP > > Now we only support and use software way to synchronize the > > invalidation, if someone calls queue_invalidate_wait() and passes sw > > with 0, then -EOPNOTSUPP is returned (Though this cannot happen in > > real world, since > > queue_invalidate_wait() is called only in one place and 1 is passed in to > > 'sw'). > > So I am not sure what should we do for this return value, if we really > > get that return value, it means the flush is not actually executed, so > > the iommu state is incorrect, the data is inconsistent. Do you think > > what should we do for this case? > > Since seeing this error would be a software bug, BUG() or ASSERT() are fine to > handle this specific case, if need be. > > > - -ETIMEDOUT > > For this case, Quan has elaborate a lot, IIUIC, the main gap between > > you and Quan is you think the error code should be propagated to the > > up caller, while in Quan's implementation, he deals with this error in > > invalidate_timeout() > > and device_tlb_invalidate_timeout(), hence no need to propagated it to > > up called, since the handling policy will crash the domain, so we > > don't think propagated error code is needed. Even we propagate it, the > > up caller still doesn't need to do anything for it. > > "Handling" an error by e.g. domain_crash() doesn't mean you don't need to also > modify (or at the very least inspect) callers: They may continue doing things > _assuming_ success. Of course you don't need to domain_crash() at all layers. > But errors from lower layers should, at least in most ordinary cases, lead to > higher layers bailing instead of continuing. > For Device-TLB flush error, I think we need to propagated error code. For IEC/iotlb/context flush error, if panic is acceptable, we can ignore the propagated error code. BTW, it is very challenge / tricky to handle all Of error, and some error is unrecoverable. As mentioned, it looks like rewriting Xen hypervisor. For -EOPNOTSUPP, we can print warning message. If it supports interrupt method, we can return 0 in queue_invalidate_wait(). Feng, thanks for your update! -Quan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v3 0/2] VT-d flush issue
> -Original Message- > From: Jan Beulich [mailto:jbeul...@suse.com] > Sent: Tuesday, December 22, 2015 4:01 PM > To: Wu, Feng ; Xu, Quan > Cc: 'andrew.coop...@citrix.com' ; > 'george.dun...@eu.citrix.com' ; Nakajima, Jun > ; Tian, Kevin ; 'xen- > de...@lists.xen.org' ; 'k...@xen.org' ; > 't...@xen.org' > Subject: RE: [Xen-devel] [PATCH v3 0/2] VT-d flush issue > > >>> On 22.12.15 at 08:40, wrote: > > Maybe, there are still some misunderstanding about your expectation. > > Let me summarize it here. > > > > After Quan's patch-set, there are two types of error code: > > - -EOPNOTSUPP > > Now we only support and use software way to synchronize the invalidation, > > if someone calls queue_invalidate_wait() and passes sw with 0, then > > -EOPNOTSUPP is returned (Though this cannot happen in real world, since > > queue_invalidate_wait() is called only in one place and 1 is passed in to > > 'sw'). > > So I am not sure what should we do for this return value, if we really get > > that return value, it means the flush is not actually executed, so the iommu > > state is incorrect, the data is inconsistent. Do you think what should we do > > for this case? > > Since seeing this error would be a software bug, BUG() or ASSERT() > are fine to handle this specific case, if need be. > > > - -ETIMEDOUT > > For this case, Quan has elaborate a lot, IIUIC, the main gap between you > > and Quan is you think the error code should be propagated to the up caller, > > while in Quan's implementation, he deals with this error in > > invalidate_timeout() > > and device_tlb_invalidate_timeout(), hence no need to propagated it to > > up called, since the handling policy will crash the domain, so we don't > > think > > propagated error code is needed. Even we propagate it, the up caller still > > doesn't need to do anything for it. > > "Handling" an error by e.g. domain_crash() doesn't mean you don't > need to also modify (or at the very least inspect) callers: They may > continue doing things _assuming_ success. Of course you don't > need to domain_crash() at all layers. But errors from lower layers > should, at least in most ordinary cases, lead to higher layers bailing > instead of continuing. So there are two questions: 1. Just confirmed with Quan, for IEC/context/iotlb flush timeout, the policy will be Xen panic. Do you agree with this policy? If you do, we don't need to pass the error code to the caller, right? If you don't, we may need more discussion about how the handle this case first before anything else. 2. For Device iotlb flush, the current policy is to crash the domain, in this case we need to check case by case whether the caller need to handle something if timeout is encountered, right? If needed, we need do something there, if not needed, we can just let it be. Thanks, Feng ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH V4 4/6] x86/hvm: pkeys, add pkeys support for guest_walk_tables
On Mon, 2015-12-21 at 08:32 -0700, Jan Beulich wrote: > > > > On 21.12.15 at 08:21, wrote: > > --- a/xen/arch/x86/mm/guest_walk.c > > +++ b/xen/arch/x86/mm/guest_walk.c > > @@ -90,6 +90,55 @@ static uint32_t set_ad_bits(void *guest_p, void > > *walk_p, int set_dirty) > > return 0; > > } > > > > +#if GUEST_PAGING_LEVELS >= CONFIG_PAGING_LEVELS > > GUEST_PAGING_LEVELS >= 4 (just like further down) The code is modified according Andrew's comments: " This is a latent linking bug for the future when 5 levels comes along. It will probably be best to use the same trick as gw_page_flags to compile it once but use it multiple times. " > > > +bool_t leaf_pte_pkeys_check(struct vcpu *vcpu, > > +uint32_t pfec, uint32_t pte_pkey) > > +{ > > +unsigned int pkru = 0; > > +bool_t pkru_ad, pkru_wd; > > + > > Stray blank line. > > > +bool_t pf = !!(pfec & PFEC_page_present); > > +bool_t uf = !!(pfec & PFEC_user_mode); > > +bool_t wf = !!(pfec & PFEC_write_access); > > +bool_t ff = !!(pfec & PFEC_insn_fetch); > > +bool_t rsvdf = !!(pfec & PFEC_reserved_bit); > > + > > +/* When page is present, PFEC_prot_key is always checked */ > > +if ( !pf || is_pv_vcpu(vcpu) ) > > +return 0; > > I think for a function called "check" together with how its callers > use > it the return value meaning is inverted here. I will change the function name to "pkey_page_fault". > Also the comment seems > inverted wrt the actual check (and is missing a full stop). And > doesn't > key 0 have static meaning, in which case you could bail early (and > namely avoid the expensive RDPKRU further down)? Key 0 have no static meaning, the default key maybe different in different OS. > > > +/* > > + * PKU: additional mechanism by which the paging controls > > + * access to user-mode addresses based on the value in the > > + * PKRU register. A fault is considered as a PKU violation if > > all > > + * of the following conditions are ture: > > + * 1.CR4_PKE=1. > > + * 2.EFER_LMA=1. > > + * 3.page is present with no reserved bit violations. > > + * 4.the access is not an instruction fetch. > > + * 5.the access is to a user page. > > + * 6.PKRU.AD=1 > > + * or The access is a data write and PKRU.WD=1 > > + *and either CR0.WP=1 or it is a user access. > > + */ > > +if ( !hvm_pku_enabled(vcpu) || > > +!hvm_long_mode_enabled(vcpu) || rsvdf || ff ) > > Where's the "user page" check? Also - indentation. it should be: if ( !hvm_pku_enabled(vcpu) || !hvm_long_mode_enabled(vcpu) || rsvdf || ff || !(pte_flags & _PAGE_USER) ) > > > +return 0; > > + > > +pkru = read_pkru(); > > +if ( unlikely(pkru) ) > > +{ > > +pkru_ad = read_pkru_ad(pkru, pte_pkey); > > +pkru_wd = read_pkru_wd(pkru, pte_pkey); > > +/* Condition 6 */ > > +if ( pkru_ad || (pkru_wd && wf && (hvm_wp_enabled(vcpu) || > > uf))) > > Ah, uf is being checked here. But according to the comment it could > (and should, again to avoid the RDPKRU) move up. > > > @@ -270,6 +324,12 @@ guest_walk_tables(struct vcpu *v, struct > > p2m_domain *p2m, > > > > pse2M = (gflags & _PAGE_PSE) && guest_supports_superpages(v); > > > > +#if GUEST_PAGING_LEVELS >= 4 > > +pkey = guest_l2e_get_pkey(gw->l2e); > > +if ( pse2M && leaf_pte_pkeys_check(v, pfec, pkey) ) > > +rc |= _PAGE_PKEY_BITS; > > +#endif > > I think the #ifdef isn't really needed here, if you moved the one > around leaf_pte_pkeys_check() into that function, and if you > perhaps also dropped the "pkey" local variable. guest_l2e_get_pkey has different macro depend on GUEST_PAGING_LEVELS too, and I think it's better to keep it. > > Jan > ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v3 0/2] VT-d flush issue
>>> On 22.12.15 at 09:10, wrote: > >> -Original Message- >> From: Jan Beulich [mailto:jbeul...@suse.com] >> Sent: Tuesday, December 22, 2015 4:01 PM >> To: Wu, Feng ; Xu, Quan >> Cc: 'andrew.coop...@citrix.com' ; >> 'george.dun...@eu.citrix.com' ; Nakajima, Jun >> ; Tian, Kevin ; 'xen- >> de...@lists.xen.org' ; 'k...@xen.org' >> ; >> 't...@xen.org' >> Subject: RE: [Xen-devel] [PATCH v3 0/2] VT-d flush issue >> >> >>> On 22.12.15 at 08:40, wrote: >> > Maybe, there are still some misunderstanding about your expectation. >> > Let me summarize it here. >> > >> > After Quan's patch-set, there are two types of error code: >> > - -EOPNOTSUPP >> > Now we only support and use software way to synchronize the invalidation, >> > if someone calls queue_invalidate_wait() and passes sw with 0, then >> > -EOPNOTSUPP is returned (Though this cannot happen in real world, since >> > queue_invalidate_wait() is called only in one place and 1 is passed in to > 'sw'). >> > So I am not sure what should we do for this return value, if we really get >> > that return value, it means the flush is not actually executed, so the > iommu >> > state is incorrect, the data is inconsistent. Do you think what should we > do >> > for this case? >> >> Since seeing this error would be a software bug, BUG() or ASSERT() >> are fine to handle this specific case, if need be. >> >> > - -ETIMEDOUT >> > For this case, Quan has elaborate a lot, IIUIC, the main gap between you >> > and Quan is you think the error code should be propagated to the up caller, >> > while in Quan's implementation, he deals with this error in >> > invalidate_timeout() >> > and device_tlb_invalidate_timeout(), hence no need to propagated it to >> > up called, since the handling policy will crash the domain, so we don't > think >> > propagated error code is needed. Even we propagate it, the up caller still >> > doesn't need to do anything for it. >> >> "Handling" an error by e.g. domain_crash() doesn't mean you don't >> need to also modify (or at the very least inspect) callers: They may >> continue doing things _assuming_ success. Of course you don't >> need to domain_crash() at all layers. But errors from lower layers >> should, at least in most ordinary cases, lead to higher layers bailing >> instead of continuing. > > So there are two questions: > 1. Just confirmed with Quan, for IEC/context/iotlb flush timeout, the policy > will be Xen panic. Do you agree with this policy? If you do, we don't need > to > pass the error code to the caller, right? If you don't, we may need more > discussion about how the handle this case first before anything else. Just to repeat several earlier statements: BUG() and panic() ought to be avoided wherever possible (i.e. including here). > 2. For Device iotlb flush, the current policy is to crash the domain, in > this case > we need to check case by case whether the caller need to handle something > if timeout is encountered, right? If needed, we need do something there, if > not needed, we can just let it be. "Just let it be" is wrong (and is what led to the bad state the code is in right now in this regard). I gave advice for the direction before: Add __must_check. This will make the compiler point out to you where changes are needed. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v3 0/2] VT-d flush issue
>>> On 22.12.15 at 09:10, wrote: > For Device-TLB flush error, I think we need to propagated error code. > For IEC/iotlb/context flush error, if panic is acceptable, we can ignore the > propagated error code. BTW, it is very challenge / tricky to handle all > Of error, and some error is unrecoverable. As mentioned, it looks like > rewriting Xen hypervisor. We're moving in circles. In particular I don't believe this last sentence. Re-writing many parts of the hypervisor would be required is you were to return the error to the guest (which technically isn't possible in many case afaict). Having crashed the guest, you don't need to be concerned about unrolling previous (partially completed) operations, so I don't think it's this much of a rewrite. And just to be clear - I hope you recall that the current approach taken to the flush issue is already a compromise far away from where we initially meant the code to move, and it was always clear that the bad error handling state the code is in is going to get in the way of this being a simple fix. It's sad that the people originally having introduced that code can't be held responsible for fixing this up, but that's a situation we find ourselves in all the time. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH V4 6/6] x86/hvm: pkeys, add pkeys support for cpuid handling
>>> On 22.12.15 at 03:54, wrote: > On Mon, 2015-12-21 at 08:07 -0700, Jan Beulich wrote: >> > > > On 21.12.15 at 08:21, wrote: >> > @@ -4600,6 +4600,14 @@ void hvm_cpuid(unsigned int input, unsigned >> > int *eax, unsigned int *ebx, >> > /* Don't expose INVPCID to non-hap hvm. */ >> > if ( (count == 0) && !hap_enabled(d) ) >> > *ebx &= ~cpufeat_mask(X86_FEATURE_INVPCID); >> > + >> > +/* X86_FEATURE_PKU is not yet implemented for shadow >> > paging. */ >> > +if ( (count == 0) && !hap_enabled(d) ) >> > +*ecx &= ~cpufeat_mask(X86_FEATURE_PKU); >> >> I'm still missing the xsave dependency here. > Xsave dependency deletion becasue we use RDPKRU to get PKRU register > value instead of XSAVE now. What the hypervisor does doesn't matter here. The question is whether from an architectural standpoint XSAVE is a prerequsite. If it is, then you need to clear PKU when _guest_ XSAVE is clear. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v3 0/2] VT-d flush issue
> On December 22, 2015 4:27pm, wrote: > >>> On 22.12.15 at 09:10, wrote: > > For Device-TLB flush error, I think we need to propagated error code. > > For IEC/iotlb/context flush error, if panic is acceptable, we can > > ignore the propagated error code. BTW, it is very challenge / tricky > > to handle all Of error, and some error is unrecoverable. As mentioned, > > it looks like rewriting Xen hypervisor. > > We're moving in circles. In particular I don't believe this last sentence. > Re-writing many parts of the hypervisor would be required is you were to > return > the error to the guest (which technically isn't possible in many case afaict). > Having crashed the guest, you don't need to be concerned about unrolling > previous (partially completed) operations, so I don't think it's this much of > a > rewrite. > > And just to be clear - I hope you recall that the current approach taken to > the > flush issue is already a compromise far away from where we initially meant the > code to move, and it was always clear that the bad error handling state the > code is in is going to get in the way of this being a simple fix. It's sad > that the > people originally having introduced that code can't be held responsible for > fixing > this up, but that's a situation we find ourselves in all the time. > > Jan Jan, Let's finish our discussion. I accept your idea. But I need to separate it into 3 patch set (It is complicated for me, sometime it makes me crash.): Patch set 1: Device-TLB/iotlb flush error. (send out this week) Patch set 2: context flush error. (need 2 ~ 3 weeks) Patch set 3: iec flush error. (need 3 ~ 4 weeks) If it is acceptable, we can discuss in detail one by one.. Thanks -Quan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH V4 4/6] x86/hvm: pkeys, add pkeys support for guest_walk_tables
>>> On 22.12.15 at 09:12, wrote: > On Mon, 2015-12-21 at 08:32 -0700, Jan Beulich wrote: >> > > > On 21.12.15 at 08:21, wrote: >> > --- a/xen/arch/x86/mm/guest_walk.c >> > +++ b/xen/arch/x86/mm/guest_walk.c >> > @@ -90,6 +90,55 @@ static uint32_t set_ad_bits(void *guest_p, void >> > *walk_p, int set_dirty) >> > return 0; >> > } >> > >> > +#if GUEST_PAGING_LEVELS >= CONFIG_PAGING_LEVELS >> >> GUEST_PAGING_LEVELS >= 4 (just like further down) > The code is modified according Andrew's comments: > " > This is a latent linking bug for the future > when 5 levels comes along. > > It will probably be best to use the same trick > as gw_page_flags to compile it once but use it > multiple times. > " Okay, understood. But then you should follow _that_ model (using == instead of >=) instead of inventing a 3rd variant. Similar things should be done in similar ways so that they can be easily recognized being similar. Otoh the function isn't being called from code other than GUEST_PAGING_LEVELS == 4, so at present it could be static, which would take care of the latent linking issue Andrew referred to. >> > +bool_t pf = !!(pfec & PFEC_page_present); >> > +bool_t uf = !!(pfec & PFEC_user_mode); >> > +bool_t wf = !!(pfec & PFEC_write_access); >> > +bool_t ff = !!(pfec & PFEC_insn_fetch); >> > +bool_t rsvdf = !!(pfec & PFEC_reserved_bit); >> > + >> > +/* When page is present, PFEC_prot_key is always checked */ >> > +if ( !pf || is_pv_vcpu(vcpu) ) >> > +return 0; >> >> I think for a function called "check" together with how its callers >> use >> it the return value meaning is inverted here. > I will change the function name to "pkey_page_fault". Perhaps just pkey_fault() then, since it's clear there's no other fault possible here besides a page fault? >> Also the comment seems >> inverted wrt the actual check (and is missing a full stop). And >> doesn't >> key 0 have static meaning, in which case you could bail early (and >> namely avoid the expensive RDPKRU further down)? > Key 0 have no static meaning, the default key maybe different in > different OS. Okay - I thought I had seen something like this mentioned in another sub-thread. >> > @@ -270,6 +324,12 @@ guest_walk_tables(struct vcpu *v, struct >> > p2m_domain *p2m, >> > >> > pse2M = (gflags & _PAGE_PSE) && guest_supports_superpages(v); >> > >> > +#if GUEST_PAGING_LEVELS >= 4 >> > +pkey = guest_l2e_get_pkey(gw->l2e); >> > +if ( pse2M && leaf_pte_pkeys_check(v, pfec, pkey) ) >> > +rc |= _PAGE_PKEY_BITS; >> > +#endif >> >> I think the #ifdef isn't really needed here, if you moved the one >> around leaf_pte_pkeys_check() into that function, and if you >> perhaps also dropped the "pkey" local variable. > guest_l2e_get_pkey has different macro depend on GUEST_PAGING_LEVELS > too, and I think it's better to keep it. I didn't suggest to drop guest_l2e_get_pkey(). I'd like to see the #ifdef-s go away if at all possible, to help code readability. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [xen-4.3-testing test] 66779: regressions - FAIL
flight 66779 xen-4.3-testing real [real] http://logs.test-lab.xenproject.org/osstest/logs/66779/ Regressions :-( Tests which did not succeed and are blocking, including tests which could not be run: build-amd64 5 xen-build fail REGR. vs. 65650 build-amd64-prev 5 xen-build fail REGR. vs. 65650 build-i386-prev 5 xen-build fail REGR. vs. 65650 build-i3865 xen-build fail REGR. vs. 65650 Tests which did not succeed, but are not blocking: test-amd64-amd64-rumpuserxen-amd64 1 build-check(1) blocked n/a test-amd64-i386-rumpuserxen-i386 1 build-check(1) blocked n/a build-i386-libvirt1 build-check(1) blocked n/a test-amd64-i386-xl-qemuu-ovmf-amd64 1 build-check(1) blocked n/a test-amd64-amd64-xl-qemuu-ovmf-amd64 1 build-check(1) blocked n/a build-amd64-libvirt 1 build-check(1) blocked n/a build-i386-rumpuserxen1 build-check(1) blocked n/a build-amd64-rumpuserxen 1 build-check(1) blocked n/a test-amd64-amd64-pv 1 build-check(1) blocked n/a test-amd64-i386-libvirt 1 build-check(1) blocked n/a test-amd64-amd64-xl-multivcpu 1 build-check(1) blocked n/a test-amd64-i386-migrupgrade 1 build-check(1) blocked n/a test-amd64-amd64-xl 1 build-check(1) blocked n/a test-amd64-amd64-migrupgrade 1 build-check(1) blocked n/a test-amd64-i386-pair 1 build-check(1) blocked n/a test-amd64-amd64-pair 1 build-check(1) blocked n/a test-amd64-amd64-libvirt 1 build-check(1) blocked n/a test-amd64-i386-qemuu-rhel6hvm-intel 1 build-check(1) blocked n/a test-amd64-amd64-amd64-pvgrub 1 build-check(1) blocked n/a test-amd64-amd64-i386-pvgrub 1 build-check(1) blocked n/a test-amd64-amd64-xl-credit2 1 build-check(1) blocked n/a test-amd64-i386-pv1 build-check(1) blocked n/a test-amd64-i386-freebsd10-i386 1 build-check(1) blocked n/a test-amd64-i386-xl-qemuu-debianhvm-amd64 1 build-check(1) blocked n/a test-amd64-i386-qemut-rhel6hvm-amd 1 build-check(1) blocked n/a test-amd64-i386-freebsd10-amd64 1 build-check(1) blocked n/a test-amd64-i386-xl-raw1 build-check(1) blocked n/a test-amd64-amd64-pygrub 1 build-check(1) blocked n/a test-amd64-i386-xl1 build-check(1) blocked n/a test-amd64-amd64-libvirt-vhd 1 build-check(1) blocked n/a test-amd64-i386-xl-qemut-win7-amd64 1 build-check(1) blocked n/a test-amd64-amd64-xl-qemut-win7-amd64 1 build-check(1) blocked n/a test-amd64-i386-qemut-rhel6hvm-intel 1 build-check(1) blocked n/a test-amd64-i386-qemuu-rhel6hvm-amd 1 build-check(1) blocked n/a test-amd64-amd64-xl-qemuu-win7-amd64 1 build-check(1) blocked n/a test-amd64-i386-xl-qemuu-win7-amd64 1 build-check(1) blocked n/a test-amd64-amd64-xl-qemut-debianhvm-amd64 1 build-check(1)blocked n/a test-amd64-amd64-xl-qemut-winxpsp3 1 build-check(1) blocked n/a test-amd64-i386-xl-qemut-winxpsp3-vcpus1 1 build-check(1) blocked n/a test-amd64-i386-xl-qemut-debianhvm-amd64 1 build-check(1) blocked n/a test-amd64-amd64-xl-qemuu-winxpsp3 1 build-check(1) blocked n/a test-amd64-amd64-xl-qcow2 1 build-check(1) blocked n/a test-amd64-amd64-xl-qemuu-debianhvm-amd64 1 build-check(1)blocked n/a test-amd64-i386-xl-qemuu-winxpsp3-vcpus1 1 build-check(1) blocked n/a test-amd64-i386-xend-qemut-winxpsp3 1 build-check(1) blocked n/a test-armhf-armhf-xl-multivcpu 6 xen-boot fail never pass test-armhf-armhf-libvirt-qcow2 6 xen-boot fail never pass test-armhf-armhf-xl-vhd 6 xen-boot fail never pass test-armhf-armhf-libvirt 6 xen-boot fail never pass test-armhf-armhf-xl-credit2 6 xen-boot fail never pass test-armhf-armhf-xl-cubietruck 6 xen-boot fail never pass test-armhf-armhf-libvirt-raw 6 xen-boot fail never pass test-armhf-armhf-xl 6 xen-boot fail never pass test-armhf-armhf-xl-arndale 6 xen-boot fail never pass version targeted for testing: xen 0574a773238f0ecce3873dc525082f6e16ac655b baseline version: xen 54dd84062d4e0f0298508826fc06990415da431e Last test of basis65650 2015-12-10 03:5
[Xen-devel] [qemu-upstream-4.3-testing test] 66831: regressions - FAIL
flight 66831 qemu-upstream-4.3-testing real [real] http://logs.test-lab.xenproject.org/osstest/logs/66831/ Regressions :-( Tests which did not succeed and are blocking, including tests which could not be run: build-i3865 xen-build fail REGR. vs. 62112 build-amd64 5 xen-build fail REGR. vs. 62112 Tests which did not succeed, but are not blocking: build-i386-libvirt1 build-check(1) blocked n/a test-amd64-amd64-libvirt-vhd 1 build-check(1) blocked n/a build-amd64-libvirt 1 build-check(1) blocked n/a test-amd64-amd64-libvirt 1 build-check(1) blocked n/a test-amd64-i386-xl-qemuu-winxpsp3-vcpus1 1 build-check(1) blocked n/a test-amd64-amd64-amd64-pvgrub 1 build-check(1) blocked n/a test-amd64-i386-xl-qemuu-ovmf-amd64 1 build-check(1) blocked n/a test-amd64-i386-xl-raw1 build-check(1) blocked n/a test-amd64-i386-qemuu-rhel6hvm-amd 1 build-check(1) blocked n/a test-amd64-i386-xl-qemuu-win7-amd64 1 build-check(1) blocked n/a test-amd64-amd64-pair 1 build-check(1) blocked n/a test-amd64-i386-xl-qemuu-debianhvm-amd64 1 build-check(1) blocked n/a test-amd64-i386-xl1 build-check(1) blocked n/a test-amd64-amd64-xl-qemuu-ovmf-amd64 1 build-check(1) blocked n/a test-amd64-amd64-xl-qemuu-debianhvm-amd64 1 build-check(1)blocked n/a test-amd64-i386-freebsd10-i386 1 build-check(1) blocked n/a test-amd64-i386-libvirt 1 build-check(1) blocked n/a test-amd64-amd64-xl-multivcpu 1 build-check(1) blocked n/a test-amd64-i386-pv1 build-check(1) blocked n/a test-amd64-i386-qemuu-rhel6hvm-intel 1 build-check(1) blocked n/a test-amd64-i386-freebsd10-amd64 1 build-check(1) blocked n/a test-amd64-i386-pair 1 build-check(1) blocked n/a test-amd64-amd64-xl-qemuu-win7-amd64 1 build-check(1) blocked n/a test-amd64-amd64-xl-qemuu-winxpsp3 1 build-check(1) blocked n/a test-amd64-amd64-pv 1 build-check(1) blocked n/a test-amd64-amd64-i386-pvgrub 1 build-check(1) blocked n/a test-amd64-amd64-xl-qcow2 1 build-check(1) blocked n/a test-amd64-amd64-xl-credit2 1 build-check(1) blocked n/a test-amd64-amd64-xl 1 build-check(1) blocked n/a test-amd64-amd64-pygrub 1 build-check(1) blocked n/a version targeted for testing: qemuu6e184363e64a0610c35ca231bfc73cea56eb02f3 baseline version: qemuub188780861662e8cf1847ec562799b32bb44f05e Last test of basis62112 2015-09-18 04:07:28 Z 95 days Testing same since66538 2015-12-18 16:12:07 Z3 days4 attempts People who touched revisions under test: Stefano Stabellini jobs: build-amd64 fail build-i386 fail build-amd64-libvirt blocked build-i386-libvirt blocked build-amd64-pvopspass build-i386-pvops pass test-amd64-amd64-xl blocked test-amd64-i386-xl blocked test-amd64-i386-qemuu-rhel6hvm-amd blocked test-amd64-amd64-xl-qemuu-debianhvm-amd64blocked test-amd64-i386-xl-qemuu-debianhvm-amd64 blocked test-amd64-i386-freebsd10-amd64 blocked test-amd64-amd64-xl-qemuu-ovmf-amd64 blocked test-amd64-i386-xl-qemuu-ovmf-amd64 blocked test-amd64-amd64-xl-qemuu-win7-amd64 blocked test-amd64-i386-xl-qemuu-win7-amd64 blocked test-amd64-amd64-xl-credit2 blocked test-amd64-i386-freebsd10-i386 blocked test-amd64-i386-qemuu-rhel6hvm-intel blocked test-amd64-amd64-libvirt blocked test-amd64-i386-libvirt blocked test-amd64-amd64-xl-multivcpublocked test-amd64-amd64-pairblocked test-amd64-i386-pair blocked test-amd64-amd64-pv blocked test-amd64-i386-pv
Re: [Xen-devel] Xen-devel Digest, Vol 130, Issue 521
On 2015-12-21 23:58, xen-devel-requ...@lists.xen.org wrote: Send Xen-devel mailing list submissions to xen-devel@lists.xen.org To subscribe or unsubscribe via the World Wide Web, visit http://lists.xen.org/cgi-bin/mailman/listinfo/xen-devel or, via email, send a message with subject or body 'help' to xen-devel-requ...@lists.xen.org You can reach the person managing the list at xen-devel-ow...@lists.xen.org When replying, please edit your Subject line so it is more specific than "Re: Contents of Xen-devel digest..." Today's Topics: 1. hvc console not working on 4.4.0 guest (M A Young) I am testing Fedora rawhide in a xen pv guest and I get the error [9.809550] genirq: Flags mismatch irq 8. (hvc_console) vs. (rtc0) [9.809714] hvc_open: request_irq failed with rc -16. during the boot and the hvc console doesn't respond when the boot finishes. I have noticed this I think since the update through various linux 4.4.0 kernels up to the current 4.4.0-0.rc5.git3.1.fc24.x86_64 one. I do get an hvc console if I use the Fedora 23 kernel 4.2.6-300.fc23.x86_64 from before the update. Is this a known problem? I have attached the boot log up to the above error. Michael Young Hi Michael, Yes it's a known problem, the patch for it was just accepted into the tip tree: https://git.kernel.org/cgit/linux/kernel/git/tip/tip.git/commit/?h=x86/urgent&id=d8c98a1d1488747625ad6044d423406e17e99b7a So it will land in 4.4-final, the 2 patches causing this regression were unfortunately also already backported to the stable trees (despite objection), but hopefully the fix wil also hit stable in the next round. -- Sander ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] x86/mmuext: Unify okay/rc error handling in do_mmuext_op()
>>> On 21.12.15 at 18:16, wrote: > c/s 506db90 "x86/HVM: merge HVM and PVH hypercall tables" introduced a path > whereby 'okay' was used uninitialised, with broke compilation on CentOS 7. It appeared to be used uninitialized, but wasn't in fact (i.e. the outcome - the value rc gets set to - didn't depend on the value due to if ( unlikely(!okay) && !rc ) rc = -EINVAL; being equivalent to if ( !rc && unlikely(!okay) ) rc = -EINVAL; (no side effects for the expressions on either side of the &&). I'll re-word accordingly upon committing, to not give the false impression of there having been other than a cosmetic problem. > Splitting the error handling like this is fragile and unnecessary. Drop the > okay variable entirely and just use rc directly, substituting rc = -EINVAL/0 > for okay = 0/1. > > In addition, two error messages are updated to print rc, and some stray > whitespace is dropped. > > Signed-off-by: Andrew Cooper Reviewed-by: Jan Beulich ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] x86/mmuext: Unify okay/rc error handling in do_mmuext_op()
On 22/12/2015 08:57, Jan Beulich wrote: On 21.12.15 at 18:16, wrote: >> c/s 506db90 "x86/HVM: merge HVM and PVH hypercall tables" introduced a path >> whereby 'okay' was used uninitialised, with broke compilation on CentOS 7. > It appeared to be used uninitialized, but wasn't in fact (i.e. the > outcome - the value rc gets set to - didn't depend on the value > due to > > if ( unlikely(!okay) && !rc ) > rc = -EINVAL; > > being equivalent to > > if ( !rc && unlikely(!okay) ) > rc = -EINVAL; > > (no side effects for the expressions on either side of the &&). > I'll re-word accordingly upon committing, to not give the false > impression of there having been other than a cosmetic problem. There is a real problem. Because the compiler is able to prove that okay is genuinely read uninitialised in one case, the rules concerning undefined behaviour permit it to do anything it wishes, including omitting this if statement. As far as practical problems go however, it is the build breakage which is relevant, and it breaks because of a -Werror=maybe-uninitialised. ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v3 0/2] VT-d flush issue
>>> On 22.12.15 at 09:43, wrote: > Let's finish our discussion. I accept your idea. But I need to separate it > into 3 patch set (It is complicated for me, sometime it makes me crash.): >Patch set 1: Device-TLB/iotlb flush error. (send out this week) >Patch set 2: context flush error. (need 2 ~ 3 weeks) >Patch set 3: iec flush error. (need 3 ~ 4 weeks) > > If it is acceptable, we can discuss in detail one by one.. Splitting up is of course acceptable. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v3 0/2] VT-d flush issue
> On 22.12.2015 at 5:09pm, wrote: > >>> On 22.12.15 at 09:43, wrote: > > Let's finish our discussion. I accept your idea. But I need to > > separate it into 3 patch set (It is complicated for me, sometime it makes me > crash.): > >Patch set 1: Device-TLB/iotlb flush error. (send out this week) > >Patch set 2: context flush error. (need 2 ~ 3 weeks) > >Patch set 3: iec flush error. (need 3 ~ 4 weeks) > > > > If it is acceptable, we can discuss in detail one by one.. > > Splitting up is of course acceptable. > Thanks Jan. I am getting started to Patch set 1. Quan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH V4 6/6] x86/hvm: pkeys, add pkeys support for cpuid handling
On Tue, 2015-12-22 at 01:30 -0700, Jan Beulich wrote: > > > > On 22.12.15 at 03:54, wrote: > > On Mon, 2015-12-21 at 08:07 -0700, Jan Beulich wrote: > > > > > > On 21.12.15 at 08:21, wrote: > > > > @@ -4600,6 +4600,14 @@ void hvm_cpuid(unsigned int input, > > > > unsigned > > > > int *eax, unsigned int *ebx, > > > > /* Don't expose INVPCID to non-hap hvm. */ > > > > if ( (count == 0) && !hap_enabled(d) ) > > > > *ebx &= ~cpufeat_mask(X86_FEATURE_INVPCID); > > > > + > > > > +/* X86_FEATURE_PKU is not yet implemented for shadow > > > > paging. */ > > > > +if ( (count == 0) && !hap_enabled(d) ) > > > > +*ecx &= ~cpufeat_mask(X86_FEATURE_PKU); > > > > > > I'm still missing the xsave dependency here. > > Xsave dependency deletion becasue we use RDPKRU to get PKRU > > register > > value instead of XSAVE now. > > What the hypervisor does doesn't matter here. The question is > whether from an architectural standpoint XSAVE is a prerequsite. > If it is, then you need to clear PKU when _guest_ XSAVE is clear. No, XSAVE is not a prerequsite. > > Jan > ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [linux-3.14 test] 65633: regressions - FAIL
On 12/16/2015 09:32 AM, Robert Hu wrote: On Fri, 2015-12-11 at 12:01 +, Ian Campbell wrote: On Fri, 2015-12-11 at 11:48 +0800, Robert Hu wrote: On Fri, 2015-12-11 at 01:16 +, osstest service owner wrote: flight 65633 linux-3.14 real [real] http://logs.test-lab.xenproject.org/osstest/logs/65633/ Regressions :-( Tests which did not succeed and are blocking, including tests which could not be run: test-amd64-i386-rumpuserxen-i386 10 guest-start fail REGR. vs. 64562 [trim...] Hi Ian, Why does it still fails there and even marked 'never pass' now? This is the test of the linux-3.14 branch, not the xen-unstable branch which was failing before. Once the revert passes through the xen-unstable push gate then the linux- 3.14 branch (and most other branches) will pick up that change. I don't know why the nested test case has never passed on the 3.14 branch, someone would have to investigate if they think that is a problem. Guangrong will take a look at this along with the nested failure issue on Xen side. Yes. I am still working on it and please be patient, i need some time to build up the knowledge and figure the root case out. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] libxc: Defer initialization of start_page for HVM guests
El 22/12/15 a les 0.45, Boris Ostrovsky ha escrit: > With commit 8c45adec18e0 ("libxc: create unmapped initrd in domain > builder if supported") location of ramdisk may not be available to > HVMlite guests by the time alloc_magic_pages_hvm() is invoked if the > guest supports unmapped initrd. > > Since the only operation related to allocating magic pages in that > routine is allocation of HVMlite start info we can move everything > else to a later point such as xc_dom_arch.start_info(). > > Signed-off-by: Boris Ostrovsky > --- > I am not sure xc_dom_arch.start_info() is the right place neither since we > still are doing things that have nothing to do with start_info. > > tools/libxc/xc_dom_x86.c | 45 ++--- > 1 files changed, 30 insertions(+), 15 deletions(-) > > diff --git a/tools/libxc/xc_dom_x86.c b/tools/libxc/xc_dom_x86.c > index 3960875..f64079e 100644 > --- a/tools/libxc/xc_dom_x86.c > +++ b/tools/libxc/xc_dom_x86.c > @@ -585,6 +585,32 @@ static void build_hvm_info(void *hvm_info_page, struct > xc_dom_image *dom) > > static int alloc_magic_pages_hvm(struct xc_dom_image *dom) > { > +struct xc_dom_seg seg; > +size_t start_info_size = sizeof(struct hvm_start_info); > + > +if ( dom->device_model ) > +return 0; > + > +if ( dom->cmdline ) > +start_info_size += ROUNDUP(strlen(dom->cmdline) + 1, 8); > +if ( dom->ramdisk_blob ) > +/* Limited to one module. */ > +start_info_size += sizeof( struct hvm_modlist_entry); ^ extra space. > + > +if ( xc_dom_alloc_segment(dom, &seg, "HVMlite start info", 0, > + start_info_size) ) > +{ > +DOMPRINTF("Unable to reserve memory for the start info"); > +return -1; > +} > + > +dom->start_info_pfn = seg.pfn; > + > +return 0; > +} > + > +static int start_info_hvm(struct xc_dom_image *dom) > +{ > unsigned long i; > void *hvm_info_page; > uint32_t *ident_pt, domid = dom->guest_domid; > @@ -636,7 +662,6 @@ static int alloc_magic_pages_hvm(struct xc_dom_image *dom) > > if ( !dom->device_model ) > { > -struct xc_dom_seg seg; > struct hvm_start_info *start_info; > char *cmdline; > struct hvm_modlist_entry *modlist; > @@ -653,17 +678,9 @@ static int alloc_magic_pages_hvm(struct xc_dom_image > *dom) > if ( dom->ramdisk_blob ) > start_info_size += sizeof(*modlist); /* Limited to one module. */ > > -rc = xc_dom_alloc_segment(dom, &seg, "HVMlite start info", 0, > - start_info_size); > -if ( rc != 0 ) > -{ > -DOMPRINTF("Unable to reserve memory for the start info"); > -goto out; > -} > - > start_page = xc_map_foreign_range(xch, domid, start_info_size, IMHO, I would stash start_info_size somewhere inside xc_dom_image in order to avoid calculating it twice. Also, why is everything done inside of alloc_magic_pages_hvm moved to start_info_hvm? AFAICT we should only need to move the code that fills the hvm_start_info struct, but the rest of the code already present in alloc_magic_pages_hvm could be left as-is. Roger. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH V4 6/6] x86/hvm: pkeys, add pkeys support for cpuid handling
>>> On 22.12.15 at 10:25, wrote: > On Tue, 2015-12-22 at 01:30 -0700, Jan Beulich wrote: >> > > > On 22.12.15 at 03:54, wrote: >> > On Mon, 2015-12-21 at 08:07 -0700, Jan Beulich wrote: >> > > > > > On 21.12.15 at 08:21, wrote: >> > > > @@ -4600,6 +4600,14 @@ void hvm_cpuid(unsigned int input, >> > > > unsigned >> > > > int *eax, unsigned int *ebx, >> > > > /* Don't expose INVPCID to non-hap hvm. */ >> > > > if ( (count == 0) && !hap_enabled(d) ) >> > > > *ebx &= ~cpufeat_mask(X86_FEATURE_INVPCID); >> > > > + >> > > > +/* X86_FEATURE_PKU is not yet implemented for shadow >> > > > paging. */ >> > > > +if ( (count == 0) && !hap_enabled(d) ) >> > > > +*ecx &= ~cpufeat_mask(X86_FEATURE_PKU); >> > > >> > > I'm still missing the xsave dependency here. >> > Xsave dependency deletion becasue we use RDPKRU to get PKRU >> > register >> > value instead of XSAVE now. >> >> What the hypervisor does doesn't matter here. The question is >> whether from an architectural standpoint XSAVE is a prerequsite. >> If it is, then you need to clear PKU when _guest_ XSAVE is clear. > No, XSAVE is not a prerequsite. I.e. OSes are expected to deal with both cases (PKU w/ XSAVE and PKU w/o XSAVE)? Sounds like a recipe for trouble, but well - okay. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] x86/mmuext: Unify okay/rc error handling in do_mmuext_op()
On 21/12/2015 17:50, Boris Ostrovsky wrote: > On 12/21/2015 12:16 PM, Andrew Cooper wrote: >> c/s 506db90 "x86/HVM: merge HVM and PVH hypercall tables" introduced >> a path >> whereby 'okay' was used uninitialised, with broke compilation on >> CentOS 7. >> >> Splitting the error handling like this is fragile and unnecessary. >> Drop the >> okay variable entirely and just use rc directly, substituting rc = >> -EINVAL/0 >> for okay = 0/1. >> >> In addition, two error messages are updated to print rc, and some stray >> whitespace is dropped. >> >> Signed-off-by: Andrew Cooper >> --- >> CC: Jan Beulich >> CC: Boris Ostrovsky > > Reviewed-by: Boris Ostrovsky > > with a couple of style nits: > >> @@ -3258,14 +3254,14 @@ long do_mmuext_op( >> break; >> } >> - >> + >> case MMUEXT_TLB_FLUSH_LOCAL: >> if ( likely(d == pg_owner) ) >> flush_tlb_local(); >> else >> rc = -EPERM; >> break; >> - >> + >> case MMUEXT_INVLPG_LOCAL: >> if ( unlikely(d != pg_owner) ) >> rc = -EPERM; > > These look like stray changes. They are specifically the stray bits of whitespace referred to in the commit message. ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] hvc console not working on 4.4.0 guest
On 21/12/15 22:57, M A Young wrote: > I am testing Fedora rawhide in a xen pv guest and I get the error > [9.809550] genirq: Flags mismatch irq 8. (hvc_console) vs. > (rtc0) > [9.809714] hvc_open: request_irq failed with rc -16. > during the boot and the hvc console doesn't respond when the boot > finishes. I have noticed this I think since the update through various > linux 4.4.0 kernels up to the current 4.4.0-0.rc5.git3.1.fc24.x86_64 one. > I do get an hvc console if I use the Fedora 23 kernel 4.2.6-300.fc23.x86_64 > from before the update. > Is this a known problem? I have attached the boot log up to the above > error. Fixed by "x86: Xen PV guests don't have the rtc_cmos platform device" which hasn't yet been applied. As a workaround try using a 2 VCPU guest. David ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v3 0/2] VT-d flush issue
> On 22.12.2015 at 5:09pm, wrote: > >>> On 22.12.15 at 09:43, wrote: > > Let's finish our discussion. I accept your idea. But I need to > > separate it into 3 patch set (It is complicated for me, sometime it makes me > crash.): > >Patch set 1: Device-TLB/iotlb flush error. (send out this week) > >Patch set 2: context flush error. (need 2 ~ 3 weeks) > >Patch set 3: iec flush error. (need 3 ~ 4 weeks) > > > > If it is acceptable, we can discuss in detail one by one.. > > Splitting up is of course acceptable. Jan, Just update the combination, Patch set 1: Device-TLB flush error. (send out this week) Patch set 2: context/iotlb flush error. (need 2 ~ 3 weeks) Patch set 3: iec flush error. (need 3 ~ 4 weeks) -Quan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH V5 0/6] x86/hvm: pkeys, add memory protection-key support
Changes in v5: *Add static for opt_pku. *Update commit message for some patches. *Add condition 5:the access is to a user page to pkey_fault, and simplify #ifdef for guest_walk_tables patch. *Don't write XSTATE_PKRU to PV's xcr0. *count == 0 is combined in hvm_cpuid function. Changes in v4: *Delete gva2gfn patch, and when page is present, PFEC_prot_key is always checked. *Use RDPKRU instead of xsave_read because RDPKRU does cost less. *Squash pkeys patch and pkru patch to guest_walk_tables patch. Changes in v3: *Get CPUID:ospke depend on guest cpuid instead of host hardware capable, and Move cpuid patch to the last of patches. *Move opt_pku to cpu/common.c. *Use MASK_EXTR for get_pte_pkeys. *Add quoting for pkru macro, and use static inline pkru_read functions. *Rebase get_xsave_addr for updated codes, and add uncompressed format support for xsave state. *Use fpu_xsave instead of vcpu_save_fpu, and adjust the code style for leaf_pte_pkeys_check. *Add parentheses for PFEC_prot_key of gva2gfn funcitons. Changes in v2: *Rebase all patches in staging branch *Disable X86_CR4_PKE on hypervisor, and delete pkru_read/write functions, and use xsave state read to get pkru value. *Delete the patch that adds pkeys support for do_page_fault. *Add pkeys support for gva2gfn so that setting _PAGE_PK_BIT in the return value can get propagated to the guest correctly. The protection-key feature provides an additional mechanism by which IA-32e paging controls access to usermode addresses. Hardware support for protection keys for user pages is enumerated with CPUID feature flag CPUID.7.0.ECX[3]:PKU. Software support is CPUID.7.0.ECX[4]:OSPKE with the setting of CR4.PKE(bit 22). When CR4.PKE = 1, every linear address is associated with the 4-bit protection key located in bits 62:59 of the paging-structure entry that mapped the page containing the linear address. The PKRU register determines, for each protection key, whether user-mode addresses with that protection key may be read or written. The PKRU register (protection key rights for user pages) is a 32-bit register with the following format: for each i (0 ≤ i ≤ 15), PKRU[2i] is the access-disable bit for protection key i (ADi); PKRU[2i+1] is the write-disable bit for protection key i (WDi). Software can use the RDPKRU and WRPKRU instructions with ECX = 0 to read and write PKRU. In addition, the PKRU register is XSAVE-managed state and can thus be read and written by instructions in the XSAVE feature set. PFEC.PK (bit 5) is defined as protection key violations. The specification of Protection Keys can be found at SDM (4.6.2, volume 3) http://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-software-developer-manual-325462.pdf. Huaitong Han (6): x86/hvm: pkeys, add the flag to enable Memory Protection Keys x86/hvm: pkeys, add pkeys support when setting CR4 x86/hvm: pkeys, disable pkeys for guests in non-paging mode x86/hvm: pkeys, add pkeys support for guest_walk_tables x86/hvm: pkeys, add xstate support for pkeys x86/hvm: pkeys, add pkeys support for cpuid handling docs/misc/xen-command-line.markdown | 10 ++ tools/libxc/xc_cpufeature.h | 2 ++ tools/libxc/xc_cpuid_x86.c | 6 ++-- xen/arch/x86/cpu/common.c | 10 +- xen/arch/x86/hvm/hvm.c | 41 xen/arch/x86/hvm/vmx/vmx.c | 11 --- xen/arch/x86/mm/guest_walk.c| 64 + xen/arch/x86/mm/hap/guest_walk.c| 3 ++ xen/arch/x86/xstate.c | 3 +- xen/include/asm-x86/cpufeature.h| 6 +++- xen/include/asm-x86/guest_pt.h | 12 +++ xen/include/asm-x86/hvm/hvm.h | 2 ++ xen/include/asm-x86/page.h | 5 +++ xen/include/asm-x86/processor.h | 39 ++ xen/include/asm-x86/x86_64/page.h | 12 +++ xen/include/asm-x86/xstate.h| 4 ++- 16 files changed, 205 insertions(+), 25 deletions(-) -- 2.4.3 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH V5 2/6] x86/hvm: pkeys, add pkeys support when setting CR4
CR4.PKE(bit 22) enables support for the RDPKRU/WRPKRU instructions to access PKRU and the protection keys check (a page fault trigger). This patch adds X86_CR4_PKE to hvm_cr4_guest_reserved_bits so that CR4 check works before setting. Signed-off-by: Huaitong Han Reviewed-by: Andrew Cooper --- xen/arch/x86/hvm/hvm.c | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c index db0aeba..59916ed 100644 --- a/xen/arch/x86/hvm/hvm.c +++ b/xen/arch/x86/hvm/hvm.c @@ -1924,6 +1924,7 @@ static unsigned long hvm_cr4_guest_reserved_bits(const struct vcpu *v, leaf1_edx = boot_cpu_data.x86_capability[cpufeat_word(X86_FEATURE_VME)]; leaf1_ecx = boot_cpu_data.x86_capability[cpufeat_word(X86_FEATURE_PCID)]; leaf7_0_ebx = boot_cpu_data.x86_capability[cpufeat_word(X86_FEATURE_FSGSBASE)]; +leaf7_0_ecx = boot_cpu_data.x86_capability[cpufeat_word(X86_FEATURE_PKU)]; } return ~(unsigned long) @@ -1959,7 +1960,9 @@ static unsigned long hvm_cr4_guest_reserved_bits(const struct vcpu *v, (leaf7_0_ebx & cpufeat_mask(X86_FEATURE_SMEP) ? X86_CR4_SMEP : 0) | (leaf7_0_ebx & cpufeat_mask(X86_FEATURE_SMAP) ? - X86_CR4_SMAP : 0)); + X86_CR4_SMAP : 0) | + (leaf7_0_ecx & cpufeat_mask(X86_FEATURE_PKU) ? + X86_CR4_PKE : 0)); } static int hvm_load_cpu_ctxt(struct domain *d, hvm_domain_context_t *h) -- 2.4.3 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH V5 1/6] x86/hvm: pkeys, add the flag to enable Memory Protection Keys
This patch adds the flag("pku") to enable Memory Protection Keys, and updates the markdown. Signed-off-by: Huaitong Han Reviewed-by: Andrew Cooper --- docs/misc/xen-command-line.markdown | 10 ++ xen/arch/x86/cpu/common.c | 10 +- xen/include/asm-x86/cpufeature.h| 6 +- 3 files changed, 24 insertions(+), 2 deletions(-) diff --git a/docs/misc/xen-command-line.markdown b/docs/misc/xen-command-line.markdown index c103894..36ecf80 100644 --- a/docs/misc/xen-command-line.markdown +++ b/docs/misc/xen-command-line.markdown @@ -1177,6 +1177,16 @@ This option can be specified more than once (up to 8 times at present). ### ple\_window > `= ` +### pku +> `= ` + +> Default: `true` + +Flag to enable Memory Protection Keys. + +The protection-key feature provides an additional mechanism by which IA-32e +paging controls access to usermode addresses. + ### psr (Intel) > `= List of ( cmt: | rmid_max: | cat: | > cos_max: | cdp: )` diff --git a/xen/arch/x86/cpu/common.c b/xen/arch/x86/cpu/common.c index 310ec85..a018855 100644 --- a/xen/arch/x86/cpu/common.c +++ b/xen/arch/x86/cpu/common.c @@ -22,6 +22,10 @@ boolean_param("xsave", use_xsave); bool_t opt_arat = 1; boolean_param("arat", opt_arat); +/* pku: Flag to enable Memory Protection Keys (default on). */ +static bool_t opt_pku = 1; +boolean_param("pku", opt_pku); + unsigned int opt_cpuid_mask_ecx = ~0u; integer_param("cpuid_mask_ecx", opt_cpuid_mask_ecx); unsigned int opt_cpuid_mask_edx = ~0u; @@ -270,7 +274,8 @@ static void __cpuinit generic_identify(struct cpuinfo_x86 *c) if ( c->cpuid_level >= 0x0007 ) cpuid_count(0x0007, 0, &tmp, &c->x86_capability[cpufeat_word(X86_FEATURE_FSGSBASE)], - &tmp, &tmp); + &c->x86_capability[cpufeat_word(X86_FEATURE_PKU)], + &tmp); } /* @@ -323,6 +328,9 @@ void __cpuinit identify_cpu(struct cpuinfo_x86 *c) if ( cpu_has_xsave ) xstate_init(c); + if ( !opt_pku ) + setup_clear_cpu_cap(X86_FEATURE_PKU); + /* * The vendor-specific functions might have changed features. Now * we do "generic changes." diff --git a/xen/include/asm-x86/cpufeature.h b/xen/include/asm-x86/cpufeature.h index af127cf..ef96514 100644 --- a/xen/include/asm-x86/cpufeature.h +++ b/xen/include/asm-x86/cpufeature.h @@ -11,7 +11,7 @@ #include -#define NCAPINTS 8 /* N 32-bit words worth of info */ +#define NCAPINTS 9 /* N 32-bit words worth of info */ /* Intel-defined CPU features, CPUID level 0x0001 (edx), word 0 */ #define X86_FEATURE_FPU(0*32+ 0) /* Onboard FPU */ @@ -163,6 +163,10 @@ #define X86_FEATURE_ADX(7*32+19) /* ADCX, ADOX instructions */ #define X86_FEATURE_SMAP (7*32+20) /* Supervisor Mode Access Prevention */ +/* Intel-defined CPU features, CPUID level 0x0007:0 (ecx), word 8 */ +#define X86_FEATURE_PKU(8*32+ 3) /* Protection Keys for Userspace */ +#define X86_FEATURE_OSPKE (8*32+ 4) /* OS Protection Keys Enable */ + #define cpufeat_word(idx) ((idx) / 32) #define cpufeat_bit(idx) ((idx) % 32) #define cpufeat_mask(idx) (_AC(1, U) << cpufeat_bit(idx)) -- 2.4.3 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH V5 5/6] x86/hvm: pkeys, add xstate support for pkeys
This patch adds xstate support for pkeys. Signed-off-by: Huaitong Han Reviewed-by: Andrew Cooper --- xen/arch/x86/xstate.c| 3 ++- xen/include/asm-x86/xstate.h | 4 +++- 2 files changed, 5 insertions(+), 2 deletions(-) diff --git a/xen/arch/x86/xstate.c b/xen/arch/x86/xstate.c index b65da38..baa4b58 100644 --- a/xen/arch/x86/xstate.c +++ b/xen/arch/x86/xstate.c @@ -487,7 +487,8 @@ void xstate_init(struct cpuinfo_x86 *c) * Set CR4_OSXSAVE and run "cpuid" to get xsave_cntxt_size. */ set_in_cr4(X86_CR4_OSXSAVE); -if ( !set_xcr0(feature_mask) ) +/* PKU is disabled on PV mode. */ +if ( !set_xcr0(feature_mask & ~XSTATE_PKRU) ) BUG(); if ( bsp ) diff --git a/xen/include/asm-x86/xstate.h b/xen/include/asm-x86/xstate.h index 12d939b..f7c41ba 100644 --- a/xen/include/asm-x86/xstate.h +++ b/xen/include/asm-x86/xstate.h @@ -34,13 +34,15 @@ #define XSTATE_OPMASK (1ULL << 5) #define XSTATE_ZMM (1ULL << 6) #define XSTATE_HI_ZMM (1ULL << 7) +#define XSTATE_PKRU(1ULL << 9) #define XSTATE_LWP (1ULL << 62) /* AMD lightweight profiling */ #define XSTATE_FP_SSE (XSTATE_FP | XSTATE_SSE) #define XCNTXT_MASK(XSTATE_FP | XSTATE_SSE | XSTATE_YMM | XSTATE_OPMASK | \ XSTATE_ZMM | XSTATE_HI_ZMM | XSTATE_NONLAZY) #define XSTATE_ALL (~(1ULL << 63)) -#define XSTATE_NONLAZY (XSTATE_LWP | XSTATE_BNDREGS | XSTATE_BNDCSR) +#define XSTATE_NONLAZY (XSTATE_LWP | XSTATE_BNDREGS | XSTATE_BNDCSR | \ +XSTATE_PKRU) #define XSTATE_LAZY(XSTATE_ALL & ~XSTATE_NONLAZY) #define XSTATE_COMPACTION_ENABLED (1ULL << 63) -- 2.4.3 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH V5 4/6] x86/hvm: pkeys, add pkeys support for guest_walk_tables
Protection keys define a new 4-bit protection key field(PKEY) in bits 62:59 of leaf entries of the page tables. PKRU register defines 32 bits, there are 16 domains and 2 attribute bits per domain in pkru, for each i (0 ≤ i ≤ 15), PKRU[2i] is the access-disable bit for protection key i (ADi); PKRU[2i+1] is the write-disable bit for protection key i (WDi). PKEY is index to a defined domain. A fault is considered as a PKU violation if all of the following conditions are ture: 1.CR4_PKE=1. 2.EFER_LMA=1. 3.Page is present with no reserved bit violations. 4.The access is not an instruction fetch. 5.The access is to a user page. 6.PKRU.AD=1 or The access is a data write and PKRU.WD=1 and either CR0.WP=1 or it is a user access. Signed-off-by: Huaitong Han --- xen/arch/x86/mm/guest_walk.c | 64 +++ xen/arch/x86/mm/hap/guest_walk.c | 3 ++ xen/include/asm-x86/guest_pt.h| 12 xen/include/asm-x86/hvm/hvm.h | 2 ++ xen/include/asm-x86/page.h| 5 +++ xen/include/asm-x86/processor.h | 39 xen/include/asm-x86/x86_64/page.h | 12 7 files changed, 137 insertions(+) diff --git a/xen/arch/x86/mm/guest_walk.c b/xen/arch/x86/mm/guest_walk.c index 18d1acf..9cdd607 100644 --- a/xen/arch/x86/mm/guest_walk.c +++ b/xen/arch/x86/mm/guest_walk.c @@ -90,6 +90,57 @@ static uint32_t set_ad_bits(void *guest_p, void *walk_p, int set_dirty) return 0; } +extern bool_t pkey_fault(struct vcpu *vcpu, uint32_t pfec, +uint32_t pte_flags, uint32_t pte_pkey); +#if GUEST_PAGING_LEVELS == CONFIG_PAGING_LEVELS +bool_t pkey_fault(struct vcpu *vcpu, uint32_t pfec, +uint32_t pte_flags, uint32_t pte_pkey) +{ +unsigned int pkru = 0; +bool_t pkru_ad, pkru_wd; + +bool_t pf = !!(pfec & PFEC_page_present); +bool_t uf = !!(pfec & PFEC_user_mode); +bool_t wf = !!(pfec & PFEC_write_access); +bool_t ff = !!(pfec & PFEC_insn_fetch); +bool_t rsvdf = !!(pfec & PFEC_reserved_bit); + +/* When page isn't present, PKEY isn't checked. */ +if ( !pf || is_pv_vcpu(vcpu) ) +return 0; + +/* + * PKU: additional mechanism by which the paging controls + * access to user-mode addresses based on the value in the + * PKRU register. A fault is considered as a PKU violation if all + * of the following conditions are ture: + * 1.CR4_PKE=1. + * 2.EFER_LMA=1. + * 3.page is present with no reserved bit violations. + * 4.the access is not an instruction fetch. + * 5.the access is to a user page. + * 6.PKRU.AD=1 + * or The access is a data write and PKRU.WD=1 + *and either CR0.WP=1 or it is a user access. + */ +if ( !hvm_pku_enabled(vcpu) || !hvm_long_mode_enabled(vcpu) || +rsvdf || ff || !(pte_flags & _PAGE_USER) ) +return 0; + +pkru = read_pkru(); +if ( unlikely(pkru) ) +{ +pkru_ad = read_pkru_ad(pkru, pte_pkey); +pkru_wd = read_pkru_wd(pkru, pte_pkey); +/* Condition 6 */ +if ( pkru_ad || (pkru_wd && wf && (hvm_wp_enabled(vcpu) || uf))) +return 1; +} + +return 0; +} +#endif + /* Walk the guest pagetables, after the manner of a hardware walker. */ /* Because the walk is essentially random, it can cause a deadlock * warning in the p2m locking code. Highly unlikely this is an actual @@ -107,6 +158,7 @@ guest_walk_tables(struct vcpu *v, struct p2m_domain *p2m, guest_l3e_t *l3p = NULL; guest_l4e_t *l4p; #endif +unsigned int pkey; uint32_t gflags, mflags, iflags, rc = 0; bool_t smep = 0, smap = 0; bool_t pse1G = 0, pse2M = 0; @@ -190,6 +242,7 @@ guest_walk_tables(struct vcpu *v, struct p2m_domain *p2m, goto out; /* Get the l3e and check its flags*/ gw->l3e = l3p[guest_l3_table_offset(va)]; +pkey = guest_l3e_get_pkey(gw->l3e); gflags = guest_l3e_get_flags(gw->l3e) ^ iflags; if ( !(gflags & _PAGE_PRESENT) ) { rc |= _PAGE_PRESENT; @@ -199,6 +252,9 @@ guest_walk_tables(struct vcpu *v, struct p2m_domain *p2m, pse1G = (gflags & _PAGE_PSE) && guest_supports_1G_superpages(v); +if ( pse1G && pkey_fault(v, pfec, gflags, pkey) ) +rc |= _PAGE_PKEY_BITS; + if ( pse1G ) { /* Generate a fake l1 table entry so callers don't all @@ -270,6 +326,10 @@ guest_walk_tables(struct vcpu *v, struct p2m_domain *p2m, pse2M = (gflags & _PAGE_PSE) && guest_supports_superpages(v); +pkey = guest_l2e_get_pkey(gw->l2e); +if ( pse2M && pkey_fault(v, pfec, gflags, pkey) ) +rc |= _PAGE_PKEY_BITS; + if ( pse2M ) { /* Special case: this guest VA is in a PSE superpage, so there's @@ -330,6 +390,10 @@ guest_walk_tables(struct vcpu *v, struct p2m_domain *p2m, goto out; } rc |= ((gflags & mflags) ^ mflags); + +pkey = guest_l1e_get_pkey(gw->l1e); +if ( pkey_fault(v, pfec, gfla
[Xen-devel] [PATCH V5 6/6] x86/hvm: pkeys, add pkeys support for cpuid handling
This patch adds pkeys support for cpuid handing. Pkeys hardware support is CPUID.7.0.ECX[3]:PKU. software support is CPUID.7.0.ECX[4]:OSPKE and it reflects the support setting of CR4.PKE. Signed-off-by: Huaitong Han --- tools/libxc/xc_cpufeature.h | 2 ++ tools/libxc/xc_cpuid_x86.c | 6 -- xen/arch/x86/hvm/hvm.c | 36 +++- 3 files changed, 29 insertions(+), 15 deletions(-) diff --git a/tools/libxc/xc_cpufeature.h b/tools/libxc/xc_cpufeature.h index c3ddc80..f6a9778 100644 --- a/tools/libxc/xc_cpufeature.h +++ b/tools/libxc/xc_cpufeature.h @@ -141,5 +141,7 @@ #define X86_FEATURE_ADX 19 /* ADCX, ADOX instructions */ #define X86_FEATURE_SMAP20 /* Supervisor Mode Access Protection */ +/* Intel-defined CPU features, CPUID level 0x0007:0 (ecx) */ +#define X86_FEATURE_PKU 3 #endif /* __LIBXC_CPUFEATURE_H */ diff --git a/tools/libxc/xc_cpuid_x86.c b/tools/libxc/xc_cpuid_x86.c index 8882c01..1ce979b 100644 --- a/tools/libxc/xc_cpuid_x86.c +++ b/tools/libxc/xc_cpuid_x86.c @@ -427,9 +427,11 @@ static void xc_cpuid_hvm_policy(xc_interface *xch, bitmaskof(X86_FEATURE_ADX) | bitmaskof(X86_FEATURE_SMAP) | bitmaskof(X86_FEATURE_FSGSBASE)); +regs[2] &= bitmaskof(X86_FEATURE_PKU); } else -regs[1] = 0; -regs[0] = regs[2] = regs[3] = 0; +regs[1] = regs[2] = 0; + +regs[0] = regs[3] = 0; break; case 0x000d: diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c index 59916ed..076313b 100644 --- a/xen/arch/x86/hvm/hvm.c +++ b/xen/arch/x86/hvm/hvm.c @@ -4572,7 +4572,7 @@ void hvm_cpuid(unsigned int input, unsigned int *eax, unsigned int *ebx, __clear_bit(X86_FEATURE_APIC & 31, edx); /* Fix up OSXSAVE. */ -if ( cpu_has_xsave ) +if ( *ecx & cpufeat_mask(X86_FEATURE_XSAVE) ) *ecx |= (v->arch.hvm_vcpu.guest_cr[4] & X86_CR4_OSXSAVE) ? cpufeat_mask(X86_FEATURE_OSXSAVE) : 0; @@ -4585,21 +4585,31 @@ void hvm_cpuid(unsigned int input, unsigned int *eax, unsigned int *ebx, *edx &= ~cpufeat_mask(X86_FEATURE_PSE36); break; case 0x7: -if ( (count == 0) && !cpu_has_smep ) -*ebx &= ~cpufeat_mask(X86_FEATURE_SMEP); +if ( count == 0 ) +{ +if ( !cpu_has_smep ) +*ebx &= ~cpufeat_mask(X86_FEATURE_SMEP); -if ( (count == 0) && !cpu_has_smap ) -*ebx &= ~cpufeat_mask(X86_FEATURE_SMAP); +if ( !cpu_has_smap ) +*ebx &= ~cpufeat_mask(X86_FEATURE_SMAP); -/* Don't expose MPX to hvm when VMX support is not available */ -if ( (count == 0) && - (!(vmx_vmexit_control & VM_EXIT_CLEAR_BNDCFGS) || - !(vmx_vmentry_control & VM_ENTRY_LOAD_BNDCFGS)) ) -*ebx &= ~cpufeat_mask(X86_FEATURE_MPX); +/* Don't expose MPX to hvm when VMX support is not available */ +if (!(vmx_vmexit_control & VM_EXIT_CLEAR_BNDCFGS) || + !(vmx_vmentry_control & VM_ENTRY_LOAD_BNDCFGS)) +*ebx &= ~cpufeat_mask(X86_FEATURE_MPX); -/* Don't expose INVPCID to non-hap hvm. */ -if ( (count == 0) && !hap_enabled(d) ) -*ebx &= ~cpufeat_mask(X86_FEATURE_INVPCID); +if ( !hap_enabled(d) ) +{ +/* Don't expose INVPCID to non-hap hvm. */ +*ebx &= ~cpufeat_mask(X86_FEATURE_INVPCID); +/* X86_FEATURE_PKU is not yet implemented for shadow paging. */ +*ecx &= ~cpufeat_mask(X86_FEATURE_PKU); +} + +if ( *ecx & cpufeat_mask(X86_FEATURE_PKU)) +*ecx |= (v->arch.hvm_vcpu.guest_cr[4] & X86_CR4_PKE) ? + cpufeat_mask(X86_FEATURE_OSPKE) : 0; +} break; case 0xb: /* Fix the x2APIC identifier. */ -- 2.4.3 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH V5 3/6] x86/hvm: pkeys, disable pkeys for guests in non-paging mode
This patch disables pkeys for guest in non-paging mode, However XEN always uses paging mode to emulate guest non-paging mode, To emulate this behavior, pkeys needs to be manually disabled when guest switches to non-paging mode. Signed-off-by: Huaitong Han Reviewed-by: Andrew Cooper --- xen/arch/x86/hvm/vmx/vmx.c | 11 ++- 1 file changed, 6 insertions(+), 5 deletions(-) diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c index 2581e97..7123912 100644 --- a/xen/arch/x86/hvm/vmx/vmx.c +++ b/xen/arch/x86/hvm/vmx/vmx.c @@ -1380,12 +1380,13 @@ static void vmx_update_guest_cr(struct vcpu *v, unsigned int cr) if ( !hvm_paging_enabled(v) ) { /* - * SMEP/SMAP is disabled if CPU is in non-paging mode in hardware. - * However Xen always uses paging mode to emulate guest non-paging - * mode. To emulate this behavior, SMEP/SMAP needs to be manually - * disabled when guest VCPU is in non-paging mode. + * SMEP/SMAP/PKU is disabled if CPU is in non-paging mode in + * hardware. However Xen always uses paging mode to emulate guest + * non-paging mode. To emulate this behavior, SMEP/SMAP/PKU needs + * to be manually disabled when guest VCPU is in non-paging mode. */ -v->arch.hvm_vcpu.hw_cr[4] &= ~(X86_CR4_SMEP | X86_CR4_SMAP); +v->arch.hvm_vcpu.hw_cr[4] &= +~(X86_CR4_SMEP | X86_CR4_SMAP | X86_CR4_PKE); } __vmwrite(GUEST_CR4, v->arch.hvm_vcpu.hw_cr[4]); break; -- 2.4.3 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH V5 4/6] x86/hvm: pkeys, add pkeys support for guest_walk_tables
On 22/12/15 10:30, Huaitong Han wrote: > --- a/xen/include/asm-x86/processor.h > +++ b/xen/include/asm-x86/processor.h > @@ -373,6 +373,45 @@ static always_inline void clear_in_cr4 (unsigned long > mask) > write_cr4(read_cr4() & ~mask); > } > > +static inline unsigned int read_pkru(void) > +{ > +unsigned int pkru; > + > +/* > + * _PAGE_PKEY_BITS have a conflict with _PAGE_GNTTAB used by PV guests, > + * so that X86_CR4_PKE is disable on hypervisor, RDPKRU instruction can > + * be used with temporarily setting CR4.PKE. > + */ > +set_in_cr4(X86_CR4_PKE); > +asm volatile (".byte 0x0f,0x01,0xee" > +: "=a" (pkru) : "c" (0) : "dx"); > +clear_in_cr4(X86_CR4_PKE); You can't use set/clear_in_cr4 here, as it modifies the global mmu_cr4_features variable. I would recommend unsigned long cr4 = read_cr4(); write_cr4(cr4 | X86_CR4_PKE); ... write_cr4(cr4); instead. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 2/4] x86: misc printk() adjustments
On 21/12/15 14:48, Jan Beulich wrote: On 21.12.15 at 15:41, wrote: >> On 21/12/15 14:34, Jan Beulich wrote: >>> --- a/xen/arch/x86/io_apic.c >>> +++ b/xen/arch/x86/io_apic.c >>> @@ -2310,13 +2310,14 @@ int ioapic_guest_read(unsigned long phys >>> return 0; >>> } >>> >>> -#define WARN_BOGUS_WRITE(f, a...) \ >>> -dprintk(XENLOG_INFO, "\n%s: "\ >>> -"apic=%d, pin=%d, irq=%d\n" \ >>> -"%s: new_entry=%08x\n" \ >>> -"%s: " f, __FUNCTION__, apic, pin, irq,\ >>> -__FUNCTION__, *(u32 *)&rte, \ >>> -__FUNCTION__ , ##a ) >>> +#define WARN_BOGUS_WRITE(f, a...) \ >>> +dprintk(XENLOG_INFO, "\n" \ >>> +XENLOG_INFO "%s: apic=%d, pin=%d, irq=%d\n" \ >>> +XENLOG_INFO "%s: new_entry=%08x\n" \ >>> +XENLOG_INFO "%s: " f "\n", \ >>> +__func__, apic, pin, irq, \ >>> +__func__, *(u32 *)&rte, \ >>> +__func__, ##a ) >>> >>> int ioapic_guest_write(unsigned long physbase, unsigned int reg, u32 val) >>> { >>> @@ -2388,7 +2389,7 @@ int ioapic_guest_write(unsigned long phy >>> rte.vector = desc->arch.vector; >>> if ( *(u32*)&rte != ret ) >>> WARN_BOGUS_WRITE("old_entry=%08x pirq=%d\n%s: " >>> - "Attempt to modify IO-APIC pin for in-use >>> IRQ!\n", >>> + "Attempt to modify IO-APIC pin for in-use >>> IRQ!", >>> ret, pirq, __FUNCTION__); >> Given that this is the sole user of WARN_BOGUS_WRITE(), I would >> recommend folding it all together in a simple dprintk(), and remove some >> of the redundant information, and fixing the resulting message to take >> up fewer lines. > If you feel strongly about this I could certainly do so, but I did > consider that option and decided against it to retain the option > of using the macro (again?) in a second place. I don't forsee it being useful to use again. Either way, Reviewed-by: Andrew Cooper ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [xen-4.6-testing test] 66794: regressions - FAIL
flight 66794 xen-4.6-testing real [real] http://logs.test-lab.xenproject.org/osstest/logs/66794/ Regressions :-( Tests which did not succeed and are blocking, including tests which could not be run: test-amd64-i386-xl-qemuu-ovmf-amd64 9 debian-hvm-install fail REGR. vs. 65639 test-amd64-i386-xl-qemut-stubdom-debianhvm-amd64-xsm 9 debian-hvm-install fail REGR. vs. 65639 test-amd64-amd64-xl-qemuu-ovmf-amd64 9 debian-hvm-install fail REGR. vs. 65639 Regressions which are regarded as allowable (not blocking): test-armhf-armhf-xl-vhd 15 guest-start.2fail blocked in 65639 test-amd64-amd64-xl-qemut-win7-amd64 16 guest-stop fail like 65639 Tests which did not succeed, but are not blocking: test-amd64-amd64-xl-pvh-intel 11 guest-start fail never pass test-amd64-amd64-xl-pvh-amd 11 guest-start fail never pass test-amd64-amd64-libvirt-xsm 12 migrate-support-checkfail never pass test-amd64-amd64-libvirt 12 migrate-support-checkfail never pass test-armhf-armhf-xl-arndale 12 migrate-support-checkfail never pass test-armhf-armhf-xl-arndale 13 saverestore-support-checkfail never pass test-amd64-i386-libvirt-xsm 12 migrate-support-checkfail never pass test-amd64-amd64-qemuu-nested-amd 16 debian-hvm-install/l1/l2 fail never pass test-armhf-armhf-xl-multivcpu 13 saverestore-support-checkfail never pass test-armhf-armhf-xl-multivcpu 12 migrate-support-checkfail never pass test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check fail never pass test-amd64-i386-libvirt 12 migrate-support-checkfail never pass test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check fail never pass test-armhf-armhf-xl-cubietruck 12 migrate-support-checkfail never pass test-armhf-armhf-xl-cubietruck 13 saverestore-support-checkfail never pass test-armhf-armhf-libvirt-qcow2 11 migrate-support-checkfail never pass test-armhf-armhf-libvirt-qcow2 13 guest-saverestorefail never pass test-armhf-armhf-xl-xsm 13 saverestore-support-checkfail never pass test-armhf-armhf-xl-xsm 12 migrate-support-checkfail never pass test-armhf-armhf-libvirt-raw 13 guest-saverestorefail never pass test-armhf-armhf-libvirt-raw 11 migrate-support-checkfail never pass test-armhf-armhf-libvirt-xsm 12 migrate-support-checkfail never pass test-armhf-armhf-libvirt-xsm 14 guest-saverestorefail never pass test-armhf-armhf-libvirt 14 guest-saverestorefail never pass test-armhf-armhf-libvirt 12 migrate-support-checkfail never pass test-armhf-armhf-xl 12 migrate-support-checkfail never pass test-armhf-armhf-xl 13 saverestore-support-checkfail never pass test-armhf-armhf-xl-vhd 11 migrate-support-checkfail never pass test-armhf-armhf-xl-vhd 12 saverestore-support-checkfail never pass test-armhf-armhf-xl-credit2 13 saverestore-support-checkfail never pass test-armhf-armhf-xl-credit2 12 migrate-support-checkfail never pass test-amd64-amd64-libvirt-vhd 11 migrate-support-checkfail never pass test-armhf-armhf-xl-rtds 13 saverestore-support-checkfail never pass test-armhf-armhf-xl-rtds 12 migrate-support-checkfail never pass version targeted for testing: xen cdd96b9be5a57ae7795750c3ae83ea6783030688 baseline version: xen 850bcd0f42e4912b6605a2caa42d5c466ed58bd1 Last test of basis65639 2015-12-09 21:22:40 Z 12 days Failing since 66394 2015-12-15 18:17:32 Z6 days6 attempts Testing same since66537 2015-12-18 16:12:08 Z3 days3 attempts People who touched revisions under test: Boris Ostrovsky Brendan Gregg Daniel Kiper Dario Faggioli David Vrabel Ed Swierk Haozhong Zhang Ian Campbell Ian Jackson Jan Beulich Kevin Tian jobs: build-amd64-xsm pass build-armhf-xsm pass build-i386-xsm pass build-amd64 pass build-armhf pass build-i386 pass build-amd64-libvirt pass build-armhf-libvirt pass build-i386-libvirt pass build-amd64-prev pass build-i386-prev pass build-amd64-pvopspass
Re: [Xen-devel] [PATCH] tools: make flask utils build unconditional
On 22/12/15 04:46, Doug Goldstein wrote: > The flask utilities only have dependencies on libxc so there's no > downside to always building it. Distros and projects based on Xen can > put these utilities into a different package and not install them for > everyone. Prior to this change FLASK_ENABLE needs to be a top level > variable however after this change FLASK_ENABLE only affects xen/. > > Signed-off-by: Doug Goldstein CC'ing Daniel as this is a flask related change. FWIW, Reviewed-by: Andrew Cooper > --- > tools/Makefile | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/tools/Makefile b/tools/Makefile > index 820ca40..2f773fd 100644 > --- a/tools/Makefile > +++ b/tools/Makefile > @@ -4,7 +4,7 @@ include $(XEN_ROOT)/tools/Rules.mk > SUBDIRS-y := > SUBDIRS-y += include > SUBDIRS-y += libxc > -SUBDIRS-$(FLASK_ENABLE) += flask > +SUBDIRS-y += flask > SUBDIRS-y += xenstore > SUBDIRS-y += misc > SUBDIRS-y += examples ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [xen-unstable-smoke test] 66936: tolerable all pass - PUSHED
flight 66936 xen-unstable-smoke real [real] http://logs.test-lab.xenproject.org/osstest/logs/66936/ Failures :-/ but no regressions. Tests which did not succeed, but are not blocking: test-armhf-armhf-xl 12 migrate-support-checkfail never pass test-armhf-armhf-xl 13 saverestore-support-checkfail never pass test-amd64-amd64-libvirt 12 migrate-support-checkfail never pass version targeted for testing: xen 8e4d18e113c9d914a591c7bd8159c990122f50dc baseline version: xen bf925a9f1254391749f569c1b8fc606036340488 Last test of basis66868 2015-12-21 18:08:31 Z0 days Testing same since66936 2015-12-22 10:01:41 Z0 days1 attempts People who touched revisions under test: Andrew Cooper Feng Wu Jan Beulich jobs: build-amd64 pass build-armhf pass build-amd64-libvirt pass test-armhf-armhf-xl pass test-amd64-amd64-xl-qemuu-debianhvm-i386 pass test-amd64-amd64-libvirt pass sg-report-flight on osstest.test-lab.xenproject.org logs: /home/logs/logs images: /home/logs/images Logs, config files, etc. are available at http://logs.test-lab.xenproject.org/osstest/logs Explanation of these reports, and of osstest in general, is at http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master Test harness code can be found at http://xenbits.xen.org/gitweb?p=osstest.git;a=summary Pushing revision : + branch=xen-unstable-smoke + revision=8e4d18e113c9d914a591c7bd8159c990122f50dc + . ./cri-lock-repos ++ . ./cri-common +++ . ./cri-getconfig +++ umask 002 +++ getrepos getconfig Repos perl -e ' use Osstest; readglobalconfig(); print $c{"Repos"} or die $!; ' +++ local repos=/home/osstest/repos +++ '[' -z /home/osstest/repos ']' +++ '[' '!' -d /home/osstest/repos ']' +++ echo /home/osstest/repos ++ repos=/home/osstest/repos ++ repos_lock=/home/osstest/repos/lock ++ '[' x '!=' x/home/osstest/repos/lock ']' ++ OSSTEST_REPOS_LOCK_LOCKED=/home/osstest/repos/lock ++ exec with-lock-ex -w /home/osstest/repos/lock ./ap-push xen-unstable-smoke 8e4d18e113c9d914a591c7bd8159c990122f50dc + branch=xen-unstable-smoke + revision=8e4d18e113c9d914a591c7bd8159c990122f50dc + . ./cri-lock-repos ++ . ./cri-common +++ . ./cri-getconfig +++ umask 002 +++ getrepos getconfig Repos perl -e ' use Osstest; readglobalconfig(); print $c{"Repos"} or die $!; ' +++ local repos=/home/osstest/repos +++ '[' -z /home/osstest/repos ']' +++ '[' '!' -d /home/osstest/repos ']' +++ echo /home/osstest/repos ++ repos=/home/osstest/repos ++ repos_lock=/home/osstest/repos/lock ++ '[' x/home/osstest/repos/lock '!=' x/home/osstest/repos/lock ']' + . ./cri-common ++ . ./cri-getconfig ++ umask 002 + select_xenbranch + case "$branch" in + tree=xen + xenbranch=xen-unstable-smoke + qemuubranch=qemu-upstream-unstable + '[' xxen = xlinux ']' + linuxbranch= + '[' xqemu-upstream-unstable = x ']' + select_prevxenbranch ++ ./cri-getprevxenbranch xen-unstable-smoke + prevxenbranch=xen-unstable + '[' x8e4d18e113c9d914a591c7bd8159c990122f50dc = x ']' + : tested/2.6.39.x + . ./ap-common ++ : osst...@xenbits.xen.org +++ getconfig OsstestUpstream +++ perl -e ' use Osstest; readglobalconfig(); print $c{"OsstestUpstream"} or die $!; ' ++ : ++ : git://xenbits.xen.org/xen.git ++ : osst...@xenbits.xen.org:/home/xen/git/xen.git ++ : git://xenbits.xen.org/qemu-xen-traditional.git ++ : git://git.kernel.org ++ : git://git.kernel.org/pub/scm/linux/kernel/git ++ : git ++ : git://xenbits.xen.org/libvirt.git ++ : osst...@xenbits.xen.org:/home/xen/git/libvirt.git ++ : git://xenbits.xen.org/libvirt.git ++ : git://xenbits.xen.org/rumpuser-xen.git ++ : git ++ : git://xenbits.xen.org/rumpuser-xen.git ++ : osst...@xenbits.xen.org:/home/xen/git/rumpuser-xen.git +++ besteffort_repo https://github.com/rumpkernel/rumpkernel-netbsd-src +++ local repo=https://github.com/rumpkernel/rumpkernel-netbsd-src +++ cached_repo https://github.com/rumpkernel/rumpkernel-netbsd-src '[fetch=try]' +++ local repo=https://github.com/rumpkernel/rumpkernel-netbsd-src +++ local 'options=[fetch=try]' getconfig GitCacheProxy perl -e ' use Osstest; readglobalconfig(); print $c{"GitCacheProxy"} or die $!; ' +++ local cache=git://cache:9419/ +++ '[' xgit://cache:9419/ '!=' x ']' +++ echo 'git://c
Re: [Xen-devel] [PATCHv5 1/3] rwlock: Add per-cpu reader-writer lock infrastructure
On 18/12/15 16:08, Malcolm Crossley wrote: > Per-cpu read-write locks allow for the fast path read case to have > low overhead by only setting/clearing a per-cpu variable for using > the read lock. The per-cpu read fast path also avoids locked > compare swap operations which can be particularly slow on coherent > multi-socket systems, particularly if there is heavy usage of the > read lock itself. > > The per-cpu reader-writer lock uses a local variable to control > the read lock fast path. This allows a writer to disable the fast > path and ensures the readers switch to using the underlying > read-write lock implementation instead of the per-cpu variable. > > Once the writer has taken the write lock and disabled the fast path, > it must poll the per-cpu variable for all CPU's which have entered > the critical section for the specific read-write lock the writer is > attempting to take. This design allows for a single per-cpu variable > to be used for read/write locks belonging to seperate data structures. > If a two or more different per-cpu read lock(s) are taken > simultaneously then the per-cpu data structure is not used and the > implementation takes the read lock of the underlying read-write lock, > this behaviour is equivalent to the slow path in terms of performance. > The per-cpu rwlock is not recursion safe for taking the per-cpu > read lock because there is no recursion count variable, this is > functionally equivalent to standard spin locks. > > Slow path readers which are unblocked, set the per-cpu variable and > drop the read lock. This simplifies the implementation and allows > for fairness in the underlying read-write lock to be taken > advantage of. > > There is more overhead on the per-cpu write lock path due to checking > each CPUs fast path per-cpu variable but this overhead is likely be > hidden by the required delay of waiting for readers to exit the > critical section. The loop is optimised to only iterate over > the per-cpu data of active readers of the rwlock. The cpumask_t for > tracking the active readers is stored in a single per-cpu data > location and thus the write lock is not pre-emption safe. Therefore > the per-cpu write lock can only be used with interrupts disabled. > > Signed-off-by: Malcolm Crossley > -- > Changes since v4: > - Use static inlines for the percpu_owner check > > Changes since v3: > - Fix the ASSERTS for the percpu_owner check > > Changes since v2: > - Remove stray hard tabs > - Added owner field to percpu_rwlock_t for debug builds. This allows > for ASSERTs to be added which verify the correct global percpu > structure is used for the local initialised percpu_rwlock lock > > Changes since v1: > - Removed restriction on taking two or more different percpu rw > locks simultaneously > - Moved fast-path/slow-path barrier to be per lock instead of global > - Created seperate percpu_rwlock_t type and added macros to > initialise new type > - Added helper macros for using the percpu rwlock itself > - Moved active_readers cpumask off the stack and into a percpu > variable > --- > xen/common/spinlock.c| 45 + > xen/include/asm-arm/percpu.h | 5 ++ > xen/include/asm-x86/percpu.h | 6 +++ > xen/include/xen/percpu.h | 4 ++ > xen/include/xen/spinlock.h | 115 > +++ > 5 files changed, 175 insertions(+) > > diff --git a/xen/common/spinlock.c b/xen/common/spinlock.c > index 7f89694..901c2b2 100644 > --- a/xen/common/spinlock.c > +++ b/xen/common/spinlock.c > @@ -10,6 +10,8 @@ > #include > #include > > +static DEFINE_PER_CPU(cpumask_t, percpu_rwlock_readers); > + > #ifndef NDEBUG > > static atomic_t spin_debug __read_mostly = ATOMIC_INIT(0); > @@ -492,6 +494,49 @@ int _rw_is_write_locked(rwlock_t *lock) > return (lock->lock == RW_WRITE_FLAG); /* writer in critical section? */ > } > > +void _percpu_write_lock(percpu_rwlock_t **per_cpudata, > +percpu_rwlock_t *percpu_rwlock) > +{ > +unsigned int cpu; > +cpumask_t *rwlock_readers = &this_cpu(percpu_rwlock_readers); > + > +/* Validate the correct per_cpudata variable has been provided. */ > +_percpu_rwlock_owner_check(per_cpudata, percpu_rwlock); > + > +/* > + * First take the write lock to protect against other writers or slow > + * path readers. > + */ > +write_lock(&percpu_rwlock->rwlock); > + > +/* Now set the global variable so that readers start using read_lock. */ > +percpu_rwlock->writer_activating = 1; > +smp_mb(); > + > +/* Using a per cpu cpumask is only safe if there is no nesting. */ > +ASSERT(!in_irq()); > +cpumask_copy(rwlock_readers, &cpu_online_map); > + > +/* Check if there are any percpu readers in progress on this rwlock. */ > +for ( ; ; ) > +{ > +for_each_cpu(cpu, rwlock_readers) > +{ > +/* > + * Remove any percpu readers not contending on this rwlock > +
[Xen-devel] [seabios baseline-only test] 38549: tolerable FAIL
This run is configured for baseline tests only. flight 38549 seabios real [real] http://osstest.xs.citrite.net/~osstest/testlogs/logs/38549/ Failures :-/ but no regressions. Regressions which are regarded as allowable (not blocking): test-amd64-amd64-xl-qemuu-winxpsp3 9 windows-install fail baseline untested Tests which did not succeed, but are not blocking: test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check fail never pass test-amd64-amd64-qemuu-nested-amd 16 debian-hvm-install/l1/l2 fail never pass test-amd64-i386-xl-qemuu-win7-amd64 16 guest-stop fail never pass test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check fail never pass test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-stop fail never pass version targeted for testing: seabios 76327b9f32a009245c215f4a3c5d58a01b5310ae baseline version: seabios 948f3c9b5f74f265bd171ee367a632a8f7c81f68 Last test of basis38383 2015-11-29 07:20:01 Z 23 days Testing same since38549 2015-12-22 02:02:51 Z0 days1 attempts People who touched revisions under test: Marcel Apfelbaum Michael S. Tsirkin Stefan Berger jobs: build-amd64-xsm pass build-i386-xsm pass build-amd64 pass build-i386 pass build-amd64-libvirt pass build-i386-libvirt pass build-amd64-pvopspass build-i386-pvops pass test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm pass test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsmpass test-amd64-amd64-xl-qemuu-debianhvm-amd64-xsmpass test-amd64-i386-xl-qemuu-debianhvm-amd64-xsm pass test-amd64-amd64-qemuu-nested-amdfail test-amd64-i386-qemuu-rhel6hvm-amd pass test-amd64-amd64-xl-qemuu-debianhvm-amd64pass test-amd64-i386-xl-qemuu-debianhvm-amd64 pass test-amd64-amd64-xl-qemuu-ovmf-amd64 pass test-amd64-i386-xl-qemuu-ovmf-amd64 pass test-amd64-amd64-xl-qemuu-win7-amd64 fail test-amd64-i386-xl-qemuu-win7-amd64 fail test-amd64-amd64-qemuu-nested-intel pass test-amd64-i386-qemuu-rhel6hvm-intel pass test-amd64-i386-xl-qemuu-winxpsp3-vcpus1 pass test-amd64-amd64-xl-qemuu-winxpsp3 fail test-amd64-i386-xl-qemuu-winxpsp3pass sg-report-flight on osstest.xs.citrite.net logs: /home/osstest/logs images: /home/osstest/images Logs, config files, etc. are available at http://osstest.xs.citrite.net/~osstest/testlogs/logs Test harness code can be found at http://xenbits.xensource.com/gitweb?p=osstest.git;a=summary Push not applicable. commit 76327b9f32a009245c215f4a3c5d58a01b5310ae Author: Marcel Apfelbaum Date: Mon Dec 7 14:05:14 2015 +0200 fw/pci: do not automatically allocate IO region for PCIe bridges PCIe downstream ports (Root Ports and switches Downstream Ports) appear to firmware as PCI-PCI bridges and a 4K IO space is allocated for them even if there is no device behind them requesting IO space, all that for hotplug purpose. However, PCIe devices can work without IO, so there is no need to allocate IO space for hotplug. Acked-by: Michael S. Tsirkin Signed-off-by: Marcel Apfelbaum commit 320df85010401c6d3ee189a54c63b937d1dcade9 Author: Stefan Berger Date: Mon Nov 30 11:14:19 2015 -0500 tpm: Add a menu for TPM configuration This patch adds an new menu entry to the main menu. This menu item enables the user to enter a TPM control menu which allows control of those aspects of the TPM's state that can only be controlled while in the firmware and while physical presence can be asserted. If the machine has a TPM, the boot menu will look as follows, with the new menu item accessible by pressing the 't' key. Select boot device: 1. ata0-1: QEMU HARDDISK ATA-7 Hard-Disk (6144 MiBytes) 2. Legacy option rom 3. iPXE (PCI 00:03.0) t. TPM Menu Upon pressing t the TPM submenu will be shown: The Trusted Platform Module (TPM) is a hardware device in this machine. I
Re: [Xen-devel] [PATCHv5 3/3] p2m: convert p2m rwlock to percpu rwlock
On 18/12/15 16:08, Malcolm Crossley wrote: > The per domain p2m read lock suffers from significant contention when > performance multi-queue block or network IO due to the parallel > grant map/unmaps/copies occuring on the DomU's p2m. > > On multi-socket systems, the contention results in the locked compare swap > operation failing frequently which results in a tight loop of retries of the > compare swap operation. As the coherency fabric can only support a specific > rate of compare swap operations for a particular data location then taking > the read lock itself becomes a bottleneck for p2m operations. > > Percpu rwlock p2m performance with the same configuration is approximately > 64 gbit/s vs the 48 gbit/s with grant table percpu rwlocks only. > > Oprofile was used to determine the initial overhead of the read-write locks > and to confirm the overhead was dramatically reduced by the percpu rwlocks. > > Note: altp2m users will not achieve a gain if they take an altp2m read lock > simultaneously with the main p2m lock. > > Signed-off-by: Malcolm Crossley Looks good, thanks: Reviewed-by: George Dunlap If you end up switching to always using the per-cpu pointer stored in the percpu_rwlock struct, you can retain the Reviewed-by for those changes. -George ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCHv6 2/2] x86/ept: defer the invalidation until the p2m lock is released
On 18/12/15 13:50, David Vrabel wrote: > Holding the p2m lock while calling ept_sync_domain() is very expensive > since it does a on_selected_cpus() call. IPIs on many socket machines > can be very slows and on_selected_cpus() is serialized. > > It is safe to defer the invalidate until the p2m lock is released > except for two cases: > > 1. When freeing a page table page (since partial translations may be >cached). > 2. When reclaiming a zero page as part of PoD. > > For these cases, add p2m_tlb_flush_sync() calls which will immediately > perform the invalidate before the page is freed or reclaimed. There are at least two other places in the PoD code where the "remove -> check -> add to cache -> unlock" pattern exist; and it looks to me like there are other places where races might occur (e.g., p2m_paging_evict(), which does remove -> scrub -> put -> unlock; p2m_altp2m_propagate_change(), which does remove -> put -> unlock). Part of me wonders whether, rather than making callers that need it remember to do a flush, it might be better to explicitly pass in P2M_FLUSH or P2M_CAN_DEFER when calling p2m_set_entry, just to make people think about the fact that the p2m change may not actually take effect until later. Any thoughts on that? Comments on the current approach inline. > diff --git a/xen/arch/x86/mm/p2m-ept.c b/xen/arch/x86/mm/p2m-ept.c > index c094320..43c7f1b 100644 > --- a/xen/arch/x86/mm/p2m-ept.c > +++ b/xen/arch/x86/mm/p2m-ept.c > @@ -263,6 +263,7 @@ static void ept_free_entry(struct p2m_domain *p2m, > ept_entry_t *ept_entry, int l > unmap_domain_page(epte); > } > > +p2m_tlb_flush_sync(p2m); > p2m_free_ptp(p2m, mfn_to_page(ept_entry->mfn)); It's probably worth a comment here pointing out that even if this function is called several times (e.g., if you replace a load of 4k entries with a 1G entry), the actual flush will only happen the first time. > +static void ept_flush_and_unlock(struct p2m_domain *p2m, bool_t unlock) > +{ > +p2m->need_flush = 0; > +if ( unlock ) > +mm_write_unlock(&p2m->lock); > +ept_sync_domain_mask(p2m, p2m->domain->domain_dirty_cpumask); > } Having a function called "flush_and_unlock", with a boolean as to whether to unlock or not, just seems a bit wonky. Wouldn't it make more sense to have the hook just named "flush_sync()", and move the unlocking out in the generic p2m code (where you already have the check for need_flush)? > diff --git a/xen/include/asm-x86/p2m.h b/xen/include/asm-x86/p2m.h > index fa46dd9..9c394c2 100644 > --- a/xen/include/asm-x86/p2m.h > +++ b/xen/include/asm-x86/p2m.h > @@ -261,6 +261,10 @@ struct p2m_domain { >unsigned long gfn, l1_pgentry_t *p, >l1_pgentry_t new, unsigned int > level); > long (*audit_p2m)(struct p2m_domain *p2m); > +void (*flush_and_unlock)(struct p2m_domain *p2m, bool_t > unlock); > + > +unsigned int defer_flush; > +bool_t need_flush; It's probably worth a comment that at the moment calling flush_and_unlock() is gated on need_flush; so it's OK not to implement flush_and_unlock() as long as you never set need_flush. Thanks, -George ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] Xen Security Advisory 155 (CVE-2015-8550) - paravirtualized drivers incautious about shared memory
MiniOS for QEMU stubdom has frontends, such as mini-os/blkfront.c and mini-os/netfront.c, not backends. Cheers, Stefano On Mon, 21 Dec 2015, Eric Shelton wrote: > Seeing as "All OSes providing PV backends are susceptible," doesn't this > include MiniOS for QEMU stubdom as well? > Are there patches available for mini-os/blkfront.c, mini-os/netfront.c, and > mini-os/pcifront.c? I didn't see > anything for this. > Best, > Eric > > On Thu, Dec 17, 2015 at 1:36 PM, Xen.org security team > wrote: > > - Topal: Output generated on Tue Dec 22 12:23:44 GMT 2015 - > Topal: GPG output starts - gpg: > no valid OpenPGP data found. gpg: processing message failed: eof - > Topal: GPG output ends - > - Topal: Original message starts - -BEGIN PGP SIGNED > MESSAGE- > Hash: SHA1 > > Xen Security Advisory CVE-2015-8550 / XSA-155 > version 6 > > paravirtualized drivers incautious about shared memory contents > > UPDATES IN VERSION 6 > > > Correct CREDITS section. > > ISSUE DESCRIPTION > = > > The compiler can emit optimizations in the PV backend drivers which > can lead to double fetch vulnerabilities. Specifically the shared > memory between the frontend and backend can be fetched twice (during > which time the frontend can alter the contents) possibly leading to > arbitrary code execution in backend. > > IMPACT > == > > Malicious guest administrators can cause denial of service. If driver > domains are not in use, the impact can be a host crash, or privilege > escalation. > > VULNERABLE SYSTEMS > == > > Systems running PV or HVM guests are vulnerable. > > ARM and x86 systems are vulnerable. > > All OSes providing PV backends are susceptible, this includes > Linux and NetBSD. By default the Linux distributions compile kernels > with optimizations. > > MITIGATION > == > > There is no mitigation. > > CREDITS > === > > This issue was discovered by Felix Wilhelm (ERNW Research, KIT / > Operating Systems Group). > > RESOLUTION > == > > Applying the appropriate attached patches should fix the problem for > PV backends. Note only that PV backends are fixed; PV frontend > patches will be developed and released (publicly) after the embargo > date. > > Please note that there is a bug in some versions of gcc, > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58145 which can cause the > construct used in RING_COPY_REQUEST() to be ineffective in some > circumstances. We have determined that this is only the case when the > structure being copied consists purely of bitfields. The Xen PV > protocols updated here do not use bitfields in this way and therefore > these patches are not subject to that bug. However authors of third > party PV protocols should take this into consideration. > > Linux v4.4: > xsa155-linux-xsa155-0001-xen-Add-RING_COPY_REQUEST.patch > > xsa155-linux-xsa155-0002-xen-netback-don-t-use-last-request-to-determine-mini.patch > > xsa155-linux-xsa155-0003-xen-netback-use-RING_COPY_REQUEST-throughout.patch > > xsa155-linux-xsa155-0004-xen-blkback-only-read-request-operation-from-shared-.patch > > xsa155-linux-xsa155-0005-xen-blkback-read-from-indirect-descriptors-only-once.patch > xsa155-linux-xsa155-0006-xen-scsiback-safely-copy-requests.patch > > xsa155-linux-xsa155-0007-xen-pciback-Save-xen_pci_op-commands-before-processi.patch > Linux v4.[0,1,2,3] > All the above patches except #5 will apply, please use: > > xsa155-linux43-0005-xen-blkback-read-from-indirect-descriptors-only-once.patch > Linux v3.19: > All the above patches except #5 and #6 will apply, please use: > > xsa155-linux43-0005-xen-blkback-read-from-indirect-descriptors-only-once.patch > xsa155-linux319-0006-xen-scsiback-safely-copy-requests.patch > > qemu-xen: > xsa155-qemu-qdisk-double-access.patch > xsa155-qemu-xenfb.patch > > qemu-traditional: > xsa155-qemut-qdisk-double-access.patch > xsa155-qemut-xenfb.patch > > NetBSD 7.0: > xsa155-netbsd-xsa155-0001-netbsd-xen-Add-RING_COPY_REQUEST.patch > > xsa155-netbsd-xsa155-0002-netbsd-netback-Use-RING_COPY_REQUEST-instead-of-RING.patch > > xsa155-netbsd-xsa155-0003-netbsd-ring-Add-barrier-to-provide-an-compiler-barri.patch > > xsa155-netbsd-xsa155-0004-netbsd-block-only-read-request-operation-from-shared.patch > > xsa155-netbsd-xsa155-0005-netbsd-pciback-Operate-on-local-version-of-xen_pci_o.patch > > xen: > xsa155-xen-0001-xen-Add-RING_COPY_REQUEST.patch > xsa155-xen-000
Re: [Xen-devel] [PATCH V5 1/6] x86/hvm: pkeys, add the flag to enable Memory Protection Keys
>>> On 22.12.15 at 11:30, wrote: > This patch adds the flag("pku") to enable Memory Protection Keys, and updates > the markdown. > > Signed-off-by: Huaitong Han > Reviewed-by: Andrew Cooper Please avoid resending patches which got applied already. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] build: save generated config in /boot
On 12/21/15 6:53 AM, Jan Beulich wrote: On 21.12.15 at 13:40, wrote: >> On 21/12/15 12:11, Jan Beulich wrote: >> On 18.12.15 at 22:35, wrote: Since we now support changing Xen options with Kconfig, we should save the configuration that was used to build up Xen. This will save it in /boot alongside the installed xen.gz and call it xen-$(FULLVERSION).config Suggested-by: Ian Campbell Signed-off-by: Doug Goldstein --- xen/Makefile | 1 + 1 file changed, 1 insertion(+) diff --git a/xen/Makefile b/xen/Makefile index 9023863..460b977 100644 --- a/xen/Makefile +++ b/xen/Makefile @@ -58,6 +58,7 @@ _install: $(TARGET)$(CONFIG_XEN_INSTALL_SUFFIX) ln -f -s $(T)-$(XEN_FULLVERSION)$(Z) $(D)$(BOOT_DIR)/$(T)$(Z) [ -d "$(D)$(DEBUG_DIR)" ] || $(INSTALL_DIR) $(D)$(DEBUG_DIR) $(INSTALL_DATA) $(TARGET)-syms $(D)$(DEBUG_DIR)/$(T)-syms-$(XEN_FULLVERSION) + $(INSTALL_DATA) $(KCONFIG_CONFIG) >> $(D)$(BOOT_DIR)/$(T)-$(XEN_FULLVERSION).config >>> Was it really suggested to put this into /boot? It has no business >>> being there... >> >> A typical Linux has: >> >> andrewcoop@andrewcoop:/local/xen.git/xen$ ls -lA /boot/ >> total 21044 >> -rw-r--r-- 1 root root 157726 Dec 15 15:40 config-3.16.0-4-amd64 >> drwxr-xr-x 5 root root 4096 Dec 18 07:53 grub >> -rw-r--r-- 1 root root 15535038 Dec 18 07:52 initrd.img-3.16.0-4-amd64 >> drwx-- 2 root root16384 Oct 19 11:11 lost+found >> -rw-r--r-- 1 root root 2676277 Dec 15 15:40 System.map-3.16.0-4-amd64 >> -rw-r--r-- 1 root root 3118928 Dec 15 15:37 vmlinuz-3.16.0-4-amd64 >> >> which at the very least is consistent between Debian and RHEL derivatives. >> >> IMO, doing the same for Xen is sensible. > > I'm afraid I have to disagree - just because Linux does things a > certain way doesn't mean that the only (sensible) way. Imo /boot > should hold exclusively stuff needed for booting. Remember how > we moved xen-syms out of there not so long ago? You could have > objected to that change too, considering that Linux puts > System.map and sometimes also the uncompressed vmlinux there. > Yet I think it was a correct move, and the change here should > follow that model instead of Linux'es. > > Jan > I don't think you're correct here Jan. Looking at most distros I have at hand (I'll admit I don't have SuSE available to me). The uncompressed kernel is not in /boot. In fact all of them use /usr/lib/debug so Xen did follow Linux here. -- Doug Goldstein signature.asc Description: OpenPGP digital signature ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [qemu-mainline test] 66848: regressions - FAIL
flight 66848 qemu-mainline real [real] http://logs.test-lab.xenproject.org/osstest/logs/66848/ Regressions :-( Tests which did not succeed and are blocking, including tests which could not be run: build-i3865 xen-build fail REGR. vs. 66433 build-i386-xsm5 xen-build fail REGR. vs. 66433 build-amd64-xsm 5 xen-build fail REGR. vs. 66433 build-amd64 5 xen-build fail REGR. vs. 66433 build-armhf-xsm 5 xen-build fail REGR. vs. 66433 build-armhf 5 xen-build fail REGR. vs. 66433 Tests which did not succeed, but are not blocking: build-amd64-libvirt 1 build-check(1) blocked n/a test-amd64-amd64-xl-pvh-amd 1 build-check(1) blocked n/a test-amd64-amd64-xl-pvh-intel 1 build-check(1) blocked n/a test-armhf-armhf-xl-vhd 1 build-check(1) blocked n/a test-armhf-armhf-libvirt-qcow2 1 build-check(1) blocked n/a build-i386-libvirt1 build-check(1) blocked n/a test-armhf-armhf-libvirt-xsm 1 build-check(1) blocked n/a test-amd64-amd64-xl-credit2 1 build-check(1) blocked n/a test-amd64-amd64-libvirt-xsm 1 build-check(1) blocked n/a test-amd64-i386-xl-xsm1 build-check(1) blocked n/a test-amd64-amd64-xl-multivcpu 1 build-check(1) blocked n/a test-amd64-amd64-pair 1 build-check(1) blocked n/a test-amd64-i386-pair 1 build-check(1) blocked n/a test-amd64-amd64-libvirt 1 build-check(1) blocked n/a test-amd64-amd64-libvirt-pair 1 build-check(1) blocked n/a test-amd64-i386-xl1 build-check(1) blocked n/a test-amd64-amd64-amd64-pvgrub 1 build-check(1) blocked n/a test-amd64-amd64-pygrub 1 build-check(1) blocked n/a test-armhf-armhf-xl-multivcpu 1 build-check(1) blocked n/a test-armhf-armhf-xl 1 build-check(1) blocked n/a test-amd64-i386-qemuu-rhel6hvm-intel 1 build-check(1) blocked n/a test-amd64-amd64-qemuu-nested-amd 1 build-check(1) blocked n/a test-armhf-armhf-libvirt-raw 1 build-check(1) blocked n/a test-amd64-i386-libvirt-pair 1 build-check(1) blocked n/a test-amd64-amd64-xl-qemuu-debianhvm-amd64 1 build-check(1)blocked n/a test-armhf-armhf-xl-xsm 1 build-check(1) blocked n/a test-amd64-amd64-xl-qemuu-ovmf-amd64 1 build-check(1) blocked n/a test-armhf-armhf-xl-cubietruck 1 build-check(1) blocked n/a test-armhf-armhf-xl-credit2 1 build-check(1) blocked n/a test-armhf-armhf-libvirt 1 build-check(1) blocked n/a test-amd64-i386-xl-qemuu-debianhvm-amd64 1 build-check(1) blocked n/a build-armhf-libvirt 1 build-check(1) blocked n/a test-amd64-i386-freebsd10-amd64 1 build-check(1) blocked n/a test-amd64-i386-libvirt 1 build-check(1) blocked n/a test-amd64-i386-xl-raw1 build-check(1) blocked n/a test-amd64-amd64-qemuu-nested-intel 1 build-check(1) blocked n/a test-amd64-i386-freebsd10-i386 1 build-check(1) blocked n/a test-armhf-armhf-xl-arndale 1 build-check(1) blocked n/a test-amd64-amd64-xl-xsm 1 build-check(1) blocked n/a test-amd64-amd64-xl-qemuu-win7-amd64 1 build-check(1) blocked n/a test-amd64-amd64-xl 1 build-check(1) blocked n/a test-amd64-i386-xl-qemuu-win7-amd64 1 build-check(1) blocked n/a test-amd64-amd64-xl-rtds 1 build-check(1) blocked n/a test-amd64-i386-libvirt-xsm 1 build-check(1) blocked n/a test-amd64-amd64-xl-qcow2 1 build-check(1) blocked n/a test-amd64-i386-xl-qemuu-debianhvm-amd64-xsm 1 build-check(1) blocked n/a test-amd64-i386-qemuu-rhel6hvm-amd 1 build-check(1) blocked n/a test-armhf-armhf-xl-rtds 1 build-check(1) blocked n/a test-amd64-amd64-i386-pvgrub 1 build-check(1) blocked n/a test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 1 build-check(1) blocked n/a test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 1 build-check(1) blocked n/a test-amd64-amd64-xl-qemuu-debianhvm-amd64-xsm 1 build-check(1)blocked n/a test-amd64-amd64-libvirt-vhd 1 build-check(1) blocked n/a test-amd64-i386-xl-qemuu-winxpsp3-vcpus1 1 build-check(1) blocked n/a test-amd64-i386-xl-qemuu-ovmf-amd64 1 build-check(1) blocked n/a test-amd64-amd64-xl-qemuu-winxpsp3 1 build-check(1)
Re: [Xen-devel] [PATCH] build: save generated config in /boot
On 12/21/15 9:35 AM, Jan Beulich wrote: On 21.12.15 at 16:20, wrote: >> On 12/21/15 8:51 AM, Jan Beulich wrote: >> On 21.12.15 at 15:35, wrote: On 12/21/15 6:11 AM, Jan Beulich wrote: On 18.12.15 at 22:35, wrote: >> Since we now support changing Xen options with Kconfig, we should save >> the configuration that was used to build up Xen. This will save it in >> /boot alongside the installed xen.gz and call it >> xen-$(FULLVERSION).config >> >> Suggested-by: Ian Campbell >> Signed-off-by: Doug Goldstein >> --- >> xen/Makefile | 1 + >> 1 file changed, 1 insertion(+) >> >> diff --git a/xen/Makefile b/xen/Makefile >> index 9023863..460b977 100644 >> --- a/xen/Makefile >> +++ b/xen/Makefile >> @@ -58,6 +58,7 @@ _install: $(TARGET)$(CONFIG_XEN_INSTALL_SUFFIX) >> ln -f -s $(T)-$(XEN_FULLVERSION)$(Z) $(D)$(BOOT_DIR)/$(T)$(Z) >> [ -d "$(D)$(DEBUG_DIR)" ] || $(INSTALL_DIR) $(D)$(DEBUG_DIR) >> $(INSTALL_DATA) $(TARGET)-syms >> $(D)$(DEBUG_DIR)/$(T)-syms-$(XEN_FULLVERSION) >> +$(INSTALL_DATA) $(KCONFIG_CONFIG) $(D)$(BOOT_DIR)/$(T)-$(XEN_FULLVERSION).config > > Was it really suggested to put this into /boot? It has no business > being there... Yes. By multiple people. Ian Campbell was the first person to suggest it in that location. >>> >>> Okay, so I've looked it up, and no, he didn't. He just gave this as one >>> possibility: >>> >>> "It occurred to me this morning that we probably ought to stash the .config >>> somewhere on install in such a way that it can be associated with the Xen >>> binary (i.e. with the same full suffix as the binary itself, not the >>> abridged symlink names), maybe as $(BOOT_DIR)/$(T)- >>> $(XEN_FULLVERSION).config?" >>> >>> But yes, I'm sorry for not noticing this as an undesirable place right >>> away. >> >> Ok well I'm at a loss here because the quote clearly shows him >> suggesting that location. Do you have a suggested location because so >> far I've just got a no from you on the only suggested location. > > Match the xen-syms location? > > Jan > I guess I fail to grasp the rationale behind not putting it in /boot. -- Doug Goldstein signature.asc Description: OpenPGP digital signature ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] build: save generated config in /boot
>>> On 22.12.15 at 13:45, wrote: > On 12/21/15 6:53 AM, Jan Beulich wrote: > On 21.12.15 at 13:40, wrote: >>> On 21/12/15 12:11, Jan Beulich wrote: >>> On 18.12.15 at 22:35, wrote: > Since we now support changing Xen options with Kconfig, we should save > the configuration that was used to build up Xen. This will save it in > /boot alongside the installed xen.gz and call it > xen-$(FULLVERSION).config > > Suggested-by: Ian Campbell > Signed-off-by: Doug Goldstein > --- > xen/Makefile | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/xen/Makefile b/xen/Makefile > index 9023863..460b977 100644 > --- a/xen/Makefile > +++ b/xen/Makefile > @@ -58,6 +58,7 @@ _install: $(TARGET)$(CONFIG_XEN_INSTALL_SUFFIX) > ln -f -s $(T)-$(XEN_FULLVERSION)$(Z) $(D)$(BOOT_DIR)/$(T)$(Z) > [ -d "$(D)$(DEBUG_DIR)" ] || $(INSTALL_DIR) $(D)$(DEBUG_DIR) > $(INSTALL_DATA) $(TARGET)-syms > $(D)$(DEBUG_DIR)/$(T)-syms-$(XEN_FULLVERSION) > + $(INSTALL_DATA) $(KCONFIG_CONFIG) >>> $(D)$(BOOT_DIR)/$(T)-$(XEN_FULLVERSION).config Was it really suggested to put this into /boot? It has no business being there... >>> >>> A typical Linux has: >>> >>> andrewcoop@andrewcoop:/local/xen.git/xen$ ls -lA /boot/ >>> total 21044 >>> -rw-r--r-- 1 root root 157726 Dec 15 15:40 config-3.16.0-4-amd64 >>> drwxr-xr-x 5 root root 4096 Dec 18 07:53 grub >>> -rw-r--r-- 1 root root 15535038 Dec 18 07:52 initrd.img-3.16.0-4-amd64 >>> drwx-- 2 root root16384 Oct 19 11:11 lost+found >>> -rw-r--r-- 1 root root 2676277 Dec 15 15:40 System.map-3.16.0-4-amd64 >>> -rw-r--r-- 1 root root 3118928 Dec 15 15:37 vmlinuz-3.16.0-4-amd64 >>> >>> which at the very least is consistent between Debian and RHEL derivatives. >>> >>> IMO, doing the same for Xen is sensible. >> >> I'm afraid I have to disagree - just because Linux does things a >> certain way doesn't mean that the only (sensible) way. Imo /boot >> should hold exclusively stuff needed for booting. Remember how >> we moved xen-syms out of there not so long ago? You could have >> objected to that change too, considering that Linux puts >> System.map and sometimes also the uncompressed vmlinux there. >> Yet I think it was a correct move, and the change here should >> follow that model instead of Linux'es. > > I don't think you're correct here Jan. Looking at most distros I have at > hand (I'll admit I don't have SuSE available to me). The uncompressed > kernel is not in /boot. Note how I said "sometimes"? And I suppose you will agree that System.map is unnecessary for booting, despite it getting put in /boot quite commonly as it looks. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] build: save generated config in /boot
>>> On 22.12.15 at 13:45, wrote: > On 12/21/15 9:35 AM, Jan Beulich wrote: > On 21.12.15 at 16:20, wrote: >>> On 12/21/15 8:51 AM, Jan Beulich wrote: >>> On 21.12.15 at 15:35, wrote: > On 12/21/15 6:11 AM, Jan Beulich wrote: > On 18.12.15 at 22:35, wrote: >>> Since we now support changing Xen options with Kconfig, we should save >>> the configuration that was used to build up Xen. This will save it in >>> /boot alongside the installed xen.gz and call it >>> xen-$(FULLVERSION).config >>> >>> Suggested-by: Ian Campbell >>> Signed-off-by: Doug Goldstein >>> --- >>> xen/Makefile | 1 + >>> 1 file changed, 1 insertion(+) >>> >>> diff --git a/xen/Makefile b/xen/Makefile >>> index 9023863..460b977 100644 >>> --- a/xen/Makefile >>> +++ b/xen/Makefile >>> @@ -58,6 +58,7 @@ _install: $(TARGET)$(CONFIG_XEN_INSTALL_SUFFIX) >>> ln -f -s $(T)-$(XEN_FULLVERSION)$(Z) $(D)$(BOOT_DIR)/$(T)$(Z) >>> [ -d "$(D)$(DEBUG_DIR)" ] || $(INSTALL_DIR) $(D)$(DEBUG_DIR) >>> $(INSTALL_DATA) $(TARGET)-syms >>> $(D)$(DEBUG_DIR)/$(T)-syms-$(XEN_FULLVERSION) >>> + $(INSTALL_DATA) $(KCONFIG_CONFIG) > $(D)$(BOOT_DIR)/$(T)-$(XEN_FULLVERSION).config >> >> Was it really suggested to put this into /boot? It has no business >> being there... > > Yes. By multiple people. Ian Campbell was the first person to suggest it > in that location. Okay, so I've looked it up, and no, he didn't. He just gave this as one possibility: "It occurred to me this morning that we probably ought to stash the .config somewhere on install in such a way that it can be associated with the Xen binary (i.e. with the same full suffix as the binary itself, not the abridged symlink names), maybe as $(BOOT_DIR)/$(T)- $(XEN_FULLVERSION).config?" But yes, I'm sorry for not noticing this as an undesirable place right away. >>> >>> Ok well I'm at a loss here because the quote clearly shows him >>> suggesting that location. Do you have a suggested location because so >>> far I've just got a no from you on the only suggested location. >> >> Match the xen-syms location? > > I guess I fail to grasp the rationale behind not putting it in /boot. It's the other way around really - you'd have to provide a reason (other than "Linux does so too") for putting it in /boot. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH V5 4/6] x86/hvm: pkeys, add pkeys support for guest_walk_tables
>>> On 22.12.15 at 11:30, wrote: I dislike having to repeat this: Please trim your Cc lists. > --- a/xen/arch/x86/mm/guest_walk.c > +++ b/xen/arch/x86/mm/guest_walk.c > @@ -90,6 +90,57 @@ static uint32_t set_ad_bits(void *guest_p, void *walk_p, > int set_dirty) > return 0; > } > > +extern bool_t pkey_fault(struct vcpu *vcpu, uint32_t pfec, > +uint32_t pte_flags, uint32_t pte_pkey); > +#if GUEST_PAGING_LEVELS == CONFIG_PAGING_LEVELS > +bool_t pkey_fault(struct vcpu *vcpu, uint32_t pfec, > +uint32_t pte_flags, uint32_t pte_pkey) > +{ See my comments on the previous version. Please avoid sending new versions without having addressed all comments on the previous one (verbally or by code changes). Having done the suggested change just partially (by removing the #ifdef-s from the call sites) you now do the key check universally, and things remain correct just because of the long mode check in the middle of the function. > +unsigned int pkru = 0; > +bool_t pkru_ad, pkru_wd; > + > +bool_t pf = !!(pfec & PFEC_page_present); There's still this stray blank line above (and I continue to wonder whether you really need all these boolean variables many of which get used just once). > +bool_t uf = !!(pfec & PFEC_user_mode); > +bool_t wf = !!(pfec & PFEC_write_access); > +bool_t ff = !!(pfec & PFEC_insn_fetch); > +bool_t rsvdf = !!(pfec & PFEC_reserved_bit); > + > +/* When page isn't present, PKEY isn't checked. */ > +if ( !pf || is_pv_vcpu(vcpu) ) > +return 0; > + > +/* > + * PKU: additional mechanism by which the paging controls > + * access to user-mode addresses based on the value in the > + * PKRU register. A fault is considered as a PKU violation if all > + * of the following conditions are ture: > + * 1.CR4_PKE=1. > + * 2.EFER_LMA=1. > + * 3.page is present with no reserved bit violations. > + * 4.the access is not an instruction fetch. > + * 5.the access is to a user page. > + * 6.PKRU.AD=1 > + * or The access is a data write and PKRU.WD=1 > + *and either CR0.WP=1 or it is a user access. > + */ > +if ( !hvm_pku_enabled(vcpu) || !hvm_long_mode_enabled(vcpu) || > +rsvdf || ff || !(pte_flags & _PAGE_USER) ) Indentation (I think this also was broken already in the previous version). Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH V5 5/6] x86/hvm: pkeys, add xstate support for pkeys
>>> On 22.12.15 at 11:30, wrote: > --- a/xen/include/asm-x86/xstate.h > +++ b/xen/include/asm-x86/xstate.h > @@ -34,13 +34,15 @@ > #define XSTATE_OPMASK (1ULL << 5) > #define XSTATE_ZMM (1ULL << 6) > #define XSTATE_HI_ZMM (1ULL << 7) > +#define XSTATE_PKRU(1ULL << 9) > #define XSTATE_LWP (1ULL << 62) /* AMD lightweight profiling */ > #define XSTATE_FP_SSE (XSTATE_FP | XSTATE_SSE) > #define XCNTXT_MASK(XSTATE_FP | XSTATE_SSE | XSTATE_YMM | XSTATE_OPMASK > | \ > XSTATE_ZMM | XSTATE_HI_ZMM | XSTATE_NONLAZY) > > #define XSTATE_ALL (~(1ULL << 63)) > -#define XSTATE_NONLAZY (XSTATE_LWP | XSTATE_BNDREGS | XSTATE_BNDCSR) > +#define XSTATE_NONLAZY (XSTATE_LWP | XSTATE_BNDREGS | XSTATE_BNDCSR | \ > +XSTATE_PKRU) Same thing here - I don't recall any reply to my inquiry regarding this change. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH V5 6/6] x86/hvm: pkeys, add pkeys support for cpuid handling
>>> On 22.12.15 at 11:30, wrote: > This patch adds pkeys support for cpuid handing. > > Pkeys hardware support is CPUID.7.0.ECX[3]:PKU. software support is > CPUID.7.0.ECX[4]:OSPKE and it reflects the support setting of CR4.PKE. > > Signed-off-by: Huaitong Han > --- > tools/libxc/xc_cpufeature.h | 2 ++ > tools/libxc/xc_cpuid_x86.c | 6 -- > xen/arch/x86/hvm/hvm.c | 36 +++- > 3 files changed, 29 insertions(+), 15 deletions(-) > > diff --git a/tools/libxc/xc_cpufeature.h b/tools/libxc/xc_cpufeature.h > index c3ddc80..f6a9778 100644 > --- a/tools/libxc/xc_cpufeature.h > +++ b/tools/libxc/xc_cpufeature.h > @@ -141,5 +141,7 @@ > #define X86_FEATURE_ADX 19 /* ADCX, ADOX instructions */ > #define X86_FEATURE_SMAP20 /* Supervisor Mode Access Protection */ > > +/* Intel-defined CPU features, CPUID level 0x0007:0 (ecx) */ > +#define X86_FEATURE_PKU 3 > > #endif /* __LIBXC_CPUFEATURE_H */ > diff --git a/tools/libxc/xc_cpuid_x86.c b/tools/libxc/xc_cpuid_x86.c > index 8882c01..1ce979b 100644 > --- a/tools/libxc/xc_cpuid_x86.c > +++ b/tools/libxc/xc_cpuid_x86.c > @@ -427,9 +427,11 @@ static void xc_cpuid_hvm_policy(xc_interface *xch, > bitmaskof(X86_FEATURE_ADX) | > bitmaskof(X86_FEATURE_SMAP) | > bitmaskof(X86_FEATURE_FSGSBASE)); > +regs[2] &= bitmaskof(X86_FEATURE_PKU); > } else > -regs[1] = 0; > -regs[0] = regs[2] = regs[3] = 0; > +regs[1] = regs[2] = 0; > + > +regs[0] = regs[3] = 0; > break; > > case 0x000d: > diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c > index 59916ed..076313b 100644 > --- a/xen/arch/x86/hvm/hvm.c > +++ b/xen/arch/x86/hvm/hvm.c > @@ -4572,7 +4572,7 @@ void hvm_cpuid(unsigned int input, unsigned int *eax, > unsigned int *ebx, > __clear_bit(X86_FEATURE_APIC & 31, edx); > > /* Fix up OSXSAVE. */ > -if ( cpu_has_xsave ) > +if ( *ecx & cpufeat_mask(X86_FEATURE_XSAVE) ) > *ecx |= (v->arch.hvm_vcpu.guest_cr[4] & X86_CR4_OSXSAVE) ? > cpufeat_mask(X86_FEATURE_OSXSAVE) : 0; > > @@ -4585,21 +4585,31 @@ void hvm_cpuid(unsigned int input, unsigned int *eax, > unsigned int *ebx, > *edx &= ~cpufeat_mask(X86_FEATURE_PSE36); > break; > case 0x7: > -if ( (count == 0) && !cpu_has_smep ) > -*ebx &= ~cpufeat_mask(X86_FEATURE_SMEP); > +if ( count == 0 ) > +{ > +if ( !cpu_has_smep ) > +*ebx &= ~cpufeat_mask(X86_FEATURE_SMEP); > > -if ( (count == 0) && !cpu_has_smap ) > -*ebx &= ~cpufeat_mask(X86_FEATURE_SMAP); > +if ( !cpu_has_smap ) > +*ebx &= ~cpufeat_mask(X86_FEATURE_SMAP); > > -/* Don't expose MPX to hvm when VMX support is not available */ > -if ( (count == 0) && > - (!(vmx_vmexit_control & VM_EXIT_CLEAR_BNDCFGS) || > - !(vmx_vmentry_control & VM_ENTRY_LOAD_BNDCFGS)) ) > -*ebx &= ~cpufeat_mask(X86_FEATURE_MPX); > +/* Don't expose MPX to hvm when VMX support is not available */ Please fix the comment style if you touch this anyway. > +if (!(vmx_vmexit_control & VM_EXIT_CLEAR_BNDCFGS) || > + !(vmx_vmentry_control & VM_ENTRY_LOAD_BNDCFGS)) > +*ebx &= ~cpufeat_mask(X86_FEATURE_MPX); > > -/* Don't expose INVPCID to non-hap hvm. */ > -if ( (count == 0) && !hap_enabled(d) ) > -*ebx &= ~cpufeat_mask(X86_FEATURE_INVPCID); > +if ( !hap_enabled(d) ) > +{ > +/* Don't expose INVPCID to non-hap hvm. */ > +*ebx &= ~cpufeat_mask(X86_FEATURE_INVPCID); > +/* X86_FEATURE_PKU is not yet implemented for shadow paging. > */ > +*ecx &= ~cpufeat_mask(X86_FEATURE_PKU); > +} > + > +if ( *ecx & cpufeat_mask(X86_FEATURE_PKU)) > +*ecx |= (v->arch.hvm_vcpu.guest_cr[4] & X86_CR4_PKE) ? > + cpufeat_mask(X86_FEATURE_OSPKE) : 0; Bogus use of ?: considering that you or in zero in its "false" case. Just extend the if() condition. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [distros-debian-snapshot test] 38550: tolerable FAIL
flight 38550 distros-debian-snapshot real [real] http://osstest.xs.citrite.net/~osstest/testlogs/logs/38550/ Failures :-/ but no regressions. Regressions which are regarded as allowable (not blocking): test-armhf-armhf-armhf-daily-netboot-pygrub 9 debian-di-install fail like 38518 test-amd64-i386-i386-daily-netboot-pvgrub 9 debian-di-install fail like 38518 test-amd64-i386-amd64-daily-netboot-pygrub 9 debian-di-install fail like 38518 test-amd64-amd64-i386-daily-netboot-pygrub 9 debian-di-install fail like 38518 test-amd64-amd64-amd64-daily-netboot-pvgrub 9 debian-di-install fail like 38518 baseline version: flight 38518 jobs: build-amd64 pass build-armhf pass build-i386 pass build-amd64-pvopspass build-armhf-pvopspass build-i386-pvops pass test-amd64-amd64-amd64-daily-netboot-pvgrub fail test-amd64-i386-i386-daily-netboot-pvgrubfail test-amd64-i386-amd64-daily-netboot-pygrub fail test-armhf-armhf-armhf-daily-netboot-pygrub fail test-amd64-amd64-i386-daily-netboot-pygrub fail test-amd64-amd64-amd64-current-netinst-pygrubpass test-amd64-i386-amd64-current-netinst-pygrub pass test-amd64-amd64-i386-current-netinst-pygrub pass test-amd64-i386-i386-current-netinst-pygrub pass test-amd64-amd64-amd64-weekly-netinst-pygrub pass test-amd64-i386-amd64-weekly-netinst-pygrub pass test-amd64-amd64-i386-weekly-netinst-pygrub pass test-amd64-i386-i386-weekly-netinst-pygrub pass sg-report-flight on osstest.xs.citrite.net logs: /home/osstest/logs images: /home/osstest/images Logs, config files, etc. are available at http://osstest.xs.citrite.net/~osstest/testlogs/logs Test harness code can be found at http://xenbits.xensource.com/gitweb?p=osstest.git;a=summary Push not applicable. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] Xen Security Advisory 155 (CVE-2015-8550) - paravirtualized drivers incautious about shared memory
Stefano Stabellini, on Tue 22 Dec 2015 12:24:35 +, wrote: > MiniOS for QEMU stubdom has frontends, such as mini-os/blkfront.c and > mini-os/netfront.c, not backends. There is one backend, tpmback. It however doesn't use a ring. Samuel ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v3 13/13] xen-hvm: Mark inappropriate error handling FIXME
On Thu, 17 Dec 2015, Markus Armbruster wrote: > Cc: Stefano Stabellini > Cc: xen-de...@lists.xensource.com > Signed-off-by: Markus Armbruster > --- > xen-hvm.c | 7 +++ > 1 file changed, 7 insertions(+) > > diff --git a/xen-hvm.c b/xen-hvm.c > index 3d78a0c..2a93390 100644 > --- a/xen-hvm.c > +++ b/xen-hvm.c > @@ -240,6 +240,7 @@ static void xen_ram_init(PCMachineState *pcms, > > void xen_ram_alloc(ram_addr_t ram_addr, ram_addr_t size, MemoryRegion *mr) > { > +/* FIXME caller ram_block_add() wants error_setg() on failure */ > unsigned long nr_pfn; > xen_pfn_t *pfn_list; > int i; > @@ -1192,6 +1193,12 @@ static void xen_wakeup_notifier(Notifier *notifier, > void *data) > int xen_hvm_init(PCMachineState *pcms, > MemoryRegion **ram_memory) > { > +/* > + * FIXME Returns -1 without cleaning up on some errors (harmless > + * as long as the caller exit()s on error), dies with hw_error() Could you please stop the comment here and just replace hw_error() with either return -1 or exit(1) within xen_hvm_init? > + * on others. hw_error() isn't approprate here. Should probably > + * simply exit() on all errors. > + */ > int i, rc; > xen_pfn_t ioreq_pfn; > xen_pfn_t bufioreq_pfn; > -- > 2.4.3 > > > ___ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel > ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCHv1 4/4] spinlock: add fair read-write locks
>>> On 18.12.15 at 15:09, wrote: > +/** > + * rspin_until_writer_unlock - inc reader count & spin until writer is gone Stale comment - the function doesn't increment anything. Also throughout the file, with Linux coding style converted to Xen style, comment style should be made Xen-like too. > +/* > + * Readers come here when they cannot get the lock without waiting > + */ > +if ( unlikely(in_irq()) ) > +{ > +/* > + * Readers in interrupt context will spin until the lock is > + * available without waiting in the queue. > + */ > +smp_rmb(); > +cnts = atomic_read(&lock->cnts); > +rspin_until_writer_unlock(lock, cnts); > +return; > +} I can't immediately see the reason for this re-introduction of unfairness - can you say a word on this, or perhaps extend the comment? > +/** > + * queue_write_lock_slowpath - acquire write lock of a queue rwlock > + * @lock : Pointer to queue rwlock structure > + */ > +void queue_write_lock_slowpath(rwlock_t *lock) > { > -_read_unlock(lock); > -local_irq_enable(); > -} > +u32 cnts; > > -void _read_unlock_irqrestore(rwlock_t *lock, unsigned long flags) > -{ > -_read_unlock(lock); > -local_irq_restore(flags); > -} > +/* Put the writer into the wait queue */ > +spin_lock(&lock->lock); > > -void _write_lock(rwlock_t *lock) > -{ > -uint32_t x; > +/* Try to acquire the lock directly if no reader is present */ > +if ( !atomic_read(&lock->cnts) && > + (atomic_cmpxchg(&lock->cnts, 0, _QW_LOCKED) == 0) ) > +goto unlock; > > -check_lock(&lock->debug); > -do { > -while ( (x = lock->lock) & RW_WRITE_FLAG ) > -cpu_relax(); > -} while ( cmpxchg(&lock->lock, x, x|RW_WRITE_FLAG) != x ); > -while ( x != 0 ) > +/* > + * Set the waiting flag to notify readers that a writer is pending, > + * or wait for a previous writer to go away. > + */ > +for (;;) > { > -cpu_relax(); > -x = lock->lock & ~RW_WRITE_FLAG; > -} > -preempt_disable(); > -} > - > -void _write_lock_irq(rwlock_t *lock) > -{ > -uint32_t x; > +cnts = atomic_read(&lock->cnts); > +if ( !(cnts & _QW_WMASK) && > + (atomic_cmpxchg(&lock->cnts, cnts, > + cnts | _QW_WAITING) == cnts) ) > +break; Considering that (at least on x86) cmpxchg is relatively expensive, and that atomic OR would be carried out by some cmpxchg-like mechanism on most other arches anyway, can't this be an atomic OR, followed by a read to check for another active writer? > +unlock: > +spin_unlock(&lock->lock); Labels indented by at least one space please. Also - are you using any of the static functions in spinlock.c? If not, putting the rwlock code in a new rwlock.c would help review quite a bit, since code removal and code addition would then be separate rather than intermixed. > +/* > + * Writer states & reader shift and bias > + */ > +#define_QW_WAITING 1 /* A writer is waiting */ > +#define_QW_LOCKED 0xff/* A writer holds the lock */ > +#define_QW_WMASK0xff/* Writer mask */ > +#define_QR_SHIFT8 /* Reader count shift */ > +#define_QR_BIAS (1U << _QR_SHIFT) Is there a particular reason for the 8-bit writer mask (a 2-bit one would seem to suffice)? > +static inline int _write_trylock(rwlock_t *lock) > +{ > +u32 cnts; > + > +cnts = atomic_read(&lock->cnts); > +if ( unlikely(cnts) ) > +return 0; > + > +return likely(atomic_cmpxchg(&lock->cnts, > + cnts, cnts | _QW_LOCKED) == cnts); The | is pointless here considering that cnts is zero. > +static inline void _write_unlock(rwlock_t *lock) > +{ > +/* > + * If the writer field is atomic, it can be cleared directly. > + * Otherwise, an atomic subtraction will be used to clear it. > + */ > +atomic_sub(_QW_LOCKED, &lock->cnts); > +} Ah, I guess the comment here is the explanation for the 8-bit write mask. > +static inline int _rw_is_write_locked(rwlock_t *lock) > +{ > +return atomic_read(&lock->cnts) & _QW_WMASK; > +} This returns true for write-locked or writer-waiting - is this intended? Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [xen-4.4-testing test] 66583: regressions - FAIL
On Mon, 21 Dec 2015, Jan Beulich wrote: > >>> On 20.12.15 at 13:25, wrote: > > flight 66583 xen-4.4-testing real [real] > > http://logs.test-lab.xenproject.org/osstest/logs/66583/ > > > > Regressions :-( > > > > Tests which did not succeed and are blocking, > > including tests which could not be run: > > build-amd64-prev 5 xen-build fail REGR. vs. > > 66458 > > build-i386-prev 5 xen-build fail REGR. vs. > > 66458 > > Ian, Stefano, > > is one of you two looking into this? > > block-qcow.c: In function 'get_cluster_offset': > block-qcow.c:444:3: error: 'tmp_ptr' may be used uninitialized in this > function [-Werror=maybe-uninitialized] >memcpy(tmp_ptr, l1_ptr, 4096); >^ > block-qcow.c:619:7: error: 'tmp_ptr2' may be used uninitialized in this > function [-Werror=maybe-uninitialized] >if (write(s->fd, tmp_ptr2, 4096) != 4096) { >^ > cc1: all warnings being treated as errors > /home/osstest/build.66583.build-amd64-prev/xen/tools/blktap/drivers/../../../tools/Rules.mk:89: > recipe for target 'block-qcow.o' failed > make[5]: *** [block-qcow.o] Error 1 What does build-amd64-prev means? I cannot find the explanation of the test anywhere. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCHv6 2/2] x86/ept: defer the invalidation until the p2m lock is released
On 22/12/15 12:23, George Dunlap wrote: > On 18/12/15 13:50, David Vrabel wrote: >> Holding the p2m lock while calling ept_sync_domain() is very expensive >> since it does a on_selected_cpus() call. IPIs on many socket machines >> can be very slows and on_selected_cpus() is serialized. >> >> It is safe to defer the invalidate until the p2m lock is released >> except for two cases: >> >> 1. When freeing a page table page (since partial translations may be >>cached). >> 2. When reclaiming a zero page as part of PoD. >> >> For these cases, add p2m_tlb_flush_sync() calls which will immediately >> perform the invalidate before the page is freed or reclaimed. > There are at least two other places in the PoD code where the "remove -> > check -> add to cache -> unlock" pattern exist; and it looks to me like > there are other places where races might occur (e.g., > p2m_paging_evict(), which does remove -> scrub -> put -> unlock; > p2m_altp2m_propagate_change(), which does remove -> put -> unlock). > > Part of me wonders whether, rather than making callers that need it > remember to do a flush, it might be better to explicitly pass in > P2M_FLUSH or P2M_CAN_DEFER when calling p2m_set_entry, just to make > people think about the fact that the p2m change may not actually take > effect until later. Any thoughts on that? > > Comments on the current approach inline. > >> diff --git a/xen/arch/x86/mm/p2m-ept.c b/xen/arch/x86/mm/p2m-ept.c >> index c094320..43c7f1b 100644 >> --- a/xen/arch/x86/mm/p2m-ept.c >> +++ b/xen/arch/x86/mm/p2m-ept.c >> @@ -263,6 +263,7 @@ static void ept_free_entry(struct p2m_domain *p2m, >> ept_entry_t *ept_entry, int l >> unmap_domain_page(epte); >> } >> >> +p2m_tlb_flush_sync(p2m); >> p2m_free_ptp(p2m, mfn_to_page(ept_entry->mfn)); > It's probably worth a comment here pointing out that even if this > function is called several times (e.g., if you replace a load of 4k > entries with a 1G entry), the actual flush will only happen the first time. > >> +static void ept_flush_and_unlock(struct p2m_domain *p2m, bool_t unlock) >> +{ >> +p2m->need_flush = 0; >> +if ( unlock ) >> +mm_write_unlock(&p2m->lock); >> +ept_sync_domain_mask(p2m, p2m->domain->domain_dirty_cpumask); >> } > Having a function called "flush_and_unlock", with a boolean as to > whether to unlock or not, just seems a bit wonky. > > Wouldn't it make more sense to have the hook just named "flush_sync()", > and move the unlocking out in the generic p2m code (where you already > have the check for need_flush)? > >> diff --git a/xen/include/asm-x86/p2m.h b/xen/include/asm-x86/p2m.h >> index fa46dd9..9c394c2 100644 >> --- a/xen/include/asm-x86/p2m.h >> +++ b/xen/include/asm-x86/p2m.h >> @@ -261,6 +261,10 @@ struct p2m_domain { >>unsigned long gfn, l1_pgentry_t >> *p, >>l1_pgentry_t new, unsigned int >> level); >> long (*audit_p2m)(struct p2m_domain *p2m); >> +void (*flush_and_unlock)(struct p2m_domain *p2m, bool_t >> unlock); >> + >> +unsigned int defer_flush; >> +bool_t need_flush; > It's probably worth a comment that at the moment calling > flush_and_unlock() is gated on need_flush; so it's OK not to implement > flush_and_unlock() as long as you never set need_flush. This is just one small accident (in code elsewhere) away from a code injection vulnerability. Either we should require that all function pointers are filled in, or BUG() if the pointer is missing when we attempt to use it. ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] libxc: Defer initialization of start_page for HVM guests
On 12/22/2015 04:42 AM, Roger Pau Monné wrote: El 22/12/15 a les 0.45, Boris Ostrovsky ha escrit: With commit 8c45adec18e0 ("libxc: create unmapped initrd in domain builder if supported") location of ramdisk may not be available to HVMlite guests by the time alloc_magic_pages_hvm() is invoked if the guest supports unmapped initrd. Since the only operation related to allocating magic pages in that routine is allocation of HVMlite start info we can move everything else to a later point such as xc_dom_arch.start_info(). Signed-off-by: Boris Ostrovsky --- I am not sure xc_dom_arch.start_info() is the right place neither since we still are doing things that have nothing to do with start_info. tools/libxc/xc_dom_x86.c | 45 ++--- 1 files changed, 30 insertions(+), 15 deletions(-) diff --git a/tools/libxc/xc_dom_x86.c b/tools/libxc/xc_dom_x86.c index 3960875..f64079e 100644 --- a/tools/libxc/xc_dom_x86.c +++ b/tools/libxc/xc_dom_x86.c @@ -585,6 +585,32 @@ static void build_hvm_info(void *hvm_info_page, struct xc_dom_image *dom) static int alloc_magic_pages_hvm(struct xc_dom_image *dom) { +struct xc_dom_seg seg; +size_t start_info_size = sizeof(struct hvm_start_info); + +if ( dom->device_model ) +return 0; + +if ( dom->cmdline ) +start_info_size += ROUNDUP(strlen(dom->cmdline) + 1, 8); +if ( dom->ramdisk_blob ) +/* Limited to one module. */ +start_info_size += sizeof( struct hvm_modlist_entry); ^ extra space. + +if ( xc_dom_alloc_segment(dom, &seg, "HVMlite start info", 0, + start_info_size) ) +{ +DOMPRINTF("Unable to reserve memory for the start info"); +return -1; +} + +dom->start_info_pfn = seg.pfn; + +return 0; +} + +static int start_info_hvm(struct xc_dom_image *dom) +{ unsigned long i; void *hvm_info_page; uint32_t *ident_pt, domid = dom->guest_domid; @@ -636,7 +662,6 @@ static int alloc_magic_pages_hvm(struct xc_dom_image *dom) if ( !dom->device_model ) { -struct xc_dom_seg seg; struct hvm_start_info *start_info; char *cmdline; struct hvm_modlist_entry *modlist; @@ -653,17 +678,9 @@ static int alloc_magic_pages_hvm(struct xc_dom_image *dom) if ( dom->ramdisk_blob ) start_info_size += sizeof(*modlist); /* Limited to one module. */ -rc = xc_dom_alloc_segment(dom, &seg, "HVMlite start info", 0, - start_info_size); -if ( rc != 0 ) -{ -DOMPRINTF("Unable to reserve memory for the start info"); -goto out; -} - start_page = xc_map_foreign_range(xch, domid, start_info_size, IMHO, I would stash start_info_size somewhere inside xc_dom_image in order to avoid calculating it twice. Yes, I should do this. Also, why is everything done inside of alloc_magic_pages_hvm moved to start_info_hvm? AFAICT we should only need to move the code that fills the hvm_start_info struct, but the rest of the code already present in alloc_magic_pages_hvm could be left as-is. That's what I was referring to in the commit message (and in the comment below it): most, if not all, of the stuff that alloc_magic_pages_hvm() currently does really has nothing to do with magic pages allocation. E.g. hvm_params settting, special pages initialization, etc. So I figured I could move all of this to a later point. But, as I said in the comment, I am not convinced that start_info() is the right place either. -boris ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH RESEND v4] arm: remove !CPU_V6 and !GENERIC_ATOMIC64 build dependencies for XEN
Hello Russell, Would you please consider this patch for the next merge window? It is a very old patch (March 2014) which has been left over. See some of the original threads here: http://marc.info/?l=linux-arm-kernel&m=138920414402610&w=2 http://marc.info/?l=linux-arm-kernel&m=139031200118436&w=2 Many thanks, Stefano --- Remove !GENERIC_ATOMIC64 build dependency: - introduce xen_atomic64_xchg - use it to implement xchg_xen_ulong Remove !CPU_V6 build dependency: - introduce __cmpxchg8 and __cmpxchg16, compiled even ifdef CONFIG_CPU_V6 - implement sync_cmpxchg using __cmpxchg8 and __cmpxchg16 Signed-off-by: Stefano Stabellini Reviewed-by: Will Deacon Acked-by: Ian Campbell CC: a...@arndb.de CC: will.dea...@arm.com CC: catalin.mari...@arm.com CC: linux-arm-ker...@lists.infradead.org CC: linux-ker...@vger.kernel.org CC: xen-de...@lists.xenproject.org diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig index 34e1569..e09ed94 100644 --- a/arch/arm/Kconfig +++ b/arch/arm/Kconfig @@ -1807,8 +1807,7 @@ config XEN_DOM0 config XEN bool "Xen guest support on ARM" depends on ARM && AEABI && OF - depends on CPU_V7 && !CPU_V6 - depends on !GENERIC_ATOMIC64 + depends on CPU_V7 depends on MMU select ARCH_DMA_ADDR_T_64BIT select ARM_PSCI diff --git a/arch/arm/include/asm/cmpxchg.h b/arch/arm/include/asm/cmpxchg.h index 97882f9..3b342ec 100644 --- a/arch/arm/include/asm/cmpxchg.h +++ b/arch/arm/include/asm/cmpxchg.h @@ -152,6 +152,44 @@ extern void __bad_cmpxchg(volatile void *ptr, int size); * cmpxchg only support 32-bits operands on ARMv6. */ +static inline unsigned long __cmpxchg8(volatile void *ptr, unsigned long old, + unsigned long new) +{ + unsigned long oldval, res; + + do { + asm volatile("@ __cmpxchg8\n" + " ldrexb %1, [%2]\n" + " mov %0, #0\n" + " teq %1, %3\n" + " strexbeq %0, %4, [%2]\n" + : "=&r" (res), "=&r" (oldval) + : "r" (ptr), "Ir" (old), "r" (new) + : "memory", "cc"); + } while (res); + + return oldval; +} + +static inline unsigned long __cmpxchg16(volatile void *ptr, unsigned long old, + unsigned long new) +{ + unsigned long oldval, res; + + do { + asm volatile("@ __cmpxchg16\n" + " ldrexh %1, [%2]\n" + " mov %0, #0\n" + " teq %1, %3\n" + " strexheq %0, %4, [%2]\n" + : "=&r" (res), "=&r" (oldval) + : "r" (ptr), "Ir" (old), "r" (new) + : "memory", "cc"); + } while (res); + + return oldval; +} + static inline unsigned long __cmpxchg(volatile void *ptr, unsigned long old, unsigned long new, int size) { @@ -162,28 +200,10 @@ static inline unsigned long __cmpxchg(volatile void *ptr, unsigned long old, switch (size) { #ifndef CONFIG_CPU_V6 /* min ARCH >= ARMv6K */ case 1: - do { - asm volatile("@ __cmpxchg1\n" - " ldrexb %1, [%2]\n" - " mov %0, #0\n" - " teq %1, %3\n" - " strexbeq %0, %4, [%2]\n" - : "=&r" (res), "=&r" (oldval) - : "r" (ptr), "Ir" (old), "r" (new) - : "memory", "cc"); - } while (res); + oldval = __cmpxchg8(ptr, old, new); break; case 2: - do { - asm volatile("@ __cmpxchg1\n" - " ldrexh %1, [%2]\n" - " mov %0, #0\n" - " teq %1, %3\n" - " strexheq %0, %4, [%2]\n" - : "=&r" (res), "=&r" (oldval) - : "r" (ptr), "Ir" (old), "r" (new) - : "memory", "cc"); - } while (res); + oldval = __cmpxchg16(ptr, old, new); break; #endif case 4: diff --git a/arch/arm/include/asm/sync_bitops.h b/arch/arm/include/asm/sync_bitops.h index 9732b8e..4103319 100644 --- a/arch/arm/include/asm/sync_bitops.h +++ b/arch/arm/include/asm/sync_bitops.h @@ -20,7 +20,29 @@ #define sync_test_and_clear_bit(nr, p) _test_and_clear_bit(nr, p) #define sync_test_and_change_bit(nr, p)_test_and_change_bit(nr, p) #define sync_test_bit(nr, addr)test_bit(nr, addr) -#define sync_cmpxchg cmpxchg +static inline unsigned long sync_cmpxchg(volatile void *ptr, +
Re: [Xen-devel] [PATCHv6 2/2] x86/ept: defer the invalidation until the p2m lock is released
On 22/12/15 14:01, Andrew Cooper wrote: > On 22/12/15 12:23, George Dunlap wrote: >> On 18/12/15 13:50, David Vrabel wrote: >>> Holding the p2m lock while calling ept_sync_domain() is very expensive >>> since it does a on_selected_cpus() call. IPIs on many socket machines >>> can be very slows and on_selected_cpus() is serialized. >>> >>> It is safe to defer the invalidate until the p2m lock is released >>> except for two cases: >>> >>> 1. When freeing a page table page (since partial translations may be >>>cached). >>> 2. When reclaiming a zero page as part of PoD. >>> >>> For these cases, add p2m_tlb_flush_sync() calls which will immediately >>> perform the invalidate before the page is freed or reclaimed. >> There are at least two other places in the PoD code where the "remove -> >> check -> add to cache -> unlock" pattern exist; and it looks to me like >> there are other places where races might occur (e.g., >> p2m_paging_evict(), which does remove -> scrub -> put -> unlock; >> p2m_altp2m_propagate_change(), which does remove -> put -> unlock). >> >> Part of me wonders whether, rather than making callers that need it >> remember to do a flush, it might be better to explicitly pass in >> P2M_FLUSH or P2M_CAN_DEFER when calling p2m_set_entry, just to make >> people think about the fact that the p2m change may not actually take >> effect until later. Any thoughts on that? >> >> Comments on the current approach inline. >> >>> diff --git a/xen/arch/x86/mm/p2m-ept.c b/xen/arch/x86/mm/p2m-ept.c >>> index c094320..43c7f1b 100644 >>> --- a/xen/arch/x86/mm/p2m-ept.c >>> +++ b/xen/arch/x86/mm/p2m-ept.c >>> @@ -263,6 +263,7 @@ static void ept_free_entry(struct p2m_domain *p2m, >>> ept_entry_t *ept_entry, int l >>> unmap_domain_page(epte); >>> } >>> >>> +p2m_tlb_flush_sync(p2m); >>> p2m_free_ptp(p2m, mfn_to_page(ept_entry->mfn)); >> It's probably worth a comment here pointing out that even if this >> function is called several times (e.g., if you replace a load of 4k >> entries with a 1G entry), the actual flush will only happen the first time. >> >>> +static void ept_flush_and_unlock(struct p2m_domain *p2m, bool_t unlock) >>> +{ >>> +p2m->need_flush = 0; >>> +if ( unlock ) >>> +mm_write_unlock(&p2m->lock); >>> +ept_sync_domain_mask(p2m, p2m->domain->domain_dirty_cpumask); >>> } >> Having a function called "flush_and_unlock", with a boolean as to >> whether to unlock or not, just seems a bit wonky. >> >> Wouldn't it make more sense to have the hook just named "flush_sync()", >> and move the unlocking out in the generic p2m code (where you already >> have the check for need_flush)? >> >>> diff --git a/xen/include/asm-x86/p2m.h b/xen/include/asm-x86/p2m.h >>> index fa46dd9..9c394c2 100644 >>> --- a/xen/include/asm-x86/p2m.h >>> +++ b/xen/include/asm-x86/p2m.h >>> @@ -261,6 +261,10 @@ struct p2m_domain { >>>unsigned long gfn, l1_pgentry_t >>> *p, >>>l1_pgentry_t new, unsigned int >>> level); >>> long (*audit_p2m)(struct p2m_domain *p2m); >>> +void (*flush_and_unlock)(struct p2m_domain *p2m, bool_t >>> unlock); >>> + >>> +unsigned int defer_flush; >>> +bool_t need_flush; >> It's probably worth a comment that at the moment calling >> flush_and_unlock() is gated on need_flush; so it's OK not to implement >> flush_and_unlock() as long as you never set need_flush. > > This is just one small accident (in code elsewhere) away from a code > injection vulnerability. > > Either we should require that all function pointers are filled in, or > BUG() if the pointer is missing when we attempt to use it. Jan asked for the call to be conditional on need_flush and to not test flush_and_unlock. David ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [xen-4.4-testing test] 66583: regressions - FAIL
>>> On 22.12.15 at 14:55, wrote: > On Mon, 21 Dec 2015, Jan Beulich wrote: >> >>> On 20.12.15 at 13:25, wrote: >> > flight 66583 xen-4.4-testing real [real] >> > http://logs.test-lab.xenproject.org/osstest/logs/66583/ >> > >> > Regressions :-( >> > >> > Tests which did not succeed and are blocking, >> > including tests which could not be run: >> > build-amd64-prev 5 xen-build fail REGR. vs. > 66458 >> > build-i386-prev 5 xen-build fail REGR. vs. > 66458 >> >> Ian, Stefano, >> >> is one of you two looking into this? >> >> block-qcow.c: In function 'get_cluster_offset': >> block-qcow.c:444:3: error: 'tmp_ptr' may be used uninitialized in this > function [-Werror=maybe-uninitialized] >>memcpy(tmp_ptr, l1_ptr, 4096); >>^ >> block-qcow.c:619:7: error: 'tmp_ptr2' may be used uninitialized in this > function [-Werror=maybe-uninitialized] >>if (write(s->fd, tmp_ptr2, 4096) != 4096) { >>^ >> cc1: all warnings being treated as errors >> > /home/osstest/build.66583.build-amd64-prev/xen/tools/blktap/drivers/../../../to > ols/Rules.mk:89: recipe for target 'block-qcow.o' failed >> make[5]: *** [block-qcow.o] Error 1 > > What does build-amd64-prev means? I cannot find the explanation of the > test anywhere. Iiuc it's the build pf the previous version, needed to test n-1 -> n migration. I.e. (in line with the respective 4.3 failure) it indicates that 4.3's qemu doesn't build. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] libxc: Defer initialization of start_page for HVM guests
El 22/12/15 a les 15.12, Boris Ostrovsky ha escrit: > On 12/22/2015 04:42 AM, Roger Pau Monné wrote: >> Also, why is everything done inside of alloc_magic_pages_hvm moved to >> start_info_hvm? AFAICT we should only need to move the code that fills >> the hvm_start_info struct, but the rest of the code already present in >> alloc_magic_pages_hvm could be left as-is. > > That's what I was referring to in the commit message (and in the comment > below it): most, if not all, of the stuff that alloc_magic_pages_hvm() > currently does really has nothing to do with magic pages allocation. > E.g. hvm_params settting, special pages initialization, etc. So I > figured I could move all of this to a later point. But, as I said in the > comment, I am not convinced that start_info() is the right place either. Some of it has to do with magic pages IMHO (although I have to admit this is probably a question of taste), for example it makes sense from my PoV to allocate the hvm_info_page, the special pages (xenstore, console...), the ioreq server pages and possibly the EPT identity map. Moving the start_info stuff into it's own function also makes sense. Roger. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [xen-4.4-testing test] 66583: regressions - FAIL
On Tue, 22 Dec 2015, Jan Beulich wrote: > >>> On 22.12.15 at 14:55, wrote: > > On Mon, 21 Dec 2015, Jan Beulich wrote: > >> >>> On 20.12.15 at 13:25, wrote: > >> > flight 66583 xen-4.4-testing real [real] > >> > http://logs.test-lab.xenproject.org/osstest/logs/66583/ > >> > > >> > Regressions :-( > >> > > >> > Tests which did not succeed and are blocking, > >> > including tests which could not be run: > >> > build-amd64-prev 5 xen-build fail REGR. > >> > vs. > > 66458 > >> > build-i386-prev 5 xen-build fail REGR. > >> > vs. > > 66458 > >> > >> Ian, Stefano, > >> > >> is one of you two looking into this? > >> > >> block-qcow.c: In function 'get_cluster_offset': > >> block-qcow.c:444:3: error: 'tmp_ptr' may be used uninitialized in this > > function [-Werror=maybe-uninitialized] > >>memcpy(tmp_ptr, l1_ptr, 4096); > >>^ > >> block-qcow.c:619:7: error: 'tmp_ptr2' may be used uninitialized in this > > function [-Werror=maybe-uninitialized] > >>if (write(s->fd, tmp_ptr2, 4096) != 4096) { > >>^ > >> cc1: all warnings being treated as errors > >> > > /home/osstest/build.66583.build-amd64-prev/xen/tools/blktap/drivers/../../../to > > ols/Rules.mk:89: recipe for target 'block-qcow.o' failed > >> make[5]: *** [block-qcow.o] Error 1 > > > > What does build-amd64-prev means? I cannot find the explanation of the > > test anywhere. > > Iiuc it's the build pf the previous version, needed to test n-1 -> n > migration. I.e. (in line with the respective 4.3 failure) it indicates > that 4.3's qemu doesn't build. Thanks for the explanation, it was very helpful. >From the logs what is not building is actually tools/blktap, not QEMU though, but I cannot reproduce the failure. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH RESEND v4] arm: remove !CPU_V6 and !GENERIC_ATOMIC64 build dependencies for XEN
On Tue, Dec 22, 2015 at 02:17:17PM +, Stefano Stabellini wrote: > Hello Russell, > > Would you please consider this patch for the next merge window? It > is a very old patch (March 2014) which has been left over. This patch has some obvious problems... > diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig > index 34e1569..e09ed94 100644 > --- a/arch/arm/Kconfig > +++ b/arch/arm/Kconfig > @@ -1807,8 +1807,7 @@ config XEN_DOM0 > config XEN > bool "Xen guest support on ARM" > depends on ARM && AEABI && OF > - depends on CPU_V7 && !CPU_V6 > - depends on !GENERIC_ATOMIC64 > + depends on CPU_V7 How sure are we that this won't cause regressions? How much testing has been done with these changed dependencies? > diff --git a/arch/arm/include/asm/sync_bitops.h > b/arch/arm/include/asm/sync_bitops.h > index 9732b8e..4103319 100644 > --- a/arch/arm/include/asm/sync_bitops.h > +++ b/arch/arm/include/asm/sync_bitops.h > @@ -20,7 +20,29 @@ > #define sync_test_and_clear_bit(nr, p) _test_and_clear_bit(nr, p) > #define sync_test_and_change_bit(nr, p) _test_and_change_bit(nr, p) > #define sync_test_bit(nr, addr) test_bit(nr, addr) > -#define sync_cmpxchg cmpxchg > > +static inline unsigned long sync_cmpxchg(volatile void *ptr, > + > unsigned long old, > + > unsigned long new) > +{ > + unsigned long oldval; > + int size = sizeof(*(ptr)); This is buggy. You're doing sizeof(void) here, which on GCC will always be 1: A consequence of this is that `sizeof' is also allowed on `void' and on function types, and returns 1. > + > + smp_mb(); > + switch (size) { > + case 1: > + oldval = __cmpxchg8(ptr, old, new); > + break; > + case 2: > + oldval = __cmpxchg16(ptr, old, new); > + break; The ldrexb/ldrexh instructions are not available on ARMv6 CPUs. __xchg() and friends avoided these for ARMv6 CPUs, but this does not. I'd expect anything that uses sync_cmpxchg() will fail to build when a kernel including ARMv6 is attempted. So... I don't think this patch is ready. -- RMK's Patch system: http://www.arm.linux.org.uk/developer/patches/ FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up according to speedtest.net. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] build: save generated config in /boot
On 22/12/15 12:52, Jan Beulich wrote: On 22.12.15 at 13:45, wrote: >> On 12/21/15 9:35 AM, Jan Beulich wrote: >> On 21.12.15 at 16:20, wrote: On 12/21/15 8:51 AM, Jan Beulich wrote: On 21.12.15 at 15:35, wrote: >> On 12/21/15 6:11 AM, Jan Beulich wrote: >> On 18.12.15 at 22:35, wrote: Since we now support changing Xen options with Kconfig, we should save the configuration that was used to build up Xen. This will save it in /boot alongside the installed xen.gz and call it xen-$(FULLVERSION).config Suggested-by: Ian Campbell Signed-off-by: Doug Goldstein --- xen/Makefile | 1 + 1 file changed, 1 insertion(+) diff --git a/xen/Makefile b/xen/Makefile index 9023863..460b977 100644 --- a/xen/Makefile +++ b/xen/Makefile @@ -58,6 +58,7 @@ _install: $(TARGET)$(CONFIG_XEN_INSTALL_SUFFIX) ln -f -s $(T)-$(XEN_FULLVERSION)$(Z) $(D)$(BOOT_DIR)/$(T)$(Z) [ -d "$(D)$(DEBUG_DIR)" ] || $(INSTALL_DIR) $(D)$(DEBUG_DIR) $(INSTALL_DATA) $(TARGET)-syms $(D)$(DEBUG_DIR)/$(T)-syms-$(XEN_FULLVERSION) + $(INSTALL_DATA) $(KCONFIG_CONFIG) >> $(D)$(BOOT_DIR)/$(T)-$(XEN_FULLVERSION).config >>> Was it really suggested to put this into /boot? It has no business >>> being there... >> Yes. By multiple people. Ian Campbell was the first person to suggest it >> in that location. > Okay, so I've looked it up, and no, he didn't. He just gave this as one > possibility: > > "It occurred to me this morning that we probably ought to stash the > .config > somewhere on install in such a way that it can be associated with the Xen > binary (i.e. with the same full suffix as the binary itself, not the > abridged symlink names), maybe as $(BOOT_DIR)/$(T)- > $(XEN_FULLVERSION).config?" > > But yes, I'm sorry for not noticing this as an undesirable place right > away. Ok well I'm at a loss here because the quote clearly shows him suggesting that location. Do you have a suggested location because so far I've just got a no from you on the only suggested location. >>> Match the xen-syms location? >> I guess I fail to grasp the rationale behind not putting it in /boot. > It's the other way around really - you'd have to provide a reason > (other than "Linux does so too") for putting it in /boot. I disagree. Xen being consistent with Linux in this regard is in the best interest of the users of Xen, as they end up finding similar information in similar places. The onus is on you to provide a reason why we should deliberately do something different. ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] libxc: Defer initialization of start_page for HVM guests
On 12/22/2015 09:36 AM, Roger Pau Monné wrote: El 22/12/15 a les 15.12, Boris Ostrovsky ha escrit: On 12/22/2015 04:42 AM, Roger Pau Monné wrote: Also, why is everything done inside of alloc_magic_pages_hvm moved to start_info_hvm? AFAICT we should only need to move the code that fills the hvm_start_info struct, but the rest of the code already present in alloc_magic_pages_hvm could be left as-is. That's what I was referring to in the commit message (and in the comment below it): most, if not all, of the stuff that alloc_magic_pages_hvm() currently does really has nothing to do with magic pages allocation. E.g. hvm_params settting, special pages initialization, etc. So I figured I could move all of this to a later point. But, as I said in the comment, I am not convinced that start_info() is the right place either. Some of it has to do with magic pages IMHO (although I have to admit this is probably a question of taste), for example it makes sense from my PoV to allocate the hvm_info_page, the special pages (xenstore, console...), the ioreq server pages and possibly the EPT identity map. Moving the start_info stuff into it's own function also makes sense. Since there are many ways to separate this I'll wait for maintainers to express their preference. (I don't know who is around now and I will disappear too on Friday until January so this may have to wait until then). -boris ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCHv6 2/2] x86/ept: defer the invalidation until the p2m lock is released
On 22/12/15 14:20, David Vrabel wrote: > On 22/12/15 14:01, Andrew Cooper wrote: >> On 22/12/15 12:23, George Dunlap wrote: >>> On 18/12/15 13:50, David Vrabel wrote: Holding the p2m lock while calling ept_sync_domain() is very expensive since it does a on_selected_cpus() call. IPIs on many socket machines can be very slows and on_selected_cpus() is serialized. It is safe to defer the invalidate until the p2m lock is released except for two cases: 1. When freeing a page table page (since partial translations may be cached). 2. When reclaiming a zero page as part of PoD. For these cases, add p2m_tlb_flush_sync() calls which will immediately perform the invalidate before the page is freed or reclaimed. >>> There are at least two other places in the PoD code where the "remove -> >>> check -> add to cache -> unlock" pattern exist; and it looks to me like >>> there are other places where races might occur (e.g., >>> p2m_paging_evict(), which does remove -> scrub -> put -> unlock; >>> p2m_altp2m_propagate_change(), which does remove -> put -> unlock). >>> >>> Part of me wonders whether, rather than making callers that need it >>> remember to do a flush, it might be better to explicitly pass in >>> P2M_FLUSH or P2M_CAN_DEFER when calling p2m_set_entry, just to make >>> people think about the fact that the p2m change may not actually take >>> effect until later. Any thoughts on that? >>> >>> Comments on the current approach inline. >>> diff --git a/xen/arch/x86/mm/p2m-ept.c b/xen/arch/x86/mm/p2m-ept.c index c094320..43c7f1b 100644 --- a/xen/arch/x86/mm/p2m-ept.c +++ b/xen/arch/x86/mm/p2m-ept.c @@ -263,6 +263,7 @@ static void ept_free_entry(struct p2m_domain *p2m, ept_entry_t *ept_entry, int l unmap_domain_page(epte); } +p2m_tlb_flush_sync(p2m); p2m_free_ptp(p2m, mfn_to_page(ept_entry->mfn)); >>> It's probably worth a comment here pointing out that even if this >>> function is called several times (e.g., if you replace a load of 4k >>> entries with a 1G entry), the actual flush will only happen the first time. >>> +static void ept_flush_and_unlock(struct p2m_domain *p2m, bool_t unlock) +{ +p2m->need_flush = 0; +if ( unlock ) +mm_write_unlock(&p2m->lock); +ept_sync_domain_mask(p2m, p2m->domain->domain_dirty_cpumask); } >>> Having a function called "flush_and_unlock", with a boolean as to >>> whether to unlock or not, just seems a bit wonky. >>> >>> Wouldn't it make more sense to have the hook just named "flush_sync()", >>> and move the unlocking out in the generic p2m code (where you already >>> have the check for need_flush)? >>> diff --git a/xen/include/asm-x86/p2m.h b/xen/include/asm-x86/p2m.h index fa46dd9..9c394c2 100644 --- a/xen/include/asm-x86/p2m.h +++ b/xen/include/asm-x86/p2m.h @@ -261,6 +261,10 @@ struct p2m_domain { unsigned long gfn, l1_pgentry_t *p, l1_pgentry_t new, unsigned int level); long (*audit_p2m)(struct p2m_domain *p2m); +void (*flush_and_unlock)(struct p2m_domain *p2m, bool_t unlock); + +unsigned int defer_flush; +bool_t need_flush; >>> It's probably worth a comment that at the moment calling >>> flush_and_unlock() is gated on need_flush; so it's OK not to implement >>> flush_and_unlock() as long as you never set need_flush. >> >> This is just one small accident (in code elsewhere) away from a code >> injection vulnerability. >> >> Either we should require that all function pointers are filled in, or >> BUG() if the pointer is missing when we attempt to use it. > > Jan asked for the call to be conditional on need_flush and to not test > flush_and_unlock. Then perhaps the other paging modes should point this to a function which calls BUG()? Or perhaps a noop -- no point in crashing a machine in production because you don't need to actually do a flush. -George ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [linux-next test] 66804: regressions - trouble: blocked/broken/fail/pass
flight 66804 linux-next real [real] http://logs.test-lab.xenproject.org/osstest/logs/66804/ Regressions :-( Tests which did not succeed and are blocking, including tests which could not be run: build-i386-pvops 5 kernel-build fail REGR. vs. 66614 build-amd64-pvops 5 kernel-build fail REGR. vs. 66614 Regressions which are regarded as allowable (not blocking): test-armhf-armhf-libvirt 3 host-install(3) broken REGR. vs. 66614 test-armhf-armhf-xl-rtds 6 xen-boot fail like 66614 test-armhf-armhf-xl-arndale 6 xen-boot fail like 66614 test-armhf-armhf-xl-vhd 6 xen-boot fail like 66614 test-armhf-armhf-xl-multivcpu 6 xen-boot fail like 66614 test-armhf-armhf-xl-credit2 6 xen-boot fail like 66614 test-armhf-armhf-libvirt-xsm 6 xen-boot fail like 66614 test-armhf-armhf-xl-cubietruck 6 xen-boot fail like 66614 test-armhf-armhf-libvirt-raw 6 xen-boot fail like 66614 test-armhf-armhf-xl 6 xen-boot fail like 66614 test-armhf-armhf-xl-xsm 6 xen-boot fail like 66614 test-armhf-armhf-libvirt-qcow2 6 xen-boot fail like 66614 Tests which did not succeed, but are not blocking: test-amd64-i386-rumpuserxen-i386 1 build-check(1) blocked n/a test-amd64-i386-xl-qemut-stubdom-debianhvm-amd64-xsm 1 build-check(1) blocked n/a test-amd64-amd64-rumpuserxen-amd64 1 build-check(1) blocked n/a test-amd64-amd64-xl-pvh-amd 1 build-check(1) blocked n/a test-amd64-amd64-xl-pvh-intel 1 build-check(1) blocked n/a test-amd64-amd64-xl 1 build-check(1) blocked n/a test-amd64-i386-libvirt-pair 1 build-check(1) blocked n/a test-amd64-amd64-xl-multivcpu 1 build-check(1) blocked n/a test-amd64-i386-qemuu-rhel6hvm-intel 1 build-check(1) blocked n/a test-amd64-amd64-pygrub 1 build-check(1) blocked n/a test-amd64-amd64-libvirt-xsm 1 build-check(1) blocked n/a test-amd64-i386-libvirt 1 build-check(1) blocked n/a test-amd64-amd64-xl-credit2 1 build-check(1) blocked n/a test-amd64-i386-xl-xsm1 build-check(1) blocked n/a test-amd64-amd64-libvirt-pair 1 build-check(1) blocked n/a test-amd64-i386-libvirt-xsm 1 build-check(1) blocked n/a test-amd64-i386-xl-qemuu-debianhvm-amd64-xsm 1 build-check(1) blocked n/a test-amd64-amd64-libvirt 1 build-check(1) blocked n/a test-amd64-i386-xl1 build-check(1) blocked n/a test-amd64-amd64-xl-xsm 1 build-check(1) blocked n/a test-amd64-amd64-i386-pvgrub 1 build-check(1) blocked n/a test-amd64-amd64-xl-qemut-stubdom-debianhvm-amd64-xsm 1 build-check(1) blocked n/a test-amd64-amd64-xl-qemut-debianhvm-amd64-xsm 1 build-check(1)blocked n/a test-amd64-i386-qemut-rhel6hvm-intel 1 build-check(1) blocked n/a test-amd64-amd64-amd64-pvgrub 1 build-check(1) blocked n/a test-amd64-amd64-xl-qemuu-debianhvm-amd64-xsm 1 build-check(1)blocked n/a test-amd64-i386-xl-qemut-debianhvm-amd64-xsm 1 build-check(1) blocked n/a test-amd64-amd64-xl-rtds 1 build-check(1) blocked n/a test-amd64-amd64-xl-qemut-debianhvm-amd64 1 build-check(1)blocked n/a test-amd64-i386-xl-qemuu-ovmf-amd64 1 build-check(1) blocked n/a test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 1 build-check(1) blocked n/a test-amd64-i386-xl-qemut-debianhvm-amd64 1 build-check(1) blocked n/a test-amd64-amd64-qemuu-nested-intel 1 build-check(1) blocked n/a test-amd64-i386-freebsd10-amd64 1 build-check(1) blocked n/a test-amd64-amd64-xl-qcow2 1 build-check(1) blocked n/a test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 1 build-check(1) blocked n/a test-amd64-i386-qemut-rhel6hvm-amd 1 build-check(1) blocked n/a test-amd64-i386-qemuu-rhel6hvm-amd 1 build-check(1) blocked n/a test-amd64-i386-pair 1 build-check(1) blocked n/a test-amd64-i386-freebsd10-i386 1 build-check(1) blocked n/a test-amd64-i386-xl-qemuu-debianhvm-amd64 1 build-check(1) blocked n/a test-amd64-i386-xl-raw1 build-check(1) blocked n/a test-amd64-amd64-xl-qemuu-win7-amd64 1 build-check(1) blocked n/a test-amd64-amd64-qemuu-nested-amd 1 build-check(1) blocked n/a test-amd64-i386-xl-qemuu-win7-amd64 1 build-check(1) blocked n/a test-amd64-amd64-pair 1 build-check(1) block
Re: [Xen-devel] Xen Security Advisory 155 (CVE-2015-8550) - paravirtualized drivers incautious about shared memory
The XSA mentions that "PV frontend patches will be developed and released (publicly) after the embargo date." Has anything been done towards this that should also be incorporated into MiniOS? On a system utilizing a "driver domain," where a backend is running on a domain that is considered unprivileged and untrusted (such as the example described in http://wiki.xenproject.org/wiki/Driver_Domain), it seems XSA-155-style double fetch vulnerabilities in the frontends are also a potential security concern, and should be eliminated. However, perhaps that does not include pcifront, since pciback would always be running in dom0. Eric On Tue, Dec 22, 2015 at 7:24 AM, Stefano Stabellini wrote: > MiniOS for QEMU stubdom has frontends, such as mini-os/blkfront.c and > mini-os/netfront.c, not backends. > > Cheers, > > Stefano > > > On Mon, 21 Dec 2015, Eric Shelton wrote: >> Seeing as "All OSes providing PV backends are susceptible," doesn't this >> include MiniOS for QEMU stubdom as well? >> Are there patches available for mini-os/blkfront.c, mini-os/netfront.c, and >> mini-os/pcifront.c? I didn't see >> anything for this. >> Best, >> Eric >> >> On Thu, Dec 17, 2015 at 1:36 PM, Xen.org security team >> wrote: >> >> - Topal: Output generated on Tue Dec 22 12:23:44 GMT 2015 - >> Topal: GPG output starts - gpg: >> no valid OpenPGP data found. gpg: processing message failed: eof - >> Topal: GPG output ends - >> - Topal: Original message starts - -BEGIN PGP SIGNED >> MESSAGE- >> Hash: SHA1 >> >> Xen Security Advisory CVE-2015-8550 / XSA-155 >> version 6 >> >> paravirtualized drivers incautious about shared memory contents >> >> UPDATES IN VERSION 6 >> >> >> Correct CREDITS section. >> >> ISSUE DESCRIPTION >> = >> >> The compiler can emit optimizations in the PV backend drivers which >> can lead to double fetch vulnerabilities. Specifically the shared >> memory between the frontend and backend can be fetched twice (during >> which time the frontend can alter the contents) possibly leading to >> arbitrary code execution in backend. >> >> IMPACT >> == >> >> Malicious guest administrators can cause denial of service. If driver >> domains are not in use, the impact can be a host crash, or privilege >> escalation. >> >> VULNERABLE SYSTEMS >> == >> >> Systems running PV or HVM guests are vulnerable. >> >> ARM and x86 systems are vulnerable. >> >> All OSes providing PV backends are susceptible, this includes >> Linux and NetBSD. By default the Linux distributions compile kernels >> with optimizations. >> >> MITIGATION >> == >> >> There is no mitigation. >> >> CREDITS >> === >> >> This issue was discovered by Felix Wilhelm (ERNW Research, KIT / >> Operating Systems Group). >> >> RESOLUTION >> == >> >> Applying the appropriate attached patches should fix the problem for >> PV backends. Note only that PV backends are fixed; PV frontend >> patches will be developed and released (publicly) after the embargo >> date. >> >> Please note that there is a bug in some versions of gcc, >> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58145 which can cause the >> construct used in RING_COPY_REQUEST() to be ineffective in some >> circumstances. We have determined that this is only the case when the >> structure being copied consists purely of bitfields. The Xen PV >> protocols updated here do not use bitfields in this way and therefore >> these patches are not subject to that bug. However authors of third >> party PV protocols should take this into consideration. >> >> Linux v4.4: >> xsa155-linux-xsa155-0001-xen-Add-RING_COPY_REQUEST.patch >> >> xsa155-linux-xsa155-0002-xen-netback-don-t-use-last-request-to-determine-mini.patch >> >> xsa155-linux-xsa155-0003-xen-netback-use-RING_COPY_REQUEST-throughout.patch >> >> xsa155-linux-xsa155-0004-xen-blkback-only-read-request-operation-from-shared-.patch >> >> xsa155-linux-xsa155-0005-xen-blkback-read-from-indirect-descriptors-only-once.patch >> xsa155-linux-xsa155-0006-xen-scsiback-safely-copy-requests.patch >> >> xsa155-linux-xsa155-0007-xen-pciback-Save-xen_pci_op-commands-before-processi.patch >> Linux v4.[0,1,2,3] >> All the above patches except #5 will apply, please use: >> >> xsa155-linux43-0005-xen-blkback-read-from-indirect-descriptors-only-once.patch >> Linux v3.19: >> All the above patches except #5 and #6 will apply, please use: >> >> xsa155-linux43-0005-xen-blkback-read-from-indirect-descriptors-only-once.patch >> xsa155-linux319-0006-
Re: [Xen-devel] [PATCH V5 4/6] x86/hvm: pkeys, add pkeys support for guest_walk_tables
On 22/12/15 13:02, Jan Beulich wrote: On 22.12.15 at 11:30, wrote: > > I dislike having to repeat this: Please trim your Cc lists. > >> --- a/xen/arch/x86/mm/guest_walk.c >> +++ b/xen/arch/x86/mm/guest_walk.c >> @@ -90,6 +90,57 @@ static uint32_t set_ad_bits(void *guest_p, void *walk_p, >> int set_dirty) >> return 0; >> } >> >> +extern bool_t pkey_fault(struct vcpu *vcpu, uint32_t pfec, >> +uint32_t pte_flags, uint32_t pte_pkey); >> +#if GUEST_PAGING_LEVELS == CONFIG_PAGING_LEVELS >> +bool_t pkey_fault(struct vcpu *vcpu, uint32_t pfec, >> +uint32_t pte_flags, uint32_t pte_pkey) >> +{ > > See my comments on the previous version. Please avoid sending new > versions without having addressed all comments on the previous one > (verbally or by code changes). Having done the suggested change > just partially (by removing the #ifdef-s from the call sites) you now > do the key check universally, and things remain correct just because > of the long mode check in the middle of the function. > >> +unsigned int pkru = 0; >> +bool_t pkru_ad, pkru_wd; >> + >> +bool_t pf = !!(pfec & PFEC_page_present); > > There's still this stray blank line above (and I continue to wonder > whether you really need all these boolean variables many of which > get used just once). I suspect the "stray blank line" was added for readability. But I agree that I'd prefer not to use local boolean variables, and just to put the flag checking inline. > >> +bool_t uf = !!(pfec & PFEC_user_mode); >> +bool_t wf = !!(pfec & PFEC_write_access); >> +bool_t ff = !!(pfec & PFEC_insn_fetch); >> +bool_t rsvdf = !!(pfec & PFEC_reserved_bit); >> + >> +/* When page isn't present, PKEY isn't checked. */ >> +if ( !pf || is_pv_vcpu(vcpu) ) >> +return 0; >> + >> +/* >> + * PKU: additional mechanism by which the paging controls >> + * access to user-mode addresses based on the value in the >> + * PKRU register. A fault is considered as a PKU violation if all >> + * of the following conditions are ture: *true >> + * 1.CR4_PKE=1. >> + * 2.EFER_LMA=1. >> + * 3.page is present with no reserved bit violations. >> + * 4.the access is not an instruction fetch. >> + * 5.the access is to a user page. >> + * 6.PKRU.AD=1 >> + * or The access is a data write and PKRU.WD=1 >> + *and either CR0.WP=1 or it is a user access. >> + */ >> +if ( !hvm_pku_enabled(vcpu) || !hvm_long_mode_enabled(vcpu) || >> +rsvdf || ff || !(pte_flags & _PAGE_USER) ) And I think you might as well make this one line per condition, something like this: if ( is_pv_vcpu(vcpu) || !hvm_pku_enabled(vcpu) || !hvm_long_mode_enabled(vcpu) !(pfec & PFEC_page_present) || (pfec & (PFEC_insn_fetch|PFEC_reserved_bit)) || !(pte_flags & _PAGE_USER) ) return 0; -George ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH V5 4/6] x86/hvm: pkeys, add pkeys support for guest_walk_tables
On 22/12/15 10:30, Huaitong Han wrote: > Protection keys define a new 4-bit protection key field(PKEY) in bits 62:59 of > leaf entries of the page tables. > > PKRU register defines 32 bits, there are 16 domains and 2 attribute bits per > domain in pkru, for each i (0 ≤ i ≤ 15), PKRU[2i] is the access-disable bit > for > protection key i (ADi); PKRU[2i+1] is the write-disable bit for protection key > i (WDi). PKEY is index to a defined domain. > > A fault is considered as a PKU violation if all of the following conditions > are > ture: > 1.CR4_PKE=1. > 2.EFER_LMA=1. > 3.Page is present with no reserved bit violations. > 4.The access is not an instruction fetch. > 5.The access is to a user page. > 6.PKRU.AD=1 > or The access is a data write and PKRU.WD=1 > and either CR0.WP=1 or it is a user access. Thanks, I think this architecture is much better. More comments inline. > diff --git a/xen/arch/x86/mm/guest_walk.c b/xen/arch/x86/mm/guest_walk.c > index 18d1acf..9cdd607 100644 > --- a/xen/arch/x86/mm/guest_walk.c > +++ b/xen/arch/x86/mm/guest_walk.c > @@ -90,6 +90,57 @@ static uint32_t set_ad_bits(void *guest_p, void *walk_p, > int set_dirty) > return 0; > } > > +extern bool_t pkey_fault(struct vcpu *vcpu, uint32_t pfec, > +uint32_t pte_flags, uint32_t pte_pkey); > +#if GUEST_PAGING_LEVELS == CONFIG_PAGING_LEVELS > +bool_t pkey_fault(struct vcpu *vcpu, uint32_t pfec, > +uint32_t pte_flags, uint32_t pte_pkey) > +{ > +unsigned int pkru = 0; > +bool_t pkru_ad, pkru_wd; > + > +bool_t pf = !!(pfec & PFEC_page_present); > +bool_t uf = !!(pfec & PFEC_user_mode); > +bool_t wf = !!(pfec & PFEC_write_access); > +bool_t ff = !!(pfec & PFEC_insn_fetch); > +bool_t rsvdf = !!(pfec & PFEC_reserved_bit); > + > +/* When page isn't present, PKEY isn't checked. */ > +if ( !pf || is_pv_vcpu(vcpu) ) > +return 0; > + > +/* > + * PKU: additional mechanism by which the paging controls > + * access to user-mode addresses based on the value in the > + * PKRU register. A fault is considered as a PKU violation if all > + * of the following conditions are ture: > + * 1.CR4_PKE=1. > + * 2.EFER_LMA=1. > + * 3.page is present with no reserved bit violations. > + * 4.the access is not an instruction fetch. > + * 5.the access is to a user page. > + * 6.PKRU.AD=1 > + * or The access is a data write and PKRU.WD=1 > + *and either CR0.WP=1 or it is a user access. > + */ > +if ( !hvm_pku_enabled(vcpu) || !hvm_long_mode_enabled(vcpu) || > +rsvdf || ff || !(pte_flags & _PAGE_USER) ) > +return 0; See my comment in response to Jan's comment. > @@ -199,6 +252,9 @@ guest_walk_tables(struct vcpu *v, struct p2m_domain *p2m, > > pse1G = (gflags & _PAGE_PSE) && guest_supports_1G_superpages(v); > > +if ( pse1G && pkey_fault(v, pfec, gflags, pkey) ) > +rc |= _PAGE_PKEY_BITS; > + > if ( pse1G ) So it looks like you're only doing a pkey check on the 'leaf' ptes, is that right? In which case we have a lot of duplication here -- duplicate pse1G / pse2M checks, and exact copies of the call to pkey_fault(). Would it make sense to move the pkey_fault() check down below the set_ad label? -George ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH] retain the configuration that Xen was built with
This should retain the .config file from the Kconfig process so that we know how this build of Xen was configured. Signed-off-by: Doug Goldstein --- ts-xen-build | 1 + 1 file changed, 1 insertion(+) diff --git a/ts-xen-build b/ts-xen-build index b02e737..80b1faa 100755 --- a/ts-xen-build +++ b/ts-xen-build @@ -192,6 +192,7 @@ sub stash () { } built_stash_file($ho, $builddir, "xen-syms", "xen/xen/xen-syms", 1); built_stash_file($ho, $builddir, "xen-config", "xen/.config", 1); +built_stash_file($ho, $builddir, "xen-hv-config", "xen/xen/.config", 1); built_stash_file($ho, $builddir, "seabios-config", "xen/tools/firmware/seabios-dir-remote/.config", 1); built_compress_stashed("xen-syms"); -- 2.4.10 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH RESEND v4] arm: remove !CPU_V6 and !GENERIC_ATOMIC64 build dependencies for XEN
On Tue, 22 Dec 2015, Russell King - ARM Linux wrote: > On Tue, Dec 22, 2015 at 02:17:17PM +, Stefano Stabellini wrote: > > Hello Russell, > > > > Would you please consider this patch for the next merge window? It > > is a very old patch (March 2014) which has been left over. > > This patch has some obvious problems... > > > diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig > > index 34e1569..e09ed94 100644 > > --- a/arch/arm/Kconfig > > +++ b/arch/arm/Kconfig > > @@ -1807,8 +1807,7 @@ config XEN_DOM0 > > config XEN > > bool "Xen guest support on ARM" > > depends on ARM && AEABI && OF > > - depends on CPU_V7 && !CPU_V6 > > - depends on !GENERIC_ATOMIC64 > > + depends on CPU_V7 > > How sure are we that this won't cause regressions? How much testing > has been done with these changed dependencies? I did build-test a range of combinations of those options, sorry for not spotting the error below. > > diff --git a/arch/arm/include/asm/sync_bitops.h > > b/arch/arm/include/asm/sync_bitops.h > > index 9732b8e..4103319 100644 > > --- a/arch/arm/include/asm/sync_bitops.h > > +++ b/arch/arm/include/asm/sync_bitops.h > > @@ -20,7 +20,29 @@ > > #define sync_test_and_clear_bit(nr, p) _test_and_clear_bit(nr, p) > > #define sync_test_and_change_bit(nr, p)_test_and_change_bit(nr, p) > > #define sync_test_bit(nr, addr)test_bit(nr, addr) > > -#define sync_cmpxchg cmpxchg > > > > +static inline unsigned long sync_cmpxchg(volatile void *ptr, > > + > > unsigned long old, > > + > > unsigned long new) > > +{ > > + unsigned long oldval; > > + int size = sizeof(*(ptr)); > > This is buggy. You're doing sizeof(void) here, which on GCC will always > be 1: > >A consequence of this is that `sizeof' is also allowed on `void' and > on function types, and returns 1. You are right, thank you very much for looking at the patch and finding this bug. I think the issue can be solved with something like: +#define sync_cmpxchg(ptr, o, n) ({ \ + (__typeof(*ptr))__sync_cmpxchg((ptr), \ + (unsigned long)(o), \ + (unsigned long)(n), \ + sizeof(*(ptr)));\ +}) +static inline unsigned long __sync_cmpxchg(volatile void *ptr, + unsigned long old, + unsigned long new, + int size) +{ > > + > > + smp_mb(); > > + switch (size) { > > + case 1: > > + oldval = __cmpxchg8(ptr, old, new); > > + break; > > + case 2: > > + oldval = __cmpxchg16(ptr, old, new); > > + break; > > The ldrexb/ldrexh instructions are not available on ARMv6 CPUs. __xchg() > and friends avoided these for ARMv6 CPUs, but this does not. > > I'd expect anything that uses sync_cmpxchg() will fail to build when a > kernel including ARMv6 is attempted. The code builds, but of course it could not actually run on CPU_V6. But the use case for me is building drivers/xen/grant-table.c, which can only run on CPU_V7 anyway, so it is not a problem. Is that acceptable for you? If not, do you have any other suggestions? > So... I don't think this patch is ready. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] retain the configuration that Xen was built with
On 12/22/15 9:44 AM, Doug Goldstein wrote: > This should retain the .config file from the Kconfig process so that we > know how this build of Xen was configured. > > Signed-off-by: Doug Goldstein > --- > ts-xen-build | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/ts-xen-build b/ts-xen-build > index b02e737..80b1faa 100755 > --- a/ts-xen-build > +++ b/ts-xen-build > @@ -192,6 +192,7 @@ sub stash () { > } > built_stash_file($ho, $builddir, "xen-syms", "xen/xen/xen-syms", 1); > built_stash_file($ho, $builddir, "xen-config", "xen/.config", 1); > +built_stash_file($ho, $builddir, "xen-hv-config", "xen/xen/.config", 1); > built_stash_file($ho, $builddir, "seabios-config", >"xen/tools/firmware/seabios-dir-remote/.config", 1); > built_compress_stashed("xen-syms"); > This is for osstest. I forgot to include that in the subject. -- Doug Goldstein signature.asc Description: OpenPGP digital signature ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH RESEND v4] arm: remove !CPU_V6 and !GENERIC_ATOMIC64 build dependencies for XEN
On Tue, Dec 22, 2015 at 03:45:19PM +, Stefano Stabellini wrote: > The code builds, but of course it could not actually run on CPU_V6. But > the use case for me is building drivers/xen/grant-table.c, which can > only run on CPU_V7 anyway, so it is not a problem. What I'm concerned about in this case is if you try to use instructions which are not supported by the CPU, even in assembly, the assembler will barf. So, if we're building an ARMv6 + ARMv7 kernel, and you try to use LDREXB in generic code, even though you may intend it to only be executed on ARMv7, the assembler will error out. We used to be able to work around that by passing -Wa,-march=armv7 to the compiler for specific files, which would then tell the assembler to accept ARMv7 instructions, but modern GCC output .arch pseudo-ops which override the command line. So now the only way to do it is to override the assemblers selected archtecture using .arch pseudo-ops in the assembly code. So, what I'm saying is that dropping the !ARMv6 dependency is not as simple as it would seem, and I'm nervous about including any such patches this close to Christmas and the merge window. I've recently been through a run of "apply this patch, no it's wrong, drop it, apply the next version, no it's still wrong, drop it again." So, today is the last day I'm accepting _known_ good patches into my tree for the 4.5 merge window. This is not yet a known good patch because it hasn't been through several iterations of testing - and I don't want to be adding and removing patches from my tree over Christmas. I'd rather leave my tree in a known good state before Christmas so that those who want to fiddle with the kernel over the holiday break can do so without having the hassle of trees containing partially tested patches. -- RMK's Patch system: http://www.arm.linux.org.uk/developer/patches/ FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up according to speedtest.net. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] build: save generated config in /boot
>>> On 22.12.15 at 15:46, wrote: > On 22/12/15 12:52, Jan Beulich wrote: > On 22.12.15 at 13:45, wrote: >>> On 12/21/15 9:35 AM, Jan Beulich wrote: >>> On 21.12.15 at 16:20, wrote: > On 12/21/15 8:51 AM, Jan Beulich wrote: > On 21.12.15 at 15:35, wrote: >>> On 12/21/15 6:11 AM, Jan Beulich wrote: >>> On 18.12.15 at 22:35, wrote: > Since we now support changing Xen options with Kconfig, we should save > the configuration that was used to build up Xen. This will save it in > /boot alongside the installed xen.gz and call it > xen-$(FULLVERSION).config > > Suggested-by: Ian Campbell > Signed-off-by: Doug Goldstein > --- > xen/Makefile | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/xen/Makefile b/xen/Makefile > index 9023863..460b977 100644 > --- a/xen/Makefile > +++ b/xen/Makefile > @@ -58,6 +58,7 @@ _install: $(TARGET)$(CONFIG_XEN_INSTALL_SUFFIX) > ln -f -s $(T)-$(XEN_FULLVERSION)$(Z) $(D)$(BOOT_DIR)/$(T)$(Z) > [ -d "$(D)$(DEBUG_DIR)" ] || $(INSTALL_DIR) $(D)$(DEBUG_DIR) > $(INSTALL_DATA) $(TARGET)-syms > $(D)$(DEBUG_DIR)/$(T)-syms-$(XEN_FULLVERSION) > + $(INSTALL_DATA) $(KCONFIG_CONFIG) >>> $(D)$(BOOT_DIR)/$(T)-$(XEN_FULLVERSION).config Was it really suggested to put this into /boot? It has no business being there... >>> Yes. By multiple people. Ian Campbell was the first person to suggest it >>> in that location. >> Okay, so I've looked it up, and no, he didn't. He just gave this as one >> possibility: >> >> "It occurred to me this morning that we probably ought to stash the >> .config >> somewhere on install in such a way that it can be associated with the >> Xen >> binary (i.e. with the same full suffix as the binary itself, not the >> abridged symlink names), maybe as $(BOOT_DIR)/$(T)- >> $(XEN_FULLVERSION).config?" >> >> But yes, I'm sorry for not noticing this as an undesirable place right >> away. > Ok well I'm at a loss here because the quote clearly shows him > suggesting that location. Do you have a suggested location because so > far I've just got a no from you on the only suggested location. Match the xen-syms location? >>> I guess I fail to grasp the rationale behind not putting it in /boot. >> It's the other way around really - you'd have to provide a reason >> (other than "Linux does so too") for putting it in /boot. > > I disagree. > > Xen being consistent with Linux in this regard is in the best interest > of the users of Xen, as they end up finding similar information in > similar places. > > The onus is on you to provide a reason why we should deliberately do > something different. I'm sorry, but no - why would we slavishly follow what Linux does, no matter whether it makes sense? Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] build: save generated config in /boot
On 12/22/15 9:59 AM, Jan Beulich wrote: On 22.12.15 at 15:46, wrote: >> On 22/12/15 12:52, Jan Beulich wrote: >> On 22.12.15 at 13:45, wrote: On 12/21/15 9:35 AM, Jan Beulich wrote: On 21.12.15 at 16:20, wrote: >> On 12/21/15 8:51 AM, Jan Beulich wrote: >> On 21.12.15 at 15:35, wrote: On 12/21/15 6:11 AM, Jan Beulich wrote: On 18.12.15 at 22:35, wrote: >> Since we now support changing Xen options with Kconfig, we should >> save >> the configuration that was used to build up Xen. This will save it in >> /boot alongside the installed xen.gz and call it >> xen-$(FULLVERSION).config >> >> Suggested-by: Ian Campbell >> Signed-off-by: Doug Goldstein >> --- >> xen/Makefile | 1 + >> 1 file changed, 1 insertion(+) >> >> diff --git a/xen/Makefile b/xen/Makefile >> index 9023863..460b977 100644 >> --- a/xen/Makefile >> +++ b/xen/Makefile >> @@ -58,6 +58,7 @@ _install: $(TARGET)$(CONFIG_XEN_INSTALL_SUFFIX) >> ln -f -s $(T)-$(XEN_FULLVERSION)$(Z) $(D)$(BOOT_DIR)/$(T)$(Z) >> [ -d "$(D)$(DEBUG_DIR)" ] || $(INSTALL_DIR) $(D)$(DEBUG_DIR) >> $(INSTALL_DATA) $(TARGET)-syms >> $(D)$(DEBUG_DIR)/$(T)-syms-$(XEN_FULLVERSION) >> +$(INSTALL_DATA) $(KCONFIG_CONFIG) $(D)$(BOOT_DIR)/$(T)-$(XEN_FULLVERSION).config > Was it really suggested to put this into /boot? It has no business > being there... Yes. By multiple people. Ian Campbell was the first person to suggest it in that location. >>> Okay, so I've looked it up, and no, he didn't. He just gave this as one >>> possibility: >>> >>> "It occurred to me this morning that we probably ought to stash the >>> .config >>> somewhere on install in such a way that it can be associated with the >>> Xen >>> binary (i.e. with the same full suffix as the binary itself, not the >>> abridged symlink names), maybe as $(BOOT_DIR)/$(T)- >>> $(XEN_FULLVERSION).config?" >>> >>> But yes, I'm sorry for not noticing this as an undesirable place right >>> away. >> Ok well I'm at a loss here because the quote clearly shows him >> suggesting that location. Do you have a suggested location because so >> far I've just got a no from you on the only suggested location. > Match the xen-syms location? I guess I fail to grasp the rationale behind not putting it in /boot. >>> It's the other way around really - you'd have to provide a reason >>> (other than "Linux does so too") for putting it in /boot. >> >> I disagree. >> >> Xen being consistent with Linux in this regard is in the best interest >> of the users of Xen, as they end up finding similar information in >> similar places. >> >> The onus is on you to provide a reason why we should deliberately do >> something different. > > I'm sorry, but no - why would we slavishly follow what Linux does, no > matter whether it makes sense? > > Jan > How does it not make sense in this case? That's what Andrew and I are asking you to explain. -- Doug Goldstein signature.asc Description: OpenPGP digital signature ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH RESEND v4] arm: remove !CPU_V6 and !GENERIC_ATOMIC64 build dependencies for XEN
On Tue, 22 Dec 2015, Russell King - ARM Linux wrote: > On Tue, Dec 22, 2015 at 03:45:19PM +, Stefano Stabellini wrote: > > The code builds, but of course it could not actually run on CPU_V6. But > > the use case for me is building drivers/xen/grant-table.c, which can > > only run on CPU_V7 anyway, so it is not a problem. > > What I'm concerned about in this case is if you try to use instructions > which are not supported by the CPU, even in assembly, the assembler > will barf. > > So, if we're building an ARMv6 + ARMv7 kernel, and you try to use > LDREXB in generic code, even though you may intend it to only be > executed on ARMv7, the assembler will error out. > > We used to be able to work around that by passing -Wa,-march=armv7 > to the compiler for specific files, which would then tell the assembler > to accept ARMv7 instructions, but modern GCC output .arch pseudo-ops > which override the command line. So now the only way to do it is to > override the assemblers selected archtecture using .arch pseudo-ops > in the assembly code. > > So, what I'm saying is that dropping the !ARMv6 dependency is not as > simple as it would seem, and I'm nervous about including any such > patches this close to Christmas and the merge window. Sure, that's understandable. Merge window aside, do you think that for the future it might be better to change approach and write a Xen specific implementation of sync_cmpxchg? Like this older patch tried to do http://marc.info/?l=linux-arm-kernel&m=138436406724990&w=2 ? ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] build: save generated config in /boot
>>> On 22.12.15 at 17:02, wrote: > How does it not make sense in this case? That's what Andrew and I are > asking you to explain. But I already explained it: The file isn't needed for booting. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [RFC 0/3] xen/arm: Support XENMEM_exchange
>>> On 17.12.15 at 17:31, wrote: > I've looked at reworking XENMEM_exchange to avoid mfn_to_gmfn. My idea would > be to allocate a temporary array to store the GFN between the two loops. > However, the array would be quite large (the max default is 18 on ARM), > so I don't know if it's acceptable. This would quite clearly not be acceptable, yet I don't have an alternative suggestion other then implementing M2P on ARM, which you have already named as not an option. Otoh I'm not sure what two loops you talk about - I find exactly one use of mfn_to_gmfn() in memory_exchange(), and latching the GFN earlier would be wrong (as it might change in parallel to your processing). Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [RFC 2/3] xen/common: memory: Add support for direct mapped domain in XEMEM_exchange
>>> On 17.12.15 at 17:31, wrote: > Direct mapped domain needs to retrieve the exact same underlying > physical page when the region is re-populated. In which case - what sense does it make for such a domain to exchange memory (the primary purpose of which is to get _different_ memory populated at the given GFN (range))? Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [RFC 3/3] xen/common: memory: Move steal_page in common code
>>> On 17.12.15 at 17:31, wrote: > --- a/xen/common/page_alloc.c > +++ b/xen/common/page_alloc.c > @@ -1786,6 +1786,57 @@ int assign_pages( > return -1; > } > > +int steal_page( > +struct domain *d, struct page_info *page, unsigned int memflags) > +{ I think it would better go into common/memory.c - there's no allocation involved here. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PULL 0/4] xen-2015-12-22
The following changes since commit c3626ca7df027dabf0568284360a23faf18f0884: Update version for v2.5.0-rc3 release (2015-12-07 17:47:40 +) are available in the git repository at: git://xenbits.xen.org/people/sstabellini/qemu-dm.git tags/xen-2015-12-22 for you to fetch changes up to fc3e493bc8e96ef4bf7ae4f035f43cb39382c936: xen_disk: treat "vhd" as "vpc" (2015-12-11 17:02:37 +) Xen 2015/12/22 Jan Beulich (3): xen/MSI-X: latch MSI-X table writes xen/MSI-X: really enforce alignment xen/pass-through: correctly deal with RW1C bits Stefano Stabellini (1): xen_disk: treat "vhd" as "vpc" hw/block/xen_disk.c |3 ++ hw/xen/xen_pt.h |6 ++- hw/xen/xen_pt_config_init.c | 40 +++- hw/xen/xen_pt_msi.c | 86 --- 4 files changed, 60 insertions(+), 75 deletions(-) ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PULL 1/4] xen/MSI-X: latch MSI-X table writes
From: Jan Beulich The remaining log message in pci_msix_write() is wrong, as there guest behavior may only appear to be wrong: For one, the old logic didn't take the mask-all bit into account. And then this shouldn't depend on host device state (i.e. the host may have masked the entry without the guest having done so). Plus these writes shouldn't be dropped even when an entry gets unmasked. Instead, if they can't be made take effect right away, they should take effect on the next unmasking or enabling operation - the specification explicitly describes such caching behavior. Signed-off-by: Jan Beulich Signed-off-by: Stefano Stabellini Reviewed-by: Stefano Stabellini --- hw/xen/xen_pt.h |4 +-- hw/xen/xen_pt_config_init.c |2 ++ hw/xen/xen_pt_msi.c | 74 --- 3 files changed, 32 insertions(+), 48 deletions(-) diff --git a/hw/xen/xen_pt.h b/hw/xen/xen_pt.h index c545280..4f922f4 100644 --- a/hw/xen/xen_pt.h +++ b/hw/xen/xen_pt.h @@ -187,13 +187,13 @@ typedef struct XenPTMSIXEntry { int pirq; uint64_t addr; uint32_t data; -uint32_t vector_ctrl; +uint32_t latch[4]; bool updated; /* indicate whether MSI ADDR or DATA is updated */ -bool warned; /* avoid issuing (bogus) warning more than once */ } XenPTMSIXEntry; typedef struct XenPTMSIX { uint32_t ctrl_offset; bool enabled; +bool maskall; int total_entries; int bar_index; uint64_t table_base; diff --git a/hw/xen/xen_pt_config_init.c b/hw/xen/xen_pt_config_init.c index 3d8686d..4fd7206 100644 --- a/hw/xen/xen_pt_config_init.c +++ b/hw/xen/xen_pt_config_init.c @@ -1499,6 +1499,8 @@ static int xen_pt_msixctrl_reg_write(XenPCIPassthroughState *s, xen_pt_msix_disable(s); } +s->msix->maskall = *val & PCI_MSIX_FLAGS_MASKALL; + debug_msix_enabled_old = s->msix->enabled; s->msix->enabled = !!(*val & PCI_MSIX_FLAGS_ENABLE); if (s->msix->enabled != debug_msix_enabled_old) { diff --git a/hw/xen/xen_pt_msi.c b/hw/xen/xen_pt_msi.c index 82de2bc..e10cd2a 100644 --- a/hw/xen/xen_pt_msi.c +++ b/hw/xen/xen_pt_msi.c @@ -25,6 +25,7 @@ #define XEN_PT_GFLAGSSHIFT_DELIV_MODE 12 #define XEN_PT_GFLAGSSHIFT_TRG_MODE 15 +#define latch(fld) latch[PCI_MSIX_ENTRY_##fld / sizeof(uint32_t)] /* * Helpers @@ -314,7 +315,8 @@ static int msix_set_enable(XenPCIPassthroughState *s, bool enabled) enabled); } -static int xen_pt_msix_update_one(XenPCIPassthroughState *s, int entry_nr) +static int xen_pt_msix_update_one(XenPCIPassthroughState *s, int entry_nr, + uint32_t vec_ctrl) { XenPTMSIXEntry *entry = NULL; int pirq; @@ -332,6 +334,19 @@ static int xen_pt_msix_update_one(XenPCIPassthroughState *s, int entry_nr) pirq = entry->pirq; +/* + * Update the entry addr and data to the latest values only when the + * entry is masked or they are all masked, as required by the spec. + * Addr and data changes while the MSI-X entry is unmasked get deferred + * until the next masked -> unmasked transition. + */ +if (pirq == XEN_PT_UNASSIGNED_PIRQ || s->msix->maskall || +(vec_ctrl & PCI_MSIX_ENTRY_CTRL_MASKBIT)) { +entry->addr = entry->latch(LOWER_ADDR) | + ((uint64_t)entry->latch(UPPER_ADDR) << 32); +entry->data = entry->latch(DATA); +} + rc = msi_msix_setup(s, entry->addr, entry->data, &pirq, true, entry_nr, entry->pirq == XEN_PT_UNASSIGNED_PIRQ); if (rc) { @@ -357,7 +372,7 @@ int xen_pt_msix_update(XenPCIPassthroughState *s) int i; for (i = 0; i < msix->total_entries; i++) { -xen_pt_msix_update_one(s, i); +xen_pt_msix_update_one(s, i, msix->msix_entry[i].latch(VECTOR_CTRL)); } return 0; @@ -406,35 +421,15 @@ int xen_pt_msix_update_remap(XenPCIPassthroughState *s, int bar_index) static uint32_t get_entry_value(XenPTMSIXEntry *e, int offset) { -switch (offset) { -case PCI_MSIX_ENTRY_LOWER_ADDR: -return e->addr & UINT32_MAX; -case PCI_MSIX_ENTRY_UPPER_ADDR: -return e->addr >> 32; -case PCI_MSIX_ENTRY_DATA: -return e->data; -case PCI_MSIX_ENTRY_VECTOR_CTRL: -return e->vector_ctrl; -default: -return 0; -} +return !(offset % sizeof(*e->latch)) + ? e->latch[offset / sizeof(*e->latch)] : 0; } static void set_entry_value(XenPTMSIXEntry *e, int offset, uint32_t val) { -switch (offset) { -case PCI_MSIX_ENTRY_LOWER_ADDR: -e->addr = (e->addr & ((uint64_t)UINT32_MAX << 32)) | val; -break; -case PCI_MSIX_ENTRY_UPPER_ADDR: -e->addr = (uint64_t)val << 32 | (e->addr & UINT32_MAX); -break; -case PCI_MSIX_ENTRY_DATA: -e->data = val; -break; -case PCI_MSIX_ENTRY_VECTOR_CTRL: -e->vector_ctrl = val; -break; +if (!(offset % sizeof(*e
[Xen-devel] [PULL 2/4] xen/MSI-X: really enforce alignment
From: Jan Beulich The way the generic infrastructure works the intention of not allowing unaligned accesses can't be achieved by simply setting .unaligned to false. The benefit is that we can now replace the conditionals in {get,set}_entry_value() by assert()-s. Signed-off-by: Jan Beulich Signed-off-by: Stefano Stabellini Reviewed-by: Stefano Stabellini --- hw/xen/xen_pt_msi.c | 22 -- 1 file changed, 16 insertions(+), 6 deletions(-) diff --git a/hw/xen/xen_pt_msi.c b/hw/xen/xen_pt_msi.c index e10cd2a..302a310 100644 --- a/hw/xen/xen_pt_msi.c +++ b/hw/xen/xen_pt_msi.c @@ -421,16 +421,14 @@ int xen_pt_msix_update_remap(XenPCIPassthroughState *s, int bar_index) static uint32_t get_entry_value(XenPTMSIXEntry *e, int offset) { -return !(offset % sizeof(*e->latch)) - ? e->latch[offset / sizeof(*e->latch)] : 0; +assert(!(offset % sizeof(*e->latch))); +return e->latch[offset / sizeof(*e->latch)]; } static void set_entry_value(XenPTMSIXEntry *e, int offset, uint32_t val) { -if (!(offset % sizeof(*e->latch))) -{ -e->latch[offset / sizeof(*e->latch)] = val; -} +assert(!(offset % sizeof(*e->latch))); +e->latch[offset / sizeof(*e->latch)] = val; } static void pci_msix_write(void *opaque, hwaddr addr, @@ -494,6 +492,12 @@ static uint64_t pci_msix_read(void *opaque, hwaddr addr, } } +static bool pci_msix_accepts(void *opaque, hwaddr addr, + unsigned size, bool is_write) +{ +return !(addr & (size - 1)); +} + static const MemoryRegionOps pci_msix_ops = { .read = pci_msix_read, .write = pci_msix_write, @@ -502,7 +506,13 @@ static const MemoryRegionOps pci_msix_ops = { .min_access_size = 4, .max_access_size = 4, .unaligned = false, +.accepts = pci_msix_accepts }, +.impl = { +.min_access_size = 4, +.max_access_size = 4, +.unaligned = false +} }; int xen_pt_msix_init(XenPCIPassthroughState *s, uint32_t base) -- 1.7.10.4 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PULL 3/4] xen/pass-through: correctly deal with RW1C bits
From: Jan Beulich Introduce yet another mask for them, so that the generic routine can handle them, at once rendering xen_pt_pmcsr_reg_write() superfluous. Signed-off-by: Jan Beulich Signed-off-by: Stefano Stabellini Reviewed-by: Stefano Stabellini --- hw/xen/xen_pt.h |2 ++ hw/xen/xen_pt_config_init.c | 38 -- 2 files changed, 14 insertions(+), 26 deletions(-) diff --git a/hw/xen/xen_pt.h b/hw/xen/xen_pt.h index 4f922f4..3749711 100644 --- a/hw/xen/xen_pt.h +++ b/hw/xen/xen_pt.h @@ -113,6 +113,8 @@ struct XenPTRegInfo { uint32_t res_mask; /* reg read only field mask (ON:RO/ROS, OFF:other) */ uint32_t ro_mask; +/* reg read/write-1-clear field mask (ON:RW1C/RW1CS, OFF:other) */ +uint32_t rw1c_mask; /* reg emulate field mask (ON:emu, OFF:passthrough) */ uint32_t emu_mask; xen_pt_conf_reg_init init; diff --git a/hw/xen/xen_pt_config_init.c b/hw/xen/xen_pt_config_init.c index 4fd7206..185a698 100644 --- a/hw/xen/xen_pt_config_init.c +++ b/hw/xen/xen_pt_config_init.c @@ -179,7 +179,8 @@ static int xen_pt_byte_reg_write(XenPCIPassthroughState *s, XenPTReg *cfg_entry, *data = XEN_PT_MERGE_VALUE(*val, *data, writable_mask); /* create value for writing to I/O device register */ -*val = XEN_PT_MERGE_VALUE(*val, dev_value, throughable_mask); +*val = XEN_PT_MERGE_VALUE(*val, dev_value & ~reg->rw1c_mask, + throughable_mask); return 0; } @@ -197,7 +198,8 @@ static int xen_pt_word_reg_write(XenPCIPassthroughState *s, XenPTReg *cfg_entry, *data = XEN_PT_MERGE_VALUE(*val, *data, writable_mask); /* create value for writing to I/O device register */ -*val = XEN_PT_MERGE_VALUE(*val, dev_value, throughable_mask); +*val = XEN_PT_MERGE_VALUE(*val, dev_value & ~reg->rw1c_mask, + throughable_mask); return 0; } @@ -215,7 +217,8 @@ static int xen_pt_long_reg_write(XenPCIPassthroughState *s, XenPTReg *cfg_entry, *data = XEN_PT_MERGE_VALUE(*val, *data, writable_mask); /* create value for writing to I/O device register */ -*val = XEN_PT_MERGE_VALUE(*val, dev_value, throughable_mask); +*val = XEN_PT_MERGE_VALUE(*val, dev_value & ~reg->rw1c_mask, + throughable_mask); return 0; } @@ -633,6 +636,7 @@ static XenPTRegInfo xen_pt_emu_reg_header0[] = { .init_val = 0x, .res_mask = 0x0007, .ro_mask= 0x06F8, +.rw1c_mask = 0xF900, .emu_mask = 0x0010, .init = xen_pt_status_reg_init, .u.w.read = xen_pt_word_reg_read, @@ -944,6 +948,7 @@ static XenPTRegInfo xen_pt_emu_reg_pcie[] = { .size = 2, .res_mask = 0xFFC0, .ro_mask= 0x0030, +.rw1c_mask = 0x000F, .init = xen_pt_common_reg_init, .u.w.read = xen_pt_word_reg_read, .u.w.write = xen_pt_word_reg_write, @@ -964,6 +969,7 @@ static XenPTRegInfo xen_pt_emu_reg_pcie[] = { .offset = PCI_EXP_LNKSTA, .size = 2, .ro_mask= 0x3FFF, +.rw1c_mask = 0xC000, .init = xen_pt_common_reg_init, .u.w.read = xen_pt_word_reg_read, .u.w.write = xen_pt_word_reg_write, @@ -1000,27 +1006,6 @@ static XenPTRegInfo xen_pt_emu_reg_pcie[] = { * Power Management Capability */ -/* write Power Management Control/Status register */ -static int xen_pt_pmcsr_reg_write(XenPCIPassthroughState *s, - XenPTReg *cfg_entry, uint16_t *val, - uint16_t dev_value, uint16_t valid_mask) -{ -XenPTRegInfo *reg = cfg_entry->reg; -uint16_t writable_mask = 0; -uint16_t throughable_mask = get_throughable_mask(s, reg, valid_mask); -uint16_t *data = cfg_entry->ptr.half_word; - -/* modify emulate register */ -writable_mask = reg->emu_mask & ~reg->ro_mask & valid_mask; -*data = XEN_PT_MERGE_VALUE(*val, *data, writable_mask); - -/* create value for writing to I/O device register */ -*val = XEN_PT_MERGE_VALUE(*val, dev_value & ~PCI_PM_CTRL_PME_STATUS, - throughable_mask); - -return 0; -} - /* Power Management Capability reg static information table */ static XenPTRegInfo xen_pt_emu_reg_pm[] = { /* Next Pointer reg */ @@ -1051,11 +1036,12 @@ static XenPTRegInfo xen_pt_emu_reg_pm[] = { .size = 2, .init_val = 0x0008, .res_mask = 0x00F0, -.ro_mask= 0xE10C, +.ro_mask= 0x610C, +.rw1c_mask = 0x8000, .emu_mask = 0x810B, .init = xen_pt_common_reg_init, .u.w.read = xen_pt_word_reg_read, -.u.w.write = xen_pt_pmcsr_reg_write, +.u.w.write = xen_pt_word_reg_write, }, { .size = 0, -- 1.7.10.4 ___ Xen-devel mailing list Xen-dev
[Xen-devel] [PULL 4/4] xen_disk: treat "vhd" as "vpc"
The Xen toolstack uses "vhd" to specify a disk in VHD format, however the name of the driver in QEMU is "vpc". Replace "vhd" with "vpc", so that QEMU can find the right driver to use for it. Signed-off-by: Stefano Stabellini --- hw/block/xen_disk.c |3 +++ 1 file changed, 3 insertions(+) diff --git a/hw/block/xen_disk.c b/hw/block/xen_disk.c index 8146650..a48e726 100644 --- a/hw/block/xen_disk.c +++ b/hw/block/xen_disk.c @@ -825,6 +825,9 @@ static int blk_init(struct XenDevice *xendev) if (!strcmp("aio", blkdev->fileproto)) { blkdev->fileproto = "raw"; } +if (!strcmp("vhd", blkdev->fileproto)) { +blkdev->fileproto = "vpc"; +} if (blkdev->mode == NULL) { blkdev->mode = xenstore_read_be_str(&blkdev->xendev, "mode"); } -- 1.7.10.4 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH] x86/VPMU: Check more carefully which bits are allowed to be written to MSRs
Current Intel VPMU emulation needs to perform more checks when writing PMU MSRs on guest's behalf: * MSR_CORE_PERF_GLOBAL_CTRL is not checked at all * MSR_CORE_PERF_FIXED_CTR_CTRL has more reserved bits in PMU version 2 * MSR_CORE_PERF_GLOBAL_OVF_CTRL's bit 61 is allowed on versions greater * than 2. We can also use precomputed mask in core2_vpmu_do_interrupt(). Signed-off-by: Boris Ostrovsky --- xen/arch/x86/cpu/vpmu_intel.c | 18 -- 1 files changed, 16 insertions(+), 2 deletions(-) diff --git a/xen/arch/x86/cpu/vpmu_intel.c b/xen/arch/x86/cpu/vpmu_intel.c index 3eff1ae..8dc0b48 100644 --- a/xen/arch/x86/cpu/vpmu_intel.c +++ b/xen/arch/x86/cpu/vpmu_intel.c @@ -87,7 +87,7 @@ static unsigned int __read_mostly arch_pmc_cnt, fixed_pmc_cnt; /* Masks used for testing whether and MSR is valid */ #define ARCH_CTRL_MASK (~((1ull << 32) - 1) | (1ull << 21)) static uint64_t __read_mostly fixed_ctrl_mask, fixed_counters_mask; -static uint64_t __read_mostly global_ovf_ctrl_mask; +static uint64_t __read_mostly global_ovf_ctrl_mask, global_ctrl_mask; /* Total size of PMU registers block (copied to/from PV(H) guest) */ static unsigned int __read_mostly regs_sz; @@ -392,6 +392,8 @@ static int core2_vpmu_verify(struct vcpu *v) if ( core2_vpmu_cxt->global_ovf_ctrl & global_ovf_ctrl_mask ) return -EINVAL; +if ( core2_vpmu_cxt->global_ctrl & global_ctrl_mask ) +return -EINVAL; fixed_ctrl = core2_vpmu_cxt->fixed_ctrl; if ( fixed_ctrl & fixed_ctrl_mask ) @@ -627,6 +629,8 @@ static int core2_vpmu_do_wrmsr(unsigned int msr, uint64_t msr_content, gdprintk(XENLOG_WARNING, "Guest setting of DTS is ignored.\n"); return 0; case MSR_CORE_PERF_GLOBAL_CTRL: +if ( msr_content & global_ctrl_mask ) +return -EINVAL; core2_vpmu_cxt->global_ctrl = msr_content; break; case MSR_CORE_PERF_FIXED_CTR_CTRL: @@ -820,7 +824,7 @@ static int core2_vpmu_do_interrupt(struct cpu_user_regs *regs) if ( is_pmc_quirk ) handle_pmc_quirk(msr_content); core2_vpmu_cxt->global_status |= msr_content; -msr_content = 0xC007 | ((1 << arch_pmc_cnt) - 1); +msr_content = ~global_ovf_ctrl_mask; wrmsrl(MSR_CORE_PERF_GLOBAL_OVF_CTRL, msr_content); } else @@ -1001,10 +1005,20 @@ int __init core2_vpmu_init(void) full_width_write = (caps >> 13) & 1; fixed_ctrl_mask = ~((1ull << (fixed_pmc_cnt * FIXED_CTR_CTRL_BITS)) - 1); +if ( version == 2 ) +fixed_ctrl_mask |= 0x444; fixed_counters_mask = ~((1ull << core2_get_bitwidth_fix_count()) - 1); +global_ctrl_mask = ~1ULL << fixed_pmc_cnt) - 1) << 32) | + ((1ULL << arch_pmc_cnt) - 1)); global_ovf_ctrl_mask = ~(0xC000 | (((1ULL << fixed_pmc_cnt) - 1) << 32) | ((1ULL << arch_pmc_cnt) - 1)); +if ( version > 2) +/* + * Even though we don't support Uncore counters guests should be + * able to clear all available overflows. + */ +global_ovf_ctrl_mask &= ~(1ULL << 61); regs_sz = (sizeof(struct xen_pmu_intel_ctxt) - regs_off) + sizeof(uint64_t) * fixed_pmc_cnt + -- 1.7.1 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH RFC 01/31] xen/public: Export featureset information in the public API
>>> On 16.12.15 at 22:24, wrote: > +/* VIA/Cyrix/Centaur-defined CPU features, CPUID level 0xC001, FSMAX+2 */ > +#define X86_FEATURE_XSTORE ((FSMAX+1)*32+ 2) /* on-CPU RNG present (xstore > insn) */ > +#define X86_FEATURE_XSTORE_EN((FSMAX+1)*32+ 3) /* on-CPU RNG enabled > */ > +#define X86_FEATURE_XCRYPT ((FSMAX+1)*32+ 6) /* on-CPU crypto (xcrypt > insn) */ > +#define X86_FEATURE_XCRYPT_EN((FSMAX+1)*32+ 7) /* on-CPU crypto > enabled */ > +#define X86_FEATURE_ACE2 ((FSMAX+1)*32+ 8) /* Advanced Cryptography > Engine v2 */ > +#define X86_FEATURE_ACE2_EN ((FSMAX+1)*32+ 9) /* ACE v2 enabled */ > +#define X86_FEATURE_PHE ((FSMAX+1)*32+10) /* PadLock Hash > Engine */ > +#define X86_FEATURE_PHE_EN ((FSMAX+1)*32+11) /* PHE enabled */ > +#define X86_FEATURE_PMM ((FSMAX+1)*32+12) /* PadLock Montgomery > Multiplier */ > +#define X86_FEATURE_PMM_EN ((FSMAX+1)*32+13) /* PMM enabled */ None of these get consumed anywhere - if you already touch this code, just drop all of them. > +/* > + * CPUID leaf shorthand: > + * - optional 'e', Extended (0x8xxx) > + * - leaf, uppercase hex > + * - register, lowercase > + * - optional subleaf, uppercase hex > + */ > +#define XEN_FEATURESET_1d 0 /* 0x0001.edx */ > +#define XEN_FEATURESET_1c 1 /* 0x0001.ecx */ > +#define XEN_FEATURESET_e1d2 /* 0x8001.edx */ > +#define XEN_FEATURESET_e1c3 /* 0x8001.ecx */ > +#define XEN_FEATURESET_Da14 /* 0x000d:1.eax*/ > +#define XEN_FEATURESET_7b05 /* 0x0007:0.ebx*/ This ends up being pretty cryptic. > +#define XEN_NR_FEATURESET_ENTRIES (XEN_FEATURESET_7b0 + 1) This shouldn't be exposed, as it will necessarily change sooner or later. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH RFC 02/31] tools/libxc: Use public/featureset.h for cpuid policy generation
>>> On 16.12.15 at 22:24, wrote: > --- a/xen/include/public/arch-x86/featureset.h > +++ b/xen/include/public/arch-x86/featureset.h > @@ -163,6 +163,7 @@ > > /* Intel-defined CPU features, CPUID level 0x0007:0.ebx, word 5 */ > #define X86_FEATURE_FSGSBASE ( 5*32+ 0) /* {RD,WR}{FS,GS}BASE > instructions */ > +#define X86_FEATURE_TSC_ADJUST( 5*32+ 1) /* TSC_ADJUST MSR available */ This would probably better go into patch 1. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH RFC 03/31] xen/x86: Store antifeatures inverted in a featureset
>>> On 16.12.15 at 22:24, wrote: > --- /dev/null > +++ b/xen/arch/x86/cpuid/cpuid-private.h > @@ -0,0 +1,27 @@ > +#ifdef __XEN__ > + > +#include > + > +#include > + > +#else > + > +# error TODO for userspace I suppose your intentions with this will become apparent in later patches? Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH v2] memory-hotplug: add automatic onlining policy for the newly added memory
Currently, all newly added memory blocks remain in 'offline' state unless someone onlines them, some linux distributions carry special udev rules like: SUBSYSTEM=="memory", ACTION=="add", ATTR{state}=="offline", ATTR{state}="online" to make this happen automatically. This is not a great solution for virtual machines where memory hotplug is being used to address high memory pressure situations as such onlining is slow and a userspace process doing this (udev) has a chance of being killed by the OOM killer as it will probably require to allocate some memory. Introduce default policy for the newly added memory blocks in /sys/devices/system/memory/hotplug_autoonline file with two possible values: "offline" which preserves the current behavior and "online" which causes all newly added memory blocks to go online as soon as they're added. The default is "online" when MEMORY_HOTPLUG_AUTOONLINE kernel config option is selected. Cc: Jonathan Corbet Cc: Greg Kroah-Hartman Cc: Daniel Kiper Cc: Dan Williams Cc: Tang Chen Cc: David Vrabel Cc: David Rientjes Cc: Andrew Morton Cc: Naoya Horiguchi Cc: Xishi Qiu Cc: Mel Gorman Cc: "K. Y. Srinivasan" Cc: Igor Mammedov Cc: Kay Sievers Cc: Konrad Rzeszutek Wilk Cc: Boris Ostrovsky Signed-off-by: Vitaly Kuznetsov --- - Changes since 'v1': Add 'online' parameter to add_memory_resource() as it is being used by xen ballon driver and it adds "empty" memory pages [David Vrabel]. (I don't completely understand what prevents manual onlining in this case as we still have all newly added blocks in sysfs ... this is the discussion point.) - Changes since 'RFC': It seems nobody is strongly opposed to the idea, thus non-RFC. Change memhp_autoonline to bool, we support only MMOP_ONLINE_KEEP and MMOP_OFFLINE for the auto-onlining policy, eliminate 'unknown' from show_memhp_autoonline(). [Daniel Kiper] Put everything under CONFIG_MEMORY_HOTPLUG_AUTOONLINE, enable the feature by default (when the config option is selected) and add kernel parameter (nomemhp_autoonline) to disable the functionality upon boot when needed. - RFC: I was able to find previous attempts to fix the issue, e.g.: http://marc.info/?l=linux-kernel&m=137425951924598&w=2 http://marc.info/?l=linux-acpi&m=127186488905382 but I'm not completely sure why it didn't work out and the solution I suggest is not 'smart enough', thus 'RFC'. --- Documentation/kernel-parameters.txt | 2 ++ Documentation/memory-hotplug.txt| 26 -- drivers/base/memory.c | 36 drivers/xen/balloon.c | 2 +- include/linux/memory_hotplug.h | 6 +- mm/Kconfig | 9 + mm/memory_hotplug.c | 25 +++-- 7 files changed, 96 insertions(+), 10 deletions(-) diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt index 742f69d..652efe1 100644 --- a/Documentation/kernel-parameters.txt +++ b/Documentation/kernel-parameters.txt @@ -2537,6 +2537,8 @@ bytes respectively. Such letter suffixes can also be entirely omitted. shutdown the other cpus. Instead use the REBOOT_VECTOR irq. + nomemhp_autoonline Don't automatically online newly added memory. + nomoduleDisable module load nopat [X86] Disable PAT (page attribute table extension of diff --git a/Documentation/memory-hotplug.txt b/Documentation/memory-hotplug.txt index ce2cfcf..041efac 100644 --- a/Documentation/memory-hotplug.txt +++ b/Documentation/memory-hotplug.txt @@ -111,8 +111,9 @@ To use memory hotplug feature, kernel must be compiled with following config options. - For all memory hotplug -Memory model -> Sparse Memory (CONFIG_SPARSEMEM) -Allow for memory hot-add (CONFIG_MEMORY_HOTPLUG) +Memory model -> Sparse Memory (CONFIG_SPARSEMEM) +Allow for memory hot-add (CONFIG_MEMORY_HOTPLUG) +Automatically online hot-added memory (CONFIG_MEMORY_HOTPLUG_AUTOONLINE) - To enable memory removal, the followings are also necessary Allow for memory hot remove(CONFIG_MEMORY_HOTREMOVE) @@ -254,12 +255,25 @@ If the memory block is online, you'll read "online". If the memory block is offline, you'll read "offline". -5.2. How to online memory +5.2. Memory onlining -Even if the memory is hot-added, it is not at ready-to-use state. -For using newly added memory, you have to "online" the memory block. +When the memory is hot-added, the kernel decides whether or not to "online" +it according to the policy which can be read from "hotplug_autoonline" file +(requires CONFIG_MEMORY_HOTPLUG_AUTOONLINE): -For onlining, you have to write "online" to the memory block's state file as: +% cat /sys/devices/system/memory/hotplug_autoonline + +The default is "online" which means the newly added memory will
Re: [Xen-devel] [PATCH V5 4/6] x86/hvm: pkeys, add pkeys support for guest_walk_tables
On 22/12/15 10:30, Huaitong Han wrote: > Protection keys define a new 4-bit protection key field(PKEY) in bits 62:59 of > leaf entries of the page tables. > > PKRU register defines 32 bits, there are 16 domains and 2 attribute bits per > domain in pkru, for each i (0 ≤ i ≤ 15), PKRU[2i] is the access-disable bit > for > protection key i (ADi); PKRU[2i+1] is the write-disable bit for protection key > i (WDi). PKEY is index to a defined domain. > > A fault is considered as a PKU violation if all of the following conditions > are > ture: > 1.CR4_PKE=1. > 2.EFER_LMA=1. > 3.Page is present with no reserved bit violations. > 4.The access is not an instruction fetch. > 5.The access is to a user page. > 6.PKRU.AD=1 > or The access is a data write and PKRU.WD=1 > and either CR0.WP=1 or it is a user access. One comment you didn't address from v3, however: "At the moment PFEC_insn_fetch is only set in hvm_fetch_from_guest_virt() if hvm_nx_enabled() or hvm_smep_enabled() are true. Which means that if you *don't* have nx or smep enabled, then the patch series as is will fault on instruction fetches when it shouldn't. (I don't remember anyone mentioning nx or smep being enabled as a prerequisite for pkeys.)" I think realistically the only way to address this is to start making the clean separation between "pfec in" and "pfec out" I mentioned in the previous discussion. I've coded up the attached patch, but only compile-tested it. Can you give it a look to see if you think it is correct, test it, include it in your next patch series? Thanks, -George From b8ec8e14c670215587a4e74c3aaec8dab6f26a8c Mon Sep 17 00:00:00 2001 From: George Dunlap Date: Tue, 22 Dec 2015 16:17:22 + Subject: [PATCH] xen/mm: Clean up pfec handling in gva_to_gfn At the moment, the pfec argument to gva_to_gfn has two functions: * To inform guest_walk what kind of access is happenind * As a value to pass back into the guest in the event of a fault. Unfortunately this is not quite treated consistently: the hvm_fetch_* function will "pre-clear" the PFEC_insn_fetch flag before calling gva_to_gfn; meaning guest_walk doesn't actually know whether a given access is an instruction fetch or not. This works now, but will cause issues when pkeys are introduced, since guest_walk will need to know whether an access is an instruction fetch even if it doesn't return PFEC_insn_fetch. Fix this by making a clean separation for in and out functionalities of the pfec argument: 1. Always pass in the access type to gva_to_gfn 2. Filter out inappropriate access flags before returning from gva_to_gfn. (The PFEC_insn_fetch flag should only be passed to the guest if either NX or SMEP is enabled. See Intel 64 Developer's Manual, Volume 3, Section 4.7.) Signed-off-by: George Dunlap --- xen/arch/x86/hvm/hvm.c | 8 ++-- xen/arch/x86/mm/hap/guest_walk.c | 10 +- xen/arch/x86/mm/shadow/multi.c | 6 ++ 3 files changed, 17 insertions(+), 7 deletions(-) diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c index 08cef1f..aa1f64f 100644 --- a/xen/arch/x86/hvm/hvm.c +++ b/xen/arch/x86/hvm/hvm.c @@ -4423,11 +4423,9 @@ enum hvm_copy_result hvm_copy_from_guest_virt( enum hvm_copy_result hvm_fetch_from_guest_virt( void *buf, unsigned long vaddr, int size, uint32_t pfec) { -if ( hvm_nx_enabled(current) || hvm_smep_enabled(current) ) -pfec |= PFEC_insn_fetch; return __hvm_copy(buf, vaddr, size, HVMCOPY_from_guest | HVMCOPY_fault | HVMCOPY_virt, - PFEC_page_present | pfec); + PFEC_page_present | PFEC_insn_fetch | pfec); } enum hvm_copy_result hvm_copy_to_guest_virt_nofault( @@ -4449,11 +4447,9 @@ enum hvm_copy_result hvm_copy_from_guest_virt_nofault( enum hvm_copy_result hvm_fetch_from_guest_virt_nofault( void *buf, unsigned long vaddr, int size, uint32_t pfec) { -if ( hvm_nx_enabled(current) || hvm_smep_enabled(current) ) -pfec |= PFEC_insn_fetch; return __hvm_copy(buf, vaddr, size, HVMCOPY_from_guest | HVMCOPY_no_fault | HVMCOPY_virt, - PFEC_page_present | pfec); + PFEC_page_present | PFEC_insn_fetch | pfec); } unsigned long copy_to_user_hvm(void *to, const void *from, unsigned int len) diff --git a/xen/arch/x86/mm/hap/guest_walk.c b/xen/arch/x86/mm/hap/guest_walk.c index 11c1b35..2dce111 100644 --- a/xen/arch/x86/mm/hap/guest_walk.c +++ b/xen/arch/x86/mm/hap/guest_walk.c @@ -82,7 +82,7 @@ unsigned long hap_p2m_ga_to_gfn(GUEST_PAGING_LEVELS)( if ( !top_page ) { pfec[0] &= ~PFEC_page_present; -return INVALID_GFN; +goto out_tweak_pfec; } top_mfn = _mfn(page_to_mfn(top_page)); @@ -136,6 +136,14 @@ unsigned long hap_p2m_ga_to_gfn(GUEST_PAGING_LEVELS)( if ( missing & _PAGE_SHARED ) pfec[0] = PFEC_page_shared; +out_tweak_pfec: +/* + * Intel 64 Volume 3, Section 4.7:
Re: [Xen-devel] [PATCH RFC 04/31] xen/x86: Mask out unknown features from Xen's capabilities
>>> On 16.12.15 at 22:24, wrote: > --- a/xen/arch/x86/cpu/common.c > +++ b/xen/arch/x86/cpu/common.c > @@ -324,6 +324,7 @@ void __cpuinit identify_cpu(struct cpuinfo_x86 *c) >* we do "generic changes." >*/ > for (i = 0; i < XEN_NR_FEATURESET_ENTRIES; ++i) { > + c->x86_capability[i] &= known_features[i]; > c->x86_capability[i] ^= inverted_features[i]; > } Assert that no unknown features slipped into the inverted ones? But see also below. > --- a/xen/arch/x86/cpuid/cpuid-private.h > +++ b/xen/arch/x86/cpuid/cpuid-private.h > @@ -10,6 +10,32 @@ > > #endif > > +/* Mask of bits which are shared between 1d and e1d. */ > +#define SHARED_1d \ > +(cpufeat_mask(X86_FEATURE_FPU) | \ > + cpufeat_mask(X86_FEATURE_VME) | \ > + cpufeat_mask(X86_FEATURE_DE)| \ > + cpufeat_mask(X86_FEATURE_PSE) | \ > + cpufeat_mask(X86_FEATURE_TSC) | \ > + cpufeat_mask(X86_FEATURE_MSR) | \ > + cpufeat_mask(X86_FEATURE_PAE) | \ > + cpufeat_mask(X86_FEATURE_MCE) | \ > + cpufeat_mask(X86_FEATURE_CX8) | \ > + cpufeat_mask(X86_FEATURE_APIC) | \ > + cpufeat_mask(X86_FEATURE_MTRR) | \ > + cpufeat_mask(X86_FEATURE_PGE) | \ > + cpufeat_mask(X86_FEATURE_MCA) | \ > + cpufeat_mask(X86_FEATURE_CMOV) | \ > + cpufeat_mask(X86_FEATURE_PAT) | \ > + cpufeat_mask(X86_FEATURE_PSE36) | \ > + cpufeat_mask(X86_FEATURE_MMX) | \ > + cpufeat_mask(X86_FEATURE_FXSR)) These are shared for AMD, but zero in the extended leaf for Intel. > --- a/xen/arch/x86/cpuid/cpuid.c > +++ b/xen/arch/x86/cpuid/cpuid.c > @@ -1,5 +1,110 @@ > #include "cpuid-private.h" > > +const uint32_t known_features[XEN_NR_FEATURESET_ENTRIES] = > +{ > +[cpufeat_word(X86_FEATURE_FPU)] = (SHARED_1d | > + cpufeat_mask(X86_FEATURE_SEP) | I can see how you try to remove fixed relationship, but using FPU in the array index when no FPU appear in the list is at least confusing. Looking at the rest, adding a new feature will become quite a bit more involved. I think we need some better abstraction here, e.g. a mechanism similar to the one I used in public/errno.h or the one Linux uses to populate its syscall tables or /proc/cpuinfo's feature list to allow multiple inclusion of a single list of definitions where all properties of each individual feature are maintained on one line. The tables in this file would then be derived from this (perhaps via further steps of machine processing). Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH RFC 01/31] xen/public: Export featureset information in the public API
On 22/12/15 16:28, Jan Beulich wrote: On 16.12.15 at 22:24, wrote: >> +/* VIA/Cyrix/Centaur-defined CPU features, CPUID level 0xC001, FSMAX+2 >> */ >> +#define X86_FEATURE_XSTORE ((FSMAX+1)*32+ 2) /* on-CPU RNG present (xstore >> insn) */ >> +#define X86_FEATURE_XSTORE_EN ((FSMAX+1)*32+ 3) /* on-CPU RNG enabled >> */ >> +#define X86_FEATURE_XCRYPT ((FSMAX+1)*32+ 6) /* on-CPU crypto (xcrypt >> insn) */ >> +#define X86_FEATURE_XCRYPT_EN ((FSMAX+1)*32+ 7) /* on-CPU crypto >> enabled */ >> +#define X86_FEATURE_ACE2((FSMAX+1)*32+ 8) /* Advanced Cryptography >> Engine v2 */ >> +#define X86_FEATURE_ACE2_EN ((FSMAX+1)*32+ 9) /* ACE v2 enabled */ >> +#define X86_FEATURE_PHE ((FSMAX+1)*32+10) /* PadLock Hash >> Engine */ >> +#define X86_FEATURE_PHE_EN ((FSMAX+1)*32+11) /* PHE enabled */ >> +#define X86_FEATURE_PMM ((FSMAX+1)*32+12) /* PadLock Montgomery >> Multiplier */ >> +#define X86_FEATURE_PMM_EN ((FSMAX+1)*32+13) /* PMM enabled */ > None of these get consumed anywhere - if you already touch this > code, just drop all of them. X86_FEATURE_XSTORE gets used in xen/arch/x86/cpu/centaur.c to stash word 0xC001, but nothing actually reads the stashed values. I could do a precursor patch which drops the stashing of this information, which will result in NCAPINTS getting shorter by the end of the series. > >> +/* >> + * CPUID leaf shorthand: >> + * - optional 'e', Extended (0x8xxx) >> + * - leaf, uppercase hex >> + * - register, lowercase >> + * - optional subleaf, uppercase hex >> + */ >> +#define XEN_FEATURESET_1d 0 /* 0x0001.edx */ >> +#define XEN_FEATURESET_1c 1 /* 0x0001.ecx */ >> +#define XEN_FEATURESET_e1d2 /* 0x8001.edx */ >> +#define XEN_FEATURESET_e1c3 /* 0x8001.ecx */ >> +#define XEN_FEATURESET_Da14 /* 0x000d:1.eax*/ >> +#define XEN_FEATURESET_7b05 /* 0x0007:0.ebx*/ > This ends up being pretty cryptic. Perhaps at a first glance, but there are so many uses that a shorthand really is needed. See especially the MSR masking patches towards the end of the series. > >> +#define XEN_NR_FEATURESET_ENTRIES (XEN_FEATURESET_7b0 + 1) > This shouldn't be exposed, as it will necessarily change sooner or later. Hmm yes - I think I can alter where this lives, although libxc does need to be able to get this value. ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH RFC XEN v1 09/14] xen: arm: Save and restore GIC state.
On Thu, 17 Dec 2015, Ian Campbell wrote: > > > +/* Save PPI and SGI states (per-CPU) */ > > > +spin_lock(&v->arch.vgic.lock); > > > > vgic_lock > > Ah, yes, this function was originally in save.c so didn't have access, but > now it is in the right place I should use all the correct functions. > > > Is the domain already paused at this point? If so, maybe we could get > > away without locking? > > Maybe we could think about it hard enough to rationalise that, but it seems > safer and easier to just follow the locking rules regardless. > > > > +int vgic_irq_save_core(struct vcpu *v, unsigned irq, struct hvm_hw_irq > > > *s) > > > +{ > > > +struct pending_irq *p = irq_to_pending(v, irq); > > > +struct vgic_irq_rank *rank = vgic_rank_irq(v, p->irq); > > > + > > > +const bool enabled = test_bit(GIC_IRQ_GUEST_ENABLED, &p->status); > > > +const bool queued = test_bit(GIC_IRQ_GUEST_QUEUED, &p->status); > > > +const bool visible = test_bit(GIC_IRQ_GUEST_VISIBLE, &p->status); > > > + > > > +spin_lock(&rank->lock); > > > > vgic_rank_lock. Maybe we need to take the vgic lock because there might > > be incoming hardware interrupts for the domain but I don't think we need > > the rank lock, do we? > > As above I think it is simpler and safer to just follow the usual rules > rather than to special case things. I don't think we take the rank lock with the vgic lock held anywhere in the code though. It would be good to avoid nested locking unless necessary. > > +{ > > > +/* > > > + * This uses the current IPRIORITYR, which may differ from > > > + * when the IRQ was actually made pending. h/w spec probably > > > + * allows this, check > > > + */ > > > > From this VM point of view the IPRIORITYR is s->priority. I would remove > > the comment. > > If you have an IRQ with priority 55 which fires and then while it is > pending the priority is changed to 75 then I'm not sure what impact that > has on the priority which the hardware considers the already pending > interrupt to have. > > To make it more complex consider if there was a second interrupt IRQ > with priority 65, if the handler for that was the one changed IRQ's > priority should it expect to get immediately preempted by IRQ > > The comment is there because I'm not sure what behaviour the spec requires. The spec says: It is IMPLEMENTATION DEFINED whether changing the value of a priority field changes the priority of an active interrupt. We could infer that changing the priority of pending interrupts is supposed to be supported. I would only write: * This uses the current IPRIORITYR, which may differ from * when the IRQ was actually made pending. > In terms of our code without the save restore I think the pending IRQ > would remain at priority 55 and the guest would get IRQ when IRQ > EOIs. Whereas upon restore we would reraise it with priority 75 and IRQ > would appear to preempt IRQ, which would be an interesting difference in > behaviour from the PoV of the guest. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH RFC 05/31] xen/x86: Collect more CPUID feature words
>>> On 16.12.15 at 22:24, wrote: > --- a/xen/arch/x86/cpu/amd.c > +++ b/xen/arch/x86/cpu/amd.c > @@ -481,7 +481,7 @@ static void __devinit init_amd(struct cpuinfo_x86 *c) > > if (c->extended_cpuid_level >= 0x8007) { > c->x86_power = cpuid_edx(0x8007); > - if (c->x86_power & (1<<8)) { > + if (c->x86_power & cpufeat_mask(X86_FEATURE_ITSC)) { generic_identify() has already run by the time we get here, so no need to re-read cpuid_edx() (interestingly enough you leverage this in init_intel()). > --- a/xen/arch/x86/cpuid/cpuid.c > +++ b/xen/arch/x86/cpuid/cpuid.c > @@ -103,6 +103,12 @@ const uint32_t known_features[XEN_NR_FEATURESET_ENTRIES] > = > cpufeat_mask(X86_FEATURE_RDSEED) > | > cpufeat_mask(X86_FEATURE_ADX) > | > cpufeat_mask(X86_FEATURE_SMAP)), > + > +[cpufeat_word(X86_FEATURE_PREFETCHWT1)] = > (cpufeat_mask(X86_FEATURE_PREFETCHWT1)), > + > +[cpufeat_word(X86_FEATURE_ITSC)] = (cpufeat_mask(X86_FEATURE_ITSC)), > + > +[cpufeat_word(X86_FEATURE_CLZERO)] = (cpufeat_mask(X86_FEATURE_CLZERO)), > }; Are these indeed the only ones in their groups that can be safely marked "known"? Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH RFC 06/31] xen/x86: Infrastructure to calculate guest featuresets
>>> On 16.12.15 at 22:24, wrote: > xen/arch/x86/Makefile | 1 + > xen/arch/x86/cpuid.c| 23 +++ > xen/arch/x86/setup.c| 3 +++ > xen/include/asm-x86/cpuid.h | 22 ++ > 4 files changed, 49 insertions(+) > create mode 100644 xen/arch/x86/cpuid.c You already introduced xen/arch/x86/cpuid/ in patch 3 - why don't you put this file there, as e.g. guest.c? Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 2/8] xen: arm: fix typo in the description of struct pending_irq->desc
On Tue, 10 Nov 2015, Ian Campbell wrote: > s/it/if/ makes more sense. > > Signed-off-by: Ian Campbell Acked-by: Stefano Stabellini > xen/include/asm-arm/vgic.h | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/xen/include/asm-arm/vgic.h b/xen/include/asm-arm/vgic.h > index cb51a9e..7d580cc 100644 > --- a/xen/include/asm-arm/vgic.h > +++ b/xen/include/asm-arm/vgic.h > @@ -66,7 +66,7 @@ struct pending_irq > #define GIC_IRQ_GUEST_ENABLED 3 > #define GIC_IRQ_GUEST_MIGRATING 4 > unsigned long status; > -struct irq_desc *desc; /* only set it the irq corresponds to a physical > irq */ > +struct irq_desc *desc; /* only set if the irq corresponds to a physical > irq */ > unsigned int irq; > #define GIC_INVALID_LR ~(uint8_t)0 > uint8_t lr; > -- > 2.1.4 > ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 1/8] xen: arm: fix indendation of struct vtimer
On Tue, 10 Nov 2015, Ian Campbell wrote: > Signed-off-by: Ian Campbell Acked-by: Stefano Stabellini > xen/include/asm-arm/domain.h | 10 +- > 1 file changed, 5 insertions(+), 5 deletions(-) > > diff --git a/xen/include/asm-arm/domain.h b/xen/include/asm-arm/domain.h > index e7e40da..c56f06e 100644 > --- a/xen/include/asm-arm/domain.h > +++ b/xen/include/asm-arm/domain.h > @@ -35,11 +35,11 @@ extern int dom0_11_mapping; > #define is_domain_direct_mapped(d) ((d) == hardware_domain && > dom0_11_mapping) > > struct vtimer { > -struct vcpu *v; > -int irq; > -struct timer timer; > -uint32_t ctl; > -uint64_t cval; > +struct vcpu *v; > +int irq; > +struct timer timer; > +uint32_t ctl; > +uint64_t cval; > }; > > struct arch_domain > -- > 2.1.4 > ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH RFC 07/31] xen/x86: Export host featureset via SYSCTL
>>> On 16.12.15 at 22:24, wrote: > @@ -190,6 +191,56 @@ long arch_do_sysctl( > } > break; > > +case XEN_SYSCTL_get_featureset: > +{ > +uint32_t *featureset; const > +unsigned int nr; > + > +/* Request for maximum number of features? */ > +if ( guest_handle_is_null(sysctl->u.featureset.features) ) > +{ > +sysctl->u.featureset.nr_features = XEN_NR_FEATURESET_ENTRIES; > +if ( __copy_to_guest(u_sysctl, sysctl, 1) ) __copy_field_to_guest()? > +ret = -EFAULT; > +break; > +} > + > +/* Clip the number of entries. */ > +nr = min_t(unsigned int, sysctl->u.featureset.nr_features, > + XEN_NR_FEATURESET_ENTRIES); Let's try to avoid min_t() wherever possible. > --- a/xen/include/public/sysctl.h > +++ b/xen/include/public/sysctl.h > @@ -764,6 +764,24 @@ struct xen_sysctl_tmem_op { > typedef struct xen_sysctl_tmem_op xen_sysctl_tmem_op_t; > DEFINE_XEN_GUEST_HANDLE(xen_sysctl_tmem_op_t); > > +/* > + * XEN_SYSCTL_get_featureset (x86 specific) Depending on your further intention this and/or ... > + * > + * Return information about the maximum sets of features which can be offered > + * to different types of guests. This is all strictly information as found > in > + * `cpuid` feature leaves with no synthetic alterations. > + */ > +struct xen_sysctl_featureset { > +#define XEN_SYSCTL_featureset_host 0 ... this need to carry "CPU" in their names. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel