Re: [Xen-devel] About UMIP unit test.
On 02/03/2017 02:33, Yu Zhang wrote: > Hi Jan, > > Previously I saw your UMIP patches merged in xen, and we'd like to > try some unit test here in Intel. And I wonder do you have any unit > test code for this feature, or any suggestions? :) There is support in the x86 emulator to handle UMIP correctly in the case that it is found set in CR4. However, the logic to emulate UMIP on non-UMIP-capable hardware has not been merged yet. As for unit tests, if you use http://xenbits.xen.org/gitweb/?p=xtf.git;a=blob;f=tests/cpuid-faulting/main.c;hb=HEAD as an example, making a unit test should hopefully be trivial. General docs can be found here http://xenbits.xen.org/docs/xtf/ and 'd be happy to answer any other questions you have on XTF, should you choose to use it. ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v4 02/17] x86emul: support MMX/SSE{, 2, 3} moves
>>> On 01.03.17 at 20:56, wrote: > On 01/03/17 14:19, Jan Beulich wrote: > On 01.03.17 at 14:59, wrote: >>> On 28/02/17 12:50, Jan Beulich wrote: >> I also think avoiding two identical prefixes is (marginally) better >> architecture- >> wise. > > There is no specific advice in the AMD optimisation guide. > > The Intel guide warns against unnecessary use of 0x66 and 0x67 > (specifically in the Length-Changing Prefixes section), which > dynamically change the length of the instruction. This doesn't apply to > us in this situation. > > The only other reference to prefixes comes from the Other Decoding > Guidelines section, which state (obviously) that extra prefixes decrease > instruction bandwidth (as more bytes need consuming to decode the > instruction), and that any instruction with multiple prefixes at all > require decoding in the first decoder, which builds competition of resource. > > I can't see anything suggesting that a double %ds vs a single %ds and > rex prefix would make any difference. Well - performance isn't of interest here anyway, only correctness is. Since not emitting multiple identical prefixes is marginally better and since using REX here is slightly better overall code, I'd prefer for it to stay this way. @@ -5429,6 +5496,57 @@ x86_emulate( singlestep = _regs._eflags & X86_EFLAGS_TF; break; +CASE_SIMD_PACKED_FP(, 0x0f, 0x50): /* movmskp{s,d} xmm,reg */ +CASE_SIMD_PACKED_FP(_VEX, 0x0f, 0x50): /* vmovmskp{s,d} {x,y}mm,reg */ +CASE_SIMD_PACKED_INT(0x0f, 0xd7): /* pmovmskb {,x}mm,reg */ +case X86EMUL_OPC_VEX_66(0x0f, 0xd7): /* vpmovmskb {x,y}mm,reg */ +generate_exception_if(ea.type != OP_REG, EXC_UD); + +if ( vex.opcx == vex_none ) +{ +if ( vex.pfx & VEX_PREFIX_DOUBLE_MASK ) +vcpu_must_have(sse2); +else +{ +if ( b != 0x50 ) +host_and_vcpu_must_have(mmx); +vcpu_must_have(sse); +} +if ( b == 0x50 || (vex.pfx & VEX_PREFIX_DOUBLE_MASK) ) +get_fpu(X86EMUL_FPU_xmm, &fic); +else +get_fpu(X86EMUL_FPU_mmx, &fic); +} +else +{ +generate_exception_if(vex.reg != 0xf, EXC_UD); >>> Isn't this TwoOp? >> Yes, hence the #UD. Or is the question "Why is this being done >> here, instead of on the common code path?" If so - the common >> code path doing this isn't being reached, as we invoke the stub >> inside the case block. > > My question was actually "Why isn't this based on d & TwoByte"? Because the above variant is more explicit imo: Why depend on some derived info when we can use the original encoding directly? +if ( b == 0x50 || !vex.l ) +host_and_vcpu_must_have(avx); +else +host_and_vcpu_must_have(avx2); +get_fpu(X86EMUL_FPU_ymm, &fic); +} + +opc = init_prefixes(stub); +opc[0] = b; +/* Convert GPR destination to %rAX. */ +rex_prefix &= ~REX_R; +vex.r = 1; +if ( !mode_64bit() ) +vex.w = 0; +opc[1] = modrm & 0xc7; +fic.insn_bytes = PFX_BYTES + 2; +opc[2] = 0xc3; + +copy_REX_VEX(opc, rex_prefix, vex); +invoke_stub("", "", "=a" (dst.val) : [dummy] "i" (0)); + +put_stub(stub); +put_fpu(&fic); + +dst.bytes = 4; >>> Somewhere there should probably be an ASSERT() that state->simd_size is >>> 0, so we don't try to invoke the stub twice. >> I can do this, but it didn't seem natural to do so when putting this >> together, as - obviously - I did produce/check the table entries at >> basically the same time as I did write this code. > > It is obvious to you now, and I do trust that you checked the > correctness as it related to this patch, but it will not be obvious in 6 > months time with another dev-cycles worth of change on top. Right, that's what I was trying to hint at with my explanation of why I didn't think of adding ASSERT()s to this effect. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v4 10/17] x86emul: add tables for 0f38 and 0f3a extension space
>>> On 01.03.17 at 21:35, wrote: > On 01/03/17 16:11, Jan Beulich wrote: > On 01.03.17 at 16:49, wrote: >>> On 28/02/17 12:54, Jan Beulich wrote: @@ -2740,6 +2790,13 @@ x86_decode( break; case ext_0f3a: +d = ext0f3a_table[b].to_memory ? DstMem | SrcReg : DstReg | SrcMem; +if ( ext0f3a_table[b].two_op ) +d |= TwoOp; +else if ( ext0f3a_table[b].four_op && !mode_64bit() && vex.opcx ) +imm1 &= 0x7f; >>> Is this sensible to do? The behaviour of imm1 doesn't appear to be very >>> consistent across encodings. As it is all passed onto hardware anyway >>> via stub, does it really matter? >> Oh, yes, it does matter: We're running in 64-bit mode, and the high >> bit, when representing a register, is ignored outside of 64-bit mode. >> If we didn't mask it off, we'd access the wrong register if 32- or 16- >> bit code had the bit set. > > Ok, but my first question still stands. > > Across this entire series, the only .four_op instructions appear to be > the Vex blend instructions at 660f3a4{a,b,c}, and these are called out > as special cases in the instruction manual. > > How does the above condition logically equate to "this instruction uses > its imm8 byte to encode an extra source register", because that is the > purpose of the applied mask. Well, "four_op" means four register operands (one of them possibly also allowing for memory), just like two_op also doesn't include possible immediates. "four_regs" would seem worse to me, as that would be more along the lines of excluding memory ones. I'll add a comment to this effect. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] [xen-4.4-testing test] 106323: tolerable FAIL - PUSHED
flight 106323 xen-4.4-testing real [real] http://logs.test-lab.xenproject.org/osstest/logs/106323/ Failures :-/ but no regressions. Tests which are failing intermittently (not blocking): test-amd64-i386-xl-qemut-win7-amd64 3 host-install(3) broken in 106018 pass in 106323 test-amd64-amd64-xl-qemuu-ovmf-amd64 3 host-install(3) broken in 106018 pass in 106323 test-amd64-i386-xl 3 host-install(3) broken in 106018 pass in 106323 test-amd64-i386-xl-qemuu-ovmf-amd64 3 host-install(3) broken in 106018 pass in 106323 test-amd64-amd64-pygrub 3 host-install(3) broken in 106018 pass in 106323 test-amd64-i386-xl-qemut-winxpsp3-vcpus1 3 host-install(3) broken in 106018 pass in 106323 test-amd64-i386-freebsd10-amd64 3 host-install(3) broken in 106051 pass in 106323 test-armhf-armhf-xl-multivcpu 16 guest-start.2 fail in 106051 pass in 106130 test-amd64-i386-xend-qemut-winxpsp3 15 guest-localmigrate/x10 fail in 106051 pass in 106323 test-armhf-armhf-xl-credit2 6 xen-boot fail in 106130 pass in 106323 test-amd64-i386-xl-qemuu-win7-amd64 15 guest-localmigrate/x10 fail in 106281 pass in 106323 test-amd64-amd64-xl-qemuu-win7-amd64 15 guest-localmigrate/x10 fail in 106281 pass in 106323 test-amd64-i386-xl-qemuu-winxpsp3-vcpus1 13 guest-localmigrate fail in 106281 pass in 106323 test-amd64-amd64-xl-qemut-win7-amd64 16 guest-stop fail pass in 106018 test-amd64-i386-xl-qemut-win7-amd64 16 guest-stop fail pass in 106051 test-armhf-armhf-xl-multivcpu 15 guest-start/debian.repeat fail pass in 106130 test-xtf-amd64-amd64-5 16 xtf/test-pv32pae-selftest fail pass in 106281 Regressions which are regarded as allowable (not blocking): test-amd64-i386-xend-qemut-winxpsp3 9 windows-install fail in 106281 like 105815 test-xtf-amd64-amd64-2 16 xtf/test-pv32pae-selftestfail like 105835 test-xtf-amd64-amd64-4 16 xtf/test-pv32pae-selftestfail like 105835 test-xtf-amd64-amd64-3 20 xtf/test-hvm32-invlpg~shadow fail like 105835 test-xtf-amd64-amd64-3 33 xtf/test-hvm32pae-invlpg~shadow fail like 105835 test-xtf-amd64-amd64-3 44 xtf/test-hvm64-invlpg~shadow fail like 105835 test-xtf-amd64-amd64-2 54 leak-check/check fail like 105835 test-xtf-amd64-amd64-4 54 leak-check/check fail like 105835 test-xtf-amd64-amd64-1 54 leak-check/check fail like 105835 test-xtf-amd64-amd64-5 54 leak-check/check fail like 105835 test-xtf-amd64-amd64-3 54 leak-check/check fail like 105835 test-amd64-i386-xl-qemuu-win7-amd64 16 guest-stop fail like 105835 test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-stopfail like 105835 Tests which did not succeed, but are not blocking: test-amd64-amd64-rumprun-amd64 1 build-check(1) blocked n/a test-amd64-i386-rumprun-i386 1 build-check(1) blocked n/a test-xtf-amd64-amd64-2 10 xtf-fep fail never pass test-xtf-amd64-amd64-4 10 xtf-fep fail never pass test-xtf-amd64-amd64-2 18 xtf/test-hvm32-cpuid-faulting fail never pass test-xtf-amd64-amd64-4 18 xtf/test-hvm32-cpuid-faulting fail never pass test-xtf-amd64-amd64-1 10 xtf-fep fail never pass test-xtf-amd64-amd64-1 16 xtf/test-pv32pae-selftestfail never pass test-xtf-amd64-amd64-1 18 xtf/test-hvm32-cpuid-faulting fail never pass test-xtf-amd64-amd64-2 31 xtf/test-hvm32pae-cpuid-faulting fail never pass test-xtf-amd64-amd64-4 31 xtf/test-hvm32pae-cpuid-faulting fail never pass test-xtf-amd64-amd64-2 37 xtf/test-hvm32pse-cpuid-faulting fail never pass test-xtf-amd64-amd64-4 37 xtf/test-hvm32pse-cpuid-faulting fail never pass test-xtf-amd64-amd64-2 41 xtf/test-hvm64-cpuid-faulting fail never pass test-xtf-amd64-amd64-1 31 xtf/test-hvm32pae-cpuid-faulting fail never pass test-xtf-amd64-amd64-4 41 xtf/test-hvm64-cpuid-faulting fail never pass test-xtf-amd64-amd64-1 37 xtf/test-hvm32pse-cpuid-faulting fail never pass test-xtf-amd64-amd64-1 41 xtf/test-hvm64-cpuid-faulting fail never pass test-xtf-amd64-amd64-5 10 xtf-fep fail never pass test-xtf-amd64-amd64-5 18 xtf/test-hvm32-cpuid-faulting fail never pass test-xtf-amd64-amd64-5 31 xtf/test-hvm32pae-cpuid-faulting fail never pass test-xtf-amd64-amd64-5 37 xtf/test-hvm32pse-cpuid-faulting fail never pass test-xtf-amd64-amd64-5 41 xtf/test-hvm64-cpuid-faulting fail never pass build-i386-rumprun7 xen-buildfail never pass build-amd64-rumprun 7 xen-buildfail never pass test-xtf-amd64-amd64-3 10 xtf-fep fail never pass test-amd64-i386-libvirt 12 migrate-support-checkfail never pass test-xtf-amd64-amd64-2 53 xtf/test-
Re: [Xen-devel] [PATCH v4 12/17] x86emul: support SSE4.1 insns
>>> On 01.03.17 at 17:58, wrote: > On 28/02/17 12:56, Jan Beulich wrote: >> @@ -6951,6 +7040,97 @@ x86_emulate( >> fic.insn_bytes = PFX_BYTES + 3; >> break; >> >> +case X86EMUL_OPC_66(0x0f38, 0x20): /* pmovsxbw xmm/m64,xmm */ >> +case X86EMUL_OPC_66(0x0f38, 0x21): /* pmovsxbd xmm/m32,xmm */ >> +case X86EMUL_OPC_66(0x0f38, 0x22): /* pmovsxbq xmm/m16,xmm */ >> +case X86EMUL_OPC_66(0x0f38, 0x23): /* pmovsxwd xmm/m64,xmm */ >> +case X86EMUL_OPC_66(0x0f38, 0x24): /* pmovsxwq xmm/m32,xmm */ >> +case X86EMUL_OPC_66(0x0f38, 0x25): /* pmovsxdq xmm/m64,xmm */ >> +case X86EMUL_OPC_66(0x0f38, 0x30): /* pmovzxbw xmm/m64,xmm */ >> +case X86EMUL_OPC_66(0x0f38, 0x31): /* pmovzxbd xmm/m32,xmm */ >> +case X86EMUL_OPC_66(0x0f38, 0x32): /* pmovzxbq xmm/m16,xmm */ >> +case X86EMUL_OPC_66(0x0f38, 0x33): /* pmovzxwd xmm/m64,xmm */ >> +case X86EMUL_OPC_66(0x0f38, 0x34): /* pmovzxwq xmm/m32,xmm */ >> +case X86EMUL_OPC_66(0x0f38, 0x35): /* pmovzxdq xmm/m64,xmm */ >> +op_bytes = 16 >> pmov_convert_delta[b & 7]; >> +/* fall through */ >> +case X86EMUL_OPC_66(0x0f38, 0x10): /* pblendvb XMM0,xmm/m128,xmm */ >> +case X86EMUL_OPC_66(0x0f38, 0x14): /* blendvps XMM0,xmm/m128,xmm */ >> +case X86EMUL_OPC_66(0x0f38, 0x15): /* blendvpd XMM0,xmm/m128,xmm */ >> +case X86EMUL_OPC_66(0x0f38, 0x28): /* pmuldq xmm/m128,xmm */ >> +case X86EMUL_OPC_66(0x0f38, 0x29): /* pcmpeqq xmm/m128,xmm */ >> +case X86EMUL_OPC_66(0x0f38, 0x2b): /* packusdw xmm/m128,xmm */ >> +case X86EMUL_OPC_66(0x0f38, 0x38): /* pminsb xmm/m128,xmm */ >> +case X86EMUL_OPC_66(0x0f38, 0x39): /* pminsd xmm/m128,xmm */ >> +case X86EMUL_OPC_66(0x0f38, 0x3a): /* pminub xmm/m128,xmm */ >> +case X86EMUL_OPC_66(0x0f38, 0x3b): /* pminud xmm/m128,xmm */ >> +case X86EMUL_OPC_66(0x0f38, 0x3c): /* pmaxsb xmm/m128,xmm */ >> +case X86EMUL_OPC_66(0x0f38, 0x3d): /* pmaxsd xmm/m128,xmm */ >> +case X86EMUL_OPC_66(0x0f38, 0x3e): /* pmaxub xmm/m128,xmm */ >> +case X86EMUL_OPC_66(0x0f38, 0x3f): /* pmaxud xmm/m128,xmm */ >> +case X86EMUL_OPC_66(0x0f38, 0x40): /* pmulld xmm/m128,xmm */ >> +case X86EMUL_OPC_66(0x0f38, 0x41): /* phminposuw xmm/m128,xmm */ >> +host_and_vcpu_must_have(sse4_1); >> +goto simd_0f38_common; >> + >> +case X86EMUL_OPC_66(0x0f38, 0x17): /* ptest xmm/m128,xmm */ >> +case X86EMUL_OPC_VEX_66(0x0f38, 0x17): /* vptest {x,y}mm/mem,{x,y}mm */ >> +if ( vex.opcx == vex_none ) >> +{ >> +host_and_vcpu_must_have(sse4_1); >> +get_fpu(X86EMUL_FPU_xmm, &fic); >> +} >> +else >> +{ >> +generate_exception_if(vex.reg != 0xf, EXC_UD); >> +host_and_vcpu_must_have(avx); >> +get_fpu(X86EMUL_FPU_ymm, &fic); >> +} >> + >> +opc = init_prefixes(stub); >> +if ( vex.opcx == vex_none ) >> +opc[0] = 0x38; >> +opc[vex.opcx == vex_none] = b; >> +opc[1 + (vex.opcx == vex_none)] = modrm; > > This use of (vex.opcx == vex_none) for construction is very awkward to read. > > How about: > > if ( vex.opcx == vex_none ) > { > opc[0] = 0x38; > opc++; /* Adjust for extra prefix. */ > } > > ... > > if ( vex.opcx == vex_none ) > opc--; /* Undo adjustment for extra prefix. */ > > which allows the rest of the opc[] setup to read like all the other > similar code. I can do that, as you think it's better to read (which I'm not sure of). > In fact, thinking more about this, using a pointer-arithmatic-based > method of filling the stub would allow for the removal of > "fic.insn_bytes = PFX_BYTES + $X", and any chance of getting the count > wrong. Well, that would leave the need to store the base address of the stub somewhere (requiring to keep the very #ifdef-ary you disliked in an earlier patch). I'd rather keep it the way it is. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v4 15/17] x86emul: support PCLMULQDQ
>>> On 01.03.17 at 18:44, wrote: > On 28/02/17 12:58, Jan Beulich wrote: >> @@ -7434,6 +7436,14 @@ x86_emulate( >> generate_exception_if(vex.l, EXC_UD); >> goto simd_0f_imm8_avx; >> >> +case X86EMUL_OPC_66(0x0f3a, 0x44): /* pclmulqdq $imm8,xmm/m128,xmm >> */ >> +case X86EMUL_OPC_VEX_66(0x0f3a, 0x44): /* vpclmulqdq >> $imm8,xmm/m128,xmm,xmm */ >> +host_and_vcpu_must_have(pclmulqdq); >> +if ( vex.opcx == vex_none ) >> +goto simd_0f3a_common; > > What is this for? There are no other instructions defined (that I can > find) in 0f3a44. Perhaps I'm misunderstanding the question. Are you mixing up vex.opcx and vex.pfx? We want the non-VEX case handled by the code at simd_0f3a_common, and the VEX one by the code further down. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] xen/arm and swiotlb-xen: possible data corruption
On Wed, Mar 01, 2017 at 05:05:21PM -0800, Stefano Stabellini wrote: > Hi all, > > Edgar reported a data corruption on network packets in dom0 when the > swiotlb-xen is in use. He also reported that the following patch "fixes" > the problem for him: > > static void __xen_dma_page_cpu_to_dev(struct device *hwdev, dma_addr_t > handle, > size_t size, enum dma_data_direction dir) > { > - dma_cache_maint(handle & PAGE_MASK, handle & ~PAGE_MASK, size, dir, > DMA_MAP); > + printk("%s: addr=%lx size=%zd\n", __func__, handle, size); > + dma_cache_maint(handle & PAGE_MASK, handle & ~PAGE_MASK, size + 64, > dir, DMA_MAP); > > I am thinking that the problem has something to do with cacheline > alignment on the Xen side > (xen/common/grant_table.c:__gnttab_cache_flush). > > If op == GNTTAB_CACHE_INVAL, we call invalidate_dcache_va_range; if op > == GNTTAB_CACHE_CLEAN, we call clean_dcache_va_range instead. The > parameter, v, could be non-cacheline aligned. > > invalidate_dcache_va_range is capable of handling a not aligned address, > while clean_dcache_va_range does not. > > Edgar, does the appended patch fix the problem for you? Thanks Stefano, This does indeed fix the issue for me. Cheers, Edgar > > --- > > diff --git a/xen/include/asm-arm/page.h b/xen/include/asm-arm/page.h > index 86de0b6..9cdf2fb 100644 > --- a/xen/include/asm-arm/page.h > +++ b/xen/include/asm-arm/page.h > @@ -322,10 +322,30 @@ static inline int invalidate_dcache_va_range(const void > *p, unsigned long size) > > static inline int clean_dcache_va_range(const void *p, unsigned long size) > { > -const void *end; > +size_t off; > +const void *end = p + size; > + > dsb(sy); /* So the CPU issues all writes to the range */ > -for ( end = p + size; p < end; p += cacheline_bytes ) > + > +off = (unsigned long)p % cacheline_bytes; > +if ( off ) > +{ > +p -= off; > asm volatile (__clean_dcache_one(0) : : "r" (p)); > +p += cacheline_bytes; > +size -= cacheline_bytes - off; > +} > +off = (unsigned long)end % cacheline_bytes; > +if ( off ) > +{ > +end -= off; > +size -= off; > +asm volatile (__clean_dcache_one(0) : : "r" (end)); > +} > + > +for ( ; p < end; p += cacheline_bytes ) > +asm volatile (__clean_dcache_one(0) : : "r" (p)); > + > dsb(sy); /* So we know the flushes happen before continuing */ > /* ARM callers assume that dcache_* functions cannot fail. */ > return 0; ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] [DO NOT APPLY PATCH XTF 1/2] x86: add UMIP feature bit
Strangely there was already one cpu_has_umip defined, so there is no need for one in this patch. Signed-off-by: Wei Liu --- include/arch/x86/processor.h | 1 + include/xen/arch-x86/cpufeatureset.h | 1 + 2 files changed, 2 insertions(+) diff --git a/include/arch/x86/processor.h b/include/arch/x86/processor.h index 0106954..e2a2cbb 100644 --- a/include/arch/x86/processor.h +++ b/include/arch/x86/processor.h @@ -52,6 +52,7 @@ #define X86_CR4_PCE 0x0100 /* Performance counters at ipl 3 */ #define X86_CR4_OSFXSR 0x0200 /* Fast FPU save and restore */ #define X86_CR4_OSXMMEXCPT 0x0400 /* Unmasked SSE exceptions */ +#define X86_CR4_UMIP0x0800 /* UMIP */ #define X86_CR4_VMXE0x2000 /* VMX */ #define X86_CR4_SMXE0x4000 /* SMX */ #define X86_CR4_FSGSBASE0x0001 /* {rd,wr}{fs,gs}base */ diff --git a/include/xen/arch-x86/cpufeatureset.h b/include/xen/arch-x86/cpufeatureset.h index 905e8e8..f66a4ab 100644 --- a/include/xen/arch-x86/cpufeatureset.h +++ b/include/xen/arch-x86/cpufeatureset.h @@ -144,6 +144,7 @@ /* Intel-defined CPU features, CPUID level 0x0007:0.ecx, word 6 */ #define X86_FEATURE_PREFETCHWT1 (6*32+ 0) /* PREFETCHWT1 instruction */ +#define X86_FEATURE_UMIP (6*32+ 2) /* User-Mode Instruction Prevention */ #define X86_FEATURE_PKU (6*32+ 3) /* Protection Keys for Userspace */ #define X86_FEATURE_OSPKE (6*32+ 4) /* OS Protection Keys Enable */ -- 2.11.0 ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] [DO NOT APPLY PATCH XTF 2/2] Add UMIP test
Signed-off-by: Wei Liu --- tests/umip/Makefile | 9 + tests/umip/main.c | 102 2 files changed, 111 insertions(+) create mode 100644 tests/umip/Makefile create mode 100644 tests/umip/main.c diff --git a/tests/umip/Makefile b/tests/umip/Makefile new file mode 100644 index 000..a74016c --- /dev/null +++ b/tests/umip/Makefile @@ -0,0 +1,9 @@ +include $(ROOT)/build/common.mk + +NAME := umip +CATEGORY := functional +TEST-ENVS := hvm32 + +obj-perenv += main.o + +include $(ROOT)/build/gen.mk diff --git a/tests/umip/main.c b/tests/umip/main.c new file mode 100644 index 000..ca74085 --- /dev/null +++ b/tests/umip/main.c @@ -0,0 +1,102 @@ +/** + * @file tests/umip/main.c + * @ref test-umip + * + * @page test-umip UMIP + * + * Test if UMIP (User-Mode Instruction Prevention) is functional. + * + * Returns SUCCESS if UMIP is functional, SKIP if UMIP is not + * available and FAILURE if UMIP doesn't cause SGDT, SLDT, SIDT, STR + * or SMSW to fault. + * + * @see tests/umip/main.c + */ +#include + +#include + +#define EX(instr) \ +bool seen_ ## instr ## _fault; \ +bool ex_ ##instr(struct cpu_regs *regs, \ + const struct extable_entry *ex)\ +{ \ +if ( regs->entry_vector == X86_EXC_GP ) \ +{ \ +seen_ ## instr ## _fault = true;\ +regs->ip = ex->fixup; \ +\ +return true;\ +} \ +\ +return false; \ +} \ + +EX(sgdt) +EX(sldt) +EX(sidt) +EX(str) +EX(smsw) + +#undef EX + +void test_main(void) +{ +uint8_t buf[1024]; + +printk("Test if UMIP is functional\n"); + +if ( !cpu_has_umip ) +{ +xtf_skip("UMIP is not available\n"); +return; +} + +if ( !xtf_has_fep ) +xtf_skip("FEP not available, some tests will be skipped\n"); + +write_cr4(read_cr4() | X86_CR4_UMIP); + +#define TEST(fep,instr) \ +do {\ +seen_ ## instr ## _fault = false; \ +asm volatile ("1: " fep #instr " %0; 2:"\ + _ASM_EXTABLE_HANDLER(1b, 2b, ex_ ##instr) \ + : : "m"(buf));\ +if ( !seen_ ## instr ## _fault )\ +{ \ +xtf_failure(fep #instr " didn't cause fault\n");\ +return; \ +} \ +seen_ ## instr ## _fault = false; \ +} while (0) + +TEST(, sgdt); +TEST(, sldt); +TEST(, sidt); +TEST(, str); +TEST(, smsw); + +if ( xtf_has_fep ) +{ +TEST(_ASM_XEN_FEP, sgdt); +TEST(_ASM_XEN_FEP, sldt); +TEST(_ASM_XEN_FEP, sidt); +TEST(_ASM_XEN_FEP, str); +TEST(_ASM_XEN_FEP, smsw); +} + +#undef TEST + +xtf_success(NULL); +} + +/* + * Local variables: + * mode: C + * c-file-style: "BSD" + * c-basic-offset: 4 + * tab-width: 4 + * indent-tabs-mode: nil + * End: + */ -- 2.11.0 ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] [DO NOT APPLY PATCH XTF 0/2] UMIP test case
I wrote this long time ago before UMIP was merged. Yu, since you asked, I might as well post it for your reference on how to do it with XTF. This series is not yet tested in any way. Wei Liu (2): x86: add UMIP feature bit Add UMIP test include/arch/x86/processor.h | 1 + include/xen/arch-x86/cpufeatureset.h | 1 + tests/umip/Makefile | 9 tests/umip/main.c| 102 +++ 4 files changed, 113 insertions(+) create mode 100644 tests/umip/Makefile create mode 100644 tests/umip/main.c -- 2.11.0 ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2] x86/apicv: enhance posted-interrupt processing
On Thu, Mar 02, 2017 at 07:42:33AM +, Xuquan (Quan Xu) wrote: >On March 01, 2017 2:24 PM, wrote: >> >>Good point. I ignore v->processor maybe change. I have thought over >> __vmx_deliver_posted_interrupt() again and want to share you my opinion. >>First of all, __vmx_deliver_posted_interrupt() is to let the target vCPU sync >>PIR to vIRR ASAP. >>different strategies we will used to deal with different cases. >>One is we just unblock the target vCPU when the vCPU is blocked. This can >>make sure the vCPU will go to vmx_intr_assist() where we achieve the goal. >>The second one is the vCPU is runnable, we will achieve the goal automatically >>when the vCPU is chosen to run. >>The third one is the vCPU is running and running on the same pCPU with the >>source vCPU. It just wants to notify itself. Just raise a softirq is enough. >>The fourth one is the vCPU is running on other pCPU. To notify the target >>vCPU, we send a IPI to the target PCPU which the vCPU is on. Note that when >>the notification arrives to the target PCPU, the target vCPU maybe is blocked, >>runnable, running in root mode, or running in non-root mode. If the target >>vCPU is running in non-root mode, hardware will sync PIR to vIRR. If the >>target >>vCPU is in non-root mode, the Interrupt handler starts to run. To make sure, >>we can go back to vmx_intr_assist(), I have suggested that the interrupt >>handler should raise a softirq. > >Does the interrupt handler refer to pi_notification_interrupt() or >event_check_interrupt()? > Yes. Please refer to the v3 patch later. Thanks, Chao ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] [PATCH v3] x86/apicv: Enhance posted-interrupt processing
__vmx_deliver_posted_interrupt() wrongly used a softirq bit to decide whether to suppress an IPI. Its logic was: the first time an IPI was sent, we set the softirq bit. Next time, we would check that softirq bit before sending another IPI. If the 1st IPI arrived at the pCPU which was in non-root mode, the hardware would consume the IPI and sync PIR to vIRR. During the process, no one (both hardware and software) will clear the softirq bit. As a result, the following IPI would be wrongly suppressed. This patch discards the suppression check, always sending IPI. The softirq also need to be raised. But there is a little change. This patch moves the place where we raise a softirq for 'cpu != smp_processor_id()' case to the IPI interrupt handler. Namely, don't raise a softirq for this case and set the interrupt handler to pi_notification_interrupt()(in which a softirq is raised) regardless of posted interrupt enabled or not. The only difference is when the IPI arrives at the pCPU which is happened in non-root mode, the patch will not raise a useless softirq since the IPI is consumed by hardware rather than raise a softirq unconditionally. Quan doesn't have enough time to upstream this fix patch. He asks me to do this. Merge another his related patch (https://lists.xenproject.org/archives/html/xen-devel/2017-02/msg02885.html). Signed-off-by: Quan Xu Signed-off-by: Chao Gao --- xen/arch/x86/hvm/vmx/vmx.c | 56 -- 1 file changed, 49 insertions(+), 7 deletions(-) diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c index 5b1717d..9db4bd0 100644 --- a/xen/arch/x86/hvm/vmx/vmx.c +++ b/xen/arch/x86/hvm/vmx/vmx.c @@ -1842,13 +1842,59 @@ static void __vmx_deliver_posted_interrupt(struct vcpu *v) bool_t running = v->is_running; vcpu_unblock(v); +/* + * The underlying 'IF' excludes two cases which we don't need further + * operation to make sure the target vCPU (@v) will sync PIR to vIRR ASAP. + * Specifically, the two cases are: + * 1. The target vCPU is not running, meaning it is blocked or runnable. + * Since we have unblocked the target vCPU above, the target vCPU will + * sync PIR to vIRR when it is chosen to run. + * 2. The target vCPU is the current vCPU and in_irq() is false. It means + * the function is called in noninterrupt context. Since we don't call + * the function in noninterrupt context after the last time a vCPU syncs + * PIR to vIRR, excluding this case is also safe. + */ if ( running && (in_irq() || (v != current)) ) { +/* + * Note: Only two cases will reach here: + * 1. The target vCPU is running on other pCPU. + * 2. The target vCPU is running on the same pCPU with the current + * vCPU and the current vCPU is in interrupt context. That's to say, + * the target vCPU is the current vCPU. + * + * Note2: Don't worry the v->processor may change since at least when + * the target vCPU is chosen to run or be blocked, there is a chance + * to sync PIR to vIRR. + */ unsigned int cpu = v->processor; -if ( !test_and_set_bit(VCPU_KICK_SOFTIRQ, &softirq_pending(cpu)) - && (cpu != smp_processor_id()) ) +/* + * For case 1, we send a IPI to the pCPU. When the IPI arrives, the + * target vCPU maybe is running in non-root mode, running in root + * mode, runnable or blocked. If the target vCPU is running in + * non-root mode, the hardware will sync PIR to vIRR for the IPI + * vector is special to the pCPU. If the target vCPU is running in + * root-mode, the interrupt handler starts to run. In order to make + * sure the target vCPU could go back to vmx_intr_assist(), the + * interrupt handler should raise a softirq if no pending softirq. + * If the target vCPU is runnable, it will sync PIR to vIRR next time + * it is chose to run. In this case, a IPI and a softirq is sent to + * a wrong vCPU which we think it is not a big issue. If the target + * vCPU is blocked, since vcpu_block() checks whether there is an + * event to be delivered through local_events_need_delivery() just + * after blocking, the vCPU must have synced PIR to vIRR. Similarly, + * there is a IPI and a softirq sent to a wrong vCPU. + */ +if ( cpu != smp_process_id() ) send_IPI_mask(cpumask_of(cpu), posted_intr_vector); +/* + * For case 2, raising a softirq can cause vmx_intr_assist() where PIR + * has a chance to be synced to vIRR to be called. As an optimization, + * We only need to raise a softirq when no pending softirq. + */ +else if ( !softirq_pending(cpu) ) +raise_softirq(VCPU_KICK_SOFTIRQ); } } @@ -2281,13 +2327,9 @@ const struct hvm_function_table * __init start_vmx(void)
Re: [Xen-devel] xen/arm and swiotlb-xen: possible data corruption
On Thu, Mar 02, 2017 at 09:38:37AM +0100, Edgar E. Iglesias wrote: > On Wed, Mar 01, 2017 at 05:05:21PM -0800, Stefano Stabellini wrote: > > Hi all, > > > > Edgar reported a data corruption on network packets in dom0 when the > > swiotlb-xen is in use. He also reported that the following patch "fixes" > > the problem for him: > > > > static void __xen_dma_page_cpu_to_dev(struct device *hwdev, dma_addr_t > > handle, > > size_t size, enum dma_data_direction dir) > > { > > - dma_cache_maint(handle & PAGE_MASK, handle & ~PAGE_MASK, size, dir, > > DMA_MAP); > > + printk("%s: addr=%lx size=%zd\n", __func__, handle, size); > > + dma_cache_maint(handle & PAGE_MASK, handle & ~PAGE_MASK, size + 64, > > dir, DMA_MAP); > > > > I am thinking that the problem has something to do with cacheline > > alignment on the Xen side > > (xen/common/grant_table.c:__gnttab_cache_flush). > > > > If op == GNTTAB_CACHE_INVAL, we call invalidate_dcache_va_range; if op > > == GNTTAB_CACHE_CLEAN, we call clean_dcache_va_range instead. The > > parameter, v, could be non-cacheline aligned. > > > > invalidate_dcache_va_range is capable of handling a not aligned address, > > while clean_dcache_va_range does not. > > > > Edgar, does the appended patch fix the problem for you? > > > Thanks Stefano, > > This does indeed fix the issue for me. Hi again, Looking at the code, the problem here is that we may flush one cache line less than expected. This smaller patch fixes it for me too: diff --git a/xen/include/asm-arm/page.h b/xen/include/asm-arm/page.h index c492d6d..fa1b4dd 100644 --- a/xen/include/asm-arm/page.h +++ b/xen/include/asm-arm/page.h @@ -325,7 +325,9 @@ static inline int clean_dcache_va_range(const void *p, unsigned long size) { const void *end; dsb(sy); /* So the CPU issues all writes to the range */ -for ( end = p + size; p < end; p += cacheline_bytes ) + +end = (void *)ROUNDUP((uintptr_t)p + size, cacheline_bytes); +for ( ; p < end; p += cacheline_bytes ) asm volatile (__clean_dcache_one(0) : : "r" (p)); dsb(sy); /* So we know the flushes happen before continuing */ /* ARM callers assume that dcache_* functions cannot fail. */ Anyway, I'm OK with either fix. Cheers, Edgar > > Cheers, > Edgar > > > > > > --- > > > > diff --git a/xen/include/asm-arm/page.h b/xen/include/asm-arm/page.h > > index 86de0b6..9cdf2fb 100644 > > --- a/xen/include/asm-arm/page.h > > +++ b/xen/include/asm-arm/page.h > > @@ -322,10 +322,30 @@ static inline int invalidate_dcache_va_range(const > > void *p, unsigned long size) > > > > static inline int clean_dcache_va_range(const void *p, unsigned long size) > > { > > -const void *end; > > +size_t off; > > +const void *end = p + size; > > + > > dsb(sy); /* So the CPU issues all writes to the range */ > > -for ( end = p + size; p < end; p += cacheline_bytes ) > > + > > +off = (unsigned long)p % cacheline_bytes; > > +if ( off ) > > +{ > > +p -= off; > > asm volatile (__clean_dcache_one(0) : : "r" (p)); > > +p += cacheline_bytes; > > +size -= cacheline_bytes - off; > > +} > > +off = (unsigned long)end % cacheline_bytes; > > +if ( off ) > > +{ > > +end -= off; > > +size -= off; > > +asm volatile (__clean_dcache_one(0) : : "r" (end)); > > +} > > + > > +for ( ; p < end; p += cacheline_bytes ) > > +asm volatile (__clean_dcache_one(0) : : "r" (p)); > > + > > dsb(sy); /* So we know the flushes happen before continuing > > */ > > /* ARM callers assume that dcache_* functions cannot fail. */ > > return 0; > > ___ > Xen-devel mailing list > Xen-devel@lists.xen.org > https://lists.xen.org/xen-devel ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 5/5] xen: use libxendevicemodel when available
> -Original Message- > From: Stefano Stabellini [mailto:sstabell...@kernel.org] > Sent: 02 March 2017 02:06 > To: Paul Durrant > Cc: xen-de...@lists.xenproject.org; qemu-de...@nongnu.org; Stefano > Stabellini ; Anthony Perard > > Subject: Re: [PATCH 5/5] xen: use libxendevicemodel when available > > On Thu, 23 Feb 2017, Paul Durrant wrote: > > This patch modifies the wrapper functions in xen_common.h to use the > > new xendevicemodel interface if it is available along with compatibility > > code to use the old libxenctrl interface if it is not. > > > > Signed-off-by: Paul Durrant > > Hi Paul, > > Sorry for the late reply, I am flooded with patches over here. Aside > from Anthony's comments, the other patches of this series look good to > me. > Good :-) I will reply to Anthony's recent comment shortly. > The Xen side patches are not in, are they? Do you have a Xen git branch > with them so that I can test this patch properly? Yes, they are in staging but not yet in master. > > This patch breaks the build against Xen >= 4.3, with: > > /local/qemu-upstream/xen-common.c: In function 'xen_init': > /local/qemu-upstream/xen-common.c:135:9: error: implicit declaration of > function 'xenforeignmemory_close' [-Werror=implicit-function-declaration] > /local/qemu-upstream/xen-common.c:135:9: error: nested extern > declaration of 'xenforeignmemory_close' [-Werror=nested-externs] > cc1: all warnings being treated as errors I must have missed that compat define. Thanks for spotting that. I'll re-check and send v2. Cheers, Paul > > > > Cc: Stefano Stabellini > > Cc: Anthony Perard > > --- > > include/hw/xen/xen_common.h | 114 > +++- > > xen-common.c| 8 > > 2 files changed, 89 insertions(+), 33 deletions(-) > > > > diff --git a/include/hw/xen/xen_common.h > b/include/hw/xen/xen_common.h > > index 31cf25f..2c2a61b 100644 > > --- a/include/hw/xen/xen_common.h > > +++ b/include/hw/xen/xen_common.h > > @@ -9,6 +9,7 @@ > > #undef XC_WANT_COMPAT_EVTCHN_API > > #undef XC_WANT_COMPAT_GNTTAB_API > > #undef XC_WANT_COMPAT_MAP_FOREIGN_API > > +#undef XC_WANT_COMPAT_DEVICEMODEL_API > > > > #include > > #include > > @@ -26,48 +27,95 @@ extern xc_interface *xen_xc; > > * We don't support Xen prior to 4.2.0. > > */ > > > > +#if CONFIG_XEN_CTRL_INTERFACE_VERSION < 490 > > + > > +typedef xc_interface xendevicemodel_handle; > > + > > +#define xendevicemodel_open(l, f) xen_xc > > + > > +#define xendevicemodel_map_io_range_to_ioreq_server \ > > Could you please use static inline functions for compatibility rather > than macros? That way we are going to have some more compile time > checks. Or at least macros with parameters. > > > > +xc_hvm_map_io_range_to_ioreq_server > > +#define xendevicemodel_unmap_io_range_from_ioreq_server \ > > +xc_hvm_unmap_io_range_from_ioreq_server > > +#define xendevicemodel_map_pcidev_to_ioreq_server \ > > +xc_hvm_map_pcidev_to_ioreq_server > > +#define xendevicemodel_unmap_pcidev_from_ioreq_server \ > > +xc_hvm_unmap_pcidev_from_ioreq_server > > +#define xendevicemodel_create_ioreq_server \ > > +xc_hvm_create_ioreq_server > > +#define xendevicemodel_destroy_ioreq_server \ > > +xc_hvm_destroy_ioreq_server > > +#define xendevicemodel_get_ioreq_server_info \ > > +xc_hvm_get_ioreq_server_info > > +#define xendevicemodel_set_ioreq_server_state \ > > +xc_hvm_set_ioreq_server_state > > +#define xendevicemodel_set_pci_intx_level \ > > +xc_hvm_set_pci_intx_level > > +#define xendevicemodel_set_pci_link_route \ > > +xc_hvm_set_pci_link_route > > +#define xendevicemodel_set_isa_irq_level \ > > +xc_hvm_set_isa_irq_level > > +#define xendevicemodel_inject_msi \ > > +xc_hvm_inject_msi > > +#define xendevicemodel_set_mem_type \ > > +xc_hvm_set_mem_type > > +#define xendevicemodel_track_dirty_vram \ > > +xc_hvm_track_dirty_vram > > +#define xendevicemodel_modified_memory \ > > +xc_hvm_modified_memory > > + > > +#else /* CONFIG_XEN_CTRL_INTERFACE_VERSION >= 490 */ > > + > > +#include > > + > > +#endif > > + > > +extern xendevicemodel_handle *xen_dmod; > > + > > static inline int xen_set_mem_type(domid_t domid, hvmmem_type_t > type, > > uint64_t first_pfn, uint32_t nr) > > { > > -return xc_hvm_set_mem_type(xen_xc, domid, type, first_pfn, nr); > > +return xendevicemodel_set_mem_type(xen_dmod, domid, type, > first_pfn, > > + nr); > > } > > > > static inline int xen_set_pci_intx_level(domid_t domid, uint16_t segment, > > uint8_t bus, uint8_t device, > > uint8_t intx, unsigned int level) > > { > > -return xc_hvm_set_pci_intx_level(xen_xc, domid, segment, bus, > device, > > - intx, level); > > +return xendevicemodel_set_pci_intx_level(xen_dmod, domid, > s
Re: [Xen-devel] [PATCH v9 5/8] VT-d: Introduce a new function update_irte_for_msi_common
>>> On 27.02.17 at 02:45, wrote: > @@ -547,16 +548,116 @@ static int remap_entry_to_msi_msg( > return 0; > } > > +/* > + * This function is a common interface to update irte for msi case. > + * > + * If @pi_desc != NULL and @gvec != 0, the IRTE will be updated to a posted > + * format. In this case, @msg is ignored because constructing a posted format > + * IRTE doesn't need any information about the msi address or msi data. > + * > + * If @pi_desc == NULL and @gvec == 0, the IRTE will be updated to a remapped > + * format. In this case, @msg can't be NULL. This kind of implies that in the other case msg can be NULL. Please make this explicit or remove the last sentence, to avoid confusing readers. Plus, if msg can be NULL in that case, why don't you pass NULL in that case? > + * Assume 'ir_ctrl->iremap_lock' has been acquired and the remap_index > + * of msi_desc has a benign value. > + */ > +static int update_irte_for_msi_common( > +struct iommu *iommu, const struct pci_dev *pdev, > +const struct msi_desc *msi_desc, struct msi_msg *msg, > +const struct pi_desc *pi_desc, const uint8_t gvec) > +{ > +struct iremap_entry *iremap_entry = NULL, *iremap_entries; > +struct iremap_entry new_ire = {{0}}; I think just "{ }" will do. > +unsigned int index = msi_desc->remap_index; > +struct ir_ctrl *ir_ctrl = iommu_ir_ctrl(iommu); > + > +ASSERT( ir_ctrl ); > +ASSERT( spin_is_locked(&ir_ctrl->iremap_lock) ); > +ASSERT( (index >= 0) && (index < IREMAP_ENTRY_NR) ); Stray blanks inside parentheses. > +if ( (!pi_desc && gvec) || (pi_desc && !gvec) ) gvec == 0 alone is never a valid check: Either all vectors are valid, or a whole range (e.g. 0x00...0x0f) is invalid. Furthermore I think such checks are easier to read as either if ( !pi_desc != !gvec ) or if ( pi_desc ? !gvec : gvec ) > +return -EINVAL; > + > +if ( !pi_desc && !gvec && !msg ) With the earlier check the first or second part could be omitted afaict. > +return -EINVAL; > + > +GET_IREMAP_ENTRY(ir_ctrl->iremap_maddr, index, > + iremap_entries, iremap_entry); > + > +if ( !pi_desc ) > +{ > + /* Set interrupt remapping table entry */ Again a request for consistency: Either have a respective comment also at the top of the else branch, or omit the one here too. > +new_ire.remap.dm = msg->address_lo >> MSI_ADDR_DESTMODE_SHIFT; > +new_ire.remap.tm = msg->data >> MSI_DATA_TRIGGER_SHIFT; > +new_ire.remap.dlm = msg->data >> MSI_DATA_DELIVERY_MODE_SHIFT; > +/* Hardware require RH = 1 for LPR delivery mode */ > +new_ire.remap.rh = (new_ire.remap.dlm == dest_LowestPrio); > +new_ire.remap.vector = (msg->data >> MSI_DATA_VECTOR_SHIFT) & > +MSI_DATA_VECTOR_MASK; > +if ( x2apic_enabled ) > +new_ire.remap.dst = msg->dest32; > +else > +new_ire.remap.dst = ((msg->address_lo >> MSI_ADDR_DEST_ID_SHIFT) > + & 0xff) << 8; Please strive to eliminate literal numbers here. At least the 0xff looks to be easy to deal with (using MASK_EXTR() together with MSI_ADDR_DEST_ID_MASK). > +new_ire.remap.p = 1; > +} > +else > +{ > +new_ire.post.im = 1; > +new_ire.post.vector = gvec; > +new_ire.post.pda_l = virt_to_maddr(pi_desc) >> (32 - PDA_LOW_BIT); > +new_ire.post.pda_h = virt_to_maddr(pi_desc) >> 32; > +new_ire.post.p = 1; > +} > + > +if ( pdev ) > +set_msi_source_id(pdev, &new_ire); > +else > +set_hpet_source_id(msi_desc->hpet_id, &new_ire); > + > +if ( iremap_entry->val != new_ire.val ) > +{ > +if ( cpu_has_cx16 ) > +{ > +__uint128_t ret; > +struct iremap_entry old_ire; > + > +old_ire = *iremap_entry; > +ret = cmpxchg16b(iremap_entry, &old_ire, &new_ire); > + > +/* > + * In the above, we use cmpxchg16 to atomically update the > 128-bit > + * IRTE, and the hardware cannot update the IRTE behind us, so > + * the return value of cmpxchg16 should be the same as old_ire. > + * This ASSERT validate it. > + */ > +ASSERT(ret == old_ire.val); > +} > +else > +{ This wants a comment added explaining the conditions under which this is safe. Perhaps also one or more ASSERT()s to that effect. > +iremap_entry->lo = new_ire.lo; > +iremap_entry->hi = new_ire.hi; > +} > + > +iommu_flush_cache_entry(iremap_entry, sizeof(struct iremap_entry)); sizeof(*iremap_entry) > @@ -592,38 +693,33 @@ static int msi_msg_to_remap_entry( > return -EFAULT; > } > > +/* Get the IRTE's bind relationship with guest from the live IRTE. */ > GET_IREMAP_ENTRY(ir_ctrl->iremap_maddr, index, > iremap_entr
Re: [Xen-devel] [PATCH v9 8/8] VT-d: Add copy_irte_{to, from}_irt for updating irte
>>> On 27.02.17 at 02:45, wrote: > We used structure assignment to update irte which was not safe when a > interrupt happend during update. It is better to update IRTE atomically > through cmpxchg16b(). When cmpxchg16b is not supported, two 64-bit write > operation can atomically update IRTE when only one the high dword or > low dword is intented to be changed. s/dword/qword/ in the last sentence. > --- a/xen/drivers/passthrough/vtd/intremap.c > +++ b/xen/drivers/passthrough/vtd/intremap.c > @@ -169,6 +169,37 @@ bool_t __init iommu_supports_eim(void) > return 1; > } > > +static inline void copy_irte_from_irt(struct iremap_entry *dst, > + struct iremap_entry *src) > +{ > +*dst = *src; > +} > + > +static void copy_irte_to_irt(struct iremap_entry *dst, struct iremap_entry > *src) For both functions the source side wants to be pointer to const. > +{ > +if ( cpu_has_cx16 ) > +{ > +__uint128_t ret; > +struct iremap_entry old_ire; > + > +copy_irte_from_irt(&old_ire, dst); > +ret = cmpxchg16b(dst, &old_ire, src); > + > +/* > + * In the above, we use cmpxchg16 to atomically update the 128-bit > + * IRTE, and the hardware cannot update the IRTE behind us, so > + * the return value of cmpxchg16 should be the same as old_ire. > + * This ASSERT validate it. > + */ > +ASSERT(ret == old_ire.val); > +} > +else > +{ > +dst->lo = src->lo; > +dst->hi = src->hi; This doesn't exactly match the description: You nevertheless write both halves, which makes the update non-atomic. But with the earlier patch adjusted to document the conditions under which this is safe, this may be fine here too. Just make the patch description match reality then please. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [DO NOT APPLY PATCH XTF 0/2] UMIP test case
On 02/03/2017 08:42, Wei Liu wrote: > I wrote this long time ago before UMIP was merged. > > Yu, since you asked, I might as well post it for your reference on how to > do it with XTF. > > This series is not yet tested in any way. Unfortunately, you execute all of the sensitive instructions in kernel mode, where they wouldn't fault even with UMIP active. For full testing of a feature like this, the test should include a check that the ability to modify CR4.UMIP depends strictly on the visibility of the feature, that uses in the kernel still continue to work, even when active, and that behaviour returns back to normal after the feature has been deactivated. The first patch however, is fine to go in. ~Andrew (I am also going to pretend I didn't see those macros :) ) ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 4/5] configure: detect presence of libxendevicemodel
> -Original Message- > From: Anthony PERARD [mailto:anthony.per...@citrix.com] > Sent: 01 March 2017 17:18 > To: Paul Durrant > Cc: xen-de...@lists.xenproject.org; qemu-de...@nongnu.org; Stefano > Stabellini > Subject: Re: [PATCH 4/5] configure: detect presence of libxendevicemodel > > On Thu, Feb 23, 2017 at 02:53:54PM +, Paul Durrant wrote: > > This patch adds code in configure to set > CONFIG_XEN_CTRL_INTERFACE_VERSION > > to a new value of 490 if libxendevicemodel is present in the build > > environment. > > > > Signed-off-by: Paul Durrant > > --- > > Cc: Stefano Stabellini > > Cc: Anthony Perard > > --- > > configure | 19 +++ > > 1 file changed, 19 insertions(+) > > > > diff --git a/configure b/configure > > index 8e8f18d..fc1e12b 100755 > > --- a/configure > > +++ b/configure > > @@ -1980,6 +1980,25 @@ EOF > ># Xen unstable > >elif > >cat > $TMPC < > +#undef XC_WANT_COMPAT_DEVICEMODEL_API > > +#define __XEN_TOOLS__ > > Isn't __XEN_TOOLS__ supposed to be reserved for some to tools inside > xen.git? > > Also it seems to be the only time this define is used in your patch > series. > No. QEMU falls under the definition of 'tools' as far as Xen goes and the hypercalls and xendevicemodel API are protected by that. The reason you don't see it elsewhere is that xenctrl.h defines it. (See http://xenbits.xen.org/gitweb/?p=xen.git;a=blob;f=tools/libxc/include/xenctrl.h;hb=HEAD#l27). I think that's a little underhand so I thought I'd make the new code in configure more transparent. I can change it to just include xenctrl.h before xendevicemodel.h if you'd prefer. Paul > > +#include > > +int main(void) { > > + xendevicemodel_handle *xd; > > + > > + xd = xendevicemodel_open(0, 0); > > + xendevicemodel_close(xd); > > + > > + return 0; > > +} > > +EOF > > + compile_prog "" "$xen_libs $xen_stable_libs -lxendevicemodel" > > +then > > +xen_stable_libs="$xen_stable_libs -lxendevicemodel" > > +xen_ctrl_version=490 > > +xen=yes > > + elif > > + cat > $TMPC < > /* > > * If we have stable libs the we don't want the libxc compat > > * layers, regardless of what CFLAGS we may have been given. > > -- > > 2.1.4 > > > > -- > Anthony PERARD ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] Xen staging: libxendevicemodel.so.1 [...] not found
Hello, Just pulled the latest staging and I get: qemu successfuly configured for Xen qemu-dm build === PCI passthrough capability has been enabled === make[2]: Entering directory `~/work/xen.git/tools/qemu-xen-traditional-dir-remote' === PCI passthrough capability has been enabled === === PCI passthrough capability has been enabled === make[3]: Entering directory `~/work/xen.git/tools/qemu-xen-traditional-dir-remote/i386-dm' LINK i386-dm/qemu-dm /usr/bin/ld: warning: libxendevicemodel.so.1, needed by ~/work/xen.git/tools/../tools/libxc/libxenctrl.so, not found (try using -rpath or -rpath-link) ~/work/xen.git/tools/../tools/libxc/libxenctrl.so: undefined reference to `xendevicemodel_map_pcidev_to_ioreq_server@VERS_1.0' ~/work/xen.git/tools/../tools/libxc/libxenctrl.so: undefined reference to `xendevicemodel_get_ioreq_server_info@VERS_1.0' ~/work/xen.git/tools/../tools/libxc/libxenctrl.so: undefined reference to `xendevicemodel_open@VERS_1.0' ~/work/xen.git/tools/../tools/libxc/libxenctrl.so: undefined reference to `xendevicemodel_unmap_io_range_from_ioreq_server@VERS_1.0' ~/work/xen.git/tools/../tools/libxc/libxenctrl.so: undefined reference to `xendevicemodel_modified_memory@VERS_1.0' ~/work/xen.git/tools/../tools/libxc/libxenctrl.so: undefined reference to `xendevicemodel_set_ioreq_server_state@VERS_1.0' ~/work/xen.git/tools/../tools/libxc/libxenctrl.so: undefined reference to `xendevicemodel_track_dirty_vram@VERS_1.0' ~/work/xen.git/tools/../tools/libxc/libxenctrl.so: undefined reference to `xendevicemodel_close@VERS_1.0' ~/work/xen.git/tools/../tools/libxc/libxenctrl.so: undefined reference to `xendevicemodel_set_mem_type@VERS_1.0' ~/work/xen.git/tools/../tools/libxc/libxenctrl.so: undefined reference to `xendevicemodel_destroy_ioreq_server@VERS_1.0' ~/work/xen.git/tools/../tools/libxc/libxenctrl.so: undefined reference to `xendevicemodel_set_pci_intx_level@VERS_1.0' ~/work/xen.git/tools/../tools/libxc/libxenctrl.so: undefined reference to `xendevicemodel_create_ioreq_server@VERS_1.0' ~/work/xen.git/tools/../tools/libxc/libxenctrl.so: undefined reference to `xendevicemodel_unmap_pcidev_from_ioreq_server@VERS_1.0' ~/work/xen.git/tools/../tools/libxc/libxenctrl.so: undefined reference to `xendevicemodel_inject_event@VERS_1.0' ~/work/xen.git/tools/../tools/libxc/libxenctrl.so: undefined reference to `xendevicemodel_set_pci_link_route@VERS_1.0' ~/work/xen.git/tools/../tools/libxc/libxenctrl.so: undefined reference to `xendevicemodel_inject_msi@VERS_1.0' ~/work/xen.git/tools/../tools/libxc/libxenctrl.so: undefined reference to `xendevicemodel_set_isa_irq_level@VERS_1.0' ~/work/xen.git/tools/../tools/libxc/libxenctrl.so: undefined reference to `xendevicemodel_map_io_range_to_ioreq_server@VERS_1.0' collect2: error: ld returned 1 exit status make[3]: *** [qemu-dm] Error 1 Just thought somebody'd like to know. Thanks, Razvan ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] Xen staging: libxendevicemodel.so.1 [...] not found
> -Original Message- > From: Xen-devel [mailto:xen-devel-boun...@lists.xen.org] On Behalf Of > Razvan Cojocaru > Sent: 02 March 2017 09:19 > To: Xen-devel > Subject: [Xen-devel] Xen staging: libxendevicemodel.so.1 [...] not found > > Hello, > > Just pulled the latest staging and I get: > > qemu successfuly configured for Xen qemu-dm build > === PCI passthrough capability has been enabled === > make[2]: Entering directory > `~/work/xen.git/tools/qemu-xen-traditional-dir-remote' > === PCI passthrough capability has been enabled === > === PCI passthrough capability has been enabled === > make[3]: Entering directory > `~/work/xen.git/tools/qemu-xen-traditional-dir-remote/i386-dm' > LINK i386-dm/qemu-dm > /usr/bin/ld: warning: libxendevicemodel.so.1, needed by > ~/work/xen.git/tools/../tools/libxc/libxenctrl.so, not found (try using > -rpath or -rpath-link) > ~/work/xen.git/tools/../tools/libxc/libxenctrl.so: undefined reference > to `xendevicemodel_map_pcidev_to_ioreq_server@VERS_1.0' > ~/work/xen.git/tools/../tools/libxc/libxenctrl.so: undefined reference > to `xendevicemodel_get_ioreq_server_info@VERS_1.0' > ~/work/xen.git/tools/../tools/libxc/libxenctrl.so: undefined reference > to `xendevicemodel_open@VERS_1.0' > ~/work/xen.git/tools/../tools/libxc/libxenctrl.so: undefined reference > to `xendevicemodel_unmap_io_range_from_ioreq_server@VERS_1.0' > ~/work/xen.git/tools/../tools/libxc/libxenctrl.so: undefined reference > to `xendevicemodel_modified_memory@VERS_1.0' > ~/work/xen.git/tools/../tools/libxc/libxenctrl.so: undefined reference > to `xendevicemodel_set_ioreq_server_state@VERS_1.0' > ~/work/xen.git/tools/../tools/libxc/libxenctrl.so: undefined reference > to `xendevicemodel_track_dirty_vram@VERS_1.0' > ~/work/xen.git/tools/../tools/libxc/libxenctrl.so: undefined reference > to `xendevicemodel_close@VERS_1.0' > ~/work/xen.git/tools/../tools/libxc/libxenctrl.so: undefined reference > to `xendevicemodel_set_mem_type@VERS_1.0' > ~/work/xen.git/tools/../tools/libxc/libxenctrl.so: undefined reference > to `xendevicemodel_destroy_ioreq_server@VERS_1.0' > ~/work/xen.git/tools/../tools/libxc/libxenctrl.so: undefined reference > to `xendevicemodel_set_pci_intx_level@VERS_1.0' > ~/work/xen.git/tools/../tools/libxc/libxenctrl.so: undefined reference > to `xendevicemodel_create_ioreq_server@VERS_1.0' > ~/work/xen.git/tools/../tools/libxc/libxenctrl.so: undefined reference > to `xendevicemodel_unmap_pcidev_from_ioreq_server@VERS_1.0' > ~/work/xen.git/tools/../tools/libxc/libxenctrl.so: undefined reference > to `xendevicemodel_inject_event@VERS_1.0' > ~/work/xen.git/tools/../tools/libxc/libxenctrl.so: undefined reference > to `xendevicemodel_set_pci_link_route@VERS_1.0' > ~/work/xen.git/tools/../tools/libxc/libxenctrl.so: undefined reference > to `xendevicemodel_inject_msi@VERS_1.0' > ~/work/xen.git/tools/../tools/libxc/libxenctrl.so: undefined reference > to `xendevicemodel_set_isa_irq_level@VERS_1.0' > ~/work/xen.git/tools/../tools/libxc/libxenctrl.so: undefined reference > to `xendevicemodel_map_io_range_to_ioreq_server@VERS_1.0' > collect2: error: ld returned 1 exit status > make[3]: *** [qemu-dm] Error 1 > > Just thought somebody'd like to know. > Are you using a custom qemu-trad? You need to make sure it has the following patches applied: http://xenbits.xen.org/gitweb/?p=qemu-xen-traditional.git;a=commit;h=58b9047bf2da88d2976bd1b7ba50dfdcc68b503d http://xenbits.xen.org/gitweb/?p=qemu-xen-traditional.git;a=commit;h=8b4834ee1202852ed83a9fc61268c65fb6961ea7 Paul > > Thanks, > Razvan > > ___ > Xen-devel mailing list > Xen-devel@lists.xen.org > https://lists.xen.org/xen-devel ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] Xen staging: libxendevicemodel.so.1 [...] not found
On 03/02/2017 11:23 AM, Paul Durrant wrote: >> -Original Message- >> From: Xen-devel [mailto:xen-devel-boun...@lists.xen.org] On Behalf Of >> Razvan Cojocaru >> Sent: 02 March 2017 09:19 >> To: Xen-devel >> Subject: [Xen-devel] Xen staging: libxendevicemodel.so.1 [...] not found >> >> Hello, >> >> Just pulled the latest staging and I get: >> >> qemu successfuly configured for Xen qemu-dm build >> === PCI passthrough capability has been enabled === >> make[2]: Entering directory >> `~/work/xen.git/tools/qemu-xen-traditional-dir-remote' >> === PCI passthrough capability has been enabled === >> === PCI passthrough capability has been enabled === >> make[3]: Entering directory >> `~/work/xen.git/tools/qemu-xen-traditional-dir-remote/i386-dm' >> LINK i386-dm/qemu-dm >> /usr/bin/ld: warning: libxendevicemodel.so.1, needed by >> ~/work/xen.git/tools/../tools/libxc/libxenctrl.so, not found (try using >> -rpath or -rpath-link) >> ~/work/xen.git/tools/../tools/libxc/libxenctrl.so: undefined reference >> to `xendevicemodel_map_pcidev_to_ioreq_server@VERS_1.0' >> ~/work/xen.git/tools/../tools/libxc/libxenctrl.so: undefined reference >> to `xendevicemodel_get_ioreq_server_info@VERS_1.0' >> ~/work/xen.git/tools/../tools/libxc/libxenctrl.so: undefined reference >> to `xendevicemodel_open@VERS_1.0' >> ~/work/xen.git/tools/../tools/libxc/libxenctrl.so: undefined reference >> to `xendevicemodel_unmap_io_range_from_ioreq_server@VERS_1.0' >> ~/work/xen.git/tools/../tools/libxc/libxenctrl.so: undefined reference >> to `xendevicemodel_modified_memory@VERS_1.0' >> ~/work/xen.git/tools/../tools/libxc/libxenctrl.so: undefined reference >> to `xendevicemodel_set_ioreq_server_state@VERS_1.0' >> ~/work/xen.git/tools/../tools/libxc/libxenctrl.so: undefined reference >> to `xendevicemodel_track_dirty_vram@VERS_1.0' >> ~/work/xen.git/tools/../tools/libxc/libxenctrl.so: undefined reference >> to `xendevicemodel_close@VERS_1.0' >> ~/work/xen.git/tools/../tools/libxc/libxenctrl.so: undefined reference >> to `xendevicemodel_set_mem_type@VERS_1.0' >> ~/work/xen.git/tools/../tools/libxc/libxenctrl.so: undefined reference >> to `xendevicemodel_destroy_ioreq_server@VERS_1.0' >> ~/work/xen.git/tools/../tools/libxc/libxenctrl.so: undefined reference >> to `xendevicemodel_set_pci_intx_level@VERS_1.0' >> ~/work/xen.git/tools/../tools/libxc/libxenctrl.so: undefined reference >> to `xendevicemodel_create_ioreq_server@VERS_1.0' >> ~/work/xen.git/tools/../tools/libxc/libxenctrl.so: undefined reference >> to `xendevicemodel_unmap_pcidev_from_ioreq_server@VERS_1.0' >> ~/work/xen.git/tools/../tools/libxc/libxenctrl.so: undefined reference >> to `xendevicemodel_inject_event@VERS_1.0' >> ~/work/xen.git/tools/../tools/libxc/libxenctrl.so: undefined reference >> to `xendevicemodel_set_pci_link_route@VERS_1.0' >> ~/work/xen.git/tools/../tools/libxc/libxenctrl.so: undefined reference >> to `xendevicemodel_inject_msi@VERS_1.0' >> ~/work/xen.git/tools/../tools/libxc/libxenctrl.so: undefined reference >> to `xendevicemodel_set_isa_irq_level@VERS_1.0' >> ~/work/xen.git/tools/../tools/libxc/libxenctrl.so: undefined reference >> to `xendevicemodel_map_io_range_to_ioreq_server@VERS_1.0' >> collect2: error: ld returned 1 exit status >> make[3]: *** [qemu-dm] Error 1 >> >> Just thought somebody'd like to know. >> > > Are you using a custom qemu-trad? You need to make sure it has the following > patches applied: > > http://xenbits.xen.org/gitweb/?p=qemu-xen-traditional.git;a=commit;h=58b9047bf2da88d2976bd1b7ba50dfdcc68b503d > http://xenbits.xen.org/gitweb/?p=qemu-xen-traditional.git;a=commit;h=8b4834ee1202852ed83a9fc61268c65fb6961ea7 No, just did ./configure --disable-systemd --prefix=/usr and a make dist. Those patches have not made it into staging yet apparently - git log doesn't show them. I'll apply them manually. Thanks, Razvan ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] Xen staging: libxendevicemodel.so.1 [...] not found
> -Original Message- > From: Razvan Cojocaru [mailto:rcojoc...@bitdefender.com] > Sent: 02 March 2017 09:27 > To: Paul Durrant ; Xen-devel de...@lists.xen.org> > Subject: Re: [Xen-devel] Xen staging: libxendevicemodel.so.1 [...] not found > > On 03/02/2017 11:23 AM, Paul Durrant wrote: > >> -Original Message- > >> From: Xen-devel [mailto:xen-devel-boun...@lists.xen.org] On Behalf Of > >> Razvan Cojocaru > >> Sent: 02 March 2017 09:19 > >> To: Xen-devel > >> Subject: [Xen-devel] Xen staging: libxendevicemodel.so.1 [...] not found > >> > >> Hello, > >> > >> Just pulled the latest staging and I get: > >> > >> qemu successfuly configured for Xen qemu-dm build > >> === PCI passthrough capability has been enabled === > >> make[2]: Entering directory > >> `~/work/xen.git/tools/qemu-xen-traditional-dir-remote' > >> === PCI passthrough capability has been enabled === > >> === PCI passthrough capability has been enabled === > >> make[3]: Entering directory > >> `~/work/xen.git/tools/qemu-xen-traditional-dir-remote/i386-dm' > >> LINK i386-dm/qemu-dm > >> /usr/bin/ld: warning: libxendevicemodel.so.1, needed by > >> ~/work/xen.git/tools/../tools/libxc/libxenctrl.so, not found (try using > >> -rpath or -rpath-link) > >> ~/work/xen.git/tools/../tools/libxc/libxenctrl.so: undefined reference > >> to `xendevicemodel_map_pcidev_to_ioreq_server@VERS_1.0' > >> ~/work/xen.git/tools/../tools/libxc/libxenctrl.so: undefined reference > >> to `xendevicemodel_get_ioreq_server_info@VERS_1.0' > >> ~/work/xen.git/tools/../tools/libxc/libxenctrl.so: undefined reference > >> to `xendevicemodel_open@VERS_1.0' > >> ~/work/xen.git/tools/../tools/libxc/libxenctrl.so: undefined reference > >> to `xendevicemodel_unmap_io_range_from_ioreq_server@VERS_1.0' > >> ~/work/xen.git/tools/../tools/libxc/libxenctrl.so: undefined reference > >> to `xendevicemodel_modified_memory@VERS_1.0' > >> ~/work/xen.git/tools/../tools/libxc/libxenctrl.so: undefined reference > >> to `xendevicemodel_set_ioreq_server_state@VERS_1.0' > >> ~/work/xen.git/tools/../tools/libxc/libxenctrl.so: undefined reference > >> to `xendevicemodel_track_dirty_vram@VERS_1.0' > >> ~/work/xen.git/tools/../tools/libxc/libxenctrl.so: undefined reference > >> to `xendevicemodel_close@VERS_1.0' > >> ~/work/xen.git/tools/../tools/libxc/libxenctrl.so: undefined reference > >> to `xendevicemodel_set_mem_type@VERS_1.0' > >> ~/work/xen.git/tools/../tools/libxc/libxenctrl.so: undefined reference > >> to `xendevicemodel_destroy_ioreq_server@VERS_1.0' > >> ~/work/xen.git/tools/../tools/libxc/libxenctrl.so: undefined reference > >> to `xendevicemodel_set_pci_intx_level@VERS_1.0' > >> ~/work/xen.git/tools/../tools/libxc/libxenctrl.so: undefined reference > >> to `xendevicemodel_create_ioreq_server@VERS_1.0' > >> ~/work/xen.git/tools/../tools/libxc/libxenctrl.so: undefined reference > >> to `xendevicemodel_unmap_pcidev_from_ioreq_server@VERS_1.0' > >> ~/work/xen.git/tools/../tools/libxc/libxenctrl.so: undefined reference > >> to `xendevicemodel_inject_event@VERS_1.0' > >> ~/work/xen.git/tools/../tools/libxc/libxenctrl.so: undefined reference > >> to `xendevicemodel_set_pci_link_route@VERS_1.0' > >> ~/work/xen.git/tools/../tools/libxc/libxenctrl.so: undefined reference > >> to `xendevicemodel_inject_msi@VERS_1.0' > >> ~/work/xen.git/tools/../tools/libxc/libxenctrl.so: undefined reference > >> to `xendevicemodel_set_isa_irq_level@VERS_1.0' > >> ~/work/xen.git/tools/../tools/libxc/libxenctrl.so: undefined reference > >> to `xendevicemodel_map_io_range_to_ioreq_server@VERS_1.0' > >> collect2: error: ld returned 1 exit status > >> make[3]: *** [qemu-dm] Error 1 > >> > >> Just thought somebody'd like to know. > >> > > > > Are you using a custom qemu-trad? You need to make sure it has the > following patches applied: > > > > http://xenbits.xen.org/gitweb/?p=qemu-xen- > traditional.git;a=commit;h=58b9047bf2da88d2976bd1b7ba50dfdcc68b503d > > http://xenbits.xen.org/gitweb/?p=qemu-xen- > traditional.git;a=commit;h=8b4834ee1202852ed83a9fc61268c65fb6961ea7 > > No, just did ./configure --disable-systemd --prefix=/usr and a make > dist. Those patches have not made it into staging yet apparently - git > log doesn't show them. I'll apply them manually. The tag in staging has been updated. If you do a 'make mrproper' or just blow away your qemu-trad repo then 'make tools' should updated it for you. Paul > > > Thanks, > Razvan ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v3] x86/apicv: Enhance posted-interrupt processing
>>> On 02.03.17 at 02:49, wrote: The patch title, btw, makes it looks like this isn't a bug fix, which is contrary to the understanding I've gained so far. > __vmx_deliver_posted_interrupt() wrongly used a softirq bit to decide whether > to suppress an IPI. Its logic was: the first time an IPI was sent, we set > the softirq bit. Next time, we would check that softirq bit before sending > another IPI. If the 1st IPI arrived at the pCPU which was in > non-root mode, the hardware would consume the IPI and sync PIR to vIRR. > During the process, no one (both hardware and software) will clear the > softirq bit. As a result, the following IPI would be wrongly suppressed. > > This patch discards the suppression check, always sending IPI. > The softirq also need to be raised. But there is a little change. > This patch moves the place where we raise a softirq for > 'cpu != smp_processor_id()' case to the IPI interrupt handler. > Namely, don't raise a softirq for this case and set the interrupt handler > to pi_notification_interrupt()(in which a softirq is raised) regardless of > posted interrupt enabled or not. As using a PI thing even in the non-PI case is counterintuitive, this needs expanding on imo. Maybe not here but in a code comment. Or perhaps, looking at the code, this is just not precise enough a description: The code is inside a cpu_has_vmx_posted_intr_processing conditional, and what you do is move code out of the iommu_intpost specific section. I.e. a reference to IOMMU may simply be missing here. > The only difference is when the IPI arrives > at the pCPU which is happened in non-root mode, the patch will not raise a s/patch/code/ (or some such) > useless softirq since the IPI is consumed by hardware rather than raise a > softirq unconditionally. > > Quan doesn't have enough time to upstream this fix patch. He asks me to do > this. Merge another his related patch > (https://lists.xenproject.org/archives/html/xen-devel/2017-02/msg02885.html). This doesn't belong in the commit message, and hence should be after the first ---. > --- a/xen/arch/x86/hvm/vmx/vmx.c > +++ b/xen/arch/x86/hvm/vmx/vmx.c > @@ -1842,13 +1842,59 @@ static void __vmx_deliver_posted_interrupt(struct > vcpu *v) > bool_t running = v->is_running; > > vcpu_unblock(v); > +/* > + * The underlying 'IF' excludes two cases which we don't need further Who or what is 'IF'? > + * operation to make sure the target vCPU (@v) will sync PIR to vIRR > ASAP. > + * Specifically, the two cases are: > + * 1. The target vCPU is not running, meaning it is blocked or runnable. > + * Since we have unblocked the target vCPU above, the target vCPU will > + * sync PIR to vIRR when it is chosen to run. > + * 2. The target vCPU is the current vCPU and in_irq() is false. It means > + * the function is called in noninterrupt context. * 2. The target vCPU is the current vCPU and we're in * non-interrupt context. > Since we don't call > + * the function in noninterrupt context after the last time a vCPU syncs > + * PIR to vIRR, excluding this case is also safe. It is not really clear to me what "the function" here refers to. Surely the function the comment is in won't call itself, no matter what context. > + */ > if ( running && (in_irq() || (v != current)) ) > { > +/* > + * Note: Only two cases will reach here: > + * 1. The target vCPU is running on other pCPU. > + * 2. The target vCPU is running on the same pCPU with the current > + * vCPU This is an impossible thing: There can't be two vCPU-s running on the same pCPU-s at the same time. Hence ... > and the current vCPU is in interrupt context. That's to say, > + * the target vCPU is the current vCPU. ... just this last statement is sufficient here. > + * Note2: Don't worry the v->processor may change since at least when > + * the target vCPU is chosen to run or be blocked, there is a chance > + * to sync PIR to vIRR. > + */ "There is a chance" reads as if it may or may not happen. How about "The vCPU being moved to another processor is guaranteed to sync PIR to vIRR, due to the involved scheduling cycle"? Of course only if this matches reality. > unsigned int cpu = v->processor; > > -if ( !test_and_set_bit(VCPU_KICK_SOFTIRQ, &softirq_pending(cpu)) > - && (cpu != smp_processor_id()) ) > +/* > + * For case 1, we send a IPI to the pCPU. When the IPI arrives, the ... an IPI ... (also at least once more below) > + * target vCPU maybe is running in non-root mode, running in root > + * mode, runnable or blocked. If the target vCPU is running in > + * non-root mode, the hardware will sync PIR to vIRR for the IPI > + * vector is special to the pCPU. If the target vCPU is running in > + * root-mode, the interrupt handler starts to run. In order to make >
Re: [Xen-devel] Xen staging: libxendevicemodel.so.1 [...] not found
On Thu, Mar 02, 2017 at 11:19:23AM +0200, Razvan Cojocaru wrote: > Hello, > > Just pulled the latest staging and I get: > Just nuke the working tree with git clean -fd and rebuild. ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] [libvirt test] 106352: tolerable FAIL - PUSHED
flight 106352 libvirt real [real] http://logs.test-lab.xenproject.org/osstest/logs/106352/ Failures :-/ but no regressions. Regressions which are regarded as allowable (not blocking): test-armhf-armhf-libvirt 13 saverestore-support-checkfail like 106226 test-armhf-armhf-libvirt-xsm 13 saverestore-support-checkfail like 106226 test-armhf-armhf-libvirt-raw 12 saverestore-support-checkfail like 106226 Tests which did not succeed, but are not blocking: test-arm64-arm64-libvirt-xsm 1 build-check(1) blocked n/a build-arm64-libvirt 1 build-check(1) blocked n/a test-arm64-arm64-libvirt-qcow2 1 build-check(1) blocked n/a test-arm64-arm64-libvirt 1 build-check(1) blocked n/a build-arm64 5 xen-buildfail never pass test-amd64-i386-libvirt-xsm 12 migrate-support-checkfail never pass test-amd64-amd64-libvirt 12 migrate-support-checkfail never pass test-amd64-amd64-libvirt-xsm 12 migrate-support-checkfail never pass build-arm64-xsm 5 xen-buildfail never pass test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check fail never pass test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check fail never pass build-arm64-pvops 5 kernel-build fail never pass test-amd64-amd64-libvirt-vhd 11 migrate-support-checkfail never pass test-armhf-armhf-libvirt 12 migrate-support-checkfail never pass test-armhf-armhf-libvirt-xsm 12 migrate-support-checkfail never pass test-amd64-i386-libvirt 12 migrate-support-checkfail never pass test-armhf-armhf-libvirt-raw 11 migrate-support-checkfail never pass version targeted for testing: libvirt 9d87f769726bd5714eb6a930379d8b4c53311196 baseline version: libvirt ca1f38545750d597c75c9773723c716483b03e5c Last test of basis 106226 2017-02-28 04:20:43 Z2 days Testing same since 106352 2017-03-02 04:21:48 Z0 days1 attempts People who touched revisions under test: Michal Privoznik jobs: build-amd64-xsm pass build-arm64-xsm fail build-armhf-xsm pass build-i386-xsm pass build-amd64 pass build-arm64 fail build-armhf pass build-i386 pass build-amd64-libvirt pass build-arm64-libvirt blocked build-armhf-libvirt pass build-i386-libvirt pass build-amd64-pvopspass build-arm64-pvopsfail build-armhf-pvopspass build-i386-pvops pass test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm pass test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsmpass test-amd64-amd64-libvirt-xsm pass test-arm64-arm64-libvirt-xsm blocked test-armhf-armhf-libvirt-xsm pass test-amd64-i386-libvirt-xsm pass test-amd64-amd64-libvirt pass test-arm64-arm64-libvirt blocked test-armhf-armhf-libvirt pass test-amd64-i386-libvirt pass test-amd64-amd64-libvirt-pairpass test-amd64-i386-libvirt-pair pass test-arm64-arm64-libvirt-qcow2 blocked test-armhf-armhf-libvirt-raw pass test-amd64-amd64-libvirt-vhd pass sg-report-flight on osstest.test-lab.xenproject.org logs: /home/logs/logs images: /home/logs/images Logs, config files, etc. are available at http://logs.test-lab.xenproject.org/osstest/logs Explanation of these reports, and of osstest in general, is at http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master Test harness code can be foun
[Xen-devel] [PATCH 3/6] xen: credit1: increase efficiency and scalability of load balancing.
During load balancing, we check the non idle pCPUs to see if they have runnable but not running vCPUs that can be stolen by and set to run on currently idle pCPUs. If a pCPU has only one running (or runnable) vCPU, though, we don't want to steal it from there, and it's therefore pointless bothering with it (especially considering that bothering means trying to take its runqueue lock!). On large systems, when load is only slightly higher than the number of pCPUs (i.e., there are just a few more active vCPUs than the number of the pCPUs), this may mean that: - we go through all the pCPUs, - for each one, we (try to) take its runqueue locks, - we figure out there's actually nothing to be stolen! To mitigate this, we introduce here the concept of overloaded runqueues, and a cpumask where to record what pCPUs are in such state. An overloaded runqueue has at least runnable 2 vCPUs (plus the idle one, which is always there). Typically, this means 1 vCPU is running, and 1 is sitting in the runqueue, and can hence be stolen. Then, in csched_balance_load(), it is enough to go over the overloaded pCPUs, instead than all non-idle pCPUs, which is better. signed-off-by: Dario Faggioli --- Cc: George Dunlap Cc: Andrew Cooper --- I'm Cc-ing Andy on this patch, because we've discussed once about doing something like this upstream. --- xen/common/sched_credit.c | 56 ++--- 1 file changed, 47 insertions(+), 9 deletions(-) diff --git a/xen/common/sched_credit.c b/xen/common/sched_credit.c index 2b13e99..529b6c7 100644 --- a/xen/common/sched_credit.c +++ b/xen/common/sched_credit.c @@ -171,6 +171,7 @@ struct csched_pcpu { struct timer ticker; unsigned int tick; unsigned int idle_bias; +unsigned int nr_runnable; }; /* @@ -221,6 +222,7 @@ struct csched_private { uint32_t ncpus; struct timer master_ticker; unsigned int master; +cpumask_var_t overloaded; cpumask_var_t idlers; cpumask_var_t cpus; uint32_t weight; @@ -263,7 +265,10 @@ static inline bool_t is_runq_idle(unsigned int cpu) static inline void __runq_insert(struct csched_vcpu *svc) { -const struct list_head * const runq = RUNQ(svc->vcpu->processor); +unsigned int cpu = svc->vcpu->processor; +const struct list_head * const runq = RUNQ(cpu); +struct csched_private * const prv = CSCHED_PRIV(per_cpu(scheduler, cpu)); +struct csched_pcpu * const spc = CSCHED_PCPU(cpu); struct list_head *iter; BUG_ON( __vcpu_on_runq(svc) ); @@ -288,12 +293,37 @@ __runq_insert(struct csched_vcpu *svc) } list_add_tail(&svc->runq_elem, iter); + +/* + * If there is more than just the idle vCPU and a "regular" vCPU runnable + * on the runqueue of this pCPU, mark it as overloaded (so other pCPU + * can come and pick up some work). + */ +if ( ++spc->nr_runnable > 2 && + !cpumask_test_cpu(cpu, prv->overloaded) ) +cpumask_set_cpu(cpu, prv->overloaded); } static inline void __runq_remove(struct csched_vcpu *svc) { +unsigned int cpu = svc->vcpu->processor; +struct csched_private * const prv = CSCHED_PRIV(per_cpu(scheduler, cpu)); +struct csched_pcpu * const spc = CSCHED_PCPU(cpu); + BUG_ON( !__vcpu_on_runq(svc) ); + +/* + * Mark the CPU as no longer overloaded when we drop to having only + * 1 vCPU in its runqueue. In fact, this means that just the idle + * idle vCPU and a "regular" vCPU are around. + */ +if ( --spc->nr_runnable <= 2 && + cpumask_test_cpu(cpu, prv->overloaded) ) +cpumask_clear_cpu(cpu, prv->overloaded); + +ASSERT(spc->nr_runnable >= 1); + list_del_init(&svc->runq_elem); } @@ -590,6 +620,7 @@ init_pdata(struct csched_private *prv, struct csched_pcpu *spc, int cpu) /* Start off idling... */ BUG_ON(!is_idle_vcpu(curr_on_cpu(cpu))); cpumask_set_cpu(cpu, prv->idlers); +spc->nr_runnable = 1; } static void @@ -1704,8 +1735,8 @@ csched_load_balance(struct csched_private *prv, int cpu, peer_node = node; do { -/* Find out what the !idle are in this node */ -cpumask_andnot(&workers, online, prv->idlers); +/* Select the pCPUs in this node that have work we can steal. */ +cpumask_and(&workers, online, prv->overloaded); cpumask_and(&workers, &workers, &node_to_cpumask(peer_node)); __cpumask_clear_cpu(cpu, &workers); @@ -1989,7 +2020,8 @@ csched_dump_pcpu(const struct scheduler *ops, int cpu) runq = &spc->runq; cpumask_scnprintf(cpustr, sizeof(cpustr), per_cpu(cpu_sibling_mask, cpu)); -printk("CPU[%02d] sort=%d, sibling=%s, ", cpu, spc->runq_sort_last, cpustr); +printk("CPU[%02d] nr_run=%d, sort=%d, sibling=%s, ", + cpu, spc->nr_runnable, spc->runq_sort_last, cpustr); cpumask_scnprintf(cpustr, sizeof(cpustr), per_cpu(cpu_core_mask, cpu)); printk("core=%s\n", cpustr)
[Xen-devel] [PATCH 5/6] xen/tools: tracing: add record for credit1 runqueue stealing.
Including whether we actually tried stealing a vCPU from a given pCPU, or we skipped that one, because of lock contention. Signed-off-by: Dario Faggioli --- Cc: George Dunlap Cc: Ian Jackson Cc: Wei Liu --- tools/xentrace/formats |1 + tools/xentrace/xenalyze.c| 11 +++ xen/common/sched_credit.c|6 ++ xen/include/xen/perfc_defn.h |1 + 4 files changed, 19 insertions(+) diff --git a/tools/xentrace/formats b/tools/xentrace/formats index a055231..8b31780 100644 --- a/tools/xentrace/formats +++ b/tools/xentrace/formats @@ -47,6 +47,7 @@ 0x00022008 CPU%(cpu)d %(tsc)d (+%(reltsc)8d) csched:unboost [ dom:vcpu = 0x%(1)04x%(2)04x ] 0x00022009 CPU%(cpu)d %(tsc)d (+%(reltsc)8d) csched:schedule [ cpu[16]:tasklet[8]:idle[8] = %(1)08x ] 0x0002200A CPU%(cpu)d %(tsc)d (+%(reltsc)8d) csched:ratelimit [ dom:vcpu = 0x%(1)08x, runtime = %(2)d ] +0x0002200B CPU%(cpu)d %(tsc)d (+%(reltsc)8d) csched:steal_check [ peer_cpu = %(1)d, checked = %(2)d ] 0x00022201 CPU%(cpu)d %(tsc)d (+%(reltsc)8d) csched2:tick 0x00022202 CPU%(cpu)d %(tsc)d (+%(reltsc)8d) csched2:runq_pos [ dom:vcpu = 0x%(1)08x, pos = %(2)d] diff --git a/tools/xentrace/xenalyze.c b/tools/xentrace/xenalyze.c index 68ffcc2..fd8ddb9 100644 --- a/tools/xentrace/xenalyze.c +++ b/tools/xentrace/xenalyze.c @@ -7651,6 +7651,17 @@ void sched_process(struct pcpu_info *p) r->runtime / 1000, r->runtime % 1000); } break; +case TRC_SCHED_CLASS_EVT(CSCHED, 11): /* STEAL_CHECK */ +if(opt.dump_all) { +struct { +unsigned int peer_cpu, check; +} *r = (typeof(r))ri->d; + +printf(" %s csched:load_balance %s %u\n", + ri->dump_header, r->check ? "checking" : "skipping", + r->peer_cpu); +} +break; /* CREDIT 2 (TRC_CSCHED2_xxx) */ case TRC_SCHED_CLASS_EVT(CSCHED2, 1): /* TICK */ case TRC_SCHED_CLASS_EVT(CSCHED2, 4): /* CREDIT_ADD*/ diff --git a/xen/common/sched_credit.c b/xen/common/sched_credit.c index bae29a7..ed44d40 100644 --- a/xen/common/sched_credit.c +++ b/xen/common/sched_credit.c @@ -134,6 +134,7 @@ #define TRC_CSCHED_BOOST_END TRC_SCHED_CLASS_EVT(CSCHED, 8) #define TRC_CSCHED_SCHEDULE TRC_SCHED_CLASS_EVT(CSCHED, 9) #define TRC_CSCHED_RATELIMIT TRC_SCHED_CLASS_EVT(CSCHED, 10) +#define TRC_CSCHED_STEAL_CHECK TRC_SCHED_CLASS_EVT(CSCHED, 11) /* @@ -1767,12 +1768,17 @@ csched_load_balance(struct csched_private *prv, int cpu, */ spinlock_t *lock = pcpu_schedule_trylock(peer_cpu); +SCHED_STAT_CRANK(steal_check); + if ( !lock ) { SCHED_STAT_CRANK(steal_trylock_failed); +TRACE_2D(TRC_CSCHED_STEAL_CHECK, peer_cpu, /* skipped */ 0); peer_cpu = cpumask_cycle(peer_cpu, &workers); continue; } +else +TRACE_2D(TRC_CSCHED_STEAL_CHECK, peer_cpu, /* checked */ 1); /* Any work over there to steal? */ speer = cpumask_test_cpu(peer_cpu, online) ? diff --git a/xen/include/xen/perfc_defn.h b/xen/include/xen/perfc_defn.h index 0d702f0..742d429 100644 --- a/xen/include/xen/perfc_defn.h +++ b/xen/include/xen/perfc_defn.h @@ -48,6 +48,7 @@ PERFCOUNTER(vcpu_unpark,"csched: vcpu_unpark") PERFCOUNTER(load_balance_idle, "csched: load_balance_idle") PERFCOUNTER(load_balance_over, "csched: load_balance_over") PERFCOUNTER(load_balance_other, "csched: load_balance_other") +PERFCOUNTER(steal_check,"csched: steal_check") PERFCOUNTER(steal_trylock_failed, "csched: steal_trylock_failed") PERFCOUNTER(steal_peer_idle,"csched: steal_peer_idle") PERFCOUNTER(migrate_queued, "csched: migrate_queued") ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] [PATCH 2/6] xen: credit: (micro) optimize csched_runq_steal().
Chacking whether or not a vCPU can be 'stolen' from a peer pCPU's runqueue is relatively cheap. Therefore, let's do that as early as possible, avoiding potentially useless complex checks, and cpumask manipulations. Signed-off-by: Dario Faggioli --- Cc: George Dunlap --- xen/common/sched_credit.c | 17 + 1 file changed, 9 insertions(+), 8 deletions(-) diff --git a/xen/common/sched_credit.c b/xen/common/sched_credit.c index 63a8675..2b13e99 100644 --- a/xen/common/sched_credit.c +++ b/xen/common/sched_credit.c @@ -708,12 +708,10 @@ static inline int __csched_vcpu_is_migrateable(struct vcpu *vc, int dest_cpu, cpumask_t *mask) { /* - * Don't pick up work that's in the peer's scheduling tail or hot on - * peer PCPU. Only pick up work that prefers and/or is allowed to run - * on our CPU. + * Don't pick up work that's or hot on peer PCPU, or that can't (or + * would prefer not to) run on cpu. */ -return !vc->is_running && - !__csched_vcpu_is_cache_hot(vc) && +return !__csched_vcpu_is_cache_hot(vc) && cpumask_test_cpu(dest_cpu, mask); } @@ -1622,7 +1620,9 @@ csched_runq_steal(int peer_cpu, int cpu, int pri, int balance_step) BUG_ON( is_idle_vcpu(vc) ); /* - * If the vcpu has no useful soft affinity, skip this vcpu. + * If the vcpu is still in peer_cpu's scheduling tail, or if it + * has no useful soft affinity, skip it. + * * In fact, what we want is to check if we have any "soft-affine * work" to steal, before starting to look at "hard-affine work". * @@ -1633,8 +1633,9 @@ csched_runq_steal(int peer_cpu, int cpu, int pri, int balance_step) * vCPUs with useful soft affinities in some sort of bitmap * or counter. */ -if ( balance_step == CSCHED_BALANCE_SOFT_AFFINITY - && !__vcpu_has_soft_affinity(vc, vc->cpu_hard_affinity) ) +if ( vc->is_running || + (balance_step == CSCHED_BALANCE_SOFT_AFFINITY + && !__vcpu_has_soft_affinity(vc, vc->cpu_hard_affinity)) ) continue; csched_balance_cpumask(vc, balance_step, cpumask_scratch); ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] [PATCH 4/6] xen: credit1: treat pCPUs more evenly during balancing.
Right now, we use cpumask_first() for going through the bus pCPUs in csched_load_balance(). This means not all pCPUs have equal chances of seeing their pending work stolen. It also means there is more runqueue lock pressure on lower ID pCPUs. To avoid all this, let's record and remember, for each NUMA node, from what pCPU we have stolen for last, and start from that the following time. Signed-off-by: Dario Faggioli --- Cc: George Dunlap --- xen/common/sched_credit.c | 37 + 1 file changed, 33 insertions(+), 4 deletions(-) diff --git a/xen/common/sched_credit.c b/xen/common/sched_credit.c index 529b6c7..bae29a7 100644 --- a/xen/common/sched_credit.c +++ b/xen/common/sched_credit.c @@ -229,6 +229,7 @@ struct csched_private { uint32_t credit; int credit_balance; uint32_t runq_sort; +uint32_t *balance_bias; unsigned ratelimit_us; /* Period of master and tick in milliseconds */ unsigned tslice_ms, tick_period_us, ticks_per_tslice; @@ -548,6 +549,7 @@ csched_deinit_pdata(const struct scheduler *ops, void *pcpu, int cpu) { struct csched_private *prv = CSCHED_PRIV(ops); struct csched_pcpu *spc = pcpu; +unsigned int node = cpu_to_node(cpu); unsigned long flags; /* @@ -571,6 +573,12 @@ csched_deinit_pdata(const struct scheduler *ops, void *pcpu, int cpu) prv->master = cpumask_first(prv->cpus); migrate_timer(&prv->master_ticker, prv->master); } +if ( prv->balance_bias[node] == cpu ) +{ +cpumask_and(cpumask_scratch, prv->cpus, &node_to_cpumask(node)); +if ( !cpumask_empty(cpumask_scratch) ) +prv->balance_bias[node] = cpumask_first(cpumask_scratch); +} kill_timer(&spc->ticker); if ( prv->ncpus == 0 ) kill_timer(&prv->master_ticker); @@ -610,6 +618,10 @@ init_pdata(struct csched_private *prv, struct csched_pcpu *spc, int cpu) NOW() + MILLISECS(prv->tslice_ms)); } +cpumask_and(cpumask_scratch, prv->cpus, &node_to_cpumask(cpu_to_node(cpu))); +if ( cpumask_weight(cpumask_scratch) == 1 ) +prv->balance_bias[cpu_to_node(cpu)] = cpu; + init_timer(&spc->ticker, csched_tick, (void *)(unsigned long)cpu, cpu); set_timer(&spc->ticker, NOW() + MICROSECS(prv->tick_period_us) ); @@ -1696,7 +1708,7 @@ csched_load_balance(struct csched_private *prv, int cpu, struct csched_vcpu *speer; cpumask_t workers; cpumask_t *online; -int peer_cpu, peer_node, bstep; +int peer_cpu, first_cpu, peer_node, bstep; int node = cpu_to_node(cpu); BUG_ON( cpu != snext->vcpu->processor ); @@ -1740,9 +1752,10 @@ csched_load_balance(struct csched_private *prv, int cpu, cpumask_and(&workers, &workers, &node_to_cpumask(peer_node)); __cpumask_clear_cpu(cpu, &workers); -peer_cpu = cpumask_first(&workers); -if ( peer_cpu >= nr_cpu_ids ) +first_cpu = cpumask_cycle(prv->balance_bias[peer_node], &workers); +if ( first_cpu >= nr_cpu_ids ) goto next_node; +peer_cpu = first_cpu; do { /* @@ -1770,12 +1783,18 @@ csched_load_balance(struct csched_private *prv, int cpu, if ( speer != NULL ) { *stolen = 1; +/* + * Next time we'll look for work to steal on this node, we + * will start from the next pCPU, with respect to this one, + * so we don't risk stealing always from the same ones. + */ +prv->balance_bias[peer_node] = peer_cpu; return speer; } peer_cpu = cpumask_cycle(peer_cpu, &workers); -} while( peer_cpu != cpumask_first(&workers) ); +} while( peer_cpu != first_cpu ); next_node: peer_node = cycle_node(peer_node, node_online_map); @@ -2126,6 +2145,14 @@ csched_init(struct scheduler *ops) prv = xzalloc(struct csched_private); if ( prv == NULL ) return -ENOMEM; + +prv->balance_bias = xzalloc_array(uint32_t, MAX_NUMNODES); +if ( prv->balance_bias == NULL ) +{ +xfree(prv); +return -ENOMEM; +} + if ( !zalloc_cpumask_var(&prv->cpus) || !zalloc_cpumask_var(&prv->idlers) || !zalloc_cpumask_var(&prv->overloaded) ) @@ -2133,6 +2160,7 @@ csched_init(struct scheduler *ops) free_cpumask_var(prv->overloaded); free_cpumask_var(prv->idlers); free_cpumask_var(prv->cpus); +xfree(prv->balance_bias); xfree(prv); return -ENOMEM; } @@ -2179,6 +2207,7 @@ csched_deinit(struct scheduler *ops) free_cpumask_var(prv->cpus); free_cpumask_var(prv->idlers); free_cpumask_var(prv->overloaded); +xfree(prv->balance_bias); xfree(prv); }
[Xen-devel] [PATCH 1/6] xen: credit1: simplify csched_runq_steal() a little bit.
Since we're holding the lock on the pCPU from which we are trying to steal, it can't have disappeared, so we can drop the check for that (and convert it in an ASSERT()). And since we try to steal only from busy pCPUs, it's unlikely for such pCPU to be idle, so we mark it as such (and bail early if it unfortunately is). Signed-off-by: Dario Faggioli --- Cc: George Dunlap --- xen/common/sched_credit.c | 87 +++-- 1 file changed, 44 insertions(+), 43 deletions(-) diff --git a/xen/common/sched_credit.c b/xen/common/sched_credit.c index 4649e64..63a8675 100644 --- a/xen/common/sched_credit.c +++ b/xen/common/sched_credit.c @@ -1593,64 +1593,65 @@ static struct csched_vcpu * csched_runq_steal(int peer_cpu, int cpu, int pri, int balance_step) { const struct csched_pcpu * const peer_pcpu = CSCHED_PCPU(peer_cpu); -const struct vcpu * const peer_vcpu = curr_on_cpu(peer_cpu); struct csched_vcpu *speer; struct list_head *iter; struct vcpu *vc; +ASSERT(peer_pcpu != NULL); + /* * Don't steal from an idle CPU's runq because it's about to * pick up work from it itself. */ -if ( peer_pcpu != NULL && !is_idle_vcpu(peer_vcpu) ) +if ( unlikely(is_idle_vcpu(curr_on_cpu(peer_cpu))) ) +goto out; + +list_for_each( iter, &peer_pcpu->runq ) { -list_for_each( iter, &peer_pcpu->runq ) -{ -speer = __runq_elem(iter); +speer = __runq_elem(iter); -/* - * If next available VCPU here is not of strictly higher - * priority than ours, this PCPU is useless to us. - */ -if ( speer->pri <= pri ) -break; +/* + * If next available VCPU here is not of strictly higher + * priority than ours, this PCPU is useless to us. + */ +if ( speer->pri <= pri ) +break; -/* Is this VCPU runnable on our PCPU? */ -vc = speer->vcpu; -BUG_ON( is_idle_vcpu(vc) ); +/* Is this VCPU runnable on our PCPU? */ +vc = speer->vcpu; +BUG_ON( is_idle_vcpu(vc) ); -/* - * If the vcpu has no useful soft affinity, skip this vcpu. - * In fact, what we want is to check if we have any "soft-affine - * work" to steal, before starting to look at "hard-affine work". - * - * Notice that, if not even one vCPU on this runq has a useful - * soft affinity, we could have avoid considering this runq for - * a soft balancing step in the first place. This, for instance, - * can be implemented by taking note of on what runq there are - * vCPUs with useful soft affinities in some sort of bitmap - * or counter. - */ -if ( balance_step == CSCHED_BALANCE_SOFT_AFFINITY - && !__vcpu_has_soft_affinity(vc, vc->cpu_hard_affinity) ) -continue; +/* + * If the vcpu has no useful soft affinity, skip this vcpu. + * In fact, what we want is to check if we have any "soft-affine + * work" to steal, before starting to look at "hard-affine work". + * + * Notice that, if not even one vCPU on this runq has a useful + * soft affinity, we could have avoid considering this runq for + * a soft balancing step in the first place. This, for instance, + * can be implemented by taking note of on what runq there are + * vCPUs with useful soft affinities in some sort of bitmap + * or counter. + */ +if ( balance_step == CSCHED_BALANCE_SOFT_AFFINITY + && !__vcpu_has_soft_affinity(vc, vc->cpu_hard_affinity) ) +continue; -csched_balance_cpumask(vc, balance_step, cpumask_scratch); -if ( __csched_vcpu_is_migrateable(vc, cpu, cpumask_scratch) ) -{ -/* We got a candidate. Grab it! */ -TRACE_3D(TRC_CSCHED_STOLEN_VCPU, peer_cpu, - vc->domain->domain_id, vc->vcpu_id); -SCHED_VCPU_STAT_CRANK(speer, migrate_q); -SCHED_STAT_CRANK(migrate_queued); -WARN_ON(vc->is_urgent); -__runq_remove(speer); -vc->processor = cpu; -return speer; -} +csched_balance_cpumask(vc, balance_step, cpumask_scratch); +if ( __csched_vcpu_is_migrateable(vc, cpu, cpumask_scratch) ) +{ +/* We got a candidate. Grab it! */ +TRACE_3D(TRC_CSCHED_STOLEN_VCPU, peer_cpu, + vc->domain->domain_id, vc->vcpu_id); +SCHED_VCPU_STAT_CRANK(speer, migrate_q); +SCHED_STAT_CRANK(migrate_queued); +WARN_ON(vc->is_urgent); +__runq_remove(speer); +vc->processor = cpu; +return speer; }
[Xen-devel] [PATCH 0/6] xen: sched: improve scalability of Credit1, and optimize a bit both Credit1 and Credit2
Hello, This series aims at introducing some optimization and performance improvement in Credit1 (in certain specific situations), but Credit2 is lightly touched as well. The core of the series is patches 3 and 4, which aim at both redistributing and reducing spinlock contention during load balancing. In fact, Credit1 load balancing is based on "work stealing". This means that, when a pCPU would go idle, it looks around inside other pCPUs' runqueues, to see if there are vCPUs waiting to run, and steal the first one it finds. This process of going around pCPUs happens in a NUMA node wise fashion, and always starts from the first pCPU on each node. That may lead to higher scheduler lock pressure on lower ID pCPUs (of each node), as well as stealing happening more frequently from them. And this is what patch 4 aims at fixing. This is not necessarily expected to improve performance per-se, although a fairer lock pressure is likely to bring benefits. Still about load balancing, when deciding whether or not to try to steal work from a pCPU, we only consider the ones that are non-idle. A pCPU which is running a vCPU and does not have any other vCPU in its runqueue waiting to run, is not idle, but there's nothing we can steal. It's therefore possible that we check a number of pCPUs, which include at least trying to take their runqueue lock, only to figure out that there's no vCPU we can grab, and we need to continue checking other processors. On a large system, in situations where the load (i.e., the number of runnable and running vCPUs) is only _slightly_ higher than the number of pCPUs, this can have a significant performance impact. A way of improving this situation, is to keep track of not only whether pCPUs are idle or not, but also which ones have more than one runnable vCPU, which basically means they have at least one vCPU ready to be stolen by anyone that would otherwise go idle. And this exactly is what is done in patch 3. Finally, patch 6 does to Credit2, something similar to what patch 3 does to Credit1, although the context is, actually, different. In fact, there are places in Credit2, where we just want the scheduler to give us one pCPU from a certain runqueue. We do that by means of cpumask_any(), which is great, but comes at a price. As a matter of fact, we don't really care much which one, as a subsequent call to runq_tickle() will override such choice anyway. But --within runqueue tickle itself-- the pCPU we choose is at last used as an hint, so we really don't want to totally give up and introduce biases (by, e.g., just using cpumask_first(). We, therefore, use an approach similar to the one in patch 3, i.e., we record and remember which pCPU we choose for last, and start from it next time. As said already, the performance benefit of this series are to be expected on large systems, with very specific load conditions. I've done some benchmarking on a 16 CPUs NUMA box that I have at hand. I've run three experiments. A Xen compile ('MAKEXEN') inside a 16 vCPUs guest. 2 Xen compiles running concurrently inside two 16 vCPUs VMs. And a Xen compile and Iperf ('IPERF') running concurrently inside two 16 vCPUs VMs Here's the result for Credit1. For MAKEXEN, lower is better, while for IPERF, higher is. Average and standard dviation over 10 runs is what's shown in the tables below. |CREDIT1| |---| |MAKEXEN, 1VM|MAKEXEN, 2VMs |vm1: MAKEXEN vm2: IPERF | |baseline patched|baseline patched|baseline patched baseline patched| |||-| avg | 18.154 17.906| 52.832 51.088| 29.306 28.936 15.840 18.580| stdd| 0.5800.059| 1.0611.717| 0.7570.296 4.2642.492| So, with this patch applied, Xen compiles a little bit faster, and Iperf achieves higher throughput, which is great. :-D As far as Credit2 goes, here's the numbers: |CREDIT2| |---| |MAKEXEN, 1VM|MAKEXEN, 2VMs |vm1: MAKEXEN vm2: IPERF | |baseline patched|baseline patched|baseline patched baseline patched| |||-| avg | 18.062 17.894| 53.136 52.968| 32.754 32.880 18.160 19.240| stdd| 0.3310.205| 0.8860.566| 0.7870.548 1.9101.842| In this case, the expected impact of the series is smaller, and that in fact matches what we get, with baseline and patched numbers very very close. What I wanted to verify is that I was not introducing regressions, which seems to be confirmed. Thanks and Regards, Dario --- Dario Faggioli (6): xen: credit1: simplify csched_runq_steal() a little bit. xen: credit: (micro) optimize csch
[Xen-devel] [PATCH 6/6] xen: credit2: avoid cpumask_any() in pick_cpu().
cpumask_any() is costly (because of the randomization). And since it does not really matter which exact CPU is selected within a runqueue, as that will be overridden shortly after, in runq_tickle(), spending too much time and achieving true randomization is pretty pointless. As the picked CPU, however, would be used as an hint, within runq_tickle(), don't give up on it entirely, and let's make sure we don't always return the same CPU, or favour lower or higher ID CPUs. To achieve that, let's record and remember, for each runqueue, what CPU we picked for last, and start from that the following time. Signed-off-by: Dario Faggioli --- Cc: George Dunlap Cc: Anshul Makkar --- xen/common/sched_credit2.c | 22 ++ 1 file changed, 18 insertions(+), 4 deletions(-) diff --git a/xen/common/sched_credit2.c b/xen/common/sched_credit2.c index af457c1..7b9e1a1 100644 --- a/xen/common/sched_credit2.c +++ b/xen/common/sched_credit2.c @@ -363,6 +363,7 @@ struct csched2_runqueue_data { struct list_head runq; /* Ordered list of runnable vms */ struct list_head svc; /* List of all vcpus assigned to this runqueue */ unsigned int max_weight; +unsigned int pick_bias;/* Last CPU we picked. Start from it next time */ cpumask_t idle,/* Currently idle pcpus */ smt_idle, /* Fully idle-and-untickled cores (see below) */ @@ -1679,7 +1680,9 @@ csched2_cpu_pick(const struct scheduler *ops, struct vcpu *vc) { cpumask_and(cpumask_scratch_cpu(cpu), cpumask_scratch_cpu(cpu), &svc->migrate_rqd->active); -new_cpu = cpumask_any(cpumask_scratch_cpu(cpu)); +new_cpu = cpumask_cycle(svc->migrate_rqd->pick_bias, +cpumask_scratch_cpu(cpu)); +svc->migrate_rqd->pick_bias = new_cpu; goto out_up; } /* Fall-through to normal cpu pick */ @@ -1737,7 +1740,9 @@ csched2_cpu_pick(const struct scheduler *ops, struct vcpu *vc) cpumask_and(cpumask_scratch_cpu(cpu), cpumask_scratch_cpu(cpu), &prv->rqd[min_rqi].active); -new_cpu = cpumask_any(cpumask_scratch_cpu(cpu)); +new_cpu = cpumask_cycle(prv->rqd[min_rqi].pick_bias, +cpumask_scratch_cpu(cpu)); +prv->rqd[min_rqi].pick_bias = new_cpu; BUG_ON(new_cpu >= nr_cpu_ids); out_up: @@ -1854,7 +1859,9 @@ static void migrate(const struct scheduler *ops, cpupool_domain_cpumask(svc->vcpu->domain)); cpumask_and(cpumask_scratch_cpu(cpu), cpumask_scratch_cpu(cpu), &trqd->active); -svc->vcpu->processor = cpumask_any(cpumask_scratch_cpu(cpu)); +svc->vcpu->processor = cpumask_cycle(trqd->pick_bias, + cpumask_scratch_cpu(cpu)); +trqd->pick_bias = svc->vcpu->processor; ASSERT(svc->vcpu->processor < nr_cpu_ids); _runq_assign(svc, trqd); @@ -2821,13 +2828,15 @@ csched2_dump(const struct scheduler *ops) printk("Runqueue %d:\n" "\tncpus = %u\n" "\tcpus = %s\n" - "\tmax_weight = %d\n" + "\tmax_weight = %u\n" + "\tpick_bias = %u\n" "\tinstload = %d\n" "\taveload= %"PRI_stime" (~%"PRI_stime"%%)\n", i, cpumask_weight(&prv->rqd[i].active), cpustr, prv->rqd[i].max_weight, + prv->rqd[i].pick_bias, prv->rqd[i].load, prv->rqd[i].avgload, fraction); @@ -2930,6 +2939,9 @@ init_pdata(struct csched2_private *prv, unsigned int cpu) __cpumask_set_cpu(cpu, &prv->initialized); __cpumask_set_cpu(cpu, &rqd->smt_idle); +if ( cpumask_weight(&rqd->active) == 1 ) +rqd->pick_bias = cpu; + return rqi; } @@ -3042,6 +3054,8 @@ csched2_deinit_pdata(const struct scheduler *ops, void *pcpu, int cpu) printk(XENLOG_INFO " No cpus left on runqueue, disabling\n"); deactivate_runqueue(prv, rqi); } +else if ( rqd->pick_bias == cpu ) +rqd->pick_bias = cpumask_first(&rqd->active); spin_unlock(&rqd->lock); ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [xen-unstable bisection] complete test-amd64-i386-xl-qemut-stubdom-debianhvm-amd64-xsm
On Wed, Mar 01, 2017 at 11:53:52PM +, osstest service owner wrote: > branch xen-unstable > xenbranch xen-unstable > job test-amd64-i386-xl-qemut-stubdom-debianhvm-amd64-xsm > testid xen-boot > > Tree: linux git://xenbits.xen.org/linux-pvops.git > Tree: linuxfirmware git://xenbits.xen.org/osstest/linux-firmware.git > Tree: qemu git://xenbits.xen.org/qemu-xen-traditional.git > Tree: qemuu git://xenbits.xen.org/qemu-xen.git > Tree: xen git://xenbits.xen.org/xen.git > > *** Found and reproduced problem changeset *** > > Bug is in tree: xen git://xenbits.xen.org/xen.git > Bug introduced: c5b9805bc1f79319ae342c65fcc201a15a47 > Bug not present: b199c44afa3a0d18d0e968e78a590eb9e69e20ad > Last fail repro: http://logs.test-lab.xenproject.org/osstest/logs/106324/ > > > commit c5b9805bc1f79319ae342c65fcc201a15a47 > Author: Daniel Kiper > Date: Wed Feb 22 14:38:06 2017 +0100 > > efi: create new early memory allocator Does anybody know why this still fails? Andrew applied a fix (workaround) for this yesterday. Daniel ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [xen-unstable bisection] complete test-amd64-i386-xl-qemut-stubdom-debianhvm-amd64-xsm
On 02/03/17 10:41, Daniel Kiper wrote: > On Wed, Mar 01, 2017 at 11:53:52PM +, osstest service owner wrote: >> branch xen-unstable >> xenbranch xen-unstable >> job test-amd64-i386-xl-qemut-stubdom-debianhvm-amd64-xsm >> testid xen-boot >> >> Tree: linux git://xenbits.xen.org/linux-pvops.git >> Tree: linuxfirmware git://xenbits.xen.org/osstest/linux-firmware.git >> Tree: qemu git://xenbits.xen.org/qemu-xen-traditional.git >> Tree: qemuu git://xenbits.xen.org/qemu-xen.git >> Tree: xen git://xenbits.xen.org/xen.git >> >> *** Found and reproduced problem changeset *** >> >> Bug is in tree: xen git://xenbits.xen.org/xen.git >> Bug introduced: c5b9805bc1f79319ae342c65fcc201a15a47 >> Bug not present: b199c44afa3a0d18d0e968e78a590eb9e69e20ad >> Last fail repro: http://logs.test-lab.xenproject.org/osstest/logs/106324/ >> >> >> commit c5b9805bc1f79319ae342c65fcc201a15a47 >> Author: Daniel Kiper >> Date: Wed Feb 22 14:38:06 2017 +0100 >> >> efi: create new early memory allocator > Does anybody know why this still fails? Andrew applied a fix (workaround) for > this yesterday. Bisection runs from before will still be running. Also, the fix hasn't passed staging yet. ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 3/5] xen: create wrappers for all other uses of xc_hvm_XXX() functions
On Wed, Mar 01, 2017 at 04:16:32PM +, Paul Durrant wrote: > > -Original Message- > > From: Anthony PERARD [mailto:anthony.per...@citrix.com] > > Sent: 01 March 2017 16:14 > > To: Paul Durrant > > Cc: xen-de...@lists.xenproject.org; qemu-de...@nongnu.org; Stefano > > Stabellini ; Paolo Bonzini ; > > Richard Henderson ; Eduardo Habkost > > ; Michael S. Tsirkin > > Subject: Re: [PATCH 3/5] xen: create wrappers for all other uses of > > xc_hvm_XXX() functions > > > > On Thu, Feb 23, 2017 at 02:53:53PM +, Paul Durrant wrote: > > > This patch creates inline wrapper functions in xen_common.h for all open > > > coded calls to xc_hvm_XXX() functions outside of xen_common.h so that > > use > > > of xen_xc can be made implicit. This again is in preparation for the move > > > to using libxendevicemodel. > > > > > > Signed-off-by: Paul Durrant > > > --- > > > diff --git a/include/hw/xen/xen_common.h > > b/include/hw/xen/xen_common.h > > > index 1e08b98..31cf25f 100644 > > > --- a/include/hw/xen/xen_common.h > > > +++ b/include/hw/xen/xen_common.h > > > @@ -26,6 +26,50 @@ extern xc_interface *xen_xc; > > > * We don't support Xen prior to 4.2.0. > > > */ > > > > > > +static inline int xen_set_mem_type(domid_t domid, hvmmem_type_t > > type, > > > + uint64_t first_pfn, uint32_t nr) > > > +{ > > > > I don't know if it matters from where the functions are called, but > > here, xc_hvm_set_mem_type takes a "uint64_t nr" (and not uint32_t). > > Yes, the old APIs were wrong and discarded the upper 32 bits, so limiting > here is correct. Moving to the new API fixes the issue. OK, thanks. In that case: Reviewed-by: Anthony PERARD -- Anthony PERARD ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] [xen-unstable-smoke test] 106361: regressions - trouble: broken/fail/pass
flight 106361 xen-unstable-smoke real [real] http://logs.test-lab.xenproject.org/osstest/logs/106361/ Regressions :-( Tests which did not succeed and are blocking, including tests which could not be run: test-armhf-armhf-xl 6 xen-boot fail REGR. vs. 106320 Tests which did not succeed, but are not blocking: test-arm64-arm64-xl-xsm 1 build-check(1) blocked n/a test-amd64-amd64-libvirt 12 migrate-support-checkfail never pass build-arm64 5 xen-buildfail never pass build-arm64-pvops 5 kernel-build fail never pass version targeted for testing: xen f14ce1a13455bfc3fb7b33c185e3e49749d68e28 baseline version: xen 1a0ab02e342cfd80decd72606c94479e4b309a3c Last test of basis 106320 2017-03-01 20:01:41 Z0 days Testing same since 106361 2017-03-02 09:01:12 Z0 days1 attempts People who touched revisions under test: Marek Marczykowski-Górecki Olaf Hering Wei Liu jobs: build-amd64 pass build-arm64 fail build-armhf pass build-amd64-libvirt pass build-arm64-pvopsfail test-armhf-armhf-xl fail test-arm64-arm64-xl-xsm broken test-amd64-amd64-xl-qemuu-debianhvm-i386 pass test-amd64-amd64-libvirt pass sg-report-flight on osstest.test-lab.xenproject.org logs: /home/logs/logs images: /home/logs/images Logs, config files, etc. are available at http://logs.test-lab.xenproject.org/osstest/logs Explanation of these reports, and of osstest in general, is at http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master Test harness code can be found at http://xenbits.xen.org/gitweb?p=osstest.git;a=summary Not pushing. commit f14ce1a13455bfc3fb7b33c185e3e49749d68e28 Author: Wei Liu Date: Wed Mar 1 12:32:26 2017 + MAINTAINERS: add Marek as maintainer of python bindings Marek has kindly agreed to step up and co-maintain the python bindings. Signed-off-by: Wei Liu Acked-by: Marek Marczykowski-Górecki commit d31828e348e7b7229d7ecb32f7f0bee7f125e61e Author: Olaf Hering Date: Wed Mar 1 12:27:08 2017 + tools/libxendevicemodel: define O_CLOEXEC Some libc headers don't have O_CLOEXEC, we need to take care of it by defining to 0 (on the ground that such glibc might barf on O_CLOEXEC). Fixes e7745d8ef5 ("tools/libxendevicemodel: introduce a Linux-specific implementation") Signed-off-by: Olaf Hering Acked-by: Wei Liu commit 34b05462d93f06f9a08caf98b7aed865f6b796bd Author: Wei Liu Date: Wed Mar 1 11:07:24 2017 + acpi: check if mapping is valid before reading / writing If acpi_map_os_memory has failed, return early with AE_ERROR. Coverity-ID: 1401601 Coverity-ID: 1401602 Signed-off-by: Wei Liu Reviewed-by: Jan Beulich (qemu changes not included) ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] [PATCH net] xen-netback: Use GFP_ATOMIC to allocate hash
Allocation of new_hash, inside xenvif_new_hash(), always happen in softirq context, so use GFP_ATOMIC instead of GFP_KERNEL for new hash allocation. Signed-off-by: Anoob Soman --- drivers/net/xen-netback/hash.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/xen-netback/hash.c b/drivers/net/xen-netback/hash.c index e8c5ddd..3c4c58b 100644 --- a/drivers/net/xen-netback/hash.c +++ b/drivers/net/xen-netback/hash.c @@ -39,7 +39,7 @@ static void xenvif_add_hash(struct xenvif *vif, const u8 *tag, unsigned long flags; bool found; - new = kmalloc(sizeof(*entry), GFP_KERNEL); + new = kmalloc(sizeof(*entry), GFP_ATOMIC); if (!new) return; -- 2.7.4 ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH net] xen-netback: Use GFP_ATOMIC to allocate hash
> -Original Message- > From: Anoob Soman [mailto:anoob.so...@citrix.com] > Sent: 02 March 2017 10:50 > To: net...@vger.kernel.org; xen-de...@lists.xenproject.org > Cc: Paul Durrant ; Wei Liu ; > Anoob Soman > Subject: [PATCH net] xen-netback: Use GFP_ATOMIC to allocate hash > > Allocation of new_hash, inside xenvif_new_hash(), always happen > in softirq context, so use GFP_ATOMIC instead of GFP_KERNEL for new > hash allocation. > > Signed-off-by: Anoob Soman Acked-by: Paul Durrant > --- > drivers/net/xen-netback/hash.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/net/xen-netback/hash.c b/drivers/net/xen- > netback/hash.c > index e8c5ddd..3c4c58b 100644 > --- a/drivers/net/xen-netback/hash.c > +++ b/drivers/net/xen-netback/hash.c > @@ -39,7 +39,7 @@ static void xenvif_add_hash(struct xenvif *vif, const u8 > *tag, > unsigned long flags; > bool found; > > - new = kmalloc(sizeof(*entry), GFP_KERNEL); > + new = kmalloc(sizeof(*entry), GFP_ATOMIC); > if (!new) > return; > > -- > 2.7.4 ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH net] xen-netback: Use GFP_ATOMIC to allocate hash
On Thu, Mar 02, 2017 at 10:50:20AM +, Anoob Soman wrote: > Allocation of new_hash, inside xenvif_new_hash(), always happen > in softirq context, so use GFP_ATOMIC instead of GFP_KERNEL for new > hash allocation. > > Signed-off-by: Anoob Soman Acked-by: Wei Liu ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [DO NOT APPLY PATCH XTF 0/2] UMIP test case
On Thu, Mar 02, 2017 at 09:05:27AM +, Andrew Cooper wrote: > On 02/03/2017 08:42, Wei Liu wrote: > > I wrote this long time ago before UMIP was merged. > > > > Yu, since you asked, I might as well post it for your reference on how to > > do it with XTF. > > > > This series is not yet tested in any way. > > Unfortunately, you execute all of the sensitive instructions in kernel > mode, where they wouldn't fault even with UMIP active. > > For full testing of a feature like this, the test should include a check > that the ability to modify CR4.UMIP depends strictly on the visibility > of the feature, that uses in the kernel still continue to work, even > when active, and that behaviour returns back to normal after the feature > has been deactivated. > I'm pretty sure there are bugs. :-) > The first patch however, is fine to go in. Feel free to pick it up in your next batch. > > ~Andrew > > (I am also going to pretend I didn't see those macros :) ) ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 4/5] configure: detect presence of libxendevicemodel
On Thu, Mar 02, 2017 at 09:06:43AM +, Paul Durrant wrote: > > -Original Message- > > From: Anthony PERARD [mailto:anthony.per...@citrix.com] > > Sent: 01 March 2017 17:18 > > To: Paul Durrant > > Cc: xen-de...@lists.xenproject.org; qemu-de...@nongnu.org; Stefano > > Stabellini > > Subject: Re: [PATCH 4/5] configure: detect presence of libxendevicemodel > > > > On Thu, Feb 23, 2017 at 02:53:54PM +, Paul Durrant wrote: > > > This patch adds code in configure to set > > CONFIG_XEN_CTRL_INTERFACE_VERSION > > > to a new value of 490 if libxendevicemodel is present in the build > > > environment. > > > > > > Signed-off-by: Paul Durrant > > > --- > > > Cc: Stefano Stabellini > > > Cc: Anthony Perard > > > --- > > > configure | 19 +++ > > > 1 file changed, 19 insertions(+) > > > > > > diff --git a/configure b/configure > > > index 8e8f18d..fc1e12b 100755 > > > --- a/configure > > > +++ b/configure > > > @@ -1980,6 +1980,25 @@ EOF > > ># Xen unstable > > >elif > > >cat > $TMPC < > > +#undef XC_WANT_COMPAT_DEVICEMODEL_API > > > +#define __XEN_TOOLS__ > > > > Isn't __XEN_TOOLS__ supposed to be reserved for some to tools inside > > xen.git? > > > > Also it seems to be the only time this define is used in your patch > > series. > > > No. QEMU falls under the definition of 'tools' as far as Xen goes and the > hypercalls and xendevicemodel API are protected by that. The reason you don't > see it elsewhere is that xenctrl.h defines it. (See > http://xenbits.xen.org/gitweb/?p=xen.git;a=blob;f=tools/libxc/include/xenctrl.h;hb=HEAD#l27). > I think that's a little underhand so I thought I'd make the new code in > configure more transparent. I can change it to just include xenctrl.h before > xendevicemodel.h if you'd prefer. I guess that is fine with __XEN_TOOLS__. You can add my Reviewed-by: Anthony PERARD -- Anthony PERARD ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 4/5] configure: detect presence of libxendevicemodel
> -Original Message- > From: Anthony PERARD [mailto:anthony.per...@citrix.com] > Sent: 02 March 2017 10:55 > To: Paul Durrant > Cc: xen-de...@lists.xenproject.org; qemu-de...@nongnu.org; Stefano > Stabellini > Subject: Re: [PATCH 4/5] configure: detect presence of libxendevicemodel > > On Thu, Mar 02, 2017 at 09:06:43AM +, Paul Durrant wrote: > > > -Original Message- > > > From: Anthony PERARD [mailto:anthony.per...@citrix.com] > > > Sent: 01 March 2017 17:18 > > > To: Paul Durrant > > > Cc: xen-de...@lists.xenproject.org; qemu-de...@nongnu.org; Stefano > > > Stabellini > > > Subject: Re: [PATCH 4/5] configure: detect presence of > libxendevicemodel > > > > > > On Thu, Feb 23, 2017 at 02:53:54PM +, Paul Durrant wrote: > > > > This patch adds code in configure to set > > > CONFIG_XEN_CTRL_INTERFACE_VERSION > > > > to a new value of 490 if libxendevicemodel is present in the build > > > > environment. > > > > > > > > Signed-off-by: Paul Durrant > > > > --- > > > > Cc: Stefano Stabellini > > > > Cc: Anthony Perard > > > > --- > > > > configure | 19 +++ > > > > 1 file changed, 19 insertions(+) > > > > > > > > diff --git a/configure b/configure > > > > index 8e8f18d..fc1e12b 100755 > > > > --- a/configure > > > > +++ b/configure > > > > @@ -1980,6 +1980,25 @@ EOF > > > ># Xen unstable > > > >elif > > > >cat > $TMPC < > > > +#undef XC_WANT_COMPAT_DEVICEMODEL_API > > > > +#define __XEN_TOOLS__ > > > > > > Isn't __XEN_TOOLS__ supposed to be reserved for some to tools inside > > > xen.git? > > > > > > Also it seems to be the only time this define is used in your patch > > > series. > > > > > No. QEMU falls under the definition of 'tools' as far as Xen goes and the > hypercalls and xendevicemodel API are protected by that. The reason you > don't see it elsewhere is that xenctrl.h defines it. (See > http://xenbits.xen.org/gitweb/?p=xen.git;a=blob;f=tools/libxc/include/xenc > trl.h;hb=HEAD#l27). > > I think that's a little underhand so I thought I'd make the new code in > configure more transparent. I can change it to just include xenctrl.h before > xendevicemodel.h if you'd prefer. > > I guess that is fine with __XEN_TOOLS__. > You can add my > Reviewed-by: Anthony PERARD Cool. Thanks, Paul > > -- > Anthony PERARD ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 0/6] xen: sched: improve scalability of Credit1, and optimize a bit both Credit1 and Credit2
On Thu, 2017-03-02 at 11:37 +0100, Dario Faggioli wrote: > --- > Dario Faggioli (6): > xen: credit1: simplify csched_runq_steal() a little bit. > xen: credit: (micro) optimize csched_runq_steal(). > xen: credit1: increase efficiency and scalability of load balancing. > xen: credit1: treat pCPUs more evenly during balancing. > xen/tools: tracing: add record for credit1 runqueue stealing. > xen: credit2: avoid cpumask_any() in pick_cpu(). > > tools/xentrace/formats |1 > tools/xentrace/xenalyze.c| 11 ++ > xen/common/sched_credit.c| 199 > +- > xen/common/sched_credit2.c | 22 - > xen/include/xen/perfc_defn.h |1 > 5 files changed, 169 insertions(+), 65 deletions(-) > And there's a git branch available here: git://xenbits.xen.org/people/dariof/xen.git rel/sched/credit1-credit2-optim-and-scalability http://xenbits.xen.org/gitweb/?p=people/dariof/xen.git;a=shortlog;h=refs/heads/rel/sched/credit1-credit2-optim-and-scalability https://travis-ci.org/fdario/xen/builds/206774498 Sorry I forgot the links before. Regards, Dario -- <> (Raistlin Majere) - Dario Faggioli, Ph.D, http://about.me/dario.faggioli Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK) signature.asc Description: This is a digitally signed message part ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 4/5] configure: detect presence of libxendevicemodel
On 02/03/17 11:54, Anthony PERARD wrote: > On Thu, Mar 02, 2017 at 09:06:43AM +, Paul Durrant wrote: >>> -Original Message- >>> From: Anthony PERARD [mailto:anthony.per...@citrix.com] >>> Sent: 01 March 2017 17:18 >>> To: Paul Durrant >>> Cc: xen-de...@lists.xenproject.org; qemu-de...@nongnu.org; Stefano >>> Stabellini >>> Subject: Re: [PATCH 4/5] configure: detect presence of libxendevicemodel >>> >>> On Thu, Feb 23, 2017 at 02:53:54PM +, Paul Durrant wrote: This patch adds code in configure to set >>> CONFIG_XEN_CTRL_INTERFACE_VERSION to a new value of 490 if libxendevicemodel is present in the build environment. Signed-off-by: Paul Durrant --- Cc: Stefano Stabellini Cc: Anthony Perard --- configure | 19 +++ 1 file changed, 19 insertions(+) diff --git a/configure b/configure index 8e8f18d..fc1e12b 100755 --- a/configure +++ b/configure @@ -1980,6 +1980,25 @@ EOF # Xen unstable elif cat > $TMPC <>>> +#undef XC_WANT_COMPAT_DEVICEMODEL_API +#define __XEN_TOOLS__ >>> >>> Isn't __XEN_TOOLS__ supposed to be reserved for some to tools inside >>> xen.git? >>> >>> Also it seems to be the only time this define is used in your patch >>> series. >>> >> No. QEMU falls under the definition of 'tools' as far as Xen goes and the >> hypercalls and xendevicemodel API are protected by that. The reason you >> don't see it elsewhere is that xenctrl.h defines it. (See >> http://xenbits.xen.org/gitweb/?p=xen.git;a=blob;f=tools/libxc/include/xenctrl.h;hb=HEAD#l27). >> I think that's a little underhand so I thought I'd make the new code in >> configure more transparent. I can change it to just include xenctrl.h before >> xendevicemodel.h if you'd prefer. > > I guess that is fine with __XEN_TOOLS__. > You can add my > Reviewed-by: Anthony PERARD > Sorry for jumping in only now: Wouldn't it be better to base qemu's configure modification for support of Xen 4.9 (and newer) on my just posted series introducing pkg-config? I'm just about to expand this series to all of Xen's libraries and I already have a qemu patch at hand supporting this new scheme. Juergen ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 3/6] xen: credit1: increase efficiency and scalability of load balancing.
On 02/03/17 10:38, Dario Faggioli wrote: > During load balancing, we check the non idle pCPUs to > see if they have runnable but not running vCPUs that > can be stolen by and set to run on currently idle pCPUs. > > If a pCPU has only one running (or runnable) vCPU, > though, we don't want to steal it from there, and > it's therefore pointless bothering with it > (especially considering that bothering means trying > to take its runqueue lock!). > > On large systems, when load is only slightly higher > than the number of pCPUs (i.e., there are just a few > more active vCPUs than the number of the pCPUs), this > may mean that: > - we go through all the pCPUs, > - for each one, we (try to) take its runqueue locks, > - we figure out there's actually nothing to be stolen! > > To mitigate this, we introduce here the concept of > overloaded runqueues, and a cpumask where to record > what pCPUs are in such state. > > An overloaded runqueue has at least runnable 2 vCPUs > (plus the idle one, which is always there). Typically, > this means 1 vCPU is running, and 1 is sitting in the > runqueue, and can hence be stolen. > > Then, in csched_balance_load(), it is enough to go > over the overloaded pCPUs, instead than all non-idle > pCPUs, which is better. > > signed-off-by: Dario Faggioli > --- > Cc: George Dunlap > Cc: Andrew Cooper Malcolm’s solution to this problem is https://github.com/xenserver/xen-4.7.pg/commit/0f830b9f229fa6472accc9630ad16cfa42258966 This has been in 2 releases of XenServer now, and has a very visible improvement for aggregate multi-queue multi-vm intrahost network performance (although I can't find the numbers right now). The root of the performance problems is that pcpu_schedule_trylock() is expensive even for the local case, while cross-cpu locking is much worse. Locking every single pcpu in turn is terribly expensive, in terms of hot cacheline pingpong, and the lock is frequently contended. As a first opinion of this patch, you are adding another cpumask which is going to play hot cacheline pingpong. ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 4/5] configure: detect presence of libxendevicemodel
> -Original Message- > From: Juergen Gross [mailto:jgr...@suse.com] > Sent: 02 March 2017 11:01 > To: Anthony Perard ; Paul Durrant > > Cc: xen-de...@lists.xenproject.org; Stefano Stabellini > ; qemu-de...@nongnu.org > Subject: Re: [Xen-devel] [PATCH 4/5] configure: detect presence of > libxendevicemodel > > On 02/03/17 11:54, Anthony PERARD wrote: > > On Thu, Mar 02, 2017 at 09:06:43AM +, Paul Durrant wrote: > >>> -Original Message- > >>> From: Anthony PERARD [mailto:anthony.per...@citrix.com] > >>> Sent: 01 March 2017 17:18 > >>> To: Paul Durrant > >>> Cc: xen-de...@lists.xenproject.org; qemu-de...@nongnu.org; Stefano > >>> Stabellini > >>> Subject: Re: [PATCH 4/5] configure: detect presence of > libxendevicemodel > >>> > >>> On Thu, Feb 23, 2017 at 02:53:54PM +, Paul Durrant wrote: > This patch adds code in configure to set > >>> CONFIG_XEN_CTRL_INTERFACE_VERSION > to a new value of 490 if libxendevicemodel is present in the build > environment. > > Signed-off-by: Paul Durrant > --- > Cc: Stefano Stabellini > Cc: Anthony Perard > --- > configure | 19 +++ > 1 file changed, 19 insertions(+) > > diff --git a/configure b/configure > index 8e8f18d..fc1e12b 100755 > --- a/configure > +++ b/configure > @@ -1980,6 +1980,25 @@ EOF > # Xen unstable > elif > cat > $TMPC < +#undef XC_WANT_COMPAT_DEVICEMODEL_API > +#define __XEN_TOOLS__ > >>> > >>> Isn't __XEN_TOOLS__ supposed to be reserved for some to tools inside > >>> xen.git? > >>> > >>> Also it seems to be the only time this define is used in your patch > >>> series. > >>> > >> No. QEMU falls under the definition of 'tools' as far as Xen goes and the > hypercalls and xendevicemodel API are protected by that. The reason you > don't see it elsewhere is that xenctrl.h defines it. (See > http://xenbits.xen.org/gitweb/?p=xen.git;a=blob;f=tools/libxc/include/xenc > trl.h;hb=HEAD#l27). > >> I think that's a little underhand so I thought I'd make the new code in > configure more transparent. I can change it to just include xenctrl.h before > xendevicemodel.h if you'd prefer. > > > > I guess that is fine with __XEN_TOOLS__. > > You can add my > > Reviewed-by: Anthony PERARD > > > > Sorry for jumping in only now: > > Wouldn't it be better to base qemu's configure modification for support > of Xen 4.9 (and newer) on my just posted series introducing pkg-config? > > I'm just about to expand this series to all of Xen's libraries and I > already have a qemu patch at hand supporting this new scheme. Ok, but I'm at v2 and I was literally just about to post. I guess it will also be pretty trivial to change configure after this change has been applied. Paul > > > Juergen ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] [PATCH v2 3/5] xen: create wrappers for all other uses of xc_hvm_XXX() functions
This patch creates inline wrapper functions in xen_common.h for all open coded calls to xc_hvm_XXX() functions outside of xen_common.h so that use of xen_xc can be made implicit. This again is in preparation for the move to using libxendevicemodel. Signed-off-by: Paul Durrant Reviewed-by: Anthony Perard --- Cc: Stefano Stabellini Cc: Paolo Bonzini Cc: Richard Henderson Cc: Eduardo Habkost Cc: "Michael S. Tsirkin" --- hw/i386/xen/xen_platform.c | 2 +- include/hw/xen/xen_common.h | 44 xen-hvm.c | 27 +-- 3 files changed, 58 insertions(+), 15 deletions(-) diff --git a/hw/i386/xen/xen_platform.c b/hw/i386/xen/xen_platform.c index 6010f35..1419fc9 100644 --- a/hw/i386/xen/xen_platform.c +++ b/hw/i386/xen/xen_platform.c @@ -195,7 +195,7 @@ static void platform_fixed_ioport_writeb(void *opaque, uint32_t addr, uint32_t v case 0: /* Platform flags */ { hvmmem_type_t mem_type = (val & PFFLAG_ROM_LOCK) ? HVMMEM_ram_ro : HVMMEM_ram_rw; -if (xc_hvm_set_mem_type(xen_xc, xen_domid, mem_type, 0xc0, 0x40)) { +if (xen_set_mem_type(xen_domid, mem_type, 0xc0, 0x40)) { DPRINTF("unable to change ro/rw state of ROM memory area!\n"); } else { s->flags = val & PFFLAG_ROM_LOCK; diff --git a/include/hw/xen/xen_common.h b/include/hw/xen/xen_common.h index 1e08b98..31cf25f 100644 --- a/include/hw/xen/xen_common.h +++ b/include/hw/xen/xen_common.h @@ -26,6 +26,50 @@ extern xc_interface *xen_xc; * We don't support Xen prior to 4.2.0. */ +static inline int xen_set_mem_type(domid_t domid, hvmmem_type_t type, + uint64_t first_pfn, uint32_t nr) +{ +return xc_hvm_set_mem_type(xen_xc, domid, type, first_pfn, nr); +} + +static inline int xen_set_pci_intx_level(domid_t domid, uint16_t segment, + uint8_t bus, uint8_t device, + uint8_t intx, unsigned int level) +{ +return xc_hvm_set_pci_intx_level(xen_xc, domid, segment, bus, device, + intx, level); +} + +static inline int xen_set_pci_link_route(domid_t domid, uint8_t link, + uint8_t irq) +{ +return xc_hvm_set_pci_link_route(xen_xc, domid, link, irq); +} + +static inline int xen_inject_msi(domid_t domid, uint64_t msi_addr, + uint32_t msi_data) +{ +return xc_hvm_inject_msi(xen_xc, domid, msi_addr, msi_data); +} + +static inline int xen_set_isa_irq_level(domid_t domid, uint8_t irq, +unsigned int level) +{ +return xc_hvm_set_isa_irq_level(xen_xc, domid, irq, level); +} + +static inline int xen_track_dirty_vram(domid_t domid, uint64_t first_pfn, + uint32_t nr, unsigned long *bitmap) +{ +return xc_hvm_track_dirty_vram(xen_xc, domid, first_pfn, nr, bitmap); +} + +static inline int xen_modified_memory(domid_t domid, uint64_t first_pfn, + uint32_t nr) +{ +return xc_hvm_modified_memory(xen_xc, domid, first_pfn, nr); +} + /* Xen 4.2 through 4.6 */ #if CONFIG_XEN_CTRL_INTERFACE_VERSION < 471 diff --git a/xen-hvm.c b/xen-hvm.c index edf4983..4b928cf 100644 --- a/xen-hvm.c +++ b/xen-hvm.c @@ -125,8 +125,8 @@ int xen_pci_slot_get_pirq(PCIDevice *pci_dev, int irq_num) void xen_piix3_set_irq(void *opaque, int irq_num, int level) { -xc_hvm_set_pci_intx_level(xen_xc, xen_domid, 0, 0, irq_num >> 2, - irq_num & 3, level); +xen_set_pci_intx_level(xen_domid, 0, 0, irq_num >> 2, + irq_num & 3, level); } void xen_piix_pci_write_config_client(uint32_t address, uint32_t val, int len) @@ -141,7 +141,7 @@ void xen_piix_pci_write_config_client(uint32_t address, uint32_t val, int len) } v &= 0xf; if (((address + i) >= 0x60) && ((address + i) <= 0x63)) { -xc_hvm_set_pci_link_route(xen_xc, xen_domid, address + i - 0x60, v); +xen_set_pci_link_route(xen_domid, address + i - 0x60, v); } } } @@ -156,7 +156,7 @@ int xen_is_pirq_msi(uint32_t msi_data) void xen_hvm_inject_msi(uint64_t addr, uint32_t data) { -xc_hvm_inject_msi(xen_xc, xen_domid, addr, data); +xen_inject_msi(xen_domid, addr, data); } static void xen_suspend_notifier(Notifier *notifier, void *data) @@ -168,7 +168,7 @@ static void xen_suspend_notifier(Notifier *notifier, void *data) static void xen_set_irq(void *opaque, int irq, int level) { -xc_hvm_set_isa_irq_level(xen_xc, xen_domid, irq, level); +xen_set_isa_irq_level(xen_domid, irq, level); } qemu_irq *xen_interrupt_controller_init(void) @@ -481,10 +481,10 @@ static void xen_set_memory(struct MemoryListener *listener, section->mr, section->offset_within_region); }
[Xen-devel] [PATCH v2 0/5] xen: use new xendevicemodel library
My recent patches to Xen [1] introduced a new library to support running device models for HVM guests. This series ports QEMU onto the new library if it is available in the build environment. [1] Patches starting with http://xenbits.xen.org/gitweb/?p=xen.git;a=commit;h=b108240265deea37601f1a605910069a837da841 Paul Durrant (5): xen: make use of xen_xc implicit in xen_common.h inlines xen: rename xen_modified_memory() to xen_hvm_modified_memory() xen: create wrappers for all other uses of xc_hvm_XXX() functions configure: detect presence of libxendevicemodel xen: use libxendevicemodel when available configure| 19 hw/i386/xen/xen_platform.c | 2 +- hw/xen/xen_backend.c | 2 - include/exec/ram_addr.h | 4 +- include/hw/xen/xen.h | 2 +- include/hw/xen/xen_backend.h | 2 - include/hw/xen/xen_common.h | 201 --- xen-common.c | 11 +++ xen-hvm-stub.c | 2 +- xen-hvm.c| 49 ++- 10 files changed, 208 insertions(+), 86 deletions(-) -- 2.1.4 ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] [PATCH v2 1/5] xen: make use of xen_xc implicit in xen_common.h inlines
Doing this will make the transition to using the new libxendevicemodel interface less intrusive on the callers of these functions, since using the new library will require a change of handle. NOTE: The patch also moves the 'externs' for xen_xc and xen_fmem from xen_backend.h to xen_common.h, and the declarations from xen_backend.c to xen-common.c, which is where they belong. Signed-off-by: Paul Durrant Reviewed-by: Anthony Perard --- Cc: Stefano Stabellini --- hw/xen/xen_backend.c | 2 - include/hw/xen/xen_backend.h | 2 - include/hw/xen/xen_common.h | 90 +++- xen-common.c | 3 ++ xen-hvm.c| 20 +- 5 files changed, 60 insertions(+), 57 deletions(-) diff --git a/hw/xen/xen_backend.c b/hw/xen/xen_backend.c index 6c21c37..d34c49e 100644 --- a/hw/xen/xen_backend.c +++ b/hw/xen/xen_backend.c @@ -43,8 +43,6 @@ BusState *xen_sysbus; /* - */ /* public */ -xc_interface *xen_xc = NULL; -xenforeignmemory_handle *xen_fmem = NULL; struct xs_handle *xenstore = NULL; const char *xen_protocol; diff --git a/include/hw/xen/xen_backend.h b/include/hw/xen/xen_backend.h index 4f4799a..30811a1 100644 --- a/include/hw/xen/xen_backend.h +++ b/include/hw/xen/xen_backend.h @@ -14,8 +14,6 @@ OBJECT_CHECK(XenDevice, (obj), TYPE_XENBACKEND) /* variables */ -extern xc_interface *xen_xc; -extern xenforeignmemory_handle *xen_fmem; extern struct xs_handle *xenstore; extern const char *xen_protocol; extern DeviceState *xen_sysdev; diff --git a/include/hw/xen/xen_common.h b/include/hw/xen/xen_common.h index dce76ee..1e08b98 100644 --- a/include/hw/xen/xen_common.h +++ b/include/hw/xen/xen_common.h @@ -20,6 +20,8 @@ #include "qemu/queue.h" #include "hw/xen/trace.h" +extern xc_interface *xen_xc; + /* * We don't support Xen prior to 4.2.0. */ @@ -73,6 +75,8 @@ static inline void *xenforeignmemory_map(xc_interface *h, uint32_t dom, #endif +extern xenforeignmemory_handle *xen_fmem; + void destroy_hvm_domain(bool reboot); /* shutdown/destroy current domain because of an error */ @@ -107,8 +111,7 @@ static inline int xen_get_vmport_regs_pfn(xc_interface *xc, domid_t dom, #endif -static inline int xen_get_default_ioreq_server_info(xc_interface *xc, -domid_t dom, +static inline int xen_get_default_ioreq_server_info(domid_t dom, xen_pfn_t *ioreq_pfn, xen_pfn_t *bufioreq_pfn, evtchn_port_t @@ -117,7 +120,7 @@ static inline int xen_get_default_ioreq_server_info(xc_interface *xc, unsigned long param; int rc; -rc = xc_get_hvm_param(xc, dom, HVM_PARAM_IOREQ_PFN, ¶m); +rc = xc_get_hvm_param(xen_xc, dom, HVM_PARAM_IOREQ_PFN, ¶m); if (rc < 0) { fprintf(stderr, "failed to get HVM_PARAM_IOREQ_PFN\n"); return -1; @@ -125,7 +128,7 @@ static inline int xen_get_default_ioreq_server_info(xc_interface *xc, *ioreq_pfn = param; -rc = xc_get_hvm_param(xc, dom, HVM_PARAM_BUFIOREQ_PFN, ¶m); +rc = xc_get_hvm_param(xen_xc, dom, HVM_PARAM_BUFIOREQ_PFN, ¶m); if (rc < 0) { fprintf(stderr, "failed to get HVM_PARAM_BUFIOREQ_PFN\n"); return -1; @@ -133,7 +136,7 @@ static inline int xen_get_default_ioreq_server_info(xc_interface *xc, *bufioreq_pfn = param; -rc = xc_get_hvm_param(xc, dom, HVM_PARAM_BUFIOREQ_EVTCHN, +rc = xc_get_hvm_param(xen_xc, dom, HVM_PARAM_BUFIOREQ_EVTCHN, ¶m); if (rc < 0) { fprintf(stderr, "failed to get HVM_PARAM_BUFIOREQ_EVTCHN\n"); @@ -156,63 +159,64 @@ static inline int xen_get_default_ioreq_server_info(xc_interface *xc, typedef uint16_t ioservid_t; -static inline void xen_map_memory_section(xc_interface *xc, domid_t dom, +static inline void xen_map_memory_section(domid_t dom, ioservid_t ioservid, MemoryRegionSection *section) { } -static inline void xen_unmap_memory_section(xc_interface *xc, domid_t dom, +static inline void xen_unmap_memory_section(domid_t dom, ioservid_t ioservid, MemoryRegionSection *section) { } -static inline void xen_map_io_section(xc_interface *xc, domid_t dom, +static inline void xen_map_io_section(domid_t dom, ioservid_t ioservid, MemoryRegionSection *section) { } -static inline void xen_unmap_io_section(xc_interface *xc, domid_t dom, +static inline void xen_unmap_io_section(domid_t dom, ioservid_t ioservid, MemoryRegi
[Xen-devel] [PATCH v2 5/5] xen: use libxendevicemodel when available
This patch modifies the wrapper functions in xen_common.h to use the new xendevicemodel interface if it is available along with compatibility code to use the old libxenctrl interface if it is not. Signed-off-by: Paul Durrant --- Cc: Stefano Stabellini Cc: Anthony Perard v2: - Add a compat define for xenforeignmemory_close() since this is now used. --- include/hw/xen/xen_common.h | 115 +++- xen-common.c| 8 +++ 2 files changed, 90 insertions(+), 33 deletions(-) diff --git a/include/hw/xen/xen_common.h b/include/hw/xen/xen_common.h index 31cf25f..48444e5 100644 --- a/include/hw/xen/xen_common.h +++ b/include/hw/xen/xen_common.h @@ -9,6 +9,7 @@ #undef XC_WANT_COMPAT_EVTCHN_API #undef XC_WANT_COMPAT_GNTTAB_API #undef XC_WANT_COMPAT_MAP_FOREIGN_API +#undef XC_WANT_COMPAT_DEVICEMODEL_API #include #include @@ -26,48 +27,95 @@ extern xc_interface *xen_xc; * We don't support Xen prior to 4.2.0. */ +#if CONFIG_XEN_CTRL_INTERFACE_VERSION < 490 + +typedef xc_interface xendevicemodel_handle; + +#define xendevicemodel_open(l, f) xen_xc + +#define xendevicemodel_map_io_range_to_ioreq_server \ +xc_hvm_map_io_range_to_ioreq_server +#define xendevicemodel_unmap_io_range_from_ioreq_server \ +xc_hvm_unmap_io_range_from_ioreq_server +#define xendevicemodel_map_pcidev_to_ioreq_server \ +xc_hvm_map_pcidev_to_ioreq_server +#define xendevicemodel_unmap_pcidev_from_ioreq_server \ +xc_hvm_unmap_pcidev_from_ioreq_server +#define xendevicemodel_create_ioreq_server \ +xc_hvm_create_ioreq_server +#define xendevicemodel_destroy_ioreq_server \ +xc_hvm_destroy_ioreq_server +#define xendevicemodel_get_ioreq_server_info \ +xc_hvm_get_ioreq_server_info +#define xendevicemodel_set_ioreq_server_state \ +xc_hvm_set_ioreq_server_state +#define xendevicemodel_set_pci_intx_level \ +xc_hvm_set_pci_intx_level +#define xendevicemodel_set_pci_link_route \ +xc_hvm_set_pci_link_route +#define xendevicemodel_set_isa_irq_level \ +xc_hvm_set_isa_irq_level +#define xendevicemodel_inject_msi \ +xc_hvm_inject_msi +#define xendevicemodel_set_mem_type \ +xc_hvm_set_mem_type +#define xendevicemodel_track_dirty_vram \ +xc_hvm_track_dirty_vram +#define xendevicemodel_modified_memory \ +xc_hvm_modified_memory + +#else /* CONFIG_XEN_CTRL_INTERFACE_VERSION >= 490 */ + +#include + +#endif + +extern xendevicemodel_handle *xen_dmod; + static inline int xen_set_mem_type(domid_t domid, hvmmem_type_t type, uint64_t first_pfn, uint32_t nr) { -return xc_hvm_set_mem_type(xen_xc, domid, type, first_pfn, nr); +return xendevicemodel_set_mem_type(xen_dmod, domid, type, first_pfn, + nr); } static inline int xen_set_pci_intx_level(domid_t domid, uint16_t segment, uint8_t bus, uint8_t device, uint8_t intx, unsigned int level) { -return xc_hvm_set_pci_intx_level(xen_xc, domid, segment, bus, device, - intx, level); +return xendevicemodel_set_pci_intx_level(xen_dmod, domid, segment, bus, + device, intx, level); } static inline int xen_set_pci_link_route(domid_t domid, uint8_t link, uint8_t irq) { -return xc_hvm_set_pci_link_route(xen_xc, domid, link, irq); +return xendevicemodel_set_pci_link_route(xen_dmod, domid, link, irq); } static inline int xen_inject_msi(domid_t domid, uint64_t msi_addr, uint32_t msi_data) { -return xc_hvm_inject_msi(xen_xc, domid, msi_addr, msi_data); +return xendevicemodel_inject_msi(xen_dmod, domid, msi_addr, msi_data); } static inline int xen_set_isa_irq_level(domid_t domid, uint8_t irq, unsigned int level) { -return xc_hvm_set_isa_irq_level(xen_xc, domid, irq, level); +return xendevicemodel_set_isa_irq_level(xen_dmod, domid, irq, level); } static inline int xen_track_dirty_vram(domid_t domid, uint64_t first_pfn, uint32_t nr, unsigned long *bitmap) { -return xc_hvm_track_dirty_vram(xen_xc, domid, first_pfn, nr, bitmap); +return xendevicemodel_track_dirty_vram(xen_dmod, domid, first_pfn, nr, + bitmap); } static inline int xen_modified_memory(domid_t domid, uint64_t first_pfn, uint32_t nr) { -return xc_hvm_modified_memory(xen_xc, domid, first_pfn, nr); +return xendevicemodel_modified_memory(xen_dmod, domid, first_pfn, nr); } /* Xen 4.2 through 4.6 */ @@ -97,6 +145,7 @@ typedef xc_gnttab xengnttab_handle; xc_gnttab_map_domain_grant_refs(h, c, d, r, p) #define xenforeignmemory_open(l, f) xen_xc +#define xenforeignmemory_close(h) static inline void *xenfor
Re: [Xen-devel] [xen-unstable bisection] complete test-amd64-i386-xl-qemut-stubdom-debianhvm-amd64-xsm
On Thu, Mar 02, 2017 at 10:42:57AM +, Andrew Cooper wrote: > On 02/03/17 10:41, Daniel Kiper wrote: > > On Wed, Mar 01, 2017 at 11:53:52PM +, osstest service owner wrote: > >> branch xen-unstable > >> xenbranch xen-unstable > >> job test-amd64-i386-xl-qemut-stubdom-debianhvm-amd64-xsm > >> testid xen-boot > >> > >> Tree: linux git://xenbits.xen.org/linux-pvops.git > >> Tree: linuxfirmware git://xenbits.xen.org/osstest/linux-firmware.git > >> Tree: qemu git://xenbits.xen.org/qemu-xen-traditional.git > >> Tree: qemuu git://xenbits.xen.org/qemu-xen.git > >> Tree: xen git://xenbits.xen.org/xen.git > >> > >> *** Found and reproduced problem changeset *** > >> > >> Bug is in tree: xen git://xenbits.xen.org/xen.git > >> Bug introduced: c5b9805bc1f79319ae342c65fcc201a15a47 > >> Bug not present: b199c44afa3a0d18d0e968e78a590eb9e69e20ad > >> Last fail repro: http://logs.test-lab.xenproject.org/osstest/logs/106324/ > >> > >> > >> commit c5b9805bc1f79319ae342c65fcc201a15a47 > >> Author: Daniel Kiper > >> Date: Wed Feb 22 14:38:06 2017 +0100 > >> > >> efi: create new early memory allocator > > Does anybody know why this still fails? Andrew applied a fix (workaround) > > for this yesterday. > > Bisection runs from before will still be running. Also, the fix hasn't This is what I expected. > passed staging yet. Yep, I saw that. Thanks for update. Daniel ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] [PATCH v2 4/5] configure: detect presence of libxendevicemodel
This patch adds code in configure to set CONFIG_XEN_CTRL_INTERFACE_VERSION to a new value of 490 if libxendevicemodel is present in the build environment. Signed-off-by: Paul Durrant Reviewed-by: Anthony Perard --- Cc: Stefano Stabellini --- configure | 19 +++ 1 file changed, 19 insertions(+) diff --git a/configure b/configure index 8e8f18d..fc1e12b 100755 --- a/configure +++ b/configure @@ -1980,6 +1980,25 @@ EOF # Xen unstable elif cat > $TMPC < +int main(void) { + xendevicemodel_handle *xd; + + xd = xendevicemodel_open(0, 0); + xendevicemodel_close(xd); + + return 0; +} +EOF + compile_prog "" "$xen_libs $xen_stable_libs -lxendevicemodel" +then +xen_stable_libs="$xen_stable_libs -lxendevicemodel" +xen_ctrl_version=490 +xen=yes + elif + cat > $TMPC
[Xen-devel] [PATCH v2 2/5] xen: rename xen_modified_memory() to xen_hvm_modified_memory()
This patch is a purely cosmetic change that avoids a name collision in a subsequent patch. Signed-off-by: Paul Durrant Reviewed-by: Anthony Perard --- Cc: Paolo Bonzini Cc: Stefano Stabellini --- include/exec/ram_addr.h | 4 ++-- include/hw/xen/xen.h| 2 +- xen-hvm-stub.c | 2 +- xen-hvm.c | 2 +- 4 files changed, 5 insertions(+), 5 deletions(-) diff --git a/include/exec/ram_addr.h b/include/exec/ram_addr.h index 3e79466..8715af6 100644 --- a/include/exec/ram_addr.h +++ b/include/exec/ram_addr.h @@ -259,7 +259,7 @@ static inline void cpu_physical_memory_set_dirty_range(ram_addr_t start, rcu_read_unlock(); -xen_modified_memory(start, length); +xen_hvm_modified_memory(start, length); } #if !defined(_WIN32) @@ -313,7 +313,7 @@ static inline void cpu_physical_memory_set_dirty_lebitmap(unsigned long *bitmap, rcu_read_unlock(); -xen_modified_memory(start, pages << TARGET_PAGE_BITS); +xen_hvm_modified_memory(start, pages << TARGET_PAGE_BITS); } else { uint8_t clients = tcg_enabled() ? DIRTY_CLIENTS_ALL : DIRTY_CLIENTS_NOCODE; /* diff --git a/include/hw/xen/xen.h b/include/hw/xen/xen.h index 09c2ce5..2b1733b 100644 --- a/include/hw/xen/xen.h +++ b/include/hw/xen/xen.h @@ -43,7 +43,7 @@ void xen_hvm_init(PCMachineState *pcms, MemoryRegion **ram_memory); void xen_ram_alloc(ram_addr_t ram_addr, ram_addr_t size, struct MemoryRegion *mr, Error **errp); -void xen_modified_memory(ram_addr_t start, ram_addr_t length); +void xen_hvm_modified_memory(ram_addr_t start, ram_addr_t length); void xen_register_framebuffer(struct MemoryRegion *mr); diff --git a/xen-hvm-stub.c b/xen-hvm-stub.c index c500325..3ca6c51 100644 --- a/xen-hvm-stub.c +++ b/xen-hvm-stub.c @@ -50,7 +50,7 @@ void xen_register_framebuffer(MemoryRegion *mr) { } -void xen_modified_memory(ram_addr_t start, ram_addr_t length) +void xen_hvm_modified_memory(ram_addr_t start, ram_addr_t length) { } diff --git a/xen-hvm.c b/xen-hvm.c index dbb8c66..edf4983 100644 --- a/xen-hvm.c +++ b/xen-hvm.c @@ -1391,7 +1391,7 @@ void xen_shutdown_fatal_error(const char *fmt, ...) qemu_system_shutdown_request(); } -void xen_modified_memory(ram_addr_t start, ram_addr_t length) +void xen_hvm_modified_memory(ram_addr_t start, ram_addr_t length) { if (unlikely(xen_in_migration)) { int rc; -- 2.1.4 ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] [PATCH] tools: Fix build of QEMU with lib xendevicemodel support
Signed-off-by: Anthony PERARD --- tools/Makefile | 2 ++ 1 file changed, 2 insertions(+) diff --git a/tools/Makefile b/tools/Makefile index 68633a413d..3e15463567 100644 --- a/tools/Makefile +++ b/tools/Makefile @@ -269,6 +269,7 @@ subdir-all-qemu-xen-dir: qemu-xen-dir-find -I$(XEN_ROOT)/tools/libs/evtchn/include \ -I$(XEN_ROOT)/tools/libs/gnttab/include \ -I$(XEN_ROOT)/tools/libs/foreignmemory/include \ + -I$(XEN_ROOT)/tools/libs/devicemodel/include \ -I$(XEN_ROOT)/tools/libxc/include \ -I$(XEN_ROOT)/tools/xenstore/include \ -I$(XEN_ROOT)/tools/xenstore/compat/include \ @@ -278,6 +279,7 @@ subdir-all-qemu-xen-dir: qemu-xen-dir-find -L$(XEN_ROOT)/tools/libs/evtchn \ -L$(XEN_ROOT)/tools/libs/gnttab \ -L$(XEN_ROOT)/tools/libs/foreignmemory \ + -L$(XEN_ROOT)/tools/libs/devicemodel \ -Wl,-rpath-link=$(XEN_ROOT)/tools/libs/toollog \ -Wl,-rpath-link=$(XEN_ROOT)/tools/libs/evtchn \ -Wl,-rpath-link=$(XEN_ROOT)/tools/libs/gnttab \ -- Anthony PERARD ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 4/5] configure: detect presence of libxendevicemodel
On 02/03/17 12:06, Paul Durrant wrote: >> -Original Message- >> From: Juergen Gross [mailto:jgr...@suse.com] >> Sent: 02 March 2017 11:01 >> To: Anthony Perard ; Paul Durrant >> >> Cc: xen-de...@lists.xenproject.org; Stefano Stabellini >> ; qemu-de...@nongnu.org >> Subject: Re: [Xen-devel] [PATCH 4/5] configure: detect presence of >> libxendevicemodel >> >> On 02/03/17 11:54, Anthony PERARD wrote: >>> On Thu, Mar 02, 2017 at 09:06:43AM +, Paul Durrant wrote: > -Original Message- > From: Anthony PERARD [mailto:anthony.per...@citrix.com] > Sent: 01 March 2017 17:18 > To: Paul Durrant > Cc: xen-de...@lists.xenproject.org; qemu-de...@nongnu.org; Stefano > Stabellini > Subject: Re: [PATCH 4/5] configure: detect presence of >> libxendevicemodel > > On Thu, Feb 23, 2017 at 02:53:54PM +, Paul Durrant wrote: >> This patch adds code in configure to set > CONFIG_XEN_CTRL_INTERFACE_VERSION >> to a new value of 490 if libxendevicemodel is present in the build >> environment. >> >> Signed-off-by: Paul Durrant >> --- >> Cc: Stefano Stabellini >> Cc: Anthony Perard >> --- >> configure | 19 +++ >> 1 file changed, 19 insertions(+) >> >> diff --git a/configure b/configure >> index 8e8f18d..fc1e12b 100755 >> --- a/configure >> +++ b/configure >> @@ -1980,6 +1980,25 @@ EOF >># Xen unstable >>elif >>cat > $TMPC <> +#undef XC_WANT_COMPAT_DEVICEMODEL_API >> +#define __XEN_TOOLS__ > > Isn't __XEN_TOOLS__ supposed to be reserved for some to tools inside > xen.git? > > Also it seems to be the only time this define is used in your patch > series. > No. QEMU falls under the definition of 'tools' as far as Xen goes and the >> hypercalls and xendevicemodel API are protected by that. The reason you >> don't see it elsewhere is that xenctrl.h defines it. (See >> http://xenbits.xen.org/gitweb/?p=xen.git;a=blob;f=tools/libxc/include/xenc >> trl.h;hb=HEAD#l27). I think that's a little underhand so I thought I'd make the new code in >> configure more transparent. I can change it to just include xenctrl.h before >> xendevicemodel.h if you'd prefer. >>> >>> I guess that is fine with __XEN_TOOLS__. >>> You can add my >>> Reviewed-by: Anthony PERARD >>> >> >> Sorry for jumping in only now: >> >> Wouldn't it be better to base qemu's configure modification for support >> of Xen 4.9 (and newer) on my just posted series introducing pkg-config? >> >> I'm just about to expand this series to all of Xen's libraries and I >> already have a qemu patch at hand supporting this new scheme. > > Ok, but I'm at v2 and I was literally just about to post. I guess it will > also be pretty trivial to change configure after this change has been applied. Sure, this will work. Its your decision. :-) Juergen ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v3] x86/apicv: Enhance posted-interrupt processing
On Thu, Mar 02, 2017 at 02:41:55AM -0700, Jan Beulich wrote: On 02.03.17 at 02:49, wrote: > >The patch title, btw, makes it looks like this isn't a bug fix, which is >contrary to the understanding I've gained so far. Thanks to your patience and your changing so much description for me. Also to the questions you asked. I agree to the comments i don't reply to. How about this: Fix a wrongly IPI suppression during posted interrupt delivery > >> __vmx_deliver_posted_interrupt() wrongly used a softirq bit to decide whether >> to suppress an IPI. Its logic was: the first time an IPI was sent, we set >> the softirq bit. Next time, we would check that softirq bit before sending >> another IPI. If the 1st IPI arrived at the pCPU which was in >> non-root mode, the hardware would consume the IPI and sync PIR to vIRR. >> During the process, no one (both hardware and software) will clear the >> softirq bit. As a result, the following IPI would be wrongly suppressed. >> >> This patch discards the suppression check, always sending IPI. >> The softirq also need to be raised. But there is a little change. >> This patch moves the place where we raise a softirq for >> 'cpu != smp_processor_id()' case to the IPI interrupt handler. >> Namely, don't raise a softirq for this case and set the interrupt handler >> to pi_notification_interrupt()(in which a softirq is raised) regardless of >> posted interrupt enabled or not. > >As using a PI thing even in the non-PI case is counterintuitive, this >needs expanding on imo. Maybe not here but in a code comment. >Or perhaps, looking at the code, this is just not precise enough a >description: The code is inside a cpu_has_vmx_posted_intr_processing >conditional, and what you do is move code out of the iommu_intpost >specific section. I.e. a reference to IOMMU may simply be missing >here. Yes. The right description is "regardless of VT-d PI enabled or not". > >> The only difference is when the IPI arrives >> at the pCPU which is happened in non-root mode, the patch will not raise a > >s/patch/code/ (or some such) > >> useless softirq since the IPI is consumed by hardware rather than raise a >> softirq unconditionally. >> >> Quan doesn't have enough time to upstream this fix patch. He asks me to do >> this. Merge another his related patch >> (https://lists.xenproject.org/archives/html/xen-devel/2017-02/msg02885.html). > >This doesn't belong in the commit message, and hence should be after >the first ---. > Will take care of this later. >> --- a/xen/arch/x86/hvm/vmx/vmx.c >> +++ b/xen/arch/x86/hvm/vmx/vmx.c >> @@ -1842,13 +1842,59 @@ static void __vmx_deliver_posted_interrupt(struct >> vcpu *v) >> bool_t running = v->is_running; >> >> vcpu_unblock(v); >> +/* >> + * The underlying 'IF' excludes two cases which we don't need further > >Who or what is 'IF'? > I meaned the 'if sentence'. >> + * operation to make sure the target vCPU (@v) will sync PIR to vIRR >> ASAP. >> + * Specifically, the two cases are: >> + * 1. The target vCPU is not running, meaning it is blocked or runnable. >> + * Since we have unblocked the target vCPU above, the target vCPU will >> + * sync PIR to vIRR when it is chosen to run. >> + * 2. The target vCPU is the current vCPU and in_irq() is false. It >> means >> + * the function is called in noninterrupt context. > > * 2. The target vCPU is the current vCPU and we're in > * non-interrupt context. > >> Since we don't call >> + * the function in noninterrupt context after the last time a vCPU syncs >> + * PIR to vIRR, excluding this case is also safe. > >It is not really clear to me what "the function" here refers to. Surely >the function the comment is in won't call itself, no matter what >context. Yes, it is not clear. How about: The target vCPU is the current vCPU and we're in non-interrupt context. we can't be in the window between the call to vmx_intr_assist() and interrupts getting disabled since no existed code reachs here in non-interrupt context. Considering that, it is safe to just leave without further operation. Do you think this is correct? I have an assumption here to explain why (in_irq() || (v != current)) is reasonable. It is totally my guess. > >> + */ >> if ( running && (in_irq() || (v != current)) ) >> { >> +/* >> + * Note: Only two cases will reach here: >> + * 1. The target vCPU is running on other pCPU. >> + * 2. The target vCPU is running on the same pCPU with the current >> + * vCPU > >This is an impossible thing: There can't be two vCPU-s running on >the same pCPU-s at the same time. Hence ... > >> and the current vCPU is in interrupt context. That's to say, >> + * the target vCPU is the current vCPU. > >... just this last statement is sufficient here. > >> + * Note2: Don't worry the v->processor may change since at least >> when >> + * the target vCPU is chosen to run or be blocke
Re: [Xen-devel] [PATCH] tools: Fix build of QEMU with lib xendevicemodel support
On Thu, Mar 02, 2017 at 11:22:35AM +, Anthony PERARD wrote: > Signed-off-by: Anthony PERARD Acked + applied. ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 3/6] xen: credit1: increase efficiency and scalability of load balancing.
On Thu, 2017-03-02 at 11:06 +, Andrew Cooper wrote: > On 02/03/17 10:38, Dario Faggioli wrote: > > signed-off-by: Dario Faggioli > > --- > > Cc: George Dunlap > > Cc: Andrew Cooper > > Malcolm’s solution to this problem is https://github.com/xenserver/xe > n-4.7.pg/commit/0f830b9f229fa6472accc9630ad16cfa42258966 This has > been in 2 releases of XenServer now, and has a very visible > improvement for aggregate multi-queue multi-vm intrahost network > performance (although I can't find the numbers right now). > Yep, as you know, I had become aware of that. > The root of the performance problems is that pcpu_schedule_trylock() > is expensive even for the local case, while cross-cpu locking is much > worse. Locking every single pcpu in turn is terribly expensive, in > terms of hot cacheline pingpong, and the lock is frequently > contended. > > As a first opinion of this patch, you are adding another cpumask > which is going to play hot cacheline pingpong. > Can you clarify? Inside csched_load_balance(), I'm actually trading one currently existing cpumask_and() with another. Sure this new mask needs updating, but that only happens when a pCPUs acquires or leaves the "overloaded" status. Malcom's patch uses an atomic counter which does not fit very well in Credit1's load balancer's logic, and in fact it (potentially) requires an additional cpumask_cycle(). And it also comes with cache coherency implications, doesn't it? Regards, Dario -- <> (Raistlin Majere) - Dario Faggioli, Ph.D, http://about.me/dario.faggioli Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK) signature.asc Description: This is a digitally signed message part ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] tools: Fix build of QEMU with lib xendevicemodel support
> -Original Message- > From: Anthony PERARD [mailto:anthony.per...@citrix.com] > Sent: 02 March 2017 11:23 > To: xen-de...@lists.xenproject.org > Cc: Paul Durrant ; Anthony Perard > ; Ian Jackson ; Wei > Liu > Subject: [PATCH] tools: Fix build of QEMU with lib xendevicemodel support > > Signed-off-by: Anthony PERARD You beat me to it. I just found this when I blew away my installed tools. Thanks for fixing. Paul > --- > tools/Makefile | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/tools/Makefile b/tools/Makefile > index 68633a413d..3e15463567 100644 > --- a/tools/Makefile > +++ b/tools/Makefile > @@ -269,6 +269,7 @@ subdir-all-qemu-xen-dir: qemu-xen-dir-find > -I$(XEN_ROOT)/tools/libs/evtchn/include \ > -I$(XEN_ROOT)/tools/libs/gnttab/include \ > -I$(XEN_ROOT)/tools/libs/foreignmemory/include \ > + -I$(XEN_ROOT)/tools/libs/devicemodel/include \ > -I$(XEN_ROOT)/tools/libxc/include \ > -I$(XEN_ROOT)/tools/xenstore/include \ > -I$(XEN_ROOT)/tools/xenstore/compat/include \ > @@ -278,6 +279,7 @@ subdir-all-qemu-xen-dir: qemu-xen-dir-find > -L$(XEN_ROOT)/tools/libs/evtchn \ > -L$(XEN_ROOT)/tools/libs/gnttab \ > -L$(XEN_ROOT)/tools/libs/foreignmemory \ > + -L$(XEN_ROOT)/tools/libs/devicemodel \ > -Wl,-rpath-link=$(XEN_ROOT)/tools/libs/toollog \ > -Wl,-rpath-link=$(XEN_ROOT)/tools/libs/evtchn \ > -Wl,-rpath-link=$(XEN_ROOT)/tools/libs/gnttab \ > -- > Anthony PERARD ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 1/4] stubdom: set xen interface version for stubdom apps using xenctrl.h
Juergen Gross writes ("[PATCH v2 1/4] stubdom: set xen interface version for stubdom apps using xenctrl.h"): > A stubdom app using xenctrl.h must use the latest interface version of > Xen in order to avoid compatibility issues. Add the related config > item to the stubdom config files where needed. Acked-by: Ian Jackson ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 2/4] tools: add pkg-config file for libxc
Juergen Gross writes ("[PATCH v2 2/4] tools: add pkg-config file for libxc"): > Instead of a try and error approach needing updates for nearly each > new version of Xen just provide xencontrol.pc to be used via > pkg-config. Acked-by: Ian Jackson Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [xen-unstable test] 106307: regressions - FAIL
On 02/03/17 04:15, osstest service owner wrote: > flight 106307 xen-unstable real [real] > http://logs.test-lab.xenproject.org/osstest/logs/106307/ > > Regressions :-( > > Tests which did not succeed and are blocking, > including tests which could not be run: > test-armhf-armhf-xl-cubietruck 15 guest-start/debian.repeat fail REGR. vs. > 105933 > test-armhf-armhf-libvirt-xsm 15 guest-start/debian.repeat fail REGR. vs. > 105933 Force push? These are known intermittent issues, and in the meantime, more work is piling into staging which has a non-zero chance of introducing further regressions. ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 7/7] x86/pagewalk: Re-implement the pagetable walker
>>> On 27.02.17 at 15:03, wrote: > @@ -91,12 +91,12 @@ unsigned long hap_p2m_ga_to_gfn(GUEST_PAGING_LEVELS)( > #if GUEST_PAGING_LEVELS == 3 > top_map += (cr3 & ~(PAGE_MASK | 31)); > #endif > -missing = guest_walk_tables(v, p2m, ga, &gw, pfec[0], top_mfn, top_map); > +walk_ok = guest_walk_tables(v, p2m, ga, &gw, pfec[0], top_mfn, top_map); Since afaict pfec doesn't really point to an array, could I talk you into adjusting the uses you touch anyway to become *pfec? > --- a/xen/arch/x86/mm/hap/nested_ept.c > +++ b/xen/arch/x86/mm/hap/nested_ept.c > @@ -208,7 +208,7 @@ nept_walk_tables(struct vcpu *v, unsigned long l2ga, > ept_walk_t *gw) > goto out; > > map_err: > -if ( rc == _PAGE_PAGED ) > +if ( rc == PFEC_page_paged ) While orthogonal to the patch here, is == (instead of &) really the right check here? > @@ -3737,18 +3737,9 @@ sh_gva_to_gfn(struct vcpu *v, struct p2m_domain *p2m, > return vtlb_gfn; > #endif /* (SHADOW_OPTIMIZATIONS & SHOPT_VIRTUAL_TLB) */ > > -if ( (missing = sh_walk_guest_tables(v, va, &gw, pfec[0])) != 0 ) > +if ( !(walk_ok = sh_walk_guest_tables(v, va, &gw, pfec[0])) ) *pfec again (also further down)? > --- a/xen/include/asm-x86/guest_pt.h > +++ b/xen/include/asm-x86/guest_pt.h > @@ -236,19 +236,26 @@ static inline bool guest_supports_nx(const struct vcpu > *v) > return hvm_nx_enabled(v); > } > > +static inline bool guest_wp_enabled(const struct vcpu *v) > +{ > +/* PV guests can't control CR0.WP, and it is unconditionally set by Xen. > */ > +return is_pv_vcpu(v) || hvm_wp_enabled(v); > +} > > -/* > - * Some bits are invalid in any pagetable entry. > - * Normal flags values get represented in 24-bit values (see > - * get_pte_flags() and put_pte_flags()), so set bit 24 in > - * addition to be able to flag out of range frame numbers. > - */ > -#if GUEST_PAGING_LEVELS == 3 > -#define _PAGE_INVALID_BITS \ > -(_PAGE_INVALID_BIT | get_pte_flags(((1ull << 63) - 1) & ~(PAGE_SIZE - > 1))) > -#else /* 2-level and 4-level */ > -#define _PAGE_INVALID_BITS _PAGE_INVALID_BIT > -#endif > +static inline bool guest_smep_enabled(const struct vcpu *v) > +{ > +return is_pv_vcpu(v) ? 0 : hvm_smep_enabled(v); > +} > + > +static inline bool guest_smap_enabled(const struct vcpu *v) > +{ > +return is_pv_vcpu(v) ? 0 : hvm_smap_enabled(v); > +} > + > +static inline bool guest_pku_enabled(const struct vcpu *v) > +{ > +return is_pv_vcpu(v) ? 0 : hvm_pku_enabled(v); > +} All three "!is_pv_vcpu(v) && hvm_..._enabled(v)" ? Or otherwise please use false. It would be nice if you could adjust the one _eflags use to eflags, avoiding the need for a follow-up patch. Reviewed-by: Jan Beulich Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [xen-unstable test] 106307: regressions - FAIL
On Thu, Mar 02, 2017 at 11:47:27AM +, Andrew Cooper wrote: > On 02/03/17 04:15, osstest service owner wrote: > > flight 106307 xen-unstable real [real] > > http://logs.test-lab.xenproject.org/osstest/logs/106307/ > > > > Regressions :-( > > > > Tests which did not succeed and are blocking, > > including tests which could not be run: > > test-armhf-armhf-xl-cubietruck 15 guest-start/debian.repeat fail REGR. vs. > > 105933 > > test-armhf-armhf-libvirt-xsm 15 guest-start/debian.repeat fail REGR. vs. > > 105933 > > Force push? These are known intermittent issues, and in the meantime, > more work is piling into staging which has a non-zero chance of > introducing further regressions. > Another flight is due to finish around 1400. We can wait a bit for that. > ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] BUG due to "xen-netback: protect resource cleaning on XenBus disconnect"
With commits f16f1df65 and 9a6cdf52b we get in our Xen testing: [ 174.512861] switch: port 2(vif3.0) entered disabled state [ 174.522735] BUG: sleeping function called from invalid context at /home/build/linux-linus/mm/vmalloc.c:1441 [ 174.523451] in_atomic(): 1, irqs_disabled(): 0, pid: 28, name: xenwatch [ 174.524131] CPU: 1 PID: 28 Comm: xenwatch Tainted: GW 4.10.0upstream-11073-g4977ab6-dirty #1 [ 174.524819] Hardware name: MSI MS-7680/H61M-P23 (MS-7680), BIOS V17.0 03/14/2011 [ 174.525517] Call Trace: [ 174.526217] show_stack+0x23/0x60 [ 174.526899] dump_stack+0x5b/0x88 [ 174.527562] ___might_sleep+0xde/0x130 [ 174.528208] __might_sleep+0x35/0xa0 [ 174.528840] ? _raw_spin_unlock_irqrestore+0x13/0x20 [ 174.529463] ? __wake_up+0x40/0x50 [ 174.530089] remove_vm_area+0x20/0x90 [ 174.530724] __vunmap+0x1d/0xc0 [ 174.531346] ? delete_object_full+0x13/0x20 [ 174.531973] vfree+0x40/0x80 [ 174.532594] set_backend_state+0x18a/0xa90 [ 174.533221] ? dwc_scan_descriptors+0x24d/0x430 [ 174.533850] ? kfree+0x5b/0xc0 [ 174.534476] ? xenbus_read+0x3d/0x50 [ 174.535101] ? xenbus_read+0x3d/0x50 [ 174.535718] ? xenbus_gather+0x31/0x90 [ 174.536332] ? ___might_sleep+0xf6/0x130 [ 174.536945] frontend_changed+0x6b/0xd0 [ 174.537565] xenbus_otherend_changed+0x7d/0x80 [ 174.538185] frontend_changed+0x12/0x20 [ 174.538803] xenwatch_thread+0x74/0x110 [ 174.539417] ? woken_wake_function+0x20/0x20 [ 174.540049] kthread+0xe5/0x120 [ 174.540663] ? xenbus_printf+0x50/0x50 [ 174.541278] ? __kthread_init_worker+0x40/0x40 [ 174.541898] ret_from_fork+0x21/0x2c [ 174.548635] switch: port 2(vif3.0) entered disabled state I believe calling vfree() when holding a spin_lock isn't a good idea. Boris, this is the dumpdata failure: FAILURE 4.10.0upstream-11073-g4977ab6-dirty(x86_64) 4.10.0upstream-11073-g4977ab6-dirty(i386)\: 2017-03-02 (tst007) Juergen ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 7/7] x86/pagewalk: Re-implement the pagetable walker
On 02/03/17 11:52, Jan Beulich wrote: On 27.02.17 at 15:03, wrote: >> @@ -91,12 +91,12 @@ unsigned long hap_p2m_ga_to_gfn(GUEST_PAGING_LEVELS)( >> #if GUEST_PAGING_LEVELS == 3 >> top_map += (cr3 & ~(PAGE_MASK | 31)); >> #endif >> -missing = guest_walk_tables(v, p2m, ga, &gw, pfec[0], top_mfn, top_map); >> +walk_ok = guest_walk_tables(v, p2m, ga, &gw, pfec[0], top_mfn, top_map); > Since afaict pfec doesn't really point to an array, could I talk you into > adjusting the uses you touch anyway to become *pfec? I did find the use of an array annoying, but left it as-was. Any objection if I do a general cleanup of that as a prerequisite patch, to avoid introducing semi-unrelated changes into this patch? > >> --- a/xen/arch/x86/mm/hap/nested_ept.c >> +++ b/xen/arch/x86/mm/hap/nested_ept.c >> @@ -208,7 +208,7 @@ nept_walk_tables(struct vcpu *v, unsigned long l2ga, >> ept_walk_t *gw) >> goto out; >> >> map_err: >> -if ( rc == _PAGE_PAGED ) >> +if ( rc == PFEC_page_paged ) > While orthogonal to the patch here, is == (instead of &) really the > right check here? In practice, if PFEC_page_paged is set, it is the only bit set. But I agree - conceptually it should &, and will need to be when I fix nested pagetable walking. > >> @@ -3737,18 +3737,9 @@ sh_gva_to_gfn(struct vcpu *v, struct p2m_domain *p2m, >> return vtlb_gfn; >> #endif /* (SHADOW_OPTIMIZATIONS & SHOPT_VIRTUAL_TLB) */ >> >> -if ( (missing = sh_walk_guest_tables(v, va, &gw, pfec[0])) != 0 ) >> +if ( !(walk_ok = sh_walk_guest_tables(v, va, &gw, pfec[0])) ) > *pfec again (also further down)? > >> --- a/xen/include/asm-x86/guest_pt.h >> +++ b/xen/include/asm-x86/guest_pt.h >> @@ -236,19 +236,26 @@ static inline bool guest_supports_nx(const struct vcpu >> *v) >> return hvm_nx_enabled(v); >> } >> >> +static inline bool guest_wp_enabled(const struct vcpu *v) >> +{ >> +/* PV guests can't control CR0.WP, and it is unconditionally set by >> Xen. */ >> +return is_pv_vcpu(v) || hvm_wp_enabled(v); >> +} >> >> -/* >> - * Some bits are invalid in any pagetable entry. >> - * Normal flags values get represented in 24-bit values (see >> - * get_pte_flags() and put_pte_flags()), so set bit 24 in >> - * addition to be able to flag out of range frame numbers. >> - */ >> -#if GUEST_PAGING_LEVELS == 3 >> -#define _PAGE_INVALID_BITS \ >> -(_PAGE_INVALID_BIT | get_pte_flags(((1ull << 63) - 1) & ~(PAGE_SIZE - >> 1))) >> -#else /* 2-level and 4-level */ >> -#define _PAGE_INVALID_BITS _PAGE_INVALID_BIT >> -#endif >> +static inline bool guest_smep_enabled(const struct vcpu *v) >> +{ >> +return is_pv_vcpu(v) ? 0 : hvm_smep_enabled(v); >> +} >> + >> +static inline bool guest_smap_enabled(const struct vcpu *v) >> +{ >> +return is_pv_vcpu(v) ? 0 : hvm_smap_enabled(v); >> +} >> + >> +static inline bool guest_pku_enabled(const struct vcpu *v) >> +{ >> +return is_pv_vcpu(v) ? 0 : hvm_pku_enabled(v); >> +} > All three "!is_pv_vcpu(v) && hvm_..._enabled(v)" ? Or otherwise > please use false. Ok. > > It would be nice if you could adjust the one _eflags use to eflags, > avoiding the need for a follow-up patch. Will do. > Reviewed-by: Jan Beulich Thanks, ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v3] x86/apicv: Enhance posted-interrupt processing
>>> On 02.03.17 at 05:28, wrote: > On Thu, Mar 02, 2017 at 02:41:55AM -0700, Jan Beulich wrote: > On 02.03.17 at 02:49, wrote: >> >>The patch title, btw, makes it looks like this isn't a bug fix, which is >>contrary to the understanding I've gained so far. > > Thanks to your patience and your changing so much description for me. Also > to the questions you asked. I agree to the comments i don't reply to. > How about this: > Fix a wrongly IPI suppression during posted interrupt delivery fix wrong IPI suppression during posted interrupt delivery >>> --- a/xen/arch/x86/hvm/vmx/vmx.c >>> +++ b/xen/arch/x86/hvm/vmx/vmx.c >>> @@ -1842,13 +1842,59 @@ static void __vmx_deliver_posted_interrupt(struct >>> vcpu *v) >>> bool_t running = v->is_running; >>> >>> vcpu_unblock(v); >>> +/* >>> + * The underlying 'IF' excludes two cases which we don't need further >> >>Who or what is 'IF'? >> > > I meaned the 'if sentence'. I'm sorry, but this doesn't make it any more clear. DYM some if() statement? But then why "underlying"? Was that meant to be "following"? >>> + * operation to make sure the target vCPU (@v) will sync PIR to vIRR >>> ASAP. >>> + * Specifically, the two cases are: >>> + * 1. The target vCPU is not running, meaning it is blocked or >>> runnable. >>> + * Since we have unblocked the target vCPU above, the target vCPU will >>> + * sync PIR to vIRR when it is chosen to run. >>> + * 2. The target vCPU is the current vCPU and in_irq() is false. It >>> means >>> + * the function is called in noninterrupt context. >> >> * 2. The target vCPU is the current vCPU and we're in >> * non-interrupt context. >> >>> Since we don't call >>> + * the function in noninterrupt context after the last time a vCPU >>> syncs >>> + * PIR to vIRR, excluding this case is also safe. >> >>It is not really clear to me what "the function" here refers to. Surely >>the function the comment is in won't call itself, no matter what >>context. > > Yes, it is not clear. How about: > The target vCPU is the current vCPU and we're in non-interrupt context. > we can't be in the window between the call to vmx_intr_assist() "we can't be" looks misleading. Perhaps "We're not being called from the window ..."? > and interrupts getting disabled since no existed code reachs here in > non-interrupt context. Considering that, it is safe to just leave without > further operation. > > Do you think this is correct? I have an assumption here to explain why > (in_irq() || (v != current)) is reasonable. It is totally my guess. As suggested earlier, feel free to simply refer to vcpu_kick() doing the same. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] BUG due to "xen-netback: protect resource cleaning on XenBus disconnect"
On Thu, Mar 02, 2017 at 12:56:20PM +0100, Juergen Gross wrote: > With commits f16f1df65 and 9a6cdf52b we get in our Xen testing: > > [ 174.512861] switch: port 2(vif3.0) entered disabled state > [ 174.522735] BUG: sleeping function called from invalid context at > /home/build/linux-linus/mm/vmalloc.c:1441 > [ 174.523451] in_atomic(): 1, irqs_disabled(): 0, pid: 28, name: xenwatch > [ 174.524131] CPU: 1 PID: 28 Comm: xenwatch Tainted: GW > 4.10.0upstream-11073-g4977ab6-dirty #1 > [ 174.524819] Hardware name: MSI MS-7680/H61M-P23 (MS-7680), BIOS V17.0 > 03/14/2011 > [ 174.525517] Call Trace: > [ 174.526217] show_stack+0x23/0x60 > [ 174.526899] dump_stack+0x5b/0x88 > [ 174.527562] ___might_sleep+0xde/0x130 > [ 174.528208] __might_sleep+0x35/0xa0 > [ 174.528840] ? _raw_spin_unlock_irqrestore+0x13/0x20 > [ 174.529463] ? __wake_up+0x40/0x50 > [ 174.530089] remove_vm_area+0x20/0x90 > [ 174.530724] __vunmap+0x1d/0xc0 > [ 174.531346] ? delete_object_full+0x13/0x20 > [ 174.531973] vfree+0x40/0x80 > [ 174.532594] set_backend_state+0x18a/0xa90 > [ 174.533221] ? dwc_scan_descriptors+0x24d/0x430 > [ 174.533850] ? kfree+0x5b/0xc0 > [ 174.534476] ? xenbus_read+0x3d/0x50 > [ 174.535101] ? xenbus_read+0x3d/0x50 > [ 174.535718] ? xenbus_gather+0x31/0x90 > [ 174.536332] ? ___might_sleep+0xf6/0x130 > [ 174.536945] frontend_changed+0x6b/0xd0 > [ 174.537565] xenbus_otherend_changed+0x7d/0x80 > [ 174.538185] frontend_changed+0x12/0x20 > [ 174.538803] xenwatch_thread+0x74/0x110 > [ 174.539417] ? woken_wake_function+0x20/0x20 > [ 174.540049] kthread+0xe5/0x120 > [ 174.540663] ? xenbus_printf+0x50/0x50 > [ 174.541278] ? __kthread_init_worker+0x40/0x40 > [ 174.541898] ret_from_fork+0x21/0x2c > [ 174.548635] switch: port 2(vif3.0) entered disabled state > > I believe calling vfree() when holding a spin_lock isn't a good idea. > Use vfree_atomic instead? > Boris, this is the dumpdata failure: > FAILURE 4.10.0upstream-11073-g4977ab6-dirty(x86_64) > 4.10.0upstream-11073-g4977ab6-dirty(i386)\: 2017-03-02 (tst007) > > > Juergen ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 5/5] xen: use libxendevicemodel when available
On Thu, Mar 02, 2017 at 11:09:46AM +, Paul Durrant wrote: > This patch modifies the wrapper functions in xen_common.h to use the > new xendevicemodel interface if it is available along with compatibility > code to use the old libxenctrl interface if it is not. > > Signed-off-by: Paul Durrant Reviewed-by: Anthony PERARD Thanks, -- Anthony PERARD ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [RFC PATCH v1 08/21] ARM: NUMA: Parse NUMA distance information
On Wed, Feb 22, 2017 at 5:14 PM, Julien Grall wrote: > Hello Vijay, > > > On 22/02/17 11:38, Vijay Kilari wrote: >> >> On Mon, Feb 20, 2017 at 11:58 PM, Julien Grall >> wrote: >>> >>> Hello Vijay, >>> >>> On 09/02/17 15:57, vijay.kil...@gmail.com wrote: From: Vijaya Kumar K Parse distance-matrix and fetch node distance information. Store distance information in node_distance[]. Signed-off-by: Vijaya Kumar K --- xen/arch/arm/dt_numa.c | 90 ++ xen/arch/arm/numa.c| 19 +- xen/include/asm-arm/numa.h | 1 + 3 files changed, 109 insertions(+), 1 deletion(-) diff --git a/xen/arch/arm/dt_numa.c b/xen/arch/arm/dt_numa.c index fce9e67..8979612 100644 --- a/xen/arch/arm/dt_numa.c +++ b/xen/arch/arm/dt_numa.c @@ -28,6 +28,19 @@ nodemask_t numa_nodes_parsed; extern struct node node_memblk_range[NR_NODE_MEMBLKS]; +extern int _node_distance[MAX_NUMNODES * 2]; +extern int *node_distance; >>> >>> >>> >>> I don't like this idea of having _node_distance and node_distance. >>> Looking >>> at the code, I see little point to do that. You could just initialize >>> node_distance with the correct value. >>> >>> Also the node distance can fit in u8, so you can save memory by using u8. >> >> >> u8 might restrict the distance value > > > The numa distance function returns an u8 and the common code rely on u8. So > IHMO it is fine to restrict to u8. > > If you want to keep u8 then please fix the rest of the code. > > [...] > +numa_set_distance(nodea, nodeb, distance); >>> >>> >>> >>> What if numa_set_distance failed? >> >> >> IMO, no need to fail numa. Should be set to default values for >> node_distance[]. > > > If you look at your implementation of numa_set_distance it returns an error > if the nodea, nodeb are too big. So you should really check the > return an report an error. Because the DT is buggy. ok > > [...] > > >>> >>> +return dt_numa_parse_distance_map(fdt, node, name, address_cells, + size_cells); + +return 0; +} + int __init dt_numa_init(void) { int ret; @@ -149,6 +234,11 @@ int __init dt_numa_init(void) ret = device_tree_for_each_node((void *)device_tree_flattened, dt_numa_scan_memory_node, NULL); +if ( ret ) +return ret; + +ret = device_tree_for_each_node((void *)device_tree_flattened, +dt_numa_scan_distance_node, NULL); return ret; } diff --git a/xen/arch/arm/numa.c b/xen/arch/arm/numa.c index 9a7f0bb..11d100b 100644 --- a/xen/arch/arm/numa.c +++ b/xen/arch/arm/numa.c @@ -22,14 +22,31 @@ #include #include #include +#include >>> >>> >>> >>> Why did you add this include. I don't see any errno here. >>> + +int _node_distance[MAX_NUMNODES * 2]; +int *node_distance; + +u8 __node_distance(nodeid_t a, nodeid_t b) +{ +if ( !node_distance ) +return a == b ? 10 : 20; >>> >>> >>> >>> Why does the 10/20 comes from? >> >> >> That is default distance value. > > > From where? Please give a link to the doc. 10/20 is used by x86 implementation as well. https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/include/linux/topology.h?id=refs/tags/v4.10#n47 Also the default matrix is shown below Documentation/devicetree/bindings/numa.txt > > Regards, > > -- > Julien Grall ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] BUG due to "xen-netback: protect resource cleaning on XenBus disconnect"
On 02/03/17 13:06, Wei Liu wrote: > On Thu, Mar 02, 2017 at 12:56:20PM +0100, Juergen Gross wrote: >> With commits f16f1df65 and 9a6cdf52b we get in our Xen testing: >> >> [ 174.512861] switch: port 2(vif3.0) entered disabled state >> [ 174.522735] BUG: sleeping function called from invalid context at >> /home/build/linux-linus/mm/vmalloc.c:1441 >> [ 174.523451] in_atomic(): 1, irqs_disabled(): 0, pid: 28, name: xenwatch >> [ 174.524131] CPU: 1 PID: 28 Comm: xenwatch Tainted: GW >> 4.10.0upstream-11073-g4977ab6-dirty #1 >> [ 174.524819] Hardware name: MSI MS-7680/H61M-P23 (MS-7680), BIOS V17.0 >> 03/14/2011 >> [ 174.525517] Call Trace: >> [ 174.526217] show_stack+0x23/0x60 >> [ 174.526899] dump_stack+0x5b/0x88 >> [ 174.527562] ___might_sleep+0xde/0x130 >> [ 174.528208] __might_sleep+0x35/0xa0 >> [ 174.528840] ? _raw_spin_unlock_irqrestore+0x13/0x20 >> [ 174.529463] ? __wake_up+0x40/0x50 >> [ 174.530089] remove_vm_area+0x20/0x90 >> [ 174.530724] __vunmap+0x1d/0xc0 >> [ 174.531346] ? delete_object_full+0x13/0x20 >> [ 174.531973] vfree+0x40/0x80 >> [ 174.532594] set_backend_state+0x18a/0xa90 >> [ 174.533221] ? dwc_scan_descriptors+0x24d/0x430 >> [ 174.533850] ? kfree+0x5b/0xc0 >> [ 174.534476] ? xenbus_read+0x3d/0x50 >> [ 174.535101] ? xenbus_read+0x3d/0x50 >> [ 174.535718] ? xenbus_gather+0x31/0x90 >> [ 174.536332] ? ___might_sleep+0xf6/0x130 >> [ 174.536945] frontend_changed+0x6b/0xd0 >> [ 174.537565] xenbus_otherend_changed+0x7d/0x80 >> [ 174.538185] frontend_changed+0x12/0x20 >> [ 174.538803] xenwatch_thread+0x74/0x110 >> [ 174.539417] ? woken_wake_function+0x20/0x20 >> [ 174.540049] kthread+0xe5/0x120 >> [ 174.540663] ? xenbus_printf+0x50/0x50 >> [ 174.541278] ? __kthread_init_worker+0x40/0x40 >> [ 174.541898] ret_from_fork+0x21/0x2c >> [ 174.548635] switch: port 2(vif3.0) entered disabled state >> >> I believe calling vfree() when holding a spin_lock isn't a good idea. >> > > Use vfree_atomic instead? Hmm, isn't this overkill here? You can just set a local variable with the address and do vfree() after releasing the lock. Juergen ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [RFC PATCH v1 08/21] ARM: NUMA: Parse NUMA distance information
Hello Vijay, On 02/03/17 12:10, Vijay Kilari wrote: On Wed, Feb 22, 2017 at 5:14 PM, Julien Grall wrote: On 22/02/17 11:38, Vijay Kilari wrote: [...] That is default distance value. From where? Please give a link to the doc. 10/20 is used by x86 implementation as well. https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/include/linux/topology.h?id=refs/tags/v4.10#n47 Then please introduce LOCAL_DISTANCE and REMOTE_DISTANCE to avoid hardcoded value. Also the default matrix is shown below Documentation/devicetree/bindings/numa.txt Looking at the binding, the show an example not the default value. Regards, -- Julien Grall ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] BUG due to "xen-netback: protect resource cleaning on XenBus disconnect"
> -Original Message- > From: Juergen Gross [mailto:jgr...@suse.com] > Sent: 02 March 2017 12:13 > To: Wei Liu > Cc: Igor Druzhinin ; xen-devel de...@lists.xenproject.org>; Linux Kernel Mailing List ker...@vger.kernel.org>; net...@vger.kernel.org; Boris Ostrovsky > ; David Miller ; Paul > Durrant > Subject: Re: BUG due to "xen-netback: protect resource cleaning on XenBus > disconnect" > > On 02/03/17 13:06, Wei Liu wrote: > > On Thu, Mar 02, 2017 at 12:56:20PM +0100, Juergen Gross wrote: > >> With commits f16f1df65 and 9a6cdf52b we get in our Xen testing: > >> > >> [ 174.512861] switch: port 2(vif3.0) entered disabled state > >> [ 174.522735] BUG: sleeping function called from invalid context at > >> /home/build/linux-linus/mm/vmalloc.c:1441 > >> [ 174.523451] in_atomic(): 1, irqs_disabled(): 0, pid: 28, name: xenwatch > >> [ 174.524131] CPU: 1 PID: 28 Comm: xenwatch Tainted: GW > >> 4.10.0upstream-11073-g4977ab6-dirty #1 > >> [ 174.524819] Hardware name: MSI MS-7680/H61M-P23 (MS-7680), BIOS > V17.0 > >> 03/14/2011 > >> [ 174.525517] Call Trace: > >> [ 174.526217] show_stack+0x23/0x60 > >> [ 174.526899] dump_stack+0x5b/0x88 > >> [ 174.527562] ___might_sleep+0xde/0x130 > >> [ 174.528208] __might_sleep+0x35/0xa0 > >> [ 174.528840] ? _raw_spin_unlock_irqrestore+0x13/0x20 > >> [ 174.529463] ? __wake_up+0x40/0x50 > >> [ 174.530089] remove_vm_area+0x20/0x90 > >> [ 174.530724] __vunmap+0x1d/0xc0 > >> [ 174.531346] ? delete_object_full+0x13/0x20 > >> [ 174.531973] vfree+0x40/0x80 > >> [ 174.532594] set_backend_state+0x18a/0xa90 > >> [ 174.533221] ? dwc_scan_descriptors+0x24d/0x430 > >> [ 174.533850] ? kfree+0x5b/0xc0 > >> [ 174.534476] ? xenbus_read+0x3d/0x50 > >> [ 174.535101] ? xenbus_read+0x3d/0x50 > >> [ 174.535718] ? xenbus_gather+0x31/0x90 > >> [ 174.536332] ? ___might_sleep+0xf6/0x130 > >> [ 174.536945] frontend_changed+0x6b/0xd0 > >> [ 174.537565] xenbus_otherend_changed+0x7d/0x80 > >> [ 174.538185] frontend_changed+0x12/0x20 > >> [ 174.538803] xenwatch_thread+0x74/0x110 > >> [ 174.539417] ? woken_wake_function+0x20/0x20 > >> [ 174.540049] kthread+0xe5/0x120 > >> [ 174.540663] ? xenbus_printf+0x50/0x50 > >> [ 174.541278] ? __kthread_init_worker+0x40/0x40 > >> [ 174.541898] ret_from_fork+0x21/0x2c > >> [ 174.548635] switch: port 2(vif3.0) entered disabled state > >> > >> I believe calling vfree() when holding a spin_lock isn't a good idea. > >> > > > > Use vfree_atomic instead? > > Hmm, isn't this overkill here? > > You can just set a local variable with the address and do vfree() after > releasing the lock. > Yep, that's what I was thinking. Patch coming shortly. Paul > > Juergen ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 3/7] x86/pagewalk: Helpers for reserved bit handling
On 01/03/17 15:57, Jan Beulich wrote: On 27.02.17 at 15:03, wrote: >> -static inline int >> -guest_supports_1G_superpages(struct vcpu *v) >> +static inline bool guest_has_pse36(const struct vcpu *v) >> +{ >> + /* No support for 2-level PV guests. */ >> +return is_pv_vcpu(v) ? 0 : paging_mode_hap(v->domain); > Considering the return type ITYM "false" instead of "0" here. Yes, sorry. (This series was what caused me to introduce bool to the hypervisor in the first place, and I clearly haven't rebased it forwards properly.) > But then why use a conditional expression here at all? > > return !is_pv_vcpu(v) && paging_mode_hap(v->domain); Can do. > Furthermore there's no point in passing in a struct vcpu here - > both checked constructs are per-domain, and passing in > struct domain is also a better fit with the guest_ prefix of the > function name. At the moment, all the guest_* functions consistently take a vcpu. However, they are inconsistent with other terms, and some of the static-configuration properties probably should take a domain. I will split out this cleanup into an earlier patch, adjusting some naming and parameters. > >> +static inline bool guest_l2e_rsvd_bits(const struct vcpu *v, guest_l2e_t >> l2e) >> +{ >> +uint64_t rsvd_bits = guest_rsvd_bits(v); >> + >> +return ((l2e.l2 & (rsvd_bits | GUEST_L2_PAGETABLE_RSVD | >> + (guest_supports_superpages(v) ? 0 : _PAGE_PSE))) || >> +((l2e.l2 & _PAGE_PSE) && >> + (l2e.l2 & ((GUEST_PAGING_LEVELS == 2 && guest_has_pse36(v)) >> +? (fold_pse36(rsvd_bits | (1ULL << 40))) > While one can look it up in the doc, I strongly think this 40 needs a > comment. Thinking about it, having GUEST_L2_PSE36_RSVD is probably a good move. ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [RFC PATCH v1 07/21] ARM: NUMA: Parse memory NUMA information
On Mon, Feb 20, 2017 at 11:35 PM, Julien Grall wrote: > Hello Vijay, > > > On 09/02/17 15:56, vijay.kil...@gmail.com wrote: >> >> From: Vijaya Kumar K >> >> Parse memory node and fetch numa-node-id information. >> For each memory range, store in node_memblk_range[] >> along with node id. >> >> Signed-off-by: Vijaya Kumar K >> --- >> xen/arch/arm/bootfdt.c| 4 +-- >> xen/arch/arm/dt_numa.c| 84 >> ++- >> xen/common/numa.c | 8 + >> xen/include/xen/device_tree.h | 3 ++ >> xen/include/xen/numa.h| 1 + >> 5 files changed, 97 insertions(+), 3 deletions(-) >> >> diff --git a/xen/arch/arm/bootfdt.c b/xen/arch/arm/bootfdt.c >> index d1122d8..5e2df92 100644 >> --- a/xen/arch/arm/bootfdt.c >> +++ b/xen/arch/arm/bootfdt.c >> @@ -56,8 +56,8 @@ static bool_t __init device_tree_node_compatible(const >> void *fdt, int node, >> return 0; >> } >> >> -static void __init device_tree_get_reg(const __be32 **cell, u32 >> address_cells, >> - u32 size_cells, u64 *start, u64 >> *size) >> +void __init device_tree_get_reg(const __be32 **cell, u32 address_cells, >> +u32 size_cells, u64 *start, u64 *size) >> { >> *start = dt_next_cell(address_cells, cell); >> *size = dt_next_cell(size_cells, cell); >> diff --git a/xen/arch/arm/dt_numa.c b/xen/arch/arm/dt_numa.c >> index 4b94c36..fce9e67 100644 >> --- a/xen/arch/arm/dt_numa.c >> +++ b/xen/arch/arm/dt_numa.c >> @@ -27,6 +27,7 @@ >> #include >> >> nodemask_t numa_nodes_parsed; >> +extern struct node node_memblk_range[NR_NODE_MEMBLKS]; > > > This should have been declared in an header (likely in patch #3). > >> >> /* >> * Even though we connect cpus to numa domains later in SMP >> @@ -48,11 +49,73 @@ static int __init dt_numa_process_cpu_node(const void >> *fdt, int node, >> return 0; >> } >> >> +static int __init dt_numa_process_memory_node(const void *fdt, int node, >> + const char *name, >> + u32 address_cells, >> + u32 size_cells) > > > Rather than reimplementing the wheel, it might be better to hook the parsing > in bootfdt.c. This would avoid an extra parsing of the device-tree, > duplicate the code and expose primitive for early DT parsing. The process_memory_node() is called only if EFI_BOOT is not enabled. So hooking might not work. > > If parsing NUMA information can be done after the DT has been unflattened, > this will be even better. > >> +{ >> +const struct fdt_property *prop; >> +int i, ret, banks; > > > Both banks and i should be unsigned. > >> +const __be32 *cell; >> +paddr_t start, size; >> +u32 reg_cells = address_cells + size_cells; >> +u32 nid; >> + >> +if ( address_cells < 1 || size_cells < 1 ) >> +{ >> +printk(XENLOG_WARNING >> + "fdt: node `%s': invalid #address-cells or #size-cells", >> name); >> +return -EINVAL; >> +} >> + >> +nid = device_tree_get_u32(fdt, node, "numa-node-id", MAX_NUMNODES); >> +if ( nid >= MAX_NUMNODES) { > > > Coding style > > if ( ... ) > { > >> +/* >> + * No node id found. Skip this memory node. >> + */ > > > This could be a single line: > > /* . */ > > So no warning if it is impossible to get the numa-node-id? Also, I don't > think this is right to boot using NUMA on platform having a buggy DT. So we > should probably return an error and disable NUMA. OK. > >> +return 0; >> +} >> + >> +prop = fdt_get_property(fdt, node, "reg", NULL); >> +if ( !prop ) >> +{ >> +printk(XENLOG_WARNING "fdt: node `%s': missing `reg' property\n", >> + name); >> +return -EINVAL; >> +} >> + >> +cell = (const __be32 *)prop->data; >> +banks = fdt32_to_cpu(prop->len) / (reg_cells * sizeof (u32)); >> + >> +for ( i = 0; i < banks; i++ ) >> +{ >> +device_tree_get_reg(&cell, address_cells, size_cells, &start, >> &size); >> +if ( !size ) >> +continue; >> + >> +/* It is fine to add this area to the nodes data it will be used >> later*/ >> +ret = conflicting_memblks(start, start + size); >> +if (ret < 0) >> + numa_add_memblk(nid, start, size); > > > numa_add_memblk rely on the caller to check whether the array is not full. I > think we should move the check in numa_add_memblk and return an error in > this case. OK > >> +else >> +{ >> + printk(XENLOG_ERR >> +"NUMA DT: node %u (%"PRIx64"-%"PRIx64") overlaps with >> ret %d (%"PRIx64"-%"PRIx64")\n", >> +nid, start, start + size, ret, >> +node_memblk_range[i].start, >> node_memblk_range[i].end); > > > i != ret. Please use the correct variable accordingly. However, I don't > think the o
Re: [Xen-devel] [PATCH 5/7] x86/shadow: Use the pagewalk reserved bits helpers
On 01/03/17 16:03, Jan Beulich wrote: On 27.02.17 at 15:03, wrote: >> The shadow logic should never create a shadow of a guest PTE which contains >> reserved bits from the guests point of view. Such a shadowed entry might not >> cause #PF[RSVD] when walked by hardware, thus won't behave architecturally >> from the guests point of view. > But are we already or-ing in the RSVD bit accordingly in such cases, > before handing the #PF to the guest? The patch here certainly > doesn't make any change towards that, afaics. The purpose of this patch is to ensure we never create a shadow which risks causing hardware to generate #PF[RSVD] when running on the shadows, other than the one deliberate case (MMIO fastpath). ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 2/4] tools: add pkg-config file for libxc
On Thu, Mar 02, 2017 at 06:13:16AM +0100, Juergen Gross wrote: > When configuring the build of qemu the configure script is building > various test programs to determine the exact version of libxencontrol. > > Instead of a try and error approach needing updates for nearly each > new version of Xen just provide xencontrol.pc to be used via > pkg-config. > > In the end we need two different variants of that file: one for the > target system where eventually someone wants to build qemu, and one > for the local system to be used for building qemu as part of the Xen > build process. > > The local variant is created in a dedicated directory in order to be > able to collect more pkg-config files used for building tools there. > > Signed-off-by: Juergen Gross Please consider writing a follow-up patch to at least move libxenlight.pc and libxlutil.pc there, too. :-) Wei. ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [RFC PATCH v1 12/21] ARM: NUMA: Do not expose numa info to DOM0
On Tue, Feb 21, 2017 at 12:06 AM, Julien Grall wrote: > Hello Vijay, > > On 09/02/17 15:57, vijay.kil...@gmail.com wrote: >> >> From: Vijaya Kumar K >> >> Delete numa-node-id and distance map from Dom0 DT >> so that NUMA information is not exposed to Dom0. >> >> This helps particularly to boot Node 1 devices >> as if booting on Node0. >> >> However this approach has limitation where memory allocation >> for the devices should be local. > > > We had a discussion about this few weeks ago but you never answered back... > (see [1]). OK. I will reply to [1]. > > If there is an issue, please provides input with examples and what will > happen. > > >> >> Signed-off-by: Vijaya Kumar >> --- >> xen/arch/arm/domain_build.c | 9 + >> 1 file changed, 9 insertions(+) >> >> diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c >> index c97a1f5..5e89eaa 100644 >> --- a/xen/arch/arm/domain_build.c >> +++ b/xen/arch/arm/domain_build.c >> @@ -424,6 +424,10 @@ static int write_properties(struct domain *d, struct >> kernel_info *kinfo, >> } >> } >> >> +/* Don't expose the property numa to the guest */ >> +if ( dt_property_name_is_equal(prop, "numa-node-id") ) >> +continue; >> + >> /* Don't expose the property "xen,passthrough" to the guest */ >> if ( dt_property_name_is_equal(prop, "xen,passthrough") ) >> continue; >> @@ -1176,6 +1180,11 @@ static int handle_node(struct domain *d, struct >> kernel_info *kinfo, >> DT_MATCH_TYPE("memory"), >> /* The memory mapped timer is not supported by Xen. */ >> DT_MATCH_COMPATIBLE("arm,armv7-timer-mem"), >> +/* >> + * NUMA info is not exposed to Dom0. >> + * So, skip distance-map infomation >> + */ >> +DT_MATCH_COMPATIBLE("numa-distance-map-v1"), >> { /* sentinel */ }, >> }; >> static const struct dt_device_match timer_matches[] __initconst = >> > > Regards, > > [1] > https://lists.xenproject.org/archives/html/xen-devel/2016-12/msg02073.html > > -- > Julien Grall ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [DO NOT APPLY PATCH XTF 0/2] UMIP test case
Wah. Thank you, Andrew & Wei. :-) On 3/2/2017 5:05 PM, Andrew Cooper wrote: On 02/03/2017 08:42, Wei Liu wrote: I wrote this long time ago before UMIP was merged. Yu, since you asked, I might as well post it for your reference on how to do it with XTF. This series is not yet tested in any way. Unfortunately, you execute all of the sensitive instructions in kernel mode, where they wouldn't fault even with UMIP active. For full testing of a feature like this, the test should include a check that the ability to modify CR4.UMIP depends strictly on the visibility of the feature, that uses in the kernel still continue to work, even when active, and that behaviour returns back to normal after the feature has been deactivated. So, before cr4 is written, a cpuid is needed first in the test code, right? But besides the emulation of cpuid and cr4, my understanding is that hypervisor need to inject a GP fault if an instruction causes a VM exit, right? This should be different than the normal GP fault inside a VM, and how do we test this code path? Our QA and I may need to have a study of the XTF, and may probably seek for your help in the future. :-) Anyway, thanks a lot! Yu ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 2/4] tools: add pkg-config file for libxc
On 02/03/17 13:29, Wei Liu wrote: > On Thu, Mar 02, 2017 at 06:13:16AM +0100, Juergen Gross wrote: >> When configuring the build of qemu the configure script is building >> various test programs to determine the exact version of libxencontrol. >> >> Instead of a try and error approach needing updates for nearly each >> new version of Xen just provide xencontrol.pc to be used via >> pkg-config. >> >> In the end we need two different variants of that file: one for the >> target system where eventually someone wants to build qemu, and one >> for the local system to be used for building qemu as part of the Xen >> build process. >> >> The local variant is created in a dedicated directory in order to be >> able to collect more pkg-config files used for building tools there. >> >> Signed-off-by: Juergen Gross > > Please consider writing a follow-up patch to at least move > libxenlight.pc and libxlutil.pc there, too. :-) I'm planning to do this and more: - add *.pc for all other Xen libraries - add the library dependencies to the *.pc files - use the pkg-config files for the Xen build process in order to get rid of the lib specific variables in tools/Rules.mk When this has been done qemu can be switched to use pkg-config files from Xen 4.9 on. After this (when Xen is using the enhanced qemu version) we can drop passing the CFLAGS and LDFLAGS to qemu's configure. Juergen ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 2/4] tools: add pkg-config file for libxc
On Thu, Mar 02, 2017 at 01:35:54PM +0100, Juergen Gross wrote: > On 02/03/17 13:29, Wei Liu wrote: > > On Thu, Mar 02, 2017 at 06:13:16AM +0100, Juergen Gross wrote: > >> When configuring the build of qemu the configure script is building > >> various test programs to determine the exact version of libxencontrol. > >> > >> Instead of a try and error approach needing updates for nearly each > >> new version of Xen just provide xencontrol.pc to be used via > >> pkg-config. > >> > >> In the end we need two different variants of that file: one for the > >> target system where eventually someone wants to build qemu, and one > >> for the local system to be used for building qemu as part of the Xen > >> build process. > >> > >> The local variant is created in a dedicated directory in order to be > >> able to collect more pkg-config files used for building tools there. > >> > >> Signed-off-by: Juergen Gross > > > > Please consider writing a follow-up patch to at least move > > libxenlight.pc and libxlutil.pc there, too. :-) > > I'm planning to do this and more: > > - add *.pc for all other Xen libraries > - add the library dependencies to the *.pc files > - use the pkg-config files for the Xen build process in order to get > rid of the lib specific variables in tools/Rules.mk > > When this has been done qemu can be switched to use pkg-config files > from Xen 4.9 on. After this (when Xen is using the enhanced qemu > version) we can drop passing the CFLAGS and LDFLAGS to qemu's > configure. > Great! Looking forward to more patches! > > Juergen > ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [DO NOT APPLY PATCH XTF 0/2] UMIP test case
On 02/03/17 12:30, Yu Zhang wrote: > Wah. Thank you, Andrew & Wei. :-) > > On 3/2/2017 5:05 PM, Andrew Cooper wrote: >> On 02/03/2017 08:42, Wei Liu wrote: >>> I wrote this long time ago before UMIP was merged. >>> >>> Yu, since you asked, I might as well post it for your reference on >>> how to >>> do it with XTF. >>> >>> This series is not yet tested in any way. >> Unfortunately, you execute all of the sensitive instructions in kernel >> mode, where they wouldn't fault even with UMIP active. >> >> For full testing of a feature like this, the test should include a check >> that the ability to modify CR4.UMIP depends strictly on the visibility >> of the feature, that uses in the kernel still continue to work, even >> when active, and that behaviour returns back to normal after the feature >> has been deactivated. > > So, before cr4 is written, a cpuid is needed first in the test code, > right? Yes, but that is performed automatically by the framework for convenience. The predicate cpu_has_umip will work fine for your test. > > But besides the emulation of cpuid and cr4, my understanding is that > hypervisor need to inject a GP fault if an instruction causes a VM > exit, right? This should be different than the normal GP fault inside > a VM, and how do we test this code path? Using the Force Emulation Prefix. This breaks out to the hypervisor and runs the instruction through the emulator. You should test both that the native instructions behave as expected, and that once forced, the instructions also behave sensibly. The FEP test is to cover alternative cases with would require emulation normally (e.g. memory operand hitting an MMIO region, or the page marked NX in EPT because of VM instrospection being active). > > Our QA and I may need to have a study of the XTF, and may probably > seek for your help in the future. :-) > Anyway, thanks a lot! Haozhong (CC'd) has previously used XTF for a load of nested-virt testing, if you'd like an alternative opinion, (although I do hope you find XTF easy to use and useful). ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] arm64: Approach for DT based NUMA and issues
On Fri, Dec 16, 2016 at 3:48 PM, Dario Faggioli wrote: > On Fri, 2016-12-16 at 09:40 +, Julien Grall wrote: >> Hi Vijay, >> On 16/12/2016 07:39, Vijay Kilari wrote: >> > If we drop numa-node-id from memory node generated to dom0, then >> > dom0 will >> > assume all the memory is from node0. So eventually node1 device >> > intialization fails. >> >> I suggested to drop the property numa-node-id from every node (not >> only >> memory one). So DOM0 will think it is running a non-NUMA platform. >> >> From my knowledge this is working on x86, and I don't understand >> why >> this would be an issue on ARM. If you think the device may not work, >> please explain why. >> > Yes, I confirm that what you said works and any x86 NUMA system I've > seen. > > AFAIUI, since Vijay is talking about "devices", the x86 equivalent of > what would be an "IONUMA system", i.e. a platform where I/O devices are > physically attached to more than just one I/O hub, which in turn are > attached to different nodes. > I don't have first hand experience with these systems on the x86 world, > but I'm quite sure they also function with the configuration Julien is > suggesting to use. > > Boris did some work to _improve_ the situation (namely, to make it > possible for *Xen* to report to the toolstack, to which NUMA node a > specific device is attached to). But: > - things were working already before this > - that does involve Xen and toolstack, while Dom0 remains totally >NUMA _unaware_. > > And I indeed think that doing what Julien says (i.e., keep dom0 NUMA- > ignorant) is, if possible, best as a first step. Sorry for late reply. I want to confirm only after checking if this approach works. Yes for first step implementation this fine. For now, our platform does not have any such restrictions that memory for IO device should be always local. Also, the issue is seen with SMMU, which is going to be hidden from Dom0. I have mentioned this restriction in my NUMA RFC patch commit. > > Regards, > Dario > -- > <> (Raistlin Majere) > - > Dario Faggioli, Ph.D, http://about.me/dario.faggioli > Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK) ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] [xen-unstable-smoke test] 106366: tolerable trouble: broken/fail/pass - PUSHED
flight 106366 xen-unstable-smoke real [real] http://logs.test-lab.xenproject.org/osstest/logs/106366/ Failures :-/ but no regressions. Tests which did not succeed, but are not blocking: test-arm64-arm64-xl-xsm 1 build-check(1) blocked n/a build-arm64 5 xen-buildfail never pass build-arm64-pvops 5 kernel-build fail never pass test-amd64-amd64-libvirt 12 migrate-support-checkfail never pass test-armhf-armhf-xl 12 migrate-support-checkfail never pass test-armhf-armhf-xl 13 saverestore-support-checkfail never pass version targeted for testing: xen f14ce1a13455bfc3fb7b33c185e3e49749d68e28 baseline version: xen 1a0ab02e342cfd80decd72606c94479e4b309a3c Last test of basis 106320 2017-03-01 20:01:41 Z0 days Testing same since 106361 2017-03-02 09:01:12 Z0 days2 attempts People who touched revisions under test: Marek Marczykowski-Górecki Olaf Hering Wei Liu jobs: build-amd64 pass build-arm64 fail build-armhf pass build-amd64-libvirt pass build-arm64-pvopsfail test-armhf-armhf-xl pass test-arm64-arm64-xl-xsm broken test-amd64-amd64-xl-qemuu-debianhvm-i386 pass test-amd64-amd64-libvirt pass sg-report-flight on osstest.test-lab.xenproject.org logs: /home/logs/logs images: /home/logs/images Logs, config files, etc. are available at http://logs.test-lab.xenproject.org/osstest/logs Explanation of these reports, and of osstest in general, is at http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master Test harness code can be found at http://xenbits.xen.org/gitweb?p=osstest.git;a=summary Pushing revision : + branch=xen-unstable-smoke + revision=f14ce1a13455bfc3fb7b33c185e3e49749d68e28 + . ./cri-lock-repos ++ . ./cri-common +++ . ./cri-getconfig +++ umask 002 +++ getrepos getconfig Repos perl -e ' use Osstest; readglobalconfig(); print $c{"Repos"} or die $!; ' +++ local repos=/home/osstest/repos +++ '[' -z /home/osstest/repos ']' +++ '[' '!' -d /home/osstest/repos ']' +++ echo /home/osstest/repos ++ repos=/home/osstest/repos ++ repos_lock=/home/osstest/repos/lock ++ '[' x '!=' x/home/osstest/repos/lock ']' ++ OSSTEST_REPOS_LOCK_LOCKED=/home/osstest/repos/lock ++ exec with-lock-ex -w /home/osstest/repos/lock ./ap-push xen-unstable-smoke f14ce1a13455bfc3fb7b33c185e3e49749d68e28 + branch=xen-unstable-smoke + revision=f14ce1a13455bfc3fb7b33c185e3e49749d68e28 + . ./cri-lock-repos ++ . ./cri-common +++ . ./cri-getconfig +++ umask 002 +++ getrepos getconfig Repos perl -e ' use Osstest; readglobalconfig(); print $c{"Repos"} or die $!; ' +++ local repos=/home/osstest/repos +++ '[' -z /home/osstest/repos ']' +++ '[' '!' -d /home/osstest/repos ']' +++ echo /home/osstest/repos ++ repos=/home/osstest/repos ++ repos_lock=/home/osstest/repos/lock ++ '[' x/home/osstest/repos/lock '!=' x/home/osstest/repos/lock ']' + . ./cri-common ++ . ./cri-getconfig ++ umask 002 + select_xenbranch + case "$branch" in + tree=xen + xenbranch=xen-unstable-smoke + qemuubranch=qemu-upstream-unstable + '[' xxen = xlinux ']' + linuxbranch= + '[' xqemu-upstream-unstable = x ']' + select_prevxenbranch ++ ./cri-getprevxenbranch xen-unstable-smoke + prevxenbranch=xen-4.8-testing + '[' xf14ce1a13455bfc3fb7b33c185e3e49749d68e28 = x ']' + : tested/2.6.39.x + . ./ap-common ++ : osst...@xenbits.xen.org +++ getconfig OsstestUpstream +++ perl -e ' use Osstest; readglobalconfig(); print $c{"OsstestUpstream"} or die $!; ' ++ : ++ : git://xenbits.xen.org/xen.git ++ : osst...@xenbits.xen.org:/home/xen/git/xen.git ++ : git://xenbits.xen.org/qemu-xen-traditional.git ++ : git://git.kernel.org ++ : git://git.kernel.org/pub/scm/linux/kernel/git ++ : git ++ : git://xenbits.xen.org/xtf.git ++ : osst...@xenbits.xen.org:/home/xen/git/xtf.git ++ : git://xenbits.xen.org/xtf.git ++ : git://xenbits.xen.org/libvirt.git ++ : osst...@xenbits.xen.org:/home/xen/git/libvirt.git ++ : git://xenbits.xen.org/libvirt.git ++ : git://xenbits.xen.org/osstest/rumprun.git ++ : git ++ : git://xenbits.xen.org/osstest/rumprun.git ++ : osst...@xenbits.xen.org:/home/xen/git/osstes
Re: [Xen-devel] [PATCH 5/7] x86/shadow: Use the pagewalk reserved bits helpers
>>> On 02.03.17 at 13:26, wrote: > On 01/03/17 16:03, Jan Beulich wrote: > On 27.02.17 at 15:03, wrote: >>> The shadow logic should never create a shadow of a guest PTE which contains >>> reserved bits from the guests point of view. Such a shadowed entry might >>> not >>> cause #PF[RSVD] when walked by hardware, thus won't behave architecturally >>> from the guests point of view. >> But are we already or-ing in the RSVD bit accordingly in such cases, >> before handing the #PF to the guest? The patch here certainly >> doesn't make any change towards that, afaics. > > The purpose of this patch is to ensure we never create a shadow which > risks causing hardware to generate #PF[RSVD] when running on the > shadows, other than the one deliberate case (MMIO fastpath). Right, but instead of answering my question this emphasizes the need for an answer, as what you say basically means we'd never (except for that one special case) see the RSVD bit set when getting #PF handed by hardware, yet for forwarding to the guest we need to set that bit then in such cases. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] arm64: Approach for DT based NUMA and issues
Hello Vijay, On 02/03/17 12:39, Vijay Kilari wrote: On Fri, Dec 16, 2016 at 3:48 PM, Dario Faggioli wrote: On Fri, 2016-12-16 at 09:40 +, Julien Grall wrote: Hi Vijay, On 16/12/2016 07:39, Vijay Kilari wrote: If we drop numa-node-id from memory node generated to dom0, then dom0 will assume all the memory is from node0. So eventually node1 device intialization fails. I suggested to drop the property numa-node-id from every node (not only memory one). So DOM0 will think it is running a non-NUMA platform. From my knowledge this is working on x86, and I don't understand why this would be an issue on ARM. If you think the device may not work, please explain why. Yes, I confirm that what you said works and any x86 NUMA system I've seen. AFAIUI, since Vijay is talking about "devices", the x86 equivalent of what would be an "IONUMA system", i.e. a platform where I/O devices are physically attached to more than just one I/O hub, which in turn are attached to different nodes. I don't have first hand experience with these systems on the x86 world, but I'm quite sure they also function with the configuration Julien is suggesting to use. Boris did some work to _improve_ the situation (namely, to make it possible for *Xen* to report to the toolstack, to which NUMA node a specific device is attached to). But: - things were working already before this - that does involve Xen and toolstack, while Dom0 remains totally NUMA _unaware_. And I indeed think that doing what Julien says (i.e., keep dom0 NUMA- ignorant) is, if possible, best as a first step. Sorry for late reply. I want to confirm only after checking if this approach works. Yes for first step implementation this fine. For now, our platform does not have any such restrictions that memory for IO device should be always local. Also, the issue is seen with SMMU, which is going to be hidden from Dom0. I have mentioned this restriction in my NUMA RFC patch commit. Can you detail the restrictions with SMMU? Which memory should be allocated to the correct node? Stage-2 page tables? Regards, -- Julien Grall ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 7/7] x86/pagewalk: Re-implement the pagetable walker
>>> On 02.03.17 at 13:00, wrote: > On 02/03/17 11:52, Jan Beulich wrote: > On 27.02.17 at 15:03, wrote: >>> @@ -91,12 +91,12 @@ unsigned long hap_p2m_ga_to_gfn(GUEST_PAGING_LEVELS)( >>> #if GUEST_PAGING_LEVELS == 3 >>> top_map += (cr3 & ~(PAGE_MASK | 31)); >>> #endif >>> -missing = guest_walk_tables(v, p2m, ga, &gw, pfec[0], top_mfn, >>> top_map); >>> +walk_ok = guest_walk_tables(v, p2m, ga, &gw, pfec[0], top_mfn, >>> top_map); >> Since afaict pfec doesn't really point to an array, could I talk you into >> adjusting the uses you touch anyway to become *pfec? > > I did find the use of an array annoying, but left it as-was. > > Any objection if I do a general cleanup of that as a prerequisite patch, > to avoid introducing semi-unrelated changes into this patch? Certainly not. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] [PATCH net 2/2] xen-netback: don't vfree() queues under spinlock
This leads to a BUG of the following form: [ 174.512861] switch: port 2(vif3.0) entered disabled state [ 174.522735] BUG: sleeping function called from invalid context at /home/build/linux-linus/mm/vmalloc.c:1441 [ 174.523451] in_atomic(): 1, irqs_disabled(): 0, pid: 28, name: xenwatch [ 174.524131] CPU: 1 PID: 28 Comm: xenwatch Tainted: GW 4.10.0upstream-11073-g4977ab6-dirty #1 [ 174.524819] Hardware name: MSI MS-7680/H61M-P23 (MS-7680), BIOS V17.0 03/14/2011 [ 174.525517] Call Trace: [ 174.526217] show_stack+0x23/0x60 [ 174.526899] dump_stack+0x5b/0x88 [ 174.527562] ___might_sleep+0xde/0x130 [ 174.528208] __might_sleep+0x35/0xa0 [ 174.528840] ? _raw_spin_unlock_irqrestore+0x13/0x20 [ 174.529463] ? __wake_up+0x40/0x50 [ 174.530089] remove_vm_area+0x20/0x90 [ 174.530724] __vunmap+0x1d/0xc0 [ 174.531346] ? delete_object_full+0x13/0x20 [ 174.531973] vfree+0x40/0x80 [ 174.532594] set_backend_state+0x18a/0xa90 [ 174.533221] ? dwc_scan_descriptors+0x24d/0x430 [ 174.533850] ? kfree+0x5b/0xc0 [ 174.534476] ? xenbus_read+0x3d/0x50 [ 174.535101] ? xenbus_read+0x3d/0x50 [ 174.535718] ? xenbus_gather+0x31/0x90 [ 174.536332] ? ___might_sleep+0xf6/0x130 [ 174.536945] frontend_changed+0x6b/0xd0 [ 174.537565] xenbus_otherend_changed+0x7d/0x80 [ 174.538185] frontend_changed+0x12/0x20 [ 174.538803] xenwatch_thread+0x74/0x110 [ 174.539417] ? woken_wake_function+0x20/0x20 [ 174.540049] kthread+0xe5/0x120 [ 174.540663] ? xenbus_printf+0x50/0x50 [ 174.541278] ? __kthread_init_worker+0x40/0x40 [ 174.541898] ret_from_fork+0x21/0x2c [ 174.548635] switch: port 2(vif3.0) entered disabled state This patch defers the vfree() until after the spinlock is released. Reported-by: Juergen Gross Signed-off-by: Paul Durrant --- Cc: Juergen Gross Cc: Wei Liu --- drivers/net/xen-netback/xenbus.c | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/drivers/net/xen-netback/xenbus.c b/drivers/net/xen-netback/xenbus.c index d82ddc9..d2d7cd9 100644 --- a/drivers/net/xen-netback/xenbus.c +++ b/drivers/net/xen-netback/xenbus.c @@ -496,6 +496,7 @@ static void backend_disconnect(struct backend_info *be) if (vif) { unsigned int queue_index; + struct xenvif_queue *queues; xen_unregister_watchers(vif); #ifdef CONFIG_DEBUG_FS @@ -508,11 +509,13 @@ static void backend_disconnect(struct backend_info *be) xenvif_deinit_queue(&vif->queues[queue_index]); spin_lock(&vif->lock); - vfree(vif->queues); + queues = vif->queues; vif->num_queues = 0; vif->queues = NULL; spin_unlock(&vif->lock); + vfree(queues); + xenvif_disconnect_ctrl(vif); } } -- 2.1.4 ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] [PATCH net 1/2] xen-netback: keep a local pointer for vif in backend_disconnect()
This patch replaces use of 'be->vif' with 'vif' and hence generally makes the function look tidier. No semantic change. Signed-off-by: Paul Durrant --- Cc: Wei Liu --- drivers/net/xen-netback/xenbus.c | 32 ++-- 1 file changed, 18 insertions(+), 14 deletions(-) diff --git a/drivers/net/xen-netback/xenbus.c b/drivers/net/xen-netback/xenbus.c index bb854f9..d82ddc9 100644 --- a/drivers/net/xen-netback/xenbus.c +++ b/drivers/net/xen-netback/xenbus.c @@ -492,24 +492,28 @@ static int backend_create_xenvif(struct backend_info *be) static void backend_disconnect(struct backend_info *be) { - if (be->vif) { + struct xenvif *vif = be->vif; + + if (vif) { unsigned int queue_index; - xen_unregister_watchers(be->vif); + xen_unregister_watchers(vif); #ifdef CONFIG_DEBUG_FS - xenvif_debugfs_delif(be->vif); + xenvif_debugfs_delif(vif); #endif /* CONFIG_DEBUG_FS */ - xenvif_disconnect_data(be->vif); - for (queue_index = 0; queue_index < be->vif->num_queues; ++queue_index) - xenvif_deinit_queue(&be->vif->queues[queue_index]); - - spin_lock(&be->vif->lock); - vfree(be->vif->queues); - be->vif->num_queues = 0; - be->vif->queues = NULL; - spin_unlock(&be->vif->lock); - - xenvif_disconnect_ctrl(be->vif); + xenvif_disconnect_data(vif); + for (queue_index = 0; +queue_index < vif->num_queues; +++queue_index) + xenvif_deinit_queue(&vif->queues[queue_index]); + + spin_lock(&vif->lock); + vfree(vif->queues); + vif->num_queues = 0; + vif->queues = NULL; + spin_unlock(&vif->lock); + + xenvif_disconnect_ctrl(vif); } } -- 2.1.4 ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] [PATCH net 0/2] xen-netback: update memory leak fix to avoid BUG
Commit 9a6cdf52b85e "xen-netback: fix memory leaks on XenBus disconnect" added missing code to fix a memory leak by calling vfree() in the appropriate place. Unfortunately subsequent commit f16f1df65f1c "xen-netback: protect resource cleaning on XenBus disconnect" then wrapped this call to vfree() in a spin lock, leading to a BUG due to incorrect context. Patch #1 makes the existing code more readable Patch #2 fixes the problem Paul Durrant (2): xen-netback: keep a local pointer for vif in backend_disconnect() xen-netback: don't vfree() queues under spinlock drivers/net/xen-netback/xenbus.c | 31 +++ 1 file changed, 19 insertions(+), 12 deletions(-) -- 2.1.4 ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 5/7] x86/shadow: Use the pagewalk reserved bits helpers
On 02/03/17 12:51, Jan Beulich wrote: On 02.03.17 at 13:26, wrote: >> On 01/03/17 16:03, Jan Beulich wrote: >> On 27.02.17 at 15:03, wrote: The shadow logic should never create a shadow of a guest PTE which contains reserved bits from the guests point of view. Such a shadowed entry might not cause #PF[RSVD] when walked by hardware, thus won't behave architecturally from the guests point of view. >>> But are we already or-ing in the RSVD bit accordingly in such cases, >>> before handing the #PF to the guest? The patch here certainly >>> doesn't make any change towards that, afaics. >> The purpose of this patch is to ensure we never create a shadow which >> risks causing hardware to generate #PF[RSVD] when running on the >> shadows, other than the one deliberate case (MMIO fastpath). > Right, but instead of answering my question this emphasizes the > need for an answer, as what you say basically means we'd never > (except for that one special case) see the RSVD bit set when > getting #PF handed by hardware, yet for forwarding to the guest > we need to set that bit then in such cases. This is intentional. We hand #PF[RSVD] back to the guest based on walking the guest pagetables, rather than what we find from hardware walking the shadows we create. ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 0/5] xl cleanup and docs
Wei Liu writes ("[PATCH 0/5] xl cleanup and docs"): > Wei Liu (5): > CONTRIBUTING: list xl in inbound license section > xl: add CODING_STYLE > xl: remove declaration of ctx in c files > xl: lift common_domname declaration to xl.h > xl: lift logfile declaration to xl.h All five Acked-by: Ian Jackson ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] [PATCH] x86/SVM: correct boot time cpu_data[] handling
start_svm() already runs after cpu_data[] was set up, so it shouldn't modify it anymore (at least not directly). Constify the involved pointers. Furthermore LMSLE feature detection was broken by 566ddbe833 ("x86: Fail CPU bringup cleanly if it cannot initialise HVM"), as Andrew Cooper has pointed out: c couldn't possibly equal &boot_cpu_data anymore. Signed-off-by: Jan Beulich --- a/xen/arch/x86/hvm/svm/asid.c +++ b/xen/arch/x86/hvm/svm/asid.c @@ -22,7 +22,7 @@ #include #include -void svm_asid_init(struct cpuinfo_x86 *c) +void svm_asid_init(const struct cpuinfo_x86 *c) { int nasids = 0; --- a/xen/arch/x86/hvm/svm/svm.c +++ b/xen/arch/x86/hvm/svm/svm.c @@ -1330,7 +1330,7 @@ static int svm_cpu_up_prepare(unsigned i return 0; } -static void svm_init_erratum_383(struct cpuinfo_x86 *c) +static void svm_init_erratum_383(const struct cpuinfo_x86 *c) { uint64_t msr_content; @@ -1365,11 +1365,12 @@ static int svm_handle_osvw(struct vcpu * return 0; } -static int svm_cpu_up(void) +static int _svm_cpu_up(bool bsp) { uint64_t msr_content; -int rc, cpu = smp_processor_id(); -struct cpuinfo_x86 *c = &cpu_data[cpu]; +int rc; +unsigned int cpu = smp_processor_id(); +const struct cpuinfo_x86 *c = &cpu_data[cpu]; /* Check whether SVM feature is disabled in BIOS */ rdmsrl(MSR_K8_VM_CR, msr_content); @@ -1402,7 +1403,7 @@ static int svm_cpu_up(void) rdmsrl(MSR_EFER, msr_content); if ( msr_content & EFER_LMSLE ) { -if ( c == &boot_cpu_data ) +if ( bsp ) cpu_has_lmsl = 1; wrmsrl(MSR_EFER, msr_content ^ EFER_LMSLE); } @@ -1419,13 +1420,18 @@ static int svm_cpu_up(void) return 0; } +static int svm_cpu_up(void) +{ +return _svm_cpu_up(false); +} + const struct hvm_function_table * __init start_svm(void) { bool_t printed = 0; svm_host_osvw_reset(); -if ( svm_cpu_up() ) +if ( _svm_cpu_up(true) ) { printk("SVM: failed to initialise.\n"); return NULL; --- a/xen/include/asm-x86/hvm/svm/asid.h +++ b/xen/include/asm-x86/hvm/svm/asid.h @@ -22,7 +22,7 @@ #include #include -void svm_asid_init(struct cpuinfo_x86 *c); +void svm_asid_init(const struct cpuinfo_x86 *c); static inline void svm_asid_g_invlpg(struct vcpu *v, unsigned long g_vaddr) { x86/SVM: correct boot time cpu_data[] handling start_svm() already runs after cpu_data[] was set up, so it shouldn't modify it anymore (at least not directly). Constify the involved pointers. Furthermore LMSLE feature detection was broken by 566ddbe833 ("x86: Fail CPU bringup cleanly if it cannot initialise HVM"), as Andrew Cooper has pointed out: c couldn't possibly equal &boot_cpu_data anymore. Signed-off-by: Jan Beulich --- a/xen/arch/x86/hvm/svm/asid.c +++ b/xen/arch/x86/hvm/svm/asid.c @@ -22,7 +22,7 @@ #include #include -void svm_asid_init(struct cpuinfo_x86 *c) +void svm_asid_init(const struct cpuinfo_x86 *c) { int nasids = 0; --- a/xen/arch/x86/hvm/svm/svm.c +++ b/xen/arch/x86/hvm/svm/svm.c @@ -1330,7 +1330,7 @@ static int svm_cpu_up_prepare(unsigned i return 0; } -static void svm_init_erratum_383(struct cpuinfo_x86 *c) +static void svm_init_erratum_383(const struct cpuinfo_x86 *c) { uint64_t msr_content; @@ -1365,11 +1365,12 @@ static int svm_handle_osvw(struct vcpu * return 0; } -static int svm_cpu_up(void) +static int _svm_cpu_up(bool bsp) { uint64_t msr_content; -int rc, cpu = smp_processor_id(); -struct cpuinfo_x86 *c = &cpu_data[cpu]; +int rc; +unsigned int cpu = smp_processor_id(); +const struct cpuinfo_x86 *c = &cpu_data[cpu]; /* Check whether SVM feature is disabled in BIOS */ rdmsrl(MSR_K8_VM_CR, msr_content); @@ -1402,7 +1403,7 @@ static int svm_cpu_up(void) rdmsrl(MSR_EFER, msr_content); if ( msr_content & EFER_LMSLE ) { -if ( c == &boot_cpu_data ) +if ( bsp ) cpu_has_lmsl = 1; wrmsrl(MSR_EFER, msr_content ^ EFER_LMSLE); } @@ -1419,13 +1420,18 @@ static int svm_cpu_up(void) return 0; } +static int svm_cpu_up(void) +{ +return _svm_cpu_up(false); +} + const struct hvm_function_table * __init start_svm(void) { bool_t printed = 0; svm_host_osvw_reset(); -if ( svm_cpu_up() ) +if ( _svm_cpu_up(true) ) { printk("SVM: failed to initialise.\n"); return NULL; --- a/xen/include/asm-x86/hvm/svm/asid.h +++ b/xen/include/asm-x86/hvm/svm/asid.h @@ -22,7 +22,7 @@ #include #include -void svm_asid_init(struct cpuinfo_x86 *c); +void svm_asid_init(const struct cpuinfo_x86 *c); static inline void svm_asid_g_invlpg(struct vcpu *v, unsigned long g_vaddr) { ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 5/7] x86/shadow: Use the pagewalk reserved bits helpers
>>> On 02.03.17 at 13:56, wrote: > On 02/03/17 12:51, Jan Beulich wrote: > On 02.03.17 at 13:26, wrote: >>> On 01/03/17 16:03, Jan Beulich wrote: >>> On 27.02.17 at 15:03, wrote: > The shadow logic should never create a shadow of a guest PTE which > contains > reserved bits from the guests point of view. Such a shadowed entry might > not > cause #PF[RSVD] when walked by hardware, thus won't behave architecturally > from the guests point of view. But are we already or-ing in the RSVD bit accordingly in such cases, before handing the #PF to the guest? The patch here certainly doesn't make any change towards that, afaics. >>> The purpose of this patch is to ensure we never create a shadow which >>> risks causing hardware to generate #PF[RSVD] when running on the >>> shadows, other than the one deliberate case (MMIO fastpath). >> Right, but instead of answering my question this emphasizes the >> need for an answer, as what you say basically means we'd never >> (except for that one special case) see the RSVD bit set when >> getting #PF handed by hardware, yet for forwarding to the guest >> we need to set that bit then in such cases. > > This is intentional. > > We hand #PF[RSVD] back to the guest based on walking the guest > pagetables, rather than what we find from hardware walking the shadows > we create. Well, is that (always) the case here already, or only after patch 7? My question after all is whether this works as intended at this point. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH net 2/2] xen-netback: don't vfree() queues under spinlock
On 02/03/17 13:54, Paul Durrant wrote: > This leads to a BUG of the following form: > > [ 174.512861] switch: port 2(vif3.0) entered disabled state > [ 174.522735] BUG: sleeping function called from invalid context at > /home/build/linux-linus/mm/vmalloc.c:1441 > [ 174.523451] in_atomic(): 1, irqs_disabled(): 0, pid: 28, name: xenwatch > [ 174.524131] CPU: 1 PID: 28 Comm: xenwatch Tainted: GW > 4.10.0upstream-11073-g4977ab6-dirty #1 > [ 174.524819] Hardware name: MSI MS-7680/H61M-P23 (MS-7680), BIOS V17.0 > 03/14/2011 > [ 174.525517] Call Trace: > [ 174.526217] show_stack+0x23/0x60 > [ 174.526899] dump_stack+0x5b/0x88 > [ 174.527562] ___might_sleep+0xde/0x130 > [ 174.528208] __might_sleep+0x35/0xa0 > [ 174.528840] ? _raw_spin_unlock_irqrestore+0x13/0x20 > [ 174.529463] ? __wake_up+0x40/0x50 > [ 174.530089] remove_vm_area+0x20/0x90 > [ 174.530724] __vunmap+0x1d/0xc0 > [ 174.531346] ? delete_object_full+0x13/0x20 > [ 174.531973] vfree+0x40/0x80 > [ 174.532594] set_backend_state+0x18a/0xa90 > [ 174.533221] ? dwc_scan_descriptors+0x24d/0x430 > [ 174.533850] ? kfree+0x5b/0xc0 > [ 174.534476] ? xenbus_read+0x3d/0x50 > [ 174.535101] ? xenbus_read+0x3d/0x50 > [ 174.535718] ? xenbus_gather+0x31/0x90 > [ 174.536332] ? ___might_sleep+0xf6/0x130 > [ 174.536945] frontend_changed+0x6b/0xd0 > [ 174.537565] xenbus_otherend_changed+0x7d/0x80 > [ 174.538185] frontend_changed+0x12/0x20 > [ 174.538803] xenwatch_thread+0x74/0x110 > [ 174.539417] ? woken_wake_function+0x20/0x20 > [ 174.540049] kthread+0xe5/0x120 > [ 174.540663] ? xenbus_printf+0x50/0x50 > [ 174.541278] ? __kthread_init_worker+0x40/0x40 > [ 174.541898] ret_from_fork+0x21/0x2c > [ 174.548635] switch: port 2(vif3.0) entered disabled state > > This patch defers the vfree() until after the spinlock is released. > > Reported-by: Juergen Gross > Signed-off-by: Paul Durrant Reviewed-by: Juergen Gross Juergen ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH net 1/2] xen-netback: keep a local pointer for vif in backend_disconnect()
On Thu, Mar 02, 2017 at 12:54:25PM +, Paul Durrant wrote: > This patch replaces use of 'be->vif' with 'vif' and hence generally > makes the function look tidier. No semantic change. > > Signed-off-by: Paul Durrant Acked-by: Wei Liu ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH net 2/2] xen-netback: don't vfree() queues under spinlock
On Thu, Mar 02, 2017 at 12:54:26PM +, Paul Durrant wrote: > This leads to a BUG of the following form: > > [ 174.512861] switch: port 2(vif3.0) entered disabled state > [ 174.522735] BUG: sleeping function called from invalid context at > /home/build/linux-linus/mm/vmalloc.c:1441 > [ 174.523451] in_atomic(): 1, irqs_disabled(): 0, pid: 28, name: xenwatch > [ 174.524131] CPU: 1 PID: 28 Comm: xenwatch Tainted: GW > 4.10.0upstream-11073-g4977ab6-dirty #1 > [ 174.524819] Hardware name: MSI MS-7680/H61M-P23 (MS-7680), BIOS V17.0 > 03/14/2011 > [ 174.525517] Call Trace: > [ 174.526217] show_stack+0x23/0x60 > [ 174.526899] dump_stack+0x5b/0x88 > [ 174.527562] ___might_sleep+0xde/0x130 > [ 174.528208] __might_sleep+0x35/0xa0 > [ 174.528840] ? _raw_spin_unlock_irqrestore+0x13/0x20 > [ 174.529463] ? __wake_up+0x40/0x50 > [ 174.530089] remove_vm_area+0x20/0x90 > [ 174.530724] __vunmap+0x1d/0xc0 > [ 174.531346] ? delete_object_full+0x13/0x20 > [ 174.531973] vfree+0x40/0x80 > [ 174.532594] set_backend_state+0x18a/0xa90 > [ 174.533221] ? dwc_scan_descriptors+0x24d/0x430 > [ 174.533850] ? kfree+0x5b/0xc0 > [ 174.534476] ? xenbus_read+0x3d/0x50 > [ 174.535101] ? xenbus_read+0x3d/0x50 > [ 174.535718] ? xenbus_gather+0x31/0x90 > [ 174.536332] ? ___might_sleep+0xf6/0x130 > [ 174.536945] frontend_changed+0x6b/0xd0 > [ 174.537565] xenbus_otherend_changed+0x7d/0x80 > [ 174.538185] frontend_changed+0x12/0x20 > [ 174.538803] xenwatch_thread+0x74/0x110 > [ 174.539417] ? woken_wake_function+0x20/0x20 > [ 174.540049] kthread+0xe5/0x120 > [ 174.540663] ? xenbus_printf+0x50/0x50 > [ 174.541278] ? __kthread_init_worker+0x40/0x40 > [ 174.541898] ret_from_fork+0x21/0x2c > [ 174.548635] switch: port 2(vif3.0) entered disabled state > > This patch defers the vfree() until after the spinlock is released. > > Reported-by: Juergen Gross > Signed-off-by: Paul Durrant Acked-by: Wei Liu ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] [xen-unstable test] 106351: tolerable FAIL - PUSHED
flight 106351 xen-unstable real [real] http://logs.test-lab.xenproject.org/osstest/logs/106351/ Failures :-/ but no regressions. Regressions which are regarded as allowable (not blocking): test-armhf-armhf-libvirt-xsm 13 saverestore-support-checkfail like 105933 test-amd64-amd64-xl-qemut-win7-amd64 16 guest-stopfail like 105933 test-amd64-i386-xl-qemut-win7-amd64 16 guest-stop fail like 105933 test-amd64-i386-xl-qemuu-win7-amd64 16 guest-stop fail like 105933 test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-stopfail like 105933 test-armhf-armhf-libvirt-raw 12 saverestore-support-checkfail like 105933 test-armhf-armhf-libvirt 13 saverestore-support-checkfail like 105933 test-amd64-amd64-xl-rtds 9 debian-install fail like 105933 Tests which did not succeed, but are not blocking: test-arm64-arm64-libvirt-xsm 1 build-check(1) blocked n/a test-arm64-arm64-xl 1 build-check(1) blocked n/a build-arm64-libvirt 1 build-check(1) blocked n/a test-arm64-arm64-libvirt-qcow2 1 build-check(1) blocked n/a test-arm64-arm64-libvirt 1 build-check(1) blocked n/a test-arm64-arm64-xl-credit2 1 build-check(1) blocked n/a test-arm64-arm64-xl-rtds 1 build-check(1) blocked n/a test-arm64-arm64-xl-multivcpu 1 build-check(1) blocked n/a test-arm64-arm64-xl-xsm 1 build-check(1) blocked n/a test-amd64-amd64-xl-pvh-amd 11 guest-start fail never pass test-amd64-amd64-xl-pvh-intel 11 guest-start fail never pass test-amd64-i386-libvirt 12 migrate-support-checkfail never pass test-amd64-amd64-libvirt-xsm 12 migrate-support-checkfail never pass test-amd64-i386-libvirt-xsm 12 migrate-support-checkfail never pass build-arm64 5 xen-buildfail never pass build-arm64-xsm 5 xen-buildfail never pass test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check fail never pass test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check fail never pass test-amd64-amd64-libvirt-vhd 11 migrate-support-checkfail never pass build-arm64-pvops 5 kernel-build fail never pass test-amd64-amd64-qemuu-nested-amd 16 debian-hvm-install/l1/l2 fail never pass test-armhf-armhf-xl-xsm 12 migrate-support-checkfail never pass test-armhf-armhf-xl-xsm 13 saverestore-support-checkfail never pass test-armhf-armhf-xl 12 migrate-support-checkfail never pass test-armhf-armhf-xl 13 saverestore-support-checkfail never pass test-armhf-armhf-xl-multivcpu 12 migrate-support-checkfail never pass test-armhf-armhf-xl-multivcpu 13 saverestore-support-checkfail never pass test-armhf-armhf-libvirt-xsm 12 migrate-support-checkfail never pass test-armhf-armhf-xl-cubietruck 12 migrate-support-checkfail never pass test-armhf-armhf-xl-cubietruck 13 saverestore-support-checkfail never pass test-armhf-armhf-libvirt-raw 11 migrate-support-checkfail never pass test-amd64-amd64-libvirt 12 migrate-support-checkfail never pass test-armhf-armhf-libvirt 12 migrate-support-checkfail never pass test-armhf-armhf-xl-credit2 12 migrate-support-checkfail never pass test-armhf-armhf-xl-credit2 13 saverestore-support-checkfail never pass test-armhf-armhf-xl-arndale 12 migrate-support-checkfail never pass test-armhf-armhf-xl-arndale 13 saverestore-support-checkfail never pass test-armhf-armhf-xl-rtds 12 migrate-support-checkfail never pass test-armhf-armhf-xl-rtds 13 saverestore-support-checkfail never pass test-armhf-armhf-xl-vhd 11 migrate-support-checkfail never pass test-armhf-armhf-xl-vhd 12 saverestore-support-checkfail never pass version targeted for testing: xen 1a0ab02e342cfd80decd72606c94479e4b309a3c baseline version: xen 2f5af2c962c05b789bdd65b46c74711e903f86d0 Last test of basis 105933 2017-02-20 20:13:35 Z9 days Failing since105946 2017-02-21 10:19:53 Z9 days 15 attempts Testing same since 106351 2017-03-02 04:21:56 Z0 days1 attempts People who touched revisions under test: Andrew Cooper Boris Ostrovsky Chao Gao Daniel Kiper Dario Faggioli Doug Goldstein George Dunlap Haozhong Zhang Ian Jackson Jan Beulich Juergen Gross Julien Grall Kevin Tian Marek Marczykowski-Górecki Olaf Hering Paul Durrant Razvan Cojocaru Roger Pau Monne Roger Pau Monné Samuel Thibault Sergey D
[Xen-devel] [xtf test] 106365: all pass - PUSHED
flight 106365 xtf real [real] http://logs.test-lab.xenproject.org/osstest/logs/106365/ Perfect :-) All tests in this flight passed as required version targeted for testing: xtf f02259db8c737220b4e6ae5564a8f6da4fba2949 baseline version: xtf b834586f982174aa0bc09daf17f1375908ad04af Last test of basis 106302 2017-03-01 12:13:51 Z1 days Testing same since 106365 2017-03-02 10:44:47 Z0 days1 attempts People who touched revisions under test: Wei Liu jobs: build-amd64-xtf pass build-amd64 pass build-amd64-pvopspass test-xtf-amd64-amd64-1 pass test-xtf-amd64-amd64-2 pass test-xtf-amd64-amd64-3 pass test-xtf-amd64-amd64-4 pass test-xtf-amd64-amd64-5 pass sg-report-flight on osstest.test-lab.xenproject.org logs: /home/logs/logs images: /home/logs/images Logs, config files, etc. are available at http://logs.test-lab.xenproject.org/osstest/logs Explanation of these reports, and of osstest in general, is at http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master Test harness code can be found at http://xenbits.xen.org/gitweb?p=osstest.git;a=summary Pushing revision : + branch=xtf + revision=f02259db8c737220b4e6ae5564a8f6da4fba2949 + . ./cri-lock-repos ++ . ./cri-common +++ . ./cri-getconfig +++ umask 002 +++ getrepos getconfig Repos perl -e ' use Osstest; readglobalconfig(); print $c{"Repos"} or die $!; ' +++ local repos=/home/osstest/repos +++ '[' -z /home/osstest/repos ']' +++ '[' '!' -d /home/osstest/repos ']' +++ echo /home/osstest/repos ++ repos=/home/osstest/repos ++ repos_lock=/home/osstest/repos/lock ++ '[' x '!=' x/home/osstest/repos/lock ']' ++ OSSTEST_REPOS_LOCK_LOCKED=/home/osstest/repos/lock ++ exec with-lock-ex -w /home/osstest/repos/lock ./ap-push xtf f02259db8c737220b4e6ae5564a8f6da4fba2949 + branch=xtf + revision=f02259db8c737220b4e6ae5564a8f6da4fba2949 + . ./cri-lock-repos ++ . ./cri-common +++ . ./cri-getconfig +++ umask 002 +++ getrepos getconfig Repos perl -e ' use Osstest; readglobalconfig(); print $c{"Repos"} or die $!; ' +++ local repos=/home/osstest/repos +++ '[' -z /home/osstest/repos ']' +++ '[' '!' -d /home/osstest/repos ']' +++ echo /home/osstest/repos ++ repos=/home/osstest/repos ++ repos_lock=/home/osstest/repos/lock ++ '[' x/home/osstest/repos/lock '!=' x/home/osstest/repos/lock ']' + . ./cri-common ++ . ./cri-getconfig ++ umask 002 + select_xenbranch + case "$branch" in + tree=xtf + xenbranch=xen-unstable + '[' xxtf = xlinux ']' + linuxbranch= + '[' x = x ']' + qemuubranch=qemu-upstream-unstable + select_prevxenbranch ++ ./cri-getprevxenbranch xen-unstable + prevxenbranch=xen-4.8-testing + '[' xf02259db8c737220b4e6ae5564a8f6da4fba2949 = x ']' + : tested/2.6.39.x + . ./ap-common ++ : osst...@xenbits.xen.org +++ getconfig OsstestUpstream +++ perl -e ' use Osstest; readglobalconfig(); print $c{"OsstestUpstream"} or die $!; ' ++ : ++ : git://xenbits.xen.org/xen.git ++ : osst...@xenbits.xen.org:/home/xen/git/xen.git ++ : git://xenbits.xen.org/qemu-xen-traditional.git ++ : git://git.kernel.org ++ : git://git.kernel.org/pub/scm/linux/kernel/git ++ : git ++ : git://xenbits.xen.org/xtf.git ++ : osst...@xenbits.xen.org:/home/xen/git/xtf.git ++ : git://xenbits.xen.org/xtf.git ++ : git://xenbits.xen.org/libvirt.git ++ : osst...@xenbits.xen.org:/home/xen/git/libvirt.git ++ : git://xenbits.xen.org/libvirt.git ++ : git://xenbits.xen.org/osstest/rumprun.git ++ : git ++ : git://xenbits.xen.org/osstest/rumprun.git ++ : osst...@xenbits.xen.org:/home/xen/git/osstest/rumprun.git ++ : git://git.seabios.org/seabios.git ++ : osst...@xenbits.xen.org:/home/xen/git/osstest/seabios.git ++ : git://xenbits.xen.org/osstest/seabios.git ++ : https://github.com/tianocore/edk2.git ++ : osst...@xenbits.xen.org:/home/xen/git/osstest/ovmf.git ++ : git://xenbits.xen.org/osstest/ovmf.git ++ : git://xenbits.xen.org/osstest/linux-firmware.git ++ : osst...@xenbits.xen.org:/home/osstest/ext/linux-firmware.git ++ : git://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git ++ : osst...@xenbits.xen.org:/home/xen/git/linux-pvops.git ++ : git://xenbits.xen.org/linux-pvops.git ++ : tested/linux-3.14 ++ : tested/linux-arm-xen ++ '[' xgit://xenbits.xen.org/linux-pvops.git = x ']' ++
Re: [Xen-devel] [RFC PATCH] mm, hotplug: get rid of auto_online_blocks
On Mon 27-02-17 16:43:04, Michal Hocko wrote: > On Mon 27-02-17 12:25:10, Heiko Carstens wrote: > > On Mon, Feb 27, 2017 at 11:02:09AM +0100, Vitaly Kuznetsov wrote: > > > A couple of other thoughts: > > > 1) Having all newly added memory online ASAP is probably what people > > > want for all virtual machines. > > > > This is not true for s390. On s390 we have "standby" memory that a guest > > sees and potentially may use if it sets it online. Every guest that sets > > memory offline contributes to the hypervisor's standby memory pool, while > > onlining standby memory takes memory away from the standby pool. > > > > The use-case is that a system administrator in advance knows the maximum > > size a guest will ever have and also defines how much memory should be used > > at boot time. The difference is standby memory. > > > > Auto-onlining of standby memory is the last thing we want. I don't know much about anything other than x86 so all comments below are from that point of view, archetectures that don't need auto online can keep current default > > > Unfortunately, we have additional complexity with memory zones > > > (ZONE_NORMAL, ZONE_MOVABLE) and in some cases manual intervention is > > > required. Especially, when further unplug is expected. > > > > This also is a reason why auto-onlining doesn't seem be the best way. When trying to support memory unplug on guest side in RHEL7, experience shows otherwise. Simplistic udev rule which onlines added block doesn't work in case one wants to online it as movable. Hotplugged blocks in current kernel should be onlined in reverse order to online blocks as movable depending on adjacent blocks zone. Which means simple udev rule isn't usable since it gets event from the first to the last hotplugged block order. So now we would have to write a daemon that would - watch for all blocks in hotplugged memory appear (how would it know) - online them in right order (order might also be different depending on kernel version) -- it becomes even more complicated in NUMA case when there are multiple zones and kernel would have to provide user-space with information about zone maps In short current experience shows that userspace approach - doesn't solve issues that Vitaly has been fixing (i.e. onlining fast and/or under memory pressure) when udev (or something else might be killed) - doesn't reduce overall system complexity, it only gets worse as user-space handler needs to know a lot about kernel internals and implementation details/kernel versions to work properly It's might be not easy but doing onlining in kernel on the other hand is: - faster - more reliable (can't be killed under memory pressure) - kernel has access to all info needed for onlining and how it internals work to implement auto-online correctly - since there is no need to mantain ABI for user-space (zones layout/ordering/maybe something else), kernel is free change internal implemetation without breaking userspace (currently hotplug+unplug doesn't work reliably and we might need something more flexible than zones) That's direction of research in progress, i.e. making kernel implementation better instead of putting responsibility on user-space to deal with kernel shortcomings. > Can you imagine any situation when somebody actually might want to have > this knob enabled? From what I understand it doesn't seem to be the > case. For x86: * this config option is enabled by default in recent Fedora, * RHEL6 ships similar downstream patches to do the same thing for years * RHEL7 has udev rule (because there wasn't kernel side solution at fork time) that auto-onlines it unconditionally, Vitaly might backport it later when he has time. Not linux kernel but still auto online policy is used by Windows both on baremetal and guest configurations. That's somewhat shows that current defaults upstream on x86 might be not what end-users wish for. When auto_online_blocks were introduced, Vitaly has been conservative and left current upstream defaults where they were lest it would break someone else setup but allowing downstreams set their own auto-online policy, eventually we might switch it upstream too. ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel