[Xen-devel] [qemu-mainline test] 78509: regressions - FAIL
flight 78509 qemu-mainline real [real] http://logs.test-lab.xenproject.org/osstest/logs/78509/ Regressions :-( Tests which did not succeed and are blocking, including tests which could not be run: test-armhf-armhf-xl-cubietruck 15 guest-start/debian.repeat fail REGR. vs. 78421 Regressions which are regarded as allowable (not blocking): test-armhf-armhf-xl-rtds 11 guest-start fail REGR. vs. 78421 Tests which did not succeed, but are not blocking: test-amd64-amd64-xl-pvh-amd 11 guest-start fail never pass test-amd64-amd64-xl-pvh-intel 11 guest-start fail never pass test-armhf-armhf-libvirt-xsm 12 migrate-support-checkfail never pass test-armhf-armhf-libvirt-xsm 14 guest-saverestorefail never pass test-amd64-i386-libvirt 12 migrate-support-checkfail never pass test-amd64-i386-libvirt-xsm 12 migrate-support-checkfail never pass test-amd64-amd64-qemuu-nested-amd 16 debian-hvm-install/l1/l2 fail never pass test-armhf-armhf-xl-credit2 13 saverestore-support-checkfail never pass test-armhf-armhf-xl-credit2 12 migrate-support-checkfail never pass test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check fail never pass test-amd64-amd64-libvirt 12 migrate-support-checkfail never pass test-amd64-i386-xl-qemuu-win7-amd64 16 guest-stop fail never pass test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-stop fail never pass test-armhf-armhf-xl-multivcpu 13 saverestore-support-checkfail never pass test-armhf-armhf-xl-multivcpu 12 migrate-support-checkfail never pass test-armhf-armhf-xl 12 migrate-support-checkfail never pass test-armhf-armhf-xl 13 saverestore-support-checkfail never pass test-armhf-armhf-xl-cubietruck 12 migrate-support-checkfail never pass test-armhf-armhf-xl-cubietruck 13 saverestore-support-checkfail never pass test-amd64-amd64-libvirt-xsm 12 migrate-support-checkfail never pass test-amd64-amd64-libvirt-vhd 11 migrate-support-checkfail never pass test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check fail never pass test-armhf-armhf-libvirt 14 guest-saverestorefail never pass test-armhf-armhf-libvirt 12 migrate-support-checkfail never pass test-armhf-armhf-libvirt-qcow2 11 migrate-support-checkfail never pass test-armhf-armhf-libvirt-qcow2 13 guest-saverestorefail never pass test-armhf-armhf-libvirt-raw 13 guest-saverestorefail never pass test-armhf-armhf-libvirt-raw 11 migrate-support-checkfail never pass test-armhf-armhf-xl-vhd 11 migrate-support-checkfail never pass test-armhf-armhf-xl-vhd 12 saverestore-support-checkfail never pass test-armhf-armhf-xl-xsm 13 saverestore-support-checkfail never pass test-armhf-armhf-xl-xsm 12 migrate-support-checkfail never pass test-armhf-armhf-xl-arndale 12 migrate-support-checkfail never pass test-armhf-armhf-xl-arndale 13 saverestore-support-checkfail never pass version targeted for testing: qemuu3db34bf64ab4f8797565dd8750003156c32b301d baseline version: qemuu19b6d84316892c8086e0115d6f09cb01abb86cfc Last test of basis78421 2016-01-18 09:47:22 Z1 days Testing same since78509 2016-01-19 09:21:37 Z0 days1 attempts People who touched revisions under test: Andreas Färber Christophe Fergeau Daniel P. Berrange Gerd Hoffmann Juan Quintela Mark Cave-Ayland Paolo Bonzini Peter Maydell Wolfgang Bumiller jobs: build-amd64-xsm pass build-armhf-xsm pass build-i386-xsm pass build-amd64 pass build-armhf pass build-i386 pass build-amd64-libvirt pass build-armhf-libvirt pass build-i386-libvirt pass build-amd64-pvopspass build-armhf-pvopspass build-i386-pvops pass test-amd64-amd64-xl pass test-armhf-armhf-xl pass test-amd64-i386-xl pass test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm pass test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsmpass test-amd64-amd64-xl-qemuu-debianhvm-amd64-x
Re: [Xen-devel] [PATCH] build: specify minimum versions of make and binutils
>>> On 19.01.16 at 20:24, wrote: > On 1/19/16 2:48 AM, Jan Beulich wrote: > On 18.01.16 at 18:21, wrote: >>> On 1/18/16 11:03 AM, Jan Beulich wrote: >>> On 18.01.16 at 17:53, wrote: > To help people avoid having to figure out what versions of make and > binutils need to be supported document them explicitly. The version of > binutils that had to be supported was mentioned in > http://lists.xenproject.org/archives/html/xen-devel/2016-01/msg00609.html > as 2.17. Knowing that Jan got these versions from SLES10 I looked up the > version of GNU make from the same vintage (mid-2006) and landed on 3.81. I'm afraid that same SLE10 has been using binutils 2.16.9. and make 3.80. While (still building Xen there once in a while) I'd probably not be in big trouble if we decided we don't want to support that old an environment anymore, I don't think we can just go and document higher versions than we so far allowed. We'd first need to settle on where to draw the line nowadays (which then likely would mean a gcc minimal version bum too). >>> >>> Not a problem. I was just trying to take the situation from a guessing >>> game to be explicitly called out. I was documenting what my logic was >>> behind the version numbers I selected. I wasn't able to compare dates >>> with binutils because their repo goes from 2003 to 2011 [1]. So I went >>> back to SLES10's release date [2] and the GCC 4.1.0 release date [3] to >>> compare it with GNU make [4]. >>> >>> Honestly I'd be happy if we just drew a line in the sand so that its >>> clear what I need to test against when I submit patches. I don't really >>> care where the line is. >> >> Then how about 2.16.1 and 3.80 respectively as the initial line? > > Sounds great to me. Would you like me to resubmit or do you want to make > that change. I'm ok if you throw away my patch and author it yourself. > Whatever is easiest for you (or whoever commits it). I'd prefer if you re-submitted. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v10 6/7] vmx: VT-d posted-interrupt core logic handling
>>> On 20.01.16 at 08:49, wrote: >> From: Jan Beulich [mailto:jbeul...@suse.com] >> Sent: Monday, January 18, 2016 11:14 PM >> >>> On 03.12.15 at 09:35, wrote: >> > --- a/xen/arch/x86/hvm/vmx/vmx.c >> > +++ b/xen/arch/x86/hvm/vmx/vmx.c >> > @@ -83,7 +83,131 @@ static int vmx_msr_write_intercept(unsigned int >> msr, uint64_t msr_content); >> > +ASSERT(v->arch.hvm_vmx.pi_block_cpu == NR_CPUS); >> > + >> > +/* >> > + * The vCPU is blocking, we need to add it to one of the per pCPU >> > lists. >> > + * We save v->processor to v->arch.hvm_vmx.pi_block_cpu and use it >> for >> > + * the per-CPU list, we also save it to posted-interrupt descriptor >> > and >> > + * make it as the destination of the wake-up notification event. >> > + */ >> > +v->arch.hvm_vmx.pi_block_cpu = v->processor; >> > + >> > +spin_lock_irqsave(&per_cpu(pi_blocked_vcpu_lock, >> > + v->arch.hvm_vmx.pi_block_cpu), flags); >> > +list_add_tail(&v->arch.hvm_vmx.pi_blocked_vcpu_list, >> > + &per_cpu(pi_blocked_vcpu, >> > v->arch.hvm_vmx.pi_block_cpu)); >> > +spin_unlock_irqrestore(&per_cpu(pi_blocked_vcpu_lock, >> > + v->arch.hvm_vmx.pi_block_cpu), flags); >> > + >> > +ASSERT(!pi_test_sn(pi_desc)); >> > + >> > +/* >> > + * We don't need to set the 'NDST' field, since it should point to >> > + * the same pCPU as v->processor. >> > + */ >> > + >> > +write_atomic(&pi_desc->nv, pi_wakeup_vector); >> >> Stray blank line between comment and corresponding code. >> >> Also the ASSERT() is rather more disconnected from the write >> than seems desirable: Wouldn't it be better to cmpxchg() the >> whole 32-bit value, validating that SN is clear in the result? > > Not quite understand this. The control field is 64 bits, do you > mean cmpxchg() the whole 64-bit value like follows: > > +do { > +old.control = new.control = pi_desc->control; > +new.nv = pi_wakeup_vector; > +} while ( cmpxchg(&pi_desc->control, old.control, new.control) > + != old.control ); > > This a somewhat like the implementation in the earlier versions, > why do you want to change it back? Did you read what I've said? I'm worried about the disconnect of assertion and update: You really care about SN being clear _after_ the update aiui. And since NDST doesn't change, a 32-bit CMPXCHG would seem to suffice. >> > +pi_block_cpu = v->arch.hvm_vmx.pi_block_cpu; >> > +if ( pi_block_cpu == NR_CPUS ) >> > +return; >> >> Are this condition and the one in the immediately preceding if() >> connected in any way? > > I am not sure I understand this correctly. Which one is > " the one in the immediately preceding if()"? If you hadn't ripped out too much of the context, it would be a matter of pointing you back up. Now I have to re-quote the full code: +if ( pi_desc->nv != posted_intr_vector ) +write_atomic(&pi_desc->nv, posted_intr_vector); + +/* the vCPU is not on any blocking list. */ +pi_block_cpu = v->arch.hvm_vmx.pi_block_cpu; +if ( pi_block_cpu == NR_CPUS ) +return; These are the two if()-s the question is about. >> I.e. could the latter perhaps become an >> ASSERT() by connecting the two blocks suitably? I'm simply >> trying to limit the number of different cases to consider mentally >> when looking at this code ... > > If we get a true from above check, it means the vcpu was removed by > pi_wakeup_interrupt(), then we just need to return. If we get a false, > we will acquire the spinlock as the following code does, there are two > scenarios: > - pi_wakeup_interrupt() acquires the spinlock before us, then when > we get the lock, 'v->arch.hvm_vmx.pi_block_cpu' has already been > set to NR_CPUS by pi_wakeup_interrupt(), we need do nothing. > - We get the spinlock before pi_wakeup_interrupt(), this time we need > to remove the vCPU from the list and set 'v->arch.hvm_vmx.pi_block_cpu' > to NR_CPUS. This is all only about the second condition, but the question really is whether one being true or false may imply the result of other. In which case this would better be ASSERT()ed than handled by two conditionals. >> > @@ -106,10 +230,17 @@ static int vmx_vcpu_initialise(struct vcpu *v) >> > >> > spin_lock_init(&v->arch.hvm_vmx.vmcs_lock); >> > >> > +INIT_LIST_HEAD(&v->arch.hvm_vmx.pi_blocked_vcpu_list); >> > + >> > +v->arch.hvm_vmx.pi_block_cpu = NR_CPUS; >> > + >> > v->arch.schedule_tail= vmx_do_resume; >> > v->arch.ctxt_switch_from = vmx_ctxt_switch_from; >> > v->arch.ctxt_switch_to = vmx_ctxt_switch_to; >> > >> > +if ( iommu_intpost ) >> > +v->arch.vcpu_block = vmx_vcpu_block; >> >> Considering the locking question above - couldn't this be done only >> when a device gets added, and the pointer zapped upon removal >> of the last device, at once saving the conditional inside the hook? > > That is another so
Re: [Xen-devel] [PATCH v2 08/16] xen/hvm/params: Add a new delivery type for event-channel in HVM_PARAM_CALLBACK_IRQ
On 20/01/2016 06:33, Shannon Zhao wrote: > > On 2016/1/18 20:52, Andrew Cooper wrote: >> On 18/01/16 12:46, Stefano Stabellini wrote: On Mon, 18 Jan 2016, Andrew Cooper wrote: >> On 18/01/16 12:38, Stefano Stabellini wrote: On Fri, 15 Jan 2016, Shannon Zhao wrote: >> From: Shannon Zhao >> >> Add a new delivery type: >> val[63:56] == 3: val[15:8] is flag: val[7:0] is a PPI. >> To the flag, bit 0 stands the interrupt mode is edge(1) or level(0) >> and >> bit 1 stands the interrupt polarity is active low(1) or high(0). >> >> Signed-off-by: Shannon Zhao >> --- >> include/xen/interface/hvm/params.h | 5 + >> 1 file changed, 5 insertions(+) >> >> diff --git a/include/xen/interface/hvm/params.h >> b/include/xen/interface/hvm/params.h >> index a6c7991..550688a 100644 >> --- a/include/xen/interface/hvm/params.h >> +++ b/include/xen/interface/hvm/params.h >> @@ -34,6 +34,11 @@ >> * Domain = val[47:32], Bus = val[31:16], >> * DevFn = val[15: 8], IntX = val[ 1: 0] >> * val[63:56] == 2: val[7:0] is a vector number. >> + * val[63:56] == 3: val[15:8] is flag of event-channel interrupt: >> + * bit 0: interrupt is edge(1) or level(0) >> triggered >> + * bit 1: interrupt is active low(1) or high(0) >> + * val[7:0] is PPI number used by event-channel. >> + * This is only used by ARM/ARM64. >> * If val == 0 then CPU0 event-channel notifications are not >> delivered. >> */ >> #define HVM_PARAM_CALLBACK_IRQ 0 Andrew, I think that this patch is correct. Looking back at your previous comment (http://marc.info/?l=devicetree&m=144804014214262&w=2), is it possible that you were confused by enum callback_via_type, which is internal to Xen and offset'ed by 1 compared to the described values in xen/include/public/hvm/params.h? If not, and indeed somebody introduced one more field but failed to document it, then I suggest she sends a patch to fix the issue as soon as possible. >> I was indeed confused - the ABI is utterly mad. All right. In that case, Shannon, you can add my Reviewed-by: Stefano Stabellini >> However, this change does need rebasing over c/s ca5c54b, which was the >> result of the original discussion. c/s ca5c54b is for Xen, while this is a Linux patch (Linux has its own set of Xen headers). >> All ABI changes need to happen in the Xen public headers first. This >> patch cannot be accepted yet. > So you mean it should port c/s ca5c54b to Linux first? Then add this > change based on that? FIrst, you must get this content and submit a patch against Xen. The Xen repository is the authoritative version of the ABI. Once that patch is accepted, you will need to port ca5c54b and the new patch as part of this series. ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 4/4] hvmloader: add support to load extra ACPI tables from qemu
>>> On 20.01.16 at 06:31, wrote: > The primary reason of current solution is to reuse existing NVDIMM > driver in Linux kernel. Re-using code in the Dom0 kernel has benefits and drawbacks, and in any event needs to depend on proper layering to remain in place. A benefit is less code duplication between Xen and Linux; along the same lines a drawback is code duplication between various Dom0 OS variants. > One responsibility of this driver is to discover NVDIMM devices and > their parameters (e.g. which portion of an NVDIMM device can be mapped > into the system address space and which address it is mapped to) by > parsing ACPI NFIT tables. Looking at the NFIT spec in Sec 5.2.25 of > ACPI Specification v6 and the actual code in Linux kernel > (drivers/acpi/nfit.*), it's not a trivial task. To answer one of Kevin's questions: The NFIT table doesn't appear to require the ACPI interpreter. They seem more like SRAT and SLIT. Also you failed to answer Kevin's question regarding E820 entries: I think NVDIMM (or at least parts thereof) get represented in E820 (or the EFI memory map), and if that's the case this would be a very strong hint towards management needing to be in the hypervisor. > Secondly, the driver implements a convenient block device interface to > let software access areas where NVDIMM devices are mapped. The > existing vNVDIMM implementation in QEMU uses this interface. > > As Linux NVDIMM driver has already done above, why do we bother to > reimplement them in Xen? See above; a possibility is that we may need a split model (block layer parts on Dom0, "normal memory" parts in the hypervisor. Iirc the split is being determined by firmware, and hence set in stone by the time OS (or hypervisor) boot starts. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 08/16] xen/hvm/params: Add a new delivery type for event-channel in HVM_PARAM_CALLBACK_IRQ
On 2016/1/20 16:39, Andrew Cooper wrote: > On 20/01/2016 06:33, Shannon Zhao wrote: >> > >> > On 2016/1/18 20:52, Andrew Cooper wrote: >>> >> On 18/01/16 12:46, Stefano Stabellini wrote: > On Mon, 18 Jan 2016, Andrew Cooper wrote: >>> >> On 18/01/16 12:38, Stefano Stabellini wrote: > On Fri, 15 Jan 2016, Shannon Zhao wrote: >>> >> From: Shannon Zhao >>> >> >>> >> Add a new delivery type: >>> >> val[63:56] == 3: val[15:8] is flag: val[7:0] is a PPI. >>> >> To the flag, bit 0 stands the interrupt mode is edge(1) >>> >> or level(0) and >>> >> bit 1 stands the interrupt polarity is active low(1) or >>> >> high(0). >>> >> >>> >> Signed-off-by: Shannon Zhao >>> >> --- >>> >> include/xen/interface/hvm/params.h | 5 + >>> >> 1 file changed, 5 insertions(+) >>> >> >>> >> diff --git a/include/xen/interface/hvm/params.h >>> >> b/include/xen/interface/hvm/params.h >>> >> index a6c7991..550688a 100644 >>> >> --- a/include/xen/interface/hvm/params.h >>> >> +++ b/include/xen/interface/hvm/params.h >>> >> @@ -34,6 +34,11 @@ >>> >> * Domain = val[47:32], Bus = >>> >> val[31:16], >>> >> * DevFn = val[15: 8], IntX = val[ 1: >>> >> 0] >>> >> * val[63:56] == 2: val[7:0] is a vector number. >>> >> + * val[63:56] == 3: val[15:8] is flag of event-channel >>> >> interrupt: >>> >> + * bit 0: interrupt is edge(1) or >>> >> level(0) triggered >>> >> + * bit 1: interrupt is active >>> >> low(1) or high(0) >>> >> + * val[7:0] is PPI number used by >>> >> event-channel. >>> >> + * This is only used by ARM/ARM64. >>> >> * If val == 0 then CPU0 event-channel notifications >>> >> are not delivered. >>> >> */ >>> >> #define HVM_PARAM_CALLBACK_IRQ 0 > Andrew, I think that this patch is correct. Looking back at > your > previous comment > (http://marc.info/?l=devicetree&m=144804014214262&w=2), > is it possible that you were confused by enum > callback_via_type, which > is internal to Xen and offset'ed by 1 compared to the > described values > in xen/include/public/hvm/params.h? > > If not, and indeed somebody introduced one more field but > failed to > document it, then I suggest she sends a patch to fix the > issue as soon > as possible. >>> >> I was indeed confused - the ABI is utterly mad. > All right. In that case, Shannon, you can add my > > Reviewed-by: Stefano Stabellini > > >>> >> However, this change does need rebasing over c/s ca5c54b, which >>> >> was the >>> >> result of the original discussion. > c/s ca5c54b is for Xen, while this is a Linux patch (Linux has its > own > set of Xen headers). >>> >> All ABI changes need to happen in the Xen public headers first. This >>> >> patch cannot be accepted yet. >> > So you mean it should port c/s ca5c54b to Linux first? Then add this >> > change based on that? > FIrst, you must get this content and submit a patch against Xen. The > Xen repository is the authoritative version of the ABI. > I've sent a patch to Xen, but not applied yet. > Once that patch is accepted, you will need to port ca5c54b and the new > patch as part of this series. Ok, thanks. -- Shannon ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [RFC V2] xen: interface: introduce pvclk interface
Introduce pvclk interface which is useful when doing device passthrough on ARM platform. To ARM platform, saying embedded SoCs, clock management hardware IP is controlled by Dom0, DomU does not have clocksource. So we need a paravirtualized clock source to let DomU can handle device that are passed through from Dom0. Signed-off-by: Peng Fan Cc: George Dunlap Cc: Konrad Rzeszutek Wilk Cc: David Vrabel Cc: Julien Grall Cc: Stefano Stabellini Cc: Ian Campbell --- V2: The V1 thread: http://lists.xen.org/archives/html/xen-devel/2016-01/msg01860.html The V1 thread binds the interface part to the implementation for linux kernel. This V2 only contains the pvclk interface part. In this patch: Backend, /local/domain//backend/vclk//nr-clks /local/domain//backend/vclk///name Y is frontend domid, Z is clk id, name is the clk name. Frontend, /local/domain//device/vclk/clk-ring-ref /local/domain//device/vclk/event-channel /local/domain//device/vclk//name Y is the clk id. My original idea is to pass clk name which is parsed from dts. And not expose clk name to xenstore. DomU pass the name to Dom0, Dom0 use the name to find correct clk and operate on the clk. Not sure whether I am right or not, please advise. I have no knowledge of ACPI, just following Ian's comment on V1, I add the clk id and expose the info to xenstore, but discard the devid/bus. If devid/bus is needed, I think we can encoded it into the clk id part in future. Exposing the info to xenstore means we need to add entry in xl configuration file like this: vclks = ["id:clkname", "id:clkname"] --> vclks = ["1:uart2-root-clk", "3:gpu-root-clk"]. And the libxl pvclk should parse the vclks entry and fill the contents to xenstore backend nodes. The frontend xenstore node will be created, by driver probe function when DomU booting or by libxl pvclk before DomU boots. To my use case, Dom0 and DomU both use device tree, I need to build the mapping table between id and name, since I use name to lookup the clk in backend, like this: "clk = __clk_loopkup(clkname); clk_prepare_enable(clk)". Maybe ACPI is another different case. xen/include/public/io/clkif.h | 129 ++ 1 file changed, 129 insertions(+) create mode 100644 xen/include/public/io/clkif.h diff --git a/xen/include/public/io/clkif.h b/xen/include/public/io/clkif.h new file mode 100644 index 000..f6f9f20 --- /dev/null +++ b/xen/include/public/io/clkif.h @@ -0,0 +1,129 @@ +/* + * clkif.h + * + * CLK I/O interface for Xen guest OSes. + * + * Copyright (C) 2016, Peng Fan + * + * Permission is hereby granted, free of charge, to any person obtaining a copy + * of this software and associated documentation files (the "Software"), to + * deal in the Software without restriction, including without limitation the + * rights to use, copy, modify, merge, publish, distribute, sublicense, and/or + * sell copies of the Software, and to permit persons to whom the Software is + * furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice shall be included in + * all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE + * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER + * DEALINGS IN THE SOFTWARE. + */ + +#ifndef __XEN_PUBLIC_IO_CLKIF_H__ +#define __XEN_PUBLIC_IO_CLKIF_H__ + +#include "ring.h" +#include "../grant_table.h" + +/* + * The two halves of an Xen pvclk driver utilize nodes within the + * XenStore to communicate and negotiate operating parameters. + * + * Backend: + * /local/domain//backend/vclk//nr-clks + * /local/domain//backend/vclk///name + * + * X - The backend domid + * Y - The frontend domid + * Z - The clk id + * name - The clk name + * + * Backend example: + * /local/domain/0/backend/vclk/1/nr-clks --> value 2 + * /local/domain/0/backend/vclk/1/0/name --> uart2-root-clk + * /local/domain/0/backend/vclk/1/1/name --> gpu-root-clk + * + * /local/domain/0/backend/vclk/2/nr-clks --> value 1 + * /local/domain/0/backend/vclk/2/0/name --> lcdif-per-clk + * + * Frontend: + * /local/domain//device/vclk/clk-ring-ref + * /local/domain//device/vclk/event-channel + * /local/domain//device/vclk//name + * + * Frontend example: + * /local/domain/1/device/vclk/0/name --> uart2-root-clk + * /local/domain/1/device/vclk/1/name --> gpu-root-clk + * + * /local/domain/2/device/vclk/0/name -->lcdif-per-clk + */ + +/* + * Define the CMDs for communication between frontend and backend + * + * If the Guest OSes want to ask the privileged OS to operate on + * a specific clk, the Guest OSes should pass the CMD to th
Re: [Xen-devel] [PATCH 4/4] hvmloader: add support to load extra ACPI tables from qemu
On 20/01/2016 08:46, Jan Beulich wrote: On 20.01.16 at 06:31, wrote: >> The primary reason of current solution is to reuse existing NVDIMM >> driver in Linux kernel. > Re-using code in the Dom0 kernel has benefits and drawbacks, and > in any event needs to depend on proper layering to remain in place. > A benefit is less code duplication between Xen and Linux; along the > same lines a drawback is code duplication between various Dom0 > OS variants. > >> One responsibility of this driver is to discover NVDIMM devices and >> their parameters (e.g. which portion of an NVDIMM device can be mapped >> into the system address space and which address it is mapped to) by >> parsing ACPI NFIT tables. Looking at the NFIT spec in Sec 5.2.25 of >> ACPI Specification v6 and the actual code in Linux kernel >> (drivers/acpi/nfit.*), it's not a trivial task. > To answer one of Kevin's questions: The NFIT table doesn't appear > to require the ACPI interpreter. They seem more like SRAT and SLIT. > Also you failed to answer Kevin's question regarding E820 entries: I > think NVDIMM (or at least parts thereof) get represented in E820 (or > the EFI memory map), and if that's the case this would be a very > strong hint towards management needing to be in the hypervisor. Conceptually, an NVDIMM is just like a fast SSD which is linearly mapped into memory. I am still on the dom0 side of this fence. The real question is whether it is possible to take an NVDIMM, split it in half, give each half to two different guests (with appropriate NFIT tables) and that be sufficient for the guests to just work. Either way, it needs to be a toolstack policy decision as to how to split the resource. ~Andrew > >> Secondly, the driver implements a convenient block device interface to >> let software access areas where NVDIMM devices are mapped. The >> existing vNVDIMM implementation in QEMU uses this interface. >> >> As Linux NVDIMM driver has already done above, why do we bother to >> reimplement them in Xen? > See above; a possibility is that we may need a split model (block > layer parts on Dom0, "normal memory" parts in the hypervisor. > Iirc the split is being determined by firmware, and hence set in > stone by the time OS (or hypervisor) boot starts. > > Jan > ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [RFC V2] xen: interface: introduce pvclk interface
On 20/01/16 09:31, Peng Fan wrote: > Introduce pvclk interface which is useful when doing device passthrough > on ARM platform. ... > +/* > + * Frontend request > + * > + * cmd: command for operation on clk, should be XEN_CLK_[xx], > + * excluding XEN_CLK_END. id is the > + * id: clk id > + * rate: clock rate. Used for set rate. Which unit? Hz? > + */ > +struct xen_clkif_request { > + uint32_t cmd; > + uint32_t id; > + uint64_t rate; > +}; > +typedef struct xen_clkif_request xen_clkif_request_t; > + > +/* > + * Backend response > + * > + * cmd: command for operation on clk, same with the cmd in request. > + * id: clk id, same with the id in request. > + * success: indicate failure or success for the cmd. Values? > + * rate: clock rate. Used for get rate. > + * > + * cmd and id are filled by backend and passed to frontend to > + * let frontend check whether the response is for the current > + * request or not. I'd rather let the frontend add a request Id to the request which will be echoed here instead cmd and clock Id. Juergen ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] Bug 1591 - xentop truncates VM names
Hi! Can anybody tell me please where I have to post bugs like this to find the right contact person who is qualified to fix or even to discuss it? Is xen-devel@lists.xen.org the right list for this? Thanks in advance, Michael Gesendet: Freitag, 15. Januar 2016 um 10:17 Uhr Von: "Michael Großer" An: xen-devel@lists.xen.org Betreff: [Xen-devel] Bug 1591 - xentop truncated VM names Hi! Some days ago, I discovered the 'xentop' tool (somebody mentioned it here: http://discussions.citrix.com/topic/374218-what-can-we-do-to-decrease-the-time-needed-for-snapshot-reverts/ ) We work with XenServer 6.5 and we use long names for our virtual machines to distinguish between them. A machine name can have about 40 characters, but xentop displays only the first 10 characters in the NAME column. The result is, that I cannot identify a particular VM using xentop. I read the man page and googled around to find a solution. I found the page http://www.bl-nk.net/2014/07/xentop-and-long-server-names/ and created my own quirky solution as follows: # cd /usr/sbin # cp xentop xentop_wide # sed -i 's/10\.10/80\.80/g' xentop_wide This works for me today, but my common sense tells me that this solution could cease working after we update to a newer version or that I could demage some other instructions that accidentally have similar machine codes. I discovered this 6 years old bug: http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=1591 The composer of this bug makes recommendations that I wouldn't recommend, so that seams to be the reason why the bug is nearly 6 years old without being solved. I recommend a very simple and reconcilable problem-solving approach: - The man page should contain / describe another option, let me call it "-w / --column-name-width", which requires a number. - When starting xentop with "-w 80" or "--column-name-width=80", then it should display (in every imaginable work mode) a name column with a width of 80 characters. - The default width should stay of cause at 10 characters, but with such an option, I as a user could have a choice. Please tell me, how is the chance that such an bug fix enters a next possible release of XenServer, which release number could this be and when will it roughly appear? Thanks in advance, Michael Großer ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v3 05/11] igd: TYPE_IGD_PASSTHROUGH_I440FX_PCI_DEVICE: call parent realize
Hi, > > > > +i440fx_realize = k->realize; > > > > k->realize = igd_pt_i440fx_realize; > > > > ... because we are overriding it right here. > > Many device classes have a parent_realize field so they can keep > a pointer to the original realize function. It's better than a > static variable. How does the attached patch (incremental fix, not tested yet) look like? cheers, Gerd From 3d110e297b5182107e055db3ab69092affdef5bb Mon Sep 17 00:00:00 2001 From: Gerd Hoffmann Date: Wed, 20 Jan 2016 10:08:19 +0100 Subject: [PATCH] [fixup] TYPE_IGD_PASSTHROUGH_I440FX_PCI_DEVICE realize_parent --- hw/pci-host/igd.c | 24 +++- 1 file changed, 19 insertions(+), 5 deletions(-) diff --git a/hw/pci-host/igd.c b/hw/pci-host/igd.c index 160828f..2d51745 100644 --- a/hw/pci-host/igd.c +++ b/hw/pci-host/igd.c @@ -49,12 +49,24 @@ out_free: g_free(path); } -static void (*i440fx_realize)(PCIDevice *pci_dev, Error **errp); +#define IGD_PT_I440FX_CLASS(class) \ +OBJECT_CLASS_CHECK(IGDPtI440fxClass, (class), \ + TYPE_IGD_PASSTHROUGH_I440FX_PCI_DEVICE) +#define IGD_PT_I440FX_GET_CLASS(obj)\ +OBJECT_GET_CLASS(IGDPtI440fxClass, (obj), \ + TYPE_IGD_PASSTHROUGH_I440FX_PCI_DEVICE) + +typedef struct IGDPtI440fxClass { +PCIDeviceClass parent_class; +void (*parent_realize)(PCIDevice *dev, Error **errp); +} IGDPtI440fxClass; + static void igd_pt_i440fx_realize(PCIDevice *pci_dev, Error **errp) { +IGDPtI440fxClass *k = IGD_PT_I440FX_GET_CLASS(pci_dev); Error *err = NULL; -i440fx_realize(pci_dev, &err); +k->parent_realize(pci_dev, &err); if (err != NULL) { error_propagate(errp, err); return; @@ -72,11 +84,12 @@ static void igd_pt_i440fx_realize(PCIDevice *pci_dev, Error **errp) static void igd_passthrough_i440fx_class_init(ObjectClass *klass, void *data) { +IGDPtI440fxClass *k = IGD_PT_I440FX_CLASS(klass); DeviceClass *dc = DEVICE_CLASS(klass); -PCIDeviceClass *k = PCI_DEVICE_CLASS(klass); +PCIDeviceClass *pc = PCI_DEVICE_CLASS(klass); -i440fx_realize = k->realize; -k->realize = igd_pt_i440fx_realize; +k->parent_realize = pc->realize; +pc->realize = igd_pt_i440fx_realize; dc->desc = "IGD Passthrough Host bridge (i440fx)"; } @@ -84,6 +97,7 @@ static const TypeInfo igd_passthrough_i440fx_info = { .name = TYPE_IGD_PASSTHROUGH_I440FX_PCI_DEVICE, .parent= TYPE_I440FX_PCI_DEVICE, .class_init= igd_passthrough_i440fx_class_init, +.class_size= sizeof(IGDPtI440fxClass), }; static void (*q35_realize)(PCIDevice *pci_dev, Error **errp); -- 1.8.3.1 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] libxc: Expose the MPX cpuid flag to guest
> Cc: wei.l...@citrix.com; ian.campb...@citrix.com; > stefano.stabell...@eu.citrix.com; ian.jack...@eu.citrix.com; xen- > de...@lists.xen.org; jbeul...@suse.com > Subject: Re: [Xen-devel] [PATCH] libxc: Expose the MPX cpuid flag to guest > > On Mon, Jan 11, 2016 at 04:52:10PM +0800, Liang Li wrote: > > If hardware support memory protect externsion, expose this feature > > extension Thanks! Liang ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [RFC V2] xen: interface: introduce pvclk interface
Hi Juergen, On Wed, Jan 20, 2016 at 10:05:15AM +0100, Juergen Gross wrote: >On 20/01/16 09:31, Peng Fan wrote: >> Introduce pvclk interface which is useful when doing device passthrough >> on ARM platform. > >... > >> +/* >> + * Frontend request >> + * >> + * cmd: command for operation on clk, should be XEN_CLK_[xx], >> + * excluding XEN_CLK_END. id is the >> + * id: clk id >> + * rate: clock rate. Used for set rate. > >Which unit? Hz? Yeah. Hz. I'll add comments in V3. > >> + */ >> +struct xen_clkif_request { >> +uint32_t cmd; >> +uint32_t id; >> +uint64_t rate; >> +}; >> +typedef struct xen_clkif_request xen_clkif_request_t; >> + >> +/* >> + * Backend response >> + * >> + * cmd: command for operation on clk, same with the cmd in request. >> + * id: clk id, same with the id in request. >> + * success: indicate failure or success for the cmd. > >Values? I'd like to let the frontend/backend driver to determine the value. In my implementation for linux, if the backend API supports return value, I'll fill the return value to success entry. If the backend API returns void, I'll just fill 0 to success entry. > >> + * rate: clock rate. Used for get rate. >> + * >> + * cmd and id are filled by backend and passed to frontend to >> + * let frontend check whether the response is for the current >> + * request or not. > >I'd rather let the frontend add a request Id to the request which >will be echoed here instead cmd and clock Id. If using request id, I think we can encode cmd and id into request id? To my dts case, clk id is simple. But I am not sure about ACPI part. Maybe I can not simply encode them into request id. Wait for more comments on this:) Thanks, Peng. > > >Juergen ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] Error booting Xen
>>> On 19.01.16 at 20:55, wrote: > I tried booting staging branch but results were identical. Following > line repeats endlessly. > (XEN) traps.c:3290: GPF (): 82d0801c1cce -> 82d080252e5c > > $ 'addr2line -e xen-syms 82d0801c1cce' returns > 'xen/xen/arch/x86/xstate.c:387' which again points to > xsave. Also, adding 'xsave=0' makes it boot just fine. Ah, I think I see the issue: We're zeroing the entire save area in the fixup code, which will make XRSTORS fault unconditionally. Shuai, would you please look into possible ways of fixing this? Just setting the compressed flag may not be enough, and falling back to plain XRSTOR doesn't seem to be an option either - I'm in particular worried that the current model of zeroing everything is bogus with the growing number of different components which get loaded here. But of course another question then is why the XRSTORS faults in the first place. I guess we'll need a debugging patch to dump the full state to understand that. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [RFC V2] xen: interface: introduce pvclk interface
>>> On 20.01.16 at 09:31, wrote: > +/* > + * Backend response > + * > + * cmd: command for operation on clk, same with the cmd in request. > + * id: clk id, same with the id in request. > + * success: indicate failure or success for the cmd. > + * rate: clock rate. Used for get rate. > + * > + * cmd and id are filled by backend and passed to frontend to > + * let frontend check whether the response is for the current > + * request or not. > + */ > +struct xen_clkif_response { > + uint32_t cmd; > + uint32_t id; > + uint32_t success; > + uint64_t rate; > +}; This isn't 32-/64-bit clean. Question is whether echoing cmd is really needed. Also naming a field "success" is pretty odd - is this mean to be a boolean? Better name it e.g. status, allowing for different (error) indicators. And what I'm missing throughout the file is some description of how clock events (interrupts?) are actually meant to make it into the guest. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 4/4] hvmloader: add support to load extra ACPI tables from qemu
On 01/20/16 08:58, Andrew Cooper wrote: > On 20/01/2016 08:46, Jan Beulich wrote: > On 20.01.16 at 06:31, wrote: > >> The primary reason of current solution is to reuse existing NVDIMM > >> driver in Linux kernel. > > Re-using code in the Dom0 kernel has benefits and drawbacks, and > > in any event needs to depend on proper layering to remain in place. > > A benefit is less code duplication between Xen and Linux; along the > > same lines a drawback is code duplication between various Dom0 > > OS variants. > > > >> One responsibility of this driver is to discover NVDIMM devices and > >> their parameters (e.g. which portion of an NVDIMM device can be mapped > >> into the system address space and which address it is mapped to) by > >> parsing ACPI NFIT tables. Looking at the NFIT spec in Sec 5.2.25 of > >> ACPI Specification v6 and the actual code in Linux kernel > >> (drivers/acpi/nfit.*), it's not a trivial task. > > To answer one of Kevin's questions: The NFIT table doesn't appear > > to require the ACPI interpreter. They seem more like SRAT and SLIT. > > Also you failed to answer Kevin's question regarding E820 entries: I > > think NVDIMM (or at least parts thereof) get represented in E820 (or > > the EFI memory map), and if that's the case this would be a very > > strong hint towards management needing to be in the hypervisor. > CCing QEMU vNVDIMM maintainer: Xiao Guangrong > Conceptually, an NVDIMM is just like a fast SSD which is linearly mapped > into memory. I am still on the dom0 side of this fence. > > The real question is whether it is possible to take an NVDIMM, split it > in half, give each half to two different guests (with appropriate NFIT > tables) and that be sufficient for the guests to just work. > Yes, one NVDIMM device can be split into multiple parts and assigned to different guests, and QEMU is responsible to maintain virtual NFIT tables for each part. > Either way, it needs to be a toolstack policy decision as to how to > split the resource. > But the split does not need to be done at Xen side IMO. It can be done by dom0 kernel and QEMU as long as they tells Xen hypervisor the address space range of each part. Haozhong > ~Andrew > > > > >> Secondly, the driver implements a convenient block device interface to > >> let software access areas where NVDIMM devices are mapped. The > >> existing vNVDIMM implementation in QEMU uses this interface. > >> > >> As Linux NVDIMM driver has already done above, why do we bother to > >> reimplement them in Xen? > > See above; a possibility is that we may need a split model (block > > layer parts on Dom0, "normal memory" parts in the hypervisor. > > Iirc the split is being determined by firmware, and hence set in > > stone by the time OS (or hypervisor) boot starts. > > > > Jan > > > ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [RFC V2] xen: interface: introduce pvclk interface
>>> On 20.01.16 at 10:25, wrote: > On Wed, Jan 20, 2016 at 10:05:15AM +0100, Juergen Gross wrote: >>On 20/01/16 09:31, Peng Fan wrote: >>> + */ >>> +struct xen_clkif_request { >>> + uint32_t cmd; >>> + uint32_t id; >>> + uint64_t rate; >>> +}; >>> +typedef struct xen_clkif_request xen_clkif_request_t; >>> + >>> +/* >>> + * Backend response >>> + * >>> + * cmd: command for operation on clk, same with the cmd in request. >>> + * id: clk id, same with the id in request. >>> + * success: indicate failure or success for the cmd. >> >>Values? > > I'd like to let the frontend/backend driver to determine the value. No, that would imply matched pairs of frontends and backends. Yet the purpose of spelling out an interface is to allow interoperability. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 3/3] tools: introduce parameter max_ranges.
On Wed, 2016-01-20 at 03:58 +, Tian, Kevin wrote: > > From: Yu, Zhang [mailto:yu.c.zh...@linux.intel.com] > > Sent: Wednesday, January 20, 2016 11:33 AM > > > As a feature this write-protection has nothing to be GPU > > > virtualization specific. > > > In the future the same mediated pass-through idea used in XenGT may > > > be > > > used on other I/O devices which need to shadow some structure w/ > > > requirement > > > to write-protect guest memory. So it's not good to tie this to either > > > XenGT > > > or GTT. > > > > > Thank you, Kevin. > > Well, if this parameter is not supposed to be xengt specific, we do not > > need to connect it with any xengt flag such as ."vgt=1" or "GVT-g=1". > > Hence the user will have to configure the max_wp_ram_ranges himself, > > right? > > > > Not always. The option can be configured manually by the user, or > automatically set in the code when "vgt=1" is recognized. Is the latter approach not always sufficient? IOW, if it can be done automatically, why would the user need to tweak it? Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 3/3] tools: introduce parameter max_ranges.
> -Original Message- > From: Ian Campbell [mailto:ian.campb...@citrix.com] > Sent: 20 January 2016 10:16 > To: Kevin Tian; Yu, Zhang; Wei Liu; Paul Durrant > Cc: Keir (Xen.org); jbeul...@suse.com; Andrew Cooper; xen- > de...@lists.xen.org; Lv, Zhiyuan; Stefano Stabellini > Subject: Re: [Xen-devel] [PATCH 3/3] tools: introduce parameter > max_ranges. > > On Wed, 2016-01-20 at 03:58 +, Tian, Kevin wrote: > > > From: Yu, Zhang [mailto:yu.c.zh...@linux.intel.com] > > > Sent: Wednesday, January 20, 2016 11:33 AM > > > > As a feature this write-protection has nothing to be GPU > > > > virtualization specific. > > > > In the future the same mediated pass-through idea used in XenGT may > > > > be > > > > used on other I/O devices which need to shadow some structure w/ > > > > requirement > > > > to write-protect guest memory. So it's not good to tie this to either > > > > XenGT > > > > or GTT. > > > > > > > Thank you, Kevin. > > > Well, if this parameter is not supposed to be xengt specific, we do not > > > need to connect it with any xengt flag such as ."vgt=1" or "GVT-g=1". > > > Hence the user will have to configure the max_wp_ram_ranges himself, > > > right? > > > > > > > Not always. The option can be configured manually by the user, or > > automatically set in the code when "vgt=1" is recognized. > > Is the latter approach not always sufficient? IOW, if it can be done > automatically, why would the user need to tweak it? > I think latter is sufficient for now. We always have the option of adding a specific wp_ram_ranges parameter in future if there is a need. Paul > Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 3/3] tools: introduce parameter max_ranges.
On Wed, Jan 20, 2016 at 01:02:39PM +0800, Yu, Zhang wrote: > > > On 1/20/2016 11:58 AM, Tian, Kevin wrote: > >>From: Yu, Zhang [mailto:yu.c.zh...@linux.intel.com] > >>Sent: Wednesday, January 20, 2016 11:33 AM > >>>As a feature this write-protection has nothing to be GPU virtualization > >>>specific. > >>>In the future the same mediated pass-through idea used in XenGT may be > >>>used on other I/O devices which need to shadow some structure w/ > >>>requirement > >>>to write-protect guest memory. So it's not good to tie this to either XenGT > >>>or GTT. > >>> > >>Thank you, Kevin. > >>Well, if this parameter is not supposed to be xengt specific, we do not > >>need to connect it with any xengt flag such as ."vgt=1" or "GVT-g=1". > >>Hence the user will have to configure the max_wp_ram_ranges himself, > >>right? > >> > > > >Not always. The option can be configured manually by the user, or > >automatically set in the code when "vgt=1" is recognized. > > OK. That sounds more reasonable. :) > To give a summary, I'll do the following changes in next version: > > 1> rename this new parameter to "max_wp_ram_ranges", then use this > parameter as the wp-ram rangeset limit, for the I/O rangeset, keep > MAX_NR_IO_RANGES as its limit; > 2> clear the documentation part; > 3> define a LIBXL_HAVE_XXX in libxl.h to indicate a new field in the > build info; > 4> We do not introduce the xengt flag by now, and will add code to > automatically set the "max_wp_ram_ranges" after this flag is accepted > in the future. > > Does anyone have more suggestions? :) > Ian posted an enquiry earlier: "Could we use something like one of those to cause the t/stack to just DTRT without the user having to micromanage the amount of pages which are allowed to have this property?" Is that possible? Wei. > B.R. > Yu > > > >Thanks > >Kevin > > ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] xen: Improvements to clean and distclean targets
On Tue, Jan 19, 2016 at 9:38 AM, Ian Campbell wrote: > On Tue, 2016-01-19 at 01:43 -0700, Jan Beulich wrote: >> > > > On 18.01.16 at 19:19, wrote: >> > On 18/01/16 16:57, Jan Beulich wrote: >> > > > > > On 18.01.16 at 17:45, wrote: >> > > > On 18/01/16 16:41, Jan Beulich wrote: >> > > > > > > > On 18.01.16 at 17:27, wrote: >> > > > > > * Move '*~' and 'core' into the find rule. >> > > > > I don't understand this part: Where in the build process do such >> > > > > get >> > > > > generated? I'm tempted to instead recommend to just drop those >> > > > > from the rm invocation... >> > > > No idea about 'core' files, but *~ are emacs backup files. >> > > But emacs should clean up after itself; this shouldn't be the job >> > > of our clean rule. >> > >> > Why? the point is to have a one-revision old version of the file to >> > hand. >> >> I guess there may be different strategies here: My editor also >> creates such named files, but deletes them as the program gets >> shut down. I.e. the one-revision old backup exists as long as the >> program is running. I can see benefits from the alternative >> model, but still it shouldn't be our scripts to clean up such backups. >> After all - what if another program used another name patter for >> its backups? Would we go clean those up then too? > > IMHO these files should be in .gitignore (so they don't clutter "git > status", AFAICT this is already done correctly) but it's not really > necessary for "make clean" (or distclean) to get rid of them, that's up to > either the editor or the user. IOW I'd be happy removing the existing > rules. As an emacs user, I agree with this. The purpose of "make clean" IMO is be to make sure that the *build* operates cleanly (i.e., doesn't end up using any output generated from a previous build), not to get rid of extraneous random files that don't affect the build. "git clean" is the proper tool for cleaning out the tree for git commands. -George ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v4 3/3] VT-d: Fix vt-d Device-TLB flush timeout issue.
Jan, thanks for your comments. > On January 15, 2016 at 9:10, wrote: > >>> On 23.12.15 at 09:25, wrote: > > --- a/xen/drivers/passthrough/vtd/qinval.c > > +++ b/xen/drivers/passthrough/vtd/qinval.c > > @@ -190,9 +190,19 @@ static int queue_invalidate_wait(struct iommu > > *iommu, static int invalidate_sync(struct iommu *iommu) { > > struct qi_ctrl *qi_ctrl = iommu_qi_ctrl(iommu); > > +int rc; > > > > if ( qi_ctrl->qinval_maddr ) > > -return queue_invalidate_wait(iommu, 0, 1, 1); > > +{ > > +rc = queue_invalidate_wait(iommu, 0, 1, 1); > > +if ( rc ) > > +{ > > +if ( rc == -ETIMEDOUT ) > > +panic("Queue invalidate wait descriptor was > > + timeout.\n"); > > Unless I'm overlooking something this doesn't replace and existing panic(), > and I > think I had indicated quite clearly that this series shouldn't add any new > ones, > unless the alternative of crashing the owning domain is completely unfeasible. > I will remove it in next v5. [...] > > @@ -229,6 +239,63 @@ int qinval_device_iotlb(struct iommu *iommu, > > return 0; > > } > > > > +static void dev_invalidate_iotlb_timeout(struct iommu *iommu, u16 did, > > + u16 seg, u8 bus, u8 devfn) > { > > +struct domain *d; > > +struct pci_dev *pdev; > > + > > +d = rcu_lock_domain_by_id(iommu->domid_map[did]); > > +ASSERT(d); > > Don't you need to acquire some lock in order to safely assert this? Referred to the other use case of 'rcu_lock_domain_by_id()', Xen checks whether The domain is 'NULL'. Could I also replace the 'ASSERT(d)' with + If ( d == NULL ) + return; ? > Also note that unused slots hold zero, i.e. there's at least a theoretical > risk of > problems here when you don't also look at > iommu->domid_bitmap. > I am not clear how to fix this point. Do you have good idea? Add a lock to 'iommu->domid_bitmap'? > > +for_each_pdev(d, pdev) > > +{ > > +if ( (pdev->seg == seg) && > > + (pdev->bus == bus) && > > + (pdev->devfn == devfn) ) > > +{ > > +if ( pdev->domain ) > > If the device is on the domain's list, can this reasonably be false? I can't find the false case. > I.e. don't you rather mean ASSERT() here? I'd better replace 'if' with ASSERT() for this case. > > > +{ > > +list_del(&pdev->domain_list); > > This should happen under pcidevs_lock - you need to either acquire it or > ASSERT() it being held. > I should check whether it is under pcidevs_lock -- with spin_is_locked(&pcidevs_lock) If it is not under pcidevs_lock, I should acquire it. I think ASSERT() is not a good idea. Hypervisor acquires this lock and then remove the resource. > > +} > > + > > +if ( !is_hardware_domain(d) ) > > +domain_crash(d); > > I wonder whether the device hiding shouldn't also be avoided if the device is > owned by hwdom. > I think it should be avoided if the device is owned by hwdom. Otherwise, it is quite similar to panic.. > > @@ -349,9 +416,18 @@ static int flush_iotlb_qi( > > queue_invalidate_iotlb(iommu, > > type >> > DMA_TLB_FLUSH_GRANU_OFFSET, dr, > > dw, did, size_order, 0, addr); > > + > > +/* > > + * Synchronize with hardware for invalidation request descriptors > > + * submitted before Device-TLB invalidate descriptor. > > + */ > > +rc = invalidate_sync(iommu); > > +if ( rc ) > > + return rc; > > + > > if ( flush_dev_iotlb ) > > ret = dev_invalidate_iotlb(iommu, did, addr, size_order, type); > > -rc = invalidate_sync(iommu); > > + > > if ( !ret ) > > ret = rc; > > } > > This change is because of ...? > As Kevin's comments, I add 'dev_invalidate_iotlb_sync()' to dev_invalidate_iotlb(). Then i can know which device is flush timeout. > > --- a/xen/drivers/passthrough/vtd/x86/ats.c > > +++ b/xen/drivers/passthrough/vtd/x86/ats.c > > @@ -162,6 +162,19 @@ int dev_invalidate_iotlb(struct iommu *iommu, u16 > did, > > return -EOPNOTSUPP; > > } > > > > +/* > > + * Synchronize with hardware for Device-TLB invalidate > > + * descriptor. > > + */ > > +ret = dev_invalidate_iotlb_sync(iommu, did, pdev->seg, > > +pdev->bus, pdev->devfn); > > +if ( ret ) > > +{ > > +dprintk(XENLOG_WARNING VTDPREFIX, > > +"Device-TLB flush timeout.\n"); > > Are you aware that dprintk() compiles to nothing in non-debug builds? To be honest, I was not aware. > Plus logging the error code and device coordinates might turn out helpful in > such cases. But first of all - if there was a timeout, we'd make it here only > for > Dom0. Perhaps the printing should
Re: [Xen-devel] Bug 1591 - xentop truncates VM names
On Wed, 2016-01-20 at 09:56 +0100, > Hi! > > Can anybody tell me please where I have to post bugs like this to > find the right contact person who is qualified to fix or even to discuss > it? > > Is xen-devel@lists.xen.org the right list for this? For XenServer related bugs: If you are a Citrix customer in which case you will have a Support contact which you should use. Otherwise the right place is somewhere on www.xenserver.org (forums, lists etc), since that is a separate downstream project from xen.org (xenproject). For upstream/XenProject related buts: xen-devel@lists.xen.org is the right place, the process/expectations are described more fully at http://wiki.xen.org/wiki/Reporting_Bugs_against_Xen . At least in recent xen.org releases xentop takes a -f/--full-name option which does not truncate the name: root@st40:~# xentop -f -b NAME STATE CPU(sec) CPU(%) MEM(k) MEM(%) MAXMEM(k) MAXMEM(%) VCPUS NETS NETTX(k) NETRX(k) VBDS VBD_OO VBD_RD VBD_WR VBD_RSECT VBD_WSECT SSID aa -r 560.0 5242886.3 525312 6.3 11 53120 1991 43 507388240 Domain-0 -r3080.0 522484 6.2 524288 6.3 8000000 0 0 00 So if this isn't present in XenServer then you want to be talking to them or your Support person. Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH] xen: allocate gntdev_copy_batch dynamically
struct gntdev_copy_batch is arguably too large to fit on the kernel stack, and we get a warning about the stack usage in gntdev_ioctl_grant_copy: drivers/xen/gntdev.c:949:1: error: the frame size of 1240 bytes is larger than 1024 bytes This changes the code to us a dynamic allocation instead. Signed-off-by: Arnd Bergmann Fixes: a4cdb556cae0 ("xen/gntdev: add ioctl for grant copy") --- This is a regression against v4.4, found with ARM randconfig testing. diff --git a/drivers/xen/gntdev.c b/drivers/xen/gntdev.c index dc495383ad73..cc753b3a7154 100644 --- a/drivers/xen/gntdev.c +++ b/drivers/xen/gntdev.c @@ -915,15 +915,16 @@ static int gntdev_grant_copy_seg(struct gntdev_copy_batch *batch, static long gntdev_ioctl_grant_copy(struct gntdev_priv *priv, void __user *u) { struct ioctl_gntdev_grant_copy copy; - struct gntdev_copy_batch batch; + struct gntdev_copy_batch *batch; unsigned int i; int ret = 0; if (copy_from_user(©, u, sizeof(copy))) return -EFAULT; - batch.nr_ops = 0; - batch.nr_pages = 0; + batch = kzalloc(sizeof(*batch), GFP_KERNEL); + if (!batch) + return -ENOMEM; for (i = 0; i < copy.count; i++) { struct gntdev_grant_copy_segment seg; @@ -933,18 +934,20 @@ static long gntdev_ioctl_grant_copy(struct gntdev_priv *priv, void __user *u) goto out; } - ret = gntdev_grant_copy_seg(&batch, &seg, ©.segments[i].status); + ret = gntdev_grant_copy_seg(batch, &seg, ©.segments[i].status); if (ret < 0) goto out; cond_resched(); } - if (batch.nr_ops) - ret = gntdev_copy(&batch); + if (batch->nr_ops) + ret = gntdev_copy(batch); + kfree(batch); return ret; out: - gntdev_put_pages(&batch); + gntdev_put_pages(batch); + kfree(batch); return ret; } ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [RFC V2] xen: interface: introduce pvclk interface
On 20/01/16 10:25, Peng Fan wrote: > Hi Juergen, > > On Wed, Jan 20, 2016 at 10:05:15AM +0100, Juergen Gross wrote: >> On 20/01/16 09:31, Peng Fan wrote: >>> Introduce pvclk interface which is useful when doing device passthrough >>> on ARM platform. >> >> ... >> >>> +/* >>> + * Frontend request >>> + * >>> + * cmd: command for operation on clk, should be XEN_CLK_[xx], >>> + * excluding XEN_CLK_END. id is the >>> + * id: clk id >>> + * rate: clock rate. Used for set rate. >> >> Which unit? Hz? > > Yeah. Hz. I'll add comments in V3. > >> >>> + */ >>> +struct xen_clkif_request { >>> + uint32_t cmd; >>> + uint32_t id; >>> + uint64_t rate; >>> +}; >>> +typedef struct xen_clkif_request xen_clkif_request_t; >>> + >>> +/* >>> + * Backend response >>> + * >>> + * cmd: command for operation on clk, same with the cmd in request. >>> + * id: clk id, same with the id in request. >>> + * success: indicate failure or success for the cmd. >> >> Values? > > I'd like to let the frontend/backend driver to determine the value. > In my implementation for linux, if the backend API supports return value, > I'll fill the return value to success entry. If the backend API returns > void, I'll just fill 0 to success entry. So please specify "0" to mean success and add some sensible error values. There might be multiple frontend- and/or backend-variants which all must agree using the same interface. >>> + * rate: clock rate. Used for get rate. >>> + * >>> + * cmd and id are filled by backend and passed to frontend to >>> + * let frontend check whether the response is for the current >>> + * request or not. >> >> I'd rather let the frontend add a request Id to the request which >> will be echoed here instead cmd and clock Id. > > If using request id, I think we can encode cmd and id into request id? This should just be an opaque value for the backend. The frontend is free how to create this value, it should be unique for every outstanding request of a domU, however. It could be built from cmd and id in case there can't be multiple requests with the same cmd/id combination active in a domU. juergen ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 4/4] hvmloader: add support to load extra ACPI tables from qemu
Hi, On 01/20/2016 06:15 PM, Haozhong Zhang wrote: CCing QEMU vNVDIMM maintainer: Xiao Guangrong Conceptually, an NVDIMM is just like a fast SSD which is linearly mapped into memory. I am still on the dom0 side of this fence. The real question is whether it is possible to take an NVDIMM, split it in half, give each half to two different guests (with appropriate NFIT tables) and that be sufficient for the guests to just work. Yes, one NVDIMM device can be split into multiple parts and assigned to different guests, and QEMU is responsible to maintain virtual NFIT tables for each part. Either way, it needs to be a toolstack policy decision as to how to split the resource. Currently, we are using NVDIMM as a block device and a DAX-based filesystem is created upon it in Linux so that file-related accesses directly reach the NVDIMM device. In KVM, If the NVDIMM device need to be shared by different VMs, we can create multiple files on the DAX-based filesystem and assign the file to each VMs. In the future, we can enable namespace (partition-like) for PMEM memory and assign the namespace to each VMs (current Linux driver uses the whole PMEM as a single namespace). I think it is not a easy work to let Xen hypervisor recognize NVDIMM device and manager NVDIMM resource. Thanks! ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH LIBVIRT] libxl: Support cmdline= in xl config files
On Tue, 2016-01-19 at 21:46 -0700, Jim Fehlig wrote: > On 01/19/2016 05:03 AM, Ian Campbell wrote: > > I went to ping this but noticed that I had sent it to "jimfehlig" (i.e. > > no > > domain), so no wonder there was no reply! > > > > To: line fixed here, let me know if you would prefer a resend. > > That would be much appreciated, thanks! > > > > > Ian. > > > > On Wed, 2015-12-16 at 12:09 +, Ian Campbell wrote: > > > ... and consolidate the cmdline/extra/root parsing to facilitate > > > doing > > > so. > > > > > > The logic is the same as xl's parse_cmdline from the current xen.git > > > master > > > branch (e6f0e099d2c17de47fd86e817b1998db903cab61), except I was > > > unable > > > to figure out how/where to route the warning about ignoring > > > root+extra if cmdline was specified. > > I think VIR_WARN() would be appropriate. Ok, will do, thanks. > > > > > > Signed-off-by: Ian Campbell > > > --- > > > src/xenconfig/xen_xl.c | 62 ++ > > > > > > > > > 1 file changed, 37 insertions(+), 25 deletions(-) > > > > > > diff --git a/src/xenconfig/xen_xl.c b/src/xenconfig/xen_xl.c > > > index 91cdff6..ba8b938 100644 > > > --- a/src/xenconfig/xen_xl.c > > > +++ b/src/xenconfig/xen_xl.c > > > @@ -58,11 +58,45 @@ extern int xlu_disk_parse(XLU_Config *cfg, > > > libxl_device_disk *disk); > > > #endif > > > > > > +static int xenParseCmdline(virConfPtr conf, char **r_cmdline) > > > +{ > > > +char *cmdline = NULL; > > > +const char *root = NULL, *extra = NULL, *buf = NULL; > > In theory, these three don't need to be initialized since > xenConfigGetString > will do that. But in practice, I worry that Coverity might complain :-/. It looks like some of the callers of xenConfigGetString initialise the value to NULL, while others don't. I can't see any public libvirt scan results to look if some of the ones which don't have been picked up or not. I've just noticed also that the code I am moving/removing didn't initialise to NULL, so I think I'll remove these initialisers. Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [xen-unstable test] 78525: regressions - FAIL
flight 78525 xen-unstable real [real] http://logs.test-lab.xenproject.org/osstest/logs/78525/ Regressions :-( Tests which did not succeed and are blocking, including tests which could not be run: test-amd64-i386-xl-qemut-win7-amd64 9 windows-installfail REGR. vs. 77892 test-amd64-i386-xl-qemuu-win7-amd64 9 windows-installfail REGR. vs. 77892 Regressions which are regarded as allowable (not blocking): test-armhf-armhf-xl-rtds15 guest-start/debian.repeat fail blocked in 77892 test-amd64-i386-rumpuserxen-i386 10 guest-startfail like 77892 test-amd64-i386-xl-qemut-stubdom-debianhvm-amd64-xsm 9 debian-hvm-install fail like 77892 test-amd64-amd64-xl-qemut-win7-amd64 16 guest-stop fail like 77892 test-amd64-amd64-libvirt-vhd 9 debian-di-installfail like 77892 Tests which did not succeed, but are not blocking: test-amd64-amd64-xl-pvh-amd 11 guest-start fail never pass test-amd64-amd64-xl-pvh-intel 11 guest-start fail never pass test-armhf-armhf-libvirt 14 guest-saverestorefail never pass test-armhf-armhf-libvirt 12 migrate-support-checkfail never pass test-amd64-i386-libvirt-xsm 12 migrate-support-checkfail never pass test-amd64-amd64-libvirt 12 migrate-support-checkfail never pass test-amd64-amd64-libvirt-xsm 12 migrate-support-checkfail never pass test-amd64-amd64-qemuu-nested-amd 16 debian-hvm-install/l1/l2 fail never pass test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-stop fail never pass test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check fail never pass test-amd64-i386-libvirt 12 migrate-support-checkfail never pass test-armhf-armhf-xl-credit2 13 saverestore-support-checkfail never pass test-armhf-armhf-xl-credit2 12 migrate-support-checkfail never pass test-armhf-armhf-xl 12 migrate-support-checkfail never pass test-armhf-armhf-xl 13 saverestore-support-checkfail never pass test-armhf-armhf-xl-cubietruck 12 migrate-support-checkfail never pass test-armhf-armhf-xl-cubietruck 13 saverestore-support-checkfail never pass test-armhf-armhf-xl-multivcpu 13 saverestore-support-checkfail never pass test-armhf-armhf-xl-multivcpu 12 migrate-support-checkfail never pass test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check fail never pass test-armhf-armhf-xl-rtds 13 saverestore-support-checkfail never pass test-armhf-armhf-xl-rtds 12 migrate-support-checkfail never pass test-armhf-armhf-libvirt-xsm 12 migrate-support-checkfail never pass test-armhf-armhf-libvirt-xsm 14 guest-saverestorefail never pass test-armhf-armhf-libvirt-qcow2 11 migrate-support-checkfail never pass test-armhf-armhf-libvirt-qcow2 13 guest-saverestorefail never pass test-armhf-armhf-libvirt-raw 13 guest-saverestorefail never pass test-armhf-armhf-libvirt-raw 11 migrate-support-checkfail never pass test-armhf-armhf-xl-arndale 12 migrate-support-checkfail never pass test-armhf-armhf-xl-arndale 13 saverestore-support-checkfail never pass test-armhf-armhf-xl-xsm 13 saverestore-support-checkfail never pass test-armhf-armhf-xl-xsm 12 migrate-support-checkfail never pass test-armhf-armhf-xl-vhd 9 debian-di-installfail never pass version targeted for testing: xen 0c2f1283486f953604a282188a06e2db143f475d baseline version: xen f7347a282420a5edc74afb31e7c42c2765f24de5 Last test of basis77892 2016-01-12 11:30:40 Z7 days Failing since 77945 2016-01-13 04:01:14 Z7 days 11 attempts Testing same since78525 2016-01-19 13:38:51 Z0 days1 attempts People who touched revisions under test: Andrew Cooper Ashwin Chaugule Boris Ostrovsky Brendan Gregg Daniel De Graaf Doug Goldstein Edgar E. Iglesias Feng Tang Grant Likely Hanjun Guo Haozhong Zhang Ian Campbell Ian Jackson Jan Beulich Jonathan Creekmore Juergen Gross Kevin Tian Konrad Rzeszutek Wilk Len Brown Parth Dixit Paul Durrant Rafael J. Wysocki Razvan Cojocaru Roger Pau Monne Roger Pau Monné Shannon Zhao Tamas K Lengyel Tomasz Nowicki Wei Liu Wei Liu for tools bits jobs: build-amd64-xsm pass build-armhf-xsm pass build-i386-xsm pass build-amd64 pass build-armhf pass build-i386 pass build-amd64-libvirt
[Xen-devel] OSSTEST: Re-blessing cubietruck-{picasso, gleizes, metzinger} for production use
I've been running a couple of adhoc production tests per day on these since before Xmas and they haven't lost sight of their disks again. TLDR; I think we should throw them back in the pool. With the recent timeout fixes they are working as well as the production cubietruck-braque. There are two flakey tests test-armhf-armhf-xl-rtds and test-armhf-armhf- libvirt-raw, but those appear to be much better than before the timeout changes and not specific to these three boards since the fourth one looks to behave much the same. At first glance it looks like some later test steps might just need a bit more time on CT too. FWIW comparison of production vs commission runs can be seen at http://osstest.test-lab.xenproject.org/~ianc/commission/history/test-armhf-armhf-xl-rtds/xen-unstable.html vs http://logs.test-lab.xenproject.org/osstest/results/history/test-armhf-armhf-xl-rtds/xen-unstable.html and http://osstest.test-lab.xenproject.org/~ianc/commission/history/test-armhf-armhf-libvirt-raw/xen-unstable.html vs http://logs.test-lab.xenproject.org/osstest/results/history/test-armhf-armhf-libvirt-raw/xen-unstable.html (nb: osstest e23886aa8fe7 is when the timeout fixes went live) Unfortunately the second one has always run on arndale-*, which doesn't give any data on cubietruck. Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [linux-mingo-tip-master test] 78581: regressions - FAIL
flight 78581 linux-mingo-tip-master real [real] http://logs.test-lab.xenproject.org/osstest/logs/78581/ Regressions :-( Tests which did not succeed and are blocking, including tests which could not be run: test-amd64-i386-rumpuserxen-i386 10 guest-start fail REGR. vs. 60684 Regressions which are regarded as allowable (not blocking): test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-stop fail blocked in 60684 Tests which did not succeed, but are not blocking: test-amd64-amd64-xl-pvh-intel 14 guest-saverestorefail never pass test-amd64-amd64-xl-pvh-amd 11 guest-start fail never pass test-amd64-amd64-libvirt-xsm 12 migrate-support-checkfail never pass test-amd64-i386-libvirt 12 migrate-support-checkfail never pass test-amd64-amd64-xl-qemut-win7-amd64 16 guest-stop fail never pass test-amd64-i386-xl-qemuu-win7-amd64 16 guest-stop fail never pass test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check fail never pass test-amd64-amd64-libvirt 12 migrate-support-checkfail never pass test-amd64-i386-libvirt-xsm 12 migrate-support-checkfail never pass test-amd64-amd64-qemuu-nested-amd 13 xen-boot/l1 fail never pass test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check fail never pass test-amd64-amd64-libvirt-vhd 11 migrate-support-checkfail never pass test-amd64-amd64-qemuu-nested-intel 13 xen-boot/l1 fail never pass version targeted for testing: linux3d1c05c9c775c565b6f40179e42a5cf8b6043d4e baseline version: linux69f75ebe3b1d1e636c4ce0a0ee248edacc69cbe0 Last test of basis60684 2015-08-13 04:21:46 Z 160 days Failing since 60712 2015-08-15 18:33:48 Z 157 days 109 attempts Testing same since78581 2016-01-20 04:20:47 Z0 days1 attempts jobs: build-amd64-xsm pass build-i386-xsm pass build-amd64 pass build-i386 pass build-amd64-libvirt pass build-i386-libvirt pass build-amd64-pvopspass build-i386-pvops pass build-amd64-rumpuserxen pass build-i386-rumpuserxen pass test-amd64-amd64-xl pass test-amd64-i386-xl pass test-amd64-amd64-xl-qemut-debianhvm-amd64-xsmpass test-amd64-i386-xl-qemut-debianhvm-amd64-xsm pass test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm pass test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsmpass test-amd64-amd64-xl-qemuu-debianhvm-amd64-xsmpass test-amd64-i386-xl-qemuu-debianhvm-amd64-xsm pass test-amd64-amd64-xl-qemut-stubdom-debianhvm-amd64-xsmpass test-amd64-i386-xl-qemut-stubdom-debianhvm-amd64-xsm pass test-amd64-amd64-libvirt-xsm pass test-amd64-i386-libvirt-xsm pass test-amd64-amd64-xl-xsm pass test-amd64-i386-xl-xsm pass test-amd64-amd64-qemuu-nested-amdfail test-amd64-amd64-xl-pvh-amd fail test-amd64-i386-qemut-rhel6hvm-amd pass test-amd64-i386-qemuu-rhel6hvm-amd pass test-amd64-amd64-xl-qemut-debianhvm-amd64pass test-amd64-i386-xl-qemut-debianhvm-amd64 pass test-amd64-amd64-xl-qemuu-debianhvm-amd64pass test-amd64-i386-xl-qemuu-debianhvm-amd64 pass test-amd64-i386-freebsd10-amd64 pass test-amd64-amd64-xl-qemuu-ovmf-amd64 pass test-amd64-i386-xl-qemuu-ovmf-amd64 pass test-amd64-amd64-rumpuserxen-amd64 pass test-amd64-amd64-xl-qemut-win7-amd64 fail test-amd64-i386-xl-qemut-win7-amd64 pass test-amd64-amd64-xl-qemuu-win7-amd64 fail test-amd64-i386-xl-qemuu-win7-amd64 fail test-amd64-amd64-xl-credit2 pass test-amd64-i386-freebsd10-i386 pass test-amd64-i386-rumpuserxen-i386 fail tes
Re: [Xen-devel] [PATCH 4/4] hvmloader: add support to load extra ACPI tables from qemu
On 01/20/16 01:46, Jan Beulich wrote: > >>> On 20.01.16 at 06:31, wrote: > > The primary reason of current solution is to reuse existing NVDIMM > > driver in Linux kernel. > CC'ing QEMU vNVDIMM maintainer: Xiao Guangrong > Re-using code in the Dom0 kernel has benefits and drawbacks, and > in any event needs to depend on proper layering to remain in place. > A benefit is less code duplication between Xen and Linux; along the > same lines a drawback is code duplication between various Dom0 > OS variants. > Not clear about other Dom0 OS. But for Linux, it already has a NVDIMM driver since 4.2. > > One responsibility of this driver is to discover NVDIMM devices and > > their parameters (e.g. which portion of an NVDIMM device can be mapped > > into the system address space and which address it is mapped to) by > > parsing ACPI NFIT tables. Looking at the NFIT spec in Sec 5.2.25 of > > ACPI Specification v6 and the actual code in Linux kernel > > (drivers/acpi/nfit.*), it's not a trivial task. > > To answer one of Kevin's questions: The NFIT table doesn't appear > to require the ACPI interpreter. They seem more like SRAT and SLIT. Sorry, I made a mistake in another reply. NFIT does not contain anything requiring ACPI interpreter. But there are some _DSM methods for NVDIMM in SSDT, which needs ACPI interpreter. > Also you failed to answer Kevin's question regarding E820 entries: I > think NVDIMM (or at least parts thereof) get represented in E820 (or > the EFI memory map), and if that's the case this would be a very > strong hint towards management needing to be in the hypervisor. > Legacy NVDIMM devices may use E820 entries or other ad-hoc ways to announce their locations, but newer ones that follow ACPI v6 spec do not need E820 any more and only need ACPI NFIT (i.e. firmware may not build E820 entries for them). The current linux kernel can handle both legacy and new NVDIMM devices and provide the same block device interface for them. > > Secondly, the driver implements a convenient block device interface to > > let software access areas where NVDIMM devices are mapped. The > > existing vNVDIMM implementation in QEMU uses this interface. > > > > As Linux NVDIMM driver has already done above, why do we bother to > > reimplement them in Xen? > > See above; a possibility is that we may need a split model (block > layer parts on Dom0, "normal memory" parts in the hypervisor. > Iirc the split is being determined by firmware, and hence set in > stone by the time OS (or hypervisor) boot starts. > For the "normal memory" parts, do you mean parts that map the host NVDIMM device's address space range to the guest? I'm going to implement that part in hypervisor and expose it as a hypercall so that it can be used by QEMU. Haozhong ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v10 6/7] vmx: VT-d posted-interrupt core logic handling
On Wed, 2016-01-20 at 01:35 -0700, Jan Beulich wrote: > > > > On 20.01.16 at 08:49, wrote: > > > > > We need to call arch_vcpu_block() before > > local_events_need_delivery(), > > since VT-d hardware can issue notification event when we are in > > arch_vcpu_block(), it that happens, 'ON' bit is set, then we will > > check > > it in local_events_need_delivery(). If we do arch_vcpu_block() in > > the > > else part, we cannot handle this. This is one reason I can recall, > > I am > > not sure whether there are other concerns since it has been really > > a long time. The current implementation is fully discussed with > > scheduler > > maintainers. > > Okay, this all makes sense. > Yes, I proposed / asked about that here: http://bugs.xenproject.org/xen/mid/%3c1445948204.2937.130.ca...@citrix.com%3E As Feng says, the whole idea is doing one last check before actually blocking, to see whether we can avoid doing so, if an event has arrived in the meantime. It can be seen as an optimization (whether a really effective one, I can't tell). In fact, an event can well come just immediately after local_events_need_delivery() has returned false! We somehow do the same thing already in vcpu_block(), and consistency with that was what convinced me that we can live with Feng's code as it is in this patch. > It's just that I don't see how the ON > bit would get checked by local_events_need_delivery(). But that > may in turn be because, with the long turnaround, I've lost some > of the details of the rest of this series. > I think what Feng means is "if that happens [== that an event arrives], 'ON' bit is set, then we will check it [== whether an event has arrived] in local_events_need_delivery()." If *no* event has arrived, no need to do anything particular, we can just block. If an event *has* arrived in time for local_events_need_delivery() to see it, we avoid blocking. We do need to rollback what arch_vcpu_block() did, and that is done while leaving the hypervisor, as soon as we realize we haven't blocked. If we put the hook in 'else', we'd always go down the full blocking path, even if an event became pending while arranging for that, and we'll have to be woken again, as soon as the event is handled. How effective an optimization this is depends on how probable and frequent it is that we get interrupts during the execution of the hook, which, as said, it's not easy to either just tell or measure. Regards, Dario -- <> (Raistlin Majere) - Dario Faggioli, Ph.D, http://about.me/dario.faggioli Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK) signature.asc Description: This is a digitally signed message part ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 3/3] tools: introduce parameter max_ranges.
On 1/20/2016 6:18 PM, Paul Durrant wrote: -Original Message- From: Ian Campbell [mailto:ian.campb...@citrix.com] Sent: 20 January 2016 10:16 To: Kevin Tian; Yu, Zhang; Wei Liu; Paul Durrant Cc: Keir (Xen.org); jbeul...@suse.com; Andrew Cooper; xen- de...@lists.xen.org; Lv, Zhiyuan; Stefano Stabellini Subject: Re: [Xen-devel] [PATCH 3/3] tools: introduce parameter max_ranges. On Wed, 2016-01-20 at 03:58 +, Tian, Kevin wrote: From: Yu, Zhang [mailto:yu.c.zh...@linux.intel.com] Sent: Wednesday, January 20, 2016 11:33 AM As a feature this write-protection has nothing to be GPU virtualization specific. In the future the same mediated pass-through idea used in XenGT may be used on other I/O devices which need to shadow some structure w/ requirement to write-protect guest memory. So it's not good to tie this to either XenGT or GTT. Thank you, Kevin. Well, if this parameter is not supposed to be xengt specific, we do not need to connect it with any xengt flag such as ."vgt=1" or "GVT-g=1". Hence the user will have to configure the max_wp_ram_ranges himself, right? Not always. The option can be configured manually by the user, or automatically set in the code when "vgt=1" is recognized. Is the latter approach not always sufficient? IOW, if it can be done automatically, why would the user need to tweak it? I think latter is sufficient for now. We always have the option of adding a specific wp_ram_ranges parameter in future if there is a need Thank you all for your reply. Well, I believe the latter option is only sufficient for most usage models on BDW due to rangeset's ability to merge continuous pages into one range, but there might be some extreme cases, e.g. too many graphic related applications in one VM, which create a great deal of per-process graphic translation tables. And also, future cpu platforms might provide even more PPGGTs. So, I suggest we use this max_wp_ram_ranges, and give the control to the system administrator. Besides, like Kevin said, XenGT's mediated pass-thru idea can also be adopted to other devices, and this parameter may also help. Also, we have plans to upstream the tool-stack changes later this year. If this max_wp_ram_ranges is not convenient, we can introduce a method to automatically set its default value. B.R. Yu Paul Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v10 6/7] vmx: VT-d posted-interrupt core logic handling
> -Original Message- > From: Dario Faggioli [mailto:dario.faggi...@citrix.com] > Sent: Wednesday, January 20, 2016 7:13 PM > To: Jan Beulich ; Wu, Feng > Cc: Andrew Cooper ; George Dunlap > ; Tian, Kevin ; xen- > de...@lists.xen.org; Keir Fraser > Subject: Re: [PATCH v10 6/7] vmx: VT-d posted-interrupt core logic handling > > On Wed, 2016-01-20 at 01:35 -0700, Jan Beulich wrote: > > > > > On 20.01.16 at 08:49, wrote: > > > > > > If we put the hook in 'else', we'd always go down the full blocking > path, even if an event became pending while arranging for that, and > we'll have to be woken again, as soon as the event is handled. How > effective an optimization this is depends on how probable and frequent > it is that we get interrupts during the execution of the hook, which, > as said, it's not easy to either just tell or measure. Thanks a lot for your explanation, Dario. Thanks, Feng ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v10 6/7] vmx: VT-d posted-interrupt core logic handling
> -Original Message- > From: Jan Beulich [mailto:jbeul...@suse.com] > Sent: Wednesday, January 20, 2016 4:35 PM > To: Wu, Feng > Cc: Andrew Cooper ; Dario Faggioli > ; George Dunlap ; > Tian, Kevin ; xen-devel@lists.xen.org; Keir Fraser > > Subject: RE: [PATCH v10 6/7] vmx: VT-d posted-interrupt core logic handling > > >>> On 20.01.16 at 08:49, wrote: > >> From: Jan Beulich [mailto:jbeul...@suse.com] > >> Sent: Monday, January 18, 2016 11:14 PM > >> >>> On 03.12.15 at 09:35, wrote: > >> > --- a/xen/arch/x86/hvm/vmx/vmx.c > >> > +++ b/xen/arch/x86/hvm/vmx/vmx.c > >> > @@ -83,7 +83,131 @@ static int vmx_msr_write_intercept(unsigned int > >> msr, uint64_t msr_content); > >> > +ASSERT(v->arch.hvm_vmx.pi_block_cpu == NR_CPUS); > >> > + > >> > +/* > >> > + * The vCPU is blocking, we need to add it to one of the per pCPU > lists. > >> > + * We save v->processor to v->arch.hvm_vmx.pi_block_cpu and use > it > >> for > >> > + * the per-CPU list, we also save it to posted-interrupt descriptor > and > >> > + * make it as the destination of the wake-up notification event. > >> > + */ > >> > +v->arch.hvm_vmx.pi_block_cpu = v->processor; > >> > + > >> > +spin_lock_irqsave(&per_cpu(pi_blocked_vcpu_lock, > >> > + v->arch.hvm_vmx.pi_block_cpu), flags); > >> > +list_add_tail(&v->arch.hvm_vmx.pi_blocked_vcpu_list, > >> > + &per_cpu(pi_blocked_vcpu, v- > >arch.hvm_vmx.pi_block_cpu)); > >> > +spin_unlock_irqrestore(&per_cpu(pi_blocked_vcpu_lock, > >> > + v->arch.hvm_vmx.pi_block_cpu), flags); > >> > + > >> > +ASSERT(!pi_test_sn(pi_desc)); > >> > + > >> > +/* > >> > + * We don't need to set the 'NDST' field, since it should point to > >> > + * the same pCPU as v->processor. > >> > + */ > >> > + > >> > +write_atomic(&pi_desc->nv, pi_wakeup_vector); > >> > >> Stray blank line between comment and corresponding code. > >> > >> Also the ASSERT() is rather more disconnected from the write > >> than seems desirable: Wouldn't it be better to cmpxchg() the > >> whole 32-bit value, validating that SN is clear in the result? > > > > Not quite understand this. The control field is 64 bits, do you > > mean cmpxchg() the whole 64-bit value like follows: > > > > +do { > > +old.control = new.control = pi_desc->control; > > +new.nv = pi_wakeup_vector; > > +} while ( cmpxchg(&pi_desc->control, old.control, new.control) > > + != old.control ); > > > > This a somewhat like the implementation in the earlier versions, > > why do you want to change it back? > > Did you read what I've said? I'm worried about the disconnect of > assertion and update: You really care about SN being clear > _after_ the update aiui. No, the ASSERT has no connection with the update. the SN bit should be clear before updating pi_desc->nv, adding ASSERT here is just kind of protection code. > And since NDST doesn't change, a 32-bit > CMPXCHG would seem to suffice. > > >> > +pi_block_cpu = v->arch.hvm_vmx.pi_block_cpu; > >> > +if ( pi_block_cpu == NR_CPUS ) > >> > +return; > >> > >> Are this condition and the one in the immediately preceding if() > >> connected in any way? > > > > I am not sure I understand this correctly. Which one is > > " the one in the immediately preceding if()"? > > If you hadn't ripped out too much of the context, it would be a > matter of pointing you back up. Now I have to re-quote the full > code: > > +if ( pi_desc->nv != posted_intr_vector ) > +write_atomic(&pi_desc->nv, posted_intr_vector); > + > +/* the vCPU is not on any blocking list. */ > +pi_block_cpu = v->arch.hvm_vmx.pi_block_cpu; > +if ( pi_block_cpu == NR_CPUS ) > +return; > > These are the two if()-s the question is about. > > >> I.e. could the latter perhaps become an > >> ASSERT() by connecting the two blocks suitably? I'm simply > >> trying to limit the number of different cases to consider mentally > >> when looking at this code ... > > > > If we get a true from above check, it means the vcpu was removed by > > pi_wakeup_interrupt(), then we just need to return. If we get a false, > > we will acquire the spinlock as the following code does, there are two > > scenarios: > > - pi_wakeup_interrupt() acquires the spinlock before us, then when > > we get the lock, 'v->arch.hvm_vmx.pi_block_cpu' has already been > > set to NR_CPUS by pi_wakeup_interrupt(), we need do nothing. > > - We get the spinlock before pi_wakeup_interrupt(), this time we need > > to remove the vCPU from the list and set 'v->arch.hvm_vmx.pi_block_cpu' > > to NR_CPUS. > > This is all only about the second condition, but the question really is > whether one being true or false may imply the result of other. In > which case this would better be ASSERT()ed than handled by two > conditionals. Thanks for the explanation. I don't think there are any connections Betwee
Re: [Xen-devel] [RFC V2] xen: interface: introduce pvclk interface
Hi Jan, On Wed, Jan 20, 2016 at 03:14:20AM -0700, Jan Beulich wrote: On 20.01.16 at 09:31, wrote: >> +/* >> + * Backend response >> + * >> + * cmd: command for operation on clk, same with the cmd in request. >> + * id: clk id, same with the id in request. >> + * success: indicate failure or success for the cmd. >> + * rate: clock rate. Used for get rate. >> + * >> + * cmd and id are filled by backend and passed to frontend to >> + * let frontend check whether the response is for the current >> + * request or not. >> + */ >> +struct xen_clkif_response { >> +uint32_t cmd; >> +uint32_t id; >> +uint32_t success; >> +uint64_t rate; >> +}; > >This isn't 32-/64-bit clean. Question is whether echoing cmd is really >needed. As Juergen suggested, use a request id. I'll fix this in V3. 32-/64-bit unclean, I can not get you here (: > >Also naming a field "success" is pretty odd - is this mean to be a >boolean? Better name it e.g. status, allowing for different (error) >indicators. As you suggested, how about `int status`? And in this clkif.h, define different status value, such as `#define XEN_CLK_PREPARE_OK 0, #define XEN_CLK_PREPARE_FAILURE -1`. The frontend and backend should be aware of the status value. > >And what I'm missing throughout the file is some description of how >clock events (interrupts?) are actually meant to make it into the >guest. I have a simple implementation v1 patch for linux, http://lists.xen.org/archives/html/xen-devel/2016-01/msg01860.html. You can see how it works. Thanks, Peng. > >Jan > ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [RFC V2] xen: interface: introduce pvclk interface
Hi Juergen, On Wed, Jan 20, 2016 at 11:40:55AM +0100, Juergen Gross wrote: >On 20/01/16 10:25, Peng Fan wrote: >> Hi Juergen, >> >> On Wed, Jan 20, 2016 at 10:05:15AM +0100, Juergen Gross wrote: >>> On 20/01/16 09:31, Peng Fan wrote: Introduce pvclk interface which is useful when doing device passthrough on ARM platform. >>> >>> ... >>> +/* + * Frontend request + * + * cmd: command for operation on clk, should be XEN_CLK_[xx], + *excluding XEN_CLK_END. id is the + * id: clk id + * rate: clock rate. Used for set rate. >>> >>> Which unit? Hz? >> >> Yeah. Hz. I'll add comments in V3. >> >>> + */ +struct xen_clkif_request { + uint32_t cmd; + uint32_t id; + uint64_t rate; +}; +typedef struct xen_clkif_request xen_clkif_request_t; + +/* + * Backend response + * + * cmd: command for operation on clk, same with the cmd in request. + * id: clk id, same with the id in request. + * success: indicate failure or success for the cmd. >>> >>> Values? >> >> I'd like to let the frontend/backend driver to determine the value. >> In my implementation for linux, if the backend API supports return value, >> I'll fill the return value to success entry. If the backend API returns >> void, I'll just fill 0 to success entry. > >So please specify "0" to mean success and add some sensible error >values. There might be multiple frontend- and/or backend-variants which >all must agree using the same interface. Will change this to `int status`, as also observed by Jan. I'll also define macros such as "#define XEN_CLK_ENABLE_OK 0", "#define XEN_CLK_ENABLE_FAILURE -1" > + * rate: clock rate. Used for get rate. + * + * cmd and id are filled by backend and passed to frontend to + * let frontend check whether the response is for the current + * request or not. >>> >>> I'd rather let the frontend add a request Id to the request which >>> will be echoed here instead cmd and clock Id. >> >> If using request id, I think we can encode cmd and id into request id? > >This should just be an opaque value for the backend. The frontend is >free how to create this value, it should be unique for every outstanding >request of a domU, however. It could be built from cmd and id in case >there can't be multiple requests with the same cmd/id combination >active in a domU. Thanks for teaching me on this. So the request id should be globally unique for backend. So "random value(generated in frontend) + frontend domid" is ok for this? Thanks, Peng. > > >juergen ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] OSSTEST: Re-blessing cubietruck-{picasso, gleizes, metzinger} for production use
Ian Campbell writes ("OSSTEST: Re-blessing cubietruck-{picasso,gleizes,metzinger} for production use"): > I've been running a couple of adhoc production tests per day on these since > before Xmas and they haven't lost sight of their disks again. > > TLDR; I think we should throw them back in the pool. Great. > With the recent timeout fixes they are working as well as the production > cubietruck-braque. > > There are two flakey tests test-armhf-armhf-xl-rtds and test-armhf-armhf- > libvirt-raw, but those appear to be much better than before the timeout > changes and not specific to these three boards since the fourth one looks > to behave much the same. > > At first glance it looks like some later test steps might just need a bit > more time on CT too. Maybe we should have target_adjust_timeout honour a host property to multiply timeouts by some factor. Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v3 4/4] x86/PV: enable the emulated PIT
On Mon, 2016-01-18 at 18:58 +0100, Roger Pau Monné wrote: > You certainly are more familiar with the migration code than me, but > wouldn't it be enough to add a new field to libxl_domain_build_info > (uint32_t emulation_flags), and teach > libxl_domain_build_info_gen_json/libxl__domain_build_info_parse_json > how to properly parse it? Is libxl not already aware whether a domain is to be dmlite vs pv or hvm based on the domain config? As such can't it figure out the hardcoded set to use already without actually needing to expose this in libxl API (until we actually have a desire to expose such nobs to users)? Can't this work on resume just like it does on create? > This however raises the question about how to signal that the field is > not initialised, because 0 is a valid value (maybe ~0)? In libxl's API we tend to try and avoid opaque hex numbers, so the natural result would be a struct full of libxl_defbool options, which IIRC already get properly propagated and have handling for defaulting already in place. Ian. > ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH v3 1/5] libelf: rewrite symtab/strtab loading for Dom0
Current implementation of elf_load_bsdsyms is broken when loading inside of a HVM guest, because it assumes elf_memcpy_safe is able to write into guest memory space, which it is not. Take the oportunity to do some cleanup and properly document how elf_{parse/load}_bsdsyms works. The new implementation uses elf_load_image when dealing with data that needs to be copied to the guest memory space. Also reduce the number of section headers copied to the minimum necessary. Signed-off-by: Roger Pau Monné --- Cc: Ian Campbell Cc: Ian Jackson Cc: Jan Beulich Cc: Keir Fraser Cc: Tim Deegan --- xen/common/libelf/libelf-loader.c | 213 -- 1 file changed, 158 insertions(+), 55 deletions(-) diff --git a/xen/common/libelf/libelf-loader.c b/xen/common/libelf/libelf-loader.c index 6f42bea..9552d4c 100644 --- a/xen/common/libelf/libelf-loader.c +++ b/xen/common/libelf/libelf-loader.c @@ -153,7 +153,7 @@ void elf_parse_bsdsyms(struct elf_binary *elf, uint64_t pstart) { uint64_t sz; ELF_HANDLE_DECL(elf_shdr) shdr; -unsigned i, type; +unsigned int i; if ( !ELF_HANDLE_VALID(elf->sym_tab) ) return; @@ -164,20 +164,33 @@ void elf_parse_bsdsyms(struct elf_binary *elf, uint64_t pstart) sz = sizeof(uint32_t); /* Space for the elf and elf section headers */ -sz += (elf_uval(elf, elf->ehdr, e_ehsize) + - elf_shdr_count(elf) * elf_uval(elf, elf->ehdr, e_shentsize)); +sz += elf_uval(elf, elf->ehdr, e_ehsize) + + 3 * elf_uval(elf, elf->ehdr, e_shentsize); sz = elf_round_up(elf, sz); -/* Space for the symbol and string tables. */ +/* Space for the symbol and string table. */ for ( i = 0; i < elf_shdr_count(elf); i++ ) { shdr = elf_shdr_by_index(elf, i); if ( !elf_access_ok(elf, ELF_HANDLE_PTRVAL(shdr), 1) ) /* input has an insane section header count field */ break; -type = elf_uval(elf, shdr, sh_type); -if ( (type == SHT_STRTAB) || (type == SHT_SYMTAB) ) -sz = elf_round_up(elf, sz + elf_uval(elf, shdr, sh_size)); + +if ( elf_uval(elf, shdr, sh_type) != SHT_SYMTAB ) +continue; + +sz = elf_round_up(elf, sz + elf_uval(elf, shdr, sh_size)); +shdr = elf_shdr_by_index(elf, elf_uval(elf, shdr, sh_link)); + +if ( !elf_access_ok(elf, ELF_HANDLE_PTRVAL(shdr), 1) ) +/* input has an insane section header count field */ +break; + +if ( elf_uval(elf, shdr, sh_type) != SHT_STRTAB ) +/* Invalid symtab -> strtab link */ +break; + +sz = elf_round_up(elf, sz + elf_uval(elf, shdr, sh_size)); } elf->bsd_symtab_pstart = pstart; @@ -186,13 +199,31 @@ void elf_parse_bsdsyms(struct elf_binary *elf, uint64_t pstart) static void elf_load_bsdsyms(struct elf_binary *elf) { -ELF_HANDLE_DECL(elf_ehdr) sym_ehdr; -unsigned long sz; -elf_ptrval maxva; -elf_ptrval symbase; -elf_ptrval symtab_addr; -ELF_HANDLE_DECL(elf_shdr) shdr; -unsigned i, type; +/* + * Header that is placed at the end of the kernel and allows + * the OS to find where the symtab and strtab have been loaded. + * It mimics a valid ELF file header, although it only contains + * a symtab and a strtab section. + * + * NB: according to the ELF spec there's only ONE symtab per ELF + * file, and accordingly we will only load the corresponding + * strtab, so we only need three section headers in our fake ELF + * header (first section header is always a dummy). + */ +struct __packed { +elf_ehdr header; +elf_shdr section[3]; +} symbol_header; + +ELF_HANDLE_DECL(elf_ehdr) header_handle; +unsigned long shdr_size; +uint32_t symsize; +ELF_HANDLE_DECL(elf_shdr) section_handle; +ELF_HANDLE_DECL(elf_shdr) image_handle; +unsigned int i, link; +elf_ptrval header_base; +elf_ptrval symtab_base; +elf_ptrval strtab_base; if ( !elf->bsd_symtab_pstart ) return; @@ -200,64 +231,136 @@ static void elf_load_bsdsyms(struct elf_binary *elf) #define elf_hdr_elm(_elf, _hdr, _elm, _val) \ do {\ if ( elf_64bit(_elf) ) \ -elf_store_field(_elf, _hdr, e64._elm, _val); \ +(_hdr).e64._elm = _val; \ else\ -elf_store_field(_elf, _hdr, e32._elm, _val); \ +(_hdr).e32._elm = _val; \ } while ( 0 ) -symbase = elf_get_ptr(elf, elf->bsd_symtab_pstart); -symtab_addr = maxva = symbase + sizeof(uint32_t); +#define SYMTAB_INDEX1 +#define STRTAB_INDEX2 -/* Set up Elf header. */ -sym_ehdr = ELF_MAKE_HANDLE(elf_ehdr, symtab_addr); -sz = elf_uval(elf, elf->ehdr, e_ehsize); -elf_memcpy_safe(elf, ELF_HANDLE_PTRVAL(sym_ehdr), ELF_HANDLE_PTRVAL(elf->ehdr), s
[Xen-devel] [PATCH v3 2/5] libxl: introduce LIBXL_VGA_INTERFACE_TYPE_UNDEF
And use it as the default value for the VGA kind. This allows libxl to set it to the default value later on when the domain type is known. For HVM guests the default value is LIBXL_VGA_INTERFACE_TYPE_CIRRUS while for HVMlite the default value is LIBXL_VGA_INTERFACE_TYPE_NONE. Signed-off-by: Roger Pau Monné --- Cc: Ian Jackson Cc: Ian Campbell Cc: Wei Liu --- tools/libxl/libxl_create.c | 8 ++-- tools/libxl/libxl_dm.c | 6 ++ tools/libxl/libxl_types.idl | 3 ++- 3 files changed, 14 insertions(+), 3 deletions(-) diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c index e491d83..61a4001 100644 --- a/tools/libxl/libxl_create.c +++ b/tools/libxl/libxl_create.c @@ -206,8 +206,12 @@ int libxl__domain_build_info_setdefault(libxl__gc *gc, if (b_info->u.hvm.mmio_hole_memkb == LIBXL_MEMKB_DEFAULT) b_info->u.hvm.mmio_hole_memkb = 0; -if (!b_info->u.hvm.vga.kind) -b_info->u.hvm.vga.kind = LIBXL_VGA_INTERFACE_TYPE_CIRRUS; +if (b_info->u.hvm.vga.kind == LIBXL_VGA_INTERFACE_TYPE_UNDEF) { +if (b_info->device_model_version == LIBXL_DEVICE_MODEL_VERSION_NONE) +b_info->u.hvm.vga.kind = LIBXL_VGA_INTERFACE_TYPE_NONE; +else +b_info->u.hvm.vga.kind = LIBXL_VGA_INTERFACE_TYPE_CIRRUS; +} if (!b_info->u.hvm.hdtype) b_info->u.hvm.hdtype = LIBXL_HDTYPE_IDE; diff --git a/tools/libxl/libxl_dm.c b/tools/libxl/libxl_dm.c index a088d71..9aa0cc8 100644 --- a/tools/libxl/libxl_dm.c +++ b/tools/libxl/libxl_dm.c @@ -531,6 +531,9 @@ static int libxl__build_device_model_args_old(libxl__gc *gc, break; case LIBXL_VGA_INTERFACE_TYPE_QXL: break; +default: +LOG(ERROR, "Invalid emulated video card specified"); +return ERROR_INVAL; } if (b_info->u.hvm.boot) { @@ -970,6 +973,9 @@ static int libxl__build_device_model_args_new(libxl__gc *gc, GCSPRINTF("qxl-vga,vram_size_mb=%"PRIu64",ram_size_mb=%"PRIu64, (b_info->video_memkb/2/1024), (b_info->video_memkb/2/1024) ) ); break; +default: +LOG(ERROR, "Invalid emulated video card specified"); +return ERROR_INVAL; } if (b_info->u.hvm.boot) { diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl index 9ad7eba..92c95e5 100644 --- a/tools/libxl/libxl_types.idl +++ b/tools/libxl/libxl_types.idl @@ -204,11 +204,12 @@ libxl_shutdown_reason = Enumeration("shutdown_reason", [ ], init_val = "LIBXL_SHUTDOWN_REASON_UNKNOWN") libxl_vga_interface_type = Enumeration("vga_interface_type", [ +(0, "UNDEF"), (1, "CIRRUS"), (2, "STD"), (3, "NONE"), (4, "QXL"), -], init_val = "LIBXL_VGA_INTERFACE_TYPE_CIRRUS") +], init_val = "LIBXL_VGA_INTERFACE_TYPE_UNDEF") libxl_vendor_device = Enumeration("vendor_device", [ (0, "NONE"), -- 1.9.5 (Apple Git-50.3) ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH v3 3/5] libxl: initialise the build info before calling prepare_config
libxl__arch_domain_prepare_config has access to the libxl_domain_build_info struct, so make sure it's properly initialised. Signed-off-by: Roger Pau Monné --- Cc: Ian Jackson Cc: Ian Campbell Cc: Wei Liu --- NB: libxl__arch_domain_prepare_config is called from libxl__domain_make. --- tools/libxl/libxl_create.c | 12 ++-- 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c index 61a4001..ba4c9e8 100644 --- a/tools/libxl/libxl_create.c +++ b/tools/libxl/libxl_create.c @@ -892,6 +892,12 @@ static void initiate_domain_create(libxl__egc *egc, goto error_out; } +ret = libxl__domain_build_info_setdefault(gc, &d_config->b_info); +if (ret) { +LOG(ERROR, "Unable to set domain build info defaults"); +goto error_out; +} + ret = libxl__domain_make(gc, d_config, &domid, &state->config); if (ret) { LOG(ERROR, "cannot make domain: %d", ret); @@ -903,12 +909,6 @@ static void initiate_domain_create(libxl__egc *egc, dcs->guest_domid = domid; dcs->dmss.dm.guest_domid = 0; /* means we haven't spawned */ -ret = libxl__domain_build_info_setdefault(gc, &d_config->b_info); -if (ret) { -LOG(ERROR, "Unable to set domain build info defaults"); -goto error_out; -} - if (d_config->c_info.type == LIBXL_DOMAIN_TYPE_HVM && (libxl_defbool_val(d_config->b_info.u.hvm.nested_hvm) && libxl_defbool_val(d_config->b_info.u.hvm.altp2m))) { -- 1.9.5 (Apple Git-50.3) ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH v3 5/5] libxl: add options to enable/disable emulated devices
Allow enabling or disabling emulated devices from the libxl domain configuration file. For HVM guests with a device model all the emulated devices are enabled. For HVM guests without a device model no devices are enabled by default, although they can be enabled using the options provided. The arbiter of whether a combination is posible or not is always Xen, libxl doesn't do any kind of check. This set of options is also propagated inside of the libxl migration record as part of the contents of the libxl_domain_build_info struct. Signed-off-by: Roger Pau Monné --- Cc: Ian Jackson Cc: Ian Campbell Cc: Wei Liu --- docs/man/xl.cfg.pod.5 | 39 +++ tools/libxl/libxl.h | 11 +++ tools/libxl/libxl_create.c | 21 - tools/libxl/libxl_types.idl | 6 ++ tools/libxl/libxl_x86.c | 33 - tools/libxl/xl_cmdimpl.c| 7 +++ 6 files changed, 111 insertions(+), 6 deletions(-) diff --git a/docs/man/xl.cfg.pod.5 b/docs/man/xl.cfg.pod.5 index 8899f75..46d4529 100644 --- a/docs/man/xl.cfg.pod.5 +++ b/docs/man/xl.cfg.pod.5 @@ -1762,6 +1762,45 @@ See F for more information. =back +=head3 HVM without a device model options + +This options can be used to change the set of emulated devices provided +to guests without a device model. Note that Xen might not support all +possible combinations. By default HVM guests without a device model +don't have any of them enabled. + +=over 4 + +=item B + +Enables or disables the Local APIC. + +=item B + +Enables or disables the IO APIC. + +=item B + +Enables or disables the RTC. + +=item B + +Enables or disables the ACPI power management timer and control interfaces. + +=item B + +Enables or disables the PIC. + +=item B + +Enables or disables the PIT. + +=item B + +Enables or disables the HPET. + +=back + =head2 Device-Model Options The following options control the selection of the device-model. This diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h index 7114491..fc4ff1d 100644 --- a/tools/libxl/libxl.h +++ b/tools/libxl/libxl.h @@ -876,6 +876,17 @@ void libxl_mac_copy(libxl_ctx *ctx, libxl_mac *dst, libxl_mac *src); */ #define LIBXL_HAVE_DEVICE_MODEL_VERSION_NONE 1 +/* + * LIBXL_HAVE_EMULATED_DEVS_OPTIONS + * + * In the case that LIBXL_HAVE_EMULATED_DEVS_OPTIONS is set libxl + * allows enabling or disabling emulated devices for HVM guests + * without a device model. The following fields are added to the + * hvm structure inside of libxl_domain_build_info: lapic, ioapic, + * rtc, power_management, pic, pit. + */ +#define LIBXL_HAVE_EMULATED_DEVS_OPTIONS 1 + typedef char **libxl_string_list; void libxl_string_list_dispose(libxl_string_list *sl); int libxl_string_list_length(const libxl_string_list *sl); diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c index ba4c9e8..8a76a6b 100644 --- a/tools/libxl/libxl_create.c +++ b/tools/libxl/libxl_create.c @@ -295,13 +295,32 @@ int libxl__domain_build_info_setdefault(libxl__gc *gc, libxl_defbool_setdefault(&b_info->u.hvm.acpi_s4,true); libxl_defbool_setdefault(&b_info->u.hvm.nx, true); libxl_defbool_setdefault(&b_info->u.hvm.viridian, false); -libxl_defbool_setdefault(&b_info->u.hvm.hpet, true); libxl_defbool_setdefault(&b_info->u.hvm.vpt_align, true); libxl_defbool_setdefault(&b_info->u.hvm.nested_hvm, false); libxl_defbool_setdefault(&b_info->u.hvm.altp2m, false); libxl_defbool_setdefault(&b_info->u.hvm.usb,false); libxl_defbool_setdefault(&b_info->u.hvm.xen_platform_pci, true); +if (b_info->type == LIBXL_DOMAIN_TYPE_HVM && +b_info->device_model_version == LIBXL_DEVICE_MODEL_VERSION_NONE) +{ +libxl_defbool_setdefault(&b_info->u.hvm.hpet, false); +libxl_defbool_setdefault(&b_info->u.hvm.lapic, false); +libxl_defbool_setdefault(&b_info->u.hvm.ioapic, false); +libxl_defbool_setdefault(&b_info->u.hvm.rtc,false); +libxl_defbool_setdefault(&b_info->u.hvm.power_management, false); +libxl_defbool_setdefault(&b_info->u.hvm.pic,false); +libxl_defbool_setdefault(&b_info->u.hvm.pit,false); +} else { +libxl_defbool_setdefault(&b_info->u.hvm.hpet, true); +libxl_defbool_setdefault(&b_info->u.hvm.lapic, true); +libxl_defbool_setdefault(&b_info->u.hvm.ioapic, true); +libxl_defbool_setdefault(&b_info->u.hvm.rtc,true); +libxl_defbool_setdefault(&b_info->u.hvm.power_management, true); +libxl_defbool_setdefault(&b_info->u.hvm.pic,true); +libxl_defbool_setdefault
[Xen-devel] [PATCH v3 0/5] HVMlite: DomU fixes and a Dom0 preparatory patch
Hello, The series is sorted in the following way: - Patch 1/5 is a preparatory patch for Dom0 HVMlite support. - Patch 4/5 is a fix from a fallout introduced by the HVMlite series, which inadvertently disabled the emulated PIT that was added to PV(H) guests. - Patches 2, 3 and 5 expand the fields of the libxl_domain_build_info structure in order to store whether emulated devices inside of Xen are enabled or not. - NOTE: patch 4/5 and 5/5 should be committed together. I haven't squashed them together in order to ease the review, but I can do so in the next iteration if requested. The full series can also be found in my git tree: http://xenbits.xen.org/gitweb/?p=people/royger/xen.git;a=shortlog;h=refs/heads/hvmlite_fixes_v3 I hope this will help review of patch 1/5, which is a significant rework, and will probably be better reviewed by looking at the resulting code. Thanks, Roger. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH v3 4/5] x86/PV: allow PV guests to have an emulated PIT
This fixes the fallout from the HVMlite series, that removed the emulated PIT from PV(H) guests. Also, this patch forces the hardware domain to always have an emulated PIT, regardless of whether the toolstack specified one or not. Signed-off-by: Roger Pau Monné --- Cc: Jan Beulich Cc: Andrew Cooper --- Changes since v2: - Force the emulated PIT to always be enabled for the hardware domain. - Change indentation of the valid set of emulation bitmaps check. --- xen/arch/x86/domain.c | 7 +-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c index e70c125..78df5ae 100644 --- a/xen/arch/x86/domain.c +++ b/xen/arch/x86/domain.c @@ -542,8 +542,11 @@ int arch_domain_create(struct domain *d, unsigned int domcr_flags, d->domain_id, config->emulation_flags); return -EINVAL; } -if ( config->emulation_flags != 0 && - (!is_hvm_domain(d) || config->emulation_flags != XEN_X86_EMU_ALL) ) +if ( is_hardware_domain(d) ) +config->emulation_flags |= XEN_X86_EMU_PIT; +if ( !is_hvm_domain(d) ? (config->emulation_flags != XEN_X86_EMU_PIT) : + (config->emulation_flags != XEN_X86_EMU_ALL && + config->emulation_flags != 0)) { printk(XENLOG_G_ERR "d%d: Xen does not allow %s domain creation " "with the current selection of emulators: %#x\n", -- 1.9.5 (Apple Git-50.3) ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] OSSTEST: Re-blessing cubietruck-{picasso, gleizes, metzinger} for production use
On Wed, 2016-01-20 at 11:52 +, Ian Jackson wrote: > Ian Campbell writes ("OSSTEST: Re-blessing cubietruck- > {picasso,gleizes,metzinger} for production use"): > > I've been running a couple of adhoc production tests per day on these > > since > > before Xmas and they haven't lost sight of their disks again. > > > > TLDR; I think we should throw them back in the pool. > > Great. I'll take this as a "yes, go ahead" ;-) > > With the recent timeout fixes they are working as well as the > > production > > cubietruck-braque. > > > > There are two flakey tests test-armhf-armhf-xl-rtds and test-armhf- > > armhf- > > libvirt-raw, but those appear to be much better than before the timeout > > changes and not specific to these three boards since the fourth one > > looks > > to behave much the same. > > > > At first glance it looks like some later test steps might just need a > > bit > > more time on CT too. > > Maybe we should have target_adjust_timeout honour a host property to > multiply timeouts by some factor. That's not a bad idea, assuming the remaining issues really are timeouts of this sort. Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [ovmf test] 78583: regressions - FAIL
flight 78583 ovmf real [real] http://logs.test-lab.xenproject.org/osstest/logs/78583/ Regressions :-( Tests which did not succeed and are blocking, including tests which could not be run: test-amd64-i386-xl-qemuu-ovmf-amd64 17 guest-start/debianhvm.repeat fail REGR. vs. 65543 test-amd64-amd64-xl-qemuu-ovmf-amd64 17 guest-start/debianhvm.repeat fail REGR. vs. 65543 version targeted for testing: ovmf 5af229d81f15da8cb97e817ccdf9a10177d6b4c4 baseline version: ovmf 5ac96e3a28dd26eabee421919f67fa7c443a47f1 Last test of basis65543 2015-12-08 08:45:15 Z 43 days Failing since 65593 2015-12-08 23:44:51 Z 42 days 43 attempts Testing same since78583 2016-01-20 04:26:02 Z0 days1 attempts People who touched revisions under test: "Samer El-Haj-Mahmoud" "Yao, Jiewen" Andrew Fish Ard Biesheuvel Cecil Sheng Chao Zhang Dandan Bi Daocheng Bu Daryl McDaniel Eric Dong Eric Dong Eugene Cohen Feng Tian Fu Siyuan Hao Wu Hess Chen Heyi Guo Jaben Carsey Jeff Fan Jiaxin Wu Jim Dailey Jordan Justen Larry Hauch Laszlo Ersek Leekha Shaveta Liming Gao Mark Rutland Michael Kinney Michael Thomas Paulo Alcantara Qin Long Qiu Shumin Ruiyu Ni Samer El-Haj-Mahmoud Samer El-Haj-Mahmoud Star Zeng Tapan Shah Yao Jiewen Yao, Jiewen Ye Ting Yonghong Zhu Zhang Lubo jobs: build-amd64-xsm pass build-i386-xsm pass build-amd64 pass build-i386 pass build-amd64-libvirt pass build-i386-libvirt pass build-amd64-pvopspass build-i386-pvops pass test-amd64-amd64-xl-qemuu-ovmf-amd64 fail test-amd64-i386-xl-qemuu-ovmf-amd64 fail sg-report-flight on osstest.test-lab.xenproject.org logs: /home/logs/logs images: /home/logs/images Logs, config files, etc. are available at http://logs.test-lab.xenproject.org/osstest/logs Explanation of these reports, and of osstest in general, is at http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master Test harness code can be found at http://xenbits.xen.org/gitweb?p=osstest.git;a=summary Not pushing. (No revision log; it would be 5795 lines long.) ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH LIBVIRT v2] libxl: Support cmdline= in xl config files
... and consolidate the cmdline/extra/root parsing to facilitate doing so. The logic is the same as xl's parse_cmdline from the current xen.git master branch (e6f0e099d2c17de47fd86e817b1998db903cab61). In order to introduce a use of VIR_WARN for logging I had to add virerror.h and VIR_LOG_INIT. Signed-off-by: Ian Campbell --- NB: I was unsure if I was supposed to change VIR_FROM_NONE into VIR_FROM_XEN, so I didn't (but will on advice). v2: Use VIR_INFO (adding necessary infra) Don't initialise things to NULL when there is no need. --- src/xenconfig/xen_xl.c | 66 +++--- 1 file changed, 41 insertions(+), 25 deletions(-) diff --git a/src/xenconfig/xen_xl.c b/src/xenconfig/xen_xl.c index 91cdff6..3d820cc 100644 --- a/src/xenconfig/xen_xl.c +++ b/src/xenconfig/xen_xl.c @@ -27,6 +27,7 @@ #include "virconf.h" #include "virerror.h" +#include "virlog.h" #include "domain_conf.h" #include "viralloc.h" #include "virstring.h" @@ -35,6 +36,8 @@ #define VIR_FROM_THIS VIR_FROM_NONE +VIR_LOG_INIT("xen.xen_xl"); + /* * Xen provides a libxl utility library, with several useful functions, * specifically xlu_disk_parse for parsing xl disk config strings. @@ -58,11 +61,46 @@ extern int xlu_disk_parse(XLU_Config *cfg, libxl_device_disk *disk); #endif +static int xenParseCmdline(virConfPtr conf, char **r_cmdline) +{ +char *cmdline; +const char *root, *extra, *buf; + +if (xenConfigGetString(conf, "cmdline", &buf, NULL) < 0) +return -1; + +if (xenConfigGetString(conf, "root", &root, NULL) < 0) +return -1; + +if (xenConfigGetString(conf, "extra", &extra, NULL) < 0) +return -1; + +if (buf) { +if (VIR_STRDUP(cmdline, buf) < 0) +return -1; +if (root || extra) +VIR_WARN("ignoring root= and extra= in favour of cmdline="); +} else { +if (root && extra) { +if (virAsprintf(&cmdline, "root=%s %s", root, extra) < 0) +return -1; +} else if (root) { +if (virAsprintf(&cmdline, "root=%s", root) < 0) +return -1; +} else if (extra) { +if (VIR_STRDUP(cmdline, extra) < 0) +return -1; +} +} + +*r_cmdline = cmdline; +return 0; +} + static int xenParseXLOS(virConfPtr conf, virDomainDefPtr def, virCapsPtr caps) { size_t i; -const char *extra, *root; if (def->os.type == VIR_DOMAIN_OSTYPE_HVM) { const char *boot; @@ -84,19 +122,8 @@ xenParseXLOS(virConfPtr conf, virDomainDefPtr def, virCapsPtr caps) if (xenConfigCopyStringOpt(conf, "ramdisk", &def->os.initrd) < 0) return -1; -if (xenConfigGetString(conf, "extra", &extra, NULL) < 0) -return -1; - -if (xenConfigGetString(conf, "root", &root, NULL) < 0) +if (xenParseCmdline(conf, &def->os.cmdline) < 0) return -1; - -if (root) { -if (virAsprintf(&def->os.cmdline, "root=%s %s", root, extra) < 0) -return -1; -} else { -if (VIR_STRDUP(def->os.cmdline, extra) < 0) -return -1; -} #endif if (xenConfigGetString(conf, "boot", &boot, "c") < 0) @@ -132,19 +159,8 @@ xenParseXLOS(virConfPtr conf, virDomainDefPtr def, virCapsPtr caps) if (xenConfigCopyStringOpt(conf, "ramdisk", &def->os.initrd) < 0) return -1; -if (xenConfigGetString(conf, "extra", &extra, NULL) < 0) -return -1; - -if (xenConfigGetString(conf, "root", &root, NULL) < 0) +if (xenParseCmdline(conf, &def->os.cmdline) < 0) return -1; - -if (root) { -if (virAsprintf(&def->os.cmdline, "root=%s %s", root, extra) < 0) -return -1; -} else { -if (VIR_STRDUP(def->os.cmdline, extra) < 0) -return -1; -} } return 0; -- 2.1.4 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [RFC V2] xen: interface: introduce pvclk interface
On Wed, 20 Jan 2016, Peng Fan wrote: > To my use case, Dom0 and DomU both use device tree, I need to build > the mapping table between id and name, since I use name to lookup > the clk in backend, like this: > "clk = __clk_loopkup(clkname); clk_prepare_enable(clk)". Maybe ACPI > is another different case. Theoretically on systems using ACPI there is no need to fiddle with clocks, see Documentation/arm64/arm-acpi.txt "In DT, clocks need to be specified and the drivers need to take them into account. In ACPI, the assumption is that UEFI will leave the device in a reasonable default state, including any clock settings. If for some reason the driver needs to change a clock value, this can be done in an ACPI method; all the driver needs to do is invoke the method and not concern itself with what the method needs to do to change the clock. Changing the hardware can then take place over time by changing what the ACPI method does, and not the driver. In DT, the parameters needed by the driver to set up clocks as in the example above are known as "bindings"; in ACPI, these are known as "Device Properties" and provided to a driver via the _DSD object." However currently we don't have the ability to run ACPI in DomU guests on ARM. Even if we had, there is no way to call native ACPI methods from any guests other than Dom0, even on x86. We just have to hope that clocks don't need to be reset on ACPI systems. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] Xen Security Advisory 167 (CVE-2016-1570) - PV superpage functionality missing sanity checks
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Xen Security Advisory CVE-2016-1570 / XSA-167 version 4 PV superpage functionality missing sanity checks UPDATES IN VERSION 4 Public release. ISSUE DESCRIPTION = The PV superpage functionality lacks certain validity checks on data being passed to the hypervisor by guests. This is the case for the page identifier (MFN) passed to MMUEXT_MARK_SUPER and MMUEXT_UNMARK_SUPER sub-ops of the HYPERVISOR_mmuext_op hypercall as well as for various forms of page table updates. IMPACT == Use of the feature, which is disabled by default, may have unknown effects, ranging from information leaks through Denial of Service to privilege escalation. VULNERABLE SYSTEMS == Only systems which enable the PV superpage feature are affected. That is, only systems with an `allowsuperpage' setting on the hypervisor command line. Note that in Xen 4.0.x and 3.4.x the option is named `allowhugepage'. Xen versions 3.4.0, 3.4.1, and from 4.1 onwards are affected. Only x86 systems are affected. Only PV guests can exploit the vulnerability. MITIGATION == Running only HVM guests will avoid this issue. Not enabling PV superpage support (by omitting the `allowsuperpage' or `allowhugepage' hypervisor command line options) will avoid exposing the issue. CREDITS === This issue was discovered by Qinghao Tang of 360 Marvel Team. RESOLUTION == Applying the appropriate attached patch resolves this issue. xsa167.patch xen-unstable xsa167-4.6.patch Xen 4.6.x, 4.5.x xsa167-4.4.patch Xen 4.4.x, 4.3.x $ sha256sum xsa167* a71f709eef59425cb2113fa48d3b44048c6bf41063200fee1c847f6e0ed45a09 xsa167.patch 194c1ce89292f4cbb9980baa703095bcbeb5849abf46d193e07a98a0d8301f78 xsa167-4.4.patch 2bd786cccfd13c6732d6db8afc9e18058465efcb1bc93f894c359e3a820d5403 xsa167-4.6.patch $ DEPLOYMENT DURING EMBARGO = Deployment of the patches and/or mitigations described above (or others which are substantially similar) is permitted during the embargo, even on public-facing systems with untrusted guest users and administrators. However deployment of the SUPERPAGE DISABLEMENT MITIGATION is NOT permitted (except where all the affected systems and VMs are administered and used only by organisations which are members of the Xen Project Security Issues Predisclosure List). Specifically, deployment on public cloud systems is NOT permitted. This is because disabling PV superpage support is visible to guests, so such deployment could lead to the rediscovery of the vulnerability. Deployment of the mitigation is permitted only AFTER the embargo ends. Also: Distribution of updated software is prohibited (except to other members of the predisclosure list). Predisclosure list members who wish to deploy significantly different patches and/or mitigations, please contact the Xen Project Security Team. (Note: this during-embargo deployment notice is retained in post-embargo publicly released Xen Project advisories, even though it is then no longer applicable. This is to enable the community to have oversight of the Xen Project Security Team's decisionmaking.) For more information about permissible uses of embargoed information, consult the Xen Project community's agreed Security Policy: http://www.xenproject.org/security-policy.html -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.12 (GNU/Linux) iQEcBAEBAgAGBQJWn3jEAAoJEIP+FMlX6CvZTOsH/2ReFJ0Yhp5da69XKvFEJR/s 0yEFxjvqiSyBPsWjyiaAdOp/1A2sltEeDDnMy7xEoXHmon0p6IV0IR4L+fMCLjl2 1ZI4tKpkn3zUE+IOjfu/GJ53f87XWSq/u9Ri7yZQdxFpgd3AXcLegGm8i4L/58iY vdwAAuczACztEN/NbWFedlGUEd5PKqKwb4wOg1uhLIMwzvjxgtejVAyZD83HgP6i LeWMO7EfeU8ND38Otiw9lNlKD/Ia7vpRG+BXuADLx18hbR1TU9AJ0RO1zb9JnAAj snYdgB6s1wzRD4/HOc+s1uaIttPPODs0IhZunylI7UVhdWKp5Qkszw/QUcmufnk= =5acB -END PGP SIGNATURE- xsa167.patch Description: Binary data xsa167-4.4.patch Description: Binary data xsa167-4.6.patch Description: Binary data ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] Xen Security Advisory 168 (CVE-2016-1571) - VMX: intercept issue with INVLPG on non-canonical address
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Xen Security Advisory CVE-2016-1571 / XSA-168 version 3 VMX: intercept issue with INVLPG on non-canonical address UPDATES IN VERSION 3 Public release. ISSUE DESCRIPTION = While INVLPG does not cause a General Protection Fault when used on a non-canonical address, INVVPID in its "individual address" variant, which is used to back the intercepted INVLPG in certain cases, fails in such cases. Failure of INVVPID results in a hypervisor bug check. IMPACT == A malicious guest can crash the host, leading to a Denial of Service. VULNERABLE SYSTEMS == Xen versions from 3.3 onwards are affected. Only systems using Intel or Cyrix CPUs are affected. ARM and AMD systems are unaffected. Only HVM guests using shadow mode paging can expose this vulnerability. PV guests, and HVM guests using Hardware Assisted Paging (also known as EPT on affected hardware), are unaffected. Note that while unsupported, guests with enabled nested virtualization are vulnerable even when using EPT. CHECKING FOR VULNERABLE CONFIGURATION = To discover whether your HVM guests are using HAP, or shadow page tables: request debug key `q' (from the Xen console, or with `xl debug-keys q'). This will print (to the console, and visible in `xl dmesg'), debug information for every domain, containing something like this: (XEN) General information for domain 2: (XEN) refcnt=1 dying=2 pause_count=2 (XEN) nr_pages=2 xenheap_pages=0 shared_pages=0 paged_pages=0 dirty_cpus={} max_pages=262400 (XEN) handle=ef58ef1a-784d-4e59-8079-42bdee87f219 vm_assist= (XEN) paging assistance: hap refcounts translate external ^^^ The presence of `hap' here indicates that the host is not vulnerable to this domain. For an HVM domain the presence of `shadow' indicates that the domain can exploit the vulnerability. Note that `General information' will also be printed for PV domains. For most PV domains there will be no `paging assistance' reported. But PV guests currently being migrated will report (XEN) paging assistance: shadow log_dirty Overall: a domain can exploit the vulnerability if this debug output contains a `paging assistance' line which reports `translate' and which does not report `hap'. MITIGATION == Running only PV guests will avoid this vulnerability. Running HVM guests on only AMD hardware will also avoid this vulnerability. Running HVM guests with Hardware Assisted Paging (HAP) enabled will also avoid this vulnerability. This is the default mode on hardware supporting HAP, but can be overridden by hypervisor command line option and guest configuration setting. Such overrides ("hap=0" in either case, with variants like "no-hap" being possible in the hypervisor command line case) would need to be removed to avoid this vulnerability. CREDITS === This issue was discovered by Jan Beulich of SUSE. RESOLUTION == Applying the attached patch resolves this issue. xsa168.patch xen-unstable, Xen 4.6.x, Xen 4.5.x, Xen 4.4.x, Xen 4.3.x $ sha256sum xsa168* c95198a66485d6e538d113ce2b84630d77c15f597113c38fadd6bf1e24e4c8ec xsa168.patch $ DEPLOYMENT DURING EMBARGO = Deployment of the patches and/or mitigations described above (or others which are substantially similar) is permitted during the embargo, even on public-facing systems with untrusted guest users and administrators. But: Distribution of updated software is prohibited (except to other members of the predisclosure list). Predisclosure list members who wish to deploy significantly different patches and/or mitigations, please contact the Xen Project Security Team. (Note: this during-embargo deployment notice is retained in post-embargo publicly released Xen Project advisories, even though it is then no longer applicable. This is to enable the community to have oversight of the Xen Project Security Team's decisionmaking.) For more information about permissible uses of embargoed information, consult the Xen Project community's agreed Security Policy: http://www.xenproject.org/security-policy.html -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.12 (GNU/Linux) iQEcBAEBAgAGBQJWn3dEAAoJEIP+FMlX6CvZLaAH/A1FzwQebCOF0MCEMcM9V/zK At3L0XG5oBiVZVpbXAfYULeKaLtTGLBXqhBJjzej0FypCvEYX6BLBITLsw7kMqoW JSYHNHlg4pLH2Wnf6i3fVC7EIHx5XNuDa8Zeyt73wEFJhVpp43PcMwMzBolTUBmP +f5WDkLYflYXv+0XiHfbBLA2fl+K+A5OdDhKgjPZJouGvdfiZxX7EChR0asmmD1i AbSZYTLGhdlSU+fvw+w2XUYSeINS1FEhsZxMbWMVuz7jmPBmOn6u8NLrBdZatYoE Z2Fly81pWD7KDwusVscoLBdmBmI1Wr3u975j5EkQLbsCTsqo5ayP3BpfsieijIg= =UJX5 -END PGP SIGNATURE- xsa168.patch Description: Binary data ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [RFC V2] xen: interface: introduce pvclk interface
On 20/01/16 12:48, Peng Fan wrote: > Hi Juergen, > > On Wed, Jan 20, 2016 at 11:40:55AM +0100, Juergen Gross wrote: >> On 20/01/16 10:25, Peng Fan wrote: >>> Hi Juergen, >>> >>> On Wed, Jan 20, 2016 at 10:05:15AM +0100, Juergen Gross wrote: On 20/01/16 09:31, Peng Fan wrote: > Introduce pvclk interface which is useful when doing device passthrough > on ARM platform. ... > +/* > + * Frontend request > + * > + * cmd: command for operation on clk, should be XEN_CLK_[xx], > + * excluding XEN_CLK_END. id is the > + * id: clk id > + * rate: clock rate. Used for set rate. Which unit? Hz? >>> >>> Yeah. Hz. I'll add comments in V3. >>> > + */ > +struct xen_clkif_request { > + uint32_t cmd; > + uint32_t id; > + uint64_t rate; > +}; > +typedef struct xen_clkif_request xen_clkif_request_t; > + > +/* > + * Backend response > + * > + * cmd: command for operation on clk, same with the cmd in request. > + * id: clk id, same with the id in request. > + * success: indicate failure or success for the cmd. Values? >>> >>> I'd like to let the frontend/backend driver to determine the value. >>> In my implementation for linux, if the backend API supports return value, >>> I'll fill the return value to success entry. If the backend API returns >>> void, I'll just fill 0 to success entry. >> >> So please specify "0" to mean success and add some sensible error >> values. There might be multiple frontend- and/or backend-variants which >> all must agree using the same interface. > > Will change this to `int status`, as also observed by Jan. > I'll also define macros such as "#define XEN_CLK_ENABLE_OK 0", "#define > XEN_CLK_ENABLE_FAILURE -1" > >> > + * rate: clock rate. Used for get rate. > + * > + * cmd and id are filled by backend and passed to frontend to > + * let frontend check whether the response is for the current > + * request or not. I'd rather let the frontend add a request Id to the request which will be echoed here instead cmd and clock Id. >>> >>> If using request id, I think we can encode cmd and id into request id? >> >> This should just be an opaque value for the backend. The frontend is >> free how to create this value, it should be unique for every outstanding >> request of a domU, however. It could be built from cmd and id in case >> there can't be multiple requests with the same cmd/id combination >> active in a domU. > > Thanks for teaching me on this. So the request id should be globally unique > for backend. So "random value(generated in frontend) + frontend domid" is ok > for this? Being unique per frontend is enough, as each frontend will only ever see responses for it's own requests. Make it as simple as possible. I guess there will be a maximum of active requests possible, e.g. the number of request slots in the request ring. You could use the index into the ring as Id (pvSCSI frontend is doing it this way). Juergen ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] netfront/netback multiqueue exhausting grants
There have been a few reports recently[0] which relate to a failure of netfront to allocate sufficient grant refs for all the queues: [0.533589] xen_netfront: can't alloc rx grant refs [0.533612] net eth0: only created 31 queues Which can be worked around by increasing the number of grants on the hypervisor command line or by limiting the number of queues permitted by either back or front using a module param (which was broken but is now fixed on both sides, but I'm not sure it has been backported everywhere such that it is a reliable thing to always tell users as a workaround). Is there any plan to do anything about the default/out of the box experience? Either limiting the number of queues or making both ends cope more gracefully with failure to create some queues (or both) might be sufficient? I think the crash after the above in the first link at [0] is fixed? I think that was the purpose of ca88ea1247df "xen-netfront: update num_queues to real created" which was in 4.3. Ian. [0] http://lists.xen.org/archives/html/xen-users/2016-01/msg00100.html http://lists.xen.org/archives/html/xen-users/2016-01/msg00072.html some before hte xmas break too IIRC ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [RFC V2] xen: interface: introduce pvclk interface
On Wed, 2016-01-20 at 12:06 +, Stefano Stabellini wrote: > On Wed, 20 Jan 2016, Peng Fan wrote: > > To my use case, Dom0 and DomU both use device tree, I need to build > > the mapping table between id and name, since I use name to lookup > > the clk in backend, like this: > > "clk = __clk_loopkup(clkname); clk_prepare_enable(clk)". Maybe ACPI > > is another different case. > > Theoretically on systems using ACPI there is no need to fiddle with > clocks, I mentioned ACPI in my replies to v1 more as a placeholder for "other firmware descriptions than DT", in order that the pv protocol we end up with does not end up being DT specific, which would be a mistake irrespective of what may or may not be required for non-DT firmware descriptions. Ian. > see > > Documentation/arm64/arm-acpi.txt > > > "In DT, clocks need to be specified > and the drivers need to take them into account. In ACPI, the assumption > is that UEFI will leave the device in a reasonable default state, > including > any clock settings. If for some reason the driver needs to change a > clock > value, this can be done in an ACPI method; all the driver needs to do is > invoke the method and not concern itself with what the method needs to do > to change the clock. Changing the hardware can then take place over time > by changing what the ACPI method does, and not the driver. > > In DT, the parameters needed by the driver to set up clocks as in the > example > above are known as "bindings"; in ACPI, these are known as "Device > Properties" > and provided to a driver via the _DSD object." > > > However currently we don't have the ability to run ACPI in DomU guests > on ARM. Even if we had, there is no way to call native ACPI methods from > any guests other than Dom0, even on x86. We just have to hope that > clocks don't need to be reset on ACPI systems. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] OSSTEST: Re-blessing cubietruck-{picasso, gleizes, metzinger} for production use
On Wed, 2016-01-20 at 11:58 +, Ian Campbell wrote: > On Wed, 2016-01-20 at 11:52 +, Ian Jackson wrote: > > Ian Campbell writes ("OSSTEST: Re-blessing cubietruck- > > {picasso,gleizes,metzinger} for production use"): > > > I've been running a couple of adhoc production tests per day on these > > > since > > > before Xmas and they haven't lost sight of their disks again. > > > > > > TLDR; I think we should throw them back in the pool. > > > > Great. > > I'll take this as a "yes, go ahead" ;-) and now I have actually gone and done it... ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v3 2/5] libxl: introduce LIBXL_VGA_INTERFACE_TYPE_UNDEF
On Wed, 2016-01-20 at 12:57 +0100, Roger Pau Monne wrote: > And use it as the default value for the VGA kind. This allows libxl to > set > it to the default value later on when the domain type is known. For HVM > guests the default value is LIBXL_VGA_INTERFACE_TYPE_CIRRUS while for > HVMlite the default value is LIBXL_VGA_INTERFACE_TYPE_NONE. Most enums in lixl_type.idl call this value "UNKNOWN" rather than "UNDEF". We should be consistent. You will also need to add the usual #define LIBXL_HAVE to libxl.h. Everything else looks ok to me. > > Signed-off-by: Roger Pau Monné > --- > Cc: Ian Jackson > Cc: Ian Campbell > Cc: Wei Liu > --- > tools/libxl/libxl_create.c | 8 ++-- > tools/libxl/libxl_dm.c | 6 ++ > tools/libxl/libxl_types.idl | 3 ++- > 3 files changed, 14 insertions(+), 3 deletions(-) > > diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c > index e491d83..61a4001 100644 > --- a/tools/libxl/libxl_create.c > +++ b/tools/libxl/libxl_create.c > @@ -206,8 +206,12 @@ int libxl__domain_build_info_setdefault(libxl__gc > *gc, > if (b_info->u.hvm.mmio_hole_memkb == LIBXL_MEMKB_DEFAULT) > b_info->u.hvm.mmio_hole_memkb = 0; > > -if (!b_info->u.hvm.vga.kind) > -b_info->u.hvm.vga.kind = LIBXL_VGA_INTERFACE_TYPE_CIRRUS; > +if (b_info->u.hvm.vga.kind == LIBXL_VGA_INTERFACE_TYPE_UNDEF) { > +if (b_info->device_model_version == > LIBXL_DEVICE_MODEL_VERSION_NONE) > +b_info->u.hvm.vga.kind = LIBXL_VGA_INTERFACE_TYPE_NONE; > +else > +b_info->u.hvm.vga.kind = > LIBXL_VGA_INTERFACE_TYPE_CIRRUS; > +} > > if (!b_info->u.hvm.hdtype) > b_info->u.hvm.hdtype = LIBXL_HDTYPE_IDE; > diff --git a/tools/libxl/libxl_dm.c b/tools/libxl/libxl_dm.c > index a088d71..9aa0cc8 100644 > --- a/tools/libxl/libxl_dm.c > +++ b/tools/libxl/libxl_dm.c > @@ -531,6 +531,9 @@ static int > libxl__build_device_model_args_old(libxl__gc *gc, > break; > case LIBXL_VGA_INTERFACE_TYPE_QXL: > break; > +default: > +LOG(ERROR, "Invalid emulated video card specified"); > +return ERROR_INVAL; > } > > if (b_info->u.hvm.boot) { > @@ -970,6 +973,9 @@ static int > libxl__build_device_model_args_new(libxl__gc *gc, > GCSPRINTF("qxl- > vga,vram_size_mb=%"PRIu64",ram_size_mb=%"PRIu64, > (b_info->video_memkb/2/1024), (b_info- > >video_memkb/2/1024) ) ); > break; > +default: > +LOG(ERROR, "Invalid emulated video card specified"); > +return ERROR_INVAL; > } > > if (b_info->u.hvm.boot) { > diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl > index 9ad7eba..92c95e5 100644 > --- a/tools/libxl/libxl_types.idl > +++ b/tools/libxl/libxl_types.idl > @@ -204,11 +204,12 @@ libxl_shutdown_reason = > Enumeration("shutdown_reason", [ > ], init_val = "LIBXL_SHUTDOWN_REASON_UNKNOWN") > > libxl_vga_interface_type = Enumeration("vga_interface_type", [ > +(0, "UNDEF"), > (1, "CIRRUS"), > (2, "STD"), > (3, "NONE"), > (4, "QXL"), > -], init_val = "LIBXL_VGA_INTERFACE_TYPE_CIRRUS") > +], init_val = "LIBXL_VGA_INTERFACE_TYPE_UNDEF") > > libxl_vendor_device = Enumeration("vendor_device", [ > (0, "NONE"), ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 0/4] add support for vNVDIMM
On Wed, 20 Jan 2016, Tian, Kevin wrote: > > From: Zhang, Haozhong > > Sent: Tuesday, December 29, 2015 7:32 PM > > > > This patch series is the Xen part patch to provide virtual NVDIMM to > > guest. The corresponding QEMU patch series is sent separately with the > > title "[PATCH 0/2] add vNVDIMM support for Xen". > > > > * Background > > > > NVDIMM (Non-Volatile Dual In-line Memory Module) is going to be > > supported on Intel's platform. NVDIMM devices are discovered via ACPI > > and configured by _DSM method of NVDIMM device in ACPI. Some > > documents can be found at > > [1] ACPI 6: http://www.uefi.org/sites/default/files/resources/ACPI_6.0.pdf > > [2] NVDIMM Namespace: http://pmem.io/documents/NVDIMM_Namespace_Spec.pdf > > [3] DSM Interface Example: > > http://pmem.io/documents/NVDIMM_DSM_Interface_Example.pdf > > [4] Driver Writer's Guide: > > http://pmem.io/documents/NVDIMM_Driver_Writers_Guide.pdf > > > > The upstream QEMU (commits 5c42eef ~ 70d1fb9) has added support to > > provide virtual NVDIMM in PMEM mode, in which NVDIMM devices are > > mapped into CPU's address space and are accessed via normal memory > > read/write and three special instructions (clflushopt/clwb/pcommit). > > > > This patch series and the corresponding QEMU patch series enable Xen > > to provide vNVDIMM devices to HVM domains. > > > > * Design > > > > Supporting vNVDIMM in PMEM mode has three requirements. > > > > Although this design is about vNVDIMM, some background of how pNVDIMM > is managed in Xen would be helpful to understand the whole design since > in PMEM mode you need map pNVDIMM into GFN addr space so there's > a matter of how pNVDIMM is allocated. Yes, some background would be very helpful. Given that there are so many moving parts on this (Xen, the Dom0 kernel, QEMU, hvmloader, libxl) I suggest that we start with a design document for this feature. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v3 3/5] libxl: initialise the build info before calling prepare_config
On Wed, 2016-01-20 at 12:57 +0100, Roger Pau Monne wrote: > libxl__arch_domain_prepare_config has access to the > libxl_domain_build_info > struct, so make sure it's properly initialised. > > Signed-off-by: Roger Pau Monné > --- > Cc: Ian Jackson > Cc: Ian Campbell > Cc: Wei Liu > --- > NB: libxl__arch_domain_prepare_config is called from libxl__domain_make. I think this is worth moving into the actual commit message. Plus it would be useful to clarify that while prepare_config has access to b_info it doesn't touch it right now (but presumably you are about to make it do so). If it does touch it then that is currently a bug which should be mentioned in the commit message and tagged for backport etc. I suspect the reason for the ordering today is that domain_make is intended to consume create_info, not build_info. However that distinction seems to me to be an artefact of a much older API structure which doesn't seem to make much sense now (but we are stuck with it :-() > --- > tools/libxl/libxl_create.c | 12 ++-- > 1 file changed, 6 insertions(+), 6 deletions(-) > > diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c > index 61a4001..ba4c9e8 100644 > --- a/tools/libxl/libxl_create.c > +++ b/tools/libxl/libxl_create.c > @@ -892,6 +892,12 @@ static void initiate_domain_create(libxl__egc *egc, > goto error_out; > } > > +ret = libxl__domain_build_info_setdefault(gc, &d_config->b_info); > +if (ret) { > +LOG(ERROR, "Unable to set domain build info defaults"); > +goto error_out; > +} > + > ret = libxl__domain_make(gc, d_config, &domid, &state->config); > if (ret) { > LOG(ERROR, "cannot make domain: %d", ret); > @@ -903,12 +909,6 @@ static void initiate_domain_create(libxl__egc *egc, > dcs->guest_domid = domid; > dcs->dmss.dm.guest_domid = 0; /* means we haven't spawned */ > > -ret = libxl__domain_build_info_setdefault(gc, &d_config->b_info); > -if (ret) { > -LOG(ERROR, "Unable to set domain build info defaults"); > -goto error_out; > -} > - > if (d_config->c_info.type == LIBXL_DOMAIN_TYPE_HVM && > (libxl_defbool_val(d_config->b_info.u.hvm.nested_hvm) && > libxl_defbool_val(d_config->b_info.u.hvm.altp2m))) { ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 4/4] hvmloader: add support to load extra ACPI tables from qemu
>>> On 20.01.16 at 12:04, wrote: > On 01/20/16 01:46, Jan Beulich wrote: >> >>> On 20.01.16 at 06:31, wrote: >> > Secondly, the driver implements a convenient block device interface to >> > let software access areas where NVDIMM devices are mapped. The >> > existing vNVDIMM implementation in QEMU uses this interface. >> > >> > As Linux NVDIMM driver has already done above, why do we bother to >> > reimplement them in Xen? >> >> See above; a possibility is that we may need a split model (block >> layer parts on Dom0, "normal memory" parts in the hypervisor. >> Iirc the split is being determined by firmware, and hence set in >> stone by the time OS (or hypervisor) boot starts. > > For the "normal memory" parts, do you mean parts that map the host > NVDIMM device's address space range to the guest? I'm going to > implement that part in hypervisor and expose it as a hypercall so that > it can be used by QEMU. To answer this I need to have my understanding of the partitioning being done by firmware confirmed: If that's the case, then "normal" means the part that doesn't get exposed as a block device (SSD). In any event there's no correlation to guest exposure here. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH] public/io/netif.h: change semantics of "request-multicast-control" flag
My patch b2700877 "move and amend multicast control documentation" clarified use of the multicast control protocol between frontend and backend. However, it transpires that the restrictions that documentation placed on the "request-multicast-control" flag make it hard for a frontend to enable 'all multicast' promiscuous mode, in that to do so would require the frontend and backend to disconnect and re-connect. This patch adds a new "feature-dynamic-multicast-control" flag to allow a backend to advertise that it will watch "request-multicast-control" hence allowing it to be meaningfully modified by the frontend at any time rather than only when the frontend and backend are disconnected. Signed-off-by: Paul Durrant Cc: Ian Campbell Cc: Ian Jackson Cc: Jan Beulich Cc: Keir Fraser Cc: Tim Deegan --- xen/include/public/io/netif.h | 34 ++ 1 file changed, 22 insertions(+), 12 deletions(-) diff --git a/xen/include/public/io/netif.h b/xen/include/public/io/netif.h index fe0a87f..8816e0f 100644 --- a/xen/include/public/io/netif.h +++ b/xen/include/public/io/netif.h @@ -136,18 +136,28 @@ */ /* - * "feature-multicast-control" advertises the capability to filter ethernet - * multicast packets in the backend. To enable use of this capability the - * frontend must set "request-multicast-control" before moving into the - * connected state. - * - * If "request-multicast-control" is set then the backend transmit side should - * no longer flood multicast packets to the frontend, it should instead drop any - * multicast packet that does not match in a filter list. The list is - * amended by the frontend by sending dummy transmit requests containing - * XEN_NETIF_EXTRA_TYPE_MCAST_{ADD,DEL} extra-info fragments as specified below. - * Once enabled by the frontend, the feature cannot be disabled except by - * closing and re-connecting to the backend. + * "feature-multicast-control" and "feature-dynamic-multicast-control" + * advertise the capability to filter ethernet multicast packets in the + * backend. If the frontend wishes to take advantage of this feature then + * it may set "request-multicast-control". If the backend only advertises + * "feature-multicast-control" then "request-multicast-control" must be set + * before the frontend moves into the connected state. The backend will + * sample the value on this state transition and any subsequent change in + * value will have no effect. However, if the backend also advertises + * "feature-dynamic-multicast-control" then "request-multicast-control" + * may be set by the frontend at any time. In this case, the backend will + * watch the value and re-sample on watch events. + * + * If the sampled value of "request-multicast-control" is set then the + * backend transmit side should no longer flood multicast packets to the + * frontend, it should instead drop any multicast packet that does not + * match in a filter list. + * The list is amended by the frontend by sending dummy transmit requests + * containing XEN_NETIF_EXTRA_TYPE_MCAST_{ADD,DEL} extra-info fragments as + * specified below. + * Note that the filter list may be amended even if the sampled value of + * "request-multicast-control" is not set, however the filter should only + * be applied if it is set. */ /* -- 2.1.4 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v4 3/3] VT-d: Fix vt-d Device-TLB flush timeout issue.
>>> On 20.01.16 at 11:26, wrote: >> On January 15, 2016 at 9:10, wrote: >> >>> On 23.12.15 at 09:25, wrote: >> > @@ -229,6 +239,63 @@ int qinval_device_iotlb(struct iommu *iommu, >> > return 0; >> > } >> > >> > +static void dev_invalidate_iotlb_timeout(struct iommu *iommu, u16 did, >> > + u16 seg, u8 bus, u8 devfn) >> { >> > +struct domain *d; >> > +struct pci_dev *pdev; >> > + >> > +d = rcu_lock_domain_by_id(iommu->domid_map[did]); >> > +ASSERT(d); >> >> Don't you need to acquire some lock in order to safely assert this? > > Referred to the other use case of 'rcu_lock_domain_by_id()', Xen checks > whether > The domain is 'NULL'. > Could I also replace the 'ASSERT(d)' with > + If ( d == NULL ) > + return; > ? At a first glance this doesn't look right, but in the end that's something you need to answer. >> Also note that unused slots hold zero, i.e. there's at least a theoretical > risk of >> problems here when you don't also look at >> iommu->domid_bitmap. >> > I am not clear how to fix this point. Do you have good idea? > Add a lock to 'iommu->domid_bitmap'? How would a lock help avoiding mistaking an unused slot to mean Dom0? As already suggested - I think you simply need to consult the bitmap along with the domain ID array. >> > +{ >> > +list_del(&pdev->domain_list); >> >> This should happen under pcidevs_lock - you need to either acquire it or >> ASSERT() it being held. >> > > I should check whether it is under pcidevs_lock -- with > spin_is_locked(&pcidevs_lock) > If it is not under pcidevs_lock, I should acquire it. > I think ASSERT() is not a good idea. Hypervisor acquires this lock and then > remove the resource. I don't understand this last sentence. >> > @@ -349,9 +416,18 @@ static int flush_iotlb_qi( >> > queue_invalidate_iotlb(iommu, >> > type >> >> DMA_TLB_FLUSH_GRANU_OFFSET, dr, >> > dw, did, size_order, 0, addr); >> > + >> > +/* >> > + * Synchronize with hardware for invalidation request descriptors >> > + * submitted before Device-TLB invalidate descriptor. >> > + */ >> > +rc = invalidate_sync(iommu); >> > +if ( rc ) >> > + return rc; >> > + >> > if ( flush_dev_iotlb ) >> > ret = dev_invalidate_iotlb(iommu, did, addr, size_order, >> > type); >> > -rc = invalidate_sync(iommu); >> > + >> > if ( !ret ) >> > ret = rc; >> > } >> >> This change is because of ...? >> > As Kevin's comments, > I add 'dev_invalidate_iotlb_sync()' to dev_invalidate_iotlb(). > Then i can know which device is flush timeout. I don't understand: How is your reply related to you moving up the invocation of invalidate_sync()? >> Plus logging the error code and device coordinates might turn out helpful in >> such cases. But first of all - if there was a timeout, we'd make it here >> only for >> Dom0. Perhaps the printing should move into an else to the domain_crash()? >> And if there was another error, the message would be outright wrong. >> > IMO, the print for Dom0 is enough. > I think it is not a good idea to move into an else to domain_crash(). > Domain_crash is quite a common part. > Anyway I can improve it in low priority. At the very least the message should not end up being actively misleading. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v10 6/7] vmx: VT-d posted-interrupt core logic handling
>>> On 20.01.16 at 12:20, wrote: >> From: Jan Beulich [mailto:jbeul...@suse.com] >> Sent: Wednesday, January 20, 2016 4:35 PM >> >>> On 20.01.16 at 08:49, wrote: >> >> From: Jan Beulich [mailto:jbeul...@suse.com] >> >> Sent: Monday, January 18, 2016 11:14 PM >> >> >>> On 03.12.15 at 09:35, wrote: >> >> > --- a/xen/arch/x86/hvm/vmx/vmx.c >> >> > +++ b/xen/arch/x86/hvm/vmx/vmx.c >> >> > @@ -83,7 +83,131 @@ static int vmx_msr_write_intercept(unsigned int >> >> msr, uint64_t msr_content); >> >> > +ASSERT(v->arch.hvm_vmx.pi_block_cpu == NR_CPUS); >> >> > + >> >> > +/* >> >> > + * The vCPU is blocking, we need to add it to one of the per pCPU >> lists. >> >> > + * We save v->processor to v->arch.hvm_vmx.pi_block_cpu and use >> it >> >> for >> >> > + * the per-CPU list, we also save it to posted-interrupt descriptor >> and >> >> > + * make it as the destination of the wake-up notification event. >> >> > + */ >> >> > +v->arch.hvm_vmx.pi_block_cpu = v->processor; >> >> > + >> >> > +spin_lock_irqsave(&per_cpu(pi_blocked_vcpu_lock, >> >> > + v->arch.hvm_vmx.pi_block_cpu), flags); >> >> > +list_add_tail(&v->arch.hvm_vmx.pi_blocked_vcpu_list, >> >> > + &per_cpu(pi_blocked_vcpu, v- >> >arch.hvm_vmx.pi_block_cpu)); >> >> > +spin_unlock_irqrestore(&per_cpu(pi_blocked_vcpu_lock, >> >> > + v->arch.hvm_vmx.pi_block_cpu), flags); >> >> > + >> >> > +ASSERT(!pi_test_sn(pi_desc)); >> >> > + >> >> > +/* >> >> > + * We don't need to set the 'NDST' field, since it should point to >> >> > + * the same pCPU as v->processor. >> >> > + */ >> >> > + >> >> > +write_atomic(&pi_desc->nv, pi_wakeup_vector); >> >> >> >> Stray blank line between comment and corresponding code. >> >> >> >> Also the ASSERT() is rather more disconnected from the write >> >> than seems desirable: Wouldn't it be better to cmpxchg() the >> >> whole 32-bit value, validating that SN is clear in the result? >> > >> > Not quite understand this. The control field is 64 bits, do you >> > mean cmpxchg() the whole 64-bit value like follows: >> > >> > +do { >> > +old.control = new.control = pi_desc->control; >> > +new.nv = pi_wakeup_vector; >> > +} while ( cmpxchg(&pi_desc->control, old.control, new.control) >> > + != old.control ); >> > >> > This a somewhat like the implementation in the earlier versions, >> > why do you want to change it back? >> >> Did you read what I've said? I'm worried about the disconnect of >> assertion and update: You really care about SN being clear >> _after_ the update aiui. > > No, the ASSERT has no connection with the update. the SN bit should > be clear before updating pi_desc->nv, adding ASSERT here is just kind > of protection code. And SN can't get set behind your back between the ASSERT() and the update? If the answer is "yes", then the code is indeed fine as is. >> >> > +pi_block_cpu = v->arch.hvm_vmx.pi_block_cpu; >> >> > +if ( pi_block_cpu == NR_CPUS ) >> >> > +return; >> >> >> >> Are this condition and the one in the immediately preceding if() >> >> connected in any way? >> > >> > I am not sure I understand this correctly. Which one is >> > " the one in the immediately preceding if()"? >> >> If you hadn't ripped out too much of the context, it would be a >> matter of pointing you back up. Now I have to re-quote the full >> code: >> >> +if ( pi_desc->nv != posted_intr_vector ) >> +write_atomic(&pi_desc->nv, posted_intr_vector); >> + >> +/* the vCPU is not on any blocking list. */ >> +pi_block_cpu = v->arch.hvm_vmx.pi_block_cpu; >> +if ( pi_block_cpu == NR_CPUS ) >> +return; >> >> These are the two if()-s the question is about. >> >> >> I.e. could the latter perhaps become an >> >> ASSERT() by connecting the two blocks suitably? I'm simply >> >> trying to limit the number of different cases to consider mentally >> >> when looking at this code ... >> > >> > If we get a true from above check, it means the vcpu was removed by >> > pi_wakeup_interrupt(), then we just need to return. If we get a false, >> > we will acquire the spinlock as the following code does, there are two >> > scenarios: >> > - pi_wakeup_interrupt() acquires the spinlock before us, then when >> > we get the lock, 'v->arch.hvm_vmx.pi_block_cpu' has already been >> > set to NR_CPUS by pi_wakeup_interrupt(), we need do nothing. >> > - We get the spinlock before pi_wakeup_interrupt(), this time we need >> > to remove the vCPU from the list and set 'v->arch.hvm_vmx.pi_block_cpu' >> > to NR_CPUS. >> >> This is all only about the second condition, but the question really is >> whether one being true or false may imply the result of other. In >> which case this would better be ASSERT()ed than handled by two >> conditionals. > > Thanks for the explanation. I don't think there are any connections > Between " if ( pi_desc->nv != poste
Re: [Xen-devel] [PATCH v3 5/5] libxl: add options to enable/disable emulated devices
On Wed, 2016-01-20 at 12:57 +0100, Roger Pau Monne wrote: > Allow enabling or disabling emulated devices from the libxl domain > configuration file. For HVM guests with a device model all the emulated > devices are enabled. For HVM guests without a device model no devices are > enabled by default, although they can be enabled using the options > provided. > The arbiter of whether a combination is posible or not is always Xen, > libxl > doesn't do any kind of check. > > This set of options is also propagated inside of the libxl migration record > as part of the contents of the libxl_domain_build_info struct. ... and this is the real motivation for this change, not actually allowing users to control all this AIUI. Did you check that the fields updated using libxl_defbool_setdefault are actually updated in the JSON copy and therefore propagated to the other side of a migration as specific values and not as "pick a default"? I think we don't want these changing on migration. I think/hope all this was automatically handled by the work Wei did in the last release cycle. > Signed-off-by: Roger Pau Monné > --- > Cc: Ian Jackson > Cc: Ian Campbell > Cc: Wei Liu > --- > docs/man/xl.cfg.pod.5 | 39 +++ > tools/libxl/libxl.h | 11 +++ > tools/libxl/libxl_create.c | 21 - > tools/libxl/libxl_types.idl | 6 ++ > tools/libxl/libxl_x86.c | 33 - > tools/libxl/xl_cmdimpl.c| 7 +++ > 6 files changed, 111 insertions(+), 6 deletions(-) > > diff --git a/docs/man/xl.cfg.pod.5 b/docs/man/xl.cfg.pod.5 > index 8899f75..46d4529 100644 > --- a/docs/man/xl.cfg.pod.5 > +++ b/docs/man/xl.cfg.pod.5 > @@ -1762,6 +1762,45 @@ See F for > more information. > > =back > > +=head3 HVM without a device model options > + > +This options can be used to change the set of emulated devices provided "These..." > +to guests without a device model. Note that Xen might not support all > +possible combinations. By default HVM guests without a device model > +don't have any of them enabled. ... and for those with a device model? The title and text suggest these options are invalid/ignored in that case, but the code does actually honour what the user specified in this case. > diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl > index 92c95e5..8a21cda 100644 > --- a/tools/libxl/libxl_types.idl > +++ b/tools/libxl/libxl_types.idl > @@ -519,6 +519,12 @@ libxl_domain_build_info = Struct("domain_build_info",[ > ("serial_list", > libxl_string_list), > ("rdm", libxl_rdm_reserve), > ("rdm_mem_boundary_memkb", MemKB), > + ("lapic",libxl_defbool), > + ("ioapic", libxl_defbool), > + ("rtc", libxl_defbool), > + ("power_management", libxl_defbool), > + ("pic", libxl_defbool), > + ("pit", libxl_defbool), I wonder if these should go in a sub-struct. Although you might reasonably argue that this is already such a dumping ground it doesn't matter... > ])), > ("pv", Struct(None, [("kernel", string), > ("slack_memkb", MemKB), > diff --git a/tools/libxl/libxl_x86.c b/tools/libxl/libxl_x86.c > index 46cfafb..92f25fd 100644 > --- a/tools/libxl/libxl_x86.c > +++ b/tools/libxl/libxl_x86.c > @@ -7,15 +7,38 @@ int libxl__arch_domain_prepare_config(libxl__gc *gc, > libxl_domain_config *d_config, > xc_domain_configuration_t > *xc_config) > { > +struct libxl_domain_build_info *info = &d_config->b_info; > > -if (d_config->c_info.type == LIBXL_DOMAIN_TYPE_HVM && > -d_config->b_info.device_model_version != > -LIBXL_DEVICE_MODEL_VERSION_NONE) { > -/* HVM domains with a device model. */ So, I'm not 100% clear on why this check and the corresponding logic to set xc_config->emulation_flags is not also sufficient for after migration. Could you explain (and likely eventually add the rationale to the commit message). Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [RFC V2] xen: interface: introduce pvclk interface
>>> On 20.01.16 at 12:40, wrote: > Hi Jan, > > On Wed, Jan 20, 2016 at 03:14:20AM -0700, Jan Beulich wrote: > On 20.01.16 at 09:31, wrote: >>> +/* >>> + * Backend response >>> + * >>> + * cmd: command for operation on clk, same with the cmd in request. >>> + * id: clk id, same with the id in request. >>> + * success: indicate failure or success for the cmd. >>> + * rate: clock rate. Used for get rate. >>> + * >>> + * cmd and id are filled by backend and passed to frontend to >>> + * let frontend check whether the response is for the current >>> + * request or not. >>> + */ >>> +struct xen_clkif_response { >>> + uint32_t cmd; >>> + uint32_t id; >>> + uint32_t success; >>> + uint64_t rate; >>> +}; >> >>This isn't 32-/64-bit clean. Question is whether echoing cmd is really >>needed. > > As Juergen suggested, use a request id. I'll fix this in V3. > 32-/64-bit unclean, I can not get you here (: The layout of above structure may end up different for 32- and 64-bit guests, depending on the alignment of uint64_t. >>And what I'm missing throughout the file is some description of how >>clock events (interrupts?) are actually meant to make it into the >>guest. > > I have a simple implementation v1 patch for linux, > http://lists.xen.org/archives/html/xen-devel/2016-01/msg01860.html. > You can see how it works. No, sorry, that's not the point of the inquiry. It seems to me that there are more aspects to this interface, not directly related to the request/reply protocol written down here, which need to be spelled out event if they don't require any particular definitions or type declarations. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] Backports for Xen 4.6
>>> On 18.01.16 at 16:53, wrote: > Possibly also: > 42940c046902 "x86/shadow: Fix missing newline in dprintk()" The affected statement compiles to nothing in a release build, which can be taken as an argument both ways. I lean towards not putting it in. > 6851e979874e "VT-d: use proper error codes in iommu_enable_x2apic_IR()" Since I had pulled this into our own tree already, and since you're now also viewing this as useful, I guess I will throw it in. > 0ce647ad6f70 "x86: suppress bogus log message" This being a purely cosmetic change, may I ask for the reason you consider this a backport candidate? > 484c14b7254e "x86/vmx: enable PML by default" Not in a point release I would say. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v3 4/5] x86/PV: allow PV guests to have an emulated PIT
>>> On 20.01.16 at 12:57, wrote: > --- a/xen/arch/x86/domain.c > +++ b/xen/arch/x86/domain.c > @@ -542,8 +542,11 @@ int arch_domain_create(struct domain *d, unsigned int > domcr_flags, > d->domain_id, config->emulation_flags); > return -EINVAL; > } > -if ( config->emulation_flags != 0 && > - (!is_hvm_domain(d) || config->emulation_flags != > XEN_X86_EMU_ALL) ) > +if ( is_hardware_domain(d) ) > +config->emulation_flags |= XEN_X86_EMU_PIT; > +if ( !is_hvm_domain(d) ? (config->emulation_flags != > XEN_X86_EMU_PIT) : > + (config->emulation_flags != XEN_X86_EMU_ALL > && > + config->emulation_flags != 0)) > { > printk(XENLOG_G_ERR "d%d: Xen does not allow %s domain creation " > "with the current selection of emulators: %#x\n", While you mentioned in the cover letter that this needs to be committed together with patch 5, having them as separate commits would still break bisectabaility afaict. Would it really be so unreasonable to allow PV(H) DomU-s to not have an emulated PIT, depending on (a future) tool stack gathered per-domain setting? Converting the check above accordingly would - afaict - at once break that dependency on patch 5. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [xen-4.6-testing test] 78528: tolerable FAIL - PUSHED
flight 78528 xen-4.6-testing real [real] http://logs.test-lab.xenproject.org/osstest/logs/78528/ Failures :-/ but no regressions. Regressions which are regarded as allowable (not blocking): test-armhf-armhf-xl-rtds 11 guest-start fail blocked in 77441 test-amd64-amd64-xl-qemut-win7-amd64 16 guest-stop fail like 77441 test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-stop fail like 77441 Tests which did not succeed, but are not blocking: test-amd64-amd64-xl-pvh-intel 11 guest-start fail never pass test-amd64-amd64-xl-pvh-amd 11 guest-start fail never pass test-armhf-armhf-libvirt 14 guest-saverestorefail never pass test-armhf-armhf-libvirt 12 migrate-support-checkfail never pass test-amd64-i386-libvirt-xsm 12 migrate-support-checkfail never pass test-amd64-i386-libvirt 12 migrate-support-checkfail never pass test-amd64-amd64-qemuu-nested-amd 16 debian-hvm-install/l1/l2 fail never pass test-amd64-amd64-libvirt 12 migrate-support-checkfail never pass test-armhf-armhf-libvirt-qcow2 11 migrate-support-checkfail never pass test-armhf-armhf-libvirt-qcow2 13 guest-saverestorefail never pass test-armhf-armhf-libvirt-raw 13 guest-saverestorefail never pass test-armhf-armhf-libvirt-raw 11 migrate-support-checkfail never pass test-armhf-armhf-xl-arndale 12 migrate-support-checkfail never pass test-armhf-armhf-xl-arndale 13 saverestore-support-checkfail never pass test-amd64-amd64-libvirt-xsm 12 migrate-support-checkfail never pass test-armhf-armhf-xl-credit2 13 saverestore-support-checkfail never pass test-armhf-armhf-xl-credit2 12 migrate-support-checkfail never pass test-armhf-armhf-xl-multivcpu 13 saverestore-support-checkfail never pass test-armhf-armhf-xl-multivcpu 12 migrate-support-checkfail never pass test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check fail never pass test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check fail never pass test-amd64-amd64-libvirt-vhd 11 migrate-support-checkfail never pass test-armhf-armhf-xl 12 migrate-support-checkfail never pass test-armhf-armhf-xl 13 saverestore-support-checkfail never pass test-armhf-armhf-xl-xsm 13 saverestore-support-checkfail never pass test-armhf-armhf-xl-xsm 12 migrate-support-checkfail never pass test-armhf-armhf-xl-cubietruck 12 migrate-support-checkfail never pass test-armhf-armhf-xl-cubietruck 13 saverestore-support-checkfail never pass test-armhf-armhf-xl-vhd 11 migrate-support-checkfail never pass test-armhf-armhf-xl-vhd 12 saverestore-support-checkfail never pass test-armhf-armhf-libvirt-xsm 12 migrate-support-checkfail never pass test-armhf-armhf-libvirt-xsm 14 guest-saverestorefail never pass version targeted for testing: xen 6150df9f3f99ecbcbd9917002186d1d895b5602e baseline version: xen 1d3cc6e62c4d2fc3dd9251d4921881425c9d27bd Last test of basis77441 2016-01-07 23:13:46 Z 12 days Testing same since78528 2016-01-19 14:09:10 Z0 days1 attempts People who touched revisions under test: Andrew Cooper Ian Jackson Jonathan Davies jobs: build-amd64-xsm pass build-armhf-xsm pass build-i386-xsm pass build-amd64 pass build-armhf pass build-i386 pass build-amd64-libvirt pass build-armhf-libvirt pass build-i386-libvirt pass build-amd64-prev pass build-i386-prev pass build-amd64-pvopspass build-armhf-pvopspass build-i386-pvops pass build-amd64-rumpuserxen pass build-i386-rumpuserxen pass test-amd64-amd64-xl pass test-armhf-armhf-xl pass test-amd64-i386-xl pass test-amd64-amd64-xl-qemut-debianhvm-amd64-xsmpass test-amd64-i386-xl-qemut-debianhvm
Re: [Xen-devel] [PATCH] public/io/netif.h: change semantics of "request-multicast-control" flag
On Wed, 2016-01-20 at 12:50 +, Paul Durrant wrote: > My patch b2700877 "move and amend multicast control documentation" > clarified use of the multicast control protocol between frontend and > backend. However, it transpires that the restrictions that documentation > placed on the "request-multicast-control" flag make it hard for a > frontend to enable 'all multicast' promiscuous mode, in that to do so > would require the frontend and backend to disconnect and re-connect. Do we therefore think that this document reflected reality, i.e. might this not be "just" a documentation bug? (Or maybe we can't tell because the only previous implementation was years ago in Solaris or something) > This patch adds a new "feature-dynamic-multicast-control" flag to allow > a backend to advertise that it will watch "request-multicast-control" hence > allowing it to be meaningfully modified by the frontend at any time rather > than only when the frontend and backend are disconnected. Would allowing XEN_NETIF_EXTRA_TYPE_MCAST_{ADD,DEL} to take a bcast address be easier on the backend, in that it would just need to be a static feature rather than watching stuff on the fly? ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] public/io/netif.h: change semantics of "request-multicast-control" flag
> -Original Message- > From: Ian Campbell [mailto:ian.campb...@citrix.com] > Sent: 20 January 2016 13:06 > To: Paul Durrant; xen-de...@lists.xenproject.org > Cc: Ian Jackson; Jan Beulich; Keir (Xen.org); Tim (Xen.org) > Subject: Re: [PATCH] public/io/netif.h: change semantics of "request- > multicast-control" flag > > On Wed, 2016-01-20 at 12:50 +, Paul Durrant wrote: > > My patch b2700877 "move and amend multicast control documentation" > > clarified use of the multicast control protocol between frontend and > > backend. However, it transpires that the restrictions that documentation > > placed on the "request-multicast-control" flag make it hard for a > > frontend to enable 'all multicast' promiscuous mode, in that to do so > > would require the frontend and backend to disconnect and re-connect. > > Do we therefore think that this document reflected reality, i.e. might this > not be "just" a documentation bug? > > (Or maybe we can't tell because the only previous implementation was years > ago in Solaris or something) That's my concern. I hope it's just a documentation bug, but I don't know. Also I've already done an implementation in Linux netback according to the restricted semantics. > > > This patch adds a new "feature-dynamic-multicast-control" flag to allow > > a backend to advertise that it will watch "request-multicast-control" hence > > allowing it to be meaningfully modified by the frontend at any time rather > > than only when the frontend and backend are disconnected. > > Would allowing XEN_NETIF_EXTRA_TYPE_MCAST_{ADD,DEL} to take a bcast > address > be easier on the backend, in that it would just need to be a static feature > rather than watching stuff on the fly? The documented semantics of the list are 'exact match' so sending a bcast address doesn't do much good with a backend that doesn't know to treat is specially hence a frontend can't tell whether 'all multicast' mode is going to work without the extra feature flag. As for watching "request-multcast-control" vs. add/remove of bcast, the complexity of implementation is cheaper for the latter but I think the former is 'nicer'. Paul ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] Backports for Xen 4.6
On 20/01/16 12:29, Jan Beulich wrote: On 18.01.16 at 16:53, wrote: >> Possibly also: >> 42940c046902 "x86/shadow: Fix missing newline in dprintk()" > The affected statement compiles to nothing in a release build, which > can be taken as an argument both ways. I lean towards not putting > it in. > >> 6851e979874e "VT-d: use proper error codes in iommu_enable_x2apic_IR()" > Since I had pulled this into our own tree already, and since you're > now also viewing this as useful, I guess I will throw it in. > >> 0ce647ad6f70 "x86: suppress bogus log message" > This being a purely cosmetic change, may I ask for the reason > you consider this a backport candidate? The shear volume reduction in debug builds. XenServer ships both a release and a debug hypervisor for first-line triage of customer issues. As such, "only affecting a debug build" isn't a relevant consideration for us, and I expect we are not alone here. ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 4/4] hvmloader: add support to load extra ACPI tables from qemu
On 20/01/16 10:36, Xiao Guangrong wrote: > > Hi, > > On 01/20/2016 06:15 PM, Haozhong Zhang wrote: > >> CCing QEMU vNVDIMM maintainer: Xiao Guangrong >> >>> Conceptually, an NVDIMM is just like a fast SSD which is linearly >>> mapped >>> into memory. I am still on the dom0 side of this fence. >>> >>> The real question is whether it is possible to take an NVDIMM, split it >>> in half, give each half to two different guests (with appropriate NFIT >>> tables) and that be sufficient for the guests to just work. >>> >> >> Yes, one NVDIMM device can be split into multiple parts and assigned >> to different guests, and QEMU is responsible to maintain virtual NFIT >> tables for each part. >> >>> Either way, it needs to be a toolstack policy decision as to how to >>> split the resource. > > Currently, we are using NVDIMM as a block device and a DAX-based > filesystem > is created upon it in Linux so that file-related accesses directly reach > the NVDIMM device. > > In KVM, If the NVDIMM device need to be shared by different VMs, we can > create multiple files on the DAX-based filesystem and assign the file to > each VMs. In the future, we can enable namespace (partition-like) for > PMEM > memory and assign the namespace to each VMs (current Linux driver uses > the > whole PMEM as a single namespace). > > I think it is not a easy work to let Xen hypervisor recognize NVDIMM > device > and manager NVDIMM resource. > > Thanks! > The more I see about this, the more sure I am that we want to keep it as a block device managed by dom0. In the case of the DAX-based filesystem, I presume files are not necessarily contiguous. I also presume that this is worked around by permuting the mapping of the virtual NVDIMM such that the it appears as a contiguous block of addresses to the guest? Today in Xen, Qemu already has the ability to create mappings in the guest's address space, e.g. to map PCI device BARs. I don't see a conceptual difference here, although the security/permission model certainly is more complicated. ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v10 6/7] vmx: VT-d posted-interrupt core logic handling
On Wed, 2016-01-20 at 04:35 -0700, Jan Beulich wrote: > > > > On 20.01.16 at 12:20, wrote: > > > > > > Then you didn't understand: The question isn't this path, but the > > > path where the hook gets called if non-NULL (and hence the > > > possibility to avoid such needless calls). > > > > I understand you mean the overhead happens when the hooks > > is called. My point is the hook is not called in a critical path, > > so I doubt > > whether it worth doing so to make the logic complex. > > Are you sure scheduling code is not a critical path? > TBH, I like Jan's point... It's always good to make all we can to avoid calling the hook, if unnecessary. Does it really complicates things a lot? Feng, can you give it a try? Regards, Dario -- <> (Raistlin Majere) - Dario Faggioli, Ph.D, http://about.me/dario.faggioli Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK) signature.asc Description: This is a digitally signed message part ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] preparing for 4.6.1
All, with this first point release due in about two weeks, please indicate backports you find missing from its staging tree but necessary in the release. Thanks, Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v10 6/7] vmx: VT-d posted-interrupt core logic handling
> -Original Message- > From: Dario Faggioli [mailto:dario.faggi...@citrix.com] > Sent: Wednesday, January 20, 2016 9:31 PM > To: Jan Beulich ; Wu, Feng > Cc: Andrew Cooper ; George Dunlap > ; Tian, Kevin ; xen- > de...@lists.xen.org; Keir Fraser > Subject: Re: [PATCH v10 6/7] vmx: VT-d posted-interrupt core logic handling > > On Wed, 2016-01-20 at 04:35 -0700, Jan Beulich wrote: > > > > > On 20.01.16 at 12:20, wrote: > > > > > > > > Then you didn't understand: The question isn't this path, but the > > > > path where the hook gets called if non-NULL (and hence the > > > > possibility to avoid such needless calls). > > > > > > I understand you mean the overhead happens when the hooks > > > is called. My point is the hook is not called in a critical path, > > > so I doubt > > > whether it worth doing so to make the logic complex. > > > > Are you sure scheduling code is not a critical path? > > > TBH, I like Jan's point... It's always good to make all we can to avoid > calling the hook, if unnecessary. > > Does it really complicates things a lot? Feng, can you give it a try? Sure, I will try it and discuss with your guys if I meet some issues in future. Thanks, Feng > > Regards, > Dario > -- > <> (Raistlin Majere) > - > Dario Faggioli, Ph.D, http://about.me/dario.faggioli > Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK) ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v10 6/7] vmx: VT-d posted-interrupt core logic handling
> -Original Message- > From: Jan Beulich [mailto:jbeul...@suse.com] > Sent: Wednesday, January 20, 2016 7:36 PM > To: Wu, Feng > Cc: Andrew Cooper ; Dario Faggioli > ; George Dunlap ; > Tian, Kevin ; xen-devel@lists.xen.org; Keir Fraser > > Subject: RE: [PATCH v10 6/7] vmx: VT-d posted-interrupt core logic handling > > >>> On 20.01.16 at 12:20, wrote: > >> From: Jan Beulich [mailto:jbeul...@suse.com] > >> Sent: Wednesday, January 20, 2016 4:35 PM > >> >>> On 20.01.16 at 08:49, wrote: > >> >> From: Jan Beulich [mailto:jbeul...@suse.com] > >> >> Sent: Monday, January 18, 2016 11:14 PM > >> >> >>> On 03.12.15 at 09:35, wrote: > >> >> > --- a/xen/arch/x86/hvm/vmx/vmx.c > >> >> > +++ b/xen/arch/x86/hvm/vmx/vmx.c > >> >> > @@ -83,7 +83,131 @@ static int vmx_msr_write_intercept(unsigned > int > >> >> msr, uint64_t msr_content); > >> >> > +ASSERT(v->arch.hvm_vmx.pi_block_cpu == NR_CPUS); > >> >> > + > >> >> > +/* > >> >> > + * The vCPU is blocking, we need to add it to one of the per pCPU > >> lists. > >> >> > + * We save v->processor to v->arch.hvm_vmx.pi_block_cpu and > use > >> it > >> >> for > >> >> > + * the per-CPU list, we also save it to posted-interrupt > >> >> > descriptor > >> and > >> >> > + * make it as the destination of the wake-up notification event. > >> >> > + */ > >> >> > +v->arch.hvm_vmx.pi_block_cpu = v->processor; > >> >> > + > >> >> > +spin_lock_irqsave(&per_cpu(pi_blocked_vcpu_lock, > >> >> > + v->arch.hvm_vmx.pi_block_cpu), flags); > >> >> > +list_add_tail(&v->arch.hvm_vmx.pi_blocked_vcpu_list, > >> >> > + &per_cpu(pi_blocked_vcpu, v- > >> >arch.hvm_vmx.pi_block_cpu)); > >> >> > +spin_unlock_irqrestore(&per_cpu(pi_blocked_vcpu_lock, > >> >> > + v->arch.hvm_vmx.pi_block_cpu), flags); > >> >> > + > >> >> > +ASSERT(!pi_test_sn(pi_desc)); > >> >> > + > >> >> > +/* > >> >> > + * We don't need to set the 'NDST' field, since it should point > >> >> > to > >> >> > + * the same pCPU as v->processor. > >> >> > + */ > >> >> > + > >> >> > +write_atomic(&pi_desc->nv, pi_wakeup_vector); > >> >> > >> >> Stray blank line between comment and corresponding code. > >> >> > >> >> Also the ASSERT() is rather more disconnected from the write > >> >> than seems desirable: Wouldn't it be better to cmpxchg() the > >> >> whole 32-bit value, validating that SN is clear in the result? > >> > > >> > Not quite understand this. The control field is 64 bits, do you > >> > mean cmpxchg() the whole 64-bit value like follows: > >> > > >> > +do { > >> > +old.control = new.control = pi_desc->control; > >> > +new.nv = pi_wakeup_vector; > >> > +} while ( cmpxchg(&pi_desc->control, old.control, new.control) > >> > + != old.control ); > >> > > >> > This a somewhat like the implementation in the earlier versions, > >> > why do you want to change it back? > >> > >> Did you read what I've said? I'm worried about the disconnect of > >> assertion and update: You really care about SN being clear > >> _after_ the update aiui. > > > > No, the ASSERT has no connection with the update. the SN bit should > > be clear before updating pi_desc->nv, adding ASSERT here is just kind > > of protection code. > > And SN can't get set behind your back between the ASSERT() and > the update? Yes. > If the answer is "yes", then the code is indeed fine as is. > > >> >> > +pi_block_cpu = v->arch.hvm_vmx.pi_block_cpu; > >> >> > +if ( pi_block_cpu == NR_CPUS ) > >> >> > +return; > >> >> > >> >> Are this condition and the one in the immediately preceding if() > >> >> connected in any way? > >> > > >> > I am not sure I understand this correctly. Which one is > >> > " the one in the immediately preceding if()"? > >> > >> If you hadn't ripped out too much of the context, it would be a > >> matter of pointing you back up. Now I have to re-quote the full > >> code: > >> > >> +if ( pi_desc->nv != posted_intr_vector ) > >> +write_atomic(&pi_desc->nv, posted_intr_vector); > >> + > >> +/* the vCPU is not on any blocking list. */ > >> +pi_block_cpu = v->arch.hvm_vmx.pi_block_cpu; > >> +if ( pi_block_cpu == NR_CPUS ) > >> +return; > >> > >> These are the two if()-s the question is about. > >> > >> >> I.e. could the latter perhaps become an > >> >> ASSERT() by connecting the two blocks suitably? I'm simply > >> >> trying to limit the number of different cases to consider mentally > >> >> when looking at this code ... > >> > > >> > If we get a true from above check, it means the vcpu was removed by > >> > pi_wakeup_interrupt(), then we just need to return. If we get a false, > >> > we will acquire the spinlock as the following code does, there are two > >> > scenarios: > >> > - pi_wakeup_interrupt() acquires the spinlock before us, then when > >> > we get the lock, 'v->arch.hvm_vmx.pi_block_cpu' has already been > >>
Re: [Xen-devel] [RFC V2] xen: interface: introduce pvclk interface
Hi Ian, Stefano On Wed, Jan 20, 2016 at 12:27:07PM +, Ian Campbell wrote: >On Wed, 2016-01-20 at 12:06 +, Stefano Stabellini wrote: >> On Wed, 20 Jan 2016, Peng Fan wrote: >> > To my use case, Dom0 and DomU both use device tree, I need to build >> > the mapping table between id and name, since I use name to lookup >> > the clk in backend, like this: >> > "clk = __clk_loopkup(clkname); clk_prepare_enable(clk)". Maybe ACPI >> > is another different case. >> >> Theoretically on systems using ACPI there is no need to fiddle with >> clocks, > >I mentioned ACPI in my replies to v1 more as a placeholder for "other >firmware descriptions than DT", in order that the pv protocol we end up >with does not end up being DT specific, which would be a mistake >irrespective of what may or may not be required for non-DT firmware >descriptions. Thanks for clarifying. Beside this, are you ok with the xenstore node description in this file? Thanks, Peng. > >Ian. > >> see >> >> Documentation/arm64/arm-acpi.txt >> >> >> "In DT, clocks need to be specified >> and the drivers need to take them into account. In ACPI, the assumption >> is that UEFI will leave the device in a reasonable default state, >> including >> any clock settings. If for some reason the driver needs to change a >> clock >> value, this can be done in an ACPI method; all the driver needs to do is >> invoke the method and not concern itself with what the method needs to do >> to change the clock. Changing the hardware can then take place over time >> by changing what the ACPI method does, and not the driver. >> >> In DT, the parameters needed by the driver to set up clocks as in the >> example >> above are known as "bindings"; in ACPI, these are known as "Device >> Properties" >> and provided to a driver via the _DSD object." >> >> >> However currently we don't have the ability to run ACPI in DomU guests >> on ARM. Even if we had, there is no way to call native ACPI methods from >> any guests other than Dom0, even on x86. We just have to hope that >> clocks don't need to be reset on ACPI systems. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH 0/3] x86: XSA-168 follow-up
1: mmuext: tighten TLB flush address checks 2: PV: relax LDT address check 3: paging: invlpg() hook returns boolean Signed-off-by: Jan Beulich ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v10 6/7] vmx: VT-d posted-interrupt core logic handling
>>> On 20.01.16 at 14:48, wrote: > >> -Original Message- >> From: Jan Beulich [mailto:jbeul...@suse.com] >> Sent: Wednesday, January 20, 2016 7:36 PM >> To: Wu, Feng >> Cc: Andrew Cooper ; Dario Faggioli >> ; George Dunlap ; >> Tian, Kevin ; xen-devel@lists.xen.org; Keir Fraser >> >> Subject: RE: [PATCH v10 6/7] vmx: VT-d posted-interrupt core logic handling >> >> >>> On 20.01.16 at 12:20, wrote: >> >> From: Jan Beulich [mailto:jbeul...@suse.com] >> >> Sent: Wednesday, January 20, 2016 4:35 PM >> >> >>> On 20.01.16 at 08:49, wrote: >> >> >> From: Jan Beulich [mailto:jbeul...@suse.com] >> >> >> Sent: Monday, January 18, 2016 11:14 PM >> >> >> >>> On 03.12.15 at 09:35, wrote: >> >> >> > --- a/xen/arch/x86/hvm/vmx/vmx.c >> >> >> > +++ b/xen/arch/x86/hvm/vmx/vmx.c >> >> >> > @@ -83,7 +83,131 @@ static int vmx_msr_write_intercept(unsigned >> int >> >> >> msr, uint64_t msr_content); >> >> >> > +ASSERT(v->arch.hvm_vmx.pi_block_cpu == NR_CPUS); >> >> >> > + >> >> >> > +/* >> >> >> > + * The vCPU is blocking, we need to add it to one of the per >> >> >> > pCPU >> >> lists. >> >> >> > + * We save v->processor to v->arch.hvm_vmx.pi_block_cpu and >> use >> >> it >> >> >> for >> >> >> > + * the per-CPU list, we also save it to posted-interrupt >> >> >> > descriptor >> >> and >> >> >> > + * make it as the destination of the wake-up notification event. >> >> >> > + */ >> >> >> > +v->arch.hvm_vmx.pi_block_cpu = v->processor; >> >> >> > + >> >> >> > +spin_lock_irqsave(&per_cpu(pi_blocked_vcpu_lock, >> >> >> > + v->arch.hvm_vmx.pi_block_cpu), flags); >> >> >> > +list_add_tail(&v->arch.hvm_vmx.pi_blocked_vcpu_list, >> >> >> > + &per_cpu(pi_blocked_vcpu, v- >> >> >arch.hvm_vmx.pi_block_cpu)); >> >> >> > +spin_unlock_irqrestore(&per_cpu(pi_blocked_vcpu_lock, >> >> >> > + v->arch.hvm_vmx.pi_block_cpu), flags); >> >> >> > + >> >> >> > +ASSERT(!pi_test_sn(pi_desc)); >> >> >> > + >> >> >> > +/* >> >> >> > + * We don't need to set the 'NDST' field, since it should point >> >> >> > to >> >> >> > + * the same pCPU as v->processor. >> >> >> > + */ >> >> >> > + >> >> >> > +write_atomic(&pi_desc->nv, pi_wakeup_vector); >> >> >> >> >> >> Stray blank line between comment and corresponding code. >> >> >> >> >> >> Also the ASSERT() is rather more disconnected from the write >> >> >> than seems desirable: Wouldn't it be better to cmpxchg() the >> >> >> whole 32-bit value, validating that SN is clear in the result? >> >> > >> >> > Not quite understand this. The control field is 64 bits, do you >> >> > mean cmpxchg() the whole 64-bit value like follows: >> >> > >> >> > +do { >> >> > +old.control = new.control = pi_desc->control; >> >> > +new.nv = pi_wakeup_vector; >> >> > +} while ( cmpxchg(&pi_desc->control, old.control, new.control) >> >> > + != old.control ); >> >> > >> >> > This a somewhat like the implementation in the earlier versions, >> >> > why do you want to change it back? >> >> >> >> Did you read what I've said? I'm worried about the disconnect of >> >> assertion and update: You really care about SN being clear >> >> _after_ the update aiui. >> > >> > No, the ASSERT has no connection with the update. the SN bit should >> > be clear before updating pi_desc->nv, adding ASSERT here is just kind >> > of protection code. >> >> And SN can't get set behind your back between the ASSERT() and >> the update? > > Yes. > >> If the answer is "yes", then the code is indeed fine as is. >> >> >> >> > +pi_block_cpu = v->arch.hvm_vmx.pi_block_cpu; >> >> >> > +if ( pi_block_cpu == NR_CPUS ) >> >> >> > +return; >> >> >> >> >> >> Are this condition and the one in the immediately preceding if() >> >> >> connected in any way? >> >> > >> >> > I am not sure I understand this correctly. Which one is >> >> > " the one in the immediately preceding if()"? >> >> >> >> If you hadn't ripped out too much of the context, it would be a >> >> matter of pointing you back up. Now I have to re-quote the full >> >> code: >> >> >> >> +if ( pi_desc->nv != posted_intr_vector ) >> >> +write_atomic(&pi_desc->nv, posted_intr_vector); >> >> + >> >> +/* the vCPU is not on any blocking list. */ >> >> +pi_block_cpu = v->arch.hvm_vmx.pi_block_cpu; >> >> +if ( pi_block_cpu == NR_CPUS ) >> >> +return; >> >> >> >> These are the two if()-s the question is about. >> >> >> >> >> I.e. could the latter perhaps become an >> >> >> ASSERT() by connecting the two blocks suitably? I'm simply >> >> >> trying to limit the number of different cases to consider mentally >> >> >> when looking at this code ... >> >> > >> >> > If we get a true from above check, it means the vcpu was removed by >> >> > pi_wakeup_interrupt(), then we just need to return. If we get a false, >> >> > we will acquire the spinlock as the following code does, there are
[Xen-devel] [PATCH 1/3] x86/mmuext: tighten TLB flush address checks
Addresses passed by PV guests should be subjected to __addr_ok(), avoiding undue TLB flushes; . Signed-off-by: Jan Beulich --- a/xen/arch/x86/mm.c +++ b/xen/arch/x86/mm.c @@ -3268,8 +3268,9 @@ long do_mmuext_op( case MMUEXT_INVLPG_LOCAL: if ( unlikely(d != pg_owner) ) rc = -EPERM; -else if ( !paging_mode_enabled(d) || - paging_invlpg(curr, op.arg1.linear_addr) != 0 ) +else if ( !paging_mode_enabled(d) + ? __addr_ok(op.arg1.linear_addr) + : paging_invlpg(curr, op.arg1.linear_addr) ) flush_tlb_one_local(op.arg1.linear_addr); break; @@ -3290,7 +3291,7 @@ long do_mmuext_op( if ( op.cmd == MMUEXT_TLB_FLUSH_MULTI ) flush_tlb_mask(&pmask); -else +else if ( __addr_ok(op.arg1.linear_addr) ) flush_tlb_one_mask(&pmask, op.arg1.linear_addr); break; } @@ -3303,10 +3304,10 @@ long do_mmuext_op( break; case MMUEXT_INVLPG_ALL: -if ( likely(d == pg_owner) ) -flush_tlb_one_mask(d->domain_dirty_cpumask, op.arg1.linear_addr); -else +if ( unlikely(d != pg_owner) ) rc = -EPERM; +else if ( __addr_ok(op.arg1.linear_addr) ) +flush_tlb_one_mask(d->domain_dirty_cpumask, op.arg1.linear_addr); break; case MMUEXT_FLUSH_CACHE: --- a/xen/include/asm-x86/paging.h +++ b/xen/include/asm-x86/paging.h @@ -245,7 +245,9 @@ paging_fault(unsigned long va, struct cp * or 0 if it's safe not to do so. */ static inline int paging_invlpg(struct vcpu *v, unsigned long va) { -return is_canonical_address(va) && paging_get_hostmode(v)->invlpg(v, va); +return (paging_mode_external(v->domain) ? is_canonical_address(va) +: __addr_ok(va)) && + paging_get_hostmode(v)->invlpg(v, va); } /* Translate a guest virtual address to the frame number that the x86/mmuext: tighten TLB flush address checks Addresses passed by PV guests should be subjected to __addr_ok(), avoiding undue TLB flushes; . Signed-off-by: Jan Beulich --- a/xen/arch/x86/mm.c +++ b/xen/arch/x86/mm.c @@ -3268,8 +3268,9 @@ long do_mmuext_op( case MMUEXT_INVLPG_LOCAL: if ( unlikely(d != pg_owner) ) rc = -EPERM; -else if ( !paging_mode_enabled(d) || - paging_invlpg(curr, op.arg1.linear_addr) != 0 ) +else if ( !paging_mode_enabled(d) + ? __addr_ok(op.arg1.linear_addr) + : paging_invlpg(curr, op.arg1.linear_addr) ) flush_tlb_one_local(op.arg1.linear_addr); break; @@ -3290,7 +3291,7 @@ long do_mmuext_op( if ( op.cmd == MMUEXT_TLB_FLUSH_MULTI ) flush_tlb_mask(&pmask); -else +else if ( __addr_ok(op.arg1.linear_addr) ) flush_tlb_one_mask(&pmask, op.arg1.linear_addr); break; } @@ -3303,10 +3304,10 @@ long do_mmuext_op( break; case MMUEXT_INVLPG_ALL: -if ( likely(d == pg_owner) ) -flush_tlb_one_mask(d->domain_dirty_cpumask, op.arg1.linear_addr); -else +if ( unlikely(d != pg_owner) ) rc = -EPERM; +else if ( __addr_ok(op.arg1.linear_addr) ) +flush_tlb_one_mask(d->domain_dirty_cpumask, op.arg1.linear_addr); break; case MMUEXT_FLUSH_CACHE: --- a/xen/include/asm-x86/paging.h +++ b/xen/include/asm-x86/paging.h @@ -245,7 +245,9 @@ paging_fault(unsigned long va, struct cp * or 0 if it's safe not to do so. */ static inline int paging_invlpg(struct vcpu *v, unsigned long va) { -return is_canonical_address(va) && paging_get_hostmode(v)->invlpg(v, va); +return (paging_mode_external(v->domain) ? is_canonical_address(va) +: __addr_ok(va)) && + paging_get_hostmode(v)->invlpg(v, va); } /* Translate a guest virtual address to the frame number that the ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH 2/3] x86/PV: relax LDT address check
There's no point placing restrictions on its address when the LDT size is zero. Also convert a local variable to a slightly more efficient type. Signed-off-by: Jan Beulich --- a/xen/arch/x86/mm.c +++ b/xen/arch/x86/mm.c @@ -3348,8 +3348,8 @@ long do_mmuext_op( case MMUEXT_SET_LDT: { -unsigned long ptr = op.arg1.linear_addr; -unsigned long ents = op.arg2.nr_ents; +unsigned int ents = op.arg2.nr_ents; +unsigned long ptr = ents ? op.arg1.linear_addr : 0; if ( unlikely(d != pg_owner) ) rc = -EPERM; @@ -3361,7 +3361,7 @@ long do_mmuext_op( else if ( ((ptr & (PAGE_SIZE - 1)) != 0) || !__addr_ok(ptr) || (ents > 8192) ) { -MEM_LOG("Bad args to SET_LDT: ptr=%lx, ents=%lx", ptr, ents); +MEM_LOG("Bad args to SET_LDT: ptr=%lx, ents=%x", ptr, ents); rc = -EINVAL; } else if ( (curr->arch.pv_vcpu.ldt_ents != ents) || x86/PV: relax LDT address check There's no point placing restrictions on its address when the LDT size is zero. Also convert a local variable to a slightly more efficient type. Signed-off-by: Jan Beulich --- a/xen/arch/x86/mm.c +++ b/xen/arch/x86/mm.c @@ -3348,8 +3348,8 @@ long do_mmuext_op( case MMUEXT_SET_LDT: { -unsigned long ptr = op.arg1.linear_addr; -unsigned long ents = op.arg2.nr_ents; +unsigned int ents = op.arg2.nr_ents; +unsigned long ptr = ents ? op.arg1.linear_addr : 0; if ( unlikely(d != pg_owner) ) rc = -EPERM; @@ -3361,7 +3361,7 @@ long do_mmuext_op( else if ( ((ptr & (PAGE_SIZE - 1)) != 0) || !__addr_ok(ptr) || (ents > 8192) ) { -MEM_LOG("Bad args to SET_LDT: ptr=%lx, ents=%lx", ptr, ents); +MEM_LOG("Bad args to SET_LDT: ptr=%lx, ents=%x", ptr, ents); rc = -EINVAL; } else if ( (curr->arch.pv_vcpu.ldt_ents != ents) || ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [RFC V2] xen: interface: introduce pvclk interface
Hi Jan, On Wed, Jan 20, 2016 at 05:01:40AM -0700, Jan Beulich wrote: On 20.01.16 at 12:40, wrote: >> Hi Jan, >> >> On Wed, Jan 20, 2016 at 03:14:20AM -0700, Jan Beulich wrote: >> On 20.01.16 at 09:31, wrote: +/* + * Backend response + * + * cmd: command for operation on clk, same with the cmd in request. + * id: clk id, same with the id in request. + * success: indicate failure or success for the cmd. + * rate: clock rate. Used for get rate. + * + * cmd and id are filled by backend and passed to frontend to + * let frontend check whether the response is for the current + * request or not. + */ +struct xen_clkif_response { + uint32_t cmd; + uint32_t id; + uint32_t success; + uint64_t rate; +}; >>> >>>This isn't 32-/64-bit clean. Question is whether echoing cmd is really >>>needed. >> >> As Juergen suggested, use a request id. I'll fix this in V3. >> 32-/64-bit unclean, I can not get you here (: > >The layout of above structure may end up different for 32- and >64-bit guests, depending on the alignment of uint64_t. Thanks for teaching me. In V3, the layout will be changed to like this struct xen_clkif_response { uint32_t request_id; int32_t status; uint64_t rate; }; And more macro definitions for the status entry: #define XEN_CLK_OP_SUCCESS 0 #define XEN_CLK_ENABLE_FALIURE -1 #define XEN_CLK_DISABLE_FAILURE -2 #define XEN_CLK_PREPARE_FAILURE -3 #define XEN_CLK_UNPREPARE_FAILURE -4 #define XEN_CLK_SET_RATE_FAILURE -5 #define XEN_CLK_GET_RATE_FALIURE -6 > >>>And what I'm missing throughout the file is some description of how >>>clock events (interrupts?) are actually meant to make it into the >>>guest. >> >> I have a simple implementation v1 patch for linux, >> http://lists.xen.org/archives/html/xen-devel/2016-01/msg01860.html. >> You can see how it works. > >No, sorry, that's not the point of the inquiry. It seems to me that >there are more aspects to this interface, not directly related to >the request/reply protocol written down here, which need to be >spelled out event if they don't require any particular definitions >or type declarations. I try to follow you about clock events(interrupts?), not sure whether I understand correct or not. clock in this file is the clk for a device. In linux side, it managed by clk subsystem, drivers/clk/xx. This is not the clock events or clock source in drivers/clocksource/xx. For the pvclk interface, there will be no interrupt for the guest. I'll add more texts in the file or commit log. Thanks, Peng. > >Jan > ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH 3/3] x86/paging: invlpg() hook returns boolean
... so make its return type reflect this. Signed-off-by: Jan Beulich --- a/xen/arch/x86/mm/hap/hap.c +++ b/xen/arch/x86/mm/hap/hap.c @@ -680,7 +680,7 @@ static int hap_page_fault(struct vcpu *v * HAP guests can handle invlpg without needing any action from Xen, so * should not be intercepting it. */ -static int hap_invlpg(struct vcpu *v, unsigned long va) +static bool_t hap_invlpg(struct vcpu *v, unsigned long va) { if (nestedhvm_enabled(v->domain)) { /* Emulate INVLPGA: --- a/xen/arch/x86/mm/shadow/multi.c +++ b/xen/arch/x86/mm/shadow/multi.c @@ -3510,11 +3510,12 @@ propagate: } -static int -sh_invlpg(struct vcpu *v, unsigned long va) -/* Called when the guest requests an invlpg. Returns 1 if the invlpg +/* + * Called when the guest requests an invlpg. Returns 1 if the invlpg * instruction should be issued on the hardware, or 0 if it's safe not - * to do so. */ + * to do so. + */ +static bool_t sh_invlpg(struct vcpu *v, unsigned long va) { mfn_t sl1mfn; shadow_l2e_t sl2e; --- a/xen/arch/x86/mm/shadow/none.c +++ b/xen/arch/x86/mm/shadow/none.c @@ -32,10 +32,10 @@ static int _page_fault(struct vcpu *v, u return 0; } -static int _invlpg(struct vcpu *v, unsigned long va) +static bool_t _invlpg(struct vcpu *v, unsigned long va) { ASSERT_UNREACHABLE(); -return -EOPNOTSUPP; +return 1; } static unsigned long _gva_to_gfn(struct vcpu *v, struct p2m_domain *p2m, --- a/xen/arch/x86/mm.c +++ b/xen/arch/x86/mm.c @@ -4392,8 +4392,7 @@ static int __do_update_va_mapping( switch ( (bmap_ptr = flags & ~UVMF_FLUSHTYPE_MASK) ) { case UVMF_LOCAL: -if ( !paging_mode_enabled(d) || - (paging_invlpg(v, va) != 0) ) +if ( !paging_mode_enabled(d) || paging_invlpg(v, va) ) flush_tlb_one_local(va); break; case UVMF_ALL: --- a/xen/include/asm-x86/paging.h +++ b/xen/include/asm-x86/paging.h @@ -104,7 +104,7 @@ struct shadow_paging_mode { struct paging_mode { int (*page_fault)(struct vcpu *v, unsigned long va, struct cpu_user_regs *regs); -int (*invlpg)(struct vcpu *v, unsigned long va); +bool_t(*invlpg)(struct vcpu *v, unsigned long va); unsigned long (*gva_to_gfn)(struct vcpu *v, struct p2m_domain *p2m, unsigned long va, @@ -243,7 +243,7 @@ paging_fault(unsigned long va, struct cp /* Handle invlpg requests on vcpus. * Returns 1 if the invlpg instruction should be issued on the hardware, * or 0 if it's safe not to do so. */ -static inline int paging_invlpg(struct vcpu *v, unsigned long va) +static inline bool_t paging_invlpg(struct vcpu *v, unsigned long va) { return (paging_mode_external(v->domain) ? is_canonical_address(va) : __addr_ok(va)) && x86/paging: invlpg() hook returns boolean ... so make its return type reflect this. Signed-off-by: Jan Beulich --- a/xen/arch/x86/mm/hap/hap.c +++ b/xen/arch/x86/mm/hap/hap.c @@ -680,7 +680,7 @@ static int hap_page_fault(struct vcpu *v * HAP guests can handle invlpg without needing any action from Xen, so * should not be intercepting it. */ -static int hap_invlpg(struct vcpu *v, unsigned long va) +static bool_t hap_invlpg(struct vcpu *v, unsigned long va) { if (nestedhvm_enabled(v->domain)) { /* Emulate INVLPGA: --- a/xen/arch/x86/mm/shadow/multi.c +++ b/xen/arch/x86/mm/shadow/multi.c @@ -3510,11 +3510,12 @@ propagate: } -static int -sh_invlpg(struct vcpu *v, unsigned long va) -/* Called when the guest requests an invlpg. Returns 1 if the invlpg +/* + * Called when the guest requests an invlpg. Returns 1 if the invlpg * instruction should be issued on the hardware, or 0 if it's safe not - * to do so. */ + * to do so. + */ +static bool_t sh_invlpg(struct vcpu *v, unsigned long va) { mfn_t sl1mfn; shadow_l2e_t sl2e; --- a/xen/arch/x86/mm/shadow/none.c +++ b/xen/arch/x86/mm/shadow/none.c @@ -32,10 +32,10 @@ static int _page_fault(struct vcpu *v, u return 0; } -static int _invlpg(struct vcpu *v, unsigned long va) +static bool_t _invlpg(struct vcpu *v, unsigned long va) { ASSERT_UNREACHABLE(); -return -EOPNOTSUPP; +return 1; } static unsigned long _gva_to_gfn(struct vcpu *v, struct p2m_domain *p2m, --- a/xen/arch/x86/mm.c +++ b/xen/arch/x86/mm.c @@ -4392,8 +4392,7 @@ static int __do_update_va_mapping( switch ( (bmap_ptr = flags & ~UVMF_FLUSHTYPE_MASK) ) { case UVMF_LOCAL: -if ( !paging_mode_enabled(d) || - (paging_invlpg(v, va) != 0) ) +if ( !paging_mode_enabled(d) || paging_invlpg(v, va) ) flush_tlb_one_local(va); break; case UVMF_ALL: --- a/xen/include/asm-x86
Re: [Xen-devel] [RFC V2] xen: interface: introduce pvclk interface
Hi Juergen, On Wed, Jan 20, 2016 at 01:11:39PM +0100, Juergen Gross wrote: >On 20/01/16 12:48, Peng Fan wrote: >> Hi Juergen, >> >> On Wed, Jan 20, 2016 at 11:40:55AM +0100, Juergen Gross wrote: >>> On 20/01/16 10:25, Peng Fan wrote: Hi Juergen, On Wed, Jan 20, 2016 at 10:05:15AM +0100, Juergen Gross wrote: > On 20/01/16 09:31, Peng Fan wrote: >> Introduce pvclk interface which is useful when doing device passthrough >> on ARM platform. > > ... > >> +/* >> + * Frontend request >> + * >> + * cmd: command for operation on clk, should be XEN_CLK_[xx], >> + * excluding XEN_CLK_END. id is the >> + * id: clk id >> + * rate: clock rate. Used for set rate. > > Which unit? Hz? Yeah. Hz. I'll add comments in V3. > >> + */ >> +struct xen_clkif_request { >> +uint32_t cmd; >> +uint32_t id; >> +uint64_t rate; >> +}; >> +typedef struct xen_clkif_request xen_clkif_request_t; >> + >> +/* >> + * Backend response >> + * >> + * cmd: command for operation on clk, same with the cmd in request. >> + * id: clk id, same with the id in request. >> + * success: indicate failure or success for the cmd. > > Values? I'd like to let the frontend/backend driver to determine the value. In my implementation for linux, if the backend API supports return value, I'll fill the return value to success entry. If the backend API returns void, I'll just fill 0 to success entry. >>> >>> So please specify "0" to mean success and add some sensible error >>> values. There might be multiple frontend- and/or backend-variants which >>> all must agree using the same interface. >> >> Will change this to `int status`, as also observed by Jan. >> I'll also define macros such as "#define XEN_CLK_ENABLE_OK 0", "#define >> XEN_CLK_ENABLE_FAILURE -1" >> >>> >> + * rate: clock rate. Used for get rate. >> + * >> + * cmd and id are filled by backend and passed to frontend to >> + * let frontend check whether the response is for the current >> + * request or not. > > I'd rather let the frontend add a request Id to the request which > will be echoed here instead cmd and clock Id. If using request id, I think we can encode cmd and id into request id? >>> >>> This should just be an opaque value for the backend. The frontend is >>> free how to create this value, it should be unique for every outstanding >>> request of a domU, however. It could be built from cmd and id in case >>> there can't be multiple requests with the same cmd/id combination >>> active in a domU. >> >> Thanks for teaching me on this. So the request id should be globally unique >> for backend. So "random value(generated in frontend) + frontend domid" is ok >> for this? > >Being unique per frontend is enough, as each frontend will only ever see >responses for it's own requests. Make it as simple as possible. I guess >there will be a maximum of active requests possible, e.g. the number of >request slots in the request ring. You could use the index into the ring >as Id (pvSCSI frontend is doing it this way). Thanks for this info. In my case, such as let DomU handle uart2, I only need to let DomU handle uart2-root-clk. Later I will passthrough gpu to DomU, only gpu-root-clk needs be handled by DomU. In linux side, clk operation is exclusive for one device(not definitely sure), so the number of slots can be max number of devices that supported for device passthrough. I'll take pvSCSI for reference. Thanks, Peng. > >Juergen ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [RFC V2] xen: interface: introduce pvclk interface
>>> On 20.01.16 at 15:05, wrote: > Hi Jan, > On Wed, Jan 20, 2016 at 05:01:40AM -0700, Jan Beulich wrote: > On 20.01.16 at 12:40, wrote: >>> Hi Jan, >>> >>> On Wed, Jan 20, 2016 at 03:14:20AM -0700, Jan Beulich wrote: >>> On 20.01.16 at 09:31, wrote: > +/* > + * Backend response > + * > + * cmd: command for operation on clk, same with the cmd in request. > + * id: clk id, same with the id in request. > + * success: indicate failure or success for the cmd. > + * rate: clock rate. Used for get rate. > + * > + * cmd and id are filled by backend and passed to frontend to > + * let frontend check whether the response is for the current > + * request or not. > + */ > +struct xen_clkif_response { > + uint32_t cmd; > + uint32_t id; > + uint32_t success; > + uint64_t rate; > +}; This isn't 32-/64-bit clean. Question is whether echoing cmd is really needed. >>> >>> As Juergen suggested, use a request id. I'll fix this in V3. >>> 32-/64-bit unclean, I can not get you here (: >> >>The layout of above structure may end up different for 32- and >>64-bit guests, depending on the alignment of uint64_t. > > Thanks for teaching me. In V3, the layout will be changed to like this > struct xen_clkif_response { > uint32_t request_id; > int32_t status; > uint64_t rate; > }; Okay. Just as a not - iirc other pv interfaces use 64-bit ID values, so not doing this here perhaps ought to be justified. > And more macro definitions for the status entry: > #define XEN_CLK_OP_SUCCESS 0 > #define XEN_CLK_ENABLE_FALIURE -1 > #define XEN_CLK_DISABLE_FAILURE -2 > #define XEN_CLK_PREPARE_FAILURE -3 > #define XEN_CLK_UNPREPARE_FAILURE -4 > #define XEN_CLK_SET_RATE_FAILURE -5 > #define XEN_CLK_GET_RATE_FALIURE -6 That's bogus - different kinds of errors would be meaningful, but an error per command is quite pointless imo. (Also please be aware of typos and parenthesization.) And what I'm missing throughout the file is some description of how clock events (interrupts?) are actually meant to make it into the guest. >>> >>> I have a simple implementation v1 patch for linux, >>> http://lists.xen.org/archives/html/xen-devel/2016-01/msg01860.html. >>> You can see how it works. >> >>No, sorry, that's not the point of the inquiry. It seems to me that >>there are more aspects to this interface, not directly related to >>the request/reply protocol written down here, which need to be >>spelled out event if they don't require any particular definitions >>or type declarations. > > I try to follow you about clock events(interrupts?), not sure whether I > understand correct or not. > clock in this file is the clk for a device. In linux side, it managed by clk > subsystem, drivers/clk/xx. > This is not the clock events or clock source in drivers/clocksource/xx. > For the pvclk interface, there will be no interrupt for the guest. Then (also considering the set of commands you propose) what use is the clock to the guest? It can't get events from it, it can't read its current value, all it can is get/set its rate, enable/disable, and prepare/unprepare it. I may be lacking some ARM knowledge here, but all of this looks pretty odd to me. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 1/3] x86/mmuext: tighten TLB flush address checks
On 20/01/16 14:05, Jan Beulich wrote: > Addresses passed by PV guests should be subjected to __addr_ok(), > avoiding undue TLB flushes; . > > Signed-off-by: Jan Beulich > > --- a/xen/arch/x86/mm.c > +++ b/xen/arch/x86/mm.c > @@ -3268,8 +3268,9 @@ long do_mmuext_op( > case MMUEXT_INVLPG_LOCAL: > if ( unlikely(d != pg_owner) ) > rc = -EPERM; > -else if ( !paging_mode_enabled(d) || > - paging_invlpg(curr, op.arg1.linear_addr) != 0 ) > +else if ( !paging_mode_enabled(d) > + ? __addr_ok(op.arg1.linear_addr) > + : paging_invlpg(curr, op.arg1.linear_addr) ) paging_mode_enabled() changes depending on whether the guest has logdirty currently enabled. As you have also patched paging_invlpg() to DTRT with __addr_ok(), I think this hunk can be dropped. Everything else, Reviewed-by: Andrew Cooper ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [qemu-mainline baseline-only test] 38665: trouble: broken/fail/pass
This run is configured for baseline tests only. flight 38665 qemu-mainline real [real] http://osstest.xs.citrite.net/~osstest/testlogs/logs/38665/ Failures and problems with tests :-( Tests which did not succeed and are blocking, including tests which could not be run: test-amd64-amd64-amd64-pvgrub 3 host-install(3)broken REGR. vs. 38647 test-amd64-i386-xl-qemuu-debianhvm-amd64 3 host-install(3) broken REGR. vs. 38647 test-amd64-i386-xl-xsm3 host-install(3) broken REGR. vs. 38647 test-amd64-amd64-xl-qcow2 3 host-install(3) broken REGR. vs. 38647 Regressions which are regarded as allowable (not blocking): test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 3 host-install(3) broken REGR. vs. 38647 test-amd64-amd64-libvirt-xsm 3 host-install(3) broken REGR. vs. 38647 test-amd64-amd64-xl-pvh-amd 3 host-install(3) broken like 38647 test-amd64-amd64-qemuu-nested-amd 3 host-install(3) broken like 38647 test-amd64-amd64-xl-qemuu-win7-amd64 3 host-install(3) broken like 38647 test-amd64-amd64-libvirt-pair 4 host-install/dst_host(4)broken like 38647 test-amd64-amd64-libvirt-pair 3 host-install/src_host(3)broken like 38647 test-amd64-amd64-pair 3 host-install/src_host(3) broken like 38647 test-amd64-amd64-pair 4 host-install/dst_host(4) broken like 38647 test-amd64-amd64-i386-pvgrub 10 guest-start fail blocked in 38647 test-amd64-amd64-xl 19 guest-start/debian.repeatfail like 38647 test-amd64-amd64-qemuu-nested-intel 13 xen-boot/l1 fail like 38647 Tests which did not succeed, but are not blocking: test-armhf-armhf-libvirt-xsm 12 migrate-support-checkfail never pass test-armhf-armhf-libvirt-xsm 14 guest-saverestorefail never pass test-armhf-armhf-libvirt 14 guest-saverestorefail never pass test-armhf-armhf-libvirt 12 migrate-support-checkfail never pass test-amd64-amd64-xl-pvh-intel 11 guest-start fail never pass test-armhf-armhf-xl-credit2 13 saverestore-support-checkfail never pass test-armhf-armhf-xl-credit2 12 migrate-support-checkfail never pass test-armhf-armhf-xl-midway 13 saverestore-support-checkfail never pass test-armhf-armhf-xl-midway 12 migrate-support-checkfail never pass test-armhf-armhf-libvirt-raw 13 guest-saverestorefail never pass test-armhf-armhf-libvirt-raw 11 migrate-support-checkfail never pass test-armhf-armhf-xl-multivcpu 13 saverestore-support-checkfail never pass test-armhf-armhf-xl-multivcpu 12 migrate-support-checkfail never pass test-armhf-armhf-xl-xsm 13 saverestore-support-checkfail never pass test-armhf-armhf-xl-xsm 12 migrate-support-checkfail never pass test-armhf-armhf-libvirt-qcow2 11 migrate-support-checkfail never pass test-armhf-armhf-libvirt-qcow2 13 guest-saverestorefail never pass test-armhf-armhf-xl 12 migrate-support-checkfail never pass test-armhf-armhf-xl 13 saverestore-support-checkfail never pass test-armhf-armhf-xl-rtds 13 saverestore-support-checkfail never pass test-armhf-armhf-xl-rtds 12 migrate-support-checkfail never pass test-amd64-i386-libvirt 12 migrate-support-checkfail never pass test-amd64-i386-libvirt-xsm 12 migrate-support-checkfail never pass test-amd64-amd64-libvirt 12 migrate-support-checkfail never pass test-armhf-armhf-xl-vhd 11 migrate-support-checkfail never pass test-armhf-armhf-xl-vhd 12 saverestore-support-checkfail never pass test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check fail never pass test-amd64-i386-xl-qemuu-win7-amd64 16 guest-stop fail never pass test-amd64-amd64-libvirt-vhd 11 migrate-support-checkfail never pass version targeted for testing: qemuu19b6d84316892c8086e0115d6f09cb01abb86cfc baseline version: qemuu5a57acb66f19ee52723aa05b8afbbc41c3e9ec99 Last test of basis38647 2016-01-16 22:18:51 Z3 days Testing same since38665 2016-01-19 09:20:50 Z1 days1 attempts People who touched revisions under test: Cao jin Daniel P. Berrange Fam Zheng Michael S. Tsirkin P J P Paolo Bonzini Peter Maydell Prasad J Pandit Shmulik Ladkani Zhu Lingshan jobs: build-amd64-xsm pass build-armhf-xsm pass build-i386-xsm pass build-amd64 pass build-armhf pass build-i386
[Xen-devel] [xen-unstable-smoke test] 78617: regressions - FAIL
flight 78617 xen-unstable-smoke real [real] http://logs.test-lab.xenproject.org/osstest/logs/78617/ Regressions :-( Tests which did not succeed and are blocking, including tests which could not be run: build-amd64 5 xen-build fail REGR. vs. 78553 Tests which did not succeed, but are not blocking: build-amd64-libvirt 1 build-check(1) blocked n/a test-amd64-amd64-libvirt 1 build-check(1) blocked n/a test-amd64-amd64-xl-qemuu-debianhvm-i386 1 build-check(1) blocked n/a test-armhf-armhf-xl 12 migrate-support-checkfail never pass test-armhf-armhf-xl 13 saverestore-support-checkfail never pass version targeted for testing: xen bf05e88ed7342a91cceba050b6c622accb809842 baseline version: xen 1949868d640427dc91bfb23741d78eb1d86841c8 Last test of basis78553 2016-01-19 20:02:25 Z0 days Testing same since78617 2016-01-20 13:07:12 Z0 days1 attempts People who touched revisions under test: Ian Campbell Jan Beulich jobs: build-amd64 fail build-armhf pass build-amd64-libvirt blocked test-armhf-armhf-xl pass test-amd64-amd64-xl-qemuu-debianhvm-i386 blocked test-amd64-amd64-libvirt blocked sg-report-flight on osstest.test-lab.xenproject.org logs: /home/logs/logs images: /home/logs/images Logs, config files, etc. are available at http://logs.test-lab.xenproject.org/osstest/logs Explanation of these reports, and of osstest in general, is at http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master Test harness code can be found at http://xenbits.xen.org/gitweb?p=osstest.git;a=summary Not pushing. commit bf05e88ed7342a91cceba050b6c622accb809842 Author: Jan Beulich Date: Wed Jan 20 13:50:10 2016 +0100 x86/VMX: prevent INVVPID failure due to non-canonical guest address While INVLPG (and on SVM INVLPGA) don't fault on non-canonical addresses, INVVPID fails (in the "individual address" case) when passed such an address. Since such intercepted INVLPG are effectively no-ops anyway, don't fix this in vmx_invlpg_intercept(), but instead have paging_invlpg() never return true in such a case. This is CVE-2016-1571 / XSA-168. Signed-off-by: Jan Beulich Reviewed-by: Andrew Cooper Acked-by: Ian Campbell commit 47abf29a9255b2e7b94e56d66b455d0a584b68b8 Author: Jan Beulich Date: Wed Jan 20 13:49:23 2016 +0100 x86/mm: PV superpage handling lacks sanity checks MMUEXT_{,UN}MARK_SUPER fail to check the input MFN for validity before dereferencing pointers into the superpage frame table. Reported-by: Qinghao Tang get_superpage() has a similar issue. This is CVE-2016-1570 / XSA-167. Signed-off-by: Jan Beulich Acked-by: Ian Campbell (qemu changes not included) ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 0/4] add support for vNVDIMM
On 01/20/16 12:43, Stefano Stabellini wrote: > On Wed, 20 Jan 2016, Tian, Kevin wrote: > > > From: Zhang, Haozhong > > > Sent: Tuesday, December 29, 2015 7:32 PM > > > > > > This patch series is the Xen part patch to provide virtual NVDIMM to > > > guest. The corresponding QEMU patch series is sent separately with the > > > title "[PATCH 0/2] add vNVDIMM support for Xen". > > > > > > * Background > > > > > > NVDIMM (Non-Volatile Dual In-line Memory Module) is going to be > > > supported on Intel's platform. NVDIMM devices are discovered via ACPI > > > and configured by _DSM method of NVDIMM device in ACPI. Some > > > documents can be found at > > > [1] ACPI 6: > > > http://www.uefi.org/sites/default/files/resources/ACPI_6.0.pdf > > > [2] NVDIMM Namespace: http://pmem.io/documents/NVDIMM_Namespace_Spec.pdf > > > [3] DSM Interface Example: > > > http://pmem.io/documents/NVDIMM_DSM_Interface_Example.pdf > > > [4] Driver Writer's Guide: > > > http://pmem.io/documents/NVDIMM_Driver_Writers_Guide.pdf > > > > > > The upstream QEMU (commits 5c42eef ~ 70d1fb9) has added support to > > > provide virtual NVDIMM in PMEM mode, in which NVDIMM devices are > > > mapped into CPU's address space and are accessed via normal memory > > > read/write and three special instructions (clflushopt/clwb/pcommit). > > > > > > This patch series and the corresponding QEMU patch series enable Xen > > > to provide vNVDIMM devices to HVM domains. > > > > > > * Design > > > > > > Supporting vNVDIMM in PMEM mode has three requirements. > > > > > > > Although this design is about vNVDIMM, some background of how pNVDIMM > > is managed in Xen would be helpful to understand the whole design since > > in PMEM mode you need map pNVDIMM into GFN addr space so there's > > a matter of how pNVDIMM is allocated. > > Yes, some background would be very helpful. Given that there are so many > moving parts on this (Xen, the Dom0 kernel, QEMU, hvmloader, libxl) > I suggest that we start with a design document for this feature. Let me prepare a design document. Basically, it would include following contents. Please let me know if you want anything additional to be included. * What NVDIMM is and how it is used * Software interface of NVDIMM - ACPI NFIT: what parameters are recorded and their usage - ACPI SSDT: what _DSM methods are provided and their functionality - New instructions: clflushopt/clwb/pcommit * How the linux kernel drives NVDIMM - ACPI parsing - Block device interface - Partition NVDIMM devices * How KVM/QEMU implements vNVDIMM * What I propose to implement vNVDIMM in Xen - Xen hypervisor/toolstack: new instruction enabling and address mapping - Dom0 Linux kernel: host NVDIMM driver - QEMU: virtual NFIT/SSDT, _DSM handling, and role in address mapping Haozhong ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 2/3] x86/PV: relax LDT address check
On 20/01/16 14:06, Jan Beulich wrote: > There's no point placing restrictions on its address when the LDT size > is zero. > > Also convert a local variable to a slightly more efficient type. > > Signed-off-by: Jan Beulich Reviewed-by: Andrew Cooper ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 3/3] x86/paging: invlpg() hook returns boolean
On 20/01/16 14:07, Jan Beulich wrote: > ... so make its return type reflect this. > > Signed-off-by: Jan Beulich Reviewed-by: Andrew Cooper , although this needs MM and shadow acks as well (CC'd). ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 4/4] hvmloader: add support to load extra ACPI tables from qemu
On Wed, 20 Jan 2016, Andrew Cooper wrote: > On 20/01/16 10:36, Xiao Guangrong wrote: > > > > Hi, > > > > On 01/20/2016 06:15 PM, Haozhong Zhang wrote: > > > >> CCing QEMU vNVDIMM maintainer: Xiao Guangrong > >> > >>> Conceptually, an NVDIMM is just like a fast SSD which is linearly > >>> mapped > >>> into memory. I am still on the dom0 side of this fence. > >>> > >>> The real question is whether it is possible to take an NVDIMM, split it > >>> in half, give each half to two different guests (with appropriate NFIT > >>> tables) and that be sufficient for the guests to just work. > >>> > >> > >> Yes, one NVDIMM device can be split into multiple parts and assigned > >> to different guests, and QEMU is responsible to maintain virtual NFIT > >> tables for each part. > >> > >>> Either way, it needs to be a toolstack policy decision as to how to > >>> split the resource. > > > > Currently, we are using NVDIMM as a block device and a DAX-based > > filesystem > > is created upon it in Linux so that file-related accesses directly reach > > the NVDIMM device. > > > > In KVM, If the NVDIMM device need to be shared by different VMs, we can > > create multiple files on the DAX-based filesystem and assign the file to > > each VMs. In the future, we can enable namespace (partition-like) for > > PMEM > > memory and assign the namespace to each VMs (current Linux driver uses > > the > > whole PMEM as a single namespace). > > > > I think it is not a easy work to let Xen hypervisor recognize NVDIMM > > device > > and manager NVDIMM resource. > > > > Thanks! > > > > The more I see about this, the more sure I am that we want to keep it as > a block device managed by dom0. > > In the case of the DAX-based filesystem, I presume files are not > necessarily contiguous. I also presume that this is worked around by > permuting the mapping of the virtual NVDIMM such that the it appears as > a contiguous block of addresses to the guest? > > Today in Xen, Qemu already has the ability to create mappings in the > guest's address space, e.g. to map PCI device BARs. I don't see a > conceptual difference here, although the security/permission model > certainly is more complicated. I imagine that mmap'ing these /dev/pmemXX devices require root privileges, does it not? I wouldn't encourage the introduction of anything else that requires root privileges in QEMU. With QEMU running as non-root by default in 4.7, the feature will not be available unless users explicitly ask to run QEMU as root (which they shouldn't really). ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v6 3/5] build: Alloc space for sched list in the link file
> On Jan 15, 2016, at 11:01 AM, Jonathan Creekmore > wrote: > > Creates a section to contain scheduler entry pointers that are gathered > together into an array. This will allow, in a follow-on patch, scheduler > entries to be automatically gathered together into the array for > automatic parsing. > > CC: Ian Campbell > CC: Stefano Stabellini > CC: Keir Fraser > CC: Jan Beulich > CC: Andrew Cooper > Signed-off-by: Jonathan Creekmore > Reviewed-by: Andrew Cooper > Reviewed-by: Doug Goldstein > Reviewed-by: Konrad Rzeszutek Wilk > > --- > Changed since v4: > * Remove defensive check for schedulers since the credit scheduler >must always be present > > Changed since v3: > * Add defensive check for schedulers in the linker > > Changed since v1: > * rename the __start and __end symbols to better match >the rest of the file > --- > xen/arch/arm/xen.lds.S | 4 > xen/arch/x86/xen.lds.S | 4 > 2 files changed, 8 insertions(+) > > diff --git a/xen/arch/arm/xen.lds.S b/xen/arch/arm/xen.lds.S > index 0488f37..f501a2f 100644 > --- a/xen/arch/arm/xen.lds.S > +++ b/xen/arch/arm/xen.lds.S > @@ -57,6 +57,10 @@ SECTIONS >. = ALIGN(PAGE_SIZE); >*(.data.page_aligned) >*(.data) > + . = ALIGN(8); > + __start_schedulers_array = .; > + *(.data.schedulers) > + __end_schedulers_array = .; >*(.data.rel) >*(.data.rel.*) >CONSTRUCTORS > diff --git a/xen/arch/x86/xen.lds.S b/xen/arch/x86/xen.lds.S > index e18e08f..c1ce027 100644 > --- a/xen/arch/x86/xen.lds.S > +++ b/xen/arch/x86/xen.lds.S > @@ -80,6 +80,10 @@ SECTIONS >__stop___pre_ex_table = .; > >*(.data.read_mostly) > + . = ALIGN(8); > + __start_schedulers_array = .; > + *(.data.schedulers) > + __end_schedulers_array = .; >*(.data.rel.ro) >*(.data.rel.ro.*) > } :text > -- > 2.6.4 I am pretty sure, with Dario’s latest ACK on (2/5), that this patch is the only one in the series that has not been ACKed yet. Is there anything else that I need to do to get this series in, especially since the (1/5) CONFIG_EXPERT patch has already landed? ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 0/4] add support for vNVDIMM
On Wed, 20 Jan 2016, Zhang, Haozhong wrote: > On 01/20/16 12:43, Stefano Stabellini wrote: > > On Wed, 20 Jan 2016, Tian, Kevin wrote: > > > > From: Zhang, Haozhong > > > > Sent: Tuesday, December 29, 2015 7:32 PM > > > > > > > > This patch series is the Xen part patch to provide virtual NVDIMM to > > > > guest. The corresponding QEMU patch series is sent separately with the > > > > title "[PATCH 0/2] add vNVDIMM support for Xen". > > > > > > > > * Background > > > > > > > > NVDIMM (Non-Volatile Dual In-line Memory Module) is going to be > > > > supported on Intel's platform. NVDIMM devices are discovered via ACPI > > > > and configured by _DSM method of NVDIMM device in ACPI. Some > > > > documents can be found at > > > > [1] ACPI 6: > > > > http://www.uefi.org/sites/default/files/resources/ACPI_6.0.pdf > > > > [2] NVDIMM Namespace: > > > > http://pmem.io/documents/NVDIMM_Namespace_Spec.pdf > > > > [3] DSM Interface Example: > > > > http://pmem.io/documents/NVDIMM_DSM_Interface_Example.pdf > > > > [4] Driver Writer's Guide: > > > > http://pmem.io/documents/NVDIMM_Driver_Writers_Guide.pdf > > > > > > > > The upstream QEMU (commits 5c42eef ~ 70d1fb9) has added support to > > > > provide virtual NVDIMM in PMEM mode, in which NVDIMM devices are > > > > mapped into CPU's address space and are accessed via normal memory > > > > read/write and three special instructions (clflushopt/clwb/pcommit). > > > > > > > > This patch series and the corresponding QEMU patch series enable Xen > > > > to provide vNVDIMM devices to HVM domains. > > > > > > > > * Design > > > > > > > > Supporting vNVDIMM in PMEM mode has three requirements. > > > > > > > > > > Although this design is about vNVDIMM, some background of how pNVDIMM > > > is managed in Xen would be helpful to understand the whole design since > > > in PMEM mode you need map pNVDIMM into GFN addr space so there's > > > a matter of how pNVDIMM is allocated. > > > > Yes, some background would be very helpful. Given that there are so many > > moving parts on this (Xen, the Dom0 kernel, QEMU, hvmloader, libxl) > > I suggest that we start with a design document for this feature. > > Let me prepare a design document. Basically, it would include > following contents. Please let me know if you want anything additional > to be included. Thank you! > * What NVDIMM is and how it is used > * Software interface of NVDIMM > - ACPI NFIT: what parameters are recorded and their usage > - ACPI SSDT: what _DSM methods are provided and their functionality > - New instructions: clflushopt/clwb/pcommit > * How the linux kernel drives NVDIMM > - ACPI parsing > - Block device interface > - Partition NVDIMM devices > * How KVM/QEMU implements vNVDIMM This is a very good start. > * What I propose to implement vNVDIMM in Xen > - Xen hypervisor/toolstack: new instruction enabling and address mapping > - Dom0 Linux kernel: host NVDIMM driver > - QEMU: virtual NFIT/SSDT, _DSM handling, and role in address mapping This is OK. It might be also good to list other options that were discussed, but it is certainly not necessary in first instance. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [distros-debian-squeeze test] 38672: trouble: blocked/broken/pass
flight 38672 distros-debian-squeeze real [real] http://osstest.xs.citrite.net/~osstest/testlogs/logs/38672/ Failures and problems with tests :-( Tests which did not succeed and are blocking, including tests which could not be run: build-amd64 3 host-install(3) broken REGR. vs. 38630 Tests which did not succeed, but are not blocking: test-amd64-amd64-i386-squeeze-netboot-pygrub 1 build-check(1) blocked n/a test-amd64-amd64-amd64-squeeze-netboot-pygrub 1 build-check(1)blocked n/a test-amd64-i386-amd64-squeeze-netboot-pygrub 1 build-check(1) blocked n/a test-amd64-i386-i386-squeeze-netboot-pygrub 1 build-check(1) blocked n/a baseline version: flight 38630 jobs: build-amd64 broken build-armhf pass build-i386 pass build-amd64-pvopspass build-armhf-pvopspass build-i386-pvops pass test-amd64-amd64-amd64-squeeze-netboot-pygrubblocked test-amd64-i386-amd64-squeeze-netboot-pygrub blocked test-amd64-amd64-i386-squeeze-netboot-pygrub blocked test-amd64-i386-i386-squeeze-netboot-pygrub blocked sg-report-flight on osstest.xs.citrite.net logs: /home/osstest/logs images: /home/osstest/images Logs, config files, etc. are available at http://osstest.xs.citrite.net/~osstest/testlogs/logs Test harness code can be found at http://xenbits.xensource.com/gitweb?p=osstest.git;a=summary Push not applicable. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [RFC V2] xen: interface: introduce pvclk interface
Hi Jan, On Wed, Jan 20, 2016 at 07:16:36AM -0700, Jan Beulich wrote: On 20.01.16 at 15:05, wrote: >> Hi Jan, >> On Wed, Jan 20, 2016 at 05:01:40AM -0700, Jan Beulich wrote: >> On 20.01.16 at 12:40, wrote: Hi Jan, On Wed, Jan 20, 2016 at 03:14:20AM -0700, Jan Beulich wrote: On 20.01.16 at 09:31, wrote: >> +/* >> + * Backend response >> + * >> + * cmd: command for operation on clk, same with the cmd in request. >> + * id: clk id, same with the id in request. >> + * success: indicate failure or success for the cmd. >> + * rate: clock rate. Used for get rate. >> + * >> + * cmd and id are filled by backend and passed to frontend to >> + * let frontend check whether the response is for the current >> + * request or not. >> + */ >> +struct xen_clkif_response { >> +uint32_t cmd; >> +uint32_t id; >> +uint32_t success; >> +uint64_t rate; >> +}; > >This isn't 32-/64-bit clean. Question is whether echoing cmd is really >needed. As Juergen suggested, use a request id. I'll fix this in V3. 32-/64-bit unclean, I can not get you here (: >>> >>>The layout of above structure may end up different for 32- and >>>64-bit guests, depending on the alignment of uint64_t. >> >> Thanks for teaching me. In V3, the layout will be changed to like this >> struct xen_clkif_response { >> uint32_t request_id; >> int32_t status; >> uint64_t rate; >> }; > >Okay. Just as a not - iirc other pv interfaces use 64-bit ID values, >so not doing this here perhaps ought to be justified. oh. I see uint64_t id in blkif.h :). Thanks. > >> And more macro definitions for the status entry: >> #define XEN_CLK_OP_SUCCESS 0 >> #define XEN_CLK_ENABLE_FALIURE -1 >> #define XEN_CLK_DISABLE_FAILURE -2 >> #define XEN_CLK_PREPARE_FAILURE -3 >> #define XEN_CLK_UNPREPARE_FAILURE -4 >> #define XEN_CLK_SET_RATE_FAILURE -5 >> #define XEN_CLK_GET_RATE_FALIURE -6 > >That's bogus - different kinds of errors would be meaningful, >but an error per command is quite pointless imo. (Also please >be aware of typos and parenthesization.) Will fine tune this in V3. > >And what I'm missing throughout the file is some description of how >clock events (interrupts?) are actually meant to make it into the >guest. I have a simple implementation v1 patch for linux, http://lists.xen.org/archives/html/xen-devel/2016-01/msg01860.html. You can see how it works. >>> >>>No, sorry, that's not the point of the inquiry. It seems to me that >>>there are more aspects to this interface, not directly related to >>>the request/reply protocol written down here, which need to be >>>spelled out event if they don't require any particular definitions >>>or type declarations. >> >> I try to follow you about clock events(interrupts?), not sure whether I >> understand correct or not. >> clock in this file is the clk for a device. In linux side, it managed by clk >> subsystem, drivers/clk/xx. >> This is not the clock events or clock source in drivers/clocksource/xx. >> For the pvclk interface, there will be no interrupt for the guest. > >Then (also considering the set of commands you propose) what >use is the clock to the guest? It can't get events from it, it can't >read its current value, all it can is get/set its rate, enable/disable, >and prepare/unprepare it. I may be lacking some ARM knowledge >here, but all of this looks pretty odd to me. I follow this https://events.linuxfoundation.org/sites/events/files/slides/talk_5.pdf to do platform device passthrough on ARM platform. I met the same issue as note in the ppt, at page 24. In my test case, the uart driver works well without xen. There is serveral function calls in the driver, such as clk = clk_get("uart2_root"),rate = clk_get_rate(clk), use rate to set baudrate for uart. There is a clk tree in kernel without XEN or dom0 kernel like the following (only an example): osc - |-->pll1 |-->pll2 |-->pll2_div |-->uart2_gate |-->uart2_root --> clk for uart2 uart2_root is directly handled by Dom0.If I assign uart2 to DomU, DomU does not have the clk tree as above, so DomU can not handle directly handle uart2_root and the uart2 driver in DomU will run into failure to intialize the uart2 hardware IP. So I introudce pvclk. Pass the operation for uart2_root in DomU to Dom0 and Dom0 directly handle the clock management hardware IP. Hope this is clear. Thanks, Peng. > >Jan > ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 3/3] x86/paging: invlpg() hook returns boolean
At 07:07 -0700 on 20 Jan (1453273623), Jan Beulich wrote: > ... so make its return type reflect this. > > Signed-off-by: Jan Beulich Acked-by: Tim Deegan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 1/3] x86/mmuext: tighten TLB flush address checks
>>> On 20.01.16 at 15:23, wrote: > On 20/01/16 14:05, Jan Beulich wrote: >> Addresses passed by PV guests should be subjected to __addr_ok(), >> avoiding undue TLB flushes; . >> >> Signed-off-by: Jan Beulich >> >> --- a/xen/arch/x86/mm.c >> +++ b/xen/arch/x86/mm.c >> @@ -3268,8 +3268,9 @@ long do_mmuext_op( >> case MMUEXT_INVLPG_LOCAL: >> if ( unlikely(d != pg_owner) ) >> rc = -EPERM; >> -else if ( !paging_mode_enabled(d) || >> - paging_invlpg(curr, op.arg1.linear_addr) != 0 ) >> +else if ( !paging_mode_enabled(d) >> + ? __addr_ok(op.arg1.linear_addr) >> + : paging_invlpg(curr, op.arg1.linear_addr) ) > > paging_mode_enabled() changes depending on whether the guest has > logdirty currently enabled. > > As you have also patched paging_invlpg() to DTRT with __addr_ok(), I > think this hunk can be dropped. How that? Without the change there's no address validation at all when !paging_mode_enabled(d). > Everything else, Reviewed-by: Andrew Cooper Please clarify. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 4/4] hvmloader: add support to load extra ACPI tables from qemu
On 01/20/16 14:29, Stefano Stabellini wrote: > On Wed, 20 Jan 2016, Andrew Cooper wrote: > > On 20/01/16 10:36, Xiao Guangrong wrote: > > > > > > Hi, > > > > > > On 01/20/2016 06:15 PM, Haozhong Zhang wrote: > > > > > >> CCing QEMU vNVDIMM maintainer: Xiao Guangrong > > >> > > >>> Conceptually, an NVDIMM is just like a fast SSD which is linearly > > >>> mapped > > >>> into memory. I am still on the dom0 side of this fence. > > >>> > > >>> The real question is whether it is possible to take an NVDIMM, split it > > >>> in half, give each half to two different guests (with appropriate NFIT > > >>> tables) and that be sufficient for the guests to just work. > > >>> > > >> > > >> Yes, one NVDIMM device can be split into multiple parts and assigned > > >> to different guests, and QEMU is responsible to maintain virtual NFIT > > >> tables for each part. > > >> > > >>> Either way, it needs to be a toolstack policy decision as to how to > > >>> split the resource. > > > > > > Currently, we are using NVDIMM as a block device and a DAX-based > > > filesystem > > > is created upon it in Linux so that file-related accesses directly reach > > > the NVDIMM device. > > > > > > In KVM, If the NVDIMM device need to be shared by different VMs, we can > > > create multiple files on the DAX-based filesystem and assign the file to > > > each VMs. In the future, we can enable namespace (partition-like) for > > > PMEM > > > memory and assign the namespace to each VMs (current Linux driver uses > > > the > > > whole PMEM as a single namespace). > > > > > > I think it is not a easy work to let Xen hypervisor recognize NVDIMM > > > device > > > and manager NVDIMM resource. > > > > > > Thanks! > > > > > > > The more I see about this, the more sure I am that we want to keep it as > > a block device managed by dom0. > > > > In the case of the DAX-based filesystem, I presume files are not > > necessarily contiguous. I also presume that this is worked around by > > permuting the mapping of the virtual NVDIMM such that the it appears as > > a contiguous block of addresses to the guest? > > > > Today in Xen, Qemu already has the ability to create mappings in the > > guest's address space, e.g. to map PCI device BARs. I don't see a > > conceptual difference here, although the security/permission model > > certainly is more complicated. > > I imagine that mmap'ing these /dev/pmemXX devices require root > privileges, does it not? > Yes, unless we assign non-root access permissions to /dev/pmemXX (but this is not the default behavior of linux kernel so far). > I wouldn't encourage the introduction of anything else that requires > root privileges in QEMU. With QEMU running as non-root by default in > 4.7, the feature will not be available unless users explicitly ask to > run QEMU as root (which they shouldn't really). > Yes, I'll include those privileged operations in the design document. Haozhong > ___ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 3/3] x86/paging: invlpg() hook returns boolean
>>> On 20.01.16 at 15:30, wrote: > On 20/01/16 14:07, Jan Beulich wrote: >> ... so make its return type reflect this. >> >> Signed-off-by: Jan Beulich > > Reviewed-by: Andrew Cooper , although this > needs MM and shadow acks as well (CC'd). Oh, sure - I meant to but then forgot; thanks for noticing! Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 4/4] hvmloader: add support to load extra ACPI tables from qemu
On 20/01/16 14:29, Stefano Stabellini wrote: > On Wed, 20 Jan 2016, Andrew Cooper wrote: >> On 20/01/16 10:36, Xiao Guangrong wrote: >>> Hi, >>> >>> On 01/20/2016 06:15 PM, Haozhong Zhang wrote: >>> CCing QEMU vNVDIMM maintainer: Xiao Guangrong > Conceptually, an NVDIMM is just like a fast SSD which is linearly > mapped > into memory. I am still on the dom0 side of this fence. > > The real question is whether it is possible to take an NVDIMM, split it > in half, give each half to two different guests (with appropriate NFIT > tables) and that be sufficient for the guests to just work. > Yes, one NVDIMM device can be split into multiple parts and assigned to different guests, and QEMU is responsible to maintain virtual NFIT tables for each part. > Either way, it needs to be a toolstack policy decision as to how to > split the resource. >>> Currently, we are using NVDIMM as a block device and a DAX-based >>> filesystem >>> is created upon it in Linux so that file-related accesses directly reach >>> the NVDIMM device. >>> >>> In KVM, If the NVDIMM device need to be shared by different VMs, we can >>> create multiple files on the DAX-based filesystem and assign the file to >>> each VMs. In the future, we can enable namespace (partition-like) for >>> PMEM >>> memory and assign the namespace to each VMs (current Linux driver uses >>> the >>> whole PMEM as a single namespace). >>> >>> I think it is not a easy work to let Xen hypervisor recognize NVDIMM >>> device >>> and manager NVDIMM resource. >>> >>> Thanks! >>> >> The more I see about this, the more sure I am that we want to keep it as >> a block device managed by dom0. >> >> In the case of the DAX-based filesystem, I presume files are not >> necessarily contiguous. I also presume that this is worked around by >> permuting the mapping of the virtual NVDIMM such that the it appears as >> a contiguous block of addresses to the guest? >> >> Today in Xen, Qemu already has the ability to create mappings in the >> guest's address space, e.g. to map PCI device BARs. I don't see a >> conceptual difference here, although the security/permission model >> certainly is more complicated. > I imagine that mmap'ing these /dev/pmemXX devices require root > privileges, does it not? I presume it does, although mmap()ing a file on a DAX filesystem will work in the standard POSIX way. Neither of these are sufficient however. That gets Qemu a mapping of the NVDIMM, not the guest. Something, one way or another, has to turn this into appropriate add-to-phymap hypercalls. > > I wouldn't encourage the introduction of anything else that requires > root privileges in QEMU. With QEMU running as non-root by default in > 4.7, the feature will not be available unless users explicitly ask to > run QEMU as root (which they shouldn't really). This isn't how design works. First, design a feature in an architecturally correct way, and then design an security policy to fit. (note, both before implement happens). We should not stunt design based on an existing implementation. In particular, if design shows that being a root only feature is the only sane way of doing this, it should be a root only feature. (I hope this is not the case, but it shouldn't cloud the judgement of a design). ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel