Re: [RFC PATCH 0/6] ARM64: KVM: PMU infrastructure support
On Thu, Nov 20, 2014 at 08:17:32PM +0530, Anup Patel wrote: > On Wed, Nov 19, 2014 at 8:59 PM, Christoffer Dall > wrote: > > On Tue, Nov 11, 2014 at 02:48:25PM +0530, Anup Patel wrote: > >> Hi All, > >> > >> I have second thoughts about rebasing KVM PMU patches > >> to Marc's irq-forwarding patches. > >> > >> The PMU IRQs (when virtualized by KVM) are not exactly > >> forwarded IRQs because they are shared between Host > >> and Guest. > >> > >> Scenario1 > >> - > >> > >> We might have perf running on Host and no KVM guest > >> running. In this scenario, we wont get interrupts on Host > >> because the kvm_pmu_hyp_init() (similar to the function > >> kvm_timer_hyp_init() of Marc's IRQ-forwarding > >> implementation) has put all host PMU IRQs in forwarding > >> mode. > >> > >> The only way solve this problem is to not set forwarding > >> mode for PMU IRQs in kvm_pmu_hyp_init() and instead > >> have special routines to turn on and turn off the forwarding > >> mode of PMU IRQs. These routines will be called from > >> kvm_arch_vcpu_ioctl_run() for toggling the PMU IRQ > >> forwarding state. > >> > >> Scenario2 > >> - > >> > >> We might have perf running on Host and Guest simultaneously > >> which means it is quite likely that PMU HW trigger IRQ meant > >> for Host between "ret = kvm_call_hyp(__kvm_vcpu_run, vcpu);" > >> and "kvm_pmu_sync_hwstate(vcpu);" (similar to timer sync routine > >> of Marc's patchset which is called before local_irq_enable()). > >> > >> In this scenario, the updated kvm_pmu_sync_hwstate(vcpu) > >> will accidentally forward IRQ meant for Host to Guest unless > >> we put additional checks to inspect VCPU PMU state. > >> > >> Am I missing any detail about IRQ forwarding for above > >> scenarios? > >> > > Hi Anup, > > Hi Christoffer, > > > > > I briefly discussed this with Marc. What I don't understand is how it > > would be possible to get an interrupt for the host while running the > > guest? > > > > The rationale behind my question is that whenever you're running the > > guest, the PMU should be programmed exclusively with guest state, and > > since the PMU is per core, any interrupts should be for the guest, where > > it would always be pending. > > Yes, thats right PMU is programmed exclusively for guest when > guest is running and for host when host is running. > > Let us assume a situation (Scenario2 mentioned previously) > where both host and guest are using PMU. When the guest is > running we come back to host mode due to variety of reasons > (stage2 fault, guest IO, regular host interrupt, host interrupt > meant for guest, ) which means we will return from the > "ret = kvm_call_hyp(__kvm_vcpu_run, vcpu);" statement in the > kvm_arch_vcpu_ioctl_run() function with local IRQs disabled. > At this point we would have restored back host PMU context and > any PMU counter used by host can trigger PMU overflow interrup > for host. Now we will be having "kvm_pmu_sync_hwstate(vcpu);" > in the kvm_arch_vcpu_ioctl_run() function (similar to the > kvm_timer_sync_hwstate() of Marc's IRQ forwarding patchset) > which will try to detect PMU irq forwarding state in GIC hence it > can accidentally discover PMU irq pending for guest while this > PMU irq is actually meant for host. > > This above mentioned situation does not happen for timer > because virtual timer interrupts are exclusively used for guest. > The exclusive use of virtual timer interrupt for guest ensures that > the function kvm_timer_sync_hwstate() will always see correct > state of virtual timer IRQ from GIC. > I'm not quite following. When you call kvm_pmu_sync_hwstate(vcpu) in the non-preemtible section, you would (1) capture the active state of the IRQ pertaining to the guest and (2) deactive the IRQ on the host, then (3) switch the state of the PMU to the host state, and finally (4) re-enable IRQs on the CPU you're running on. If the host PMU state restored in (3) causes the PMU to raise an interrupt, you'll take an interrupt after (4), which is for the host, and you'll handle it on the host. Whenever you schedule the guest VCPU again, you'll (a) disable interrupts on the CPU, (b) restore the active state of the IRQ for the guest, (c) restore the guest PMU state, (d) switch to the guest with IRQs enabled on the CPU (potentially). If the state in (c) causes an IRQ it will not fire on the host, because it is marked as active in (b). Where does this break? -Christoffer -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 3/3] arm/arm64: Enable Dirty Page logging for ARMv8 move log read, tlb flush to generic code
On Wed, Nov 19, 2014 at 12:15:55PM -0800, Mario Smarduch wrote: > On 11/19/2014 06:39 AM, Christoffer Dall wrote: > > Hi Mario, > > > > On Fri, Nov 07, 2014 at 12:51:39PM -0800, Mario Smarduch wrote: > >> On 11/07/2014 12:20 PM, Christoffer Dall wrote: > >>> On Thu, Oct 09, 2014 at 07:34:07PM -0700, Mario Smarduch wrote: > This patch enables ARMv8 dirty page logging and unifies ARMv7/ARMv8 code. > > Signed-off-by: Mario Smarduch > --- > arch/arm/include/asm/kvm_host.h | 12 > arch/arm/kvm/arm.c | 9 - > arch/arm/kvm/mmu.c | 17 +++-- > arch/arm64/kvm/Kconfig | 2 +- > 4 files changed, 12 insertions(+), 28 deletions(-) > > diff --git a/arch/arm/include/asm/kvm_host.h > b/arch/arm/include/asm/kvm_host.h > index 12311a5..59565f5 100644 > --- a/arch/arm/include/asm/kvm_host.h > +++ b/arch/arm/include/asm/kvm_host.h > @@ -220,18 +220,6 @@ static inline void __cpu_init_hyp_mode(phys_addr_t > boot_pgd_ptr, > kvm_call_hyp((void*)hyp_stack_ptr, vector_ptr, pgd_ptr); > } > > -/** > - * kvm_arch_flush_remote_tlbs() - flush all VM TLB entries > - * @kvm: pointer to kvm structure. > - * > - * Interface to HYP function to flush all VM TLB entries without address > - * parameter. > - */ > -static inline void kvm_arch_flush_remote_tlbs(struct kvm *kvm) > -{ > -kvm_call_hyp(__kvm_tlb_flush_vmid, kvm); > -} > - > static inline int kvm_arch_dev_ioctl_check_extension(long ext) > { > return 0; > diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c > index 0546fa3..6a6fd6b 100644 > --- a/arch/arm/kvm/arm.c > +++ b/arch/arm/kvm/arm.c > @@ -242,7 +242,6 @@ void kvm_arch_commit_memory_region(struct kvm *kvm, > const struct kvm_memory_slot *old, > enum kvm_mr_change change) > { > -#ifdef CONFIG_ARM > /* > * At this point memslot has been committed and there is an > * allocated dirty_bitmap[], dirty pages will be be tracked > while the > @@ -250,7 +249,6 @@ void kvm_arch_commit_memory_region(struct kvm *kvm, > */ > if ((change != KVM_MR_DELETE) && (mem->flags & > KVM_MEM_LOG_DIRTY_PAGES)) > kvm_mmu_wp_memory_region(kvm, mem->slot); > -#endif > } > > void kvm_arch_flush_shadow_all(struct kvm *kvm) > @@ -783,13 +781,6 @@ long kvm_arch_vcpu_ioctl(struct file *filp, > } > } > > -#ifdef CONFIG_ARM64 > -int kvm_arch_vm_ioctl_get_dirty_log(struct kvm *kvm, struct > kvm_dirty_log *log) > -{ > -return -EINVAL; > -} > -#endif > - > static int kvm_vm_ioctl_set_device_addr(struct kvm *kvm, > struct kvm_arm_device_addr > *dev_addr) > { > diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c > index df1a5a3..8c0f9f2 100644 > --- a/arch/arm/kvm/mmu.c > +++ b/arch/arm/kvm/mmu.c > @@ -49,11 +49,18 @@ static phys_addr_t hyp_idmap_vector; > > static bool kvm_get_logging_state(struct kvm_memory_slot *memslot) > { > -#ifdef CONFIG_ARM > return !!memslot->dirty_bitmap; > -#else > -return false; > -#endif > +} > + > +/** > + * kvm_arch_flush_remote_tlbs() - flush all VM TLB entries for ARMv7/8 > + * @kvm:pointer to kvm structure. > + * > + * Interface to HYP function to flush all VM TLB entries > + */ > +inline void kvm_arch_flush_remote_tlbs(struct kvm *kvm) > +{ > +kvm_call_hyp(__kvm_tlb_flush_vmid, kvm); > } > > static void kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa) > @@ -769,7 +776,6 @@ static bool transparent_hugepage_adjust(pfn_t *pfnp, > phys_addr_t *ipap) > return false; > } > > -#ifdef CONFIG_ARM > /** > * stage2_wp_ptes - write protect PMD range > * @pmd:pointer to pmd entry > @@ -917,7 +923,6 @@ void kvm_mmu_write_protect_pt_masked(struct kvm *kvm, > > stage2_wp_range(kvm, start, end); > } > -#endif > > static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, > struct kvm_memory_slot *memslot, > diff --git a/arch/arm64/kvm/Kconfig b/arch/arm64/kvm/Kconfig > index 40a8d19..a1a35809 100644 > --- a/arch/arm64/kvm/Kconfig > +++ b/arch/arm64/kvm/Kconfig > @@ -26,7 +26,7 @@ config KVM > select KVM_ARM_HOST > select KVM_ARM_VGIC > select KVM_ARM_TIMER > -
Re: Exposing host debug capabilities to userspace
On Thu, Nov 20, 2014 at 04:55:14PM +, Alex Bennée wrote: > Hi, > > I've almost finished the ARMv8 guest debug support but I have one > problem left to solve. userspace needs to know how many hardware debug > registers are available for GDB to use. This information is available > from the ID_AA64DFR0_EL1 register. Currently I abuse GET_ONE_REG to > fetch it's value however semantically this is poor as it's API is for > getting guest state not host state and they could theoretically have > different values. > > So far the options I've examined are: > > * KVM ioctl GET_ONE_REG(ID_AA64DFR0_EL1) > > As explained above, abusing a guest state API for host configuration. It's just wrong, and we should only do this if there's absolutely no other way to do this. > > * ptrace(PTRACE_GETREGSET, NT_ARM_HW_WATCH) > > This is used by GDB to access the host details in debug-monitors. > However the ptrace API really wants you to attach to a process before > calling PTRACE_GETREGSET. Currently I've tried attaching to the > thread_id of the vCPU but this fails with EPERM, I suspect because > attaching to your own threads likely upsets the kernel. Can you confirm your suspicion? This seems like a rather good approach so we should really investigate why this doesn't work and explore ways to get it working. > > * KVM ioctl KVM_GET_DEBUGREGS > > This is currently x86 only and looks like it's more aimed at debug > registers than capability stuff. Also I'm not sure what the state of > this ioctl is compared to KVM_SET_GUEST_DEBUG. Do these APIs overlap or > is one an older deprecated x86 only API? The API text and a brief glance of the x86 code seems to indicate that this is also the vcpu state... > > * Export the information via sysfs > > I suppose the correct canonical non-subsystem specific way to make this > information available it to expose the data in some sort of sysfs node? > However I don't see any existing sysfs structure for the CPU. > > * Expand /proc/cpuinfo > > I suspect adding extra text to be badly parsed by userspace is just > horrid and unacceptable behaviour ;-) > > * Add another KVM ioctl? > > This would have the downside of being specific to KVM and of course > proliferating the API space again. > This may not be that bad, for example, could we ever imaging that we'd only want to export a few of the debug registers for host gdbstub usage? -Christoffer -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Exposing host debug capabilities to userspace
Christoffer Dall writes: > On Thu, Nov 20, 2014 at 04:55:14PM +, Alex Bennée wrote: >> Hi, >> >> I've almost finished the ARMv8 guest debug support but I have one >> problem left to solve. userspace needs to know how many hardware debug >> registers are available for GDB to use. >> * KVM ioctl KVM_GET_DEBUGREGS >> >> This is currently x86 only and looks like it's more aimed at debug >> registers than capability stuff. Also I'm not sure what the state of >> this ioctl is compared to KVM_SET_GUEST_DEBUG. Do these APIs overlap or >> is one an older deprecated x86 only API? > > The API text and a brief glance of the x86 code seems to indicate that > this is also the vcpu state... Yeah I was getting confused as to the difference between the two API calls. Is this just an x86 version of what GET/SET_ONE_REG replaced? >> * Add another KVM ioctl? >> >> This would have the downside of being specific to KVM and of course >> proliferating the API space again. >> > This may not be that bad, for example, could we ever imaging that we'd > only want to export a few of the debug registers for host gdbstub > usage? However it is general information which might be useful to the whole system (although I suspect KVM and PTRACE are the only two). It would be a shame to have an informational API wrapped up in the extra boiler-plate of a specific API. -- Alex Bennée -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC PATCH 0/6] ARM64: KVM: PMU infrastructure support
Hi Christoffer, On Fri, Nov 21, 2014 at 3:29 PM, Christoffer Dall wrote: > On Thu, Nov 20, 2014 at 08:17:32PM +0530, Anup Patel wrote: >> On Wed, Nov 19, 2014 at 8:59 PM, Christoffer Dall >> wrote: >> > On Tue, Nov 11, 2014 at 02:48:25PM +0530, Anup Patel wrote: >> >> Hi All, >> >> >> >> I have second thoughts about rebasing KVM PMU patches >> >> to Marc's irq-forwarding patches. >> >> >> >> The PMU IRQs (when virtualized by KVM) are not exactly >> >> forwarded IRQs because they are shared between Host >> >> and Guest. >> >> >> >> Scenario1 >> >> - >> >> >> >> We might have perf running on Host and no KVM guest >> >> running. In this scenario, we wont get interrupts on Host >> >> because the kvm_pmu_hyp_init() (similar to the function >> >> kvm_timer_hyp_init() of Marc's IRQ-forwarding >> >> implementation) has put all host PMU IRQs in forwarding >> >> mode. >> >> >> >> The only way solve this problem is to not set forwarding >> >> mode for PMU IRQs in kvm_pmu_hyp_init() and instead >> >> have special routines to turn on and turn off the forwarding >> >> mode of PMU IRQs. These routines will be called from >> >> kvm_arch_vcpu_ioctl_run() for toggling the PMU IRQ >> >> forwarding state. >> >> >> >> Scenario2 >> >> - >> >> >> >> We might have perf running on Host and Guest simultaneously >> >> which means it is quite likely that PMU HW trigger IRQ meant >> >> for Host between "ret = kvm_call_hyp(__kvm_vcpu_run, vcpu);" >> >> and "kvm_pmu_sync_hwstate(vcpu);" (similar to timer sync routine >> >> of Marc's patchset which is called before local_irq_enable()). >> >> >> >> In this scenario, the updated kvm_pmu_sync_hwstate(vcpu) >> >> will accidentally forward IRQ meant for Host to Guest unless >> >> we put additional checks to inspect VCPU PMU state. >> >> >> >> Am I missing any detail about IRQ forwarding for above >> >> scenarios? >> >> >> > Hi Anup, >> >> Hi Christoffer, >> >> > >> > I briefly discussed this with Marc. What I don't understand is how it >> > would be possible to get an interrupt for the host while running the >> > guest? >> > >> > The rationale behind my question is that whenever you're running the >> > guest, the PMU should be programmed exclusively with guest state, and >> > since the PMU is per core, any interrupts should be for the guest, where >> > it would always be pending. >> >> Yes, thats right PMU is programmed exclusively for guest when >> guest is running and for host when host is running. >> >> Let us assume a situation (Scenario2 mentioned previously) >> where both host and guest are using PMU. When the guest is >> running we come back to host mode due to variety of reasons >> (stage2 fault, guest IO, regular host interrupt, host interrupt >> meant for guest, ) which means we will return from the >> "ret = kvm_call_hyp(__kvm_vcpu_run, vcpu);" statement in the >> kvm_arch_vcpu_ioctl_run() function with local IRQs disabled. >> At this point we would have restored back host PMU context and >> any PMU counter used by host can trigger PMU overflow interrup >> for host. Now we will be having "kvm_pmu_sync_hwstate(vcpu);" >> in the kvm_arch_vcpu_ioctl_run() function (similar to the >> kvm_timer_sync_hwstate() of Marc's IRQ forwarding patchset) >> which will try to detect PMU irq forwarding state in GIC hence it >> can accidentally discover PMU irq pending for guest while this >> PMU irq is actually meant for host. >> >> This above mentioned situation does not happen for timer >> because virtual timer interrupts are exclusively used for guest. >> The exclusive use of virtual timer interrupt for guest ensures that >> the function kvm_timer_sync_hwstate() will always see correct >> state of virtual timer IRQ from GIC. >> > I'm not quite following. > > When you call kvm_pmu_sync_hwstate(vcpu) in the non-preemtible section, > you would (1) capture the active state of the IRQ pertaining to the > guest and (2) deactive the IRQ on the host, then (3) switch the state of > the PMU to the host state, and finally (4) re-enable IRQs on the CPU > you're running on. > > If the host PMU state restored in (3) causes the PMU to raise an > interrupt, you'll take an interrupt after (4), which is for the host, > and you'll handle it on the host. > We only switch PMU state in assembly code using kvm_call_hyp(__kvm_vcpu_run, vcpu) so whenever we are in kvm_arch_vcpu_ioctl_run() (i.e. host mode) the current hardware PMU state is for host. This means whenever we are in host mode the host PMU can change state of PMU IRQ in GIC even if local IRQs are disabled. Whenever we inspect active state of PMU IRQ in the kvm_pmu_sync_hwstate() function using irq_get_fwd_state() API. Here we are not guaranteed that IRQ forward state returned by the irq_get_fwd_state() API is for guest only. The above situation does not manifest for virtual timer because virtual timer registers are exclusively accessed by Guest and virtual timer interrupt is only for Guest (never used by Host). > Whenever yo
Re: [PATCH 3/3] arm, arm64: KVM: handle potential incoherency of readonly memslots
Hi Mario, On Wed, Nov 19, 2014 at 03:32:31PM -0800, Mario Smarduch wrote: > Hi Laszlo, > > couple observations. > > I'm wondering if access from qemu and guest won't > result in mixed memory attributes and if that's acceptable > to the CPU. > > Also is if you update memory from qemu you may break > dirty page logging/migration. Unless there is some other way > you keep track. Of course it may not be applicable in your > case (i.e. flash unused after boot). > I'm not concerned about this particular case; dirty page logging exists so KVM can inform userspace when a page may have been dirtied. If userspace directly dirties (is that a verb?) a page, then it already knows that it needs to migrate that page and deal with it accordingly. Or did I miss some more subtle point here? -Christoffer -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Exposing host debug capabilities to userspace
Christoffer Dall writes: > On Thu, Nov 20, 2014 at 04:55:14PM +, Alex Bennée wrote: >> >> * ptrace(PTRACE_GETREGSET, NT_ARM_HW_WATCH) >> >> This is used by GDB to access the host details in debug-monitors. >> However the ptrace API really wants you to attach to a process before >> calling PTRACE_GETREGSET. Currently I've tried attaching to the >> thread_id of the vCPU but this fails with EPERM, I suspect because >> attaching to your own threads likely upsets the kernel. > > Can you confirm your suspicion? This seems like a rather good approach > so we should really investigate why this doesn't work and explore ways > to get it working. >From ptrace_attach: retval = -EPERM; if (unlikely(task->flags & PF_KTHREAD)) goto out; if (same_thread_group(task, current)) goto out; I think this is what is triggering my EPERM. I'm going to dig into the history of code around that bit. While I can see it might be undesirable I'm not sure if it has to be verbotten... -- Alex Bennée -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/2] arm/arm64: kvm: drop inappropriate use of kvm_is_mmio_pfn()
On Mon, Nov 10, 2014 at 09:33:55AM +0100, Ard Biesheuvel wrote: > Instead of using kvm_is_mmio_pfn() to decide whether a host region > should be stage 2 mapped with device attributes, add a new static > function kvm_is_device_pfn() that disregards RAM pages with the > reserved bit set, as those should usually not be mapped as device > memory. > > Signed-off-by: Ard Biesheuvel > --- > arch/arm/kvm/mmu.c | 7 ++- > 1 file changed, 6 insertions(+), 1 deletion(-) > > diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c > index 57a403a5c22b..b007438242e2 100644 > --- a/arch/arm/kvm/mmu.c > +++ b/arch/arm/kvm/mmu.c > @@ -834,6 +834,11 @@ static bool kvm_is_write_fault(struct kvm_vcpu *vcpu) > return kvm_vcpu_dabt_iswrite(vcpu); > } > > +static bool kvm_is_device_pfn(unsigned long pfn) > +{ > + return !pfn_valid(pfn); > +} > + > static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, > struct kvm_memory_slot *memslot, unsigned long hva, > unsigned long fault_status) > @@ -904,7 +909,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, > phys_addr_t fault_ipa, > if (is_error_pfn(pfn)) > return -EFAULT; > > - if (kvm_is_mmio_pfn(pfn)) > + if (kvm_is_device_pfn(pfn)) > mem_type = PAGE_S2_DEVICE; > > spin_lock(&kvm->mmu_lock); > -- > 1.8.3.2 > Acked-by: Christoffer Dall -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] kvm: fix kvm_is_mmio_pfn() and rename to kvm_is_reserved_pfn()
On 10 November 2014 09:33, Ard Biesheuvel wrote: > This reverts commit 85c8555ff0 ("KVM: check for !is_zero_pfn() in > kvm_is_mmio_pfn()") and renames the function to kvm_is_reserved_pfn. > > The problem being addressed by the patch above was that some ARM code > based the memory mapping attributes of a pfn on the return value of > kvm_is_mmio_pfn(), whose name indeed suggests that such pfns should > be mapped as device memory. > > However, kvm_is_mmio_pfn() doesn't do quite what it says on the tin, > and the existing non-ARM users were already using it in a way which > suggests that its name should probably have been 'kvm_is_reserved_pfn' > from the beginning, e.g., whether or not to call get_page/put_page on > it etc. This means that returning false for the zero page is a mistake > and the patch above should be reverted. > > Signed-off-by: Ard Biesheuvel Ping? > --- > arch/ia64/kvm/kvm-ia64.c | 2 +- > arch/x86/kvm/mmu.c | 6 +++--- > include/linux/kvm_host.h | 2 +- > virt/kvm/kvm_main.c | 16 > 4 files changed, 13 insertions(+), 13 deletions(-) > > diff --git a/arch/ia64/kvm/kvm-ia64.c b/arch/ia64/kvm/kvm-ia64.c > index ec6b9acb6bea..dbe46f43884d 100644 > --- a/arch/ia64/kvm/kvm-ia64.c > +++ b/arch/ia64/kvm/kvm-ia64.c > @@ -1563,7 +1563,7 @@ int kvm_arch_prepare_memory_region(struct kvm *kvm, > > for (i = 0; i < npages; i++) { > pfn = gfn_to_pfn(kvm, base_gfn + i); > - if (!kvm_is_mmio_pfn(pfn)) { > + if (!kvm_is_reserved_pfn(pfn)) { > kvm_set_pmt_entry(kvm, base_gfn + i, > pfn << PAGE_SHIFT, > _PAGE_AR_RWX | _PAGE_MA_WB); > diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c > index ac1c4de3a484..978f402006ee 100644 > --- a/arch/x86/kvm/mmu.c > +++ b/arch/x86/kvm/mmu.c > @@ -630,7 +630,7 @@ static int mmu_spte_clear_track_bits(u64 *sptep) > * kvm mmu, before reclaiming the page, we should > * unmap it from mmu first. > */ > - WARN_ON(!kvm_is_mmio_pfn(pfn) && !page_count(pfn_to_page(pfn))); > + WARN_ON(!kvm_is_reserved_pfn(pfn) && !page_count(pfn_to_page(pfn))); > > if (!shadow_accessed_mask || old_spte & shadow_accessed_mask) > kvm_set_pfn_accessed(pfn); > @@ -2461,7 +2461,7 @@ static int set_spte(struct kvm_vcpu *vcpu, u64 *sptep, > spte |= PT_PAGE_SIZE_MASK; > if (tdp_enabled) > spte |= kvm_x86_ops->get_mt_mask(vcpu, gfn, > - kvm_is_mmio_pfn(pfn)); > + kvm_is_reserved_pfn(pfn)); > > if (host_writable) > spte |= SPTE_HOST_WRITEABLE; > @@ -2737,7 +2737,7 @@ static void transparent_hugepage_adjust(struct kvm_vcpu > *vcpu, > * PT_PAGE_TABLE_LEVEL and there would be no adjustment done > * here. > */ > - if (!is_error_noslot_pfn(pfn) && !kvm_is_mmio_pfn(pfn) && > + if (!is_error_noslot_pfn(pfn) && !kvm_is_reserved_pfn(pfn) && > level == PT_PAGE_TABLE_LEVEL && > PageTransCompound(pfn_to_page(pfn)) && > !has_wrprotected_page(vcpu->kvm, gfn, PT_DIRECTORY_LEVEL)) { > diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h > index ea53b04993f2..a6059bdf7b03 100644 > --- a/include/linux/kvm_host.h > +++ b/include/linux/kvm_host.h > @@ -703,7 +703,7 @@ void kvm_arch_sync_events(struct kvm *kvm); > int kvm_cpu_has_pending_timer(struct kvm_vcpu *vcpu); > void kvm_vcpu_kick(struct kvm_vcpu *vcpu); > > -bool kvm_is_mmio_pfn(pfn_t pfn); > +bool kvm_is_reserved_pfn(pfn_t pfn); > > struct kvm_irq_ack_notifier { > struct hlist_node link; > diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c > index 25ffac9e947d..3cee7b167052 100644 > --- a/virt/kvm/kvm_main.c > +++ b/virt/kvm/kvm_main.c > @@ -107,10 +107,10 @@ EXPORT_SYMBOL_GPL(kvm_rebooting); > > static bool largepages_enabled = true; > > -bool kvm_is_mmio_pfn(pfn_t pfn) > +bool kvm_is_reserved_pfn(pfn_t pfn) > { > if (pfn_valid(pfn)) > - return !is_zero_pfn(pfn) && PageReserved(pfn_to_page(pfn)); > + return PageReserved(pfn_to_page(pfn)); > > return true; > } > @@ -1321,7 +1321,7 @@ static pfn_t hva_to_pfn(unsigned long addr, bool > atomic, bool *async, > else if ((vma->vm_flags & VM_PFNMAP)) { > pfn = ((addr - vma->vm_start) >> PAGE_SHIFT) + > vma->vm_pgoff; > - BUG_ON(!kvm_is_mmio_pfn(pfn)); > + BUG_ON(!kvm_is_reserved_pfn(pfn)); > } else { > if (async && vma_is_valid(vma, write_fault)) > *async = true; > @@ -1427,7 +1427,7 @@ static struct page *kvm_pfn_to_page(pfn_t pfn) > if (is_error_noslot_pfn(pfn)) > return KVM_ERR_PTR_BAD_PAGE; > > - if (kvm_is_mmio_pfn(pfn)) { > + if (kvm_is_reserved_pfn(pfn)) { >
Re: [PATCH 2/2] kvm: fix kvm_is_mmio_pfn() and rename to kvm_is_reserved_pfn()
Hi Paolo, I think these look good, would you mind queueing them as either a fix or for 3.19 as you see fit, assuming you agree with the content? Thanks, -Christoffer On Mon, Nov 10, 2014 at 09:33:56AM +0100, Ard Biesheuvel wrote: > This reverts commit 85c8555ff0 ("KVM: check for !is_zero_pfn() in > kvm_is_mmio_pfn()") and renames the function to kvm_is_reserved_pfn. > > The problem being addressed by the patch above was that some ARM code > based the memory mapping attributes of a pfn on the return value of > kvm_is_mmio_pfn(), whose name indeed suggests that such pfns should > be mapped as device memory. > > However, kvm_is_mmio_pfn() doesn't do quite what it says on the tin, > and the existing non-ARM users were already using it in a way which > suggests that its name should probably have been 'kvm_is_reserved_pfn' > from the beginning, e.g., whether or not to call get_page/put_page on > it etc. This means that returning false for the zero page is a mistake > and the patch above should be reverted. > > Signed-off-by: Ard Biesheuvel > --- > arch/ia64/kvm/kvm-ia64.c | 2 +- > arch/x86/kvm/mmu.c | 6 +++--- > include/linux/kvm_host.h | 2 +- > virt/kvm/kvm_main.c | 16 > 4 files changed, 13 insertions(+), 13 deletions(-) > > diff --git a/arch/ia64/kvm/kvm-ia64.c b/arch/ia64/kvm/kvm-ia64.c > index ec6b9acb6bea..dbe46f43884d 100644 > --- a/arch/ia64/kvm/kvm-ia64.c > +++ b/arch/ia64/kvm/kvm-ia64.c > @@ -1563,7 +1563,7 @@ int kvm_arch_prepare_memory_region(struct kvm *kvm, > > for (i = 0; i < npages; i++) { > pfn = gfn_to_pfn(kvm, base_gfn + i); > - if (!kvm_is_mmio_pfn(pfn)) { > + if (!kvm_is_reserved_pfn(pfn)) { > kvm_set_pmt_entry(kvm, base_gfn + i, > pfn << PAGE_SHIFT, > _PAGE_AR_RWX | _PAGE_MA_WB); > diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c > index ac1c4de3a484..978f402006ee 100644 > --- a/arch/x86/kvm/mmu.c > +++ b/arch/x86/kvm/mmu.c > @@ -630,7 +630,7 @@ static int mmu_spte_clear_track_bits(u64 *sptep) >* kvm mmu, before reclaiming the page, we should >* unmap it from mmu first. >*/ > - WARN_ON(!kvm_is_mmio_pfn(pfn) && !page_count(pfn_to_page(pfn))); > + WARN_ON(!kvm_is_reserved_pfn(pfn) && !page_count(pfn_to_page(pfn))); > > if (!shadow_accessed_mask || old_spte & shadow_accessed_mask) > kvm_set_pfn_accessed(pfn); > @@ -2461,7 +2461,7 @@ static int set_spte(struct kvm_vcpu *vcpu, u64 *sptep, > spte |= PT_PAGE_SIZE_MASK; > if (tdp_enabled) > spte |= kvm_x86_ops->get_mt_mask(vcpu, gfn, > - kvm_is_mmio_pfn(pfn)); > + kvm_is_reserved_pfn(pfn)); > > if (host_writable) > spte |= SPTE_HOST_WRITEABLE; > @@ -2737,7 +2737,7 @@ static void transparent_hugepage_adjust(struct kvm_vcpu > *vcpu, >* PT_PAGE_TABLE_LEVEL and there would be no adjustment done >* here. >*/ > - if (!is_error_noslot_pfn(pfn) && !kvm_is_mmio_pfn(pfn) && > + if (!is_error_noslot_pfn(pfn) && !kvm_is_reserved_pfn(pfn) && > level == PT_PAGE_TABLE_LEVEL && > PageTransCompound(pfn_to_page(pfn)) && > !has_wrprotected_page(vcpu->kvm, gfn, PT_DIRECTORY_LEVEL)) { > diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h > index ea53b04993f2..a6059bdf7b03 100644 > --- a/include/linux/kvm_host.h > +++ b/include/linux/kvm_host.h > @@ -703,7 +703,7 @@ void kvm_arch_sync_events(struct kvm *kvm); > int kvm_cpu_has_pending_timer(struct kvm_vcpu *vcpu); > void kvm_vcpu_kick(struct kvm_vcpu *vcpu); > > -bool kvm_is_mmio_pfn(pfn_t pfn); > +bool kvm_is_reserved_pfn(pfn_t pfn); > > struct kvm_irq_ack_notifier { > struct hlist_node link; > diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c > index 25ffac9e947d..3cee7b167052 100644 > --- a/virt/kvm/kvm_main.c > +++ b/virt/kvm/kvm_main.c > @@ -107,10 +107,10 @@ EXPORT_SYMBOL_GPL(kvm_rebooting); > > static bool largepages_enabled = true; > > -bool kvm_is_mmio_pfn(pfn_t pfn) > +bool kvm_is_reserved_pfn(pfn_t pfn) > { > if (pfn_valid(pfn)) > - return !is_zero_pfn(pfn) && PageReserved(pfn_to_page(pfn)); > + return PageReserved(pfn_to_page(pfn)); > > return true; > } > @@ -1321,7 +1321,7 @@ static pfn_t hva_to_pfn(unsigned long addr, bool > atomic, bool *async, > else if ((vma->vm_flags & VM_PFNMAP)) { > pfn = ((addr - vma->vm_start) >> PAGE_SHIFT) + > vma->vm_pgoff; > - BUG_ON(!kvm_is_mmio_pfn(pfn)); > + BUG_ON(!kvm_is_reserved_pfn(pfn)); > } else { > if (async && vma_is_valid(vma, write_fault)) > *async = true; > @@ -1427,7 +1427,7 @@ static struct page *kvm_pfn_to_page(pfn_t pfn) > if (is_error_noslot_pfn(pfn)) > return KVM_ERR_P
Re: [RFC PATCH 0/6] ARM64: KVM: PMU infrastructure support
On Fri, Nov 21, 2014 at 04:06:05PM +0530, Anup Patel wrote: > Hi Christoffer, > > On Fri, Nov 21, 2014 at 3:29 PM, Christoffer Dall > wrote: > > On Thu, Nov 20, 2014 at 08:17:32PM +0530, Anup Patel wrote: > >> On Wed, Nov 19, 2014 at 8:59 PM, Christoffer Dall > >> wrote: > >> > On Tue, Nov 11, 2014 at 02:48:25PM +0530, Anup Patel wrote: > >> >> Hi All, > >> >> > >> >> I have second thoughts about rebasing KVM PMU patches > >> >> to Marc's irq-forwarding patches. > >> >> > >> >> The PMU IRQs (when virtualized by KVM) are not exactly > >> >> forwarded IRQs because they are shared between Host > >> >> and Guest. > >> >> > >> >> Scenario1 > >> >> - > >> >> > >> >> We might have perf running on Host and no KVM guest > >> >> running. In this scenario, we wont get interrupts on Host > >> >> because the kvm_pmu_hyp_init() (similar to the function > >> >> kvm_timer_hyp_init() of Marc's IRQ-forwarding > >> >> implementation) has put all host PMU IRQs in forwarding > >> >> mode. > >> >> > >> >> The only way solve this problem is to not set forwarding > >> >> mode for PMU IRQs in kvm_pmu_hyp_init() and instead > >> >> have special routines to turn on and turn off the forwarding > >> >> mode of PMU IRQs. These routines will be called from > >> >> kvm_arch_vcpu_ioctl_run() for toggling the PMU IRQ > >> >> forwarding state. > >> >> > >> >> Scenario2 > >> >> - > >> >> > >> >> We might have perf running on Host and Guest simultaneously > >> >> which means it is quite likely that PMU HW trigger IRQ meant > >> >> for Host between "ret = kvm_call_hyp(__kvm_vcpu_run, vcpu);" > >> >> and "kvm_pmu_sync_hwstate(vcpu);" (similar to timer sync routine > >> >> of Marc's patchset which is called before local_irq_enable()). > >> >> > >> >> In this scenario, the updated kvm_pmu_sync_hwstate(vcpu) > >> >> will accidentally forward IRQ meant for Host to Guest unless > >> >> we put additional checks to inspect VCPU PMU state. > >> >> > >> >> Am I missing any detail about IRQ forwarding for above > >> >> scenarios? > >> >> > >> > Hi Anup, > >> > >> Hi Christoffer, > >> > >> > > >> > I briefly discussed this with Marc. What I don't understand is how it > >> > would be possible to get an interrupt for the host while running the > >> > guest? > >> > > >> > The rationale behind my question is that whenever you're running the > >> > guest, the PMU should be programmed exclusively with guest state, and > >> > since the PMU is per core, any interrupts should be for the guest, where > >> > it would always be pending. > >> > >> Yes, thats right PMU is programmed exclusively for guest when > >> guest is running and for host when host is running. > >> > >> Let us assume a situation (Scenario2 mentioned previously) > >> where both host and guest are using PMU. When the guest is > >> running we come back to host mode due to variety of reasons > >> (stage2 fault, guest IO, regular host interrupt, host interrupt > >> meant for guest, ) which means we will return from the > >> "ret = kvm_call_hyp(__kvm_vcpu_run, vcpu);" statement in the > >> kvm_arch_vcpu_ioctl_run() function with local IRQs disabled. > >> At this point we would have restored back host PMU context and > >> any PMU counter used by host can trigger PMU overflow interrup > >> for host. Now we will be having "kvm_pmu_sync_hwstate(vcpu);" > >> in the kvm_arch_vcpu_ioctl_run() function (similar to the > >> kvm_timer_sync_hwstate() of Marc's IRQ forwarding patchset) > >> which will try to detect PMU irq forwarding state in GIC hence it > >> can accidentally discover PMU irq pending for guest while this > >> PMU irq is actually meant for host. > >> > >> This above mentioned situation does not happen for timer > >> because virtual timer interrupts are exclusively used for guest. > >> The exclusive use of virtual timer interrupt for guest ensures that > >> the function kvm_timer_sync_hwstate() will always see correct > >> state of virtual timer IRQ from GIC. > >> > > I'm not quite following. > > > > When you call kvm_pmu_sync_hwstate(vcpu) in the non-preemtible section, > > you would (1) capture the active state of the IRQ pertaining to the > > guest and (2) deactive the IRQ on the host, then (3) switch the state of > > the PMU to the host state, and finally (4) re-enable IRQs on the CPU > > you're running on. > > > > If the host PMU state restored in (3) causes the PMU to raise an > > interrupt, you'll take an interrupt after (4), which is for the host, > > and you'll handle it on the host. > > > We only switch PMU state in assembly code using > kvm_call_hyp(__kvm_vcpu_run, vcpu) > so whenever we are in kvm_arch_vcpu_ioctl_run() (i.e. host mode) > the current hardware PMU state is for host. This means whenever > we are in host mode the host PMU can change state of PMU IRQ > in GIC even if local IRQs are disabled. > > Whenever we inspect active state of PMU IRQ in the > kvm_pmu_sync_hwstate() function using irq_get_fwd_state() API. > Here we are not guaranteed that
Re: [PATCH v1] ARM/ARM64: support KVM_IOEVENTFD
Hi Ming, for your information there is a series written by Antonios (added in CC) https://lists.cs.columbia.edu/pipermail/kvmarm/2014-March/008416.html exactly on the same topic. The thread was reactivated by Nikolay latterly on Nov (see http://www.gossamer-threads.com/lists/linux/kernel/1886716?page=last). I am also convinced we must progress on ioeventfd topic concurrently with irqfd one. What starting point do we use then for further comments? Best Regards Eric On 11/19/2014 06:16 AM, Ming Lei wrote: > From Documentation/virtual/kvm/api.txt, all ARCHs should support > ioeventfd. > > Also ARM VM has supported PCI bus already, and ARM64 will do too, > ioeventfd is required for some popular devices, like virtio-blk > and virtio-scsi dataplane in QEMU. > > Without this patch, virtio-blk-pci dataplane can't work in QEMU. > > This patch has been tested on both ARM and ARM64. > > Signed-off-by: Ming Lei > --- > v1: > - make eventfd.o built in ARM64 > arch/arm/kvm/Kconfig|1 + > arch/arm/kvm/Makefile |2 +- > arch/arm/kvm/arm.c |1 + > arch/arm/kvm/mmio.c | 19 +++ > arch/arm64/kvm/Kconfig |1 + > arch/arm64/kvm/Makefile |2 +- > 6 files changed, 24 insertions(+), 2 deletions(-) > > diff --git a/arch/arm/kvm/Kconfig b/arch/arm/kvm/Kconfig > index 466bd29..25bd83a 100644 > --- a/arch/arm/kvm/Kconfig > +++ b/arch/arm/kvm/Kconfig > @@ -23,6 +23,7 @@ config KVM > select HAVE_KVM_CPU_RELAX_INTERCEPT > select KVM_MMIO > select KVM_ARM_HOST > + select HAVE_KVM_EVENTFD > depends on ARM_VIRT_EXT && ARM_LPAE > ---help--- > Support hosting virtualized guest machines. You will also > diff --git a/arch/arm/kvm/Makefile b/arch/arm/kvm/Makefile > index f7057ed..859db09 100644 > --- a/arch/arm/kvm/Makefile > +++ b/arch/arm/kvm/Makefile > @@ -15,7 +15,7 @@ AFLAGS_init.o := -Wa,-march=armv7-a$(plus_virt) > AFLAGS_interrupts.o := -Wa,-march=armv7-a$(plus_virt) > > KVM := ../../../virt/kvm > -kvm-arm-y = $(KVM)/kvm_main.o $(KVM)/coalesced_mmio.o > +kvm-arm-y = $(KVM)/kvm_main.o $(KVM)/coalesced_mmio.o $(KVM)/eventfd.o > > obj-y += kvm-arm.o init.o interrupts.o > obj-y += arm.o handle_exit.o guest.o mmu.o emulate.o reset.o > diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c > index 9e193c8..d90d989 100644 > --- a/arch/arm/kvm/arm.c > +++ b/arch/arm/kvm/arm.c > @@ -172,6 +172,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long > ext) > case KVM_CAP_IRQCHIP: > r = vgic_present; > break; > + case KVM_CAP_IOEVENTFD: > case KVM_CAP_DEVICE_CTRL: > case KVM_CAP_USER_MEMORY: > case KVM_CAP_SYNC_MMU: > diff --git a/arch/arm/kvm/mmio.c b/arch/arm/kvm/mmio.c > index 4cb5a93..ee332a7 100644 > --- a/arch/arm/kvm/mmio.c > +++ b/arch/arm/kvm/mmio.c > @@ -162,6 +162,21 @@ static int decode_hsr(struct kvm_vcpu *vcpu, phys_addr_t > fault_ipa, > return 0; > } > > +static int handle_io_bus_rw(struct kvm_vcpu *vcpu, gpa_t addr, > + int len, void *val, bool write) > +{ > + int idx, ret; > + > + idx = srcu_read_lock(&vcpu->kvm->srcu); > + if (write) > + ret = kvm_io_bus_write(vcpu->kvm, KVM_MMIO_BUS, addr, len, val); > + else > + ret = kvm_io_bus_read(vcpu->kvm, KVM_MMIO_BUS, addr, len, val); > + srcu_read_unlock(&vcpu->kvm->srcu, idx); > + > + return ret; > +} > + > int io_mem_abort(struct kvm_vcpu *vcpu, struct kvm_run *run, >phys_addr_t fault_ipa) > { > @@ -200,6 +215,10 @@ int io_mem_abort(struct kvm_vcpu *vcpu, struct kvm_run > *run, > if (vgic_handle_mmio(vcpu, run, &mmio)) > return 1; > > + if (!handle_io_bus_rw(vcpu, fault_ipa, mmio.len, &mmio.data, > + mmio.is_write)) > + return 1; > + > kvm_prepare_mmio(run, &mmio); > return 0; > } > diff --git a/arch/arm64/kvm/Kconfig b/arch/arm64/kvm/Kconfig > index 8ba85e9..642f57c 100644 > --- a/arch/arm64/kvm/Kconfig > +++ b/arch/arm64/kvm/Kconfig > @@ -26,6 +26,7 @@ config KVM > select KVM_ARM_HOST > select KVM_ARM_VGIC > select KVM_ARM_TIMER > + select HAVE_KVM_EVENTFD > ---help--- > Support hosting virtualized guest machines. > > diff --git a/arch/arm64/kvm/Makefile b/arch/arm64/kvm/Makefile > index 32a0961..2e6b827 100644 > --- a/arch/arm64/kvm/Makefile > +++ b/arch/arm64/kvm/Makefile > @@ -11,7 +11,7 @@ ARM=../../../arch/arm/kvm > > obj-$(CONFIG_KVM_ARM_HOST) += kvm.o > > -kvm-$(CONFIG_KVM_ARM_HOST) += $(KVM)/kvm_main.o $(KVM)/coalesced_mmio.o > +kvm-$(CONFIG_KVM_ARM_HOST) += $(KVM)/kvm_main.o $(KVM)/coalesced_mmio.o > $(KVM)/eventfd.o > kvm-$(CONFIG_KVM_ARM_HOST) += $(ARM)/arm.o $(ARM)/mmu.o $(ARM)/mmio.o > kvm-$(CONFIG_KVM_ARM_HOST) += $(ARM)/psci.o $(ARM)/perf.o > > -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info
Re: [PATCH 2/2] kvm: fix kvm_is_mmio_pfn() and rename to kvm_is_reserved_pfn()
On 21/11/2014 12:46, Christoffer Dall wrote: > Hi Paolo, > > I think these look good, would you mind queueing them as either a fix or > for 3.19 as you see fit, assuming you agree with the content? Ah, I was thinking _you_ would queue them for 3.19. Paolo -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 3.10] vhost-net: backport extend device allocation to 3.10
12.10.2014 13:30, Michael S. Tsirkin пишет: On Thu, Oct 09, 2014 at 08:41:23AM +0400, Dmitry Petuhov wrote: Cc: Michael Mueller Signed-off-by: Romain Francoise Acked-by: Michael S. Tsirkin [mityapetuhov: backport to v3.10: vhost_net_free() in one more place] Signed-off-by: Dmitry Petuhov Sounds reasonable. Acked-by: Michael S. Tsirkin Am I need any extra actions to see it in next 3.10 release? -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] kvm: fix kvm_is_mmio_pfn() and rename to kvm_is_reserved_pfn()
On Fri, Nov 21, 2014 at 02:06:40PM +0100, Paolo Bonzini wrote: > > > On 21/11/2014 12:46, Christoffer Dall wrote: > > Hi Paolo, > > > > I think these look good, would you mind queueing them as either a fix or > > for 3.19 as you see fit, assuming you agree with the content? > > Ah, I was thinking _you_ would queue them for 3.19. > We can do that, did I miss your previous ack or reviewed-by? -Christoffer -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: can I make this work… (Foundation for accessibility project)
On 20/11/2014 23:22, Eric S. Johansson wrote: > I'll be able to run some tests in about 2 to 3 hours after I finish this > document. Let me know what I should look at? on a side note, a pointer > to an automated install process would be wonderful. GNOME Boxes can pretty much automate the install process. Can you just run "ps aux" while the install is running and send the result? Paolo -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: KVM causes #GP on XRSTORS
On 20/11/2014 17:34, Nadav Amit wrote: > Fenghua, > > I got KVM (v3.17) crashing on a machine that supports XRSTORS - It appears to > get a #GP when it is trying to load the guest FPU. > One reason for the #GP is that XCOMP_BV[63] is zeroed on the guest_fpu, but I > am not sure it is the only problem. > Was KVM ever tested with XRSTORS? What is the content of the CPUID[EAX=13,ECX=0] and CPUID[EAX=13,ECX=1] leaves on the host? Fenghua, which processors have XSAVEC, which have XGETBV with ECX=1, and which have XSAVES? We need to expose this in QEMU, for which I can send a patch later today or next week (CCing Eduardo for this). We will also have to uncompact the XSAVE area either in KVM_GET_XSAVE or in QEMU. It's probably not hard to do it in the kernel. Paolo -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] kvm: x86: move ioapic.c and irq_comm.c back to arch/x86/
2014-11-20 14:42+0100, Paolo Bonzini: > ia64 does not need them anymore. (Similar for device assignment and iommu, should I prepare patches?) > Signed-off-by: Paolo Bonzini > --- At least one compile-breaker on arches without IOAPIC, > diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h > index ea53b04993f2..d2d42709d6f4 100644 > --- a/include/linux/kvm_host.h > +++ b/include/linux/kvm_host.h > +#ifdef __KVM_HAVE_IOAPIC > +void kvm_vcpu_request_scan_ioapic(struct kvm *kvm); > +#else > +static inline void kvm_vcpu_request-scan_ioapic(struct kvm *kvm) ^_- > +{ > +} > +#endif > + Reviewed-by: Radim Krčmář And we could clean them up as well: ---8<--- KVM: x86: remove IA64 from ioapic.c and irq_comm.c They won't get compiled in x86 tree. Signed-off-by: Radim Krčmář --- arch/x86/kvm/ioapic.c | 12 arch/x86/kvm/irq_comm.c | 41 ++--- 2 files changed, 2 insertions(+), 51 deletions(-) diff --git a/arch/x86/kvm/ioapic.c b/arch/x86/kvm/ioapic.c index 0ba4057..b1947e0 100644 --- a/arch/x86/kvm/ioapic.c +++ b/arch/x86/kvm/ioapic.c @@ -270,7 +270,6 @@ void kvm_ioapic_scan_entry(struct kvm_vcpu *vcpu, u64 *eoi_exit_bitmap, spin_unlock(&ioapic->lock); } -#ifdef CONFIG_X86 void kvm_vcpu_request_scan_ioapic(struct kvm *kvm) { struct kvm_ioapic *ioapic = kvm->arch.vioapic; @@ -279,12 +278,6 @@ void kvm_vcpu_request_scan_ioapic(struct kvm *kvm) return; kvm_make_scan_ioapic_request(kvm); } -#else -void kvm_vcpu_request_scan_ioapic(struct kvm *kvm) -{ - return; -} -#endif static void ioapic_write_indirect(struct kvm_ioapic *ioapic, u32 val) { @@ -586,11 +579,6 @@ static int ioapic_mmio_write(struct kvm_io_device *this, gpa_t addr, int len, case IOAPIC_REG_WINDOW: ioapic_write_indirect(ioapic, data); break; -#ifdef CONFIG_IA64 - case IOAPIC_REG_EOI: - __kvm_ioapic_update_eoi(NULL, ioapic, data, IOAPIC_LEVEL_TRIG); - break; -#endif default: break; diff --git a/arch/x86/kvm/irq_comm.c b/arch/x86/kvm/irq_comm.c index a8f988c..72298b3 100644 --- a/arch/x86/kvm/irq_comm.c +++ b/arch/x86/kvm/irq_comm.c @@ -26,9 +26,6 @@ #include #include -#ifdef CONFIG_IA64 -#include -#endif #include "irq.h" @@ -38,12 +35,8 @@ static int kvm_set_pic_irq(struct kvm_kernel_irq_routing_entry *e, struct kvm *kvm, int irq_source_id, int level, bool line_status) { -#ifdef CONFIG_X86 struct kvm_pic *pic = pic_irqchip(kvm); return kvm_pic_set_irq(pic, e->irqchip.pin, irq_source_id, level); -#else - return -1; -#endif } static int kvm_set_ioapic_irq(struct kvm_kernel_irq_routing_entry *e, @@ -57,12 +50,7 @@ static int kvm_set_ioapic_irq(struct kvm_kernel_irq_routing_entry *e, inline static bool kvm_is_dm_lowest_prio(struct kvm_lapic_irq *irq) { -#ifdef CONFIG_IA64 - return irq->delivery_mode == - (IOSAPIC_LOWEST_PRIORITY << IOSAPIC_DELIVERY_SHIFT); -#else return irq->delivery_mode == APIC_DM_LOWEST; -#endif } int kvm_irq_delivery_to_apic(struct kvm *kvm, struct kvm_lapic *src, @@ -202,9 +190,7 @@ int kvm_request_irq_source_id(struct kvm *kvm) } ASSERT(irq_source_id != KVM_USERSPACE_IRQ_SOURCE_ID); -#ifdef CONFIG_X86 ASSERT(irq_source_id != KVM_IRQFD_RESAMPLE_IRQ_SOURCE_ID); -#endif set_bit(irq_source_id, bitmap); unlock: mutex_unlock(&kvm->irq_lock); @@ -215,9 +201,7 @@ unlock: void kvm_free_irq_source_id(struct kvm *kvm, int irq_source_id) { ASSERT(irq_source_id != KVM_USERSPACE_IRQ_SOURCE_ID); -#ifdef CONFIG_X86 ASSERT(irq_source_id != KVM_IRQFD_RESAMPLE_IRQ_SOURCE_ID); -#endif mutex_lock(&kvm->irq_lock); if (irq_source_id < 0 || @@ -230,9 +214,7 @@ void kvm_free_irq_source_id(struct kvm *kvm, int irq_source_id) goto unlock; kvm_ioapic_clear_all(kvm->arch.vioapic, irq_source_id); -#ifdef CONFIG_X86 kvm_pic_clear_all(pic_irqchip(kvm), irq_source_id); -#endif unlock: mutex_unlock(&kvm->irq_lock); } @@ -322,16 +304,11 @@ out: .u.irqchip = { .irqchip = KVM_IRQCHIP_IOAPIC, .pin = (irq) } } #define ROUTING_ENTRY1(irq) IOAPIC_ROUTING_ENTRY(irq) -#ifdef CONFIG_X86 -# define PIC_ROUTING_ENTRY(irq) \ +#define PIC_ROUTING_ENTRY(irq) \ { .gsi = irq, .type = KVM_IRQ_ROUTING_IRQCHIP, \ .u.irqchip = { .irqchip = SELECT_PIC(irq), .pin = (irq) % 8 } } -# define ROUTING_ENTRY2(irq) \ +#define ROUTING_ENTRY2(irq) \ IOAPIC_ROUTING_ENTRY(irq), PIC_ROUTING_ENTRY(irq) -#else -# define ROUTING_ENTRY2(irq) \ - IOAPIC_ROUTING_ENTRY(irq) -#endif static const struct kvm_irq_routing_entry default_routing[] = { ROUTING_ENTRY2(0), ROUTING_ENTRY2(1), @@ -346,20 +323,6 @@ static const struct kvm_irq_routing_entry
Re: [PATCH] kvm: x86: move ioapic.c and irq_comm.c back to arch/x86/
On 21/11/2014 17:19, Radim Krčmář wrote: > 2014-11-20 14:42+0100, Paolo Bonzini: >> ia64 does not need them anymore. > > (Similar for device assignment and iommu, should I prepare patches?) Sure! Feel free to join the party. ;) Paolo >> Signed-off-by: Paolo Bonzini >> --- > > At least one compile-breaker on arches without IOAPIC, > >> diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h >> index ea53b04993f2..d2d42709d6f4 100644 >> --- a/include/linux/kvm_host.h >> +++ b/include/linux/kvm_host.h >> +#ifdef __KVM_HAVE_IOAPIC >> +void kvm_vcpu_request_scan_ioapic(struct kvm *kvm); >> +#else >> +static inline void kvm_vcpu_request-scan_ioapic(struct kvm *kvm) > ^_- >> +{ >> +} >> +#endif >> + > > Reviewed-by: Radim Krčmář > > And we could clean them up as well: Will squash in next monday. Paolo > ---8<--- > KVM: x86: remove IA64 from ioapic.c and irq_comm.c > > They won't get compiled in x86 tree. > > Signed-off-by: Radim Krčmář > --- > arch/x86/kvm/ioapic.c | 12 > arch/x86/kvm/irq_comm.c | 41 ++--- > 2 files changed, 2 insertions(+), 51 deletions(-) > > diff --git a/arch/x86/kvm/ioapic.c b/arch/x86/kvm/ioapic.c > index 0ba4057..b1947e0 100644 > --- a/arch/x86/kvm/ioapic.c > +++ b/arch/x86/kvm/ioapic.c > @@ -270,7 +270,6 @@ void kvm_ioapic_scan_entry(struct kvm_vcpu *vcpu, u64 > *eoi_exit_bitmap, > spin_unlock(&ioapic->lock); > } > > -#ifdef CONFIG_X86 > void kvm_vcpu_request_scan_ioapic(struct kvm *kvm) > { > struct kvm_ioapic *ioapic = kvm->arch.vioapic; > @@ -279,12 +278,6 @@ void kvm_vcpu_request_scan_ioapic(struct kvm *kvm) > return; > kvm_make_scan_ioapic_request(kvm); > } > -#else > -void kvm_vcpu_request_scan_ioapic(struct kvm *kvm) > -{ > - return; > -} > -#endif > > static void ioapic_write_indirect(struct kvm_ioapic *ioapic, u32 val) > { > @@ -586,11 +579,6 @@ static int ioapic_mmio_write(struct kvm_io_device *this, > gpa_t addr, int len, > case IOAPIC_REG_WINDOW: > ioapic_write_indirect(ioapic, data); > break; > -#ifdef CONFIG_IA64 > - case IOAPIC_REG_EOI: > - __kvm_ioapic_update_eoi(NULL, ioapic, data, IOAPIC_LEVEL_TRIG); > - break; > -#endif > > default: > break; > diff --git a/arch/x86/kvm/irq_comm.c b/arch/x86/kvm/irq_comm.c > index a8f988c..72298b3 100644 > --- a/arch/x86/kvm/irq_comm.c > +++ b/arch/x86/kvm/irq_comm.c > @@ -26,9 +26,6 @@ > #include > > #include > -#ifdef CONFIG_IA64 > -#include > -#endif > > #include "irq.h" > > @@ -38,12 +35,8 @@ static int kvm_set_pic_irq(struct > kvm_kernel_irq_routing_entry *e, > struct kvm *kvm, int irq_source_id, int level, > bool line_status) > { > -#ifdef CONFIG_X86 > struct kvm_pic *pic = pic_irqchip(kvm); > return kvm_pic_set_irq(pic, e->irqchip.pin, irq_source_id, level); > -#else > - return -1; > -#endif > } > > static int kvm_set_ioapic_irq(struct kvm_kernel_irq_routing_entry *e, > @@ -57,12 +50,7 @@ static int kvm_set_ioapic_irq(struct > kvm_kernel_irq_routing_entry *e, > > inline static bool kvm_is_dm_lowest_prio(struct kvm_lapic_irq *irq) > { > -#ifdef CONFIG_IA64 > - return irq->delivery_mode == > - (IOSAPIC_LOWEST_PRIORITY << IOSAPIC_DELIVERY_SHIFT); > -#else > return irq->delivery_mode == APIC_DM_LOWEST; > -#endif > } > > int kvm_irq_delivery_to_apic(struct kvm *kvm, struct kvm_lapic *src, > @@ -202,9 +190,7 @@ int kvm_request_irq_source_id(struct kvm *kvm) > } > > ASSERT(irq_source_id != KVM_USERSPACE_IRQ_SOURCE_ID); > -#ifdef CONFIG_X86 > ASSERT(irq_source_id != KVM_IRQFD_RESAMPLE_IRQ_SOURCE_ID); > -#endif > set_bit(irq_source_id, bitmap); > unlock: > mutex_unlock(&kvm->irq_lock); > @@ -215,9 +201,7 @@ unlock: > void kvm_free_irq_source_id(struct kvm *kvm, int irq_source_id) > { > ASSERT(irq_source_id != KVM_USERSPACE_IRQ_SOURCE_ID); > -#ifdef CONFIG_X86 > ASSERT(irq_source_id != KVM_IRQFD_RESAMPLE_IRQ_SOURCE_ID); > -#endif > > mutex_lock(&kvm->irq_lock); > if (irq_source_id < 0 || > @@ -230,9 +214,7 @@ void kvm_free_irq_source_id(struct kvm *kvm, int > irq_source_id) > goto unlock; > > kvm_ioapic_clear_all(kvm->arch.vioapic, irq_source_id); > -#ifdef CONFIG_X86 > kvm_pic_clear_all(pic_irqchip(kvm), irq_source_id); > -#endif > unlock: > mutex_unlock(&kvm->irq_lock); > } > @@ -322,16 +304,11 @@ out: > .u.irqchip = { .irqchip = KVM_IRQCHIP_IOAPIC, .pin = (irq) } } > #define ROUTING_ENTRY1(irq) IOAPIC_ROUTING_ENTRY(irq) > > -#ifdef CONFIG_X86 > -# define PIC_ROUTING_ENTRY(irq) \ > +#define PIC_ROUTING_ENTRY(irq) \ > { .gsi = irq, .type = KVM_IRQ_ROUTING_IRQCHIP, \ > .u.irqchip = { .irqchip = SELECT_PIC(irq), .pin = (irq) % 8 } } > -# define ROUTING_E
Re: can I make this work… (Foundation for accessibility project)
On 11/21/2014 09:06 AM, Paolo Bonzini wrote: On 20/11/2014 23:22, Eric S. Johansson wrote: I'll be able to run some tests in about 2 to 3 hours after I finish this document. Let me know what I should look at? on a side note, a pointer to an automated install process would be wonderful. GNOME Boxes can pretty much automate the install process. Can you just run "ps aux" while the install is running and send the result? I went back and verified I had installed all packages. apparently I missed a few updates. also I was more familiar with the UI tool. I noticed a few places where kvm was now an option. last I made a copy of the dvd to an iso as an install image. end result is *wow* much faster. I now have hope that my project will work. sure does like giving 110% in cpu speed. 4384 libvirt+ 20 0 2825112 2.058g 9960 R 109.1 26.6 12:47.73 qemu-system-x86 next report after updates install btw, would you like a better UI design for a management tool? I have some ideas but would need someone with hands to put it together. --- eric top sez Tasks: 182 total, 4 running, 178 sleeping, 0 stopped, 0 zombie %Cpu(s): 44.2 us, 14.9 sy, 0.0 ni, 38.7 id, 2.0 wa, 0.0 hi, 0.2 si, 0.0 st KiB Mem: 8128204 total, 4750320 used, 3377884 free,54476 buffers KiB Swap: 8338428 total,0 used, 8338428 free. 1996164 cached Mem PID USER PR NIVIRTRESSHR S %CPU %MEM TIME+ COMMAND 4384 libvirt+ 20 0 2634992 2.033g 9940 R 108.6 26.2 2:02.83 qemu-syste+ 2668 eric 20 0 1284184 66308 29828 S 2.3 0.8 0:21.50 compiz 1314 root 20 0 1032288 22264 11436 S 2.0 0.3 0:46.29 libvirtd 18 root 20 0 0 0 0 S 1.7 0.0 0:00.96 kworker/1:0 1423 root 20 0 410736 49196 35228 S 1.7 0.6 0:32.18 Xorg 4694 root 20 0 0 0 0 R 1.7 0.0 0:00.20 kworker/0:1 2837 eric 20 0 1481612 102828 38476 S 1.0 1.3 0:54.03 python 2628 eric 20 0 20232940768 S 0.3 0.0 0:00.69 syndaemon 3047 eric 20 0 653160 20868 12472 S 0.3 0.3 0:02.14 gnome-term+ 3147 eric 20 0 377868 4168 3288 S 0.3 0.1 0:00.04 deja-dup-m+ 1 root 20 0 33908 3280 1472 S 0.0 0.0 0:01.62 init 2 root 20 0 0 0 0 S 0.0 0.0 0:00.00 kthreadd 3 root 20 0 0 0 0 S 0.0 0.0 0:00.16 ksoftirqd/0 4 root 20 0 0 0 0 S 0.0 0.0 0:00.72 kworker/0:0 5 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 kworker/0:+ 7 root 20 0 0 0 0 S 0.0 0.0 0:00.50 rcu_sched 8 root 20 0 0 0 0 R 0.0 0.0 0:00.40 rcuos/0 eric@garnet:~$ ps aux sez eric@garnet:~$ ps -aux USER PID %CPU %MEMVSZ RSS TTY STAT START TIME COMMAND root 1 0.1 0.0 33908 3280 ?Ss 11:12 0:01 /sbin/init root 2 0.0 0.0 0 0 ?S11:12 0:00 [kthreadd] root 3 0.0 0.0 0 0 ?S11:12 0:00 [ksoftirqd/0] root 4 0.0 0.0 0 0 ?S11:12 0:00 [kworker/0:0] root 5 0.0 0.0 0 0 ?S< 11:12 0:00 [kworker/0:0H] root 7 0.0 0.0 0 0 ?S11:12 0:00 [rcu_sched] root 8 0.0 0.0 0 0 ?S11:12 0:00 [rcuos/0] root 9 0.0 0.0 0 0 ?S11:12 0:00 [rcuos/1] root10 0.0 0.0 0 0 ?S11:12 0:00 [rcu_bh] root11 0.0 0.0 0 0 ?S11:12 0:00 [rcuob/0] root12 0.0 0.0 0 0 ?S11:12 0:00 [rcuob/1] root13 0.0 0.0 0 0 ?S11:12 0:00 [migration/0] root14 0.0 0.0 0 0 ?S11:12 0:00 [watchdog/0] root15 0.0 0.0 0 0 ?S11:12 0:00 [watchdog/1] root16 0.0 0.0 0 0 ?S11:12 0:00 [migration/1] root17 0.0 0.0 0 0 ?S11:12 0:00 [ksoftirqd/1] root18 0.0 0.0 0 0 ?S11:12 0:01 [kworker/1:0] root19 0.0 0.0 0 0 ?S< 11:12 0:00 [kworker/1:0H] root20 0.0 0.0 0 0 ?S< 11:12 0:00 [khelper] root21 0.0 0.0 0 0 ?S11:12 0:00 [kdevtmpfs] root22 0.0 0.0 0 0 ?S< 11:12 0:00 [netns] root23 0.0 0.0 0 0 ?S< 11:12 0:00 [writeback] root24 0.0 0.0 0 0 ?S< 11:12 0:00 [kintegrityd] root25 0.0 0.0 0 0 ?S< 11:12 0:00 [bioset] root26 0.0 0.0 0 0 ?S< 11:12 0:00 [kworker/u5:0] root27 0.0 0.0 0 0 ?S< 11:12 0:00 [kblockd] root28 0.0 0.0 0 0 ?S< 11:
Re: [PATCH] kvm: x86: move ioapic.c and irq_comm.c back to arch/x86/
On 21/11/2014 17:19, Radim Krčmář wrote: > KVM: x86: remove IA64 from ioapic.c and irq_comm.c > > They won't get compiled in x86 tree. Ah no, these were already in my ia64 removal patch. I had a deja-vu feeling... Paolo -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/3] arm64: KVM: fix unmapping with 48-bit VAs
From: Mark Rutland Currently if using a 48-bit VA, tearing down the hyp page tables (which can happen in the absence of a GICH or GICV resource) results in the rather nasty splat below, evidently becasue we access a table that doesn't actually exist. Commit 38f791a4e499792e (arm64: KVM: Implement 48 VA support for KVM EL2 and Stage-2) added a pgd_none check to __create_hyp_mappings to account for the additional level of tables, but didn't add a corresponding check to unmap_range, and this seems to be the source of the problem. This patch adds the missing pgd_none check, ensuring we don't try to access tables that don't exist. Original splat below: kvm [1]: Using HYP init bounce page @83fe94a000 kvm [1]: Cannot obtain GICH resource Unable to handle kernel paging request at virtual address 7f7fff00 pgd = 8077 [7f7fff00] *pgd= Internal error: Oops: 9604 [#1] PREEMPT SMP Modules linked in: CPU: 1 PID: 1 Comm: swapper/0 Not tainted 3.18.0-rc2+ #89 task: 8003eb50 ti: 8003eb45c000 task.ti: 8003eb45c000 PC is at unmap_range+0x120/0x580 LR is at free_hyp_pgds+0xac/0xe4 pc : [] lr : [] pstate: 8045 sp : 8003eb45fbf0 x29: 8003eb45fbf0 x28: 80736000 x27: 80735000 x26: 7f7fff00 x25: 4000 x24: 806f5000 x23: x22: 007f x21: 8000 x20: 0080 x19: x18: 80648000 x17: 80537228 x16: x15: 001f x14: x13: 0001 x12: 0020 x11: 0062 x10: 0006 x9 : x8 : 0063 x7 : 0018 x6 : 0003ff00 x5 : 80744188 x4 : 0001 x3 : 4000 x2 : 8000 x1 : 007f x0 : 3fff Process swapper/0 (pid: 1, stack limit = 0x8003eb45c058) Stack: (0x8003eb45fbf0 to 0x8003eb46) fbe0: eb45fcb0 8003 0009cad8 8000 fc00: 0080 00736140 8000 00736000 8000 7c80 fc20: 0080 006f5000 8000 0080 00743000 8000 fc40: 00735000 8000 006d3030 8000 006fe7b8 8000 0080 fc60: 007f fdac1000 8003 fd94b000 8003 fda47000 8003 fc80: 00502b40 8000 ff00 7f7f fdec6000 8003 fdac1630 8003 fca0: eb45fcb0 8003 007f eb45fd00 8003 0009b378 8000 fcc0: ffea 006fe000 8000 00736728 8000 00736120 8000 fce0: 0040 00743000 8000 006fe7b8 8000 0050cd48 fd00: eb45fd60 8003 00096070 8000 006f06e0 8000 006f06e0 8000 fd20: fd948b40 8003 0009a320 8000 fd40: 0ae0 006aa25c 8000 eb45fd60 8003 0017ca44 0002 fd60: eb45fdc0 8003 0009a33c 8000 006f06e0 8000 006f06e0 8000 fd80: fd948b40 8003 0009a320 8000 00735000 8000 fda0: 006d3090 8000 006aa25c 8000 00735000 8000 006d3030 8000 fdc0: eb45fdd0 8003 000814c0 8000 eb45fe50 8003 006aaac4 8000 fde0: 006ddd90 8000 0006 006d3000 8000 0095 fe00: 006a1e90 8000 00735000 8000 006d3000 8000 006aa25c 8000 fe20: 00735000 8000 006d3030 8000 eb45fe50 8003 006fac68 8000 fe40: 0006 0006 fe293ee6 8003 eb45feb0 8003 004f8ee8 8000 fe60: 004f8ed4 8000 00735000 8000 fe80: fea0: 000843d0 8000 fec0: 004f8ed4 8000 fee0: ff00: ff20: ff40: ff60: ff80: ffa0: ffc0: 0005 ffe0: Call trace: [] unmap_range+0x120/0x580 [] free_hyp_pgds+0xa8/0xe4 [] kvm_arch_init+0x268/0x44c [] kvm_init+0x24/0x260 [] arm_init+0x18/0x24 [] do_one_initcall+0x88/0x1a0 [] kernel_init_freeable+0x148/0x1e8 [] kernel_init+0x10/0xd4 Code: 8b000263 92628479 d1000720 eb01001f (f9400340) ---[ end trace 3bc230562e926fa4 ]--- Kernel panic - not syncing: Attempted to kill init! exitcode=0x000b Signed-off-by: Mark Rutland Cc: Catalin Marinas Cc: Jun
[PULL] arm/arm64: KVM: pull request for 3.18-rc6
Hi Paolo, Please consider pulling the following patches that fixes a few issues for KVM on arm/arm64. The following changes since commit 41e7ed64d86db351a94063596b478a0bfc040258: KVM: nVMX: Disable preemption while reading from shadow VMCS (2014-10-29 13:13:52 +0100) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm.git tags/kvm-arm-for-3.18-rc6 for you to fetch changes up to 837711af0e99718af5a8cc84fe42ea335c9c71ce: arm/arm64: KVM: vgic: Fix error code in kvm_vgic_create() (2014-11-21 17:00:57 +) Updates for KVM/{arm,arm64}, fixing a few issues: - fix an unmap error when using 48bit VAs - trap access to ICC_SRE_EL1 when the guest is trying to use GICv3 - return an error when userspace is trying to init the vgic on a running VM Christoffer Dall (2): arm64: KVM: Handle traps of ICC_SRE_EL1 as RAZ/WI arm/arm64: KVM: vgic: Fix error code in kvm_vgic_create() Mark Rutland (1): arm64: KVM: fix unmapping with 48-bit VAs arch/arm/kvm/mmu.c| 3 ++- arch/arm64/kvm/sys_regs.c | 9 + virt/kvm/arm/vgic.c | 8 3 files changed, 15 insertions(+), 5 deletions(-) -- 2.1.3 -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/3] arm64: KVM: Handle traps of ICC_SRE_EL1 as RAZ/WI
From: Christoffer Dall When running on a system with a GICv3, we currenly don't allow the guest to access the system register interface of the GICv3. We do this by clearing the ICC_SRE_EL2.Enable, which causes all guest accesses to ICC_SRE_EL1 to trap to EL2 and causes all guest accesses to other ICC_ registers to cause an undefined exception in the guest. However, we currently don't handle the trap of guest accesses to ICC_SRE_EL1 and will spill out a warning. The trap just needs to handle the access as RAZ/WI, and a guest that tries to prod this register and set ICC_SRE_EL1.SRE=1, must read back the value (which Linux already does) to see if it succeeded, and will thus observe that ICC_SRE_EL1.SRE was not set. Add the simple trap handler in the sorted table of the system registers. Signed-off-by: Christoffer Dall [ardb: added cp15 handling] Signed-off-by: Ard Biesheuvel Signed-off-by: Marc Zyngier --- arch/arm64/kvm/sys_regs.c | 9 + 1 file changed, 9 insertions(+) diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c index 4cc3b71..3d7c2df 100644 --- a/arch/arm64/kvm/sys_regs.c +++ b/arch/arm64/kvm/sys_regs.c @@ -424,6 +424,11 @@ static const struct sys_reg_desc sys_reg_descs[] = { /* VBAR_EL1 */ { Op0(0b11), Op1(0b000), CRn(0b1100), CRm(0b), Op2(0b000), NULL, reset_val, VBAR_EL1, 0 }, + + /* ICC_SRE_EL1 */ + { Op0(0b11), Op1(0b000), CRn(0b1100), CRm(0b1100), Op2(0b101), + trap_raz_wi }, + /* CONTEXTIDR_EL1 */ { Op0(0b11), Op1(0b000), CRn(0b1101), CRm(0b), Op2(0b001), access_vm_reg, reset_val, CONTEXTIDR_EL1, 0 }, @@ -690,6 +695,10 @@ static const struct sys_reg_desc cp15_regs[] = { { Op1( 0), CRn(10), CRm( 2), Op2( 1), access_vm_reg, NULL, c10_NMRR }, { Op1( 0), CRn(10), CRm( 3), Op2( 0), access_vm_reg, NULL, c10_AMAIR0 }, { Op1( 0), CRn(10), CRm( 3), Op2( 1), access_vm_reg, NULL, c10_AMAIR1 }, + + /* ICC_SRE */ + { Op1( 0), CRn(12), CRm(12), Op2( 5), trap_raz_wi }, + { Op1( 0), CRn(13), CRm( 0), Op2( 1), access_vm_reg, NULL, c13_CID }, }; -- 2.1.3 -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 3/3] arm/arm64: KVM: vgic: Fix error code in kvm_vgic_create()
From: Christoffer Dall If we detect another vCPU is running we just exit and return 0 as if we succesfully created the VGIC, but the VGIC wouldn't actual be created. This shouldn't break in-kernel behavior because the kernel will not observe the failed the attempt to create the VGIC, but userspace could be rightfully confused. Cc: Andre Przywara Signed-off-by: Christoffer Dall Signed-off-by: Marc Zyngier --- virt/kvm/arm/vgic.c | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c index 3aaca49..aacdb59 100644 --- a/virt/kvm/arm/vgic.c +++ b/virt/kvm/arm/vgic.c @@ -1933,7 +1933,7 @@ out: int kvm_vgic_create(struct kvm *kvm) { - int i, vcpu_lock_idx = -1, ret = 0; + int i, vcpu_lock_idx = -1, ret; struct kvm_vcpu *vcpu; mutex_lock(&kvm->lock); @@ -1948,6 +1948,7 @@ int kvm_vgic_create(struct kvm *kvm) * vcpu->mutex. By grabbing the vcpu->mutex of all VCPUs we ensure * that no other VCPUs are run while we create the vgic. */ + ret = -EBUSY; kvm_for_each_vcpu(i, vcpu, kvm) { if (!mutex_trylock(&vcpu->mutex)) goto out_unlock; @@ -1955,11 +1956,10 @@ int kvm_vgic_create(struct kvm *kvm) } kvm_for_each_vcpu(i, vcpu, kvm) { - if (vcpu->arch.has_run_once) { - ret = -EBUSY; + if (vcpu->arch.has_run_once) goto out_unlock; - } } + ret = 0; spin_lock_init(&kvm->arch.vgic.lock); kvm->arch.vgic.in_kernel = true; -- 2.1.3 -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] kvm: x86: move ioapic.c and irq_comm.c back to arch/x86/
2014-11-21 18:05+0100, Paolo Bonzini: > On 21/11/2014 17:19, Radim Krčmář wrote: > > KVM: x86: remove IA64 from ioapic.c and irq_comm.c > > > > They won't get compiled in x86 tree. > > Ah no, these were already in my ia64 removal patch. I had a deja-vu > feeling... Oops, renaming simplifies conflict resolution ... CONFIG_X86 removal should still be applicable though. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: can I make this work… (Foundation for accessibility project)
On 21/11/2014 17:52, Eric S. Johansson wrote: > > 4384 libvirt+ 20 0 2825112 2.058g 9960 R 109.1 26.6 12:47.73 > qemu-system-x86 > > next report after updates install > > btw, would you like a better UI design for a management tool? I have > some ideas but would need someone with hands to put it together. I don't develop the management tool, but there are several. The most advanced UI is probably in GNOME Boxes, but it also has less functionality than virt-manager. Paolo -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
next puzzle: Re: can I make this work… (Foundation for accessibility project)
On 11/21/2014 11:52 AM, Eric S. Johansson wrote: 4384 libvirt+ 20 0 2825112 2.058g 9960 R 109.1 26.6 12:47.73 qemu-system-x86 next report after updates install next puzzle. updates are not working using bridged to eth0 using virt io driver (checked install on windows) browser works in vm (quite well in fact) watching output of tcpdump and there is no apparent traffic for updates. any ideas? btw, would you like a better UI design for a management tool? I have some ideas but would need someone with hands to put it together. --- eric top sez Tasks: 182 total, 4 running, 178 sleeping, 0 stopped, 0 zombie %Cpu(s): 44.2 us, 14.9 sy, 0.0 ni, 38.7 id, 2.0 wa, 0.0 hi, 0.2 si, 0.0 st KiB Mem: 8128204 total, 4750320 used, 3377884 free,54476 buffers KiB Swap: 8338428 total,0 used, 8338428 free. 1996164 cached Mem PID USER PR NIVIRTRESSHR S %CPU %MEM TIME+ COMMAND 4384 libvirt+ 20 0 2634992 2.033g 9940 R 108.6 26.2 2:02.83 qemu-syste+ 2668 eric 20 0 1284184 66308 29828 S 2.3 0.8 0:21.50 compiz 1314 root 20 0 1032288 22264 11436 S 2.0 0.3 0:46.29 libvirtd 18 root 20 0 0 0 0 S 1.7 0.0 0:00.96 kworker/1:0 1423 root 20 0 410736 49196 35228 S 1.7 0.6 0:32.18 Xorg 4694 root 20 0 0 0 0 R 1.7 0.0 0:00.20 kworker/0:1 2837 eric 20 0 1481612 102828 38476 S 1.0 1.3 0:54.03 python 2628 eric 20 0 20232940768 S 0.3 0.0 0:00.69 syndaemon 3047 eric 20 0 653160 20868 12472 S 0.3 0.3 0:02.14 gnome-term+ 3147 eric 20 0 377868 4168 3288 S 0.3 0.1 0:00.04 deja-dup-m+ 1 root 20 0 33908 3280 1472 S 0.0 0.0 0:01.62 init 2 root 20 0 0 0 0 S 0.0 0.0 0:00.00 kthreadd 3 root 20 0 0 0 0 S 0.0 0.0 0:00.16 ksoftirqd/0 4 root 20 0 0 0 0 S 0.0 0.0 0:00.72 kworker/0:0 5 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 kworker/0:+ 7 root 20 0 0 0 0 S 0.0 0.0 0:00.50 rcu_sched 8 root 20 0 0 0 0 R 0.0 0.0 0:00.40 rcuos/0 eric@garnet:~$ ps aux sez eric@garnet:~$ ps -aux USER PID %CPU %MEMVSZ RSS TTY STAT START TIME COMMAND root 1 0.1 0.0 33908 3280 ?Ss 11:12 0:01 /sbin/init root 2 0.0 0.0 0 0 ?S11:12 0:00 [kthreadd] root 3 0.0 0.0 0 0 ?S11:12 0:00 [ksoftirqd/0] root 4 0.0 0.0 0 0 ?S11:12 0:00 [kworker/0:0] root 5 0.0 0.0 0 0 ?S< 11:12 0:00 [kworker/0:0H] root 7 0.0 0.0 0 0 ?S11:12 0:00 [rcu_sched] root 8 0.0 0.0 0 0 ?S11:12 0:00 [rcuos/0] root 9 0.0 0.0 0 0 ?S11:12 0:00 [rcuos/1] root10 0.0 0.0 0 0 ?S11:12 0:00 [rcu_bh] root11 0.0 0.0 0 0 ?S11:12 0:00 [rcuob/0] root12 0.0 0.0 0 0 ?S11:12 0:00 [rcuob/1] root13 0.0 0.0 0 0 ?S11:12 0:00 [migration/0] root14 0.0 0.0 0 0 ?S11:12 0:00 [watchdog/0] root15 0.0 0.0 0 0 ?S11:12 0:00 [watchdog/1] root16 0.0 0.0 0 0 ?S11:12 0:00 [migration/1] root17 0.0 0.0 0 0 ?S11:12 0:00 [ksoftirqd/1] root18 0.0 0.0 0 0 ?S11:12 0:01 [kworker/1:0] root19 0.0 0.0 0 0 ?S< 11:12 0:00 [kworker/1:0H] root20 0.0 0.0 0 0 ?S< 11:12 0:00 [khelper] root21 0.0 0.0 0 0 ?S11:12 0:00 [kdevtmpfs] root22 0.0 0.0 0 0 ?S< 11:12 0:00 [netns] root23 0.0 0.0 0 0 ?S< 11:12 0:00 [writeback] root24 0.0 0.0 0 0 ?S< 11:12 0:00 [kintegrityd] root25 0.0 0.0 0 0 ?S< 11:12 0:00 [bioset] root26 0.0 0.0 0 0 ?S< 11:12 0:00 [kworker/u5:0] root27 0.0 0.0 0 0 ?S< 11:12 0:00 [kblockd] root28 0.0 0.0 0 0 ?S< 11:12 0:00 [ata_sff] root29 0.0 0.0 0 0 ?S11:12 0:00 [khubd] root30 0.0 0.0 0 0 ?S< 11:12 0:00 [md] root31 0.0 0.0 0 0 ?S< 11:12 0:00 [devfreq_wq] root34 0.0 0.0 0 0 ?S11:12 0:00 [khungtaskd] root35 0.0 0.0 0 0 ?S11:12 0:00 [kswapd0] root36 0.1 0.0 0 0 ?SN 11:12 0:02 [ksmd] root37 0.0 0.0 0 0 ?SN 11:12 0:00 [khugepaged] root38 0
[CFT PATCH 1/2] kvm: x86: mask out XSAVES
This feature is not supported inside KVM guests yet, because we do not emulate MSR_IA32_XSS. Mask it out. Cc: sta...@vger.kernel.org Cc: Nadav Amit Signed-off-by: Paolo Bonzini --- arch/x86/kvm/cpuid.c | 11 ++- 1 file changed, 10 insertions(+), 1 deletion(-) diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c index 20d83217fb1d..a4f5ac46226c 100644 --- a/arch/x86/kvm/cpuid.c +++ b/arch/x86/kvm/cpuid.c @@ -320,6 +320,10 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function, F(ADX) | F(SMAP) | F(AVX512F) | F(AVX512PF) | F(AVX512ER) | F(AVX512CD); + /* cpuid 0xD.1.eax */ + const u32 kvm_supported_word10_x86_features = + F(XSAVEOPT) | F(XSAVEC) | F(XGETBV1); + /* all calls to cpuid_count() should be made on the same cpu */ get_cpu(); @@ -456,13 +460,18 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function, entry->eax &= supported; entry->edx &= supported >> 32; entry->flags |= KVM_CPUID_FLAG_SIGNIFCANT_INDEX; + if (!supported) + break; + for (idx = 1, i = 1; idx < 64; ++idx) { u64 mask = ((u64)1 << idx); if (*nent >= maxnent) goto out; do_cpuid_1_ent(&entry[i], function, idx); - if (entry[i].eax == 0 || !(supported & mask)) + if (idx == 1) + entry[i].eax &= kvm_supported_word10_x86_features; + else if (entry[i].eax == 0 || !(supported & mask)) continue; entry[i].flags |= KVM_CPUID_FLAG_SIGNIFCANT_INDEX; -- 1.8.3.1 -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[CFT PATCH 2/2] KVM: x86: support XSAVES usage in the host
Userspace is expecting non-compacted format for KVM_GET_XSAVE, but struct xsave_struct might be using the compacted format. Convert in order to preserve userspace ABI. Fixes: f31a9f7c71691569359fa7fb8b0acaa44bce0324 Cc: Fenghua Yu Cc: sta...@vger.kernel.org Cc: Nadav Amit Signed-off-by: Paolo Bonzini --- arch/x86/kvm/x86.c | 48 +++- 1 file changed, 43 insertions(+), 5 deletions(-) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 5337039427c8..7e8a20e5615a 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -3131,15 +3131,53 @@ static int kvm_vcpu_ioctl_x86_set_debugregs(struct kvm_vcpu *vcpu, return 0; } +#define XSTATE_COMPACTION_ENABLED (1ULL << 63) + +static void fill_xsave(u8 *dest, struct kvm_vcpu *vcpu) +{ + struct xsave_struct *xsave = &vcpu->arch.guest_fpu.state->xsave; + u64 xstate_bv = vcpu->arch.guest_supported_xcr0 | XSTATE_FPSSE; + u64 valid; + + /* +* Copy legacy XSAVE area, to avoid complications with CPUID +* leaves 0 and 1 in the loop below. +*/ + memcpy(dest, xsave, XSAVE_HDR_OFFSET); + + /* Set XSTATE_BV */ + *(u64 *)(dest + XSAVE_HDR_OFFSET) = xstate_bv; + + /* +* Copy each region from the possibly compacted offset to the +* non-compacted offset. +*/ + valid = xstate_bv & ~XSTATE_FPSSE; + if (xsave->xsave_hdr.xcomp_bv & XSTATE_COMPACTION_ENABLED) + valid &= xsave->xsave_hdr.xcomp_bv; + + while (valid) { + u64 feature = valid & -valid; + int index = fls64(feature) - 1; + void *src = get_xsave_addr(xsave, feature); + + if (src) { + u32 size, offset, ecx, edx; + cpuid_count(XSTATE_CPUID, index, + &size, &offset, &ecx, &edx); + memcpy(dest + offset, src, size); + } + + valid -= feature; + } +} + static void kvm_vcpu_ioctl_x86_get_xsave(struct kvm_vcpu *vcpu, struct kvm_xsave *guest_xsave) { if (cpu_has_xsave) { - memcpy(guest_xsave->region, - &vcpu->arch.guest_fpu.state->xsave, - vcpu->arch.guest_xstate_size); - *(u64 *)&guest_xsave->region[XSAVE_HDR_OFFSET / sizeof(u32)] &= - vcpu->arch.guest_supported_xcr0 | XSTATE_FPSSE; + memset(guest_xsave, 0, sizeof(struct kvm_xsave)); + fill_xsave((u8 *) guest_xsave->region, vcpu); } else { memcpy(guest_xsave->region, &vcpu->arch.guest_fpu.state->fxsave, -- 1.8.3.1 -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PULL] arm/arm64: KVM: pull request for 3.18-rc6
On 21/11/2014 18:21, Marc Zyngier wrote: > git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm.git > tags/kvm-arm-for-3.18-rc6 Pulled, thanks. I'm not sure I'll be able to forward the request to Linus in time, though. Paolo -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[Bug 88671] BUG at drivers/pci/ats.c:62! the second time booting a kvm guest with pci passthrough
https://bugzilla.kernel.org/show_bug.cgi?id=88671 --- Comment #1 from Tom Stellard --- Created attachment 158391 --> https://bugzilla.kernel.org/attachment.cgi?id=158391&action=edit Backtrace from BUG_ON -- You are receiving this mail because: You are watching the assignee of the bug. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[Bug 88671] New: BUG at drivers/pci/ats.c:62! the second time booting a kvm guest with pci passthrough
https://bugzilla.kernel.org/show_bug.cgi?id=88671 Bug ID: 88671 Summary: BUG at drivers/pci/ats.c:62! the second time booting a kvm guest with pci passthrough Product: Virtualization Version: unspecified Kernel Version: 3.17.3 Hardware: All OS: Linux Tree: Mainline Status: NEW Severity: normal Priority: P1 Component: kvm Assignee: virtualization_...@kernel-bugs.osdl.org Reporter: tstel...@gmail.com Regression: No Created attachment 158381 --> https://bugzilla.kernel.org/attachment.cgi?id=158381&action=edit lspci I'm running into this bug while trying to use pci passthrough of an AMD BONAIRE XT (Radeon HD 7790) Steps to reproduce: 1. virsh start vm 2. virsh destroy vm 3. virsh start vm This bug only appears after starting the vm for the second time. The first time the vm boots normally and passthrough works as expected. -- You are receiving this mail because: You are watching the assignee of the bug. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[Bug 88671] BUG at drivers/pci/ats.c:62! the second time booting a kvm guest with pci passthrough
https://bugzilla.kernel.org/show_bug.cgi?id=88671 --- Comment #2 from Tom Stellard --- Created attachment 158401 --> https://bugzilla.kernel.org/attachment.cgi?id=158401&action=edit Virtual machine definition -- You are receiving this mail because: You are watching the assignee of the bug. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: KVM causes #GP on XRSTORS
On 21/11/2014 15:46, Paolo Bonzini wrote: > Fenghua, which processors have XSAVEC, which have XGETBV with ECX=1, and > which have XSAVES? We need to expose this in QEMU, for which I can send > a patch later today or next week (CCing Eduardo for this). Actually no change in QEMU is needed to hide XSAVES; the KVM patch I just sent should be enough for "-cpu host" to work. We still need the information on processor support though, in order to enable the feature with the right -cpu options. I assume Nadav was using "-cpu host". Paolo -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[Bug 88671] BUG at drivers/pci/ats.c:62! the second time booting a kvm guest with pci passthrough
https://bugzilla.kernel.org/show_bug.cgi?id=88671 --- Comment #3 from Tom Stellard --- I should also mention that I have this hook executing when the machine starts up: if [ "$2" = "prepare" ]; then virsh nodedev-detach pci__01_00_1 fi -- You are receiving this mail because: You are watching the assignee of the bug. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] kvm: fix kvm_is_mmio_pfn() and rename to kvm_is_reserved_pfn()
On 10/11/2014 09:33, Ard Biesheuvel wrote: > This reverts commit 85c8555ff0 ("KVM: check for !is_zero_pfn() in > kvm_is_mmio_pfn()") and renames the function to kvm_is_reserved_pfn. > > The problem being addressed by the patch above was that some ARM code > based the memory mapping attributes of a pfn on the return value of > kvm_is_mmio_pfn(), whose name indeed suggests that such pfns should > be mapped as device memory. > > However, kvm_is_mmio_pfn() doesn't do quite what it says on the tin, > and the existing non-ARM users were already using it in a way which > suggests that its name should probably have been 'kvm_is_reserved_pfn' > from the beginning, e.g., whether or not to call get_page/put_page on > it etc. This means that returning false for the zero page is a mistake > and the patch above should be reverted. > > Signed-off-by: Ard Biesheuvel > --- > arch/ia64/kvm/kvm-ia64.c | 2 +- > arch/x86/kvm/mmu.c | 6 +++--- > include/linux/kvm_host.h | 2 +- > virt/kvm/kvm_main.c | 16 > 4 files changed, 13 insertions(+), 13 deletions(-) > > diff --git a/arch/ia64/kvm/kvm-ia64.c b/arch/ia64/kvm/kvm-ia64.c > index ec6b9acb6bea..dbe46f43884d 100644 > --- a/arch/ia64/kvm/kvm-ia64.c > +++ b/arch/ia64/kvm/kvm-ia64.c > @@ -1563,7 +1563,7 @@ int kvm_arch_prepare_memory_region(struct kvm *kvm, > > for (i = 0; i < npages; i++) { > pfn = gfn_to_pfn(kvm, base_gfn + i); > - if (!kvm_is_mmio_pfn(pfn)) { > + if (!kvm_is_reserved_pfn(pfn)) { > kvm_set_pmt_entry(kvm, base_gfn + i, > pfn << PAGE_SHIFT, > _PAGE_AR_RWX | _PAGE_MA_WB); > diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c > index ac1c4de3a484..978f402006ee 100644 > --- a/arch/x86/kvm/mmu.c > +++ b/arch/x86/kvm/mmu.c > @@ -630,7 +630,7 @@ static int mmu_spte_clear_track_bits(u64 *sptep) >* kvm mmu, before reclaiming the page, we should >* unmap it from mmu first. >*/ > - WARN_ON(!kvm_is_mmio_pfn(pfn) && !page_count(pfn_to_page(pfn))); > + WARN_ON(!kvm_is_reserved_pfn(pfn) && !page_count(pfn_to_page(pfn))); > > if (!shadow_accessed_mask || old_spte & shadow_accessed_mask) > kvm_set_pfn_accessed(pfn); > @@ -2461,7 +2461,7 @@ static int set_spte(struct kvm_vcpu *vcpu, u64 *sptep, > spte |= PT_PAGE_SIZE_MASK; > if (tdp_enabled) > spte |= kvm_x86_ops->get_mt_mask(vcpu, gfn, > - kvm_is_mmio_pfn(pfn)); > + kvm_is_reserved_pfn(pfn)); > > if (host_writable) > spte |= SPTE_HOST_WRITEABLE; > @@ -2737,7 +2737,7 @@ static void transparent_hugepage_adjust(struct kvm_vcpu > *vcpu, >* PT_PAGE_TABLE_LEVEL and there would be no adjustment done >* here. >*/ > - if (!is_error_noslot_pfn(pfn) && !kvm_is_mmio_pfn(pfn) && > + if (!is_error_noslot_pfn(pfn) && !kvm_is_reserved_pfn(pfn) && > level == PT_PAGE_TABLE_LEVEL && > PageTransCompound(pfn_to_page(pfn)) && > !has_wrprotected_page(vcpu->kvm, gfn, PT_DIRECTORY_LEVEL)) { > diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h > index ea53b04993f2..a6059bdf7b03 100644 > --- a/include/linux/kvm_host.h > +++ b/include/linux/kvm_host.h > @@ -703,7 +703,7 @@ void kvm_arch_sync_events(struct kvm *kvm); > int kvm_cpu_has_pending_timer(struct kvm_vcpu *vcpu); > void kvm_vcpu_kick(struct kvm_vcpu *vcpu); > > -bool kvm_is_mmio_pfn(pfn_t pfn); > +bool kvm_is_reserved_pfn(pfn_t pfn); > > struct kvm_irq_ack_notifier { > struct hlist_node link; > diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c > index 25ffac9e947d..3cee7b167052 100644 > --- a/virt/kvm/kvm_main.c > +++ b/virt/kvm/kvm_main.c > @@ -107,10 +107,10 @@ EXPORT_SYMBOL_GPL(kvm_rebooting); > > static bool largepages_enabled = true; > > -bool kvm_is_mmio_pfn(pfn_t pfn) > +bool kvm_is_reserved_pfn(pfn_t pfn) > { > if (pfn_valid(pfn)) > - return !is_zero_pfn(pfn) && PageReserved(pfn_to_page(pfn)); > + return PageReserved(pfn_to_page(pfn)); > > return true; > } > @@ -1321,7 +1321,7 @@ static pfn_t hva_to_pfn(unsigned long addr, bool > atomic, bool *async, > else if ((vma->vm_flags & VM_PFNMAP)) { > pfn = ((addr - vma->vm_start) >> PAGE_SHIFT) + > vma->vm_pgoff; > - BUG_ON(!kvm_is_mmio_pfn(pfn)); > + BUG_ON(!kvm_is_reserved_pfn(pfn)); > } else { > if (async && vma_is_valid(vma, write_fault)) > *async = true; > @@ -1427,7 +1427,7 @@ static struct page *kvm_pfn_to_page(pfn_t pfn) > if (is_error_noslot_pfn(pfn)) > return KVM_ERR_PTR_BAD_PAGE; > > - if (kvm_is_mmio_pfn(pfn)) { > + if (kvm_is_reserved_pfn(pfn)) { > WARN_ON(1); > return KVM_ERR_PTR_BAD_PAGE; > } > @@ -1456,7
Re: [PATCH 2/2] kvm: fix kvm_is_mmio_pfn() and rename to kvm_is_reserved_pfn()
On 21/11/2014 14:18, Christoffer Dall wrote: > On Fri, Nov 21, 2014 at 02:06:40PM +0100, Paolo Bonzini wrote: >> >> >> On 21/11/2014 12:46, Christoffer Dall wrote: >>> Hi Paolo, >>> >>> I think these look good, would you mind queueing them as either a fix or >>> for 3.19 as you see fit, assuming you agree with the content? >> >> Ah, I was thinking _you_ would queue them for 3.19. >> > We can do that, did I miss your previous ack or reviewed-by? Since there's more stuff for 3.18 I can include these too. Paolo -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[CFT PATCH 0/2] KVM: support XSAVES usage in the host
The first patch ensures that XSAVES is not exposed in the guest until we emulate MSR_IA32_XSS. The second exports XSAVE data in the correct format. I tested these on a non-XSAVES system so they should not be completely broken, but I need some help. I am not even sure which XSAVE states are _not_ enabled, and thus compacted, in Linux. Note that these patches do not add support for XSAVES in the guest yet, since MSR_IA32_XSS is not emulated. If they fix the bug Nadav reported, I'll add Reported-by and commit. Thanks, Paolo Paolo Bonzini (2): kvm: x86: mask out XSAVES KVM: x86: support XSAVES usage in the host arch/x86/kvm/cpuid.c | 11 ++- arch/x86/kvm/x86.c | 48 +++- 2 files changed, 53 insertions(+), 6 deletions(-) -- 1.8.3.1 -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: next puzzle: Re: can I make this work… (Foundation for accessibility project)
a little more info On 11/21/2014 01:24 PM, Eric S. Johansson wrote: next puzzle. updates are not working using bridged to eth0 using virt io driver (checked install on windows) browser works in vm (quite well in fact) watching output of tcpdump and there is no apparent traffic for updates. in resource manager, svchost.exe (netsvcs) is running at 100% -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[Bug 88671] BUG at drivers/pci/ats.c:62! the second time booting a kvm guest with pci passthrough
https://bugzilla.kernel.org/show_bug.cgi?id=88671 Alex Williamson changed: What|Removed |Added CC||alex.william...@redhat.com --- Comment #4 from Alex Williamson --- I'm not sure how you're getting to this BUG_ON, but (a) legacy KVM device assignment is deprecated and (b) the card you've chosen has known reset issues. You might want to try vfio-pci, which I know can make this card work at least once per host boot, but you're likely to get a BSOD and IOMMU faults on subsequent guest [re]boots. The reset problem with this card has been reported to AMD, but there is no solution at this time. -- You are receiving this mail because: You are watching the assignee of the bug. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC PATCH v1 1/2] vfio: Add new interrupt group for VFIO
On Fri, 2014-11-21 at 06:06 +, Wu, Feng wrote: > > > -Original Message- > > From: Alex Williamson [mailto:alex.william...@redhat.com] > > Sent: Thursday, November 20, 2014 11:54 PM > > To: Wu, Feng > > Cc: pbonz...@redhat.com; kvm@vger.kernel.org; eric.auger > > Subject: Re: [RFC PATCH v1 1/2] vfio: Add new interrupt group for VFIO > > > > On Thu, 2014-11-20 at 17:05 +0800, Feng Wu wrote: > > > Add new group KVM_DEV_VFIO_INTERRUPT and command > > > KVM_DEV_VFIO_DEVIE_POSTING_IRQ related to it. > > > > > > This is used for VT-d Posted-Interrupts setup. > > > > Eric proposed an interface for ARM forwarded interrupts[1] using group > > KVM_DEV_VFIO_DEVICE with attributes > > KVM_DEV_VFIO_DEVICE_ASSIGN_IRQ and > > KVM_DEV_VFIO_DEVICE_DEASSIGN_IRQ. Why are we proposing yet another > > group and attributes here? Why can't we re-use the ones Eric proposes? > > > > I totally agree that I can reuse Eric's proposals. However, as Eric mentioned > in > his reply, I am using another data structure. So how about adding my own > attribute, say, KVM_DEV_VFIO_DEVICE_POSTING_IRQ in group KVM_DEV_VFIO_DEVICE. Right, Eric's latest proposal (sorry I picked the v1 links by mistake in my previous reply) includes: KVM_DEV_VFIO_DEVICE attributes: KVM_DEV_VFIO_DEVICE_FORWARD_IRQ KVM_DEV_VFIO_DEVICE_UNFORWARD_IRQ So I think we'd want to add something similar for posted interrupts, some sort of "start" and "stop" attribute. At the QEMU level we'll want to abstract both of these as opportunistic IRQ accelerators, but at the KVM-VFIO level we probably need to make them distinct using a separate set of attributes. Who knows, maybe one day ARM will support posted interrupts and Intel will support forwarding... I expect the calls from the KVM-VFIO device into VFIO at the kernel level to be largely the same between the different attributes though. Thanks, Alex > > [1] https://lkml.org/lkml/2014/8/25/258 > > > > > Signed-off-by: Feng Wu > > > --- > > > Documentation/virtual/kvm/devices/vfio.txt |8 > > > include/uapi/linux/kvm.h | 14 ++ > > > 2 files changed, 22 insertions(+), 0 deletions(-) > > > > > > diff --git a/Documentation/virtual/kvm/devices/vfio.txt > > b/Documentation/virtual/kvm/devices/vfio.txt > > > index ef51740..bd99176 100644 > > > --- a/Documentation/virtual/kvm/devices/vfio.txt > > > +++ b/Documentation/virtual/kvm/devices/vfio.txt > > > @@ -13,6 +13,7 @@ VFIO-group is held by KVM. > > > > > > Groups: > > >KVM_DEV_VFIO_GROUP > > > + KVM_DEV_VFIO_INTERRUPT > > > > > > KVM_DEV_VFIO_GROUP attributes: > > >KVM_DEV_VFIO_GROUP_ADD: Add a VFIO group to VFIO-KVM device > > tracking > > > @@ -20,3 +21,10 @@ KVM_DEV_VFIO_GROUP attributes: > > > > > > For each, kvm_device_attr.addr points to an int32_t file descriptor > > > for the VFIO group. > > > + > > > +KVM_DEV_VFIO_INTERRUPT attributes: > > > + KVM_DEV_VFIO_INTERRUPT_POSTING_IRQ: Set up the interrupt > > configuration for > > > +VT-d Posted-Interrrupts > > > + > > > +For each, kvm_device_attr.addr points to struct kvm_posted_intr, which > > > +include the needed information for VT-d Posted-Interrupts setup. > > > diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h > > > index 6076882..5544fcc 100644 > > > --- a/include/uapi/linux/kvm.h > > > +++ b/include/uapi/linux/kvm.h > > > @@ -943,9 +943,23 @@ struct kvm_device_attr { > > > __u64 addr; /* userspace address of attr data */ > > > }; > > > > > > +struct virq_info { > > > + __u32 index; /* index of the msi/msix entry */ > > > + int virq; /* virq of the interrupt */ > > > +}; > > > + > > > +struct kvm_posted_intr { > > > + __u32 fd; /* file descriptor of the VFIO device */ > > > + __u32 count; > > > + boolmsix; > > > > Note that MSI-X (as opposed to MSI) is a PCI concept. Being a VFIO > > interface this should operate at VFIO IRQ index and sub-index. > > Yes, I will use VFIO stuff instead. > > Thanks, > Feng > > > > > > + struct virq_info virq_info[0]; > > > +}; > > > + > > > #define KVM_DEV_VFIO_GROUP 1 > > > #define KVM_DEV_VFIO_GROUP_ADD 1 > > > #define KVM_DEV_VFIO_GROUP_DEL 2 > > > +#define KVM_DEV_VFIO_INTERRUPT 2 > > > +#define KVM_DEV_VFIO_INTERRUPT_POSTING_IRQ 1 > > > > > > enum kvm_device_type { > > > KVM_DEV_TYPE_FSL_MPIC_20= 1, > > > > > -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [CFT PATCH 2/2] KVM: x86: support XSAVES usage in the host
On 11/21/2014 10:31 AM, Paolo Bonzini wrote: > Userspace is expecting non-compacted format for KVM_GET_XSAVE, but > struct xsave_struct might be using the compacted format. Convert > in order to preserve userspace ABI. > > Fixes: f31a9f7c71691569359fa7fb8b0acaa44bce0324 > Cc: Fenghua Yu > Cc: sta...@vger.kernel.org > Cc: Nadav Amit > Signed-off-by: Paolo Bonzini > --- > arch/x86/kvm/x86.c | 48 +++- > 1 file changed, 43 insertions(+), 5 deletions(-) > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > index 5337039427c8..7e8a20e5615a 100644 > --- a/arch/x86/kvm/x86.c > +++ b/arch/x86/kvm/x86.c > @@ -3131,15 +3131,53 @@ static int kvm_vcpu_ioctl_x86_set_debugregs(struct > kvm_vcpu *vcpu, > return 0; > } > > +#define XSTATE_COMPACTION_ENABLED (1ULL << 63) > + > +static void fill_xsave(u8 *dest, struct kvm_vcpu *vcpu) > +{ > + struct xsave_struct *xsave = &vcpu->arch.guest_fpu.state->xsave; > + u64 xstate_bv = vcpu->arch.guest_supported_xcr0 | XSTATE_FPSSE; > + u64 valid; > + > + /* > + * Copy legacy XSAVE area, to avoid complications with CPUID > + * leaves 0 and 1 in the loop below. > + */ > + memcpy(dest, xsave, XSAVE_HDR_OFFSET); > + > + /* Set XSTATE_BV */ > + *(u64 *)(dest + XSAVE_HDR_OFFSET) = xstate_bv; > + > + /* > + * Copy each region from the possibly compacted offset to the > + * non-compacted offset. > + */ > + valid = xstate_bv & ~XSTATE_FPSSE; > + if (xsave->xsave_hdr.xcomp_bv & XSTATE_COMPACTION_ENABLED) > + valid &= xsave->xsave_hdr.xcomp_bv; > + > + while (valid) { > + u64 feature = valid & -valid; > + int index = fls64(feature) - 1; > + void *src = get_xsave_addr(xsave, feature); > + > + if (src) { > + u32 size, offset, ecx, edx; > + cpuid_count(XSTATE_CPUID, index, > + &size, &offset, &ecx, &edx); > + memcpy(dest + offset, src, size); Is this really the best way to do this? cpuid is serializing, so this is possibly *very* slow. --Andy -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] [PATCH 00/17] RFC: userfault v2
Hi Peter, On Wed, Oct 29, 2014 at 05:56:59PM +, Peter Maydell wrote: > On 29 October 2014 17:46, Andrea Arcangeli wrote: > > After some chat during the KVMForum I've been already thinking it > > could be beneficial for some usage to give userland the information > > about the fault being read or write > > ...I wonder if that would let us replace the current nasty > mess we use in linux-user to detect read vs write faults > (which uses a bunch of architecture-specific hacks including > in some cases "look at the insn that triggered this SEGV and > decode it to see if it was a load or a store"; see the > various cpu_signal_handler() implementations in user-exec.c). There's currently no plan to deliver to userland read access notifications of a present page, simply because the task of the userfaultfd is to handle the page fault in userland, but if the page is mapped and readable it won't fault in the first place :). I just mean it's not like gdb read watch. Even if the region would be set to PROT_NONE it would still SEGV without triggering an userfault (after all pte_present would still true because the page is still mapped despite not being readable, so in any case it wouldn't be considered a not-present page fault). If you temporarily remove the page (which requires an unavoidable TLB flush also considering if the page was previously mapped the TLB could still resolve it for reads) it would work then, because the plan is to provide read/write fault information through the userfaultfd. In theory it would be possible to deliver PROT_NONE faults through userfault too but it doesn't make much sense because PROT_NONE still requires a TLB flush, in addition to the vma modifications/splitting/rbtree-rebalance and the mmap_sem for writing as well. Temporarily removing/moving the page with remap_anon_pages shall be much better than using PROT_NONE for this (or alternative syscall name to differentiate it further from remap_file_pages, or equivalent userfaultfd command if we decide to hide the pte/pmd mangling as userfaultfd commands instead of adding new standalone syscalls). It would have the only constraint that you must mark the region MADV_DONTFORK if you intend linux-user to ever fork or it won't work reliably (that constraint is to eliminate the need of additional rmap complexity, precisely so that it doesn't turn into something more intrusive like remap_file_pages). I assume that would be a fine constraint for linux-user. Thanks, Andrea -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] kvm: x86: move assigned-dev.c and iommu.c to arch/x86/
Now that ia64 is gone, we can hide deprecated device assignment in x86. Notable changes: - kvm_vm_ioctl_assigned_device() was moved to x86/kvm_arch_vm_ioctl() The easy parts were removed from generic kvm code, remaining - kvm_iommu_(un)map_pages() would require new code to be moved - struct kvm_assigned_dev_kernel depends on struct kvm_irq_ack_notifier Signed-off-by: Radim Krčmář --- Or are we going to remove it instead? ;) arch/x86/include/asm/kvm_host.h | 23 +++ arch/x86/kvm/Makefile | 2 +- {virt => arch/x86}/kvm/assigned-dev.c | 0 {virt => arch/x86}/kvm/iommu.c| 0 arch/x86/kvm/x86.c| 2 +- include/linux/kvm_host.h | 26 -- virt/kvm/kvm_main.c | 2 -- 7 files changed, 25 insertions(+), 30 deletions(-) rename {virt => arch/x86}/kvm/assigned-dev.c (100%) rename {virt => arch/x86}/kvm/iommu.c (100%) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 76ff3e2..d549cf8 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -1112,4 +1112,27 @@ int kvm_pmu_read_pmc(struct kvm_vcpu *vcpu, unsigned pmc, u64 *data); void kvm_handle_pmu_event(struct kvm_vcpu *vcpu); void kvm_deliver_pmi(struct kvm_vcpu *vcpu); +#ifdef CONFIG_KVM_DEVICE_ASSIGNMENT +int kvm_iommu_map_guest(struct kvm *kvm); +int kvm_iommu_unmap_guest(struct kvm *kvm); + +long kvm_vm_ioctl_assigned_device(struct kvm *kvm, unsigned ioctl, + unsigned long arg); + +void kvm_free_all_assigned_devices(struct kvm *kvm); +#else +static inline int kvm_iommu_unmap_guest(struct kvm *kvm) +{ + return 0; +} + +static inline long kvm_vm_ioctl_assigned_device(struct kvm *kvm, unsigned ioctl, + unsigned long arg) +{ + return -ENOTTY; +} + +static inline void kvm_free_all_assigned_devices(struct kvm *kvm) {} +#endif + #endif /* _ASM_X86_KVM_HOST_H */ diff --git a/arch/x86/kvm/Makefile b/arch/x86/kvm/Makefile index ee1cd92..08f790d 100644 --- a/arch/x86/kvm/Makefile +++ b/arch/x86/kvm/Makefile @@ -9,11 +9,11 @@ KVM := ../../../virt/kvm kvm-y += $(KVM)/kvm_main.o $(KVM)/coalesced_mmio.o \ $(KVM)/eventfd.o $(KVM)/irqchip.o $(KVM)/vfio.o -kvm-$(CONFIG_KVM_DEVICE_ASSIGNMENT)+= $(KVM)/assigned-dev.o $(KVM)/iommu.o kvm-$(CONFIG_KVM_ASYNC_PF) += $(KVM)/async_pf.o kvm-y += x86.o mmu.o emulate.o i8259.o irq.o lapic.o \ i8254.o ioapic.o irq_comm.o cpuid.o pmu.o +kvm-$(CONFIG_KVM_DEVICE_ASSIGNMENT)+= assigned-dev.o iommu.o kvm-intel-y+= vmx.o kvm-amd-y += svm.o diff --git a/virt/kvm/assigned-dev.c b/arch/x86/kvm/assigned-dev.c similarity index 100% rename from virt/kvm/assigned-dev.c rename to arch/x86/kvm/assigned-dev.c diff --git a/virt/kvm/iommu.c b/arch/x86/kvm/iommu.c similarity index 100% rename from virt/kvm/iommu.c rename to arch/x86/kvm/iommu.c diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 5337039..782e4ea 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -4007,7 +4007,7 @@ long kvm_arch_vm_ioctl(struct file *filp, } default: - ; + r = kvm_vm_ioctl_assigned_device(kvm, ioctl, arg); } out: return r; diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index d2d4270..746e3ef 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -764,8 +764,6 @@ void kvm_free_irq_source_id(struct kvm *kvm, int irq_source_id); #ifdef CONFIG_KVM_DEVICE_ASSIGNMENT int kvm_iommu_map_pages(struct kvm *kvm, struct kvm_memory_slot *slot); void kvm_iommu_unmap_pages(struct kvm *kvm, struct kvm_memory_slot *slot); -int kvm_iommu_map_guest(struct kvm *kvm); -int kvm_iommu_unmap_guest(struct kvm *kvm); int kvm_assign_device(struct kvm *kvm, struct kvm_assigned_dev_kernel *assigned_dev); int kvm_deassign_device(struct kvm *kvm, @@ -781,11 +779,6 @@ static inline void kvm_iommu_unmap_pages(struct kvm *kvm, struct kvm_memory_slot *slot) { } - -static inline int kvm_iommu_unmap_guest(struct kvm *kvm) -{ - return 0; -} #endif static inline void kvm_guest_enter(void) @@ -1005,25 +998,6 @@ static inline bool kvm_vcpu_compatible(struct kvm_vcpu *vcpu) { return true; } #endif -#ifdef CONFIG_KVM_DEVICE_ASSIGNMENT - -long kvm_vm_ioctl_assigned_device(struct kvm *kvm, unsigned ioctl, - unsigned long arg); - -void kvm_free_all_assigned_devices(struct kvm *kvm); - -#else - -static inline long kvm_vm_ioctl_assigned_device(struct kvm *kvm, unsigned ioctl, - unsigned long arg) -{ - return -ENOTTY; -} - -static inline void kvm_free_all_assigned_devices(struct kvm *kvm) {} - -#endif - static inl
Re: [CFT PATCH 2/2] KVM: x86: support XSAVES usage in the host
On 21/11/2014 21:06, Andy Lutomirski wrote: >> > + cpuid_count(XSTATE_CPUID, index, >> > + &size, &offset, &ecx, &edx); >> > + memcpy(dest + offset, src, size); > Is this really the best way to do this? cpuid is serializing, so this > is possibly *very* slow. The data is in arch/x86/kernel/xsave.c, but it is not exported. But this is absolutely not a hotspot. Paolo -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v4] KVM: x86: fix access memslots w/o hold srcu read lock
On 21/11/2014 07:30, Wanpeng Li wrote: > I test it on the other guy's Ivytown and take advantage of the qemu command > line which he used, so I forget the accurate command line which used that day. > > Paolo also reproduce the bug, Paolo, ping. It also reproduced always for me with a debug kernel from Fedora. Paolo -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [question] lots of interrupts injected to vm when pressing some key w/o releasing
On 21/11/2014 15:20, Zhang, Yang Z wrote: > Zhang Haoyu wrote on 2014-11-20: >> Hi all, >> >> If I press the one of "Insert/Delete/Home/End/PageUp/PageDown/UpArrow/ >> DownArrow/LeftArrow/RightArrow" key w/o releasing, then lots of >> interrupts will be injected to vm(win7/win2008), about 8000/s, the >> system become very slow, bringing very bad experience. But the other keys >> are okay. >> And, linux guest has no this problem. >> >> If I remove the commit of 0bc830b05c667218d703f2026ec866c49df974fc, then >> the problem disappeared, but win7 guest got stuck at booting stage. And >> so strange that If the vm has only one vcpu, then the problem also >> disappeared. >> >> Any ideas? > > It looks commit 0bc830 doesn't do the right thing. The right point > to clear an edge triggered interrupt in ioapic->irr is after userspace > changes the irq line status. Otherwise, there may cause interrupt storm > if a device sets the irq line in a fix edge continuously. > > See below code: > ioapic_set_irq: > . > old_irr = ioapic->irr; > ioapic->irr |= mask; > if ((edge && old_irr == ioapic->irr) || > (!edge && entry.fields.remote_irr)) { > ret = 0; // normally, > we should break from here. But we never go to here due to (edge && old_irr != > ioapic->irr) now. > goto out; > } The IRR register means an interrupt was received and not serviced yet, similar to the LAPIC or PIC register. It is not the same thing as the interrupt line level (it happens to be for level-triggered interrupts). We observed lost interrupts during migration, and fixing the semantics of IRR was necessary in order to reinject those properly (commit 673f7b4257). If QEMU sends KVM_IRQ_LINE twice with level=1 it should be fixed---it is not supposed to do so. Commit 0bc830b05 makes the kernel IOAPIC behave the same way as QEMU's. If you want the old semantics of KVM_IRQ_LINE, that requires a separate register, different from IRR but it is not easy because they were buggy: the level of the interrupt is not part of the IOAPIC state structs in KVM, and it is not migrated in QEMU either. Paolo -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] [PATCH 00/17] RFC: userfault v2
On 21 November 2014 20:14, Andrea Arcangeli wrote: > Hi Peter, > > On Wed, Oct 29, 2014 at 05:56:59PM +, Peter Maydell wrote: >> On 29 October 2014 17:46, Andrea Arcangeli wrote: >> > After some chat during the KVMForum I've been already thinking it >> > could be beneficial for some usage to give userland the information >> > about the fault being read or write >> >> ...I wonder if that would let us replace the current nasty >> mess we use in linux-user to detect read vs write faults >> (which uses a bunch of architecture-specific hacks including >> in some cases "look at the insn that triggered this SEGV and >> decode it to see if it was a load or a store"; see the >> various cpu_signal_handler() implementations in user-exec.c). > > There's currently no plan to deliver to userland read access > notifications of a present page, simply because the task of the > userfaultfd is to handle the page fault in userland, but if the page > is mapped and readable it won't fault in the first place :). I just > mean it's not like gdb read watch. If it's mapped and readable-but-not-writable then it should still fault on write accesses, though? These are cases we currently get SEGV for, anyway. > Even if the region would be set to PROT_NONE it would still SEGV > without triggering an userfault (after all pte_present would still > true because the page is still mapped despite not being readable, so > in any case it wouldn't be considered a not-present page fault). Ah, I guess we have a terminology difference. I was considering "page fault" to mean (roughly) "anything that causes the CPU to take an exception on an attempted load/store" and expected that userfaultfd would notify userspace of any of those. (Well, not alignment faults, maybe, but I'm definitely surprised that access permission issues don't get reported the same way as page-completely-missing issues. In other words I was expecting that this was "everything previously reported via SIGSEGV or SIGBUS now comes via userfaultfd".) > Temporarily removing/moving the page with remap_anon_pages shall be > much better than using PROT_NONE for this (or alternative syscall name > to differentiate it further from remap_file_pages, or equivalent > userfaultfd command if we decide to hide the pte/pmd mangling as > userfaultfd commands instead of adding new standalone syscalls). We don't use PROT_NONE for the linux-user situation, we just use mprotect() to remove the PAGE_WRITE permission so it's still readable. I suspect actually linux-user would be better off implementing something like "if this is a page which we've mapped read-only because we translated code out of it, then go ahead and remap it r/w and throw away the translation and retry the access, otherwise report SEGV to the guest", because taking SEGVs shouldn't be a fast path in the guest binary. That would let us work without architecture-specific junk and without requiring new kernel features either. So you can ignore this whole tangent thread :-) thanks -- PMM -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 3/3] arm/arm64: Enable Dirty Page logging for ARMv8 move log read, tlb flush to generic code
On 11/21/2014 02:09 AM, Christoffer Dall wrote: > On Wed, Nov 19, 2014 at 12:15:55PM -0800, Mario Smarduch wrote: >> On 11/19/2014 06:39 AM, Christoffer Dall wrote: >>> Hi Mario, >>> >>> On Fri, Nov 07, 2014 at 12:51:39PM -0800, Mario Smarduch wrote: On 11/07/2014 12:20 PM, Christoffer Dall wrote: > On Thu, Oct 09, 2014 at 07:34:07PM -0700, Mario Smarduch wrote: >> This patch enables ARMv8 dirty page logging and unifies ARMv7/ARMv8 code. >> >> Signed-off-by: Mario Smarduch >> --- >> arch/arm/include/asm/kvm_host.h | 12 >> arch/arm/kvm/arm.c | 9 - >> arch/arm/kvm/mmu.c | 17 +++-- >> arch/arm64/kvm/Kconfig | 2 +- >> 4 files changed, 12 insertions(+), 28 deletions(-) >> >> diff --git a/arch/arm/include/asm/kvm_host.h >> b/arch/arm/include/asm/kvm_host.h >> index 12311a5..59565f5 100644 >> --- a/arch/arm/include/asm/kvm_host.h >> +++ b/arch/arm/include/asm/kvm_host.h >> @@ -220,18 +220,6 @@ static inline void __cpu_init_hyp_mode(phys_addr_t >> boot_pgd_ptr, >> kvm_call_hyp((void*)hyp_stack_ptr, vector_ptr, pgd_ptr); >> } >> >> -/** >> - * kvm_arch_flush_remote_tlbs() - flush all VM TLB entries >> - * @kvm: pointer to kvm structure. >> - * >> - * Interface to HYP function to flush all VM TLB entries without address >> - * parameter. >> - */ >> -static inline void kvm_arch_flush_remote_tlbs(struct kvm *kvm) >> -{ >> -kvm_call_hyp(__kvm_tlb_flush_vmid, kvm); >> -} >> - >> static inline int kvm_arch_dev_ioctl_check_extension(long ext) >> { >> return 0; >> diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c >> index 0546fa3..6a6fd6b 100644 >> --- a/arch/arm/kvm/arm.c >> +++ b/arch/arm/kvm/arm.c >> @@ -242,7 +242,6 @@ void kvm_arch_commit_memory_region(struct kvm *kvm, >> const struct kvm_memory_slot *old, >> enum kvm_mr_change change) >> { >> -#ifdef CONFIG_ARM >> /* >> * At this point memslot has been committed and there is an >> * allocated dirty_bitmap[], dirty pages will be be tracked >> while the >> @@ -250,7 +249,6 @@ void kvm_arch_commit_memory_region(struct kvm *kvm, >> */ >> if ((change != KVM_MR_DELETE) && (mem->flags & >> KVM_MEM_LOG_DIRTY_PAGES)) >> kvm_mmu_wp_memory_region(kvm, mem->slot); >> -#endif >> } >> >> void kvm_arch_flush_shadow_all(struct kvm *kvm) >> @@ -783,13 +781,6 @@ long kvm_arch_vcpu_ioctl(struct file *filp, >> } >> } >> >> -#ifdef CONFIG_ARM64 >> -int kvm_arch_vm_ioctl_get_dirty_log(struct kvm *kvm, struct >> kvm_dirty_log *log) >> -{ >> -return -EINVAL; >> -} >> -#endif >> - >> static int kvm_vm_ioctl_set_device_addr(struct kvm *kvm, >> struct kvm_arm_device_addr >> *dev_addr) >> { >> diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c >> index df1a5a3..8c0f9f2 100644 >> --- a/arch/arm/kvm/mmu.c >> +++ b/arch/arm/kvm/mmu.c >> @@ -49,11 +49,18 @@ static phys_addr_t hyp_idmap_vector; >> >> static bool kvm_get_logging_state(struct kvm_memory_slot *memslot) >> { >> -#ifdef CONFIG_ARM >> return !!memslot->dirty_bitmap; >> -#else >> -return false; >> -#endif >> +} >> + >> +/** >> + * kvm_arch_flush_remote_tlbs() - flush all VM TLB entries for ARMv7/8 >> + * @kvm:pointer to kvm structure. >> + * >> + * Interface to HYP function to flush all VM TLB entries >> + */ >> +inline void kvm_arch_flush_remote_tlbs(struct kvm *kvm) >> +{ >> +kvm_call_hyp(__kvm_tlb_flush_vmid, kvm); >> } >> >> static void kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa) >> @@ -769,7 +776,6 @@ static bool transparent_hugepage_adjust(pfn_t *pfnp, >> phys_addr_t *ipap) >> return false; >> } >> >> -#ifdef CONFIG_ARM >> /** >> * stage2_wp_ptes - write protect PMD range >> * @pmd:pointer to pmd entry >> @@ -917,7 +923,6 @@ void kvm_mmu_write_protect_pt_masked(struct kvm *kvm, >> >> stage2_wp_range(kvm, start, end); >> } >> -#endif >> >> static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, >>struct kvm_memory_slot *memslot, >> diff --git a/arch/arm64/kvm/Kconfig b/arch/arm64/kvm/Kconfig >> index 40a8d19..a1a35809 100644 >> --- a/arch/arm64/kvm/Kconfig >> +++ b/arch/arm64/kvm/Kconfig >> @@ -26,7 +26,7 @@ config KVM >> select KVM_ARM_HOST >> select KVM_ARM_VGI
Re: [PATCH 3/3] arm, arm64: KVM: handle potential incoherency of readonly memslots
On 11/21/2014 03:19 AM, Christoffer Dall wrote: > Hi Mario, > > On Wed, Nov 19, 2014 at 03:32:31PM -0800, Mario Smarduch wrote: >> Hi Laszlo, >> >> couple observations. >> >> I'm wondering if access from qemu and guest won't >> result in mixed memory attributes and if that's acceptable >> to the CPU. >> >> Also is if you update memory from qemu you may break >> dirty page logging/migration. Unless there is some other way >> you keep track. Of course it may not be applicable in your >> case (i.e. flash unused after boot). >> > I'm not concerned about this particular case; dirty page logging exists > so KVM can inform userspace when a page may have been dirtied. If > userspace directly dirties (is that a verb?) a page, I would think so, I rely on software too much :) > then it already knows that it needs to migrate that page and > deal with it accordingly. > > Or did I miss some more subtle point here QEMU has a global migration bitmap for all regions initially set dirty, and it's updated over iterations with KVM's dirty bitmap. Once dirty pages are migrated bits are cleared. If QEMU updates a memory region directly I can't see how it's reflected in that migration bitmap that determines what pages should be migrated as it makes it's passes. On x86 if host updates guest memory it marks that page dirty. But virtio writes to guest memory directly and that appears to work just fine. I read that code sometime back, and will need to revisit. - Mario > > -Christoffer > -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] KVM: nVMX: nested MSR auto load/restore emulation.
Some hypervisors need MSR auto load/restore feature. We read MSRs from vm-entry MSR load area which specified by L1, and load them via kvm_set_msr in the nested entry. When nested exit occurs, we get MSRs via kvm_get_msr, writting them to L1`s MSR store area. After this, we read MSRs from vm-exit MSR load area, and load them via kvm_set_msr. VirtualBox will work fine with this patch. Signed-off-by: Wincy Van diff --git a/arch/x86/include/uapi/asm/vmx.h b/arch/x86/include/uapi/asm/vmx.h index 990a2fe..986af3f 100644 --- a/arch/x86/include/uapi/asm/vmx.h +++ b/arch/x86/include/uapi/asm/vmx.h @@ -56,6 +56,7 @@ #define EXIT_REASON_MSR_READ31 #define EXIT_REASON_MSR_WRITE 32 #define EXIT_REASON_INVALID_STATE 33 +#define EXIT_REASON_MSR_LOAD_FAIL 34 #define EXIT_REASON_MWAIT_INSTRUCTION 36 #define EXIT_REASON_MONITOR_INSTRUCTION 39 #define EXIT_REASON_PAUSE_INSTRUCTION 40 @@ -114,8 +115,12 @@ { EXIT_REASON_APIC_WRITE,"APIC_WRITE" }, \ { EXIT_REASON_EOI_INDUCED, "EOI_INDUCED" }, \ { EXIT_REASON_INVALID_STATE, "INVALID_STATE" }, \ + { EXIT_REASON_MSR_LOAD_FAIL, "MSR_LOAD_FAIL" }, \ { EXIT_REASON_INVD, "INVD" }, \ { EXIT_REASON_INVVPID, "INVVPID" }, \ { EXIT_REASON_INVPCID, "INVPCID" } +#define VMX_ABORT_SAVE_GUEST_MSR_FAIL1 +#define VMX_ABORT_LOAD_HOST_MSR_FAIL 4 + #endif /* _UAPIVMX_H */ diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index 6a951d8..377e405 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -6088,6 +6088,13 @@ static void nested_vmx_failValid(struct kvm_vcpu *vcpu, */ } +static void nested_vmx_abort(struct kvm_vcpu *vcpu, u32 indicator) +{ + /* TODO: not to simply reset guest here. */ + kvm_make_request(KVM_REQ_TRIPLE_FAULT, vcpu); + printk(KERN_WARNING"kvm: nested vmx abort, indicator %d\n", indicator); +} + static enum hrtimer_restart vmx_preemption_timer_fn(struct hrtimer *timer) { struct vcpu_vmx *vmx = @@ -8215,6 +8222,88 @@ static void vmx_start_preemption_timer(struct kvm_vcpu *vcpu) ns_to_ktime(preemption_timeout), HRTIMER_MODE_REL); } +static inline int nested_msr_check_common(struct vmx_msr_entry *e) +{ + if (e->index >> 8 == 0x8 || e->reserved != 0) + return -EINVAL; +return 0; +} + +static inline int nested_load_msr_check(struct vmx_msr_entry *e) +{ + if (e->index == MSR_FS_BASE || +e->index == MSR_GS_BASE || +nested_msr_check_common(e)) + return -EINVAL; + return 0; +} + +/* load guest msr at nested entry. + * return 0 for success, entry index for failed. + */ +static u32 nested_entry_load_msr(struct kvm_vcpu *vcpu, u64 gpa, u32 count) +{ + u32 i = 0; + struct vmx_msr_entry e; + struct msr_data msr; + + msr.host_initiated = false; + while (i < count) { + kvm_read_guest(vcpu->kvm, gpa + i * sizeof(struct vmx_msr_entry), + &e, sizeof(struct vmx_msr_entry)); + if (nested_load_msr_check(&e)) + goto fail; + msr.index = e.index; + msr.data = e.value; + if (kvm_set_msr(vcpu, &msr)) + goto fail; + ++i; +} + return 0; +fail: + return i + 1; +} + +static int nested_exit_store_msr(struct kvm_vcpu *vcpu, u64 gpa, u32 count) +{ + u32 i = 0; + struct vmx_msr_entry e; + +while (i < count) { + kvm_read_guest(vcpu->kvm, gpa + i * sizeof(struct vmx_msr_entry), + &e, sizeof(struct vmx_msr_entry)); + if (nested_msr_check_common(&e)) + return -EINVAL; + if (kvm_get_msr(vcpu, e.index, &e.value)) + return -EINVAL; + kvm_write_guest(vcpu->kvm, gpa + i * sizeof(struct vmx_msr_entry), + &e, sizeof(struct vmx_msr_entry)); + ++i; + } + return 0; +} + +static int nested_exit_load_msr(struct kvm_vcpu *vcpu, u64 gpa, u32 count) +{ + u32 i = 0; + struct vmx_msr_entry e; + struct msr_data msr; + + msr.host_initiated = false; + while (i < count) { + kvm_read_guest(vcpu->kvm, gpa + i * sizeof(struct vmx_msr_entry), + &e, sizeof(struct vmx_msr_entry)); + if (nested_load_msr_check(&e)) + return -EINVAL; + msr.index = e.index; + msr.data = e.value; + if (kvm_set_msr(vcpu, &msr)) + return -EINVAL; + ++i; + } + return 0; +} + /* * prepare_vmcs02 is called when the L1 guest hypervisor runs its nested * L2 guest. L1 has a vmcs for L2 (vmcs12), and this function "merges" it @@ -8509,6 +8598,7 @@ static int nested_vmx_run(struct kvm_vcpu *vcpu, bool launch) int cpu; struct loaded_vmcs *vmcs02; bool ia32e; + u32 msr_entry_idx; if (!nested_vmx_check_permission(vcpu) || !nested_vmx_check_vmcs12(vcpu)) @@ -8556,11 +8646,12 @@ static int nested_vmx_run(struct kvm_vcpu *vcpu, bool launch) return 1; } - if (vmcs12->vm_entry_msr_load_count > 0 || -vmcs12->vm_exit_msr_load_count > 0 || -vmcs12->vm_exit_msr_store_count > 0) { - pr_warn_ratelimited("%s: VMCS MSR_{LOAD,STORE} unsupported\n", -__func__); + if ((vmcs12->vm_entry_msr_load_count > 0 && + !IS_ALIGNED(vmcs12->vm_entry_msr_load_addr, 16)) || +(vmcs12->vm_exit_msr_load_count > 0 && + !IS_ALIGNED(vmcs12->vm_exit_msr_load
Re: [PATCH] KVM: nVMX: nested MSR auto load/restore emulation.
On 2014-11-22 05:24, Wincy Van wrote: > Some hypervisors need MSR auto load/restore feature. > > We read MSRs from vm-entry MSR load area which specified by L1, > and load them via kvm_set_msr in the nested entry. > When nested exit occurs, we get MSRs via kvm_get_msr, writting > them to L1`s MSR store area. After this, we read MSRs from vm-exit > MSR load area, and load them via kvm_set_msr. > > VirtualBox will work fine with this patch. Cool! This feature is long overdue. Patch is unfortunately misformatted which makes it very hard to read. Please check via linux/scripts/checkpatch.pl for the proper style. Could you also write a corresponding kvm-unit-test (see x86/vmx_tests.c)? Jan signature.asc Description: OpenPGP digital signature