Re: [PATCH 08/18] PCI, powerpc: Register busn_res for root buses
On Mon, 2012-02-27 at 22:36 -0700, Bjorn Helgaas wrote: > > There's a lot of powerpc code that does this: > > bus_range = of_get_property(pcictrl, "bus-range", &len); > hose->first_busno = bus_range[0]; > hose->last_busno = bus_range[1]; > > That *looks* like it is discovering the bus number aperture. Is it? > If it is, why are we using the largest bus number found by > pci_scan_child_bus() rather than "last_busno"? We do that but we somewhat -also- rely on the core bumping it if it needs to make room :-) As I said, we are swimming in dirty waters between reverse engineered stuff we don't know 100% and "designed" stuff. I think we should have ways to more explicitely define what we want tho, ie whether hose->last_busno is just what happens to be the "current" bus number assigned by the firmware or the hard max. Maybe a pci flag ? On the other hand some platforms (all the ppc4xx ones for example) set the flag to reassign all busses ... but have limit on bus numbers simply because they have a memory mapped only config space and we don't have enough address space to ioremap it all on 32-bit. We need to fix them to use a fixmap entry to do atomic on-demand mapping of the config space and lift that restriction, but that isn't done yet. So I think those patches will need really careful handling on our side. Cheers, Ben. ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
RE: [PATCH 1/2] powerpc/e500: make load_up_spe a normal fuction
Hi Scott, This had been reviewed before and accepted by internal tree. http://linux.freescale.net/patchwork/patch/11100/ http://git.am.freescale.net/gitolite/gitweb.cgi/sdk/kvm.git/commit/?h=for-sdk1.2&id=c5088844dc665dbdae4fa51b8d58dc203bacc17e I didn't change anything except the line. I just commit to external kvm-ppc mailing list. Should I add my own Signed-off-by? Best Regards, Olivia -Original Message- From: Wood Scott-B07421 Sent: Tuesday, February 28, 2012 3:19 AM To: Yin Olivia-R63875 Cc: kvm-...@vger.kernel.org; k...@vger.kernel.org; linuxppc-dev@lists.ozlabs.org; Liu Yu-B13201 Subject: Re: [PATCH 1/2] powerpc/e500: make load_up_spe a normal fuction On 02/27/2012 04:59 AM, Olivia Yin wrote: > So that we can call it in kernel. > > Signed-off-by: Liu Yu Explain why we want this, and point out that this makes it similar to load_up_fpu. > --- > arch/powerpc/kernel/head_fsl_booke.S | 23 ++- > 1 files changed, 6 insertions(+), 17 deletions(-) When posting a patch authored by someone else, more or less unchanged, you should put a From: line in the body of the e-mail. git send-email will do this automatically if you preserve the authorship in the git commit. Also, you should add your own Signed-off-by. -Scott ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Recall: [PATCH 1/2] powerpc/e500: make load_up_spe a normal fuction
Yin Olivia-R63875 would like to recall the message, "[PATCH 1/2] powerpc/e500: make load_up_spe a normal fuction". ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
RE: [PATCH 20/21] Introduce struct eeh_stats for EEH
> +struct eeh_stats { > + unsigned int no_device; /* PCI device not found */ ... > + "no device =%d\n" ... Use %u (for all the stats), you really don't want negative values printed. I've NFI how long wrapping these counters might take! If it is feasable (maybe much above 100Hz) then you need 64bit counters. David ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH V2 0/2] powerpc: Add support for GE IMP3A
These patches add support for the GE IMP3A. This board (based on a Freescale P2020) uses some support for FPGA logic common with the PPC9A and other 86xx based boards, so this support has been moved out of the 86xx directory. A config option (GE_FPGA) has been added to reduce churn on dependant drivers (such as the watchdog timer) when further boards are added. ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH V2 1/2] powerpc: Move GE GPIO and PIC drivers
Move the GE GPIO and PIC drivers to allow these to be used by non-86xx boards. Signed-off-by: Martyn Welch --- v2: Move GPIO and PIC drivers to sysdev/ge/ rather than platforms/. arch/powerpc/platforms/86xx/Kconfig|3 +++ arch/powerpc/platforms/86xx/Makefile |7 +++ arch/powerpc/platforms/86xx/gef_ppc9a.c|2 +- arch/powerpc/platforms/86xx/gef_sbc310.c |2 +- arch/powerpc/platforms/86xx/gef_sbc610.c |2 +- arch/powerpc/sysdev/Kconfig|7 +++ arch/powerpc/sysdev/Makefile |2 ++ arch/powerpc/sysdev/ge/Makefile|2 ++ .../86xx/gef_gpio.c => sysdev/ge/ge_gpio.c}|2 +- .../86xx/gef_pic.c => sysdev/ge/ge_pic.c} |2 +- .../86xx/gef_pic.h => sysdev/ge/ge_pic.h} |0 drivers/watchdog/Kconfig |2 +- 12 files changed, 23 insertions(+), 10 deletions(-) create mode 100644 arch/powerpc/sysdev/ge/Makefile rename arch/powerpc/{platforms/86xx/gef_gpio.c => sysdev/ge/ge_gpio.c} (98%) rename arch/powerpc/{platforms/86xx/gef_pic.c => sysdev/ge/ge_pic.c} (99%) rename arch/powerpc/{platforms/86xx/gef_pic.h => sysdev/ge/ge_pic.h} (100%) diff --git a/arch/powerpc/platforms/86xx/Kconfig b/arch/powerpc/platforms/86xx/Kconfig index 8d6599d..2015022 100644 --- a/arch/powerpc/platforms/86xx/Kconfig +++ b/arch/powerpc/platforms/86xx/Kconfig @@ -39,6 +39,7 @@ config GEF_PPC9A select MMIO_NVRAM select GENERIC_GPIO select ARCH_REQUIRE_GPIOLIB + select GE_FPGA help This option enables support for the GE PPC9A. @@ -48,6 +49,7 @@ config GEF_SBC310 select MMIO_NVRAM select GENERIC_GPIO select ARCH_REQUIRE_GPIOLIB + select GE_FPGA help This option enables support for the GE SBC310. @@ -58,6 +60,7 @@ config GEF_SBC610 select GENERIC_GPIO select ARCH_REQUIRE_GPIOLIB select HAS_RAPIDIO + select GE_FPGA help This option enables support for the GE SBC610. diff --git a/arch/powerpc/platforms/86xx/Makefile b/arch/powerpc/platforms/86xx/Makefile index 4b0d7b1..ede815d 100644 --- a/arch/powerpc/platforms/86xx/Makefile +++ b/arch/powerpc/platforms/86xx/Makefile @@ -7,7 +7,6 @@ obj-$(CONFIG_SMP) += mpc86xx_smp.o obj-$(CONFIG_MPC8641_HPCN) += mpc86xx_hpcn.o obj-$(CONFIG_SBC8641D) += sbc8641d.o obj-$(CONFIG_MPC8610_HPCD) += mpc8610_hpcd.o -gef-gpio-$(CONFIG_GPIOLIB) += gef_gpio.o -obj-$(CONFIG_GEF_SBC610) += gef_sbc610.o gef_pic.o $(gef-gpio-y) -obj-$(CONFIG_GEF_SBC310) += gef_sbc310.o gef_pic.o $(gef-gpio-y) -obj-$(CONFIG_GEF_PPC9A)+= gef_ppc9a.o gef_pic.o $(gef-gpio-y) +obj-$(CONFIG_GEF_SBC610) += gef_sbc610.o +obj-$(CONFIG_GEF_SBC310) += gef_sbc310.o +obj-$(CONFIG_GEF_PPC9A)+= gef_ppc9a.o diff --git a/arch/powerpc/platforms/86xx/gef_ppc9a.c b/arch/powerpc/platforms/86xx/gef_ppc9a.c index 60ce07e..ed58b6c 100644 --- a/arch/powerpc/platforms/86xx/gef_ppc9a.c +++ b/arch/powerpc/platforms/86xx/gef_ppc9a.c @@ -37,9 +37,9 @@ #include #include +#include #include "mpc86xx.h" -#include "gef_pic.h" #undef DEBUG diff --git a/arch/powerpc/platforms/86xx/gef_sbc310.c b/arch/powerpc/platforms/86xx/gef_sbc310.c index 3ecee25..710db69 100644 --- a/arch/powerpc/platforms/86xx/gef_sbc310.c +++ b/arch/powerpc/platforms/86xx/gef_sbc310.c @@ -37,9 +37,9 @@ #include #include +#include #include "mpc86xx.h" -#include "gef_pic.h" #undef DEBUG diff --git a/arch/powerpc/platforms/86xx/gef_sbc610.c b/arch/powerpc/platforms/86xx/gef_sbc610.c index 5090d60..4a13d2f 100644 --- a/arch/powerpc/platforms/86xx/gef_sbc610.c +++ b/arch/powerpc/platforms/86xx/gef_sbc610.c @@ -37,9 +37,9 @@ #include #include +#include #include "mpc86xx.h" -#include "gef_pic.h" #undef DEBUG diff --git a/arch/powerpc/sysdev/Kconfig b/arch/powerpc/sysdev/Kconfig index 7b4df37..cd0ef0b 100644 --- a/arch/powerpc/sysdev/Kconfig +++ b/arch/powerpc/sysdev/Kconfig @@ -29,3 +29,10 @@ config SCOM_DEBUGFS bool "Expose SCOM controllers via debugfs" depends on PPC_SCOM default n + +config GE_FPGA + bool + default n + help + Support for common GPIO and interrupt routing functionality provided + on some GE Single Board Computers. diff --git a/arch/powerpc/sysdev/Makefile b/arch/powerpc/sysdev/Makefile index 5e37b47..f80ff9f 100644 --- a/arch/powerpc/sysdev/Makefile +++ b/arch/powerpc/sysdev/Makefile @@ -65,3 +65,5 @@ obj-$(CONFIG_PPC_SCOM)+= scom.o subdir-ccflags-$(CONFIG_PPC_WERROR) := -Werror obj-$(CONFIG_PPC_XICS) += xics/ + +obj-$(CONFIG_GE_FPGA) += ge/ diff --git a/arch/powerpc/sysdev/ge/Makefile b/arch/powerpc/sysdev/ge/Makefile new file mode 100644 index 000..6a10372 --- /dev/null +++ b/arch/pow
[PATCH V2 2/2] powerpc: Board support for GE IMP3A
Initial board support for the GE IMP3A, a 3U compactPCI card with a p2020 processor. Signed-off-by: Martyn Welch --- v2: Rebase patch onto powerpc/next, taking work by Kyle Moffett into account. arch/powerpc/boot/dts/ge_imp3a.dts | 254 ++ arch/powerpc/configs/ge_imp3a_defconfig | 256 +++ arch/powerpc/platforms/85xx/Kconfig | 15 ++ arch/powerpc/platforms/85xx/Makefile|1 + arch/powerpc/platforms/85xx/ge_imp3a.c | 246 + arch/powerpc/sysdev/ge/ge_gpio.c| 28 6 files changed, 800 insertions(+), 0 deletions(-) create mode 100644 arch/powerpc/boot/dts/ge_imp3a.dts create mode 100644 arch/powerpc/configs/ge_imp3a_defconfig create mode 100644 arch/powerpc/platforms/85xx/ge_imp3a.c diff --git a/arch/powerpc/boot/dts/ge_imp3a.dts b/arch/powerpc/boot/dts/ge_imp3a.dts new file mode 100644 index 000..f30fadb --- /dev/null +++ b/arch/powerpc/boot/dts/ge_imp3a.dts @@ -0,0 +1,254 @@ +/* + * GE IMP3A Device Tree Source + * + * Copyright 2010-2011 GE Intelligent Platforms Embedded Systems, Inc. + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License as published by the + * Free Software Foundation; either version 2 of the License, or (at your + * option) any later version. + * + * Based on: P2020 DS Device Tree Source + * Copyright 2009 Freescale Semiconductor Inc. + */ + +/include/ "fsl/p2020si-pre.dtsi" + +/ { + model = "GE_IMP3A"; + compatible = "ge,imp3a"; + + memory { + device_type = "memory"; + }; + + lbc: localbus@fef05000 { + reg = <0 0xfef05000 0 0x1000>; + + ranges = <0x0 0x0 0x0 0xff00 0x0100 + 0x1 0x0 0x0 0xe000 0x0800 + 0x2 0x0 0x0 0xe800 0x0800 + 0x3 0x0 0x0 0xfc10 0x0002 + 0x4 0x0 0x0 0xfc00 0x8000 + 0x5 0x0 0x0 0xfc008000 0x8000 + 0x6 0x0 0x0 0xfee0 0x0004 + 0x7 0x0 0x0 0xfee8 0x0004>; + + /* nor@0,0 is a mirror of part of the memory in nor@1,0 + nor@0,0 { + #address-cells = <1>; + #size-cells = <1>; + compatible = "ge,imp3a-firmware-mirror", "cfi-flash"; + reg = <0x0 0x0 0x100>; + bank-width = <2>; + device-width = <1>; + + partition@0 { + label = "firmware"; + reg = <0x0 0x100>; + read-only; + }; + }; + */ + + nor@1,0 { + #address-cells = <1>; + #size-cells = <1>; + compatible = "ge,imp3a-paged-flash", "cfi-flash"; + reg = <0x1 0x0 0x800>; + bank-width = <2>; + device-width = <1>; + + partition@0 { + label = "user"; + reg = <0x0 0x780>; + }; + + partition@780 { + label = "firmware"; + reg = <0x780 0x80>; + read-only; + }; + }; + + nvram@3,0 { + device_type = "nvram"; + compatible = "simtek,stk14ca8"; + reg = <0x3 0x0 0x2>; + }; + + fpga@4,0 { + compatible = "ge,imp3a-fpga-regs"; + reg = <0x4 0x0 0x20>; + }; + + gef_pic: pic@4,20 { + #interrupt-cells = <1>; + interrupt-controller; + compatible = "ge,imp3a-fpga-pic", "gef,fpga-pic-1.00"; + reg = <0x4 0x20 0x20>; + interrupts = <6 7 0 0>; + }; + + gef_gpio: gpio@4,400 { + #gpio-cells = <2>; + compatible = "ge,imp3a-gpio"; + reg = <0x4 0x400 0x24>; + gpio-controller; + }; + + wdt@4,800 { + compatible = "ge,imp3a-fpga-wdt", "gef,fpga-wdt-1.00", + "gef,fpga-wdt"; + reg = <0x4 0x800 0x8>; + interrupts = <10 4>; + interrupt-parent = <&gef_pic>; + }; + + /* Second watchdog available, driver currently supports one. + wdt@4,808 {
Re: [PATCH 24/37] KVM: PPC: booke: rework rescheduling checks
On 27.02.2012, at 20:28, Scott Wood wrote: > On 02/24/2012 08:26 AM, Alexander Graf wrote: >> -void kvmppc_core_prepare_to_enter(struct kvm_vcpu *vcpu) >> +int kvmppc_core_prepare_to_enter(struct kvm_vcpu *vcpu) >> { >> unsigned long *pending = &vcpu->arch.pending_exceptions; >> unsigned long old_pending = vcpu->arch.pending_exceptions; >> @@ -283,6 +283,8 @@ void kvmppc_core_prepare_to_enter(struct kvm_vcpu *vcpu) >> >> /* Tell the guest about our interrupt status */ >> kvmppc_update_int_pending(vcpu, *pending, old_pending); >> + >> +return 0; >> } >> >> pfn_t kvmppc_gfn_to_pfn(struct kvm_vcpu *vcpu, gfn_t gfn) >> diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c >> index 9979be1..3fcec2c 100644 >> --- a/arch/powerpc/kvm/booke.c >> +++ b/arch/powerpc/kvm/booke.c >> @@ -439,8 +439,9 @@ static void kvmppc_core_check_exceptions(struct kvm_vcpu >> *vcpu) >> } >> >> /* Check pending exceptions and deliver one, if possible. */ >> -void kvmppc_core_prepare_to_enter(struct kvm_vcpu *vcpu) >> +int kvmppc_core_prepare_to_enter(struct kvm_vcpu *vcpu) >> { >> +int r = 0; >> WARN_ON_ONCE(!irqs_disabled()); >> >> kvmppc_core_check_exceptions(vcpu); >> @@ -451,8 +452,44 @@ void kvmppc_core_prepare_to_enter(struct kvm_vcpu *vcpu) >> local_irq_disable(); >> >> kvmppc_set_exit_type(vcpu, EMULATED_MTMSRWE_EXITS); >> -kvmppc_core_check_exceptions(vcpu); >> +r = 1; >> }; >> + >> +return r; >> +} >> + >> +/* >> + * Common checks before entering the guest world. Call with interrupts >> + * disabled. >> + * >> + * returns !0 if a signal is pending and check_signal is true >> + */ >> +static int kvmppc_prepare_to_enter(struct kvm_vcpu *vcpu, bool check_signal) >> +{ >> +int r = 0; >> + >> +WARN_ON_ONCE(!irqs_disabled()); >> +while (true) { >> +if (need_resched()) { >> +local_irq_enable(); >> +cond_resched(); >> +local_irq_disable(); >> +continue; >> +} >> + >> +if (kvmppc_core_prepare_to_enter(vcpu)) { >> +/* interrupts got enabled in between, so we >> + are back at square 1 */ >> +continue; >> +} >> + >> + >> +if (check_signal && signal_pending(current)) >> +r = 1; > > If there is a signal pending and MSR[WE] is set, we'll loop forever > without reaching this check. Good point. How about something like this on top (will fold in later)? diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c index 430055e..9f27258 100644 --- a/arch/powerpc/kvm/booke.c +++ b/arch/powerpc/kvm/booke.c @@ -477,15 +477,17 @@ static int kvmppc_prepare_to_enter(struct kvm_vcpu *vcpu) continue; } + if (signal_pending(current)) { + r = 1; + break; + } + if (kvmppc_core_prepare_to_enter(vcpu)) { /* interrupts got enabled in between, so we are back at square 1 */ continue; } - if (signal_pending(current)) - r = 1; - break; } Alex ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 24/37] KVM: PPC: booke: rework rescheduling checks
On 02/28/2012 05:03 AM, Alexander Graf wrote: > > On 27.02.2012, at 20:28, Scott Wood wrote: > >> If there is a signal pending and MSR[WE] is set, we'll loop forever >> without reaching this check. > > Good point. How about something like this on top (will fold in later)? > > diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c > index 430055e..9f27258 100644 > --- a/arch/powerpc/kvm/booke.c > +++ b/arch/powerpc/kvm/booke.c > @@ -477,15 +477,17 @@ static int kvmppc_prepare_to_enter(struct kvm_vcpu > *vcpu) > continue; > } > > + if (signal_pending(current)) { > + r = 1; > + break; > + } > + > if (kvmppc_core_prepare_to_enter(vcpu)) { > /* interrupts got enabled in between, so we >are back at square 1 */ > continue; > } > > - if (signal_pending(current)) > - r = 1; > - > break; > } Looks OK. -Scott ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH] sparsemem/bootmem: catch greater than section size allocations
On Fri, Feb 24, 2012 at 11:33:58AM -0800, Nishanth Aravamudan wrote: > While testing AMS (Active Memory Sharing) / CMO (Cooperative Memory > Overcommit) on powerpc, we tripped the following: > > kernel BUG at mm/bootmem.c:483! > cpu 0x0: Vector: 700 (Program Check) at [c0c03940] > pc: c0a62bd8: .alloc_bootmem_core+0x90/0x39c > lr: c0a64bcc: .sparse_early_usemaps_alloc_node+0x84/0x29c > sp: c0c03bc0 >msr: 80021032 > current = 0xc0b0cce0 > paca= 0xc1d8 > pid = 0, comm = swapper > kernel BUG at mm/bootmem.c:483! > enter ? for help > [c0c03c80] c0a64bcc > .sparse_early_usemaps_alloc_node+0x84/0x29c > [c0c03d50] c0a64f10 .sparse_init+0x12c/0x28c > [c0c03e20] c0a474f4 .setup_arch+0x20c/0x294 > [c0c03ee0] c0a4079c .start_kernel+0xb4/0x460 > [c0c03f90] c0009670 .start_here_common+0x1c/0x2c > > This is > > BUG_ON(limit && goal + size > limit); > > and after some debugging, it seems that > > goal = 0x700 > limit = 0x800 > > and sparse_early_usemaps_alloc_node -> > sparse_early_usemaps_alloc_pgdat_section -> alloc_bootmem_section calls > > return alloc_bootmem_section(usemap_size() * count, section_nr); > > This is on a system with 8TB available via the AMS pool, and as a quirk > of AMS in firmware, all of that memory shows up in node 0. So, we end up > with an allocation that will fail the goal/limit constraints. In theory, > we could "fall-back" to alloc_bootmem_node() in > sparse_early_usemaps_alloc_node(), but since we actually have HOTREMOVE > defined, we'll BUG_ON() instead. A simple solution appears to be to > disable the limit check if the size of the allocation in > alloc_bootmem_secition exceeds the section size. It makes sense to allow the usemaps to spill over to subsequent sections instead of panicking, so FWIW: Acked-by: Johannes Weiner That being said, it would be good if check_usemap_section_nr() printed the cross-dependencies between pgdats and sections when the usemaps of a node spilled over to other sections than the ones holding the pgdat. How about this? --- From: Johannes Weiner Subject: sparsemem/bootmem: catch greater than section size allocations fix If alloc_bootmem_section() no longer guarantees section-locality, we need check_usemap_section_nr() to print possible cross-dependencies between node descriptors and the usemaps allocated through it. Signed-off-by: Johannes Weiner --- diff --git a/mm/sparse.c b/mm/sparse.c index 61d7cde..9e032dc 100644 --- a/mm/sparse.c +++ b/mm/sparse.c @@ -359,6 +359,7 @@ static void __init sparse_early_usemaps_alloc_node(unsigned long**usemap_map, continue; usemap_map[pnum] = usemap; usemap += size; + check_usemap_section_nr(nodeid, usemap_map[pnum]); } return; } --- Furthermore, I wonder if we can remove the sparse-specific stuff from bootmem.c as well, as now even more so than before, calculating the desired area is really none of bootmem's business. Would something like this be okay? --- From: Johannes Weiner Subject: [patch] mm: remove sparsemem allocation details from the bootmem allocator alloc_bootmem_section() derives allocation area constraints from the specified sparsemem section. This is a bit specific for a generic memory allocator like bootmem, though, so move it over to sparsemem. Since __alloc_bootmem_node() already retries failed allocations with relaxed area constraints, the fallback code in sparsemem.c can be removed and the code becomes a bit more compact overall. Signed-off-by: Johannes Weiner --- include/linux/bootmem.h |3 --- mm/bootmem.c| 26 -- mm/sparse.c | 29 + 3 files changed, 9 insertions(+), 49 deletions(-) diff --git a/include/linux/bootmem.h b/include/linux/bootmem.h index ab344a5..001c248 100644 --- a/include/linux/bootmem.h +++ b/include/linux/bootmem.h @@ -135,9 +135,6 @@ extern void *__alloc_bootmem_low_node(pg_data_t *pgdat, extern int reserve_bootmem_generic(unsigned long addr, unsigned long size, int flags); -extern void *alloc_bootmem_section(unsigned long size, - unsigned long section_nr); - #ifdef CONFIG_HAVE_ARCH_ALLOC_REMAP extern void *alloc_remap(int nid, unsigned long size); #else diff --git a/mm/bootmem.c b/mm/bootmem.c index 7bc0557..d34026c 100644 --- a/mm/bootmem.c +++ b/mm/bootmem.c @@ -756,32 +756,6 @@ void * __init __alloc_bootmem_node_high(pg_data_t *pgdat, unsigned long size, } -#ifdef CONFIG_SPARSEMEM -/** - * alloc_bootmem_section - allocate boot memory from a specific section - * @size: size of the request in bytes - * @section_nr: sparse map section to allocat
Re: [PATCH] sparsemem/bootmem: catch greater than section size allocations
On Fri, Feb 24, 2012 at 11:33:58AM -0800, Nishanth Aravamudan wrote: > While testing AMS (Active Memory Sharing) / CMO (Cooperative Memory > Overcommit) on powerpc, we tripped the following: > > kernel BUG at mm/bootmem.c:483! > cpu 0x0: Vector: 700 (Program Check) at [c0c03940] > pc: c0a62bd8: .alloc_bootmem_core+0x90/0x39c > lr: c0a64bcc: .sparse_early_usemaps_alloc_node+0x84/0x29c > sp: c0c03bc0 >msr: 80021032 > current = 0xc0b0cce0 > paca= 0xc1d8 > pid = 0, comm = swapper > kernel BUG at mm/bootmem.c:483! > enter ? for help > [c0c03c80] c0a64bcc > .sparse_early_usemaps_alloc_node+0x84/0x29c > [c0c03d50] c0a64f10 .sparse_init+0x12c/0x28c > [c0c03e20] c0a474f4 .setup_arch+0x20c/0x294 > [c0c03ee0] c0a4079c .start_kernel+0xb4/0x460 > [c0c03f90] c0009670 .start_here_common+0x1c/0x2c > > This is > > BUG_ON(limit && goal + size > limit); > > and after some debugging, it seems that > > goal = 0x700 > limit = 0x800 > > and sparse_early_usemaps_alloc_node -> > sparse_early_usemaps_alloc_pgdat_section -> alloc_bootmem_section calls > > return alloc_bootmem_section(usemap_size() * count, section_nr); > > This is on a system with 8TB available via the AMS pool, and as a quirk > of AMS in firmware, all of that memory shows up in node 0. So, we end up > with an allocation that will fail the goal/limit constraints. In theory, > we could "fall-back" to alloc_bootmem_node() in > sparse_early_usemaps_alloc_node(), but since we actually have HOTREMOVE > defined, we'll BUG_ON() instead. A simple solution appears to be to > disable the limit check if the size of the allocation in > alloc_bootmem_secition exceeds the section size. > > Signed-off-by: Nishanth Aravamudan > Cc: Dave Hansen > Cc: Anton Blanchard > Cc: Paul Mackerras > Cc: Ben Herrenschmidt > Cc: Robert Jennings > Cc: linux...@kvack.org > Cc: linuxppc-dev@lists.ozlabs.org > --- > include/linux/mmzone.h |2 ++ > mm/bootmem.c |5 - > 2 files changed, 6 insertions(+), 1 deletions(-) > > diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h > index 650ba2f..4176834 100644 > --- a/include/linux/mmzone.h > +++ b/include/linux/mmzone.h > @@ -967,6 +967,8 @@ static inline unsigned long early_pfn_to_nid(unsigned > long pfn) > * PA_SECTION_SHIFT physical address to/from section number > * PFN_SECTION_SHIFT pfn to/from section number > */ > +#define BYTES_PER_SECTION(1UL << SECTION_SIZE_BITS) > + > #define SECTIONS_SHIFT (MAX_PHYSMEM_BITS - SECTION_SIZE_BITS) > > #define PA_SECTION_SHIFT (SECTION_SIZE_BITS) > diff --git a/mm/bootmem.c b/mm/bootmem.c > index 668e94d..5cbbc76 100644 > --- a/mm/bootmem.c > +++ b/mm/bootmem.c > @@ -770,7 +770,10 @@ void * __init alloc_bootmem_section(unsigned long size, > > pfn = section_nr_to_pfn(section_nr); > goal = pfn << PAGE_SHIFT; > - limit = section_nr_to_pfn(section_nr + 1) << PAGE_SHIFT; > + if (size > BYTES_PER_SECTION) > + limit = 0; > + else > + limit = section_nr_to_pfn(section_nr + 1) << PAGE_SHIFT; As it's ok to spill the allocation over to an adjacent section, why not just make limit==0 unconditionally. That would avoid defining BYTES_PER_SECTION. -- Mel Gorman SUSE Labs ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH] powerpc/boot: fix typo in p1010rdb.dtsi
Fix typo introduced by "powerpc: Add TBI PHY node to first MDIO bus" from Andy Fleming. It's device_type rather than device-type, which causes the mdio probe to fail thus making all gianfar ethernet interfaces unusable. Signed-off-by: Gustavo Zacarias --- arch/powerpc/boot/dts/p1010rdb.dtsi |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/arch/powerpc/boot/dts/p1010rdb.dtsi b/arch/powerpc/boot/dts/p1010rdb.dtsi index d4c4a77..9e0a0b7 100644 --- a/arch/powerpc/boot/dts/p1010rdb.dtsi +++ b/arch/powerpc/boot/dts/p1010rdb.dtsi @@ -196,7 +196,7 @@ }; tbi-phy@3 { - device-type = "tbi-phy"; + device_type = "tbi-phy"; reg = <0x3>; }; }; -- 1.7.3.4 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH] powerpc/boot: fix typo in p1010rdb.dtsi
Fix typo introduced by "powerpc: Add TBI PHY node to first MDIO bus" from Andy Fleming. It's device_type rather than device-type, which causes the mdio probe to fail thus making all gianfar ethernet interfaces unusable. Signed-off-by: Gustavo Zacarias --- arch/powerpc/boot/dts/p1010rdb.dtsi |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/arch/powerpc/boot/dts/p1010rdb.dtsi b/arch/powerpc/boot/dts/p1010rdb.dtsi index d4c4a77..9e0a0b7 100644 --- a/arch/powerpc/boot/dts/p1010rdb.dtsi +++ b/arch/powerpc/boot/dts/p1010rdb.dtsi @@ -196,7 +196,7 @@ }; tbi-phy@3 { - device-type = "tbi-phy"; + device_type = "tbi-phy"; reg = <0x3>; }; }; -- 1.7.3.4 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH trivial next 0/9] treewide: Use vsprintf extention %pf
Emit the actual function name when possible Joe Perches (9): sparc: Use vsprintf extention %pf with builtin_return_address alpha: Use vsprintf extention %pf with builtin_return_address arm: Use vsprintf extention %pf with builtin_return_address microblaze: Use vsprintf extention %pf with builtin_return_address powerpc: Use vsprintf extention %pf with builtin_return_address parisc: Use vsprintf extention %pf with builtin_return_address scsi: Use vsprintf extention %pf with builtin_return_address staging: ramster: Use vsprintf extention %pf with builtin_return_address gadget: Use vsprintf extention %pf with builtin_return_address arch/alpha/kernel/pci_iommu.c | 20 ++-- arch/arm/nwfpe/fpmodule.c |2 +- arch/microblaze/mm/pgtable.c|2 +- arch/powerpc/mm/pgtable_32.c|2 +- arch/sparc/kernel/ds.c |2 +- arch/sparc/mm/srmmu.c |2 +- drivers/parisc/superio.c|2 +- drivers/scsi/esp_scsi.c |2 +- drivers/staging/ramster/cluster/heartbeat.c |4 ++-- drivers/usb/gadget/u_serial.c |2 +- 10 files changed, 20 insertions(+), 20 deletions(-) -- 1.7.8.111.gad25c.dirty ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH trivial next 5/9] powerpc: Use vsprintf extention %pf with builtin_return_address
Emit the function name not the address when possible. builtin_return_address() gives an address. When building a kernel with CONFIG_KALLSYMS, emit the actual function name not the address. Signed-off-by: Joe Perches --- arch/powerpc/mm/pgtable_32.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/arch/powerpc/mm/pgtable_32.c b/arch/powerpc/mm/pgtable_32.c index 51f8795..0907f92 100644 --- a/arch/powerpc/mm/pgtable_32.c +++ b/arch/powerpc/mm/pgtable_32.c @@ -207,7 +207,7 @@ __ioremap_caller(phys_addr_t addr, unsigned long size, unsigned long flags, */ if (mem_init_done && (p < virt_to_phys(high_memory)) && !(__allow_ioremap_reserved && memblock_is_region_reserved(p, size))) { - printk("__ioremap(): phys addr 0x%llx is RAM lr %p\n", + printk("__ioremap(): phys addr 0x%llx is RAM lr %pf\n", (unsigned long long)p, __builtin_return_address(0)); return NULL; } -- 1.7.8.111.gad25c.dirty ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH] sparsemem/bootmem: catch greater than section size allocations
On 28.02.2012 [14:53:26 +0100], Johannes Weiner wrote: > On Fri, Feb 24, 2012 at 11:33:58AM -0800, Nishanth Aravamudan wrote: > > While testing AMS (Active Memory Sharing) / CMO (Cooperative Memory > > Overcommit) on powerpc, we tripped the following: > > > > kernel BUG at mm/bootmem.c:483! > > cpu 0x0: Vector: 700 (Program Check) at [c0c03940] > > pc: c0a62bd8: .alloc_bootmem_core+0x90/0x39c > > lr: c0a64bcc: .sparse_early_usemaps_alloc_node+0x84/0x29c > > sp: c0c03bc0 > >msr: 80021032 > > current = 0xc0b0cce0 > > paca= 0xc1d8 > > pid = 0, comm = swapper > > kernel BUG at mm/bootmem.c:483! > > enter ? for help > > [c0c03c80] c0a64bcc > > .sparse_early_usemaps_alloc_node+0x84/0x29c > > [c0c03d50] c0a64f10 .sparse_init+0x12c/0x28c > > [c0c03e20] c0a474f4 .setup_arch+0x20c/0x294 > > [c0c03ee0] c0a4079c .start_kernel+0xb4/0x460 > > [c0c03f90] c0009670 .start_here_common+0x1c/0x2c > > > > This is > > > > BUG_ON(limit && goal + size > limit); > > > > and after some debugging, it seems that > > > > goal = 0x700 > > limit = 0x800 > > > > and sparse_early_usemaps_alloc_node -> > > sparse_early_usemaps_alloc_pgdat_section -> alloc_bootmem_section calls > > > > return alloc_bootmem_section(usemap_size() * count, section_nr); > > > > This is on a system with 8TB available via the AMS pool, and as a quirk > > of AMS in firmware, all of that memory shows up in node 0. So, we end up > > with an allocation that will fail the goal/limit constraints. In theory, > > we could "fall-back" to alloc_bootmem_node() in > > sparse_early_usemaps_alloc_node(), but since we actually have HOTREMOVE > > defined, we'll BUG_ON() instead. A simple solution appears to be to > > disable the limit check if the size of the allocation in > > alloc_bootmem_secition exceeds the section size. > > It makes sense to allow the usemaps to spill over to subsequent > sections instead of panicking, so FWIW: > > Acked-by: Johannes Weiner > > That being said, it would be good if check_usemap_section_nr() printed > the cross-dependencies between pgdats and sections when the usemaps of > a node spilled over to other sections than the ones holding the pgdat. > > How about this? > > --- > From: Johannes Weiner > Subject: sparsemem/bootmem: catch greater than section size allocations fix > > If alloc_bootmem_section() no longer guarantees section-locality, we > need check_usemap_section_nr() to print possible cross-dependencies > between node descriptors and the usemaps allocated through it. > > Signed-off-by: Johannes Weiner > --- > > diff --git a/mm/sparse.c b/mm/sparse.c > index 61d7cde..9e032dc 100644 > --- a/mm/sparse.c > +++ b/mm/sparse.c > @@ -359,6 +359,7 @@ static void __init > sparse_early_usemaps_alloc_node(unsigned long**usemap_map, > continue; > usemap_map[pnum] = usemap; > usemap += size; > + check_usemap_section_nr(nodeid, usemap_map[pnum]); > } > return; > } This makes sense to me -- ok if I fold it into the re-worked patch (based upon Mel's comments)? > --- > > Furthermore, I wonder if we can remove the sparse-specific stuff from > bootmem.c as well, as now even more so than before, calculating the > desired area is really none of bootmem's business. > > Would something like this be okay? > > --- > From: Johannes Weiner > Subject: [patch] mm: remove sparsemem allocation details from the bootmem > allocator > > alloc_bootmem_section() derives allocation area constraints from the > specified sparsemem section. This is a bit specific for a generic > memory allocator like bootmem, though, so move it over to sparsemem. > > Since __alloc_bootmem_node() already retries failed allocations with > relaxed area constraints, the fallback code in sparsemem.c can be > removed and the code becomes a bit more compact overall. > > Signed-off-by: Johannes Weiner I've not tested it, but the intention seems sensible. I think it should remain a separate change. Thanks, Nish -- Nishanth Aravamudan IBM Linux Technology Center ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH 14/24] PCI, powerpc: Register busn_res for root buses
Signed-off-by: Yinghai Lu Cc: Benjamin Herrenschmidt Cc: Paul Mackerras Cc: linuxppc-dev@lists.ozlabs.org --- arch/powerpc/include/asm/pci-bridge.h |1 + arch/powerpc/kernel/pci-common.c | 10 +- 2 files changed, 10 insertions(+), 1 deletions(-) diff --git a/arch/powerpc/include/asm/pci-bridge.h b/arch/powerpc/include/asm/pci-bridge.h index 5d48765..11cebf0 100644 --- a/arch/powerpc/include/asm/pci-bridge.h +++ b/arch/powerpc/include/asm/pci-bridge.h @@ -30,6 +30,7 @@ struct pci_controller { int first_busno; int last_busno; int self_busno; + struct resource busn; void __iomem *io_base_virt; #ifdef CONFIG_PPC64 diff --git a/arch/powerpc/kernel/pci-common.c b/arch/powerpc/kernel/pci-common.c index 910b9de..ee8c0c9 100644 --- a/arch/powerpc/kernel/pci-common.c +++ b/arch/powerpc/kernel/pci-common.c @@ -1648,6 +1648,11 @@ void __devinit pcibios_scan_phb(struct pci_controller *hose) /* Wire up PHB bus resources */ pcibios_setup_phb_resources(hose, &resources); + hose->busn.start = hose->first_busno; + hose->busn.end = hose->last_busno; + hose->busn.flags = IORESOURCE_BUS; + pci_add_resource(&resources, &hose->busn); + /* Create an empty bus for the toplevel */ bus = pci_create_root_bus(hose->parent, hose->first_busno, hose->ops, hose, &resources); @@ -1670,8 +1675,11 @@ void __devinit pcibios_scan_phb(struct pci_controller *hose) of_scan_bus(node, bus); } - if (mode == PCI_PROBE_NORMAL) + if (mode == PCI_PROBE_NORMAL) { + pci_bus_update_busn_res_end(bus, 255); hose->last_busno = bus->subordinate = pci_scan_child_bus(bus); + pci_bus_update_busn_res_end(bus, bus->subordinate); + } /* Platform gets a chance to do some global fixups before * we proceed to resource allocation -- 1.7.7 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH] [PATCH v2] powerpc: icswx: fix race condition where threads do not get their ACOP register updated in time.
There is a race where a thread causes a coprocessor type to be valid in its own ACOP _and_ in the current context, but it does not propagate to the ACOP register of other threads in time for them to use it. The original code tries to solve this by sending an IPI to all threads on the system, which is heavy handed, but unfortunately still provides a window where the icswx is issued by other threads and the ACOP is not up to date. This patch detects that the ACOP DSI fault was a "false positive" and syncs the ACOP and causes the icswx to be replayed. Signed-off-by: Jimi Xenidis Cc: Anton Blanchard Cc: Benjamin Herrenschmidt --- Re: benh - fix typo in logic where I used "&&" and not "&" - remove pr_debug --- arch/powerpc/mm/icswx.c | 23 +-- arch/powerpc/mm/icswx.h |6 ++ 2 files changed, 27 insertions(+), 2 deletions(-) diff --git arch/powerpc/mm/icswx.c arch/powerpc/mm/icswx.c index 5d9a59e..8cdbd86 100644 --- arch/powerpc/mm/icswx.c +++ arch/powerpc/mm/icswx.c @@ -163,7 +163,7 @@ EXPORT_SYMBOL_GPL(drop_cop); static int acop_use_cop(int ct) { - /* todo */ + /* There is no alternate policy, yet */ return -1; } @@ -227,11 +227,30 @@ int acop_handle_fault(struct pt_regs *regs, unsigned long address, ct = (ccw >> 16) & 0x3f; } + /* +* We could be here because another thread has enabled acop +* but the ACOP register has yet to be updated. +* +* This should have been taken care of by the IPI to sync all +* the threads (see smp_call_function(sync_cop, mm, 1)), but +* that could take forever if there are a significant amount +* of threads. +* +* Given the number of threads on some of these systems, +* perhaps this is the best way to sync ACOP rather than whack +* every thread with an IPI. +*/ + if ((acop_copro_type_bit(ct) & current->active_mm->context.acop) != 0) { + sync_cop(current->active_mm); + return 0; + } + + /* check for alternate policy */ if (!acop_use_cop(ct)) return 0; /* at this point the CT is unknown to the system */ - pr_warn("%s[%d]: Coprocessor %d is unavailable", + pr_warn("%s[%d]: Coprocessor %d is unavailable\n", current->comm, current->pid, ct); /* get inst if we don't already have it */ diff --git arch/powerpc/mm/icswx.h arch/powerpc/mm/icswx.h index 42176bd..6dedc08 100644 --- arch/powerpc/mm/icswx.h +++ arch/powerpc/mm/icswx.h @@ -59,4 +59,10 @@ extern void free_cop_pid(int free_pid); extern int acop_handle_fault(struct pt_regs *regs, unsigned long address, unsigned long error_code); + +static inline u64 acop_copro_type_bit(unsigned int type) +{ + return 1ULL << (63 - type); +} + #endif /* !_ARCH_POWERPC_MM_ICSWX_H_ */ -- 1.7.0.4 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 08/18] PCI, powerpc: Register busn_res for root buses
On Mon, Feb 27, 2012 at 10:36 PM, Bjorn Helgaas wrote: > On Mon, Feb 27, 2012 at 7:09 PM, Yinghai Lu wrote: >> Signed-off-by: Yinghai Lu >> Cc: Benjamin Herrenschmidt >> Cc: Paul Mackerras >> Cc: linuxppc-dev@lists.ozlabs.org >> --- >> arch/powerpc/kernel/pci-common.c | 7 ++- >> 1 files changed, 6 insertions(+), 1 deletions(-) >> >> diff --git a/arch/powerpc/kernel/pci-common.c >> b/arch/powerpc/kernel/pci-common.c >> index 910b9de..ae5ae5f 100644 >> --- a/arch/powerpc/kernel/pci-common.c >> +++ b/arch/powerpc/kernel/pci-common.c >> @@ -1660,6 +1660,8 @@ void __devinit pcibios_scan_phb(struct pci_controller >> *hose) >> bus->secondary = hose->first_busno; >> hose->bus = bus; >> >> + pci_bus_insert_busn_res(bus, hose->first_busno, hose->last_busno); >> + >> /* Get probe mode and perform scan */ >> mode = PCI_PROBE_NORMAL; >> if (node && ppc_md.pci_probe_mode) >> @@ -1670,8 +1672,11 @@ void __devinit pcibios_scan_phb(struct pci_controller >> *hose) >> of_scan_bus(node, bus); >> } >> >> - if (mode == PCI_PROBE_NORMAL) >> + if (mode == PCI_PROBE_NORMAL) { >> + pci_bus_update_busn_res_end(bus, 255); >> hose->last_busno = bus->subordinate = pci_scan_child_bus(bus); >> + pci_bus_update_busn_res_end(bus, bus->subordinate); >> + } > > There's a lot of powerpc code that does this: > > bus_range = of_get_property(pcictrl, "bus-range", &len); > hose->first_busno = bus_range[0]; > hose->last_busno = bus_range[1]; > > That *looks* like it is discovering the bus number aperture. Is it? > If it is, why are we using the largest bus number found by > pci_scan_child_bus() rather than "last_busno"? Sorry, I missed the earlier hunk of the patch where you *do* use last_busno: >> + pci_bus_insert_busn_res(bus, hose->first_busno, hose->last_busno); I still think this part is wrong: + if (mode == PCI_PROBE_NORMAL) { + pci_bus_update_busn_res_end(bus, 255); hose->last_busno = bus->subordinate = pci_scan_child_bus(bus); + pci_bus_update_busn_res_end(bus, bus->subordinate); I think there are two problems: 1) We can enumerate devices under the wrong PHB. Assume this: PCI host bridge A to [bus 00] pci :00:01.0: PCI bridge PCI host bridge B to [bus 01] pci :01:01.0: PCI endpoint The P2P bridge at 00:01.0 has no devices below it, but of course we can't tell that until we look behind it. To do that, we'll have to assign a bus number, and since we forced the bus number aperture to [bus 00-ff] instead of the correct [bus 00], we'll probably allocate bus number 01 as the secondary bus. Then we'll generate a config cycle for 01:01.0, which discovers a device. But we can't tell that the cycle was actually claimed by host bridge B, not A. So now we wrongly think that 01:01.0 is under A, so we can't handle its resources correctly. I think we should have failed when allocating a secondary bus number for 00:01.0 and just skipped looking behind it. 2) We preclude hot-add in some cases. For example, if we scan this topology: PCI host bridge C to [bus 00-7f] pci :00:01.0: PCI bridge to [bus 01] pci :01:01.0: PCI endpoint we set the root bus's subordinate bus number to 01 (the highest bus number we discovered), so we now think host bridge C leads only to [bus 00-01]. Now let's remove 01:01.0 and plug in a card with a bridge on it, e.g., pci :01:01.0: PCI bridge to ... pci :xx:01.0: PCI endpoint We can't allocate a bus number for 01:01.0's secondary bus because we think we're out of space. But we're really not; the true bus number aperture for C is [bus 00-7f], not [bus 00-01]. We may need mechanism to say "don't trust this info from the firmware," but we should be able to figure out a way that doesn't penalize platforms that do everything correctly. The current patch breaks these scenarios even when the platform firmware is 100% correct. Bjorn ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 08/18] PCI, powerpc: Register busn_res for root buses
On Tue, 2012-02-28 at 16:31 -0700, Bjorn Helgaas wrote: > We may need mechanism to say "don't trust this info from the > firmware," but we should be able to figure out a way that doesn't > penalize platforms that do everything correctly. The current patch > breaks these scenarios even when the platform firmware is 100% > correct. On the other hand, our firmwares tend not to be and the vast majority of our platforms have separate bus number domains (In fact I'm not sure whether we have one that actually splits bus numbers or not, maybe some ancient Apple gear, I need to double check). We did use to force renumbering on macs to avoid bus number collisions between domains because of ancient X servers that didn't do domains properly but I think we dropped that. Cheers, Ben. ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH 01/38] powerpc/booke: Set CPU_FTR_DEBUG_LVL_EXC on 32-bit
From: Scott Wood Currently 32-bit only cares about this for choice of exception vector, which is done in core-specific code. However, KVM will want to distinguish as well. Signed-off-by: Scott Wood Signed-off-by: Alexander Graf --- arch/powerpc/include/asm/cputable.h |5 +++-- 1 files changed, 3 insertions(+), 2 deletions(-) diff --git a/arch/powerpc/include/asm/cputable.h b/arch/powerpc/include/asm/cputable.h index ad55a1c..6a034a2 100644 --- a/arch/powerpc/include/asm/cputable.h +++ b/arch/powerpc/include/asm/cputable.h @@ -376,7 +376,8 @@ extern const char *powerpc_base_platform; #define CPU_FTRS_47X (CPU_FTRS_440x6) #define CPU_FTRS_E200 (CPU_FTR_USE_TB | CPU_FTR_SPE_COMP | \ CPU_FTR_NODSISRALIGN | CPU_FTR_COHERENT_ICACHE | \ - CPU_FTR_UNIFIED_ID_CACHE | CPU_FTR_NOEXECUTE) + CPU_FTR_UNIFIED_ID_CACHE | CPU_FTR_NOEXECUTE | \ + CPU_FTR_DEBUG_LVL_EXC) #define CPU_FTRS_E500 (CPU_FTR_MAYBE_CAN_DOZE | CPU_FTR_USE_TB | \ CPU_FTR_SPE_COMP | CPU_FTR_MAYBE_CAN_NAP | CPU_FTR_NODSISRALIGN | \ CPU_FTR_NOEXECUTE) @@ -385,7 +386,7 @@ extern const char *powerpc_base_platform; CPU_FTR_NODSISRALIGN | CPU_FTR_NOEXECUTE) #define CPU_FTRS_E500MC(CPU_FTR_USE_TB | CPU_FTR_NODSISRALIGN | \ CPU_FTR_L2CSR | CPU_FTR_LWSYNC | CPU_FTR_NOEXECUTE | \ - CPU_FTR_DBELL) + CPU_FTR_DBELL | CPU_FTR_DEBUG_LVL_EXC) #define CPU_FTRS_E5500 (CPU_FTR_USE_TB | CPU_FTR_NODSISRALIGN | \ CPU_FTR_L2CSR | CPU_FTR_LWSYNC | CPU_FTR_NOEXECUTE | \ CPU_FTR_DBELL | CPU_FTR_POPCNTB | CPU_FTR_POPCNTD | \ -- 1.6.0.2 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH 03/38] KVM: PPC: factor out lpid allocator from book3s_64_mmu_hv
From: Scott Wood We'll use it on e500mc as well. Signed-off-by: Scott Wood Signed-off-by: Alexander Graf --- arch/powerpc/include/asm/kvm_book3s.h |3 ++ arch/powerpc/include/asm/kvm_booke.h |3 ++ arch/powerpc/include/asm/kvm_ppc.h|5 arch/powerpc/kvm/book3s_64_mmu_hv.c | 26 +--- arch/powerpc/kvm/powerpc.c| 34 + 5 files changed, 55 insertions(+), 16 deletions(-) diff --git a/arch/powerpc/include/asm/kvm_book3s.h b/arch/powerpc/include/asm/kvm_book3s.h index aa795cc..046041f 100644 --- a/arch/powerpc/include/asm/kvm_book3s.h +++ b/arch/powerpc/include/asm/kvm_book3s.h @@ -452,4 +452,7 @@ static inline bool kvmppc_critical_section(struct kvm_vcpu *vcpu) #define INS_DCBZ 0x7c0007ec +/* LPIDs we support with this build -- runtime limit may be lower */ +#define KVMPPC_NR_LPIDS(LPID_RSVD + 1) + #endif /* __ASM_KVM_BOOK3S_H__ */ diff --git a/arch/powerpc/include/asm/kvm_booke.h b/arch/powerpc/include/asm/kvm_booke.h index a90e091..b7cd335 100644 --- a/arch/powerpc/include/asm/kvm_booke.h +++ b/arch/powerpc/include/asm/kvm_booke.h @@ -23,6 +23,9 @@ #include #include +/* LPIDs we support with this build -- runtime limit may be lower */ +#define KVMPPC_NR_LPIDS64 + static inline void kvmppc_set_gpr(struct kvm_vcpu *vcpu, int num, ulong val) { vcpu->arch.gpr[num] = val; diff --git a/arch/powerpc/include/asm/kvm_ppc.h b/arch/powerpc/include/asm/kvm_ppc.h index 9d6dee0..731e920 100644 --- a/arch/powerpc/include/asm/kvm_ppc.h +++ b/arch/powerpc/include/asm/kvm_ppc.h @@ -204,4 +204,9 @@ int kvm_vcpu_ioctl_config_tlb(struct kvm_vcpu *vcpu, int kvm_vcpu_ioctl_dirty_tlb(struct kvm_vcpu *vcpu, struct kvm_dirty_tlb *cfg); +long kvmppc_alloc_lpid(void); +void kvmppc_claim_lpid(long lpid); +void kvmppc_free_lpid(long lpid); +void kvmppc_init_lpid(unsigned long nr_lpids); + #endif /* __POWERPC_KVM_PPC_H__ */ diff --git a/arch/powerpc/kvm/book3s_64_mmu_hv.c b/arch/powerpc/kvm/book3s_64_mmu_hv.c index ddc485a..d031ce1 100644 --- a/arch/powerpc/kvm/book3s_64_mmu_hv.c +++ b/arch/powerpc/kvm/book3s_64_mmu_hv.c @@ -36,13 +36,11 @@ /* POWER7 has 10-bit LPIDs, PPC970 has 6-bit LPIDs */ #define MAX_LPID_970 63 -#define NR_LPIDS (LPID_RSVD + 1) -unsigned long lpid_inuse[BITS_TO_LONGS(NR_LPIDS)]; long kvmppc_alloc_hpt(struct kvm *kvm) { unsigned long hpt; - unsigned long lpid; + long lpid; struct revmap_entry *rev; struct kvmppc_linear_info *li; @@ -72,14 +70,9 @@ long kvmppc_alloc_hpt(struct kvm *kvm) } kvm->arch.revmap = rev; - /* Allocate the guest's logical partition ID */ - do { - lpid = find_first_zero_bit(lpid_inuse, NR_LPIDS); - if (lpid >= NR_LPIDS) { - pr_err("kvm_alloc_hpt: No LPIDs free\n"); - goto out_freeboth; - } - } while (test_and_set_bit(lpid, lpid_inuse)); + lpid = kvmppc_alloc_lpid(); + if (lpid < 0) + goto out_freeboth; kvm->arch.sdr1 = __pa(hpt) | (HPT_ORDER - 18); kvm->arch.lpid = lpid; @@ -96,7 +89,7 @@ long kvmppc_alloc_hpt(struct kvm *kvm) void kvmppc_free_hpt(struct kvm *kvm) { - clear_bit(kvm->arch.lpid, lpid_inuse); + kvmppc_free_lpid(kvm->arch.lpid); vfree(kvm->arch.revmap); if (kvm->arch.hpt_li) kvm_release_hpt(kvm->arch.hpt_li); @@ -171,8 +164,7 @@ int kvmppc_mmu_hv_init(void) if (!cpu_has_feature(CPU_FTR_HVMODE)) return -EINVAL; - memset(lpid_inuse, 0, sizeof(lpid_inuse)); - + /* POWER7 has 10-bit LPIDs, PPC970 and e500mc have 6-bit LPIDs */ if (cpu_has_feature(CPU_FTR_ARCH_206)) { host_lpid = mfspr(SPRN_LPID); /* POWER7 */ rsvd_lpid = LPID_RSVD; @@ -181,9 +173,11 @@ int kvmppc_mmu_hv_init(void) rsvd_lpid = MAX_LPID_970; } - set_bit(host_lpid, lpid_inuse); + kvmppc_init_lpid(rsvd_lpid + 1); + + kvmppc_claim_lpid(host_lpid); /* rsvd_lpid is reserved for use in partition switching */ - set_bit(rsvd_lpid, lpid_inuse); + kvmppc_claim_lpid(rsvd_lpid); return 0; } diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c index 00d7e34..9806ea5 100644 --- a/arch/powerpc/kvm/powerpc.c +++ b/arch/powerpc/kvm/powerpc.c @@ -808,6 +808,40 @@ out: return r; } +static unsigned long lpid_inuse[BITS_TO_LONGS(KVMPPC_NR_LPIDS)]; +static unsigned long nr_lpids; + +long kvmppc_alloc_lpid(void) +{ + long lpid; + + do { + lpid = find_first_zero_bit(lpid_inuse, KVMPPC_NR_LPIDS); + if (lpid >= nr_lpids) { + pr_err("%s: No LPIDs free\n", __func__); + return -ENOMEM; + } +
[PATCH 04/38] KVM: PPC: booke: add booke-level vcpu load/put
From: Scott Wood This gives us a place to put load/put actions that correspond to code that is booke-specific but not specific to a particular core. Signed-off-by: Scott Wood Signed-off-by: Alexander Graf --- arch/powerpc/kvm/44x.c |3 +++ arch/powerpc/kvm/booke.c |8 arch/powerpc/kvm/booke.h |3 +++ arch/powerpc/kvm/e500.c |3 +++ 4 files changed, 17 insertions(+), 0 deletions(-) diff --git a/arch/powerpc/kvm/44x.c b/arch/powerpc/kvm/44x.c index 7b612a7..879a1a7 100644 --- a/arch/powerpc/kvm/44x.c +++ b/arch/powerpc/kvm/44x.c @@ -29,15 +29,18 @@ #include #include "44x_tlb.h" +#include "booke.h" void kvmppc_core_vcpu_load(struct kvm_vcpu *vcpu, int cpu) { + kvmppc_booke_vcpu_load(vcpu, cpu); kvmppc_44x_tlb_load(vcpu); } void kvmppc_core_vcpu_put(struct kvm_vcpu *vcpu) { kvmppc_44x_tlb_put(vcpu); + kvmppc_booke_vcpu_put(vcpu); } int kvmppc_core_check_processor_compat(void) diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c index ee9e1ee..a2456c7 100644 --- a/arch/powerpc/kvm/booke.c +++ b/arch/powerpc/kvm/booke.c @@ -968,6 +968,14 @@ void kvmppc_decrementer_func(unsigned long data) kvmppc_set_tsr_bits(vcpu, TSR_DIS); } +void kvmppc_booke_vcpu_load(struct kvm_vcpu *vcpu, int cpu) +{ +} + +void kvmppc_booke_vcpu_put(struct kvm_vcpu *vcpu) +{ +} + int __init kvmppc_booke_init(void) { unsigned long ivor[16]; diff --git a/arch/powerpc/kvm/booke.h b/arch/powerpc/kvm/booke.h index 2fe2027..05d1d99 100644 --- a/arch/powerpc/kvm/booke.h +++ b/arch/powerpc/kvm/booke.h @@ -71,4 +71,7 @@ void kvmppc_save_guest_spe(struct kvm_vcpu *vcpu); /* high-level function, manages flags, host state */ void kvmppc_vcpu_disable_spe(struct kvm_vcpu *vcpu); +void kvmppc_booke_vcpu_load(struct kvm_vcpu *vcpu, int cpu); +void kvmppc_booke_vcpu_put(struct kvm_vcpu *vcpu); + #endif /* __KVM_BOOKE_H__ */ diff --git a/arch/powerpc/kvm/e500.c b/arch/powerpc/kvm/e500.c index ddcd896..2d5fe04 100644 --- a/arch/powerpc/kvm/e500.c +++ b/arch/powerpc/kvm/e500.c @@ -36,6 +36,7 @@ void kvmppc_core_load_guest_debugstate(struct kvm_vcpu *vcpu) void kvmppc_core_vcpu_load(struct kvm_vcpu *vcpu, int cpu) { + kvmppc_booke_vcpu_load(vcpu, cpu); kvmppc_e500_tlb_load(vcpu, cpu); } @@ -47,6 +48,8 @@ void kvmppc_core_vcpu_put(struct kvm_vcpu *vcpu) if (vcpu->arch.shadow_msr & MSR_SPE) kvmppc_vcpu_disable_spe(vcpu); #endif + + kvmppc_booke_vcpu_put(vcpu); } int kvmppc_core_check_processor_compat(void) -- 1.6.0.2 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH 00/38] KVM: PPC: e500mc support v3
This is Scott's e500mc RFC patch set rebased, berobbed of its pt_regs parts and fixed for bisectability. On top of them, I addressed all the comments that I had on the code and that came up in his code as FIXMEs. I verified that this patch set works just fine on e500mc and doesn't break e500v2, so I would say it's good to go as it is, unless someone has strong objections to how things are done. Everything hereafter I would prefer to do based on a working upstream version rather than a downstream fork, as that way exposure is a lot higher. v1 -> v2: - ESR -> GESR - introduce and use constants for doorbell - drop e500mc ifdefs for doorbell - fix whitespace - use explicit preempt counts in inst fixup - rework e500v2 kconfig patch - add patches 31-37 v2 -> v3: - add patch 38 - check for signals earlier - also remove "lwzr9, VCPU_KVM(r4)" which was as superfluous - sync host state instead of guest state to pt_regs - optimize reinject code path to get out fast when not reinjecting Alexander Graf (23): KVM: PPC: e500mc: Add doorbell emulation support KVM: PPC: e500mc: implicitly set MSR_GS KVM: PPC: e500mc: Move r1/r2 restoration very early KVM: PPC: e500mc: add load inst fixup KVM: PPC: rename CONFIG_KVM_E500 -> CONFIG_KVM_E500V2 KVM: PPC: make e500v2 kvm and e500mc cpu mutually exclusive KVM: PPC: booke: remove leftover debugging KVM: PPC: booke: deliver program int on emulation failure KVM: PPC: booke: rework rescheduling checks KVM: PPC: booke: BOOKE_IRQPRIO_MAX is n+1 KVM: PPC: bookehv: fix exit timing KVM: PPC: bookehv: remove negation for CONFIG_64BIT KVM: PPC: bookehv: remove SET_VCPU KVM: PPC: bookehv: disable MAS register updates early KVM: PPC: bookehv: add comment about shadow_msr KVM: PPC: booke: Readd debug abort code for machine check KVM: PPC: booke: add GS documentation for program interrupt KVM: PPC: bookehv: remove unused code KVM: PPC: e500: fix typo in tlb code KVM: PPC: booke: Support perfmon interrupts KVM: PPC: booke: expose good state on irq reinject KVM: PPC: booke: Reinject performance monitor interrupts KVM: PPC: Booke: only prepare to enter when we enter Scott Wood (15): powerpc/booke: Set CPU_FTR_DEBUG_LVL_EXC on 32-bit powerpc/e500: split CPU_FTRS_ALWAYS/CPU_FTRS_POSSIBLE KVM: PPC: factor out lpid allocator from book3s_64_mmu_hv KVM: PPC: booke: add booke-level vcpu load/put KVM: PPC: booke: Move vm core init/destroy out of booke.c KVM: PPC: e500: rename e500_tlb.h to e500.h KVM: PPC: e500: merge into arch/powerpc/kvm/e500.h KVM: PPC: e500: clean up arch/powerpc/kvm/e500.h KVM: PPC: e500: refactor core-specific TLB code KVM: PPC: e500: Track TLB1 entries with a bitmap KVM: PPC: e500: emulate tlbilx powerpc/booke: Provide exception macros with interrupt name KVM: PPC: booke: category E.HV (GS-mode) support KVM: PPC: booke: standard PPC floating point support KVM: PPC: e500mc support arch/powerpc/include/asm/cputable.h | 21 +- arch/powerpc/include/asm/dbell.h|3 + arch/powerpc/include/asm/hw_irq.h |1 + arch/powerpc/include/asm/kvm.h |1 + arch/powerpc/include/asm/kvm_asm.h |8 + arch/powerpc/include/asm/kvm_book3s.h |3 + arch/powerpc/include/asm/kvm_booke.h|3 + arch/powerpc/include/asm/kvm_booke_hv_asm.h | 49 +++ arch/powerpc/include/asm/kvm_e500.h | 96 - arch/powerpc/include/asm/kvm_host.h | 22 +- arch/powerpc/include/asm/kvm_ppc.h | 10 +- arch/powerpc/include/asm/mmu-book3e.h |6 + arch/powerpc/include/asm/processor.h|3 + arch/powerpc/include/asm/reg.h |2 + arch/powerpc/include/asm/reg_booke.h| 34 ++ arch/powerpc/include/asm/system.h |1 + arch/powerpc/kernel/asm-offsets.c | 15 +- arch/powerpc/kernel/cpu_setup_fsl_booke.S |1 + arch/powerpc/kernel/head_44x.S | 23 +- arch/powerpc/kernel/head_booke.h| 69 ++- arch/powerpc/kernel/head_fsl_booke.S| 98 - arch/powerpc/kvm/44x.c | 12 + arch/powerpc/kvm/Kconfig| 28 +- arch/powerpc/kvm/Makefile | 15 +- arch/powerpc/kvm/book3s.c |4 +- arch/powerpc/kvm/book3s_64_mmu_hv.c | 26 +- arch/powerpc/kvm/booke.c| 469 + arch/powerpc/kvm/booke.h| 57 +++- arch/powerpc/kvm/booke_emulate.c| 23 +- arch/powerpc/kvm/bookehv_interrupts.S | 602 +++ arch/powerpc/kvm/e500.c | 372 ++--- arch/powerpc/kvm/e500.h | 302 ++ arch/powerpc/kvm/e500_emulate.c | 110 +- arch/powerpc/kvm/e500_tlb.c | 588 +++--- arch/powerpc/kvm/e500_tlb.h | 174
[PATCH 07/38] KVM: PPC: e500: merge into arch/powerpc/kvm/e500.h
From: Scott Wood Keeping two separate headers for e500-specific things was a pain, and wasn't even organized along any logical boundary. There was TLB stuff in despite the existence of arch/powerpc/kvm/e500_tlb.h, and nothing in needed to be referenced from outside arch/powerpc/kvm. Signed-off-by: Scott Wood [agraf: fix bisectability] Signed-off-by: Alexander Graf --- arch/powerpc/include/asm/kvm_e500.h | 96 --- arch/powerpc/kvm/e500.c |1 - arch/powerpc/kvm/e500.h | 82 -- arch/powerpc/kvm/e500_emulate.c |1 - arch/powerpc/kvm/e500_tlb.c |1 - 5 files changed, 78 insertions(+), 103 deletions(-) delete mode 100644 arch/powerpc/include/asm/kvm_e500.h diff --git a/arch/powerpc/include/asm/kvm_e500.h b/arch/powerpc/include/asm/kvm_e500.h deleted file mode 100644 index 8cd50a5..000 --- a/arch/powerpc/include/asm/kvm_e500.h +++ /dev/null @@ -1,96 +0,0 @@ -/* - * Copyright (C) 2008-2011 Freescale Semiconductor, Inc. All rights reserved. - * - * Author: Yu Liu, - * - * Description: - * This file is derived from arch/powerpc/include/asm/kvm_44x.h, - * by Hollis Blanchard . - * - * This program is free software; you can redistribute it and/or modify - * it under the terms of the GNU General Public License, version 2, as - * published by the Free Software Foundation. - */ - -#ifndef __ASM_KVM_E500_H__ -#define __ASM_KVM_E500_H__ - -#include - -#define BOOKE_INTERRUPT_SIZE 36 - -#define E500_PID_NUM 3 -#define E500_TLB_NUM 2 - -#define E500_TLB_VALID 1 -#define E500_TLB_DIRTY 2 - -struct tlbe_ref { - pfn_t pfn; - unsigned int flags; /* E500_TLB_* */ -}; - -struct tlbe_priv { - struct tlbe_ref ref; /* TLB0 only -- TLB1 uses tlb_refs */ -}; - -struct vcpu_id_table; - -struct kvmppc_e500_tlb_params { - int entries, ways, sets; -}; - -struct kvmppc_vcpu_e500 { - /* Unmodified copy of the guest's TLB -- shared with host userspace. */ - struct kvm_book3e_206_tlb_entry *gtlb_arch; - - /* Starting entry number in gtlb_arch[] */ - int gtlb_offset[E500_TLB_NUM]; - - /* KVM internal information associated with each guest TLB entry */ - struct tlbe_priv *gtlb_priv[E500_TLB_NUM]; - - struct kvmppc_e500_tlb_params gtlb_params[E500_TLB_NUM]; - - unsigned int gtlb_nv[E500_TLB_NUM]; - - /* -* information associated with each host TLB entry -- -* TLB1 only for now. If/when guest TLB1 entries can be -* mapped with host TLB0, this will be used for that too. -* -* We don't want to use this for guest TLB0 because then we'd -* have the overhead of doing the translation again even if -* the entry is still in the guest TLB (e.g. we swapped out -* and back, and our host TLB entries got evicted). -*/ - struct tlbe_ref *tlb_refs[E500_TLB_NUM]; - unsigned int host_tlb1_nv; - - u32 host_pid[E500_PID_NUM]; - u32 pid[E500_PID_NUM]; - u32 svr; - - /* vcpu id table */ - struct vcpu_id_table *idt; - - u32 l1csr0; - u32 l1csr1; - u32 hid0; - u32 hid1; - u32 tlb0cfg; - u32 tlb1cfg; - u64 mcar; - - struct page **shared_tlb_pages; - int num_shared_tlb_pages; - - struct kvm_vcpu vcpu; -}; - -static inline struct kvmppc_vcpu_e500 *to_e500(struct kvm_vcpu *vcpu) -{ - return container_of(vcpu, struct kvmppc_vcpu_e500, vcpu); -} - -#endif /* __ASM_KVM_E500_H__ */ diff --git a/arch/powerpc/kvm/e500.c b/arch/powerpc/kvm/e500.c index 5c450ba..76b35d8 100644 --- a/arch/powerpc/kvm/e500.c +++ b/arch/powerpc/kvm/e500.c @@ -20,7 +20,6 @@ #include #include #include -#include #include #include "booke.h" diff --git a/arch/powerpc/kvm/e500.h b/arch/powerpc/kvm/e500.h index 02ecde2..51d13bd 100644 --- a/arch/powerpc/kvm/e500.h +++ b/arch/powerpc/kvm/e500.h @@ -1,11 +1,12 @@ /* * Copyright (C) 2008-2011 Freescale Semiconductor, Inc. All rights reserved. * - * Author: Yu Liu, yu@freescale.com + * Author: Yu Liu * * Description: - * This file is based on arch/powerpc/kvm/44x_tlb.h, - * by Hollis Blanchard . + * This file is based on arch/powerpc/kvm/44x_tlb.h and + * arch/powerpc/include/asm/kvm_44x.h by Hollis Blanchard , + * Copyright IBM Corp. 2007-2008 * * This program is free software; you can redistribute it and/or modify * it under the terms of the GNU General Public License, version 2, as @@ -18,7 +19,80 @@ #include #include #include -#include + +#define E500_PID_NUM 3 +#define E500_TLB_NUM 2 + +#define E500_TLB_VALID 1 +#define E500_TLB_DIRTY 2 + +struct tlbe_ref { + pfn_t pfn; + unsigned int flags; /* E500_TLB_* */ +}; + +struct tlbe_priv { + struct tlbe_ref ref; /* TLB0 only -- TLB1 uses tlb_refs */ +}; + +struct vcpu_id_table; + +struct kvmppc_e500_tlb_params { + int entries, ways, sets; +}; + +struct kvmppc_vc
[PATCH 06/38] KVM: PPC: e500: rename e500_tlb.h to e500.h
From: Scott Wood This is in preparation for merging in the contents of arch/powerpc/include/asm/kvm_e500.h. Signed-off-by: Scott Wood Signed-off-by: Alexander Graf --- arch/powerpc/kvm/e500.c |2 +- arch/powerpc/kvm/{e500_tlb.h => e500.h} |6 +++--- arch/powerpc/kvm/e500_emulate.c |2 +- arch/powerpc/kvm/e500_tlb.c |2 +- 4 files changed, 6 insertions(+), 6 deletions(-) rename arch/powerpc/kvm/{e500_tlb.h => e500.h} (98%) diff --git a/arch/powerpc/kvm/e500.c b/arch/powerpc/kvm/e500.c index ac6c9ae..5c450ba 100644 --- a/arch/powerpc/kvm/e500.c +++ b/arch/powerpc/kvm/e500.c @@ -24,7 +24,7 @@ #include #include "booke.h" -#include "e500_tlb.h" +#include "e500.h" void kvmppc_core_load_host_debugstate(struct kvm_vcpu *vcpu) { diff --git a/arch/powerpc/kvm/e500_tlb.h b/arch/powerpc/kvm/e500.h similarity index 98% rename from arch/powerpc/kvm/e500_tlb.h rename to arch/powerpc/kvm/e500.h index 5c6d2d7..02ecde2 100644 --- a/arch/powerpc/kvm/e500_tlb.h +++ b/arch/powerpc/kvm/e500.h @@ -12,8 +12,8 @@ * published by the Free Software Foundation. */ -#ifndef __KVM_E500_TLB_H__ -#define __KVM_E500_TLB_H__ +#ifndef KVM_E500_H +#define KVM_E500_H #include #include @@ -171,4 +171,4 @@ static inline int tlbe_is_host_safe(const struct kvm_vcpu *vcpu, return 1; } -#endif /* __KVM_E500_TLB_H__ */ +#endif /* KVM_E500_H */ diff --git a/arch/powerpc/kvm/e500_emulate.c b/arch/powerpc/kvm/e500_emulate.c index 6d0b2bd..2a1a228 100644 --- a/arch/powerpc/kvm/e500_emulate.c +++ b/arch/powerpc/kvm/e500_emulate.c @@ -17,7 +17,7 @@ #include #include "booke.h" -#include "e500_tlb.h" +#include "e500.h" #define XOP_TLBIVAX 786 #define XOP_TLBSX 914 diff --git a/arch/powerpc/kvm/e500_tlb.c b/arch/powerpc/kvm/e500_tlb.c index 6e53e41..1d623a0 100644 --- a/arch/powerpc/kvm/e500_tlb.c +++ b/arch/powerpc/kvm/e500_tlb.c @@ -29,7 +29,7 @@ #include #include "../mm/mmu_decl.h" -#include "e500_tlb.h" +#include "e500.h" #include "trace.h" #include "timing.h" -- 1.6.0.2 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH 17/38] KVM: PPC: e500mc: implicitly set MSR_GS
When setting MSR for an e500mc guest, we implicitly always set MSR_GS to make sure the guest is in guest state. Since we have this implicit rule there, we don't need to explicitly pass MSR_GS to set_msr(). Remove all explicit setters of MSR_GS. Signed-off-by: Alexander Graf --- arch/powerpc/kvm/booke.c | 11 +-- 1 files changed, 5 insertions(+), 6 deletions(-) diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c index 85bd5b8..fcbe928 100644 --- a/arch/powerpc/kvm/booke.c +++ b/arch/powerpc/kvm/booke.c @@ -280,7 +280,7 @@ static int kvmppc_booke_irqprio_deliver(struct kvm_vcpu *vcpu, unsigned int priority) { int allowed = 0; - ulong uninitialized_var(msr_mask); + ulong msr_mask = 0; bool update_esr = false, update_dear = false; ulong crit_raw = vcpu->arch.shared->critical; ulong crit_r1 = kvmppc_get_gpr(vcpu, 1); @@ -322,20 +322,19 @@ static int kvmppc_booke_irqprio_deliver(struct kvm_vcpu *vcpu, case BOOKE_IRQPRIO_AP_UNAVAIL: case BOOKE_IRQPRIO_ALIGNMENT: allowed = 1; - msr_mask = MSR_GS | MSR_CE | MSR_ME | MSR_DE; + msr_mask = MSR_CE | MSR_ME | MSR_DE; int_class = INT_CLASS_NONCRIT; break; case BOOKE_IRQPRIO_CRITICAL: case BOOKE_IRQPRIO_DBELL_CRIT: allowed = vcpu->arch.shared->msr & MSR_CE; allowed = allowed && !crit; - msr_mask = MSR_GS | MSR_ME; + msr_mask = MSR_ME; int_class = INT_CLASS_CRIT; break; case BOOKE_IRQPRIO_MACHINE_CHECK: allowed = vcpu->arch.shared->msr & MSR_ME; allowed = allowed && !crit; - msr_mask = MSR_GS; int_class = INT_CLASS_MC; break; case BOOKE_IRQPRIO_DECREMENTER: @@ -346,13 +345,13 @@ static int kvmppc_booke_irqprio_deliver(struct kvm_vcpu *vcpu, case BOOKE_IRQPRIO_DBELL: allowed = vcpu->arch.shared->msr & MSR_EE; allowed = allowed && !crit; - msr_mask = MSR_GS | MSR_CE | MSR_ME | MSR_DE; + msr_mask = MSR_CE | MSR_ME | MSR_DE; int_class = INT_CLASS_NONCRIT; break; case BOOKE_IRQPRIO_DEBUG: allowed = vcpu->arch.shared->msr & MSR_DE; allowed = allowed && !crit; - msr_mask = MSR_GS | MSR_ME; + msr_mask = MSR_ME; int_class = INT_CLASS_CRIT; break; } -- 1.6.0.2 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH 02/38] powerpc/e500: split CPU_FTRS_ALWAYS/CPU_FTRS_POSSIBLE
From: Scott Wood Split e500 (v1/v2) and e500mc/e5500 to allow optimization of feature checks that differ between the two. Signed-off-by: Scott Wood Signed-off-by: Alexander Graf --- arch/powerpc/include/asm/cputable.h | 12 1 files changed, 8 insertions(+), 4 deletions(-) diff --git a/arch/powerpc/include/asm/cputable.h b/arch/powerpc/include/asm/cputable.h index 6a034a2..2022f2d 100644 --- a/arch/powerpc/include/asm/cputable.h +++ b/arch/powerpc/include/asm/cputable.h @@ -483,8 +483,10 @@ enum { CPU_FTRS_E200 | #endif #ifdef CONFIG_E500 - CPU_FTRS_E500 | CPU_FTRS_E500_2 | CPU_FTRS_E500MC | - CPU_FTRS_E5500 | + CPU_FTRS_E500 | CPU_FTRS_E500_2 | +#endif +#ifdef CONFIG_PPC_E500MC + CPU_FTRS_E500MC | CPU_FTRS_E5500 | #endif 0, }; @@ -528,8 +530,10 @@ enum { CPU_FTRS_E200 & #endif #ifdef CONFIG_E500 - CPU_FTRS_E500 & CPU_FTRS_E500_2 & CPU_FTRS_E500MC & - CPU_FTRS_E5500 & + CPU_FTRS_E500 & CPU_FTRS_E500_2 & +#endif +#ifdef CONFIG_PPC_E500MC + CPU_FTRS_E500MC & CPU_FTRS_E5500 & #endif CPU_FTRS_POSSIBLE, }; -- 1.6.0.2 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH 05/38] KVM: PPC: booke: Move vm core init/destroy out of booke.c
From: Scott Wood e500mc will want to do lpid allocation/deallocation here. Signed-off-by: Scott Wood Signed-off-by: Alexander Graf --- arch/powerpc/kvm/44x.c |9 + arch/powerpc/kvm/booke.c |9 - arch/powerpc/kvm/e500.c |9 + 3 files changed, 18 insertions(+), 9 deletions(-) diff --git a/arch/powerpc/kvm/44x.c b/arch/powerpc/kvm/44x.c index 879a1a7..50e7dbc 100644 --- a/arch/powerpc/kvm/44x.c +++ b/arch/powerpc/kvm/44x.c @@ -163,6 +163,15 @@ void kvmppc_core_vcpu_free(struct kvm_vcpu *vcpu) kmem_cache_free(kvm_vcpu_cache, vcpu_44x); } +int kvmppc_core_init_vm(struct kvm *kvm) +{ + return 0; +} + +void kvmppc_core_destroy_vm(struct kvm *kvm) +{ +} + static int __init kvmppc_44x_init(void) { int r; diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c index a2456c7..2ee9bae 100644 --- a/arch/powerpc/kvm/booke.c +++ b/arch/powerpc/kvm/booke.c @@ -932,15 +932,6 @@ void kvmppc_core_commit_memory_region(struct kvm *kvm, { } -int kvmppc_core_init_vm(struct kvm *kvm) -{ - return 0; -} - -void kvmppc_core_destroy_vm(struct kvm *kvm) -{ -} - void kvmppc_set_tcr(struct kvm_vcpu *vcpu, u32 new_tcr) { vcpu->arch.tcr = new_tcr; diff --git a/arch/powerpc/kvm/e500.c b/arch/powerpc/kvm/e500.c index 2d5fe04..ac6c9ae 100644 --- a/arch/powerpc/kvm/e500.c +++ b/arch/powerpc/kvm/e500.c @@ -226,6 +226,15 @@ void kvmppc_core_vcpu_free(struct kvm_vcpu *vcpu) kmem_cache_free(kvm_vcpu_cache, vcpu_e500); } +int kvmppc_core_init_vm(struct kvm *kvm) +{ + return 0; +} + +void kvmppc_core_destroy_vm(struct kvm *kvm) +{ +} + static int __init kvmppc_e500_init(void) { int r, i; -- 1.6.0.2 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH 18/38] KVM: PPC: e500mc: Move r1/r2 restoration very early
If we hit any exception whatsoever in the restore path and r1/r2 aren't the host registers, we don't get a working oops. So it's always a good idea to restore them as early as possible. This time, it actually has practical reasons to do so too, since we need to have the host page fault handler fix up our guest instruction read code. And for that to work we need r1/r2 restored. Signed-off-by: Alexander Graf --- arch/powerpc/kvm/bookehv_interrupts.S | 12 ++-- 1 files changed, 6 insertions(+), 6 deletions(-) diff --git a/arch/powerpc/kvm/bookehv_interrupts.S b/arch/powerpc/kvm/bookehv_interrupts.S index 9eaeebd..63023ae 100644 --- a/arch/powerpc/kvm/bookehv_interrupts.S +++ b/arch/powerpc/kvm/bookehv_interrupts.S @@ -67,6 +67,12 @@ * saved in vcpu: cr, ctr, r3-r13 */ .macro kvm_handler_common intno, srr0, flags + /* Restore host stack pointer */ + PPC_STL r1, VCPU_GPR(r1)(r4) + PPC_STL r2, VCPU_GPR(r2)(r4) + PPC_LL r1, VCPU_HOST_STACK(r4) + PPC_LL r2, HOST_R2(r1) + mfspr r10, SPRN_PID lwz r8, VCPU_HOST_PID(r4) PPC_LL r11, VCPU_SHARED(r4) @@ -290,10 +296,8 @@ _GLOBAL(kvmppc_resume_host) /* Save remaining volatile guest register state to vcpu. */ mfspr r3, SPRN_VRSAVE PPC_STL r0, VCPU_GPR(r0)(r4) - PPC_STL r1, VCPU_GPR(r1)(r4) mflrr5 mfspr r6, SPRN_SPRG4 - PPC_STL r2, VCPU_GPR(r2)(r4) PPC_STL r5, VCPU_LR(r4) mfspr r7, SPRN_SPRG5 PPC_STL r3, VCPU_VRSAVE(r4) @@ -334,10 +338,6 @@ _GLOBAL(kvmppc_resume_host) mtspr SPRN_EPCR, r3 isync - /* Restore host stack pointer */ - PPC_LL r1, VCPU_HOST_STACK(r4) - PPC_LL r2, HOST_R2(r1) - /* Switch to kernel stack and jump to handler. */ PPC_LL r3, HOST_RUN(r1) mr r5, r14 /* intno */ -- 1.6.0.2 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH 08/38] KVM: PPC: e500: clean up arch/powerpc/kvm/e500.h
From: Scott Wood Move vcpu to the beginning of vcpu_e500 to give it appropriate prominence, especially if more fields end up getting added to the end of vcpu_e500 (and vcpu ends up in the middle). Remove gratuitous "extern" and add parameter names to prototypes. Signed-off-by: Scott Wood [agraf: fix bisectability] Signed-off-by: Alexander Graf --- arch/powerpc/kvm/e500.h | 25 ++--- 1 files changed, 14 insertions(+), 11 deletions(-) diff --git a/arch/powerpc/kvm/e500.h b/arch/powerpc/kvm/e500.h index 51d13bd..a48af00 100644 --- a/arch/powerpc/kvm/e500.h +++ b/arch/powerpc/kvm/e500.h @@ -42,6 +42,8 @@ struct kvmppc_e500_tlb_params { }; struct kvmppc_vcpu_e500 { + struct kvm_vcpu vcpu; + /* Unmodified copy of the guest's TLB -- shared with host userspace. */ struct kvm_book3e_206_tlb_entry *gtlb_arch; @@ -85,8 +87,6 @@ struct kvmppc_vcpu_e500 { struct page **shared_tlb_pages; int num_shared_tlb_pages; - - struct kvm_vcpu vcpu; }; static inline struct kvmppc_vcpu_e500 *to_e500(struct kvm_vcpu *vcpu) @@ -113,19 +113,22 @@ static inline struct kvmppc_vcpu_e500 *to_e500(struct kvm_vcpu *vcpu) (MAS3_U0 | MAS3_U1 | MAS3_U2 | MAS3_U3 \ | E500_TLB_USER_PERM_MASK | E500_TLB_SUPER_PERM_MASK) -extern void kvmppc_dump_tlbs(struct kvm_vcpu *); -extern int kvmppc_e500_emul_mt_mmucsr0(struct kvmppc_vcpu_e500 *, ulong); -extern int kvmppc_e500_emul_tlbwe(struct kvm_vcpu *); -extern int kvmppc_e500_emul_tlbre(struct kvm_vcpu *); -extern int kvmppc_e500_emul_tlbivax(struct kvm_vcpu *, int, int); -extern int kvmppc_e500_emul_tlbsx(struct kvm_vcpu *, int); -extern int kvmppc_e500_tlb_search(struct kvm_vcpu *, gva_t, unsigned int, int); extern void kvmppc_e500_tlb_put(struct kvm_vcpu *); extern void kvmppc_e500_tlb_load(struct kvm_vcpu *, int); -extern int kvmppc_e500_tlb_init(struct kvmppc_vcpu_e500 *); -extern void kvmppc_e500_tlb_uninit(struct kvmppc_vcpu_e500 *); extern void kvmppc_e500_tlb_setup(struct kvmppc_vcpu_e500 *); extern void kvmppc_e500_recalc_shadow_pid(struct kvmppc_vcpu_e500 *); +int kvmppc_e500_emul_mt_mmucsr0(struct kvmppc_vcpu_e500 *vcpu_e500, + ulong value); +int kvmppc_e500_emul_tlbwe(struct kvm_vcpu *vcpu); +int kvmppc_e500_emul_tlbre(struct kvm_vcpu *vcpu); +int kvmppc_e500_emul_tlbivax(struct kvm_vcpu *vcpu, int ra, int rb); +int kvmppc_e500_emul_tlbsx(struct kvm_vcpu *vcpu, int rb); +int kvmppc_e500_tlb_search(struct kvm_vcpu *, gva_t, unsigned int, int); +int kvmppc_e500_tlb_init(struct kvmppc_vcpu_e500 *vcpu_e500); +void kvmppc_e500_tlb_uninit(struct kvmppc_vcpu_e500 *vcpu_e500); + +void kvmppc_get_sregs_e500_tlb(struct kvm_vcpu *vcpu, struct kvm_sregs *sregs); +int kvmppc_set_sregs_e500_tlb(struct kvm_vcpu *vcpu, struct kvm_sregs *sregs); /* TLB helper functions */ static inline unsigned int -- 1.6.0.2 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH 11/38] KVM: PPC: e500: emulate tlbilx
From: Scott Wood tlbilx is the new, preferred invalidation instruction. It is not found on e500 prior to e500mc, but there should be no harm in supporting it on all e500. Based on code from Ashish Kalra . Signed-off-by: Scott Wood Signed-off-by: Alexander Graf --- arch/powerpc/kvm/e500.h |1 + arch/powerpc/kvm/e500_emulate.c |9 ++ arch/powerpc/kvm/e500_tlb.c | 52 +++ 3 files changed, 62 insertions(+), 0 deletions(-) diff --git a/arch/powerpc/kvm/e500.h b/arch/powerpc/kvm/e500.h index f4dee55..ce3f163 100644 --- a/arch/powerpc/kvm/e500.h +++ b/arch/powerpc/kvm/e500.h @@ -124,6 +124,7 @@ int kvmppc_e500_emul_mt_mmucsr0(struct kvmppc_vcpu_e500 *vcpu_e500, int kvmppc_e500_emul_tlbwe(struct kvm_vcpu *vcpu); int kvmppc_e500_emul_tlbre(struct kvm_vcpu *vcpu); int kvmppc_e500_emul_tlbivax(struct kvm_vcpu *vcpu, int ra, int rb); +int kvmppc_e500_emul_tlbilx(struct kvm_vcpu *vcpu, int rt, int ra, int rb); int kvmppc_e500_emul_tlbsx(struct kvm_vcpu *vcpu, int rb); int kvmppc_e500_tlb_init(struct kvmppc_vcpu_e500 *vcpu_e500); void kvmppc_e500_tlb_uninit(struct kvmppc_vcpu_e500 *vcpu_e500); diff --git a/arch/powerpc/kvm/e500_emulate.c b/arch/powerpc/kvm/e500_emulate.c index c80794d..af02c18 100644 --- a/arch/powerpc/kvm/e500_emulate.c +++ b/arch/powerpc/kvm/e500_emulate.c @@ -22,6 +22,7 @@ #define XOP_TLBSX 914 #define XOP_TLBRE 946 #define XOP_TLBWE 978 +#define XOP_TLBILX 18 int kvmppc_core_emulate_op(struct kvm_run *run, struct kvm_vcpu *vcpu, unsigned int inst, int *advance) @@ -29,6 +30,7 @@ int kvmppc_core_emulate_op(struct kvm_run *run, struct kvm_vcpu *vcpu, int emulated = EMULATE_DONE; int ra; int rb; + int rt; switch (get_op(inst)) { case 31: @@ -47,6 +49,13 @@ int kvmppc_core_emulate_op(struct kvm_run *run, struct kvm_vcpu *vcpu, emulated = kvmppc_e500_emul_tlbsx(vcpu,rb); break; + case XOP_TLBILX: + ra = get_ra(inst); + rb = get_rb(inst); + rt = get_rt(inst); + emulated = kvmppc_e500_emul_tlbilx(vcpu, rt, ra, rb); + break; + case XOP_TLBIVAX: ra = get_ra(inst); rb = get_rb(inst); diff --git a/arch/powerpc/kvm/e500_tlb.c b/arch/powerpc/kvm/e500_tlb.c index c8ce51d..6eb5d65 100644 --- a/arch/powerpc/kvm/e500_tlb.c +++ b/arch/powerpc/kvm/e500_tlb.c @@ -631,6 +631,58 @@ int kvmppc_e500_emul_tlbivax(struct kvm_vcpu *vcpu, int ra, int rb) return EMULATE_DONE; } +static void tlbilx_all(struct kvmppc_vcpu_e500 *vcpu_e500, int tlbsel, + int pid, int rt) +{ + struct kvm_book3e_206_tlb_entry *tlbe; + int tid, esel; + + /* invalidate all entries */ + for (esel = 0; esel < vcpu_e500->gtlb_params[tlbsel].entries; esel++) { + tlbe = get_entry(vcpu_e500, tlbsel, esel); + tid = get_tlb_tid(tlbe); + if (rt == 0 || tid == pid) { + inval_gtlbe_on_host(vcpu_e500, tlbsel, esel); + kvmppc_e500_gtlbe_invalidate(vcpu_e500, tlbsel, esel); + } + } +} + +static void tlbilx_one(struct kvmppc_vcpu_e500 *vcpu_e500, int pid, + int ra, int rb) +{ + int tlbsel, esel; + gva_t ea; + + ea = kvmppc_get_gpr(&vcpu_e500->vcpu, rb); + if (ra) + ea += kvmppc_get_gpr(&vcpu_e500->vcpu, ra); + + for (tlbsel = 0; tlbsel < 2; tlbsel++) { + esel = kvmppc_e500_tlb_index(vcpu_e500, ea, tlbsel, pid, -1); + if (esel >= 0) { + inval_gtlbe_on_host(vcpu_e500, tlbsel, esel); + kvmppc_e500_gtlbe_invalidate(vcpu_e500, tlbsel, esel); + break; + } + } +} + +int kvmppc_e500_emul_tlbilx(struct kvm_vcpu *vcpu, int rt, int ra, int rb) +{ + struct kvmppc_vcpu_e500 *vcpu_e500 = to_e500(vcpu); + int pid = get_cur_spid(vcpu); + + if (rt == 0 || rt == 1) { + tlbilx_all(vcpu_e500, 0, pid, rt); + tlbilx_all(vcpu_e500, 1, pid, rt); + } else if (rt == 3) { + tlbilx_one(vcpu_e500, pid, ra, rb); + } + + return EMULATE_DONE; +} + int kvmppc_e500_emul_tlbre(struct kvm_vcpu *vcpu) { struct kvmppc_vcpu_e500 *vcpu_e500 = to_e500(vcpu); -- 1.6.0.2 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH 14/38] KVM: PPC: booke: standard PPC floating point support
From: Scott Wood e500mc has a normal PPC FPU, rather than SPE which is found on e500v1/v2. Based on code from Liu Yu . Signed-off-by: Scott Wood Signed-off-by: Alexander Graf --- arch/powerpc/include/asm/system.h |1 + arch/powerpc/kvm/booke.c | 44 + arch/powerpc/kvm/booke.h | 30 + 3 files changed, 75 insertions(+), 0 deletions(-) diff --git a/arch/powerpc/include/asm/system.h b/arch/powerpc/include/asm/system.h index c377457..73eee86 100644 --- a/arch/powerpc/include/asm/system.h +++ b/arch/powerpc/include/asm/system.h @@ -140,6 +140,7 @@ extern void via_cuda_init(void); extern void read_rtc_time(void); extern void pmac_find_display(void); extern void giveup_fpu(struct task_struct *); +extern void load_up_fpu(void); extern void disable_kernel_fp(void); extern void enable_kernel_fp(void); extern void flush_fp_to_thread(struct task_struct *); diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c index 75dbaeb..0b77be1 100644 --- a/arch/powerpc/kvm/booke.c +++ b/arch/powerpc/kvm/booke.c @@ -457,6 +457,11 @@ void kvmppc_core_prepare_to_enter(struct kvm_vcpu *vcpu) int kvmppc_vcpu_run(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu) { int ret; +#ifdef CONFIG_PPC_FPU + unsigned int fpscr; + int fpexc_mode; + u64 fpr[32]; +#endif if (!vcpu->arch.sane) { kvm_run->exit_reason = KVM_EXIT_INTERNAL_ERROR; @@ -479,7 +484,46 @@ int kvmppc_vcpu_run(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu) } kvm_guest_enter(); + +#ifdef CONFIG_PPC_FPU + /* Save userspace FPU state in stack */ + enable_kernel_fp(); + memcpy(fpr, current->thread.fpr, sizeof(current->thread.fpr)); + fpscr = current->thread.fpscr.val; + fpexc_mode = current->thread.fpexc_mode; + + /* Restore guest FPU state to thread */ + memcpy(current->thread.fpr, vcpu->arch.fpr, sizeof(vcpu->arch.fpr)); + current->thread.fpscr.val = vcpu->arch.fpscr; + + /* +* Since we can't trap on MSR_FP in GS-mode, we consider the guest +* as always using the FPU. Kernel usage of FP (via +* enable_kernel_fp()) in this thread must not occur while +* vcpu->fpu_active is set. +*/ + vcpu->fpu_active = 1; + + kvmppc_load_guest_fp(vcpu); +#endif + ret = __kvmppc_vcpu_run(kvm_run, vcpu); + +#ifdef CONFIG_PPC_FPU + kvmppc_save_guest_fp(vcpu); + + vcpu->fpu_active = 0; + + /* Save guest FPU state from thread */ + memcpy(vcpu->arch.fpr, current->thread.fpr, sizeof(vcpu->arch.fpr)); + vcpu->arch.fpscr = current->thread.fpscr.val; + + /* Restore userspace FPU state from stack */ + memcpy(current->thread.fpr, fpr, sizeof(current->thread.fpr)); + current->thread.fpscr.val = fpscr; + current->thread.fpexc_mode = fpexc_mode; +#endif + kvm_guest_exit(); out: diff --git a/arch/powerpc/kvm/booke.h b/arch/powerpc/kvm/booke.h index d53bcf2..3bf5eda 100644 --- a/arch/powerpc/kvm/booke.h +++ b/arch/powerpc/kvm/booke.h @@ -96,4 +96,34 @@ enum int_class { void kvmppc_set_pending_interrupt(struct kvm_vcpu *vcpu, enum int_class type); +/* + * Load up guest vcpu FP state if it's needed. + * It also set the MSR_FP in thread so that host know + * we're holding FPU, and then host can help to save + * guest vcpu FP state if other threads require to use FPU. + * This simulates an FP unavailable fault. + * + * It requires to be called with preemption disabled. + */ +static inline void kvmppc_load_guest_fp(struct kvm_vcpu *vcpu) +{ +#ifdef CONFIG_PPC_FPU + if (vcpu->fpu_active && !(current->thread.regs->msr & MSR_FP)) { + load_up_fpu(); + current->thread.regs->msr |= MSR_FP; + } +#endif +} + +/* + * Save guest vcpu FP state into thread. + * It requires to be called with preemption disabled. + */ +static inline void kvmppc_save_guest_fp(struct kvm_vcpu *vcpu) +{ +#ifdef CONFIG_PPC_FPU + if (vcpu->fpu_active && (current->thread.regs->msr & MSR_FP)) + giveup_fpu(current); +#endif +} #endif /* __KVM_BOOKE_H__ */ -- 1.6.0.2 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH 12/38] powerpc/booke: Provide exception macros with interrupt name
From: Scott Wood DO_KVM will need to identify the particular exception type. There is an existing set of arbitrary numbers that Linux passes, but it's an undocumented mess that sort of corresponds to server/classic exception vectors but not really. Signed-off-by: Scott Wood Signed-off-by: Alexander Graf --- arch/powerpc/kernel/head_44x.S | 23 +-- arch/powerpc/kernel/head_booke.h | 41 ++ arch/powerpc/kernel/head_fsl_booke.S | 52 +- 3 files changed, 68 insertions(+), 48 deletions(-) diff --git a/arch/powerpc/kernel/head_44x.S b/arch/powerpc/kernel/head_44x.S index 7dd2981..d1192c5 100644 --- a/arch/powerpc/kernel/head_44x.S +++ b/arch/powerpc/kernel/head_44x.S @@ -248,10 +248,11 @@ _ENTRY(_start); interrupt_base: /* Critical Input Interrupt */ - CRITICAL_EXCEPTION(0x0100, CriticalInput, unknown_exception) + CRITICAL_EXCEPTION(0x0100, CRITICAL, CriticalInput, unknown_exception) /* Machine Check Interrupt */ - CRITICAL_EXCEPTION(0x0200, MachineCheck, machine_check_exception) + CRITICAL_EXCEPTION(0x0200, MACHINE_CHECK, MachineCheck, \ + machine_check_exception) MCHECK_EXCEPTION(0x0210, MachineCheckA, machine_check_exception) /* Data Storage Interrupt */ @@ -261,7 +262,8 @@ interrupt_base: INSTRUCTION_STORAGE_EXCEPTION /* External Input Interrupt */ - EXCEPTION(0x0500, ExternalInput, do_IRQ, EXC_XFER_LITE) + EXCEPTION(0x0500, BOOKE_INTERRUPT_EXTERNAL, ExternalInput, \ + do_IRQ, EXC_XFER_LITE) /* Alignment Interrupt */ ALIGNMENT_EXCEPTION @@ -273,29 +275,32 @@ interrupt_base: #ifdef CONFIG_PPC_FPU FP_UNAVAILABLE_EXCEPTION #else - EXCEPTION(0x2010, FloatingPointUnavailable, unknown_exception, EXC_XFER_EE) + EXCEPTION(0x2010, BOOKE_INTERRUPT_FP_UNAVAIL, \ + FloatingPointUnavailable, unknown_exception, EXC_XFER_EE) #endif /* System Call Interrupt */ START_EXCEPTION(SystemCall) - NORMAL_EXCEPTION_PROLOG + NORMAL_EXCEPTION_PROLOG(BOOKE_INTERRUPT_SYSCALL) EXC_XFER_EE_LITE(0x0c00, DoSyscall) /* Auxiliary Processor Unavailable Interrupt */ - EXCEPTION(0x2020, AuxillaryProcessorUnavailable, unknown_exception, EXC_XFER_EE) + EXCEPTION(0x2020, BOOKE_INTERRUPT_AP_UNAVAIL, \ + AuxillaryProcessorUnavailable, unknown_exception, EXC_XFER_EE) /* Decrementer Interrupt */ DECREMENTER_EXCEPTION /* Fixed Internal Timer Interrupt */ /* TODO: Add FIT support */ - EXCEPTION(0x1010, FixedIntervalTimer, unknown_exception, EXC_XFER_EE) + EXCEPTION(0x1010, BOOKE_INTERRUPT_FIT, FixedIntervalTimer, \ + unknown_exception, EXC_XFER_EE) /* Watchdog Timer Interrupt */ /* TODO: Add watchdog support */ #ifdef CONFIG_BOOKE_WDT - CRITICAL_EXCEPTION(0x1020, WatchdogTimer, WatchdogException) + CRITICAL_EXCEPTION(0x1020, WATCHDOG, WatchdogTimer, WatchdogException) #else - CRITICAL_EXCEPTION(0x1020, WatchdogTimer, unknown_exception) + CRITICAL_EXCEPTION(0x1020, WATCHDOG, WatchdogTimer, unknown_exception) #endif /* Data TLB Error Interrupt */ diff --git a/arch/powerpc/kernel/head_booke.h b/arch/powerpc/kernel/head_booke.h index fc921bf..06ab353 100644 --- a/arch/powerpc/kernel/head_booke.h +++ b/arch/powerpc/kernel/head_booke.h @@ -2,6 +2,8 @@ #define __HEAD_BOOKE_H__ #include /* for STACK_FRAME_REGS_MARKER */ +#include + /* * Macros used for common Book-e exception handling */ @@ -28,7 +30,7 @@ */ #define THREAD_NORMSAVE(offset)(THREAD_NORMSAVES + (offset * 4)) -#define NORMAL_EXCEPTION_PROLOG \ +#define NORMAL_EXCEPTION_PROLOG(intno) \ mtspr SPRN_SPRG_WSCRATCH0, r10; /* save one register */ \ mfspr r10, SPRN_SPRG_THREAD; \ stw r11, THREAD_NORMSAVE(0)(r10);\ @@ -113,7 +115,7 @@ * registers as the normal prolog above. Instead we use a portion of the * critical/machine check exception stack at low physical addresses. */ -#define EXC_LEVEL_EXCEPTION_PROLOG(exc_level, exc_level_srr0, exc_level_srr1) \ +#define EXC_LEVEL_EXCEPTION_PROLOG(exc_level, intno, exc_level_srr0, exc_level_srr1) \ mtspr SPRN_SPRG_WSCRATCH_##exc_level,r8; \ BOOKE_LOAD_EXC_LEVEL_STACK(exc_level);/* r8 points to the exc_level stack*/ \ stw r9,GPR9(r8);/* save various registers */\ @@ -162,12 +164,13 @@ SAVE_4GPRS(3, r11); \ SAVE_2GPRS(7, r11) -#define CRITICAL_EXCEPTION_PROLOG \ - EXC_LEVEL_EXCEPTION_PROLOG(CRIT, S
[PATCH 09/38] KVM: PPC: e500: refactor core-specific TLB code
From: Scott Wood The PID handling is e500v1/v2-specific, and is moved to e500.c. The MMU sregs code and kvmppc_core_vcpu_translate will be shared with e500mc, and is moved from e500.c to e500_tlb.c. Partially based on patches from Liu Yu . Signed-off-by: Scott Wood [agraf: fix bisectability] Signed-off-by: Alexander Graf --- arch/powerpc/include/asm/kvm_host.h |2 + arch/powerpc/kvm/e500.c | 357 +++ arch/powerpc/kvm/e500.h | 62 - arch/powerpc/kvm/e500_emulate.c |6 +- arch/powerpc/kvm/e500_tlb.c | 460 +-- 5 files changed, 473 insertions(+), 414 deletions(-) diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h index 52eb9c1..47612cc 100644 --- a/arch/powerpc/include/asm/kvm_host.h +++ b/arch/powerpc/include/asm/kvm_host.h @@ -426,6 +426,8 @@ struct kvm_vcpu_arch { ulong fault_esr; ulong queued_dear; ulong queued_esr; + u32 tlbcfg[4]; + u32 mmucfg; #endif gpa_t paddr_accessed; diff --git a/arch/powerpc/kvm/e500.c b/arch/powerpc/kvm/e500.c index 76b35d8..b479ed7 100644 --- a/arch/powerpc/kvm/e500.c +++ b/arch/powerpc/kvm/e500.c @@ -22,9 +22,281 @@ #include #include +#include "../mm/mmu_decl.h" #include "booke.h" #include "e500.h" +struct id { + unsigned long val; + struct id **pentry; +}; + +#define NUM_TIDS 256 + +/* + * This table provide mappings from: + * (guestAS,guestTID,guestPR) --> ID of physical cpu + * guestAS [0..1] + * guestTID[0..255] + * guestPR [0..1] + * ID [1..255] + * Each vcpu keeps one vcpu_id_table. + */ +struct vcpu_id_table { + struct id id[2][NUM_TIDS][2]; +}; + +/* + * This table provide reversed mappings of vcpu_id_table: + * ID --> address of vcpu_id_table item. + * Each physical core has one pcpu_id_table. + */ +struct pcpu_id_table { + struct id *entry[NUM_TIDS]; +}; + +static DEFINE_PER_CPU(struct pcpu_id_table, pcpu_sids); + +/* This variable keeps last used shadow ID on local core. + * The valid range of shadow ID is [1..255] */ +static DEFINE_PER_CPU(unsigned long, pcpu_last_used_sid); + +/* + * Allocate a free shadow id and setup a valid sid mapping in given entry. + * A mapping is only valid when vcpu_id_table and pcpu_id_table are match. + * + * The caller must have preemption disabled, and keep it that way until + * it has finished with the returned shadow id (either written into the + * TLB or arch.shadow_pid, or discarded). + */ +static inline int local_sid_setup_one(struct id *entry) +{ + unsigned long sid; + int ret = -1; + + sid = ++(__get_cpu_var(pcpu_last_used_sid)); + if (sid < NUM_TIDS) { + __get_cpu_var(pcpu_sids).entry[sid] = entry; + entry->val = sid; + entry->pentry = &__get_cpu_var(pcpu_sids).entry[sid]; + ret = sid; + } + + /* +* If sid == NUM_TIDS, we've run out of sids. We return -1, and +* the caller will invalidate everything and start over. +* +* sid > NUM_TIDS indicates a race, which we disable preemption to +* avoid. +*/ + WARN_ON(sid > NUM_TIDS); + + return ret; +} + +/* + * Check if given entry contain a valid shadow id mapping. + * An ID mapping is considered valid only if + * both vcpu and pcpu know this mapping. + * + * The caller must have preemption disabled, and keep it that way until + * it has finished with the returned shadow id (either written into the + * TLB or arch.shadow_pid, or discarded). + */ +static inline int local_sid_lookup(struct id *entry) +{ + if (entry && entry->val != 0 && + __get_cpu_var(pcpu_sids).entry[entry->val] == entry && + entry->pentry == &__get_cpu_var(pcpu_sids).entry[entry->val]) + return entry->val; + return -1; +} + +/* Invalidate all id mappings on local core -- call with preempt disabled */ +static inline void local_sid_destroy_all(void) +{ + __get_cpu_var(pcpu_last_used_sid) = 0; + memset(&__get_cpu_var(pcpu_sids), 0, sizeof(__get_cpu_var(pcpu_sids))); +} + +static void *kvmppc_e500_id_table_alloc(struct kvmppc_vcpu_e500 *vcpu_e500) +{ + vcpu_e500->idt = kzalloc(sizeof(struct vcpu_id_table), GFP_KERNEL); + return vcpu_e500->idt; +} + +static void kvmppc_e500_id_table_free(struct kvmppc_vcpu_e500 *vcpu_e500) +{ + kfree(vcpu_e500->idt); + vcpu_e500->idt = NULL; +} + +/* Map guest pid to shadow. + * We use PID to keep shadow of current guest non-zero PID, + * and use PID1 to keep shadow of guest zero PID. + * So that guest tlbe with TID=0 can be accessed at any time */ +static void kvmppc_e500_recalc_shadow_pid(struct kvmppc_vcpu_e500 *vcpu_e500) +{ + preempt_disable(); + vcpu_e500->vcpu.arch.shadow_pid = kvmppc_e500_get_sid(vcpu_e500, + get_cur_as(&vcpu_e500->vcpu), + get
[PATCH 19/38] KVM: PPC: e500mc: add load inst fixup
There's always a chance we're unable to read a guest instruction. The guest could have its TLB mapped execute-, but not readable, something odd happens and our TLB gets flushed. So it's a good idea to be prepared for that case and have a fallback that allows us to fix things up in that case. Add fixup code that keeps guest code from potentially crashing our host kernel. Signed-off-by: Alexander Graf --- v1 -> v2: - fix whitespace - use explicit preempt counts --- arch/powerpc/kvm/bookehv_interrupts.S | 30 +- 1 files changed, 29 insertions(+), 1 deletions(-) diff --git a/arch/powerpc/kvm/bookehv_interrupts.S b/arch/powerpc/kvm/bookehv_interrupts.S index 63023ae..f7dc3f6 100644 --- a/arch/powerpc/kvm/bookehv_interrupts.S +++ b/arch/powerpc/kvm/bookehv_interrupts.S @@ -28,6 +28,7 @@ #include #include #include +#include #include "../kernel/head_booke.h" /* for THREAD_NORMSAVE() */ @@ -171,9 +172,36 @@ PPC_STL r30, VCPU_GPR(r30)(r4) PPC_STL r31, VCPU_GPR(r31)(r4) mtspr SPRN_EPLC, r8 + + /* disable preemption, so we are sure we hit the fixup handler */ +#ifdef CONFIG_PPC64 + clrrdi r8,r1,THREAD_SHIFT +#else + rlwinm r8,r1,0,0,31-THREAD_SHIFT /* current thread_info */ +#endif + li r7, 1 +stwr7, TI_PREEMPT(r8) + isync - lwepx r9, 0, r5 + + /* +* In case the read goes wrong, we catch it and write an invalid value +* in LAST_INST instead. +*/ +1: lwepx r9, 0, r5 +2: +.section .fixup, "ax" +3: li r9, KVM_INST_FETCH_FAILED + b 2b +.previous +.section __ex_table,"a" + PPC_LONG_ALIGN + PPC_LONG 1b,3b +.previous + mtspr SPRN_EPLC, r3 + li r7, 0 +stwr7, TI_PREEMPT(r8) stw r9, VCPU_LAST_INST(r4) .endif -- 1.6.0.2 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH 22/38] KVM: PPC: booke: remove leftover debugging
The e500mc patches left some debug code in that we don't need. Remove it. Signed-off-by: Alexander Graf --- arch/powerpc/kvm/booke.c |5 - 1 files changed, 0 insertions(+), 5 deletions(-) diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c index 9fcc760..17d5318 100644 --- a/arch/powerpc/kvm/booke.c +++ b/arch/powerpc/kvm/booke.c @@ -469,11 +469,6 @@ int kvmppc_vcpu_run(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu) return -EINVAL; } - if (!current->thread.kvm_vcpu) { - WARN(1, "no vcpu\n"); - return -EPERM; - } - local_irq_disable(); kvmppc_core_prepare_to_enter(vcpu); -- 1.6.0.2 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH 27/38] KVM: PPC: bookehv: remove negation for CONFIG_64BIT
Instead if doing #ifndef CONFIG_64BIT ... #else ... #endif we should rather do #ifdef CONFIG_64BIT ... #else ... #endif which is a lot easier to read. Change the bookehv implementation to stick with this rule. Signed-off-by: Alexander Graf --- arch/powerpc/kvm/bookehv_interrupts.S | 24 1 files changed, 12 insertions(+), 12 deletions(-) diff --git a/arch/powerpc/kvm/bookehv_interrupts.S b/arch/powerpc/kvm/bookehv_interrupts.S index 215381e..c5a0796 100644 --- a/arch/powerpc/kvm/bookehv_interrupts.S +++ b/arch/powerpc/kvm/bookehv_interrupts.S @@ -99,10 +99,10 @@ .endif orisr8, r6, MSR_CE@h -#ifndef CONFIG_64BIT - stw r6, (VCPU_SHARED_MSR + 4)(r11) -#else +#ifdef CONFIG_64BIT std r6, (VCPU_SHARED_MSR)(r11) +#else + stw r6, (VCPU_SHARED_MSR + 4)(r11) #endif ori r8, r8, MSR_ME | MSR_RI PPC_STL r5, VCPU_PC(r4) @@ -344,10 +344,10 @@ _GLOBAL(kvmppc_resume_host) stw r5, VCPU_SHARED_MAS0(r11) mfspr r7, SPRN_MAS2 stw r6, VCPU_SHARED_MAS1(r11) -#ifndef CONFIG_64BIT - stw r7, (VCPU_SHARED_MAS2 + 4)(r11) -#else +#ifdef CONFIG_64BIT std r7, (VCPU_SHARED_MAS2)(r11) +#else + stw r7, (VCPU_SHARED_MAS2 + 4)(r11) #endif mfspr r5, SPRN_MAS3 mfspr r6, SPRN_MAS4 @@ -530,10 +530,10 @@ lightweight_exit: stw r3, VCPU_HOST_MAS6(r4) lwz r3, VCPU_SHARED_MAS0(r11) lwz r5, VCPU_SHARED_MAS1(r11) -#ifndef CONFIG_64BIT - lwz r6, (VCPU_SHARED_MAS2 + 4)(r11) -#else +#ifdef CONFIG_64BIT ld r6, (VCPU_SHARED_MAS2)(r11) +#else + lwz r6, (VCPU_SHARED_MAS2 + 4)(r11) #endif lwz r7, VCPU_SHARED_MAS7_3+4(r11) lwz r8, VCPU_SHARED_MAS4(r11) @@ -572,10 +572,10 @@ lightweight_exit: PPC_LL r6, VCPU_CTR(r4) PPC_LL r7, VCPU_CR(r4) PPC_LL r8, VCPU_PC(r4) -#ifndef CONFIG_64BIT - lwz r9, (VCPU_SHARED_MSR + 4)(r11) -#else +#ifdef CONFIG_64BIT ld r9, (VCPU_SHARED_MSR)(r11) +#else + lwz r9, (VCPU_SHARED_MSR + 4)(r11) #endif PPC_LL r0, VCPU_GPR(r0)(r4) PPC_LL r1, VCPU_GPR(r1)(r4) -- 1.6.0.2 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH 26/38] KVM: PPC: bookehv: fix exit timing
When using exit timing stats, we clobber r9 in the NEED_EMU case, so better move that part down a few lines and fix it that way. Signed-off-by: Alexander Graf --- arch/powerpc/kvm/bookehv_interrupts.S |8 1 files changed, 4 insertions(+), 4 deletions(-) diff --git a/arch/powerpc/kvm/bookehv_interrupts.S b/arch/powerpc/kvm/bookehv_interrupts.S index f7dc3f6..215381e 100644 --- a/arch/powerpc/kvm/bookehv_interrupts.S +++ b/arch/powerpc/kvm/bookehv_interrupts.S @@ -83,10 +83,6 @@ stw r10, VCPU_GUEST_PID(r4) mtspr SPRN_PID, r8 - .if \flags & NEED_EMU - lwz r9, VCPU_KVM(r4) - .endif - #ifdef CONFIG_KVM_EXIT_TIMING /* save exit time */ 1: mfspr r7, SPRN_TBRU @@ -98,6 +94,10 @@ PPC_STL r9, VCPU_TIMING_EXIT_TBU(r4) #endif + .if \flags & NEED_EMU + lwz r9, VCPU_KVM(r4) + .endif + orisr8, r6, MSR_CE@h #ifndef CONFIG_64BIT stw r6, (VCPU_SHARED_MSR + 4)(r11) -- 1.6.0.2 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH 23/38] KVM: PPC: booke: deliver program int on emulation failure
When we fail to emulate an instruction for the guest, we better go in and tell it that we failed to emulate it, by throwing an illegal instruction exception. Please beware that we basically never get around to telling the guest that we failed thanks to the debugging code right above it. If user space however decides that it wants to ignore the debug, we would at least do "the right thing" afterwards. Signed-off-by: Alexander Graf --- arch/powerpc/kvm/booke.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c index 17d5318..9979be1 100644 --- a/arch/powerpc/kvm/booke.c +++ b/arch/powerpc/kvm/booke.c @@ -545,13 +545,13 @@ static int emulation_exit(struct kvm_run *run, struct kvm_vcpu *vcpu) return RESUME_HOST; case EMULATE_FAIL: - /* XXX Deliver Program interrupt to guest. */ printk(KERN_CRIT "%s: emulation at %lx failed (%08x)\n", __func__, vcpu->arch.pc, vcpu->arch.last_inst); /* For debugging, encode the failing instruction and * report it to userspace. */ run->hw.hardware_exit_reason = ~0ULL << 32; run->hw.hardware_exit_reason |= vcpu->arch.last_inst; + kvmppc_core_queue_program(vcpu, ESR_PIL); return RESUME_HOST; default: -- 1.6.0.2 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH 25/38] KVM: PPC: booke: BOOKE_IRQPRIO_MAX is n+1
The semantics of BOOKE_IRQPRIO_MAX changed to denote the highest available irqprio + 1, so let's reflect that in the code too. Signed-off-by: Alexander Graf --- arch/powerpc/kvm/booke.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c index 3da0e42..11b0625 100644 --- a/arch/powerpc/kvm/booke.c +++ b/arch/powerpc/kvm/booke.c @@ -425,7 +425,7 @@ static void kvmppc_core_check_exceptions(struct kvm_vcpu *vcpu) } priority = __ffs(*pending); - while (priority <= BOOKE_IRQPRIO_MAX) { + while (priority < BOOKE_IRQPRIO_MAX) { if (kvmppc_booke_irqprio_deliver(vcpu, priority)) break; -- 1.6.0.2 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH 16/38] KVM: PPC: e500mc: Add doorbell emulation support
When one vcpu wants to kick another, it can issue a special IPI instruction called msgsnd. This patch emulates this instruction, its clearing counterpart and the infrastructure required to actually trigger that interrupt inside a guest vcpu. With this patch, SMP guests on e500mc work. Signed-off-by: Alexander Graf --- v1 -> v2: - introduce and use constants - drop e500mc ifdefs --- arch/powerpc/include/asm/dbell.h |2 + arch/powerpc/kvm/booke.c |2 + arch/powerpc/kvm/e500_emulate.c | 68 ++ 3 files changed, 72 insertions(+), 0 deletions(-) diff --git a/arch/powerpc/include/asm/dbell.h b/arch/powerpc/include/asm/dbell.h index d7365b0..154c067 100644 --- a/arch/powerpc/include/asm/dbell.h +++ b/arch/powerpc/include/asm/dbell.h @@ -19,7 +19,9 @@ #define PPC_DBELL_MSG_BRDCAST (0x0400) #define PPC_DBELL_TYPE(x) (((x) & 0xf) << (63-36)) +#define PPC_DBELL_TYPE_MASKPPC_DBELL_TYPE(0xf) #define PPC_DBELL_LPID(x) ((x) << (63 - 49)) +#define PPC_DBELL_PIR_MASK 0x3fff enum ppc_dbell { PPC_DBELL = 0, /* doorbell */ PPC_DBELL_CRIT = 1, /* critical doorbell */ diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c index 0b77be1..85bd5b8 100644 --- a/arch/powerpc/kvm/booke.c +++ b/arch/powerpc/kvm/booke.c @@ -326,6 +326,7 @@ static int kvmppc_booke_irqprio_deliver(struct kvm_vcpu *vcpu, int_class = INT_CLASS_NONCRIT; break; case BOOKE_IRQPRIO_CRITICAL: + case BOOKE_IRQPRIO_DBELL_CRIT: allowed = vcpu->arch.shared->msr & MSR_CE; allowed = allowed && !crit; msr_mask = MSR_GS | MSR_ME; @@ -342,6 +343,7 @@ static int kvmppc_booke_irqprio_deliver(struct kvm_vcpu *vcpu, keep_irq = true; /* fall through */ case BOOKE_IRQPRIO_EXTERNAL: + case BOOKE_IRQPRIO_DBELL: allowed = vcpu->arch.shared->msr & MSR_EE; allowed = allowed && !crit; msr_mask = MSR_GS | MSR_CE | MSR_ME | MSR_DE; diff --git a/arch/powerpc/kvm/e500_emulate.c b/arch/powerpc/kvm/e500_emulate.c index 98b6c1c..99155f8 100644 --- a/arch/powerpc/kvm/e500_emulate.c +++ b/arch/powerpc/kvm/e500_emulate.c @@ -14,16 +14,74 @@ #include #include +#include #include "booke.h" #include "e500.h" +#define XOP_MSGSND 206 +#define XOP_MSGCLR 238 #define XOP_TLBIVAX 786 #define XOP_TLBSX 914 #define XOP_TLBRE 946 #define XOP_TLBWE 978 #define XOP_TLBILX 18 +#ifdef CONFIG_KVM_E500MC +static int dbell2prio(ulong param) +{ + int msg = param & PPC_DBELL_TYPE_MASK; + int prio = -1; + + switch (msg) { + case PPC_DBELL_TYPE(PPC_DBELL): + prio = BOOKE_IRQPRIO_DBELL; + break; + case PPC_DBELL_TYPE(PPC_DBELL_CRIT): + prio = BOOKE_IRQPRIO_DBELL_CRIT; + break; + default: + break; + } + + return prio; +} + +static int kvmppc_e500_emul_msgclr(struct kvm_vcpu *vcpu, int rb) +{ + ulong param = vcpu->arch.gpr[rb]; + int prio = dbell2prio(param); + + if (prio < 0) + return EMULATE_FAIL; + + clear_bit(prio, &vcpu->arch.pending_exceptions); + return EMULATE_DONE; +} + +static int kvmppc_e500_emul_msgsnd(struct kvm_vcpu *vcpu, int rb) +{ + ulong param = vcpu->arch.gpr[rb]; + int prio = dbell2prio(rb); + int pir = param & PPC_DBELL_PIR_MASK; + int i; + struct kvm_vcpu *cvcpu; + + if (prio < 0) + return EMULATE_FAIL; + + kvm_for_each_vcpu(i, cvcpu, vcpu->kvm) { + int cpir = cvcpu->arch.shared->pir; + if ((param & PPC_DBELL_MSG_BRDCAST) || (cpir == pir)) { + set_bit(prio, &cvcpu->arch.pending_exceptions); + kvm_vcpu_kick(cvcpu); + } + } + + return EMULATE_DONE; +} +#endif + int kvmppc_core_emulate_op(struct kvm_run *run, struct kvm_vcpu *vcpu, unsigned int inst, int *advance) { @@ -36,6 +94,16 @@ int kvmppc_core_emulate_op(struct kvm_run *run, struct kvm_vcpu *vcpu, case 31: switch (get_xop(inst)) { +#ifdef CONFIG_KVM_E500MC + case XOP_MSGSND: + emulated = kvmppc_e500_emul_msgsnd(vcpu, get_rb(inst)); + break; + + case XOP_MSGCLR: + emulated = kvmppc_e500_emul_msgclr(vcpu, get_rb(inst)); + break; +#endif + case XOP_TLBRE: emulated = kvmppc_e500_emul_tlbre(vcpu); break; -- 1.6.0.2 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH 28/38] KVM: PPC: bookehv: remove SET_VCPU
The SET_VCPU macro is a leftover from times when the vcpu struct wasn't stored in the thread on vcpu_load/put. It's not needed anymore. Remove it. Signed-off-by: Alexander Graf --- arch/powerpc/kvm/bookehv_interrupts.S |8 1 files changed, 0 insertions(+), 8 deletions(-) diff --git a/arch/powerpc/kvm/bookehv_interrupts.S b/arch/powerpc/kvm/bookehv_interrupts.S index c5a0796..469bd3f 100644 --- a/arch/powerpc/kvm/bookehv_interrupts.S +++ b/arch/powerpc/kvm/bookehv_interrupts.S @@ -35,9 +35,6 @@ #define GET_VCPU(vcpu, thread) \ PPC_LL vcpu, THREAD_KVM_VCPU(thread) -#define SET_VCPU(vcpu) \ -PPC_STLvcpu, (THREAD + THREAD_KVM_VCPU)(r2) - #define LONGBYTES (BITS_PER_LONG / 8) #define VCPU_GPR(n)(VCPU_GPRS + (n * LONGBYTES)) @@ -517,11 +514,6 @@ lightweight_exit: lwz r3, VCPU_GUEST_PID(r4) mtspr SPRN_PID, r3 - /* Save vcpu pointer for the exception handlers -* must be done before loading guest r2. -*/ -// SET_VCPU(r4) - PPC_LL r11, VCPU_SHARED(r4) /* Save host mas4 and mas6 and load guest MAS registers */ mfspr r3, SPRN_MAS4 -- 1.6.0.2 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH 20/38] KVM: PPC: rename CONFIG_KVM_E500 -> CONFIG_KVM_E500V2
The CONFIG_KVM_E500 option really indicates that we're running on a V2 machine, not on a machine of the generic E500 class. So indicate that properly and change the config name accordingly. Signed-off-by: Alexander Graf --- arch/powerpc/kvm/Kconfig|8 arch/powerpc/kvm/Makefile |4 ++-- arch/powerpc/kvm/booke.c|2 +- arch/powerpc/kvm/e500.h |6 +++--- arch/powerpc/kvm/e500_tlb.c |2 +- arch/powerpc/kvm/powerpc.c |8 6 files changed, 15 insertions(+), 15 deletions(-) diff --git a/arch/powerpc/kvm/Kconfig b/arch/powerpc/kvm/Kconfig index 58f6e68..44a998d 100644 --- a/arch/powerpc/kvm/Kconfig +++ b/arch/powerpc/kvm/Kconfig @@ -109,7 +109,7 @@ config KVM_440 config KVM_EXIT_TIMING bool "Detailed exit timing" - depends on KVM_440 || KVM_E500 || KVM_E500MC + depends on KVM_440 || KVM_E500V2 || KVM_E500MC ---help--- Calculate elapsed time for every exit/enter cycle. A per-vcpu report is available in debugfs kvm/vm#_vcpu#_timing. @@ -118,14 +118,14 @@ config KVM_EXIT_TIMING If unsure, say N. -config KVM_E500 - bool "KVM support for PowerPC E500 processors" +config KVM_E500V2 + bool "KVM support for PowerPC E500v2 processors" depends on EXPERIMENTAL && E500 select KVM select KVM_MMIO ---help--- Support running unmodified E500 guest kernels in virtual machines on - E500 host processors. + E500v2 host processors. This module provides access to the hardware capabilities through a character device node named /dev/kvm. diff --git a/arch/powerpc/kvm/Makefile b/arch/powerpc/kvm/Makefile index 62febd7..25225ae 100644 --- a/arch/powerpc/kvm/Makefile +++ b/arch/powerpc/kvm/Makefile @@ -36,7 +36,7 @@ kvm-e500-objs := \ e500.o \ e500_tlb.o \ e500_emulate.o -kvm-objs-$(CONFIG_KVM_E500) := $(kvm-e500-objs) +kvm-objs-$(CONFIG_KVM_E500V2) := $(kvm-e500-objs) kvm-e500mc-objs := \ $(common-objs-y) \ @@ -98,7 +98,7 @@ kvm-objs-$(CONFIG_KVM_BOOK3S_32) := $(kvm-book3s_32-objs) kvm-objs := $(kvm-objs-m) $(kvm-objs-y) obj-$(CONFIG_KVM_440) += kvm.o -obj-$(CONFIG_KVM_E500) += kvm.o +obj-$(CONFIG_KVM_E500V2) += kvm.o obj-$(CONFIG_KVM_E500MC) += kvm.o obj-$(CONFIG_KVM_BOOK3S_64) += kvm.o obj-$(CONFIG_KVM_BOOK3S_32) += kvm.o diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c index fcbe928..9fcc760 100644 --- a/arch/powerpc/kvm/booke.c +++ b/arch/powerpc/kvm/booke.c @@ -762,7 +762,7 @@ int kvmppc_handle_exit(struct kvm_run *run, struct kvm_vcpu *vcpu, gpa_t gpaddr; gfn_t gfn; -#ifdef CONFIG_KVM_E500 +#ifdef CONFIG_KVM_E500V2 if (!(vcpu->arch.shared->msr & MSR_PR) && (eaddr & PAGE_MASK) == vcpu->arch.magic_page_ea) { kvmppc_map_magic(vcpu); diff --git a/arch/powerpc/kvm/e500.h b/arch/powerpc/kvm/e500.h index 3143085..7967f3f 100644 --- a/arch/powerpc/kvm/e500.h +++ b/arch/powerpc/kvm/e500.h @@ -39,7 +39,7 @@ struct tlbe_priv { struct tlbe_ref ref; /* TLB0 only -- TLB1 uses tlb_refs */ }; -#ifdef CONFIG_KVM_E500 +#ifdef CONFIG_KVM_E500V2 struct vcpu_id_table; #endif @@ -89,7 +89,7 @@ struct kvmppc_vcpu_e500 { u64 *g2h_tlb1_map; unsigned int *h2g_tlb1_rmap; -#ifdef CONFIG_KVM_E500 +#ifdef CONFIG_KVM_E500V2 u32 pid[E500_PID_NUM]; /* vcpu id table */ @@ -136,7 +136,7 @@ void kvmppc_get_sregs_e500_tlb(struct kvm_vcpu *vcpu, struct kvm_sregs *sregs); int kvmppc_set_sregs_e500_tlb(struct kvm_vcpu *vcpu, struct kvm_sregs *sregs); -#ifdef CONFIG_KVM_E500 +#ifdef CONFIG_KVM_E500V2 unsigned int kvmppc_e500_get_sid(struct kvmppc_vcpu_e500 *vcpu_e500, unsigned int as, unsigned int gid, unsigned int pr, int avoid_recursion); diff --git a/arch/powerpc/kvm/e500_tlb.c b/arch/powerpc/kvm/e500_tlb.c index e232bb4..279e10a 100644 --- a/arch/powerpc/kvm/e500_tlb.c +++ b/arch/powerpc/kvm/e500_tlb.c @@ -156,7 +156,7 @@ static inline void write_host_tlbe(struct kvmppc_vcpu_e500 *vcpu_e500, } } -#ifdef CONFIG_KVM_E500 +#ifdef CONFIG_KVM_E500V2 void kvmppc_map_magic(struct kvm_vcpu *vcpu) { struct kvmppc_vcpu_e500 *vcpu_e500 = to_e500(vcpu); diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c index 58a084f..26c6a8d 100644 --- a/arch/powerpc/kvm/powerpc.c +++ b/arch/powerpc/kvm/powerpc.c @@ -74,7 +74,7 @@ int kvmppc_kvm_pv(struct kvm_vcpu *vcpu) } case HC_VENDOR_KVM | KVM_HC_FEATURES: r = HC_EV_SUCCESS; -#if defined(CONFIG_PPC_BOOK3S) || defined(CONFIG_KVM_E500) +#if defined(CONFIG_PPC_BOOK3S) || defined(CONFIG_KVM_E500V2) /* XXX Missing magic page on 44x */ r2 |= (1 << KVM_FEATURE_MAGIC_PAGE); #endif @@ -230,7 +230,7 @@ int kvm_dev_ioctl_check_extension(long ext) case KVM_CAP_
[PATCH 33/38] KVM: PPC: bookehv: remove unused code
There was some unused code in the exit code path that must have been a leftover from earlier iterations. While it did no harm, it's superfluous and thus should be removed. Signed-off-by: Alexander Graf --- v2 -> v3: - fix commit message - also remove "lwzr9, VCPU_KVM(r4)" which was as superfluous --- arch/powerpc/kvm/bookehv_interrupts.S |7 --- 1 files changed, 0 insertions(+), 7 deletions(-) diff --git a/arch/powerpc/kvm/bookehv_interrupts.S b/arch/powerpc/kvm/bookehv_interrupts.S index 021d087..63fc5f0 100644 --- a/arch/powerpc/kvm/bookehv_interrupts.S +++ b/arch/powerpc/kvm/bookehv_interrupts.S @@ -91,10 +91,6 @@ PPC_STL r9, VCPU_TIMING_EXIT_TBU(r4) #endif - .if \flags & NEED_EMU - lwz r9, VCPU_KVM(r4) - .endif - orisr8, r6, MSR_CE@h #ifdef CONFIG_64BIT std r6, (VCPU_SHARED_MSR)(r11) @@ -112,9 +108,6 @@ * appropriate for the exception type). */ cmpwr6, r8 - .if \flags & NEED_EMU - lwz r9, KVM_LPID(r9) - .endif beq 1f mfmsr r7 .if \srr0 != SPRN_MCSRR0 && \srr0 != SPRN_CSRR0 -- 1.6.0.2 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH 32/38] KVM: PPC: booke: add GS documentation for program interrupt
The comment for program interrupts triggered when using bookehv was misleading. Update it to mention why MSR_GS indicates that we have to inject an interrupt into the guest again, not emulate it. Signed-off-by: Alexander Graf --- arch/powerpc/kvm/booke.c | 10 -- 1 files changed, 8 insertions(+), 2 deletions(-) diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c index af02d9d..7df3f3a 100644 --- a/arch/powerpc/kvm/booke.c +++ b/arch/powerpc/kvm/booke.c @@ -685,8 +685,14 @@ int kvmppc_handle_exit(struct kvm_run *run, struct kvm_vcpu *vcpu, case BOOKE_INTERRUPT_PROGRAM: if (vcpu->arch.shared->msr & (MSR_PR | MSR_GS)) { - /* Program traps generated by user-level software must be handled -* by the guest kernel. */ + /* +* Program traps generated by user-level software must +* be handled by the guest kernel. +* +* In GS mode, hypervisor privileged instructions trap +* on BOOKE_INTERRUPT_HV_PRIV, not here, so these are +* actual program interrupts, handled by the guest. +*/ kvmppc_core_queue_program(vcpu, vcpu->arch.fault_esr); r = RESUME_GUEST; kvmppc_account_exit(vcpu, USR_PR_INST); -- 1.6.0.2 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH 31/38] KVM: PPC: booke: Readd debug abort code for machine check
When during guest execution we get a machine check interrupt, we don't know how to handle it yet. So let's add the error printing code back again that we dropped accidently earlier and tell user space that something went really wrong. Signed-off-by: Alexander Graf --- arch/powerpc/kvm/booke.c |7 ++- 1 files changed, 6 insertions(+), 1 deletions(-) diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c index 11b0625..af02d9d 100644 --- a/arch/powerpc/kvm/booke.c +++ b/arch/powerpc/kvm/booke.c @@ -634,7 +634,12 @@ int kvmppc_handle_exit(struct kvm_run *run, struct kvm_vcpu *vcpu, switch (exit_nr) { case BOOKE_INTERRUPT_MACHINE_CHECK: - r = RESUME_GUEST; + printk("MACHINE CHECK: %lx\n", mfspr(SPRN_MCSR)); + kvmppc_dump_vcpu(vcpu); + /* For debugging, send invalid exit reason to user space */ + run->hw.hardware_exit_reason = ~1ULL << 32; + run->hw.hardware_exit_reason |= mfspr(SPRN_MCSR); + r = RESUME_HOST; break; case BOOKE_INTERRUPT_EXTERNAL: -- 1.6.0.2 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH 30/38] KVM: PPC: bookehv: add comment about shadow_msr
For BookE HV the guest visible MSR is shared->msr and is identical to the MSR that is in use while the guest is running, because we can't trap reads from/to MSR. So shadow_msr is unused there. Indicate that with a comment. Signed-off-by: Alexander Graf --- arch/powerpc/include/asm/kvm_host.h |1 + 1 files changed, 1 insertions(+), 0 deletions(-) diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h index ed95f53..633d68f 100644 --- a/arch/powerpc/include/asm/kvm_host.h +++ b/arch/powerpc/include/asm/kvm_host.h @@ -386,6 +386,7 @@ struct kvm_vcpu_arch { #endif u32 vrsave; /* also USPRG0 */ u32 mmucr; + /* shadow_msr is unused for BookE HV */ ulong shadow_msr; ulong csrr0; ulong csrr1; -- 1.6.0.2 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH 29/38] KVM: PPC: bookehv: disable MAS register updates early
We need to make sure that no MAS updates happen automatically while we have the guest MAS registers loaded. So move the disabling code a bit higher up so that it covers the full time we have guest values in MAS registers. The race this patch fixes should never occur, but it makes the code a bit more logical to do it this way around. Signed-off-by: Alexander Graf --- arch/powerpc/kvm/bookehv_interrupts.S | 10 ++ 1 files changed, 6 insertions(+), 4 deletions(-) diff --git a/arch/powerpc/kvm/bookehv_interrupts.S b/arch/powerpc/kvm/bookehv_interrupts.S index 469bd3f..021d087 100644 --- a/arch/powerpc/kvm/bookehv_interrupts.S +++ b/arch/powerpc/kvm/bookehv_interrupts.S @@ -358,6 +358,7 @@ _GLOBAL(kvmppc_resume_host) mtspr SPRN_MAS4, r6 stw r5, VCPU_SHARED_MAS7_3+0(r11) mtspr SPRN_MAS6, r8 + /* Enable MAS register updates via exception */ mfspr r3, SPRN_EPCR rlwinm r3, r3, 0, ~SPRN_EPCR_DMIUH mtspr SPRN_EPCR, r3 @@ -515,6 +516,11 @@ lightweight_exit: mtspr SPRN_PID, r3 PPC_LL r11, VCPU_SHARED(r4) + /* Disable MAS register updates via exception */ + mfspr r3, SPRN_EPCR + orisr3, r3, SPRN_EPCR_DMIUH@h + mtspr SPRN_EPCR, r3 + isync /* Save host mas4 and mas6 and load guest MAS registers */ mfspr r3, SPRN_MAS4 stw r3, VCPU_HOST_MAS4(r4) @@ -538,10 +544,6 @@ lightweight_exit: lwz r5, VCPU_SHARED_MAS7_3+0(r11) mtspr SPRN_MAS6, r3 mtspr SPRN_MAS7, r5 - /* Disable MAS register updates via exception */ - mfspr r3, SPRN_EPCR - orisr3, r3, SPRN_EPCR_DMIUH@h - mtspr SPRN_EPCR, r3 /* * Host interrupt handlers may have clobbered these guest-readable -- 1.6.0.2 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH 24/38] KVM: PPC: booke: rework rescheduling checks
Instead of checking whether we should reschedule only when we exited due to an interrupt, let's always check before entering the guest back again. This gets the target more in line with the other archs. Also while at it, generalize the whole thing so that eventually we could have a single kvmppc_prepare_to_enter function for all ppc targets that does signal and reschedule checking for us. Signed-off-by: Alexander Graf --- v2 -> v3: - check for signals earlier --- arch/powerpc/include/asm/kvm_ppc.h |2 +- arch/powerpc/kvm/book3s.c |4 +- arch/powerpc/kvm/booke.c | 72 +--- 3 files changed, 54 insertions(+), 24 deletions(-) diff --git a/arch/powerpc/include/asm/kvm_ppc.h b/arch/powerpc/include/asm/kvm_ppc.h index e709975..7f0a3da 100644 --- a/arch/powerpc/include/asm/kvm_ppc.h +++ b/arch/powerpc/include/asm/kvm_ppc.h @@ -95,7 +95,7 @@ extern int kvmppc_core_vcpu_translate(struct kvm_vcpu *vcpu, extern void kvmppc_core_vcpu_load(struct kvm_vcpu *vcpu, int cpu); extern void kvmppc_core_vcpu_put(struct kvm_vcpu *vcpu); -extern void kvmppc_core_prepare_to_enter(struct kvm_vcpu *vcpu); +extern int kvmppc_core_prepare_to_enter(struct kvm_vcpu *vcpu); extern int kvmppc_core_pending_dec(struct kvm_vcpu *vcpu); extern void kvmppc_core_queue_program(struct kvm_vcpu *vcpu, ulong flags); extern void kvmppc_core_queue_dec(struct kvm_vcpu *vcpu); diff --git a/arch/powerpc/kvm/book3s.c b/arch/powerpc/kvm/book3s.c index 7d54f4e..c8ead7b 100644 --- a/arch/powerpc/kvm/book3s.c +++ b/arch/powerpc/kvm/book3s.c @@ -258,7 +258,7 @@ static bool clear_irqprio(struct kvm_vcpu *vcpu, unsigned int priority) return true; } -void kvmppc_core_prepare_to_enter(struct kvm_vcpu *vcpu) +int kvmppc_core_prepare_to_enter(struct kvm_vcpu *vcpu) { unsigned long *pending = &vcpu->arch.pending_exceptions; unsigned long old_pending = vcpu->arch.pending_exceptions; @@ -283,6 +283,8 @@ void kvmppc_core_prepare_to_enter(struct kvm_vcpu *vcpu) /* Tell the guest about our interrupt status */ kvmppc_update_int_pending(vcpu, *pending, old_pending); + + return 0; } pfn_t kvmppc_gfn_to_pfn(struct kvm_vcpu *vcpu, gfn_t gfn) diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c index 9979be1..3da0e42 100644 --- a/arch/powerpc/kvm/booke.c +++ b/arch/powerpc/kvm/booke.c @@ -439,8 +439,9 @@ static void kvmppc_core_check_exceptions(struct kvm_vcpu *vcpu) } /* Check pending exceptions and deliver one, if possible. */ -void kvmppc_core_prepare_to_enter(struct kvm_vcpu *vcpu) +int kvmppc_core_prepare_to_enter(struct kvm_vcpu *vcpu) { + int r = 0; WARN_ON_ONCE(!irqs_disabled()); kvmppc_core_check_exceptions(vcpu); @@ -451,8 +452,46 @@ void kvmppc_core_prepare_to_enter(struct kvm_vcpu *vcpu) local_irq_disable(); kvmppc_set_exit_type(vcpu, EMULATED_MTMSRWE_EXITS); - kvmppc_core_check_exceptions(vcpu); + r = 1; }; + + return r; +} + +/* + * Common checks before entering the guest world. Call with interrupts + * disabled. + * + * returns !0 if a signal is pending and check_signal is true + */ +static int kvmppc_prepare_to_enter(struct kvm_vcpu *vcpu, bool check_signal) +{ + int r = 0; + + WARN_ON_ONCE(!irqs_disabled()); + while (true) { + if (need_resched()) { + local_irq_enable(); + cond_resched(); + local_irq_disable(); + continue; + } + + if (check_signal && signal_pending(current)) { + r = 1; + break; + } + + if (kvmppc_core_prepare_to_enter(vcpu)) { + /* interrupts got enabled in between, so we + are back at square 1 */ + continue; + } + + break; + } + + return r; } int kvmppc_vcpu_run(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu) @@ -470,10 +509,7 @@ int kvmppc_vcpu_run(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu) } local_irq_disable(); - - kvmppc_core_prepare_to_enter(vcpu); - - if (signal_pending(current)) { + if (kvmppc_prepare_to_enter(vcpu, true)) { kvm_run->exit_reason = KVM_EXIT_INTR; ret = -EINTR; goto out; @@ -598,25 +634,21 @@ int kvmppc_handle_exit(struct kvm_run *run, struct kvm_vcpu *vcpu, switch (exit_nr) { case BOOKE_INTERRUPT_MACHINE_CHECK: - kvm_resched(vcpu); r = RESUME_GUEST; break; case BOOKE_INTERRUPT_EXTERNAL: kvmppc_account_exit(vcpu, EXT_INTR_EXITS); - kvm_resched(vcpu); r = RESUME_GUEST; break; case BOOKE_INTERRUPT_DECREMENTER:
[PATCH 15/38] KVM: PPC: e500mc support
From: Scott Wood Add processor support for e500mc, using hardware virtualization support (GS-mode). Current issues include: - No support for external proxy (coreint) interrupt mode in the guest. Includes work by Ashish Kalra , Varun Sethi , and Liu Yu . Signed-off-by: Scott Wood Signed-off-by: Alexander Graf --- arch/powerpc/include/asm/cputable.h |6 +- arch/powerpc/include/asm/kvm.h|1 + arch/powerpc/kernel/cpu_setup_fsl_booke.S |1 + arch/powerpc/kernel/head_fsl_booke.S | 46 arch/powerpc/kvm/Kconfig | 17 ++- arch/powerpc/kvm/Makefile | 11 + arch/powerpc/kvm/e500.h | 13 +- arch/powerpc/kvm/e500_emulate.c | 24 ++- arch/powerpc/kvm/e500_tlb.c | 21 ++- arch/powerpc/kvm/e500mc.c | 342 + arch/powerpc/kvm/powerpc.c|6 +- 11 files changed, 476 insertions(+), 12 deletions(-) create mode 100644 arch/powerpc/kvm/e500mc.c diff --git a/arch/powerpc/include/asm/cputable.h b/arch/powerpc/include/asm/cputable.h index 2022f2d..598cd24 100644 --- a/arch/powerpc/include/asm/cputable.h +++ b/arch/powerpc/include/asm/cputable.h @@ -168,6 +168,7 @@ extern const char *powerpc_base_platform; #define CPU_FTR_LWSYNC ASM_CONST(0x0800) #define CPU_FTR_NOEXECUTE ASM_CONST(0x1000) #define CPU_FTR_INDEXED_DCRASM_CONST(0x2000) +#define CPU_FTR_EMB_HV ASM_CONST(0x4000) /* * Add the 64-bit processor unique features in the top half of the word; @@ -386,11 +387,11 @@ extern const char *powerpc_base_platform; CPU_FTR_NODSISRALIGN | CPU_FTR_NOEXECUTE) #define CPU_FTRS_E500MC(CPU_FTR_USE_TB | CPU_FTR_NODSISRALIGN | \ CPU_FTR_L2CSR | CPU_FTR_LWSYNC | CPU_FTR_NOEXECUTE | \ - CPU_FTR_DBELL | CPU_FTR_DEBUG_LVL_EXC) + CPU_FTR_DBELL | CPU_FTR_DEBUG_LVL_EXC | CPU_FTR_EMB_HV) #define CPU_FTRS_E5500 (CPU_FTR_USE_TB | CPU_FTR_NODSISRALIGN | \ CPU_FTR_L2CSR | CPU_FTR_LWSYNC | CPU_FTR_NOEXECUTE | \ CPU_FTR_DBELL | CPU_FTR_POPCNTB | CPU_FTR_POPCNTD | \ - CPU_FTR_DEBUG_LVL_EXC) + CPU_FTR_DEBUG_LVL_EXC | CPU_FTR_EMB_HV) #define CPU_FTRS_GENERIC_32(CPU_FTR_COMMON | CPU_FTR_NODSISRALIGN) /* 64-bit CPUs */ @@ -535,6 +536,7 @@ enum { #ifdef CONFIG_PPC_E500MC CPU_FTRS_E500MC & CPU_FTRS_E5500 & #endif + ~CPU_FTR_EMB_HV & /* can be removed at runtime */ CPU_FTRS_POSSIBLE, }; #endif /* __powerpc64__ */ diff --git a/arch/powerpc/include/asm/kvm.h b/arch/powerpc/include/asm/kvm.h index b921c3f..1bea4d8 100644 --- a/arch/powerpc/include/asm/kvm.h +++ b/arch/powerpc/include/asm/kvm.h @@ -277,6 +277,7 @@ struct kvm_sync_regs { #define KVM_CPU_E500V2 2 #define KVM_CPU_3S_32 3 #define KVM_CPU_3S_64 4 +#define KVM_CPU_E500MC 5 /* for KVM_CAP_SPAPR_TCE */ struct kvm_create_spapr_tce { diff --git a/arch/powerpc/kernel/cpu_setup_fsl_booke.S b/arch/powerpc/kernel/cpu_setup_fsl_booke.S index 8053db0..69fdd23 100644 --- a/arch/powerpc/kernel/cpu_setup_fsl_booke.S +++ b/arch/powerpc/kernel/cpu_setup_fsl_booke.S @@ -73,6 +73,7 @@ _GLOBAL(__setup_cpu_e500v2) mtlrr4 blr _GLOBAL(__setup_cpu_e500mc) + mr r5, r4 mflrr4 bl __e500_icache_setup bl __e500_dcache_setup diff --git a/arch/powerpc/kernel/head_fsl_booke.S b/arch/powerpc/kernel/head_fsl_booke.S index 418931f..88c0a35 100644 --- a/arch/powerpc/kernel/head_fsl_booke.S +++ b/arch/powerpc/kernel/head_fsl_booke.S @@ -380,10 +380,16 @@ interrupt_base: mtspr SPRN_SPRG_WSCRATCH0, r10 /* Save some working registers */ mfspr r10, SPRN_SPRG_THREAD stw r11, THREAD_NORMSAVE(0)(r10) +#ifdef CONFIG_KVM_BOOKE_HV +BEGIN_FTR_SECTION + mfspr r11, SPRN_SRR1 +END_FTR_SECTION_IFSET(CPU_FTR_EMB_HV) +#endif stw r12, THREAD_NORMSAVE(1)(r10) stw r13, THREAD_NORMSAVE(2)(r10) mfcrr13 stw r13, THREAD_NORMSAVE(3)(r10) + DO_KVM BOOKE_INTERRUPT_DTLB_MISS SPRN_SRR1 mfspr r10, SPRN_DEAR /* Get faulting address */ /* If we are faulting a kernel address, we have to use the @@ -468,10 +474,16 @@ interrupt_base: mtspr SPRN_SPRG_WSCRATCH0, r10 /* Save some working registers */ mfspr r10, SPRN_SPRG_THREAD stw r11, THREAD_NORMSAVE(0)(r10) +#ifdef CONFIG_KVM_BOOKE_HV +BEGIN_FTR_SECTION + mfspr r11, SPRN_SRR1 +END_FTR_SECTION_IFSET(CPU_FTR_EMB_HV) +#endif stw r12, THREAD_NORMSAVE(1)(r10) stw r13, THREAD_NORMSAVE(2)(r10) mfcrr13 stw r13, THREAD_NORMSAVE(3)(r10) + DO_KVM BOOKE_INTERRUPT_ITLB_MISS SPRN_SRR1 mfspr r10, SPRN_SRR0 /* Get faulting address */ /*
[PATCH 13/38] KVM: PPC: booke: category E.HV (GS-mode) support
From: Scott Wood Chips such as e500mc that implement category E.HV in Power ISA 2.06 provide hardware virtualization features, including a new MSR mode for guest state. The guest OS can perform many operations without trapping into the hypervisor, including transitions to and from guest userspace. Since we can use SRR1[GS] to reliably tell whether an exception came from guest state, instead of messing around with IVPR, we use DO_KVM similarly to book3s. Current issues include: - Machine checks from guest state are not routed to the host handler. - The guest can cause a host oops by executing an emulated instruction in a page that lacks read permission. Existing e500/4xx support has the same problem. Includes work by Ashish Kalra , Varun Sethi , and Liu Yu . Signed-off-by: Scott Wood [agraf: remove pt_regs usage] Signed-off-by: Alexander Graf --- v1 -> v2: - ESR -> GESR --- arch/powerpc/include/asm/dbell.h|1 + arch/powerpc/include/asm/kvm_asm.h |8 + arch/powerpc/include/asm/kvm_booke_hv_asm.h | 49 +++ arch/powerpc/include/asm/kvm_host.h | 19 +- arch/powerpc/include/asm/kvm_ppc.h |3 + arch/powerpc/include/asm/mmu-book3e.h |6 + arch/powerpc/include/asm/processor.h|3 + arch/powerpc/include/asm/reg.h |2 + arch/powerpc/include/asm/reg_booke.h| 34 ++ arch/powerpc/kernel/asm-offsets.c | 15 +- arch/powerpc/kernel/head_booke.h| 28 ++- arch/powerpc/kvm/Kconfig|3 + arch/powerpc/kvm/booke.c| 309 --- arch/powerpc/kvm/booke.h| 24 +- arch/powerpc/kvm/booke_emulate.c| 23 +- arch/powerpc/kvm/bookehv_interrupts.S | 587 +++ arch/powerpc/kvm/powerpc.c |5 + arch/powerpc/kvm/timing.h |6 + 18 files changed, 1058 insertions(+), 67 deletions(-) create mode 100644 arch/powerpc/include/asm/kvm_booke_hv_asm.h create mode 100644 arch/powerpc/kvm/bookehv_interrupts.S diff --git a/arch/powerpc/include/asm/dbell.h b/arch/powerpc/include/asm/dbell.h index efa74ac..d7365b0 100644 --- a/arch/powerpc/include/asm/dbell.h +++ b/arch/powerpc/include/asm/dbell.h @@ -19,6 +19,7 @@ #define PPC_DBELL_MSG_BRDCAST (0x0400) #define PPC_DBELL_TYPE(x) (((x) & 0xf) << (63-36)) +#define PPC_DBELL_LPID(x) ((x) << (63 - 49)) enum ppc_dbell { PPC_DBELL = 0, /* doorbell */ PPC_DBELL_CRIT = 1, /* critical doorbell */ diff --git a/arch/powerpc/include/asm/kvm_asm.h b/arch/powerpc/include/asm/kvm_asm.h index 7b1f0e0..0978152 100644 --- a/arch/powerpc/include/asm/kvm_asm.h +++ b/arch/powerpc/include/asm/kvm_asm.h @@ -48,6 +48,14 @@ #define BOOKE_INTERRUPT_SPE_FP_DATA 33 #define BOOKE_INTERRUPT_SPE_FP_ROUND 34 #define BOOKE_INTERRUPT_PERFORMANCE_MONITOR 35 +#define BOOKE_INTERRUPT_DOORBELL 36 +#define BOOKE_INTERRUPT_DOORBELL_CRITICAL 37 + +/* booke_hv */ +#define BOOKE_INTERRUPT_GUEST_DBELL 38 +#define BOOKE_INTERRUPT_GUEST_DBELL_CRIT 39 +#define BOOKE_INTERRUPT_HV_SYSCALL 40 +#define BOOKE_INTERRUPT_HV_PRIV 41 /* book3s */ diff --git a/arch/powerpc/include/asm/kvm_booke_hv_asm.h b/arch/powerpc/include/asm/kvm_booke_hv_asm.h new file mode 100644 index 000..30a600f --- /dev/null +++ b/arch/powerpc/include/asm/kvm_booke_hv_asm.h @@ -0,0 +1,49 @@ +/* + * Copyright 2010-2011 Freescale Semiconductor, Inc. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License, version 2, as + * published by the Free Software Foundation. + */ + +#ifndef ASM_KVM_BOOKE_HV_ASM_H +#define ASM_KVM_BOOKE_HV_ASM_H + +#ifdef __ASSEMBLY__ + +/* + * All exceptions from guest state must go through KVM + * (except for those which are delivered directly to the guest) -- + * there are no exceptions for which we fall through directly to + * the normal host handler. + * + * Expected inputs (normal exceptions): + * SCRATCH0 = saved r10 + * r10 = thread struct + * r11 = appropriate SRR1 variant (currently used as scratch) + * r13 = saved CR + * *(r10 + THREAD_NORMSAVE(0)) = saved r11 + * *(r10 + THREAD_NORMSAVE(2)) = saved r13 + * + * Expected inputs (crit/mcheck/debug exceptions): + * appropriate SCRATCH = saved r8 + * r8 = exception level stack frame + * r9 = *(r8 + _CCR) = saved CR + * r11 = appropriate SRR1 variant (currently used as scratch) + * *(r8 + GPR9) = saved r9 + * *(r8 + GPR10) = saved r10 (r10 not yet clobbered) + * *(r8 + GPR11) = saved r11 + */ +.macro DO_KVM intno srr1 +#ifdef CONFIG_KVM_BOOKE_HV +BEGIN_FTR_SECTION + mtocrf 0x80, r11 /* check MSR[GS] without clobbering reg */ + bf 3, kvmppc_resume_\intno\()_\srr1 + b kvmppc_handler_\intno\()_\srr1 +kvmppc_resume_\intno\()_\srr1: +END_FTR_SECTION_IFSET(CPU_FTR_EMB_HV) +#endif +.endm + +#endif /*__ASSEMB
[PATCH 21/38] KVM: PPC: make e500v2 kvm and e500mc cpu mutually exclusive
We can't run e500v2 kvm on e500mc kernels, so indicate that by making the 2 options mutually exclusive in kconfig. Signed-off-by: Alexander Graf --- arch/powerpc/kvm/Kconfig |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/arch/powerpc/kvm/Kconfig b/arch/powerpc/kvm/Kconfig index 44a998d..f4dacb9 100644 --- a/arch/powerpc/kvm/Kconfig +++ b/arch/powerpc/kvm/Kconfig @@ -120,7 +120,7 @@ config KVM_EXIT_TIMING config KVM_E500V2 bool "KVM support for PowerPC E500v2 processors" - depends on EXPERIMENTAL && E500 + depends on EXPERIMENTAL && E500 && !PPC_E500MC select KVM select KVM_MMIO ---help--- -- 1.6.0.2 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH 10/38] KVM: PPC: e500: Track TLB1 entries with a bitmap
From: Scott Wood Rather than invalidate everything when a TLB1 entry needs to be taken down, keep track of which host TLB1 entries are used for a given guest TLB1 entry, and invalidate just those entries. Based on code from Ashish Kalra and Liu Yu . Signed-off-by: Scott Wood Signed-off-by: Alexander Graf --- arch/powerpc/kvm/e500.h |5 +++ arch/powerpc/kvm/e500_tlb.c | 72 --- 2 files changed, 72 insertions(+), 5 deletions(-) diff --git a/arch/powerpc/kvm/e500.h b/arch/powerpc/kvm/e500.h index 34cef08..f4dee55 100644 --- a/arch/powerpc/kvm/e500.h +++ b/arch/powerpc/kvm/e500.h @@ -2,6 +2,7 @@ * Copyright (C) 2008-2011 Freescale Semiconductor, Inc. All rights reserved. * * Author: Yu Liu + * Ashish Kalra * * Description: * This file is based on arch/powerpc/kvm/44x_tlb.h and @@ -25,6 +26,7 @@ #define E500_TLB_VALID 1 #define E500_TLB_DIRTY 2 +#define E500_TLB_BITMAP 4 struct tlbe_ref { pfn_t pfn; @@ -82,6 +84,9 @@ struct kvmppc_vcpu_e500 { struct page **shared_tlb_pages; int num_shared_tlb_pages; + u64 *g2h_tlb1_map; + unsigned int *h2g_tlb1_rmap; + #ifdef CONFIG_KVM_E500 u32 pid[E500_PID_NUM]; diff --git a/arch/powerpc/kvm/e500_tlb.c b/arch/powerpc/kvm/e500_tlb.c index 9925fc6..c8ce51d 100644 --- a/arch/powerpc/kvm/e500_tlb.c +++ b/arch/powerpc/kvm/e500_tlb.c @@ -2,6 +2,7 @@ * Copyright (C) 2008-2011 Freescale Semiconductor, Inc. All rights reserved. * * Author: Yu Liu, yu@freescale.com + * Ashish Kalra, ashish.ka...@freescale.com * * Description: * This file is based on arch/powerpc/kvm/44x_tlb.c, @@ -175,8 +176,28 @@ static void inval_gtlbe_on_host(struct kvmppc_vcpu_e500 *vcpu_e500, struct kvm_book3e_206_tlb_entry *gtlbe = get_entry(vcpu_e500, tlbsel, esel); - if (tlbsel == 1) { - kvmppc_e500_tlbil_all(vcpu_e500); + if (tlbsel == 1 && + vcpu_e500->gtlb_priv[1][esel].ref.flags & E500_TLB_BITMAP) { + u64 tmp = vcpu_e500->g2h_tlb1_map[esel]; + int hw_tlb_indx; + unsigned long flags; + + local_irq_save(flags); + while (tmp) { + hw_tlb_indx = __ilog2_u64(tmp & -tmp); + mtspr(SPRN_MAS0, + MAS0_TLBSEL(1) | + MAS0_ESEL(to_htlb1_esel(hw_tlb_indx))); + mtspr(SPRN_MAS1, 0); + asm volatile("tlbwe"); + vcpu_e500->h2g_tlb1_rmap[hw_tlb_indx] = 0; + tmp &= tmp - 1; + } + mb(); + vcpu_e500->g2h_tlb1_map[esel] = 0; + vcpu_e500->gtlb_priv[1][esel].ref.flags &= ~E500_TLB_BITMAP; + local_irq_restore(flags); + return; } @@ -282,6 +303,16 @@ static inline void kvmppc_e500_ref_release(struct tlbe_ref *ref) } } +static void clear_tlb1_bitmap(struct kvmppc_vcpu_e500 *vcpu_e500) +{ + if (vcpu_e500->g2h_tlb1_map) + memset(vcpu_e500->g2h_tlb1_map, + sizeof(u64) * vcpu_e500->gtlb_params[1].entries, 0); + if (vcpu_e500->h2g_tlb1_rmap) + memset(vcpu_e500->h2g_tlb1_rmap, + sizeof(unsigned int) * host_tlb_params[1].entries, 0); +} + static void clear_tlb_privs(struct kvmppc_vcpu_e500 *vcpu_e500) { int tlbsel = 0; @@ -511,7 +542,7 @@ static void kvmppc_e500_tlb0_map(struct kvmppc_vcpu_e500 *vcpu_e500, /* XXX for both one-one and one-to-many , for now use TLB1 */ static int kvmppc_e500_tlb1_map(struct kvmppc_vcpu_e500 *vcpu_e500, u64 gvaddr, gfn_t gfn, struct kvm_book3e_206_tlb_entry *gtlbe, - struct kvm_book3e_206_tlb_entry *stlbe) + struct kvm_book3e_206_tlb_entry *stlbe, int esel) { struct tlbe_ref *ref; unsigned int victim; @@ -524,6 +555,14 @@ static int kvmppc_e500_tlb1_map(struct kvmppc_vcpu_e500 *vcpu_e500, ref = &vcpu_e500->tlb_refs[1][victim]; kvmppc_e500_shadow_map(vcpu_e500, gvaddr, gfn, gtlbe, 1, stlbe, ref); + vcpu_e500->g2h_tlb1_map[esel] |= (u64)1 << victim; + vcpu_e500->gtlb_priv[1][esel].ref.flags |= E500_TLB_BITMAP; + if (vcpu_e500->h2g_tlb1_rmap[victim]) { + unsigned int idx = vcpu_e500->h2g_tlb1_rmap[victim]; + vcpu_e500->g2h_tlb1_map[idx] &= ~(1ULL << victim); + } + vcpu_e500->h2g_tlb1_rmap[victim] = esel; + return victim; } @@ -728,7 +767,7 @@ int kvmppc_e500_emul_tlbwe(struct kvm_vcpu *vcpu) * are mapped on the fly. */ stlbsel = 1; sesel = kvmppc_e500_tlb1_map(vcpu_e500, eaddr, - raddr >> PAGE_SHIFT, gtlbe, &stlbe); + raddr >> PAGE_SHIFT, gtlbe, &stlbe, esel);
[PATCH 36/38] KVM: PPC: booke: expose good state on irq reinject
When reinjecting an interrupt into the host interrupt handler after we're back in host kernel land, we need to tell the kernel where the interrupt happened. We can't tell it that we were in guest state, because that might lead to random code walking host addresses. So instead, we tell it that we came from the interrupt reinject code. This helps getting reasonable numbers out of perf. Signed-off-by: Alexander Graf --- v2 -> v3: - actually sync host state - no need for vcpu in sync --- arch/powerpc/kvm/booke.c | 56 + 1 files changed, 41 insertions(+), 15 deletions(-) diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c index ee39c8a..488936b 100644 --- a/arch/powerpc/kvm/booke.c +++ b/arch/powerpc/kvm/booke.c @@ -595,37 +595,63 @@ static int emulation_exit(struct kvm_run *run, struct kvm_vcpu *vcpu) } } -/** - * kvmppc_handle_exit - * - * Return value is in the form (errcode<<2 | RESUME_FLAG_HOST | RESUME_FLAG_NV) - */ -int kvmppc_handle_exit(struct kvm_run *run, struct kvm_vcpu *vcpu, - unsigned int exit_nr) +static void kvmppc_fill_pt_regs(struct pt_regs *regs) { - int r = RESUME_HOST; + ulong r1, ip, msr, lr; + + asm("mr %0, 1" : "=r"(r1)); + asm("mflr %0" : "=r"(lr)); + asm("mfmsr %0" : "=r"(msr)); + asm("bl 1f; 1: mflr %0" : "=r"(ip)); + + memset(regs, 0, sizeof(*regs)); + regs->gpr[1] = r1; + regs->nip = ip; + regs->msr = msr; + regs->link = lr; +} - /* update before a new last_exit_type is rewritten */ - kvmppc_update_timing_stats(vcpu); +static void kvmppc_restart_interrupt(struct kvm_vcpu *vcpu, +unsigned int exit_nr) +{ + struct pt_regs regs; switch (exit_nr) { case BOOKE_INTERRUPT_EXTERNAL: - do_IRQ(current->thread.regs); + kvmppc_fill_pt_regs(®s); + do_IRQ(®s); break; - case BOOKE_INTERRUPT_DECREMENTER: - timer_interrupt(current->thread.regs); + kvmppc_fill_pt_regs(®s); + timer_interrupt(®s); break; - #if defined(CONFIG_PPC_FSL_BOOK3E) || defined(CONFIG_PPC_BOOK3E_64) case BOOKE_INTERRUPT_DOORBELL: - doorbell_exception(current->thread.regs); + kvmppc_fill_pt_regs(®s); + doorbell_exception(®s); break; #endif case BOOKE_INTERRUPT_MACHINE_CHECK: /* FIXME */ break; } +} + +/** + * kvmppc_handle_exit + * + * Return value is in the form (errcode<<2 | RESUME_FLAG_HOST | RESUME_FLAG_NV) + */ +int kvmppc_handle_exit(struct kvm_run *run, struct kvm_vcpu *vcpu, + unsigned int exit_nr) +{ + int r = RESUME_HOST; + + /* update before a new last_exit_type is rewritten */ + kvmppc_update_timing_stats(vcpu); + + /* restart interrupts if they were meant for the host */ + kvmppc_restart_interrupt(vcpu, exit_nr); local_irq_enable(); -- 1.6.0.2 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH 34/38] KVM: PPC: e500: fix typo in tlb code
The tlbncfg registers should be populated with their respective TLB's values. Fix the obvious typo. Signed-off-by: Alexander Graf --- arch/powerpc/kvm/e500_tlb.c |4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/powerpc/kvm/e500_tlb.c b/arch/powerpc/kvm/e500_tlb.c index 279e10a..e05232b 100644 --- a/arch/powerpc/kvm/e500_tlb.c +++ b/arch/powerpc/kvm/e500_tlb.c @@ -1268,8 +1268,8 @@ int kvmppc_e500_tlb_init(struct kvmppc_vcpu_e500 *vcpu_e500) vcpu->arch.tlbcfg[1] = mfspr(SPRN_TLB1CFG) & ~(TLBnCFG_N_ENTRY | TLBnCFG_ASSOC); - vcpu->arch.tlbcfg[0] |= vcpu_e500->gtlb_params[1].entries; - vcpu->arch.tlbcfg[0] |= + vcpu->arch.tlbcfg[1] |= vcpu_e500->gtlb_params[1].entries; + vcpu->arch.tlbcfg[1] |= vcpu_e500->gtlb_params[1].ways << TLBnCFG_ASSOC_SHIFT; return 0; -- 1.6.0.2 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH 38/38] KVM: PPC: Booke: only prepare to enter when we enter
So far, we've always called prepare_to_enter even when all we did was return to the host. This patch changes that semantic to only call prepare_to_enter when we actually want to get back into the guest. Signed-off-by: Alexander Graf --- arch/powerpc/kvm/booke.c | 18 ++ 1 files changed, 10 insertions(+), 8 deletions(-) diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c index 8e8aa4c..9f27258 100644 --- a/arch/powerpc/kvm/booke.c +++ b/arch/powerpc/kvm/booke.c @@ -464,7 +464,7 @@ int kvmppc_core_prepare_to_enter(struct kvm_vcpu *vcpu) * * returns !0 if a signal is pending and check_signal is true */ -static int kvmppc_prepare_to_enter(struct kvm_vcpu *vcpu, bool check_signal) +static int kvmppc_prepare_to_enter(struct kvm_vcpu *vcpu) { int r = 0; @@ -477,7 +477,7 @@ static int kvmppc_prepare_to_enter(struct kvm_vcpu *vcpu, bool check_signal) continue; } - if (check_signal && signal_pending(current)) { + if (signal_pending(current)) { r = 1; break; } @@ -509,7 +509,7 @@ int kvmppc_vcpu_run(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu) } local_irq_disable(); - if (kvmppc_prepare_to_enter(vcpu, true)) { + if (kvmppc_prepare_to_enter(vcpu)) { kvm_run->exit_reason = KVM_EXIT_INTR; ret = -EINTR; goto out; @@ -946,11 +946,13 @@ int kvmppc_handle_exit(struct kvm_run *run, struct kvm_vcpu *vcpu, * To avoid clobbering exit_reason, only check for signals if we * aren't already exiting to userspace for some other reason. */ - local_irq_disable(); - if (kvmppc_prepare_to_enter(vcpu, !(r & RESUME_HOST))) { - run->exit_reason = KVM_EXIT_INTR; - r = (-EINTR << 2) | RESUME_HOST | (r & RESUME_FLAG_NV); - kvmppc_account_exit(vcpu, SIGNAL_EXITS); + if (!(r & RESUME_HOST)) { + local_irq_disable(); + if (kvmppc_prepare_to_enter(vcpu)) { + run->exit_reason = KVM_EXIT_INTR; + r = (-EINTR << 2) | RESUME_HOST | (r & RESUME_FLAG_NV); + kvmppc_account_exit(vcpu, SIGNAL_EXITS); + } } return r; -- 1.6.0.2 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH 37/38] KVM: PPC: booke: Reinject performance monitor interrupts
When we get a performance monitor interrupt, we need to make sure that the host receives it. So reinject it like we reinject the other host destined interrupts. Signed-off-by: Alexander Graf --- v2 -> v3: - call regs sync directly --- arch/powerpc/include/asm/hw_irq.h |1 + arch/powerpc/kvm/booke.c |4 2 files changed, 5 insertions(+), 0 deletions(-) diff --git a/arch/powerpc/include/asm/hw_irq.h b/arch/powerpc/include/asm/hw_irq.h index bb712c9..904e66c 100644 --- a/arch/powerpc/include/asm/hw_irq.h +++ b/arch/powerpc/include/asm/hw_irq.h @@ -12,6 +12,7 @@ #include extern void timer_interrupt(struct pt_regs *); +extern void performance_monitor_exception(struct pt_regs *regs); #ifdef CONFIG_PPC64 #include diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c index 488936b..8e8aa4c 100644 --- a/arch/powerpc/kvm/booke.c +++ b/arch/powerpc/kvm/booke.c @@ -634,6 +634,10 @@ static void kvmppc_restart_interrupt(struct kvm_vcpu *vcpu, case BOOKE_INTERRUPT_MACHINE_CHECK: /* FIXME */ break; + case BOOKE_INTERRUPT_PERFORMANCE_MONITOR: + kvmppc_fill_pt_regs(®s); + performance_monitor_exception(®s); + break; } } -- 1.6.0.2 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH 35/38] KVM: PPC: booke: Support perfmon interrupts
When during guest context we get a performance monitor interrupt, we currently bail out and oops. Let's route it to its correct handler instead. Signed-off-by: Alexander Graf --- arch/powerpc/kvm/booke.c |4 1 files changed, 4 insertions(+), 0 deletions(-) diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c index 7df3f3a..ee39c8a 100644 --- a/arch/powerpc/kvm/booke.c +++ b/arch/powerpc/kvm/booke.c @@ -679,6 +679,10 @@ int kvmppc_handle_exit(struct kvm_run *run, struct kvm_vcpu *vcpu, r = RESUME_GUEST; break; + case BOOKE_INTERRUPT_PERFORMANCE_MONITOR: + r = RESUME_GUEST; + break; + case BOOKE_INTERRUPT_HV_PRIV: r = emulation_exit(run, vcpu); break; -- 1.6.0.2 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 20/21] Introduce struct eeh_stats for EEH
> > > +struct eeh_stats { > > + unsigned int no_device; /* PCI device not found */ > ... > > + "no device =%d\n" > ... > > Use %u (for all the stats), you really don't want negative > values printed. Yes. > I've NFI how long wrapping these counters might take! > If it is feasable (maybe much above 100Hz) then you > need 64bit counters. > I think it's better to use "u64" here ;-) > David > Thanks, Gavin ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH] KVM: PPC: Don't sync timebase when inside KVM
When we know that we're running inside of a KVM guest, we don't have to worry about synchronizing timebases between different CPUs, since the host already took care of that. This fixes CPU overcommit scenarios where vCPUs could hang forever trying to sync each other while not being scheduled. Reported-by: Stuart Yoder Signed-off-by: Alexander Graf --- arch/powerpc/kernel/smp.c |6 -- 1 files changed, 4 insertions(+), 2 deletions(-) diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c index 46695fe..670b453 100644 --- a/arch/powerpc/kernel/smp.c +++ b/arch/powerpc/kernel/smp.c @@ -49,6 +49,8 @@ #ifdef CONFIG_PPC64 #include #endif +#include +#include #ifdef DEBUG #include @@ -541,7 +543,7 @@ int __cpuinit __cpu_up(unsigned int cpu) DBG("Processor %u found.\n", cpu); - if (smp_ops->give_timebase) + if (!kvm_para_available() && smp_ops->give_timebase) smp_ops->give_timebase(); /* Wait until cpu puts itself in the online map */ @@ -626,7 +628,7 @@ void __devinit start_secondary(void *unused) if (smp_ops->setup_cpu) smp_ops->setup_cpu(cpu); - if (smp_ops->take_timebase) + if (!kvm_para_available() && smp_ops->take_timebase) smp_ops->take_timebase(); secondary_cpu_time_init(); -- 1.6.0.2 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 20/21] Introduce struct eeh_stats for EEH
With the original EEH implementation, the EEH global statistics are maintained by individual global variables. That makes the code a little hard to maintain. The patch introduces extra struct eeh_stats for the EEH global statistics so that it can be maintained in collective fashion. It's the rework on the corresponding v5 patch. According to the comments from David Laight, the EEH global statistics have been changed for a litte bit so that they have fixed-type of "u64". Also, the format used to print them has been changed to "%llu" based on David's suggestion. Signed-off-by: Gavin Shan --- arch/powerpc/platforms/pseries/eeh.c | 65 -- 1 files changed, 38 insertions(+), 27 deletions(-) diff --git a/arch/powerpc/platforms/pseries/eeh.c b/arch/powerpc/platforms/pseries/eeh.c index 9b1fd0c..753ec8a 100644 --- a/arch/powerpc/platforms/pseries/eeh.c +++ b/arch/powerpc/platforms/pseries/eeh.c @@ -102,14 +102,22 @@ static DEFINE_RAW_SPINLOCK(confirm_error_lock); #define EEH_PCI_REGS_LOG_LEN 4096 static unsigned char pci_regs_buf[EEH_PCI_REGS_LOG_LEN]; -/* System monitoring statistics */ -static unsigned long no_device; -static unsigned long no_dn; -static unsigned long no_cfg_addr; -static unsigned long ignored_check; -static unsigned long total_mmio_ffs; -static unsigned long false_positives; -static unsigned long slot_resets; +/* + * The struct is used to maintain the EEH global statistic + * information. Besides, the EEH global statistics will be + * exported to user space through procfs + */ +struct eeh_stats { + u64 no_device; /* PCI device not found */ + u64 no_dn; /* OF node not found*/ + u64 no_cfg_addr;/* Config address not found */ + u64 ignored_check; /* EEH check skipped*/ + u64 total_mmio_ffs; /* Total EEH checks */ + u64 false_positives;/* Unnecessary EEH checks */ + u64 slot_resets;/* PE reset */ +}; + +static struct eeh_stats eeh_stats; #define IS_BRIDGE(class_code) (((class_code)<<16) == PCI_BASE_CLASS_BRIDGE) @@ -392,13 +400,13 @@ int eeh_dn_check_failure(struct device_node *dn, struct pci_dev *dev) int rc = 0; const char *location; - total_mmio_ffs++; + eeh_stats.total_mmio_ffs++; if (!eeh_subsystem_enabled) return 0; if (!dn) { - no_dn++; + eeh_stats.no_dn++; return 0; } dn = eeh_find_device_pe(dn); @@ -407,14 +415,14 @@ int eeh_dn_check_failure(struct device_node *dn, struct pci_dev *dev) /* Access to IO BARs might get this far and still not want checking. */ if (!(edev->mode & EEH_MODE_SUPPORTED) || edev->mode & EEH_MODE_NOCHECK) { - ignored_check++; + eeh_stats.ignored_check++; pr_debug("EEH: Ignored check (%x) for %s %s\n", edev->mode, eeh_pci_name(dev), dn->full_name); return 0; } if (!edev->config_addr && !edev->pe_config_addr) { - no_cfg_addr++; + eeh_stats.no_cfg_addr++; return 0; } @@ -460,13 +468,13 @@ int eeh_dn_check_failure(struct device_node *dn, struct pci_dev *dev) (ret == EEH_STATE_NOT_SUPPORT) || (ret & (EEH_STATE_MMIO_ACTIVE | EEH_STATE_DMA_ACTIVE)) == (EEH_STATE_MMIO_ACTIVE | EEH_STATE_DMA_ACTIVE)) { - false_positives++; + eeh_stats.false_positives++; edev->false_positives ++; rc = 0; goto dn_unlock; } - slot_resets++; + eeh_stats.slot_resets++; /* Avoid repeated reports of this failure, including problems * with other functions on this device, and functions under @@ -513,7 +521,7 @@ unsigned long eeh_check_failure(const volatile void __iomem *token, unsigned lon addr = eeh_token_to_phys((unsigned long __force) token); dev = pci_addr_cache_get_device(addr); if (!dev) { - no_device++; + eeh_stats.no_device++; return val; } @@ -1174,21 +1182,24 @@ static int proc_eeh_show(struct seq_file *m, void *v) { if (0 == eeh_subsystem_enabled) { seq_printf(m, "EEH Subsystem is globally disabled\n"); - seq_printf(m, "eeh_total_mmio_ffs=%ld\n", total_mmio_ffs); + seq_printf(m, "eeh_total_mmio_ffs=%llu\n", eeh_stats.total_mmio_ffs); } else { seq_printf(m, "EEH Subsystem is enabled\n"); seq_printf(m, - "no device=%ld\n" - "no device node=%ld\n" - "no config address=%ld\n" - "check not wanted=%ld\n" - "e
RE: [PATCH V3] fsl-sata: add support for interrupt coalsecing feature
Hi Jeff, Do you plan to apply it to upstream, or any suggestions? Thanks. > -Original Message- > From: linux-ide-ow...@vger.kernel.org [mailto:linux-ide- > ow...@vger.kernel.org] On Behalf Of Li Yang > Sent: Wednesday, February 15, 2012 3:51 PM > To: Liu Qiang-B32616 > Cc: jgar...@pobox.com; linux-...@vger.kernel.org; linux- > ker...@vger.kernel.org; linuxppc-dev@lists.ozlabs.org > Subject: Re: [PATCH V3] fsl-sata: add support for interrupt coalsecing > feature > > On Wed, Feb 15, 2012 at 3:40 PM, Qiang Liu > wrote: > > Adds support for interrupt coalescing feature to reduce interrupt > events. > > Provides a mechanism of adjusting coalescing count and timeout tick by > > sysfs at runtime, so that tradeoff of latency and CPU load can be made > > depending on different applications. > > > > Signed-off-by: Qiang Liu > > Acked-by: Li Yang > > - Leo > -- > To unsubscribe from this list: send the line "unsubscribe linux-ide" in > the body of a message to majord...@vger.kernel.org More majordomo info at > http://vger.kernel.org/majordomo-info.html ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH v5 00/21] EEH reorganization
Hi Ben, Could you pls take a look on this when you have time? Thanks, Gavin > This series of patches is going to reorganize EEH so that it could support > multiple platforms in future. The requirements were raised from the aspects. > > * The original EEH implementation only support pSeries platform, which > would be regarded as guest system. Platform powernv is coming and EEH > needs to be supported on powernv as well. > * Different platforms might be running based on variable > firmware.Further > more, the firmware would supply different EEH interfaces to kernel. > Therefore, we have to do necessary abstraction on current EEH > implementation. > > In order to accomodate the requirements, the series of patches have > reorganized > current EEH implementation. > > * The original implementation looks not clean enough. Necessary cleanup > will be done in some of the patches. > * struct eeh_ops has been introduced so that EEH core components and > platform > dependent implementation could be split up. That make it possible for > EEH > to be supported on multiple platforms. > * struct eeh_dev has been introduced to replace struct pci_dn so that > EEH module > works independently as much as possible. > * EEH global statistics will be maintained in a collective fashion. > > v1 -> v2: > > * If possible, to add "eeh_" prefix for function names. > * The format of leading function comments won't be changed in order not > to > break kernel document automatic generation (e.g. by "make pdfdocs"). > * The name of local variables won't be changed if there're no explicit > reasons. > * Represent the PE's state in bitmap fasion. > * Some function names have been adjusted so that they look shorter and > meaningful. > * Platform operation name has been changed to "pseries". > * Merge those patches for cleanup if possible. > * The line length is kept as appropriately short if possible. > * Fixup on alignment & spacing issues. > > v2 -> v3: > * Split cleanup patch into 2: one for comment cleanup and another one > for > renaming function names. > * Try to use pr_warning/pr_info/pr_debug instead of printk() function > call. > * Function names are adjusted a little bit so that they looks more > meaningful > according to comments from Michael/Ben. > * Useful comment has been kept according to Michael's comments. > * struct eeh_ops::set_eeh has been changed to eeh_ops::set_option. > * struct eeh_ops::name has been changed to "char *". > * Remove file name from the source file. > * Copyright (C) format has been changed since "(C)" isn't encouraged to > use. > * The header files included in the source file have been sorted > alphabetically. > * eeh_platform_init() has been replaced by eeh_pseries_init() to avoid > duplicate > functions when kernel supports multiple platforms. > * "F/W" has been changed to "Firmware". > * The maximal wait time to retrieve PE's state has been covered by > macro. > * It also include changes according to the minor comments from Michael. > > v3 -> v4: > * Fix some typo included in the commit messages. > * Reduce code nesting according to Ram's suggestions. > * Addtinal pr_warning on failure of configuring bridges. > > v4 -> v5: > * OF node and PCI device are tracing the corresponding eeh device. > That has been changed to "struct eeh_dev *" instead of the original > "void *". > * The conversion between OF node, PCI device, eeh device is changed > to inline functions instead of the original macros. > * The "struct eeh_stats" has been moved from eeh.h to eeh.c. Besides, > the individual members of the struct have been changed to fixed-type > "unsigned int". > > > The series of patches (v5) has been verified on Firebird-L machine. In order > to carry out > the test, you have to install IBM Power Tools from IBM internal yum source. > Following > command is used to force EEH check on ethernet interface, which could be > recovered eventually > by EEH and device driver successfully. You could keep pinging to the blade > before issuing > the following command to force EEH. You should see the network interface > can't be reached for > a moment and everything will be recovered couple of seconds after the forced > EEH error. At the > same time, you should see EEH error log out of system console. > > * errinjct eeh -v -f 0 -p U78AE.001.WZS00M9-P1-C18-L1-T2 -a 0x0 -m 0x0 > > - > > arch/powerpc/include/asm/device.h|3 + > arch/powerpc/include/asm/eeh.h | 134 +++- > arch/powerpc/include/asm/eeh_event.h | 33 +- > arch/powerpc/include/asm/ppc-pci.h | 89 +-- > arch/powerpc/kernel/of_p