[PATCH 6/6] powerpc/eeh: Aux PE data for error log

2014-07-15 Thread Gavin Shan
The patch allows PE (struct eeh_pe) instance to have auxillary data, whose size is configurable on basis of platform. For PowerNV, the auxillary data will be used to cache PHB diag-data for that PE (frozen PE or fenced PHB). In turn, we can retrieve the diag-data at any later points It's useful for

[PATCH 2/6] powerpc/eeh: Selectively enable IO for error log

2014-07-15 Thread Gavin Shan
According to the experiment I did, PCI config access is blocked on P7IOC frozen PE by hardware, but PHB3 doesn't do that. That means we always get 0xFF's while dumping PCI config space of the frozen PE on P7IOC. We don't have the problem on PHB3. So we have to enable I/O prioir to collecting error

[PATCH 5/6] powerpc/eeh: Make diag-data not endian dependent

2014-07-15 Thread Gavin Shan
It's followup of commit ddf0322a ("powerpc/powernv: Fix endianness problems in EEH"). The patch helps to get non-endian-dependent diag-data. Cc: Guo Chao Signed-off-by: Gavin Shan --- arch/powerpc/include/asm/opal.h | 128 +++--- arch/powerpc/platforms/powernv/

[PATCH 0/6] EEH Cleanup

2014-07-15 Thread Gavin Shan
The patchset is EEH cleanup and expected to be merged during 3.17 window. The the patchset is expected to be applied after: |EEH support for guest |2 more bug fixes for EEH support for guest |M64 related EEH changes |2 bug fixes from Mike Qiu | +-> The current pa

[PATCH 3/6] powerpc/eeh: Reduce lines of log dump

2014-07-15 Thread Gavin Shan
The patch prints 4 PCIE or AER config registers each line, which is part of the EEH log so that it looks a bit more compact. Suggested-by: Benjamin Herrenschmidt Signed-off-by: Gavin Shan --- arch/powerpc/kernel/eeh.c | 37 +++-- 1 file changed, 31 insertions(+),

[PATCH 1/6] powerpc/eeh: Refactor EEH flag accessors

2014-07-15 Thread Gavin Shan
There are multiple global EEH flags. Almost each flag has its own accessor, which doesn't make sense. The patch refactors EEH flag accessors so that they look unified: eeh_add_flag(): Add EEH flag eeh_clear_flag(): Clear EEH flag eeh_has_flag(): Check if one specific flag has been set

[PATCH 4/6] powerpc/eeh: Replace pr_warning() with pr_warn()

2014-07-15 Thread Gavin Shan
pr_warn() is equal to pr_warning(), but the former is a bit more formal. The patch replaces pr_warning() with pr_warn(). Signed-off-by: Gavin Shan --- arch/powerpc/kernel/eeh.c| 16 arch/powerpc/kernel/eeh_cache.c | 7 --- arch/powerpc/kerne

Re: OF_DYNAMIC node lifecycle

2014-07-15 Thread Grant Likely
I've got another question about powerpc reconfiguration. I was looking at the dlpar_configure_connector() function in dlpar.c. I see that the function has the ability to process multiple nodes with additional sibling and child nodes. It appears to link them into a detached tree structure, and the f

Re: [PATCH v2] powerpc/pseries: dynamically added OF nodes need to call of_node_init

2014-07-15 Thread Grant Likely
On Thu, Jul 10, 2014 at 1:59 PM, Nathan Fontenot wrote: > On 07/10/2014 01:50 PM, Tyrel Datwyler wrote: >> Commit 75b57ecf9 refactored device tree nodes to use kobjects such that they >> can be exposed via /sysfs. A secondary commit 0829f6d1f furthered this rework >> by moving the kobect initializ

[PATCH v2 3/3] powerpc/pseries: Switch pseries drivers to use machine_xxx_initcall()

2014-07-15 Thread Michael Ellerman
A lot of the code in platforms/pseries is using non-machine initcalls. That means if a kernel built with pseries support runs on another platform, for example powernv, the initcalls will still run. Most of these cases are OK, though sometimes only due to luck. Some were having more effect: * hca

Re: [PATCH] ppc/xmon: use isspace/isxdigit/isalnum from linux/ctype.h

2014-07-15 Thread Benjamin Herrenschmidt
On Tue, 2014-07-15 at 13:43 +0200, Vincent Bernat wrote: > isxdigit() macro definition is the same. > > isalnum() from linux/ctype.h will accept additional latin non-ASCII > characters. This is harmless since this macro is used in scanhex() which > parses user input. > > isspace() from linux/ctyp

Re: [PATCH] powerpc: subpage_protect: Increase the array size to take care of 64TB

2014-07-15 Thread Aneesh Kumar K.V
"Aneesh Kumar K.V" writes: > We now support TASK_SIZE of 16TB, hence the array should be 8. should be ^^^ 64TB > > Fixes the below crash: -aneesh ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs

Re: bit fields && data tearing

2014-07-15 Thread Richard Henderson
On 07/15/2014 06:54 AM, Peter Hurley wrote: > > Jonathan Corbet wrote a LWN article about this back in 2012: > http://lwn.net/Articles/478657/ > > I guess it's fixed in gcc 4.8, but too bad there's not a workaround for > earlier compilers (akin to -fstrict_volatile_bitfields without requiring > t

[PATCH 1/2] powerpc: thp: don't recompute vsid and ssize in loop on invalidate

2014-07-15 Thread Aneesh Kumar K.V
The segment identifier and segment size will remain the same in the loop, So we can compute it outside. We also change the hugepage_invalidate interface so that we can use it the later patch Signed-off-by: Aneesh Kumar K.V --- arch/powerpc/include/asm/machdep.h| 6 +++--- arch/powerpc/mm/ha

[PATCH] powerpc: subpage_protect: Increase the array size to take care of 64TB

2014-07-15 Thread Aneesh Kumar K.V
We now support TASK_SIZE of 16TB, hence the array should be 8. Fixes the below crash: Unable to handle kernel paging request for data at address 0x000100bd Faulting instruction address: 0xc004f914 cpu 0x13: Vector: 300 (Data Access) at [c00fea75fa90] pc: c004f914: .sys_sub

[PATCH] powerpc: thp: Add write barrier after updating the valid bit

2014-07-15 Thread Aneesh Kumar K.V
With hugepages, we store the hpte valid information in the pte page whose address is stored in the second half of the PMD. Use a write barrier to make sure that clearing pmd busy bit and updating hpte valid info are ordered properly. Signed-off-by: Aneesh Kumar K.V --- arch/powerpc/include/asm/p

[PATCH 2/2] powerpc: thp: invalidate old 64K based hash page mapping before insert

2014-07-15 Thread Aneesh Kumar K.V
If we changed base page size of the segment, either via sub_page_protect or via remap_4k_pfn, we do a demote_segment which doesn't flush the hash table entries. We do that when inserting a new hash pte by checking the _PAGE_COMBO flag. We missed to do that when inserting hash for a new 16MB page. A

[PATCH] powerpc: thp: Add write barrier after updating the valid bit

2014-07-15 Thread Aneesh Kumar K.V
With hugepages, we store the hpte valid information in the pte page whose address is stored in the second half of the PMD. Use a write barrier to make sure that clearing pmd busy bit and updating hpte valid info are ordered properly. Signed-off-by: Aneesh Kumar K.V --- arch/powerpc/include/asm/p

Re: [PATCH v3] arm64, ia64, ppc, s390, sh, tile, um, x86, mm: Remove default gate area

2014-07-15 Thread Andy Lutomirski
On Sun, Jul 13, 2014 at 1:01 PM, Andy Lutomirski wrote: > The core mm code will provide a default gate area based on > FIXADDR_USER_START and FIXADDR_USER_END if > !defined(__HAVE_ARCH_GATE_AREA) && defined(AT_SYSINFO_EHDR). > > This default is only useful for ia64. arm64, ppc, s390, sh, tile, >

Re: bit fields && data tearing

2014-07-15 Thread Peter Hurley
On 07/13/2014 06:25 PM, Benjamin Herrenschmidt wrote: On Sun, 2014-07-13 at 09:15 -0400, Peter Hurley wrote: I'm not sure I understand your point here, Ben. Suppose that two different spinlocks are used independently to protect r-m-w access to adjacent data. In Oleg's example, suppose spinlock

[PATCH 3/3] powerpc/pseries: Switch pseries drivers to use machine_xxx_initcall()

2014-07-15 Thread Michael Ellerman
A lot of the code in platforms/pseries is using non-machine initcalls. That means if a kernel built with pseries support runs on another platform, for example powernv, the initcalls will still run. Most of these cases are OK, though sometimes only due to luck. Some were having more effect: * hca

[PATCH 2/3] powerpc/powernv: Switch powernv drivers to use machine_xxx_initcall()

2014-07-15 Thread Michael Ellerman
A lot of the code in platforms/powernv is using non-machine initcalls. That means if a kernel built with powernv support runs on another platform, for example pseries, the initcalls will still run. That is usually OK, because the initcalls will check for something in the device tree or elsewhere b

[PATCH 1/3] powerpc: Add machine_early_initcall()

2014-07-15 Thread Michael Ellerman
Signed-off-by: Michael Ellerman --- arch/powerpc/include/asm/machdep.h | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/powerpc/include/asm/machdep.h b/arch/powerpc/include/asm/machdep.h index f92b0b54e921..5c7e74ddee4c 100644 --- a/arch/powerpc/include/asm/machdep.h +++ b/arch/powerpc/i

[PATCH] ppc/xmon: use isspace/isxdigit/isalnum from linux/ctype.h

2014-07-15 Thread Vincent Bernat
isxdigit() macro definition is the same. isalnum() from linux/ctype.h will accept additional latin non-ASCII characters. This is harmless since this macro is used in scanhex() which parses user input. isspace() from linux/ctype.h will accept vertical tab and form feed but not NULL. The use of thi

Re: [PATCH 1/6] powerpc/powernv: Enable M64 aperatus for PHB3

2014-07-15 Thread Gavin Shan
On Tue, Jul 15, 2014 at 10:55:25AM +0800, Wei Yang wrote: >On Thu, Jul 10, 2014 at 09:53:41PM +0800, Guo Chao wrote: >>This patch enable M64 aperatus for PHB3. >> >>We already had platform hook (ppc_md.pcibios_window_alignment) to affect >>the PCI resource assignment done in PCI core so that each P

[PATCH 3/3] powerpc: Remove misleading DISABLE_INTS

2014-07-15 Thread Michael Ellerman
DISABLE_INTS has a long and storied history, but for some time now it has not actually disabled interrupts. For the open-coded exception handlers, just stop using it, instead call RECONCILE_IRQ_STATE directly. This has the benefit of removing a level of indirection, and making it clear that r10 &

[PATCH 2/3] powerpc: Document register clobbering in EXCEPTION_COMMON()

2014-07-15 Thread Michael Ellerman
Signed-off-by: Michael Ellerman --- arch/powerpc/include/asm/exception-64s.h | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/powerpc/include/asm/exception-64s.h b/arch/powerpc/include/asm/exception-64s.h index 8f35cd7d59cc..066c15cd2837 100644 --- a/arch/powerpc/include/asm/exception-64

[PATCH 1/3] powerpc: Update comments in irqflags.h

2014-07-15 Thread Michael Ellerman
The comment on TRACE_ENABLE_INTS is incorrect, and appears to have always been incorrect since the code was merged. It probably came from an original out-of-tree patch. Replace it with something that's correct. Also propagate the message to RECONCILE_IRQ_STATE(), because it's potentially subtle.

[PATCH] powerpc: Move bad_stack() below the fwnmi_data_area

2014-07-15 Thread Michael Ellerman
At the moment the allmodconfig build is failing because we run out of space between altivec_assist() at 0x5700 and the fwnmi_data_area at 0x7000. Fixing it permanently will take some more work, but a quick fix is to move bad_stack() below the fwnmi_data_area. That gives us just enough room with ev

Re: Re: [PATCH v5 2/2] [BUGFIX] kprobes: Fix "Failed to find blacklist" error on ia64 and ppc64

2014-07-15 Thread Masami Hiramatsu
(2014/07/15 16:16), Benjamin Herrenschmidt wrote: > On Tue, 2014-07-15 at 13:19 +1000, Michael Ellerman wrote: > >>> Signed-off-by: Masami Hiramatsu >>> Reported-by: Tony Luck >>> Tested-by: Tony Luck >>> Cc: Michael Ellerman >> >> Tested-by: Michael Ellerman >> Acked-by: Michael Ellerman (f

Re: [PATCH] ppc/xmon: use isxdigit/isspace/isalnum from ctype.h

2014-07-15 Thread Vincent Bernat
❦ 15 juillet 2014 08:55 GMT, David Laight  : >> Use linux/ctype.h instead of defining custom versions of >> isxdigit/isspace/isalnum. > > ... >> -#define isspace(c) (c == ' ' || c == '\t' || c == 10 || c == 13 || c == 0) > > That is different from the version in linux/ctype.h > Especially for 'c

[PATCH QEMU 09/12] vfio: Enable DDW ioctls to VFIO IOMMU driver

2014-07-15 Thread Alexey Kardashevskiy
Signed-off-by: Alexey Kardashevskiy --- hw/misc/vfio.c | 4 1 file changed, 4 insertions(+) diff --git a/hw/misc/vfio.c b/hw/misc/vfio.c index 0b9eba0..e7b4d6e 100644 --- a/hw/misc/vfio.c +++ b/hw/misc/vfio.c @@ -4437,6 +4437,10 @@ int vfio_container_ioctl(AddressSpace *as, int32_t groupid

[PATCH QEMU 10/12] headers: update for KVM_CAP_SPAPR_TCE_64 and VFIO KVM device

2014-07-15 Thread Alexey Kardashevskiy
Signed-off-by: Alexey Kardashevskiy --- linux-headers/asm-mips/kvm_para.h | 6 +- linux-headers/asm-powerpc/kvm.h | 9 + linux-headers/linux/kvm.h | 12 linux-headers/linux/kvm_para.h| 3 +++ 4 files changed, 29 insertions(+), 1 deletion(-) diff --git a/

[PATCH QEMU 12/12] vfio: Enable in-kernel acceleration via VFIO KVM device

2014-07-15 Thread Alexey Kardashevskiy
TCE hypercalls (H_PUT_TCE, H_PUT_TCE_INDIRECT, H_STUFF_TCE) use a logical bus number (LIOBN) to identify which TCE table the request is addressed to. However VFIO kernel driver operates with IOMMU group IDs and has no idea about which LIOBN corresponds to which group. If the host kernel supports in

[PATCH QEMU 11/12] target-ppc: kvm: make use of KVM_CREATE_SPAPR_TCE_64

2014-07-15 Thread Alexey Kardashevskiy
Signed-off-by: Alexey Kardashevskiy --- hw/ppc/spapr_iommu.c | 7 --- target-ppc/kvm.c | 47 --- target-ppc/kvm_ppc.h | 10 +++--- 3 files changed, 47 insertions(+), 17 deletions(-) diff --git a/hw/ppc/spapr_iommu.c b/hw/ppc/spapr_iommu.c

[PATCH QEMU 07/12] spapr_pci: Enable DDW

2014-07-15 Thread Alexey Kardashevskiy
Signed-off-by: Alexey Kardashevskiy --- hw/ppc/spapr_pci.c | 62 + include/hw/pci-host/spapr.h | 3 +++ 2 files changed, 65 insertions(+) diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c index 230b59c..038a485 100644 --- a/hw/ppc/spapr_pc

[PATCH QEMU 08/12] spapr_pci_vfio: Enable DDW

2014-07-15 Thread Alexey Kardashevskiy
Signed-off-by: Alexey Kardashevskiy --- hw/ppc/spapr_pci_vfio.c | 73 + 1 file changed, 73 insertions(+) diff --git a/hw/ppc/spapr_pci_vfio.c b/hw/ppc/spapr_pci_vfio.c index d3bddf2..b72aff0 100644 --- a/hw/ppc/spapr_pci_vfio.c +++ b/hw/ppc/spapr_p

[PATCH QEMU 01/12] spapr_iommu: Disable in-kernel IOMMU tables for >4GB windows

2014-07-15 Thread Alexey Kardashevskiy
The existing KVM_CREATE_SPAPR_TCE ioctl only support 4G windows max. We are going to add huge DMA windows support so this will create small window and unexpectedly fail later. This disables KVM_CREATE_SPAPR_TCE for windows bigger that 4GB. Since those windows are normally mapped at the boot time,

[PATCH QEMU 06/12] spapr: Add "ddw" machine option

2014-07-15 Thread Alexey Kardashevskiy
This option will enable Dynamic DMA windows (DDW) support for pseries machine. Signed-off-by: Alexey Kardashevskiy --- hw/ppc/spapr.c | 15 +++ vl.c | 4 2 files changed, 19 insertions(+) diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c index d01978f..fec295b 100644 ---

[PATCH QEMU 03/12] spapr_iommu: Make spapr_tce_find_by_liobn() public

2014-07-15 Thread Alexey Kardashevskiy
Signed-off-by: Alexey Kardashevskiy --- hw/ppc/spapr_iommu.c | 2 +- include/hw/ppc/spapr.h | 1 + 2 files changed, 2 insertions(+), 1 deletion(-) diff --git a/hw/ppc/spapr_iommu.c b/hw/ppc/spapr_iommu.c index 36f5d27..588d442 100644 --- a/hw/ppc/spapr_iommu.c +++ b/hw/ppc/spapr_iommu.c @@ -40

[PATCH QEMU 04/12] linux headers update for DDW

2014-07-15 Thread Alexey Kardashevskiy
Signed-off-by: Alexey Kardashevskiy --- linux-headers/linux/vfio.h | 37 - 1 file changed, 36 insertions(+), 1 deletion(-) diff --git a/linux-headers/linux/vfio.h b/linux-headers/linux/vfio.h index 26c218e..f0aa97d 100644 --- a/linux-headers/linux/vfio.h +++ b

[PATCH QEMU 02/12] spapr_pci: Make find_phb()/find_dev() public

2014-07-15 Thread Alexey Kardashevskiy
This makes find_phb()/find_dev() public and changed its names to spapr_pci_find_phb()/spapr_pci_find_dev() as they are going to be used from other parts of QEMU such as VFIO DDW (dynamic DMA window) or VFIO PCI error injection or VFIO EEH handling - in all these cases there are RTAS calls which are

[PATCH QEMU 05/12] spapr_rtas: Add Dynamic DMA windows (DDW) RTAS calls support

2014-07-15 Thread Alexey Kardashevskiy
spapr_pci_vfio: Support dynamic DMA window This adds support for Dynamic DMA Windows (DDW) option defined by the SPAPR specification which allows to have additional DMA windows besides the default and small one which can only handle 4K pages and which should completely fit into first 32bit of PCI

[PATCH QEMU 00/12] vfio: pci: Enable DDW and in-kernel acceleration

2014-07-15 Thread Alexey Kardashevskiy
This makes use of kernel patchsets: [PATCH v1 00/16] powernv: vfio: Add Dynamic DMA windows (DDW) [PATCH v1 0/7] powerpc/iommu: kvm: Enable MultiTCE support [PATCH v1 00/13] powerpc: kvm: Enable in-kernel acceleration for VFIO I am posting it for reference here, reviews are still welcome but not r

Re: [PATCH v1 01/13] KVM: PPC: Account TCE pages in locked_vm

2014-07-15 Thread Alexey Kardashevskiy
On 07/15/2014 07:25 PM, Alexey Kardashevskiy wrote: > Signed-off-by: Alexey Kardashevskiy Just realized this should go to "powernv: vfio: Add Dynamic DMA windows (DDW)". And neither patchset accounts DDW in locked_vm, need to decide how... > --- > arch/powerpc/kvm/book3s_64_vio.c | 35 ++

[PATCH v1 13/13] KVM: PPC: Add support for IOMMU in-kernel handling

2014-07-15 Thread Alexey Kardashevskiy
This allows the host kernel to handle H_PUT_TCE, H_PUT_TCE_INDIRECT and H_STUFF_TCE requests targeted an IOMMU TCE table without passing them to user space which saves time on switching to user space and back. Both real and virtual modes are supported. The kernel tries to handle a TCE request in t

[PATCH v1 11/13] KVM: PPC: Associate IOMMU group with guest copy of TCE table

2014-07-15 Thread Alexey Kardashevskiy
The existing in-kernel TCE table for emulated devices contains guest physical addresses which are accesses by emulated devices. Since we need to keep this information for VFIO devices too in order to implement H_GET_TCE, we are reusing it. This adds iommu_group* and iommu_table* pointers to kvmppc

[PATCH v1 12/13] KVM: PPC: vfio kvm device: support spapr tce

2014-07-15 Thread Alexey Kardashevskiy
In addition to the external VFIO user API, a VFIO KVM device has been introduced recently. sPAPR TCE IOMMU is para-virtualized and the guest does map/unmap via hypercalls which take a logical bus id (LIOBN) as a target IOMMU identifier. LIOBNs are made up, advertised to the guest system and linked

[PATCH v1 10/13] KVM: PPC: Fix kvmppc_gpa_to_hva_and_get() to return host physical address

2014-07-15 Thread Alexey Kardashevskiy
The existing support of emulated devices does not need to calculate a host physical address as the translation is performed by the userspace. The upcoming support of VFIO needs it as it stores host physical addresses in the real hardware TCE table which hardware uses during DMA transfer. This tran

[PATCH v1 09/13] KVM: PPC: Add page_shift support for in-kernel H_PUT_TCE/etc handlers

2014-07-15 Thread Alexey Kardashevskiy
Recently introduced KVM_CREATE_SPAPR_TCE_64 added page_shift. This makes use of it in kvmppc_tce_put(). This changes kvmppc_tce_put() to take an TCE index rather than IO address. This does not change the existing behaviour and will be utilized later by Dynamic DMA windows which support 64K and 16

[PATCH v1 08/13] KVM: PPC: Add hugepage support for IOMMU in-kernel handling

2014-07-15 Thread Alexey Kardashevskiy
This adds special support for huge pages (16MB) in real mode. The reference counting cannot be easily done for such pages in real mode (when MMU is off) so this adds a hash table of huge pages. It is populated in virtual mode and get_page is called just once per a huge page. Real mode handlers chec

[PATCH v1 06/13] KVM: PPC: Add @offset to kvmppc_spapr_tce_table

2014-07-15 Thread Alexey Kardashevskiy
This enables guest visible TCE tables to start from non-zero offset on a bus. This will be used for VFIO support. Signed-off-by: Alexey Kardashevskiy --- arch/powerpc/include/asm/kvm_host.h | 1 + arch/powerpc/kvm/book3s_64_vio_hv.c | 5 - 2 files changed, 5 insertions(+), 1 deletion(-) dif

[PATCH v1 07/13] KVM: PPC: Add support for 64bit TCE windows

2014-07-15 Thread Alexey Kardashevskiy
The existing KVM_CREATE_SPAPR_TCE only supports 32bit windows which is not enough for directly mapped windows as the guest can get more than 4GB. This adds KVM_CREATE_SPAPR_TCE_64 ioctl and advertises it via KVM_CAP_SPAPR_TCE_64 capability. Since 64bit windows are to support Dynamic DMA windows (

[PATCH v1 03/13] KVM: PPC: Enable IOMMU_API for KVM_BOOK3S_64 permanently

2014-07-15 Thread Alexey Kardashevskiy
It does not make much sense to have KVM in book3s-64 and not to have IOMMU bits for PCI pass through support as it costs little and allows VFIO to function on book3s KVM. Having IOMMU_API always enabled makes it unnecessary to have a lot of "#ifdef IOMMU_API" in arch/powerpc/kvm/book3s_64_vio*. Wi

[PATCH v1 05/13] KVM: PPC: Reserve KVM_CAP_SPAPR_TCE_64 capability number

2014-07-15 Thread Alexey Kardashevskiy
This adds a capability number for 64-bit TCE tables support. Signed-off-by: Alexey Kardashevskiy --- include/uapi/linux/kvm.h | 1 + 1 file changed, 1 insertion(+) diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h index 3048c86..65c2689 100644 --- a/include/uapi/linux/kvm.h +++ b

[PATCH v1 04/13] KVM: PPC: Reserve KVM_CAP_SPAPR_TCE_VFIO capability number

2014-07-15 Thread Alexey Kardashevskiy
This adds a capability number for in-kernel support for VFIO on SPAPR platform. The capability will tell the user space whether in-kernel handlers of H_PUT_TCE can handle VFIO-targeted requests or not. If not, the user space must not attempt allocating a TCE table in the host kernel via the KVM_CR

[PATCH v1 02/13] KVM: PPC: Rework kvmppc_spapr_tce_table to support variable page size

2014-07-15 Thread Alexey Kardashevskiy
At the moment the kvmppc_spapr_tce_table struct can only describe 4GB windows which is not enough for big DMA windows. This replaces window_size (in bytes, 4GB max) with page_shift (32bit) and size (64bit, in pages). Signed-off-by: Alexey Kardashevskiy --- arch/powerpc/include/asm/kvm_host.h |

[PATCH v1 01/13] KVM: PPC: Account TCE pages in locked_vm

2014-07-15 Thread Alexey Kardashevskiy
Signed-off-by: Alexey Kardashevskiy --- arch/powerpc/kvm/book3s_64_vio.c | 35 ++- 1 file changed, 34 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/kvm/book3s_64_vio.c b/arch/powerpc/kvm/book3s_64_vio.c index 2137836..4ca33f1 100644 --- a/arch/powerpc/kvm

[PATCH v1 00/13] powerpc: kvm: Enable in-kernel acceleration for VFIO

2014-07-15 Thread Alexey Kardashevskiy
This enables in-kernel acceleration of TCE hypercalls (H_PUT_TCE, H_PUT_TCE_INDIRECT, H_STUFF_TCE). This implements acceleration for both real and virtual modes. This was made on top of both: [PATCH v1 00/16] powernv: vfio: Add Dynamic DMA windows (DDW) [PATCH v1 0/7] powerpc/iommu: kvm: Enable Mu

[PATCH v1 15/16] vfio: Use it_page_size

2014-07-15 Thread Alexey Kardashevskiy
This makes use of the it_page_size from the iommu_table struct as page size can differ. This replaces missing IOMMU_PAGE_SHIFT macro in commented debug code as recently introduced IOMMU_PAGE_XXX macros do not include IOMMU_PAGE_SHIFT. Signed-off-by: Alexey Kardashevskiy --- drivers/vfio/vfio_io

[PATCH v1 7/7] KVM: PPC: Add support for multiple-TCE hcalls

2014-07-15 Thread Alexey Kardashevskiy
This adds real and virtual mode handlers for the H_PUT_TCE_INDIRECT and H_STUFF_TCE hypercalls for user space emulated devices such as IBMVIO devices or emulated PCI. These calls allow adding multiple entries (up to 512) into the TCE table in one call which saves time on transition between kernel

[PATCH v1 6/7] KVM: PPC: Add kvmppc_find_tce_table()

2014-07-15 Thread Alexey Kardashevskiy
This adds a common helper to search for a kvmppc_spapr_tce_table by LIOBN. This makes H_PUT_TCE and H_GET_TCE handler use this new helper. The helper will be also used in H_PUT_TCE_INDIRECT and H_STUFF_TCE handlers. Signed-off-by: Alexey Kardashevskiy --- arch/powerpc/kvm/book3s_64_vio_hv.c |

[PATCH v1 5/7] KVM: PPC: Move reusable bits of H_PUT_TCE handler to helpers

2014-07-15 Thread Alexey Kardashevskiy
Upcoming multi-tce support (H_PUT_TCE_INDIRECT/H_STUFF_TCE hypercalls) will validate TCE (not to have unexpected bits) and IO address (to be within the DMA window boundaries). This introduces helpers to validate TCE and IO address. Signed-off-by: Alexey Kardashevskiy --- arch/powerpc/include/as

[PATCH v1 4/7] KVM: PPC: Replace SPAPR_TCE_SHIFT with IOMMU_PAGE_SHIFT_4K

2014-07-15 Thread Alexey Kardashevskiy
SPAPR_TCE_SHIFT is used in few places only and since IOMMU_PAGE_SHIFT_4K can bre easily used instead, remove SPAPR_TCE_SHIFT. Signed-off-by: Alexey Kardashevskiy --- arch/powerpc/include/asm/kvm_book3s_64.h | 2 -- arch/powerpc/kvm/book3s_64_vio.c | 3 ++- arch/powerpc/kvm/book3s_64_vio_

[PATCH v1 2/7] powerpc/iommu: Support real mode

2014-07-15 Thread Alexey Kardashevskiy
The TCE tables handling differs for real (MMU off) and virtual modes (MMU on) so additional set of realmode-capable TCE callbacks has been added to ppc_md: * tce_build_rm * tce_free_rm * tce_flush_rm This makes use of new ppc_md calls in iommu_clear_tces_and_put_pages. This changes iommu_tce_buil

[PATCH v1 0/7] powerpc/iommu: kvm: Enable MultiTCE support

2014-07-15 Thread Alexey Kardashevskiy
This prepares upstream kernel for in-kernel acceleration of TCE hypercalls (H_PUT_TCE, H_PUT_TCE_INDIRECT, H_STUFF_TCE). This implements acceleration for both real and virtual modes. As it requires gup() for real mode to parse TCE list page, this implements gup() for realmode. This only accelerat

[PATCH v1 1/7] powerpc/iommu: Change prototypes for realmode support

2014-07-15 Thread Alexey Kardashevskiy
This is a mechanical patch to add an extra "realmode" parameter to iommu_clear_tces_and_put_pages() and iommu_tce_build() helpers. This changes iommu_tce_build() to receive multiple page addresses at once as in the future we want to save on locks and TCE flushes in realmode. Signed-off-by: Alexey

[PATCH v1 3/7] powerpc/iommu: Clean up IOMMU API

2014-07-15 Thread Alexey Kardashevskiy
The iommu_tce_direction() function is not used from outside iommu.c so make it static. The iommu_clear_tce() is not used anymore at all so remove it. Signed-off-by: Alexey Kardashevskiy --- arch/powerpc/include/asm/iommu.h | 4 arch/powerpc/kernel/iommu.c | 22 +-

[PATCH v1 14/16] powerpc/powernv: Implement Dynamic DMA windows (DDW) for IODA

2014-07-15 Thread Alexey Kardashevskiy
SPAPR defines an interface to create additional DMA windows dynamically. "Dynamically" means that the window is not allocated at the guest start and the guest can request it later. In practice, existing linux guests check for the capability and if it is there, they create+map one big DMA window as

[PATCH v1 16/16] vfio: powerpc: Enable Dynamic DMA windows

2014-07-15 Thread Alexey Kardashevskiy
This defines and implements VFIO IOMMU API required to support Dynamic DMA windows defined in the SPAPR specification. The ioctl handlers implement host-size part of corresponding RTAS calls: - VFIO_IOMMU_SPAPR_TCE_QUERY - ibm,query-pe-dma-window; - VFIO_IOMMU_SPAPR_TCE_CREATE - ibm,create-pe-dma-w

[PATCH v1 09/16] powerpc/iommu: Fix IOMMU ownership control functions

2014-07-15 Thread Alexey Kardashevskiy
This adds missing locks in iommu_take_ownership()/ iommu_release_ownership(). This marks all pages busy in iommu_table::it_map in order to catch errors if there is an attempt to use this table while ownership over it is taken. This only clears TCE content if there is no page marked busy in it_map

[PATCH v1 10/16] powerpc/iommu: Fix missing permission bits in iommu_put_tce_user_mode()

2014-07-15 Thread Alexey Kardashevskiy
This adds missing permission bits to the translated TCE. Signed-off-by: Alexey Kardashevskiy --- arch/powerpc/kernel/iommu.c | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/powerpc/kernel/iommu.c b/arch/powerpc/kernel/iommu.c index da04561..01ac319 100644 --- a/arch/powerpc/kernel/iommu

[PATCH v1 12/16] powerpc/powernv: Return non-zero TCE from pnv_tce_build

2014-07-15 Thread Alexey Kardashevskiy
This returns old TCE values to the caller if requested. The caller is expectded to call put_page() for them. Signed-off-by: Alexey Kardashevskiy --- arch/powerpc/platforms/powernv/pci.c | 11 --- 1 file changed, 8 insertions(+), 3 deletions(-) diff --git a/arch/powerpc/platforms/powernv

[PATCH v1 13/16] powerpc/iommu: Implement put_page() if TCE had non-zero value

2014-07-15 Thread Alexey Kardashevskiy
Guests might put new TCEs without clearing them first and the PAPR spec allows that. This adds put_page() for TCEs which we just replaced. Signed-off-by: Alexey Kardashevskiy --- arch/powerpc/kernel/iommu.c | 10 +- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/arch/powe

[PATCH v1 11/16] powerpc/iommu: Extend ppc_md.tce_build(_rm) to return old TCE values

2014-07-15 Thread Alexey Kardashevskiy
The tce_build/tce_build_rm callbacks are used to implement H_PUT_TCE/etc hypercalls. The PAPR spec does not allow to fail if the TCE is not empty. However we cannot just overwrite the existing TCE value with the new one as we still have to do page counting. This adds an optional @old_tces return p

[PATCH v1 04/16] powerpc/powernv: Use it_page_shift in TCE build

2014-07-15 Thread Alexey Kardashevskiy
This makes use of iommu_table::it_page_shift instead of TCE_SHIFT and TCE_RPN_SHIFT hardcoded values. Signed-off-by: Alexey Kardashevskiy --- arch/powerpc/platforms/powernv/pci.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/arch/powerpc/platforms/powernv/pci.c b/arch

[PATCH v1 08/16] powerpc/powernv: Convert/move set_bypass() callback to take_ownership()

2014-07-15 Thread Alexey Kardashevskiy
At the moment the iommu_table struct has a set_bypass() which enables/ disables DMA bypass on IODA2 PHB. This is exposed to POWERPC IOMMU code which calls this callback when external IOMMU users such as VFIO are about to get over a PHB. Since the set_bypass() is not really an iommu_table function

[PATCH v1 03/16] powerpc/powernv: Use it_page_shift for TCE invalidation

2014-07-15 Thread Alexey Kardashevskiy
This fixes IODA1/2 to use it_page_shift as it may be bigger than 4K. This changes the involved constant values to use "ull" modifier. Signed-off-by: Alexey Kardashevskiy --- arch/powerpc/platforms/powernv/pci-ioda.c | 16 +--- 1 file changed, 9 insertions(+), 7 deletions(-) diff --

[PATCH v1 05/16] powerpc/powernv: Add a page size parameter to pnv_pci_setup_iommu_table()

2014-07-15 Thread Alexey Kardashevskiy
Since a TCE page size can be other than 4K, make it configurable for P5IOC2 and IODA PHBs. Signed-off-by: Alexey Kardashevskiy --- arch/powerpc/platforms/powernv/pci-ioda.c | 5 +++-- arch/powerpc/platforms/powernv/pci-p5ioc2.c | 3 ++- arch/powerpc/platforms/powernv/pci.c| 6 +++--- a

[PATCH v1 07/16] powerpc/spapr: vfio: Implement spapr_tce_iommu_ops

2014-07-15 Thread Alexey Kardashevskiy
Modern IBM POWERPC systems support multiple IOMMU tables per PHB so we need a more reliable way (compared to container_of()) to get a PE pointer from the iommu_table struct pointer used in IOMMU functions. At the moment IOMMU group data points to an iommu_table struct. This introduces a spapr_tce_

[PATCH v1 06/16] powerpc/powernv: Make invalidate() callback an iommu_table callback

2014-07-15 Thread Alexey Kardashevskiy
This implements pnv_pci_ioda(1|2)_tce_invalidate as a callback of iommu_table to simplify code structure. The callbacks receive iommu_table only and cast it to PE, the specific callback knows how. This registers invalidate() callbacks for IODA1 and IODA2: - pnv_pci_ioda1_tce_invalidate; - pnv_pci_

[PATCH v1 00/16] powernv: vfio: Add Dynamic DMA windows (DDW)

2014-07-15 Thread Alexey Kardashevskiy
This prepares existing upstream kernel for DDW (Dynamic DMA windows) and adds actual DDW support for VFIO. This patchset does not contain any in-kernel acceleration stuff. This patchset does not enable DDW for emulated devices. Alexey Kardashevskiy (16): powerpc/iommu: Fix comments with it_pa

[PATCH v1 02/16] KVM: PPC: Use RCU when adding to arch.spapr_tce_tables

2014-07-15 Thread Alexey Kardashevskiy
Signed-off-by: Alexey Kardashevskiy --- arch/powerpc/kvm/book3s_64_vio.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/powerpc/kvm/book3s_64_vio.c b/arch/powerpc/kvm/book3s_64_vio.c index 54cf9bc..516f2ee 100644 --- a/arch/powerpc/kvm/book3s_64_vio.c +++ b/arch/powerpc/

[PATCH v1 01/16] powerpc/iommu: Fix comments with it_page_shift

2014-07-15 Thread Alexey Kardashevskiy
There is a couple of commented debug prints which still use IOMMU_PAGE_SHIFT() which is not defined for POWERPC anymore, replace them with it_page_shift. Signed-off-by: Alexey Kardashevskiy --- arch/powerpc/kernel/iommu.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/ar

RE: [PATCH] ppc/xmon: use isxdigit/isspace/isalnum from ctype.h

2014-07-15 Thread David Laight
From: Vincent Bernat > Use linux/ctype.h instead of defining custom versions of > isxdigit/isspace/isalnum. ... > -#define isspace(c) (c == ' ' || c == '\t' || c == 10 || c == 13 || c == 0) That is different from the version in linux/ctype.h Especially for 'c == 0', but probably also vertical t

[PATCH] ppc/xmon: use isxdigit/isspace/isalnum from ctype.h

2014-07-15 Thread Vincent Bernat
Use linux/ctype.h instead of defining custom versions of isxdigit/isspace/isalnum. Signed-off-by: Vincent Bernat --- arch/powerpc/xmon/xmon.c | 12 +--- 1 file changed, 1 insertion(+), 11 deletions(-) diff --git a/arch/powerpc/xmon/xmon.c b/arch/powerpc/xmon/xmon.c index d199bfa2f1fa..c

Re: [PATCH v5 2/2] [BUGFIX] kprobes: Fix "Failed to find blacklist" error on ia64 and ppc64

2014-07-15 Thread Benjamin Herrenschmidt
On Tue, 2014-07-15 at 13:19 +1000, Michael Ellerman wrote: > > Signed-off-by: Masami Hiramatsu > > Reported-by: Tony Luck > > Tested-by: Tony Luck > > Cc: Michael Ellerman > > Tested-by: Michael Ellerman > Acked-by: Michael Ellerman (for powerpc) > > Ben, can you take this in your tree? A

[PATCH 0/2] Bug fix for VFIO EEH

2014-07-15 Thread Gavin Shan
Those 2 patches are bug fix for VFIO EEH support, which isn't merged yet though all reviewers gave their ack. So I'm sending this to avoid revert or something like that. The problem is that dma_offset/iommu_table_base are sharing same memory location. When disabling bypass mode, we missed to rest

[PATCH 2/2] powerpc/eeh: Fetch IOMMU table in reliable way

2014-07-15 Thread Gavin Shan
Function eeh_iommu_group_to_pe() iterates each PCI device to check the binding IOMMU group with get_iommu_table_base(), which possibly fetches pdev->dev.archdata.dma_data.dma_offset. It's (0x1 << 59) for "bypass" cases. The patch fixes the issue by iterating devices hooked to the IOMMU group and f

[PATCH 1/2] powerpc/powernv: Fix IOMMU table for VFIO dev

2014-07-15 Thread Gavin Shan
On PHB3, PCI devices can bypass IOMMU for DMA access. If we pass through one PCI device, whose hose driver ever enable the bypass mode, pdev->dev.archdata.dma_data.iommu_table_base isn't IOMMU table. However, EEH needs access the IOMMU table when the device is owned by guest. The patch fixes pdev-