[PATCH kernel v11 06/34] vfio: powerpc/spapr: Move page pinning from arch code to VFIO IOMMU driver

2015-05-29 Thread Alexey Kardashevskiy
This moves page pinning (get_user_pages_fast()/put_page()) code out of the platform IOMMU code and puts it to VFIO IOMMU driver where it belongs to as the platform code does not deal with page pinning. This makes iommu_take_ownership()/iommu_release_ownership() deal with the IOMMU table bitmap onl

[PATCH kernel v11 13/34] powerpc/powernv: Do not set "read" flag if direction==DMA_NONE

2015-05-29 Thread Alexey Kardashevskiy
Normally a bitmap from the iommu_table is used to track what TCE entry is in use. Since we are going to use iommu_table without its locks and do xchg() instead, it becomes essential not to put bits which are not implied in the direction flag as the old TCE value (more precisely - the permission bit

[PATCH kernel v11 14/34] powerpc/iommu: Move tce_xxx callbacks from ppc_md to iommu_table

2015-05-29 Thread Alexey Kardashevskiy
This adds a iommu_table_ops struct and puts pointer to it into the iommu_table struct. This moves tce_build/tce_free/tce_get/tce_flush callbacks from ppc_md to the new struct where they really belong to. This adds the requirement for @it_ops to be initialized before calling iommu_init_table() to m

[PATCH kernel v11 19/34] powerpc/iommu: Fix IOMMU ownership control functions

2015-05-29 Thread Alexey Kardashevskiy
This adds missing locks in iommu_take_ownership()/ iommu_release_ownership(). This marks all pages busy in iommu_table::it_map in order to catch errors if there is an attempt to use this table while ownership over it is taken. This only clears TCE content if there is no page marked busy in it_map

[PATCH kernel v11 02/34] powerpc/iommu/powernv: Get rid of set_iommu_table_base_and_group

2015-05-29 Thread Alexey Kardashevskiy
The set_iommu_table_base_and_group() name suggests that the function sets table base and add a device to an IOMMU group. The actual purpose for table base setting is to put some reference into a device so later iommu_add_device() can get the IOMMU group reference and the device to the group. At t

[PATCH kernel v11 01/34] powerpc/eeh/ioda2: Use device::iommu_group to check IOMMU group

2015-05-29 Thread Alexey Kardashevskiy
This relies on the fact that a PCI device always has an IOMMU table which may not be the case when we get dynamic DMA windows so let's use more reliable check for IOMMU group here. As we do not rely on the table presence here, remove the workaround from pnv_pci_ioda2_set_bypass(); also remove the

[PATCH kernel v11 26/34] powerpc/powernv/ioda2: Introduce pnv_pci_ioda2_set_window

2015-05-29 Thread Alexey Kardashevskiy
This is a part of moving DMA window programming to an iommu_ops callback. pnv_pci_ioda2_set_window() takes an iommu_table_group as a first parameter (not pnv_ioda_pe) as it is going to be used as a callback for VFIO DDW code. This adds pnv_pci_ioda2_tvt_invalidate() to invalidate TVT as it is a go

[PATCH kernel v11 24/34] powerpc/powernv/ioda2: Rework iommu_table creation

2015-05-29 Thread Alexey Kardashevskiy
This moves iommu_table creation to the beginning to make following changes easier to review. This starts using table parameters from the iommu_table struct. This should cause no behavioural change. Signed-off-by: Alexey Kardashevskiy Reviewed-by: David Gibson Reviewed-by: Gavin Shan --- Change

[PATCH kernel v11 22/34] powerpc/powernv: Implement accessor to TCE entry

2015-05-29 Thread Alexey Kardashevskiy
This replaces direct accesses to TCE table with a helper which returns an TCE entry address. This does not make difference now but will when multi-level TCE tables get introduces. No change in behavior is expected. Signed-off-by: Alexey Kardashevskiy Reviewed-by: David Gibson Reviewed-by: Gavin

[PATCH kernel v11 25/34] powerpc/powernv/ioda2: Introduce helpers to allocate TCE pages

2015-05-29 Thread Alexey Kardashevskiy
This is a part of moving TCE table allocation into an iommu_ops callback to support multiple IOMMU groups per one VFIO container. This moves the code which allocates the actual TCE tables to helpers: pnv_pci_ioda2_table_alloc_pages() and pnv_pci_ioda2_table_free_pages(). These do not allocate/free

[PATCH kernel v11 27/34] powerpc/powernv: Implement multilevel TCE tables

2015-05-29 Thread Alexey Kardashevskiy
TCE tables might get too big in case of 4K IOMMU pages and DDW enabled on huge guests (hundreds of GB of RAM) so the kernel might be unable to allocate contiguous chunk of physical memory to store the TCE table. To address this, POWER8 CPU (actually, IODA2) supports multi-level TCE tables, up to 5

[PATCH kernel v11 00/34] powerpc/iommu/vfio: Enable Dynamic DMA windows

2015-05-29 Thread Alexey Kardashevskiy
This enables sPAPR defined feature called Dynamic DMA windows (DDW). Each Partitionable Endpoint (IOMMU group) has an address range on a PCI bus where devices are allowed to do DMA. These ranges are called DMA windows. By default, there is a single DMA window, 1 or 2GB big, mapped at zero on a PC

[PATCH kernel v11 23/34] powerpc/iommu/powernv: Release replaced TCE

2015-05-29 Thread Alexey Kardashevskiy
At the moment writing new TCE value to the IOMMU table fails with EBUSY if there is a valid entry already. However PAPR specification allows the guest to write new TCE value without clearing it first. Another problem this patch is addressing is the use of pool locks for external IOMMU users such a

[PATCH kernel v11 29/34] powerpc/powernv/ioda2: Use new helpers to do proper cleanup on PE release

2015-05-29 Thread Alexey Kardashevskiy
The existing code programmed TVT#0 with some address and then immediately released that memory. This makes use of pnv_pci_ioda2_unset_window() and pnv_pci_ioda2_set_bypass() which do correct resource release and TVT update. Signed-off-by: Alexey Kardashevskiy --- arch/powerpc/platforms/powernv/

[PATCH kernel v11 18/34] vfio: powerpc/spapr/iommu/powernv/ioda2: Rework IOMMU ownership control

2015-05-29 Thread Alexey Kardashevskiy
This adds tce_iommu_take_ownership() and tce_iommu_release_ownership which call in a loop iommu_take_ownership()/iommu_release_ownership() for every table on the group. As there is just one now, no change in behaviour is expected. At the moment the iommu_table struct has a set_bypass() which enabl

[PATCH kernel v11 12/34] vfio: powerpc/spapr: Rework groups attaching

2015-05-29 Thread Alexey Kardashevskiy
This is to make extended ownership and multiple groups support patches simpler for review. This should cause no behavioural change. Signed-off-by: Alexey Kardashevskiy [aw: for the vfio related changes] Acked-by: Alex Williamson Reviewed-by: David Gibson Reviewed-by: Gavin Shan --- drivers/v

[PATCH kernel v11 08/34] vfio: powerpc/spapr: Use it_page_size

2015-05-29 Thread Alexey Kardashevskiy
This makes use of the it_page_size from the iommu_table struct as page size can differ. This replaces missing IOMMU_PAGE_SHIFT macro in commented debug code as recently introduced IOMMU_PAGE_XXX macros do not include IOMMU_PAGE_SHIFT. Signed-off-by: Alexey Kardashevskiy Reviewed-by: David Gibson

[PATCH kernel v11 16/34] powerpc/spapr: vfio: Replace iommu_table with iommu_table_group

2015-05-29 Thread Alexey Kardashevskiy
Modern IBM POWERPC systems support multiple (currently two) TCE tables per IOMMU group (a.k.a. PE). This adds a iommu_table_group container for TCE tables. Right now just one table is supported. This defines iommu_table_group struct which stores pointers to iommu_group and iommu_table(s). This rep

[PATCH kernel v11 31/34] vfio: powerpc/spapr: powerpc/powernv/ioda2: Use DMA windows API in ownership control

2015-05-29 Thread Alexey Kardashevskiy
Before the IOMMU user (VFIO) would take control over the IOMMU table belonging to a specific IOMMU group. This approach did not allow sharing tables between IOMMU groups attached to the same container. This introduces a new IOMMU ownership flavour when the user can not just control the existing IO

[PATCH kernel v11 21/34] powerpc/powernv/ioda2: Add TCE invalidation for all attached groups

2015-05-29 Thread Alexey Kardashevskiy
The iommu_table struct keeps a list of IOMMU groups it is used for. At the moment there is just a single group attached but further patches will add TCE table sharing. When sharing is enabled, TCE cache in each PE needs to be invalidated so does the patch. This does not change pnv_pci_ioda1_tce_in

[PATCH kernel v11 33/34] vfio: powerpc/spapr: Register memory and define IOMMU v2

2015-05-29 Thread Alexey Kardashevskiy
The existing implementation accounts the whole DMA window in the locked_vm counter. This is going to be worse with multiple containers and huge DMA windows. Also, real-time accounting would requite additional tracking of accounted pages due to the page size difference - IOMMU uses 4K pages and syst

[PATCH kernel v11 34/34] vfio: powerpc/spapr: Support Dynamic DMA windows

2015-05-29 Thread Alexey Kardashevskiy
This adds create/remove window ioctls to create and remove DMA windows. sPAPR defines a Dynamic DMA windows capability which allows para-virtualized guests to create additional DMA windows on a PCI bus. The existing linux kernels use this new window to map the entire guest memory and switch to the

[PATCH kernel v11 28/34] vfio: powerpc/spapr: powerpc/powernv/ioda: Define and implement DMA windows API

2015-05-29 Thread Alexey Kardashevskiy
This extends iommu_table_group_ops by a set of callbacks to support dynamic DMA windows management. create_table() creates a TCE table with specific parameters. it receives iommu_table_group to know nodeid in order to allocate TCE table memory closer to the PHB. The exact format of allocated multi

[PATCH kernel v11 09/34] vfio: powerpc/spapr: Move locked_vm accounting to helpers

2015-05-29 Thread Alexey Kardashevskiy
There moves locked pages accounting to helpers. Later they will be reused for Dynamic DMA windows (DDW). This reworks debug messages to show the current value and the limit. This stores the locked pages number in the container so when unlocking the iommu table pointer won't be needed. This does n

[PATCH kernel v11 04/34] powerpc/iommu: Put IOMMU group explicitly

2015-05-29 Thread Alexey Kardashevskiy
So far an iommu_table lifetime was the same as PE. Dynamic DMA windows will change this and iommu_free_table() will not always require the group to be released. This moves iommu_group_put() out of iommu_free_table(). This adds a iommu_pseries_free_table() helper which does iommu_group_put() and i

[PATCH kernel v11 30/34] powerpc/iommu/ioda2: Add get_table_size() to calculate the size of future table

2015-05-29 Thread Alexey Kardashevskiy
This adds a way for the IOMMU user to know how much a new table will use so it can be accounted in the locked_vm limit before allocation happens. This stores the allocated table size in pnv_pci_ioda2_get_table_size() so the locked_vm counter can be updated correctly when a table is being disposed.

[PATCH kernel v11 10/34] vfio: powerpc/spapr: Disable DMA mappings on disabled container

2015-05-29 Thread Alexey Kardashevskiy
At the moment DMA map/unmap requests are handled irrespective to the container's state. This allows the user space to pin memory which it might not be allowed to pin. This adds checks to MAP/UNMAP that the container is enabled, otherwise -EPERM is returned. Signed-off-by: Alexey Kardashevskiy [a

[PATCH kernel v11 03/34] powerpc/powernv/ioda: Clean up IOMMU group registration

2015-05-29 Thread Alexey Kardashevskiy
The existing code has 3 calls to iommu_register_group() and all 3 branches actually cover all possible cases. This replaces 3 calls with one and moves the registration earlier; the latter will make more sense when we add TCE table sharing. Signed-off-by: Alexey Kardashevskiy Reviewed-by: Gavin S

[PATCH kernel v11 11/34] vfio: powerpc/spapr: Moving pinning/unpinning to helpers

2015-05-29 Thread Alexey Kardashevskiy
This is a pretty mechanical patch to make next patches simpler. New tce_iommu_unuse_page() helper does put_page() now but it might skip that after the memory registering patch applied. As we are here, this removes unnecessary checks for a value returned by pfn_to_page() as it cannot possibly retu

[PATCH kernel v11 32/34] powerpc/mmu: Add userspace-to-physical addresses translation cache

2015-05-29 Thread Alexey Kardashevskiy
We are adding support for DMA memory pre-registration to be used in conjunction with VFIO. The idea is that the userspace which is going to run a guest may want to pre-register a user space memory region so it all gets pinned once and never goes away. Having this done, a hypervisor will not have to

[PATCH kernel v11 05/34] powerpc/iommu: Always release iommu_table in iommu_free_table()

2015-05-29 Thread Alexey Kardashevskiy
At the moment iommu_free_table() only releases memory if the table was initialized for the platform code use, i.e. it had it_map initialized (which purpose is to track DMA memory space use). With dynamic DMA windows, we will need to be able to release iommu_table even if it was used for VFIO in wh

[PATCH kernel v11 17/34] powerpc/spapr: vfio: Switch from iommu_table to new iommu_table_group

2015-05-29 Thread Alexey Kardashevskiy
Modern IBM POWERPC systems support multiple (currently two) TCE tables per IOMMU group (a.k.a. PE). This adds a iommu_table_group container for TCE tables. Right now just one table is supported. For IODA, instead of embedding iommu_table, the new iommu_table_group keeps pointers to those. The iomm

[PATCH kernel v11 15/34] powerpc/powernv/ioda/ioda2: Rework TCE invalidation in tce_build()/tce_free()

2015-05-29 Thread Alexey Kardashevskiy
The pnv_pci_ioda_tce_invalidate() helper invalidates TCE cache. It is supposed to be called on IODA1/2 and not called on p5ioc2. It receives start and end host addresses of TCE table. IODA2 actually needs PCI addresses to invalidate the cache. Those can be calculated from host addresses but since

[PATCH kernel v11 07/34] vfio: powerpc/spapr: Check that IOMMU page is fully contained by system page

2015-05-29 Thread Alexey Kardashevskiy
This checks that the TCE table page size is not bigger that the size of a page we just pinned and going to put its physical address to the table. Otherwise the hardware gets unwanted access to physical memory between the end of the actual page and the end of the aligned up TCE page. Since compoun

[PATCH kernel v11 20/34] powerpc/powernv/ioda2: Move TCE kill register address to PE

2015-05-29 Thread Alexey Kardashevskiy
At the moment the DMA setup code looks for the "ibm,opal-tce-kill" property which contains the TCE kill register address. Writing to this register invalidates TCE cache on IODA/IODA2 hub. This moves the register address from iommu_table to pnv_pnb as this register belongs to PHB and invalidates TC

[PATCH v5 03/12] KVM: arm64: guest debug, define API headers

2015-05-29 Thread Alex Bennée
This commit defines the API headers for guest debugging. There are two architecture specific debug structures: - kvm_guest_debug_arch, allows us to pass in HW debug registers - kvm_debug_exit_arch, signals exception and possible faulting address The type of debugging being used is controlled

[PATCH v5 00/12] KVM Guest Debug support for arm64

2015-05-29 Thread Alex Bennée
Here is V5 of the KVM Guest Debug support for arm64. The changes are fairly minimal from the last round: - dropped KVM_GUESTDBG_USE_SW/HW_BP unifying patch (ABI break) - new comment patch to fix comments in hyp.S (also sent separately) - simplified singlestep code (no longer needs to preser

[PATCH v5 05/12] KVM: arm: introduce kvm_arm_init/setup/clear_debug

2015-05-29 Thread Alex Bennée
This is a precursor for later patches which will need to do more to setup debug state before entering the hyp.S switch code. The existing functionality for setting mdcr_el2 has been moved out of hyp.S and now uses the value kept in vcpu->arch.mdcr_el2. As the assembler used to previously mask and

[PATCH v5 09/12] KVM: arm64: introduce vcpu->arch.debug_ptr

2015-05-29 Thread Alex Bennée
This introduces a level of indirection for the debug registers. Instead of using the sys_regs[] directly we store registers in a structure in the vcpu. As we are no longer tied to the layout of the sys_regs[] we can make the copies size appropriate for control and value registers. This also entail

[PATCH v5 01/12] KVM: add comments for kvm_debug_exit_arch struct

2015-05-29 Thread Alex Bennée
Bring into line with the comments for the other structures and their KVM_EXIT_* cases. Also update api.txt to reflect use in kvm_run documentation. Signed-off-by: Alex Bennée Reviewed-by: David Hildenbrand Reviewed-by: Andrew Jones Acked-by: Christoffer Dall --- v2 - add comments for other

[PATCH v5 06/12] KVM: arm64: guest debug, add SW break point support

2015-05-29 Thread Alex Bennée
This adds support for SW breakpoints inserted by userspace. We do this by trapping all guest software debug exceptions to the hypervisor (MDCR_EL2.TDE). The exit handler sets an exit reason of KVM_EXIT_DEBUG with the kvm_debug_exit_arch structure holding the exception syndrome information. It wil

[PATCH v5 02/12] KVM: arm64: fix misleading comments in save/restore

2015-05-29 Thread Alex Bennée
The elr_el2 and spsr_el2 registers in fact contain the processor state before entry into the hypervisor code. In the case of guest state it could be in either el0 or el1. Signed-off-by: Alex Bennée --- arch/arm64/kvm/hyp.S | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git

[PATCH v5 08/12] KVM: arm64: re-factor hyp.S debug register code

2015-05-29 Thread Alex Bennée
This is a pre-cursor to sharing the code with the guest debug support. This replaces the big macro that fishes data out of a fixed location with a more general helper macro to restore a set of debug registers. It uses macro substitution so it can be re-used for debug control and value registers. It

[PATCH v5 04/12] KVM: arm: guest debug, add stub KVM_SET_GUEST_DEBUG ioctl

2015-05-29 Thread Alex Bennée
This commit adds a stub function to support the KVM_SET_GUEST_DEBUG ioctl. Any unsupported flag will return -EINVAL. For now, only KVM_GUESTDBG_ENABLE is supported, although it won't have any effects. Signed-off-by: Alex Bennée . Reviewed-by: Christoffer Dall --- v2 - simplified form of the io

[PATCH v5 07/12] KVM: arm64: guest debug, add support for single-step

2015-05-29 Thread Alex Bennée
This adds support for single-stepping the guest. To do this we need to manipulate the guests PSTATE.SS and MDSCR_EL1.SS bits which we do in the kvm_arm_setup/clear_debug() so we don't affect the apparent state of the guest. Additionally while the host is debugging the guest we suppress the ability

[PATCH v5 11/12] KVM: arm64: enable KVM_CAP_SET_GUEST_DEBUG

2015-05-29 Thread Alex Bennée
Finally advertise the KVM capability for SET_GUEST_DEBUG. Once arm support is added this check can be moved to the common kvm_vm_ioctl_check_extension() code. Signed-off-by: Alex Bennée Acked-by: Christoffer Dall --- v3: - separated capability check from previous patches - moved into arm64 s

[PATCH v5 12/12] KVM: arm64: add trace points for guest_debug debug

2015-05-29 Thread Alex Bennée
This includes trace points for: kvm_arch_setup_guest_debug kvm_arch_clear_guest_debug I've also added some generic register setting trace events and also a trace point to dump the array of hardware registers. Signed-off-by: Alex Bennée --- v3 - add trace event for debug access. - remove

[PATCH v5 10/12] KVM: arm64: guest debug, HW assisted debug support

2015-05-29 Thread Alex Bennée
This adds support for userspace to control the HW debug registers for guest debug. In the debug ioctl we copy the IMPDEF defined number of registers into a new register set called host_debug_state. There is now a new vcpu parameter called debug_ptr which selects which register set is to copied into

[PATCH 01/13] KVM: arm/arm64: VGIC: don't track used LRs in the distributor

2015-05-29 Thread Andre Przywara
Currently we track which IRQ has been mapped to which VGIC list register and also have to synchronize both. We used to do this to hold some extra state (for instance the active bit). It turns out that this extra state in the LRs is no longer needed and this extra tracking causes some pain later. Re

[PATCH 00/13] arm64: KVM: GICv3 ITS emulation

2015-05-29 Thread Andre Przywara
The GICv3 ITS (Interrupt Translation Service) is a part of the ARM GICv3 interrupt controller used for implementing MSIs. It specifies a new kind of interrupts (LPIs), which are mapped to establish a connection between a device, its MSI payload value and the target processor the IRQ is eventually d

[PATCH 02/13] KVM: extend struct kvm_msi to hold a 32-bit device ID

2015-05-29 Thread Andre Przywara
The ARM GICv3 ITS MSI controller requires a device ID to be able to assign the proper interrupt vector. On real hardware, this ID is sampled from the bus. To be able to emulate an ITS controller, extend the KVM MSI interface to let userspace provide such a device ID. For PCI devices, the device ID

[PATCH 03/13] KVM: arm/arm64: add emulation model specific destroy function

2015-05-29 Thread Andre Przywara
Currently we destroy the VGIC emulation in one function that cares for all emulated models. The ITS emulation will require some differentiation, so introduce a per-emulation-model destroy method. Use it for a tiny GICv3 specific code already. Signed-off-by: Andre Przywara --- include/kvm/arm_vgi

[PATCH 10/13] KVM: arm64: sync LPI properties and status between guest and KVM

2015-05-29 Thread Andre Przywara
The properties and status of the GICv3 LPIs are hold in tables in (guest) memory. To achieve reasonable performance, we cache this data in our own data structures, so we need to sync those two views from time to time. This behaviour is well described in the GICv3 spec and is also exercised by hardw

[PATCH 06/13] KVM: arm64: introduce ITS emulation file with stub functions

2015-05-29 Thread Andre Przywara
The ARM GICv3 ITS emulation code goes into a separate file, but needs to be connected to the GICv3 emulation, of which it is an option. Introduce the skeletton with function stubs to be filled later. Introduce the basic ITS data structure and initialize it, but don't return any success yet, as we a

[PATCH 09/13] KVM: arm64: handle pending bit for LPIs in ITS emulation

2015-05-29 Thread Andre Przywara
As the actual LPI number in a guest can be quite high, but is mostly assigned using a very sparse allocation scheme, bitmaps and arrays for storing the virtual interrupt status are a waste of memory. We use our equivalent of the "Interrupt Translation Table Entry" (ITTE) to hold this extra status i

[PATCH 05/13] KVM: arm64: handle ITS related GICv3 redistributor registers

2015-05-29 Thread Andre Przywara
In the GICv3 redistributor there are the PENDBASER and PROPBASER registers which we did not emulate so far, as they only make sense when having an ITS. In preparation for that emulate those MMIO accesses by storing the 64-bit data written into it into a variable which we later read in the ITS emula

[PATCH 08/13] KVM: arm64: add data structures to model ITS interrupt translation

2015-05-29 Thread Andre Przywara
The GICv3 Interrupt Translation Service (ITS) uses tables in memory to allow a sophisticated interrupt routing. It features device tables, an interrupt table per device and a table connecting "collections" to actual CPUs (aka. redistributors in the GICv3 lingo). Since the interrupt numbers for the

[PATCH 07/13] KVM: arm64: implement basic ITS register handlers

2015-05-29 Thread Andre Przywara
Add emulation for some basic MMIO registers used in the ITS emulation. This includes: - GITS_{CTLR,TYPER,IIDR} - ID registers - GITS_{CBASER,CREAD,CWRITER} those implement the ITS command buffer handling Signed-off-by: Andre Przywara --- include/kvm/arm_vgic.h | 3 + include/linu

[PATCH 12/13] KVM: arm64: implement MSI injection in ITS emulation

2015-05-29 Thread Andre Przywara
When userland wants to inject a MSI into the guest, we have to use our data structures to find the LPI number and the VCPU to receivce the interrupt. Use the wrapper functions to iterate the linked lists and find the proper Interrupt Translation Table Entry. Then set the pending bit in this ITTE to

[PATCH 11/13] KVM: arm64: implement ITS command queue command handlers

2015-05-29 Thread Andre Przywara
The connection between a device, an event ID, the LPI number and the allocated CPU is stored in in-memory tables in a GICv3, but their format is not specified by the spec. Instead software uses a command queue to let the ITS implementation use their own format. Implement handlers for the various IT

[PATCH 04/13] KVM: arm64: Introduce new MMIO region for the ITS base address

2015-05-29 Thread Andre Przywara
The ARM GICv3 ITS controller requires a separate register frame to cover ITS specific registers. Add a new VGIC address type and store the address in a field in the vgic_dist structure. Provide a function to check whether userland has provided the address, so ITS functionality can be guarded by tha

[PATCH 13/13] KVM: arm64: enable ITS emulation as a virtual MSI controller

2015-05-29 Thread Andre Przywara
If userspace has provided a base address for the ITS register frame, we enable the bits that advertise LPIs in the GICv3. When the guest has enabled LPIs and the ITS, we enable the emulation part by initializing the ITS data structures and trapping on ITS register frame accesses by the guest. Also

[PATCH] virtio: fix fsync() on a directory

2015-05-29 Thread Russell King
dpkg in the guest fails when it tries to use fsync() on a directory: openat(AT_FDCWD, "/var/lib/dpkg", O_RDONLY|O_NONBLOCK|O_LARGEFILE|O_DIRECTORY|O_CLOEXEC) = 4 fsync(4)= -1 EINVAL (Invalid argument) stracing lkvm shows that this is converted to: openat(AT_FDCWD

Re: [PATCH V1 4/5] kvm: arm64: Implement ACPI probing code for GICv2

2015-05-29 Thread Andrew Jones
On Thu, May 28, 2015 at 01:34:33AM -0400, Wei Huang wrote: > This patches enables ACPI support for KVM virtual GICv2. KVM parses > ACPI table for virt GIC related information and initializes resources. > > Signed-off-by: Alexander Spyridaki > Signed-off-by: Wei Huang > --- > virt/kvm/arm/vgic-v

[PATCH] Remove visible dependency files

2015-05-29 Thread Russell King
After building, there is a lot of clutter from the dependency system. Let's clean this up by using dir/.file.d style dependencies, similar to those used in the Linux kernel. In order to support this, rearrange the dependency generation to create the dependency files as we build rather than as a se

Re: [PATCH V1 4/5] kvm: arm64: Implement ACPI probing code for GICv2

2015-05-29 Thread Wei Huang
On 05/29/2015 09:06 AM, Andrew Jones wrote: > On Thu, May 28, 2015 at 01:34:33AM -0400, Wei Huang wrote: >> This patches enables ACPI support for KVM virtual GICv2. KVM parses >> ACPI table for virt GIC related information and initializes resources. >> >> Signed-off-by: Alexander Spyridaki >> Si

[PATCH v5 2/6] target-arm: kvm64: introduce kvm_arm_init_debug()

2015-05-29 Thread Alex Bennée
As we haven't always had guest debug support we need to probe for it. Additionally we don't do this in the start-up capability code so we don't fall over on old kernels. Signed-off-by: Alex Bennée --- target-arm/kvm64.c | 18 ++ 1 file changed, 18 insertions(+) diff --git a/targ

[PATCH v5 4/6] target-arm: kvm - support for single step

2015-05-29 Thread Alex Bennée
This adds support for single-step. There isn't much to do on the QEMU side as after we set-up the request for single step via the debug ioctl it is all handled within the kernel. Signed-off-by: Alex Bennée --- v2 - convert to using HSR_EC v3 - use internals.h definitions --- target-arm/kvm.

[PATCH v5 0/6] QEMU support for KVM Guest Debug on arm64

2015-05-29 Thread Alex Bennée
Hi, You may be wondering what happened to v3 and v4. They do exist but they didn't change much from the the original patches as I've been mostly looking the kernel side of the equation. So in summary the changes are: - updates to the kernel ABI - don't fall over on kernels without debug suppo

[PATCH v5 1/6] linux-headers: sync from my kernel tree (DEV)

2015-05-29 Thread Alex Bennée
I assume I'll properly merge the KVM Headers direct from Linux when the kernel side is upstream. These headers came from: https://git.linaro.org/people/alex.bennee/linux.git/shortlog/refs/heads/guest-debug/4.1-rc5-v5 Signed-off-by: Alex Bennée --- v2 - update ABI to include ->far v3 - updat

[PATCH v5 6/6] target-arm: kvm - re-inject guest debug exceptions

2015-05-29 Thread Alex Bennée
From: Alex Bennée If we can't find details for the debug exception in our debug state then we can assume the exception is due to debugging inside the guest. To inject the exception into the guest state we re-use the TCG exception code (do_interupt). However while guest debugging is in effect we

[PATCH v5 5/6] target-arm: kvm - add support for HW assisted debug

2015-05-29 Thread Alex Bennée
This adds basic support for HW assisted debug. The ioctl interface to KVM allows us to pass an implementation defined number of break and watch point registers. When KVM_GUESTDBG_USE_HW_BP is specified these debug registers will be installed in place on the world switch into the guest. The hardwar

[PATCH v5 3/6] target-arm: kvm - implement software breakpoints

2015-05-29 Thread Alex Bennée
These don't involve messing around with debug registers, just setting the breakpoint instruction in memory. GDB will not use this mechanism if it can't access the memory to write the breakpoint. All the kernel has to do is ensure the hypervisor traps the breakpoint exceptions and returns to usersp

[PATCH] KVM: arm: vgic: Drop useless Group0 warning

2015-05-29 Thread Marc Zyngier
If a GICv3-enabled guest tries to configure Group0, we print a warning on the console (because we don't support Group0 interrupts). This is fairly pointless, and would allow a guest to spam the console. Let's just drop the warning. Signed-off-by: Marc Zyngier --- virt/kvm/arm/vgic-v3-emul.c | 2

Re: [PATCH v2 04/13] KVM: x86: API changes for SMM support

2015-05-29 Thread Radim Krčmář
2015-05-27 19:05+0200, Paolo Bonzini: > This patch includes changes to the external API for SMM support. > All the changes are predicated by the availability of a new > capability, KVM_CAP_X86_SMM, which is added at the end of the > patch series. > > Signed-off-by: Paolo Bonzini > --- > diff --gi

Re: [PATCH v2 00/13] SMM implementation for KVM

2015-05-29 Thread Radim Krčmář
I found a corner case that doesn't fit any specific patch: We allow INIT while in SMM. This brings some security complications as we also don't reset hflags (another long standing bug?), but we don't really need to because INIT in SMM is against the spec anyway; APM May 2013 2:10.3.3 Exceptions a

Re: [PATCH v2 07/13] KVM: add vcpu-specific functions to read/write/translate GFNs

2015-05-29 Thread Radim Krčmář
2015-05-27 19:05+0200, Paolo Bonzini: > diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c > @@ -1616,6 +1727,27 @@ int kvm_write_guest(struct kvm *kvm, gpa_t gpa, const > void *data, | int kvm_write_guest(struct kvm *kvm, gpa_t gpa, const void *data, | unsigned long len) |

Re: [PATCH v2 00/13] SMM implementation for KVM

2015-05-29 Thread Paolo Bonzini
On 29/05/2015 21:03, Radim Krčmář wrote: > I found a corner case that doesn't fit any specific patch: > > We allow INIT while in SMM. This brings some security complications as > we also don't reset hflags (another long standing bug?), but we don't > really need to because INIT in SMM is agains

Re: [PATCH v2] arm/arm64: KVM: Properly account for guest CPU time

2015-05-29 Thread Mario Smarduch
On 05/28/2015 11:49 AM, Christoffer Dall wrote: > Until now we have been calling kvm_guest_exit after re-enabling > interrupts when we come back from the guest, but this has the > unfortunate effect that CPU time accounting done in the context of timer > interrupts occurring while the guest is runn