On Fri, Sep 06, 2013 at 04:57:05PM +0200, Paolo Bonzini wrote: > Il 25/08/2013 17:47, Alexander Graf ha scritto: > > > > On 23.08.2013, at 21:10, Christoffer Dall wrote: > > > >> Save and restore the ARM KVM VGIC state from the kernel. We rely on > >> QEMU to marshal the GICState data structure and therefore simply > >> synchronize the kernel state with the QEMU emulated state in both > >> directions. > >> > >> We take some care on the restore path to check the VGIC has been > >> configured with enough IRQs and CPU interfaces that we can properly > >> restore the state, and for separate set/clear registers we first fully > >> clear the registers and then set the required bits. > >> > >> Signed-off-by: Christoffer Dall <christoffer.d...@linaro.org> > >> --- > >> hw/intc/arm_gic_common.c | 2 + > >> hw/intc/arm_gic_kvm.c | 418 > >> ++++++++++++++++++++++++++++++++++++++++++- > >> hw/intc/gic_internal.h | 1 + > >> include/migration/vmstate.h | 6 + > >> 4 files changed, 425 insertions(+), 2 deletions(-) > >> > >> diff --git a/hw/intc/arm_gic_common.c b/hw/intc/arm_gic_common.c > >> index a50ded2..f39bdc1 100644 > >> --- a/hw/intc/arm_gic_common.c > >> +++ b/hw/intc/arm_gic_common.c > >> @@ -77,6 +77,8 @@ static const VMStateDescription vmstate_gic = { > >> VMSTATE_UINT16_ARRAY(running_priority, GICState, NCPU), > >> VMSTATE_UINT16_ARRAY(current_pending, GICState, NCPU), > >> VMSTATE_UINT8_2DARRAY(binary_point, GICState, 2, NCPU), > >> + VMSTATE_UINT32_2DARRAY(active_prio, GICState, 4, NCPU), > >> + VMSTATE_UINT32(num_irq, GICState), > >> VMSTATE_END_OF_LIST() > >> } > >> }; > >> diff --git a/hw/intc/arm_gic_kvm.c b/hw/intc/arm_gic_kvm.c > >> index 9f0a852..9785281 100644 > >> --- a/hw/intc/arm_gic_kvm.c > >> +++ b/hw/intc/arm_gic_kvm.c > >> @@ -23,6 +23,15 @@ > >> #include "kvm_arm.h" > >> #include "gic_internal.h" > >> > >> +//#define DEBUG_GIC_KVM > >> + > >> +#ifdef DEBUG_GIC_KVM > >> +#define DPRINTF(fmt, ...) \ > >> +do { fprintf(stderr, "arm_gic: " fmt , ## __VA_ARGS__); } while (0) > >> +#else > >> +#define DPRINTF(fmt, ...) do {} while(0) > > > > You really want to make DPRINTF still be format checked by the compiler. > > Check out hw/intc/openpic.c for an example how to get there. > > > >> +#endif > >> + > >> #define TYPE_KVM_ARM_GIC "kvm-arm-gic" > >> #define KVM_ARM_GIC(obj) \ > >> OBJECT_CHECK(GICState, (obj), TYPE_KVM_ARM_GIC) > >> @@ -73,14 +82,419 @@ static void kvm_arm_gic_set_irq(void *opaque, int > >> irq, int level) > >> kvm_set_irq(kvm_state, kvm_irq, !!level); > >> } > >> > >> +static bool kvm_arm_gic_can_save_restore(GICState *s) > >> +{ > >> + KVMARMGICClass *kgc = KVM_ARM_GIC_GET_CLASS(s); > >> + return kgc->dev_fd >= 0; > >> +} > >> + > >> +static void kvm_arm_gic_reg_access(GICState *s, int group, int offset, > >> + int cpu, uint32_t *val, bool write) > >> +{ > >> + KVMARMGICClass *kgc = KVM_ARM_GIC_GET_CLASS(s); > >> + struct kvm_device_attr attr; > >> + int type; > >> + > >> + cpu = cpu & 0xff; > >> + > >> + attr.flags = 0; > >> + attr.group = group; > >> + attr.attr = (((uint64_t)cpu << KVM_DEV_ARM_VGIC_CPUID_SHIFT) & > >> + KVM_DEV_ARM_VGIC_CPUID_MASK) | > >> + (((uint64_t)offset << KVM_DEV_ARM_VGIC_OFFSET_SHIFT) & > >> + KVM_DEV_ARM_VGIC_OFFSET_MASK); > >> + attr.addr = (uint64_t)(long)val; > > > > uintptr_t? > > > >> + > >> + if (write) { > >> + type = KVM_SET_DEVICE_ATTR; > >> + } else { > >> + type = KVM_GET_DEVICE_ATTR; > >> + } > >> + > >> + if (kvm_device_ioctl(kgc->dev_fd, type, &attr) < 0) { > >> + fprintf(stderr, "KVM_{SET/GET}_DEVICE_ATTR failed: %s\n", > >> + strerror(errno)); > >> + abort(); > >> + } > >> +} > >> + > >> +static void kvm_arm_gic_dist_reg_access(GICState *s, int offset, int cpu, > >> + uint32_t *val, bool write) > >> +{ > >> + kvm_arm_gic_reg_access(s, KVM_DEV_ARM_VGIC_GRP_DIST_REGS, > >> + offset, cpu, val, write); > >> +} > >> + > >> +static void kvm_arm_gic_cpu_reg_access(GICState *s, int offset, int cpu, > >> + uint32_t *val, bool write) > >> +{ > >> + kvm_arm_gic_reg_access(s, KVM_DEV_ARM_VGIC_GRP_CPU_REGS, > >> + offset, cpu, val, write); > >> +} > >> + > >> +#define for_each_irq_reg(_ctr, _max_irq, _field_width) \ > >> + for (_ctr = 0; _ctr < ((_max_irq) / (32 / (_field_width))); _ctr++) > >> + > >> +/* > >> + * Translate from the in-kernel field for an IRQ value to/from the qemu > >> + * representation. > >> + */ > >> +typedef void (*vgic_translate_fn)(GICState *s, int irq, int cpu, > >> + uint32_t *field, bool to_kernel); > >> + > >> +/* synthetic translate function used for clear/set registers to completely > >> + * clear a setting using a clear-register before setting the remaing bits > >> + * using a set-register */ > >> +static void translate_clear(GICState *s, int irq, int cpu, > >> + uint32_t *field, bool to_kernel) > >> +{ > >> + if (to_kernel) { > >> + *field = ~0; > >> + } else { > >> + /* does not make sense: qemu model doesn't use set/clear regs */ > >> + abort(); > >> + } > > > > I don't understand the to_kernel bits. I thought we're in a device file > > that only works with KVM? > > The opposite of "to_kernel" is "from_kernel". > > IIUC this is for GIC registers that are modelled as a pair of MMIO > registers, one with write-1-sets and one with write-1-clears semantics. > The patch is first writing all ones to the w1c register, and then > writing the actual value to the w1s register. > > The way we solved this for x86 is that the set-registers, when called > from userspace, also clear the zero bits. See the "host_initiated" > parts in arch/x86/kvm/x86.c. However, this would be an ABI change for > ARM, so Christoffer's solution is also fine.
I prefer sticking to the ARM ABI so we don't have to keep a documentation of exceptions to that around to the widest extend possible. See my comment on Alex's e-mail for a more thorough response to this as well. Sorry for the long response time on this. > > One comment on the function names: > > kvm_arm_gic_dist_readr > kvm_arm_gic_dist_writer > > Why not get_reg/set_reg (I was quite surprised to see readr instead of > reader :) and it took me a while to understand the convention)? Or if > the name is too long, s/readr/get/ and s/writer/set/ would be enough to > match the ioctls. > r for register, read register, write register... I thought I'd seen this convention elsewhere, but I may be wrong. If you feel really strongly about it, I can rename. -Christoffer