date:20190310

Re: [Qemu-devel] [PATCH v4 4/9] {hmp, hw/pvrdma}: Expose device internals via monitor interface

2019-03-10 Thread Yuval Shaia

> 
> [...]
> > diff --git a/hw/rdma/rdma_hmp.c b/hw/rdma/rdma_hmp.c
> > new file mode 100644
> > index 00..c5814473c5
> > --- /dev/null
> > +++ b/hw/rdma/rdma_hmp.c
> > @@ -0,0 +1,30 @@
> > +/*
> > + * RDMA device: Human Monitor interface
> 
> The file name and this comment are a bit akward.  Yes, you create
> TYPE_RDMA_STATS_PROVIDER for use in HMP info rdma, but there's
> absolutely nothing HMP-related in this file.  Same for rdma_hmp.h below.
> 
> Call them rdma_stats.c and rdma_stats.h?

Renamed to rdma.h and rdma.c, wo knows what other things might be added in
the future.

> 
> > + *
> > + * Copyright (C) 2018 Oracle
> > + * Copyright (C) 2018 Red Hat Inc
> > + *
> > + * Authors:
> > + * Yuval Shaia 
> > + *
> > + * This work is licensed under the terms of the GNU GPL, version 2 or 
> > later.
> > + * See the COPYING file in the top-level directory.
> > + *
> > + */
> > +
> > +#include "qemu/osdep.h"
> > +#include "hw/rdma/rdma_hmp.h"
> > +#include "qemu/module.h"
> > +
> > +static const TypeInfo rdma_hmp_info = {
> > +.name = TYPE_RDMA_STATS_PROVIDER,
> > +.parent = TYPE_INTERFACE,
> > +.class_size = sizeof(RdmaStatsProviderClass),
> > +};
> > +
> > +static void rdma_hmp_register_types(void)
> > +{
> > +type_register_static(&rdma_hmp_info);
> > +}
> > +
> > +type_init(rdma_hmp_register_types)
> 
> Also rename _hmp_ to _stats_.

Ditto.

>

[Qemu-devel] [PULL 05/60] target/ppc/spapr: Add SPAPR_CAP_LARGE_DECREMENTER

2019-03-10 Thread David Gibson

From: Suraj Jitindar Singh 

Add spapr_cap SPAPR_CAP_LARGE_DECREMENTER to be used to control the
availability of the large decrementer for a guest.

Signed-off-by: Suraj Jitindar Singh 
Message-Id: <20190301024317.22137-1-sjitindarsi...@gmail.com>
Signed-off-by: David Gibson 
---
 hw/ppc/spapr.c |  2 ++
 hw/ppc/spapr_caps.c| 17 +
 include/hw/ppc/spapr.h |  5 -
 3 files changed, 23 insertions(+), 1 deletion(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index f7d527464c..e07e5370d3 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -2088,6 +2088,7 @@ static const VMStateDescription vmstate_spapr = {
 &vmstate_spapr_irq_map,
 &vmstate_spapr_cap_nested_kvm_hv,
 &vmstate_spapr_dtb,
+&vmstate_spapr_cap_large_decr,
 NULL
 }
 };
@@ -4302,6 +4303,7 @@ static void spapr_machine_class_init(ObjectClass *oc, 
void *data)
 smc->default_caps.caps[SPAPR_CAP_IBS] = SPAPR_CAP_BROKEN;
 smc->default_caps.caps[SPAPR_CAP_HPT_MAXPAGESIZE] = 16; /* 64kiB */
 smc->default_caps.caps[SPAPR_CAP_NESTED_KVM_HV] = SPAPR_CAP_OFF;
+smc->default_caps.caps[SPAPR_CAP_LARGE_DECREMENTER] = SPAPR_CAP_OFF;
 spapr_caps_add_properties(smc, &error_abort);
 smc->irq = &spapr_irq_xics;
 smc->dr_phb_enabled = true;
diff --git a/hw/ppc/spapr_caps.c b/hw/ppc/spapr_caps.c
index 64f98ae68d..3f90f5823e 100644
--- a/hw/ppc/spapr_caps.c
+++ b/hw/ppc/spapr_caps.c
@@ -390,6 +390,13 @@ static void cap_nested_kvm_hv_apply(sPAPRMachineState 
*spapr,
 }
 }
 
+static void cap_large_decr_apply(sPAPRMachineState *spapr,
+ uint8_t val, Error **errp)
+{
+if (val)
+error_setg(errp, "No large decrementer support, try 
cap-large-decr=off");
+}
+
 sPAPRCapabilityInfo capability_table[SPAPR_CAP_NUM] = {
 [SPAPR_CAP_HTM] = {
 .name = "htm",
@@ -468,6 +475,15 @@ sPAPRCapabilityInfo capability_table[SPAPR_CAP_NUM] = {
 .type = "bool",
 .apply = cap_nested_kvm_hv_apply,
 },
+[SPAPR_CAP_LARGE_DECREMENTER] = {
+.name = "large-decr",
+.description = "Allow Large Decrementer",
+.index = SPAPR_CAP_LARGE_DECREMENTER,
+.get = spapr_cap_get_bool,
+.set = spapr_cap_set_bool,
+.type = "bool",
+.apply = cap_large_decr_apply,
+},
 };
 
 static sPAPRCapabilities default_caps_with_cpu(sPAPRMachineState *spapr,
@@ -596,6 +612,7 @@ SPAPR_CAP_MIG_STATE(cfpc, SPAPR_CAP_CFPC);
 SPAPR_CAP_MIG_STATE(sbbc, SPAPR_CAP_SBBC);
 SPAPR_CAP_MIG_STATE(ibs, SPAPR_CAP_IBS);
 SPAPR_CAP_MIG_STATE(nested_kvm_hv, SPAPR_CAP_NESTED_KVM_HV);
+SPAPR_CAP_MIG_STATE(large_decr, SPAPR_CAP_LARGE_DECREMENTER);
 
 void spapr_caps_init(sPAPRMachineState *spapr)
 {
diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
index 59073a7579..8efc5e0779 100644
--- a/include/hw/ppc/spapr.h
+++ b/include/hw/ppc/spapr.h
@@ -74,8 +74,10 @@ typedef enum {
 #define SPAPR_CAP_HPT_MAXPAGESIZE   0x06
 /* Nested KVM-HV */
 #define SPAPR_CAP_NESTED_KVM_HV 0x07
+/* Large Decrementer */
+#define SPAPR_CAP_LARGE_DECREMENTER 0x08
 /* Num Caps */
-#define SPAPR_CAP_NUM   (SPAPR_CAP_NESTED_KVM_HV + 1)
+#define SPAPR_CAP_NUM   (SPAPR_CAP_LARGE_DECREMENTER + 1)
 
 /*
  * Capability Values
@@ -828,6 +830,7 @@ extern const VMStateDescription vmstate_spapr_cap_cfpc;
 extern const VMStateDescription vmstate_spapr_cap_sbbc;
 extern const VMStateDescription vmstate_spapr_cap_ibs;
 extern const VMStateDescription vmstate_spapr_cap_nested_kvm_hv;
+extern const VMStateDescription vmstate_spapr_cap_large_decr;
 
 static inline uint8_t spapr_get_cap(sPAPRMachineState *spapr, int cap)
 {
-- 
2.20.1

[Qemu-devel] [PULL 02/60] vfio/spapr: Rename local systempagesize variable

2019-03-10 Thread David Gibson

From: Alexey Kardashevskiy 

The "systempagesize" name suggests that it is the host system page size
while it is the smallest page size of memory backing the guest RAM so
let's rename it to stop confusion. This should cause no behavioral change.

Signed-off-by: Alexey Kardashevskiy 
Message-Id: <20190227085149.38596-4-...@ozlabs.ru>
Signed-off-by: David Gibson 
---
 hw/vfio/spapr.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/hw/vfio/spapr.c b/hw/vfio/spapr.c
index 88437a79e6..57fe758e54 100644
--- a/hw/vfio/spapr.c
+++ b/hw/vfio/spapr.c
@@ -148,14 +148,14 @@ int vfio_spapr_create_window(VFIOContainer *container,
 uint64_t pagesize = memory_region_iommu_get_min_page_size(iommu_mr);
 unsigned entries, bits_total, bits_per_level, max_levels;
 struct vfio_iommu_spapr_tce_create create = { .argsz = sizeof(create) };
-long systempagesize = qemu_getrampagesize();
+long rampagesize = qemu_getrampagesize();
 
 /*
  * The host might not support the guest supported IOMMU page size,
  * so we will use smaller physical IOMMU pages to back them.
  */
-if (pagesize > systempagesize) {
-pagesize = systempagesize;
+if (pagesize > rampagesize) {
+pagesize = rampagesize;
 }
 pagesize = 1ULL << (63 - clz64(container->pgsizes &
(pagesize | (pagesize - 1;
-- 
2.20.1

[Qemu-devel] [PULL 04/60] Revert "spapr: support memory unplug for qtest"

2019-03-10 Thread David Gibson

From: Greg Kurz 

Commit b8165118f52c broke CPU hotplug tests for old machine types:

$ QTEST_QEMU_BINARY=ppc64-softmmu/qemu-system-ppc64 ./tests/cpu-plug-test 
-m=slow
/ppc64/cpu-plug/pseries-3.1/device-add/2x3x1&maxcpus=6: OK
/ppc64/cpu-plug/pseries-2.12-sxxm/device-add/2x3x1&maxcpus=6: OK
/ppc64/cpu-plug/pseries-3.0/device-add/2x3x1&maxcpus=6: OK
/ppc64/cpu-plug/pseries-2.10/device-add/2x3x1&maxcpus=6: OK
/ppc64/cpu-plug/pseries-2.11/device-add/2x3x1&maxcpus=6: OK
/ppc64/cpu-plug/pseries-2.12/device-add/2x3x1&maxcpus=6: OK
/ppc64/cpu-plug/pseries-2.9/device-add/2x3x1&maxcpus=6: OK
/ppc64/cpu-plug/pseries-2.7/device-add/2x3x1&maxcpus=6: **
ERROR:/home/thuth/devel/qemu/hw/ppc/spapr_events.c:313:rtas_event_log_to_source:
 assertion failed: (source->enabled)
Broken pipe
/home/thuth/devel/qemu/tests/libqtest.c:143: kill_qemu() detected QEMU death 
from signal 6 (Aborted) (core dumped)
Aborted (core dumped)

The approach of faking the availability of OV5_HP_EVT causes the
code to assume the hotplug event source is enabled, which is wrong
for older machines.

This reverts commit b8165118f52ce5ee88565d3cec83d30374efdc96.

A subsequent patch will address the problem of CAS under qtest from
a different angle.

Reported-by: Thomas Huth 
Signed-off-by: Greg Kurz 
Message-Id: <155146875097.147873.1732264036668112686.st...@bahia.lan>
Tested-by: Michael Roth 
Reviewed-by: Michael Roth 
Signed-off-by: David Gibson 
---
 hw/ppc/spapr_ovec.c | 6 --
 1 file changed, 6 deletions(-)

diff --git a/hw/ppc/spapr_ovec.c b/hw/ppc/spapr_ovec.c
index 12510b236a..318bf33de4 100644
--- a/hw/ppc/spapr_ovec.c
+++ b/hw/ppc/spapr_ovec.c
@@ -16,7 +16,6 @@
 #include "qemu/bitmap.h"
 #include "exec/address-spaces.h"
 #include "qemu/error-report.h"
-#include "sysemu/qtest.h"
 #include "trace.h"
 #include 
 
@@ -132,11 +131,6 @@ bool spapr_ovec_test(sPAPROptionVector *ov, long bitnr)
 g_assert(ov);
 g_assert(bitnr < OV_MAXBITS);
 
-/* support memory unplug for qtest */
-if (qtest_enabled() && bitnr == OV5_HP_EVT) {
-return true;
-}
-
 return test_bit(bitnr, ov->bitmap) ? true : false;
 }
 
-- 
2.20.1

[Qemu-devel] [PULL 13/60] target/ppc: Move exception vector offset computation into a function

2019-03-10 Thread David Gibson

From: Fabiano Rosas 

Signed-off-by: Fabiano Rosas 
Reviewed-by: Alexey Kardashevskiy 
Message-Id: <20190228225759.21328-2-faro...@linux.ibm.com>
Signed-off-by: David Gibson 
---
 target/ppc/excp_helper.c | 30 +++---
 1 file changed, 19 insertions(+), 11 deletions(-)

diff --git a/target/ppc/excp_helper.c b/target/ppc/excp_helper.c
index 39bedbb11d..beafcf1ebd 100644
--- a/target/ppc/excp_helper.c
+++ b/target/ppc/excp_helper.c
@@ -107,6 +107,24 @@ static int powerpc_reset_wakeup(CPUState *cs, CPUPPCState 
*env, int excp,
 return POWERPC_EXCP_RESET;
 }
 
+static uint64_t ppc_excp_vector_offset(CPUState *cs, int ail)
+{
+uint64_t offset = 0;
+
+switch (ail) {
+case AIL_0001_8000:
+offset = 0x18000;
+break;
+case AIL_C000___4000:
+offset = 0xc0004000ull;
+break;
+default:
+cpu_abort(cs, "Invalid AIL combination %d\n", ail);
+break;
+}
+
+return offset;
+}
 
 /* Note that this function should be greatly optimized
  * when called with a constant excp, from ppc_hw_interrupt
@@ -708,17 +726,7 @@ static inline void powerpc_excp(PowerPCCPU *cpu, int 
excp_model, int excp)
 /* Handle AIL */
 if (ail) {
 new_msr |= (1 << MSR_IR) | (1 << MSR_DR);
-switch(ail) {
-case AIL_0001_8000:
-vector |= 0x18000;
-break;
-case AIL_C000___4000:
-vector |= 0xc0004000ull;
-break;
-default:
-cpu_abort(cs, "Invalid AIL combination %d\n", ail);
-break;
-}
+vector |= ppc_excp_vector_offset(cs, ail);
 }
 
 #if defined(TARGET_PPC64)
-- 
2.20.1

[Qemu-devel] [PULL 01/60] vfio/spapr: Fix indirect levels calculation

2019-03-10 Thread David Gibson

From: Alexey Kardashevskiy 

The current code assumes that we can address more bits on a PCI bus
for DMA than we really can but there is no way knowing the actual limit.

This makes a better guess for the number of levels and if the kernel
fails to allocate that, this increases the level numbers till succeeded
or reached the 64bit limit.

This adds levels to the trace point.

This may cause the kernel to warn about failed allocation:
   [65122.837458] Failed to allocate a TCE memory, level shift=28
which might happen if MAX_ORDER is not large enough as it can vary:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/powerpc/Kconfig?h=v5.0-rc2#n727

Signed-off-by: Alexey Kardashevskiy 
Message-Id: <20190227085149.38596-3-...@ozlabs.ru>
Signed-off-by: David Gibson 
---
 hw/vfio/spapr.c  | 43 +--
 hw/vfio/trace-events |  2 +-
 2 files changed, 34 insertions(+), 11 deletions(-)

diff --git a/hw/vfio/spapr.c b/hw/vfio/spapr.c
index becf71a3fc..88437a79e6 100644
--- a/hw/vfio/spapr.c
+++ b/hw/vfio/spapr.c
@@ -143,10 +143,10 @@ int vfio_spapr_create_window(VFIOContainer *container,
  MemoryRegionSection *section,
  hwaddr *pgsize)
 {
-int ret;
+int ret = 0;
 IOMMUMemoryRegion *iommu_mr = IOMMU_MEMORY_REGION(section->mr);
 uint64_t pagesize = memory_region_iommu_get_min_page_size(iommu_mr);
-unsigned entries, pages;
+unsigned entries, bits_total, bits_per_level, max_levels;
 struct vfio_iommu_spapr_tce_create create = { .argsz = sizeof(create) };
 long systempagesize = qemu_getrampagesize();
 
@@ -176,16 +176,38 @@ int vfio_spapr_create_window(VFIOContainer *container,
 create.window_size = int128_get64(section->size);
 create.page_shift = ctz64(pagesize);
 /*
- * SPAPR host supports multilevel TCE tables, there is some
- * heuristic to decide how many levels we want for our table:
- * 0..64 = 1; 65..4096 = 2; 4097..262144 = 3; 262145.. = 4
+ * SPAPR host supports multilevel TCE tables. We try to guess optimal
+ * levels number and if this fails (for example due to the host memory
+ * fragmentation), we increase levels. The DMA address structure is:
+ *  rxxx       
+ * where:
+ *   r = reserved (bits >= 55 are reserved in the existing hardware)
+ *   i = IOMMU page offset (64K in this example)
+ *   x = bits to index a TCE which can be split to equal chunks to index
+ *  within the level.
+ * The aim is to split "x" to smaller possible number of levels.
  */
 entries = create.window_size >> create.page_shift;
-pages = MAX((entries * sizeof(uint64_t)) / getpagesize(), 1);
-pages = MAX(pow2ceil(pages), 1); /* Round up */
-create.levels = ctz64(pages) / 6 + 1;
-
-ret = ioctl(container->fd, VFIO_IOMMU_SPAPR_TCE_CREATE, &create);
+/* bits_total is number of "x" needed */
+bits_total = ctz64(entries * sizeof(uint64_t));
+/*
+ * bits_per_level is a safe guess of how much we can allocate per level:
+ * 8 is the current minimum for CONFIG_FORCE_MAX_ZONEORDER and MAX_ORDER
+ * is usually bigger than that.
+ * Below we look at getpagesize() as TCEs are allocated from system pages.
+ */
+bits_per_level = ctz64(getpagesize()) + 8;
+create.levels = bits_total / bits_per_level;
+if (bits_total % bits_per_level) {
+++create.levels;
+}
+max_levels = (64 - create.page_shift) / ctz64(getpagesize());
+for ( ; create.levels <= max_levels; ++create.levels) {
+ret = ioctl(container->fd, VFIO_IOMMU_SPAPR_TCE_CREATE, &create);
+if (!ret) {
+break;
+}
+}
 if (ret) {
 error_report("Failed to create a window, ret = %d (%m)", ret);
 return -errno;
@@ -200,6 +222,7 @@ int vfio_spapr_create_window(VFIOContainer *container,
 return -EINVAL;
 }
 trace_vfio_spapr_create_window(create.page_shift,
+   create.levels,
create.window_size,
create.start_addr);
 *pgsize = pagesize;
diff --git a/hw/vfio/trace-events b/hw/vfio/trace-events
index ed2f333ad7..cf1e886818 100644
--- a/hw/vfio/trace-events
+++ b/hw/vfio/trace-events
@@ -129,6 +129,6 @@ vfio_prereg_listener_region_add_skip(uint64_t start, 
uint64_t end) "0x%"PRIx64"
 vfio_prereg_listener_region_del_skip(uint64_t start, uint64_t end) 
"0x%"PRIx64" - 0x%"PRIx64
 vfio_prereg_register(uint64_t va, uint64_t size, int ret) "va=0x%"PRIx64" 
size=0x%"PRIx64" ret=%d"
 vfio_prereg_unregister(uint64_t va, uint64_t size, int ret) "va=0x%"PRIx64" 
size=0x%"PRIx64" ret=%d"
-vfio_spapr_create_window(int ps, uint64_t ws, uint64_t off) "pageshift=0x%x 
winsize=0x%"PRIx64" offset=0x%"PRIx64
+vfio_spapr_create_window(int ps, unsigned int levels, uint64_t ws, uint64_t

[Qemu-devel] [PULL 07/60] target/ppc: Implement large decrementer support for KVM

2019-03-10 Thread David Gibson

From: Suraj Jitindar Singh 

Implement support to allow KVM guests to take advantage of the large
decrementer introduced on POWER9 cpus.

To determine if the host can support the requested large decrementer
size, we check it matches that specified in the ibm,dec-bits device-tree
property. We also need to enable it in KVM by setting the LPCR_LD bit in
the LPCR. Note that to do this we need to try and set the bit, then read
it back to check the host allowed us to set it, if so we can use it but
if we were unable to set it the host cannot support it and we must not
use the large decrementer.

Signed-off-by: Suraj Jitindar Singh 
Signed-off-by: Cédric Le Goater 
Message-Id: <20190301024317.22137-3-sjitindarsi...@gmail.com>
Signed-off-by: David Gibson 
---
 hw/ppc/spapr_caps.c  | 18 --
 target/ppc/kvm.c | 39 +++
 target/ppc/kvm_ppc.h | 12 
 3 files changed, 67 insertions(+), 2 deletions(-)

diff --git a/hw/ppc/spapr_caps.c b/hw/ppc/spapr_caps.c
index 9a34d1f4ed..1e76685199 100644
--- a/hw/ppc/spapr_caps.c
+++ b/hw/ppc/spapr_caps.c
@@ -394,6 +394,7 @@ static void cap_large_decr_apply(sPAPRMachineState *spapr,
  uint8_t val, Error **errp)
 {
 PowerPCCPU *cpu = POWERPC_CPU(first_cpu);
+PowerPCCPUClass *pcc = POWERPC_CPU_GET_CLASS(cpu);
 
 if (!val)
 return; /* Disabled by default */
@@ -405,8 +406,16 @@ static void cap_large_decr_apply(sPAPRMachineState *spapr,
 "Large decrementer only supported on POWER9, try -cpu POWER9");
 return;
 }
-} else {
-error_setg(errp, "No large decrementer support, try 
cap-large-decr=off");
+} else if (kvm_enabled()) {
+int kvm_nr_bits = kvmppc_get_cap_large_decr();
+
+if (!kvm_nr_bits) {
+error_setg(errp, "No large decrementer support, try 
cap-large-decr=off");
+} else if (pcc->lrg_decr_bits != kvm_nr_bits) {
+error_setg(errp,
+"KVM large decrementer size (%d) differs to model (%d), try 
-cap-large-decr=off",
+kvm_nr_bits, pcc->lrg_decr_bits);
+}
 }
 }
 
@@ -417,6 +426,11 @@ static void cap_large_decr_cpu_apply(sPAPRMachineState 
*spapr,
 CPUPPCState *env = &cpu->env;
 target_ulong lpcr = env->spr[SPR_LPCR];
 
+if (kvm_enabled()) {
+if (kvmppc_enable_cap_large_decr(cpu, val))
+error_setg(errp, "No large decrementer support, try 
cap-large-decr=off");
+}
+
 if (val)
 lpcr |= LPCR_LD;
 else
diff --git a/target/ppc/kvm.c b/target/ppc/kvm.c
index d01852fe31..3f650c8fc4 100644
--- a/target/ppc/kvm.c
+++ b/target/ppc/kvm.c
@@ -91,6 +91,7 @@ static int cap_ppc_safe_cache;
 static int cap_ppc_safe_bounds_check;
 static int cap_ppc_safe_indirect_branch;
 static int cap_ppc_nested_kvm_hv;
+static int cap_large_decr;
 
 static uint32_t debug_inst_opcode;
 
@@ -124,6 +125,7 @@ static bool kvmppc_is_pr(KVMState *ks)
 
 static int kvm_ppc_register_host_cpu_type(MachineState *ms);
 static void kvmppc_get_cpu_characteristics(KVMState *s);
+static int kvmppc_get_dec_bits(void);
 
 int kvm_arch_init(MachineState *ms, KVMState *s)
 {
@@ -151,6 +153,7 @@ int kvm_arch_init(MachineState *ms, KVMState *s)
 cap_resize_hpt = kvm_vm_check_extension(s, KVM_CAP_SPAPR_RESIZE_HPT);
 kvmppc_get_cpu_characteristics(s);
 cap_ppc_nested_kvm_hv = kvm_vm_check_extension(s, KVM_CAP_PPC_NESTED_HV);
+cap_large_decr = kvmppc_get_dec_bits();
 /*
  * Note: setting it to false because there is not such capability
  * in KVM at this moment.
@@ -1927,6 +1930,15 @@ uint64_t kvmppc_get_clockfreq(void)
 return kvmppc_read_int_cpu_dt("clock-frequency");
 }
 
+static int kvmppc_get_dec_bits(void)
+{
+int nr_bits = kvmppc_read_int_cpu_dt("ibm,dec-bits");
+
+if (nr_bits > 0)
+return nr_bits;
+return 0;
+}
+
 static int kvmppc_get_pvinfo(CPUPPCState *env, struct kvm_ppc_pvinfo *pvinfo)
  {
  PowerPCCPU *cpu = ppc_env_get_cpu(env);
@@ -2442,6 +2454,33 @@ bool kvmppc_has_cap_spapr_vfio(void)
 return cap_spapr_vfio;
 }
 
+int kvmppc_get_cap_large_decr(void)
+{
+return cap_large_decr;
+}
+
+int kvmppc_enable_cap_large_decr(PowerPCCPU *cpu, int enable)
+{
+CPUState *cs = CPU(cpu);
+uint64_t lpcr;
+
+kvm_get_one_reg(cs, KVM_REG_PPC_LPCR_64, &lpcr);
+/* Do we need to modify the LPCR? */
+if (!!(lpcr & LPCR_LD) != !!enable) {
+if (enable)
+lpcr |= LPCR_LD;
+else
+lpcr &= ~LPCR_LD;
+kvm_set_one_reg(cs, KVM_REG_PPC_LPCR_64, &lpcr);
+kvm_get_one_reg(cs, KVM_REG_PPC_LPCR_64, &lpcr);
+
+if (!!(lpcr & LPCR_LD) != !!enable)
+return -1;
+}
+
+return 0;
+}
+
 PowerPCCPUClass *kvm_ppc_get_host_cpu_class(void)
 {
 uint32_t host_pvr = mfpvr();
diff --git a/target/ppc/kvm_ppc.h b/target/ppc/kvm_ppc.h
index bdfaa4e70a..a79835bd14 100644
--- a/target/ppc/kvm_ppc.h
+++ b/target/ppc/kvm_ppc.h
@@

[Qemu-devel] [PULL 00/60] ppc-for-4.0 queue 20190310

2019-03-10 Thread David Gibson

The following changes since commit f5b4c31030f45293bb4517445722768434829d91:

  Merge remote-tracking branch 'remotes/stefanha/tags/block-pull-request' into 
staging (2019-03-09 17:35:48 +)

are available in the Git repository at:

  git://github.com/dgibson/qemu.git tags/ppc-for-4.0-20190310

for you to fetch changes up to 08d020471fcd41cb020fc9987ed1945eefcc8805:

  spapr: Use CamelCase properly (2019-03-10 14:35:44 +1100)


ppc patch queue for 2019-03-10

Here's a final pull request before the 4.0 soft freeze.  Changes
include:
  * A Great Renaming to use camel case properly in spapr code
  * Optimization of some vector instructions
  * Support for POWER9 cpus in the powernv machine
  * Fixes a regression from the last pull request in handling VSX
instructions with mixed operands from the FPR and VMX parts of the
register array
  * Optimization hack to avoid scanning all the (empty) entries on a
new IOMMU window
  * Add FSL I2C controller model for E500
  * Support for KVM acceleration of the H_PAGE_INIT hypercall on spapr
  * Update u-boot image for E500
  * Enable Specre/Meltdown mitigations by default on the new machine type
  * Enable large decrementer support for POWER9

Plus a number of assorted bugfixes and cleanups.


Alexander Graf (1):
  PPC: E500: Update u-boot to v2019.01

Alexey Kardashevskiy (3):
  vfio/spapr: Fix indirect levels calculation
  vfio/spapr: Rename local systempagesize variable
  spapr_iommu: Do not replay mappings from just created DMA window

Andrew Randrianasulu (1):
  PPC: E500: Add FSL I2C controller and integrate RTC with it

Cédric Le Goater (27):
  ppc/xive: hardwire the Physical CAM line of the thread context
  ppc: externalize ppc_get_vcpu_by_pir()
  ppc/xive: export the TIMA memory accessors
  ppc/pnv: export the xive_router_notify() routine
  ppc/pnv: change the CPU machine_data presenter type to Object *
  ppc/pnv: add a XIVE interrupt controller model for POWER9
  ppc/pnv: introduce a new dt_populate() operation to the chip model
  ppc/pnv: introduce a new pic_print_info() operation to the chip model
  ppc/xive: activate HV support
  ppc/pnv: fix logging primitives using Ox
  ppc/pnv: psi: add a PSIHB_REG macro
  ppc/pnv: psi: add a reset handler
  ppc/pnv: add a PSI bridge class model
  ppc/pnv: add a PSI bridge model for POWER9
  ppc/pnv: lpc: fix OPB address ranges
  ppc/pnv: add a LPC Controller class model
  ppc/pnv: add a 'dt_isa_nodename' to the chip
  ppc/pnv: add a LPC Controller model for POWER9
  ppc/pnv: add SerIRQ routing registers
  ppc/pnv: add a OCC model class
  ppc/pnv: add a OCC model for POWER9
  ppc/pnv: extend XSCOM core support for POWER9
  ppc/pnv: POWER9 XSCOM quad support
  ppc/pnv: activate XSCOM tests for POWER9
  ppc/pnv: add more dummy XSCOM addresses
  ppc/pnv: add a "ibm,opal/power-mgt" device tree node on POWER9
  target/ppc: add HV support for POWER9

David Gibson (2):
  spapr: Force SPAPR_MEMORY_BLOCK_SIZE to be a hwaddr (64-bit)
  spapr: Use CamelCase properly

Fabiano Rosas (3):
  target/ppc: Move exception vector offset computation into a function
  target/ppc: Move handling of hardware breakpoints to a separate function
  target/ppc: Refactor kvm_handle_debug

Greg Kurz (2):
  spapr: Simulate CAS for qtest
  Revert "spapr: support memory unplug for qtest"

Mark Cave-Ayland (9):
  target/ppc: introduce single fpr_offset() function
  target/ppc: introduce single vsrl_offset() function
  target/ppc: move Vsr* macros from internal.h to cpu.h
  target/ppc: introduce avr_full_offset() function
  target/ppc: improve avr64_offset() and use it to simplify 
get_avr64()/set_avr64()
  target/ppc: switch fpr/vsrl registers so all VSX registers are in host 
endian order
  target/ppc: introduce vsr64_offset() to simplify get_cpu_vsr{l,h}() and 
set_cpu_vsr{l,h}()
  mac_oldworld: use node name instead of alias name for hd device in 
FWPathProvider
  mac_newworld: use node name instead of alias name for hd device in 
FWPathProvider

Philippe Mathieu-Daudé (2):
  target/ppc: Optimize xviexpdp() using deposit_i64()
  target/ppc: Optimize x[sv]xsigdp using deposit_i64()

Suraj Jitindar Singh (10):
  target/ppc/spapr: Add SPAPR_CAP_LARGE_DECREMENTER
  target/ppc: Implement large decrementer support for TCG
  target/ppc: Implement large decrementer support for KVM
  target/ppc/spapr: Enable the large decrementer for pseries-4.0
  target/ppc/spapr: Add workaround option to SPAPR_CAP_IBS
  target/ppc/spapr: Add SPAPR_CAP_CCF_ASSIST
  target/ppc/tcg: make spapr_caps apply cap-[cfpc/sbbc/ibs] non-fatal for 
tcg
  target/ppc/spapr:

[Qemu-devel] [PULL 09/60] target/ppc/spapr: Add workaround option to SPAPR_CAP_IBS

2019-03-10 Thread David Gibson

From: Suraj Jitindar Singh 

The spapr_cap SPAPR_CAP_IBS is used to indicate the level of capability
for mitigations for indirect branch speculation. Currently the available
values are broken (default), fixed-ibs (fixed by serialising indirect
branches) and fixed-ccd (fixed by diabling the count cache).

Introduce a new value for this capability denoted workaround, meaning that
software can work around the issue by flushing the count cache on
context switch. This option is available if the hypervisor sets the
H_CPU_BEHAV_FLUSH_COUNT_CACHE flag in the cpu behaviours returned from
the KVM_PPC_GET_CPU_CHAR ioctl.

Signed-off-by: Suraj Jitindar Singh 
Message-Id: <20190301031912.28809-1-sjitindarsi...@gmail.com>
Signed-off-by: David Gibson 
---
 hw/ppc/spapr_caps.c| 21 ++---
 hw/ppc/spapr_hcall.c   |  5 +
 include/hw/ppc/spapr.h |  7 +++
 target/ppc/kvm.c   |  8 +++-
 4 files changed, 29 insertions(+), 12 deletions(-)

diff --git a/hw/ppc/spapr_caps.c b/hw/ppc/spapr_caps.c
index 920224d0c2..74a48a423a 100644
--- a/hw/ppc/spapr_caps.c
+++ b/hw/ppc/spapr_caps.c
@@ -276,11 +276,13 @@ static void cap_safe_bounds_check_apply(sPAPRMachineState 
*spapr, uint8_t val,
 }
 
 sPAPRCapPossible cap_ibs_possible = {
-.num = 4,
+.num = 5,
 /* Note workaround only maintained for compatibility */
-.vals = {"broken", "workaround", "fixed-ibs", "fixed-ccd"},
-.help = "broken - no protection, fixed-ibs - indirect branch 
serialisation,"
-" fixed-ccd - cache count disabled",
+.vals = {"broken", "workaround", "fixed-ibs", "fixed-ccd", "fixed-na"},
+.help = "broken - no protection, workaround - count cache flush"
+", fixed-ibs - indirect branch serialisation,"
+" fixed-ccd - cache count disabled,"
+" fixed-na - fixed in hardware (no longer applicable)",
 };
 
 static void cap_safe_indirect_branch_apply(sPAPRMachineState *spapr,
@@ -288,15 +290,11 @@ static void 
cap_safe_indirect_branch_apply(sPAPRMachineState *spapr,
 {
 uint8_t kvm_val = kvmppc_get_cap_safe_indirect_branch();
 
-if (val == SPAPR_CAP_WORKAROUND) { /* Can only be Broken or Fixed */
-error_setg(errp,
-"Requested safe indirect branch capability level \"workaround\" not valid, try 
cap-ibs=%s",
-   cap_ibs_possible.vals[kvm_val]);
-} else if (tcg_enabled() && val) {
+if (tcg_enabled() && val) {
 /* TODO - for now only allow broken for TCG */
 error_setg(errp,
 "Requested safe indirect branch capability level not supported by tcg, try a 
different value for cap-ibs");
-} else if (kvm_enabled() && val && (val != kvm_val)) {
+} else if (kvm_enabled() && (val > kvm_val)) {
 error_setg(errp,
 "Requested safe indirect branch capability level not supported by kvm, try 
cap-ibs=%s",
cap_ibs_possible.vals[kvm_val]);
@@ -489,7 +487,8 @@ sPAPRCapabilityInfo capability_table[SPAPR_CAP_NUM] = {
 [SPAPR_CAP_IBS] = {
 .name = "ibs",
 .description =
-"Indirect Branch Speculation (broken, fixed-ibs, fixed-ccd)",
+"Indirect Branch Speculation (broken, workaround, fixed-ibs,"
+"fixed-ccd, fixed-na)",
 .index = SPAPR_CAP_IBS,
 .get = spapr_cap_get_string,
 .set = spapr_cap_set_string,
diff --git a/hw/ppc/spapr_hcall.c b/hw/ppc/spapr_hcall.c
index 476bad6271..4aa8036fc0 100644
--- a/hw/ppc/spapr_hcall.c
+++ b/hw/ppc/spapr_hcall.c
@@ -1723,12 +1723,17 @@ static target_ulong 
h_get_cpu_characteristics(PowerPCCPU *cpu,
 }
 
 switch (safe_indirect_branch) {
+case SPAPR_CAP_FIXED_NA:
+break;
 case SPAPR_CAP_FIXED_CCD:
 characteristics |= H_CPU_CHAR_CACHE_COUNT_DIS;
 break;
 case SPAPR_CAP_FIXED_IBS:
 characteristics |= H_CPU_CHAR_BCCTRL_SERIALISED;
 break;
+case SPAPR_CAP_WORKAROUND:
+behaviour |= H_CPU_BEHAV_FLUSH_COUNT_CACHE;
+break;
 default: /* broken */
 assert(safe_indirect_branch == SPAPR_CAP_BROKEN);
 break;
diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
index 8efc5e0779..a7f3b1bfdd 100644
--- a/include/hw/ppc/spapr.h
+++ b/include/hw/ppc/spapr.h
@@ -85,12 +85,17 @@ typedef enum {
 /* Bool Caps */
 #define SPAPR_CAP_OFF   0x00
 #define SPAPR_CAP_ON0x01
+
 /* Custom Caps */
+
+/* Generic */
 #define SPAPR_CAP_BROKEN0x00
 #define SPAPR_CAP_WORKAROUND0x01
 #define SPAPR_CAP_FIXED 0x02
+/* SPAPR_CAP_IBS (cap-ibs) */
 #define SPAPR_CAP_FIXED_IBS 0x02
 #define SPAPR_CAP_FIXED_CCD 0x03
+#define SPAPR_CAP_FIXED_NA  0x10 /* Lets leave a bit of a gap... */
 
 typedef struct sPAPRCapabilities sPAPRCapabilities;
 struct sPAPRCapabilities {
@@ -339,9 +344,11 @@ struct sPAPRMachineState {
 #define H_CPU_CHAR_HON_BRANCH_HINTS PPC_BIT(5)
 #define H_CPU_CHAR_THR_RECONF_TRIG

[Qemu-devel] [PULL 08/60] target/ppc/spapr: Enable the large decrementer for pseries-4.0

2019-03-10 Thread David Gibson

From: Suraj Jitindar Singh 

Enable the large decrementer by default for the pseries-4.0 machine type.
It is disabled again by default_caps_with_cpu() for pre-POWER9 cpus
since they don't support the large decrementer.

Signed-off-by: Suraj Jitindar Singh 
Message-Id: <20190301024317.22137-4-sjitindarsi...@gmail.com>
Signed-off-by: David Gibson 
---
 hw/ppc/spapr.c  | 3 ++-
 hw/ppc/spapr_caps.c | 5 +
 2 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 6b54ad260a..8e24d7dc50 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -4311,7 +4311,7 @@ static void spapr_machine_class_init(ObjectClass *oc, 
void *data)
 smc->default_caps.caps[SPAPR_CAP_IBS] = SPAPR_CAP_BROKEN;
 smc->default_caps.caps[SPAPR_CAP_HPT_MAXPAGESIZE] = 16; /* 64kiB */
 smc->default_caps.caps[SPAPR_CAP_NESTED_KVM_HV] = SPAPR_CAP_OFF;
-smc->default_caps.caps[SPAPR_CAP_LARGE_DECREMENTER] = SPAPR_CAP_OFF;
+smc->default_caps.caps[SPAPR_CAP_LARGE_DECREMENTER] = SPAPR_CAP_ON;
 spapr_caps_add_properties(smc, &error_abort);
 smc->irq = &spapr_irq_xics;
 smc->dr_phb_enabled = true;
@@ -4387,6 +4387,7 @@ static void spapr_machine_3_1_class_options(MachineClass 
*mc)
 mc->default_cpu_type = POWERPC_CPU_TYPE_NAME("power8_v2.0");
 smc->update_dt_enabled = false;
 smc->dr_phb_enabled = false;
+smc->default_caps.caps[SPAPR_CAP_LARGE_DECREMENTER] = SPAPR_CAP_OFF;
 }
 
 DEFINE_SPAPR_MACHINE(3_1, "3.1", false);
diff --git a/hw/ppc/spapr_caps.c b/hw/ppc/spapr_caps.c
index 1e76685199..920224d0c2 100644
--- a/hw/ppc/spapr_caps.c
+++ b/hw/ppc/spapr_caps.c
@@ -536,6 +536,11 @@ static sPAPRCapabilities 
default_caps_with_cpu(sPAPRMachineState *spapr,
 
 caps = smc->default_caps;
 
+if (!ppc_type_check_compat(cputype, CPU_POWERPC_LOGICAL_3_00,
+   0, spapr->max_compat_pvr)) {
+caps.caps[SPAPR_CAP_LARGE_DECREMENTER] = SPAPR_CAP_OFF;
+}
+
 if (!ppc_type_check_compat(cputype, CPU_POWERPC_LOGICAL_2_07,
0, spapr->max_compat_pvr)) {
 caps.caps[SPAPR_CAP_HTM] = SPAPR_CAP_OFF;
-- 
2.20.1

[Qemu-devel] [PULL 18/60] spapr: Force SPAPR_MEMORY_BLOCK_SIZE to be a hwaddr (64-bit)

2019-03-10 Thread David Gibson

SPAPR_MEMORY_BLOCK_SIZE is logically a difference in memory addresses, and
hence of type hwaddr which is 64-bit.  Previously it wasn't marked as such
which means that it could be treated as 32-bit.  That will work in some
circumstances but if multiplied by another 32-bit value it could lead to
a 32-bit overflow and an incorrect result.

One specific instance of this in spapr_lmb_dt_populate() was spotted by
Coverity (CID 1399145).

Reported-by: Peter Maydell 
Signed-off-by: David Gibson 
---
 include/hw/ppc/spapr.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
index ff1bd60615..1311ebe28e 100644
--- a/include/hw/ppc/spapr.h
+++ b/include/hw/ppc/spapr.h
@@ -792,7 +792,7 @@ int spapr_rtc_import_offset(sPAPRRTCState *rtc, int64_t 
legacy_offset);
 
 #define TYPE_SPAPR_RNG "spapr-rng"
 
-#define SPAPR_MEMORY_BLOCK_SIZE (1 << 28) /* 256MB */
+#define SPAPR_MEMORY_BLOCK_SIZE ((hwaddr)1 << 28) /* 256MB */
 
 /*
  * This defines the maximum number of DIMM slots we can have for sPAPR
-- 
2.20.1

[Qemu-devel] [PULL 06/60] target/ppc: Implement large decrementer support for TCG

2019-03-10 Thread David Gibson

From: Suraj Jitindar Singh 

Prior to POWER9 the decrementer was a 32-bit register which decremented
with each tick of the timebase. From POWER9 onwards the decrementer can
be set to operate in a mode called large decrementer where it acts as a
n-bit decrementing register which is visible as a 64-bit register, that
is the value of the decrementer is sign extended to 64 bits (where n is
implementation dependant).

The mode in which the decrementer operates is controlled by the LPCR_LD
bit in the logical paritition control register (LPCR).

>From POWER9 onwards the HDEC (hypervisor decrementer) was enlarged to
h-bits, also sign extended to 64 bits (where h is implementation
dependant). Note this isn't configurable and is always enabled.

On POWER9 the large decrementer and hdec are both 56 bits, as
represented by the lrg_decr_bits cpu class property. Since they are the
same size we only add one property for now, which could be extended in
the case they ever differ in the future.

We also add the lrg_decr_bits property for POWER5+/7/8 since it is used
to determine the size of the hdec, which is only generated on the
POWER5+ processor and later. On these processors it is 32 bits.

Signed-off-by: Suraj Jitindar Singh 
Signed-off-by: Cédric Le Goater 
Message-Id: <20190301024317.22137-2-sjitindarsi...@gmail.com>
Signed-off-by: David Gibson 
---
 hw/ppc/ppc.c| 85 +++--
 hw/ppc/spapr.c  |  8 
 hw/ppc/spapr_caps.c | 30 +++-
 target/ppc/cpu-qom.h|  1 +
 target/ppc/cpu.h|  8 ++--
 target/ppc/mmu-hash64.c |  2 +-
 target/ppc/translate.c  |  2 +-
 target/ppc/translate_init.inc.c |  4 ++
 8 files changed, 107 insertions(+), 33 deletions(-)

diff --git a/hw/ppc/ppc.c b/hw/ppc/ppc.c
index d1e3d4cd20..9145aeddcb 100644
--- a/hw/ppc/ppc.c
+++ b/hw/ppc/ppc.c
@@ -744,11 +744,10 @@ bool ppc_decr_clear_on_delivery(CPUPPCState *env)
 return ((tb_env->flags & flags) == PPC_DECR_UNDERFLOW_TRIGGERED);
 }
 
-static inline uint32_t _cpu_ppc_load_decr(CPUPPCState *env, uint64_t next)
+static inline int64_t _cpu_ppc_load_decr(CPUPPCState *env, uint64_t next)
 {
 ppc_tb_t *tb_env = env->tb_env;
-uint32_t decr;
-int64_t diff;
+int64_t decr, diff;
 
 diff = next - qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL);
 if (diff >= 0) {
@@ -758,27 +757,47 @@ static inline uint32_t _cpu_ppc_load_decr(CPUPPCState 
*env, uint64_t next)
 }  else {
 decr = -muldiv64(-diff, tb_env->decr_freq, NANOSECONDS_PER_SECOND);
 }
-LOG_TB("%s: %08" PRIx32 "\n", __func__, decr);
+LOG_TB("%s: %016" PRIx64 "\n", __func__, decr);
 
 return decr;
 }
 
-uint32_t cpu_ppc_load_decr (CPUPPCState *env)
+target_ulong cpu_ppc_load_decr (CPUPPCState *env)
 {
 ppc_tb_t *tb_env = env->tb_env;
+uint64_t decr;
 
 if (kvm_enabled()) {
 return env->spr[SPR_DECR];
 }
 
-return _cpu_ppc_load_decr(env, tb_env->decr_next);
+decr = _cpu_ppc_load_decr(env, tb_env->decr_next);
+
+/*
+ * If large decrementer is enabled then the decrementer is signed extened
+ * to 64 bits, otherwise it is a 32 bit value.
+ */
+if (env->spr[SPR_LPCR] & LPCR_LD)
+return decr;
+return (uint32_t) decr;
 }
 
-uint32_t cpu_ppc_load_hdecr (CPUPPCState *env)
+target_ulong cpu_ppc_load_hdecr (CPUPPCState *env)
 {
+PowerPCCPU *cpu = ppc_env_get_cpu(env);
+PowerPCCPUClass *pcc = POWERPC_CPU_GET_CLASS(cpu);
 ppc_tb_t *tb_env = env->tb_env;
+uint64_t hdecr;
 
-return _cpu_ppc_load_decr(env, tb_env->hdecr_next);
+hdecr =  _cpu_ppc_load_decr(env, tb_env->hdecr_next);
+
+/*
+ * If we have a large decrementer (POWER9 or later) then hdecr is sign
+ * extended to 64 bits, otherwise it is 32 bits.
+ */
+if (pcc->lrg_decr_bits > 32)
+return hdecr;
+return (uint32_t) hdecr;
 }
 
 uint64_t cpu_ppc_load_purr (CPUPPCState *env)
@@ -832,13 +851,21 @@ static void __cpu_ppc_store_decr(PowerPCCPU *cpu, 
uint64_t *nextp,
  QEMUTimer *timer,
  void (*raise_excp)(void *),
  void (*lower_excp)(PowerPCCPU *),
- uint32_t decr, uint32_t value)
+ target_ulong decr, target_ulong value,
+ int nr_bits)
 {
 CPUPPCState *env = &cpu->env;
 ppc_tb_t *tb_env = env->tb_env;
 uint64_t now, next;
+bool negative;
+
+/* Truncate value to decr_width and sign extend for simplicity */
+value &= ((1ULL << nr_bits) - 1);
+negative = !!(value & (1ULL << (nr_bits - 1)));
+if (negative)
+value |= (0xULL << nr_bits);
 
-LOG_TB("%s: %08" PRIx32 " => %08" PRIx32 "\n", __func__,
+LOG_TB("%s: " TARGET_FMT_lx " => " TARGET_FMT_lx "\n", __func__,
 decr, value);
 
 if (kvm_enabled()) {
@@ -860,15 +887,15 @@ static

[Qemu-devel] [PULL 11/60] target/ppc/tcg: make spapr_caps apply cap-[cfpc/sbbc/ibs] non-fatal for tcg

2019-03-10 Thread David Gibson

From: Suraj Jitindar Singh 

The spapr_caps cap-cfpc, cap-sbbc and cap-ibs are used to control the
availability of certain mitigations to the guest. These haven't been
implemented under TCG, it is unlikely they ever will be, and it is unclear
as to whether they even need to be.

As such, make failure to apply these capabilities under TCG non-fatal.
Instead we print a warning message to the user but still allow the guest
to continue.

Signed-off-by: Suraj Jitindar Singh 
Message-Id: <20190301044609.9626-2-sjitindarsi...@gmail.com>
Signed-off-by: David Gibson 
---
 hw/ppc/spapr_caps.c | 33 -
 1 file changed, 24 insertions(+), 9 deletions(-)

diff --git a/hw/ppc/spapr_caps.c b/hw/ppc/spapr_caps.c
index f03f2f64e7..b68d767d63 100644
--- a/hw/ppc/spapr_caps.c
+++ b/hw/ppc/spapr_caps.c
@@ -239,17 +239,22 @@ sPAPRCapPossible cap_cfpc_possible = {
 static void cap_safe_cache_apply(sPAPRMachineState *spapr, uint8_t val,
  Error **errp)
 {
+Error *local_err = NULL;
 uint8_t kvm_val =  kvmppc_get_cap_safe_cache();
 
 if (tcg_enabled() && val) {
-/* TODO - for now only allow broken for TCG */
-error_setg(errp,
-"Requested safe cache capability level not supported by tcg, try a different 
value for cap-cfpc");
+/* TCG only supports broken, allow other values and print a warning */
+error_setg(&local_err,
+   "TCG doesn't support requested feature, cap-cfpc=%s",
+   cap_cfpc_possible.vals[val]);
 } else if (kvm_enabled() && (val > kvm_val)) {
 error_setg(errp,
 "Requested safe cache capability level not supported by kvm, try cap-cfpc=%s",
cap_cfpc_possible.vals[kvm_val]);
 }
+
+if (local_err != NULL)
+warn_report_err(local_err);
 }
 
 sPAPRCapPossible cap_sbbc_possible = {
@@ -262,17 +267,22 @@ sPAPRCapPossible cap_sbbc_possible = {
 static void cap_safe_bounds_check_apply(sPAPRMachineState *spapr, uint8_t val,
 Error **errp)
 {
+Error *local_err = NULL;
 uint8_t kvm_val =  kvmppc_get_cap_safe_bounds_check();
 
 if (tcg_enabled() && val) {
-/* TODO - for now only allow broken for TCG */
-error_setg(errp,
-"Requested safe bounds check capability level not supported by tcg, try a 
different value for cap-sbbc");
+/* TCG only supports broken, allow other values and print a warning */
+error_setg(&local_err,
+   "TCG doesn't support requested feature, cap-sbbc=%s",
+   cap_sbbc_possible.vals[val]);
 } else if (kvm_enabled() && (val > kvm_val)) {
 error_setg(errp,
 "Requested safe bounds check capability level not supported by kvm, try 
cap-sbbc=%s",
cap_sbbc_possible.vals[kvm_val]);
 }
+
+if (local_err != NULL)
+warn_report_err(local_err);
 }
 
 sPAPRCapPossible cap_ibs_possible = {
@@ -288,17 +298,22 @@ sPAPRCapPossible cap_ibs_possible = {
 static void cap_safe_indirect_branch_apply(sPAPRMachineState *spapr,
uint8_t val, Error **errp)
 {
+Error *local_err = NULL;
 uint8_t kvm_val = kvmppc_get_cap_safe_indirect_branch();
 
 if (tcg_enabled() && val) {
-/* TODO - for now only allow broken for TCG */
-error_setg(errp,
-"Requested safe indirect branch capability level not supported by tcg, try a 
different value for cap-ibs");
+/* TCG only supports broken, allow other values and print a warning */
+error_setg(&local_err,
+   "TCG doesn't support requested feature, cap-ibs=%s",
+   cap_ibs_possible.vals[val]);
 } else if (kvm_enabled() && (val > kvm_val)) {
 error_setg(errp,
 "Requested safe indirect branch capability level not supported by kvm, try 
cap-ibs=%s",
cap_ibs_possible.vals[kvm_val]);
 }
+
+if (local_err != NULL)
+warn_report_err(local_err);
 }
 
 #define VALUE_DESC_TRISTATE " (broken, workaround, fixed)"
-- 
2.20.1

[Qemu-devel] [PULL 03/60] spapr: Simulate CAS for qtest

2019-03-10 Thread David Gibson

From: Greg Kurz 

The RTAS event hotplug code for machine types 2.8 and newer depends on
the CAS negotiated ov5 in order to work properly. However, there's no
CAS when running under qtest. There has been a tentative to trick the
code by faking the OV5_HP_EVT bit, but it turned out to break other
assumptions in the code and the change got reverted.

Go for a more general approach and simulate a CAS when running under
qtest. For simplicity, this pseudo CAS simple simulates the case where
the guest supports the same features as the machine. It is done at
reset time, just before we reset the DRCs, which could potentially
exercise the unplug code.

This allows to test unplug on spapr with both older and newer machine
types.

Suggested-by: Michael Roth 
Signed-off-by: Greg Kurz 
Message-Id: <155146875704.147873.10563808578795890265.st...@bahia.lan>
Tested-by: Michael Roth 
Reviewed-by: Michael Roth 
Signed-off-by: David Gibson 
---
 hw/ppc/spapr.c | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 9e01226e18..f7d527464c 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -29,6 +29,7 @@
 #include "qapi/visitor.h"
 #include "sysemu/sysemu.h"
 #include "sysemu/numa.h"
+#include "sysemu/qtest.h"
 #include "hw/hw.h"
 #include "qemu/log.h"
 #include "hw/fw-path-provider.h"
@@ -1711,6 +1712,16 @@ static void spapr_machine_reset(void)
  */
 spapr_irq_reset(spapr, &error_fatal);
 
+/*
+ * There is no CAS under qtest. Simulate one to please the code that
+ * depends on spapr->ov5_cas. This is especially needed to test device
+ * unplug, so we do that before resetting the DRCs.
+ */
+if (qtest_enabled()) {
+spapr_ovec_cleanup(spapr->ov5_cas);
+spapr->ov5_cas = spapr_ovec_clone(spapr->ov5);
+}
+
 /* DRC reset may cause a device to be unplugged. This will cause troubles
  * if this device is used by another device (eg, a running vhost backend
  * will crash QEMU if the DIMM holding the vring goes away). To avoid such
-- 
2.20.1

[Qemu-devel] [PULL 10/60] target/ppc/spapr: Add SPAPR_CAP_CCF_ASSIST

2019-03-10 Thread David Gibson

From: Suraj Jitindar Singh 

Introduce a new spapr_cap SPAPR_CAP_CCF_ASSIST to be used to indicate
the requirement for a hw-assisted version of the count cache flush
workaround.

The count cache flush workaround is a software workaround which can be
used to flush the count cache on context switch. Some revisions of
hardware may have a hardware accelerated flush, in which case the
software flush can be shortened. This cap is used to set the
availability of such hardware acceleration for the count cache flush
routine.

The availability of such hardware acceleration is indicated by the
H_CPU_CHAR_BCCTR_FLUSH_ASSIST flag being set in the characteristics
returned from the KVM_PPC_GET_CPU_CHAR ioctl.

Signed-off-by: Suraj Jitindar Singh 
Message-Id: <20190301031912.28809-2-sjitindarsi...@gmail.com>
Signed-off-by: David Gibson 
---
 hw/ppc/spapr.c |  2 ++
 hw/ppc/spapr_caps.c| 25 +
 hw/ppc/spapr_hcall.c   |  3 +++
 include/hw/ppc/spapr.h |  5 -
 target/ppc/kvm.c   | 14 ++
 target/ppc/kvm_ppc.h   |  6 ++
 6 files changed, 54 insertions(+), 1 deletion(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 8e24d7dc50..37fd7a1411 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -2097,6 +2097,7 @@ static const VMStateDescription vmstate_spapr = {
 &vmstate_spapr_cap_nested_kvm_hv,
 &vmstate_spapr_dtb,
 &vmstate_spapr_cap_large_decr,
+&vmstate_spapr_cap_ccf_assist,
 NULL
 }
 };
@@ -4312,6 +4313,7 @@ static void spapr_machine_class_init(ObjectClass *oc, 
void *data)
 smc->default_caps.caps[SPAPR_CAP_HPT_MAXPAGESIZE] = 16; /* 64kiB */
 smc->default_caps.caps[SPAPR_CAP_NESTED_KVM_HV] = SPAPR_CAP_OFF;
 smc->default_caps.caps[SPAPR_CAP_LARGE_DECREMENTER] = SPAPR_CAP_ON;
+smc->default_caps.caps[SPAPR_CAP_CCF_ASSIST] = SPAPR_CAP_OFF;
 spapr_caps_add_properties(smc, &error_abort);
 smc->irq = &spapr_irq_xics;
 smc->dr_phb_enabled = true;
diff --git a/hw/ppc/spapr_caps.c b/hw/ppc/spapr_caps.c
index 74a48a423a..f03f2f64e7 100644
--- a/hw/ppc/spapr_caps.c
+++ b/hw/ppc/spapr_caps.c
@@ -436,6 +436,21 @@ static void cap_large_decr_cpu_apply(sPAPRMachineState 
*spapr,
 ppc_store_lpcr(cpu, lpcr);
 }
 
+static void cap_ccf_assist_apply(sPAPRMachineState *spapr, uint8_t val,
+ Error **errp)
+{
+uint8_t kvm_val = kvmppc_get_cap_count_cache_flush_assist();
+
+if (tcg_enabled() && val) {
+/* TODO - for now only allow broken for TCG */
+error_setg(errp,
+"Requested count cache flush assist capability level not supported by tcg, try 
cap-ccf-assist=off");
+} else if (kvm_enabled() && (val > kvm_val)) {
+error_setg(errp,
+"Requested count cache flush assist capability level not supported by kvm, try 
cap-ccf-assist=off");
+}
+}
+
 sPAPRCapabilityInfo capability_table[SPAPR_CAP_NUM] = {
 [SPAPR_CAP_HTM] = {
 .name = "htm",
@@ -525,6 +540,15 @@ sPAPRCapabilityInfo capability_table[SPAPR_CAP_NUM] = {
 .apply = cap_large_decr_apply,
 .cpu_apply = cap_large_decr_cpu_apply,
 },
+[SPAPR_CAP_CCF_ASSIST] = {
+.name = "ccf-assist",
+.description = "Count Cache Flush Assist via HW Instruction",
+.index = SPAPR_CAP_CCF_ASSIST,
+.get = spapr_cap_get_bool,
+.set = spapr_cap_set_bool,
+.type = "bool",
+.apply = cap_ccf_assist_apply,
+},
 };
 
 static sPAPRCapabilities default_caps_with_cpu(sPAPRMachineState *spapr,
@@ -659,6 +683,7 @@ SPAPR_CAP_MIG_STATE(sbbc, SPAPR_CAP_SBBC);
 SPAPR_CAP_MIG_STATE(ibs, SPAPR_CAP_IBS);
 SPAPR_CAP_MIG_STATE(nested_kvm_hv, SPAPR_CAP_NESTED_KVM_HV);
 SPAPR_CAP_MIG_STATE(large_decr, SPAPR_CAP_LARGE_DECREMENTER);
+SPAPR_CAP_MIG_STATE(ccf_assist, SPAPR_CAP_CCF_ASSIST);
 
 void spapr_caps_init(sPAPRMachineState *spapr)
 {
diff --git a/hw/ppc/spapr_hcall.c b/hw/ppc/spapr_hcall.c
index 4aa8036fc0..8bfdddc964 100644
--- a/hw/ppc/spapr_hcall.c
+++ b/hw/ppc/spapr_hcall.c
@@ -1693,6 +1693,7 @@ static target_ulong h_get_cpu_characteristics(PowerPCCPU 
*cpu,
 uint8_t safe_cache = spapr_get_cap(spapr, SPAPR_CAP_CFPC);
 uint8_t safe_bounds_check = spapr_get_cap(spapr, SPAPR_CAP_SBBC);
 uint8_t safe_indirect_branch = spapr_get_cap(spapr, SPAPR_CAP_IBS);
+uint8_t count_cache_flush_assist = spapr_get_cap(spapr, 
SPAPR_CAP_CCF_ASSIST);
 
 switch (safe_cache) {
 case SPAPR_CAP_WORKAROUND:
@@ -1733,6 +1734,8 @@ static target_ulong h_get_cpu_characteristics(PowerPCCPU 
*cpu,
 break;
 case SPAPR_CAP_WORKAROUND:
 behaviour |= H_CPU_BEHAV_FLUSH_COUNT_CACHE;
+if (count_cache_flush_assist)
+characteristics |= H_CPU_CHAR_BCCTR_FLUSH_ASSIST;
 break;
 default: /* broken */
 assert(safe_indirect_branch == SPAPR_CAP_BROKEN);
diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
index a7f3b1bfdd..ff1bd60615 100644
--- a/include/hw/ppc/spapr.h
+++ b/include/hw/ppc/spapr

[Qemu-devel] [PULL 19/60] target/ppc/spapr: Enable H_PAGE_INIT in-kernel handling

2019-03-10 Thread David Gibson

From: Suraj Jitindar Singh 

The H_CALL H_PAGE_INIT can be used to zero or copy a page of guest
memory. Enable the in-kernel H_PAGE_INIT handler.

The in-kernel handler takes half the time to complete compared to
handling the H_CALL in userspace.

Signed-off-by: Suraj Jitindar Singh 
Message-Id: <20190306060608.19935-1-sjitindarsi...@gmail.com>
Signed-off-by: David Gibson 
---
 hw/ppc/spapr.c   | 3 +++
 target/ppc/kvm.c | 5 +
 target/ppc/kvm_ppc.h | 5 +
 3 files changed, 13 insertions(+)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 755056875c..e764e89806 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -2822,6 +2822,9 @@ static void spapr_machine_init(MachineState *machine)
 
 /* H_CLEAR_MOD/_REF are mandatory in PAPR, but off by default */
 kvmppc_enable_clear_ref_mod_hcalls();
+
+/* Enable H_PAGE_INIT */
+kvmppc_enable_h_page_init();
 }
 
 /* allocate RAM */
diff --git a/target/ppc/kvm.c b/target/ppc/kvm.c
index 4e3f1e4b78..d0bfb076df 100644
--- a/target/ppc/kvm.c
+++ b/target/ppc/kvm.c
@@ -2043,6 +2043,11 @@ void kvmppc_enable_clear_ref_mod_hcalls(void)
 kvmppc_enable_hcall(kvm_state, H_CLEAR_MOD);
 }
 
+void kvmppc_enable_h_page_init(void)
+{
+kvmppc_enable_hcall(kvm_state, H_PAGE_INIT);
+}
+
 void kvmppc_set_papr(PowerPCCPU *cpu)
 {
 CPUState *cs = CPU(cpu);
diff --git a/target/ppc/kvm_ppc.h b/target/ppc/kvm_ppc.h
index 2937b36cae..2c2ea30e87 100644
--- a/target/ppc/kvm_ppc.h
+++ b/target/ppc/kvm_ppc.h
@@ -23,6 +23,7 @@ int kvmppc_set_interrupt(PowerPCCPU *cpu, int irq, int level);
 void kvmppc_enable_logical_ci_hcalls(void);
 void kvmppc_enable_set_mode_hcall(void);
 void kvmppc_enable_clear_ref_mod_hcalls(void);
+void kvmppc_enable_h_page_init(void);
 void kvmppc_set_papr(PowerPCCPU *cpu);
 int kvmppc_set_compat(PowerPCCPU *cpu, uint32_t compat_pvr);
 void kvmppc_set_mpic_proxy(PowerPCCPU *cpu, int mpic_proxy);
@@ -138,6 +139,10 @@ static inline void kvmppc_enable_clear_ref_mod_hcalls(void)
 {
 }
 
+static inline void kvmppc_enable_h_page_init(void)
+{
+}
+
 static inline void kvmppc_set_papr(PowerPCCPU *cpu)
 {
 }
-- 
2.20.1

[Qemu-devel] [PULL 21/60] ppc/xive: hardwire the Physical CAM line of the thread context

2019-03-10 Thread David Gibson

From: Cédric Le Goater 

By default on P9, the HW CAM line (23bits) is hardwired to :

  0x000||0b1||4Bit chip number||7Bit Thread number.

When the block group mode is enabled at the controller level (PowerNV),
the CAM line is changed for CAM compares to :

  4Bit chip number||0x001||7Bit Thread number

This will require changes in xive_presenter_tctx_match() possibly.
This is a lowlevel functionality of the HW controller and it is not
strictly needed. Leave it for later.

Signed-off-by: Cédric Le Goater 
Message-Id: <20190306085032.15744-2-...@kaod.org>
Signed-off-by: David Gibson 
---
 hw/intc/xive.c | 31 ++-
 1 file changed, 30 insertions(+), 1 deletion(-)

diff --git a/hw/intc/xive.c b/hw/intc/xive.c
index daa7badc84..b21759c938 100644
--- a/hw/intc/xive.c
+++ b/hw/intc/xive.c
@@ -1112,6 +1112,30 @@ XiveTCTX *xive_router_get_tctx(XiveRouter *xrtr, 
CPUState *cs)
 return xrc->get_tctx(xrtr, cs);
 }
 
+/*
+ * By default on P9, the HW CAM line (23bits) is hardwired to :
+ *
+ *   0x000||0b1||4Bit chip number||7Bit Thread number.
+ *
+ * When the block grouping is enabled, the CAM line is changed to :
+ *
+ *   4Bit chip number||0x001||7Bit Thread number.
+ */
+static uint32_t hw_cam_line(uint8_t chip_id, uint8_t tid)
+{
+return 1 << 11 | (chip_id & 0xf) << 7 | (tid & 0x7f);
+}
+
+static bool xive_presenter_tctx_match_hw(XiveTCTX *tctx,
+ uint8_t nvt_blk, uint32_t nvt_idx)
+{
+CPUPPCState *env = &POWERPC_CPU(tctx->cs)->env;
+uint32_t pir = env->spr_cb[SPR_PIR].default_value;
+
+return hw_cam_line((pir >> 8) & 0xf, pir & 0x7f) ==
+hw_cam_line(nvt_blk, nvt_idx);
+}
+
 /*
  * The thread context register words are in big-endian format.
  */
@@ -1120,6 +1144,7 @@ static int xive_presenter_tctx_match(XiveTCTX *tctx, 
uint8_t format,
  bool cam_ignore, uint32_t logic_serv)
 {
 uint32_t cam = xive_nvt_cam_line(nvt_blk, nvt_idx);
+uint32_t qw3w2 = xive_tctx_word2(&tctx->regs[TM_QW3_HV_PHYS]);
 uint32_t qw2w2 = xive_tctx_word2(&tctx->regs[TM_QW2_HV_POOL]);
 uint32_t qw1w2 = xive_tctx_word2(&tctx->regs[TM_QW1_OS]);
 uint32_t qw0w2 = xive_tctx_word2(&tctx->regs[TM_QW0_USER]);
@@ -1142,7 +1167,11 @@ static int xive_presenter_tctx_match(XiveTCTX *tctx, 
uint8_t format,
 
 /* F=0 & i=0: Specific NVT notification */
 
-/* TODO (PowerNV) : PHYS ring */
+/* PHYS ring */
+if ((be32_to_cpu(qw3w2) & TM_QW3W2_VT) &&
+xive_presenter_tctx_match_hw(tctx, nvt_blk, nvt_idx)) {
+return TM_QW3_HV_PHYS;
+}
 
 /* HV POOL ring */
 if ((be32_to_cpu(qw2w2) & TM_QW2W2_VP) &&
-- 
2.20.1

[Qemu-devel] [PULL 15/60] target/ppc: Refactor kvm_handle_debug

2019-03-10 Thread David Gibson

From: Fabiano Rosas 

There are four scenarios being handled in this function:

- single stepping
- hardware breakpoints
- software breakpoints
- fallback (no debug supported)

A future patch will add code to handle specific single step and
software breakpoints cases so let's split each scenario into its own
function now to avoid hurting readability.

Signed-off-by: Fabiano Rosas 
Reviewed-by: Alexey Kardashevskiy 
Message-Id: <20190228225759.21328-5-faro...@linux.ibm.com>
Signed-off-by: David Gibson 
---
 target/ppc/kvm.c | 86 
 1 file changed, 50 insertions(+), 36 deletions(-)

diff --git a/target/ppc/kvm.c b/target/ppc/kvm.c
index 996b08a1d3..4e3f1e4b78 100644
--- a/target/ppc/kvm.c
+++ b/target/ppc/kvm.c
@@ -1624,52 +1624,66 @@ static int kvm_handle_hw_breakpoint(CPUState *cs,
 return handle;
 }
 
+static int kvm_handle_singlestep(void)
+{
+return 1;
+}
+
+static int kvm_handle_sw_breakpoint(void)
+{
+return 1;
+}
+
 static int kvm_handle_debug(PowerPCCPU *cpu, struct kvm_run *run)
 {
 CPUState *cs = CPU(cpu);
 CPUPPCState *env = &cpu->env;
 struct kvm_debug_exit_arch *arch_info = &run->debug.arch;
-int handle = 0;
 
 if (cs->singlestep_enabled) {
-handle = 1;
-} else if (arch_info->status) {
-handle = kvm_handle_hw_breakpoint(cs, arch_info);
-} else if (kvm_find_sw_breakpoint(cs, arch_info->address)) {
-handle = 1;
-} else {
-/* QEMU is not able to handle debug exception, so inject
- * program exception to guest;
- * Yes program exception NOT debug exception !!
- * When QEMU is using debug resources then debug exception must
- * be always set. To achieve this we set MSR_DE and also set
- * MSRP_DEP so guest cannot change MSR_DE.
- * When emulating debug resource for guest we want guest
- * to control MSR_DE (enable/disable debug interrupt on need).
- * Supporting both configurations are NOT possible.
- * So the result is that we cannot share debug resources
- * between QEMU and Guest on BOOKE architecture.
- * In the current design QEMU gets the priority over guest,
- * this means that if QEMU is using debug resources then guest
- * cannot use them;
- * For software breakpoint QEMU uses a privileged instruction;
- * So there cannot be any reason that we are here for guest
- * set debug exception, only possibility is guest executed a
- * privileged / illegal instruction and that's why we are
- * injecting a program interrupt.
- */
+return kvm_handle_singlestep();
+}
 
-cpu_synchronize_state(cs);
-/* env->nip is PC, so increment this by 4 to use
- * ppc_cpu_do_interrupt(), which set srr0 = env->nip - 4.
- */
-env->nip += 4;
-cs->exception_index = POWERPC_EXCP_PROGRAM;
-env->error_code = POWERPC_EXCP_INVAL;
-ppc_cpu_do_interrupt(cs);
+if (arch_info->status) {
+return kvm_handle_hw_breakpoint(cs, arch_info);
 }
 
-return handle;
+if (kvm_find_sw_breakpoint(cs, arch_info->address)) {
+return kvm_handle_sw_breakpoint();
+}
+
+/*
+ * QEMU is not able to handle debug exception, so inject
+ * program exception to guest;
+ * Yes program exception NOT debug exception !!
+ * When QEMU is using debug resources then debug exception must
+ * be always set. To achieve this we set MSR_DE and also set
+ * MSRP_DEP so guest cannot change MSR_DE.
+ * When emulating debug resource for guest we want guest
+ * to control MSR_DE (enable/disable debug interrupt on need).
+ * Supporting both configurations are NOT possible.
+ * So the result is that we cannot share debug resources
+ * between QEMU and Guest on BOOKE architecture.
+ * In the current design QEMU gets the priority over guest,
+ * this means that if QEMU is using debug resources then guest
+ * cannot use them;
+ * For software breakpoint QEMU uses a privileged instruction;
+ * So there cannot be any reason that we are here for guest
+ * set debug exception, only possibility is guest executed a
+ * privileged / illegal instruction and that's why we are
+ * injecting a program interrupt.
+ */
+cpu_synchronize_state(cs);
+/*
+ * env->nip is PC, so increment this by 4 to use
+ * ppc_cpu_do_interrupt(), which set srr0 = env->nip - 4.
+ */
+env->nip += 4;
+cs->exception_index = POWERPC_EXCP_PROGRAM;
+env->error_code = POWERPC_EXCP_INVAL;
+ppc_cpu_do_interrupt(cs);
+
+return 0;
 }
 
 int kvm_arch_handle_exit(CPUState *cs, struct kvm_run *run)
-- 
2.20.1

[Qemu-devel] [PULL 17/60] target/ppc/spapr: Clear partition table entry when allocating hash table

2019-03-10 Thread David Gibson

From: Suraj Jitindar Singh 

If we allocate a hash page table then we know that the guest won't be
using process tables, so set the partition table entry maintained for
the guest to zero. If this isn't done, then the guest radix bit will
remain set in the entry. This means that when the guest calls
H_REGISTER_PROCESS_TABLE there will be a mismatch between then flags
and the value in spapr->patb_entry, and the call will fail. The guest
will then panic:

Failed to register process table (rc=-4)
kernel BUG at arch/powerpc/platforms/pseries/lpar.c:959

The result being that it isn't possible to boot a hash guest on a P9
system.

Also fix a bug in the flags parsing in h_register_process_table() which
was introduced by the same patch, and simplify the handling to make it
less likely that errors will be introduced in the future. The effect
would have been setting the host radix bit LPCR_HR for a hash guest
using process tables, which currently isn't supported and so couldn't
have been triggered.

Fixes: 00fd075e18 "target/ppc/spapr: Set LPCR:HR when using Radix mode"

Signed-off-by: Suraj Jitindar Singh 
Message-Id: <20190305022102.17610-1-sjitindarsi...@gmail.com>
Signed-off-by: David Gibson 
---
 hw/ppc/spapr.c   |  1 +
 hw/ppc/spapr_hcall.c | 12 
 2 files changed, 9 insertions(+), 4 deletions(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 946bbcf9ee..755056875c 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -1632,6 +1632,7 @@ void spapr_reallocate_hpt(sPAPRMachineState *spapr, int 
shift,
 }
 }
 /* We're setting up a hash table, so that means we're not radix */
+spapr->patb_entry = 0;
 spapr_set_all_lpcrs(0, LPCR_HR | LPCR_UPRT);
 }
 
diff --git a/hw/ppc/spapr_hcall.c b/hw/ppc/spapr_hcall.c
index 8bfdddc964..7016a09386 100644
--- a/hw/ppc/spapr_hcall.c
+++ b/hw/ppc/spapr_hcall.c
@@ -1339,6 +1339,7 @@ static target_ulong h_register_process_table(PowerPCCPU 
*cpu,
 target_ulong proc_tbl = args[1];
 target_ulong page_size = args[2];
 target_ulong table_size = args[3];
+target_ulong update_lpcr = 0;
 uint64_t cproc;
 
 if (flags & ~FLAGS_MASK) { /* Check no reserved bits are set */
@@ -1394,10 +1395,13 @@ static target_ulong h_register_process_table(PowerPCCPU 
*cpu,
 spapr->patb_entry = cproc; /* Save new process table */
 
 /* Update the UPRT, HR and GTSE bits in the LPCR for all cpus */
-spapr_set_all_lpcrs(((flags & (FLAG_RADIX | FLAG_HASH_PROC_TBL)) ?
- (LPCR_UPRT | LPCR_HR) : 0) |
-((flags & FLAG_GTSE) ? LPCR_GTSE : 0),
-LPCR_UPRT | LPCR_HR | LPCR_GTSE);
+if (flags & FLAG_RADIX) /* Radix must use process tables, also set HR 
*/
+update_lpcr |= (LPCR_UPRT | LPCR_HR);
+else if (flags & FLAG_HASH_PROC_TBL) /* Hash with process tables */
+update_lpcr |= LPCR_UPRT;
+if (flags & FLAG_GTSE)  /* Guest translation shootdown enable */
+update_lpcr |= FLAG_GTSE;
+spapr_set_all_lpcrs(update_lpcr, LPCR_UPRT | LPCR_HR | LPCR_GTSE);
 
 if (kvm_enabled()) {
 return kvmppc_configure_v3_mmu(cpu, flags & FLAG_RADIX,
-- 
2.20.1

[Qemu-devel] [PULL 12/60] target/ppc/spapr: Enable mitigations by default for pseries-4.0 machine type

2019-03-10 Thread David Gibson

From: Suraj Jitindar Singh 

There are currently 3 mitigations the availability of which is controlled
by the spapr-caps mechanism, cap-cfpc, cap-sbbc, and cap-ibs. Enable these
mitigations by default for the pseries-4.0 machine type.

By now machine firmware should have been upgraded to allow these
settings.

Signed-off-by: Suraj Jitindar Singh 
Message-Id: <20190301044609.9626-3-sjitindarsi...@gmail.com>
Signed-off-by: David Gibson 
---
 hw/ppc/spapr.c | 9 ++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 37fd7a1411..946bbcf9ee 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -4307,9 +4307,9 @@ static void spapr_machine_class_init(ObjectClass *oc, 
void *data)
 smc->default_caps.caps[SPAPR_CAP_HTM] = SPAPR_CAP_OFF;
 smc->default_caps.caps[SPAPR_CAP_VSX] = SPAPR_CAP_ON;
 smc->default_caps.caps[SPAPR_CAP_DFP] = SPAPR_CAP_ON;
-smc->default_caps.caps[SPAPR_CAP_CFPC] = SPAPR_CAP_BROKEN;
-smc->default_caps.caps[SPAPR_CAP_SBBC] = SPAPR_CAP_BROKEN;
-smc->default_caps.caps[SPAPR_CAP_IBS] = SPAPR_CAP_BROKEN;
+smc->default_caps.caps[SPAPR_CAP_CFPC] = SPAPR_CAP_WORKAROUND;
+smc->default_caps.caps[SPAPR_CAP_SBBC] = SPAPR_CAP_WORKAROUND;
+smc->default_caps.caps[SPAPR_CAP_IBS] = SPAPR_CAP_WORKAROUND;
 smc->default_caps.caps[SPAPR_CAP_HPT_MAXPAGESIZE] = 16; /* 64kiB */
 smc->default_caps.caps[SPAPR_CAP_NESTED_KVM_HV] = SPAPR_CAP_OFF;
 smc->default_caps.caps[SPAPR_CAP_LARGE_DECREMENTER] = SPAPR_CAP_ON;
@@ -4389,6 +4389,9 @@ static void spapr_machine_3_1_class_options(MachineClass 
*mc)
 mc->default_cpu_type = POWERPC_CPU_TYPE_NAME("power8_v2.0");
 smc->update_dt_enabled = false;
 smc->dr_phb_enabled = false;
+smc->default_caps.caps[SPAPR_CAP_CFPC] = SPAPR_CAP_BROKEN;
+smc->default_caps.caps[SPAPR_CAP_SBBC] = SPAPR_CAP_BROKEN;
+smc->default_caps.caps[SPAPR_CAP_IBS] = SPAPR_CAP_BROKEN;
 smc->default_caps.caps[SPAPR_CAP_LARGE_DECREMENTER] = SPAPR_CAP_OFF;
 }
 
-- 
2.20.1

[Qemu-devel] [PULL 41/60] mac_oldworld: use node name instead of alias name for hd device in FWPathProvider

2019-03-10 Thread David Gibson

From: Mark Cave-Ayland 

When using -drive to configure the hd drive for the Old World machine, the node
name "disk" should be used instead of the "hd" alias.

Signed-off-by: Mark Cave-Ayland 
Message-Id: <20190307212058.4890-2-mark.cave-ayl...@ilande.co.uk>
Signed-off-by: David Gibson 
---
 hw/ppc/mac_oldworld.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/hw/ppc/mac_oldworld.c b/hw/ppc/mac_oldworld.c
index cc1e463466..460cbc7923 100644
--- a/hw/ppc/mac_oldworld.c
+++ b/hw/ppc/mac_oldworld.c
@@ -402,11 +402,11 @@ static char *heathrow_fw_dev_path(FWPathProvider *p, 
BusState *bus,
 return g_strdup("cdrom");
 }
 
-return g_strdup("hd");
+return g_strdup("disk");
 }
 
 if (!strcmp(object_get_typename(OBJECT(dev)), "ide-hd")) {
-return g_strdup("hd");
+return g_strdup("disk");
 }
 
 if (!strcmp(object_get_typename(OBJECT(dev)), "ide-cd")) {
-- 
2.20.1

[Qemu-devel] [PULL 14/60] target/ppc: Move handling of hardware breakpoints to a separate function

2019-03-10 Thread David Gibson

From: Fabiano Rosas 

This is in preparation for a refactoring of the kvm_handle_debug
function in the next patch.

Signed-off-by: Fabiano Rosas 
Message-Id: <20190228225759.21328-4-faro...@linux.ibm.com>
Signed-off-by: David Gibson 
---
 target/ppc/kvm.c | 47 ---
 1 file changed, 28 insertions(+), 19 deletions(-)

diff --git a/target/ppc/kvm.c b/target/ppc/kvm.c
index e0f0de0ce0..996b08a1d3 100644
--- a/target/ppc/kvm.c
+++ b/target/ppc/kvm.c
@@ -1597,35 +1597,44 @@ void kvm_arch_update_guest_debug(CPUState *cs, struct 
kvm_guest_debug *dbg)
 }
 }
 
+static int kvm_handle_hw_breakpoint(CPUState *cs,
+struct kvm_debug_exit_arch *arch_info)
+{
+int handle = 0;
+int n;
+int flag = 0;
+
+if (nb_hw_breakpoint + nb_hw_watchpoint > 0) {
+if (arch_info->status & KVMPPC_DEBUG_BREAKPOINT) {
+n = find_hw_breakpoint(arch_info->address, GDB_BREAKPOINT_HW);
+if (n >= 0) {
+handle = 1;
+}
+} else if (arch_info->status & (KVMPPC_DEBUG_WATCH_READ |
+KVMPPC_DEBUG_WATCH_WRITE)) {
+n = find_hw_watchpoint(arch_info->address,  &flag);
+if (n >= 0) {
+handle = 1;
+cs->watchpoint_hit = &hw_watchpoint;
+hw_watchpoint.vaddr = hw_debug_points[n].addr;
+hw_watchpoint.flags = flag;
+}
+}
+}
+return handle;
+}
+
 static int kvm_handle_debug(PowerPCCPU *cpu, struct kvm_run *run)
 {
 CPUState *cs = CPU(cpu);
 CPUPPCState *env = &cpu->env;
 struct kvm_debug_exit_arch *arch_info = &run->debug.arch;
 int handle = 0;
-int n;
-int flag = 0;
 
 if (cs->singlestep_enabled) {
 handle = 1;
 } else if (arch_info->status) {
-if (nb_hw_breakpoint + nb_hw_watchpoint > 0) {
-if (arch_info->status & KVMPPC_DEBUG_BREAKPOINT) {
-n = find_hw_breakpoint(arch_info->address, GDB_BREAKPOINT_HW);
-if (n >= 0) {
-handle = 1;
-}
-} else if (arch_info->status & (KVMPPC_DEBUG_WATCH_READ |
-KVMPPC_DEBUG_WATCH_WRITE)) {
-n = find_hw_watchpoint(arch_info->address,  &flag);
-if (n >= 0) {
-handle = 1;
-cs->watchpoint_hit = &hw_watchpoint;
-hw_watchpoint.vaddr = hw_debug_points[n].addr;
-hw_watchpoint.flags = flag;
-}
-}
-}
+handle = kvm_handle_hw_breakpoint(cs, arch_info);
 } else if (kvm_find_sw_breakpoint(cs, arch_info->address)) {
 handle = 1;
 } else {
-- 
2.20.1

[Qemu-devel] [PULL 29/60] ppc/xive: activate HV support

2019-03-10 Thread David Gibson

From: Cédric Le Goater 

The NSR register of the HV ring has a different, although similar, bit
layout. TM_QW3_NSR_HE_PHYS bit should now be raised when the
Hypervisor interrupt line is signaled. Other bits TM_QW3_NSR_HE_POOL
and TM_QW3_NSR_HE_LSI are not modeled. LSI are for special interrupts
reserved for HW bringup and the POOL bit is used when signaling a
group of VPs. This is not currently implemented in Linux but it is in
pHyp.

The most important special commands on the HV TIMA page are added to
let the core manage interrupts : acking and changing the CPU priority.

Signed-off-by: Cédric Le Goater 
Message-Id: <20190306085032.15744-10-...@kaod.org>
Signed-off-by: David Gibson 
---
 hw/intc/xive.c | 57 +++---
 1 file changed, 54 insertions(+), 3 deletions(-)

diff --git a/hw/intc/xive.c b/hw/intc/xive.c
index 7d7992c0ce..a0b87001da 100644
--- a/hw/intc/xive.c
+++ b/hw/intc/xive.c
@@ -54,6 +54,8 @@ static uint8_t exception_mask(uint8_t ring)
 switch (ring) {
 case TM_QW1_OS:
 return TM_QW1_NSR_EO;
+case TM_QW3_HV_PHYS:
+return TM_QW3_NSR_HE;
 default:
 g_assert_not_reached();
 }
@@ -88,7 +90,16 @@ static void xive_tctx_notify(XiveTCTX *tctx, uint8_t ring)
 uint8_t *regs = &tctx->regs[ring];
 
 if (regs[TM_PIPR] < regs[TM_CPPR]) {
-regs[TM_NSR] |= exception_mask(ring);
+switch (ring) {
+case TM_QW1_OS:
+regs[TM_NSR] |= TM_QW1_NSR_EO;
+break;
+case TM_QW3_HV_PHYS:
+regs[TM_NSR] |= (TM_QW3_NSR_HE_PHYS << 6);
+break;
+default:
+g_assert_not_reached();
+}
 qemu_irq_raise(tctx->output);
 }
 }
@@ -109,6 +120,38 @@ static void xive_tctx_set_cppr(XiveTCTX *tctx, uint8_t 
ring, uint8_t cppr)
  * XIVE Thread Interrupt Management Area (TIMA)
  */
 
+static void xive_tm_set_hv_cppr(XiveTCTX *tctx, hwaddr offset,
+uint64_t value, unsigned size)
+{
+xive_tctx_set_cppr(tctx, TM_QW3_HV_PHYS, value & 0xff);
+}
+
+static uint64_t xive_tm_ack_hv_reg(XiveTCTX *tctx, hwaddr offset, unsigned 
size)
+{
+return xive_tctx_accept(tctx, TM_QW3_HV_PHYS);
+}
+
+static uint64_t xive_tm_pull_pool_ctx(XiveTCTX *tctx, hwaddr offset,
+  unsigned size)
+{
+uint64_t ret;
+
+ret = tctx->regs[TM_QW2_HV_POOL + TM_WORD2] & TM_QW2W2_POOL_CAM;
+tctx->regs[TM_QW2_HV_POOL + TM_WORD2] &= ~TM_QW2W2_POOL_CAM;
+return ret;
+}
+
+static void xive_tm_vt_push(XiveTCTX *tctx, hwaddr offset,
+uint64_t value, unsigned size)
+{
+tctx->regs[TM_QW3_HV_PHYS + TM_WORD2] = value & 0xff;
+}
+
+static uint64_t xive_tm_vt_poll(XiveTCTX *tctx, hwaddr offset, unsigned size)
+{
+return tctx->regs[TM_QW3_HV_PHYS + TM_WORD2] & 0xff;
+}
+
 /*
  * Define an access map for each page of the TIMA that we will use in
  * the memory region ops to filter values when doing loads and stores
@@ -288,10 +331,16 @@ static const XiveTmOp xive_tm_operations[] = {
  * effects
  */
 { XIVE_TM_OS_PAGE, TM_QW1_OS + TM_CPPR,   1, xive_tm_set_os_cppr, NULL },
+{ XIVE_TM_HV_PAGE, TM_QW3_HV_PHYS + TM_CPPR, 1, xive_tm_set_hv_cppr, NULL 
},
+{ XIVE_TM_HV_PAGE, TM_QW3_HV_PHYS + TM_WORD2, 1, xive_tm_vt_push, NULL },
+{ XIVE_TM_HV_PAGE, TM_QW3_HV_PHYS + TM_WORD2, 1, NULL, xive_tm_vt_poll },
 
 /* MMIOs above 2K : special operations with side effects */
 { XIVE_TM_OS_PAGE, TM_SPC_ACK_OS_REG, 2, NULL, xive_tm_ack_os_reg },
 { XIVE_TM_OS_PAGE, TM_SPC_SET_OS_PENDING, 1, xive_tm_set_os_pending, NULL 
},
+{ XIVE_TM_HV_PAGE, TM_SPC_ACK_HV_REG, 2, NULL, xive_tm_ack_hv_reg },
+{ XIVE_TM_HV_PAGE, TM_SPC_PULL_POOL_CTX,  4, NULL, xive_tm_pull_pool_ctx },
+{ XIVE_TM_HV_PAGE, TM_SPC_PULL_POOL_CTX,  8, NULL, xive_tm_pull_pool_ctx },
 };
 
 static const XiveTmOp *xive_tm_find_op(hwaddr offset, unsigned size, bool 
write)
@@ -323,7 +372,7 @@ void xive_tctx_tm_write(XiveTCTX *tctx, hwaddr offset, 
uint64_t value,
 const XiveTmOp *xto;
 
 /*
- * TODO: check V bit in Q[0-3]W2, check PTER bit associated with CPU
+ * TODO: check V bit in Q[0-3]W2
  */
 
 /*
@@ -360,7 +409,7 @@ uint64_t xive_tctx_tm_read(XiveTCTX *tctx, hwaddr offset, 
unsigned size)
 const XiveTmOp *xto;
 
 /*
- * TODO: check V bit in Q[0-3]W2, check PTER bit associated with CPU
+ * TODO: check V bit in Q[0-3]W2
  */
 
 /*
@@ -472,6 +521,8 @@ static void xive_tctx_reset(void *dev)
  */
 tctx->regs[TM_QW1_OS + TM_PIPR] =
 ipb_to_pipr(tctx->regs[TM_QW1_OS + TM_IPB]);
+tctx->regs[TM_QW3_HV_PHYS + TM_PIPR] =
+ipb_to_pipr(tctx->regs[TM_QW3_HV_PHYS + TM_IPB]);
 }
 
 static void xive_tctx_realize(DeviceState *dev, Error **errp)
-- 
2.20.1

[Qemu-devel] [PULL 22/60] ppc: externalize ppc_get_vcpu_by_pir()

2019-03-10 Thread David Gibson

From: Cédric Le Goater 

We will use it to get the CPU interrupt presenter in XIVE when the
TIMA is accessed from the indirect page.

Signed-off-by: Cédric Le Goater 
Message-Id: <20190306085032.15744-3-...@kaod.org>
Signed-off-by: David Gibson 
---
 hw/ppc/pnv.c | 16 
 hw/ppc/ppc.c | 16 
 include/hw/ppc/ppc.h |  1 +
 3 files changed, 17 insertions(+), 16 deletions(-)

diff --git a/hw/ppc/pnv.c b/hw/ppc/pnv.c
index 3d5dfef220..9aa81c7f09 100644
--- a/hw/ppc/pnv.c
+++ b/hw/ppc/pnv.c
@@ -1082,22 +1082,6 @@ static void pnv_ics_resend(XICSFabric *xi)
 }
 }
 
-static PowerPCCPU *ppc_get_vcpu_by_pir(int pir)
-{
-CPUState *cs;
-
-CPU_FOREACH(cs) {
-PowerPCCPU *cpu = POWERPC_CPU(cs);
-CPUPPCState *env = &cpu->env;
-
-if (env->spr_cb[SPR_PIR].default_value == pir) {
-return cpu;
-}
-}
-
-return NULL;
-}
-
 static ICPState *pnv_icp_get(XICSFabric *xi, int pir)
 {
 PowerPCCPU *cpu = ppc_get_vcpu_by_pir(pir);
diff --git a/hw/ppc/ppc.c b/hw/ppc/ppc.c
index 9145aeddcb..b2ff99ec66 100644
--- a/hw/ppc/ppc.c
+++ b/hw/ppc/ppc.c
@@ -1487,3 +1487,19 @@ void PPC_debug_write (void *opaque, uint32_t addr, 
uint32_t val)
 break;
 }
 }
+
+PowerPCCPU *ppc_get_vcpu_by_pir(int pir)
+{
+CPUState *cs;
+
+CPU_FOREACH(cs) {
+PowerPCCPU *cpu = POWERPC_CPU(cs);
+CPUPPCState *env = &cpu->env;
+
+if (env->spr_cb[SPR_PIR].default_value == pir) {
+return cpu;
+}
+}
+
+return NULL;
+}
diff --git a/include/hw/ppc/ppc.h b/include/hw/ppc/ppc.h
index 746170f635..4bdcb8bacd 100644
--- a/include/hw/ppc/ppc.h
+++ b/include/hw/ppc/ppc.h
@@ -4,6 +4,7 @@
 #include "target/ppc/cpu-qom.h"
 
 void ppc_set_irq(PowerPCCPU *cpu, int n_IRQ, int level);
+PowerPCCPU *ppc_get_vcpu_by_pir(int pir);
 
 /* PowerPC hardware exceptions management helpers */
 typedef void (*clk_setup_cb)(void *opaque, uint32_t freq);
-- 
2.20.1

[Qemu-devel] [PULL 37/60] target/ppc: introduce avr_full_offset() function

2019-03-10 Thread David Gibson

From: Mark Cave-Ayland 

All TCG vector operations require pointers to the base address of the vector
rather than separate access to the top and bottom 64-bits. Convert the VMX TCG
instructions to use a new avr_full_offset() function instead of avr64_offset()
which can then itself be written as a simple wrapper onto vsr_full_offset().

This same function can also reused in cpu_avr_ptr() to avoid having more than
one copy of the offset calculation logic.

Signed-off-by: Mark Cave-Ayland 
Message-Id: <20190307180520.13868-5-mark.cave-ayl...@ilande.co.uk>
Reviewed-by: Richard Henderson 
Signed-off-by: David Gibson 
---
 target/ppc/cpu.h| 12 +++-
 target/ppc/translate/vmx-impl.inc.c | 22 +++---
 target/ppc/translate/vsx-impl.inc.c |  5 -
 3 files changed, 22 insertions(+), 17 deletions(-)

diff --git a/target/ppc/cpu.h b/target/ppc/cpu.h
index 1c4af4a1dc..caddbd012c 100644
--- a/target/ppc/cpu.h
+++ b/target/ppc/cpu.h
@@ -2598,14 +2598,24 @@ static inline int vsrl_offset(int i)
 return offsetof(CPUPPCState, vsr[i].u64[1]);
 }
 
+static inline int vsr_full_offset(int i)
+{
+return offsetof(CPUPPCState, vsr[i].u64[0]);
+}
+
 static inline uint64_t *cpu_vsrl_ptr(CPUPPCState *env, int i)
 {
 return (uint64_t *)((uintptr_t)env + vsrl_offset(i));
 }
 
+static inline int avr_full_offset(int i)
+{
+return vsr_full_offset(i + 32);
+}
+
 static inline ppc_avr_t *cpu_avr_ptr(CPUPPCState *env, int i)
 {
-return &env->vsr[32 + i];
+return (ppc_avr_t *)((uintptr_t)env + avr_full_offset(i));
 }
 
 void dump_mmu(FILE *f, fprintf_function cpu_fprintf, CPUPPCState *env);
diff --git a/target/ppc/translate/vmx-impl.inc.c 
b/target/ppc/translate/vmx-impl.inc.c
index f1b15ae2cb..4e5d0bc0e0 100644
--- a/target/ppc/translate/vmx-impl.inc.c
+++ b/target/ppc/translate/vmx-impl.inc.c
@@ -10,7 +10,7 @@
 static inline TCGv_ptr gen_avr_ptr(int reg)
 {
 TCGv_ptr r = tcg_temp_new_ptr();
-tcg_gen_addi_ptr(r, cpu_env, offsetof(CPUPPCState, vsr[32 + reg].u64[0]));
+tcg_gen_addi_ptr(r, cpu_env, avr_full_offset(reg));
 return r;
 }
 
@@ -205,7 +205,7 @@ static void gen_mtvscr(DisasContext *ctx)
 }
 
 val = tcg_temp_new_i32();
-bofs = avr64_offset(rB(ctx->opcode), true);
+bofs = avr_full_offset(rB(ctx->opcode));
 #ifdef HOST_WORDS_BIGENDIAN
 bofs += 3 * 4;
 #endif
@@ -284,9 +284,9 @@ static void glue(gen_, name)(DisasContext *ctx) 
\
 }   \
 \
 tcg_op(vece,\
-   avr64_offset(rD(ctx->opcode), true), \
-   avr64_offset(rA(ctx->opcode), true), \
-   avr64_offset(rB(ctx->opcode), true), \
+   avr_full_offset(rD(ctx->opcode)),\
+   avr_full_offset(rA(ctx->opcode)),\
+   avr_full_offset(rB(ctx->opcode)),\
16, 16); \
 }
 
@@ -578,10 +578,10 @@ static void glue(gen_, NAME)(DisasContext *ctx)   
  \
 gen_exception(ctx, POWERPC_EXCP_VPU);   \
 return; \
 }   \
-tcg_gen_gvec_4(avr64_offset(rD(ctx->opcode), true), \
+tcg_gen_gvec_4(avr_full_offset(rD(ctx->opcode)),\
offsetof(CPUPPCState, vscr_sat), \
-   avr64_offset(rA(ctx->opcode), true), \
-   avr64_offset(rB(ctx->opcode), true), \
+   avr_full_offset(rA(ctx->opcode)),\
+   avr_full_offset(rB(ctx->opcode)),\
16, 16, &g); \
 }
 
@@ -755,7 +755,7 @@ static void glue(gen_, name)(DisasContext *ctx) 
\
 return; \
 }   \
 simm = SIMM5(ctx->opcode);  \
-tcg_op(avr64_offset(rD(ctx->opcode), true), 16, 16, simm);  \
+tcg_op(avr_full_offset(rD(ctx->opcode)), 16, 16, simm); \
 }
 
 GEN_VXFORM_DUPI(vspltisb, tcg_gen_gvec_dup8i, 6, 12);
@@ -850,8 +850,8 @@ static void gen_vsplt(DisasContext *ctx, int vece)
 }
 
 uimm = UIMM5(ctx->opcode);
-bofs = avr64_offset(rB(ctx->opcode), true);
-dofs = avr64_offset(rD(ctx->opcode), true);
+bofs = avr_full_offset(rB(ctx->opcode));
+dofs = avr_full_offset(rD(ctx->opcode));
 
 /* Exper

[Qemu-devel] [PULL 25/60] ppc/pnv: change the CPU machine_data presenter type to Object *

2019-03-10 Thread David Gibson

From: Cédric Le Goater 

The POWER9 PowerNV machine will use a XIVE interrupt presenter type.

Signed-off-by: Cédric Le Goater 
Message-Id: <20190306085032.15744-6-...@kaod.org>
Signed-off-by: David Gibson 
---
 hw/ppc/pnv.c  | 6 +++---
 hw/ppc/pnv_core.c | 2 +-
 include/hw/ppc/pnv_core.h | 2 +-
 3 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/hw/ppc/pnv.c b/hw/ppc/pnv.c
index 9aa81c7f09..b90d03711a 100644
--- a/hw/ppc/pnv.c
+++ b/hw/ppc/pnv.c
@@ -684,7 +684,7 @@ static void pnv_chip_power8_intc_create(PnvChip *chip, 
PowerPCCPU *cpu,
 return;
 }
 
-pnv_cpu->icp = ICP(obj);
+pnv_cpu->intc = obj;
 }
 
 /*
@@ -1086,7 +1086,7 @@ static ICPState *pnv_icp_get(XICSFabric *xi, int pir)
 {
 PowerPCCPU *cpu = ppc_get_vcpu_by_pir(pir);
 
-return cpu ? pnv_cpu_state(cpu)->icp : NULL;
+return cpu ? ICP(pnv_cpu_state(cpu)->intc) : NULL;
 }
 
 static void pnv_pic_print_info(InterruptStatsProvider *obj,
@@ -1099,7 +1099,7 @@ static void pnv_pic_print_info(InterruptStatsProvider 
*obj,
 CPU_FOREACH(cs) {
 PowerPCCPU *cpu = POWERPC_CPU(cs);
 
-icp_pic_print_info(pnv_cpu_state(cpu)->icp, mon);
+icp_pic_print_info(ICP(pnv_cpu_state(cpu)->intc), mon);
 }
 
 for (i = 0; i < pnv->num_chips; i++) {
diff --git a/hw/ppc/pnv_core.c b/hw/ppc/pnv_core.c
index 7c806da720..38179cdc53 100644
--- a/hw/ppc/pnv_core.c
+++ b/hw/ppc/pnv_core.c
@@ -198,7 +198,7 @@ static void pnv_unrealize_vcpu(PowerPCCPU *cpu)
 PnvCPUState *pnv_cpu = pnv_cpu_state(cpu);
 
 qemu_unregister_reset(pnv_cpu_reset, cpu);
-object_unparent(OBJECT(pnv_cpu_state(cpu)->icp));
+object_unparent(OBJECT(pnv_cpu_state(cpu)->intc));
 cpu_remove_sync(CPU(cpu));
 cpu->machine_data = NULL;
 g_free(pnv_cpu);
diff --git a/include/hw/ppc/pnv_core.h b/include/hw/ppc/pnv_core.h
index 9961ea3a92..6874bb847a 100644
--- a/include/hw/ppc/pnv_core.h
+++ b/include/hw/ppc/pnv_core.h
@@ -48,7 +48,7 @@ typedef struct PnvCoreClass {
 #define PNV_CORE_TYPE_NAME(cpu_model) cpu_model PNV_CORE_TYPE_SUFFIX
 
 typedef struct PnvCPUState {
-struct ICPState *icp;
+Object *intc;
 } PnvCPUState;
 
 static inline PnvCPUState *pnv_cpu_state(PowerPCCPU *cpu)
-- 
2.20.1

[Qemu-devel] [PULL 20/60] PPC: E500: Add FSL I2C controller and integrate RTC with it

2019-03-10 Thread David Gibson

From: Andrew Randrianasulu 

Original commit message:
This patch adds an emulation model for i2c controller found on most of the FSL 
SoCs.
It also integrates the RTC (ds1338) that sits on the i2c Bus with e500 machine 
model.

Patch was originally written by Amit Singh Tomar 
see http://patchwork.ozlabs.org/patch/431475/
I only fixed it enough for application on top of current qemu master
20b084c4b1401b7f8fbc385649d48c67b6f43d44, and hopefully fixed checkpatch errors

Tested by booting Linux kernel 4.20.12. Now e500 machine doesn't need
network time protocol daemon because it will have working RTC
(before all timestamps on files were from 2016)

Signed-off-by: Amit Singh Tomar 
Signed-off-by: Andrew Randrianasulu 
Message-Id: <20190306102812.28972-1-randrianas...@gmail.com>
[dwg: Add Kconfig stanza to define the new symbol]
Signed-off-by: David Gibson 
---
 default-configs/ppc-softmmu.mak |   2 +
 hw/i2c/Kconfig  |   4 +
 hw/i2c/Makefile.objs|   1 +
 hw/i2c/mpc_i2c.c| 357 
 hw/ppc/e500.c   |  54 +
 5 files changed, 418 insertions(+)
 create mode 100644 hw/i2c/mpc_i2c.c

diff --git a/default-configs/ppc-softmmu.mak b/default-configs/ppc-softmmu.mak
index 6ea36d4090..bf86128a0c 100644
--- a/default-configs/ppc-softmmu.mak
+++ b/default-configs/ppc-softmmu.mak
@@ -1,6 +1,8 @@
 # Default configuration for ppc-softmmu
 
 # For embedded PPCs:
+CONFIG_MPC_I2C=y
+CONFIG_DS1338=y
 CONFIG_E500=y
 CONFIG_PPC405=y
 CONFIG_PPC440=y
diff --git a/hw/i2c/Kconfig b/hw/i2c/Kconfig
index ef1caa6d89..820b24de5b 100644
--- a/hw/i2c/Kconfig
+++ b/hw/i2c/Kconfig
@@ -25,3 +25,7 @@ config BITBANG_I2C
 config IMX_I2C
 bool
 select I2C
+
+config MPC_I2C
+bool
+select I2C
diff --git a/hw/i2c/Makefile.objs b/hw/i2c/Makefile.objs
index 2a3c106551..5f76b6a990 100644
--- a/hw/i2c/Makefile.objs
+++ b/hw/i2c/Makefile.objs
@@ -9,5 +9,6 @@ common-obj-$(CONFIG_EXYNOS4) += exynos4210_i2c.o
 common-obj-$(CONFIG_IMX_I2C) += imx_i2c.o
 common-obj-$(CONFIG_ASPEED_SOC) += aspeed_i2c.o
 common-obj-$(CONFIG_NRF51_SOC) += microbit_i2c.o
+common-obj-$(CONFIG_MPC_I2C) += mpc_i2c.o
 obj-$(CONFIG_OMAP) += omap_i2c.o
 obj-$(CONFIG_PPC4XX) += ppc4xx_i2c.o
diff --git a/hw/i2c/mpc_i2c.c b/hw/i2c/mpc_i2c.c
new file mode 100644
index 00..693ca7ef6b
--- /dev/null
+++ b/hw/i2c/mpc_i2c.c
@@ -0,0 +1,357 @@
+/*
+ * Copyright (C) 2014 Freescale Semiconductor, Inc. All rights reserved.
+ *
+ * Author: Amit Tomar, 
+ *
+ * Description:
+ * This file is derived from IMX I2C controller,
+ * by Jean-Christophe DUBOIS .
+ *
+ * Thanks to Scott Wood and Alexander Graf for their kind help on this.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2 or later,
+ * as published by the Free Software Foundation.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, see .
+ */
+
+#include "qemu/osdep.h"
+#include "hw/i2c/i2c.h"
+#include "qemu/log.h"
+#include "hw/sysbus.h"
+
+/* #define DEBUG_I2C */
+
+#ifdef DEBUG_I2C
+#define DPRINTF(fmt, ...)  \
+do { fprintf(stderr, "mpc_i2c[%s]: " fmt, __func__, ## __VA_ARGS__); \
+} while (0)
+#else
+#define DPRINTF(fmt, ...) do {} while (0)
+#endif
+
+#define TYPE_MPC_I2C "mpc-i2c"
+#define MPC_I2C(obj) \
+OBJECT_CHECK(MPCI2CState, (obj), TYPE_MPC_I2C)
+
+#define MPC_I2C_ADR   0x00
+#define MPC_I2C_FDR   0x04
+#define MPC_I2C_CR0x08
+#define MPC_I2C_SR0x0c
+#define MPC_I2C_DR0x10
+#define MPC_I2C_DFSRR 0x14
+
+#define CCR_MEN  (1 << 7)
+#define CCR_MIEN (1 << 6)
+#define CCR_MSTA (1 << 5)
+#define CCR_MTX  (1 << 4)
+#define CCR_TXAK (1 << 3)
+#define CCR_RSTA (1 << 2)
+#define CCR_BCST (1 << 0)
+
+#define CSR_MCF  (1 << 7)
+#define CSR_MAAS (1 << 6)
+#define CSR_MBB  (1 << 5)
+#define CSR_MAL  (1 << 4)
+#define CSR_SRW  (1 << 2)
+#define CSR_MIF  (1 << 1)
+#define CSR_RXAK (1 << 0)
+
+#define CADR_MASK 0xFE
+#define CFDR_MASK 0x3F
+#define CCR_MASK  0xFC
+#define CSR_MASK  0xED
+#define CDR_MASK  0xFF
+
+#define CYCLE_RESET 0xFF
+
+typedef struct MPCI2CState {
+SysBusDevice parent_obj;
+
+I2CBus *bus;
+qemu_irq irq;
+MemoryRegion iomem;
+
+uint8_t address;
+uint8_t adr;
+uint8_t fdr;
+uint8_t cr;
+uint8_t sr;
+uint8_t dr;
+uint8_t dfssr;
+} MPCI2CState;
+
+static bool mpc_i2c_is_enabled(MPCI2CState *s)
+{
+return s->cr & CCR_MEN;
+}
+
+static bool mpc_i2c_is_master(MPCI2CState *s)
+{
+return s->cr & CCR_MSTA;
+}
+
+static bool mpc_i2c_direction_is_tx(MPCI2CState *s)
+{
+return s->cr & CCR_MTX;
+}
+
+static bool mpc_i2c_irq_pending(MPCI2CState *s)
+{
+return s->sr & CSR_MIF;
+}
+
+static bool mpc_i2c_irq_is_enabled(MPCI2CState *s)
+{
+return s->cr & CCR_MIEN;
+}
+
+static void mpc_i2c_reset(DeviceState *dev)
+{
+MPCI2CState *i2c = MPC_I

[Qemu-devel] [PULL 28/60] ppc/pnv: introduce a new pic_print_info() operation to the chip model

2019-03-10 Thread David Gibson

From: Cédric Le Goater 

The POWER9 and POWER8 processors have different interrupt controllers,
and reporting their state requires calling different helper routines.

However, the interrupt presenters are still handled in the higher
level pic_print_info() routine because they are not related to the
chip.

Signed-off-by: Cédric Le Goater 
Message-Id: <20190306085032.15744-9-...@kaod.org>
Signed-off-by: David Gibson 
---
 hw/ppc/pnv.c | 27 ---
 include/hw/ppc/pnv.h |  1 +
 2 files changed, 25 insertions(+), 3 deletions(-)

diff --git a/hw/ppc/pnv.c b/hw/ppc/pnv.c
index 087541a91a..7660eaa22c 100644
--- a/hw/ppc/pnv.c
+++ b/hw/ppc/pnv.c
@@ -567,6 +567,20 @@ static ISABus *pnv_isa_create(PnvChip *chip, Error **errp)
 return PNV_CHIP_GET_CLASS(chip)->isa_create(chip, errp);
 }
 
+static void pnv_chip_power8_pic_print_info(PnvChip *chip, Monitor *mon)
+{
+Pnv8Chip *chip8 = PNV8_CHIP(chip);
+
+ics_pic_print_info(&chip8->psi.ics, mon);
+}
+
+static void pnv_chip_power9_pic_print_info(PnvChip *chip, Monitor *mon)
+{
+Pnv9Chip *chip9 = PNV9_CHIP(chip);
+
+pnv_xive_pic_print_info(&chip9->xive, mon);
+}
+
 static void pnv_init(MachineState *machine)
 {
 PnvMachineState *pnv = PNV_MACHINE(machine);
@@ -878,6 +892,7 @@ static void pnv_chip_power8e_class_init(ObjectClass *klass, 
void *data)
 k->intc_create = pnv_chip_power8_intc_create;
 k->isa_create = pnv_chip_power8_isa_create;
 k->dt_populate = pnv_chip_power8_dt_populate;
+k->pic_print_info = pnv_chip_power8_pic_print_info;
 k->xscom_base = 0x003fc00ull;
 dc->desc = "PowerNV Chip POWER8E";
 
@@ -897,6 +912,7 @@ static void pnv_chip_power8_class_init(ObjectClass *klass, 
void *data)
 k->intc_create = pnv_chip_power8_intc_create;
 k->isa_create = pnv_chip_power8_isa_create;
 k->dt_populate = pnv_chip_power8_dt_populate;
+k->pic_print_info = pnv_chip_power8_pic_print_info;
 k->xscom_base = 0x003fc00ull;
 dc->desc = "PowerNV Chip POWER8";
 
@@ -916,6 +932,7 @@ static void pnv_chip_power8nvl_class_init(ObjectClass 
*klass, void *data)
 k->intc_create = pnv_chip_power8_intc_create;
 k->isa_create = pnv_chip_power8nvl_isa_create;
 k->dt_populate = pnv_chip_power8_dt_populate;
+k->pic_print_info = pnv_chip_power8_pic_print_info;
 k->xscom_base = 0x003fc00ull;
 dc->desc = "PowerNV Chip POWER8NVL";
 
@@ -977,6 +994,7 @@ static void pnv_chip_power9_class_init(ObjectClass *klass, 
void *data)
 k->intc_create = pnv_chip_power9_intc_create;
 k->isa_create = pnv_chip_power9_isa_create;
 k->dt_populate = pnv_chip_power9_dt_populate;
+k->pic_print_info = pnv_chip_power9_pic_print_info;
 k->xscom_base = 0x00603fcull;
 dc->desc = "PowerNV Chip POWER9";
 
@@ -1164,12 +1182,15 @@ static void pnv_pic_print_info(InterruptStatsProvider 
*obj,
 CPU_FOREACH(cs) {
 PowerPCCPU *cpu = POWERPC_CPU(cs);
 
-icp_pic_print_info(ICP(pnv_cpu_state(cpu)->intc), mon);
+if (pnv_chip_is_power9(pnv->chips[0])) {
+xive_tctx_pic_print_info(XIVE_TCTX(pnv_cpu_state(cpu)->intc), mon);
+} else {
+icp_pic_print_info(ICP(pnv_cpu_state(cpu)->intc), mon);
+}
 }
 
 for (i = 0; i < pnv->num_chips; i++) {
-Pnv8Chip *chip8 = PNV8_CHIP(pnv->chips[i]);
-ics_pic_print_info(&chip8->psi.ics, mon);
+PNV_CHIP_GET_CLASS(pnv->chips[i])->pic_print_info(pnv->chips[i], mon);
 }
 }
 
diff --git a/include/hw/ppc/pnv.h b/include/hw/ppc/pnv.h
index fa9ec50fd5..eb4bba25b3 100644
--- a/include/hw/ppc/pnv.h
+++ b/include/hw/ppc/pnv.h
@@ -103,6 +103,7 @@ typedef struct PnvChipClass {
 void (*intc_create)(PnvChip *chip, PowerPCCPU *cpu, Error **errp);
 ISABus *(*isa_create)(PnvChip *chip, Error **errp);
 void (*dt_populate)(PnvChip *chip, void *fdt);
+void (*pic_print_info)(PnvChip *chip, Monitor *mon);
 } PnvChipClass;
 
 #define PNV_CHIP_TYPE_SUFFIX "-" TYPE_PNV_CHIP
-- 
2.20.1

[Qemu-devel] [PULL 35/60] target/ppc: introduce single vsrl_offset() function

2019-03-10 Thread David Gibson

From: Mark Cave-Ayland 

Instead of having multiple copies of the offset calculation logic, move it to a
single vsrl_offset() function.

This commit also renames the existing get_vsr()/set_vsr() functions to
get_vsrl()/set_vsrl() which better describes their purpose.

Signed-off-by: Mark Cave-Ayland 
Reviewed-by: Richard Henderson 
Message-Id: <20190307180520.13868-3-mark.cave-ayl...@ilande.co.uk>
Signed-off-by: David Gibson 
---
 target/ppc/cpu.h|  7 ++-
 target/ppc/translate/vsx-impl.inc.c | 12 ++--
 2 files changed, 12 insertions(+), 7 deletions(-)

diff --git a/target/ppc/cpu.h b/target/ppc/cpu.h
index 15e053becd..0c3fc8e084 100644
--- a/target/ppc/cpu.h
+++ b/target/ppc/cpu.h
@@ -2573,9 +2573,14 @@ static inline uint64_t *cpu_fpr_ptr(CPUPPCState *env, 
int i)
 return (uint64_t *)((uintptr_t)env + fpr_offset(i));
 }
 
+static inline int vsrl_offset(int i)
+{
+return offsetof(CPUPPCState, vsr[i].u64[1]);
+}
+
 static inline uint64_t *cpu_vsrl_ptr(CPUPPCState *env, int i)
 {
-return &env->vsr[i].u64[1];
+return (uint64_t *)((uintptr_t)env + vsrl_offset(i));
 }
 
 static inline ppc_avr_t *cpu_avr_ptr(CPUPPCState *env, int i)
diff --git a/target/ppc/translate/vsx-impl.inc.c 
b/target/ppc/translate/vsx-impl.inc.c
index e73197e717..381ae0f2e9 100644
--- a/target/ppc/translate/vsx-impl.inc.c
+++ b/target/ppc/translate/vsx-impl.inc.c
@@ -1,13 +1,13 @@
 /***   VSX extension   ***/
 
-static inline void get_vsr(TCGv_i64 dst, int n)
+static inline void get_vsrl(TCGv_i64 dst, int n)
 {
-tcg_gen_ld_i64(dst, cpu_env, offsetof(CPUPPCState, vsr[n].u64[1]));
+tcg_gen_ld_i64(dst, cpu_env, vsrl_offset(n));
 }
 
-static inline void set_vsr(int n, TCGv_i64 src)
+static inline void set_vsrl(int n, TCGv_i64 src)
 {
-tcg_gen_st_i64(src, cpu_env, offsetof(CPUPPCState, vsr[n].u64[1]));
+tcg_gen_st_i64(src, cpu_env, vsrl_offset(n));
 }
 
 static inline int vsr_full_offset(int n)
@@ -27,7 +27,7 @@ static inline void get_cpu_vsrh(TCGv_i64 dst, int n)
 static inline void get_cpu_vsrl(TCGv_i64 dst, int n)
 {
 if (n < 32) {
-get_vsr(dst, n);
+get_vsrl(dst, n);
 } else {
 get_avr64(dst, n - 32, false);
 }
@@ -45,7 +45,7 @@ static inline void set_cpu_vsrh(int n, TCGv_i64 src)
 static inline void set_cpu_vsrl(int n, TCGv_i64 src)
 {
 if (n < 32) {
-set_vsr(n, src);
+set_vsrl(n, src);
 } else {
 set_avr64(n - 32, src, false);
 }
-- 
2.20.1

[Qemu-devel] [PULL 23/60] ppc/xive: export the TIMA memory accessors

2019-03-10 Thread David Gibson

From: Cédric Le Goater 

The PowerNV machine can perform indirect loads and stores on the TIMA
on behalf of another CPU. Give the controller the possibility to call
the TIMA memory accessors with a XiveTCTX of its choice.

Signed-off-by: Cédric Le Goater 
Message-Id: <20190306085032.15744-4-...@kaod.org>
Signed-off-by: David Gibson 
---
 hw/intc/xive.c| 23 ++-
 include/hw/ppc/xive.h |  3 +++
 2 files changed, 21 insertions(+), 5 deletions(-)

diff --git a/hw/intc/xive.c b/hw/intc/xive.c
index b21759c938..3d7de864e9 100644
--- a/hw/intc/xive.c
+++ b/hw/intc/xive.c
@@ -317,10 +317,9 @@ static const XiveTmOp *xive_tm_find_op(hwaddr offset, 
unsigned size, bool write)
 /*
  * TIMA MMIO handlers
  */
-static void xive_tm_write(void *opaque, hwaddr offset,
-  uint64_t value, unsigned size)
+void xive_tctx_tm_write(XiveTCTX *tctx, hwaddr offset, uint64_t value,
+unsigned size)
 {
-XiveTCTX *tctx = xive_router_get_tctx(XIVE_ROUTER(opaque), current_cpu);
 const XiveTmOp *xto;
 
 /*
@@ -356,9 +355,8 @@ static void xive_tm_write(void *opaque, hwaddr offset,
 xive_tm_raw_write(tctx, offset, value, size);
 }
 
-static uint64_t xive_tm_read(void *opaque, hwaddr offset, unsigned size)
+uint64_t xive_tctx_tm_read(XiveTCTX *tctx, hwaddr offset, unsigned size)
 {
-XiveTCTX *tctx = xive_router_get_tctx(XIVE_ROUTER(opaque), current_cpu);
 const XiveTmOp *xto;
 
 /*
@@ -392,6 +390,21 @@ static uint64_t xive_tm_read(void *opaque, hwaddr offset, 
unsigned size)
 return xive_tm_raw_read(tctx, offset, size);
 }
 
+static void xive_tm_write(void *opaque, hwaddr offset,
+  uint64_t value, unsigned size)
+{
+XiveTCTX *tctx = xive_router_get_tctx(XIVE_ROUTER(opaque), current_cpu);
+
+xive_tctx_tm_write(tctx, offset, value, size);
+}
+
+static uint64_t xive_tm_read(void *opaque, hwaddr offset, unsigned size)
+{
+XiveTCTX *tctx = xive_router_get_tctx(XIVE_ROUTER(opaque), current_cpu);
+
+return xive_tctx_tm_read(tctx, offset, size);
+}
+
 const MemoryRegionOps xive_tm_ops = {
 .read = xive_tm_read,
 .write = xive_tm_write,
diff --git a/include/hw/ppc/xive.h b/include/hw/ppc/xive.h
index 13a487527b..7dd80e0f46 100644
--- a/include/hw/ppc/xive.h
+++ b/include/hw/ppc/xive.h
@@ -410,6 +410,9 @@ void xive_end_queue_pic_print_info(XiveEND *end, uint32_t 
width, Monitor *mon);
 #define XIVE_TM_USER_PAGE   0x3
 
 extern const MemoryRegionOps xive_tm_ops;
+void xive_tctx_tm_write(XiveTCTX *tctx, hwaddr offset, uint64_t value,
+unsigned size);
+uint64_t xive_tctx_tm_read(XiveTCTX *tctx, hwaddr offset, unsigned size);
 
 void xive_tctx_pic_print_info(XiveTCTX *tctx, Monitor *mon);
 Object *xive_tctx_create(Object *cpu, XiveRouter *xrtr, Error **errp);
-- 
2.20.1

[Qemu-devel] [PULL 24/60] ppc/pnv: export the xive_router_notify() routine

2019-03-10 Thread David Gibson

From: Cédric Le Goater 

The PowerNV machine with need to encode the block id in the source
interrupt number before forwarding the source event notification to
the Router.

Signed-off-by: Cédric Le Goater 
Message-Id: <20190306085032.15744-5-...@kaod.org>
Signed-off-by: David Gibson 
---
 hw/intc/xive.c| 2 +-
 include/hw/ppc/xive.h | 1 +
 2 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/hw/intc/xive.c b/hw/intc/xive.c
index 3d7de864e9..7d7992c0ce 100644
--- a/hw/intc/xive.c
+++ b/hw/intc/xive.c
@@ -1404,7 +1404,7 @@ static void xive_router_end_notify(XiveRouter *xrtr, 
uint8_t end_blk,
 /* TODO: Auto EOI. */
 }
 
-static void xive_router_notify(XiveNotifier *xn, uint32_t lisn)
+void xive_router_notify(XiveNotifier *xn, uint32_t lisn)
 {
 XiveRouter *xrtr = XIVE_ROUTER(xn);
 uint8_t eas_blk = XIVE_SRCNO_BLOCK(lisn);
diff --git a/include/hw/ppc/xive.h b/include/hw/ppc/xive.h
index 7dd80e0f46..c4f27742ca 100644
--- a/include/hw/ppc/xive.h
+++ b/include/hw/ppc/xive.h
@@ -364,6 +364,7 @@ int xive_router_get_nvt(XiveRouter *xrtr, uint8_t nvt_blk, 
uint32_t nvt_idx,
 int xive_router_write_nvt(XiveRouter *xrtr, uint8_t nvt_blk, uint32_t nvt_idx,
   XiveNVT *nvt, uint8_t word_number);
 XiveTCTX *xive_router_get_tctx(XiveRouter *xrtr, CPUState *cs);
+void xive_router_notify(XiveNotifier *xn, uint32_t lisn);
 
 /*
  * XIVE END ESBs
-- 
2.20.1

[Qemu-devel] [PULL 38/60] target/ppc: improve avr64_offset() and use it to simplify get_avr64()/set_avr64()

2019-03-10 Thread David Gibson

From: Mark Cave-Ayland 

By using the VsrD macro in avr64_offset() the same offset calculation can be
used regardless of the host endian. This allows get_avr64() and set_avr64() to
be simplified accordingly.

Signed-off-by: Mark Cave-Ayland 
Message-Id: <20190307180520.13868-6-mark.cave-ayl...@ilande.co.uk>
Reviewed-by: Richard Henderson 
Signed-off-by: David Gibson 
---
 target/ppc/cpu.h|  5 +
 target/ppc/translate.c  | 16 ++--
 target/ppc/translate/vmx-impl.inc.c |  5 -
 3 files changed, 7 insertions(+), 19 deletions(-)

diff --git a/target/ppc/cpu.h b/target/ppc/cpu.h
index caddbd012c..3050982707 100644
--- a/target/ppc/cpu.h
+++ b/target/ppc/cpu.h
@@ -2608,6 +2608,11 @@ static inline uint64_t *cpu_vsrl_ptr(CPUPPCState *env, 
int i)
 return (uint64_t *)((uintptr_t)env + vsrl_offset(i));
 }
 
+static inline long avr64_offset(int i, bool high)
+{
+return offsetof(CPUPPCState, vsr[32 + i].VsrD(high ? 0 : 1));
+}
+
 static inline int avr_full_offset(int i)
 {
 return vsr_full_offset(i + 32);
diff --git a/target/ppc/translate.c b/target/ppc/translate.c
index 668d4cf75a..98b37cebc2 100644
--- a/target/ppc/translate.c
+++ b/target/ppc/translate.c
@@ -6687,24 +6687,12 @@ static inline void set_fpr(int regno, TCGv_i64 src)
 
 static inline void get_avr64(TCGv_i64 dst, int regno, bool high)
 {
-#ifdef HOST_WORDS_BIGENDIAN
-tcg_gen_ld_i64(dst, cpu_env, offsetof(CPUPPCState,
-  vsr[32 + regno].u64[(high ? 0 : 
1)]));
-#else
-tcg_gen_ld_i64(dst, cpu_env, offsetof(CPUPPCState,
-  vsr[32 + regno].u64[(high ? 1 : 
0)]));
-#endif
+tcg_gen_ld_i64(dst, cpu_env, avr64_offset(regno, high));
 }
 
 static inline void set_avr64(int regno, TCGv_i64 src, bool high)
 {
-#ifdef HOST_WORDS_BIGENDIAN
-tcg_gen_st_i64(src, cpu_env, offsetof(CPUPPCState,
-  vsr[32 + regno].u64[(high ? 0 : 
1)]));
-#else
-tcg_gen_st_i64(src, cpu_env, offsetof(CPUPPCState,
-  vsr[32 + regno].u64[(high ? 1 : 
0)]));
-#endif
+tcg_gen_st_i64(src, cpu_env, avr64_offset(regno, high));
 }
 
 #include "translate/fp-impl.inc.c"
diff --git a/target/ppc/translate/vmx-impl.inc.c 
b/target/ppc/translate/vmx-impl.inc.c
index 4e5d0bc0e0..eb10c533ca 100644
--- a/target/ppc/translate/vmx-impl.inc.c
+++ b/target/ppc/translate/vmx-impl.inc.c
@@ -14,11 +14,6 @@ static inline TCGv_ptr gen_avr_ptr(int reg)
 return r;
 }
 
-static inline long avr64_offset(int reg, bool high)
-{
-return offsetof(CPUPPCState, vsr[32 + reg].u64[(high ? 0 : 1)]);
-}
-
 #define GEN_VR_LDX(name, opc2, opc3)  \
 static void glue(gen_, name)(DisasContext *ctx)
   \
 { \
-- 
2.20.1

[Qemu-devel] [PULL 31/60] ppc/pnv: psi: add a PSIHB_REG macro

2019-03-10 Thread David Gibson

From: Cédric Le Goater 

This is a simple helper to translate XSCOM addresses to MMIO addresses

Signed-off-by: Cédric Le Goater 
Message-Id: <20190306085032.15744-13-...@kaod.org>
Signed-off-by: David Gibson 
---
 hw/ppc/pnv_psi.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/hw/ppc/pnv_psi.c b/hw/ppc/pnv_psi.c
index c872be0b9c..a2f8d0dece 100644
--- a/hw/ppc/pnv_psi.c
+++ b/hw/ppc/pnv_psi.c
@@ -114,6 +114,8 @@
 #define PSIHB_BAR_MASK  0x0003fff0ull
 #define PSIHB_FSPBAR_MASK   0x0003ull
 
+#define PSIHB_REG(addr) (((addr) >> 3) + PSIHB_XSCOM_BAR)
+
 static void pnv_psi_set_bar(PnvPsi *psi, uint64_t bar)
 {
 MemoryRegion *sysmem = get_system_memory();
@@ -392,13 +394,13 @@ static void pnv_psi_reg_write(PnvPsi *psi, uint32_t 
offset, uint64_t val,
  */
 static uint64_t pnv_psi_mmio_read(void *opaque, hwaddr addr, unsigned size)
 {
-return pnv_psi_reg_read(opaque, (addr >> 3) + PSIHB_XSCOM_BAR, true);
+return pnv_psi_reg_read(opaque, PSIHB_REG(addr), true);
 }
 
 static void pnv_psi_mmio_write(void *opaque, hwaddr addr,
   uint64_t val, unsigned size)
 {
-pnv_psi_reg_write(opaque, (addr >> 3) + PSIHB_XSCOM_BAR, val, true);
+pnv_psi_reg_write(opaque, PSIHB_REG(addr), val, true);
 }
 
 static const MemoryRegionOps psi_mmio_ops = {
-- 
2.20.1

[Qemu-devel] [PULL 32/60] ppc/pnv: psi: add a reset handler

2019-03-10 Thread David Gibson

From: Cédric Le Goater 

Reset all regs but keep the MMIO BAR enabled as it is at realize time.

Signed-off-by: Cédric Le Goater 
Message-Id: <20190306085032.15744-14-...@kaod.org>
Signed-off-by: David Gibson 
---
 hw/ppc/pnv_psi.c | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/hw/ppc/pnv_psi.c b/hw/ppc/pnv_psi.c
index a2f8d0dece..e61861bfd3 100644
--- a/hw/ppc/pnv_psi.c
+++ b/hw/ppc/pnv_psi.c
@@ -442,6 +442,15 @@ static const MemoryRegionOps pnv_psi_xscom_ops = {
 }
 };
 
+static void pnv_psi_reset(void *dev)
+{
+PnvPsi *psi = PNV_PSI(dev);
+
+memset(psi->regs, 0x0, sizeof(psi->regs));
+
+psi->regs[PSIHB_XSCOM_BAR] = psi->bar | PSIHB_BAR_EN;
+}
+
 static void pnv_psi_init(Object *obj)
 {
 PnvPsi *psi = PNV_PSI(obj);
@@ -511,6 +520,8 @@ static void pnv_psi_realize(DeviceState *dev, Error **errp)
 psi->regs[xivr] = PSIHB_XIVR_PRIO_MSK |
 ((uint64_t) i << PSIHB_XIVR_SRC_SH);
 }
+
+qemu_register_reset(pnv_psi_reset, dev);
 }
 
 static int pnv_psi_dt_xscom(PnvXScomInterface *dev, void *fdt, int 
xscom_offset)
-- 
2.20.1

[Qemu-devel] [PULL 42/60] mac_newworld: use node name instead of alias name for hd device in FWPathProvider

2019-03-10 Thread David Gibson

From: Mark Cave-Ayland 

When using -drive to configure the hd drive for the New World machine, the node
name "disk" should be used instead of the "hd" alias.

Signed-off-by: Mark Cave-Ayland 
Message-Id: <20190307212058.4890-3-mark.cave-ayl...@ilande.co.uk>
Signed-off-by: David Gibson 
---
 hw/ppc/mac_newworld.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/hw/ppc/mac_newworld.c b/hw/ppc/mac_newworld.c
index 97e8817145..02d8559621 100644
--- a/hw/ppc/mac_newworld.c
+++ b/hw/ppc/mac_newworld.c
@@ -547,11 +547,11 @@ static char *core99_fw_dev_path(FWPathProvider *p, 
BusState *bus,
 return g_strdup("cdrom");
 }
 
-return g_strdup("hd");
+return g_strdup("disk");
 }
 
 if (!strcmp(object_get_typename(OBJECT(dev)), "ide-hd")) {
-return g_strdup("hd");
+return g_strdup("disk");
 }
 
 if (!strcmp(object_get_typename(OBJECT(dev)), "ide-cd")) {
-- 
2.20.1

[Qemu-devel] [PULL 30/60] ppc/pnv: fix logging primitives using Ox

2019-03-10 Thread David Gibson

From: Cédric Le Goater 

Signed-off-by: Cédric Le Goater 
Message-Id: <20190306085032.15744-12-...@kaod.org>
Signed-off-by: David Gibson 
---
 hw/ppc/pnv_lpc.c | 10 +-
 hw/ppc/pnv_psi.c |  4 ++--
 2 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/hw/ppc/pnv_lpc.c b/hw/ppc/pnv_lpc.c
index 172a915cfc..9b18ce55e3 100644
--- a/hw/ppc/pnv_lpc.c
+++ b/hw/ppc/pnv_lpc.c
@@ -294,7 +294,7 @@ static uint64_t lpc_hc_read(void *opaque, hwaddr addr, 
unsigned size)
 val =  lpc->lpc_hc_error_addr;
 break;
 default:
-qemu_log_mask(LOG_UNIMP, "LPC HC Unimplemented register: Ox%"
+qemu_log_mask(LOG_UNIMP, "LPC HC Unimplemented register: 0x%"
   HWADDR_PRIx "\n", addr);
 }
 return val;
@@ -332,7 +332,7 @@ static void lpc_hc_write(void *opaque, hwaddr addr, 
uint64_t val,
 case LPC_HC_ERROR_ADDRESS:
 break;
 default:
-qemu_log_mask(LOG_UNIMP, "LPC HC Unimplemented register: Ox%"
+qemu_log_mask(LOG_UNIMP, "LPC HC Unimplemented register: 0x%"
   HWADDR_PRIx "\n", addr);
 }
 }
@@ -370,7 +370,7 @@ static uint64_t opb_master_read(void *opaque, hwaddr addr, 
unsigned size)
 val = lpc->opb_irq_input;
 break;
 default:
-qemu_log_mask(LOG_UNIMP, "OPB MASTER Unimplemented register: Ox%"
+qemu_log_mask(LOG_UNIMP, "OPBM: read on unimplemented register: 0x%"
   HWADDR_PRIx "\n", addr);
 }
 
@@ -399,8 +399,8 @@ static void opb_master_write(void *opaque, hwaddr addr,
 /* Read only */
 break;
 default:
-qemu_log_mask(LOG_UNIMP, "OPB MASTER Unimplemented register: Ox%"
-  HWADDR_PRIx "\n", addr);
+qemu_log_mask(LOG_UNIMP, "OPBM: write on unimplemented register: 0x%"
+  HWADDR_PRIx " val=0x%08"PRIx64"\n", addr, val);
 }
 }
 
diff --git a/hw/ppc/pnv_psi.c b/hw/ppc/pnv_psi.c
index 44bc0cbf58..c872be0b9c 100644
--- a/hw/ppc/pnv_psi.c
+++ b/hw/ppc/pnv_psi.c
@@ -323,7 +323,7 @@ static uint64_t pnv_psi_reg_read(PnvPsi *psi, uint32_t 
offset, bool mmio)
 val = psi->regs[offset];
 break;
 default:
-qemu_log_mask(LOG_UNIMP, "PSI: read at Ox%" PRIx32 "\n", offset);
+qemu_log_mask(LOG_UNIMP, "PSI: read at 0x%" PRIx32 "\n", offset);
 }
 return val;
 }
@@ -382,7 +382,7 @@ static void pnv_psi_reg_write(PnvPsi *psi, uint32_t offset, 
uint64_t val,
 pnv_psi_set_irsn(psi, val);
 break;
 default:
-qemu_log_mask(LOG_UNIMP, "PSI: write at Ox%" PRIx32 "\n", offset);
+qemu_log_mask(LOG_UNIMP, "PSI: write at 0x%" PRIx32 "\n", offset);
 }
 }
 
-- 
2.20.1

[Qemu-devel] [PULL 33/60] spapr_iommu: Do not replay mappings from just created DMA window

2019-03-10 Thread David Gibson

From: Alexey Kardashevskiy 

On sPAPR vfio_listener_region_add() is called in 2 situations:
1. a new listener is registered from vfio_connect_container();
2. a new IOMMU Memory Region is added from rtas_ibm_create_pe_dma_window().

In both cases vfio_listener_region_add() calls
memory_region_iommu_replay() to notify newly registered IOMMU notifiers
about existing mappings which is totally desirable for case 1.

However for case 2 it is nothing but noop as the window has just been
created and has no valid mappings so replaying those does not do anything.
It is barely noticeable with usual guests but if the window happens to be
really big, such no-op replay might take minutes and trigger RCU stall
warnings in the guest.

For example, a upcoming GPU RAM memory region mapped at 64TiB (right
after SPAPR_PCI_LIMIT) causes a 64bit DMA window to be at least 128TiB
which is (128<<40)/0x1=2.147.483.648 TCEs to replay.

This mitigates the problem by adding an "skipping_replay" flag to
sPAPRTCETable and defining sPAPR own IOMMU MR replay() hook which does
exactly the same thing as the generic one except it returns early if
@skipping_replay==true.

Another way of fixing this would be delaying replay till the very first
H_PUT_TCE but this does not work if in-kernel H_PUT_TCE handler is
enabled (a likely case).

When "ibm,create-pe-dma-window" is complete, the guest will map only
required regions of the huge DMA window.

Signed-off-by: Alexey Kardashevskiy 
Message-Id: <20190307050518.64968-2-...@ozlabs.ru>
Signed-off-by: David Gibson 
---
 hw/ppc/spapr_iommu.c| 31 +++
 hw/ppc/spapr_rtas_ddw.c | 10 ++
 include/hw/ppc/spapr.h  |  1 +
 3 files changed, 42 insertions(+)

diff --git a/hw/ppc/spapr_iommu.c b/hw/ppc/spapr_iommu.c
index 37e98f9321..8f231799b2 100644
--- a/hw/ppc/spapr_iommu.c
+++ b/hw/ppc/spapr_iommu.c
@@ -141,6 +141,36 @@ static IOMMUTLBEntry 
spapr_tce_translate_iommu(IOMMUMemoryRegion *iommu,
 return ret;
 }
 
+static void spapr_tce_replay(IOMMUMemoryRegion *iommu_mr, IOMMUNotifier *n)
+{
+MemoryRegion *mr = MEMORY_REGION(iommu_mr);
+IOMMUMemoryRegionClass *imrc = IOMMU_MEMORY_REGION_GET_CLASS(iommu_mr);
+hwaddr addr, granularity;
+IOMMUTLBEntry iotlb;
+sPAPRTCETable *tcet = container_of(iommu_mr, sPAPRTCETable, iommu);
+
+if (tcet->skipping_replay) {
+return;
+}
+
+granularity = memory_region_iommu_get_min_page_size(iommu_mr);
+
+for (addr = 0; addr < memory_region_size(mr); addr += granularity) {
+iotlb = imrc->translate(iommu_mr, addr, IOMMU_NONE, n->iommu_idx);
+if (iotlb.perm != IOMMU_NONE) {
+n->notify(n, &iotlb);
+}
+
+/*
+ * if (2^64 - MR size) < granularity, it's possible to get an
+ * infinite loop here.  This should catch such a wraparound.
+ */
+if ((addr + granularity) < addr) {
+break;
+}
+}
+}
+
 static int spapr_tce_table_pre_save(void *opaque)
 {
 sPAPRTCETable *tcet = SPAPR_TCE_TABLE(opaque);
@@ -659,6 +689,7 @@ static void 
spapr_iommu_memory_region_class_init(ObjectClass *klass, void *data)
 IOMMUMemoryRegionClass *imrc = IOMMU_MEMORY_REGION_CLASS(klass);
 
 imrc->translate = spapr_tce_translate_iommu;
+imrc->replay = spapr_tce_replay;
 imrc->get_min_page_size = spapr_tce_get_min_page_size;
 imrc->notify_flag_changed = spapr_tce_notify_flag_changed;
 imrc->get_attr = spapr_tce_get_attr;
diff --git a/hw/ppc/spapr_rtas_ddw.c b/hw/ppc/spapr_rtas_ddw.c
index cb8a410359..cc9d1f5c1c 100644
--- a/hw/ppc/spapr_rtas_ddw.c
+++ b/hw/ppc/spapr_rtas_ddw.c
@@ -171,8 +171,18 @@ static void rtas_ibm_create_pe_dma_window(PowerPCCPU *cpu,
 }
 
 win_addr = (windows == 0) ? sphb->dma_win_addr : sphb->dma64_win_addr;
+/*
+ * We have just created a window, we know for the fact that it is empty,
+ * use a hack to avoid iterating over the table as it is quite possible
+ * to have billions of TCEs, all empty.
+ * Note that we cannot delay this to the first H_PUT_TCE as this hcall is
+ * mostly likely to be handled in KVM so QEMU just does not know if it
+ * happened.
+ */
+tcet->skipping_replay = true;
 spapr_tce_table_enable(tcet, page_shift, win_addr,
1ULL << (window_shift - page_shift));
+tcet->skipping_replay = false;
 if (!tcet->nb_table) {
 goto hw_error_exit;
 }
diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
index 1311ebe28e..f117a7ce6e 100644
--- a/include/hw/ppc/spapr.h
+++ b/include/hw/ppc/spapr.h
@@ -723,6 +723,7 @@ struct sPAPRTCETable {
 uint64_t *mig_table;
 bool bypass;
 bool need_vfio;
+bool skipping_replay;
 int fd;
 MemoryRegion root;
 IOMMUMemoryRegion iommu;
-- 
2.20.1

[Qemu-devel] [PULL 40/60] target/ppc: introduce vsr64_offset() to simplify get_cpu_vsr{l, h}() and set_cpu_vsr{l, h}()

2019-03-10 Thread David Gibson

From: Mark Cave-Ayland 

Now that all VSX registers are stored in host endian order, there is no need
to go via different accessors depending upon the register number. Instead we
introduce vsr64_offset() and use it directly from within get_cpu_vsr{l,h}() and
set_cpu_vsr{l,h}().

This also allows us to rewrite avr64_offset() and fpr_offset() in terms of the
new vsr64_offset() function to more clearly express the relationship between the
VSX, FPR and VMX registers, and also remove vsrl_offset() which is no longer
required.

Signed-off-by: Mark Cave-Ayland 
Message-Id: <20190307180520.13868-8-mark.cave-ayl...@ilande.co.uk>
Reviewed-by: Richard Henderson 
Signed-off-by: David Gibson 
---
 target/ppc/cpu.h| 20 -
 target/ppc/translate/vsx-impl.inc.c | 34 -
 2 files changed, 14 insertions(+), 40 deletions(-)

diff --git a/target/ppc/cpu.h b/target/ppc/cpu.h
index 8905edbfd0..c1fce44303 100644
--- a/target/ppc/cpu.h
+++ b/target/ppc/cpu.h
@@ -2583,34 +2583,34 @@ static inline bool lsw_reg_in_range(int start, int 
nregs, int rx)
 #define VsrSD(i) s64[1 - (i)]
 #endif
 
-static inline int fpr_offset(int i)
+static inline int vsr64_offset(int i, bool high)
 {
-return offsetof(CPUPPCState, vsr[i].VsrD(0));
+return offsetof(CPUPPCState, vsr[i].VsrD(high ? 0 : 1));
 }
 
-static inline uint64_t *cpu_fpr_ptr(CPUPPCState *env, int i)
+static inline int vsr_full_offset(int i)
 {
-return (uint64_t *)((uintptr_t)env + fpr_offset(i));
+return offsetof(CPUPPCState, vsr[i].u64[0]);
 }
 
-static inline int vsrl_offset(int i)
+static inline int fpr_offset(int i)
 {
-return offsetof(CPUPPCState, vsr[i].VsrD(1));
+return vsr64_offset(i, true);
 }
 
-static inline int vsr_full_offset(int i)
+static inline uint64_t *cpu_fpr_ptr(CPUPPCState *env, int i)
 {
-return offsetof(CPUPPCState, vsr[i].u64[0]);
+return (uint64_t *)((uintptr_t)env + fpr_offset(i));
 }
 
 static inline uint64_t *cpu_vsrl_ptr(CPUPPCState *env, int i)
 {
-return (uint64_t *)((uintptr_t)env + vsrl_offset(i));
+return (uint64_t *)((uintptr_t)env + vsr64_offset(i, false));
 }
 
 static inline long avr64_offset(int i, bool high)
 {
-return offsetof(CPUPPCState, vsr[32 + i].VsrD(high ? 0 : 1));
+return vsr64_offset(i + 32, high);
 }
 
 static inline int avr_full_offset(int i)
diff --git a/target/ppc/translate/vsx-impl.inc.c 
b/target/ppc/translate/vsx-impl.inc.c
index 7d02a235e7..95a269fff0 100644
--- a/target/ppc/translate/vsx-impl.inc.c
+++ b/target/ppc/translate/vsx-impl.inc.c
@@ -1,49 +1,23 @@
 /***   VSX extension   ***/
 
-static inline void get_vsrl(TCGv_i64 dst, int n)
-{
-tcg_gen_ld_i64(dst, cpu_env, vsrl_offset(n));
-}
-
-static inline void set_vsrl(int n, TCGv_i64 src)
-{
-tcg_gen_st_i64(src, cpu_env, vsrl_offset(n));
-}
-
 static inline void get_cpu_vsrh(TCGv_i64 dst, int n)
 {
-if (n < 32) {
-get_fpr(dst, n);
-} else {
-get_avr64(dst, n - 32, true);
-}
+tcg_gen_ld_i64(dst, cpu_env, vsr64_offset(n, true));
 }
 
 static inline void get_cpu_vsrl(TCGv_i64 dst, int n)
 {
-if (n < 32) {
-get_vsrl(dst, n);
-} else {
-get_avr64(dst, n - 32, false);
-}
+tcg_gen_ld_i64(dst, cpu_env, vsr64_offset(n, false));
 }
 
 static inline void set_cpu_vsrh(int n, TCGv_i64 src)
 {
-if (n < 32) {
-set_fpr(n, src);
-} else {
-set_avr64(n - 32, src, true);
-}
+tcg_gen_st_i64(src, cpu_env, vsr64_offset(n, true));
 }
 
 static inline void set_cpu_vsrl(int n, TCGv_i64 src)
 {
-if (n < 32) {
-set_vsrl(n, src);
-} else {
-set_avr64(n - 32, src, false);
-}
+tcg_gen_st_i64(src, cpu_env, vsr64_offset(n, false));
 }
 
 #define VSX_LOAD_SCALAR(name, operation)  \
-- 
2.20.1

[Qemu-devel] [PULL 39/60] target/ppc: switch fpr/vsrl registers so all VSX registers are in host endian order

2019-03-10 Thread David Gibson

From: Mark Cave-Ayland 

When VSX support was initially added, the fpr registers were added at
offset 0 of the VSR register and the vsrl registers were added at offset
1. This is in contrast to the VMX registers (the last 32 VSX registers) which
are stored in host-endian order.

Switch the fpr/vsrl registers so that the lower 32 VSX registers are now also
stored in host endian order to match the VMX registers. This ensures that TCG
vector operations involving mixed VMX and VSX registers will function
correctly.

Signed-off-by: Mark Cave-Ayland 
Reviewed-by: Richard Henderson 
Message-Id: <20190307180520.13868-7-mark.cave-ayl...@ilande.co.uk>
Signed-off-by: David Gibson 
---
 target/ppc/cpu.h  | 4 ++--
 target/ppc/internal.h | 8 
 target/ppc/machine.c  | 8 
 3 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/target/ppc/cpu.h b/target/ppc/cpu.h
index 3050982707..8905edbfd0 100644
--- a/target/ppc/cpu.h
+++ b/target/ppc/cpu.h
@@ -2585,7 +2585,7 @@ static inline bool lsw_reg_in_range(int start, int nregs, 
int rx)
 
 static inline int fpr_offset(int i)
 {
-return offsetof(CPUPPCState, vsr[i].u64[0]);
+return offsetof(CPUPPCState, vsr[i].VsrD(0));
 }
 
 static inline uint64_t *cpu_fpr_ptr(CPUPPCState *env, int i)
@@ -2595,7 +2595,7 @@ static inline uint64_t *cpu_fpr_ptr(CPUPPCState *env, int 
i)
 
 static inline int vsrl_offset(int i)
 {
-return offsetof(CPUPPCState, vsr[i].u64[1]);
+return offsetof(CPUPPCState, vsr[i].VsrD(1));
 }
 
 static inline int vsr_full_offset(int i)
diff --git a/target/ppc/internal.h b/target/ppc/internal.h
index 3ebbdf4da4..fb6f64ed1e 100644
--- a/target/ppc/internal.h
+++ b/target/ppc/internal.h
@@ -206,14 +206,14 @@ EXTRACT_HELPER_SPLIT_3(DCMX_XV, 5, 16, 0, 1, 2, 5, 1, 6, 
6);
 
 static inline void getVSR(int n, ppc_vsr_t *vsr, CPUPPCState *env)
 {
-vsr->VsrD(0) = env->vsr[n].u64[0];
-vsr->VsrD(1) = env->vsr[n].u64[1];
+vsr->VsrD(0) = env->vsr[n].VsrD(0);
+vsr->VsrD(1) = env->vsr[n].VsrD(1);
 }
 
 static inline void putVSR(int n, ppc_vsr_t *vsr, CPUPPCState *env)
 {
-env->vsr[n].u64[0] = vsr->VsrD(0);
-env->vsr[n].u64[1] = vsr->VsrD(1);
+env->vsr[n].VsrD(0) = vsr->VsrD(0);
+env->vsr[n].VsrD(1) = vsr->VsrD(1);
 }
 
 void helper_compute_fprf_float16(CPUPPCState *env, float16 arg);
diff --git a/target/ppc/machine.c b/target/ppc/machine.c
index 756b6d2971..a92d0ad3a3 100644
--- a/target/ppc/machine.c
+++ b/target/ppc/machine.c
@@ -150,7 +150,7 @@ static int get_fpr(QEMUFile *f, void *pv, size_t size,
 {
 ppc_vsr_t *v = pv;
 
-v->u64[0] = qemu_get_be64(f);
+v->VsrD(0) = qemu_get_be64(f);
 
 return 0;
 }
@@ -160,7 +160,7 @@ static int put_fpr(QEMUFile *f, void *pv, size_t size,
 {
 ppc_vsr_t *v = pv;
 
-qemu_put_be64(f, v->u64[0]);
+qemu_put_be64(f, v->VsrD(0));
 return 0;
 }
 
@@ -181,7 +181,7 @@ static int get_vsr(QEMUFile *f, void *pv, size_t size,
 {
 ppc_vsr_t *v = pv;
 
-v->u64[1] = qemu_get_be64(f);
+v->VsrD(1) = qemu_get_be64(f);
 
 return 0;
 }
@@ -191,7 +191,7 @@ static int put_vsr(QEMUFile *f, void *pv, size_t size,
 {
 ppc_vsr_t *v = pv;
 
-qemu_put_be64(f, v->u64[1]);
+qemu_put_be64(f, v->VsrD(1));
 return 0;
 }
 
-- 
2.20.1

[Qemu-devel] [PULL 45/60] ppc/pnv: lpc: fix OPB address ranges

2019-03-10 Thread David Gibson

From: Cédric Le Goater 

The PowerNV LPC Controller exposes different sets of registers for
each of the functional units it encompasses, among which the OPB
(On-Chip Peripheral Bus) Master and Arbitrer and the LPC HOST
Controller.

The mapping addresses of each register range are correct but the sizes
are too large. Fix the sizes and define the OPB Arbitrer range to fill
the gap between the OPB Master registers and the LPC HOST Controller
registers.

Signed-off-by: Cédric Le Goater 
Message-Id: <20190307223548.20516-4-...@kaod.org>
Signed-off-by: David Gibson 
---
 hw/ppc/pnv_lpc.c | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/hw/ppc/pnv_lpc.c b/hw/ppc/pnv_lpc.c
index 9b18ce55e3..547be609ca 100644
--- a/hw/ppc/pnv_lpc.c
+++ b/hw/ppc/pnv_lpc.c
@@ -89,10 +89,11 @@ enum {
 #define LPC_FW_OPB_SIZE 0x1000
 
 #define LPC_OPB_REGS_OPB_ADDR   0xc001
-#define LPC_OPB_REGS_OPB_SIZE   0x2000
+#define LPC_OPB_REGS_OPB_SIZE   0x0060
+#define LPC_OPB_REGS_OPBA_ADDR  0xc0011000
+#define LPC_OPB_REGS_OPBA_SIZE  0x0008
 #define LPC_HC_REGS_OPB_ADDR0xc0012000
-#define LPC_HC_REGS_OPB_SIZE0x1000
-
+#define LPC_HC_REGS_OPB_SIZE0x0100
 
 static int pnv_lpc_dt_xscom(PnvXScomInterface *dev, void *fdt, int 
xscom_offset)
 {
-- 
2.20.1

[Qemu-devel] [PULL 27/60] ppc/pnv: introduce a new dt_populate() operation to the chip model

2019-03-10 Thread David Gibson

From: Cédric Le Goater 

The POWER9 and POWER8 processors have a different set of devices and a
different device tree layout.

Signed-off-by: Cédric Le Goater 
Message-Id: <20190306085032.15744-8-...@kaod.org>
Signed-off-by: David Gibson 
---
 hw/ppc/pnv.c | 27 +--
 include/hw/ppc/pnv.h |  1 +
 2 files changed, 26 insertions(+), 2 deletions(-)

diff --git a/hw/ppc/pnv.c b/hw/ppc/pnv.c
index a7ec76dbd6..087541a91a 100644
--- a/hw/ppc/pnv.c
+++ b/hw/ppc/pnv.c
@@ -267,7 +267,7 @@ static void pnv_dt_icp(PnvChip *chip, void *fdt, uint32_t 
pir,
 g_free(reg);
 }
 
-static void pnv_dt_chip(PnvChip *chip, void *fdt)
+static void pnv_chip_power8_dt_populate(PnvChip *chip, void *fdt)
 {
 const char *typename = pnv_chip_core_typename(chip);
 size_t typesize = object_type_get_instance_size(typename);
@@ -289,6 +289,25 @@ static void pnv_dt_chip(PnvChip *chip, void *fdt)
 }
 }
 
+static void pnv_chip_power9_dt_populate(PnvChip *chip, void *fdt)
+{
+const char *typename = pnv_chip_core_typename(chip);
+size_t typesize = object_type_get_instance_size(typename);
+int i;
+
+pnv_dt_xscom(chip, fdt, 0);
+
+for (i = 0; i < chip->nr_cores; i++) {
+PnvCore *pnv_core = PNV_CORE(chip->cores + i * typesize);
+
+pnv_dt_core(chip, pnv_core, fdt);
+}
+
+if (chip->ram_size) {
+pnv_dt_memory(fdt, chip->chip_id, chip->ram_start, chip->ram_size);
+}
+}
+
 static void pnv_dt_rtc(ISADevice *d, void *fdt, int lpc_off)
 {
 uint32_t io_base = d->ioport_id;
@@ -474,7 +493,7 @@ static void *pnv_dt_create(MachineState *machine)
 
 /* Populate device tree for each chip */
 for (i = 0; i < pnv->num_chips; i++) {
-pnv_dt_chip(pnv->chips[i], fdt);
+PNV_CHIP_GET_CLASS(pnv->chips[i])->dt_populate(pnv->chips[i], fdt);
 }
 
 /* Populate ISA devices on chip 0 */
@@ -858,6 +877,7 @@ static void pnv_chip_power8e_class_init(ObjectClass *klass, 
void *data)
 k->core_pir = pnv_chip_core_pir_p8;
 k->intc_create = pnv_chip_power8_intc_create;
 k->isa_create = pnv_chip_power8_isa_create;
+k->dt_populate = pnv_chip_power8_dt_populate;
 k->xscom_base = 0x003fc00ull;
 dc->desc = "PowerNV Chip POWER8E";
 
@@ -876,6 +896,7 @@ static void pnv_chip_power8_class_init(ObjectClass *klass, 
void *data)
 k->core_pir = pnv_chip_core_pir_p8;
 k->intc_create = pnv_chip_power8_intc_create;
 k->isa_create = pnv_chip_power8_isa_create;
+k->dt_populate = pnv_chip_power8_dt_populate;
 k->xscom_base = 0x003fc00ull;
 dc->desc = "PowerNV Chip POWER8";
 
@@ -894,6 +915,7 @@ static void pnv_chip_power8nvl_class_init(ObjectClass 
*klass, void *data)
 k->core_pir = pnv_chip_core_pir_p8;
 k->intc_create = pnv_chip_power8_intc_create;
 k->isa_create = pnv_chip_power8nvl_isa_create;
+k->dt_populate = pnv_chip_power8_dt_populate;
 k->xscom_base = 0x003fc00ull;
 dc->desc = "PowerNV Chip POWER8NVL";
 
@@ -954,6 +976,7 @@ static void pnv_chip_power9_class_init(ObjectClass *klass, 
void *data)
 k->core_pir = pnv_chip_core_pir_p9;
 k->intc_create = pnv_chip_power9_intc_create;
 k->isa_create = pnv_chip_power9_isa_create;
+k->dt_populate = pnv_chip_power9_dt_populate;
 k->xscom_base = 0x00603fcull;
 dc->desc = "PowerNV Chip POWER9";
 
diff --git a/include/hw/ppc/pnv.h b/include/hw/ppc/pnv.h
index ebbb3d0e9a..fa9ec50fd5 100644
--- a/include/hw/ppc/pnv.h
+++ b/include/hw/ppc/pnv.h
@@ -102,6 +102,7 @@ typedef struct PnvChipClass {
 uint32_t (*core_pir)(PnvChip *chip, uint32_t core_id);
 void (*intc_create)(PnvChip *chip, PowerPCCPU *cpu, Error **errp);
 ISABus *(*isa_create)(PnvChip *chip, Error **errp);
+void (*dt_populate)(PnvChip *chip, void *fdt);
 } PnvChipClass;
 
 #define PNV_CHIP_TYPE_SUFFIX "-" TYPE_PNV_CHIP
-- 
2.20.1

[Qemu-devel] [PULL 34/60] target/ppc: introduce single fpr_offset() function

2019-03-10 Thread David Gibson

From: Mark Cave-Ayland 

Instead of having multiple copies of the offset calculation logic, move it to a
single fpr_offset() function.

Signed-off-by: Mark Cave-Ayland 
Reviewed-by: Richard Henderson 
Message-Id: <20190307180520.13868-2-mark.cave-ayl...@ilande.co.uk>
Signed-off-by: David Gibson 
---
 target/ppc/cpu.h   | 7 ++-
 target/ppc/translate.c | 4 ++--
 2 files changed, 8 insertions(+), 3 deletions(-)

diff --git a/target/ppc/cpu.h b/target/ppc/cpu.h
index 81763d72f9..15e053becd 100644
--- a/target/ppc/cpu.h
+++ b/target/ppc/cpu.h
@@ -2563,9 +2563,14 @@ static inline bool lsw_reg_in_range(int start, int 
nregs, int rx)
 }
 
 /* Accessors for FP, VMX and VSX registers */
+static inline int fpr_offset(int i)
+{
+return offsetof(CPUPPCState, vsr[i].u64[0]);
+}
+
 static inline uint64_t *cpu_fpr_ptr(CPUPPCState *env, int i)
 {
-return &env->vsr[i].u64[0];
+return (uint64_t *)((uintptr_t)env + fpr_offset(i));
 }
 
 static inline uint64_t *cpu_vsrl_ptr(CPUPPCState *env, int i)
diff --git a/target/ppc/translate.c b/target/ppc/translate.c
index b156be4d98..668d4cf75a 100644
--- a/target/ppc/translate.c
+++ b/target/ppc/translate.c
@@ -6677,12 +6677,12 @@ GEN_TM_PRIV_NOOP(trechkpt);
 
 static inline void get_fpr(TCGv_i64 dst, int regno)
 {
-tcg_gen_ld_i64(dst, cpu_env, offsetof(CPUPPCState, vsr[regno].u64[0]));
+tcg_gen_ld_i64(dst, cpu_env, fpr_offset(regno));
 }
 
 static inline void set_fpr(int regno, TCGv_i64 src)
 {
-tcg_gen_st_i64(src, cpu_env, offsetof(CPUPPCState, vsr[regno].u64[0]));
+tcg_gen_st_i64(src, cpu_env, fpr_offset(regno));
 }
 
 static inline void get_avr64(TCGv_i64 dst, int regno, bool high)
-- 
2.20.1

[Qemu-devel] [PULL 58/60] target/ppc: Optimize xviexpdp() using deposit_i64()

2019-03-10 Thread David Gibson

From: Philippe Mathieu-Daudé 

The t0 tcg_temp register is now unused, remove it.

Signed-off-by: Philippe Mathieu-Daudé 
Message-Id: <20190309214255.9952-2-f4...@amsat.org>
Signed-off-by: David Gibson 
---
 target/ppc/translate/vsx-impl.inc.c | 14 +++---
 1 file changed, 3 insertions(+), 11 deletions(-)

diff --git a/target/ppc/translate/vsx-impl.inc.c 
b/target/ppc/translate/vsx-impl.inc.c
index 95a269fff0..30d8aabd92 100644
--- a/target/ppc/translate/vsx-impl.inc.c
+++ b/target/ppc/translate/vsx-impl.inc.c
@@ -1695,7 +1695,6 @@ static void gen_xviexpdp(DisasContext *ctx)
 TCGv_i64 xal;
 TCGv_i64 xbh;
 TCGv_i64 xbl;
-TCGv_i64 t0;
 
 if (unlikely(!ctx->vsx_enabled)) {
 gen_exception(ctx, POWERPC_EXCP_VSXU);
@@ -1711,20 +1710,13 @@ static void gen_xviexpdp(DisasContext *ctx)
 get_cpu_vsrl(xal, xA(ctx->opcode));
 get_cpu_vsrh(xbh, xB(ctx->opcode));
 get_cpu_vsrl(xbl, xB(ctx->opcode));
-t0 = tcg_temp_new_i64();
 
-tcg_gen_andi_i64(xth, xah, 0x800F);
-tcg_gen_andi_i64(t0, xbh, 0x7FF);
-tcg_gen_shli_i64(t0, t0, 52);
-tcg_gen_or_i64(xth, xth, t0);
+tcg_gen_deposit_i64(xth, xah, xbh, 52, 11);
 set_cpu_vsrh(xT(ctx->opcode), xth);
-tcg_gen_andi_i64(xtl, xal, 0x800F);
-tcg_gen_andi_i64(t0, xbl, 0x7FF);
-tcg_gen_shli_i64(t0, t0, 52);
-tcg_gen_or_i64(xtl, xtl, t0);
+
+tcg_gen_deposit_i64(xtl, xal, xbl, 52, 11);
 set_cpu_vsrl(xT(ctx->opcode), xtl);
 
-tcg_temp_free_i64(t0);
 tcg_temp_free_i64(xth);
 tcg_temp_free_i64(xtl);
 tcg_temp_free_i64(xah);
-- 
2.20.1

[Qemu-devel] [PULL 54/60] ppc/pnv: activate XSCOM tests for POWER9

2019-03-10 Thread David Gibson

From: Cédric Le Goater 

We now have enough support to let the XSCOM test run on POWER9.

Signed-off-by: Cédric Le Goater 
Message-Id: <20190307223548.20516-13-...@kaod.org>
Signed-off-by: David Gibson 
---
 tests/pnv-xscom-test.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/tests/pnv-xscom-test.c b/tests/pnv-xscom-test.c
index 974f8da5b2..63d464048d 100644
--- a/tests/pnv-xscom-test.c
+++ b/tests/pnv-xscom-test.c
@@ -39,7 +39,6 @@ static const PnvChip pnv_chips[] = {
 .cfam_id= 0x120d30498000ull,
 .first_core = 0x1,
 },
-#if 0 /* POWER9 support is not ready yet */
 {
 .chip_type  = PNV_CHIP_POWER9,
 .cpu_model  = "POWER9",
@@ -47,7 +46,6 @@ static const PnvChip pnv_chips[] = {
 .cfam_id= 0x220d10498000ull,
 .first_core = 0x0,
 },
-#endif
 };
 
 static uint64_t pnv_xscom_addr(const PnvChip *chip, uint32_t pcba)
-- 
2.20.1

[Qemu-devel] [PULL 26/60] ppc/pnv: add a XIVE interrupt controller model for POWER9

2019-03-10 Thread David Gibson

From: Cédric Le Goater 

This is a simple model of the POWER9 XIVE interrupt controller for the
PowerNV machine which only addresses the needs of the skiboot
firmware. The PowerNV model reuses the common XIVE framework developed
for sPAPR as the fundamentals aspects are quite the same. The
difference are outlined below.

The controller initial BAR configuration is performed using the XSCOM
bus from there, MMIO are used for further configuration.

The MMIO regions exposed are :

 - Interrupt controller registers
 - ESB pages for IPIs and ENDs
 - Presenter MMIO (Not used)
 - Thread Interrupt Management Area MMIO, direct and indirect

The virtualization controller MMIO region containing the IPI ESB pages
and END ESB pages is sub-divided into "sets" which map portions of the
VC region to the different ESB pages. These are modeled with custom
address spaces and the XiveSource and XiveENDSource objects are sized
to the maximum allowed by HW. The memory regions are resized at
run-time using the configuration of EDT set translation table provided
by the firmware.

The XIVE virtualization structure tables (EAT, ENDT, NVTT) are now in
the machine RAM and not in the hypervisor anymore. The firmware
(skiboot) configures these tables using Virtual Structure Descriptor
defining the characteristics of each table : SBE, EAS, END and
NVT. These are later used to access the virtual interrupt entries. The
internal cache of these tables in the interrupt controller is updated
and invalidated using a set of registers.

Still to address to complete the model but not fully required is the
support for block grouping. Escalation support will be necessary for
KVM guests.

Signed-off-by: Cédric Le Goater 
Message-Id: <20190306085032.15744-7-...@kaod.org>
Signed-off-by: David Gibson 
---
 hw/intc/Makefile.objs  |2 +-
 hw/intc/pnv_xive.c | 1753 
 hw/intc/pnv_xive_regs.h|  248 +
 hw/ppc/pnv.c   |   44 +-
 include/hw/ppc/pnv.h   |   21 +
 include/hw/ppc/pnv_xive.h  |   93 ++
 include/hw/ppc/pnv_xscom.h |3 +
 7 files changed, 2162 insertions(+), 2 deletions(-)
 create mode 100644 hw/intc/pnv_xive.c
 create mode 100644 hw/intc/pnv_xive_regs.h
 create mode 100644 include/hw/ppc/pnv_xive.h

diff --git a/hw/intc/Makefile.objs b/hw/intc/Makefile.objs
index 301a8e972d..df712c3e6c 100644
--- a/hw/intc/Makefile.objs
+++ b/hw/intc/Makefile.objs
@@ -39,7 +39,7 @@ obj-$(CONFIG_XICS_SPAPR) += xics_spapr.o
 obj-$(CONFIG_XICS_KVM) += xics_kvm.o
 obj-$(CONFIG_XIVE) += xive.o
 obj-$(CONFIG_XIVE_SPAPR) += spapr_xive.o
-obj-$(CONFIG_POWERNV) += xics_pnv.o
+obj-$(CONFIG_POWERNV) += xics_pnv.o pnv_xive.o
 obj-$(CONFIG_ALLWINNER_A10_PIC) += allwinner-a10-pic.o
 obj-$(CONFIG_S390_FLIC) += s390_flic.o
 obj-$(CONFIG_S390_FLIC_KVM) += s390_flic_kvm.o
diff --git a/hw/intc/pnv_xive.c b/hw/intc/pnv_xive.c
new file mode 100644
index 00..bb0877cbdf
--- /dev/null
+++ b/hw/intc/pnv_xive.c
@@ -0,0 +1,1753 @@
+/*
+ * QEMU PowerPC XIVE interrupt controller model
+ *
+ * Copyright (c) 2017-2019, IBM Corporation.
+ *
+ * This code is licensed under the GPL version 2 or later. See the
+ * COPYING file in the top-level directory.
+ */
+
+#include "qemu/osdep.h"
+#include "qemu/log.h"
+#include "qapi/error.h"
+#include "target/ppc/cpu.h"
+#include "sysemu/cpus.h"
+#include "sysemu/dma.h"
+#include "monitor/monitor.h"
+#include "hw/ppc/fdt.h"
+#include "hw/ppc/pnv.h"
+#include "hw/ppc/pnv_core.h"
+#include "hw/ppc/pnv_xscom.h"
+#include "hw/ppc/pnv_xive.h"
+#include "hw/ppc/xive_regs.h"
+#include "hw/ppc/ppc.h"
+
+#include 
+
+#include "pnv_xive_regs.h"
+
+#define XIVE_DEBUG
+
+/*
+ * Virtual structures table (VST)
+ */
+#define SBE_PER_BYTE   4
+
+typedef struct XiveVstInfo {
+const char *name;
+uint32_tsize;
+uint32_tmax_blocks;
+} XiveVstInfo;
+
+static const XiveVstInfo vst_infos[] = {
+[VST_TSEL_IVT]  = { "EAT",  sizeof(XiveEAS), 16 },
+[VST_TSEL_SBE]  = { "SBE",  1,   16 },
+[VST_TSEL_EQDT] = { "ENDT", sizeof(XiveEND), 16 },
+[VST_TSEL_VPDT] = { "VPDT", sizeof(XiveNVT), 32 },
+
+/*
+ *  Interrupt fifo backing store table (not modeled) :
+ *
+ * 0 - IPI,
+ * 1 - HWD,
+ * 2 - First escalate,
+ * 3 - Second escalate,
+ * 4 - Redistribution,
+ * 5 - IPI cascaded queue ?
+ */
+[VST_TSEL_IRQ]  = { "IRQ",  1,   6  },
+};
+
+#define xive_error(xive, fmt, ...)  \
+qemu_log_mask(LOG_GUEST_ERROR, "XIVE[%x] - " fmt "\n",  \
+  (xive)->chip->chip_id, ## __VA_ARGS__);
+
+/*
+ * QEMU version of the GETFIELD/SETFIELD macros
+ *
+ * TODO: It might be better to use the existing extract64() and
+ * deposit64() but this means that all the register definitions will
+ * change and become incompatible with the ones found in skiboot.
+ *
+ * Keep it as it is for now until we find a common ground.
+ */
+static inline uint64_t GETFIELD(uint64_

[Qemu-devel] [PULL 43/60] ppc/pnv: add a PSI bridge class model

2019-03-10 Thread David Gibson

From: Cédric Le Goater 

To ease the introduction of the PSI bridge model for POWER9, abstract
the POWER chip differences in a PnvPsi class model and introduce a
specific Pnv8Psi type for POWER8. POWER8 interface to the interrupt
controller is still XICS whereas POWER9 uses the new XIVE model.

Signed-off-by: Cédric Le Goater 
Message-Id: <20190307223548.20516-2-...@kaod.org>
Signed-off-by: David Gibson 
---
 hw/ppc/pnv.c |  6 ++-
 hw/ppc/pnv_psi.c | 79 
 include/hw/ppc/pnv.h |  2 +-
 include/hw/ppc/pnv_psi.h | 29 ++-
 4 files changed, 87 insertions(+), 29 deletions(-)

diff --git a/hw/ppc/pnv.c b/hw/ppc/pnv.c
index 7660eaa22c..5bb2332f16 100644
--- a/hw/ppc/pnv.c
+++ b/hw/ppc/pnv.c
@@ -788,7 +788,7 @@ static void pnv_chip_power8_instance_init(Object *obj)
 Pnv8Chip *chip8 = PNV8_CHIP(obj);
 
 object_initialize_child(obj, "psi",  &chip8->psi, sizeof(chip8->psi),
-TYPE_PNV_PSI, &error_abort, NULL);
+TYPE_PNV8_PSI, &error_abort, NULL);
 object_property_add_const_link(OBJECT(&chip8->psi), "xics",
OBJECT(qdev_get_machine()), &error_abort);
 
@@ -840,6 +840,7 @@ static void pnv_chip_power8_realize(DeviceState *dev, Error 
**errp)
 PnvChipClass *pcc = PNV_CHIP_GET_CLASS(dev);
 PnvChip *chip = PNV_CHIP(dev);
 Pnv8Chip *chip8 = PNV8_CHIP(dev);
+Pnv8Psi *psi8 = &chip8->psi;
 Error *local_err = NULL;
 
 pcc->parent_realize(dev, &local_err);
@@ -856,7 +857,8 @@ static void pnv_chip_power8_realize(DeviceState *dev, Error 
**errp)
 error_propagate(errp, local_err);
 return;
 }
-pnv_xscom_add_subregion(chip, PNV_XSCOM_PSIHB_BASE, 
&chip8->psi.xscom_regs);
+pnv_xscom_add_subregion(chip, PNV_XSCOM_PSIHB_BASE,
+&PNV_PSI(psi8)->xscom_regs);
 
 /* Create LPC controller */
 object_property_set_bool(OBJECT(&chip8->lpc), true, "realized",
diff --git a/hw/ppc/pnv_psi.c b/hw/ppc/pnv_psi.c
index e61861bfd3..067f733f1e 100644
--- a/hw/ppc/pnv_psi.c
+++ b/hw/ppc/pnv_psi.c
@@ -118,10 +118,11 @@
 
 static void pnv_psi_set_bar(PnvPsi *psi, uint64_t bar)
 {
+PnvPsiClass *ppc = PNV_PSI_GET_CLASS(psi);
 MemoryRegion *sysmem = get_system_memory();
 uint64_t old = psi->regs[PSIHB_XSCOM_BAR];
 
-psi->regs[PSIHB_XSCOM_BAR] = bar & (PSIHB_BAR_MASK | PSIHB_BAR_EN);
+psi->regs[PSIHB_XSCOM_BAR] = bar & (ppc->bar_mask | PSIHB_BAR_EN);
 
 /* Update MR, always remove it first */
 if (old & PSIHB_BAR_EN) {
@@ -130,7 +131,7 @@ static void pnv_psi_set_bar(PnvPsi *psi, uint64_t bar)
 
 /* Then add it back if needed */
 if (bar & PSIHB_BAR_EN) {
-uint64_t addr = bar & PSIHB_BAR_MASK;
+uint64_t addr = bar & ppc->bar_mask;
 memory_region_add_subregion(sysmem, addr, &psi->regs_mr);
 }
 }
@@ -154,7 +155,7 @@ static void pnv_psi_set_cr(PnvPsi *psi, uint64_t cr)
 
 static void pnv_psi_set_irsn(PnvPsi *psi, uint64_t val)
 {
-ICSState *ics = &psi->ics;
+ICSState *ics = &PNV8_PSI(psi)->ics;
 
 /* In this model we ignore the up/down enable bits for now
  * as SW doesn't use them (other than setting them at boot).
@@ -207,7 +208,12 @@ static const uint64_t stat_bits[] = {
 [PSIHB_IRQ_EXTERNAL]  = PSIHB_IRQ_STAT_EXT,
 };
 
-void pnv_psi_irq_set(PnvPsi *psi, PnvPsiIrq irq, bool state)
+void pnv_psi_irq_set(PnvPsi *psi, int irq, bool state)
+{
+PNV_PSI_GET_CLASS(psi)->irq_set(psi, irq, state);
+}
+
+static void pnv_psi_power8_irq_set(PnvPsi *psi, int irq, bool state)
 {
 uint32_t xivr_reg;
 uint32_t stat_reg;
@@ -262,7 +268,7 @@ void pnv_psi_irq_set(PnvPsi *psi, PnvPsiIrq irq, bool state)
 
 static void pnv_psi_set_xivr(PnvPsi *psi, uint32_t reg, uint64_t val)
 {
-ICSState *ics = &psi->ics;
+ICSState *ics = &PNV8_PSI(psi)->ics;
 uint16_t server;
 uint8_t prio;
 uint8_t src;
@@ -451,11 +457,11 @@ static void pnv_psi_reset(void *dev)
 psi->regs[PSIHB_XSCOM_BAR] = psi->bar | PSIHB_BAR_EN;
 }
 
-static void pnv_psi_init(Object *obj)
+static void pnv_psi_power8_instance_init(Object *obj)
 {
-PnvPsi *psi = PNV_PSI(obj);
+Pnv8Psi *psi8 = PNV8_PSI(obj);
 
-object_initialize_child(obj, "ics-psi",  &psi->ics, sizeof(psi->ics),
+object_initialize_child(obj, "ics-psi",  &psi8->ics, sizeof(psi8->ics),
 TYPE_ICS_SIMPLE, &error_abort, NULL);
 }
 
@@ -468,10 +474,10 @@ static const uint8_t irq_to_xivr[] = {
 PSIHB_XSCOM_XIVR_EXT,
 };
 
-static void pnv_psi_realize(DeviceState *dev, Error **errp)
+static void pnv_psi_power8_realize(DeviceState *dev, Error **errp)
 {
 PnvPsi *psi = PNV_PSI(dev);
-ICSState *ics = &psi->ics;
+ICSState *ics = &PNV8_PSI(psi)->ics;
 Object *obj;
 Error *err = NULL;
 unsigned int i;
@@ -524,28 +530,28 @@ static void pnv_psi_realize(DeviceState *dev, Error 
**errp)
 qemu_register_reset(pnv_psi_reset, dev

[Qemu-devel] [PULL 50/60] ppc/pnv: add a OCC model class

2019-03-10 Thread David Gibson

From: Cédric Le Goater 

To ease the introduction of the OCC model for POWER9, provide a new
class attributes to define XSCOM operations per CPU family and a PSI
IRQ number.

Signed-off-by: Cédric Le Goater 
Reviewed-by: David Gibson 
Message-Id: <20190307223548.20516-9-...@kaod.org>
Signed-off-by: David Gibson 
---
 hw/ppc/pnv.c |  2 +-
 hw/ppc/pnv_occ.c | 55 +++-
 include/hw/ppc/pnv_occ.h | 15 +++
 3 files changed, 54 insertions(+), 18 deletions(-)

diff --git a/hw/ppc/pnv.c b/hw/ppc/pnv.c
index 918fae057b..6ae9ce6795 100644
--- a/hw/ppc/pnv.c
+++ b/hw/ppc/pnv.c
@@ -790,7 +790,7 @@ static void pnv_chip_power8_instance_init(Object *obj)
OBJECT(&chip8->psi), &error_abort);
 
 object_initialize_child(obj, "occ",  &chip8->occ, sizeof(chip8->occ),
-TYPE_PNV_OCC, &error_abort, NULL);
+TYPE_PNV8_OCC, &error_abort, NULL);
 object_property_add_const_link(OBJECT(&chip8->occ), "psi",
OBJECT(&chip8->psi), &error_abort);
 }
diff --git a/hw/ppc/pnv_occ.c b/hw/ppc/pnv_occ.c
index 04880f26d6..ea725647c9 100644
--- a/hw/ppc/pnv_occ.c
+++ b/hw/ppc/pnv_occ.c
@@ -34,15 +34,17 @@
 static void pnv_occ_set_misc(PnvOCC *occ, uint64_t val)
 {
 bool irq_state;
+PnvOCCClass *poc = PNV_OCC_GET_CLASS(occ);
 
 val &= 0xull;
 
 occ->occmisc = val;
 irq_state = !!(val >> 63);
-pnv_psi_irq_set(occ->psi, PSIHB_IRQ_OCC, irq_state);
+pnv_psi_irq_set(occ->psi, poc->psi_irq, irq_state);
 }
 
-static uint64_t pnv_occ_xscom_read(void *opaque, hwaddr addr, unsigned size)
+static uint64_t pnv_occ_power8_xscom_read(void *opaque, hwaddr addr,
+  unsigned size)
 {
 PnvOCC *occ = PNV_OCC(opaque);
 uint32_t offset = addr >> 3;
@@ -54,13 +56,13 @@ static uint64_t pnv_occ_xscom_read(void *opaque, hwaddr 
addr, unsigned size)
 break;
 default:
 qemu_log_mask(LOG_UNIMP, "OCC Unimplemented register: Ox%"
-  HWADDR_PRIx "\n", addr);
+  HWADDR_PRIx "\n", addr >> 3);
 }
 return val;
 }
 
-static void pnv_occ_xscom_write(void *opaque, hwaddr addr,
-uint64_t val, unsigned size)
+static void pnv_occ_power8_xscom_write(void *opaque, hwaddr addr,
+   uint64_t val, unsigned size)
 {
 PnvOCC *occ = PNV_OCC(opaque);
 uint32_t offset = addr >> 3;
@@ -77,13 +79,13 @@ static void pnv_occ_xscom_write(void *opaque, hwaddr addr,
 break;
 default:
 qemu_log_mask(LOG_UNIMP, "OCC Unimplemented register: Ox%"
-  HWADDR_PRIx "\n", addr);
+  HWADDR_PRIx "\n", addr >> 3);
 }
 }
 
-static const MemoryRegionOps pnv_occ_xscom_ops = {
-.read = pnv_occ_xscom_read,
-.write = pnv_occ_xscom_write,
+static const MemoryRegionOps pnv_occ_power8_xscom_ops = {
+.read = pnv_occ_power8_xscom_read,
+.write = pnv_occ_power8_xscom_write,
 .valid.min_access_size = 8,
 .valid.max_access_size = 8,
 .impl.min_access_size = 8,
@@ -91,27 +93,42 @@ static const MemoryRegionOps pnv_occ_xscom_ops = {
 .endianness = DEVICE_BIG_ENDIAN,
 };
 
+static void pnv_occ_power8_class_init(ObjectClass *klass, void *data)
+{
+PnvOCCClass *poc = PNV_OCC_CLASS(klass);
+
+poc->xscom_size = PNV_XSCOM_OCC_SIZE;
+poc->xscom_ops = &pnv_occ_power8_xscom_ops;
+poc->psi_irq = PSIHB_IRQ_OCC;
+}
+
+static const TypeInfo pnv_occ_power8_type_info = {
+.name  = TYPE_PNV8_OCC,
+.parent= TYPE_PNV_OCC,
+.instance_size = sizeof(PnvOCC),
+.class_init= pnv_occ_power8_class_init,
+};
 
 static void pnv_occ_realize(DeviceState *dev, Error **errp)
 {
 PnvOCC *occ = PNV_OCC(dev);
+PnvOCCClass *poc = PNV_OCC_GET_CLASS(occ);
 Object *obj;
-Error *error = NULL;
+Error *local_err = NULL;
 
 occ->occmisc = 0;
 
-/* get PSI object from chip */
-obj = object_property_get_link(OBJECT(dev), "psi", &error);
+obj = object_property_get_link(OBJECT(dev), "psi", &local_err);
 if (!obj) {
-error_setg(errp, "%s: required link 'psi' not found: %s",
-   __func__, error_get_pretty(error));
+error_propagate(errp, local_err);
+error_prepend(errp, "required link 'psi' not found: ");
 return;
 }
 occ->psi = PNV_PSI(obj);
 
 /* XScom region for OCC registers */
-pnv_xscom_region_init(&occ->xscom_regs, OBJECT(dev), &pnv_occ_xscom_ops,
-  occ, "xscom-occ", PNV_XSCOM_OCC_SIZE);
+pnv_xscom_region_init(&occ->xscom_regs, OBJECT(dev), poc->xscom_ops,
+  occ, "xscom-occ", poc->xscom_size);
 }
 
 static void pnv_occ_class_init(ObjectClass *klass, void *data)
@@ -119,6 +136,7 @@ static void pnv_occ_class_init(ObjectClass *klass, void 
*data)
 DeviceClass

[Qemu-devel] [PULL 46/60] ppc/pnv: add a LPC Controller class model

2019-03-10 Thread David Gibson

From: Cédric Le Goater 

It will ease the introduction of the LPC Controller model for POWER9.

Signed-off-by: Cédric Le Goater 
Reviewed-by: David Gibson 
Message-Id: <20190307223548.20516-5-...@kaod.org>
Signed-off-by: David Gibson 
---
 hw/ppc/pnv.c |  2 +-
 hw/ppc/pnv_lpc.c | 85 
 include/hw/ppc/pnv_lpc.h | 15 +++
 3 files changed, 77 insertions(+), 25 deletions(-)

diff --git a/hw/ppc/pnv.c b/hw/ppc/pnv.c
index 1cc454cbbc..922e3ec48b 100644
--- a/hw/ppc/pnv.c
+++ b/hw/ppc/pnv.c
@@ -794,7 +794,7 @@ static void pnv_chip_power8_instance_init(Object *obj)
OBJECT(qdev_get_machine()), &error_abort);
 
 object_initialize_child(obj, "lpc",  &chip8->lpc, sizeof(chip8->lpc),
-TYPE_PNV_LPC, &error_abort, NULL);
+TYPE_PNV8_LPC, &error_abort, NULL);
 object_property_add_const_link(OBJECT(&chip8->lpc), "psi",
OBJECT(&chip8->psi), &error_abort);
 
diff --git a/hw/ppc/pnv_lpc.c b/hw/ppc/pnv_lpc.c
index 547be609ca..3c509a30a0 100644
--- a/hw/ppc/pnv_lpc.c
+++ b/hw/ppc/pnv_lpc.c
@@ -245,6 +245,7 @@ static const MemoryRegionOps pnv_lpc_xscom_ops = {
 static void pnv_lpc_eval_irqs(PnvLpcController *lpc)
 {
 bool lpc_to_opb_irq = false;
+PnvLpcClass *plc = PNV_LPC_GET_CLASS(lpc);
 
 /* Update LPC controller to OPB line */
 if (lpc->lpc_hc_irqser_ctrl & LPC_HC_IRQSER_EN) {
@@ -267,7 +268,7 @@ static void pnv_lpc_eval_irqs(PnvLpcController *lpc)
 lpc->opb_irq_stat |= lpc->opb_irq_input & lpc->opb_irq_mask;
 
 /* Reflect the interrupt */
-pnv_psi_irq_set(lpc->psi, PSIHB_IRQ_LPC_I2C, lpc->opb_irq_stat != 0);
+pnv_psi_irq_set(lpc->psi, plc->psi_irq, lpc->opb_irq_stat != 0);
 }
 
 static uint64_t lpc_hc_read(void *opaque, hwaddr addr, unsigned size)
@@ -419,11 +420,65 @@ static const MemoryRegionOps opb_master_ops = {
 },
 };
 
+static void pnv_lpc_power8_realize(DeviceState *dev, Error **errp)
+{
+PnvLpcController *lpc = PNV_LPC(dev);
+PnvLpcClass *plc = PNV_LPC_GET_CLASS(dev);
+Error *local_err = NULL;
+
+plc->parent_realize(dev, &local_err);
+if (local_err) {
+error_propagate(errp, local_err);
+return;
+}
+
+/* P8 uses a XSCOM region for LPC registers */
+pnv_xscom_region_init(&lpc->xscom_regs, OBJECT(lpc),
+  &pnv_lpc_xscom_ops, lpc, "xscom-lpc",
+  PNV_XSCOM_LPC_SIZE);
+}
+
+static void pnv_lpc_power8_class_init(ObjectClass *klass, void *data)
+{
+DeviceClass *dc = DEVICE_CLASS(klass);
+PnvXScomInterfaceClass *xdc = PNV_XSCOM_INTERFACE_CLASS(klass);
+PnvLpcClass *plc = PNV_LPC_CLASS(klass);
+
+dc->desc = "PowerNV LPC Controller POWER8";
+
+xdc->dt_xscom = pnv_lpc_dt_xscom;
+
+plc->psi_irq = PSIHB_IRQ_LPC_I2C;
+
+device_class_set_parent_realize(dc, pnv_lpc_power8_realize,
+&plc->parent_realize);
+}
+
+static const TypeInfo pnv_lpc_power8_info = {
+.name  = TYPE_PNV8_LPC,
+.parent= TYPE_PNV_LPC,
+.instance_size = sizeof(PnvLpcController),
+.class_init= pnv_lpc_power8_class_init,
+.interfaces = (InterfaceInfo[]) {
+{ TYPE_PNV_XSCOM_INTERFACE },
+{ }
+}
+};
+
 static void pnv_lpc_realize(DeviceState *dev, Error **errp)
 {
 PnvLpcController *lpc = PNV_LPC(dev);
 Object *obj;
-Error *error = NULL;
+Error *local_err = NULL;
+
+obj = object_property_get_link(OBJECT(dev), "psi", &local_err);
+if (!obj) {
+error_propagate(errp, local_err);
+error_prepend(errp, "required link 'psi' not found: ");
+return;
+}
+/* The LPC controller needs PSI to generate interrupts  */
+lpc->psi = PNV_PSI(obj);
 
 /* Reg inits */
 lpc->lpc_hc_fw_rd_acc_size = LPC_HC_FW_RD_4B;
@@ -463,46 +518,28 @@ static void pnv_lpc_realize(DeviceState *dev, Error 
**errp)
   "lpc-hc", LPC_HC_REGS_OPB_SIZE);
 memory_region_add_subregion(&lpc->opb_mr, LPC_HC_REGS_OPB_ADDR,
 &lpc->lpc_hc_regs);
-
-/* XScom region for LPC registers */
-pnv_xscom_region_init(&lpc->xscom_regs, OBJECT(dev),
-  &pnv_lpc_xscom_ops, lpc, "xscom-lpc",
-  PNV_XSCOM_LPC_SIZE);
-
-/* get PSI object from chip */
-obj = object_property_get_link(OBJECT(dev), "psi", &error);
-if (!obj) {
-error_setg(errp, "%s: required link 'psi' not found: %s",
-   __func__, error_get_pretty(error));
-return;
-}
-lpc->psi = PNV_PSI(obj);
 }
 
 static void pnv_lpc_class_init(ObjectClass *klass, void *data)
 {
 DeviceClass *dc = DEVICE_CLASS(klass);
-PnvXScomInterfaceClass *xdc = PNV_XSCOM_INTERFACE_CLASS(klass);
-
-xdc->dt_xscom = pnv_lpc_dt_xscom;
 
 dc->realize = pnv_lpc_realize;
+dc->desc = "PowerNV LPC Controller";

[Qemu-devel] [PULL 56/60] ppc/pnv: add a "ibm, opal/power-mgt" device tree node on POWER9

2019-03-10 Thread David Gibson

From: Cédric Le Goater 

Activate only stop0 and stop1 levels. We should not need more levels
when under QEMU.

Signed-off-by: Cédric Le Goater 
Message-Id: <20190307223548.20516-15-...@kaod.org>
Signed-off-by: David Gibson 
---
 hw/ppc/pnv.c | 15 +++
 1 file changed, 15 insertions(+)

diff --git a/hw/ppc/pnv.c b/hw/ppc/pnv.c
index e68d419203..8be4d4cbf7 100644
--- a/hw/ppc/pnv.c
+++ b/hw/ppc/pnv.c
@@ -438,6 +438,16 @@ static void pnv_dt_isa(PnvMachineState *pnv, void *fdt)
&args);
 }
 
+static void pnv_dt_power_mgt(void *fdt)
+{
+int off;
+
+off = fdt_add_subnode(fdt, 0, "ibm,opal");
+off = fdt_add_subnode(fdt, off, "power-mgt");
+
+_FDT(fdt_setprop_cell(fdt, off, "ibm,enabled-stop-levels", 0xc000));
+}
+
 static void *pnv_dt_create(MachineState *machine)
 {
 const char plat_compat[] = "qemu,powernv\0ibm,powernv";
@@ -493,6 +503,11 @@ static void *pnv_dt_create(MachineState *machine)
 pnv_dt_bmc_sensors(pnv->bmc, fdt);
 }
 
+/* Create an extra node for power management on Power9 */
+if (pnv_is_power9(pnv)) {
+pnv_dt_power_mgt(fdt);
+}
+
 return fdt;
 }
 
-- 
2.20.1

[Qemu-devel] [PULL 51/60] ppc/pnv: add a OCC model for POWER9

2019-03-10 Thread David Gibson

From: Cédric Le Goater 

The OCC on POWER9 is very similar to the one found on POWER8. Provide
the same routines with P9 values for the registers and IRQ number.

Signed-off-by: Cédric Le Goater 
Message-Id: <20190307223548.20516-10-...@kaod.org>
Signed-off-by: David Gibson 
---
 hw/ppc/pnv.c   | 13 +++
 hw/ppc/pnv_occ.c   | 72 ++
 include/hw/ppc/pnv.h   |  1 +
 include/hw/ppc/pnv_occ.h   |  2 ++
 include/hw/ppc/pnv_xscom.h |  3 ++
 5 files changed, 91 insertions(+)

diff --git a/hw/ppc/pnv.c b/hw/ppc/pnv.c
index 6ae9ce6795..1559a73323 100644
--- a/hw/ppc/pnv.c
+++ b/hw/ppc/pnv.c
@@ -956,6 +956,11 @@ static void pnv_chip_power9_instance_init(Object *obj)
 TYPE_PNV9_LPC, &error_abort, NULL);
 object_property_add_const_link(OBJECT(&chip9->lpc), "psi",
OBJECT(&chip9->psi), &error_abort);
+
+object_initialize_child(obj, "occ",  &chip9->occ, sizeof(chip9->occ),
+TYPE_PNV9_OCC, &error_abort, NULL);
+object_property_add_const_link(OBJECT(&chip9->occ), "psi",
+   OBJECT(&chip9->psi), &error_abort);
 }
 
 static void pnv_chip_power9_realize(DeviceState *dev, Error **errp)
@@ -1012,6 +1017,14 @@ static void pnv_chip_power9_realize(DeviceState *dev, 
Error **errp)
 
 chip->dt_isa_nodename = g_strdup_printf("/lpcm-opb@%" PRIx64 "/lpc@0",
 (uint64_t) PNV9_LPCM_BASE(chip));
+
+/* Create the simplified OCC model */
+object_property_set_bool(OBJECT(&chip9->occ), true, "realized", 
&local_err);
+if (local_err) {
+error_propagate(errp, local_err);
+return;
+}
+pnv_xscom_add_subregion(chip, PNV9_XSCOM_OCC_BASE, &chip9->occ.xscom_regs);
 }
 
 static void pnv_chip_power9_class_init(ObjectClass *klass, void *data)
diff --git a/hw/ppc/pnv_occ.c b/hw/ppc/pnv_occ.c
index ea725647c9..fdd9296e1b 100644
--- a/hw/ppc/pnv_occ.c
+++ b/hw/ppc/pnv_occ.c
@@ -109,6 +109,77 @@ static const TypeInfo pnv_occ_power8_type_info = {
 .class_init= pnv_occ_power8_class_init,
 };
 
+#define P9_OCB_OCI_OCCMISC  0x6080
+#define P9_OCB_OCI_OCCMISC_CLEAR0x6081
+#define P9_OCB_OCI_OCCMISC_OR   0x6082
+
+
+static uint64_t pnv_occ_power9_xscom_read(void *opaque, hwaddr addr,
+  unsigned size)
+{
+PnvOCC *occ = PNV_OCC(opaque);
+uint32_t offset = addr >> 3;
+uint64_t val = 0;
+
+switch (offset) {
+case P9_OCB_OCI_OCCMISC:
+val = occ->occmisc;
+break;
+default:
+qemu_log_mask(LOG_UNIMP, "OCC Unimplemented register: Ox%"
+  HWADDR_PRIx "\n", addr >> 3);
+}
+return val;
+}
+
+static void pnv_occ_power9_xscom_write(void *opaque, hwaddr addr,
+   uint64_t val, unsigned size)
+{
+PnvOCC *occ = PNV_OCC(opaque);
+uint32_t offset = addr >> 3;
+
+switch (offset) {
+case P9_OCB_OCI_OCCMISC_CLEAR:
+pnv_occ_set_misc(occ, 0);
+break;
+case P9_OCB_OCI_OCCMISC_OR:
+pnv_occ_set_misc(occ, occ->occmisc | val);
+break;
+case P9_OCB_OCI_OCCMISC:
+pnv_occ_set_misc(occ, val);
+   break;
+default:
+qemu_log_mask(LOG_UNIMP, "OCC Unimplemented register: Ox%"
+  HWADDR_PRIx "\n", addr >> 3);
+}
+}
+
+static const MemoryRegionOps pnv_occ_power9_xscom_ops = {
+.read = pnv_occ_power9_xscom_read,
+.write = pnv_occ_power9_xscom_write,
+.valid.min_access_size = 8,
+.valid.max_access_size = 8,
+.impl.min_access_size = 8,
+.impl.max_access_size = 8,
+.endianness = DEVICE_BIG_ENDIAN,
+};
+
+static void pnv_occ_power9_class_init(ObjectClass *klass, void *data)
+{
+PnvOCCClass *poc = PNV_OCC_CLASS(klass);
+
+poc->xscom_size = PNV9_XSCOM_OCC_SIZE;
+poc->xscom_ops = &pnv_occ_power9_xscom_ops;
+poc->psi_irq = PSIHB9_IRQ_OCC;
+}
+
+static const TypeInfo pnv_occ_power9_type_info = {
+.name  = TYPE_PNV9_OCC,
+.parent= TYPE_PNV_OCC,
+.instance_size = sizeof(PnvOCC),
+.class_init= pnv_occ_power9_class_init,
+};
+
 static void pnv_occ_realize(DeviceState *dev, Error **errp)
 {
 PnvOCC *occ = PNV_OCC(dev);
@@ -152,6 +223,7 @@ static void pnv_occ_register_types(void)
 {
 type_register_static(&pnv_occ_type_info);
 type_register_static(&pnv_occ_power8_type_info);
+type_register_static(&pnv_occ_power9_type_info);
 }
 
 type_init(pnv_occ_register_types);
diff --git a/include/hw/ppc/pnv.h b/include/hw/ppc/pnv.h
index 1cd1ad622d..39888f9d52 100644
--- a/include/hw/ppc/pnv.h
+++ b/include/hw/ppc/pnv.h
@@ -88,6 +88,7 @@ typedef struct Pnv9Chip {
 PnvXive  xive;
 Pnv9Psi  psi;
 PnvLpcController lpc;
+PnvOCC   occ;
 } Pnv9Chip;
 
 typedef struct PnvChipClass {
diff --git a/include/hw/ppc/pnv_occ.h b/include/hw/ppc/pnv_occ.h
index

[Qemu-devel] [PULL 36/60] target/ppc: move Vsr* macros from internal.h to cpu.h

2019-03-10 Thread David Gibson

From: Mark Cave-Ayland 

It isn't possible to include internal.h from cpu.h so move the Vsr* macros
into cpu.h alongside the other VMX/VSX register access functions.

Signed-off-by: Mark Cave-Ayland 
Message-Id: <20190307180520.13868-4-mark.cave-ayl...@ilande.co.uk>
Reviewed-by: Richard Henderson 
Signed-off-by: David Gibson 
---
 target/ppc/cpu.h  | 20 
 target/ppc/internal.h | 19 ---
 2 files changed, 20 insertions(+), 19 deletions(-)

diff --git a/target/ppc/cpu.h b/target/ppc/cpu.h
index 0c3fc8e084..1c4af4a1dc 100644
--- a/target/ppc/cpu.h
+++ b/target/ppc/cpu.h
@@ -2563,6 +2563,26 @@ static inline bool lsw_reg_in_range(int start, int 
nregs, int rx)
 }
 
 /* Accessors for FP, VMX and VSX registers */
+#if defined(HOST_WORDS_BIGENDIAN)
+#define VsrB(i) u8[i]
+#define VsrSB(i) s8[i]
+#define VsrH(i) u16[i]
+#define VsrSH(i) s16[i]
+#define VsrW(i) u32[i]
+#define VsrSW(i) s32[i]
+#define VsrD(i) u64[i]
+#define VsrSD(i) s64[i]
+#else
+#define VsrB(i) u8[15 - (i)]
+#define VsrSB(i) s8[15 - (i)]
+#define VsrH(i) u16[7 - (i)]
+#define VsrSH(i) s16[7 - (i)]
+#define VsrW(i) u32[3 - (i)]
+#define VsrSW(i) s32[3 - (i)]
+#define VsrD(i) u64[1 - (i)]
+#define VsrSD(i) s64[1 - (i)]
+#endif
+
 static inline int fpr_offset(int i)
 {
 return offsetof(CPUPPCState, vsr[i].u64[0]);
diff --git a/target/ppc/internal.h b/target/ppc/internal.h
index f26a71ffcf..3ebbdf4da4 100644
--- a/target/ppc/internal.h
+++ b/target/ppc/internal.h
@@ -204,25 +204,6 @@ EXTRACT_HELPER(IMM8, 11, 8);
 EXTRACT_HELPER(DCMX, 16, 7);
 EXTRACT_HELPER_SPLIT_3(DCMX_XV, 5, 16, 0, 1, 2, 5, 1, 6, 6);
 
-#if defined(HOST_WORDS_BIGENDIAN)
-#define VsrB(i) u8[i]
-#define VsrSB(i) s8[i]
-#define VsrH(i) u16[i]
-#define VsrSH(i) s16[i]
-#define VsrW(i) u32[i]
-#define VsrSW(i) s32[i]
-#define VsrD(i) u64[i]
-#define VsrSD(i) s64[i]
-#else
-#define VsrB(i) u8[15 - (i)]
-#define VsrSB(i) s8[15 - (i)]
-#define VsrH(i) u16[7 - (i)]
-#define VsrSH(i) s16[7 - (i)]
-#define VsrW(i) u32[3 - (i)]
-#define VsrSW(i) s32[3 - (i)]
-#define VsrD(i) u64[1 - (i)]
-#define VsrSD(i) s64[1 - (i)]
-#endif
 static inline void getVSR(int n, ppc_vsr_t *vsr, CPUPPCState *env)
 {
 vsr->VsrD(0) = env->vsr[n].u64[0];
-- 
2.20.1

[Qemu-devel] [PULL 57/60] target/ppc: add HV support for POWER9

2019-03-10 Thread David Gibson

From: Cédric Le Goater 

We now have enough support to boot a PowerNV machine with a POWER9
processor. Allow HV mode on POWER9.

Signed-off-by: Cédric Le Goater 
Message-Id: <20190307223548.20516-16-...@kaod.org>
Signed-off-by: David Gibson 
---
 target/ppc/translate_init.inc.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/target/ppc/translate_init.inc.c b/target/ppc/translate_init.inc.c
index af70a3b78c..0bd555eb19 100644
--- a/target/ppc/translate_init.inc.c
+++ b/target/ppc/translate_init.inc.c
@@ -8895,7 +8895,7 @@ POWERPC_FAMILY(POWER9)(ObjectClass *oc, void *data)
PPC_CACHE | PPC_CACHE_ICBI | PPC_CACHE_DCBZ |
PPC_MEM_SYNC | PPC_MEM_EIEIO |
PPC_MEM_TLBSYNC |
-   PPC_64B | PPC_64BX | PPC_ALTIVEC |
+   PPC_64B | PPC_64H | PPC_64BX | PPC_ALTIVEC |
PPC_SEGMENT_64B | PPC_SLBI |
PPC_POPCNTB | PPC_POPCNTWD |
PPC_CILDST;
@@ -8907,6 +8907,7 @@ POWERPC_FAMILY(POWER9)(ObjectClass *oc, void *data)
 PPC2_ISA205 | PPC2_ISA207S | PPC2_FP_CVT_S64 |
 PPC2_TM | PPC2_ISA300 | PPC2_PRCNTL;
 pcc->msr_mask = (1ull << MSR_SF) |
+(1ull << MSR_SHV) |
 (1ull << MSR_TM) |
 (1ull << MSR_VR) |
 (1ull << MSR_VSX) |
-- 
2.20.1

[Qemu-devel] [PULL 49/60] ppc/pnv: add SerIRQ routing registers

2019-03-10 Thread David Gibson

From: Cédric Le Goater 

This is just a simple reminder that SerIRQ routing should be
addressed.

Signed-off-by: Cédric Le Goater 
Message-Id: <20190307223548.20516-8-...@kaod.org>
Signed-off-by: David Gibson 
---
 hw/ppc/pnv_lpc.c | 14 ++
 include/hw/ppc/pnv_lpc.h |  2 ++
 2 files changed, 16 insertions(+)

diff --git a/hw/ppc/pnv_lpc.c b/hw/ppc/pnv_lpc.c
index 6df694e0ab..641e2046db 100644
--- a/hw/ppc/pnv_lpc.c
+++ b/hw/ppc/pnv_lpc.c
@@ -39,6 +39,8 @@ enum {
 };
 
 /* OPB Master LS registers */
+#define OPB_MASTER_LS_ROUTE00x8
+#define OPB_MASTER_LS_ROUTE10xC
 #define OPB_MASTER_LS_IRQ_STAT  0x50
 #define   OPB_MASTER_IRQ_LPC0x0800
 #define OPB_MASTER_LS_IRQ_MASK  0x54
@@ -521,6 +523,12 @@ static uint64_t opb_master_read(void *opaque, hwaddr addr, 
unsigned size)
 uint64_t val = 0xul;
 
 switch (addr) {
+case OPB_MASTER_LS_ROUTE0: /* TODO */
+val = lpc->opb_irq_route0;
+break;
+case OPB_MASTER_LS_ROUTE1: /* TODO */
+val = lpc->opb_irq_route1;
+break;
 case OPB_MASTER_LS_IRQ_STAT:
 val = lpc->opb_irq_stat;
 break;
@@ -547,6 +555,12 @@ static void opb_master_write(void *opaque, hwaddr addr,
 PnvLpcController *lpc = opaque;
 
 switch (addr) {
+case OPB_MASTER_LS_ROUTE0: /* TODO */
+lpc->opb_irq_route0 = val;
+break;
+case OPB_MASTER_LS_ROUTE1: /* TODO */
+lpc->opb_irq_route1 = val;
+break;
 case OPB_MASTER_LS_IRQ_STAT:
 lpc->opb_irq_stat &= ~val;
 pnv_lpc_eval_irqs(lpc);
diff --git a/include/hw/ppc/pnv_lpc.h b/include/hw/ppc/pnv_lpc.h
index 242b18081c..413579792e 100644
--- a/include/hw/ppc/pnv_lpc.h
+++ b/include/hw/ppc/pnv_lpc.h
@@ -55,6 +55,8 @@ typedef struct PnvLpcController {
 MemoryRegion opb_master_regs;
 
 /* OPB Master LS registers */
+uint32_t opb_irq_route0;
+uint32_t opb_irq_route1;
 uint32_t opb_irq_stat;
 uint32_t opb_irq_mask;
 uint32_t opb_irq_pol;
-- 
2.20.1

[Qemu-devel] [PULL 47/60] ppc/pnv: add a 'dt_isa_nodename' to the chip

2019-03-10 Thread David Gibson

From: Cédric Le Goater 

The ISA bus has a different DT nodename on POWER9. Compute the name
when the PnvChip is realized, that is before it is used by the machine
to populate the device tree with the ISA devices.

Signed-off-by: Cédric Le Goater 
Message-Id: <20190307223548.20516-6-...@kaod.org>
Signed-off-by: David Gibson 
---
 hw/ppc/pnv.c | 18 +-
 include/hw/ppc/pnv.h |  2 ++
 2 files changed, 7 insertions(+), 13 deletions(-)

diff --git a/hw/ppc/pnv.c b/hw/ppc/pnv.c
index 922e3ec48b..6625562d27 100644
--- a/hw/ppc/pnv.c
+++ b/hw/ppc/pnv.c
@@ -417,24 +417,12 @@ static int pnv_dt_isa_device(DeviceState *dev, void 
*opaque)
 return 0;
 }
 
-static int pnv_chip_isa_offset(PnvChip *chip, void *fdt)
-{
-char *name;
-int offset;
-
-name = g_strdup_printf("/xscom@%" PRIx64 "/isa@%x",
-   (uint64_t) PNV_XSCOM_BASE(chip), 
PNV_XSCOM_LPC_BASE);
-offset = fdt_path_offset(fdt, name);
-g_free(name);
-return offset;
-}
-
 /* The default LPC bus of a multichip system is on chip 0. It's
  * recognized by the firmware (skiboot) using a "primary" property.
  */
 static void pnv_dt_isa(PnvMachineState *pnv, void *fdt)
 {
-int isa_offset = pnv_chip_isa_offset(pnv->chips[0], fdt);
+int isa_offset = fdt_path_offset(fdt, pnv->chips[0]->dt_isa_nodename);
 ForeachPopulateArgs args = {
 .fdt = fdt,
 .offset = isa_offset,
@@ -866,6 +854,10 @@ static void pnv_chip_power8_realize(DeviceState *dev, 
Error **errp)
  &error_fatal);
 pnv_xscom_add_subregion(chip, PNV_XSCOM_LPC_BASE, &chip8->lpc.xscom_regs);
 
+chip->dt_isa_nodename = g_strdup_printf("/xscom@%" PRIx64 "/isa@%x",
+(uint64_t) PNV_XSCOM_BASE(chip),
+PNV_XSCOM_LPC_BASE);
+
 /* Interrupt Management Area. This is the memory region holding
  * all the Interrupt Control Presenter (ICP) registers */
 pnv_chip_icp_realize(chip8, &local_err);
diff --git a/include/hw/ppc/pnv.h b/include/hw/ppc/pnv.h
index 8d80cb34ee..c81f157f41 100644
--- a/include/hw/ppc/pnv.h
+++ b/include/hw/ppc/pnv.h
@@ -58,6 +58,8 @@ typedef struct PnvChip {
 MemoryRegion xscom_mmio;
 MemoryRegion xscom;
 AddressSpace xscom_as;
+
+gchar*dt_isa_nodename;
 } PnvChip;
 
 #define TYPE_PNV8_CHIP "pnv8-chip"
-- 
2.20.1

[Qemu-devel] [PULL 48/60] ppc/pnv: add a LPC Controller model for POWER9

2019-03-10 Thread David Gibson

From: Cédric Le Goater 

The LPC Controller on POWER9 is very similar to the one found on
POWER8 but accesses are now done via on MMIOs, without the XSCOM and
ECCB logic. The device tree is populated differently so we add a
specific POWER9 routine for the purpose.

SerIRQ routing is yet to be done.

Signed-off-by: Cédric Le Goater 
Message-Id: <20190307223548.20516-7-...@kaod.org>
Signed-off-by: David Gibson 
---
 hw/ppc/pnv.c |  22 -
 hw/ppc/pnv_lpc.c | 200 +++
 include/hw/ppc/pnv.h |   4 +
 include/hw/ppc/pnv_lpc.h |   9 ++
 4 files changed, 234 insertions(+), 1 deletion(-)

diff --git a/hw/ppc/pnv.c b/hw/ppc/pnv.c
index 6625562d27..918fae057b 100644
--- a/hw/ppc/pnv.c
+++ b/hw/ppc/pnv.c
@@ -306,6 +306,8 @@ static void pnv_chip_power9_dt_populate(PnvChip *chip, void 
*fdt)
 if (chip->ram_size) {
 pnv_dt_memory(fdt, chip->chip_id, chip->ram_start, chip->ram_size);
 }
+
+pnv_dt_lpc(chip, fdt, 0);
 }
 
 static void pnv_dt_rtc(ISADevice *d, void *fdt, int lpc_off)
@@ -547,7 +549,8 @@ static ISABus *pnv_chip_power8nvl_isa_create(PnvChip *chip, 
Error **errp)
 
 static ISABus *pnv_chip_power9_isa_create(PnvChip *chip, Error **errp)
 {
-return NULL;
+Pnv9Chip *chip9 = PNV9_CHIP(chip);
+return pnv_lpc_isa_create(&chip9->lpc, false, errp);
 }
 
 static ISABus *pnv_isa_create(PnvChip *chip, Error **errp)
@@ -948,6 +951,11 @@ static void pnv_chip_power9_instance_init(Object *obj)
 TYPE_PNV9_PSI, &error_abort, NULL);
 object_property_add_const_link(OBJECT(&chip9->psi), "chip", obj,
&error_abort);
+
+object_initialize_child(obj, "lpc",  &chip9->lpc, sizeof(chip9->lpc),
+TYPE_PNV9_LPC, &error_abort, NULL);
+object_property_add_const_link(OBJECT(&chip9->lpc), "psi",
+   OBJECT(&chip9->psi), &error_abort);
 }
 
 static void pnv_chip_power9_realize(DeviceState *dev, Error **errp)
@@ -992,6 +1000,18 @@ static void pnv_chip_power9_realize(DeviceState *dev, 
Error **errp)
 }
 pnv_xscom_add_subregion(chip, PNV9_XSCOM_PSIHB_BASE,
 &PNV_PSI(psi9)->xscom_regs);
+
+/* LPC */
+object_property_set_bool(OBJECT(&chip9->lpc), true, "realized", 
&local_err);
+if (local_err) {
+error_propagate(errp, local_err);
+return;
+}
+memory_region_add_subregion(get_system_memory(), PNV9_LPCM_BASE(chip),
+&chip9->lpc.xscom_regs);
+
+chip->dt_isa_nodename = g_strdup_printf("/lpcm-opb@%" PRIx64 "/lpc@0",
+(uint64_t) PNV9_LPCM_BASE(chip));
 }
 
 static void pnv_chip_power9_class_init(ObjectClass *klass, void *data)
diff --git a/hw/ppc/pnv_lpc.c b/hw/ppc/pnv_lpc.c
index 3c509a30a0..6df694e0ab 100644
--- a/hw/ppc/pnv_lpc.c
+++ b/hw/ppc/pnv_lpc.c
@@ -118,6 +118,100 @@ static int pnv_lpc_dt_xscom(PnvXScomInterface *dev, void 
*fdt, int xscom_offset)
 return 0;
 }
 
+/* POWER9 only */
+int pnv_dt_lpc(PnvChip *chip, void *fdt, int root_offset)
+{
+const char compat[] = "ibm,power9-lpcm-opb\0simple-bus";
+const char lpc_compat[] = "ibm,power9-lpc\0ibm,lpc";
+char *name;
+int offset, lpcm_offset;
+uint64_t lpcm_addr = PNV9_LPCM_BASE(chip);
+uint32_t opb_ranges[8] = { 0,
+   cpu_to_be32(lpcm_addr >> 32),
+   cpu_to_be32((uint32_t)lpcm_addr),
+   cpu_to_be32(PNV9_LPCM_SIZE / 2),
+   cpu_to_be32(PNV9_LPCM_SIZE / 2),
+   cpu_to_be32(lpcm_addr >> 32),
+   cpu_to_be32(PNV9_LPCM_SIZE / 2),
+   cpu_to_be32(PNV9_LPCM_SIZE / 2),
+};
+uint32_t opb_reg[4] = { cpu_to_be32(lpcm_addr >> 32),
+cpu_to_be32((uint32_t)lpcm_addr),
+cpu_to_be32(PNV9_LPCM_SIZE >> 32),
+cpu_to_be32((uint32_t)PNV9_LPCM_SIZE),
+};
+uint32_t reg[2];
+
+/*
+ * OPB bus
+ */
+name = g_strdup_printf("lpcm-opb@%"PRIx64, lpcm_addr);
+lpcm_offset = fdt_add_subnode(fdt, root_offset, name);
+_FDT(lpcm_offset);
+g_free(name);
+
+_FDT((fdt_setprop(fdt, lpcm_offset, "reg", opb_reg, sizeof(opb_reg;
+_FDT((fdt_setprop_cell(fdt, lpcm_offset, "#address-cells", 1)));
+_FDT((fdt_setprop_cell(fdt, lpcm_offset, "#size-cells", 1)));
+_FDT((fdt_setprop(fdt, lpcm_offset, "compatible", compat, 
sizeof(compat;
+_FDT((fdt_setprop_cell(fdt, lpcm_offset, "ibm,chip-id", chip->chip_id)));
+_FDT((fdt_setprop(fdt, lpcm_offset, "ranges", opb_ranges,
+  sizeof(opb_ranges;
+
+/*
+ * OPB Master registers
+ */
+name = g_strdup_printf("opb-master@%x", LPC_OPB_REGS_OPB_ADDR);
+offset = fdt_add_subnode(fdt, lpcm_offset, name);
+_FDT(offse

[Qemu-devel] [PULL 44/60] ppc/pnv: add a PSI bridge model for POWER9

2019-03-10 Thread David Gibson

From: Cédric Le Goater 

The PSI bridge on POWER9 is very similar to POWER8. The BAR is still
set through XSCOM but the controls are now entirely done with MMIOs.
More interrupts are defined and the interrupt controller interface has
changed to XIVE. The POWER9 model is a first example of the usage of
the notify() handler of the XiveNotifier interface, linking the PSI
XiveSource to its owning device model.

Signed-off-by: Cédric Le Goater 
Message-Id: <20190307223548.20516-3-...@kaod.org>
Signed-off-by: David Gibson 
---
 hw/ppc/pnv.c   |  18 ++
 hw/ppc/pnv_psi.c   | 329 -
 include/hw/ppc/pnv.h   |   6 +
 include/hw/ppc/pnv_psi.h   |  30 
 include/hw/ppc/pnv_xscom.h |   3 +
 5 files changed, 384 insertions(+), 2 deletions(-)

diff --git a/hw/ppc/pnv.c b/hw/ppc/pnv.c
index 5bb2332f16..1cc454cbbc 100644
--- a/hw/ppc/pnv.c
+++ b/hw/ppc/pnv.c
@@ -579,6 +579,7 @@ static void pnv_chip_power9_pic_print_info(PnvChip *chip, 
Monitor *mon)
 Pnv9Chip *chip9 = PNV9_CHIP(chip);
 
 pnv_xive_pic_print_info(&chip9->xive, mon);
+pnv_psi_pic_print_info(&chip9->psi, mon);
 }
 
 static void pnv_init(MachineState *machine)
@@ -950,6 +951,11 @@ static void pnv_chip_power9_instance_init(Object *obj)
 TYPE_PNV_XIVE, &error_abort, NULL);
 object_property_add_const_link(OBJECT(&chip9->xive), "chip", obj,
&error_abort);
+
+object_initialize_child(obj, "psi",  &chip9->psi, sizeof(chip9->psi),
+TYPE_PNV9_PSI, &error_abort, NULL);
+object_property_add_const_link(OBJECT(&chip9->psi), "chip", obj,
+   &error_abort);
 }
 
 static void pnv_chip_power9_realize(DeviceState *dev, Error **errp)
@@ -957,6 +963,7 @@ static void pnv_chip_power9_realize(DeviceState *dev, Error 
**errp)
 PnvChipClass *pcc = PNV_CHIP_GET_CLASS(dev);
 Pnv9Chip *chip9 = PNV9_CHIP(dev);
 PnvChip *chip = PNV_CHIP(dev);
+Pnv9Psi *psi9 = &chip9->psi;
 Error *local_err = NULL;
 
 pcc->parent_realize(dev, &local_err);
@@ -982,6 +989,17 @@ static void pnv_chip_power9_realize(DeviceState *dev, 
Error **errp)
 }
 pnv_xscom_add_subregion(chip, PNV9_XSCOM_XIVE_BASE,
 &chip9->xive.xscom_regs);
+
+/* Processor Service Interface (PSI) Host Bridge */
+object_property_set_int(OBJECT(&chip9->psi), PNV9_PSIHB_BASE(chip),
+"bar", &error_fatal);
+object_property_set_bool(OBJECT(&chip9->psi), true, "realized", 
&local_err);
+if (local_err) {
+error_propagate(errp, local_err);
+return;
+}
+pnv_xscom_add_subregion(chip, PNV9_XSCOM_PSIHB_BASE,
+&PNV_PSI(psi9)->xscom_regs);
 }
 
 static void pnv_chip_power9_class_init(ObjectClass *klass, void *data)
diff --git a/hw/ppc/pnv_psi.c b/hw/ppc/pnv_psi.c
index 067f733f1e..5a923e4151 100644
--- a/hw/ppc/pnv_psi.c
+++ b/hw/ppc/pnv_psi.c
@@ -22,6 +22,7 @@
 #include "target/ppc/cpu.h"
 #include "qemu/log.h"
 #include "qapi/error.h"
+#include "monitor/monitor.h"
 
 #include "exec/address-spaces.h"
 
@@ -114,6 +115,9 @@
 #define PSIHB_BAR_MASK  0x0003fff0ull
 #define PSIHB_FSPBAR_MASK   0x0003ull
 
+#define PSIHB9_BAR_MASK 0x00f0ull
+#define PSIHB9_FSPBAR_MASK  0x00ffull
+
 #define PSIHB_REG(addr) (((addr) >> 3) + PSIHB_XSCOM_BAR)
 
 static void pnv_psi_set_bar(PnvPsi *psi, uint64_t bar)
@@ -531,6 +535,7 @@ static void pnv_psi_power8_realize(DeviceState *dev, Error 
**errp)
 }
 
 static const char compat_p8[] = "ibm,power8-psihb-x\0ibm,psihb-x";
+static const char compat_p9[] = "ibm,power9-psihb-x\0ibm,psihb-x";
 
 static int pnv_psi_dt_xscom(PnvXScomInterface *dev, void *fdt, int 
xscom_offset)
 {
@@ -550,8 +555,13 @@ static int pnv_psi_dt_xscom(PnvXScomInterface *dev, void 
*fdt, int xscom_offset)
 _FDT(fdt_setprop(fdt, offset, "reg", reg, sizeof(reg)));
 _FDT(fdt_setprop_cell(fdt, offset, "#address-cells", 2));
 _FDT(fdt_setprop_cell(fdt, offset, "#size-cells", 1));
-_FDT(fdt_setprop(fdt, offset, "compatible", compat_p8,
- sizeof(compat_p8)));
+if (ppc->chip_type == PNV_CHIP_POWER9) {
+_FDT(fdt_setprop(fdt, offset, "compatible", compat_p9,
+ sizeof(compat_p9)));
+} else {
+_FDT(fdt_setprop(fdt, offset, "compatible", compat_p8,
+ sizeof(compat_p8)));
+}
 return 0;
 }
 
@@ -584,6 +594,308 @@ static const TypeInfo pnv_psi_power8_info = {
 .class_init= pnv_psi_power8_class_init,
 };
 
+
+/* Common registers */
+
+#define PSIHB9_CR   0x20
+#define PSIHB9_SEMR 0x28
+
+/* P9 registers */
+
+#define PSIHB9_INTERRUPT_CONTROL0x58
+#define   PSIHB9_IRQ_METHOD PPC_BIT(0)
+#define   PSIHB9_IRQ_RESET  PPC_BIT(1)
+#

[Qemu-devel] [PULL 59/60] target/ppc: Optimize x[sv]xsigdp using deposit_i64()

2019-03-10 Thread David Gibson

From: Philippe Mathieu-Daudé 

Signed-off-by: Philippe Mathieu-Daudé 
Message-Id: <20190309214255.9952-3-f4...@amsat.org>
Signed-off-by: David Gibson 
---
 target/ppc/translate/vsx-impl.inc.c | 12 
 1 file changed, 4 insertions(+), 8 deletions(-)

diff --git a/target/ppc/translate/vsx-impl.inc.c 
b/target/ppc/translate/vsx-impl.inc.c
index 30d8aabd92..508e9199c8 100644
--- a/target/ppc/translate/vsx-impl.inc.c
+++ b/target/ppc/translate/vsx-impl.inc.c
@@ -1587,8 +1587,7 @@ static void gen_xsxsigdp(DisasContext *ctx)
 tcg_gen_movcond_i64(TCG_COND_EQ, t0, exp, zr, zr, t0);
 tcg_gen_movcond_i64(TCG_COND_EQ, t0, exp, nan, zr, t0);
 get_cpu_vsrh(t1, xB(ctx->opcode));
-tcg_gen_andi_i64(rt, t1, 0x000F);
-tcg_gen_or_i64(rt, rt, t0);
+tcg_gen_deposit_i64(rt, t0, t1, 0, 52);
 
 tcg_temp_free_i64(t0);
 tcg_temp_free_i64(t1);
@@ -1624,8 +1623,7 @@ static void gen_xsxsigqp(DisasContext *ctx)
 tcg_gen_movi_i64(t0, 0x0001);
 tcg_gen_movcond_i64(TCG_COND_EQ, t0, exp, zr, zr, t0);
 tcg_gen_movcond_i64(TCG_COND_EQ, t0, exp, nan, zr, t0);
-tcg_gen_andi_i64(xth, xbh, 0x);
-tcg_gen_or_i64(xth, xth, t0);
+tcg_gen_deposit_i64(xth, t0, xbh, 0, 48);
 set_cpu_vsrh(rD(ctx->opcode) + 32, xth);
 tcg_gen_mov_i64(xtl, xbl);
 set_cpu_vsrl(rD(ctx->opcode) + 32, xtl);
@@ -1814,16 +1812,14 @@ static void gen_xvxsigdp(DisasContext *ctx)
 tcg_gen_movi_i64(t0, 0x0010);
 tcg_gen_movcond_i64(TCG_COND_EQ, t0, exp, zr, zr, t0);
 tcg_gen_movcond_i64(TCG_COND_EQ, t0, exp, nan, zr, t0);
-tcg_gen_andi_i64(xth, xbh, 0x000F);
-tcg_gen_or_i64(xth, xth, t0);
+tcg_gen_deposit_i64(xth, t0, xbh, 0, 52);
 set_cpu_vsrh(xT(ctx->opcode), xth);
 
 tcg_gen_extract_i64(exp, xbl, 52, 11);
 tcg_gen_movi_i64(t0, 0x0010);
 tcg_gen_movcond_i64(TCG_COND_EQ, t0, exp, zr, zr, t0);
 tcg_gen_movcond_i64(TCG_COND_EQ, t0, exp, nan, zr, t0);
-tcg_gen_andi_i64(xtl, xbl, 0x000F);
-tcg_gen_or_i64(xtl, xtl, t0);
+tcg_gen_deposit_i64(xth, t0, xbl, 0, 52);
 set_cpu_vsrl(xT(ctx->opcode), xtl);
 
 tcg_temp_free_i64(t0);
-- 
2.20.1

[Qemu-devel] [PULL 55/60] ppc/pnv: add more dummy XSCOM addresses

2019-03-10 Thread David Gibson

From: Cédric Le Goater 

To improve OPAL/skiboot support. We don't need to strictly model these
XSCOM accesses.

Signed-off-by: Cédric Le Goater 
Message-Id: <20190307223548.20516-14-...@kaod.org>
Signed-off-by: David Gibson 
---
 hw/ppc/pnv_xscom.c | 33 +++--
 1 file changed, 27 insertions(+), 6 deletions(-)

diff --git a/hw/ppc/pnv_xscom.c b/hw/ppc/pnv_xscom.c
index 46fae41f32..c285ef514e 100644
--- a/hw/ppc/pnv_xscom.c
+++ b/hw/ppc/pnv_xscom.c
@@ -64,11 +64,21 @@ static uint64_t xscom_read_default(PnvChip *chip, uint32_t 
pcba)
 switch (pcba) {
 case 0xf000f:
 return PNV_CHIP_GET_CLASS(chip)->chip_cfam_id;
+case 0x18002:   /* ECID2 */
+return 0;
+
 case 0x1010c00: /* PIBAM FIR */
 case 0x1010c03: /* PIBAM FIR MASK */
-case 0x2020007: /* ADU stuff */
-case 0x2020009: /* ADU stuff */
-case 0x202000f: /* ADU stuff */
+
+/* P9 xscom reset */
+case 0x0090018: /* Receive status reg */
+case 0x0090012: /* log register */
+case 0x0090013: /* error register */
+
+/* P8 xscom reset */
+case 0x2020007: /* ADU stuff, log register */
+case 0x2020009: /* ADU stuff, error register */
+case 0x202000f: /* ADU stuff, receive status register*/
 return 0;
 case 0x2013f00: /* PBA stuff */
 case 0x2013f01: /* PBA stuff */
@@ -100,9 +110,20 @@ static bool xscom_write_default(PnvChip *chip, uint32_t 
pcba, uint64_t val)
 case 0x1010c03: /* PIBAM FIR MASK */
 case 0x1010c04: /* PIBAM FIR MASK */
 case 0x1010c05: /* PIBAM FIR MASK */
-case 0x2020007: /* ADU stuff */
-case 0x2020009: /* ADU stuff */
-case 0x202000f: /* ADU stuff */
+/* P9 xscom reset */
+case 0x0090018: /* Receive status reg */
+case 0x0090012: /* log register */
+case 0x0090013: /* error register */
+
+/* P8 xscom reset */
+case 0x2020007: /* ADU stuff, log register */
+case 0x2020009: /* ADU stuff, error register */
+case 0x202000f: /* ADU stuff, receive status register*/
+
+case 0x2013028: /* CAPP stuff */
+case 0x201302a: /* CAPP stuff */
+case 0x2013801: /* CAPP stuff */
+case 0x2013802: /* CAPP stuff */
 return true;
 default:
 return false;
-- 
2.20.1

[Qemu-devel] [PATCH v5 00/10] Misc fixes to pvrdma device

2019-03-10 Thread Yuval Shaia

Hi,
Please review the following patch-set which consist of cosmetics fixes to
device's user interface (traces, error_report and monitor) and some bug
fixes.

Thanks Markus, Eric, Marcel and David for your review

Review is needed for patch #4 - "hw/pvrdma: Collect debugging statistics"

v0 -> v1:
* Explain why device attributes are exposed only in HMP interface.
* Squash the 3 patches related to HMP interface into one.
* Make monitor dump function simple.
* Make HMP interface available only if pvrdma is included (detected by
  build robot).
* Remove patch 03/10 ("Warn when too many consecutive poll CQ triggered
  on an empty CQ) and add the two counters to patch 0/7 (monitor).
* Add Marcel's R-Bs.
* Add mutex protection to cqe_ctx list.
* Add two new patches.

v1 -> v2:
* Rename locked-lists to protected-lists in patch 2 and patch 6.
* Add Marcel's R-Bs.

v2 -> v3:
* Address some 32 bit host compilation issues.

v3 -> v4:
* Per suggestion from Markus rebase HMP report on
  object_child_foreach_recursive().
* Strip off David's r-b from HMP patch (#4) because of the above.

v4 -> v5:
* Accept comments from Marcus and David
* Rename RDMA device interface
* Split patch #4 into 2 patches
* Add some more counters
* Add Marcus and David ack-by to hmp patch

Yuval Shaia (10):
  hw/rdma: Switch to generic error reporting way
  hw/rdma: Introduce protected qlist
  hw/rdma: Protect against concurrent execution of poll_cq
  hw/pvrdma: Collect debugging statistics
  {hmp, hw/pvrdma}: Expose device internals via monitor interface
  hw/rdma: Free all MAD receive buffers when device is closed
  hw/rdma: Free all receive buffers when QP is destroyed
  hw/pvrdma: Delete unneeded function argument
  hw/pvrdma: Delete pvrdma_exit function
  hw/pvrdma: Unregister from shutdown notifier when device goes down

 hmp-commands-info.hx  |  14 +
 hmp.c |  27 ++
 hmp.h |   1 +
 hw/rdma/Makefile.objs |   2 +-
 hw/rdma/rdma.c|  30 +++
 hw/rdma/rdma_backend.c| 483 +-
 hw/rdma/rdma_backend.h|   3 +-
 hw/rdma/rdma_backend_defs.h   |  10 +-
 hw/rdma/rdma_rm.c | 193 --
 hw/rdma/rdma_rm.h |   7 +-
 hw/rdma/rdma_rm_defs.h|  28 +-
 hw/rdma/rdma_utils.c  |  83 +-
 hw/rdma/rdma_utils.h  |  61 ++---
 hw/rdma/trace-events  |  32 ++-
 hw/rdma/vmw/pvrdma.h  |  12 +-
 hw/rdma/vmw/pvrdma_cmd.c  | 115 +++-
 hw/rdma/vmw/pvrdma_dev_ring.c |  26 +-
 hw/rdma/vmw/pvrdma_main.c | 168 ++--
 hw/rdma/vmw/pvrdma_qp_ops.c   |  52 +---
 hw/rdma/vmw/trace-events  |  16 +-
 include/hw/rdma/rdma.h|  40 +++
 21 files changed, 796 insertions(+), 607 deletions(-)
 create mode 100644 hw/rdma/rdma.c
 create mode 100644 include/hw/rdma/rdma.h

-- 
2.17.2

[Qemu-devel] [PULL 52/60] ppc/pnv: extend XSCOM core support for POWER9

2019-03-10 Thread David Gibson

From: Cédric Le Goater 

Provide a new class attribute to define XSCOM operations per CPU
family and add a couple of XSCOM addresses controlling the power
management states of the core on POWER9.

Signed-off-by: Cédric Le Goater 
Message-Id: <20190307223548.20516-11-...@kaod.org>
Signed-off-by: David Gibson 
---
 hw/ppc/pnv_core.c | 100 +-
 include/hw/ppc/pnv_core.h |   2 +
 2 files changed, 89 insertions(+), 13 deletions(-)

diff --git a/hw/ppc/pnv_core.c b/hw/ppc/pnv_core.c
index 38179cdc53..171474e080 100644
--- a/hw/ppc/pnv_core.c
+++ b/hw/ppc/pnv_core.c
@@ -60,8 +60,8 @@ static void pnv_cpu_reset(void *opaque)
 #define PNV_XSCOM_EX_DTS_RESULT0 0x5
 #define PNV_XSCOM_EX_DTS_RESULT1 0x50001
 
-static uint64_t pnv_core_xscom_read(void *opaque, hwaddr addr,
-unsigned int width)
+static uint64_t pnv_core_power8_xscom_read(void *opaque, hwaddr addr,
+   unsigned int width)
 {
 uint32_t offset = addr >> 3;
 uint64_t val = 0;
@@ -82,16 +82,74 @@ static uint64_t pnv_core_xscom_read(void *opaque, hwaddr 
addr,
 return val;
 }
 
-static void pnv_core_xscom_write(void *opaque, hwaddr addr, uint64_t val,
- unsigned int width)
+static void pnv_core_power8_xscom_write(void *opaque, hwaddr addr, uint64_t 
val,
+unsigned int width)
 {
 qemu_log_mask(LOG_UNIMP, "Warning: writing to reg=0x%" HWADDR_PRIx "\n",
   addr);
 }
 
-static const MemoryRegionOps pnv_core_xscom_ops = {
-.read = pnv_core_xscom_read,
-.write = pnv_core_xscom_write,
+static const MemoryRegionOps pnv_core_power8_xscom_ops = {
+.read = pnv_core_power8_xscom_read,
+.write = pnv_core_power8_xscom_write,
+.valid.min_access_size = 8,
+.valid.max_access_size = 8,
+.impl.min_access_size = 8,
+.impl.max_access_size = 8,
+.endianness = DEVICE_BIG_ENDIAN,
+};
+
+
+/*
+ * POWER9 core controls
+ */
+#define PNV9_XSCOM_EC_PPM_SPECIAL_WKUP_HYP 0xf010d
+#define PNV9_XSCOM_EC_PPM_SPECIAL_WKUP_OTR 0xf010a
+
+static uint64_t pnv_core_power9_xscom_read(void *opaque, hwaddr addr,
+   unsigned int width)
+{
+uint32_t offset = addr >> 3;
+uint64_t val = 0;
+
+/* The result should be 38 C */
+switch (offset) {
+case PNV_XSCOM_EX_DTS_RESULT0:
+val = 0x26f024f023full;
+break;
+case PNV_XSCOM_EX_DTS_RESULT1:
+val = 0x24full;
+break;
+case PNV9_XSCOM_EC_PPM_SPECIAL_WKUP_HYP:
+case PNV9_XSCOM_EC_PPM_SPECIAL_WKUP_OTR:
+val = 0x0;
+break;
+default:
+qemu_log_mask(LOG_UNIMP, "Warning: reading reg=0x%" HWADDR_PRIx "\n",
+  addr);
+}
+
+return val;
+}
+
+static void pnv_core_power9_xscom_write(void *opaque, hwaddr addr, uint64_t 
val,
+unsigned int width)
+{
+uint32_t offset = addr >> 3;
+
+switch (offset) {
+case PNV9_XSCOM_EC_PPM_SPECIAL_WKUP_HYP:
+case PNV9_XSCOM_EC_PPM_SPECIAL_WKUP_OTR:
+break;
+default:
+qemu_log_mask(LOG_UNIMP, "Warning: writing to reg=0x%" HWADDR_PRIx 
"\n",
+  addr);
+}
+}
+
+static const MemoryRegionOps pnv_core_power9_xscom_ops = {
+.read = pnv_core_power9_xscom_read,
+.write = pnv_core_power9_xscom_write,
 .valid.min_access_size = 8,
 .valid.max_access_size = 8,
 .impl.min_access_size = 8,
@@ -138,6 +196,7 @@ static void pnv_realize_vcpu(PowerPCCPU *cpu, PnvChip 
*chip, Error **errp)
 static void pnv_core_realize(DeviceState *dev, Error **errp)
 {
 PnvCore *pc = PNV_CORE(OBJECT(dev));
+PnvCoreClass *pcc = PNV_CORE_GET_CLASS(pc);
 CPUCore *cc = CPU_CORE(OBJECT(dev));
 const char *typename = pnv_core_cpu_typename(pc);
 Error *local_err = NULL;
@@ -180,7 +239,7 @@ static void pnv_core_realize(DeviceState *dev, Error **errp)
 }
 
 snprintf(name, sizeof(name), "xscom-core.%d", cc->core_id);
-pnv_xscom_region_init(&pc->xscom_regs, OBJECT(dev), &pnv_core_xscom_ops,
+pnv_xscom_region_init(&pc->xscom_regs, OBJECT(dev), pcc->xscom_ops,
   pc, name, PNV_XSCOM_EX_SIZE);
 return;
 
@@ -222,6 +281,20 @@ static Property pnv_core_properties[] = {
 DEFINE_PROP_END_OF_LIST(),
 };
 
+static void pnv_core_power8_class_init(ObjectClass *oc, void *data)
+{
+PnvCoreClass *pcc = PNV_CORE_CLASS(oc);
+
+pcc->xscom_ops = &pnv_core_power8_xscom_ops;
+}
+
+static void pnv_core_power9_class_init(ObjectClass *oc, void *data)
+{
+PnvCoreClass *pcc = PNV_CORE_CLASS(oc);
+
+pcc->xscom_ops = &pnv_core_power9_xscom_ops;
+}
+
 static void pnv_core_class_init(ObjectClass *oc, void *data)
 {
 DeviceClass *dc = DEVICE_CLASS(oc);
@@ -231,10 +304,11 @@ static void pnv_core_class_init(ObjectClass *oc, void 
*data)
 dc->props = pnv_core_properties;
 }
 
-

[Qemu-devel] [PULL 53/60] ppc/pnv: POWER9 XSCOM quad support

2019-03-10 Thread David Gibson

From: Cédric Le Goater 

The POWER9 processor does not support per-core frequency control. The
cores are arranged in groups of four, along with their respective L2
and L3 caches, into a structure known as a Quad. The frequency must be
managed at the Quad level.

Provide a basic Quad model to fake the settings done by the firmware
on the Non-Cacheable Unit (NCU). Each core pair (EX) needs a special
BAR setting for the TIMA area of XIVE because it resides on the same
address on all chips.

Signed-off-by: Cédric Le Goater 
Message-Id: <20190307223548.20516-12-...@kaod.org>
Signed-off-by: David Gibson 
---
 hw/ppc/pnv.c   | 38 -
 hw/ppc/pnv_core.c  | 87 ++
 include/hw/ppc/pnv.h   |  4 ++
 include/hw/ppc/pnv_core.h  | 10 +
 include/hw/ppc/pnv_xscom.h | 12 --
 5 files changed, 146 insertions(+), 5 deletions(-)

diff --git a/hw/ppc/pnv.c b/hw/ppc/pnv.c
index 1559a73323..e68d419203 100644
--- a/hw/ppc/pnv.c
+++ b/hw/ppc/pnv.c
@@ -963,6 +963,36 @@ static void pnv_chip_power9_instance_init(Object *obj)
OBJECT(&chip9->psi), &error_abort);
 }
 
+static void pnv_chip_quad_realize(Pnv9Chip *chip9, Error **errp)
+{
+PnvChip *chip = PNV_CHIP(chip9);
+const char *typename = pnv_chip_core_typename(chip);
+size_t typesize = object_type_get_instance_size(typename);
+int i;
+
+chip9->nr_quads = DIV_ROUND_UP(chip->nr_cores, 4);
+chip9->quads = g_new0(PnvQuad, chip9->nr_quads);
+
+for (i = 0; i < chip9->nr_quads; i++) {
+char eq_name[32];
+PnvQuad *eq = &chip9->quads[i];
+PnvCore *pnv_core = PNV_CORE(chip->cores + (i * 4) * typesize);
+int core_id = CPU_CORE(pnv_core)->core_id;
+
+object_initialize(eq, sizeof(*eq), TYPE_PNV_QUAD);
+snprintf(eq_name, sizeof(eq_name), "eq[%d]", core_id);
+
+object_property_add_child(OBJECT(chip), eq_name, OBJECT(eq),
+  &error_fatal);
+object_property_set_int(OBJECT(eq), core_id, "id", &error_fatal);
+object_property_set_bool(OBJECT(eq), true, "realized", &error_fatal);
+object_unref(OBJECT(eq));
+
+pnv_xscom_add_subregion(chip, PNV9_XSCOM_EQ_BASE(eq->id),
+&eq->xscom_regs);
+}
+}
+
 static void pnv_chip_power9_realize(DeviceState *dev, Error **errp)
 {
 PnvChipClass *pcc = PNV_CHIP_GET_CLASS(dev);
@@ -977,6 +1007,12 @@ static void pnv_chip_power9_realize(DeviceState *dev, 
Error **errp)
 return;
 }
 
+pnv_chip_quad_realize(chip9, &local_err);
+if (local_err) {
+error_propagate(errp, local_err);
+return;
+}
+
 /* XIVE interrupt controller (POWER9) */
 object_property_set_int(OBJECT(&chip9->xive), PNV9_XIVE_IC_BASE(chip),
 "ic-bar", &error_fatal);
@@ -1135,7 +1171,7 @@ static void pnv_chip_core_realize(PnvChip *chip, Error 
**errp)
 if (!pnv_chip_is_power9(chip)) {
 xscom_core_base = PNV_XSCOM_EX_BASE(core_hwid);
 } else {
-xscom_core_base = PNV_XSCOM_P9_EC_BASE(core_hwid);
+xscom_core_base = PNV9_XSCOM_EC_BASE(core_hwid);
 }
 
 pnv_xscom_add_subregion(chip, xscom_core_base,
diff --git a/hw/ppc/pnv_core.c b/hw/ppc/pnv_core.c
index 171474e080..5feeed6bc4 100644
--- a/hw/ppc/pnv_core.c
+++ b/hw/ppc/pnv_core.c
@@ -327,3 +327,90 @@ static const TypeInfo pnv_core_infos[] = {
 };
 
 DEFINE_TYPES(pnv_core_infos)
+
+/*
+ * POWER9 Quads
+ */
+
+#define P9X_EX_NCU_SPEC_BAR 0x11010
+
+static uint64_t pnv_quad_xscom_read(void *opaque, hwaddr addr,
+unsigned int width)
+{
+uint32_t offset = addr >> 3;
+uint64_t val = -1;
+
+switch (offset) {
+case P9X_EX_NCU_SPEC_BAR:
+case P9X_EX_NCU_SPEC_BAR + 0x400: /* Second EX */
+val = 0;
+break;
+default:
+qemu_log_mask(LOG_UNIMP, "%s: writing @0x%08x\n", __func__,
+  offset);
+}
+
+return val;
+}
+
+static void pnv_quad_xscom_write(void *opaque, hwaddr addr, uint64_t val,
+ unsigned int width)
+{
+uint32_t offset = addr >> 3;
+
+switch (offset) {
+case P9X_EX_NCU_SPEC_BAR:
+case P9X_EX_NCU_SPEC_BAR + 0x400: /* Second EX */
+break;
+default:
+qemu_log_mask(LOG_UNIMP, "%s: writing @0x%08x\n", __func__,
+  offset);
+}
+}
+
+static const MemoryRegionOps pnv_quad_xscom_ops = {
+.read = pnv_quad_xscom_read,
+.write = pnv_quad_xscom_write,
+.valid.min_access_size = 8,
+.valid.max_access_size = 8,
+.impl.min_access_size = 8,
+.impl.max_access_size = 8,
+.endianness = DEVICE_BIG_ENDIAN,
+};
+
+static void pnv_quad_realize(DeviceState *dev, Error **errp)
+{
+PnvQuad *eq = PNV_QUAD(dev);
+char name[32];
+
+snprintf(name, sizeof(name), "xscom-quad.%d", eq->id);
+pnv_xscom_region_

[Qemu-devel] [PATCH v5 02/10] hw/rdma: Introduce protected qlist

2019-03-10 Thread Yuval Shaia

To make code more readable move handling of protected list to a
rdma_utils

Signed-off-by: Yuval Shaia 
Reviewed-by: Marcel Apfelbaum 
---
 hw/rdma/rdma_backend.c  | 20 +--
 hw/rdma/rdma_backend_defs.h |  8 ++--
 hw/rdma/rdma_utils.c| 39 +
 hw/rdma/rdma_utils.h|  9 +
 4 files changed, 55 insertions(+), 21 deletions(-)

diff --git a/hw/rdma/rdma_backend.c b/hw/rdma/rdma_backend.c
index 24bac00a20..0ed14751be 100644
--- a/hw/rdma/rdma_backend.c
+++ b/hw/rdma/rdma_backend.c
@@ -527,9 +527,7 @@ static unsigned int save_mad_recv_buffer(RdmaBackendDev 
*backend_dev,
 bctx->up_ctx = ctx;
 bctx->sge = *sge;
 
-qemu_mutex_lock(&backend_dev->recv_mads_list.lock);
-qlist_append_int(backend_dev->recv_mads_list.list, bctx_id);
-qemu_mutex_unlock(&backend_dev->recv_mads_list.lock);
+rdma_protected_qlist_append_int64(&backend_dev->recv_mads_list, bctx_id);
 
 return 0;
 }
@@ -913,23 +911,19 @@ static inline void build_mad_hdr(struct ibv_grh *grh, 
union ibv_gid *sgid,
 static void process_incoming_mad_req(RdmaBackendDev *backend_dev,
  RdmaCmMuxMsg *msg)
 {
-QObject *o_ctx_id;
 unsigned long cqe_ctx_id;
 BackendCtx *bctx;
 char *mad;
 
 trace_mad_message("recv", msg->umad.mad, msg->umad_len);
 
-qemu_mutex_lock(&backend_dev->recv_mads_list.lock);
-o_ctx_id = qlist_pop(backend_dev->recv_mads_list.list);
-qemu_mutex_unlock(&backend_dev->recv_mads_list.lock);
-if (!o_ctx_id) {
+cqe_ctx_id = rdma_protected_qlist_pop_int64(&backend_dev->recv_mads_list);
+if (cqe_ctx_id == -ENOENT) {
 rdma_warn_report("No more free MADs buffers, waiting for a while");
 sleep(THR_POLL_TO);
 return;
 }
 
-cqe_ctx_id = qnum_get_uint(qobject_to(QNum, o_ctx_id));
 bctx = rdma_rm_get_cqe_ctx(backend_dev->rdma_dev_res, cqe_ctx_id);
 if (unlikely(!bctx)) {
 rdma_error_report("No matching ctx for req %ld", cqe_ctx_id);
@@ -994,8 +988,7 @@ static int mad_init(RdmaBackendDev *backend_dev, 
CharBackend *mad_chr_be)
 return -EIO;
 }
 
-qemu_mutex_init(&backend_dev->recv_mads_list.lock);
-backend_dev->recv_mads_list.list = qlist_new();
+rdma_protected_qlist_init(&backend_dev->recv_mads_list);
 
 enable_rdmacm_mux_async(backend_dev);
 
@@ -1010,10 +1003,7 @@ static void mad_fini(RdmaBackendDev *backend_dev)
 {
 disable_rdmacm_mux_async(backend_dev);
 qemu_chr_fe_disconnect(backend_dev->rdmacm_mux.chr_be);
-if (backend_dev->recv_mads_list.list) {
-qlist_destroy_obj(QOBJECT(backend_dev->recv_mads_list.list));
-qemu_mutex_destroy(&backend_dev->recv_mads_list.lock);
-}
+rdma_protected_qlist_destroy(&backend_dev->recv_mads_list);
 }
 
 int rdma_backend_get_gid_index(RdmaBackendDev *backend_dev,
diff --git a/hw/rdma/rdma_backend_defs.h b/hw/rdma/rdma_backend_defs.h
index 15ae8b970e..a8c15b09ab 100644
--- a/hw/rdma/rdma_backend_defs.h
+++ b/hw/rdma/rdma_backend_defs.h
@@ -20,6 +20,7 @@
 #include "chardev/char-fe.h"
 #include 
 #include "contrib/rdmacm-mux/rdmacm-mux.h"
+#include "rdma_utils.h"
 
 typedef struct RdmaDeviceResources RdmaDeviceResources;
 
@@ -30,11 +31,6 @@ typedef struct RdmaBackendThread {
 bool is_running; /* Set by the thread to report its status */
 } RdmaBackendThread;
 
-typedef struct RecvMadList {
-QemuMutex lock;
-QList *list;
-} RecvMadList;
-
 typedef struct RdmaCmMux {
 CharBackend *chr_be;
 int can_receive;
@@ -48,7 +44,7 @@ typedef struct RdmaBackendDev {
 struct ibv_context *context;
 struct ibv_comp_channel *channel;
 uint8_t port_num;
-RecvMadList recv_mads_list;
+RdmaProtectedQList recv_mads_list;
 RdmaCmMux rdmacm_mux;
 } RdmaBackendDev;
 
diff --git a/hw/rdma/rdma_utils.c b/hw/rdma/rdma_utils.c
index b9f07fcda7..0a8abe572d 100644
--- a/hw/rdma/rdma_utils.c
+++ b/hw/rdma/rdma_utils.c
@@ -14,6 +14,8 @@
  */
 
 #include "qemu/osdep.h"
+#include "qapi/qmp/qlist.h"
+#include "qapi/qmp/qnum.h"
 #include "trace.h"
 #include "rdma_utils.h"
 
@@ -51,3 +53,40 @@ void rdma_pci_dma_unmap(PCIDevice *dev, void *buffer, 
dma_addr_t len)
 pci_dma_unmap(dev, buffer, len, DMA_DIRECTION_TO_DEVICE, 0);
 }
 }
+
+void rdma_protected_qlist_init(RdmaProtectedQList *list)
+{
+qemu_mutex_init(&list->lock);
+list->list = qlist_new();
+}
+
+void rdma_protected_qlist_destroy(RdmaProtectedQList *list)
+{
+if (list->list) {
+qlist_destroy_obj(QOBJECT(list->list));
+qemu_mutex_destroy(&list->lock);
+list->list = NULL;
+}
+}
+
+void rdma_protected_qlist_append_int64(RdmaProtectedQList *list, int64_t value)
+{
+qemu_mutex_lock(&list->lock);
+qlist_append_int(list->list, value);
+qemu_mutex_unlock(&list->lock);
+}
+
+int64_t rdma_protected_qlist_pop_int64(RdmaProtectedQList *list)
+{
+QObject *obj;
+
+qemu_mutex_lock(&list->lock);
+obj = qlist_pop(

[Qemu-devel] [PATCH v5 04/10] hw/pvrdma: Collect debugging statistics

2019-03-10 Thread Yuval Shaia

Add counters to enable enhance debugging

Signed-off-by: Yuval Shaia 
---
 hw/rdma/rdma_backend.c| 70 +--
 hw/rdma/rdma_rm.c |  7 
 hw/rdma/rdma_rm_defs.h| 27 ++-
 hw/rdma/vmw/pvrdma.h  | 10 ++
 hw/rdma/vmw/pvrdma_cmd.c  |  2 ++
 hw/rdma/vmw/pvrdma_main.c |  8 +
 6 files changed, 106 insertions(+), 18 deletions(-)

diff --git a/hw/rdma/rdma_backend.c b/hw/rdma/rdma_backend.c
index 9679b842d1..bc2fefcf93 100644
--- a/hw/rdma/rdma_backend.c
+++ b/hw/rdma/rdma_backend.c
@@ -64,9 +64,9 @@ static inline void complete_work(enum ibv_wc_status status, 
uint32_t vendor_err,
 comp_handler(ctx, &wc);
 }
 
-static void rdma_poll_cq(RdmaDeviceResources *rdma_dev_res, struct ibv_cq 
*ibcq)
+static int rdma_poll_cq(RdmaDeviceResources *rdma_dev_res, struct ibv_cq *ibcq)
 {
-int i, ne;
+int i, ne, total_ne = 0;
 BackendCtx *bctx;
 struct ibv_wc wc[2];
 
@@ -89,12 +89,18 @@ static void rdma_poll_cq(RdmaDeviceResources *rdma_dev_res, 
struct ibv_cq *ibcq)
 rdma_rm_dealloc_cqe_ctx(rdma_dev_res, wc[i].wr_id);
 g_free(bctx);
 }
+total_ne += ne;
 } while (ne > 0);
+atomic_sub(&rdma_dev_res->stats.missing_cqe, total_ne);
 qemu_mutex_unlock(&rdma_dev_res->lock);
 
 if (ne < 0) {
 rdma_error_report("ibv_poll_cq fail, rc=%d, errno=%d", ne, errno);
 }
+
+rdma_dev_res->stats.completions += total_ne;
+
+return total_ne;
 }
 
 static void *comp_handler_thread(void *arg)
@@ -122,6 +128,9 @@ static void *comp_handler_thread(void *arg)
 while (backend_dev->comp_thread.run) {
 do {
 rc = qemu_poll_ns(pfds, 1, THR_POLL_TO * (int64_t)SCALE_MS);
+if (!rc) {
+backend_dev->rdma_dev_res->stats.poll_cq_ppoll_to++;
+}
 } while (!rc && backend_dev->comp_thread.run);
 
 if (backend_dev->comp_thread.run) {
@@ -138,6 +147,7 @@ static void *comp_handler_thread(void *arg)
   errno);
 }
 
+backend_dev->rdma_dev_res->stats.poll_cq_from_bk++;
 rdma_poll_cq(backend_dev->rdma_dev_res, ev_cq);
 
 ibv_ack_cq_events(ev_cq, 1);
@@ -271,7 +281,13 @@ int rdma_backend_query_port(RdmaBackendDev *backend_dev,
 
 void rdma_backend_poll_cq(RdmaDeviceResources *rdma_dev_res, RdmaBackendCQ *cq)
 {
-rdma_poll_cq(rdma_dev_res, cq->ibcq);
+int polled;
+
+rdma_dev_res->stats.poll_cq_from_guest++;
+polled = rdma_poll_cq(rdma_dev_res, cq->ibcq);
+if (!polled) {
+rdma_dev_res->stats.poll_cq_from_guest_empty++;
+}
 }
 
 static GHashTable *ah_hash;
@@ -333,7 +349,7 @@ static void ah_cache_init(void)
 
 static int build_host_sge_array(RdmaDeviceResources *rdma_dev_res,
 struct ibv_sge *dsge, struct ibv_sge *ssge,
-uint8_t num_sge)
+uint8_t num_sge, uint64_t *total_length)
 {
 RdmaRmMR *mr;
 int ssge_idx;
@@ -349,6 +365,8 @@ static int build_host_sge_array(RdmaDeviceResources 
*rdma_dev_res,
 dsge->length = ssge[ssge_idx].length;
 dsge->lkey = rdma_backend_mr_lkey(&mr->backend_mr);
 
+*total_length += dsge->length;
+
 dsge++;
 }
 
@@ -445,8 +463,10 @@ void rdma_backend_post_send(RdmaBackendDev *backend_dev,
 rc = mad_send(backend_dev, sgid_idx, sgid, sge, num_sge);
 if (rc) {
 complete_work(IBV_WC_GENERAL_ERR, VENDOR_ERR_MAD_SEND, ctx);
+backend_dev->rdma_dev_res->stats.mad_tx_err++;
 } else {
 complete_work(IBV_WC_SUCCESS, 0, ctx);
+backend_dev->rdma_dev_res->stats.mad_tx++;
 }
 }
 return;
@@ -458,20 +478,21 @@ void rdma_backend_post_send(RdmaBackendDev *backend_dev,
 rc = rdma_rm_alloc_cqe_ctx(backend_dev->rdma_dev_res, &bctx_id, bctx);
 if (unlikely(rc)) {
 complete_work(IBV_WC_GENERAL_ERR, VENDOR_ERR_NOMEM, ctx);
-goto out_free_bctx;
+goto err_free_bctx;
 }
 
-rc = build_host_sge_array(backend_dev->rdma_dev_res, new_sge, sge, 
num_sge);
+rc = build_host_sge_array(backend_dev->rdma_dev_res, new_sge, sge, num_sge,
+  &backend_dev->rdma_dev_res->stats.tx_len);
 if (rc) {
 complete_work(IBV_WC_GENERAL_ERR, rc, ctx);
-goto out_dealloc_cqe_ctx;
+goto err_dealloc_cqe_ctx;
 }
 
 if (qp_type == IBV_QPT_UD) {
 wr.wr.ud.ah = create_ah(backend_dev, qp->ibpd, sgid_idx, dgid);
 if (!wr.wr.ud.ah) {
 complete_work(IBV_WC_GENERAL_ERR, VENDOR_ERR_FAIL_BACKEND, ctx);
-goto out_dealloc_cqe_ctx;
+goto err_dealloc_cqe_ctx;
 }
 wr.wr.ud.remote_qpn = dqpn;
 wr.wr.ud.remote_qkey = dqkey;
@@ -488,15 +509,19 @@ void rdma_backend_post_send(RdmaBackendDev *backend_dev,
 rdma_error_report("ibv_p

[Qemu-devel] [PATCH v5 03/10] hw/rdma: Protect against concurrent execution of poll_cq

2019-03-10 Thread Yuval Shaia

The function rdma_poll_cq is called from two contexts - completion
handler thread which sense new completion on backend channel and
explicitly as result of guest issuing poll_cq command.

Add lock to protect against concurrent executions.

Signed-off-by: Yuval Shaia 
Reviewed-by: Marcel Apfelbaum 
---
 hw/rdma/rdma_backend.c | 2 ++
 hw/rdma/rdma_rm.c  | 4 
 hw/rdma/rdma_rm_defs.h | 1 +
 3 files changed, 7 insertions(+)

diff --git a/hw/rdma/rdma_backend.c b/hw/rdma/rdma_backend.c
index 0ed14751be..9679b842d1 100644
--- a/hw/rdma/rdma_backend.c
+++ b/hw/rdma/rdma_backend.c
@@ -70,6 +70,7 @@ static void rdma_poll_cq(RdmaDeviceResources *rdma_dev_res, 
struct ibv_cq *ibcq)
 BackendCtx *bctx;
 struct ibv_wc wc[2];
 
+qemu_mutex_lock(&rdma_dev_res->lock);
 do {
 ne = ibv_poll_cq(ibcq, ARRAY_SIZE(wc), wc);
 
@@ -89,6 +90,7 @@ static void rdma_poll_cq(RdmaDeviceResources *rdma_dev_res, 
struct ibv_cq *ibcq)
 g_free(bctx);
 }
 } while (ne > 0);
+qemu_mutex_unlock(&rdma_dev_res->lock);
 
 if (ne < 0) {
 rdma_error_report("ibv_poll_cq fail, rc=%d, errno=%d", ne, errno);
diff --git a/hw/rdma/rdma_rm.c b/hw/rdma/rdma_rm.c
index 5dab4a2189..14580ca379 100644
--- a/hw/rdma/rdma_rm.c
+++ b/hw/rdma/rdma_rm.c
@@ -618,12 +618,16 @@ int rdma_rm_init(RdmaDeviceResources *dev_res, struct 
ibv_device_attr *dev_attr,
 
 init_ports(dev_res);
 
+qemu_mutex_init(&dev_res->lock);
+
 return 0;
 }
 
 void rdma_rm_fini(RdmaDeviceResources *dev_res, RdmaBackendDev *backend_dev,
   const char *ifname)
 {
+qemu_mutex_destroy(&dev_res->lock);
+
 fini_ports(dev_res, backend_dev, ifname);
 
 res_tbl_free(&dev_res->uc_tbl);
diff --git a/hw/rdma/rdma_rm_defs.h b/hw/rdma/rdma_rm_defs.h
index 0ba61d1838..f0ee1f3072 100644
--- a/hw/rdma/rdma_rm_defs.h
+++ b/hw/rdma/rdma_rm_defs.h
@@ -105,6 +105,7 @@ typedef struct RdmaDeviceResources {
 RdmaRmResTbl cq_tbl;
 RdmaRmResTbl cqe_ctx_tbl;
 GHashTable *qp_hash; /* Keeps mapping between real and emulated */
+QemuMutex lock;
 } RdmaDeviceResources;
 
 #endif
-- 
2.17.2

[Qemu-devel] [PATCH v5 08/10] hw/pvrdma: Delete unneeded function argument

2019-03-10 Thread Yuval Shaia

The function's argument rdma_dev_res is not needed as it is stored in
the backend_dev object at init.

Signed-off-by: Yuval Shaia 
Reviewed-by: Marcel Apfelbaum 
---
 hw/rdma/rdma_backend.c  | 13 ++---
 hw/rdma/rdma_backend.h  |  1 -
 hw/rdma/vmw/pvrdma_qp_ops.c |  3 +--
 3 files changed, 7 insertions(+), 10 deletions(-)

diff --git a/hw/rdma/rdma_backend.c b/hw/rdma/rdma_backend.c
index d511ca282b..66185bd487 100644
--- a/hw/rdma/rdma_backend.c
+++ b/hw/rdma/rdma_backend.c
@@ -594,7 +594,6 @@ static unsigned int save_mad_recv_buffer(RdmaBackendDev 
*backend_dev,
 }
 
 void rdma_backend_post_recv(RdmaBackendDev *backend_dev,
-RdmaDeviceResources *rdma_dev_res,
 RdmaBackendQP *qp, uint8_t qp_type,
 struct ibv_sge *sge, uint32_t num_sge, void *ctx)
 {
@@ -613,9 +612,9 @@ void rdma_backend_post_recv(RdmaBackendDev *backend_dev,
 rc = save_mad_recv_buffer(backend_dev, sge, num_sge, ctx);
 if (rc) {
 complete_work(IBV_WC_GENERAL_ERR, rc, ctx);
-rdma_dev_res->stats.mad_rx_bufs_err++;
+backend_dev->rdma_dev_res->stats.mad_rx_bufs_err++;
 } else {
-rdma_dev_res->stats.mad_rx_bufs++;
+backend_dev->rdma_dev_res->stats.mad_rx_bufs++;
 }
 }
 return;
@@ -625,7 +624,7 @@ void rdma_backend_post_recv(RdmaBackendDev *backend_dev,
 bctx->up_ctx = ctx;
 bctx->backend_qp = qp;
 
-rc = rdma_rm_alloc_cqe_ctx(rdma_dev_res, &bctx_id, bctx);
+rc = rdma_rm_alloc_cqe_ctx(backend_dev->rdma_dev_res, &bctx_id, bctx);
 if (unlikely(rc)) {
 complete_work(IBV_WC_GENERAL_ERR, VENDOR_ERR_NOMEM, ctx);
 goto err_free_bctx;
@@ -633,7 +632,7 @@ void rdma_backend_post_recv(RdmaBackendDev *backend_dev,
 
 rdma_protected_gslist_append_int32(&qp->cqe_ctx_list, bctx_id);
 
-rc = build_host_sge_array(rdma_dev_res, new_sge, sge, num_sge,
+rc = build_host_sge_array(backend_dev->rdma_dev_res, new_sge, sge, num_sge,
   &backend_dev->rdma_dev_res->stats.rx_bufs_len);
 if (rc) {
 complete_work(IBV_WC_GENERAL_ERR, rc, ctx);
@@ -652,13 +651,13 @@ void rdma_backend_post_recv(RdmaBackendDev *backend_dev,
 }
 
 atomic_inc(&backend_dev->rdma_dev_res->stats.missing_cqe);
-rdma_dev_res->stats.rx_bufs++;
+backend_dev->rdma_dev_res->stats.rx_bufs++;
 
 return;
 
 err_dealloc_cqe_ctx:
 backend_dev->rdma_dev_res->stats.rx_bufs_err++;
-rdma_rm_dealloc_cqe_ctx(rdma_dev_res, bctx_id);
+rdma_rm_dealloc_cqe_ctx(backend_dev->rdma_dev_res, bctx_id);
 
 err_free_bctx:
 g_free(bctx);
diff --git a/hw/rdma/rdma_backend.h b/hw/rdma/rdma_backend.h
index cb5efa2a3a..5d507a1c41 100644
--- a/hw/rdma/rdma_backend.h
+++ b/hw/rdma/rdma_backend.h
@@ -111,7 +111,6 @@ void rdma_backend_post_send(RdmaBackendDev *backend_dev,
 union ibv_gid *dgid, uint32_t dqpn, uint32_t dqkey,
 void *ctx);
 void rdma_backend_post_recv(RdmaBackendDev *backend_dev,
-RdmaDeviceResources *rdma_dev_res,
 RdmaBackendQP *qp, uint8_t qp_type,
 struct ibv_sge *sge, uint32_t num_sge, void *ctx);
 
diff --git a/hw/rdma/vmw/pvrdma_qp_ops.c b/hw/rdma/vmw/pvrdma_qp_ops.c
index 16db726dac..508d8fca3c 100644
--- a/hw/rdma/vmw/pvrdma_qp_ops.c
+++ b/hw/rdma/vmw/pvrdma_qp_ops.c
@@ -231,8 +231,7 @@ void pvrdma_qp_recv(PVRDMADev *dev, uint32_t qp_handle)
 continue;
 }
 
-rdma_backend_post_recv(&dev->backend_dev, &dev->rdma_dev_res,
-   &qp->backend_qp, qp->qp_type,
+rdma_backend_post_recv(&dev->backend_dev, &qp->backend_qp, qp->qp_type,
(struct ibv_sge *)&wqe->sge[0], 
wqe->hdr.num_sge,
comp_ctx);
 
-- 
2.17.2

[Qemu-devel] [PATCH v5 01/10] hw/rdma: Switch to generic error reporting way

2019-03-10 Thread Yuval Shaia

Utilize error_report for all pr_err calls and some pr_dbg that are
considered as errors.
For the remaining pr_dbg calls, the important ones were replaced by
trace points while other deleted.

Signed-off-by: Yuval Shaia 
Reviewed-by: Marcel Apfelbaum 
---
 hw/rdma/rdma_backend.c| 336 ++
 hw/rdma/rdma_rm.c | 127 ++---
 hw/rdma/rdma_rm.h |   6 +-
 hw/rdma/rdma_utils.c  |  15 +-
 hw/rdma/rdma_utils.h  |  45 +
 hw/rdma/trace-events  |  32 +++-
 hw/rdma/vmw/pvrdma.h  |   2 +-
 hw/rdma/vmw/pvrdma_cmd.c  | 113 +++-
 hw/rdma/vmw/pvrdma_dev_ring.c |  26 +--
 hw/rdma/vmw/pvrdma_main.c | 132 +
 hw/rdma/vmw/pvrdma_qp_ops.c   |  49 ++---
 hw/rdma/vmw/trace-events  |  16 +-
 12 files changed, 343 insertions(+), 556 deletions(-)

diff --git a/hw/rdma/rdma_backend.c b/hw/rdma/rdma_backend.c
index fd571f21e5..24bac00a20 100644
--- a/hw/rdma/rdma_backend.c
+++ b/hw/rdma/rdma_backend.c
@@ -14,7 +14,6 @@
  */
 
 #include "qemu/osdep.h"
-#include "qemu/error-report.h"
 #include "sysemu/sysemu.h"
 #include "qapi/error.h"
 #include "qapi/qmp/qlist.h"
@@ -39,7 +38,6 @@
 
 typedef struct BackendCtx {
 void *up_ctx;
-bool is_tx_req;
 struct ibv_sge sge; /* Used to save MAD recv buffer */
 } BackendCtx;
 
@@ -52,7 +50,7 @@ static void (*comp_handler)(void *ctx, struct ibv_wc *wc);
 
 static void dummy_comp_handler(void *ctx, struct ibv_wc *wc)
 {
-pr_err("No completion handler is registered\n");
+rdma_error_report("No completion handler is registered");
 }
 
 static inline void complete_work(enum ibv_wc_status status, uint32_t 
vendor_err,
@@ -66,29 +64,24 @@ static inline void complete_work(enum ibv_wc_status status, 
uint32_t vendor_err,
 comp_handler(ctx, &wc);
 }
 
-static void poll_cq(RdmaDeviceResources *rdma_dev_res, struct ibv_cq *ibcq)
+static void rdma_poll_cq(RdmaDeviceResources *rdma_dev_res, struct ibv_cq 
*ibcq)
 {
 int i, ne;
 BackendCtx *bctx;
 struct ibv_wc wc[2];
 
-pr_dbg("Entering poll_cq loop on cq %p\n", ibcq);
 do {
 ne = ibv_poll_cq(ibcq, ARRAY_SIZE(wc), wc);
 
-pr_dbg("Got %d completion(s) from cq %p\n", ne, ibcq);
+trace_rdma_poll_cq(ne, ibcq);
 
 for (i = 0; i < ne; i++) {
-pr_dbg("wr_id=0x%" PRIx64 "\n", wc[i].wr_id);
-pr_dbg("status=%d\n", wc[i].status);
-
 bctx = rdma_rm_get_cqe_ctx(rdma_dev_res, wc[i].wr_id);
 if (unlikely(!bctx)) {
-pr_dbg("Error: Failed to find ctx for req %" PRId64 "\n",
-   wc[i].wr_id);
+rdma_error_report("No matching ctx for req %"PRId64,
+  wc[i].wr_id);
 continue;
 }
-pr_dbg("Processing %s CQE\n", bctx->is_tx_req ? "send" : "recv");
 
 comp_handler(bctx->up_ctx, &wc[i]);
 
@@ -98,7 +91,7 @@ static void poll_cq(RdmaDeviceResources *rdma_dev_res, struct 
ibv_cq *ibcq)
 } while (ne > 0);
 
 if (ne < 0) {
-pr_dbg("Got error %d from ibv_poll_cq\n", ne);
+rdma_error_report("ibv_poll_cq fail, rc=%d, errno=%d", ne, errno);
 }
 }
 
@@ -115,12 +108,10 @@ static void *comp_handler_thread(void *arg)
 flags = fcntl(backend_dev->channel->fd, F_GETFL);
 rc = fcntl(backend_dev->channel->fd, F_SETFL, flags | O_NONBLOCK);
 if (rc < 0) {
-pr_dbg("Fail to change to non-blocking mode\n");
+rdma_error_report("Failed to change backend channel FD to 
non-blocking");
 return NULL;
 }
 
-pr_dbg("Starting\n");
-
 pfds[0].fd = backend_dev->channel->fd;
 pfds[0].events = G_IO_IN | G_IO_HUP | G_IO_ERR;
 
@@ -132,27 +123,25 @@ static void *comp_handler_thread(void *arg)
 } while (!rc && backend_dev->comp_thread.run);
 
 if (backend_dev->comp_thread.run) {
-pr_dbg("Waiting for completion on channel %p\n", 
backend_dev->channel);
 rc = ibv_get_cq_event(backend_dev->channel, &ev_cq, &ev_ctx);
-pr_dbg("ibv_get_cq_event=%d\n", rc);
 if (unlikely(rc)) {
-pr_dbg("---> ibv_get_cq_event (%d)\n", rc);
+rdma_error_report("ibv_get_cq_event fail, rc=%d, errno=%d", rc,
+  errno);
 continue;
 }
 
 rc = ibv_req_notify_cq(ev_cq, 0);
 if (unlikely(rc)) {
-pr_dbg("Error %d from ibv_req_notify_cq\n", rc);
+rdma_error_report("ibv_req_notify_cq fail, rc=%d, errno=%d", 
rc,
+  errno);
 }
 
-poll_cq(backend_dev->rdma_dev_res, ev_cq);
+rdma_poll_cq(backend_dev->rdma_dev_res, ev_cq);
 
 ibv_ack_cq_events(ev_cq, 1);
 }
 }
 
-pr_dbg("Going down\n");
-
 /* TODO: Post cqe for all remaining buffs that were posted */
 
 backend_dev->comp_thread.is_running = fa

[Qemu-devel] [PATCH v5 07/10] hw/rdma: Free all receive buffers when QP is destroyed

2019-03-10 Thread Yuval Shaia

When QP is destroyed the backend QP is destroyed as well. This ensures
we clean all received buffer we posted to it.
However, a contexts of these buffers are still remain in the device.
Fix it by maintaining a list of buffer's context and free them when QP
is destroyed.

Signed-off-by: Yuval Shaia 
Reviewed-by: Marcel Apfelbaum 
---
 hw/rdma/rdma_backend.c  | 26 --
 hw/rdma/rdma_backend.h  |  2 +-
 hw/rdma/rdma_backend_defs.h |  2 +-
 hw/rdma/rdma_rm.c   |  2 +-
 hw/rdma/rdma_utils.c| 29 +
 hw/rdma/rdma_utils.h| 11 +++
 6 files changed, 63 insertions(+), 9 deletions(-)

diff --git a/hw/rdma/rdma_backend.c b/hw/rdma/rdma_backend.c
index a65f5737e4..d511ca282b 100644
--- a/hw/rdma/rdma_backend.c
+++ b/hw/rdma/rdma_backend.c
@@ -39,6 +39,7 @@
 typedef struct BackendCtx {
 void *up_ctx;
 struct ibv_sge sge; /* Used to save MAD recv buffer */
+RdmaBackendQP *backend_qp; /* To maintain recv buffers */
 } BackendCtx;
 
 struct backend_umad {
@@ -73,6 +74,7 @@ static void free_cqe_ctx(gpointer data, gpointer user_data)
 bctx = rdma_rm_get_cqe_ctx(rdma_dev_res, cqe_ctx_id);
 if (bctx) {
 rdma_rm_dealloc_cqe_ctx(rdma_dev_res, cqe_ctx_id);
+atomic_dec(&rdma_dev_res->stats.missing_cqe);
 }
 g_free(bctx);
 }
@@ -85,13 +87,15 @@ static void clean_recv_mads(RdmaBackendDev *backend_dev)
 cqe_ctx_id = rdma_protected_qlist_pop_int64(&backend_dev->
 recv_mads_list);
 if (cqe_ctx_id != -ENOENT) {
+atomic_inc(&backend_dev->rdma_dev_res->stats.missing_cqe);
 free_cqe_ctx(GINT_TO_POINTER(cqe_ctx_id),
  backend_dev->rdma_dev_res);
 }
 } while (cqe_ctx_id != -ENOENT);
 }
 
-static int rdma_poll_cq(RdmaDeviceResources *rdma_dev_res, struct ibv_cq *ibcq)
+static int rdma_poll_cq(RdmaBackendDev *backend_dev,
+RdmaDeviceResources *rdma_dev_res, struct ibv_cq *ibcq)
 {
 int i, ne, total_ne = 0;
 BackendCtx *bctx;
@@ -113,6 +117,8 @@ static int rdma_poll_cq(RdmaDeviceResources *rdma_dev_res, 
struct ibv_cq *ibcq)
 
 comp_handler(bctx->up_ctx, &wc[i]);
 
+rdma_protected_gslist_remove_int32(&bctx->backend_qp->cqe_ctx_list,
+   wc[i].wr_id);
 rdma_rm_dealloc_cqe_ctx(rdma_dev_res, wc[i].wr_id);
 g_free(bctx);
 }
@@ -175,14 +181,12 @@ static void *comp_handler_thread(void *arg)
 }
 
 backend_dev->rdma_dev_res->stats.poll_cq_from_bk++;
-rdma_poll_cq(backend_dev->rdma_dev_res, ev_cq);
+rdma_poll_cq(backend_dev, backend_dev->rdma_dev_res, ev_cq);
 
 ibv_ack_cq_events(ev_cq, 1);
 }
 }
 
-/* TODO: Post cqe for all remaining buffs that were posted */
-
 backend_dev->comp_thread.is_running = false;
 
 qemu_thread_exit(0);
@@ -311,7 +315,7 @@ void rdma_backend_poll_cq(RdmaDeviceResources 
*rdma_dev_res, RdmaBackendCQ *cq)
 int polled;
 
 rdma_dev_res->stats.poll_cq_from_guest++;
-polled = rdma_poll_cq(rdma_dev_res, cq->ibcq);
+polled = rdma_poll_cq(cq->backend_dev, rdma_dev_res, cq->ibcq);
 if (!polled) {
 rdma_dev_res->stats.poll_cq_from_guest_empty++;
 }
@@ -501,6 +505,7 @@ void rdma_backend_post_send(RdmaBackendDev *backend_dev,
 
 bctx = g_malloc0(sizeof(*bctx));
 bctx->up_ctx = ctx;
+bctx->backend_qp = qp;
 
 rc = rdma_rm_alloc_cqe_ctx(backend_dev->rdma_dev_res, &bctx_id, bctx);
 if (unlikely(rc)) {
@@ -508,6 +513,8 @@ void rdma_backend_post_send(RdmaBackendDev *backend_dev,
 goto err_free_bctx;
 }
 
+rdma_protected_gslist_append_int32(&qp->cqe_ctx_list, bctx_id);
+
 rc = build_host_sge_array(backend_dev->rdma_dev_res, new_sge, sge, num_sge,
   &backend_dev->rdma_dev_res->stats.tx_len);
 if (rc) {
@@ -616,6 +623,7 @@ void rdma_backend_post_recv(RdmaBackendDev *backend_dev,
 
 bctx = g_malloc0(sizeof(*bctx));
 bctx->up_ctx = ctx;
+bctx->backend_qp = qp;
 
 rc = rdma_rm_alloc_cqe_ctx(rdma_dev_res, &bctx_id, bctx);
 if (unlikely(rc)) {
@@ -623,6 +631,8 @@ void rdma_backend_post_recv(RdmaBackendDev *backend_dev,
 goto err_free_bctx;
 }
 
+rdma_protected_gslist_append_int32(&qp->cqe_ctx_list, bctx_id);
+
 rc = build_host_sge_array(rdma_dev_res, new_sge, sge, num_sge,
   &backend_dev->rdma_dev_res->stats.rx_bufs_len);
 if (rc) {
@@ -762,6 +772,8 @@ int rdma_backend_create_qp(RdmaBackendQP *qp, uint8_t 
qp_type,
 return -EIO;
 }
 
+rdma_protected_gslist_init(&qp->cqe_ctx_list);
+
 qp->ibpd = pd->ibpd;
 
 /* TODO: Query QP to get max_inline_data and save it to be used in send */
@@ -919,11 +931,13 @@ int rdma_backend_query_qp(RdmaBackendQP *qp, struct 
ibv_qp_attr *attr,
 return ib

[Qemu-devel] [PATCH v5 10/10] hw/pvrdma: Unregister from shutdown notifier when device goes down

2019-03-10 Thread Yuval Shaia

This hook was installed to close the device when VM is going down.
After the device is closed there is no need to be informed on VM
shutdown.

Signed-off-by: Yuval Shaia 
Reviewed-by: Marcel Apfelbaum 
---
 hw/rdma/vmw/pvrdma_main.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/hw/rdma/vmw/pvrdma_main.c b/hw/rdma/vmw/pvrdma_main.c
index 84df4294eb..1f7a95d596 100644
--- a/hw/rdma/vmw/pvrdma_main.c
+++ b/hw/rdma/vmw/pvrdma_main.c
@@ -311,6 +311,8 @@ static void pvrdma_fini(PCIDevice *pdev)
 {
 PVRDMADev *dev = PVRDMA_DEV(pdev);
 
+notifier_remove(&dev->shutdown_notifier);
+
 pvrdma_qp_ops_fini();
 
 rdma_backend_stop(&dev->backend_dev);
-- 
2.17.2

[Qemu-devel] [PATCH v5 05/10] {hmp, hw/pvrdma}: Expose device internals via monitor interface

2019-03-10 Thread Yuval Shaia

Allow interrogating device internals through HMP interface.
The exposed indicators can be used for troubleshooting by developers or
sysadmin.
There is no need to expose these attributes to a management system (e.x.
libvirt) because (1) most of them are not "device-management' related
info and (2) there is no guarantee the interface is stable.

Signed-off-by: Yuval Shaia 
Acked-by: Dr. David Alan Gilbert 
Acked-by: Markus Armbruster 
---
 hmp-commands-info.hx  | 14 +++
 hmp.c | 27 
 hmp.h |  1 +
 hw/rdma/Makefile.objs |  2 +-
 hw/rdma/rdma.c| 30 ++
 hw/rdma/rdma_rm.c | 53 +++
 hw/rdma/rdma_rm.h |  1 +
 hw/rdma/vmw/pvrdma_main.c | 26 +++
 include/hw/rdma/rdma.h| 40 +
 9 files changed, 193 insertions(+), 1 deletion(-)
 create mode 100644 hw/rdma/rdma.c
 create mode 100644 include/hw/rdma/rdma.h

diff --git a/hmp-commands-info.hx b/hmp-commands-info.hx
index cbee8b944d..c59444c461 100644
--- a/hmp-commands-info.hx
+++ b/hmp-commands-info.hx
@@ -202,6 +202,20 @@ STEXI
 @item info pic
 @findex info pic
 Show PIC state.
+ETEXI
+
+{
+.name   = "rdma",
+.args_type  = "",
+.params = "",
+.help   = "show RDMA state",
+.cmd= hmp_info_rdma,
+},
+
+STEXI
+@item info rdma
+@findex info rdma
+Show RDMA state.
 ETEXI
 
 {
diff --git a/hmp.c b/hmp.c
index 1e006eeb49..e24e3f9d5d 100644
--- a/hmp.c
+++ b/hmp.c
@@ -51,6 +51,7 @@
 #include "qemu/error-report.h"
 #include "exec/ramlist.h"
 #include "hw/intc/intc.h"
+#include "hw/rdma/rdma.h"
 #include "migration/snapshot.h"
 #include "migration/misc.h"
 
@@ -968,6 +969,32 @@ void hmp_info_pic(Monitor *mon, const QDict *qdict)
hmp_info_pic_foreach, mon);
 }
 
+static int hmp_info_rdma_foreach(Object *obj, void *opaque)
+{
+RdmaProvider *rdma;
+RdmaProviderClass *k;
+Monitor *mon = opaque;
+
+if (object_dynamic_cast(obj, INTERFACE_RDMA_PROVIDER)) {
+rdma = RDMA_PROVIDER(obj);
+k = RDMA_PROVIDER_GET_CLASS(obj);
+if (k->print_statistics) {
+k->print_statistics(mon, rdma);
+} else {
+monitor_printf(mon, "RDMA statistics not available for %s.\n",
+   object_get_typename(obj));
+}
+}
+
+return 0;
+}
+
+void hmp_info_rdma(Monitor *mon, const QDict *qdict)
+{
+object_child_foreach_recursive(object_get_root(),
+   hmp_info_rdma_foreach, mon);
+}
+
 void hmp_info_pci(Monitor *mon, const QDict *qdict)
 {
 PciInfoList *info_list, *info;
diff --git a/hmp.h b/hmp.h
index 5f1addcca2..666949afc3 100644
--- a/hmp.h
+++ b/hmp.h
@@ -36,6 +36,7 @@ void hmp_info_spice(Monitor *mon, const QDict *qdict);
 void hmp_info_balloon(Monitor *mon, const QDict *qdict);
 void hmp_info_irq(Monitor *mon, const QDict *qdict);
 void hmp_info_pic(Monitor *mon, const QDict *qdict);
+void hmp_info_rdma(Monitor *mon, const QDict *qdict);
 void hmp_info_pci(Monitor *mon, const QDict *qdict);
 void hmp_info_block_jobs(Monitor *mon, const QDict *qdict);
 void hmp_info_tpm(Monitor *mon, const QDict *qdict);
diff --git a/hw/rdma/Makefile.objs b/hw/rdma/Makefile.objs
index bd36cbf51c..c354e60e5b 100644
--- a/hw/rdma/Makefile.objs
+++ b/hw/rdma/Makefile.objs
@@ -1,5 +1,5 @@
 ifeq ($(CONFIG_PVRDMA),y)
-obj-$(CONFIG_PCI) += rdma_utils.o rdma_backend.o rdma_rm.o
+obj-$(CONFIG_PCI) += rdma_utils.o rdma_backend.o rdma_rm.o rdma.o
 obj-$(CONFIG_PCI) += vmw/pvrdma_dev_ring.o vmw/pvrdma_cmd.o \
  vmw/pvrdma_qp_ops.o vmw/pvrdma_main.o
 endif
diff --git a/hw/rdma/rdma.c b/hw/rdma/rdma.c
new file mode 100644
index 00..7bec0d0d2c
--- /dev/null
+++ b/hw/rdma/rdma.c
@@ -0,0 +1,30 @@
+/*
+ * RDMA device interface
+ *
+ * Copyright (C) 2018 Oracle
+ * Copyright (C) 2018 Red Hat Inc
+ *
+ * Authors:
+ * Yuval Shaia 
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ *
+ */
+
+#include "qemu/osdep.h"
+#include "hw/rdma/rdma.h"
+#include "qemu/module.h"
+
+static const TypeInfo rdma_hmp_info = {
+.name = INTERFACE_RDMA_PROVIDER,
+.parent = TYPE_INTERFACE,
+.class_size = sizeof(RdmaProviderClass),
+};
+
+static void rdma_register_types(void)
+{
+type_register_static(&rdma_hmp_info);
+}
+
+type_init(rdma_register_types)
diff --git a/hw/rdma/rdma_rm.c b/hw/rdma/rdma_rm.c
index 16109b9647..e019de1a14 100644
--- a/hw/rdma/rdma_rm.c
+++ b/hw/rdma/rdma_rm.c
@@ -16,6 +16,7 @@
 #include "qemu/osdep.h"
 #include "qapi/error.h"
 #include "cpu.h"
+#include "monitor/monitor.h"
 
 #include "trace.h"
 #include "rdma_utils.h"
@@ -26,6 +27,58 @@
 #define PG_DIR_SZ { TARGET_PAGE_SIZE / sizeof(__u64) }
 #define PG_TBL_SZ { TARGET_PAGE_SIZE / sizeof(__u64) }
 
+void rdma_du

[Qemu-devel] [PATCH v5 09/10] hw/pvrdma: Delete pvrdma_exit function

2019-03-10 Thread Yuval Shaia

This hook is not called and was implemented by mistake.

Signed-off-by: Yuval Shaia 
Reviewed-by: Marcel Apfelbaum 
---
 hw/rdma/vmw/pvrdma_main.c | 6 --
 1 file changed, 6 deletions(-)

diff --git a/hw/rdma/vmw/pvrdma_main.c b/hw/rdma/vmw/pvrdma_main.c
index 01bb6e6b17..84df4294eb 100644
--- a/hw/rdma/vmw/pvrdma_main.c
+++ b/hw/rdma/vmw/pvrdma_main.c
@@ -654,11 +654,6 @@ out:
 }
 }
 
-static void pvrdma_exit(PCIDevice *pdev)
-{
-pvrdma_fini(pdev);
-}
-
 static void pvrdma_class_init(ObjectClass *klass, void *data)
 {
 DeviceClass *dc = DEVICE_CLASS(klass);
@@ -666,7 +661,6 @@ static void pvrdma_class_init(ObjectClass *klass, void 
*data)
 RdmaProviderClass *ir = INTERFACE_RDMA_PROVIDER_CLASS(klass);
 
 k->realize = pvrdma_realize;
-k->exit = pvrdma_exit;
 k->vendor_id = PCI_VENDOR_ID_VMWARE;
 k->device_id = PCI_DEVICE_ID_VMWARE_PVRDMA;
 k->revision = 0x00;
-- 
2.17.2

[Qemu-devel] [PATCH v5 06/10] hw/rdma: Free all MAD receive buffers when device is closed

2019-03-10 Thread Yuval Shaia

When device is going down free all saved MAD buffers.

Signed-off-by: Yuval Shaia 
Reviewed-by: Marcel Apfelbaum 
---
 hw/rdma/rdma_backend.c| 34 +-
 hw/rdma/vmw/pvrdma_main.c |  2 ++
 2 files changed, 35 insertions(+), 1 deletion(-)

diff --git a/hw/rdma/rdma_backend.c b/hw/rdma/rdma_backend.c
index bc2fefcf93..a65f5737e4 100644
--- a/hw/rdma/rdma_backend.c
+++ b/hw/rdma/rdma_backend.c
@@ -64,6 +64,33 @@ static inline void complete_work(enum ibv_wc_status status, 
uint32_t vendor_err,
 comp_handler(ctx, &wc);
 }
 
+static void free_cqe_ctx(gpointer data, gpointer user_data)
+{
+BackendCtx *bctx;
+RdmaDeviceResources *rdma_dev_res = user_data;
+unsigned long cqe_ctx_id = GPOINTER_TO_INT(data);
+
+bctx = rdma_rm_get_cqe_ctx(rdma_dev_res, cqe_ctx_id);
+if (bctx) {
+rdma_rm_dealloc_cqe_ctx(rdma_dev_res, cqe_ctx_id);
+}
+g_free(bctx);
+}
+
+static void clean_recv_mads(RdmaBackendDev *backend_dev)
+{
+unsigned long cqe_ctx_id;
+
+do {
+cqe_ctx_id = rdma_protected_qlist_pop_int64(&backend_dev->
+recv_mads_list);
+if (cqe_ctx_id != -ENOENT) {
+free_cqe_ctx(GINT_TO_POINTER(cqe_ctx_id),
+ backend_dev->rdma_dev_res);
+}
+} while (cqe_ctx_id != -ENOENT);
+}
+
 static int rdma_poll_cq(RdmaDeviceResources *rdma_dev_res, struct ibv_cq *ibcq)
 {
 int i, ne, total_ne = 0;
@@ -1037,6 +1064,11 @@ static int mad_init(RdmaBackendDev *backend_dev, 
CharBackend *mad_chr_be)
 return 0;
 }
 
+static void mad_stop(RdmaBackendDev *backend_dev)
+{
+clean_recv_mads(backend_dev);
+}
+
 static void mad_fini(RdmaBackendDev *backend_dev)
 {
 disable_rdmacm_mux_async(backend_dev);
@@ -1224,12 +1256,12 @@ void rdma_backend_start(RdmaBackendDev *backend_dev)
 
 void rdma_backend_stop(RdmaBackendDev *backend_dev)
 {
+mad_stop(backend_dev);
 stop_backend_thread(&backend_dev->comp_thread);
 }
 
 void rdma_backend_fini(RdmaBackendDev *backend_dev)
 {
-rdma_backend_stop(backend_dev);
 mad_fini(backend_dev);
 g_hash_table_destroy(ah_hash);
 ibv_destroy_comp_channel(backend_dev->channel);
diff --git a/hw/rdma/vmw/pvrdma_main.c b/hw/rdma/vmw/pvrdma_main.c
index 8c240e2b69..01bb6e6b17 100644
--- a/hw/rdma/vmw/pvrdma_main.c
+++ b/hw/rdma/vmw/pvrdma_main.c
@@ -313,6 +313,8 @@ static void pvrdma_fini(PCIDevice *pdev)
 
 pvrdma_qp_ops_fini();
 
+rdma_backend_stop(&dev->backend_dev);
+
 rdma_rm_fini(&dev->rdma_dev_res, &dev->backend_dev,
  dev->backend_eth_device_name);
 
-- 
2.17.2

Re: [Qemu-devel] [PULL 00/60] ppc-for-4.0 queue 20190310

2019-03-10 Thread no-reply

Patchew URL: 
https://patchew.org/QEMU/20190310082703.1245-1-da...@gibson.dropbear.id.au/



Hi,

This series seems to have some coding style problems. See output below for
more information:

Type: series
Message-id: 20190310082703.1245-1-da...@gibson.dropbear.id.au
Subject: [Qemu-devel] [PULL 00/60] ppc-for-4.0 queue 20190310

=== TEST SCRIPT BEGIN ===
#!/bin/bash
git rev-parse base > /dev/null || exit 0
git config --local diff.renamelimit 0
git config --local diff.renames True
git config --local diff.algorithm histogram
./scripts/checkpatch.pl --mailback base..
=== TEST SCRIPT END ===

Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
From https://github.com/patchew-project/qemu
 * [new tag]   
patchew/20190310082703.1245-1-da...@gibson.dropbear.id.au -> 
patchew/20190310082703.1245-1-da...@gibson.dropbear.id.au
Switched to a new branch 'test'
a13e9857de spapr: Use CamelCase properly
9b2d3e81df target/ppc: Optimize x[sv]xsigdp using deposit_i64()
1a63e5d488 target/ppc: Optimize xviexpdp() using deposit_i64()
82f058a774 target/ppc: add HV support for POWER9
c85ade25f0 ppc/pnv: add a "ibm, opal/power-mgt" device tree node on POWER9
e39b190d65 ppc/pnv: add more dummy XSCOM addresses
2aaa98b799 ppc/pnv: activate XSCOM tests for POWER9
40a779400e ppc/pnv: POWER9 XSCOM quad support
bf6b37e0e3 ppc/pnv: extend XSCOM core support for POWER9
a16eb6280a ppc/pnv: add a OCC model for POWER9
177a01f268 ppc/pnv: add a OCC model class
0032552915 ppc/pnv: add SerIRQ routing registers
cad3544b00 ppc/pnv: add a LPC Controller model for POWER9
c8a3537791 ppc/pnv: add a 'dt_isa_nodename' to the chip
8a24b7e984 ppc/pnv: add a LPC Controller class model
b565618566 ppc/pnv: lpc: fix OPB address ranges
b3753bf4d5 ppc/pnv: add a PSI bridge model for POWER9
0f805a7337 ppc/pnv: add a PSI bridge class model
dc90b53163 mac_newworld: use node name instead of alias name for hd device in 
FWPathProvider
b40751980c mac_oldworld: use node name instead of alias name for hd device in 
FWPathProvider
09a2890670 target/ppc: introduce vsr64_offset() to simplify get_cpu_vsr{l, h}() 
and set_cpu_vsr{l, h}()
6502f1d34c target/ppc: switch fpr/vsrl registers so all VSX registers are in 
host endian order
c8ccee4ae4 target/ppc: improve avr64_offset() and use it to simplify 
get_avr64()/set_avr64()
009069ca82 target/ppc: introduce avr_full_offset() function
42276547b8 target/ppc: move Vsr* macros from internal.h to cpu.h
9b303bf2fe target/ppc: introduce single vsrl_offset() function
7d115a7cca target/ppc: introduce single fpr_offset() function
bc857d3423 spapr_iommu: Do not replay mappings from just created DMA window
318207457b ppc/pnv: psi: add a reset handler
d8c41355f7 ppc/pnv: psi: add a PSIHB_REG macro
0afa36e08f ppc/pnv: fix logging primitives using Ox
71115811ab ppc/xive: activate HV support
6a4dfae59e ppc/pnv: introduce a new pic_print_info() operation to the chip model
d775142ddd ppc/pnv: introduce a new dt_populate() operation to the chip model
087e79ef3a ppc/pnv: add a XIVE interrupt controller model for POWER9
560306e92f ppc/pnv: change the CPU machine_data presenter type to Object *
61f1c7c909 ppc/pnv: export the xive_router_notify() routine
7de67358c5 ppc/xive: export the TIMA memory accessors
bb2f939bf2 ppc: externalize ppc_get_vcpu_by_pir()
371cc21427 ppc/xive: hardwire the Physical CAM line of the thread context
2bd3f311b6 PPC: E500: Add FSL I2C controller and integrate RTC with it
cd8d1d103b target/ppc/spapr: Enable H_PAGE_INIT in-kernel handling
331fb825f9 spapr: Force SPAPR_MEMORY_BLOCK_SIZE to be a hwaddr (64-bit)
4d0a5b8bac target/ppc/spapr: Clear partition table entry when allocating hash 
table
459399789c PPC: E500: Update u-boot to v2019.01
d95c3a89e7 target/ppc: Refactor kvm_handle_debug
4e0adda559 target/ppc: Move handling of hardware breakpoints to a separate 
function
f914b4a55d target/ppc: Move exception vector offset computation into a function
dec20748cf target/ppc/spapr: Enable mitigations by default for pseries-4.0 
machine type
2a2020fd52 target/ppc/tcg: make spapr_caps apply cap-[cfpc/sbbc/ibs] non-fatal 
for tcg
abe6251060 target/ppc/spapr: Add SPAPR_CAP_CCF_ASSIST
efcc41a1f0 target/ppc/spapr: Add workaround option to SPAPR_CAP_IBS
d27181434a target/ppc/spapr: Enable the large decrementer for pseries-4.0
e3509a4217 target/ppc: Implement large decrementer support for KVM
85592739a6 target/ppc: Implement large decrementer support for TCG
ed7935af70 target/ppc/spapr: Add SPAPR_CAP_LARGE_DECREMENTER
9bcb00db74 Revert "spapr: support memory unplug for qtest"
16684c3584 spapr: Simulate CAS for qtest
31faed9bf2 vfio/spapr: Rename local systempagesize variable
29eef1ef4b vfio/spapr: Fix indirect levels calculation

=== OUTPUT BEGIN ===
1/60 Checking commit 29eef1ef4b33 (vfio/spapr: Fix indirect levels calculation)
2/60 Checking commit 31faed9bf208 (vfio/spapr: Rename local systempagesize 
variable)
3/60 Checking commit 16684c358405 (spapr: Simulate CAS for qt

[Qemu-devel] 答复: 'make check' error

2019-03-10 Thread Li Qiang

Thanks Emilio,
I found the ssh connection is very slow and the submodule wasn’t checked out 
completely.
I uses the https connection and it works as normal.

Thanks,
Li Qiang

发件人: Emilio G. Cota
发送时间: 2019年3月10日 1:38
收件人: Li Qiang
抄送: qemu-devel@nongnu.org
主题: Re: 'make check' error

On Sat, Mar 09, 2019 at 13:53:32 +0800, Li Qiang wrote:
> Hi all, 
> 
> Today I ‘git clone’ && configure && make && make check 
> 
> And get following error, 
> 
> fp-test.c:50:10: fatal error: fail.h: No such file or directory
>  #include "fail.h"
>   ^~~~
> 
> I look at the commit:
> https://git.qemu.org/?p=qemu.git;a=commitdiff;h=3ac1f81329f4dfdc10a51e180f9cf28dbcb02a3c;hp=b44b5abeae4a3b54ccbd7137f59c0a8923cecec9
> 
> Seems it’s old commit, I think I got ‘make check’ work after this commit.
> So I don’t know anywhere wrong.
> 
> Any hints?

fail.h is part of berkeley-testfloat-3 -- I suspect the
berkeley-testfloat-3 git submodule wasn't checked out.

Make sure both berkeley-softfloat-3 and berkeley-testfloat-3 are
checked out at $src/tests/fp. If not, you can get them with
"git submodule init && git submodule update".

Hope that helps,

Emilio

Re: [Qemu-devel] [libvirt] [PATCH 1/2] numa: deprecate 'mem' parameter of '-numa node' option

2019-03-10 Thread Markus Armbruster

Daniel P. Berrangé  writes:

> On Mon, Mar 04, 2019 at 12:45:14PM +0100, Markus Armbruster wrote:
>> Daniel P. Berrangé  writes:
>> 
>> > On Mon, Mar 04, 2019 at 08:13:53AM +0100, Markus Armbruster wrote:
>> >> If we deprecate outdated NUMA configurations now, we can start rejecting
>> >> them with new machine types after a suitable grace period.
>> >
>> > How is libvirt going to know what machines it can use with the feature ?
>> > We don't have any way to introspect machine type specific logic, since we
>> > run all probing with "-machine none", and QEMU can't report anything about
>> > machines without instantiating them.
>> 
>> Fair point.  A practical way for management applications to decide which
>> of the two interfaces they can use with which machine type may be
>> required for deprecating one of the interfaces with new machine types.
>
> We currently have  "qom-list-properties" which can report on the
> existance of properties registered against object types. What it
> can't do though is report on the default values of these properties.

Yes.

> What's interesting though is that qmp_qom_list_properties will actually
> instantiate objects in order to query properties, if the type isn't an
> abstract type.

If it's an abstract type, qom-list-properties returns the properties
created with object_class_property_add() & friends, typically by the
class_init method.  This is possible without instantiating the type.

If it's a concrete type, qom-list-properties additionally returns the
properties created with object_property_add(), typically by the
instance_init() method.  This requires instantiating the type.

Both kinds of properties can be added or deleted at any time.  For
instance, setting a property value with object_property_set() or similar
could create additional properties.

For historical reasons, we use often use object_property_add() where
object_class_property_add() would do.  Sad.

> IOW, even if you are running "$QEMU -machine none", then if at the qmp-shell
> you do
>
>(QEMU) qom-list-properties typename=pc-q35-2.6-machine
>
> it will have actually instantiate the pc-q35-2.6-machine machine type.
> Since it has instantiated the machine, the object initializer function
> will have run and initialized the default values for various properties.
>
> IOW, it is possible for qom-list-properties to report on default values
> for non-abstract types.

instance_init() also initializes the properties' values.
qom-list-properties could show these initial values (I hesitate calling
them default values).

Setting a property's value can change other properties' values by side
effect.

My point is: the properties qom-list-properties shows and the initial
values it could show are not necessarily final.  QOM is designed to be
maximally flexible, and flexibility brings along its bosom-buddy
complexity.

If you keep that in mind, qom-list-properties can be put to good use all
the same.

A way to report "default values" (really: whatever the values are after
object_new()) feels like a fair feature request to me, if backed by an
actual use case.

[...]

Re: [Qemu-devel] [libvirt] [PATCH 1/2] numa: deprecate 'mem' parameter of '-numa node' option

2019-03-10 Thread Markus Armbruster

Daniel P. Berrangé  writes:

> On Wed, Mar 06, 2019 at 08:03:48PM +0100, Igor Mammedov wrote:
>> On Mon, 4 Mar 2019 16:35:16 +
>> Daniel P. Berrangé  wrote:
>> 
>> > On Mon, Mar 04, 2019 at 05:20:13PM +0100, Michal Privoznik wrote:
>> > > We couldn't have done that. How we would migrate from older qemu?
>> > > 
>> > > Anyway, now that I look into this (esp. git log) I came accross:
>> > > 
>> > > commit f309db1f4d51009bad0d32e12efc75530b66836b
>> > > Author: Michal Privoznik 
>> > > AuthorDate: Thu Dec 18 12:36:48 2014 +0100
>> > > Commit: Michal Privoznik 
>> > > CommitDate: Fri Dec 19 07:44:44 2014 +0100
>> > > 
>> > > qemu: Create memory-backend-{ram,file} iff needed
>> > > 
>> > > Or this 7832fac84741d65e851dbdbfaf474785cbfdcf3c. We did try to generated
>> > > newer cmd line but then for various reasong (e.g. avoiding triggering a 
>> > > qemu
>> > > bug) we turned it off and make libvirt default to older (now deprecated) 
>> > > cmd
>> > > line.
>> > > 
>> > > Frankly, I don't know how to proceed. Unless qemu is fixed to allow
>> > > migration from deprecated to new cmd line (unlikely, if not impossible,
>> > > right?) then I guess the only approach we can have is that:
>> > > 
>> > > 1) whenever so called cold booting a new machine (fresh, brand new start 
>> > > of
>> > > a new domain) libvirt would default to modern cmd line,
>> > > 
>> > > 2) on migration, libvirt would record in the migration stream (or status 
>> > > XML
>> > > or wherever) that modern cmd line was generated and thus it'll make the
>> > > destination generate modern cmd line too.
>> > > 
>> > > This solution still suffers a couple of problems:
>> > > a) migration to older libvirt will fail as older libvirt won't recognize 
>> > > the
>> > > flag set in 2) and therefore would default to deprecated cmd line
>> > > b) migrating from one host to another won't modernize the cmd line
>> > > 
>> > > But I guess we have to draw a line somewhere (if we are not willing to 
>> > > write
>> > > those migration patches).
>> > 
>> > Yeah supporting backwards migration is a non-optional requirement from at
>> > least one of the mgmt apps using libvirt, so breaking the new to old case
>> > is something we always aim to avoid.
>> Aiming for support of 
>> "new QEMU + new machine type" => "old QEMU + non-existing machine type"
>> seems a bit difficult.
>
> That's not the scenario that's the problem. The problem is
>
>new QEMU + new machine type + new libvirt   -> new QEMU + new machine type 
> + old libvirt
>
> Previously released versions of libvirt will happily use any new machine
> type that QEMU introduces. So we can't make new libvirt use a different
> options, only for new machine types, as old libvirt supports those machine
> types too.

Avoiding tight coupling between QEMU und libvirt versions makes sense,
because having to upgrade stuff in lock-step is such a pain.

Does not imply we must support arbitrary combinations of QEMU and
libvirt versions.

Unless upstream libvirt's test matrix covers all versions of libvirt
against all released versions of QEMU, "previously released versions of
libvirt will continue to work with new QEMU" is largely an empty promise
anyway.  The real promise is more like "we won't break it intentionally;
good luck".

Mind, I'm not criticizing that real promise.  I'm criticizing cutting
yourself off from large areas of the solution space so you can continue
to pretend to yourself you actually deliver on the empty promise.

Now, if you limited what you promise to something more realistic,
ideally to something you actually test, we could talk about deprecation
schedules constructively.

For instance, if you promised

QEMU as of time T + its latest machine type + libvirt as of time T
 -> QEMU as of time T + its latest machine type + libvirt as of time T - d

will work for a certain value of d, then once all released versions of
libvirt since T - d support a new way of doing things, flipping to that
new way becomes a whole lot easier.

Re: [Qemu-devel] [PATCH] vfio-pci: enable by default

2019-03-10 Thread Daniel Henrique Barboza


Just faced this problem when trying to test vfio-pci using upstream:

qemu-system-ppc64: -device vfio-pci,host=0035:03:00.0,id=hostdev8: 
'vfio-pci' is not a valid device model name



This patch fixed it.

Tested-by: Daniel Henrique Barboza 


On 3/8/19 2:36 PM, Paolo Bonzini wrote:

CONFIG_VFIO_PCI was not "default y" - and once you do that, it is also important
to disable it if PCI is not there.

Reported-by: Alex Williamson 
Signed-off-by: Paolo Bonzini 
---
  hw/vfio/Kconfig | 3 ++-
  1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/hw/vfio/Kconfig b/hw/vfio/Kconfig
index ebda9fdf22..34da2a3cfd 100644
--- a/hw/vfio/Kconfig
+++ b/hw/vfio/Kconfig
@@ -4,8 +4,9 @@ config VFIO
  
  config VFIO_PCI

  bool
+default y
  select VFIO
-depends on LINUX
+depends on LINUX && PCI
  
  config VFIO_CCW

  bool

Re: [Qemu-devel] [PATCH v3 1/2] hw/arm/virt: Remove null-check in virt_build_smbios()

2019-03-10 Thread Laurent Vivier

On 09/03/2019 19:19, Philippe Mathieu-Daudé wrote:
> Since commit 578f3c7b0835 ("arm: add fw_cfg to "virt" board",
> 2014-12-22), the machvirt_init() unconditionally creates the
> fw_cfg object.  Later, commit c30e15658b1b ("smbios: implement
> smbios support for mach-virt", 2015-09-07) added a superfluous
> null-check on it.
> Remove this superfluous check.
> 
> Reviewed-by: Laszlo Ersek 
> Reviewed-by: Markus Armbruster 
> Signed-off-by: Philippe Mathieu-Daudé 
> ---
> v2: Corrected commit reference (Laszlo)
> v3: Dropped 'Fixes:' (Markus)
> ---
>  hw/arm/virt.c | 4 
>  1 file changed, 4 deletions(-)
> 
> diff --git a/hw/arm/virt.c b/hw/arm/virt.c
> index 7f66ddad89..377e95a4cd 100644
> --- a/hw/arm/virt.c
> +++ b/hw/arm/virt.c
> @@ -1281,10 +1281,6 @@ static void virt_build_smbios(VirtMachineState *vms)
>  size_t smbios_tables_len, smbios_anchor_len;
>  const char *product = "QEMU Virtual Machine";
>  
> -if (!vms->fw_cfg) {
> -return;
> -}
> -
>  if (kvm_enabled()) {
>  product = "KVM Virtual Machine";
>  }
> 

Applied to my trivial-patches branch.

Thanks,
Laurent

Re: [Qemu-devel] [PATCH v3 2/2] hw/nvram/fw_cfg: Use the ldst API

2019-03-10 Thread Laurent Vivier

On 09/03/2019 19:19, Philippe Mathieu-Daudé wrote:
> The load/store API eases code review.
> 
> Reviewed-by: Laszlo Ersek 
> Signed-off-by: Philippe Mathieu-Daudé 
> ---
>  hw/nvram/fw_cfg.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/hw/nvram/fw_cfg.c b/hw/nvram/fw_cfg.c
> index 7fdf04adc9..3d8859e333 100644
> --- a/hw/nvram/fw_cfg.c
> +++ b/hw/nvram/fw_cfg.c
> @@ -85,7 +85,7 @@ static char *read_splashfile(char *filename, gsize 
> *file_sizep,
>  }
>  
>  /* check magic ID */
> -filehead = ((content[0] & 0xff) + (content[1] << 8)) & 0x;
> +filehead = lduw_le_p(content);
>  if (filehead == 0xd8ff) {
>  file_type = JPG_FILE;
>  } else if (filehead == 0x4d42) {
> @@ -96,7 +96,7 @@ static char *read_splashfile(char *filename, gsize 
> *file_sizep,
>  
>  /* check BMP bpp */
>  if (file_type == BMP_FILE) {
> -bmp_bpp = (content[28] + (content[29] << 8)) & 0x;
> +bmp_bpp = lduw_le_p(&content[28]);
>  if (bmp_bpp != 24) {
>  goto error;
>  }
> 

Applied to my trivial-patches branch.

Thanks,
Laurent

Re: [Qemu-devel] [PATCH] Added periodic IRQ support for bcm2836_control local timer

2019-03-10 Thread bzt

Hi,

Okay, as you wish. My code works either way and on real hardware as
well, because I acknowledge the periodic IRQ as soon as possible, so
good for me.

Sign-off-by: Zoltán Baldaszti 
Subject: [PATCH] Added periodic IRQ support for bcm2836_control local timer
diff --git a/hw/intc/bcm2836_control.c b/hw/intc/bcm2836_control.c
index cfa5bc7365..82d2f51ffe 100644
--- a/hw/intc/bcm2836_control.c
+++ b/hw/intc/bcm2836_control.c
@@ -7,7 +7,13 @@
  * This code is licensed under the GNU GPLv2 and later.
  *
  * At present, only implements interrupt routing, and mailboxes (i.e.,
- * not local timer, PMU interrupt, or AXI counters).
+ * not PMU interrupt, or AXI counters).
+ *
+ * ARM Local Timer IRQ Copyright (c) 2019. Zoltán Baldaszti
+ * The IRQ_TIMER support is still very basic, does not provide timer counter
+ * access and other timer features, it just generates periodic IRQs. But it
+ * still requires not only the interrupt enable, but the timer enable bit to
+ * be set.
  *
  * Ref:
  * 
https://www.raspberrypi.org/documentation/hardware/raspberrypi/bcm2836/QA7_rev3.4.pdf
@@ -18,6 +24,9 @@
 #include "qemu/log.h"

 #define REG_GPU_ROUTE   0x0c
+#define REG_LOCALTIMERROUTING   0x24
+#define REG_LOCALTIMERCONTROL   0x34
+#define REG_LOCALTIMERACK   0x38
 #define REG_TIMERCONTROL0x40
 #define REG_MBOXCONTROL 0x50
 #define REG_IRQSRC  0x60
@@ -43,6 +52,13 @@
 #define IRQ_TIMER   11
 #define IRQ_MAX IRQ_TIMER

+#define LOCALTIMER_FREQ  3840
+#define LOCALTIMER_INTFLAG   (1 << 31)
+#define LOCALTIMER_RELOAD(1 << 30)
+#define LOCALTIMER_INTENABLE (1 << 29)
+#define LOCALTIMER_ENABLE(1 << 28)
+#define LOCALTIMER_VALUE(x)  ((x) & 0xfff)
+
 static void deliver_local(BCM2836ControlState *s, uint8_t core, uint8_t irq,
   uint32_t controlreg, uint8_t controlidx)
 {
@@ -78,6 +94,17 @@ static void bcm2836_control_update(BCM2836ControlState *s)
 s->fiqsrc[s->route_gpu_fiq] |= (uint32_t)1 << IRQ_GPU;
 }

+/* handle THE local timer interrupt for one of the cores' IRQ/FIQ */
+if ((s->local_timer_control & LOCALTIMER_INTENABLE) &&
+(s->local_timer_control & LOCALTIMER_INTFLAG)) {
+/* note: this will keep firing the IRQ as Peter asked */
+if (s->route_localtimer & 4) {
+s->fiqsrc[(s->route_localtimer & 3)] |= (uint32_t)1 << IRQ_TIMER;
+} else {
+s->irqsrc[(s->route_localtimer & 3)] |= (uint32_t)1 << IRQ_TIMER;
+}
+}
+
 for (i = 0; i < BCM2836_NCORES; i++) {
 /* handle local timer interrupts for this core */
 if (s->timerirqs[i]) {
@@ -162,6 +189,54 @@ static void bcm2836_control_set_gpu_fiq(void
*opaque, int irq, int level)
 bcm2836_control_update(s);
 }

+static void bcm2836_control_local_timer_set_next(void *opaque)
+{
+BCM2836ControlState *s = opaque;
+uint64_t next_event;
+
+assert(LOCALTIMER_VALUE(s->local_timer_control) > 0);
+
+next_event = qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) +
+muldiv64(LOCALTIMER_VALUE(s->local_timer_control),
+NANOSECONDS_PER_SECOND, LOCALTIMER_FREQ);
+timer_mod(&s->timer, next_event);
+}
+
+static void bcm2836_control_local_timer_tick(void *opaque)
+{
+BCM2836ControlState *s = opaque;
+
+bcm2836_control_local_timer_set_next(s);
+
+s->local_timer_control |= LOCALTIMER_INTFLAG;
+bcm2836_control_update(s);
+}
+
+static void bcm2836_control_local_timer_control(void *opaque, uint32_t val)
+{
+BCM2836ControlState *s = opaque;
+
+s->local_timer_control = val;
+if (val & LOCALTIMER_ENABLE) {
+bcm2836_control_local_timer_set_next(s);
+} else {
+timer_del(&s->timer);
+}
+}
+
+static void bcm2836_control_local_timer_ack(void *opaque, uint32_t val)
+{
+BCM2836ControlState *s = opaque;
+
+if (val & LOCALTIMER_INTFLAG) {
+s->local_timer_control &= ~LOCALTIMER_INTFLAG;
+}
+if ((val & LOCALTIMER_RELOAD) &&
+(s->local_timer_control & LOCALTIMER_ENABLE)) {
+bcm2836_control_local_timer_set_next(s);
+}
+}
+
 static uint64_t bcm2836_control_read(void *opaque, hwaddr offset,
unsigned size)
 {
 BCM2836ControlState *s = opaque;
@@ -170,6 +245,12 @@ static uint64_t bcm2836_control_read(void
*opaque, hwaddr offset, unsigned size)
 assert(s->route_gpu_fiq < BCM2836_NCORES
&& s->route_gpu_irq < BCM2836_NCORES);
 return ((uint32_t)s->route_gpu_fiq << 2) | s->route_gpu_irq;
+} else if (offset == REG_LOCALTIMERROUTING) {
+return s->route_localtimer;
+} else if (offset == REG_LOCALTIMERCONTROL) {
+return s->local_timer_control;
+} else if (offset == REG_LOCALTIMERACK) {
+return 0;
 } else if (offset >= REG_TIMERCONTROL && offset < REG_MBOXCONTROL) {
 return s->timercontrol[(offset - REG_TIMERCONTROL) >> 2];
 } else if (offset >= REG_MBOXCONTROL && offset < REG_IRQSRC) {
@@ -195,6 +276,12 @@ static void

[Qemu-devel] [PATCH v2] linux-user: Add missing IPV6 sockopts

2019-03-10 Thread Helge Deller

When running ssh over IPv6 with linux-user I faced this warning:
 Unsupported setsockopt level=41 optname=67
 setsockopt IPV6_TCLASS 32: Protocol not available:

This patch adds code to the linux-user emulatation for setting and
retrieving of a few missing IPV6 options, including IPV6_TCLASS.

Signed-off-by: Helge Deller 

diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index 208fd1813d..0da51b1208 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -1871,6 +1874,20 @@ static abi_long do_setsockopt(int sockfd, int level, int 
optname,
 case IPV6_RECVHOPLIMIT:
 case IPV6_2292HOPLIMIT:
 case IPV6_CHECKSUM:
+case IPV6_ADDRFORM:
+case IPV6_2292PKTINFO:
+case IPV6_RECVTCLASS:
+case IPV6_RECVRTHDR:
+case IPV6_2292RTHDR:
+case IPV6_RECVHOPOPTS:
+case IPV6_2292HOPOPTS:
+case IPV6_RECVDSTOPTS:
+case IPV6_2292DSTOPTS:
+case IPV6_TCLASS:
+case IPV6_RECVPATHMTU:
+case IPV6_TRANSPARENT:
+case IPV6_FREEBIND:
+case IPV6_RECVORIGDSTADDR:
 val = 0;
 if (optlen < sizeof(uint32_t)) {
 return -TARGET_EINVAL;
@@ -2365,6 +2382,20 @@ static abi_long do_getsockopt(int sockfd, int level, int 
optname,
 case IPV6_RECVHOPLIMIT:
 case IPV6_2292HOPLIMIT:
 case IPV6_CHECKSUM:
+case IPV6_ADDRFORM:
+case IPV6_2292PKTINFO:
+case IPV6_RECVTCLASS:
+case IPV6_RECVRTHDR:
+case IPV6_2292RTHDR:
+case IPV6_RECVHOPOPTS:
+case IPV6_2292HOPOPTS:
+case IPV6_RECVDSTOPTS:
+case IPV6_2292DSTOPTS:
+case IPV6_TCLASS:
+case IPV6_RECVPATHMTU:
+case IPV6_TRANSPARENT:
+case IPV6_FREEBIND:
+case IPV6_RECVORIGDSTADDR:
 if (get_user_u32(len, optlen))
 return -TARGET_EFAULT;
 if (len < 0)

[Qemu-devel] [PATCH] tests: test-bdrv-graph-mod: fix memory leak

2019-03-10 Thread Li Qiang

Fixes: 2dbfadf
Spotted by ASAN when 'make check'.

Signed-off-by: Li Qiang 
---
 tests/test-bdrv-graph-mod.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/tests/test-bdrv-graph-mod.c b/tests/test-bdrv-graph-mod.c
index 458dfa6661..8bf0fe735d 100644
--- a/tests/test-bdrv-graph-mod.c
+++ b/tests/test-bdrv-graph-mod.c
@@ -117,6 +117,7 @@ static void test_update_perm_tree(void)
 
 bdrv_unref(bs);
 blk_unref(root);
+error_free(local_err);
 }
 
 /*
-- 
2.17.1

Re: [Qemu-devel] [PATCH 07/10] roms: build edk2 firmware binaries and variable store templates

2019-03-10 Thread Philippe Mathieu-Daudé

Hi Laszlo,

On 3/9/19 1:48 AM, Laszlo Ersek wrote:
> Add the "efi" target to "Makefile".
> 
> Introduce "Makefile.edk2" for building and cleaning the firmware images
> and varstore templates.
> 
> Collect the common bits from the recipes in the helper script
> "edk2-build.sh".
> 
> Signed-off-by: Laszlo Ersek 
> ---
>  roms/Makefile  |   5 +
>  roms/Makefile.edk2 | 138 
>  roms/edk2-build.sh |  55 
>  3 files changed, 198 insertions(+)
> 
> diff --git a/roms/Makefile b/roms/Makefile
> index 2e83ececa25a..054b432834ba 100644
> --- a/roms/Makefile
> +++ b/roms/Makefile
> @@ -61,6 +61,7 @@ default:
>   @echo "  skiboot-- update skiboot.lid"
>   @echo "  u-boot.e500-- update u-boot.e500"
>   @echo "  u-boot.sam460  -- update u-boot.sam460"
> + @echo "  efi-- update UEFI (edk2) platform firmware"
>  
>  bios: build-seabios-config-seabios-128k build-seabios-config-seabios-256k
>   cp seabios/builds/seabios-128k/bios.bin ../pc-bios/bios.bin
> @@ -143,6 +144,9 @@ skiboot:
>   $(MAKE) -C skiboot CROSS=$(powerpc64_cross_prefix)
>   cp skiboot/skiboot.lid ../pc-bios/skiboot.lid
>  
> +efi: edk2-basetools
> + $(MAKE) -f Makefile.edk2
> +
>  clean:
>   rm -rf seabios/.config seabios/out seabios/builds
>   $(MAKE) -C sgabios clean
> @@ -153,3 +157,4 @@ clean:
>   rm -rf u-boot/build.e500
>   $(MAKE) -C u-boot-sam460ex distclean
>   $(MAKE) -C skiboot clean
> + $(MAKE) -f Makefile.edk2 clean
> diff --git a/roms/Makefile.edk2 b/roms/Makefile.edk2
> new file mode 100644
> index ..ad6fff044cd6
> --- /dev/null
> +++ b/roms/Makefile.edk2
> @@ -0,0 +1,138 @@
> +# Makefile for building firmware binaries and variable store templates for a
> +# number of virtual platforms in edk2.
> +#
> +# Copyright (C) 2019, Red Hat, Inc.
> +#
> +# This program and the accompanying materials are licensed and made available
> +# under the terms and conditions of the BSD License that accompanies this
> +# distribution. The full text of the license may be found at
> +# .
> +#
> +# THE PROGRAM IS DISTRIBUTED UNDER THE BSD LICENSE ON AN "AS IS" BASIS, 
> WITHOUT
> +# WARRANTIES OR REPRESENTATIONS OF ANY KIND, EITHER EXPRESS OR IMPLIED.
> +
> +toolchain = $(shell source ./edk2-funcs.sh && qemu_edk2_get_toolchain $(1))
> +
> +licenses := \
> + edk2/License.txt \
> + edk2/OvmfPkg/License.txt \
> + edk2/CryptoPkg/Library/OpensslLib/openssl/LICENSE
> +
> +# The "edk2-arm-vars.fd" varstore template is suitable for aarch64 as well.
> +# Similarly, the "edk2-i386-vars.fd" varstore template is suitable for x86_64
> +# as well, independently of "secure" too.
> +all: \
> + ../pc-bios/edk2-aarch64-code.fd \
> + ../pc-bios/edk2-arm-code.fd \
> + ../pc-bios/edk2-i386-code.fd \
> + ../pc-bios/edk2-i386-secure-code.fd \
> + ../pc-bios/edk2-x86_64-code.fd \
> + ../pc-bios/edk2-x86_64-secure-code.fd \
> + \
> + ../pc-bios/edk2-arm-vars.fd \
> + ../pc-bios/edk2-i386-vars.fd \
> + \
> + ../pc-bios/edk2-licenses.txt
> +
> +submodules:
> + cd edk2 && git submodule update --init --force
> +
> +# See notes on the ".NOTPARALLEL" target and the "+" indicator in
> +# "tests/uefi-test-tools/Makefile".
> +.NOTPARALLEL:
> +
> +../pc-bios/edk2-aarch64-code.fd: submodules
> + +./edk2-build.sh \
> + aarch64 \
> + --arch=AARCH64 \
> + --platform=ArmVirtPkg/ArmVirtQemu.dsc \
> + -D NETWORK_IP6_ENABLE \
> + -D HTTP_BOOT_ENABLE
> + cp edk2/Build/ArmVirtQemu-AARCH64/DEBUG_$(call 
> toolchain,aarch64)/FV/QEMU_EFI.fd \
> + $@
> + truncate --size=64M $@
> +
> +../pc-bios/edk2-arm-code.fd: submodules
> + +./edk2-build.sh \
> + arm \
> + --arch=ARM \
> + --platform=ArmVirtPkg/ArmVirtQemu.dsc \
> + -D NETWORK_IP6_ENABLE \
> + -D HTTP_BOOT_ENABLE
> + cp edk2/Build/ArmVirtQemu-ARM/DEBUG_$(call 
> toolchain,arm)/FV/QEMU_EFI.fd \
> + $@
> + truncate --size=64M $@
> +
> +../pc-bios/edk2-i386-code.fd: submodules
> + +./edk2-build.sh \
> + i386 \
> + --arch=IA32 \
> + --platform=OvmfPkg/OvmfPkgIa32.dsc \
> + -D NETWORK_IP6_ENABLE \
> + -D HTTP_BOOT_ENABLE \
> + -D TLS_ENABLE \
> + -D TPM2_ENABLE \
> + -D TPM2_CONFIG_ENABLE
> + cp edk2/Build/OvmfIa32/DEBUG_$(call toolchain,i386)/FV/OVMF_CODE.fd $@
> +
> +../pc-bios/edk2-i386-secure-code.fd: submodules
> + +./edk2-build.sh \
> + i386 \
> + --arch=IA32 \
> + --platform=OvmfPkg/OvmfPkgIa32.dsc \
> + -D NETWORK_IP6_ENABLE \
> + -D HTTP_BOOT_ENABLE \
> + -D TLS_ENABLE \
> + -D TPM2_ENABLE \
> + -D TPM2_CONFIG_ENABLE \
> + -D SECURE_BOOT_ENABLE \
> +

Re: [Qemu-devel] [PULL 0/4] NBD patches for 2019-03-08, 4.0 softfreeze

2019-03-10 Thread Peter Maydell

On Fri, 8 Mar 2019 at 17:41, Eric Blake  wrote:
>
> The following changes since commit c557a8c7b755d8c153fc0f5be00688228be96e76:
>
>   Merge remote-tracking branch 
> 'remotes/dgilbert/tags/pull-migration-20190306a' into staging (2019-03-06 
> 14:50:33 +)
>
> are available in the Git repository at:
>
>   https://repo.or.cz/qemu/ericb.git tags/pull-nbd-2019-03-08
>
> for you to fetch changes up to 054be3605459d4342e9ee5a82ae0fcffeeb09e4d:
>
>   iotests: Wait for qemu to end in 223 (2019-03-06 11:05:27 -0600)
>
> 
> nbd patches for 2019-03-08
>
> - support TLS client authorization in NBD servers
> - iotest 223 race fix
>
> 
> Daniel P. Berrangé (3):
>   qemu-nbd: add support for authorization of TLS clients
>   nbd: allow authorization with nbd-server-start QMP command
>   nbd: fix outdated qapi docs syntax for tls-creds
>
> Eric Blake (1):
>   iotests: Wait for qemu to end in 223

Applied, thanks.

Please update the changelog at https://wiki.qemu.org/ChangeLog/4.0
for any user-visible changes.

-- PMM

Re: [Qemu-devel] [PATCH] net: tap: use qemu_set_nonblock

2019-03-10 Thread Li Qiang

Hi Jason,

What's the status of this patch? I don't see it in upstream.

Thanks,
Li Qiang

Jason Wang  于2018年11月22日周四 上午10:22写道：

>
> On 2018/11/22 上午1:39, Michael S. Tsirkin wrote:
> > On Wed, Nov 21, 2018 at 11:30:41AM -0600, Eric Blake wrote:
> >> On 11/21/18 6:23 AM, Michael S. Tsirkin wrote:
> >>
>  I agree it is good to preserve fcntl flags though, so this patch
>  looks desirable.
> 
>  Reviewed-by: Daniel P. Berrangé 
> >>> Sure
> >>>
> >>> Acked-by: Michael S. Tsirkin 
> >>>
> >>> but really not for this release I guess as we are in freeze.
> >> We're in freeze, so the criteria is: Does this fix a bug that we would
> >> otherwise not want in 3.1.  If the code is pre-existing (that is, if
> 3.0 was
> >> released with the same problem), or then delaying the patch to 4.0 is an
> >> easier call to make.  If the problem is new to 3.1, then fixing it for
> -rc3
> >> is still reasonable with maintainer discretion (although once -rc3
> lands, we
> >> want as little as possible to go into -rc4, even if our track record
> says we
> >> will be unable to avoid -rc4 altogether).
> >>
> >> I think that losing flags is likely enough to be a noticeable bug worth
> >> fixing for 3.1, but I did not research when the problem was introduced,
> so I
> >> don't have a strong preference for 3.1 vs. 4.0.
> > Maintainer in this case is Jason, so it's up to him
>
>
> I've queued this for 4.0.
>
> Thanks
>
>

Re: [Qemu-devel] [PATCH 00/10] bundle edk2 platform firmware with QEMU

2019-03-10 Thread Philippe Mathieu-Daudé

On 3/10/19 4:56 AM, Michael S. Tsirkin wrote:
> On Sat, Mar 09, 2019 at 01:48:16AM +0100, Laszlo Ersek wrote:
>> Repo:   https://github.com/lersek/qemu.git
>> Branch: edk2_build
>>
>> This series advances the roms/edk2 submodule to the "edk2-stable201903"
>> release, and builds and captures platform firmware binaries from that
>> release. At this point they are meant to be used by both end-users and
>> by Igor's ACPI unit tests in qtest ("make check").
>>
>> Previous discussion:
>>
>>   [Qemu-devel] bundling edk2 platform firmware images with QEMU
>>   80f0bae3-e79a-bb68-04c4-1c9c684d95b8@redhat.com">http://mid.mail-archive.com/80f0bae3-e79a-bb68-04c4-1c9c684d95b8@redhat.com
>>   https://lists.gnu.org/archive/html/qemu-devel/2019-03/msg02601.html

There David raised a concern about "[adding] ~206 MB of binaries to the
pc-bios directory". I'm also worried.

GitHub kindly suggest to use git-lfs. It is an extra dependency I'd
rather strongly avoid (because we support a wide range of host OS, each
using a wide types of filesystems).

What about storing those binaries on a file server (http/ftp) altogether
with a file containing its hashed digest (SHA1/SHA256)? Then we already
have all the required tools to fetch and verify those blob roms with the
build system.
Or we could store the hashes in the QEMU repository too.

>> Note that the series was formatted with "--no-binary" (affecting patch
>> #8), therefore it cannot be applied with "git-am". See the remote
>> repo/branch reference near the top instead.
>>
>> Thanks,
>> Laszlo
> 
> High time IMO.

:)

> Reviewed-by: Michael S. Tsirkin 
> 
> Laszlo I suggest you add an entry to MAINTAINERS
> and start doing pull requests.

This is the entry I added here:
https://lists.gnu.org/archive/html/qemu-devel/2019-03/msg02967.html

> 
> Peter, what do you say? OK with you?

Since this series doesn't change the QEMU binaries built, it looks OK to
me to merge it past soft freeze (as we do we tests/CI), this way it get
merged with the final EDK2 release tag.
Else we can merge it next week, and update the EDK2 submodule tag
previous QEMU release.

>> Laszlo Ersek (10):
>>   roms: lift "edk2-funcs.sh" from "tests/uefi-test-tools/build.sh"
>>   roms/edk2-funcs.sh: require gcc-4.8+ for building i386 and x86_64
>>   tests/uefi-test-tools/build.sh: work around TianoCore#1607
>>   roms/edk2: advance to tag edk2-stable201903
>>   roms/edk2-funcs.sh: add the qemu_edk2_get_thread_count() function
>>   roms/Makefile: replace the $(EFIROM) target with "edk2-basetools"
>>   roms: build edk2 firmware binaries and variable store templates
>>   pc-bios: add edk2 firmware binaries and variable store templates
>>   pc-bios: document the edk2 firmware images; add firmware descriptors
>>   Makefile: install the edk2 firmware images and their descriptors
>>
>>  Makefile   |  17 +-
>>  pc-bios/README |  11 +
>>  pc-bios/descriptors/50-edk2-i386-secure.json   |  34 +++
>>  pc-bios/descriptors/50-edk2-x86_64-secure.json |  35 +++
>>  pc-bios/descriptors/60-edk2-aarch64.json   |  31 +++
>>  pc-bios/descriptors/60-edk2-arm.json   |  31 +++
>>  pc-bios/descriptors/60-edk2-i386.json  |  33 +++
>>  pc-bios/descriptors/60-edk2-x86_64.json|  34 +++
>>  pc-bios/edk2-aarch64-code.fd   | Bin 0 -> 67108864 bytes
>>  pc-bios/edk2-arm-code.fd   | Bin 0 -> 67108864 bytes
>>  pc-bios/edk2-arm-vars.fd   | Bin 0 -> 67108864 bytes
>>  pc-bios/edk2-i386-code.fd  | Bin 0 -> 3653632 bytes
>>  pc-bios/edk2-i386-secure-code.fd   | Bin 0 -> 3653632 bytes
>>  pc-bios/edk2-i386-vars.fd  | Bin 0 -> 540672 bytes
>>  pc-bios/edk2-licenses.txt  | 209 
>>  pc-bios/edk2-x86_64-code.fd| Bin 0 -> 3653632 bytes
>>  pc-bios/edk2-x86_64-secure-code.fd | Bin 0 -> 3653632 bytes
>>  roms/Makefile  |   9 +-
>>  roms/Makefile.edk2 | 138 +++
>>  roms/edk2  |   2 +-
>>  roms/edk2-build.sh |  55 +
>>  roms/edk2-funcs.sh | 253 
>>  tests/uefi-test-tools/build.sh | 100 +---
>>  23 files changed, 897 insertions(+), 95 deletions(-)
>>  create mode 100644 pc-bios/descriptors/50-edk2-i386-secure.json
>>  create mode 100644 pc-bios/descriptors/50-edk2-x86_64-secure.json
>>  create mode 100644 pc-bios/descriptors/60-edk2-aarch64.json
>>  create mode 100644 pc-bios/descriptors/60-edk2-arm.json
>>  create mode 100644 pc-bios/descriptors/60-edk2-i386.json
>>  create mode 100644 pc-bios/descriptors/60-edk2-x86_64.json
>>  create mode 100644 pc-bios/edk2-aarch64-code.fd
>>  create mode 100644 pc-bios/edk2-arm-code.fd
>>  create mode 100644 pc-bios/edk2-arm-vars.fd
>>  create mode 100644

Re: [Qemu-devel] [PATCH] qom: cpu: destroy work_mutex in cpu_common_finalize

2019-03-10 Thread 李强

Hi Paolo,

What's the status of this patch? I don't see it in upstream.


Thanks,
Li Qiang


At 2019-01-08 07:41:09, "Paolo Bonzini"  wrote:
>On 02/01/19 08:41, Li Qiang wrote:
>> Commit 376692b9dc6(cpus: protect work list with work_mutex)
>> initialize a work_mutex in cpu_common_initfn, however forget
>> to destroy it. This will cause resource leak when hotunplug cpu
>> or hotplug cpu fails.
>> 
>> Signed-off-by: Li Qiang 
>> ---
>>  qom/cpu.c | 3 +++
>>  1 file changed, 3 insertions(+)
>> 
>> diff --git a/qom/cpu.c b/qom/cpu.c
>> index 9ad1372d57..367ebf9d61 100644
>> --- a/qom/cpu.c
>> +++ b/qom/cpu.c
>> @@ -380,6 +380,9 @@ static void cpu_common_initfn(Object *obj)
>>  
>>  static void cpu_common_finalize(Object *obj)
>>  {
>> +CPUState *cpu = CPU(obj);
>> +
>> +qemu_mutex_destroy(&cpu->work_mutex);
>>  }
>>  
>>  static int64_t cpu_common_get_arch_id(CPUState *cpu)
>> 
>
>Queued, thanks.
>
>Paolo

Re: [Qemu-devel] [PULL 00/25] Misc patches for QEMU 4.0 soft freeze

2019-03-10 Thread Peter Maydell

On Sat, 9 Mar 2019 at 07:50, Paolo Bonzini  wrote:
>
> The following changes since commit 62cfabb52210139843e26c95434356f73a0631b9:
>
>   Merge remote-tracking branch 'remotes/rth/tags/pull-hppa-20190307' into 
> staging (2019-03-08 15:17:01 +)
>
> are available in the git repository at:
>
>
>   git://github.com/bonzini/qemu.git tags/for-upstream
>
> for you to fetch changes up to d7fa7c30d45c375fb7d9ad6f58462a64c3317037:
>
>   exec: streamline flatview_add_to_dispatch (2019-03-09 08:46:56 +0100)
>
> 
> * allow building QEMU without TCG or KVM support (Anthony)
> * update AMD IOMMU copyright (David)
> * compilation fixes for GCC and BSDs (Alexey, David, Paolo, Philippe)
> * coalesced I/O bugfix (Jagannathan)
> * Processor Tracing cpuid fix (Luwei)
> * Kconfig fix (Paolo)
> * Cleanups (Paolo)
> * PVH vs. multiboot fix (Stefano)
> * LSI bugfixes (Sven)
> * elf2dmp Coverity fix (Victor)
> * scsi-disk fix (Zhengui)

Signed-off-by: Philippe Mathieu-DaudÃ© 
Signed-off-by: Philippe Mathieu-DaudÃ© 
Signed-off-by: Philippe Mathieu-DaudÃ© 
ERROR: pull request includes tag with UTF-8 error in person name

(Maybe we should add "patch includes LATIN CAPITAL LETTER A WITH TILDE"
in checkpatch.pl as a warning, since it's much more likely to
be "something got doubly-utf-8-encoded" than intentional.)

thanks
-- PMM

[Qemu-devel] [PATCH] tests: test-announce-self: fix memory leak

2019-03-10 Thread Li Qiang

Spotted by ASAN when 'make check'.

Signed-off-by: Li Qiang 
---
 tests/test-announce-self.c | 20 ++--
 1 file changed, 6 insertions(+), 14 deletions(-)

diff --git a/tests/test-announce-self.c b/tests/test-announce-self.c
index 1644d34a3f..3f370d8bf5 100644
--- a/tests/test-announce-self.c
+++ b/tests/test-announce-self.c
@@ -21,17 +21,6 @@
 #define ETH_P_RARP 0x8035
 #endif
 
-static QTestState *test_init(int socket)
-{
-char *args;
-
-args = g_strdup_printf("-netdev socket,fd=%d,id=hs0 -device "
-   "virtio-net-pci,netdev=hs0", socket);
-
-return qtest_start(args);
-}
-
-
 static void test_announce(int socket)
 {
 char buffer[60];
@@ -58,19 +47,22 @@ static void test_announce(int socket)
 
 static void setup(gconstpointer data)
 {
-QTestState *qs;
 void (*func) (int socket) = data;
 int sv[2], ret;
+char *args;
 
 ret = socketpair(PF_UNIX, SOCK_STREAM, 0, sv);
 g_assert_cmpint(ret, !=, -1);
 
-qs = test_init(sv[1]);
+args = g_strdup_printf("-netdev socket,fd=%d,id=hs0 -device "
+   "virtio-net-pci,netdev=hs0", sv[1]);
+qtest_start(args);
 func(sv[0]);
 
 /* End test */
 close(sv[0]);
-qtest_quit(qs);
+qtest_end();
+g_free(args);
 }
 
 int main(int argc, char **argv)
-- 
2.17.1

Re: [Qemu-devel] [PATCH] hw/riscv/virt: re-add machine-specific compatible string to /soc/ node

2019-03-10 Thread Auer, Lukas

Hi Bin,

On Sun, 2019-03-10 at 09:07 +0800, Bin Meng wrote:
> Hi Lukas,
> 
> On Mon, Feb 11, 2019 at 6:13 AM Lukas Auer
>  wrote:
> > Re-add the previous compatible string "riscv-virtio-soc" to the soc
> > device tree node to allow U-Boot and Linux to bind machine-specific
> > drivers to it. The current compatible string "simple-bus" is
> > retained.
> > 
> > This is required by U-Boot to bind devices early, as part of the
> > pre-relocation driver model.
> > 
> 
> I see no problem with U-Boot working with current compatible string
> "simple-bus". In fact I had planned to remove the compatible string
> "riscv-virtio-soc" in U-Boot but did not get time to work on it.
> 

It is only required if U-Boot is running in machine-mode. For
relocation it needs to use the CLINT driver to send appropriate IPIs to
the other harts. To be able to probe the driver, the device and its
parent device tree node (soc) must therefore be available in the pre-
relocation device model.
This patch was the easiest way I could think of for achieving this. It
could be that there is a better way of solving this.

Thanks,
Lukas

> > Fixes: 53f54508dae6("hw/riscv/virtio: Set the soc device tree node
> > as a
> > simple-bus")
> > Signed-off-by: Lukas Auer 
> > ---
> > 
> >  hw/riscv/virt.c | 3 ++-
> >  1 file changed, 2 insertions(+), 1 deletion(-)
> > 
> 
> Regards,
> Bin

[Qemu-devel] [PATCH] ati-vga: Implement DDC and EDID info from monitor

2019-03-10 Thread BALATON Zoltan

This adds DDC support to ati-vga and connects i2c-ddc to provide EDID
info that is read by guests to find available screen modes. Not sure
if this is 100% correct yet but at least MorphOS is happy with it and
starts in a high resolution mode instead of 640x480 (although its
splash screen is still not correct). Linux needs support from VESA
vgabios, it seems to be missing INT10 0x4F15 function (see
https://gitlab.freedesktop.org/xorg/xserver/blob/master/hw/xfree86/vbe/vbe.c)
without which no DDC is available that also prevents loading the
accelerated X driver.

Besides, this depends on bitbang_i2c.h which is now in hw/i2c so if
including it from there is not desirable that may need to be moved
somewhere.

Signed-off-by: BALATON Zoltan 
---
 hw/display/Kconfig   |  2 ++
 hw/display/ati.c | 24 ++--
 hw/display/ati_int.h |  3 +++
 3 files changed, 27 insertions(+), 2 deletions(-)

diff --git a/hw/display/Kconfig b/hw/display/Kconfig
index 86c1d544c5..f8d65802a9 100644
--- a/hw/display/Kconfig
+++ b/hw/display/Kconfig
@@ -112,3 +112,5 @@ config ATI_VGA
 default y if PCI_DEVICES
 depends on PCI
 select VGA
+select BITBANG_I2C
+select DDC
diff --git a/hw/display/ati.c b/hw/display/ati.c
index 8322f52aff..260cd803d8 100644
--- a/hw/display/ati.c
+++ b/hw/display/ati.c
@@ -24,6 +24,8 @@
 #include "qapi/error.h"
 #include "hw/hw.h"
 #include "ui/console.h"
+#include "hw/i2c/i2c-ddc.h"
+#include "../i2c/bitbang_i2c.h"
 #include "trace.h"
 
 #define ATI_DEBUG_HW_CURSOR 0
@@ -267,7 +269,9 @@ static uint64_t ati_mm_read(void *opaque, hwaddr addr, 
unsigned int size)
 case DAC_CNTL:
 val = s->regs.dac_cntl;
 break;
-/*case GPIO_MONID: FIXME hook up DDC I2C here */
+case GPIO_MONID:
+val = s->regs.gpio_monid;
+break;
 case PALETTE_INDEX:
 /* FIXME unaligned access */
 val = vga_ioport_read(&s->vga, VGA_PEL_IR) << 16;
@@ -501,7 +505,17 @@ static void ati_mm_write(void *opaque, hwaddr addr,
 s->regs.dac_cntl = data & 0xe3ff;
 s->vga.dac_8bit = !!(data & DAC_8BIT_EN);
 break;
-/*case GPIO_MONID: FIXME hook up DDC I2C here */
+case GPIO_MONID:
+s->regs.gpio_monid = data & 0x0f0f000f;
+if (data & BIT(2) << 24) {
+s->regs.gpio_monid |= !!(data & BIT(2)) << 10;
+bitbang_i2c_set(s->bbi2c, BITBANG_I2C_SCL, (data & BIT(2)) != 0);
+}
+if (data & BIT(1) << 24) {
+s->regs.gpio_monid |= bitbang_i2c_set(s->bbi2c, BITBANG_I2C_SDA,
+  (data & BIT(1)) != 0) << 9;
+}
+break;
 case PALETTE_INDEX ... PALETTE_INDEX + 3:
 if (size == 4) {
 vga_ioport_write(&s->vga, VGA_PEL_IR, (data >> 16) & 0xff);
@@ -792,6 +806,12 @@ static void ati_vga_realize(PCIDevice *dev, Error **errp)
 vga->cursor_draw_line = ati_cursor_draw_line;
 }
 
+/* ddc, edid */
+I2CBus *i2cbus = i2c_init_bus(DEVICE(s), "ati-vga.ddc");
+s->bbi2c = bitbang_i2c_init(i2cbus);
+I2CSlave *i2cddc = I2C_SLAVE(qdev_create(BUS(i2cbus), TYPE_I2CDDC));
+i2c_set_slave_address(i2cddc, 0x50);
+
 /* mmio register space */
 memory_region_init_io(&s->mm, OBJECT(s), &ati_mm_ops, s,
   "ati.mmregs", 0x4000);
diff --git a/hw/display/ati_int.h b/hw/display/ati_int.h
index a6f3e20e63..8df00efd93 100644
--- a/hw/display/ati_int.h
+++ b/hw/display/ati_int.h
@@ -11,6 +11,7 @@
 
 #include "qemu/osdep.h"
 #include "hw/pci/pci.h"
+#include "hw/i2c/i2c.h"
 #include "vga_int.h"
 
 /*#define DEBUG_ATI*/
@@ -36,6 +37,7 @@ typedef struct ATIVGARegs {
 uint32_t crtc_gen_cntl;
 uint32_t crtc_ext_cntl;
 uint32_t dac_cntl;
+uint32_t gpio_monid;
 uint32_t crtc_h_total_disp;
 uint32_t crtc_h_sync_strt_wid;
 uint32_t crtc_v_total_disp;
@@ -84,6 +86,7 @@ typedef struct ATIVGAState {
 uint16_t cursor_size;
 uint32_t cursor_offset;
 QEMUCursor *cursor;
+bitbang_i2c_interface *bbi2c;
 MemoryRegion io;
 MemoryRegion mm;
 ATIVGARegs regs;
-- 
2.13.7

[Qemu-devel] [PATCH] isa: Add APM and ACPI dependencies for VT82C686

2019-03-10 Thread BALATON Zoltan

Compiling vt82c686.c fails without APM and ACPI_PM funtions. Add
dependency on these in Kconfig to fix this.

Signed-off-by: BALATON Zoltan 
---
 hw/isa/Kconfig | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/hw/isa/Kconfig b/hw/isa/Kconfig
index 57e09a0cb8..e092da3fc3 100644
--- a/hw/isa/Kconfig
+++ b/hw/isa/Kconfig
@@ -36,6 +36,8 @@ config VT82C686
 select ACPI_SMBUS
 select SERIAL_ISA
 select FDC
+select APM
+select ACPI_X86
 
 config SMC37C669
 bool
-- 
2.13.7

[Qemu-devel] [PATCH v7 1/2] hw/display: Add basic ATI VGA emulation

2019-03-10 Thread BALATON Zoltan

At least two machines, the PPC mac99 and MIPS fulong2e, have an ATI
gfx chip by default (Rage 128 Pro and M6/RV100 respectively) and
guests running on these and the PMON2000 firmware of the fulong2e
expect this to be available. Fortunately these are very similar chips
so they can be mostly emulated in the same device model. This patch
adds basic emulation of these ATI VGA chips.

While this is incomplete and currently only enough to run the MIPS
firmware and get framebuffer output with Linux, it allows the fulong2e
board to work more like the real hardware and having it in QEMU in
this state provides a way to experiment with it and allows others to
contribute to improve it. It is compiled for all archs but only the
fulong2e (which currently has no display output at all) is set to use
it by default (in a separate patch).

Signed-off-by: BALATON Zoltan 
Acked-by: Aleksandar Markovic 
Tested-by: Andrew Randrianasulu 
Tested-by: Howard Spoelstra 
---
v7:
- rebased for Kconfig changes in master
- added Tested-by: tags

v6:
- add support for hwcursor rendered in device as well, enable with
  guest_hwcursor=true property

v5:
- review suggestions: add const to model aliases, \n to log, %u in trace
- implemented hardware cursor support

v4:
- fix mingw build (from Gerd)
- set dev_id in realize to allow pci_patch_ids to change bios rom
- add model aliases to select device variant by name instead of id
- misc mode switch and 2d fixes (better but still not quite right)

v3:
- add to default-configs/pci.mak instead of mips64el and ppc only
- rename device_id property to x-device-id
- use extract32/deposit32 in *_offs functions
- add ati-vga to vl.c default_list[]

v2:
- Extended debug logs
- Fix mode switching and some registers
- Fixes to 2D functions

 hw/display/Kconfig   |   6 +
 hw/display/Makefile.objs |   2 +
 hw/display/ati.c | 865 +++
 hw/display/ati_2d.c  | 167 +
 hw/display/ati_dbg.c | 259 ++
 hw/display/ati_int.h |  96 ++
 hw/display/ati_regs.h| 461 +
 hw/display/trace-events  |   4 +
 vl.c |   1 +
 9 files changed, 1861 insertions(+)
 create mode 100644 hw/display/ati.c
 create mode 100644 hw/display/ati_2d.c
 create mode 100644 hw/display/ati_dbg.c
 create mode 100644 hw/display/ati_int.h
 create mode 100644 hw/display/ati_regs.h

diff --git a/hw/display/Kconfig b/hw/display/Kconfig
index a96ea763a8..86c1d544c5 100644
--- a/hw/display/Kconfig
+++ b/hw/display/Kconfig
@@ -106,3 +106,9 @@ config VIRTIO_VGA
 
 config DPCD
 bool
+
+config ATI_VGA
+bool
+default y if PCI_DEVICES
+depends on PCI
+select VGA
diff --git a/hw/display/Makefile.objs b/hw/display/Makefile.objs
index 576fca4eb6..dbd453ab1b 100644
--- a/hw/display/Makefile.objs
+++ b/hw/display/Makefile.objs
@@ -51,3 +51,5 @@ virtio-gpu-3d.o-cflags := $(VIRGL_CFLAGS)
 virtio-gpu-3d.o-libs += $(VIRGL_LIBS)
 obj-$(CONFIG_DPCD) += dpcd.o
 obj-$(CONFIG_XLNX_ZYNQMP_ARM) += xlnx_dp.o
+
+obj-$(CONFIG_ATI_VGA) += ati.o ati_2d.o ati_dbg.o
diff --git a/hw/display/ati.c b/hw/display/ati.c
new file mode 100644
index 00..8322f52aff
--- /dev/null
+++ b/hw/display/ati.c
@@ -0,0 +1,865 @@
+/*
+ * QEMU ATI SVGA emulation
+ *
+ * Copyright (c) 2019 BALATON Zoltan
+ *
+ * This work is licensed under the GNU GPL license version 2 or later.
+ */
+
+/*
+ * WARNING:
+ * This is very incomplete and only enough for Linux console and some
+ * unaccelerated X output at the moment.
+ * Currently it's little more than a frame buffer with minimal functions,
+ * other more advanced features of the hardware are yet to be implemented.
+ * We only aim for Rage 128 Pro (and some RV100) and 2D only at first,
+ * No 3D at all yet (maybe after 2D works, but feel free to improve it)
+ */
+
+#include "ati_int.h"
+#include "ati_regs.h"
+#include "vga_regs.h"
+#include "qemu/log.h"
+#include "qemu/error-report.h"
+#include "qapi/error.h"
+#include "hw/hw.h"
+#include "ui/console.h"
+#include "trace.h"
+
+#define ATI_DEBUG_HW_CURSOR 0
+
+static const struct {
+const char *name;
+uint16_t dev_id;
+} ati_model_aliases[] = {
+{ "rage128p", PCI_DEVICE_ID_ATI_RAGE128_PF },
+{ "rv100", PCI_DEVICE_ID_ATI_RADEON_QY },
+};
+
+enum { VGA_MODE, EXT_MODE };
+
+static void ati_vga_switch_mode(ATIVGAState *s)
+{
+DPRINTF("%d -> %d\n",
+s->mode, !!(s->regs.crtc_gen_cntl & CRTC2_EXT_DISP_EN));
+if (s->regs.crtc_gen_cntl & CRTC2_EXT_DISP_EN) {
+/* Extended mode enabled */
+s->mode = EXT_MODE;
+if (s->regs.crtc_gen_cntl & CRTC2_EN) {
+/* CRT controller enabled, use CRTC values */
+uint32_t offs = s->regs.crtc_offset & 0x07ff;
+int stride = (s->regs.crtc_pitch & 0x7ff) * 8;
+int bpp = 0;
+int h, v;
+
+if (s->regs.crtc_h_total_disp == 0) {
+s->regs.crtc_h_total_disp = ((640 / 8) - 1) << 16;
+

[Qemu-devel] [PATCH v7 2/2] mips_fulong2e: Add on-board graphics chip

2019-03-10 Thread BALATON Zoltan

Add (partial) emulation of the on-board GPU of the machine. This
allows the PMON2000 firmware to run and should also work with Linux
console but probably not with X yet.

Signed-off-by: BALATON Zoltan 
Reviewed-by: Philippe Mathieu-DaudÃ© 
Tested-by: Philippe Mathieu-DaudÃ© 
Reviewed-by: Aleksandar Markovic 
---
v7:
- set vgamem_mb explicitely to match board instead of relying on default

 hw/mips/mips_fulong2e.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/hw/mips/mips_fulong2e.c b/hw/mips/mips_fulong2e.c
index fbbc543eed..2bdb766ed1 100644
--- a/hw/mips/mips_fulong2e.c
+++ b/hw/mips/mips_fulong2e.c
@@ -287,6 +287,7 @@ static void mips_fulong2e_init(MachineState *machine)
 I2CBus *smbus;
 MIPSCPU *cpu;
 CPUMIPSState *env;
+DeviceState *dev;
 
 /* init CPUs */
 cpu = MIPS_CPU(cpu_create(machine->cpu_type));
@@ -347,6 +348,12 @@ static void mips_fulong2e_init(MachineState *machine)
 vt82c686b_southbridge_init(pci_bus, FULONG2E_VIA_SLOT, env->irq[5],
&smbus, &isa_bus);
 
+/* GPU */
+dev = DEVICE(pci_create(pci_bus, -1, "ati-vga"));
+qdev_prop_set_uint32(dev, "vgamem_mb", 16);
+qdev_prop_set_uint16(dev, "x-device-id", 0x5159);
+qdev_init_nofail(dev);
+
 /* Populate SPD eeprom data */
 spd_data = spd_data_generate(DDR, ram_size, &err);
 if (err) {
-- 
2.13.7

[Qemu-devel] [PATCH v7 0/2] Basic ATI VGA emulation

2019-03-10 Thread BALATON Zoltan

Rebase on master to fix build with new Kconfig based stuff.

BALATON Zoltan (2):
  hw/display: Add basic ATI VGA emulation
  mips_fulong2e: Add on-board graphics chip

 hw/display/Kconfig   |   6 +
 hw/display/Makefile.objs |   2 +
 hw/display/ati.c | 865 +++
 hw/display/ati_2d.c  | 167 +
 hw/display/ati_dbg.c | 259 ++
 hw/display/ati_int.h |  96 ++
 hw/display/ati_regs.h| 461 +
 hw/display/trace-events  |   4 +
 hw/mips/mips_fulong2e.c  |   7 +
 vl.c |   1 +
 10 files changed, 1868 insertions(+)
 create mode 100644 hw/display/ati.c
 create mode 100644 hw/display/ati_2d.c
 create mode 100644 hw/display/ati_dbg.c
 create mode 100644 hw/display/ati_int.h
 create mode 100644 hw/display/ati_regs.h

-- 
2.13.7

[Qemu-devel] [Bug 1818483] Re: qemu user mode does not support binfmt_misc config with flags include "P"

2019-03-10 Thread YunQiang Su

@Peter Luyou and me are working on try to pass the info about whether P
flag is enabled or not by enviroment var or auxval. While we have not
found the right method to do it from binfmt_misc.

In fact, currently qemu trys to process the O flag, and it cannot work at all.
When you install qemu-user-static package from Debian/Ubuntu, the O flag is 
enabled,
while 
   execfd = qemu_getauxval(AT_EXECFD);
always return 0.

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1818483

Title:
  qemu user mode does not support binfmt_misc config with flags include
  "P"

Status in QEMU:
  New

Bug description:
  Hi Sir:
  During our test in chroot environment with qemu-user-static, we got some test 
cases failed because of program output warning with unexpected full path name.
  For example in test module "Devscripts"
  the test item for broken tarball expected the warning info:
  
  but the output was:
  
  the cause is the config file of binfmt_misc was set not to send argv0, for 
example:
  type command "tar" after chroot:
  ==
  lpeng@lpeng-VirtualBox:~/projects_lpeng/qemu/mips_2/sid$ sudo chroot .
  [sudo] password for lpeng: 
  root@lpeng-VirtualBox:/# tar
  /bin/tar: You must specify one of the '-Acdtrux', '--delete' or 
'--test-label' options
  Try '/bin/tar --help' or '/bin/tar --usage' for more information.
  root@lpeng-VirtualBox:/# 
  ===

  by adding output log in main()@qemu/Linux-user/main.c
  we found the original input command was changed, and qemu do not know that, 
we got the input args:
  argv_0/usr/bin/qemu-mips64el-static---
  argv_1/bin/tar---
  argv_2NULL---

  Next step we modified the flags=P in the corresponding config under folder 
/proc/sys/fs/binfmt_misc, then binfmt_misc sent argv[0] to qemu.
  But chroot could not start bash because in current qemu dose not consider 
about this unexpected one more"argv[0]"

  
  After modified qemu code temporary to handle the new argv list we got the 
input args, and from argv[2] is the original input command
  argv_0/usr/bin/qemu-mips64el-static---
  argv_1/bin/tar---
  argv_2tar---

  We need the original input from command line, so is it possible that let 
binfmt_misc to pass one more additional args or env to qemu as a token of the 
binfmt_misc flag, then qemu can judge how to parse the input args by it?
  looking forward your suggestions.

  Thanks
  luyou

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1818483/+subscriptions

Re: [Qemu-devel] converting build system to Meson?

2019-03-10 Thread Markus Armbruster

Daniel P. Berrangé  writes:

[...]
> As long term contributors we've built enough enough knowledge to
> QEMU to consider our build system attractive or even "simple",

The words that come to my mind aren't "attractive" or "simple", but
"impressively clever", "brittle", and "uh, can I hack on something else
today?"

I've obviously failed to develop enough of a Stockholm Syndrome in 8+
years.

> which skews our view of the benefits of alternatives to people
> without this tribal knowledge.

[Qemu-devel] [Bug 1819343] [NEW] Qcow2 image stuck as locked after host crash

2019-03-10 Thread Tim Schuster

Public bug reported:

After a host crash, the qcow2 image of the VM, stored on a remote NFS
share, has become inaccessible. Libvirt/QEMU reports that 'failed to get
"write" lock\nIs another process using the image
[/path/nfs/image.qcow2]?'. No process is accessing the image from either
host or the network share side. There is no obvious way in qemu-img to
force unlocking the file or repair the image (attempting a qemu-img
check with -r all results in qemu-img complaining about the lock and
being unable to do force-share=on on anything but readonly images).

I'm currently attempting to fix this by converting the image via 'qemu-
img convert -U -f qcow2 -O qcow2 image.qcow2 image_2.gcow2', though this
will likely take some time.

Using QEMU 3.1.0

** Affects: qemu
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1819343

Title:
  Qcow2 image stuck as locked after host crash

Status in QEMU:
  New

Bug description:
  After a host crash, the qcow2 image of the VM, stored on a remote NFS
  share, has become inaccessible. Libvirt/QEMU reports that 'failed to
  get "write" lock\nIs another process using the image
  [/path/nfs/image.qcow2]?'. No process is accessing the image from
  either host or the network share side. There is no obvious way in
  qemu-img to force unlocking the file or repair the image (attempting a
  qemu-img check with -r all results in qemu-img complaining about the
  lock and being unable to do force-share=on on anything but readonly
  images).

  I'm currently attempting to fix this by converting the image via
  'qemu-img convert -U -f qcow2 -O qcow2 image.qcow2 image_2.gcow2',
  though this will likely take some time.

  Using QEMU 3.1.0

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1819343/+subscriptions

Re: [Qemu-devel] converting build system to Meson?

2019-03-10 Thread Markus Armbruster

Peter Maydell  writes:

> On Thu, 7 Mar 2019 at 06:39, Thomas Huth  wrote:
>> On 06/03/2019 19.12, Paolo Bonzini wrote:
>> > lately I have been thinking of converting the QEMU build system to
>> > Meson.  Meson is a relatively new build system that can replace
>> > Autotools or hand-written Makefiles such as QEMU; as a die-hard
>> > Autotools fan, I must say that Meson is by far better than anything else
>> > that has ever tried to replace Autotools, and actually has the potential
>> > to do so.
>> >
>> > Advantages of Meson that directly matter for QEMU include:[...]
>>
>> I'm not objecting a new build system per se, but could you elaborate on
>>  problems of the current QEMU build system that will be fixed by this
>> change? Since apart from some minor glitches (with the *.mak file
>> dependencies for example), the current build system seems to work quite
>> well for me ... so at least I currently don't feel enough pain yet to do
>> such a big step, just because there is another new cool build system
>> around...
>
> Yes, that tends to be my view. Our current build system:
>  * has no dependencies that are problematic for older hosts
>(contrast Meson, which needs Python 3.5, even if we take
>the drastic step of shipping an entire build tool along
>with QEMU; OSX python is 2.7 still)

By the time Meson is ready for us, and we're ready for Meson, chances
are even OS-X has moved on from Python 2.

https://pythonclock.org/

>  * is not particularly hard to deal with for the common cases
>("add new source file" is straightforward)

Yes.  Quite an achievement.

>  * covers all our requirements as far as I'm aware
>(whereas you've listed a couple of places where Meson
>would need changes/extensions to support things we do already)
>  * is generally flexible enough to be hackable to deal with odd
>cases (it has escape mechanisms to generic-programmability,
>even if they're ugly and awkward)

Yes, it's hackable, but it takes quite a hacker to hack it.  While it's
reasonably easy to do simple things in it with basic voodoo skills, the
learning curve goes up like the Zimbabwean inflation rate after that.  I
got plenty of experience in Make, and consider myself pretty fluent, yet
I find myself running to Paolo for help.

> So I think we'd need a more compelling reason to move right now.
> (This might change in the future, eg if Meson catches on to the
> extent that everybody is using it and competitors like CMake are
> more obviously eclipsed by it, in the way that git took over
> from svn and relegated mercurial and bzr to obscurity.)
>
> thanks
> -- PMM

Re: [Qemu-devel] [PATCH] tests: test-bdrv-graph-mod: fix memory leak

2019-03-10 Thread Philippe Mathieu-Daudé

On 3/10/19 12:34 PM, Li Qiang wrote:
> Fixes: 2dbfadf

  ^ Please keep tags together (with Signed-off-by, ...)

> Spotted by ASAN when 'make check'.

I'm not native English speaker but I'd say:

Spotted by ASAN [with] 'make check'.

or

Spotted by ASAN [while running] 'make check'.

Here goes:

"Fixes: 2dbfadf"

> Signed-off-by: Li Qiang 
> ---
>  tests/test-bdrv-graph-mod.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/tests/test-bdrv-graph-mod.c b/tests/test-bdrv-graph-mod.c
> index 458dfa6661..8bf0fe735d 100644
> --- a/tests/test-bdrv-graph-mod.c
> +++ b/tests/test-bdrv-graph-mod.c
> @@ -117,6 +117,7 @@ static void test_update_perm_tree(void)
>  
>  bdrv_unref(bs);
>  blk_unref(root);
> +error_free(local_err);

Reviewed-by: Philippe Mathieu-Daudé 

>  }
>  
>  /*
>

Re: [Qemu-devel] [PATCH] hw/riscv/virt: re-add machine-specific compatible string to /soc/ node

2019-03-10 Thread Bin Meng

Hi Lukas,

On Sun, Mar 10, 2019 at 9:44 PM Auer, Lukas
 wrote:
>
> Hi Bin,
>
> On Sun, 2019-03-10 at 09:07 +0800, Bin Meng wrote:
> > Hi Lukas,
> >
> > On Mon, Feb 11, 2019 at 6:13 AM Lukas Auer
> >  wrote:
> > > Re-add the previous compatible string "riscv-virtio-soc" to the soc
> > > device tree node to allow U-Boot and Linux to bind machine-specific
> > > drivers to it. The current compatible string "simple-bus" is
> > > retained.
> > >
> > > This is required by U-Boot to bind devices early, as part of the
> > > pre-relocation driver model.
> > >
> >
> > I see no problem with U-Boot working with current compatible string
> > "simple-bus". In fact I had planned to remove the compatible string
> > "riscv-virtio-soc" in U-Boot but did not get time to work on it.
> >
>
> It is only required if U-Boot is running in machine-mode. For
> relocation it needs to use the CLINT driver to send appropriate IPIs to
> the other harts. To be able to probe the driver, the device and its
> parent device tree node (soc) must therefore be available in the pre-
> relocation device model.
> This patch was the easiest way I could think of for achieving this. It
> could be that there is a better way of solving this.
>

I tested your SMP U-Boot series in both M-mode and S-mode, using a 4
core 'virt' target. Works fine. I am using QEMU 3.1.0 so it is
"simple-bus".

Regards,
Bin

1 2 >

1 - 100 of 174 matches

Mail list logo