Re: [RFC PATCH 0/2] QEMU/openbios: PPC Software TLB support in the G4 family

2021-11-26 Thread Mark Cave-Ayland

On 24/11/2021 22:00, Fabiano Rosas wrote:


Fabiano Rosas  writes:


Hi all,

We have this bug in QEMU which indicates that we haven't been able to
run openbios on a 7450 cpu for quite a long time:

https://gitlab.com/qemu-project/qemu/-/issues/86

OK:
   $ ./qemu-system-ppc -serial mon:stdio -nographic -cpu 7410

   >> =
   >> OpenBIOS 1.1 [Nov 1 2021 20:36]
   ...

NOK:
   $ ./qemu-system-ppc -serial mon:stdio -nographic -cpu 7450 -d int
   Raise exception at fff08cc4 => 004e (00)
   QEMU: Terminated

The actual issue is straightforward. There is a non-architected
feature that QEMU has enabled by default that openbios doesn't know
about. From the user manual:

"The MPC7540 has a set of implementation-specific registers,
exceptions, and instructions that facilitate very efficient software
searching of the page tables in memory for when software table
searching is enabled (HID0[STEN] = 1). This section describes those
resources and provides three example code sequences that can be used
in a MPC7540 system for an efficient search of the translation tables
in software. These three code sequences can be used as handlers for
the three exceptions requiring access to the PTEs in the page tables
in memory in this case-instruction TLB miss, data TLB miss on load,
and data TLB miss on store exceptions."

The current state:

1) QEMU does not check HID0[STEN] and makes the feature always enabled
by setting these cpus with the POWERPC_MMU_SOFT_74xx MMU model,
instead of the generic POWERPC_MMU_32B.

2) openbios does not recognize the PVRs for those cpus and also does
not have any handlers for the software TLB exceptions (vectors 0x1000,
0x1100, 0x1200).

Some assumptions (correct me if I'm wrong please):

- openbios is the only firmware we use for the following cpus: 7441,
7445, 7450, 7451, 7455, 7457, 7447, 7447a, 7448.
- without openbios, we cannot have a guest running on these cpus.

So to bring 7450 back to life we would need to either:

a) find another firmware/guest OS code that supports the feature;

b) implement the switching of the feature in QEMU and have the guest
code enable it only when supported. That would take some fiddling with
the MMU code to: merge POWERPC_MMU_SOFT_74xx into POWERPC_MMU_32B,
check the HID0[STEN] bit, figure out how to switch from HW TLB miss to
SW TLB miss on demand, block access to the TLBMISS register (and
others) when the feature is off, and so on;

c) leave the feature enabled in QEMU and implement the software TLB
miss handlers in openbios. The UM provides sample code, so this is
easy;

d) remove support for software TLB search for the 7450 family and
switch the cpus to the POWERPC_MMU_32B model. This is by far the
easiest solution, but could cause problems for any (which?) guest OS
code that actually uses the feature. All of the existing code for the
POWERPC_MMU_SOFT_74xx MMU model would probably be removed since it
would be dead code then;

Option (c) seemed to me like a good compromise so this is a patch
series for openbios doing that and also adding the necessary PVRs so
we can get a working guest with these cpus without too much effort.

I have also a patch for QEMU adding basic sanity check tests for the
7400 and 7450 families. I'll send that separately to the QEMU ml.

Fabiano Rosas (2):
   ppc: Add support for MPC7450 software TLB miss interrupts
   ppc: Add PVRs for the MPC7450 family

  arch/ppc/qemu/init.c  |  52 ++
  arch/ppc/qemu/start.S | 236 +-
  2 files changed, 285 insertions(+), 3 deletions(-)


(Adding Mark because his email got somehow dropped from the original
message)



So with these patches in OpenBIOS we could get a bit further and call
into the Linux kernel using the same image as the one used for the
7400. However there seems to be no support for the 7450 software TLB in
the kernel. There are only handlers for the 4xx, 8xx and 603 which are
different code altogether. There's no mention of the TLBMISS and
PTEHI/LO registers in the code as well.

Do we know of any guest OS that implements the 7450 software TLB at
vectors 0x1000, 0x1100 and 0x1200? Otherwise replacing the
POWERPC_MMU_SOFT_74xx model with POWERPC_MMU_32B might be the only way
of getting an OS to run in the 7450 family.


My experience of anything other than the default CPUs used on the PPC Mac machines is 
basically zero, so you're certainly in new territory :)


I could live with your proposed solution c) although it would be nice to guard the 
extra vectors so that they remain uninitialised for the non-7450 CPUs. My main 
question is if the kernel itself doesn't support software TLBs then does adding the 
new code help at all? Or are you eventually planning for solution b) to improve 
QEMU's 7450 CPU emulation for developers without real hardware?



ATB,

Mark.



Re: [PATCH 1/2] block-backend: Retain permissions after migration

2021-11-26 Thread Peng Liang via
On 11/25/2021 9:53 PM, Hanna Reitz wrote:
> After migration, the permissions the guest device wants to impose on its
> BlockBackend are stored in blk->perm and blk->shared_perm.  In
> blk_root_activate(), we take our permissions, but keep all shared
> permissions open by calling `blk_set_perm(blk->perm, BLK_PERM_ALL)`.
> 
> Only afterwards (immediately or later, depending on the runstate) do we
> restrict the shared permissions by calling
> `blk_set_perm(blk->perm, blk->shared_perm)`.  Unfortunately, our first
> call with shared_perm=BLK_PERM_ALL has overwritten blk->shared_perm to
> be BLK_PERM_ALL, so this is a no-op and the set of shared permissions is
> not restricted.
> 
> Fix this bug by saving the set of shared permissions before invoking
> blk_set_perm() with BLK_PERM_ALL and restoring it afterwards.
> 
> Fixes: 5f7772c4d0cf32f4e779fcd5a69ae4dae24aeebf
>("block-backend: Defer shared_perm tightening migration
>completion")
> Reported-by: Peng Liang 
> Signed-off-by: Hanna Reitz 
> ---
>  block/block-backend.c | 11 +++
>  1 file changed, 11 insertions(+)
> 

Thanks for your patch!

Tested-by: Peng Liang 




Re: [RFC PATCH v2 21/30] hw/intc: Add LoongArch extioi interrupt controller(EIOINTC)

2021-11-26 Thread Mark Cave-Ayland

On 25/11/2021 08:20, yangxiaojuan wrote:


Hi Mark,

On 11/11/2021 10:49 PM, Mark Cave-Ayland wrote:

On 11/11/2021 01:35, Xiaojuan Yang wrote:


This patch realize the EIOINTC interrupt controller.

Signed-off-by: Xiaojuan Yang 
Signed-off-by: Song Gao 
---
   hw/intc/Kconfig|   3 +
   hw/intc/loongarch_extioi.c | 570 +
   hw/intc/meson.build|   1 +
   hw/loongarch/Kconfig   |   1 +
   include/hw/intc/loongarch_extioi.h |  99 +
   include/hw/loongarch/loongarch.h   |   1 +
   6 files changed, 675 insertions(+)
   create mode 100644 hw/intc/loongarch_extioi.c
   create mode 100644 include/hw/intc/loongarch_extioi.h

diff --git a/hw/intc/Kconfig b/hw/intc/Kconfig
index c0dc12dfa0..a2d9efd5aa 100644
--- a/hw/intc/Kconfig
+++ b/hw/intc/Kconfig
@@ -82,3 +82,6 @@ config LOONGARCH_PCH_MSI
   select MSI_NONBROKEN
   bool
   select UNIMP
+
+config LOONGARCH_EXTIOI
+bool
diff --git a/hw/intc/loongarch_extioi.c b/hw/intc/loongarch_extioi.c
new file mode 100644
index 00..592cd8d1e2
--- /dev/null
+++ b/hw/intc/loongarch_extioi.c
@@ -0,0 +1,570 @@
+/* SPDX-License-Identifier: GPL-2.0-or-later */
+/*
+ * Loongson 3A5000 ext interrupt controller emulation
+ *
+ * Copyright (C) 2021 Loongson Technology Corporation Limited
+ */
+
+#include "qemu/osdep.h"
+#include "qemu/module.h"
+#include "qemu/log.h"
+#include "hw/irq.h"
+#include "hw/sysbus.h"
+#include "hw/loongarch/loongarch.h"
+#include "hw/qdev-properties.h"
+#include "exec/address-spaces.h"
+#include "hw/intc/loongarch_extioi.h"
+#include "migration/vmstate.h"
+
+#define DEBUG_APIC 0
+
+#define DPRINTF(fmt, ...) \
+do { \
+if (DEBUG_APIC) { \
+fprintf(stderr, "APIC: " fmt , ## __VA_ARGS__); \
+} \
+} while (0)


Again please use trace-events insead of DPRINTF().


+static void extioi_update_irq(void *opaque, int irq_num, int level)
+{
+loongarch_extioi *s = opaque;
+uint8_t  ipnum, cpu;
+unsigned long found1, found2;
+
+ipnum = s->sw_ipmap[irq_num];
+cpu   = s->sw_coremap[irq_num];
+if (level == 1) {
+if (test_bit(irq_num, (void *)s->en_reg8) == false) {
+return;
+}
+bitmap_set((void *)s->coreisr_reg8[cpu], irq_num, 1);
+found1 = find_next_bit((void *)&(s->sw_ipisr[cpu][ipnum]),
+   EXTIOI_IRQS, 0);
+bitmap_set((void *)&(s->sw_ipisr[cpu][ipnum]), irq_num, 1);
+
+if (found1 >= EXTIOI_IRQS) {
+qemu_set_irq(s->parent_irq[cpu][ipnum], level);
+}
+} else {
+bitmap_clear((void *)s->coreisr_reg8[cpu], irq_num, 1);
+found1 = find_next_bit((void *)&(s->sw_ipisr[cpu][ipnum]),
+   EXTIOI_IRQS, 0);
+bitmap_clear((void *)&(s->sw_ipisr[cpu][ipnum]), irq_num, 1);
+found2 = find_next_bit((void *)&(s->sw_ipisr[cpu][ipnum]),
+   EXTIOI_IRQS, 0);
+
+if ((found1 < EXTIOI_IRQS) && (found2 >= EXTIOI_IRQS)) {
+qemu_set_irq(s->parent_irq[cpu][ipnum], level);
+}
+}
+}
+
+static void extioi_setirq(void *opaque, int irq, int level)
+{
+loongarch_extioi *s = opaque;
+extioi_update_irq(s, irq, level);
+}
+
+static void extioi_handler(void *opaque, int irq, int level)
+{
+loongarch_extioi *extioi = (loongarch_extioi *)opaque;
+
+qemu_set_irq(extioi->irq[irq], level);
+}
+
+static uint32_t extioi_readb(void *opaque, hwaddr addr)
+{
+loongarch_extioi *state = opaque;


Add a QOM cast here.


+unsigned long offset, reg_count;
+uint8_t ret;
+int cpu;
+
+offset = addr & 0x;
+
+if ((offset >= EXTIOI_ENABLE_START) && (offset < EXTIOI_ENABLE_END)) {
+reg_count = (offset - EXTIOI_ENABLE_START);
+ret = state->en_reg8[reg_count];
+} else if ((offset >= EXTIOI_BOUNCE_START) &&
+   (offset < EXTIOI_BOUNCE_END)) {
+reg_count = (offset - EXTIOI_BOUNCE_START);
+ret = state->bounce_reg8[reg_count];
+} else if ((offset >= EXTIOI_COREISR_START) &&
+   (offset < EXTIOI_COREISR_END)) {
+reg_count = ((offset - EXTIOI_COREISR_START) & 0x1f);
+cpu = ((offset - EXTIOI_COREISR_START) >> 8) & 0x3;
+ret = state->coreisr_reg8[cpu][reg_count];
+} else if ((offset >= EXTIOI_IPMAP_START) &&
+   (offset < EXTIOI_IPMAP_END)) {
+reg_count = (offset - EXTIOI_IPMAP_START);
+ret = state->ipmap_reg8[reg_count];
+} else if ((offset >= EXTIOI_COREMAP_START) &&
+   (offset < EXTIOI_COREMAP_END)) {
+reg_count = (offset - EXTIOI_COREMAP_START);
+ret = state->coremap_reg8[reg_count];
+} else if ((offset >= EXTIOI_NODETYPE_START) &&
+   (offset < EXTIOI_NODETYPE_END)) {
+reg_count = (offset - EXTIOI_NODETYPE_START);
+ret = state->nodetype_reg8[reg_count];
+}
+
+DPRINTF("readb reg 0x" TARGET_FMT_plx " = %x\n", addr, ret);
+retu

Re: [OpenBIOS] Re: [RFC PATCH 0/2] QEMU/openbios: PPC Software TLB support in the G4 family

2021-11-26 Thread Cédric Le Goater

Hello,

On 11/25/21 10:38, Segher Boessenkool wrote:

Hi!

On Thu, Nov 25, 2021 at 01:45:00AM +0100, BALATON Zoltan wrote:

As for guests, those running on the said PowerMac G4 should have support
for these CPUs so maybe you can try some Mac OS X versions (or maybe


OSX uses hardware pagetables.


MorphOS but that is not the best for debugging as there's no source
available nor any help from its owners but just to see if it boots it may
be sufficient, it should work on real PowerMac G4).


I have no idea what MorphOS uses, but I bet HPT as well.  That is
because HPT is fastest in general.  Software TLB reloads are good in
special cases only; the most common is real-time OSes, which can use its
lower guaranteed latency for some special address spaces (and can have a
simpler address map in general).


The support was added to QEMU knowing that Linux didn't handle soft TLBs.
And the commit says that it was kept disabled initially. I guess that was
broken these last years.

C.


$ git show 7dbe11acd807
commit 7dbe11acd807
Author: Jocelyn Mayer 
Date:   Mon Oct 1 05:16:57 2007 +

Handle all MMU models in switches, even if it's just to abort because of 
lack
  of supporting code.
Implement 74xx software TLB model.
Keep 74xx with software TLB disabled, as Linux is not able to handle TLB 
miss
  on those processors.

C.



Re: [RFC PATCH 0/2] QEMU/openbios: PPC Software TLB support in the G4 family

2021-11-26 Thread Cédric Le Goater

On 11/26/21 09:01, Mark Cave-Ayland wrote:

On 24/11/2021 22:00, Fabiano Rosas wrote:


Fabiano Rosas  writes:


Hi all,

We have this bug in QEMU which indicates that we haven't been able to
run openbios on a 7450 cpu for quite a long time:

https://gitlab.com/qemu-project/qemu/-/issues/86

OK:
   $ ./qemu-system-ppc -serial mon:stdio -nographic -cpu 7410

   >> =
   >> OpenBIOS 1.1 [Nov 1 2021 20:36]
   ...

NOK:
   $ ./qemu-system-ppc -serial mon:stdio -nographic -cpu 7450 -d int
   Raise exception at fff08cc4 => 004e (00)
   QEMU: Terminated

The actual issue is straightforward. There is a non-architected
feature that QEMU has enabled by default that openbios doesn't know
about. From the user manual:

"The MPC7540 has a set of implementation-specific registers,
exceptions, and instructions that facilitate very efficient software
searching of the page tables in memory for when software table
searching is enabled (HID0[STEN] = 1). This section describes those
resources and provides three example code sequences that can be used
in a MPC7540 system for an efficient search of the translation tables
in software. These three code sequences can be used as handlers for
the three exceptions requiring access to the PTEs in the page tables
in memory in this case-instruction TLB miss, data TLB miss on load,
and data TLB miss on store exceptions."

The current state:

1) QEMU does not check HID0[STEN] and makes the feature always enabled
by setting these cpus with the POWERPC_MMU_SOFT_74xx MMU model,
instead of the generic POWERPC_MMU_32B.

2) openbios does not recognize the PVRs for those cpus and also does
not have any handlers for the software TLB exceptions (vectors 0x1000,
0x1100, 0x1200).

Some assumptions (correct me if I'm wrong please):

- openbios is the only firmware we use for the following cpus: 7441,
7445, 7450, 7451, 7455, 7457, 7447, 7447a, 7448.
- without openbios, we cannot have a guest running on these cpus.

So to bring 7450 back to life we would need to either:

a) find another firmware/guest OS code that supports the feature;

b) implement the switching of the feature in QEMU and have the guest
code enable it only when supported. That would take some fiddling with
the MMU code to: merge POWERPC_MMU_SOFT_74xx into POWERPC_MMU_32B,
check the HID0[STEN] bit, figure out how to switch from HW TLB miss to
SW TLB miss on demand, block access to the TLBMISS register (and
others) when the feature is off, and so on;

c) leave the feature enabled in QEMU and implement the software TLB
miss handlers in openbios. The UM provides sample code, so this is
easy;

d) remove support for software TLB search for the 7450 family and
switch the cpus to the POWERPC_MMU_32B model. This is by far the
easiest solution, but could cause problems for any (which?) guest OS
code that actually uses the feature. All of the existing code for the
POWERPC_MMU_SOFT_74xx MMU model would probably be removed since it
would be dead code then;

Option (c) seemed to me like a good compromise so this is a patch
series for openbios doing that and also adding the necessary PVRs so
we can get a working guest with these cpus without too much effort.

I have also a patch for QEMU adding basic sanity check tests for the
7400 and 7450 families. I'll send that separately to the QEMU ml.

Fabiano Rosas (2):
   ppc: Add support for MPC7450 software TLB miss interrupts
   ppc: Add PVRs for the MPC7450 family

  arch/ppc/qemu/init.c  |  52 ++
  arch/ppc/qemu/start.S | 236 +-
  2 files changed, 285 insertions(+), 3 deletions(-)


(Adding Mark because his email got somehow dropped from the original
message)



So with these patches in OpenBIOS we could get a bit further and call
into the Linux kernel using the same image as the one used for the
7400. However there seems to be no support for the 7450 software TLB in
the kernel. There are only handlers for the 4xx, 8xx and 603 which are
different code altogether. There's no mention of the TLBMISS and
PTEHI/LO registers in the code as well.

Do we know of any guest OS that implements the 7450 software TLB at
vectors 0x1000, 0x1100 and 0x1200? Otherwise replacing the
POWERPC_MMU_SOFT_74xx model with POWERPC_MMU_32B might be the only way
of getting an OS to run in the 7450 family.


My experience of anything other than the default CPUs used on the PPC Mac 
machines is basically zero, so you're certainly in new territory :)

I could live with your proposed solution c) although it would be nice to guard the extra vectors so that they remain uninitialised for the non-7450 CPUs. My main question is if the kernel itself doesn't support software TLBs then does adding the new code help at all? 


yes, it helps to boot Linux and MacOS (9 and 10) on those CPUs but you still
need to replace the mmu model to POWERPC_MMU_32B in QEMU.

Or are you eventually planning for solution b) to improve QEMU's 7450 CPU 

Re: [RFC PATCH 0/2] QEMU/openbios: PPC Software TLB support in the G4 family

2021-11-26 Thread Mark Cave-Ayland

On 26/11/2021 08:40, Cédric Le Goater wrote:


On 11/26/21 09:01, Mark Cave-Ayland wrote:

On 24/11/2021 22:00, Fabiano Rosas wrote:


Fabiano Rosas  writes:


Hi all,

We have this bug in QEMU which indicates that we haven't been able to
run openbios on a 7450 cpu for quite a long time:

https://gitlab.com/qemu-project/qemu/-/issues/86

OK:
   $ ./qemu-system-ppc -serial mon:stdio -nographic -cpu 7410

   >> =
   >> OpenBIOS 1.1 [Nov 1 2021 20:36]
   ...

NOK:
   $ ./qemu-system-ppc -serial mon:stdio -nographic -cpu 7450 -d int
   Raise exception at fff08cc4 => 004e (00)
   QEMU: Terminated

The actual issue is straightforward. There is a non-architected
feature that QEMU has enabled by default that openbios doesn't know
about. From the user manual:

"The MPC7540 has a set of implementation-specific registers,
exceptions, and instructions that facilitate very efficient software
searching of the page tables in memory for when software table
searching is enabled (HID0[STEN] = 1). This section describes those
resources and provides three example code sequences that can be used
in a MPC7540 system for an efficient search of the translation tables
in software. These three code sequences can be used as handlers for
the three exceptions requiring access to the PTEs in the page tables
in memory in this case-instruction TLB miss, data TLB miss on load,
and data TLB miss on store exceptions."

The current state:

1) QEMU does not check HID0[STEN] and makes the feature always enabled
by setting these cpus with the POWERPC_MMU_SOFT_74xx MMU model,
instead of the generic POWERPC_MMU_32B.

2) openbios does not recognize the PVRs for those cpus and also does
not have any handlers for the software TLB exceptions (vectors 0x1000,
0x1100, 0x1200).

Some assumptions (correct me if I'm wrong please):

- openbios is the only firmware we use for the following cpus: 7441,
7445, 7450, 7451, 7455, 7457, 7447, 7447a, 7448.
- without openbios, we cannot have a guest running on these cpus.

So to bring 7450 back to life we would need to either:

a) find another firmware/guest OS code that supports the feature;

b) implement the switching of the feature in QEMU and have the guest
code enable it only when supported. That would take some fiddling with
the MMU code to: merge POWERPC_MMU_SOFT_74xx into POWERPC_MMU_32B,
check the HID0[STEN] bit, figure out how to switch from HW TLB miss to
SW TLB miss on demand, block access to the TLBMISS register (and
others) when the feature is off, and so on;

c) leave the feature enabled in QEMU and implement the software TLB
miss handlers in openbios. The UM provides sample code, so this is
easy;

d) remove support for software TLB search for the 7450 family and
switch the cpus to the POWERPC_MMU_32B model. This is by far the
easiest solution, but could cause problems for any (which?) guest OS
code that actually uses the feature. All of the existing code for the
POWERPC_MMU_SOFT_74xx MMU model would probably be removed since it
would be dead code then;

Option (c) seemed to me like a good compromise so this is a patch
series for openbios doing that and also adding the necessary PVRs so
we can get a working guest with these cpus without too much effort.

I have also a patch for QEMU adding basic sanity check tests for the
7400 and 7450 families. I'll send that separately to the QEMU ml.

Fabiano Rosas (2):
   ppc: Add support for MPC7450 software TLB miss interrupts
   ppc: Add PVRs for the MPC7450 family

  arch/ppc/qemu/init.c  |  52 ++
  arch/ppc/qemu/start.S | 236 +-
  2 files changed, 285 insertions(+), 3 deletions(-)


(Adding Mark because his email got somehow dropped from the original
message)



So with these patches in OpenBIOS we could get a bit further and call
into the Linux kernel using the same image as the one used for the
7400. However there seems to be no support for the 7450 software TLB in
the kernel. There are only handlers for the 4xx, 8xx and 603 which are
different code altogether. There's no mention of the TLBMISS and
PTEHI/LO registers in the code as well.

Do we know of any guest OS that implements the 7450 software TLB at
vectors 0x1000, 0x1100 and 0x1200? Otherwise replacing the
POWERPC_MMU_SOFT_74xx model with POWERPC_MMU_32B might be the only way
of getting an OS to run in the 7450 family.


My experience of anything other than the default CPUs used on the PPC Mac machines 
is basically zero, so you're certainly in new territory :)


I could live with your proposed solution c) although it would be nice to guard the 
extra vectors so that they remain uninitialised for the non-7450 CPUs. My main 
question is if the kernel itself doesn't support software TLBs then does adding the 
new code help at all? 


yes, it helps to boot Linux and MacOS (9 and 10) on those CPUs but you still
need to replace the mmu model to POWERPC_MMU_32B in QEMU.

Or are you eventually p

Re: [PATCH 1/3] ppc/pnv: Tune the POWER9 PCIe Host bridge model

2021-11-26 Thread Cédric Le Goater

On 11/16/21 18:01, Frederic Barrat wrote:

The PHB v4 found on POWER9 doesn't request any LSI, so let's clear the
Interrupt Pin register in the config space so that the model matches
the hardware.

If we don't, then we inherit from the default pcie root bridge, which
requests a LSI. And because we don't map it correctly in the device
tree, all PHBs allocate the same bogus hw interrupt. We end up with
inconsistent interrupt controller (xive) data. The problem goes away
if we don't allocate the LSI in the first place.

Signed-off-by: Frederic Barrat 
---
  hw/pci-host/pnv_phb4.c | 5 -
  1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/hw/pci-host/pnv_phb4.c b/hw/pci-host/pnv_phb4.c
index 5c375a9f28..1659d55b4f 100644
--- a/hw/pci-host/pnv_phb4.c
+++ b/hw/pci-host/pnv_phb4.c
@@ -1234,10 +1234,13 @@ static void pnv_phb4_reset(DeviceState *dev)
  PCIDevice *root_dev = PCI_DEVICE(&phb->root);
  
  /*

- * Configure PCI device id at reset using a property.
+ * Configure the PCI device at reset:
+ *   - set the Vendor and Device ID to for the root bridge
+ *   - no LSI
   */
  pci_config_set_vendor_id(root_dev->config, PCI_VENDOR_ID_IBM);
  pci_config_set_device_id(root_dev->config, phb->device_id);
+pci_config_set_interrupt_pin(root_dev->config, 0);
  }
  
  static const char *pnv_phb4_root_bus_path(PCIHostState *host_bridge,




FYI, I am seeing an issue with FreeBSD when booting from iso :

  
https://download.freebsd.org/ftp/snapshots/powerpc/powerpc64/ISO-IMAGES/14.0/FreeBSD-14.0-CURRENT-powerpc-powerpc64-20211028-4827bf76bce-250301-disc1.iso.xz

Thanks,

C.

SIGTERM received, booting...
KDB: debugger backends: ddb
KDB: current backend: ddb
---<>---
Copyright (c) 1992-2021 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
The Regents of the University of California. All rights reserved.
FreeBSD is a registered trademark of The FreeBSD Foundation.
FreeBSD 14.0-CURRENT #0 main-n250301-4827bf76bce: Thu Oct 28 06:53:58 UTC 2021

r...@releng1.nyi.freebsd.org:/usr/obj/usr/src/powerpc.powerpc64/sys/GENERIC64 
powerpc
FreeBSD clang version 12.0.1 (g...@github.com:llvm/llvm-project.git 
llvmorg-12.0.1-0-gfed41342a82f)
WARNING: WITNESS option enabled, expect reduced performance.
VT: init without driver.
ofw_initrd: initrd loaded at 0x2800-0x28c7928c
cpu0: IBM POWER9 revision 2.0, 1000.00 MHz
cpu0: Features 
dc007182
cpu0: Features2 bee0
real memory  = 1014484992 (967 MB)
avail memory = 117903360 (112 MB)
random: registering fast source PowerISA DARN random number generator
random: fast provider: "PowerISA DARN random number generator"
arc4random: WARNING: initial seeding bypassed the cryptographic random device 
because it was not yet seeded and the knob 'bypass_before_seeding' was enabled.
random: entropy device external interface
kbd0 at kbdmux0
ofwbus0:  on nexus0
opal0:  irq 
1048560,1048561,1048562,1048563,1048564,1048565,1048566,1048567,1048568,1048569,1048570,1048571,1048572,1048573
 on ofwbus0
opal0: registered as a time-of-day clock, resolution 0.002000s
simplebus0:  mem 
0x60300-0x60300 on ofwbus0
pcib0:  mem 
0x600c3c000-0x600c3cfff,0x600c3-0x600c30fff on ofwbus0
pci0:  numa-domain 0 on pcib0
qemu-system-ppc64: ../hw/pci/pci.c:1487: pci_irq_handler: Assertion `0 <= irq_num 
&& irq_num < PCI_NUM_PINS' failed.





Re: [PATCH v2] target/ppc: fix Hash64 MMU update of PTE bit R

2021-11-26 Thread Cédric Le Goater

Hello,

Curiously, I didn't get the v2 email.

On 11/26/21 02:13, David Gibson wrote:

On Thu, Nov 25, 2021 at 03:33:22PM -0300, Leandro Lupori wrote:

When updating the R bit of a PTE, the Hash64 MMU was using a wrong byte
offset, causing the first byte of the adjacent PTE to be corrupted.
This caused a panic when booting FreeBSD, using the Hash MMU.

Fixes: a2dd4e83e76b ("ppc/hash64: Rework R and C bit updates")
Signed-off-by: Leandro Lupori 


If you're introducing the constant, it would make sense to also use it
in spapr_hpte_set_r().


I agree and please add one for the C bit also since it's the same
kind of twiddling.

Thanks,

C.



---
Changes from v1:
- Add and use a new define for the byte offset of PTE bit R
---
  target/ppc/mmu-hash64.c | 2 +-
  target/ppc/mmu-hash64.h | 3 +++
  2 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/target/ppc/mmu-hash64.c b/target/ppc/mmu-hash64.c
index 19832c4b46..0968927744 100644
--- a/target/ppc/mmu-hash64.c
+++ b/target/ppc/mmu-hash64.c
@@ -786,7 +786,7 @@ static void ppc_hash64_set_dsi(CPUState *cs, int mmu_idx, 
uint64_t dar, uint64_t
  
  static void ppc_hash64_set_r(PowerPCCPU *cpu, hwaddr ptex, uint64_t pte1)

  {
-hwaddr base, offset = ptex * HASH_PTE_SIZE_64 + 16;
+hwaddr base, offset = ptex * HASH_PTE_SIZE_64 + HPTE64_R_R_BYTE_OFFSET;
  
  if (cpu->vhyp) {

  PPCVirtualHypervisorClass *vhc =
diff --git a/target/ppc/mmu-hash64.h b/target/ppc/mmu-hash64.h
index c5b2f97ff7..40bb901262 100644
--- a/target/ppc/mmu-hash64.h
+++ b/target/ppc/mmu-hash64.h
@@ -97,6 +97,9 @@ void ppc_hash64_finalize(PowerPCCPU *cpu);
  #define HPTE64_V_1TB_SEG0x4000ULL
  #define HPTE64_V_VRMA_MASK  0x4001ff00ULL
  
+/* PTE byte offsets */

+#define HPTE64_R_R_BYTE_OFFSET  14>> +
  /* Format changes for ARCH v3 */
  #define HPTE64_V_COMMON_BITS0x000fULL
  #define HPTE64_R_3_0_SSIZE_SHIFT 58







Re: [PATCH v3 04/23] multifd: Add missing documention

2021-11-26 Thread Juan Quintela
"Dr. David Alan Gilbert"  wrote:
> * Juan Quintela (quint...@redhat.com) wrote:
>> Signed-off-by: Juan Quintela 
>
> Pretty obvious, but I guess to have the complete set of comments:

Yeap.  When I was removing the used parameter, I found that we have this
function without the comment.  If we have the comments, just make them
right.

> Reviewed-by: Dr. David Alan Gilbert 

Thanks, Juan.




Re: [PATCH v3 01/23] multifd: Delete useless operation

2021-11-26 Thread Juan Quintela
"Dr. David Alan Gilbert"  wrote:
> * Juan Quintela (quint...@redhat.com) wrote:
>> "Dr. David Alan Gilbert"  wrote:
>> > * Juan Quintela (quint...@redhat.com) wrote:
>> >> We are divining by page_size to multiply again in the only use.
>> >  ^--- typo
>> >> Once there, impreve the comments.
>> >   ^--- typo
>> >> 
>> >> Signed-off-by: Juan Quintela 
>> >
>> > OK, with the typo's fixed:
>> 
>> Thanks.
>> 
>> > Reviewed-by: Dr. David Alan Gilbert 
>> >
>> > but, could you also explain the  x 2 (that's no worse than the current
>> > code); is this defined somewhere in zlib?  I thought there was a routine
>> > that told you the worst case?
>> 
>> Nowhere.
>> 
>> There are pathological cases where it can be worse.  Not clear at all
>> how much (ok, for zlib it appears that it is on the order of dozen of
>> bytes, because it marks it as uncompressed on the worst possible case),
>> For zstd, there is not a clear/fast answer when you google.
>
> For zlib:
>
> ZEXTERN uLong ZEXPORT compressBound OF((uLong sourceLen));
> /*
>  compressBound() returns an upper bound on the compressed size after
>compress() or compress2() on sourceLen bytes.  It would be used before a
>compress() or compress2() call to allocate the destination buffer.
> */

Aha, exaactly what I needed.

thanks.

zstd one is called:

ZSTD_compressBound()

Added to the series.

Thanks, Juan.




Re: [PATCH v2 1/7] accel/tcg: introduce CF_NOIRQ

2021-11-26 Thread Richard Henderson

On 11/25/21 4:41 PM, Alex Bennée wrote:

Here we introduce a new compiler flag to disable the checking of exit
request (icount_decr.u32). This is useful when we want to ensure the
next block cannot be preempted by an asynchronous event.

Suggested-by: Richard Henderson
Signed-off-by: Alex Bennée

---
v2
   - split from larger patch
   - reword the check in cpu_handle_interrupt and scope to CF_NOIRQ only
---
  include/exec/exec-all.h   |  1 +
  include/exec/gen-icount.h | 21 +
  2 files changed, 18 insertions(+), 4 deletions(-)


Reviewed-by: Richard Henderson 

r~



[PATCH v2 1/1] MAINTAINERS: update email address of Christian Borntraeger

2021-11-26 Thread Christian Borntraeger
My borntrae...@de.ibm.com email is just a forwarder to the
linux.ibm.com address. Let us remove the extra hop to avoid
a potential source of errors.

While at it, add the relevant email addresses to mailmap.

Signed-off-by: Christian Borntraeger 
---
 .mailmap| 1 +
 MAINTAINERS | 6 +++---
 2 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/.mailmap b/.mailmap
index 8beb2f95ae28..c45d1c530144 100644
--- a/.mailmap
+++ b/.mailmap
@@ -50,6 +50,7 @@ Aleksandar Rikalo  

 Aleksandar Rikalo  
 Alexander Graf  
 Anthony Liguori  Anthony Liguori 
+Christian Borntraeger  
 Filip Bozuta  
 Frederic Konrad  
 Greg Kurz  
diff --git a/MAINTAINERS b/MAINTAINERS
index d3879aa3c12c..e19d88ca9960 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -393,7 +393,7 @@ F: target/ppc/kvm.c
 
 S390 KVM CPUs
 M: Halil Pasic 
-M: Christian Borntraeger 
+M: Christian Borntraeger 
 S: Supported
 F: target/s390x/kvm/
 F: target/s390x/ioinst.[ch]
@@ -1527,7 +1527,7 @@ S390 Machines
 -
 S390 Virtio-ccw
 M: Halil Pasic 
-M: Christian Borntraeger 
+M: Christian Borntraeger 
 S: Supported
 F: hw/char/sclp*.[hc]
 F: hw/char/terminal3270.c
@@ -1541,7 +1541,7 @@ T: git https://github.com/borntraeger/qemu.git s390-next
 L: qemu-s3...@nongnu.org
 
 S390-ccw boot
-M: Christian Borntraeger 
+M: Christian Borntraeger 
 M: Thomas Huth 
 S: Supported
 F: hw/s390x/ipl.*
-- 
2.31.1




Re: [PATCH v2 2/7] accel/tcg: suppress IRQ check for special TBs

2021-11-26 Thread Richard Henderson

On 11/25/21 4:41 PM, Alex Bennée wrote:

@@ -1738,7 +1738,7 @@ tb_invalidate_phys_page_range__locked(struct 
page_collection *pages,
  if (current_tb_modified) {
  page_collection_unlock(pages);
  /* Force execution of one insn next time.  */
-cpu->cflags_next_tb = 1 | curr_cflags(cpu);
+cpu->cflags_next_tb = 1 | CF_NOIRQ | curr_cflags(cpu);
  mmap_unlock();
  cpu_loop_exit_noexc(cpu);
  }


There's another instance in tb_invalidate_phys_page.


diff --git a/softmmu/physmem.c b/softmmu/physmem.c
index 314f8b439c..b43f92e900 100644
--- a/softmmu/physmem.c
+++ b/softmmu/physmem.c
@@ -946,7 +946,7 @@ void cpu_check_watchpoint(CPUState *cpu, vaddr addr, vaddr 
len,
  cpu_loop_exit(cpu);
  } else {
  /* Force execution of one insn next time.  */
-cpu->cflags_next_tb = 1 | CF_LAST_IO | curr_cflags(cpu);
+cpu->cflags_next_tb = 1 | CF_LAST_IO | CF_NOIRQ | 
curr_cflags(cpu);
  mmap_unlock();
  cpu_loop_exit_noexc(cpu);
  }


And a second instance in this function.


r~



Re: [OpenBIOS] Re: [RFC PATCH 0/2] QEMU/openbios: PPC Software TLB support in the G4 family

2021-11-26 Thread Segher Boessenkool
Hi!

On Fri, Nov 26, 2021 at 09:34:44AM +0100, Cédric Le Goater wrote:
> On 11/25/21 10:38, Segher Boessenkool wrote:
> >On Thu, Nov 25, 2021 at 01:45:00AM +0100, BALATON Zoltan wrote:
> >>As for guests, those running on the said PowerMac G4 should have support
> >>for these CPUs so maybe you can try some Mac OS X versions (or maybe
> >
> >OSX uses hardware pagetables.
> >
> >>MorphOS but that is not the best for debugging as there's no source
> >>available nor any help from its owners but just to see if it boots it may
> >>be sufficient, it should work on real PowerMac G4).
> >
> >I have no idea what MorphOS uses, but I bet HPT as well.  That is
> >because HPT is fastest in general.  Software TLB reloads are good in
> >special cases only; the most common is real-time OSes, which can use its
> >lower guaranteed latency for some special address spaces (and can have a
> >simpler address map in general).
> 
> The support was added to QEMU knowing that Linux didn't handle soft TLBs.
> And the commit says that it was kept disabled initially. I guess that was
> broken these last years.

Ah :-)  So when was it enabled, do you know?

> $ git show 7dbe11acd807
> commit 7dbe11acd807
> Author: Jocelyn Mayer 
> Date:   Mon Oct 1 05:16:57 2007 +
> 
> Handle all MMU models in switches, even if it's just to abort because 
> of lack
>   of supporting code.
> Implement 74xx software TLB model.
> Keep 74xx with software TLB disabled, as Linux is not able to handle 
> TLB miss
>   on those processors.

This is very specifically for 7450, not 7400, fwiw.  7400 is a nice
core, while 7450 is ugly and asymmetric and unbalanced as hell.  It can
be faster though ;-)


Segher



Re: [PATCH v2 1/3] linux-user: Move target_signal.h generic definitions to generic/signal.h

2021-11-26 Thread Richard Henderson

On 11/26/21 3:23 AM, Song Gao wrote:

No code change

Suggested-by: Richard Henderson
Signed-off-by: Song Gao
Reviewed-by: Laurent Vivier
---
  linux-user/aarch64/target_signal.h| 18 --
  linux-user/arm/target_signal.h| 18 --
  linux-user/cris/target_signal.h   | 18 --
  linux-user/generic/signal.h   | 16 
  linux-user/hexagon/target_signal.h| 11 ---
  linux-user/i386/target_signal.h   | 18 --
  linux-user/m68k/target_signal.h   | 18 --
  linux-user/microblaze/target_signal.h | 18 --
  linux-user/nios2/target_signal.h  | 16 
  linux-user/openrisc/target_signal.h   | 23 ---
  linux-user/ppc/target_signal.h| 18 --
  linux-user/riscv/target_signal.h  | 12 
  linux-user/s390x/target_signal.h  | 15 ---
  linux-user/sh4/target_signal.h| 18 --
  linux-user/x86_64/target_signal.h | 18 --
  linux-user/xtensa/target_signal.h | 17 -
  16 files changed, 16 insertions(+), 256 deletions(-)


Reviewed-by: Richard Henderson 

r~



Re: [PATCH v2 2/3] linux-user: target_syscall.h remove definition TARGET_MINSIGSTKSZ

2021-11-26 Thread Richard Henderson

On 11/26/21 3:23 AM, Song Gao wrote:

TARGET_MINSIGSTKSZ has been defined in generic/signal.h
or target_signal.h, We don't need to define it again.

Signed-off-by: Song Gao
Reviewed-by: Laurent Vivier
Reviewed-by: Philippe Mathieu-Daudé
---
  linux-user/aarch64/target_syscall.h| 1 -
  linux-user/alpha/target_syscall.h  | 1 -
  linux-user/arm/target_syscall.h| 1 -
  linux-user/cris/target_syscall.h   | 1 -
  linux-user/hppa/target_syscall.h   | 1 -
  linux-user/i386/target_syscall.h   | 1 -
  linux-user/m68k/target_syscall.h   | 1 -
  linux-user/microblaze/target_syscall.h | 1 -
  linux-user/mips/target_syscall.h   | 1 -
  linux-user/mips64/target_syscall.h | 1 -
  linux-user/nios2/target_syscall.h  | 1 -
  linux-user/openrisc/target_syscall.h   | 1 -
  linux-user/ppc/target_syscall.h| 1 -
  linux-user/riscv/target_syscall.h  | 1 -
  linux-user/s390x/target_syscall.h  | 1 -
  linux-user/sh4/target_syscall.h| 1 -
  linux-user/sparc/target_syscall.h  | 1 -
  linux-user/x86_64/target_syscall.h | 1 -
  18 files changed, 18 deletions(-)


Reviewed-by: Richard Henderson 

r~



Re: [PATCH v2 3/3] linux-user: Remove TARGET_SIGSTKSZ

2021-11-26 Thread Richard Henderson

On 11/26/21 3:23 AM, Song Gao wrote:

TARGET_SIGSTKSZ is not used, we should remove it.

Signed-off-by: Song Gao
---
  linux-user/alpha/target_signal.h  | 1 -
  linux-user/generic/signal.h   | 1 -
  linux-user/hppa/target_signal.h   | 1 -
  linux-user/mips/target_signal.h   | 1 -
  linux-user/mips64/target_signal.h | 1 -
  linux-user/sparc/target_signal.h  | 1 -
  6 files changed, 6 deletions(-)


Reviewed-by: Richard Henderson 

r~



Re: Follow-up on the CXL discussion at OFTC

2021-11-26 Thread Jonathan Cameron via
On Fri, 19 Nov 2021 18:53:43 +
Jonathan Cameron  wrote:

> On Thu, 18 Nov 2021 17:52:07 -0800
> Ben Widawsky  wrote:
> 
> > On 21-11-18 15:20:34, Saransh Gupta1 wrote:  
> > > Hi Ben and Jonathan,
> > > 
> > > Thanks for your replies. I'm looking forward to the patches.
> > > 
> > > For QEMU, I see hotplug support as an item on the list and would like to 
> > > start working on it. It would be great if you can provide some pointers 
> > > about how I should go about it.
> > 
> > It's been a while, so I can't recall what's actually missing. I think it 
> > should
> > mostly behave like a normal PCIe endpoint.
> >   
> > > Also, which version of kernel and QEMU (maybe Jonathan's upcoming 
> > > version) 
> > > would be a good starting point for it?
> > 
> > If he rebased and claims it works I have no reason to doubt it :-). I have a
> > small fix on my v4 branch if you want to use the latest port patches.  
> 
> Thanks. I'd missed that one. Now pushed down into the original patch.
> 
> It occurred to me that technically I only know my rebase works on Arm64...
> Fingers crossed for x86.
> 
> Anyhow, I'll run more tests on it next week (possibly even including x86),

x86 tests throw up an issue with a 2 byte write to the box registers.
For now I've papered over that by explicitly adding support - obvious how to
do it if you look at mailbox_reg_read.  I want to understand what the source
of that access is though before deciding if this fix is correct and that might
take a little bit of tracking down.

Jonathan

> 
> Available at: 
> https://github.com/hisilicon/qemu/tree/cxl-hacks
> 
> For arm64 the description at
> https://people.kernel.org/jic23/ will almost work with this. 
> There is a bug however that I need to track down which currently means you
> need to set the pxb uid to the same as the bus number.   Shouldn't take
> long to fix but it's Friday evening...
> (add uid=0x80 to the options for pxb-cxl)
> 
> I dropped the CMA patch from Avery from this tree as need to improve
> the way it's getting hold of some parts of libSPDM and move to the current
> version of that library (rather than the old openSPDM)
> 
> Ben, if you don't mind me trying to push this forwards, I'll do a bit
> of cleanup and reordering then make use of the QEMU folks we have / know and
> try and start getting your hard work upstream.
> 
> Whilst I've not poked the various interfaces yet, this is working with
> a kernel tree that is current cxl/next + Ira's DOE series and Ben's region 
> series
> + (for fun) my SPDM series.  That tree's a franken monster so I'm not planning
> to share it unless anyone has particular need of it.  Hopefully the various
> parts will move forwards this cycle anyway so I can stop having to spend
> as much time on rebases!
> 
> Jonathan 
> 
> >   
> > > 
> > > Thanks,
> > > Saransh
> > > 
> > > 
> > > 
> > > From:   "Jonathan Cameron" 
> > > To: "Ben Widawsky" 
> > > Cc: "Saransh Gupta1" , , 
> > > 
> > > Date:   11/17/2021 09:32 AM
> > > Subject:[EXTERNAL] Re: Follow-up on the CXL discussion at OFTC
> > > 
> > > 
> > > 
> > > On Wed, 17 Nov 2021 08:57:19 -0800
> > > Ben Widawsky  wrote:
> > > 
> > > > Hi Saransh. Please add the list for these kind of questions. I've 
> > > converted your
> > > > HTML mail, but going forward, the list will eat it, so please use text  
> > > >
> > > only.
> > > > 
> > > > On 21-11-16 00:14:33, Saransh Gupta1 wrote:
> > > > >Hi Ben,
> > > > > 
> > > > >This is Saransh from IBM. Sorry to have (unintentionally) dropped  
> > > > >
> > > out
> > > > >of the conversion on OFTC, I'm new to IRC.
> > > > >Just wanted to follow-up on the discussion there. We discussed 
> > > about
> > > > >helping with linux patches reviews. On that front, I have 
> > > identified
> > > > >some colleague(s) who can help me with this. Let me know if/how you
> > > > >want to proceed with that. 
> > > > 
> > > > Currently the ball is in my court to re-roll the RFC v2 patches [1] 
> > > based on
> > > > feedback from Dan. I've implemented all/most of it, but I'm still 
> > > debugging some
> > > > issues with the result.
> > > > 
> > > > > 
> > > > >Maybe not urgently, but my team would also like to get an 
> > > understanding
> > > > >of the missing pieces in QEMU. Initially our focus is on type3 
> > > memory
> > > > >access and hotplug support. Most of the work that my team does is
> > > > >open-source, so contributing to the QEMU effort is another possible
> > > > >line of collaboration. 
> > > > 
> > > > If you haven't seen it already, check out my LPC talk [2]. The QEMU 
> > > patches
> > > > could use a lot of love. Mostly, I have little/no motivation until 
> > > upstream
> > > > shows an interest because I don't have time currently to make sure I
> > > >  
> > > don't break
> > > > vs. upstream. If you want

Re: [PATCH v8 04/10] target/ppc: PMU: update counters on MMCR1 write

2021-11-26 Thread David Gibson
On Thu, Nov 25, 2021 at 12:08:11PM -0300, Daniel Henrique Barboza wrote:
> MMCR1 determines the events to be sampled by the PMU. Updating the
> counters at every MMCR1 write ensures that we're not sampling more
> or less events by looking only at MMCR0 and the PMCs.
> 
> It is worth noticing that both the Book3S PowerPC PMU, and this IBM
> Power8+ PMU that we're modeling, also uses MMCRA, MMCR2 and MMCR3 to
> control the PMU. These three registers aren't being handled in this
> initial implementation, so for now we're controlling all the PMU
> aspects using MMCR0, MMCR1 and the PMCs.
> 
> Signed-off-by: Daniel Henrique Barboza 

Reviewed-by: David Gibson 

> ---
>  target/ppc/cpu_init.c|  2 +-
>  target/ppc/helper.h  |  1 +
>  target/ppc/power8-pmu-regs.c.inc | 11 +++
>  target/ppc/power8-pmu.c  |  7 +++
>  target/ppc/spr_tcg.h |  1 +
>  5 files changed, 21 insertions(+), 1 deletion(-)
> 
> diff --git a/target/ppc/cpu_init.c b/target/ppc/cpu_init.c
> index a7f47ec322..2d72dde26d 100644
> --- a/target/ppc/cpu_init.c
> +++ b/target/ppc/cpu_init.c
> @@ -6825,7 +6825,7 @@ static void register_book3s_pmu_sup_sprs(CPUPPCState 
> *env)
>   KVM_REG_PPC_MMCR0, 0x8000);
>  spr_register_kvm(env, SPR_POWER_MMCR1, "MMCR1",
>   SPR_NOACCESS, SPR_NOACCESS,
> - &spr_read_generic, &spr_write_generic,
> + &spr_read_generic, &spr_write_MMCR1,
>   KVM_REG_PPC_MMCR1, 0x);
>  spr_register_kvm(env, SPR_POWER_MMCRA, "MMCRA",
>   SPR_NOACCESS, SPR_NOACCESS,
> diff --git a/target/ppc/helper.h b/target/ppc/helper.h
> index d7567f75b4..94b4690375 100644
> --- a/target/ppc/helper.h
> +++ b/target/ppc/helper.h
> @@ -21,6 +21,7 @@ DEF_HELPER_1(hrfid, void, env)
>  DEF_HELPER_2(store_lpcr, void, env, tl)
>  DEF_HELPER_2(store_pcr, void, env, tl)
>  DEF_HELPER_2(store_mmcr0, void, env, tl)
> +DEF_HELPER_2(store_mmcr1, void, env, tl)
>  DEF_HELPER_3(store_pmc, void, env, i32, i64)
>  DEF_HELPER_2(read_pmc, tl, env, i32)
>  #endif
> diff --git a/target/ppc/power8-pmu-regs.c.inc 
> b/target/ppc/power8-pmu-regs.c.inc
> index f0c9cc343b..25b13ad564 100644
> --- a/target/ppc/power8-pmu-regs.c.inc
> +++ b/target/ppc/power8-pmu-regs.c.inc
> @@ -255,6 +255,12 @@ void spr_write_MMCR0(DisasContext *ctx, int sprn, int 
> gprn)
>  {
>  write_MMCR0_common(ctx, cpu_gpr[gprn]);
>  }
> +
> +void spr_write_MMCR1(DisasContext *ctx, int sprn, int gprn)
> +{
> +gen_icount_io_start(ctx);
> +gen_helper_store_mmcr1(cpu_env, cpu_gpr[gprn]);
> +}
>  #else
>  void spr_read_MMCR0_ureg(DisasContext *ctx, int gprn, int sprn)
>  {
> @@ -301,6 +307,11 @@ void spr_write_MMCR0(DisasContext *ctx, int sprn, int 
> gprn)
>  spr_write_generic(ctx, sprn, gprn);
>  }
>  
> +void spr_write_MMCR1(DisasContext *ctx, int sprn, int gprn)
> +{
> +spr_write_generic(ctx, sprn, gprn);
> +}
> +
>  void spr_write_PMC(DisasContext *ctx, int sprn, int gprn)
>  {
>  spr_write_generic(ctx, sprn, gprn);
> diff --git a/target/ppc/power8-pmu.c b/target/ppc/power8-pmu.c
> index 5f2623aa25..acdaee7459 100644
> --- a/target/ppc/power8-pmu.c
> +++ b/target/ppc/power8-pmu.c
> @@ -145,6 +145,13 @@ void helper_store_mmcr0(CPUPPCState *env, target_ulong 
> value)
>  }
>  }
>  
> +void helper_store_mmcr1(CPUPPCState *env, uint64_t value)
> +{
> +pmu_update_cycles(env);
> +
> +env->spr[SPR_POWER_MMCR1] = value;
> +}
> +
>  target_ulong helper_read_pmc(CPUPPCState *env, uint32_t sprn)
>  {
>  pmu_update_cycles(env);
> diff --git a/target/ppc/spr_tcg.h b/target/ppc/spr_tcg.h
> index 1e79a0522a..1d6521eedc 100644
> --- a/target/ppc/spr_tcg.h
> +++ b/target/ppc/spr_tcg.h
> @@ -26,6 +26,7 @@ void spr_noaccess(DisasContext *ctx, int gprn, int sprn);
>  void spr_read_generic(DisasContext *ctx, int gprn, int sprn);
>  void spr_write_generic(DisasContext *ctx, int sprn, int gprn);
>  void spr_write_MMCR0(DisasContext *ctx, int sprn, int gprn);
> +void spr_write_MMCR1(DisasContext *ctx, int sprn, int gprn);
>  void spr_write_PMC(DisasContext *ctx, int sprn, int gprn);
>  void spr_read_xer(DisasContext *ctx, int gprn, int sprn);
>  void spr_write_xer(DisasContext *ctx, int sprn, int gprn);

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature


Re: [PATCH v8 02/10] target/ppc: PMU basic cycle count for pseries TCG

2021-11-26 Thread David Gibson
On Thu, Nov 25, 2021 at 12:08:09PM -0300, Daniel Henrique Barboza wrote:
> This patch adds the barebones of the PMU logic by enabling cycle
> counting. The overall logic goes as follows:
> 
> - MMCR0 reg initial value is set to 0x8000 (MMCR0_FC set) to avoid
> having to spin the PMU right at system init;
> 
> - to retrieve the events that are being profiled, pmc_get_event() will
> check the current MMCR0 and MMCR1 value and return the appropriate
> PMUEventType. For PMCs 1-4, event 0x2 is the implementation dependent
> value of PMU_EVENT_INSTRUCTIONS and event 0x1E is the implementation
> dependent value of PMU_EVENT_CYCLES. These events are supported by IBM
> Power chips since Power8, at least, and the Linux Perf driver makes use
> of these events until kernel v5.15. For PMC1, event 0xF0 is the
> architected PowerISA event for cycles. Event 0xFE is the architected
> PowerISA event for instructions;
> 
> - if the counter is frozen, either via the global MMCR0_FC bit or its
> individual frozen counter bit, PMU_EVENT_INACTIVE is returned;
> 
> - pmu_update_cycles() will go through each counter and update the
> values of all PMCs that are counting cycles. This function will be
> called every time a MMCR0 update is done to keep counters values
> up to date. Upcoming patches will use this function to allow the
> counters to be properly updated during read/write of the PMCs
> and MMCR1 writes.
> 
> Given that the base CPU frequency is fixed at 1Ghz for both powernv and
> pseries clock, cycle calculation assumes that 1 nanosecond equals 1 CPU
> cycle. Cycle value is then calculated by adding the elapsed time, in
> nanoseconds, of the last cycle update done via pmu_update_cycles().
> 
> Signed-off-by: Daniel Henrique Barboza 

Reviewed-by: David Gibson 

> ---
>  target/ppc/cpu.h |  20 +
>  target/ppc/cpu_init.c|   6 +-
>  target/ppc/helper.h  |   1 +
>  target/ppc/power8-pmu-regs.c.inc |  23 +-
>  target/ppc/power8-pmu.c  | 122 +++
>  target/ppc/spr_tcg.h |   1 +
>  6 files changed, 169 insertions(+), 4 deletions(-)
> 
> diff --git a/target/ppc/cpu.h b/target/ppc/cpu.h
> index 2ad47b06d0..9c732953f0 100644
> --- a/target/ppc/cpu.h
> +++ b/target/ppc/cpu.h
> @@ -361,6 +361,9 @@ typedef enum {
>  #define MMCR0_FCECE  PPC_BIT(38) /* FC on Enabled Cond or Event */
>  #define MMCR0_PMCC0  PPC_BIT(44) /* PMC Control bit 0 */
>  #define MMCR0_PMCC1  PPC_BIT(45) /* PMC Control bit 1 */
> +#define MMCR0_PMCC   PPC_BITMASK(44, 45) /* PMC Control */
> +#define MMCR0_FC14   PPC_BIT(58) /* PMC Freeze Counters 1-4 bit */
> +#define MMCR0_FC56   PPC_BIT(59) /* PMC Freeze Counters 5-6 bit */
>  /* MMCR0 userspace r/w mask */
>  #define MMCR0_UREG_MASK (MMCR0_FC | MMCR0_PMAO | MMCR0_PMAE)
>  /* MMCR2 userspace r/w mask */
> @@ -373,6 +376,17 @@ typedef enum {
>  #define MMCR2_UREG_MASK (MMCR2_FC1P0 | MMCR2_FC2P0 | MMCR2_FC3P0 | \
>   MMCR2_FC4P0 | MMCR2_FC5P0 | MMCR2_FC6P0)
>  
> +#define MMCR1_EVT_SIZE 8
> +/* extract64() does a right shift before extracting */
> +#define MMCR1_PMC1SEL_START 32
> +#define MMCR1_PMC1EVT_EXTR (64 - MMCR1_PMC1SEL_START - MMCR1_EVT_SIZE)
> +#define MMCR1_PMC2SEL_START 40
> +#define MMCR1_PMC2EVT_EXTR (64 - MMCR1_PMC2SEL_START - MMCR1_EVT_SIZE)
> +#define MMCR1_PMC3SEL_START 48
> +#define MMCR1_PMC3EVT_EXTR (64 - MMCR1_PMC3SEL_START - MMCR1_EVT_SIZE)
> +#define MMCR1_PMC4SEL_START 56
> +#define MMCR1_PMC4EVT_EXTR (64 - MMCR1_PMC4SEL_START - MMCR1_EVT_SIZE)
> +
>  /* LPCR bits */
>  #define LPCR_VPM0 PPC_BIT(0)
>  #define LPCR_VPM1 PPC_BIT(1)
> @@ -1207,6 +1221,12 @@ struct CPUPPCState {
>   * when counting cycles.
>   */
>  QEMUTimer *pmu_cyc_overflow_timers[PMU_TIMERS_NUM];
> +
> +/*
> + * PMU base time value used by the PMU to calculate
> + * running cycles.
> + */
> +uint64_t pmu_base_time;
>  };
>  
>  #define SET_FIT_PERIOD(a_, b_, c_, d_)  \
> diff --git a/target/ppc/cpu_init.c b/target/ppc/cpu_init.c
> index 9610e65c76..e0b6fe4057 100644
> --- a/target/ppc/cpu_init.c
> +++ b/target/ppc/cpu_init.c
> @@ -6821,8 +6821,8 @@ static void register_book3s_pmu_sup_sprs(CPUPPCState 
> *env)
>  {
>  spr_register_kvm(env, SPR_POWER_MMCR0, "MMCR0",
>   SPR_NOACCESS, SPR_NOACCESS,
> - &spr_read_generic, &spr_write_generic,
> - KVM_REG_PPC_MMCR0, 0x);
> + &spr_read_generic, &spr_write_MMCR0,
> + KVM_REG_PPC_MMCR0, 0x8000);
>  spr_register_kvm(env, SPR_POWER_MMCR1, "MMCR1",
>   SPR_NOACCESS, SPR_NOACCESS,
>   &spr_read_generic, &spr_write_generic,
> @@ -6870,7 +6870,7 @@ static void register_book3s_pmu_user_sprs(CPUPPCState 
> *env)
>  spr_register(env, SPR_POWER_UMMCR0, "UMMCR0",
>   &spr_read_MMCR0_ureg, &spr_write

Re: [PATCH v8 03/10] target/ppc: PMU: update counters on PMCs r/w

2021-11-26 Thread David Gibson
On Thu, Nov 25, 2021 at 12:08:10PM -0300, Daniel Henrique Barboza wrote:
> Calling pmu_update_cycles() on every PMC read/write operation ensures
> that the values being fetched are up to date with the current PMU state.
> 
> In theory we can get away by just trapping PMCs reads, but we're going
> to trap PMC writes to deal with counter overflow logic later on.  Let's
> put the required wiring for that and make our lives a bit easier in the
> next patches.
> 
> Signed-off-by: Daniel Henrique Barboza 

Reviewed-by: David Gibson 

> ---
>  target/ppc/cpu_init.c| 12 ++--
>  target/ppc/helper.h  |  2 ++
>  target/ppc/power8-pmu-regs.c.inc | 29 +++--
>  target/ppc/power8-pmu.c  | 14 ++
>  target/ppc/spr_tcg.h |  2 ++
>  5 files changed, 51 insertions(+), 8 deletions(-)
> 
> diff --git a/target/ppc/cpu_init.c b/target/ppc/cpu_init.c
> index e0b6fe4057..a7f47ec322 100644
> --- a/target/ppc/cpu_init.c
> +++ b/target/ppc/cpu_init.c
> @@ -6833,27 +6833,27 @@ static void register_book3s_pmu_sup_sprs(CPUPPCState 
> *env)
>   KVM_REG_PPC_MMCRA, 0x);
>  spr_register_kvm(env, SPR_POWER_PMC1, "PMC1",
>   SPR_NOACCESS, SPR_NOACCESS,
> - &spr_read_generic, &spr_write_generic,
> + &spr_read_PMC, &spr_write_PMC,
>   KVM_REG_PPC_PMC1, 0x);
>  spr_register_kvm(env, SPR_POWER_PMC2, "PMC2",
>   SPR_NOACCESS, SPR_NOACCESS,
> - &spr_read_generic, &spr_write_generic,
> + &spr_read_PMC, &spr_write_PMC,
>   KVM_REG_PPC_PMC2, 0x);
>  spr_register_kvm(env, SPR_POWER_PMC3, "PMC3",
>   SPR_NOACCESS, SPR_NOACCESS,
> - &spr_read_generic, &spr_write_generic,
> + &spr_read_PMC, &spr_write_PMC,
>   KVM_REG_PPC_PMC3, 0x);
>  spr_register_kvm(env, SPR_POWER_PMC4, "PMC4",
>   SPR_NOACCESS, SPR_NOACCESS,
> - &spr_read_generic, &spr_write_generic,
> + &spr_read_PMC, &spr_write_PMC,
>   KVM_REG_PPC_PMC4, 0x);
>  spr_register_kvm(env, SPR_POWER_PMC5, "PMC5",
>   SPR_NOACCESS, SPR_NOACCESS,
> - &spr_read_generic, &spr_write_generic,
> + &spr_read_PMC, &spr_write_PMC,
>   KVM_REG_PPC_PMC5, 0x);
>  spr_register_kvm(env, SPR_POWER_PMC6, "PMC6",
>   SPR_NOACCESS, SPR_NOACCESS,
> - &spr_read_generic, &spr_write_generic,
> + &spr_read_PMC, &spr_write_PMC,
>   KVM_REG_PPC_PMC6, 0x);
>  spr_register_kvm(env, SPR_POWER_SIAR, "SIAR",
>   SPR_NOACCESS, SPR_NOACCESS,
> diff --git a/target/ppc/helper.h b/target/ppc/helper.h
> index ea60a7493c..d7567f75b4 100644
> --- a/target/ppc/helper.h
> +++ b/target/ppc/helper.h
> @@ -21,6 +21,8 @@ DEF_HELPER_1(hrfid, void, env)
>  DEF_HELPER_2(store_lpcr, void, env, tl)
>  DEF_HELPER_2(store_pcr, void, env, tl)
>  DEF_HELPER_2(store_mmcr0, void, env, tl)
> +DEF_HELPER_3(store_pmc, void, env, i32, i64)
> +DEF_HELPER_2(read_pmc, tl, env, i32)
>  #endif
>  DEF_HELPER_1(check_tlb_flush_local, void, env)
>  DEF_HELPER_1(check_tlb_flush_global, void, env)
> diff --git a/target/ppc/power8-pmu-regs.c.inc 
> b/target/ppc/power8-pmu-regs.c.inc
> index fbb8977641..f0c9cc343b 100644
> --- a/target/ppc/power8-pmu-regs.c.inc
> +++ b/target/ppc/power8-pmu-regs.c.inc
> @@ -181,13 +181,23 @@ void spr_write_MMCR2_ureg(DisasContext *ctx, int sprn, 
> int gprn)
>  tcg_temp_free(masked_gprn);
>  }
>  
> +void spr_read_PMC(DisasContext *ctx, int gprn, int sprn)
> +{
> +TCGv_i32 t_sprn = tcg_const_i32(sprn);
> +
> +gen_icount_io_start(ctx);
> +gen_helper_read_pmc(cpu_gpr[gprn], cpu_env, t_sprn);
> +
> +tcg_temp_free_i32(t_sprn);
> +}
> +
>  void spr_read_PMC14_ureg(DisasContext *ctx, int gprn, int sprn)
>  {
>  if (!spr_groupA_read_allowed(ctx)) {
>  return;
>  }
>  
> -spr_read_ureg(ctx, gprn, sprn);
> +spr_read_PMC(ctx, gprn, sprn + 0x10);
>  }
>  
>  void spr_read_PMC56_ureg(DisasContext *ctx, int gprn, int sprn)
> @@ -206,13 +216,23 @@ void spr_read_PMC56_ureg(DisasContext *ctx, int gprn, 
> int sprn)
>  spr_read_PMC14_ureg(ctx, gprn, sprn);
>  }
>  
> +void spr_write_PMC(DisasContext *ctx, int sprn, int gprn)
> +{
> +TCGv_i32 t_sprn = tcg_const_i32(sprn);
> +
> +gen_icount_io_start(ctx);
> +gen_helper_store_pmc(cpu_env, t_sprn, cpu_gpr[gprn]);
> +
> +tcg_temp_free_i32(t_sprn);
> +}
> +
>  void spr_write_PMC14_ureg(DisasContext *ctx, int sprn, int gprn)
>  {
>  if (!spr_groupA_write_allowed(ctx)) {
>  return;
>  }
>  
> -spr_write_ureg(ctx, sprn, gprn);
> +spr_write_PMC(ctx, sprn 

Re: [PATCH v5 01/22] target/riscv: Adjust pmpcfg access with mxl

2021-11-26 Thread Richard Henderson

On 11/25/21 8:39 AM, LIU Zhiwei wrote:

+static bool check_pmp_reg_index(CPURISCVState *env, uint32_t reg_index)
+{
+if ((reg_index & 1) && (riscv_cpu_mxl(env) == MXL_RV64)) {


Let's make this != MXL_RV32.  I suppose real RV128 will extend this restriction to mod 4, 
but that is not yet documented.


Otherwise,
Reviewed-by: Richard Henderson 


r~



Re: [PATCH] dbus-vmstate: Restrict error checks to registered proxies in dbus_get_proxies

2021-11-26 Thread Marc-André Lureau
Hi

On Fri, Nov 26, 2021 at 10:49 AM Priyankar Jain 
wrote:

> The purpose of dbus_get_proxies to construct the proxies corresponding to
> the
> IDs registered to dbus-vmstate.
>
> Currenty, this function returns an error in case there is any failure
> while instantiating proxy for "all" the names on dbus.
>
> Ideally this function should error out only if it is not able to find and
> validate the proxies registered to the backend otherwise any offending
> process(for eg: the process purposefully may not export its Id property on
> the dbus) may connect to the dbus and can lead to migration failures.
>

ok


> This commit ensures that dbus_get_proxies returns an error if it is not
> able to find and validate the proxies of interest(the IDs registered
> during the dbus-vmstate instantiation).
>
> Signed-off-by: Priyankar Jain 
> ---
>  backends/dbus-vmstate.c | 7 +++
>  1 file changed, 3 insertions(+), 4 deletions(-)
>
> diff --git a/backends/dbus-vmstate.c b/backends/dbus-vmstate.c
> index 9cfd758c42..ec86b5bac2 100644
> --- a/backends/dbus-vmstate.c
> +++ b/backends/dbus-vmstate.c
> @@ -114,14 +114,13 @@ dbus_get_proxies(DBusVMState *self, GError **err)
>  "org.qemu.VMState1",
>  NULL, err);
>  if (!proxy) {
> -return NULL;
>

This would leak "err", you would need to pass NULL instead. Imho we need to
report a warning anyway with "err".


> +continue;
>  }
>
>  result = g_dbus_proxy_get_cached_property(proxy, "Id");
>  if (!result) {
> -g_set_error_literal(err, G_IO_ERROR, G_IO_ERROR_FAILED,
> -"VMState Id property is missing.");
> -return NULL;
>

Similarly, report a warning.


> +g_clear_object(&proxy);
> +continue;
>  }
>
>  id = g_variant_dup_string(result, &size);
> --
> 2.30.1 (Apple Git-130)
>
>
>
thanks

-- 
Marc-André Lureau


Re: [PATCH v5 04/22] target/riscv: Create xl field in env

2021-11-26 Thread Richard Henderson

On 11/25/21 8:39 AM, LIU Zhiwei wrote:

Signed-off-by: LIU Zhiwei 
---
  target/riscv/cpu.c| 1 +
  target/riscv/cpu.h| 3 +++
  target/riscv/cpu_helper.c | 3 ++-
  target/riscv/csr.c| 2 ++
  target/riscv/machine.c| 5 +++--
  5 files changed, 11 insertions(+), 3 deletions(-)

diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
index f812998123..5c757ce33a 100644
--- a/target/riscv/cpu.c
+++ b/target/riscv/cpu.c
@@ -377,6 +377,7 @@ static void riscv_cpu_reset(DeviceState *dev)
  /* mmte is supposed to have pm.current hardwired to 1 */
  env->mmte |= (PM_EXT_INITIAL | MMTE_M_PM_CURRENT);
  #endif
+env->xl = riscv_cpu_mxl(env);
  cs->exception_index = RISCV_EXCP_NONE;
  env->load_res = -1;
  set_default_nan_mode(1, &env->fp_status);
diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h
index 0760c0af93..412339dbad 100644
--- a/target/riscv/cpu.h
+++ b/target/riscv/cpu.h
@@ -138,6 +138,7 @@ struct CPURISCVState {
  uint32_t misa_mxl_max;  /* max mxl for this cpu */
  uint32_t misa_ext;  /* current extensions */
  uint32_t misa_ext_mask; /* max ext for this cpu */
+uint32_t xl;/* current xlen */
  
  uint32_t features;
  
@@ -420,6 +421,8 @@ static inline RISCVMXL riscv_cpu_mxl(CPURISCVState *env)

  }
  #endif
  
+RISCVMXL cpu_get_xl(CPURISCVState *env);


Probably this name should be a define/inline function, just like riscv_cpu_mxl.  The 
proper function should probably be renamed cpu_recompute_xl, or something.



  const VMStateDescription vmstate_riscv_cpu = {
  .name = "cpu",
-.version_id = 3,
-.minimum_version_id = 3,
+.version_id = 4,
+.minimum_version_id = 4,
  .fields = (VMStateField[]) {
  VMSTATE_UINTTL_ARRAY(env.gpr, RISCVCPU, 32),
  VMSTATE_UINT64_ARRAY(env.fpr, RISCVCPU, 32),
@@ -183,6 +183,7 @@ const VMStateDescription vmstate_riscv_cpu = {
  VMSTATE_UINT32(env.misa_ext, RISCVCPU),
  VMSTATE_UINT32(env.misa_mxl_max, RISCVCPU),
  VMSTATE_UINT32(env.misa_ext_mask, RISCVCPU),
+VMSTATE_UINT32(env.xl, RISCVCPU),


Do not save this.  We prefer to only save architectural state (which is of course 
required), and recompute anything else (which is qemu internal) from that in the post_load 
hook.  This allows qemu internals to change without breaking compatibility.



r~



Re: [PATCH v5 10/22] target/riscv: Create current pm fields in env

2021-11-26 Thread Richard Henderson

On 11/25/21 8:39 AM, LIU Zhiwei wrote:

Signed-off-by: LIU Zhiwei
Reviewed-by: Alistair Francis
---
  target/riscv/cpu.c|  1 +
  target/riscv/cpu.h|  4 
  target/riscv/cpu_helper.c | 43 +++
  target/riscv/csr.c| 19 +
  target/riscv/machine.c| 10 +
  5 files changed, 77 insertions(+)


Reviewed-by: Richard Henderson 

r~



Re: [PATCH v5 20/22] target/riscv: Adjust vector address with mask

2021-11-26 Thread Richard Henderson

On 11/25/21 8:39 AM, LIU Zhiwei wrote:

The mask comes from the pointer masking extension, or the max value
corresponding to XLEN bits.

Signed-off-by: LIU Zhiwei
Acked-by: Alistair Francis
---
  target/riscv/vector_helper.c | 23 ++-
  1 file changed, 14 insertions(+), 9 deletions(-)


Reviewed-by: Richard Henderson 

r~



[PULL for-6.2 0/2] ppc queue

2021-11-26 Thread Cédric Le Goater
The following changes since commit 67f9968ce3f0847ffddb6ee2837a3641acd92abf:

  Update version for v6.2.0-rc1 release (2021-11-16 21:07:31 +0100)

are available in the Git repository at:

  https://github.com/legoater/qemu/ tags/pull-ppc-2029

for you to fetch changes up to a443d55c3f7cafa3d5dfb7fe2b5c3cd9d671b61d:

  tests/tcg/ppc64le: Fix compile flags for byte_reverse (2021-11-17 19:10:44 
+0100)


ppc 6.2 queue:

* fix pmu vmstate
* Fix compile of byte_reverse on new compilers


Laurent Vivier (1):
  pmu: fix pmu vmstate subsection list

Richard Henderson (1):
  tests/tcg/ppc64le: Fix compile flags for byte_reverse

 hw/misc/macio/pmu.c   |  1 +
 tests/tcg/ppc64le/Makefile.target | 12 +++-
 2 files changed, 4 insertions(+), 9 deletions(-)



[PULL for-6.2 2/2] tests/tcg/ppc64le: Fix compile flags for byte_reverse

2021-11-26 Thread Cédric Le Goater
From: Richard Henderson 

With a host compiler new enough to recognize power10 insns,
CROSS_CC_HAS_POWER10 is true, but we do not supply the -cpu
option to the compiler, resulting in

/tmp/ccAVdYJd.s: Assembler messages:
/tmp/ccAVdYJd.s:49: Error: unrecognized opcode: `brh'
/tmp/ccAVdYJd.s:78: Error: unrecognized opcode: `brw'
/tmp/ccAVdYJd.s:107: Error: unrecognized opcode: `brd'
make[2]: *** [byte_reverse] Error 1

Signed-off-by: Richard Henderson 
Signed-off-by: Cédric Le Goater 
---
 tests/tcg/ppc64le/Makefile.target | 12 +++-
 1 file changed, 3 insertions(+), 9 deletions(-)

diff --git a/tests/tcg/ppc64le/Makefile.target 
b/tests/tcg/ppc64le/Makefile.target
index 5e65b1590dba..ba2fde5ff1c3 100644
--- a/tests/tcg/ppc64le/Makefile.target
+++ b/tests/tcg/ppc64le/Makefile.target
@@ -9,18 +9,12 @@ PPC64LE_TESTS=bcdsub
 endif
 bcdsub: CFLAGS += -mpower8-vector
 
-PPC64LE_TESTS += byte_reverse
 ifneq ($(DOCKER_IMAGE)$(CROSS_CC_HAS_POWER10),)
+PPC64LE_TESTS += byte_reverse
+endif
+byte_reverse: CFLAGS += -mcpu=power10
 run-byte_reverse: QEMU_OPTS+=-cpu POWER10
 run-plugin-byte_reverse-with-%: QEMU_OPTS+=-cpu POWER10
-else
-byte_reverse:
-   $(call skip-test, "BUILD of $@", "missing compiler support")
-run-byte_reverse:
-   $(call skip-test, "RUN of byte_reverse", "not built")
-run-plugin-byte_reverse-with-%:
-   $(call skip-test, "RUN of byte_reverse ($*)", "not built")
-endif
 
 PPC64LE_TESTS += signal_save_restore_xer
 
-- 
2.31.1




[PATCH v3 02/18] ppc/xive2: Introduce a presenter matching routine

2021-11-26 Thread Cédric Le Goater
The VP space is larger in XIVE2 (P10), 24 bits instead of 19bits on
XIVE (P9), and the CAM line can use a 7bits or 8bits thread id.

For now, we only use 7bits thread ids, same as P9, but because of the
change of the size of the VP space, the CAM matching routine is
different between P9 and P10. It is easier to duplicate the whole
routine than to add extra handlers in xive_presenter_tctx_match() used
for P9.

We might come with a better solution later on, after we have added
some more support for the XIVE2 controller.

Signed-off-by: Cédric Le Goater 
---
 include/hw/ppc/xive2.h |  9 +
 hw/intc/xive2.c| 82 ++
 2 files changed, 91 insertions(+)

diff --git a/include/hw/ppc/xive2.h b/include/hw/ppc/xive2.h
index a3cd02520475..e881c039d9c0 100644
--- a/include/hw/ppc/xive2.h
+++ b/include/hw/ppc/xive2.h
@@ -55,6 +55,15 @@ int xive2_router_write_nvp(Xive2Router *xrtr, uint8_t 
nvp_blk, uint32_t nvp_idx,
 
 void xive2_router_notify(XiveNotifier *xn, uint32_t lisn);
 
+/*
+ * XIVE2 Presenter (POWER10)
+ */
+
+int xive2_presenter_tctx_match(XivePresenter *xptr, XiveTCTX *tctx,
+   uint8_t format,
+   uint8_t nvt_blk, uint32_t nvt_idx,
+   bool cam_ignore, uint32_t logic_serv);
+
 /*
  * XIVE2 END ESBs  (POWER10)
  */
diff --git a/hw/intc/xive2.c b/hw/intc/xive2.c
index e4aa614f3cc8..9e186bbb6cd9 100644
--- a/hw/intc/xive2.c
+++ b/hw/intc/xive2.c
@@ -209,6 +209,88 @@ static int xive2_router_get_block_id(Xive2Router *xrtr)
return xrc->get_block_id(xrtr);
 }
 
+/*
+ * Encode the HW CAM line with 7bit or 8bit thread id. The thread id
+ * width and block id width is configurable at the IC level.
+ *
+ *chipid << 24 |     1 threadid (7Bit)
+ *chipid << 24 |    0001 threadid   (8Bit)
+ */
+static uint32_t xive2_tctx_hw_cam_line(XivePresenter *xptr, XiveTCTX *tctx)
+{
+Xive2Router *xrtr = XIVE2_ROUTER(xptr);
+CPUPPCState *env = &POWERPC_CPU(tctx->cs)->env;
+uint32_t pir = env->spr_cb[SPR_PIR].default_value;
+uint8_t blk = xive2_router_get_block_id(xrtr);
+uint8_t tid_shift = 7;
+uint8_t tid_mask = (1 << tid_shift) - 1;
+
+return xive2_nvp_cam_line(blk, 1 << tid_shift | (pir & tid_mask));
+}
+
+/*
+ * The thread context register words are in big-endian format.
+ */
+int xive2_presenter_tctx_match(XivePresenter *xptr, XiveTCTX *tctx,
+   uint8_t format,
+   uint8_t nvt_blk, uint32_t nvt_idx,
+   bool cam_ignore, uint32_t logic_serv)
+{
+uint32_t cam =   xive2_nvp_cam_line(nvt_blk, nvt_idx);
+uint32_t qw3w2 = xive_tctx_word2(&tctx->regs[TM_QW3_HV_PHYS]);
+uint32_t qw2w2 = xive_tctx_word2(&tctx->regs[TM_QW2_HV_POOL]);
+uint32_t qw1w2 = xive_tctx_word2(&tctx->regs[TM_QW1_OS]);
+uint32_t qw0w2 = xive_tctx_word2(&tctx->regs[TM_QW0_USER]);
+
+/*
+ * TODO (PowerNV): ignore mode. The low order bits of the NVT
+ * identifier are ignored in the "CAM" match.
+ */
+
+if (format == 0) {
+if (cam_ignore == true) {
+/*
+ * F=0 & i=1: Logical server notification (bits ignored at
+ * the end of the NVT identifier)
+ */
+qemu_log_mask(LOG_UNIMP, "XIVE: no support for LS NVT %x/%x\n",
+  nvt_blk, nvt_idx);
+return -1;
+}
+
+/* F=0 & i=0: Specific NVT notification */
+
+/* PHYS ring */
+if ((be32_to_cpu(qw3w2) & TM2_QW3W2_VT) &&
+cam == xive2_tctx_hw_cam_line(xptr, tctx)) {
+return TM_QW3_HV_PHYS;
+}
+
+/* HV POOL ring */
+if ((be32_to_cpu(qw2w2) & TM2_QW2W2_VP) &&
+cam == xive_get_field32(TM2_QW2W2_POOL_CAM, qw2w2)) {
+return TM_QW2_HV_POOL;
+}
+
+/* OS ring */
+if ((be32_to_cpu(qw1w2) & TM2_QW1W2_VO) &&
+cam == xive_get_field32(TM2_QW1W2_OS_CAM, qw1w2)) {
+return TM_QW1_OS;
+}
+} else {
+/* F=1 : User level Event-Based Branch (EBB) notification */
+
+/* USER ring */
+if  ((be32_to_cpu(qw1w2) & TM2_QW1W2_VO) &&
+ (cam == xive_get_field32(TM2_QW1W2_OS_CAM, qw1w2)) &&
+ (be32_to_cpu(qw0w2) & TM2_QW0W2_VU) &&
+ (logic_serv == xive_get_field32(TM2_QW0W2_LOGIC_SERV, qw0w2))) {
+return TM_QW0_USER;
+}
+}
+return -1;
+}
+
 static void xive2_router_realize(DeviceState *dev, Error **errp)
 {
 Xive2Router *xrtr = XIVE2_ROUTER(dev);
-- 
2.31.1




[PATCH v3 01/18] ppc/xive2: Introduce a XIVE2 core framework

2021-11-26 Thread Cédric Le Goater
The XIVE2 interrupt controller of the POWER10 processor as the same
logic as on POWER9 but its SW interface has been largely reworked. The
interrupt controller has a new register interface, different BARs,
extra VSDs. These will be described when we add the device model for
the baremetal machine.

The XIVE internal structures for the EAS, END, NVT have different
layouts which is a problem for the current core XIVE framework. To
avoid adding too much complexity in the XIVE models, a new XIVE2 core
framework is introduced. It duplicates the models which are closely
linked to the XIVE internal structures : Xive2Router and
Xive2ENDSource and reuses the XiveSource, XivePresenter, XiveTCTX
models, as they are more generic.

Signed-off-by: Cédric Le Goater 
---
 include/hw/ppc/xive2.h  |  78 +
 include/hw/ppc/xive2_regs.h | 198 +++
 hw/intc/xive2.c | 666 
 hw/intc/meson.build |   2 +-
 4 files changed, 943 insertions(+), 1 deletion(-)
 create mode 100644 include/hw/ppc/xive2.h
 create mode 100644 include/hw/ppc/xive2_regs.h
 create mode 100644 hw/intc/xive2.c

diff --git a/include/hw/ppc/xive2.h b/include/hw/ppc/xive2.h
new file mode 100644
index ..a3cd02520475
--- /dev/null
+++ b/include/hw/ppc/xive2.h
@@ -0,0 +1,78 @@
+/*
+ * QEMU PowerPC XIVE2 interrupt controller model  (POWER10)
+ *
+ * Copyright (c) 2019-2021, IBM Corporation.
+ *
+ * This code is licensed under the GPL version 2 or later. See the
+ * COPYING file in the top-level directory.
+ *
+ */
+
+#ifndef PPC_XIVE2_H
+#define PPC_XIVE2_H
+
+#include "hw/ppc/xive2_regs.h"
+
+/*
+ * XIVE2 Router (POWER10)
+ */
+typedef struct Xive2Router {
+SysBusDeviceparent;
+
+XiveFabric *xfb;
+} Xive2Router;
+
+#define TYPE_XIVE2_ROUTER "xive2-router"
+OBJECT_DECLARE_TYPE(Xive2Router, Xive2RouterClass, XIVE2_ROUTER);
+
+typedef struct Xive2RouterClass {
+SysBusDeviceClass parent;
+
+/* XIVE table accessors */
+int (*get_eas)(Xive2Router *xrtr, uint8_t eas_blk, uint32_t eas_idx,
+   Xive2Eas *eas);
+int (*get_end)(Xive2Router *xrtr, uint8_t end_blk, uint32_t end_idx,
+   Xive2End *end);
+int (*write_end)(Xive2Router *xrtr, uint8_t end_blk, uint32_t end_idx,
+ Xive2End *end, uint8_t word_number);
+int (*get_nvp)(Xive2Router *xrtr, uint8_t nvp_blk, uint32_t nvp_idx,
+   Xive2Nvp *nvp);
+int (*write_nvp)(Xive2Router *xrtr, uint8_t nvp_blk, uint32_t nvp_idx,
+ Xive2Nvp *nvp, uint8_t word_number);
+uint8_t (*get_block_id)(Xive2Router *xrtr);
+} Xive2RouterClass;
+
+int xive2_router_get_eas(Xive2Router *xrtr, uint8_t eas_blk, uint32_t eas_idx,
+Xive2Eas *eas);
+int xive2_router_get_end(Xive2Router *xrtr, uint8_t end_blk, uint32_t end_idx,
+Xive2End *end);
+int xive2_router_write_end(Xive2Router *xrtr, uint8_t end_blk, uint32_t 
end_idx,
+  Xive2End *end, uint8_t word_number);
+int xive2_router_get_nvp(Xive2Router *xrtr, uint8_t nvp_blk, uint32_t nvp_idx,
+Xive2Nvp *nvp);
+int xive2_router_write_nvp(Xive2Router *xrtr, uint8_t nvp_blk, uint32_t 
nvp_idx,
+  Xive2Nvp *nvp, uint8_t word_number);
+
+void xive2_router_notify(XiveNotifier *xn, uint32_t lisn);
+
+/*
+ * XIVE2 END ESBs  (POWER10)
+ */
+
+#define TYPE_XIVE2_END_SOURCE "xive2-end-source"
+OBJECT_DECLARE_SIMPLE_TYPE(Xive2EndSource, XIVE2_END_SOURCE)
+
+typedef struct Xive2EndSource {
+DeviceState parent;
+
+uint32_tnr_ends;
+
+/* ESB memory region */
+uint32_tesb_shift;
+MemoryRegionesb_mmio;
+
+Xive2Router *xrtr;
+} Xive2EndSource;
+
+
+#endif /* PPC_XIVE2_H */
diff --git a/include/hw/ppc/xive2_regs.h b/include/hw/ppc/xive2_regs.h
new file mode 100644
index ..f4827f4c6d54
--- /dev/null
+++ b/include/hw/ppc/xive2_regs.h
@@ -0,0 +1,198 @@
+/*
+ * QEMU PowerPC XIVE2 internal structure definitions (POWER10)
+ *
+ * Copyright (c) 2019-2021, IBM Corporation.
+ *
+ * This code is licensed under the GPL version 2 or later. See the
+ * COPYING file in the top-level directory.
+ */
+
+#ifndef PPC_XIVE2_REGS_H
+#define PPC_XIVE2_REGS_H
+
+/*
+ * Thread Interrupt Management Area (TIMA)
+ *
+ * In Gen1 mode (P9 compat mode) word 2 is the same. However in Gen2
+ * mode (P10), the CAM line is slightly different as the VP space was
+ * increased.
+ */
+#define   TM2_QW0W2_VU   PPC_BIT32(0)
+#define   TM2_QW0W2_LOGIC_SERV   PPC_BITMASK32(4, 31)
+#define   TM2_QW1W2_VO   PPC_BIT32(0)
+#define   TM2_QW1W2_OS_CAM   PPC_BITMASK32(4, 31)
+#define   TM2_QW2W2_VP   PPC_BIT32(0)
+#define   TM2_QW2W2_POOL_CAM PPC_BITMASK32(4, 31)
+#define   TM2_QW3W2_VT   PPC_BIT32(0)
+#define   TM2_QW3W2_LP   PPC_BIT32(6)
+#define   TM2_QW3W2_LE   PPC_BIT32(7)
+
+/*
+ * Event Assignment Structure (EAS)
+ */
+
+type

[PATCH v3 05/18] ppc/pnv: Add POWER10 quads

2021-11-26 Thread Cédric Le Goater
and use a pnv_chip_power10_quad_realize() helper to avoid code
duplication with P9. This still needs some refinements on the XSCOM
registers handling in PnvQuad.

Signed-off-by: Cédric Le Goater 
---
 include/hw/ppc/pnv.h |  3 +++
 hw/ppc/pnv.c | 50 +++-
 2 files changed, 43 insertions(+), 10 deletions(-)

diff --git a/include/hw/ppc/pnv.h b/include/hw/ppc/pnv.h
index a299fbc7f25c..13495423283a 100644
--- a/include/hw/ppc/pnv.h
+++ b/include/hw/ppc/pnv.h
@@ -128,6 +128,9 @@ struct Pnv10Chip {
 Pnv9Psi  psi;
 PnvLpcController lpc;
 PnvOCC   occ;
+
+uint32_t nr_quads;
+PnvQuad  *quads;
 };
 
 #define PNV10_PIR2FUSEDCORE(pir) (((pir) >> 3) & 0xf)
diff --git a/hw/ppc/pnv.c b/hw/ppc/pnv.c
index a186df3fee41..5c342e313329 100644
--- a/hw/ppc/pnv.c
+++ b/hw/ppc/pnv.c
@@ -1370,6 +1370,21 @@ static void pnv_chip_power9_instance_init(Object *obj)
 chip->num_phbs = pcc->num_phbs;
 }
 
+static void pnv_chip_quad_realize_one(PnvChip *chip, PnvQuad *eq,
+  PnvCore *pnv_core)
+{
+char eq_name[32];
+int core_id = CPU_CORE(pnv_core)->core_id;
+
+snprintf(eq_name, sizeof(eq_name), "eq[%d]", core_id);
+object_initialize_child_with_props(OBJECT(chip), eq_name, eq,
+   sizeof(*eq), TYPE_PNV_QUAD,
+   &error_fatal, NULL);
+
+object_property_set_int(OBJECT(eq), "quad-id", core_id, &error_fatal);
+qdev_realize(DEVICE(eq), NULL, &error_fatal);
+}
+
 static void pnv_chip_quad_realize(Pnv9Chip *chip9, Error **errp)
 {
 PnvChip *chip = PNV_CHIP(chip9);
@@ -1379,18 +1394,9 @@ static void pnv_chip_quad_realize(Pnv9Chip *chip9, Error 
**errp)
 chip9->quads = g_new0(PnvQuad, chip9->nr_quads);
 
 for (i = 0; i < chip9->nr_quads; i++) {
-char eq_name[32];
 PnvQuad *eq = &chip9->quads[i];
-PnvCore *pnv_core = chip->cores[i * 4];
-int core_id = CPU_CORE(pnv_core)->core_id;
-
-snprintf(eq_name, sizeof(eq_name), "eq[%d]", core_id);
-object_initialize_child_with_props(OBJECT(chip), eq_name, eq,
-   sizeof(*eq), TYPE_PNV_QUAD,
-   &error_fatal, NULL);
 
-object_property_set_int(OBJECT(eq), "quad-id", core_id, &error_fatal);
-qdev_realize(DEVICE(eq), NULL, &error_fatal);
+pnv_chip_quad_realize_one(chip, eq, chip->cores[i * 4]);
 
 pnv_xscom_add_subregion(chip, PNV9_XSCOM_EQ_BASE(eq->quad_id),
 &eq->xscom_regs);
@@ -1606,6 +1612,24 @@ static void pnv_chip_power10_instance_init(Object *obj)
 object_initialize_child(obj, "occ",  &chip10->occ, TYPE_PNV10_OCC);
 }
 
+static void pnv_chip_power10_quad_realize(Pnv10Chip *chip10, Error **errp)
+{
+PnvChip *chip = PNV_CHIP(chip10);
+int i;
+
+chip10->nr_quads = DIV_ROUND_UP(chip->nr_cores, 4);
+chip10->quads = g_new0(PnvQuad, chip10->nr_quads);
+
+for (i = 0; i < chip10->nr_quads; i++) {
+PnvQuad *eq = &chip10->quads[i];
+
+pnv_chip_quad_realize_one(chip, eq, chip->cores[i * 4]);
+
+pnv_xscom_add_subregion(chip, PNV10_XSCOM_EQ_BASE(eq->quad_id),
+&eq->xscom_regs);
+}
+}
+
 static void pnv_chip_power10_realize(DeviceState *dev, Error **errp)
 {
 PnvChipClass *pcc = PNV_CHIP_GET_CLASS(dev);
@@ -1627,6 +1651,12 @@ static void pnv_chip_power10_realize(DeviceState *dev, 
Error **errp)
 return;
 }
 
+pnv_chip_power10_quad_realize(chip10, &local_err);
+if (local_err) {
+error_propagate(errp, local_err);
+return;
+}
+
 /* XIVE2 interrupt controller (POWER10) */
 object_property_set_int(OBJECT(&chip10->xive), "ic-bar",
 PNV10_XIVE2_IC_BASE(chip), &error_fatal);
-- 
2.31.1




[PATCH v3 03/18] ppc/pnv: Add a XIVE2 controller to the POWER10 chip

2021-11-26 Thread Cédric Le Goater
The XIVE2 interrupt controller of the POWER10 processor follows the
same logic than on POWER9 but the HW interface has been largely
reviewed.  It has a new register interface, different BARs, extra
VSDs, new layout for the XIVE2 structures, and a set of new features
which are described below.

This is a model of the POWER10 XIVE2 interrupt controller for the
PowerNV machine. It focuses primarily on the needs of the skiboot
firmware but some initial hypervisor support is implemented for KVM
use (escalation).

Support for new features will be implemented in time and will require
new support from the OS.

* XIVE2 BARS

The interrupt controller BARs have a different layout outlined below.
Each sub-engine has now own its range and the indirect TIMA access was
replaced with a set of pages, one per CPU, under the IC BAR:

  - IC BAR (Interrupt Controller)
. 4 pages, one per sub-engine
. 128 indirect TIMA pages
  - TM BAR (Thread Interrupt Management Area)
. 4 pages
  - ESB BAR (ESB pages for IPIs)
. up to 1TB
  - END BAR (ESB pages for ENDs)
. up to 2TB
  - NVC BAR (Notification Virtual Crowd)
. up to 128
  - NVPG BAR (Notification Virtual Process and Group)
. up to 1TB
  - Direct mapped Thread Context Area (reads & writes)

OPAL does not use the grouping and crowd capability.

* Virtual Structure Tables

XIVE2 adds new tables types and also changes the field layout of the END
and NVP Virtualization Structure Descriptors.

  - EAS
  - END new layout
  - NVT was splitted in :
. NVP (Processor), 32B
. NVG (Group), 32B
. NVC (Crowd == P9 block group) 32B
  - IC for remote configuration
  - SYNC for cache injection
  - ERQ for event input queue

The setup is slighly different on XIVE2 because the indexing has changed
for some of the tables, block ID or the chip topology ID can be used.

* XIVE2 features

SCOM and MMIO registers have a new layout and XIVE2 adds a new global
capability and configuration registers.

The lowlevel hardware offers a set of new features among which :

  - a configurable number of priorities : 1 - 8
  - StoreEOI with load-after-store ordering is activated by default
  - Gen2 TIMA layout
  - A P9-compat mode, or Gen1, TIMA toggle bit for SW compatibility
  - increase to 24bit for VP number

Other features will have some impact on the Hypervisor and guest OS
when activated, but this is not required for initial support of the
controller.

Signed-off-by: Cédric Le Goater 
---
 hw/intc/pnv_xive2_regs.h   |  428 
 include/hw/ppc/pnv.h   |   22 +
 include/hw/ppc/pnv_xive.h  |   71 ++
 include/hw/ppc/pnv_xscom.h |3 +
 hw/intc/pnv_xive2.c| 2027 
 hw/ppc/pnv.c   |   85 +-
 hw/intc/meson.build|2 +-
 7 files changed, 2634 insertions(+), 4 deletions(-)
 create mode 100644 hw/intc/pnv_xive2_regs.h
 create mode 100644 hw/intc/pnv_xive2.c

diff --git a/hw/intc/pnv_xive2_regs.h b/hw/intc/pnv_xive2_regs.h
new file mode 100644
index ..084fccc8d3e9
--- /dev/null
+++ b/hw/intc/pnv_xive2_regs.h
@@ -0,0 +1,428 @@
+/*
+ * QEMU PowerPC XIVE2 interrupt controller model  (POWER10)
+ *
+ * Copyright (c) 2019-2021, IBM Corporation.
+ *
+ * This code is licensed under the GPL version 2 or later. See the
+ * COPYING file in the top-level directory.
+ */
+
+#ifndef PPC_PNV_XIVE2_REGS_H
+#define PPC_PNV_XIVE2_REGS_H
+
+/*
+ * CQ Common Queue (PowerBus bridge) Registers
+ */
+
+/* XIVE2 Capabilities */
+#define X_CQ_XIVE_CAP   0x02
+#define CQ_XIVE_CAP 0x010
+#defineCQ_XIVE_CAP_VERSION  PPC_BITMASK(0, 3)
+/* 4:6 reserved */
+#defineCQ_XIVE_CAP_USER_INT_PRIOPPC_BITMASK(8, 9)
+#define   CQ_XIVE_CAP_USER_INT_PRIO_1   0
+#define   CQ_XIVE_CAP_USER_INT_PRIO_1_2 1
+#define   CQ_XIVE_CAP_USER_INT_PRIO_1_4 2
+#define   CQ_XIVE_CAP_USER_INT_PRIO_1_8 3
+#defineCQ_XIVE_CAP_VP_INT_PRIO  PPC_BITMASK(10, 11)
+#define   CQ_XIVE_CAP_VP_INT_PRIO_1_8   0
+#define   CQ_XIVE_CAP_VP_INT_PRIO_2_8   1
+#define   CQ_XIVE_CAP_VP_INT_PRIO_4_8   2
+#define   CQ_XIVE_CAP_VP_INT_PRIO_8 3
+#defineCQ_XIVE_CAP_BLOCK_ID_WIDTH   PPC_BITMASK(12, 13)
+
+/* XIVE2 Configuration */
+#define X_CQ_XIVE_CFG   0x03
+#define CQ_XIVE_CFG 0x018
+
+/* 0:7 reserved */
+#defineCQ_XIVE_CFG_USER_INT_PRIOPPC_BITMASK(8, 9)
+#defineCQ_XIVE_CFG_VP_INT_PRIO  PPC_BITMASK(10, 11)
+#define   CQ_XIVE_CFG_INT_PRIO_10
+#define   CQ_XIVE_CFG_INT_PRIO_21
+#define   CQ_XIVE_CFG_INT_PRIO_42
+#define   CQ_XIVE_CFG_INT_PRIO_83
+#defineCQ_XIVE_CFG_BLOCK_ID_WIDTH   PPC_BITMASK(12, 13)
+#define   CQ_XIVE_CFG_BLOCK_ID_4BITS0
+#define   CQ_XIVE_CFG_BLOCK_ID_5BITS1
+#define   CQ_XIVE_CFG_BLOCK_ID_6B

[PULL for-6.2 1/2] pmu: fix pmu vmstate subsection list

2021-11-26 Thread Cédric Le Goater
From: Laurent Vivier 

The subsection is not closed by a NULL marker so this can trigger
a segfault when the pmu vmstate is saved.

This can be easily shown with:

  $ ./qemu-system-ppc64  -dump-vmstate vmstate.json
  Segmentation fault (core dumped)

Fixes: d811d61fbc6c ("mac_newworld: add PMU device")
Cc: mark.cave-ayl...@ilande.co.uk
Signed-off-by: Laurent Vivier 
Reviewed-by: Greg Kurz 
Reviewed-by: Philippe Mathieu-Daudé 
Reviewed-by: Mark Cave-Ayland 
Signed-off-by: Cédric Le Goater 
---
 hw/misc/macio/pmu.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/hw/misc/macio/pmu.c b/hw/misc/macio/pmu.c
index 4ad4f50e08c3..eb39c64694aa 100644
--- a/hw/misc/macio/pmu.c
+++ b/hw/misc/macio/pmu.c
@@ -718,6 +718,7 @@ static const VMStateDescription vmstate_pmu = {
 },
 .subsections = (const VMStateDescription * []) {
 &vmstate_pmu_adb,
+NULL
 }
 };
 
-- 
2.31.1




[PATCH v3 07/18] ppc/pnv: Add a HOMER model to POWER10

2021-11-26 Thread Cédric Le Goater
Reviewed-by: David Gibson 
Signed-off-by: Cédric Le Goater 
---
 include/hw/ppc/pnv.h   | 10 ++
 include/hw/ppc/pnv_homer.h |  3 ++
 include/hw/ppc/pnv_xscom.h |  3 ++
 hw/ppc/pnv.c   | 20 
 hw/ppc/pnv_homer.c | 64 ++
 5 files changed, 100 insertions(+)

diff --git a/include/hw/ppc/pnv.h b/include/hw/ppc/pnv.h
index f44b9947d00e..3ea2d798eed1 100644
--- a/include/hw/ppc/pnv.h
+++ b/include/hw/ppc/pnv.h
@@ -128,6 +128,7 @@ struct Pnv10Chip {
 Pnv9Psi  psi;
 PnvLpcController lpc;
 PnvOCC   occ;
+PnvHomer homer;
 
 uint32_t nr_quads;
 PnvQuad  *quads;
@@ -358,4 +359,13 @@ void pnv_bmc_set_pnor(IPMIBmc *bmc, PnvPnor *pnor);
 #define PNV10_XIVE2_END_SIZE0x0200ull
 #define PNV10_XIVE2_END_BASE(chip)  PNV10_CHIP_BASE(chip, 
0x00060600ull)
 
+#define PNV10_OCC_COMMON_AREA_SIZE  0x0080ull
+#define PNV10_OCC_COMMON_AREA_BASE  0x300fff80ull
+#define PNV10_OCC_SENSOR_BASE(chip) (PNV10_OCC_COMMON_AREA_BASE +   \
+PNV_OCC_SENSOR_DATA_BLOCK_BASE((chip)->chip_id))
+
+#define PNV10_HOMER_SIZE  0x0040ull
+#define PNV10_HOMER_BASE(chip)   \
+(0x300ffd80ll + ((uint64_t)(chip)->chip_id) * PNV10_HOMER_SIZE)
+
 #endif /* PPC_PNV_H */
diff --git a/include/hw/ppc/pnv_homer.h b/include/hw/ppc/pnv_homer.h
index 1889e3083c57..07e8b193116e 100644
--- a/include/hw/ppc/pnv_homer.h
+++ b/include/hw/ppc/pnv_homer.h
@@ -32,6 +32,9 @@ DECLARE_INSTANCE_CHECKER(PnvHomer, PNV8_HOMER,
 #define TYPE_PNV9_HOMER TYPE_PNV_HOMER "-POWER9"
 DECLARE_INSTANCE_CHECKER(PnvHomer, PNV9_HOMER,
  TYPE_PNV9_HOMER)
+#define TYPE_PNV10_HOMER TYPE_PNV_HOMER "-POWER10"
+DECLARE_INSTANCE_CHECKER(PnvHomer, PNV10_HOMER,
+ TYPE_PNV10_HOMER)
 
 struct PnvHomer {
 DeviceState parent;
diff --git a/include/hw/ppc/pnv_xscom.h b/include/hw/ppc/pnv_xscom.h
index 75db33d46af6..7c7440de0c40 100644
--- a/include/hw/ppc/pnv_xscom.h
+++ b/include/hw/ppc/pnv_xscom.h
@@ -134,6 +134,9 @@ struct PnvXScomInterfaceClass {
 #define PNV10_XSCOM_OCC_BASE   PNV9_XSCOM_OCC_BASE
 #define PNV10_XSCOM_OCC_SIZE   PNV9_XSCOM_OCC_SIZE
 
+#define PNV10_XSCOM_PBA_BASE   0x01010CDA
+#define PNV10_XSCOM_PBA_SIZE   0x40
+
 #define PNV10_XSCOM_XIVE2_BASE 0x2010800
 #define PNV10_XSCOM_XIVE2_SIZE 0x400
 
diff --git a/hw/ppc/pnv.c b/hw/ppc/pnv.c
index 0de3027b7122..d510d2e1d917 100644
--- a/hw/ppc/pnv.c
+++ b/hw/ppc/pnv.c
@@ -1621,6 +1621,7 @@ static void pnv_chip_power10_instance_init(Object *obj)
 object_initialize_child(obj, "psi", &chip10->psi, TYPE_PNV10_PSI);
 object_initialize_child(obj, "lpc", &chip10->lpc, TYPE_PNV10_LPC);
 object_initialize_child(obj, "occ",  &chip10->occ, TYPE_PNV10_OCC);
+object_initialize_child(obj, "homer", &chip10->homer, TYPE_PNV10_HOMER);
 
 for (i = 0; i < PNV10_CHIP_MAX_PEC; i++) {
 object_initialize_child(obj, "pec[*]", &chip10->pecs[i],
@@ -1795,6 +1796,25 @@ static void pnv_chip_power10_realize(DeviceState *dev, 
Error **errp)
 pnv_xscom_add_subregion(chip, PNV10_XSCOM_OCC_BASE,
 &chip10->occ.xscom_regs);
 
+/* OCC SRAM model */
+memory_region_add_subregion(get_system_memory(),
+PNV10_OCC_SENSOR_BASE(chip),
+&chip10->occ.sram_regs);
+
+/* HOMER */
+object_property_set_link(OBJECT(&chip10->homer), "chip", OBJECT(chip),
+ &error_abort);
+if (!qdev_realize(DEVICE(&chip10->homer), NULL, errp)) {
+return;
+}
+/* Homer Xscom region */
+pnv_xscom_add_subregion(chip, PNV10_XSCOM_PBA_BASE,
+&chip10->homer.pba_regs);
+
+/* Homer mmio region */
+memory_region_add_subregion(get_system_memory(), PNV10_HOMER_BASE(chip),
+&chip10->homer.regs);
+
 /* PHBs */
 pnv_chip_power10_phb_realize(chip, &local_err);
 if (local_err) {
diff --git a/hw/ppc/pnv_homer.c b/hw/ppc/pnv_homer.c
index 9a262629b73a..ea73919e54ca 100644
--- a/hw/ppc/pnv_homer.c
+++ b/hw/ppc/pnv_homer.c
@@ -332,6 +332,69 @@ static const TypeInfo pnv_homer_power9_type_info = {
 .class_init= pnv_homer_power9_class_init,
 };
 
+static uint64_t pnv_homer_power10_pba_read(void *opaque, hwaddr addr,
+  unsigned size)
+{
+PnvHomer *homer = PNV_HOMER(opaque);
+PnvChip *chip = homer->chip;
+uint32_t reg = addr >> 3;
+uint64_t val = 0;
+
+switch (reg) {
+case PBA_BAR0:
+val = PNV10_HOMER_BASE(chip);
+break;
+case PBA_BARMASK0: /* P10 homer region mask */
+val = (PNV10_HOMER_SIZE - 1) & 0x30;
+break;
+case PBA_BAR2: /* P10 occ common area */
+val = PNV10_OCC_COMMON_AREA_BASE;
+break;
+case PBA_BARMASK2: /* P10 occ co

[PATCH v3 10/18] ppc/xive: Add support for PQ state bits offload

2021-11-26 Thread Cédric Le Goater
The trigger message coming from a HW source contains a special bit
informing the XIVE interrupt controller that the PQ bits have been
checked at the source or not. Depending on the value, the IC can
perform the check and the state transition locally using its own PQ
state bits.

The following changes add new accessors to the XiveRouter required to
query and update the PQ state bits. This only applies to the PowerNV
machine. sPAPR accessors are provided but the pSeries machine should
not be concerned by such complex configuration for the moment.

Signed-off-by: Cédric Le Goater 
---
 include/hw/ppc/xive.h  |  8 +--
 include/hw/ppc/xive2.h |  6 +-
 hw/intc/pnv_xive.c | 37 +---
 hw/intc/pnv_xive2.c| 37 +---
 hw/intc/spapr_xive.c   | 25 ++
 hw/intc/xive.c | 48 --
 hw/intc/xive2.c| 42 +++-
 hw/pci-host/pnv_phb4.c |  9 ++--
 hw/ppc/pnv_psi.c   |  8 +--
 9 files changed, 199 insertions(+), 21 deletions(-)

diff --git a/include/hw/ppc/xive.h b/include/hw/ppc/xive.h
index 875c7f639689..649b58a08f0c 100644
--- a/include/hw/ppc/xive.h
+++ b/include/hw/ppc/xive.h
@@ -160,7 +160,7 @@ DECLARE_CLASS_CHECKERS(XiveNotifierClass, XIVE_NOTIFIER,
 
 struct XiveNotifierClass {
 InterfaceClass parent;
-void (*notify)(XiveNotifier *xn, uint32_t lisn);
+void (*notify)(XiveNotifier *xn, uint32_t lisn, bool pq_checked);
 };
 
 /*
@@ -386,6 +386,10 @@ struct XiveRouterClass {
 /* XIVE table accessors */
 int (*get_eas)(XiveRouter *xrtr, uint8_t eas_blk, uint32_t eas_idx,
XiveEAS *eas);
+int (*get_pq)(XiveRouter *xrtr, uint8_t eas_blk, uint32_t eas_idx,
+  uint8_t *pq);
+int (*set_pq)(XiveRouter *xrtr, uint8_t eas_blk, uint32_t eas_idx,
+  uint8_t *pq);
 int (*get_end)(XiveRouter *xrtr, uint8_t end_blk, uint32_t end_idx,
XiveEND *end);
 int (*write_end)(XiveRouter *xrtr, uint8_t end_blk, uint32_t end_idx,
@@ -407,7 +411,7 @@ int xive_router_get_nvt(XiveRouter *xrtr, uint8_t nvt_blk, 
uint32_t nvt_idx,
 XiveNVT *nvt);
 int xive_router_write_nvt(XiveRouter *xrtr, uint8_t nvt_blk, uint32_t nvt_idx,
   XiveNVT *nvt, uint8_t word_number);
-void xive_router_notify(XiveNotifier *xn, uint32_t lisn);
+void xive_router_notify(XiveNotifier *xn, uint32_t lisn, bool pq_checked);
 
 /*
  * XIVE Presenter
diff --git a/include/hw/ppc/xive2.h b/include/hw/ppc/xive2.h
index e881c039d9c0..9222b5b36979 100644
--- a/include/hw/ppc/xive2.h
+++ b/include/hw/ppc/xive2.h
@@ -31,6 +31,10 @@ typedef struct Xive2RouterClass {
 /* XIVE table accessors */
 int (*get_eas)(Xive2Router *xrtr, uint8_t eas_blk, uint32_t eas_idx,
Xive2Eas *eas);
+int (*get_pq)(Xive2Router *xrtr, uint8_t eas_blk, uint32_t eas_idx,
+  uint8_t *pq);
+int (*set_pq)(Xive2Router *xrtr, uint8_t eas_blk, uint32_t eas_idx,
+  uint8_t *pq);
 int (*get_end)(Xive2Router *xrtr, uint8_t end_blk, uint32_t end_idx,
Xive2End *end);
 int (*write_end)(Xive2Router *xrtr, uint8_t end_blk, uint32_t end_idx,
@@ -53,7 +57,7 @@ int xive2_router_get_nvp(Xive2Router *xrtr, uint8_t nvp_blk, 
uint32_t nvp_idx,
 int xive2_router_write_nvp(Xive2Router *xrtr, uint8_t nvp_blk, uint32_t 
nvp_idx,
   Xive2Nvp *nvp, uint8_t word_number);
 
-void xive2_router_notify(XiveNotifier *xn, uint32_t lisn);
+void xive2_router_notify(XiveNotifier *xn, uint32_t lisn, bool pq_checked);
 
 /*
  * XIVE2 Presenter (POWER10)
diff --git a/hw/intc/pnv_xive.c b/hw/intc/pnv_xive.c
index ad43483612e5..5022f85350f4 100644
--- a/hw/intc/pnv_xive.c
+++ b/hw/intc/pnv_xive.c
@@ -393,6 +393,34 @@ static int pnv_xive_get_eas(XiveRouter *xrtr, uint8_t blk, 
uint32_t idx,
 return pnv_xive_vst_read(xive, VST_TSEL_IVT, blk, idx, eas);
 }
 
+static int pnv_xive_get_pq(XiveRouter *xrtr, uint8_t blk, uint32_t idx,
+   uint8_t *pq)
+{
+PnvXive *xive = PNV_XIVE(xrtr);
+
+if (pnv_xive_block_id(xive) != blk) {
+xive_error(xive, "VST: EAS %x is remote !?", XIVE_EAS(blk, idx));
+return -1;
+}
+
+*pq = xive_source_esb_get(&xive->ipi_source, idx);
+return 0;
+}
+
+static int pnv_xive_set_pq(XiveRouter *xrtr, uint8_t blk, uint32_t idx,
+   uint8_t *pq)
+{
+PnvXive *xive = PNV_XIVE(xrtr);
+
+if (pnv_xive_block_id(xive) != blk) {
+xive_error(xive, "VST: EAS %x is remote !?", XIVE_EAS(blk, idx));
+return -1;
+}
+
+*pq = xive_source_esb_set(&xive->ipi_source, idx, *pq);
+return 0;
+}
+
 /*
  * One bit per thread id. The first register PC_THREAD_EN_REG0 covers
  * the first cores 0-15 (normal) of the chip or 0-7 (fused). The
@@ -489,12 +517,12 @@ static PnvXive *pnv_xive_tm_get_xive(Power

[PATCH v3 08/18] ppc/psi: Add support for StoreEOI and 64k ESB pages (POWER10)

2021-11-26 Thread Cédric Le Goater
POWER10 adds support for StoreEOI operation and 64K ESB pages on PSIHB
to be consistent with the other interrupt sources of the system.

Signed-off-by: Cédric Le Goater 
---
 hw/ppc/pnv.c |  6 ++
 hw/ppc/pnv_psi.c | 30 --
 2 files changed, 30 insertions(+), 6 deletions(-)

diff --git a/hw/ppc/pnv.c b/hw/ppc/pnv.c
index d510d2e1d917..96c908c753cb 100644
--- a/hw/ppc/pnv.c
+++ b/hw/ppc/pnv.c
@@ -1526,6 +1526,9 @@ static void pnv_chip_power9_realize(DeviceState *dev, 
Error **errp)
 /* Processor Service Interface (PSI) Host Bridge */
 object_property_set_int(OBJECT(&chip9->psi), "bar", PNV9_PSIHB_BASE(chip),
 &error_fatal);
+/* This is the only device with 4k ESB pages */
+object_property_set_int(OBJECT(&chip9->psi), "shift", XIVE_ESB_4K,
+&error_fatal);
 if (!qdev_realize(DEVICE(&chip9->psi), NULL, errp)) {
 return;
 }
@@ -1768,6 +1771,9 @@ static void pnv_chip_power10_realize(DeviceState *dev, 
Error **errp)
 /* Processor Service Interface (PSI) Host Bridge */
 object_property_set_int(OBJECT(&chip10->psi), "bar",
 PNV10_PSIHB_BASE(chip), &error_fatal);
+/* PSI can now be configured to use 64k ESB pages on POWER10 */
+object_property_set_int(OBJECT(&chip10->psi), "shift", XIVE_ESB_64K,
+&error_fatal);
 if (!qdev_realize(DEVICE(&chip10->psi), NULL, errp)) {
 return;
 }
diff --git a/hw/ppc/pnv_psi.c b/hw/ppc/pnv_psi.c
index cd9a2c5952a6..737486046d5a 100644
--- a/hw/ppc/pnv_psi.c
+++ b/hw/ppc/pnv_psi.c
@@ -601,7 +601,6 @@ static const TypeInfo pnv_psi_power8_info = {
 #define   PSIHB9_IRQ_METHOD PPC_BIT(0)
 #define   PSIHB9_IRQ_RESET  PPC_BIT(1)
 #define PSIHB9_ESB_CI_BASE  0x60
-#define   PSIHB9_ESB_CI_64K PPC_BIT(1)
 #define   PSIHB9_ESB_CI_ADDR_MASK   PPC_BITMASK(8, 47)
 #define   PSIHB9_ESB_CI_VALID   PPC_BIT(63)
 #define PSIHB9_ESB_NOTIF_ADDR   0x68
@@ -646,6 +645,14 @@ static const TypeInfo pnv_psi_power8_info = {
 #define   PSIHB9_IRQ_STAT_DIO   PPC_BIT(12)
 #define   PSIHB9_IRQ_STAT_PSU   PPC_BIT(13)
 
+/* P10 register extensions */
+
+#define PSIHB10_CR   PSIHB9_CR
+#definePSIHB10_CR_STORE_EOI  PPC_BIT(12)
+
+#define PSIHB10_ESB_CI_BASE  PSIHB9_ESB_CI_BASE
+#define   PSIHB10_ESB_CI_64K PPC_BIT(1)
+
 static void pnv_psi_notify(XiveNotifier *xf, uint32_t srcno)
 {
 PnvPsi *psi = PNV_PSI(xf);
@@ -704,6 +711,13 @@ static void pnv_psi_p9_mmio_write(void *opaque, hwaddr 
addr,
 
 switch (addr) {
 case PSIHB9_CR:
+if (val & PSIHB10_CR_STORE_EOI) {
+psi9->source.esb_flags |= XIVE_SRC_STORE_EOI;
+} else {
+psi9->source.esb_flags &= ~XIVE_SRC_STORE_EOI;
+}
+break;
+
 case PSIHB9_SEMR:
 /* FSP stuff */
 break;
@@ -715,15 +729,20 @@ static void pnv_psi_p9_mmio_write(void *opaque, hwaddr 
addr,
 break;
 
 case PSIHB9_ESB_CI_BASE:
+if (val & PSIHB10_ESB_CI_64K) {
+psi9->source.esb_shift = XIVE_ESB_64K;
+} else {
+psi9->source.esb_shift = XIVE_ESB_4K;
+}
 if (!(val & PSIHB9_ESB_CI_VALID)) {
 if (psi->regs[reg] & PSIHB9_ESB_CI_VALID) {
 memory_region_del_subregion(sysmem, &psi9->source.esb_mmio);
 }
 } else {
 if (!(psi->regs[reg] & PSIHB9_ESB_CI_VALID)) {
-memory_region_add_subregion(sysmem,
-val & ~PSIHB9_ESB_CI_VALID,
-&psi9->source.esb_mmio);
+hwaddr addr = val & ~(PSIHB9_ESB_CI_VALID | 
PSIHB10_ESB_CI_64K);
+memory_region_add_subregion(sysmem, addr,
+&psi9->source.esb_mmio);
 }
 }
 psi->regs[reg] = val;
@@ -831,6 +850,7 @@ static void pnv_psi_power9_instance_init(Object *obj)
 Pnv9Psi *psi = PNV9_PSI(obj);
 
 object_initialize_child(obj, "source", &psi->source, TYPE_XIVE_SOURCE);
+object_property_add_alias(obj, "shift", OBJECT(&psi->source), "shift");
 }
 
 static void pnv_psi_power9_realize(DeviceState *dev, Error **errp)
@@ -839,8 +859,6 @@ static void pnv_psi_power9_realize(DeviceState *dev, Error 
**errp)
 XiveSource *xsrc = &PNV9_PSI(psi)->source;
 int i;
 
-/* This is the only device with 4k ESB pages */
-object_property_set_int(OBJECT(xsrc), "shift", XIVE_ESB_4K, &error_fatal);
 object_property_set_int(OBJECT(xsrc), "nr-irqs", PSIHB9_NUM_IRQS,
 &error_fatal);
 object_property_set_link(OBJECT(xsrc), "xive", OBJECT(psi), &error_abort);
-- 
2.31.1




[PATCH v3 06/18] ppc/pnv: Add model for POWER10 PHB5 PCIe Host bridge

2021-11-26 Thread Cédric Le Goater
PHB4 and PHB5 are very similar. Use the PHB4 models with some minor
adjustements in a subclass for P10.

Signed-off-by: Cédric Le Goater 
---
 include/hw/pci-host/pnv_phb4.h | 11 
 include/hw/ppc/pnv.h   |  3 ++
 include/hw/ppc/pnv_xscom.h |  6 +++
 hw/pci-host/pnv_phb4_pec.c | 44 
 hw/ppc/pnv.c   | 94 ++
 5 files changed, 158 insertions(+)

diff --git a/include/hw/pci-host/pnv_phb4.h b/include/hw/pci-host/pnv_phb4.h
index 27556ae53425..78ae74349299 100644
--- a/include/hw/pci-host/pnv_phb4.h
+++ b/include/hw/pci-host/pnv_phb4.h
@@ -221,4 +221,15 @@ struct PnvPhb4PecClass {
 int stk_compat_size;
 };
 
+/*
+ * POWER10 definitions
+ */
+
+#define PNV_PHB5_VERSION   0x00a50001ull
+#define PNV_PHB5_DEVICE_ID 0x0652
+
+#define TYPE_PNV_PHB5_PEC "pnv-phb5-pec"
+#define PNV_PHB5_PEC(obj) \
+OBJECT_CHECK(PnvPhb4PecState, (obj), TYPE_PNV_PHB5_PEC)
+
 #endif /* PCI_HOST_PNV_PHB4_H */
diff --git a/include/hw/ppc/pnv.h b/include/hw/ppc/pnv.h
index 13495423283a..f44b9947d00e 100644
--- a/include/hw/ppc/pnv.h
+++ b/include/hw/ppc/pnv.h
@@ -131,6 +131,9 @@ struct Pnv10Chip {
 
 uint32_t nr_quads;
 PnvQuad  *quads;
+
+#define PNV10_CHIP_MAX_PEC 2
+PnvPhb4PecState pecs[PNV10_CHIP_MAX_PEC];
 };
 
 #define PNV10_PIR2FUSEDCORE(pir) (((pir) >> 3) & 0xf)
diff --git a/include/hw/ppc/pnv_xscom.h b/include/hw/ppc/pnv_xscom.h
index 151df15378d1..75db33d46af6 100644
--- a/include/hw/ppc/pnv_xscom.h
+++ b/include/hw/ppc/pnv_xscom.h
@@ -137,6 +137,12 @@ struct PnvXScomInterfaceClass {
 #define PNV10_XSCOM_XIVE2_BASE 0x2010800
 #define PNV10_XSCOM_XIVE2_SIZE 0x400
 
+#define PNV10_XSCOM_PEC_NEST_BASE  0x3011800 /* index goes downwards ... */
+#define PNV10_XSCOM_PEC_NEST_SIZE  0x100
+
+#define PNV10_XSCOM_PEC_PCI_BASE   0x8010800 /* index goes upwards ... */
+#define PNV10_XSCOM_PEC_PCI_SIZE   0x200
+
 void pnv_xscom_realize(PnvChip *chip, uint64_t size, Error **errp);
 int pnv_dt_xscom(PnvChip *chip, void *fdt, int root_offset,
  uint64_t xscom_base, uint64_t xscom_size,
diff --git a/hw/pci-host/pnv_phb4_pec.c b/hw/pci-host/pnv_phb4_pec.c
index 741ddc90ed8d..ab13311ef4c7 100644
--- a/hw/pci-host/pnv_phb4_pec.c
+++ b/hw/pci-host/pnv_phb4_pec.c
@@ -584,9 +584,53 @@ static const TypeInfo pnv_pec_stk_type_info = {
 }
 };
 
+/*
+ * POWER10 definitions
+ */
+
+static uint32_t pnv_phb5_pec_xscom_pci_base(PnvPhb4PecState *pec)
+{
+return PNV10_XSCOM_PEC_PCI_BASE + 0x100 * pec->index;
+}
+
+static uint32_t pnv_phb5_pec_xscom_nest_base(PnvPhb4PecState *pec)
+{
+/* index goes down ... */
+return PNV10_XSCOM_PEC_NEST_BASE - 0x100 * pec->index;
+}
+
+static void pnv_phb5_pec_class_init(ObjectClass *klass, void *data)
+{
+PnvPhb4PecClass *pecc = PNV_PHB4_PEC_CLASS(klass);
+static const char compat[] = "ibm,power10-pbcq";
+static const char stk_compat[] = "ibm,power10-phb-stack";
+
+pecc->xscom_nest_base = pnv_phb5_pec_xscom_nest_base;
+pecc->xscom_pci_base  = pnv_phb5_pec_xscom_pci_base;
+pecc->xscom_nest_size = PNV10_XSCOM_PEC_NEST_SIZE;
+pecc->xscom_pci_size  = PNV10_XSCOM_PEC_PCI_SIZE;
+pecc->compat = compat;
+pecc->compat_size = sizeof(compat);
+pecc->stk_compat = stk_compat;
+pecc->stk_compat_size = sizeof(stk_compat);
+}
+
+static const TypeInfo pnv_phb5_pec_type_info = {
+.name  = TYPE_PNV_PHB5_PEC,
+.parent= TYPE_PNV_PHB4_PEC,
+.instance_size = sizeof(PnvPhb4PecState),
+.class_init= pnv_phb5_pec_class_init,
+.class_size= sizeof(PnvPhb4PecClass),
+.interfaces= (InterfaceInfo[]) {
+{ TYPE_PNV_XSCOM_INTERFACE },
+{ }
+}
+};
+
 static void pnv_pec_register_types(void)
 {
 type_register_static(&pnv_pec_type_info);
+type_register_static(&pnv_phb5_pec_type_info);
 type_register_static(&pnv_pec_stk_type_info);
 }
 
diff --git a/hw/ppc/pnv.c b/hw/ppc/pnv.c
index 5c342e313329..0de3027b7122 100644
--- a/hw/ppc/pnv.c
+++ b/hw/ppc/pnv.c
@@ -706,9 +706,17 @@ static void pnv_ipmi_bt_init(ISABus *bus, IPMIBmc *bmc, 
uint32_t irq)
 static void pnv_chip_power10_pic_print_info(PnvChip *chip, Monitor *mon)
 {
 Pnv10Chip *chip10 = PNV10_CHIP(chip);
+int i, j;
 
 pnv_xive2_pic_print_info(&chip10->xive, mon);
 pnv_psi_pic_print_info(&chip10->psi, mon);
+
+for (i = 0; i < PNV10_CHIP_MAX_PEC; i++) {
+PnvPhb4PecState *pec = &chip10->pecs[i];
+for (j = 0; j < pec->num_stacks; j++) {
+pnv_phb4_pic_print_info(&pec->stacks[j].phb, mon);
+}
+}
 }
 
 /* Always give the first 1GB to chip 0 else we won't boot */
@@ -1602,7 +1610,10 @@ static void pnv_chip_power9_class_init(ObjectClass 
*klass, void *data)
 
 static void pnv_chip_power10_instance_init(Object *obj)
 {
+PnvChip *chip = PNV_CHIP(obj);
 Pnv10Chip *chip10 = PNV10_CHIP(obj);
+PnvChipClass *pcc = PNV_CHIP_GET_CLASS(obj);
+int i;
 
   

Re: [PULL for-6.2 0/2] ppc queue

2021-11-26 Thread Cédric Le Goater

Oops forget this email. I got the directory wrong on my command line.

Sorry for the noise.

C.

On 11/26/21 12:51, Cédric Le Goater wrote:

The following changes since commit 67f9968ce3f0847ffddb6ee2837a3641acd92abf:

   Update version for v6.2.0-rc1 release (2021-11-16 21:07:31 +0100)

are available in the Git repository at:

   https://github.com/legoater/qemu/ tags/pull-ppc-2029

for you to fetch changes up to a443d55c3f7cafa3d5dfb7fe2b5c3cd9d671b61d:

   tests/tcg/ppc64le: Fix compile flags for byte_reverse (2021-11-17 19:10:44 
+0100)


ppc 6.2 queue:

* fix pmu vmstate
* Fix compile of byte_reverse on new compilers


Laurent Vivier (1):
   pmu: fix pmu vmstate subsection list

Richard Henderson (1):
   tests/tcg/ppc64le: Fix compile flags for byte_reverse

  hw/misc/macio/pmu.c   |  1 +
  tests/tcg/ppc64le/Makefile.target | 12 +++-
  2 files changed, 4 insertions(+), 9 deletions(-)






[PATCH v3 09/18] ppc/xive2: Add support for notification injection on ESB pages

2021-11-26 Thread Cédric Le Goater
This is an internal offset used to inject triggers when the PQ state
bits are not controlled locally. Such as for LSIs when the PHB5 are
using the Address-Based Interrupt Trigger mode and on the END.

Signed-off-by: Cédric Le Goater 
---
 include/hw/ppc/xive.h |  1 +
 hw/intc/xive.c|  9 +
 hw/intc/xive2.c   | 10 ++
 3 files changed, 20 insertions(+)

diff --git a/include/hw/ppc/xive.h b/include/hw/ppc/xive.h
index b8ab0bf7490f..875c7f639689 100644
--- a/include/hw/ppc/xive.h
+++ b/include/hw/ppc/xive.h
@@ -278,6 +278,7 @@ uint8_t xive_esb_set(uint8_t *pq, uint8_t value);
 #define XIVE_ESB_STORE_EOI  0x400 /* Store */
 #define XIVE_ESB_LOAD_EOI   0x000 /* Load */
 #define XIVE_ESB_GET0x800 /* Load */
+#define XIVE_ESB_INJECT 0x800 /* Store */
 #define XIVE_ESB_SET_PQ_00  0xc00 /* Load */
 #define XIVE_ESB_SET_PQ_01  0xd00 /* Load */
 #define XIVE_ESB_SET_PQ_10  0xe00 /* Load */
diff --git a/hw/intc/xive.c b/hw/intc/xive.c
index 190194d27f84..2c73ab5ca9d6 100644
--- a/hw/intc/xive.c
+++ b/hw/intc/xive.c
@@ -1061,6 +1061,15 @@ static void xive_source_esb_write(void *opaque, hwaddr 
addr,
 notify = xive_source_esb_eoi(xsrc, srcno);
 break;
 
+/*
+ * This is an internal offset used to inject triggers when the PQ
+ * state bits are not controlled locally. Such as for LSIs when
+ * under ABT mode.
+ */
+case XIVE_ESB_INJECT ... XIVE_ESB_INJECT + 0x3FF:
+notify = true;
+break;
+
 case XIVE_ESB_SET_PQ_00 ... XIVE_ESB_SET_PQ_00 + 0x0FF:
 case XIVE_ESB_SET_PQ_01 ... XIVE_ESB_SET_PQ_01 + 0x0FF:
 case XIVE_ESB_SET_PQ_10 ... XIVE_ESB_SET_PQ_10 + 0x0FF:
diff --git a/hw/intc/xive2.c b/hw/intc/xive2.c
index 9e186bbb6cd9..d474476b5a55 100644
--- a/hw/intc/xive2.c
+++ b/hw/intc/xive2.c
@@ -658,6 +658,16 @@ static void xive2_end_source_write(void *opaque, hwaddr 
addr,
 notify = xive_esb_eoi(&pq);
 break;
 
+case XIVE_ESB_INJECT ... XIVE_ESB_INJECT + 0x3FF:
+if (end_esmask == END2_W1_ESe) {
+qemu_log_mask(LOG_GUEST_ERROR,
+  "XIVE: END %x/%x can not EQ inject on ESe\n",
+   end_blk, end_idx);
+return;
+}
+notify = true;
+break;
+
 default:
 qemu_log_mask(LOG_GUEST_ERROR, "XIVE: invalid END ESB write addr %d\n",
   offset);
-- 
2.31.1




[PATCH v3 11/18] ppc/pnv: Add support for PQ offload on PHB5

2021-11-26 Thread Cédric Le Goater
The PQ_disable configuration bit disables the check done on the PQ
state bits when processing new MSI interrupts. When bit 9 is enabled,
the PHB forwards any MSI trigger to the XIVE interrupt controller
without checking the PQ state bits. The XIVE IC knows from the trigger
message that the PQ bits have not been checked and performs the check
locally.

This configuration bit only applies to MSIs and LSIs are still checked
on the PHB to handle the assertion level.

PQ_disable enablement is a requirement for StoreEOI.

Signed-off-by: Cédric Le Goater 
---
 include/hw/pci-host/pnv_phb4_regs.h |  1 +
 include/hw/ppc/xive.h   |  1 +
 hw/intc/xive.c  | 22 +-
 hw/pci-host/pnv_phb4.c  |  9 +
 4 files changed, 32 insertions(+), 1 deletion(-)

diff --git a/include/hw/pci-host/pnv_phb4_regs.h 
b/include/hw/pci-host/pnv_phb4_regs.h
index 55df2c3e5ece..64f326b7158e 100644
--- a/include/hw/pci-host/pnv_phb4_regs.h
+++ b/include/hw/pci-host/pnv_phb4_regs.h
@@ -225,6 +225,7 @@
 /* Fundamental register set B */
 #define PHB_VERSION 0x800
 #define PHB_CTRLR   0x810
+#define   PHB_CTRLR_IRQ_PQ_DISABLE  PPC_BIT(9)   /* P10 */
 #define   PHB_CTRLR_IRQ_PGSZ_64KPPC_BIT(11)
 #define   PHB_CTRLR_IRQ_STORE_EOI   PPC_BIT(12)
 #define   PHB_CTRLR_MMIO_RD_STRICT  PPC_BIT(13)
diff --git a/include/hw/ppc/xive.h b/include/hw/ppc/xive.h
index 649b58a08f0c..126e4e2c3a17 100644
--- a/include/hw/ppc/xive.h
+++ b/include/hw/ppc/xive.h
@@ -176,6 +176,7 @@ OBJECT_DECLARE_SIMPLE_TYPE(XiveSource, XIVE_SOURCE)
  */
 #define XIVE_SRC_H_INT_ESB 0x1 /* ESB managed with hcall H_INT_ESB */
 #define XIVE_SRC_STORE_EOI 0x2 /* Store EOI supported */
+#define XIVE_SRC_PQ_DISABLE0x4 /* Disable check on the PQ state bits */
 
 struct XiveSource {
 DeviceState parent;
diff --git a/hw/intc/xive.c b/hw/intc/xive.c
index 3cc439a84655..4f3d67f246b5 100644
--- a/hw/intc/xive.c
+++ b/hw/intc/xive.c
@@ -886,6 +886,16 @@ static bool xive_source_lsi_trigger(XiveSource *xsrc, 
uint32_t srcno)
 }
 }
 
+/*
+ * Sources can be configured with PQ offloading in which case the check
+ * on the PQ state bits of MSIs is disabled
+ */
+static bool xive_source_esb_disabled(XiveSource *xsrc, uint32_t srcno)
+{
+return (xsrc->esb_flags & XIVE_SRC_PQ_DISABLE) &&
+!xive_source_irq_is_lsi(xsrc, srcno);
+}
+
 /*
  * Returns whether the event notification should be forwarded.
  */
@@ -895,6 +905,10 @@ static bool xive_source_esb_trigger(XiveSource *xsrc, 
uint32_t srcno)
 
 assert(srcno < xsrc->nr_irqs);
 
+if (xive_source_esb_disabled(xsrc, srcno)) {
+return true;
+}
+
 ret = xive_esb_trigger(&xsrc->status[srcno]);
 
 if (xive_source_irq_is_lsi(xsrc, srcno) &&
@@ -915,6 +929,11 @@ static bool xive_source_esb_eoi(XiveSource *xsrc, uint32_t 
srcno)
 
 assert(srcno < xsrc->nr_irqs);
 
+if (xive_source_esb_disabled(xsrc, srcno)) {
+qemu_log_mask(LOG_GUEST_ERROR, "XIVE: invalid EOI for IRQ %d\n", 
srcno);
+return false;
+}
+
 ret = xive_esb_eoi(&xsrc->status[srcno]);
 
 /*
@@ -936,9 +955,10 @@ static bool xive_source_esb_eoi(XiveSource *xsrc, uint32_t 
srcno)
 static void xive_source_notify(XiveSource *xsrc, int srcno)
 {
 XiveNotifierClass *xnc = XIVE_NOTIFIER_GET_CLASS(xsrc->xive);
+bool pq_checked = !xive_source_esb_disabled(xsrc, srcno);
 
 if (xnc->notify) {
-xnc->notify(xsrc->xive, srcno, true);
+xnc->notify(xsrc->xive, srcno, pq_checked);
 }
 }
 
diff --git a/hw/pci-host/pnv_phb4.c b/hw/pci-host/pnv_phb4.c
index 3edd5845ebde..cf506d1623c3 100644
--- a/hw/pci-host/pnv_phb4.c
+++ b/hw/pci-host/pnv_phb4.c
@@ -475,6 +475,15 @@ static void pnv_phb4_update_xsrc(PnvPHB4 *phb)
 flags = 0;
 }
 
+/*
+ * When the PQ disable configuration bit is set, the check on the
+ * PQ state bits is disabled on the PHB side (for MSI only) and it
+ * is performed on the IC side instead.
+ */
+if (phb->regs[PHB_CTRLR >> 3] & PHB_CTRLR_IRQ_PQ_DISABLE) {
+flags |= XIVE_SRC_PQ_DISABLE;
+}
+
 phb->xsrc.esb_shift = shift;
 phb->xsrc.esb_flags = flags;
 
-- 
2.31.1




[PATCH v3 00/18] ppc/pnv: Extend the powernv10 machine

2021-11-26 Thread Cédric Le Goater
Hi,

The skiboot merged in QEMU already has POWER10 support. This series
adds a minimum set of models (XIVE2, PHB5) to boot a baremetal POWER10
machine using the OpenPOWER firmware images.

The major change is the support for the interrupt controller of the
POWER10 processor. XIVE2 is very much like XIVE on POWER9 but the
register interface, the different MMIO regions, the XIVE internal
descriptors have gone through a major cleanup. It was easier to
duplicate the models then to try to adapt the current models.

XIVE2 adds some new set of features. Only some are modeled :

  - Address-based trigger (AB5) mode. Activated by default on the
PHB5. When using ABT [1], the PHB5 offloads [2] all interrupt
management on the IC, this to improve latency.

  - P9 compat mode. XIVE2 can be configured to provide a strict P9
interface for the TIMA.

  - Automatic save & restore of thread context registers. Used in
KVM [3].
  
  - 8bits thread id. Mostly to validate the model.

Thanks,

C.

[1] 
https://github.com/open-power/skiboot/commit/2a7e3d203496a016cc90ce91eeb2c4e680ebd1d2
[2] 
https://github.com/open-power/skiboot/commit/5b2d7c79a2049c1bedfaa8a9dfa19880f980b2ef
[3] 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=f5af0a978776

Changes in v3:

  - rebased on upstream
   
Changes in v2:

  - Most comments on v1 have been addressed independently and merged

Cédric Le Goater (18):
  ppc/xive2: Introduce a XIVE2 core framework
  ppc/xive2: Introduce a presenter matching routine
  ppc/pnv: Add a XIVE2 controller to the POWER10 chip
  ppc/pnv: Add a OCC model for POWER10
  ppc/pnv: Add POWER10 quads
  ppc/pnv: Add model for POWER10 PHB5 PCIe Host bridge
  ppc/pnv: Add a HOMER model to POWER10
  ppc/psi: Add support for StoreEOI and 64k ESB pages (POWER10)
  ppc/xive2: Add support for notification injection on ESB pages
  ppc/xive: Add support for PQ state bits offload
  ppc/pnv: Add support for PQ offload on PHB5
  ppc/pnv: Add support for PHB5 "Address-based trigger" mode
  pnv/xive2: Introduce new capability bits
  ppc/pnv: add XIVE Gen2 TIMA support
  pnv/xive2: Add support XIVE2 P9-compat mode (or Gen1)
  xive2: Add a get_config() handler for the router configuration
  pnv/xive2: Add support for automatic save&restore
  pnv/xive2: Add support for 8bits thread id

 hw/intc/pnv_xive2_regs.h|  442 ++
 include/hw/pci-host/pnv_phb4.h  |   11 +
 include/hw/pci-host/pnv_phb4_regs.h |3 +
 include/hw/ppc/pnv.h|   39 +
 include/hw/ppc/pnv_homer.h  |3 +
 include/hw/ppc/pnv_occ.h|2 +
 include/hw/ppc/pnv_xive.h   |   71 +
 include/hw/ppc/pnv_xscom.h  |   15 +
 include/hw/ppc/xive.h   |   10 +-
 include/hw/ppc/xive2.h  |  109 ++
 include/hw/ppc/xive2_regs.h |  210 +++
 hw/intc/pnv_xive.c  |   37 +-
 hw/intc/pnv_xive2.c | 2127 +++
 hw/intc/spapr_xive.c|   25 +
 hw/intc/xive.c  |   77 +-
 hw/intc/xive2.c | 1017 +
 hw/pci-host/pnv_phb4.c  |   87 +-
 hw/pci-host/pnv_phb4_pec.c  |   44 +
 hw/ppc/pnv.c|  265 +++-
 hw/ppc/pnv_homer.c  |   64 +
 hw/ppc/pnv_occ.c|   16 +
 hw/ppc/pnv_psi.c|   38 +-
 hw/intc/meson.build |4 +-
 hw/pci-host/trace-events|2 +
 24 files changed, 4677 insertions(+), 41 deletions(-)
 create mode 100644 hw/intc/pnv_xive2_regs.h
 create mode 100644 include/hw/ppc/xive2.h
 create mode 100644 include/hw/ppc/xive2_regs.h
 create mode 100644 hw/intc/pnv_xive2.c
 create mode 100644 hw/intc/xive2.c

-- 
2.31.1




[PATCH v3 18/18] pnv/xive2: Add support for 8bits thread id

2021-11-26 Thread Cédric Le Goater
Signed-off-by: Cédric Le Goater 
---
 include/hw/ppc/xive2.h | 1 +
 hw/intc/pnv_xive2.c| 5 +
 hw/intc/xive2.c| 3 ++-
 3 files changed, 8 insertions(+), 1 deletion(-)

diff --git a/include/hw/ppc/xive2.h b/include/hw/ppc/xive2.h
index 88c3e393162d..001388ccea7a 100644
--- a/include/hw/ppc/xive2.h
+++ b/include/hw/ppc/xive2.h
@@ -31,6 +31,7 @@ OBJECT_DECLARE_TYPE(Xive2Router, Xive2RouterClass, 
XIVE2_ROUTER);
 
 #define XIVE2_GEN1_TIMA_OS  0x0001
 #define XIVE2_VP_SAVE_RESTORE   0x0002
+#define XIVE2_THREADID_8BITS0x0004
 
 typedef struct Xive2RouterClass {
 SysBusDeviceClass parent;
diff --git a/hw/intc/pnv_xive2.c b/hw/intc/pnv_xive2.c
index 6f0a63cd3d2f..5aaccaf78934 100644
--- a/hw/intc/pnv_xive2.c
+++ b/hw/intc/pnv_xive2.c
@@ -438,6 +438,11 @@ static uint32_t pnv_xive2_get_config(Xive2Router *xrtr)
 cfg |= XIVE2_VP_SAVE_RESTORE;
 }
 
+if (GETFIELD(CQ_XIVE_CFG_HYP_HARD_RANGE,
+  xive->cq_regs[CQ_XIVE_CFG >> 3]) == CQ_XIVE_CFG_THREADID_8BITS) {
+cfg |= XIVE2_THREADID_8BITS;
+}
+
 return cfg;
 }
 
diff --git a/hw/intc/xive2.c b/hw/intc/xive2.c
index 978a0e3972e1..6b46f7021b46 100644
--- a/hw/intc/xive2.c
+++ b/hw/intc/xive2.c
@@ -458,7 +458,8 @@ static uint32_t xive2_tctx_hw_cam_line(XivePresenter *xptr, 
XiveTCTX *tctx)
 CPUPPCState *env = &POWERPC_CPU(tctx->cs)->env;
 uint32_t pir = env->spr_cb[SPR_PIR].default_value;
 uint8_t blk = xive2_router_get_block_id(xrtr);
-uint8_t tid_shift = 7;
+uint8_t tid_shift =
+xive2_router_get_config(xrtr) & XIVE2_THREADID_8BITS ? 8 : 7;
 uint8_t tid_mask = (1 << tid_shift) - 1;
 
 return xive2_nvp_cam_line(blk, 1 << tid_shift | (pir & tid_mask));
-- 
2.31.1




[PATCH v3 04/18] ppc/pnv: Add a OCC model for POWER10

2021-11-26 Thread Cédric Le Goater
Our OCC model is very mininal and POWER10 can simply reuse the OCC
model we introduced for POWER9.

Reviewed-by: David Gibson 
Signed-off-by: Cédric Le Goater 
---
 include/hw/ppc/pnv.h   |  1 +
 include/hw/ppc/pnv_occ.h   |  2 ++
 include/hw/ppc/pnv_xscom.h |  3 +++
 hw/ppc/pnv.c   | 10 ++
 hw/ppc/pnv_occ.c   | 16 
 5 files changed, 32 insertions(+)

diff --git a/include/hw/ppc/pnv.h b/include/hw/ppc/pnv.h
index b773b09f9f8e..a299fbc7f25c 100644
--- a/include/hw/ppc/pnv.h
+++ b/include/hw/ppc/pnv.h
@@ -127,6 +127,7 @@ struct Pnv10Chip {
 PnvXive2 xive;
 Pnv9Psi  psi;
 PnvLpcController lpc;
+PnvOCC   occ;
 };
 
 #define PNV10_PIR2FUSEDCORE(pir) (((pir) >> 3) & 0xf)
diff --git a/include/hw/ppc/pnv_occ.h b/include/hw/ppc/pnv_occ.h
index b78185aecaf2..f982ba002481 100644
--- a/include/hw/ppc/pnv_occ.h
+++ b/include/hw/ppc/pnv_occ.h
@@ -32,6 +32,8 @@ DECLARE_INSTANCE_CHECKER(PnvOCC, PNV8_OCC,
 #define TYPE_PNV9_OCC TYPE_PNV_OCC "-POWER9"
 DECLARE_INSTANCE_CHECKER(PnvOCC, PNV9_OCC,
  TYPE_PNV9_OCC)
+#define TYPE_PNV10_OCC TYPE_PNV_OCC "-POWER10"
+DECLARE_INSTANCE_CHECKER(PnvOCC, PNV10_OCC, TYPE_PNV10_OCC)
 
 #define PNV_OCC_SENSOR_DATA_BLOCK_OFFSET 0x0058
 #define PNV_OCC_SENSOR_DATA_BLOCK_SIZE   0x00025800
diff --git a/include/hw/ppc/pnv_xscom.h b/include/hw/ppc/pnv_xscom.h
index 188da874a4b0..151df15378d1 100644
--- a/include/hw/ppc/pnv_xscom.h
+++ b/include/hw/ppc/pnv_xscom.h
@@ -131,6 +131,9 @@ struct PnvXScomInterfaceClass {
 #define PNV10_XSCOM_PSIHB_BASE 0x3011D00
 #define PNV10_XSCOM_PSIHB_SIZE 0x100
 
+#define PNV10_XSCOM_OCC_BASE   PNV9_XSCOM_OCC_BASE
+#define PNV10_XSCOM_OCC_SIZE   PNV9_XSCOM_OCC_SIZE
+
 #define PNV10_XSCOM_XIVE2_BASE 0x2010800
 #define PNV10_XSCOM_XIVE2_SIZE 0x400
 
diff --git a/hw/ppc/pnv.c b/hw/ppc/pnv.c
index 4ec51e9157fd..a186df3fee41 100644
--- a/hw/ppc/pnv.c
+++ b/hw/ppc/pnv.c
@@ -1603,6 +1603,7 @@ static void pnv_chip_power10_instance_init(Object *obj)
   "xive-fabric");
 object_initialize_child(obj, "psi", &chip10->psi, TYPE_PNV10_PSI);
 object_initialize_child(obj, "lpc", &chip10->lpc, TYPE_PNV10_LPC);
+object_initialize_child(obj, "occ",  &chip10->occ, TYPE_PNV10_OCC);
 }
 
 static void pnv_chip_power10_realize(DeviceState *dev, Error **errp)
@@ -1668,6 +1669,15 @@ static void pnv_chip_power10_realize(DeviceState *dev, 
Error **errp)
 chip->fw_mr = &chip10->lpc.isa_fw;
 chip->dt_isa_nodename = g_strdup_printf("/lpcm-opb@%" PRIx64 "/lpc@0",
 (uint64_t) PNV10_LPCM_BASE(chip));
+
+/* Create the simplified OCC model */
+object_property_set_link(OBJECT(&chip10->occ), "psi", OBJECT(&chip10->psi),
+ &error_abort);
+if (!qdev_realize(DEVICE(&chip10->occ), NULL, errp)) {
+return;
+}
+pnv_xscom_add_subregion(chip, PNV10_XSCOM_OCC_BASE,
+&chip10->occ.xscom_regs);
 }
 
 static uint32_t pnv_chip_power10_xscom_pcba(PnvChip *chip, uint64_t addr)
diff --git a/hw/ppc/pnv_occ.c b/hw/ppc/pnv_occ.c
index 5a716c256edc..4ed66f5e1fcc 100644
--- a/hw/ppc/pnv_occ.c
+++ b/hw/ppc/pnv_occ.c
@@ -236,7 +236,9 @@ static const MemoryRegionOps pnv_occ_power9_xscom_ops = {
 static void pnv_occ_power9_class_init(ObjectClass *klass, void *data)
 {
 PnvOCCClass *poc = PNV_OCC_CLASS(klass);
+DeviceClass *dc = DEVICE_CLASS(klass);
 
+dc->desc = "PowerNV OCC Controller (POWER9)";
 poc->xscom_size = PNV9_XSCOM_OCC_SIZE;
 poc->xscom_ops = &pnv_occ_power9_xscom_ops;
 poc->psi_irq = PSIHB9_IRQ_OCC;
@@ -249,6 +251,19 @@ static const TypeInfo pnv_occ_power9_type_info = {
 .class_init= pnv_occ_power9_class_init,
 };
 
+static void pnv_occ_power10_class_init(ObjectClass *klass, void *data)
+{
+DeviceClass *dc = DEVICE_CLASS(klass);
+
+dc->desc = "PowerNV OCC Controller (POWER10)";
+}
+
+static const TypeInfo pnv_occ_power10_type_info = {
+.name  = TYPE_PNV10_OCC,
+.parent= TYPE_PNV9_OCC,
+.class_init= pnv_occ_power10_class_init,
+};
+
 static void pnv_occ_realize(DeviceState *dev, Error **errp)
 {
 PnvOCC *occ = PNV_OCC(dev);
@@ -297,6 +312,7 @@ static void pnv_occ_register_types(void)
 type_register_static(&pnv_occ_type_info);
 type_register_static(&pnv_occ_power8_type_info);
 type_register_static(&pnv_occ_power9_type_info);
+type_register_static(&pnv_occ_power10_type_info);
 }
 
 type_init(pnv_occ_register_types);
-- 
2.31.1




[PATCH v3 15/18] pnv/xive2: Add support XIVE2 P9-compat mode (or Gen1)

2021-11-26 Thread Cédric Le Goater
The thread interrupt management area (TIMA) is a set of pages mapped
in the Hypervisor and in the guest OS address space giving access to
the interrupt thread context registers for interrupt management, ACK,
EOI, CPPR, etc.

XIVE2 changes slightly the TIMA layout with extra bits for the new
features, larger CAM lines and the controller provides configuration
switches for backward compatibility. This is called the XIVE2
P9-compat mode, of Gen1 TIMA. It impacts the layout of the TIMA and
the availability of the internal features associated with it,
Automatic Save & Restore for instance. Using a P9 layout also means
setting the controller in such a mode at init time.

As the OPAL driver initializes the XIVE2 controller with a XIVE2/P10
TIMA directly, the XIVE2 model only has a simple support for the
compat mode in the OS TIMA.

Signed-off-by: Cédric Le Goater 
---
 hw/intc/pnv_xive2_regs.h |  6 ++
 hw/intc/pnv_xive2.c  | 22 +-
 2 files changed, 23 insertions(+), 5 deletions(-)

diff --git a/hw/intc/pnv_xive2_regs.h b/hw/intc/pnv_xive2_regs.h
index 46d4fb378135..902220e6be69 100644
--- a/hw/intc/pnv_xive2_regs.h
+++ b/hw/intc/pnv_xive2_regs.h
@@ -60,6 +60,12 @@
 #defineCQ_XIVE_CFG_HYP_HARD_BLKID_OVERRIDE  PPC_BIT(16)
 #defineCQ_XIVE_CFG_HYP_HARD_BLOCK_IDPPC_BITMASK(17, 23)
 
+#defineCQ_XIVE_CFG_GEN1_TIMA_OS PPC_BIT(24)
+#defineCQ_XIVE_CFG_GEN1_TIMA_HYPPPC_BIT(25)
+#defineCQ_XIVE_CFG_GEN1_TIMA_HYP_BLK0   PPC_BIT(26) /* 0 if bit[25]=0 
*/
+#defineCQ_XIVE_CFG_GEN1_TIMA_CROWD_DIS  PPC_BIT(27) /* 0 if bit[25]=0 
*/
+#defineCQ_XIVE_CFG_GEN1_END_ESX PPC_BIT(28)
+
 /* Interrupt Controller Base Address Register - 512 pages (32M) */
 #define X_CQ_IC_BAR 0x08
 #define CQ_IC_BAR   0x040
diff --git a/hw/intc/pnv_xive2.c b/hw/intc/pnv_xive2.c
index 4a2649893232..b364ee3b306b 100644
--- a/hw/intc/pnv_xive2.c
+++ b/hw/intc/pnv_xive2.c
@@ -444,6 +444,8 @@ static int pnv_xive2_match_nvt(XivePresenter *xptr, uint8_t 
format,
 PnvChip *chip = xive->chip;
 int count = 0;
 int i, j;
+bool gen1_tima_os =
+xive->cq_regs[CQ_XIVE_CFG >> 3] & CQ_XIVE_CFG_GEN1_TIMA_OS;
 
 for (i = 0; i < chip->nr_cores; i++) {
 PnvCore *pc = chip->cores[i];
@@ -460,9 +462,15 @@ static int pnv_xive2_match_nvt(XivePresenter *xptr, 
uint8_t format,
 
 tctx = XIVE_TCTX(pnv_cpu_state(cpu)->intc);
 
-ring = xive2_presenter_tctx_match(xptr, tctx, format, nvt_blk,
-  nvt_idx, cam_ignore,
-  logic_serv);
+if (gen1_tima_os) {
+ring = xive_presenter_tctx_match(xptr, tctx, format, nvt_blk,
+ nvt_idx, cam_ignore,
+ logic_serv);
+} else {
+ring = xive2_presenter_tctx_match(xptr, tctx, format, nvt_blk,
+   nvt_idx, cam_ignore,
+   logic_serv);
+}
 
 /*
  * Save the context and follow on to catch duplicates,
@@ -1627,9 +1635,11 @@ static void pnv_xive2_tm_write(void *opaque, hwaddr 
offset,
 PnvXive2 *xive = pnv_xive2_tm_get_xive(cpu);
 XiveTCTX *tctx = XIVE_TCTX(pnv_cpu_state(cpu)->intc);
 XivePresenter *xptr = XIVE_PRESENTER(xive);
+bool gen1_tima_os =
+xive->cq_regs[CQ_XIVE_CFG >> 3] & CQ_XIVE_CFG_GEN1_TIMA_OS;
 
 /* TODO: should we switch the TM ops table instead ? */
-if (offset == HV_PUSH_OS_CTX_OFFSET) {
+if (!gen1_tima_os && offset == HV_PUSH_OS_CTX_OFFSET) {
 xive2_tm_push_os_ctx(xptr, tctx, offset, value, size);
 return;
 }
@@ -1644,9 +1654,11 @@ static uint64_t pnv_xive2_tm_read(void *opaque, hwaddr 
offset, unsigned size)
 PnvXive2 *xive = pnv_xive2_tm_get_xive(cpu);
 XiveTCTX *tctx = XIVE_TCTX(pnv_cpu_state(cpu)->intc);
 XivePresenter *xptr = XIVE_PRESENTER(xive);
+bool gen1_tima_os =
+xive->cq_regs[CQ_XIVE_CFG >> 3] & CQ_XIVE_CFG_GEN1_TIMA_OS;
 
 /* TODO: should we switch the TM ops table instead ? */
-if (offset == HV_PULL_OS_CTX_OFFSET) {
+if (!gen1_tima_os && offset == HV_PULL_OS_CTX_OFFSET) {
 return xive2_tm_pull_os_ctx(xptr, tctx, offset, size);
 }
 
-- 
2.31.1




[PATCH v3 16/18] xive2: Add a get_config() handler for the router configuration

2021-11-26 Thread Cédric Le Goater
Add GEN1 config even if we don't use it yet in the core framework.

Signed-off-by: Cédric Le Goater 
---
 include/hw/ppc/xive2.h |  8 
 hw/intc/pnv_xive2.c| 13 +
 hw/intc/xive2.c|  7 +++
 3 files changed, 28 insertions(+)

diff --git a/include/hw/ppc/xive2.h b/include/hw/ppc/xive2.h
index cf6211a0ecb9..b08600cbd5ee 100644
--- a/include/hw/ppc/xive2.h
+++ b/include/hw/ppc/xive2.h
@@ -25,6 +25,12 @@ typedef struct Xive2Router {
 #define TYPE_XIVE2_ROUTER "xive2-router"
 OBJECT_DECLARE_TYPE(Xive2Router, Xive2RouterClass, XIVE2_ROUTER);
 
+/*
+ * Configuration flags
+ */
+
+#define XIVE2_GEN1_TIMA_OS  0x0001
+
 typedef struct Xive2RouterClass {
 SysBusDeviceClass parent;
 
@@ -44,6 +50,7 @@ typedef struct Xive2RouterClass {
 int (*write_nvp)(Xive2Router *xrtr, uint8_t nvp_blk, uint32_t nvp_idx,
  Xive2Nvp *nvp, uint8_t word_number);
 uint8_t (*get_block_id)(Xive2Router *xrtr);
+uint32_t (*get_config)(Xive2Router *xrtr);
 } Xive2RouterClass;
 
 int xive2_router_get_eas(Xive2Router *xrtr, uint8_t eas_blk, uint32_t eas_idx,
@@ -56,6 +63,7 @@ int xive2_router_get_nvp(Xive2Router *xrtr, uint8_t nvp_blk, 
uint32_t nvp_idx,
 Xive2Nvp *nvp);
 int xive2_router_write_nvp(Xive2Router *xrtr, uint8_t nvp_blk, uint32_t 
nvp_idx,
   Xive2Nvp *nvp, uint8_t word_number);
+uint32_t xive2_router_get_config(Xive2Router *xrtr);
 
 void xive2_router_notify(XiveNotifier *xn, uint32_t lisn, bool pq_checked);
 
diff --git a/hw/intc/pnv_xive2.c b/hw/intc/pnv_xive2.c
index b364ee3b306b..2b7d6ccbd097 100644
--- a/hw/intc/pnv_xive2.c
+++ b/hw/intc/pnv_xive2.c
@@ -425,6 +425,18 @@ static int pnv_xive2_get_eas(Xive2Router *xrtr, uint8_t 
blk, uint32_t idx,
 return pnv_xive2_vst_read(xive, VST_EAS, blk, idx, eas);
 }
 
+static uint32_t pnv_xive2_get_config(Xive2Router *xrtr)
+{
+PnvXive2 *xive = PNV_XIVE2(xrtr);
+uint32_t cfg = 0;
+
+if (xive->cq_regs[CQ_XIVE_CFG >> 3] & CQ_XIVE_CFG_GEN1_TIMA_OS) {
+cfg |= XIVE2_GEN1_TIMA_OS;
+}
+
+return cfg;
+}
+
 static bool pnv_xive2_is_cpu_enabled(PnvXive2 *xive, PowerPCCPU *cpu)
 {
 int pir = ppc_cpu_pir(cpu);
@@ -1949,6 +1961,7 @@ static void pnv_xive2_class_init(ObjectClass *klass, void 
*data)
 xrc->write_end = pnv_xive2_write_end;
 xrc->get_nvp   = pnv_xive2_get_nvp;
 xrc->write_nvp = pnv_xive2_write_nvp;
+xrc->get_config  = pnv_xive2_get_config;
 xrc->get_block_id = pnv_xive2_get_block_id;
 
 xnc->notify= pnv_xive2_notify;
diff --git a/hw/intc/xive2.c b/hw/intc/xive2.c
index e31037e1f030..71086c7fbd01 100644
--- a/hw/intc/xive2.c
+++ b/hw/intc/xive2.c
@@ -20,6 +20,13 @@
 #include "hw/ppc/xive2.h"
 #include "hw/ppc/xive2_regs.h"
 
+uint32_t xive2_router_get_config(Xive2Router *xrtr)
+{
+Xive2RouterClass *xrc = XIVE2_ROUTER_GET_CLASS(xrtr);
+
+return xrc->get_config(xrtr);
+}
+
 void xive2_eas_pic_print_info(Xive2Eas *eas, uint32_t lisn, Monitor *mon)
 {
 if (!xive2_eas_is_valid(eas)) {
-- 
2.31.1




[PATCH v3 13/18] pnv/xive2: Introduce new capability bits

2021-11-26 Thread Cédric Le Goater
These bits control the availability of interrupt features : StoreEOI,
PHB PQ_disable, PHB Address-Based Trigger and the overall XIVE
exploitation mode. These bits can be set at early boot time of the
system to activate/deactivate a feature for testing purposes. The
default value should be '1'.

The 'XIVE exploitation mode' bit is a software bit that skiboot could
use to disable the XIVE OS interface and propose a P8 style XICS
interface instead. There are no plans for that for the moment.

Signed-off-by: Cédric Le Goater 
---
 hw/intc/pnv_xive2_regs.h | 5 +
 hw/intc/pnv_xive2.c  | 4 ++--
 2 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/hw/intc/pnv_xive2_regs.h b/hw/intc/pnv_xive2_regs.h
index 084fccc8d3e9..46d4fb378135 100644
--- a/hw/intc/pnv_xive2_regs.h
+++ b/hw/intc/pnv_xive2_regs.h
@@ -31,6 +31,11 @@
 #define   CQ_XIVE_CAP_VP_INT_PRIO_8 3
 #defineCQ_XIVE_CAP_BLOCK_ID_WIDTH   PPC_BITMASK(12, 13)
 
+#defineCQ_XIVE_CAP_PHB_PQ_DISABLE   PPC_BIT(56)
+#defineCQ_XIVE_CAP_PHB_ABT  PPC_BIT(57)
+#defineCQ_XIVE_CAP_EXPLOITATION_MODEPPC_BIT(58)
+#defineCQ_XIVE_CAP_STORE_EOIPPC_BIT(59)
+
 /* XIVE2 Configuration */
 #define X_CQ_XIVE_CFG   0x03
 #define CQ_XIVE_CFG 0x018
diff --git a/hw/intc/pnv_xive2.c b/hw/intc/pnv_xive2.c
index 186ab66e105d..cb12cea14fc6 100644
--- a/hw/intc/pnv_xive2.c
+++ b/hw/intc/pnv_xive2.c
@@ -1708,9 +1708,9 @@ static const MemoryRegionOps pnv_xive2_nvpg_ops = {
 };
 
 /*
- * POWER10 default capabilities: 0x2000120076f0
+ * POWER10 default capabilities: 0x2000120076f000FC
  */
-#define PNV_XIVE2_CAPABILITIES  0x2000120076f0
+#define PNV_XIVE2_CAPABILITIES  0x2000120076f000FC
 
 /*
  * POWER10 default configuration: 0x00303300
-- 
2.31.1




[PATCH v3 12/18] ppc/pnv: Add support for PHB5 "Address-based trigger" mode

2021-11-26 Thread Cédric Le Goater
When the Address-Based Interrupt Trigger mode is activated, the PHB
maps the interrupt source number into the interrupt command address.
The PHB directly triggers the IC ESB page of the interrupt number and
not the notify page of the IC anymore.

Signed-off-by: Cédric Le Goater 
---
 include/hw/pci-host/pnv_phb4_regs.h |  2 +
 hw/pci-host/pnv_phb4.c  | 73 ++---
 hw/pci-host/trace-events|  2 +
 3 files changed, 71 insertions(+), 6 deletions(-)

diff --git a/include/hw/pci-host/pnv_phb4_regs.h 
b/include/hw/pci-host/pnv_phb4_regs.h
index 64f326b7158e..4a0d3b28efb3 100644
--- a/include/hw/pci-host/pnv_phb4_regs.h
+++ b/include/hw/pci-host/pnv_phb4_regs.h
@@ -220,12 +220,14 @@
 #define   PHB_PAPR_ERR_INJ_MASK_MMIOPPC_BITMASK(16, 63)
 #define PHB_ETU_ERR_SUMMARY 0x2c8
 #define PHB_INT_NOTIFY_ADDR 0x300
+#define   PHB_INT_NOTIFY_ADDR_64K   PPC_BIT(1)   /* P10 */
 #define PHB_INT_NOTIFY_INDEX0x308
 
 /* Fundamental register set B */
 #define PHB_VERSION 0x800
 #define PHB_CTRLR   0x810
 #define   PHB_CTRLR_IRQ_PQ_DISABLE  PPC_BIT(9)   /* P10 */
+#define   PHB_CTRLR_IRQ_ABT_MODEPPC_BIT(10)  /* P10 */
 #define   PHB_CTRLR_IRQ_PGSZ_64KPPC_BIT(11)
 #define   PHB_CTRLR_IRQ_STORE_EOI   PPC_BIT(12)
 #define   PHB_CTRLR_MMIO_RD_STRICT  PPC_BIT(13)
diff --git a/hw/pci-host/pnv_phb4.c b/hw/pci-host/pnv_phb4.c
index cf506d1623c3..353ce6617743 100644
--- a/hw/pci-host/pnv_phb4.c
+++ b/hw/pci-host/pnv_phb4.c
@@ -1259,10 +1259,54 @@ static const char *pnv_phb4_root_bus_path(PCIHostState 
*host_bridge,
 return phb->bus_path;
 }
 
-static void pnv_phb4_xive_notify(XiveNotifier *xf, uint32_t srcno,
- bool pq_checked)
+/*
+ * Address base trigger mode (POWER10)
+ *
+ * Trigger directly the IC ESB page
+ */
+static void pnv_phb4_xive_notify_abt(PnvPHB4 *phb, uint32_t srcno,
+ bool pq_checked)
+{
+uint64_t notif_port = phb->regs[PHB_INT_NOTIFY_ADDR >> 3];
+uint64_t data = 0; /* trigger data : don't care */
+hwaddr addr;
+MemTxResult result;
+int esb_shift;
+
+if (notif_port & PHB_INT_NOTIFY_ADDR_64K) {
+esb_shift = 16;
+} else {
+esb_shift = 12;
+}
+
+/* Compute the address of the IC ESB management page */
+addr = (notif_port & ~PHB_INT_NOTIFY_ADDR_64K);
+addr |= (1ull << (esb_shift + 1)) * srcno;
+addr |= (1ull << esb_shift);
+
+/*
+ * When the PQ state bits are checked on the PHB, the associated
+ * PQ state bits on the IC should be ignored. Use the unconditional
+ * trigger offset to inject a trigger on the IC. This is always
+ * the case for LSIs
+ */
+if (pq_checked) {
+addr |= XIVE_ESB_INJECT;
+}
+
+trace_pnv_phb4_xive_notify_ic(addr, data);
+
+address_space_stq_be(&address_space_memory, addr, data,
+ MEMTXATTRS_UNSPECIFIED, &result);
+if (result != MEMTX_OK) {
+phb_error(phb, "trigger failed @%"HWADDR_PRIx "\n", addr);
+return;
+}
+}
+
+static void pnv_phb4_xive_notify_ic(PnvPHB4 *phb, uint32_t srcno,
+bool pq_checked)
 {
-PnvPHB4 *phb = PNV_PHB4(xf);
 uint64_t notif_port = phb->regs[PHB_INT_NOTIFY_ADDR >> 3];
 uint32_t offset = phb->regs[PHB_INT_NOTIFY_INDEX >> 3];
 uint64_t data = offset | srcno;
@@ -1272,7 +1316,7 @@ static void pnv_phb4_xive_notify(XiveNotifier *xf, 
uint32_t srcno,
 data |= XIVE_TRIGGER_PQ;
 }
 
-trace_pnv_phb4_xive_notify(notif_port, data);
+trace_pnv_phb4_xive_notify_ic(notif_port, data);
 
 address_space_stq_be(&address_space_memory, notif_port, data,
  MEMTXATTRS_UNSPECIFIED, &result);
@@ -1282,6 +1326,18 @@ static void pnv_phb4_xive_notify(XiveNotifier *xf, 
uint32_t srcno,
 }
 }
 
+static void pnv_phb4_xive_notify(XiveNotifier *xf, uint32_t srcno,
+ bool pq_checked)
+{
+PnvPHB4 *phb = PNV_PHB4(xf);
+
+if (phb->regs[PHB_CTRLR >> 3] & PHB_CTRLR_IRQ_ABT_MODE) {
+pnv_phb4_xive_notify_abt(phb, srcno, pq_checked);
+} else {
+pnv_phb4_xive_notify_ic(phb, srcno, pq_checked);
+}
+}
+
 static Property pnv_phb4_properties[] = {
 DEFINE_PROP_UINT32("index", PnvPHB4, phb_id, 0),
 DEFINE_PROP_UINT32("chip-id", PnvPHB4, chip_id, 0),
@@ -1442,10 +1498,15 @@ void pnv_phb4_update_regions(PnvPhb4PecStack *stack)
 
 void pnv_phb4_pic_print_info(PnvPHB4 *phb, Monitor *mon)
 {
+uint64_t notif_port =
+phb->regs[PHB_INT_NOTIFY_ADDR >> 3] & ~PHB_INT_NOTIFY_ADDR_64K;
 uint32_t offset = phb->regs[PHB_INT_NOTIFY_INDEX >> 3];
+bool abt = !!(phb->regs[PHB_CTRLR >> 3] & PHB_CTRLR_IRQ_ABT_MODE);
 
-monitor_printf(mon, "PHB4[%x:%x] Source %08x .. %08x\n",
+monitor_printf(mon, "PHB4[%x:%x] Source %08x .. %08x %s @%"HWADDR_PRIx"\n",

Re: [PATCH v2] target/ppc: fix Hash64 MMU update of PTE bit R

2021-11-26 Thread Leandro Lupori

On 26/11/2021 06:18, Cédric Le Goater wrote:


Hello,

Curiously, I didn't get the v2 email.



I've received an e-mail from postmaster, saying that delivery to 
recipients @kaod.org have been delayed, but I don't know why.





qemu-devel@nongnu.org

2021-11-26 Thread Cédric Le Goater
The XIVE interrupt controller on P10 can automatically save and
restore the state of the interrupt registers under the internal NVP
structure representing the VCPU. This saves a costly store/load in
guest entries and exits.

Signed-off-by: Cédric Le Goater 
---
 hw/intc/pnv_xive2_regs.h|   3 +
 include/hw/ppc/xive2.h  |   1 +
 include/hw/ppc/xive2_regs.h |  12 
 hw/intc/pnv_xive2.c |  18 +-
 hw/intc/xive2.c | 126 ++--
 5 files changed, 154 insertions(+), 6 deletions(-)

diff --git a/hw/intc/pnv_xive2_regs.h b/hw/intc/pnv_xive2_regs.h
index 902220e6be69..3488ae188938 100644
--- a/hw/intc/pnv_xive2_regs.h
+++ b/hw/intc/pnv_xive2_regs.h
@@ -30,6 +30,7 @@
 #define   CQ_XIVE_CAP_VP_INT_PRIO_4_8   2
 #define   CQ_XIVE_CAP_VP_INT_PRIO_8 3
 #defineCQ_XIVE_CAP_BLOCK_ID_WIDTH   PPC_BITMASK(12, 13)
+#defineCQ_XIVE_CAP_VP_SAVE_RESTORE  PPC_BIT(38)
 
 #defineCQ_XIVE_CAP_PHB_PQ_DISABLE   PPC_BIT(56)
 #defineCQ_XIVE_CAP_PHB_ABT  PPC_BIT(57)
@@ -65,6 +66,8 @@
 #defineCQ_XIVE_CFG_GEN1_TIMA_HYP_BLK0   PPC_BIT(26) /* 0 if bit[25]=0 
*/
 #defineCQ_XIVE_CFG_GEN1_TIMA_CROWD_DIS  PPC_BIT(27) /* 0 if bit[25]=0 
*/
 #defineCQ_XIVE_CFG_GEN1_END_ESX PPC_BIT(28)
+#defineCQ_XIVE_CFG_EN_VP_SAVE_RESTORE   PPC_BIT(38) /* 0 if bit[25]=1 
*/
+#defineCQ_XIVE_CFG_EN_VP_SAVE_REST_STRICT   PPC_BIT(39) /* 0 if bit[25]=1 
*/
 
 /* Interrupt Controller Base Address Register - 512 pages (32M) */
 #define X_CQ_IC_BAR 0x08
diff --git a/include/hw/ppc/xive2.h b/include/hw/ppc/xive2.h
index b08600cbd5ee..88c3e393162d 100644
--- a/include/hw/ppc/xive2.h
+++ b/include/hw/ppc/xive2.h
@@ -30,6 +30,7 @@ OBJECT_DECLARE_TYPE(Xive2Router, Xive2RouterClass, 
XIVE2_ROUTER);
  */
 
 #define XIVE2_GEN1_TIMA_OS  0x0001
+#define XIVE2_VP_SAVE_RESTORE   0x0002
 
 typedef struct Xive2RouterClass {
 SysBusDeviceClass parent;
diff --git a/include/hw/ppc/xive2_regs.h b/include/hw/ppc/xive2_regs.h
index f4827f4c6d54..d214b49bef75 100644
--- a/include/hw/ppc/xive2_regs.h
+++ b/include/hw/ppc/xive2_regs.h
@@ -20,10 +20,13 @@
 #define   TM2_QW0W2_VU   PPC_BIT32(0)
 #define   TM2_QW0W2_LOGIC_SERV   PPC_BITMASK32(4, 31)
 #define   TM2_QW1W2_VO   PPC_BIT32(0)
+#define   TM2_QW1W2_HO   PPC_BIT32(1)
 #define   TM2_QW1W2_OS_CAM   PPC_BITMASK32(4, 31)
 #define   TM2_QW2W2_VP   PPC_BIT32(0)
+#define   TM2_QW2W2_HP   PPC_BIT32(1)
 #define   TM2_QW2W2_POOL_CAM PPC_BITMASK32(4, 31)
 #define   TM2_QW3W2_VT   PPC_BIT32(0)
+#define   TM2_QW3W2_HT   PPC_BIT32(1)
 #define   TM2_QW3W2_LP   PPC_BIT32(6)
 #define   TM2_QW3W2_LE   PPC_BIT32(7)
 
@@ -137,10 +140,17 @@ void xive2_end_eas_pic_print_info(Xive2End *end, uint32_t 
end_idx,
 typedef struct Xive2Nvp {
 uint32_t   w0;
 #define NVP2_W0_VALID  PPC_BIT32(0)
+#define NVP2_W0_HW PPC_BIT32(7)
 #define NVP2_W0_ESC_ENDPPC_BIT32(25) /* 'N' bit 0:ESB  1:END */
 uint32_t   w1;
+#define NVP2_W1_CO PPC_BIT32(13)
+#define NVP2_W1_CO_PRIVPPC_BITMASK32(14, 15)
+#define NVP2_W1_CO_THRID_VALID PPC_BIT32(16)
+#define NVP2_W1_CO_THRID   PPC_BITMASK32(17, 31)
 uint32_t   w2;
+#define NVP2_W2_CPPR   PPC_BITMASK32(0, 7)
 #define NVP2_W2_IPBPPC_BITMASK32(8, 15)
+#define NVP2_W2_LSMFB  PPC_BITMASK32(16, 23)
 uint32_t   w3;
 uint32_t   w4;
 #define NVP2_W4_ESC_ESB_BLOCK  PPC_BITMASK32(0, 3)  /* N:0 */
@@ -156,6 +166,8 @@ typedef struct Xive2Nvp {
 } Xive2Nvp;
 
 #define xive2_nvp_is_valid(nvp)(be32_to_cpu((nvp)->w0) & NVP2_W0_VALID)
+#define xive2_nvp_is_hw(nvp)   (be32_to_cpu((nvp)->w0) & NVP2_W0_HW)
+#define xive2_nvp_is_co(nvp)   (be32_to_cpu((nvp)->w1) & NVP2_W1_CO)
 
 /*
  * The VP number space in a block is defined by the END2_W6_VP_OFFSET
diff --git a/hw/intc/pnv_xive2.c b/hw/intc/pnv_xive2.c
index 2b7d6ccbd097..6f0a63cd3d2f 100644
--- a/hw/intc/pnv_xive2.c
+++ b/hw/intc/pnv_xive2.c
@@ -434,6 +434,10 @@ static uint32_t pnv_xive2_get_config(Xive2Router *xrtr)
 cfg |= XIVE2_GEN1_TIMA_OS;
 }
 
+if (xive->cq_regs[CQ_XIVE_CFG >> 3] & CQ_XIVE_CFG_EN_VP_SAVE_RESTORE) {
+cfg |= XIVE2_VP_SAVE_RESTORE;
+}
+
 return cfg;
 }
 
@@ -1999,9 +2003,21 @@ static void xive2_nvp_pic_print_info(Xive2Nvp *nvp, 
uint32_t nvp_idx,
 return;
 }
 
-monitor_printf(mon, "  %08x end:%02x/%04x IPB:%02x\n",
+monitor_printf(mon, "  %08x end:%02x/%04x IPB:%02x",
nvp_idx, eq_blk, eq_idx,
xive_get_field32(NVP2_W2_IPB, nvp->w2));
+/*
+ * When the NVP is HW controlled, more fields are updated
+ */
+if (xive2_nvp_is_hw(nvp)) {
+monitor_printf(mon, " CPPR:%02x",
+  

[PATCH v3 14/18] ppc/pnv: add XIVE Gen2 TIMA support

2021-11-26 Thread Cédric Le Goater
Only the CAM line updates done by the hypervisor are specific to
POWER10. Instead of duplicating the TM ops table, we handle these
commands locally under the PowerNV XIVE2 model.

Signed-off-by: Cédric Le Goater 
---
 include/hw/ppc/xive2.h |  8 
 hw/intc/pnv_xive2.c| 27 +++-
 hw/intc/xive2.c| 95 ++
 3 files changed, 128 insertions(+), 2 deletions(-)

diff --git a/include/hw/ppc/xive2.h b/include/hw/ppc/xive2.h
index 9222b5b36979..cf6211a0ecb9 100644
--- a/include/hw/ppc/xive2.h
+++ b/include/hw/ppc/xive2.h
@@ -87,5 +87,13 @@ typedef struct Xive2EndSource {
 Xive2Router *xrtr;
 } Xive2EndSource;
 
+/*
+ * XIVE2 Thread Interrupt Management Area (POWER10)
+ */
+
+void xive2_tm_push_os_ctx(XivePresenter *xptr, XiveTCTX *tctx, hwaddr offset,
+   uint64_t value, unsigned size);
+uint64_t xive2_tm_pull_os_ctx(XivePresenter *xptr, XiveTCTX *tctx,
+   hwaddr offset, unsigned size);
 
 #endif /* PPC_XIVE2_H */
diff --git a/hw/intc/pnv_xive2.c b/hw/intc/pnv_xive2.c
index cb12cea14fc6..4a2649893232 100644
--- a/hw/intc/pnv_xive2.c
+++ b/hw/intc/pnv_xive2.c
@@ -1610,15 +1610,32 @@ static const MemoryRegionOps 
pnv_xive2_ic_tm_indirect_ops = {
  * TIMA ops
  */
 
+/*
+ * Special TIMA offsets to handle accesses in a POWER10 way.
+ *
+ * Only the CAM line updates done by the hypervisor should be handled
+ * specifically.
+ */
+#define HV_PAGE_OFFSET (XIVE_TM_HV_PAGE << TM_SHIFT)
+#define HV_PUSH_OS_CTX_OFFSET  (HV_PAGE_OFFSET | (TM_QW1_OS + TM_WORD2))
+#define HV_PULL_OS_CTX_OFFSET  (HV_PAGE_OFFSET | TM_SPC_PULL_OS_CTX)
+
 static void pnv_xive2_tm_write(void *opaque, hwaddr offset,
uint64_t value, unsigned size)
 {
 PowerPCCPU *cpu = POWERPC_CPU(current_cpu);
 PnvXive2 *xive = pnv_xive2_tm_get_xive(cpu);
 XiveTCTX *tctx = XIVE_TCTX(pnv_cpu_state(cpu)->intc);
+XivePresenter *xptr = XIVE_PRESENTER(xive);
+
+/* TODO: should we switch the TM ops table instead ? */
+if (offset == HV_PUSH_OS_CTX_OFFSET) {
+xive2_tm_push_os_ctx(xptr, tctx, offset, value, size);
+return;
+}
 
 /* Other TM ops are the same as XIVE1 */
-xive_tctx_tm_write(XIVE_PRESENTER(xive), tctx, offset, value, size);
+xive_tctx_tm_write(xptr, tctx, offset, value, size);
 }
 
 static uint64_t pnv_xive2_tm_read(void *opaque, hwaddr offset, unsigned size)
@@ -1626,9 +1643,15 @@ static uint64_t pnv_xive2_tm_read(void *opaque, hwaddr 
offset, unsigned size)
 PowerPCCPU *cpu = POWERPC_CPU(current_cpu);
 PnvXive2 *xive = pnv_xive2_tm_get_xive(cpu);
 XiveTCTX *tctx = XIVE_TCTX(pnv_cpu_state(cpu)->intc);
+XivePresenter *xptr = XIVE_PRESENTER(xive);
+
+/* TODO: should we switch the TM ops table instead ? */
+if (offset == HV_PULL_OS_CTX_OFFSET) {
+return xive2_tm_pull_os_ctx(xptr, tctx, offset, size);
+}
 
 /* Other TM ops are the same as XIVE1 */
-return xive_tctx_tm_read(XIVE_PRESENTER(xive), tctx, offset, size);
+return xive_tctx_tm_read(xptr, tctx, offset, size);
 }
 
 static const MemoryRegionOps pnv_xive2_tm_ops = {
diff --git a/hw/intc/xive2.c b/hw/intc/xive2.c
index 26af08a5de07..e31037e1f030 100644
--- a/hw/intc/xive2.c
+++ b/hw/intc/xive2.c
@@ -158,6 +158,101 @@ static void xive2_end_enqueue(Xive2End *end, uint32_t 
data)
 }
 end->w1 = xive_set_field32(END2_W1_PAGE_OFF, end->w1, qindex);
 }
+
+/*
+ * XIVE Thread Interrupt Management Area (TIMA) - Gen2 mode
+ */
+
+static void xive2_os_cam_decode(uint32_t cam, uint8_t *nvp_blk,
+uint32_t *nvp_idx, bool *vo)
+{
+*nvp_blk = xive2_nvp_blk(cam);
+*nvp_idx = xive2_nvp_idx(cam);
+*vo = !!(cam & TM2_QW1W2_VO);
+}
+
+uint64_t xive2_tm_pull_os_ctx(XivePresenter *xptr, XiveTCTX *tctx,
+  hwaddr offset, unsigned size)
+{
+uint32_t qw1w2 = xive_tctx_word2(&tctx->regs[TM_QW1_OS]);
+uint32_t qw1w2_new;
+uint32_t cam = be32_to_cpu(qw1w2);
+uint8_t nvp_blk;
+uint32_t nvp_idx;
+bool vo;
+
+xive2_os_cam_decode(cam, &nvp_blk, &nvp_idx, &vo);
+
+if (!vo) {
+qemu_log_mask(LOG_GUEST_ERROR, "XIVE: pulling invalid NVP %x/%x !?\n",
+  nvp_blk, nvp_idx);
+}
+
+/* Invalidate CAM line */
+qw1w2_new = xive_set_field32(TM2_QW1W2_VO, qw1w2, 0);
+memcpy(&tctx->regs[TM_QW1_OS + TM_WORD2], &qw1w2_new, 4);
+
+return qw1w2;
+}
+
+static void xive2_tctx_need_resend(Xive2Router *xrtr, XiveTCTX *tctx,
+   uint8_t nvp_blk, uint32_t nvp_idx)
+{
+Xive2Nvp nvp;
+uint8_t ipb;
+uint8_t cppr = 0;
+
+/*
+ * Grab the associated thread interrupt context registers in the
+ * associated NVP
+ */
+if (xive2_router_get_nvp(xrtr, nvp_blk, nvp_idx, &nvp)) {
+qemu_log_mask(LOG_GUEST_ERROR, "XIVE: No NVP %x/%x\n",
+  nvp_blk, nvp_idx);
+return;
+   

Re: [PATCH v2] hw/intc/arm_gicv3: Update cached state after LPI state changes

2021-11-26 Thread Alex Bennée


Peter Maydell  writes:

> The logic of gicv3_redist_update() is as follows:
>  * it must be called in any code path that changes the state of
>(only) redistributor interrupts
>  * if it finds a redistributor interrupt that is (now) higher
>priority than the previous highest-priority pending interrupt,
>then this must be the new highest-priority pending interrupt
>  * if it does *not* find a better redistributor interrupt, then:
> - if the previous state was "no interrupts pending" then
>   the new state is still "no interrupts pending"
> - if the previous best interrupt was not a redistributor
>   interrupt then that remains the best interrupt
> - if the previous best interrupt *was* a redistributor interrupt,
>   then the new best interrupt must be some non-redistributor
>   interrupt, but we don't know which so must do a full scan
>
> In commit 17fb5e36aabd4b2c125 we effectively added the LPI interrupts
> as a kind of "redistributor interrupt" for this purpose, by adding
> cs->hpplpi to the set of things that gicv3_redist_update() considers
> before it gives up and decides to do a full scan of distributor
> interrupts. However we didn't quite get this right:
>  * the condition check for "was the previous best interrupt a
>redistributor interrupt" must be updated to include LPIs
>in what it considers to be redistributor interrupts
>  * every code path which updates the LPI state which
>gicv3_redist_update() checks must also call gicv3_redist_update():
>this is cs->hpplpi and the GICR_CTLR ENABLE_LPIS bit
>
> This commit fixes this by:
>  * correcting the test on cs->hppi.irq in gicv3_redist_update()
>  * making gicv3_redist_update_lpi() always call gicv3_redist_update()
>  * introducing a new gicv3_redist_update_lpi_only() for the one
>callsite (the post-load hook) which must not call
>gicv3_redist_update()
>  * making gicv3_redist_lpi_pending() always call gicv3_redist_update(),
>either directly or via gicv3_redist_update_lpi()
>  * removing a couple of now-unnecessary calls to gicv3_redist_update()
>from some callers of those two functions
>  * calling gicv3_redist_update() when the GICR_CTLR ENABLE_LPIS
>bit is cleared
>
> (This means that the not-file-local gicv3_redist_* LPI related
> functions now all take care of the updates of internally cached
> GICv3 information, in the same way the older functions
> gicv3_redist_set_irq() and gicv3_redist_send_sgi() do.)
>
> The visible effect of this bug was that when the guest acknowledged
> an LPI by reading ICC_IAR1_EL1, we marked it as not pending in the
> LPI data structure but still left it in cs->hppi so we would offer it
> to the guest again.  In particular for setups using an emulated GICv3
> and ITS and using devices which use LPIs (ie PCI devices) a Linux
> guest would complain "irq 54: nobody cared" and then hang.  (The hang
> was intermittent, presumably depending on the timing between
> different interrupts arriving and being completed.)
>
> Signed-off-by: Peter Maydell 

Tested-by: Alex Bennée 

Interestingly this also triggers an extra IRQ in v4 of my kvm-unit-test
ITS patches. However it works with v3 which was more limited in the
excising of the test:

v3:

--8<---cut here---start->8---
modified   arm/gic.c
@@ -732,21 +732,17 @@ static void test_its_trigger(void)
"dev2/eventid=20 does not trigger any LPI");
 
/*
-* re-enable the LPI but willingly do not call invall
-* so the change in config is not taken into account.
-* The LPI should not hit
+* re-enable the LPI. While "A change to the LPI configuration
+* is not guaranteed to be visible until an appropriate
+* invalidation operation has completed" hardware that doesn't
+* implement caches may have delivered the event at any point
+* after the enabling. Check the LPI has hit by the time the
+* invall is done.
 */
gicv3_lpi_set_config(8195, LPI_PROP_DEFAULT);
stats_reset();
cpumask_clear(&mask);
its_send_int(dev2, 20);
-   wait_for_interrupts(&mask);
-   report(check_acked(&mask, -1, -1),
-   "dev2/eventid=20 still does not trigger any LPI");
-
-   /* Now call the invall and check the LPI hits */
-   stats_reset();
-   cpumask_clear(&mask);
cpumask_set_cpu(3, &mask);
its_send_invall(col3);
wait_for_interrupts(&mask);
--8<---cut here---end--->8---

v4:

--8<---cut here---start->8---
modified   arm/gic.c
@@ -732,34 +732,22 @@ static void test_its_trigger(void)
"dev2/eventid=20 does not trigger any LPI");
 
/*
-* re-enable the LPI but willingly do not call invall
-* so the change in config is not taken into account.
-* The LPI should not hit
+* 

Re: [OpenBIOS] Re: [RFC PATCH 0/2] QEMU/openbios: PPC Software TLB support in the G4 family

2021-11-26 Thread Fabiano Rosas
Segher Boessenkool  writes:

> Hi!
>
> On Fri, Nov 26, 2021 at 09:34:44AM +0100, Cédric Le Goater wrote:
>> On 11/25/21 10:38, Segher Boessenkool wrote:
>> >On Thu, Nov 25, 2021 at 01:45:00AM +0100, BALATON Zoltan wrote:
>> >>As for guests, those running on the said PowerMac G4 should have support
>> >>for these CPUs so maybe you can try some Mac OS X versions (or maybe
>> >
>> >OSX uses hardware pagetables.
>> >
>> >>MorphOS but that is not the best for debugging as there's no source
>> >>available nor any help from its owners but just to see if it boots it may
>> >>be sufficient, it should work on real PowerMac G4).
>> >
>> >I have no idea what MorphOS uses, but I bet HPT as well.  That is
>> >because HPT is fastest in general.  Software TLB reloads are good in
>> >special cases only; the most common is real-time OSes, which can use its
>> >lower guaranteed latency for some special address spaces (and can have a
>> >simpler address map in general).
>> 
>> The support was added to QEMU knowing that Linux didn't handle soft TLBs.
>> And the commit says that it was kept disabled initially. I guess that was
>> broken these last years.
>
> Ah :-)  So when was it enabled, do you know?

Hm.. That commit message does not match the code. They simply added the
software TLB implementation to an already existing SOFT_74xx MMU
model. I don't see anything that would keep it disabled at that time.



Re: Follow-up on the CXL discussion at OFTC

2021-11-26 Thread Alex Bennée


Ben Widawsky  writes:

> On 21-11-19 02:29:51, Shreyas Shah wrote:
>> Hi Ben
>> 
>> Are you planning to add the CXL2.0 switch inside QEMU or already added in 
>> one of the version? 
>>  
>
> From me, there are no plans for QEMU anything until/unless upstream thinks it
> will merge the existing patches, or provide feedback as to what it would take 
> to
> get them merged. If upstream doesn't see a point in these patches, then I 
> really
> don't see much value in continuing to further them. Once hardware comes out, 
> the
> value proposition is certainly less.

I take it:

  Subject: [RFC PATCH v3 00/31] CXL 2.0 Support
  Date: Mon,  1 Feb 2021 16:59:17 -0800
  Message-Id: <20210202005948.241655-1-ben.widaw...@intel.com>

is the current state of the support? I saw there was a fair amount of
discussion on the thread so assumed there would be a v4 forthcoming at
some point.

Adding new subsystems to QEMU does seem to be a pain point for new
contributors. Patches tend to fall through the cracks of existing
maintainers who spend most of their time looking at stuff that directly
touches their files. There is also a reluctance to merge large chunks of
functionality without an identified maintainer (and maybe reviewers) who
can be the contact point for new patches. So in short you need:

 - Maintainer Reviewed-by/Acked-by on patches that touch other sub-systems
 - Reviewed-by tags on the new sub-system patches from anyone who understands 
CXL
 - Some* in-tree testing (so it doesn't quietly bitrot)
 - A patch adding the sub-system to MAINTAINERS with identified people

* Some means at least ensuring qtest can instantiate the device and not
  fall over. Obviously more testing is better but it can always be
  expanded on in later series.

Is that the feedback you were looking for?

-- 
Alex Bennée



Re: [RFC PATCH 0/2] QEMU/openbios: PPC Software TLB support in the G4 family

2021-11-26 Thread Fabiano Rosas
Mark Cave-Ayland  writes:

> On 26/11/2021 08:40, Cédric Le Goater wrote:
>
>> On 11/26/21 09:01, Mark Cave-Ayland wrote:
>>> On 24/11/2021 22:00, Fabiano Rosas wrote:
>>>
 Fabiano Rosas  writes:

> Hi all,
>
> We have this bug in QEMU which indicates that we haven't been able to
> run openbios on a 7450 cpu for quite a long time:
>
> https://gitlab.com/qemu-project/qemu/-/issues/86
>
> OK:
>    $ ./qemu-system-ppc -serial mon:stdio -nographic -cpu 7410
>
>    >> =
>    >> OpenBIOS 1.1 [Nov 1 2021 20:36]
>    ...
>
> NOK:
>    $ ./qemu-system-ppc -serial mon:stdio -nographic -cpu 7450 -d int
>    Raise exception at fff08cc4 => 004e (00)
>    QEMU: Terminated
>
> The actual issue is straightforward. There is a non-architected
> feature that QEMU has enabled by default that openbios doesn't know
> about. From the user manual:
>
> "The MPC7540 has a set of implementation-specific registers,
> exceptions, and instructions that facilitate very efficient software
> searching of the page tables in memory for when software table
> searching is enabled (HID0[STEN] = 1). This section describes those
> resources and provides three example code sequences that can be used
> in a MPC7540 system for an efficient search of the translation tables
> in software. These three code sequences can be used as handlers for
> the three exceptions requiring access to the PTEs in the page tables
> in memory in this case-instruction TLB miss, data TLB miss on load,
> and data TLB miss on store exceptions."
>
> The current state:
>
> 1) QEMU does not check HID0[STEN] and makes the feature always enabled
> by setting these cpus with the POWERPC_MMU_SOFT_74xx MMU model,
> instead of the generic POWERPC_MMU_32B.
>
> 2) openbios does not recognize the PVRs for those cpus and also does
> not have any handlers for the software TLB exceptions (vectors 0x1000,
> 0x1100, 0x1200).
>
> Some assumptions (correct me if I'm wrong please):
>
> - openbios is the only firmware we use for the following cpus: 7441,
> 7445, 7450, 7451, 7455, 7457, 7447, 7447a, 7448.
> - without openbios, we cannot have a guest running on these cpus.
>
> So to bring 7450 back to life we would need to either:
>
> a) find another firmware/guest OS code that supports the feature;
>
> b) implement the switching of the feature in QEMU and have the guest
> code enable it only when supported. That would take some fiddling with
> the MMU code to: merge POWERPC_MMU_SOFT_74xx into POWERPC_MMU_32B,
> check the HID0[STEN] bit, figure out how to switch from HW TLB miss to
> SW TLB miss on demand, block access to the TLBMISS register (and
> others) when the feature is off, and so on;
>
> c) leave the feature enabled in QEMU and implement the software TLB
> miss handlers in openbios. The UM provides sample code, so this is
> easy;
>
> d) remove support for software TLB search for the 7450 family and
> switch the cpus to the POWERPC_MMU_32B model. This is by far the
> easiest solution, but could cause problems for any (which?) guest OS
> code that actually uses the feature. All of the existing code for the
> POWERPC_MMU_SOFT_74xx MMU model would probably be removed since it
> would be dead code then;
>
> Option (c) seemed to me like a good compromise so this is a patch
> series for openbios doing that and also adding the necessary PVRs so
> we can get a working guest with these cpus without too much effort.
>
> I have also a patch for QEMU adding basic sanity check tests for the
> 7400 and 7450 families. I'll send that separately to the QEMU ml.
>
> Fabiano Rosas (2):
>    ppc: Add support for MPC7450 software TLB miss interrupts
>    ppc: Add PVRs for the MPC7450 family
>
>   arch/ppc/qemu/init.c  |  52 ++
>   arch/ppc/qemu/start.S | 236 +-
>   2 files changed, 285 insertions(+), 3 deletions(-)

 (Adding Mark because his email got somehow dropped from the original
 message)
>>>
 So with these patches in OpenBIOS we could get a bit further and call
 into the Linux kernel using the same image as the one used for the
 7400. However there seems to be no support for the 7450 software TLB in
 the kernel. There are only handlers for the 4xx, 8xx and 603 which are
 different code altogether. There's no mention of the TLBMISS and
 PTEHI/LO registers in the code as well.

 Do we know of any guest OS that implements the 7450 software TLB at
 vectors 0x1000, 0x1100 and 0x1200? Otherwise replacing the
 POWERPC_MMU_SOFT_74xx model with POWERPC_MMU_32B might be the only way
 of getting an OS 

Re: [RFC PATCH 0/2] QEMU/openbios: PPC Software TLB support in the G4 family

2021-11-26 Thread Fabiano Rosas
BALATON Zoltan  writes:

> On Wed, 24 Nov 2021, Fabiano Rosas wrote:
>> Fabiano Rosas  writes:
>>
>>> Hi all,
>>>
>>> We have this bug in QEMU which indicates that we haven't been able to
>>> run openbios on a 7450 cpu for quite a long time:
>>>
>>> https://gitlab.com/qemu-project/qemu/-/issues/86
>>>
>>> OK:
>>>   $ ./qemu-system-ppc -serial mon:stdio -nographic -cpu 7410
>>>
>>>  >> =
>>>  >> OpenBIOS 1.1 [Nov 1 2021 20:36]
>>>   ...
>>>
>>> NOK:
>>>   $ ./qemu-system-ppc -serial mon:stdio -nographic -cpu 7450 -d int
>
> This CPU appears in PowerMac G4 so maybe better use -machine mac99,via=pmu 
> with it as it's strange to put it in a g3beige but that may not matter for 
> reproducing the problem.
>
> As for guests, those running on the said PowerMac G4 should have support 
> for these CPUs so maybe you can try some Mac OS X versions (or maybe 
> MorphOS but that is not the best for debugging as there's no source 
> available nor any help from its owners but just to see if it boots it may 
> be sufficient, it should work on real PowerMac G4). According to 
>  this CPU was used 
> in  
> and it runs up to Mac OS 10.4.11. (Although OpenBIOS sets the device tree 
> according to a PowerMac3,1 so not sure it's entirely correct for the 
> PowerMac3,5 that has a 7450 CPU and if it matters for Mac OS X.)
>
> I asked about this before but got no reply back then:
> https://lists.nongnu.org/archive/html/qemu-ppc/2020-03/msg00292.html
>
> This was because pegasos2 should have 7447 but it did not work so 
> currently I've set it to 7400 which also works. The original board 
> firmware had some problem detecting it but I think that only results in 
> wrong CPU speed shown which is only a cosmetic problem, otherwise it seems 
> to work. Since pegasos2 does not use OpenBIOS but either VOF or the 
> board's original firmware it may be an alternative way to test at least 
> 7447 which the firmware and guests running on that board should work with. 
> At least Debian 8.11 powerpc version had support for pegasos2 and should 
> boot, I'm not sure newer versions still work. More info on pegasos2 can be 
> found at:
> http://zero.eik.bme.hu/~balaton/qemu/amiga/#morphos 
> and
> https://osdn.net/projects/qmiga/wiki/SubprojectPegasos2
>
> I don't remember what problem I had with 7447 but if it does not work with 
> pegasos2 then maybe there's some other problem with it too. I think it was 
> maybe related to TLBs but I don't know and had no time to try again so I 
> could be entirely wrong about this.

So yesterday I tested these:

* all with my patched OpenBIOS
** all of them work fine with the 7400 CPU

- macos9 w/ -M mac99 and -cpu 7450

OS starts and then freezes. Upon further inspection I see that it has
the 0x1000, 0x1100 and 0x1200 vectors implemented, but not the 7450
ones. It implements the 6xx SW TLB handler, i.e. it accesses SPR 976
instead of 980.

- macosx10 w/ -M mac99 and -cpu 7450

Shows the apple logo and then spins. Looking at the asm I don't see
anything resembling the 7450 software TLB code. I'm calling it unsupported.

- debian 10 w/ -M mac99 and -cpu 7450

Boots linux and then spins. It has the vectors implemented, but it also
uses different SPRs. The data misses come via 976, which is different
from 7450, which uses only 980 (tlbmiss) for instruction and data.

- morphos w/ -M pegasos2 and -cpu 7447|7450

Hangs. It also has a different software TLB model implemented:
Trying to read invalid spr 978 (0x3d2) at 1100
Trying to read invalid spr 977 (0x3d1) at 110c
Trying to read invalid spr 979 (0x3d3) at 115c
Trying to read invalid spr 976 (0x3d0) at 1188

So my initial impression that no OS supports the 7450 software TLB seems
to match these findings and what people have said elsewhere in the
thread.

>
> Regards,
> BALATON Zoltan
>
>>>   Raise exception at fff08cc4 => 004e (00)
>>>   QEMU: Terminated
>>>
>>> The actual issue is straightforward. There is a non-architected
>>> feature that QEMU has enabled by default that openbios doesn't know
>>> about. From the user manual:
>>>
>>> "The MPC7540 has a set of implementation-specific registers,
>>> exceptions, and instructions that facilitate very efficient software
>>> searching of the page tables in memory for when software table
>>> searching is enabled (HID0[STEN] = 1). This section describes those
>>> resources and provides three example code sequences that can be used
>>> in a MPC7540 system for an efficient search of the translation tables
>>> in software. These three code sequences can be used as handlers for
>>> the three exceptions requiring access to the PTEs in the page tables
>>> in memory in this case-instruction TLB miss, data TLB miss on load,
>>> and data TLB miss on store exceptions."
>>>
>>> The current state:
>>>
>>> 1) QEMU does not check

Re: [OpenBIOS] Re: [RFC PATCH 0/2] QEMU/openbios: PPC Software TLB support in the G4 family

2021-11-26 Thread Cédric Le Goater

On 11/26/21 13:13, Fabiano Rosas wrote:

Segher Boessenkool  writes:


Hi!

On Fri, Nov 26, 2021 at 09:34:44AM +0100, Cédric Le Goater wrote:

On 11/25/21 10:38, Segher Boessenkool wrote:

On Thu, Nov 25, 2021 at 01:45:00AM +0100, BALATON Zoltan wrote:

As for guests, those running on the said PowerMac G4 should have support
for these CPUs so maybe you can try some Mac OS X versions (or maybe


OSX uses hardware pagetables.


MorphOS but that is not the best for debugging as there's no source
available nor any help from its owners but just to see if it boots it may
be sufficient, it should work on real PowerMac G4).


I have no idea what MorphOS uses, but I bet HPT as well.  That is
because HPT is fastest in general.  Software TLB reloads are good in
special cases only; the most common is real-time OSes, which can use its
lower guaranteed latency for some special address spaces (and can have a
simpler address map in general).


The support was added to QEMU knowing that Linux didn't handle soft TLBs.
And the commit says that it was kept disabled initially. I guess that was
broken these last years.


Ah :-)  So when was it enabled, do you know?


Hm.. That commit message does not match the code. They simply added the
software TLB implementation to an already existing SOFT_74xx MMU
model. I don't see anything that would keep it disabled at that time.



because most of the cpu definitions in ppc_defs[] are protected by a :

#if defined (TODO)

See below. commit 8ca3f6c3824c ("Allow selection of all defined PowerPC
74xx (aka G4) CPUs.") removed the TODO without a reason :/ This is old,
when SVN was in used.


Thanks,

C.


#if defined (TODO)
/* PowerPC 7450 (G4) */
POWERPC_DEF("7450",CPU_POWERPC_7450,0x, 7450),
/* Code name for PowerPC 7450*/
POWERPC_DEF("Vger",CPU_POWERPC_7450,0x, 7450),
#endif
#if defined (TODO)
/* PowerPC 7450 v1.0 (G4)*/
POWERPC_DEF("7450v1.0",CPU_POWERPC_7450_v10,0x, 7450),
#endif
#if defined (TODO)
/* PowerPC 7450 v1.1 (G4)*/
POWERPC_DEF("7450v1.1",CPU_POWERPC_7450_v11,0x, 7450),
#endif
#if defined (TODO)
/* PowerPC 7450 v1.2 (G4)*/
POWERPC_DEF("7450v1.2",CPU_POWERPC_7450_v12,0x, 7450),
#endif
#if defined (TODO)
/* PowerPC 7450 v2.0 (G4)*/
POWERPC_DEF("7450v2.0",CPU_POWERPC_7450_v20,0x, 7450),
#endif
#if defined (TODO)
/* PowerPC 7450 v2.1 (G4)*/
POWERPC_DEF("7450v2.1",CPU_POWERPC_7450_v21,0x, 7450),
#endif
#if defined (TODO)
/* PowerPC 7441 (G4) */
POWERPC_DEF("7441",CPU_POWERPC_74x1,0x, 7440),
/* PowerPC 7451 (G4) */




Re: [OpenBIOS] Re: [RFC PATCH 0/2] QEMU/openbios: PPC Software TLB support in the G4 family

2021-11-26 Thread Fabiano Rosas
Cédric Le Goater  writes:

> On 11/26/21 13:13, Fabiano Rosas wrote:
>> Segher Boessenkool  writes:
>> 
>>> Hi!
>>>
>>> On Fri, Nov 26, 2021 at 09:34:44AM +0100, Cédric Le Goater wrote:
 On 11/25/21 10:38, Segher Boessenkool wrote:
> On Thu, Nov 25, 2021 at 01:45:00AM +0100, BALATON Zoltan wrote:
>> As for guests, those running on the said PowerMac G4 should have support
>> for these CPUs so maybe you can try some Mac OS X versions (or maybe
>
> OSX uses hardware pagetables.
>
>> MorphOS but that is not the best for debugging as there's no source
>> available nor any help from its owners but just to see if it boots it may
>> be sufficient, it should work on real PowerMac G4).
>
> I have no idea what MorphOS uses, but I bet HPT as well.  That is
> because HPT is fastest in general.  Software TLB reloads are good in
> special cases only; the most common is real-time OSes, which can use its
> lower guaranteed latency for some special address spaces (and can have a
> simpler address map in general).

 The support was added to QEMU knowing that Linux didn't handle soft TLBs.
 And the commit says that it was kept disabled initially. I guess that was
 broken these last years.
>>>
>>> Ah :-)  So when was it enabled, do you know?
>> 
>> Hm.. That commit message does not match the code. They simply added the
>> software TLB implementation to an already existing SOFT_74xx MMU
>> model. I don't see anything that would keep it disabled at that time.
>> 
>
> because most of the cpu definitions in ppc_defs[] are protected by a :
>
> #if defined (TODO)
>
> See below. commit 8ca3f6c3824c ("Allow selection of all defined PowerPC
> 74xx (aka G4) CPUs.") removed the TODO without a reason :/ This is old,
> when SVN was in used.

Ah nice catch!



[PATCH v2] dbus-vmstate: Restrict error checks to registered proxies in dbus_get_proxies

2021-11-26 Thread Priyankar Jain
The purpose of dbus_get_proxies to construct the proxies corresponding to the
IDs registered to dbus-vmstate.

Currenty, this function returns an error in case there is any failure
while instantiating proxy for "all" the names on dbus.

Ideally this function should error out only if it is not able to find and
validate the proxies registered to the backend otherwise any offending
process(for eg: the process purposefully may not export its Id property on
the dbus) may connect to the dbus and can lead to migration failures.

This commit ensures that dbus_get_proxies returns an error if it is not
able to find and validate the proxies of interest(the IDs registered
during the dbus-vmstate instantiation).

Signed-off-by: Priyankar Jain 
---
 backends/dbus-vmstate.c | 13 +
 1 file changed, 9 insertions(+), 4 deletions(-)

diff --git a/backends/dbus-vmstate.c b/backends/dbus-vmstate.c
index 9cfd758c42..57369ec0f2 100644
--- a/backends/dbus-vmstate.c
+++ b/backends/dbus-vmstate.c
@@ -114,14 +114,19 @@ dbus_get_proxies(DBusVMState *self, GError **err)
 "org.qemu.VMState1",
 NULL, err);
 if (!proxy) {
-return NULL;
+if (err != NULL && *err != NULL) {
+warn_report("%s: Failed to create proxy: %s",
+__func__, (*err)->message);
+g_clear_error(err);
+}
+continue;
 }
 
 result = g_dbus_proxy_get_cached_property(proxy, "Id");
 if (!result) {
-g_set_error_literal(err, G_IO_ERROR, G_IO_ERROR_FAILED,
-"VMState Id property is missing.");
-return NULL;
+warn_report("%s: VMState Id property is missing.", __func__);
+g_clear_object(&proxy);
+continue;
 }
 
 id = g_variant_dup_string(result, &size);
-- 
2.30.1 (Apple Git-130)




Re: [PATCH v2] dbus-vmstate: Restrict error checks to registered proxies in dbus_get_proxies

2021-11-26 Thread Marc-André Lureau
On Fri, Nov 26, 2021 at 5:40 PM Priyankar Jain 
wrote:

> The purpose of dbus_get_proxies to construct the proxies corresponding to
> the
> IDs registered to dbus-vmstate.
>
> Currenty, this function returns an error in case there is any failure
> while instantiating proxy for "all" the names on dbus.
>
> Ideally this function should error out only if it is not able to find and
> validate the proxies registered to the backend otherwise any offending
> process(for eg: the process purposefully may not export its Id property on
> the dbus) may connect to the dbus and can lead to migration failures.
>
> This commit ensures that dbus_get_proxies returns an error if it is not
> able to find and validate the proxies of interest(the IDs registered
> during the dbus-vmstate instantiation).
>
> Signed-off-by: Priyankar Jain 
>

Reviewed-by: Marc-André Lureau 

thanks

---
>  backends/dbus-vmstate.c | 13 +
>  1 file changed, 9 insertions(+), 4 deletions(-)
>
> diff --git a/backends/dbus-vmstate.c b/backends/dbus-vmstate.c
> index 9cfd758c42..57369ec0f2 100644
> --- a/backends/dbus-vmstate.c
> +++ b/backends/dbus-vmstate.c
> @@ -114,14 +114,19 @@ dbus_get_proxies(DBusVMState *self, GError **err)
>  "org.qemu.VMState1",
>  NULL, err);
>  if (!proxy) {
> -return NULL;
> +if (err != NULL && *err != NULL) {
> +warn_report("%s: Failed to create proxy: %s",
> +__func__, (*err)->message);
> +g_clear_error(err);
> +}
> +continue;
>  }
>
>  result = g_dbus_proxy_get_cached_property(proxy, "Id");
>  if (!result) {
> -g_set_error_literal(err, G_IO_ERROR, G_IO_ERROR_FAILED,
> -"VMState Id property is missing.");
> -return NULL;
> +warn_report("%s: VMState Id property is missing.", __func__);
> +g_clear_object(&proxy);
> +continue;
>  }
>
>  id = g_variant_dup_string(result, &size);
> --
> 2.30.1 (Apple Git-130)
>
>
>

-- 
Marc-André Lureau


Re: [RFC PATCH 0/2] QEMU/openbios: PPC Software TLB support in the G4 family

2021-11-26 Thread Cédric Le Goater

Right. If we're doing this to say "I can boot a kernel with a 7450 cpu in QEMU" 
but
the implementation is different from real hardware, then I'm not sure what the 
real
value is. That effectively leaves option b) if someone is willing to do the 
work, or
as you say to simply remove the code from QEMU.


Yeah, that is a good point. Although the software TLB is well contained,
so we could certainly document that our 7450s don't have that feature
and call it a day. Does QEMU have any policy on how much of a machine is
required to be implemented?

I am more inclined to apply c) for now as I said, just to have some code
running on the CPU and maybe document in a gitlab issue that we're
lacking the runtime switch and eventually implement that. It's not like
this is high traffic code anyway. It has been broken for 10+ years.

That said, if Cédric and Daniel see more value in moving the 7450s to
the POWERPC_MMU_32B I won't oppose.


I am in favor of dropping unused code in QEMU and keeping the CPUs for
which we have support in Linux using the POWERPC_MMU_32B in QEMU and the
openbios patch. If we need SoftTLB support for the 74x CPUs in QEMU, we
can always dig in the history.

We can give FreeBSB a try also since they had support for the G4 :

  https://people.freebsd.org/~arved/stuff/minimac


With the openbios patch, Linux boots fine under 7450, 7455, 7447 CPUs.

Under 7448, it drops in xmon with a :
 
kernel tried to execute exec-protected page (c07fdd98) - exploit attempt? (uid: 0)

BUG: Unable to handle kernel instruction fetch
Faulting instruction address: 0xc07fdd98
Vector: 400 (Instruction Access) at [f1019d30]
pc: c07fdd98: __do_softirq+0x0/0x2f0
lr: c00516a4: irq_exit+0xbc/0xf8
sp: f1019df0
   msr: 10001032
  current = 0xc0d0
pid   = 1, comm = swapper


This should be fixable.

Thanks,

C.






[Bug 1952448] Re: qemu 1:6.0+dfsg-2expubuntu2: Fail to build against OpenSSL 3.0

2021-11-26 Thread Paride Legovini
** Also affects: qemu
   Importance: Undecided
   Status: New

** No longer affects: qemu

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1952448

Title:
  qemu 1:6.0+dfsg-2expubuntu2: Fail to build against OpenSSL 3.0

Status in qemu package in Ubuntu:
  Triaged

Bug description:
  Issue discovered after doing a "No-change rebuild" upload to Jammy
  while working at the liburing2 migration (LP: #1944037).

  Full build log:

  https://launchpadlibrarian.net/570888790/buildlog_ubuntu-jammy-
  amd64.qemu_1%3A6.0+dfsg-2expubuntu3_BUILDING.txt.gz

  Failure mode:

  /<>/qemu-6.0+dfsg/roms/skiboot/libstb/create-container.c: In 
function ‘getPublicKeyRaw’:
  /<>/qemu-6.0+dfsg/roms/skiboot/libstb/create-container.c:85:17: 
error: ‘EVP_PKEY_get1_EC_KEY’ is deprecated: Since OpenSSL 3.0 
[-Werror=deprecated-declarations]

  Also note that:

  cc1: all warnings being treated as errors

  Upstream skiboot [1] still uses EVP_PKEY_get1_EC_KEY in master, and
  don't have an open issue about this. To be filed once we setup a
  reproducer that builds skiboot "standalone", outside of the qemu
  source tree.

  For the moment we have to relax the severity of that deprecation
  error, likely appending a -Wno-deprecated-declarations somewhere in
  d/rules.

  
  [1] https://github.com/open-power/skiboot

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1952448/+subscriptions




[PATCH v3] dbus-vmstate: Restrict error checks to registered proxies in dbus_get_proxies

2021-11-26 Thread Priyankar Jain
The purpose of dbus_get_proxies to construct the proxies corresponding to the
IDs registered to dbus-vmstate.

Currenty, this function returns an error in case there is any failure
while instantiating proxy for "all" the names on dbus.

Ideally this function should error out only if it is not able to find and
validate the proxies registered to the backend otherwise any offending
process(for eg: the process purposefully may not export its Id property on
the dbus) may connect to the dbus and can lead to migration failures.

This commit ensures that dbus_get_proxies returns an error if it is not
able to find and validate the proxies of interest(the IDs registered
during the dbus-vmstate instantiation).

Signed-off-by: Priyankar Jain 
Reviewed-by: Marc-André Lureau 
---
 backends/dbus-vmstate.c | 13 +
 1 file changed, 9 insertions(+), 4 deletions(-)

diff --git a/backends/dbus-vmstate.c b/backends/dbus-vmstate.c
index 9cfd758c42..57369ec0f2 100644
--- a/backends/dbus-vmstate.c
+++ b/backends/dbus-vmstate.c
@@ -114,14 +114,19 @@ dbus_get_proxies(DBusVMState *self, GError **err)
 "org.qemu.VMState1",
 NULL, err);
 if (!proxy) {
-return NULL;
+if (err != NULL && *err != NULL) {
+warn_report("%s: Failed to create proxy: %s",
+__func__, (*err)->message);
+g_clear_error(err);
+}
+continue;
 }
 
 result = g_dbus_proxy_get_cached_property(proxy, "Id");
 if (!result) {
-g_set_error_literal(err, G_IO_ERROR, G_IO_ERROR_FAILED,
-"VMState Id property is missing.");
-return NULL;
+warn_report("%s: VMState Id property is missing.", __func__);
+g_clear_object(&proxy);
+continue;
 }
 
 id = g_variant_dup_string(result, &size);
-- 
2.30.1 (Apple Git-130)




Re: [RFC PATCH 0/2] QEMU/openbios: PPC Software TLB support in the G4 family

2021-11-26 Thread BALATON Zoltan

On Fri, 26 Nov 2021, Fabiano Rosas wrote:

BALATON Zoltan  writes:


On Wed, 24 Nov 2021, Fabiano Rosas wrote:

Fabiano Rosas  writes:


Hi all,

We have this bug in QEMU which indicates that we haven't been able to
run openbios on a 7450 cpu for quite a long time:

https://gitlab.com/qemu-project/qemu/-/issues/86

OK:
  $ ./qemu-system-ppc -serial mon:stdio -nographic -cpu 7410

>> =
>> OpenBIOS 1.1 [Nov 1 2021 20:36]
  ...

NOK:
  $ ./qemu-system-ppc -serial mon:stdio -nographic -cpu 7450 -d int


This CPU appears in PowerMac G4 so maybe better use -machine mac99,via=pmu
with it as it's strange to put it in a g3beige but that may not matter for
reproducing the problem.

As for guests, those running on the said PowerMac G4 should have support
for these CPUs so maybe you can try some Mac OS X versions (or maybe
MorphOS but that is not the best for debugging as there's no source
available nor any help from its owners but just to see if it boots it may
be sufficient, it should work on real PowerMac G4). According to
 this CPU was used
in 
and it runs up to Mac OS 10.4.11. (Although OpenBIOS sets the device tree
according to a PowerMac3,1 so not sure it's entirely correct for the
PowerMac3,5 that has a 7450 CPU and if it matters for Mac OS X.)

I asked about this before but got no reply back then:
https://lists.nongnu.org/archive/html/qemu-ppc/2020-03/msg00292.html

This was because pegasos2 should have 7447 but it did not work so
currently I've set it to 7400 which also works. The original board
firmware had some problem detecting it but I think that only results in
wrong CPU speed shown which is only a cosmetic problem, otherwise it seems
to work. Since pegasos2 does not use OpenBIOS but either VOF or the
board's original firmware it may be an alternative way to test at least
7447 which the firmware and guests running on that board should work with.
At least Debian 8.11 powerpc version had support for pegasos2 and should
boot, I'm not sure newer versions still work. More info on pegasos2 can be
found at:
http://zero.eik.bme.hu/~balaton/qemu/amiga/#morphos
and
https://osdn.net/projects/qmiga/wiki/SubprojectPegasos2

I don't remember what problem I had with 7447 but if it does not work with
pegasos2 then maybe there's some other problem with it too. I think it was
maybe related to TLBs but I don't know and had no time to try again so I
could be entirely wrong about this.


So yesterday I tested these:

* all with my patched OpenBIOS
** all of them work fine with the 7400 CPU

- macos9 w/ -M mac99 and -cpu 7450

OS starts and then freezes. Upon further inspection I see that it has
the 0x1000, 0x1100 and 0x1200 vectors implemented, but not the 7450
ones. It implements the 6xx SW TLB handler, i.e. it accesses SPR 976
instead of 980.

- macosx10 w/ -M mac99 and -cpu 7450

Shows the apple logo and then spins. Looking at the asm I don't see
anything resembling the 7450 software TLB code. I'm calling it unsupported.

- debian 10 w/ -M mac99 and -cpu 7450


Bevare that -M mac99 is not matching the device tree as it has ADB instead 
of USB but claims to be PowerMac3,1 nevertheless. This is a silly default 
preserved for compatibility with some older OS X versions but to avoid 
problems it's better to always use -M mac99,via=pmu which is the closest 
to a real PowerMac3,1. There is some info on this and which option to use 
with which version at 
https://www.emaculation.com/doku.php/ppc-osx-on-qemu-for-osx



Boots linux and then spins. It has the vectors implemented, but it also
uses different SPRs. The data misses come via 976, which is different
from 7450, which uses only 980 (tlbmiss) for instruction and data.

- morphos w/ -M pegasos2 and -cpu 7447|7450

Hangs. It also has a different software TLB model implemented:
Trying to read invalid spr 978 (0x3d2) at 1100
Trying to read invalid spr 977 (0x3d1) at 110c
Trying to read invalid spr 979 (0x3d3) at 115c
Trying to read invalid spr 976 (0x3d0) at 1188

So my initial impression that no OS supports the 7450 software TLB seems
to match these findings and what people have said elsewhere in the
thread.


I'm getting this with MorphOS on pegasos2:

$ qemu-system-ppc -M pegasos2 -device ati-vga,romfile="" -kernel boot.img 
-cdrom morphos-3.15.iso -serial stdio -d unimp,guest_errors,int 2>&1 | 
grep --line-buffered -v '^Invalid read\|^hypercall\|^syscall\|^Raise exception.*000[48a]'


Memory used before SYS_Init: 9MB
i8259: level sensitive irq not supported
i8259: level sensitive irq not supported


unsupported keyboard cmd=0xaf
PCI ATA/ATAPI Driver@2: PIO Mode 4
PCI ATA/ATAPI Driver@2: UDMA Mode 5
ide.device@2: QEMU QEMU DVD-ROM 

and it boots with the default 7400 CPU but with -cpu 7447 I get:

Memory used before SYS_Init: 9MB
i8259: level sensitive irq not sup

Re: [RFC PATCH 0/2] QEMU/openbios: PPC Software TLB support in the G4 family

2021-11-26 Thread BALATON Zoltan

On Fri, 26 Nov 2021, Cédric Le Goater wrote:
Right. If we're doing this to say "I can boot a kernel with a 7450 cpu in 
QEMU" but
the implementation is different from real hardware, then I'm not sure what 
the real
value is. That effectively leaves option b) if someone is willing to do 
the work, or

as you say to simply remove the code from QEMU.


Yeah, that is a good point. Although the software TLB is well contained,
so we could certainly document that our 7450s don't have that feature
and call it a day. Does QEMU have any policy on how much of a machine is
required to be implemented?

I am more inclined to apply c) for now as I said, just to have some code
running on the CPU and maybe document in a gitlab issue that we're
lacking the runtime switch and eventually implement that. It's not like
this is high traffic code anyway. It has been broken for 10+ years.

That said, if Cédric and Daniel see more value in moving the 7450s to
the POWERPC_MMU_32B I won't oppose.


I am in favor of dropping unused code in QEMU and keeping the CPUs for
which we have support in Linux using the POWERPC_MMU_32B in QEMU and the
openbios patch. If we need SoftTLB support for the 74x CPUs in QEMU, we
can always dig in the history.


If we can't find a guest that needs it and can be used to test with it may 
be OK to remove it for now but digging the history may not be easy if 
somebody later comes along with a guest that would need this. Likely they 
would just go away when finding it's not supported or maybe try to redo it 
from scratch and not think of checking history first. So if you drop it 
maybe leave a one line comment at some obvious place saying something like 
"74xx soft TLB removed in commit x" to make it simpler for those who 
may want to resurrect it later.


Regards,
BALATON Zoltan


We can give FreeBSB a try also since they had support for the G4 :

 https://people.freebsd.org/~arved/stuff/minimac


With the openbios patch, Linux boots fine under 7450, 7455, 7447 CPUs.

Under 7448, it drops in xmon with a :
kernel tried to execute exec-protected page (c07fdd98) - exploit attempt? 
(uid: 0)

BUG: Unable to handle kernel instruction fetch
Faulting instruction address: 0xc07fdd98
Vector: 400 (Instruction Access) at [f1019d30]
   pc: c07fdd98: __do_softirq+0x0/0x2f0
   lr: c00516a4: irq_exit+0xbc/0xf8
   sp: f1019df0
  msr: 10001032
 current = 0xc0d0
   pid   = 1, comm = swapper


This should be fixable.

Thanks,

C.






[PATCH 1/2] multifd: use qemu_sem_timedwait in multifd_recv_thread to avoid waiting forever

2021-11-26 Thread Li Zhang
When doing live migration with multifd channels 8, 16 or larger number,
the guest hangs in the presence of the network errors such as missing TCP ACKs.

At sender's side:
The main thread is blocked on qemu_thread_join, migration_fd_cleanup
is called because one thread fails on qio_channel_write_all when
the network problem happens and other send threads are blocked on sendmsg.
They could not be terminated. So the main thread is blocked on qemu_thread_join
to wait for the threads terminated.

(gdb) bt
0  0x7f30c8dcffc0 in __pthread_clockjoin_ex () at /lib64/libpthread.so.0
1  0x55cbb716084b in qemu_thread_join (thread=0x55cbb881f418) at 
../util/qemu-thread-posix.c:627
2  0x55cbb6b54e40 in multifd_save_cleanup () at ../migration/multifd.c:542
3  0x55cbb6b4de06 in migrate_fd_cleanup (s=0x55cbb8024000) at 
../migration/migration.c:1808
4  0x55cbb6b4dfb4 in migrate_fd_cleanup_bh (opaque=0x55cbb8024000) at 
../migration/migration.c:1850
5  0x55cbb7173ac1 in aio_bh_call (bh=0x55cbb7eb98e0) at ../util/async.c:141
6  0x55cbb7173bcb in aio_bh_poll (ctx=0x55cbb7ebba80) at ../util/async.c:169
7  0x55cbb715ba4b in aio_dispatch (ctx=0x55cbb7ebba80) at 
../util/aio-posix.c:381
8  0x55cbb7173ffe in aio_ctx_dispatch (source=0x55cbb7ebba80, callback=0x0, 
user_data=0x0) at ../util/async.c:311
9  0x7f30c9c8cdf4 in g_main_context_dispatch () at 
/usr/lib64/libglib-2.0.so.0
10 0x55cbb71851a2 in glib_pollfds_poll () at ../util/main-loop.c:232
11 0x55cbb718521c in os_host_main_loop_wait (timeout=42251070366) at 
../util/main-loop.c:255
12 0x55cbb7185321 in main_loop_wait (nonblocking=0) at 
../util/main-loop.c:531
13 0x55cbb6e6ba27 in qemu_main_loop () at ../softmmu/runstate.c:726
14 0x55cbb6ad6fd7 in main (argc=68, argv=0x7ffc0c57, 
envp=0x7ffc0c578ab0) at ../softmmu/main.c:50

At receiver's side:
Several receive threads are not created successfully and the receive threads
which have been created are blocked on qemu_sem_wait. No semaphores are posted
because migration is not started if not all the receive threads are created
successfully and multifd_recv_sync_main is not called which posts the semaphore
to receive threads. So the receive threads are waiting on the semaphore and
never return. It shouldn't wait for the semaphore forever.
Use qemu_sem_timedwait to wait for a while, then return and close the channels.
So the guest doesn't hang anymore.

(gdb) bt
0  0x7fd61c43f064 in do_futex_wait.constprop () at /lib64/libpthread.so.0
1  0x7fd61c43f158 in __new_sem_wait_slow.constprop.0 () at 
/lib64/libpthread.so.0
2  0x56075916014a in qemu_sem_wait (sem=0x56075b6515f0) at 
../util/qemu-thread-posix.c:358
3  0x560758b56643 in multifd_recv_thread (opaque=0x56075b651550) at 
../migration/multifd.c:1112
4  0x560759160598 in qemu_thread_start (args=0x56075befad00) at 
../util/qemu-thread-posix.c:556
5  0x7fd61c43594a in start_thread () at /lib64/libpthread.so.0
6  0x7fd61c158d0f in clone () at /lib64/libc.so.6

Signed-off-by: Li Zhang 
---
 migration/multifd.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/migration/multifd.c b/migration/multifd.c
index 7c9deb1921..656239ca2a 100644
--- a/migration/multifd.c
+++ b/migration/multifd.c
@@ -1109,7 +1109,7 @@ static void *multifd_recv_thread(void *opaque)
 
 if (flags & MULTIFD_FLAG_SYNC) {
 qemu_sem_post(&multifd_recv_state->sem_sync);
-qemu_sem_wait(&p->sem_sync);
+qemu_sem_timedwait(&p->sem_sync, 1000);
 }
 }
 
-- 
2.31.1




[PATCH 0/2] migration: multifd live migration improvement

2021-11-26 Thread Li Zhang
When testing live migration with multifd channels (8, 16, or a bigger number)
and using qemu -incoming (without "defer"), if a network error occurs
(for example, triggering the kernel SYN flooding detection),
the migration fails and the guest hangs forever.

The test environment and the command line is as the following:

QEMU verions: QEMU emulator version 6.2.91 (v6.2.0-rc1-47-gc5fbdd60cf)
Host OS: SLE 15  with kernel: 5.14.5-1-default
Network Card: mlx5 100Gbps
Network card: Intel Corporation I350 Gigabit (1Gbps)

Source:
qemu-system-x86_64 -M q35 -smp 32 -nographic \
-serial telnet:10.156.208.153:4321,server,nowait \
-m 4096 -enable-kvm -hda /var/lib/libvirt/images/openSUSE-15.3.img \
-monitor stdio
Dest:
qemu-system-x86_64 -M q35 -smp 32 -nographic \
-serial telnet:10.156.208.154:4321,server,nowait \
-m 4096 -enable-kvm -hda /var/lib/libvirt/images/openSUSE-15.3.img \
-monitor stdio \
-incoming tcp:1.0.8.154:4000

(qemu) migrate_set_parameter max-bandwidth 100G
(qemu) migrate_set_capability multifd on
(qemu) migrate_set_parameter multifd-channels 16

The guest hangs when executing the command: migrate -d tcp:1.0.8.154:4000.

If a network problem happens, TCP ACK is not received by destination
and the destination resets the connection with RST.

No. TimeSource  Destination ProtocolLength  Info
119 1.0211691.0.8.153   1.0.8.154   TCP 141060166 → 
4000 [PSH, ACK] Seq=65 Ack=1 Win=62720 Len=1344 TSval=1338662881 
TSecr=1399531897
No. TimeSource  Destination ProtocolLength  Info
125 1.0211811.0.8.154   1.0.8.153   TCP 54  4000 → 
60166 [RST] Seq=1 Win=0 Len=0

kernel log:
[334520.229445] TCP: request_sock_TCP: Possible SYN flooding on port 4000. 
Sending cookies.  Check SNMP counters.
[334562.994919] TCP: request_sock_TCP: Possible SYN flooding on port 4000. 
Sending cookies.  Check SNMP counters.
[334695.519927] TCP: request_sock_TCP: Possible SYN flooding on port 4000. 
Sending cookies.  Check SNMP counters.
[334734.689511] TCP: request_sock_TCP: Possible SYN flooding on port 4000. 
Sending cookies.  Check SNMP counters.
[335687.740415] TCP: request_sock_TCP: Possible SYN flooding on port 4000. 
Sending cookies.  Check SNMP counters.
[335730.013598] TCP: request_sock_TCP: Possible SYN flooding on port 4000. 
Sending cookies.  Check SNMP counters.

There are two problems here:
1. When live migration fails, the guest hangs and no errors reported
   even if an error has happened.
2. The network problem causes the live migraiton failure when channel
   number is 8, 16, or larger.

So the two patches are to fix the two problems.

Li Zhang (2):
  multifd: use qemu_sem_timedwait in multifd_recv_thread to avoid
waiting forever
  migration: Set the socket backlog number to reduce chance of live
migration failure

 migration/multifd.c | 2 +-
 migration/socket.c  | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

-- 
2.31.1




[PATCH 2/2] migration: Set the socket backlog number to reduce the chance of live migration failure

2021-11-26 Thread Li Zhang
When creating the guest on destination with -incoming ip:port in QEMU command 
line,
the source code that sets the backlog number as the same as multifd channels 
doesn't
get called. So the number of backlog is always 1. It's very easy to cause live 
migration
failure, so a bigger number is preferred to reduce the chance of the failure.

Signed-off-by: Li Zhang 
---
 migration/socket.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/migration/socket.c b/migration/socket.c
index 05705a32d8..398d4c10fa 100644
--- a/migration/socket.c
+++ b/migration/socket.c
@@ -152,7 +152,7 @@ socket_start_incoming_migration_internal(SocketAddress 
*saddr,
 QIONetListener *listener = qio_net_listener_new();
 MigrationIncomingState *mis = migration_incoming_get_current();
 size_t i;
-int num = 1;
+int num = 16;
 
 qio_net_listener_set_name(listener, "migration-socket-listener");
 
-- 
2.31.1




[PATCH] hid: Implement support for side and extra buttons

2021-11-26 Thread Noah Bergbauer
Simply set the respective bits and update the descriptor accordingly.

Signed-off-by: Noah Bergbauer 
---
 hw/input/hid.c   | 2 ++
 hw/usb/dev-hid.c | 6 +++---
 2 files changed, 5 insertions(+), 3 deletions(-)

diff --git a/hw/input/hid.c b/hw/input/hid.c
index 8aab0521f4..e7ecebdf8f 100644
--- a/hw/input/hid.c
+++ b/hw/input/hid.c
@@ -114,6 +114,8 @@ static void hid_pointer_event(DeviceState *dev, QemuConsole 
*src,
 [INPUT_BUTTON_LEFT]   = 0x01,
 [INPUT_BUTTON_RIGHT]  = 0x02,
 [INPUT_BUTTON_MIDDLE] = 0x04,
+[INPUT_BUTTON_SIDE] = 0x08,
+[INPUT_BUTTON_EXTRA] = 0x10,
 };
 HIDState *hs = (HIDState *)dev;
 HIDPointerEvent *e;
diff --git a/hw/usb/dev-hid.c b/hw/usb/dev-hid.c
index 1c7ae97c30..bdd6d1ffaf 100644
--- a/hw/usb/dev-hid.c
+++ b/hw/usb/dev-hid.c
@@ -461,14 +461,14 @@ static const uint8_t qemu_mouse_hid_report_descriptor[] = 
{
 0xa1, 0x00,/*   Collection (Physical) */
 0x05, 0x09,/* Usage Page (Button) */
 0x19, 0x01,/* Usage Minimum (1) */
-0x29, 0x03,/* Usage Maximum (3) */
+0x29, 0x05,/* Usage Maximum (5) */
 0x15, 0x00,/* Logical Minimum (0) */
 0x25, 0x01,/* Logical Maximum (1) */
-0x95, 0x03,/* Report Count (3) */
+0x95, 0x05,/* Report Count (5) */
 0x75, 0x01,/* Report Size (1) */
 0x81, 0x02,/* Input (Data, Variable, Absolute) */
 0x95, 0x01,/* Report Count (1) */
-0x75, 0x05,/* Report Size (5) */
+0x75, 0x03,/* Report Size (3) */
 0x81, 0x01,/* Input (Constant) */
 0x05, 0x01,/* Usage Page (Generic Desktop) */
 0x09, 0x30,/* Usage (X) */
-- 
2.34.0




Re: [PATCH 1/2] multifd: use qemu_sem_timedwait in multifd_recv_thread to avoid waiting forever

2021-11-26 Thread Daniel P . Berrangé
On Fri, Nov 26, 2021 at 04:31:53PM +0100, Li Zhang wrote:
> When doing live migration with multifd channels 8, 16 or larger number,
> the guest hangs in the presence of the network errors such as missing TCP 
> ACKs.
> 
> At sender's side:
> The main thread is blocked on qemu_thread_join, migration_fd_cleanup
> is called because one thread fails on qio_channel_write_all when
> the network problem happens and other send threads are blocked on sendmsg.
> They could not be terminated. So the main thread is blocked on 
> qemu_thread_join
> to wait for the threads terminated.

Isn't the right answer here to ensure we've called 'shutdown' on
all the FDs, so that the threads get kicked out of sendmsg, before
trying to join the thread ?


Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|




Re: [PATCH v6 3/3] cpus-common: implement dirty page limit on vCPU

2021-11-26 Thread Hyman




在 2021/11/26 15:03, Markus Armbruster 写道:

huang...@chinatelecom.cn writes:


From: Hyman Huang(黄勇) 

Implement dirtyrate calculation periodically basing on
dirty-ring and throttle vCPU until it reachs the quota
dirty page rate given by user.

Introduce qmp commands set-dirty-limit/cancel-dirty-limit to
set/cancel dirty page limit on vCPU.

Signed-off-by: Hyman Huang(黄勇) 
---
  cpus-common.c | 41 +
  include/hw/core/cpu.h |  9 +
  qapi/migration.json   | 47 +++
  softmmu/vl.c  |  1 +
  4 files changed, 98 insertions(+)

diff --git a/cpus-common.c b/cpus-common.c
index 6e73d3e..3c156b3 100644
--- a/cpus-common.c
+++ b/cpus-common.c
@@ -23,6 +23,11 @@
  #include "hw/core/cpu.h"
  #include "sysemu/cpus.h"
  #include "qemu/lockable.h"
+#include "sysemu/dirtylimit.h"
+#include "sysemu/cpu-throttle.h"
+#include "sysemu/kvm.h"
+#include "qapi/error.h"
+#include "qapi/qapi-commands-migration.h"
  
  static QemuMutex qemu_cpu_list_lock;

  static QemuCond exclusive_cond;
@@ -352,3 +357,39 @@ void process_queued_cpu_work(CPUState *cpu)
  qemu_mutex_unlock(&cpu->work_mutex);
  qemu_cond_broadcast(&qemu_work_cond);
  }
+
+void qmp_set_dirty_limit(int64_t idx,
+ uint64_t dirtyrate,
+ Error **errp)
+{
+if (!kvm_enabled() || !kvm_dirty_ring_enabled()) {
+error_setg(errp, "setting a dirty page limit requires support from dirty 
ring");


Can we phrase the message in a way that gives the user a chance to guess
what he needs to do to avoid it?
> Perhaps: "setting a dirty page limit requires KVM with accelerator
property 'dirty-ring-size' set".

Sound good, this make things more clear.



+return;
+}
+
+dirtylimit_calc();
+dirtylimit_vcpu(idx, dirtyrate);
+}
+
+void qmp_cancel_dirty_limit(int64_t idx,
+Error **errp)
+{


Three cases:

Case 1: enable is impossible, so nothing to do.

Case 2: enable is possible and we actually enabled.

Case 3: enable is possible, but we didn't.  Nothing to do.


+if (!kvm_enabled() || !kvm_dirty_ring_enabled()) {
+error_setg(errp, "no need to cancel a dirty page limit as dirty ring not 
enabled");
+return;


This is case 1.  We error out.


+}
+
+if (unlikely(!dirtylimit_cancel_vcpu(idx))) {


I don't think unlikely() matters here.


+dirtylimit_calc_quit();
+}


In case 2, dirtylimit_calc_quit() returns zero if this was the last
limit, else non-zero.  If the former, we request the thread to stop.I am wildly guessing you misunderstood the function 

dirtylimit_cancel_vcpu, see below.


In case 3, dirtylimit_calc_quit() returns zero, and we do nothing.
In this case, we cancel the "dirtylimit thread" in function 
dirtylimit_cancel_vcpu actually, if it was the last limit thread of the 
whole vm, dirtylimit_cancel_vcpu return zero and we request the 
dirtyrate calculation thread to stop, so we call the function 
dirtylimit_calc_quit , which stop the "dirtyrate calculation thread" 
internally.


Why is case 1 and error, but case 3 isn't?

Both could silently do nothing, like case 3 does now.

Both could error out, like case 1 does now.  A possible common error
message: "there is no dirty page limit to cancel".

I'd be okay with consistently doing nothing, and with consistently
erroring out.


+}
+
+void dirtylimit_setup(int max_cpus)
+{
+if (!kvm_enabled() || !kvm_dirty_ring_enabled()) {
+return;
+}
+
+dirtylimit_calc_state_init(max_cpus);
+dirtylimit_state_init(max_cpus);
+}
diff --git a/include/hw/core/cpu.h b/include/hw/core/cpu.h
index e948e81..11df012 100644
--- a/include/hw/core/cpu.h
+++ b/include/hw/core/cpu.h
@@ -881,6 +881,15 @@ void end_exclusive(void);
   */
  void qemu_init_vcpu(CPUState *cpu);
  
+/**

+ * dirtylimit_setup:
+ *
+ * Initializes the global state of dirtylimit calculation and
+ * dirtylimit itself. This is prepared for vCPU dirtylimit which
+ * could be triggered during vm lifecycle.
+ */
+void dirtylimit_setup(int max_cpus);
+
  #define SSTEP_ENABLE  0x1  /* Enable simulated HW single stepping */
  #define SSTEP_NOIRQ   0x2  /* Do not use IRQ while single stepping */
  #define SSTEP_NOTIMER 0x4  /* Do not Timers while single stepping */
diff --git a/qapi/migration.json b/qapi/migration.json
index bbfd48c..2b0fe19 100644
--- a/qapi/migration.json
+++ b/qapi/migration.json
@@ -1850,6 +1850,53 @@
  { 'command': 'query-dirty-rate', 'returns': 'DirtyRateInfo' }
  
  ##

+# @set-dirty-limit:
+#
+# Set the upper limit of dirty page rate for a vCPU.
+#
+# This command could be used to cap the vCPU memory load, which is also


"Could be used" suggests there are other uses.  I don't think there are.
 >> +# refered as "dirty page rate". Users can use set-dirty-limit 

unconditionally,

+# but if one want to know which vCPU is in high memory load and which vCPU


"one wants"


+# should be limi

[PATCH] gitlab-ci.d/buildtest: Add jobs that run the device-crash-test

2021-11-26 Thread Thomas Huth
The device-crash-test script has been quite neglected in the past,
so that it bit-rot quite often. Let's add CI jobs that run this
script for at least some targets, so that this script does not
regress that easily anymore.

Signed-off-by: Thomas Huth 
---
 .gitlab-ci.d/buildtest.yml | 23 +++
 1 file changed, 23 insertions(+)

diff --git a/.gitlab-ci.d/buildtest.yml b/.gitlab-ci.d/buildtest.yml
index 71d0f407ad..7e1cb0b3c2 100644
--- a/.gitlab-ci.d/buildtest.yml
+++ b/.gitlab-ci.d/buildtest.yml
@@ -100,6 +100,17 @@ avocado-system-debian:
 IMAGE: debian-amd64
 MAKE_CHECK_ARGS: check-avocado
 
+crash-test-debian:
+  extends: .native_test_job_template
+  needs:
+- job: build-system-debian
+  artifacts: true
+  variables:
+IMAGE: debian-amd64
+  script:
+- cd build
+- scripts/device-crash-test -q ./qemu-system-i386
+
 build-system-fedora:
   extends: .native_build_job_template
   needs:
@@ -134,6 +145,18 @@ avocado-system-fedora:
 IMAGE: fedora
 MAKE_CHECK_ARGS: check-avocado
 
+crash-test-fedora:
+  extends: .native_test_job_template
+  needs:
+- job: build-system-fedora
+  artifacts: true
+  variables:
+IMAGE: fedora
+  script:
+- cd build
+- scripts/device-crash-test -q ./qemu-system-ppc
+- scripts/device-crash-test -q ./qemu-system-riscv32
+
 build-system-centos:
   extends: .native_build_job_template
   needs:
-- 
2.27.0




Re: [PATCH 2/2] migration: Set the socket backlog number to reduce the chance of live migration failure

2021-11-26 Thread Juan Quintela
Li Zhang  wrote:
> When creating the guest on destination with -incoming ip:port in QEMU command 
> line,
> the source code that sets the backlog number as the same as multifd channels 
> doesn't
> get called. So the number of backlog is always 1. It's very easy to cause 
> live migration
> failure, so a bigger number is preferred to reduce the chance of the failure.
>
> Signed-off-by: Li Zhang 
> ---
>  migration/socket.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/migration/socket.c b/migration/socket.c
> index 05705a32d8..398d4c10fa 100644
> --- a/migration/socket.c
> +++ b/migration/socket.c
> @@ -152,7 +152,7 @@ socket_start_incoming_migration_internal(SocketAddress 
> *saddr,
>  QIONetListener *listener = qio_net_listener_new();
>  MigrationIncomingState *mis = migration_incoming_get_current();
>  size_t i;
> -int num = 1;
> +int num = 16;
>  
>  qio_net_listener_set_name(listener, "migration-socket-listener");

Here, the right answer is to use -incoming defer.

Later, Juan.




Re: [PATCH 1/2] multifd: use qemu_sem_timedwait in multifd_recv_thread to avoid waiting forever

2021-11-26 Thread Juan Quintela
Li Zhang  wrote:
> When doing live migration with multifd channels 8, 16 or larger number,
> the guest hangs in the presence of the network errors such as missing TCP 
> ACKs.
>
> At sender's side:
> The main thread is blocked on qemu_thread_join, migration_fd_cleanup
> is called because one thread fails on qio_channel_write_all when
> the network problem happens and other send threads are blocked on sendmsg.
> They could not be terminated. So the main thread is blocked on 
> qemu_thread_join
> to wait for the threads terminated.
>
> (gdb) bt
> 0  0x7f30c8dcffc0 in __pthread_clockjoin_ex () at /lib64/libpthread.so.0
> 1  0x55cbb716084b in qemu_thread_join (thread=0x55cbb881f418) at 
> ../util/qemu-thread-posix.c:627
> 2  0x55cbb6b54e40 in multifd_save_cleanup () at ../migration/multifd.c:542
> 3  0x55cbb6b4de06 in migrate_fd_cleanup (s=0x55cbb8024000) at 
> ../migration/migration.c:1808
> 4  0x55cbb6b4dfb4 in migrate_fd_cleanup_bh (opaque=0x55cbb8024000) at 
> ../migration/migration.c:1850
> 5  0x55cbb7173ac1 in aio_bh_call (bh=0x55cbb7eb98e0) at 
> ../util/async.c:141
> 6  0x55cbb7173bcb in aio_bh_poll (ctx=0x55cbb7ebba80) at 
> ../util/async.c:169
> 7  0x55cbb715ba4b in aio_dispatch (ctx=0x55cbb7ebba80) at 
> ../util/aio-posix.c:381
> 8  0x55cbb7173ffe in aio_ctx_dispatch (source=0x55cbb7ebba80, 
> callback=0x0, user_data=0x0) at ../util/async.c:311
> 9  0x7f30c9c8cdf4 in g_main_context_dispatch () at 
> /usr/lib64/libglib-2.0.so.0
> 10 0x55cbb71851a2 in glib_pollfds_poll () at ../util/main-loop.c:232
> 11 0x55cbb718521c in os_host_main_loop_wait (timeout=42251070366) at 
> ../util/main-loop.c:255
> 12 0x55cbb7185321 in main_loop_wait (nonblocking=0) at 
> ../util/main-loop.c:531
> 13 0x55cbb6e6ba27 in qemu_main_loop () at ../softmmu/runstate.c:726
> 14 0x55cbb6ad6fd7 in main (argc=68, argv=0x7ffc0c57, 
> envp=0x7ffc0c578ab0) at ../softmmu/main.c:50
>
> At receiver's side:
> Several receive threads are not created successfully and the receive threads
> which have been created are blocked on qemu_sem_wait. No semaphores are posted
> because migration is not started if not all the receive threads are created
> successfully and multifd_recv_sync_main is not called which posts the 
> semaphore
> to receive threads. So the receive threads are waiting on the semaphore and
> never return. It shouldn't wait for the semaphore forever.
> Use qemu_sem_timedwait to wait for a while, then return and close the 
> channels.
> So the guest doesn't hang anymore.
>
> (gdb) bt
> 0  0x7fd61c43f064 in do_futex_wait.constprop () at /lib64/libpthread.so.0
> 1  0x7fd61c43f158 in __new_sem_wait_slow.constprop.0 () at 
> /lib64/libpthread.so.0
> 2  0x56075916014a in qemu_sem_wait (sem=0x56075b6515f0) at 
> ../util/qemu-thread-posix.c:358
> 3  0x560758b56643 in multifd_recv_thread (opaque=0x56075b651550) at 
> ../migration/multifd.c:1112
> 4  0x560759160598 in qemu_thread_start (args=0x56075befad00) at 
> ../util/qemu-thread-posix.c:556
> 5  0x7fd61c43594a in start_thread () at /lib64/libpthread.so.0
> 6  0x7fd61c158d0f in clone () at /lib64/libc.so.6
>
> Signed-off-by: Li Zhang 
> ---
>  migration/multifd.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/migration/multifd.c b/migration/multifd.c
> index 7c9deb1921..656239ca2a 100644
> --- a/migration/multifd.c
> +++ b/migration/multifd.c
> @@ -1109,7 +1109,7 @@ static void *multifd_recv_thread(void *opaque)
>  
>  if (flags & MULTIFD_FLAG_SYNC) {
>  qemu_sem_post(&multifd_recv_state->sem_sync);
> -qemu_sem_wait(&p->sem_sync);
> +qemu_sem_timedwait(&p->sem_sync, 1000);
>  }
>  }

Problem happens here, but I think that the solution is not worng.  We
are returning from the semaphore without given a single error message.

Later, Juan.




[PATCH for-6.2? 0/2] arm_gicv3: Fix handling of LPIs in list registers

2021-11-26 Thread Peter Maydell
(Marc: cc'd you on this one in case you're still using QEMU
to test KVM stuff with, in which case you might have run into
the bug this is fixing.)

It is valid for an OS to put virtual interrupt ID values into the
list registers ICH_LR which are greater than 1023.  This
corresponds to (for example) KVM running as an L1 guest inside
emulated QEMU and using the in-kernel emulated ITS to give a (nested)
L2 guest an ITS.  LPIs are delivered by the L1 kernel to the L2 guest
via the list registers in the same way as non-LPI interrupts.

QEMU's code for handling writes to ICV_IARn (which happen when the L2
guest acknowledges an interrupt) and to ICV_EOIRn (which happen at
the end of the interrupt) did not consider LPIs, so it would
incorrectly treat interrupt IDs above 1023 as invalid, with the
effect that a read to ICV_IARn would return the correct interrupt ID
number but not actually mark the interrupt active or set the CPU
priority accordingly, and a write to ICV_EOIRn would do nothing.

This bug doesn't seem to have any visible effect on Linux L2 guests
most of the time, because the two bugs cancel each other out: we
neither mark the interrupt active nor deactivate it.  However it does
mean that the L2 vCPU priority while the LPI handler is running will
not be correct, so the interrupt handler could be unexpectedly
interrupted by a different interrupt.  (I haven't observed this; I
found the ICV_IARn bug by code inspection, and then the ICV_EOIRn bug
by figuring out why fixing ICV_IARn broke L2 guests :-))

This isn't a regression -- we've behaved like this since the GICv3
support for virtualization was first implemented. I'm tempted to
put it into 6.2 anyway, though.

Patch 1 abstracts out the test we were using already elsewhere
in the code into its own function, and patch 2 uses it to fix
the EOIR and IAR behaviour.

Based-on: 20211124202005.989935-1-peter.mayd...@linaro.org
("[PATCH v2] hw/intc/arm_gicv3: Update cached state after LPI state changes")

Peter Maydell (2):
  hw/intc/arm_gicv3: Add new gicv3_intid_is_special() function
  hw/intc/arm_gicv3: fix handling of LPIs in list registers

 hw/intc/gicv3_internal.h  | 13 +
 hw/intc/arm_gicv3_cpuif.c |  9 -
 2 files changed, 17 insertions(+), 5 deletions(-)

-- 
2.25.1




[PATCH for-6.2? 1/2] hw/intc/arm_gicv3: Add new gicv3_intid_is_special() function

2021-11-26 Thread Peter Maydell
The GICv3/v4 pseudocode has a function IsSpecial() which returns true
if passed a "special" interrupt ID number (anything between 1020 and
1023 inclusive).  We open-code this condition in a couple of places,
so abstract it out into a new function gicv3_intid_is_special().

Signed-off-by: Peter Maydell 
---
 hw/intc/gicv3_internal.h  | 13 +
 hw/intc/arm_gicv3_cpuif.c |  4 ++--
 2 files changed, 15 insertions(+), 2 deletions(-)

diff --git a/hw/intc/gicv3_internal.h b/hw/intc/gicv3_internal.h
index 70f34ee4955..b9c37453b04 100644
--- a/hw/intc/gicv3_internal.h
+++ b/hw/intc/gicv3_internal.h
@@ -411,6 +411,19 @@ FIELD(MAPC, RDBASE, 16, 32)
 
 /* Functions internal to the emulated GICv3 */
 
+/**
+ * gicv3_intid_is_special:
+ * @intid: interrupt ID
+ *
+ * Return true if @intid is a special interrupt ID (1020 to
+ * 1023 inclusive). This corresponds to the GIC spec pseudocode
+ * IsSpecial() function.
+ */
+static inline bool gicv3_intid_is_special(int intid)
+{
+return intid >= INTID_SECURE && intid <= INTID_SPURIOUS;
+}
+
 /**
  * gicv3_redist_update:
  * @cs: GICv3CPUState for this redistributor
diff --git a/hw/intc/arm_gicv3_cpuif.c b/hw/intc/arm_gicv3_cpuif.c
index 3fe5de8ad7d..7fbc36ff41b 100644
--- a/hw/intc/arm_gicv3_cpuif.c
+++ b/hw/intc/arm_gicv3_cpuif.c
@@ -997,7 +997,7 @@ static uint64_t icc_iar0_read(CPUARMState *env, const 
ARMCPRegInfo *ri)
 intid = icc_hppir0_value(cs, env);
 }
 
-if (!(intid >= INTID_SECURE && intid <= INTID_SPURIOUS)) {
+if (!gicv3_intid_is_special(intid)) {
 icc_activate_irq(cs, intid);
 }
 
@@ -1020,7 +1020,7 @@ static uint64_t icc_iar1_read(CPUARMState *env, const 
ARMCPRegInfo *ri)
 intid = icc_hppir1_value(cs, env);
 }
 
-if (!(intid >= INTID_SECURE && intid <= INTID_SPURIOUS)) {
+if (!gicv3_intid_is_special(intid)) {
 icc_activate_irq(cs, intid);
 }
 
-- 
2.25.1




[PATCH for-6.2? 2/2] hw/intc/arm_gicv3: fix handling of LPIs in list registers

2021-11-26 Thread Peter Maydell
It is valid for an OS to put virtual interrupt ID values into the
list registers ICH_LR which are greater than 1023.  This
corresponds to (for example) KVM using the in-kernel emulated ITS to
give a (nested) guest an ITS.  LPIs are delivered by the L1 kernel to
the L2 guest via the list registers in the same way as non-LPI
interrupts.

QEMU's code for handling writes to ICV_IARn (which happen when the L2
guest acknowledges an interrupt) and to ICV_EOIRn (which happen at
the end of the interrupt) did not consider LPIs, so it would
incorrectly treat interrupt IDs above 1023 as invalid.  Fix this by
using the correct condition, which is gicv3_intid_is_special().

Note that the condition in icv_dir_write() is correct -- LPIs
are not valid there and so we want to ignore both "special" ID
values and LPIs.

(In the pseudocode this logic is in:
 - VirtualReadIAR0(), VirtualReadIAR1(), which call IsSpecial()
 - VirtualWriteEOIR0(), VirtualWriteEOIR1(), which call
 VirtualIdentifierValid(data, TRUE) meaning "LPIs OK"
 - VirtualWriteDIR(), which calls VirtualIdentifierValid(data, FALSE)
 meaning "LPIs not OK")

This bug doesn't seem to have any visible effect on Linux L2 guests
most of the time, because the two bugs cancel each other out: we
neither mark the interrupt active nor deactivate it.  However it does
mean that the L2 vCPU priority while the LPI handler is running will
not be correct, so the interrupt handler could be unexpectedly
interrupted by a different interrupt.

(NB: this has nothing to do with using QEMU's emulated ITS.)

Signed-off-by: Peter Maydell 
---
Not sure whether to put this into 6.2 -- I haven't ever seen
any actual misbehaviour, I found the bug by code inspection;
and we've behaved this way since the GICv3 support for
virtualization was first implemented, so it's not a regression.
---
 hw/intc/arm_gicv3_cpuif.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/hw/intc/arm_gicv3_cpuif.c b/hw/intc/arm_gicv3_cpuif.c
index 7fbc36ff41b..7fba9314508 100644
--- a/hw/intc/arm_gicv3_cpuif.c
+++ b/hw/intc/arm_gicv3_cpuif.c
@@ -653,7 +653,7 @@ static uint64_t icv_iar_read(CPUARMState *env, const 
ARMCPRegInfo *ri)
 
 if (thisgrp == grp && icv_hppi_can_preempt(cs, lr)) {
 intid = ich_lr_vintid(lr);
-if (intid < INTID_SECURE) {
+if (!gicv3_intid_is_special(intid)) {
 icv_activate_irq(cs, idx, grp);
 } else {
 /* Interrupt goes from Pending to Invalid */
@@ -1265,8 +1265,7 @@ static void icv_eoir_write(CPUARMState *env, const 
ARMCPRegInfo *ri,
 trace_gicv3_icv_eoir_write(ri->crm == 8 ? 0 : 1,
gicv3_redist_affid(cs), value);
 
-if (irq >= GICV3_MAXIRQ) {
-/* Also catches special interrupt numbers and LPIs */
+if (gicv3_intid_is_special(irq)) {
 return;
 }
 
-- 
2.25.1




Re: [PATCH 1/2] multifd: use qemu_sem_timedwait in multifd_recv_thread to avoid waiting forever

2021-11-26 Thread Li Zhang



On 11/26/21 4:49 PM, Daniel P. Berrangé wrote:

On Fri, Nov 26, 2021 at 04:31:53PM +0100, Li Zhang wrote:

When doing live migration with multifd channels 8, 16 or larger number,
the guest hangs in the presence of the network errors such as missing TCP ACKs.

At sender's side:
The main thread is blocked on qemu_thread_join, migration_fd_cleanup
is called because one thread fails on qio_channel_write_all when
the network problem happens and other send threads are blocked on sendmsg.
They could not be terminated. So the main thread is blocked on qemu_thread_join
to wait for the threads terminated.

Isn't the right answer here to ensure we've called 'shutdown' on
all the FDs, so that the threads get kicked out of sendmsg, before
trying to join the thread ?


If we shutdown the channels at sender's side, it could terminate send 
threads. The receive threads are still waiting there.


From receiver's side, if wait semaphore is timeout, the channels can be 
terminated at last. And the sender threads also be terminated at last.


If we close channels on both sides, I am not sure if it works correctly.




Regards,
Daniel




Re: [PATCH 2/2] migration: Set the socket backlog number to reduce the chance of live migration failure

2021-11-26 Thread Li Zhang



On 11/26/21 5:32 PM, Juan Quintela wrote:

Li Zhang  wrote:

When creating the guest on destination with -incoming ip:port in QEMU command 
line,
the source code that sets the backlog number as the same as multifd channels 
doesn't
get called. So the number of backlog is always 1. It's very easy to cause live 
migration
failure, so a bigger number is preferred to reduce the chance of the failure.

Signed-off-by: Li Zhang 
---
  migration/socket.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/migration/socket.c b/migration/socket.c
index 05705a32d8..398d4c10fa 100644
--- a/migration/socket.c
+++ b/migration/socket.c
@@ -152,7 +152,7 @@ socket_start_incoming_migration_internal(SocketAddress 
*saddr,
  QIONetListener *listener = qio_net_listener_new();
  MigrationIncomingState *mis = migration_incoming_get_current();
  size_t i;
-int num = 1;
+int num = 16;
  
  qio_net_listener_set_name(listener, "migration-socket-listener");

Here, the right answer is to use -incoming defer.


Ok, thanks a lot.




Later, Juan.






Re: [PATCH 1/2] multifd: use qemu_sem_timedwait in multifd_recv_thread to avoid waiting forever

2021-11-26 Thread Daniel P . Berrangé
On Fri, Nov 26, 2021 at 05:44:04PM +0100, Li Zhang wrote:
> 
> On 11/26/21 4:49 PM, Daniel P. Berrangé wrote:
> > On Fri, Nov 26, 2021 at 04:31:53PM +0100, Li Zhang wrote:
> > > When doing live migration with multifd channels 8, 16 or larger number,
> > > the guest hangs in the presence of the network errors such as missing TCP 
> > > ACKs.
> > > 
> > > At sender's side:
> > > The main thread is blocked on qemu_thread_join, migration_fd_cleanup
> > > is called because one thread fails on qio_channel_write_all when
> > > the network problem happens and other send threads are blocked on sendmsg.
> > > They could not be terminated. So the main thread is blocked on 
> > > qemu_thread_join
> > > to wait for the threads terminated.
> > Isn't the right answer here to ensure we've called 'shutdown' on
> > all the FDs, so that the threads get kicked out of sendmsg, before
> > trying to join the thread ?
> 
> If we shutdown the channels at sender's side, it could terminate send
> threads. The receive threads are still waiting there.
> 
> From receiver's side, if wait semaphore is timeout, the channels can be
> terminated at last. And the sender threads also be terminated at last.

If something goes wrong on the sender side, the mgmt app should be
tearing down the destination QEMU entirely, so I'm not sure we need
to do anything special to deal with received threads.

Using semtimedwait just feels risky because it will introduce false
failures if the system/network is under high load such that the
connections don't all establish within 1 second.

Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|




Re: [PATCH] hw/arm/virt: Extend nested and mte checks to hvf

2021-11-26 Thread Peter Maydell
On Tue, 23 Nov 2021 at 14:00, Alexander Graf  wrote:
>
>
> On 23.11.21 13:34, Peter Maydell wrote:
> > On Tue, 23 Nov 2021 at 12:29, Alexander Graf  wrote:
> >> The virt machine has properties to enable MTE and Nested Virtualization
> >> support. However, its check to ensure the backing accel implementation
> >> supports it today only looks for KVM and bails out if it finds it.
> >>
> >> Extend the checks to HVF as well as it does not support either today.
> >>
> >> Reported-by: saar amar 
> >> Signed-off-by: Alexander Graf 
> > Without this check, what happens if you try to enable
> > both eg virtualization and hvf? Crash, unhelpful error
> > message, something else?
>
>
> The guest just never gets either feature enabled. No crash, no error
> message.

Thanks; I've added that info to the commit message and applied this
to target-arm.next for 6.2.

-- PMM



Re: [PATCH] hw/intc: cannot clear GICv3 ITS CTLR[Enabled] bit

2021-11-26 Thread Peter Maydell
On Thu, 25 Nov 2021 at 15:47, Peter Maydell  wrote:
>
> On Wed, 24 Nov 2021 at 18:22, Shashi Mallela  
> wrote:
> >
> > When Enabled bit is cleared in GITS_CTLR,ITS feature continues
> > to be enabled.This patch fixes the issue.
> >
> > Signed-off-by: Shashi Mallela 
> > ---
> >  hw/intc/arm_gicv3_its.c | 7 ---
> >  1 file changed, 4 insertions(+), 3 deletions(-)
> >
> > diff --git a/hw/intc/arm_gicv3_its.c b/hw/intc/arm_gicv3_its.c
> > index 84bcbb5f56..c929a9cb5c 100644
> > --- a/hw/intc/arm_gicv3_its.c
> > +++ b/hw/intc/arm_gicv3_its.c
> > @@ -896,13 +896,14 @@ static bool its_writel(GICv3ITSState *s, hwaddr 
> > offset,
> >
> >  switch (offset) {
> >  case GITS_CTLR:
> > -s->ctlr |= (value & ~(s->ctlr));
> > -
> > -if (s->ctlr & ITS_CTLR_ENABLED) {
> > +if (value & R_GITS_CTLR_ENABLED_MASK) {
> > +s->ctlr |= ITS_CTLR_ENABLED;
> >  extract_table_params(s);
> >  extract_cmdq_params(s);
> >  s->creadr = 0;
> >  process_cmdq(s);
> > +} else {
> > +s->ctlr &= ~ITS_CTLR_ENABLED;
> >  }
> >  break;
> >  case GITS_CBASER:
>
> The code looks fine, so in that sense
> Reviewed-by: Peter Maydell 
>
> It seems odd that we have two different #defines for the
> same bit, though (ITS_CTLR_ENABLED and R_GITS_CTLR_ENABLED_MASK).
> We should probably standardize on the latter and drop the
> former.

Applied this version to target-arm.next for 6.2, anyway.

-- PMM



Re: [PATCH 1/2] multifd: use qemu_sem_timedwait in multifd_recv_thread to avoid waiting forever

2021-11-26 Thread Li Zhang



On 11/26/21 5:33 PM, Juan Quintela wrote:

Li Zhang  wrote:

When doing live migration with multifd channels 8, 16 or larger number,
the guest hangs in the presence of the network errors such as missing TCP ACKs.

At sender's side:
The main thread is blocked on qemu_thread_join, migration_fd_cleanup
is called because one thread fails on qio_channel_write_all when
the network problem happens and other send threads are blocked on sendmsg.
They could not be terminated. So the main thread is blocked on qemu_thread_join
to wait for the threads terminated.

(gdb) bt
0  0x7f30c8dcffc0 in __pthread_clockjoin_ex () at /lib64/libpthread.so.0
1  0x55cbb716084b in qemu_thread_join (thread=0x55cbb881f418) at 
../util/qemu-thread-posix.c:627
2  0x55cbb6b54e40 in multifd_save_cleanup () at ../migration/multifd.c:542
3  0x55cbb6b4de06 in migrate_fd_cleanup (s=0x55cbb8024000) at 
../migration/migration.c:1808
4  0x55cbb6b4dfb4 in migrate_fd_cleanup_bh (opaque=0x55cbb8024000) at 
../migration/migration.c:1850
5  0x55cbb7173ac1 in aio_bh_call (bh=0x55cbb7eb98e0) at ../util/async.c:141
6  0x55cbb7173bcb in aio_bh_poll (ctx=0x55cbb7ebba80) at ../util/async.c:169
7  0x55cbb715ba4b in aio_dispatch (ctx=0x55cbb7ebba80) at 
../util/aio-posix.c:381
8  0x55cbb7173ffe in aio_ctx_dispatch (source=0x55cbb7ebba80, callback=0x0, 
user_data=0x0) at ../util/async.c:311
9  0x7f30c9c8cdf4 in g_main_context_dispatch () at 
/usr/lib64/libglib-2.0.so.0
10 0x55cbb71851a2 in glib_pollfds_poll () at ../util/main-loop.c:232
11 0x55cbb718521c in os_host_main_loop_wait (timeout=42251070366) at 
../util/main-loop.c:255
12 0x55cbb7185321 in main_loop_wait (nonblocking=0) at 
../util/main-loop.c:531
13 0x55cbb6e6ba27 in qemu_main_loop () at ../softmmu/runstate.c:726
14 0x55cbb6ad6fd7 in main (argc=68, argv=0x7ffc0c57, 
envp=0x7ffc0c578ab0) at ../softmmu/main.c:50

At receiver's side:
Several receive threads are not created successfully and the receive threads
which have been created are blocked on qemu_sem_wait. No semaphores are posted
because migration is not started if not all the receive threads are created
successfully and multifd_recv_sync_main is not called which posts the semaphore
to receive threads. So the receive threads are waiting on the semaphore and
never return. It shouldn't wait for the semaphore forever.
Use qemu_sem_timedwait to wait for a while, then return and close the channels.
So the guest doesn't hang anymore.

(gdb) bt
0  0x7fd61c43f064 in do_futex_wait.constprop () at /lib64/libpthread.so.0
1  0x7fd61c43f158 in __new_sem_wait_slow.constprop.0 () at 
/lib64/libpthread.so.0
2  0x56075916014a in qemu_sem_wait (sem=0x56075b6515f0) at 
../util/qemu-thread-posix.c:358
3  0x560758b56643 in multifd_recv_thread (opaque=0x56075b651550) at 
../migration/multifd.c:1112
4  0x560759160598 in qemu_thread_start (args=0x56075befad00) at 
../util/qemu-thread-posix.c:556
5  0x7fd61c43594a in start_thread () at /lib64/libpthread.so.0
6  0x7fd61c158d0f in clone () at /lib64/libc.so.6

Signed-off-by: Li Zhang 
---
  migration/multifd.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/migration/multifd.c b/migration/multifd.c
index 7c9deb1921..656239ca2a 100644
--- a/migration/multifd.c
+++ b/migration/multifd.c
@@ -1109,7 +1109,7 @@ static void *multifd_recv_thread(void *opaque)
  
  if (flags & MULTIFD_FLAG_SYNC) {

  qemu_sem_post(&multifd_recv_state->sem_sync);
-qemu_sem_wait(&p->sem_sync);
+qemu_sem_timedwait(&p->sem_sync, 1000);
  }
  }

Problem happens here, but I think that the solution is not worng.  We
are returning from the semaphore without given a single error message.


Ah, okay. I can add an error message.

Thanks

Li




Later, Juan.





Re: [PATCH 1/2] multifd: use qemu_sem_timedwait in multifd_recv_thread to avoid waiting forever

2021-11-26 Thread Li Zhang



On 11/26/21 5:51 PM, Daniel P. Berrangé wrote:

On Fri, Nov 26, 2021 at 05:44:04PM +0100, Li Zhang wrote:

On 11/26/21 4:49 PM, Daniel P. Berrangé wrote:

On Fri, Nov 26, 2021 at 04:31:53PM +0100, Li Zhang wrote:

When doing live migration with multifd channels 8, 16 or larger number,
the guest hangs in the presence of the network errors such as missing TCP ACKs.

At sender's side:
The main thread is blocked on qemu_thread_join, migration_fd_cleanup
is called because one thread fails on qio_channel_write_all when
the network problem happens and other send threads are blocked on sendmsg.
They could not be terminated. So the main thread is blocked on qemu_thread_join
to wait for the threads terminated.

Isn't the right answer here to ensure we've called 'shutdown' on
all the FDs, so that the threads get kicked out of sendmsg, before
trying to join the thread ?

If we shutdown the channels at sender's side, it could terminate send
threads. The receive threads are still waiting there.

 From receiver's side, if wait semaphore is timeout, the channels can be
terminated at last. And the sender threads also be terminated at last.

If something goes wrong on the sender side, the mgmt app should be
tearing down the destination QEMU entirely, so I'm not sure we need
to do anything special to deal with received threads.

Using semtimedwait just feels risky because it will introduce false
failures if the system/network is under high load such that the
connections don't all establish within 1 second.


You are right. This may be a risk. I am not sure if the interval is 
proper, we can set longer.





Regards,
Daniel




Re: [PATCH 1/3] ppc/pnv: Tune the POWER9 PCIe Host bridge model

2021-11-26 Thread Cédric Le Goater

[ Adding Alfredo the thread ]

On 11/26/21 10:09, Cédric Le Goater wrote:

On 11/16/21 18:01, Frederic Barrat wrote:

The PHB v4 found on POWER9 doesn't request any LSI, so let's clear the
Interrupt Pin register in the config space so that the model matches
the hardware.

If we don't, then we inherit from the default pcie root bridge, which
requests a LSI. And because we don't map it correctly in the device
tree, all PHBs allocate the same bogus hw interrupt. We end up with
inconsistent interrupt controller (xive) data. The problem goes away
if we don't allocate the LSI in the first place.

Signed-off-by: Frederic Barrat 
---
  hw/pci-host/pnv_phb4.c | 5 -
  1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/hw/pci-host/pnv_phb4.c b/hw/pci-host/pnv_phb4.c
index 5c375a9f28..1659d55b4f 100644
--- a/hw/pci-host/pnv_phb4.c
+++ b/hw/pci-host/pnv_phb4.c
@@ -1234,10 +1234,13 @@ static void pnv_phb4_reset(DeviceState *dev)
  PCIDevice *root_dev = PCI_DEVICE(&phb->root);
  /*
- * Configure PCI device id at reset using a property.
+ * Configure the PCI device at reset:
+ *   - set the Vendor and Device ID to for the root bridge
+ *   - no LSI
   */
  pci_config_set_vendor_id(root_dev->config, PCI_VENDOR_ID_IBM);
  pci_config_set_device_id(root_dev->config, phb->device_id);
+    pci_config_set_interrupt_pin(root_dev->config, 0);
  }
  static const char *pnv_phb4_root_bus_path(PCIHostState *host_bridge,



FYI, I am seeing an issue with FreeBSD when booting from iso :

   
https://download.freebsd.org/ftp/snapshots/powerpc/powerpc64/ISO-IMAGES/14.0/FreeBSD-14.0-CURRENT-powerpc-powerpc64-20211028-4827bf76bce-250301-disc1.iso.xz

Thanks,

C.

SIGTERM received, booting...
KDB: debugger backends: ddb
KDB: current backend: ddb
---<>---
Copyright (c) 1992-2021 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
 The Regents of the University of California. All rights reserved.
FreeBSD is a registered trademark of The FreeBSD Foundation.
FreeBSD 14.0-CURRENT #0 main-n250301-4827bf76bce: Thu Oct 28 06:53:58 UTC 2021
     
r...@releng1.nyi.freebsd.org:/usr/obj/usr/src/powerpc.powerpc64/sys/GENERIC64 
powerpc
FreeBSD clang version 12.0.1 (g...@github.com:llvm/llvm-project.git 
llvmorg-12.0.1-0-gfed41342a82f)
WARNING: WITNESS option enabled, expect reduced performance.
VT: init without driver.
ofw_initrd: initrd loaded at 0x2800-0x28c7928c
cpu0: IBM POWER9 revision 2.0, 1000.00 MHz
cpu0: Features 
dc007182
cpu0: Features2 bee0
real memory  = 1014484992 (967 MB)
avail memory = 117903360 (112 MB)
random: registering fast source PowerISA DARN random number generator
random: fast provider: "PowerISA DARN random number generator"
arc4random: WARNING: initial seeding bypassed the cryptographic random device 
because it was not yet seeded and the knob 'bypass_before_seeding' was enabled.
random: entropy device external interface
kbd0 at kbdmux0
ofwbus0:  on nexus0
opal0:  irq 
1048560,1048561,1048562,1048563,1048564,1048565,1048566,1048567,1048568,1048569,1048570,1048571,1048572,1048573
 on ofwbus0
opal0: registered as a time-of-day clock, resolution 0.002000s
simplebus0:  mem 
0x60300-0x60300 on ofwbus0
pcib0:  mem 
0x600c3c000-0x600c3cfff,0x600c3-0x600c30fff on ofwbus0
pci0:  numa-domain 0 on pcib0
qemu-system-ppc64: ../hw/pci/pci.c:1487: pci_irq_handler: Assertion `0 <= irq_num 
&& irq_num < PCI_NUM_PINS' failed.







Re: [PATCH v2 1/1] MAINTAINERS: update email address of Christian Borntraeger

2021-11-26 Thread Thomas Huth

On 26/11/2021 11.24, Christian Borntraeger wrote:

My borntrae...@de.ibm.com email is just a forwarder to the
linux.ibm.com address. Let us remove the extra hop to avoid
a potential source of errors.

While at it, add the relevant email addresses to mailmap.

Signed-off-by: Christian Borntraeger 
---
  .mailmap| 1 +
  MAINTAINERS | 6 +++---
  2 files changed, 4 insertions(+), 3 deletions(-)


Thanks, queued it to my s390x-next branch now:

 https://gitlab.com/thuth/qemu/-/commits/s390x-next/

  Thomas




Re: [PATCH 1/2] multifd: use qemu_sem_timedwait in multifd_recv_thread to avoid waiting forever

2021-11-26 Thread Daniel P . Berrangé
On Fri, Nov 26, 2021 at 06:00:24PM +0100, Li Zhang wrote:
> 
> On 11/26/21 5:51 PM, Daniel P. Berrangé wrote:
> > On Fri, Nov 26, 2021 at 05:44:04PM +0100, Li Zhang wrote:
> > > On 11/26/21 4:49 PM, Daniel P. Berrangé wrote:
> > > > On Fri, Nov 26, 2021 at 04:31:53PM +0100, Li Zhang wrote:
> > > > > When doing live migration with multifd channels 8, 16 or larger 
> > > > > number,
> > > > > the guest hangs in the presence of the network errors such as missing 
> > > > > TCP ACKs.
> > > > > 
> > > > > At sender's side:
> > > > > The main thread is blocked on qemu_thread_join, migration_fd_cleanup
> > > > > is called because one thread fails on qio_channel_write_all when
> > > > > the network problem happens and other send threads are blocked on 
> > > > > sendmsg.
> > > > > They could not be terminated. So the main thread is blocked on 
> > > > > qemu_thread_join
> > > > > to wait for the threads terminated.
> > > > Isn't the right answer here to ensure we've called 'shutdown' on
> > > > all the FDs, so that the threads get kicked out of sendmsg, before
> > > > trying to join the thread ?
> > > If we shutdown the channels at sender's side, it could terminate send
> > > threads. The receive threads are still waiting there.
> > > 
> > >  From receiver's side, if wait semaphore is timeout, the channels can be
> > > terminated at last. And the sender threads also be terminated at last.
> > If something goes wrong on the sender side, the mgmt app should be
> > tearing down the destination QEMU entirely, so I'm not sure we need
> > to do anything special to deal with received threads.
> > 
> > Using semtimedwait just feels risky because it will introduce false
> > failures if the system/network is under high load such that the
> > connections don't all establish within 1 second.
> 
> You are right. This may be a risk. I am not sure if the interval is proper,
> we can set longer.

I don't think any kind of timeout is right in this context. There should
be a sem_post() invoked in every scenario where we want to tear down the
recv thread. That should only be the case when we see the other end of
the connection close IMHO.

Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|




Re: [PATCH 1/2] multifd: use qemu_sem_timedwait in multifd_recv_thread to avoid waiting forever

2021-11-26 Thread Li Zhang



On 11/26/21 6:13 PM, Daniel P. Berrangé wrote:

On Fri, Nov 26, 2021 at 06:00:24PM +0100, Li Zhang wrote:

On 11/26/21 5:51 PM, Daniel P. Berrangé wrote:

On Fri, Nov 26, 2021 at 05:44:04PM +0100, Li Zhang wrote:

On 11/26/21 4:49 PM, Daniel P. Berrangé wrote:

On Fri, Nov 26, 2021 at 04:31:53PM +0100, Li Zhang wrote:

When doing live migration with multifd channels 8, 16 or larger number,
the guest hangs in the presence of the network errors such as missing TCP ACKs.

At sender's side:
The main thread is blocked on qemu_thread_join, migration_fd_cleanup
is called because one thread fails on qio_channel_write_all when
the network problem happens and other send threads are blocked on sendmsg.
They could not be terminated. So the main thread is blocked on qemu_thread_join
to wait for the threads terminated.

Isn't the right answer here to ensure we've called 'shutdown' on
all the FDs, so that the threads get kicked out of sendmsg, before
trying to join the thread ?

If we shutdown the channels at sender's side, it could terminate send
threads. The receive threads are still waiting there.

  From receiver's side, if wait semaphore is timeout, the channels can be
terminated at last. And the sender threads also be terminated at last.

If something goes wrong on the sender side, the mgmt app should be
tearing down the destination QEMU entirely, so I'm not sure we need
to do anything special to deal with received threads.

Using semtimedwait just feels risky because it will introduce false
failures if the system/network is under high load such that the
connections don't all establish within 1 second.

You are right. This may be a risk. I am not sure if the interval is proper,
we can set longer.

I don't think any kind of timeout is right in this context. There should
be a sem_post() invoked in every scenario where we want to tear down the
recv thread. That should only be the case when we see the other end of
the connection close IMHO.


OK,  I need to consider about that. It may be better to shutdown the 
channels from sender's side.



Thanks

Li



Regards,
Daniel




Re: [PATCH 1/1] ppc/pnv.c: add a friendly warning when accel=kvm is used

2021-11-26 Thread Cédric Le Goater

On 11/26/21 02:11, David Gibson wrote:

On Thu, Nov 25, 2021 at 07:42:02PM -0300, Daniel Henrique Barboza wrote:

If one tries to use -machine powernv9,accel=kvm in a Power9 host, a
cryptic error will be shown:

qemu-system-ppc64: Register sync failed... If you're using kvm-hv.ko, only "-cpu 
host" is possible
qemu-system-ppc64: kvm_init_vcpu: kvm_arch_init_vcpu failed (0): Invalid 
argument

Appending '-cpu host' will throw another error:

qemu-system-ppc64: invalid chip model 'host' for powernv9 machine

The root cause is that in IBM PowerPC we have different specs for the bare-metal
and the guests. The bare-metal follows OPAL, the guests follow PAPR. The kernel
KVM modules presented in the ppc kernels implements PAPR. This means that we
can't use KVM accel when using the powernv machine, which is the emulation of
the bare-metal host.

All that said, let's give a more informative error in this case.

Signed-off-by: Daniel Henrique Barboza 
---
  hw/ppc/pnv.c | 5 +
  1 file changed, 5 insertions(+)

diff --git a/hw/ppc/pnv.c b/hw/ppc/pnv.c
index 71e45515f1..e5b87e8730 100644
--- a/hw/ppc/pnv.c
+++ b/hw/ppc/pnv.c
@@ -742,6 +742,11 @@ static void pnv_init(MachineState *machine)
  DriveInfo *pnor = drive_get(IF_MTD, 0, 0);
  DeviceState *dev;
  
+if (kvm_enabled()) {

+error_report("The powernv machine does not work with KVM 
acceleration");
+exit(EXIT_FAILURE);
+}



Hmm.. my only concern here is that powernv could, at least
theoretically, work with KVM PR.  I don't think it does right now,
though.


At the same time, it is nice to not let the user think that it could work
in its current state. Don't you think so ?

C.




+
  /* allocate RAM */
  if (machine->ram_size < mc->default_ram_size) {
  char *sz = size_to_str(mc->default_ram_size);







Re: [PATCH-for-6.2] docs: add a word of caution on x-native-hotplug property for pcie-root-ports

2021-11-26 Thread Igor Mammedov
It's hardly 6.2 material

On Fri, 26 Nov 2021 11:12:55 +0530 (IST)
Ani Sinha  wrote:

> On Thu, 25 Nov 2021, Michael S. Tsirkin wrote:
> 
> > On Thu, Nov 25, 2021 at 05:36:29PM +0530, Ani Sinha wrote:  
> > > x-native-hotplug property, when used in order to disable HPC bit on the 
> > > PCIE
> > > root ports, can lead to unexpected results from the guest operating 
> > > system.
> > > Users are strongly advised not to touch this property in order to 
> > > manipulte the
> > > HPC bit. Add a word of caution in the pcie.txt doc file to document this.
> > >
> > > Signed-off-by: Ani Sinha   
> >
> > Do we want to generally document this for all "x-" options?  
> 
> Yes igor suggested it but I sent this one for two reasons:
> (a) I could not find a place to document this for properties without
> adding a new file. This sounded too bigger a hammer at the present. If you
> can suggest an existing place for documenting this for the property names,
> I will go and add this info there as well.
> 
> (b) I think we need to document this experimental property here regardless
> because this doc deals with hotplug and pcie ports and we had too much of
> a mess with this acpi/pci native switch.
> 
> When things stabilize a bit, Igor suggested elsewhere that we start a
> separate doc just for hotplug and various options we have and at
> that point we can move this info in this new doc.
> 
> https://www.mail-archive.com/libvir-list@redhat.com/msg221746.html

I'd rather put a blanket statement somewhere, like:
 
"x-" prefixed properties are experimental, unstable, internal and
are subject to change/go away without prior notice.
Such properties are not meant for use by users unless explicitly
documented otherwise.

> >  
> > > ---
> > >  docs/pcie.txt | 17 -
> > >  1 file changed, 16 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/docs/pcie.txt b/docs/pcie.txt
> > > index 89e3502075..e1f99f725f 100644
> > > --- a/docs/pcie.txt
> > > +++ b/docs/pcie.txt
> > > @@ -262,11 +262,26 @@ PCI Express Root Ports (and PCI Express Downstream 
> > > Ports).
> > >  Port, which may come handy for hot-plugging another device.
> > >
> > >
> > > -5.3 Hot-plug example:
> > > +5.2 Hot-plug example:
> > >  Using HMP: (add -monitor stdio to QEMU command line)
> > >device_add ,id=,bus= > > Downstream Port Id/PCI-PCI Bridge Id/>
> > >
> > >
> > > +5.3 A word of caution using hotplug on PCI Express Root Ports:
> > > +Starting Qemu version 6.2, PCI Express Root ports have a property
> > > +"x-native-hotplug" ("native-hotplug" for Qemu version 6.1), that can be 
> > > used to
> > > +enable or disable hotplug on that port. For example:
> > > +
> > > +-device pcie-root-port,x-native-hotplug=off,... etc.
> > > +
> > > +The "x-" prefix indicates that this property is highly experimental and 
> > > can
> > > +lead to unexpected results from the guest operating system if users try 
> > > to use
> > > +it to alter the native hotplug on the port. It also means that the 
> > > property
> > > +name and its behavior is liable to change in the future and is not 
> > > expected to
> > > +be stable across Qemu versions. Therefore, end users are advised not to 
> > > change
> > > +the value of this option from its default set value or use it in the Qemu
> > > +command line.
> > > +
> > >  6. Device assignment
> > >  
> > >  Host devices are mostly PCI Express and should be plugged only into
> > > --
> > > 2.25.1  
> >
> >  
> 




[PATCH v3] target/ppc: fix Hash64 MMU update of PTE bit R

2021-11-26 Thread Leandro Lupori
When updating the R bit of a PTE, the Hash64 MMU was using a wrong byte
offset, causing the first byte of the adjacent PTE to be corrupted.
This caused a panic when booting FreeBSD, using the Hash MMU.

Fixes: a2dd4e83e76b ("ppc/hash64: Rework R and C bit updates")
Signed-off-by: Leandro Lupori 
---
Changes from v2:
- Add new defines for the byte offset of PTE bit C and
  HASH_PTE_SIZE_64 / 2 (pte1)
- Use new defines in hash64 and spapr code
---
 hw/ppc/spapr.c  | 8 
 hw/ppc/spapr_softmmu.c  | 2 +-
 target/ppc/mmu-hash64.c | 4 ++--
 target/ppc/mmu-hash64.h | 5 +
 4 files changed, 12 insertions(+), 7 deletions(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 163c90388a..8ebf85bad8 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -1414,7 +1414,7 @@ void spapr_store_hpte(PowerPCCPU *cpu, hwaddr ptex,
 kvmppc_write_hpte(ptex, pte0, pte1);
 } else {
 if (pte0 & HPTE64_V_VALID) {
-stq_p(spapr->htab + offset + HASH_PTE_SIZE_64 / 2, pte1);
+stq_p(spapr->htab + offset + HPTE64_R_BYTE_OFFSET, pte1);
 /*
  * When setting valid, we write PTE1 first. This ensures
  * proper synchronization with the reading code in
@@ -1430,7 +1430,7 @@ void spapr_store_hpte(PowerPCCPU *cpu, hwaddr ptex,
  * ppc_hash64_pteg_search()
  */
 smp_wmb();
-stq_p(spapr->htab + offset + HASH_PTE_SIZE_64 / 2, pte1);
+stq_p(spapr->htab + offset + HPTE64_R_BYTE_OFFSET, pte1);
 }
 }
 }
@@ -1438,7 +1438,7 @@ void spapr_store_hpte(PowerPCCPU *cpu, hwaddr ptex,
 static void spapr_hpte_set_c(PPCVirtualHypervisor *vhyp, hwaddr ptex,
  uint64_t pte1)
 {
-hwaddr offset = ptex * HASH_PTE_SIZE_64 + 15;
+hwaddr offset = ptex * HASH_PTE_SIZE_64 + HPTE64_R_C_BYTE_OFFSET;
 SpaprMachineState *spapr = SPAPR_MACHINE(vhyp);
 
 if (!spapr->htab) {
@@ -1454,7 +1454,7 @@ static void spapr_hpte_set_c(PPCVirtualHypervisor *vhyp, 
hwaddr ptex,
 static void spapr_hpte_set_r(PPCVirtualHypervisor *vhyp, hwaddr ptex,
  uint64_t pte1)
 {
-hwaddr offset = ptex * HASH_PTE_SIZE_64 + 14;
+hwaddr offset = ptex * HASH_PTE_SIZE_64 + HPTE64_R_R_BYTE_OFFSET;
 SpaprMachineState *spapr = SPAPR_MACHINE(vhyp);
 
 if (!spapr->htab) {
diff --git a/hw/ppc/spapr_softmmu.c b/hw/ppc/spapr_softmmu.c
index f8924270ef..03676c4448 100644
--- a/hw/ppc/spapr_softmmu.c
+++ b/hw/ppc/spapr_softmmu.c
@@ -426,7 +426,7 @@ static void new_hpte_store(void *htab, uint64_t pteg, int 
slot,
 addr += slot * HASH_PTE_SIZE_64;
 
 stq_p(addr, pte0);
-stq_p(addr + HASH_PTE_SIZE_64 / 2, pte1);
+stq_p(addr + HPTE64_R_BYTE_OFFSET, pte1);
 }
 
 static int rehash_hpte(PowerPCCPU *cpu,
diff --git a/target/ppc/mmu-hash64.c b/target/ppc/mmu-hash64.c
index 19832c4b46..168d397c26 100644
--- a/target/ppc/mmu-hash64.c
+++ b/target/ppc/mmu-hash64.c
@@ -786,7 +786,7 @@ static void ppc_hash64_set_dsi(CPUState *cs, int mmu_idx, 
uint64_t dar, uint64_t
 
 static void ppc_hash64_set_r(PowerPCCPU *cpu, hwaddr ptex, uint64_t pte1)
 {
-hwaddr base, offset = ptex * HASH_PTE_SIZE_64 + 16;
+hwaddr base, offset = ptex * HASH_PTE_SIZE_64 + HPTE64_R_R_BYTE_OFFSET;
 
 if (cpu->vhyp) {
 PPCVirtualHypervisorClass *vhc =
@@ -803,7 +803,7 @@ static void ppc_hash64_set_r(PowerPCCPU *cpu, hwaddr ptex, 
uint64_t pte1)
 
 static void ppc_hash64_set_c(PowerPCCPU *cpu, hwaddr ptex, uint64_t pte1)
 {
-hwaddr base, offset = ptex * HASH_PTE_SIZE_64 + 15;
+hwaddr base, offset = ptex * HASH_PTE_SIZE_64 + HPTE64_R_C_BYTE_OFFSET;
 
 if (cpu->vhyp) {
 PPCVirtualHypervisorClass *vhc =
diff --git a/target/ppc/mmu-hash64.h b/target/ppc/mmu-hash64.h
index c5b2f97ff7..2a46763f70 100644
--- a/target/ppc/mmu-hash64.h
+++ b/target/ppc/mmu-hash64.h
@@ -97,6 +97,11 @@ void ppc_hash64_finalize(PowerPCCPU *cpu);
 #define HPTE64_V_1TB_SEG0x4000ULL
 #define HPTE64_V_VRMA_MASK  0x4001ff00ULL
 
+/* PTE byte offsets */
+#define HPTE64_R_R_BYTE_OFFSET  14
+#define HPTE64_R_C_BYTE_OFFSET  15
+#define HPTE64_R_BYTE_OFFSET(HASH_PTE_SIZE_64 / 2)
+
 /* Format changes for ARCH v3 */
 #define HPTE64_V_COMMON_BITS0x000fULL
 #define HPTE64_R_3_0_SSIZE_SHIFT 58
-- 
2.25.1




[RFC PATCH] blog post: how to get your new feature up-streamed

2021-11-26 Thread Alex Bennée
Experience has shown that getting new functionality up-streamed can be
a somewhat painful process. Lets see if we can collect some of our
community knowledge into a blog post describing some best practices
for getting code accepted.

[AJB: obviously RFC for now, need material for the end]

Signed-off-by: Alex Bennée 
---
 ...26-so-you-want-to-add-something-to-qemu.md | 100 ++
 1 file changed, 100 insertions(+)
 create mode 100644 _posts/2021-11-26-so-you-want-to-add-something-to-qemu.md

diff --git a/_posts/2021-11-26-so-you-want-to-add-something-to-qemu.md 
b/_posts/2021-11-26-so-you-want-to-add-something-to-qemu.md
new file mode 100644
index 000..d38c0ca
--- /dev/null
+++ b/_posts/2021-11-26-so-you-want-to-add-something-to-qemu.md
@@ -0,0 +1,100 @@
+---
+layout: post
+title:  "So you want to add a sub-system/device/architecture to QEMU?"
+date:   2021-11-26 19:43:45
+author: Alex Bennée
+categories: [blog, process, development]
+---
+
+From time to time I hear of frustrations from potential new
+contributors who have tried to get new features up-streamed into the
+QEMU repository. After having read [our patch
+guidelines](https://qemu.readthedocs.io/en/latest/devel/submitting-a-patch.html)
+they post them to [qemu-devel](https://lore.kernel.org/qemu-devel/).
+Often the patches sit there seemingly unread and unloved. The
+developer is left wandering if they missed out the secret hand shake
+required to move the process forward. My hope is that this blog post
+will help.
+
+
+New features != Fixing a bug
+
+
+Adding a new feature is not the same as fixing a bug. For an area of
+code that is supported for Odd Fixes or above there will be a
+someone listed in the
+[MAINTAINERS](https://gitlab.com/qemu-project/qemu/-/blob/master/MAINTAINERS)
+file. A properly configured `git-send-email` will even automatically
+add them to the patches as they are sent out. The maintainer will
+review the code and if no changes are requested they ensure the 
+patch flows through the appropriate trees and eventually makes it into
+the master branch.
+
+This doesn't usually happen for new code unless your patches happen to
+touch a directory that is marked as maintained. Without a maintainer
+to look at and apply your patches how will it ever get merged?
+
+Adding new code to a project is not a free activity. Code that isn't
+actively maintained has a tendency to [bit
+rot](http://www.catb.org/jargon/html/B/bit-rot.html) and become a drag
+on the rest of the code base. The QEMU code base is quite large and
+none of the developers are knowledgeable about the all of it. If
+features aren't
+[documented](https://qemu.readthedocs.io/en/latest/devel/submitting-a-patch.html)
+they tend to remain unused as users struggle to enable them. If an
+unused feature becomes a drag on the rest of the code base by preventing
+re-factoring and other clean ups it is likely to be deprecated.
+Eventually deprecated code gets removed from the code base never to be
+seen again.
+
+Fortunately there is a way to avoid the ignominy of ignored new features
+and that is to become a maintainer of your own code!
+
+The maintainers path
+
+
+There is perhaps an unfortunate stereotype in the open source world of
+maintainers being grumpy old experts who spend their time dismissively
+rejecting the patches of new contributors. Having done their time in
+the metaphorical trenches of the project they must ingest the email
+archive to prove their encyclopedic mastery. Eventually they then
+ascend to the status of maintainer having completed the dark key
+signing ritual.
+
+In reality the process is much more prosaic - you simply need to send
+a patch to the MAINTAINERS file with your email address, the areas you
+are going to cover and the level of support you expect to give.
+
+I won't pretend there isn't some commitment required when becoming a
+maintainer. However if you were motivated enough to write the code for
+a new feature you should be up to keeping it running smoothly in the
+upstream. The level of effort required is also proportional to the
+popularity of the feature - there is a world of difference between
+maintaining an individual device and a core subsystem. If the feature 
+
+Practically you will probably want to get yourself a
+[GitLab](https://gitlab.com/qemu-project/qemu/-/blob/master/MAINTAINERS)
+account so you can run the CI tests on your pull requests. While
+membership of `qemu-devel` is recommended no one is expecting you to
+read every message sent to it as long as you look at those where you
+are explicitly Cc'd.
+
+Now if you are convinced to become a maintainer for your new feature
+lets discuss how you can improve the chances of getting it merged.
+
+A practically perfect set of patches
+
+
+I don't want to repeat all the valuable information from the
+submitting patches document but I do want to emphasise the importance
+of respon

Re: [PATCH] gitlab-ci.d/buildtest: Add jobs that run the device-crash-test

2021-11-26 Thread Philippe Mathieu-Daudé
On 11/26/21 17:27, Thomas Huth wrote:
> The device-crash-test script has been quite neglected in the past,
> so that it bit-rot quite often. Let's add CI jobs that run this
> script for at least some targets, so that this script does not
> regress that easily anymore.
> 
> Signed-off-by: Thomas Huth 
> ---
>  .gitlab-ci.d/buildtest.yml | 23 +++
>  1 file changed, 23 insertions(+)

Reviewed-by: Philippe Mathieu-Daudé 




Re: [PATCH v3] target/ppc: fix Hash64 MMU update of PTE bit R

2021-11-26 Thread David Gibson
On Fri, Nov 26, 2021 at 04:39:40PM -0300, Leandro Lupori wrote:
> When updating the R bit of a PTE, the Hash64 MMU was using a wrong byte
> offset, causing the first byte of the adjacent PTE to be corrupted.
> This caused a panic when booting FreeBSD, using the Hash MMU.
> 
> Fixes: a2dd4e83e76b ("ppc/hash64: Rework R and C bit updates")
> Signed-off-by: Leandro Lupori 
> ---
> Changes from v2:
> - Add new defines for the byte offset of PTE bit C and
>   HASH_PTE_SIZE_64 / 2 (pte1)
> - Use new defines in hash64 and spapr code
> ---
>  hw/ppc/spapr.c  | 8 
>  hw/ppc/spapr_softmmu.c  | 2 +-
>  target/ppc/mmu-hash64.c | 4 ++--
>  target/ppc/mmu-hash64.h | 5 +
>  4 files changed, 12 insertions(+), 7 deletions(-)
> 
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index 163c90388a..8ebf85bad8 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -1414,7 +1414,7 @@ void spapr_store_hpte(PowerPCCPU *cpu, hwaddr ptex,
>  kvmppc_write_hpte(ptex, pte0, pte1);
>  } else {
>  if (pte0 & HPTE64_V_VALID) {
> -stq_p(spapr->htab + offset + HASH_PTE_SIZE_64 / 2, pte1);
> +stq_p(spapr->htab + offset + HPTE64_R_BYTE_OFFSET, pte1);

Urgh.. so, initially I thought this was wrong because I was confusing
HPTE64_R_BYTE_OFFSET with HPTE64_R_R_BYTE_OFFSET.  I doubt I'd be the
only one.

Calling something a BYTE_OFFSET then doing an stq to it is pretty
misleading I think.  WORD1_OFFSET or R_WORD_OFFSET might be better?

Or you could change these writebacks to byte writes, as powernv has
already been changed.  I'm not sure if that's necessary in the case of
pseries - since in that case the HPT doesn't exist within the guest's
address space.

>  /*
>   * When setting valid, we write PTE1 first. This ensures
>   * proper synchronization with the reading code in
> @@ -1430,7 +1430,7 @@ void spapr_store_hpte(PowerPCCPU *cpu, hwaddr ptex,
>   * ppc_hash64_pteg_search()
>   */
>  smp_wmb();
> -stq_p(spapr->htab + offset + HASH_PTE_SIZE_64 / 2, pte1);
> +stq_p(spapr->htab + offset + HPTE64_R_BYTE_OFFSET, pte1);
>  }
>  }
>  }
> @@ -1438,7 +1438,7 @@ void spapr_store_hpte(PowerPCCPU *cpu, hwaddr ptex,
>  static void spapr_hpte_set_c(PPCVirtualHypervisor *vhyp, hwaddr ptex,
>   uint64_t pte1)
>  {
> -hwaddr offset = ptex * HASH_PTE_SIZE_64 + 15;
> +hwaddr offset = ptex * HASH_PTE_SIZE_64 + HPTE64_R_C_BYTE_OFFSET;
>  SpaprMachineState *spapr = SPAPR_MACHINE(vhyp);
>  
>  if (!spapr->htab) {
> @@ -1454,7 +1454,7 @@ static void spapr_hpte_set_c(PPCVirtualHypervisor 
> *vhyp, hwaddr ptex,
>  static void spapr_hpte_set_r(PPCVirtualHypervisor *vhyp, hwaddr ptex,
>   uint64_t pte1)
>  {
> -hwaddr offset = ptex * HASH_PTE_SIZE_64 + 14;
> +hwaddr offset = ptex * HASH_PTE_SIZE_64 + HPTE64_R_R_BYTE_OFFSET;
>  SpaprMachineState *spapr = SPAPR_MACHINE(vhyp);
>  
>  if (!spapr->htab) {
> diff --git a/hw/ppc/spapr_softmmu.c b/hw/ppc/spapr_softmmu.c
> index f8924270ef..03676c4448 100644
> --- a/hw/ppc/spapr_softmmu.c
> +++ b/hw/ppc/spapr_softmmu.c
> @@ -426,7 +426,7 @@ static void new_hpte_store(void *htab, uint64_t pteg, int 
> slot,
>  addr += slot * HASH_PTE_SIZE_64;
>  
>  stq_p(addr, pte0);
> -stq_p(addr + HASH_PTE_SIZE_64 / 2, pte1);
> +stq_p(addr + HPTE64_R_BYTE_OFFSET, pte1);
>  }
>  
>  static int rehash_hpte(PowerPCCPU *cpu,
> diff --git a/target/ppc/mmu-hash64.c b/target/ppc/mmu-hash64.c
> index 19832c4b46..168d397c26 100644
> --- a/target/ppc/mmu-hash64.c
> +++ b/target/ppc/mmu-hash64.c
> @@ -786,7 +786,7 @@ static void ppc_hash64_set_dsi(CPUState *cs, int mmu_idx, 
> uint64_t dar, uint64_t
>  
>  static void ppc_hash64_set_r(PowerPCCPU *cpu, hwaddr ptex, uint64_t pte1)
>  {
> -hwaddr base, offset = ptex * HASH_PTE_SIZE_64 + 16;
> +hwaddr base, offset = ptex * HASH_PTE_SIZE_64 + HPTE64_R_R_BYTE_OFFSET;
>  
>  if (cpu->vhyp) {
>  PPCVirtualHypervisorClass *vhc =
> @@ -803,7 +803,7 @@ static void ppc_hash64_set_r(PowerPCCPU *cpu, hwaddr 
> ptex, uint64_t pte1)
>  
>  static void ppc_hash64_set_c(PowerPCCPU *cpu, hwaddr ptex, uint64_t pte1)
>  {
> -hwaddr base, offset = ptex * HASH_PTE_SIZE_64 + 15;
> +hwaddr base, offset = ptex * HASH_PTE_SIZE_64 + HPTE64_R_C_BYTE_OFFSET;
>  
>  if (cpu->vhyp) {
>  PPCVirtualHypervisorClass *vhc =
> diff --git a/target/ppc/mmu-hash64.h b/target/ppc/mmu-hash64.h
> index c5b2f97ff7..2a46763f70 100644
> --- a/target/ppc/mmu-hash64.h
> +++ b/target/ppc/mmu-hash64.h
> @@ -97,6 +97,11 @@ void ppc_hash64_finalize(PowerPCCPU *cpu);
>  #define HPTE64_V_1TB_SEG0x4000ULL
>  #define HPTE64_V_VRMA_MASK  0x4001ff00ULL
>  
> +/* PTE byte offsets */
> +#define HPTE64_R_R_BYTE_OFFSET  14
> +#define HPTE64_R_C_BYTE_OFFSET  15
> +#define HPTE64_R_BYTE_OFFSET(HASH_PTE_SIZE_64 / 2)
> +

Re: [PATCH v3 0/3] tpm: Add missing ACPI device identification objects

2021-11-26 Thread Stefan Berger

Is this series now acceptable for 'after 6.2'?


On 11/10/21 08:35, Stefan Berger wrote:

This series of patches adds missing ACPI device identification objects _STR
and _UID to TPM 1.2 and TPM 2 ACPI tables.

Stefan

v3:
  - Dropped replacement of ACPI tables with empty files in 1/3.
  - Reduced ignored files

Stefan Berger (3):
   tests: acpi: prepare for updated TPM related tables
   acpi: tpm: Add missing device identification objects
   tests: acpi: Add updated TPM related tables

  hw/arm/virt-acpi-build.c   |   1 +
  hw/i386/acpi-build.c   |   8 
  tests/data/acpi/q35/DSDT.tis.tpm12 | Bin 8894 -> 8900 bytes
  tests/data/acpi/q35/DSDT.tis.tpm2  | Bin 8894 -> 8921 bytes
  4 files changed, 9 insertions(+)





Re: [PATCH 1/1] ppc/pnv.c: add a friendly warning when accel=kvm is used

2021-11-26 Thread David Gibson
On Fri, Nov 26, 2021 at 06:51:38PM +0100, Cédric le Goater wrote:
> On 11/26/21 02:11, David Gibson wrote:
> > On Thu, Nov 25, 2021 at 07:42:02PM -0300, Daniel Henrique Barboza wrote:
> > > If one tries to use -machine powernv9,accel=kvm in a Power9 host, a
> > > cryptic error will be shown:
> > > 
> > > qemu-system-ppc64: Register sync failed... If you're using kvm-hv.ko, 
> > > only "-cpu host" is possible
> > > qemu-system-ppc64: kvm_init_vcpu: kvm_arch_init_vcpu failed (0): Invalid 
> > > argument
> > > 
> > > Appending '-cpu host' will throw another error:
> > > 
> > > qemu-system-ppc64: invalid chip model 'host' for powernv9 machine
> > > 
> > > The root cause is that in IBM PowerPC we have different specs for the 
> > > bare-metal
> > > and the guests. The bare-metal follows OPAL, the guests follow PAPR. The 
> > > kernel
> > > KVM modules presented in the ppc kernels implements PAPR. This means that 
> > > we
> > > can't use KVM accel when using the powernv machine, which is the 
> > > emulation of
> > > the bare-metal host.
> > > 
> > > All that said, let's give a more informative error in this case.
> > > 
> > > Signed-off-by: Daniel Henrique Barboza 
> > > ---
> > >   hw/ppc/pnv.c | 5 +
> > >   1 file changed, 5 insertions(+)
> > > 
> > > diff --git a/hw/ppc/pnv.c b/hw/ppc/pnv.c
> > > index 71e45515f1..e5b87e8730 100644
> > > --- a/hw/ppc/pnv.c
> > > +++ b/hw/ppc/pnv.c
> > > @@ -742,6 +742,11 @@ static void pnv_init(MachineState *machine)
> > >   DriveInfo *pnor = drive_get(IF_MTD, 0, 0);
> > >   DeviceState *dev;
> > > +if (kvm_enabled()) {
> > > +error_report("The powernv machine does not work with KVM 
> > > acceleration");
> > > +exit(EXIT_FAILURE);
> > > +}
> > 
> > 
> > Hmm.. my only concern here is that powernv could, at least
> > theoretically, work with KVM PR.  I don't think it does right now,
> > though.
> 
> At the same time, it is nice to not let the user think that it could work
> in its current state. Don't you think so ?

Right, I'm thinking of the implication if you have an old qemu but a
new KVM which let it work.  Chances of KVM actually implementing this
probably aren't good though, so requiring the qemu update if we ever
do is probably the better deal.

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature


[PATCH v6 0/4] virtio-iommu: config related fixes and qtest

2021-11-26 Thread Eric Auger
Introduce a qtest for the virtio-iommu device. The test
allowed to identify an endianess bug in the get_config().
We also remove the unneeded set_config() and fix the value
for domain_range.end field.

Best Regards

Eric

This series can be found at:
https://github.com/eauger/qemu/tree/virtio-iommu-test-v6

History:
v5 -> v6:
- added patches 1-3
- qtest: fix domain_range.end expected value

Eric Auger (4):
  virtio-iommu: Remove set_config callback
  virtio-iommu: Fix endianness in get_config
  virtio-iommu: Fix the domain_range end
  tests: qtest: Add virtio-iommu test

 hw/virtio/trace-events|   3 +-
 hw/virtio/virtio-iommu.c  |  38 ++--
 tests/qtest/libqos/meson.build|   1 +
 tests/qtest/libqos/virtio-iommu.c | 126 
 tests/qtest/libqos/virtio-iommu.h |  40 
 tests/qtest/meson.build   |   1 +
 tests/qtest/virtio-iommu-test.c   | 326 ++
 7 files changed, 511 insertions(+), 24 deletions(-)
 create mode 100644 tests/qtest/libqos/virtio-iommu.c
 create mode 100644 tests/qtest/libqos/virtio-iommu.h
 create mode 100644 tests/qtest/virtio-iommu-test.c

-- 
2.26.3




[PATCH v6 1/4] virtio-iommu: Remove set_config callback

2021-11-26 Thread Eric Auger
The spec says "the driver must not write to device configuration
fields". So remove the set_config() callback which anyway did
not do anything.

Signed-off-by: Eric Auger 
---
 hw/virtio/trace-events   |  1 -
 hw/virtio/virtio-iommu.c | 14 --
 2 files changed, 15 deletions(-)

diff --git a/hw/virtio/trace-events b/hw/virtio/trace-events
index 650e521e351..54bd7da00c8 100644
--- a/hw/virtio/trace-events
+++ b/hw/virtio/trace-events
@@ -92,7 +92,6 @@ virtio_iommu_device_reset(void) "reset!"
 virtio_iommu_get_features(uint64_t features) "device supports 
features=0x%"PRIx64
 virtio_iommu_device_status(uint8_t status) "driver status = %d"
 virtio_iommu_get_config(uint64_t page_size_mask, uint64_t start, uint64_t end, 
uint32_t domain_range, uint32_t probe_size) "page_size_mask=0x%"PRIx64" 
start=0x%"PRIx64" end=0x%"PRIx64" domain_range=%d probe_size=0x%x"
-virtio_iommu_set_config(uint64_t page_size_mask, uint64_t start, uint64_t end, 
uint32_t domain_range, uint32_t probe_size) "page_size_mask=0x%"PRIx64" 
start=0x%"PRIx64" end=0x%"PRIx64" domain_bits=%d probe_size=0x%x"
 virtio_iommu_attach(uint32_t domain_id, uint32_t ep_id) "domain=%d endpoint=%d"
 virtio_iommu_detach(uint32_t domain_id, uint32_t ep_id) "domain=%d endpoint=%d"
 virtio_iommu_map(uint32_t domain_id, uint64_t virt_start, uint64_t virt_end, 
uint64_t phys_start, uint32_t flags) "domain=%d virt_start=0x%"PRIx64" 
virt_end=0x%"PRIx64 " phys_start=0x%"PRIx64" flags=%d"
diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c
index 1b23e8e18c7..645c0aa3997 100644
--- a/hw/virtio/virtio-iommu.c
+++ b/hw/virtio/virtio-iommu.c
@@ -832,19 +832,6 @@ static void virtio_iommu_get_config(VirtIODevice *vdev, 
uint8_t *config_data)
 memcpy(config_data, &dev->config, sizeof(struct virtio_iommu_config));
 }
 
-static void virtio_iommu_set_config(VirtIODevice *vdev,
-  const uint8_t *config_data)
-{
-struct virtio_iommu_config config;
-
-memcpy(&config, config_data, sizeof(struct virtio_iommu_config));
-trace_virtio_iommu_set_config(config.page_size_mask,
-  config.input_range.start,
-  config.input_range.end,
-  config.domain_range.end,
-  config.probe_size);
-}
-
 static uint64_t virtio_iommu_get_features(VirtIODevice *vdev, uint64_t f,
   Error **errp)
 {
@@ -1185,7 +1172,6 @@ static void virtio_iommu_class_init(ObjectClass *klass, 
void *data)
 vdc->unrealize = virtio_iommu_device_unrealize;
 vdc->reset = virtio_iommu_device_reset;
 vdc->get_config = virtio_iommu_get_config;
-vdc->set_config = virtio_iommu_set_config;
 vdc->get_features = virtio_iommu_get_features;
 vdc->set_status = virtio_iommu_set_status;
 vdc->vmsd = &vmstate_virtio_iommu_device;
-- 
2.26.3




[PATCH v6 4/4] tests: qtest: Add virtio-iommu test

2021-11-26 Thread Eric Auger
Add the framework to test the virtio-iommu-pci device
and tests exercising the attach/detach, map/unmap API.

Signed-off-by: Eric Auger 
Tested-by: Jean-Philippe Brucker 
Reviewed-by: Jean-Philippe Brucker 

---

v5 -> v6:
- changed the expected value for domain.end (32 -> MAX_UINT32)
---
 tests/qtest/libqos/meson.build|   1 +
 tests/qtest/libqos/virtio-iommu.c | 126 
 tests/qtest/libqos/virtio-iommu.h |  40 
 tests/qtest/meson.build   |   1 +
 tests/qtest/virtio-iommu-test.c   | 326 ++
 5 files changed, 494 insertions(+)
 create mode 100644 tests/qtest/libqos/virtio-iommu.c
 create mode 100644 tests/qtest/libqos/virtio-iommu.h
 create mode 100644 tests/qtest/virtio-iommu-test.c

diff --git a/tests/qtest/libqos/meson.build b/tests/qtest/libqos/meson.build
index 4af1f04787c..e988d157917 100644
--- a/tests/qtest/libqos/meson.build
+++ b/tests/qtest/libqos/meson.build
@@ -41,6 +41,7 @@ libqos_srcs = files('../libqtest.c',
 'virtio-rng.c',
 'virtio-scsi.c',
 'virtio-serial.c',
+'virtio-iommu.c',
 
 # qgraph machines:
 'aarch64-xlnx-zcu102-machine.c',
diff --git a/tests/qtest/libqos/virtio-iommu.c 
b/tests/qtest/libqos/virtio-iommu.c
new file mode 100644
index 000..18cba4ca36b
--- /dev/null
+++ b/tests/qtest/libqos/virtio-iommu.c
@@ -0,0 +1,126 @@
+/*
+ * libqos driver virtio-iommu-pci framework
+ *
+ * Copyright (c) 2021 Red Hat, Inc.
+ *
+ * Authors:
+ *  Eric Auger 
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or (at your
+ * option) any later version.  See the COPYING file in the top-level directory.
+ *
+ */
+
+#include "qemu/osdep.h"
+#include "libqtest.h"
+#include "qemu/module.h"
+#include "qgraph.h"
+#include "virtio-iommu.h"
+#include "hw/virtio/virtio-iommu.h"
+
+static QGuestAllocator *alloc;
+
+/* virtio-iommu-device */
+static void *qvirtio_iommu_get_driver(QVirtioIOMMU *v_iommu,
+  const char *interface)
+{
+if (!g_strcmp0(interface, "virtio-iommu")) {
+return v_iommu;
+}
+if (!g_strcmp0(interface, "virtio")) {
+return v_iommu->vdev;
+}
+
+fprintf(stderr, "%s not present in virtio-iommu-device\n", interface);
+g_assert_not_reached();
+}
+
+static void virtio_iommu_cleanup(QVirtioIOMMU *interface)
+{
+qvirtqueue_cleanup(interface->vdev->bus, interface->vq, alloc);
+}
+
+static void virtio_iommu_setup(QVirtioIOMMU *interface)
+{
+QVirtioDevice *vdev = interface->vdev;
+uint64_t features;
+
+features = qvirtio_get_features(vdev);
+features &= ~(QVIRTIO_F_BAD_FEATURE |
+  (1ull << VIRTIO_RING_F_INDIRECT_DESC) |
+  (1ull << VIRTIO_RING_F_EVENT_IDX) |
+  (1ull << VIRTIO_IOMMU_F_BYPASS));
+qvirtio_set_features(vdev, features);
+interface->vq = qvirtqueue_setup(interface->vdev, alloc, 0);
+qvirtio_set_driver_ok(interface->vdev);
+}
+
+/* virtio-iommu-pci */
+static void *qvirtio_iommu_pci_get_driver(void *object, const char *interface)
+{
+QVirtioIOMMUPCI *v_iommu = object;
+if (!g_strcmp0(interface, "pci-device")) {
+return v_iommu->pci_vdev.pdev;
+}
+return qvirtio_iommu_get_driver(&v_iommu->iommu, interface);
+}
+
+static void qvirtio_iommu_pci_destructor(QOSGraphObject *obj)
+{
+QVirtioIOMMUPCI *iommu_pci = (QVirtioIOMMUPCI *) obj;
+QVirtioIOMMU *interface = &iommu_pci->iommu;
+QOSGraphObject *pci_vobj =  &iommu_pci->pci_vdev.obj;
+
+virtio_iommu_cleanup(interface);
+qvirtio_pci_destructor(pci_vobj);
+}
+
+static void qvirtio_iommu_pci_start_hw(QOSGraphObject *obj)
+{
+QVirtioIOMMUPCI *iommu_pci = (QVirtioIOMMUPCI *) obj;
+QVirtioIOMMU *interface = &iommu_pci->iommu;
+QOSGraphObject *pci_vobj =  &iommu_pci->pci_vdev.obj;
+
+qvirtio_pci_start_hw(pci_vobj);
+virtio_iommu_setup(interface);
+}
+
+
+static void *virtio_iommu_pci_create(void *pci_bus, QGuestAllocator *t_alloc,
+   void *addr)
+{
+QVirtioIOMMUPCI *virtio_rpci = g_new0(QVirtioIOMMUPCI, 1);
+QVirtioIOMMU *interface = &virtio_rpci->iommu;
+QOSGraphObject *obj = &virtio_rpci->pci_vdev.obj;
+
+virtio_pci_init(&virtio_rpci->pci_vdev, pci_bus, addr);
+interface->vdev = &virtio_rpci->pci_vdev.vdev;
+alloc = t_alloc;
+
+obj->get_driver = qvirtio_iommu_pci_get_driver;
+obj->start_hw = qvirtio_iommu_pci_start_hw;
+obj->destructor = qvirtio_iommu_pci_destructor;
+
+return obj;
+}
+
+static void virtio_iommu_register_nodes(void)
+{
+QPCIAddress addr = {
+.devfn = QPCI_DEVFN(4, 0),
+};
+
+QOSGraphEdgeOptions opts = {
+.extra_device_opts = "addr=04.0",
+};
+
+/* virtio-iommu-pci */
+add_qpci_address(&opts, &addr);
+qos_node_create_driver("virtio-iommu-pci", virtio_iommu_pci_create);
+qos_node_consumes("virtio-iommu-pci", "pci-bus", &opts);
+qos_node_produces("virtio-iommu-pci", "pc

[PATCH v6 2/4] virtio-iommu: Fix endianness in get_config

2021-11-26 Thread Eric Auger
Endianess is not properly handled when populating
the returned config. Use the cpu_to_le* primitives
for each separate field. Also, while at it, trace
the domain range start.

Signed-off-by: Eric Auger 
Reported-by: Thomas Huth 
---
 hw/virtio/trace-events   |  2 +-
 hw/virtio/virtio-iommu.c | 22 +++---
 2 files changed, 16 insertions(+), 8 deletions(-)

diff --git a/hw/virtio/trace-events b/hw/virtio/trace-events
index 54bd7da00c8..f7ad6be5fbb 100644
--- a/hw/virtio/trace-events
+++ b/hw/virtio/trace-events
@@ -91,7 +91,7 @@ virtio_mmio_setting_irq(int level) "virtio_mmio setting IRQ 
%d"
 virtio_iommu_device_reset(void) "reset!"
 virtio_iommu_get_features(uint64_t features) "device supports 
features=0x%"PRIx64
 virtio_iommu_device_status(uint8_t status) "driver status = %d"
-virtio_iommu_get_config(uint64_t page_size_mask, uint64_t start, uint64_t end, 
uint32_t domain_range, uint32_t probe_size) "page_size_mask=0x%"PRIx64" 
start=0x%"PRIx64" end=0x%"PRIx64" domain_range=%d probe_size=0x%x"
+virtio_iommu_get_config(uint64_t page_size_mask, uint64_t start, uint64_t end, 
uint32_t domain_start, uint32_t domain_end, uint32_t probe_size) 
"page_size_mask=0x%"PRIx64" input range start=0x%"PRIx64" input range 
end=0x%"PRIx64" domain range start=%d domain range end=%d probe_size=0x%x"
 virtio_iommu_attach(uint32_t domain_id, uint32_t ep_id) "domain=%d endpoint=%d"
 virtio_iommu_detach(uint32_t domain_id, uint32_t ep_id) "domain=%d endpoint=%d"
 virtio_iommu_map(uint32_t domain_id, uint64_t virt_start, uint64_t virt_end, 
uint64_t phys_start, uint32_t flags) "domain=%d virt_start=0x%"PRIx64" 
virt_end=0x%"PRIx64 " phys_start=0x%"PRIx64" flags=%d"
diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c
index 645c0aa3997..30ee09187b8 100644
--- a/hw/virtio/virtio-iommu.c
+++ b/hw/virtio/virtio-iommu.c
@@ -822,14 +822,22 @@ unlock:
 static void virtio_iommu_get_config(VirtIODevice *vdev, uint8_t *config_data)
 {
 VirtIOIOMMU *dev = VIRTIO_IOMMU(vdev);
-struct virtio_iommu_config *config = &dev->config;
+struct virtio_iommu_config *dev_config = &dev->config;
+struct virtio_iommu_config *out_config = (void *)config_data;
 
-trace_virtio_iommu_get_config(config->page_size_mask,
-  config->input_range.start,
-  config->input_range.end,
-  config->domain_range.end,
-  config->probe_size);
-memcpy(config_data, &dev->config, sizeof(struct virtio_iommu_config));
+out_config->page_size_mask = cpu_to_le64(dev_config->page_size_mask);
+out_config->input_range.start = cpu_to_le64(dev_config->input_range.start);
+out_config->input_range.end = cpu_to_le64(dev_config->input_range.end);
+out_config->domain_range.start = 
cpu_to_le32(dev_config->domain_range.start);
+out_config->domain_range.end = cpu_to_le32(dev_config->domain_range.end);
+out_config->probe_size = cpu_to_le32(dev_config->probe_size);
+
+trace_virtio_iommu_get_config(dev_config->page_size_mask,
+  dev_config->input_range.start,
+  dev_config->input_range.end,
+  dev_config->domain_range.start,
+  dev_config->domain_range.end,
+  dev_config->probe_size);
 }
 
 static uint64_t virtio_iommu_get_features(VirtIODevice *vdev, uint64_t f,
-- 
2.26.3




[PATCH v6 3/4] virtio-iommu: Fix the domain_range end

2021-11-26 Thread Eric Auger
in old times the domain range was defined by a domain_bits le32.
This was then converted into a domain_range struct. During the
upgrade the original value of '32' (bits) has been kept while
the end field now is the max value of the domain id (UINT32_MAX).
Fix that and also use UINT64_MAX for the input_range.end.

Signed-off-by: Eric Auger 
Reported-by: Jean-Philippe Brucker 
---
 hw/virtio/virtio-iommu.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c
index 30ee09187b8..aa9c16a17b1 100644
--- a/hw/virtio/virtio-iommu.c
+++ b/hw/virtio/virtio-iommu.c
@@ -978,8 +978,8 @@ static void virtio_iommu_device_realize(DeviceState *dev, 
Error **errp)
 s->event_vq = virtio_add_queue(vdev, VIOMMU_DEFAULT_QUEUE_SIZE, NULL);
 
 s->config.page_size_mask = TARGET_PAGE_MASK;
-s->config.input_range.end = -1UL;
-s->config.domain_range.end = 32;
+s->config.input_range.end = UINT64_MAX;
+s->config.domain_range.end = UINT32_MAX;
 s->config.probe_size = VIOMMU_PROBE_SIZE;
 
 virtio_add_feature(&s->features, VIRTIO_RING_F_EVENT_IDX);
-- 
2.26.3




  1   2   >