date:20220816

On Fri, Jul 29, 2022 at 03:14:02PM +0800, Xiaoyao Li wrote:
> On 5/30/2022 7:49 PM, Gerd Hoffmann wrote:
> > On Thu, May 26, 2022 at 10:48:56AM +0800, Xiaoyao Li wrote:
> > > On 5/24/2022 3:08 PM, Gerd Hoffmann wrote:
> > > > On Thu, May 12, 2022 at 11:17:45AM +0800, Xiaoyao Li wrote:
> > > > > TDX guest cannot go to real mode, so just skip the setup of isa-bios.
> > > > 
> > > > Does isa-bios setup cause any actual problems?
> > > > (same question for patch #19).

> > There is no need for copying, end_of_1M is a alias memory region for
> > end_of_4G, so the backing storage is the same.
> 
> It is a reason that current alias approach cannot work for TDX. Because in
> TDX a private page can be only mapped to one gpa.

Ok, so memory aliasing not being supported by TDX is the underlying
reason.

> So for simplicity, I will
> just skip isa-bios shadowing for TDX instead of implementing a non-alias +
> memcpy approach.

Makes sense given that tdx wouldn't use the mapping below 1M anyway.
A comment explaining the tdx aliasing restriction would be good to make
clear why the special case for tdx exists.

take care,
  Gerd

Re: [RFC v2 00/10] Introduce an extensible static analyzer

Hi

On Fri, Aug 12, 2022 at 7:49 PM Alberto Faria  wrote:
>
> On Thu, Aug 4, 2022 at 12:44 PM Marc-André Lureau
>  wrote:
> > On fc36, I had several dependencies I needed to install manually (imho
> > they should have been pulled by python3-clang), but more annoyingly I
> > got:
> > clang.cindex.LibclangError: libclang.so: cannot open shared object
> > file: No such file or directory. To provide a path to libclang use
> > Config.set_library_path() or Config.set_library_file().
> >
> > clang-libs doesn't install libclang.so, I wonder why. I made a link
> > manually and it works, but it's probably incorrect. I'll try to open
> > issues for the clang packaging.
>
> That's strange. Thanks for looking into this.

No that's normal, I just got confused. clang-devel provides it.

However, I opened https://bugzilla.redhat.com/show_bug.cgi?id=2115362
"python3-clang depends on libclang.so"

Re: [RFC PATCH v4 18/36] i386/tdx: Skip BIOS shadowing setup

  Hi,

> I did some tracing for this, and the result differs for q35 machine type and
> pc machine type.
> 
> - For q35, the memslot update for isa-bios/pc.rom happens when mc->reset()
> that is triggered via
> 
>   qdev_machine_creation_done()
> -> qemu_system_reset(SHUTDOWN_CASE_NONE);
> 
> It's surely later than TDX's machine_init_done_notify callback which
> initializes the part of private memory via KVM_TDX_INIT_MEM_REGION
> 
> - For pc machine type, the memslot update happens in i440fx_init(), which is
> earlier than TDX's machine_init_done_notify callback
> 
> I haven't fully understand in what condition will QEMU carry out the memslot
> update yet. I will keep learning and try to come up a solution to ensure
> TDX's machine_init_done_notify callback executed after all the memslot
> settle down.

My guess would be the rom shadowing initialization being slightly
different in 'pc' and 'q35'.

take care,
  Gerd

Re: [PATCH 01/22] ppc/ppc4xx: Introduce a DCR device model


On 8/13/22 17:34, BALATON Zoltan wrote:

From: Cédric Le Goater 

The Device Control Registers (DCR) of on-SoC devices are accessed by
software through the use of the mtdcr and mfdcr instructions. These
are converted in transactions on a side band bus, the DCR bus, which
connects the on-SoC devices to the CPU.

Ideally, we should model these accesses with a DCR namespace and DCR
memory regions but today the DCR handlers are installed in a DCR table
under the CPU. Instead, introduce a little device model wrapper to hold
a CPU link and handle registration of DCR handlers.

The DCR device inherits from SysBus because most of these devices also
have MMIO regions and/or IRQs. Being a SysBusDevice makes things easier
to install the device model in the overall SoC.

Signed-off-by: Cédric Le Goater 


When re-sending a patch, it is a good practice to list the changes before
the Sob of the person doing the resend.

I think you only changed the ppc4xx_dcr_register prototype. Correct ?

Thanks,

C.


Signed-off-by: BALATON Zoltan 
---
  hw/ppc/ppc4xx_devs.c| 41 +
  include/hw/ppc/ppc4xx.h | 17 +
  2 files changed, 58 insertions(+)

diff --git a/hw/ppc/ppc4xx_devs.c b/hw/ppc/ppc4xx_devs.c
index 069b511951..f4d7ae9567 100644
--- a/hw/ppc/ppc4xx_devs.c
+++ b/hw/ppc/ppc4xx_devs.c
@@ -664,3 +664,44 @@ void ppc4xx_mal_init(CPUPPCState *env, uint8_t txcnum, 
uint8_t rxcnum,
   mal, &dcr_read_mal, &dcr_write_mal);
  }
  }
+
+/* PPC4xx_DCR_DEVICE */
+
+void ppc4xx_dcr_register(Ppc4xxDcrDeviceState *dev, int dcrn, void *opaque,
+ dcr_read_cb dcr_read, dcr_write_cb dcr_write)
+{
+assert(dev->cpu);
+ppc_dcr_register(&dev->cpu->env, dcrn, opaque, dcr_read, dcr_write);
+}
+
+bool ppc4xx_dcr_realize(Ppc4xxDcrDeviceState *dev, PowerPCCPU *cpu,
+Error **errp)
+{
+object_property_set_link(OBJECT(dev), "cpu", OBJECT(cpu), &error_abort);
+return sysbus_realize(SYS_BUS_DEVICE(dev), errp);
+}
+
+static Property ppc4xx_dcr_properties[] = {
+DEFINE_PROP_LINK("cpu", Ppc4xxDcrDeviceState, cpu, TYPE_POWERPC_CPU,
+ PowerPCCPU *),
+DEFINE_PROP_END_OF_LIST(),
+};
+
+static void ppc4xx_dcr_class_init(ObjectClass *oc, void *data)
+{
+DeviceClass *dc = DEVICE_CLASS(oc);
+
+device_class_set_props(dc, ppc4xx_dcr_properties);
+}
+
+static const TypeInfo ppc4xx_types[] = {
+{
+.name   = TYPE_PPC4xx_DCR_DEVICE,
+.parent = TYPE_SYS_BUS_DEVICE,
+.instance_size  = sizeof(Ppc4xxDcrDeviceState),
+.class_init = ppc4xx_dcr_class_init,
+.abstract   = true,
+}
+};
+
+DEFINE_TYPES(ppc4xx_types)
diff --git a/include/hw/ppc/ppc4xx.h b/include/hw/ppc/ppc4xx.h
index 591e2421a3..a537a5567b 100644
--- a/include/hw/ppc/ppc4xx.h
+++ b/include/hw/ppc/ppc4xx.h
@@ -27,6 +27,7 @@
  
  #include "hw/ppc/ppc.h"

  #include "exec/memory.h"
+#include "hw/sysbus.h"
  
  void ppc4xx_sdram_banks(MemoryRegion *ram, int nr_banks,

  MemoryRegion ram_memories[],
@@ -44,4 +45,20 @@ void ppc4xx_mal_init(CPUPPCState *env, uint8_t txcnum, 
uint8_t rxcnum,
  
  #define TYPE_PPC4xx_PCI_HOST_BRIDGE "ppc4xx-pcihost"
  
+/*

+ * Generic DCR device
+ */
+#define TYPE_PPC4xx_DCR_DEVICE "ppc4xx-dcr-device"
+OBJECT_DECLARE_SIMPLE_TYPE(Ppc4xxDcrDeviceState, PPC4xx_DCR_DEVICE);
+struct Ppc4xxDcrDeviceState {
+SysBusDevice parent_obj;
+
+PowerPCCPU *cpu;
+};
+
+void ppc4xx_dcr_register(Ppc4xxDcrDeviceState *dev, int dcrn, void *opaque,
+ dcr_read_cb dcr_read, dcr_write_cb dcr_write);
+bool ppc4xx_dcr_realize(Ppc4xxDcrDeviceState *dev, PowerPCCPU *cpu,
+Error **errp);
+
  #endif /* PPC4XX_H */

Re: [PATCH] ui/console: fix qemu_console_resize() regression

> > >> diff --git a/ui/console.c b/ui/console.c
> > >> index e139f7115e1f..765892f84f1c 100644
> > >> --- a/ui/console.c
> > >> +++ b/ui/console.c
> > >> @@ -2575,11 +2575,13 @@ static void vc_chr_open(Chardev *chr,
> > >>   void qemu_console_resize(QemuConsole *s, int width, int height)
> > >>   {
> > >> -DisplaySurface *surface;
> > >> +DisplaySurface *surface = qemu_console_surface(s);
> > >>   assert(s->console_type == GRAPHIC_CONSOLE);
> > >> -if (qemu_console_get_width(s, -1) == width &&
> > >> +if ((s->scanout.kind != SCANOUT_SURFACE ||
> > >> + (surface && surface->flags & QEMU_ALLOCATED_FLAG)) &&
> > >> +qemu_console_get_width(s, -1) == width &&
> > >>   qemu_console_get_height(s, -1) == height) {
> > >>   return;
> > >>   }

> Gerd, could you review the patch and let me send a MR ? (or do you
> have other UI patches queued already and take it?)

Patch looks good to me.

Acked-by: Gerd Hoffmann 

[ just back from summer vacation, no pending queue atm, just started
  walking through my email backlog though ... ]

take care,
  Gerd

Re: [PATCH 13/22] ppc4xx: Move EBC model to ppc4xx_devs.c


On 8/13/22 17:34, BALATON Zoltan wrote:

The EBC is shared between 405 and 440 so move it to shared file.


Should we rename the device to Ppc4xxEbcState ?

Thanks,

C.



Signed-off-by: BALATON Zoltan 
---
  hw/ppc/ppc405.h |  15 
  hw/ppc/ppc405_uc.c  | 191 
  hw/ppc/ppc4xx_devs.c| 191 
  include/hw/ppc/ppc4xx.h |  15 
  4 files changed, 206 insertions(+), 206 deletions(-)

diff --git a/hw/ppc/ppc405.h b/hw/ppc/ppc405.h
index d85c595f9d..c0251f0894 100644
--- a/hw/ppc/ppc405.h
+++ b/hw/ppc/ppc405.h
@@ -85,21 +85,6 @@ struct Ppc405OpbaState {
  uint8_t pr;
  };
  
-/* Peripheral controller */

-#define TYPE_PPC405_EBC "ppc405-ebc"
-OBJECT_DECLARE_SIMPLE_TYPE(Ppc405EbcState, PPC405_EBC);
-struct Ppc405EbcState {
-Ppc4xxDcrDeviceState parent_obj;
-
-uint32_t addr;
-uint32_t bcr[8];
-uint32_t bap[8];
-uint32_t bear;
-uint32_t besr0;
-uint32_t besr1;
-uint32_t cfg;
-};
-
  /* DMA controller */
  #define TYPE_PPC405_DMA "ppc405-dma"
  OBJECT_DECLARE_SIMPLE_TYPE(Ppc405DmaState, PPC405_DMA);
diff --git a/hw/ppc/ppc405_uc.c b/hw/ppc/ppc405_uc.c
index 3de6c77631..01625e3237 100644
--- a/hw/ppc/ppc405_uc.c
+++ b/hw/ppc/ppc405_uc.c
@@ -299,192 +299,6 @@ static void ppc405_opba_class_init(ObjectClass *oc, void 
*data)
  /* Code decompression controller */
  /* XXX: TODO */
  
-/*/

-/* Peripheral controller */
-enum {
-EBC0_CFGADDR = 0x012,
-EBC0_CFGDATA = 0x013,
-};
-
-static uint32_t dcr_read_ebc(void *opaque, int dcrn)
-{
-Ppc405EbcState *ebc = opaque;
-uint32_t ret;
-
-switch (dcrn) {
-case EBC0_CFGADDR:
-ret = ebc->addr;
-break;
-case EBC0_CFGDATA:
-switch (ebc->addr) {
-case 0x00: /* B0CR */
-ret = ebc->bcr[0];
-break;
-case 0x01: /* B1CR */
-ret = ebc->bcr[1];
-break;
-case 0x02: /* B2CR */
-ret = ebc->bcr[2];
-break;
-case 0x03: /* B3CR */
-ret = ebc->bcr[3];
-break;
-case 0x04: /* B4CR */
-ret = ebc->bcr[4];
-break;
-case 0x05: /* B5CR */
-ret = ebc->bcr[5];
-break;
-case 0x06: /* B6CR */
-ret = ebc->bcr[6];
-break;
-case 0x07: /* B7CR */
-ret = ebc->bcr[7];
-break;
-case 0x10: /* B0AP */
-ret = ebc->bap[0];
-break;
-case 0x11: /* B1AP */
-ret = ebc->bap[1];
-break;
-case 0x12: /* B2AP */
-ret = ebc->bap[2];
-break;
-case 0x13: /* B3AP */
-ret = ebc->bap[3];
-break;
-case 0x14: /* B4AP */
-ret = ebc->bap[4];
-break;
-case 0x15: /* B5AP */
-ret = ebc->bap[5];
-break;
-case 0x16: /* B6AP */
-ret = ebc->bap[6];
-break;
-case 0x17: /* B7AP */
-ret = ebc->bap[7];
-break;
-case 0x20: /* BEAR */
-ret = ebc->bear;
-break;
-case 0x21: /* BESR0 */
-ret = ebc->besr0;
-break;
-case 0x22: /* BESR1 */
-ret = ebc->besr1;
-break;
-case 0x23: /* CFG */
-ret = ebc->cfg;
-break;
-default:
-ret = 0x;
-break;
-}
-break;
-default:
-ret = 0x;
-break;
-}
-
-return ret;
-}
-
-static void dcr_write_ebc(void *opaque, int dcrn, uint32_t val)
-{
-Ppc405EbcState *ebc = opaque;
-
-switch (dcrn) {
-case EBC0_CFGADDR:
-ebc->addr = val;
-break;
-case EBC0_CFGDATA:
-switch (ebc->addr) {
-case 0x00: /* B0CR */
-break;
-case 0x01: /* B1CR */
-break;
-case 0x02: /* B2CR */
-break;
-case 0x03: /* B3CR */
-break;
-case 0x04: /* B4CR */
-break;
-case 0x05: /* B5CR */
-break;
-case 0x06: /* B6CR */
-break;
-case 0x07: /* B7CR */
-break;
-case 0x10: /* B0AP */
-break;
-case 0x11: /* B1AP */
-break;
-case 0x12: /* B2AP */
-break;
-case 0x13: /* B3AP */
-break;
-case 0x14: /* B4AP */
-break;
-case 0x15: /* B5AP */
-break;
-case 0x16: /* B6AP */
-break;
-case 0x17: /* B7AP */
-break;
-case 0x20: /* BEAR */
-break;
-case 0x21: /* BESR0 */
-break;
-case 0x22: /* BESR1 */
-break;
-case 0x23: /* CFG */
-break;
-default:
-bre

Re: [PATCH 20/22] hw/ppc/Kconfig: Move imply before select


On 8/13/22 17:34, BALATON Zoltan wrote:

In pegasos2 section move imply before select to match other sections.


Reviewed-by: Cédric Le Goater 

Thanks,

C.



Signed-off-by: BALATON Zoltan 
---
  hw/ppc/Kconfig | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/ppc/Kconfig b/hw/ppc/Kconfig
index 205f9f98d7..3a4418a69e 100644
--- a/hw/ppc/Kconfig
+++ b/hw/ppc/Kconfig
@@ -71,6 +71,7 @@ config SAM460EX
  
  config PEGASOS2

  bool
+imply ATI_VGA
  select MV64361
  select VT82C686
  select IDE_VIA
@@ -78,7 +79,6 @@ config PEGASOS2
  select VOF
  # This should come with VT82C686
  select ACPI_X86
-imply ATI_VGA
  
  config PREP

  bool

Re: [PATCH 12/22] ppc4xx: Move PLB model to ppc4xx_devs.c


On 8/13/22 17:34, BALATON Zoltan wrote:

The PLB is shared between 405 and 440 so move it to the shared file.


Should we rename the device to Ppc4xxPlbState ?

Thanks,

C.



Signed-off-by: BALATON Zoltan 
---
  hw/ppc/ppc405.h | 11 -
  hw/ppc/ppc405_uc.c  | 93 
  hw/ppc/ppc4xx_devs.c| 94 +
  include/hw/ppc/ppc4xx.h | 11 +
  4 files changed, 105 insertions(+), 104 deletions(-)

diff --git a/hw/ppc/ppc405.h b/hw/ppc/ppc405.h
index 31c94e4742..d85c595f9d 100644
--- a/hw/ppc/ppc405.h
+++ b/hw/ppc/ppc405.h
@@ -63,17 +63,6 @@ struct ppc4xx_bd_info_t {
  uint32_t bi_iic_fast[2];
  };
  
-/* Peripheral local bus arbitrer */

-#define TYPE_PPC405_PLB "ppc405-plb"
-OBJECT_DECLARE_SIMPLE_TYPE(Ppc405PlbState, PPC405_PLB);
-struct Ppc405PlbState {
-Ppc4xxDcrDeviceState parent_obj;
-
-uint32_t acr;
-uint32_t bear;
-uint32_t besr;
-};
-
  /* PLB to OPB bridge */
  #define TYPE_PPC405_POB "ppc405-pob"
  OBJECT_DECLARE_SIMPLE_TYPE(Ppc405PobState, PPC405_POB);
diff --git a/hw/ppc/ppc405_uc.c b/hw/ppc/ppc405_uc.c
index 922c23346f..3de6c77631 100644
--- a/hw/ppc/ppc405_uc.c
+++ b/hw/ppc/ppc405_uc.c
@@ -137,94 +137,6 @@ ram_addr_t ppc405_set_bootinfo(CPUPPCState *env, 
ram_addr_t ram_size)
  
/*/
  /* Shared peripherals */
  
-/*/

-/* Peripheral local bus arbitrer */
-enum {
-PLB3A0_ACR = 0x077,
-PLB4A0_ACR = 0x081,
-PLB0_BESR  = 0x084,
-PLB0_BEAR  = 0x086,
-PLB0_ACR   = 0x087,
-PLB4A1_ACR = 0x089,
-};
-
-static uint32_t dcr_read_plb(void *opaque, int dcrn)
-{
-Ppc405PlbState *plb = opaque;
-uint32_t ret;
-
-switch (dcrn) {
-case PLB0_ACR:
-ret = plb->acr;
-break;
-case PLB0_BEAR:
-ret = plb->bear;
-break;
-case PLB0_BESR:
-ret = plb->besr;
-break;
-default:
-/* Avoid gcc warning */
-ret = 0;
-break;
-}
-
-return ret;
-}
-
-static void dcr_write_plb(void *opaque, int dcrn, uint32_t val)
-{
-Ppc405PlbState *plb = opaque;
-
-switch (dcrn) {
-case PLB0_ACR:
-/* We don't care about the actual parameters written as
- * we don't manage any priorities on the bus
- */
-plb->acr = val & 0xF800;
-break;
-case PLB0_BEAR:
-/* Read only */
-break;
-case PLB0_BESR:
-/* Write-clear */
-plb->besr &= ~val;
-break;
-}
-}
-
-static void ppc405_plb_reset(DeviceState *dev)
-{
-Ppc405PlbState *plb = PPC405_PLB(dev);
-
-plb->acr = 0x;
-plb->bear = 0x;
-plb->besr = 0x;
-}
-
-static void ppc405_plb_realize(DeviceState *dev, Error **errp)
-{
-Ppc405PlbState *plb = PPC405_PLB(dev);
-Ppc4xxDcrDeviceState *dcr = PPC4xx_DCR_DEVICE(dev);
-
-ppc4xx_dcr_register(dcr, PLB3A0_ACR, plb, &dcr_read_plb, &dcr_write_plb);
-ppc4xx_dcr_register(dcr, PLB4A0_ACR, plb, &dcr_read_plb, &dcr_write_plb);
-ppc4xx_dcr_register(dcr, PLB0_ACR, plb, &dcr_read_plb, &dcr_write_plb);
-ppc4xx_dcr_register(dcr, PLB0_BEAR, plb, &dcr_read_plb, &dcr_write_plb);
-ppc4xx_dcr_register(dcr, PLB0_BESR, plb, &dcr_read_plb, &dcr_write_plb);
-ppc4xx_dcr_register(dcr, PLB4A1_ACR, plb, &dcr_read_plb, &dcr_write_plb);
-}
-
-static void ppc405_plb_class_init(ObjectClass *oc, void *data)
-{
-DeviceClass *dc = DEVICE_CLASS(oc);
-
-dc->realize = ppc405_plb_realize;
-dc->reset = ppc405_plb_reset;
-/* Reason: only works as function of a ppc4xx SoC */
-dc->user_creatable = false;
-}
-
  
/*/
  /* PLB to OPB bridge */
  enum {
@@ -1535,11 +1447,6 @@ static void ppc405_soc_class_init(ObjectClass *oc, void 
*data)
  
  static const TypeInfo ppc405_types[] = {

  {
-.name   = TYPE_PPC405_PLB,
-.parent = TYPE_PPC4xx_DCR_DEVICE,
-.instance_size  = sizeof(Ppc405PlbState),
-.class_init = ppc405_plb_class_init,
-}, {
  .name   = TYPE_PPC405_POB,
  .parent = TYPE_PPC4xx_DCR_DEVICE,
  .instance_size  = sizeof(Ppc405PobState),
diff --git a/hw/ppc/ppc4xx_devs.c b/hw/ppc/ppc4xx_devs.c
index 7d40c1b68a..843d759b1b 100644
--- a/hw/ppc/ppc4xx_devs.c
+++ b/hw/ppc/ppc4xx_devs.c
@@ -658,6 +658,95 @@ static void ppc4xx_mal_class_init(ObjectClass *oc, void 
*data)
  device_class_set_props(dc, ppc4xx_mal_properties);
  }
  
+/*/

+/* Peripheral local bus arbitrer */
+enum {
+PLB3A0_ACR = 0x077,
+PLB4A0_ACR = 0x081,
+PLB0_BESR  = 0x084,
+PLB0_BEAR  = 0x086,
+PLB0_ACR   = 0x087,
+PLB4A1_ACR = 0x089,
+};
+
+static uint32_t dcr_read_plb(void *opaque, int dc

Re: [PATCH 19/22] hw/ppc/Kconfig: Remove PPC405 dependency from sam460ex


On 8/13/22 17:34, BALATON Zoltan wrote:

Now that shared PPC4xx devices are separated from PPC405 ones we can
drop this depencency.



Reviewed-by: Cédric Le Goater 

Thanks,

C.



Signed-off-by: BALATON Zoltan 
---
  hw/ppc/Kconfig | 1 -
  1 file changed, 1 deletion(-)

diff --git a/hw/ppc/Kconfig b/hw/ppc/Kconfig
index 400511c6b7..205f9f98d7 100644
--- a/hw/ppc/Kconfig
+++ b/hw/ppc/Kconfig
@@ -58,7 +58,6 @@ config PPC4XX
  
  config SAM460EX

  bool
-select PPC405
  select PFLASH_CFI01
  select IDE_SII3112
  select M41T80

Re: [PATCH] RFC: char: deprecate usage of bidirectional pipe

On Tue, Jul 26, 2022 at 09:44:25AM +0100, Daniel P. Berrangé wrote:
> On Tue, Jul 26, 2022 at 12:32:32PM +0400, marcandre.lur...@redhat.com wrote:
> > From: Marc-André Lureau 
> > 
> > As Ed Swierk explained back in 2006:
> > https://lists.nongnu.org/archive/html/qemu-devel/2006-12/msg00160.html
> > 
> > "When qemu writes into the pipe, it immediately reads back what it just
> > wrote and treats it as a monitor command, endlessly breathing its own
> > exhaust."
> > 
> > This is similarly confusing when using the chardev with a serial device,
> > as reported in https://bugzilla.redhat.com/show_bug.cgi?id=2106975.
> > 
> > It seems we have kept the support for bidirectional pipes for historical
> > reasons and odd systems, however it's not documented in qemu -chardev
> > options. I suggest to stop supporting it, for portability reasons.
> 
> Hmm, I always assumed that in this scenario the pipe was operating
> in output-only mode. Obviously not the case with the code as it
> exists, but perhaps this would be useful ?  eg its good as a serial
> console logging mechanism at least.

Well, using ${file}.in and ${file}.out has the advantage that it works
fine with all qemu versions.  So adding a warning suggesting to use that
makes sense to me, especially as 7.1 fix.

When looking at longer-term improvements it is probably better to add
support for explicit in/out pipe names, i.e. input= and output=
parameters as suggested later in the thread.  Avoids needing to know
qemu internals (pipe naming convention) and allows to make the input
pipe optional for the logging use case.  That probably is something we
don't want rush into 7.1 though ...

take care,
  Gerd

Re: [PATCH v12 2/6] target/riscv: Simplify counter predicate function

2022-08-16 Thread Atish Kumar Patra

On Mon, Aug 15, 2022 at 12:54 AM Andrew Jones 
wrote:

> On Tue, Aug 02, 2022 at 04:33:03PM -0700, Atish Patra wrote:
> > All the hpmcounters and the fixed counters (CY, IR, TM) can be
> represented
> > as a unified counter. Thus, the predicate function doesn't need handle
> each
> > case separately.
> >
> > Simplify the predicate function so that we just handle things differently
> > between RV32/RV64 and S/HS mode.
> >
> > Reviewed-by: Bin Meng 
> > Acked-by: Alistair Francis 
> > Signed-off-by: Atish Patra 
> > ---
> >  target/riscv/csr.c | 110 -
> >  1 file changed, 9 insertions(+), 101 deletions(-)
> >
> > diff --git a/target/riscv/csr.c b/target/riscv/csr.c
> > index 9b45c49ab45f..6690b72ea0e7 100644
> > --- a/target/riscv/csr.c
> > +++ b/target/riscv/csr.c
> > @@ -74,6 +74,7 @@ static RISCVException ctr(CPURISCVState *env, int
> csrno)
> >  CPUState *cs = env_cpu(env);
> >  RISCVCPU *cpu = RISCV_CPU(cs);
> >  int ctr_index;
> > +target_ulong ctr_mask;
> >  int base_csrno = CSR_CYCLE;
> >  bool rv32 = riscv_cpu_mxl(env) == MXL_RV32 ? true : false;
> >
> > @@ -82,122 +83,29 @@ static RISCVException ctr(CPURISCVState *env, int
> csrno)
> >  base_csrno += 0x80;
> >  }
> >  ctr_index = csrno - base_csrno;
> > +ctr_mask = BIT(ctr_index);
> >
> >  if ((csrno >= CSR_CYCLE && csrno <= CSR_INSTRET) ||
> >  (csrno >= CSR_CYCLEH && csrno <= CSR_INSTRETH)) {
> >  goto skip_ext_pmu_check;
> >  }
> >
> > -if ((!cpu->cfg.pmu_num || !(cpu->pmu_avail_ctrs & BIT(ctr_index
> {
> > +if (!(cpu->pmu_avail_ctrs & ctr_mask)) {
> >  /* No counter is enabled in PMU or the counter is out of range
> */
> >  return RISCV_EXCP_ILLEGAL_INST;
> >  }
> >
> >  skip_ext_pmu_check:
> >
> > -if (env->priv == PRV_S) {
> > -switch (csrno) {
> > -case CSR_CYCLE:
> > -if (!get_field(env->mcounteren, COUNTEREN_CY)) {
> > -return RISCV_EXCP_ILLEGAL_INST;
> > -}
> > -break;
> > -case CSR_TIME:
> > -if (!get_field(env->mcounteren, COUNTEREN_TM)) {
> > -return RISCV_EXCP_ILLEGAL_INST;
> > -}
> > -break;
> > -case CSR_INSTRET:
> > -if (!get_field(env->mcounteren, COUNTEREN_IR)) {
> > -return RISCV_EXCP_ILLEGAL_INST;
> > -}
> > -break;
> > -case CSR_HPMCOUNTER3...CSR_HPMCOUNTER31:
> > -if (!get_field(env->mcounteren, 1 << ctr_index)) {
> > -return RISCV_EXCP_ILLEGAL_INST;
> > -}
> > -break;
> > -}
> > -if (rv32) {
> > -switch (csrno) {
> > -case CSR_CYCLEH:
> > -if (!get_field(env->mcounteren, COUNTEREN_CY)) {
> > -return RISCV_EXCP_ILLEGAL_INST;
> > -}
> > -break;
> > -case CSR_TIMEH:
> > -if (!get_field(env->mcounteren, COUNTEREN_TM)) {
> > -return RISCV_EXCP_ILLEGAL_INST;
> > -}
> > -break;
> > -case CSR_INSTRETH:
> > -if (!get_field(env->mcounteren, COUNTEREN_IR)) {
> > -return RISCV_EXCP_ILLEGAL_INST;
> > -}
> > -break;
> > -case CSR_HPMCOUNTER3H...CSR_HPMCOUNTER31H:
> > -if (!get_field(env->mcounteren, 1 << ctr_index)) {
> > -return RISCV_EXCP_ILLEGAL_INST;
> > -}
> > -break;
> > -}
> > -}
> > +if (((env->priv == PRV_S) && (!get_field(env->mcounteren,
> ctr_mask))) ||
> > +   ((env->priv == PRV_U) && (!get_field(env->scounteren,
> ctr_mask {
> ^ there should be another space here
>
> > +return RISCV_EXCP_ILLEGAL_INST;
> >  }
> >
> >  if (riscv_cpu_virt_enabled(env)) {
> > -switch (csrno) {
> > -case CSR_CYCLE:
> > -if (!get_field(env->hcounteren, COUNTEREN_CY) &&
> > -get_field(env->mcounteren, COUNTEREN_CY)) {
> > -return RISCV_EXCP_VIRT_INSTRUCTION_FAULT;
> > -}
> > -break;
> > -case CSR_TIME:
> > -if (!get_field(env->hcounteren, COUNTEREN_TM) &&
> > -get_field(env->mcounteren, COUNTEREN_TM)) {
> > -return RISCV_EXCP_VIRT_INSTRUCTION_FAULT;
> > -}
> > -break;
> > -case CSR_INSTRET:
> > -if (!get_field(env->hcounteren, COUNTEREN_IR) &&
> > -get_field(env->mcounteren, COUNTEREN_IR)) {
> > -return RISCV_EXCP_VIRT_INSTRUCTION_FAULT;
> > -}
> > -break;
> > -case CSR_HPMCOUNTER3...CSR_HPMCOUNTER31:
> > -if (!get_field(env->hcounteren, 1 << ctr_index) &&
> > - get_field(env->mcounteren, 1 << ctr_ind

[PATCH] target/hppa: Fix proberi instruction emulation for linux-user

2022-08-16 Thread Helge Deller

The proberi assembler instruction checks the read/write access rights
for the page of a given address and shall return a value of 1 if the
test succeeds and a value of 0 on failure in the target register.

But when run in linux-user mode, qemu currently simply returns the
return code of page_check_range() which returns 0 on success and -1 on
failure, which is the opposite of what proberi should return.

Fix it by checking the return code of page_check_range() and return the
expected return value.

The easiest way to reproduce the issue is by running
"/lib/ld.so.1 --version" in a chroot which fails without this patch.
At startup of ld.so the __canonicalize_funcptr_for_compare() function is
used to resolve the function address out of a function descriptor, which
fails because proberi (due to the wrong return code) seems to indicate
that the given address isn't accessible.

Signed-off-by: Helge Deller 

diff --git a/target/hppa/op_helper.c b/target/hppa/op_helper.c
index cd304f051e..fbd80e4248 100644
--- a/target/hppa/op_helper.c
+++ b/target/hppa/op_helper.c
@@ -170,7 +170,7 @@ target_ureg HELPER(probe)(CPUHPPAState *env, target_ulong 
addr,
   uint32_t level, uint32_t want)
 {
 #ifdef CONFIG_USER_ONLY
-return page_check_range(addr, 1, want);
+return (page_check_range(addr, 1, want) == 0) ? 1 : 0;
 #else
 int prot, excp;
 hwaddr phys;

Re: [PATCH v5 01/18] dump: Replace opaque DumpState pointer with a typed one

On Thu, Aug 11, 2022 at 4:13 PM Janosch Frank  wrote:
>
> It's always better to convey the type of a pointer if at all
> possible. So let's add the DumpState typedef to typedefs.h and move
> the dump note functions from the opaque pointers to DumpState
> pointers.
>
> Signed-off-by: Janosch Frank 

Reviewed-by: Marc-André Lureau 

> CC: Peter Maydell 
> CC: Cédric Le Goater 
> CC: Daniel Henrique Barboza 
> CC: David Gibson 
> CC: Greg Kurz 
> CC: Palmer Dabbelt 
> CC: Alistair Francis 
> CC: Bin Meng 
> CC: Cornelia Huck 
> CC: Thomas Huth 
> CC: Richard Henderson 
> CC: David Hildenbrand 
> ---
>  include/hw/core/sysemu-cpu-ops.h |  8 
>  include/qemu/typedefs.h  |  1 +
>  target/arm/arch_dump.c   |  6 ++
>  target/arm/cpu.h |  4 ++--
>  target/i386/arch_dump.c  | 30 +++---
>  target/i386/cpu.h|  8 
>  target/ppc/arch_dump.c   | 18 +-
>  target/ppc/cpu.h |  4 ++--
>  target/riscv/arch_dump.c |  6 ++
>  target/riscv/cpu.h   |  4 ++--
>  target/s390x/arch_dump.c | 10 +-
>  target/s390x/s390x-internal.h|  2 +-
>  12 files changed, 49 insertions(+), 52 deletions(-)
>
> diff --git a/include/hw/core/sysemu-cpu-ops.h 
> b/include/hw/core/sysemu-cpu-ops.h
> index a9ba39e5f2..ee169b872c 100644
> --- a/include/hw/core/sysemu-cpu-ops.h
> +++ b/include/hw/core/sysemu-cpu-ops.h
> @@ -53,25 +53,25 @@ typedef struct SysemuCPUOps {
>   * 32-bit VM coredump.
>   */
>  int (*write_elf32_note)(WriteCoreDumpFunction f, CPUState *cpu,
> -int cpuid, void *opaque);
> +int cpuid, DumpState *s);
>  /**
>   * @write_elf64_note: Callback for writing a CPU-specific ELF note to a
>   * 64-bit VM coredump.
>   */
>  int (*write_elf64_note)(WriteCoreDumpFunction f, CPUState *cpu,
> -int cpuid, void *opaque);
> +int cpuid, DumpState *s);
>  /**
>   * @write_elf32_qemunote: Callback for writing a CPU- and QEMU-specific 
> ELF
>   * note to a 32-bit VM coredump.
>   */
>  int (*write_elf32_qemunote)(WriteCoreDumpFunction f, CPUState *cpu,
> -void *opaque);
> +DumpState *s);
>  /**
>   * @write_elf64_qemunote: Callback for writing a CPU- and QEMU-specific 
> ELF
>   * note to a 64-bit VM coredump.
>   */
>  int (*write_elf64_qemunote)(WriteCoreDumpFunction f, CPUState *cpu,
> -void *opaque);
> +DumpState *s);
>  /**
>   * @virtio_is_big_endian: Callback to return %true if a CPU which 
> supports
>   * runtime configurable endianness is currently big-endian.
> diff --git a/include/qemu/typedefs.h b/include/qemu/typedefs.h
> index 42f4ceb701..054fd46fa6 100644
> --- a/include/qemu/typedefs.h
> +++ b/include/qemu/typedefs.h
> @@ -129,6 +129,7 @@ typedef struct VirtIODevice VirtIODevice;
>  typedef struct Visitor Visitor;
>  typedef struct VMChangeStateEntry VMChangeStateEntry;
>  typedef struct VMStateDescription VMStateDescription;
> +typedef struct DumpState DumpState;
>
>  /*
>   * Pointer types
> diff --git a/target/arm/arch_dump.c b/target/arm/arch_dump.c
> index b1f040e69f..2d8e41ab8a 100644
> --- a/target/arm/arch_dump.c
> +++ b/target/arm/arch_dump.c
> @@ -232,12 +232,11 @@ static int 
> aarch64_write_elf64_sve(WriteCoreDumpFunction f,
>  #endif
>
>  int arm_cpu_write_elf64_note(WriteCoreDumpFunction f, CPUState *cs,
> - int cpuid, void *opaque)
> + int cpuid, DumpState *s)
>  {
>  struct aarch64_note note;
>  ARMCPU *cpu = ARM_CPU(cs);
>  CPUARMState *env = &cpu->env;
> -DumpState *s = opaque;
>  uint64_t pstate, sp;
>  int ret, i;
>
> @@ -360,12 +359,11 @@ static int arm_write_elf32_vfp(WriteCoreDumpFunction f, 
> CPUARMState *env,
>  }
>
>  int arm_cpu_write_elf32_note(WriteCoreDumpFunction f, CPUState *cs,
> - int cpuid, void *opaque)
> + int cpuid, DumpState *s)
>  {
>  struct arm_note note;
>  ARMCPU *cpu = ARM_CPU(cs);
>  CPUARMState *env = &cpu->env;
> -DumpState *s = opaque;
>  int ret, i;
>  bool fpvalid = cpu_isar_feature(aa32_vfp_simd, cpu);
>
> diff --git a/target/arm/cpu.h b/target/arm/cpu.h
> index 5168e3d837..fc8b358779 100644
> --- a/target/arm/cpu.h
> +++ b/target/arm/cpu.h
> @@ -1100,9 +1100,9 @@ int arm_gen_dynamic_svereg_xml(CPUState *cpu, int 
> base_reg);
>  const char *arm_gdb_get_dynamic_xml(CPUState *cpu, const char *xmlname);
>
>  int arm_cpu_write_elf64_note(WriteCoreDumpFunction f, CPUState *cs,
> - int cpuid, void *opaque);
> + int cpuid, DumpState *s);
>  int arm_cpu_write_elf32_note(WriteCore

Re: [PATCH 18/22] ppc405: Move machine specific code to ppc405_boards.c


On 8/13/22 17:34, BALATON Zoltan wrote:

These are only used by tha board code so move out from the shared SoC


s/tha/the/


model and put it in the boards file.


Reviewed-by: Cédric Le Goater 

Thanks,

C.



Signed-off-by: BALATON Zoltan 
---
  hw/ppc/ppc405.h|  38 -
  hw/ppc/ppc405_boards.c | 375 +++--
  hw/ppc/ppc405_uc.c |  92 --
  3 files changed, 251 insertions(+), 254 deletions(-)

diff --git a/hw/ppc/ppc405.h b/hw/ppc/ppc405.h
index 12eee97169..6b26fc6d17 100644
--- a/hw/ppc/ppc405.h
+++ b/hw/ppc/ppc405.h
@@ -30,41 +30,6 @@
  #include "hw/intc/ppc-uic.h"
  #include "hw/i2c/ppc4xx_i2c.h"
  
-#define PPC405EP_SDRAM_BASE 0x

-#define PPC405EP_NVRAM_BASE 0xF000
-#define PPC405EP_FPGA_BASE  0xF030
-#define PPC405EP_SRAM_BASE  0xFFF0
-#define PPC405EP_SRAM_SIZE  (512 * KiB)
-#define PPC405EP_FLASH_BASE 0xFFF8
-
-/* Bootinfo as set-up by u-boot */
-typedef struct ppc4xx_bd_info_t ppc4xx_bd_info_t;
-struct ppc4xx_bd_info_t {
-uint32_t bi_memstart;
-uint32_t bi_memsize;
-uint32_t bi_flashstart;
-uint32_t bi_flashsize;
-uint32_t bi_flashoffset; /* 0x10 */
-uint32_t bi_sramstart;
-uint32_t bi_sramsize;
-uint32_t bi_bootflags;
-uint32_t bi_ipaddr; /* 0x20 */
-uint8_t  bi_enetaddr[6];
-uint16_t bi_ethspeed;
-uint32_t bi_intfreq;
-uint32_t bi_busfreq; /* 0x30 */
-uint32_t bi_baudrate;
-uint8_t  bi_s_version[4];
-uint8_t  bi_r_version[32];
-uint32_t bi_procfreq;
-uint32_t bi_plb_busfreq;
-uint32_t bi_pci_busfreq;
-uint8_t  bi_pci_enetaddr[6];
-uint8_t  bi_pci_enetaddr2[6]; /* PPC405EP specific */
-uint32_t bi_opbfreq;
-uint32_t bi_iic_fast[2];
-};
-
  /* PLB to OPB bridge */
  #define TYPE_PPC405_POB "ppc405-pob"
  OBJECT_DECLARE_SIMPLE_TYPE(Ppc405PobState, PPC405_POB);
@@ -224,7 +189,4 @@ struct Ppc405SoCState {
  Ppc4xxMalState mal;
  };
  
-/* PowerPC 405 core */

-ram_addr_t ppc405_set_bootinfo(CPUPPCState *env, ram_addr_t ram_size);
-
  #endif /* PPC405_H */
diff --git a/hw/ppc/ppc405_boards.c b/hw/ppc/ppc405_boards.c
index 7af0d7feef..083f12b23e 100644
--- a/hw/ppc/ppc405_boards.c
+++ b/hw/ppc/ppc405_boards.c
@@ -48,6 +48,10 @@
  #define KERNEL_LOAD_ADDR 0x0100
  #define INITRD_LOAD_ADDR 0x0180
  
+#define PPC405EP_SDRAM_BASE 0x

+#define PPC405EP_SRAM_BASE  0xFFF0
+#define PPC405EP_SRAM_SIZE  (512 * KiB)
+
  #define USE_FLASH_BIOS
  
  #define TYPE_PPC405_MACHINE MACHINE_TYPE_NAME("ppc405")

@@ -61,112 +65,7 @@ struct Ppc405MachineState {
  Ppc405SoCState soc;
  };
  
-/*/

-/* PPC405EP reference board (IBM) */
-/* Standalone board with:
- * - PowerPC 405EP CPU
- * - SDRAM (0x)
- * - Flash (0xFFF8)
- * - SRAM  (0xFFF0)
- * - NVRAM (0xF000)
- * - FPGA  (0xF030)
- */
-
-#define TYPE_REF405EP_FPGA "ref405ep-fpga"
-OBJECT_DECLARE_SIMPLE_TYPE(Ref405epFpgaState, REF405EP_FPGA);
-struct Ref405epFpgaState {
-SysBusDevice parent_obj;
-
-MemoryRegion iomem;
-
-uint8_t reg0;
-uint8_t reg1;
-};
-
-static uint64_t ref405ep_fpga_readb(void *opaque, hwaddr addr, unsigned size)
-{
-Ref405epFpgaState *fpga = opaque;
-uint32_t ret;
-
-switch (addr) {
-case 0x0:
-ret = fpga->reg0;
-break;
-case 0x1:
-ret = fpga->reg1;
-break;
-default:
-ret = 0;
-break;
-}
-
-return ret;
-}
-
-static void ref405ep_fpga_writeb(void *opaque, hwaddr addr, uint64_t value,
- unsigned size)
-{
-Ref405epFpgaState *fpga = opaque;
-
-switch (addr) {
-case 0x0:
-/* Read only */
-break;
-case 0x1:
-fpga->reg1 = value;
-break;
-default:
-break;
-}
-}
-
-static const MemoryRegionOps ref405ep_fpga_ops = {
-.read = ref405ep_fpga_readb,
-.write = ref405ep_fpga_writeb,
-.impl.min_access_size = 1,
-.impl.max_access_size = 1,
-.valid.min_access_size = 1,
-.valid.max_access_size = 4,
-.endianness = DEVICE_BIG_ENDIAN,
-};
-
-static void ref405ep_fpga_reset(DeviceState *dev)
-{
-Ref405epFpgaState *fpga = REF405EP_FPGA(dev);
-
-fpga->reg0 = 0x00;
-fpga->reg1 = 0x0F;
-}
-
-static void ref405ep_fpga_realize(DeviceState *dev, Error **errp)
-{
-Ref405epFpgaState *s = REF405EP_FPGA(dev);
-
-memory_region_init_io(&s->iomem, OBJECT(s), &ref405ep_fpga_ops, s,
-  "fpga", 0x0100);
-sysbus_init_mmio(SYS_BUS_DEVICE(s), &s->iomem);
-}
-
-static void ref405ep_fpga_class_init(ObjectClass *oc, void *data)
-{
-DeviceClass *dc = DEVICE_CLASS(oc);
-
-dc->realize = ref405ep_fpga_realize;
-dc->reset = ref405ep_fpga_reset;
-/* Reason: only works as part of a ppc405 board */
-dc->user_creatable = false;
-}
-
-static const TypeInfo ref405ep_fpga_type = {
-.name = TYPE_REF405EP_FPGA,
-.parent

Re: [PATCH v5 03/18] dump: Refactor dump_iterate and introduce dump_filter_memblock_*()

On Thu, Aug 11, 2022 at 4:12 PM Janosch Frank  wrote:
>
> The iteration over the memblocks in dump_iterate() is hard to
> understand so it's about time to clean it up. Instead of manually
> grabbing the next memblock we can use QTAILQ_FOREACH to iterate over
> all memblocks.
>
> Additionally we move the calculation of the offset and length out by
> introducing and using the dump_filter_memblock_*() functions. These
> functions will later be used to cleanup other parts of dump.c.
>
> Signed-off-by: Janosch Frank 
> Reviewed-by: Janis Schoetterl-Glausch 

Reviewed-by: Marc-André Lureau 

> ---
>  dump/dump.c | 80 ++---
>  1 file changed, 45 insertions(+), 35 deletions(-)
>
> diff --git a/dump/dump.c b/dump/dump.c
> index 0ed7cf9c7b..340de5a1e7 100644
> --- a/dump/dump.c
> +++ b/dump/dump.c
> @@ -591,31 +591,43 @@ static void dump_begin(DumpState *s, Error **errp)
>  write_elf_notes(s, errp);
>  }
>
> -static int get_next_block(DumpState *s, GuestPhysBlock *block)
> +static int64_t dump_filtered_memblock_size(GuestPhysBlock *block,
> +   int64_t filter_area_start,
> +   int64_t filter_area_length)
>  {
> -while (1) {
> -block = QTAILQ_NEXT(block, next);
> -if (!block) {
> -/* no more block */
> -return 1;
> -}
> +int64_t size, left, right;
>
> -s->start = 0;
> -s->next_block = block;
> -if (s->has_filter) {
> -if (block->target_start >= s->begin + s->length ||
> -block->target_end <= s->begin) {
> -/* This block is out of the range */
> -continue;
> -}
> -
> -if (s->begin > block->target_start) {
> -s->start = s->begin - block->target_start;
> -}
> -}
> -
> -return 0;
> +/* No filter, return full size */
> +if (!filter_area_length) {
> +return block->target_end - block->target_start;
>  }
> +
> +/* calculate the overlapped region. */
> +left = MAX(filter_area_start, block->target_start);
> +right = MIN(filter_area_start + filter_area_length, block->target_end);
> +size = right - left;
> +size = size > 0 ? size : 0;
> +
> +return size;
> +}
> +
> +static int64_t dump_filtered_memblock_start(GuestPhysBlock *block,
> +int64_t filter_area_start,
> +int64_t filter_area_length)
> +{
> +if (filter_area_length) {
> +/* return -1 if the block is not within filter area */
> +if (block->target_start >= filter_area_start + filter_area_length ||
> +block->target_end <= filter_area_start) {
> +return -1;
> +}
> +
> +if (filter_area_start > block->target_start) {
> +return filter_area_start - block->target_start;
> +}
> +}
> +
> +return 0;
>  }
>
>  /* write all memory to vmcore */
> @@ -623,24 +635,22 @@ static void dump_iterate(DumpState *s, Error **errp)
>  {
>  ERRP_GUARD();
>  GuestPhysBlock *block;
> -int64_t size;
> +int64_t memblock_size, memblock_start;
>
> -do {
> -block = s->next_block;
> -
> -size = block->target_end - block->target_start;
> -if (s->has_filter) {
> -size -= s->start;
> -if (s->begin + s->length < block->target_end) {
> -size -= block->target_end - (s->begin + s->length);
> -}
> +QTAILQ_FOREACH(block, &s->guest_phys_blocks.head, next) {
> +memblock_start = dump_filtered_memblock_start(block, s->begin, 
> s->length);
> +if (memblock_start == -1) {
> +continue;
>  }
> -write_memory(s, block, s->start, size, errp);
> +
> +memblock_size = dump_filtered_memblock_size(block, s->begin, 
> s->length);
> +
> +/* Write the memory to file */
> +write_memory(s, block, memblock_start, memblock_size, errp);
>  if (*errp) {
>  return;
>  }
> -
> -} while (!get_next_block(s, block));
> +}
>  }
>
>  static void create_vmcore(DumpState *s, Error **errp)
> --
> 2.34.1
>

Re: [PATCH 15/22] hw/intc/ppc-uic: Convert ppc-uic to a PPC4xx DCR device


On 8/13/22 17:34, BALATON Zoltan wrote:

Make ppc-uic a subclass of ppc4xx-dcr-device which will handle the cpu
link and make it uniform with the other PPC4xx devices.


Reviewed-by: Cédric Le Goater 

Thanks,

C.


Signed-off-by: BALATON Zoltan 
---
  hw/intc/ppc-uic.c | 26 ++
  hw/ppc/ppc405_uc.c|  6 ++
  hw/ppc/ppc440_bamboo.c|  7 ++-
  hw/ppc/ppc4xx_devs.c  |  1 -
  hw/ppc/sam460ex.c | 17 +++--
  hw/ppc/virtex_ml507.c |  7 ++-
  include/hw/intc/ppc-uic.h |  6 ++
  7 files changed, 21 insertions(+), 49 deletions(-)

diff --git a/hw/intc/ppc-uic.c b/hw/intc/ppc-uic.c
index 60013f2dde..dcf5de5d43 100644
--- a/hw/intc/ppc-uic.c
+++ b/hw/intc/ppc-uic.c
@@ -25,11 +25,8 @@
  #include "qemu/osdep.h"
  #include "hw/intc/ppc-uic.h"
  #include "hw/irq.h"
-#include "cpu.h"
-#include "hw/ppc/ppc.h"
  #include "hw/qdev-properties.h"
  #include "migration/vmstate.h"
-#include "qapi/error.h"
  
  enum {

  DCR_UICSR  = 0x000,
@@ -105,10 +102,9 @@ static void ppcuic_trigger_irq(PPCUIC *uic)
  
  static void ppcuic_set_irq(void *opaque, int irq_num, int level)

  {
-PPCUIC *uic;
+PPCUIC *uic = opaque;
  uint32_t mask, sr;
  
-uic = opaque;

  mask = 1U << (31 - irq_num);
  LOG_UIC("%s: irq %d level %d uicsr %08" PRIx32
  " mask %08" PRIx32 " => %08" PRIx32 " %08" PRIx32 "\n",
@@ -144,10 +140,9 @@ static void ppcuic_set_irq(void *opaque, int irq_num, int 
level)
  
  static uint32_t dcr_read_uic(void *opaque, int dcrn)

  {
-PPCUIC *uic;
+PPCUIC *uic = opaque;
  uint32_t ret;
  
-uic = opaque;

  dcrn -= uic->dcr_base;
  switch (dcrn) {
  case DCR_UICSR:
@@ -192,9 +187,8 @@ static uint32_t dcr_read_uic(void *opaque, int dcrn)
  
  static void dcr_write_uic(void *opaque, int dcrn, uint32_t val)

  {
-PPCUIC *uic;
+PPCUIC *uic = opaque;
  
-uic = opaque;

  dcrn -= uic->dcr_base;
  LOG_UIC("%s: dcr %d val 0x%x\n", __func__, dcrn, val);
  switch (dcrn) {
@@ -251,19 +245,12 @@ static void ppc_uic_reset(DeviceState *dev)
  static void ppc_uic_realize(DeviceState *dev, Error **errp)
  {
  PPCUIC *uic = PPC_UIC(dev);
+Ppc4xxDcrDeviceState *dcr = PPC4xx_DCR_DEVICE(dev);
  SysBusDevice *sbd = SYS_BUS_DEVICE(dev);
-PowerPCCPU *cpu;
  int i;
  
-if (!uic->cpu) {

-/* This is a programming error in the code using this device */
-error_setg(errp, "ppc-uic 'cpu' link property was not set");
-return;
-}
-
-cpu = POWERPC_CPU(uic->cpu);
  for (i = 0; i < DCR_UICMAX; i++) {
-ppc_dcr_register(&cpu->env, uic->dcr_base + i, uic,
+ppc4xx_dcr_register(dcr, uic->dcr_base + i, uic,
   &dcr_read_uic, &dcr_write_uic);
  }
  
@@ -273,7 +260,6 @@ static void ppc_uic_realize(DeviceState *dev, Error **errp)

  }
  
  static Property ppc_uic_properties[] = {

-DEFINE_PROP_LINK("cpu", PPCUIC, cpu, TYPE_CPU, CPUState *),
  DEFINE_PROP_UINT32("dcr-base", PPCUIC, dcr_base, 0xc0),
  DEFINE_PROP_BOOL("use-vectors", PPCUIC, use_vectors, true),
  DEFINE_PROP_END_OF_LIST()
@@ -308,7 +294,7 @@ static void ppc_uic_class_init(ObjectClass *klass, void 
*data)
  
  static const TypeInfo ppc_uic_info = {

  .name = TYPE_PPC_UIC,
-.parent = TYPE_SYS_BUS_DEVICE,
+.parent = TYPE_PPC4xx_DCR_DEVICE,
  .instance_size = sizeof(PPCUIC),
  .class_init = ppc_uic_class_init,
  };
diff --git a/hw/ppc/ppc405_uc.c b/hw/ppc/ppc405_uc.c
index 82830f52bf..aa3617f876 100644
--- a/hw/ppc/ppc405_uc.c
+++ b/hw/ppc/ppc405_uc.c
@@ -1149,12 +1149,10 @@ static void ppc405_soc_realize(DeviceState *dev, Error 
**errp)
  sysbus_mmio_map(sbd, 0, 0xef600600);
  
  /* Universal interrupt controller */

-object_property_set_link(OBJECT(&s->uic), "cpu", OBJECT(&s->cpu),
- &error_fatal);
-sbd = SYS_BUS_DEVICE(&s->uic);
-if (!sysbus_realize(sbd, errp)) {
+if (!ppc4xx_dcr_realize(PPC4xx_DCR_DEVICE(&s->uic), &s->cpu, errp)) {
  return;
  }
+sbd = SYS_BUS_DEVICE(&s->uic);
  sysbus_connect_irq(sbd, PPCUIC_OUTPUT_INT,
 qdev_get_gpio_in(DEVICE(&s->cpu), PPC40x_INPUT_INT));
  sysbus_connect_irq(sbd, PPCUIC_OUTPUT_CINT,
diff --git a/hw/ppc/ppc440_bamboo.c b/hw/ppc/ppc440_bamboo.c
index 873f930c77..b14a9ef776 100644
--- a/hw/ppc/ppc440_bamboo.c
+++ b/hw/ppc/ppc440_bamboo.c
@@ -193,12 +193,9 @@ static void bamboo_init(MachineState *machine)
  
  /* interrupt controller */

  uicdev = qdev_new(TYPE_PPC_UIC);
+ppc4xx_dcr_realize(PPC4xx_DCR_DEVICE(uicdev), cpu, &error_fatal);
+object_unref(OBJECT(uicdev));
  uicsbd = SYS_BUS_DEVICE(uicdev);
-
-object_property_set_link(OBJECT(uicdev), "cpu", OBJECT(cpu),
- &error_fatal);
-sysbus_realize_and_unref(uicsbd, &error_fatal);
-
  sysbus_connect_irq(uicsbd, PPCUIC_OUTPUT_INT,

Re: [PATCH v5 05/18] dump: Rework filter area variables

Hi

On Thu, Aug 11, 2022 at 4:12 PM Janosch Frank  wrote:
>
> While the DumpState begin and length variables directly mirror the API
> variable names they are not very descriptive. So let's add a
> "filter_area_" prefix and make has_filter a function checking length > 0.
>
> Signed-off-by: Janosch Frank 
> ---
>  dump/dump.c   | 53 +--
>  include/sysemu/dump.h | 13 ---
>  2 files changed, 41 insertions(+), 25 deletions(-)
>
> diff --git a/dump/dump.c b/dump/dump.c
> index e204912a89..b043337bc7 100644
> --- a/dump/dump.c
> +++ b/dump/dump.c
> @@ -59,6 +59,11 @@ static inline bool dump_is_64bit(DumpState *s)
>  return s->dump_info.d_class == ELFCLASS64;
>  }
>
> +static inline bool dump_has_filter(DumpState *s)

I'd drop the inline, and let the compiler decide.


Otherwise:
Reviewed-by: Marc-André Lureau 

> +{
> +return s->filter_area_length > 0;
> +}
> +
>  uint16_t cpu_to_dump16(DumpState *s, uint16_t val)
>  {
>  if (s->dump_info.d_endian == ELFDATA2LSB) {
> @@ -443,29 +448,30 @@ static void get_offset_range(hwaddr phys_addr,
>  *p_offset = -1;
>  *p_filesz = 0;
>
> -if (s->has_filter) {
> -if (phys_addr < s->begin || phys_addr >= s->begin + s->length) {
> +if (dump_has_filter(s)) {
> +if (phys_addr < s->filter_area_begin ||
> +phys_addr >= s->filter_area_begin + s->filter_area_length) {
>  return;
>  }
>  }
>
>  QTAILQ_FOREACH(block, &s->guest_phys_blocks.head, next) {
> -if (s->has_filter) {
> -if (block->target_start >= s->begin + s->length ||
> -block->target_end <= s->begin) {
> +if (dump_has_filter(s)) {
> +if (block->target_start >= s->filter_area_begin + 
> s->filter_area_length ||
> +block->target_end <= s->filter_area_begin) {
>  /* This block is out of the range */
>  continue;
>  }
>
> -if (s->begin <= block->target_start) {
> +if (s->filter_area_begin <= block->target_start) {
>  start = block->target_start;
>  } else {
> -start = s->begin;
> +start = s->filter_area_begin;
>  }
>
>  size_in_block = block->target_end - start;
> -if (s->begin + s->length < block->target_end) {
> -size_in_block -= block->target_end - (s->begin + s->length);
> +if (s->filter_area_begin + s->filter_area_length < 
> block->target_end) {
> +size_in_block -= block->target_end - (s->filter_area_begin + 
> s->filter_area_length);
>  }
>  } else {
>  start = block->target_start;
> @@ -638,12 +644,12 @@ static void dump_iterate(DumpState *s, Error **errp)
>  int64_t memblock_size, memblock_start;
>
>  QTAILQ_FOREACH(block, &s->guest_phys_blocks.head, next) {
> -memblock_start = dump_filtered_memblock_start(block, s->begin, 
> s->length);
> +memblock_start = dump_filtered_memblock_start(block, 
> s->filter_area_begin, s->filter_area_length);
>  if (memblock_start == -1) {
>  continue;
>  }
>
> -memblock_size = dump_filtered_memblock_size(block, s->begin, 
> s->length);
> +memblock_size = dump_filtered_memblock_size(block, 
> s->filter_area_begin, s->filter_area_length);
>
>  /* Write the memory to file */
>  write_memory(s, block, memblock_start, memblock_size, errp);
> @@ -1504,14 +1510,14 @@ static int validate_start_block(DumpState *s)
>  {
>  GuestPhysBlock *block;
>
> -if (!s->has_filter) {
> +if (!dump_has_filter(s)) {
>  return 0;
>  }
>
>  QTAILQ_FOREACH(block, &s->guest_phys_blocks.head, next) {
>  /* This block is out of the range */
> -if (block->target_start >= s->begin + s->length ||
> -block->target_end <= s->begin) {
> +if (block->target_start >= s->filter_area_begin + 
> s->filter_area_length ||
> +block->target_end <= s->filter_area_begin) {
>  continue;
>  }
>  return 0;
> @@ -1550,10 +1556,10 @@ static int64_t dump_calculate_size(DumpState *s)
>  int64_t size = 0, total = 0, left = 0, right = 0;
>
>  QTAILQ_FOREACH(block, &s->guest_phys_blocks.head, next) {
> -if (s->has_filter) {
> +if (dump_has_filter(s)) {
>  /* calculate the overlapped region. */
> -left = MAX(s->begin, block->target_start);
> -right = MIN(s->begin + s->length, block->target_end);
> +left = MAX(s->filter_area_begin, block->target_start);
> +right = MIN(s->filter_area_begin + s->filter_area_length, 
> block->target_end);
>  size = right - left;
>  size = size > 0 ? size : 0;
>  } else {
> @@ -1643,9 +1649,12 @@ static void dump_init(DumpState *s, int fd, bool 
> has_format,
>  }
>
>

Re: [PATCH v5 07/18] dump: Split elf header functions into prepare and write

On Thu, Aug 11, 2022 at 4:29 PM Janosch Frank  wrote:

> Let's split the write from the modification of the elf header so we
> can consolidate the write of the data in one function.
>
> Signed-off-by: Janosch Frank 
>

Reviewed-by: Marc-André Lureau 


> ---
>  dump/dump.c | 100 
>  1 file changed, 53 insertions(+), 47 deletions(-)
>
> diff --git a/dump/dump.c b/dump/dump.c
> index d82cc46d7d..8a2a97a85e 100644
> --- a/dump/dump.c
> +++ b/dump/dump.c
> @@ -131,7 +131,7 @@ static int fd_write_vmcore(const void *buf, size_t
> size, void *opaque)
>  return 0;
>  }
>
> -static void write_elf64_header(DumpState *s, Error **errp)
> +static void prepare_elf64_header(DumpState *s, Elf64_Ehdr *elf_header)
>  {
>  /*
>   * phnum in the elf header is 16 bit, if we have more segments we
> @@ -139,34 +139,27 @@ static void write_elf64_header(DumpState *s, Error
> **errp)
>   * special section.
>   */
>  uint16_t phnum = MIN(s->phdr_num, PN_XNUM);
> -Elf64_Ehdr elf_header;
> -int ret;
>
> -memset(&elf_header, 0, sizeof(Elf64_Ehdr));
> -memcpy(&elf_header, ELFMAG, SELFMAG);
> -elf_header.e_ident[EI_CLASS] = ELFCLASS64;
> -elf_header.e_ident[EI_DATA] = s->dump_info.d_endian;
> -elf_header.e_ident[EI_VERSION] = EV_CURRENT;
> -elf_header.e_type = cpu_to_dump16(s, ET_CORE);
> -elf_header.e_machine = cpu_to_dump16(s, s->dump_info.d_machine);
> -elf_header.e_version = cpu_to_dump32(s, EV_CURRENT);
> -elf_header.e_ehsize = cpu_to_dump16(s, sizeof(elf_header));
> -elf_header.e_phoff = cpu_to_dump64(s, s->phdr_offset);
> -elf_header.e_phentsize = cpu_to_dump16(s, sizeof(Elf64_Phdr));
> -elf_header.e_phnum = cpu_to_dump16(s, phnum);
> +memset(elf_header, 0, sizeof(Elf64_Ehdr));
> +memcpy(elf_header, ELFMAG, SELFMAG);
> +elf_header->e_ident[EI_CLASS] = ELFCLASS64;
> +elf_header->e_ident[EI_DATA] = s->dump_info.d_endian;
> +elf_header->e_ident[EI_VERSION] = EV_CURRENT;
> +elf_header->e_type = cpu_to_dump16(s, ET_CORE);
> +elf_header->e_machine = cpu_to_dump16(s, s->dump_info.d_machine);
> +elf_header->e_version = cpu_to_dump32(s, EV_CURRENT);
> +elf_header->e_ehsize = cpu_to_dump16(s, sizeof(elf_header));
> +elf_header->e_phoff = cpu_to_dump64(s, s->phdr_offset);
> +elf_header->e_phentsize = cpu_to_dump16(s, sizeof(Elf64_Phdr));
> +elf_header->e_phnum = cpu_to_dump16(s, phnum);
>  if (s->shdr_num) {
> -elf_header.e_shoff = cpu_to_dump64(s, s->shdr_offset);
> -elf_header.e_shentsize = cpu_to_dump16(s, sizeof(Elf64_Shdr));
> -elf_header.e_shnum = cpu_to_dump16(s, s->shdr_num);
> -}
> -
> -ret = fd_write_vmcore(&elf_header, sizeof(elf_header), s);
> -if (ret < 0) {
> -error_setg_errno(errp, -ret, "dump: failed to write elf header");
> +elf_header->e_shoff = cpu_to_dump64(s, s->shdr_offset);
> +elf_header->e_shentsize = cpu_to_dump16(s, sizeof(Elf64_Shdr));
> +elf_header->e_shnum = cpu_to_dump16(s, s->shdr_num);
>  }
>  }
>
> -static void write_elf32_header(DumpState *s, Error **errp)
> +static void prepare_elf32_header(DumpState *s, Elf32_Ehdr *elf_header)
>  {
>  /*
>   * phnum in the elf header is 16 bit, if we have more segments we
> @@ -174,28 +167,45 @@ static void write_elf32_header(DumpState *s, Error
> **errp)
>   * special section.
>   */
>  uint16_t phnum = MIN(s->phdr_num, PN_XNUM);
> -Elf32_Ehdr elf_header;
> +
> +memset(elf_header, 0, sizeof(Elf32_Ehdr));
> +memcpy(elf_header, ELFMAG, SELFMAG);
> +elf_header->e_ident[EI_CLASS] = ELFCLASS32;
> +elf_header->e_ident[EI_DATA] = s->dump_info.d_endian;
> +elf_header->e_ident[EI_VERSION] = EV_CURRENT;
> +elf_header->e_type = cpu_to_dump16(s, ET_CORE);
> +elf_header->e_machine = cpu_to_dump16(s, s->dump_info.d_machine);
> +elf_header->e_version = cpu_to_dump32(s, EV_CURRENT);
> +elf_header->e_ehsize = cpu_to_dump16(s, sizeof(elf_header));
> +elf_header->e_phoff = cpu_to_dump32(s, s->phdr_offset);
> +elf_header->e_phentsize = cpu_to_dump16(s, sizeof(Elf32_Phdr));
> +elf_header->e_phnum = cpu_to_dump16(s, phnum);
> +if (s->shdr_num) {
> +elf_header->e_shoff = cpu_to_dump32(s, s->shdr_offset);
> +elf_header->e_shentsize = cpu_to_dump16(s, sizeof(Elf32_Shdr));
> +elf_header->e_shnum = cpu_to_dump16(s, s->shdr_num);
> +}
> +}
> +
> +static void write_elf_header(DumpState *s, Error **errp)
> +{
> +Elf32_Ehdr elf32_header;
> +Elf64_Ehdr elf64_header;
> +size_t header_size;
> +void *header_ptr;
>  int ret;
>
> -memset(&elf_header, 0, sizeof(Elf32_Ehdr));
> -memcpy(&elf_header, ELFMAG, SELFMAG);
> -elf_header.e_ident[EI_CLASS] = ELFCLASS32;
> -elf_header.e_ident[EI_DATA] = s->dump_info.d_endian;
> -elf_header.e_ident[EI_VERSION] = EV_CURRENT;
> -elf_header.e_type = cpu_to_dump16(s, ET_CORE);
>

Re: [PATCH v5 08/18] dump: Rename write_elf_phdr_note to prepare_elf_phdr_note

On Thu, Aug 11, 2022 at 4:43 PM Janosch Frank  wrote:

> The functions in question do not actually write to the file descriptor
> they set up a buffer which is later written to the fd.
>
> Signed-off-by: Janosch Frank 
>

Reviewed-by: Marc-André Lureau 

---
>  dump/dump.c | 8 
>  1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/dump/dump.c b/dump/dump.c
> index 8a2a97a85e..a905316fe5 100644
> --- a/dump/dump.c
> +++ b/dump/dump.c
> @@ -260,7 +260,7 @@ static void write_elf32_load(DumpState *s,
> MemoryMapping *memory_mapping,
>  }
>  }
>
> -static void write_elf64_phdr_note(DumpState *s, Elf64_Phdr *phdr)
> +static void prepare_elf64_phdr_note(DumpState *s, Elf64_Phdr *phdr)
>  {
>  memset(phdr, 0, sizeof(*phdr));
>  phdr->p_type = cpu_to_dump32(s, PT_NOTE);
> @@ -316,7 +316,7 @@ static void write_elf64_notes(WriteCoreDumpFunction f,
> DumpState *s,
>  write_guest_note(f, s, errp);
>  }
>
> -static void write_elf32_phdr_note(DumpState *s, Elf32_Phdr *phdr)
> +static void prepare_elf32_phdr_note(DumpState *s, Elf32_Phdr *phdr)
>  {
>  memset(phdr, 0, sizeof(*phdr));
>  phdr->p_type = cpu_to_dump32(s, PT_NOTE);
> @@ -364,11 +364,11 @@ static void write_elf_phdr_note(DumpState *s, Error
> **errp)
>  int ret;
>
>  if (dump_is_64bit(s)) {
> -write_elf64_phdr_note(s, &phdr64);
> +prepare_elf64_phdr_note(s, &phdr64);
>  size = sizeof(phdr64);
>  phdr = &phdr64;
>  } else {
> -write_elf32_phdr_note(s, &phdr32);
> +prepare_elf32_phdr_note(s, &phdr32);
>  size = sizeof(phdr32);
>  phdr = &phdr32;
>  }
> --
> 2.34.1
>
>
>

-- 
Marc-André Lureau

Re: [PATCH] hw/usb/hcd-xhci: Fix endless loop in case the DMA access fails (CVE-2020-14394)

On Thu, Aug 04, 2022 at 01:43:14PM +0200, Thomas Huth wrote:
> On 04/08/2022 12.17, Peter Maydell wrote:
> > That sounds like we do still have an unbounded-loop problem,
> > then: there's no limit on the number of consecutive TRBs
> > we try to read in that function. Maybe we're missing an
> > error check of some kind (does the spec limit how many
> > consecutive TRBs there can be somehow?) or else we need
> > another artificial limit.
> 
> I'm not an XHCI expert at all, but while at least having a quick glance at
> the spec, I did not see any limit there. So I assume that we should enforce
> an artificial limit? What would be a good value for this? Gerd, do you maybe
> have any opinion?

Hmm, dunno.  Typical workflow is that the driver allocates a page
(or multiple physically contiguous pages) for the TRBs, and when
it reaches the end of the allocation it gets a new block and chains
it using TRB_LINK.

Right now we have a limit for type=TRB_LINK only, having another one for
all TRBs makes sense.  The number of TRBs for a transfer can be quite
large (think usb-storage reads/writes), so don't set that too low, 64k
maybe?

take care,
  Gerd

Re: [PATCH] [PATCH] linux-user/aarch64: Reset target data on MADV_DONTNEED



Laurent Vivier  writes:

> Le 11/08/2022 à 17:18, Alex Bennée a écrit :
>> Laurent Vivier  writes:
>> 
>>> Le 11/08/2022 à 13:54, Peter Maydell a écrit :
 On Thu, 11 Aug 2022 at 09:29, Laurent Vivier  wrote:
>
> Le 10/08/2022 à 22:47, Richard Henderson a écrit :
>> On 8/10/22 13:32, Vitaly Buka wrote:
>>> Sorry, I only noticed today that it's not submitted.
>>> Version is not critical for us, as we build from masters anyway.
>>> Richard, do you know a reason to consider this critical?
>>>
>>> On Wed, 10 Aug 2022 at 13:04, Peter Maydell >> > wrote:
>>>
>>>   On Wed, 10 Aug 2022 at 21:00, Vitaly Buka >>   > wrote:
>>>>
>>>> How can we land this one?
>>>
>>>   Pinging it a week ago rather than now would have been a good 
>>> start :-(
>>>   I think it got missed because you didn't cc the linux-user 
>>> maintainer.
>>>
>>>   Is this a critical fix for 7.1 or can we let it slip to 7.2 ?
>>
>> It's unfortunate that it got missed.  It's not critical, but it would be 
>> nice, because support for
>> MADV_DONTNEED is new in 7.1 (previously, we ignored all madvise).
>>
>> I'll note there are missing braces for coding style on an IF.
>>
>> Laurent, do you have an objection to merging this for rc3?
>>
>
> No objection.
>
> Do you want it goes via the arm branch or via the linux-user branch?
>
> If it goes via linux-user I can run the LTP testsuite but it takes 1 day.
 I think we should definitely run the LTP testsuite on it, so
 taking it via linux-user probably makes more sense.
>>>
>>> ok, applied to my linux-user-for-7.1 branch.
>>>
>>> Running tests.
>> Any chance you could pick up:
>>Subject: [PATCH v2] linux-user: un-parent OBJECT(cpu) when
>> closing thread
>>Date: Wed,  3 Aug 2022 14:05:37 +0100
>>Message-Id: <20220803130537.763666-1-alex.ben...@linaro.org>
>> before you run the tests?
>> 
>
> I've tested it, it works fine.
>
> Do you plan to do a PR including it or do you want I do (there will be
> only this one in mine)?

I'm going to a roll a PR today so I can include it. Shall I add a
Tested-by for you?

>
> Thanks,
> Laurent


-- 
Alex Bennée

Re: [PATCH] hw/usb/hcd-xhci: Fix endless loop in case the DMA access fails (CVE-2020-14394)

On 16/08/2022 10.37, Gerd Hoffmann wrote:

On Thu, Aug 04, 2022 at 01:43:14PM +0200, Thomas Huth wrote:

On 04/08/2022 12.17, Peter Maydell wrote:

That sounds like we do still have an unbounded-loop problem,
then: there's no limit on the number of consecutive TRBs
we try to read in that function. Maybe we're missing an
error check of some kind (does the spec limit how many
consecutive TRBs there can be somehow?) or else we need
another artificial limit.

I'm not an XHCI expert at all, but while at least having a quick glance at
the spec, I did not see any limit there. So I assume that we should enforce
an artificial limit? What would be a good value for this? Gerd, do you maybe
have any opinion?

Hmm, dunno. Typical workflow is that the driver allocates a page
(or multiple physically contiguous pages) for the TRBs, and when
it reaches the end of the allocation it gets a new block and chains
it using TRB_LINK.

Right now we have a limit for type=TRB_LINK only, having another one for
all TRBs makes sense. The number of TRBs for a transfer can be quite
large (think usb-storage reads/writes), so don't set that too low, 64k
maybe?

Yes, after some more reading, I found a remark in the spec (in chapter 6)
saying that each segment should be limited by 64 kb.

I've sent a v2 with that limit check here (it has a slightly different
subject, so it's hard to relate it to this v1 here):

https://patchwork.kernel.org/project/qemu-devel/patch/20220804131300.96368-1-th...@redhat.com/

Thomas

Re: [PATCH v5 09/18] dump: Use a buffer for ELF section data and headers

Hi

On Thu, Aug 11, 2022 at 4:16 PM Janosch Frank  wrote:

> Currently we're writing the NULL section header if we overflow the
> physical header number in the ELF header. But in the future we'll add
> custom section headers AND section data.
>
> To facilitate this we need to rearange section handling a bit. As with
> the other ELF headers we split the code into a prepare and a write
> step.
>
> Signed-off-by: Janosch Frank 
>
---
>  dump/dump.c   | 83 +--
>  include/sysemu/dump.h |  2 ++
>  2 files changed, 58 insertions(+), 27 deletions(-)
>
> diff --git a/dump/dump.c b/dump/dump.c
> index a905316fe5..0051c71d08 100644
> --- a/dump/dump.c
> +++ b/dump/dump.c
> @@ -380,30 +380,57 @@ static void write_elf_phdr_note(DumpState *s, Error
> **errp)
>  }
>  }
>
> -static void write_elf_section(DumpState *s, int type, Error **errp)
> +static void prepare_elf_section_hdr_zero(DumpState *s)
>  {
> -Elf32_Shdr shdr32;
> -Elf64_Shdr shdr64;
> -int shdr_size;
> -void *shdr;
> +if (dump_is_64bit(s)) {
> +Elf64_Shdr *shdr64 = s->elf_section_hdrs;
> +
> +shdr64->sh_info = cpu_to_dump32(s, s->phdr_num);
> +} else {
> +Elf32_Shdr *shdr32 = s->elf_section_hdrs;
> +
> +shdr32->sh_info = cpu_to_dump32(s, s->phdr_num);
> +}
> +}
> +
> +static void prepare_elf_section_hdrs(DumpState *s)
> +{
> +size_t len, sizeof_shdr;
> +
> +/*
> + * Section ordering:
> + * - HDR zero
> + */
> +sizeof_shdr = dump_is_64bit(s) ? sizeof(Elf64_Shdr) :
> sizeof(Elf32_Shdr);
> +len = sizeof_shdr * s->shdr_num;
> +s->elf_section_hdrs = g_malloc0(len);
> +
> +/*
> + * The first section header is ALWAYS a special initial section
> + * header.
> + *
> + * The header should be 0 with one exception being that if
> + * phdr_num is PN_XNUM then the sh_info field contains the real
> + * number of segment entries.
> + *
> + * As we zero allocate the buffer we will only need to modify
> + * sh_info for the PN_XNUM case.
> + */
> +if (s->phdr_num >= PN_XNUM) {
> +prepare_elf_section_hdr_zero(s);
> +}
> +}
> +
> +static void write_elf_section_headers(DumpState *s, Error **errp)
> +{
> +size_t sizeof_shdr = dump_is_64bit(s) ? sizeof(Elf64_Shdr) :
> sizeof(Elf32_Shdr);
>  int ret;
>
> -if (type == 0) {
> -shdr_size = sizeof(Elf32_Shdr);
> -memset(&shdr32, 0, shdr_size);
> -shdr32.sh_info = cpu_to_dump32(s, s->phdr_num);
> -shdr = &shdr32;
> -} else {
> -shdr_size = sizeof(Elf64_Shdr);
> -memset(&shdr64, 0, shdr_size);
> -shdr64.sh_info = cpu_to_dump32(s, s->phdr_num);
> -shdr = &shdr64;
> -}
> +prepare_elf_section_hdrs(s);
>
> -ret = fd_write_vmcore(shdr, shdr_size, s);
> +ret = fd_write_vmcore(s->elf_section_hdrs, s->shdr_num * sizeof_shdr,
> s);
>  if (ret < 0) {
> -error_setg_errno(errp, -ret,
> - "dump: failed to write section header table");
> +error_setg_errno(errp, -ret, "dump: failed to write section
> headers");
>  }
>  }
>
> @@ -579,6 +606,12 @@ static void dump_begin(DumpState *s, Error **errp)
>  return;
>  }
>
> +/* write section headers to vmcore */
> +write_elf_section_headers(s, errp);
> +if (*errp) {
> +return;
> +}
>

Can you make that move a separate commit? And please explain why this is
valid, and also update the table in the comment too.

Otherwise, changes look good to me.

+
>  /* write PT_NOTE to vmcore */
>  write_elf_phdr_note(s, errp);
>  if (*errp) {
> @@ -591,14 +624,6 @@ static void dump_begin(DumpState *s, Error **errp)
>  return;
>  }
>
> -/* write section to vmcore */
> -if (s->shdr_num) {
> -write_elf_section(s, 1, errp);
> -if (*errp) {
> -return;
> -}
> -}
> -
>  /* write notes to vmcore */
>  write_elf_notes(s, errp);
>  }
> @@ -674,7 +699,11 @@ static void create_vmcore(DumpState *s, Error **errp)
>  return;
>  }
>
> +/* Iterate over memory and dump it to file */
>  dump_iterate(s, errp);
> +if (*errp) {
> +return;
> +}
>  }
>
>  static int write_start_flat_header(int fd)
> diff --git a/include/sysemu/dump.h b/include/sysemu/dump.h
> index b62513d87d..9995f65dc8 100644
> --- a/include/sysemu/dump.h
> +++ b/include/sysemu/dump.h
> @@ -177,6 +177,8 @@ typedef struct DumpState {
>  int64_t filter_area_begin;  /* Start address of partial guest memory
> area */
>  int64_t filter_area_length; /* Length of partial guest memory area */
>
> +void *elf_section_hdrs; /* Pointer to section header buffer */
> +
>  uint8_t *note_buf;  /* buffer for notes */
>  size_t note_buf_offset; /* the writing place in note_buf */
>  uint32_t nr_cpus;   /* number of guest's cpu */
> --
> 2.34.1
>
>
>

-- 
Marc-An

[PATCH] virtio-crypto: support asynchronous mode

2022-08-16 Thread Lei He

virtio-crypto: Modify the current interface of virtio-crypto
device to support asynchronous mode.

Signed-off-by: lei he 
---
 backends/cryptodev-builtin.c|  69 ++---
 backends/cryptodev-vhost-user.c |  51 +--
 backends/cryptodev.c|  44 +++---
 hw/virtio/virtio-crypto.c   | 324 ++--
 include/sysemu/cryptodev.h  |  60 +---
 5 files changed, 336 insertions(+), 212 deletions(-)

diff --git a/backends/cryptodev-builtin.c b/backends/cryptodev-builtin.c
index 125cbad1d3..cda6ca3b71 100644
--- a/backends/cryptodev-builtin.c
+++ b/backends/cryptodev-builtin.c
@@ -355,42 +355,62 @@ static int cryptodev_builtin_create_akcipher_session(
 return index;
 }
 
-static int64_t cryptodev_builtin_create_session(
+static int cryptodev_builtin_create_session(
CryptoDevBackend *backend,
CryptoDevBackendSessionInfo *sess_info,
-   uint32_t queue_index, Error **errp)
+   uint32_t queue_index,
+   CryptoDevCompletionFunc cb,
+   void *opaque)
 {
 CryptoDevBackendBuiltin *builtin =
   CRYPTODEV_BACKEND_BUILTIN(backend);
 CryptoDevBackendSymSessionInfo *sym_sess_info;
 CryptoDevBackendAsymSessionInfo *asym_sess_info;
+int ret, status;
+Error *local_error = NULL;
 
 switch (sess_info->op_code) {
 case VIRTIO_CRYPTO_CIPHER_CREATE_SESSION:
 sym_sess_info = &sess_info->u.sym_sess_info;
-return cryptodev_builtin_create_cipher_session(
-   builtin, sym_sess_info, errp);
+ret = cryptodev_builtin_create_cipher_session(
+builtin, sym_sess_info, &local_error);
+break;
 
 case VIRTIO_CRYPTO_AKCIPHER_CREATE_SESSION:
 asym_sess_info = &sess_info->u.asym_sess_info;
-return cryptodev_builtin_create_akcipher_session(
-   builtin, asym_sess_info, errp);
+ret = cryptodev_builtin_create_akcipher_session(
+   builtin, asym_sess_info, &local_error);
+break;
 
 case VIRTIO_CRYPTO_HASH_CREATE_SESSION:
 case VIRTIO_CRYPTO_MAC_CREATE_SESSION:
 default:
-error_setg(errp, "Unsupported opcode :%" PRIu32 "",
+error_setg(&local_error, "Unsupported opcode :%" PRIu32 "",
sess_info->op_code);
-return -1;
+return -VIRTIO_CRYPTO_NOTSUPP;
 }
 
-return -1;
+if (local_error) {
+error_report_err(local_error);
+}
+if (ret < 0) {
+status = -VIRTIO_CRYPTO_ERR;
+} else {
+sess_info->session_id = ret;
+status = VIRTIO_CRYPTO_OK;
+}
+if (cb) {
+cb(opaque, status);
+}
+return 0;
 }
 
 static int cryptodev_builtin_close_session(
CryptoDevBackend *backend,
uint64_t session_id,
-   uint32_t queue_index, Error **errp)
+   uint32_t queue_index,
+   CryptoDevCompletionFunc cb,
+   void *opaque)
 {
 CryptoDevBackendBuiltin *builtin =
   CRYPTODEV_BACKEND_BUILTIN(backend);
@@ -407,6 +427,9 @@ static int cryptodev_builtin_close_session(
 
 g_free(session);
 builtin->sessions[session_id] = NULL;
+if (cb) {
+cb(opaque, VIRTIO_CRYPTO_OK);
+}
 return 0;
 }
 
@@ -506,7 +529,9 @@ static int cryptodev_builtin_asym_operation(
 static int cryptodev_builtin_operation(
  CryptoDevBackend *backend,
  CryptoDevBackendOpInfo *op_info,
- uint32_t queue_index, Error **errp)
+ uint32_t queue_index,
+ CryptoDevCompletionFunc cb,
+ void *opaque)
 {
 CryptoDevBackendBuiltin *builtin =
   CRYPTODEV_BACKEND_BUILTIN(backend);
@@ -514,11 +539,12 @@ static int cryptodev_builtin_operation(
 CryptoDevBackendSymOpInfo *sym_op_info;
 CryptoDevBackendAsymOpInfo *asym_op_info;
 enum CryptoDevBackendAlgType algtype = op_info->algtype;
-int ret = -VIRTIO_CRYPTO_ERR;
+int status = -VIRTIO_CRYPTO_ERR;
+Error *local_error = NULL;
 
 if (op_info->session_id >= MAX_NUM_SESSIONS ||
   builtin->sessions[op_info->session_id] == NULL) {
-error_setg(errp, "Cannot find a valid session id: %" PRIu64 "",
+error_setg(&local_error, "Cannot find a valid session id: %" PRIu64 "",
op_info->session_id);
 return -VIRTIO_CRYPTO_INVSESS;
 }
@@ -526,14 +552,21 @@ static int cryptodev_builtin_operation(
 sess = builtin->sessions[op_info->session_id];
 if (algtype == CRYPTODEV_BACKEND_ALG_SYM) {
 sym_op_info = op_info->u.sym_op_info;
-ret = cryptodev_builtin_sym_operation(sess, sym_op_info, errp);
+status = cryptodev_builtin_sym_operation(sess, sym_op_info,
+ &local_error);
 } else if (algtype == CRYPTODEV_BACKEND_ALG_ASYM) {
 asym_op_info = op_info->u.asym_op_info;
-

Re: [PATCH 1/7] semihosting: Allow optional use of semihosting from userspace



Peter Maydell  writes:

> Currently our semihosting implementations generally prohibit use of
> semihosting calls in system emulation from the guest userspace.  This
> is a very long standing behaviour justified originally "to provide
> some semblance of security" (since code with access to the
> semihosting ABI can do things like read and write arbitrary files on
> the host system).  However, it is sometimes useful to be able to run
> trusted guest code which performs semihosting calls from guest
> userspace, notably for test code.  Add a command line suboption to
> the existing semihosting-config option group so that you can
> explicitly opt in to semihosting from guest userspace with
>  -semihosting-config userspace=on
>
> (There is no equivalent option for the user-mode emulator, because
> there by definition all code runs in userspace and has access to
> semihosting already.)
>
> This commit adds the infrastructure for the command line option and
> adds a bool 'is_user' parameter to the function
> semihosting_userspace_enabled() that target code can use to check
> whether it should be permitting the semihosting call for userspace.
> It mechanically makes all the callsites pass 'false', so they
> continue checking "is semihosting enabled in general".  Subsequent
> commits will make each target that implements semihosting honour the
> userspace=on option by passing the correct value and removing
> whatever "don't do this for userspace" checking they were doing by
> hand.
>
> Signed-off-by: Peter Maydell 

Acked-by: Alex Bennée 

-- 
Alex Bennée

Re: [PATCH v2] hw/i386: place setup_data at fixed place in memory

  Hi,

> > We can make setup_data chaining work with OVMF, but the whole chain
> > should be located in a GPA range that OVMF dictates.
> 
> It sounds like what you describe is pretty OVMF-specific though,
> right? Do we want to tie things together so tightly like that?
> 
> Given we only need 48 bytes or so, isn't there a more subtle place we
> could just throw this in ram that doesn't need such complex
> coordination?

Joining the party late (and still catching up the thread).  Given we
don't need that anyway with EFI, only with legacy BIOS:  Can't that just
be a protocol between qemu and pc-bios/optionrom/*boot*.S on how to pass
those 48 bytes random seed?

take care,
  Gerd

Re: [PATCH v2 for-7.1] hw/usb/hcd-xhci: Fix unbounded loop in xhci_ring_chain_length() (CVE-2020-14394)

> +
> +/*
> + * According to the xHCI spec, Transfer Ring segments should have
> + * a maximum size of 64 kB (see chapter "6 Data Structures")
> + */
> +} while (length < TRB_LINK_LIMIT * 65536 / TRB_SIZE);

Acked-by: Gerd Hoffmann 

take care,
  Gerd

[PATCH v2] xio3130_upstream: Add ACS (Access Control Services) capability

2022-08-16 Thread Paul Schlacter

v1 -> v2:
- Allow ACS to be disabled.
- Suggested by Michael S. Tsirkin, use disable-acs to set property.

v1:
- Add ACS (Access Control Services) capability.

If it is a pcie device, check that all devices on the path from

the device to the root complex have ACS enabled, and then the

device will become an iommu_group.

it will have the effect of isolation



Signed-off-by: wangliang 

Signed-off-by: wangliang 



---

 hw/pci-bridge/xio3130_upstream.c | 12 +++-

 1 file changed, 11 insertions(+), 1 deletion(-)


diff --git a/hw/pci-bridge/xio3130_upstream.c
b/hw/pci-bridge/xio3130_upstream.c

index 2df95b..5433d06fb3 100644

--- a/hw/pci-bridge/xio3130_upstream.c

+++ b/hw/pci-bridge/xio3130_upstream.c

@@ -24,6 +24,7 @@

 #include "hw/pci/msi.h"

 #include "hw/pci/pcie.h"

 #include "hw/pci/pcie_port.h"

+#include "hw/qdev-properties.h"

 #include "migration/vmstate.h"

 #include "qemu/module.h"



@@ -59,6 +60,7 @@ static void xio3130_upstream_reset(DeviceState *qdev)

 static void xio3130_upstream_realize(PCIDevice *d, Error **errp)

 {

 PCIEPort *p = PCIE_PORT(d);

+PCIESlot *s = PCIE_SLOT(d);

 int rc;



 pci_bridge_initfn(d, TYPE_PCIE_BUS);

@@ -94,7 +96,9 @@ static void xio3130_upstream_realize(PCIDevice *d, Error
**errp)

 goto err;

 }



-pcie_acs_init(d, XIO3130_ACS_OFFSET);

+if (!s->disable_acs) {

+pcie_acs_init(d, XIO3130_ACS_OFFSET);

+}

 return;



 err:

@@ -113,6 +117,11 @@ static void xio3130_upstream_exitfn(PCIDevice *d)

 pci_bridge_exitfn(d);

 }



+static Property xio3130_upstream_props[] = {

+DEFINE_PROP_BOOL("disable-acs", PCIESlot, disable_acs, false),

+DEFINE_PROP_END_OF_LIST()

+};

+

 static const VMStateDescription vmstate_xio3130_upstream = {

 .name = "xio3130-express-upstream-port",

 .priority = MIG_PRI_PCI_BUS,

@@ -142,6 +151,7 @@ static void xio3130_upstream_class_init(ObjectClass
*klass, void *data)

 dc->desc = "TI X3130 Upstream Port of PCI Express Switch";

 dc->reset = xio3130_upstream_reset;

 dc->vmsd = &vmstate_xio3130_upstream;

+device_class_set_props(dc, xio3130_upstream_props);

 }



 static const TypeInfo xio3130_upstream_info = {

-- 

2.24.3 (Apple Git-128)

[PATCH] acpi_ged: Add ospm_status hook implementation

2022-08-16 Thread Keqian Zhu via

This fixes a bug that causes segmentation fault with following dumpstack:
 #1  0xab64235c in qmp_query_acpi_ospm_status 
(errp=errp@entry=0xf030) at ../monitor/qmp-cmds.c:312
 #2  0xabfc4e20 in qmp_marshal_query_acpi_ospm_status (args=, ret=0xea4ffe90, errp=0xea4ffe88) at qapi/qapi-commands-acpi.c:63
 #3  0xabff8ba0 in do_qmp_dispatch_bh (opaque=0xea4ffe98) at 
../qapi/qmp-dispatch.c:128
 #4  0xac02e594 in aio_bh_call (bh=0xe0004d80) at 
../util/async.c:150
 #5  aio_bh_poll (ctx=ctx@entry=0xad0f6040) at ../util/async.c:178
 #6  0xac00bd40 in aio_dispatch (ctx=ctx@entry=0xad0f6040) at 
../util/aio-posix.c:421
 #7  0xac02e010 in aio_ctx_dispatch (source=0xad0f6040, 
callback=, user_data=) at ../util/async.c:320
 #8  0xf76f6884 in g_main_context_dispatch () at 
/usr/lib64/libglib-2.0.so.0
 #9  0xac0452d4 in glib_pollfds_poll () at ../util/main-loop.c:297
 #10 os_host_main_loop_wait (timeout=0) at ../util/main-loop.c:320
 #11 main_loop_wait (nonblocking=nonblocking@entry=0) at ../util/main-loop.c:596
 #12 0xab5c9e50 in qemu_main_loop () at ../softmmu/runstate.c:734
 #13 0xab185370 in qemu_main (argc=argc@entry=47, 
argv=argv@entry=0xf518, envp=envp@entry=0x0) at ../softmmu/main.c:38
 #14 0xab16f99c in main (argc=47, argv=0xf518) at 
../softmmu/main.c:47

Fixes: ebb62075021a ("hw/acpi: Add ACPI Generic Event Device Support")
Signed-off-by: Keqian Zhu 
---
 hw/acpi/generic_event_device.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/hw/acpi/generic_event_device.c b/hw/acpi/generic_event_device.c
index e28457a7d1..a3d31631fe 100644
--- a/hw/acpi/generic_event_device.c
+++ b/hw/acpi/generic_event_device.c
@@ -267,6 +267,13 @@ static void acpi_ged_unplug_cb(HotplugHandler *hotplug_dev,
 }
 }
 
+static void acpi_ged_ospm_status(AcpiDeviceIf *adev, ACPIOSTInfoList ***list)
+{
+AcpiGedState *s = ACPI_GED(adev);
+
+acpi_memory_ospm_status(&s->memhp_state, list);
+}
+
 static void acpi_ged_send_event(AcpiDeviceIf *adev, AcpiEventStatusBits ev)
 {
 AcpiGedState *s = ACPI_GED(adev);
@@ -409,6 +416,7 @@ static void acpi_ged_class_init(ObjectClass *class, void 
*data)
 hc->unplug_request = acpi_ged_unplug_request_cb;
 hc->unplug = acpi_ged_unplug_cb;
 
+adevc->ospm_status = acpi_ged_ospm_status;
 adevc->send_event = acpi_ged_send_event;
 }
 
-- 
2.33.0

Re: [PATCH] acpi_ged: Add ospm_status hook implementation

On Tue, 16 Aug 2022 at 10:26, Keqian Zhu  wrote:
>
> This fixes a bug that causes segmentation fault with following dumpstack:
>  #1  0xab64235c in qmp_query_acpi_ospm_status 
> (errp=errp@entry=0xf030) at ../monitor/qmp-cmds.c:312
>  #2  0xabfc4e20 in qmp_marshal_query_acpi_ospm_status 
> (args=, ret=0xea4ffe90, errp=0xea4ffe88) at 
> qapi/qapi-commands-acpi.c:63
>  #3  0xabff8ba0 in do_qmp_dispatch_bh (opaque=0xea4ffe98) at 
> ../qapi/qmp-dispatch.c:128
>  #4  0xac02e594 in aio_bh_call (bh=0xe0004d80) at 
> ../util/async.c:150
>  #5  aio_bh_poll (ctx=ctx@entry=0xad0f6040) at ../util/async.c:178
>  #6  0xac00bd40 in aio_dispatch (ctx=ctx@entry=0xad0f6040) at 
> ../util/aio-posix.c:421
>  #7  0xac02e010 in aio_ctx_dispatch (source=0xad0f6040, 
> callback=, user_data=) at ../util/async.c:320
>  #8  0xf76f6884 in g_main_context_dispatch () at 
> /usr/lib64/libglib-2.0.so.0
>  #9  0xac0452d4 in glib_pollfds_poll () at ../util/main-loop.c:297
>  #10 os_host_main_loop_wait (timeout=0) at ../util/main-loop.c:320
>  #11 main_loop_wait (nonblocking=nonblocking@entry=0) at 
> ../util/main-loop.c:596
>  #12 0xab5c9e50 in qemu_main_loop () at ../softmmu/runstate.c:734
>  #13 0xab185370 in qemu_main (argc=argc@entry=47, 
> argv=argv@entry=0xf518, envp=envp@entry=0x0) at ../softmmu/main.c:38
>  #14 0xab16f99c in main (argc=47, argv=0xf518) at 
> ../softmmu/main.c:47

What are the conditions required to trigger the segfault?


> Fixes: ebb62075021a ("hw/acpi: Add ACPI Generic Event Device Support")
> Signed-off-by: Keqian Zhu 
> ---
>  hw/acpi/generic_event_device.c | 8 
>  1 file changed, 8 insertions(+)
>
> diff --git a/hw/acpi/generic_event_device.c b/hw/acpi/generic_event_device.c
> index e28457a7d1..a3d31631fe 100644
> --- a/hw/acpi/generic_event_device.c
> +++ b/hw/acpi/generic_event_device.c
> @@ -267,6 +267,13 @@ static void acpi_ged_unplug_cb(HotplugHandler 
> *hotplug_dev,
>  }
>  }
>
> +static void acpi_ged_ospm_status(AcpiDeviceIf *adev, ACPIOSTInfoList ***list)
> +{
> +AcpiGedState *s = ACPI_GED(adev);
> +
> +acpi_memory_ospm_status(&s->memhp_state, list);
> +}
> +
>  static void acpi_ged_send_event(AcpiDeviceIf *adev, AcpiEventStatusBits ev)
>  {
>  AcpiGedState *s = ACPI_GED(adev);
> @@ -409,6 +416,7 @@ static void acpi_ged_class_init(ObjectClass *class, void 
> *data)
>  hc->unplug_request = acpi_ged_unplug_request_cb;
>  hc->unplug = acpi_ged_unplug_cb;
>
> +adevc->ospm_status = acpi_ged_ospm_status;
>  adevc->send_event = acpi_ged_send_event;
>  }
>
> --

thanks
-- PMM

Re: [PATCH 01/22] ppc/ppc4xx: Introduce a DCR device model

2022-08-16 Thread BALATON Zoltan


On Tue, 16 Aug 2022, Cédric Le Goater wrote:

On 8/13/22 17:34, BALATON Zoltan wrote:

From: Cédric Le Goater 

The Device Control Registers (DCR) of on-SoC devices are accessed by
software through the use of the mtdcr and mfdcr instructions. These
are converted in transactions on a side band bus, the DCR bus, which
connects the on-SoC devices to the CPU.

Ideally, we should model these accesses with a DCR namespace and DCR
memory regions but today the DCR handlers are installed in a DCR table
under the CPU. Instead, introduce a little device model wrapper to hold
a CPU link and handle registration of DCR handlers.

The DCR device inherits from SysBus because most of these devices also
have MMIO regions and/or IRQs. Being a SysBusDevice makes things easier
to install the device model in the overall SoC.

Signed-off-by: Cédric Le Goater 


When re-sending a patch, it is a good practice to list the changes before
the Sob of the person doing the resend.

I think you only changed the ppc4xx_dcr_register prototype. Correct ?


Mostly, and the resulting rebase but maybe some small changes here and 
there but I think those are also just code style fixes. I did not know 
what you prefer with the from line so if you're OK with keeping it I can 
go through it again and mark changes before signed-off if you think that's 
better.


I've also started cleaning up the sdram model, I need a bit more time for 
that, I'll probably send it as a separate series.



Thanks,

C.


Signed-off-by: BALATON Zoltan 
---
  hw/ppc/ppc4xx_devs.c| 41 +
  include/hw/ppc/ppc4xx.h | 17 +
  2 files changed, 58 insertions(+)

diff --git a/hw/ppc/ppc4xx_devs.c b/hw/ppc/ppc4xx_devs.c
index 069b511951..f4d7ae9567 100644
--- a/hw/ppc/ppc4xx_devs.c
+++ b/hw/ppc/ppc4xx_devs.c
@@ -664,3 +664,44 @@ void ppc4xx_mal_init(CPUPPCState *env, uint8_t txcnum, 
uint8_t rxcnum,

   mal, &dcr_read_mal, &dcr_write_mal);
  }
  }
+
+/* PPC4xx_DCR_DEVICE */
+
+void ppc4xx_dcr_register(Ppc4xxDcrDeviceState *dev, int dcrn, void 
*opaque,

+ dcr_read_cb dcr_read, dcr_write_cb dcr_write)
+{
+assert(dev->cpu);
+ppc_dcr_register(&dev->cpu->env, dcrn, opaque, dcr_read, dcr_write);
+}
+
+bool ppc4xx_dcr_realize(Ppc4xxDcrDeviceState *dev, PowerPCCPU *cpu,
+Error **errp)
+{
+object_property_set_link(OBJECT(dev), "cpu", OBJECT(cpu), 
&error_abort);

+return sysbus_realize(SYS_BUS_DEVICE(dev), errp);
+}
+
+static Property ppc4xx_dcr_properties[] = {
+DEFINE_PROP_LINK("cpu", Ppc4xxDcrDeviceState, cpu, TYPE_POWERPC_CPU,
+ PowerPCCPU *),
+DEFINE_PROP_END_OF_LIST(),
+};
+
+static void ppc4xx_dcr_class_init(ObjectClass *oc, void *data)
+{
+DeviceClass *dc = DEVICE_CLASS(oc);
+
+device_class_set_props(dc, ppc4xx_dcr_properties);
+}
+
+static const TypeInfo ppc4xx_types[] = {
+{
+.name   = TYPE_PPC4xx_DCR_DEVICE,
+.parent = TYPE_SYS_BUS_DEVICE,
+.instance_size  = sizeof(Ppc4xxDcrDeviceState),
+.class_init = ppc4xx_dcr_class_init,
+.abstract   = true,
+}
+};
+
+DEFINE_TYPES(ppc4xx_types)
diff --git a/include/hw/ppc/ppc4xx.h b/include/hw/ppc/ppc4xx.h
index 591e2421a3..a537a5567b 100644
--- a/include/hw/ppc/ppc4xx.h
+++ b/include/hw/ppc/ppc4xx.h
@@ -27,6 +27,7 @@
#include "hw/ppc/ppc.h"
  #include "exec/memory.h"
+#include "hw/sysbus.h"
void ppc4xx_sdram_banks(MemoryRegion *ram, int nr_banks,
  MemoryRegion ram_memories[],
@@ -44,4 +45,20 @@ void ppc4xx_mal_init(CPUPPCState *env, uint8_t txcnum, 
uint8_t rxcnum,

#define TYPE_PPC4xx_PCI_HOST_BRIDGE "ppc4xx-pcihost"
  +/*
+ * Generic DCR device
+ */
+#define TYPE_PPC4xx_DCR_DEVICE "ppc4xx-dcr-device"
+OBJECT_DECLARE_SIMPLE_TYPE(Ppc4xxDcrDeviceState, PPC4xx_DCR_DEVICE);
+struct Ppc4xxDcrDeviceState {
+SysBusDevice parent_obj;
+
+PowerPCCPU *cpu;
+};
+
+void ppc4xx_dcr_register(Ppc4xxDcrDeviceState *dev, int dcrn, void 
*opaque,

+ dcr_read_cb dcr_read, dcr_write_cb dcr_write);
+bool ppc4xx_dcr_realize(Ppc4xxDcrDeviceState *dev, PowerPCCPU *cpu,
+Error **errp);
+
  #endif /* PPC4XX_H */

Re: [PATCH 12/22] ppc4xx: Move PLB model to ppc4xx_devs.c

2022-08-16 Thread BALATON Zoltan


On Tue, 16 Aug 2022, Cédric Le Goater wrote:

On 8/13/22 17:34, BALATON Zoltan wrote:

The PLB is shared between 405 and 440 so move it to the shared file.


Should we rename the device to Ppc4xxPlbState ?


I could do that (also for the other one moved). Ptobably nothing in these 
boards care about the names so it shouldn't break anything.


Regards,
BALATON Zoltan


Thanks,

C.



Signed-off-by: BALATON Zoltan 
---
  hw/ppc/ppc405.h | 11 -
  hw/ppc/ppc405_uc.c  | 93 
  hw/ppc/ppc4xx_devs.c| 94 +
  include/hw/ppc/ppc4xx.h | 11 +
  4 files changed, 105 insertions(+), 104 deletions(-)

diff --git a/hw/ppc/ppc405.h b/hw/ppc/ppc405.h
index 31c94e4742..d85c595f9d 100644
--- a/hw/ppc/ppc405.h
+++ b/hw/ppc/ppc405.h
@@ -63,17 +63,6 @@ struct ppc4xx_bd_info_t {
  uint32_t bi_iic_fast[2];
  };
  -/* Peripheral local bus arbitrer */
-#define TYPE_PPC405_PLB "ppc405-plb"
-OBJECT_DECLARE_SIMPLE_TYPE(Ppc405PlbState, PPC405_PLB);
-struct Ppc405PlbState {
-Ppc4xxDcrDeviceState parent_obj;
-
-uint32_t acr;
-uint32_t bear;
-uint32_t besr;
-};
-
  /* PLB to OPB bridge */
  #define TYPE_PPC405_POB "ppc405-pob"
  OBJECT_DECLARE_SIMPLE_TYPE(Ppc405PobState, PPC405_POB);
diff --git a/hw/ppc/ppc405_uc.c b/hw/ppc/ppc405_uc.c
index 922c23346f..3de6c77631 100644
--- a/hw/ppc/ppc405_uc.c
+++ b/hw/ppc/ppc405_uc.c
@@ -137,94 +137,6 @@ ram_addr_t ppc405_set_bootinfo(CPUPPCState *env, 
ram_addr_t ram_size)

  
/*/
  /* Shared peripherals */

-/*/
-/* Peripheral local bus arbitrer */
-enum {
-PLB3A0_ACR = 0x077,
-PLB4A0_ACR = 0x081,
-PLB0_BESR  = 0x084,
-PLB0_BEAR  = 0x086,
-PLB0_ACR   = 0x087,
-PLB4A1_ACR = 0x089,
-};
-
-static uint32_t dcr_read_plb(void *opaque, int dcrn)
-{
-Ppc405PlbState *plb = opaque;
-uint32_t ret;
-
-switch (dcrn) {
-case PLB0_ACR:
-ret = plb->acr;
-break;
-case PLB0_BEAR:
-ret = plb->bear;
-break;
-case PLB0_BESR:
-ret = plb->besr;
-break;
-default:
-/* Avoid gcc warning */
-ret = 0;
-break;
-}
-
-return ret;
-}
-
-static void dcr_write_plb(void *opaque, int dcrn, uint32_t val)
-{
-Ppc405PlbState *plb = opaque;
-
-switch (dcrn) {
-case PLB0_ACR:
-/* We don't care about the actual parameters written as
- * we don't manage any priorities on the bus
- */
-plb->acr = val & 0xF800;
-break;
-case PLB0_BEAR:
-/* Read only */
-break;
-case PLB0_BESR:
-/* Write-clear */
-plb->besr &= ~val;
-break;
-}
-}
-
-static void ppc405_plb_reset(DeviceState *dev)
-{
-Ppc405PlbState *plb = PPC405_PLB(dev);
-
-plb->acr = 0x;
-plb->bear = 0x;
-plb->besr = 0x;
-}
-
-static void ppc405_plb_realize(DeviceState *dev, Error **errp)
-{
-Ppc405PlbState *plb = PPC405_PLB(dev);
-Ppc4xxDcrDeviceState *dcr = PPC4xx_DCR_DEVICE(dev);
-
-ppc4xx_dcr_register(dcr, PLB3A0_ACR, plb, &dcr_read_plb, 
&dcr_write_plb);
-ppc4xx_dcr_register(dcr, PLB4A0_ACR, plb, &dcr_read_plb, 
&dcr_write_plb);
-ppc4xx_dcr_register(dcr, PLB0_ACR, plb, &dcr_read_plb, 
&dcr_write_plb);
-ppc4xx_dcr_register(dcr, PLB0_BEAR, plb, &dcr_read_plb, 
&dcr_write_plb);
-ppc4xx_dcr_register(dcr, PLB0_BESR, plb, &dcr_read_plb, 
&dcr_write_plb);
-ppc4xx_dcr_register(dcr, PLB4A1_ACR, plb, &dcr_read_plb, 
&dcr_write_plb);

-}
-
-static void ppc405_plb_class_init(ObjectClass *oc, void *data)
-{
-DeviceClass *dc = DEVICE_CLASS(oc);
-
-dc->realize = ppc405_plb_realize;
-dc->reset = ppc405_plb_reset;
-/* Reason: only works as function of a ppc4xx SoC */
-dc->user_creatable = false;
-}
-
  
/*/
  /* PLB to OPB bridge */
  enum {
@@ -1535,11 +1447,6 @@ static void ppc405_soc_class_init(ObjectClass *oc, 
void *data)

static const TypeInfo ppc405_types[] = {
  {
-.name   = TYPE_PPC405_PLB,
-.parent = TYPE_PPC4xx_DCR_DEVICE,
-.instance_size  = sizeof(Ppc405PlbState),
-.class_init = ppc405_plb_class_init,
-}, {
  .name   = TYPE_PPC405_POB,
  .parent = TYPE_PPC4xx_DCR_DEVICE,
  .instance_size  = sizeof(Ppc405PobState),
diff --git a/hw/ppc/ppc4xx_devs.c b/hw/ppc/ppc4xx_devs.c
index 7d40c1b68a..843d759b1b 100644
--- a/hw/ppc/ppc4xx_devs.c
+++ b/hw/ppc/ppc4xx_devs.c
@@ -658,6 +658,95 @@ static void ppc4xx_mal_class_init(ObjectClass *oc, 
void *data)

  device_class_set_props(dc, ppc4xx_mal_properties);
  }
  
+/*/
+/* Peripheral local bus arbitrer */
+e

[PATCH v3] xio3130_upstream: Add ACS, Access Control Services, capability

2022-08-16 Thread Paul Schlacter

v2 -> v3:
- Add the missing code in V2.

v1 -> v2:
- Allow ACS to be disabled.
- Suggested by Michael S. Tsirkin, use disable-acs to set property.

v1:
- Add ACS (Access Control Services) capability.

If it is a pcie device, check that all devices on the path from

the device to the root complex have ACS enabled, and then the

device will become an iommu_group.

it will have the effect of isolation




Signed-off-by: wlfightup 

Signed-off-by: wangliang 

---

 hw/pci-bridge/xio3130_upstream.c | 13 +

 1 file changed, 13 insertions(+)


diff --git a/hw/pci-bridge/xio3130_upstream.c
b/hw/pci-bridge/xio3130_upstream.c

index 5ff46ef050..5433d06fb3 100644

--- a/hw/pci-bridge/xio3130_upstream.c

+++ b/hw/pci-bridge/xio3130_upstream.c

@@ -24,6 +24,7 @@

 #include "hw/pci/msi.h"

 #include "hw/pci/pcie.h"

 #include "hw/pci/pcie_port.h"

+#include "hw/qdev-properties.h"

 #include "migration/vmstate.h"

 #include "qemu/module.h"



@@ -37,6 +38,8 @@

 #define XIO3130_SSVID_SSID  0

 #define XIO3130_EXP_OFFSET  0x90

 #define XIO3130_AER_OFFSET  0x100

+#define XIO3130_ACS_OFFSET \

+(XIO3130_AER_OFFSET + PCI_ERR_SIZEOF)



 static void xio3130_upstream_write_config(PCIDevice *d, uint32_t address,

   uint32_t val, int len)

@@ -57,6 +60,7 @@ static void xio3130_upstream_reset(DeviceState *qdev)

 static void xio3130_upstream_realize(PCIDevice *d, Error **errp)

 {

 PCIEPort *p = PCIE_PORT(d);

+PCIESlot *s = PCIE_SLOT(d);

 int rc;



 pci_bridge_initfn(d, TYPE_PCIE_BUS);

@@ -92,6 +96,9 @@ static void xio3130_upstream_realize(PCIDevice *d, Error
**errp)

 goto err;

 }



+if (!s->disable_acs) {

+pcie_acs_init(d, XIO3130_ACS_OFFSET);

+}

 return;



 err:

@@ -110,6 +117,11 @@ static void xio3130_upstream_exitfn(PCIDevice *d)

 pci_bridge_exitfn(d);

 }



+static Property xio3130_upstream_props[] = {

+DEFINE_PROP_BOOL("disable-acs", PCIESlot, disable_acs, false),

+DEFINE_PROP_END_OF_LIST()

+};

+

 static const VMStateDescription vmstate_xio3130_upstream = {

 .name = "xio3130-express-upstream-port",

 .priority = MIG_PRI_PCI_BUS,

@@ -139,6 +151,7 @@ static void xio3130_upstream_class_init(ObjectClass
*klass, void *data)

 dc->desc = "TI X3130 Upstream Port of PCI Express Switch";

 dc->reset = xio3130_upstream_reset;

 dc->vmsd = &vmstate_xio3130_upstream;

+device_class_set_props(dc, xio3130_upstream_props);

 }



 static const TypeInfo xio3130_upstream_info = {

-- 

2.24.3 (Apple Git-128)

答复: [PATCH] acpi_ged: Add ospm_status hook implementation

2022-08-16 Thread zhukeqian via

Hi Peter,

Setup an ARM virtual machine of machine virt and execute qmp 
"query-acpi-ospm-status" can trigger this bug.

Thanks.

-邮件原件-
发件人: Qemu-devel [mailto:qemu-devel-bounces+zhukeqian1=huawei@nongnu.org] 代表 
Peter Maydell
发送时间: 2022年8月16日 17:30
收件人: zhukeqian 
抄送: qemu-devel@nongnu.org; qemu-...@nongnu.org; qemu-triv...@nongnu.org; 
Philippe Mathieu-Daudé ; Eric Auger ; 
Peter Xu ; Igor Mammedov ; Wanghaibin 
(D) 
主题: Re: [PATCH] acpi_ged: Add ospm_status hook implementation

On Tue, 16 Aug 2022 at 10:26, Keqian Zhu  wrote:
>
> This fixes a bug that causes segmentation fault with following dumpstack:
>  #1  0xab64235c in qmp_query_acpi_ospm_status 
> (errp=errp@entry=0xf030) at ../monitor/qmp-cmds.c:312
>  #2  0xabfc4e20 in qmp_marshal_query_acpi_ospm_status 
> (args=, ret=0xea4ffe90, errp=0xea4ffe88) at 
> qapi/qapi-commands-acpi.c:63
>  #3  0xabff8ba0 in do_qmp_dispatch_bh (opaque=0xea4ffe98) 
> at ../qapi/qmp-dispatch.c:128
>  #4  0xac02e594 in aio_bh_call (bh=0xe0004d80) at 
> ../util/async.c:150
>  #5  aio_bh_poll (ctx=ctx@entry=0xad0f6040) at ../util/async.c:178
>  #6  0xac00bd40 in aio_dispatch (ctx=ctx@entry=0xad0f6040) 
> at ../util/aio-posix.c:421
>  #7  0xac02e010 in aio_ctx_dispatch (source=0xad0f6040, 
> callback=, user_data=) at 
> ../util/async.c:320
>  #8  0xf76f6884 in g_main_context_dispatch () at 
> /usr/lib64/libglib-2.0.so.0
>  #9  0xac0452d4 in glib_pollfds_poll () at 
> ../util/main-loop.c:297
>  #10 os_host_main_loop_wait (timeout=0) at ../util/main-loop.c:320
>  #11 main_loop_wait (nonblocking=nonblocking@entry=0) at 
> ../util/main-loop.c:596
>  #12 0xab5c9e50 in qemu_main_loop () at 
> ../softmmu/runstate.c:734
>  #13 0xab185370 in qemu_main (argc=argc@entry=47, 
> argv=argv@entry=0xf518, envp=envp@entry=0x0) at 
> ../softmmu/main.c:38
>  #14 0xab16f99c in main (argc=47, argv=0xf518) at 
> ../softmmu/main.c:47

What are the conditions required to trigger the segfault?


> Fixes: ebb62075021a ("hw/acpi: Add ACPI Generic Event Device Support")
> Signed-off-by: Keqian Zhu 
> ---
>  hw/acpi/generic_event_device.c | 8 
>  1 file changed, 8 insertions(+)
>
> diff --git a/hw/acpi/generic_event_device.c 
> b/hw/acpi/generic_event_device.c index e28457a7d1..a3d31631fe 100644
> --- a/hw/acpi/generic_event_device.c
> +++ b/hw/acpi/generic_event_device.c
> @@ -267,6 +267,13 @@ static void acpi_ged_unplug_cb(HotplugHandler 
> *hotplug_dev,
>  }
>  }
>
> +static void acpi_ged_ospm_status(AcpiDeviceIf *adev, ACPIOSTInfoList 
> +***list) {
> +AcpiGedState *s = ACPI_GED(adev);
> +
> +acpi_memory_ospm_status(&s->memhp_state, list); }
> +
>  static void acpi_ged_send_event(AcpiDeviceIf *adev, 
> AcpiEventStatusBits ev)  {
>  AcpiGedState *s = ACPI_GED(adev); @@ -409,6 +416,7 @@ static void 
> acpi_ged_class_init(ObjectClass *class, void *data)
>  hc->unplug_request = acpi_ged_unplug_request_cb;
>  hc->unplug = acpi_ged_unplug_cb;
>
> +adevc->ospm_status = acpi_ged_ospm_status;
>  adevc->send_event = acpi_ged_send_event;  }
>
> --

thanks
-- PMM

Re: [PATCH] acpi_ged: Add ospm_status hook implementation

On Tue, 16 Aug 2022 at 10:40, zhukeqian  wrote:
>
> Hi Peter,
>
> Setup an ARM virtual machine of machine virt and execute qmp 
> "query-acpi-ospm-status" can trigger this bug.

Thanks. That is worth stating in the commit message, I think.

-- PMm

[PATCH v3 2/2] virtio-gpu: hostmem

2022-08-16 Thread Antonio Caggiano

From: Gerd Hoffmann 

Use VIRTIO_GPU_SHM_ID_HOST_VISIBLE as id for virtio-gpu.

v2: Formatting fixes

Signed-off-by: Antonio Caggiano 
Acked-by: Michael S. Tsirkin 
---
 hw/display/virtio-gpu-pci.c| 15 +++
 hw/display/virtio-gpu.c|  1 +
 hw/display/virtio-vga.c| 33 -
 include/hw/virtio/virtio-gpu.h |  5 +
 4 files changed, 45 insertions(+), 9 deletions(-)

diff --git a/hw/display/virtio-gpu-pci.c b/hw/display/virtio-gpu-pci.c
index 93f214ff58..2cbbacd7fe 100644
--- a/hw/display/virtio-gpu-pci.c
+++ b/hw/display/virtio-gpu-pci.c
@@ -33,6 +33,21 @@ static void virtio_gpu_pci_base_realize(VirtIOPCIProxy 
*vpci_dev, Error **errp)
 DeviceState *vdev = DEVICE(g);
 int i;
 
+if (virtio_gpu_hostmem_enabled(g->conf)) {
+vpci_dev->msix_bar_idx = 1;
+vpci_dev->modern_mem_bar_idx = 2;
+memory_region_init(&g->hostmem, OBJECT(g), "virtio-gpu-hostmem",
+   g->conf.hostmem);
+pci_register_bar(&vpci_dev->pci_dev, 4,
+ PCI_BASE_ADDRESS_SPACE_MEMORY |
+ PCI_BASE_ADDRESS_MEM_PREFETCH |
+ PCI_BASE_ADDRESS_MEM_TYPE_64,
+ &g->hostmem);
+virtio_pci_add_shm_cap(vpci_dev, 4, 0, g->conf.hostmem,
+   VIRTIO_GPU_SHM_ID_HOST_VISIBLE);
+}
+
+qdev_set_parent_bus(vdev, BUS(&vpci_dev->bus), errp);
 virtio_pci_force_virtio_1(vpci_dev);
 if (!qdev_realize(vdev, BUS(&vpci_dev->bus), errp)) {
 return;
diff --git a/hw/display/virtio-gpu.c b/hw/display/virtio-gpu.c
index 20cc703dcc..506b3b8eef 100644
--- a/hw/display/virtio-gpu.c
+++ b/hw/display/virtio-gpu.c
@@ -1424,6 +1424,7 @@ static Property virtio_gpu_properties[] = {
  256 * MiB),
 DEFINE_PROP_BIT("blob", VirtIOGPU, parent_obj.conf.flags,
 VIRTIO_GPU_FLAG_BLOB_ENABLED, false),
+DEFINE_PROP_SIZE("hostmem", VirtIOGPU, parent_obj.conf.hostmem, 0),
 DEFINE_PROP_END_OF_LIST(),
 };
 
diff --git a/hw/display/virtio-vga.c b/hw/display/virtio-vga.c
index 4dcb34c4a7..aa8d1ab993 100644
--- a/hw/display/virtio-vga.c
+++ b/hw/display/virtio-vga.c
@@ -115,17 +115,32 @@ static void virtio_vga_base_realize(VirtIOPCIProxy 
*vpci_dev, Error **errp)
 pci_register_bar(&vpci_dev->pci_dev, 0,
  PCI_BASE_ADDRESS_MEM_PREFETCH, &vga->vram);
 
-/*
- * Configure virtio bar and regions
- *
- * We use bar #2 for the mmio regions, to be compatible with stdvga.
- * virtio regions are moved to the end of bar #2, to make room for
- * the stdvga mmio registers at the start of bar #2.
- */
-vpci_dev->modern_mem_bar_idx = 2;
-vpci_dev->msix_bar_idx = 4;
 vpci_dev->modern_io_bar_idx = 5;
 
+if (!virtio_gpu_hostmem_enabled(g->conf)) {
+/*
+ * Configure virtio bar and regions
+ *
+ * We use bar #2 for the mmio regions, to be compatible with stdvga.
+ * virtio regions are moved to the end of bar #2, to make room for
+ * the stdvga mmio registers at the start of bar #2.
+ */
+vpci_dev->modern_mem_bar_idx = 2;
+vpci_dev->msix_bar_idx = 4;
+} else {
+vpci_dev->msix_bar_idx = 1;
+vpci_dev->modern_mem_bar_idx = 2;
+memory_region_init(&g->hostmem, OBJECT(g), "virtio-gpu-hostmem",
+   g->conf.hostmem);
+pci_register_bar(&vpci_dev->pci_dev, 4,
+ PCI_BASE_ADDRESS_SPACE_MEMORY |
+ PCI_BASE_ADDRESS_MEM_PREFETCH |
+ PCI_BASE_ADDRESS_MEM_TYPE_64,
+ &g->hostmem);
+virtio_pci_add_shm_cap(vpci_dev, 4, 0, g->conf.hostmem,
+   VIRTIO_GPU_SHM_ID_HOST_VISIBLE);
+}
+
 if (!(vpci_dev->flags & VIRTIO_PCI_FLAG_PAGE_PER_VQ)) {
 /*
  * with page-per-vq=off there is no padding space we can use
diff --git a/include/hw/virtio/virtio-gpu.h b/include/hw/virtio/virtio-gpu.h
index 2e28507efe..eafce75b04 100644
--- a/include/hw/virtio/virtio-gpu.h
+++ b/include/hw/virtio/virtio-gpu.h
@@ -102,12 +102,15 @@ enum virtio_gpu_base_conf_flags {
 (_cfg.flags & (1 << VIRTIO_GPU_FLAG_DMABUF_ENABLED))
 #define virtio_gpu_blob_enabled(_cfg) \
 (_cfg.flags & (1 << VIRTIO_GPU_FLAG_BLOB_ENABLED))
+#define virtio_gpu_hostmem_enabled(_cfg) \
+(_cfg.hostmem > 0)
 
 struct virtio_gpu_base_conf {
 uint32_t max_outputs;
 uint32_t flags;
 uint32_t xres;
 uint32_t yres;
+uint64_t hostmem;
 };
 
 struct virtio_gpu_ctrl_command {
@@ -131,6 +134,8 @@ struct VirtIOGPUBase {
 int renderer_blocked;
 int enable;
 
+MemoryRegion hostmem;
+
 struct virtio_gpu_scanout scanout[VIRTIO_GPU_MAX_SCANOUTS];
 
 int enabled_output_bitmask;
-- 
2.34.1

Re: [PATCH] hw/riscv: microchip_pfsoc: fix kernel panics due to missing peripherals

2022-08-16 Thread Conor.Dooley

On 16/08/2022 01:40, Philippe Mathieu-Daudé wrote:
> [You don't often get email from f4...@amsat.org. Learn why this is important 
> at https://aka.ms/LearnAboutSenderIdentification ]
> 
> EXTERNAL EMAIL: Do not click links or open attachments unless you know the 
> content is safe
> 
> Hi Conor,
> 
> On 15/8/22 00:48, conor.doo...@microchip.com wrote:
>> On 14/08/2022 23:08, Alistair Francis wrote:
>>> EXTERNAL EMAIL: Do not click links or open attachments unless you know the 
>>> content is safe
>>>
>>> On Sat, Aug 13, 2022 at 11:51 PM Conor Dooley  wrote:
 QEMU support for PolarFire SoC seems to be fairly out of date at this
 point. Running with a recent HSS, U-Boot etc doesn't work, partly due
 to the unimplemented cache controller that the HSS tries to read from
 (it needs to know the ways configuration now) and the rest seems to be
 down to 64 bit address DMA to the sd card (not 100% on that yet).
 There's some patches floating around internally that supposedly fixed
 things for QEMU v6.something but I could not replicate & they're fairly
 conflicty at this point. Plan is to clean them up, but no point sitting
 on this patch until then as I have no ETA for that at this point.
>>>
>>> Awesome! It is great to see Microchip supporting open source projects
>>
>> Better late than never ehh..
>> As I said, no ETA yet as I don't know just how far off the sd card stuff
>> is, but it's in the todo pile. In the meantime, I'll keep an eye out here
>> which I am ~certain we haven't been doing so far. I've added QEMU stuff
>> to my build/test scripts now that I've got the direct kernel boot working
>> for me so hopefully once things get fixed, they'll stay that way.
> 
> Please Cc me and Cédric in your future posts regarding SD card, or open
> a GitLab issue describing the problem.

Willdo. Need to do some more digging first :)

Thanks.

[PATCH v2] hw/acpi: Add ospm_status hook implementation for acpi-ged

2022-08-16 Thread Keqian Zhu via

Setup an ARM virtual machine of machine virt and execute qmp 
"query-acpi-ospm-status"
causes segmentation fault with following dumpstack:
 #1  0xab64235c in qmp_query_acpi_ospm_status 
(errp=errp@entry=0xf030) at ../monitor/qmp-cmds.c:312
 #2  0xabfc4e20 in qmp_marshal_query_acpi_ospm_status (args=, ret=0xea4ffe90, errp=0xea4ffe88) at qapi/qapi-commands-acpi.c:63
 #3  0xabff8ba0 in do_qmp_dispatch_bh (opaque=0xea4ffe98) at 
../qapi/qmp-dispatch.c:128
 #4  0xac02e594 in aio_bh_call (bh=0xe0004d80) at 
../util/async.c:150
 #5  aio_bh_poll (ctx=ctx@entry=0xad0f6040) at ../util/async.c:178
 #6  0xac00bd40 in aio_dispatch (ctx=ctx@entry=0xad0f6040) at 
../util/aio-posix.c:421
 #7  0xac02e010 in aio_ctx_dispatch (source=0xad0f6040, 
callback=, user_data=) at ../util/async.c:320
 #8  0xf76f6884 in g_main_context_dispatch () at 
/usr/lib64/libglib-2.0.so.0
 #9  0xac0452d4 in glib_pollfds_poll () at ../util/main-loop.c:297
 #10 os_host_main_loop_wait (timeout=0) at ../util/main-loop.c:320
 #11 main_loop_wait (nonblocking=nonblocking@entry=0) at ../util/main-loop.c:596
 #12 0xab5c9e50 in qemu_main_loop () at ../softmmu/runstate.c:734
 #13 0xab185370 in qemu_main (argc=argc@entry=47, 
argv=argv@entry=0xf518, envp=envp@entry=0x0) at ../softmmu/main.c:38
 #14 0xab16f99c in main (argc=47, argv=0xf518) at 
../softmmu/main.c:47

Fixes: ebb62075021a ("hw/acpi: Add ACPI Generic Event Device Support")
Signed-off-by: Keqian Zhu 
---
 hw/acpi/generic_event_device.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/hw/acpi/generic_event_device.c b/hw/acpi/generic_event_device.c
index e28457a7d1..a3d31631fe 100644
--- a/hw/acpi/generic_event_device.c
+++ b/hw/acpi/generic_event_device.c
@@ -267,6 +267,13 @@ static void acpi_ged_unplug_cb(HotplugHandler *hotplug_dev,
 }
 }
 
+static void acpi_ged_ospm_status(AcpiDeviceIf *adev, ACPIOSTInfoList ***list)
+{
+AcpiGedState *s = ACPI_GED(adev);
+
+acpi_memory_ospm_status(&s->memhp_state, list);
+}
+
 static void acpi_ged_send_event(AcpiDeviceIf *adev, AcpiEventStatusBits ev)
 {
 AcpiGedState *s = ACPI_GED(adev);
@@ -409,6 +416,7 @@ static void acpi_ged_class_init(ObjectClass *class, void 
*data)
 hc->unplug_request = acpi_ged_unplug_request_cb;
 hc->unplug = acpi_ged_unplug_cb;
 
+adevc->ospm_status = acpi_ged_ospm_status;
 adevc->send_event = acpi_ged_send_event;
 }
 
-- 
2.33.0

[PATCH v3 0/2] virtio-gpu: Shared memory capability

2022-08-16 Thread Antonio Caggiano

Previously part of [0], now a patch series on its own.

This patch series cherry picks two commits from [1] and applies one fix
according to [2], which should answer Gerd's comment [3] on previous
patch.

v2: Squash patch #3 into patch #2, and formatting fixes to patch #1.
v3: Reverse commits order.

[0] https://www.mail-archive.com/qemu-devel@nongnu.org/msg826897.html
[1] https://gitlab.freedesktop.org/virgl/qemu/-/commits/virtio-gpu-next/
[2] 
https://github.com/torvalds/linux/commit/0dd4ff93f4c8dba016ad79384007da4938cd54a1
[3] https://www.mail-archive.com/qemu-devel@nongnu.org/msg827306.html


Dr. David Alan Gilbert (1):
  virtio: Add shared memory capability

Gerd Hoffmann (1):
  virtio-gpu: hostmem

 hw/display/virtio-gpu-pci.c| 15 +++
 hw/display/virtio-gpu.c|  1 +
 hw/display/virtio-vga.c| 33 -
 hw/virtio/virtio-pci.c | 18 ++
 include/hw/virtio/virtio-gpu.h |  5 +
 include/hw/virtio/virtio-pci.h |  4 
 6 files changed, 67 insertions(+), 9 deletions(-)

-- 
2.34.1

[PATCH v3 1/2] virtio: Add shared memory capability

2022-08-16 Thread Antonio Caggiano

From: "Dr. David Alan Gilbert" 

Define a new capability type 'VIRTIO_PCI_CAP_SHARED_MEMORY_CFG'
and the data structure 'virtio_pci_shm_cap' to go with it.
They allow defining shared memory regions with sizes and offsets
of 2^32 and more.
Multiple instances of the capability are allowed and distinguished
by a device-specific 'id'.

v2: Remove virtio_pci_shm_cap as virtio_pci_cap64 is used instead.
v3: No need for mask32 as cpu_to_le32 truncates the value.

Signed-off-by: Dr. David Alan Gilbert 
Signed-off-by: Antonio Caggiano 
---
 hw/virtio/virtio-pci.c | 18 ++
 include/hw/virtio/virtio-pci.h |  4 
 2 files changed, 22 insertions(+)

diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c
index 45327f0b31..50bd230122 100644
--- a/hw/virtio/virtio-pci.c
+++ b/hw/virtio/virtio-pci.c
@@ -1164,6 +1164,24 @@ static int virtio_pci_add_mem_cap(VirtIOPCIProxy *proxy,
 return offset;
 }
 
+int virtio_pci_add_shm_cap(VirtIOPCIProxy *proxy,
+   uint8_t bar, uint64_t offset, uint64_t length,
+   uint8_t id)
+{
+struct virtio_pci_cap64 cap = {
+.cap.cap_len = sizeof cap,
+.cap.cfg_type = VIRTIO_PCI_CAP_SHARED_MEMORY_CFG,
+};
+
+cap.cap.bar = bar;
+cap.cap.length = cpu_to_le32(length);
+cap.length_hi = cpu_to_le32(length >> 32);
+cap.cap.offset = cpu_to_le32(offset);
+cap.offset_hi = cpu_to_le32(offset >> 32);
+cap.cap.id = id;
+return virtio_pci_add_mem_cap(proxy, &cap.cap);
+}
+
 static uint64_t virtio_pci_common_read(void *opaque, hwaddr addr,
unsigned size)
 {
diff --git a/include/hw/virtio/virtio-pci.h b/include/hw/virtio/virtio-pci.h
index 2446dcd9ae..5e5c4a4c6d 100644
--- a/include/hw/virtio/virtio-pci.h
+++ b/include/hw/virtio/virtio-pci.h
@@ -252,4 +252,8 @@ void virtio_pci_types_register(const 
VirtioPCIDeviceTypeInfo *t);
  */
 unsigned virtio_pci_optimal_num_queues(unsigned fixed_queues);
 
+int virtio_pci_add_shm_cap(VirtIOPCIProxy *proxy,
+   uint8_t bar, uint64_t offset, uint64_t length,
+   uint8_t id);
+
 #endif
-- 
2.34.1

答复: [PATCH] acpi_ged: Add ospm_status hook implementation

2022-08-16 Thread zhukeqian via

OK, I'll send v2 soon.

-邮件原件-
发件人: Peter Maydell [mailto:peter.mayd...@linaro.org] 
发送时间: 2022年8月16日 17:42
收件人: zhukeqian 
抄送: qemu-devel@nongnu.org; qemu-...@nongnu.org; qemu-triv...@nongnu.org; 
Philippe Mathieu-Daudé ; Eric Auger ; 
Peter Xu ; Igor Mammedov ; Wanghaibin 
(D) 
主题: Re: [PATCH] acpi_ged: Add ospm_status hook implementation

On Tue, 16 Aug 2022 at 10:40, zhukeqian  wrote:
>
> Hi Peter,
>
> Setup an ARM virtual machine of machine virt and execute qmp 
> "query-acpi-ospm-status" can trigger this bug.

Thanks. That is worth stating in the commit message, I think.

-- PMm

[PULL 1/2] tests/qtest: misc tweaks to readconfig

From: Daniel P. Berrangé 

The property name parameter is ignored when visiting a top
level type, but the obvious typo should be fixed to avoid
confusion. A few indentation issues were tidied up. We
can break out of the loop when finding the RNG device.
Finally, close the temp FD immediately when no longer
needed.

Signed-off-by: Daniel P. Berrangé 
Message-Id: <20220809093854.168438-1-berra...@redhat.com>
Reviewed-by: Marc-André Lureau 
Reviewed-by: Thomas Huth 
Reviewed-by: Philippe Mathieu-Daudé 
Signed-off-by: Thomas Huth 
---
 tests/qtest/readconfig-test.c | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/tests/qtest/readconfig-test.c b/tests/qtest/readconfig-test.c
index 2e604d7c2d..c7a9b0c7dd 100644
--- a/tests/qtest/readconfig-test.c
+++ b/tests/qtest/readconfig-test.c
@@ -33,13 +33,12 @@ static QTestState *qtest_init_with_config(const char 
*cfgdata)
 g_assert_cmpint(cfgfd, >=, 0);
 
 ret = qemu_write_full(cfgfd, cfgdata, strlen(cfgdata));
+close(cfgfd);
 if (ret < 0) {
 unlink(cfgpath);
 }
 g_assert_cmpint(ret, ==, strlen(cfgdata));
 
-close(cfgfd);
-
 args = g_strdup_printf("-nodefaults -machine none -readconfig %s", 
cfgpath);
 
 qts = qtest_init(args);
@@ -79,7 +78,7 @@ static void test_x86_memdev(void)
 "size = \"200\"";
 
 qts = qtest_init_with_config(cfgdata);
-   /* Test valid command */
+/* Test valid command */
 resp = qtest_qmp(qts, "{ 'execute': 'query-memdev' }");
 test_x86_memdev_resp(qdict_get(resp, "return"));
 qobject_unref(resp);
@@ -96,7 +95,7 @@ static void test_spice_resp(QObject *res)
 
 g_assert(res);
 v = qobject_input_visitor_new(res);
-visit_type_SpiceInfo(v, "spcie", &spice, &error_abort);
+visit_type_SpiceInfo(v, "spice", &spice, &error_abort);
 
 g_assert(spice);
 g_assert(spice->enabled);
@@ -114,7 +113,7 @@ static void test_spice(void)
 "unix = \"on\"\n";
 
 qts = qtest_init_with_config(cfgdata);
-   /* Test valid command */
+/* Test valid command */
 resp = qtest_qmp(qts, "{ 'execute': 'query-spice' }");
 test_spice_resp(qdict_get(resp, "return"));
 qobject_unref(resp);
@@ -144,6 +143,7 @@ static void test_object_rng_resp(QObject *res)
 if (g_str_equal(obj->name, "rng0") &&
 g_str_equal(obj->type, "child")) {
 seen_rng = true;
+break;
 }
 
 tmp = tmp->next;
@@ -164,7 +164,7 @@ static void test_object_rng(void)
 "id = \"rng0\"\n";
 
 qts = qtest_init_with_config(cfgdata);
-   /* Test valid command */
+/* Test valid command */
 resp = qtest_qmp(qts,
  "{ 'execute': 'qom-list',"
  "  'arguments': {'path': '/objects' }}");
-- 
2.31.1

[PULL 2/2] hw/usb/hcd-xhci: Fix unbounded loop in xhci_ring_chain_length() (CVE-2020-14394)

The loop condition in xhci_ring_chain_length() is under control of
the guest, and additionally the code does not check for failed DMA
transfers (e.g. if reaching the end of the RAM), so the loop there
could run for a very long time or even forever. Fix it by checking
the return value of dma_memory_read() and by introducing a maximum
loop length.

Resolves: https://gitlab.com/qemu-project/qemu/-/issues/646
Message-Id: <20220804131300.96368-1-th...@redhat.com>
Reviewed-by: Mauro Matteo Cascella 
Acked-by: Gerd Hoffmann 
Signed-off-by: Thomas Huth 
---
 hw/usb/hcd-xhci.c | 23 +++
 1 file changed, 19 insertions(+), 4 deletions(-)

diff --git a/hw/usb/hcd-xhci.c b/hw/usb/hcd-xhci.c
index 296cc6c8e6..3c48b58dde 100644
--- a/hw/usb/hcd-xhci.c
+++ b/hw/usb/hcd-xhci.c
@@ -21,6 +21,7 @@
 
 #include "qemu/osdep.h"
 #include "qemu/timer.h"
+#include "qemu/log.h"
 #include "qemu/module.h"
 #include "qemu/queue.h"
 #include "migration/vmstate.h"
@@ -725,10 +726,14 @@ static int xhci_ring_chain_length(XHCIState *xhci, const 
XHCIRing *ring)
 bool control_td_set = 0;
 uint32_t link_cnt = 0;
 
-while (1) {
+do {
 TRBType type;
-dma_memory_read(xhci->as, dequeue, &trb, TRB_SIZE,
-MEMTXATTRS_UNSPECIFIED);
+if (dma_memory_read(xhci->as, dequeue, &trb, TRB_SIZE,
+MEMTXATTRS_UNSPECIFIED) != MEMTX_OK) {
+qemu_log_mask(LOG_GUEST_ERROR, "%s: DMA memory access failed!\n",
+  __func__);
+return -1;
+}
 le64_to_cpus(&trb.parameter);
 le32_to_cpus(&trb.status);
 le32_to_cpus(&trb.control);
@@ -762,7 +767,17 @@ static int xhci_ring_chain_length(XHCIState *xhci, const 
XHCIRing *ring)
 if (!control_td_set && !(trb.control & TRB_TR_CH)) {
 return length;
 }
-}
+
+/*
+ * According to the xHCI spec, Transfer Ring segments should have
+ * a maximum size of 64 kB (see chapter "6 Data Structures")
+ */
+} while (length < TRB_LINK_LIMIT * 65536 / TRB_SIZE);
+
+qemu_log_mask(LOG_GUEST_ERROR, "%s: exceeded maximum tranfer ring size!\n",
+  __func__);
+
+return -1;
 }
 
 static void xhci_er_reset(XHCIState *xhci, int v)
-- 
2.31.1

Re: Teensy 4.1 Implementation

On Tue, 16 Aug 2022 at 10:59, Alex Bennée  wrote:
> Shiny Saana  writes:
> > I personally don't need any of the GPIO interfaces, but if needed
> > by someone else, that could be a good second step to
> > work on once that part of the board is implemented.
>
> Handling GPIOs in QEMU is fine (we have things like the qdev_init_gpio_*
> functions to handle them). The problem is usually what to do with the
> actual general purpose pins which aren't wired to something we emulate
> in the board. Some boards expose their values via QMP properties but I
> suspect whats really needed is a generic mechanism for exposing GPIO to
> external scripts rather than have every board define it's own thing.

Yes. However one key thing for trying to get a new board model
in is not to get tangled up in trying to improve/extend
the core QEMU facilities for something. That's much harder
than "my board model supports GPIO output lines to the same
extent as the other existing board models" :-)

thanks
-- PMM

[PULL 0/2] Two small fixes for QEMU 7.1-rc3

 Hi Richard!

Two minor fixes for rc3. If this is too late for rc3, please feel free
to ignore, I think they are not severe enough to justify an rc4 later.

The following changes since commit d102b8162a1e5fe8288d4d5c01801ce6536ac2d1:

  Merge tag 'pull-la-20220814' of https://gitlab.com/rth7680/qemu into staging 
(2022-08-14 08:48:11 -0500)

are available in the Git repository at:

  https://gitlab.com/thuth/qemu.git tags/pull-request-2022-08-16

for you to fetch changes up to effaf5a240e03020f4ae953e10b764622c3e87cc:

  hw/usb/hcd-xhci: Fix unbounded loop in xhci_ring_chain_length() 
(CVE-2020-14394) (2022-08-16 11:37:19 +0200)


* Fix a possible endless loop in USB XHCI code
* Minor fixes for the new readconfig test


Daniel P. Berrangé (1):
  tests/qtest: misc tweaks to readconfig

Thomas Huth (1):
  hw/usb/hcd-xhci: Fix unbounded loop in xhci_ring_chain_length() 
(CVE-2020-14394)

 hw/usb/hcd-xhci.c | 23 +++
 tests/qtest/readconfig-test.c | 12 ++--
 2 files changed, 25 insertions(+), 10 deletions(-)

Re: Teensy 4.1 Implementation



Shiny Saana  writes:

> Thank you very much for your answer!
>
> Apologies if I mess up the process of communicating via mailing lists,
> it's my first time communicating via this channel.

Don't worry about it - mailing lists are absolutely a good place to
discuss things ahead of time. I suspect because the devel list looks
to be mostly consisting of PATCHes and discussion of them that puts off
people from asking questions up front.

> The project that requires me to implement this board in QEMU currently would 
> require full USB interface and flash
> storage at the end of the day, and I feel on the same page that implementing 
> UART via USB would indeed be a good
> place to start.

Does the board use a standardised host controller or something custom?
You can see all the current host controllers in hw/usb/meson.build:

  # usb host adapters
  softmmu_ss.add(when: 'CONFIG_USB_UHCI', if_true: files('hcd-uhci.c'))
  softmmu_ss.add(when: 'CONFIG_USB_OHCI', if_true: files('hcd-ohci.c'))
  softmmu_ss.add(when: 'CONFIG_USB_OHCI_PCI', if_true: files('hcd-ohci-pci.c'))
  softmmu_ss.add(when: 'CONFIG_USB_EHCI', if_true: files('hcd-ehci.c'))
  softmmu_ss.add(when: 'CONFIG_USB_EHCI_PCI', if_true: files('hcd-ehci-pci.c'))
  softmmu_ss.add(when: 'CONFIG_USB_EHCI_SYSBUS', if_true: files('hcd-ehci.c', 
'hcd-ehci-sysbus.c'))
  softmmu_ss.add(when: 'CONFIG_USB_XHCI', if_true: files('hcd-xhci.c'))
  softmmu_ss.add(when: 'CONFIG_USB_XHCI_PCI', if_true: files('hcd-xhci-pci.c'))
  softmmu_ss.add(when: 'CONFIG_USB_XHCI_SYSBUS', if_true: 
files('hcd-xhci-sysbus.c'))
  softmmu_ss.add(when: 'CONFIG_USB_XHCI_NEC', if_true: files('hcd-xhci-nec.c'))
  softmmu_ss.add(when: 'CONFIG_USB_MUSB', if_true: files('hcd-musb.c'))
  softmmu_ss.add(when: 'CONFIG_USB_DWC2', if_true: files('hcd-dwc2.c'))
  softmmu_ss.add(when: 'CONFIG_USB_DWC3', if_true: files('hcd-dwc3.c'))

  softmmu_ss.add(when: 'CONFIG_TUSB6010', if_true: files('tusb6010.c'))
  softmmu_ss.add(when: 'CONFIG_IMX', if_true: files('chipidea.c'))
  softmmu_ss.add(when: 'CONFIG_IMX_USBPHY', if_true: files('imx-usb-phy.c'))
  softmmu_ss.add(when: 'CONFIG_VT82C686', if_true: files('vt82c686-uhci-pci.c'))
  specific_ss.add(when: 'CONFIG_XLNX_VERSAL', if_true: 
files('xlnx-versal-usb2-ctrl-regs.c'))
  specific_ss.add(when: 'CONFIG_XLNX_USB_SUBSYS', if_true: 
files('xlnx-usb-subsystem.c'))

So hopefully you can just use an existing model. Search for the TYPE_FOO
variable for your given host controller to see how the board models
instantiate it.


> I personally don't need any of the GPIO interfaces, but if needed by someone 
> else, that could be a good second step to
> work on once that part of the board is implemented.

Handling GPIOs in QEMU is fine (we have things like the qdev_init_gpio_*
functions to handle them). The problem is usually what to do with the
actual general purpose pins which aren't wired to something we emulate
in the board. Some boards expose their values via QMP properties but I
suspect whats really needed is a generic mechanism for exposing GPIO to
external scripts rather than have every board define it's own thing.

>
> I have jotted down a few lines based on the MPS2 implementation, which made 
> me get a feel of how a board is
> implemented. I'll look at the board you recommended too, thanks a lot for the 
> recommendation!
>
> As for upstreaming, once (if?) I get a working "MVP", and if I cannot figure 
> out the best way to submit the patches, I'll
> keep in touch to figure it out, if that's alright!

The submitting patches document has a lot of information on this
including a guide to using sourcehut to publish a branch as emails if
setting up local email is too tricky.

>
> Again, thanks a bunch!
>
> //Saana
>
> Le lun. 15 août 2022 à 16:58, Peter Maydell  a 
> écrit :
>
>  On Sat, 13 Aug 2022 at 18:54, Shiny Saana  wrote:
>  > I'd like to implement support for emulating the teensy 4.1 board (
>  > https://www.pjrc.com/store/teensy41.html) to QEMU.
>  >
>  > I'm honestly quite lost as to where to start at the moment, since
>  > I can't really find any emulated Cortex-M7 that would be close to
>  > that board already implemented.
>
>  Hi! Yes, implementing a new board and SoC model is quite a bit of
>  work, and unfortunately the process isn't really documented, so
>  the best thing is to look for some other existing SoC model and
>  do something similar. In this case, we implement the Cortex-M7
>  CPU itself, but we don't implement the IMXRT1062 SoC that it uses,
>  or any similar SoC in that family. (We do have some of the older
>  A-profile IMX boards, so it's possible some device models are
>  reusable -- but equally they might be very different.)
>
>  As a pattern, I would look at the stm32vldiscovery machine.
>  This is a Cortex-M3 system based on the stm32f100 SoC, where
>  we have implemented a few of the SoC devices and have stub
>  implementations of most of the rest. That's a fairly recently
>  added M-profile SoC and it's written in the "mod

[RFC PATCH 1/2] softmmu/memory: add missing begin/commit callback calls

kvm listeners now need ->commit callback in order to actually send
the ioctl to the hypervisor. Therefore, add missing callers around
address_space_set_flatview(), which in turn calls
address_space_update_topology_pass() which calls ->region_* and
->log_* callbacks.

Using MEMORY_LISTENER_CALL_GLOBAL is a little bit an overkill,
but it is harmless, considering that other listeners that are not
invoked in address_space_update_topology_pass() won't do anything,
since they won't have anything to commit.

Signed-off-by: Emanuele Giuseppe Esposito 
---
 softmmu/memory.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/softmmu/memory.c b/softmmu/memory.c
index 7ba2048836..1afd3f9703 100644
--- a/softmmu/memory.c
+++ b/softmmu/memory.c
@@ -1076,7 +1076,9 @@ static void address_space_update_topology(AddressSpace 
*as)
 if (!g_hash_table_lookup(flat_views, physmr)) {
 generate_memory_topology(physmr);
 }
+MEMORY_LISTENER_CALL_GLOBAL(begin, Forward);
 address_space_set_flatview(as);
+MEMORY_LISTENER_CALL_GLOBAL(commit, Forward);
 }
 
 void memory_region_transaction_begin(void)
-- 
2.31.1

Re: [PATCH v2] xio3130_upstream: Add ACS (Access Control Services) capability

2022-08-16 Thread Michael S. Tsirkin

On Tue, Aug 16, 2022 at 05:16:38PM +0800, Paul Schlacter wrote:
> v1 -> v2:
> - Allow ACS to be disabled.
> - Suggested by Michael S. Tsirkin, use disable-acs to set property.
> 
> v1:
> - Add ACS (Access Control Services) capability.

changelog generally after ---

> 
> If it is a pcie device, check that all devices on the path from


Hmm I don't see any checks on a path. what does this refer to?

> 
> the device to the root complex have ACS enabled, and then the
> 
> device will become an iommu_group.
> 
> it will have the effect of isolation
> 
> 
> 
> Signed-off-by: wangliang 
> 
> Signed-off-by: wangliang 
> 

why two signatures?

> 
> ---

> 
>  hw/pci-bridge/xio3130_upstream.c | 12 +++-
> 
>  1 file changed, 11 insertions(+), 1 deletion(-)
> 

Patch has corrupted whitespace.

> 
> diff --git a/hw/pci-bridge/xio3130_upstream.c b/hw/pci-bridge/
> xio3130_upstream.c
> 
> index 2df95b..5433d06fb3 100644
> 
> --- a/hw/pci-bridge/xio3130_upstream.c
> 
> +++ b/hw/pci-bridge/xio3130_upstream.c
> 
> @@ -24,6 +24,7 @@
> 
>  #include "hw/pci/msi.h"
> 
>  #include "hw/pci/pcie.h"
> 
>  #include "hw/pci/pcie_port.h"
> 
> +#include "hw/qdev-properties.h"
> 
>  #include "migration/vmstate.h"
> 
>  #include "qemu/module.h"
> 
>  
> 
> @@ -59,6 +60,7 @@ static void xio3130_upstream_reset(DeviceState *qdev)
> 
>  static void xio3130_upstream_realize(PCIDevice *d, Error **errp)
> 
>  {
> 
>      PCIEPort *p = PCIE_PORT(d);
> 
> +    PCIESlot *s = PCIE_SLOT(d);
> 
>      int rc;
> 
>  
> 
>      pci_bridge_initfn(d, TYPE_PCIE_BUS);
> 
> @@ -94,7 +96,9 @@ static void xio3130_upstream_realize(PCIDevice *d, Error
> **errp)
> 
>          goto err;
> 
>      }
> 
>  
> 
> -    pcie_acs_init(d, XIO3130_ACS_OFFSET);
> 
> +    if (!s->disable_acs) {
> 
> +        pcie_acs_init(d, XIO3130_ACS_OFFSET);
> 
> +    }
> 
>      return;
> 
>  
> 
>  err:
> 
> @@ -113,6 +117,11 @@ static void xio3130_upstream_exitfn(PCIDevice *d)
> 
>      pci_bridge_exitfn(d);
> 
>  }
> 
>  
> 
> +static Property xio3130_upstream_props[] = {
> 
> +    DEFINE_PROP_BOOL("disable-acs", PCIESlot, disable_acs, false),
> 
> +    DEFINE_PROP_END_OF_LIST()
> 
> +};
> 
> +

I'd say prefix the property with "x-".


> 
>  static const VMStateDescription vmstate_xio3130_upstream = {
> 
>      .name = "xio3130-express-upstream-port",
> 
>      .priority = MIG_PRI_PCI_BUS,
> 
> @@ -142,6 +151,7 @@ static void xio3130_upstream_class_init(ObjectClass 
> *klass,
> void *data)
> 
>      dc->desc = "TI X3130 Upstream Port of PCI Express Switch";
> 
>      dc->reset = xio3130_upstream_reset;
> 
>      dc->vmsd = &vmstate_xio3130_upstream;
> 
> +    device_class_set_props(dc, xio3130_upstream_props);
> 
>  }
> 

Seems to lack compat machinety for existing machine types.


>  
> 
>  static const TypeInfo xio3130_upstream_info = {
> 
> -- 
> 
> 2.24.3 (Apple Git-128)
>

[RFC PATCH 2/2] kvm/kvm-all.c: listener should delay kvm_vm_ioctl to the commit phase

Instead of sending a single ioctl every time ->region_* or ->log_*
callbacks are called, "queue" all memory regions in a list that will
be emptied only when committing.

This allow the KVM kernel API to be extended and support multiple
memslots updates in a single call.

Signed-off-by: Emanuele Giuseppe Esposito 
---
 accel/kvm/kvm-all.c   | 99 ---
 include/sysemu/kvm_int.h  |  6 +++
 linux-headers/linux/kvm.h |  9 
 3 files changed, 87 insertions(+), 27 deletions(-)

diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
index 645f0a249a..3afa46b2ef 100644
--- a/accel/kvm/kvm-all.c
+++ b/accel/kvm/kvm-all.c
@@ -357,39 +357,40 @@ int kvm_physical_memory_addr_from_host(KVMState *s, void 
*ram,
 return ret;
 }
 
+static void kvm_memory_region_node_add(KVMMemoryListener *kml,
+   struct kvm_userspace_memory_region *mem)
+{
+MemoryRegionNode *node;
+
+node = g_malloc(sizeof(MemoryRegionNode));
+*node = (MemoryRegionNode) {
+.mem = mem,
+};
+QTAILQ_INSERT_TAIL(&kml->mem_list, node, list);
+}
+
 static int kvm_set_user_memory_region(KVMMemoryListener *kml, KVMSlot *slot, 
bool new)
 {
-KVMState *s = kvm_state;
-struct kvm_userspace_memory_region mem;
-int ret;
+struct kvm_userspace_memory_region *mem;
 
-mem.slot = slot->slot | (kml->as_id << 16);
-mem.guest_phys_addr = slot->start_addr;
-mem.userspace_addr = (unsigned long)slot->ram;
-mem.flags = slot->flags;
+mem = g_malloc(sizeof(struct kvm_userspace_memory_region));
 
-if (slot->memory_size && !new && (mem.flags ^ slot->old_flags) & 
KVM_MEM_READONLY) {
+mem->slot = slot->slot | (kml->as_id << 16);
+mem->guest_phys_addr = slot->start_addr;
+mem->userspace_addr = (unsigned long)slot->ram;
+mem->flags = slot->flags;
+
+if (slot->memory_size && !new && (mem->flags ^ slot->old_flags) &
+KVM_MEM_READONLY) {
 /* Set the slot size to 0 before setting the slot to the desired
  * value. This is needed based on KVM commit 75d61fbc. */
-mem.memory_size = 0;
-ret = kvm_vm_ioctl(s, KVM_SET_USER_MEMORY_REGION, &mem);
-if (ret < 0) {
-goto err;
-}
-}
-mem.memory_size = slot->memory_size;
-ret = kvm_vm_ioctl(s, KVM_SET_USER_MEMORY_REGION, &mem);
-slot->old_flags = mem.flags;
-err:
-trace_kvm_set_user_memory(mem.slot, mem.flags, mem.guest_phys_addr,
-  mem.memory_size, mem.userspace_addr, ret);
-if (ret < 0) {
-error_report("%s: KVM_SET_USER_MEMORY_REGION failed, slot=%d,"
- " start=0x%" PRIx64 ", size=0x%" PRIx64 ": %s",
- __func__, mem.slot, slot->start_addr,
- (uint64_t)mem.memory_size, strerror(errno));
+mem->memory_size = 0;
+kvm_memory_region_node_add(kml, mem);
 }
-return ret;
+mem->memory_size = slot->memory_size;
+kvm_memory_region_node_add(kml, mem);
+slot->old_flags = mem->flags;
+return 0;
 }
 
 static int do_kvm_destroy_vcpu(CPUState *cpu)
@@ -1517,12 +1518,52 @@ static void kvm_region_add(MemoryListener *listener,
 static void kvm_region_del(MemoryListener *listener,
MemoryRegionSection *section)
 {
-KVMMemoryListener *kml = container_of(listener, KVMMemoryListener, 
listener);
+KVMMemoryListener *kml = container_of(listener, KVMMemoryListener,
+  listener);
 
 kvm_set_phys_mem(kml, section, false);
 memory_region_unref(section->mr);
 }
 
+static void kvm_begin(MemoryListener *listener)
+{
+KVMMemoryListener *kml = container_of(listener, KVMMemoryListener,
+  listener);
+assert(QTAILQ_EMPTY(&kml->mem_list));
+}
+
+static void kvm_commit(MemoryListener *listener)
+{
+KVMMemoryListener *kml = container_of(listener, KVMMemoryListener,
+  listener);
+MemoryRegionNode *node, *next;
+KVMState *s = kvm_state;
+
+QTAILQ_FOREACH_SAFE(node, &kml->mem_list, list, next) {
+struct kvm_userspace_memory_region *mem = node->mem;
+int ret;
+
+ret = kvm_vm_ioctl(s, KVM_SET_USER_MEMORY_REGION, mem);
+
+trace_kvm_set_user_memory(mem->slot, mem->flags, mem->guest_phys_addr,
+  mem->memory_size, mem->userspace_addr, 0);
+
+if (ret < 0) {
+error_report("%s: KVM_SET_USER_MEMORY_REGION failed, slot=%d,"
+ " start=0x%" PRIx64 ": %s",
+ __func__, mem->slot,
+ (uint64_t)mem->memory_size, strerror(errno));
+}
+
+QTAILQ_REMOVE(&kml->mem_list, node, list);
+g_free(mem);
+g_free(node);
+}
+
+
+
+}
+
 static void kvm_log_sync(MemoryListener *listener,
  MemoryRegionSection *section)
 {
@@ -1664,8 +1705,12

[RFC PATCH 0/2] accel/kvm: extend kvm memory listener to support

The aim of this serie is to prepare kvm memory listener to support atomic
memslots update. In order to do that, QEMU should take care of sending all
memslot updates in a single ioctl, so that they can all be processed
atomically.

In order to do that, implement kml->begin() and kml->commit() callbacks, and
change the logic by replacing every ioctl invocation in ->region_* and ->log_*
so that the struct kvm_userspace_memory_region are queued in a linked list that
is then traversed and processed in ->commit.

Patch 1 ensures that ->region_* and ->log_* are always wrapped by ->begin and
->commit.

Emanuele Giuseppe Esposito (2):
  softmmu/memory: add missing begin/commit callback calls
  kvm/kvm-all.c: listener should delay kvm_vm_ioctl to the commit phase

 accel/kvm/kvm-all.c   | 99 ---
 include/sysemu/kvm_int.h  |  6 +++
 linux-headers/linux/kvm.h |  9 
 softmmu/memory.c  |  2 +
 4 files changed, 89 insertions(+), 27 deletions(-)

-- 
2.31.1

Re: [PATCH 4/4] hw/nvme: add MSI-x mask handlers for irqfd

2022-08-16 Thread Klaus Jensen

On Aug 11 23:37, Jinhao Fan wrote:
> When irqfd is enabled, we bypass QEMU's irq emulation and let KVM to
> directly assert the irq. However, KVM is not aware of the device's MSI-x
> masking status. Add MSI-x mask bookkeeping in NVMe emulation and
> detach the corresponding irqfd when the certain vector is masked.
> 

Did qtest work out for you for testing? If so, it would be nice to add a
simple test case as well.


signature.asc
Description: PGP signature

Re: [PATCH] [PATCH] linux-user/aarch64: Reset target data on MADV_DONTNEED

2022-08-16 Thread Laurent Vivier

Le 16/08/2022 à 10:41, Alex Bennée a écrit :

Laurent Vivier  writes:

Le 11/08/2022 à 17:18, Alex Bennée a écrit :

Laurent Vivier  writes:

Le 11/08/2022 à 13:54, Peter Maydell a écrit :

On Thu, 11 Aug 2022 at 09:29, Laurent Vivier  wrote:

Le 10/08/2022 à 22:47, Richard Henderson a écrit :

On 8/10/22 13:32, Vitaly Buka wrote:

Sorry, I only noticed today that it's not submitted.
Version is not critical for us, as we build from masters anyway.
Richard, do you know a reason to consider this critical?

On Wed, 10 Aug 2022 at 13:04, Peter Maydell mailto:peter.mayd...@linaro.org>> wrote:

   On Wed, 10 Aug 2022 at 21:00, Vitaly Buka mailto:vitalyb...@google.com>> wrote:
>
> How can we land this one?

   Pinging it a week ago rather than now would have been a good start :-(
   I think it got missed because you didn't cc the linux-user maintainer.

   Is this a critical fix for 7.1 or can we let it slip to 7.2 ?

It's unfortunate that it got missed.  It's not critical, but it would be nice, 
because support for
MADV_DONTNEED is new in 7.1 (previously, we ignored all madvise).

I'll note there are missing braces for coding style on an IF.

Laurent, do you have an objection to merging this for rc3?

No objection.

Do you want it goes via the arm branch or via the linux-user branch?

If it goes via linux-user I can run the LTP testsuite but it takes 1 day.

I think we should definitely run the LTP testsuite on it, so
taking it via linux-user probably makes more sense.

ok, applied to my linux-user-for-7.1 branch.

Running tests.

Any chance you could pick up:
Subject: [PATCH v2] linux-user: un-parent OBJECT(cpu) when
closing thread
Date: Wed,  3 Aug 2022 14:05:37 +0100
Message-Id: <20220803130537.763666-1-alex.ben...@linaro.org>
before you run the tests?

I've tested it, it works fine.

Do you plan to do a PR including it or do you want I do (there will be
only this one in mine)?

I'm going to a roll a PR today so I can include it. Shall I add a
Tested-by for you?

OK. No need to add the Tested-by, I've run generic tests, not targeted to this 
problem.

Thanks,
Laurent

Re: [PATCH 3/7] target/m68k: Honour -semihosting-config userspace=on

2022-08-16 Thread Laurent Vivier


Le 15/08/2022 à 21:02, Peter Maydell a écrit :

Honour the commandline -semihosting-config userspace=on option,
instead of never permitting userspace semihosting calls in system
emulation mode, by passing the correct value to the is_userspace
argument of semihosting_enabled(), instead of manually checking and
always forbidding semihosting if the guest is in userspace.

(Note that target/m68k doesn't support semihosting at all
in the linux-user build.)

Signed-off-by: Peter Maydell 
---
  target/m68k/op_helper.c | 3 +--
  1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/target/m68k/op_helper.c b/target/m68k/op_helper.c
index 4b3dfec1306..a96a0340506 100644
--- a/target/m68k/op_helper.c
+++ b/target/m68k/op_helper.c
@@ -203,8 +203,7 @@ static void cf_interrupt_all(CPUM68KState *env, int is_hw)
  cf_rte(env);
  return;
  case EXCP_HALT_INSN:
-if (semihosting_enabled(false)
-&& (env->sr & SR_S) != 0
+if (semihosting_enabled((env->sr & SR_S) == 0)
  && (env->pc & 3) == 0
  && cpu_lduw_code(env, env->pc - 4) == 0x4e71
  && cpu_ldl_code(env, env->pc) == 0x4e7bf000) {


Reviewed-by: Laurent Vivier

Re: [PATCH 01/22] ppc/ppc4xx: Introduce a DCR device model


On 8/16/22 11:33, BALATON Zoltan wrote:

On Tue, 16 Aug 2022, Cédric Le Goater wrote:

On 8/13/22 17:34, BALATON Zoltan wrote:

From: Cédric Le Goater 

The Device Control Registers (DCR) of on-SoC devices are accessed by
software through the use of the mtdcr and mfdcr instructions. These
are converted in transactions on a side band bus, the DCR bus, which
connects the on-SoC devices to the CPU.

Ideally, we should model these accesses with a DCR namespace and DCR
memory regions but today the DCR handlers are installed in a DCR table
under the CPU. Instead, introduce a little device model wrapper to hold
a CPU link and handle registration of DCR handlers.

The DCR device inherits from SysBus because most of these devices also
have MMIO regions and/or IRQs. Being a SysBusDevice makes things easier
to install the device model in the overall SoC.

Signed-off-by: Cédric Le Goater 


When re-sending a patch, it is a good practice to list the changes before
the Sob of the person doing the resend.

I think you only changed the ppc4xx_dcr_register prototype. Correct ?


Mostly, and the resulting rebase but maybe some small changes here and there but I think those are also just code style fixes. I did not know what you prefer with the from line 


See commit 6a54ac2a9737 for next time.


so if you're OK with keeping it I can go through it again and mark changes 
before signed-off if you think that's better.


Don't bother. I did a quick scan and didn't notice anything important.


I've also started cleaning up the sdram model, I need a bit more time for that, 
I'll probably send it as a separate series.


OK. My initial patches were good enough for 405 and bamboo. May be keep the same
basic idea.

Thanks,

C.



Thanks,

C.


Signed-off-by: BALATON Zoltan 
---
  hw/ppc/ppc4xx_devs.c    | 41 +
  include/hw/ppc/ppc4xx.h | 17 +
  2 files changed, 58 insertions(+)

diff --git a/hw/ppc/ppc4xx_devs.c b/hw/ppc/ppc4xx_devs.c
index 069b511951..f4d7ae9567 100644
--- a/hw/ppc/ppc4xx_devs.c
+++ b/hw/ppc/ppc4xx_devs.c
@@ -664,3 +664,44 @@ void ppc4xx_mal_init(CPUPPCState *env, uint8_t txcnum, 
uint8_t rxcnum,
   mal, &dcr_read_mal, &dcr_write_mal);
  }
  }
+
+/* PPC4xx_DCR_DEVICE */
+
+void ppc4xx_dcr_register(Ppc4xxDcrDeviceState *dev, int dcrn, void *opaque,
+ dcr_read_cb dcr_read, dcr_write_cb dcr_write)
+{
+    assert(dev->cpu);
+    ppc_dcr_register(&dev->cpu->env, dcrn, opaque, dcr_read, dcr_write);
+}
+
+bool ppc4xx_dcr_realize(Ppc4xxDcrDeviceState *dev, PowerPCCPU *cpu,
+    Error **errp)
+{
+    object_property_set_link(OBJECT(dev), "cpu", OBJECT(cpu), &error_abort);
+    return sysbus_realize(SYS_BUS_DEVICE(dev), errp);
+}
+
+static Property ppc4xx_dcr_properties[] = {
+    DEFINE_PROP_LINK("cpu", Ppc4xxDcrDeviceState, cpu, TYPE_POWERPC_CPU,
+ PowerPCCPU *),
+    DEFINE_PROP_END_OF_LIST(),
+};
+
+static void ppc4xx_dcr_class_init(ObjectClass *oc, void *data)
+{
+    DeviceClass *dc = DEVICE_CLASS(oc);
+
+    device_class_set_props(dc, ppc4xx_dcr_properties);
+}
+
+static const TypeInfo ppc4xx_types[] = {
+    {
+    .name   = TYPE_PPC4xx_DCR_DEVICE,
+    .parent = TYPE_SYS_BUS_DEVICE,
+    .instance_size  = sizeof(Ppc4xxDcrDeviceState),
+    .class_init = ppc4xx_dcr_class_init,
+    .abstract   = true,
+    }
+};
+
+DEFINE_TYPES(ppc4xx_types)
diff --git a/include/hw/ppc/ppc4xx.h b/include/hw/ppc/ppc4xx.h
index 591e2421a3..a537a5567b 100644
--- a/include/hw/ppc/ppc4xx.h
+++ b/include/hw/ppc/ppc4xx.h
@@ -27,6 +27,7 @@
    #include "hw/ppc/ppc.h"
  #include "exec/memory.h"
+#include "hw/sysbus.h"
    void ppc4xx_sdram_banks(MemoryRegion *ram, int nr_banks,
  MemoryRegion ram_memories[],
@@ -44,4 +45,20 @@ void ppc4xx_mal_init(CPUPPCState *env, uint8_t txcnum, 
uint8_t rxcnum,
    #define TYPE_PPC4xx_PCI_HOST_BRIDGE "ppc4xx-pcihost"
  +/*
+ * Generic DCR device
+ */
+#define TYPE_PPC4xx_DCR_DEVICE "ppc4xx-dcr-device"
+OBJECT_DECLARE_SIMPLE_TYPE(Ppc4xxDcrDeviceState, PPC4xx_DCR_DEVICE);
+struct Ppc4xxDcrDeviceState {
+    SysBusDevice parent_obj;
+
+    PowerPCCPU *cpu;
+};
+
+void ppc4xx_dcr_register(Ppc4xxDcrDeviceState *dev, int dcrn, void *opaque,
+ dcr_read_cb dcr_read, dcr_write_cb dcr_write);
+bool ppc4xx_dcr_realize(Ppc4xxDcrDeviceState *dev, PowerPCCPU *cpu,
+    Error **errp);
+
  #endif /* PPC4XX_H */

Re: [PATCH 21/22] ppc4xx: Drop empty default cases


On 8/13/22 17:34, BALATON Zoltan wrote:

Remove default case labels that do nothing or only there to set a
default value that could easily be done at the variable definition
instead.


May be instead, the default case labels deserve a LOG_GUEST_ERROR or
a UNIMP or even g_assert_not_reached() ?

C.


Signed-off-by: BALATON Zoltan 
---
  hw/ppc/ppc405_boards.c |  7 +--
  hw/ppc/ppc405_uc.c | 29 +
  hw/ppc/ppc440_uc.c | 27 ---
  hw/ppc/ppc4xx_devs.c   | 31 ---
  4 files changed, 10 insertions(+), 84 deletions(-)

diff --git a/hw/ppc/ppc405_boards.c b/hw/ppc/ppc405_boards.c
index 083f12b23e..a876b4af85 100644
--- a/hw/ppc/ppc405_boards.c
+++ b/hw/ppc/ppc405_boards.c
@@ -401,7 +401,7 @@ struct Ref405epFpgaState {
  static uint64_t ref405ep_fpga_readb(void *opaque, hwaddr addr, unsigned size)
  {
  Ref405epFpgaState *fpga = opaque;
-uint32_t ret;
+uint32_t ret = 0;
  
  switch (addr) {

  case 0x0:
@@ -410,9 +410,6 @@ static uint64_t ref405ep_fpga_readb(void *opaque, hwaddr 
addr, unsigned size)
  case 0x1:
  ret = fpga->reg1;
  break;
-default:
-ret = 0;
-break;
  }
  
  return ret;

@@ -430,8 +427,6 @@ static void ref405ep_fpga_writeb(void *opaque, hwaddr addr, 
uint64_t value,
  case 0x1:
  fpga->reg1 = value;
  break;
-default:
-break;
  }
  }
  
diff --git a/hw/ppc/ppc405_uc.c b/hw/ppc/ppc405_uc.c

index b13026200f..6c84d87330 100644
--- a/hw/ppc/ppc405_uc.c
+++ b/hw/ppc/ppc405_uc.c
@@ -56,7 +56,7 @@ enum {
  static uint32_t dcr_read_pob(void *opaque, int dcrn)
  {
  Ppc405PobState *pob = opaque;
-uint32_t ret;
+uint32_t ret = 0;
  
  switch (dcrn) {

  case POB0_BEAR:
@@ -68,10 +68,6 @@ static uint32_t dcr_read_pob(void *opaque, int dcrn)
  case POB0_BESR1:
  ret = pob->besr1;
  break;
-default:
-/* Avoid gcc warning */
-ret = 0;
-break;
  }
  
  return ret;

@@ -131,7 +127,7 @@ static void ppc405_pob_class_init(ObjectClass *oc, void 
*data)
  static uint64_t opba_readb(void *opaque, hwaddr addr, unsigned size)
  {
  Ppc405OpbaState *opba = opaque;
-uint32_t ret;
+uint32_t ret = 0;
  
  switch (addr) {

  case 0x00:
@@ -140,9 +136,6 @@ static uint64_t opba_readb(void *opaque, hwaddr addr, 
unsigned size)
  case 0x01:
  ret = opba->pr;
  break;
-default:
-ret = 0x00;
-break;
  }
  
  trace_opba_readb(addr, ret);

@@ -163,8 +156,6 @@ static void opba_writeb(void *opaque, hwaddr addr, uint64_t 
value,
  case 0x01:
  opba->pr = value & 0xFF;
  break;
-default:
-break;
  }
  }
  static const MemoryRegionOps opba_ops = {
@@ -403,7 +394,7 @@ static void ocm_update_mappings(Ppc405OcmState *ocm,
  static uint32_t dcr_read_ocm(void *opaque, int dcrn)
  {
  Ppc405OcmState *ocm = opaque;
-uint32_t ret;
+uint32_t ret = 0;
  
  switch (dcrn) {

  case OCM0_ISARC:
@@ -418,9 +409,6 @@ static uint32_t dcr_read_ocm(void *opaque, int dcrn)
  case OCM0_DSACNTL:
  ret = ocm->dsacntl;
  break;
-default:
-ret = 0;
-break;
  }
  
  return ret;

@@ -556,7 +544,7 @@ static void ppc4xx_gpt_compute_timer(Ppc405GptState *gpt)
  static uint64_t ppc4xx_gpt_read(void *opaque, hwaddr addr, unsigned size)
  {
  Ppc405GptState *gpt = opaque;
-uint32_t ret;
+uint32_t ret = -1;
  int idx;
  
  trace_ppc4xx_gpt_read(addr, size);

@@ -598,9 +586,6 @@ static uint64_t ppc4xx_gpt_read(void *opaque, hwaddr addr, 
unsigned size)
  idx = (addr - 0xC0) >> 2;
  ret = gpt->mask[idx];
  break;
-default:
-ret = -1;
-break;
  }
  
  return ret;

@@ -844,7 +829,7 @@ static void ppc405ep_compute_clocks(Ppc405CpcState *cpc)
  static uint32_t dcr_read_epcpc(void *opaque, int dcrn)
  {
  Ppc405CpcState *cpc = opaque;
-uint32_t ret;
+uint32_t ret = 0;
  
  switch (dcrn) {

  case PPC405EP_CPC0_BOOT:
@@ -871,10 +856,6 @@ static uint32_t dcr_read_epcpc(void *opaque, int dcrn)
  case PPC405EP_CPC0_PCI:
  ret = cpc->pci;
  break;
-default:
-/* Avoid gcc warning */
-ret = 0;
-break;
  }
  
  return ret;

diff --git a/hw/ppc/ppc440_uc.c b/hw/ppc/ppc440_uc.c
index 11fdb88c22..b390672bce 100644
--- a/hw/ppc/ppc440_uc.c
+++ b/hw/ppc/ppc440_uc.c
@@ -147,9 +147,6 @@ static uint32_t dcr_read_l2sram(void *opaque, int dcrn)
  case DCR_ISRAM0_DPC:
  ret = l2sram->isram0[dcrn - DCR_ISRAM0_BASE];
  break;
-
-default:
-break;
  }
  
  return ret;

@@ -304,12 +301,8 @@ static uint32_t dcr_read_cpr(void *opaque, int dcrn)
  case CPR0_AHBD:
  ret = (1 << 24);
  break;
-default:
-break;
  }
  bre

Re: [PATCH 00/22] QOMify PPC4xx devices and minor clean ups


On 8/13/22 17:34, BALATON Zoltan wrote:

Hello,

This is mased on gitlab.com/danielhb/qemu/tree/ppc-7.2

This series contains the rest of Cédric's patches modified according
my review comments and some other small clean ups I've noticed along
the way. I've kept the From line of Cédric for patches that were
originally his even though they are modified a bit. Not sure what's
the best way for this or what Cédric prefers.

The last sdram changes are not yet here because I'm still looking at
those and will come back to them but this series is ready to merge
unless there are comments that need further changes. Please let me
know what do you think.


LGTM. In case you resend, may be change the names of the models which
are now common to PPC4xx. That's minor really.

I would dig the default case labels a little more before removing
them.

Thanks,

C.




Regards,
BALATON Zoltan

BALATON Zoltan (22):
   ppc/ppc4xx: Introduce a DCR device model
   ppc/ppc405: QOM'ify CPC
   ppc/ppc405: QOM'ify GPT
   ppc/ppc405: QOM'ify OCM
   ppc/ppc405: QOM'ify GPIO
   ppc/ppc405: QOM'ify DMA
   ppc/ppc405: QOM'ify EBC
   ppc/ppc405: QOM'ify OPBA
   ppc/ppc405: QOM'ify POB
   ppc/ppc405: QOM'ify PLB
   ppc/ppc405: QOM'ify MAL
   ppc4xx: Move PLB model to ppc4xx_devs.c
   ppc4xx: Move EBC model to ppc4xx_devs.c
   ppc/ppc405: Use an embedded PPCUIC model in SoC state
   hw/intc/ppc-uic: Convert ppc-uic to a PPC4xx DCR device
   ppc/ppc405: Use an explicit I2C object
   ppc/ppc405: QOM'ify FPGA
   ppc405: Move machine specific code to ppc405_boards.c
   hw/ppc/Kconfig: Remove PPC405 dependency from sam460ex
   hw/ppc/Kconfig: Move imply before select
   ppc4xx: Drop empty default cases
   ppc/ppc4xx: Fix sdram trace events

  hw/intc/ppc-uic.c |   26 +-
  hw/ppc/Kconfig|3 +-
  hw/ppc/ppc405.h   |  182 +--
  hw/ppc/ppc405_boards.c|  360 +
  hw/ppc/ppc405_uc.c| 1071 -
  hw/ppc/ppc440_bamboo.c|7 +-
  hw/ppc/ppc440_uc.c|   27 -
  hw/ppc/ppc4xx_devs.c  |  473 +---
  hw/ppc/sam460ex.c |   37 +-
  hw/ppc/trace-events   |3 -
  hw/ppc/virtex_ml507.c |7 +-
  include/hw/intc/ppc-uic.h |6 +-
  include/hw/ppc/ppc4xx.h   |   71 ++-
  13 files changed, 1223 insertions(+), 1050 deletions(-)

Re: [PATCH 2/4] hw/nvme: add option to (de)assert irq with eventfd

2022-08-16 Thread Klaus Jensen

On Aug 11 23:37, Jinhao Fan wrote:
> When the new option 'irq-eventfd' is turned on, the IO emulation code
> signals an eventfd when it want to (de)assert an irq. The main loop
> eventfd handler does the actual irq (de)assertion.  This paves the way
> for iothread support since QEMU's interrupt emulation is not thread
> safe.
> 
> Asserting and deasseting irq with eventfd has some performance
> implications. For small queue depth it increases request latency but
> for large queue depth it effectively coalesces irqs.
> 
> Comparision (KIOPS):
> 
> QD1   4  16  64
> QEMU 38 123 210 329
> irq-eventfd  32 106 240 364
> 
> Signed-off-by: Jinhao Fan 
> ---
>  hw/nvme/ctrl.c | 89 --
>  hw/nvme/nvme.h |  4 +++
>  2 files changed, 90 insertions(+), 3 deletions(-)
> 
> diff --git a/hw/nvme/ctrl.c b/hw/nvme/ctrl.c
> index bd3350d7e0..8a1c5ce3e1 100644
> --- a/hw/nvme/ctrl.c
> +++ b/hw/nvme/ctrl.c
> @@ -7675,6 +7757,7 @@ static Property nvme_props[] = {
>  DEFINE_PROP_BOOL("use-intel-id", NvmeCtrl, params.use_intel_id, false),
>  DEFINE_PROP_BOOL("legacy-cmb", NvmeCtrl, params.legacy_cmb, false),
>  DEFINE_PROP_BOOL("ioeventfd", NvmeCtrl, params.ioeventfd, false),
> +DEFINE_PROP_BOOL("irq-eventfd", NvmeCtrl, params.irq_eventfd, false),

This option does not seem to change anything - the value is never used
;)


signature.asc
Description: PGP signature

Re: [PATCH v2] ppc/pnv: Add initial P9/10 SBE model





On 8/11/22 06:37, Nicholas Piggin wrote:

The SBE (Self Boot Engine) are on-chip microcontrollers that perform
early boot steps, as well as provide some runtime facilities (e.g.,
timer, secure register access, MPIPL). The latter facilities are
accessed mostly via a message system called SBEFIFO.

This driver provides initial emulation for the SBE runtime registers
and a very basic SBEFIFO implementation that provides the timer
command. This covers the basic SBE behaviour expected by skiboot when
booting.

Reviewed-by: Cédric Le Goater 
Signed-off-by: Nicholas Piggin 
---


Applied to gitlab.com/danielhb/qemu/tree/ppc-7.2 after fixing the following
checkpatch.pl error:


WARNING: line over 80 characters
#282: FILE: hw/ppc/pnv_sbe.c:73:
+#define SBE_HOST_RESPONSE_MASK  (PPC_BITMASK(0, 4) | 
SBE_HOST_TIMER_EXPIRY)

total: 0 errors, 2 warnings, 595 lines checked


Thanks,


Daniel




v2: use trace events instead of open coded debug print macros

  hw/ppc/meson.build |   1 +
  hw/ppc/pnv.c   |  25 +++
  hw/ppc/pnv_sbe.c   | 413 +
  hw/ppc/pnv_xscom.c |   3 +
  hw/ppc/trace-events|  11 +
  include/hw/ppc/pnv.h   |   3 +
  include/hw/ppc/pnv_sbe.h   |  55 +
  include/hw/ppc/pnv_xscom.h |  12 ++
  8 files changed, 523 insertions(+)
  create mode 100644 hw/ppc/pnv_sbe.c
  create mode 100644 include/hw/ppc/pnv_sbe.h

diff --git a/hw/ppc/meson.build b/hw/ppc/meson.build
index aa4c8e6a2e..62801923f3 100644
--- a/hw/ppc/meson.build
+++ b/hw/ppc/meson.build
@@ -46,6 +46,7 @@ ppc_ss.add(when: 'CONFIG_POWERNV', if_true: files(
'pnv_lpc.c',
'pnv_psi.c',
'pnv_occ.c',
+  'pnv_sbe.c',
'pnv_bmc.c',
'pnv_homer.c',
'pnv_pnor.c',
diff --git a/hw/ppc/pnv.c b/hw/ppc/pnv.c
index d3f77c8367..7ff1f464d3 100644
--- a/hw/ppc/pnv.c
+++ b/hw/ppc/pnv.c
@@ -1397,6 +1397,8 @@ static void pnv_chip_power9_instance_init(Object *obj)
  
  object_initialize_child(obj, "occ", &chip9->occ, TYPE_PNV9_OCC);
  
+object_initialize_child(obj, "sbe", &chip9->sbe, TYPE_PNV9_SBE);

+
  object_initialize_child(obj, "homer", &chip9->homer, TYPE_PNV9_HOMER);
  
  /* Number of PECs is the chip default */

@@ -1549,6 +1551,17 @@ static void pnv_chip_power9_realize(DeviceState *dev, 
Error **errp)
  memory_region_add_subregion(get_system_memory(), 
PNV9_OCC_SENSOR_BASE(chip),
  &chip9->occ.sram_regs);
  
+/* SBE */

+if (!qdev_realize(DEVICE(&chip9->sbe), NULL, errp)) {
+return;
+}
+pnv_xscom_add_subregion(chip, PNV9_XSCOM_SBE_CTRL_BASE,
+&chip9->sbe.xscom_ctrl_regs);
+pnv_xscom_add_subregion(chip, PNV9_XSCOM_SBE_MBOX_BASE,
+&chip9->sbe.xscom_mbox_regs);
+qdev_connect_gpio_out(DEVICE(&chip9->sbe), 0, qdev_get_gpio_in(
+  DEVICE(&chip9->psi), PSIHB9_IRQ_PSU));
+
  /* HOMER */
  object_property_set_link(OBJECT(&chip9->homer), "chip", OBJECT(chip),
   &error_abort);
@@ -1613,6 +1626,7 @@ static void pnv_chip_power10_instance_init(Object *obj)
  object_initialize_child(obj, "psi", &chip10->psi, TYPE_PNV10_PSI);
  object_initialize_child(obj, "lpc", &chip10->lpc, TYPE_PNV10_LPC);
  object_initialize_child(obj, "occ",  &chip10->occ, TYPE_PNV10_OCC);
+object_initialize_child(obj, "sbe",  &chip10->sbe, TYPE_PNV10_SBE);
  object_initialize_child(obj, "homer", &chip10->homer, TYPE_PNV10_HOMER);
  
  chip->num_pecs = pcc->num_pecs;

@@ -1754,6 +1768,17 @@ static void pnv_chip_power10_realize(DeviceState *dev, 
Error **errp)
  PNV10_OCC_SENSOR_BASE(chip),
  &chip10->occ.sram_regs);
  
+/* SBE */

+if (!qdev_realize(DEVICE(&chip10->sbe), NULL, errp)) {
+return;
+}
+pnv_xscom_add_subregion(chip, PNV10_XSCOM_SBE_CTRL_BASE,
+&chip10->sbe.xscom_ctrl_regs);
+pnv_xscom_add_subregion(chip, PNV10_XSCOM_SBE_MBOX_BASE,
+&chip10->sbe.xscom_mbox_regs);
+qdev_connect_gpio_out(DEVICE(&chip10->sbe), 0, qdev_get_gpio_in(
+  DEVICE(&chip10->psi), PSIHB9_IRQ_PSU));
+
  /* HOMER */
  object_property_set_link(OBJECT(&chip10->homer), "chip", OBJECT(chip),
   &error_abort);
diff --git a/hw/ppc/pnv_sbe.c b/hw/ppc/pnv_sbe.c
new file mode 100644
index 00..41420c47bd
--- /dev/null
+++ b/hw/ppc/pnv_sbe.c
@@ -0,0 +1,413 @@
+/*
+ * QEMU PowerPC PowerNV Emulation of some SBE behaviour
+ *
+ * Copyright (c) 2022, IBM Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the

Re: [PATCH v7 00/14] KVM: mm: fd-based approach for supporting KVM guest private memory

2022-08-16 Thread Gupta, Pankaj


Hi Chao,



Actually the current version allows you to delay the allocation to a
later time (e.g. page fault time) if you don't call fallocate() on the
private fd. fallocate() is necessary in previous versions because we
treat the existense in the fd as 'private' but in this version we track
private/shared info in KVM so we don't rely on that fact from memory
backstores.


Does this also mean reservation of guest physical memory with secure 
processor (both for SEV-SNP & TDX) will also happen at page fault time?


Do we plan to keep it this way?

Thanks,
Pankaj


Definitely the page will still be pinned once it's allocated, there is
no way to swap it out for example just with the current code. That kind
of support, if desirable, can be extended through MOVABLE flag and some
other callbacks to let feature-specific code to involve.

Re: [PATCH 2/2] hw/mips/boston: Pack fdt in fdt filter

2022-08-16 Thread Jiaxun Yang




> 2022年8月16日 01:44，Philippe Mathieu-Daudé  写道：
> 
> On 13/8/22 18:27, Jiaxun Yang wrote:
>> FDT can be awfully fat after series of modifications in fdt
>> filter. Just pack it up before add to ram.
>> Signed-off-by: Jiaxun Yang 
>> ---
>>  hw/mips/boston.c | 1 +
>>  1 file changed, 1 insertion(+)
>> diff --git a/hw/mips/boston.c b/hw/mips/boston.c
>> index 5145179951..a40f193f78 100644
>> --- a/hw/mips/boston.c
>> +++ b/hw/mips/boston.c
>> @@ -400,6 +400,7 @@ static const void *boston_fdt_filter(void *opaque, const 
>> void *fdt_orig,
>>  1, boston_memmap[BOSTON_HIGHDDR].base + ram_low_sz,
>>  1, ram_high_sz);
>>  +fdt_pack(fdt);
>>  fdt = g_realloc(fdt, fdt_totalsize(fdt));
>>  qemu_fdt_dumpdtb(fdt, fdt_sz);
>>  
> 
> Why not pack by default in qemu_fdt_dumpdtb()?

qemu_fdt_dumpdtb() is explicitly a function for debugging purpose.
Donno if it’s wise to hijack it.

Thanks.
---
Jiaxun Yang

Re: [PATCH 21/22] ppc4xx: Drop empty default cases

2022-08-16 Thread BALATON Zoltan


On Tue, 16 Aug 2022, Cédric Le Goater wrote:

On 8/13/22 17:34, BALATON Zoltan wrote:

Remove default case labels that do nothing or only there to set a
default value that could easily be done at the variable definition
instead.


May be instead, the default case labels deserve a LOG_GUEST_ERROR or
a UNIMP or even g_assert_not_reached() ?


g_assert_not_reached() is probably too much as a lot of these models may 
be incomplete so I think that could cause crashes where it works now but 
adding LOG_UNIMP could be useful to detect missing pieces.


Regards,
BALATON Zoltan


C.


Signed-off-by: BALATON Zoltan 
---
  hw/ppc/ppc405_boards.c |  7 +--
  hw/ppc/ppc405_uc.c | 29 +
  hw/ppc/ppc440_uc.c | 27 ---
  hw/ppc/ppc4xx_devs.c   | 31 ---
  4 files changed, 10 insertions(+), 84 deletions(-)

diff --git a/hw/ppc/ppc405_boards.c b/hw/ppc/ppc405_boards.c
index 083f12b23e..a876b4af85 100644
--- a/hw/ppc/ppc405_boards.c
+++ b/hw/ppc/ppc405_boards.c
@@ -401,7 +401,7 @@ struct Ref405epFpgaState {
  static uint64_t ref405ep_fpga_readb(void *opaque, hwaddr addr, unsigned 
size)

  {
  Ref405epFpgaState *fpga = opaque;
-uint32_t ret;
+uint32_t ret = 0;
switch (addr) {
  case 0x0:
@@ -410,9 +410,6 @@ static uint64_t ref405ep_fpga_readb(void *opaque, 
hwaddr addr, unsigned size)

  case 0x1:
  ret = fpga->reg1;
  break;
-default:
-ret = 0;
-break;
  }
return ret;
@@ -430,8 +427,6 @@ static void ref405ep_fpga_writeb(void *opaque, hwaddr 
addr, uint64_t value,

  case 0x1:
  fpga->reg1 = value;
  break;
-default:
-break;
  }
  }
  diff --git a/hw/ppc/ppc405_uc.c b/hw/ppc/ppc405_uc.c
index b13026200f..6c84d87330 100644
--- a/hw/ppc/ppc405_uc.c
+++ b/hw/ppc/ppc405_uc.c
@@ -56,7 +56,7 @@ enum {
  static uint32_t dcr_read_pob(void *opaque, int dcrn)
  {
  Ppc405PobState *pob = opaque;
-uint32_t ret;
+uint32_t ret = 0;
switch (dcrn) {
  case POB0_BEAR:
@@ -68,10 +68,6 @@ static uint32_t dcr_read_pob(void *opaque, int dcrn)
  case POB0_BESR1:
  ret = pob->besr1;
  break;
-default:
-/* Avoid gcc warning */
-ret = 0;
-break;
  }
return ret;
@@ -131,7 +127,7 @@ static void ppc405_pob_class_init(ObjectClass *oc, void 
*data)

  static uint64_t opba_readb(void *opaque, hwaddr addr, unsigned size)
  {
  Ppc405OpbaState *opba = opaque;
-uint32_t ret;
+uint32_t ret = 0;
switch (addr) {
  case 0x00:
@@ -140,9 +136,6 @@ static uint64_t opba_readb(void *opaque, hwaddr addr, 
unsigned size)

  case 0x01:
  ret = opba->pr;
  break;
-default:
-ret = 0x00;
-break;
  }
trace_opba_readb(addr, ret);
@@ -163,8 +156,6 @@ static void opba_writeb(void *opaque, hwaddr addr, 
uint64_t value,

  case 0x01:
  opba->pr = value & 0xFF;
  break;
-default:
-break;
  }
  }
  static const MemoryRegionOps opba_ops = {
@@ -403,7 +394,7 @@ static void ocm_update_mappings(Ppc405OcmState *ocm,
  static uint32_t dcr_read_ocm(void *opaque, int dcrn)
  {
  Ppc405OcmState *ocm = opaque;
-uint32_t ret;
+uint32_t ret = 0;
switch (dcrn) {
  case OCM0_ISARC:
@@ -418,9 +409,6 @@ static uint32_t dcr_read_ocm(void *opaque, int dcrn)
  case OCM0_DSACNTL:
  ret = ocm->dsacntl;
  break;
-default:
-ret = 0;
-break;
  }
return ret;
@@ -556,7 +544,7 @@ static void ppc4xx_gpt_compute_timer(Ppc405GptState 
*gpt)

  static uint64_t ppc4xx_gpt_read(void *opaque, hwaddr addr, unsigned size)
  {
  Ppc405GptState *gpt = opaque;
-uint32_t ret;
+uint32_t ret = -1;
  int idx;
trace_ppc4xx_gpt_read(addr, size);
@@ -598,9 +586,6 @@ static uint64_t ppc4xx_gpt_read(void *opaque, hwaddr 
addr, unsigned size)

  idx = (addr - 0xC0) >> 2;
  ret = gpt->mask[idx];
  break;
-default:
-ret = -1;
-break;
  }
return ret;
@@ -844,7 +829,7 @@ static void ppc405ep_compute_clocks(Ppc405CpcState 
*cpc)

  static uint32_t dcr_read_epcpc(void *opaque, int dcrn)
  {
  Ppc405CpcState *cpc = opaque;
-uint32_t ret;
+uint32_t ret = 0;
switch (dcrn) {
  case PPC405EP_CPC0_BOOT:
@@ -871,10 +856,6 @@ static uint32_t dcr_read_epcpc(void *opaque, int dcrn)
  case PPC405EP_CPC0_PCI:
  ret = cpc->pci;
  break;
-default:
-/* Avoid gcc warning */
-ret = 0;
-break;
  }
return ret;
diff --git a/hw/ppc/ppc440_uc.c b/hw/ppc/ppc440_uc.c
index 11fdb88c22..b390672bce 100644
--- a/hw/ppc/ppc440_uc.c
+++ b/hw/ppc/ppc440_uc.c
@@ -147,9 +147,6 @@ static uint32_t dcr_read_l2sram(void *opaque, int dcrn)
  case DCR_ISRAM0_DPC:
  ret = l2sram->isram0[dcrn - DCR_ISRAM0_BASE];
  break;
-
-de

KVM Forum gpg key signing

Hi; we haven't had an in-person KVM Forum for a while. This seems
like a good opportunity for people who are or who expect to be
submitting pull requests to get their GPG key signed, if it's
not been signed by anybody else yet or it's a bit low on signatures.

If that's you, and you're planning to be at KVM Forum, please let
me know, and send me your key ID, key type, fingerprint, and key size
(the output of "gpg --fingerprint your-key-id" is best). Please
do this *before the 28th August* as I won't have access to a printer
after that.

Depending on numbers I will either arrange ad-hoc to meet
up with people, or else organise a larger-group "keysigning party":
https://wiki.qemu.org/KeySigningParty
Either way you'll need to bring with you:
 a) positive (government-issued) photo ID, eg a passport
 b) a printed out copy of your key info
(independently of what you send me)
 c) a pen or pencil

thanks
-- PMM

Re: [PATCH v7 00/14] KVM: mm: fd-based approach for supporting KVM guest private memory

2022-08-16 Thread Kirill A . Shutemov

On Tue, Aug 16, 2022 at 01:33:00PM +0200, Gupta, Pankaj wrote:
> Hi Chao,
> 
> > 
> > Actually the current version allows you to delay the allocation to a
> > later time (e.g. page fault time) if you don't call fallocate() on the
> > private fd. fallocate() is necessary in previous versions because we
> > treat the existense in the fd as 'private' but in this version we track
> > private/shared info in KVM so we don't rely on that fact from memory
> > backstores.
> 
> Does this also mean reservation of guest physical memory with secure
> processor (both for SEV-SNP & TDX) will also happen at page fault time?
> 
> Do we plan to keep it this way?

If you are talking about accepting memory by the guest, it is initiated by
the guest and has nothing to do with page fault time vs fallocate()
allocation of host memory. I mean acceptance happens after host memory
allocation but they are not in lockstep, acceptance can happen much later.

-- 
  Kiryl Shutsemau / Kirill A. Shutemov

[PULL 2/3] tests/avocado: add timeout to the aspeed tests

On some systems the test can hang. At least defining a timeout stops
it from hanging forever.

Signed-off-by: Alex Bennée 
Reviewed-by: Philippe Mathieu-Daudé 
Message-Id: <20220811151413.3350684-7-alex.ben...@linaro.org>

diff --git a/tests/avocado/machine_aspeed.py b/tests/avocado/machine_aspeed.py
index b4e35a3d07..c54da0fd8f 100644
--- a/tests/avocado/machine_aspeed.py
+++ b/tests/avocado/machine_aspeed.py
@@ -40,6 +40,8 @@ def test_ast1030_zephyros(self):
 
 class AST2x00Machine(QemuSystemTest):
 
+timeout = 90
+
 def wait_for_console_pattern(self, success_message, vm=None):
 wait_for_console_pattern(self, success_message,
  failure_message='Kernel panic - not syncing',
-- 
2.30.2

[PULL for 7.1 0/3] memory leak and testing tweaks

The following changes since commit d102b8162a1e5fe8288d4d5c01801ce6536ac2d1:

  Merge tag 'pull-la-20220814' of https://gitlab.com/rth7680/qemu into staging 
(2022-08-14 08:48:11 -0500)

are available in the Git repository at:

  https://github.com/stsquad/qemu.git tags/pull-for-7.1-fixes-160822-1

for you to fetch changes up to 65711f9a87874a9ec61108c6009f8baec07c8b0d:

  tests/avocado: apply a band aid to aspeed-evb login (2022-08-16 09:57:12 
+0100)


A few small fixes:

  - properly un-parent OBJECT(cpu) when closing -user thread
  - add missing timeout to aspeed tests
  - reduce raciness of login: prompt handling for aspeed tests


Alex Bennée (3):
  linux-user: un-parent OBJECT(cpu) when closing thread
  tests/avocado: add timeout to the aspeed tests
  tests/avocado: apply a band aid to aspeed-evb login

 linux-user/syscall.c| 13 +++--
 tests/avocado/machine_aspeed.py |  4 
 2 files changed, 11 insertions(+), 6 deletions(-)


-- 
2.30.2

[PULL 1/3] linux-user: un-parent OBJECT(cpu) when closing thread

While forcing the CPU to unrealize by hand does trigger the clean-up
code we never fully free resources because refcount never reaches
zero. This is because QOM automatically added objects without an
explicit parent to /unattached/, incrementing the refcount.

Instead of manually triggering unrealization just unparent the object
and let the device machinery deal with that for us.

Resolves: https://gitlab.com/qemu-project/qemu/-/issues/866
Signed-off-by: Alex Bennée 
Reviewed-by: Laurent Vivier 
Message-Id: <20220811151413.3350684-2-alex.ben...@linaro.org>

diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index f409121202..bfdd60136b 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -8594,7 +8594,13 @@ static abi_long do_syscall1(CPUArchState *cpu_env, int 
num, abi_long arg1,
 if (CPU_NEXT(first_cpu)) {
 TaskState *ts = cpu->opaque;
 
-object_property_set_bool(OBJECT(cpu), "realized", false, NULL);
+if (ts->child_tidptr) {
+put_user_u32(0, ts->child_tidptr);
+do_sys_futex(g2h(cpu, ts->child_tidptr),
+ FUTEX_WAKE, INT_MAX, NULL, NULL, 0);
+}
+
+object_unparent(OBJECT(cpu));
 object_unref(OBJECT(cpu));
 /*
  * At this point the CPU should be unrealized and removed
@@ -8604,11 +8610,6 @@ static abi_long do_syscall1(CPUArchState *cpu_env, int 
num, abi_long arg1,
 
 pthread_mutex_unlock(&clone_lock);
 
-if (ts->child_tidptr) {
-put_user_u32(0, ts->child_tidptr);
-do_sys_futex(g2h(cpu, ts->child_tidptr),
- FUTEX_WAKE, INT_MAX, NULL, NULL, 0);
-}
 thread_cpu = NULL;
 g_free(ts);
 rcu_unregister_thread();
-- 
2.30.2

[PULL 3/3] tests/avocado: apply a band aid to aspeed-evb login

This is really a limitation of the underlying console code which
doesn't allow us to detect the login: and following "#" prompts
because it reads input line wise. By adding a small delay we ensure
that the login prompt has appeared so we don't accidentally spaff the
shell commands to a confused getty in the guest.

Signed-off-by: Alex Bennée 
Reviewed-by: Cédric Le Goater 
Acked-by: John Snow 
Message-Id: <20220811151413.3350684-8-alex.ben...@linaro.org>

diff --git a/tests/avocado/machine_aspeed.py b/tests/avocado/machine_aspeed.py
index c54da0fd8f..65d38f4efa 100644
--- a/tests/avocado/machine_aspeed.py
+++ b/tests/avocado/machine_aspeed.py
@@ -101,7 +101,9 @@ def do_test_arm_aspeed_buidroot_start(self, image, cpu_id):
 self.wait_for_console_pattern('Starting kernel ...')
 self.wait_for_console_pattern('Booting Linux on physical CPU ' + 
cpu_id)
 self.wait_for_console_pattern('lease of 10.0.2.15')
+# the line before login:
 self.wait_for_console_pattern('Aspeed EVB')
+time.sleep(0.1)
 exec_command(self, 'root')
 time.sleep(0.1)
 
-- 
2.30.2

Re: [PULL 2/3] tests/avocado: add timeout to the aspeed tests

On Tue, 16 Aug 2022 at 13:26, Alex Bennée  wrote:
>
> On some systems the test can hang. At least defining a timeout stops
> it from hanging forever.

Aha. Yeah, I've seen this test hang forever sometimes.

Is there some place (in the superclass??) that we can put a
default timeout that applies to *all* avocado tests, so we
don't have the risk of forgetting it in a particular test?

-- PMM

Re: [PATCH v10 18/21] job.c: enable job lock/unlock and remove Aiocontext locks




Am 27/07/2022 um 17:53 schrieb Vladimir Sementsov-Ogievskiy:
>>    * job_lock:
>> @@ -672,7 +673,7 @@ void job_user_cancel_locked(Job *job, bool force,
>> Error **errp);
>>    * Returns the return value from the job if the job actually completed
>>    * during the call, or -ECANCELED if it was canceled.
>>    *
>> - * Callers must hold the AioContext lock of job->aio_context.
>> + * Called with job_lock held.
> 
> That's wrong, it should be called with job_lock not held :)
> 
>>    */
>>   int job_cancel_sync(Job *job, bool force);
>>   @@ -697,8 +698,7 @@ void job_cancel_sync_all(void);
>>    * function).
>>    *
>>    * Returns the return value from the job.
>> - *
>> - * Callers must hold the AioContext lock of job->aio_context.
>> + * Called with job_lock held.
> 
> and this,
> 
>>    */
>>   int job_complete_sync(Job *job, Error **errp);
>>   @@ -734,7 +734,7 @@ void job_dismiss_locked(Job **job, Error **errp);
>>    * Returns 0 if the job is successfully completed, -ECANCELED if the
>> job was
>>    * cancelled before completing, and -errno in other error cases.
>>    *
>> - * Callers must hold the AioContext lock of job->aio_context.
>> + * Called with job_lock held.
> 
> and this.

Well, except for this part here :) You're right here.

Re: [PATCH v7 00/14] KVM: mm: fd-based approach for supporting KVM guest private memory

2022-08-16 Thread Gupta, Pankaj





Actually the current version allows you to delay the allocation to a
later time (e.g. page fault time) if you don't call fallocate() on the
private fd. fallocate() is necessary in previous versions because we
treat the existense in the fd as 'private' but in this version we track
private/shared info in KVM so we don't rely on that fact from memory
backstores.


Does this also mean reservation of guest physical memory with secure
processor (both for SEV-SNP & TDX) will also happen at page fault time?

Do we plan to keep it this way?


If you are talking about accepting memory by the guest, it is initiated by
the guest and has nothing to do with page fault time vs fallocate()
allocation of host memory. I mean acceptance happens after host memory
allocation but they are not in lockstep, acceptance can happen much later.


No, I meant reserving guest physical memory range from hypervisor e.g 
with RMPUpdate for SEV-SNP or equivalent at TDX side (PAMTs?).


Thanks,
Pankaj

Re: [PATCH v3 0/8] parallels: Refactor the code of images checks and fix a bug

2022-08-16 Thread Denis V. Lunev


On 15.08.2022 11:02, Alexander Ivanov wrote:

Fix image inflation when offset in BAT is out of image.

Replace whole BAT syncing by flushing only dirty blocks.

Move all the checks outside the main check function in
separate functions

Use WITH_QEMU_LOCK_GUARD for simplier code.


Alexander Ivanov (8):
   parallels: Out of image offset in BAT leads to image inflation
   parallels: create parallels_set_bat_entry_helper() to assign BAT value
   parallels: Use generic infrastructure for BAT writing in
 parallels_co_check()
   parallels: Move check of unclean image to a separate function
   parallels: Move check of cluster outside image to a separate function
   parallels: Move check of leaks to a separate function
   parallels: Move statistic collection to a separate function
   parallels: Replace qemu_co_mutex_lock by WITH_QEMU_LOCK_GUARD

  block/parallels.c | 188 --
  1 file changed, 132 insertions(+), 56 deletions(-)


Reviewed-by: Denis V. Lunev 

I am happy with this code now.

Stefan, Vova,

can we take this to block-next via one of your trees?

Den

Re: [PATCH v10 17/21] blockjob: protect iostatus field in BlockJob struct




Am 27/07/2022 um 17:29 schrieb Vladimir Sementsov-Ogievskiy:
> On 7/25/22 10:38, Emanuele Giuseppe Esposito wrote:
>> iostatus is the only field (together with .job) that needs
>> protection using the job mutex.
>>
>> It is set in the main loop (GLOBAL_STATE functions) but read
>> in I/O code (block_job_error_action).
> 
> Hmm, block_job_error_action doesn't read iostatus..

block_job_error_action -> block_job_iostatus_set_err_locked -> reads
iostatus

>  
> Also, iostatus is read by by mirror_run, which is not protected.

Will do, thanks

Emanuele

>>
>> In order to protect it, change block_job_iostatus_set_err
>> to block_job_iostatus_set_err_locked(), always called under
>> job lock.
>>
>> Signed-off-by: Emanuele Giuseppe Esposito 
>> ---
>>   blockjob.c | 5 +++--
>>   1 file changed, 3 insertions(+), 2 deletions(-)
>>
>> diff --git a/blockjob.c b/blockjob.c
>> index 0663faee2c..448bdb5a53 100644
>> --- a/blockjob.c
>> +++ b/blockjob.c
>> @@ -363,7 +363,8 @@ BlockJobInfo *block_job_query(BlockJob *job, Error
>> **errp)
>>   return block_job_query_locked(job, errp);
>>   }
>>   -static void block_job_iostatus_set_err(BlockJob *job, int error)
>> +/* Called with job lock held */
>> +static void block_job_iostatus_set_err_locked(BlockJob *job, int error)
>>   {
>>   if (job->iostatus == BLOCK_DEVICE_IO_STATUS_OK) {
>>   job->iostatus = error == ENOSPC ?
>> BLOCK_DEVICE_IO_STATUS_NOSPACE :
>> @@ -577,8 +578,8 @@ BlockErrorAction block_job_error_action(BlockJob
>> *job, BlockdevOnError on_err,
>>    */
>>   job->job.user_paused = true;
>>   }
>> +    block_job_iostatus_set_err_locked(job, error);
>>   }
>> -    block_job_iostatus_set_err(job, error);
>>   }
>>   return action;
>>   }
> 
>

Re: [PATCH v2] xio3130_upstream: Add ACS (Access Control Services) capability

2022-08-16 Thread Paul Schlacter

On Tue, Aug 16, 2022 at 6:11 PM Michael S. Tsirkin  wrote:
>
> On Tue, Aug 16, 2022 at 05:16:38PM +0800, Paul Schlacter wrote:
> > v1 -> v2:
> > - Allow ACS to be disabled.
> > - Suggested by Michael S. Tsirkin, use disable-acs to set property.
> >
> > v1:
> > - Add ACS (Access Control Services) capability.
>
> changelog generally after ---
>
> >
> > If it is a pcie device, check that all devices on the path from
>
>
> Hmm I don't see any checks on a path. what does this refer to?

pci_acs_path_enabled, this function in the Linux kernel,
it means that if the device is a PCIe device,
check the path from the device to the root complex. If ACS is all enabled,
the device will become an iommu_group.
acs determine whether it is a separate iommu_group.
>
> >
> > the device to the root complex have ACS enabled, and then the
> >
> > device will become an iommu_group.
> >
> > it will have the effect of isolation
> >
> >
> >
> > Signed-off-by: wangliang 
> >
> > Signed-off-by: wangliang 
> >
>
> why two signatures?
>
> >
> > ---
>
> >
> >  hw/pci-bridge/xio3130_upstream.c | 12 +++-
> >
> >  1 file changed, 11 insertions(+), 1 deletion(-)
> >
>
> Patch has corrupted whitespace.
>
> >
> > diff --git a/hw/pci-bridge/xio3130_upstream.c b/hw/pci-bridge/
> > xio3130_upstream.c
> >
> > index 2df95b..5433d06fb3 100644
> >
> > --- a/hw/pci-bridge/xio3130_upstream.c
> >
> > +++ b/hw/pci-bridge/xio3130_upstream.c
> >
> > @@ -24,6 +24,7 @@
> >
> >  #include "hw/pci/msi.h"
> >
> >  #include "hw/pci/pcie.h"
> >
> >  #include "hw/pci/pcie_port.h"
> >
> > +#include "hw/qdev-properties.h"
> >
> >  #include "migration/vmstate.h"
> >
> >  #include "qemu/module.h"
> >
> >
> >
> > @@ -59,6 +60,7 @@ static void xio3130_upstream_reset(DeviceState *qdev)
> >
> >  static void xio3130_upstream_realize(PCIDevice *d, Error **errp)
> >
> >  {
> >
> >  PCIEPort *p = PCIE_PORT(d);
> >
> > +PCIESlot *s = PCIE_SLOT(d);
> >
> >  int rc;
> >
> >
> >
> >  pci_bridge_initfn(d, TYPE_PCIE_BUS);
> >
> > @@ -94,7 +96,9 @@ static void xio3130_upstream_realize(PCIDevice *d, Error
> > **errp)
> >
> >  goto err;
> >
> >  }
> >
> >
> >
> > -pcie_acs_init(d, XIO3130_ACS_OFFSET);
> >
> > +if (!s->disable_acs) {
> >
> > +pcie_acs_init(d, XIO3130_ACS_OFFSET);
> >
> > +}
> >
> >  return;
> >
> >
> >
> >  err:
> >
> > @@ -113,6 +117,11 @@ static void xio3130_upstream_exitfn(PCIDevice *d)
> >
> >  pci_bridge_exitfn(d);
> >
> >  }
> >
> >
> >
> > +static Property xio3130_upstream_props[] = {
> >
> > +DEFINE_PROP_BOOL("disable-acs", PCIESlot, disable_acs, false),
> >
> > +DEFINE_PROP_END_OF_LIST()
> >
> > +};
> >
> > +
>
> I'd say prefix the property with "x-".

Do you want me to change it to this?
DEFINE_PROP_BOOL("x-disable-acs", PCIESlot, disable_acs, false),

disable_acs is a field in PCIESlot, which is also used by other code.
It may not be good to modify this field
>
>
> >
> >  static const VMStateDescription vmstate_xio3130_upstream = {
> >
> >  .name = "xio3130-express-upstream-port",
> >
> >  .priority = MIG_PRI_PCI_BUS,
> >
> > @@ -142,6 +151,7 @@ static void xio3130_upstream_class_init(ObjectClass 
> > *klass,
> > void *data)
> >
> >  dc->desc = "TI X3130 Upstream Port of PCI Express Switch";
> >
> >  dc->reset = xio3130_upstream_reset;
> >
> >  dc->vmsd = &vmstate_xio3130_upstream;
> >
> > +device_class_set_props(dc, xio3130_upstream_props);
> >
> >  }
> >
>
> Seems to lack compat machinety for existing machine types.

DEFINE_PROP_BOOL("x-disable-acs", PCIESlot, disable_acs, true),
If I change the default value to true, can I not add compat machine
>
>
> >
> >
> >  static const TypeInfo xio3130_upstream_info = {
> >
> > --
> >
> > 2.24.3 (Apple Git-128)
> >
>

Re: [PATCH v10 18/21] job.c: enable job lock/unlock and remove Aiocontext locks



> 
>>   }
>> @@ -501,8 +481,12 @@ void job_unref_locked(Job *job)>  
>> assert(!job->txn);
>>     if (job->driver->free) {
>> +    AioContext *aio_context = job->aio_context;
>>   job_unlock();
>> +    /* FIXME: aiocontext lock is required because cb calls
>> blk_unref */
>> +    aio_context_acquire(aio_context);
>>   job->driver->free(job);
>> +    aio_context_release(aio_context);
> 
> So, job_unref_locked() must be called with aio_context not locked,
> otherwise we dead-lock here? That should be documented in function
> declaration comment.
> 
> For example in qemu-img.c in run_block_job() we do exactly that: call
> job_unref_locked()  inside aio-context lock critical seaction..

No, job_unref_locked has also status and refcnt and all the other fields
that need to be protectd. Only free must be called without job lock held.

> 
> 
>>   job_lock();
>>   }
>>   @@ -581,21 +565,17 @@ void job_enter_cond_locked(Job *job,
>> bool(*fn)(Job *job))
>>   return;
>>   }
>>   -    real_job_lock();
>>   if (job->busy) {
>> -    real_job_unlock();
>>   return;
>>   }
>>     if (fn && !fn(job)) {
>> -    real_job_unlock();
>>   return;
>>   }
>>     assert(!job->deferred_to_main_loop);
>>   timer_del(&job->sleep_timer);
>>   job->busy = true;
>> -    real_job_unlock();
>>   job_unlock();
>>   aio_co_wake(job->co);
>>   job_lock();
>> @@ -626,13 +606,11 @@ static void coroutine_fn job_do_yield_locked(Job
>> *job, uint64_t ns)
>>   {
>>   AioContext *next_aio_context;
>>   -    real_job_lock();
>>   if (ns != -1) {
>>   timer_mod(&job->sleep_timer, ns);
>>   }
>>   job->busy = false;
>>   job_event_idle_locked(job);
>> -    real_job_unlock();
>>   job_unlock();
>>   qemu_coroutine_yield();
>>   job_lock();
>> @@ -922,6 +900,7 @@ static void job_clean(Job *job)
>>   static int job_finalize_single_locked(Job *job)
>>   {
>>   int job_ret;
>> +    AioContext *ctx = job->aio_context;
>>     assert(job_is_completed_locked(job));
>>   @@ -929,6 +908,7 @@ static int job_finalize_single_locked(Job *job)
>>   job_update_rc_locked(job);
>>     job_unlock();
>> +    aio_context_acquire(ctx);
> 
> Hmm, and this function and all its callers now should be called with
> aio-context lock not locked?

Why not leave it as it is?
> 
> For example job_exit is scheduled as as BH. Aren't BHs called with
> aio-context lock held?

I see no aiocontext call in aio_bh_schedule_oneshot and callees...

So summing up, no, I don't think I will apply your suggestions for this
patch here (assume the opposite for all the others).

Emanuele
> 
>>     if (!job->ret) {
>>   job_commit(job);
>> @@ -937,6 +917,7 @@ static int job_finalize_single_locked(Job *job)
>>   }
>>   job_clean(job);
>>   +    aio_context_release(ctx);
>>   job_lock();
>>     if (job->cb) {
> 
> [..]
> 
>

Re: [PULL 2/3] tests/avocado: add timeout to the aspeed tests

Peter Maydell  writes:

> On Tue, 16 Aug 2022 at 13:26, Alex Bennée  wrote:
>>
>> On some systems the test can hang. At least defining a timeout stops
>> it from hanging forever.
>
> Aha. Yeah, I've seen this test hang forever sometimes.
>
> Is there some place (in the superclass??) that we can put a
> default timeout that applies to *all* avocado tests, so we
> don't have the risk of forgetting it in a particular test?

It's a bit muddy. Most tests are sub-classed on LinuxTest which does
define a default timeout:

  class LinuxTest(LinuxSSHMixIn, QemuSystemTest):
  """Facilitates having a cloud-image Linux based available.

  For tests that indent to interact with guests, this is a better choice
  to start with than the more vanilla `QemuSystemTest` class.
  """

  timeout = 900
  distro = None
  username = 'root'
  password = 'password'
  smp = '2'
  memory = '1024'

However the aspeed tests are directly derived from QemuSystemTest.
Perhaps we should just move the timeout down to that or maybe
QemuBaseTest?

>
> -- PMM

-- 
Alex Bennée

[RFC PATCH] tests/avocado: push default timeout to QemuBaseTest

All of the QEMU tests eventually end up derrived from this class. Move
the default timeout from LinuxTest to ensure we catch them all.

Signed-off-by: Alex Bennée 
---
 tests/avocado/avocado_qemu/__init__.py | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/tests/avocado/avocado_qemu/__init__.py 
b/tests/avocado/avocado_qemu/__init__.py
index ed4853c805..9d17a287cf 100644
--- a/tests/avocado/avocado_qemu/__init__.py
+++ b/tests/avocado/avocado_qemu/__init__.py
@@ -227,6 +227,10 @@ def exec_command_and_wait_for_pattern(test, command,
 _console_interaction(test, success_message, failure_message, command + 
'\r')
 
 class QemuBaseTest(avocado.Test):
+
+# default timeout for all tests, can be overridden
+timeout = 900
+
 def _get_unique_tag_val(self, tag_name):
 """
 Gets a tag value, if unique for a key
@@ -512,7 +516,6 @@ class LinuxTest(LinuxSSHMixIn, QemuSystemTest):
 to start with than the more vanilla `QemuSystemTest` class.
 """
 
-timeout = 900
 distro = None
 username = 'root'
 password = 'password'
-- 
2.30.2

Re: qemu-system-aarch64: Failed to retrieve host CPU features

On Sat, 13 Aug 2022 at 14:32, Marc Zyngier  wrote:
> But we probably need to handle EINTR when creating the mini VM.

It's easy enough to add a retry-on-EINTR loop to the KVM_CREATE_VM
ioctl in the target/arm/ code. But do we need to do that more
widely ? At the moment QEMU seems to assume that KVM ioctls
will never fail EINTR except for the one special-cased
KVM_CREATE_VM, plus (more obviously) KVM_RUN...

thanks
-- PMM

Re: [PULL 2/3] tests/avocado: add timeout to the aspeed tests

On Tue, 16 Aug 2022 at 14:34, Alex Bennée  wrote:
> Peter Maydell  writes:
> > Is there some place (in the superclass??) that we can put a
> > default timeout that applies to *all* avocado tests, so we
> > don't have the risk of forgetting it in a particular test?
>
> It's a bit muddy. Most tests are sub-classed on LinuxTest which does
> define a default timeout:
>
>   class LinuxTest(LinuxSSHMixIn, QemuSystemTest):
>   """Facilitates having a cloud-image Linux based available.
>
>   For tests that indent to interact with guests, this is a better choice
>   to start with than the more vanilla `QemuSystemTest` class.
>   """
>
>   timeout = 900
>   distro = None
>   username = 'root'
>   password = 'password'
>   smp = '2'
>   memory = '1024'
>
> However the aspeed tests are directly derived from QemuSystemTest.
> Perhaps we should just move the timeout down to that or maybe
> QemuBaseTest?

Ideally, we should do it at whatever level ensures it is applied
to every single test that 'check-avocado' runs, regardless of how
the test was written. "QemuBaseTest" still sounds a bit higher
than the absolute basic "this is a test" level, but maybe that's
the lowest level we have access to?

thanks
-- PMM

Re: [RFC v3 7/8] blkio: implement BDRV_REQ_REGISTERED_BUF optimization

2022-08-16 Thread Stefan Hajnoczi

On Tue, Jul 12, 2022 at 04:28:02PM +0200, Stefano Garzarella wrote:
> On Fri, Jul 08, 2022 at 05:17:36AM +0100, Stefan Hajnoczi wrote:
> > Avoid bounce buffers when QEMUIOVector elements are within previously
> > registered bdrv_register_buf() buffers.
> > 
> > The idea is that emulated storage controllers will register guest RAM
> > using bdrv_register_buf() and set the BDRV_REQ_REGISTERED_BUF on I/O
> > requests. Therefore no blkio_map_mem_region() calls are necessary in the
> > performance-critical I/O code path.
> > 
> > This optimization doesn't apply if the I/O buffer is internally
> > allocated by QEMU (e.g. qcow2 metadata). There we still take the slow
> > path because BDRV_REQ_REGISTERED_BUF is not set.
> > 
> > Signed-off-by: Stefan Hajnoczi 
> > ---
> > block/blkio.c | 104 --
> > 1 file changed, 101 insertions(+), 3 deletions(-)
> > 
> > diff --git a/block/blkio.c b/block/blkio.c
> > index 7fbdbd7fae..37d593a20c 100644
> > --- a/block/blkio.c
> > +++ b/block/blkio.c
> > @@ -1,7 +1,9 @@
> > #include "qemu/osdep.h"
> > #include 
> > #include "block/block_int.h"
> > +#include "exec/memory.h"
> > #include "qapi/error.h"
> > +#include "qemu/error-report.h"
> > #include "qapi/qmp/qdict.h"
> > #include "qemu/module.h"
> > 
> > @@ -28,6 +30,9 @@ typedef struct {
> > 
> > /* Can we skip adding/deleting blkio_mem_regions? */
> > bool needs_mem_regions;
> > +
> > +/* Are file descriptors necessary for blkio_mem_regions? */
> > +bool needs_mem_region_fd;
> > } BDRVBlkioState;
> > 
> > static void blkio_aiocb_complete(BlkioAIOCB *acb, int ret)
> > @@ -198,6 +203,8 @@ static BlockAIOCB *blkio_aio_preadv(BlockDriverState 
> > *bs, int64_t offset,
> > BlockCompletionFunc *cb, void *opaque)
> > {
> > BDRVBlkioState *s = bs->opaque;
> > +bool needs_mem_regions =
> > +s->needs_mem_regions && !(flags & BDRV_REQ_REGISTERED_BUF);
> > struct iovec *iov = qiov->iov;
> > int iovcnt = qiov->niov;
> > BlkioAIOCB *acb;
> > @@ -206,7 +213,7 @@ static BlockAIOCB *blkio_aio_preadv(BlockDriverState 
> > *bs, int64_t offset,
> > 
> > acb = blkio_aiocb_get(bs, cb, opaque);
> > 
> > -if (s->needs_mem_regions) {
> > +if (needs_mem_regions) {
> > if (blkio_aiocb_init_mem_region_locked(acb, bytes) < 0) {
> > qemu_aio_unref(&acb->common);
> > return NULL;
> > @@ -230,6 +237,8 @@ static BlockAIOCB *blkio_aio_pwritev(BlockDriverState 
> > *bs, int64_t offset,
> > {
> > uint32_t blkio_flags = (flags & BDRV_REQ_FUA) ? BLKIO_REQ_FUA : 0;
> > BDRVBlkioState *s = bs->opaque;
> > +bool needs_mem_regions =
> > +s->needs_mem_regions && !(flags & BDRV_REQ_REGISTERED_BUF);
> > struct iovec *iov = qiov->iov;
> > int iovcnt = qiov->niov;
> > BlkioAIOCB *acb;
> > @@ -238,7 +247,7 @@ static BlockAIOCB *blkio_aio_pwritev(BlockDriverState 
> > *bs, int64_t offset,
> > 
> > acb = blkio_aiocb_get(bs, cb, opaque);
> > 
> > -if (s->needs_mem_regions) {
> > +if (needs_mem_regions) {
> > if (blkio_aiocb_init_mem_region_locked(acb, bytes) < 0) {
> > qemu_aio_unref(&acb->common);
> > return NULL;
> > @@ -324,6 +333,80 @@ static void blkio_io_unplug(BlockDriverState *bs)
> > }
> > }
> > 
> > +static void blkio_register_buf(BlockDriverState *bs, void *host, size_t 
> > size)
> > +{
> > +BDRVBlkioState *s = bs->opaque;
> > +int ret;
> > +struct blkio_mem_region region = (struct blkio_mem_region){
> > +.addr = host,
> > +.len = size,
> > +.fd = -1,
> > +};
> > +
> > +if (((uintptr_t)host | size) % s->mem_region_alignment) {
> > +error_report_once("%s: skipping unaligned buf %p with size %zu",
> > +  __func__, host, size);
> > +return; /* skip unaligned */
> > +}
> > +
> > +/* Attempt to find the fd for a MemoryRegion */
> > +if (s->needs_mem_region_fd) {
> > +int fd = -1;
> > +ram_addr_t offset;
> > +MemoryRegion *mr;
> > +
> > +/*
> > + * bdrv_register_buf() is called with the BQL held so mr lives at 
> > least
> > + * until this function returns.
> > + */
> > +mr = memory_region_from_host(host, &offset);
> > +if (mr) {
> > +fd = memory_region_get_fd(mr);
> 
> If s->needs_mem_region_fd is true, memory_region_get_fd() crashes I think
> because mr->ram_block is not yet set, indeed from the stack trace
> blkio_register_buf() is called inside qemu_ram_alloc_resizeable(), and its
> result is used to set mr->ram_block in memory_region_init_resizeable_ram():
> 
> Program terminated with signal SIGSEGV, Segmentation fault.
> #0  0x56235bf1f7a3 in memory_region_get_fd (mr=) at 
> ../softmmu/memory.c:2309
> #1  0x56235c07e54d in blkio_register_buf (bs=, 
> host=0x7f824e20, size=2097152)
> at ../block/blkio.c:364
> #2  0x56235c0246c6 in bdrv_register_buf (b

Re: [RFC v3 8/8] virtio-blk: use BDRV_REQ_REGISTERED_BUF optimization hint

2022-08-16 Thread Stefan Hajnoczi

On Thu, Jul 14, 2022 at 12:16:16PM +0200, Hanna Reitz wrote:
> On 08.07.22 06:17, Stefan Hajnoczi wrote:
> > Register guest RAM using BlockRAMRegistrar and set the
> > BDRV_REQ_REGISTERED_BUF flag so block drivers can optimize memory
> > accesses in I/O requests.
> > 
> > This is for vdpa-blk, vhost-user-blk, and other I/O interfaces that rely
> > on DMA mapping/unmapping.
> > 
> > Signed-off-by: Stefan Hajnoczi 
> > ---
> >   include/hw/virtio/virtio-blk.h |  2 ++
> >   hw/block/virtio-blk.c  | 13 +
> >   2 files changed, 11 insertions(+), 4 deletions(-)
> 
> Seems fair, but as said on patch 5, I’m quite wary of “register guest RAM”. 
> How can we guarantee that it won’t be too fragmented to be registerable with
> either nvme.c or blkio.c?

We can't guarantee it. blkio instances have a maximum number of mappings
and we might exceed it. This patch doesn't have a smart solution.

Smart solutions are possible, but I haven't had time to work on one yet.
It is necessary to keep track of which mappings are referenced by
in-flight requests. When the maximum number of mappings is hit, a
mapping that currently has no references can be evicted to make space.
When the maximum number of mappings is reached by in-flight requests
then new requests may have to wait.

Until we hit the maximum number of mappings in the real world this
doesn't matter.

Stefan

signature.asc
Description: PGP signature

Re: [PATCH v10 10/21] block/mirror.c: use of job helpers in drivers to avoid TOC/TOU




Am 04/08/2022 um 18:35 schrieb Kevin Wolf:
> Am 25.07.2022 um 09:38 hat Emanuele Giuseppe Esposito geschrieben:
>> Once job lock is used and aiocontext is removed, mirror has
>> to perform job operations under the same critical section,
>> using the helpers prepared in previous commit.
>>
>> Note: at this stage, job_{lock/unlock} and job lock guard macros
>> are *nop*.
>>
>> Reviewed-by: Vladimir Sementsov-Ogievskiy 
>> Reviewed-by: Stefan Hajnoczi 
>> Signed-off-by: Emanuele Giuseppe Esposito 
> 
> Can you explain in the commit message what the TOC/TOU case is that this
> patch is addressing? It's not obvious to me why you picked exactly these
> places to add locking.

Well after thinking about it, mentioning TOC/TOU doesn't really make
sense here. I'll remove the "to avoid TOC/TOU" in the title.

> 
>> diff --git a/block/mirror.c b/block/mirror.c
>> index d8ecb9efa2..b38676e19d 100644
>> --- a/block/mirror.c
>> +++ b/block/mirror.c
>> @@ -654,9 +654,13 @@ static int mirror_exit_common(Job *job)
>>  BlockDriverState *target_bs;
>>  BlockDriverState *mirror_top_bs;
>>  Error *local_err = NULL;
>> -bool abort = job->ret < 0;
>> +bool abort;
>>  int ret = 0;
>>  
>> +WITH_JOB_LOCK_GUARD() {
>> +abort = job->ret < 0;
>> +}
> 
> This is the most mysterious hunk to me. The only thing that should
> modify job->ret is the caller of this function anyway, but let's assume
> for a moment that another thread could write to it.
> 
> Then why is it only important that we hold the lock when we're reading
> the value, but not any more when we are actually using it? And what is
> the TOC/TOU that this fixes?

No TOC/TOU, no sense for this fix too. I'll remove this hunk too.

Emanuele
> 
> Kevin
>

Re: [PATCH v10 11/21] jobs: group together API calls under the same job lock




Am 04/08/2022 um 19:10 schrieb Kevin Wolf:
> Am 25.07.2022 um 09:38 hat Emanuele Giuseppe Esposito geschrieben:
>> Now that the API offers also _locked() functions, take advantage
>> of it and give also the caller control to take the lock and call
>> _locked functions.
>>
>> This makes sense especially when we have for loops, because it
>> makes no sense to have:
>>
>> for(job = job_next(); ...)
>>
>> where each job_next() takes the lock internally.
>> Instead we want
>>
>> JOB_LOCK_GUARD();
>> for(job = job_next_locked(); ...)
>>
>> In addition, protect also direct field accesses, by either creating a
>> new critical section or widening the existing ones.
> 
> "In addition" sounds like it should be a separate patch. I was indeed
> surprised when after a few for loops where you just pulled the existing
> locking up a bit, I saw some hunks that add completely new locking.

Would it be okay if we don't split it in two? There would be two
microscopical patches.

> 
>> Note: at this stage, job_{lock/unlock} and job lock guard macros
>> are *nop*.
>>
>> Signed-off-by: Emanuele Giuseppe Esposito 
>> ---
>>  block.c| 17 ++---
>>  blockdev.c | 12 +---
>>  blockjob.c | 35 ++-
>>  job-qmp.c  |  4 +++-
>>  job.c  |  7 +--
>>  monitor/qmp-cmds.c |  7 +--
>>  qemu-img.c | 37 +
>>  7 files changed, 75 insertions(+), 44 deletions(-)
>>
>> diff --git a/block.c b/block.c
>> index 2c0080..7559965dbc 100644
>> --- a/block.c
>> +++ b/block.c
>> @@ -4978,8 +4978,8 @@ static void bdrv_close(BlockDriverState *bs)
>>  
>>  void bdrv_close_all(void)
>>  {
>> -assert(job_next(NULL) == NULL);
>>  GLOBAL_STATE_CODE();
>> +assert(job_next(NULL) == NULL);
>>  
>>  /* Drop references from requests still in flight, such as canceled block
>>   * jobs whose AIO context has not been polled yet */
>> @@ -6165,13 +6165,16 @@ XDbgBlockGraph *bdrv_get_xdbg_block_graph(Error 
>> **errp)
>>  }
>>  }
>>  
>> -for (job = block_job_next(NULL); job; job = block_job_next(job)) {
>> -GSList *el;
>> +WITH_JOB_LOCK_GUARD() {
>> +for (job = block_job_next_locked(NULL); job;
>> + job = block_job_next_locked(job)) {
>> +GSList *el;
>>  
>> -xdbg_graph_add_node(gr, job, X_DBG_BLOCK_GRAPH_NODE_TYPE_BLOCK_JOB,
>> -   job->job.id);
>> -for (el = job->nodes; el; el = el->next) {
>> -xdbg_graph_add_edge(gr, job, (BdrvChild *)el->data);
>> +xdbg_graph_add_node(gr, job, 
>> X_DBG_BLOCK_GRAPH_NODE_TYPE_BLOCK_JOB,
>> +job->job.id);
>> +for (el = job->nodes; el; el = el->next) {
>> +xdbg_graph_add_edge(gr, job, (BdrvChild *)el->data);
>> +}
>>  }
>>  }
>>  
>> diff --git a/blockdev.c b/blockdev.c
>> index 71f793c4ab..5b79093155 100644
>> --- a/blockdev.c
>> +++ b/blockdev.c
>> @@ -150,12 +150,15 @@ void blockdev_mark_auto_del(BlockBackend *blk)
>>  return;
>>  }
>>  
>> -for (job = block_job_next(NULL); job; job = block_job_next(job)) {
>> +JOB_LOCK_GUARD();
>> +
>> +for (job = block_job_next_locked(NULL); job;
>> + job = block_job_next_locked(job)) {
>>  if (block_job_has_bdrv(job, blk_bs(blk))) {
> 
> Should this be renamed to block_job_has_bdrv_locked() now?
> 
> It looks to me like it does need the locking. (Which actually makes
> this patch a fix and not just an optimisation as the commit message
> suggests.)

Nope, as GSList *nodes; is always read and written under BQL.

> 
>>  AioContext *aio_context = job->job.aio_context;
>>  aio_context_acquire(aio_context);
>>  
>> -job_cancel(&job->job, false);
>> +job_cancel_locked(&job->job, false);
>>  
>>  aio_context_release(aio_context);
>>  }
>> @@ -3745,7 +3748,10 @@ BlockJobInfoList *qmp_query_block_jobs(Error **errp)
>>  BlockJobInfoList *head = NULL, **tail = &head;
>>  BlockJob *job;
>>  
>> -for (job = block_job_next(NULL); job; job = block_job_next(job)) {
>> +JOB_LOCK_GUARD();
>> +
>> +for (job = block_job_next_locked(NULL); job;
>> + job = block_job_next_locked(job)) {
>>  BlockJobInfo *value;
>>  AioContext *aio_context;
> 
> More context:
> 
> BlockJobInfo *value;
> AioContext *aio_context;
> 
> if (block_job_is_internal(job)) {
> continue;
> }
> aio_context = block_job_get_aio_context(job);
> aio_context_acquire(aio_context);
> value = block_job_query(job, errp);
> aio_context_release(aio_context);
> 
> This should become block_job_query_locked(). (You do that in patch 18,
> but it looks a bit out of place there - which is precisely because it
> really belongs in this one.)

Ok
> 
>> diff --git a/bloc

Re: [PATCH v10 12/21] commit and mirror: create new nodes using bdrv_get_aio_context, and not the job aiocontext




Am 05/08/2022 um 10:14 schrieb Kevin Wolf:
> Am 25.07.2022 um 09:38 hat Emanuele Giuseppe Esposito geschrieben:
>> We are always using the given bs AioContext, so there is no need
>> to take the job ones (which is identical anyways).
>> This also reduces the point we need to check when protecting
>> job.aio_context field.
>>
>> Reviewed-by: Vladimir Sementsov-Ogievskiy 
>> Reviewed-by: Stefan Hajnoczi 
>> Signed-off-by: Emanuele Giuseppe Esposito 
> 
> I'm not sure against which scenario this is actually protecting. The
> only reason for not using job.aio_context seems to be if someone else
> can change the job AioContext in parallel. However, if that is the case
> (I don't think it is, but for the hypothetical case this patch seems to
> address), the AioContext is not identical any more and the change is
> wrong because we actually want things to run in the job AioContext -
> otherwise accessing the BlockBackend from the job coroutine wouldn't be
> possible.
> 
> So I believe the current code expresses how we actually want to use the
> BlockBackend and the change doesn't only protect nothing, but is even
> misleading because it implies that the job can work with any AioContext,
> which is not true.
> 
> Kevin
> 
Ok, dropped

Re: [PATCH v10 13/21] job: detect change of aiocontext within job coroutine




Am 05/08/2022 um 10:37 schrieb Kevin Wolf:
> Am 25.07.2022 um 09:38 hat Emanuele Giuseppe Esposito geschrieben:
>> From: Paolo Bonzini 
>>
>> We want to make sure access of job->aio_context is always done
>> under either BQL or job_mutex.
> 
> Is this the goal of this series? If so, it would have been useful to
> state somewhere more obvious, because I had assumed that holding the BQL
> would not be considered enough, but everyone needs to hold the job_mutex.

It is the goal for this patch :)
The whole job API can't rely on BQL since there are coroutines running
in another aiocontext.
> 
>> The problem is that using
>> aio_co_enter(job->aiocontext, job->co) in job_start and job_enter_cond
>> makes the coroutine immediately resume, so we can't hold the job lock.
>> And caching it is not safe either, as it might change.
>>
>> job_start is under BQL, so it can freely read job->aiocontext, but
>> job_enter_cond is not. In order to fix this, use aio_co_wake():
>> the advantage is that it won't use job->aiocontext, but the
>> main disadvantage is that it won't be able to detect a change of
>> job AioContext.
>>
>> Calling bdrv_try_set_aio_context() will issue the following calls
>> (simplified):
>> * in terms of  bdrv callbacks:
>>   .drained_begin -> .set_aio_context -> .drained_end
>> * in terms of child_job functions:
>>   child_job_drained_begin -> child_job_set_aio_context -> 
>> child_job_drained_end
>> * in terms of job functions:
>>   job_pause_locked -> job_set_aio_context -> job_resume_locked
>>
>> We can see that after setting the new aio_context, job_resume_locked
>> calls again job_enter_cond, which then invokes aio_co_wake(). But
>> while job->aiocontext has been set in job_set_aio_context,
>> job->co->ctx has not changed, so the coroutine would be entering in
>> the wrong aiocontext.
>>
>> Using aio_co_schedule in job_resume_locked() might seem as a valid
>> alternative, but the problem is that the bh resuming the coroutine
>> is not scheduled immediately, and if in the meanwhile another
>> bdrv_try_set_aio_context() is run (see test_propagate_mirror() in
>> test-block-iothread.c), we would have the first schedule in the
>> wrong aiocontext, and the second set of drains won't even manage
>> to schedule the coroutine, as job->busy would still be true from
>> the previous job_resume_locked().
>>
>> The solution is to stick with aio_co_wake(), but then detect every time
>> the coroutine resumes back from yielding if job->aio_context
>> has changed. If so, we can reschedule it to the new context.
> 
> Hm, but with this in place, what does aio_co_wake() actually buy us
> compared to aio_co_enter()?
> 
> I guess it's a bit simpler code because you don't have to explicitly
> specify the AioContext, but we're still going to enter the coroutine in
> the wrong AioContext occasionally and have to reschedule it, just like
> in the existing code (except that the rescheduling doesn't exist there
> yet).
> 
> So while I don't disagree with the change, I don't think the
> justification in the commit message is right for this part.

What do you suggest to change?

> 
>> Check for the aiocontext change in job_do_yield_locked because:
>> 1) aio_co_reschedule_self requires to be in the running coroutine
>> 2) since child_job_set_aio_context allows changing the aiocontext only
>>while the job is paused, this is the exact place where the coroutine
>>resumes, before running JobDriver's code.
>>
>> Reviewed-by: Vladimir Sementsov-Ogievskiy 
>> Reviewed-by: Stefan Hajnoczi 
>> Signed-off-by: Paolo Bonzini 
>> ---
>>  job.c | 21 +++--
>>  1 file changed, 19 insertions(+), 2 deletions(-)
>>
>> diff --git a/job.c b/job.c
>> index b0729e2bb2..ecec66b44e 100644
>> --- a/job.c
>> +++ b/job.c
>> @@ -585,7 +585,9 @@ void job_enter_cond_locked(Job *job, bool(*fn)(Job *job))
>>  timer_del(&job->sleep_timer);
>>  job->busy = true;
>>  real_job_unlock();
>> -aio_co_enter(job->aio_context, job->co);
>> +job_unlock();
>> +aio_co_wake(job->co);
>> +job_lock();
> 
> The addition of job_unlock/lock is unrelated, this was necessary even
> before this patch.

Ok

> 
>>  }
>>  
>>  void job_enter_cond(Job *job, bool(*fn)(Job *job))
>> @@ -611,6 +613,8 @@ void job_enter(Job *job)
>>   */
>>  static void coroutine_fn job_do_yield_locked(Job *job, uint64_t ns)
>>  {
>> +AioContext *next_aio_context;
>> +
>>  real_job_lock();
>>  if (ns != -1) {
>>  timer_mod(&job->sleep_timer, ns);
>> @@ -622,7 +626,20 @@ static void coroutine_fn job_do_yield_locked(Job *job, 
>> uint64_t ns)
>>  qemu_coroutine_yield();
>>  job_lock();
>>  
>> -/* Set by job_enter_cond() before re-entering the coroutine.  */
>> +next_aio_context = job->aio_context;
>> +/*
>> + * Coroutine has resumed, but in the meanwhile the job AioContext
>> + * might have changed via bdrv_try_set_aio_context(), so we need to move
>> + * the coroutine too in the new aiocontext.
>> +

Re: [PATCH 1/4] hw/nvme: avoid unnecessary call to irq (de)assertion functions

2022-08-16 Thread Stefan Hajnoczi

On Thu, 11 Aug 2022 at 11:38, Jinhao Fan  wrote:
>
> nvme_irq_assert() only does useful work when cq->irq_enabled is true.
> nvme_irq_deassert() only works for pin-based interrupts. Avoid calls
> into these functions if we are sure they will not do useful work.
>
> This will be most useful when we use eventfd to send interrupts. We
> can avoid the unnecessary overhead of signalling eventfd.
>
> Signed-off-by: Jinhao Fan 
> ---
>  hw/nvme/ctrl.c | 40 ++--
>  1 file changed, 22 insertions(+), 18 deletions(-)

There is code duplication and nvme_irq_assert/deassert() check
->irq_enabled and msix_enabled() again.

Can the logic be moved into assert()/deassert() so callers don't need
to duplicate the checks?

(I assume the optimization is that eventfd syscalls are avoided, not
that the function call is avoided.)

Stefan

Re: [PATCH v7 00/14] KVM: mm: fd-based approach for supporting KVM guest private memory

2022-08-16 Thread Sean Christopherson

On Tue, Aug 16, 2022, Gupta, Pankaj wrote:
> 
> > > > Actually the current version allows you to delay the allocation to a
> > > > later time (e.g. page fault time) if you don't call fallocate() on the
> > > > private fd. fallocate() is necessary in previous versions because we
> > > > treat the existense in the fd as 'private' but in this version we track
> > > > private/shared info in KVM so we don't rely on that fact from memory
> > > > backstores.
> > > 
> > > Does this also mean reservation of guest physical memory with secure
> > > processor (both for SEV-SNP & TDX) will also happen at page fault time?
> > > 
> > > Do we plan to keep it this way?
> > 
> > If you are talking about accepting memory by the guest, it is initiated by
> > the guest and has nothing to do with page fault time vs fallocate()
> > allocation of host memory. I mean acceptance happens after host memory
> > allocation but they are not in lockstep, acceptance can happen much later.
> 
> No, I meant reserving guest physical memory range from hypervisor e.g with
> RMPUpdate for SEV-SNP or equivalent at TDX side (PAMTs?).

As proposed, RMP/PAMT updates will occur in the fault path, i.e. there is no way
for userspace to pre-map guest memory.

I think the best approach is to turn KVM_TDX_INIT_MEM_REGION into a generic
vCPU-scoped ioctl() that allows userspace to pre-map guest memory.  Supporting
initializing guest private memory with a source page can be implemented via a
flag.  That also gives KVM line of sight to in-place "conversion", e.g. another
flag could be added to say that the dest is also the source.

The TDX and SNP restrictions would then become addition restrictions on when
initializing with a source is allowed (and VMs that don't have guest private
memory wouldn't allow the flag at all).

Re: [PULL for 7.1 0/3] memory leak and testing tweaks

2022-08-16 Thread Richard Henderson


On 8/16/22 07:26, Alex Bennée wrote:

The following changes since commit d102b8162a1e5fe8288d4d5c01801ce6536ac2d1:

   Merge tag 'pull-la-20220814' of https://gitlab.com/rth7680/qemu into staging 
(2022-08-14 08:48:11 -0500)

are available in the Git repository at:

   https://github.com/stsquad/qemu.git tags/pull-for-7.1-fixes-160822-1

for you to fetch changes up to 65711f9a87874a9ec61108c6009f8baec07c8b0d:

   tests/avocado: apply a band aid to aspeed-evb login (2022-08-16 09:57:12 
+0100)


A few small fixes:

   - properly un-parent OBJECT(cpu) when closing -user thread
   - add missing timeout to aspeed tests
   - reduce raciness of login: prompt handling for aspeed tests


Applied, thanks.  Please update https://wiki.qemu.org/ChangeLog/7.1 as 
appropriate.


r~






Alex Bennée (3):
   linux-user: un-parent OBJECT(cpu) when closing thread
   tests/avocado: add timeout to the aspeed tests
   tests/avocado: apply a band aid to aspeed-evb login

  linux-user/syscall.c| 13 +++--
  tests/avocado/machine_aspeed.py |  4 
  2 files changed, 11 insertions(+), 6 deletions(-)

Re: [PATCH v7 2/8] file-posix: introduce get_sysfs_str_val for device zoned model

2022-08-16 Thread Sam Li

Sam Li  于2022年8月16日周二 14:25写道：
>
> Use sysfs attribute files to get the string value of device
> zoned model. Then get_sysfs_zoned_model can convert it to
> BlockZoneModel type in QEMU.
>
> Signed-off-by: Sam Li 
> Reviewed-by: Hannes Reinecke 
> ---
>  block/file-posix.c   | 93 ++--
>  include/block/block_int-common.h |  3 ++
>  2 files changed, 55 insertions(+), 41 deletions(-)
>
> diff --git a/block/file-posix.c b/block/file-posix.c
> index 48cd096624..c07ac4c697 100644
> --- a/block/file-posix.c
> +++ b/block/file-posix.c
> @@ -1210,66 +1210,71 @@ static int hdev_get_max_hw_transfer(int fd, struct 
> stat *st)
>  #endif
>  }
>
> -static int hdev_get_max_segments(int fd, struct stat *st)
> -{
> +/*
> + * Convert the zoned attribute file in sysfs to internal value.
> + */
> +static int get_sysfs_str_val(struct stat *st, const char *attribute,
> + char **val) {
>  #ifdef CONFIG_LINUX
> -char buf[32];
> -const char *end;
> -char *sysfspath = NULL;
> +g_autofree char *sysfspath = NULL;
>  int ret;
> -int sysfd = -1;
> -long max_segments;
> -
> -if (S_ISCHR(st->st_mode)) {
> -if (ioctl(fd, SG_GET_SG_TABLESIZE, &ret) == 0) {
> -return ret;
> -}
> -return -ENOTSUP;
> -}
> +size_t len;
>
>  if (!S_ISBLK(st->st_mode)) {
>  return -ENOTSUP;
>  }
>
> -sysfspath = g_strdup_printf("/sys/dev/block/%u:%u/queue/max_segments",
> -major(st->st_rdev), minor(st->st_rdev));
> -sysfd = open(sysfspath, O_RDONLY);
> -if (sysfd == -1) {
> -ret = -errno;
> -goto out;
> +sysfspath = g_strdup_printf("/sys/dev/block/%u:%u/queue/%s",
> +major(st->st_rdev), minor(st->st_rdev),
> +attribute);
> +ret = g_file_get_contents(sysfspath, val, &len, NULL);
> +if (ret == -1) {
> +return -ENOENT;
>  }

+/* The file is ended with '\n' */
+char *p;
+p = *val;
+if (*(p + len - 1) == '\n') {
+*(p + len - 1) = '\0';
+}

I'm sorry to miss this part to make the str end with '\0'.

> -do {
> -ret = read(sysfd, buf, sizeof(buf) - 1);
> -} while (ret == -1 && errno == EINTR);
> +return ret;
> +#else
> +return -ENOTSUP;
> +#endif
> +}
> +
> +static int get_sysfs_zoned_model(struct stat *st, BlockZoneModel *zoned) {
> +g_autofree char *val = NULL;
> +int ret;
> +
> +ret = get_sysfs_str_val(st, "zoned", &val);
>  if (ret < 0) {
> -ret = -errno;
> -goto out;
> -} else if (ret == 0) {
> -ret = -EIO;
> -goto out;
> +return ret;
>  }
> -buf[ret] = 0;
> -/* The file is ended with '\n', pass 'end' to accept that. */
> -ret = qemu_strtol(buf, &end, 10, &max_segments);
> -if (ret == 0 && end && *end == '\n') {
> -ret = max_segments;
> +
> +if (strcmp(val, "host-managed") == 0) {
> +*zoned = BLK_Z_HM;
> +} else if (strcmp(val, "host-aware") == 0) {
> +*zoned = BLK_Z_HA;
> +} else if (strcmp(val, "none") == 0) {
> +*zoned = BLK_Z_NONE;
> +} else {
> +return -ENOTSUP;
>  }
> +return 0;
> +}
>
> -out:
> -if (sysfd != -1) {
> -close(sysfd);
> +static int hdev_get_max_segments(int fd, struct stat *st) {
> +int ret;
> +if (S_ISCHR(st->st_mode)) {
> +if (ioctl(fd, SG_GET_SG_TABLESIZE, &ret) == 0) {
> +return ret;
> +}
> +return -ENOTSUP;
>  }
> -g_free(sysfspath);
> -return ret;
> -#else
> -return -ENOTSUP;
> -#endif
> +return get_sysfs_long_val(st, "max_segments");
>  }
>
>  static void raw_refresh_limits(BlockDriverState *bs, Error **errp)
>  {
>  BDRVRawState *s = bs->opaque;
>  struct stat st;
> +int ret;
> +BlockZoneModel zoned;
>
>  s->needs_alignment = raw_needs_alignment(bs);
>  raw_probe_alignment(bs, s->fd, errp);
> @@ -1307,6 +1312,12 @@ static void raw_refresh_limits(BlockDriverState *bs, 
> Error **errp)
>  bs->bl.max_hw_iov = ret;
>  }
>  }
> +
> +ret = get_sysfs_zoned_model(s->fd, &st, &zoned);
> +if (ret < 0) {
> +zoned = BLK_Z_NONE;
> +}
> +bs->bl.zoned = zoned;
>  }
>
>  static int check_for_dasd(int fd)
> diff --git a/include/block/block_int-common.h 
> b/include/block/block_int-common.h
> index 8947abab76..7f7863cc9e 100644
> --- a/include/block/block_int-common.h
> +++ b/include/block/block_int-common.h
> @@ -825,6 +825,9 @@ typedef struct BlockLimits {
>
>  /* maximum number of iovec elements */
>  int max_iov;
> +
> +/* device zone model */
> +BlockZoneModel zoned;
>  } BlockLimits;
>
>  typedef struct BdrvOpBlocker BdrvOpBlocker;
> --
> 2.37.1
>

Re: [PATCH v7 3/8] file-posix: introduce get_sysfs_long_val for the long sysfs attribute

2022-08-16 Thread Sam Li

Sam Li  于2022年8月16日周二 14:25写道：
>
> Use sysfs attribute files to get the long value of zoned device
> information.
>
> Signed-off-by: Sam Li 
> Reviewed-by: Hannes Reinecke 
> Reviewed-by: Stefan Hajnoczi 
> ---
>  block/file-posix.c | 27 +++
>  1 file changed, 27 insertions(+)
>
> diff --git a/block/file-posix.c b/block/file-posix.c
> index c07ac4c697..727389488c 100644
> --- a/block/file-posix.c
> +++ b/block/file-posix.c
> @@ -1258,6 +1258,33 @@ static int get_sysfs_zoned_model(struct stat *st, 
> BlockZoneModel *zoned) {
>  return 0;
>  }
>
> +/*
> + * Get zoned device information (chunk_sectors, zoned_append_max_bytes,
> + * max_open_zones, max_active_zones) through sysfs attribute files.
> + */
> +static long get_sysfs_long_val(struct stat *st, const char *attribute) {
> +#ifdef CONFIG_LINUX
> +g_autofree char *str = NULL;
> +const char *end;
> +long val;
> +int ret;
> +
> +ret = get_sysfs_str_val(st, attribute, &str);
> +if (ret < 0) {
> +return ret;
> +}
> +
> +/* The file is ended with '\n', pass 'end' to accept that. */
> +ret = qemu_strtol(str, &end, 10, &val);
> +if (ret == 0 && end && *end == '\n') {

Should be  "if (ret == 0 && end && *end == '\0') {". Changes accordingly.

> +ret = val;
> +}
> +return ret;
> +#else
> +return -ENOTSUP;
> +#endif
> +}
> +
>  static int hdev_get_max_segments(int fd, struct stat *st) {
>  int ret;
>  if (S_ISCHR(st->st_mode)) {
> --
> 2.37.1
>

Re: [PATCH v2 4/4] virt/hw/virt: Add virt_set_high_memmap() helper

2022-08-16 Thread Zhenyu Zhang

commit 49e00c1fe2ab24b73ac16908f3c05ebe88b9186d (HEAD -> master)
Author: Gavin Shan 
Date:   Mon Aug 15 14:29:58 2022 +0800

virt/hw/virt: Add virt_set_high_memmap() helper

The logic to assign high memory region's address in virt_set_memmap()
is independent. Lets move the logic to virt_set_high_memmap() helper.
"each device" is replaced by "each region" in the comments.

No functional change intended.

Signed-off-by: Gavin Shan 

The patchs works well on my Fujitsu host.

[root@hpe-apollo80-02-n00 qemu]# /home/qemu/build/qemu-system-aarch64 -version
QEMU emulator version 7.0.92 (v7.1.0-rc2-12-gd102b8162a)
[root@hpe-apollo80-02-n00 qemu]# /home/qemu/build/qemu-system-aarch64
-accel kvm -m 4096,maxmem=1023G -machine virt-2.12 -cpu host

[root@hpe-apollo80-02-n00 qemu]# /home/qemu/build/qemu-system-aarch64
-accel kvm -m 4096,maxmem=1024G -machine virt-2.12 -cpu host
qemu-system-aarch64: -accel kvm: Addressing limited to 40 bits, but
memory exceeds it by 1073741824 bytes

[root@hpe-apollo80-02-n00 qemu]# /home/qemu/build/qemu-system-aarch64
-accel kvm -m 4096,maxmem=1023G -machine virt -cpu host

[root@hpe-apollo80-02-n00 qemu]# /home/qemu/build/qemu-system-aarch64
-accel kvm -m 4096,maxmem=1024G -machine virt -cpu host
qemu-system-aarch64: -accel kvm: Addressing limited to 40 bits, but
memory exceeds it by 1073741824 bytes

Tested-by:zheny...@redhat.com


On Mon, Aug 15, 2022 at 2:30 PM Gavin Shan  wrote:
>
> The logic to assign high memory region's address in virt_set_memmap()
> is independent. Lets move the logic to virt_set_high_memmap() helper.
> "each device" is replaced by "each region" in the comments.
>
> No functional change intended.
>
> Signed-off-by: Gavin Shan 
> ---
>  hw/arm/virt.c | 92 ---
>  1 file changed, 50 insertions(+), 42 deletions(-)
>
> diff --git a/hw/arm/virt.c b/hw/arm/virt.c
> index e38b6919c9..4dde08a924 100644
> --- a/hw/arm/virt.c
> +++ b/hw/arm/virt.c
> @@ -1688,6 +1688,55 @@ static uint64_t virt_cpu_mp_affinity(VirtMachineState 
> *vms, int idx)
>  return arm_cpu_mp_affinity(idx, clustersz);
>  }
>
> +static void virt_set_high_memmap(VirtMachineState *vms,
> + hwaddr base, int pa_bits)
> +{
> +hwaddr region_base, region_size;
> +bool *region_enabled, fits;
> +int i;
> +
> +for (i = VIRT_LOWMEMMAP_LAST; i < ARRAY_SIZE(extended_memmap); i++) {
> +region_base = ROUND_UP(base, extended_memmap[i].size);
> +region_size = extended_memmap[i].size;
> +
> +switch (i) {
> +case VIRT_HIGH_GIC_REDIST2:
> +region_enabled = &vms->highmem_redists;
> +break;
> +case VIRT_HIGH_PCIE_ECAM:
> +region_enabled = &vms->highmem_ecam;
> +break;
> +case VIRT_HIGH_PCIE_MMIO:
> +region_enabled = &vms->highmem_mmio;
> +break;
> +default:
> +region_enabled = NULL;
> +}
> +
> +/* Skip unknwon or disabled regions */
> +if (!region_enabled || !*region_enabled) {
> +continue;
> +}
> +
> +/*
> + * Check each region to see if they fit in the PA space,
> + * moving highest_gpa as we go.
> + *
> + * For each device that doesn't fit, disable it.
> + */
> +fits = (region_base + region_size) <= BIT_ULL(pa_bits);
> +if (fits) {
> +vms->memmap[i].base = region_base;
> +vms->memmap[i].size = region_size;
> +
> +base = region_base + region_size;
> +vms->highest_gpa = region_base + region_size - 1;
> +} else {
> +*region_enabled = false;
> +}
> +}
> +}
> +
>  static void virt_set_memmap(VirtMachineState *vms, int pa_bits)
>  {
>  MachineState *ms = MACHINE(vms);
> @@ -1742,48 +1791,7 @@ static void virt_set_memmap(VirtMachineState *vms, int 
> pa_bits)
>
>  /* We know for sure that at least the memory fits in the PA space */
>  vms->highest_gpa = memtop - 1;
> -
> -for (i = VIRT_LOWMEMMAP_LAST; i < ARRAY_SIZE(extended_memmap); i++) {
> -hwaddr region_base = ROUND_UP(base, extended_memmap[i].size);
> -hwaddr region_size = extended_memmap[i].size;
> -bool *region_enabled, fits;
> -
> -switch (i) {
> -case VIRT_HIGH_GIC_REDIST2:
> -region_enabled = &vms->highmem_redists;
> -break;
> -case VIRT_HIGH_PCIE_ECAM:
> -region_enabled = &vms->highmem_ecam;
> -break;
> -case VIRT_HIGH_PCIE_MMIO:
> -region_enabled = &vms->highmem_mmio;
> -break;
> -default:
> -region_enabled = NULL;
> -}
> -
> -/* Skip unknwon or disabled regions */
> -if (!region_enabled || !*region_enabled) {
> -continue;
> -}
> -
> -/*
> - * Check each device to see if they fit in the PA space,
> -

Re: [PATCH 00/24] Support VIRTIO_F_RING_RESET for virtio-net, vhost-user, vhost-kernel in virtio pci-modern

2022-08-16 Thread Xuan Zhuo

On Tue, 16 Aug 2022 02:14:10 -0400, "Michael S. Tsirkin"  
wrote:
> On Tue, Aug 16, 2022 at 09:06:12AM +0800, Kangjie Xu wrote:
> > The virtio queue reset function has already been defined in the virtio spec 
> > 1.2.
> > The relevant virtio spec information is here:
> >
> > https://github.com/oasis-tcs/virtio-spec/issues/124
> > https://github.com/oasis-tcs/virtio-spec/issues/139
> >
> > This patch set is to support this function in QEMU. It consists of several 
> > parts:
> > 1. Patches 1-7 are the basic interfaces for vq reset in virtio and 
> > virtio-pci.
> > 2. Patches 8-12 support vq stop and vq restart for vhost-kernel.
> > 3. Patches 13-19 support vq stop and vq restart for vhost-user.
> > 4. Patches 20-22 support vq reset and re-enable for virtio-net.
> > 5. Patches 23-24 enable the vq reset feature for vhost-kernel and 
> > vhost-user.
> >
> > The process of virtqueue reset can be concluded as:
> > 1. The virtqueue is disabled when VIRTIO_PCI_COMMON_Q_RESET is written.
> > 2. Then the virtqueue can be optionally restarted(re-enabled).
> >
> > Since this patch set involves multiple modules and seems a bit messy, we 
> > briefly describe the
> > calling process for different modes below.
> > virtio-net:
> > 1. VIRTIO_PCI_COMMON_Q_RESET is written [virtio-pci]
> > -> virtio_queue_reset() [virtio]
> > -> virtio_net_queue_reset() [virtio-net]
> > -> __virtio_queue_reset()
> > 2. VIRTIO_PCI_COMMON_Q_ENABLE is written [virtio-pci]
> > -> set enabled, reset status of vq.
> >
> > vhost-kernel:
> > 1. VIRTIO_PCI_COMMON_Q_RESET is written [virtio-pci]
> > -> virtio_queue_reset() [virtio]
> > -> virtio_net_queue_reset() [virtio-net]
> > -> vhost_net_virtqueue_stop() [vhost-net]
> > -> vhost_net_set_backend() [vhost]
> > -> vhost_dev_virtqueue_stop()
> > -> vhost_virtqueue_unmap()
> > -> __virtio_queue_reset()
> > 2. VIRTIO_PCI_COMMON_Q_ENABLE is written [virtio-pci]
> > -> virtio_queue_enable() [virtio]
> > -> virtio_net_queue_enable() [virtio-net]
> > -> vhost_net_virtqueue_restart() [vhost-net]
> > -> vhost_dev_virtqueue_restart() [vhost]
> > -> vhost_virtqueue_start()
> > -> vhost_net_set_backend()
> > -> set enabled, reset status of vq.
> >
> > vhost-user:
> > 1. VIRTIO_PCI_COMMON_Q_RESET is written [virtio-pci]
> > -> virtio_queue_reset() [virtio]
> > -> virtio_net_queue_reset() [virtio-net]
> > -> vhost_net_virtqueue_stop() [vhost-net]
> > -> vhost_dev_virtqueue_stop() [vhost]
> > -> vhost_user_reset_vring() [vhost-user]
> > -> send VHOST_USER_RESET_VRING to the device
> > -> vhost_virtqueue_unmap()
> > -> __virtio_queue_reset()
> > 2. VIRTIO_PCI_COMMON_Q_ENABLE is written [virtio-pci]
> > -> virtio_queue_enable() [virtio]
> > -> virtio_net_queue_enable() [virtio-net]
> > -> vhost_net_virtqueue_restart() [vhost-net]
> > -> vhost_dev_virtqueue_restart() [vhost]
> > -> vhost_virtqueue_start()
> > -> vhost_user_set_single_vring_enable [vhost-user]
> > -> send VHOST_USER_SET_VRING_ENABLE to the device
> > -> set enabled, reset status of vq.
> >
> >
> > Test environment:
> > Host: 5.19.0-rc3 (With vq reset support)
> > Qemu: QEMU emulator version 7.0.50
> > Guest: 5.19.0-rc3 (With vq reset support)
> > DPDK: 22.07-rc1 (With vq reset support)
> > Test Cmd: ethtool -g eth1; ethtool -G eth1 rx $1 tx $2; ethtool -g eth1;
> >
> > The drvier can resize the virtio queue, then virtio queue reset 
> > function should
> > be triggered.
> >
> > The default is split mode, modify Qemu virtio-net to add PACKED feature 
> > to
> > test packed mode.
>
> legacy mode testing?


legacy does not support vq reset.

Thanks.

>
> > Guest Kernel Patch:
> > 
> > https://lore.kernel.org/bpf/20220801063902.129329-1-xuanz...@linux.alibaba.com/
> >
> > DPDK Patch:
> > 
> > https://github.com/middaywords/dpdk/compare/72206323a5dd3182b13f61b25a64abdddfee595c...eabadfac7953da66bc10ffb8284b490d09bb7ec7
> >
> > Host Kernel Patch:
> > 
> > https://github.com/middaywords/linux/commit/19a91e0d7167b2031e46078c6215c213b89cb2c3
> >
> > Looking forward to your review and comments. Thanks.
> >
> > Kangjie Xu (19):
> >   virtio: introduce virtio_queue_enable()
> >   virtio: core: vq reset feature negotation support
> >   virtio-pci: support queue enable
> >   vhost: extract the logic of unmapping the vrings and desc
> >   vhost: introduce vhost_dev_virtqueue_stop()
> >   vhost: introduce vhost_dev_virtqueue_restart()
> >   vhost-net: vhost-kernel: introduce vhost_net_virtqueue_stop()
> >   vhost-net: vhost-kernel: introduce vhost_net_virtqueue_restart()
> >   docs: vhost-user: add VHOST_USER_R

Re: [PATCH 00/24] Support VIRTIO_F_RING_RESET for virtio-net, vhost-user, vhost-kernel in virtio pci-modern

2022-08-16 Thread Xuan Zhuo

On Tue, 16 Aug 2022 02:22:16 -0400, "Michael S. Tsirkin"  
wrote:
> On Tue, Aug 16, 2022 at 02:15:57PM +0800, Xuan Zhuo wrote:
> > On Tue, 16 Aug 2022 02:14:10 -0400, "Michael S. Tsirkin"  
> > wrote:
> > > On Tue, Aug 16, 2022 at 09:06:12AM +0800, Kangjie Xu wrote:
> > > > The virtio queue reset function has already been defined in the virtio 
> > > > spec 1.2.
> > > > The relevant virtio spec information is here:
> > > >
> > > > https://github.com/oasis-tcs/virtio-spec/issues/124
> > > > https://github.com/oasis-tcs/virtio-spec/issues/139
> > > >
> > > > This patch set is to support this function in QEMU. It consists of 
> > > > several parts:
> > > > 1. Patches 1-7 are the basic interfaces for vq reset in virtio and 
> > > > virtio-pci.
> > > > 2. Patches 8-12 support vq stop and vq restart for vhost-kernel.
> > > > 3. Patches 13-19 support vq stop and vq restart for vhost-user.
> > > > 4. Patches 20-22 support vq reset and re-enable for virtio-net.
> > > > 5. Patches 23-24 enable the vq reset feature for vhost-kernel and 
> > > > vhost-user.
> > > >
> > > > The process of virtqueue reset can be concluded as:
> > > > 1. The virtqueue is disabled when VIRTIO_PCI_COMMON_Q_RESET is written.
> > > > 2. Then the virtqueue can be optionally restarted(re-enabled).
> > > >
> > > > Since this patch set involves multiple modules and seems a bit messy, 
> > > > we briefly describe the
> > > > calling process for different modes below.
> > > > virtio-net:
> > > > 1. VIRTIO_PCI_COMMON_Q_RESET is written [virtio-pci]
> > > > -> virtio_queue_reset() [virtio]
> > > > -> virtio_net_queue_reset() [virtio-net]
> > > > -> __virtio_queue_reset()
> > > > 2. VIRTIO_PCI_COMMON_Q_ENABLE is written [virtio-pci]
> > > > -> set enabled, reset status of vq.
> > > >
> > > > vhost-kernel:
> > > > 1. VIRTIO_PCI_COMMON_Q_RESET is written [virtio-pci]
> > > > -> virtio_queue_reset() [virtio]
> > > > -> virtio_net_queue_reset() [virtio-net]
> > > > -> vhost_net_virtqueue_stop() [vhost-net]
> > > > -> vhost_net_set_backend() [vhost]
> > > > -> vhost_dev_virtqueue_stop()
> > > > -> vhost_virtqueue_unmap()
> > > > -> __virtio_queue_reset()
> > > > 2. VIRTIO_PCI_COMMON_Q_ENABLE is written [virtio-pci]
> > > > -> virtio_queue_enable() [virtio]
> > > > -> virtio_net_queue_enable() [virtio-net]
> > > > -> vhost_net_virtqueue_restart() [vhost-net]
> > > > -> vhost_dev_virtqueue_restart() [vhost]
> > > > -> vhost_virtqueue_start()
> > > > -> vhost_net_set_backend()
> > > > -> set enabled, reset status of vq.
> > > >
> > > > vhost-user:
> > > > 1. VIRTIO_PCI_COMMON_Q_RESET is written [virtio-pci]
> > > > -> virtio_queue_reset() [virtio]
> > > > -> virtio_net_queue_reset() [virtio-net]
> > > > -> vhost_net_virtqueue_stop() [vhost-net]
> > > > -> vhost_dev_virtqueue_stop() [vhost]
> > > > -> vhost_user_reset_vring() [vhost-user]
> > > > -> send VHOST_USER_RESET_VRING to the device
> > > > -> vhost_virtqueue_unmap()
> > > > -> __virtio_queue_reset()
> > > > 2. VIRTIO_PCI_COMMON_Q_ENABLE is written [virtio-pci]
> > > > -> virtio_queue_enable() [virtio]
> > > > -> virtio_net_queue_enable() [virtio-net]
> > > > -> vhost_net_virtqueue_restart() [vhost-net]
> > > > -> vhost_dev_virtqueue_restart() [vhost]
> > > > -> vhost_virtqueue_start()
> > > > -> vhost_user_set_single_vring_enable [vhost-user]
> > > > -> send VHOST_USER_SET_VRING_ENABLE to the 
> > > > device
> > > > -> set enabled, reset status of vq.
> > > >
> > > >
> > > > Test environment:
> > > > Host: 5.19.0-rc3 (With vq reset support)
> > > > Qemu: QEMU emulator version 7.0.50
> > > > Guest: 5.19.0-rc3 (With vq reset support)
> > > > DPDK: 22.07-rc1 (With vq reset support)
> > > > Test Cmd: ethtool -g eth1; ethtool -G eth1 rx $1 tx $2; ethtool -g 
> > > > eth1;
> > > >
> > > > The drvier can resize the virtio queue, then virtio queue reset 
> > > > function should
> > > > be triggered.
> > > >
> > > > The default is split mode, modify Qemu virtio-net to add PACKED 
> > > > feature to
> > > > test packed mode.
> > >
> > > legacy mode testing?
> >
> >
> > legacy does not support vq reset.
> >
> > Thanks.
>
> yes but did it break with all these code changes?

OK, I see, we'll test this.

Thanks.


>
> > >
> > > > Guest Kernel Patch:
> > > > 
> > > > https://lore.kernel.org/bpf/20220801063902.129329-1-xuanz...@linux.alibaba.com/
> > > >
> > > > DPDK Patch:
> > > > 
> > > > https://github.com/middaywords/dpdk/compare/72206323a5dd3182b13f61b25a64abdddfee595c...eabadfac7953da66bc10ffb8284b490d09bb7ec7
> > > >
> > > > Host Kernel Patch:
> > > > 
>

Re: [PATCH v7 1/8] include: add zoned device structs

2022-08-16 Thread Damien Le Moal

On 2022/08/15 23:25, Sam Li wrote:
> Signed-off-by: Sam Li 
> Reviewed-by: Stefan Hajnoczi 

Looks good.

Reviewed-by: Damien Le Moal 

> ---
>  include/block/block-common.h | 43 
>  1 file changed, 43 insertions(+)
> 
> diff --git a/include/block/block-common.h b/include/block/block-common.h
> index fdb7306e78..36bd0e480e 100644
> --- a/include/block/block-common.h
> +++ b/include/block/block-common.h
> @@ -49,6 +49,49 @@ typedef struct BlockDriver BlockDriver;
>  typedef struct BdrvChild BdrvChild;
>  typedef struct BdrvChildClass BdrvChildClass;
>  
> +typedef enum BlockZoneOp {
> +BLK_ZO_OPEN,
> +BLK_ZO_CLOSE,
> +BLK_ZO_FINISH,
> +BLK_ZO_RESET,
> +} BlockZoneOp;
> +
> +typedef enum BlockZoneModel {
> +BLK_Z_NONE = 0x0, /* Regular block device */
> +BLK_Z_HM = 0x1, /* Host-managed zoned block device */
> +BLK_Z_HA = 0x2, /* Host-aware zoned block device */
> +} BlockZoneModel;
> +
> +typedef enum BlockZoneCondition {
> +BLK_ZS_NOT_WP = 0x0,
> +BLK_ZS_EMPTY = 0x1,
> +BLK_ZS_IOPEN = 0x2,
> +BLK_ZS_EOPEN = 0x3,
> +BLK_ZS_CLOSED = 0x4,
> +BLK_ZS_RDONLY = 0xD,
> +BLK_ZS_FULL = 0xE,
> +BLK_ZS_OFFLINE = 0xF,
> +} BlockZoneCondition;
> +
> +typedef enum BlockZoneType {
> +BLK_ZT_CONV = 0x1, /* Conventional random writes supported */
> +BLK_ZT_SWR = 0x2, /* Sequential writes required */
> +BLK_ZT_SWP = 0x3, /* Sequential writes preferred */
> +} BlockZoneType;
> +
> +/*
> + * Zone descriptor data structure.
> + * Provides information on a zone with all position and size values in bytes.
> + */
> +typedef struct BlockZoneDescriptor {
> +uint64_t start;
> +uint64_t length;
> +uint64_t cap;
> +uint64_t wp;
> +BlockZoneType type;
> +BlockZoneCondition cond;
> +} BlockZoneDescriptor;
> +
>  typedef struct BlockDriverInfo {
>  /* in bytes, 0 if irrelevant */
>  int cluster_size;


-- 
Damien Le Moal
Western Digital Research

Re: [PATCH v7 2/8] file-posix: introduce get_sysfs_str_val for device zoned model

2022-08-16 Thread Damien Le Moal

On 2022/08/15 23:25, Sam Li wrote:
> Use sysfs attribute files to get the string value of device
> zoned model. Then get_sysfs_zoned_model can convert it to
> BlockZoneModel type in QEMU.
> 
> Signed-off-by: Sam Li 
> Reviewed-by: Hannes Reinecke 
> ---
>  block/file-posix.c   | 93 ++--
>  include/block/block_int-common.h |  3 ++
>  2 files changed, 55 insertions(+), 41 deletions(-)
> 
> diff --git a/block/file-posix.c b/block/file-posix.c
> index 48cd096624..c07ac4c697 100644
> --- a/block/file-posix.c
> +++ b/block/file-posix.c
> @@ -1210,66 +1210,71 @@ static int hdev_get_max_hw_transfer(int fd, struct 
> stat *st)
>  #endif
>  }
>  
> -static int hdev_get_max_segments(int fd, struct stat *st)
> -{
> +/*
> + * Convert the zoned attribute file in sysfs to internal value.

This function does not convert anything. So this comment should be changed to
something like:

/*
 * Get a sysfs attribute value as a character string.
 */

> + */
> +static int get_sysfs_str_val(struct stat *st, const char *attribute,
> + char **val) {
>  #ifdef CONFIG_LINUX
> -char buf[32];
> -const char *end;
> -char *sysfspath = NULL;
> +g_autofree char *sysfspath = NULL;
>  int ret;
> -int sysfd = -1;
> -long max_segments;
> -
> -if (S_ISCHR(st->st_mode)) {
> -if (ioctl(fd, SG_GET_SG_TABLESIZE, &ret) == 0) {
> -return ret;
> -}
> -return -ENOTSUP;
> -}
> +size_t len;
>  
>  if (!S_ISBLK(st->st_mode)) {
>  return -ENOTSUP;
>  }
>  
> -sysfspath = g_strdup_printf("/sys/dev/block/%u:%u/queue/max_segments",
> -major(st->st_rdev), minor(st->st_rdev));
> -sysfd = open(sysfspath, O_RDONLY);
> -if (sysfd == -1) {
> -ret = -errno;
> -goto out;
> +sysfspath = g_strdup_printf("/sys/dev/block/%u:%u/queue/%s",
> +major(st->st_rdev), minor(st->st_rdev),
> +attribute);
> +ret = g_file_get_contents(sysfspath, val, &len, NULL);
> +if (ret == -1) {
> +return -ENOENT;
>  }
> -do {
> -ret = read(sysfd, buf, sizeof(buf) - 1);
> -} while (ret == -1 && errno == EINTR);
> +return ret;
> +#else
> +return -ENOTSUP;
> +#endif
> +}
> +
> +static int get_sysfs_zoned_model(struct stat *st, BlockZoneModel *zoned) {
> +g_autofree char *val = NULL;
> +int ret;
> +
> +ret = get_sysfs_str_val(st, "zoned", &val);
>  if (ret < 0) {
> -ret = -errno;
> -goto out;
> -} else if (ret == 0) {
> -ret = -EIO;
> -goto out;
> +return ret;
>  }
> -buf[ret] = 0;
> -/* The file is ended with '\n', pass 'end' to accept that. */
> -ret = qemu_strtol(buf, &end, 10, &max_segments);
> -if (ret == 0 && end && *end == '\n') {
> -ret = max_segments;
> +
> +if (strcmp(val, "host-managed") == 0) {
> +*zoned = BLK_Z_HM;
> +} else if (strcmp(val, "host-aware") == 0) {
> +*zoned = BLK_Z_HA;
> +} else if (strcmp(val, "none") == 0) {
> +*zoned = BLK_Z_NONE;
> +} else {
> +return -ENOTSUP;
>  }
> +return 0;
> +}
>  
> -out:
> -if (sysfd != -1) {
> -close(sysfd);
> +static int hdev_get_max_segments(int fd, struct stat *st) {
> +int ret;

Add a blank line here ? Not sure about the qemu code style convention. But a
blank line after a variable declaration is always nice to clearly separate
declarations and code.

> +if (S_ISCHR(st->st_mode)) {
> +if (ioctl(fd, SG_GET_SG_TABLESIZE, &ret) == 0) {
> +return ret;
> +}
> +return -ENOTSUP;
>  }
> -g_free(sysfspath);
> -return ret;
> -#else
> -return -ENOTSUP;
> -#endif
> +return get_sysfs_long_val(st, "max_segments");
>  }
>  
>  static void raw_refresh_limits(BlockDriverState *bs, Error **errp)
>  {
>  BDRVRawState *s = bs->opaque;
>  struct stat st;
> +int ret;
> +BlockZoneModel zoned;
>  
>  s->needs_alignment = raw_needs_alignment(bs);
>  raw_probe_alignment(bs, s->fd, errp);
> @@ -1307,6 +1312,12 @@ static void raw_refresh_limits(BlockDriverState *bs, 
> Error **errp)
>  bs->bl.max_hw_iov = ret;
>  }
>  }
> +
> +ret = get_sysfs_zoned_model(s->fd, &st, &zoned);
> +if (ret < 0) {
> +zoned = BLK_Z_NONE;
> +}
> +bs->bl.zoned = zoned;
>  }
>  
>  static int check_for_dasd(int fd)
> diff --git a/include/block/block_int-common.h 
> b/include/block/block_int-common.h
> index 8947abab76..7f7863cc9e 100644
> --- a/include/block/block_int-common.h
> +++ b/include/block/block_int-common.h
> @@ -825,6 +825,9 @@ typedef struct BlockLimits {
>  
>  /* maximum number of iovec elements */
>  int max_iov;
> +
> +/* device zone model */
> +BlockZoneModel zoned;
>  } BlockLimits;
>  
>  typedef struct BdrvOpBlocker BdrvOpBlocker;


-- 
Damien

[PATCH for-7.2 v3 00/20] QMP/HMP: add 'dumpdtb' and 'info fdt' commands

Hi,

In this new version the most notable changes are:

- removed fdt_pack() from machine specific code. As discussed in the previous
version, the proper use of fdt_pack() would require more work/thought and,
since it's not required for the work we're doing here, it was removed;

- we're now handling string arrays. The previous version was interpreting
all string properties as a single, plain string. We're now dealing with string
arrays instead;

- changed the output format to be more in line with the dts format.

Other small changes were made based on the feeback of the previous version.

Changes from v2:
- patches 1-8:
  - remove fdt_pack() to shrink the FDT before assigning it to ms->fdt
- patch 9:
  - call g_free(ms->fdt) to avoid leaking an old fdt during reset
- patch 10:
  - added a commit msg note about why we're not eliminating spapr->fdt_blob
for machine->fdt at this moment
- patches 11, 12:
  - remove fdt_pack() to shrink the FDT before assigning it to ms->fdt
  - added Alistair's r-b
- patch 13:
  - remove fdt_pack() to shrink the FDT before assigning it to ms->fdt
- patch 14:
  - added a commit msg note about BQL
- patch 15:
  - added a commit msg note about BQL
- patch 16:
  - renamed fdt_prop_is_string to fdt_prop_is_string_array. \0 characters
in the middle of the data array is now legal
  - added a new fdt_prop_format_string_array() to format the string array
  - added a semicolon at the end of the string array
- patch 17:
  - added semicolon at the end of properties
  - use %02x instead of %x to format vals in [] notation

- v2 link: https://lists.gnu.org/archive/html/qemu-devel/2022-08/msg00937.html


Daniel Henrique Barboza (20):
  hw/arm: do not free machine->fdt in arm_load_dtb()
  hw/microblaze: set machine->fdt in microblaze_load_dtb()
  hw/nios2: set machine->fdt in nios2_load_dtb()
  hw/ppc: set machine->fdt in ppce500_load_device_tree()
  hw/ppc: set machine->fdt in bamboo_load_device_tree()
  hw/ppc: set machine->fdt in sam460ex_load_device_tree()
  hw/ppc: set machine->fdt in xilinx_load_device_tree()
  hw/ppc: set machine->fdt in pegasos2_machine_reset()
  hw/ppc: set machine->fdt in pnv_reset()
  hw/ppc: set machine->fdt in spapr machine
  hw/riscv: set machine->fdt in sifive_u_machine_init()
  hw/riscv: set machine->fdt in spike_board_init()
  hw/xtensa: set machine->fdt in xtfpga_init()
  qmp/hmp, device_tree.c: introduce dumpdtb
  qmp/hmp, device_tree.c: introduce 'info fdt' command
  device_tree.c: support string array prop in fdt_format_node()
  device_tree.c: support remaining FDT prop types
  device_node.c: enable 'info fdt' to print subnodes
  device_tree.c: add fdt_format_property() helper
  hmp, device_tree.c: add 'info fdt ' support

 hmp-commands-info.hx |  14 +++
 hmp-commands.hx  |  13 +++
 hw/arm/boot.c|   6 +-
 hw/microblaze/boot.c |  11 +-
 hw/microblaze/meson.build|   2 +-
 hw/nios2/boot.c  |  11 +-
 hw/nios2/meson.build |   2 +-
 hw/ppc/e500.c|  13 ++-
 hw/ppc/pegasos2.c|   7 ++
 hw/ppc/pnv.c |   8 +-
 hw/ppc/ppc440_bamboo.c   |  11 +-
 hw/ppc/sam460ex.c|   8 +-
 hw/ppc/spapr.c   |   6 +
 hw/ppc/spapr_hcall.c |   8 ++
 hw/ppc/virtex_ml507.c|  11 +-
 hw/riscv/sifive_u.c  |   6 +
 hw/riscv/spike.c |   9 ++
 hw/xtensa/meson.build|   2 +-
 hw/xtensa/xtfpga.c   |   9 +-
 include/monitor/hmp.h|   2 +
 include/sysemu/device_tree.h |   7 ++
 monitor/hmp-cmds.c   |  28 +
 monitor/qmp-cmds.c   |  27 +
 qapi/machine.json|  38 ++
 softmmu/device_tree.c| 219 +++
 25 files changed, 466 insertions(+), 12 deletions(-)

-- 
2.37.2

[PATCH for-7.2 v3 01/20] hw/arm: do not free machine->fdt in arm_load_dtb()

At this moment, arm_load_dtb() can free machine->fdt when
binfo->dtb_filename is NULL. If there's no 'dtb_filename', 'fdt' will be
retrieved by binfo->get_dtb(). If get_dtb() returns machine->fdt, as is
the case of machvirt_dtb() from hw/arm/virt.c, fdt now has a pointer to
machine->fdt. And, in that case, the existing g_free(fdt) at the end of
arm_load_dtb() will make machine->fdt point to an invalid memory region.

This is not an issue right now because there's no code that access
machine->fdt after arm_load_dtb(), but we're going to add a couple do
FDT HMP commands that will rely on machine->fdt being valid.

Instead of freeing 'fdt' at the end of arm_load_dtb(), assign it to
machine->fdt. This will allow the FDT of ARM machines that relies on
arm_load_dtb() to be accessed later on.

Since all ARM machines allocates the FDT only once, we don't need to
worry about leaking the existing FDT during a machine reset (which is
something that other machines have to look after, e.g. the ppc64 pSeries
machine).

Cc: Peter Maydell 
Cc: qemu-...@nongnu.org
Signed-off-by: Daniel Henrique Barboza 
---
 hw/arm/boot.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/hw/arm/boot.c b/hw/arm/boot.c
index ada2717f76..669a978157 100644
--- a/hw/arm/boot.c
+++ b/hw/arm/boot.c
@@ -684,7 +684,11 @@ int arm_load_dtb(hwaddr addr, const struct arm_boot_info 
*binfo,
  */
 rom_add_blob_fixed_as("dtb", fdt, size, addr, as);
 
-g_free(fdt);
+/*
+ * Update the ms->fdt pointer to enable support for
+ * 'dumpdtb' and 'info fdt' QMP/HMP commands.
+ */
+ms->fdt = fdt;
 
 return size;
 
-- 
2.37.2

[PATCH for-7.2 v3 03/20] hw/nios2: set machine->fdt in nios2_load_dtb()

This will enable support for 'dumpdtb' and 'info fdt' HMP commands for
all nios2 machines that uses nios2_load_dtb().

Cc: Chris Wulff 
Cc: Marek Vasut 
Signed-off-by: Daniel Henrique Barboza 
---
 hw/nios2/boot.c  | 11 ++-
 hw/nios2/meson.build |  2 +-
 2 files changed, 11 insertions(+), 2 deletions(-)

diff --git a/hw/nios2/boot.c b/hw/nios2/boot.c
index 21cb47..db3b21fea6 100644
--- a/hw/nios2/boot.c
+++ b/hw/nios2/boot.c
@@ -43,6 +43,8 @@
 
 #include "boot.h"
 
+#include 
+
 #define NIOS2_MAGIC0x534f494e
 
 static struct nios2_boot_info {
@@ -81,6 +83,7 @@ static uint64_t translate_kernel_address(void *opaque, 
uint64_t addr)
 static int nios2_load_dtb(struct nios2_boot_info bi, const uint32_t ramsize,
   const char *kernel_cmdline, const char *dtb_filename)
 {
+MachineState *machine = MACHINE(qdev_get_machine());
 int fdt_size;
 void *fdt = NULL;
 int r;
@@ -113,7 +116,13 @@ static int nios2_load_dtb(struct nios2_boot_info bi, const 
uint32_t ramsize,
 }
 
 cpu_physical_memory_write(bi.fdt, fdt, fdt_size);
-g_free(fdt);
+
+/*
+ * Update the machine->fdt pointer to enable support for
+ * 'dumpdtb' and 'info fdt' QMP/HMP commands.
+ */
+machine->fdt = fdt;
+
 return fdt_size;
 }
 
diff --git a/hw/nios2/meson.build b/hw/nios2/meson.build
index 6c58e8082b..22277bd6c5 100644
--- a/hw/nios2/meson.build
+++ b/hw/nios2/meson.build
@@ -1,5 +1,5 @@
 nios2_ss = ss.source_set()
-nios2_ss.add(files('boot.c'))
+nios2_ss.add(files('boot.c'), fdt)
 nios2_ss.add(when: 'CONFIG_NIOS2_10M50', if_true: files('10m50_devboard.c'))
 nios2_ss.add(when: 'CONFIG_NIOS2_GENERIC_NOMMU', if_true: 
files('generic_nommu.c'))
 
-- 
2.37.2

[PATCH for-7.2 v3 02/20] hw/microblaze: set machine->fdt in microblaze_load_dtb()

This will enable support for 'dumpdtb' and 'info fdt' HMP commands for
all microblaze machines that uses microblaze_load_dtb().

Cc: Edgar E. Iglesias 
Signed-off-by: Daniel Henrique Barboza 
---
 hw/microblaze/boot.c  | 11 ++-
 hw/microblaze/meson.build |  2 +-
 2 files changed, 11 insertions(+), 2 deletions(-)

diff --git a/hw/microblaze/boot.c b/hw/microblaze/boot.c
index 8b92a9801a..e9ebc04381 100644
--- a/hw/microblaze/boot.c
+++ b/hw/microblaze/boot.c
@@ -39,6 +39,8 @@
 
 #include "boot.h"
 
+#include 
+
 static struct
 {
 void (*machine_cpu_reset)(MicroBlazeCPU *);
@@ -72,6 +74,7 @@ static int microblaze_load_dtb(hwaddr addr,
const char *kernel_cmdline,
const char *dtb_filename)
 {
+MachineState *machine = MACHINE(qdev_get_machine());
 int fdt_size;
 void *fdt = NULL;
 int r;
@@ -100,7 +103,13 @@ static int microblaze_load_dtb(hwaddr addr,
 }
 
 cpu_physical_memory_write(addr, fdt, fdt_size);
-g_free(fdt);
+
+/*
+ * Update the machine->fdt pointer to enable support for
+ * 'dumpdtb' and 'info fdt' QMP/HMP commands.
+ */
+machine->fdt = fdt;
+
 return fdt_size;
 }
 
diff --git a/hw/microblaze/meson.build b/hw/microblaze/meson.build
index bb9e4eb8f4..a38a397872 100644
--- a/hw/microblaze/meson.build
+++ b/hw/microblaze/meson.build
@@ -1,5 +1,5 @@
 microblaze_ss = ss.source_set()
-microblaze_ss.add(files('boot.c'))
+microblaze_ss.add(files('boot.c'), fdt)
 microblaze_ss.add(when: 'CONFIG_PETALOGIX_S3ADSP1800', if_true: 
files('petalogix_s3adsp1800_mmu.c'))
 microblaze_ss.add(when: 'CONFIG_PETALOGIX_ML605', if_true: 
files('petalogix_ml605_mmu.c'))
 microblaze_ss.add(when: 'CONFIG_XLNX_ZYNQMP_PMU', if_true: 
files('xlnx-zynqmp-pmu.c'))
-- 
2.37.2

[PATCH for-7.2 v3 04/20] hw/ppc: set machine->fdt in ppce500_load_device_tree()

This will enable support for 'dumpdtb' and 'info fdt' HMP commands for
the e500 machine.

Cc: Cédric Le Goater 
Signed-off-by: Daniel Henrique Barboza 
---
 hw/ppc/e500.c | 13 -
 1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/hw/ppc/e500.c b/hw/ppc/e500.c
index 32495d0123..6197a905b8 100644
--- a/hw/ppc/e500.c
+++ b/hw/ppc/e500.c
@@ -47,6 +47,8 @@
 #include "hw/i2c/i2c.h"
 #include "hw/irq.h"
 
+#include 
+
 #define EPAPR_MAGIC(0x45504150)
 #define DTC_LOAD_PAD   0x180
 #define DTC_PAD_MASK   0xF
@@ -600,7 +602,16 @@ done:
 cpu_physical_memory_write(addr, fdt, fdt_size);
 }
 ret = fdt_size;
-g_free(fdt);
+
+/*
+ * Update the machine->fdt pointer to enable support for
+ * 'dumpdtb' and 'info fdt' QMP/HMP commands.
+ *
+ * The FDT is re-created during reset, so free machine->fdt
+ * to avoid leaking the old FDT.
+ */
+g_free(machine->fdt);
+machine->fdt = fdt;
 
 out:
 g_free(pci_map);
-- 
2.37.2

[PATCH for-7.2 v3 07/20] hw/ppc: set machine->fdt in xilinx_load_device_tree()