date:20250410

Re: [PATCH] vfio/igd: Check host PCI address when probing

2025-04-10 Thread Cédric Le Goater


+ Corvin

On 4/9/25 19:18, Alex Williamson wrote:

On Wed, 26 Mar 2025 01:22:39 +0800
Tomita Moeko  wrote:


So far, all Intel VGA adapters, including discrete GPUs like A770 and
B580, were treated as IGD devices. While this had no functional impact,
a error about "unsupported IGD device" will be printed when passthrough
Intel discrete GPUs.

Since IGD devices must be at ":00:02.0", let's check the host PCI
address when probing.

Signed-off-by: Tomita Moeko 
---
  hw/vfio/igd.c | 23 +--
  1 file changed, 9 insertions(+), 14 deletions(-)

diff --git a/hw/vfio/igd.c b/hw/vfio/igd.c
index 265fffc2aa..ff250017b0 100644
--- a/hw/vfio/igd.c
+++ b/hw/vfio/igd.c
@@ -53,6 +53,13 @@
   * headless setup is desired, the OpRegion gets in the way of that.
   */
  
+static bool vfio_is_igd(VFIOPCIDevice *vdev)

+{
+return vfio_pci_is(vdev, PCI_VENDOR_ID_INTEL, PCI_ANY_ID) &&
+   vfio_is_vga(vdev) &&
+   vfio_pci_host_match(&vdev->host, ":00:02.0");
+}


vfio-pci devices can also be specified via sysfsdev= rather than host=,
so at a minimum I think we'd need to test against vdev->vbasedev.name,
as other callers of vfio_pci_host_match do.  For example building a
local PCIHostDeviceAddress and comparing it to name.  This is also not
foolproof though if we start taking advantage of devices passed by fd.

Could we instead rely PCIe capabilities?  A discrete GPU should
identify as either an endpoint or legacy endpoint and IGD should
identify as a root complex integrated endpoint, or maybe older versions
would lack the PCIe capability altogether.


Maintaining a list of PCI IDs for Intel GPU devices as Corvin was
proposing in [1] is not a viable solution ?

Thanks,

C.

[1] 
https://lore.kernel.org/qemu-devel/20250206121341.118337-1-corvin.koe...@gmail.com/
 

Also I think the comments that were dropped below are still valid and
useful to transfer to this new helper.  I think those are actually
referring to the guest address of 00:02.0 though, which should maybe be
a test as well.  Thanks,

Alex


+
  /*
   * This presumes the device is already known to be an Intel VGA device, so we
   * take liberties in which device ID bits match which generation.  This should
@@ -427,13 +434,7 @@ void vfio_probe_igd_bar0_quirk(VFIOPCIDevice *vdev, int nr)
  VFIOConfigMirrorQuirk *ggc_mirror, *bdsm_mirror;
  int gen;
  
-/*

- * This must be an Intel VGA device at address 00:02.0 for us to even
- * consider enabling legacy mode. Some driver have dependencies on the PCI
- * bus address.
- */
-if (!vfio_pci_is(vdev, PCI_VENDOR_ID_INTEL, PCI_ANY_ID) ||
-!vfio_is_vga(vdev) || nr != 0) {
+if (nr != 0 || !vfio_is_igd(vdev)) {
  return;
  }
  
@@ -490,13 +491,7 @@ static bool vfio_pci_igd_config_quirk(VFIOPCIDevice *vdev, Error **errp)

  bool legacy_mode_enabled = false;
  Error *err = NULL;
  
-/*

- * This must be an Intel VGA device at address 00:02.0 for us to even
- * consider enabling legacy mode.  The vBIOS has dependencies on the
- * PCI bus address.
- */
-if (!vfio_pci_is(vdev, PCI_VENDOR_ID_INTEL, PCI_ANY_ID) ||
-!vfio_is_vga(vdev)) {
+if (!vfio_is_igd(vdev)) {
  return true;
  }

[PATCH] i386/cpu: Consolidate the helper to get Host's vendor

2025-04-10 Thread Zhao Liu

Extend host_cpu_vendor_fms() to help more cases to get Host's vendor
information.

Cc: Dongli Zhang 
Signed-off-by: Zhao Liu 
---
 target/i386/host-cpu.c| 10 ++
 target/i386/kvm/vmsr_energy.c |  3 +--
 2 files changed, 7 insertions(+), 6 deletions(-)

diff --git a/target/i386/host-cpu.c b/target/i386/host-cpu.c
index 3e4e85e729c8..072731a4dd25 100644
--- a/target/i386/host-cpu.c
+++ b/target/i386/host-cpu.c
@@ -109,9 +109,13 @@ void host_cpu_vendor_fms(char *vendor, int *family, int 
*model, int *stepping)
 {
 uint32_t eax, ebx, ecx, edx;
 
-host_cpuid(0x0, 0, &eax, &ebx, &ecx, &edx);
+host_cpuid(0x0, 0, NULL, &ebx, &ecx, &edx);
 x86_cpu_vendor_words2str(vendor, ebx, edx, ecx);
 
+if (!family && !model && !stepping) {
+return;
+}
+
 host_cpuid(0x1, 0, &eax, &ebx, &ecx, &edx);
 if (family) {
 *family = ((eax >> 8) & 0x0F) + ((eax >> 20) & 0xFF);
@@ -129,11 +133,9 @@ void host_cpu_instance_init(X86CPU *cpu)
 X86CPUClass *xcc = X86_CPU_GET_CLASS(cpu);
 
 if (xcc->model) {
-uint32_t ebx = 0, ecx = 0, edx = 0;
 char vendor[CPUID_VENDOR_SZ + 1];
 
-host_cpuid(0, 0, NULL, &ebx, &ecx, &edx);
-x86_cpu_vendor_words2str(vendor, ebx, edx, ecx);
+host_cpu_vendor_fms(vendor, NULL, NULL, NULL);
 object_property_set_str(OBJECT(cpu), "vendor", vendor, &error_abort);
 }
 }
diff --git a/target/i386/kvm/vmsr_energy.c b/target/i386/kvm/vmsr_energy.c
index 31508d4e77a2..f499ec6e8b05 100644
--- a/target/i386/kvm/vmsr_energy.c
+++ b/target/i386/kvm/vmsr_energy.c
@@ -29,10 +29,9 @@ char *vmsr_compute_default_paths(void)
 
 bool is_host_cpu_intel(void)
 {
-int family, model, stepping;
 char vendor[CPUID_VENDOR_SZ + 1];
 
-host_cpu_vendor_fms(vendor, &family, &model, &stepping);
+host_cpu_vendor_fms(vendor, NULL, NULL, NULL);
 
 return g_str_equal(vendor, CPUID_VENDOR_INTEL);
 }
-- 
2.34.1

[PATCH v2 0/3] virtio-gpu: fix blob unmapping sequence

2025-04-10 Thread Manos Pitsidianakis

A hang was observed when running a small kernel that exercised VIRTIO 
GPU under TCG. This is an edge-case and won't happen under typical 
conditions.

When unmapping a blob object, its MemoryRegion's freeing is deferred to 
the RCU thread. The hang's cause was determined to be a busy main loop 
not allowing for the RCU thread to run because the kernel did not setup 
any timers or had any interrupts on the way. While fixing the RCU thread 
to run even if the guest CPU spins is a solution, it's easier to fix the 
reason why the MemoryRegion isn't freed from the main loop instead.

While at it, also restructure the 3 stage cleanup to immediately respond 
to the guest if the MR happened to have had no other reference.

PS: The hang can be reproduced by running this unikernel with TCG 

https://git.codelinaro.org/manos.pitsidianakis/virtio-tests/-/tree/8c0ebe9395827e24aa5711186d499bf5de87cf63/virtio-test-suite

v1 to v2:
  - Add patch by Alex to prevent double-free when FlatView is destroyed 
from RCU thread.

Alex Bennée (1):
  hw/display: re-arrange memory region tracking

Manos Pitsidianakis (2):
  virtio-gpu: fix hang under TCG when unmapping blob
  virtio-gpu: refactor async blob unmapping

 include/exec/memory.h |  1 +
 hw/display/virtio-gpu-virgl.c | 60 ---
 2 files changed, 35 insertions(+), 26 deletions(-)


base-commit: 56c6e249b6988c1b6edc2dd34ebb0f1e570a1365
-- 
γαῖα πυρί μιχθήτω

Re: [PATCH 00/10] Enable QEMU to run on browsers

2025-04-10 Thread Kohei Tokunaga

Hi Stefan,

> > This patch series enables QEMU's system emulator to run in a browser
using
> > Emscripten.
> > It includes implementations and workarounds to address browser
environment
> > limitations, as shown in the following.
>
> I think it would be great to merge this even if there are limitations
> once code review comments have been addressed. Developing WebAssembly
> support in-tree is likely to allow this effort to develop further than
> if done in personal repos (and with significant efforts required to
> rebase the code periodically).
>
> > # New TCG Backend for Browsers
> >
> > A new TCG backend translates IR instructions into Wasm instructions and
runs
> > them using the browser's WebAssembly APIs (WebAssembly.Module and
> > WebAssembly.instantiate). To minimize compilation overhead and avoid
hitting
> > the browser's limitation of the number of instances, this backend
integrates
> > a forked TCI. TBs run on TCI by default, with frequently executed TBs
> > compiled into WebAssembly.
> >
> > # Workaround for Running 64-bit Guests
> >
> > The current implementation uses Wasm's 32-bit memory model, even though
Wasm
> > supports 64-bit variables and instructions. This patch explores
implementing
> > TCG 64-bit instructions while leveraging SoftMMU for address
translation. To
> > enable 64-bit guest support in Wasm today, it was necessary to partially
> > revert recent changes that removed support for different pointer widths
> > between the host and guest (e.g., commits
> > a70af12addd9060fdf8f3dbd42b42e3072c3914f and
> > bf455ec50b6fea15b4d2493059365bf94c706273) when compiling with
> > Emscripten. While this serves as a temporary workaround, a long-term
> > solution could involve adopting Wasm's 64-bit memory model once it gains
> > broader support, as it is currently not widely adopted (e.g.,
unsupported by
> > Safari and libffi). Feedback and suggestions on this approach are
welcome.
> >
> > # Emscripten-Based Coroutine Backend
> >
> > Emscripten does not support couroutine methods currently used by QEMU
but
> > provides a coroutine implementation called "fiber". This patch series
> > introduces a coroutine backend using fiber. However, fiber does not
support
> > submitting coroutines to other threads. So this patch series modifies
> > hw/9pfs/coth.h to disable this behavior when compiled with Emscripten.
>
> QEMU's block job coroutines also rely on switching between threads. See
> how job_co_entry() schedules job_exit(). It's not very likely that users
> will run jobs in a WebAssembly environment, so maybe this is more of a
> theoretical problem for the time being.

Thank you for the feedback. I'll investigate the block job coroutines
further. As you pointed out, I agree that users aren't likely to run block
jobs in the WebAssembly environment.

> If I understand correctly the QEMU project is only build the statically
> linked wasm binary in the CI system and not distributing it (e.g. making
> it available for download)? I'm asking because if the QEMU project wants
> to distribute the wasm binary it may be necessary to put together a
> combined software license to meet the license requirements of glib and
> other dependencies that are statically linked.

Yes, it doesn't distribute the statically linked wasm binary.

Re: [PATCH] vfio/igd: Check host PCI address when probing

2025-04-10 Thread Alex Williamson

On Wed, 26 Mar 2025 01:22:39 +0800
Tomita Moeko  wrote:

> So far, all Intel VGA adapters, including discrete GPUs like A770 and
> B580, were treated as IGD devices. While this had no functional impact,
> a error about "unsupported IGD device" will be printed when passthrough
> Intel discrete GPUs.
> 
> Since IGD devices must be at ":00:02.0", let's check the host PCI
> address when probing.
> 
> Signed-off-by: Tomita Moeko 
> ---
>  hw/vfio/igd.c | 23 +--
>  1 file changed, 9 insertions(+), 14 deletions(-)
> 
> diff --git a/hw/vfio/igd.c b/hw/vfio/igd.c
> index 265fffc2aa..ff250017b0 100644
> --- a/hw/vfio/igd.c
> +++ b/hw/vfio/igd.c
> @@ -53,6 +53,13 @@
>   * headless setup is desired, the OpRegion gets in the way of that.
>   */
>  
> +static bool vfio_is_igd(VFIOPCIDevice *vdev)
> +{
> +return vfio_pci_is(vdev, PCI_VENDOR_ID_INTEL, PCI_ANY_ID) &&
> +   vfio_is_vga(vdev) &&
> +   vfio_pci_host_match(&vdev->host, ":00:02.0");
> +}

vfio-pci devices can also be specified via sysfsdev= rather than host=,
so at a minimum I think we'd need to test against vdev->vbasedev.name,
as other callers of vfio_pci_host_match do.  For example building a
local PCIHostDeviceAddress and comparing it to name.  This is also not
foolproof though if we start taking advantage of devices passed by fd.

Could we instead rely PCIe capabilities?  A discrete GPU should
identify as either an endpoint or legacy endpoint and IGD should
identify as a root complex integrated endpoint, or maybe older versions
would lack the PCIe capability altogether.

Also I think the comments that were dropped below are still valid and
useful to transfer to this new helper.  I think those are actually
referring to the guest address of 00:02.0 though, which should maybe be
a test as well.  Thanks,

Alex

> +
>  /*
>   * This presumes the device is already known to be an Intel VGA device, so we
>   * take liberties in which device ID bits match which generation.  This 
> should
> @@ -427,13 +434,7 @@ void vfio_probe_igd_bar0_quirk(VFIOPCIDevice *vdev, int 
> nr)
>  VFIOConfigMirrorQuirk *ggc_mirror, *bdsm_mirror;
>  int gen;
>  
> -/*
> - * This must be an Intel VGA device at address 00:02.0 for us to even
> - * consider enabling legacy mode. Some driver have dependencies on the 
> PCI
> - * bus address.
> - */
> -if (!vfio_pci_is(vdev, PCI_VENDOR_ID_INTEL, PCI_ANY_ID) ||
> -!vfio_is_vga(vdev) || nr != 0) {
> +if (nr != 0 || !vfio_is_igd(vdev)) {
>  return;
>  }
>  
> @@ -490,13 +491,7 @@ static bool vfio_pci_igd_config_quirk(VFIOPCIDevice 
> *vdev, Error **errp)
>  bool legacy_mode_enabled = false;
>  Error *err = NULL;
>  
> -/*
> - * This must be an Intel VGA device at address 00:02.0 for us to even
> - * consider enabling legacy mode.  The vBIOS has dependencies on the
> - * PCI bus address.
> - */
> -if (!vfio_pci_is(vdev, PCI_VENDOR_ID_INTEL, PCI_ANY_ID) ||
> -!vfio_is_vga(vdev)) {
> +if (!vfio_is_igd(vdev)) {
>  return true;
>  }
>

Re: [PATCH for-10.0] scsi-disk: Apply error policy for host_status errors again

2025-04-10 Thread Kevin Wolf

Am 10.04.2025 um 14:37 hat Michael Tokarev geschrieben:
> 07.04.2025 18:59, Kevin Wolf пишет:
> > Originally, all failed SG_IO requests called scsi_handle_rw_error() to
> > apply the configured error policy. However, commit f3126d65, which was
> > supposed to be a mere refactoring for scsi-disk.c, broke this and
> > accidentally completed the SCSI request without considering the error
> > policy any more if the error was signalled in the host_status field.
> > 
> > Apart from the commit message not describing the chance as intended,
> > errors indicated in host_status are also obviously backend errors and
> > not something the guest must deal with indepdently of the error policy.
> > 
> > This behaviour means that some recoverable errors (such as a path error
> > in multipath configurations) were reported to the guest anyway, which
> > might not expect it and might consider its disk broken.
> > 
> > Make sure that we apply the error policy again for host_status errors,
> > too. This addresses an existing FIXME comment and allows us to remove
> > some comments warning that callbacks weren't always called. With this
> > fix, they are called in all cases again.
> > 
> > The return value passed to the request callback doesn't have more free
> > values that could be used to indicate host_status errors as well as SAM
> > status codes and negative errno. Store the value in the host_status
> > field of the SCSIRequest instead and use -ENODEV as the return value (if
> > a path hasn't been reachable for a while, blk_aio_ioctl() will return
> > -ENODEV instead of just setting host_status, so just reuse it here -
> > it's not necessarily entirely accurate, but it's as good as any errno).
> > 
> > Cc: qemu-sta...@nongnu.org
> > Fixes: f3126d65b393 ('scsi: move host_status handling into SCSI drivers')
> 
> Does it make sense to apply this one for older stable qemu series?
> In particular, in 8.2, we lack cfe0880835cd3
> "scsi-disk: Use positive return value for status in dma_readv/writev",
> which seems to be relevant here.  Or should I pick up cfe0880835cd3 too,
> maybe together with 8a0495624f (a no-op, just to make this patch to apply
> cleanly) and probably 9da6bd39f924?

Yes, I think it makes sense to pick all of them up (and 622a7016 in the
middle, too), they were part of one series:

https://patchew.org/QEMU/20240731123207.27636-1-kw...@redhat.com/

And this patch builds on top of that series, so rebasing it correctly
might not be trivial without the previous series.

Kevin

Re: [PATCH for-10.0] scsi-disk: Apply error policy for host_status errors again

2025-04-10 Thread Paolo Bonzini


On 4/7/25 17:59, Kevin Wolf wrote:

Originally, all failed SG_IO requests called scsi_handle_rw_error() to
apply the configured error policy. However, commit f3126d65, which was
supposed to be a mere refactoring for scsi-disk.c, broke this and
accidentally completed the SCSI request without considering the error
policy any more if the error was signalled in the host_status field.

Apart from the commit message not describing the chance as intended,
errors indicated in host_status are also obviously backend errors and
not something the guest must deal with indepdently of the error policy.

This behaviour means that some recoverable errors (such as a path error
in multipath configurations) were reported to the guest anyway, which
might not expect it and might consider its disk broken.

Make sure that we apply the error policy again for host_status errors,
too. This addresses an existing FIXME comment and allows us to remove
some comments warning that callbacks weren't always called. With this
fix, they are called in all cases again.

The return value passed to the request callback doesn't have more free
values that could be used to indicate host_status errors as well as SAM
status codes and negative errno. Store the value in the host_status
field of the SCSIRequest instead and use -ENODEV as the return value (if
a path hasn't been reachable for a while, blk_aio_ioctl() will return
-ENODEV instead of just setting host_status, so just reuse it here -
it's not necessarily entirely accurate, but it's as good as any errno).

Cc: qemu-sta...@nongnu.org
Fixes: f3126d65b393 ('scsi: move host_status handling into SCSI drivers')
Signed-off-by: Kevin Wolf 
---
  hw/scsi/scsi-disk.c | 39 +--
  1 file changed, 25 insertions(+), 14 deletions(-)

diff --git a/hw/scsi/scsi-disk.c b/hw/scsi/scsi-disk.c
index 8da1d5a77c..e59632e9b1 100644
--- a/hw/scsi/scsi-disk.c
+++ b/hw/scsi/scsi-disk.c
@@ -68,10 +68,9 @@ struct SCSIDiskClass {
  SCSIDeviceClass parent_class;
  /*
   * Callbacks receive ret == 0 for success. Errors are represented either 
as
- * negative errno values, or as positive SAM status codes.
- *
- * Beware: For errors returned in host_status, the function may directly
- * complete the request and never call the callback.
+ * negative errno values, or as positive SAM status codes. For host_status
+ * errors, the function passes ret == -ENODEV and sets the host_status 
field
+ * of the SCSIRequest.
   */
  DMAIOFunc   *dma_readv;
  DMAIOFunc   *dma_writev;
@@ -225,11 +224,26 @@ static bool scsi_handle_rw_error(SCSIDiskReq *r, int ret, 
bool acct_failed)
  SCSIDiskState *s = DO_UPCAST(SCSIDiskState, qdev, r->req.dev);
  SCSIDiskClass *sdc = (SCSIDiskClass *) object_get_class(OBJECT(s));
  SCSISense sense = SENSE_CODE(NO_SENSE);
+int16_t host_status;
  int error;
  bool req_has_sense = false;
  BlockErrorAction action;
  int status;
  
+/*

+ * host_status should only be set for SG_IO requests that came back with a
+ * host_status error in scsi_block_sgio_complete(). This error path passes
+ * -ENODEV as the return value.
+ *
+ * Reset host_status in the request because we may still want to complete
+ * the request successfully with the 'stop' or 'ignore' error policy.
+ */
+host_status = r->req.host_status;
+if (host_status != -1) {
+assert(ret == -ENODEV);
+r->req.host_status = -1;


You should set ret = 0 here to avoid going down the 
scsi_sense_from_errno() path.


Otherwise,

Reviewed-by: Paolo Bonzini 


+}
+
  if (ret < 0) {
  status = scsi_sense_from_errno(-ret, &sense);
  error = -ret;
@@ -289,6 +303,10 @@ static bool scsi_handle_rw_error(SCSIDiskReq *r, int ret, 
bool acct_failed)
  if (acct_failed) {
  block_acct_failed(blk_get_stats(s->qdev.conf.blk), &r->acct);
  }
+if (host_status != -1) {
+scsi_req_complete_failed(&r->req, host_status);
+return true;
+}
  if (req_has_sense) {
  sdc->update_sense(&r->req);
  } else if (status == CHECK_CONDITION) {
@@ -409,7 +427,6 @@ done:
  scsi_req_unref(&r->req);
  }
  
-/* May not be called in all error cases, don't rely on cleanup here */

  static void scsi_dma_complete(void *opaque, int ret)
  {
  SCSIDiskReq *r = (SCSIDiskReq *)opaque;
@@ -448,7 +465,6 @@ done:
  scsi_req_unref(&r->req);
  }
  
-/* May not be called in all error cases, don't rely on cleanup here */

  static void scsi_read_complete(void *opaque, int ret)
  {
  SCSIDiskReq *r = (SCSIDiskReq *)opaque;
@@ -585,7 +601,6 @@ done:
  scsi_req_unref(&r->req);
  }
  
-/* May not be called in all error cases, don't rely on cleanup here */

  static void scsi_write_complete(void * opaque, int ret)
  {
  SCSIDiskReq *r = (SCSIDiskReq *)opaque;
@@ -2846,14 +2861,10 @@ static void scsi_b

Re: [RFC PATCH-for-8.0 09/10] hw/virtio: Extract vhost_user_ram_slots_max() to vhost-user-target.c

2025-04-10 Thread Pierrick Bouvier


On 4/10/25 05:14, Philippe Mathieu-Daudé wrote:

Hi Pierrick,

On 13/12/22 00:05, Philippe Mathieu-Daudé wrote:

The current definition of VHOST_USER_MAX_RAM_SLOTS is
target specific. By converting this definition to a runtime
vhost_user_ram_slots_max() helper declared in a target
specific unit, we can have the rest of vhost-user.c target
independent.

To avoid variable length array or using the heap to store
arrays of vhost_user_ram_slots_max() elements, we simply
declare an array of the biggest VHOST_USER_MAX_RAM_SLOTS,
and each target uses up to vhost_user_ram_slots_max()
elements of it. Ensure arrays are big enough by adding an
assertion in vhost_user_init().

Signed-off-by: Philippe Mathieu-Daudé 
---
RFC: Should I add VHOST_USER_MAX_RAM_SLOTS to vhost-user.h
   or create an internal header for it?
---
   hw/virtio/meson.build  |  1 +
   hw/virtio/vhost-user-target.c  | 29 +
   hw/virtio/vhost-user.c | 26 +-
   include/hw/virtio/vhost-user.h |  7 +++
   4 files changed, 42 insertions(+), 21 deletions(-)
   create mode 100644 hw/virtio/vhost-user-target.c

diff --git a/hw/virtio/meson.build b/hw/virtio/meson.build
index eb7ee8ea92..bf7e35fa8a 100644
--- a/hw/virtio/meson.build
+++ b/hw/virtio/meson.build
@@ -11,6 +11,7 @@ if have_vhost
 specific_virtio_ss.add(files('vhost.c', 'vhost-backend.c', 
'vhost-iova-tree.c'))
 if have_vhost_user
   specific_virtio_ss.add(files('vhost-user.c'))
+specific_virtio_ss.add(files('vhost-user-target.c'))
 endif
 if have_vhost_vdpa
   specific_virtio_ss.add(files('vhost-vdpa.c', 'vhost-shadow-virtqueue.c'))
diff --git a/hw/virtio/vhost-user-target.c b/hw/virtio/vhost-user-target.c
new file mode 100644
index 00..6a0d0f53d0
--- /dev/null
+++ b/hw/virtio/vhost-user-target.c
@@ -0,0 +1,29 @@
+/*
+ * vhost-user target-specific helpers
+ *
+ * Copyright (c) 2013 Virtual Open Systems Sarl.
+ *
+ * SPDX-License-Identifier: GPL-2.0-or-later
+ */
+
+#include "qemu/osdep.h"
+#include "hw/virtio/vhost-user.h"
+
+#if defined(TARGET_X86) || defined(TARGET_X86_64) || \
+defined(TARGET_ARM) || defined(TARGET_ARM_64)
+#include "hw/acpi/acpi.h"
+#elif defined(TARGET_PPC) || defined(TARGET_PPC64)
+#include "hw/ppc/spapr.h"
+#endif
+
+unsigned int vhost_user_ram_slots_max(void)
+{
+#if defined(TARGET_X86) || defined(TARGET_X86_64) || \
+defined(TARGET_ARM) || defined(TARGET_ARM_64)
+return ACPI_MAX_RAM_SLOTS;
+#elif defined(TARGET_PPC) || defined(TARGET_PPC64)
+return SPAPR_MAX_RAM_SLOTS;
+#else
+return 512;


Should vhost_user_ram_slots_max be another TargetInfo field?



I don't think so, it would be better to transform the existing function 
in something like:


switch (target_current()) {
case TARGET_X86:
case TARGET_ARM:
case TARGET_X86_64:
case TARGET_ARM_64:
return ACPI_MAX_RAM_SLOTS;
case TARGET PPC:
case TARGET PPC64:
return SPAPR_MAX_RAM_SLOTS;
default:
return 512;
}

We should not add anything possible to TargetInfo, just for the sake of 
it. Especially becomes it's hard to follow values set per architecture.
In a case like this, a switch is much more readable and located in one 
place. With a generated jump table, it's quite efficient also.


In my opinion, it's another proof we need to have TARGET_X, and 
target_X() available at runtime.



+#endif
+}
diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
index 8f635844af..21fc176725 100644
--- a/hw/virtio/vhost-user.c
+++ b/hw/virtio/vhost-user.c
@@ -41,24 +41,7 @@
   #define VHOST_MEMORY_BASELINE_NREGIONS8
   #define VHOST_USER_F_PROTOCOL_FEATURES 30
   #define VHOST_USER_SLAVE_MAX_FDS 8
-
-/*
- * Set maximum number of RAM slots supported to
- * the maximum number supported by the target
- * hardware plaform.
- */
-#if defined(TARGET_X86) || defined(TARGET_X86_64) || \
-defined(TARGET_ARM) || defined(TARGET_ARM_64)
-#include "hw/acpi/acpi.h"
-#define VHOST_USER_MAX_RAM_SLOTS ACPI_MAX_RAM_SLOTS
-
-#elif defined(TARGET_PPC) || defined(TARGET_PPC64)
-#include "hw/ppc/spapr.h"
-#define VHOST_USER_MAX_RAM_SLOTS SPAPR_MAX_RAM_SLOTS
-
-#else
   #define VHOST_USER_MAX_RAM_SLOTS 512
-#endif
   
   /*

* Maximum size of virtio device config space
@@ -935,7 +918,7 @@ static int vhost_user_add_remove_regions(struct vhost_dev 
*dev,
   
   if (track_ramblocks) {

   memcpy(u->postcopy_client_bases, shadow_pcb,
-   sizeof(uint64_t) * VHOST_USER_MAX_RAM_SLOTS);
+   sizeof(uint64_t) * vhost_user_ram_slots_max());
   /*
* Now we've registered this with the postcopy code, we ack to the
* client, because now we're in the position to be able to deal with
@@ -956,7 +939,7 @@ static int vhost_user_add_remove_regions(struct vhost_dev 
*dev,
   err:
   if (track_ramblocks) {
   memcpy(u->postcopy_client_bases, shadow_pcb,
-   sizeof(uint64_t) * VHOST_USER_MAX_RAM_SLOTS);
+

[PATCH v2 6/6] h264: new vnc option to configure h264 at server side

2025-04-10 Thread Dietmar Maurer

Values can be 'on', 'off', or a space sparated list of
allowed gstreamer encoders.

- on: automatically select the encoder
- off: disbale h264
- encoder-list: select first available encoder from that list.

Signed-off-by: Dietmar Maurer 
---
 ui/vnc-enc-h264.c | 28 +---
 ui/vnc.c  | 26 +-
 ui/vnc.h  |  6 +-
 3 files changed, 47 insertions(+), 13 deletions(-)

diff --git a/ui/vnc-enc-h264.c b/ui/vnc-enc-h264.c
index 3eabfc2cfe..33067d3a16 100644
--- a/ui/vnc-enc-h264.c
+++ b/ui/vnc-enc-h264.c
@@ -3,13 +3,21 @@
 
 #include 
 
-const char *encoder_list[] = { "x264enc", "openh264enc", NULL };
-
-static const char *get_available_encoder(void)
+static char *get_available_encoder(const char *encoder_list)
 {
+g_assert(encoder_list != NULL);
+
+if (!strcmp(encoder_list, "")) {
+/* use default list */
+encoder_list = "x264enc openh264enc";
+}
+
+char *ret = NULL;
+char **encoder_array = g_strsplit(encoder_list, " ", -1);
+
 int i = 0;
 do {
-const char *encoder_name = encoder_list[i];
+const char *encoder_name = encoder_array[i];
 if (encoder_name == NULL) {
 break;
 }
@@ -17,12 +25,15 @@ static const char *get_available_encoder(void)
 encoder_name, "video-encoder");
 if (element != NULL) {
 gst_object_unref(element);
-return encoder_name;
+ret = strdup(encoder_name);
+break;
 }
 i = i + 1;
 } while (true);
 
-return NULL;
+g_strfreev(encoder_array);
+
+return ret;
 }
 
 static GstElement *create_encoder(const char *encoder_name)
@@ -215,8 +226,10 @@ static bool create_encoder_context(VncState *vs, int w, 
int h)
 int vnc_h264_encoder_init(VncState *vs)
 {
 g_assert(vs->h264 == NULL);
+g_assert(vs->vd != NULL);
+g_assert(vs->vd->h264_encoder_list != NULL);
 
-const char *encoder_name = get_available_encoder();
+char *encoder_name = get_available_encoder(vs->vd->h264_encoder_list);
 if (encoder_name == NULL) {
 VNC_DEBUG("No H264 encoder available.\n");
 return -1;
@@ -316,6 +329,7 @@ void vnc_h264_clear(VncState *vs)
 
 destroy_encoder_context(vs);
 
+g_free(vs->h264->encoder_name);
 g_free(vs->h264);
 vs->h264 = NULL;
 }
diff --git a/ui/vnc.c b/ui/vnc.c
index 4ba0b715fd..5a7f93e762 100644
--- a/ui/vnc.c
+++ b/ui/vnc.c
@@ -2190,11 +2190,11 @@ static void set_encodings(VncState *vs, int32_t 
*encodings, size_t n_encodings)
 break;
 #ifdef CONFIG_GSTREAMER
 case VNC_ENCODING_H264:
-if (vnc_h264_encoder_init(vs) == 0) {
-vnc_set_feature(vs, VNC_FEATURE_H264);
-vs->vnc_encoding = enc;
-} else {
-VNC_DEBUG("vnc_h264_encoder_init failed\n");
+if (vs->vd->h264_encoder_list != NULL) { /* if h264 is enabled */
+if (vnc_h264_encoder_init(vs) == 0) {
+vnc_set_feature(vs, VNC_FEATURE_H264);
+vs->vnc_encoding = enc;
+}
 }
 break;
 #endif
@@ -3634,6 +3634,9 @@ static QemuOptsList qemu_vnc_opts = {
 },{
 .name = "power-control",
 .type = QEMU_OPT_BOOL,
+},{
+.name = "h264",
+.type = QEMU_OPT_STRING,
 },
 { /* end of list */ }
 },
@@ -4196,6 +4199,19 @@ void vnc_display_open(const char *id, Error **errp)
 }
 #endif
 
+#ifdef CONFIG_GSTREAMER
+const char *h264_opt = qemu_opt_get(opts, "h264");
+fprintf(stderr, "GOT %s\n", h264_opt);
+if (!strcmp(h264_opt, "off")) {
+vd->h264_encoder_list = NULL; /* disable h264 */
+} else if  (!strcmp(h264_opt, "on")) {
+vd->h264_encoder_list = ""; /* use default encoder list */
+} else  {
+/* assume this is a list of endiers */
+vd->h264_encoder_list = h264_opt;
+}
+#endif
+
 if (vnc_display_setup_auth(&vd->auth, &vd->subauth,
vd->tlscreds, password,
sasl, false, errp) < 0) {
diff --git a/ui/vnc.h b/ui/vnc.h
index f39dbe21aa..e459441e35 100644
--- a/ui/vnc.h
+++ b/ui/vnc.h
@@ -188,6 +188,10 @@ struct VncDisplay
 VncDisplaySASL sasl;
 #endif
 
+#ifdef CONFIG_GSTREAMER
+const char *h264_encoder_list;
+#endif
+
 AudioState *audio_state;
 };
 
@@ -239,7 +243,7 @@ typedef struct VncZywrle {
 /* Number of frames we send after the display is clean. */
 #define VNC_H264_KEEP_DIRTY 10
 typedef struct VncH264 {
-const char *encoder_name;
+char *encoder_name;
 GstElement *pipeline, *source, *gst_encoder, *sink, *convert;
 size_t width;
 size_t height;
-- 
2.39.5

Re: [PATCH] hw/ppc/spapr_hcall: Return host mitigation characteristics in KVM mode

2025-04-10 Thread Philippe Mathieu-Daudé


Hi Gautam,

On 10/4/25 12:43, Gautam Menghani wrote:

Currently, on a P10 KVM guest, the mitigations seen in the output of
"lscpu" command are different from the host. The reason for this
behaviour is that when the KVM guest makes the "h_get_cpu_characteristics"
hcall, QEMU does not consider the data it received from the host via the
KVM_PPC_GET_CPU_CHAR ioctl, and just uses the values present in
spapr->eff.caps[], which in turn just contain the default values set in
spapr_machine_class_init().

Fix this behaviour by making sure that h_get_cpu_characteristics()
returns the data received from the KVM ioctl for a KVM guest.

Perf impact:
With null syscall benchmark[1], ~45% improvement is observed.

1. Vanilla QEMU
$ ./null_syscall
132.19 ns 456.54 cycles

2. With this patch
$ ./null_syscall
91.18 ns 314.57 cycles

[1]: https://ozlabs.org/~anton/junkcode/null_syscall.c

Signed-off-by: Gautam Menghani 
---
  hw/ppc/spapr_hcall.c   | 6 ++
  include/hw/ppc/spapr.h | 1 +
  target/ppc/kvm.c   | 2 ++
  3 files changed, 9 insertions(+)

diff --git a/hw/ppc/spapr_hcall.c b/hw/ppc/spapr_hcall.c
index 406aea4ecb..6aec4e22fc 100644
--- a/hw/ppc/spapr_hcall.c
+++ b/hw/ppc/spapr_hcall.c
@@ -1415,6 +1415,12 @@ static target_ulong h_get_cpu_characteristics(PowerPCCPU 
*cpu,
  uint8_t count_cache_flush_assist = spapr_get_cap(spapr,
   SPAPR_CAP_CCF_ASSIST);
  
+if (kvm_enabled()) {

+args[0] = spapr->chars.character;
+args[1] = spapr->chars.behaviour;


If kvmppc_get_cpu_characteristics() call fails, we return random data.

Can't we just call kvm_vm_check_extension(s, KVM_CAP_PPC_GET_CPU_CHAR)
and kvm_vm_ioctl(s, KVM_PPC_GET_CPU_CHAR, &c) here?


+return H_SUCCESS;
+}
+
  switch (safe_cache) {
  case SPAPR_CAP_WORKAROUND:
  characteristics |= H_CPU_CHAR_L1D_FLUSH_ORI30;
diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
index 39bd5bd5ed..b1e3ee1ae2 100644
--- a/include/hw/ppc/spapr.h
+++ b/include/hw/ppc/spapr.h
@@ -283,6 +283,7 @@ struct SpaprMachineState {
  Error *fwnmi_migration_blocker;
  
  SpaprWatchdog wds[WDT_MAX_WATCHDOGS];

+struct kvm_ppc_cpu_char chars;
  };
  
  #define H_SUCCESS 0

diff --git a/target/ppc/kvm.c b/target/ppc/kvm.c
index 992356cb75..fee6c5d131 100644
--- a/target/ppc/kvm.c
+++ b/target/ppc/kvm.c
@@ -2511,6 +2511,7 @@ bool kvmppc_has_cap_xive(void)
  
  static void kvmppc_get_cpu_characteristics(KVMState *s)

  {
+SpaprMachineState *spapr = SPAPR_MACHINE(qdev_get_machine());
  struct kvm_ppc_cpu_char c;
  int ret;
  
@@ -2528,6 +2529,7 @@ static void kvmppc_get_cpu_characteristics(KVMState *s)

  return;
  }
  
+spapr->chars = c;

  cap_ppc_safe_cache = parse_cap_ppc_safe_cache(c);
  cap_ppc_safe_bounds_check = parse_cap_ppc_safe_bounds_check(c);
  cap_ppc_safe_indirect_branch = parse_cap_ppc_safe_indirect_branch(c);

Re: [PATCH 00/10] Enable QEMU to run on browsers

2025-04-10 Thread Kohei Tokunaga

Hi Philippe,

> On 9/4/25 21:21, Stefan Hajnoczi wrote:
> > On Mon, Apr 07, 2025 at 11:45:51PM +0900, Kohei Tokunaga wrote:
> >> This patch series enables QEMU's system emulator to run in a browser
using
> >> Emscripten.
> >> It includes implementations and workarounds to address browser
environment
> >> limitations, as shown in the following.
> >
> > I think it would be great to merge this even if there are limitations
> > once code review comments have been addressed. Developing WebAssembly
> > support in-tree is likely to allow this effort to develop further than
> > if done in personal repos (and with significant efforts required to
> > rebase the code periodically).
> >
> >> # New TCG Backend for Browsers
> >>
> >> A new TCG backend translates IR instructions into Wasm instructions
and runs
> >> them using the browser's WebAssembly APIs (WebAssembly.Module and
> >> WebAssembly.instantiate). To minimize compilation overhead and avoid
hitting
> >> the browser's limitation of the number of instances, this backend
integrates
> >> a forked TCI. TBs run on TCI by default, with frequently executed TBs
> >> compiled into WebAssembly.
> >>
> >> # Workaround for Running 64-bit Guests
> >>
> >> The current implementation uses Wasm's 32-bit memory model, even
though Wasm
> >> supports 64-bit variables and instructions. This patch explores
implementing
> >> TCG 64-bit instructions while leveraging SoftMMU for address
translation. To
> >> enable 64-bit guest support in Wasm today, it was necessary to
partially
> >> revert recent changes that removed support for different pointer widths
> >> between the host and guest (e.g., commits
> >> a70af12addd9060fdf8f3dbd42b42e3072c3914f and
> >> bf455ec50b6fea15b4d2493059365bf94c706273) when compiling with
> >> Emscripten. While this serves as a temporary workaround, a long-term
> >> solution could involve adopting Wasm's 64-bit memory model once it
gains
> >> broader support, as it is currently not widely adopted (e.g.,
unsupported by
> >> Safari and libffi). Feedback and suggestions on this approach are
welcome.
>
> The biggest problem I'm seeing is we no longer support 64-bit guests on
> 32-bit hosts, and don't plan to revert that.

Yes, so the sixth patch ("[PATCH 06/10] include/exec: Allow using 64bit
guest addresses on emscripten") should be considered as a temporary
workaround, enabled only for Emsripten builds. It will be removed once
wasm64 gains broader support and is adopted in the Wasm backend.

Re: Note improvements to QAPI-generated manuals in ChangeLog/10.0?

2025-04-10 Thread Stefan Hajnoczi

On Thu, Apr 10, 2025 at 06:53:53AM +0200, Markus Armbruster wrote:
> Stefan Hajnoczi  writes:
> 
> > On Wed, Apr 9, 2025 at 9:44 AM Markus Armbruster  wrote:
> >>
> >> John improved looks and usabilibity of the QAPI-generated manuals quite
> >> a bit.  These are
> >>
> >> QEMU QMP Reference Manual
> >> QEMU Storage Daemon QMP Reference Manual
> >> QEMU Guest Agent Protocol Reference
> >>
> >> Where should it go?  https://wiki.qemu.org/ChangeLog/10.0 has no section
> >> dedicated to the manuals.  We could mention it under System Emulation /
> >> Monitor / QMP, and again under Guest Agent.  Thoughts?
> >
> > I would add a separate Documentation section for changes like this.
> > That's also where major docs infrastructure items like changes to
> > formats (rST), tooling (Sphinx), etc could be announced.
> 
> Insert a "Documentation" section between "Guest agent" and "Build
> Information"?

Sounds good to me.

Stefan


signature.asc
Description: PGP signature

[PATCH v4 2/4] hw/s390x: add Control-Program Identification to QOM

2025-04-10 Thread Shalini Chellathurai Saroja

Add Control-Program Identification data to the QEMU Object
Model (QOM), along with the timestamp in which the data was received.

Example:
virsh # qemu-monitor-command vm --pretty '{
"execute": "qom-get",
"arguments": {
"path": "/machine/sclp/s390-sclp-event-facility/sclpcpi",
"property": "control-program-id" }}'
{
  "return": {
"timestamp": 1742390410685762000,
"system-level": 74872343805430528,
"sysplex-name": "PLEX ",
"system-name": "TESTVM  ",
"system-type": "LINUX   "
  },
  "id": "libvirt-15"
}

Signed-off-by: Shalini Chellathurai Saroja 
---
 hw/s390x/sclpcpi.c| 39 +
 include/hw/s390x/event-facility.h |  9 +
 qapi/machine.json | 58 +++
 3 files changed, 106 insertions(+)

diff --git a/hw/s390x/sclpcpi.c b/hw/s390x/sclpcpi.c
index 13589459b1..dcc8bd3245 100644
--- a/hw/s390x/sclpcpi.c
+++ b/hw/s390x/sclpcpi.c
@@ -18,7 +18,10 @@
   */
 
 #include "qemu/osdep.h"
+#include "qemu/timer.h"
 #include "hw/s390x/event-facility.h"
+#include "hw/s390x/ebcdic.h"
+#include "qapi/qapi-visit-machine.h"
 
 typedef struct Data {
 uint8_t id_format;
@@ -58,11 +61,39 @@ static int write_event_data(SCLPEvent *event, 
EventBufferHeader *evt_buf_hdr)
 {
 ControlProgramIdMsg *cpim = container_of(evt_buf_hdr, ControlProgramIdMsg,
  ebh);
+SCLPEventCPI *e = SCLP_EVENT_CPI(event);
+
+ascii_put(e->cpi.system_type, (char *)cpim->data.system_type, 8);
+ascii_put(e->cpi.system_name, (char *)cpim->data.system_name, 8);
+ascii_put(e->cpi.sysplex_name, (char *)cpim->data.sysplex_name, 8);
+e->cpi.system_level = ldq_be_p(&cpim->data.system_level);
+e->cpi.timestamp = qemu_clock_get_ns(QEMU_CLOCK_HOST);
 
 cpim->ebh.flags = SCLP_EVENT_BUFFER_ACCEPTED;
 return SCLP_RC_NORMAL_COMPLETION;
 }
 
+static void get_control_program_id(Object *obj, Visitor *v,
+   const char *name, void *opaque,
+   Error **errp)
+{
+SCLPEventCPI *e = SCLP_EVENT_CPI(obj);
+S390ControlProgramId *cpi;
+
+cpi = &(S390ControlProgramId){
+.system_type = g_strndup((char *) e->cpi.system_type,
+ sizeof(e->cpi.system_type)),
+.system_name = g_strndup((char *) e->cpi.system_name,
+ sizeof(e->cpi.system_name)),
+.system_level = e->cpi.system_level,
+.sysplex_name = g_strndup((char *) e->cpi.sysplex_name,
+  sizeof(e->cpi.sysplex_name)),
+.timestamp = e->cpi.timestamp
+};
+
+visit_type_S390ControlProgramId(v, name, &cpi, errp);
+}
+
 static void cpi_class_init(ObjectClass *klass, void *data)
 {
 DeviceClass *dc = DEVICE_CLASS(klass);
@@ -74,6 +105,14 @@ static void cpi_class_init(ObjectClass *klass, void *data)
 k->get_send_mask = send_mask;
 k->get_receive_mask = receive_mask;
 k->write_event_data = write_event_data;
+
+object_class_property_add(klass, "control-program-id",
+  "S390ControlProgramId",
+  get_control_program_id,
+  NULL, NULL, NULL);
+object_class_property_set_description(klass, "control-program-id",
+"Control-program identifiers provide data about the guest "
+"operating system");
 }
 
 static const TypeInfo sclp_cpi_info = {
diff --git a/include/hw/s390x/event-facility.h 
b/include/hw/s390x/event-facility.h
index ef469e62ae..123c4ac49c 100644
--- a/include/hw/s390x/event-facility.h
+++ b/include/hw/s390x/event-facility.h
@@ -199,9 +199,18 @@ typedef struct SCLPEventCPI SCLPEventCPI;
 OBJECT_DECLARE_TYPE(SCLPEventCPI, SCLPEventCPIClass,
 SCLP_EVENT_CPI)
 
+typedef struct ControlProgramId {
+uint8_t system_type[8];
+uint8_t system_name[8];
+uint64_t system_level;
+uint8_t sysplex_name[8];
+uint64_t timestamp;
+} ControlProgramId;
+
 struct SCLPEventCPI {
 DeviceState qdev;
 SCLPEvent event;
+ControlProgramId cpi;
 };
 
 #define TYPE_SCLP_EVENT_FACILITY "s390-sclp-event-facility"
diff --git a/qapi/machine.json b/qapi/machine.json
index a6b8795b09..cd2bcd2d13 100644
--- a/qapi/machine.json
+++ b/qapi/machine.json
@@ -1898,3 +1898,61 @@
 { 'command': 'x-query-interrupt-controllers',
   'returns': 'HumanReadableText',
   'features': [ 'unstable' ]}
+
+##
+# @S390ControlProgramId:
+#
+# Control-program identifiers provide data about the guest operating system.
+# The control-program identifiers are: system type, system name, system level
+# and sysplex name.
+#
+# In Linux, all the control-program identifiers are user configurable. The
+# system type, system name, and sysplex name use EBCDIC characters from
+# this set: capital A-Z, 0-9, $, @, #, and blank.  In Linux, the system type,
+# system name and sysplex name are arbitrary free-form texts.
+#
+# In Linux, the 8-byte hexadecimal sy

[PATCH v4 4/4] hw/s390x: compat handling for backward migration

2025-04-10 Thread Shalini Chellathurai Saroja

Add Control-Program Identification (CPI) device to QOM only when the virtual
machine supports CPI. CPI is supported from "s390-ccw-virtio-10.0" machine
and higher.

Signed-off-by: Shalini Chellathurai Saroja 
---
 hw/s390x/s390-virtio-ccw.c | 10 +-
 include/hw/s390x/s390-virtio-ccw.h |  1 +
 2 files changed, 10 insertions(+), 1 deletion(-)

diff --git a/hw/s390x/s390-virtio-ccw.c b/hw/s390x/s390-virtio-ccw.c
index 7f28cbd1de..81832ee638 100644
--- a/hw/s390x/s390-virtio-ccw.c
+++ b/hw/s390x/s390-virtio-ccw.c
@@ -274,6 +274,7 @@ static void s390_create_sclpcpi(SCLPDevice *sclp)
 static void ccw_init(MachineState *machine)
 {
 MachineClass *mc = MACHINE_GET_CLASS(machine);
+S390CcwMachineClass *s390mc = S390_CCW_MACHINE_CLASS(mc);
 S390CcwMachineState *ms = S390_CCW_MACHINE(machine);
 int ret;
 VirtualCssBus *css_bus;
@@ -336,7 +337,10 @@ static void ccw_init(MachineState *machine)
 s390_init_tod();
 
 /* init SCLP event Control-Program Identification */
-s390_create_sclpcpi(ms->sclp);
+if (s390mc->use_cpi) {
+s390_create_sclpcpi(ms->sclp);
+}
+
 }
 
 static void s390_cpu_plug(HotplugHandler *hotplug_dev,
@@ -827,6 +831,7 @@ static void ccw_machine_class_init(ObjectClass *oc, void 
*data)
 
 s390mc->hpage_1m_allowed = true;
 s390mc->max_threads = 1;
+s390mc->use_cpi = true;
 mc->reset = s390_machine_reset;
 mc->block_default_type = IF_VIRTIO;
 mc->no_cdrom = 1;
@@ -955,6 +960,9 @@ static void ccw_machine_9_2_class_options(MachineClass *mc)
 { TYPE_S390_PCI_DEVICE, "relaxed-translation", "off", },
 };
 
+S390CcwMachineClass *s390mc = S390_CCW_MACHINE_CLASS(mc);
+s390mc->use_cpi = false;
+
 ccw_machine_10_0_class_options(mc);
 compat_props_add(mc->compat_props, hw_compat_9_2, hw_compat_9_2_len);
 compat_props_add(mc->compat_props, compat, G_N_ELEMENTS(compat));
diff --git a/include/hw/s390x/s390-virtio-ccw.h 
b/include/hw/s390x/s390-virtio-ccw.h
index 686d9497d2..fc4112fbf5 100644
--- a/include/hw/s390x/s390-virtio-ccw.h
+++ b/include/hw/s390x/s390-virtio-ccw.h
@@ -55,6 +55,7 @@ struct S390CcwMachineClass {
 /*< public >*/
 bool hpage_1m_allowed;
 int max_threads;
+bool use_cpi;
 };
 
 /* 1M huge page mappings allowed by the machine */
-- 
2.49.0

[PATCH v4 1/4] hw/s390x: add SCLP event type CPI

2025-04-10 Thread Shalini Chellathurai Saroja

Implement the Service-Call Logical Processor (SCLP) event
type Control-Program Identification (CPI) in QEMU. This
event is used to send CPI identifiers from the guest to the
host. The CPI identifiers are: system type, system name,
system level and sysplex name.

System type: operating system of the guest (e.g. "LINUX").
System name: user configurable name of the guest (e.g. "TESTVM").
System level: distribution and kernel version, if the system type is Linux
(e.g. 0x50e00).
Sysplex name: name of the cluster which the guest belongs to (if any)
(e.g. "PLEX").

Signed-off-by: Shalini Chellathurai Saroja 
Reviewed-by: Thomas Huth 
---
 hw/s390x/event-facility.c |  2 +
 hw/s390x/meson.build  |  1 +
 hw/s390x/s390-virtio-ccw.c| 14 +
 hw/s390x/sclpcpi.c| 92 +++
 include/hw/s390x/event-facility.h | 13 +
 5 files changed, 122 insertions(+)
 create mode 100644 hw/s390x/sclpcpi.c

diff --git a/hw/s390x/event-facility.c b/hw/s390x/event-facility.c
index 2b0332c20e..60237b8581 100644
--- a/hw/s390x/event-facility.c
+++ b/hw/s390x/event-facility.c
@@ -4,6 +4,7 @@
  *   handles SCLP event types
  *  - Signal Quiesce - system power down
  *  - ASCII Console Data - VT220 read and write
+ *  - Control-Program Identification - Send OS data from guest to host
  *
  * Copyright IBM, Corp. 2012
  *
@@ -40,6 +41,7 @@ struct SCLPEventFacility {
 SysBusDevice parent_obj;
 SCLPEventsBus sbus;
 SCLPEvent quiesce, cpu_hotplug;
+SCLPEventCPI cpi;
 /* guest's receive mask */
 union {
 uint32_t receive_mask_pieces[2];
diff --git a/hw/s390x/meson.build b/hw/s390x/meson.build
index 3bbebfd817..eb7950489c 100644
--- a/hw/s390x/meson.build
+++ b/hw/s390x/meson.build
@@ -13,6 +13,7 @@ s390x_ss.add(files(
   's390-skeys.c',
   's390-stattrib.c',
   'sclp.c',
+  'sclpcpi.c',
   'sclpcpu.c',
   'sclpquiesce.c',
   'tod.c',
diff --git a/hw/s390x/s390-virtio-ccw.c b/hw/s390x/s390-virtio-ccw.c
index 75b32182eb..7f28cbd1de 100644
--- a/hw/s390x/s390-virtio-ccw.c
+++ b/hw/s390x/s390-virtio-ccw.c
@@ -260,6 +260,17 @@ static void s390_create_sclpconsole(SCLPDevice *sclp,
 qdev_realize_and_unref(dev, ev_fac_bus, &error_fatal);
 }
 
+static void s390_create_sclpcpi(SCLPDevice *sclp)
+{
+SCLPEventFacility *ef = sclp->event_facility;
+BusState *ev_fac_bus = sclp_get_event_facility_bus(ef);
+DeviceState *dev;
+
+dev = qdev_new(TYPE_SCLP_EVENT_CPI);
+object_property_add_child(OBJECT(ef), "sclpcpi", OBJECT(dev));
+qdev_realize_and_unref(dev, ev_fac_bus, &error_fatal);
+}
+
 static void ccw_init(MachineState *machine)
 {
 MachineClass *mc = MACHINE_GET_CLASS(machine);
@@ -323,6 +334,9 @@ static void ccw_init(MachineState *machine)
 
 /* init the TOD clock */
 s390_init_tod();
+
+/* init SCLP event Control-Program Identification */
+s390_create_sclpcpi(ms->sclp);
 }
 
 static void s390_cpu_plug(HotplugHandler *hotplug_dev,
diff --git a/hw/s390x/sclpcpi.c b/hw/s390x/sclpcpi.c
new file mode 100644
index 00..13589459b1
--- /dev/null
+++ b/hw/s390x/sclpcpi.c
@@ -0,0 +1,92 @@
+ /*
+  * SPDX-License-Identifier: GPL-2.0-or-later
+  *
+  * SCLP event type 11 - Control-Program Identification (CPI):
+  *CPI is used to send program identifiers from the guest to the
+  *Service-Call Logical Processor (SCLP). It is not sent by the SCLP.
+  *Please refer S390ControlProgramId QOM-type description for details
+  *on the contents of the CPI.
+  *
+  * Copyright IBM, Corp. 2024
+  *
+  * Authors:
+  *  Shalini Chellathurai Saroja 
+  *
+  * This work is licensed under the terms of the GNU GPL, version 2 or (at your
+  * option) any later version.  See the COPYING file in the top-level 
directory.
+  *
+  */
+
+#include "qemu/osdep.h"
+#include "hw/s390x/event-facility.h"
+
+typedef struct Data {
+uint8_t id_format;
+uint8_t reserved0;
+uint8_t system_type[8];
+uint64_t reserved1;
+uint8_t system_name[8];
+uint64_t reserved2;
+uint64_t system_level;
+uint64_t reserved3;
+uint8_t sysplex_name[8];
+uint8_t reserved4[16];
+} QEMU_PACKED Data;
+
+typedef struct ControlProgramIdMsg {
+EventBufferHeader ebh;
+Data data;
+} QEMU_PACKED ControlProgramIdMsg;
+
+static bool can_handle_event(uint8_t type)
+{
+return type == SCLP_EVENT_CTRL_PGM_ID;
+}
+
+static sccb_mask_t send_mask(void)
+{
+return 0;
+}
+
+/* Enable SCLP to accept buffers of event type CPI from the control-program. */
+static sccb_mask_t receive_mask(void)
+{
+return SCLP_EVENT_MASK_CTRL_PGM_ID;
+}
+
+static int write_event_data(SCLPEvent *event, EventBufferHeader *evt_buf_hdr)
+{
+ControlProgramIdMsg *cpim = container_of(evt_buf_hdr, ControlProgramIdMsg,
+ ebh);
+
+cpim->ebh.flags = SCLP_EVENT_BUFFER_ACCEPTED;
+return SCLP_RC_NORMAL_COMPLETION;
+}
+
+static void cpi_class_init(ObjectClass *klass, vo

Re: [PATCH 01/10] various: Fix type conflict of GLib function pointers

2025-04-10 Thread Paolo Bonzini


On 4/7/25 16:45, Kohei Tokunaga wrote:

On emscripten, function pointer casts can cause function call failure.
This commit fixes the function definition to match to the type of the
function call.

- qtest_set_command_cb passed to g_once should match to GThreadFunc


Sending an alternative patch that doesn't use GOnce, this code runs in 
the main thread.



- object_class_cmp and cpreg_key_compare are passed to g_list_sort as
   GCopmareFunc but GLib cast them to GCompareDataFunc.


Please use g_list_sort_with_data instead, and poison 
g_slist_sort/g_list_sort in include/glib-compat.h, with a comment 
explaining that it's done this way because of Emscripten.


Paolo


Signed-off-by: Kohei Tokunaga 
---
  hw/riscv/riscv_hart.c | 9 -
  qom/object.c  | 5 +++--
  target/arm/helper.c   | 4 ++--
  3 files changed, 13 insertions(+), 5 deletions(-)

diff --git a/hw/riscv/riscv_hart.c b/hw/riscv/riscv_hart.c
index a55d156668..e37317dcbd 100644
--- a/hw/riscv/riscv_hart.c
+++ b/hw/riscv/riscv_hart.c
@@ -102,10 +102,17 @@ static bool csr_qtest_callback(CharBackend *chr, gchar 
**words)
  return false;
  }
  
+static gpointer g_qtest_set_command_cb(

+bool (*pc_cb)(CharBackend *chr, gchar **words))
+{
+qtest_set_command_cb(pc_cb);
+return NULL;
+}
+
  static void riscv_cpu_register_csr_qtest_callback(void)
  {
  static GOnce once;
-g_once(&once, (GThreadFunc)qtest_set_command_cb, csr_qtest_callback);
+g_once(&once, (GThreadFunc)g_qtest_set_command_cb, csr_qtest_callback);
  }
  #endif
  
diff --git a/qom/object.c b/qom/object.c

index 01618d06bd..19698aae4c 100644
--- a/qom/object.c
+++ b/qom/object.c
@@ -1191,7 +1191,8 @@ GSList *object_class_get_list(const char *implements_type,
  return list;
  }
  
-static gint object_class_cmp(gconstpointer a, gconstpointer b)

+static gint object_class_cmp(gconstpointer a, gconstpointer b,
+ gpointer user_data)
  {
  return strcasecmp(object_class_get_name((ObjectClass *)a),
object_class_get_name((ObjectClass *)b));
@@ -1201,7 +1202,7 @@ GSList *object_class_get_list_sorted(const char 
*implements_type,
   bool include_abstract)
  {
  return g_slist_sort(object_class_get_list(implements_type, 
include_abstract),
-object_class_cmp);
+(GCompareFunc)object_class_cmp);
  }
  
  Object *object_ref(void *objptr)

diff --git a/target/arm/helper.c b/target/arm/helper.c
index bb445e30cd..68f81fadfc 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -220,7 +220,7 @@ static void count_cpreg(gpointer key, gpointer opaque)
  }
  }
  
-static gint cpreg_key_compare(gconstpointer a, gconstpointer b)

+static gint cpreg_key_compare(gconstpointer a, gconstpointer b, void *d)
  {
  uint64_t aidx = cpreg_to_kvm_id((uintptr_t)a);
  uint64_t bidx = cpreg_to_kvm_id((uintptr_t)b);
@@ -244,7 +244,7 @@ void init_cpreg_list(ARMCPU *cpu)
  int arraylen;
  
  keys = g_hash_table_get_keys(cpu->cp_regs);

-keys = g_list_sort(keys, cpreg_key_compare);
+keys = g_list_sort(keys, (GCompareFunc)cpreg_key_compare);
  
  cpu->cpreg_array_len = 0;

Re: [PATCH v3 0/3] Enable QEMU NVMe userspace driver on s390x

2025-04-10 Thread Farhan Ali




On 4/3/2025 2:24 PM, Alex Williamson wrote:

On Thu, 3 Apr 2025 13:33:17 -0700
Farhan Ali  wrote:


On 4/3/2025 11:05 AM, Alex Williamson wrote:

On Thu, 3 Apr 2025 10:33:52 -0700
Farhan Ali  wrote:
  

On 4/3/2025 9:27 AM, Alex Williamson wrote:

On Thu, 3 Apr 2025 11:44:42 -0400
Stefan Hajnoczi  wrote:
 

On Thu, Apr 03, 2025 at 09:47:26AM +0200, Niklas Schnelle wrote:

On Wed, 2025-04-02 at 11:51 -0400, Stefan Hajnoczi wrote:

On Tue, Apr 01, 2025 at 10:22:43AM -0700, Farhan Ali wrote:

Hi,

Recently on s390x we have enabled mmap support for vfio-pci devices [1].

Hi Alex,
I wanted to bring this to your attention. Feel free to merge it through
the VFIO tree, otherwise I will merge it once you have taken a look.

Thanks,
Stefan
 

This allows us to take advantage and use userspace drivers on s390x. However,
on s390x we have special instructions for MMIO access. Starting with z15
(and newer platforms) we have new PCI Memory I/O (MIO) instructions which
operate on virtually mapped PCI memory spaces, and can be used from userspace.
On older platforms we would fallback to using existing system calls for MMIO 
access.

This patch series introduces support the PCI MIO instructions, and enables s390x
support for the userspace NVMe driver on s390x. I would appreciate any 
review/feedback
on the patches.

Thanks
Farhan

Hi Stefan,

the kernel patch actually made it into Linus' tree for v6.15 already as
commit aa9f168d55dc ("s390/pci: Support mmap() of PCI resources except
for ISM devices") plus prerequisites. This went via the PCI tree
because they included a change to struct pci_dev and also enabled
mmap() on PCI resource files. Alex reviewed an earlier version and was
the one who suggested to also enable mmap() on PCI resources.

The introduction of a new QEMU API for accessing MMIO BARs in this
series is something Alex might be interested in as QEMU VFIO maintainer.
That wouldn't have been part of the kernel patch review.

If he's aware of the new API he can encourage other VFIO users to use it
in the future so that you won't need to convert them to work on s390x
again.

I don't claim any jurisdiction over the vfio-nvme driver.  In general
vfio users should be using either vfio_region_ops, ram_device_mem_ops,
or directly mapping MMIO into the VM address space.  The first uses
pread/write through the region offset, irrespective of the type of
memory, the second provides the type of access used here where we're
dereferencing into an mmap, and the last if of course the preferred
mechanism where available.

It is curious that the proposal here doesn't include any changes to
ram_device_mem_ops for more generically enabling MMIO access on s390x.
Thanks,

Alex

Hi Alex,
   From my understanding the ram_device_mem_ops sets up the BAR access for
a guest passthrough device. Unfortunately today an s390x KVM guest
doesn't use and have support for these MIO instructions. We wanted to
use this series as an initial test vehicle of the mmap support.

Right, ram_device_mem_ops is what we'll use to access a BAR that
supports mmap but for whatever reason we're accessing it directly
through the mmap.  For instance if an overlapping quirk prevents the
page from being mapped to the VM or we have some back channel mechanism
where the VMM is interacting with the BAR.

I bring it up here because it's effectively the same kind of access
you're adding with these helpers and would need to be addressed if this
were generically enabling vfio mmap access on s390x.

On s390x the use of the MIO instructions is limited to only PCI access.
So i am not sure if we should generically apply this to all vfio mmap
access (for non PCI devices).



Prior to commit 2b8fe81b3c2e ("system/memory: use ldn_he_p/stn_he_p")
the mmio helpers here might have been a drop-in replacement for the
dereferencing of mmap offsets, but something would need to be done
about the explicit PCI assumption introduced here and the possibility
of unaligned accesses that the noted commit tries to resolve.  Thanks,

Alex

AFAICT in qemu today the ram_device_mem_ops is used for non PCI vfio
mmap cases. For s390x these helpers should be restricted to PCI
accesses. For the unaligned accesses (thanks for pointing out that
commmit!), are you suggesting we use the ld*_he_p/st*_he_p functions in
the helpers i defined? Though those functions don't seem to be doing
volatile accesses.

TBH, it's not clear to me that 2b8fe81b3c2e is correct.  We implemented
the ram_device MemoryRegion specifically to avoid memory access
optimizations that are not compatible with MMIO, but I see that these
{ld,st}*_he_pe operations are using __builtin_memcpy.  I'm not a
compiler aficionado, but is __builtin_memcpy guaranteed to use an
instruction set compatible with MMIO?

Cc: folks related to that commit.

The original issue that brought us ram_device was a very obscure
alignment of a memory region versus a device quirk only seen with
assignment of specific RTL NICs.

The description for comm

Re: [PATCH 02/10] various: Define macros for dependencies on emscripten

2025-04-10 Thread Paolo Bonzini


On 4/7/25 16:45, Kohei Tokunaga wrote:

+#ifdef EMSCRIPTEN
+/*
+ * emscripten exposes copy_file_range declaration but doesn't provide the
+ * implementation in the final link. Define the stub here but avoid type
+ * conflict with the emscripten's header.
+ */
+ssize_t copy_file_range(int in_fd, off_t *in_off, int out_fd,
+ off_t *out_off, size_t len, unsigned int flags)
+{
+errno = ENOSYS;
+return -1;
+}


Please add a file stubs/emscripten.c with this function, and add it to 
the build in stubs/meson.build.



+#ifdef EMSCRIPTEN
+error_report("initgroups unsupported");
+exit(1);


I think it's best to add a new function os-wasm.c in addition to 
os-posix.c and os-win32.c, and disable all the functionality of 
-run-with and -daemonize in vl.c via


-#if defined(CONFIG_POSIX)
+#if defined(CONFIG_POSIX) && !defined(EMSCRIPTEN)

(there are a couple occurrences).

Thanks,

Paolo

[PATCH] tests/avocado: add memlock tests

2025-04-10 Thread Alexandr Moshkov

Add new tests to check the correctness of the `-overcommit memlock`
option (possible values: off, on, on-fault) by using
`/proc/{qemu_pid}/smaps` file to check in Size, Rss and Locked fields of
anonymous segments:

* if `memlock=off`, then Locked = 0 on every anonymous smaps;
* if `memlock=on`, then Size, Rss and Locked values must be equal for
every anon smaps where Rss is not 0;
* if `memlock=on-fault`, then Rss and Locked must be equal on every anon
smaps and anonymous segment with Rss < Size must exists.

Signed-off-by: Alexandr Moshkov 
---
 tests/avocado/memlock.py | 98 
 1 file changed, 98 insertions(+)
 create mode 100644 tests/avocado/memlock.py

diff --git a/tests/avocado/memlock.py b/tests/avocado/memlock.py
new file mode 100644
index 00..935cc3dc42
--- /dev/null
+++ b/tests/avocado/memlock.py
@@ -0,0 +1,98 @@
+# Functional test that check overcommit memlock options
+#
+# Copyright (c) Yandex Technologies LLC, 2025
+#
+# Author:
+#  Alexandr Moshkov 
+#
+#
+# This work is licensed under the terms of the GNU GPL, version 2 or
+# later.  See the COPYING file in the top-level directory.
+
+import re
+
+from typing import List, Dict
+
+from avocado_qemu.linuxtest import LinuxTest
+
+
+SMAPS_HEADER_PATTERN = re.compile(r'^\w+-\w+', re.MULTILINE)
+SMAPS_VALUE_PATTERN = re.compile(r'^(\w+):\s+(\d+) kB', re.MULTILINE)
+
+
+class Memlock(LinuxTest):
+"""
+Boots a Linux system with memlock options.
+Then verify, that this options is working correctly
+by checking the smaps of the QEMU proccess.
+"""
+
+def common_vm_setup_with_memlock(self, memlock):
+self.vm.add_args('-overcommit', f'mem-lock={memlock}')
+self.launch_and_wait(set_up_ssh_connection=False)
+
+def get_anon_smaps_by_pid(self, pid):
+smaps_raw = self._get_raw_smaps_by_pid(pid)
+return self._parse_anonymous_smaps(smaps_raw)
+
+
+def test_memlock_off(self):
+self.common_vm_setup_with_memlock('off')
+
+anon_smaps = self.get_anon_smaps_by_pid(self.vm.get_pid())
+
+# locked = 0 on every smap
+for smap in anon_smaps:
+self.assertEqual(smap['Locked'], 0)
+
+def test_memlock_on(self):
+self.common_vm_setup_with_memlock('on')
+
+anon_smaps = self.get_anon_smaps_by_pid(self.vm.get_pid())
+
+# size = rss = locked on every smap where rss not 0
+for smap in anon_smaps:
+if smap['Rss'] == 0:
+continue
+self.assertTrue(smap['Size'] == smap['Rss'] == smap['Locked'])
+
+def test_memlock_onfault(self):
+self.common_vm_setup_with_memlock('on-fault')
+
+anon_smaps = self.get_anon_smaps_by_pid(self.vm.get_pid())
+
+# rss = locked on every smap and segment with rss < size exists
+exists = False
+for smap in anon_smaps:
+self.assertTrue(smap['Rss'] == smap['Locked'])
+if smap['Rss'] < smap['Size']:
+exists = True
+self.assertTrue(exists)
+
+
+def _parse_anonymous_smaps(self, smaps_raw: str) -> List[Dict[str, int]]:
+result_segments = []
+current_segment = {}
+is_anonymous = False
+
+for line in smaps_raw.split('\n'):
+if SMAPS_HEADER_PATTERN.match(line):
+if current_segment and is_anonymous:
+result_segments.append(current_segment)
+current_segment = {}
+# anonymous segment header looks like this:
+# 7f3b8d3f-7f3b8d3f3000 rw-s  00:0f 1052
+# and non anonymous header looks like this:
+# 7f3b8d3f-7f3b8d3f3000 rw-s  00:0f 1052   [stack]
+is_anonymous = len(line.split()) == 5
+elif m := SMAPS_VALUE_PATTERN.match(line):
+current_segment[m.group(1)] = int(m.group(2))
+
+if current_segment and is_anonymous:
+result_segments.append(current_segment)
+
+return result_segments
+
+def _get_raw_smaps_by_pid(self, pid: int) -> str:
+with open(f'/proc/{pid}/smaps', 'r') as f:
+return f.read()
-- 
2.34.1

Re: [PATCH for-10.0] scsi-disk: Apply error policy for host_status errors again

2025-04-10 Thread Paolo Bonzini

On Thu, Apr 10, 2025 at 4:25 PM Paolo Bonzini  wrote:
> You should set ret = 0 here to avoid going down the
> scsi_sense_from_errno() path.
>
> Otherwise,
>
> Reviewed-by: Paolo Bonzini 

Okay, going down the scsi_sense_from_errno() path is more or less
harmless because status and sense end up unused; even though ENODEV is
not something that the function handles, that can be added as a
cleanup in 10.1.

Paolo

> > +}
> > +
> >   if (ret < 0) {
> >   status = scsi_sense_from_errno(-ret, &sense);
> >   error = -ret;
> > @@ -289,6 +303,10 @@ static bool scsi_handle_rw_error(SCSIDiskReq *r, int 
> > ret, bool acct_failed)
> >   if (acct_failed) {
> >   block_acct_failed(blk_get_stats(s->qdev.conf.blk), &r->acct);
> >   }
> > +if (host_status != -1) {
> > +scsi_req_complete_failed(&r->req, host_status);
> > +return true;
> > +}
> >   if (req_has_sense) {
> >   sdc->update_sense(&r->req);
> >   } else if (status == CHECK_CONDITION) {
> > @@ -409,7 +427,6 @@ done:
> >   scsi_req_unref(&r->req);
> >   }
> >
> > -/* May not be called in all error cases, don't rely on cleanup here */
> >   static void scsi_dma_complete(void *opaque, int ret)
> >   {
> >   SCSIDiskReq *r = (SCSIDiskReq *)opaque;
> > @@ -448,7 +465,6 @@ done:
> >   scsi_req_unref(&r->req);
> >   }
> >
> > -/* May not be called in all error cases, don't rely on cleanup here */
> >   static void scsi_read_complete(void *opaque, int ret)
> >   {
> >   SCSIDiskReq *r = (SCSIDiskReq *)opaque;
> > @@ -585,7 +601,6 @@ done:
> >   scsi_req_unref(&r->req);
> >   }
> >
> > -/* May not be called in all error cases, don't rely on cleanup here */
> >   static void scsi_write_complete(void * opaque, int ret)
> >   {
> >   SCSIDiskReq *r = (SCSIDiskReq *)opaque;
> > @@ -2846,14 +2861,10 @@ static void scsi_block_sgio_complete(void *opaque, 
> > int ret)
> >   sg_io_hdr_t *io_hdr = &req->io_header;
> >
> >   if (ret == 0) {
> > -/* FIXME This skips calling req->cb() and any cleanup in it */
> >   if (io_hdr->host_status != SCSI_HOST_OK) {
> > -scsi_req_complete_failed(&r->req, io_hdr->host_status);
> > -scsi_req_unref(&r->req);
> > -return;
> > -}
> > -
> > -if (io_hdr->driver_status & SG_ERR_DRIVER_TIMEOUT) {
> > +r->req.host_status = io_hdr->host_status;
> > +ret = -ENODEV;
> > +} else if (io_hdr->driver_status & SG_ERR_DRIVER_TIMEOUT) {
> >   ret = BUSY;
> >   } else {
> >   ret = io_hdr->status;
>

[PATCH v4 0/4] Add SCLP event type CPI

2025-04-10 Thread Shalini Chellathurai Saroja

Implement the Service-Call Logical Processor (SCLP) event
type Control-Program Identification (CPI) in QEMU.

Changed since v3:
- Add QOM object sclpcpi from ccw_init()
- Add SCLPEventCPI state to store the CPI data in the sclpcpi device
- Other minor changes

Changed since v2:
- Add SPDX license tag in the new file hw/s390x/sclpcpi.c
- Store the control-program Identification data in the sclpcpi device
- Update the description of CPI attributes
- Use ldq_be_p() intead of be64_to_cpu()
- Return the CPI attribute system-level as an integer in QMP
- Add compat handling for backward migration
- Other minor changes

Shalini Chellathurai Saroja (4):
  hw/s390x: add SCLP event type CPI
  hw/s390x: add Control-Program Identification to QOM
  hw/s390x: support migration of CPI data
  hw/s390x: compat handling for backward migration

 hw/s390x/event-facility.c  |   2 +
 hw/s390x/meson.build   |   1 +
 hw/s390x/s390-virtio-ccw.c |  22 
 hw/s390x/sclpcpi.c | 156 +
 include/hw/s390x/event-facility.h  |  22 
 include/hw/s390x/s390-virtio-ccw.h |   1 +
 qapi/machine.json  |  58 +++
 7 files changed, 262 insertions(+)
 create mode 100644 hw/s390x/sclpcpi.c

-- 
2.49.0

[PATCH] scsi: add conversion from ENODEV to sense

2025-04-10 Thread Paolo Bonzini

This is mostly for completeness; I noticed it because ENODEV is used internally
within scsi-disk.c, but when scsi_sense_from_errno(ENODEV) is called the 
resulting
sense is never used and instead scsi_sense_from_host_status() is called later
by scsi_req_complete_failed().

Signed-off-by: Paolo Bonzini 
---
 scsi/utils.c | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/scsi/utils.c b/scsi/utils.c
index 357b0366716..c6d9e04cb19 100644
--- a/scsi/utils.c
+++ b/scsi/utils.c
@@ -587,20 +587,21 @@ int scsi_sense_from_errno(int errno_value, SCSISense 
*sense)
 return GOOD;
 case EDOM:
 return TASK_SET_FULL;
+case ENODEV:
 #ifdef CONFIG_LINUX
 /* These errno mapping are specific to Linux.  For more information:
  * - scsi_check_sense and scsi_decide_disposition in 
drivers/scsi/scsi_error.c
  * - scsi_result_to_blk_status in drivers/scsi/scsi_lib.c
  * - blk_errors[] in block/blk-core.c
  */
+case EREMOTEIO:
+*sense = SENSE_CODE(TARGET_FAILURE);
+return CHECK_CONDITION;
 case EBADE:
 return RESERVATION_CONFLICT;
 case ENODATA:
 *sense = SENSE_CODE(READ_ERROR);
 return CHECK_CONDITION;
-case EREMOTEIO:
-*sense = SENSE_CODE(TARGET_FAILURE);
-return CHECK_CONDITION;
 #endif
 case ENOMEDIUM:
 *sense = SENSE_CODE(NO_MEDIUM);
-- 
2.49.0

Re: [PATCH 1/5] qapi/qom: Introduce kvm-pmu-filter object

2025-04-10 Thread Markus Armbruster

Zhao Liu  writes:

> Introduce the kvm-pmu-filter object and support the PMU event with raw
> format.

Remind me, what does the kvm-pmu-filter object do, and why would we want
to use it?

> The raw format, as a native PMU event code representation, can be used
> for several architectures.
>
> Signed-off-by: Zhao Liu 
> Tested-by: Yi Lai

[PATCH] hw/riscv: Fix type conflict of GLib function pointers

2025-04-10 Thread Paolo Bonzini

qtest_set_command_cb passed to g_once should match GThreadFunc,
which it does not.  But using g_once is actually unnecessary,
because the function is called by riscv_harts_realize() under
the Big QEMU Lock.

Reported-by: Kohei Tokunaga 
Signed-off-by: Paolo Bonzini 
---
 hw/riscv/riscv_hart.c | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/hw/riscv/riscv_hart.c b/hw/riscv/riscv_hart.c
index a55d1566687..bb9104bae0b 100644
--- a/hw/riscv/riscv_hart.c
+++ b/hw/riscv/riscv_hart.c
@@ -104,8 +104,11 @@ static bool csr_qtest_callback(CharBackend *chr, gchar 
**words)
 
 static void riscv_cpu_register_csr_qtest_callback(void)
 {
-static GOnce once;
-g_once(&once, (GThreadFunc)qtest_set_command_cb, csr_qtest_callback);
+static bool first = true;
+if (first) {
+first = false;
+qtest_set_command_cb(csr_qtest_callback);
+}
 }
 #endif
 
-- 
2.49.0

Re: [PATCH] hw/riscv: Fix type conflict of GLib function pointers

2025-04-10 Thread Kohei Tokunaga

Hi Paolo,
I appreciate you addressing this issue and submitting the fix.

Reviewed-by: Kohei Tokunaga

Re: [PATCH] [for-10.1] qtest: introduce qtest_init_ext

2025-04-10 Thread Philippe Mathieu-Daudé


On 10/4/25 18:22, Vladimir Sementsov-Ogievskiy wrote:

Merge qtest_init_with_env_and_capabilities() and qtest_init_with_env()
into one qtest_init_ext().

Reasons:

1. qtest_init_with_env() is just wrong: it gets do_connect parameter
but always pass true to qtest_init_with_env_and_capabilities().
Happily, all qtest_init_with_env() callers pass true as well.

2. qtest_init_with_env() is not used outside of libqtest.c, so no
reason to keep it as public function

3. and in libqtest.c it's used not often, so no problem to use
more generic function instead.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
  tests/qtest/libqtest.c| 18 +-
  tests/qtest/libqtest.h| 30 +++---
  tests/qtest/migration/framework.c |  7 +++
  3 files changed, 15 insertions(+), 40 deletions(-)


Reviewed-by: Philippe Mathieu-Daudé

Re: [PATCH] hw/riscv: Fix type conflict of GLib function pointers

2025-04-10 Thread Philippe Mathieu-Daudé


On 10/4/25 18:17, Paolo Bonzini wrote:

qtest_set_command_cb passed to g_once should match GThreadFunc,
which it does not.  But using g_once is actually unnecessary,
because the function is called by riscv_harts_realize() under
the Big QEMU Lock.

Reported-by: Kohei Tokunaga 
Signed-off-by: Paolo Bonzini 
---
  hw/riscv/riscv_hart.c | 7 +--
  1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/hw/riscv/riscv_hart.c b/hw/riscv/riscv_hart.c
index a55d1566687..bb9104bae0b 100644
--- a/hw/riscv/riscv_hart.c
+++ b/hw/riscv/riscv_hart.c
@@ -104,8 +104,11 @@ static bool csr_qtest_callback(CharBackend *chr, gchar 
**words)
  
  static void riscv_cpu_register_csr_qtest_callback(void)

  {
-static GOnce once;
-g_once(&once, (GThreadFunc)qtest_set_command_cb, csr_qtest_callback);
+static bool first = true;


Preferably using 'qtest_cb_registered' boolean name,

Reviewed-by: Philippe Mathieu-Daudé 


+if (first) {
+first = false;
+qtest_set_command_cb(csr_qtest_callback);
+}
  }
  #endif

[PATCH v2 0/6] Add VNC Open H.264 Encoding

2025-04-10 Thread Dietmar Maurer

As defined by:

https://github.com/rfbproto/rfbproto/blob/master/rfbproto.rst#open-h-264-encoding

The noVNC HTML application recently added support for this encoding. There is
also an open pull request to add audio support to noVNC:

https://github.com/novnc/noVNC/pull/1952

With that in place, the web based VNC console is good enough to display
a VM showing a video with reasonable bandwidth.

Possible improvements:

- Dynamic switching to/from H264 mode at high change rates
- Support for hardware encoders

We may also extend the RFB Audio protocol with "opus" encoding, because 
uncompressed
audio need too much bandwidth.

Changes in v2:

- cleanup: h264: remove wrong libavcodec_ prefix from function names
- search for available h264 encoder, and only enable h264 if a
  encoder is available
- new vnc option to configure h264 at server side


Dietmar Maurer (6):
  new configure option to enable gstreamer
  add vnc h264 encoder
  vnc: h264: send additional frames after the display is clean
  h264: remove wrong libavcodec_ prefix from function names
  h264: search for available h264 encoder
  h264: new vnc option to configure h264 at server side

 meson.build   |  10 +
 meson_options.txt |   2 +
 scripts/meson-buildoptions.sh |   5 +-
 ui/meson.build|   1 +
 ui/vnc-enc-h264.c | 335 ++
 ui/vnc-jobs.c |  49 +++--
 ui/vnc.c  |  62 ++-
 ui/vnc.h  |  29 +++
 8 files changed, 476 insertions(+), 17 deletions(-)
 create mode 100644 ui/vnc-enc-h264.c

-- 
2.39.5

Re: [PATCH v2 2/3] virtio-gpu: fix hang under TCG when unmapping blob

2025-04-10 Thread Alex Bennée

Manos Pitsidianakis  writes:

> From: Manos Pitsidianakis 
>
> This commit fixes an indefinite hang when using VIRTIO GPU blob objects
> under TCG in certain conditions.
>
> The VIRTIO_GPU_CMD_RESOURCE_MAP_BLOB VIRTIO command creates a
> MemoryRegion and attaches it to an offset on a PCI BAR of the
> VirtIOGPUdevice. The VIRTIO_GPU_CMD_RESOURCE_UNMAP_BLOB command unmaps
> it.
>
> Because virglrenderer commands are not thread-safe they are only
> called on the main context and QEMU performs the cleanup in three steps
> to prevent a use-after-free scenario where the guest can access the
> region after it’s unmapped:
>
> 1. From the main context, the region’s field finish_unmapping is false
>by default, so it sets a variable cmd_suspended, increases the
>renderer_blocked variable, deletes the blob subregion, and unparents
>the blob subregion causing its reference count to decrement.
>
> 2. From an RCU context, the MemoryView gets freed, the FlatView gets
>recalculated, the free callback of the blob region
>virtio_gpu_virgl_hostmem_region_free is called which sets the
>region’s field finish_unmapping to true, allowing the main thread
>context to finish replying to the command
>
> 3. From the main context, the command is processed again, but this time
>finish_unmapping is true, so virgl_renderer_resource_unmap can be
>called and a response is sent to the guest.
>
> It happens so that under TCG, if the guest has no timers configured (and
> thus no interrupt will cause the CPU to exit), the RCU thread does not
> have enough time to grab the locks and recalculate the FlatView.
>
> That’s not a big problem in practice since most guests will assume a
> response will happen later in time and go on to do different things,
> potentially triggering interrupts and allowing the RCU context to run.
> If the guest waits for the unmap command to complete though, it blocks
> indefinitely. Attaching to the QEMU monitor and force quitting the guest
> allows the cleanup to continue.
>
> There's no reason why the FlatView recalculation can't occur right away
> when we delete the blob subregion, however. It does not, because when we
> create the subregion we set the object as its own parent:
>
> memory_region_init_ram_ptr(mr, OBJECT(mr), "blob", size, data);
>
> The extra reference is what prevents freeing the memory region object in
> the memory transaction of deleting the subregion.
>
> This commit changes the owner object to the device, which removes the
> extra owner reference in the memory region and causes the MR to be
> freed right away in the main context.
>
> Acked-by: Michael S. Tsirkin 
> Signed-off-by: Manos Pitsidianakis 

Reviewed-by: Alex Bennée 
Tested-by: Alex Bennée 

-- 
Alex Bennée
Virtualisation Tech Lead @ Linaro

Re: [PATCH] target/i386: Reset parked vCPUs together with the online ones

2025-04-10 Thread Maciej S. Szmigiero


On 27.03.2025 19:24, Maciej S. Szmigiero wrote:

From: "Maciej S. Szmigiero" 

Commit 3f2a05b31ee9 ("target/i386: Reset TSCs of parked vCPUs too on VM
reset") introduced a way to reset TSCs of parked vCPUs during VM reset to
prevent them getting desynchronized with the online vCPUs and therefore
causing the KVM PV clock to lose PVCLOCK_TSC_STABLE_BIT.

The way this was done was by registering a parked vCPU-specific QEMU reset
callback via qemu_register_reset().

However, it turns out that on particularly device-rich VMs QEMU reset
callbacks can take a long time to execute (which isn't surprising,
considering that they involve resetting all of VM devices).

In particular, their total runtime can exceed the 1-second TSC
synchronization window introduced in KVM commit 5d3cb0f6a8e3 ("KVM:
Improve TSC offset matching").
Since the TSCs of online vCPUs are only reset from "synchronize_post_reset"
AccelOps handler (which runs after all qemu_register_reset() handlers) this
essentially makes that fix ineffective on these VMs.

The easiest way to guarantee that these parked vCPUs are reset at the same
time as the online ones (regardless how long it takes for VM devices to
reset) is to piggyback on post-reset vCPU synchronization handler for one
of online vCPUs - as there is no generic post-reset AccelOps handler that
isn't per-vCPU.

The first online vCPU was selected for that since it is easily available
under "first_cpu" define.
This does not create an ordering issue since the order of vCPU TSC resets
does not matter.

Fixes: 3f2a05b31ee9 ("target/i386: Reset TSCs of parked vCPUs too on VM reset")
Signed-off-by: Maciej S. Szmigiero 
---


Friendly ping?

Thanks,
Maciej

Re: [PATCH-for-10.1 v3 6/9] qtest/bios-tables-test: Whitelist aarch64/virt 'its_off' variant blobs

2025-04-10 Thread Gustavo Romero


Hi Igor,

On 4/10/25 03:50, Igor Mammedov wrote:

On Wed, 9 Apr 2025 12:49:36 -0300
Gustavo Romero  wrote:


Hi Igor,

On 4/9/25 11:05, Igor Mammedov wrote:

On Fri, 4 Apr 2025 00:01:22 -0300
Gustavo Romero  wrote:
   

Hi Phil,

On 4/3/25 17:40, Philippe Mathieu-Daudé wrote:

We are going to fix the test_acpi_aarch64_virt_tcg_its_off()
test. In preparation, copy the ACPI tables which will be
altered as 'its_off' variants, and whitelist them.

Reviewed-by: Gustavo Romero 
Signed-off-by: Philippe Mathieu-Daudé 
---
tests/qtest/bios-tables-test-allowed-diff.h |   3 +++
tests/qtest/bios-tables-test.c  |   1 +
tests/data/acpi/aarch64/virt/APIC.its_off   | Bin 0 -> 184 bytes
tests/data/acpi/aarch64/virt/FACP.its_off   | Bin 0 -> 276 bytes
tests/data/acpi/aarch64/virt/IORT.its_off   | Bin 0 -> 236 bytes
5 files changed, 4 insertions(+)
create mode 100644 tests/data/acpi/aarch64/virt/APIC.its_off
create mode 100644 tests/data/acpi/aarch64/virt/FACP.its_off
create mode 100644 tests/data/acpi/aarch64/virt/IORT.its_off

diff --git a/tests/qtest/bios-tables-test-allowed-diff.h 
b/tests/qtest/bios-tables-test-allowed-diff.h
index dfb8523c8bf..3421dd5adf3 100644
--- a/tests/qtest/bios-tables-test-allowed-diff.h
+++ b/tests/qtest/bios-tables-test-allowed-diff.h
@@ -1 +1,4 @@
/* List of comma-separated changed AML files to ignore */
+"tests/data/acpi/aarch64/virt/APIC.its_off",
+"tests/data/acpi/aarch64/virt/FACP.its_off",
+"tests/data/acpi/aarch64/virt/IORT.its_off",


I think your first approach is the correct one: you add the blobs
when adding the new test, so they would go into patch 5/9 in this series,
making the test pass without adding anything to bios-tables-test-allowed-diff.h.
Then in this patch only add the APIC.its_off table to the 
bios-tables-test-allowed-diff.h
since that's the table that changes when the fix is in place, as you did in:


if APIC.its_off is the only one that's changing, but FACP/IORT blobs are the 
same
as suffix-less blobs, one can omit copying FACP/IORT as test harness will 
fallback
to suffix-less blob if the one with suffix isn't found.


OK. Just clarifying and for the records, this is not the case for this series



if blobs are different from defaults then create empty blobs and whitelist them 
in the same patch
then do your changes and then update blobs & wipeout withe list.


Thanks for confirming it. That's what I suggested to Phil in my first review 
and what
I understood from the prescription in bios-tables-test.c.

However, on second thoughts, for this particular series, isn't it better to 
have the following commit sequence instead:

1) Add the new test and the new blobs that make the test pass, i.e. 
APIC.suffix, FACP.suffix, and IORT.suffix (they are different than the default 
suffix-less blobs)


blobs should be a separate commit (that way it's easier for maintainer to 
rebase them,
if they clash during merge with some other change.


I see. What is a bit confusing here is that the series consists in
one blob addition act (for the new test) and one blob update/removal act (after 
the fix).



2) Whitelist only the APIC.suffix since that's the table that will change with 
the fix
3) Add the fix (which changes the APIC table so a new APIC.suffix blob is 
needed and also stops generating the IORT table, so no more IORT.suffix blob is 
necessary)
4) Finally, update only the APIC.suffix blob and remove the IORT.suffix blob 
and wipe out the whitelist

This way:

A) It's clear that only ACPI blob changed with the fix, because there is no 
addition of a FACP.suffix blob in 4) (it remains the same)
B) It's clear that the IORT table is removed with the fix and is not relevant 
anymore for the test


I'd just mention it in commit log so  that later no one would wonder why we are 
adding and then removing tables

As for the rest of suggestions, it looks fine to me.


Well, 2) won't make sense anymore since APIC.suffix would be already in the
whitelist in the previous patch that added the empty blobs. Since there won't
be a commit that adds _only_ the APIC.suffix to the whitelist, in preparation
for the fix, this info is "lost" in the series, even tho it's possible to
mention in the commit message.

Hence, what I think is not ideal from a maintainer's/reviewer's perspective,
is that in one commit all the blobs are updated/removed at once, which is
confusing because the fix did not touch the FACP table (for instance) and
this table is updated with APIC and with the removal of IORT, altogether,
in the last commit.

So, for this series, which adds new blobs and _also_ updates and removes some
of them, how about the following organization:

- Patch 1 : Add the new test, add the empty blobs *.suffix files, whitelist 
such a blobs
- Patch 2 : Update the blobs in Patch 1 with the ones that make the new 
test pass and remove them from the whitelist

- Patch 3 : Add the APIC.suffix blob to the whitelist (the table that 
changes

[PATCH] [for-10.1] qtest: introduce qtest_init_ext

2025-04-10 Thread Vladimir Sementsov-Ogievskiy

Merge qtest_init_with_env_and_capabilities() and qtest_init_with_env()
into one qtest_init_ext().

Reasons:

1. qtest_init_with_env() is just wrong: it gets do_connect parameter
   but always pass true to qtest_init_with_env_and_capabilities().
   Happily, all qtest_init_with_env() callers pass true as well.

2. qtest_init_with_env() is not used outside of libqtest.c, so no
   reason to keep it as public function

3. and in libqtest.c it's used not often, so no problem to use
   more generic function instead.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
 tests/qtest/libqtest.c| 18 +-
 tests/qtest/libqtest.h| 30 +++---
 tests/qtest/migration/framework.c |  7 +++
 3 files changed, 15 insertions(+), 40 deletions(-)

diff --git a/tests/qtest/libqtest.c b/tests/qtest/libqtest.c
index fad307d125..66ff318201 100644
--- a/tests/qtest/libqtest.c
+++ b/tests/qtest/libqtest.c
@@ -574,10 +574,8 @@ void qtest_qmp_handshake(QTestState *s, QList 
*capabilities)
 }
 }
 
-QTestState *qtest_init_with_env_and_capabilities(const char *var,
- const char *extra_args,
- QList *capabilities,
- bool do_connect)
+QTestState *qtest_init_ext(const char *var, const char *extra_args,
+   QList *capabilities, bool do_connect)
 {
 QTestState *s = qtest_init_internal(qtest_qemu_binary(var), extra_args,
 do_connect);
@@ -594,15 +592,9 @@ QTestState *qtest_init_with_env_and_capabilities(const 
char *var,
 return s;
 }
 
-QTestState *qtest_init_with_env(const char *var, const char *extra_args,
-bool do_connect)
-{
-return qtest_init_with_env_and_capabilities(var, extra_args, NULL, true);
-}
-
 QTestState *qtest_init(const char *extra_args)
 {
-return qtest_init_with_env(NULL, extra_args, true);
+return qtest_init_ext(NULL, extra_args, NULL, true);
 }
 
 QTestState *qtest_vinitf(const char *fmt, va_list ap)
@@ -1662,7 +1654,7 @@ static struct MachInfo *qtest_get_machines(const char 
*var)
 
 silence_spawn_log = !g_test_verbose();
 
-qts = qtest_init_with_env(qemu_var, "-machine none", true);
+qts = qtest_init_ext(qemu_var, "-machine none", NULL, true);
 response = qtest_qmp(qts, "{ 'execute': 'query-machines' }");
 g_assert(response);
 list = qdict_get_qlist(response, "return");
@@ -1717,7 +1709,7 @@ static struct CpuModel *qtest_get_cpu_models(void)
 
 silence_spawn_log = !g_test_verbose();
 
-qts = qtest_init_with_env(NULL, "-machine none", true);
+qts = qtest_init_ext(NULL, "-machine none", NULL, true);
 response = qtest_qmp(qts, "{ 'execute': 'query-cpu-definitions' }");
 g_assert(response);
 list = qdict_get_qlist(response, "return");
diff --git a/tests/qtest/libqtest.h b/tests/qtest/libqtest.h
index 930a91dcb7..b3f2e7fbef 100644
--- a/tests/qtest/libqtest.h
+++ b/tests/qtest/libqtest.h
@@ -57,37 +57,21 @@ QTestState *qtest_vinitf(const char *fmt, va_list ap) 
G_GNUC_PRINTF(1, 0);
 QTestState *qtest_init(const char *extra_args);
 
 /**
- * qtest_init_with_env:
- * @var: Environment variable from where to take the QEMU binary
- * @extra_args: Other arguments to pass to QEMU.  CAUTION: these
- * arguments are subject to word splitting and shell evaluation.
- * @do_connect: connect to qemu monitor and qtest socket.
- *
- * Like qtest_init(), but use a different environment variable for the
- * QEMU binary.
- *
- * Returns: #QTestState instance.
- */
-QTestState *qtest_init_with_env(const char *var, const char *extra_args,
-bool do_connect);
-
-/**
- * qtest_init_with_env_and_capabilities:
+ * qtest_init_ext:
  * @var: Environment variable from where to take the QEMU binary
  * @extra_args: Other arguments to pass to QEMU.  CAUTION: these
  * arguments are subject to word splitting and shell evaluation.
  * @capabilities: list of QMP capabilities (strings) to enable
  * @do_connect: connect to qemu monitor and qtest socket.
  *
- * Like qtest_init_with_env(), but enable specified capabilities during
- * hadshake.
+ * Like qtest_init(), but use a different environment variable for the
+ * QEMU binary, allow specify capabilities and skip connecting
+ * to QEMU monitor.
  *
  * Returns: #QTestState instance.
  */
-QTestState *qtest_init_with_env_and_capabilities(const char *var,
- const char *extra_args,
- QList *capabilities,
- bool do_connect);
+QTestState *qtest_init_ext(const char *var, const char *extra_args,
+   QList *capabilities, bool do_connect);
 
 /**
  * qtest_init_without_qmp_handshake:
@@ -102,7 +86,7 @@ QTestState *qtest_init_without_qmp_handshake(const c

Re: [PATCH] [for-10.1] qtest: introduce qtest_init_ext

2025-04-10 Thread Vladimir Sementsov-Ogievskiy


On 10.04.25 19:22, Vladimir Sementsov-Ogievskiy wrote:

Merge qtest_init_with_env_and_capabilities() and qtest_init_with_env()
into one qtest_init_ext().


CC Steve

--
Best regards,
Vladimir

[PATCH v4 3/4] hw/s390x: support migration of CPI data

2025-04-10 Thread Shalini Chellathurai Saroja

Register Control-Program Identification data with the live
migration infrastructure.

Signed-off-by: Shalini Chellathurai Saroja 
Reviewed-by: Nina Schoetterl-Glausch 
---
 hw/s390x/sclpcpi.c | 25 +
 1 file changed, 25 insertions(+)

diff --git a/hw/s390x/sclpcpi.c b/hw/s390x/sclpcpi.c
index dcc8bd3245..40a74c16b5 100644
--- a/hw/s390x/sclpcpi.c
+++ b/hw/s390x/sclpcpi.c
@@ -22,6 +22,7 @@
 #include "hw/s390x/event-facility.h"
 #include "hw/s390x/ebcdic.h"
 #include "qapi/qapi-visit-machine.h"
+#include "migration/vmstate.h"
 
 typedef struct Data {
 uint8_t id_format;
@@ -94,12 +95,36 @@ static void get_control_program_id(Object *obj, Visitor *v,
 visit_type_S390ControlProgramId(v, name, &cpi, errp);
 }
 
+static const VMStateDescription vmstate_control_program_id = {
+.name = "s390_control_program_id",
+.version_id = 0,
+.fields = (const VMStateField[]) {
+VMSTATE_UINT8_ARRAY(system_type, ControlProgramId, 8),
+VMSTATE_UINT8_ARRAY(system_name, ControlProgramId, 8),
+VMSTATE_UINT64(system_level, ControlProgramId),
+VMSTATE_UINT8_ARRAY(sysplex_name, ControlProgramId, 8),
+VMSTATE_UINT64(timestamp, ControlProgramId),
+VMSTATE_END_OF_LIST()
+}
+};
+
+static const VMStateDescription vmstate_sclpcpi = {
+.name = "s390_sclpcpi",
+.version_id = 0,
+.fields = (const VMStateField[]) {
+VMSTATE_STRUCT(cpi, SCLPEventCPI, 0, vmstate_control_program_id,
+   ControlProgramId),
+VMSTATE_END_OF_LIST()
+}
+};
+
 static void cpi_class_init(ObjectClass *klass, void *data)
 {
 DeviceClass *dc = DEVICE_CLASS(klass);
 SCLPEventClass *k = SCLP_EVENT_CLASS(klass);
 
 dc->user_creatable = false;
+dc->vmsd =  &vmstate_sclpcpi;
 
 k->can_handle_event = can_handle_event;
 k->get_send_mask = send_mask;
-- 
2.49.0

Re: [PATCH 08/10] hw/9pfs: Allow using hw/9pfs with emscripten

2025-04-10 Thread Paolo Bonzini


On 4/7/25 16:45, Kohei Tokunaga wrote:

Emscripten's fiber does not support submitting coroutines to other
threads. 


Does it work as long as the thread does not rewind?


diff --git a/hw/9pfs/9p-util-stub.c b/hw/9pfs/9p-util-stub.c
new file mode 100644
index 00..57c89902ab
--- /dev/null
+++ b/hw/9pfs/9p-util-stub.c
@@ -0,0 +1,43 @@
+/*
+ * 9p utilities stub functions
+ *
+ * SPDX-License-Identifier: GPL-2.0-or-later
+ */
+
+#include "qemu/osdep.h"
+#include "9p-util.h"
+
+ssize_t fgetxattrat_nofollow(int dirfd, const char *path, const char *name,
+ void *value, size_t size)
+{
+return -1;
+}
+
+ssize_t flistxattrat_nofollow(int dirfd, const char *filename,
+  char *list, size_t size)
+{
+return -1;
+}
+
+ssize_t fremovexattrat_nofollow(int dirfd, const char *filename,
+const char *name)
+{
+return -1;
+}
+
+int fsetxattrat_nofollow(int dirfd, const char *path, const char *name,
+ void *value, size_t size, int flags)
+{
+return -1;
+
+}
+
+int qemu_mknodat(int dirfd, const char *filename, mode_t mode, dev_t dev)
+{
+return -1;
+}
+
+ssize_t fgetxattr(int fd, const char *name, void *value, size_t size)
+{
+return -1;
+}


You can add all these to the stubs/emscripten.c file that I suggested 
elsewhere.



diff --git a/hw/9pfs/9p-util.h b/hw/9pfs/9p-util.h
index 7bc4ec8e85..8c5006fcdc 100644
--- a/hw/9pfs/9p-util.h
+++ b/hw/9pfs/9p-util.h
@@ -84,6 +84,24 @@ static inline int errno_to_dotl(int err) {
  } else if (err == EOPNOTSUPP) {
  err = 95; /* ==EOPNOTSUPP on Linux */
  }
+#elif defined(EMSCRIPTEN)
+/*
+ * FIXME: Only most important errnos translated here yet, this should be
+ * extended to as many errnos being translated as possible in future.
+ */
+if (err == ENAMETOOLONG) {
+err = 36; /* ==ENAMETOOLONG on Linux */
+} else if (err == ENOTEMPTY) {
+err = 39; /* ==ENOTEMPTY on Linux */
+} else if (err == ELOOP) {
+err = 40; /* ==ELOOP on Linux */
+} else if (err == ENODATA) {
+err = 61; /* ==ENODATA on Linux */
+} else if (err == ENOTSUP) {
+err = 95; /* ==EOPNOTSUPP on Linux */
+} else if (err == EOPNOTSUPP) {
+err = 95; /* ==EOPNOTSUPP on Linux */
+}
  #else
  #error Missing errno translation to Linux for this host system
  #endif
diff --git a/hw/9pfs/9p.c b/hw/9pfs/9p.c
index 7cad2bce62..4f45f0edd3 100644
--- a/hw/9pfs/9p.c
+++ b/hw/9pfs/9p.c
@@ -4013,6 +4013,9 @@ out_nofid:
   * Linux guests.
   */
  #define P9_XATTR_SIZE_MAX 65536
+#elif defined(EMSCRIPTEN)
+/* No support for xattr */
+#define P9_XATTR_SIZE_MAX 0
  #else
  #error Missing definition for P9_XATTR_SIZE_MAX for this host system
  #endif
diff --git a/hw/9pfs/coth.h b/hw/9pfs/coth.h
index 2c54249b35..7b0d05ba1b 100644
--- a/hw/9pfs/coth.h
+++ b/hw/9pfs/coth.h
@@ -19,6 +19,7 @@
  #include "qemu/coroutine-core.h"
  #include "9p.h"
  
+#ifndef EMSCRIPTEN

  /*
   * we want to use bottom half because we want to make sure the below
   * sequence of events.
@@ -57,6 +58,17 @@
  /* re-enter back to qemu thread */  \
  qemu_coroutine_yield(); \
  } while (0)
+#else
+/*
+ * FIXME: implement this on emscripten but emscripten's coroutine
+ * implementation (fiber) doesn't support submitting a coroutine to other
+ * threads.
+ */
+#define v9fs_co_run_in_worker(code_block)   \
+do {\
+code_block; \
+} while (0)
+#endif


You could extracting v9fs_co_run_in_worker()'s bodies into separate 
functions.  It is tedious but not hard; all you have to do is define 
structs for the to parameters and return values of v9fs_co_*(), unpack 
them in the callback functions, and retrieve the return value in 
v9fs_co_*().  Many functions


The advantage is that, instead of all the bottom half and yielding dance 
that is done by v9fs_co_run_in_worker() and co_run_in_worker_bh(), you 
can just use thread_pool_submit_co().


Paolo


  void co_run_in_worker_bh(void *);
  int coroutine_fn v9fs_co_readlink(V9fsPDU *, V9fsPath *, V9fsString *);
diff --git a/hw/9pfs/meson.build b/hw/9pfs/meson.build
index d35d4f44ff..04f85fb9e9 100644
--- a/hw/9pfs/meson.build
+++ b/hw/9pfs/meson.build
@@ -17,6 +17,8 @@ if host_os == 'darwin'
fs_ss.add(files('9p-util-darwin.c'))
  elif host_os == 'linux'
fs_ss.add(files('9p-util-linux.c'))
+elif host_os == 'emscripten'
+  fs_ss.add(files('9p-util-stub.c'))
  endif
  fs_ss.add(when: 'CONFIG_XEN_BUS', if_true: files('xen-9p-backend.c'))
  system_ss.add_all(when: 'CONFIG_FSDEV_9P', if_true: fs_ss)
diff --git a/meson.build b/meson.build
index ab84820bc5..a3aadf8b59 100644
--- a/meson.build
+++ b/meson.build
@@ -2356,11 +2356,11 @@ dbus_display =

Re: [PATCH] target/i386: Reset parked vCPUs together with the online ones

2025-04-10 Thread Paolo Bonzini


On 4/10/25 17:57, Maciej S. Szmigiero wrote:

On 27.03.2025 19:24, Maciej S. Szmigiero wrote:

From: "Maciej S. Szmigiero" 

Commit 3f2a05b31ee9 ("target/i386: Reset TSCs of parked vCPUs too on VM
reset") introduced a way to reset TSCs of parked vCPUs during VM reset to
prevent them getting desynchronized with the online vCPUs and therefore
causing the KVM PV clock to lose PVCLOCK_TSC_STABLE_BIT.

The way this was done was by registering a parked vCPU-specific QEMU 
reset

callback via qemu_register_reset().

However, it turns out that on particularly device-rich VMs QEMU reset
callbacks can take a long time to execute (which isn't surprising,
considering that they involve resetting all of VM devices).

In particular, their total runtime can exceed the 1-second TSC
synchronization window introduced in KVM commit 5d3cb0f6a8e3 ("KVM:
Improve TSC offset matching").
Since the TSCs of online vCPUs are only reset from 
"synchronize_post_reset"
AccelOps handler (which runs after all qemu_register_reset() handlers) 
this

essentially makes that fix ineffective on these VMs.

The easiest way to guarantee that these parked vCPUs are reset at the 
same

time as the online ones (regardless how long it takes for VM devices to
reset) is to piggyback on post-reset vCPU synchronization handler for one
of online vCPUs - as there is no generic post-reset AccelOps handler that
isn't per-vCPU.

The first online vCPU was selected for that since it is easily available
under "first_cpu" define.
This does not create an ordering issue since the order of vCPU TSC resets
does not matter.

Fixes: 3f2a05b31ee9 ("target/i386: Reset TSCs of parked vCPUs too on 
VM reset")

Signed-off-by: Maciej S. Szmigiero 
---


Friendly ping?


Applied, thanks.

Paolo

Re: [PATCH] virtio: Call set_features during reset

2025-04-10 Thread Michael S. Tsirkin

On Thu, Apr 10, 2025 at 05:26:47PM +0900, Akihiko Odaki wrote:
> On 2025/04/10 17:02, Michael S. Tsirkin wrote:
> > On Thu, Apr 10, 2025 at 04:54:41PM +0900, Akihiko Odaki wrote:
> > > On 2025/04/10 16:48, 'Michael S. Tsirkin' via devel wrote:
> > > > On Thu, Apr 10, 2025 at 04:42:06PM +0900, Akihiko Odaki wrote:
> > > > > virtio-net expects set_features() will be called when the feature set
> > > > > used by the guest changes to update the number of virtqueues. Call it
> > > > > during reset as reset clears all features and the queues added for
> > > > > VIRTIO_NET_F_MQ or VIRTIO_NET_F_RSS will need to be removed.
> > > > > 
> > > > > Fixes: f9d6dbf0bf6e ("virtio-net: remove virtio queues if the guest 
> > > > > doesn't support multiqueue")
> > > > > Buglink: https://issues.redhat.com/browse/RHEL-73842
> > > > > Cc: qemu-sta...@nongnu.org
> > > > > Signed-off-by: Akihiko Odaki 
> > > > 
> > > > The issue seems specific to virtio net: rset is reset,
> > > > it is distict from set features.
> > > > Why not just call the necessary functionality from virtio_net_reset?
> > > 
> > > set_features is currently implemented only in virtio-net; virtio-gpu-base
> > > also have a function set but it only has code to trace. If another device
> > > implements the function in the future, I think the device will also want 
> > > to
> > > have it called during reset for the same reason with virtio-net.
> > > 
> > > virtio_reset() also calls set_status to update the status field so calling
> > > set_features() is more aligned with the handling of the status field.
> > 
> > That came to be because writing 0 to status resets the virtio device.
> > For a while, this was the only way to reset vhost-user so we just
> > went along with it.
> 
> It is possible to have code to send a command to write 0 to status to
> vhost-user in reset(), but calling set_status() in virtio_reset() is more
> convenient and makes sense as the status is indeed being set to 0. I think
> the same reasoning applies to features.

I don't know who makes assumptions that features are only set during
driver setup, though.
This will send an extra VHOST_USER_SET_FEATURES message for vhost-user,
for example.
I want to have a good reason to add this overhead.

> > 
> > 
> > > > 
> > > > 
> > > > > ---
> > > > >hw/virtio/virtio.c | 86 
> > > > > +++---
> > > > >1 file changed, 43 insertions(+), 43 deletions(-)
> > > > > 
> > > > > diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
> > > > > index 85110bce3744..033e87cdd3b9 100644
> > > > > --- a/hw/virtio/virtio.c
> > > > > +++ b/hw/virtio/virtio.c
> > > > > @@ -2316,49 +2316,6 @@ void virtio_queue_enable(VirtIODevice *vdev, 
> > > > > uint32_t queue_index)
> > > > >}
> > > > >}
> > > > > -void virtio_reset(void *opaque)
> > > > > -{
> > > > > -VirtIODevice *vdev = opaque;
> > > > > -VirtioDeviceClass *k = VIRTIO_DEVICE_GET_CLASS(vdev);
> > > > > -int i;
> > > > > -
> > > > > -virtio_set_status(vdev, 0);
> > > > > -if (current_cpu) {
> > > > > -/* Guest initiated reset */
> > > > > -vdev->device_endian = virtio_current_cpu_endian();
> > > > > -} else {
> > > > > -/* System reset */
> > > > > -vdev->device_endian = virtio_default_endian();
> > > > > -}
> > > > > -
> > > > > -if (k->get_vhost) {
> > > > > -struct vhost_dev *hdev = k->get_vhost(vdev);
> > > > > -/* Only reset when vhost back-end is connected */
> > > > > -if (hdev && hdev->vhost_ops) {
> > > > > -vhost_reset_device(hdev);
> > > > > -}
> > > > > -}
> > > > > -
> > > > > -if (k->reset) {
> > > > > -k->reset(vdev);
> > > > > -}
> > > > > -
> > > > > -vdev->start_on_kick = false;
> > > > > -vdev->started = false;
> > > > > -vdev->broken = false;
> > > > > -vdev->guest_features = 0;
> > > > > -vdev->queue_sel = 0;
> > > > > -vdev->status = 0;
> > > > > -vdev->disabled = false;
> > > > > -qatomic_set(&vdev->isr, 0);
> > > > > -vdev->config_vector = VIRTIO_NO_VECTOR;
> > > > > -virtio_notify_vector(vdev, vdev->config_vector);
> > > > > -
> > > > > -for(i = 0; i < VIRTIO_QUEUE_MAX; i++) {
> > > > > -__virtio_queue_reset(vdev, i);
> > > > > -}
> > > > > -}
> > > > > -
> > > > >void virtio_queue_set_addr(VirtIODevice *vdev, int n, hwaddr addr)
> > > > >{
> > > > >if (!vdev->vq[n].vring.num) {
> > > > > @@ -3169,6 +3126,49 @@ int virtio_set_features(VirtIODevice *vdev, 
> > > > > uint64_t val)
> > > > >return ret;
> > > > >}
> > > > > +void virtio_reset(void *opaque)
> > > > > +{
> > > > > +VirtIODevice *vdev = opaque;
> > > > > +VirtioDeviceClass *k = VIRTIO_DEVICE_GET_CLASS(vdev);
> > > > > +int i;
> > > > > +
> > > > > +virtio_set_status(vdev, 0);
> > > > > +if (current_cpu) {
> > > > > +/* Guest initiated reset */
> > > > > +vdev->device_endia

Re: [PATCH v3 10/10] target/i386/kvm: don't stop Intel PMU counters

2025-04-10 Thread Zhao Liu

On Sun, Mar 30, 2025 at 06:32:29PM -0700, Dongli Zhang wrote:
> Date: Sun, 30 Mar 2025 18:32:29 -0700
> From: Dongli Zhang 
> Subject: [PATCH v3 10/10] target/i386/kvm: don't stop Intel PMU counters
> X-Mailer: git-send-email 2.43.5
> 
> The kvm_put_msrs() sets the MSRs using KVM_SET_MSRS. The x86 KVM processes
> these MSRs one by one in a loop, only saving the config and triggering the
> KVM_REQ_PMU request. This approach does not immediately stop the event
> before updating PMC.

This is ture after KVM's 68fb4757e867 (v6.2). QEMU even supports v4.5
(docs/system/target-i386.rst)... I'm not sure whether it is outdated,
but it's better to mention the Linux version.

> In additional, PMU MSRs are set only at levels >= KVM_PUT_RESET_STATE,
> excluding runtime. Therefore, updating these MSRs without stopping events
> should be acceptable.

I agree.

> Finally, KVM creates kernel perf events with host mode excluded
> (exclude_host = 1). While the events remain active, they don't increment
> the counter during QEMU vCPU userspace mode.
> 
> No Fixed tag is going to be added for the commit 0d89436786b0 ("kvm:
> migrate vPMU state"), because this isn't a bugfix.
> 
> Signed-off-by: Dongli Zhang 
> ---
>  target/i386/kvm/kvm.c | 9 -
>  1 file changed, 9 deletions(-)

Fine for me,

Reviewed-by: Zhao Liu

Re: [PATCH] virtio: Call set_features during reset

2025-04-10 Thread Philippe Mathieu-Daudé


Hi Akihiko,

On 10/4/25 09:42, Akihiko Odaki wrote:

virtio-net expects set_features() will be called when the feature set
used by the guest changes to update the number of virtqueues. Call it
during reset as reset clears all features and the queues added for
VIRTIO_NET_F_MQ or VIRTIO_NET_F_RSS will need to be removed.

Fixes: f9d6dbf0bf6e ("virtio-net: remove virtio queues if the guest doesn't support 
multiqueue")
Buglink: https://issues.redhat.com/browse/RHEL-73842
Cc: qemu-sta...@nongnu.org
Signed-off-by: Akihiko Odaki 
---
  hw/virtio/virtio.c | 86 +++---
  1 file changed, 43 insertions(+), 43 deletions(-)

diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
index 85110bce3744..033e87cdd3b9 100644
--- a/hw/virtio/virtio.c
+++ b/hw/virtio/virtio.c
@@ -2316,49 +2316,6 @@ void virtio_queue_enable(VirtIODevice *vdev, uint32_t 
queue_index)
  }
  }
  
-void virtio_reset(void *opaque)

-{
-VirtIODevice *vdev = opaque;
-VirtioDeviceClass *k = VIRTIO_DEVICE_GET_CLASS(vdev);
-int i;
-
-virtio_set_status(vdev, 0);
-if (current_cpu) {
-/* Guest initiated reset */
-vdev->device_endian = virtio_current_cpu_endian();
-} else {
-/* System reset */
-vdev->device_endian = virtio_default_endian();
-}
-
-if (k->get_vhost) {
-struct vhost_dev *hdev = k->get_vhost(vdev);
-/* Only reset when vhost back-end is connected */
-if (hdev && hdev->vhost_ops) {
-vhost_reset_device(hdev);
-}
-}
-
-if (k->reset) {
-k->reset(vdev);
-}
-
-vdev->start_on_kick = false;
-vdev->started = false;
-vdev->broken = false;
-vdev->guest_features = 0;
-vdev->queue_sel = 0;
-vdev->status = 0;
-vdev->disabled = false;
-qatomic_set(&vdev->isr, 0);
-vdev->config_vector = VIRTIO_NO_VECTOR;
-virtio_notify_vector(vdev, vdev->config_vector);
-
-for(i = 0; i < VIRTIO_QUEUE_MAX; i++) {
-__virtio_queue_reset(vdev, i);
-}
-}
-
  void virtio_queue_set_addr(VirtIODevice *vdev, int n, hwaddr addr)
  {
  if (!vdev->vq[n].vring.num) {
@@ -3169,6 +3126,49 @@ int virtio_set_features(VirtIODevice *vdev, uint64_t val)
  return ret;
  }
  
+void virtio_reset(void *opaque)

+{
+VirtIODevice *vdev = opaque;
+VirtioDeviceClass *k = VIRTIO_DEVICE_GET_CLASS(vdev);
+int i;
+
+virtio_set_status(vdev, 0);
+if (current_cpu) {
+/* Guest initiated reset */
+vdev->device_endian = virtio_current_cpu_endian();
+} else {
+/* System reset */
+vdev->device_endian = virtio_default_endian();
+}
+
+if (k->get_vhost) {
+struct vhost_dev *hdev = k->get_vhost(vdev);
+/* Only reset when vhost back-end is connected */
+if (hdev && hdev->vhost_ops) {
+vhost_reset_device(hdev);
+}
+}
+
+if (k->reset) {
+k->reset(vdev);
+}
+
+vdev->start_on_kick = false;
+vdev->started = false;
+vdev->broken = false;
+virtio_set_features_nocheck(vdev, 0);


It would be simpler to review having a first patch doing code
movement, then a second one with the addition.

For my own education, are feature sets modifiable at runtime?


+vdev->queue_sel = 0;
+vdev->status = 0;
+vdev->disabled = false;
+qatomic_set(&vdev->isr, 0);
+vdev->config_vector = VIRTIO_NO_VECTOR;
+virtio_notify_vector(vdev, vdev->config_vector);
+
+for (i = 0; i < VIRTIO_QUEUE_MAX; i++) {
+__virtio_queue_reset(vdev, i);
+}
+}
+
  static void virtio_device_check_notification_compatibility(VirtIODevice *vdev,
 Error **errp)
  {

---
base-commit: 825b96dbcee23d134b691fc75618b59c5f53da32
change-id: 20250406-reset-5ed5248ee3c1

Best regards,

Re: [PATCH v3 2/4] hw/s390x: add Control-Program Identification to QOM

2025-04-10 Thread Shalini Chellathurai Saroja


On 2025-04-09 07:30, Thomas Huth wrote:

On 03/04/2025 16.33, Shalini Chellathurai Saroja wrote:

On 2025-04-01 15:55, Nina Schoetterl-Glausch wrote:

On Mon, 2025-03-31 at 16:00 +0200, Shalini Chellathurai Saroja wrote:

Add Control-Program Identification data to the QEMU Object
Model (QOM), along with the timestamp in which the data was 
received.


Example:
virsh # qemu-monitor-command vm --pretty '{
"execute": "qom-get",
"arguments": {
"path": "/machine/sclp/s390-sclp-event-facility/sclpcpi",
"property": "control-program-id" }}'
{
  "return": {
    "timestamp": 1742390410685762000,
    "system-level": 74872343805430528,
    "sysplex-name": "PLEX ",
    "system-name": "TESTVM  ",
    "system-type": "LINUX   "
  },
  "id": "libvirt-15"
}

Signed-off-by: Shalini Chellathurai Saroja 
---
 hw/s390x/sclpcpi.c    | 38 
 include/hw/s390x/event-facility.h |  9 +
 qapi/machine.json | 58 
+++

 3 files changed, 105 insertions(+)

diff --git a/hw/s390x/sclpcpi.c b/hw/s390x/sclpcpi.c
index 7ace5dd64e..969c15e43d 100644
--- a/hw/s390x/sclpcpi.c
+++ b/hw/s390x/sclpcpi.c
@@ -57,8 +57,11 @@
   */

 #include "qemu/osdep.h"
+#include "qemu/timer.h"
 #include "hw/s390x/sclp.h"
 #include "hw/s390x/event-facility.h"
+#include "hw/s390x/ebcdic.h"
+#include "qapi/qapi-visit-machine.h"

 typedef struct Data {
 uint8_t id_format;
@@ -99,10 +102,37 @@ static int write_event_data(SCLPEvent *event, 
EventBufferHeader *evt_buf_hdr)
 ControlProgramIdMsg *cpim = container_of(evt_buf_hdr, 
ControlProgramIdMsg,

  ebh);

+    ascii_put(event->cpi.system_type, (char 
*)cpim->data.system_type, 8);
+    ascii_put(event->cpi.system_name, (char 
*)cpim->data.system_name, 8);
+    ascii_put(event->cpi.sysplex_name, (char 
*)cpim->data.sysplex_name, 8);

+    event->cpi.system_level = ldq_be_p(&cpim->data.system_level);
+    event->cpi.timestamp = qemu_clock_get_ns(QEMU_CLOCK_HOST);
+
 cpim->ebh.flags = SCLP_EVENT_BUFFER_ACCEPTED;
 return SCLP_RC_NORMAL_COMPLETION;
 }

+static void get_control_program_id(Object *obj, Visitor *v,
+   const char *name, void *opaque,
+   Error **errp)
+{
+    SCLPEvent *event = (SCLPEvent *)(obj);


Do a checked cast with SCLP_EVENT(obj).


Hello Nina,

ok, thank you.



+    S390ControlProgramId *cpi;
+
+    cpi = &(S390ControlProgramId){
+    .system_type = g_strndup((char *) event->cpi.system_type,
+ sizeof(event->cpi.system_type)),
+    .system_name = g_strndup((char *) event->cpi.system_name,
+ sizeof(event->cpi.system_name)),
+    .system_level = event->cpi.system_level,
+    .sysplex_name = g_strndup((char *) event->cpi.sysplex_name,
+  sizeof(event->cpi.sysplex_name)),
+    .timestamp = event->cpi.timestamp
+    };
+
+    visit_type_S390ControlProgramId(v, name, &cpi, errp);
+}
+
 static void cpi_class_init(ObjectClass *klass, void *data)
 {
 DeviceClass *dc = DEVICE_CLASS(klass);
@@ -114,6 +144,14 @@ static void cpi_class_init(ObjectClass *klass, 
void *data)

 k->get_send_mask = send_mask;
 k->get_receive_mask = receive_mask;
 k->write_event_data = write_event_data;
+
+    object_class_property_add(klass, "control-program-id",
+  "S390ControlProgramId",
+  get_control_program_id,
+  NULL, NULL, NULL);
+    object_class_property_set_description(klass, 
"control-program-id",

+    "Control-program identifiers provide data about the guest "
+    "operating system");
 }

 static const TypeInfo sclp_cpi_info = {
diff --git a/include/hw/s390x/event-facility.h 
b/include/hw/s390x/event- facility.h

index f445d2f9f5..39e589ed44 100644
--- a/include/hw/s390x/event-facility.h
+++ b/include/hw/s390x/event-facility.h
@@ -169,10 +169,19 @@ typedef struct ReadEventData {
 };
 } QEMU_PACKED ReadEventData;

+typedef struct ControlProgramId {
+    uint8_t system_type[8];
+    uint8_t system_name[8];
+    uint64_t system_level;
+    uint8_t sysplex_name[8];
+    uint64_t timestamp;
+} QEMU_PACKED ControlProgramId;
+
 struct SCLPEvent {
 DeviceState qdev;
 bool event_pending;
 char *name;
+    ControlProgramId cpi;


I don't think this should go into SCLPEvent.
Rather SCLPEventFacility or SCLPDevice. Otherwise all events,
so also quiesce and cpu_hotplug have a cpi field.


ok, that is correct.

I gave it a try by moving ControlProgramId to SCLPDevice. With this, 
the migration data is stored in dc->vmsd of TYPE_SCLP as shown below.


diff --git a/hw/s390x/sclp.c b/hw/s390x/sclp.c
index 5945c9b1d8..4d6d5bb857 100644
--- a/hw/s390x/sclp.c
+++ b/hw/s390x/sclp.c
@@ -424,6 +424,29 @@ static void sclp_init(Object *obj)
  sclp_memory_init(sclp);
  }

+static const VMStateDescription vmst

[PATCH v2 2/6] add vnc h264 encoder

2025-04-10 Thread Dietmar Maurer

This patch implements H264 support for VNC. The RFB protocol
extension is defined in:

https://github.com/rfbproto/rfbproto/blob/master/rfbproto.rst#open-h-264-encoding

Currently the Gstreamer x264enc plugin (software encoder) is used
to encode the video stream.

The gstreamer pipe is:

appsrc -> videoconvert -> x264enc -> appsink

Note: videoconvert is required for RGBx to YUV420 conversion.

The code still use the VNC server framebuffer change detection,
and only encodes and sends video frames if there are changes.

Signed-off-by: Dietmar Maurer 
---
 ui/meson.build|   1 +
 ui/vnc-enc-h264.c | 269 ++
 ui/vnc-jobs.c |  49 ++---
 ui/vnc.c  |  21 
 ui/vnc.h  |  21 
 5 files changed, 346 insertions(+), 15 deletions(-)
 create mode 100644 ui/vnc-enc-h264.c

diff --git a/ui/meson.build b/ui/meson.build
index 35fb04cadf..34f1f33699 100644
--- a/ui/meson.build
+++ b/ui/meson.build
@@ -46,6 +46,7 @@ vnc_ss.add(files(
 ))
 vnc_ss.add(zlib, jpeg)
 vnc_ss.add(when: sasl, if_true: files('vnc-auth-sasl.c'))
+vnc_ss.add(when: gstreamer, if_true: files('vnc-enc-h264.c'))
 system_ss.add_all(when: [vnc, pixman], if_true: vnc_ss)
 system_ss.add(when: vnc, if_false: files('vnc-stubs.c'))
 
diff --git a/ui/vnc-enc-h264.c b/ui/vnc-enc-h264.c
new file mode 100644
index 00..ca8e206335
--- /dev/null
+++ b/ui/vnc-enc-h264.c
@@ -0,0 +1,269 @@
+#include "qemu/osdep.h"
+#include "vnc.h"
+
+#include 
+
+static void libavcodec_destroy_encoder_context(VncState *vs)
+{
+if (!vs->h264) {
+return;
+}
+
+if (vs->h264->source) {
+gst_object_unref(vs->h264->source);
+vs->h264->source = NULL;
+}
+
+if (vs->h264->convert) {
+gst_object_unref(vs->h264->convert);
+vs->h264->convert = NULL;
+}
+
+if (vs->h264->gst_encoder) {
+gst_object_unref(vs->h264->gst_encoder);
+vs->h264->sink = NULL;
+}
+
+if (vs->h264->sink) {
+gst_object_unref(vs->h264->sink);
+vs->h264->sink = NULL;
+}
+
+if (vs->h264->pipeline) {
+gst_object_unref(vs->h264->pipeline);
+vs->h264->pipeline = NULL;
+}
+}
+
+static bool libavcodec_create_encoder_context(VncState *vs, int w, int h)
+{
+g_assert(vs->h264 != NULL);
+
+if (vs->h264->sink) {
+if (w != vs->h264->width || h != vs->h264->height) {
+libavcodec_destroy_encoder_context(vs);
+}
+}
+
+if (vs->h264->sink) {
+return TRUE;
+}
+
+vs->h264->width = w;
+vs->h264->height = h;
+
+vs->h264->source = gst_element_factory_make("appsrc", "source");
+if (!vs->h264->source) {
+VNC_DEBUG("Could not create gst source\n");
+libavcodec_destroy_encoder_context(vs);
+return FALSE;
+}
+
+vs->h264->convert = gst_element_factory_make("videoconvert", "convert");
+if (!vs->h264->convert) {
+VNC_DEBUG("Could not create gst convert element\n");
+libavcodec_destroy_encoder_context(vs);
+return FALSE;
+}
+
+vs->h264->gst_encoder = gst_element_factory_make("x264enc", "gst-encoder");
+if (!vs->h264->gst_encoder) {
+VNC_DEBUG("Could not create gst x264 encoder\n");
+libavcodec_destroy_encoder_context(vs);
+return FALSE;
+}
+
+g_object_set(vs->h264->gst_encoder, "tune", 4, NULL); /* zerolatency */
+/* fix for zerolatency with novnc (without, noVNC displays green stripes) 
*/
+g_object_set(vs->h264->gst_encoder, "threads", 1, NULL);
+
+g_object_set(vs->h264->gst_encoder, "pass", 5, NULL); /* Constant Quality 
*/
+g_object_set(vs->h264->gst_encoder, "quantizer", 26, NULL);
+
+/* avoid access unit delimiters (Nal Unit Type 9) - not required */
+g_object_set(vs->h264->gst_encoder, "aud", false, NULL);
+
+vs->h264->sink = gst_element_factory_make("appsink", "sink");
+if (!vs->h264->sink) {
+VNC_DEBUG("Could not create gst sink\n");
+libavcodec_destroy_encoder_context(vs);
+return FALSE;
+}
+
+vs->h264->pipeline = gst_pipeline_new("vnc-h264-pipeline");
+if (!vs->h264->pipeline) {
+VNC_DEBUG("Could not create gst pipeline\n");
+libavcodec_destroy_encoder_context(vs);
+return FALSE;
+}
+
+gst_object_ref(vs->h264->source);
+if (!gst_bin_add(GST_BIN(vs->h264->pipeline), vs->h264->source)) {
+gst_object_unref(vs->h264->source);
+VNC_DEBUG("Could not add source to gst pipeline\n");
+libavcodec_destroy_encoder_context(vs);
+return FALSE;
+}
+
+gst_object_ref(vs->h264->convert);
+if (!gst_bin_add(GST_BIN(vs->h264->pipeline), vs->h264->convert)) {
+gst_object_unref(vs->h264->convert);
+VNC_DEBUG("Could not add convert to gst pipeline\n");
+libavcodec_destroy_encoder_context(vs);
+return FALSE;
+}
+
+gst_object_ref(vs->h264->gst_encoder);
+if (!gst_bin_add(GST_BIN(vs->h264->pipeline), vs

[PATCH v2 4/6] h264: remove wrong libavcodec_ prefix from function names

2025-04-10 Thread Dietmar Maurer

Signed-off-by: Dietmar Maurer 
---
 ui/vnc-enc-h264.c | 36 ++--
 1 file changed, 18 insertions(+), 18 deletions(-)

diff --git a/ui/vnc-enc-h264.c b/ui/vnc-enc-h264.c
index ca8e206335..9e01b8a548 100644
--- a/ui/vnc-enc-h264.c
+++ b/ui/vnc-enc-h264.c
@@ -3,7 +3,7 @@
 
 #include 
 
-static void libavcodec_destroy_encoder_context(VncState *vs)
+static void destroy_encoder_context(VncState *vs)
 {
 if (!vs->h264) {
 return;
@@ -35,13 +35,13 @@ static void libavcodec_destroy_encoder_context(VncState *vs)
 }
 }
 
-static bool libavcodec_create_encoder_context(VncState *vs, int w, int h)
+static bool create_encoder_context(VncState *vs, int w, int h)
 {
 g_assert(vs->h264 != NULL);
 
 if (vs->h264->sink) {
 if (w != vs->h264->width || h != vs->h264->height) {
-libavcodec_destroy_encoder_context(vs);
+destroy_encoder_context(vs);
 }
 }
 
@@ -55,21 +55,21 @@ static bool libavcodec_create_encoder_context(VncState *vs, 
int w, int h)
 vs->h264->source = gst_element_factory_make("appsrc", "source");
 if (!vs->h264->source) {
 VNC_DEBUG("Could not create gst source\n");
-libavcodec_destroy_encoder_context(vs);
+destroy_encoder_context(vs);
 return FALSE;
 }
 
 vs->h264->convert = gst_element_factory_make("videoconvert", "convert");
 if (!vs->h264->convert) {
 VNC_DEBUG("Could not create gst convert element\n");
-libavcodec_destroy_encoder_context(vs);
+destroy_encoder_context(vs);
 return FALSE;
 }
 
 vs->h264->gst_encoder = gst_element_factory_make("x264enc", "gst-encoder");
 if (!vs->h264->gst_encoder) {
 VNC_DEBUG("Could not create gst x264 encoder\n");
-libavcodec_destroy_encoder_context(vs);
+destroy_encoder_context(vs);
 return FALSE;
 }
 
@@ -86,14 +86,14 @@ static bool libavcodec_create_encoder_context(VncState *vs, 
int w, int h)
 vs->h264->sink = gst_element_factory_make("appsink", "sink");
 if (!vs->h264->sink) {
 VNC_DEBUG("Could not create gst sink\n");
-libavcodec_destroy_encoder_context(vs);
+destroy_encoder_context(vs);
 return FALSE;
 }
 
 vs->h264->pipeline = gst_pipeline_new("vnc-h264-pipeline");
 if (!vs->h264->pipeline) {
 VNC_DEBUG("Could not create gst pipeline\n");
-libavcodec_destroy_encoder_context(vs);
+destroy_encoder_context(vs);
 return FALSE;
 }
 
@@ -101,7 +101,7 @@ static bool libavcodec_create_encoder_context(VncState *vs, 
int w, int h)
 if (!gst_bin_add(GST_BIN(vs->h264->pipeline), vs->h264->source)) {
 gst_object_unref(vs->h264->source);
 VNC_DEBUG("Could not add source to gst pipeline\n");
-libavcodec_destroy_encoder_context(vs);
+destroy_encoder_context(vs);
 return FALSE;
 }
 
@@ -109,7 +109,7 @@ static bool libavcodec_create_encoder_context(VncState *vs, 
int w, int h)
 if (!gst_bin_add(GST_BIN(vs->h264->pipeline), vs->h264->convert)) {
 gst_object_unref(vs->h264->convert);
 VNC_DEBUG("Could not add convert to gst pipeline\n");
-libavcodec_destroy_encoder_context(vs);
+destroy_encoder_context(vs);
 return FALSE;
 }
 
@@ -117,7 +117,7 @@ static bool libavcodec_create_encoder_context(VncState *vs, 
int w, int h)
 if (!gst_bin_add(GST_BIN(vs->h264->pipeline), vs->h264->gst_encoder)) {
 gst_object_unref(vs->h264->gst_encoder);
 VNC_DEBUG("Could not add encoder to gst pipeline\n");
-libavcodec_destroy_encoder_context(vs);
+destroy_encoder_context(vs);
 return FALSE;
 }
 
@@ -125,7 +125,7 @@ static bool libavcodec_create_encoder_context(VncState *vs, 
int w, int h)
 if (!gst_bin_add(GST_BIN(vs->h264->pipeline), vs->h264->sink)) {
 gst_object_unref(vs->h264->sink);
 VNC_DEBUG("Could not add sink to gst pipeline\n");
-libavcodec_destroy_encoder_context(vs);
+destroy_encoder_context(vs);
 return FALSE;
 }
 
@@ -139,7 +139,7 @@ static bool libavcodec_create_encoder_context(VncState *vs, 
int w, int h)
 
 if (!source_caps) {
 VNC_DEBUG("Could not create source caps filter\n");
-libavcodec_destroy_encoder_context(vs);
+destroy_encoder_context(vs);
 return FALSE;
 }
 
@@ -154,7 +154,7 @@ static bool libavcodec_create_encoder_context(VncState *vs, 
int w, int h)
 NULL
 ) != TRUE) {
 VNC_DEBUG("Elements could not be linked.\n");
-libavcodec_destroy_encoder_context(vs);
+destroy_encoder_context(vs);
 return FALSE;
 }
 
@@ -162,7 +162,7 @@ static bool libavcodec_create_encoder_context(VncState *vs, 
int w, int h)
 int ret = gst_element_set_state(vs->h264->pipeline, GST_STATE_PLAYING);
 if (ret == GST_STATE_CHANGE_FAILURE) {
 VNC_DEBUG("Unable to set the pipeline to the playing state

[Bug 2072564] Re: qemu-aarch64-static segfaults running ldconfig.real (amd64 host)

2025-04-10 Thread Dimitry Andric

I've tested
https://launchpad.net/ubuntu/+source/qemu/1:8.2.2+ds-0ubuntu1.7/+build/30620359/+files/qemu-
user-static_8.2.2+ds-0ubuntu1.7_amd64.deb, and it solves the problem for
me.

With 8.2.2+ds-0ubuntu1.6, running a Docker container with Ubuntu 22.04,
targeting arm64 on an amd64 host, and upgrading the libc package results
in:

124.7 Processing triggers for libc-bin (2.35-0ubuntu3.9) ...
124.8 Segmentation fault
124.8 Segmentation fault
124.8 dpkg: error processing package libc-bin (--configure):
124.8  installed libc-bin package post-installation script subprocess returned 
error exit status 139

With 8.2.2+ds-0ubuntu1.7, this problem does not appear, and building the
container works fine.

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/2072564

Title:
  qemu-aarch64-static segfaults running ldconfig.real (amd64 host)

Status in QEMU:
  Fix Released
Status in qemu package in Ubuntu:
  Fix Released
Status in qemu source package in Noble:
  Fix Committed
Status in qemu source package in Oracular:
  Fix Committed

Bug description:
  [ Impact ]

   * QEMU crashes when running (emulating) ldconfig in a Ubuntu 22.04
  arm64 guest

   * This affects the qemu-user-static 1:8.2.2+ds-0ubuntu1 package on
  Ubuntu 24.04+, running on a amd64 host.

   * When running docker containers with Ubuntu 22.04 in them, emulating
  arm64 with qemu-aarch64-static, invocations of ldconfig (actually
  ldconfig.real) segfault, leading to problems when loading shared
  libraries.

  [ Test Plan ]

   * Reproducer is very easy:

  $ sudo snap install docker
  docker 27.5.1 from Canonical** installed
  $ docker run -ti --platform linux/arm64/v8 ubuntu:22.04
  Unable to find image 'ubuntu:22.04' locally
  22.04: Pulling from library/ubuntu
  0d1c17d4e593: Pull complete 
  Digest: 
sha256:ed1544e454989078f5dec1bfdabd8c5cc9c48e0705d07b678ab6ae3fb61952d2
  Status: Downloaded newer image for ubuntu:22.04

  # Execute ldconfig.real inside the arm64 guest.
  # This should not crash after the fix!
  root@ad80af5378dc:/# /sbin/ldconfig.real
  qemu: uncaught target signal 11 (Segmentation fault) - core dumped
  Segmentation fault (core dumped)

  [ Where problems could occur ]

   * This changes the alignment of sections in the ELF binary via QEMUs
  elfloader, if something goes wrong with this change, it could lead to
  all kind of crashes (segfault) of any emulated binaries.

  [ Other Info ]

   * Upstream bug: https://gitlab.com/qemu-project/qemu/-/issues/1913
   * Upstream fix: https://gitlab.com/qemu-project/qemu/-/commit/4b7b20a3
 - Fix dependency (needed for QEMU < 9.20): 
https://gitlab.com/qemu-project/qemu/-/commit/c81d1faf

  --- original bug report ---

  
  This affects the qemu-user-static 1:8.2.2+ds-0ubuntu1 package on Ubuntu 
24.04, running on a amd64 host.

  When running docker containers with Ubuntu 22.04 in them, emulating
  arm64 with qemu-aarch64-static, invocations of ldconfig (actually
  ldconfig.real) segfault. For example:

  $ docker run -ti --platform linux/arm64/v8 ubuntu:22.04
  root@8861ff640a1c:/# /sbin/ldconfig.real
  Segmentation fault

  If you copy the ldconfig.real binary to the host, and run it directly
  via qemu-aarch64-static:

  $ gdb --args qemu-aarch64-static ./ldconfig.real
  GNU gdb (Ubuntu 15.0.50.20240403-0ubuntu1) 15.0.50.20240403-git
  Copyright (C) 2024 Free Software Foundation, Inc.
  License GPLv3+: GNU GPL version 3 or later 
  This is free software: you are free to change and redistribute it.
  There is NO WARRANTY, to the extent permitted by law.
  Type "show copying" and "show warranty" for details.
  This GDB was configured as "x86_64-linux-gnu".
  Type "show configuration" for configuration details.
  For bug reporting instructions, please see:
  .
  Find the GDB manual and other documentation resources online at:
  .

  For help, type "help".
  Type "apropos word" to search for commands related to "word"...
  Reading symbols from qemu-aarch64-static...
  Reading symbols from 
/home/dim/.cache/debuginfod_client/86579812b213be0964189499f62f176bea817bf2/debuginfo...
  (gdb) r
  Starting program: /usr/bin/qemu-aarch64-static ./ldconfig.real
  [Thread debugging using libthread_db enabled]
  Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
  [New Thread 0x776006c0 (LWP 28378)]

  Thread 1 "qemu-aarch64-st" received signal SIGSEGV, Segmentation fault.
  0x7fffe801645b in ?? ()
  (gdb) disassemble
  No function contains program counter for selected frame.

  It looks like this is a known qemu regression after v8.1.1:
  https://gitlab.com/qemu-project/qemu/-/issues/1913

  Downgrading the package to qemu-user-
  static_8.0.4+dfsg-1ubuntu3_amd64.deb fixes the segfault.

To manage notifications about this bug go to:
https://bugs.launchp

Re: [PATCH v3 08/10] target/i386/kvm: reset AMD PMU registers during VM reset

2025-04-10 Thread Zhao Liu

...

> TODO:
>   - This patch adds is_host_compat_vendor(), while there are something
> like is_host_cpu_intel() from target/i386/kvm/vmsr_energy.c. A rework
> may help move those helpers to target/i386/cpu*.

vmsr_energy emulates RAPL in user space...but RAPL is not architectural
(no CPUID), so this case doesn't need to consider "compat" vendor.

>  target/i386/cpu.h |   8 ++
>  target/i386/kvm/kvm.c | 176 +-
>  2 files changed, 180 insertions(+), 4 deletions(-)

...

> +static bool is_host_compat_vendor(CPUX86State *env)
> +{
> +char host_vendor[CPUID_VENDOR_SZ + 1];
> +uint32_t host_cpuid_vendor1;
> +uint32_t host_cpuid_vendor2;
> +uint32_t host_cpuid_vendor3;
>
> +host_cpuid(0x0, 0, NULL, &host_cpuid_vendor1, &host_cpuid_vendor3,
> +   &host_cpuid_vendor2);
> +
> +x86_cpu_vendor_words2str(host_vendor, host_cpuid_vendor1,
> + host_cpuid_vendor2, host_cpuid_vendor3);

We can use host_cpu_vendor_fms() (with a little change). If you like
this idea, pls feel free to pick my cleanup patch into your series.

> +/*
> + * Intel and Zhaoxin are compatible.
> + */
> +if ((g_str_equal(host_vendor, CPUID_VENDOR_INTEL) ||
> + g_str_equal(host_vendor, CPUID_VENDOR_ZHAOXIN1) ||
> + g_str_equal(host_vendor, CPUID_VENDOR_ZHAOXIN2)) &&
> +(IS_INTEL_CPU(env) || IS_ZHAOXIN_CPU(env))) {
> +return true;
> +}
> +
> +return env->cpuid_vendor1 == host_cpuid_vendor1 &&
> +   env->cpuid_vendor2 == host_cpuid_vendor2 &&
> +   env->cpuid_vendor3 == host_cpuid_vendor3;

Checking AMD directly makes the "compat" rule clear:

return g_str_equal(host_vendor, CPUID_VENDOR_AMD) &&
   IS_AMD_CPU(env);

> +}

...

>  if (env->mcg_cap) {
>  kvm_msr_entry_add(cpu, MSR_MCG_STATUS, 0);
>  kvm_msr_entry_add(cpu, MSR_MCG_CTL, 0);
> @@ -4871,6 +5024,21 @@ static int kvm_get_msrs(X86CPU *cpu)
>  case MSR_P6_EVNTSEL0 ... MSR_P6_EVNTSEL0 + MAX_GP_COUNTERS - 1:
>  env->msr_gp_evtsel[index - MSR_P6_EVNTSEL0] = msrs[i].data;
>  break;
> +case MSR_K7_EVNTSEL0 ... MSR_K7_EVNTSEL0 + AMD64_NUM_COUNTERS - 1:
> +env->msr_gp_evtsel[index - MSR_K7_EVNTSEL0] = msrs[i].data;
> +break;
> +case MSR_K7_PERFCTR0 ... MSR_K7_PERFCTR0 + AMD64_NUM_COUNTERS - 1:
> +env->msr_gp_counters[index - MSR_K7_PERFCTR0] = msrs[i].data;
> +break;
> +case MSR_F15H_PERF_CTL0 ...
> + MSR_F15H_PERF_CTL0 + AMD64_NUM_COUNTERS_CORE * 2 - 1:
> +index = index - MSR_F15H_PERF_CTL0;
> +if (index & 0x1) {
> +env->msr_gp_counters[index] = msrs[i].data;
> +} else {
> +env->msr_gp_evtsel[index] = msrs[i].data;

This msr_gp_evtsel[] array's size is 18:

#define MAX_GP_COUNTERS(MSR_IA32_PERF_STATUS - MSR_P6_EVNTSEL0)

This formula is based on Intel's MSR, it's best to add a note that the
current size also meets AMD's needs. (No need to adjust the size, as
it will affect migration).

> +}
> +break;
>  case HV_X64_MSR_HYPERCALL:
>  env->msr_hv_hypercall = msrs[i].data;
>  break;

Others LGTM!

Thanks,
Zhao

Re: [RFC PATCH-for-8.0 09/10] hw/virtio: Extract vhost_user_ram_slots_max() to vhost-user-target.c

2025-04-10 Thread Philippe Mathieu-Daudé


Hi Pierrick,

On 13/12/22 00:05, Philippe Mathieu-Daudé wrote:

The current definition of VHOST_USER_MAX_RAM_SLOTS is
target specific. By converting this definition to a runtime
vhost_user_ram_slots_max() helper declared in a target
specific unit, we can have the rest of vhost-user.c target
independent.

To avoid variable length array or using the heap to store
arrays of vhost_user_ram_slots_max() elements, we simply
declare an array of the biggest VHOST_USER_MAX_RAM_SLOTS,
and each target uses up to vhost_user_ram_slots_max()
elements of it. Ensure arrays are big enough by adding an
assertion in vhost_user_init().

Signed-off-by: Philippe Mathieu-Daudé 
---
RFC: Should I add VHOST_USER_MAX_RAM_SLOTS to vhost-user.h
  or create an internal header for it?
---
  hw/virtio/meson.build  |  1 +
  hw/virtio/vhost-user-target.c  | 29 +
  hw/virtio/vhost-user.c | 26 +-
  include/hw/virtio/vhost-user.h |  7 +++
  4 files changed, 42 insertions(+), 21 deletions(-)
  create mode 100644 hw/virtio/vhost-user-target.c

diff --git a/hw/virtio/meson.build b/hw/virtio/meson.build
index eb7ee8ea92..bf7e35fa8a 100644
--- a/hw/virtio/meson.build
+++ b/hw/virtio/meson.build
@@ -11,6 +11,7 @@ if have_vhost
specific_virtio_ss.add(files('vhost.c', 'vhost-backend.c', 
'vhost-iova-tree.c'))
if have_vhost_user
  specific_virtio_ss.add(files('vhost-user.c'))
+specific_virtio_ss.add(files('vhost-user-target.c'))
endif
if have_vhost_vdpa
  specific_virtio_ss.add(files('vhost-vdpa.c', 'vhost-shadow-virtqueue.c'))
diff --git a/hw/virtio/vhost-user-target.c b/hw/virtio/vhost-user-target.c
new file mode 100644
index 00..6a0d0f53d0
--- /dev/null
+++ b/hw/virtio/vhost-user-target.c
@@ -0,0 +1,29 @@
+/*
+ * vhost-user target-specific helpers
+ *
+ * Copyright (c) 2013 Virtual Open Systems Sarl.
+ *
+ * SPDX-License-Identifier: GPL-2.0-or-later
+ */
+
+#include "qemu/osdep.h"
+#include "hw/virtio/vhost-user.h"
+
+#if defined(TARGET_X86) || defined(TARGET_X86_64) || \
+defined(TARGET_ARM) || defined(TARGET_ARM_64)
+#include "hw/acpi/acpi.h"
+#elif defined(TARGET_PPC) || defined(TARGET_PPC64)
+#include "hw/ppc/spapr.h"
+#endif
+
+unsigned int vhost_user_ram_slots_max(void)
+{
+#if defined(TARGET_X86) || defined(TARGET_X86_64) || \
+defined(TARGET_ARM) || defined(TARGET_ARM_64)
+return ACPI_MAX_RAM_SLOTS;
+#elif defined(TARGET_PPC) || defined(TARGET_PPC64)
+return SPAPR_MAX_RAM_SLOTS;
+#else
+return 512;


Should vhost_user_ram_slots_max be another TargetInfo field?


+#endif
+}
diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
index 8f635844af..21fc176725 100644
--- a/hw/virtio/vhost-user.c
+++ b/hw/virtio/vhost-user.c
@@ -41,24 +41,7 @@
  #define VHOST_MEMORY_BASELINE_NREGIONS8
  #define VHOST_USER_F_PROTOCOL_FEATURES 30
  #define VHOST_USER_SLAVE_MAX_FDS 8
-
-/*
- * Set maximum number of RAM slots supported to
- * the maximum number supported by the target
- * hardware plaform.
- */
-#if defined(TARGET_X86) || defined(TARGET_X86_64) || \
-defined(TARGET_ARM) || defined(TARGET_ARM_64)
-#include "hw/acpi/acpi.h"
-#define VHOST_USER_MAX_RAM_SLOTS ACPI_MAX_RAM_SLOTS
-
-#elif defined(TARGET_PPC) || defined(TARGET_PPC64)
-#include "hw/ppc/spapr.h"
-#define VHOST_USER_MAX_RAM_SLOTS SPAPR_MAX_RAM_SLOTS
-
-#else
  #define VHOST_USER_MAX_RAM_SLOTS 512
-#endif
  
  /*

   * Maximum size of virtio device config space
@@ -935,7 +918,7 @@ static int vhost_user_add_remove_regions(struct vhost_dev 
*dev,
  
  if (track_ramblocks) {

  memcpy(u->postcopy_client_bases, shadow_pcb,
-   sizeof(uint64_t) * VHOST_USER_MAX_RAM_SLOTS);
+   sizeof(uint64_t) * vhost_user_ram_slots_max());
  /*
   * Now we've registered this with the postcopy code, we ack to the
   * client, because now we're in the position to be able to deal with
@@ -956,7 +939,7 @@ static int vhost_user_add_remove_regions(struct vhost_dev 
*dev,
  err:
  if (track_ramblocks) {
  memcpy(u->postcopy_client_bases, shadow_pcb,
-   sizeof(uint64_t) * VHOST_USER_MAX_RAM_SLOTS);
+   sizeof(uint64_t) * vhost_user_ram_slots_max());
  }
  
  return ret;

@@ -1030,7 +1013,7 @@ static int vhost_user_set_mem_table_postcopy(struct 
vhost_dev *dev,
  }
  
  memset(u->postcopy_client_bases, 0,

-   sizeof(uint64_t) * VHOST_USER_MAX_RAM_SLOTS);
+   sizeof(uint64_t) * vhost_user_ram_slots_max());
  
  /*

   * They're in the same order as the regions that were sent
@@ -2169,7 +2152,7 @@ static int vhost_user_backend_init(struct vhost_dev *dev, 
void *opaque,
  return -EINVAL;
  }
  
-u->user->memory_slots = MIN(ram_slots, VHOST_USER_MAX_RAM_SLOTS);

+u->user->memory_slots = MIN(ram_slots, vhost_user_ram_slots_max());
  }
  }
  
@@ -2649,6 +2632,

Re: [PATCH 00/10] Enable QEMU to run on browsers

2025-04-10 Thread Philippe Mathieu-Daudé


On 9/4/25 21:21, Stefan Hajnoczi wrote:

On Mon, Apr 07, 2025 at 11:45:51PM +0900, Kohei Tokunaga wrote:

This patch series enables QEMU's system emulator to run in a browser using
Emscripten.
It includes implementations and workarounds to address browser environment
limitations, as shown in the following.


I think it would be great to merge this even if there are limitations
once code review comments have been addressed. Developing WebAssembly
support in-tree is likely to allow this effort to develop further than
if done in personal repos (and with significant efforts required to
rebase the code periodically).


# New TCG Backend for Browsers

A new TCG backend translates IR instructions into Wasm instructions and runs
them using the browser's WebAssembly APIs (WebAssembly.Module and
WebAssembly.instantiate). To minimize compilation overhead and avoid hitting
the browser's limitation of the number of instances, this backend integrates
a forked TCI. TBs run on TCI by default, with frequently executed TBs
compiled into WebAssembly.

# Workaround for Running 64-bit Guests

The current implementation uses Wasm's 32-bit memory model, even though Wasm
supports 64-bit variables and instructions. This patch explores implementing
TCG 64-bit instructions while leveraging SoftMMU for address translation. To
enable 64-bit guest support in Wasm today, it was necessary to partially
revert recent changes that removed support for different pointer widths
between the host and guest (e.g., commits
a70af12addd9060fdf8f3dbd42b42e3072c3914f and
bf455ec50b6fea15b4d2493059365bf94c706273) when compiling with
Emscripten. While this serves as a temporary workaround, a long-term
solution could involve adopting Wasm's 64-bit memory model once it gains
broader support, as it is currently not widely adopted (e.g., unsupported by
Safari and libffi). Feedback and suggestions on this approach are welcome.


The biggest problem I'm seeing is we no longer support 64-bit guests on
32-bit hosts, and don't plan to revert that.

Re: [PATCH v8 0/7] Allow to enable multifd and postcopy migration together

2025-04-10 Thread Prasad Pandit

Hello Fabiano,

On Thu, 3 Apr 2025 at 18:41, Fabiano Rosas  wrote:
> Prasad Pandit  writes:
> > * Thank you for the reproducer and traces. I'll try to check more and
> > see if I'm able to reproduce it on my side.
>
> Thanks. I cannot merge this series until that issue is resolved. If it
> reproduces on my machine there's a high chance that it will break CI at
> some point and then it'll be a nightmare to debug. This has happened
> many times before with multifd.

===
qemu/build)$ for i in $(seq 1 ); do echo "$i ";
QTEST_QEMU_BINARY=./qemu-system-x86_64 ./tests/qtest/migration-test
--full -r '/x86_64/migration/postcopy/multifd/plain' || break; done |
tee /tmp/migration-test.out | awk -e '// { printf ("%s ", $_) };
/slow test/ { printf("%s\n", $_); }'

Host-1]
...
9980  # slow test /x86_64/migration/postcopy/multifd/plain
executed in 1.51 secs
9981  # slow test /x86_64/migration/postcopy/multifd/plain
executed in 1.47 secs
9982  # slow test /x86_64/migration/postcopy/multifd/plain
executed in 1.42 secs
9983  # slow test /x86_64/migration/postcopy/multifd/plain
executed in 1.56 secs
9984  # slow test /x86_64/migration/postcopy/multifd/plain
executed in 1.44 secs
9985  # slow test /x86_64/migration/postcopy/multifd/plain
executed in 1.43 secs
9986  # slow test /x86_64/migration/postcopy/multifd/plain
executed in 1.45 secs
9987  # slow test /x86_64/migration/postcopy/multifd/plain
executed in 1.53 secs
9988  # slow test /x86_64/migration/postcopy/multifd/plain
executed in 1.46 secs
9989  # slow test /x86_64/migration/postcopy/multifd/plain
executed in 1.49 secs
9990  # slow test /x86_64/migration/postcopy/multifd/plain
executed in 1.48 secs
9991  # slow test /x86_64/migration/postcopy/multifd/plain
executed in 1.47 secs
9992  # slow test /x86_64/migration/postcopy/multifd/plain
executed in 1.45 secs
9993  # slow test /x86_64/migration/postcopy/multifd/plain
executed in 1.47 secs
9994  # slow test /x86_64/migration/postcopy/multifd/plain
executed in 1.41 secs
9995  # slow test /x86_64/migration/postcopy/multifd/plain
executed in 1.42 secs
9996  # slow test /x86_64/migration/postcopy/multifd/plain
executed in 1.58 secs
9997  # slow test /x86_64/migration/postcopy/multifd/plain
executed in 1.45 secs
9998  # slow test /x86_64/migration/postcopy/multifd/plain
executed in 1.51 secs
  # slow test /x86_64/migration/postcopy/multifd/plain
executed in 1.51 secs

Iter: , low: 1.35, high: 1.73, avg: 1.47 secs


Host-2]
...
9980  # slow test /x86_64/migration/postcopy/multifd/plain
executed in 1.45 secs
9981  # slow test /x86_64/migration/postcopy/multifd/plain
executed in 1.69 secs
9982  # slow test /x86_64/migration/postcopy/multifd/plain
executed in 1.41 secs
9983  # slow test /x86_64/migration/postcopy/multifd/plain
executed in 1.54 secs
9984  # slow test /x86_64/migration/postcopy/multifd/plain
executed in 1.45 secs
9985  # slow test /x86_64/migration/postcopy/multifd/plain
executed in 1.44 secs
9986  # slow test /x86_64/migration/postcopy/multifd/plain
executed in 1.48 secs
9987  # slow test /x86_64/migration/postcopy/multifd/plain
executed in 1.48 secs
9988  # slow test /x86_64/migration/postcopy/multifd/plain
executed in 1.44 secs
9989  # slow test /x86_64/migration/postcopy/multifd/plain
executed in 1.51 secs
9990  # slow test /x86_64/migration/postcopy/multifd/plain
executed in 1.37 secs
9991  # slow test /x86_64/migration/postcopy/multifd/plain
executed in 1.48 secs
9992  # slow test /x86_64/migration/postcopy/multifd/plain
executed in 1.51 secs
9993  # slow test /x86_64/migration/postcopy/multifd/plain
executed in 1.47 secs
9994  # slow test /x86_64/migration/postcopy/multifd/plain
executed in 1.47 secs
9995  # slow test /x86_64/migration/postcopy/multifd/plain
executed in 1.45 secs
9996  # slow test /x86_64/migration/postcopy/multifd/plain
executed in 1.53 secs
9997  # slow test /x86_64/migration/postcopy/multifd/plain
executed in 1.48 secs
9998  # slow test /x86_64/migration/postcopy/multifd/plain
executed in 1.47 secs
  # slow test /x86_64/migration/postcopy/multifd/plain
executed in 1.48 secs

Iter: , low: 1.34, high: 1.82, avg: 1.48 secs


Host-3]
...
9980  # slow test /x86_64/migration/postcopy/multifd/plain
executed in 1.50 secs
9981  # slow test /x86_64/migration/postcopy/multifd/plain
executed in 1.55 secs
9982  # slow test /x86_64/migration/postcopy/multifd/plain
executed in 1.54 secs
9983  # slow test /x86_64/migration/postcopy/multifd/plain
executed in 1.49 secs
9984  # slow test /x86_64/migration/postcopy/multifd/plain
executed in 1.49 secs
9985  # slow test /x86_64/migration/postcopy/multifd/plain
executed in 1.52 secs
9986  # slow test /x86_64/migration/postcopy/multifd/plain
executed in 1.48 secs
9987  # slow test /x86_64/migration/postcopy/multifd/plain
executed

Re: [PATCH 05/10] meson: Add wasm build in build scripts

2025-04-10 Thread Kohei Tokunaga

Hi Paolo, thank you for the comments.

> >> has_int128_type is set to false on emscripten as of now to avoid
errors by
> >> libffi.
>
> What is the error here?  How hard would it be to test for it?

When has_int128_type=true, I encountered a runtime error from libffi. To
reproduce this, we need to actually execute a libffi call with 128-bit
arguments.

> Uncaught TypeError: Cannot convert 1079505232 to a BigInt
> at ffi_call_js (out.js:702:37)
> at qemu-system-x86_64.wasm.ffi_call (qemu-system-x86_64.wasm:0xa37712)
> at qemu-system-x86_64.wasm.tcg_qemu_tb_exec_tci
(qemu-system-x86_64.wasm:0x65f440)
> at qemu-system-x86_64.wasm.tcg_qemu_tb_exec
(qemu-system-x86_64.wasm:0x65edff)
> at qemu-system-x86_64.wasm.cpu_tb_exec
(qemu-system-x86_64.wasm:0x6762c0)
> at qemu-system-x86_64.wasm.cpu_exec_loop
(qemu-system-x86_64.wasm:0x677c84)
> at qemu-system-x86_64.wasm.dynCall_iii
(qemu-system-x86_64.wasm:0xab9014)
> at ret. (out.js:6016:24)
> at invoke_iii (out.js:7574:10)
> at qemu-system-x86_64.wasm.cpu_exec_setjmp
(qemu-system-x86_64.wasm:0x676db8)

> >> And tests aren't integrated with Wasm execution environment as of
> >> now so this commit disables tests.
>
> Perhaps it would be enough to add
>
> [binaries]
> exe_wrapper = 'node'
>
> to the emscripten.txt file?

Thank you for the suggestion. I'll explore this approach.

> >> +[built-in options]
> >> +c_args = ['-Wno-unused-command-line-argument','-g','-O3','-pthread']
> >> +cpp_args = ['-Wno-unused-command-line-argument','-g','-O3','-pthread']
> >> +objc_args =
['-Wno-unused-command-line-argument','-g','-O3','-pthread']
> >> +c_link_args = ['-Wno-unused-command-line-argument','-g','-O3','-
> >> pthread','-sASYNCIFY=1','-sPROXY_TO_PTHREAD=1','-sFORCE_FILESYSTEM','-
> >> sALLOW_TABLE_GROWTH','-sTOTAL_MEMORY=2GB','-sWASM_BIGINT','-
> >> sEXPORT_ES6=1','-sASYNCIFY_IMPORTS=ffi_call_js','-
> >> sEXPORTED_RUNTIME_METHODS=addFunction,removeFunction,TTY,FS']
> >> +cpp_link_args = ['-Wno-unused-command-line-argument','-g','-O3','-
> >> pthread','-sASYNCIFY=1','-sPROXY_TO_PTHREAD=1','-sFORCE_FILESYSTEM','-
> >> sALLOW_TABLE_GROWTH','-sTOTAL_MEMORY=2GB','-sWASM_BIGINT','-
> >> sEXPORT_ES6=1','-sASYNCIFY_IMPORTS=ffi_call_js','-
> >> sEXPORTED_RUNTIME_METHODS=addFunction,removeFunction,TTY,FS']
>
> At least -g -O3 -pthread should not be necessary.

Thank you for the suggestion. -sPROXY_TO_PTHREAD flag used in c_link_args
always requires -pthread, even during configuration. Otherwise, emcc returns
an error like:

> emcc: error: -sPROXY_TO_PTHREAD requires -pthread to work!

So I think -pthread needs to be included in c_link_args at minimum. I'll try
to remove other flags in the next version of the series.

> For -Wno-unused-command-line-argument what are the warnings/errors that
> you are getting?

I encountered the following error when compiling QEMU:

> clang: error: argument unused during compilation: '-no-pie'
[-Werror,-Wunused-command-line-argument]

It seems Emscripten doesn't support the -no-pie flag, and this wasn't caught
during the configure phase. It seems that removing
-Wno-unused-command-line-argument would require the following change in
meson.build, but I'm open to better approaches.

> -if not get_option('b_pie')
> +if not get_option('b_pie') and host_os != 'emscripten'
>qemu_common_flags += cc.get_supported_arguments('-fno-pie', '-no-pie')
>  endif

> >> +elif host_os == 'emscripten'
> >> +  supported_backends += ['fiber']
>
> Can you rename the backend to 'wasm' since the 'windows' backend also
> uses an API called Fibers?

Sure, I'll rename the coroutine backend in the next version of the series.

Re: [RFC PATCH v2 0/5] virtio-balloon: Working Set Reporting

2025-04-10 Thread Michael S. Tsirkin

On Wed, Apr 09, 2025 at 09:52:12AM -0700, Yuanchu Xie wrote:
> On Mon, Apr 7, 2025 at 2:39 AM Michael S. Tsirkin  wrote:
> >
> > On Thu, May 25, 2023 at 10:20:11PM +, T.J. Alumbaugh wrote:
> > > This is the device implementation for the proposed expanded balloon 
> > > feature
> > > described here:
> > >
> > > https://lore.kernel.org/linux-mm/20230509185419.1088297-1-yuan...@google.com/
> > >
> > > This series has a fixed number of "bins" for the working set report, but 
> > > this is
> > > not a constraint of the system. The bin number is fixed at device 
> > > realization
> > > time (in other implementations it is specified as a command line 
> > > argument). Once
> > > that number is fixed, this determines the correct number of bin intervals 
> > > to
> > > pass to the QMP/HMP function 'working_set_config'. Any feedback on how to
> > > properly construct that function for this use case (passing a variable 
> > > length
> > > list?) would be appreciated.
> >
> > It's been a while. Is there interest is reviving this? I also note that
> > reserving a feature bit is very much recommended to avoid a complete
> > mess.
> Thanks for the reminder Michael! I've been informed [1] that this
> should be brought up in VIRTIO TC, but I haven't gotten around to this
> yet. Should I send a patch to this mailing list or a proposal to
> virtio-comm...@lists.oasis-open.org to reserve a feature bit?
> 
> [1] 
> https://lore.kernel.org/linux-mm/20241127025728.3689245-10-yuan...@google.com/


You can do this in any order.

-- 
MST

Re: [PATCH v2] target/i386: Fix model number of Zhaoxin YongFeng vCPU template

2025-04-10 Thread Paolo Bonzini


On 4/7/25 04:07, Ewan Hai wrote:

The model number was mistakenly set to 0x0b (11) in commit ff04bc1ac4.
The correct value is 0x5b. This mistake occurred because the extended
model bits in cpuid[eax=0x1].eax were overlooked, and only the base
model was used.

This patch corrects the model field.


Hi, please follow commit e0013791b9326945ccd09b5b602437beb322cab8 to 
define a new version of the CPU.


Paolo


Fixes: ff04bc1ac4 ("target/i386: Introduce Zhaoxin Yongfeng CPU model")
Signed-off-by: Ewan Hai 
Reviewed-by: Zhao Liu 
---
  target/i386/cpu.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 1b64ceaaba..0dd9788a68 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -5503,7 +5503,7 @@ static const X86CPUDefinition builtin_x86_defs[] = {
  .level = 0x1F,
  .vendor = CPUID_VENDOR_ZHAOXIN1,
  .family = 7,
-.model = 11,
+.model = 0x5b,
  .stepping = 3,
  /* missing: CPUID_HT, CPUID_TM, CPUID_PBE */
  .features[FEAT_1_EDX] =

[PATCH v2 1/3] hw/display: re-arrange memory region tracking

2025-04-10 Thread Manos Pitsidianakis

From: Alex Bennée 

QOM objects can be embedded in other QOM objects and managed as part
of their lifetime but this isn't the case for
virtio_gpu_virgl_hostmem_region. However before we can split it out we
need some other way of associating the wider data structure with the
memory region.

Fortunately MemoryRegion has an opaque pointer. This is passed down to
MemoryRegionOps for device type regions but is unused in the
memory_region_init_ram_ptr() case. Use the opaque to carry the
reference and allow the final MemoryRegion object to be reaped when
its reference count is cleared.

Signed-off-by: Alex Bennée 
Signed-off-by: Manos Pitsidianakis 
---
 include/exec/memory.h |  1 +
 hw/display/virtio-gpu-virgl.c | 23 ---
 2 files changed, 9 insertions(+), 15 deletions(-)

diff --git a/include/exec/memory.h b/include/exec/memory.h
index d09af58c97..bb735a3c7e 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -784,6 +784,7 @@ struct MemoryRegion {
 DeviceState *dev;
 
 const MemoryRegionOps *ops;
+/* opaque data, used by backends like @ops */
 void *opaque;
 MemoryRegion *container;
 int mapped_via_alias; /* Mapped via an alias, container might be NULL */
diff --git a/hw/display/virtio-gpu-virgl.c b/hw/display/virtio-gpu-virgl.c
index 145a0b3879..71a7500de9 100644
--- a/hw/display/virtio-gpu-virgl.c
+++ b/hw/display/virtio-gpu-virgl.c
@@ -52,17 +52,11 @@ virgl_get_egl_display(G_GNUC_UNUSED void *cookie)
 
 #if VIRGL_VERSION_MAJOR >= 1
 struct virtio_gpu_virgl_hostmem_region {
-MemoryRegion mr;
+MemoryRegion *mr;
 struct VirtIOGPU *g;
 bool finish_unmapping;
 };
 
-static struct virtio_gpu_virgl_hostmem_region *
-to_hostmem_region(MemoryRegion *mr)
-{
-return container_of(mr, struct virtio_gpu_virgl_hostmem_region, mr);
-}
-
 static void virtio_gpu_virgl_resume_cmdq_bh(void *opaque)
 {
 VirtIOGPU *g = opaque;
@@ -73,14 +67,12 @@ static void virtio_gpu_virgl_resume_cmdq_bh(void *opaque)
 static void virtio_gpu_virgl_hostmem_region_free(void *obj)
 {
 MemoryRegion *mr = MEMORY_REGION(obj);
-struct virtio_gpu_virgl_hostmem_region *vmr;
+struct virtio_gpu_virgl_hostmem_region *vmr = mr->opaque;
 VirtIOGPUBase *b;
 VirtIOGPUGL *gl;
 
-vmr = to_hostmem_region(mr);
-vmr->finish_unmapping = true;
-
 b = VIRTIO_GPU_BASE(vmr->g);
+vmr->finish_unmapping = true;
 b->renderer_blocked--;
 
 /*
@@ -118,8 +110,8 @@ virtio_gpu_virgl_map_resource_blob(VirtIOGPU *g,
 
 vmr = g_new0(struct virtio_gpu_virgl_hostmem_region, 1);
 vmr->g = g;
+mr = g_new0(MemoryRegion, 1);
 
-mr = &vmr->mr;
 memory_region_init_ram_ptr(mr, OBJECT(mr), "blob", size, data);
 memory_region_add_subregion(&b->hostmem, offset, mr);
 memory_region_set_enabled(mr, true);
@@ -131,7 +123,9 @@ virtio_gpu_virgl_map_resource_blob(VirtIOGPU *g,
  * command processing until MR is fully unreferenced and freed.
  */
 OBJECT(mr)->free = virtio_gpu_virgl_hostmem_region_free;
+mr->opaque = vmr;
 
+vmr->mr = mr;
 res->mr = mr;
 
 return 0;
@@ -142,16 +136,15 @@ virtio_gpu_virgl_unmap_resource_blob(VirtIOGPU *g,
  struct virtio_gpu_virgl_resource *res,
  bool *cmd_suspended)
 {
-struct virtio_gpu_virgl_hostmem_region *vmr;
 VirtIOGPUBase *b = VIRTIO_GPU_BASE(g);
 MemoryRegion *mr = res->mr;
+struct virtio_gpu_virgl_hostmem_region *vmr;
 int ret;
 
 if (!mr) {
 return 0;
 }
-
-vmr = to_hostmem_region(res->mr);
+vmr = mr->opaque;
 
 /*
  * Perform async unmapping in 3 steps:
-- 
γαῖα πυρί μιχθήτω

Re: [PATCH 1/3] scripts: nixify archive-source.sh

2025-04-10 Thread Paolo Bonzini


On 4/8/25 22:14, Joel Granados wrote:

Use "#!/usr/bin/env bash" instead of "#!/bin/bash". This is necessary
for nix environments as they only provide /usr/bin/env at the standard
location.


I am confused, how does this not break everything else?  All the test 
scripts in tests/docker/test-* have "#!/bin/bash", and configure has 
"/bin/sh".  How is the environment that runs scripts/archive-source.sh 
different, and why should it be fixed in scripts/archive-source.sh?


These are genuine questions - it would help if the commit message 
explained those... In fact, what is a nix overlay and why would you use 
scripts/archive-source.sh to prepare one? :)




Signed-off-by: Joel Granados 
---
  scripts/archive-source.sh | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/scripts/archive-source.sh b/scripts/archive-source.sh
index 
30677c3ec9032ea01090f74602d839d1c571d012..a469a5e2dec4b05e51474f0a1af190c1ccf23c7e
 100755
--- a/scripts/archive-source.sh
+++ b/scripts/archive-source.sh
@@ -1,4 +1,4 @@
-#!/bin/bash
+#!/usr/bin/env bash
  #
  # Author: Fam Zheng 
  #

Re: [PATCH for-10.0] scsi-disk: Apply error policy for host_status errors again

2025-04-10 Thread Michael Tokarev


07.04.2025 18:59, Kevin Wolf пишет:

Originally, all failed SG_IO requests called scsi_handle_rw_error() to
apply the configured error policy. However, commit f3126d65, which was
supposed to be a mere refactoring for scsi-disk.c, broke this and
accidentally completed the SCSI request without considering the error
policy any more if the error was signalled in the host_status field.

Apart from the commit message not describing the chance as intended,
errors indicated in host_status are also obviously backend errors and
not something the guest must deal with indepdently of the error policy.

This behaviour means that some recoverable errors (such as a path error
in multipath configurations) were reported to the guest anyway, which
might not expect it and might consider its disk broken.

Make sure that we apply the error policy again for host_status errors,
too. This addresses an existing FIXME comment and allows us to remove
some comments warning that callbacks weren't always called. With this
fix, they are called in all cases again.

The return value passed to the request callback doesn't have more free
values that could be used to indicate host_status errors as well as SAM
status codes and negative errno. Store the value in the host_status
field of the SCSIRequest instead and use -ENODEV as the return value (if
a path hasn't been reachable for a while, blk_aio_ioctl() will return
-ENODEV instead of just setting host_status, so just reuse it here -
it's not necessarily entirely accurate, but it's as good as any errno).

Cc: qemu-sta...@nongnu.org
Fixes: f3126d65b393 ('scsi: move host_status handling into SCSI drivers')


Hi!

Does it make sense to apply this one for older stable qemu series?
In particular, in 8.2, we lack cfe0880835cd3
"scsi-disk: Use positive return value for status in dma_readv/writev",
which seems to be relevant here.  Or should I pick up cfe0880835cd3 too,
maybe together with 8a0495624f (a no-op, just to make this patch to apply
cleanly) and probably 9da6bd39f924?

Thanks,

/mjt

[PATCH 1/2] hw/i386/amd_iommu: Fix device setup failure when PT is on.

2025-04-10 Thread Sairaj Kodilkar

Current amd_iommu enables the iommu_nodma address space when pt_supported
flag is on. This causes device to bypass the IOMMU and use untranslated
address to perform DMA when guest kernel uses DMA mode, resulting in
failure to setup the devices in the guest.

Fix the issue by removing pt_supported check and disabling nodma memory
region. Adding pt_supported requires additional changes and we will look
into it later.

Fixes: c1f46999ef506 ("amd_iommu: Add support for pass though mode")
Signed-off-by: Sairaj Kodilkar 
---
 hw/i386/amd_iommu.c | 12 ++--
 1 file changed, 2 insertions(+), 10 deletions(-)

diff --git a/hw/i386/amd_iommu.c b/hw/i386/amd_iommu.c
index 5f9b95279997..df8ba5d39ada 100644
--- a/hw/i386/amd_iommu.c
+++ b/hw/i386/amd_iommu.c
@@ -1426,7 +1426,6 @@ static AddressSpace *amdvi_host_dma_iommu(PCIBus *bus, 
void *opaque, int devfn)
 AMDVIState *s = opaque;
 AMDVIAddressSpace **iommu_as, *amdvi_dev_as;
 int bus_num = pci_bus_num(bus);
-X86IOMMUState *x86_iommu = X86_IOMMU_DEVICE(s);
 
 iommu_as = s->address_spaces[bus_num];
 
@@ -1486,15 +1485,8 @@ static AddressSpace *amdvi_host_dma_iommu(PCIBus *bus, 
void *opaque, int devfn)
 AMDVI_INT_ADDR_FIRST,
 &amdvi_dev_as->iommu_ir, 1);
 
-if (!x86_iommu->pt_supported) {
-memory_region_set_enabled(&amdvi_dev_as->iommu_nodma, false);
-memory_region_set_enabled(MEMORY_REGION(&amdvi_dev_as->iommu),
-  true);
-} else {
-memory_region_set_enabled(MEMORY_REGION(&amdvi_dev_as->iommu),
-  false);
-memory_region_set_enabled(&amdvi_dev_as->iommu_nodma, true);
-}
+memory_region_set_enabled(&amdvi_dev_as->iommu_nodma, false);
+memory_region_set_enabled(MEMORY_REGION(&amdvi_dev_as->iommu), true);
 }
 return &iommu_as[devfn]->as;
 }
-- 
2.34.1

[PATCH 2/2] hw/i386/amd_iommu: Fix xtsup when vcpus < 255

2025-04-10 Thread Sairaj Kodilkar

From: Vasant Hegde 

If vCPUs > 255 then x86 common code (x86_cpus_init()) call kvm_enable_x2apic().
But if vCPUs <= 255 then it won't call kvm_enable_x2apic().

Booting guest in x2apic mode, amd-iommu,xtsup=on and <= 255 vCPUs is
broken as it fails to call kvm_enable_x2apic().

Fix this by adding back kvm_enable_x2apic() call when xtsup=on.

Fixes: 8c6619f3e692 ("hw/i386/amd_iommu: Simplify non-KVM checks on XTSup 
feature")
Reported-by: Alejandro Jimenez 
Cc: Philippe Mathieu-Daudé 
Cc: Joao Martins 
Signed-off-by: Vasant Hegde 
---
 hw/i386/amd_iommu.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/hw/i386/amd_iommu.c b/hw/i386/amd_iommu.c
index df8ba5d39ada..af85706b8a0d 100644
--- a/hw/i386/amd_iommu.c
+++ b/hw/i386/amd_iommu.c
@@ -1649,6 +1649,14 @@ static void amdvi_sysbus_realize(DeviceState *dev, Error 
**errp)
 exit(EXIT_FAILURE);
 }
 
+if (s->xtsup) {
+if (kvm_irqchip_is_split() && !kvm_enable_x2apic()) {
+error_report("AMD IOMMU xtsup=on requires x2APIC support on "
+  "the KVM side");
+exit(EXIT_FAILURE);
+}
+}
+
 pci_setup_iommu(bus, &amdvi_iommu_ops, s);
 amdvi_init(s);
 }
-- 
2.34.1

[PATCH] virtio: Call set_features during reset

2025-04-10 Thread Akihiko Odaki

virtio-net expects set_features() will be called when the feature set
used by the guest changes to update the number of virtqueues. Call it
during reset as reset clears all features and the queues added for
VIRTIO_NET_F_MQ or VIRTIO_NET_F_RSS will need to be removed.

Fixes: f9d6dbf0bf6e ("virtio-net: remove virtio queues if the guest doesn't 
support multiqueue")
Buglink: https://issues.redhat.com/browse/RHEL-73842
Cc: qemu-sta...@nongnu.org
Signed-off-by: Akihiko Odaki 
---
 hw/virtio/virtio.c | 86 +++---
 1 file changed, 43 insertions(+), 43 deletions(-)

diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
index 85110bce3744..033e87cdd3b9 100644
--- a/hw/virtio/virtio.c
+++ b/hw/virtio/virtio.c
@@ -2316,49 +2316,6 @@ void virtio_queue_enable(VirtIODevice *vdev, uint32_t 
queue_index)
 }
 }
 
-void virtio_reset(void *opaque)
-{
-VirtIODevice *vdev = opaque;
-VirtioDeviceClass *k = VIRTIO_DEVICE_GET_CLASS(vdev);
-int i;
-
-virtio_set_status(vdev, 0);
-if (current_cpu) {
-/* Guest initiated reset */
-vdev->device_endian = virtio_current_cpu_endian();
-} else {
-/* System reset */
-vdev->device_endian = virtio_default_endian();
-}
-
-if (k->get_vhost) {
-struct vhost_dev *hdev = k->get_vhost(vdev);
-/* Only reset when vhost back-end is connected */
-if (hdev && hdev->vhost_ops) {
-vhost_reset_device(hdev);
-}
-}
-
-if (k->reset) {
-k->reset(vdev);
-}
-
-vdev->start_on_kick = false;
-vdev->started = false;
-vdev->broken = false;
-vdev->guest_features = 0;
-vdev->queue_sel = 0;
-vdev->status = 0;
-vdev->disabled = false;
-qatomic_set(&vdev->isr, 0);
-vdev->config_vector = VIRTIO_NO_VECTOR;
-virtio_notify_vector(vdev, vdev->config_vector);
-
-for(i = 0; i < VIRTIO_QUEUE_MAX; i++) {
-__virtio_queue_reset(vdev, i);
-}
-}
-
 void virtio_queue_set_addr(VirtIODevice *vdev, int n, hwaddr addr)
 {
 if (!vdev->vq[n].vring.num) {
@@ -3169,6 +3126,49 @@ int virtio_set_features(VirtIODevice *vdev, uint64_t val)
 return ret;
 }
 
+void virtio_reset(void *opaque)
+{
+VirtIODevice *vdev = opaque;
+VirtioDeviceClass *k = VIRTIO_DEVICE_GET_CLASS(vdev);
+int i;
+
+virtio_set_status(vdev, 0);
+if (current_cpu) {
+/* Guest initiated reset */
+vdev->device_endian = virtio_current_cpu_endian();
+} else {
+/* System reset */
+vdev->device_endian = virtio_default_endian();
+}
+
+if (k->get_vhost) {
+struct vhost_dev *hdev = k->get_vhost(vdev);
+/* Only reset when vhost back-end is connected */
+if (hdev && hdev->vhost_ops) {
+vhost_reset_device(hdev);
+}
+}
+
+if (k->reset) {
+k->reset(vdev);
+}
+
+vdev->start_on_kick = false;
+vdev->started = false;
+vdev->broken = false;
+virtio_set_features_nocheck(vdev, 0);
+vdev->queue_sel = 0;
+vdev->status = 0;
+vdev->disabled = false;
+qatomic_set(&vdev->isr, 0);
+vdev->config_vector = VIRTIO_NO_VECTOR;
+virtio_notify_vector(vdev, vdev->config_vector);
+
+for (i = 0; i < VIRTIO_QUEUE_MAX; i++) {
+__virtio_queue_reset(vdev, i);
+}
+}
+
 static void virtio_device_check_notification_compatibility(VirtIODevice *vdev,
Error **errp)
 {

---
base-commit: 825b96dbcee23d134b691fc75618b59c5f53da32
change-id: 20250406-reset-5ed5248ee3c1

Best regards,
-- 
Akihiko Odaki

Re: [PATCH] virtio: Call set_features during reset

2025-04-10 Thread Michael S. Tsirkin

On Thu, Apr 10, 2025 at 04:42:06PM +0900, Akihiko Odaki wrote:
> virtio-net expects set_features() will be called when the feature set
> used by the guest changes to update the number of virtqueues. Call it
> during reset as reset clears all features and the queues added for
> VIRTIO_NET_F_MQ or VIRTIO_NET_F_RSS will need to be removed.
> 
> Fixes: f9d6dbf0bf6e ("virtio-net: remove virtio queues if the guest doesn't 
> support multiqueue")
> Buglink: https://issues.redhat.com/browse/RHEL-73842
> Cc: qemu-sta...@nongnu.org
> Signed-off-by: Akihiko Odaki 

The issue seems specific to virtio net: rset is reset,
it is distict from set features.
Why not just call the necessary functionality from virtio_net_reset?


> ---
>  hw/virtio/virtio.c | 86 
> +++---
>  1 file changed, 43 insertions(+), 43 deletions(-)
> 
> diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
> index 85110bce3744..033e87cdd3b9 100644
> --- a/hw/virtio/virtio.c
> +++ b/hw/virtio/virtio.c
> @@ -2316,49 +2316,6 @@ void virtio_queue_enable(VirtIODevice *vdev, uint32_t 
> queue_index)
>  }
>  }
>  
> -void virtio_reset(void *opaque)
> -{
> -VirtIODevice *vdev = opaque;
> -VirtioDeviceClass *k = VIRTIO_DEVICE_GET_CLASS(vdev);
> -int i;
> -
> -virtio_set_status(vdev, 0);
> -if (current_cpu) {
> -/* Guest initiated reset */
> -vdev->device_endian = virtio_current_cpu_endian();
> -} else {
> -/* System reset */
> -vdev->device_endian = virtio_default_endian();
> -}
> -
> -if (k->get_vhost) {
> -struct vhost_dev *hdev = k->get_vhost(vdev);
> -/* Only reset when vhost back-end is connected */
> -if (hdev && hdev->vhost_ops) {
> -vhost_reset_device(hdev);
> -}
> -}
> -
> -if (k->reset) {
> -k->reset(vdev);
> -}
> -
> -vdev->start_on_kick = false;
> -vdev->started = false;
> -vdev->broken = false;
> -vdev->guest_features = 0;
> -vdev->queue_sel = 0;
> -vdev->status = 0;
> -vdev->disabled = false;
> -qatomic_set(&vdev->isr, 0);
> -vdev->config_vector = VIRTIO_NO_VECTOR;
> -virtio_notify_vector(vdev, vdev->config_vector);
> -
> -for(i = 0; i < VIRTIO_QUEUE_MAX; i++) {
> -__virtio_queue_reset(vdev, i);
> -}
> -}
> -
>  void virtio_queue_set_addr(VirtIODevice *vdev, int n, hwaddr addr)
>  {
>  if (!vdev->vq[n].vring.num) {
> @@ -3169,6 +3126,49 @@ int virtio_set_features(VirtIODevice *vdev, uint64_t 
> val)
>  return ret;
>  }
>  
> +void virtio_reset(void *opaque)
> +{
> +VirtIODevice *vdev = opaque;
> +VirtioDeviceClass *k = VIRTIO_DEVICE_GET_CLASS(vdev);
> +int i;
> +
> +virtio_set_status(vdev, 0);
> +if (current_cpu) {
> +/* Guest initiated reset */
> +vdev->device_endian = virtio_current_cpu_endian();
> +} else {
> +/* System reset */
> +vdev->device_endian = virtio_default_endian();
> +}
> +
> +if (k->get_vhost) {
> +struct vhost_dev *hdev = k->get_vhost(vdev);
> +/* Only reset when vhost back-end is connected */
> +if (hdev && hdev->vhost_ops) {
> +vhost_reset_device(hdev);
> +}
> +}
> +
> +if (k->reset) {
> +k->reset(vdev);
> +}
> +
> +vdev->start_on_kick = false;
> +vdev->started = false;
> +vdev->broken = false;
> +virtio_set_features_nocheck(vdev, 0);
> +vdev->queue_sel = 0;
> +vdev->status = 0;
> +vdev->disabled = false;
> +qatomic_set(&vdev->isr, 0);
> +vdev->config_vector = VIRTIO_NO_VECTOR;
> +virtio_notify_vector(vdev, vdev->config_vector);
> +
> +for (i = 0; i < VIRTIO_QUEUE_MAX; i++) {
> +__virtio_queue_reset(vdev, i);
> +}
> +}
> +
>  static void virtio_device_check_notification_compatibility(VirtIODevice 
> *vdev,
> Error **errp)
>  {
> 
> ---
> base-commit: 825b96dbcee23d134b691fc75618b59c5f53da32
> change-id: 20250406-reset-5ed5248ee3c1
> 
> Best regards,
> -- 
> Akihiko Odaki

Re: [PATCH] virtio: Call set_features during reset

2025-04-10 Thread Michael S. Tsirkin

On Thu, Apr 10, 2025 at 04:54:41PM +0900, Akihiko Odaki wrote:
> On 2025/04/10 16:48, 'Michael S. Tsirkin' via devel wrote:
> > On Thu, Apr 10, 2025 at 04:42:06PM +0900, Akihiko Odaki wrote:
> > > virtio-net expects set_features() will be called when the feature set
> > > used by the guest changes to update the number of virtqueues. Call it
> > > during reset as reset clears all features and the queues added for
> > > VIRTIO_NET_F_MQ or VIRTIO_NET_F_RSS will need to be removed.
> > > 
> > > Fixes: f9d6dbf0bf6e ("virtio-net: remove virtio queues if the guest 
> > > doesn't support multiqueue")
> > > Buglink: https://issues.redhat.com/browse/RHEL-73842
> > > Cc: qemu-sta...@nongnu.org
> > > Signed-off-by: Akihiko Odaki 
> > 
> > The issue seems specific to virtio net: rset is reset,
> > it is distict from set features.
> > Why not just call the necessary functionality from virtio_net_reset?
> 
> set_features is currently implemented only in virtio-net; virtio-gpu-base
> also have a function set but it only has code to trace. If another device
> implements the function in the future, I think the device will also want to
> have it called during reset for the same reason with virtio-net.
> 
> virtio_reset() also calls set_status to update the status field so calling
> set_features() is more aligned with the handling of the status field.

That came to be because writing 0 to status resets the virtio device.
For a while, this was the only way to reset vhost-user so we just
went along with it.


> > 
> > 
> > > ---
> > >   hw/virtio/virtio.c | 86 
> > > +++---
> > >   1 file changed, 43 insertions(+), 43 deletions(-)
> > > 
> > > diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
> > > index 85110bce3744..033e87cdd3b9 100644
> > > --- a/hw/virtio/virtio.c
> > > +++ b/hw/virtio/virtio.c
> > > @@ -2316,49 +2316,6 @@ void virtio_queue_enable(VirtIODevice *vdev, 
> > > uint32_t queue_index)
> > >   }
> > >   }
> > > -void virtio_reset(void *opaque)
> > > -{
> > > -VirtIODevice *vdev = opaque;
> > > -VirtioDeviceClass *k = VIRTIO_DEVICE_GET_CLASS(vdev);
> > > -int i;
> > > -
> > > -virtio_set_status(vdev, 0);
> > > -if (current_cpu) {
> > > -/* Guest initiated reset */
> > > -vdev->device_endian = virtio_current_cpu_endian();
> > > -} else {
> > > -/* System reset */
> > > -vdev->device_endian = virtio_default_endian();
> > > -}
> > > -
> > > -if (k->get_vhost) {
> > > -struct vhost_dev *hdev = k->get_vhost(vdev);
> > > -/* Only reset when vhost back-end is connected */
> > > -if (hdev && hdev->vhost_ops) {
> > > -vhost_reset_device(hdev);
> > > -}
> > > -}
> > > -
> > > -if (k->reset) {
> > > -k->reset(vdev);
> > > -}
> > > -
> > > -vdev->start_on_kick = false;
> > > -vdev->started = false;
> > > -vdev->broken = false;
> > > -vdev->guest_features = 0;
> > > -vdev->queue_sel = 0;
> > > -vdev->status = 0;
> > > -vdev->disabled = false;
> > > -qatomic_set(&vdev->isr, 0);
> > > -vdev->config_vector = VIRTIO_NO_VECTOR;
> > > -virtio_notify_vector(vdev, vdev->config_vector);
> > > -
> > > -for(i = 0; i < VIRTIO_QUEUE_MAX; i++) {
> > > -__virtio_queue_reset(vdev, i);
> > > -}
> > > -}
> > > -
> > >   void virtio_queue_set_addr(VirtIODevice *vdev, int n, hwaddr addr)
> > >   {
> > >   if (!vdev->vq[n].vring.num) {
> > > @@ -3169,6 +3126,49 @@ int virtio_set_features(VirtIODevice *vdev, 
> > > uint64_t val)
> > >   return ret;
> > >   }
> > > +void virtio_reset(void *opaque)
> > > +{
> > > +VirtIODevice *vdev = opaque;
> > > +VirtioDeviceClass *k = VIRTIO_DEVICE_GET_CLASS(vdev);
> > > +int i;
> > > +
> > > +virtio_set_status(vdev, 0);
> > > +if (current_cpu) {
> > > +/* Guest initiated reset */
> > > +vdev->device_endian = virtio_current_cpu_endian();
> > > +} else {
> > > +/* System reset */
> > > +vdev->device_endian = virtio_default_endian();
> > > +}
> > > +
> > > +if (k->get_vhost) {
> > > +struct vhost_dev *hdev = k->get_vhost(vdev);
> > > +/* Only reset when vhost back-end is connected */
> > > +if (hdev && hdev->vhost_ops) {
> > > +vhost_reset_device(hdev);
> > > +}
> > > +}
> > > +
> > > +if (k->reset) {
> > > +k->reset(vdev);
> > > +}
> > > +
> > > +vdev->start_on_kick = false;
> > > +vdev->started = false;
> > > +vdev->broken = false;
> > > +virtio_set_features_nocheck(vdev, 0);
> > > +vdev->queue_sel = 0;
> > > +vdev->status = 0;
> > > +vdev->disabled = false;
> > > +qatomic_set(&vdev->isr, 0);
> > > +vdev->config_vector = VIRTIO_NO_VECTOR;
> > > +virtio_notify_vector(vdev, vdev->config_vector);
> > > +
> > > +for (i = 0; i < VIRTIO_QUEUE_MAX; i++) {
> > > +__virtio_queue_reset(vdev, i);
> > > +

Re: [PATCH 1/2] hw/i386/amd_iommu: Fix device setup failure when PT is on.

2025-04-10 Thread Vasant Hegde



+ Michael,

On 4/10/2025 12:14 PM, Sairaj Kodilkar wrote:
> Current amd_iommu enables the iommu_nodma address space when pt_supported
> flag is on. This causes device to bypass the IOMMU and use untranslated
> address to perform DMA when guest kernel uses DMA mode, resulting in
> failure to setup the devices in the guest.
> 
> Fix the issue by removing pt_supported check and disabling nodma memory
> region. Adding pt_supported requires additional changes and we will look
> into it later.
> 
> Fixes: c1f46999ef506 ("amd_iommu: Add support for pass though mode")
> Signed-off-by: Sairaj Kodilkar 

Reviewed-by: Vasant Hegde 

-Vasant

Re: [PATCH v3 09/10] target/i386/kvm: support perfmon-v2 for reset

2025-04-10 Thread Zhao Liu

On Sun, Mar 30, 2025 at 06:32:28PM -0700, Dongli Zhang wrote:
> Date: Sun, 30 Mar 2025 18:32:28 -0700
> From: Dongli Zhang 
> Subject: [PATCH v3 09/10] target/i386/kvm: support perfmon-v2 for reset
> X-Mailer: git-send-email 2.43.5
> 
> Since perfmon-v2, the AMD PMU supports additional registers. This update
> includes get/put functionality for these extra registers.
> 
> Similar to the implementation in KVM:
> 
> - MSR_CORE_PERF_GLOBAL_STATUS and MSR_AMD64_PERF_CNTR_GLOBAL_STATUS both
> use env->msr_global_status.
> - MSR_CORE_PERF_GLOBAL_CTRL and MSR_AMD64_PERF_CNTR_GLOBAL_CTL both use
> env->msr_global_ctrl.
> - MSR_CORE_PERF_GLOBAL_OVF_CTRL and MSR_AMD64_PERF_CNTR_GLOBAL_STATUS_CLR
> both use env->msr_global_ovf_ctrl.
> 
> No changes are needed for vmstate_msr_architectural_pmu or
> pmu_enable_needed().
> 
> Signed-off-by: Dongli Zhang 
> ---

...

> diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
> index 3a35fd741d..f4532e6f2a 100644
> --- a/target/i386/kvm/kvm.c
> +++ b/target/i386/kvm/kvm.c
> @@ -2149,6 +2149,16 @@ static void kvm_init_pmu_info_amd(struct kvm_cpuid2 
> *cpuid, X86CPU *cpu)
>  }
>  
>  num_pmu_gp_counters = AMD64_NUM_COUNTERS_CORE;
> +
> +c = cpuid_find_entry(cpuid, 0x8022, 0);
> +if (c && (c->eax & CPUID_8000_0022_EAX_PERFMON_V2)) {
> +pmu_version = 2;
> +num_pmu_gp_counters = c->ebx & 0xf;
> +
> +if (num_pmu_gp_counters > MAX_GP_COUNTERS) {
> +num_pmu_gp_counters = MAX_GP_COUNTERS;

OK! KVM now supports 6 GP counters (KVM_MAX_NR_AMD_GP_COUNTERS).

> +}
> +}
>  }

Fine for me,

Reviewed-by: Zhao Liu

Re: [PATCH] virtio: Call set_features during reset

2025-04-10 Thread Akihiko Odaki


On 2025/04/10 17:02, Michael S. Tsirkin wrote:

On Thu, Apr 10, 2025 at 04:54:41PM +0900, Akihiko Odaki wrote:

On 2025/04/10 16:48, 'Michael S. Tsirkin' via devel wrote:

On Thu, Apr 10, 2025 at 04:42:06PM +0900, Akihiko Odaki wrote:

virtio-net expects set_features() will be called when the feature set
used by the guest changes to update the number of virtqueues. Call it
during reset as reset clears all features and the queues added for
VIRTIO_NET_F_MQ or VIRTIO_NET_F_RSS will need to be removed.

Fixes: f9d6dbf0bf6e ("virtio-net: remove virtio queues if the guest doesn't support 
multiqueue")
Buglink: https://issues.redhat.com/browse/RHEL-73842
Cc: qemu-sta...@nongnu.org
Signed-off-by: Akihiko Odaki 


The issue seems specific to virtio net: rset is reset,
it is distict from set features.
Why not just call the necessary functionality from virtio_net_reset?


set_features is currently implemented only in virtio-net; virtio-gpu-base
also have a function set but it only has code to trace. If another device
implements the function in the future, I think the device will also want to
have it called during reset for the same reason with virtio-net.

virtio_reset() also calls set_status to update the status field so calling
set_features() is more aligned with the handling of the status field.


That came to be because writing 0 to status resets the virtio device.
For a while, this was the only way to reset vhost-user so we just
went along with it.


It is possible to have code to send a command to write 0 to status to 
vhost-user in reset(), but calling set_status() in virtio_reset() is 
more convenient and makes sense as the status is indeed being set to 0. 
I think the same reasoning applies to features.









---
   hw/virtio/virtio.c | 86 
+++---
   1 file changed, 43 insertions(+), 43 deletions(-)

diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
index 85110bce3744..033e87cdd3b9 100644
--- a/hw/virtio/virtio.c
+++ b/hw/virtio/virtio.c
@@ -2316,49 +2316,6 @@ void virtio_queue_enable(VirtIODevice *vdev, uint32_t 
queue_index)
   }
   }
-void virtio_reset(void *opaque)
-{
-VirtIODevice *vdev = opaque;
-VirtioDeviceClass *k = VIRTIO_DEVICE_GET_CLASS(vdev);
-int i;
-
-virtio_set_status(vdev, 0);
-if (current_cpu) {
-/* Guest initiated reset */
-vdev->device_endian = virtio_current_cpu_endian();
-} else {
-/* System reset */
-vdev->device_endian = virtio_default_endian();
-}
-
-if (k->get_vhost) {
-struct vhost_dev *hdev = k->get_vhost(vdev);
-/* Only reset when vhost back-end is connected */
-if (hdev && hdev->vhost_ops) {
-vhost_reset_device(hdev);
-}
-}
-
-if (k->reset) {
-k->reset(vdev);
-}
-
-vdev->start_on_kick = false;
-vdev->started = false;
-vdev->broken = false;
-vdev->guest_features = 0;
-vdev->queue_sel = 0;
-vdev->status = 0;
-vdev->disabled = false;
-qatomic_set(&vdev->isr, 0);
-vdev->config_vector = VIRTIO_NO_VECTOR;
-virtio_notify_vector(vdev, vdev->config_vector);
-
-for(i = 0; i < VIRTIO_QUEUE_MAX; i++) {
-__virtio_queue_reset(vdev, i);
-}
-}
-
   void virtio_queue_set_addr(VirtIODevice *vdev, int n, hwaddr addr)
   {
   if (!vdev->vq[n].vring.num) {
@@ -3169,6 +3126,49 @@ int virtio_set_features(VirtIODevice *vdev, uint64_t val)
   return ret;
   }
+void virtio_reset(void *opaque)
+{
+VirtIODevice *vdev = opaque;
+VirtioDeviceClass *k = VIRTIO_DEVICE_GET_CLASS(vdev);
+int i;
+
+virtio_set_status(vdev, 0);
+if (current_cpu) {
+/* Guest initiated reset */
+vdev->device_endian = virtio_current_cpu_endian();
+} else {
+/* System reset */
+vdev->device_endian = virtio_default_endian();
+}
+
+if (k->get_vhost) {
+struct vhost_dev *hdev = k->get_vhost(vdev);
+/* Only reset when vhost back-end is connected */
+if (hdev && hdev->vhost_ops) {
+vhost_reset_device(hdev);
+}
+}
+
+if (k->reset) {
+k->reset(vdev);
+}
+
+vdev->start_on_kick = false;
+vdev->started = false;
+vdev->broken = false;
+virtio_set_features_nocheck(vdev, 0);
+vdev->queue_sel = 0;
+vdev->status = 0;
+vdev->disabled = false;
+qatomic_set(&vdev->isr, 0);
+vdev->config_vector = VIRTIO_NO_VECTOR;
+virtio_notify_vector(vdev, vdev->config_vector);
+
+for (i = 0; i < VIRTIO_QUEUE_MAX; i++) {
+__virtio_queue_reset(vdev, i);
+}
+}
+
   static void virtio_device_check_notification_compatibility(VirtIODevice 
*vdev,
  Error **errp)
   {

---
base-commit: 825b96dbcee23d134b691fc75618b59c5f53da32
change-id: 20250406-reset-5ed5248ee3c1

Best regards,
--
Akihiko Odaki

[PATCH 2/3] add vnc h264 encoder

2025-04-10 Thread Dietmar Maurer

Signed-off-by: Dietmar Maurer 
---
 ui/meson.build|   1 +
 ui/vnc-enc-h264.c | 269 ++
 ui/vnc-jobs.c |  49 ++---
 ui/vnc.c  |  21 
 ui/vnc.h  |  21 
 5 files changed, 346 insertions(+), 15 deletions(-)
 create mode 100644 ui/vnc-enc-h264.c

diff --git a/ui/meson.build b/ui/meson.build
index 35fb04cadf..34f1f33699 100644
--- a/ui/meson.build
+++ b/ui/meson.build
@@ -46,6 +46,7 @@ vnc_ss.add(files(
 ))
 vnc_ss.add(zlib, jpeg)
 vnc_ss.add(when: sasl, if_true: files('vnc-auth-sasl.c'))
+vnc_ss.add(when: gstreamer, if_true: files('vnc-enc-h264.c'))
 system_ss.add_all(when: [vnc, pixman], if_true: vnc_ss)
 system_ss.add(when: vnc, if_false: files('vnc-stubs.c'))
 
diff --git a/ui/vnc-enc-h264.c b/ui/vnc-enc-h264.c
new file mode 100644
index 00..ca8e206335
--- /dev/null
+++ b/ui/vnc-enc-h264.c
@@ -0,0 +1,269 @@
+#include "qemu/osdep.h"
+#include "vnc.h"
+
+#include 
+
+static void libavcodec_destroy_encoder_context(VncState *vs)
+{
+if (!vs->h264) {
+return;
+}
+
+if (vs->h264->source) {
+gst_object_unref(vs->h264->source);
+vs->h264->source = NULL;
+}
+
+if (vs->h264->convert) {
+gst_object_unref(vs->h264->convert);
+vs->h264->convert = NULL;
+}
+
+if (vs->h264->gst_encoder) {
+gst_object_unref(vs->h264->gst_encoder);
+vs->h264->sink = NULL;
+}
+
+if (vs->h264->sink) {
+gst_object_unref(vs->h264->sink);
+vs->h264->sink = NULL;
+}
+
+if (vs->h264->pipeline) {
+gst_object_unref(vs->h264->pipeline);
+vs->h264->pipeline = NULL;
+}
+}
+
+static bool libavcodec_create_encoder_context(VncState *vs, int w, int h)
+{
+g_assert(vs->h264 != NULL);
+
+if (vs->h264->sink) {
+if (w != vs->h264->width || h != vs->h264->height) {
+libavcodec_destroy_encoder_context(vs);
+}
+}
+
+if (vs->h264->sink) {
+return TRUE;
+}
+
+vs->h264->width = w;
+vs->h264->height = h;
+
+vs->h264->source = gst_element_factory_make("appsrc", "source");
+if (!vs->h264->source) {
+VNC_DEBUG("Could not create gst source\n");
+libavcodec_destroy_encoder_context(vs);
+return FALSE;
+}
+
+vs->h264->convert = gst_element_factory_make("videoconvert", "convert");
+if (!vs->h264->convert) {
+VNC_DEBUG("Could not create gst convert element\n");
+libavcodec_destroy_encoder_context(vs);
+return FALSE;
+}
+
+vs->h264->gst_encoder = gst_element_factory_make("x264enc", "gst-encoder");
+if (!vs->h264->gst_encoder) {
+VNC_DEBUG("Could not create gst x264 encoder\n");
+libavcodec_destroy_encoder_context(vs);
+return FALSE;
+}
+
+g_object_set(vs->h264->gst_encoder, "tune", 4, NULL); /* zerolatency */
+/* fix for zerolatency with novnc (without, noVNC displays green stripes) 
*/
+g_object_set(vs->h264->gst_encoder, "threads", 1, NULL);
+
+g_object_set(vs->h264->gst_encoder, "pass", 5, NULL); /* Constant Quality 
*/
+g_object_set(vs->h264->gst_encoder, "quantizer", 26, NULL);
+
+/* avoid access unit delimiters (Nal Unit Type 9) - not required */
+g_object_set(vs->h264->gst_encoder, "aud", false, NULL);
+
+vs->h264->sink = gst_element_factory_make("appsink", "sink");
+if (!vs->h264->sink) {
+VNC_DEBUG("Could not create gst sink\n");
+libavcodec_destroy_encoder_context(vs);
+return FALSE;
+}
+
+vs->h264->pipeline = gst_pipeline_new("vnc-h264-pipeline");
+if (!vs->h264->pipeline) {
+VNC_DEBUG("Could not create gst pipeline\n");
+libavcodec_destroy_encoder_context(vs);
+return FALSE;
+}
+
+gst_object_ref(vs->h264->source);
+if (!gst_bin_add(GST_BIN(vs->h264->pipeline), vs->h264->source)) {
+gst_object_unref(vs->h264->source);
+VNC_DEBUG("Could not add source to gst pipeline\n");
+libavcodec_destroy_encoder_context(vs);
+return FALSE;
+}
+
+gst_object_ref(vs->h264->convert);
+if (!gst_bin_add(GST_BIN(vs->h264->pipeline), vs->h264->convert)) {
+gst_object_unref(vs->h264->convert);
+VNC_DEBUG("Could not add convert to gst pipeline\n");
+libavcodec_destroy_encoder_context(vs);
+return FALSE;
+}
+
+gst_object_ref(vs->h264->gst_encoder);
+if (!gst_bin_add(GST_BIN(vs->h264->pipeline), vs->h264->gst_encoder)) {
+gst_object_unref(vs->h264->gst_encoder);
+VNC_DEBUG("Could not add encoder to gst pipeline\n");
+libavcodec_destroy_encoder_context(vs);
+return FALSE;
+}
+
+gst_object_ref(vs->h264->sink);
+if (!gst_bin_add(GST_BIN(vs->h264->pipeline), vs->h264->sink)) {
+gst_object_unref(vs->h264->sink);
+VNC_DEBUG("Could not add sink to gst pipeline\n");
+libavcodec_destroy_encoder_context(vs);
+return FALSE;
+}
+
+GstCaps *so

[PATCH] hw/intc/loongarch_pch_msi: Remove gpio input handler

2025-04-10 Thread Bibo Mao

MSI interrupt is triggered by writing message on specified memory address.
In generic it is used by PCI devices, and no device is connected pch MSI
irqchip with GPIO pin line method, here remove gpio input setting for MSI
controller.

Signed-off-by: Bibo Mao 
---
 hw/intc/loongarch_pch_msi.c | 9 -
 1 file changed, 9 deletions(-)

diff --git a/hw/intc/loongarch_pch_msi.c b/hw/intc/loongarch_pch_msi.c
index 66b5c1e660..bc93504ff7 100644
--- a/hw/intc/loongarch_pch_msi.c
+++ b/hw/intc/loongarch_pch_msi.c
@@ -42,13 +42,6 @@ static const MemoryRegionOps loongarch_pch_msi_ops = {
 .endianness = DEVICE_LITTLE_ENDIAN,
 };
 
-static void pch_msi_irq_handler(void *opaque, int irq, int level)
-{
-LoongArchPCHMSI *s = LOONGARCH_PCH_MSI(opaque);
-
-qemu_set_irq(s->pch_msi_irq[irq], level);
-}
-
 static void loongarch_pch_msi_realize(DeviceState *dev, Error **errp)
 {
 LoongArchPCHMSI *s = LOONGARCH_PCH_MSI(dev);
@@ -59,9 +52,7 @@ static void loongarch_pch_msi_realize(DeviceState *dev, Error 
**errp)
 }
 
 s->pch_msi_irq = g_new(qemu_irq, s->irq_num);
-
 qdev_init_gpio_out(dev, s->pch_msi_irq, s->irq_num);
-qdev_init_gpio_in(dev, pch_msi_irq_handler, s->irq_num);
 }
 
 static void loongarch_pch_msi_unrealize(DeviceState *dev)

base-commit: 56c6e249b6988c1b6edc2dd34ebb0f1e570a1365
-- 
2.39.3

Re: [PATCH v4 07/13] ram-block-attribute: Introduce RamBlockAttribute to manage RAMBLock with guest_memfd

2025-04-10 Thread Chenyi Qiang




On 4/9/2025 5:57 PM, Alexey Kardashevskiy wrote:
> 
> 
> On 7/4/25 17:49, Chenyi Qiang wrote:
>> Commit 852f0048f3 ("RAMBlock: make guest_memfd require uncoordinated
>> discard") highlighted that subsystems like VFIO may disable RAM block
>> discard. However, guest_memfd relies on discard operations for page
>> conversion between private and shared memory, potentially leading to
>> stale IOMMU mapping issue when assigning hardware devices to
>> confidential VMs via shared memory. To address this, it is crucial to
>> ensure systems like VFIO refresh its IOMMU mappings.
>>
>> PrivateSharedManager is introduced to manage private and shared states in
>> confidential VMs, similar to RamDiscardManager, which supports
>> coordinated RAM discard in VFIO. Integrating PrivateSharedManager with
>> guest_memfd can facilitate the adjustment of VFIO mappings in response
>> to page conversion events.
>>
>> Since guest_memfd is not an object, it cannot directly implement the
>> PrivateSharedManager interface. Implementing it in HostMemoryBackend is
>> not appropriate because guest_memfd is per RAMBlock, and some RAMBlocks
>> have a memory backend while others do not. 
> 
> HostMemoryBackend::mr::ram_block::guest_memfd?
> And there is HostMemoryBackendMemfd too.

HostMemoryBackend is the parent of HostMemoryBackendMemfd. It is also
possible to use HostMemoryBackendFile or HostMemoryBackendRAM.

> 
>> Notably, virtual BIOS
>> RAMBlocks using memory_region_init_ram_guest_memfd() do not have a
>> backend.
> 
> I thought private memory can be allocated from guest_memfd only. And it
> is still not clear if this BIOS memory can be discarded or not, does it
> change state during the VM lifetime?
> (sorry I keep asking but I do not remember definitive answer).

The BIOS region supports conversion as it is backed by guest_memfd. It
can change the state but it never does during VM lifetime.

> 
>> To manage RAMBlocks with guest_memfd, define a new object named
>> RamBlockAttribute to implement the RamDiscardManager interface. This
>> object stores guest_memfd information such as shared_bitmap, and handles
>> page conversion notification. The memory state is tracked at the host
>> page size granularity, as the minimum memory conversion size can be one
>> page per request. Additionally, VFIO expects the DMA mapping for a
>> specific iova to be mapped and unmapped with the same granularity.
>> Confidential VMs may perform partial conversions, such as conversions on
>> small regions within larger regions. To prevent invalid cases and until
>> cut_mapping operation support is available, all operations are performed
>> with 4K granularity.
>>
>> Signed-off-by: Chenyi Qiang 
>> ---
>> Changes in v4:
>>  - Change the name from memory-attribute-manager to
>>    ram-block-attribute.
>>  - Implement the newly-introduced PrivateSharedManager instead of
>>    RamDiscardManager and change related commit message.
>>  - Define the new object in ramblock.h instead of adding a new file.
>>
>> Changes in v3:
>>  - Some rename (bitmap_size->shared_bitmap_size,
>>    first_one/zero_bit->first_bit, etc.)
>>  - Change shared_bitmap_size from uint32_t to unsigned
>>  - Return mgr->mr->ram_block->page_size in get_block_size()
>>  - Move set_ram_discard_manager() up to avoid a g_free() in failure
>>    case.
>>  - Add const for the memory_attribute_manager_get_block_size()
>>  - Unify the ReplayRamPopulate and ReplayRamDiscard and related
>>    callback.
>>
>> Changes in v2:
>>  - Rename the object name to MemoryAttributeManager
>>  - Rename the bitmap to shared_bitmap to make it more clear.
>>  - Remove block_size field and get it from a helper. In future, we
>>    can get the page_size from RAMBlock if necessary.
>>  - Remove the unncessary "struct" before GuestMemfdReplayData
>>  - Remove the unncessary g_free() for the bitmap
>>  - Add some error report when the callback failure for
>>    populated/discarded section.
>>  - Move the realize()/unrealize() definition to this patch.
>> ---
>>   include/exec/ramblock.h  |  24 +++
>>   system/meson.build   |   1 +
>>   system/ram-block-attribute.c | 282 +++
>>   3 files changed, 307 insertions(+)
>>   create mode 100644 system/ram-block-attribute.c
>>
>> diff --git a/include/exec/ramblock.h b/include/exec/ramblock.h
>> index 0babd105c0..b8b5469db9 100644
>> --- a/include/exec/ramblock.h
>> +++ b/include/exec/ramblock.h
>> @@ -23,6 +23,10 @@
>>   #include "cpu-common.h"
>>   #include "qemu/rcu.h"
>>   #include "exec/ramlist.h"
>> +#include "system/hostmem.h"
>> +
>> +#define TYPE_RAM_BLOCK_ATTRIBUTE "ram-block-attribute"
>> +OBJECT_DECLARE_TYPE(RamBlockAttribute, RamBlockAttributeClass,
>> RAM_BLOCK_ATTRIBUTE)
>>     struct RAMBlock {
>>   struct rcu_head rcu;
>> @@ -90,5 +94,25 @@ struct RAMBlock {
>>    */
>>   ram_addr_t postcopy_length;
>>   };
>> +
>> +struc

[PATCH v1 06/24] s390x/diag: Implement DIAG 320 subcode 1

2025-04-10 Thread Zhuoying Cai

DIAG 320 subcode 1 provides information needed to determine
the amount of storage to store one or more certificates.

The subcode value is denoted by setting the left-most bit
of an 8-byte field.

The verification-certificate-storage-size block (VCSSB) contains
the output data when the operation completes successfully.

Signed-off-by: Zhuoying Cai 
---
 include/hw/s390x/ipl/diag320.h | 25 ++
 target/s390x/diag.c| 39 +-
 2 files changed, 63 insertions(+), 1 deletion(-)

diff --git a/include/hw/s390x/ipl/diag320.h b/include/hw/s390x/ipl/diag320.h
index d6f70c65df..ded336df25 100644
--- a/include/hw/s390x/ipl/diag320.h
+++ b/include/hw/s390x/ipl/diag320.h
@@ -13,7 +13,32 @@
 #define S390X_DIAG320_H
 
 #define DIAG_320_SUBC_QUERY_ISM 0
+#define DIAG_320_SUBC_QUERY_VCSI1
 
 #define DIAG_320_RC_OK  0x0001
+#define DIAG_320_RC_NOMEM   0x0202
+
+#define VCSSB_MAX_LEN   128
+#define VCE_HEADER_LEN  128
+#define VCB_HEADER_LEN  64
+
+#define DIAG_320_ISM_QUERY_VCSI 0x4000
+
+struct VerificationCertificateStorageSizeBlock {
+uint32_t length;
+uint8_t reserved0[3];
+uint8_t version;
+uint32_t reserved1[6];
+uint16_t totalvc;
+uint16_t maxvc;
+uint32_t reserved3[7];
+uint32_t maxvcelen;
+uint32_t reserved4[3];
+uint32_t largestvcblen;
+uint32_t totalvcblen;
+uint32_t reserved5[10];
+} QEMU_PACKED;
+typedef struct VerificationCertificateStorageSizeBlock \
+VerificationCertificateStorageSizeBlock;
 
 #endif
diff --git a/target/s390x/diag.c b/target/s390x/diag.c
index c64b935c87..cc639819ec 100644
--- a/target/s390x/diag.c
+++ b/target/s390x/diag.c
@@ -194,6 +194,7 @@ out:
 void handle_diag_320(CPUS390XState *env, uint64_t r1, uint64_t r3, uintptr_t 
ra)
 {
 S390CPU *cpu = env_archcpu(env);
+S390IPLCertificateStore *qcs = s390_ipl_get_certificate_store();
 uint64_t subcode = env->regs[r3];
 uint64_t addr = env->regs[r1];
 int rc;
@@ -210,7 +211,7 @@ void handle_diag_320(CPUS390XState *env, uint64_t r1, 
uint64_t r3, uintptr_t ra)
 
 switch (subcode) {
 case DIAG_320_SUBC_QUERY_ISM:
-uint64_t ism =  0;
+uint64_t ism = DIAG_320_ISM_QUERY_VCSI;
 
 if (s390_cpu_virt_mem_write(cpu, addr, (uint8_t)r1, &ism,
 be64_to_cpu(sizeof(ism {
@@ -218,6 +219,42 @@ void handle_diag_320(CPUS390XState *env, uint64_t r1, 
uint64_t r3, uintptr_t ra)
 return;
 }
 
+rc = DIAG_320_RC_OK;
+break;
+case DIAG_320_SUBC_QUERY_VCSI:
+VerificationCertificateStorageSizeBlock vcssb;
+
+if (!diag_parm_addr_valid(addr, 
sizeof(VerificationCertificateStorageSizeBlock),
+  true)) {
+s390_program_interrupt(env, PGM_ADDRESSING, ra);
+return;
+}
+
+if (!qcs || !qcs->count) {
+vcssb.length = 4;
+} else {
+vcssb.length = VCSSB_MAX_LEN;
+vcssb.version = 0;
+vcssb.totalvc = qcs->count;
+vcssb.maxvc = MAX_CERTIFICATES;
+vcssb.maxvcelen = VCE_HEADER_LEN + qcs->max_cert_size;
+vcssb.largestvcblen = VCB_HEADER_LEN + vcssb.maxvcelen;
+vcssb.totalvcblen = VCB_HEADER_LEN + qcs->count * VCE_HEADER_LEN +
+qcs->total_bytes;
+}
+
+if (vcssb.length < 128) {
+rc = DIAG_320_RC_NOMEM;
+break;
+}
+
+if (s390_cpu_virt_mem_write(cpu, addr, (uint8_t)r1, &vcssb,
+be64_to_cpu(
+
sizeof(VerificationCertificateStorageSizeBlock)
+))) {
+s390_cpu_virt_mem_handle_exc(cpu, ra);
+return;
+}
 rc = DIAG_320_RC_OK;
 break;
 default:
-- 
2.49.0

[PATCH 00/10] Enable QEMU to run on browsers

2025-04-10 Thread Kohei Tokunaga

This patch series enables QEMU's system emulator to run in a browser using
Emscripten.
It includes implementations and workarounds to address browser environment
limitations, as shown in the following.

# New TCG Backend for Browsers

A new TCG backend translates IR instructions into Wasm instructions and runs
them using the browser's WebAssembly APIs (WebAssembly.Module and
WebAssembly.instantiate). To minimize compilation overhead and avoid hitting
the browser's limitation of the number of instances, this backend integrates
a forked TCI. TBs run on TCI by default, with frequently executed TBs
compiled into WebAssembly.

# Workaround for Running 64-bit Guests

The current implementation uses Wasm's 32-bit memory model, even though Wasm
supports 64-bit variables and instructions. This patch explores implementing
TCG 64-bit instructions while leveraging SoftMMU for address translation. To
enable 64-bit guest support in Wasm today, it was necessary to partially
revert recent changes that removed support for different pointer widths
between the host and guest (e.g., commits
a70af12addd9060fdf8f3dbd42b42e3072c3914f and
bf455ec50b6fea15b4d2493059365bf94c706273) when compiling with
Emscripten. While this serves as a temporary workaround, a long-term
solution could involve adopting Wasm's 64-bit memory model once it gains
broader support, as it is currently not widely adopted (e.g., unsupported by
Safari and libffi). Feedback and suggestions on this approach are welcome.

# Emscripten-Based Coroutine Backend

Emscripten does not support couroutine methods currently used by QEMU but
provides a coroutine implementation called "fiber". This patch series
introduces a coroutine backend using fiber. However, fiber does not support
submitting coroutines to other threads. So this patch series modifies
hw/9pfs/coth.h to disable this behavior when compiled with Emscripten.

# Overview of build process

This section provides an overview of the build process for compiling QEMU
using Emscripten. Full instructions are available in the sample
repository[1].

To compile QEMU with Emscripten, the following dependencies are required.
The emsdk-wasm32-cross.docker environment includes all necessary components
and can be used as the build environment:

- Emscripten SDK (emsdk) v3.1.50
- Libraries cross-compiled with Emscripten (refer to
  emsdk-wasm32-cross.docker for build steps)
  - GLib v2.84.0
  - zlib v1.3.1
  - libffi v3.4.7
  - Pixman v0.44.2

QEMU can be compiled using Emscripten's emconfigure and emmake, which
automatically set environment variables such as CC for targeting Emscripten.

emconfigure configure --static --disable-tools --target-list=x86_64-softmmu
emmake make -j$(nproc)

This process generates the following files:

- qemu-system-x86_64.js
- qemu-system-x86_64.wasm
- qemu-system-x86_64.worker.js

Guest images can be packaged using Emscripten's file_packager.py tool.
For example, if the images are stored in a directory named "pack", the
following command packages them, allowing QEMU to access them through
Emscripten's virtual filesystem:

/path/to/file_packager.py qemu-system-x86_64.data --preload pack > load.js

This process generates the following files:

- qemu-system-x86_64.data
- load.js

Emscripten allows passing arguments to the QEMU command via the Module
object in JavaScript:

Module['arguments'] = [
'-nographic', '-m', '512M', '-accel', 'tcg,tb-size=500',
'-L', 'pack/',
'-drive', 'if=virtio,format=raw,file=pack/rootfs.bin',
'-kernel', 'pack/bzImage',
'-append', 'earlyprintk=ttyS0 console=ttyS0 root=/dev/vda loglevel=7',
];

The sample repository[1] provides a complete setup, including an HTML file
that implements a terminal UI.

[1] https://github.com/ktock/qemu-wasm-sample

# Additional references

- A talk at FOSDEM 2025:
  
https://fosdem.org/2025/schedule/event/fosdem-2025-6290-running-qemu-inside-browser/
- Demo page on GitHub Pages: https://ktock.github.io/qemu-wasm-demo/

Kohei Tokunaga (10):
  various: Fix type conflict of GLib function pointers
  various: Define macros for dependencies on emscripten
  util/mmap-alloc: Add qemu_ram_mmap implementation for emscripten
  util: Add coroutine backend for emscripten
  meson: Add wasm build in build scripts
  include/exec: Allow using 64bit guest addresses on emscripten
  tcg: Add a TCG backend for WebAssembly
  hw/9pfs: Allow using hw/9pfs with emscripten
  gitlab: Enable CI for wasm build
  MAINTAINERS: Update MAINTAINERS file for wasm-related files

 .gitlab-ci.d/buildtest-template.yml   |   27 +
 .gitlab-ci.d/buildtest.yml|9 +
 .gitlab-ci.d/container-cross.yml  |5 +
 MAINTAINERS   |   11 +
 accel/tcg/cputlb.c|8 +-
 block/file-posix.c|   18 +
 configs/meson/emscripten.txt  |6 +
 configure |7 +
 fsdev/file-op-9p.h

Re: [PATCH 03/13] hw/intc/aspeed: Add support for AST2700 SSP INTC

2025-04-10 Thread Cédric Le Goater


On 3/13/25 06:40, Steven Lee wrote:

- Define new types for ast2700ssp INTC and INTCIO
- Add register definitions for SSP INTC and INTCIO
- Implement write handlers for SSP INTC and INTCIO
- Register new types in aspeed_intc_register_types

The design of the SSP INTC and INTCIO controllers is similar to
AST2700, with the following differences:

- AST2700
   Support GICINT128 to GICINT136 in INTC
   The INTCIO GIC_192_201 has 10 output pins, mapped as follows:
 Bit 0 -> GIC 192
 Bit 1 -> GIC 193
 Bit 2 -> GIC 194
 Bit 3 -> GIC 195
 Bit 4 -> GIC 196

- AST2700-ssp
   Support SSPINT128 to SSPINT136 in INTC
   The INTCIO SSPINT_160_169 has 10 output pins, mapped as follows:
 Bit 0 -> SSPINT 160
 Bit 1 -> SSPINT 161
 Bit 2 -> SSPINT 162
 Bit 3 -> SSPINT 163
 Bit 4 -> SSPINT 164

Signed-off-by: Steven Lee 
Change-Id: I5329767b21c0e982d3afcb87c7d1690cc04ce2ef



Reviewed-by: Cédric Le Goater 

Thanks,

C.



---
  include/hw/intc/aspeed_intc.h |   3 +
  hw/intc/aspeed_intc.c | 211 ++
  2 files changed, 214 insertions(+)

diff --git a/include/hw/intc/aspeed_intc.h b/include/hw/intc/aspeed_intc.h
index 3727ba24be..746f159bf3 100644
--- a/include/hw/intc/aspeed_intc.h
+++ b/include/hw/intc/aspeed_intc.h
@@ -15,6 +15,9 @@
  #define TYPE_ASPEED_INTC "aspeed.intc"
  #define TYPE_ASPEED_2700_INTC TYPE_ASPEED_INTC "-ast2700"
  #define TYPE_ASPEED_2700_INTCIO TYPE_ASPEED_INTC "io-ast2700"
+#define TYPE_ASPEED_2700SSP_INTC TYPE_ASPEED_INTC "-ast2700ssp"
+#define TYPE_ASPEED_2700SSP_INTCIO TYPE_ASPEED_INTC "io-ast2700ssp"
+
  OBJECT_DECLARE_TYPE(AspeedINTCState, AspeedINTCClass, ASPEED_INTC)
  
  #define ASPEED_INTC_MAX_INPINS 10

diff --git a/hw/intc/aspeed_intc.c b/hw/intc/aspeed_intc.c
index 3fd417084f..1f8b4d4d36 100644
--- a/hw/intc/aspeed_intc.c
+++ b/hw/intc/aspeed_intc.c
@@ -62,6 +62,50 @@ REG32(GICINT196_STATUS, 0x44)
  REG32(GICINT197_EN, 0x50)
  REG32(GICINT197_STATUS, 0x54)
  
+/*

+ * SSP INTC Registers
+ */
+REG32(SSPINT128_EN, 0x2000)
+REG32(SSPINT128_STATUS, 0x2004)
+REG32(SSPINT129_EN, 0x2100)
+REG32(SSPINT129_STATUS, 0x2104)
+REG32(SSPINT130_EN, 0x2200)
+REG32(SSPINT130_STATUS, 0x2204)
+REG32(SSPINT131_EN, 0x2300)
+REG32(SSPINT131_STATUS, 0x2304)
+REG32(SSPINT132_EN, 0x2400)
+REG32(SSPINT132_STATUS, 0x2404)
+REG32(SSPINT133_EN, 0x2500)
+REG32(SSPINT133_STATUS, 0x2504)
+REG32(SSPINT134_EN, 0x2600)
+REG32(SSPINT134_STATUS, 0x2604)
+REG32(SSPINT135_EN, 0x2700)
+REG32(SSPINT135_STATUS, 0x2704)
+REG32(SSPINT136_EN, 0x2800)
+REG32(SSPINT136_STATUS, 0x2804)
+REG32(SSPINT137_EN, 0x2900)
+REG32(SSPINT137_STATUS, 0x2904)
+REG32(SSPINT138_EN, 0x2A00)
+REG32(SSPINT138_STATUS, 0x2A04)
+REG32(SSPINT160_169_EN, 0x2B00)
+REG32(SSPINT160_169_STATUS, 0x2B04)
+
+/*
+ * SSP INTCIO Registers
+ */
+REG32(SSPINT160_EN, 0x180)
+REG32(SSPINT160_STATUS, 0x184)
+REG32(SSPINT161_EN, 0x190)
+REG32(SSPINT161_STATUS, 0x194)
+REG32(SSPINT162_EN, 0x1A0)
+REG32(SSPINT162_STATUS, 0x1A4)
+REG32(SSPINT163_EN, 0x1B0)
+REG32(SSPINT163_STATUS, 0x1B4)
+REG32(SSPINT164_EN, 0x1C0)
+REG32(SSPINT164_STATUS, 0x1C4)
+REG32(SSPINT165_EN, 0x1D0)
+REG32(SSPINT165_STATUS, 0x1D4)
+
  static const AspeedINTCIRQ *aspeed_intc_get_irq(AspeedINTCClass *aic,
  uint32_t reg)
  {
@@ -452,6 +496,50 @@ static void aspeed_intc_write(void *opaque, hwaddr offset, 
uint64_t data,
  return;
  }
  
+static void aspeed_ssp_intc_write(void *opaque, hwaddr offset, uint64_t data,

+unsigned size)
+{
+AspeedINTCState *s = ASPEED_INTC(opaque);
+const char *name = object_get_typename(OBJECT(s));
+uint32_t reg = offset >> 2;
+
+trace_aspeed_intc_write(name, offset, size, data);
+
+switch (reg) {
+case R_SSPINT128_EN:
+case R_SSPINT129_EN:
+case R_SSPINT130_EN:
+case R_SSPINT131_EN:
+case R_SSPINT132_EN:
+case R_SSPINT133_EN:
+case R_SSPINT134_EN:
+case R_SSPINT135_EN:
+case R_SSPINT136_EN:
+case R_SSPINT160_169_EN:
+aspeed_intc_enable_handler(s, offset, data);
+break;
+case R_SSPINT128_STATUS:
+case R_SSPINT129_STATUS:
+case R_SSPINT130_STATUS:
+case R_SSPINT131_STATUS:
+case R_SSPINT132_STATUS:
+case R_SSPINT133_STATUS:
+case R_SSPINT134_STATUS:
+case R_SSPINT135_STATUS:
+case R_SSPINT136_STATUS:
+aspeed_intc_status_handler(s, offset, data);
+break;
+case R_SSPINT160_169_STATUS:
+aspeed_intc_status_handler_multi_outpins(s, offset, data);
+break;
+default:
+s->regs[reg] = data;
+break;
+}
+
+return;
+}
+
  static uint64_t as

[PATCH v1 01/24] Add -boot-certificates /path/dir:/path/file option in QEMU command line

2025-04-10 Thread Zhuoying Cai

The `-boot-certificates /path/dir:/path/file` option is implemented
to provide path to either a directory or a single certificate.

Multiple paths can be delineated using a colon.

Signed-off-by: Zhuoying Cai 
---
 qemu-options.hx | 11 +++
 system/vl.c | 22 ++
 2 files changed, 33 insertions(+)

diff --git a/qemu-options.hx b/qemu-options.hx
index dc694a99a3..b460c63490 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -1251,6 +1251,17 @@ SRST
 Set system UUID.
 ERST
 
+DEF("boot-certificates", HAS_ARG, QEMU_OPTION_boot_certificates,
+"-boot-certificates /path/directory:/path/file\n"
+"  Provide a path to a directory or a boot certificate.\n"
+"  A colon may be used to delineate multiple paths.\n",
+QEMU_ARCH_S390X)
+SRST
+``-boot-certificates /path/directory:/path/file``
+Provide a path to a directory or a boot certificate.
+A colon may be used to delineate multiple paths.
+ERST
+
 DEFHEADING()
 
 DEFHEADING(Block device options:)
diff --git a/system/vl.c b/system/vl.c
index ec93988a03..bd6197c887 100644
--- a/system/vl.c
+++ b/system/vl.c
@@ -510,6 +510,20 @@ static QemuOptsList qemu_action_opts = {
 },
 };
 
+static QemuOptsList qemu_boot_certificates_opts = {
+.name = "boot-certificates",
+.implied_opt_name = "boot-certificates",
+.merge_lists = true,
+.head = QTAILQ_HEAD_INITIALIZER(qemu_boot_certificates_opts.head),
+.desc = {
+   {
+   .name = "boot-certificates",
+   .type = QEMU_OPT_STRING,
+   },
+{ /* end of list */}
+},
+};
+
 const char *qemu_get_vm_name(void)
 {
 return qemu_name;
@@ -2879,6 +2893,7 @@ void qemu_init(int argc, char **argv)
 qemu_add_opts(&qemu_semihosting_config_opts);
 qemu_add_opts(&qemu_fw_cfg_opts);
 qemu_add_opts(&qemu_action_opts);
+qemu_add_opts(&qemu_boot_certificates_opts);
 qemu_add_run_with_opts();
 module_call_init(MODULE_INIT_OPTS);
 
@@ -3024,6 +3039,13 @@ void qemu_init(int argc, char **argv)
 case QEMU_OPTION_boot:
 machine_parse_property_opt(qemu_find_opts("boot-opts"), 
"boot", optarg);
 break;
+case QEMU_OPTION_boot_certificates:
+opts = 
qemu_opts_parse_noisily(qemu_find_opts("boot-certificates"),
+   optarg, true);
+if (!opts) {
+exit(1);
+}
+break;
 case QEMU_OPTION_fda:
 case QEMU_OPTION_fdb:
 drive_add(IF_FLOPPY, popt->index - QEMU_OPTION_fda,
-- 
2.49.0

[PATCH for-10.1 v5 07/13] arm/cpu: Store aa64smfr0 into the idregs array

2025-04-10 Thread Cornelia Huck

From: Eric Auger 

Reviewed-by: Richard Henderson 
Reviewed-by: Sebastian Ott 
Signed-off-by: Eric Auger 
Signed-off-by: Cornelia Huck 
---
 target/arm/cpu-features.h | 6 +++---
 target/arm/cpu.h  | 1 -
 target/arm/cpu64.c| 7 ++-
 target/arm/helper.c   | 2 +-
 target/arm/kvm.c  | 3 +--
 target/arm/tcg/cpu64.c| 4 ++--
 6 files changed, 9 insertions(+), 14 deletions(-)

diff --git a/target/arm/cpu-features.h b/target/arm/cpu-features.h
index 7f6331ca437d..1ac1f3e95984 100644
--- a/target/arm/cpu-features.h
+++ b/target/arm/cpu-features.h
@@ -978,17 +978,17 @@ static inline bool isar_feature_aa64_sve_f64mm(const 
ARMISARegisters *id)
 
 static inline bool isar_feature_aa64_sme_f64f64(const ARMISARegisters *id)
 {
-return FIELD_EX64(id->id_aa64smfr0, ID_AA64SMFR0, F64F64);
+return FIELD_EX64_IDREG(id, ID_AA64SMFR0, F64F64);
 }
 
 static inline bool isar_feature_aa64_sme_i16i64(const ARMISARegisters *id)
 {
-return FIELD_EX64(id->id_aa64smfr0, ID_AA64SMFR0, I16I64) == 0xf;
+return FIELD_EX64_IDREG(id, ID_AA64SMFR0, I16I64) == 0xf;
 }
 
 static inline bool isar_feature_aa64_sme_fa64(const ARMISARegisters *id)
 {
-return FIELD_EX64(id->id_aa64smfr0, ID_AA64SMFR0, FA64);
+return FIELD_EX64_IDREG(id, ID_AA64SMFR0, FA64);
 }
 
 /*
diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index 37bb337b3c71..a3a3b8031eed 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -1097,7 +1097,6 @@ struct ArchCPU {
 uint32_t dbgdidr;
 uint32_t dbgdevid;
 uint32_t dbgdevid1;
-uint64_t id_aa64smfr0;
 uint64_t reset_pmcr_el0;
 uint64_t idregs[NUM_ID_IDX];
 } isar;
diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
index 4ba53f75ed96..c8ab8761282a 100644
--- a/target/arm/cpu64.c
+++ b/target/arm/cpu64.c
@@ -328,7 +328,7 @@ void arm_cpu_sme_finalize(ARMCPU *cpu, Error **errp)
 
 if (vq_map == 0) {
 if (!cpu_isar_feature(aa64_sme, cpu)) {
-cpu->isar.id_aa64smfr0 = 0;
+SET_IDREG(&cpu->isar, ID_AA64SMFR0, 0);
 return;
 }
 
@@ -381,11 +381,8 @@ static bool cpu_arm_get_sme_fa64(Object *obj, Error **errp)
 static void cpu_arm_set_sme_fa64(Object *obj, bool value, Error **errp)
 {
 ARMCPU *cpu = ARM_CPU(obj);
-uint64_t t;
 
-t = cpu->isar.id_aa64smfr0;
-t = FIELD_DP64(t, ID_AA64SMFR0, FA64, value);
-cpu->isar.id_aa64smfr0 = t;
+FIELD_DP64_IDREG(&cpu->isar, ID_AA64SMFR0, FA64, value);
 }
 
 #ifdef CONFIG_USER_ONLY
diff --git a/target/arm/helper.c b/target/arm/helper.c
index 8efe508306e5..275e590876bf 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -7946,7 +7946,7 @@ void register_cp_regs_for_features(ARMCPU *cpu)
   .opc0 = 3, .opc1 = 0, .crn = 0, .crm = 4, .opc2 = 5,
   .access = PL1_R, .type = ARM_CP_CONST,
   .accessfn = access_aa64_tid3,
-  .resetvalue = cpu->isar.id_aa64smfr0 },
+  .resetvalue = GET_IDREG(isar, ID_AA64SMFR0)},
 { .name = "ID_AA64PFR6_EL1_RESERVED", .state = ARM_CP_STATE_AA64,
   .opc0 = 3, .opc1 = 0, .crn = 0, .crm = 4, .opc2 = 6,
   .access = PL1_R, .type = ARM_CP_CONST,
diff --git a/target/arm/kvm.c b/target/arm/kvm.c
index e999d98dcf7f..a73ff0a603bc 100644
--- a/target/arm/kvm.c
+++ b/target/arm/kvm.c
@@ -353,8 +353,7 @@ static bool 
kvm_arm_get_host_cpu_features(ARMHostCPUFeatures *ahcf)
 err = 0;
 } else {
 err |= get_host_cpu_reg(fd, ahcf, ID_AA64PFR1_EL1_IDX);
-err |= read_sys_reg64(fdarray[2], &ahcf->isar.id_aa64smfr0,
-  ARM64_SYS_REG(3, 0, 0, 4, 5));
+err |= get_host_cpu_reg(fd, ahcf, ID_AA64SMFR0_EL1_IDX);
 err |= get_host_cpu_reg(fd, ahcf, ID_AA64DFR0_EL1_IDX);
 err |= get_host_cpu_reg(fd, ahcf, ID_AA64DFR1_EL1_IDX);
 err |= get_host_cpu_reg(fd, ahcf, ID_AA64ISAR0_EL1_IDX);
diff --git a/target/arm/tcg/cpu64.c b/target/arm/tcg/cpu64.c
index 41077b3dcd08..cadc1258fc40 100644
--- a/target/arm/tcg/cpu64.c
+++ b/target/arm/tcg/cpu64.c
@@ -1268,7 +1268,7 @@ void aarch64_max_tcg_initfn(Object *obj)
 t = FIELD_DP64(t, ID_AA64DFR0, HPMN0, 1); /* FEAT_HPMN0 */
 SET_IDREG(isar, ID_AA64DFR0, t);
 
-t = cpu->isar.id_aa64smfr0;
+t = GET_IDREG(isar, ID_AA64SMFR0);
 t = FIELD_DP64(t, ID_AA64SMFR0, F32F32, 1);   /* FEAT_SME */
 t = FIELD_DP64(t, ID_AA64SMFR0, B16F32, 1);   /* FEAT_SME */
 t = FIELD_DP64(t, ID_AA64SMFR0, F16F32, 1);   /* FEAT_SME */
@@ -1276,7 +1276,7 @@ void aarch64_max_tcg_initfn(Object *obj)
 t = FIELD_DP64(t, ID_AA64SMFR0, F64F64, 1);   /* FEAT_SME_F64F64 */
 t = FIELD_DP64(t, ID_AA64SMFR0, I16I64, 0xf); /* FEAT_SME_I16I64 */
 t = FIELD_DP64(t, ID_AA64SMFR0, FA64, 1); /* FEAT_SME_FA64 */
-cpu->isar.id_aa64smfr0 = t;
+SET_IDREG(isar, ID_AA64SMFR0, t);
 
 /* Replicate the same data to the 32-bit id registers.  */
 aa32_max_features(cpu);
-- 
2.48.1

Re: [PATCH v8 08/55] i386/tdx: Initialize TDX before creating TD vcpus

2025-04-10 Thread Xiaoyao Li


On 4/2/2025 7:41 PM, Daniel P. Berrangé wrote:

On Tue, Apr 01, 2025 at 09:01:18AM -0400, Xiaoyao Li wrote:

Invoke KVM_TDX_INIT_VM in kvm_arch_pre_create_vcpu() that
KVM_TDX_INIT_VM configures global TD configurations, e.g. the canonical
CPUID config, and must be executed prior to creating vCPUs.

Use kvm_x86_arch_cpuid() to setup the CPUID settings for TDX VM.

Note, this doesn't address the fact that QEMU may change the CPUID
configuration when creating vCPUs, i.e. punts on refactoring QEMU to
provide a stable CPUID config prior to kvm_arch_init().

Signed-off-by: Xiaoyao Li 
Acked-by: Gerd Hoffmann 
Acked-by: Markus Armbruster 
---
Changes in v8:
- Drop the code that initializes cpu->kvm_state before
   kvm_arch_pre_create_vcpu() because it's not needed anymore.

Changes in v7:
- Add comments to explain why KVM_TDX_INIT_VM should retry on -EAGAIN;
- Add retry limit of 1 times for -EAGAIN on KVM_TDX_INIT_VM;

Changes in v6:
- setup xfam explicitly to fit with new uapi;
- use tdx_caps->cpuid to filter the input of cpuids because now KVM only
   allows the leafs that reported via KVM_TDX_GET_CAPABILITIES;

Changes in v4:
- mark init_vm with g_autofree() and use QEMU_LOCK_GUARD() to eliminate
   the goto labels; (Daniel)
Changes in v3:
- Pass @errp in tdx_pre_create_vcpu() and pass error info to it. (Daniel)
---
  target/i386/kvm/kvm.c   |  16 +++---
  target/i386/kvm/kvm_i386.h  |   5 ++
  target/i386/kvm/meson.build |   2 +-
  target/i386/kvm/tdx-stub.c  |  10 
  target/i386/kvm/tdx.c   | 105 
  target/i386/kvm/tdx.h   |   6 +++
  6 files changed, 137 insertions(+), 7 deletions(-)
  create mode 100644 target/i386/kvm/tdx-stub.c




diff --git a/target/i386/kvm/tdx.c b/target/i386/kvm/tdx.c
index 16f67e18ae78..0afaf739c09f 100644
--- a/target/i386/kvm/tdx.c
+++ b/target/i386/kvm/tdx.c



+int tdx_pre_create_vcpu(CPUState *cpu, Error **errp)
+{
+X86CPU *x86cpu = X86_CPU(cpu);
+CPUX86State *env = &x86cpu->env;
+g_autofree struct kvm_tdx_init_vm *init_vm = NULL;
+Error *local_err = NULL;
+int retry = 1;
+int r = 0;
+
+QEMU_LOCK_GUARD(&tdx_guest->lock);
+if (tdx_guest->initialized) {
+return r;
+}
+
+init_vm = g_malloc0(sizeof(struct kvm_tdx_init_vm) +
+sizeof(struct kvm_cpuid_entry2) * 
KVM_MAX_CPUID_ENTRIES);
+
+r = setup_td_xfam(x86cpu, errp);
+if (r) {
+return r;
+}
+
+init_vm->cpuid.nent = kvm_x86_build_cpuid(env, init_vm->cpuid.entries, 0);
+tdx_filter_cpuid(&init_vm->cpuid);
+
+init_vm->attributes = tdx_guest->attributes;
+init_vm->xfam = tdx_guest->xfam;
+
+/*
+ * KVM_TDX_INIT_VM gets -EAGAIN when KVM side SEAMCALL(TDH_MNG_CREATE)
+ * gets TDX_RND_NO_ENTROPY due to Random number generation (e.g., RDRAND or
+ * RDSEED) is busy.
+ *
+ * Retry for the case.
+ */
+do {
+error_free(local_err);
+local_err = NULL;
+r = tdx_vm_ioctl(KVM_TDX_INIT_VM, 0, init_vm, &local_err);
+} while (r == -EAGAIN && --retry);
+
+if (r < 0) {
+if (!retry) {
+error_report("Hardware RNG (Random Number Generator) is busy "
+ "occupied by someone (via RDRAND/RDSEED) maliciously, 
"
+ "which leads to KVM_TDX_INIT_VM keeping failure "
+ "due to lack of entropy.");


This needs to be

  error_append_hint(local_err, );

so that this message gets associated with the error object that
is propagated, and the top level will print it all at once.


Good suggestion! Will change to it in the next version.


+}
+error_propagate(errp, local_err);
+return r;
+}
+
+tdx_guest->initialized = true;
+
+return 0;
+}


With regards,
Daniel

[PATCH v1 19/24] pc-bios/s390-ccw: Add additional security checks for secure boot

2025-04-10 Thread Zhuoying Cai

Add additional checks to ensure that components do not overlap with
signed components when loaded into memory.

Add additional checks to ensure the load addresses of unsigned components
are greater than or equal to 0x2000.

When the secure IPL code loading attributes facility (SCLAF) is installed,
all signed components must contain a secure code loading attributes block
(SCLAB).

The SCLAB provides further validation of information on where to load the
signed binary code from the load device, and where to start the execution
of the loaded OS code.

When SCLAF is installed, its content must be evaluated during secure IPL.
However, a missing SCLAB will not be reported in audit mode. The SCALB
checking will be skipped in this case.

Add IPL Information Error Indicators (IIEI) and Component Error
Indicators (CEI) for IPL Information Report Block (IIRB).

When SCLAF is installed, additional secure boot checks are performed
during zipl and store results of verification into IIRB.

Signed-off-by: Zhuoying Cai 
---
 pc-bios/s390-ccw/bootmap.c  | 281 +++-
 pc-bios/s390-ccw/iplb.h |  43 +-
 pc-bios/s390-ccw/s390-ccw.h |   1 +
 pc-bios/s390-ccw/sclp.c |   8 +
 pc-bios/s390-ccw/sclp.h |   1 +
 5 files changed, 331 insertions(+), 3 deletions(-)

diff --git a/pc-bios/s390-ccw/bootmap.c b/pc-bios/s390-ccw/bootmap.c
index bdbd6ccd96..4bc6311802 100644
--- a/pc-bios/s390-ccw/bootmap.c
+++ b/pc-bios/s390-ccw/bootmap.c
@@ -683,6 +683,207 @@ static int zipl_load_segment(ComponentEntry *entry, 
uint64_t address)
 return comp_len;
 }
 
+typedef struct SecureIplCompAddrRange {
+bool is_signed;
+uint64_t start_addr;
+uint64_t end_addr;
+} SecureIplCompAddrRange;
+
+static bool is_comp_overlap(SecureIplCompAddrRange *comp_addr_range, int 
addr_range_index,
+uint64_t start_addr, uint64_t end_addr)
+{
+/* neither a signed nor an unsigned component can overlap with a signed 
component */
+for (int i = 0; i < addr_range_index; i++) {
+if ((comp_addr_range[i].start_addr <= end_addr &&
+start_addr <= comp_addr_range[i].end_addr) &&
+comp_addr_range[i].is_signed) {
+return true;
+   }
+}
+
+return false;
+}
+
+static void comp_addr_range_add(SecureIplCompAddrRange *comp_addr_range,
+int addr_range_index, bool is_signed,
+uint64_t start_addr, uint64_t end_addr)
+{
+comp_addr_range[addr_range_index].is_signed = is_signed;
+comp_addr_range[addr_range_index].start_addr = start_addr;
+comp_addr_range[addr_range_index].end_addr = end_addr;
+}
+
+static void unsigned_addr_check(uint64_t load_addr, IplDeviceComponentList 
*comps,
+int comp_index, void (*print_func)(bool, const 
char *))
+{
+bool is_addr_valid;
+
+is_addr_valid = load_addr >= 0x2000;
+if (!is_addr_valid) {
+comps->device_entries[comp_index].cei |=
+S390_IPL_COMPONENT_CEI_INVALID_UNSIGNED_ADDR;
+print_func(is_addr_valid, "Load address is less than 0x2000");
+}
+}
+
+static void addr_overlap_check(SecureIplCompAddrRange *comp_addr_range,
+   int *addr_range_index,
+   uint64_t start_addr, uint64_t end_addr,
+   bool is_signed, void (*print_func)(bool, const 
char *))
+{
+bool overlap;
+
+overlap = is_comp_overlap(comp_addr_range, *addr_range_index,
+  start_addr, end_addr);
+if (!overlap) {
+comp_addr_range_add(comp_addr_range, *addr_range_index, is_signed,
+start_addr, end_addr);
+*addr_range_index += 1;
+} else {
+print_func(!overlap, "Component addresses overlap");
+}
+}
+
+static void valid_sclab_check(SclabOriginLocator *sclab_locator,
+ IplDeviceComponentList *comps, int comp_index,
+ void (*print_func)(bool, const char *))
+{
+bool is_magic_match;
+bool is_len_valid;
+
+/* identifies the presence of SCLAB */
+is_magic_match = magic_match(sclab_locator->magic, ZIPL_MAGIC);
+if (!is_magic_match) {
+comps->device_entries[comp_index].cei |= 
S390_IPL_COMPONENT_CEI_INVALID_SCLAB;
+
+/* a missing SCLAB will not be reported in audit mode */
+return;
+}
+
+is_len_valid = sclab_locator->len >= 32;
+if (!is_len_valid) {
+comps->device_entries[comp_index].cei |= 
S390_IPL_COMPONENT_CEI_INVALID_SCLAB_LEN;
+comps->device_entries[comp_index].cei |= 
S390_IPL_COMPONENT_CEI_INVALID_SCLAB;
+print_func(is_len_valid, "Invalid SCLAB length");
+}
+}
+
+static void sclab_format_check(SecureCodeLoadingAttributesBlock *sclab,
+   IplDeviceComponentList *comps, int comp_index,
+   void (*print_func)(bool, const char *))
+{
+bool valid_format;
+
+

[PATCH v3 5/5] target/hexagon: Remove unreachable

2025-04-10 Thread Brian Cain

We should raise an exception in the event that we encounter a packet
that can't be correctly decoded, not fault.

Signed-off-by: Brian Cain 
---
 target/hexagon/decode.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/target/hexagon/decode.c b/target/hexagon/decode.c
index b5ece60450..1db7f1950f 100644
--- a/target/hexagon/decode.c
+++ b/target/hexagon/decode.c
@@ -489,7 +489,6 @@ decode_insns(DisasContext *ctx, Insn *insn, uint32_t 
encoding)
 insn->iclass = iclass_bits(encoding);
 return 1;
 }
-g_assert_not_reached();
 } else {
 uint32_t iclass = get_duplex_iclass(encoding);
 unsigned int slot0_subinsn = get_slot0_subinsn(encoding);
@@ -512,6 +511,11 @@ decode_insns(DisasContext *ctx, Insn *insn, uint32_t 
encoding)
 }
 g_assert_not_reached();
 }
+/*
+ * invalid/unrecognized opcode; return 1 and let gen_insn() raise an
+ * exception when it sees this empty insn.
+ */
+return 1;
 }
 
 static void decode_add_endloop_insn(Insn *insn, int loopnum)
-- 
2.34.1

[PATCH v3 4/5] target/hexagon: s/pkt_has_store/pkt_has_scalar_store

2025-04-10 Thread Brian Cain

To remove any confusion with HVX or other potential store instructions,
we'll qualify this context var with "scalar".

Signed-off-by: Brian Cain 
---
 target/hexagon/idef-parser/README.rst   | 2 +-
 target/hexagon/insn.h   | 4 ++--
 target/hexagon/macros.h | 8 
 target/hexagon/decode.c | 4 ++--
 target/hexagon/genptr.c | 3 ++-
 target/hexagon/idef-parser/parser-helpers.c | 4 ++--
 target/hexagon/op_helper.c  | 4 ++--
 target/hexagon/translate.c  | 9 +
 target/hexagon/gen_helper_funcs.py  | 2 +-
 9 files changed, 21 insertions(+), 19 deletions(-)

diff --git a/target/hexagon/idef-parser/README.rst 
b/target/hexagon/idef-parser/README.rst
index 7199177ee3..235e3debee 100644
--- a/target/hexagon/idef-parser/README.rst
+++ b/target/hexagon/idef-parser/README.rst
@@ -637,7 +637,7 @@ tinycode for the Hexagon ``add`` instruction
 ::
 
 00021094
-   mov_i32 pkt_has_store_s1,$0x0
+   mov_i32 pkt_has_scalar_store_s1,$0x0
add_i32 tmp0,r2,r2
mov_i32 loc2,tmp0
mov_i32 new_r1,loc2
diff --git a/target/hexagon/insn.h b/target/hexagon/insn.h
index 24dcf7fe9f..5d59430da9 100644
--- a/target/hexagon/insn.h
+++ b/target/hexagon/insn.h
@@ -66,8 +66,8 @@ struct Packet {
 
 bool pkt_has_dczeroa;
 
-bool pkt_has_store_s0;
-bool pkt_has_store_s1;
+bool pkt_has_scalar_store_s0;
+bool pkt_has_scalar_store_s1;
 
 bool pkt_has_hvx;
 Insn *vhist_insn;
diff --git a/target/hexagon/macros.h b/target/hexagon/macros.h
index ee3d4c88e7..b6e5c8aae2 100644
--- a/target/hexagon/macros.h
+++ b/target/hexagon/macros.h
@@ -82,7 +82,7 @@
  */
 #define CHECK_NOSHUF(VA, SIZE) \
 do { \
-if (insn->slot == 0 && ctx->pkt->pkt_has_store_s1) { \
+if (insn->slot == 0 && ctx->pkt->pkt_has_scalar_store_s1) { \
 probe_noshuf_load(VA, SIZE, ctx->mem_idx); \
 process_store(ctx, 1); \
 } \
@@ -93,11 +93,11 @@
 TCGLabel *noshuf_label = gen_new_label(); \
 tcg_gen_brcondi_tl(TCG_COND_EQ, PRED, 0, noshuf_label); \
 GET_EA; \
-if (insn->slot == 0 && ctx->pkt->pkt_has_store_s1) { \
+if (insn->slot == 0 && ctx->pkt->pkt_has_scalar_store_s1) { \
 probe_noshuf_load(EA, SIZE, ctx->mem_idx); \
 } \
 gen_set_label(noshuf_label); \
-if (insn->slot == 0 && ctx->pkt->pkt_has_store_s1) { \
+if (insn->slot == 0 && ctx->pkt->pkt_has_scalar_store_s1) { \
 process_store(ctx, 1); \
 } \
 } while (0)
@@ -524,7 +524,7 @@ static inline TCGv gen_read_ireg(TCGv result, TCGv val, int 
shift)
 
 #define fLOAD(NUM, SIZE, SIGN, EA, DST) \
 do { \
-check_noshuf(env, pkt_has_store_s1, slot, EA, SIZE, GETPC()); \
+check_noshuf(env, pkt_has_scalar_store_s1, slot, EA, SIZE, GETPC()); \
 DST = (size##SIZE##SIGN##_t)MEM_LOAD##SIZE(env, EA, GETPC()); \
 } while (0)
 #endif
diff --git a/target/hexagon/decode.c b/target/hexagon/decode.c
index 23deba2426..b5ece60450 100644
--- a/target/hexagon/decode.c
+++ b/target/hexagon/decode.c
@@ -236,9 +236,9 @@ static void decode_set_insn_attr_fields(Packet *pkt)
 if (GET_ATTRIB(opcode, A_SCALAR_STORE) &&
 !GET_ATTRIB(opcode, A_MEMSIZE_0B)) {
 if (pkt->insn[i].slot == 0) {
-pkt->pkt_has_store_s0 = true;
+pkt->pkt_has_scalar_store_s0 = true;
 } else {
-pkt->pkt_has_store_s1 = true;
+pkt->pkt_has_scalar_store_s1 = true;
 }
 }
 }
diff --git a/target/hexagon/genptr.c b/target/hexagon/genptr.c
index 2c5e15cfcf..7c73772e40 100644
--- a/target/hexagon/genptr.c
+++ b/target/hexagon/genptr.c
@@ -395,7 +395,8 @@ static inline void gen_store_conditional8(DisasContext *ctx,
 #ifndef CONFIG_HEXAGON_IDEF_PARSER
 static TCGv gen_slotval(DisasContext *ctx)
 {
-int slotval = (ctx->pkt->pkt_has_store_s1 & 1) | (ctx->insn->slot << 1);
+int slotval =
+(ctx->pkt->pkt_has_scalar_store_s1 & 1) | (ctx->insn->slot << 1);
 return tcg_constant_tl(slotval);
 }
 #endif
diff --git a/target/hexagon/idef-parser/parser-helpers.c 
b/target/hexagon/idef-parser/parser-helpers.c
index a7dcd85fe4..3316c230f8 100644
--- a/target/hexagon/idef-parser/parser-helpers.c
+++ b/target/hexagon/idef-parser/parser-helpers.c
@@ -1725,7 +1725,7 @@ void gen_cancel(Context *c, YYLTYPE *locp)
 
 void gen_load_cancel(Context *c, YYLTYPE *locp)
 {
-OUT(c, locp, "if (insn->slot == 0 && pkt->pkt_has_store_s1) {\n");
+OUT(c, locp, "if (insn->slot == 0 && pkt->pkt_has_scalar_store_s1) {\n");
 OUT(c, locp, "ctx->s1_store_processed = false;\n");
 OUT(c, locp, "process_store(ctx, 1);\n");
 OUT(c, locp, "}\n");
@@ -1750,7 +1750,7 @@ void gen_load(Context *c, YYLTYPE *locp, HexValue *width,
 
 /* Lookup the effective address EA */

Re: Issue with stoptrigger.c Plugin in QEMU Emulation

2025-04-10 Thread Pierrick Bouvier


Hi Saanjh,

I have not been able to reproduce the issue with current master branch.
Is it an error you see for every run?

Regards,
Pierrick

On 4/10/25 04:10, Saanjh Sengupta wrote:

Hi,

I am writing to seek assistance with an issue I am experiencing while 
using the stoptrigger.c plugin in QEMU emulation. I am currently 
utilising the latest QEMU version, 9.2.92, and attempting to emulate the 
Debian 11 as the operating system.


The command I am using to emulate QEMU is as follows:
*./build/qemu-system-x86_64 -m 2048M -smp 2 -boot c -nographic -serial 
mon:stdio -nic tap,ifname=tap0,script=no,downscript=no  -hda 
debian11.qcow2 -icount shift=0 -plugin ./build/contrib/plugins/ 
libstoptrigger.so,icount=90 -d plugin -qmp 
tcp:localhost:,server,wait=off*


However, when I attempt to use the -icount shift=0 option, the plugin 
fails with the error "*Basic icount read*". I have attached a screenshot 
of the error for your reference.


error.png

When I remove the -plugin argument from the command the OS boots up 
perfectly, as expected. Command utilised in that context was somewhat 
like *./build/qemu-system-x86_64 -m 2048M -smp 2 -boot c -nographic - 
serial mon:stdio -nic tap,ifname=tap0,script=no,downscript=no  -hda 
debian11.qcow2 -icount shift=0 -qmp tcp:localhost:,server,wait=off*



I would greatly appreciate it if you could provide guidance on resolving 
this issue. Specifically, I would like to know the cause of the error 
and any potential solutions or workarounds that could be implemented to 
successfully use the stoptrigger.c plugin with the -icount shift=0 option.



Regards

Saanjh Sengupta

Re: [PATCH] vfio/spapr: Fix L2 crash with PCI device passthrough with L2 guest memory > 128G

2025-04-10 Thread Amit Machhiwal

Hi Cédric,

Thanks for looking into this patch. Please find my response inline:

On 2025/04/04 01:29 PM, Cédric Le Goater wrote:
> On 4/4/25 11:17, Amit Machhiwal wrote:
> > An L2 KVM guest fails to boot inside a pSeries LPAR when booted with a
> > memory more than 128 GB and PCI device passthrough. The L2 guest also
> > crashes when it is booted with a memory greater than 128 GB and a PCI
> > device is hotplugged later.
> > 
> > The issue arises from a conditional check for `levels > 1` in
> > `spapr_tce_create_table()` within L1 KVM. This check is meant to prevent
> > multi-level TCEs, which are not supported by the PowerVM hypervisor. As
> > a result, when QEMU makes a `VFIO_IOMMU_SPAPR_TCE_CREATE` ioctl call
> > with `levels > 1`, it triggers the conditional check and returns
> > `EINVAL`, causing the guest to crash with the following errors:
> > 
> >   2025-03-04T06:36:36.133117Z qemu-system-ppc64: Failed to create a window, 
> > ret = -1 (Invalid argument)
> >   2025-03-04T06:36:36.133176Z qemu-system-ppc64: Failed to create SPAPR 
> > window: Invalid argument
> >   qemu: hardware error: vfio: DMA mapping failed, unable to continue
> > 
> > Fix this by checking the supported DDW "levels" returned by the
> > VFIO_IOMMU_SPAPR_TCE_GET_INFO ioctl before attempting the TCE create
> > ioctl in KVM.
> > 
> > The patch has been tested on KVM guests with memory configurations of up
> > to 390GB, and 450GB on PowerVM and bare-metal environments respectively.
> > > Signed-off-by: Amit Machhiwal 
> > ---
> >   hw/vfio/spapr.c | 35 ++-
> >   1 file changed, 26 insertions(+), 9 deletions(-)
> > 
> > diff --git a/hw/vfio/spapr.c b/hw/vfio/spapr.c
> > index 1a5d1611f2cd..07498218fea9 100644
> > --- a/hw/vfio/spapr.c
> > +++ b/hw/vfio/spapr.c
> > @@ -26,6 +26,7 @@ typedef struct VFIOSpaprContainer {
> >   VFIOContainer container;
> >   MemoryListener prereg_listener;
> >   QLIST_HEAD(, VFIOHostDMAWindow) hostwin_list;
> > +unsigned int levels;
> >   } VFIOSpaprContainer;
> >   OBJECT_DECLARE_SIMPLE_TYPE(VFIOSpaprContainer, VFIO_IOMMU_SPAPR);
> > @@ -236,9 +237,11 @@ static int vfio_spapr_create_window(VFIOContainer 
> > *container,
> >   {
> >   int ret = 0;
> >   VFIOContainerBase *bcontainer = &container->bcontainer;
> > +VFIOSpaprContainer *scontainer = container_of(container, 
> > VFIOSpaprContainer,
> > +  container);
> >   IOMMUMemoryRegion *iommu_mr = IOMMU_MEMORY_REGION(section->mr);
> >   uint64_t pagesize = memory_region_iommu_get_min_page_size(iommu_mr), 
> > pgmask;
> > -unsigned entries, bits_total, bits_per_level, max_levels;
> > +unsigned entries, bits_total, bits_per_level, max_levels, ddw_levels;
> >   struct vfio_iommu_spapr_tce_create create = { .argsz = sizeof(create) 
> > };
> >   long rampagesize = qemu_minrampagesize();
> > @@ -291,16 +294,28 @@ static int vfio_spapr_create_window(VFIOContainer 
> > *container,
> >*/
> >   bits_per_level = ctz64(qemu_real_host_page_size()) + 8;
> >   create.levels = bits_total / bits_per_level;
> > -if (bits_total % bits_per_level) {
> > -++create.levels;
> > -}
> > -max_levels = (64 - create.page_shift) / 
> > ctz64(qemu_real_host_page_size());
> > -for ( ; create.levels <= max_levels; ++create.levels) {
> > -ret = ioctl(container->fd, VFIO_IOMMU_SPAPR_TCE_CREATE, &create);
> > -if (!ret) {
> > -break;
> > +
> > +ddw_levels = scontainer->levels;
> > +if (ddw_levels > 1) {
> > +if (bits_total % bits_per_level) {
> > +++create.levels;
> >   }
> > +max_levels = (64 - create.page_shift) / 
> > ctz64(qemu_real_host_page_size());
> > +for ( ; create.levels <= max_levels; ++create.levels) {
> > +ret = ioctl(container->fd, VFIO_IOMMU_SPAPR_TCE_CREATE, 
> > &create);
> > +if (!ret) {
> > +break;
> > +}
> > +}
> > +} else { /* ddw_levels == 1 */
> > +if (create.levels > ddw_levels) {
> > +error_report("Host doesn't support multi-level TCE tables. "
> > + "Use larger IO page size. Supported mask is 
> > 0x%lx",
> > + bcontainer->pgsizes);
> 
> While at it, please modify vfio_spapr_create_window(), add an 'Error **'
> parameter to report errors to the caller with error_setg(errp ...)

Sure, I'll include the suggested changes and send a v2 soon.

Thanks,
Amit

> 
> Thanks,
> 
> C.
> 
> 
> 
> 
> > +}
> > +ret = ioctl(container->fd, VFIO_IOMMU_SPAPR_TCE_CREATE, &create);
> >   }
> > +
> >   if (ret) {
> >   error_report("Failed to create a window, ret = %d (%m)", ret);
> >   return -errno;
> > @@ -502,6 +517,8 @@ static bool 
> > vfio_spapr_container_setup(VFIOContainerBase *bcontainer,
> >   goto listener_unregister_exit;
> >   }
> > +scontai

Re: [RFC PATCH-for-8.0 09/10] hw/virtio: Extract vhost_user_ram_slots_max() to vhost-user-target.c

2025-04-10 Thread Philippe Mathieu-Daudé


On 10/4/25 16:36, Pierrick Bouvier wrote:

On 4/10/25 05:14, Philippe Mathieu-Daudé wrote:

Hi Pierrick,

On 13/12/22 00:05, Philippe Mathieu-Daudé wrote:

The current definition of VHOST_USER_MAX_RAM_SLOTS is
target specific. By converting this definition to a runtime
vhost_user_ram_slots_max() helper declared in a target
specific unit, we can have the rest of vhost-user.c target
independent.

To avoid variable length array or using the heap to store
arrays of vhost_user_ram_slots_max() elements, we simply
declare an array of the biggest VHOST_USER_MAX_RAM_SLOTS,
and each target uses up to vhost_user_ram_slots_max()
elements of it. Ensure arrays are big enough by adding an
assertion in vhost_user_init().

Signed-off-by: Philippe Mathieu-Daudé 
---
RFC: Should I add VHOST_USER_MAX_RAM_SLOTS to vhost-user.h
   or create an internal header for it?
---
   hw/virtio/meson.build  |  1 +
   hw/virtio/vhost-user-target.c  | 29 +
   hw/virtio/vhost-user.c | 26 +-
   include/hw/virtio/vhost-user.h |  7 +++
   4 files changed, 42 insertions(+), 21 deletions(-)
   create mode 100644 hw/virtio/vhost-user-target.c

diff --git a/hw/virtio/meson.build b/hw/virtio/meson.build
index eb7ee8ea92..bf7e35fa8a 100644
--- a/hw/virtio/meson.build
+++ b/hw/virtio/meson.build
@@ -11,6 +11,7 @@ if have_vhost
 specific_virtio_ss.add(files('vhost.c', 'vhost-backend.c', 
'vhost-iova-tree.c'))

 if have_vhost_user
   specific_virtio_ss.add(files('vhost-user.c'))
+    specific_virtio_ss.add(files('vhost-user-target.c'))
 endif
 if have_vhost_vdpa
   specific_virtio_ss.add(files('vhost-vdpa.c', 'vhost-shadow- 
virtqueue.c'))
diff --git a/hw/virtio/vhost-user-target.c b/hw/virtio/vhost-user- 
target.c

new file mode 100644
index 00..6a0d0f53d0
--- /dev/null
+++ b/hw/virtio/vhost-user-target.c
@@ -0,0 +1,29 @@
+/*
+ * vhost-user target-specific helpers
+ *
+ * Copyright (c) 2013 Virtual Open Systems Sarl.
+ *
+ * SPDX-License-Identifier: GPL-2.0-or-later
+ */
+
+#include "qemu/osdep.h"
+#include "hw/virtio/vhost-user.h"
+
+#if defined(TARGET_X86) || defined(TARGET_X86_64) || \
+    defined(TARGET_ARM) || defined(TARGET_ARM_64)
+#include "hw/acpi/acpi.h"
+#elif defined(TARGET_PPC) || defined(TARGET_PPC64)
+#include "hw/ppc/spapr.h"
+#endif
+
+unsigned int vhost_user_ram_slots_max(void)
+{
+#if defined(TARGET_X86) || defined(TARGET_X86_64) || \
+    defined(TARGET_ARM) || defined(TARGET_ARM_64)
+    return ACPI_MAX_RAM_SLOTS;
+#elif defined(TARGET_PPC) || defined(TARGET_PPC64)
+    return SPAPR_MAX_RAM_SLOTS;
+#else
+    return 512;


Should vhost_user_ram_slots_max be another TargetInfo field?



I don't think so, it would be better to transform the existing function 
in something like:


switch (target_current()) {
case TARGET_X86:
case TARGET_ARM:
case TARGET_X86_64:
case TARGET_ARM_64:
 return ACPI_MAX_RAM_SLOTS;
case TARGET PPC:
case TARGET PPC64:
 return SPAPR_MAX_RAM_SLOTS;
default:
 return 512;
}


Clever, I like it, thanks!

Re: [PATCH for-10.0] scsi-disk: Apply error policy for host_status errors again

2025-04-10 Thread Michael Tokarev


10.04.2025 16:14, Kevin Wolf wrote:

Am 10.04.2025 um 14:37 hat Michael Tokarev geschrieben:

...>> Does it make sense to apply this one for older stable qemu series?

In particular, in 8.2, we lack cfe0880835cd3
"scsi-disk: Use positive return value for status in dma_readv/writev",
which seems to be relevant here.  Or should I pick up cfe0880835cd3 too,
maybe together with 8a0495624f (a no-op, just to make this patch to apply
cleanly) and probably 9da6bd39f924?


Yes, I think it makes sense to pick all of them up (and 622a7016 in the
middle, too), they were part of one series:

https://patchew.org/QEMU/20240731123207.27636-1-kw...@redhat.com/

And this patch builds on top of that series, so rebasing it correctly
might not be trivial without the previous series.


A (most likely small) issue here: 622a70161a "scsi-block: Don't skip
callback for sgio error status/driver_status" is on top of an earlier
commit, 1404226804 "scsi: don't lock AioContext in I/O code path",
but does not actually *require* it, since it removes whole code block
where a locking has been removed earlier by 1404226804.

Also the next comment commit, 8a0495624f "scsi-disk: Add warning comments
that host_status errors take a shortcut", clashes with e7fc3c4a8cc "scsi:
remove outdated AioContext lock comment".

This seems a bit too fragile for 8.2, don't you think?  And I haven't even
tried to check 7.2 yet :)

https://gitlab.com/mjt0k/qemu/-/tree/staging-8.2 for the current result
(not yet tested) - I dislike my comment handling at scsi_dma_complete().

Thanks,

/mjt

RE: [PATCH v1 0/1] hw/misc/aspeed_sbc: Implement OTP memory and controller

2025-04-10 Thread Kane Chen

Hi Cédric,

Thank you for your comments and suggestions. I’ll start by reviewing
the changes in commit ebc29e1beab0. At the moment, I don’t have
another example of a device that is accessed through an external bus,
but I’ll look into it and see what I can find.

Best Regards,
Kane

> -Original Message-
> From: Cédric Le Goater 
> Sent: Monday, April 7, 2025 5:55 PM
> To: Kane Chen ; Philippe Mathieu-Daudé
> ; Peter Maydell ; Steven Lee
> ; Troy Lee ; Jamin Lin
> ; Andrew Jeffery
> ; Joel Stanley ; open
> list:ASPEED BMCs ; open list:All patches CC here
> ; qemu-block ; Markus
> Armbruster 
> Cc: Troy Lee 
> Subject: Re: [PATCH v1 0/1] hw/misc/aspeed_sbc: Implement OTP memory and
> controller
> 
> Hello Kane,
> 
> + Markus (for ebc29e1beab0 implementation)
> 
> On 4/7/25 09:33, Kane Chen wrote:
> > Hi Cédric/Philippe,
> >
> > OTP (One-Time Programmable) memory is a type of non-volatile memory in
> > which each bit can be programmed only once. It is typically used to
> > store critical and permanent information, such as the chip ID and
> > secure boot keys. The structure and behavior of OTP memory are
> > consistent across both the AST1030 and AST2600 platforms.
> >
> > As Philippe pointed out, this proposal models the OTP memory as a
> > flash device and utilizes a block backend for persistent storage. In
> > contrast, existing implementations such as NPCM7xxOTPState,
> > BCM2835OTPState, and SiFiveUOTPState expose OTP memory via MMIO and
> > always initialize it in a blank state.
> 
> AFAIU, Aspeed SBC is also MMIO based or is there another device, an eeprom,
> accessible through an external bus ? How is it implemented in HW ?
> 
> > The goal of this design is to
> > allow the guest system to boot with a pre-configured OTP memory state.
> 
> Yes. This is a valid request. It's not the first time we've had this kind of 
> requests.
> The initial content of EEPROM devices are an example and some machines,
> like the rainier, have a lot.
> 
> If the device can be defined on the command line, like would be an EEPROM
> device attached to an I2C bus or a flash device attached to a SPI bus, we can
> use a 'drive' property. Something like :
> 
>qemu-system-arm -M ast2600-evb \
>-blockdev node-name=fmc0,driver=file,filename=/path/to/fmc0.img \
>-device mx66u51235f,bus=ssi.0,cs=0x0,drive=fmc0 \
>-blockdev node-name=fmc1,driver=file,filename=/path/to/fmc1.img \
>-device mx66u51235f,bus=ssi.0,cs=0x1,drive=fmc1 \
>-blockdev node-name=spi1,driver=file,filename=/path/to/spi1.img \
>-device mx66u51235f,cs=0x0,bus=ssi.1,drive=spi1 \
>...
> 
> However, the Aspeed SBC device is a platform device and it makes things more
> complex : it can not be created on the command line, it is directly created by
> the machine and the soc and passing device properties to specify a blockdev it
> is not possible :
> 
>qemu-system-arm -M ast2600-evb \
>-blockdev
> node-name=otpmem,driver=file,filename=/path/to/otpmem.img \
>-device aspeed-sbc,drive=otpmem \
>...
> 
> 
> > To support this, the OTP memory is backed by a file, simulating
> > persistent flash behavior.
> 
> The idea is good but the implementation is problematic.
> 
>  +static BlockBackend *init_otpmem(int64_t size_bytes)
>  +{
>  +Error *local_err = NULL;
>  +BlockDriverState *bs = NULL;
>  +BlockBackend *blk = NULL;
>  +bool image_created = false;
>  +QDict *options;
>  +uint32_t i, odd_def = 0x, even_def = 0, *def;
>  +
>  +if (!g_file_test(OTP_FILE_PATH, G_FILE_TEST_EXISTS)) {
>  +bdrv_img_create(OTP_FILE_PATH, "raw", NULL, NULL,
>  +NULL, size_bytes, 0, true, &local_err);
>  +if (local_err) {
>  +qemu_log_mask(LOG_GUEST_ERROR,
>  +  "%s: Failed to create image %s: %s\n",
>  +  __func__, OTP_FILE_PATH,
>  +  error_get_pretty(local_err));
>  +error_free(local_err);
>  +return NULL;
>  +}
>  +image_created = true;
>  +}
>  +
>  +blk = blk_new(qemu_get_aio_context(),
>  +  BLK_PERM_CONSISTENT_READ |
> BLK_PERM_WRITE,
>  +  0);
>  +if (!blk) {
>  +qemu_log_mask(LOG_GUEST_ERROR,
>  +  "%s: Failed to create BlockBackend\n",
>  +  __func__);
>  +return NULL;
>  +}
>  +
>  +options =  qdict_new();
>  +qdict_put_str(options, "driver", "raw");
>  +bs = bdrv_open(OTP_FILE_PATH, NULL, options, BDRV_O_RDWR,
> &local_err);
>  +if (local_err) {
>  +qemu_log_mask(LOG_GUEST_ERROR,
>  +  "%s: Failed to create OTP memory, err =
> %s\n",
>  +  __func__, error_get_pretty(local_err));
>

Re: [PATCH 00/16] Add Multi-Core Debug (MCD) API support

2025-04-10 Thread Markus Armbruster

Alex Bennée  writes:

> Markus Armbruster  writes:
>
>> Mario Fleischmann  writes:
>>
>>> Apologies for the line wrapping in yesterday's answer. Should be fixed now.
>>>
>>> On 08.04.2025 09:00, Markus Armbruster wrote:

[...]

 What about providing the MCD interface as a separate QMP-like protocol?
 It gets its own QAPI schema, just like for qemu-ga.  Simplifies
 compiling it out when not needed.

 It gets its own socket, just like the GDB stub.  Might reduce
 interference between debugging and QMP.

 Thoughts?  Alex, Philippe, care to chime in?
>>>
>>> Sound reasonable to me. Keeping in mind the size of generated QAPI code,
>>> an option to `./configure [...] --enable-mcd` is definitely advisable.
>>
>> Alex, Philippe?
>
> When I spoke to Mario at DVCon last year I liked the idea of re-using
> QMP instead of inventing yet another RPC interface for QEMU. QMP
> certainly has nicer properties than the gdbstub which has a very
> "organic" and "serial" feel to it.
>
> Are you suggesting we re-use the machinery but use an entirely separate
> socket with just the MCD namespace in it? I don't see that being a
> problem as long as we can test it properly in the CI.

Yes.

"Keep them separate" is only a gut feeling, though.  While I pay
attention to my gut feelings, I know they can be wrong.  I am soliciting
opinions.

Re: [RFC v2 1/5] qapi/qom: Introduce kvm-pmu-filter object

2025-04-10 Thread Markus Armbruster

Zhao Liu  writes:

> Hi Mrkus,
>
> I'm really sorry I completely missed your reply (and your patient
> advice). It wasn't until I looked back at the lore archives that I
> realized my mistake. Thinking it over again, I see that your reply,
> which I missed, really helped clear up my confusion:

I'm glad I was able to help some!

> On Fri, Feb 07, 2025 at 02:02:44PM +0100, Markus Armbruster wrote:
>> Date: Fri, 07 Feb 2025 14:02:44 +0100
>> From: Markus Armbruster 
>> Subject: Re: [RFC v2 1/5] qapi/qom: Introduce kvm-pmu-filter object
>> 
>> Zhao Liu  writes:
>> 
>> >> Let's ignore how to place it for now, and focus on where we would *like*
>> >> to place it.
>> >> 
>> >> Is it related to anything other than ObjectType / ObjectOptions in the
>> >> QMP reference manual?
>> >
>> > Yes!
>> 
>> Now I'm confused :)
>> 
>> It is related to ObjectType / ObjectType.
>> 
>> Is it related to anything else in the QMP reference manual, and if yes,
>> to what exactly is it related?
>
> I misunderstood your point. The PMU stuff and the QAPI definitions for
> ObjectType/ObjectOptions are not related. They should belong to separate
> categories or sections.
>
>> >> I guess qapi/kvm.json is for KVM-specific stuff in general, not just the
>> >> KVM PMU filter.  Should we have a section for accelerator-specific
>> >> stuff, with subsections for the various accelerators?
>> >> 
>> >> [...]
>> >
>> > If we consider the accelerator from a top-down perspective, I understand
>> > that we need to add accelerator.json, kvm.json, and kvm-pmu-filter.json.
>> >
>> > The first two files are just to include subsections without any additional
>> > content. Is this overkill? Could we just add a single kvm-pmu-filter.json
>> > (I also considered this name, thinking that kvm might need to add more
>> > things in the future)?
>> >
>> > Of course, I lack experience with the file organization here. If you think
>> > the three-level sections (accelerator.json, kvm.json, and 
>> > kvm-pmu-filter.json)
>> > is necessary, I am happy to try this way. :-)
>> 
>> We don't have to create files just to get a desired section structure.
>> 
>> I'll show you how in a jiffie, but before I do that, let me stress: we
>> should figure out what we want *first*, and only then how to get it.
>> So, what section structure would make the most sense for the QMP
>> reference manual?
>
> Thank you for your patience. I have revisited and carefully considered
> the "QEMU QMP Reference Manual," especially from a reader's perspective.
> Indeed, I agree that, as you mentioned, a three-level directory
> (accelerator - kvm - kvm stuff) is more readable and easier to maintain.

Sounds good to me.

> For this question "what we want *first*, and only then how to get it", I
> think my thought is:
>
> First, the structure should be considered, and then the specific content
> can be added. Once the structure is clearly defined, categorizing items
> into their appropriate places becomes a natural process...
>
> Then for this question "what section structure would make the most sense
> for the QMP reference manual?", I understand that a top-down, clearly
> defined hierarchical directory makes the most sense, allowing readers to
> follow the structure to find what they want. Directly adding
> kvm-pmu-filter.json or kvm.json would disrupt the entire structure, because
> KVM is just one of the accelerators supported by QEMU. Using "accelerator"
> as the entry point for the documentation, similar to the "accel" directory
> in QEMU's source code, would make indexing more convenient.

I think so, too.

>> A few hints on how...
>> 
>> Consider how qapi/block.json includes qapi/block-core.json:
>> 
>> ##
>> # = Block devices
>> ##
>> 
>> { 'include': 'block-core.json' }
>> 
>> ##
>> # == Additional block stuff (VM related)
>> ##
>> 
>> block-core.json starts with
>> 
>> ##
>> # == Block core (VM unrelated)
>> ##
>> 
>> Together, this produces this section structure
>> 
>> = Block devices
>> == 
>> ##
>> 
>> Together, this produces this section structure
>> 
>> = Block devices
>> == Block core (VM unrelated)
>> == Additional block stuff (VM related)
>> 
>> Note that qapi/block-core.json isn't included anywhere else.
>> qapi/qapi-schema.json advises:
>> 
>> # Documentation generated with qapi-gen.py is in source order, with
>> # included sub-schemas inserted at the first include directive
>> # (subsequent include directives have no effect).  To get a sane and
>> # stable order, it's best to include each sub-schema just once, or
>> # include it first right here.
>
> Thank you very much!!
>
> Based on your inspiration, I think the ideal section structure for my
> issue could be:
>
> = accelerator
> == KVM
> === PMU
>
> Firstly, I should have a new accelerator.json () to include KVM stuff:
>
> ##
> # = Accelerator
> ##
>
> { 'include': 'kvm.json' }
>
> Next, in kvm.json, I could organi

Re: [PATCH v3 1/2] vfio/spapr: Enhance error handling in vfio_spapr_create_window()

2025-04-10 Thread Cédric Le Goater


On 4/8/25 14:40, Amit Machhiwal wrote:

Introduce an Error ** parameter to vfio_spapr_create_window() to enable
structured error reporting. This allows the function to propagate
detailed errors back to callers.

Suggested-by: Cédric Le Goater 
Signed-off-by: Amit Machhiwal 
--->   hw/vfio/spapr.c | 33 -
  1 file changed, 16 insertions(+), 17 deletions(-)

diff --git a/hw/vfio/spapr.c b/hw/vfio/spapr.c
index 1a5d1611f2cd..dd9207679dbe 100644
--- a/hw/vfio/spapr.c
+++ b/hw/vfio/spapr.c
@@ -230,9 +230,9 @@ static int vfio_spapr_remove_window(VFIOContainer 
*container,
  return 0;
  }
  
-static int vfio_spapr_create_window(VFIOContainer *container,

+static bool vfio_spapr_create_window(VFIOContainer *container,
  MemoryRegionSection *section,
-hwaddr *pgsize)
+hwaddr *pgsize, Error **errp)
  {
  int ret = 0;
  VFIOContainerBase *bcontainer = &container->bcontainer;
@@ -252,11 +252,11 @@ static int vfio_spapr_create_window(VFIOContainer 
*container,
  pgmask = bcontainer->pgsizes & (pagesize | (pagesize - 1));
  pagesize = pgmask ? (1ULL << (63 - clz64(pgmask))) : 0;
  if (!pagesize) {
-error_report("Host doesn't support page size 0x%"PRIx64
- ", the supported mask is 0x%lx",
- memory_region_iommu_get_min_page_size(iommu_mr),
- bcontainer->pgsizes);
-return -EINVAL;
+error_setg_errno(errp, EINVAL, "Host doesn't support page size 
0x%"PRIx64
+ ", the supported mask is 0x%lx",
+ memory_region_iommu_get_min_page_size(iommu_mr),
+ bcontainer->pgsizes);
+return false;
  }
  
  /*

@@ -302,17 +302,17 @@ static int vfio_spapr_create_window(VFIOContainer 
*container,
  }
  }
  if (ret) {
-error_report("Failed to create a window, ret = %d (%m)", ret);
-return -errno;
+error_setg_errno(errp, errno, "Failed to create a window, ret = %d", 
ret);
+return false;
  }
  
  if (create.start_addr != section->offset_within_address_space) {

  vfio_spapr_remove_window(container, create.start_addr);
  
-error_report("Host doesn't support DMA window at %"HWADDR_PRIx", must be %"PRIx64,

- section->offset_within_address_space,
- (uint64_t)create.start_addr);
-return -EINVAL;
+error_setg_errno(errp, EINVAL, "Host doesn't support DMA window at 
%"HWADDR_PRIx
+ ", must be %"PRIx64, 
section->offset_within_address_space,
+ (uint64_t)create.start_addr);
+return false;
  }
  trace_vfio_spapr_create_window(create.page_shift,
 create.levels,
@@ -320,7 +320,7 @@ static int vfio_spapr_create_window(VFIOContainer 
*container,
 create.start_addr);
  *pgsize = pagesize;
  
-return 0;

+return true;
  }
  
  static bool

@@ -377,9 +377,8 @@ vfio_spapr_container_add_section_window(VFIOContainerBase 
*bcontainer,
  }
  }
  
-ret = vfio_spapr_create_window(container, section, &pgsize);

-if (ret) {
-error_setg_errno(errp, -ret, "Failed to create SPAPR window");
+ret = vfio_spapr_create_window(container, section, &pgsize, errp);
+if (!ret) {
  return false;
  }
  


ret is not needed. Minor.


Reviewed-by: Cédric Le Goater 

Thanks,

C.

[PATCH v1 21/24] hw/s390x/ipl: Set IPIB flags for secure IPL

2025-04-10 Thread Zhuoying Cai

If `-secure-boot on` is specified on the command line option, indicating
true secure IPL enabled, set Secure-IPL bit and IPL-Information-Report
bit on in IPIB Flags field, and trigger true secure IPL in the S390 BIOS.

Any error that occurs during true secure IPL will cause the IPL to
terminate.

Signed-off-by: Zhuoying Cai 
---
 hw/s390x/ipl.c | 22 +-
 1 file changed, 21 insertions(+), 1 deletion(-)

diff --git a/hw/s390x/ipl.c b/hw/s390x/ipl.c
index b646fcc74e..60bafcbd2e 100644
--- a/hw/s390x/ipl.c
+++ b/hw/s390x/ipl.c
@@ -438,6 +438,15 @@ static bool s390_has_certificate(void)
 return ipl->cert_store.count > 0;
 }
 
+static bool s390_secure_boot_enabled(void)
+{
+QemuOpts *opts;
+
+opts = qemu_find_opts_singleton("secure-boot");
+
+return qemu_opt_get_bool(opts, "secure-boot", false);
+}
+
 static bool s390_build_iplb(DeviceState *dev_st, IplParameterBlock *iplb)
 {
 CcwDevice *ccw_dev = NULL;
@@ -495,6 +504,17 @@ static bool s390_build_iplb(DeviceState *dev_st, 
IplParameterBlock *iplb)
 s390_ipl_convert_loadparm((char *)lp, iplb->loadparm);
 iplb->flags |= DIAG308_FLAGS_LP_VALID;
 
+/*
+ * If -secure-boot on, then toggle the secure IPL flags to trigger
+ * secure boot in the s390 BIOS.
+ *
+ * Boot process will terminate if any error occurs during secure boot.
+ *
+ * If SIPL is on, IPLIR must also be on.
+ */
+if (s390_secure_boot_enabled()) {
+iplb->hdr_flags |= (DIAG308_IPIB_FLAGS_SIPL | 
DIAG308_IPIB_FLAGS_IPLIR);
+}
 /*
  * Secure boot in audit mode will perform
  * if certificate(s) exist in the key store.
@@ -504,7 +524,7 @@ static bool s390_build_iplb(DeviceState *dev_st, 
IplParameterBlock *iplb)
  *
  * Results of secure boot will be stored in IIRB.
  */
-if (s390_has_certificate()) {
+else if (s390_has_certificate()) {
 iplb->hdr_flags |= DIAG308_IPIB_FLAGS_IPLIR;
 }
 
-- 
2.49.0

[PATCH for-10.1 v5 02/13] arm/cpu: Store aa64isar0/aa64zfr0 into the idregs arrays

2025-04-10 Thread Cornelia Huck

From: Eric Auger 

Also add kvm add accessors for storing host features into idregs.

Reviewed-by: Richard Henderson 
Reviewed-by: Sebastian Ott 
Signed-off-by: Eric Auger 
Signed-off-by: Cornelia Huck 
---
 target/arm/cpu-features.h | 57 ---
 target/arm/cpu-sysregs.h  |  4 +++
 target/arm/cpu.c  | 10 +++
 target/arm/cpu.h  |  2 --
 target/arm/cpu64.c|  8 +++---
 target/arm/helper.c   |  6 +++--
 target/arm/hvf/hvf.c  |  3 ++-
 target/arm/kvm.c  | 30 ++---
 target/arm/tcg/cpu64.c| 44 ++
 9 files changed, 101 insertions(+), 63 deletions(-)

diff --git a/target/arm/cpu-features.h b/target/arm/cpu-features.h
index 525e4cee12f6..779bcd1abb36 100644
--- a/target/arm/cpu-features.h
+++ b/target/arm/cpu-features.h
@@ -22,6 +22,7 @@
 
 #include "hw/registerfields.h"
 #include "qemu/host-utils.h"
+#include "cpu-sysregs.h"
 
 /*
  * Naming convention for isar_feature functions:
@@ -376,92 +377,92 @@ static inline bool isar_feature_aa32_doublelock(const 
ARMISARegisters *id)
  */
 static inline bool isar_feature_aa64_aes(const ARMISARegisters *id)
 {
-return FIELD_EX64(id->id_aa64isar0, ID_AA64ISAR0, AES) != 0;
+return FIELD_EX64_IDREG(id, ID_AA64ISAR0, AES) != 0;
 }
 
 static inline bool isar_feature_aa64_pmull(const ARMISARegisters *id)
 {
-return FIELD_EX64(id->id_aa64isar0, ID_AA64ISAR0, AES) > 1;
+return FIELD_EX64_IDREG(id, ID_AA64ISAR0, AES) > 1;
 }
 
 static inline bool isar_feature_aa64_sha1(const ARMISARegisters *id)
 {
-return FIELD_EX64(id->id_aa64isar0, ID_AA64ISAR0, SHA1) != 0;
+return FIELD_EX64_IDREG(id, ID_AA64ISAR0, SHA1) != 0;
 }
 
 static inline bool isar_feature_aa64_sha256(const ARMISARegisters *id)
 {
-return FIELD_EX64(id->id_aa64isar0, ID_AA64ISAR0, SHA2) != 0;
+return FIELD_EX64_IDREG(id, ID_AA64ISAR0, SHA2) != 0;
 }
 
 static inline bool isar_feature_aa64_sha512(const ARMISARegisters *id)
 {
-return FIELD_EX64(id->id_aa64isar0, ID_AA64ISAR0, SHA2) > 1;
+return FIELD_EX64_IDREG(id, ID_AA64ISAR0, SHA2) > 1;
 }
 
 static inline bool isar_feature_aa64_crc32(const ARMISARegisters *id)
 {
-return FIELD_EX64(id->id_aa64isar0, ID_AA64ISAR0, CRC32) != 0;
+return FIELD_EX64_IDREG(id, ID_AA64ISAR0, CRC32) != 0;
 }
 
 static inline bool isar_feature_aa64_atomics(const ARMISARegisters *id)
 {
-return FIELD_EX64(id->id_aa64isar0, ID_AA64ISAR0, ATOMIC) != 0;
+return FIELD_EX64_IDREG(id, ID_AA64ISAR0, ATOMIC) != 0;
 }
 
 static inline bool isar_feature_aa64_rdm(const ARMISARegisters *id)
 {
-return FIELD_EX64(id->id_aa64isar0, ID_AA64ISAR0, RDM) != 0;
+return FIELD_EX64_IDREG(id, ID_AA64ISAR0, RDM) != 0;
 }
 
 static inline bool isar_feature_aa64_sha3(const ARMISARegisters *id)
 {
-return FIELD_EX64(id->id_aa64isar0, ID_AA64ISAR0, SHA3) != 0;
+return FIELD_EX64_IDREG(id, ID_AA64ISAR0, SHA3) != 0;
 }
 
 static inline bool isar_feature_aa64_sm3(const ARMISARegisters *id)
 {
-return FIELD_EX64(id->id_aa64isar0, ID_AA64ISAR0, SM3) != 0;
+return FIELD_EX64_IDREG(id, ID_AA64ISAR0, SM3) != 0;
 }
 
 static inline bool isar_feature_aa64_sm4(const ARMISARegisters *id)
 {
-return FIELD_EX64(id->id_aa64isar0, ID_AA64ISAR0, SM4) != 0;
+return FIELD_EX64_IDREG(id, ID_AA64ISAR0, SM4) != 0;
 }
 
 static inline bool isar_feature_aa64_dp(const ARMISARegisters *id)
 {
-return FIELD_EX64(id->id_aa64isar0, ID_AA64ISAR0, DP) != 0;
+return FIELD_EX64_IDREG(id, ID_AA64ISAR0, DP) != 0;
 }
 
 static inline bool isar_feature_aa64_fhm(const ARMISARegisters *id)
 {
-return FIELD_EX64(id->id_aa64isar0, ID_AA64ISAR0, FHM) != 0;
+return FIELD_EX64_IDREG(id, ID_AA64ISAR0, FHM) != 0;
 }
 
 static inline bool isar_feature_aa64_condm_4(const ARMISARegisters *id)
 {
-return FIELD_EX64(id->id_aa64isar0, ID_AA64ISAR0, TS) != 0;
+return FIELD_EX64_IDREG(id, ID_AA64ISAR0, TS) != 0;
 }
 
 static inline bool isar_feature_aa64_condm_5(const ARMISARegisters *id)
 {
-return FIELD_EX64(id->id_aa64isar0, ID_AA64ISAR0, TS) >= 2;
+return FIELD_EX64_IDREG(id, ID_AA64ISAR0, TS) >= 2;
 }
 
 static inline bool isar_feature_aa64_rndr(const ARMISARegisters *id)
 {
-return FIELD_EX64(id->id_aa64isar0, ID_AA64ISAR0, RNDR) != 0;
+return FIELD_EX64_IDREG(id, ID_AA64ISAR0, RNDR) != 0;
 }
 
 static inline bool isar_feature_aa64_tlbirange(const ARMISARegisters *id)
 {
-return FIELD_EX64(id->id_aa64isar0, ID_AA64ISAR0, TLB) == 2;
+return FIELD_EX64_IDREG(id, ID_AA64ISAR0, TLB) == 2;
 }
 
 static inline bool isar_feature_aa64_tlbios(const ARMISARegisters *id)
 {
-return FIELD_EX64(id->id_aa64isar0, ID_AA64ISAR0, TLB) != 0;
+return FIELD_EX64_IDREG(id, ID_AA64ISAR0, TLB) != 0;
 }
 
 static inline bool isar_feature_aa64_jscvt(const ARMISARegisters *id)
@@ -927,52 +928,52 @@ static inline bool isar_feature_aa64_doublelock(const 
ARMISARegisters *id)
 
 static inline bool isar_feature_aa64_sve2(const ARMISA

[PATCH for-10.1 v5 11/13] arm/cpu: Store id_mmfr0-5 into the idregs array

2025-04-10 Thread Cornelia Huck

From: Eric Auger 

Reviewed-by: Richard Henderson 
Reviewed-by: Sebastian Ott 
Signed-off-by: Eric Auger 
Signed-off-by: Cornelia Huck 
---
 hw/intc/armv7m_nvic.c |  8 ++--
 target/arm/cpu-features.h | 18 
 target/arm/cpu.h  |  6 ---
 target/arm/cpu64.c| 16 +++
 target/arm/helper.c   | 12 ++---
 target/arm/kvm.c  | 18 +++-
 target/arm/tcg/cpu-v7m.c  | 48 ++--
 target/arm/tcg/cpu32.c| 94 +++
 target/arm/tcg/cpu64.c| 76 +++
 9 files changed, 140 insertions(+), 156 deletions(-)

diff --git a/hw/intc/armv7m_nvic.c b/hw/intc/armv7m_nvic.c
index 7f42803fef7c..f6d945c52923 100644
--- a/hw/intc/armv7m_nvic.c
+++ b/hw/intc/armv7m_nvic.c
@@ -1284,22 +1284,22 @@ static uint32_t nvic_readl(NVICState *s, uint32_t 
offset, MemTxAttrs attrs)
 if (!arm_feature(&cpu->env, ARM_FEATURE_M_MAIN)) {
 goto bad_offset;
 }
-return cpu->isar.id_mmfr0;
+return GET_IDREG(isar, ID_MMFR0);
 case 0xd54: /* MMFR1.  */
 if (!arm_feature(&cpu->env, ARM_FEATURE_M_MAIN)) {
 goto bad_offset;
 }
-return cpu->isar.id_mmfr1;
+return GET_IDREG(isar, ID_MMFR1);
 case 0xd58: /* MMFR2.  */
 if (!arm_feature(&cpu->env, ARM_FEATURE_M_MAIN)) {
 goto bad_offset;
 }
-return cpu->isar.id_mmfr2;
+return GET_IDREG(isar, ID_MMFR2);
 case 0xd5c: /* MMFR3.  */
 if (!arm_feature(&cpu->env, ARM_FEATURE_M_MAIN)) {
 goto bad_offset;
 }
-return cpu->isar.id_mmfr3;
+return GET_IDREG(isar, ID_MMFR3);
 case 0xd60: /* ISAR0.  */
 if (!arm_feature(&cpu->env, ARM_FEATURE_M_MAIN)) {
 goto bad_offset;
diff --git a/target/arm/cpu-features.h b/target/arm/cpu-features.h
index cad34f0ad403..db3c99e42964 100644
--- a/target/arm/cpu-features.h
+++ b/target/arm/cpu-features.h
@@ -283,17 +283,17 @@ static inline bool isar_feature_aa32_vminmaxnm(const 
ARMISARegisters *id)
 
 static inline bool isar_feature_aa32_pxn(const ARMISARegisters *id)
 {
-return FIELD_EX32(id->id_mmfr0, ID_MMFR0, VMSA) >= 4;
+return FIELD_EX32_IDREG(id, ID_MMFR0, VMSA) >= 4;
 }
 
 static inline bool isar_feature_aa32_pan(const ARMISARegisters *id)
 {
-return FIELD_EX32(id->id_mmfr3, ID_MMFR3, PAN) != 0;
+return FIELD_EX32_IDREG(id, ID_MMFR3, PAN) != 0;
 }
 
 static inline bool isar_feature_aa32_ats1e1(const ARMISARegisters *id)
 {
-return FIELD_EX32(id->id_mmfr3, ID_MMFR3, PAN) >= 2;
+return FIELD_EX32_IDREG(id, ID_MMFR3, PAN) >= 2;
 }
 
 static inline bool isar_feature_aa32_pmuv3p1(const ARMISARegisters *id)
@@ -319,32 +319,32 @@ static inline bool isar_feature_aa32_pmuv3p5(const 
ARMISARegisters *id)
 
 static inline bool isar_feature_aa32_hpd(const ARMISARegisters *id)
 {
-return FIELD_EX32(id->id_mmfr4, ID_MMFR4, HPDS) != 0;
+return FIELD_EX32_IDREG(id, ID_MMFR4, HPDS) != 0;
 }
 
 static inline bool isar_feature_aa32_ac2(const ARMISARegisters *id)
 {
-return FIELD_EX32(id->id_mmfr4, ID_MMFR4, AC2) != 0;
+return FIELD_EX32_IDREG(id, ID_MMFR4, AC2) != 0;
 }
 
 static inline bool isar_feature_aa32_ccidx(const ARMISARegisters *id)
 {
-return FIELD_EX32(id->id_mmfr4, ID_MMFR4, CCIDX) != 0;
+return FIELD_EX32_IDREG(id, ID_MMFR4, CCIDX) != 0;
 }
 
 static inline bool isar_feature_aa32_tts2uxn(const ARMISARegisters *id)
 {
-return FIELD_EX32(id->id_mmfr4, ID_MMFR4, XNX) != 0;
+return FIELD_EX32_IDREG(id, ID_MMFR4, XNX) != 0;
 }
 
 static inline bool isar_feature_aa32_half_evt(const ARMISARegisters *id)
 {
-return FIELD_EX32(id->id_mmfr4, ID_MMFR4, EVT) >= 1;
+return FIELD_EX32_IDREG(id, ID_MMFR4, EVT) >= 1;
 }
 
 static inline bool isar_feature_aa32_evt(const ARMISARegisters *id)
 {
-return FIELD_EX32(id->id_mmfr4, ID_MMFR4, EVT) >= 2;
+return FIELD_EX32_IDREG(id, ID_MMFR4, EVT) >= 2;
 }
 
 static inline bool isar_feature_aa32_dit(const ARMISARegisters *id)
diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index 496c7f9a3ce7..d27134f4a025 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -1073,12 +1073,6 @@ struct ArchCPU {
  * field by reading the value from the KVM vCPU.
  */
 struct ARMISARegisters {
-uint32_t id_mmfr0;
-uint32_t id_mmfr1;
-uint32_t id_mmfr2;
-uint32_t id_mmfr3;
-uint32_t id_mmfr4;
-uint32_t id_mmfr5;
 uint32_t mvfr0;
 uint32_t mvfr1;
 uint32_t mvfr2;
diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
index 292d09fb8e9b..9769401a8585 100644
--- a/target/arm/cpu64.c
+++ b/target/arm/cpu64.c
@@ -656,10 +656,10 @@ static void aarch64_a57_initfn(Object *obj)
 SET_IDREG(isar, ID_PFR1, 0x00011011);
 SET_IDREG(isar, ID_DFR0, 0x03010066);
 cpu->id_afr0 = 0x;
-cpu->isar.id_mmfr0 = 0x10101105;
-cpu->isar.id_mmfr1 = 0x4000;
-cpu->isar.id_mmfr2 = 0x0126;
-cpu

Re: [RFC PATCH-for-8.0 09/10] hw/virtio: Extract vhost_user_ram_slots_max() to vhost-user-target.c

2025-04-10 Thread Pierrick Bouvier


On 4/10/25 10:21, Philippe Mathieu-Daudé wrote:

On 10/4/25 16:36, Pierrick Bouvier wrote:

On 4/10/25 05:14, Philippe Mathieu-Daudé wrote:

Hi Pierrick,

On 13/12/22 00:05, Philippe Mathieu-Daudé wrote:

The current definition of VHOST_USER_MAX_RAM_SLOTS is
target specific. By converting this definition to a runtime
vhost_user_ram_slots_max() helper declared in a target
specific unit, we can have the rest of vhost-user.c target
independent.

To avoid variable length array or using the heap to store
arrays of vhost_user_ram_slots_max() elements, we simply
declare an array of the biggest VHOST_USER_MAX_RAM_SLOTS,
and each target uses up to vhost_user_ram_slots_max()
elements of it. Ensure arrays are big enough by adding an
assertion in vhost_user_init().

Signed-off-by: Philippe Mathieu-Daudé 
---
RFC: Should I add VHOST_USER_MAX_RAM_SLOTS to vhost-user.h
    or create an internal header for it?
---
    hw/virtio/meson.build  |  1 +
    hw/virtio/vhost-user-target.c  | 29 +
    hw/virtio/vhost-user.c | 26 +-
    include/hw/virtio/vhost-user.h |  7 +++
    4 files changed, 42 insertions(+), 21 deletions(-)
    create mode 100644 hw/virtio/vhost-user-target.c

diff --git a/hw/virtio/meson.build b/hw/virtio/meson.build
index eb7ee8ea92..bf7e35fa8a 100644
--- a/hw/virtio/meson.build
+++ b/hw/virtio/meson.build
@@ -11,6 +11,7 @@ if have_vhost
  specific_virtio_ss.add(files('vhost.c', 'vhost-backend.c',
'vhost-iova-tree.c'))
  if have_vhost_user
    specific_virtio_ss.add(files('vhost-user.c'))
+    specific_virtio_ss.add(files('vhost-user-target.c'))
  endif
  if have_vhost_vdpa
    specific_virtio_ss.add(files('vhost-vdpa.c', 'vhost-shadow-
virtqueue.c'))
diff --git a/hw/virtio/vhost-user-target.c b/hw/virtio/vhost-user-
target.c
new file mode 100644
index 00..6a0d0f53d0
--- /dev/null
+++ b/hw/virtio/vhost-user-target.c
@@ -0,0 +1,29 @@
+/*
+ * vhost-user target-specific helpers
+ *
+ * Copyright (c) 2013 Virtual Open Systems Sarl.
+ *
+ * SPDX-License-Identifier: GPL-2.0-or-later
+ */
+
+#include "qemu/osdep.h"
+#include "hw/virtio/vhost-user.h"
+
+#if defined(TARGET_X86) || defined(TARGET_X86_64) || \
+    defined(TARGET_ARM) || defined(TARGET_ARM_64)
+#include "hw/acpi/acpi.h"
+#elif defined(TARGET_PPC) || defined(TARGET_PPC64)
+#include "hw/ppc/spapr.h"
+#endif
+
+unsigned int vhost_user_ram_slots_max(void)
+{
+#if defined(TARGET_X86) || defined(TARGET_X86_64) || \
+    defined(TARGET_ARM) || defined(TARGET_ARM_64)
+    return ACPI_MAX_RAM_SLOTS;
+#elif defined(TARGET_PPC) || defined(TARGET_PPC64)
+    return SPAPR_MAX_RAM_SLOTS;
+#else
+    return 512;


Should vhost_user_ram_slots_max be another TargetInfo field?



I don't think so, it would be better to transform the existing function
in something like:

switch (target_current()) {
case TARGET_X86:
case TARGET_ARM:
case TARGET_X86_64:
case TARGET_ARM_64:
  return ACPI_MAX_RAM_SLOTS;
case TARGET PPC:
case TARGET PPC64:
  return SPAPR_MAX_RAM_SLOTS;
default:
  return 512;
}


Clever, I like it, thanks!


It's a pattern we can reuse in all places where it'll be needed.
It's better if we keep in TargetInfo only global information, that is 
used through all the codebase, and not specifics about a given 
subsystem/device/file.


By the way, TARGET_ARM_64 is probably TARGET_AARCH64.

Re: [PATCH] migration: add FEATURE_SEEKABLE to QIOChannelBlock

2025-04-10 Thread Prasad Pandit

On Mon, 7 Apr 2025 at 14:31, Marco Cavenati  wrote:
> As you said the capability is used internally. Its goal is to signal to
> other QEMU code that the QIOChannel is seekable.
> 'qio_channel_pwritev' and 'qio_channel_preadv' can be used only if
> the QIOChannel has the 'QIO_CHANNEL_FEATURE_SEEKABLE'
> capability.
>
> The mapped-ram migration checks if the channel has this capability
> because it uses the aforementioned functions. Without the capability
> and the functions implemented in this patch, the mapped-ram migration
> won't work with QIOChannelBlock.
>
> You can have a look at the patch where those functions were
> introduced here [0].

*  _channel_preadv/_writev functions are generic. They are independent
of whether the underlying channel is file or socket or memory or
something else. They are called if and when they are defined and they
in turn call channel specific preadv/pwritev functions.

if (!klass->io_pwritev) {
error_setg(errp, "Channel does not support pwritev");
return -1;
}

* io: add and implement QIO_CHANNEL_FEATURE_SEEKABLE for channel file
-> 
https://gitlab.com/qemu-project/qemu/-/commit/401e311ff72e0a62c834bfe466de68a82cfd90cb

   This commit sets the *_FEATURE_SEEKABLE flag for the file channel
when the lseek(2) call succeeds.

* ie. 'file' OR 'fd' channel is seekable when lseek(2) call works.
Similarly Block channel would be seekable when ->io_seek() method is
defined and it works. And ->io_seek() method is also called if and
when it is defined

qio_channel_io_seek
if (!klass->io_seek) {
error_setg(errp, "Channel does not support random
access");
return -1;
}

  Setting  '*_FEATURE_SEEKABLE' for the block channel does not ensure
that ->io_seek() is defined and works. It seems redundant that way.

Maybe I'm missing something here, not sure. Thank you.
---
  - Prasad

[PATCH v3 02/16] hw/intc/loongarch_pch: Modify register name PCH_PIC_xxx_OFFSET with PCH_PIC_xxx

2025-04-10 Thread Bibo Mao

Macro PCH_PIC_HTMSI_VEC_OFFSET and PCH_PIC_ROUTE_ENTRY_OFFSET is renamed
as PCH_PIC_HTMSI_VEC and PCH_PIC_ROUTE_ENTRY separately, it is easier to
understand.

Signed-off-by: Bibo Mao 
---
 hw/intc/loongarch_pch_pic.c| 20 ++--
 hw/loongarch/virt.c|  2 +-
 include/hw/intc/loongarch_pic_common.h |  4 ++--
 3 files changed, 13 insertions(+), 13 deletions(-)

diff --git a/hw/intc/loongarch_pch_pic.c b/hw/intc/loongarch_pch_pic.c
index 2b90ccd1ff..4c845ba5e9 100644
--- a/hw/intc/loongarch_pch_pic.c
+++ b/hw/intc/loongarch_pch_pic.c
@@ -263,18 +263,18 @@ static uint64_t loongarch_pch_pic_readb(void *opaque, 
hwaddr addr,
 {
 LoongArchPICCommonState *s = LOONGARCH_PIC_COMMON(opaque);
 uint64_t val = 0;
-uint32_t offset = (addr & 0xfff) + PCH_PIC_ROUTE_ENTRY_OFFSET;
+uint32_t offset = (addr & 0xfff) + PCH_PIC_ROUTE_ENTRY;
 int64_t offset_tmp;
 
 switch (offset) {
-case PCH_PIC_HTMSI_VEC_OFFSET ... PCH_PIC_HTMSI_VEC_END:
-offset_tmp = offset - PCH_PIC_HTMSI_VEC_OFFSET;
+case PCH_PIC_HTMSI_VEC ... PCH_PIC_HTMSI_VEC_END:
+offset_tmp = offset - PCH_PIC_HTMSI_VEC;
 if (offset_tmp >= 0 && offset_tmp < 64) {
 val = s->htmsi_vector[offset_tmp];
 }
 break;
-case PCH_PIC_ROUTE_ENTRY_OFFSET ... PCH_PIC_ROUTE_ENTRY_END:
-offset_tmp = offset - PCH_PIC_ROUTE_ENTRY_OFFSET;
+case PCH_PIC_ROUTE_ENTRY ... PCH_PIC_ROUTE_ENTRY_END:
+offset_tmp = offset - PCH_PIC_ROUTE_ENTRY;
 if (offset_tmp >= 0 && offset_tmp < 64) {
 val = s->route_entry[offset_tmp];
 }
@@ -292,19 +292,19 @@ static void loongarch_pch_pic_writeb(void *opaque, hwaddr 
addr,
 {
 LoongArchPICCommonState *s = LOONGARCH_PIC_COMMON(opaque);
 int32_t offset_tmp;
-uint32_t offset = (addr & 0xfff) + PCH_PIC_ROUTE_ENTRY_OFFSET;
+uint32_t offset = (addr & 0xfff) + PCH_PIC_ROUTE_ENTRY;
 
 trace_loongarch_pch_pic_writeb(size, addr, data);
 
 switch (offset) {
-case PCH_PIC_HTMSI_VEC_OFFSET ... PCH_PIC_HTMSI_VEC_END:
-offset_tmp = offset - PCH_PIC_HTMSI_VEC_OFFSET;
+case PCH_PIC_HTMSI_VEC ... PCH_PIC_HTMSI_VEC_END:
+offset_tmp = offset - PCH_PIC_HTMSI_VEC;
 if (offset_tmp >= 0 && offset_tmp < 64) {
 s->htmsi_vector[offset_tmp] = (uint8_t)(data & 0xff);
 }
 break;
-case PCH_PIC_ROUTE_ENTRY_OFFSET ... PCH_PIC_ROUTE_ENTRY_END:
-offset_tmp = offset - PCH_PIC_ROUTE_ENTRY_OFFSET;
+case PCH_PIC_ROUTE_ENTRY ... PCH_PIC_ROUTE_ENTRY_END:
+offset_tmp = offset - PCH_PIC_ROUTE_ENTRY;
 if (offset_tmp >= 0 && offset_tmp < 64) {
 s->route_entry[offset_tmp] = (uint8_t)(data & 0xff);
 }
diff --git a/hw/loongarch/virt.c b/hw/loongarch/virt.c
index 8c0cc98c72..1f1cca667e 100644
--- a/hw/loongarch/virt.c
+++ b/hw/loongarch/virt.c
@@ -429,7 +429,7 @@ static void virt_irq_init(LoongArchVirtMachineState *lvms)
 memory_region_add_subregion(get_system_memory(), VIRT_IOAPIC_REG_BASE,
 sysbus_mmio_get_region(d, 0));
 memory_region_add_subregion(get_system_memory(),
-VIRT_IOAPIC_REG_BASE + PCH_PIC_ROUTE_ENTRY_OFFSET,
+VIRT_IOAPIC_REG_BASE + PCH_PIC_ROUTE_ENTRY,
 sysbus_mmio_get_region(d, 1));
 memory_region_add_subregion(get_system_memory(),
 VIRT_IOAPIC_REG_BASE + PCH_PIC_INT_STATUS,
diff --git a/include/hw/intc/loongarch_pic_common.h 
b/include/hw/intc/loongarch_pic_common.h
index c04471b08d..b33bebb129 100644
--- a/include/hw/intc/loongarch_pic_common.h
+++ b/include/hw/intc/loongarch_pic_common.h
@@ -19,9 +19,9 @@
 #define PCH_PIC_INT_CLEAR   0x80
 #define PCH_PIC_AUTO_CTRL0  0xc0
 #define PCH_PIC_AUTO_CTRL1  0xe0
-#define PCH_PIC_ROUTE_ENTRY_OFFSET  0x100
+#define PCH_PIC_ROUTE_ENTRY 0x100
 #define PCH_PIC_ROUTE_ENTRY_END 0x13f
-#define PCH_PIC_HTMSI_VEC_OFFSET0x200
+#define PCH_PIC_HTMSI_VEC   0x200
 #define PCH_PIC_HTMSI_VEC_END   0x23f
 #define PCH_PIC_INT_STATUS  0x3a0
 #define PCH_PIC_INT_POL 0x3e0
-- 
2.39.3

Re: [PATCH 3/3] vnc: h264: send additional frames after the display is clean

2025-04-10 Thread Marc-André Lureau

On Mon, Apr 7, 2025 at 3:06 PM Dietmar Maurer  wrote:
>
> So that encoder can improve the picture quality.
>
> Signed-off-by: Dietmar Maurer 

Reviewed-by: Marc-André Lureau 

> ---
>  ui/vnc.c | 25 -
>  ui/vnc.h |  3 +++
>  2 files changed, 27 insertions(+), 1 deletion(-)
>
> diff --git a/ui/vnc.c b/ui/vnc.c
> index 2e60b55e47..4ba0b715fd 100644
> --- a/ui/vnc.c
> +++ b/ui/vnc.c
> @@ -3239,7 +3239,30 @@ static void vnc_refresh(DisplayChangeListener *dcl)
>  vnc_unlock_display(vd);
>
>  QTAILQ_FOREACH_SAFE(vs, &vd->clients, next, vn) {
> -rects += vnc_update_client(vs, has_dirty);
> +int client_dirty = has_dirty;
> +if (vs->h264) {
> +if (client_dirty) {
> +vs->h264->keep_dirty = VNC_H264_KEEP_DIRTY;
> +} else {
> +if (vs->h264->keep_dirty > 0) {
> +client_dirty = 1;
> +vs->h264->keep_dirty--;
> +}
> +}
> +}
> +
> +int count = vnc_update_client(vs, client_dirty);
> +rects += count;
> +
> +if (vs->h264 && !count && vs->h264->keep_dirty) {
> +VncJob *job = vnc_job_new(vs);
> +int height = pixman_image_get_height(vd->server);
> +int width = pixman_image_get_width(vd->server);
> +vs->job_update = vs->update;
> +vs->update = VNC_STATE_UPDATE_NONE;
> +vnc_job_add_rect(job, 0, 0, width, height);
> +vnc_job_push(job);
> +}
>  /* vs might be free()ed here */
>  }
>
> diff --git a/ui/vnc.h b/ui/vnc.h
> index 7e232f7dac..e1b81d6bcc 100644
> --- a/ui/vnc.h
> +++ b/ui/vnc.h
> @@ -236,10 +236,13 @@ typedef struct VncZywrle {
>  } VncZywrle;
>
>  #ifdef CONFIG_GSTREAMER
> +/* Number of frames we send after the display is clean. */
> +#define VNC_H264_KEEP_DIRTY 10
>  typedef struct VncH264 {
>  GstElement *pipeline, *source, *gst_encoder, *sink, *convert;
>  size_t width;
>  size_t height;
> +guint keep_dirty;
>  } VncH264;
>  #endif
>
> --
> 2.39.5
>

[PATCH v2 02/10] hw/arm/aspeed_ast27x0: Add "vbootrom_size" field to AspeedSoCClass

2025-04-10 Thread Jamin Lin via

Introduced a "vbootrom_size" attribute in "AspeedSoCClass" to define virtual
boot ROM size.
Initialized "vbootrom_size" to "0x2" for both AST2700 A0 and A1 variants.

Signed-off-by: Jamin Lin 
---
 include/hw/arm/aspeed_soc.h | 1 +
 hw/arm/aspeed_ast27x0.c | 2 ++
 2 files changed, 3 insertions(+)

diff --git a/include/hw/arm/aspeed_soc.h b/include/hw/arm/aspeed_soc.h
index 37cd7cd793..432f6178ac 100644
--- a/include/hw/arm/aspeed_soc.h
+++ b/include/hw/arm/aspeed_soc.h
@@ -152,6 +152,7 @@ struct AspeedSoCClass {
 const char * const *valid_cpu_types;
 uint32_t silicon_rev;
 uint64_t sram_size;
+uint64_t vbootrom_size;
 uint64_t secsram_size;
 int spis_num;
 int ehcis_num;
diff --git a/hw/arm/aspeed_ast27x0.c b/hw/arm/aspeed_ast27x0.c
index dce7255a2c..81dd90ffdd 100644
--- a/hw/arm/aspeed_ast27x0.c
+++ b/hw/arm/aspeed_ast27x0.c
@@ -898,6 +898,7 @@ static void aspeed_soc_ast2700a0_class_init(ObjectClass 
*oc, void *data)
 
 sc->valid_cpu_types = valid_cpu_types;
 sc->silicon_rev  = AST2700_A0_SILICON_REV;
+sc->vbootrom_size = 0x2;
 sc->sram_size= 0x2;
 sc->spis_num = 3;
 sc->wdts_num = 8;
@@ -925,6 +926,7 @@ static void aspeed_soc_ast2700a1_class_init(ObjectClass 
*oc, void *data)
 
 sc->valid_cpu_types = valid_cpu_types;
 sc->silicon_rev  = AST2700_A1_SILICON_REV;
+sc->vbootrom_size = 0x2;
 sc->sram_size= 0x2;
 sc->spis_num = 3;
 sc->wdts_num = 8;
-- 
2.43.0

[PATCH 02/14] vfio: refactor out vfio_pci_config_setup()

2025-04-10 Thread John Levon

Refactor the PCI config setup code out of vfio_realize() for
readability.

Reviewed-by: Cédric Le Goater 
Signed-off-by: John Levon 
---
 hw/vfio/pci.c | 176 +++---
 1 file changed, 94 insertions(+), 82 deletions(-)

diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
index 02f23efaba..81bf0dab28 100644
--- a/hw/vfio/pci.c
+++ b/hw/vfio/pci.c
@@ -2963,6 +2963,99 @@ static void vfio_unregister_req_notifier(VFIOPCIDevice 
*vdev)
 vdev->req_enabled = false;
 }
 
+static bool vfio_pci_config_setup(VFIOPCIDevice *vdev, Error **errp)
+{
+PCIDevice *pdev = &vdev->pdev;
+VFIODevice *vbasedev = &vdev->vbasedev;
+
+/* vfio emulates a lot for us, but some bits need extra love */
+vdev->emulated_config_bits = g_malloc0(vdev->config_size);
+
+/* QEMU can choose to expose the ROM or not */
+memset(vdev->emulated_config_bits + PCI_ROM_ADDRESS, 0xff, 4);
+/* QEMU can also add or extend BARs */
+memset(vdev->emulated_config_bits + PCI_BASE_ADDRESS_0, 0xff, 6 * 4);
+
+/*
+ * The PCI spec reserves vendor ID 0x as an invalid value.  The
+ * device ID is managed by the vendor and need only be a 16-bit value.
+ * Allow any 16-bit value for subsystem so they can be hidden or changed.
+ */
+if (vdev->vendor_id != PCI_ANY_ID) {
+if (vdev->vendor_id >= 0x) {
+error_setg(errp, "invalid PCI vendor ID provided");
+return false;
+}
+vfio_add_emulated_word(vdev, PCI_VENDOR_ID, vdev->vendor_id, ~0);
+trace_vfio_pci_emulated_vendor_id(vbasedev->name, vdev->vendor_id);
+} else {
+vdev->vendor_id = pci_get_word(pdev->config + PCI_VENDOR_ID);
+}
+
+if (vdev->device_id != PCI_ANY_ID) {
+if (vdev->device_id > 0x) {
+error_setg(errp, "invalid PCI device ID provided");
+return false;
+}
+vfio_add_emulated_word(vdev, PCI_DEVICE_ID, vdev->device_id, ~0);
+trace_vfio_pci_emulated_device_id(vbasedev->name, vdev->device_id);
+} else {
+vdev->device_id = pci_get_word(pdev->config + PCI_DEVICE_ID);
+}
+
+if (vdev->sub_vendor_id != PCI_ANY_ID) {
+if (vdev->sub_vendor_id > 0x) {
+error_setg(errp, "invalid PCI subsystem vendor ID provided");
+return false;
+}
+vfio_add_emulated_word(vdev, PCI_SUBSYSTEM_VENDOR_ID,
+   vdev->sub_vendor_id, ~0);
+trace_vfio_pci_emulated_sub_vendor_id(vbasedev->name,
+  vdev->sub_vendor_id);
+}
+
+if (vdev->sub_device_id != PCI_ANY_ID) {
+if (vdev->sub_device_id > 0x) {
+error_setg(errp, "invalid PCI subsystem device ID provided");
+return false;
+}
+vfio_add_emulated_word(vdev, PCI_SUBSYSTEM_ID, vdev->sub_device_id, 
~0);
+trace_vfio_pci_emulated_sub_device_id(vbasedev->name,
+  vdev->sub_device_id);
+}
+
+/* QEMU can change multi-function devices to single function, or reverse */
+vdev->emulated_config_bits[PCI_HEADER_TYPE] =
+  PCI_HEADER_TYPE_MULTI_FUNCTION;
+
+/* Restore or clear multifunction, this is always controlled by QEMU */
+if (vdev->pdev.cap_present & QEMU_PCI_CAP_MULTIFUNCTION) {
+vdev->pdev.config[PCI_HEADER_TYPE] |= PCI_HEADER_TYPE_MULTI_FUNCTION;
+} else {
+vdev->pdev.config[PCI_HEADER_TYPE] &= ~PCI_HEADER_TYPE_MULTI_FUNCTION;
+}
+
+/*
+ * Clear host resource mapping info.  If we choose not to register a
+ * BAR, such as might be the case with the option ROM, we can get
+ * confusing, unwritable, residual addresses from the host here.
+ */
+memset(&vdev->pdev.config[PCI_BASE_ADDRESS_0], 0, 24);
+memset(&vdev->pdev.config[PCI_ROM_ADDRESS], 0, 4);
+
+vfio_pci_size_rom(vdev);
+
+vfio_bars_prepare(vdev);
+
+if (!vfio_msix_early_setup(vdev, errp)) {
+return false;
+}
+
+vfio_bars_register(vdev);
+
+return true;
+}
+
 static bool vfio_interrupt_setup(VFIOPCIDevice *vdev, Error **errp)
 {
 PCIDevice *pdev = &vdev->pdev;
@@ -3067,91 +3160,10 @@ static void vfio_realize(PCIDevice *pdev, Error **errp)
 goto error;
 }
 
-/* vfio emulates a lot for us, but some bits need extra love */
-vdev->emulated_config_bits = g_malloc0(vdev->config_size);
-
-/* QEMU can choose to expose the ROM or not */
-memset(vdev->emulated_config_bits + PCI_ROM_ADDRESS, 0xff, 4);
-/* QEMU can also add or extend BARs */
-memset(vdev->emulated_config_bits + PCI_BASE_ADDRESS_0, 0xff, 6 * 4);
-
-/*
- * The PCI spec reserves vendor ID 0x as an invalid value.  The
- * device ID is managed by the vendor and need only be a 16-bit value.
- * Allow any 16-bit value for subsystem so they can be hidden or changed.
- */
-if (vdev->vendor_id != PCI_ANY_

[PATCH v1 00/24] Secure IPL Support for SCSI Scheme of virtio-blk/virtio-scsi Devices

2025-04-10 Thread Zhuoying Cai

# Description

This patch series is an external requirement by Linux distribution
partners to verify secure IPL process. Additional secure IPL checks are
also included in this series to address security holes in the original
secure IPL design to prevent malicious actors to boot modified or
unsigned code despite secure IPL being enforced.

Secure IPL is enabled when the QEMU options for secure IPL are specified
in the command line.

During this process, additional security checks are performed to ensure
system integrity.

As components are loaded from disk, DIAG 508 subcode 2 performs
signature verification if a signature entry is identified. Upon
successful verification, DIAG 320 subcode 2 will request the
corresponding certificate from QEMU key store to the BIOS.

Secure IPL will continue until all the components are loaded if no error
occurs during True secure IPL mode or in Audit mode (see explanation below).

After that, an IPL Information report block (IIRB) is initialized
immediately following an IPL Parameter Information Block. The IIRB is
populated with information about the components, verification results
and certificate data.

Finally, the guest system proceeds to boot.

Only List-Directed-IPL contains the relevant zIPL data structures to
perform secure IPL. This patch series only adds support for the SCSI
scheme of virtio-blk/virtio-scsi devices. Secure IPL for other device
types will be considered as follow-up work at a later date.

** Note: “secure IPL” and “secure boot” are used interchangeably
throughout the design. **

# True Secure IPL Mode and Audit Mode

## True Secure IPL Mode

When secure IPL is enabled and certificates are provided, all the secure
IPL checks will performed. The boot process will abort if any error
occurs during the secure IPL checks.

## Audit Mode

When the secure IPL option is not selected and certificates are
provided, all the secure IPL checks will still be performed. However,
the boot process will continue if any errors occur, with messages logged
to the console during the secure IPL checks.

The audit mode is also considered as simulated secure IPL because it is
less pervasive, and allows the guest to boot regardless of the secure
checking results.

# How to Enable Secure IPL

## QEMU Build Notes

When building QEMU, enable the cryptographic libraries.

Run configure script in QEMU repository with either parameter:

./configure … --enable-nettle or --enable-gcrypt

## Create Certificates via Openssl

openssl req -new -x509 -newkey rsa:2048 -keyout mykey.priv
-outform DER -out mycert.der -days 36500 -subj "/CN=My Name/"
-nodes

Use an RSA private key for signing.

It is recommended to store the certificate(s) in the /…/qemu/certs
directory for easy identification.

## Sign Kernel and Prepare zipl

All actions must be performed on a KVM guest.

Copy the sign-file script (located in Linux source repository),
generated private key(s), and certificate(s) to guest’s file system.

Sign guest image(s) and stage3 binary:

./sign-file sha256 mykey.priv mycert.der /boot/vmlinuz-…

./sign-file sha256 mykey.priv mycert.der /usr/lib/s390-tools/stage3.bin

Run zipl with secure boot enabled.

zipl --secure 1 -V

Guest image(s) are now signed, stored on disk, and can be verified.

## New QEMU Command Options for Secure IPL

To enable secure IPL and provide certificates for signature verification
via QEMU command line.

Enables secure IPL/boot, this option defaults to off if it is not
provided for the command line options.

-secure-boot [on|off]

Provides a path to either a directory or a single boot certificate. A
colon may be used to delineate multiple paths.

-boot-certificates 

Example:
qemu-system-s390x ... \
-secure-boot on \
-boot-certificates /.../qemu/certs:/another/path/cert.der

Secure IPL command options overview:

If neither the -secure-boot nor the -boot-certificates options are
specified, the guest will boot in normal mode, and no security checks
will be conducted.

If the -secure-boot option is not specified or is set to off, and the
-boot-certificates option is provided, the guest will boot in audit
mode. In this mode, all security checks are performed; however, any
errors encountered will not interrupt the boot process.

If the -secure-boot option is set to on and the -boot-certificates
option is provided, the guest will boot in true secure IPL mode. In this
mode, all security checks are performed, and any errors encountered will
terminate the boot process.
  - If the -boot-certificates option is not provided in true secure IPL
mode, the boot process will fail for the corresponding device.

## Constraints

- certificates must be in X.509 DER format

- only sha256 encryption is supported

- only support for SCSI scheme of virtio-blk/virtio-scsi devices
- The boot process will terminate if secure boot is enabled without
specifying a boot device.
- If enabl

Re: CXL memory pooling emulation inqury

2025-04-10 Thread Fan Ni

On Wed, Mar 12, 2025 at 03:33:12PM -0400, Gregory Price wrote:
> On Wed, Mar 12, 2025 at 06:05:43PM +, Jonathan Cameron wrote:
> > 
> > Longer term I remain a little unconvinced by whether this is the best 
> > approach
> > because I also want a single management path (so fake CCI etc) and that may
> > need to be exposed to one of the hosts for tests purposes.  In the current
> > approach commands are issued to each host directly to surface memory.
> >
> 
> Lets say we implement this
> 
>   --- ---
>   |  Host 1 | | Host 2  |
>   ||| | |
>   |v|   Add   | |
>   |   CCI   | --> | Evt Log |
>   --- ---
>  ^ 
>   What mechanism
>  do you use here?
> 
> And how does it not just replicate QMP logic?
> 
> Not arguing against it, I just see what amounts to more code than
> required to test the functionality.  QMP fits the bill so split the CCI
> interface for single-host management testing and the MHSLD interface.
> 
> Why not leave the 1-node DCD with inbound CCI interface for testing and
> leave QMP interface for development of a reference fabric manager
> outside the scope of another host?

Hi Gregory,

FYI. Just posted a RFC for FM emulation, the approach used does not need
to replicate QMP logic, but indeed we use one QMP to notify host2 for a
in-coming MCTP message.
https://lore.kernel.org/linux-cxl/20250408043051.430340-1-nifan@gmail.com/

Fan

> 
> TL;DR:  :[ distributed systems are hard to test
> 
> > > 
> > > 2.If not fully supported yet, are there any available development 
> > > branches 
> > > or patches that implement this functionality?
> > > 
> > > 3.Are there any guidelines or considerations for configuring and testing 
> > > CXL memory pooling in QEMU?
> > 
> > There is some information in that patch series cover letter.
> >
> 
> The attached series implements an MHSLD, but implementing the pooling
> mechanism (i.e. fabric manager logic) is left to the imagination of the
> reader.   You will want to look at Fan Ni's DCD patch set to understand
> the QMP Add/Remove logic for DCD capacity.  This patch set just enables
> you to manage 2+ QEMU Guests sharing a DCD State in shared memory.
> 
> So you'll have to send DCD commands individual guest QEMU via QMP, but
> the underlying logic manages the shared state via locks to emulate real
> MHSLD behavior.
>  QMP|---> Host 1 v
>[FM]-|  [Shared State]
>QMP|---> Host 2 ^
> 
> This differs from a real DCD in that a real DCD is a single endpoint for
> management, rather than N endpoints (1 per vm).
> 
>   |---> Host 1
> [FM] ---> [DCD] --|
> |---> Host 2
> 
> However this is an implementation detail on the FM side, so I chose to
> do it this way to simplify the QEMU MHSLD implementation.  There's far
> fewer interactions this way - with the downside that having one of the
> hosts manage the shared state isn't possible via the current emulation.
> 
> It could probably be done, but I'm not sure what value it has since the
> FM implementation difference is a matter a small amount of python.
> 
> It's been a while since I played with this patch set and I do not have a
> reference pooling manager available to me any longer unfortunately. But
> I'm happy to provide some guidance where I can.
> 
> ~Gregory

Re: [PATCH v4 02/13] memory: Change memory_region_set_ram_discard_manager() to return the result

2025-04-10 Thread Alexey Kardashevskiy





On 7/4/25 17:49, Chenyi Qiang wrote:

Modify memory_region_set_ram_discard_manager() to return false if a
RamDiscardManager is already set in the MemoryRegion. The caller must
handle this failure, such as having virtio-mem undo its actions and fail
the realize() process. Opportunistically move the call earlier to avoid
complex error handling.

This change is beneficial when introducing a new RamDiscardManager
instance besides virtio-mem. After
ram_block_coordinated_discard_require(true) unlocks all
RamDiscardManager instances, only one instance is allowed to be set for
a MemoryRegion at present.

Suggested-by: David Hildenbrand 
Signed-off-by: Chenyi Qiang 
---
Changes in v4:
 - No change.

Changes in v3:
 - Move set_ram_discard_manager() up to avoid a g_free()
 - Clean up set_ram_discard_manager() definition

Changes in v2:
 - newly added.
---
  hw/virtio/virtio-mem.c | 29 -
  include/exec/memory.h  |  6 +++---
  system/memory.c| 10 +++---
  3 files changed, 26 insertions(+), 19 deletions(-)

diff --git a/hw/virtio/virtio-mem.c b/hw/virtio/virtio-mem.c
index 21f16e4912..d0d3a0240f 100644
--- a/hw/virtio/virtio-mem.c
+++ b/hw/virtio/virtio-mem.c
@@ -1049,6 +1049,17 @@ static void virtio_mem_device_realize(DeviceState *dev, 
Error **errp)
  return;
  }
  
+/*

+ * Set ourselves as RamDiscardManager before the plug handler maps the
+ * memory region and exposes it via an address space.
+ */
+if (memory_region_set_ram_discard_manager(&vmem->memdev->mr,
+  RAM_DISCARD_MANAGER(vmem))) {
+error_setg(errp, "Failed to set RamDiscardManager");
+ram_block_coordinated_discard_require(false);
+return;
+}
+
  /*
   * We don't know at this point whether shared RAM is migrated using
   * QEMU or migrated using the file content. "x-ignore-shared" will be


Right after the end of this comment block, do not you want 
memory_region_set_generic_state_manager(..., NULL)?




@@ -1124,13 +1135,6 @@ static void virtio_mem_device_realize(DeviceState *dev, 
Error **errp)
  vmem->system_reset = VIRTIO_MEM_SYSTEM_RESET(obj);
  vmem->system_reset->vmem = vmem;
  qemu_register_resettable(obj);
-
-/*
- * Set ourselves as RamDiscardManager before the plug handler maps the
- * memory region and exposes it via an address space.
- */
-memory_region_set_ram_discard_manager(&vmem->memdev->mr,
-  RAM_DISCARD_MANAGER(vmem));
  }
  
  static void virtio_mem_device_unrealize(DeviceState *dev)

@@ -1138,12 +1142,6 @@ static void virtio_mem_device_unrealize(DeviceState *dev)
  VirtIODevice *vdev = VIRTIO_DEVICE(dev);
  VirtIOMEM *vmem = VIRTIO_MEM(dev);
  
-/*

- * The unplug handler unmapped the memory region, it cannot be
- * found via an address space anymore. Unset ourselves.
- */
-memory_region_set_ram_discard_manager(&vmem->memdev->mr, NULL);
-
  qemu_unregister_resettable(OBJECT(vmem->system_reset));
  object_unref(OBJECT(vmem->system_reset));
  
@@ -1156,6 +1154,11 @@ static void virtio_mem_device_unrealize(DeviceState *dev)

  virtio_del_queue(vdev, 0);
  virtio_cleanup(vdev);
  g_free(vmem->bitmap);
+/*
+ * The unplug handler unmapped the memory region, it cannot be
+ * found via an address space anymore. Unset ourselves.
+ */
+memory_region_set_ram_discard_manager(&vmem->memdev->mr, NULL);
  ram_block_coordinated_discard_require(false);
  }
  
diff --git a/include/exec/memory.h b/include/exec/memory.h

index 3bebc43d59..390477b588 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -2487,13 +2487,13 @@ static inline bool 
memory_region_has_ram_discard_manager(MemoryRegion *mr)
   *
   * This function must not be called for a mapped #MemoryRegion, a 
#MemoryRegion
   * that does not cover RAM, or a #MemoryRegion that already has a
- * #RamDiscardManager assigned.
+ * #RamDiscardManager assigned. Return 0 if the rdm is set successfully.
   *
   * @mr: the #MemoryRegion
   * @rdm: #RamDiscardManager to set
   */
-void memory_region_set_ram_discard_manager(MemoryRegion *mr,
-   RamDiscardManager *rdm);
+int memory_region_set_ram_discard_manager(MemoryRegion *mr,
+  RamDiscardManager *rdm);
  
  /**

   * memory_region_find: translate an address/size relative to a
diff --git a/system/memory.c b/system/memory.c
index b17b5538ff..62d6b410f0 100644
--- a/system/memory.c
+++ b/system/memory.c
@@ -2115,12 +2115,16 @@ RamDiscardManager 
*memory_region_get_ram_discard_manager(MemoryRegion *mr)
  return mr->rdm;
  }
  
-void memory_region_set_ram_discard_manager(MemoryRegion *mr,

-   RamDiscardManager *rdm)
+int memory_region_set_ram_discard_manager(MemoryRegion *mr,
+

[PATCH v2 1/2] vfio/spapr: Enhance error handling in vfio_spapr_create_window()

2025-04-10 Thread Amit Machhiwal

Introduce an Error ** parameter to vfio_spapr_create_window() to enable
structured error reporting. This allows the function to propagate
detailed errors back to callers.

Suggested-by: Cédric Le Goater 
Signed-off-by: Amit Machhiwal 
---
 hw/vfio/spapr.c | 23 ---
 1 file changed, 12 insertions(+), 11 deletions(-)

diff --git a/hw/vfio/spapr.c b/hw/vfio/spapr.c
index 1a5d1611f2cd..4f2858b43f36 100644
--- a/hw/vfio/spapr.c
+++ b/hw/vfio/spapr.c
@@ -232,7 +232,7 @@ static int vfio_spapr_remove_window(VFIOContainer 
*container,
 
 static int vfio_spapr_create_window(VFIOContainer *container,
 MemoryRegionSection *section,
-hwaddr *pgsize)
+hwaddr *pgsize, Error **errp)
 {
 int ret = 0;
 VFIOContainerBase *bcontainer = &container->bcontainer;
@@ -252,10 +252,10 @@ static int vfio_spapr_create_window(VFIOContainer 
*container,
 pgmask = bcontainer->pgsizes & (pagesize | (pagesize - 1));
 pagesize = pgmask ? (1ULL << (63 - clz64(pgmask))) : 0;
 if (!pagesize) {
-error_report("Host doesn't support page size 0x%"PRIx64
- ", the supported mask is 0x%lx",
- memory_region_iommu_get_min_page_size(iommu_mr),
- bcontainer->pgsizes);
+error_setg(errp, "Host doesn't support page size 0x%"PRIx64
+   ", the supported mask is 0x%lx",
+   memory_region_iommu_get_min_page_size(iommu_mr),
+   bcontainer->pgsizes);
 return -EINVAL;
 }
 
@@ -302,16 +302,16 @@ static int vfio_spapr_create_window(VFIOContainer 
*container,
 }
 }
 if (ret) {
-error_report("Failed to create a window, ret = %d (%m)", ret);
+error_setg_errno(errp, -ret, "Failed to create a window, ret = %d 
(%m)", ret);
 return -errno;
 }
 
 if (create.start_addr != section->offset_within_address_space) {
 vfio_spapr_remove_window(container, create.start_addr);
 
-error_report("Host doesn't support DMA window at %"HWADDR_PRIx", must 
be %"PRIx64,
- section->offset_within_address_space,
- (uint64_t)create.start_addr);
+error_setg(errp, "Host doesn't support DMA window at %"HWADDR_PRIx
+   ", must be %"PRIx64, section->offset_within_address_space,
+   (uint64_t)create.start_addr);
 return -EINVAL;
 }
 trace_vfio_spapr_create_window(create.page_shift,
@@ -334,6 +334,7 @@ vfio_spapr_container_add_section_window(VFIOContainerBase 
*bcontainer,
   container);
 VFIOHostDMAWindow *hostwin;
 hwaddr pgsize = 0;
+Error *local_err = NULL;
 int ret;
 
 /*
@@ -377,9 +378,9 @@ vfio_spapr_container_add_section_window(VFIOContainerBase 
*bcontainer,
 }
 }
 
-ret = vfio_spapr_create_window(container, section, &pgsize);
+ret = vfio_spapr_create_window(container, section, &pgsize, &local_err);
 if (ret) {
-error_setg_errno(errp, -ret, "Failed to create SPAPR window");
+error_propagate(errp, local_err);
 return false;
 }
 

base-commit: 53f3a13ac1069975ad47cf8bd05cc96b4ac09962
-- 
2.49.0

Re: [PATCH 00/11] qapi: Documentation improvements

2025-04-10 Thread Markus Armbruster

Queued for 10.0.

[PATCH v2 2/2] block/io: skip head/tail requests on EINVAL

2025-04-10 Thread Stefan Hajnoczi

When guests send misaligned discard requests, the block layer breaks
them up into a misaligned head, an aligned main body, and a misaligned
tail.

The file-posix block driver on Linux returns -EINVAL on misaligned
discard requests. This causes bdrv_co_pdiscard() to fail and guests
configured with werror=stop will pause.

Add a special case for misaligned head/tail requests. Simply continue
when EINVAL is encountered so that the aligned main body of the request
can be completed and the guest is not paused. This is the best we can do
when guest discard limits do not match the host discard limits.

Fixes: https://issues.redhat.com/browse/RHEL-86032
Signed-off-by: Stefan Hajnoczi 
---
 block/io.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/block/io.c b/block/io.c
index 1ba8d1aeea..a0d0b31a3e 100644
--- a/block/io.c
+++ b/block/io.c
@@ -3180,7 +3180,11 @@ int coroutine_fn bdrv_co_pdiscard(BdrvChild *child, 
int64_t offset,
 }
 }
 if (ret && ret != -ENOTSUP) {
-goto out;
+if (ret == -EINVAL && (offset % align != 0 || num % align != 0)) {
+/* Silently skip rejected unaligned head/tail requests */
+} else {
+goto out; /* bail out */
+}
 }
 
 offset += num;
-- 
2.49.0

Re: [PATCH v2] Revert "virtio-net: Copy received header to buffer"

2025-04-10 Thread Stefan Hajnoczi

On Tue, Apr 8, 2025 at 10:55 AM Antoine Damhet  wrote:
>
> This reverts commit 7987d2be5a8bc3a502f89ba8cf3ac3e09f64d1ce.
>
> The goal was to remove the need to patch the (const) input buffer
> with a recomputed UDP checksum by copying headers to a RW region and
> inject the checksum there. The patch computed the checksum only from the
> header fields (missing the rest of the payload) producing an invalid one
> and making guests fail to acquire a DHCP lease.
>
> Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2727
> Cc: qemu-sta...@nongnu.org
> Signed-off-by: Antoine Damhet 
> ---
> v2: Rebased on master due to conflict with c17ad4b11bd2 (
> "virtio-net: Fix num_buffers for version 1")

Michael: Please review this and send a pull request for 10.0 (-rc4
will be tagged on Tuesday). There was a conflict so this is not a
mechanical revert.

Thanks!

>
>  hw/net/virtio-net.c | 87 +
>  1 file changed, 40 insertions(+), 47 deletions(-)
>
> diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
> index 340c6b642224..bd37651dabb0 100644
> --- a/hw/net/virtio-net.c
> +++ b/hw/net/virtio-net.c
> @@ -1702,44 +1702,41 @@ static void virtio_net_hdr_swap(VirtIODevice *vdev, 
> struct virtio_net_hdr *hdr)
>   * cache.
>   */
>  static void work_around_broken_dhclient(struct virtio_net_hdr *hdr,
> -size_t *hdr_len, const uint8_t *buf,
> -size_t buf_size, size_t *buf_offset)
> +uint8_t *buf, size_t size)
>  {
>  size_t csum_size = ETH_HLEN + sizeof(struct ip_header) +
> sizeof(struct udp_header);
>
> -buf += *buf_offset;
> -buf_size -= *buf_offset;
> -
>  if ((hdr->flags & VIRTIO_NET_HDR_F_NEEDS_CSUM) && /* missing csum */
> -(buf_size >= csum_size && buf_size < 1500) && /* normal sized MTU */
> +(size >= csum_size && size < 1500) && /* normal sized MTU */
>  (buf[12] == 0x08 && buf[13] == 0x00) && /* ethertype == IPv4 */
>  (buf[23] == 17) && /* ip.protocol == UDP */
>  (buf[34] == 0 && buf[35] == 67)) { /* udp.srcport == bootps */
> -memcpy((uint8_t *)hdr + *hdr_len, buf, csum_size);
> -net_checksum_calculate((uint8_t *)hdr + *hdr_len, csum_size, 
> CSUM_UDP);
> +net_checksum_calculate(buf, size, CSUM_UDP);
>  hdr->flags &= ~VIRTIO_NET_HDR_F_NEEDS_CSUM;
> -*hdr_len += csum_size;
> -*buf_offset += csum_size;
>  }
>  }
>
> -static size_t receive_header(VirtIONet *n, struct virtio_net_hdr *hdr,
> - const void *buf, size_t buf_size,
> - size_t *buf_offset)
> +static void receive_header(VirtIONet *n, const struct iovec *iov, int 
> iov_cnt,
> +   const void *buf, size_t size)
>  {
> -size_t hdr_len = n->guest_hdr_len;
> -
> -memcpy(hdr, buf, sizeof(struct virtio_net_hdr));
> -
> -*buf_offset = n->host_hdr_len;
> -work_around_broken_dhclient(hdr, &hdr_len, buf, buf_size, buf_offset);
> +if (n->has_vnet_hdr) {
> +/* FIXME this cast is evil */
> +void *wbuf = (void *)buf;
> +work_around_broken_dhclient(wbuf, wbuf + n->host_hdr_len,
> +size - n->host_hdr_len);
>
> -if (n->needs_vnet_hdr_swap) {
> -virtio_net_hdr_swap(VIRTIO_DEVICE(n), hdr);
> +if (n->needs_vnet_hdr_swap) {
> +virtio_net_hdr_swap(VIRTIO_DEVICE(n), wbuf);
> +}
> +iov_from_buf(iov, iov_cnt, 0, buf, sizeof(struct virtio_net_hdr));
> +} else {
> +struct virtio_net_hdr hdr = {
> +.flags = 0,
> +.gso_type = VIRTIO_NET_HDR_GSO_NONE
> +};
> +iov_from_buf(iov, iov_cnt, 0, &hdr, sizeof hdr);
>  }
> -
> -return hdr_len;
>  }
>
>  static int receive_filter(VirtIONet *n, const uint8_t *buf, int size)
> @@ -1907,13 +1904,6 @@ static int virtio_net_process_rss(NetClientState *nc, 
> const uint8_t *buf,
>  return (index == new_index) ? -1 : new_index;
>  }
>
> -typedef struct Header {
> -struct virtio_net_hdr_v1_hash virtio_net;
> -struct eth_header eth;
> -struct ip_header ip;
> -struct udp_header udp;
> -} Header;
> -
>  static ssize_t virtio_net_receive_rcu(NetClientState *nc, const uint8_t *buf,
>size_t size)
>  {
> @@ -1923,15 +1913,15 @@ static ssize_t virtio_net_receive_rcu(NetClientState 
> *nc, const uint8_t *buf,
>  VirtQueueElement *elems[VIRTQUEUE_MAX_SIZE];
>  size_t lens[VIRTQUEUE_MAX_SIZE];
>  struct iovec mhdr_sg[VIRTQUEUE_MAX_SIZE];
> -Header hdr;
> +struct virtio_net_hdr_v1_hash extra_hdr;
>  unsigned mhdr_cnt = 0;
>  size_t offset, i, guest_offset, j;
>  ssize_t err;
>
> -memset(&hdr.virtio_net, 0, sizeof(hdr.virtio_net));
> +memset(&extra_hdr, 0, sizeof(extra_hdr));
>
>  if (n->rss_data.enabled && n

Re: [PATCH v6] hw/misc/vmfwupdate: Introduce hypervisor fw-cfg interface support

2025-04-10 Thread Gerd Hoffmann

On Thu, Apr 10, 2025 at 12:01:18PM +0530, Ani Sinha wrote:
> 
> 
> > On 9 Apr 2025, at 11:51 AM, Gerd Hoffman  wrote:
> > 
> >  Hi,
> > 
> >>> The chicken-and-egg problem arises if you go for hashing and want embed
> >>> the igvm file in the UKI.
> >> 
> >> I don't really see how signing the IGVM file for secure boot helps 
> >> anything.
> > 
> > It doesn't help indeed.  This comes from the original idea by Alex to
> > simply add a firmware image to the UKI.  In that case the firmware is
> > covered by the signature / hash, even though it is not needed.  Quite
> > the contrary, it complicates things when we want ship db/dbx in the
> > firmware image.
> > 
> > So most likely the firmware will not be part of the main UKI.  Options
> > for alternatives are using UKI add-ons,
> 
> But add-ons are also subjected to signature verification. How does not using 
> the main UKI help?

For the first boot using secure boot doesn't help much, the trust
in the firmware being loaded for the second boot is established via
launch measurement not secure boot signature.

For the second boot (with new firmware) you don't need the add-on any
more.

The main advantage of wrapping the igvm into a uki add-on is that we
can easily use the hwid matching support of systemd-stub when packaging
multiple firmware variants (aws, azure, gcp, qemu, ...).  Not sure this
will actually matter in practice though.

take care,
  Gersd

[PULL 3/4] scsi-disk: Apply error policy for host_status errors again

2025-04-10 Thread Kevin Wolf

Originally, all failed SG_IO requests called scsi_handle_rw_error() to
apply the configured error policy. However, commit f3126d65, which was
supposed to be a mere refactoring for scsi-disk.c, broke this and
accidentally completed the SCSI request without considering the error
policy any more if the error was signalled in the host_status field.

Apart from the commit message not describing the change as intended,
errors indicated in host_status are also obviously backend errors and
not something the guest must deal with independently of the error
policy.

This behaviour means that some recoverable errors (such as a path error
in multipath configurations) were reported to the guest anyway, which
might not expect it and might consider its disk broken.

Make sure that we apply the error policy again for host_status errors,
too. This addresses an existing FIXME comment and allows us to remove
some comments warning that callbacks weren't always called. With this
fix, they are called in all cases again.

The return value passed to the request callback doesn't have more free
values that could be used to indicate host_status errors as well as SAM
status codes and negative errno. Store the value in the host_status
field of the SCSIRequest instead and use -ENODEV as the return value (if
a path hasn't been reachable for a while, blk_aio_ioctl() will return
-ENODEV instead of just setting host_status, so just reuse it here -
it's not necessarily entirely accurate, but it's as good as any errno).

Cc: qemu-sta...@nongnu.org
Fixes: f3126d65b393 ('scsi: move host_status handling into SCSI drivers')
Signed-off-by: Kevin Wolf 
Message-ID: <20250407155949.44736-1-kw...@redhat.com>
Reviewed-by: Stefan Hajnoczi 
Reviewed-by: Hanna Czenczek 
Signed-off-by: Kevin Wolf 
---
 hw/scsi/scsi-disk.c | 39 +--
 1 file changed, 25 insertions(+), 14 deletions(-)

diff --git a/hw/scsi/scsi-disk.c b/hw/scsi/scsi-disk.c
index 8da1d5a77c..e59632e9b1 100644
--- a/hw/scsi/scsi-disk.c
+++ b/hw/scsi/scsi-disk.c
@@ -68,10 +68,9 @@ struct SCSIDiskClass {
 SCSIDeviceClass parent_class;
 /*
  * Callbacks receive ret == 0 for success. Errors are represented either as
- * negative errno values, or as positive SAM status codes.
- *
- * Beware: For errors returned in host_status, the function may directly
- * complete the request and never call the callback.
+ * negative errno values, or as positive SAM status codes. For host_status
+ * errors, the function passes ret == -ENODEV and sets the host_status 
field
+ * of the SCSIRequest.
  */
 DMAIOFunc   *dma_readv;
 DMAIOFunc   *dma_writev;
@@ -225,11 +224,26 @@ static bool scsi_handle_rw_error(SCSIDiskReq *r, int ret, 
bool acct_failed)
 SCSIDiskState *s = DO_UPCAST(SCSIDiskState, qdev, r->req.dev);
 SCSIDiskClass *sdc = (SCSIDiskClass *) object_get_class(OBJECT(s));
 SCSISense sense = SENSE_CODE(NO_SENSE);
+int16_t host_status;
 int error;
 bool req_has_sense = false;
 BlockErrorAction action;
 int status;
 
+/*
+ * host_status should only be set for SG_IO requests that came back with a
+ * host_status error in scsi_block_sgio_complete(). This error path passes
+ * -ENODEV as the return value.
+ *
+ * Reset host_status in the request because we may still want to complete
+ * the request successfully with the 'stop' or 'ignore' error policy.
+ */
+host_status = r->req.host_status;
+if (host_status != -1) {
+assert(ret == -ENODEV);
+r->req.host_status = -1;
+}
+
 if (ret < 0) {
 status = scsi_sense_from_errno(-ret, &sense);
 error = -ret;
@@ -289,6 +303,10 @@ static bool scsi_handle_rw_error(SCSIDiskReq *r, int ret, 
bool acct_failed)
 if (acct_failed) {
 block_acct_failed(blk_get_stats(s->qdev.conf.blk), &r->acct);
 }
+if (host_status != -1) {
+scsi_req_complete_failed(&r->req, host_status);
+return true;
+}
 if (req_has_sense) {
 sdc->update_sense(&r->req);
 } else if (status == CHECK_CONDITION) {
@@ -409,7 +427,6 @@ done:
 scsi_req_unref(&r->req);
 }
 
-/* May not be called in all error cases, don't rely on cleanup here */
 static void scsi_dma_complete(void *opaque, int ret)
 {
 SCSIDiskReq *r = (SCSIDiskReq *)opaque;
@@ -448,7 +465,6 @@ done:
 scsi_req_unref(&r->req);
 }
 
-/* May not be called in all error cases, don't rely on cleanup here */
 static void scsi_read_complete(void *opaque, int ret)
 {
 SCSIDiskReq *r = (SCSIDiskReq *)opaque;
@@ -585,7 +601,6 @@ done:
 scsi_req_unref(&r->req);
 }
 
-/* May not be called in all error cases, don't rely on cleanup here */
 static void scsi_write_complete(void * opaque, int ret)
 {
 SCSIDiskReq *r = (SCSIDiskReq *)opaque;
@@ -2846,14 +2861,10 @@ static void scsi_block_sgio_complete(void *opaque, int 
ret)
 sg_io_hdr_t *io_hdr = &r

RE: [PATCH v2] hw/i2c/aspeed: Fix wrong I2CC_DMA_LEN when I2CM_DMA_TX/RX_ADDR set first

2025-04-10 Thread Jamin Lin

Hi Cedric, 

> Subject: Re: [PATCH v2] hw/i2c/aspeed: Fix wrong I2CC_DMA_LEN when
> I2CM_DMA_TX/RX_ADDR set first
> 
> Hello,
> 
> On 4/9/25 11:10, Jamin Lin wrote:
> > Hi Cedric,
> >
> > After discussing with the I2C hardware designers, we confirmed that the I2c
> design in AST2600 and AST2700 A1 is the same.
> > The datasheet will be updated accordingly for AST2700.
> >
> > However, please note that bit 15 and bit 31 are not available on AST2700 A0
> and FW do not set either bit 15 and bit 31.
> > AST2700 A0 was an engineering sample version. Given this, I plan to resend
> the v3 patch with AST2700 A0 explicitly marked as unsupported.
> 
> Are you going to introduce an I2C property to distinguish the
> A0 implementation for the A1 ?
> 

No. I want the I2C model to be as simple as possible, without adding any 
""AST2700 A0"" specific changes.
V1 patch work for AST2700 A1, AST2600, AST2500, AST1030, AST2500, AST2400 but 
not AST2700 A0.

> > I prefer not to introduce workarounds specifically for AST2700 A0, as it is 
> > not
> a production-grade silicon.
> 
> We can deprecate the AST2700 A0 machine too, See 56a37eda93ed, and
> change the ast2700-evb machine alias.


Will do.

> 
> > I will resend the v3 patch with the same content as v1, since the only 
> > issue in
> v1 was a functional test failure on AST2700 A0.
> > Apologies for the inconvenience, and thank you for your understanding.
> 
> That's ok. Nothing is merged. Let's get it right first.
> 

Thanks-Jamin

> Thanks,
> 
> C.
> 
> 
> 
> >
> > Thanks-Jamin
> >
> > * Email Confidentiality Notice 
> > 免責聲明:
> > 本信件(或其附件)可能包含機密資訊，並受法律保護。如 台端非指定之
> 收件者，請以電子郵件通知本電子郵件之發送者, 並請立即刪除本電子郵
> 件及其附件和銷毀所有複印件。謝謝您的合作!
> >
> > DISCLAIMER:
> > This message (and any attachments) may contain legally privileged and/or
> other confidential information. If you have received it in error, please 
> notify
> the sender by reply e-mail and immediately delete the e-mail and any
> attachments without copying or disclosing the contents. Thank you.
> >
> >> -Original Message-
> >> From: Jamin Lin 
> >> Sent: Tuesday, April 8, 2025 4:25 PM
> >> To: Cédric Le Goater ; Peter Maydell
> >> ; Steven Lee ;
> >> Troy Lee ; Andrew Jeffery
> >> ; Joel Stanley ; open
> >> list:ASPEED BMCs ; open list:All patches CC here
> >> 
> >> Cc: Jamin Lin ; Troy Lee
> >> 
> >> Subject: [PATCH v2] hw/i2c/aspeed: Fix wrong I2CC_DMA_LEN when
> >> I2CM_DMA_TX/RX_ADDR set first
> >>
> >> In the previous design, the I2C model would update I2CC_DMA_LEN
> >> (0x54) based on the value of I2CM_DMA_LEN (0x1C) when the firmware
> >> set either I2CM_DMA_TX_ADDR
> >> (0x30) or I2CM_DMA_RX_ADDR (0x34). However, this only worked
> >> correctly if the firmware set I2CM_DMA_LEN before setting
> >> I2CM_DMA_TX_ADDR or I2CM_DMA_RX_ADDR.
> >>
> >> If the firmware instead set I2CM_DMA_TX_ADDR or I2CM_DMA_RX_ADDR
> >> before setting I2CM_DMA_LEN, the value written to I2CC_DMA_LEN would
> >> be incorrect.
> >>
> >> Ideally, this issue should be resolved by updating the model to set
> >> I2CC_DMA_LEN (0x54) when the firmware writes to the I2CM_DMA_LEN
> >> (0x1C) register, instead of when it writes to I2CM_DMA_TX_ADDR (0x30)
> >> or I2CM_DMA_RX_ADDR (0x34).
> >>
> >> Originally, the design of I2CM_DMA_LEN (0x1C) included buffer length
> >> write-enable bits for the current command:
> >> Bit 31 enabled the RX buffer length update Bit 15 enabled the TX
> >> buffer length update
> >>
> >> In other words, when the firmware set either bit 31 or bit 15, the
> >> I2C model could safely update I2CC_DMA_LEN (0x54) with the value in
> >> I2CM_DMA_LEN (0x1C).
> >>
> >> However, starting with the AST2700, the design of the I2CM_DMA_LEN
> >> (0x1C) register was changed. The write-enable bits (bit 31 and bit
> >> 15) were removed, meaning there is no longer an explicit indication
> >> of whether the firmware intends to update the TX or RX length.
> >>
> >> As a result, on AST2700 and newer SoCs, the model cannot reliably
> >> determine whether a write to I2CM_DMA_LEN was meant for TX or RX.
> >> This ambiguity is especially problematic when the value written is 0,
> >> which actually corresponds to a DMA length of 1.
> >>
> >> To ensure consistent behavior across all SoCs, the model now updates
> >> I2CC_DMA_LEN when I2CM_CMD (0x18) is written, as this is the final
> >> command that initiates a TX or RX transfer and reflects the
> >> firmware’s intent more clearly.
> >>
> >> Signed-off-by: Jamin Lin 
> >> Fixes: ba2cccd (aspeed: i2c: Add new mode support)
> >> ---
> >>   hw/i2c/aspeed_i2c.c | 18 ++
> >>   1 file changed, 14 insertions(+), 4 deletions(-)
> >>
> >> diff --git a/hw/i2c/aspeed_i2c.c b/hw/i2c/aspeed_i2c.c index
> >> a8fbb9f44a..c659099e9a 100644
> >> --- a/hw/i2c/aspeed_i2c.c
> >> +++ b/hw/i2c/aspeed_i2c.c
> >> @@ -634,6 +634,20 @@ static void
> >> aspeed_i2c_bus_new_write(AspeedI2CBus
> >> *bus, hwaddr offset,
> >>   break;
> >>   }
> >>
> >> +/

1 2 3 >

1 - 100 of 219 matches

Mail list logo