date:20231121

Re: [PATCH-for-8.2?] hw/acpi/erst: Do not ignore Error* in realize handler

2023-11-21 Thread Markus Armbruster

Philippe Mathieu-Daudé  writes:

> erst_realizefn() calls functions which could update the 'errp'
> argument, but then ignores it.

To be precise: it ignores failure.  Suggest to clarify the commit
message like this:

  erst_realizefn() passes @errp to functions without checking for
  failure.  If it runs into another failure, it trips error_setv()'s
  assertion.

>Use the ERRP_GUARD() macro and
> check *errp, as suggested in commit ae7c80a7bd ("error: New macro
> ERRP_GUARD()").
>
> Cc: qemu-sta...@nongnu.org
> Fixes: f7e26ffa59 ("ACPI ERST: support for ACPI ERST feature")
> Signed-off-by: Philippe Mathieu-Daudé 
> ---
>  hw/acpi/erst.c | 10 ++
>  1 file changed, 10 insertions(+)
>
> diff --git a/hw/acpi/erst.c b/hw/acpi/erst.c
> index 35007d8017..ba751dc60e 100644
> --- a/hw/acpi/erst.c
> +++ b/hw/acpi/erst.c
> @@ -947,6 +947,7 @@ static const VMStateDescription erst_vmstate  = {
>  
>  static void erst_realizefn(PCIDevice *pci_dev, Error **errp)
>  {
> +ERRP_GUARD();
>  ERSTDeviceState *s = ACPIERST(pci_dev);
>  
>  trace_acpi_erst_realizefn_in();
> @@ -964,9 +965,15 @@ static void erst_realizefn(PCIDevice *pci_dev, Error 
> **errp)
>  
>  /* HostMemoryBackend size will be multiple of PAGE_SIZE */
>  s->storage_size = object_property_get_int(OBJECT(s->hostmem), "size", 
> errp);
> +if (*errp) {
> +return;
> +}
>  
>  /* Initialize backend storage and record_count */
>  check_erst_backend_storage(s, errp);
> +if (*errp) {
> +return;
> +}

If you change check_erst_backend_storage() to return bool, you can use

   if (!check_erst_backend_storage(s, errp) {
   return;
   }

Not a demand.

>  
>  /* BAR 0: Programming registers */
>  memory_region_init_io(&s->iomem_mr, OBJECT(pci_dev), &erst_reg_ops, s,
> @@ -977,6 +984,9 @@ static void erst_realizefn(PCIDevice *pci_dev, Error 
> **errp)
>  memory_region_init_ram(&s->exchange_mr, OBJECT(pci_dev),
>  "erst.exchange",
>  le32_to_cpu(s->header->record_size), errp);
> +if (*errp) {
> +return;
> +}

Likewise, with more callers to simplify.  Again, not a demand.

>  pci_register_bar(pci_dev, 1, PCI_BASE_ADDRESS_SPACE_MEMORY,
>  &s->exchange_mr);

With the commit message clarified:
Reviewed-by: Markus Armbruster

Re: [PATCH] system: Use &error_abort in memory_region_init_ram_[device_]ptr()

2023-11-21 Thread Markus Armbruster

Philippe Mathieu-Daudé  writes:

> If an unexpected error condition happens, we have to abort
> (&fatal_error is meant for expected errors).
>
> Suggested-by: Paolo Bonzini 
> Suggested-by: Markus Armbruster 
> Signed-off-by: Philippe Mathieu-Daudé 

Reviewed-by: Markus Armbruster

Re: [PATCH v6 01/21] backends/iommufd: Introduce the iommufd object

2023-11-21 Thread Cédric Le Goater


Hello Zhenzhong,


Below are other gaps I can think of for now:

Gaps:
1. dirty page sync, WIP (Joao)
2. p2p dma not supported yet.
3. fd passing with mdev not support ram discard(vfio-pci) as no way to know it's

a mdev from a fd.

Call the section Caveats maybe?


Got it.


It looks like v7 should be ready by rc2 (next week). I would then merge
in vfio-next and wait a week before sending a QEMU-9.0 PR.

Thanks,

C.

[PATCH] tests/qtest: check the return value

2023-11-21 Thread Zhu Jun

These variables "ret" are never referenced in the code, thus
add check logic for the "ret"

Signed-off-by: Zhu Jun 
---
 tests/qtest/test-filter-mirror.c | 1 +
 tests/qtest/test-filter-redirector.c | 2 ++
 tests/qtest/virtio-net-test.c| 1 +
 3 files changed, 4 insertions(+)

diff --git a/tests/qtest/test-filter-mirror.c b/tests/qtest/test-filter-mirror.c
index adeada3eb8..f3865f7519 100644
--- a/tests/qtest/test-filter-mirror.c
+++ b/tests/qtest/test-filter-mirror.c
@@ -61,6 +61,7 @@ static void test_mirror(void)
 g_assert_cmpint(len, ==, sizeof(send_buf));
 recv_buf = g_malloc(len);
 ret = recv(recv_sock[0], recv_buf, len, 0);
+g_assert_cmpint(ret, ==, len);
 g_assert_cmpstr(recv_buf, ==, send_buf);
 
 g_free(recv_buf);
diff --git a/tests/qtest/test-filter-redirector.c 
b/tests/qtest/test-filter-redirector.c
index e72e3b7873..a77d5fd8ec 100644
--- a/tests/qtest/test-filter-redirector.c
+++ b/tests/qtest/test-filter-redirector.c
@@ -118,6 +118,7 @@ static void test_redirector_tx(void)
 g_assert_cmpint(len, ==, sizeof(send_buf));
 recv_buf = g_malloc(len);
 ret = recv(recv_sock, recv_buf, len, 0);
+g_assert_cmpint(ret, ==, len);
 g_assert_cmpstr(recv_buf, ==, send_buf);
 
 g_free(recv_buf);
@@ -185,6 +186,7 @@ static void test_redirector_rx(void)
 g_assert_cmpint(len, ==, sizeof(send_buf));
 recv_buf = g_malloc(len);
 ret = recv(backend_sock[0], recv_buf, len, 0);
+g_assert_cmpint(ret, ==, len);
 g_assert_cmpstr(recv_buf, ==, send_buf);
 
 close(send_sock);
diff --git a/tests/qtest/virtio-net-test.c b/tests/qtest/virtio-net-test.c
index fab5dd8b05..2df75c9780 100644
--- a/tests/qtest/virtio-net-test.c
+++ b/tests/qtest/virtio-net-test.c
@@ -91,6 +91,7 @@ static void tx_test(QVirtioDevice *dev,
 len = ntohl(len);
 
 ret = recv(socket, buffer, len, 0);
+g_assert_cmpint(ret, ==, len);
 g_assert_cmpstr(buffer, ==, "TEST");
 }
 
-- 
2.17.1

Re: [PATCH-for-8.2?] hw/arm/fsl-imx: Do not ignore Error argument

2023-11-21 Thread Philippe Mathieu-Daudé


Hi Markus,

On 21/11/23 07:40, Markus Armbruster wrote:

Peter Maydell  writes:


On Mon, 20 Nov 2023 at 11:51, Philippe Mathieu-Daudé  wrote:


Both i.MX25 and i.MX6 SoC models ignore the Error argument when
setting the PHY number. Pick &error_abort which is the error
used by the i.MX7 SoC (see commit 1f7197deb0 "ability to change
the FEC PHY on i.MX7 processor").

Fixes: 74c1330582 ("ability to change the FEC PHY on i.MX25 processor")
Fixes: a9c167a3c4 ("ability to change the FEC PHY on i.MX6 processor")
Signed-off-by: Philippe Mathieu-Daudé 
---




Applied to target-arm.next, thanks.


With or without my commit message clarification?


I didn't get your email on this patch, but per the other
ones on similar fixes:
https://lore.kernel.org/all/87cyw3mu4r@pond.sub.org/
https://lore.kernel.org/all/87il5vlemo@pond.sub.org/
I assume you want:

  Both i.MX25 and i.MX6 SoC models ignore the Error argument when
  setting the PHY number with object_property_set_uint(). If this
  @errp argument is set, its following use via sysbus_realize()
  might potentially triggers an assertion in error_setv().

  Pick &error_abort which is the error used by the i.MX7 SoC (see
  commit 1f7197deb0 "ability to change the FEC PHY on i.MX7 processor").

If that is OK with you, Peter, do you mind updating the description?

Thanks both!

Phil.

Re: [PATCH] tests/qtest: check the return value

2023-11-21 Thread Thomas Huth


On 21/11/2023 09.08, Zhu Jun wrote:

These variables "ret" are never referenced in the code, thus
add check logic for the "ret"

Signed-off-by: Zhu Jun 


Thanks!

Reviewed-by: Thomas Huth 

I'll queue it for my next pull request.

[PULL SUBSYSTEM qemu-pseries] pseries: Update SLOF firmware image

2023-11-21 Thread Alexey Kardashevskiy

The following changes since commit af9264da80073435fd78944bc5a46e695897d7e5:

  Merge tag '20231119-xtensa-1' of https://github.com/OSLL/qemu-xtensa into 
staging (2023-11-20 05:25:19 -0500)

are available in the Git repository at:

  g...@github.com:aik/qemu.git tags/qemu-slof-20231121

for you to fetch changes up to b6838bf9c01c32bfecd5c446c98e788bbfd467d9:

  pseries: Update SLOF firmware image (2023-11-21 19:11:31 +1100)


Alexey Kardashevskiy (1):
  pseries: Update SLOF firmware image

 pc-bios/README   |   2 +-
 pc-bios/slof.bin | Bin 995176 -> 995000 bytes
 roms/SLOF|   2 +-
 3 files changed, 2 insertions(+), 2 deletions(-)


*** Note: this is not for master, this is for pseries

Just an update, nothing major. Thanks everyone for keeping
an eye on this!

Compiled with  gcc-12.1.0-nolibc

Tested with:
 /home/aik/b/q-slof/qemu-system-ppc64 \
-nodefaults \
-chardev stdio,id=STDIO0,signal=off,mux=on \
-device spapr-vty,id=svty0,reg=0x71000110,chardev=STDIO0 \
-mon id=MON0,chardev=STDIO0,mode=readline \
-nographic \
-vga none \
-m 4G \
-kernel /home/aik/t/vml4150le \
-initrd /home/aik/t/le.cpio \
-machine 
pseries,cap-cfpc=broken,cap-sbbc=broken,cap-ibs=broken,cap-ccf-assist=off \
-bios pc-bios/slof.bin \
-trace events=/home/aik/qemu_trace_events \
-d guest_errors \
-chardev socket,id=SOCKET0,server=on,wait=off,path=qemu.mon.60616 \
-mon chardev=SOCKET0,mode=control \
-name 60616,debug-threads=on


The complete change log is:

Alexey Kardashevskiy (3):
  Remove ?PICK
  version: update to 20230918
  version: update to 20231121

Bernhard M. Wiedemann (1):
  Allow to override build date with SOURCE_DATE_EPOCH

Jordan Niethe (1):
  virtio-serial: Do not close stdout on quiesce

Kautuk Consul (1):
  virtio-serial: Make read and write methods report failure

Thomas Huth (10):
  lib/libnet/ipv6: Silence compiler warning from Clang
  Fix typos in the board-qemu folder
  Fix typos in the lib/libnet folder
  Fix typos in the remaining lib folders
  Fix typos in the slof folder
  Fix typos in the board-js2x folder
  Fix typos in the llfw folder
  Fix typos in the board-js2x folder
  Fix typos in the clients folder
  Fix remaining typos in various folders

Re: [PATCH v10 08/18] target/riscv: add rva22u64 profile definition

2023-11-21 Thread Daniel Henrique Barboza





On 11/21/23 05:13, Jerry Shih wrote:

On Nov 3, 2023, at 21:46, Daniel Henrique Barboza  
wrote:


+/*
+ * RVA22U64 defines some 'named features' or 'synthetic extensions'
+ * that are cache related: Za64rs, Zic64b, Ziccif, Ziccrse, Ziccamoa
+ * and Zicclsm. We do not implement caching in QEMU so we'll consider
+ * all these named features as always enabled.
+ *


Hi Daniel,

If the cache related extensions are `ignored/assumed enabled`, why don't
we export them in `riscv,isa`?


These aren't extensions, but 'named features'. They don't have a riscv,isa. 
There's
no DT bindings for them.


If we try to check the RVA22 profile in linux kernel running with qemu, the
isa string is not match RVA22 profile.


The kernel would check profile compatibility by matching the riscv,isa of the 
actual
extensions, as expected, but then it would need to check these 'named features'
in other fashion. For example, in patch 06, zic64b would be asserted by checking
if all block sizes are 64 bytes.

I agree that this is over-complicated and checking everything in riscv,isa 
would make
things easier. For now these named extensions don't have DT bindings, thus we 
can't
add them to the DT. The kernel doesn't seem to care about their existence in 
the DT
either.

TBH a better place for this discussion is the kernel mailing list. Thanks,



Daniel





Thanks,
Jerry

RE: [PATCH v6 01/21] backends/iommufd: Introduce the iommufd object

2023-11-21 Thread Duan, Zhenzhong



>-Original Message-
>From: Cédric Le Goater 
>Sent: Tuesday, November 21, 2023 4:06 PM
>Subject: Re: [PATCH v6 01/21] backends/iommufd: Introduce the iommufd object
>
>Hello Zhenzhong,
>
 Below are other gaps I can think of for now:

 Gaps:
 1. dirty page sync, WIP (Joao)
 2. p2p dma not supported yet.
 3. fd passing with mdev not support ram discard(vfio-pci) as no way to know
>it's
>>> a mdev from a fd.
>>>
>>> Call the section Caveats maybe?
>>
>> Got it.
>
>It looks like v7 should be ready by rc2 (next week). I would then merge
>in vfio-next and wait a week before sending a QEMU-9.0 PR.

Got it, I'll send out soon.

Thanks
Zhenzhong

[PATCH v7 03/27] vfio/common: return early if space isn't empty

2023-11-21 Thread Zhenzhong Duan

This is a trivial optimization. If there is active container in space,
vfio_reset_handler will never be unregistered. So revert the check of
space->containers and return early.

Signed-off-by: Zhenzhong Duan 
Reviewed-by: Cédric Le Goater 
Reviewed-by: Eric Auger 
Tested-by: Eric Auger 
---
 hw/vfio/common.c | 9 ++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/hw/vfio/common.c b/hw/vfio/common.c
index 572ae7c934..934f4f5446 100644
--- a/hw/vfio/common.c
+++ b/hw/vfio/common.c
@@ -1462,10 +1462,13 @@ VFIOAddressSpace *vfio_get_address_space(AddressSpace 
*as)
 
 void vfio_put_address_space(VFIOAddressSpace *space)
 {
-if (QLIST_EMPTY(&space->containers)) {
-QLIST_REMOVE(space, list);
-g_free(space);
+if (!QLIST_EMPTY(&space->containers)) {
+return;
 }
+
+QLIST_REMOVE(space, list);
+g_free(space);
+
 if (QLIST_EMPTY(&vfio_address_spaces)) {
 qemu_unregister_reset(vfio_reset_handler, NULL);
 }
-- 
2.34.1

[PATCH v7 08/27] vfio/pci: Introduce a vfio pci hot reset interface

2023-11-21 Thread Zhenzhong Duan

Legacy vfio pci and iommufd cdev have different process to hot reset
vfio device, expand current code to abstract out pci_hot_reset callback
for legacy vfio, this same interface will also be used by iommufd
cdev vfio device.

Rename vfio_pci_hot_reset to vfio_legacy_pci_hot_reset and move it
into container.c.

vfio_pci_[pre/post]_reset and vfio_pci_host_match are exported so
they could be called in legacy and iommufd pci_hot_reset callback.

Suggested-by: Cédric Le Goater 
Signed-off-by: Zhenzhong Duan 
Reviewed-by: Eric Auger 
Tested-by: Eric Auger 
---
 hw/vfio/pci.h |   3 +
 include/hw/vfio/vfio-container-base.h |   3 +
 hw/vfio/container.c   | 170 ++
 hw/vfio/pci.c | 168 +
 4 files changed, 182 insertions(+), 162 deletions(-)

diff --git a/hw/vfio/pci.h b/hw/vfio/pci.h
index 1006061afb..6e64a2654e 100644
--- a/hw/vfio/pci.h
+++ b/hw/vfio/pci.h
@@ -218,6 +218,9 @@ void vfio_probe_igd_bar4_quirk(VFIOPCIDevice *vdev, int nr);
 
 extern const PropertyInfo qdev_prop_nv_gpudirect_clique;
 
+void vfio_pci_pre_reset(VFIOPCIDevice *vdev);
+void vfio_pci_post_reset(VFIOPCIDevice *vdev);
+bool vfio_pci_host_match(PCIHostDeviceAddress *addr, const char *name);
 int vfio_pci_get_pci_hot_reset_info(VFIOPCIDevice *vdev,
 struct vfio_pci_hot_reset_info **info_p);
 
diff --git a/include/hw/vfio/vfio-container-base.h 
b/include/hw/vfio/vfio-container-base.h
index 4b6f017c6f..45bb19c767 100644
--- a/include/hw/vfio/vfio-container-base.h
+++ b/include/hw/vfio/vfio-container-base.h
@@ -106,6 +106,9 @@ struct VFIOIOMMUOps {
 int (*set_dirty_page_tracking)(VFIOContainerBase *bcontainer, bool start);
 int (*query_dirty_bitmap)(VFIOContainerBase *bcontainer, VFIOBitmap *vbmap,
   hwaddr iova, hwaddr size);
+/* PCI specific */
+int (*pci_hot_reset)(VFIODevice *vbasedev, bool single);
+
 /* SPAPR specific */
 int (*add_window)(VFIOContainerBase *bcontainer,
   MemoryRegionSection *section,
diff --git a/hw/vfio/container.c b/hw/vfio/container.c
index ed2d721b2b..1dbf9b9a17 100644
--- a/hw/vfio/container.c
+++ b/hw/vfio/container.c
@@ -33,6 +33,7 @@
 #include "trace.h"
 #include "qapi/error.h"
 #include "migration/migration.h"
+#include "pci.h"
 
 VFIOGroupList vfio_group_list =
 QLIST_HEAD_INITIALIZER(vfio_group_list);
@@ -922,6 +923,174 @@ static void vfio_legacy_detach_device(VFIODevice 
*vbasedev)
 vfio_put_group(group);
 }
 
+static int vfio_legacy_pci_hot_reset(VFIODevice *vbasedev, bool single)
+{
+VFIOPCIDevice *vdev = container_of(vbasedev, VFIOPCIDevice, vbasedev);
+VFIOGroup *group;
+struct vfio_pci_hot_reset_info *info = NULL;
+struct vfio_pci_dependent_device *devices;
+struct vfio_pci_hot_reset *reset;
+int32_t *fds;
+int ret, i, count;
+bool multi = false;
+
+trace_vfio_pci_hot_reset(vdev->vbasedev.name, single ? "one" : "multi");
+
+if (!single) {
+vfio_pci_pre_reset(vdev);
+}
+vdev->vbasedev.needs_reset = false;
+
+ret = vfio_pci_get_pci_hot_reset_info(vdev, &info);
+
+if (ret) {
+goto out_single;
+}
+devices = &info->devices[0];
+
+trace_vfio_pci_hot_reset_has_dep_devices(vdev->vbasedev.name);
+
+/* Verify that we have all the groups required */
+for (i = 0; i < info->count; i++) {
+PCIHostDeviceAddress host;
+VFIOPCIDevice *tmp;
+VFIODevice *vbasedev_iter;
+
+host.domain = devices[i].segment;
+host.bus = devices[i].bus;
+host.slot = PCI_SLOT(devices[i].devfn);
+host.function = PCI_FUNC(devices[i].devfn);
+
+trace_vfio_pci_hot_reset_dep_devices(host.domain,
+host.bus, host.slot, host.function, devices[i].group_id);
+
+if (vfio_pci_host_match(&host, vdev->vbasedev.name)) {
+continue;
+}
+
+QLIST_FOREACH(group, &vfio_group_list, next) {
+if (group->groupid == devices[i].group_id) {
+break;
+}
+}
+
+if (!group) {
+if (!vdev->has_pm_reset) {
+error_report("vfio: Cannot reset device %s, "
+ "depends on group %d which is not owned.",
+ vdev->vbasedev.name, devices[i].group_id);
+}
+ret = -EPERM;
+goto out;
+}
+
+/* Prep dependent devices for reset and clear our marker. */
+QLIST_FOREACH(vbasedev_iter, &group->device_list, next) {
+if (!vbasedev_iter->dev->realized ||
+vbasedev_iter->type != VFIO_DEVICE_TYPE_PCI) {
+continue;
+}
+tmp = container_of(vbasedev_iter, VFIOPCIDevice, vbasedev);
+if (vfio_pci_host_match(&host, tmp->vbasedev.name)) {
+if (single) {
+ret = -EINVAL;
+

[PATCH v7 09/27] vfio/iommufd: Enable pci hot reset through iommufd cdev interface

2023-11-21 Thread Zhenzhong Duan

Implement the newly introduced pci_hot_reset callback named
iommufd_cdev_pci_hot_reset to do iommufd specific check and
reset operation.

Signed-off-by: Zhenzhong Duan 
Reviewed-by: Eric Auger 
Tested-by: Eric Auger 
---
 hw/vfio/iommufd.c| 150 +++
 hw/vfio/trace-events |   1 +
 2 files changed, 151 insertions(+)

diff --git a/hw/vfio/iommufd.c b/hw/vfio/iommufd.c
index 01b448e840..6e53e013ef 100644
--- a/hw/vfio/iommufd.c
+++ b/hw/vfio/iommufd.c
@@ -24,6 +24,7 @@
 #include "sysemu/reset.h"
 #include "qemu/cutils.h"
 #include "qemu/chardev_open.h"
+#include "pci.h"
 
 static int iommufd_cdev_map(VFIOContainerBase *bcontainer, hwaddr iova,
 ram_addr_t size, void *vaddr, bool readonly)
@@ -468,9 +469,158 @@ static void iommufd_cdev_detach(VFIODevice *vbasedev)
 close(vbasedev->fd);
 }
 
+static VFIODevice *iommufd_cdev_pci_find_by_devid(__u32 devid)
+{
+VFIODevice *vbasedev_iter;
+
+QLIST_FOREACH(vbasedev_iter, &vfio_device_list, global_next) {
+if (vbasedev_iter->bcontainer->ops != &vfio_iommufd_ops) {
+continue;
+}
+if (devid == vbasedev_iter->devid) {
+return vbasedev_iter;
+}
+}
+return NULL;
+}
+
+static VFIOPCIDevice *
+iommufd_cdev_dep_get_realized_vpdev(struct vfio_pci_dependent_device *dep_dev,
+VFIODevice *reset_dev)
+{
+VFIODevice *vbasedev_tmp;
+
+if (dep_dev->devid == reset_dev->devid ||
+dep_dev->devid == VFIO_PCI_DEVID_OWNED) {
+return NULL;
+}
+
+vbasedev_tmp = iommufd_cdev_pci_find_by_devid(dep_dev->devid);
+if (!vbasedev_tmp || !vbasedev_tmp->dev->realized ||
+vbasedev_tmp->type != VFIO_DEVICE_TYPE_PCI) {
+return NULL;
+}
+
+return container_of(vbasedev_tmp, VFIOPCIDevice, vbasedev);
+}
+
+static int iommufd_cdev_pci_hot_reset(VFIODevice *vbasedev, bool single)
+{
+VFIOPCIDevice *vdev = container_of(vbasedev, VFIOPCIDevice, vbasedev);
+struct vfio_pci_hot_reset_info *info = NULL;
+struct vfio_pci_dependent_device *devices;
+struct vfio_pci_hot_reset *reset;
+int ret, i;
+bool multi = false;
+
+trace_vfio_pci_hot_reset(vdev->vbasedev.name, single ? "one" : "multi");
+
+if (!single) {
+vfio_pci_pre_reset(vdev);
+}
+vdev->vbasedev.needs_reset = false;
+
+ret = vfio_pci_get_pci_hot_reset_info(vdev, &info);
+
+if (ret) {
+goto out_single;
+}
+
+assert(info->flags & VFIO_PCI_HOT_RESET_FLAG_DEV_ID);
+
+devices = &info->devices[0];
+
+if (!(info->flags & VFIO_PCI_HOT_RESET_FLAG_DEV_ID_OWNED)) {
+if (!vdev->has_pm_reset) {
+for (i = 0; i < info->count; i++) {
+if (devices[i].devid == VFIO_PCI_DEVID_NOT_OWNED) {
+error_report("vfio: Cannot reset device %s, "
+ "depends on device %04x:%02x:%02x.%x "
+ "which is not owned.",
+ vdev->vbasedev.name, devices[i].segment,
+ devices[i].bus, PCI_SLOT(devices[i].devfn),
+ PCI_FUNC(devices[i].devfn));
+}
+}
+}
+ret = -EPERM;
+goto out_single;
+}
+
+trace_vfio_pci_hot_reset_has_dep_devices(vdev->vbasedev.name);
+
+for (i = 0; i < info->count; i++) {
+VFIOPCIDevice *tmp;
+
+trace_iommufd_cdev_pci_hot_reset_dep_devices(devices[i].segment,
+ devices[i].bus,
+ 
PCI_SLOT(devices[i].devfn),
+ 
PCI_FUNC(devices[i].devfn),
+ devices[i].devid);
+
+/*
+ * If a VFIO cdev device is resettable, all the dependent devices
+ * are either bound to same iommufd or within same iommu_groups as
+ * one of the iommufd bound devices.
+ */
+assert(devices[i].devid != VFIO_PCI_DEVID_NOT_OWNED);
+
+tmp = iommufd_cdev_dep_get_realized_vpdev(&devices[i], 
&vdev->vbasedev);
+if (!tmp) {
+continue;
+}
+
+if (single) {
+ret = -EINVAL;
+goto out_single;
+}
+vfio_pci_pre_reset(tmp);
+tmp->vbasedev.needs_reset = false;
+multi = true;
+}
+
+if (!single && !multi) {
+ret = -EINVAL;
+goto out_single;
+}
+
+/* Use zero length array for hot reset with iommufd backend */
+reset = g_malloc0(sizeof(*reset));
+reset->argsz = sizeof(*reset);
+
+ /* Bus reset! */
+ret = ioctl(vdev->vbasedev.fd, VFIO_DEVICE_PCI_HOT_RESET, reset);
+g_free(reset);
+if (ret) {
+ret = -errno;
+}
+
+trace_vfio_pci_hot_reset_result(vdev->vbasedev.name,
+ret ? strerro

[PATCH v7 00/27] vfio: Adopt iommufd

2023-11-21 Thread Zhenzhong Duan

Hi,

Thanks all for giving guides and comments on previous series, this is
the remaining part of the iommufd support.

Besides suggested changes in v6, I'd like to highlight two changes
for final review:
1. Instantiate can_be_deleted callback to fix race where iommufd object
   can be deleted before vfio device
2. After careful re-thinking, I'd like to follow Nicolin's suggestion in v5,
   remove is_ioas check which indeed looks heavy just for tracepoint.
   In fact we can get corresponding info by looking over trace context.

PATCH 1: Introduce iommufd object
PATCH 2-9: add IOMMUFD container and cdev support
PATCH 10-17: fd passing for cdev and linking to IOMMUFD
PATCH 18: make VFIOContainerBase parameter const
PATCH 19-21: Compile out for IOMMUFD for arm, s390x and x86
PATCH 22-26: vfio device init code cleanup
PATCH 27: add iommufd doc


We have done wide test with different combinations, e.g:
- PCI device were tested
- FD passing and hot reset with some trick.
- device hotplug test with legacy and iommufd backends
- with or without vIOMMU for legacy and iommufd backends
- divices linked to different iommufds
- VFIO migration with a E800 net card(no dirty sync support) passthrough
- platform, ccw and ap were only compile-tested due to environment limit
- test mdev pass through with mtty and mix with real device and different BE
- test iommufd object hotplug/unplug and mix with vfio device plug/unplug

Given some iommufd kernel limitations, the iommufd backend is
not yet fully on par with the legacy backend w.r.t. features like:
- p2p mappings (you will see related error traces)
- dirty page sync
- and etc.


qemu code: https://github.com/yiliu1765/qemu/commits/zhenzhong/iommufd_cdev_v7
Based on vfio-next, commit id: c487fb8a50

--

Below are some background and graph about the design:

With the introduction of iommufd, the Linux kernel provides a generic
interface for userspace drivers to propagate their DMA mappings to kernel
for assigned devices. This series does the porting of the VFIO devices
onto the /dev/iommu uapi and let it coexist with the legacy implementation.

At QEMU level, interactions with the /dev/iommu are abstracted by a new
iommufd object (compiled in with the CONFIG_IOMMUFD option).

Any QEMU device (e.g. vfio device) wishing to use /dev/iommu must be
linked with an iommufd object. In this series, the vfio-pci device is
granted with such capability (other VFIO devices are not yet ready):

It gets a new optional parameter named iommufd which allows to pass
an iommufd object:

-object iommufd,id=iommufd0
-device vfio-pci,host=:02:00.0,iommufd=iommufd0

Note the /dev/iommu and vfio cdev can be externally opened by a
management layer. In such a case the fd is passed:

-object iommufd,id=iommufd0,fd=22
-device vfio-pci,iommufd=iommufd0,fd=23

If the fd parameter is not passed, the fd is opened by QEMU.
See https://www.mail-archive.com/qemu-devel@nongnu.org/msg937155.html
for detailed discuss on this requirement.

If no iommufd option is passed to the vfio-pci device, iommufd is not
used and the end-user gets the behavior based on the legacy vfio iommu
interfaces:

-device vfio-pci,host=:02:00.0

While the legacy kernel interface is group-centric, the new iommufd
interface is device-centric, relying on device fd and iommufd.

To support both interfaces in the QEMU VFIO device we reworked the vfio
container abstraction so that the generic VFIO code can use either
backend.

The VFIOContainer object becomes a base object derived into
a) the legacy VFIO container and
b) the new iommufd based container.

The base object implements generic code such as code related to
memory_listener and address space management whereas the derived
objects implement callbacks specific to either BE, legacy and
iommufd. Indeed each backend has its own way to setup secure context
and dma management interface. The below diagram shows how it looks
like with both BEs.

VFIO   AddressSpace/Memory
+---+  +--+  +-+  +-+
|  pci  |  | platform |  |  ap |  | ccw |
+---+---+  ++-+  +--+--+  +--+--+ +--+
|   |   |||   AddressSpace   |
|   |   ||++-+
+---V---V---VV+   /
|   VFIOAddressSpace  | <+
|  |  |  MemoryListener
|  VFIOContainer list |
+---+++
||
||
+---V--++V--+
|   iommufd||vfio legacy|
|  container   || container |
+---+--+++

[PATCH v7 15/27] vfio/ap: Make vfio cdev pre-openable by passing a file handle

2023-11-21 Thread Zhenzhong Duan

This gives management tools like libvirt a chance to open the vfio
cdev with privilege and pass FD to qemu. This way qemu never needs
to have privilege to open a VFIO or iommu cdev node.

Signed-off-by: Zhenzhong Duan 
Reviewed-by: Matthew Rosato 
Reviewed-by: Cédric Le Goater 
---
 hw/vfio/ap.c | 23 ++-
 1 file changed, 22 insertions(+), 1 deletion(-)

diff --git a/hw/vfio/ap.c b/hw/vfio/ap.c
index 80629609ae..f180e4a32a 100644
--- a/hw/vfio/ap.c
+++ b/hw/vfio/ap.c
@@ -160,7 +160,10 @@ static void vfio_ap_realize(DeviceState *dev, Error **errp)
 VFIOAPDevice *vapdev = VFIO_AP_DEVICE(dev);
 VFIODevice *vbasedev = &vapdev->vdev;
 
-vbasedev->name = g_path_get_basename(vbasedev->sysfsdev);
+if (vfio_device_get_name(vbasedev, errp) < 0) {
+return;
+}
+
 vbasedev->ops = &vfio_ap_ops;
 vbasedev->type = VFIO_DEVICE_TYPE_AP;
 vbasedev->dev = dev;
@@ -230,11 +233,28 @@ static const VMStateDescription vfio_ap_vmstate = {
 .unmigratable = 1,
 };
 
+static void vfio_ap_instance_init(Object *obj)
+{
+VFIOAPDevice *vapdev = VFIO_AP_DEVICE(obj);
+
+vapdev->vdev.fd = -1;
+}
+
+#ifdef CONFIG_IOMMUFD
+static void vfio_ap_set_fd(Object *obj, const char *str, Error **errp)
+{
+vfio_device_set_fd(&VFIO_AP_DEVICE(obj)->vdev, str, errp);
+}
+#endif
+
 static void vfio_ap_class_init(ObjectClass *klass, void *data)
 {
 DeviceClass *dc = DEVICE_CLASS(klass);
 
 device_class_set_props(dc, vfio_ap_properties);
+#ifdef CONFIG_IOMMUFD
+object_class_property_add_str(klass, "fd", NULL, vfio_ap_set_fd);
+#endif
 dc->vmsd = &vfio_ap_vmstate;
 dc->desc = "VFIO-based AP device assignment";
 set_bit(DEVICE_CATEGORY_MISC, dc->categories);
@@ -249,6 +269,7 @@ static const TypeInfo vfio_ap_info = {
 .name = TYPE_VFIO_AP_DEVICE,
 .parent = TYPE_AP_DEVICE,
 .instance_size = sizeof(VFIOAPDevice),
+.instance_init = vfio_ap_instance_init,
 .class_init = vfio_ap_class_init,
 };
 
-- 
2.34.1

[PATCH v7 10/27] vfio/pci: Allow the selection of a given iommu backend

2023-11-21 Thread Zhenzhong Duan

From: Eric Auger 

Now we support two types of iommu backends, let's add the capability
to select one of them. This depends on whether an iommufd object has
been linked with the vfio-pci device:

If the user wants to use the legacy backend, it shall not
link the vfio-pci device with any iommufd object:

 -device vfio-pci,host=:02:00.0

This is called the legacy mode/backend.

If the user wants to use the iommufd backend (/dev/iommu) it
shall pass an iommufd object id in the vfio-pci device options:

 -object iommufd,id=iommufd0
 -device vfio-pci,host=:02:00.0,iommufd=iommufd0

Suggested-by: Alex Williamson 
Signed-off-by: Eric Auger 
Signed-off-by: Yi Liu 
Signed-off-by: Zhenzhong Duan 
Reviewed-by: Cédric Le Goater 
Tested-by: Eric Auger 
---
 hw/vfio/pci.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
index d00c3472c7..c5984b0598 100644
--- a/hw/vfio/pci.c
+++ b/hw/vfio/pci.c
@@ -19,6 +19,7 @@
  */
 
 #include "qemu/osdep.h"
+#include CONFIG_DEVICES /* CONFIG_IOMMUFD */
 #include 
 #include 
 
@@ -42,6 +43,7 @@
 #include "qapi/error.h"
 #include "migration/blocker.h"
 #include "migration/qemu-file.h"
+#include "sysemu/iommufd.h"
 
 #define TYPE_VFIO_PCI_NOHOTPLUG "vfio-pci-nohotplug"
 
@@ -3386,6 +3388,10 @@ static Property vfio_pci_dev_properties[] = {
  * DEFINE_PROP_STRING("vfiofd", VFIOPCIDevice, vfiofd_name),
  * DEFINE_PROP_STRING("vfiogroupfd, VFIOPCIDevice, vfiogroupfd_name),
  */
+#ifdef CONFIG_IOMMUFD
+DEFINE_PROP_LINK("iommufd", VFIOPCIDevice, vbasedev.iommufd,
+ TYPE_IOMMUFD_BACKEND, IOMMUFDBackend *),
+#endif
 DEFINE_PROP_END_OF_LIST(),
 };
 
-- 
2.34.1

[PATCH v7 13/27] vfio/platform: Make vfio cdev pre-openable by passing a file handle

2023-11-21 Thread Zhenzhong Duan

This gives management tools like libvirt a chance to open the vfio
cdev with privilege and pass FD to qemu. This way qemu never needs
to have privilege to open a VFIO or iommu cdev node.

Signed-off-by: Zhenzhong Duan 
Reviewed-by: Cédric Le Goater 
---
 hw/vfio/platform.c | 32 
 1 file changed, 24 insertions(+), 8 deletions(-)

diff --git a/hw/vfio/platform.c b/hw/vfio/platform.c
index 98ae4bc655..a97d9c6234 100644
--- a/hw/vfio/platform.c
+++ b/hw/vfio/platform.c
@@ -531,14 +531,13 @@ static VFIODeviceOps vfio_platform_ops = {
  */
 static int vfio_base_device_init(VFIODevice *vbasedev, Error **errp)
 {
-struct stat st;
 int ret;
 
-/* @sysfsdev takes precedence over @host */
-if (vbasedev->sysfsdev) {
+/* @fd takes precedence over @sysfsdev which takes precedence over @host */
+if (vbasedev->fd < 0 && vbasedev->sysfsdev) {
 g_free(vbasedev->name);
 vbasedev->name = g_path_get_basename(vbasedev->sysfsdev);
-} else {
+} else if (vbasedev->fd < 0) {
 if (!vbasedev->name || strchr(vbasedev->name, '/')) {
 error_setg(errp, "wrong host device name");
 return -EINVAL;
@@ -548,10 +547,9 @@ static int vfio_base_device_init(VFIODevice *vbasedev, 
Error **errp)
  vbasedev->name);
 }
 
-if (stat(vbasedev->sysfsdev, &st) < 0) {
-error_setg_errno(errp, errno,
- "failed to get the sysfs host device file status");
-return -errno;
+ret = vfio_device_get_name(vbasedev, errp);
+if (ret) {
+return ret;
 }
 
 ret = vfio_attach_device(vbasedev->name, vbasedev,
@@ -658,6 +656,20 @@ static Property vfio_platform_dev_properties[] = {
 DEFINE_PROP_END_OF_LIST(),
 };
 
+static void vfio_platform_instance_init(Object *obj)
+{
+VFIOPlatformDevice *vdev = VFIO_PLATFORM_DEVICE(obj);
+
+vdev->vbasedev.fd = -1;
+}
+
+#ifdef CONFIG_IOMMUFD
+static void vfio_platform_set_fd(Object *obj, const char *str, Error **errp)
+{
+vfio_device_set_fd(&VFIO_PLATFORM_DEVICE(obj)->vbasedev, str, errp);
+}
+#endif
+
 static void vfio_platform_class_init(ObjectClass *klass, void *data)
 {
 DeviceClass *dc = DEVICE_CLASS(klass);
@@ -665,6 +677,9 @@ static void vfio_platform_class_init(ObjectClass *klass, 
void *data)
 
 dc->realize = vfio_platform_realize;
 device_class_set_props(dc, vfio_platform_dev_properties);
+#ifdef CONFIG_IOMMUFD
+object_class_property_add_str(klass, "fd", NULL, vfio_platform_set_fd);
+#endif
 dc->vmsd = &vfio_platform_vmstate;
 dc->desc = "VFIO-based platform device assignment";
 sbc->connect_irq_notifier = vfio_start_irqfd_injection;
@@ -677,6 +692,7 @@ static const TypeInfo vfio_platform_dev_info = {
 .name = TYPE_VFIO_PLATFORM,
 .parent = TYPE_SYS_BUS_DEVICE,
 .instance_size = sizeof(VFIOPlatformDevice),
+.instance_init = vfio_platform_instance_init,
 .class_init = vfio_platform_class_init,
 .class_size = sizeof(VFIOPlatformDeviceClass),
 };
-- 
2.34.1

[PATCH v7 05/27] vfio/iommufd: Relax assert check for iommufd backend

2023-11-21 Thread Zhenzhong Duan

Currently iommufd doesn't support dirty page sync yet,
but it will not block us doing live migration if VFIO
migration is force enabled.

So in this case we allow set_dirty_page_tracking to be NULL.
Note we don't need same change for query_dirty_bitmap because
when dirty page sync isn't supported, query_dirty_bitmap will
never be called.

Suggested-by: Cédric Le Goater 
Signed-off-by: Zhenzhong Duan 
Reviewed-by: Cédric Le Goater 
Reviewed-by: Eric Auger 
Tested-by: Eric Auger 
---
 hw/vfio/container-base.c | 4 
 hw/vfio/container.c  | 4 
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/hw/vfio/container-base.c b/hw/vfio/container-base.c
index 71f7274973..eee2dcfe76 100644
--- a/hw/vfio/container-base.c
+++ b/hw/vfio/container-base.c
@@ -55,6 +55,10 @@ void vfio_container_del_section_window(VFIOContainerBase 
*bcontainer,
 int vfio_container_set_dirty_page_tracking(VFIOContainerBase *bcontainer,
bool start)
 {
+if (!bcontainer->dirty_pages_supported) {
+return 0;
+}
+
 g_assert(bcontainer->ops->set_dirty_page_tracking);
 return bcontainer->ops->set_dirty_page_tracking(bcontainer, start);
 }
diff --git a/hw/vfio/container.c b/hw/vfio/container.c
index 6bacf38222..ed2d721b2b 100644
--- a/hw/vfio/container.c
+++ b/hw/vfio/container.c
@@ -216,10 +216,6 @@ static int 
vfio_legacy_set_dirty_page_tracking(VFIOContainerBase *bcontainer,
 .argsz = sizeof(dirty),
 };
 
-if (!bcontainer->dirty_pages_supported) {
-return 0;
-}
-
 if (start) {
 dirty.flags = VFIO_IOMMU_DIRTY_PAGES_FLAG_START;
 } else {
-- 
2.34.1

[PATCH v7 07/27] vfio/pci: Extract out a helper vfio_pci_get_pci_hot_reset_info

2023-11-21 Thread Zhenzhong Duan

This helper will be used by both legacy and iommufd backends.

No functional changes intended.

Signed-off-by: Zhenzhong Duan 
Reviewed-by: Cédric Le Goater 
Reviewed-by: Eric Auger 
Tested-by: Eric Auger 
---
 hw/vfio/pci.h |  3 +++
 hw/vfio/pci.c | 54 +++
 2 files changed, 40 insertions(+), 17 deletions(-)

diff --git a/hw/vfio/pci.h b/hw/vfio/pci.h
index fba8737ab2..1006061afb 100644
--- a/hw/vfio/pci.h
+++ b/hw/vfio/pci.h
@@ -218,6 +218,9 @@ void vfio_probe_igd_bar4_quirk(VFIOPCIDevice *vdev, int nr);
 
 extern const PropertyInfo qdev_prop_nv_gpudirect_clique;
 
+int vfio_pci_get_pci_hot_reset_info(VFIOPCIDevice *vdev,
+struct vfio_pci_hot_reset_info **info_p);
+
 int vfio_populate_vga(VFIOPCIDevice *vdev, Error **errp);
 
 int vfio_pci_igd_opregion_init(VFIOPCIDevice *vdev,
diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
index c62c02f7b6..eb55e8ae88 100644
--- a/hw/vfio/pci.c
+++ b/hw/vfio/pci.c
@@ -2445,22 +2445,13 @@ static bool vfio_pci_host_match(PCIHostDeviceAddress 
*addr, const char *name)
 return (strcmp(tmp, name) == 0);
 }
 
-static int vfio_pci_hot_reset(VFIOPCIDevice *vdev, bool single)
+int vfio_pci_get_pci_hot_reset_info(VFIOPCIDevice *vdev,
+struct vfio_pci_hot_reset_info **info_p)
 {
-VFIOGroup *group;
 struct vfio_pci_hot_reset_info *info;
-struct vfio_pci_dependent_device *devices;
-struct vfio_pci_hot_reset *reset;
-int32_t *fds;
-int ret, i, count;
-bool multi = false;
+int ret, count;
 
-trace_vfio_pci_hot_reset(vdev->vbasedev.name, single ? "one" : "multi");
-
-if (!single) {
-vfio_pci_pre_reset(vdev);
-}
-vdev->vbasedev.needs_reset = false;
+assert(info_p && !*info_p);
 
 info = g_malloc0(sizeof(*info));
 info->argsz = sizeof(*info);
@@ -2468,24 +2459,53 @@ static int vfio_pci_hot_reset(VFIOPCIDevice *vdev, bool 
single)
 ret = ioctl(vdev->vbasedev.fd, VFIO_DEVICE_GET_PCI_HOT_RESET_INFO, info);
 if (ret && errno != ENOSPC) {
 ret = -errno;
+g_free(info);
 if (!vdev->has_pm_reset) {
 error_report("vfio: Cannot reset device %s, "
  "no available reset mechanism.", vdev->vbasedev.name);
 }
-goto out_single;
+return ret;
 }
 
 count = info->count;
-info = g_realloc(info, sizeof(*info) + (count * sizeof(*devices)));
-info->argsz = sizeof(*info) + (count * sizeof(*devices));
-devices = &info->devices[0];
+info = g_realloc(info, sizeof(*info) + (count * sizeof(info->devices[0])));
+info->argsz = sizeof(*info) + (count * sizeof(info->devices[0]));
 
 ret = ioctl(vdev->vbasedev.fd, VFIO_DEVICE_GET_PCI_HOT_RESET_INFO, info);
 if (ret) {
 ret = -errno;
+g_free(info);
 error_report("vfio: hot reset info failed: %m");
+return ret;
+}
+
+*info_p = info;
+return 0;
+}
+
+static int vfio_pci_hot_reset(VFIOPCIDevice *vdev, bool single)
+{
+VFIOGroup *group;
+struct vfio_pci_hot_reset_info *info = NULL;
+struct vfio_pci_dependent_device *devices;
+struct vfio_pci_hot_reset *reset;
+int32_t *fds;
+int ret, i, count;
+bool multi = false;
+
+trace_vfio_pci_hot_reset(vdev->vbasedev.name, single ? "one" : "multi");
+
+if (!single) {
+vfio_pci_pre_reset(vdev);
+}
+vdev->vbasedev.needs_reset = false;
+
+ret = vfio_pci_get_pci_hot_reset_info(vdev, &info);
+
+if (ret) {
 goto out_single;
 }
+devices = &info->devices[0];
 
 trace_vfio_pci_hot_reset_has_dep_devices(vdev->vbasedev.name);
 
-- 
2.34.1

[PATCH v7 24/27] vfio/ap: Move VFIODevice initializations in vfio_ap_instance_init

2023-11-21 Thread Zhenzhong Duan

Some of the VFIODevice initializations is in vfio_ap_realize,
move all of them in vfio_ap_instance_init.

No functional change intended.

Suggested-by: Cédric Le Goater 
Signed-off-by: Zhenzhong Duan 
Reviewed-by: Philippe Mathieu-Daudé 
Reviewed-by: Eric Farman 
---
 hw/vfio/ap.c | 26 +-
 1 file changed, 13 insertions(+), 13 deletions(-)

diff --git a/hw/vfio/ap.c b/hw/vfio/ap.c
index f180e4a32a..95fe7cd98b 100644
--- a/hw/vfio/ap.c
+++ b/hw/vfio/ap.c
@@ -164,18 +164,6 @@ static void vfio_ap_realize(DeviceState *dev, Error **errp)
 return;
 }
 
-vbasedev->ops = &vfio_ap_ops;
-vbasedev->type = VFIO_DEVICE_TYPE_AP;
-vbasedev->dev = dev;
-
-/*
- * vfio-ap devices operate in a way compatible with discarding of
- * memory in RAM blocks, as no pages are pinned in the host.
- * This needs to be set before vfio_get_device() for vfio common to
- * handle ram_block_discard_disable().
- */
-vapdev->vdev.ram_block_discard_allowed = true;
-
 ret = vfio_attach_device(vbasedev->name, vbasedev,
  &address_space_memory, errp);
 if (ret) {
@@ -236,8 +224,20 @@ static const VMStateDescription vfio_ap_vmstate = {
 static void vfio_ap_instance_init(Object *obj)
 {
 VFIOAPDevice *vapdev = VFIO_AP_DEVICE(obj);
+VFIODevice *vbasedev = &vapdev->vdev;
 
-vapdev->vdev.fd = -1;
+vbasedev->type = VFIO_DEVICE_TYPE_AP;
+vbasedev->ops = &vfio_ap_ops;
+vbasedev->dev = DEVICE(vapdev);
+vbasedev->fd = -1;
+
+/*
+ * vfio-ap devices operate in a way compatible with discarding of
+ * memory in RAM blocks, as no pages are pinned in the host.
+ * This needs to be set before vfio_get_device() for vfio common to
+ * handle ram_block_discard_disable().
+ */
+vbasedev->ram_block_discard_allowed = true;
 }
 
 #ifdef CONFIG_IOMMUFD
-- 
2.34.1

[PATCH v7 12/27] vfio/platform: Allow the selection of a given iommu backend

2023-11-21 Thread Zhenzhong Duan

Now we support two types of iommu backends, let's add the capability
to select one of them. This depends on whether an iommufd object has
been linked with the vfio-platform device:

If the user wants to use the legacy backend, it shall not
link the vfio-platform device with any iommufd object:

 -device vfio-platform,host=XXX

This is called the legacy mode/backend.

If the user wants to use the iommufd backend (/dev/iommu) it
shall pass an iommufd object id in the vfio-platform device options:

 -object iommufd,id=iommufd0
 -device vfio-platform,host=XXX,iommufd=iommufd0

Suggested-by: Alex Williamson 
Signed-off-by: Zhenzhong Duan 
Reviewed-by: Cédric Le Goater 
Reviewed-by: Eric Auger 
---
 hw/vfio/platform.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/hw/vfio/platform.c b/hw/vfio/platform.c
index 8e3d4ac458..98ae4bc655 100644
--- a/hw/vfio/platform.c
+++ b/hw/vfio/platform.c
@@ -15,11 +15,13 @@
  */
 
 #include "qemu/osdep.h"
+#include CONFIG_DEVICES /* CONFIG_IOMMUFD */
 #include "qapi/error.h"
 #include 
 #include 
 
 #include "hw/vfio/vfio-platform.h"
+#include "sysemu/iommufd.h"
 #include "migration/vmstate.h"
 #include "qemu/error-report.h"
 #include "qemu/lockable.h"
@@ -649,6 +651,10 @@ static Property vfio_platform_dev_properties[] = {
 DEFINE_PROP_UINT32("mmap-timeout-ms", VFIOPlatformDevice,
mmap_timeout, 1100),
 DEFINE_PROP_BOOL("x-irqfd", VFIOPlatformDevice, irqfd_allowed, true),
+#ifdef CONFIG_IOMMUFD
+DEFINE_PROP_LINK("iommufd", VFIOPlatformDevice, vbasedev.iommufd,
+ TYPE_IOMMUFD_BACKEND, IOMMUFDBackend *),
+#endif
 DEFINE_PROP_END_OF_LIST(),
 };
 
-- 
2.34.1

[PATCH v7 20/27] kconfig: Activate IOMMUFD for s390x machines

2023-11-21 Thread Zhenzhong Duan

From: Cédric Le Goater 

Signed-off-by: Cédric Le Goater 
Signed-off-by: Zhenzhong Duan 
Reviewed-by: Matthew Rosato 
Reviewed-by: Eric Farman 
---
 hw/s390x/Kconfig | 1 +
 1 file changed, 1 insertion(+)

diff --git a/hw/s390x/Kconfig b/hw/s390x/Kconfig
index 4c068d7960..26ad104485 100644
--- a/hw/s390x/Kconfig
+++ b/hw/s390x/Kconfig
@@ -6,6 +6,7 @@ config S390_CCW_VIRTIO
 imply VFIO_CCW
 imply WDT_DIAG288
 imply PCIE_DEVICES
+imply IOMMUFD
 select PCI_EXPRESS
 select S390_FLIC
 select S390_FLIC_KVM if KVM
-- 
2.34.1

[PATCH v7 11/27] vfio/pci: Make vfio cdev pre-openable by passing a file handle

2023-11-21 Thread Zhenzhong Duan

This gives management tools like libvirt a chance to open the vfio
cdev with privilege and pass FD to qemu. This way qemu never needs
to have privilege to open a VFIO or iommu cdev node.

Together with the earlier support of pre-opening /dev/iommu device,
now we have full support of passing a vfio device to unprivileged
qemu by management tool. This mode is no more considered for the
legacy backend. So let's remove the "TODO" comment.

Add helper functions vfio_device_set_fd() and vfio_device_get_name()
to set fd and get device name, they will also be used by other vfio
devices.

There is no easy way to check if a device is mdev with FD passing,
so fail the x-balloon-allowed check unconditionally in this case.

There is also no easy way to get BDF as name with FD passing, so
we fake a name by VFIO_FD[fd].

Signed-off-by: Zhenzhong Duan 
Reviewed-by: Cédric Le Goater 
Tested-by: Eric Auger 
---
 include/hw/vfio/vfio-common.h |  4 
 hw/vfio/helpers.c | 43 +++
 hw/vfio/iommufd.c | 12 ++
 hw/vfio/pci.c | 28 +--
 4 files changed, 71 insertions(+), 16 deletions(-)

diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h
index 3dac5c167e..697bf24a35 100644
--- a/include/hw/vfio/vfio-common.h
+++ b/include/hw/vfio/vfio-common.h
@@ -251,4 +251,8 @@ int vfio_devices_query_dirty_bitmap(VFIOContainerBase 
*bcontainer,
 hwaddr size);
 int vfio_get_dirty_bitmap(VFIOContainerBase *bcontainer, uint64_t iova,
  uint64_t size, ram_addr_t ram_addr);
+
+/* Returns 0 on success, or a negative errno. */
+int vfio_device_get_name(VFIODevice *vbasedev, Error **errp);
+void vfio_device_set_fd(VFIODevice *vbasedev, const char *str, Error **errp);
 #endif /* HW_VFIO_VFIO_COMMON_H */
diff --git a/hw/vfio/helpers.c b/hw/vfio/helpers.c
index 168847e7c5..3592c3d54e 100644
--- a/hw/vfio/helpers.c
+++ b/hw/vfio/helpers.c
@@ -27,6 +27,7 @@
 #include "trace.h"
 #include "qapi/error.h"
 #include "qemu/error-report.h"
+#include "monitor/monitor.h"
 
 /*
  * Common VFIO interrupt disable
@@ -609,3 +610,45 @@ bool vfio_has_region_cap(VFIODevice *vbasedev, int region, 
uint16_t cap_type)
 
 return ret;
 }
+
+int vfio_device_get_name(VFIODevice *vbasedev, Error **errp)
+{
+struct stat st;
+
+if (vbasedev->fd < 0) {
+if (stat(vbasedev->sysfsdev, &st) < 0) {
+error_setg_errno(errp, errno, "no such host device");
+error_prepend(errp, VFIO_MSG_PREFIX, vbasedev->sysfsdev);
+return -errno;
+}
+/* User may specify a name, e.g: VFIO platform device */
+if (!vbasedev->name) {
+vbasedev->name = g_path_get_basename(vbasedev->sysfsdev);
+}
+} else {
+if (!vbasedev->iommufd) {
+error_setg(errp, "Use FD passing only with iommufd backend");
+return -EINVAL;
+}
+/*
+ * Give a name with fd so any function printing out vbasedev->name
+ * will not break.
+ */
+if (!vbasedev->name) {
+vbasedev->name = g_strdup_printf("VFIO_FD%d", vbasedev->fd);
+}
+}
+
+return 0;
+}
+
+void vfio_device_set_fd(VFIODevice *vbasedev, const char *str, Error **errp)
+{
+int fd = monitor_fd_param(monitor_cur(), str, errp);
+
+if (fd < 0) {
+error_prepend(errp, "Could not parse remote object fd %s:", str);
+return;
+}
+vbasedev->fd = fd;
+}
diff --git a/hw/vfio/iommufd.c b/hw/vfio/iommufd.c
index 6e53e013ef..5accd26484 100644
--- a/hw/vfio/iommufd.c
+++ b/hw/vfio/iommufd.c
@@ -320,11 +320,15 @@ static int iommufd_cdev_attach(const char *name, 
VFIODevice *vbasedev,
 uint32_t ioas_id;
 Error *err = NULL;
 
-devfd = iommufd_cdev_getfd(vbasedev->sysfsdev, errp);
-if (devfd < 0) {
-return devfd;
+if (vbasedev->fd < 0) {
+devfd = iommufd_cdev_getfd(vbasedev->sysfsdev, errp);
+if (devfd < 0) {
+return devfd;
+}
+vbasedev->fd = devfd;
+} else {
+devfd = vbasedev->fd;
 }
-vbasedev->fd = devfd;
 
 ret = iommufd_cdev_connect_and_bind(vbasedev, errp);
 if (ret) {
diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
index c5984b0598..445d58c8e5 100644
--- a/hw/vfio/pci.c
+++ b/hw/vfio/pci.c
@@ -2944,17 +2944,19 @@ static void vfio_realize(PCIDevice *pdev, Error **errp)
 VFIODevice *vbasedev = &vdev->vbasedev;
 char *tmp, *subsys;
 Error *err = NULL;
-struct stat st;
 int i, ret;
 bool is_mdev;
 char uuid[UUID_STR_LEN];
 char *name;
 
-if (!vbasedev->sysfsdev) {
+if (vbasedev->fd < 0 && !vbasedev->sysfsdev) {
 if (!(~vdev->host.domain || ~vdev->host.bus ||
   ~vdev->host.slot || ~vdev->host.function)) {
 error_setg(errp, "No provided host device");
 error_append_hint(errp, "Use -device vfio-pci,host=:

[PATCH v7 19/27] hw/arm: Activate IOMMUFD for virt machines

2023-11-21 Thread Zhenzhong Duan

From: Cédric Le Goater 

Signed-off-by: Cédric Le Goater 
Signed-off-by: Zhenzhong Duan 
Reviewed-by: Eric Auger 
Tested-by: Eric Auger 
---
 hw/arm/Kconfig | 1 +
 1 file changed, 1 insertion(+)

diff --git a/hw/arm/Kconfig b/hw/arm/Kconfig
index 3ada335a24..660f49db49 100644
--- a/hw/arm/Kconfig
+++ b/hw/arm/Kconfig
@@ -8,6 +8,7 @@ config ARM_VIRT
 imply TPM_TIS_SYSBUS
 imply TPM_TIS_I2C
 imply NVDIMM
+imply IOMMUFD
 select ARM_GIC
 select ACPI
 select ARM_SMMUV3
-- 
2.34.1

[PATCH v7 26/27] vfio: Introduce a helper function to initialize VFIODevice

2023-11-21 Thread Zhenzhong Duan

Introduce a helper function to replace the common code to initialize
VFIODevice in pci, platform, ap and ccw VFIO device.

No functional change intended.

Suggested-by: Cédric Le Goater 
Signed-off-by: Zhenzhong Duan 
---
 include/hw/vfio/vfio-common.h |  2 ++
 hw/vfio/ap.c  |  8 ++--
 hw/vfio/ccw.c |  8 ++--
 hw/vfio/helpers.c | 11 +++
 hw/vfio/pci.c |  6 ++
 hw/vfio/platform.c|  6 ++
 6 files changed, 21 insertions(+), 20 deletions(-)

diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h
index efcba19f66..b8aa8a5495 100644
--- a/include/hw/vfio/vfio-common.h
+++ b/include/hw/vfio/vfio-common.h
@@ -257,4 +257,6 @@ int vfio_get_dirty_bitmap(const VFIOContainerBase 
*bcontainer, uint64_t iova,
 /* Returns 0 on success, or a negative errno. */
 int vfio_device_get_name(VFIODevice *vbasedev, Error **errp);
 void vfio_device_set_fd(VFIODevice *vbasedev, const char *str, Error **errp);
+void vfio_device_init(VFIODevice *vbasedev, int type, VFIODeviceOps *ops,
+  DeviceState *dev, bool ram_discard);
 #endif /* HW_VFIO_VFIO_COMMON_H */
diff --git a/hw/vfio/ap.c b/hw/vfio/ap.c
index 95fe7cd98b..e157aa1ff7 100644
--- a/hw/vfio/ap.c
+++ b/hw/vfio/ap.c
@@ -226,18 +226,14 @@ static void vfio_ap_instance_init(Object *obj)
 VFIOAPDevice *vapdev = VFIO_AP_DEVICE(obj);
 VFIODevice *vbasedev = &vapdev->vdev;
 
-vbasedev->type = VFIO_DEVICE_TYPE_AP;
-vbasedev->ops = &vfio_ap_ops;
-vbasedev->dev = DEVICE(vapdev);
-vbasedev->fd = -1;
-
 /*
  * vfio-ap devices operate in a way compatible with discarding of
  * memory in RAM blocks, as no pages are pinned in the host.
  * This needs to be set before vfio_get_device() for vfio common to
  * handle ram_block_discard_disable().
  */
-vbasedev->ram_block_discard_allowed = true;
+vfio_device_init(vbasedev, VFIO_DEVICE_TYPE_AP, &vfio_ap_ops,
+ DEVICE(vapdev), true);
 }
 
 #ifdef CONFIG_IOMMUFD
diff --git a/hw/vfio/ccw.c b/hw/vfio/ccw.c
index 6305a4c1b8..90e4a53437 100644
--- a/hw/vfio/ccw.c
+++ b/hw/vfio/ccw.c
@@ -683,11 +683,6 @@ static void vfio_ccw_instance_init(Object *obj)
 VFIOCCWDevice *vcdev = VFIO_CCW(obj);
 VFIODevice *vbasedev = &vcdev->vdev;
 
-vbasedev->type = VFIO_DEVICE_TYPE_CCW;
-vbasedev->ops = &vfio_ccw_ops;
-vbasedev->dev = DEVICE(vcdev);
-vbasedev->fd = -1;
-
 /*
  * All vfio-ccw devices are believed to operate in a way compatible with
  * discarding of memory in RAM blocks, ie. pages pinned in the host are
@@ -696,7 +691,8 @@ static void vfio_ccw_instance_init(Object *obj)
  * needs to be set before vfio_get_device() for vfio common to handle
  * ram_block_discard_disable().
  */
-vbasedev->ram_block_discard_allowed = true;
+vfio_device_init(vbasedev, VFIO_DEVICE_TYPE_CCW, &vfio_ccw_ops,
+ DEVICE(vcdev), true);
 }
 
 #ifdef CONFIG_IOMMUFD
diff --git a/hw/vfio/helpers.c b/hw/vfio/helpers.c
index 3592c3d54e..6789870802 100644
--- a/hw/vfio/helpers.c
+++ b/hw/vfio/helpers.c
@@ -652,3 +652,14 @@ void vfio_device_set_fd(VFIODevice *vbasedev, const char 
*str, Error **errp)
 }
 vbasedev->fd = fd;
 }
+
+void vfio_device_init(VFIODevice *vbasedev, int type, VFIODeviceOps *ops,
+  DeviceState *dev, bool ram_discard)
+{
+vbasedev->type = type;
+vbasedev->ops = ops;
+vbasedev->dev = dev;
+vbasedev->fd = -1;
+
+vbasedev->ram_block_discard_allowed = ram_discard;
+}
diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
index 87405584d7..1874ec1aba 100644
--- a/hw/vfio/pci.c
+++ b/hw/vfio/pci.c
@@ -3327,10 +3327,8 @@ static void vfio_instance_init(Object *obj)
 vdev->host.slot = ~0U;
 vdev->host.function = ~0U;
 
-vbasedev->type = VFIO_DEVICE_TYPE_PCI;
-vbasedev->ops = &vfio_pci_ops;
-vbasedev->dev = DEVICE(vdev);
-vbasedev->fd = -1;
+vfio_device_init(vbasedev, VFIO_DEVICE_TYPE_PCI, &vfio_pci_ops,
+ DEVICE(vdev), false);
 
 vdev->nv_gpudirect_clique = 0xFF;
 
diff --git a/hw/vfio/platform.c b/hw/vfio/platform.c
index 506eb8193f..a8d9b7da63 100644
--- a/hw/vfio/platform.c
+++ b/hw/vfio/platform.c
@@ -657,10 +657,8 @@ static void vfio_platform_instance_init(Object *obj)
 VFIOPlatformDevice *vdev = VFIO_PLATFORM_DEVICE(obj);
 VFIODevice *vbasedev = &vdev->vbasedev;
 
-vbasedev->type = VFIO_DEVICE_TYPE_PLATFORM;
-vbasedev->ops = &vfio_platform_ops;
-vbasedev->dev = DEVICE(vdev);
-vbasedev->fd = -1;
+vfio_device_init(vbasedev, VFIO_DEVICE_TYPE_PLATFORM, &vfio_platform_ops,
+ DEVICE(vdev), false);
 }
 
 #ifdef CONFIG_IOMMUFD
-- 
2.34.1

[PATCH v7 27/27] docs/devel: Add VFIO iommufd backend documentation

2023-11-21 Thread Zhenzhong Duan

Suggested-by: Cédric Le Goater 
Signed-off-by: Eric Auger 
Signed-off-by: Yi Liu 
Signed-off-by: Zhenzhong Duan 
---
 MAINTAINERS|   1 +
 docs/devel/index-internals.rst |   1 +
 docs/devel/vfio-iommufd.rst| 166 +
 3 files changed, 168 insertions(+)
 create mode 100644 docs/devel/vfio-iommufd.rst

diff --git a/MAINTAINERS b/MAINTAINERS
index ca70bb4e64..0ddb20a35f 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -2176,6 +2176,7 @@ F: backends/iommufd.c
 F: include/sysemu/iommufd.h
 F: include/qemu/chardev_open.h
 F: util/chardev_open.c
+F: docs/devel/vfio-iommufd.rst
 
 vhost
 M: Michael S. Tsirkin 
diff --git a/docs/devel/index-internals.rst b/docs/devel/index-internals.rst
index 6f81df92bc..3def4a138b 100644
--- a/docs/devel/index-internals.rst
+++ b/docs/devel/index-internals.rst
@@ -18,5 +18,6 @@ Details about QEMU's various subsystems including how to add 
features to them.
s390-dasd-ipl
tracing
vfio-migration
+   vfio-iommufd
writing-monitor-commands
virtio-backends
diff --git a/docs/devel/vfio-iommufd.rst b/docs/devel/vfio-iommufd.rst
new file mode 100644
index 00..3d1c11f175
--- /dev/null
+++ b/docs/devel/vfio-iommufd.rst
@@ -0,0 +1,166 @@
+===
+IOMMUFD BACKEND usage with VFIO
+===
+
+(Same meaning for backend/container/BE)
+
+With the introduction of iommufd, the Linux kernel provides a generic
+interface for user space drivers to propagate their DMA mappings to kernel
+for assigned devices. While the legacy kernel interface is group-centric,
+the new iommufd interface is device-centric, relying on device fd and iommufd.
+
+To support both interfaces in the QEMU VFIO device, introduce a base container
+to abstract the common part of VFIO legacy and iommufd container. So that the
+generic VFIO code can use either container.
+
+The base container implements generic functions such as memory_listener and
+address space management whereas the derived container implements callbacks
+specific to either legacy or iommufd. Each container has its own way to setup
+secure context and dma management interface. The below diagram shows how it
+looks like with both containers.
+
+::
+
+  VFIO   AddressSpace/Memory
+  +---+  +--+  +-+  +-+
+  |  pci  |  | platform |  |  ap |  | ccw |
+  +---+---+  ++-+  +--+--+  +--+--+ +--+
+  |   |   |||   AddressSpace   |
+  |   |   ||++-+
+  +---V---V---VV+   /
+  |   VFIOAddressSpace  | <+
+  |  |  |  MemoryListener
+  |VFIOContainerBase list   |
+  +---+++
+  ||
+  ||
+  +---V--++V--+
+  |   iommufd||vfio legacy|
+  |  container   || container |
+  +---+--+++--+
+  ||
+  | /dev/iommu | /dev/vfio/vfio
+  | /dev/vfio/devices/vfioX| /dev/vfio/$group_id
+  Userspace   ||
+  ++===
+  Kernel  |  device fd |
+  +---+| group/container fd
+  | (BIND_IOMMUFD || (SET_CONTAINER/SET_IOMMU)
+  |  ATTACH_IOAS) || device fd
+  |   ||
+  |   +---VV-+
+  iommufd |   |vfio  |
+  (map/unmap  |   +-++---+
+  ioas_copy)  | || map/unmap
+  | ||
+   +--V--++-V--+  +--V+
+   | iommfd core ||  device|  |  vfio iommu   |
+   +-+++  +---+
+
+* Secure Context setup
+
+  - iommufd BE: uses device fd and iommufd to setup secure context
+(bind_iommufd, attach_ioas)
+  - vfio legacy BE: uses group fd and container fd to setup secure context
+(set_container, set_iommu)
+
+* Device access
+
+  - iommufd BE: device fd is opened through ``/dev/vfio/devices/vfioX``
+  - vfio legacy BE: device fd is retrieved from group fd ioctl
+
+* DMA Mapping flow
+
+  1. VFIOAddressSpace receives MemoryRegion add/del via MemoryListener
+  2. VFIO populates DMA map/unmap via the container BEs
+ * iommufd BE: uses iommufd
+ * vfio legacy BE: uses container fd
+
+Example

[PATCH v7 21/27] hw/i386: Activate IOMMUFD for q35 machines

2023-11-21 Thread Zhenzhong Duan

From: Cédric Le Goater 

Signed-off-by: Cédric Le Goater 
Signed-off-by: Zhenzhong Duan 
Reviewed-by: Eric Auger 
---
 hw/i386/Kconfig | 1 +
 1 file changed, 1 insertion(+)

diff --git a/hw/i386/Kconfig b/hw/i386/Kconfig
index 55850791df..a1846be6f7 100644
--- a/hw/i386/Kconfig
+++ b/hw/i386/Kconfig
@@ -95,6 +95,7 @@ config Q35
 imply E1000E_PCI_EXPRESS
 imply VMPORT
 imply VMMOUSE
+imply IOMMUFD
 select PC_PCI
 select PC_ACPI
 select PCI_EXPRESS_Q35
-- 
2.34.1

[PATCH v7 14/27] vfio/ap: Allow the selection of a given iommu backend

2023-11-21 Thread Zhenzhong Duan

Now we support two types of iommu backends, let's add the capability
to select one of them. This depends on whether an iommufd object has
been linked with the vfio-ap device:

if the user wants to use the legacy backend, it shall not
link the vfio-ap device with any iommufd object:

 -device vfio-ap,sysfsdev=/sys/bus/mdev/devices/XXX

This is called the legacy mode/backend.

If the user wants to use the iommufd backend (/dev/iommu) it
shall pass an iommufd object id in the vfio-ap device options:

 -object iommufd,id=iommufd0
 -device vfio-ap,sysfsdev=/sys/bus/mdev/devices/XXX,iommufd=iommufd0

Suggested-by: Alex Williamson 
Signed-off-by: Zhenzhong Duan 
Reviewed-by: Matthew Rosato 
Reviewed-by: Cédric Le Goater 
---
 hw/vfio/ap.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/hw/vfio/ap.c b/hw/vfio/ap.c
index bbf69ff55a..80629609ae 100644
--- a/hw/vfio/ap.c
+++ b/hw/vfio/ap.c
@@ -11,10 +11,12 @@
  */
 
 #include "qemu/osdep.h"
+#include CONFIG_DEVICES /* CONFIG_IOMMUFD */
 #include 
 #include 
 #include "qapi/error.h"
 #include "hw/vfio/vfio-common.h"
+#include "sysemu/iommufd.h"
 #include "hw/s390x/ap-device.h"
 #include "qemu/error-report.h"
 #include "qemu/event_notifier.h"
@@ -204,6 +206,10 @@ static void vfio_ap_unrealize(DeviceState *dev)
 
 static Property vfio_ap_properties[] = {
 DEFINE_PROP_STRING("sysfsdev", VFIOAPDevice, vdev.sysfsdev),
+#ifdef CONFIG_IOMMUFD
+DEFINE_PROP_LINK("iommufd", VFIOAPDevice, vdev.iommufd,
+ TYPE_IOMMUFD_BACKEND, IOMMUFDBackend *),
+#endif
 DEFINE_PROP_END_OF_LIST(),
 };
 
-- 
2.34.1

[PATCH v7 17/27] vfio/ccw: Make vfio cdev pre-openable by passing a file handle

2023-11-21 Thread Zhenzhong Duan

This gives management tools like libvirt a chance to open the vfio
cdev with privilege and pass FD to qemu. This way qemu never needs
to have privilege to open a VFIO or iommu cdev node.

Signed-off-by: Zhenzhong Duan 
Reviewed-by: Matthew Rosato 
Reviewed-by: Cédric Le Goater 
Reviewed-by: Eric Farman 
---
 hw/vfio/ccw.c | 25 ++---
 1 file changed, 22 insertions(+), 3 deletions(-)

diff --git a/hw/vfio/ccw.c b/hw/vfio/ccw.c
index d2d58bb677..2afdf17dbe 100644
--- a/hw/vfio/ccw.c
+++ b/hw/vfio/ccw.c
@@ -590,11 +590,12 @@ static void vfio_ccw_realize(DeviceState *dev, Error 
**errp)
 }
 }
 
+if (vfio_device_get_name(vbasedev, errp) < 0) {
+return;
+}
+
 vbasedev->ops = &vfio_ccw_ops;
 vbasedev->type = VFIO_DEVICE_TYPE_CCW;
-vbasedev->name = g_strdup_printf("%x.%x.%04x", vcdev->cdev.hostid.cssid,
-   vcdev->cdev.hostid.ssid,
-   vcdev->cdev.hostid.devid);
 vbasedev->dev = dev;
 
 /*
@@ -691,12 +692,29 @@ static const VMStateDescription vfio_ccw_vmstate = {
 .unmigratable = 1,
 };
 
+static void vfio_ccw_instance_init(Object *obj)
+{
+VFIOCCWDevice *vcdev = VFIO_CCW(obj);
+
+vcdev->vdev.fd = -1;
+}
+
+#ifdef CONFIG_IOMMUFD
+static void vfio_ccw_set_fd(Object *obj, const char *str, Error **errp)
+{
+vfio_device_set_fd(&VFIO_CCW(obj)->vdev, str, errp);
+}
+#endif
+
 static void vfio_ccw_class_init(ObjectClass *klass, void *data)
 {
 DeviceClass *dc = DEVICE_CLASS(klass);
 S390CCWDeviceClass *cdc = S390_CCW_DEVICE_CLASS(klass);
 
 device_class_set_props(dc, vfio_ccw_properties);
+#ifdef CONFIG_IOMMUFD
+object_class_property_add_str(klass, "fd", NULL, vfio_ccw_set_fd);
+#endif
 dc->vmsd = &vfio_ccw_vmstate;
 dc->desc = "VFIO-based subchannel assignment";
 set_bit(DEVICE_CATEGORY_MISC, dc->categories);
@@ -714,6 +732,7 @@ static const TypeInfo vfio_ccw_info = {
 .name = TYPE_VFIO_CCW,
 .parent = TYPE_S390_CCW,
 .instance_size = sizeof(VFIOCCWDevice),
+.instance_init = vfio_ccw_instance_init,
 .class_init = vfio_ccw_class_init,
 };
 
-- 
2.34.1

[PATCH v7 04/27] vfio/iommufd: Implement the iommufd backend

2023-11-21 Thread Zhenzhong Duan

From: Yi Liu 

The iommufd backend is implemented based on the new /dev/iommu user API.
This backend obviously depends on CONFIG_IOMMUFD.

So far, the iommufd backend doesn't support dirty page sync yet.

Co-authored-by: Eric Auger 
Signed-off-by: Yi Liu 
Signed-off-by: Zhenzhong Duan 
Reviewed-by: Cédric Le Goater 
Tested-by: Eric Auger 
---
 include/hw/vfio/vfio-common.h |  11 +
 hw/vfio/common.c  |   6 +
 hw/vfio/iommufd.c | 422 ++
 hw/vfio/meson.build   |   3 +
 hw/vfio/trace-events  |  10 +
 5 files changed, 452 insertions(+)
 create mode 100644 hw/vfio/iommufd.c

diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h
index 24ecc0e7ee..3dac5c167e 100644
--- a/include/hw/vfio/vfio-common.h
+++ b/include/hw/vfio/vfio-common.h
@@ -89,6 +89,14 @@ typedef struct VFIOHostDMAWindow {
 QLIST_ENTRY(VFIOHostDMAWindow) hostwin_next;
 } VFIOHostDMAWindow;
 
+typedef struct IOMMUFDBackend IOMMUFDBackend;
+
+typedef struct VFIOIOMMUFDContainer {
+VFIOContainerBase bcontainer;
+IOMMUFDBackend *be;
+uint32_t ioas_id;
+} VFIOIOMMUFDContainer;
+
 typedef struct VFIODeviceOps VFIODeviceOps;
 
 typedef struct VFIODevice {
@@ -116,6 +124,8 @@ typedef struct VFIODevice {
 OnOffAuto pre_copy_dirty_page_tracking;
 bool dirty_pages_supported;
 bool dirty_tracking;
+int devid;
+IOMMUFDBackend *iommufd;
 } VFIODevice;
 
 struct VFIODeviceOps {
@@ -201,6 +211,7 @@ typedef QLIST_HEAD(VFIODeviceList, VFIODevice) 
VFIODeviceList;
 extern VFIOGroupList vfio_group_list;
 extern VFIODeviceList vfio_device_list;
 extern const VFIOIOMMUOps vfio_legacy_ops;
+extern const VFIOIOMMUOps vfio_iommufd_ops;
 extern const MemoryListener vfio_memory_listener;
 extern int vfio_kvm_device_fd;
 
diff --git a/hw/vfio/common.c b/hw/vfio/common.c
index 934f4f5446..6569732b7a 100644
--- a/hw/vfio/common.c
+++ b/hw/vfio/common.c
@@ -19,6 +19,7 @@
  */
 
 #include "qemu/osdep.h"
+#include CONFIG_DEVICES /* CONFIG_IOMMUFD */
 #include 
 #ifdef CONFIG_KVM
 #include 
@@ -1503,6 +1504,11 @@ int vfio_attach_device(char *name, VFIODevice *vbasedev,
 {
 const VFIOIOMMUOps *ops = &vfio_legacy_ops;
 
+#ifdef CONFIG_IOMMUFD
+if (vbasedev->iommufd) {
+ops = &vfio_iommufd_ops;
+}
+#endif
 return ops->attach_device(name, vbasedev, as, errp);
 }
 
diff --git a/hw/vfio/iommufd.c b/hw/vfio/iommufd.c
new file mode 100644
index 00..6d31aeac7b
--- /dev/null
+++ b/hw/vfio/iommufd.c
@@ -0,0 +1,422 @@
+/*
+ * iommufd container backend
+ *
+ * Copyright (C) 2023 Intel Corporation.
+ * Copyright Red Hat, Inc. 2023
+ *
+ * Authors: Yi Liu 
+ *  Eric Auger 
+ *
+ * SPDX-License-Identifier: GPL-2.0-or-later
+ */
+
+#include "qemu/osdep.h"
+#include 
+#include 
+#include 
+
+#include "hw/vfio/vfio-common.h"
+#include "qemu/error-report.h"
+#include "trace.h"
+#include "qapi/error.h"
+#include "sysemu/iommufd.h"
+#include "hw/qdev-core.h"
+#include "sysemu/reset.h"
+#include "qemu/cutils.h"
+#include "qemu/chardev_open.h"
+
+static int iommufd_cdev_map(VFIOContainerBase *bcontainer, hwaddr iova,
+ram_addr_t size, void *vaddr, bool readonly)
+{
+VFIOIOMMUFDContainer *container =
+container_of(bcontainer, VFIOIOMMUFDContainer, bcontainer);
+
+return iommufd_backend_map_dma(container->be,
+   container->ioas_id,
+   iova, size, vaddr, readonly);
+}
+
+static int iommufd_cdev_unmap(VFIOContainerBase *bcontainer,
+  hwaddr iova, ram_addr_t size,
+  IOMMUTLBEntry *iotlb)
+{
+VFIOIOMMUFDContainer *container =
+container_of(bcontainer, VFIOIOMMUFDContainer, bcontainer);
+
+/* TODO: Handle dma_unmap_bitmap with iotlb args (migration) */
+return iommufd_backend_unmap_dma(container->be,
+ container->ioas_id, iova, size);
+}
+
+static int iommufd_cdev_kvm_device_add(VFIODevice *vbasedev, Error **errp)
+{
+return vfio_kvm_device_add_fd(vbasedev->fd, errp);
+}
+
+static void iommufd_cdev_kvm_device_del(VFIODevice *vbasedev)
+{
+Error *err = NULL;
+
+if (vfio_kvm_device_del_fd(vbasedev->fd, &err)) {
+error_report_err(err);
+}
+}
+
+static int iommufd_cdev_connect_and_bind(VFIODevice *vbasedev, Error **errp)
+{
+IOMMUFDBackend *iommufd = vbasedev->iommufd;
+struct vfio_device_bind_iommufd bind = {
+.argsz = sizeof(bind),
+.flags = 0,
+};
+int ret;
+
+ret = iommufd_backend_connect(iommufd, errp);
+if (ret) {
+return ret;
+}
+
+/*
+ * Add device to kvm-vfio to be prepared for the tracking
+ * in KVM. Especially for some emulated devices, it requires
+ * to have kvm information in the device open.
+ */
+ret = iommufd_cdev_kvm_device_add(vbasedev, errp);
+if (ret) {
+goto err_kvm_device_add;
+}
+
+/* Bin

[PATCH v7 16/27] vfio/ccw: Allow the selection of a given iommu backend

2023-11-21 Thread Zhenzhong Duan

Now we support two types of iommu backends, let's add the capability
to select one of them. This depends on whether an iommufd object has
been linked with the vfio-ccw device:

If the user wants to use the legacy backend, it shall not
link the vfio-ccw device with any iommufd object:

 -device vfio-ccw,sysfsdev=/sys/bus/mdev/devices/XXX

This is called the legacy mode/backend.

If the user wants to use the iommufd backend (/dev/iommu) it
shall pass an iommufd object id in the vfio-ccw device options:

 -object iommufd,id=iommufd0
 -device vfio-ccw,sysfsdev=/sys/bus/mdev/devices/XXX,iommufd=iommufd0

Suggested-by: Alex Williamson 
Signed-off-by: Zhenzhong Duan 
Reviewed-by: Matthew Rosato 
Reviewed-by: Cédric Le Goater 
Reviewed-by: Eric Farman 
---
 hw/vfio/ccw.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/hw/vfio/ccw.c b/hw/vfio/ccw.c
index d857bb8d0f..d2d58bb677 100644
--- a/hw/vfio/ccw.c
+++ b/hw/vfio/ccw.c
@@ -15,12 +15,14 @@
  */
 
 #include "qemu/osdep.h"
+#include CONFIG_DEVICES /* CONFIG_IOMMUFD */
 #include 
 #include 
 #include 
 
 #include "qapi/error.h"
 #include "hw/vfio/vfio-common.h"
+#include "sysemu/iommufd.h"
 #include "hw/s390x/s390-ccw.h"
 #include "hw/s390x/vfio-ccw.h"
 #include "hw/qdev-properties.h"
@@ -677,6 +679,10 @@ static void vfio_ccw_unrealize(DeviceState *dev)
 static Property vfio_ccw_properties[] = {
 DEFINE_PROP_STRING("sysfsdev", VFIOCCWDevice, vdev.sysfsdev),
 DEFINE_PROP_BOOL("force-orb-pfch", VFIOCCWDevice, force_orb_pfch, false),
+#ifdef CONFIG_IOMMUFD
+DEFINE_PROP_LINK("iommufd", VFIOCCWDevice, vdev.iommufd,
+ TYPE_IOMMUFD_BACKEND, IOMMUFDBackend *),
+#endif
 DEFINE_PROP_END_OF_LIST(),
 };
 
-- 
2.34.1

[PATCH v7 06/27] vfio/iommufd: Add support for iova_ranges and pgsizes

2023-11-21 Thread Zhenzhong Duan

Some vIOMMU such as virtio-iommu use IOVA ranges from host side to
setup reserved ranges for passthrough device, so that guest will not
use an IOVA range beyond host support.

Use an uAPI of IOMMUFD to get IOVA ranges of host side and pass to
vIOMMU just like the legacy backend, if this fails, fallback to
64bit IOVA range.

Also use out_iova_alignment returned from uAPI as pgsizes instead of
qemu_real_host_page_size() as a fallback.

Signed-off-by: Zhenzhong Duan 
Reviewed-by: Cédric Le Goater 
Reviewed-by: Eric Auger 
Tested-by: Eric Auger 
---
 hw/vfio/iommufd.c | 56 ++-
 1 file changed, 55 insertions(+), 1 deletion(-)

diff --git a/hw/vfio/iommufd.c b/hw/vfio/iommufd.c
index 6d31aeac7b..01b448e840 100644
--- a/hw/vfio/iommufd.c
+++ b/hw/vfio/iommufd.c
@@ -261,6 +261,53 @@ static int iommufd_cdev_ram_block_discard_disable(bool 
state)
 return ram_block_uncoordinated_discard_disable(state);
 }
 
+static int iommufd_cdev_get_info_iova_range(VFIOIOMMUFDContainer *container,
+uint32_t ioas_id, Error **errp)
+{
+VFIOContainerBase *bcontainer = &container->bcontainer;
+struct iommu_ioas_iova_ranges *info;
+struct iommu_iova_range *iova_ranges;
+int ret, sz, fd = container->be->fd;
+
+info = g_malloc0(sizeof(*info));
+info->size = sizeof(*info);
+info->ioas_id = ioas_id;
+
+ret = ioctl(fd, IOMMU_IOAS_IOVA_RANGES, info);
+if (ret && errno != EMSGSIZE) {
+goto error;
+}
+
+sz = info->num_iovas * sizeof(struct iommu_iova_range);
+info = g_realloc(info, sizeof(*info) + sz);
+info->allowed_iovas = (uintptr_t)(info + 1);
+
+ret = ioctl(fd, IOMMU_IOAS_IOVA_RANGES, info);
+if (ret) {
+goto error;
+}
+
+iova_ranges = (struct iommu_iova_range *)(uintptr_t)info->allowed_iovas;
+
+for (int i = 0; i < info->num_iovas; i++) {
+Range *range = g_new(Range, 1);
+
+range_set_bounds(range, iova_ranges[i].start, iova_ranges[i].last);
+bcontainer->iova_ranges =
+range_list_insert(bcontainer->iova_ranges, range);
+}
+bcontainer->pgsizes = info->out_iova_alignment;
+
+g_free(info);
+return 0;
+
+error:
+ret = -errno;
+g_free(info);
+error_setg_errno(errp, errno, "Cannot get IOVA ranges");
+return ret;
+}
+
 static int iommufd_cdev_attach(const char *name, VFIODevice *vbasedev,
AddressSpace *as, Error **errp)
 {
@@ -335,7 +382,14 @@ static int iommufd_cdev_attach(const char *name, 
VFIODevice *vbasedev,
 goto err_discard_disable;
 }
 
-bcontainer->pgsizes = qemu_real_host_page_size();
+ret = iommufd_cdev_get_info_iova_range(container, ioas_id, &err);
+if (ret) {
+error_append_hint(&err,
+   "Fallback to default 64bit IOVA range and 4K page size\n");
+warn_report_err(err);
+err = NULL;
+bcontainer->pgsizes = qemu_real_host_page_size();
+}
 
 bcontainer->listener = vfio_memory_listener;
 memory_listener_register(&bcontainer->listener, bcontainer->space->as);
-- 
2.34.1

[PATCH v7 01/27] backends/iommufd: Introduce the iommufd object

2023-11-21 Thread Zhenzhong Duan

From: Eric Auger 

Introduce an iommufd object which allows the interaction
with the host /dev/iommu device.

The /dev/iommu can have been already pre-opened outside of qemu,
in which case the fd can be passed directly along with the
iommufd object:

This allows the iommufd object to be shared accross several
subsystems (VFIO, VDPA, ...). For example, libvirt would open
the /dev/iommu once.

If no fd is passed along with the iommufd object, the /dev/iommu
is opened by the qemu code.

Suggested-by: Alex Williamson 
Signed-off-by: Eric Auger 
Signed-off-by: Yi Liu 
Signed-off-by: Zhenzhong Duan 
Reviewed-by: Cédric Le Goater 
Tested-by: Eric Auger 
---
 MAINTAINERS  |   8 ++
 qapi/qom.json|  19 +++
 include/sysemu/iommufd.h |  38 ++
 backends/iommufd.c   | 245 +++
 backends/Kconfig |   4 +
 backends/meson.build |   1 +
 backends/trace-events|  10 ++
 qemu-options.hx  |  12 ++
 8 files changed, 337 insertions(+)
 create mode 100644 include/sysemu/iommufd.h
 create mode 100644 backends/iommufd.c

diff --git a/MAINTAINERS b/MAINTAINERS
index 695e0bd34f..a5a446914a 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -2167,6 +2167,14 @@ F: hw/vfio/ap.c
 F: docs/system/s390x/vfio-ap.rst
 L: qemu-s3...@nongnu.org
 
+iommufd
+M: Yi Liu 
+M: Eric Auger 
+M: Zhenzhong Duan 
+S: Supported
+F: backends/iommufd.c
+F: include/sysemu/iommufd.h
+
 vhost
 M: Michael S. Tsirkin 
 S: Supported
diff --git a/qapi/qom.json b/qapi/qom.json
index c53ef978ff..95516ba325 100644
--- a/qapi/qom.json
+++ b/qapi/qom.json
@@ -794,6 +794,23 @@
 { 'struct': 'VfioUserServerProperties',
   'data': { 'socket': 'SocketAddress', 'device': 'str' } }
 
+##
+# @IOMMUFDProperties:
+#
+# Properties for iommufd objects.
+#
+# @fd: file descriptor name previously passed via 'getfd' command,
+# which represents a pre-opened /dev/iommu.  This allows the
+# iommufd object to be shared accross several subsystems
+# (VFIO, VDPA, ...), and the file descriptor to be shared
+# with other process, e.g. DPDK.  (default: QEMU opens
+# /dev/iommu by itself)
+#
+# Since: 9.0
+##
+{ 'struct': 'IOMMUFDProperties',
+  'data': { '*fd': 'str' } }
+
 ##
 # @RngProperties:
 #
@@ -934,6 +951,7 @@
 'input-barrier',
 { 'name': 'input-linux',
   'if': 'CONFIG_LINUX' },
+'iommufd',
 'iothread',
 'main-loop',
 { 'name': 'memory-backend-epc',
@@ -1003,6 +1021,7 @@
   'input-barrier':  'InputBarrierProperties',
   'input-linux':{ 'type': 'InputLinuxProperties',
   'if': 'CONFIG_LINUX' },
+  'iommufd':'IOMMUFDProperties',
   'iothread':   'IothreadProperties',
   'main-loop':  'MainLoopProperties',
   'memory-backend-epc': { 'type': 'MemoryBackendEpcProperties',
diff --git a/include/sysemu/iommufd.h b/include/sysemu/iommufd.h
new file mode 100644
index 00..9c5524b0ed
--- /dev/null
+++ b/include/sysemu/iommufd.h
@@ -0,0 +1,38 @@
+#ifndef SYSEMU_IOMMUFD_H
+#define SYSEMU_IOMMUFD_H
+
+#include "qom/object.h"
+#include "qemu/thread.h"
+#include "exec/hwaddr.h"
+#include "exec/cpu-common.h"
+
+#define TYPE_IOMMUFD_BACKEND "iommufd"
+OBJECT_DECLARE_TYPE(IOMMUFDBackend, IOMMUFDBackendClass, IOMMUFD_BACKEND)
+
+struct IOMMUFDBackendClass {
+ObjectClass parent_class;
+};
+
+struct IOMMUFDBackend {
+Object parent;
+
+/*< protected >*/
+int fd;/* /dev/iommu file descriptor */
+bool owned;/* is the /dev/iommu opened internally */
+QemuMutex lock;
+uint32_t users;
+
+/*< public >*/
+};
+
+int iommufd_backend_connect(IOMMUFDBackend *be, Error **errp);
+void iommufd_backend_disconnect(IOMMUFDBackend *be);
+
+int iommufd_backend_alloc_ioas(IOMMUFDBackend *be, uint32_t *ioas_id,
+   Error **errp);
+void iommufd_backend_free_id(IOMMUFDBackend *be, uint32_t id);
+int iommufd_backend_map_dma(IOMMUFDBackend *be, uint32_t ioas_id, hwaddr iova,
+ram_addr_t size, void *vaddr, bool readonly);
+int iommufd_backend_unmap_dma(IOMMUFDBackend *be, uint32_t ioas_id,
+  hwaddr iova, ram_addr_t size);
+#endif
diff --git a/backends/iommufd.c b/backends/iommufd.c
new file mode 100644
index 00..ba58a0eb0d
--- /dev/null
+++ b/backends/iommufd.c
@@ -0,0 +1,245 @@
+/*
+ * iommufd container backend
+ *
+ * Copyright (C) 2023 Intel Corporation.
+ * Copyright Red Hat, Inc. 2023
+ *
+ * Authors: Yi Liu 
+ *  Eric Auger 
+ *
+ * SPDX-License-Identifier: GPL-2.0-or-later
+ */
+
+#include "qemu/osdep.h"
+#include "sysemu/iommufd.h"
+#include "qapi/error.h"
+#include "qapi/qmp/qerror.h"
+#include "qemu/module.h"
+#include "qom/object_interfaces.h"
+#include "qemu/error-report.h"
+#include "monitor/monitor.h"
+#include "trace.h"
+#include 
+#include 
+
+static void iommufd_backend_init(Object *

[PATCH v7 25/27] vfio/ccw: Move VFIODevice initializations in vfio_ccw_instance_init

2023-11-21 Thread Zhenzhong Duan

Some of the VFIODevice initializations is in vfio_ccw_realize,
move all of them in vfio_ccw_instance_init.

No functional change intended.

Suggested-by: Cédric Le Goater 
Signed-off-by: Zhenzhong Duan 
Reviewed-by: Philippe Mathieu-Daudé 
Reviewed-by: Eric Farman 
---
 hw/vfio/ccw.c | 30 +++---
 1 file changed, 15 insertions(+), 15 deletions(-)

diff --git a/hw/vfio/ccw.c b/hw/vfio/ccw.c
index 2afdf17dbe..6305a4c1b8 100644
--- a/hw/vfio/ccw.c
+++ b/hw/vfio/ccw.c
@@ -594,20 +594,6 @@ static void vfio_ccw_realize(DeviceState *dev, Error 
**errp)
 return;
 }
 
-vbasedev->ops = &vfio_ccw_ops;
-vbasedev->type = VFIO_DEVICE_TYPE_CCW;
-vbasedev->dev = dev;
-
-/*
- * All vfio-ccw devices are believed to operate in a way compatible with
- * discarding of memory in RAM blocks, ie. pages pinned in the host are
- * in the current working set of the guest driver and therefore never
- * overlap e.g., with pages available to the guest balloon driver.  This
- * needs to be set before vfio_get_device() for vfio common to handle
- * ram_block_discard_disable().
- */
-vbasedev->ram_block_discard_allowed = true;
-
 ret = vfio_attach_device(cdev->mdevid, vbasedev,
  &address_space_memory, errp);
 if (ret) {
@@ -695,8 +681,22 @@ static const VMStateDescription vfio_ccw_vmstate = {
 static void vfio_ccw_instance_init(Object *obj)
 {
 VFIOCCWDevice *vcdev = VFIO_CCW(obj);
+VFIODevice *vbasedev = &vcdev->vdev;
+
+vbasedev->type = VFIO_DEVICE_TYPE_CCW;
+vbasedev->ops = &vfio_ccw_ops;
+vbasedev->dev = DEVICE(vcdev);
+vbasedev->fd = -1;
 
-vcdev->vdev.fd = -1;
+/*
+ * All vfio-ccw devices are believed to operate in a way compatible with
+ * discarding of memory in RAM blocks, ie. pages pinned in the host are
+ * in the current working set of the guest driver and therefore never
+ * overlap e.g., with pages available to the guest balloon driver.  This
+ * needs to be set before vfio_get_device() for vfio common to handle
+ * ram_block_discard_disable().
+ */
+vbasedev->ram_block_discard_allowed = true;
 }
 
 #ifdef CONFIG_IOMMUFD
-- 
2.34.1

[PATCH v7 23/27] vfio/platform: Move VFIODevice initializations in vfio_platform_instance_init

2023-11-21 Thread Zhenzhong Duan

Some of the VFIODevice initializations is in vfio_platform_realize,
move all of them in vfio_platform_instance_init.

No functional change intended.

Suggested-by: Cédric Le Goater 
Signed-off-by: Zhenzhong Duan 
Reviewed-by: Philippe Mathieu-Daudé 
---
 hw/vfio/platform.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/hw/vfio/platform.c b/hw/vfio/platform.c
index a97d9c6234..506eb8193f 100644
--- a/hw/vfio/platform.c
+++ b/hw/vfio/platform.c
@@ -581,10 +581,6 @@ static void vfio_platform_realize(DeviceState *dev, Error 
**errp)
 VFIODevice *vbasedev = &vdev->vbasedev;
 int i, ret;
 
-vbasedev->type = VFIO_DEVICE_TYPE_PLATFORM;
-vbasedev->dev = dev;
-vbasedev->ops = &vfio_platform_ops;
-
 qemu_mutex_init(&vdev->intp_mutex);
 
 trace_vfio_platform_realize(vbasedev->sysfsdev ?
@@ -659,8 +655,12 @@ static Property vfio_platform_dev_properties[] = {
 static void vfio_platform_instance_init(Object *obj)
 {
 VFIOPlatformDevice *vdev = VFIO_PLATFORM_DEVICE(obj);
+VFIODevice *vbasedev = &vdev->vbasedev;
 
-vdev->vbasedev.fd = -1;
+vbasedev->type = VFIO_DEVICE_TYPE_PLATFORM;
+vbasedev->ops = &vfio_platform_ops;
+vbasedev->dev = DEVICE(vdev);
+vbasedev->fd = -1;
 }
 
 #ifdef CONFIG_IOMMUFD
-- 
2.34.1

[PATCH v7 02/27] util/char_dev: Add open_cdev()

2023-11-21 Thread Zhenzhong Duan

From: Yi Liu 

/dev/vfio/devices/vfioX may not exist. In that case it is still possible
to open /dev/char/$major:$minor instead. Add helper function to abstract
the cdev open.

Suggested-by: Jason Gunthorpe 
Signed-off-by: Yi Liu 
Signed-off-by: Zhenzhong Duan 
Reviewed-by: Cédric Le Goater 
Reviewed-by: Eric Auger 
Tested-by: Eric Auger 
---
 MAINTAINERS |  2 +
 include/qemu/chardev_open.h | 16 
 util/chardev_open.c | 81 +
 util/meson.build|  1 +
 4 files changed, 100 insertions(+)
 create mode 100644 include/qemu/chardev_open.h
 create mode 100644 util/chardev_open.c

diff --git a/MAINTAINERS b/MAINTAINERS
index a5a446914a..ca70bb4e64 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -2174,6 +2174,8 @@ M: Zhenzhong Duan 
 S: Supported
 F: backends/iommufd.c
 F: include/sysemu/iommufd.h
+F: include/qemu/chardev_open.h
+F: util/chardev_open.c
 
 vhost
 M: Michael S. Tsirkin 
diff --git a/include/qemu/chardev_open.h b/include/qemu/chardev_open.h
new file mode 100644
index 00..64e8fcfdcb
--- /dev/null
+++ b/include/qemu/chardev_open.h
@@ -0,0 +1,16 @@
+/*
+ * QEMU Chardev Helper
+ *
+ * Copyright (C) 2023 Intel Corporation.
+ *
+ * Authors: Yi Liu 
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2.  See
+ * the COPYING file in the top-level directory.
+ */
+
+#ifndef QEMU_CHARDEV_OPEN_H
+#define QEMU_CHARDEV_OPEN_H
+
+int open_cdev(const char *devpath, dev_t cdev);
+#endif
diff --git a/util/chardev_open.c b/util/chardev_open.c
new file mode 100644
index 00..f776429788
--- /dev/null
+++ b/util/chardev_open.c
@@ -0,0 +1,81 @@
+/*
+ * Copyright (c) 2019, Mellanox Technologies. All rights reserved.
+ * Copyright (C) 2023 Intel Corporation.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses.  You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ *  Redistribution and use in source and binary forms, with or
+ *  without modification, are permitted provided that the following
+ *  conditions are met:
+ *
+ *  - Redistributions of source code must retain the above
+ *copyright notice, this list of conditions and the following
+ *disclaimer.
+ *
+ *  - Redistributions in binary form must reproduce the above
+ *copyright notice, this list of conditions and the following
+ *disclaimer in the documentation and/or other materials
+ *provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ *
+ * Authors: Yi Liu 
+ *
+ * Copied from
+ * https://github.com/linux-rdma/rdma-core/blob/master/util/open_cdev.c
+ *
+ */
+
+#include "qemu/osdep.h"
+#include "qemu/chardev_open.h"
+
+static int open_cdev_internal(const char *path, dev_t cdev)
+{
+struct stat st;
+int fd;
+
+fd = qemu_open_old(path, O_RDWR);
+if (fd == -1) {
+return -1;
+}
+if (fstat(fd, &st) || !S_ISCHR(st.st_mode) ||
+(cdev != 0 && st.st_rdev != cdev)) {
+close(fd);
+return -1;
+}
+return fd;
+}
+
+static int open_cdev_robust(dev_t cdev)
+{
+g_autofree char *devpath = NULL;
+
+/*
+ * This assumes that udev is being used and is creating the /dev/char/
+ * symlinks.
+ */
+devpath = g_strdup_printf("/dev/char/%u:%u", major(cdev), minor(cdev));
+return open_cdev_internal(devpath, cdev);
+}
+
+int open_cdev(const char *devpath, dev_t cdev)
+{
+int fd;
+
+fd = open_cdev_internal(devpath, cdev);
+if (fd == -1 && cdev != 0) {
+return open_cdev_robust(cdev);
+}
+return fd;
+}
diff --git a/util/meson.build b/util/meson.build
index c2322ef6e7..174c133368 100644
--- a/util/meson.build
+++ b/util/meson.build
@@ -108,6 +108,7 @@ if have_block
 util_ss.add(files('filemonitor-stub.c'))
   endif
   util_ss.add(when: 'CONFIG_LINUX', if_true: files('vfio-helpers.c'))
+  util_ss.add(when: 'CONFIG_LINUX', if_true: files('chardev_open.c'))
 endif
 
 if cpu == 'aarch64'
-- 
2.34.1

[PATCH v7 18/27] vfio: Make VFIOContainerBase poiner parameter const in VFIOIOMMUOps callbacks

2023-11-21 Thread Zhenzhong Duan

Some of the callbacks in VFIOIOMMUOps pass VFIOContainerBase poiner,
those callbacks only need read access to the sub object of VFIOContainerBase.
So make VFIOContainerBase, VFIOContainer and VFIOIOMMUFDContainer as const
in these callbacks.

Local functions called by those callbacks also need same changes to avoid
build error.

Suggested-by: Cédric Le Goater 
Signed-off-by: Zhenzhong Duan 
Reviewed-by: Cédric Le Goater 
Reviewed-by: Eric Auger 
Tested-by: Eric Auger 
---
 include/hw/vfio/vfio-common.h | 12 ++
 include/hw/vfio/vfio-container-base.h | 12 ++
 hw/vfio/common.c  |  9 +++
 hw/vfio/container-base.c  |  2 +-
 hw/vfio/container.c   | 34 ++-
 hw/vfio/iommufd.c |  8 +++
 6 files changed, 42 insertions(+), 35 deletions(-)

diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h
index 697bf24a35..efcba19f66 100644
--- a/include/hw/vfio/vfio-common.h
+++ b/include/hw/vfio/vfio-common.h
@@ -244,13 +244,15 @@ bool vfio_migration_realize(VFIODevice *vbasedev, Error 
**errp);
 void vfio_migration_exit(VFIODevice *vbasedev);
 
 int vfio_bitmap_alloc(VFIOBitmap *vbmap, hwaddr size);
-bool vfio_devices_all_running_and_mig_active(VFIOContainerBase *bcontainer);
-bool vfio_devices_all_device_dirty_tracking(VFIOContainerBase *bcontainer);
-int vfio_devices_query_dirty_bitmap(VFIOContainerBase *bcontainer,
+bool
+vfio_devices_all_running_and_mig_active(const VFIOContainerBase *bcontainer);
+bool
+vfio_devices_all_device_dirty_tracking(const VFIOContainerBase *bcontainer);
+int vfio_devices_query_dirty_bitmap(const VFIOContainerBase *bcontainer,
 VFIOBitmap *vbmap, hwaddr iova,
 hwaddr size);
-int vfio_get_dirty_bitmap(VFIOContainerBase *bcontainer, uint64_t iova,
- uint64_t size, ram_addr_t ram_addr);
+int vfio_get_dirty_bitmap(const VFIOContainerBase *bcontainer, uint64_t iova,
+  uint64_t size, ram_addr_t ram_addr);
 
 /* Returns 0 on success, or a negative errno. */
 int vfio_device_get_name(VFIODevice *vbasedev, Error **errp);
diff --git a/include/hw/vfio/vfio-container-base.h 
b/include/hw/vfio/vfio-container-base.h
index 45bb19c767..2ae297ccda 100644
--- a/include/hw/vfio/vfio-container-base.h
+++ b/include/hw/vfio/vfio-container-base.h
@@ -82,7 +82,7 @@ void vfio_container_del_section_window(VFIOContainerBase 
*bcontainer,
MemoryRegionSection *section);
 int vfio_container_set_dirty_page_tracking(VFIOContainerBase *bcontainer,
bool start);
-int vfio_container_query_dirty_bitmap(VFIOContainerBase *bcontainer,
+int vfio_container_query_dirty_bitmap(const VFIOContainerBase *bcontainer,
   VFIOBitmap *vbmap,
   hwaddr iova, hwaddr size);
 
@@ -93,18 +93,20 @@ void vfio_container_destroy(VFIOContainerBase *bcontainer);
 
 struct VFIOIOMMUOps {
 /* basic feature */
-int (*dma_map)(VFIOContainerBase *bcontainer,
+int (*dma_map)(const VFIOContainerBase *bcontainer,
hwaddr iova, ram_addr_t size,
void *vaddr, bool readonly);
-int (*dma_unmap)(VFIOContainerBase *bcontainer,
+int (*dma_unmap)(const VFIOContainerBase *bcontainer,
  hwaddr iova, ram_addr_t size,
  IOMMUTLBEntry *iotlb);
 int (*attach_device)(const char *name, VFIODevice *vbasedev,
  AddressSpace *as, Error **errp);
 void (*detach_device)(VFIODevice *vbasedev);
 /* migration feature */
-int (*set_dirty_page_tracking)(VFIOContainerBase *bcontainer, bool start);
-int (*query_dirty_bitmap)(VFIOContainerBase *bcontainer, VFIOBitmap *vbmap,
+int (*set_dirty_page_tracking)(const VFIOContainerBase *bcontainer,
+   bool start);
+int (*query_dirty_bitmap)(const VFIOContainerBase *bcontainer,
+  VFIOBitmap *vbmap,
   hwaddr iova, hwaddr size);
 /* PCI specific */
 int (*pci_hot_reset)(VFIODevice *vbasedev, bool single);
diff --git a/hw/vfio/common.c b/hw/vfio/common.c
index 6569732b7a..08a3e57672 100644
--- a/hw/vfio/common.c
+++ b/hw/vfio/common.c
@@ -204,7 +204,7 @@ static bool 
vfio_devices_all_dirty_tracking(VFIOContainerBase *bcontainer)
 return true;
 }
 
-bool vfio_devices_all_device_dirty_tracking(VFIOContainerBase *bcontainer)
+bool vfio_devices_all_device_dirty_tracking(const VFIOContainerBase 
*bcontainer)
 {
 VFIODevice *vbasedev;
 
@@ -221,7 +221,8 @@ bool 
vfio_devices_all_device_dirty_tracking(VFIOContainerBase *bcontainer)
  * Check if all VFIO devices are running and migration is active, which is
  * essentially equivalent to the migration being in pre-copy phase.
  */
-bool vf

[PATCH v7 22/27] vfio/pci: Move VFIODevice initializations in vfio_instance_init

2023-11-21 Thread Zhenzhong Duan

Some of the VFIODevice initializations is in vfio_realize,
move all of them in vfio_instance_init.

No functional change intended.

Suggested-by: Cédric Le Goater 
Signed-off-by: Zhenzhong Duan 
Reviewed-by: Philippe Mathieu-Daudé 
---
 hw/vfio/pci.c | 10 ++
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
index 445d58c8e5..87405584d7 100644
--- a/hw/vfio/pci.c
+++ b/hw/vfio/pci.c
@@ -2969,9 +2969,6 @@ static void vfio_realize(PCIDevice *pdev, Error **errp)
 if (vfio_device_get_name(vbasedev, errp) < 0) {
 return;
 }
-vbasedev->ops = &vfio_pci_ops;
-vbasedev->type = VFIO_DEVICE_TYPE_PCI;
-vbasedev->dev = DEVICE(vdev);
 
 /*
  * Mediated devices *might* operate compatibly with discarding of RAM, but
@@ -3320,6 +3317,7 @@ static void vfio_instance_init(Object *obj)
 {
 PCIDevice *pci_dev = PCI_DEVICE(obj);
 VFIOPCIDevice *vdev = VFIO_PCI(obj);
+VFIODevice *vbasedev = &vdev->vbasedev;
 
 device_add_bootindex_property(obj, &vdev->bootindex,
   "bootindex", NULL,
@@ -3328,7 +3326,11 @@ static void vfio_instance_init(Object *obj)
 vdev->host.bus = ~0U;
 vdev->host.slot = ~0U;
 vdev->host.function = ~0U;
-vdev->vbasedev.fd = -1;
+
+vbasedev->type = VFIO_DEVICE_TYPE_PCI;
+vbasedev->ops = &vfio_pci_ops;
+vbasedev->dev = DEVICE(vdev);
+vbasedev->fd = -1;
 
 vdev->nv_gpudirect_clique = 0xFF;
 
-- 
2.34.1

[PULL 1/4] ppc/pnv: Fix potential overflow in I2C model

2023-11-21 Thread Cédric Le Goater

Coverity warns that "i2c_bus_busy(i2c->busses[i]) << i" might overflow
because the expression is evaluated using 32-bit arithmetic and then
used in a context expecting a uint64_t.

While we are at it, introduce a PNV_I2C_MAX_BUSSES constant and check
the number of busses at realize time.

Fixes: Coverity CID 1523918
Cc: Glenn Miles 
Reviewed-by: Philippe Mathieu-Daudé 
Reviewed-by: Glenn Miles 
Signed-off-by: Cédric Le Goater 
---
 hw/ppc/pnv_i2c.c | 8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/hw/ppc/pnv_i2c.c b/hw/ppc/pnv_i2c.c
index f75e59e70977..483d91d15a77 100644
--- a/hw/ppc/pnv_i2c.c
+++ b/hw/ppc/pnv_i2c.c
@@ -151,6 +151,7 @@
 #define I2C_RESET_S_SDA_REG 0x11
 
 #define PNV_I2C_FIFO_SIZE 8
+#define PNV_I2C_MAX_BUSSES 64
 
 static I2CBus *pnv_i2c_get_bus(PnvI2C *i2c)
 {
@@ -437,7 +438,7 @@ static uint64_t pnv_i2c_xscom_read(void *opaque, hwaddr 
addr,
 case I2C_PORT_BUSY_REG: /* compute busy bit for each port  */
 val = 0;
 for (i = 0; i < i2c->num_busses; i++) {
-val |= i2c_bus_busy(i2c->busses[i]) << i;
+val |= (uint64_t)i2c_bus_busy(i2c->busses[i]) << i;
 }
 break;
 
@@ -641,6 +642,11 @@ static void pnv_i2c_realize(DeviceState *dev, Error **errp)
 
 assert(i2c->chip);
 
+if (i2c->num_busses > PNV_I2C_MAX_BUSSES) {
+error_setg(errp, "Invalid number of busses: %u", i2c->num_busses);
+return;
+}
+
 pnv_xscom_region_init(&i2c->xscom_regs, OBJECT(i2c), &pnv_i2c_xscom_ops,
   i2c, "xscom-i2c", PNV9_XSCOM_I2CM_SIZE);
 
-- 
2.42.0

[PULL 0/4] ppc queue

2023-11-21 Thread Cédric Le Goater

The following changes since commit af9264da80073435fd78944bc5a46e695897d7e5:

  Merge tag '20231119-xtensa-1' of https://github.com/OSLL/qemu-xtensa into 
staging (2023-11-20 05:25:19 -0500)

are available in the Git repository at:

  https://github.com/legoater/qemu/ tags/pull-ppc-20231121

for you to fetch changes up to b664466d8f3c7b448fc7e9bd50d03a36538c6c27:

  ppc/pnv: Fix PNV I2C invalid status after reset (2023-11-21 08:39:58 +0100)


ppc queue:

* PNV I2C fixes
* VSX instruction fix when converting floating point to integer values


Cédric Le Goater (1):
  ppc/pnv: Fix potential overflow in I2C model

Glenn Miles (2):
  ppc/pnv: PNV I2C engines assigned incorrect XSCOM addresses
  ppc/pnv: Fix PNV I2C invalid status after reset

John Platts (1):
  target/ppc: Fix bugs in VSX_CVT_FP_TO_INT and VSX_CVT_FP_TO_INT2 macros

 hw/ppc/pnv.c|   6 +-
 hw/ppc/pnv_i2c.c|  52 +++
 target/ppc/fpu_helper.c |  12 +-
 tests/tcg/ppc64/vsx_f2i_nan.c   | 300 
 tests/tcg/ppc64/Makefile.target |   5 +
 5 files changed, 343 insertions(+), 32 deletions(-)
 create mode 100644 tests/tcg/ppc64/vsx_f2i_nan.c

[PULL 2/4] target/ppc: Fix bugs in VSX_CVT_FP_TO_INT and VSX_CVT_FP_TO_INT2 macros

2023-11-21 Thread Cédric Le Goater

From: John Platts 

The patch below fixes a bug in the VSX_CVT_FP_TO_INT and VSX_CVT_FP_TO_INT2
macros in target/ppc/fpu_helper.c where a non-NaN floating point value from the
source vector is incorrectly converted to 0, 0x8000, or 0x8000
instead of the expected value if a preceding source floating point value from
the same source vector was a NaN.

The bug in the VSX_CVT_FP_TO_INT and VSX_CVT_FP_TO_INT2 macros in
target/ppc/fpu_helper.c was introduced with commit c3f24257e3c0.

This patch also adds a new vsx_f2i_nan test in tests/tcg/ppc64 that checks that
the VSX xvcvspsxws, xvcvspuxws, xvcvspsxds, xvcvspuxds, xvcvdpsxws, xvcvdpuxws,
xvcvdpsxds, and xvcvdpuxds instructions correctly convert non-NaN floating point
values to integer values if the source vector contains NaN floating point 
values.

Fixes: c3f24257e3c0 ("target/ppc: Clear fpstatus flags on helpers missing it")
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1941
Signed-off-by: John Platts 
Reviewed-by: Richard Henderson 
Signed-off-by: Cédric Le Goater 
---
 target/ppc/fpu_helper.c |  12 +-
 tests/tcg/ppc64/vsx_f2i_nan.c   | 300 
 tests/tcg/ppc64/Makefile.target |   5 +
 3 files changed, 313 insertions(+), 4 deletions(-)
 create mode 100644 tests/tcg/ppc64/vsx_f2i_nan.c

diff --git a/target/ppc/fpu_helper.c b/target/ppc/fpu_helper.c
index 03150a0f1082..4b3dcad5d132 100644
--- a/target/ppc/fpu_helper.c
+++ b/target/ppc/fpu_helper.c
@@ -2880,20 +2880,22 @@ uint64_t helper_XSCVSPDPN(uint64_t xb)
 #define VSX_CVT_FP_TO_INT(op, nels, stp, ttp, sfld, tfld, sfi, rnan) \
 void helper_##op(CPUPPCState *env, ppc_vsr_t *xt, ppc_vsr_t *xb) \
 {\
+int all_flags = 0;   \
 ppc_vsr_t t = { };   \
 int i, flags;\
  \
-helper_reset_fpstatus(env);  \
- \
 for (i = 0; i < nels; i++) { \
+helper_reset_fpstatus(env);  \
 t.tfld = stp##_to_##ttp##_round_to_zero(xb->sfld, &env->fp_status);  \
 flags = env->fp_status.float_exception_flags;\
+all_flags |= flags;  \
 if (unlikely(flags & float_flag_invalid)) {  \
 t.tfld = float_invalid_cvt(env, flags, t.tfld, rnan, 0, GETPC());\
 }\
 }\
  \
 *xt = t; \
+env->fp_status.float_exception_flags = all_flags;\
 do_float_check_status(env, sfi, GETPC());\
 }
 
@@ -2945,15 +2947,16 @@ VSX_CVT_FP_TO_INT128(XSCVQPSQZ, int128, 
0x8000ULL);
 #define VSX_CVT_FP_TO_INT2(op, nels, stp, ttp, sfi, rnan)\
 void helper_##op(CPUPPCState *env, ppc_vsr_t *xt, ppc_vsr_t *xb) \
 {\
+int all_flags = 0;   \
 ppc_vsr_t t = { };   \
 int i, flags;\
  \
-helper_reset_fpstatus(env);  \
- \
 for (i = 0; i < nels; i++) { \
+helper_reset_fpstatus(env);  \
 t.VsrW(2 * i) = stp##_to_##ttp##_round_to_zero(xb->VsrD(i),  \
&env->fp_status); \
 flags = env->fp_status.float_exception_flags;\
+all_flags |= flags;  \
 if (unlikely(flags & float_flag_invalid)) {  \
 t.VsrW(2 * i) = float_invalid_cvt(env, flags, t.VsrW(2 * i), \
   rnan, 0, GETPC()); \
@@ -2962,6 +2965,7 @@ void helper_##op(CPUPPCState *env, ppc_vsr_t *xt, 
ppc_vsr_t *xb) \
 }

[PULL 4/4] ppc/pnv: Fix PNV I2C invalid status after reset

2023-11-21 Thread Cédric Le Goater

From: Glenn Miles 

The PNV I2C Controller was clearing the status register
after a reset without repopulating the "upper threshold
for I2C ports", "Command Complete" and the SCL/SDA input
level fields.

Fixed this for resets caused by a system reset as well
as from writing to the "Immediate Reset" register.

Fixes: 263b81ee15af ("ppc/pnv: Add an I2C controller model")
Signed-off-by: Glenn Miles 
Signed-off-by: Cédric Le Goater 
---
 hw/ppc/pnv_i2c.c | 42 ++
 1 file changed, 18 insertions(+), 24 deletions(-)

diff --git a/hw/ppc/pnv_i2c.c b/hw/ppc/pnv_i2c.c
index 384442f898bf..656a48eebe51 100644
--- a/hw/ppc/pnv_i2c.c
+++ b/hw/ppc/pnv_i2c.c
@@ -463,6 +463,23 @@ static uint64_t pnv_i2c_xscom_read(void *opaque, hwaddr 
addr,
 return val;
 }
 
+static void pnv_i2c_reset(void *dev)
+{
+PnvI2C *i2c = PNV_I2C(dev);
+
+memset(i2c->regs, 0, sizeof(i2c->regs));
+
+i2c->regs[I2C_STAT_REG] =
+SETFIELD(I2C_STAT_UPPER_THRS, 0ull, i2c->num_busses - 1) |
+I2C_STAT_CMD_COMP | I2C_STAT_SCL_INPUT_LEVEL |
+I2C_STAT_SDA_INPUT_LEVEL;
+i2c->regs[I2C_EXTD_STAT_REG] =
+SETFIELD(I2C_EXTD_STAT_FIFO_SIZE, 0ull, PNV_I2C_FIFO_SIZE) |
+SETFIELD(I2C_EXTD_STAT_I2C_VERSION, 0ull, 23); /* last version */
+
+fifo8_reset(&i2c->fifo);
+}
+
 static void pnv_i2c_xscom_write(void *opaque, hwaddr addr,
 uint64_t val, unsigned size)
 {
@@ -500,16 +517,7 @@ static void pnv_i2c_xscom_write(void *opaque, hwaddr addr,
 break;
 
 case I2C_RESET_I2C_REG:
-i2c->regs[I2C_MODE_REG] = 0;
-i2c->regs[I2C_CMD_REG] = 0;
-i2c->regs[I2C_WATERMARK_REG] = 0;
-i2c->regs[I2C_INTR_MASK_REG] = 0;
-i2c->regs[I2C_INTR_COND_REG] = 0;
-i2c->regs[I2C_INTR_RAW_COND_REG] = 0;
-i2c->regs[I2C_STAT_REG] = 0;
-i2c->regs[I2C_RESIDUAL_LEN_REG] = 0;
-i2c->regs[I2C_EXTD_STAT_REG] &=
-(I2C_EXTD_STAT_FIFO_SIZE | I2C_EXTD_STAT_I2C_VERSION);
+pnv_i2c_reset(i2c);
 break;
 
 case I2C_RESET_ERRORS:
@@ -621,20 +629,6 @@ static int pnv_i2c_dt_xscom(PnvXScomInterface *dev, void 
*fdt,
 return 0;
 }
 
-static void pnv_i2c_reset(void *dev)
-{
-PnvI2C *i2c = PNV_I2C(dev);
-
-memset(i2c->regs, 0, sizeof(i2c->regs));
-
-i2c->regs[I2C_STAT_REG] = I2C_STAT_CMD_COMP;
-i2c->regs[I2C_EXTD_STAT_REG] =
-SETFIELD(I2C_EXTD_STAT_FIFO_SIZE, 0ull, PNV_I2C_FIFO_SIZE) |
-SETFIELD(I2C_EXTD_STAT_I2C_VERSION, 0ull, 23); /* last version */
-
-fifo8_reset(&i2c->fifo);
-}
-
 static void pnv_i2c_realize(DeviceState *dev, Error **errp)
 {
 PnvI2C *i2c = PNV_I2C(dev);
-- 
2.42.0

[PULL 3/4] ppc/pnv: PNV I2C engines assigned incorrect XSCOM addresses

2023-11-21 Thread Cédric Le Goater

From: Glenn Miles 

The PNV I2C engines for power9 and power10 were being assigned a base
XSCOM address that was off by one I2C engine's address range such
that engine 0 had engine 1's address and so on.  The xscom address
assignment was being based on the device tree engine numbering, which
starts at 1.  Rather than changing the device tree numbering to start
with 0, the addressing was changed to be based on the existing device
tree numbers minus one.

Fixes: 1ceda19c28a1 ("ppc/pnv: Connect PNV I2C controller to powernv10)
Signed-off-by: Glenn Miles 
Signed-off-by: Cédric Le Goater 
---
 hw/ppc/pnv.c | 6 --
 hw/ppc/pnv_i2c.c | 2 +-
 2 files changed, 5 insertions(+), 3 deletions(-)

diff --git a/hw/ppc/pnv.c b/hw/ppc/pnv.c
index 9c2972733784..0297871bdd5d 100644
--- a/hw/ppc/pnv.c
+++ b/hw/ppc/pnv.c
@@ -1623,7 +1623,8 @@ static void pnv_chip_power9_realize(DeviceState *dev, 
Error **errp)
 return;
 }
 pnv_xscom_add_subregion(chip, PNV9_XSCOM_I2CM_BASE +
-   chip9->i2c[i].engine * PNV9_XSCOM_I2CM_SIZE,
+(chip9->i2c[i].engine - 1) *
+PNV9_XSCOM_I2CM_SIZE,
 &chip9->i2c[i].xscom_regs);
 qdev_connect_gpio_out(DEVICE(&chip9->i2c[i]), 0,
   qdev_get_gpio_in(DEVICE(&chip9->psi),
@@ -1871,7 +1872,8 @@ static void pnv_chip_power10_realize(DeviceState *dev, 
Error **errp)
 return;
 }
 pnv_xscom_add_subregion(chip, PNV10_XSCOM_I2CM_BASE +
-chip10->i2c[i].engine * PNV10_XSCOM_I2CM_SIZE,
+(chip10->i2c[i].engine - 1) *
+PNV10_XSCOM_I2CM_SIZE,
 &chip10->i2c[i].xscom_regs);
 qdev_connect_gpio_out(DEVICE(&chip10->i2c[i]), 0,
   qdev_get_gpio_in(DEVICE(&chip10->psi),
diff --git a/hw/ppc/pnv_i2c.c b/hw/ppc/pnv_i2c.c
index 483d91d15a77..384442f898bf 100644
--- a/hw/ppc/pnv_i2c.c
+++ b/hw/ppc/pnv_i2c.c
@@ -594,7 +594,7 @@ static int pnv_i2c_dt_xscom(PnvXScomInterface *dev, void 
*fdt,
 int i2c_offset;
 const char i2c_compat[] = "ibm,power8-i2cm\0ibm,power9-i2cm";
 uint32_t i2c_pcba = PNV9_XSCOM_I2CM_BASE +
-i2c->engine * PNV9_XSCOM_I2CM_SIZE;
+(i2c->engine - 1) * PNV9_XSCOM_I2CM_SIZE;
 uint32_t reg[2] = {
 cpu_to_be32(i2c_pcba),
 cpu_to_be32(PNV9_XSCOM_I2CM_SIZE)
-- 
2.42.0

Re: [PATCH 3/3] tests/avocado: Enable reverse_debugging.py tests in gitlab CI

2023-11-21 Thread Thomas Huth


On 17/11/2023 08.35, Nicholas Piggin wrote:

On Fri Nov 17, 2023 at 4:11 AM AEST, Thomas Huth wrote:

On 16/11/2023 12.53, Nicholas Piggin wrote:

Let's try enable reverse_debugging.py in gitlab CI.

Signed-off-by: Nicholas Piggin 
---
Maybe we could try this again at some point? The bug might have been
noticed sooner.

They only take a couple of seconds to run so should not take too much
overhead. But my gitlab CI pipeline doesn't run the avocado tests for
some reason, so I can't see if it's still causing problems.

Thanks,
Nick
---

   tests/avocado/reverse_debugging.py | 7 ---
   1 file changed, 7 deletions(-)


FYI, I gave it a try, and it survived my CI run:

   https://gitlab.com/thuth/qemu/-/jobs/5552213972#L403

So I went ahead and put it (together with the first patch) in my current
pull request, let's see how it goes...


Great, thank you.


... and here it's failing again (current master branch):

https://gitlab.com/thuth/qemu/-/jobs/5582657378#L404

According to the debug.log in the artifacts, it's failing here:

08:28:32 DEBUG| [0.230392217,5] OPAL v7.0 starting...

08:28:32 DEBUG| [0.230674939,7] initial console log level: memory 7, driver 
5

08:28:32 DEBUG| [0.231048494,6] CPU: P9 generation processor (max 4 
threads/core)

08:28:32 DEBUG| [
08:28:32 DEBUG| [0.231412547,7] CPU: Boot CPU PIR is 0x PVR is 
0x004e1202

08:28:32 DEBUG| [
08:28:32 ERROR|
08:28:32 ERROR| Reproduced traceback from: 
/builds/thuth/qemu/build/pyvenv/lib64/python3.8/site-packages/avocado/core/test.py:770
08:28:32 ERROR| Traceback (most recent call last):
08:28:32 ERROR|   File 
"/builds/thuth/qemu/build/tests/avocado/reverse_debugging.py", line 262, in 
test_ppc64_powernv
08:28:32 ERROR| self.reverse_debugging()
08:28:32 ERROR|   File 
"/builds/thuth/qemu/build/tests/avocado/reverse_debugging.py", line 178, in 
reverse_debugging
08:28:32 ERROR| g.cmd(b'c')
08:28:32 ERROR|   File 
"/builds/thuth/qemu/build/pyvenv/lib64/python3.8/site-packages/avocado/utils/gdb.py",
 line 783, in cmd
08:28:32 ERROR| response_payload = self.decode(result)
08:28:32 ERROR|   File 
"/builds/thuth/qemu/build/pyvenv/lib64/python3.8/site-packages/avocado/utils/gdb.py",
 line 738, in decode
08:28:32 ERROR| raise InvalidPacketError
08:28:32 ERROR| avocado.utils.gdb.InvalidPacketError
08:28:32 ERROR|
08:28:32 DEBUG| Local variables:
08:28:32 DEBUG|  -> self : 
79-tests/avocado/reverse_debugging.py:ReverseDebugging_ppc64.test_ppc64_powernv
08:28:32 DEBUG| Shutting down VM appliance; timeout=30
08:28:32 DEBUG| Attempting graceful termination
08:28:32 DEBUG| Closing console socket
08:28:32 DEBUG| Politely asking QEMU to terminate

So unless someone has a clue how to fix that, I guess it's
likely best to revert this enablement patch again...

 Thomas

Re: [PATCH V7 8/8] docs/specs/acpi_hw_reduced_hotplug: Add the CPU Hotplug Event Bit

2023-11-21 Thread Shaoqin Huang





On 11/14/23 04:12, Salil Mehta via wrote:

GED interface is used by many hotplug events like memory hotplug, NVDIMM hotplug
and non-hotplug events like system power down event. Each of these can be
selected using a bit in the 32 bit GED IO interface. A bit has been reserved for
the CPU hotplug event.

Signed-off-by: Salil Mehta 

Reviewed-by: Shaoqin Huang 

---
  docs/specs/acpi_hw_reduced_hotplug.rst | 3 ++-
  1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/docs/specs/acpi_hw_reduced_hotplug.rst 
b/docs/specs/acpi_hw_reduced_hotplug.rst
index 0bd3f9399f..3acd6fcd8b 100644
--- a/docs/specs/acpi_hw_reduced_hotplug.rst
+++ b/docs/specs/acpi_hw_reduced_hotplug.rst
@@ -64,7 +64,8 @@ GED IO interface (4 byte access)
 0: Memory hotplug event
 1: System power down event
 2: NVDIMM hotplug event
-3-31: Reserved
+   3: CPU hotplug event
+4-31: Reserved
  
  **write_access:**
  


--
Shaoqin

Re: [PATCH v3 0/4] ide: implement simple legacy/native mode switching for PCI IDE controllers

2023-11-21 Thread Kevin Wolf

Am 20.11.2023 um 16:02 hat BALATON Zoltan geschrieben:
> On Mon, 20 Nov 2023, Mark Cave-Ayland wrote:
> > On 20/11/2023 13:42, Kevin Wolf wrote:
> > > Am 20.11.2023 um 14:09 hat BALATON Zoltan geschrieben:
> > > > On Mon, 20 Nov 2023, Mark Cave-Ayland wrote:
> > > > > On 19/11/2023 21:43, BALATON Zoltan wrote:
> > > > > > On Thu, 16 Nov 2023, Mark Cave-Ayland wrote:
> > > > > > > This series adds a simple implementation of legacy/native mode
> > > > > > > switching for PCI
> > > > > > > IDE controllers and updates the via-ide device to use it.
> > > > > > > 
> > > > > > > The approach I take here is to add a new pci_ide_update_mode()
> > > > > > > function which handles
> > > > > > > management of the PCI BARs and legacy IDE ioports for each mode
> > > > > > > to avoid exposing
> > > > > > > details of the internal logic to individual PCI IDE controllers.
> > > > > > > 
> > > > > > > As noted in [1] this is extracted from a local WIP branch I have
> > > > > > > which contains
> > > > > > > further work in this area. However for the moment
> > > > > > > I've kept it simple (and
> > > > > > > restricted it to the via-ide device) which is good
> > > > > > > enough for Zoltan's PPC
> > > > > > > images whilst paving the way for future improvements after 8.2.
> > > > > > > 
> > > > > > > Signed-off-by: Mark Cave-Ayland 
> > > > > > > 
> > > > > > > [1] 
> > > > > > > https://lists.gnu.org/archive/html/qemu-devel/2023-10/msg05403.html
> > > > > > > 
> > > > > > > v3:
> > > > > > > - Rebase onto master
> > > > > > > - Move ide_portio_list[] and ide_portio_list2[] to IDE core to
> > > > > > > prevent duplication in
> > > > > > >   hw/ide/pci.c
> > > > > > > - Don't zero BARs when switching from native mode to legacy
> > > > > > > mode, instead always force
> > > > > > >   them to read zero as suggested in the PCI IDE specification
> > > > > > > (note: this also appears
> > > > > > >   to fix the fuloong2e machine booting from IDE)
> > > > > > 
> > > > > > Not sure you're getting this, see also:
> > > > > > https://lists.nongnu.org/archive/html/qemu-devel/2023-11/msg04167.html
> > > > > > but this seems to break latest version of the AmigaOS driver for
> > > > > > some reason. I assume this is the BAR zeroing that causes this as it
> > > > > > works with v2 series and nothing else changed in v3 that could cause
> > > > > > this. Testing was done by Rene Engel, cc'd so maybe he can add more
> > > > > > info. It seems to work with my patch that sets BARs to legacy values
> > > > > > and with v2 that sets them to 0 but not with v3 which should also
> > > > > > read 0 but maybe something is off here.
> > > > > 
> > > > > I've been AFK for a few days, so just starting to catch up on various
> > > > > bits and pieces.
> > > > 
> > > > OK just wasn't sure if you saw my emails at all as it happened before 
> > > > that
> > > > some spam filters disliked my mail server and put messages in the spam
> > > > folder.
> > > > 
> > > > > The only difference I can think of regarding the BAR zeroing is that 
> > > > > the
> > > > > BMDMA BAR is zeroed here. Does the following diff fix things?
> > > > 
> > > > This helps, with this the latest driver does not crash but still
> > > > reads BAR4
> > > > as 0 instead of 0xcc00 so UDMA won't work but at least it boots.
> > > 
> > > And disabling only the first four BARs is actually what the spec says,
> > > too. So I'll make this change to the queued patches.
> > 
> > That was definitely something that I thought about: what should happen
> > to BARs outside of the ones mentioned in the PCI IDE controller
> > specification? It seems reasonable to me just to consider BARS 0-3 for
> > zeroing here.
> > 
> > > If I understand correctly, UDMA didn't work before this series either,
> > > so it's a separate goal and doing it in its own patch is best anyway.
> > > 
> > > As we don't seem to have a good place to set a default, maybe just
> > > overriding it in via_ide_cfg_read(), too, and making it return 0xcc01 in
> > > compatibility mode is enough?
> > 
> > It's difficult to know whether switching to legacy mode on the via-ide
> > device resets BAR4 to its default value, or whether it is simply left
> > unaltered. For 8.2 I don't mind too much as long as the logic is
> > separate from the BAR zeroing logic (which will eventually be lifted up
> > into hw/ide/pci.c).
> 
> My original patch checked for BAR being unset and only reset to defailt in
> that case so it won't clobber a value set by something (like pegasos2
> firmware) but will set default for amigaone which does not program the BAR
> just uses the default legacy mode (which is the default on real chip but we
> have to make that happen on QEMU after reset). So setting it to default if
> it's unset when switching to legacy seems like a safe bet and testing with
> my patch did not find problem with that.

How about setting the default if it's unset on the first read after
reset instead of only on switching modes? I'd like to avoid that the

Re: [PATCH 3/3] tests/avocado: Enable reverse_debugging.py tests in gitlab CI

2023-11-21 Thread Daniel P . Berrangé

On Tue, Nov 21, 2023 at 09:56:24AM +0100, Thomas Huth wrote:
> On 17/11/2023 08.35, Nicholas Piggin wrote:
> > On Fri Nov 17, 2023 at 4:11 AM AEST, Thomas Huth wrote:
> > > On 16/11/2023 12.53, Nicholas Piggin wrote:
> > > > Let's try enable reverse_debugging.py in gitlab CI.
> > > > 
> > > > Signed-off-by: Nicholas Piggin 
> > > > ---
> > > > Maybe we could try this again at some point? The bug might have been
> > > > noticed sooner.
> > > > 
> > > > They only take a couple of seconds to run so should not take too much
> > > > overhead. But my gitlab CI pipeline doesn't run the avocado tests for
> > > > some reason, so I can't see if it's still causing problems.
> > > > 
> > > > Thanks,
> > > > Nick
> > > > ---
> > > > 
> > > >tests/avocado/reverse_debugging.py | 7 ---
> > > >1 file changed, 7 deletions(-)
> > > 
> > > FYI, I gave it a try, and it survived my CI run:
> > > 
> > >https://gitlab.com/thuth/qemu/-/jobs/5552213972#L403
> > > 
> > > So I went ahead and put it (together with the first patch) in my current
> > > pull request, let's see how it goes...
> > 
> > Great, thank you.
> 
> ... and here it's failing again (current master branch):
> 
> https://gitlab.com/thuth/qemu/-/jobs/5582657378#L404
> 
> According to the debug.log in the artifacts, it's failing here:
> 
> 08:28:32 DEBUG| [0.230392217,5] OPAL v7.0 starting...
> 
> 08:28:32 DEBUG| [0.230674939,7] initial console log level: memory 7, 
> driver 5
> 
> 08:28:32 DEBUG| [0.231048494,6] CPU: P9 generation processor (max 4 
> threads/core)
> 
> 08:28:32 DEBUG| [
> 08:28:32 DEBUG| [0.231412547,7] CPU: Boot CPU PIR is 0x PVR is 
> 0x004e1202
> 
> 08:28:32 DEBUG| [
> 08:28:32 ERROR|
> 08:28:32 ERROR| Reproduced traceback from: 
> /builds/thuth/qemu/build/pyvenv/lib64/python3.8/site-packages/avocado/core/test.py:770
> 08:28:32 ERROR| Traceback (most recent call last):
> 08:28:32 ERROR|   File 
> "/builds/thuth/qemu/build/tests/avocado/reverse_debugging.py", line 262, in 
> test_ppc64_powernv
> 08:28:32 ERROR| self.reverse_debugging()
> 08:28:32 ERROR|   File 
> "/builds/thuth/qemu/build/tests/avocado/reverse_debugging.py", line 178, in 
> reverse_debugging
> 08:28:32 ERROR| g.cmd(b'c')
> 08:28:32 ERROR|   File 
> "/builds/thuth/qemu/build/pyvenv/lib64/python3.8/site-packages/avocado/utils/gdb.py",
>  line 783, in cmd
> 08:28:32 ERROR| response_payload = self.decode(result)
> 08:28:32 ERROR|   File 
> "/builds/thuth/qemu/build/pyvenv/lib64/python3.8/site-packages/avocado/utils/gdb.py",
>  line 738, in decode
> 08:28:32 ERROR| raise InvalidPacketError
> 08:28:32 ERROR| avocado.utils.gdb.InvalidPacketError
> 08:28:32 ERROR|
> 08:28:32 DEBUG| Local variables:
> 08:28:32 DEBUG|  -> self : 
> 79-tests/avocado/reverse_debugging.py:ReverseDebugging_ppc64.test_ppc64_powernv
> 08:28:32 DEBUG| Shutting down VM appliance; timeout=30
> 08:28:32 DEBUG| Attempting graceful termination
> 08:28:32 DEBUG| Closing console socket
> 08:28:32 DEBUG| Politely asking QEMU to terminate
> 
> So unless someone has a clue how to fix that, I guess it's
> likely best to revert this enablement patch again...

A little further in the log we see

08:28:32 DEBUG| Politely asking QEMU to terminate
08:28:32 DEBUG| --> {
  "execute": "quit"
}
08:28:32 DEBUG| <-- {
  "timestamp": {
"seconds": 1700555312,
"microseconds": 86122
  },
  "event": "RESUME"
}
08:28:32 ERROR| Task.Reader: BrokenPipeError: [Errno 32] Broken pipe



With seeing a bad packet from GDB and seeing Broken pipe from QMP,
my impression is that the QEMU process is no longer present, most
likely it has SEGV'd I reckon.

IOW, I think we might well have a genuine bug here, not merely an
unreliable test suite.

None the less, unless someone can guess what the problem is, we'll
need to disable the test to get reliable CI.

A bug should be opened though with the CI logs.

With regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

Re: [PATCH-for-8.2?] hw/arm/fsl-imx: Do not ignore Error argument

2023-11-21 Thread Markus Armbruster

Philippe Mathieu-Daudé  writes:

> Both i.MX25 and i.MX6 SoC models ignore the Error argument when
> setting the PHY number. Pick &error_abort which is the error
> used by the i.MX7 SoC (see commit 1f7197deb0 "ability to change
> the FEC PHY on i.MX7 processor").
>
> Fixes: 74c1330582 ("ability to change the FEC PHY on i.MX25 processor")
> Fixes: a9c167a3c4 ("ability to change the FEC PHY on i.MX6 processor")
> Signed-off-by: Philippe Mathieu-Daudé 
> ---
>  hw/arm/fsl-imx25.c | 3 ++-
>  hw/arm/fsl-imx6.c  | 3 ++-
>  2 files changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/hw/arm/fsl-imx25.c b/hw/arm/fsl-imx25.c
> index 24c4374590..9aabbf7f58 100644
> --- a/hw/arm/fsl-imx25.c
> +++ b/hw/arm/fsl-imx25.c
> @@ -169,7 +169,8 @@ static void fsl_imx25_realize(DeviceState *dev, Error 
> **errp)
>  epit_table[i].irq));
>  }
>  
> -object_property_set_uint(OBJECT(&s->fec), "phy-num", s->phy_num, &err);

This is actually worse than "ignore the Error argument".  If this fails,
we continue with @err not null.  If we actually reach the next use of
@err...

> +object_property_set_uint(OBJECT(&s->fec), "phy-num", s->phy_num,
> + &error_abort);
>  qdev_set_nic_properties(DEVICE(&s->fec), &nd_table[0]);
>  
>  if (!sysbus_realize(SYS_BUS_DEVICE(&s->fec), errp)) {
   return;
   }
   sysbus_mmio_map(SYS_BUS_DEVICE(&s->fec), 0, FSL_IMX25_FEC_ADDR);
   sysbus_connect_irq(SYS_BUS_DEVICE(&s->fec), 0,
  qdev_get_gpio_in(DEVICE(&s->avic), 
FSL_IMX25_FEC_IRQ));

   if (!sysbus_realize(SYS_BUS_DEVICE(&s->rngc), errp)) {
   return;
   }

   [...]

   /* initialize 2 x 16 KB ROM */

... here, we pass a non-null @err to memory_region_init_rom().  Any
error will trip error_setv()'s assertion.

   memory_region_init_rom(&s->rom[0], OBJECT(dev), "imx25.rom0",
  FSL_IMX25_ROM0_SIZE, &err);
   if (err) {
   error_propagate(errp, err);
   return;
   }

This is an instance of an anti-pattern: passing &err or errp without
checking for failure.  Three possible fixes:

1. Check for failure.

2. Pass &error_abort instead.  This is appropriate for programming
   errors.

3. Pass NULL instead.  This is appropriate when errors don't matter.
   Which is rare.

You go with 2., which looks good to me.

> diff --git a/hw/arm/fsl-imx6.c b/hw/arm/fsl-imx6.c
> index 4fa7f0b95e..7dc42cbfe6 100644
> --- a/hw/arm/fsl-imx6.c
> +++ b/hw/arm/fsl-imx6.c
> @@ -379,7 +379,8 @@ static void fsl_imx6_realize(DeviceState *dev, Error 
> **errp)
>  spi_table[i].irq));
>  }
>  
> -object_property_set_uint(OBJECT(&s->eth), "phy-num", s->phy_num, &err);
> +object_property_set_uint(OBJECT(&s->eth), "phy-num", s->phy_num,
> + &error_abort);
>  qdev_set_nic_properties(DEVICE(&s->eth), &nd_table[0]);
>  if (!sysbus_realize(SYS_BUS_DEVICE(&s->eth), errp)) {
>  return;

Same.

Suggest to clarify the commit message like this:

  Both i.MX25 and i.MX6 SoC pass &err to object_property_set_uint()
  without checking for failure.  Running into another failure will trip
  error_setv()'s assertion.

  Pass &error_abort instead, like the i.MX7 SoC does (see commit
  1f7197deb0 "ability to change the FEC PHY on i.MX7 processor").

With something like that:
Reviewed-by: Markus Armbruster

Re: [PATCH for-8.2 2/3] ui: use "vc" chardev for dbus, gtk & spice-app

2023-11-21 Thread Thomas Huth


On 17/11/2023 15.35, marcandre.lur...@redhat.com wrote:

From: Marc-André Lureau 

Those display have their own implementation of "vc" chardev, which
doesn't use pixman. They also don't implement the width/height/cols/rows
options, so qemu_display_get_vc() should return a compatible argument.

This patch was meant to be with the pixman series, when the "vc" field
was introduced. It fixes a regression where VC are created on the
tty (or null) instead of the display own "vc" implementation.

Signed-off-by: Marc-André Lureau 
---
  ui/dbus.c  | 1 +
  ui/gtk.c   | 1 +
  ui/spice-app.c | 1 +
  3 files changed, 3 insertions(+)


FWIW,
Acked-by: Thomas Huth

Re: [PATCH for-8.2 3/3] ui/console: fix default VC when there are no display

2023-11-21 Thread Thomas Huth


On 17/11/2023 15.35, marcandre.lur...@redhat.com wrote:

From: Marc-André Lureau 

When display is "none", we may still have remote displays (I think it
would be simpler if VNC/Spice were regular display btw). Return the
default VC then, and set them up to fix a regression when using remote
display and it used the TTY instead.

Fixes: commit 1bec1cc0d ("ui/console: allow to override the default VC")
Reported-by: German Maglione 
Signed-off-by: Marc-André Lureau 
---
  ui/console.c | 18 --
  1 file changed, 8 insertions(+), 10 deletions(-)


Acked-by: Thomas Huth

[PATCH for-8.2] ui/pixman-minimal.h: fix empty allocation

2023-11-21 Thread Manos Pitsidianakis

In the minimal pixman API stub that is used when the real pixman
dependency is missing a NULL dereference happens when
virtio-gpu-rutabaga allocates a pixman image with bits = NULL and
rowstride_bytes = zero. A buffer of rowstride_bytes * height is
allocated which is NULL. However, in that scenario pixman calculates a
new stride value based on given width, height and format size.

This commit adds a helper function that performs the same logic as
pixman.

Signed-off-by: Manos Pitsidianakis 
---
 include/ui/pixman-minimal.h | 48 +++--
 1 file changed, 46 insertions(+), 2 deletions(-)

diff --git a/include/ui/pixman-minimal.h b/include/ui/pixman-minimal.h
index efcf570c9e..6dd7de1c7e 100644
--- a/include/ui/pixman-minimal.h
+++ b/include/ui/pixman-minimal.h
@@ -113,6 +113,45 @@ typedef struct pixman_color {
 uint16_talpha;
 } pixman_color_t;
 
+static inline uint32_t *create_bits(pixman_format_code_t format,
+int width,
+int height,
+int *rowstride_bytes)
+{
+int stride = 0;
+size_t buf_size = 0;
+int bpp = PIXMAN_FORMAT_BPP(format);
+
+/*
+ * Calculate the following while checking for overflow truncation:
+ * stride = ((width * bpp + 0x1f) >> 5) * sizeof(uint32_t);
+ */
+
+if (unlikely(__builtin_mul_overflow(width, bpp, &stride))) {
+return NULL;
+}
+
+if (unlikely(__builtin_add_overflow(stride, 0x1f, &stride))) {
+return NULL;
+}
+
+stride >>= 5;
+
+stride *= sizeof(uint32_t);
+
+if (unlikely(__builtin_mul_overflow((size_t) height,
+(size_t) stride,
+&buf_size))) {
+return NULL;
+}
+
+if (rowstride_bytes) {
+*rowstride_bytes = stride;
+}
+
+return g_malloc0(buf_size);
+}
+
 static inline pixman_image_t *pixman_image_create_bits(pixman_format_code_t 
format,
int width,
int height,
@@ -123,13 +162,18 @@ static inline pixman_image_t 
*pixman_image_create_bits(pixman_format_code_t form
 
 i->width = width;
 i->height = height;
-i->stride = rowstride_bytes ?: width * 
DIV_ROUND_UP(PIXMAN_FORMAT_BPP(format), 8);
 i->format = format;
 if (bits) {
 i->data = bits;
 } else {
-i->free_me = i->data = g_malloc0(rowstride_bytes * height);
+i->free_me = i->data =
+create_bits(format, width, height, &rowstride_bytes);
+if (width && height) {
+assert(i->data);
+}
 }
+i->stride = rowstride_bytes ? rowstride_bytes :
+width * DIV_ROUND_UP(PIXMAN_FORMAT_BPP(format), 8);
 i->ref_count = 1;
 
 return i;

base-commit: af9264da80073435fd78944bc5a46e695897d7e5
-- 
2.39.2

Re: [PATCH 2/3] chardev: report blocked write to chardev backend

2023-11-21 Thread Marc-André Lureau

Hi

On Mon, Nov 20, 2023 at 5:36 PM Nicholas Piggin  wrote:
>
> On Mon Nov 20, 2023 at 10:06 PM AEST, Marc-André Lureau wrote:
> > Hi
> >
> > On Thu, Nov 16, 2023 at 3:54 PM Nicholas Piggin  wrote:
> > >
> > > If a chardev socket is not read, it will eventually fill and QEMU
> > > can block attempting to write to it. A difficult bug in avocado
> > > tests where the console socket was not being read from caused this
> > > hang.
> > >
> > > warn if a chardev write is blocked for 100ms.
> > >
> > > Signed-off-by: Nicholas Piggin 
> > > ---
> > > This is not necessary for the fix but it does trigger in the
> > > failing avocado test without the previous patch applied. Maybe
> > > it would be helpful?
> > >
> > > Thanks,
> > > Nick
> > >
> > >  chardev/char.c | 6 ++
> > >  1 file changed, 6 insertions(+)
> > >
> > > diff --git a/chardev/char.c b/chardev/char.c
> > > index 996a024c7a..7c375e3cc4 100644
> > > --- a/chardev/char.c
> > > +++ b/chardev/char.c
> > > @@ -114,6 +114,8 @@ static int qemu_chr_write_buffer(Chardev *s,
> > >  {
> > >  ChardevClass *cc = CHARDEV_GET_CLASS(s);
> > >  int res = 0;
> > > +int nr_retries = 0;
> > > +
> > >  *offset = 0;
> > >
> > >  qemu_mutex_lock(&s->chr_write_lock);
> > > @@ -126,6 +128,10 @@ static int qemu_chr_write_buffer(Chardev *s,
> > >  } else {
> > >  g_usleep(100);
> > >  }
> > > +if (++nr_retries == 1000) { /* 100ms */
> > > +warn_report("Chardev '%s' write blocked for > 100ms, "
> > > +"socket buffer full?", s->label);
> > > +}
> >
> > That shouldn't happen, the frontend should poll and only write when it
> > can. What is the qemu command being used here?
>
> You can follow it through the thread here
>
> https://lore.kernel.org/qemu-devel/zvt-by9yor69q...@redhat.com/
>
> In short, a console device is attached to a socket pair and nothing
> ever reads from it. It eventually fills, and writing to it fails
> indefinitely here.
>
> It can be reproduced with:
>
> make check-avocado
> AVOCADO_TESTS=tests/avocado/reverse_debugging.py:test_ppc64_pseries
>
>

How reliably? I tried 10/10.

> > I think this change can be worth for debugging though.
> >
> > Reviewed-by: Marc-André Lureau 
>
> Thanks,
> Nick
>

Re: [PATCH 3/3] tests/avocado: Enable reverse_debugging.py tests in gitlab CI

2023-11-21 Thread Thomas Huth


On 21/11/2023 10.14, Daniel P. Berrangé wrote:

On Tue, Nov 21, 2023 at 09:56:24AM +0100, Thomas Huth wrote:

On 17/11/2023 08.35, Nicholas Piggin wrote:

On Fri Nov 17, 2023 at 4:11 AM AEST, Thomas Huth wrote:

On 16/11/2023 12.53, Nicholas Piggin wrote:

Let's try enable reverse_debugging.py in gitlab CI.

Signed-off-by: Nicholas Piggin 
---
Maybe we could try this again at some point? The bug might have been
noticed sooner.

They only take a couple of seconds to run so should not take too much
overhead. But my gitlab CI pipeline doesn't run the avocado tests for
some reason, so I can't see if it's still causing problems.

...

FYI, I gave it a try, and it survived my CI run:

https://gitlab.com/thuth/qemu/-/jobs/5552213972#L403

So I went ahead and put it (together with the first patch) in my current
pull request, let's see how it goes...


Great, thank you.


... and here it's failing again (current master branch):

https://gitlab.com/thuth/qemu/-/jobs/5582657378#L404

According to the debug.log in the artifacts, it's failing here:

...

08:28:32 ERROR| Task.Reader: BrokenPipeError: [Errno 32] Broken pipe

With seeing a bad packet from GDB and seeing Broken pipe from QMP,
my impression is that the QEMU process is no longer present, most
likely it has SEGV'd I reckon.

IOW, I think we might well have a genuine bug here, not merely an
unreliable test suite.

None the less, unless someone can guess what the problem is, we'll
need to disable the test to get reliable CI.


I'll sent a patch to revert the commit.


A bug should be opened though with the CI logs.


Done: https://gitlab.com/qemu-project/qemu/-/issues/1992

 Thomas

Re: [PATCH 2/3] chardev: report blocked write to chardev backend

2023-11-21 Thread Daniel P . Berrangé

On Tue, Nov 21, 2023 at 01:39:03PM +0400, Marc-André Lureau wrote:
> Hi
> 
> On Mon, Nov 20, 2023 at 5:36 PM Nicholas Piggin  wrote:
> >
> > On Mon Nov 20, 2023 at 10:06 PM AEST, Marc-André Lureau wrote:
> > > Hi
> > >
> > > On Thu, Nov 16, 2023 at 3:54 PM Nicholas Piggin  wrote:
> > > >
> > > > If a chardev socket is not read, it will eventually fill and QEMU
> > > > can block attempting to write to it. A difficult bug in avocado
> > > > tests where the console socket was not being read from caused this
> > > > hang.
> > > >
> > > > warn if a chardev write is blocked for 100ms.
> > > >
> > > > Signed-off-by: Nicholas Piggin 
> > > > ---
> > > > This is not necessary for the fix but it does trigger in the
> > > > failing avocado test without the previous patch applied. Maybe
> > > > it would be helpful?
> > > >
> > > > Thanks,
> > > > Nick
> > > >
> > > >  chardev/char.c | 6 ++
> > > >  1 file changed, 6 insertions(+)
> > > >
> > > > diff --git a/chardev/char.c b/chardev/char.c
> > > > index 996a024c7a..7c375e3cc4 100644
> > > > --- a/chardev/char.c
> > > > +++ b/chardev/char.c
> > > > @@ -114,6 +114,8 @@ static int qemu_chr_write_buffer(Chardev *s,
> > > >  {
> > > >  ChardevClass *cc = CHARDEV_GET_CLASS(s);
> > > >  int res = 0;
> > > > +int nr_retries = 0;
> > > > +
> > > >  *offset = 0;
> > > >
> > > >  qemu_mutex_lock(&s->chr_write_lock);
> > > > @@ -126,6 +128,10 @@ static int qemu_chr_write_buffer(Chardev *s,
> > > >  } else {
> > > >  g_usleep(100);
> > > >  }
> > > > +if (++nr_retries == 1000) { /* 100ms */
> > > > +warn_report("Chardev '%s' write blocked for > 100ms, "
> > > > +"socket buffer full?", s->label);
> > > > +}
> > >
> > > That shouldn't happen, the frontend should poll and only write when it
> > > can. What is the qemu command being used here?
> >
> > You can follow it through the thread here
> >
> > https://lore.kernel.org/qemu-devel/zvt-by9yor69q...@redhat.com/
> >
> > In short, a console device is attached to a socket pair and nothing
> > ever reads from it. It eventually fills, and writing to it fails
> > indefinitely here.
> >
> > It can be reproduced with:
> >
> > make check-avocado
> > AVOCADO_TESTS=tests/avocado/reverse_debugging.py:test_ppc64_pseries
> >
> >
> 
> How reliably? I tried 10/10.

It reproduced 100% reliably, but note git master is fixed now, so to
test you'll need to revert cd43f00524070c0267613acc98a153dba0e398d9

With regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

Re: [PATCH 2/3] chardev: report blocked write to chardev backend

2023-11-21 Thread Thomas Huth


On 21/11/2023 10.39, Marc-André Lureau wrote:

Hi

On Mon, Nov 20, 2023 at 5:36 PM Nicholas Piggin  wrote:


On Mon Nov 20, 2023 at 10:06 PM AEST, Marc-André Lureau wrote:

Hi

On Thu, Nov 16, 2023 at 3:54 PM Nicholas Piggin  wrote:


If a chardev socket is not read, it will eventually fill and QEMU
can block attempting to write to it. A difficult bug in avocado
tests where the console socket was not being read from caused this
hang.

warn if a chardev write is blocked for 100ms.

Signed-off-by: Nicholas Piggin 
---
This is not necessary for the fix but it does trigger in the
failing avocado test without the previous patch applied. Maybe
it would be helpful?

Thanks,
Nick

  chardev/char.c | 6 ++
  1 file changed, 6 insertions(+)

diff --git a/chardev/char.c b/chardev/char.c
index 996a024c7a..7c375e3cc4 100644
--- a/chardev/char.c
+++ b/chardev/char.c
@@ -114,6 +114,8 @@ static int qemu_chr_write_buffer(Chardev *s,
  {
  ChardevClass *cc = CHARDEV_GET_CLASS(s);
  int res = 0;
+int nr_retries = 0;
+
  *offset = 0;

  qemu_mutex_lock(&s->chr_write_lock);
@@ -126,6 +128,10 @@ static int qemu_chr_write_buffer(Chardev *s,
  } else {
  g_usleep(100);
  }
+if (++nr_retries == 1000) { /* 100ms */
+warn_report("Chardev '%s' write blocked for > 100ms, "
+"socket buffer full?", s->label);
+}


That shouldn't happen, the frontend should poll and only write when it
can. What is the qemu command being used here?


You can follow it through the thread here

https://lore.kernel.org/qemu-devel/zvt-by9yor69q...@redhat.com/

In short, a console device is attached to a socket pair and nothing
ever reads from it. It eventually fills, and writing to it fails
indefinitely here.

It can be reproduced with:

make check-avocado
AVOCADO_TESTS=tests/avocado/reverse_debugging.py:test_ppc64_pseries




How reliably? I tried 10/10.


It used to fail for me every time I tried - but the fix has already been 
merged yesterday (commit cd43f00524070c026), so if you updated today, you'll 
see the test passing again.


 Thomas

Re: [PATCH v3 0/4] ide: implement simple legacy/native mode switching for PCI IDE controllers

2023-11-21 Thread Kevin Wolf

Am 20.11.2023 um 16:11 hat BALATON Zoltan geschrieben:
> On Mon, 20 Nov 2023, Kevin Wolf wrote:
> > Am 20.11.2023 um 14:47 hat BALATON Zoltan geschrieben:
> > > On Mon, 20 Nov 2023, Kevin Wolf wrote:
> > > > Am 20.11.2023 um 14:09 hat BALATON Zoltan geschrieben:
> > > > > On Mon, 20 Nov 2023, Mark Cave-Ayland wrote:
> > > > > > The only difference I can think of regarding the BAR zeroing is 
> > > > > > that the
> > > > > > BMDMA BAR is zeroed here. Does the following diff fix things?
> > > > > 
> > > > > This helps, with this the latest driver does not crash but still 
> > > > > reads BAR4
> > > > > as 0 instead of 0xcc00 so UDMA won't work but at least it boots.
> > > > 
> > > > And disabling only the first four BARs is actually what the spec says,
> > > > too. So I'll make this change to the queued patches.
> > > > 
> > > > If I understand correctly, UDMA didn't work before this series either,
> > > > so it's a separate goal and doing it in its own patch is best anyway.
> > > 
> > > UDMA works with my original series, did not work with earlier versions of
> > > this alternative from Mark but could be fixed up on top unless Mark can 
> > > send
> > > a v4 now.
> > > 
> > > > As we don't seem to have a good place to set a default, maybe just
> > > > overriding it in via_ide_cfg_read(), too, and making it return 0xcc01 in
> > > > compatibility mode is enough?
> > > 
> > > I could give that a try and see if that helps but all this
> > > via_ide_cfg_read() seems like an unnecessary complication to me. Why can't
> > > we just set the BARs (o for BAR1-3 and default for BAR4) then we don't 
> > > need
> > > to override config read?
> > 
> > I would be fine with setting 0xcc00 as the default value for BAR 4, but
> > as you said yourself, we can't do that in reset because it will be
> > overwritten by the PCI core code. Where else could we meaningfully do
> > that? As far as I understand, we don't have any hint that the
> > native/compatibility mode switch resets it on real hardware, so I'm
> > hesitant to do it there (and if the guest OS doesn't even switch, it
> > would never get set).
> 
> Luckily machines which need legacy mode also seem to set it explicitly
> on startup so we can set the defaults there.

With machines, you mean the QEMU board code? Where does this happen? Or
just what guests usually do?

> The check to see if something changed the BARs before is enough to
> avoid breaking it when legacy mode is set after native mode which does
> not seem to reset BARs according to how pegasos2 Linux behaves that
> sets legacy mode after firmware set native and proframmed BARs but the
> keep using BAR addresses. The AmigaOne I think just uses the default
> values with setting legacy mode doing nothing as that's the default
> but we can detect this as setting legacy mode with BARs unset so
> that's a good place to set default values which is what my patch did
> and I added a lot of comments trying to explain this.

I guess it's a question of your philosophy if you focus on just making a
specific set of OSes work, or if you focus on trying to do what hardware
actually does. Of course, figuring out what hardware actually does
proved a bit harder than I would have hoped in this case.

> > As for BAR 0-3, didn't we conclude that the via device still accepts I/O
> > to the configured addresses even though they read as zeros? Having
> > inconsistent config space and PCIIORegion seems like a bad idea, the
> > next call to pci_update_mappings() would break it.
> 
> I don't quite get this but then we could also just leave BARs alone
> and it would still work.

Didn't we have config space dumps from real hardware that showed that
this isn't what it does? Otherwise, this would be simpler indeed.

Of course, the spec also mandates that the values in the first four BARs
are ignored in compatibility mode (i.e. we would have to unregister the
memory regions), but we already figured that the via controller doesn't
do this. So that's something we would only have to solve if we want pci.c
code to be actually generic, but not for via.

> It probably does not matter what it reads back when the device is in
> legacy mode. What would call pci_update_mappings() if device is in
> legacy and if something switches it to native it will very likely also
> program BARs. I can't imagine what would want to turn on native mode
> without trying to use a PCI driver and program BARs.

For example, I see vmstate_info_pci_config -> get_pci_config_device ->
pci_update_mappings. Would live migration be enough to trigger this?

Maybe the relevant machines don't even support live migration (I haven't
checked), but it just doesn't feel robust to have inconsistent values in
data structures that are supposed to be in sync.

Kevin

Re: [PATCH for-8.2] ui/pixman-minimal.h: fix empty allocation

2023-11-21 Thread Marc-André Lureau

Hi Manos

On Tue, Nov 21, 2023 at 1:38 PM Manos Pitsidianakis
 wrote:
>
> In the minimal pixman API stub that is used when the real pixman
> dependency is missing a NULL dereference happens when
> virtio-gpu-rutabaga allocates a pixman image with bits = NULL and
> rowstride_bytes = zero. A buffer of rowstride_bytes * height is
> allocated which is NULL. However, in that scenario pixman calculates a
> new stride value based on given width, height and format size.
>
> This commit adds a helper function that performs the same logic as
> pixman.
>

Thanks a lot for investigating this and providing a solution!
Reviewed-by: Marc-André Lureau 

> Signed-off-by: Manos Pitsidianakis 
> ---
>  include/ui/pixman-minimal.h | 48 +++--
>  1 file changed, 46 insertions(+), 2 deletions(-)
>
> diff --git a/include/ui/pixman-minimal.h b/include/ui/pixman-minimal.h
> index efcf570c9e..6dd7de1c7e 100644
> --- a/include/ui/pixman-minimal.h
> +++ b/include/ui/pixman-minimal.h
> @@ -113,6 +113,45 @@ typedef struct pixman_color {
>  uint16_talpha;
>  } pixman_color_t;
>
> +static inline uint32_t *create_bits(pixman_format_code_t format,
> +int width,
> +int height,
> +int *rowstride_bytes)
> +{
> +int stride = 0;
> +size_t buf_size = 0;
> +int bpp = PIXMAN_FORMAT_BPP(format);
> +
> +/*
> + * Calculate the following while checking for overflow truncation:
> + * stride = ((width * bpp + 0x1f) >> 5) * sizeof(uint32_t);
> + */
> +
> +if (unlikely(__builtin_mul_overflow(width, bpp, &stride))) {
> +return NULL;
> +}
> +
> +if (unlikely(__builtin_add_overflow(stride, 0x1f, &stride))) {
> +return NULL;
> +}
> +
> +stride >>= 5;
> +
> +stride *= sizeof(uint32_t);
> +
> +if (unlikely(__builtin_mul_overflow((size_t) height,
> +(size_t) stride,
> +&buf_size))) {
> +return NULL;
> +}
> +
> +if (rowstride_bytes) {
> +*rowstride_bytes = stride;
> +}
> +
> +return g_malloc0(buf_size);
> +}
> +
>  static inline pixman_image_t *pixman_image_create_bits(pixman_format_code_t 
> format,
> int width,
> int height,
> @@ -123,13 +162,18 @@ static inline pixman_image_t 
> *pixman_image_create_bits(pixman_format_code_t form
>
>  i->width = width;
>  i->height = height;
> -i->stride = rowstride_bytes ?: width * 
> DIV_ROUND_UP(PIXMAN_FORMAT_BPP(format), 8);
>  i->format = format;
>  if (bits) {
>  i->data = bits;
>  } else {
> -i->free_me = i->data = g_malloc0(rowstride_bytes * height);
> +i->free_me = i->data =
> +create_bits(format, width, height, &rowstride_bytes);
> +if (width && height) {
> +assert(i->data);
> +}
>  }
> +i->stride = rowstride_bytes ? rowstride_bytes :
> +width * DIV_ROUND_UP(PIXMAN_FORMAT_BPP(format), 
> 8);
>  i->ref_count = 1;
>
>  return i;
>
> base-commit: af9264da80073435fd78944bc5a46e695897d7e5
> --
> 2.39.2
>

Re: [PATCH v3 4/4] hw/riscv/virt: Add IOPMP support

2023-11-21 Thread Ethan Chen via

On Tue, Nov 21, 2023 at 03:22:18PM +1000, Alistair Francis wrote:
> On Tue, Nov 14, 2023 at 7:48 PM Ethan Chen via  wrote:
> >
> > - Add 'iopmp=on' option to enable a iopmp device and a dma device
> >  connect to the iopmp device
> > - Add 'iopmp_cascade=on' option to enable iopmp cascading.
> 
> Can we document these in docs/system/riscv/virt.rst
> 
> Alistair

Sure. I will document these.

Thanks,
Ethan Chen

> 
> >
> > Signed-off-by: Ethan Chen 
> > ---
> >  hw/riscv/Kconfig|  2 ++
> >  hw/riscv/virt.c | 72 +++--
> >  include/hw/riscv/virt.h | 10 +-
> >  3 files changed, 81 insertions(+), 3 deletions(-)
> >
> > diff --git a/hw/riscv/Kconfig b/hw/riscv/Kconfig
> > index b6a5eb4452..c30a104aa4 100644
> > --- a/hw/riscv/Kconfig
> > +++ b/hw/riscv/Kconfig
> > @@ -45,6 +45,8 @@ config RISCV_VIRT
> >  select FW_CFG_DMA
> >  select PLATFORM_BUS
> >  select ACPI
> > +select ATCDMAC300
> > +select RISCV_IOPMP
> >
> >  config SHAKTI_C
> >  bool
> > diff --git a/hw/riscv/virt.c b/hw/riscv/virt.c
> > index c7fc97e273..3e23ee3afc 100644
> > --- a/hw/riscv/virt.c
> > +++ b/hw/riscv/virt.c
> > @@ -53,6 +53,8 @@
> >  #include "hw/display/ramfb.h"
> >  #include "hw/acpi/aml-build.h"
> >  #include "qapi/qapi-visit-common.h"
> > +#include "hw/misc/riscv_iopmp.h"
> > +#include "hw/dma/atcdmac300.h"
> >
> >  /*
> >   * The virt machine physical address space used by some of the devices
> > @@ -97,6 +99,9 @@ static const MemMapEntry virt_memmap[] = {
> >  [VIRT_UART0] ={ 0x1000, 0x100 },
> >  [VIRT_VIRTIO] =   { 0x10001000,0x1000 },
> >  [VIRT_FW_CFG] =   { 0x1010,  0x18 },
> > +[VIRT_IOPMP] ={ 0x1020,  0x10 },
> > +[VIRT_IOPMP2] =   { 0x1030,  0x10 },
> > +[VIRT_DMAC] = { 0x1040,  0x10 },
> >  [VIRT_FLASH] ={ 0x2000, 0x400 },
> >  [VIRT_IMSIC_M] =  { 0x2400, VIRT_IMSIC_MAX_SIZE },
> >  [VIRT_IMSIC_S] =  { 0x2800, VIRT_IMSIC_MAX_SIZE },
> > @@ -1527,13 +1532,33 @@ static void virt_machine_init(MachineState *machine)
> >
> >  create_platform_bus(s, mmio_irqchip);
> >
> > -serial_mm_init(system_memory, memmap[VIRT_UART0].base,
> > -0, qdev_get_gpio_in(mmio_irqchip, UART0_IRQ), 399193,
> > +serial_mm_init(system_memory, memmap[VIRT_UART0].base + 0x20,
> > +0x2, qdev_get_gpio_in(mmio_irqchip, UART0_IRQ), 38400,
> >  serial_hd(0), DEVICE_LITTLE_ENDIAN);
> >
> >  sysbus_create_simple("goldfish_rtc", memmap[VIRT_RTC].base,
> >  qdev_get_gpio_in(mmio_irqchip, RTC_IRQ));
> >
> > +/* DMAC */
> > +DeviceState *dmac_dev = atcdmac300_create("atcdmac300",
> > +memmap[VIRT_DMAC].base, memmap[VIRT_DMAC].size,
> > +qdev_get_gpio_in(DEVICE(mmio_irqchip), DMAC_IRQ));
> > +
> > +if (s->have_iopmp) {
> > +/* IOPMP */
> > +DeviceState *iopmp_dev = iopmp_create(memmap[VIRT_IOPMP].base,
> > +qdev_get_gpio_in(DEVICE(mmio_irqchip), IOPMP_IRQ));
> > +/* DMA with IOPMP */
> > +atcdmac300_connect_iopmp(dmac_dev, &(IOPMP(iopmp_dev)->iopmp_as),
> > +(StreamSink *)&(IOPMP(iopmp_dev)->transaction_info_sink), 0);
> > +if (s->have_iopmp_cascade) {
> > +DeviceState *iopmp_dev2 = 
> > iopmp_create(memmap[VIRT_IOPMP2].base,
> > +qdev_get_gpio_in(DEVICE(mmio_irqchip), IOPMP2_IRQ));
> > +cascade_iopmp(iopmp_dev, iopmp_dev2);
> > +}
> > +}
> > +
> > +
> >  for (i = 0; i < ARRAY_SIZE(s->flash); i++) {
> >  /* Map legacy -drive if=pflash to machine properties */
> >  pflash_cfi01_legacy_drive(s->flash[i],
> > @@ -1628,6 +1653,35 @@ static void virt_set_aclint(Object *obj, bool value, 
> > Error **errp)
> >  s->have_aclint = value;
> >  }
> >
> > +static bool virt_get_iopmp(Object *obj, Error **errp)
> > +{
> > +RISCVVirtState *s = RISCV_VIRT_MACHINE(obj);
> > +
> > +return s->have_iopmp;
> > +}
> > +
> > +static void virt_set_iopmp(Object *obj, bool value, Error **errp)
> > +{
> > +RISCVVirtState *s = RISCV_VIRT_MACHINE(obj);
> > +
> > +s->have_iopmp = value;
> > +}
> > +
> > +static bool virt_get_iopmp_cascade(Object *obj, Error **errp)
> > +{
> > +RISCVVirtState *s = RISCV_VIRT_MACHINE(obj);
> > +
> > +return s->have_iopmp_cascade;
> > +}
> > +
> > +static void virt_set_iopmp_cascade(Object *obj, bool value, Error **errp)
> > +{
> > +RISCVVirtState *s = RISCV_VIRT_MACHINE(obj);
> > +
> > +s->have_iopmp_cascade = value;
> > +}
> > +
> > +
> >  bool virt_is_acpi_enabled(RISCVVirtState *s)
> >  {
> >  return s->acpi != ON_OFF_AUTO_OFF;
> > @@ -1730,6 +1784,20 @@ static void virt_machine_class_init(ObjectClass *oc, 
> > void *data)
> >NULL, NULL);
> >  object_class_property_set_description(oc, "acpi",
> >

[PULL 1/3] net: Provide MemReentrancyGuard * to qemu_new_nic()

2023-11-21 Thread Jason Wang

From: Akihiko Odaki 

Recently MemReentrancyGuard was added to DeviceState to record that the
device is engaging in I/O. The network device backend needs to update it
when delivering a packet to a device.

In preparation for such a change, add MemReentrancyGuard * as a
parameter of qemu_new_nic().

Signed-off-by: Akihiko Odaki 
Reviewed-by: Alexander Bulekov 
Signed-off-by: Jason Wang 
---
 hw/net/allwinner-sun8i-emac.c | 3 ++-
 hw/net/allwinner_emac.c   | 3 ++-
 hw/net/cadence_gem.c  | 3 ++-
 hw/net/dp8393x.c  | 3 ++-
 hw/net/e1000.c| 3 ++-
 hw/net/e1000e.c   | 2 +-
 hw/net/eepro100.c | 4 +++-
 hw/net/etraxfs_eth.c  | 3 ++-
 hw/net/fsl_etsec/etsec.c  | 3 ++-
 hw/net/ftgmac100.c| 3 ++-
 hw/net/i82596.c   | 2 +-
 hw/net/igb.c  | 2 +-
 hw/net/imx_fec.c  | 2 +-
 hw/net/lan9118.c  | 3 ++-
 hw/net/mcf_fec.c  | 3 ++-
 hw/net/mipsnet.c  | 3 ++-
 hw/net/msf2-emac.c| 3 ++-
 hw/net/mv88w8618_eth.c| 3 ++-
 hw/net/ne2000-isa.c   | 3 ++-
 hw/net/ne2000-pci.c   | 3 ++-
 hw/net/npcm7xx_emc.c  | 3 ++-
 hw/net/opencores_eth.c| 3 ++-
 hw/net/pcnet.c| 3 ++-
 hw/net/rocker/rocker_fp.c | 4 ++--
 hw/net/rtl8139.c  | 3 ++-
 hw/net/smc91c111.c| 3 ++-
 hw/net/spapr_llan.c   | 3 ++-
 hw/net/stellaris_enet.c   | 3 ++-
 hw/net/sungem.c   | 2 +-
 hw/net/sunhme.c   | 3 ++-
 hw/net/tulip.c| 3 ++-
 hw/net/virtio-net.c   | 6 --
 hw/net/vmxnet3.c  | 2 +-
 hw/net/xen_nic.c  | 3 ++-
 hw/net/xgmac.c| 3 ++-
 hw/net/xilinx_axienet.c   | 3 ++-
 hw/net/xilinx_ethlite.c   | 3 ++-
 hw/usb/dev-network.c  | 3 ++-
 include/net/net.h | 1 +
 net/net.c | 1 +
 40 files changed, 75 insertions(+), 40 deletions(-)

diff --git a/hw/net/allwinner-sun8i-emac.c b/hw/net/allwinner-sun8i-emac.c
index fac4405..cc350d4 100644
--- a/hw/net/allwinner-sun8i-emac.c
+++ b/hw/net/allwinner-sun8i-emac.c
@@ -824,7 +824,8 @@ static void allwinner_sun8i_emac_realize(DeviceState *dev, 
Error **errp)
 
 qemu_macaddr_default_if_unset(&s->conf.macaddr);
 s->nic = qemu_new_nic(&net_allwinner_sun8i_emac_info, &s->conf,
-   object_get_typename(OBJECT(dev)), dev->id, s);
+  object_get_typename(OBJECT(dev)), dev->id,
+  &dev->mem_reentrancy_guard, s);
 qemu_format_nic_info_str(qemu_get_queue(s->nic), s->conf.macaddr.a);
 }
 
diff --git a/hw/net/allwinner_emac.c b/hw/net/allwinner_emac.c
index 372e5b6..e10965d 100644
--- a/hw/net/allwinner_emac.c
+++ b/hw/net/allwinner_emac.c
@@ -453,7 +453,8 @@ static void aw_emac_realize(DeviceState *dev, Error **errp)
 
 qemu_macaddr_default_if_unset(&s->conf.macaddr);
 s->nic = qemu_new_nic(&net_aw_emac_info, &s->conf,
-  object_get_typename(OBJECT(dev)), dev->id, s);
+  object_get_typename(OBJECT(dev)), dev->id,
+  &dev->mem_reentrancy_guard, s);
 qemu_format_nic_info_str(qemu_get_queue(s->nic), s->conf.macaddr.a);
 
 fifo8_create(&s->rx_fifo, RX_FIFO_SIZE);
diff --git a/hw/net/cadence_gem.c b/hw/net/cadence_gem.c
index 19adbc0..296bba2 100644
--- a/hw/net/cadence_gem.c
+++ b/hw/net/cadence_gem.c
@@ -1743,7 +1743,8 @@ static void gem_realize(DeviceState *dev, Error **errp)
 qemu_macaddr_default_if_unset(&s->conf.macaddr);
 
 s->nic = qemu_new_nic(&net_gem_info, &s->conf,
-  object_get_typename(OBJECT(dev)), dev->id, s);
+  object_get_typename(OBJECT(dev)), dev->id,
+  &dev->mem_reentrancy_guard, s);
 
 if (s->jumbo_max_len > MAX_FRAME_SIZE) {
 error_setg(errp, "jumbo-max-len is greater than %d",
diff --git a/hw/net/dp8393x.c b/hw/net/dp8393x.c
index c6f5fb7..b16b18b 100644
--- a/hw/net/dp8393x.c
+++ b/hw/net/dp8393x.c
@@ -913,7 +913,8 @@ static void dp8393x_realize(DeviceState *dev, Error **errp)
   "dp8393x-regs", SONIC_REG_COUNT << s->it_shift);
 
 s->nic = qemu_new_nic(&net_dp83932_info, &s->conf,
-  object_get_typename(OBJECT(dev)), dev->id, s);
+  object_get_typename(OBJECT(dev)), dev->id,
+  &dev->mem_reentrancy_guard, s);
 qemu_format_nic_info_str(qemu_get_queue(s->nic), s->conf.macaddr.a);
 
 s->watchdog = timer_new_ns(QEMU_CLOCK_VIRTUAL, dp8393x_watchdog, s);
diff --git a/hw/net/e1000.c b/hw/net/e1000.c
index 548bcab..8ffe107 100644
--- a/hw/net/e1000.c
+++ b/hw/net/e1000.c
@@ -1666,7 +1666,8 @@ static void pci_e1000_realize(PCIDevice *pci_dev, Error 
**errp)
macaddr);
 
 d->nic = qemu_new_nic(&net_e1000_info, &d->conf,
-

[PULL 2/3] net: Update MemReentrancyGuard for NIC

2023-11-21 Thread Jason Wang

From: Akihiko Odaki 

Recently MemReentrancyGuard was added to DeviceState to record that the
device is engaging in I/O. The network device backend needs to update it
when delivering a packet to a device.

This implementation follows what bottom half does, but it does not add
a tracepoint for the case that the network device backend started
delivering a packet to a device which is already engaging in I/O. This
is because such reentrancy frequently happens for
qemu_flush_queued_packets() and is insignificant.

Fixes: CVE-2023-3019
Reported-by: Alexander Bulekov 
Signed-off-by: Akihiko Odaki 
Acked-by: Alexander Bulekov 
Signed-off-by: Jason Wang 
---
 include/net/net.h |  1 +
 net/net.c | 14 ++
 2 files changed, 15 insertions(+)

diff --git a/include/net/net.h b/include/net/net.h
index 24deea2..ffbd2c8 100644
--- a/include/net/net.h
+++ b/include/net/net.h
@@ -126,6 +126,7 @@ typedef QTAILQ_HEAD(NetClientStateList, NetClientState) 
NetClientStateList;
 typedef struct NICState {
 NetClientState *ncs;
 NICConf *conf;
+MemReentrancyGuard *reentrancy_guard;
 void *opaque;
 bool peer_deleted;
 } NICState;
diff --git a/net/net.c b/net/net.c
index aaf0c2b..4d1ff7a 100644
--- a/net/net.c
+++ b/net/net.c
@@ -332,6 +332,7 @@ NICState *qemu_new_nic(NetClientInfo *info,
 nic = g_malloc0(info->size + sizeof(NetClientState) * queues);
 nic->ncs = (void *)nic + info->size;
 nic->conf = conf;
+nic->reentrancy_guard = reentrancy_guard,
 nic->opaque = opaque;
 
 for (i = 0; i < queues; i++) {
@@ -814,6 +815,7 @@ static ssize_t qemu_deliver_packet_iov(NetClientState 
*sender,
int iovcnt,
void *opaque)
 {
+MemReentrancyGuard *owned_reentrancy_guard;
 NetClientState *nc = opaque;
 int ret;
 
@@ -826,12 +828,24 @@ static ssize_t qemu_deliver_packet_iov(NetClientState 
*sender,
 return 0;
 }
 
+if (nc->info->type != NET_CLIENT_DRIVER_NIC ||
+qemu_get_nic(nc)->reentrancy_guard->engaged_in_io) {
+owned_reentrancy_guard = NULL;
+} else {
+owned_reentrancy_guard = qemu_get_nic(nc)->reentrancy_guard;
+owned_reentrancy_guard->engaged_in_io = true;
+}
+
 if (nc->info->receive_iov && !(flags & QEMU_NET_PACKET_FLAG_RAW)) {
 ret = nc->info->receive_iov(nc, iov, iovcnt);
 } else {
 ret = nc_sendv_compat(nc, iov, iovcnt, flags);
 }
 
+if (owned_reentrancy_guard) {
+owned_reentrancy_guard->engaged_in_io = false;
+}
+
 if (ret == 0) {
 nc->receive_disabled = 1;
 }
-- 
2.7.4

[PULL 0/3] Net patches

2023-11-21 Thread Jason Wang

The following changes since commit af9264da80073435fd78944bc5a46e695897d7e5:

  Merge tag '20231119-xtensa-1' of https://github.com/OSLL/qemu-xtensa into 
staging (2023-11-20 05:25:19 -0500)

are available in the git repository at:

  https://github.com/jasowang/qemu.git tags/net-pull-request

for you to fetch changes up to 84f85eb95f14add02efd5e69f2ff7783d79b24f7:

  net: do not delete nics in net_cleanup() (2023-11-21 15:42:34 +0800)




Akihiko Odaki (2):
  net: Provide MemReentrancyGuard * to qemu_new_nic()
  net: Update MemReentrancyGuard for NIC

David Woodhouse (1):
  net: do not delete nics in net_cleanup()

 hw/net/allwinner-sun8i-emac.c |  3 ++-
 hw/net/allwinner_emac.c   |  3 ++-
 hw/net/cadence_gem.c  |  3 ++-
 hw/net/dp8393x.c  |  3 ++-
 hw/net/e1000.c|  3 ++-
 hw/net/e1000e.c   |  2 +-
 hw/net/eepro100.c |  4 +++-
 hw/net/etraxfs_eth.c  |  3 ++-
 hw/net/fsl_etsec/etsec.c  |  3 ++-
 hw/net/ftgmac100.c|  3 ++-
 hw/net/i82596.c   |  2 +-
 hw/net/igb.c  |  2 +-
 hw/net/imx_fec.c  |  2 +-
 hw/net/lan9118.c  |  3 ++-
 hw/net/mcf_fec.c  |  3 ++-
 hw/net/mipsnet.c  |  3 ++-
 hw/net/msf2-emac.c|  3 ++-
 hw/net/mv88w8618_eth.c|  3 ++-
 hw/net/ne2000-isa.c   |  3 ++-
 hw/net/ne2000-pci.c   |  3 ++-
 hw/net/npcm7xx_emc.c  |  3 ++-
 hw/net/opencores_eth.c|  3 ++-
 hw/net/pcnet.c|  3 ++-
 hw/net/rocker/rocker_fp.c |  4 ++--
 hw/net/rtl8139.c  |  3 ++-
 hw/net/smc91c111.c|  3 ++-
 hw/net/spapr_llan.c   |  3 ++-
 hw/net/stellaris_enet.c   |  3 ++-
 hw/net/sungem.c   |  2 +-
 hw/net/sunhme.c   |  3 ++-
 hw/net/tulip.c|  3 ++-
 hw/net/virtio-net.c   |  6 --
 hw/net/vmxnet3.c  |  2 +-
 hw/net/xen_nic.c  |  3 ++-
 hw/net/xgmac.c|  3 ++-
 hw/net/xilinx_axienet.c   |  3 ++-
 hw/net/xilinx_ethlite.c   |  3 ++-
 hw/usb/dev-network.c  |  3 ++-
 include/net/net.h |  2 ++
 net/net.c | 43 +--
 40 files changed, 112 insertions(+), 46 deletions(-)

[PULL 3/3] net: do not delete nics in net_cleanup()

2023-11-21 Thread Jason Wang

From: David Woodhouse 

In net_cleanup() we only need to delete the netdevs, as those may have
state which outlives Qemu when it exits, and thus may actually need to
be cleaned up on exit.

The nics, on the other hand, are owned by the device which created them.
Most devices don't bother to clean up on exit because they don't have
any state which will outlive Qemu... but XenBus devices do need to clean
up their nodes in XenStore, and do have an exit handler to delete them.

When the XenBus exit handler destroys the xen-net-device, it attempts
to delete its nic after net_cleanup() had already done so. And crashes.

Fix this by only deleting netdevs as we walk the list. As the comment
notes, we can't use QTAILQ_FOREACH_SAFE() as each deletion may remove
*multiple* entries, including the "safely" saved 'next' pointer. But
we can store the *previous* entry, since nics are safe.

Signed-off-by: David Woodhouse 
Reviewed-by: Paul Durrant 
Signed-off-by: Jason Wang 
---
 net/net.c | 28 ++--
 1 file changed, 22 insertions(+), 6 deletions(-)

diff --git a/net/net.c b/net/net.c
index 4d1ff7a..0520bc1 100644
--- a/net/net.c
+++ b/net/net.c
@@ -1514,18 +1514,34 @@ static void net_vm_change_state_handler(void *opaque, 
bool running,
 
 void net_cleanup(void)
 {
-NetClientState *nc;
+NetClientState *nc, **p = &QTAILQ_FIRST(&net_clients);
 
 /*cleanup colo compare module for COLO*/
 colo_compare_cleanup();
 
-/* We may del multiple entries during qemu_del_net_client(),
- * so QTAILQ_FOREACH_SAFE() is also not safe here.
+/*
+ * Walk the net_clients list and remove the netdevs but *not* any
+ * NET_CLIENT_DRIVER_NIC entries. The latter are owned by the device
+ * model which created them, and in some cases (e.g. xen-net-device)
+ * the device itself may do cleanup at exit and will be upset if we
+ * just delete its NIC from underneath it.
+ *
+ * Since qemu_del_net_client() may delete multiple entries, using
+ * QTAILQ_FOREACH_SAFE() is not safe here. The only safe pointer
+ * to keep as a bookmark is a NET_CLIENT_DRIVER_NIC entry, so keep
+ * 'p' pointing to either the head of the list, or the 'next' field
+ * of the latest NET_CLIENT_DRIVER_NIC, and operate on *p as we walk
+ * the list.
+ *
+ * The 'nc' variable isn't part of the list traversal; it's purely
+ * for convenience as too much '(*p)->' has a tendency to make the
+ * readers' eyes bleed.
  */
-while (!QTAILQ_EMPTY(&net_clients)) {
-nc = QTAILQ_FIRST(&net_clients);
+while (*p) {
+nc = *p;
 if (nc->info->type == NET_CLIENT_DRIVER_NIC) {
-qemu_del_nic(qemu_get_nic(nc));
+/* Skip NET_CLIENT_DRIVER_NIC entries */
+p = &QTAILQ_NEXT(nc, next);
 } else {
 qemu_del_net_client(nc);
 }
-- 
2.7.4

Re: [RFC 1/2] qapi/virtio: introduce the "show-bits" argument for x-query-virtio-status

2023-11-21 Thread Yong Huang

On Tue, Nov 21, 2023 at 3:58 PM Markus Armbruster  wrote:

> Laurent, there's a question for you at the end.
>
> Yong Huang  writes:
>
> > On Thu, Nov 16, 2023 at 10:44 PM Markus Armbruster 
> > wrote:
> >
> >> Hyman Huang  writes:
> >>
> >> > This patch allows to display feature and status bits in virtio-status.
> >> >
> >> > An optional argument is introduced: show-bits. For example:
> >> > {"execute": "x-query-virtio-status",
> >> >  "arguments": {"path":
> "/machine/peripheral-anon/device[1]/virtio-backend",
> >> >"show-bits": true}
> >> >
> >> > Features and status bits could be helpful for applications to compare
> >> > directly. For instance, when an upper application aims to ensure the
> >> > virtio negotiation correctness between guest, QEMU, and OVS-DPDK, it
> use
> >> > the "ovs-vsctl list interface" command to retrieve interface features
> >> > (in number format) and the QMP command x-query-virtio-status to
> retrieve
> >> > vhost-user net device features. If "show-bits" is added, the
> application
> >> > can compare the two features directly; No need to encoding the
> features
> >> > returned by the QMP command.
> >> >
> >> > This patch also serves as a preparation for the next one, which
> implements
> >> > a vhost-user test case about acked features of vhost-user protocol.
> >> >
> >> > Note that since the matching HMP command is typically used for human,
> >> > leave it unchanged.
> >> >
> >> > Signed-off-by: Hyman Huang 
> >> > ---
> >> >  hw/virtio/virtio-hmp-cmds.c |  2 +-
> >> >  hw/virtio/virtio-qmp.c  | 21 +++-
> >> >  qapi/virtio.json| 49
> ++---
> >> >  3 files changed, 67 insertions(+), 5 deletions(-)
> >> >
> >> > diff --git a/hw/virtio/virtio-hmp-cmds.c b/hw/virtio/virtio-hmp-cmds.c
> >> > index 477c97dea2..3774f3d4bf 100644
> >> > --- a/hw/virtio/virtio-hmp-cmds.c
> >> > +++ b/hw/virtio/virtio-hmp-cmds.c
> >> > @@ -108,7 +108,7 @@ void hmp_virtio_status(Monitor *mon, const QDict
> *qdict)
> >> >  {
> >> >  Error *err = NULL;
> >> >  const char *path = qdict_get_try_str(qdict, "path");
> >> > -VirtioStatus *s = qmp_x_query_virtio_status(path, &err);
> >> > +VirtioStatus *s = qmp_x_query_virtio_status(path, false, false,
> &err);
> >> >
> >> >  if (err != NULL) {
> >> >  hmp_handle_error(mon, err);
> >> > diff --git a/hw/virtio/virtio-qmp.c b/hw/virtio/virtio-qmp.c
> >> > index 1dd96ed20f..2e92bf28ac 100644
> >> > --- a/hw/virtio/virtio-qmp.c
> >> > +++ b/hw/virtio/virtio-qmp.c
> >> > @@ -718,10 +718,15 @@ VirtIODevice *qmp_find_virtio_device(const char
> *path)
> >> >  return VIRTIO_DEVICE(dev);
> >> >  }
> >> >
> >> > -VirtioStatus *qmp_x_query_virtio_status(const char *path, Error
> **errp)
> >> > +VirtioStatus *qmp_x_query_virtio_status(const char *path,
> >> > +bool has_show_bits,
> >> > +bool show_bits,
> >> > +Error **errp)
> >> >  {
> >> >  VirtIODevice *vdev;
> >> >  VirtioStatus *status;
> >> > +bool display_bits =
> >> > +has_show_bits ? show_bits : false;
> >>
> >> Since !has_show_bits implies !show_bits, you can simply use
> >> if (show_bits).
> >>
> > Ok
> >
> >>
> >> >
> >> >  vdev = qmp_find_virtio_device(path);
> >> >  if (vdev == NULL) {
> >> > @@ -733,6 +738,11 @@ VirtioStatus *qmp_x_query_virtio_status(const
> char *path, Error **errp)
> >> >  status->name = g_strdup(vdev->name);
> >> >  status->device_id = vdev->device_id;
> >> >  status->vhost_started = vdev->vhost_started;
> >> > +if (display_bits) {
> >> > +status->guest_features_bits = vdev->guest_features;
> >> > +status->host_features_bits = vdev->host_features;
> >> > +status->backend_features_bits = vdev->backend_features;
> >> > +}
> >> >  status->guest_features = qmp_decode_features(vdev->device_id,
> >> >
>  vdev->guest_features);
> >> >  status->host_features = qmp_decode_features(vdev->device_id,
> >> > @@ -753,6 +763,9 @@ VirtioStatus *qmp_x_query_virtio_status(const
> char *path, Error **errp)
> >> >  }
> >> >
> >> >  status->num_vqs = virtio_get_num_queues(vdev);
> >> > +if (display_bits) {
> >> > +status->status_bits = vdev->status;
> >> > +}
> >> >  status->status = qmp_decode_status(vdev->status);
> >> >  status->isr = vdev->isr;
> >> >  status->queue_sel = vdev->queue_sel;
> >> > @@ -775,6 +788,12 @@ VirtioStatus *qmp_x_query_virtio_status(const
> char *path, Error **errp)
> >> >  status->vhost_dev->n_tmp_sections = hdev->n_tmp_sections;
> >> >  status->vhost_dev->nvqs = hdev->nvqs;
> >> >  status->vhost_dev->vq_index = hdev->vq_index;
> >> > +if (display_bits) {
> >> > +status->vhost_dev->features_bits = hdev->features;
> >> > +status->vhost_dev->acked_features_bits =
> hdev->acked_features;
> >>

[PATCH] Revert "tests/avocado: Enable reverse_debugging.py tests in gitlab CI"

2023-11-21 Thread Thomas Huth

This reverts commit c4d74ab24a02c90b7a3240510b3dd4e1bec536dd.

The reverse debugging test is sometimes still failing. See:
 https://gitlab.com/qemu-project/qemu/-/issues/1992

Signed-off-by: Thomas Huth 
---
 tests/avocado/reverse_debugging.py | 8 
 1 file changed, 8 insertions(+)

diff --git a/tests/avocado/reverse_debugging.py 
b/tests/avocado/reverse_debugging.py
index b1410e7a69..ed04e92bb4 100644
--- a/tests/avocado/reverse_debugging.py
+++ b/tests/avocado/reverse_debugging.py
@@ -205,6 +205,8 @@ def get_pc(self, g):
 return self.get_reg_le(g, self.REG_PC) \
 + self.get_reg_le(g, self.REG_CS) * 0x10
 
+# unidentified gitlab timeout problem
+@skipIf(os.getenv('GITLAB_CI'), 'Running on GitLab')
 def test_x86_64_pc(self):
 """
 :avocado: tags=arch:x86_64
@@ -220,6 +222,8 @@ class ReverseDebugging_AArch64(ReverseDebugging):
 
 REG_PC = 32
 
+# unidentified gitlab timeout problem
+@skipIf(os.getenv('GITLAB_CI'), 'Running on GitLab')
 def test_aarch64_virt(self):
 """
 :avocado: tags=arch:aarch64
@@ -242,6 +246,8 @@ class ReverseDebugging_ppc64(ReverseDebugging):
 
 REG_PC = 0x40
 
+# unidentified gitlab timeout problem
+@skipIf(os.getenv('GITLAB_CI'), 'Running on GitLab')
 def test_ppc64_pseries(self):
 """
 :avocado: tags=arch:ppc64
@@ -253,6 +259,8 @@ def test_ppc64_pseries(self):
 self.endian_is_le = False
 self.reverse_debugging()
 
+# See https://gitlab.com/qemu-project/qemu/-/issues/1992
+@skipIf(os.getenv('GITLAB_CI'), 'Running on GitLab')
 def test_ppc64_powernv(self):
 """
 :avocado: tags=arch:ppc64
-- 
2.42.0

Re: [PATCH v3 1/4] hw/core: Add config stream

2023-11-21 Thread Ethan Chen via

On Tue, Nov 21, 2023 at 03:28:13PM +1000, Alistair Francis wrote:
> On Tue, Nov 21, 2023 at 3:24 PM Alistair Francis  wrote:
> >
> > On Tue, Nov 14, 2023 at 7:49 PM Ethan Chen via  
> > wrote:
> > >
> > > Make other device can use /hw/core/stream.c by select this config.
> > >
> > > Signed-off-by: Ethan Chen 
> > > ---
> > >  hw/core/Kconfig | 3 +++
> > >  hw/core/meson.build | 1 +
> > >  2 files changed, 4 insertions(+)
> > >
> > > diff --git a/hw/core/Kconfig b/hw/core/Kconfig
> > > index 9397503656..628dc3d883 100644
> > > --- a/hw/core/Kconfig
> > > +++ b/hw/core/Kconfig
> > > @@ -27,3 +27,6 @@ config REGISTER
> > >
> > >  config SPLIT_IRQ
> > >  bool
> > > +
> > > +config STREAM
> > > +bool
> > > \ No newline at end of file
> >
> > You are missing a newline here. I think checkpatch should catch this,
> > make sure you run it on all of your patches

Sorry for that. It is wired that this was not catched by checkpatch.

> >
> > > diff --git a/hw/core/meson.build b/hw/core/meson.build
> > > index 67dad04de5..d6ce14d5ce 100644
> > > --- a/hw/core/meson.build
> > > +++ b/hw/core/meson.build
> > > @@ -34,6 +34,7 @@ system_ss.add(when: 'CONFIG_REGISTER', if_true: 
> > > files('register.c'))
> > >  system_ss.add(when: 'CONFIG_SPLIT_IRQ', if_true: files('split-irq.c'))
> > >  system_ss.add(when: 'CONFIG_XILINX_AXI', if_true: files('stream.c'))
> > >  system_ss.add(when: 'CONFIG_PLATFORM_BUS', if_true: 
> > > files('sysbus-fdt.c'))
> > > +system_ss.add(when: 'CONFIG_STREAM', if_true: files('stream.c'))
> >
> > You have added the build but not the file. This will fail to compile.
> >
> > Each patch must compile and run when applied individually in order.
> > That way we maintain git bisectability. Can you please make sure that
> > the build is not broken as your patches are applied
> 
> Whoops! The file already exists.
> 
> We should only include the file stream.c once. So we should change the
> CONFIG_XILINX_AXI to select CONFIG_STREAM in this patch
> 

I will fix that in next revision.

Thanks,
Ethan Chen

Re: [PATCH for-8.2 1/3] vl: revert behaviour for -display none

2023-11-21 Thread David Woodhouse

On Mon, 2023-11-20 at 12:42 +, Peter Maydell wrote:
> 
> This fixes the regression I was seeing with the semihosting
> use case. I haven't checked the Xen setup.
> 
> Tested-by: Peter Maydell 
> Reviewed-by: Peter Maydell 

It does also work for the Xen command line (with my other fix
reverted).

Tested-by: David Woodhouse 
Reviewed-by: David Woodhouse 



smime.p7s
Description: S/MIME cryptographic signature

Re: [PATCH for-8.2 0/3] UI: fix default VC regressions

2023-11-21 Thread Woodhouse, David

On Tue, 2023-11-21 at 11:37 +0400, Marc-André Lureau wrote:
> On Fri, Nov 17, 2023 at 6:36 PM  wrote:
> > 
> > From: Marc-André Lureau 
> > 
> > Hi,
> > 
> > There are a few annoying regressions with the default VCs introduced with 
> > the
> > pixman series. The "vl: revert behaviour for -display none" change solves 
> > most
> > of the issues. Another one is hit when using remote displays, and VCs are 
> > not
> > created as they used to, see: "ui/console: fix default VC when there are no
> > display". Finally, "ui: use "vc" chardev for dbus, gtk & spice-app" was 
> > meant to
> > be included in the pixman series and also brings back default VCs creation.
> > 
> > Marc-André Lureau (3):
> >    vl: revert behaviour for -display none
> >    ui: use "vc" chardev for dbus, gtk & spice-app
> >    ui/console: fix default VC when there are no display
> 
> I wish to send a PR (rc1 today), together with "[PATCH] vl: add
> missing display_remote++".
> 
> Some R-B/A-B appreciated! thanks

Not sure I can give coherent review on the other two, but the first
patch does fix the Xen command line and looks sane.

Please could I ask you to also include
https://lore.kernel.org/qemu-devel/20231115172723.1161679-3-dw...@infradead.org/
in the series as you push it?

smime.p7s
Description: S/MIME cryptographic signature

Amazon Development Centre (London) Ltd. Registered in England and Wales with 
registration number 04543232 with its registered office at 1 Principal Place, 
Worship Street, London EC2A 2FA, United Kingdom.

Re: [PATCH v3 0/4] ide: implement simple legacy/native mode switching for PCI IDE controllers

2023-11-21 Thread Mark Cave-Ayland


On 21/11/2023 09:12, Kevin Wolf wrote:


Am 20.11.2023 um 16:02 hat BALATON Zoltan geschrieben:

On Mon, 20 Nov 2023, Mark Cave-Ayland wrote:

On 20/11/2023 13:42, Kevin Wolf wrote:

Am 20.11.2023 um 14:09 hat BALATON Zoltan geschrieben:

On Mon, 20 Nov 2023, Mark Cave-Ayland wrote:

On 19/11/2023 21:43, BALATON Zoltan wrote:

On Thu, 16 Nov 2023, Mark Cave-Ayland wrote:

This series adds a simple implementation of legacy/native mode
switching for PCI
IDE controllers and updates the via-ide device to use it.

The approach I take here is to add a new pci_ide_update_mode()
function which handles
management of the PCI BARs and legacy IDE ioports for each mode
to avoid exposing
details of the internal logic to individual PCI IDE controllers.

As noted in [1] this is extracted from a local WIP branch I have
which contains
further work in this area. However for the moment
I've kept it simple (and
restricted it to the via-ide device) which is good
enough for Zoltan's PPC
images whilst paving the way for future improvements after 8.2.

Signed-off-by: Mark Cave-Ayland 

[1] https://lists.gnu.org/archive/html/qemu-devel/2023-10/msg05403.html

v3:
- Rebase onto master
- Move ide_portio_list[] and ide_portio_list2[] to IDE core to
prevent duplication in
   hw/ide/pci.c
- Don't zero BARs when switching from native mode to legacy
mode, instead always force
   them to read zero as suggested in the PCI IDE specification
(note: this also appears
   to fix the fuloong2e machine booting from IDE)


Not sure you're getting this, see also:
https://lists.nongnu.org/archive/html/qemu-devel/2023-11/msg04167.html
but this seems to break latest version of the AmigaOS driver for
some reason. I assume this is the BAR zeroing that causes this as it
works with v2 series and nothing else changed in v3 that could cause
this. Testing was done by Rene Engel, cc'd so maybe he can add more
info. It seems to work with my patch that sets BARs to legacy values
and with v2 that sets them to 0 but not with v3 which should also
read 0 but maybe something is off here.


I've been AFK for a few days, so just starting to catch up on various
bits and pieces.


OK just wasn't sure if you saw my emails at all as it happened before that
some spam filters disliked my mail server and put messages in the spam
folder.


The only difference I can think of regarding the BAR zeroing is that the
BMDMA BAR is zeroed here. Does the following diff fix things?


This helps, with this the latest driver does not crash but still
reads BAR4
as 0 instead of 0xcc00 so UDMA won't work but at least it boots.


And disabling only the first four BARs is actually what the spec says,
too. So I'll make this change to the queued patches.


That was definitely something that I thought about: what should happen
to BARs outside of the ones mentioned in the PCI IDE controller
specification? It seems reasonable to me just to consider BARS 0-3 for
zeroing here.


If I understand correctly, UDMA didn't work before this series either,
so it's a separate goal and doing it in its own patch is best anyway.

As we don't seem to have a good place to set a default, maybe just
overriding it in via_ide_cfg_read(), too, and making it return 0xcc01 in
compatibility mode is enough?


It's difficult to know whether switching to legacy mode on the via-ide
device resets BAR4 to its default value, or whether it is simply left
unaltered. For 8.2 I don't mind too much as long as the logic is
separate from the BAR zeroing logic (which will eventually be lifted up
into hw/ide/pci.c).


My original patch checked for BAR being unset and only reset to defailt in
that case so it won't clobber a value set by something (like pegasos2
firmware) but will set default for amigaone which does not program the BAR
just uses the default legacy mode (which is the default on real chip but we
have to make that happen on QEMU after reset). So setting it to default if
it's unset when switching to legacy seems like a safe bet and testing with
my patch did not find problem with that.


How about setting the default if it's unset on the first read after
reset instead of only on switching modes? I'd like to avoid that the
guest could observe surprising state changes, even if the OSes we're
currently looking at don't do that. It just seems more robust than
introducing random magic at arbitrary points.


Note that Zoltan's image will work with just the change suggested in 
https://lists.gnu.org/archive/html/qemu-devel/2023-11/msg04331.html, but without 
BMDMA so any discussion re: default BAR addresses can be handled separately from the 
currently queued patches.


On real AmigaOne hardware BMDMA isn't well tested because it needs a hardware 
modification and configuration changes for U-Boot [1], and given that BAR4 isn't 
programmed by either the OS or the firmware, I'd be inclined to say that the fact it 
even works at all is because of a happy coincidence of bugs.


In the meantime the department of hacks has been looking at w

[PULL 1/8] target/arm: enable FEAT_RNG on Neoverse-N2

2023-11-21 Thread Peter Maydell

From: Marcin Juszkiewicz 

I noticed that Neoverse-V1 has FEAT_RNG enabled so let enable it also on
Neoverse-N2.

Signed-off-by: Marcin Juszkiewicz 
Reviewed-by: Richard Henderson 
Message-id: 20231114103443.1652308-1-marcin.juszkiew...@linaro.org
Reviewed-by: Peter Maydell 
Signed-off-by: Peter Maydell 
---
 target/arm/tcg/cpu64.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/target/arm/tcg/cpu64.c b/target/arm/tcg/cpu64.c
index 08db1dbcc74..fcda99e1583 100644
--- a/target/arm/tcg/cpu64.c
+++ b/target/arm/tcg/cpu64.c
@@ -1018,7 +1018,7 @@ static void aarch64_neoverse_n2_initfn(Object *obj)
 cpu->isar.id_aa64dfr1  = 0;
 cpu->id_aa64afr0   = 0;
 cpu->id_aa64afr1   = 0;
-cpu->isar.id_aa64isar0 = 0x022110212120ull; /* with Crypto */
+cpu->isar.id_aa64isar0 = 0x122110212120ull; /* with Crypto and 
FEAT_RNG */
 cpu->isar.id_aa64isar1 = 0x001101211052ull;
 cpu->isar.id_aa64mmfr0 = 0x022200101125ull;
 cpu->isar.id_aa64mmfr1 = 0x10212122ull;
-- 
2.34.1

[PULL 7/8] hw/arm/stm32f100: Report error when incorrect CPU is used

2023-11-21 Thread Peter Maydell

From: Philippe Mathieu-Daudé 

The 'stm32vldiscovery' machine ignores the CPU type requested by
the command line. This might confuse users, since the following
will create a machine with a Cortex-M3 CPU:

  $ qemu-system-aarch64 -M stm32vldiscovery -cpu neoverse-n1

Set the MachineClass::valid_cpu_types field (introduced in commit
c9cf636d48 "machine: Add a valid_cpu_types property").
Remove the now unused MachineClass::default_cpu_type field.

We now get:

  $ qemu-system-aarch64 -M stm32vldiscovery -cpu neoverse-n1
  qemu-system-aarch64: Invalid CPU type: neoverse-n1-arm-cpu
  The valid types are: cortex-m3-arm-cpu

Since the SoC family can only use Cortex-M3 CPUs, hard-code the
CPU type name at the SoC level, removing the QOM property
entirely.

Reviewed-by: Richard Henderson 
Signed-off-by: Philippe Mathieu-Daudé 
Reviewed-by: Gavin Shan 
Message-id: 20231117071704.35040-5-phi...@linaro.org
Signed-off-by: Peter Maydell 
---
 include/hw/arm/stm32f100_soc.h | 4 
 hw/arm/stm32f100_soc.c | 9 ++---
 hw/arm/stm32vldiscovery.c  | 7 ++-
 3 files changed, 8 insertions(+), 12 deletions(-)

diff --git a/include/hw/arm/stm32f100_soc.h b/include/hw/arm/stm32f100_soc.h
index 40cd415b284..a74d7b369c1 100644
--- a/include/hw/arm/stm32f100_soc.h
+++ b/include/hw/arm/stm32f100_soc.h
@@ -43,12 +43,8 @@ OBJECT_DECLARE_SIMPLE_TYPE(STM32F100State, STM32F100_SOC)
 #define SRAM_SIZE (8 * 1024)
 
 struct STM32F100State {
-/*< private >*/
 SysBusDevice parent_obj;
 
-/*< public >*/
-char *cpu_type;
-
 ARMv7MState armv7m;
 
 STM32F2XXUsartState usart[STM_NUM_USARTS];
diff --git a/hw/arm/stm32f100_soc.c b/hw/arm/stm32f100_soc.c
index f7b344ba9fb..b90d440d7aa 100644
--- a/hw/arm/stm32f100_soc.c
+++ b/hw/arm/stm32f100_soc.c
@@ -115,7 +115,7 @@ static void stm32f100_soc_realize(DeviceState *dev_soc, 
Error **errp)
 /* Init ARMv7m */
 armv7m = DEVICE(&s->armv7m);
 qdev_prop_set_uint32(armv7m, "num-irq", 61);
-qdev_prop_set_string(armv7m, "cpu-type", s->cpu_type);
+qdev_prop_set_string(armv7m, "cpu-type", ARM_CPU_TYPE_NAME("cortex-m3"));
 qdev_prop_set_bit(armv7m, "enable-bitband", true);
 qdev_connect_clock_in(armv7m, "cpuclk", s->sysclk);
 qdev_connect_clock_in(armv7m, "refclk", s->refclk);
@@ -180,17 +180,12 @@ static void stm32f100_soc_realize(DeviceState *dev_soc, 
Error **errp)
 create_unimplemented_device("CRC",   0x40023000, 0x400);
 }
 
-static Property stm32f100_soc_properties[] = {
-DEFINE_PROP_STRING("cpu-type", STM32F100State, cpu_type),
-DEFINE_PROP_END_OF_LIST(),
-};
-
 static void stm32f100_soc_class_init(ObjectClass *klass, void *data)
 {
 DeviceClass *dc = DEVICE_CLASS(klass);
 
 dc->realize = stm32f100_soc_realize;
-device_class_set_props(dc, stm32f100_soc_properties);
+/* No vmstate or reset required: device has no internal state */
 }
 
 static const TypeInfo stm32f100_soc_info = {
diff --git a/hw/arm/stm32vldiscovery.c b/hw/arm/stm32vldiscovery.c
index 67675e952fc..190db6118b9 100644
--- a/hw/arm/stm32vldiscovery.c
+++ b/hw/arm/stm32vldiscovery.c
@@ -47,7 +47,6 @@ static void stm32vldiscovery_init(MachineState *machine)
 clock_set_hz(sysclk, SYSCLK_FRQ);
 
 dev = qdev_new(TYPE_STM32F100_SOC);
-qdev_prop_set_string(dev, "cpu-type", ARM_CPU_TYPE_NAME("cortex-m3"));
 qdev_connect_clock_in(dev, "sysclk", sysclk);
 sysbus_realize_and_unref(SYS_BUS_DEVICE(dev), &error_fatal);
 
@@ -58,8 +57,14 @@ static void stm32vldiscovery_init(MachineState *machine)
 
 static void stm32vldiscovery_machine_init(MachineClass *mc)
 {
+static const char * const valid_cpu_types[] = {
+ARM_CPU_TYPE_NAME("cortex-m3"),
+NULL
+};
+
 mc->desc = "ST STM32VLDISCOVERY (Cortex-M3)";
 mc->init = stm32vldiscovery_init;
+mc->valid_cpu_types = valid_cpu_types;
 }
 
 DEFINE_MACHINE("stm32vldiscovery", stm32vldiscovery_machine_init)
-- 
2.34.1

[PULL 4/8] hw/core/machine: Constify MachineClass::valid_cpu_types[]

2023-11-21 Thread Peter Maydell

From: Gavin Shan 

Constify MachineClass::valid_cpu_types[i], as suggested by Richard
Henderson.

Suggested-by: Richard Henderson 
Signed-off-by: Gavin Shan 
Reviewed-by: Philippe Mathieu-Daudé 
Reviewed-by: Richard Henderson 
Signed-off-by: Philippe Mathieu-Daudé 
Message-id: 20231117071704.35040-2-phi...@linaro.org
[PMD: Constify HPPA machines,
  restrict valid_cpu_types to machine_class_init() handlers]
Signed-off-by: Philippe Mathieu-Daudé 
Signed-off-by: Peter Maydell 
---
 include/hw/boards.h |  2 +-
 hw/hppa/machine.c   | 22 ++
 hw/m68k/q800.c  | 11 +--
 3 files changed, 16 insertions(+), 19 deletions(-)

diff --git a/include/hw/boards.h b/include/hw/boards.h
index a7359992980..da85f86efb9 100644
--- a/include/hw/boards.h
+++ b/include/hw/boards.h
@@ -273,7 +273,7 @@ struct MachineClass {
 bool has_hotpluggable_cpus;
 bool ignore_memory_transaction_failures;
 int numa_mem_align_shift;
-const char **valid_cpu_types;
+const char * const *valid_cpu_types;
 strList *allowed_dynamic_sysbus_devices;
 bool auto_enable_numa_with_memhp;
 bool auto_enable_numa_with_memdev;
diff --git a/hw/hppa/machine.c b/hw/hppa/machine.c
index 9d08f39490e..c8da7c18d53 100644
--- a/hw/hppa/machine.c
+++ b/hw/hppa/machine.c
@@ -672,19 +672,18 @@ static void hppa_nmi(NMIState *n, int cpu_index, Error 
**errp)
 }
 }
 
-static const char *HP_B160L_machine_valid_cpu_types[] = {
-TYPE_HPPA_CPU,
-NULL
-};
-
 static void HP_B160L_machine_init_class_init(ObjectClass *oc, void *data)
 {
+static const char * const valid_cpu_types[] = {
+TYPE_HPPA_CPU,
+NULL
+};
 MachineClass *mc = MACHINE_CLASS(oc);
 NMIClass *nc = NMI_CLASS(oc);
 
 mc->desc = "HP B160L workstation";
 mc->default_cpu_type = TYPE_HPPA_CPU;
-mc->valid_cpu_types = HP_B160L_machine_valid_cpu_types;
+mc->valid_cpu_types = valid_cpu_types;
 mc->init = machine_HP_B160L_init;
 mc->reset = hppa_machine_reset;
 mc->block_default_type = IF_SCSI;
@@ -709,19 +708,18 @@ static const TypeInfo HP_B160L_machine_init_typeinfo = {
 },
 };
 
-static const char *HP_C3700_machine_valid_cpu_types[] = {
-TYPE_HPPA64_CPU,
-NULL
-};
-
 static void HP_C3700_machine_init_class_init(ObjectClass *oc, void *data)
 {
+static const char * const valid_cpu_types[] = {
+TYPE_HPPA64_CPU,
+NULL
+};
 MachineClass *mc = MACHINE_CLASS(oc);
 NMIClass *nc = NMI_CLASS(oc);
 
 mc->desc = "HP C3700 workstation";
 mc->default_cpu_type = TYPE_HPPA64_CPU;
-mc->valid_cpu_types = HP_C3700_machine_valid_cpu_types;
+mc->valid_cpu_types = valid_cpu_types;
 mc->init = machine_HP_C3700_init;
 mc->reset = hppa_machine_reset;
 mc->block_default_type = IF_SCSI;
diff --git a/hw/m68k/q800.c b/hw/m68k/q800.c
index 1d7cd5ff1c3..83d1571d02f 100644
--- a/hw/m68k/q800.c
+++ b/hw/m68k/q800.c
@@ -726,19 +726,18 @@ static GlobalProperty hw_compat_q800[] = {
 };
 static const size_t hw_compat_q800_len = G_N_ELEMENTS(hw_compat_q800);
 
-static const char *q800_machine_valid_cpu_types[] = {
-M68K_CPU_TYPE_NAME("m68040"),
-NULL
-};
-
 static void q800_machine_class_init(ObjectClass *oc, void *data)
 {
+static const char * const valid_cpu_types[] = {
+M68K_CPU_TYPE_NAME("m68040"),
+NULL
+};
 MachineClass *mc = MACHINE_CLASS(oc);
 
 mc->desc = "Macintosh Quadra 800";
 mc->init = q800_machine_init;
 mc->default_cpu_type = M68K_CPU_TYPE_NAME("m68040");
-mc->valid_cpu_types = q800_machine_valid_cpu_types;
+mc->valid_cpu_types = valid_cpu_types;
 mc->max_cpus = 1;
 mc->block_default_type = IF_SCSI;
 mc->default_ram_id = "m68k_mac.ram";
-- 
2.34.1

[PULL 6/8] hw/arm/stm32f205: Report error when incorrect CPU is used

2023-11-21 Thread Peter Maydell

From: Philippe Mathieu-Daudé 

The 'netduino2' machine ignores the CPU type requested by the
command line. This might confuse users, since the following will
create a machine with a Cortex-M3 CPU:

  $ qemu-system-arm -M netduino2 -cpu cortex-a9

Set the MachineClass::valid_cpu_types field (introduced in commit
c9cf636d48 "machine: Add a valid_cpu_types property").
Remove the now unused MachineClass::default_cpu_type field.

We now get:

  $ qemu-system-arm -M netduino2 -cpu cortex-a9
  qemu-system-arm: Invalid CPU type: cortex-a9-arm-cpu
  The valid types are: cortex-m3-arm-cpu

Since the SoC family can only use Cortex-M3 CPUs, hard-code the
CPU type name at the SoC level, removing the QOM property
entirely.

Reviewed-by: Richard Henderson 
Signed-off-by: Philippe Mathieu-Daudé 
Reviewed-by: Gavin Shan 
Message-id: 20231117071704.35040-4-phi...@linaro.org
Signed-off-by: Peter Maydell 
---
 include/hw/arm/stm32f205_soc.h | 4 
 hw/arm/netduino2.c | 7 ++-
 hw/arm/stm32f205_soc.c | 9 ++---
 3 files changed, 8 insertions(+), 12 deletions(-)

diff --git a/include/hw/arm/stm32f205_soc.h b/include/hw/arm/stm32f205_soc.h
index 5a4f7762642..4f4c8bbebc1 100644
--- a/include/hw/arm/stm32f205_soc.h
+++ b/include/hw/arm/stm32f205_soc.h
@@ -49,11 +49,7 @@ OBJECT_DECLARE_SIMPLE_TYPE(STM32F205State, STM32F205_SOC)
 #define SRAM_SIZE (128 * 1024)
 
 struct STM32F205State {
-/*< private >*/
 SysBusDevice parent_obj;
-/*< public >*/
-
-char *cpu_type;
 
 ARMv7MState armv7m;
 
diff --git a/hw/arm/netduino2.c b/hw/arm/netduino2.c
index 83753d53a3f..501f63a77f9 100644
--- a/hw/arm/netduino2.c
+++ b/hw/arm/netduino2.c
@@ -44,7 +44,6 @@ static void netduino2_init(MachineState *machine)
 clock_set_hz(sysclk, SYSCLK_FRQ);
 
 dev = qdev_new(TYPE_STM32F205_SOC);
-qdev_prop_set_string(dev, "cpu-type", ARM_CPU_TYPE_NAME("cortex-m3"));
 qdev_connect_clock_in(dev, "sysclk", sysclk);
 sysbus_realize_and_unref(SYS_BUS_DEVICE(dev), &error_fatal);
 
@@ -54,8 +53,14 @@ static void netduino2_init(MachineState *machine)
 
 static void netduino2_machine_init(MachineClass *mc)
 {
+static const char * const valid_cpu_types[] = {
+ARM_CPU_TYPE_NAME("cortex-m3"),
+NULL
+};
+
 mc->desc = "Netduino 2 Machine (Cortex-M3)";
 mc->init = netduino2_init;
+mc->valid_cpu_types = valid_cpu_types;
 mc->ignore_memory_transaction_failures = true;
 }
 
diff --git a/hw/arm/stm32f205_soc.c b/hw/arm/stm32f205_soc.c
index c6b75a381d9..1a548646f6e 100644
--- a/hw/arm/stm32f205_soc.c
+++ b/hw/arm/stm32f205_soc.c
@@ -127,7 +127,7 @@ static void stm32f205_soc_realize(DeviceState *dev_soc, 
Error **errp)
 
 armv7m = DEVICE(&s->armv7m);
 qdev_prop_set_uint32(armv7m, "num-irq", 96);
-qdev_prop_set_string(armv7m, "cpu-type", s->cpu_type);
+qdev_prop_set_string(armv7m, "cpu-type", ARM_CPU_TYPE_NAME("cortex-m3"));
 qdev_prop_set_bit(armv7m, "enable-bitband", true);
 qdev_connect_clock_in(armv7m, "cpuclk", s->sysclk);
 qdev_connect_clock_in(armv7m, "refclk", s->refclk);
@@ -201,17 +201,12 @@ static void stm32f205_soc_realize(DeviceState *dev_soc, 
Error **errp)
 }
 }
 
-static Property stm32f205_soc_properties[] = {
-DEFINE_PROP_STRING("cpu-type", STM32F205State, cpu_type),
-DEFINE_PROP_END_OF_LIST(),
-};
-
 static void stm32f205_soc_class_init(ObjectClass *klass, void *data)
 {
 DeviceClass *dc = DEVICE_CLASS(klass);
 
 dc->realize = stm32f205_soc_realize;
-device_class_set_props(dc, stm32f205_soc_properties);
+/* No vmstate or reset required: device has no internal state */
 }
 
 static const TypeInfo stm32f205_soc_info = {
-- 
2.34.1

[PULL 8/8] hw/arm/fsl-imx: Do not ignore Error argument

2023-11-21 Thread Peter Maydell

From: Philippe Mathieu-Daudé 

Both i.MX25 and i.MX6 SoC models ignore the Error argument when
setting the PHY number. Pick &error_abort which is the error
used by the i.MX7 SoC (see commit 1f7197deb0 "ability to change
the FEC PHY on i.MX7 processor").

Fixes: 74c1330582 ("ability to change the FEC PHY on i.MX25 processor")
Fixes: a9c167a3c4 ("ability to change the FEC PHY on i.MX6 processor")
Signed-off-by: Philippe Mathieu-Daudé 
Message-id: 20231120115116.76858-1-phi...@linaro.org
Reviewed-by: Peter Maydell 
Signed-off-by: Peter Maydell 
---
 hw/arm/fsl-imx25.c | 3 ++-
 hw/arm/fsl-imx6.c  | 3 ++-
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/hw/arm/fsl-imx25.c b/hw/arm/fsl-imx25.c
index 24c43745903..9aabbf7f587 100644
--- a/hw/arm/fsl-imx25.c
+++ b/hw/arm/fsl-imx25.c
@@ -169,7 +169,8 @@ static void fsl_imx25_realize(DeviceState *dev, Error 
**errp)
 epit_table[i].irq));
 }
 
-object_property_set_uint(OBJECT(&s->fec), "phy-num", s->phy_num, &err);
+object_property_set_uint(OBJECT(&s->fec), "phy-num", s->phy_num,
+ &error_abort);
 qdev_set_nic_properties(DEVICE(&s->fec), &nd_table[0]);
 
 if (!sysbus_realize(SYS_BUS_DEVICE(&s->fec), errp)) {
diff --git a/hw/arm/fsl-imx6.c b/hw/arm/fsl-imx6.c
index 4fa7f0b95ed..7dc42cbfe64 100644
--- a/hw/arm/fsl-imx6.c
+++ b/hw/arm/fsl-imx6.c
@@ -379,7 +379,8 @@ static void fsl_imx6_realize(DeviceState *dev, Error **errp)
 spi_table[i].irq));
 }
 
-object_property_set_uint(OBJECT(&s->eth), "phy-num", s->phy_num, &err);
+object_property_set_uint(OBJECT(&s->eth), "phy-num", s->phy_num,
+ &error_abort);
 qdev_set_nic_properties(DEVICE(&s->eth), &nd_table[0]);
 if (!sysbus_realize(SYS_BUS_DEVICE(&s->eth), errp)) {
 return;
-- 
2.34.1

[PULL 2/8] hw/intc/arm_gicv3: ICC_PMR_EL1 high bits should be RAZ

2023-11-21 Thread Peter Maydell

From: Ben Dooks 

The ICC_PMR_ELx and ICV_PMR_ELx bit masks returned from
ic{c,v}_fullprio_mask should technically also remove any
bit above 7 as these are marked reserved (read 0) and should
therefore should not be written as anything other than 0.

This was noted during a run of a proprietary test system and
discused on the mailing list [1] and initially thought not to
be an issue due to RES0 being technically allowed to be
written to and read back as long as the implementation does
not use the RES0 bits. It is very possible that the values
are used in comparison without masking, as pointed out by
Peter in [2], if (cs->hppi.prio >= cs->icc_pmr_el1) may well
do the wrong thing.

Masking these values in ic{c,v}_fullprio_mask() should fix
this and prevent any future problems with playing with the
values.

[1]: https://lists.nongnu.org/archive/html/qemu-arm/2023-11/msg00607.html
[2]: https://lists.nongnu.org/archive/html/qemu-arm/2023-11/msg00737.html

Signed-off-by: Ben Dooks 
Message-id: 20231116172818.792364-1-ben.do...@codethink.co.uk
Suggested-by: Peter Maydell 
Reviewed-by: Peter Maydell 
Signed-off-by: Peter Maydell 
---
 hw/intc/arm_gicv3_cpuif.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/hw/intc/arm_gicv3_cpuif.c b/hw/intc/arm_gicv3_cpuif.c
index d07b13eb270..ab1a00508e6 100644
--- a/hw/intc/arm_gicv3_cpuif.c
+++ b/hw/intc/arm_gicv3_cpuif.c
@@ -146,7 +146,7 @@ static uint32_t icv_fullprio_mask(GICv3CPUState *cs)
  * with the group priority, whose mask depends on the value of VBPR
  * for the interrupt group.)
  */
-return ~0U << (8 - cs->vpribits);
+return (~0U << (8 - cs->vpribits)) & 0xff;
 }
 
 static int ich_highest_active_virt_prio(GICv3CPUState *cs)
@@ -803,7 +803,7 @@ static uint32_t icc_fullprio_mask(GICv3CPUState *cs)
  * with the group priority, whose mask depends on the value of BPR
  * for the interrupt group.)
  */
-return ~0U << (8 - cs->pribits);
+return (~0U << (8 - cs->pribits)) & 0xff;
 }
 
 static inline int icc_min_bpr(GICv3CPUState *cs)
-- 
2.34.1

[PULL 3/8] target/arm: Fix SME FMOPA (16-bit), BFMOPA

2023-11-21 Thread Peter Maydell

From: Richard Henderson 

Perform the loop increment unconditionally, not nested
within the predication.

Cc: qemu-sta...@nongnu.org
Fixes: 3916841ac75 ("target/arm: Implement FMOPA, FMOPS (widening)")
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1985
Signed-off-by: Richard Henderson 
Reviewed-by: Philippe Mathieu-Daudé 
Message-id: 20231117193135.1180657-1-richard.hender...@linaro.org
Signed-off-by: Peter Maydell 
---
 target/arm/tcg/sme_helper.c | 10 --
 1 file changed, 4 insertions(+), 6 deletions(-)

diff --git a/target/arm/tcg/sme_helper.c b/target/arm/tcg/sme_helper.c
index 296826ffe6a..1ee2690ceb5 100644
--- a/target/arm/tcg/sme_helper.c
+++ b/target/arm/tcg/sme_helper.c
@@ -1037,10 +1037,9 @@ void HELPER(sme_fmopa_h)(void *vza, void *vzn, void 
*vzm, void *vpn,
 
 m = f16mop_adj_pair(m, pcol, 0);
 *a = f16_dotadd(*a, n, m, &fpst_std, &fpst_odd);
-
-col += 4;
-pcol >>= 4;
 }
+col += 4;
+pcol >>= 4;
 } while (col & 15);
 }
 row += 4;
@@ -1073,10 +1072,9 @@ void HELPER(sme_bfmopa)(void *vza, void *vzn, void *vzm, 
void *vpn,
 
 m = f16mop_adj_pair(m, pcol, 0);
 *a = bfdotadd(*a, n, m);
-
-col += 4;
-pcol >>= 4;
 }
+col += 4;
+pcol >>= 4;
 } while (col & 15);
 }
 row += 4;
-- 
2.34.1

[PULL 5/8] hw/arm/stm32f405: Report error when incorrect CPU is used

2023-11-21 Thread Peter Maydell

From: Philippe Mathieu-Daudé 

Both 'netduinoplus2' and 'olimex-stm32-h405' machines ignore the
CPU type requested by the command line. This might confuse users,
since the following will create a machine with a Cortex-M4 CPU:

  $ qemu-system-aarch64 -M netduinoplus2 -cpu cortex-r5f

Set the MachineClass::valid_cpu_types field (introduced in commit
c9cf636d48 "machine: Add a valid_cpu_types property").
Remove the now unused MachineClass::default_cpu_type field.

We now get:

  $ qemu-system-aarch64 -M netduinoplus2 -cpu cortex-r5f
  qemu-system-aarch64: Invalid CPU type: cortex-r5f-arm-cpu
  The valid types are: cortex-m4-arm-cpu

Since the SoC family can only use Cortex-M4 CPUs, hard-code the
CPU type name at the SoC level, removing the QOM property
entirely.

Reviewed-by: Richard Henderson 
Signed-off-by: Philippe Mathieu-Daudé 
Reviewed-by: Gavin Shan 
Message-id: 20231117071704.35040-3-phi...@linaro.org
Signed-off-by: Peter Maydell 
---
 include/hw/arm/stm32f405_soc.h | 4 
 hw/arm/netduinoplus2.c | 7 ++-
 hw/arm/olimex-stm32-h405.c | 8 ++--
 hw/arm/stm32f405_soc.c | 8 +---
 4 files changed, 13 insertions(+), 14 deletions(-)

diff --git a/include/hw/arm/stm32f405_soc.h b/include/hw/arm/stm32f405_soc.h
index c968ce3ab23..d15c03c4b5d 100644
--- a/include/hw/arm/stm32f405_soc.h
+++ b/include/hw/arm/stm32f405_soc.h
@@ -51,11 +51,7 @@ OBJECT_DECLARE_SIMPLE_TYPE(STM32F405State, STM32F405_SOC)
 #define CCM_SIZE (64 * 1024)
 
 struct STM32F405State {
-/*< private >*/
 SysBusDevice parent_obj;
-/*< public >*/
-
-char *cpu_type;
 
 ARMv7MState armv7m;
 
diff --git a/hw/arm/netduinoplus2.c b/hw/arm/netduinoplus2.c
index 515c0816054..2e589849478 100644
--- a/hw/arm/netduinoplus2.c
+++ b/hw/arm/netduinoplus2.c
@@ -44,7 +44,6 @@ static void netduinoplus2_init(MachineState *machine)
 clock_set_hz(sysclk, SYSCLK_FRQ);
 
 dev = qdev_new(TYPE_STM32F405_SOC);
-qdev_prop_set_string(dev, "cpu-type", ARM_CPU_TYPE_NAME("cortex-m4"));
 qdev_connect_clock_in(dev, "sysclk", sysclk);
 sysbus_realize_and_unref(SYS_BUS_DEVICE(dev), &error_fatal);
 
@@ -55,8 +54,14 @@ static void netduinoplus2_init(MachineState *machine)
 
 static void netduinoplus2_machine_init(MachineClass *mc)
 {
+static const char * const valid_cpu_types[] = {
+ARM_CPU_TYPE_NAME("cortex-m4"),
+NULL
+};
+
 mc->desc = "Netduino Plus 2 Machine (Cortex-M4)";
 mc->init = netduinoplus2_init;
+mc->valid_cpu_types = valid_cpu_types;
 }
 
 DEFINE_MACHINE("netduinoplus2", netduinoplus2_machine_init)
diff --git a/hw/arm/olimex-stm32-h405.c b/hw/arm/olimex-stm32-h405.c
index 3aa61c91b75..d793de7c97f 100644
--- a/hw/arm/olimex-stm32-h405.c
+++ b/hw/arm/olimex-stm32-h405.c
@@ -47,7 +47,6 @@ static void olimex_stm32_h405_init(MachineState *machine)
 clock_set_hz(sysclk, SYSCLK_FRQ);
 
 dev = qdev_new(TYPE_STM32F405_SOC);
-qdev_prop_set_string(dev, "cpu-type", ARM_CPU_TYPE_NAME("cortex-m4"));
 qdev_connect_clock_in(dev, "sysclk", sysclk);
 sysbus_realize_and_unref(SYS_BUS_DEVICE(dev), &error_fatal);
 
@@ -58,9 +57,14 @@ static void olimex_stm32_h405_init(MachineState *machine)
 
 static void olimex_stm32_h405_machine_init(MachineClass *mc)
 {
+static const char * const valid_cpu_types[] = {
+ARM_CPU_TYPE_NAME("cortex-m4"),
+NULL
+};
+
 mc->desc = "Olimex STM32-H405 (Cortex-M4)";
 mc->init = olimex_stm32_h405_init;
-mc->default_cpu_type = ARM_CPU_TYPE_NAME("cortex-m4");
+mc->valid_cpu_types = valid_cpu_types;
 
 /* SRAM pre-allocated as part of the SoC instantiation */
 mc->default_ram_size = 0;
diff --git a/hw/arm/stm32f405_soc.c b/hw/arm/stm32f405_soc.c
index cef23d7ee41..a65bbe298d2 100644
--- a/hw/arm/stm32f405_soc.c
+++ b/hw/arm/stm32f405_soc.c
@@ -149,7 +149,7 @@ static void stm32f405_soc_realize(DeviceState *dev_soc, 
Error **errp)
 
 armv7m = DEVICE(&s->armv7m);
 qdev_prop_set_uint32(armv7m, "num-irq", 96);
-qdev_prop_set_string(armv7m, "cpu-type", s->cpu_type);
+qdev_prop_set_string(armv7m, "cpu-type", ARM_CPU_TYPE_NAME("cortex-m4"));
 qdev_prop_set_bit(armv7m, "enable-bitband", true);
 qdev_connect_clock_in(armv7m, "cpuclk", s->sysclk);
 qdev_connect_clock_in(armv7m, "refclk", s->refclk);
@@ -287,17 +287,11 @@ static void stm32f405_soc_realize(DeviceState *dev_soc, 
Error **errp)
 create_unimplemented_device("RNG", 0x50060800, 0x400);
 }
 
-static Property stm32f405_soc_properties[] = {
-DEFINE_PROP_STRING("cpu-type", STM32F405State, cpu_type),
-DEFINE_PROP_END_OF_LIST(),
-};
-
 static void stm32f405_soc_class_init(ObjectClass *klass, void *data)
 {
 DeviceClass *dc = DEVICE_CLASS(klass);
 
 dc->realize = stm32f405_soc_realize;
-device_class_set_props(dc, stm32f405_soc_properties);
 /* No vmstate or reset required: device has no internal state */
 }
 
-- 
2.34.1

[PULL 0/8] target-arm queue

2023-11-21 Thread Peter Maydell

Hi; here are some arm patches for rc1; all small bug fixes and cleanups.

thanks
-- PMM

The following changes since commit af9264da80073435fd78944bc5a46e695897d7e5:

  Merge tag '20231119-xtensa-1' of https://github.com/OSLL/qemu-xtensa into 
staging (2023-11-20 05:25:19 -0500)

are available in the Git repository at:

  https://git.linaro.org/people/pmaydell/qemu-arm.git 
tags/pull-target-arm-20231121

for you to fetch changes up to 0cbb56c236a4a28f5149eed227d74bb737321cfc:

  hw/arm/fsl-imx: Do not ignore Error argument (2023-11-20 15:34:19 +)


target-arm queue:
 * enable FEAT_RNG on Neoverse-N2
 * hw/intc/arm_gicv3: ICC_PMR_EL1 high bits should be RAZ
 * Fix SME FMOPA (16-bit), BFMOPA
 * hw/core/machine: Constify MachineClass::valid_cpu_types[]
 * stm32f* machines: Report error when user asks for wrong CPU type
 * hw/arm/fsl-imx: Do not ignore Error argument


Ben Dooks (1):
  hw/intc/arm_gicv3: ICC_PMR_EL1 high bits should be RAZ

Gavin Shan (1):
  hw/core/machine: Constify MachineClass::valid_cpu_types[]

Marcin Juszkiewicz (1):
  target/arm: enable FEAT_RNG on Neoverse-N2

Philippe Mathieu-Daudé (4):
  hw/arm/stm32f405: Report error when incorrect CPU is used
  hw/arm/stm32f205: Report error when incorrect CPU is used
  hw/arm/stm32f100: Report error when incorrect CPU is used
  hw/arm/fsl-imx: Do not ignore Error argument

Richard Henderson (1):
  target/arm: Fix SME FMOPA (16-bit), BFMOPA

 include/hw/arm/stm32f100_soc.h |  4 
 include/hw/arm/stm32f205_soc.h |  4 
 include/hw/arm/stm32f405_soc.h |  4 
 include/hw/boards.h|  2 +-
 hw/arm/fsl-imx25.c |  3 ++-
 hw/arm/fsl-imx6.c  |  3 ++-
 hw/arm/netduino2.c |  7 ++-
 hw/arm/netduinoplus2.c |  7 ++-
 hw/arm/olimex-stm32-h405.c |  8 ++--
 hw/arm/stm32f100_soc.c |  9 ++---
 hw/arm/stm32f205_soc.c |  9 ++---
 hw/arm/stm32f405_soc.c |  8 +---
 hw/arm/stm32vldiscovery.c  |  7 ++-
 hw/hppa/machine.c  | 22 ++
 hw/intc/arm_gicv3_cpuif.c  |  4 ++--
 hw/m68k/q800.c | 11 +--
 target/arm/tcg/cpu64.c |  2 +-
 target/arm/tcg/sme_helper.c| 10 --
 18 files changed, 56 insertions(+), 68 deletions(-)

RE: [RFC PATCH 1/3] hw/cxl/cxl-mailbox-utils: Add support for feature commands (8.2.9.6)

2023-11-21 Thread Shiju Jose via

Hi Davidlohr,

Thanks for reviewing and for the comments.

>-Original Message-
>From: Davidlohr Bueso 
>Sent: 20 November 2023 19:45
>To: Shiju Jose 
>Cc: qemu-devel@nongnu.org; linux-...@vger.kernel.org; Jonathan Cameron
>; tanxiaofei ;
>Zengtao (B) ; Linuxarm ;
>fan...@samsung.com; a.manzana...@samsung.com
>Subject: Re: [RFC PATCH 1/3] hw/cxl/cxl-mailbox-utils: Add support for feature
>commands (8.2.9.6)
>
>On Tue, 14 Nov 2023, shiju.j...@huawei.com wrote:
>
>>From: Shiju Jose 
>>
>>CXL spec 3.0 section 8.2.9.6 describes optional device specific features.
>>CXL devices supports features with changeable attributes.
>>Get Supported Features retrieves the list of supported device specific
>>features. The settings of a feature can be retrieved using Get Feature
>>and optionally modified using Set Feature.
>>
>>Signed-off-by: Shiju Jose 
>
>Reviewed-by: Davidlohr Bueso 
>
>... with some comments below.
>
>>---
>> hw/cxl/cxl-mailbox-utils.c | 140 +
>> 1 file changed, 140 insertions(+)
>>
>>diff --git a/hw/cxl/cxl-mailbox-utils.c b/hw/cxl/cxl-mailbox-utils.c
>>index 6184f44339..93960afd44 100644
>>--- a/hw/cxl/cxl-mailbox-utils.c
>>+++ b/hw/cxl/cxl-mailbox-utils.c
>>@@ -66,6 +66,10 @@ enum {
>> LOGS= 0x04,
>> #define GET_SUPPORTED 0x0
>> #define GET_LOG   0x1
>>+FEATURES= 0x05,
>>+#define GET_SUPPORTED 0x0
>>+#define GET_FEATURE   0x1
>>+#define SET_FEATURE   0x2
>> IDENTIFY= 0x40,
>> #define MEMORY_DEVICE 0x0
>> CCLS= 0x41,
>>@@ -785,6 +789,135 @@ static CXLRetCode cmd_logs_get_log(const struct
>cxl_cmd *cmd,
>> return CXL_MBOX_SUCCESS;
>> }
>>
>>+/* CXL r3.0 section 8.2.9.6: Features */ typedef struct
>>+CXLSupportedFeatureHeader {
>>+uint16_t entries;
>>+uint16_t nsuppfeats_dev;
>>+uint32_t reserved;
>>+} QEMU_PACKED CXLSupportedFeatureHeader;
>>+
>>+typedef struct CXLSupportedFeatureEntry {
>>+QemuUUID uuid;
>>+uint16_t feat_index;
>>+uint16_t get_feat_size;
>>+uint16_t set_feat_size;
>>+uint32_t attrb_flags;
>>+uint8_t get_feat_version;
>>+uint8_t set_feat_version;
>>+uint16_t set_feat_effects;
>>+uint8_t rsvd[18];
>>+} QEMU_PACKED CXLSupportedFeatureEntry;
>>+
>>+enum CXL_SUPPORTED_FEATURES_LIST {
>>+CXL_FEATURE_MAX
>>+};
>>+
>>+typedef struct CXLSetFeatureInHeader {
>>+QemuUUID uuid;
>>+uint32_t flags;
>>+uint16_t offset;
>>+uint8_t version;
>>+uint8_t rsvd[9];
>>+} QEMU_PACKED QEMU_ALIGNED(16) CXLSetFeatureInHeader;
>>+
>>+#define CXL_SET_FEATURE_FLAG_DATA_TRANSFER_MASK   0x7
>>+#define CXL_SET_FEATURE_FLAG_FULL_DATA_TRANSFER0
>>+#define CXL_SET_FEATURE_FLAG_INITIATE_DATA_TRANSFER1
>>+#define CXL_SET_FEATURE_FLAG_CONTINUE_DATA_TRANSFER2
>>+#define CXL_SET_FEATURE_FLAG_FINISH_DATA_TRANSFER3
>>+#define CXL_SET_FEATURE_FLAG_ABORT_DATA_TRANSFER4
>
>Maybe enum here?
Sure. I will change.

>
>>+
>>+/* CXL r3.0 section 8.2.9.6.1: Get Supported Features (Opcode 0500h)
>>+*/ static CXLRetCode cmd_features_get_supported(const struct cxl_cmd
>*cmd,
>>+ uint8_t *payload_in,
>>+ size_t len_in,
>>+ uint8_t *payload_out,
>>+ size_t *len_out,
>>+ CXLCCI *cci) {
>>+struct {
>>+uint32_t count;
>>+uint16_t start_index;
>>+uint16_t reserved;
>>+} QEMU_PACKED QEMU_ALIGNED(16) * get_feats_in = (void
>>+*)payload_in;
>>+
>>+struct {
>>+CXLSupportedFeatureHeader hdr;
>>+CXLSupportedFeatureEntry feat_entries[];
>>+} QEMU_PACKED QEMU_ALIGNED(16) * supported_feats = (void
>>+ *)payload_out;
>
>s/supported_feats/get_feats_out.
Will change.

>
>>+uint16_t index;
>>+uint16_t entry, req_entries;
>>+uint16_t feat_entries = 0;
>>+
>>+if (get_feats_in->count < sizeof(CXLSupportedFeatureHeader) ||
>>+get_feats_in->start_index > CXL_FEATURE_MAX) {
>
>Ah I see you update this to '>=' in the next patch.
>
>>+return CXL_MBOX_INVALID_INPUT;
>>+} else {
>
>This branch is not needed.
Ok.

>
>>+req_entries = (get_feats_in->count -
>>+sizeof(CXLSupportedFeatureHeader)) /
>>+sizeof(CXLSupportedFeatureEntry);
>>+}
>>+if (req_entries > CXL_FEATURE_MAX) {
>>+req_entries = CXL_FEATURE_MAX;
>>+}
>
>min()?
Sure.

>
>>+supported_feats->hdr.nsuppfeats_dev = CXL_FEATURE_MAX;
>
>Logically this should go below, when setting the feature entries.
>
>>+index = get_feats_in->start_index;
>>+
>>+entry = 0;
>>+while (entry < req_entries) {
>>+switch (index) {
>>+default:
>>+break;
>>+}
>>+index++;
>>+entry++;
>>+}
>>+
>>+supported_feats->hdr.entries = feat_entrie

Re: [PATCH for-8.2 0/3] UI: fix default VC regressions

2023-11-21 Thread Marc-André Lureau

Hi David

On Tue, Nov 21, 2023 at 2:15 PM Woodhouse, David  wrote:
>
> On Tue, 2023-11-21 at 11:37 +0400, Marc-André Lureau wrote:
> > On Fri, Nov 17, 2023 at 6:36 PM  wrote:
> > >
> > > From: Marc-André Lureau 
> > >
> > > Hi,
> > >
> > > There are a few annoying regressions with the default VCs introduced with 
> > > the
> > > pixman series. The "vl: revert behaviour for -display none" change solves 
> > > most
> > > of the issues. Another one is hit when using remote displays, and VCs are 
> > > not
> > > created as they used to, see: "ui/console: fix default VC when there are 
> > > no
> > > display". Finally, "ui: use "vc" chardev for dbus, gtk & spice-app" was 
> > > meant to
> > > be included in the pixman series and also brings back default VCs 
> > > creation.
> > >
> > > Marc-André Lureau (3):
> > >vl: revert behaviour for -display none
> > >ui: use "vc" chardev for dbus, gtk & spice-app
> > >ui/console: fix default VC when there are no display
> >
> > I wish to send a PR (rc1 today), together with "[PATCH] vl: add
> > missing display_remote++".
> >
> > Some R-B/A-B appreciated! thanks
>
> Not sure I can give coherent review on the other two, but the first
> patch does fix the Xen command line and looks sane.
>
> Please could I ask you to also include
> https://lore.kernel.org/qemu-devel/20231115172723.1161679-3-dw...@infradead.org/
> in the series as you push it?
>
>

Thanks for the quick test. I am bit reluctant to push your change in
8.2 too. It's a change in behaviour at this point, not simply a fix.
But as the maintainer of Xen stuff, you have perhaps the final call.?

-- 
Marc-André Lureau

[PULL 4/5] vl: add missing display_remote++

2023-11-21 Thread marcandre . lureau

From: Marc-André Lureau 

We should also consider -display vnc= as setting up a remote display,
and not attempt to add another default one.

The display_remote++ in qemu_setup_display() isn't necessary at this
point, but is there for completeness and further usages of the variable.

Fixes: https://gitlab.com/qemu-project/qemu/-/issues/1988
Fixes: commit 484629fc81 ("vl: simplify display_remote logic ")
Signed-off-by: Marc-André Lureau 
---
 system/vl.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/system/vl.c b/system/vl.c
index 14bf0cf0bf..da2654aa77 100644
--- a/system/vl.c
+++ b/system/vl.c
@@ -1110,6 +1110,7 @@ static void parse_display(const char *p)
  */
 if (*opts == '=') {
 vnc_parse(opts + 1);
+display_remote++;
 } else {
 error_report("VNC requires a display argument vnc=");
 exit(1);
@@ -1359,6 +1360,7 @@ static void qemu_setup_display(void)
 dpy.type = DISPLAY_TYPE_NONE;
 #if defined(CONFIG_VNC)
 vnc_parse("localhost:0,to=99,id=default");
+display_remote++;
 #endif
 }
 }
-- 
2.42.0

[PULL 0/5] Ui patches

2023-11-21 Thread marcandre . lureau

From: Marc-André Lureau 

The following changes since commit af9264da80073435fd78944bc5a46e695897d7e5:

  Merge tag '20231119-xtensa-1' of https://github.com/OSLL/qemu-xtensa into 
staging (2023-11-20 05:25:19 -0500)

are available in the Git repository at:

  https://gitlab.com/marcandre.lureau/qemu.git tags/ui-pull-request

for you to fetch changes up to e0c58720bfd8c0553f170b64717278b07438d2f5:

  ui/pixman-minimal.h: fix empty allocation (2023-11-21 14:38:14 +0400)


UI: fixes for 8.2-rc1



Manos Pitsidianakis (1):
  ui/pixman-minimal.h: fix empty allocation

Marc-André Lureau (4):
  vl: revert behaviour for -display none
  ui: use "vc" chardev for dbus, gtk & spice-app
  ui/console: fix default VC when there are no display
  vl: add missing display_remote++

 include/ui/pixman-minimal.h | 48 +++--
 system/vl.c |  4 +++-
 ui/console.c| 18 +++---
 ui/dbus.c   |  1 +
 ui/gtk.c|  1 +
 ui/spice-app.c  |  1 +
 6 files changed, 60 insertions(+), 13 deletions(-)

-- 
2.42.0

[PULL 3/5] ui/console: fix default VC when there are no display

2023-11-21 Thread marcandre . lureau

From: Marc-André Lureau 

When display is "none", we may still have remote displays (I think it
would be simpler if VNC/Spice were regular display btw). Return the
default VC then, and set them up to fix a regression when using remote
display and it used the TTY instead.

Fixes: https://gitlab.com/qemu-project/qemu/-/issues/1989
Fixes: commit 1bec1cc0d ("ui/console: allow to override the default VC")
Reported-by: German Maglione 
Signed-off-by: Marc-André Lureau 
Acked-by: Thomas Huth 
---
 ui/console.c | 18 --
 1 file changed, 8 insertions(+), 10 deletions(-)

diff --git a/ui/console.c b/ui/console.c
index 8e688d3569..7db921e3b7 100644
--- a/ui/console.c
+++ b/ui/console.c
@@ -1679,19 +1679,17 @@ void qemu_display_init(DisplayState *ds, DisplayOptions 
*opts)
 
 const char *qemu_display_get_vc(DisplayOptions *opts)
 {
-assert(opts->type < DISPLAY_TYPE__MAX);
-if (opts->type == DISPLAY_TYPE_NONE) {
-return NULL;
-}
-assert(dpys[opts->type] != NULL);
-if (dpys[opts->type]->vc) {
-return dpys[opts->type]->vc;
-} else {
 #ifdef CONFIG_PIXMAN
-return "vc:80Cx24C";
+const char *vc = "vc:80Cx24C";
+#else
+const char *vc = NULL;
 #endif
+
+assert(opts->type < DISPLAY_TYPE__MAX);
+if (dpys[opts->type] && dpys[opts->type]->vc) {
+vc = dpys[opts->type]->vc;
 }
-return NULL;
+return vc;
 }
 
 void qemu_display_help(void)
-- 
2.42.0

[PULL 2/5] ui: use "vc" chardev for dbus, gtk & spice-app

2023-11-21 Thread marcandre . lureau

From: Marc-André Lureau 

Those display have their own implementation of "vc" chardev, which
doesn't use pixman. They also don't implement the width/height/cols/rows
options, so qemu_display_get_vc() should return a compatible argument.

This patch was meant to be with the pixman series, when the "vc" field
was introduced. It fixes a regression where VC are created on the
tty (or null) instead of the display own "vc" implementation.

Signed-off-by: Marc-André Lureau 
Acked-by: Thomas Huth 
---
 ui/dbus.c  | 1 +
 ui/gtk.c   | 1 +
 ui/spice-app.c | 1 +
 3 files changed, 3 insertions(+)

diff --git a/ui/dbus.c b/ui/dbus.c
index 866467ad2e..e08b5de064 100644
--- a/ui/dbus.c
+++ b/ui/dbus.c
@@ -518,6 +518,7 @@ static QemuDisplay qemu_display_dbus = {
 .type   = DISPLAY_TYPE_DBUS,
 .early_init = early_dbus_init,
 .init   = dbus_init,
+.vc = "vc",
 };
 
 static void register_dbus(void)
diff --git a/ui/gtk.c b/ui/gtk.c
index be047a41ad..810d7fc796 100644
--- a/ui/gtk.c
+++ b/ui/gtk.c
@@ -2534,6 +2534,7 @@ static QemuDisplay qemu_display_gtk = {
 .type   = DISPLAY_TYPE_GTK,
 .early_init = early_gtk_display_init,
 .init   = gtk_display_init,
+.vc = "vc",
 };
 
 static void register_gtk(void)
diff --git a/ui/spice-app.c b/ui/spice-app.c
index 405fb7f9f5..a10b4a58fe 100644
--- a/ui/spice-app.c
+++ b/ui/spice-app.c
@@ -220,6 +220,7 @@ static QemuDisplay qemu_display_spice_app = {
 .type   = DISPLAY_TYPE_SPICE_APP,
 .early_init = spice_app_display_early_init,
 .init   = spice_app_display_init,
+.vc = "vc",
 };
 
 static void register_spice_app(void)
-- 
2.42.0

[PULL 1/5] vl: revert behaviour for -display none

2023-11-21 Thread marcandre . lureau

From: Marc-André Lureau 

Commit 1bec1cc0d ("ui/console: allow to override the default VC") changed
the behaviour of the "-display none" option, so that it now creates a
QEMU monitor on the terminal. "-display none" should not be tangled up
with whether we create a monitor or a serial terminal; it should purely
and only disable the graphical window. Changing its behaviour like this
breaks command lines which, for example, use semihosting for their
output and don't want a graphical window, as they now get a monitor they
never asked for.

It also breaks the command line we document for Xen in
docs/system/i386/xen.html:

 $ ./qemu-system-x86_64 --accel kvm,xen-version=0x40011,kernel-irqchip=split \
-display none -chardev stdio,mux=on,id=char0,signal=off -mon char0 \
-device xen-console,chardev=char0  -drive file=${GUEST_IMAGE},if=xen

qemu-system-x86_64: cannot use stdio by multiple character devices
qemu-system-x86_64: could not connect serial device to character backend
'stdio'

When qemu is compiled without PIXMAN, by default the serials aren't
muxed with the monitor anymore on stdio. The serials are redirected to
"null" instead, and the monitor isn't set up.

Fixes: commit 1bec1cc0d ("ui/console: allow to override the default VC")
Signed-off-by: Marc-André Lureau 
Tested-by: Peter Maydell 
Reviewed-by: Peter Maydell 
Tested-by: David Woodhouse 
Reviewed-by: David Woodhouse 
---
 system/vl.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/system/vl.c b/system/vl.c
index 5af7ced2a1..14bf0cf0bf 100644
--- a/system/vl.c
+++ b/system/vl.c
@@ -1391,7 +1391,7 @@ static void qemu_create_default_devices(void)
 }
 }
 
-if (nographic || (!vc && !is_daemonized() && isatty(STDOUT_FILENO))) {
+if (nographic) {
 if (default_parallel) {
 add_device_config(DEV_PARALLEL, "null");
 }
-- 
2.42.0

[PULL 5/5] ui/pixman-minimal.h: fix empty allocation

2023-11-21 Thread marcandre . lureau

From: Manos Pitsidianakis 

In the minimal pixman API stub that is used when the real pixman
dependency is missing a NULL dereference happens when
virtio-gpu-rutabaga allocates a pixman image with bits = NULL and
rowstride_bytes = zero. A buffer of rowstride_bytes * height is
allocated which is NULL. However, in that scenario pixman calculates a
new stride value based on given width, height and format size.

This commit adds a helper function that performs the same logic as
pixman.

Signed-off-by: Manos Pitsidianakis 
Reviewed-by: Marc-André Lureau 
Message-Id: <20231121093840.2121195-1-manos.pitsidiana...@linaro.org>
---
 include/ui/pixman-minimal.h | 48 +++--
 1 file changed, 46 insertions(+), 2 deletions(-)

diff --git a/include/ui/pixman-minimal.h b/include/ui/pixman-minimal.h
index efcf570c9e..6dd7de1c7e 100644
--- a/include/ui/pixman-minimal.h
+++ b/include/ui/pixman-minimal.h
@@ -113,6 +113,45 @@ typedef struct pixman_color {
 uint16_talpha;
 } pixman_color_t;
 
+static inline uint32_t *create_bits(pixman_format_code_t format,
+int width,
+int height,
+int *rowstride_bytes)
+{
+int stride = 0;
+size_t buf_size = 0;
+int bpp = PIXMAN_FORMAT_BPP(format);
+
+/*
+ * Calculate the following while checking for overflow truncation:
+ * stride = ((width * bpp + 0x1f) >> 5) * sizeof(uint32_t);
+ */
+
+if (unlikely(__builtin_mul_overflow(width, bpp, &stride))) {
+return NULL;
+}
+
+if (unlikely(__builtin_add_overflow(stride, 0x1f, &stride))) {
+return NULL;
+}
+
+stride >>= 5;
+
+stride *= sizeof(uint32_t);
+
+if (unlikely(__builtin_mul_overflow((size_t) height,
+(size_t) stride,
+&buf_size))) {
+return NULL;
+}
+
+if (rowstride_bytes) {
+*rowstride_bytes = stride;
+}
+
+return g_malloc0(buf_size);
+}
+
 static inline pixman_image_t *pixman_image_create_bits(pixman_format_code_t 
format,
int width,
int height,
@@ -123,13 +162,18 @@ static inline pixman_image_t 
*pixman_image_create_bits(pixman_format_code_t form
 
 i->width = width;
 i->height = height;
-i->stride = rowstride_bytes ?: width * 
DIV_ROUND_UP(PIXMAN_FORMAT_BPP(format), 8);
 i->format = format;
 if (bits) {
 i->data = bits;
 } else {
-i->free_me = i->data = g_malloc0(rowstride_bytes * height);
+i->free_me = i->data =
+create_bits(format, width, height, &rowstride_bytes);
+if (width && height) {
+assert(i->data);
+}
 }
+i->stride = rowstride_bytes ? rowstride_bytes :
+width * DIV_ROUND_UP(PIXMAN_FORMAT_BPP(format), 8);
 i->ref_count = 1;
 
 return i;
-- 
2.42.0

Re: [PATCH for-8.2 0/3] UI: fix default VC regressions

2023-11-21 Thread David Woodhouse

On Tue, 2023-11-21 at 14:37 +0400, Marc-André Lureau wrote:
> 
> Thanks for the quick test. I am bit reluctant to push your change in
> 8.2 too. It's a change in behaviour at this point, not simply a fix.
> But as the maintainer of Xen stuff, you have perhaps the final call.?

it's not a change in behaviour yet. Being able to add xen-console
devices on the command line at all is new in 8.2, so it only ends up
being a "change" if we do it after the 8.2 release, which is why I'm
keen to do it now.

Thanks.

smime.p7s
Description: S/MIME cryptographic signature

Re: [PATCH for-8.2 0/3] UI: fix default VC regressions

2023-11-21 Thread Marc-André Lureau

Hi

On Tue, Nov 21, 2023 at 2:42 PM David Woodhouse  wrote:
>
> On Tue, 2023-11-21 at 14:37 +0400, Marc-André Lureau wrote:
> >
> > Thanks for the quick test. I am bit reluctant to push your change in
> > 8.2 too. It's a change in behaviour at this point, not simply a fix.
> > But as the maintainer of Xen stuff, you have perhaps the final call.?
>
> it's not a change in behaviour yet. Being able to add xen-console
> devices on the command line at all is new in 8.2, so it only ends up
> being a "change" if we do it after the 8.2 release, which is why I'm
> keen to do it now.
>

I didn't realize that. Perhaps it's best to go through the Xen queue.
I already sent a PR for UI regressions, as we are close to rc1.

thanks!


-- 
Marc-André Lureau

Re: [PATCH for-8.2 0/3] UI: fix default VC regressions

2023-11-21 Thread David Woodhouse

On Tue, 2023-11-21 at 14:45 +0400, Marc-André Lureau wrote:
> Hi
> 
> On Tue, Nov 21, 2023 at 2:42 PM David Woodhouse  wrote:
> > 
> > On Tue, 2023-11-21 at 14:37 +0400, Marc-André Lureau wrote:
> > > 
> > > Thanks for the quick test. I am bit reluctant to push your change in
> > > 8.2 too. It's a change in behaviour at this point, not simply a fix.
> > > But as the maintainer of Xen stuff, you have perhaps the final call.?
> > 
> > it's not a change in behaviour yet. Being able to add xen-console
> > devices on the command line at all is new in 8.2, so it only ends up
> > being a "change" if we do it after the 8.2 release, which is why I'm
> > keen to do it now.
> > 
> 
> I didn't realize that. Perhaps it's best to go through the Xen queue.
> I already sent a PR for UI regressions, as we are close to rc1.

Makes sense. Could I trouble you for a R-b for it and then I'll do so?

Thanks.



smime.p7s
Description: S/MIME cryptographic signature

Re: [PATCH 2/3] vl: disable default serial when xen-console is enabled

2023-11-21 Thread Marc-André Lureau

Hi

On Wed, Nov 15, 2023 at 9:28 PM David Woodhouse  wrote:
>
> From: David Woodhouse 
>
> If a Xen console is configured on the command line, do not add a default
> serial port.
>
> Signed-off-by: David Woodhouse 
> ---
>  system/vl.c | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/system/vl.c b/system/vl.c
> index 5af7ced2a1..8109231834 100644
> --- a/system/vl.c
> +++ b/system/vl.c
> @@ -198,6 +198,7 @@ static const struct {
>  const char *driver;
>  int *flag;
>  } default_list[] = {
> +{ .driver = "xen-console",  .flag = &default_serial},
>  { .driver = "isa-serial",   .flag = &default_serial},
>  { .driver = "isa-parallel", .flag = &default_parallel  },
>  { .driver = "isa-fdc",  .flag = &default_floppy},

Consistent with the rest of the lines (no conditional compilation nor
driver #define..)
Reviewed-by: Marc-André Lureau 

btw, while quickly testing this (do we have any test for xen-console?):

$ qemu --accel kvm,xen-version=0x40011,kernel-irqchip=split -device
xen-console,chardev=foo -chardev stdio,id=foo
(and close gtk window)

Thread 1 "qemu-system-x86" received signal SIGSEGV, Segmentation fault.
0x55c11695 in qemu_free_net_client (nc=0x0) at ../net/net.c:387
387if (nc->incoming_queue) {
(gdb) bt
#0  0x55c11695 in qemu_free_net_client (nc=0x0) at ../net/net.c:387
#1  0x55c11a14 in qemu_del_nic (nic=0x58b6f930) at ../net/net.c:459
#2  0x559e398b in xen_netdev_unrealize (xendev=0x58b6b510)
at ../hw/net/xen_nic.c:550
#3  0x55b6e22f in xen_device_unrealize (dev=0x58b6b510) at
../hw/xen/xen-bus.c:973
#4  0x55b6e351 in xen_device_exit (n=0x58b6b5e0, data=0x0)
at ../hw/xen/xen-bus.c:1002
#5  0x560bc3fc in notifier_list_notify (list=0x570b5fc0
, data=0x0) at ../util/notify.c:39
#6  0x55ba1d49 in qemu_run_exit_notifiers () at ../system/runstate.c:800



--
Marc-André Lureau

Re: [PATCH 2/3] vl: disable default serial when xen-console is enabled

2023-11-21 Thread Marc-André Lureau

Hi

On Tue, Nov 21, 2023 at 2:57 PM Marc-André Lureau
 wrote:
>
> Hi
>
> On Wed, Nov 15, 2023 at 9:28 PM David Woodhouse  wrote:
> >
> > From: David Woodhouse 
> >
> > If a Xen console is configured on the command line, do not add a default
> > serial port.
> >
> > Signed-off-by: David Woodhouse 
> > ---
> >  system/vl.c | 1 +
> >  1 file changed, 1 insertion(+)
> >
> > diff --git a/system/vl.c b/system/vl.c
> > index 5af7ced2a1..8109231834 100644
> > --- a/system/vl.c
> > +++ b/system/vl.c
> > @@ -198,6 +198,7 @@ static const struct {
> >  const char *driver;
> >  int *flag;
> >  } default_list[] = {
> > +{ .driver = "xen-console",  .flag = &default_serial},
> >  { .driver = "isa-serial",   .flag = &default_serial},
> >  { .driver = "isa-parallel", .flag = &default_parallel  },
> >  { .driver = "isa-fdc",  .flag = &default_floppy},
>
> Consistent with the rest of the lines (no conditional compilation nor
> driver #define..)
> Reviewed-by: Marc-André Lureau 
>
> btw, while quickly testing this (do we have any test for xen-console?):
>
> $ qemu --accel kvm,xen-version=0x40011,kernel-irqchip=split -device
> xen-console,chardev=foo -chardev stdio,id=foo
> (and close gtk window)
>
> Thread 1 "qemu-system-x86" received signal SIGSEGV, Segmentation fault.
> 0x55c11695 in qemu_free_net_client (nc=0x0) at ../net/net.c:387
> 387if (nc->incoming_queue) {
> (gdb) bt
> #0  0x55c11695 in qemu_free_net_client (nc=0x0) at ../net/net.c:387
> #1  0x55c11a14 in qemu_del_nic (nic=0x58b6f930) at 
> ../net/net.c:459
> #2  0x559e398b in xen_netdev_unrealize (xendev=0x58b6b510)
> at ../hw/net/xen_nic.c:550
> #3  0x55b6e22f in xen_device_unrealize (dev=0x58b6b510) at
> ../hw/xen/xen-bus.c:973
> #4  0x55b6e351 in xen_device_exit (n=0x58b6b5e0, data=0x0)
> at ../hw/xen/xen-bus.c:1002
> #5  0x560bc3fc in notifier_list_notify (list=0x570b5fc0
> , data=0x0) at ../util/notify.c:39
> #6  0x55ba1d49 in qemu_run_exit_notifiers () at 
> ../system/runstate.c:800

Ok, I found related "[PATCH 1/3] net: do not delete nics in net_cleanup()"



-- 
Marc-André Lureau

Re: [PATCH 2/3] vl: disable default serial when xen-console is enabled

2023-11-21 Thread Paul Durrant


On 15/11/2023 17:24, David Woodhouse wrote:

From: David Woodhouse 

If a Xen console is configured on the command line, do not add a default
serial port.

Signed-off-by: David Woodhouse 
---
  system/vl.c | 1 +
  1 file changed, 1 insertion(+)



Reviewed-by: Paul Durrant

Re: [PATCH v3 0/4] ide: implement simple legacy/native mode switching for PCI IDE controllers

2023-11-21 Thread Peter Maydell

On Tue, 21 Nov 2023 at 10:18, Mark Cave-Ayland
 wrote:
> In the meantime the department of hacks has been looking at ways of trying to 
> set BAR
> addresses during reset, and humbly submits the following for consideration:

> +static void via_ide_bar_reset(void *opaque)
> +{
> +PCIIDEState *d = PCI_IDE(opaque);
> +PCIDevice *pd = PCI_DEVICE(d);
> +uint8_t *pci_conf = pd->config;
> +
> +/*
> + * Some OSs e.g. AmigaOS rely on the default BMDMA BAR value being 
> present
> + * to initialise correctly, even in legacy mode(!)
> + */
> +pci_set_long(pci_conf + PCI_BASE_ADDRESS_4,
> + 0xcc00 | PCI_BASE_ADDRESS_SPACE_IO);
> +
> +/* Unregister reset function */
> +qemu_unregister_reset(via_ide_bar_reset, opaque);
> +}
> +
>   static void via_ide_reset(DeviceState *dev)
>   {
>   PCIIDEState *d = PCI_IDE(dev);
> @@ -156,6 +174,9 @@ static void via_ide_reset(DeviceState *dev)
>   pci_set_long(pci_conf + 0x68, 0x0200);
>   /* PCI PM Block */
>   pci_set_long(pci_conf + 0xc0, 0x00020001);
> +
> +/* Register separate function to set BAR values after PCI bus reset */
> +qemu_register_reset(via_ide_bar_reset, d);
>   }

I'm definitely not very enthusiastic about hacks which
increase the usage of qemu_register_reset() and rely
on reset-hook call order. We really need to try to have
another go at cleaning up the reset mess and this would be
yet another thing somebody's going to have to undo some day.
Unregistering your reset function in the reset function is
also rather curious...

thanks
-- PMM

Re: [PATCH-for-9.0 22/25] hw/sparc: Simplify memory_region_init_ram_nomigrate() calls

2023-11-21 Thread Philippe Mathieu-Daudé


On 20/11/23 22:32, Philippe Mathieu-Daudé wrote:

Mechanical change using the following coccinelle script:

@@
expression mr, owner, arg3, arg4, errp;
@@
-   memory_region_init_ram_nomigrate(mr, owner, arg3, arg4, &errp);
 if (
-   errp
+   !memory_region_init_ram_nomigrate(mr, owner, arg3, arg4, &errp)
 ) {
 ...
 return;
 }

and removing the local Error variable.

Signed-off-by: Philippe Mathieu-Daudé 
---
  hw/sparc/sun4m.c   | 20 ++--
  hw/sparc64/sun4u.c |  7 ++-
  2 files changed, 8 insertions(+), 19 deletions(-)




@@ -631,11 +628,9 @@ static void afx_realize(DeviceState *ds, Error **errp)
  {
  AFXState *s = TCX_AFX(ds);
  SysBusDevice *dev = SYS_BUS_DEVICE(ds);
-Error *local_err = NULL;
  
-memory_region_init_ram_nomigrate(&s->mem, OBJECT(ds), "sun4m.afx", 4,

- &local_err);
-if (local_err) {
+if (!memory_region_init_ram_nomigrate(&s->mem, OBJECT(ds), "sun4m.afx",
+  4, errp)) {
  error_propagate(errp, local_err);


I forgot to remove this error_propagate() line.


  return;
  }
@@ -715,12 +710,9 @@ static void prom_realize(DeviceState *ds, Error **errp)
  {
  PROMState *s = OPENPROM(ds);
  SysBusDevice *dev = SYS_BUS_DEVICE(ds);
-Error *local_err = NULL;
  
-memory_region_init_ram_nomigrate(&s->prom, OBJECT(ds), "sun4m.prom",

- PROM_SIZE_MAX, &local_err);
-if (local_err) {
-error_propagate(errp, local_err);
+if (!memory_region_init_ram_nomigrate(&s->prom, OBJECT(ds), "sun4m.prom",
+  PROM_SIZE_MAX, errp)) {
  return;
  }

Re: [PATCH 2/3] vl: disable default serial when xen-console is enabled

2023-11-21 Thread David Woodhouse

On Tue, 2023-11-21 at 14:58 +0400, Marc-André Lureau wrote:
> 
> > Consistent with the rest of the lines (no conditional compilation nor
> > driver #define..)
> > Reviewed-by: Marc-André Lureau 

Thanks.

> > btw, while quickly testing this (do we have any test for xen-console?):
> > 
> > $ qemu --accel kvm,xen-version=0x40011,kernel-irqchip=split -device
> > xen-console,chardev=foo -chardev stdio,id=foo
> > (and close gtk window)
> > 
> > Thread 1 "qemu-system-x86" received signal SIGSEGV, Segmentation fault.
> > 0x55c11695 in qemu_free_net_client (nc=0x0) at ../net/net.c:387
> > 387    if (nc->incoming_queue) {
> > (gdb) bt
> > #0  0x55c11695 in qemu_free_net_client (nc=0x0) at ../net/net.c:387
> > #1  0x55c11a14 in qemu_del_nic (nic=0x58b6f930) at 
> > ../net/net.c:459
> > #2  0x559e398b in xen_netdev_unrealize (xendev=0x58b6b510)
> > at ../hw/net/xen_nic.c:550
> > #3  0x55b6e22f in xen_device_unrealize (dev=0x58b6b510) at
> > ../hw/xen/xen-bus.c:973
> > #4  0x55b6e351 in xen_device_exit (n=0x58b6b5e0, data=0x0)
> > at ../hw/xen/xen-bus.c:1002
> > #5  0x560bc3fc in notifier_list_notify (list=0x570b5fc0
> > , data=0x0) at ../util/notify.c:39
> > #6  0x55ba1d49 in qemu_run_exit_notifiers () at 
> > ../system/runstate.c:800
> 
> Ok, I found related "[PATCH 1/3] net: do not delete nics in net_cleanup()"

Yep, and I think I saw that go by in a pull request not many hours ago,
so it should be fixed by -rc1. Thanks for testing.


smime.p7s
Description: S/MIME cryptographic signature

Re: [PATCH 2/3] chardev: report blocked write to chardev backend

2023-11-21 Thread Marc-André Lureau

Hi

On Tue, Nov 21, 2023 at 1:45 PM Thomas Huth  wrote:
>
> On 21/11/2023 10.39, Marc-André Lureau wrote:
> > Hi
> >
> > On Mon, Nov 20, 2023 at 5:36 PM Nicholas Piggin  wrote:
> >>
> >> On Mon Nov 20, 2023 at 10:06 PM AEST, Marc-André Lureau wrote:
> >>> Hi
> >>>
> >>> On Thu, Nov 16, 2023 at 3:54 PM Nicholas Piggin  wrote:
> 
>  If a chardev socket is not read, it will eventually fill and QEMU
>  can block attempting to write to it. A difficult bug in avocado
>  tests where the console socket was not being read from caused this
>  hang.
> 
>  warn if a chardev write is blocked for 100ms.
> 
>  Signed-off-by: Nicholas Piggin 
>  ---
>  This is not necessary for the fix but it does trigger in the
>  failing avocado test without the previous patch applied. Maybe
>  it would be helpful?
> 
>  Thanks,
>  Nick
> 
>    chardev/char.c | 6 ++
>    1 file changed, 6 insertions(+)
> 
>  diff --git a/chardev/char.c b/chardev/char.c
>  index 996a024c7a..7c375e3cc4 100644
>  --- a/chardev/char.c
>  +++ b/chardev/char.c
>  @@ -114,6 +114,8 @@ static int qemu_chr_write_buffer(Chardev *s,
>    {
>    ChardevClass *cc = CHARDEV_GET_CLASS(s);
>    int res = 0;
>  +int nr_retries = 0;
>  +
>    *offset = 0;
> 
>    qemu_mutex_lock(&s->chr_write_lock);
>  @@ -126,6 +128,10 @@ static int qemu_chr_write_buffer(Chardev *s,
>    } else {
>    g_usleep(100);
>    }
>  +if (++nr_retries == 1000) { /* 100ms */
>  +warn_report("Chardev '%s' write blocked for > 100ms, "
>  +"socket buffer full?", s->label);
>  +}
> >>>
> >>> That shouldn't happen, the frontend should poll and only write when it
> >>> can. What is the qemu command being used here?
> >>
> >> You can follow it through the thread here
> >>
> >> https://lore.kernel.org/qemu-devel/zvt-by9yor69q...@redhat.com/
> >>
> >> In short, a console device is attached to a socket pair and nothing
> >> ever reads from it. It eventually fills, and writing to it fails
> >> indefinitely here.
> >>
> >> It can be reproduced with:
> >>
> >> make check-avocado
> >> AVOCADO_TESTS=tests/avocado/reverse_debugging.py:test_ppc64_pseries
> >>
> >>
> >
> > How reliably? I tried 10/10.
>
> It used to fail for me every time I tried - but the fix has already been
> merged yesterday (commit cd43f00524070c026), so if you updated today, you'll
> see the test passing again.

Ok so the "frontend" is spapr-vty and there:

void vty_putchars(SpaprVioDevice *sdev, uint8_t *buf, int len)
{
SpaprVioVty *dev = VIO_SPAPR_VTY_DEVICE(sdev);

/* XXX this blocks entire thread. Rewrite to use
 * qemu_chr_fe_write and background I/O callbacks */
qemu_chr_fe_write_all(&dev->chardev, buf, len);
}

(grep "XXX this blocks", we have a lot...)

Can H_PUT_TERM_CHAR return the number of bytes written?

Is there a way to tell the guest the console is ready to accept more bytes?

[PULL 4/9] stream: Fix AioContext locking during bdrv_graph_wrlock()

2023-11-21 Thread Kevin Wolf

In stream_prepare(), we need to temporarily drop the AioContext lock
that job_prepare_locked() took for us while calling the graph write lock
functions which can poll.

All block nodes related to this block job are in the same AioContext, so
we can pass any of them to bdrv_graph_wrlock()/ bdrv_graph_wrunlock().
Unfortunately, the one that we picked is base, which can be NULL - and
in this case the AioContext lock is not released and deadlocks can
occur.

Fix this by passing s->target_bs, which is never NULL.

Signed-off-by: Kevin Wolf 
Message-ID: <20231115172012.112727-4-kw...@redhat.com>
Reviewed-by: Stefan Hajnoczi 
Signed-off-by: Kevin Wolf 
---
 block/stream.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/block/stream.c b/block/stream.c
index e3aa696289..01fe7c0f16 100644
--- a/block/stream.c
+++ b/block/stream.c
@@ -99,9 +99,9 @@ static int stream_prepare(Job *job)
 }
 }
 
-bdrv_graph_wrlock(base);
+bdrv_graph_wrlock(s->target_bs);
 bdrv_set_backing_hd_drained(unfiltered_bs, base, &local_err);
-bdrv_graph_wrunlock(base);
+bdrv_graph_wrunlock(s->target_bs);
 
 /*
  * This call will do I/O, so the graph can change again from here on.
-- 
2.42.0

[PULL 1/9] hw/ide/ahci: fix legacy software reset

2023-11-21 Thread Kevin Wolf

From: Niklas Cassel 

Legacy software contains a standard mechanism for generating a reset to a
Serial ATA device - setting the SRST (software reset) bit in the Device
Control register.

Serial ATA has a more robust mechanism called COMRESET, also referred to
as port reset. A port reset is the preferred mechanism for error
recovery and should be used in place of software reset.

Commit e2a5d9b3d9c3 ("hw/ide/ahci: simplify and document PxCI handling")
improved the handling of PxCI, such that PxCI gets cleared after handling
a non-NCQ, or NCQ command (instead of incorrectly clearing PxCI after
receiving anything - even a FIS that failed to parse, which should NOT
clear PxCI, so that you can see which command slot that caused an error).

However, simply clearing PxCI after a non-NCQ, or NCQ command, is not
enough, we also need to clear PxCI when receiving a SRST in the Device
Control register.

A legacy software reset is performed by the host sending two H2D FISes,
the first H2D FIS asserts SRST, and the second H2D FIS deasserts SRST.

The first H2D FIS will not get a D2H reply, and requires the FIS to have
the C bit set to one, such that the HBA itself will clear the bit in PxCI.

The second H2D FIS will get a D2H reply once the diagnostic is completed.
The clearing of the bit in PxCI for this command should ideally be done
in ahci_init_d2h() (if it was a legacy software reset that caused the
reset (a COMRESET does not use a command slot)). However, since the reset
value for PxCI is 0, modify ahci_reset_port() to actually clear PxCI to 0,
that way we can avoid complex logic in ahci_init_d2h().

This fixes an issue for FreeBSD where the device would fail to reset.
The problem was not noticed in Linux, because Linux uses a COMRESET
instead of a legacy software reset by default.

Fixes: e2a5d9b3d9c3 ("hw/ide/ahci: simplify and document PxCI handling")
Reported-by: Marcin Juszkiewicz 
Signed-off-by: Niklas Cassel 
Message-ID: <20231108222657.117984-1-...@flawful.org>
Reviewed-by: Kevin Wolf 
Tested-by: Marcin Juszkiewicz 
Signed-off-by: Kevin Wolf 
---
 hw/ide/ahci.c | 27 ++-
 1 file changed, 26 insertions(+), 1 deletion(-)

diff --git a/hw/ide/ahci.c b/hw/ide/ahci.c
index 7676e2d871..afdc44b8e0 100644
--- a/hw/ide/ahci.c
+++ b/hw/ide/ahci.c
@@ -623,9 +623,13 @@ static void ahci_init_d2h(AHCIDevice *ad)
 return;
 }
 
+/*
+ * For simplicity, do not call ahci_clear_cmd_issue() for this
+ * ahci_write_fis_d2h(). (The reset value for PxCI is 0.)
+ */
 if (ahci_write_fis_d2h(ad, true)) {
 ad->init_d2h_sent = true;
-/* We're emulating receiving the first Reg H2D Fis from the device;
+/* We're emulating receiving the first Reg D2H FIS from the device;
  * Update the SIG register, but otherwise proceed as normal. */
 pr->sig = ((uint32_t)ide_state->hcyl << 24) |
 (ide_state->lcyl << 16) |
@@ -663,6 +667,7 @@ static void ahci_reset_port(AHCIState *s, int port)
 pr->scr_act = 0;
 pr->tfdata = 0x7F;
 pr->sig = 0x;
+pr->cmd_issue = 0;
 d->busy_slot = -1;
 d->init_d2h_sent = false;
 
@@ -1242,10 +1247,30 @@ static void handle_reg_h2d_fis(AHCIState *s, int port,
 case STATE_RUN:
 if (cmd_fis[15] & ATA_SRST) {
 s->dev[port].port_state = STATE_RESET;
+/*
+ * When setting SRST in the first H2D FIS in the reset 
sequence,
+ * the device does not send a D2H FIS. Host software thus has 
to
+ * set the "Clear Busy upon R_OK" bit such that PxCI (and BUSY)
+ * gets cleared. See AHCI 1.3.1, section 10.4.1 Software Reset.
+ */
+if (opts & AHCI_CMD_CLR_BUSY) {
+ahci_clear_cmd_issue(ad, slot);
+}
 }
 break;
 case STATE_RESET:
 if (!(cmd_fis[15] & ATA_SRST)) {
+/*
+ * When clearing SRST in the second H2D FIS in the reset
+ * sequence, the device will execute diagnostics. When this is
+ * done, the device will send a D2H FIS with the good status.
+ * See SATA 3.5a Gold, section 11.4 Software reset protocol.
+ *
+ * This D2H FIS is the first D2H FIS received from the device,
+ * and is received regardless if the reset was performed by a
+ * COMRESET or by setting and clearing the SRST bit. Therefore,
+ * the logic for this is found in ahci_init_d2h() and not here.
+ */
 ahci_reset_port(s, port);
 }
 break;
-- 
2.42.0

[PULL 9/9] hw/ide/via: implement legacy/native mode switching

2023-11-21 Thread Kevin Wolf

From: Mark Cave-Ayland 

Allow the VIA IDE controller to switch between both legacy and native modes by
calling pci_ide_update_mode() to reconfigure the device whenever PCI_CLASS_PROG
is updated.

This patch moves the initial setting of PCI_CLASS_PROG from via_ide_realize() to
via_ide_reset(), and removes the direct setting of PCI_INTERRUPT_PIN during PCI
bus reset since this is now managed by pci_ide_update_mode(). This ensures that
the device configuration is always consistent with respect to the currently
selected mode.

Signed-off-by: Mark Cave-Ayland 
Message-ID: <20231116103355.588580-5-mark.cave-ayl...@ilande.co.uk>
Reviewed-by: Kevin Wolf 
Signed-off-by: Kevin Wolf 
---
 hw/ide/via.c | 39 +--
 1 file changed, 37 insertions(+), 2 deletions(-)

diff --git a/hw/ide/via.c b/hw/ide/via.c
index 87b134083a..2d3124ebd7 100644
--- a/hw/ide/via.c
+++ b/hw/ide/via.c
@@ -28,6 +28,7 @@
 #include "hw/pci/pci.h"
 #include "migration/vmstate.h"
 #include "qemu/module.h"
+#include "qemu/range.h"
 #include "sysemu/dma.h"
 #include "hw/isa/vt82c686.h"
 #include "hw/ide/pci.h"
@@ -128,11 +129,14 @@ static void via_ide_reset(DeviceState *dev)
 ide_bus_reset(&d->bus[i]);
 }
 
+pci_config_set_prog_interface(pci_conf, 0x8a); /* legacy mode */
+pci_ide_update_mode(d);
+
 pci_set_word(pci_conf + PCI_COMMAND, PCI_COMMAND_IO | PCI_COMMAND_WAIT);
 pci_set_word(pci_conf + PCI_STATUS, PCI_STATUS_FAST_BACK |
  PCI_STATUS_DEVSEL_MEDIUM);
 
-pci_set_long(pci_conf + PCI_INTERRUPT_LINE, 0x010e);
+pci_set_byte(pci_conf + PCI_INTERRUPT_LINE, 0xe);
 
 /* IDE chip enable, IDE configuration 1/2, IDE FIFO Configuration*/
 pci_set_long(pci_conf + 0x40, 0x0a090600);
@@ -154,6 +158,36 @@ static void via_ide_reset(DeviceState *dev)
 pci_set_long(pci_conf + 0xc0, 0x00020001);
 }
 
+static uint32_t via_ide_cfg_read(PCIDevice *pd, uint32_t addr, int len)
+{
+uint32_t val = pci_default_read_config(pd, addr, len);
+uint8_t mode = pd->config[PCI_CLASS_PROG];
+
+if ((mode & 0xf) == 0xa && ranges_overlap(addr, len,
+  PCI_BASE_ADDRESS_0, 16)) {
+/* BARs always read back zero in legacy mode */
+for (int i = addr; i < addr + len; i++) {
+if (i >= PCI_BASE_ADDRESS_0 && i < PCI_BASE_ADDRESS_0 + 16) {
+val &= ~(0xffULL << ((i - addr) << 3));
+}
+}
+}
+
+return val;
+}
+
+static void via_ide_cfg_write(PCIDevice *pd, uint32_t addr,
+  uint32_t val, int len)
+{
+PCIIDEState *d = PCI_IDE(pd);
+
+pci_default_write_config(pd, addr, val, len);
+
+if (range_covers_byte(addr, len, PCI_CLASS_PROG)) {
+pci_ide_update_mode(d);
+}
+}
+
 static void via_ide_realize(PCIDevice *dev, Error **errp)
 {
 PCIIDEState *d = PCI_IDE(dev);
@@ -161,7 +195,6 @@ static void via_ide_realize(PCIDevice *dev, Error **errp)
 uint8_t *pci_conf = dev->config;
 int i;
 
-pci_config_set_prog_interface(pci_conf, 0x8a); /* legacy mode */
 pci_set_long(pci_conf + PCI_CAPABILITY_LIST, 0x00c0);
 dev->wmask[PCI_INTERRUPT_LINE] = 0;
 dev->wmask[PCI_CLASS_PROG] = 5;
@@ -216,6 +249,8 @@ static void via_ide_class_init(ObjectClass *klass, void 
*data)
 /* Reason: only works as function of VIA southbridge */
 dc->user_creatable = false;
 
+k->config_read = via_ide_cfg_read;
+k->config_write = via_ide_cfg_write;
 k->realize = via_ide_realize;
 k->exit = via_ide_exitfn;
 k->vendor_id = PCI_VENDOR_ID_VIA;
-- 
2.42.0

[PULL 3/9] block: Fix deadlocks in bdrv_graph_wrunlock()

2023-11-21 Thread Kevin Wolf

bdrv_graph_wrunlock() calls aio_poll(), which may run callbacks that
have a nested event loop. Nested event loops can depend on other
iothreads making progress, so in order to allow them to make progress it
must not hold the AioContext lock of another thread while calling
aio_poll().

This introduces a @bs parameter to bdrv_graph_wrunlock() whose
AioContext is temporarily dropped (which matches bdrv_graph_wrlock()),
and a bdrv_graph_wrunlock_ctx() that can be used if the BlockDriverState
doesn't necessarily exist any more when unlocking.

This also requires a change to bdrv_schedule_unref(), which was relying
on the incorrectly taken lock. It needs to take the lock itself now.
While this is a separate bug, it can't be fixed a separate patch because
otherwise the intermediate state would either deadlock or try to release
a lock that we don't even hold.

Signed-off-by: Kevin Wolf 
Message-ID: <20231115172012.112727-3-kw...@redhat.com>
Reviewed-by: Stefan Hajnoczi 
[kwolf: Fixed up bdrv_schedule_unref()]
Signed-off-by: Kevin Wolf 
---
 include/block/graph-lock.h | 15 +++-
 block.c| 39 ++
 block/backup.c |  2 +-
 block/blklogwrites.c   |  4 +--
 block/blkverify.c  |  2 +-
 block/block-backend.c  |  8 --
 block/commit.c | 10 
 block/graph-lock.c | 23 +-
 block/mirror.c | 14 +--
 block/qcow2.c  |  2 +-
 block/quorum.c |  4 +--
 block/replication.c| 10 
 block/snapshot.c   |  2 +-
 block/stream.c |  8 +++---
 block/vmdk.c   | 10 
 blockdev.c |  4 +--
 blockjob.c |  8 +++---
 tests/unit/test-bdrv-drain.c   | 20 +++
 tests/unit/test-bdrv-graph-mod.c   | 10 
 scripts/block-coroutine-wrapper.py |  2 +-
 20 files changed, 122 insertions(+), 75 deletions(-)

diff --git a/include/block/graph-lock.h b/include/block/graph-lock.h
index 6f1cd12745..22b5db1ed9 100644
--- a/include/block/graph-lock.h
+++ b/include/block/graph-lock.h
@@ -123,8 +123,21 @@ bdrv_graph_wrlock(BlockDriverState *bs);
  * bdrv_graph_wrunlock:
  * Write finished, reset global has_writer to 0 and restart
  * all readers that are waiting.
+ *
+ * If @bs is non-NULL, its AioContext is temporarily released.
+ */
+void no_coroutine_fn TSA_RELEASE(graph_lock) TSA_NO_TSA
+bdrv_graph_wrunlock(BlockDriverState *bs);
+
+/*
+ * bdrv_graph_wrunlock_ctx:
+ * Write finished, reset global has_writer to 0 and restart
+ * all readers that are waiting.
+ *
+ * If @ctx is non-NULL, its lock is temporarily released.
  */
-void bdrv_graph_wrunlock(void) TSA_RELEASE(graph_lock) TSA_NO_TSA;
+void no_coroutine_fn TSA_RELEASE(graph_lock) TSA_NO_TSA
+bdrv_graph_wrunlock_ctx(AioContext *ctx);
 
 /*
  * bdrv_graph_co_rdlock:
diff --git a/block.c b/block.c
index eac105a504..bfb0861ec6 100644
--- a/block.c
+++ b/block.c
@@ -1713,7 +1713,7 @@ open_failed:
 bdrv_unref_child(bs, bs->file);
 assert(!bs->file);
 }
-bdrv_graph_wrunlock();
+bdrv_graph_wrunlock(NULL);
 
 g_free(bs->opaque);
 bs->opaque = NULL;
@@ -3577,7 +3577,7 @@ int bdrv_set_backing_hd(BlockDriverState *bs, 
BlockDriverState *backing_hd,
 bdrv_drained_begin(drain_bs);
 bdrv_graph_wrlock(backing_hd);
 ret = bdrv_set_backing_hd_drained(bs, backing_hd, errp);
-bdrv_graph_wrunlock();
+bdrv_graph_wrunlock(backing_hd);
 bdrv_drained_end(drain_bs);
 bdrv_unref(drain_bs);
 
@@ -3796,7 +3796,7 @@ BdrvChild *bdrv_open_child(const char *filename,
 child = bdrv_attach_child(parent, bs, bdref_key, child_class, child_role,
   errp);
 aio_context_release(ctx);
-bdrv_graph_wrunlock();
+bdrv_graph_wrunlock(NULL);
 
 return child;
 }
@@ -4652,7 +4652,7 @@ int bdrv_reopen_multiple(BlockReopenQueue *bs_queue, 
Error **errp)
 
 bdrv_graph_wrlock(NULL);
 tran_commit(tran);
-bdrv_graph_wrunlock();
+bdrv_graph_wrunlock(NULL);
 
 QTAILQ_FOREACH_REVERSE(bs_entry, bs_queue, entry) {
 BlockDriverState *bs = bs_entry->state.bs;
@@ -4671,7 +4671,7 @@ int bdrv_reopen_multiple(BlockReopenQueue *bs_queue, 
Error **errp)
 abort:
 bdrv_graph_wrlock(NULL);
 tran_abort(tran);
-bdrv_graph_wrunlock();
+bdrv_graph_wrunlock(NULL);
 
 QTAILQ_FOREACH_SAFE(bs_entry, bs_queue, entry, next) {
 if (bs_entry->prepared) {
@@ -4857,7 +4857,7 @@ bdrv_reopen_parse_file_or_backing(BDRVReopenState 
*reopen_state,
 ret = bdrv_set_file_or_backing_noperm(bs, new_child_bs, is_backing,
   tran, errp);
 
-bdrv_graph_wrunlock();
+bdrv_graph_wrunlock_ctx(ctx);
 
 if (old_ctx != ctx) {
 aio_context_release(ctx);
@@ -5216,7 +5216,7 @@ sta

[PULL 0/9] Block layer fixes for 8.2.0-rc1

2023-11-21 Thread Kevin Wolf

The following changes since commit af9264da80073435fd78944bc5a46e695897d7e5:

  Merge tag '20231119-xtensa-1' of https://github.com/OSLL/qemu-xtensa into 
staging (2023-11-20 05:25:19 -0500)

are available in the Git repository at:

  https://repo.or.cz/qemu/kevin.git tags/for-upstream

for you to fetch changes up to debb4911667b1f8213ca8760ae83afcf3b3579e0:

  hw/ide/via: implement legacy/native mode switching (2023-11-21 12:45:21 +0100)


Block layer patches

- Fix graph lock related deadlocks with the stream job
- ahci: Fix legacy software reset
- ide/via: Fix switch between compatibility and native mode


Kevin Wolf (4):
  block: Fix bdrv_graph_wrlock() call in blk_remove_bs()
  block: Fix deadlocks in bdrv_graph_wrunlock()
  stream: Fix AioContext locking during bdrv_graph_wrlock()
  iotests: Test two stream jobs in a single iothread

Mark Cave-Ayland (4):
  ide/ioport: move ide_portio_list[] and ide_portio_list2[] definitions to 
IDE core
  ide/pci: introduce pci_ide_update_mode() function
  ide/via: don't attempt to set default BAR addresses
  hw/ide/via: implement legacy/native mode switching

Niklas Cassel (1):
  hw/ide/ahci: fix legacy software reset

 include/block/graph-lock.h| 15 -
 include/hw/ide/internal.h |  3 +
 include/hw/ide/pci.h  |  1 +
 block.c   | 39 -
 block/backup.c|  2 +-
 block/blklogwrites.c  |  4 +-
 block/blkverify.c |  2 +-
 block/block-backend.c | 12 +++-
 block/commit.c| 10 ++--
 block/graph-lock.c| 23 +++-
 block/mirror.c| 14 ++---
 block/qcow2.c |  2 +-
 block/quorum.c|  4 +-
 block/replication.c   | 10 ++--
 block/snapshot.c  |  2 +-
 block/stream.c| 10 ++--
 block/vmdk.c  | 10 ++--
 blockdev.c|  4 +-
 blockjob.c|  8 +--
 hw/ide/ahci.c | 27 -
 hw/ide/core.c | 12 
 hw/ide/ioport.c   | 12 
 hw/ide/pci.c  | 84 +++
 hw/ide/via.c  | 44 +++---
 tests/unit/test-bdrv-drain.c  | 20 +++
 tests/unit/test-bdrv-graph-mod.c  | 10 ++--
 scripts/block-coroutine-wrapper.py|  2 +-
 tests/qemu-iotests/tests/iothreads-stream | 74 +++
 tests/qemu-iotests/tests/iothreads-stream.out | 11 
 29 files changed, 374 insertions(+), 97 deletions(-)
 create mode 100755 tests/qemu-iotests/tests/iothreads-stream
 create mode 100644 tests/qemu-iotests/tests/iothreads-stream.out

[PULL 8/9] ide/via: don't attempt to set default BAR addresses

2023-11-21 Thread Kevin Wolf

From: Mark Cave-Ayland 

The via-ide device currently attempts to set the default BAR addresses to the
values shown in the datasheet, but this doesn't work for 2 reasons: firstly
BARS 1-4 do not set the bottom 2 bits to PCI_BASE_ADDRESS_SPACE_IO, and
secondly the initial PCI bus reset clears the values of all PCI device BARs
after the device itself has been reset.

Remove the setting of the default BAR addresses from via_ide_reset() to ensure
there is no doubt that these values are never exposed to the guest.

Signed-off-by: Mark Cave-Ayland 
Message-ID: <20231116103355.588580-4-mark.cave-ayl...@ilande.co.uk>
Reviewed-by: Kevin Wolf 
Reviewed-by: Philippe Mathieu-Daudé 
Signed-off-by: Kevin Wolf 
---
 hw/ide/via.c | 5 -
 1 file changed, 5 deletions(-)

diff --git a/hw/ide/via.c b/hw/ide/via.c
index fff23803a6..87b134083a 100644
--- a/hw/ide/via.c
+++ b/hw/ide/via.c
@@ -132,11 +132,6 @@ static void via_ide_reset(DeviceState *dev)
 pci_set_word(pci_conf + PCI_STATUS, PCI_STATUS_FAST_BACK |
  PCI_STATUS_DEVSEL_MEDIUM);
 
-pci_set_long(pci_conf + PCI_BASE_ADDRESS_0, 0x01f0);
-pci_set_long(pci_conf + PCI_BASE_ADDRESS_1, 0x03f4);
-pci_set_long(pci_conf + PCI_BASE_ADDRESS_2, 0x0170);
-pci_set_long(pci_conf + PCI_BASE_ADDRESS_3, 0x0374);
-pci_set_long(pci_conf + PCI_BASE_ADDRESS_4, 0xcc01); /* BMIBA: 20-23h 
*/
 pci_set_long(pci_conf + PCI_INTERRUPT_LINE, 0x010e);
 
 /* IDE chip enable, IDE configuration 1/2, IDE FIFO Configuration*/
-- 
2.42.0

1 2 3 >

1 - 100 of 270 matches

Mail list logo