Re: [PATCH] ui: Optimization dirty rect empty check logic

2023-11-27 Thread Marc-André Lureau
Hi

On Sat, Nov 25, 2023 at 11:55 AM lijiejun  wrote:
>
> Reduce unnecessary code execution in function qemu_spice_create_update,
> such as "int blocks = DIV_ROUND_UP(surface_width(ssd->ds), blksize);"
> and "int bpp = surface_bytes_per_pixel(ssd->ds);".
>
> Signed-off-by: lijiejun 

This is a micro-optimization which makes the
qemu_spice_create_update() less complete, mixing caller/callee
responsibilities.

> ---
>  ui/spice-display.c | 8 +++-
>  1 file changed, 3 insertions(+), 5 deletions(-)
>
> diff --git a/ui/spice-display.c b/ui/spice-display.c
> index 6eb98a5a5c..508e35ed0f 100644
> --- a/ui/spice-display.c
> +++ b/ui/spice-display.c
> @@ -194,10 +194,6 @@ static void qemu_spice_create_update(SimpleSpiceDisplay 
> *ssd)
>  int bpp = surface_bytes_per_pixel(ssd->ds);
>  uint8_t *guest, *mirror;
>
> -if (qemu_spice_rect_is_empty(&ssd->dirty)) {
> -return;
> -};
> -
>  dirty_top = g_new(int, blocks);
>  for (blk = 0; blk < blocks; blk++) {
>  dirty_top[blk] = -1;
> @@ -488,7 +484,9 @@ void qemu_spice_display_refresh(SimpleSpiceDisplay *ssd)
>
>  WITH_QEMU_LOCK_GUARD(&ssd->lock) {
>  if (QTAILQ_EMPTY(&ssd->updates) && ssd->ds) {
> -qemu_spice_create_update(ssd);
> +if (!qemu_spice_rect_is_empty(&ssd->dirty)) {
> +qemu_spice_create_update(ssd);
> +}
>  ssd->notify++;
>  }
>  }
> --
> 2.25.1
>
>


-- 
Marc-André Lureau



Re: [PATCH] l2tpv3: overwrite s->fd in net_l2tpv3_cleanup

2023-11-27 Thread Анастасия Любимова



18/10/23 14:09, Anastasia Belova:
It's better to overwrite freed pointer s->fd to avoid accessing an 
invalid descriptor. Found by Linux Verification Center 
(linuxtesting.org) with SVACE.


Just a friendly reminder.

Anastasia Belova



[RFC v2 0/7] Add persistence to NVMe ZNS emulation

2023-11-27 Thread Sam Li
ZNS emulation follows NVMe ZNS spec but the state of namespace
zones does not persist accross restarts of QEMU. This patch makes the
metadata of ZNS emulation persistent by using new block layer APIs and
the qcow2 img as backing file. It is the second part after the patches
- adding full zoned storage emulation to qcow2 driver.
https://patchwork.kernel.org/project/qemu-devel/cover/20231127043703.49489-1-faithilike...@gmail.com/

The metadata of ZNS emulation divides into two parts, zone metadata and
zone descriptor extension data. The zone metadata is composed of zone
states, zone type, wp and zone attributes. The zone information can be
stored at an uint64_t wp to save space and easy access. The structure of
wp of each zone is as follows:
|(4)| zone type (1)| zone attr (8)| wp (51) ||

The zone descriptor extension data is relatively small comparing to the
overall size therefore we adopt the option that store zded of all zones
in an array regardless of the valid bit set.

Creating a zns format qcow2 image file adds one more option zd_extension_size
to zoned device configurations.

To attach this file as emulated zns drive in the command line of QEMU, use:
  -drive file=${znsimg},id=nvmezns0,format=qcow2,if=none \
  -device nvme-ns,drive=nvmezns0,bus=nvme0,nsid=1,uuid=xxx \

v1->v2:
- split [v1 2/5] patch to three (doc, config, block layer API)
- adapt qcow2 v6

Sam Li (7):
  docs/qcow2: add zd_extension_size option to the zoned format feature
  qcow2: add zd_extension configurations to zoned metadata
  hw/nvme: use blk_get_*() to access zone info in the block layer
  hw/nvme: add blk_get_zone_extension to access zd_extensions
  hw/nvme: make the metadata of ZNS emulation persistent
  hw/nvme: refactor zone append write using block layer APIs
  hw/nvme: make ZDED persistent

 block/block-backend.c |   88 ++
 block/qcow2.c |  119 ++-
 block/qcow2.h |2 +
 docs/interop/qcow2.txt|3 +
 hw/nvme/ctrl.c| 1247 -
 hw/nvme/ns.c  |  162 +---
 hw/nvme/nvme.h|   95 +--
 include/block/block-common.h  |9 +
 include/block/block_int-common.h  |8 +
 include/sysemu/block-backend-io.h |   11 +
 include/sysemu/dma.h  |3 +
 qapi/block-core.json  |4 +
 system/dma-helpers.c  |   17 +
 13 files changed, 647 insertions(+), 1121 deletions(-)

-- 
2.40.1




Re: [EXTERNAL] [PATCH v3 2/5] xen: backends: don't overwrite XenStore nodes created by toolstack

2023-11-27 Thread David Woodhouse
On Fri, 2023-11-24 at 23:24 +, Volodymyr Babchuk wrote:
> Xen PV devices in QEMU can be created in two ways: either by QEMU
> itself, if they were passed via command line, or by Xen toolstack. In
> the latter case, QEMU scans XenStore entries and configures devices
> accordingly.
> 
> In the second case we don't want QEMU to write/delete front-end
> entries for two reasons: it might have no access to those entries if
> it is running in un-privileged domain and it is just incorrect to
> overwrite entries already provided by Xen toolstack, because
> toolstack
> manages those nodes. For example, it might read backend- or frontend-
> state to be sure that they are both disconnected and it is safe to
> destroy a domain.
> 
> This patch checks presence of xendev->backend to check if Xen PV
> device was configured by Xen toolstack to decide if it should touch
> frontend entries in XenStore. Also, when we need to remove XenStore
> entries during device teardown only if they weren't created by Xen
> toolstack. If they were created by toolstack, then it is toolstack's
> job to do proper clean-up.
> 
> Suggested-by: Paul Durrant 
> Suggested-by: David Woodhouse 
> Co-Authored-by: Oleksandr Tyshchenko 
> Signed-off-by: Volodymyr Babchuk 

Reviewed-by: David Woodhouse 

... albeit with a couple of suggestions... 

> diff --git a/hw/char/xen_console.c b/hw/char/xen_console.c
> index bef8a3a621..b52abf 100644
> --- a/hw/char/xen_console.c
> +++ b/hw/char/xen_console.c
> @@ -450,7 +450,7 @@ static void xen_console_realize(XenDevice *xendev, Error 
> **errp)
> 
>  trace_xen_console_realize(con->dev, object_get_typename(OBJECT(cs)));
> 
> -    if (CHARDEV_IS_PTY(cs)) {
> +    if (CHARDEV_IS_PTY(cs) && !xendev->backend) {
>  /* Strip the leading 'pty:' */
>  xen_device_frontend_printf(xendev, "tty", "%s", cs->filename + 4);
>  }


It's kind of weird that that one is a frontend node at all; surely it
should have been a backend node? But it is known only to QEMU once it
actually opens /dev/ptmx and creates a new pty. It can't be populated
by the toolstack in advance.

So shouldn't the toolstack have made it writable by the driver domain? 

I think we should attempt to write this and just gracefully handle the
failure if we can't. (In fact, xen_device_frontend_printf() will just
use error_report_err() which is probably OK unless you feel strongly
about silencing it).

> diff --git a/hw/net/xen_nic.c b/hw/net/xen_nic.c
> index afa10c96e8..27442bef38 100644
> --- a/hw/net/xen_nic.c
> +++ b/hw/net/xen_nic.c
> @@ -315,14 +315,16 @@ static void xen_netdev_realize(XenDevice *xendev, Error 
> **errp)
> 
>  qemu_macaddr_default_if_unset(&netdev->conf.macaddr);
> 
> -    xen_device_frontend_printf(xendev, "mac", 
> "%02x:%02x:%02x:%02x:%02x:%02x",
> -   netdev->conf.macaddr.a[0],
> -   netdev->conf.macaddr.a[1],
> -   netdev->conf.macaddr.a[2],
> -   netdev->conf.macaddr.a[3],
> -   netdev->conf.macaddr.a[4],
> -   netdev->conf.macaddr.a[5]);
> -
> +    if (!xendev->backend) {
> +    xen_device_frontend_printf(xendev, "mac",
> +   "%02x:%02x:%02x:%02x:%02x:%02x",
> +   netdev->conf.macaddr.a[0],
> +   netdev->conf.macaddr.a[1],
> +   netdev->conf.macaddr.a[2],
> +   netdev->conf.macaddr.a[3],
> +   netdev->conf.macaddr.a[4],
> +   netdev->conf.macaddr.a[5]);
> +    }
>  netdev->nic = qemu_new_nic(&net_xen_info, &netdev->conf,
>     object_get_typename(OBJECT(xendev)),
>     DEVICE(xendev)->id, netdev);


Perhaps here you should create the "mac" node if it doesn't exist (or
is that "if it doesn't match netdev->conf.macaddr"?) and just
gracefully accept failure too?





smime.p7s
Description: S/MIME cryptographic signature


Re: [RFC PATCH v3 3/5] xen: add option to disable legacy backends

2023-11-27 Thread Woodhouse, David
On Fri, 2023-11-24 at 23:24 +, Volodymyr Babchuk wrote:
> This patch makes legacy backends optional. As was discussed at [1]
> this is a solution to a problem when we can't run QEMU as a device
> model in a non-privileged domain. This is because legacy backends
> assume that they are always running in domain with ID = 0. Actually,
> this may prevent running QEMU in a privileged domain with ID not equal
> to zero.
> 
> To be able to disable legacy backends we need to alter couple of
> source files that unintentionally depend on them. For example
> xen-all.c used xen_pv_printf to report errors, while not providing any
> additional like xendev pointer. Also, we need to move xenstore
> structure from xen-legacy-backend.c, because it is apparently used in
> xen-all.c.
> 
> With this patch it is possible to provide
> "--disable-xen-legacy-backends" configure option to get QEMU binary
> that can run in a driver domain. With price of not be able to use
> legacy backends of course.
> 
> [1]
> https://lists.gnu.org/archive/html/qemu-devel/2023-11/msg05022.html
> 
> Signed-off-by: Volodymyr Babchuk 
> 
> ---
> 
> I am not sure if I made correct changes to the build system, thus this
> patch is tagged as RFC.

Hm, I was imagining a new CONFIG_LEGACY_XEN_BACKENDS option which would
look a lot like CONFIG_XEN_BUS (which would now be only for the new
XenBus code).

This looks weird to me:

> --- a/hw/display/meson.build
> +++ b/hw/display/meson.build
> @@ -14,7 +14,9 @@ system_ss.add(when: 'CONFIG_PL110', if_true:
> files('pl110.c'))
>  system_ss.add(when: 'CONFIG_SII9022', if_true: files('sii9022.c'))
>  system_ss.add(when: 'CONFIG_SSD0303', if_true: files('ssd0303.c'))
>  system_ss.add(when: 'CONFIG_SSD0323', if_true: files('ssd0323.c'))
> -system_ss.add(when: 'CONFIG_XEN_BUS', if_true: files('xenfb.c'))
> +if have_xen_legacy_backends
> +  system_ss.add(when: 'CONFIG_XEN_BUS', if_true: files('xenfb.c'))
> +endif
> 
>  system_ss.add(when: 'CONFIG_VGA_PCI', if_true: files('vga-pci.c'))
>  system_ss.add(when: 'CONFIG_VGA_ISA', if_true: files('vga-isa.c'))

I'd prefer to see just:

-system_ss.add(when: 'CONFIG_XEN_BUS', if_true: files('xenfb.c'))
+system_ss.add(when: 'CONFIG_XEN_LEGACY_BACKENDS', if_true: files('xenfb.c'))


Probably also better to split out the bits in accel/xen/xen-all.c and
hw/xen/xen-legacy-backend.c to a separate preparatory commit.


smime.p7s
Description: S/MIME cryptographic signature



Amazon Development Centre (London) Ltd. Registered in England and Wales with 
registration number 04543232 with its registered office at 1 Principal Place, 
Worship Street, London EC2A 2FA, United Kingdom.




[RFC v2 2/7] qcow2: add zd_extension configurations to zoned metadata

2023-11-27 Thread Sam Li
Zone descriptor data is host definied data that is associated with
each zone. Add zone descriptor extensions to zonedmeta struct.

Signed-off-by: Sam Li 
---
 block/qcow2.c| 69 +---
 block/qcow2.h|  2 +
 include/block/block_int-common.h |  6 +++
 qapi/block-core.json |  4 ++
 4 files changed, 76 insertions(+), 5 deletions(-)

diff --git a/block/qcow2.c b/block/qcow2.c
index 26f2bb4a87..75dff27216 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -354,7 +354,8 @@ static inline int qcow2_refresh_zonedmeta(BlockDriverState 
*bs)
 {
 int ret;
 BDRVQcow2State *s = bs->opaque;
-uint64_t wps_size = s->zoned_header.zonedmeta_size;
+uint64_t wps_size = s->zoned_header.zonedmeta_size -
+s->zded_size;
 g_autofree uint64_t *temp = NULL;
 temp = g_new(uint64_t, wps_size);
 ret = bdrv_pread(bs->file, s->zoned_header.zonedmeta_offset,
@@ -364,7 +365,17 @@ static inline int qcow2_refresh_zonedmeta(BlockDriverState 
*bs)
 return ret;
 }
 
+g_autofree uint8_t *zded = NULL;
+zded = g_try_malloc0(s->zded_size);
+ret = bdrv_pread(bs->file, s->zoned_header.zonedmeta_offset + wps_size,
+ s->zded_size, zded, 0);
+if (ret < 0) {
+error_report("Can not read zded");
+return ret;
+}
+
 memcpy(bs->wps->wp, temp, wps_size);
+memcpy(bs->zd_extensions, zded, s->zded_size);
 return 0;
 }
 
@@ -390,6 +401,19 @@ qcow2_check_zone_options(Qcow2ZonedHeaderExtension 
*zone_opt)
 return false;
 }
 
+if (zone_opt->zd_extension_size) {
+if (zone_opt->zd_extension_size & 0x3f) {
+error_report("zone descriptor extension size must be a "
+ "multiple of 64B");
+return false;
+}
+
+if ((zone_opt->zd_extension_size >> 6) > 0xff) {
+error_report("Zone descriptor extension size is too large");
+return false;
+}
+}
+
 if (zone_opt->max_active_zones > zone_opt->nr_zones) {
 error_report("Max_active_zones %" PRIu32 " exceeds "
  "nr_zones %" PRIu32". Set it to nr_zones.",
@@ -676,6 +700,8 @@ qcow2_read_extensions(BlockDriverState *bs, uint64_t 
start_offset,
 zoned_ext.conventional_zones =
 be32_to_cpu(zoned_ext.conventional_zones);
 zoned_ext.nr_zones = be32_to_cpu(zoned_ext.nr_zones);
+zoned_ext.zd_extension_size =
+be32_to_cpu(zoned_ext.zd_extension_size);
 zoned_ext.max_open_zones = be32_to_cpu(zoned_ext.max_open_zones);
 zoned_ext.max_active_zones =
 be32_to_cpu(zoned_ext.max_active_zones);
@@ -686,7 +712,8 @@ qcow2_read_extensions(BlockDriverState *bs, uint64_t 
start_offset,
 zoned_ext.zonedmeta_size = be64_to_cpu(zoned_ext.zonedmeta_size);
 s->zoned_header = zoned_ext;
 bs->wps = g_malloc(sizeof(BlockZoneWps)
-+ s->zoned_header.zonedmeta_size);
++ zoned_ext.zonedmeta_size - s->zded_size);
+bs->zd_extensions = g_malloc0(s->zded_size);
 ret = qcow2_refresh_zonedmeta(bs);
 if (ret < 0) {
 error_setg_errno(errp, -ret, "zonedmeta: "
@@ -2264,6 +2291,7 @@ static void qcow2_refresh_limits(BlockDriverState *bs, 
Error **errp)
 bs->bl.zone_size = s->zoned_header.zone_size;
 bs->bl.zone_capacity = s->zoned_header.zone_capacity;
 bs->bl.write_granularity = BDRV_SECTOR_SIZE;
+bs->bl.zd_extension_size = s->zoned_header.zd_extension_size;
 }
 
 static int GRAPH_UNLOCKED
@@ -3534,6 +3562,8 @@ int qcow2_update_header(BlockDriverState *bs)
 .conventional_zones =
 cpu_to_be32(s->zoned_header.conventional_zones),
 .nr_zones   = cpu_to_be32(s->zoned_header.nr_zones),
+.zd_extension_size  =
+cpu_to_be32(s->zoned_header.zd_extension_size),
 .max_open_zones = cpu_to_be32(s->zoned_header.max_open_zones),
 .max_active_zones   =
 cpu_to_be32(s->zoned_header.max_active_zones),
@@ -4287,6 +4317,15 @@ qcow2_co_create(BlockdevCreateOptions *create_options, 
Error **errp)
 }
 s->zoned_header.max_append_bytes = zone_host_managed->max_append_bytes;
 
+uint64_t zded_size = 0;
+if (zone_host_managed->has_descriptor_extension_size) {
+s->zoned_header.zd_extension_size =
+zone_host_managed->descriptor_extension_size;
+zded_size = s->zoned_header.zd_extension_size *
+bs->bl.nr_zones;
+}
+s->zded_size = zded_size;
+
 if (!qcow2_check_zone_options(&s->zoned_header)) {
 s->zoned_header.zoned = BLK_Z_NONE;
 ret = -EINVAL;
@@ -4294,7 +4333,7 @@ qcow2_co_create(BlockdevCreateOptions *create_opti

[RFC v2 0/7] Add persistence to NVMe ZNS emulation

2023-11-27 Thread Sam Li
ZNS emulation follows NVMe ZNS spec but the state of namespace
zones does not persist accross restarts of QEMU. This patch makes the
metadata of ZNS emulation persistent by using new block layer APIs and
the qcow2 img as backing file. It is the second part after the patches
- adding full zoned storage emulation to qcow2 driver.
https://patchwork.kernel.org/project/qemu-devel/cover/20231127043703.49489-1-faithilike...@gmail.com/

The metadata of ZNS emulation divides into two parts, zone metadata and
zone descriptor extension data. The zone metadata is composed of zone
states, zone type, wp and zone attributes. The zone information can be
stored at an uint64_t wp to save space and easy access. The structure of
wp of each zone is as follows:
|(4)| zone type (1)| zone attr (8)| wp (51) ||

The zone descriptor extension data is relatively small comparing to the
overall size therefore we adopt the option that store zded of all zones
in an array regardless of the valid bit set.

Creating a zns format qcow2 image file adds one more option zd_extension_size
to zoned device configurations.

To attach this file as emulated zns drive in the command line of QEMU, use:
  -drive file=${znsimg},id=nvmezns0,format=qcow2,if=none \
  -device nvme-ns,drive=nvmezns0,bus=nvme0,nsid=1,uuid=xxx \

Sorry, send this one more time due to network problems.

v1->v2:
- split [v1 2/5] patch to three (doc, config, block layer API)
- adapt qcow2 v6

Sam Li (7):
  docs/qcow2: add zd_extension_size option to the zoned format feature
  qcow2: add zd_extension configurations to zoned metadata
  hw/nvme: use blk_get_*() to access zone info in the block layer
  hw/nvme: add blk_get_zone_extension to access zd_extensions
  hw/nvme: make the metadata of ZNS emulation persistent
  hw/nvme: refactor zone append write using block layer APIs
  hw/nvme: make ZDED persistent

 block/block-backend.c |   88 ++
 block/qcow2.c |  119 ++-
 block/qcow2.h |2 +
 docs/interop/qcow2.txt|3 +
 hw/nvme/ctrl.c| 1247 -
 hw/nvme/ns.c  |  162 +---
 hw/nvme/nvme.h|   95 +--
 include/block/block-common.h  |9 +
 include/block/block_int-common.h  |8 +
 include/sysemu/block-backend-io.h |   11 +
 include/sysemu/dma.h  |3 +
 qapi/block-core.json  |4 +
 system/dma-helpers.c  |   17 +
 13 files changed, 647 insertions(+), 1121 deletions(-)

-- 
2.40.1




[RFC v2 1/7] docs/qcow2: add zd_extension_size option to the zoned format feature

2023-11-27 Thread Sam Li
Signed-off-by: Sam Li 
---
 docs/interop/qcow2.txt | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/docs/interop/qcow2.txt b/docs/interop/qcow2.txt
index 0f1938f056..458d05371a 100644
--- a/docs/interop/qcow2.txt
+++ b/docs/interop/qcow2.txt
@@ -428,6 +428,9 @@ The fields of the zoned extension are:
The offset of zoned metadata structure in the contained
image, in bytes.
 
+  44 - 51:  zd_extension_size
+The size of zone descriptor extension data in bytes.
+
 == Full disk encryption header pointer ==
 
 The full disk encryption header must be present if, and only if, the
-- 
2.40.1




[RFC v2 3/7] hw/nvme: use blk_get_*() to access zone info in the block layer

2023-11-27 Thread Sam Li
The zone information is contained in the BlockLimits fileds. Add blk_get_*() 
functions
to access the block layer and update zone info accessing in the NVMe device 
emulation.

Signed-off-by: Sam Li 
---
 block/block-backend.c | 72 +++
 hw/nvme/ctrl.c| 34 +--
 hw/nvme/ns.c  | 61 --
 hw/nvme/nvme.h|  3 --
 include/sysemu/block-backend-io.h |  9 
 5 files changed, 111 insertions(+), 68 deletions(-)

diff --git a/block/block-backend.c b/block/block-backend.c
index ec21148806..666df9cfea 100644
--- a/block/block-backend.c
+++ b/block/block-backend.c
@@ -2380,6 +2380,78 @@ int blk_get_max_iov(BlockBackend *blk)
 return blk->root->bs->bl.max_iov;
 }
 
+uint8_t blk_get_zone_model(BlockBackend *blk)
+{
+BlockDriverState *bs = blk_bs(blk);
+IO_CODE();
+return bs ? bs->bl.zoned: 0;
+
+}
+
+uint32_t blk_get_zone_size(BlockBackend *blk)
+{
+BlockDriverState *bs = blk_bs(blk);
+IO_CODE();
+
+return bs ? bs->bl.zone_size : 0;
+}
+
+uint32_t blk_get_zone_capacity(BlockBackend *blk)
+{
+BlockDriverState *bs = blk_bs(blk);
+IO_CODE();
+
+return bs ? bs->bl.zone_capacity : 0;
+}
+
+uint32_t blk_get_max_open_zones(BlockBackend *blk)
+{
+BlockDriverState *bs = blk_bs(blk);
+IO_CODE();
+
+return bs ? bs->bl.max_open_zones : 0;
+}
+
+uint32_t blk_get_max_active_zones(BlockBackend *blk)
+{
+BlockDriverState *bs = blk_bs(blk);
+IO_CODE();
+
+return bs ? bs->bl.max_active_zones : 0;
+}
+
+uint32_t blk_get_max_append_sectors(BlockBackend *blk)
+{
+BlockDriverState *bs = blk_bs(blk);
+IO_CODE();
+
+return bs ? bs->bl.max_append_sectors : 0;
+}
+
+uint32_t blk_get_nr_zones(BlockBackend *blk)
+{
+BlockDriverState *bs = blk_bs(blk);
+IO_CODE();
+
+return bs ? bs->bl.nr_zones : 0;
+}
+
+uint32_t blk_get_write_granularity(BlockBackend *blk)
+{
+BlockDriverState *bs = blk_bs(blk);
+IO_CODE();
+
+return bs ? bs->bl.write_granularity : 0;
+}
+
+BlockZoneWps *blk_get_zone_wps(BlockBackend *blk)
+{
+BlockDriverState *bs = blk_bs(blk);
+IO_CODE();
+
+return bs ? bs->wps : NULL;
+}
+
 void *blk_try_blockalign(BlockBackend *blk, size_t size)
 {
 IO_CODE();
diff --git a/hw/nvme/ctrl.c b/hw/nvme/ctrl.c
index f026245d1e..e64b021454 100644
--- a/hw/nvme/ctrl.c
+++ b/hw/nvme/ctrl.c
@@ -417,18 +417,6 @@ static void nvme_assign_zone_state(NvmeNamespace *ns, 
NvmeZone *zone,
 static uint16_t nvme_zns_check_resources(NvmeNamespace *ns, uint32_t act,
  uint32_t opn, uint32_t zrwa)
 {
-if (ns->params.max_active_zones != 0 &&
-ns->nr_active_zones + act > ns->params.max_active_zones) {
-trace_pci_nvme_err_insuff_active_res(ns->params.max_active_zones);
-return NVME_ZONE_TOO_MANY_ACTIVE | NVME_DNR;
-}
-
-if (ns->params.max_open_zones != 0 &&
-ns->nr_open_zones + opn > ns->params.max_open_zones) {
-trace_pci_nvme_err_insuff_open_res(ns->params.max_open_zones);
-return NVME_ZONE_TOO_MANY_OPEN | NVME_DNR;
-}
-
 if (zrwa > ns->zns.numzrwa) {
 return NVME_NOZRWA | NVME_DNR;
 }
@@ -1988,9 +1976,9 @@ static uint16_t nvme_zrm_reset(NvmeNamespace *ns, 
NvmeZone *zone)
 static void nvme_zrm_auto_transition_zone(NvmeNamespace *ns)
 {
 NvmeZone *zone;
+int moz = blk_get_max_open_zones(ns->blkconf.blk);
 
-if (ns->params.max_open_zones &&
-ns->nr_open_zones == ns->params.max_open_zones) {
+if (moz && ns->nr_open_zones == moz) {
 zone = QTAILQ_FIRST(&ns->imp_open_zones);
 if (zone) {
 /*
@@ -2160,7 +2148,7 @@ void nvme_rw_complete_cb(void *opaque, int ret)
 block_acct_done(stats, acct);
 }
 
-if (ns->params.zoned && nvme_is_write(req)) {
+if (blk_get_zone_model(blk) && nvme_is_write(req)) {
 nvme_finalize_zoned_write(ns, req);
 }
 
@@ -2882,7 +2870,7 @@ static void nvme_copy_out_completed_cb(void *opaque, int 
ret)
 goto out;
 }
 
-if (ns->params.zoned) {
+if (blk_get_zone_model(ns->blkconf.blk)) {
 nvme_advance_zone_wp(ns, iocb->zone, nlb);
 }
 
@@ -2994,7 +2982,7 @@ static void nvme_copy_in_completed_cb(void *opaque, int 
ret)
 goto invalid;
 }
 
-if (ns->params.zoned) {
+if (blk_get_zone_model(ns->blkconf.blk)) {
 status = nvme_check_zone_write(ns, iocb->zone, iocb->slba, nlb);
 if (status) {
 goto invalid;
@@ -3088,7 +3076,7 @@ static void nvme_do_copy(NvmeCopyAIOCB *iocb)
 }
 }
 
-if (ns->params.zoned) {
+if (blk_get_zone_model(ns->blkconf.blk)) {
 status = nvme_check_zone_read(ns, slba, nlb);
 if (status) {
 goto invalid;
@@ -3164,7 +3152,7 @@ static uint16_t nvme_copy(NvmeCtrl *n, NvmeRequest *req)
 
 iocb->slba = le64_to_cpu(copy->sdlba);
 
-if (ns->params.zoned) {
+if (blk_ge

[RFC v2 5/7] hw/nvme: make the metadata of ZNS emulation persistent

2023-11-27 Thread Sam Li
The NVMe ZNS devices follow NVMe ZNS spec but the state of namespace
zones does not persist accross restarts of QEMU. This patch makes the
metadata of ZNS emulation persistent by using new block layer APIs. The
ZNS device calls zone report and zone mgmt APIs from the block layer
which will handle zone state transition and manage zone resources.

Signed-off-by: Sam Li 
---
 block/qcow2.c|3 +
 hw/nvme/ctrl.c   | 1106 +++---
 hw/nvme/ns.c |   77 +--
 hw/nvme/nvme.h   |   85 +--
 include/block/block-common.h |8 +
 include/block/block_int-common.h |2 +
 6 files changed, 264 insertions(+), 1017 deletions(-)

diff --git a/block/qcow2.c b/block/qcow2.c
index 75dff27216..dfaf5566e2 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -5043,6 +5043,9 @@ static int coroutine_fn 
qcow2_co_zone_mgmt(BlockDriverState *bs, BlockZoneOp op,
 case BLK_ZO_RESET:
 ret = qcow2_reset_zone(bs, index, len);
 break;
+case BLK_ZO_OFFLINE:
+/* There are no transitions from the offline state to any other state 
*/
+break;
 default:
 error_report("Unsupported zone op: 0x%x", op);
 ret = -ENOTSUP;
diff --git a/hw/nvme/ctrl.c b/hw/nvme/ctrl.c
index dae6f00e4f..b9ed3495e1 100644
--- a/hw/nvme/ctrl.c
+++ b/hw/nvme/ctrl.c
@@ -372,67 +372,6 @@ static inline bool nvme_parse_pid(NvmeNamespace *ns, 
uint16_t pid,
 return nvme_ph_valid(ns, *ph) && nvme_rg_valid(ns->endgrp, *rg);
 }
 
-static void nvme_assign_zone_state(NvmeNamespace *ns, NvmeZone *zone,
-   NvmeZoneState state)
-{
-if (QTAILQ_IN_USE(zone, entry)) {
-switch (nvme_get_zone_state(zone)) {
-case NVME_ZONE_STATE_EXPLICITLY_OPEN:
-QTAILQ_REMOVE(&ns->exp_open_zones, zone, entry);
-break;
-case NVME_ZONE_STATE_IMPLICITLY_OPEN:
-QTAILQ_REMOVE(&ns->imp_open_zones, zone, entry);
-break;
-case NVME_ZONE_STATE_CLOSED:
-QTAILQ_REMOVE(&ns->closed_zones, zone, entry);
-break;
-case NVME_ZONE_STATE_FULL:
-QTAILQ_REMOVE(&ns->full_zones, zone, entry);
-default:
-;
-}
-}
-
-nvme_set_zone_state(zone, state);
-
-switch (state) {
-case NVME_ZONE_STATE_EXPLICITLY_OPEN:
-QTAILQ_INSERT_TAIL(&ns->exp_open_zones, zone, entry);
-break;
-case NVME_ZONE_STATE_IMPLICITLY_OPEN:
-QTAILQ_INSERT_TAIL(&ns->imp_open_zones, zone, entry);
-break;
-case NVME_ZONE_STATE_CLOSED:
-QTAILQ_INSERT_TAIL(&ns->closed_zones, zone, entry);
-break;
-case NVME_ZONE_STATE_FULL:
-QTAILQ_INSERT_TAIL(&ns->full_zones, zone, entry);
-case NVME_ZONE_STATE_READ_ONLY:
-break;
-default:
-zone->d.za = 0;
-}
-}
-
-static uint16_t nvme_zns_check_resources(NvmeNamespace *ns, uint32_t act,
- uint32_t opn, uint32_t zrwa)
-{
-if (zrwa > ns->zns.numzrwa) {
-return NVME_NOZRWA | NVME_DNR;
-}
-
-return NVME_SUCCESS;
-}
-
-/*
- * Check if we can open a zone without exceeding open/active limits.
- * AOR stands for "Active and Open Resources" (see TP 4053 section 2.5).
- */
-static uint16_t nvme_aor_check(NvmeNamespace *ns, uint32_t act, uint32_t opn)
-{
-return nvme_zns_check_resources(ns, act, opn, 0);
-}
-
 static NvmeFdpEvent *nvme_fdp_alloc_event(NvmeCtrl *n, NvmeFdpEventBuffer 
*ebuf)
 {
 NvmeFdpEvent *ret = NULL;
@@ -1769,346 +1708,11 @@ static inline uint32_t nvme_zone_idx(NvmeNamespace 
*ns, uint64_t slba)
 slba / ns->zone_size;
 }
 
-static inline NvmeZone *nvme_get_zone_by_slba(NvmeNamespace *ns, uint64_t slba)
-{
-uint32_t zone_idx = nvme_zone_idx(ns, slba);
-
-if (zone_idx >= ns->num_zones) {
-return NULL;
-}
-
-return &ns->zone_array[zone_idx];
-}
-
-static uint16_t nvme_check_zone_state_for_write(NvmeZone *zone)
-{
-uint64_t zslba = zone->d.zslba;
-
-switch (nvme_get_zone_state(zone)) {
-case NVME_ZONE_STATE_EMPTY:
-case NVME_ZONE_STATE_IMPLICITLY_OPEN:
-case NVME_ZONE_STATE_EXPLICITLY_OPEN:
-case NVME_ZONE_STATE_CLOSED:
-return NVME_SUCCESS;
-case NVME_ZONE_STATE_FULL:
-trace_pci_nvme_err_zone_is_full(zslba);
-return NVME_ZONE_FULL;
-case NVME_ZONE_STATE_OFFLINE:
-trace_pci_nvme_err_zone_is_offline(zslba);
-return NVME_ZONE_OFFLINE;
-case NVME_ZONE_STATE_READ_ONLY:
-trace_pci_nvme_err_zone_is_read_only(zslba);
-return NVME_ZONE_READ_ONLY;
-default:
-assert(false);
-}
-
-return NVME_INTERNAL_DEV_ERROR;
-}
-
-static uint16_t nvme_check_zone_write(NvmeNamespace *ns, NvmeZone *zone,
-  uint64_t slba, uint32_t nlb)
-{
-uint64_t zcap = nvme_zone_wr_boundary(zone);
-uint16_t status;
-
-status = nvm

[RFC v2 7/7] hw/nvme: make ZDED persistent

2023-11-27 Thread Sam Li
Zone descriptor extension data (ZDED) is not persistent across QEMU
restarts. The zone descriptor extension valid bit (ZDEV) is part of
zone attributes, which sets to one when the ZDED is associated with
the zone.

With the qcow2 img as the backing file, the NVMe ZNS device stores
the zone attributes at the following eight bit of zone type bit of write
pointers for each zone. The ZDED is stored as part of zoned metadata as
write pointers.

Signed-off-by: Sam Li 
---
 block/qcow2.c| 45 
 hw/nvme/ctrl.c   |  1 +
 include/block/block-common.h |  1 +
 3 files changed, 47 insertions(+)

diff --git a/block/qcow2.c b/block/qcow2.c
index 74d2e2bf39..861a8f9f06 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -25,6 +25,7 @@
 #include "qemu/osdep.h"
 
 #include "block/qdict.h"
+#include "block/nvme.h"
 #include "sysemu/block-backend.h"
 #include "qemu/main-loop.h"
 #include "qemu/module.h"
@@ -235,6 +236,17 @@ static inline BlockZoneState 
qcow2_get_zone_state(BlockDriverState *bs,
 return BLK_ZS_NOT_WP;
 }
 
+static inline void qcow2_set_za(uint64_t *wp, uint8_t za)
+{
+/*
+ * The zone attribute takes up one byte. Store it after the zoned
+ * bit.
+ */
+uint64_t addr = *wp;
+addr |= ((uint64_t)za << 51);
+*wp = addr;
+}
+
 /*
  * Write the new wp value to the dedicated location of the image file.
  */
@@ -4990,6 +5002,36 @@ unlock:
 return ret;
 }
 
+static int qcow2_zns_set_zded(BlockDriverState *bs, uint32_t index)
+{
+BDRVQcow2State *s = bs->opaque;
+int ret;
+
+qemu_co_mutex_lock(&bs->wps->colock);
+uint64_t *wp = &bs->wps->wp[index];
+BlockZoneState zs = qcow2_get_zone_state(bs, index);
+if (zs == BLK_ZS_EMPTY) {
+ret = qcow2_check_zone_resources(bs, zs);
+if (ret < 0) {
+goto unlock;
+}
+
+qcow2_set_za(wp, NVME_ZA_ZD_EXT_VALID);
+ret = qcow2_write_wp_at(bs, wp, index);
+if (ret < 0) {
+error_report("Failed to set zone extension at 0x%" PRIx64 "", *wp);
+goto unlock;
+}
+s->nr_zones_closed++;
+qemu_co_mutex_unlock(&bs->wps->colock);
+return ret;
+}
+
+unlock:
+qemu_co_mutex_unlock(&bs->wps->colock);
+return NVME_ZONE_INVAL_TRANSITION;
+}
+
 static int coroutine_fn qcow2_co_zone_mgmt(BlockDriverState *bs, BlockZoneOp 
op,
int64_t offset, int64_t len)
 {
@@ -5046,6 +5088,9 @@ static int coroutine_fn 
qcow2_co_zone_mgmt(BlockDriverState *bs, BlockZoneOp op,
 case BLK_ZO_OFFLINE:
 /* There are no transitions from the offline state to any other state 
*/
 break;
+case BLK_ZO_SET_ZDED:
+ret = qcow2_zns_set_zded(bs, index);
+break;
 default:
 error_report("Unsupported zone op: 0x%x", op);
 ret = -ENOTSUP;
diff --git a/hw/nvme/ctrl.c b/hw/nvme/ctrl.c
index f65a87646e..c33e24e303 100644
--- a/hw/nvme/ctrl.c
+++ b/hw/nvme/ctrl.c
@@ -3474,6 +3474,7 @@ static uint16_t nvme_zone_mgmt_send(NvmeCtrl *n, 
NvmeRequest *req)
 break;
 
 case NVME_ZONE_ACTION_SET_ZD_EXT:
+op = BLK_ZO_SET_ZDED;
 int zd_ext_size = blk_get_zd_ext_size(blk);
 trace_pci_nvme_set_descriptor_extension(slba, zone_idx);
 if (all || !zd_ext_size) {
diff --git a/include/block/block-common.h b/include/block/block-common.h
index ea213c3887..b61541599f 100644
--- a/include/block/block-common.h
+++ b/include/block/block-common.h
@@ -91,6 +91,7 @@ typedef enum BlockZoneOp {
 BLK_ZO_FINISH,
 BLK_ZO_RESET,
 BLK_ZO_OFFLINE,
+BLK_ZO_SET_ZDED,
 } BlockZoneOp;
 
 typedef enum BlockZoneModel {
-- 
2.40.1




[RFC v2 4/7] hw/nvme: add blk_get_zone_extension to access zd_extensions

2023-11-27 Thread Sam Li
Signed-off-by: Sam Li 
---
 block/block-backend.c | 16 
 hw/nvme/ctrl.c| 20 ++--
 hw/nvme/ns.c  | 24 
 hw/nvme/nvme.h|  7 ---
 include/sysemu/block-backend-io.h |  2 ++
 5 files changed, 36 insertions(+), 33 deletions(-)

diff --git a/block/block-backend.c b/block/block-backend.c
index 666df9cfea..fcdcbe28bf 100644
--- a/block/block-backend.c
+++ b/block/block-backend.c
@@ -2452,6 +2452,22 @@ BlockZoneWps *blk_get_zone_wps(BlockBackend *blk)
 return bs ? bs->wps : NULL;
 }
 
+uint8_t *blk_get_zone_extension(BlockBackend *blk)
+{
+BlockDriverState *bs = blk_bs(blk);
+IO_CODE();
+
+return bs ? bs->zd_extensions : NULL;
+}
+
+uint32_t blk_get_zd_ext_size(BlockBackend *blk)
+{
+BlockDriverState *bs = blk_bs(blk);
+IO_CODE();
+
+return bs ? bs->bl.zd_extension_size : 0;
+}
+
 void *blk_try_blockalign(BlockBackend *blk, size_t size)
 {
 IO_CODE();
diff --git a/hw/nvme/ctrl.c b/hw/nvme/ctrl.c
index e64b021454..dae6f00e4f 100644
--- a/hw/nvme/ctrl.c
+++ b/hw/nvme/ctrl.c
@@ -4004,6 +4004,12 @@ static uint16_t nvme_zone_mgmt_send_zrwa_flush(NvmeCtrl 
*n, NvmeZone *zone,
 return NVME_SUCCESS;
 }
 
+static inline uint8_t *nvme_get_zd_extension(NvmeNamespace *ns,
+uint32_t zone_idx)
+{
+return &ns->zd_extensions[zone_idx * blk_get_zd_ext_size(ns->blkconf.blk)];
+}
+
 static uint16_t nvme_zone_mgmt_send(NvmeCtrl *n, NvmeRequest *req)
 {
 NvmeZoneSendCmd *cmd = (NvmeZoneSendCmd *)&req->cmd;
@@ -4088,11 +4094,11 @@ static uint16_t nvme_zone_mgmt_send(NvmeCtrl *n, 
NvmeRequest *req)
 
 case NVME_ZONE_ACTION_SET_ZD_EXT:
 trace_pci_nvme_set_descriptor_extension(slba, zone_idx);
-if (all || !ns->params.zd_extension_size) {
+if (all || !blk_get_zd_ext_size(ns->blkconf.blk)) {
 return NVME_INVALID_FIELD | NVME_DNR;
 }
 zd_ext = nvme_get_zd_extension(ns, zone_idx);
-status = nvme_h2c(n, zd_ext, ns->params.zd_extension_size, req);
+status = nvme_h2c(n, zd_ext, blk_get_zd_ext_size(ns->blkconf.blk), 
req);
 if (status) {
 trace_pci_nvme_err_zd_extension_map_error(zone_idx);
 return status;
@@ -4183,7 +4189,8 @@ static uint16_t nvme_zone_mgmt_recv(NvmeCtrl *n, 
NvmeRequest *req)
 if (zra != NVME_ZONE_REPORT && zra != NVME_ZONE_REPORT_EXTENDED) {
 return NVME_INVALID_FIELD | NVME_DNR;
 }
-if (zra == NVME_ZONE_REPORT_EXTENDED && !ns->params.zd_extension_size) {
+if (zra == NVME_ZONE_REPORT_EXTENDED &&
+!blk_get_zd_ext_size(ns->blkconf.blk)) {
 return NVME_INVALID_FIELD | NVME_DNR;
 }
 
@@ -4205,7 +4212,7 @@ static uint16_t nvme_zone_mgmt_recv(NvmeCtrl *n, 
NvmeRequest *req)
 
 zone_entry_sz = sizeof(NvmeZoneDescr);
 if (zra == NVME_ZONE_REPORT_EXTENDED) {
-zone_entry_sz += ns->params.zd_extension_size;
+zone_entry_sz += blk_get_zd_ext_size(ns->blkconf.blk) ;
 }
 
 max_zones = (data_size - sizeof(NvmeZoneReportHeader)) / zone_entry_sz;
@@ -4243,11 +4250,12 @@ static uint16_t nvme_zone_mgmt_recv(NvmeCtrl *n, 
NvmeRequest *req)
 }
 
 if (zra == NVME_ZONE_REPORT_EXTENDED) {
+int zd_ext_size = blk_get_zd_ext_size(ns->blkconf.blk);
 if (zone->d.za & NVME_ZA_ZD_EXT_VALID) {
 memcpy(buf_p, nvme_get_zd_extension(ns, zone_idx),
-   ns->params.zd_extension_size);
+   zd_ext_size);
 }
-buf_p += ns->params.zd_extension_size;
+buf_p += zd_ext_size;
 }
 
 max_zones--;
diff --git a/hw/nvme/ns.c b/hw/nvme/ns.c
index 82d4f7932d..45c08391f5 100644
--- a/hw/nvme/ns.c
+++ b/hw/nvme/ns.c
@@ -218,15 +218,15 @@ static int 
nvme_ns_zoned_check_calc_geometry(NvmeNamespace *ns, Error **errp)
 
 static void nvme_ns_zoned_init_state(NvmeNamespace *ns)
 {
+BlockBackend *blk = ns->blkconf.blk;
 uint64_t start = 0, zone_size = ns->zone_size;
 uint64_t capacity = ns->num_zones * zone_size;
 NvmeZone *zone;
 int i;
 
 ns->zone_array = g_new0(NvmeZone, ns->num_zones);
-if (ns->params.zd_extension_size) {
-ns->zd_extensions = g_malloc0(ns->params.zd_extension_size *
-  ns->num_zones);
+if (blk_get_zone_extension(blk)) {
+ns->zd_extensions = blk_get_zone_extension(blk);
 }
 
 QTAILQ_INIT(&ns->exp_open_zones);
@@ -275,7 +275,7 @@ static void nvme_ns_init_zoned(NvmeNamespace *ns)
 for (i = 0; i <= ns->id_ns.nlbaf; i++) {
 id_ns_z->lbafe[i].zsze = cpu_to_le64(ns->zone_size);
 id_ns_z->lbafe[i].zdes =
-ns->params.zd_extension_size >> 6; /* Units of 64B */
+blk_get_zd_ext_size(blk) >> 6; /* Units of 64B */
 }
 
 if (ns->params.zrwas) {
@@ -576,19 +576,6 

[RFC v2 6/7] hw/nvme: refactor zone append write using block layer APIs

2023-11-27 Thread Sam Li
Signed-off-by: Sam Li 
---
 block/qcow2.c|   2 +-
 hw/nvme/ctrl.c   | 190 ---
 include/sysemu/dma.h |   3 +
 system/dma-helpers.c |  17 
 4 files changed, 162 insertions(+), 50 deletions(-)

diff --git a/block/qcow2.c b/block/qcow2.c
index dfaf5566e2..74d2e2bf39 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -2290,7 +2290,7 @@ static void qcow2_refresh_limits(BlockDriverState *bs, 
Error **errp)
 bs->bl.max_open_zones = s->zoned_header.max_open_zones;
 bs->bl.zone_size = s->zoned_header.zone_size;
 bs->bl.zone_capacity = s->zoned_header.zone_capacity;
-bs->bl.write_granularity = BDRV_SECTOR_SIZE;
+bs->bl.write_granularity = BDRV_SECTOR_SIZE; /* physical block size */
 bs->bl.zd_extension_size = s->zoned_header.zd_extension_size;
 }
 
diff --git a/hw/nvme/ctrl.c b/hw/nvme/ctrl.c
index b9ed3495e1..f65a87646e 100644
--- a/hw/nvme/ctrl.c
+++ b/hw/nvme/ctrl.c
@@ -1735,6 +1735,95 @@ static void nvme_misc_cb(void *opaque, int ret)
 nvme_enqueue_req_completion(nvme_cq(req), req);
 }
 
+typedef struct NvmeZoneCmdAIOCB {
+NvmeRequest *req;
+NvmeCmd *cmd;
+NvmeCtrl *n;
+
+union {
+struct {
+  uint32_t partial;
+  unsigned int nr_zones;
+  BlockZoneDescriptor *zones;
+} zone_report_data;
+struct {
+  int64_t offset;
+} zone_append_data;
+};
+} NvmeZoneCmdAIOCB;
+
+static void nvme_blk_zone_append_complete_cb(void *opaque, int ret)
+{
+NvmeZoneCmdAIOCB *cb = opaque;
+NvmeRequest *req = cb->req;
+int64_t *offset = (int64_t *)&req->cqe;
+
+if (ret) {
+nvme_aio_err(req, ret);
+}
+
+*offset = nvme_b2l(req->ns, cb->zone_append_data.offset);
+nvme_enqueue_req_completion(nvme_cq(req), req);
+g_free(cb);
+}
+
+static inline void nvme_blk_zone_append(BlockBackend *blk, int64_t *offset,
+  uint32_t align,
+  BlockCompletionFunc *cb,
+  NvmeZoneCmdAIOCB *aiocb)
+{
+NvmeRequest *req = aiocb->req;
+assert(req->sg.flags & NVME_SG_ALLOC);
+
+if (req->sg.flags & NVME_SG_DMA) {
+req->aiocb = dma_blk_zone_append(blk, &req->sg.qsg, (int64_t)offset,
+ align, cb, aiocb);
+} else {
+req->aiocb = blk_aio_zone_append(blk, offset, &req->sg.iov, 0,
+ cb, aiocb);
+}
+}
+
+static void nvme_zone_append_cb(void *opaque, int ret)
+{
+NvmeZoneCmdAIOCB *aiocb = opaque;
+NvmeRequest *req = aiocb->req;
+NvmeNamespace *ns = req->ns;
+
+BlockBackend *blk = ns->blkconf.blk;
+
+trace_pci_nvme_rw_cb(nvme_cid(req), blk_name(blk));
+
+if (ret) {
+goto out;
+}
+
+if (ns->lbaf.ms) {
+NvmeRwCmd *rw = (NvmeRwCmd *)&req->cmd;
+uint32_t nlb = (uint32_t)le16_to_cpu(rw->nlb) + 1;
+int64_t offset = aiocb->zone_append_data.offset;
+
+if (nvme_ns_ext(ns) || req->cmd.mptr) {
+uint16_t status;
+
+nvme_sg_unmap(&req->sg);
+status = nvme_map_mdata(nvme_ctrl(req), nlb, req);
+if (status) {
+ret = -EFAULT;
+goto out;
+}
+
+return nvme_blk_zone_append(blk, &offset, 1,
+nvme_blk_zone_append_complete_cb,
+aiocb);
+}
+}
+
+out:
+nvme_blk_zone_append_complete_cb(aiocb, ret);
+}
+
+
 void nvme_rw_complete_cb(void *opaque, int ret)
 {
 NvmeRequest *req = opaque;
@@ -3061,6 +3150,9 @@ static uint16_t nvme_do_write(NvmeCtrl *n, NvmeRequest 
*req, bool append,
 uint64_t mapped_size = data_size;
 uint64_t data_offset;
 BlockBackend *blk = ns->blkconf.blk;
+BlockZoneWps *wps = blk_get_zone_wps(blk);
+uint32_t zone_size = blk_get_zone_size(blk);
+uint32_t zone_idx;
 uint16_t status;
 
 if (nvme_ns_ext(ns)) {
@@ -3091,42 +3183,47 @@ static uint16_t nvme_do_write(NvmeCtrl *n, NvmeRequest 
*req, bool append,
 }
 
 if (blk_get_zone_model(blk)) {
-uint32_t zone_size = blk_get_zone_size(blk);
-uint32_t zone_idx = slba / zone_size;
-int64_t zone_start = zone_idx * zone_size;
+assert(wps);
+if (zone_size) {
+zone_idx = slba / zone_size;
+int64_t zone_start = zone_idx * zone_size;
+
+if (append) {
+bool piremap = !!(ctrl & NVME_RW_PIREMAP);
+
+if (n->params.zasl &&
+data_size > (uint64_t)
+n->page_size << n->params.zasl) {
+trace_pci_nvme_err_zasl(data_size);
+return NVME_INVALID_FIELD | NVME_DNR;
+}
 
-if (append) {
-bool piremap = !!(ctrl & NVME_RW_PIREMAP);
+rw->slba = cpu_to_le64(slba);
 
-if (n->params.zasl &&
-  

Re: [PATCH] pcie_sriov: Remove g_new assertion

2023-11-27 Thread Cédric Le Goater

On 11/23/23 08:56, Akihiko Odaki wrote:

g_new() aborts if the allocation fails so it returns NULL only if the
requested allocation size is zero. register_vfs() makes such an
allocation if NumVFs is zero so it should not assert that g_new()
returns a non-NULL value.

Fixes: 7c0fa8dff8 ("pcie: Add support for Single Root I/O Virtualization 
(SR/IOV)")
Buglink: https://issues.redhat.com/browse/RHEL-17209
Signed-off-by: Akihiko Odaki 



Reviewed-by: Cédric Le Goater 

Thanks,

C.



---
  hw/pci/pcie_sriov.c | 1 -
  1 file changed, 1 deletion(-)

diff --git a/hw/pci/pcie_sriov.c b/hw/pci/pcie_sriov.c
index 5ef8950940..a1fe65f5d8 100644
--- a/hw/pci/pcie_sriov.c
+++ b/hw/pci/pcie_sriov.c
@@ -178,7 +178,6 @@ static void register_vfs(PCIDevice *dev)
  num_vfs = pci_get_word(dev->config + sriov_cap + PCI_SRIOV_NUM_VF);
  
  dev->exp.sriov_pf.vf = g_new(PCIDevice *, num_vfs);

-assert(dev->exp.sriov_pf.vf);
  
  trace_sriov_register_vfs(dev->name, PCI_SLOT(dev->devfn),

   PCI_FUNC(dev->devfn), num_vfs);





Re: [PATCH for-8.2] ui/vnc-clipboard: fix inflate_buffer

2023-11-27 Thread Marc-André Lureau
Hi

On Thu, Nov 23, 2023 at 12:46 PM Fiona Ebner  wrote:
>
> Am 23.11.23 um 07:52 schrieb Marc-André Lureau:
> > Hi
> >
> > On Wed, Nov 22, 2023 at 5:25 PM Fiona Ebner  wrote:
> >>
> >> Am 22.11.23 um 14:06 schrieb Marc-André Lureau:
> >>> Hi
> >>>
> >>> On Wed, Nov 22, 2023 at 5:00 PM Fiona Ebner  wrote:
> 
>  Commit d921fea338 ("ui/vnc-clipboard: fix infinite loop in
>  inflate_buffer (CVE-2023-3255)") removed this hunk, but it is still
>  required, because it can happen that stream.avail_in becomes zero
>  before coming across a return value of Z_STREAM_END in the loop.
> >>>
> >>> Isn't this an error from the client side then?
> >>>
> >>
> >> In my test just now I get Z_BUF_ERROR twice and after the second one,
> >> stream.avail_in is zero. Maybe if you'd call inflate() again, you'd get
> >> Z_STREAM_END, but no such call is made, because we exit the loop.
> >
> > It should exit the loop after calling inflate() again though.
> >
> > Or do you mean that it goes to Z_BUF_ERROR a second time with
> > stream.avail_in == 0, thus exit the loop quickly after ?
>
> Yes, sorry if I wasn't clear. After the second inflate() call,
> stream.avail_in == 0. See below.
>
> >
> > That could mean that the input buffer is not complete.
> >
> > "Note that Z_BUF_ERROR is not fatal, and inflate() can be called again
> > with more input..."
> >
> > Something is fishy.. Is it easy to reproduce?
> >
>
> Yes, always. For example when entering "foobar" in the clipboard on the
> host side:
>
> > Thread 1 "qemu-system-x86" hit Breakpoint 2, inflate_buffer 
> > (in=0x57a2980c "x^b```O\313\317OJ,b", in_len=19, size=0x7fffd058) 
> > at ../ui/vnc-clipboard.c:44
> > 44ret = inflateInit(&stream);
> > (gdb) n
> > [Thread 0x7ffec7c166c0 (LWP 20918) exited]
> > 45if (ret != Z_OK) {
> > (gdb) p stream
> > $18 = {next_in = 0x57a2980c "x^b```O\313\317OJ,b", avail_in = 19, 
> > total_in = 0, next_out = 0x57173d20 "\303\337:\002PU", avail_out = 8, 
> > total_out = 0, msg = 0x0, state = 0x57654200,
> >   zalloc = 0x77bc1560, zfree = 0x77bc1570, opaque = 0x0, data_type 
> > = 0, adler = 1, reserved = 0}
> > (gdb) c
> > Continuing.
> > [New Thread 0x7ffec7c166c0 (LWP 21011)]
> >
> > Thread 1 "qemu-system-x86" hit Breakpoint 3, inflate_buffer 
> > (in=0x57a2980c "x^b```O\313\317OJ,b", in_len=19, size=0x7fffd058) 
> > at ../ui/vnc-clipboard.c:50
> > 50ret = inflate(&stream, Z_FINISH);
> > (gdb) n
> > [Thread 0x7ffec7c166c0 (LWP 21011) exited]
> > 51switch (ret) {
> > (gdb) p ret
> > $19 = -5
> > (gdb) p stream
> > $20 = {next_in = 0x57a29818 "b", avail_in = 7, total_in = 12, next_out 
> > = 0x57173d28 "", avail_out = 0, total_out = 8, msg = 0x0, state = 
> > 0x57654200, zalloc = 0x77bc1560, zfree = 0x77bc1570,
> >   opaque = 0x0, data_type = 5, adler = 72352174, reserved = 0}
> > (gdb) c
> > Continuing.
> > [New Thread 0x7ffec7c166c0 (LWP 21131)]
> >
> > Thread 1 "qemu-system-x86" hit Breakpoint 3, inflate_buffer 
> > (in=0x57a2980c "x^b```O\313\317OJ,b", in_len=19, size=0x7fffd058) 
> > at ../ui/vnc-clipboard.c:50
> > 50ret = inflate(&stream, Z_FINISH);
> > (gdb) n
> > [Thread 0x7ffec7c166c0 (LWP 21131) exited]
> > 51switch (ret) {
> > (gdb) p ret
> > $21 = -5
> > (gdb) p stream
> > $22 = {next_in = 0x57a2981f "", avail_in = 0, total_in = 19, next_out = 
> > 0x57173d2b "", avail_out = 5, total_out = 11, msg = 0x0, state = 
> > 0x57654200, zalloc = 0x77bc1560, zfree = 0x77bc1570,
> >   opaque = 0x0, data_type = 128, adler = 190907009, reserved = 0}
> > (gdb) p out + 4
> > $23 = (uint8_t *) 0x57173d24 "foobar"
> > (gdb) p *size
> > $24 = 0
>
> Now we exit the loop and without the hunk this patch re-adds, we don't
> update *size and don't return 'out'.
>

It seems like a bug in tigervnc then. For some reason, the compressed
data doesn't trigger Z_STREAM_END on the decompression side. Have you
investigated or reported an issue to them?

thanks

-- 
Marc-André Lureau



Re: [RFC PATCH v3 3/5] xen: add option to disable legacy backends

2023-11-27 Thread Volodymyr Babchuk


Hi David,

"Woodhouse, David"  writes:

> [[S/MIME Signed Part:Undecided]]
> On Fri, 2023-11-24 at 23:24 +, Volodymyr Babchuk wrote:
>> This patch makes legacy backends optional. As was discussed at [1]
>> this is a solution to a problem when we can't run QEMU as a device
>> model in a non-privileged domain. This is because legacy backends
>> assume that they are always running in domain with ID = 0. Actually,
>> this may prevent running QEMU in a privileged domain with ID not equal
>> to zero.
>> 
>> To be able to disable legacy backends we need to alter couple of
>> source files that unintentionally depend on them. For example
>> xen-all.c used xen_pv_printf to report errors, while not providing any
>> additional like xendev pointer. Also, we need to move xenstore
>> structure from xen-legacy-backend.c, because it is apparently used in
>> xen-all.c.
>> 
>> With this patch it is possible to provide
>> "--disable-xen-legacy-backends" configure option to get QEMU binary
>> that can run in a driver domain. With price of not be able to use
>> legacy backends of course.
>> 
>> [1]
>> https://lists.gnu.org/archive/html/qemu-devel/2023-11/msg05022.html
>> 
>> Signed-off-by: Volodymyr Babchuk 
>> 
>> ---
>> 
>> I am not sure if I made correct changes to the build system, thus this
>> patch is tagged as RFC.
>
> Hm, I was imagining a new CONFIG_LEGACY_XEN_BACKENDS option which would
> look a lot like CONFIG_XEN_BUS (which would now be only for the new
> XenBus code).
>

It was my original intention too. But it appears that it is not possible
to add Kconfig value and then make it configurable via ./config
script. As I understood it can be set only via defconfig file.

> This looks weird to me:
>
>> --- a/hw/display/meson.build
>> +++ b/hw/display/meson.build
>> @@ -14,7 +14,9 @@ system_ss.add(when: 'CONFIG_PL110', if_true:
>> files('pl110.c'))
>>  system_ss.add(when: 'CONFIG_SII9022', if_true: files('sii9022.c'))
>>  system_ss.add(when: 'CONFIG_SSD0303', if_true: files('ssd0303.c'))
>>  system_ss.add(when: 'CONFIG_SSD0323', if_true: files('ssd0323.c'))
>> -system_ss.add(when: 'CONFIG_XEN_BUS', if_true: files('xenfb.c'))
>> +if have_xen_legacy_backends
>> +  system_ss.add(when: 'CONFIG_XEN_BUS', if_true: files('xenfb.c'))
>> +endif
>> 
>>  system_ss.add(when: 'CONFIG_VGA_PCI', if_true: files('vga-pci.c'))
>>  system_ss.add(when: 'CONFIG_VGA_ISA', if_true: files('vga-isa.c'))
>
> I'd prefer to see just:
>
> -system_ss.add(when: 'CONFIG_XEN_BUS', if_true: files('xenfb.c'))
> +system_ss.add(when: 'CONFIG_XEN_LEGACY_BACKENDS', if_true: files('xenfb.c'))

I tried, but it does not work this way. I need to create Kconfig
variable to do this, but then other problems appear.

>
> Probably also better to split out the bits in accel/xen/xen-all.c and
> hw/xen/xen-legacy-backend.c to a separate preparatory commit.

Okay, will do.

-- 
WBR, Volodymyr


[RFC] docs/s390: Fix wrong command example in s390-cpu-topology.rst

2023-11-27 Thread Zhao Liu
From: Zhao Liu 

>From s390_possible_cpu_arch_ids() in hw/s390x/s390-virtio-ccw.c, the
"core-id" is the index of pssible_cpus->cpus[], so it should only be
less than possible_cpus->len, which is equal to ms->smp.max_cpus.

Fix the wrong "core-id" 112 because it is greater than maxcpus (36) in
-smp.

Cc: Nina Schoetterl-Glausch 
Signed-off-by: Zhao Liu 
---
RFC: Not tested on S390 machine, just code reading.
---
 docs/devel/s390-cpu-topology.rst | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/docs/devel/s390-cpu-topology.rst b/docs/devel/s390-cpu-topology.rst
index 9eab28d5e5d8..48313b92d417 100644
--- a/docs/devel/s390-cpu-topology.rst
+++ b/docs/devel/s390-cpu-topology.rst
@@ -15,7 +15,7 @@ have default values:
 -smp 1,drawers=3,books=3,sockets=2,cores=2,maxcpus=36 \
 -device z14-s390x-cpu,core-id=19,entitlement=high \
 -device z14-s390x-cpu,core-id=11,entitlement=low \
--device z14-s390x-cpu,core-id=112,entitlement=high \
+-device z14-s390x-cpu,core-id=12,entitlement=high \
...
 
 Additions to query-cpus-fast
@@ -78,7 +78,7 @@ modifiers for all configured vCPUs.
   "dedicated": true,
   "thread-id": 537005,
   "props": {
-"core-id": 112,
+"core-id": 12,
 "socket-id": 0,
 "drawer-id": 3,
 "book-id": 2
@@ -86,7 +86,7 @@ modifiers for all configured vCPUs.
   "cpu-state": "operating",
   "entitlement": "high",
   "qom-path": "/machine/peripheral-anon/device[2]",
-  "cpu-index": 112,
+  "cpu-index": 12,
   "target": "s390x"
 }
   ]
-- 
2.34.1




Re: [BUG] accel/tcg: cpu_exec_longjmp_cleanup: assertion failed: (cpu == current_cpu)

2023-11-27 Thread Peter Maydell
On Sat, 25 Nov 2023 at 13:09, Petr Cvek  wrote:
>
> It seems there is a bug in SIGALRM handling when 486 system emulates x86_64 
> code.

486 host is pretty well out of support currently. Can you reproduce
this on a less ancient host CPU type ?

> ERROR:../accel/tcg/cpu-exec.c:546:cpu_exec_longjmp_cleanup: assertion failed: 
> (cpu == current_cpu)
> Bail out! ERROR:../accel/tcg/cpu-exec.c:546:cpu_exec_longjmp_cleanup: 
> assertion failed: (cpu == current_cpu)
> 0x48874a != 0x3c69e10
> **
> ERROR:../accel/tcg/cpu-exec.c:546:cpu_exec_longjmp_cleanup: assertion failed: 
> (cpu == current_cpu)
> Bail out! ERROR:../accel/tcg/cpu-exec.c:546:cpu_exec_longjmp_cleanup: 
> assertion failed: (cpu == current_cpu)

What compiler version do you build QEMU with? That
assert is there because we have seen some buggy compilers
in the past which don't correctly preserve the variable
value as the setjmp/longjmp spec requires them to.

thanks
-- PMM



Re: [PATCH] qemu/timer: Don't use RDTSC on i486

2023-11-27 Thread Peter Maydell
On Sat, 25 Nov 2023 at 12:24, Petr Cvek  wrote:
>
> GCC defines __i386__ for i386 and i486, which both lack RDTSC instruction.
> The i386 seems to be impossible to distinguish, but i486 can be identified
> by checking for undefined __i486__.
>
> Signed-off-by: Petr Cvek 

Last time this came up (over a decade ago!) we dropped the idea
because we couldn't find a consistent way of identifying
the no-RDTSC CPUs:
https://patchwork.ozlabs.org/project/qemu-devel/patch/1353683570-30525-1-git-send-email-peter.mayd...@linaro.org/

We have already deprecated 32-bit x86 host support for
system emulation mode, incidentally.

thanks
-- PMM



Re: [PATCH v2] arm/kvm: Enable support for KVM_ARM_VCPU_PMU_V3_FILTER

2023-11-27 Thread Shaoqin Huang

Hi Eric,

On 11/24/23 18:40, Eric Auger wrote:

Hi Shaoqin,

On 11/17/23 07:08, Shaoqin Huang wrote:

The KVM_ARM_VCPU_PMU_V3_FILTER provide the ability to let the VMM decide
which PMU events are provided to the guest. Add a new option
`pmu-filter` as -accel sub-option to set the PMU Event Filtering.

you remind the reader the default policy without filter (ie. expose all
events from the hots)


Yes. I will add this description to the default policy.



The `pmu-filter` has such format:

   pmu-filter="{A,D}:start-end[;{A,D}:start-end...]"

The A means "allow" and D means "deny", start is the first event of the
range and the end is the last one. The first filter action defines if the whole
event list is an allow or deny list, if the first filter action is "allow", all
other events are denied except start-end; if the first filter action is "deny",
all other events are allowed except start-end. For example:


I prefer the kernel doc wording
The first registered range defines the global policy (global ALLOW if
the first @action is DENY, global DENY if the first @action is ALLOW).


I can replace it this by kernel doc description in next version.



   pmu-filter="A:0x11-0x11;A:0x23-0x3a,D:0x30-0x30"

shoudn't the "," be replaced by a ";"?


Yes. Thanks for catching this. It should be ";".




I would add: since the first action is allow, we have a global deny policy.


That makes the example more clear, will add it.



This will allow event 0x11 (The cycle counter), events 0x23 to 0x3a is
also allowed except the event 0x30 is denied, and all the other events
are disallowed.

Here is an real example shows how to use the PMU Event Filtering, when
we launch a guest by use kvm, add such command line:

   # qemu-system-aarch64 \
-accel kvm,pmu-filter="D:0x11-0x11"

Since the first filter action is deny, we have a global allow policy.
this disables the filtering of the cycle counter (event 0x11 being
CPU_CYCLES)

kernel doc says that the ranges should match the PMU arch (10 bits on
ARMv8.0, 16 bits from ARMv8.1 onwards). How do you handle that?


Currently I think I can rely on the SET_DEVICE_ATTR, when set the 
KVM_ARM_VCPU_PMU_V3_FILTER, check the errno number, if it equals to 
EINVAL, then report the error to the use it's an invalid filter.


Or another way is to detect the ARM version, and do more check in the 
userspace?


Do you have any good suggestions on handle the two different event space 
in QEMU?




And then in guest, use the perf to count the cycle:

   # perf stat sleep 1

Performance counter stats for 'sleep 1':

   1.22 msec task-clock   #0.001 CPUs 
utilized
  1  context-switches #  820.695 /sec
  0  cpu-migrations   #0.000 /sec
 55  page-faults  #   45.138 K/sec
  cycles
1128954  instructions
 227031  branches #  186.323 M/sec
   8686  branch-misses#3.83% of all 
branches

1.002492480 seconds time elapsed

0.001752000 seconds user
0.0 seconds sys

As we can see, the cycle counter has been disabled in the guest, but
other pmu events are still work.


perf list should work as well


It works, should I post it output at here?



Signed-off-by: Shaoqin Huang 
---
v1->v2:
   - Add more description for allow and deny meaning in
 commit message. [Sebastian]
   - Small improvement.  [Sebastian]

v1: https://lore.kernel.org/all/20231113081713.153615-1-shahu...@redhat.com/
---
  include/sysemu/kvm_int.h |  1 +
  qemu-options.hx  | 16 +
  target/arm/kvm.c | 22 +
  target/arm/kvm64.c   | 51 
  4 files changed, 90 insertions(+)

diff --git a/include/sysemu/kvm_int.h b/include/sysemu/kvm_int.h
index fd846394be..8f4601474f 100644
--- a/include/sysemu/kvm_int.h
+++ b/include/sysemu/kvm_int.h
@@ -120,6 +120,7 @@ struct KVMState
  uint32_t xen_caps;
  uint16_t xen_gnttab_max_frames;
  uint16_t xen_evtchn_max_pirq;
+char *kvm_pmu_filter;
  };
  
  void kvm_memory_listener_register(KVMState *s, KVMMemoryListener *kml,

diff --git a/qemu-options.hx b/qemu-options.hx
index 42fd09e4de..dd3518092c 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -187,6 +187,7 @@ DEF("accel", HAS_ARG, QEMU_OPTION_accel,
  "tb-size=n (TCG translation block cache size)\n"
  "dirty-ring-size=n (KVM dirty ring GFN count, default 
0)\n"
  "eager-split-size=n (KVM Eager Page Split chunk size, default 
0, disabled. ARM only)\n"
+"pmu-filter={A,D}:start-end[;...] (KVM PMU Event Filter, 
default no filter. ARM only)\n"
  "notify-vmexit=run|internal-error|disable,notify-windo

Re: [PATCH v2] arm/kvm: Enable support for KVM_ARM_VCPU_PMU_V3_FILTER

2023-11-27 Thread Shaoqin Huang

Hi Eric,

On 11/25/23 02:24, Eric Auger wrote:

Hi,

On 11/17/23 07:08, Shaoqin Huang wrote:

The KVM_ARM_VCPU_PMU_V3_FILTER provide the ability to let the VMM decide
which PMU events are provided to the guest. Add a new option
`pmu-filter` as -accel sub-option to set the PMU Event Filtering.

The `pmu-filter` has such format:

   pmu-filter="{A,D}:start-end[;{A,D}:start-end...]"

The A means "allow" and D means "deny", start is the first event of the
range and the end is the last one. The first filter action defines if the whole
event list is an allow or deny list, if the first filter action is "allow", all
other events are denied except start-end; if the first filter action is "deny",
all other events are allowed except start-end. For example:

   pmu-filter="A:0x11-0x11;A:0x23-0x3a,D:0x30-0x30"

This will allow event 0x11 (The cycle counter), events 0x23 to 0x3a is
also allowed except the event 0x30 is denied, and all the other events
are disallowed.

Here is an real example shows how to use the PMU Event Filtering, when
we launch a guest by use kvm, add such command line:

   # qemu-system-aarch64 \
-accel kvm,pmu-filter="D:0x11-0x11"

And then in guest, use the perf to count the cycle:

   # perf stat sleep 1

Performance counter stats for 'sleep 1':

   1.22 msec task-clock   #0.001 CPUs 
utilized
  1  context-switches #  820.695 /sec
  0  cpu-migrations   #0.000 /sec
 55  page-faults  #   45.138 K/sec
  cycles
1128954  instructions
 227031  branches #  186.323 M/sec
   8686  branch-misses#3.83% of all 
branches

1.002492480 seconds time elapsed

0.001752000 seconds user
0.0 seconds sys

As we can see, the cycle counter has been disabled in the guest, but
other pmu events are still work.

Signed-off-by: Shaoqin Huang 
---
v1->v2:
   - Add more description for allow and deny meaning in
 commit message. [Sebastian]
   - Small improvement.  [Sebastian]

v1: https://lore.kernel.org/all/20231113081713.153615-1-shahu...@redhat.com/
---
  include/sysemu/kvm_int.h |  1 +
  qemu-options.hx  | 16 +
  target/arm/kvm.c | 22 +
  target/arm/kvm64.c   | 51 
  4 files changed, 90 insertions(+)

diff --git a/include/sysemu/kvm_int.h b/include/sysemu/kvm_int.h
index fd846394be..8f4601474f 100644
--- a/include/sysemu/kvm_int.h
+++ b/include/sysemu/kvm_int.h
@@ -120,6 +120,7 @@ struct KVMState
  uint32_t xen_caps;
  uint16_t xen_gnttab_max_frames;
  uint16_t xen_evtchn_max_pirq;
+char *kvm_pmu_filter;
  };
  
  void kvm_memory_listener_register(KVMState *s, KVMMemoryListener *kml,

diff --git a/qemu-options.hx b/qemu-options.hx
index 42fd09e4de..dd3518092c 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -187,6 +187,7 @@ DEF("accel", HAS_ARG, QEMU_OPTION_accel,
  "tb-size=n (TCG translation block cache size)\n"
  "dirty-ring-size=n (KVM dirty ring GFN count, default 
0)\n"
  "eager-split-size=n (KVM Eager Page Split chunk size, default 
0, disabled. ARM only)\n"
+"pmu-filter={A,D}:start-end[;...] (KVM PMU Event Filter, 
default no filter. ARM only)\n"
  "notify-vmexit=run|internal-error|disable,notify-window=n 
(enable notify VM exit and set notify window, x86 only)\n"
  "thread=single|multi (enable multi-threaded TCG)\n", 
QEMU_ARCH_ALL)
  SRST
@@ -259,6 +260,21 @@ SRST
  impact on the memory. By default, this feature is disabled
  (eager-split-size=0).
  
+``pmu-filter={A,D}:start-end[;...]``

+KVM implements pmu event filtering to prevent a guest from being able 
to
+   sample certain events. It has the following format:
+
+   pmu-filter="{A,D}:start-end[;{A,D}:start-end...]"
+
+   The A means "allow" and D means "deny", start if the first event of the
+   range and the end is the last one. For example:
+
+   pmu-filter="A:0x11-0x11;A:0x23-0x3a,D:0x30-0x30"
+
+   This will allow event 0x11 (The cycle counter), events 0x23 to 0x3a is
+   also allowed except the event 0x30 is denied, and all the other events
+   are disallowed.
+
  ``notify-vmexit=run|internal-error|disable,notify-window=n``
  Enables or disables notify VM exit support on x86 host and specify
  the corresponding notify window to trigger the VM exit if enabled.
diff --git a/target/arm/kvm.c b/target/arm/kvm.c
index 7903e2ddde..74796de055 100644
--- a/target/arm/kvm.c
+++ b/target/arm/kvm.c
@@ -1108,6 +1108,21 @@ static void kvm_arch_set_eager_split_size(Object *obj, 
Visitor *v,
 

Re: [PATCH] avr: Fix wrong initial value of stack pointer

2023-11-27 Thread Peter Maydell
On Mon, 27 Nov 2023 at 03:52, Gihun Nam  wrote:
>
> The current implementation initializes the stack pointer of AVR devices
> to 0, but it should be set to RAMEND according to the specs.
>
> Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1525
> Signed-off-by: Gihun Nam 

Hi; thanks for sending in this patch!

> ---
>  hw/avr/atmega.c  | 3 +++
>  target/avr/cpu.c | 2 +-
>  target/avr/cpu.h | 3 +++
>  3 files changed, 7 insertions(+), 1 deletion(-)
>
> diff --git a/hw/avr/atmega.c b/hw/avr/atmega.c
> index a34803e642..3a8caccf99 100644
> --- a/hw/avr/atmega.c
> +++ b/hw/avr/atmega.c
> @@ -233,6 +233,9 @@ static void atmega_realize(DeviceState *dev, Error **errp)
>
>  /* CPU */
>  object_initialize_child(OBJECT(dev), "cpu", &s->cpu, mc->cpu_type);
> +
> +s->cpu.init_sp = mc->io_size + mc->sram_size - 1;

The board code should not directly touch fields inside the
CPU object. You want to set up a QOM property on the CPU
which the board code can then initialize using a QOM
property-setting function.

For an example of how to do this, have a look at how the
target/microblaze code handles its "base-vectors" property
(in fact any CPU that does a DEFINE_PROP_UINT32() will
be similar):
 * you have a Property array with a DEFINE_PROP_UINT32()
   line to define the property name, the struct field
   that it corresponds to, and the default value
   (you can list a struct field directly, don't
   put it in a substruct 'cfg' the way MicroBlaze does),
   plus a DEFINE_PROP_END_OF_LIST() terminator
 * the CPU class init function calls device_class_set_props()
   to say that that array is its properties
 * the board code sets the correct value using either
   object_property_set_uint() or qdev_prop_set_uint32()
   (doesn't matter which)

> +
>  qdev_realize(DEVICE(&s->cpu), NULL, &error_abort);
>  cpudev = DEVICE(&s->cpu);

As a side note, one of the answers to
https://stackoverflow.com/questions/46949227/compiling-an-assembly-program-using-avr
says that older AVR CPUs set the SP to 0 on reset, and
it's only newer ones that set it to RAMEND. That's
probably why we reset to 0.

thanks
-- PMM



Re: [PATCH v7 0/8] Unified CPU type check

2023-11-27 Thread Marcin Juszkiewicz

W dniu 27.11.2023 o 00:12, Gavin Shan pisze:

After the series is applied:

   [gshan@gshan q]$ ./build/qemu-system-aarch64 -M virt -cpu cortex-a8
   qemu-system-aarch64: Invalid CPU type: cortex-a8
   The valid types are: cortex-a7, cortex-a15, cortex-a35, cortex-a55, \
cortex-a72, cortex-a76, cortex-a710, a64fx,\
neoverse-n1, neoverse-v1, neoverse-n2, cortex-a53, \
cortex-a57, max


Can this list have some better order? A35, A53, A55, A57, A72 feels 
better than current one.




Re: [PATCH v7 5/8] hw/arm/virt: Check CPU type in machine_run_board_init()

2023-11-27 Thread Marcin Juszkiewicz

W dniu 27.11.2023 o 00:12, Gavin Shan pisze:

@@ -2939,6 +2900,28 @@ static void virt_machine_class_init(ObjectClass *oc, 
void *data)
  {
  MachineClass *mc = MACHINE_CLASS(oc);
  HotplugHandlerClass *hc = HOTPLUG_HANDLER_CLASS(oc);
+static const char * const valid_cpu_types[] = {
+#ifdef CONFIG_TCG
+ARM_CPU_TYPE_NAME("cortex-a7"),
+ARM_CPU_TYPE_NAME("cortex-a15"),
+ARM_CPU_TYPE_NAME("cortex-a35"),
+ARM_CPU_TYPE_NAME("cortex-a55"),
+ARM_CPU_TYPE_NAME("cortex-a72"),
+ARM_CPU_TYPE_NAME("cortex-a76"),
+ARM_CPU_TYPE_NAME("cortex-a710"),
+ARM_CPU_TYPE_NAME("a64fx"),
+ARM_CPU_TYPE_NAME("neoverse-n1"),
+ARM_CPU_TYPE_NAME("neoverse-v1"),
+ARM_CPU_TYPE_NAME("neoverse-n2"),
+#endif
+ARM_CPU_TYPE_NAME("cortex-a53"),
+ARM_CPU_TYPE_NAME("cortex-a57"),
+#if defined(CONFIG_KVM) || defined(CONFIG_HVF)
+ARM_CPU_TYPE_NAME("host"),
+#endif
+ARM_CPU_TYPE_NAME("max"),
+NULL
+};


I understand that you just move list from one place to the other but 
also wonder why a53/a57 were/are outside of 'ifdef CONFIG_TCG' check.






Re: [PATCH v7 0/8] Unified CPU type check

2023-11-27 Thread Gavin Shan



On 11/27/23 21:10, Marcin Juszkiewicz wrote:

W dniu 27.11.2023 o 00:12, Gavin Shan pisze:

After the series is applied:

   [gshan@gshan q]$ ./build/qemu-system-aarch64 -M virt -cpu cortex-a8
   qemu-system-aarch64: Invalid CPU type: cortex-a8
   The valid types are: cortex-a7, cortex-a15, cortex-a35, cortex-a55, \
    cortex-a72, cortex-a76, cortex-a710, a64fx,    \
    neoverse-n1, neoverse-v1, neoverse-n2, cortex-a53, \
    cortex-a57, max


Can this list have some better order? A35, A53, A55, A57, A72 feels better than 
current one.



Definitely a good idea. I think this series is about to be queued,
let me send one additional patch to do so after this series is
merged.

Thanks,
Gavin




Re: [PATCH v7 5/8] hw/arm/virt: Check CPU type in machine_run_board_init()

2023-11-27 Thread Gavin Shan

On 11/27/23 21:13, Marcin Juszkiewicz wrote:

W dniu 27.11.2023 o 00:12, Gavin Shan pisze:

@@ -2939,6 +2900,28 @@ static void virt_machine_class_init(ObjectClass *oc, 
void *data)
  {
  MachineClass *mc = MACHINE_CLASS(oc);
  HotplugHandlerClass *hc = HOTPLUG_HANDLER_CLASS(oc);
+    static const char * const valid_cpu_types[] = {
+#ifdef CONFIG_TCG
+    ARM_CPU_TYPE_NAME("cortex-a7"),
+    ARM_CPU_TYPE_NAME("cortex-a15"),
+    ARM_CPU_TYPE_NAME("cortex-a35"),
+    ARM_CPU_TYPE_NAME("cortex-a55"),
+    ARM_CPU_TYPE_NAME("cortex-a72"),
+    ARM_CPU_TYPE_NAME("cortex-a76"),
+    ARM_CPU_TYPE_NAME("cortex-a710"),
+    ARM_CPU_TYPE_NAME("a64fx"),
+    ARM_CPU_TYPE_NAME("neoverse-n1"),
+    ARM_CPU_TYPE_NAME("neoverse-v1"),
+    ARM_CPU_TYPE_NAME("neoverse-n2"),
+#endif
+    ARM_CPU_TYPE_NAME("cortex-a53"),
+    ARM_CPU_TYPE_NAME("cortex-a57"),
+#if defined(CONFIG_KVM) || defined(CONFIG_HVF)
+    ARM_CPU_TYPE_NAME("host"),
+#endif
+    ARM_CPU_TYPE_NAME("max"),
+    NULL
+    };


I understand that you just move list from one place to the other but also 
wonder why a53/a57 were/are outside of 'ifdef CONFIG_TCG' check.



I'm not sure about HVF, but a53/a57 can be supported by KVM. The supported list 
of
CPUs by KVM is defined in linux/arch/arm64/include/uapi/asm/kvm.h as below

/*
 * Supported CPU Targets - Adding a new target type is not recommended,
 * unless there are some special registers not supported by the
 * genericv8 syreg table.
 */
#define KVM_ARM_TARGET_AEM_V8   0
#define KVM_ARM_TARGET_FOUNDATION_V81
#define KVM_ARM_TARGET_CORTEX_A57   2
#define KVM_ARM_TARGET_XGENE_POTENZA3
#define KVM_ARM_TARGET_CORTEX_A53   4
/* Generic ARM v8 target */
#define KVM_ARM_TARGET_GENERIC_V8   5

#define KVM_ARM_NUM_TARGETS 6

And the following QEMU commit gives more hints about it.

[gshan@gshan q]$ git show 39920a04952
commit 39920a04952b67fb1fce8fc3519ac18b7a95f3f3
Author: Fabiano Rosas 
Date:   Wed Apr 26 15:00:05 2023 -0300

target/arm: Move 64-bit TCG CPUs into tcg/

Move the 64-bit CPUs that are TCG-only:

- cortex-a35
- cortex-a55
- cortex-a72
- cortex-a76
- a64fx
- neoverse-n1

Keep the CPUs that can be used with KVM:

- cortex-a57
- cortex-a53
- max
- host

Signed-off-by: Fabiano Rosas 

Reviewed-by: Richard Henderson 
Reviewed-by: Philippe Mathieu-Daudé 
Message-id: 20230426180013.14814-6-faro...@suse.de
Signed-off-by: Peter Maydell 

Thanks,
Gavin




Re: [PATCH for-8.2] ui/vnc-clipboard: fix inflate_buffer

2023-11-27 Thread Fiona Ebner
Am 27.11.23 um 10:15 schrieb Marc-André Lureau:
> 
> It seems like a bug in tigervnc then. For some reason, the compressed
> data doesn't trigger Z_STREAM_END on the decompression side. Have you
> investigated or reported an issue to them?
> 

This was with noVNC. A colleague tested with TigerVNC. I haven't stepped
through with GDB there, but it might be similar. No, I haven't
reported/investigated for the VNC clients yet. Unfortunately, I've got
my hands full with other things at the moment, so it will be a while
until I can do that.

Even if it's a bug in the clients, this was working before d921fea338
("ui/vnc-clipboard: fix infinite loop in inflate_buffer
(CVE-2023-3255)") so I still feel like it might be worth handling in QEMU.

But is it really a client error? What I don't understand is why the
return value of inflate() is Z_BUF_ERROR even though all the input was
handled.

>From https://www.zlib.net/manual.html

"inflate() returns [...] Z_BUF_ERROR if no progress was possible or if
there was not enough room in the output buffer when Z_FINISH is used."

> 51ret = inflate(&stream, Z_FINISH);
> (gdb) p stream
> $23 = {next_in = 0x57652708 "", avail_in = 5, total_in = 12, next_out = 
> 0x57627378 "", avail_out = 8, total_out = 8, msg = 0x0, state = 
> 0x578df5c0, zalloc = 0x77bc1560, zfree = 0x77bc1570, 
>   opaque = 0x0, data_type = 5, adler = 71434672, reserved = 0}
> (gdb) n
> 52switch (ret) {
> (gdb) p stream
> $24 = {next_in = 0x5765270d "", avail_in = 0, total_in = 17, next_out = 
> 0x57627379 "", avail_out = 7, total_out = 9, msg = 0x0, state = 
> 0x578df5c0, zalloc = 0x77bc1560, zfree = 0x77bc1570, 
>   opaque = 0x0, data_type = 128, adler = 99746224, reserved = 0}
> (gdb) p ret
> $25 = -5
> (gdb) p out + 4
> $26 = (uint8_t *) 0x57627374 "fish"

Progress was made and there was enough space for the output (avail_out =
7 after the call), so it really shouldn't return Z_BUF_ERROR, right?

zlib version is 1:1.2.13.dfsg-1 (Debian 12 Bookworm)

Best Regards,
Fiona




[PATCH v1 0/2] A Fixup for "QEMU: CXL mailbox rework and features (Part 1)"

2023-11-27 Thread Hyeonggon Yoo
Hi, this is a fixup for the recent patch series "QEMU: CXL mailbox rework and
features (Part 1)" [1].

This fixes two problems:

   1. Media Status in memory device status register not being correctly
  read as "Disabled" while sanitation is in progress.

   2. QEMU assertion failure when it issues an MSI-X interrupt
  (indicating the completion of the sanitize command).

[1] 
https://lore.kernel.org/linux-cxl/20231023160806.13206-1-jonathan.came...@huawei.com

Hyeonggon Yoo (2):
  hw/cxl/device: read from register values in mdev_reg_read()
  hw/mem/cxl_type3: allocate more vectors for MSI-X

 hw/cxl/cxl-device-utils.c   | 17 +++--
 hw/mem/cxl_type3.c  |  2 +-
 include/hw/cxl/cxl_device.h |  4 +++-
 3 files changed, 15 insertions(+), 8 deletions(-)

-- 
2.39.1




[PATCH v1 2/2] hw/mem/cxl_type3: allocate more vectors for MSI-X

2023-11-27 Thread Hyeonggon Yoo
commit 43efb0bfad2b ("hw/cxl/mbox: Wire up interrupts for background
completion") enables notifying background command completion via MSI-X
interrupt (vector number 9).

However, the commit uses vector number 9 but the maximum number of
entries is less thus resulting in error below. Fix it by passing
nentries = 10 when calling msix_init_exclusive_bar().

 # echo 1 > sanitize
 Background command 4400h finished: success
 qemu-system-x86_64: ../hw/pci/msix.c:529: msix_notify: Assertion `vector < 
dev->msix_entries_nr' failed.

Fixes: 43efb0bfad2b ("hw/cxl/mbox: Wire up interrupts for background 
completion")
Signed-off-by: Hyeonggon Yoo <42.hye...@gmail.com>
---
 hw/mem/cxl_type3.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/mem/cxl_type3.c b/hw/mem/cxl_type3.c
index 52647b4ac7..72d9371347 100644
--- a/hw/mem/cxl_type3.c
+++ b/hw/mem/cxl_type3.c
@@ -685,7 +685,7 @@ static void ct3_realize(PCIDevice *pci_dev, Error **errp)
 ComponentRegisters *regs = &cxl_cstate->crb;
 MemoryRegion *mr = ®s->component_registers;
 uint8_t *pci_conf = pci_dev->config;
-unsigned short msix_num = 6;
+unsigned short msix_num = 10;
 int i, rc;
 
 QTAILQ_INIT(&ct3d->error_list);
-- 
2.39.1




[PATCH v1 1/2] hw/cxl/device: read from register values in mdev_reg_read()

2023-11-27 Thread Hyeonggon Yoo
In the current mdev_reg_read() implementation, it consistently returns
that the Media Status is Ready (01b). This was fine until commit
25a52959f99d ("hw/cxl: Add support for device sanitation") because the
media was presumed to be ready.

However, as per the CXL 3.0 spec "8.2.9.8.5.1 Sanitize (Opcode 4400h)",
during sanitation, the Media State should be set to Disabled (11b). The
mentioned commit correctly sets it to Disabled, but mdev_reg_read()
still returns Media Status as Ready.

To address this, update mdev_reg_read() to read register values instead
of returning dummy values.

Fixes: commit 25a52959f99d ("hw/cxl: Add support for device sanitation")
Signed-off-by: Hyeonggon Yoo <42.hye...@gmail.com>
---
 hw/cxl/cxl-device-utils.c   | 17 +++--
 include/hw/cxl/cxl_device.h |  4 +++-
 2 files changed, 14 insertions(+), 7 deletions(-)

diff --git a/hw/cxl/cxl-device-utils.c b/hw/cxl/cxl-device-utils.c
index 61a3c4dc2e..feb20f 100644
--- a/hw/cxl/cxl-device-utils.c
+++ b/hw/cxl/cxl-device-utils.c
@@ -229,12 +229,9 @@ static void mailbox_reg_write(void *opaque, hwaddr offset, 
uint64_t value,
 
 static uint64_t mdev_reg_read(void *opaque, hwaddr offset, unsigned size)
 {
-uint64_t retval = 0;
-
-retval = FIELD_DP64(retval, CXL_MEM_DEV_STS, MEDIA_STATUS, 1);
-retval = FIELD_DP64(retval, CXL_MEM_DEV_STS, MBOX_READY, 1);
+CXLDeviceState *cxl_dstate = opaque;
 
-return retval;
+return cxl_dstate->mbox_reg_state64[R_CXL_MEM_DEV_STS];
 }
 
 static void ro_reg_write(void *opaque, hwaddr offset, uint64_t value,
@@ -371,7 +368,15 @@ static void mailbox_reg_init_common(CXLDeviceState 
*cxl_dstate)
 cxl_dstate->mbox_msi_n = msi_n;
 }
 
-static void memdev_reg_init_common(CXLDeviceState *cxl_dstate) { }
+static void memdev_reg_init_common(CXLDeviceState *cxl_dstate)
+{
+uint64_t memdev_status_reg;
+
+memdev_status_reg = FIELD_DP64(0, CXL_MEM_DEV_STS, MEDIA_STATUS, 1);
+memdev_status_reg = FIELD_DP64(memdev_status_reg, CXL_MEM_DEV_STS,
+   MBOX_READY, 1);
+cxl_dstate->mbox_reg_state64[R_CXL_MEM_DEV_STS] = memdev_status_reg;
+}
 
 void cxl_device_register_init_t3(CXLType3Dev *ct3d)
 {
diff --git a/include/hw/cxl/cxl_device.h b/include/hw/cxl/cxl_device.h
index befb5f884b..873e6d6ab1 100644
--- a/include/hw/cxl/cxl_device.h
+++ b/include/hw/cxl/cxl_device.h
@@ -353,7 +353,9 @@ static inline void __toggle_media(CXLDeviceState 
*cxl_dstate, int val)
 {
 uint64_t dev_status_reg;
 
-dev_status_reg = FIELD_DP64(0, CXL_MEM_DEV_STS, MEDIA_STATUS, val);
+dev_status_reg = cxl_dstate->mbox_reg_state64[R_CXL_MEM_DEV_STS];
+dev_status_reg = FIELD_DP64(dev_status_reg, CXL_MEM_DEV_STS, MEDIA_STATUS,
+val);
 cxl_dstate->mbox_reg_state64[R_CXL_MEM_DEV_STS] = dev_status_reg;
 }
 #define cxl_dev_disable_media(cxlds)\
-- 
2.39.1




[PATCH for-9.0 v2 2/8] target/riscv: add priv ver restriction to profiles

2023-11-27 Thread Daniel Henrique Barboza
Some profiles, like RVA22S64, has a priv_spec requirement.

Make this requirement explicit for all profiles. We'll validate this
requirement finalize() time and, in case the user chooses an
incompatible priv_spec while activating a profile, a warning will be
shown.

Signed-off-by: Daniel Henrique Barboza 
---
 target/riscv/cpu.c |  1 +
 target/riscv/cpu.h |  2 ++
 target/riscv/tcg/tcg-cpu.c | 31 +++
 3 files changed, 34 insertions(+)

diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
index 59b131c1fc..29a9f77702 100644
--- a/target/riscv/cpu.c
+++ b/target/riscv/cpu.c
@@ -1537,6 +1537,7 @@ Property riscv_cpu_options[] = {
 static RISCVCPUProfile RVA22U64 = {
 .name = "rva22u64",
 .misa_ext = RVI | RVM | RVA | RVF | RVD | RVC | RVU,
+.priv_spec = RISCV_PROFILE_ATTR_UNUSED,
 .ext_offsets = {
 CPU_CFG_OFFSET(ext_zicsr), CPU_CFG_OFFSET(ext_zihintpause),
 CPU_CFG_OFFSET(ext_zba), CPU_CFG_OFFSET(ext_zbb),
diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h
index 5ff629650d..1f34eda1e4 100644
--- a/target/riscv/cpu.h
+++ b/target/riscv/cpu.h
@@ -81,10 +81,12 @@ typedef struct riscv_cpu_profile {
 uint32_t misa_ext;
 bool enabled;
 bool user_set;
+int priv_spec;
 const int32_t ext_offsets[];
 } RISCVCPUProfile;
 
 #define RISCV_PROFILE_EXT_LIST_END -1
+#define RISCV_PROFILE_ATTR_UNUSED -1
 
 extern RISCVCPUProfile *riscv_profiles[];
 
diff --git a/target/riscv/tcg/tcg-cpu.c b/target/riscv/tcg/tcg-cpu.c
index e395e2449e..4d25fc43d2 100644
--- a/target/riscv/tcg/tcg-cpu.c
+++ b/target/riscv/tcg/tcg-cpu.c
@@ -74,6 +74,20 @@ static void riscv_cpu_write_misa_bit(RISCVCPU *cpu, uint32_t 
bit,
 }
 }
 
+static const char *cpu_priv_ver_to_str(int priv_ver)
+{
+switch (priv_ver) {
+case PRIV_VERSION_1_10_0:
+return "v1.10.0";
+case PRIV_VERSION_1_11_0:
+return "v1.11.0";
+case PRIV_VERSION_1_12_0:
+return "v1.12.0";
+}
+
+g_assert_not_reached();
+}
+
 static void riscv_cpu_synchronize_from_tb(CPUState *cs,
   const TranslationBlock *tb)
 {
@@ -755,11 +769,24 @@ void riscv_cpu_validate_set_extensions(RISCVCPU *cpu, 
Error **errp)
 static void riscv_cpu_validate_profile(RISCVCPU *cpu,
RISCVCPUProfile *profile)
 {
+CPURISCVState *env = &cpu->env;
 const char *warn_msg = "Profile %s mandates disabled extension %s";
 bool send_warn = profile->user_set && profile->enabled;
 bool profile_impl = true;
 int i;
 
+if (profile->priv_spec != RISCV_PROFILE_ATTR_UNUSED &&
+profile->priv_spec != env->priv_ver) {
+profile_impl = false;
+
+if (send_warn) {
+warn_report("Profile %s requires priv spec %s, "
+"but priv ver %s was set", profile->name,
+cpu_priv_ver_to_str(profile->priv_spec),
+cpu_priv_ver_to_str(env->priv_ver));
+}
+}
+
 for (i = 0; misa_bits[i] != 0; i++) {
 uint32_t bit = misa_bits[i];
 
@@ -1048,6 +1075,10 @@ static void cpu_set_profile(Object *obj, Visitor *v, 
const char *name,
 profile->user_set = true;
 profile->enabled = value;
 
+if (profile->enabled) {
+cpu->env.priv_ver = profile->priv_spec;
+}
+
 for (i = 0; misa_bits[i] != 0; i++) {
 uint32_t bit = misa_bits[i];
 
-- 
2.41.0




[PATCH for-9.0 v2 0/8] target/riscv: implement RVA22S64 profile

2023-11-27 Thread Daniel Henrique Barboza
Based-on: 20231124202353.1187814-1-dbarb...@ventanamicro.com
("[PATCH for-9.0 v12 00/18] riscv: rv64i/rva22u64 CPUs, RVA22U64 profile 
support")

Hi,

In this second version the most notable change is a new patch where we
added a 'parent' field in the profile description. This feature was
suggested by Drew in the v1 review. 

RVA22S64 is then declared as having RVA22U64 as parent, plus any other
extensions, named features and other contraints that are specific to
RVA22S64.

Another notable change is the removal of riscv_cpu_validate_svade(). The
helper (a single assignment) is now open-coded in
riscv_cpu_update_named_features().

Series based on top of:

"[PATCH for-9.0 v12 00/18] riscv: rv64i/rva22u64 CPUs, RVA22U64 profile support"

Patches missing acks: 2, 6, 7

Changes from v1:
- patch 1:
  - removed riscv_cpu_validate_svade()
- patch 2:
  - add RISCV_PROFILE_ATTR_UNUSED check when validating priv_spec
- patch 5:
  - removed stray blank line
- patch 6 (new):
  - add 'parent' in profile description
- patch 7:
  - declare RVA22U64 as parent of RVA22S64
- v1 link: 
https://lore.kernel.org/qemu-riscv/20231123191532.1101644-1-dbarb...@ventanamicro.com/
 

Daniel Henrique Barboza (8):
  target/riscv: implement svade
  target/riscv: add priv ver restriction to profiles
  target/riscv/cpu.c: finalize satp_mode earlier
  target/riscv/cpu.c: add riscv_cpu_is_32bit()
  target/riscv: add satp_mode profile support
  target/riscv: add 'parent' in profile description
  target/riscv: add RVA22S64 profile
  target/riscv: add rva22s64 cpu

 target/riscv/cpu-qom.h |  1 +
 target/riscv/cpu.c | 67 
 target/riscv/cpu.h |  5 +++
 target/riscv/cpu_cfg.h |  1 +
 target/riscv/tcg/tcg-cpu.c | 90 +-
 5 files changed, 154 insertions(+), 10 deletions(-)

-- 
2.41.0




[PATCH for-9.0 v2 1/8] target/riscv: implement svade

2023-11-27 Thread Daniel Henrique Barboza
'svade' is a RVA22S64 profile requirement, a profile we're going to add
shortly. It is a named feature (i.e. not a formal extension, not defined
in riscv,isa DT at this moment) defined in [1] as:

"Page-fault exceptions are raised when a page is accessed when A bit is
clear, or written when D bit is clear.".

As far as the spec goes, 'svade' is one of the two distinct modes of
handling PTE_A and PTE_D. The other way, i.e. update PTE_A/PTE_D when
they're cleared, is defined by the 'svadu' extension. Checking
cpu_helper.c, get_physical_address(), we can verify that QEMU is
compliant with that: we will update PTE_A/PTE_D if 'svadu' is enabled,
or throw a page-fault exception if 'svadu' isn't enabled.

So, as far as we're concerned, 'svade' translates to 'svadu must be
disabled'.

We'll implement it like 'zic64b': an internal flag that profiles can
enable. The flag will not be exposed to users.

[1] https://github.com/riscv/riscv-profiles/blob/main/profiles.adoc

Signed-off-by: Daniel Henrique Barboza 
Reviewed-by: Andrew Jones 
---
 target/riscv/cpu.c | 1 +
 target/riscv/cpu_cfg.h | 1 +
 target/riscv/tcg/tcg-cpu.c | 5 +
 3 files changed, 7 insertions(+)

diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
index 3a230608cb..59b131c1fc 100644
--- a/target/riscv/cpu.c
+++ b/target/riscv/cpu.c
@@ -1445,6 +1445,7 @@ const RISCVCPUMultiExtConfig 
riscv_cpu_experimental_exts[] = {
 };
 
 const RISCVCPUMultiExtConfig riscv_cpu_named_features[] = {
+MULTI_EXT_CFG_BOOL("svade", svade, true),
 MULTI_EXT_CFG_BOOL("zic64b", zic64b, true),
 
 DEFINE_PROP_END_OF_LIST(),
diff --git a/target/riscv/cpu_cfg.h b/target/riscv/cpu_cfg.h
index 90f18eb601..46b06db68b 100644
--- a/target/riscv/cpu_cfg.h
+++ b/target/riscv/cpu_cfg.h
@@ -116,6 +116,7 @@ struct RISCVCPUConfig {
 bool ext_smepmp;
 bool rvv_ta_all_1s;
 bool rvv_ma_all_1s;
+bool svade;
 bool zic64b;
 
 uint32_t mvendorid;
diff --git a/target/riscv/tcg/tcg-cpu.c b/target/riscv/tcg/tcg-cpu.c
index 04aedf3840..e395e2449e 100644
--- a/target/riscv/tcg/tcg-cpu.c
+++ b/target/riscv/tcg/tcg-cpu.c
@@ -188,6 +188,9 @@ static void riscv_cpu_enable_named_feat(RISCVCPU *cpu, 
uint32_t feat_offset)
 cpu->cfg.cbop_blocksize = 64;
 cpu->cfg.cboz_blocksize = 64;
 break;
+case CPU_CFG_OFFSET(svade):
+cpu->cfg.ext_svadu = false;
+break;
 default:
 g_assert_not_reached();
 }
@@ -381,6 +384,8 @@ static void riscv_cpu_update_named_features(RISCVCPU *cpu)
 cpu->cfg.zic64b = cpu->cfg.cbom_blocksize == 64 &&
   cpu->cfg.cbop_blocksize == 64 &&
   cpu->cfg.cboz_blocksize == 64;
+
+cpu->cfg.svade = !cpu->cfg.ext_svadu;
 }
 
 static void riscv_cpu_validate_g(RISCVCPU *cpu)
-- 
2.41.0




[PATCH for-9.0 v2 4/8] target/riscv/cpu.c: add riscv_cpu_is_32bit()

2023-11-27 Thread Daniel Henrique Barboza
Next patch will need to retrieve if a given RISCVCPU is 32 or 64 bit.
The existing helper riscv_is_32bit() (hw/riscv/boot.c) will always check
the first CPU of a given hart array, not any given CPU.

Create a helper to retrieve the info for any given CPU, not the first
CPU of the hart array. The helper is using the same 32 bit check that
riscv_cpu_satp_mode_finalize() was doing.

Signed-off-by: Daniel Henrique Barboza 
Reviewed-by: Andrew Jones 
---
 target/riscv/cpu.c | 7 ++-
 target/riscv/cpu.h | 1 +
 2 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
index a395c77bda..2b79fe861b 100644
--- a/target/riscv/cpu.c
+++ b/target/riscv/cpu.c
@@ -53,6 +53,11 @@ const uint32_t misa_bits[] = {RVI, RVE, RVM, RVA, RVF, RVD, 
RVV,
 #define BYTE(x)   (x)
 #endif
 
+bool riscv_cpu_is_32bit(RISCVCPU *cpu)
+{
+return riscv_cpu_mxl(&cpu->env) == MXL_RV32;
+}
+
 #define ISA_EXT_DATA_ENTRY(_name, _min_ver, _prop) \
 {#_name, _min_ver, CPU_CFG_OFFSET(_prop)}
 
@@ -980,7 +985,7 @@ static void riscv_cpu_disas_set_info(CPUState *s, 
disassemble_info *info)
 #ifndef CONFIG_USER_ONLY
 static void riscv_cpu_satp_mode_finalize(RISCVCPU *cpu, Error **errp)
 {
-bool rv32 = riscv_cpu_mxl(&cpu->env) == MXL_RV32;
+bool rv32 = riscv_cpu_is_32bit(cpu);
 uint8_t satp_mode_map_max, satp_mode_supported_max;
 
 /* The CPU wants the OS to decide which satp mode to use */
diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h
index 1f34eda1e4..485d2da3c2 100644
--- a/target/riscv/cpu.h
+++ b/target/riscv/cpu.h
@@ -695,6 +695,7 @@ void cpu_get_tb_cpu_state(CPURISCVState *env, vaddr *pc,
   uint64_t *cs_base, uint32_t *pflags);
 
 void riscv_cpu_update_mask(CPURISCVState *env);
+bool riscv_cpu_is_32bit(RISCVCPU *cpu);
 
 RISCVException riscv_csrrw(CPURISCVState *env, int csrno,
target_ulong *ret_value,
-- 
2.41.0




[PATCH for-9.0 v2 7/8] target/riscv: add RVA22S64 profile

2023-11-27 Thread Daniel Henrique Barboza
The RVA22S64 profile consists of the following:

- all mandatory extensions of RVA22U64;
- priv spec v1.12.0;
- satp mode sv39;
- Ssccptr, a cache related named feature that we're assuming always
  enable since we don't implement a cache;
- Other named features already implemented: Sstvecd, Sstvala,
  Sscounterenw;
- the new Svade named feature that was recently added.

Most of the work is already done, so this patch is enough to implement
the profile.

After this patch, the 'rva22s64' user flag alone can be used with the
rva64i CPU to boot Linux:

-cpu rv64i,rva22s64=true

This is the /proc/cpuinfo with this profile enabled:

 # cat /proc/cpuinfo
processor   : 0
hart: 0
isa : 
rv64imafdc_zicbom_zicbop_zicboz_zicntr_zicsr_zifencei_zihintpause_zihpm_zfhmin_zca_zcd_zba_zbb_zbs_zkt_svinval_svpbmt
mmu : sv39

Signed-off-by: Daniel Henrique Barboza 
---
 target/riscv/cpu.c | 32 
 1 file changed, 32 insertions(+)

diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
index 6957b4b9be..fa056fb45f 100644
--- a/target/riscv/cpu.c
+++ b/target/riscv/cpu.c
@@ -1560,8 +1560,40 @@ static RISCVCPUProfile RVA22U64 = {
 }
 };
 
+/*
+ * As with RVA22U64, RVA22S64 also defines 'named features'.
+ *
+ * Cache related features that we consider enabled since we don't
+ * implement cache: Ssccptr
+ *
+ * Other named features that we already implement: Sstvecd, Sstvala,
+ * Sscounterenw
+ *
+ * Named features that we need to enable: svade
+ *
+ * The remaining features/extensions comes from RVA22U64.
+ */
+static RISCVCPUProfile RVA22S64 = {
+.parent = &RVA22U64,
+.name = "rva22s64",
+.misa_ext = RVS,
+.priv_spec = PRIV_VERSION_1_12_0,
+.satp_mode = VM_1_10_SV39,
+.ext_offsets = {
+/* rva22s64 exts */
+CPU_CFG_OFFSET(ext_zifencei), CPU_CFG_OFFSET(ext_svpbmt),
+CPU_CFG_OFFSET(ext_svinval),
+
+/* rva22s64 named features */
+CPU_CFG_OFFSET(svade),
+
+RISCV_PROFILE_EXT_LIST_END
+}
+};
+
 RISCVCPUProfile *riscv_profiles[] = {
 &RVA22U64,
+&RVA22S64,
 NULL,
 };
 
-- 
2.41.0




[PATCH for-9.0 v2 6/8] target/riscv: add 'parent' in profile description

2023-11-27 Thread Daniel Henrique Barboza
Certain S-mode profiles, like RVA22S64 and RVA23S64, mandate all the
mandatory extensions of their respective U-mode profiles. RVA22S64
includes all mandatory extensions of RVA22U64, and the same happens with
RVA23 profiles.

Add a 'parent' field to allow profiles to enable other profiles. This
will allow us to describe S-mode profiles by specifying their parent
U-mode profile, then adding just the S-mode specific extensions.

We're naming the field 'parent' to consider the possibility of other
uses (e.g. a s-mode profile including a previous s-mode profile) in the
future.

Suggested-by: Andrew Jones 
Signed-off-by: Daniel Henrique Barboza 
---
 target/riscv/cpu.c |  1 +
 target/riscv/cpu.h |  1 +
 target/riscv/tcg/tcg-cpu.c | 14 +-
 3 files changed, 15 insertions(+), 1 deletion(-)

diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
index a77118549b..6957b4b9be 100644
--- a/target/riscv/cpu.c
+++ b/target/riscv/cpu.c
@@ -1540,6 +1540,7 @@ Property riscv_cpu_options[] = {
  * having a cfg offset) at this moment.
  */
 static RISCVCPUProfile RVA22U64 = {
+.parent = NULL,
 .name = "rva22u64",
 .misa_ext = RVI | RVM | RVA | RVF | RVD | RVC | RVU,
 .priv_spec = RISCV_PROFILE_ATTR_UNUSED,
diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h
index 6c5fceb5f5..44fb0a9ca8 100644
--- a/target/riscv/cpu.h
+++ b/target/riscv/cpu.h
@@ -77,6 +77,7 @@ const char *riscv_get_misa_ext_description(uint32_t bit);
 #define CPU_CFG_OFFSET(_prop) offsetof(struct RISCVCPUConfig, _prop)
 
 typedef struct riscv_cpu_profile {
+struct riscv_cpu_profile *parent;
 const char *name;
 uint32_t misa_ext;
 bool enabled;
diff --git a/target/riscv/tcg/tcg-cpu.c b/target/riscv/tcg/tcg-cpu.c
index 152f95718b..6284d36809 100644
--- a/target/riscv/tcg/tcg-cpu.c
+++ b/target/riscv/tcg/tcg-cpu.c
@@ -797,7 +797,7 @@ static void riscv_cpu_validate_profile(RISCVCPU *cpu,
 CPURISCVState *env = &cpu->env;
 const char *warn_msg = "Profile %s mandates disabled extension %s";
 bool send_warn = profile->user_set && profile->enabled;
-bool profile_impl = true;
+bool parent_enabled, profile_impl = true;
 int i;
 
 #ifndef CONFIG_USER_ONLY
@@ -850,6 +850,13 @@ static void riscv_cpu_validate_profile(RISCVCPU *cpu,
 }
 
 profile->enabled = profile_impl;
+
+if (profile->parent != NULL) {
+parent_enabled = object_property_get_bool(OBJECT(cpu),
+  profile->parent->name,
+  NULL);
+profile->enabled = profile->enabled && parent_enabled;
+}
 }
 
 static void riscv_cpu_validate_profiles(RISCVCPU *cpu)
@@ -1107,6 +1114,11 @@ static void cpu_set_profile(Object *obj, Visitor *v, 
const char *name,
 profile->user_set = true;
 profile->enabled = value;
 
+if (profile->parent != NULL) {
+object_property_set_bool(obj, profile->parent->name,
+ profile->enabled, NULL);
+}
+
 if (profile->enabled) {
 cpu->env.priv_ver = profile->priv_spec;
 }
-- 
2.41.0




[PATCH for-9.0 v2 3/8] target/riscv/cpu.c: finalize satp_mode earlier

2023-11-27 Thread Daniel Henrique Barboza
Profiles will need to validate satp_mode during their own finalize
methods. This will occur inside riscv_tcg_cpu_finalize_features() for
TCG. Given that satp_mode does not have any pre-req from the accelerator
finalize() method, it's safe to finalize it earlier.

Signed-off-by: Daniel Henrique Barboza 
Reviewed-by: Andrew Jones 
---
 target/riscv/cpu.c | 16 
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
index 29a9f77702..a395c77bda 100644
--- a/target/riscv/cpu.c
+++ b/target/riscv/cpu.c
@@ -1056,6 +1056,14 @@ void riscv_cpu_finalize_features(RISCVCPU *cpu, Error 
**errp)
 {
 Error *local_err = NULL;
 
+#ifndef CONFIG_USER_ONLY
+riscv_cpu_satp_mode_finalize(cpu, &local_err);
+if (local_err != NULL) {
+error_propagate(errp, local_err);
+return;
+}
+#endif
+
 /*
  * KVM accel does not have a specialized finalize()
  * callback because its extensions are validated
@@ -1068,14 +1076,6 @@ void riscv_cpu_finalize_features(RISCVCPU *cpu, Error 
**errp)
 return;
 }
 }
-
-#ifndef CONFIG_USER_ONLY
-riscv_cpu_satp_mode_finalize(cpu, &local_err);
-if (local_err != NULL) {
-error_propagate(errp, local_err);
-return;
-}
-#endif
 }
 
 static void riscv_cpu_realize(DeviceState *dev, Error **errp)
-- 
2.41.0




[PATCH for-9.0 v2 8/8] target/riscv: add rva22s64 cpu

2023-11-27 Thread Daniel Henrique Barboza
Add a new profile CPU 'rva22s64' to work as an alias of

-cpu rv64i,rva22s64

Like the existing rva22u64 CPU already does with the RVA22U64 profile.

Signed-off-by: Daniel Henrique Barboza 
Reviewed-by: Andrew Jones 
---
 target/riscv/cpu-qom.h | 1 +
 target/riscv/cpu.c | 8 
 2 files changed, 9 insertions(+)

diff --git a/target/riscv/cpu-qom.h b/target/riscv/cpu-qom.h
index 12fe78fc52..9219c2fcc3 100644
--- a/target/riscv/cpu-qom.h
+++ b/target/riscv/cpu-qom.h
@@ -36,6 +36,7 @@
 #define TYPE_RISCV_CPU_BASE128  RISCV_CPU_TYPE_NAME("x-rv128")
 #define TYPE_RISCV_CPU_RV64IRISCV_CPU_TYPE_NAME("rv64i")
 #define TYPE_RISCV_CPU_RVA22U64 RISCV_CPU_TYPE_NAME("rva22u64")
+#define TYPE_RISCV_CPU_RVA22S64 RISCV_CPU_TYPE_NAME("rva22s64")
 #define TYPE_RISCV_CPU_IBEX RISCV_CPU_TYPE_NAME("lowrisc-ibex")
 #define TYPE_RISCV_CPU_SHAKTI_C RISCV_CPU_TYPE_NAME("shakti-c")
 #define TYPE_RISCV_CPU_SIFIVE_E31   RISCV_CPU_TYPE_NAME("sifive-e31")
diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
index fa056fb45f..99bdb96335 100644
--- a/target/riscv/cpu.c
+++ b/target/riscv/cpu.c
@@ -1624,6 +1624,13 @@ static void rva22u64_profile_cpu_init(Object *obj)
 
 RVA22U64.enabled = true;
 }
+
+static void rva22s64_profile_cpu_init(Object *obj)
+{
+rv64i_bare_cpu_init(obj);
+
+RVA22S64.enabled = true;
+}
 #endif
 
 static const gchar *riscv_gdb_arch_name(CPUState *cs)
@@ -1968,6 +1975,7 @@ static const TypeInfo riscv_cpu_type_infos[] = {
 DEFINE_DYNAMIC_CPU(TYPE_RISCV_CPU_BASE128,  rv128_base_cpu_init),
 DEFINE_BARE_CPU(TYPE_RISCV_CPU_RV64I, rv64i_bare_cpu_init),
 DEFINE_PROFILE_CPU(TYPE_RISCV_CPU_RVA22U64, rva22u64_profile_cpu_init),
+DEFINE_PROFILE_CPU(TYPE_RISCV_CPU_RVA22S64, rva22s64_profile_cpu_init),
 #endif
 };
 
-- 
2.41.0




[PATCH for-9.0 v2 5/8] target/riscv: add satp_mode profile support

2023-11-27 Thread Daniel Henrique Barboza
'satp_mode' is a requirement for supervisor profiles like RVA22S64.
User-mode/application profiles like RVA22U64 doesn't care.

Add 'satp_mode' to the profile description. If a profile requires it,
set it during cpu_set_profile(). We'll also check it during finalize()
to validate if the running config implements the profile.

Signed-off-by: Daniel Henrique Barboza 
Reviewed-by: Andrew Jones 
---
 target/riscv/cpu.c |  1 +
 target/riscv/cpu.h |  1 +
 target/riscv/tcg/tcg-cpu.c | 40 ++
 3 files changed, 42 insertions(+)

diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
index 2b79fe861b..a77118549b 100644
--- a/target/riscv/cpu.c
+++ b/target/riscv/cpu.c
@@ -1543,6 +1543,7 @@ static RISCVCPUProfile RVA22U64 = {
 .name = "rva22u64",
 .misa_ext = RVI | RVM | RVA | RVF | RVD | RVC | RVU,
 .priv_spec = RISCV_PROFILE_ATTR_UNUSED,
+.satp_mode = RISCV_PROFILE_ATTR_UNUSED,
 .ext_offsets = {
 CPU_CFG_OFFSET(ext_zicsr), CPU_CFG_OFFSET(ext_zihintpause),
 CPU_CFG_OFFSET(ext_zba), CPU_CFG_OFFSET(ext_zbb),
diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h
index 485d2da3c2..6c5fceb5f5 100644
--- a/target/riscv/cpu.h
+++ b/target/riscv/cpu.h
@@ -82,6 +82,7 @@ typedef struct riscv_cpu_profile {
 bool enabled;
 bool user_set;
 int priv_spec;
+int satp_mode;
 const int32_t ext_offsets[];
 } RISCVCPUProfile;
 
diff --git a/target/riscv/tcg/tcg-cpu.c b/target/riscv/tcg/tcg-cpu.c
index 4d25fc43d2..152f95718b 100644
--- a/target/riscv/tcg/tcg-cpu.c
+++ b/target/riscv/tcg/tcg-cpu.c
@@ -766,6 +766,31 @@ void riscv_cpu_validate_set_extensions(RISCVCPU *cpu, 
Error **errp)
 riscv_cpu_disable_priv_spec_isa_exts(cpu);
 }
 
+#ifndef CONFIG_USER_ONLY
+static bool riscv_cpu_validate_profile_satp(RISCVCPU *cpu,
+RISCVCPUProfile *profile,
+bool send_warn)
+{
+int satp_max = satp_mode_max_from_map(cpu->cfg.satp_mode.supported);
+
+if (profile->satp_mode > satp_max) {
+if (send_warn) {
+bool is_32bit = riscv_cpu_is_32bit(cpu);
+const char *req_satp = satp_mode_str(profile->satp_mode, is_32bit);
+const char *cur_satp = satp_mode_str(satp_max, is_32bit);
+
+warn_report("Profile %s requires satp mode %s, "
+"but satp mode %s was set", profile->name,
+req_satp, cur_satp);
+}
+
+return false;
+}
+
+return true;
+}
+#endif
+
 static void riscv_cpu_validate_profile(RISCVCPU *cpu,
RISCVCPUProfile *profile)
 {
@@ -775,6 +800,13 @@ static void riscv_cpu_validate_profile(RISCVCPU *cpu,
 bool profile_impl = true;
 int i;
 
+#ifndef CONFIG_USER_ONLY
+if (profile->satp_mode != RISCV_PROFILE_ATTR_UNUSED) {
+profile_impl = riscv_cpu_validate_profile_satp(cpu, profile,
+   send_warn);
+}
+#endif
+
 if (profile->priv_spec != RISCV_PROFILE_ATTR_UNUSED &&
 profile->priv_spec != env->priv_ver) {
 profile_impl = false;
@@ -1079,6 +,14 @@ static void cpu_set_profile(Object *obj, Visitor *v, 
const char *name,
 cpu->env.priv_ver = profile->priv_spec;
 }
 
+#ifndef CONFIG_USER_ONLY
+if (profile->satp_mode != RISCV_PROFILE_ATTR_UNUSED) {
+const char *satp_prop = satp_mode_str(profile->satp_mode,
+  riscv_cpu_is_32bit(cpu));
+object_property_set_bool(obj, satp_prop, profile->enabled, NULL);
+}
+#endif
+
 for (i = 0; misa_bits[i] != 0; i++) {
 uint32_t bit = misa_bits[i];
 
-- 
2.41.0




Re: [PATCH v2 for-8.2] ppc/amigaone: Allow running AmigaOS without firmware image

2023-11-27 Thread BALATON Zoltan

On Mon, 27 Nov 2023, Nicholas Piggin wrote:

On Sun Nov 26, 2023 at 2:34 AM AEST, BALATON Zoltan wrote:

The machine uses a modified U-Boot under GPL license but the sources
of it are lost with only a binary available so it cannot be included
in QEMU. Allow running without the firmware image with -bios none
which can be used when calling a boot loader directly and thus
simplifying booting guests. We need a small routine that AmigaOS calls
from ROM which is added in this case to allow booting AmigaOS without
external firmware image.

Signed-off-by: BALATON Zoltan 
---
v2: Unfortunately AmigaOS needs some additional ROM part which is added
Please merge for 8.2 as it allows booting AmigaOS simpler without
having to download separate firmware.


How to test this?


You can check with -M amigaone -bios none then from QEMU monitor
(qemu) xp/10i 0xfff7ff80


 hw/ppc/amigaone.c | 20 +---
 1 file changed, 17 insertions(+), 3 deletions(-)

diff --git a/hw/ppc/amigaone.c b/hw/ppc/amigaone.c
index 992a55e632..a11d2d5556 100644
--- a/hw/ppc/amigaone.c
+++ b/hw/ppc/amigaone.c
@@ -40,6 +40,16 @@
 #define PROM_ADDR 0xfff0
 #define PROM_SIZE (512 * KiB)

+/* AmigaOS calls this routine from ROM, use this if -bios none */
+static const char dummy_fw[] = {
+0x38, 0x00, 0x00, 0x08, /* li  r0,8 */
+0x7c, 0x09, 0x03, 0xa6, /* mtctr   r0 */
+0x54, 0x63, 0xf8, 0x7e, /* srwir3,r3,1 */
+0x42, 0x00, 0xff, 0xfc, /* bdnz0x8 */
+0x7c, 0x63, 0x18, 0xf8, /* not r3,r3 */
+0x4e, 0x80, 0x00, 0x20, /* blr */
+};


This is clever, but does anything else create blobs like this?


There are some examples in hw/mips/{loongson3_virt.c, malta.c} at least 
and maybe others that put code in guest memory. If this was longer than 
this few instructions I'd consider putting it in a binary but this seems 
simpler for such small code.



It could be put into a .S in pc-bios, which might be a bit more
consistent.

We might make a ppc/ subdirectory under there, but that's for
another time.


Maybe later we could reorganise these unless it's really necessary to 
change this for 8.2 now. (The mips boards and some arm and riscv machines 
seem to use rom_add_blob_fixed() which sould show up in info roms under 
monitor so maybe I could look at changing to use that now if you think it 
would be better that way.)


Regards,
BALATON Zoltan


Thanks,
Nick


+
 static void amigaone_cpu_reset(void *opaque)
 {
 PowerPCCPU *cpu = opaque;
@@ -94,17 +104,21 @@ static void amigaone_init(MachineState *machine)
 }

 /* allocate and load firmware */
+rom = g_new(MemoryRegion, 1);
+memory_region_init_rom(rom, NULL, "rom", PROM_SIZE, &error_fatal);
+memory_region_add_subregion(get_system_memory(), PROM_ADDR, rom);
 filename = qemu_find_file(QEMU_FILE_TYPE_BIOS, fwname);
 if (filename) {
-rom = g_new(MemoryRegion, 1);
-memory_region_init_rom(rom, NULL, "rom", PROM_SIZE, &error_fatal);
-memory_region_add_subregion(get_system_memory(), PROM_ADDR, rom);
 sz = load_image_targphys(filename, PROM_ADDR, PROM_SIZE);
 if (sz <= 0 || sz > PROM_SIZE) {
 error_report("Could not load firmware '%s'", filename);
 exit(1);
 }
 g_free(filename);
+} else if (!strcmp(fwname, "none")) {
+address_space_write_rom(&address_space_memory, 0xfff7ff80,
+MEMTXATTRS_UNSPECIFIED, dummy_fw,
+ARRAY_SIZE(dummy_fw));
 } else if (!qtest_enabled()) {
 error_report("Could not find firmware '%s'", fwname);
 exit(1);








Re: [PATCH for-8.2] export/vhost-user-blk: Fix consecutive drains

2023-11-27 Thread Kevin Wolf
Am 24.11.2023 um 18:44 hat Kevin Wolf geschrieben:
> The vhost-user-blk export implement AioContext switches in its drain
> implementation. This means that on drain_begin, it detaches the server
> from its AioContext and on drain_end, attaches it again and schedules
> the server->co_trip coroutine in the updated AioContext.
> 
> However, nothing guarantees that server->co_trip is even safe to be
> scheduled. Not only is it unclear that the coroutine is actually in a
> state where it can be reentered externally without causing problems, but
> with two consecutive drains, it is possible that the scheduled coroutine
> didn't have a chance yet to run and trying to schedule an already
> scheduled coroutine a second time crashes with an assertion failure.
> 
> Following the model of NBD, this commit makes the vhost-user-blk export
> shut down server->co_trip during drain so that resuming the export means
> creating and scheduling a new coroutine, which is always safe.
> 
> There is one exception: If the drain call didn't poll (for example, this
> happens in the context of bdrv_graph_wrlock()), then the coroutine
> didn't have a chance to shut down. However, in this case the AioContext
> can't have changed; changing the AioContext always involves a polling
> drain. So in this case we can simply assert that the AioContext is
> unchanged and just leave the coroutine running or wake it up if it has
> yielded to wait for the AioContext to be attached again.
> 
> Fixes: e1054cd4aad03a493a5d1cded7508f7c348205bf
> Fixes: https://issues.redhat.com/browse/RHEL-1708
> Signed-off-by: Kevin Wolf 
> ---
>  include/qemu/vhost-user-server.h |  2 ++
>  block/export/vhost-user-blk-server.c |  9 +--
>  util/vhost-user-server.c | 36 +++-
>  3 files changed, 39 insertions(+), 8 deletions(-)
> 
> diff --git a/include/qemu/vhost-user-server.h 
> b/include/qemu/vhost-user-server.h
> index 64ad701015..ca1713b53e 100644
> --- a/include/qemu/vhost-user-server.h
> +++ b/include/qemu/vhost-user-server.h
> @@ -45,6 +45,8 @@ typedef struct {
>  /* Protected by ctx lock */
>  bool in_qio_channel_yield;
>  bool wait_idle;
> +bool quiescing;
> +bool wake_on_ctx_attach;
>  VuDev vu_dev;
>  QIOChannel *ioc; /* The I/O channel with the client */
>  QIOChannelSocket *sioc; /* The underlying data channel with the client */
> diff --git a/block/export/vhost-user-blk-server.c 
> b/block/export/vhost-user-blk-server.c
> index fe2cee3a78..16f48388d3 100644
> --- a/block/export/vhost-user-blk-server.c
> +++ b/block/export/vhost-user-blk-server.c
> @@ -283,6 +283,7 @@ static void vu_blk_drained_begin(void *opaque)
>  {
>  VuBlkExport *vexp = opaque;
>  
> +vexp->vu_server.quiescing = true;
>  vhost_user_server_detach_aio_context(&vexp->vu_server);
>  }
>  
> @@ -291,19 +292,23 @@ static void vu_blk_drained_end(void *opaque)
>  {
>  VuBlkExport *vexp = opaque;
>  
> +vexp->vu_server.quiescing = false;
>  vhost_user_server_attach_aio_context(&vexp->vu_server, vexp->export.ctx);
>  }
>  
>  /*
> - * Ensures that bdrv_drained_begin() waits until in-flight requests complete.
> + * Ensures that bdrv_drained_begin() waits until in-flight requests complete
> + * and the server->co_trip coroutine has terminated. It will be restarted in
> + * vhost_user_server_attach_aio_context().
>   *
>   * Called with vexp->export.ctx acquired.
>   */
>  static bool vu_blk_drained_poll(void *opaque)
>  {
>  VuBlkExport *vexp = opaque;
> +VuServer *server = &vexp->vu_server;
>  
> -return vhost_user_server_has_in_flight(&vexp->vu_server);
> +return server->co_trip || vhost_user_server_has_in_flight(server);
>  }
>  
>  static const BlockDevOps vu_blk_dev_ops = {
> diff --git a/util/vhost-user-server.c b/util/vhost-user-server.c
> index 5ccc6d24a0..23004d0c62 100644
> --- a/util/vhost-user-server.c
> +++ b/util/vhost-user-server.c
> @@ -133,7 +133,9 @@ vu_message_read(VuDev *vu_dev, int conn_fd, VhostUserMsg 
> *vmsg)
>  server->in_qio_channel_yield = false;
>  } else {
>  /* Wait until attached to an AioContext again */
> +server->wake_on_ctx_attach = true;
>  qemu_coroutine_yield();
> +assert(!server->wake_on_ctx_attach);
>  }

Yielding here isn't good enough as drained_poll waits for the coroutine
to terminate, and if the coroutine is here, it will hang. v2 will return
instead.

Kevin




[PATCH for-8.2 v2] export/vhost-user-blk: Fix consecutive drains

2023-11-27 Thread Kevin Wolf
The vhost-user-blk export implement AioContext switches in its drain
implementation. This means that on drain_begin, it detaches the server
from its AioContext and on drain_end, attaches it again and schedules
the server->co_trip coroutine in the updated AioContext.

However, nothing guarantees that server->co_trip is even safe to be
scheduled. Not only is it unclear that the coroutine is actually in a
state where it can be reentered externally without causing problems, but
with two consecutive drains, it is possible that the scheduled coroutine
didn't have a chance yet to run and trying to schedule an already
scheduled coroutine a second time crashes with an assertion failure.

Following the model of NBD, this commit makes the vhost-user-blk export
shut down server->co_trip during drain so that resuming the export means
creating and scheduling a new coroutine, which is always safe.

There is one exception: If the drain call didn't poll (for example, this
happens in the context of bdrv_graph_wrlock()), then the coroutine
didn't have a chance to shut down. However, in this case the AioContext
can't have changed; changing the AioContext always involves a polling
drain. So in this case we can simply assert that the AioContext is
unchanged and just leave the coroutine running or wake it up if it has
yielded to wait for the AioContext to be attached again.

Fixes: e1054cd4aad03a493a5d1cded7508f7c348205bf
Fixes: https://issues.redhat.com/browse/RHEL-1708
Signed-off-by: Kevin Wolf 
---
 include/qemu/vhost-user-server.h |  1 +
 block/export/vhost-user-blk-server.c |  9 +--
 util/vhost-user-server.c | 39 ++--
 3 files changed, 39 insertions(+), 10 deletions(-)

diff --git a/include/qemu/vhost-user-server.h b/include/qemu/vhost-user-server.h
index 64ad701015..0417ec0533 100644
--- a/include/qemu/vhost-user-server.h
+++ b/include/qemu/vhost-user-server.h
@@ -45,6 +45,7 @@ typedef struct {
 /* Protected by ctx lock */
 bool in_qio_channel_yield;
 bool wait_idle;
+bool quiescing;
 VuDev vu_dev;
 QIOChannel *ioc; /* The I/O channel with the client */
 QIOChannelSocket *sioc; /* The underlying data channel with the client */
diff --git a/block/export/vhost-user-blk-server.c 
b/block/export/vhost-user-blk-server.c
index fe2cee3a78..16f48388d3 100644
--- a/block/export/vhost-user-blk-server.c
+++ b/block/export/vhost-user-blk-server.c
@@ -283,6 +283,7 @@ static void vu_blk_drained_begin(void *opaque)
 {
 VuBlkExport *vexp = opaque;
 
+vexp->vu_server.quiescing = true;
 vhost_user_server_detach_aio_context(&vexp->vu_server);
 }
 
@@ -291,19 +292,23 @@ static void vu_blk_drained_end(void *opaque)
 {
 VuBlkExport *vexp = opaque;
 
+vexp->vu_server.quiescing = false;
 vhost_user_server_attach_aio_context(&vexp->vu_server, vexp->export.ctx);
 }
 
 /*
- * Ensures that bdrv_drained_begin() waits until in-flight requests complete.
+ * Ensures that bdrv_drained_begin() waits until in-flight requests complete
+ * and the server->co_trip coroutine has terminated. It will be restarted in
+ * vhost_user_server_attach_aio_context().
  *
  * Called with vexp->export.ctx acquired.
  */
 static bool vu_blk_drained_poll(void *opaque)
 {
 VuBlkExport *vexp = opaque;
+VuServer *server = &vexp->vu_server;
 
-return vhost_user_server_has_in_flight(&vexp->vu_server);
+return server->co_trip || vhost_user_server_has_in_flight(server);
 }
 
 static const BlockDevOps vu_blk_dev_ops = {
diff --git a/util/vhost-user-server.c b/util/vhost-user-server.c
index 5ccc6d24a0..a9a48fffb8 100644
--- a/util/vhost-user-server.c
+++ b/util/vhost-user-server.c
@@ -132,8 +132,7 @@ vu_message_read(VuDev *vu_dev, int conn_fd, VhostUserMsg 
*vmsg)
 qio_channel_yield(ioc, G_IO_IN);
 server->in_qio_channel_yield = false;
 } else {
-/* Wait until attached to an AioContext again */
-qemu_coroutine_yield();
+return false;
 }
 continue;
 } else {
@@ -201,8 +200,16 @@ static coroutine_fn void vu_client_trip(void *opaque)
 VuServer *server = opaque;
 VuDev *vu_dev = &server->vu_dev;
 
-while (!vu_dev->broken && vu_dispatch(vu_dev)) {
-/* Keep running */
+while (!vu_dev->broken) {
+if (server->quiescing) {
+server->co_trip = NULL;
+aio_wait_kick();
+return;
+}
+/* vu_dispatch() returns false if server->ctx went away */
+if (!vu_dispatch(vu_dev) && server->ctx) {
+break;
+}
 }
 
 if (vhost_user_server_has_in_flight(server)) {
@@ -353,8 +360,7 @@ static void vu_accept(QIONetListener *listener, 
QIOChannelSocket *sioc,
 
 qio_channel_set_follow_coroutine_ctx(server->ioc, true);
 
-server->co_trip = qemu_coroutine_create(vu_client_trip, server);
-
+/* Attaching the AioContext starts 

Re: [PATCH for-9.0 v2 2/8] target/riscv: add priv ver restriction to profiles

2023-11-27 Thread Andrew Jones
On Mon, Nov 27, 2023 at 08:37:46AM -0300, Daniel Henrique Barboza wrote:
> Some profiles, like RVA22S64, has a priv_spec requirement.
> 
> Make this requirement explicit for all profiles. We'll validate this
> requirement finalize() time and, in case the user chooses an
> incompatible priv_spec while activating a profile, a warning will be
> shown.
> 
> Signed-off-by: Daniel Henrique Barboza 
> ---
>  target/riscv/cpu.c |  1 +
>  target/riscv/cpu.h |  2 ++
>  target/riscv/tcg/tcg-cpu.c | 31 +++
>  3 files changed, 34 insertions(+)
>

Reviewed-by: Andrew Jones 



Re: [PATCH] iotests: fix default MT detection

2023-11-27 Thread Kevin Wolf
Am 22.11.2023 um 13:15 hat Andrey Drobyshev geschrieben:
> MT is being detected based on "-M help" output, and we're searching for
> the line ending with " (default)".  However, in downstream one of the
> MTs marked as deprecated might become the default, in which case this
> logic breaks as the line would now end with " (default) (deprecated)".
> To fix potential issues here, let's relax that requirement and detect
> the mere presence of " (default)" line instead.
> 
> Signed-off-by: Andrey Drobyshev 

Thanks, applied to the block branch. (I did however change "MT" to
"machine type" in the commit message because at first I was confused
what you meant with it.)

Kevin




Re: [PATCH for-9.0 v2 6/8] target/riscv: add 'parent' in profile description

2023-11-27 Thread Andrew Jones
On Mon, Nov 27, 2023 at 08:37:50AM -0300, Daniel Henrique Barboza wrote:
> Certain S-mode profiles, like RVA22S64 and RVA23S64, mandate all the
> mandatory extensions of their respective U-mode profiles. RVA22S64
> includes all mandatory extensions of RVA22U64, and the same happens with
> RVA23 profiles.
> 
> Add a 'parent' field to allow profiles to enable other profiles. This
> will allow us to describe S-mode profiles by specifying their parent
> U-mode profile, then adding just the S-mode specific extensions.
> 
> We're naming the field 'parent' to consider the possibility of other
> uses (e.g. a s-mode profile including a previous s-mode profile) in the
> future.
> 
> Suggested-by: Andrew Jones 
> Signed-off-by: Daniel Henrique Barboza 
> ---
>  target/riscv/cpu.c |  1 +
>  target/riscv/cpu.h |  1 +
>  target/riscv/tcg/tcg-cpu.c | 14 +-
>  3 files changed, 15 insertions(+), 1 deletion(-)
>

Reviewed-by: Andrew Jones 



Re: [PATCH for-9.0 v2 7/8] target/riscv: add RVA22S64 profile

2023-11-27 Thread Andrew Jones
On Mon, Nov 27, 2023 at 08:37:51AM -0300, Daniel Henrique Barboza wrote:
> The RVA22S64 profile consists of the following:
> 
> - all mandatory extensions of RVA22U64;
> - priv spec v1.12.0;
> - satp mode sv39;
> - Ssccptr, a cache related named feature that we're assuming always
>   enable since we don't implement a cache;
> - Other named features already implemented: Sstvecd, Sstvala,
>   Sscounterenw;
> - the new Svade named feature that was recently added.
> 
> Most of the work is already done, so this patch is enough to implement
> the profile.
> 
> After this patch, the 'rva22s64' user flag alone can be used with the
> rva64i CPU to boot Linux:
> 
> -cpu rv64i,rva22s64=true
> 
> This is the /proc/cpuinfo with this profile enabled:
> 
>  # cat /proc/cpuinfo
> processor : 0
> hart  : 0
> isa   : 
> rv64imafdc_zicbom_zicbop_zicboz_zicntr_zicsr_zifencei_zihintpause_zihpm_zfhmin_zca_zcd_zba_zbb_zbs_zkt_svinval_svpbmt
> mmu   : sv39
> 
> Signed-off-by: Daniel Henrique Barboza 
> ---
>  target/riscv/cpu.c | 32 
>  1 file changed, 32 insertions(+)
>

Reviewed-by: Andrew Jones 



hanging process with commit 69562648f9 ("vl: revert behaviour for -display none")

2023-11-27 Thread Sebastian Ott

Hej,

qemu fails to start a guest using the following command (the process just
hangs): qemu-system-aarch64 -machine virt -cpu host -smp 4 -m 8192 
-kernel /boot/vmlinuz-6.7.0-rc1 -initrd ~/basic.img -append "root=/dev/ram 
console=ttyAMA0" -enable-kvm -device virtio-gpu,hostmem=2G -display none


..which I've used to debug a potential virtio-gpu issue. Bisect points to
69562648f9 ("vl: revert behaviour for -display none")

Is that qemu cmd just invalid and shouldn't have worked in the first
place?

Sebastian




[PATCH] hostmem: Round up memory size for qemu_madvise() in host_memory_backend_memory_complete()

2023-11-27 Thread Michal Privoznik
Simple reproducer:
qemu.git $ ./build/qemu-system-x86_64 \
-m size=8389632k,slots=16,maxmem=2560k \
-object 
'{"qom-type":"memory-backend-file","id":"ram-node0","mem-path":"/hugepages2M/","prealloc":true,"size":8590983168,"host-nodes":[0],"policy":"bind"}'
 \
-numa node,nodeid=0,cpus=0,memdev=ram-node0

With current master I get:

qemu-system-x86_64: cannot bind memory to host NUMA nodes: Invalid argument

The problem is that memory size (8193MiB) is not an integer
multiple of underlying pagesize (2MiB) which triggers a check
inside of madvise(), since we can't really set a madvise() policy
just to a fraction of a page.

Signed-off-by: Michal Privoznik 
---
 backends/hostmem.c | 8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/backends/hostmem.c b/backends/hostmem.c
index 747e7838c0..4e88d048de 100644
--- a/backends/hostmem.c
+++ b/backends/hostmem.c
@@ -326,9 +326,10 @@ host_memory_backend_memory_complete(UserCreatable *uc, 
Error **errp)
 HostMemoryBackendClass *bc = MEMORY_BACKEND_GET_CLASS(uc);
 Error *local_err = NULL;
 void *ptr;
-uint64_t sz;
 
 if (bc->alloc) {
+uint64_t sz;
+
 bc->alloc(backend, &local_err);
 if (local_err) {
 goto out;
@@ -337,6 +338,11 @@ host_memory_backend_memory_complete(UserCreatable *uc, 
Error **errp)
 ptr = memory_region_get_ram_ptr(&backend->mr);
 sz = memory_region_size(&backend->mr);
 
+/* Round up size to be an integer multiple of pagesize, because
+ * madvise() does not really like setting advices on a fraction of a
+ * page. */
+sz = ROUND_UP(sz, qemu_ram_pagesize(backend->mr.ram_block));
+
 if (backend->merge) {
 qemu_madvise(ptr, sz, QEMU_MADV_MERGEABLE);
 }
-- 
2.41.0




Re: hanging process with commit 69562648f9 ("vl: revert behaviour for -display none")

2023-11-27 Thread Peter Maydell
On Mon, 27 Nov 2023 at 12:29, Sebastian Ott  wrote:
>
> Hej,
>
> qemu fails to start a guest using the following command (the process just
> hangs): qemu-system-aarch64 -machine virt -cpu host -smp 4 -m 8192
> -kernel /boot/vmlinuz-6.7.0-rc1 -initrd ~/basic.img -append "root=/dev/ram
> console=ttyAMA0" -enable-kvm -device virtio-gpu,hostmem=2G -display none
>
> ..which I've used to debug a potential virtio-gpu issue. Bisect points to
> 69562648f9 ("vl: revert behaviour for -display none")

Is it actually hanging, or is the guest starting up fine but
outputting to a serial port which you haven't directed anywhere?
The commandline is a bit odd because it doesn't set up any of:
 * a serial terminal
 * a graphical window/display
 * network forwarding that would allow ssh into the guest

If you add '-serial stdio' do you see the guest output?

thanks
-- PMM



Re: [PATCH v2 for-8.2] ppc/amigaone: Allow running AmigaOS without firmware image

2023-11-27 Thread Nicholas Piggin
On Mon Nov 27, 2023 at 9:43 PM AEST, BALATON Zoltan wrote:
> On Mon, 27 Nov 2023, Nicholas Piggin wrote:
> > On Sun Nov 26, 2023 at 2:34 AM AEST, BALATON Zoltan wrote:
> >> The machine uses a modified U-Boot under GPL license but the sources
> >> of it are lost with only a binary available so it cannot be included
> >> in QEMU. Allow running without the firmware image with -bios none
> >> which can be used when calling a boot loader directly and thus
> >> simplifying booting guests. We need a small routine that AmigaOS calls
> >> from ROM which is added in this case to allow booting AmigaOS without
> >> external firmware image.
> >>
> >> Signed-off-by: BALATON Zoltan 
> >> ---
> >> v2: Unfortunately AmigaOS needs some additional ROM part which is added
> >> Please merge for 8.2 as it allows booting AmigaOS simpler without
> >> having to download separate firmware.
> >
> > How to test this?
>
> You can check with -M amigaone -bios none then from QEMU monitor
> (qemu) xp/10i 0xfff7ff80

Okay, so to fully test it you really need a non-free AmigaOS image?

And, how does a user know how or when to use this? Should it just be
default if there is no bios binary found?

>
> >>  hw/ppc/amigaone.c | 20 +---
> >>  1 file changed, 17 insertions(+), 3 deletions(-)
> >>
> >> diff --git a/hw/ppc/amigaone.c b/hw/ppc/amigaone.c
> >> index 992a55e632..a11d2d5556 100644
> >> --- a/hw/ppc/amigaone.c
> >> +++ b/hw/ppc/amigaone.c
> >> @@ -40,6 +40,16 @@
> >>  #define PROM_ADDR 0xfff0
> >>  #define PROM_SIZE (512 * KiB)
> >>
> >> +/* AmigaOS calls this routine from ROM, use this if -bios none */
> >> +static const char dummy_fw[] = {
> >> +0x38, 0x00, 0x00, 0x08, /* li  r0,8 */
> >> +0x7c, 0x09, 0x03, 0xa6, /* mtctr   r0 */
> >> +0x54, 0x63, 0xf8, 0x7e, /* srwir3,r3,1 */
> >> +0x42, 0x00, 0xff, 0xfc, /* bdnz0x8 */
> >> +0x7c, 0x63, 0x18, 0xf8, /* not r3,r3 */
> >> +0x4e, 0x80, 0x00, 0x20, /* blr */
> >> +};
> >
> > This is clever, but does anything else create blobs like this?
>
> There are some examples in hw/mips/{loongson3_virt.c, malta.c} at least 
> and maybe others that put code in guest memory. If this was longer than 
> this few instructions I'd consider putting it in a binary but this seems 
> simpler for such small code.

Thanks, compiling blob inline looks fine then.

It's 0x80 bytes from the end of prom AFAIKS. Should you make that
PROM_ADDR + PROM_SIZE - 0x80 instead of hard coding it?

>
> > It could be put into a .S in pc-bios, which might be a bit more
> > consistent.
> >
> > We might make a ppc/ subdirectory under there, but that's for
> > another time.
>
> Maybe later we could reorganise these unless it's really necessary to 
> change this for 8.2 now.

Nah that's fine.

> (The mips boards and some arm and riscv machines 
> seem to use rom_add_blob_fixed() which sould show up in info roms under 
> monitor so maybe I could look at changing to use that now if you think it 
> would be better that way.)

I'm not sure, I don't think it's necessary if your minimal patch works.

I'll do a PR for 8.2 for SLOF and Skiboot updates, so happy to include
this as well.

Thanks,
Nick

>
> Regards,
> BALATON Zoltan
>
> > Thanks,
> > Nick
> >
> >> +
> >>  static void amigaone_cpu_reset(void *opaque)
> >>  {
> >>  PowerPCCPU *cpu = opaque;
> >> @@ -94,17 +104,21 @@ static void amigaone_init(MachineState *machine)
> >>  }
> >>
> >>  /* allocate and load firmware */
> >> +rom = g_new(MemoryRegion, 1);
> >> +memory_region_init_rom(rom, NULL, "rom", PROM_SIZE, &error_fatal);
> >> +memory_region_add_subregion(get_system_memory(), PROM_ADDR, rom);
> >>  filename = qemu_find_file(QEMU_FILE_TYPE_BIOS, fwname);
> >>  if (filename) {
> >> -rom = g_new(MemoryRegion, 1);
> >> -memory_region_init_rom(rom, NULL, "rom", PROM_SIZE, &error_fatal);
> >> -memory_region_add_subregion(get_system_memory(), PROM_ADDR, rom);
> >>  sz = load_image_targphys(filename, PROM_ADDR, PROM_SIZE);
> >>  if (sz <= 0 || sz > PROM_SIZE) {
> >>  error_report("Could not load firmware '%s'", filename);
> >>  exit(1);
> >>  }
> >>  g_free(filename);
> >> +} else if (!strcmp(fwname, "none")) {
> >> +address_space_write_rom(&address_space_memory, 0xfff7ff80,
> >> +MEMTXATTRS_UNSPECIFIED, dummy_fw,
> >> +ARRAY_SIZE(dummy_fw));
> >>  } else if (!qtest_enabled()) {
> >>  error_report("Could not find firmware '%s'", fwname);
> >>  exit(1);
> >
> >
> >




Re: [RFC] docs/s390: Fix wrong command example in s390-cpu-topology.rst

2023-11-27 Thread Nina Schoetterl-Glausch
On Mon, 2023-11-27 at 17:39 +0800, Zhao Liu wrote:
> From: Zhao Liu 
> 
> From s390_possible_cpu_arch_ids() in hw/s390x/s390-virtio-ccw.c, the
> "core-id" is the index of pssible_cpus->cpus[], so it should only be

s/pssible_cpus/possible_cpus/

> less than possible_cpus->len, which is equal to ms->smp.max_cpus.
> 
> Fix the wrong "core-id" 112 because it is greater than maxcpus (36) in

Maybe s/is greater/isn't less/ since the valid ids are 0-35 inclusive.

> -smp.
> 
> Cc: Nina Schoetterl-Glausch 
> Signed-off-by: Zhao Liu 

Reviewed-by: Nina Schoetterl-Glausch 

Thanks!
> ---
> RFC: Not tested on S390 machine, just code reading.
> ---
>  docs/devel/s390-cpu-topology.rst | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/docs/devel/s390-cpu-topology.rst 
> b/docs/devel/s390-cpu-topology.rst
> index 9eab28d5e5d8..48313b92d417 100644
> --- a/docs/devel/s390-cpu-topology.rst
> +++ b/docs/devel/s390-cpu-topology.rst
> @@ -15,7 +15,7 @@ have default values:
>  -smp 1,drawers=3,books=3,sockets=2,cores=2,maxcpus=36 \
>  -device z14-s390x-cpu,core-id=19,entitlement=high \
>  -device z14-s390x-cpu,core-id=11,entitlement=low \
> --device z14-s390x-cpu,core-id=112,entitlement=high \
> +-device z14-s390x-cpu,core-id=12,entitlement=high \
> ...
>  
>  Additions to query-cpus-fast
> @@ -78,7 +78,7 @@ modifiers for all configured vCPUs.
>"dedicated": true,
>"thread-id": 537005,
>"props": {
> -"core-id": 112,
> +"core-id": 12,
>  "socket-id": 0,
>  "drawer-id": 3,
>  "book-id": 2
> @@ -86,7 +86,7 @@ modifiers for all configured vCPUs.
>"cpu-state": "operating",
>"entitlement": "high",
>"qom-path": "/machine/peripheral-anon/device[2]",
> -  "cpu-index": 112,
> +  "cpu-index": 12,
>"target": "s390x"
>  }
>]




[PATCH v2] avr: Fix wrong initial value of stack pointer

2023-11-27 Thread Gihun Nam
The current implementation initializes the stack pointer of AVR devices
to 0. Although older AVR devices used to be like that, newer ones set
it to RAMEND.

Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1525
Signed-off-by: Gihun Nam 
---
Edit code to use QOM property and add more description to commit message
about the changes

Thanks for the detailed help, Mr. Peter!

P.S. I don't understand how replies work with git send-email, so
 if I've done something wrong, please bear with me.

 hw/avr/atmega.c  |  4 
 target/avr/cpu.c | 10 +-
 target/avr/cpu.h |  3 +++
 3 files changed, 16 insertions(+), 1 deletion(-)

diff --git a/hw/avr/atmega.c b/hw/avr/atmega.c
index a34803e642..31c8992d75 100644
--- a/hw/avr/atmega.c
+++ b/hw/avr/atmega.c
@@ -233,6 +233,10 @@ static void atmega_realize(DeviceState *dev, Error **errp)
 
 /* CPU */
 object_initialize_child(OBJECT(dev), "cpu", &s->cpu, mc->cpu_type);
+
+object_property_set_uint(OBJECT(&s->cpu), "init-sp",
+ mc->io_size + mc->sram_size - 1, &error_abort);
+
 qdev_realize(DEVICE(&s->cpu), NULL, &error_abort);
 cpudev = DEVICE(&s->cpu);
 
diff --git a/target/avr/cpu.c b/target/avr/cpu.c
index 44de1e18d1..999c010ded 100644
--- a/target/avr/cpu.c
+++ b/target/avr/cpu.c
@@ -25,6 +25,7 @@
 #include "cpu.h"
 #include "disas/dis-asm.h"
 #include "tcg/debug-assert.h"
+#include "hw/qdev-properties.h"
 
 static void avr_cpu_set_pc(CPUState *cs, vaddr value)
 {
@@ -95,7 +96,7 @@ static void avr_cpu_reset_hold(Object *obj)
 env->rampY = 0;
 env->rampZ = 0;
 env->eind = 0;
-env->sp = 0;
+env->sp = cpu->init_sp;
 
 env->skip = 0;
 
@@ -152,6 +153,11 @@ static void avr_cpu_initfn(Object *obj)
   sizeof(cpu->env.intsrc) * 8);
 }
 
+static Property avr_cpu_properties[] = {
+DEFINE_PROP_UINT32("init-sp", AVRCPU, init_sp, 0),
+DEFINE_PROP_END_OF_LIST()
+};
+
 static ObjectClass *avr_cpu_class_by_name(const char *cpu_model)
 {
 ObjectClass *oc;
@@ -228,6 +234,8 @@ static void avr_cpu_class_init(ObjectClass *oc, void *data)
 
 device_class_set_parent_realize(dc, avr_cpu_realizefn, 
&mcc->parent_realize);
 
+device_class_set_props(dc, avr_cpu_properties);
+
 resettable_class_set_parent_phases(rc, NULL, avr_cpu_reset_hold, NULL,
&mcc->parent_phases);
 
diff --git a/target/avr/cpu.h b/target/avr/cpu.h
index 8a17862737..7960c5c57a 100644
--- a/target/avr/cpu.h
+++ b/target/avr/cpu.h
@@ -145,6 +145,9 @@ struct ArchCPU {
 CPUState parent_obj;
 
 CPUAVRState env;
+
+/* Initial value of stack pointer */
+uint32_t init_sp;
 };
 
 /**
-- 
2.39.2




Re: hanging process with commit 69562648f9 ("vl: revert behaviour for -display none")

2023-11-27 Thread Sebastian Ott

On Mon, 27 Nov 2023, Peter Maydell wrote:

On Mon, 27 Nov 2023 at 12:29, Sebastian Ott  wrote:

qemu fails to start a guest using the following command (the process just
hangs): qemu-system-aarch64 -machine virt -cpu host -smp 4 -m 8192
-kernel /boot/vmlinuz-6.7.0-rc1 -initrd ~/basic.img -append "root=/dev/ram
console=ttyAMA0" -enable-kvm -device virtio-gpu,hostmem=2G -display none

..which I've used to debug a potential virtio-gpu issue. Bisect points to
69562648f9 ("vl: revert behaviour for -display none")


Is it actually hanging, or is the guest starting up fine but
outputting to a serial port which you haven't directed anywhere?


Ough, that's indeed the case. I only had a quick glance at the bt in gdb
and obviously misinterpreted what I got there.


The commandline is a bit odd because it doesn't set up any of:
* a serial terminal
* a graphical window/display
* network forwarding that would allow ssh into the guest

If you add '-serial stdio' do you see the guest output?


I do. I was using the serial terminal which got setup implicitly I guess.
I'll make sure to always add this.

Sry and thanks,
Sebastian




Re: [RFC] docs/s390: Fix wrong command example in s390-cpu-topology.rst

2023-11-27 Thread Zhao Liu
Hi Nina,

On Mon, Nov 27, 2023 at 01:58:32PM +0100, Nina Schoetterl-Glausch wrote:
> Date: Mon, 27 Nov 2023 13:58:32 +0100
> From: Nina Schoetterl-Glausch 
> Subject: Re: [RFC] docs/s390: Fix wrong command example in
>  s390-cpu-topology.rst
> 
> On Mon, 2023-11-27 at 17:39 +0800, Zhao Liu wrote:
> > From: Zhao Liu 
> > 
> > From s390_possible_cpu_arch_ids() in hw/s390x/s390-virtio-ccw.c, the
> > "core-id" is the index of pssible_cpus->cpus[], so it should only be
> 
> s/pssible_cpus/possible_cpus/

Thanks!

> 
> > less than possible_cpus->len, which is equal to ms->smp.max_cpus.
> > 
> > Fix the wrong "core-id" 112 because it is greater than maxcpus (36) in
> 
> Maybe s/is greater/isn't less/ since the valid ids are 0-35 inclusive.

Ok.

> 
> > -smp.
> > 
> > Cc: Nina Schoetterl-Glausch 
> > Signed-off-by: Zhao Liu 
> 
> Reviewed-by: Nina Schoetterl-Glausch 

Thanks! Let me refresh a new version quickly.

-Zhao



[PATCH v2] docs/s390: Fix wrong command example in s390-cpu-topology.rst

2023-11-27 Thread Zhao Liu
From: Zhao Liu 

>From s390_possible_cpu_arch_ids() in hw/s390x/s390-virtio-ccw.c, the
"core-id" is the index of possible_cpus->cpus[], so it should only be
less than possible_cpus->len, which is equal to ms->smp.max_cpus.

Fix the wrong "core-id" 112, because it isn't less than maxcpus (36) in
-smp, and the valid core ids are 0-35 inclusive.

Signed-off-by: Zhao Liu 
Reviewed-by: Nina Schoetterl-Glausch 
---
Changes since v1 RFC:
 * Fixed typo. (Nina)
 * Polished the description of the reason for fixing the wrong core-id.
   (Nina)
---
 docs/devel/s390-cpu-topology.rst | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/docs/devel/s390-cpu-topology.rst b/docs/devel/s390-cpu-topology.rst
index 9eab28d5e5d8..48313b92d417 100644
--- a/docs/devel/s390-cpu-topology.rst
+++ b/docs/devel/s390-cpu-topology.rst
@@ -15,7 +15,7 @@ have default values:
 -smp 1,drawers=3,books=3,sockets=2,cores=2,maxcpus=36 \
 -device z14-s390x-cpu,core-id=19,entitlement=high \
 -device z14-s390x-cpu,core-id=11,entitlement=low \
--device z14-s390x-cpu,core-id=112,entitlement=high \
+-device z14-s390x-cpu,core-id=12,entitlement=high \
...
 
 Additions to query-cpus-fast
@@ -78,7 +78,7 @@ modifiers for all configured vCPUs.
   "dedicated": true,
   "thread-id": 537005,
   "props": {
-"core-id": 112,
+"core-id": 12,
 "socket-id": 0,
 "drawer-id": 3,
 "book-id": 2
@@ -86,7 +86,7 @@ modifiers for all configured vCPUs.
   "cpu-state": "operating",
   "entitlement": "high",
   "qom-path": "/machine/peripheral-anon/device[2]",
-  "cpu-index": 112,
+  "cpu-index": 12,
   "target": "s390x"
 }
   ]
-- 
2.34.1




Re: [PATCH] hostmem: Round up memory size for qemu_madvise() in host_memory_backend_memory_complete()

2023-11-27 Thread David Hildenbrand

On 27.11.23 13:32, Michal Privoznik wrote:

Simple reproducer:
qemu.git $ ./build/qemu-system-x86_64 \
-m size=8389632k,slots=16,maxmem=2560k \
-object 
'{"qom-type":"memory-backend-file","id":"ram-node0","mem-path":"/hugepages2M/","prealloc":true,"size":8590983168,"host-nodes":[0],"policy":"bind"}'
 \
-numa node,nodeid=0,cpus=0,memdev=ram-node0

With current master I get:

qemu-system-x86_64: cannot bind memory to host NUMA nodes: Invalid argument

The problem is that memory size (8193MiB) is not an integer
multiple of underlying pagesize (2MiB) which triggers a check
inside of madvise(), since we can't really set a madvise() policy
just to a fraction of a page.


I thought we would just always fail create something that doesn't really 
make any sense.


Why would we want to support that case?

Let me dig, I thought we would have had some check there at some point 
that would make that fail (especially: RAM block not aligned to the 
pagesize).




Signed-off-by: Michal Privoznik 
---
  backends/hostmem.c | 8 +++-
  1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/backends/hostmem.c b/backends/hostmem.c
index 747e7838c0..4e88d048de 100644
--- a/backends/hostmem.c
+++ b/backends/hostmem.c
@@ -326,9 +326,10 @@ host_memory_backend_memory_complete(UserCreatable *uc, 
Error **errp)
  HostMemoryBackendClass *bc = MEMORY_BACKEND_GET_CLASS(uc);
  Error *local_err = NULL;
  void *ptr;
-uint64_t sz;
  
  if (bc->alloc) {

+uint64_t sz;
+
  bc->alloc(backend, &local_err);
  if (local_err) {
  goto out;
@@ -337,6 +338,11 @@ host_memory_backend_memory_complete(UserCreatable *uc, 
Error **errp)
  ptr = memory_region_get_ram_ptr(&backend->mr);
  sz = memory_region_size(&backend->mr);
  
+/* Round up size to be an integer multiple of pagesize, because

+ * madvise() does not really like setting advices on a fraction of a
+ * page. */
+sz = ROUND_UP(sz, qemu_ram_pagesize(backend->mr.ram_block));
+
  if (backend->merge) {
  qemu_madvise(ptr, sz, QEMU_MADV_MERGEABLE);
  }


--
Cheers,

David / dhildenb




Re: [PULL 00/14] random fixes for 8.2 (tests, plugins, docs, semihosting)

2023-11-27 Thread Stefan Hajnoczi
Applied, thanks.

Please update the changelog at https://wiki.qemu.org/ChangeLog/8.2 for any 
user-visible changes.


signature.asc
Description: PGP signature


Re: [PATCH v2 for-8.2] ppc/amigaone: Allow running AmigaOS without firmware image

2023-11-27 Thread BALATON Zoltan

On Mon, 27 Nov 2023, Nicholas Piggin wrote:

On Mon Nov 27, 2023 at 9:43 PM AEST, BALATON Zoltan wrote:

On Mon, 27 Nov 2023, Nicholas Piggin wrote:

On Sun Nov 26, 2023 at 2:34 AM AEST, BALATON Zoltan wrote:

The machine uses a modified U-Boot under GPL license but the sources
of it are lost with only a binary available so it cannot be included
in QEMU. Allow running without the firmware image with -bios none
which can be used when calling a boot loader directly and thus
simplifying booting guests. We need a small routine that AmigaOS calls
from ROM which is added in this case to allow booting AmigaOS without
external firmware image.

Signed-off-by: BALATON Zoltan 
---
v2: Unfortunately AmigaOS needs some additional ROM part which is added
Please merge for 8.2 as it allows booting AmigaOS simpler without
having to download separate firmware.


How to test this?


You can check with -M amigaone -bios none then from QEMU monitor
(qemu) xp/10i 0xfff7ff80


Okay, so to fully test it you really need a non-free AmigaOS image?


I'm afraid yes, it's hard to test without AmigaOS otherwise as that's the 
only thing that seems to need this.



And, how does a user know how or when to use this? Should it just be
default if there is no bios binary found?


It could be the default without -bios, that could also allow removing the 
qtest special case than. Maybe it's easier for users so I'll change that 
in v2.



 hw/ppc/amigaone.c | 20 +---
 1 file changed, 17 insertions(+), 3 deletions(-)

diff --git a/hw/ppc/amigaone.c b/hw/ppc/amigaone.c
index 992a55e632..a11d2d5556 100644
--- a/hw/ppc/amigaone.c
+++ b/hw/ppc/amigaone.c
@@ -40,6 +40,16 @@
 #define PROM_ADDR 0xfff0
 #define PROM_SIZE (512 * KiB)

+/* AmigaOS calls this routine from ROM, use this if -bios none */
+static const char dummy_fw[] = {
+0x38, 0x00, 0x00, 0x08, /* li  r0,8 */
+0x7c, 0x09, 0x03, 0xa6, /* mtctr   r0 */
+0x54, 0x63, 0xf8, 0x7e, /* srwir3,r3,1 */
+0x42, 0x00, 0xff, 0xfc, /* bdnz0x8 */
+0x7c, 0x63, 0x18, 0xf8, /* not r3,r3 */
+0x4e, 0x80, 0x00, 0x20, /* blr */
+};


This is clever, but does anything else create blobs like this?


There are some examples in hw/mips/{loongson3_virt.c, malta.c} at least
and maybe others that put code in guest memory. If this was longer than
this few instructions I'd consider putting it in a binary but this seems
simpler for such small code.


Thanks, compiling blob inline looks fine then.

It's 0x80 bytes from the end of prom AFAIKS. Should you make that
PROM_ADDR + PROM_SIZE - 0x80 instead of hard coding it?


I thought about that after sending the patch, I'll change it too.


It could be put into a .S in pc-bios, which might be a bit more
consistent.

We might make a ppc/ subdirectory under there, but that's for
another time.


Maybe later we could reorganise these unless it's really necessary to
change this for 8.2 now.


Nah that's fine.


(The mips boards and some arm and riscv machines
seem to use rom_add_blob_fixed() which sould show up in info roms under
monitor so maybe I could look at changing to use that now if you think it
would be better that way.)


I'm not sure, I don't think it's necessary if your minimal patch works.


It works but just for consistency with other similar machines I'll 
consider trying rom_add_blob_fixed() now that there will be another 
iteration.



I'll do a PR for 8.2 for SLOF and Skiboot updates, so happy to include
this as well.


Thanks, I'll only have time to do a v2 later but still before the next rc 
due tomorrow.


There are some more data around this function in ROM that I'm not sure 
would be needed but don't know how to verify what AmigaOS accesses. I've 
tried enabling memory_region* traces but those only seem to log io 
regions, ram and rom region accesses are probably just memory reads so not 
traced. Then I've tried adding watch points in gdb like this:


(gdb) watch *(char [0x8])*0xfff0
value requires 524288 bytes, which is more than max-value-size
(gdb) show max-value-size
Maximum value size is 65536 bytes.
(gdb) watch *(char [0x1])*0xfff0
Hardware watchpoint 1: *(char [0x1])*0xfff0
(gdb) watch *(char [0x1])*0xfff1
Hardware watchpoint 2: *(char [0x1])*0xfff1
(gdb) watch *(char [0x1])*0xfff2
Hardware watchpoint 3: *(char [0x1])*0xfff2
(gdb) watch *(char [0x1])*0xfff3
Hardware watchpoint 4: *(char [0x1])*0xfff3
(gdb) watch *(char [0x1])*0xfff4
Hardware watchpoint 5: *(char [0x1])*0xfff4
(gdb) watch *(char [0x1])*0xfff5
Hardware watchpoint 6: *(char [0x1])*0xfff5
(gdb) watch *(char [0x1])*0xfff6
Hardware watchpoint 7: *(char [0x1])*0xfff6
(gdb) watch *(char [0x1])*0xfff7
Hardware watchpoint 8: *(char [0x1])*0xfff7

but they are not firing even for the code known to be accessed so not sure 
how to trace rom access. Maybe I'll need to temporarily make it an io 
region to be

Re: [PATCH] hostmem: Round up memory size for qemu_madvise() in host_memory_backend_memory_complete()

2023-11-27 Thread David Hildenbrand

On 27.11.23 14:37, David Hildenbrand wrote:

On 27.11.23 13:32, Michal Privoznik wrote:

Simple reproducer:
qemu.git $ ./build/qemu-system-x86_64 \
-m size=8389632k,slots=16,maxmem=2560k \
-object 
'{"qom-type":"memory-backend-file","id":"ram-node0","mem-path":"/hugepages2M/","prealloc":true,"size":8590983168,"host-nodes":[0],"policy":"bind"}'
 \
-numa node,nodeid=0,cpus=0,memdev=ram-node0

With current master I get:

qemu-system-x86_64: cannot bind memory to host NUMA nodes: Invalid argument

The problem is that memory size (8193MiB) is not an integer
multiple of underlying pagesize (2MiB) which triggers a check
inside of madvise(), since we can't really set a madvise() policy
just to a fraction of a page.


I thought we would just always fail create something that doesn't really
make any sense.

Why would we want to support that case?

Let me dig, I thought we would have had some check there at some point
that would make that fail (especially: RAM block not aligned to the
pagesize).



At least memory-backend-memfd properly fails for that case:

$ ./build/qemu-system-x86_64 -object 
memory-backend-memfd,hugetlb=on,size=3m,id=tmp
qemu-system-x86_64: failed to resize memfd to 3145728: Invalid argument

memory-backend-file ends up creating a new file:

 $ ./build/qemu-system-x86_64 -object 
memory-backend-file,share=on,mem-path=/dev/hugepages/tmp,size=3m,id=tmp

$ stat /dev/hugepages/tmp
  File: /dev/hugepages/tmp
  Size: 4194304 Blocks: 0  IO Block: 2097152 regular file

... and ends up sizing it properly aligned to the huge page size.


Seems to be due to:

if (memory < block->page_size) {
error_setg(errp, "memory size 0x" RAM_ADDR_FMT " must be equal to "
   "or larger than page size 0x%zx",
   memory, block->page_size);
return NULL;
}

memory = ROUND_UP(memory, block->page_size);

/*
 * ftruncate is not supported by hugetlbfs in older
 * hosts, so don't bother bailing out on errors.
 * If anything goes wrong with it under other filesystems,
 * mmap will fail.
 *
 * Do not truncate the non-empty backend file to avoid corrupting
 * the existing data in the file. Disabling shrinking is not
 * enough. For example, the current vNVDIMM implementation stores
 * the guest NVDIMM labels at the end of the backend file. If the
 * backend file is later extended, QEMU will not be able to find
 * those labels. Therefore, extending the non-empty backend file
 * is disabled as well.
 */
if (truncate && ftruncate(fd, offset + memory)) {
perror("ftruncate");
}

So we create a bigger file and map the bigger file and also have a
RAMBlock that is bigger. So we'll also consume more memory.

... but the memory region is smaller and we tell the VM that it has
less memory. Lot of work with no obvious benefit, and only some
memory waste :)


We better should have just rejected such memory backends right from
the start. But now it's likely too late.

I suspect other things like
 * qemu_madvise(ptr, sz, QEMU_MADV_MERGEABLE);
 * qemu_madvise(ptr, sz, QEMU_MADV_DONTDUMP);

fail, but we don't care for hugetlb at least regarding merging
and don't even log an error.

But QEMU_MADV_DONTDUMP might also be broken, because that
qemu_madvise() call will just fail.

Your fix would be correct. But I do wonder if we want to just let that
case fail and warn users that they are doing something that doesn't
make too much sense.

--
Cheers,

David / dhildenb




Re: [PATCH v5 12/14] tests: acpi: implement TPM CRB tests for ARM virt

2023-11-27 Thread Stefan Berger




On 11/24/23 21:39, Joelle van Dyne wrote:

On Fri, Nov 24, 2023 at 8:26 AM Stefan Berger  wrote:




On 11/24/23 11:21, Joelle van Dyne wrote:

On Fri, Nov 24, 2023 at 8:17 AM Stefan Berger  wrote:




On 11/23/23 19:56, Joelle van Dyne wrote:

On Tue, Nov 14, 2023 at 4:12 PM Stefan Berger  wrote:




On 11/14/23 16:05, Stefan Berger wrote:



On 11/14/23 13:03, Stefan Berger wrote:



On 11/14/23 04:36, Marc-André Lureau wrote:

Hi

On Tue, Nov 14, 2023 at 6:12 AM Joelle van Dyne  wrote:


Signed-off-by: Joelle van Dyne 
Reviewed-by: Stefan Berger 


nit: you also added tests for x86, could be a different patch?

For arm, the test fails until next patch with:

# starting QEMU: exec ./qemu-system-aarch64 -qtest
unix:/tmp/qtest-991279.sock -qtest-log /dev/null -chardev
socket,path=/tmp/qtest-991279.qmp,id=char0 -mon
chardev=char0,mode=control -display none -audio none -machine virt
-accel tcg -nodefaults -nographic -drive
if=pflash,format=raw,file=pc-bios/edk2-aarch64-code.fd,readonly=on
-drive if=pflash,format=raw,file=pc-bios/edk2-arm-vars.fd,snapshot=on
-cdrom tests/data/uefi-boot-images/bios-tables-test.aarch64.iso.qcow2
-cpu cortex-a57 -chardev
socket,id=chr,path=/tmp/qemu-test_acpi_virt_tcg_crb-device.KZ3GE2/sock
-tpmdev emulator,id=dev,chardev=chr -device tpm-crb-device,tpmdev=dev
-accel qtest
Warning! zero length expected file
'tests/data/acpi/virt/TPM2.crb-device.tpm2'
Warning! zero length expected file
'tests/data/acpi/virt/DSDT.crb-device.tpm2'
acpi-test: Warning!  binary file mismatch. Actual
[aml:/tmp/aml-GO4ME2], Expected
[aml:tests/data/acpi/virt/TPM2.crb-device.tpm2].
See source file tests/qtest/bios-tables-test.c for instructions on how
to update expected files.
acpi-test: Warning!  binary file mismatch. Actual
[aml:/tmp/aml-6N4ME2], Expected
[aml:tests/data/acpi/virt/DSDT.crb-device.tpm2].
See source file tests/qtest/bios-tables-test.c for instructions on how
to update expected files.
to see ASL diff between mismatched files install IASL, rebuild QEMU
from scratch and re-run tests with V=1 environment variable set**
ERROR:../tests/qtest/bios-tables-test.c:538:test_acpi_asl: assertion
failed: (all_tables_match)
not ok /aarch64/acpi/virt/tpm2-crb -
ERROR:../tests/qtest/bios-tables-test.c:538:test_acpi_asl: assertion
failed: (all_tables_match)
Bail out!
qemu-system-aarch64: tpm-emulator: Could not cleanly shutdown the TPM:
Resource temporarily unavailable
Unexpected error in qio_channel_socket_writev() at
../io/channel-socket.c:622:
/home/elmarco/src/qemu/buildall/tests/qtest/bios-tables-test: Unable
to write to socket: Bad file descriptor



Travis testing on s390x I see the following failures for this patchset
(search for 'ERROR'):

https://app.travis-ci.com/github/stefanberger/qemu-tpm/builds/267230363

Summary of Failures:

134/320 qemu:qtest+qtest-aarch64 / qtest-aarch64/tpm-crb-device-test
ERROR   0.70s   killed by signal 6 SIGABRT

219/320 qemu:qtest+qtest-x86_64 / qtest-x86_64/tpm-crb-test
ERROR   0.88s   killed by signal 6 SIGABRT


Summary of Failures:

271/537 qemu:qtest+qtest-i386 / qtest-i386/tpm-crb-test
ERROR   0.59s   killed by signal 6 SIGABRT


My guess is it's an endianess issue on big endian machines due to
reading from the ROM device where we lost the .endianess:

+const MemoryRegionOps tpm_crb_memory_ops = {
+.read = tpm_crb_mmio_read,
+.write = tpm_crb_mmio_write,
+.endianness = DEVICE_LITTLE_ENDIAN,
+.valid = {
+.min_access_size = 1,
+.max_access_size = 4,
+},
+};



I think we need a 2nd set of registers to support the endianess
conversion. It's not exactly nice, though. Basically the saved_regs
could be used for this directly, even though I did not do that but
introduced n_regs:
https://github.com/stefanberger/qemu-tpm/commit/90f6b21c0dd93dbb13d9e80a628f5b631fd07d91

This patch allows the tests on s390x to run farther but the execution of
the command doesn't seem to work maybe due to command data that were
also written in wrong endianess. I don't know. I would have to get
access to a big endian / s390 machine to be able to fix it.




The latest version now passes on Travis s390x:
https://app.travis-ci.com/github/stefanberger/qemu-tpm/builds/267245220



Are the tests failing on S390X due to the added code or are they
failing because previously it was untested? I don't think the original
code took account of endianness and that should be fixed, but feels
like it should be in a separate patch set?


They are failing because something like the topmost one or two patches
as in this branch here are missing for a big endian host:

https://github.com/stefanberger/qemu-tpm/tree/joelle.v5%2B2nd_registers


Right, but is this issue new due to the patchset? i.e. if just the


Yes, it is due to this patchset. The reason is that CRB switched to a
ROMD interface where the fact that the MMIO registers are little endian
got lost for existing x86_64 support.


tests were added without the other patches, would they still

Re: [PATCH for-8.2 v2] export/vhost-user-blk: Fix consecutive drains

2023-11-27 Thread Stefan Hajnoczi
On Mon, 27 Nov 2023 at 06:58, Kevin Wolf  wrote:
>
> The vhost-user-blk export implement AioContext switches in its drain
> implementation. This means that on drain_begin, it detaches the server
> from its AioContext and on drain_end, attaches it again and schedules
> the server->co_trip coroutine in the updated AioContext.
>
> However, nothing guarantees that server->co_trip is even safe to be
> scheduled. Not only is it unclear that the coroutine is actually in a
> state where it can be reentered externally without causing problems, but
> with two consecutive drains, it is possible that the scheduled coroutine
> didn't have a chance yet to run and trying to schedule an already
> scheduled coroutine a second time crashes with an assertion failure.
>
> Following the model of NBD, this commit makes the vhost-user-blk export
> shut down server->co_trip during drain so that resuming the export means
> creating and scheduling a new coroutine, which is always safe.
>
> There is one exception: If the drain call didn't poll (for example, this
> happens in the context of bdrv_graph_wrlock()), then the coroutine
> didn't have a chance to shut down. However, in this case the AioContext
> can't have changed; changing the AioContext always involves a polling
> drain. So in this case we can simply assert that the AioContext is
> unchanged and just leave the coroutine running or wake it up if it has
> yielded to wait for the AioContext to be attached again.
>
> Fixes: e1054cd4aad03a493a5d1cded7508f7c348205bf
> Fixes: https://issues.redhat.com/browse/RHEL-1708
> Signed-off-by: Kevin Wolf 
> ---
>  include/qemu/vhost-user-server.h |  1 +
>  block/export/vhost-user-blk-server.c |  9 +--
>  util/vhost-user-server.c | 39 ++--
>  3 files changed, 39 insertions(+), 10 deletions(-)

Reviewed-by: Stefan Hajnoczi 



Re: hanging process with commit 69562648f9 ("vl: revert behaviour for -display none")

2023-11-27 Thread Peter Maydell
On Mon, 27 Nov 2023 at 13:08, Sebastian Ott  wrote:
>
> On Mon, 27 Nov 2023, Peter Maydell wrote:
> > On Mon, 27 Nov 2023 at 12:29, Sebastian Ott  wrote:
> >> qemu fails to start a guest using the following command (the process just
> >> hangs): qemu-system-aarch64 -machine virt -cpu host -smp 4 -m 8192
> >> -kernel /boot/vmlinuz-6.7.0-rc1 -initrd ~/basic.img -append "root=/dev/ram
> >> console=ttyAMA0" -enable-kvm -device virtio-gpu,hostmem=2G -display none
> >>
> >> ..which I've used to debug a potential virtio-gpu issue. Bisect points to
> >> 69562648f9 ("vl: revert behaviour for -display none")
> >
> > Is it actually hanging, or is the guest starting up fine but
> > outputting to a serial port which you haven't directed anywhere?
>
> Ough, that's indeed the case. I only had a quick glance at the bt in gdb
> and obviously misinterpreted what I got there.
>
> > The commandline is a bit odd because it doesn't set up any of:
> > * a serial terminal
> > * a graphical window/display
> > * network forwarding that would allow ssh into the guest
> >
> > If you add '-serial stdio' do you see the guest output?
>
> I do. I was using the serial terminal which got setup implicitly I guess.

Yep. The issue fixed by 69562648f9 is that we briefly incorrectly
made "-display none" do more than just "disable the display window".
The revert brings us back to the normal behaviour that if you
want a serial port you need to ask for it. (Or use the -nographic
option, which is a legacy 'do what I mean' option that does
multiple things at once including turning off the GUI window,
and adding a serial terminal and a monitor multiplexed onto stdio.
But personally I find it clearer to explicitly ask for all
this stuff via '-display none -serial ...' etc.)

-- PMM



Re: [PATCH v1 1/7] migration/multifd: Remove MultiFDPages_t::packet_num

2023-11-27 Thread Peter Xu
On Fri, Nov 24, 2023 at 01:14:26PM -0300, Fabiano Rosas wrote:
> This was introduced by commit 34c55a94b1 ("migration: Create multipage
> support") and never used.
> 
> Signed-off-by: Fabiano Rosas 

Reviewed-by: Peter Xu 

-- 
Peter Xu




[PATCH 1/2] hw/cpu/core: Cleanup unused included header in core.c

2023-11-27 Thread Zhao Liu
From: Zhao Liu 

Remove unused header (qemu/module.h and sysemu/cpus.h) in core.c,
and reorder the remaining header files (except qemu/osdep.h) in
alphabetical order.

Tested by "./configure" and then "make".

Signed-off-by: Zhao Liu 
---
 hw/cpu/core.c | 7 +++
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/hw/cpu/core.c b/hw/cpu/core.c
index 987607515574..495a5c30ffe1 100644
--- a/hw/cpu/core.c
+++ b/hw/cpu/core.c
@@ -8,12 +8,11 @@
  */
 
 #include "qemu/osdep.h"
+
+#include "hw/boards.h"
 #include "hw/cpu/core.h"
-#include "qapi/visitor.h"
-#include "qemu/module.h"
 #include "qapi/error.h"
-#include "sysemu/cpus.h"
-#include "hw/boards.h"
+#include "qapi/visitor.h"
 
 static void core_prop_get_core_id(Object *obj, Visitor *v, const char *name,
   void *opaque, Error **errp)
-- 
2.34.1




[PATCH 2/2] hw/cpu/cluster: Cleanup unused included header in cluster.c

2023-11-27 Thread Zhao Liu
From: Zhao Liu 

Remove unused header (qemu/module.h and qemu/cutils.h) in cluster.c,
and reorder the remaining header files (except qemu/osdep.h) in
alphabetical order.

Tested by "./configure" and then "make".

Signed-off-by: Zhao Liu 
---
 hw/cpu/cluster.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/hw/cpu/cluster.c b/hw/cpu/cluster.c
index e444b7c29d1b..61289a840d46 100644
--- a/hw/cpu/cluster.c
+++ b/hw/cpu/cluster.c
@@ -19,12 +19,11 @@
  */
 
 #include "qemu/osdep.h"
+
+#include "hw/core/cpu.h"
 #include "hw/cpu/cluster.h"
 #include "hw/qdev-properties.h"
-#include "hw/core/cpu.h"
 #include "qapi/error.h"
-#include "qemu/module.h"
-#include "qemu/cutils.h"
 
 static Property cpu_cluster_properties[] = {
 DEFINE_PROP_UINT32("cluster-id", CPUClusterState, cluster_id, 0),
-- 
2.34.1




Re: [PATCH v1 2/7] migration/multifd: Remove QEMUFile from where it is not needed

2023-11-27 Thread Peter Xu
On Fri, Nov 24, 2023 at 01:14:27PM -0300, Fabiano Rosas wrote:
> Signed-off-by: Fabiano Rosas 

Reviewed-by: Peter Xu 

-- 
Peter Xu




[PATCH 0/2] Cleanup unused included header in core.c & cluster.c

2023-11-27 Thread Zhao Liu
From: Zhao Liu 

Remove unused header in core.c and cluster.c, and reorder the remaining
header files (except qemu/osdep.h) in alphabetical order.

Tested by "./configure" and then "make".

---
Zhao Liu (2):
  hw/cpu/core: Cleanup unused included header in core.c
  hw/cpu/cluster: Cleanup unused included header in cluster.c

 hw/cpu/cluster.c | 5 ++---
 hw/cpu/core.c| 7 +++
 2 files changed, 5 insertions(+), 7 deletions(-)

-- 
2.34.1




Re: [PATCH v1 3/7] migration/multifd: Change multifd_pages_init argument

2023-11-27 Thread Peter Xu
On Fri, Nov 24, 2023 at 01:14:28PM -0300, Fabiano Rosas wrote:
> The 'size' argument is actually the number of pages that fit in a
> multifd packet. Change it to uint32_t and rename.
> 
> Signed-off-by: Fabiano Rosas 

Reviewed-by: Peter Xu 

-- 
Peter Xu




Re: [PATCH v1 4/7] migration: Report error in incoming migration

2023-11-27 Thread Peter Xu
On Fri, Nov 24, 2023 at 01:14:29PM -0300, Fabiano Rosas wrote:
> We're not currently reporting the errors set with migrate_set_error()
> when incoming migration fails.
> 
> Signed-off-by: Fabiano Rosas 

Reviewed-by: Peter Xu 

-- 
Peter Xu




Re: [PATCH v1 5/7] tests/qtest/migration: Print migration incoming errors

2023-11-27 Thread Peter Xu
On Fri, Nov 24, 2023 at 01:14:30PM -0300, Fabiano Rosas wrote:
> We're currently just asserting when incoming migration fails. Let's
> print the error message from QMP as well.
> 
> Signed-off-by: Fabiano Rosas 
> ---
>  tests/qtest/migration-helpers.c | 6 ++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/tests/qtest/migration-helpers.c b/tests/qtest/migration-helpers.c
> index 24fb7b3525..f1106128a9 100644
> --- a/tests/qtest/migration-helpers.c
> +++ b/tests/qtest/migration-helpers.c
> @@ -118,6 +118,12 @@ void migrate_incoming_qmp(QTestState *to, const char 
> *uri, const char *fmt, ...)
>  
>  rsp = qtest_qmp(to, "{ 'execute': 'migrate-incoming', 'arguments': %p}",
>  args);
> +
> +if (!qdict_haskey(rsp, "return")) {
> +g_autoptr(GString) s = qobject_to_json_pretty(QOBJECT(rsp), true);
> +g_test_message("%s", s->str);
> +}

This traps the "migrate-incoming" command only (which, afaiu, only setup
the listening), would this capture the incoming error?

> +
>  g_assert(qdict_haskey(rsp, "return"));
>  qobject_unref(rsp);
>  
> -- 
> 2.35.3
> 

-- 
Peter Xu




Re: [PATCH v1 6/7] tests/qtest/migration: Add a wrapper to print test names

2023-11-27 Thread Peter Xu
On Fri, Nov 24, 2023 at 01:14:31PM -0300, Fabiano Rosas wrote:
> Our usage of gtest results in us losing the very basic functionality
> of "knowing which test failed". The issue is that gtest only prints
> test names ("paths" in gtest parlance) once the test has finished, but
> we use asserts in the tests and crash gtest itself before it can print
> anything. We also use a final abort when the result of g_test_run is
> not 0.
> 
> Depending on how the test failed/broke we can see the function that
> trigged the abort, which may be representative of the test, but it
> could also just be some generic function.
> 
> We have been relying on the primitive method of looking at the name of
> the previous successful test and then looking at the code to figure
> out which test should have come next.
> 
> Add a wrapper to the test registration that does the job of printing
> the test name before running.
> 
> Signed-off-by: Fabiano Rosas 
> ---
>  tests/qtest/migration-helpers.c | 32 
>  tests/qtest/migration-helpers.h |  1 +
>  2 files changed, 33 insertions(+)
> 
> diff --git a/tests/qtest/migration-helpers.c b/tests/qtest/migration-helpers.c
> index f1106128a9..164e09c299 100644
> --- a/tests/qtest/migration-helpers.c
> +++ b/tests/qtest/migration-helpers.c
> @@ -298,3 +298,35 @@ char *resolve_machine_version(const char *alias, const 
> char *var1,
>  
>  return find_common_machine_version(machine_name, var1, var2);
>  }
> +
> +typedef struct {
> +char *name;
> +void (*func)(void);
> +} MigrationTest;
> +
> +static void migration_test_destroy(gpointer data)
> +{
> +MigrationTest *test = (MigrationTest *)data;
> +
> +g_free(test->name);
> +g_free(test);
> +}
> +
> +static void migration_test_wrapper(const void *data)
> +{
> +MigrationTest *test = (MigrationTest *)data;
> +
> +g_test_message("Running /%s%s", qtest_get_arch(), test->name);

/%s/%s?

> +test->func();
> +}

Reviewed-by: Peter Xu 

-- 
Peter Xu




Re: [PATCH v1 7/7] tests/qtest/migration: Use the new migration_test_add

2023-11-27 Thread Peter Xu
On Fri, Nov 24, 2023 at 01:14:32PM -0300, Fabiano Rosas wrote:
> Replace the tests registration with the new function that prints tests
> names.
> 
> Signed-off-by: Fabiano Rosas 

Reviewed-by: Peter Xu 

-- 
Peter Xu




Re: [PATCH 1/4] scsi: only access SCSIDevice->requests from one thread

2023-11-27 Thread Eric Blake
On Thu, Nov 23, 2023 at 02:49:28PM -0500, Stefan Hajnoczi wrote:
> Stop depending on the AioContext lock and instead access
> SCSIDevice->requests from only one thread at a time:
> - When the VM is running only the BlockBackend's AioContext may access
>   the requests list.
> - When the VM is stopped only the main loop may access the requests
>   list.
> 
> These constraints protect the requests list without the need for locking
> in the I/O code path.
> 
> Note that multiple IOThreads are not supported yet because the code
> assumes all SCSIRequests are executed from a single AioContext. Leave
> that as future work.
> 
> Signed-off-by: Stefan Hajnoczi 
> ---
>  include/hw/scsi/scsi.h |   7 +-
>  hw/scsi/scsi-bus.c | 174 -
>  2 files changed, 124 insertions(+), 57 deletions(-)
>

Reviewed-by: Eric Blake 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.
Virtualization:  qemu.org | libguestfs.org




Re: [PATCH] vmdk: Don't corrupt desc file in vmdk_write_cid

2023-11-27 Thread Eric Blake
On Fri, Nov 24, 2023 at 11:56:54AM +, Fam wrote:
> From: Fam Zheng 
> 
> If the text description file is larger than DESC_SIZE, we force the last
> byte in the buffer to be 0 and write it out.
> 
> This results in a corruption.
> 
> Try to allocate a big buffer in this case.
> 
> Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1923
> 
> Signed-off-by: Fam Zheng 
> ---
>  block/vmdk.c   | 28 
>  tests/qemu-iotests/059 |  2 ++
>  tests/qemu-iotests/059.out |  4 
>  3 files changed, 26 insertions(+), 8 deletions(-)

Reviewed-by: Eric Blake 

Are we trying to get this into 8.2, since it is a data corruption?

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.
Virtualization:  qemu.org | libguestfs.org




Re: [PATCH 2/4] virtio-scsi: don't lock AioContext around virtio_queue_aio_attach_host_notifier()

2023-11-27 Thread Eric Blake
On Thu, Nov 23, 2023 at 02:49:29PM -0500, Stefan Hajnoczi wrote:
> virtio_queue_aio_attach_host_notifier() does not require the AioContext
> lock. Stop taking the lock and remember add an explicit smp_wmb()

s/remember// ?

> because we were relying on the implicit barrier in the AioContext lock
> before.
> 
> Signed-off-by: Stefan Hajnoczi 
> ---
>  hw/scsi/virtio-scsi-dataplane.c | 8 +---
>  1 file changed, 1 insertion(+), 7 deletions(-)

Reviewed-by: Eric Blake 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.
Virtualization:  qemu.org | libguestfs.org




Re: [PATCH-for-8.2? v3 0/2] hw/net/can/xlnx-zynqmp: Avoid underflow while popping FIFOs

2023-11-27 Thread Peter Maydell
On Fri, 24 Nov 2023 at 18:34, Philippe Mathieu-Daudé  wrote:
>
> Series fully reviewed.
>
> Since v2:
> - Addressed Vikram review comments,
> - Added R-b tags
>
> Fix a pair of fuzzed bugs.
>
> Tested with the CAN tests from 'make check-qtest-aarch64'.



Applied to target-arm.next, thanks.

-- PMM



Re: [PATCH-for-8.2 0/6] hw: Free DEFINE_PROP_ARRAY()'s arrays in instance_finalize()

2023-11-27 Thread Peter Maydell
On Tue, 21 Nov 2023 at 17:40, Philippe Mathieu-Daudé  wrote:
>
> In few places we forget to free the array allocated by the
> DEFINE_PROP_ARRAY() macro handlers. Fix that.
>
> Philippe Mathieu-Daudé (6):
>   hw/virtio: Add VirtioPCIDeviceTypeInfo::instance_finalize field
>   hw/virtio: Free VirtIOIOMMUPCI::vdev.reserved_regions[] on finalize()
>   hw/misc/mps2-scc: Free MPS2SCC::oscclk[] array on finalize()
>   hw/nvram/xlnx-efuse: Free XlnxEFuse::ro_bits[] array on finalize()
>   hw/nvram/xlnx-efuse-ctrl: Free XlnxVersalEFuseCtrl[] "pg0-lock" array
>   hw/input/stellaris_gamepad: Free StellarisGamepad::keycodes[] array

Applied to target-arm.next, thanks.

-- PMM



Re: [PATCH v1 0/3] ZynqMP / Versal: various fixes

2023-11-27 Thread Peter Maydell
On Fri, 24 Nov 2023 at 14:35, Frederic Konrad  wrote:
>
> Hi,
>
> Those are various simple fixes for ZynqMP:
>   * 1: fixes a possible out of bound access in the SPI model.
>   * 2: is a trivial fix for documentation url.
>   * 3: is a log guest error fix for the CSU DMA.
>



Applied to target-arm.next, thanks.

-- PMM



Re: [PATCH v1 6/7] tests/qtest/migration: Add a wrapper to print test names

2023-11-27 Thread Fabiano Rosas
Peter Xu  writes:

> On Fri, Nov 24, 2023 at 01:14:31PM -0300, Fabiano Rosas wrote:
>> Our usage of gtest results in us losing the very basic functionality
>> of "knowing which test failed". The issue is that gtest only prints
>> test names ("paths" in gtest parlance) once the test has finished, but
>> we use asserts in the tests and crash gtest itself before it can print
>> anything. We also use a final abort when the result of g_test_run is
>> not 0.
>> 
>> Depending on how the test failed/broke we can see the function that
>> trigged the abort, which may be representative of the test, but it
>> could also just be some generic function.
>> 
>> We have been relying on the primitive method of looking at the name of
>> the previous successful test and then looking at the code to figure
>> out which test should have come next.
>> 
>> Add a wrapper to the test registration that does the job of printing
>> the test name before running.
>> 
>> Signed-off-by: Fabiano Rosas 
>> ---
>>  tests/qtest/migration-helpers.c | 32 
>>  tests/qtest/migration-helpers.h |  1 +
>>  2 files changed, 33 insertions(+)
>> 
>> diff --git a/tests/qtest/migration-helpers.c 
>> b/tests/qtest/migration-helpers.c
>> index f1106128a9..164e09c299 100644
>> --- a/tests/qtest/migration-helpers.c
>> +++ b/tests/qtest/migration-helpers.c
>> @@ -298,3 +298,35 @@ char *resolve_machine_version(const char *alias, const 
>> char *var1,
>>  
>>  return find_common_machine_version(machine_name, var1, var2);
>>  }
>> +
>> +typedef struct {
>> +char *name;
>> +void (*func)(void);
>> +} MigrationTest;
>> +
>> +static void migration_test_destroy(gpointer data)
>> +{
>> +MigrationTest *test = (MigrationTest *)data;
>> +
>> +g_free(test->name);
>> +g_free(test);
>> +}
>> +
>> +static void migration_test_wrapper(const void *data)
>> +{
>> +MigrationTest *test = (MigrationTest *)data;
>> +
>> +g_test_message("Running /%s%s", qtest_get_arch(), test->name);
>
> /%s/%s?

The test name contains a leading slash.



[PATCH] tests: bios-tables-test: Rename smbios type 4 related test functions

2023-11-27 Thread Zhao Liu
From: Zhao Liu 

In fact, type4-count, core-count, core-count2, thread-count and
thread-count2 are tested with KVM not TCG.

Rename these test functions to reflect KVM base instead of TCG.

Signed-off-by: Zhao Liu 
---
 tests/qtest/bios-tables-test.c | 20 ++--
 1 file changed, 10 insertions(+), 10 deletions(-)

diff --git a/tests/qtest/bios-tables-test.c b/tests/qtest/bios-tables-test.c
index 71af5cf69fc5..9a7e459e8ffb 100644
--- a/tests/qtest/bios-tables-test.c
+++ b/tests/qtest/bios-tables-test.c
@@ -1003,7 +1003,7 @@ static void test_acpi_q35_tcg(void)
 free_test_data(&data);
 }
 
-static void test_acpi_q35_tcg_type4_count(void)
+static void test_acpi_q35_kvm_type4_count(void)
 {
 test_data data = {
 .machine = MACHINE_Q35,
@@ -1019,7 +1019,7 @@ static void test_acpi_q35_tcg_type4_count(void)
 free_test_data(&data);
 }
 
-static void test_acpi_q35_tcg_core_count(void)
+static void test_acpi_q35_kvm_core_count(void)
 {
 test_data data = {
 .machine = MACHINE_Q35,
@@ -1036,7 +1036,7 @@ static void test_acpi_q35_tcg_core_count(void)
 free_test_data(&data);
 }
 
-static void test_acpi_q35_tcg_core_count2(void)
+static void test_acpi_q35_kvm_core_count2(void)
 {
 test_data data = {
 .machine = MACHINE_Q35,
@@ -1053,7 +1053,7 @@ static void test_acpi_q35_tcg_core_count2(void)
 free_test_data(&data);
 }
 
-static void test_acpi_q35_tcg_thread_count(void)
+static void test_acpi_q35_kvm_thread_count(void)
 {
 test_data data = {
 .machine = MACHINE_Q35,
@@ -1070,7 +1070,7 @@ static void test_acpi_q35_tcg_thread_count(void)
 free_test_data(&data);
 }
 
-static void test_acpi_q35_tcg_thread_count2(void)
+static void test_acpi_q35_kvm_thread_count2(void)
 {
 test_data data = {
 .machine = MACHINE_Q35,
@@ -2250,15 +2250,15 @@ int main(int argc, char *argv[])
 qtest_add_func("acpi/q35/kvm/xapic", test_acpi_q35_kvm_xapic);
 qtest_add_func("acpi/q35/kvm/dmar", test_acpi_q35_kvm_dmar);
 qtest_add_func("acpi/q35/type4-count",
-   test_acpi_q35_tcg_type4_count);
+   test_acpi_q35_kvm_type4_count);
 qtest_add_func("acpi/q35/core-count",
-   test_acpi_q35_tcg_core_count);
+   test_acpi_q35_kvm_core_count);
 qtest_add_func("acpi/q35/core-count2",
-   test_acpi_q35_tcg_core_count2);
+   test_acpi_q35_kvm_core_count2);
 qtest_add_func("acpi/q35/thread-count",
-   test_acpi_q35_tcg_thread_count);
+   test_acpi_q35_kvm_thread_count);
 qtest_add_func("acpi/q35/thread-count2",
-   test_acpi_q35_tcg_thread_count2);
+   test_acpi_q35_kvm_thread_count2);
 }
 if (qtest_has_device("virtio-iommu-pci")) {
 qtest_add_func("acpi/q35/viot", test_acpi_q35_viot);
-- 
2.34.1




Re: [PATCH v1 5/7] tests/qtest/migration: Print migration incoming errors

2023-11-27 Thread Fabiano Rosas
Peter Xu  writes:

> On Fri, Nov 24, 2023 at 01:14:30PM -0300, Fabiano Rosas wrote:
>> We're currently just asserting when incoming migration fails. Let's
>> print the error message from QMP as well.
>> 
>> Signed-off-by: Fabiano Rosas 
>> ---
>>  tests/qtest/migration-helpers.c | 6 ++
>>  1 file changed, 6 insertions(+)
>> 
>> diff --git a/tests/qtest/migration-helpers.c 
>> b/tests/qtest/migration-helpers.c
>> index 24fb7b3525..f1106128a9 100644
>> --- a/tests/qtest/migration-helpers.c
>> +++ b/tests/qtest/migration-helpers.c
>> @@ -118,6 +118,12 @@ void migrate_incoming_qmp(QTestState *to, const char 
>> *uri, const char *fmt, ...)
>>  
>>  rsp = qtest_qmp(to, "{ 'execute': 'migrate-incoming', 'arguments': %p}",
>>  args);
>> +
>> +if (!qdict_haskey(rsp, "return")) {
>> +g_autoptr(GString) s = qobject_to_json_pretty(QOBJECT(rsp), true);
>> +g_test_message("%s", s->str);
>> +}
>
> This traps the "migrate-incoming" command only (which, afaiu, only setup
> the listening), would this capture the incoming error?

This is about the migrate-incoming only. We could replace "incoming
migration" with "qmp_migrate_incoming" in the commit message to clarify.



Re: [PATCH v1 6/7] tests/qtest/migration: Add a wrapper to print test names

2023-11-27 Thread Peter Xu
On Mon, Nov 27, 2023 at 12:44:53PM -0300, Fabiano Rosas wrote:
> >> +static void migration_test_wrapper(const void *data)
> >> +{
> >> +MigrationTest *test = (MigrationTest *)data;
> >> +
> >> +g_test_message("Running /%s%s", qtest_get_arch(), test->name);
> >
> > /%s/%s?
> 
> The test name contains a leading slash.

Then I suppose qtest_add_func() just tried to omit the duplicated '/'?

Never mind then.  My R-b holds.  Thanks,

-- 
Peter Xu




[PATCH] crypto: Introduce SM4 symmetric cipher algorithm

2023-11-27 Thread Hyman Huang
Introduce the SM4 cipher algorithms (OSCCA GB/T 32907-2016).

SM4 (GBT.32907-2016) is a cryptographic standard issued by the
Organization of State Commercial Administration of China (OSCCA)
as an authorized cryptographic algorithms for the use within China.

Signed-off-by: Hyman Huang 
---
 crypto/block-luks.c |  7 ++
 crypto/cipher-gcrypt.c.inc  |  4 
 crypto/cipher-nettle.c.inc  | 42 +
 crypto/cipher.c |  2 ++
 qapi/crypto.json|  5 +++-
 tests/unit/test-crypto-cipher.c | 11 +
 6 files changed, 70 insertions(+), 1 deletion(-)

diff --git a/crypto/block-luks.c b/crypto/block-luks.c
index fb01ec38bb..1cb7f21a05 100644
--- a/crypto/block-luks.c
+++ b/crypto/block-luks.c
@@ -95,12 +95,19 @@ qcrypto_block_luks_cipher_size_map_twofish[] = {
 { 0, 0 },
 };
 
+static const QCryptoBlockLUKSCipherSizeMap
+qcrypto_block_luks_cipher_size_map_sm4[] = {
+{ 16, QCRYPTO_CIPHER_ALG_SM4},
+{ 0, 0 },
+};
+
 static const QCryptoBlockLUKSCipherNameMap
 qcrypto_block_luks_cipher_name_map[] = {
 { "aes", qcrypto_block_luks_cipher_size_map_aes },
 { "cast5", qcrypto_block_luks_cipher_size_map_cast5 },
 { "serpent", qcrypto_block_luks_cipher_size_map_serpent },
 { "twofish", qcrypto_block_luks_cipher_size_map_twofish },
+{ "sm4", qcrypto_block_luks_cipher_size_map_sm4},
 };
 
 QEMU_BUILD_BUG_ON(sizeof(struct QCryptoBlockLUKSKeySlot) != 48);
diff --git a/crypto/cipher-gcrypt.c.inc b/crypto/cipher-gcrypt.c.inc
index a6a0117717..03af50b0c3 100644
--- a/crypto/cipher-gcrypt.c.inc
+++ b/crypto/cipher-gcrypt.c.inc
@@ -35,6 +35,7 @@ bool qcrypto_cipher_supports(QCryptoCipherAlgorithm alg,
 case QCRYPTO_CIPHER_ALG_SERPENT_256:
 case QCRYPTO_CIPHER_ALG_TWOFISH_128:
 case QCRYPTO_CIPHER_ALG_TWOFISH_256:
+case QCRYPTO_CIPHER_ALG_SM4:
 break;
 default:
 return false;
@@ -219,6 +220,9 @@ static QCryptoCipher 
*qcrypto_cipher_ctx_new(QCryptoCipherAlgorithm alg,
 case QCRYPTO_CIPHER_ALG_TWOFISH_256:
 gcryalg = GCRY_CIPHER_TWOFISH;
 break;
+case QCRYPTO_CIPHER_ALG_SM4:
+gcryalg = GCRY_CIPHER_SM4;
+break;
 default:
 error_setg(errp, "Unsupported cipher algorithm %s",
QCryptoCipherAlgorithm_str(alg));
diff --git a/crypto/cipher-nettle.c.inc b/crypto/cipher-nettle.c.inc
index 24cc61f87b..cd2ca0c7b5 100644
--- a/crypto/cipher-nettle.c.inc
+++ b/crypto/cipher-nettle.c.inc
@@ -30,6 +30,7 @@
 #include 
 #include 
 #include 
+#include 
 #ifndef CONFIG_QEMU_PRIVATE_XTS
 #include 
 #endif
@@ -426,6 +427,28 @@ DEFINE_ECB_CBC_CTR_XTS(qcrypto_nettle_twofish,
QCryptoNettleTwofish, TWOFISH_BLOCK_SIZE,
twofish_encrypt_native, twofish_decrypt_native)
 
+typedef struct QCryptoNettleSm4 {
+QCryptoCipher base;
+struct sm4_ctx key[2];
+} QCryptoNettleSm4;
+
+static void sm4_encrypt_native(void *ctx, size_t length,
+   uint8_t *dst, const uint8_t *src)
+{
+struct sm4_ctx *keys = ctx;
+sm4_crypt(&keys[0], length, dst, src);
+}
+
+static void sm4_decrypt_native(void *ctx, size_t length,
+   uint8_t *dst, const uint8_t *src)
+{
+struct sm4_ctx *keys = ctx;
+sm4_crypt(&keys[1], length, dst, src);
+}
+
+DEFINE_ECB(qcrypto_nettle_sm4,
+   QCryptoNettleSm4, SM4_BLOCK_SIZE,
+   sm4_encrypt_native, sm4_decrypt_native)
 
 bool qcrypto_cipher_supports(QCryptoCipherAlgorithm alg,
  QCryptoCipherMode mode)
@@ -443,6 +466,7 @@ bool qcrypto_cipher_supports(QCryptoCipherAlgorithm alg,
 case QCRYPTO_CIPHER_ALG_TWOFISH_128:
 case QCRYPTO_CIPHER_ALG_TWOFISH_192:
 case QCRYPTO_CIPHER_ALG_TWOFISH_256:
+case QCRYPTO_CIPHER_ALG_SM4:
 break;
 default:
 return false;
@@ -702,6 +726,24 @@ static QCryptoCipher 
*qcrypto_cipher_ctx_new(QCryptoCipherAlgorithm alg,
 return &ctx->base;
 }
 
+case QCRYPTO_CIPHER_ALG_SM4:
+{
+QCryptoNettleSm4 *ctx = g_new0(QCryptoNettleSm4, 1);
+
+switch (mode) {
+case QCRYPTO_CIPHER_MODE_ECB:
+ctx->base.driver = &qcrypto_nettle_sm4_driver_ecb;
+break;
+default:
+goto bad_cipher_mode;
+}
+
+sm4_set_encrypt_key(&ctx->key[0], key);
+sm4_set_decrypt_key(&ctx->key[1], key);
+
+return &ctx->base;
+}
+
 default:
 error_setg(errp, "Unsupported cipher algorithm %s",
QCryptoCipherAlgorithm_str(alg));
diff --git a/crypto/cipher.c b/crypto/cipher.c
index 74b09a5b26..048ceaa6a3 100644
--- a/crypto/cipher.c
+++ b/crypto/cipher.c
@@ -38,6 +38,7 @@ static const size_t alg_key_len[QCRYPTO_CIPHER_ALG__MAX] = {
 [QCRYPTO_CIPHER_ALG_TWOFISH_128] = 16,
 [QCRYPTO_CIPHER_ALG_TWOFISH_192] = 24,
 [QCRYPTO_CIPHER_ALG_TWOFISH_25

Re: [PATCH 3/4] scsi: don't lock AioContext in I/O code path

2023-11-27 Thread Eric Blake
On Thu, Nov 23, 2023 at 02:49:30PM -0500, Stefan Hajnoczi wrote:
> blk_aio_*() doesn't require the AioContext lock and the SCSI subsystem's
> internal state also does not anymore.
> 
> Signed-off-by: Stefan Hajnoczi 
> ---
>  hw/scsi/scsi-disk.c| 23 ---
>  hw/scsi/scsi-generic.c | 20 +++-
>  2 files changed, 3 insertions(+), 40 deletions(-)
>

Reviewed-by: Eric Blake 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.
Virtualization:  qemu.org | libguestfs.org




Re: [PATCH v7 01/10] hw/fsi: Introduce IBM's Local bus

2023-11-27 Thread Cédric Le Goater

On 10/26/23 18:47, Ninad Palsule wrote:

This is a part of patchset where IBM's Flexible Service Interface is
introduced.

The LBUS is modelled to maintain the qdev bus hierarchy and to take
advantage of the object model to automatically generate the CFAM
configuration block. The configuration block presents engines in the
order they are attached to the CFAM's LBUS. Engine implementations
should subclass the LBusDevice and set the 'config' member of
LBusDeviceClass to match the engine's type.

Signed-off-by: Andrew Jeffery 
Signed-off-by: Ninad Palsule 
---
v2:
- Incorporated Joel's review comments.
v5:
- Incorporated review comments by Cedric.
v6:
- Incorporated review comments by Cedric & Daniel.
v7:
- Incorporated review comments by Philippe.
---
  include/hw/fsi/lbus.h | 43 +
  hw/fsi/lbus.c | 74 +++
  hw/Kconfig|  1 +
  hw/fsi/Kconfig|  2 ++
  hw/fsi/meson.build|  1 +
  hw/meson.build|  1 +
  6 files changed, 122 insertions(+)
  create mode 100644 include/hw/fsi/lbus.h
  create mode 100644 hw/fsi/lbus.c
  create mode 100644 hw/fsi/Kconfig
  create mode 100644 hw/fsi/meson.build

diff --git a/include/hw/fsi/lbus.h b/include/hw/fsi/lbus.h
new file mode 100644
index 00..4fa696bbdb
--- /dev/null
+++ b/include/hw/fsi/lbus.h
@@ -0,0 +1,43 @@
+/*
+ * SPDX-License-Identifier: GPL-2.0-or-later
+ * Copyright (C) 2023 IBM Corp.
+ *
+ * IBM Local bus and connected device structures.
+ */
+#ifndef FSI_LBUS_H
+#define FSI_LBUS_H
+
+#include "exec/memory.h"
+#include "hw/qdev-core.h"
+
+#define TYPE_FSI_LBUS_DEVICE "fsi.lbus.device"
+OBJECT_DECLARE_TYPE(FSILBusDevice, FSILBusDeviceClass, FSI_LBUS_DEVICE)
+
+#define FSI_LBUS_MEM_REGION_SIZE  (2 * 1024 * 1024)
+#define FSI_LBUSDEV_IOMEM_SIZE0x400
+
+typedef struct FSILBusDevice {
+DeviceState parent;
+
+MemoryRegion iomem;
+uint32_t address;


This is address attribute is not useful. Models/HW unit generally do not
know where their MMIO regions are mmapped in the address space.


+} FSILBusDevice;
+
+typedef struct FSILBusDeviceClass {
+DeviceClass parent;
+
+uint32_t config;
+} FSILBusDeviceClass;
+
+#define TYPE_FSI_LBUS "fsi.lbus"
+OBJECT_DECLARE_SIMPLE_TYPE(FSILBus, FSI_LBUS)
+
+typedef struct FSILBus {
+BusState bus;
+
+MemoryRegion mr;
+} FSILBus;
+
+DeviceState *lbus_create_device(FSILBus *bus, const char *type, uint32_t addr);
+int lbus_add_device(FSILBus *bus, FSILBusDevice *dev);
+#endif /* FSI_LBUS_H */
diff --git a/hw/fsi/lbus.c b/hw/fsi/lbus.c
new file mode 100644
index 00..3a7335dde5
--- /dev/null
+++ b/hw/fsi/lbus.c
@@ -0,0 +1,74 @@
+/*
+ * SPDX-License-Identifier: GPL-2.0-or-later
+ * Copyright (C) 2023 IBM Corp.
+ *
+ * IBM Local bus where FSI slaves are connected
+ */
+
+#include "qemu/osdep.h"
+#include "qapi/error.h"
+#include "hw/fsi/lbus.h"
+
+#include "hw/qdev-properties.h"
+
+static void lbus_init(Object *o)
+{
+FSILBus *lbus = FSI_LBUS(o);
+
+memory_region_init(&lbus->mr, OBJECT(lbus), TYPE_FSI_LBUS,
+   FSI_LBUS_MEM_REGION_SIZE - FSI_LBUSDEV_IOMEM_SIZE);
+}
+
+static const TypeInfo lbus_info = {
+.name = TYPE_FSI_LBUS,
+.parent = TYPE_BUS,
+.instance_init = lbus_init,
+.instance_size = sizeof(FSILBus),
+};
+
+static Property lbus_device_props[] = {
+DEFINE_PROP_UINT32("address", FSILBusDevice, address, 0),
+DEFINE_PROP_END_OF_LIST(),
+};


Please remove lbus_device_props


+DeviceState *lbus_create_device(FSILBus *bus, const char *type, uint32_t addr)


This routine is only used once in a realize handler which could propagate
a possible error. Please remove.


+{
+DeviceState *ds;
+BusState *state = BUS(bus);
+FSILBusDevice *dev;
+
+ds = qdev_new(type);
+qdev_prop_set_uint32(ds, "address", addr);
+qdev_realize_and_unref(ds, state, &error_fatal);
+
+dev = FSI_LBUS_DEVICE(ds);
+memory_region_add_subregion(&bus->mr, dev->address,
+&dev->iomem);
+
+return ds;
+}
+
+static void lbus_device_class_init(ObjectClass *klass, void *data)
+{
+DeviceClass *dc = DEVICE_CLASS(klass);
+
+dc->bus_type = TYPE_FSI_LBUS;
+device_class_set_props(dc, lbus_device_props);
+}
+
+static const TypeInfo lbus_device_type_info = {
+.name = TYPE_FSI_LBUS_DEVICE,
+.parent = TYPE_DEVICE,
+.instance_size = sizeof(FSILBusDevice),
+.abstract = true,
+.class_init = lbus_device_class_init,
+.class_size = sizeof(FSILBusDeviceClass),
+};
+
+static void lbus_register_types(void)
+{
+type_register_static(&lbus_info);
+type_register_static(&lbus_device_type_info);
+}
+
+type_init(lbus_register_types);
diff --git a/hw/Kconfig b/hw/Kconfig
index 9ca7b38c31..2c00936c28 100644
--- a/hw/Kconfig
+++ b/hw/Kconfig
@@ -9,6 +9,7 @@ source core/Kconfig
  source cxl/Kconfig
  source display/Kconfig
  source dma/Kconfig
+source fsi/Kconfig
  source gpio/Kconfig
  source hyperv/Kconfig
 

Re: [PATCH] vmdk: Don't corrupt desc file in vmdk_write_cid

2023-11-27 Thread Kevin Wolf
Am 24.11.2023 um 12:56 hat Fam geschrieben:
> From: Fam Zheng 
> 
> If the text description file is larger than DESC_SIZE, we force the last
> byte in the buffer to be 0 and write it out.
> 
> This results in a corruption.
> 
> Try to allocate a big buffer in this case.
> 
> Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1923
> 
> Signed-off-by: Fam Zheng 

Thanks, applied to the block branch.

But while I'm looking at this function, is there really anything that
guarantees that "parentCID" always exists and comes immediately after
"CID"? This looks like a questionable assumption to me.

Kevin




Re: [PATCH v7 02/10] hw/fsi: Introduce IBM's scratchpad

2023-11-27 Thread Cédric Le Goater

On 10/26/23 18:47, Ninad Palsule wrote:

This is a part of patchset where scratchpad is introduced.

The scratchpad provides a set of non-functional registers. The firmware
is free to use them, hardware does not support any special management
support. The scratchpad registers can be read or written from LBUS
slave.

In this model, The LBUS device is parent for the scratchpad.

Signed-off-by: Andrew Jeffery 
Signed-off-by: Ninad Palsule 
---
v2:
- Incorporated Joel's review comments.
v5:
- Incorporated review comments by Cedric.
v6:
- Incorporated review comments by Daniel.
v7:
- Incorporated review comments by Philippe.
- Cleaned up unused bits.
---
  meson.build|  1 +
  hw/fsi/trace.h |  1 +
  include/hw/fsi/engine-scratchpad.h | 27 +
  include/hw/fsi/fsi.h   | 16 +
  hw/fsi/engine-scratchpad.c | 93 ++
  hw/fsi/Kconfig |  4 ++
  hw/fsi/meson.build |  1 +
  hw/fsi/trace-events|  2 +
  8 files changed, 145 insertions(+)
  create mode 100644 hw/fsi/trace.h
  create mode 100644 include/hw/fsi/engine-scratchpad.h
  create mode 100644 include/hw/fsi/fsi.h
  create mode 100644 hw/fsi/engine-scratchpad.c
  create mode 100644 hw/fsi/trace-events

diff --git a/meson.build b/meson.build
index dcef8b1e79..793c7c1f20 100644
--- a/meson.build
+++ b/meson.build
@@ -3257,6 +3257,7 @@ if have_system
  'hw/char',
  'hw/display',
  'hw/dma',
+'hw/fsi',


This should be introduced in the first patch.


  'hw/hyperv',
  'hw/i2c',
  'hw/i386',
diff --git a/hw/fsi/trace.h b/hw/fsi/trace.h
new file mode 100644
index 00..ee67c7fb04
--- /dev/null
+++ b/hw/fsi/trace.h
@@ -0,0 +1 @@
+#include "trace/trace-hw_fsi.h"
diff --git a/include/hw/fsi/engine-scratchpad.h 
b/include/hw/fsi/engine-scratchpad.h
new file mode 100644
index 00..4ffa871965
--- /dev/null
+++ b/include/hw/fsi/engine-scratchpad.h


This model is introduced to early please drop from this patch.


@@ -0,0 +1,27 @@
+/*
+ * SPDX-License-Identifier: GPL-2.0-or-later
+ * Copyright (C) 2023 IBM Corp.
+ *
+ * IBM scratchpad engne
+ */
+#ifndef FSI_ENGINE_SCRATCHPAD_H
+#define FSI_ENGINE_SCRATCHPAD_H
+
+#include "hw/fsi/lbus.h"
+#include "hw/fsi/fsi.h"
+
+#define ENGINE_CONFIG_NEXTBE_BIT(0)
+#define ENGINE_CONFIG_TYPE_PEEK   (0x02 << 4)
+#define ENGINE_CONFIG_TYPE_FSI(0x03 << 4)
+#define ENGINE_CONFIG_TYPE_SCRATCHPAD (0x06 << 4)
+
+#define TYPE_FSI_SCRATCHPAD "fsi.scratchpad"
+#define SCRATCHPAD(obj) OBJECT_CHECK(FSIScratchPad, (obj), TYPE_FSI_SCRATCHPAD)
+
+typedef struct FSIScratchPad {
+FSILBusDevice parent;
+
+uint32_t reg;
+} FSIScratchPad;
+
+#endif /* FSI_ENGINE_SCRATCHPAD_H */
diff --git a/include/hw/fsi/fsi.h b/include/hw/fsi/fsi.h
new file mode 100644
index 00..b08b97f62b
--- /dev/null
+++ b/include/hw/fsi/fsi.h
@@ -0,0 +1,16 @@
+/*
+ * SPDX-License-Identifier: GPL-2.0-or-later
+ * Copyright (C) 2023 IBM Corp.
+ *
+ * IBM Flexible Service Interface
+ */
+#ifndef FSI_FSI_H
+#define FSI_FSI_H
+
+#include "qemu/bitops.h"
+
+/* Bitwise operations at the word level. */
+#define BE_BIT(x)   BIT(31 - (x))


31 ? BITS_PER_LONG would be better I think.


+#define BE_GENMASK(hb, lb)  MAKE_64BIT_MASK((lb), ((hb) - (lb) + 1))
+
+#endif
diff --git a/hw/fsi/engine-scratchpad.c b/hw/fsi/engine-scratchpad.c
new file mode 100644
index 00..a8887cd613
--- /dev/null
+++ b/hw/fsi/engine-scratchpad.c


DRop FSIScratchPad from this patch. It should be introduced later.

However, TYPE_FSI_BUS should be introcuded now because the following
patch needs it.



@@ -0,0 +1,93 @@
+/*
+ * SPDX-License-Identifier: GPL-2.0-or-later
+ * Copyright (C) 2023 IBM Corp.
+ *
+ * IBM scratchpad engine
+ */
+
+#include "qemu/osdep.h"
+
+#include "qapi/error.h"
+#include "qemu/log.h"
+#include "trace.h"
+
+#include "hw/fsi/engine-scratchpad.h"
+
+static uint64_t fsi_scratchpad_read(void *opaque, hwaddr addr, unsigned size)
+{
+FSIScratchPad *s = SCRATCHPAD(opaque);
+
+trace_fsi_scratchpad_read(addr, size);
+
+if (addr) {
+return 0;
+}
+
+return s->reg;
+}
+
+static void fsi_scratchpad_write(void *opaque, hwaddr addr, uint64_t data,
+ unsigned size)
+{
+FSIScratchPad *s = SCRATCHPAD(opaque);
+
+trace_fsi_scratchpad_write(addr, size, data);
+
+if (addr) {
+return;
+}
+
+s->reg = data;
+}
+
+static const struct MemoryRegionOps scratchpad_ops = {
+.read = fsi_scratchpad_read,
+.write = fsi_scratchpad_write,
+.endianness = DEVICE_BIG_ENDIAN,
+};
+
+static void fsi_scratchpad_realize(DeviceState *dev, Error **errp)
+{
+FSILBusDevice *ldev = FSI_LBUS_DEVICE(dev);
+
+memory_region_init_io(&ldev->iomem, OBJECT(ldev), &scratchpad_ops,
+  ldev, TYPE_FSI_SCRATCHPAD, 0x400);
+}
+
+static void fsi_scratchpad_reset(DeviceState *dev)
+

Re: [PATCH] crypto: Introduce SM4 symmetric cipher algorithm

2023-11-27 Thread Daniel P . Berrangé
On Mon, Nov 27, 2023 at 11:55:34PM +0800, Hyman Huang wrote:
> Introduce the SM4 cipher algorithms (OSCCA GB/T 32907-2016).
> 
> SM4 (GBT.32907-2016) is a cryptographic standard issued by the
> Organization of State Commercial Administration of China (OSCCA)
> as an authorized cryptographic algorithms for the use within China.

Just out of interest, what part of QEMU are you needing to use
SM4 with ? Is it for a LUKS block driver cipher ?

> 
> Signed-off-by: Hyman Huang 
> ---
>  crypto/block-luks.c |  7 ++
>  crypto/cipher-gcrypt.c.inc  |  4 

Looking at the gcrypt code, SM4 is only supported in >= 1.9.0 

QEMU min version is 1.8.0, so you'll need to modify meson.build
to check whether SM4 is supported and put conditionals in this
file

>  crypto/cipher-nettle.c.inc  | 42 +

Looking at the nettle code, SM4 is only supported in unreleased
versions thus far.

So again will need a meson.build check and conditionals.

>  crypto/cipher.c |  2 ++
>  qapi/crypto.json|  5 +++-
>  tests/unit/test-crypto-cipher.c | 11 +
>  6 files changed, 70 insertions(+), 1 deletion(-)


> diff --git a/qapi/crypto.json b/qapi/crypto.json
> index fd3d46ebd1..95fa10bb6d 100644
> --- a/qapi/crypto.json
> +++ b/qapi/crypto.json
> @@ -94,6 +94,8 @@
>  #
>  # @twofish-256: Twofish with 256 bit / 32 byte keys
>  #
> +# @sm4: SM4 with 128 bit / 16 byte keys (since 8.2)

We're in feature freeze for 8.2, so mark this 9.0 as that'll be the
next available release this could be merged for.

> +#
>  # Since: 2.6
>  ##
>  { 'enum': 'QCryptoCipherAlgorithm',
> @@ -102,7 +104,8 @@
> 'des', '3des',
> 'cast5-128',
> 'serpent-128', 'serpent-192', 'serpent-256',
> -   'twofish-128', 'twofish-192', 'twofish-256']}
> +   'twofish-128', 'twofish-192', 'twofish-256',
> +   'sm4']}
>  
>  ##
>  # @QCryptoCipherMode:
> diff --git a/tests/unit/test-crypto-cipher.c b/tests/unit/test-crypto-cipher.c
> index d9d9d078ff..80a4984e43 100644
> --- a/tests/unit/test-crypto-cipher.c
> +++ b/tests/unit/test-crypto-cipher.c
> @@ -382,6 +382,17 @@ static QCryptoCipherTestData test_data[] = {
>  .plaintext = "90afe91bb288544f2c32dc239b2635e6",
>  .ciphertext = "6cb4561c40bf0a9705931cb6d408e7fa",
>  },
> +{
> +/* SM4, GB/T 32907-2016, Appendix A.1 */
> +.path = "/crypto/cipher/sm4",
> +.alg = QCRYPTO_CIPHER_ALG_SM4,
> +.mode = QCRYPTO_CIPHER_MODE_ECB,
> +.key = "0123456789abcdeffedcba9876543210",
> +.plaintext  =
> +"0123456789abcdeffedcba9876543210",
> +.ciphertext =
> +"681edf34d206965e86b3e94f536e4246",
> +},
>  {
>  /* #1 32 byte key, 32 byte PTX */
>  .path = "/crypto/cipher/aes-xts-128-1",
> -- 
> 2.39.1
> 

With regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|




Re: [PATCH v7 03/10] hw/fsi: Introduce IBM's cfam,fsi-slave

2023-11-27 Thread Cédric Le Goater

On 10/26/23 18:47, Ninad Palsule wrote:

This is a part of patchset where IBM's Flexible Service Interface is
introduced.

The Common FRU Access Macro (CFAM), an address space containing
various "engines" that drive accesses on busses internal and external
to the POWER chip. Examples include the SBEFIFO and I2C masters. The
engines hang off of an internal Local Bus (LBUS) which is described
by the CFAM configuration block.

The FSI slave: The slave is the terminal point of the FSI bus for
FSI symbols addressed to it. Slaves can be cascaded off of one
another. The slave's configuration registers appear in address space
of the CFAM to which it is attached.

Signed-off-by: Andrew Jeffery 
Signed-off-by: Ninad Palsule 
---
v2:
- Incorporated Joel's review comments.
v3:
- Incorporated Thomas Huth's review comments.
v5:
- Incorporated review comments by Cedric.
v6:
- Incorporated review comments by Cedric & Daniel
---
  include/hw/fsi/cfam.h  |  34 
  include/hw/fsi/fsi-slave.h |  29 +++
  include/hw/fsi/fsi.h   |  20 +
  hw/fsi/cfam.c  | 173 +
  hw/fsi/fsi-slave.c |  78 +
  hw/fsi/Kconfig |   9 ++
  hw/fsi/meson.build |   2 +
  hw/fsi/trace-events|   7 ++
  8 files changed, 352 insertions(+)
  create mode 100644 include/hw/fsi/cfam.h
  create mode 100644 include/hw/fsi/fsi-slave.h
  create mode 100644 hw/fsi/cfam.c
  create mode 100644 hw/fsi/fsi-slave.c

diff --git a/include/hw/fsi/cfam.h b/include/hw/fsi/cfam.h
new file mode 100644
index 00..842a3bad0c
--- /dev/null
+++ b/include/hw/fsi/cfam.h
@@ -0,0 +1,34 @@
+/*
+ * SPDX-License-Identifier: GPL-2.0-or-later
+ * Copyright (C) 2023 IBM Corp.
+ *
+ * IBM Common FRU Access Macro
+ */
+#ifndef FSI_CFAM_H
+#define FSI_CFAM_H
+
+#include "exec/memory.h"
+
+#include "hw/fsi/fsi-slave.h"
+#include "hw/fsi/lbus.h"



Please include here the FSIScratchPad definitions. There is no need
for an extra file since the object is only used under FSICFAMState.



+
+#define TYPE_FSI_CFAM "cfam"
+#define FSI_CFAM(obj) OBJECT_CHECK(FSICFAMState, (obj), TYPE_FSI_CFAM)
+
+/* P9-ism */
+#define CFAM_CONFIG_NR_REGS 0x28
+
+typedef struct FSICFAMState {
+/* < private > */
+FSISlaveState parent;
+
+/* CFAM config address space */
+MemoryRegion config_iomem;
+
+MemoryRegion mr;
+AddressSpace as;
+
+FSILBus lbus;


Please add :

FSIScratchPad scratchpad;


+} FSICFAMState;
+
+#endif /* FSI_CFAM_H */
diff --git a/include/hw/fsi/fsi-slave.h b/include/hw/fsi/fsi-slave.h
new file mode 100644
index 00..f5f23f4457
--- /dev/null
+++ b/include/hw/fsi/fsi-slave.h
@@ -0,0 +1,29 @@
+/*
+ * SPDX-License-Identifier: GPL-2.0-or-later
+ * Copyright (C) 2023 IBM Corp.
+ *
+ * IBM Flexible Service Interface slave
+ */
+#ifndef FSI_FSI_SLAVE_H
+#define FSI_FSI_SLAVE_H
+
+#include "exec/memory.h"
+#include "hw/qdev-core.h"
+
+#include "hw/fsi/lbus.h"
+
+#include 
+
+#define TYPE_FSI_SLAVE "fsi.slave"
+OBJECT_DECLARE_SIMPLE_TYPE(FSISlaveState, FSI_SLAVE)
+
+#define FSI_SLAVE_CONTROL_NR_REGS ((0x40 >> 2) + 1)
+
+typedef struct FSISlaveState {
+DeviceState parent;
+
+MemoryRegion iomem;
+uint32_t regs[FSI_SLAVE_CONTROL_NR_REGS];
+} FSISlaveState;
+
+#endif /* FSI_FSI_H */
diff --git a/include/hw/fsi/fsi.h b/include/hw/fsi/fsi.h
index b08b97f62b..3cbc685226 100644
--- a/include/hw/fsi/fsi.h
+++ b/include/hw/fsi/fsi.h
@@ -8,9 +8,29 @@
  #define FSI_FSI_H
  
  #include "qemu/bitops.h"

+#include "hw/qdev-core.h"
+
+/*
+ * TODO: Maybe unwind this dependency with const links? Store a
+ * pointer in FSIBus?
+ */
+#include "hw/fsi/cfam.h"
  
  /* Bitwise operations at the word level. */

  #define BE_BIT(x)   BIT(31 - (x))
  #define BE_GENMASK(hb, lb)  MAKE_64BIT_MASK((lb), ((hb) - (lb) + 1))
  
+#define TYPE_FSI_BUS "fsi.bus"

+OBJECT_DECLARE_SIMPLE_TYPE(FSIBus, FSI_BUS)
+
+/* TODO: Figure out what's best with a point-to-point bus */
+typedef struct FSISlaveState FSISlaveState;
+
+typedef struct FSIBus {
+BusState bus;
+
+/* XXX: It's point-to-point, just instantiate the slave directly for now */
+FSICFAMState slave;


Drop the FSICFAMState from FSIBus and introduce FSIBus in the previous
patch.


+} FSIBus;
+
  #endif
diff --git a/hw/fsi/cfam.c b/hw/fsi/cfam.c
new file mode 100644
index 00..a1c037925f
--- /dev/null
+++ b/hw/fsi/cfam.c
@@ -0,0 +1,173 @@
+/*
+ * SPDX-License-Identifier: GPL-2.0-or-later
+ * Copyright (C) 2023 IBM Corp.
+ *
+ * IBM Common FRU Access Macro
+ */
+
+#include "qemu/osdep.h"
+
+#include "qapi/error.h"
+#include "trace.h"
+
+#include "hw/fsi/cfam.h"
+#include "hw/fsi/fsi.h"
+#include "hw/fsi/engine-scratchpad.h"
+
+#include "hw/qdev-properties.h"
+
+#define TO_REG(x)  ((x) >> 2)
+
+#define CFAM_ENGINE_CONFIG  TO_REG(0x04)
+
+#define CFAM_CONFIG_CHIP_IDTO_REG(0x00)
+#define CFAM_CONFIG_CHIP_ID_P9 0xc0022d15
+#

Re: [PATCH v7 04/10] hw/fsi: Introduce IBM's FSI

2023-11-27 Thread Cédric Le Goater

On 10/26/23 18:47, Ninad Palsule wrote:

This is a part of patchset where IBM's Flexible Service Interface is
introduced.

This commit models the FSI bus. CFAM is hanging out of FSI bus. The bus
is model such a way that it is embedded inside the FSI master which is a
bus controller.

The FSI master: A controller in the platform service processor (e.g.
BMC) driving CFAM engine accesses into the POWER chip. At the
hardware level FSI is a bit-based protocol supporting synchronous and
DMA-driven accesses of engines in a CFAM.

Signed-off-by: Andrew Jeffery 
Signed-off-by: Ninad Palsule 
Reviewed-by: Joel Stanley 
---
v2:
- Incorporated review comments by Joel
v5:
- Incorporated review comments by Cedric.
v6:
- Incorporated review comments by Cedric & Daniel
v7:
- Cleaned up unused bits.
---
  include/hw/fsi/fsi-master.h |  30 +++
  hw/fsi/fsi-master.c | 162 
  hw/fsi/fsi.c|  25 ++
  hw/fsi/meson.build  |   2 +-
  hw/fsi/trace-events |   2 +
  5 files changed, 220 insertions(+), 1 deletion(-)
  create mode 100644 include/hw/fsi/fsi-master.h
  create mode 100644 hw/fsi/fsi-master.c
  create mode 100644 hw/fsi/fsi.c

diff --git a/include/hw/fsi/fsi-master.h b/include/hw/fsi/fsi-master.h
new file mode 100644
index 00..847078919c
--- /dev/null
+++ b/include/hw/fsi/fsi-master.h
@@ -0,0 +1,30 @@
+/*
+ * SPDX-License-Identifier: GPL-2.0-or-later
+ * Copyright (C) 2019 IBM Corp.
+ *
+ * IBM Flexible Service Interface Master
+ */
+#ifndef FSI_FSI_MASTER_H
+#define FSI_FSI_MASTER_H
+
+#include "exec/memory.h"
+#include "hw/qdev-core.h"
+#include "hw/fsi/fsi.h"
+
+#define TYPE_FSI_MASTER "fsi.master"
+OBJECT_DECLARE_SIMPLE_TYPE(FSIMasterState, FSI_MASTER)
+
+#define FSI_MASTER_NR_REGS ((0x2e0 >> 2) + 1)
+
+typedef struct FSIMasterState {
+DeviceState parent;
+MemoryRegion iomem;
+MemoryRegion opb2fsi;
+
+FSIBus bus;
+
+uint32_t regs[FSI_MASTER_NR_REGS];


Move FSICFAMState here.



+} FSIMasterState;
+
+
+#endif /* FSI_FSI_H */
diff --git a/hw/fsi/fsi-master.c b/hw/fsi/fsi-master.c
new file mode 100644
index 00..bb7a893003
--- /dev/null
+++ b/hw/fsi/fsi-master.c
@@ -0,0 +1,162 @@
+/*
+ * SPDX-License-Identifier: GPL-2.0-or-later
+ * Copyright (C) 2023 IBM Corp.
+ *
+ * IBM Flexible Service Interface master
+ */
+
+#include "qemu/osdep.h"
+#include "qapi/error.h"
+#include "qemu/log.h"
+#include "trace.h"
+
+#include "hw/fsi/fsi-master.h"
+
+#define TYPE_OP_BUS "opb"
+
+#define TO_REG(x)   ((x) >> 2)
+
+#define FSI_MENP0   TO_REG(0x010)
+#define FSI_MENP32  TO_REG(0x014)
+#define FSI_MSENP0  TO_REG(0x018)
+#define FSI_MLEVP0  TO_REG(0x018)
+#define FSI_MSENP32 TO_REG(0x01c)
+#define FSI_MLEVP32 TO_REG(0x01c)
+#define FSI_MCENP0  TO_REG(0x020)
+#define FSI_MREFP0  TO_REG(0x020)
+#define FSI_MCENP32 TO_REG(0x024)
+#define FSI_MREFP32 TO_REG(0x024)
+
+#define FSI_MVERTO_REG(0x074)
+#define FSI_MRESP0  TO_REG(0x0d0)
+
+#define FSI_MRESB0  TO_REG(0x1d0)
+#define   FSI_MRESB0_RESET_GENERAL  BE_BIT(0)
+#define   FSI_MRESB0_RESET_ERRORBE_BIT(1)
+
+static uint64_t fsi_master_read(void *opaque, hwaddr addr, unsigned size)
+{
+FSIMasterState *s = FSI_MASTER(opaque);
+
+trace_fsi_master_read(addr, size);
+
+if (addr + size > sizeof(s->regs)) {
+qemu_log_mask(LOG_GUEST_ERROR,
+  "%s: Out of bounds read: 0x%"HWADDR_PRIx" for %u\n",
+  __func__, addr, size);
+return 0;
+}
+
+return s->regs[TO_REG(addr)];
+}
+
+static void fsi_master_write(void *opaque, hwaddr addr, uint64_t data,
+ unsigned size)
+{
+FSIMasterState *s = FSI_MASTER(opaque);
+
+trace_fsi_master_write(addr, size, data);
+
+if (addr + size > sizeof(s->regs)) {
+qemu_log_mask(LOG_GUEST_ERROR,
+  "%s: Out of bounds write: %"HWADDR_PRIx" for %u\n",
+  __func__, addr, size);
+return;
+}
+
+switch (TO_REG(addr)) {
+case FSI_MENP0:
+s->regs[FSI_MENP0] = data;
+break;
+case FSI_MENP32:
+s->regs[FSI_MENP32] = data;
+break;
+case FSI_MSENP0:
+s->regs[FSI_MENP0] |= data;
+break;
+case FSI_MSENP32:
+s->regs[FSI_MENP32] |= data;
+break;
+case FSI_MCENP0:
+s->regs[FSI_MENP0] &= ~data;
+break;
+case FSI_MCENP32:
+s->regs[FSI_MENP32] &= ~data;
+break;
+case FSI_MRESP0:
+/* Perform necessary resets leave register 0 to indicate no errors */
+b

Re: [PATCH v7 05/10] hw/fsi: IBM's On-chip Peripheral Bus

2023-11-27 Thread Cédric Le Goater

On 10/26/23 18:47, Ninad Palsule wrote:

This is a part of patchset where IBM's Flexible Service Interface is
introduced.

The On-Chip Peripheral Bus (OPB): A low-speed bus typically found in
POWER processors. This now makes an appearance in the ASPEED SoC due
to tight integration of the FSI master IP with the OPB, mainly the
existence of an MMIO-mapping of the CFAM address straight onto a
sub-region of the OPB address space.

Signed-off-by: Andrew Jeffery 
Signed-off-by: Ninad Palsule 
Reviewed-by: Joel Stanley 
---
v2:
- Incorporated review comment by Joel.
v5:
- Incorporated review comments by Cedric.
v6:
- Incorporated review comments by Cedric & Daniel
v7:
- Incorporated review comments by Cedric.
---
  include/hw/fsi/opb.h | 33 
  hw/fsi/fsi-master.c  |  3 +-
  hw/fsi/opb.c | 74 
  hw/fsi/Kconfig   |  4 +++
  hw/fsi/meson.build   |  1 +
  5 files changed, 113 insertions(+), 2 deletions(-)
  create mode 100644 include/hw/fsi/opb.h
  create mode 100644 hw/fsi/opb.c

diff --git a/include/hw/fsi/opb.h b/include/hw/fsi/opb.h
new file mode 100644
index 00..8b71bb55c2
--- /dev/null
+++ b/include/hw/fsi/opb.h
@@ -0,0 +1,33 @@
+/*
+ * SPDX-License-Identifier: GPL-2.0-or-later
+ * Copyright (C) 2023 IBM Corp.
+ *
+ * IBM On-Chip Peripheral Bus
+ */
+#ifndef FSI_OPB_H
+#define FSI_OPB_H
+
+#include "exec/memory.h"
+#include "hw/fsi/fsi-master.h"
+
+#define TYPE_OP_BUS "opb"
+OBJECT_DECLARE_SIMPLE_TYPE(OPBus, OP_BUS)
+
+typedef struct OPBus {
+/*< private >*/
+BusState bus;
+
+/*< public >*/
+MemoryRegion mr;
+AddressSpace as;
+
+/* Model OPB as dumb enough just to provide an address-space */
+/* TODO: Maybe don't store device state in the bus? */
+FSIMasterState fsi;


Please remove. FSIMasterState should be introduced later.


+} OPBus;
+
+typedef struct OPBusClass {
+BusClass parent_class;
+} OPBusClass;
+
+#endif /* FSI_OPB_H */
diff --git a/hw/fsi/fsi-master.c b/hw/fsi/fsi-master.c
index bb7a893003..ec092b42ea 100644
--- a/hw/fsi/fsi-master.c
+++ b/hw/fsi/fsi-master.c
@@ -11,8 +11,7 @@
  #include "trace.h"
  
  #include "hw/fsi/fsi-master.h"

-
-#define TYPE_OP_BUS "opb"
+#include "hw/fsi/opb.h"


ouch.

This is an ugly hack because the modeling is broken. OPB should be introduced
before tFSIMasterState.



  #define TO_REG(x)   ((x) >> 2)
  
diff --git a/hw/fsi/opb.c b/hw/fsi/opb.c

new file mode 100644
index 00..04771b4b27
--- /dev/null
+++ b/hw/fsi/opb.c
@@ -0,0 +1,74 @@
+/*
+ * SPDX-License-Identifier: GPL-2.0-or-later
+ * Copyright (C) 2023 IBM Corp.
+ *
+ * IBM On-chip Peripheral Bus
+ */
+
+#include "qemu/osdep.h"
+
+#include "qapi/error.h"
+#include "qemu/log.h"
+
+#include "hw/fsi/opb.h"
+
+static void fsi_opb_realize(BusState *bus, Error **errp)
+{
+OPBus *opb = OP_BUS(bus);
+
+memory_region_init_io(&opb->mr, OBJECT(opb), NULL, opb,
+  NULL, UINT32_MAX);
+address_space_init(&opb->as, &opb->mr, "opb");


Please keep the above and put it in a instance_init handler and remove
the rest. It should go under AspeedAPB2OPBState.


+
+if (!object_property_set_bool(OBJECT(&opb->fsi), "realized", true, errp)) {
+return;
+}
+
+memory_region_add_subregion(&opb->mr, 0x8000, &opb->fsi.iomem);
+
+/* OPB2FSI region */
+/*
+ * Avoid endianness issues by mapping each slave's memory region directly.
+ * Manually bridging multiple address-spaces causes endian swapping
+ * headaches as memory_region_dispatch_read() and
+ * memory_region_dispatch_write() correct the endianness based on the
+ * target machine endianness and not relative to the device endianness on
+ * either side of the bridge.
+ */
+/*
+ * XXX: This is a bit hairy and will need to be fixed when I sort out the
+ * bus/slave relationship and any changes to the CFAM modelling (multiple
+ * slaves, LBUS)
+ */
+memory_region_add_subregion(&opb->mr, 0xa000, &opb->fsi.opb2fsi);
+}
+
+static void fsi_opb_init(Object *o)
+{
+OPBus *opb = OP_BUS(o);
+
+object_initialize_child(o, "fsi-master", &opb->fsi, TYPE_FSI_MASTER);
+qdev_set_parent_bus(DEVICE(&opb->fsi), BUS(o), &error_abort);
+}


Drop all of the above.


Thanks,

C.




+
+static void fsi_opb_class_init(ObjectClass *klass, void *data)
+{
+BusClass *bc = BUS_CLASS(klass);
+bc->realize = fsi_opb_realize;
+}
+
+static const TypeInfo opb_info = {
+.name = TYPE_OP_BUS,
+.parent = TYPE_BUS,
+.instance_init = fsi_opb_init,
+.instance_size = sizeof(OPBus),
+.class_init = fsi_opb_class_init,
+.class_size = sizeof(OPBusClass),
+};
+
+static void fsi_opb_register_types(void)
+{
+type_register_static(&opb_info);
+}
+
+type_init(fsi_opb_register_types);
diff --git a/hw/fsi/Kconfig b/hw/fsi/Kconfig
index 8d712e77ed..0f6e6d331a 100644
--- a/hw/fsi/Kconfig
+++ b/

Re: [PATCH v7 06/10] hw/fsi: Aspeed APB2OPB interface

2023-11-27 Thread Cédric Le Goater

On 10/26/23 18:47, Ninad Palsule wrote:

This is a part of patchset where IBM's Flexible Service Interface is
introduced.

An APB-to-OPB bridge enabling access to the OPB from the ARM core in
the AST2600. Hardware limitations prevent the OPB from being directly
mapped into APB, so all accesses are indirect through the bridge.

Signed-off-by: Andrew Jeffery 
Signed-off-by: Ninad Palsule 
---
v2:
- Incorporated review comments by Joel
v3:
- Incorporated review comments by Thomas Huth
v4:
   - Compile FSI with ASPEED_SOC only.
v5:
- Incorporated review comments by Cedric.
v6:
- Incorporated review comments by Cedric.
v7:
- Incorporated review comments by Cedric.
---
  include/hw/fsi/aspeed-apb2opb.h |  33 
  hw/fsi/aspeed-apb2opb.c | 272 
  hw/arm/Kconfig  |   1 +
  hw/fsi/Kconfig  |   4 +
  hw/fsi/meson.build  |   1 +
  hw/fsi/trace-events |   2 +
  6 files changed, 313 insertions(+)
  create mode 100644 include/hw/fsi/aspeed-apb2opb.h
  create mode 100644 hw/fsi/aspeed-apb2opb.c

diff --git a/include/hw/fsi/aspeed-apb2opb.h b/include/hw/fsi/aspeed-apb2opb.h
new file mode 100644
index 00..a81ae67023
--- /dev/null
+++ b/include/hw/fsi/aspeed-apb2opb.h
@@ -0,0 +1,33 @@
+/*
+ * SPDX-License-Identifier: GPL-2.0-or-later
+ * Copyright (C) 2023 IBM Corp.
+ *
+ * ASPEED APB2OPB Bridge
+ */
+#ifndef FSI_ASPEED_APB2OPB_H
+#define FSI_ASPEED_APB2OPB_H
+
+#include "hw/sysbus.h"
+#include "hw/fsi/opb.h"
+
+#define TYPE_ASPEED_APB2OPB "aspeed.apb2opb"
+OBJECT_DECLARE_SIMPLE_TYPE(AspeedAPB2OPBState, ASPEED_APB2OPB)
+
+#define ASPEED_APB2OPB_NR_REGS ((0xe8 >> 2) + 1)
+
+#define ASPEED_FSI_NUM 2
+
+typedef struct AspeedAPB2OPBState {
+/*< private >*/
+SysBusDevice parent_obj;
+
+/*< public >*/
+MemoryRegion iomem;
+
+uint32_t regs[ASPEED_APB2OPB_NR_REGS];
+qemu_irq irq;
+
+OPBus opb[ASPEED_FSI_NUM];


Please introduce :

  FSIMasterState fsi[ASPEED_FSI_NUM];


+} AspeedAPB2OPBState;
+
+#endif /* FSI_ASPEED_APB2OPB_H */
diff --git a/hw/fsi/aspeed-apb2opb.c b/hw/fsi/aspeed-apb2opb.c
new file mode 100644
index 00..4cac62a38f
--- /dev/null
+++ b/hw/fsi/aspeed-apb2opb.c
@@ -0,0 +1,272 @@
+/*
+ * SPDX-License-Identifier: GPL-2.0-or-later
+ * Copyright (C) 2023 IBM Corp.
+ *
+ * ASPEED APB-OPB FSI interface
+ */
+
+#include "qemu/osdep.h"
+#include "qemu/log.h"
+#include "qom/object.h"
+#include "qapi/error.h"
+#include "trace.h"
+
+#include "hw/fsi/aspeed-apb2opb.h"
+#include "hw/qdev-core.h"
+
+#define TO_REG(x) (x >> 2)
+
+#define APB2OPB_VERSIONTO_REG(0x00)
+#define APB2OPB_TRIGGERTO_REG(0x04)
+
+#define APB2OPB_CONTROLTO_REG(0x08)
+#define   APB2OPB_CONTROL_OFF  BE_GENMASK(31, 13)
+
+#define APB2OPB_OPB2FSITO_REG(0x0c)
+#define   APB2OPB_OPB2FSI_OFF  BE_GENMASK(31, 22)
+
+#define APB2OPB_OPB0_SEL   TO_REG(0x10)
+#define APB2OPB_OPB1_SEL   TO_REG(0x28)
+#define   APB2OPB_OPB_SEL_EN   BIT(0)
+
+#define APB2OPB_OPB0_MODE  TO_REG(0x14)
+#define APB2OPB_OPB1_MODE  TO_REG(0x2c)
+#define   APB2OPB_OPB_MODE_RD  BIT(0)
+
+#define APB2OPB_OPB0_XFER  TO_REG(0x18)
+#define APB2OPB_OPB1_XFER  TO_REG(0x30)
+#define   APB2OPB_OPB_XFER_FULLBIT(1)
+#define   APB2OPB_OPB_XFER_HALFBIT(0)
+
+#define APB2OPB_OPB0_ADDR  TO_REG(0x1c)
+#define APB2OPB_OPB0_WRITE_DATATO_REG(0x20)
+
+#define APB2OPB_OPB1_ADDR  TO_REG(0x34)
+#define APB2OPB_OPB1_WRITE_DATA  TO_REG(0x38)
+
+#define APB2OPB_IRQ_STSTO_REG(0x48)
+#define   APB2OPB_IRQ_STS_OPB1_TX_ACK  BIT(17)
+#define   APB2OPB_IRQ_STS_OPB0_TX_ACK  BIT(16)
+
+#define APB2OPB_OPB0_WRITE_WORD_ENDIAN TO_REG(0x4c)
+#define   APB2OPB_OPB0_WRITE_WORD_ENDIAN_BE 0x0011101b
+#define APB2OPB_OPB0_WRITE_BYTE_ENDIAN TO_REG(0x50)
+#define   APB2OPB_OPB0_WRITE_BYTE_ENDIAN_BE 0x0c330f3f
+#define APB2OPB_OPB1_WRITE_WORD_ENDIAN TO_REG(0x54)
+#define APB2OPB_OPB1_WRITE_BYTE_ENDIAN TO_REG(0x58)
+#define APB2OPB_OPB0_READ_BYTE_ENDIAN  TO_REG(0x5c)
+#define APB2OPB_OPB1_READ_BYTE_ENDIAN  TO_REG(0x60)
+#define   APB2OPB_OPB0_READ_WORD_ENDIAN_BE  0x00030b1b
+
+#define APB2OPB_OPB0_READ_DATA TO_REG(0x84)
+#define APB2OPB_OPB1_READ_DATA TO_REG(0x90)
+
+/*
+ * The following magic values came from AST2600 data sheet
+ * The register values are defined under section "FSI controller"
+ * as initial values.
+ */
+static const uint32_t aspeed_apb2opb_reset[ASPEED_APB2OPB_NR_REGS] = {
+ [APB2OPB_VERSION]= 0x00a1,
+ [APB2OPB_OPB0_WRITE_WORD_ENDIAN] = 0x0044eee4,
+ [APB2OPB_OPB0_WRITE_BYTE_ENDIAN] = 0x0055aaff,
+ [APB2OPB_OPB1_WRITE_WORD_ENDIAN] = 0x00117717,
+ [APB2OPB_OPB1_WRITE_BYTE_END

Re: [PATCH v7 01/10] hw/fsi: Introduce IBM's Local bus

2023-11-27 Thread Cédric Le Goater

On 10/26/23 18:47, Ninad Palsule wrote:

This is a part of patchset where IBM's Flexible Service Interface is
introduced.

The LBUS is modelled to maintain the qdev bus hierarchy and to take
advantage of the object model to automatically generate the CFAM
configuration block. The configuration block presents engines in the
order they are attached to the CFAM's LBUS. Engine implementations
should subclass the LBusDevice and set the 'config' member of
LBusDeviceClass to match the engine's type.

Signed-off-by: Andrew Jeffery 
Signed-off-by: Ninad Palsule 
---
v2:
- Incorporated Joel's review comments.
v5:
- Incorporated review comments by Cedric.
v6:
- Incorporated review comments by Cedric & Daniel.
v7:
- Incorporated review comments by Philippe.
---
  include/hw/fsi/lbus.h | 43 +
  hw/fsi/lbus.c | 74 +++
  hw/Kconfig|  1 +
  hw/fsi/Kconfig|  2 ++
  hw/fsi/meson.build|  1 +
  hw/meson.build|  1 +
  6 files changed, 122 insertions(+)
  create mode 100644 include/hw/fsi/lbus.h
  create mode 100644 hw/fsi/lbus.c
  create mode 100644 hw/fsi/Kconfig
  create mode 100644 hw/fsi/meson.build

diff --git a/include/hw/fsi/lbus.h b/include/hw/fsi/lbus.h
new file mode 100644
index 00..4fa696bbdb
--- /dev/null
+++ b/include/hw/fsi/lbus.h
@@ -0,0 +1,43 @@
+/*
+ * SPDX-License-Identifier: GPL-2.0-or-later
+ * Copyright (C) 2023 IBM Corp.
+ *
+ * IBM Local bus and connected device structures.
+ */
+#ifndef FSI_LBUS_H
+#define FSI_LBUS_H
+
+#include "exec/memory.h"
+#include "hw/qdev-core.h"
+
+#define TYPE_FSI_LBUS_DEVICE "fsi.lbus.device"
+OBJECT_DECLARE_TYPE(FSILBusDevice, FSILBusDeviceClass, FSI_LBUS_DEVICE)
+
+#define FSI_LBUS_MEM_REGION_SIZE  (2 * 1024 * 1024)
+#define FSI_LBUSDEV_IOMEM_SIZE0x400
+
+typedef struct FSILBusDevice {
+DeviceState parent;
+
+MemoryRegion iomem;
+uint32_t address;
+} FSILBusDevice;
+
+typedef struct FSILBusDeviceClass {
+DeviceClass parent;
+
+uint32_t config;
+} FSILBusDeviceClass;
+
+#define TYPE_FSI_LBUS "fsi.lbus"
+OBJECT_DECLARE_SIMPLE_TYPE(FSILBus, FSI_LBUS)
+
+typedef struct FSILBus {
+BusState bus;
+
+MemoryRegion mr;
+} FSILBus;
+
+DeviceState *lbus_create_device(FSILBus *bus, const char *type, uint32_t addr);
+int lbus_add_device(FSILBus *bus, FSILBusDevice *dev);
+#endif /* FSI_LBUS_H */
diff --git a/hw/fsi/lbus.c b/hw/fsi/lbus.c
new file mode 100644
index 00..3a7335dde5
--- /dev/null
+++ b/hw/fsi/lbus.c
@@ -0,0 +1,74 @@
+/*
+ * SPDX-License-Identifier: GPL-2.0-or-later
+ * Copyright (C) 2023 IBM Corp.
+ *
+ * IBM Local bus where FSI slaves are connected
+ */
+
+#include "qemu/osdep.h"
+#include "qapi/error.h"
+#include "hw/fsi/lbus.h"
+
+#include "hw/qdev-properties.h"
+
+static void lbus_init(Object *o)
+{
+FSILBus *lbus = FSI_LBUS(o);
+
+memory_region_init(&lbus->mr, OBJECT(lbus), TYPE_FSI_LBUS,
+   FSI_LBUS_MEM_REGION_SIZE - FSI_LBUSDEV_IOMEM_SIZE);


That's a bit difficult to understand. What are these two regions ?
CAn you explain more the mappings ?


Thanks,

C.





+}
+
+static const TypeInfo lbus_info = {
+.name = TYPE_FSI_LBUS,
+.parent = TYPE_BUS,
+.instance_init = lbus_init,
+.instance_size = sizeof(FSILBus),
+};
+
+static Property lbus_device_props[] = {
+DEFINE_PROP_UINT32("address", FSILBusDevice, address, 0),
+DEFINE_PROP_END_OF_LIST(),
+};
+
+DeviceState *lbus_create_device(FSILBus *bus, const char *type, uint32_t addr)
+{
+DeviceState *ds;
+BusState *state = BUS(bus);
+FSILBusDevice *dev;
+
+ds = qdev_new(type);
+qdev_prop_set_uint32(ds, "address", addr);
+qdev_realize_and_unref(ds, state, &error_fatal);
+
+dev = FSI_LBUS_DEVICE(ds);
+memory_region_add_subregion(&bus->mr, dev->address,
+&dev->iomem);
+
+return ds;
+}
+
+static void lbus_device_class_init(ObjectClass *klass, void *data)
+{
+DeviceClass *dc = DEVICE_CLASS(klass);
+
+dc->bus_type = TYPE_FSI_LBUS;
+device_class_set_props(dc, lbus_device_props);
+}
+
+static const TypeInfo lbus_device_type_info = {
+.name = TYPE_FSI_LBUS_DEVICE,
+.parent = TYPE_DEVICE,
+.instance_size = sizeof(FSILBusDevice),
+.abstract = true,
+.class_init = lbus_device_class_init,
+.class_size = sizeof(FSILBusDeviceClass),
+};
+
+static void lbus_register_types(void)
+{
+type_register_static(&lbus_info);
+type_register_static(&lbus_device_type_info);
+}
+
+type_init(lbus_register_types);
diff --git a/hw/Kconfig b/hw/Kconfig
index 9ca7b38c31..2c00936c28 100644
--- a/hw/Kconfig
+++ b/hw/Kconfig
@@ -9,6 +9,7 @@ source core/Kconfig
  source cxl/Kconfig
  source display/Kconfig
  source dma/Kconfig
+source fsi/Kconfig
  source gpio/Kconfig
  source hyperv/Kconfig
  source i2c/Kconfig
diff --git a/hw/fsi/Kconfig b/hw/fsi/Kconfig
new file mode 100644
index 00..e650c660f0
--- /dev/null
+++ b/hw/fsi/Kconfig
@@ -0,0 +1,2

Re: [PATCH v7 03/10] hw/fsi: Introduce IBM's cfam,fsi-slave

2023-11-27 Thread Cédric Le Goater

On 10/26/23 18:47, Ninad Palsule wrote:

This is a part of patchset where IBM's Flexible Service Interface is
introduced.

The Common FRU Access Macro (CFAM), an address space containing
various "engines" that drive accesses on busses internal and external
to the POWER chip. Examples include the SBEFIFO and I2C masters. The
engines hang off of an internal Local Bus (LBUS) which is described
by the CFAM configuration block.

The FSI slave: The slave is the terminal point of the FSI bus for
FSI symbols addressed to it. Slaves can be cascaded off of one
another. The slave's configuration registers appear in address space
of the CFAM to which it is attached.

Signed-off-by: Andrew Jeffery 
Signed-off-by: Ninad Palsule 
---
v2:
- Incorporated Joel's review comments.
v3:
- Incorporated Thomas Huth's review comments.
v5:
- Incorporated review comments by Cedric.
v6:
- Incorporated review comments by Cedric & Daniel
---
  include/hw/fsi/cfam.h  |  34 
  include/hw/fsi/fsi-slave.h |  29 +++
  include/hw/fsi/fsi.h   |  20 +
  hw/fsi/cfam.c  | 173 +
  hw/fsi/fsi-slave.c |  78 +
  hw/fsi/Kconfig |   9 ++
  hw/fsi/meson.build |   2 +
  hw/fsi/trace-events|   7 ++
  8 files changed, 352 insertions(+)
  create mode 100644 include/hw/fsi/cfam.h
  create mode 100644 include/hw/fsi/fsi-slave.h
  create mode 100644 hw/fsi/cfam.c
  create mode 100644 hw/fsi/fsi-slave.c

diff --git a/include/hw/fsi/cfam.h b/include/hw/fsi/cfam.h
new file mode 100644
index 00..842a3bad0c
--- /dev/null
+++ b/include/hw/fsi/cfam.h
@@ -0,0 +1,34 @@
+/*
+ * SPDX-License-Identifier: GPL-2.0-or-later
+ * Copyright (C) 2023 IBM Corp.
+ *
+ * IBM Common FRU Access Macro
+ */
+#ifndef FSI_CFAM_H
+#define FSI_CFAM_H
+
+#include "exec/memory.h"
+
+#include "hw/fsi/fsi-slave.h"
+#include "hw/fsi/lbus.h"
+
+#define TYPE_FSI_CFAM "cfam"
+#define FSI_CFAM(obj) OBJECT_CHECK(FSICFAMState, (obj), TYPE_FSI_CFAM)
+
+/* P9-ism */
+#define CFAM_CONFIG_NR_REGS 0x28
+
+typedef struct FSICFAMState {
+/* < private > */
+FSISlaveState parent;
+
+/* CFAM config address space */
+MemoryRegion config_iomem;
+
+MemoryRegion mr;
+AddressSpace as;
+
+FSILBus lbus;
+} FSICFAMState;
+
+#endif /* FSI_CFAM_H */
diff --git a/include/hw/fsi/fsi-slave.h b/include/hw/fsi/fsi-slave.h
new file mode 100644
index 00..f5f23f4457
--- /dev/null
+++ b/include/hw/fsi/fsi-slave.h
@@ -0,0 +1,29 @@
+/*
+ * SPDX-License-Identifier: GPL-2.0-or-later
+ * Copyright (C) 2023 IBM Corp.
+ *
+ * IBM Flexible Service Interface slave
+ */
+#ifndef FSI_FSI_SLAVE_H
+#define FSI_FSI_SLAVE_H
+
+#include "exec/memory.h"
+#include "hw/qdev-core.h"
+
+#include "hw/fsi/lbus.h"
+
+#include 
+
+#define TYPE_FSI_SLAVE "fsi.slave"
+OBJECT_DECLARE_SIMPLE_TYPE(FSISlaveState, FSI_SLAVE)
+
+#define FSI_SLAVE_CONTROL_NR_REGS ((0x40 >> 2) + 1)
+
+typedef struct FSISlaveState {
+DeviceState parent;
+
+MemoryRegion iomem;
+uint32_t regs[FSI_SLAVE_CONTROL_NR_REGS];
+} FSISlaveState;
+
+#endif /* FSI_FSI_H */
diff --git a/include/hw/fsi/fsi.h b/include/hw/fsi/fsi.h
index b08b97f62b..3cbc685226 100644
--- a/include/hw/fsi/fsi.h
+++ b/include/hw/fsi/fsi.h
@@ -8,9 +8,29 @@
  #define FSI_FSI_H
  
  #include "qemu/bitops.h"

+#include "hw/qdev-core.h"
+
+/*
+ * TODO: Maybe unwind this dependency with const links? Store a
+ * pointer in FSIBus?
+ */
+#include "hw/fsi/cfam.h"
  
  /* Bitwise operations at the word level. */

  #define BE_BIT(x)   BIT(31 - (x))
  #define BE_GENMASK(hb, lb)  MAKE_64BIT_MASK((lb), ((hb) - (lb) + 1))
  
+#define TYPE_FSI_BUS "fsi.bus"

+OBJECT_DECLARE_SIMPLE_TYPE(FSIBus, FSI_BUS)
+
+/* TODO: Figure out what's best with a point-to-point bus */
+typedef struct FSISlaveState FSISlaveState;
+
+typedef struct FSIBus {
+BusState bus;
+
+/* XXX: It's point-to-point, just instantiate the slave directly for now */
+FSICFAMState slave;
+} FSIBus;
+
  #endif
diff --git a/hw/fsi/cfam.c b/hw/fsi/cfam.c
new file mode 100644
index 00..a1c037925f
--- /dev/null
+++ b/hw/fsi/cfam.c
@@ -0,0 +1,173 @@
+/*
+ * SPDX-License-Identifier: GPL-2.0-or-later
+ * Copyright (C) 2023 IBM Corp.
+ *
+ * IBM Common FRU Access Macro
+ */
+
+#include "qemu/osdep.h"
+
+#include "qapi/error.h"
+#include "trace.h"
+
+#include "hw/fsi/cfam.h"
+#include "hw/fsi/fsi.h"
+#include "hw/fsi/engine-scratchpad.h"
+
+#include "hw/qdev-properties.h"
+
+#define TO_REG(x)  ((x) >> 2)
+
+#define CFAM_ENGINE_CONFIG  TO_REG(0x04)
+
+#define CFAM_CONFIG_CHIP_IDTO_REG(0x00)
+#define CFAM_CONFIG_CHIP_ID_P9 0xc0022d15
+#define CFAM_CONFIG_CHIP_ID_BREAK  0xc0de
+
+static uint64_t fsi_cfam_config_read(void *opaque, hwaddr addr, unsigned size)
+{
+FSICFAMState *cfam = FSI_CFAM(opaque);
+BusChild *kid;
+int i;
+
+trace_fsi_cfam_config_read(addr, size);
+
+switch (

Re: [PATCH v7 00/10] Introduce model for IBM's FSI

2023-11-27 Thread Cédric Le Goater

Hello Ninad,

On 10/26/23 18:47, Ninad Palsule wrote:

Hello,

Please review the patch-set version 7.
I have incorporated review comments from Cedric, Philippe and Thomas.



I reworked v7 with the suggestions I made in patches 1-6. Please check :

  https://github.com/legoater/qemu/commits/aspeed-8.2

I will have more questions on the mappings because some parts are really
unclear.


Thanks,

C.




Ninad Palsule (10):
   hw/fsi: Introduce IBM's Local bus
   hw/fsi: Introduce IBM's scratchpad
   hw/fsi: Introduce IBM's cfam,fsi-slave
   hw/fsi: Introduce IBM's FSI
   hw/fsi: IBM's On-chip Peripheral Bus
   hw/fsi: Aspeed APB2OPB interface
   hw/arm: Hook up FSI module in AST2600
   hw/fsi: Added qtest
   hw/fsi: Added FSI documentation
   hw/fsi: Update MAINTAINER list

  MAINTAINERS|   8 +
  docs/specs/fsi.rst | 138 +++
  docs/specs/index.rst   |   1 +
  meson.build|   1 +
  hw/fsi/trace.h |   1 +
  include/hw/arm/aspeed_soc.h|   4 +
  include/hw/fsi/aspeed-apb2opb.h|  33 
  include/hw/fsi/cfam.h  |  34 
  include/hw/fsi/engine-scratchpad.h |  27 +++
  include/hw/fsi/fsi-master.h|  30 
  include/hw/fsi/fsi-slave.h |  29 +++
  include/hw/fsi/fsi.h   |  36 
  include/hw/fsi/lbus.h  |  43 +
  include/hw/fsi/opb.h   |  33 
  hw/arm/aspeed_ast2600.c|  19 ++
  hw/fsi/aspeed-apb2opb.c| 272 +
  hw/fsi/cfam.c  | 173 ++
  hw/fsi/engine-scratchpad.c |  93 ++
  hw/fsi/fsi-master.c| 161 +
  hw/fsi/fsi-slave.c |  78 +
  hw/fsi/fsi.c   |  25 +++
  hw/fsi/lbus.c  |  74 
  hw/fsi/opb.c   |  74 
  tests/qtest/aspeed-fsi-test.c  | 205 ++
  hw/Kconfig |   1 +
  hw/arm/Kconfig |   1 +
  hw/fsi/Kconfig |  23 +++
  hw/fsi/meson.build |   6 +
  hw/fsi/trace-events|  13 ++
  hw/meson.build |   1 +
  tests/qtest/meson.build|   1 +
  31 files changed, 1638 insertions(+)
  create mode 100644 docs/specs/fsi.rst
  create mode 100644 hw/fsi/trace.h
  create mode 100644 include/hw/fsi/aspeed-apb2opb.h
  create mode 100644 include/hw/fsi/cfam.h
  create mode 100644 include/hw/fsi/engine-scratchpad.h
  create mode 100644 include/hw/fsi/fsi-master.h
  create mode 100644 include/hw/fsi/fsi-slave.h
  create mode 100644 include/hw/fsi/fsi.h
  create mode 100644 include/hw/fsi/lbus.h
  create mode 100644 include/hw/fsi/opb.h
  create mode 100644 hw/fsi/aspeed-apb2opb.c
  create mode 100644 hw/fsi/cfam.c
  create mode 100644 hw/fsi/engine-scratchpad.c
  create mode 100644 hw/fsi/fsi-master.c
  create mode 100644 hw/fsi/fsi-slave.c
  create mode 100644 hw/fsi/fsi.c
  create mode 100644 hw/fsi/lbus.c
  create mode 100644 hw/fsi/opb.c
  create mode 100644 tests/qtest/aspeed-fsi-test.c
  create mode 100644 hw/fsi/Kconfig
  create mode 100644 hw/fsi/meson.build
  create mode 100644 hw/fsi/trace-events






Re: [PATCH v2 for-8.2] ppc/amigaone: Allow running AmigaOS without firmware image

2023-11-27 Thread Cédric Le Goater




I'm not sure, I don't think it's necessary if your minimal patch works.

I'll do a PR for 8.2 for SLOF and Skiboot updates, so happy to include
this as well.


I think this is a bit late for 8.2 to change FW images, well, at least
SLOF and skiboot. Are the new versions fixing something critical ?

Thanks,

C.




Re: [PATCH] crypto: Introduce SM4 symmetric cipher algorithm

2023-11-27 Thread Yong Huang
On Tue, Nov 28, 2023 at 12:11 AM Daniel P. Berrangé 
wrote:

> On Mon, Nov 27, 2023 at 11:55:34PM +0800, Hyman Huang wrote:
> > Introduce the SM4 cipher algorithms (OSCCA GB/T 32907-2016).
> >
> > SM4 (GBT.32907-2016) is a cryptographic standard issued by the
> > Organization of State Commercial Administration of China (OSCCA)
> > as an authorized cryptographic algorithms for the use within China.
>
> Just out of interest, what part of QEMU are you needing to use
> SM4 with ? Is it for a LUKS block driver cipher ?
>

Indeed, the LUKS block driver is the cause. Since SM4 can be accelerated by
encryption cards or hardware modules, we wish to evaluate the performance
overhead of the CPU and the proprietary hardware as in our production,
And the SM4 Algo CPU implementation could be introduced beforehand.

>
> >
> > Signed-off-by: Hyman Huang 
> > ---
> >  crypto/block-luks.c |  7 ++
> >  crypto/cipher-gcrypt.c.inc  |  4 
>
> Looking at the gcrypt code, SM4 is only supported in >= 1.9.0
>
> QEMU min version is 1.8.0, so you'll need to modify meson.build
> to check whether SM4 is supported and put conditionals in this
> file


> >  crypto/cipher-nettle.c.inc  | 42 +
>
> Looking at the nettle code, SM4 is only supported in unreleased
> versions thus far.


> So again will need a meson.build check and conditionals.
>
OK, I'll check the library versions in the next versoin.

>
> >  crypto/cipher.c |  2 ++
> >  qapi/crypto.json|  5 +++-
> >  tests/unit/test-crypto-cipher.c | 11 +
> >  6 files changed, 70 insertions(+), 1 deletion(-)
>
>
> > diff --git a/qapi/crypto.json b/qapi/crypto.json
> > index fd3d46ebd1..95fa10bb6d 100644
> > --- a/qapi/crypto.json
> > +++ b/qapi/crypto.json
> > @@ -94,6 +94,8 @@
> >  #
> >  # @twofish-256: Twofish with 256 bit / 32 byte keys
> >  #
> > +# @sm4: SM4 with 128 bit / 16 byte keys (since 8.2)
>
> We're in feature freeze for 8.2, so mark this 9.0 as that'll be the
> next available release this could be merged for.
>
Get it.

>
> > +#
> >  # Since: 2.6
> >  ##
> >  { 'enum': 'QCryptoCipherAlgorithm',
> > @@ -102,7 +104,8 @@
> > 'des', '3des',
> > 'cast5-128',
> > 'serpent-128', 'serpent-192', 'serpent-256',
> > -   'twofish-128', 'twofish-192', 'twofish-256']}
> > +   'twofish-128', 'twofish-192', 'twofish-256',
> > +   'sm4']}
> >
> >  ##
> >  # @QCryptoCipherMode:
> > diff --git a/tests/unit/test-crypto-cipher.c
> b/tests/unit/test-crypto-cipher.c
> > index d9d9d078ff..80a4984e43 100644
> > --- a/tests/unit/test-crypto-cipher.c
> > +++ b/tests/unit/test-crypto-cipher.c
> > @@ -382,6 +382,17 @@ static QCryptoCipherTestData test_data[] = {
> >  .plaintext = "90afe91bb288544f2c32dc239b2635e6",
> >  .ciphertext = "6cb4561c40bf0a9705931cb6d408e7fa",
> >  },
> > +{
> > +/* SM4, GB/T 32907-2016, Appendix A.1 */
> > +.path = "/crypto/cipher/sm4",
> > +.alg = QCRYPTO_CIPHER_ALG_SM4,
> > +.mode = QCRYPTO_CIPHER_MODE_ECB,
> > +.key = "0123456789abcdeffedcba9876543210",
> > +.plaintext  =
> > +"0123456789abcdeffedcba9876543210",
> > +.ciphertext =
> > +"681edf34d206965e86b3e94f536e4246",
> > +},
> >  {
> >  /* #1 32 byte key, 32 byte PTX */
> >  .path = "/crypto/cipher/aes-xts-128-1",
> > --
> > 2.39.1
> >
>
> With regards,
> Daniel
> --
> |: https://berrange.com  -o-
> https://www.flickr.com/photos/dberrange :|
> |: https://libvirt.org -o-
> https://fstop138.berrange.com :|
> |: https://entangle-photo.org-o-
> https://www.instagram.com/dberrange :|
>
>
Thanks for the comment,
Yong

-- 
Best regards


Re: [PATCH v5 7/9] misc: Add a pca9554 GPIO device model

2023-11-27 Thread Miles Glenn
On Wed, 2023-11-22 at 09:55 +0100, Cédric Le Goater wrote:
> On 11/21/23 20:09, Glenn Miles wrote:
> > Specs are available here:
> > 
> >  https://www.nxp.com/docs/en/data-sheet/PCA9554_9554A.pdf
> > 
> > This is a simple model supporting the basic registers for GPIO
> > mode.  The device also supports an interrupt output line but the
> > model does not yet support this.
> > 
> > Reviewed-by: Cédric Le Goater 
> 
> My R-b was on patch "ppc/pnv: Add a pca9554 I2C device to powernv10-
> rainier".
> Not on the pca9554 model itself.
> 
> Thanks,
> 
> C.
> 

Yikes!  Sorry about that.  I wonder if there's a good tool out there
to help with tracking this type of information.

Thanks,

Glenn
> 
> 
> > Signed-off-by: Glenn Miles 
> > ---
> > 
> > No change from previous version
> > 
> >   MAINTAINERS|  10 +-
> >   hw/misc/pca9554.c  | 328
> > +
> >   include/hw/misc/pca9554.h  |  36 
> >   include/hw/misc/pca9554_regs.h |  19 ++
> >   4 files changed, 391 insertions(+), 2 deletions(-)
> >   create mode 100644 hw/misc/pca9554.c
> >   create mode 100644 include/hw/misc/pca9554.h
> >   create mode 100644 include/hw/misc/pca9554_regs.h
> > 
> > diff --git a/MAINTAINERS b/MAINTAINERS
> > index 695e0bd34f..4d1c991691 100644
> > --- a/MAINTAINERS
> > +++ b/MAINTAINERS
> > @@ -1155,9 +1155,7 @@ R: Joel Stanley 
> >   L: qemu-...@nongnu.org
> >   S: Maintained
> >   F: hw/*/*aspeed*
> > -F: hw/misc/pca9552.c
> >   F: include/hw/*/*aspeed*
> > -F: include/hw/misc/pca9552*.h
> >   F: hw/net/ftgmac100.c
> >   F: include/hw/net/ftgmac100.h
> >   F: docs/system/arm/aspeed.rst
> > @@ -1526,6 +1524,14 @@ F: include/hw/pci-host/pnv*
> >   F: pc-bios/skiboot.lid
> >   F: tests/qtest/pnv*
> >   
> > +pca955x
> > +M: Glenn Miles 
> > +L: qemu-...@nongnu.org
> > +L: qemu-...@nongnu.org
> > +S: Odd Fixes
> > +F: hw/misc/pca955*.c
> > +F: include/hw/misc/pca955*.h
> > +
> >   virtex_ml507
> >   M: Edgar E. Iglesias 
> >   L: qemu-...@nongnu.org
> > diff --git a/hw/misc/pca9554.c b/hw/misc/pca9554.c
> > new file mode 100644
> > index 00..778b32e443
> > --- /dev/null
> > +++ b/hw/misc/pca9554.c
> > @@ -0,0 +1,328 @@
> > +/*
> > + * PCA9554 I/O port
> > + *
> > + * Copyright (c) 2023, IBM Corporation.
> > + *
> > + * SPDX-License-Identifier: GPL-2.0-or-later
> > + */
> > +
> > +#include "qemu/osdep.h"
> > +#include "qemu/log.h"
> > +#include "qemu/module.h"
> > +#include "qemu/bitops.h"
> > +#include "hw/qdev-properties.h"
> > +#include "hw/misc/pca9554.h"
> > +#include "hw/misc/pca9554_regs.h"
> > +#include "hw/irq.h"
> > +#include "migration/vmstate.h"
> > +#include "qapi/error.h"
> > +#include "qapi/visitor.h"
> > +#include "trace.h"
> > +#include "qom/object.h"
> > +
> > +struct PCA9554Class {
> > +/*< private >*/
> > +I2CSlaveClass parent_class;
> > +/*< public >*/
> > +};
> > +typedef struct PCA9554Class PCA9554Class;
> > +
> > +DECLARE_CLASS_CHECKERS(PCA9554Class, PCA9554,
> > +   TYPE_PCA9554)
> > +
> > +#define PCA9554_PIN_LOW  0x0
> > +#define PCA9554_PIN_HIZ  0x1
> > +
> > +static const char *pin_state[] = {"low", "high"};
> > +
> > +static void pca9554_update_pin_input(PCA9554State *s)
> > +{
> > +int i;
> > +uint8_t config = s->regs[PCA9554_CONFIG];
> > +uint8_t output = s->regs[PCA9554_OUTPUT];
> > +uint8_t internal_state = config | output;
> > +
> > +for (i = 0; i < PCA9554_PIN_COUNT; i++) {
> > +uint8_t bit_mask = 1 << i;
> > +uint8_t internal_pin_state = (internal_state >> i) & 0x1;
> > +uint8_t old_value = s->regs[PCA9554_INPUT] & bit_mask;
> > +uint8_t new_value;
> > +
> > +switch (internal_pin_state) {
> > +case PCA9554_PIN_LOW:
> > +s->regs[PCA9554_INPUT] &= ~bit_mask;
> > +break;
> > +case PCA9554_PIN_HIZ:
> > +/*
> > + * pullup sets it to a logical 1 unless
> > + * external device drives it low.
> > + */
> > +if (s->ext_state[i] == PCA9554_PIN_LOW) {
> > +s->regs[PCA9554_INPUT] &= ~bit_mask;
> > +} else {
> > +s->regs[PCA9554_INPUT] |=  bit_mask;
> > +}
> > +break;
> > +default:
> > +break;
> > +}
> > +
> > +/* update irq state only if pin state changed */
> > +new_value = s->regs[PCA9554_INPUT] & bit_mask;
> > +if (new_value != old_value) {
> > +if (new_value) {
> > +/* changed from 0 to 1 */
> > +qemu_set_irq(s->gpio_out[i], 1);
> > +} else {
> > +/* changed from 1 to 0 */
> > +qemu_set_irq(s->gpio_out[i], 0);
> > +}
> > +}
> > +}
> > +}
> > +
> > +static uint8_t pca9554_read(PCA9554State *s, uint8_t reg)
> > +{
> > +switch (reg) {
> > +case PCA9554_INPUT:
> > +return s->regs[PCA9554_INPUT] ^ s->regs[P

Re: [RFC PATCH v2 17/19] heki: x86: Update permissions counters during text patching

2023-11-27 Thread Madhavan T. Venkataraman
Apologies for the late reply. I was on vacation. Please see my response below:

On 11/13/23 02:19, Peter Zijlstra wrote:
> On Sun, Nov 12, 2023 at 09:23:24PM -0500, Mickaël Salaün wrote:
>> From: Madhavan T. Venkataraman 
>>
>> X86 uses a function called __text_poke() to modify executable code. This
>> patching function is used by many features such as KProbes and FTrace.
>>
>> Update the permissions counters for the text page so that write
>> permissions can be temporarily established in the EPT to modify the
>> instructions in that page.
>>
>> Cc: Borislav Petkov 
>> Cc: Dave Hansen 
>> Cc: H. Peter Anvin 
>> Cc: Ingo Molnar 
>> Cc: Kees Cook 
>> Cc: Madhavan T. Venkataraman 
>> Cc: Mickaël Salaün 
>> Cc: Paolo Bonzini 
>> Cc: Sean Christopherson 
>> Cc: Thomas Gleixner 
>> Cc: Vitaly Kuznetsov 
>> Cc: Wanpeng Li 
>> Signed-off-by: Madhavan T. Venkataraman 
>> ---
>>
>> Changes since v1:
>> * New patch
>> ---
>>  arch/x86/kernel/alternative.c |  5 
>>  arch/x86/mm/heki.c| 49 +++
>>  include/linux/heki.h  | 14 ++
>>  3 files changed, 68 insertions(+)
>>
>> diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
>> index 517ee01503be..64fd8757ba5c 100644
>> --- a/arch/x86/kernel/alternative.c
>> +++ b/arch/x86/kernel/alternative.c
>> @@ -18,6 +18,7 @@
>>  #include 
>>  #include 
>>  #include 
>> +#include 
>>  #include 
>>  #include 
>>  #include 
>> @@ -1801,6 +1802,7 @@ static void *__text_poke(text_poke_f func, void *addr, 
>> const void *src, size_t l
>>   */
>>  pgprot = __pgprot(pgprot_val(PAGE_KERNEL) & ~_PAGE_GLOBAL);
>>  
>> +heki_text_poke_start(pages, cross_page_boundary ? 2 : 1, pgprot);
>>  /*
>>   * The lock is not really needed, but this allows to avoid open-coding.
>>   */
>> @@ -1865,7 +1867,10 @@ static void *__text_poke(text_poke_f func, void 
>> *addr, const void *src, size_t l
>>  }
>>  
>>  local_irq_restore(flags);
>> +
>>  pte_unmap_unlock(ptep, ptl);
>> +heki_text_poke_end(pages, cross_page_boundary ? 2 : 1, pgprot);
>> +
>>  return addr;
>>  }
> 
> This makes no sense, we already use a custom CR3 with userspace alias
> for the actual pages to write to, why are you then frobbing permissions
> on that *again* ?

Today, the permissions for a guest page in the extended page table (EPT) are 
RWX (unless permissions are
restricted for some specific reason like for shadow page table pages). In this 
Heki feature, we don't allow
RWX by default in the EPT. We only allow those permissions in the EPT that the 
guest page actually needs.
E.g., for a text page, it is R_X in both the guest page table and the EPT.

For text patching, the above code establishes an alternate mapping in the guest 
page table that is RW_ so
that the text can be patched. That needs to be reflected in the EPT so that the 
EPT permissions will change
from R_X to RWX. In other words, RWX is allowed only as necessary. At the end 
of patching, the EPT permissions
are restored to R_X.

Does that address your comment?

Madhavan



Re: [PATCH for-9.0] hw: Add compat machines for 9.0

2023-11-27 Thread Eric Farman
On Mon, 2023-11-20 at 10:42 +0100, Cornelia Huck wrote:
> Add 9.0 machine types for arm/i440fx/m68k/q35/s390x/spapr.
> 
> Signed-off-by: Cornelia Huck 
> ---
>  hw/arm/virt.c  |  9 -
>  hw/core/machine.c  |  3 +++
>  hw/i386/pc.c   |  3 +++
>  hw/i386/pc_piix.c  | 17 ++---
>  hw/i386/pc_q35.c   | 13 -
>  hw/m68k/virt.c |  9 -
>  hw/ppc/spapr.c | 15 +--
>  hw/s390x/s390-virtio-ccw.c | 14 +-
>  include/hw/boards.h    |  3 +++
>  include/hw/i386/pc.h   |  3 +++
>  10 files changed, 80 insertions(+), 9 deletions(-)

Acked-by: Eric Farman   # s390x



  1   2   3   >