Re: [PATCH v3 09/11] hw/nvme: enable namespace rescap function

2024-05-24 Thread Klaus Jensen
On May 17 17:52, Changqi Lu wrote:
> This commit enables the rescap function in the
> namespace by detecting the supported reservation
> function in the backend driver.
> 
> Signed-off-by: Changqi Lu 
> Signed-off-by: zhenwei pi 
> ---
>  hw/nvme/ns.c | 8 
>  1 file changed, 8 insertions(+)
> 
> diff --git a/hw/nvme/ns.c b/hw/nvme/ns.c
> index ea8db175db..bb09117f4b 100644
> --- a/hw/nvme/ns.c
> +++ b/hw/nvme/ns.c
> @@ -20,6 +20,7 @@
>  #include "qemu/bitops.h"
>  #include "sysemu/sysemu.h"
>  #include "sysemu/block-backend.h"
> +#include "block/block_int.h"
>  
>  #include "nvme.h"
>  #include "trace.h"
> @@ -55,6 +56,13 @@ void nvme_ns_init_format(NvmeNamespace *ns)
>  }
>  
>  id_ns->npda = id_ns->npdg = npdg - 1;
> +
> +/*
> + * The persistent reservation capacities of block
> + * and nvme are currently defined the same.
> + * If there are subsequent changes, this part needs to be changed.
> + */
> +id_ns->rescap = blk_bs(ns->blkconf.blk)->file->bs->bl.pr_cap;
>  }
>  
>  static int nvme_ns_init(NvmeNamespace *ns, Error **errp)
> -- 
> 2.20.1
> 

This should probably be merged with path 10. I don't think it make sense
on it's own?


signature.asc
Description: PGP signature


Re: [PATCH v3 08/11] hw/nvme: enable ONCS reservations

2024-05-24 Thread Klaus Jensen
On May 17 17:52, Changqi Lu wrote:
> This commit enables ONCS to support the reservation
> function at the controller level. It also lays the
> groundwork for detecting and enabling the reservation
> function on a per-namespace basis in RESCAP.
> 
> Signed-off-by: Changqi Lu 
> Signed-off-by: zhenwei pi 
> ---
>  hw/nvme/ctrl.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/nvme/ctrl.c b/hw/nvme/ctrl.c
> index 127c3d2383..182307a48b 100644
> --- a/hw/nvme/ctrl.c
> +++ b/hw/nvme/ctrl.c
> @@ -8248,7 +8248,8 @@ static void nvme_init_ctrl(NvmeCtrl *n, PCIDevice 
> *pci_dev)
>  id->nn = cpu_to_le32(NVME_MAX_NAMESPACES);
>  id->oncs = cpu_to_le16(NVME_ONCS_WRITE_ZEROES | NVME_ONCS_TIMESTAMP |
> NVME_ONCS_FEATURES | NVME_ONCS_DSM |
> -   NVME_ONCS_COMPARE | NVME_ONCS_COPY);
> +   NVME_ONCS_COMPARE | NVME_ONCS_COPY |
> +   NVME_ONCS_RESRVATIONS);
>  
>  /*
>   * NOTE: If this device ever supports a command set that does NOT use 0x0
> -- 
> 2.20.1
> 

Should be merged with patch 10.


signature.asc
Description: PGP signature


Re: [PATCH v3 06/11] block/nvme: add reservation command protocol constants

2024-05-24 Thread Klaus Jensen
On May 17 17:52, Changqi Lu wrote:
> Add constants for the NVMe persistent command protocol.
> The constants include the reservation command opcode and
> reservation type values defined in section 7 of the NVMe
> 2.0 specification.
> 
> Signed-off-by: Changqi Lu 
> Signed-off-by: zhenwei pi 
> ---
>  include/block/nvme.h | 61 
>  1 file changed, 61 insertions(+)
> 
> diff --git a/include/block/nvme.h b/include/block/nvme.h
> index bb231d0b9a..84e2b2e401 100644
> --- a/include/block/nvme.h
> +++ b/include/block/nvme.h
> @@ -633,6 +633,10 @@ enum NvmeIoCommands {
>  NVME_CMD_WRITE_ZEROES   = 0x08,
>  NVME_CMD_DSM= 0x09,
>  NVME_CMD_VERIFY = 0x0c,
> +NVME_CMD_RESV_REGISTER  = 0x0d,
> +NVME_CMD_RESV_REPORT= 0x0e,
> +NVME_CMD_RESV_ACQUIRE   = 0x11,
> +NVME_CMD_RESV_RELEASE   = 0x15,
>  NVME_CMD_IO_MGMT_RECV   = 0x12,
>  NVME_CMD_COPY   = 0x19,
>  NVME_CMD_IO_MGMT_SEND   = 0x1d,
> @@ -641,6 +645,63 @@ enum NvmeIoCommands {
>  NVME_CMD_ZONE_APPEND= 0x7d,
>  };
>  
> +typedef enum {
> +NVME_RESV_REGISTER_ACTION_REGISTER  = 0x00,
> +NVME_RESV_REGISTER_ACTION_UNREGISTER= 0x01,
> +NVME_RESV_REGISTER_ACTION_REPLACE   = 0x02,
> +} NVME_RESV_REGISTER_ACTION;

Existing style would name this `NvmeReservationRegisterAction`.


signature.asc
Description: PGP signature


Re: [PATCH v2] meson.build: add -mcx16 flag for x86_64 host

2024-05-24 Thread Paolo Bonzini

On 5/23/24 10:45, Daniel P. Berrangé wrote:

On Thu, May 23, 2024 at 08:11:18AM +0300, Artyom Kunakovsky wrote:

Fix linker error if the project was configured by the './configure 
--cpu=unknown --target-list=riscv64-softmmu' command


As with v1, why are you intentionally passing a bogus CPU target
name to the --cpu arg ?  QEMU already correctly sets '-mcx16' if
you omit --cpu, or pass a correct "x86_64" target name to --cpu.


The patch has a point though, in that right above we have another test
to add -march=i486.  It's just that we do that one conditionally,
because most of the time the compiler will already apply the less-
restrictive -march=i686.

The point of CPU_CFLAGS is really just to select the appropriate
multilib, for example for library linking tests, and -mcx16 is not
needed for that purpose.  And -mcx16 is not applied to cross-compiled
x86_64 code too, so why is it even in configure.

This is not to say that passing --cpu=unknown is a good idea; the
reason that Artyom gives is not really compelling.  But I think
I am going to apply it as a cleanup together with the matching
change to configure:

--- 8< -
From: Artyom Kunakovsky 
Subject: [PATCH] configure: move -mcx16 flag out of CPU_CFLAGS

The point of CPU_CFLAGS is really just to select the appropriate multilib,
for example for library linking tests, and -mcx16 is not needed for
that purpose.

Furthermore, if -mcx16 is part of QEMU's choice of a basic x86_64
instruction set, it should be applied to cross-compiled x86_64 code too;
it is plausible that tests/tcg would want to cover cmpxchg16b as well,
for example.  In the end this makes just as much sense as a per sub-build
tweak, so move the flag to meson.build and cross_cc_cflags_x86_64.

This leaves out contrib/plugins, which would fail when attempting to use
__sync_val_compare_and_swap_16 (note it does not do yet); while minor,
this *is* a disadvantage of this change.  But building contrib/plugins
with a Makefile instead of meson.build is something self-inflicted just
for the sake of showing that it can be done, and if this kind of papercut
started becoming a problem we could make the directory part of the meson
build.  Until then, we can live with the limitation.

Signed-off-by: Artyom Kunakovsky 
Message-ID: <20240523051118.29367-1-artyomkunakov...@gmail.com>
[rewrite commit message, remove from configure. - Paolo]
Signed-off-by: Paolo Bonzini 

diff --git a/configure b/configure
index 38ee2577013..4d01a42ba65 100755
--- a/configure
+++ b/configure
@@ -512,10 +512,7 @@ case "$cpu" in
 cpu="x86_64"
 host_arch=x86_64
 linux_arch=x86
-# ??? Only extremely old AMD cpus do not have cmpxchg16b.
-# If we truly care, we should simply detect this case at
-# runtime and generate the fallback to serial emulation.
-CPU_CFLAGS="-m64 -mcx16"
+CPU_CFLAGS="-m64"
 ;;
 esac
 
@@ -1203,7 +1200,7 @@ fi

 : ${cross_cc_cflags_sparc64="-m64 -mcpu=ultrasparc"}
 : ${cross_cc_sparc="$cross_cc_sparc64"}
 : ${cross_cc_cflags_sparc="-m32 -mcpu=supersparc"}
-: ${cross_cc_cflags_x86_64="-m64"}
+: ${cross_cc_cflags_x86_64="-m64 -mcx16"}
 
 compute_target_variable() {

   eval "$2="
diff --git a/meson.build b/meson.build
index a9de71d4506..7fd82b5f48c 100644
--- a/meson.build
+++ b/meson.build
@@ -336,6 +336,13 @@ if host_arch == 'i386' and not cc.links('''
   qemu_common_flags = ['-march=i486'] + qemu_common_flags
 endif
 
+# ??? Only extremely old AMD cpus do not have cmpxchg16b.

+# If we truly care, we should simply detect this case at
+# runtime and generate the fallback to serial emulation.
+if host_arch == 'x86_64'
+  qemu_common_flags = ['-mcx16'] + qemu_common_flags
+endif
+
 if get_option('prefer_static')
   qemu_ldflags += get_option('b_pie') ? '-static-pie' : '-static'
 endif




Signed-off-by: Artyom Kunakovsky 
---
  meson.build | 6 ++
  1 file changed, 6 insertions(+)

diff --git a/meson.build b/meson.build
index a9de71d450..e68fbfc662 100644
--- a/meson.build
+++ b/meson.build
@@ -336,6 +336,12 @@ if host_arch == 'i386' and not cc.links('''
qemu_common_flags = ['-march=i486'] + qemu_common_flags
  endif
  
+

+if host_arch == 'x86_64'
+  qemu_common_flags = ['-mcx16'] + qemu_common_flags
+endif
+
+
  if get_option('prefer_static')
qemu_ldflags += get_option('b_pie') ? '-static-pie' : '-static'
  endif
--
2.25.1




With regards,
Daniel





Re: [PATCH v3 09/11] hw/nvme: enable namespace rescap function

2024-05-24 Thread Klaus Jensen
On May 17 17:52, Changqi Lu wrote:
> This commit enables the rescap function in the
> namespace by detecting the supported reservation
> function in the backend driver.
> 
> Signed-off-by: Changqi Lu 
> Signed-off-by: zhenwei pi 
> ---
>  hw/nvme/ns.c | 8 
>  1 file changed, 8 insertions(+)
> 
> diff --git a/hw/nvme/ns.c b/hw/nvme/ns.c
> index ea8db175db..bb09117f4b 100644
> --- a/hw/nvme/ns.c
> +++ b/hw/nvme/ns.c
> @@ -20,6 +20,7 @@
>  #include "qemu/bitops.h"
>  #include "sysemu/sysemu.h"
>  #include "sysemu/block-backend.h"
> +#include "block/block_int.h"
>  
>  #include "nvme.h"
>  #include "trace.h"
> @@ -55,6 +56,13 @@ void nvme_ns_init_format(NvmeNamespace *ns)
>  }
>  
>  id_ns->npda = id_ns->npdg = npdg - 1;
> +
> +/*
> + * The persistent reservation capacities of block
> + * and nvme are currently defined the same.
> + * If there are subsequent changes, this part needs to be changed.
> + */
> +id_ns->rescap = blk_bs(ns->blkconf.blk)->file->bs->bl.pr_cap;

This is very brittle. I see that you have an enum for both th eblock
layer and nvme. It is tricky to remember to update this if it changes in
the block layer.

>  }
>  
>  static int nvme_ns_init(NvmeNamespace *ns, Error **errp)
> -- 
> 2.20.1
> 

-- 
One of us - No more doubt, silence or taboo about mental illness.


signature.asc
Description: PGP signature


Re: [PATCH v3 10/11] hw/nvme: add reservation protocal command

2024-05-24 Thread Klaus Jensen
On May 17 17:52, Changqi Lu wrote:
> Add reservation acquire, reservation register,
> reservation release and reservation report commands
> in the nvme device layer.
> 
> By introducing these commands, this enables the nvme
> device to perform reservation-related tasks, including
> querying keys, querying reservation status, registering
> reservation keys, initiating and releasing reservations,
> as well as clearing and preempting reservations held by
> other keys.
> 
> These commands are crucial for management and control of
> shared storage resources in a persistent manner.
> 
> Signed-off-by: Changqi Lu 
> Signed-off-by: zhenwei pi 
> ---
>  hw/nvme/ctrl.c   | 321 ++-
>  hw/nvme/nvme.h   |   4 +
>  include/block/nvme.h |  38 +
>  3 files changed, 362 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/nvme/ctrl.c b/hw/nvme/ctrl.c
> index 182307a48b..ac2fbd22ec 100644
> --- a/hw/nvme/ctrl.c
> +++ b/hw/nvme/ctrl.c
> @@ -294,6 +294,10 @@ static const uint32_t nvme_cse_iocs_nvm[256] = {
>  [NVME_CMD_COMPARE]  = NVME_CMD_EFF_CSUPP,
>  [NVME_CMD_IO_MGMT_RECV] = NVME_CMD_EFF_CSUPP,
>  [NVME_CMD_IO_MGMT_SEND] = NVME_CMD_EFF_CSUPP | NVME_CMD_EFF_LBCC,
> +[NVME_CMD_RESV_REGISTER]= NVME_CMD_EFF_CSUPP,
> +[NVME_CMD_RESV_REPORT]  = NVME_CMD_EFF_CSUPP,
> +[NVME_CMD_RESV_ACQUIRE] = NVME_CMD_EFF_CSUPP,
> +[NVME_CMD_RESV_RELEASE] = NVME_CMD_EFF_CSUPP,
>  };
>  
>  static const uint32_t nvme_cse_iocs_zoned[256] = {
> @@ -308,6 +312,10 @@ static const uint32_t nvme_cse_iocs_zoned[256] = {
>  [NVME_CMD_ZONE_APPEND]  = NVME_CMD_EFF_CSUPP | NVME_CMD_EFF_LBCC,
>  [NVME_CMD_ZONE_MGMT_SEND]   = NVME_CMD_EFF_CSUPP | NVME_CMD_EFF_LBCC,
>  [NVME_CMD_ZONE_MGMT_RECV]   = NVME_CMD_EFF_CSUPP,
> +[NVME_CMD_RESV_REGISTER]= NVME_CMD_EFF_CSUPP,
> +[NVME_CMD_RESV_REPORT]  = NVME_CMD_EFF_CSUPP,
> +[NVME_CMD_RESV_ACQUIRE] = NVME_CMD_EFF_CSUPP,
> +[NVME_CMD_RESV_RELEASE] = NVME_CMD_EFF_CSUPP,
>  };
>  
>  static void nvme_process_sq(void *opaque);
> @@ -1745,6 +1753,7 @@ static void nvme_aio_err(NvmeRequest *req, int ret)
>  
>  switch (req->cmd.opcode) {
>  case NVME_CMD_READ:
> +case NVME_CMD_RESV_REPORT:
>  status = NVME_UNRECOVERED_READ;
>  break;
>  case NVME_CMD_FLUSH:
> @@ -1752,6 +1761,9 @@ static void nvme_aio_err(NvmeRequest *req, int ret)
>  case NVME_CMD_WRITE_ZEROES:
>  case NVME_CMD_ZONE_APPEND:
>  case NVME_CMD_COPY:
> +case NVME_CMD_RESV_REGISTER:
> +case NVME_CMD_RESV_ACQUIRE:
> +case NVME_CMD_RESV_RELEASE:
>  status = NVME_WRITE_FAULT;
>  break;
>  default:
> @@ -2127,7 +2139,10 @@ static inline bool nvme_is_write(NvmeRequest *req)
>  
>  return rw->opcode == NVME_CMD_WRITE ||
> rw->opcode == NVME_CMD_ZONE_APPEND ||
> -   rw->opcode == NVME_CMD_WRITE_ZEROES;
> +   rw->opcode == NVME_CMD_WRITE_ZEROES ||
> +   rw->opcode == NVME_CMD_RESV_REGISTER ||
> +   rw->opcode == NVME_CMD_RESV_ACQUIRE ||
> +   rw->opcode == NVME_CMD_RESV_RELEASE;
>  }
>  
>  static void nvme_misc_cb(void *opaque, int ret)
> @@ -2692,6 +2707,302 @@ static uint16_t nvme_verify(NvmeCtrl *n, NvmeRequest 
> *req)
>  return NVME_NO_COMPLETE;
>  }
>  
> +typedef struct NvmeKeyInfo {
> +uint64_t cr_key;
> +uint64_t nr_key;
> +} NvmeKeyInfo;
> +
> +static uint16_t nvme_resv_register(NvmeCtrl *n, NvmeRequest *req)
> +{
> +int ret;
> +NvmeKeyInfo key_info;
> +NvmeNamespace *ns = req->ns;
> +uint32_t cdw10 = le32_to_cpu(req->cmd.cdw10);
> +bool ignore_key = cdw10 >> 3 & 0x1;
> +uint8_t action = cdw10 & 0x7;
> +uint8_t ptpl = cdw10 >> 30 & 0x3;
> +bool aptpl;
> +
> +switch (ptpl) {
> +case NVME_RESV_PTPL_NO_CHANGE:
> +aptpl = (ns->id_ns.rescap & NVME_PR_CAP_PTPL) ? true : false;
> +break;
> +case NVME_RESV_PTPL_DISABLE:
> +aptpl = false;
> +break;
> +case NVME_RESV_PTPL_ENABLE:
> +aptpl = true;
> +break;
> +default:
> +return NVME_INVALID_FIELD;
> +}
> +
> +ret = nvme_h2c(n, (uint8_t *)&key_info, sizeof(NvmeKeyInfo), req);
> +if (ret) {
> +return ret;
> +}
> +
> +switch (action) {
> +case NVME_RESV_REGISTER_ACTION_REGISTER:
> +req->aiocb = blk_aio_pr_register(ns->blkconf.blk, 0,
> + key_info.nr_key, 0, aptpl,
> + ignore_key, nvme_misc_cb,
> + req);
> +break;
> +case NVME_RESV_REGISTER_ACTION_UNREGISTER:
> +req->aiocb = blk_aio_pr_register(ns->blkconf.blk, key_info.cr_key, 0,
> + 0, aptpl, ignore_key,
> + nvme_misc_cb, req);
> +break;
> +case N

[PATCH RFC 1/2] meson: Pass objects to declare_dependency()

2024-05-24 Thread Akihiko Odaki
We used to request declare_dependency() to link_whole static libraries.
If a static library is a thin archive, GNU ld needs to open all object
files referenced by the archieve, and sometimes reaches to the open
file limit.

Another problem with link_whole is that it does not propagate
dependencies. In particular, gnutls, a dependency of crypto, is not
propagated to its users, and we currently workaround the issue by
declaring gnutls as a dependency for each crypto user.

Instead of using link_whole, extract objects included in static
libraries and pass them to declare_dependency(). This requires Meson
1.1.0 or later.

Signed-off-by: Akihiko Odaki 
---
 docs/devel/build-system.rst   |  2 +-
 meson.build   | 27 ++-
 gdbstub/meson.build   |  4 ++--
 subprojects/libvhost-user/meson.build |  2 +-
 tests/qtest/libqos/meson.build|  2 +-
 5 files changed, 19 insertions(+), 18 deletions(-)

diff --git a/docs/devel/build-system.rst b/docs/devel/build-system.rst
index 5baf027b7614..36ad40c76d2a 100644
--- a/docs/devel/build-system.rst
+++ b/docs/devel/build-system.rst
@@ -238,7 +238,7 @@ Subsystem sourcesets:
 libchardev = static_library('chardev', chardev_ss.sources(),
 build_by_default: false)
 
-chardev = declare_dependency(link_whole: libchardev)
+chardev = declare_dependency(objects: 
libchardev.extract_all_objects(recursive: false))
 
 Target-independent emulator sourcesets:
   Various general purpose helper code is compiled only once and
diff --git a/meson.build b/meson.build
index d6549722b50d..0e6fa2e4b777 100644
--- a/meson.build
+++ b/meson.build
@@ -1,4 +1,4 @@
-project('qemu', ['c'], meson_version: '>=0.63.0',
+project('qemu', ['c'], meson_version: '>=1.1.0',
 default_options: ['warning_level=1', 'c_std=gnu11', 'cpp_std=gnu++11', 
'b_colorout=auto',
   'b_staticpic=false', 'stdsplit=false', 
'optimization=2', 'b_pie=true'],
 version: files('VERSION'))
@@ -3456,20 +3456,20 @@ subdir('gdbstub')
 
 if enable_modules
   libmodulecommon = static_library('module-common', files('module-common.c') + 
genh, pic: true, c_args: '-DBUILD_DSO')
-  modulecommon = declare_dependency(link_whole: libmodulecommon, compile_args: 
'-DBUILD_DSO')
+  modulecommon = declare_dependency(objects: 
libmodulecommon.extract_all_objects(recursive: false), compile_args: 
'-DBUILD_DSO')
 endif
 
 qom_ss = qom_ss.apply({})
 libqom = static_library('qom', qom_ss.sources() + genh,
 dependencies: [qom_ss.dependencies()],
 build_by_default: false)
-qom = declare_dependency(link_whole: libqom)
+qom = declare_dependency(objects: libqom.extract_all_objects(recursive: false))
 
 event_loop_base = files('event-loop-base.c')
 event_loop_base = static_library('event-loop-base',
  sources: event_loop_base + genh,
  build_by_default: false)
-event_loop_base = declare_dependency(link_whole: event_loop_base,
+event_loop_base = declare_dependency(objects: 
event_loop_base.extract_all_objects(recursive: false),
  dependencies: [qom])
 
 stub_ss = stub_ss.apply({})
@@ -3703,7 +3703,7 @@ libauthz = static_library('authz', authz_ss.sources() + 
genh,
   dependencies: [authz_ss.dependencies()],
   build_by_default: false)
 
-authz = declare_dependency(link_whole: libauthz,
+authz = declare_dependency(objects: libauthz.extract_all_objects(recursive: 
false),
dependencies: qom)
 
 crypto_ss = crypto_ss.apply({})
@@ -3711,7 +3711,7 @@ libcrypto = static_library('crypto', crypto_ss.sources() 
+ genh,
dependencies: [crypto_ss.dependencies()],
build_by_default: false)
 
-crypto = declare_dependency(link_whole: libcrypto,
+crypto = declare_dependency(objects: libcrypto.extract_all_objects(recursive: 
false),
 dependencies: [authz, qom])
 
 io_ss = io_ss.apply({})
@@ -3720,7 +3720,8 @@ libio = static_library('io', io_ss.sources() + genh,
link_with: libqemuutil,
build_by_default: false)
 
-io = declare_dependency(link_whole: libio, dependencies: [crypto, qom])
+io = declare_dependency(objects: libio.extract_all_objects(recursive: false),
+dependencies: [crypto, qom])
 
 libmigration = static_library('migration', sources: migration_files + genh,
   build_by_default: false)
@@ -3734,7 +3735,7 @@ libblock = static_library('block', block_ss.sources() + 
genh,
   link_depends: block_syms,
   build_by_default: false)
 
-block = declare_dependency(link_whole: [libblock],
+block = declare_dependency(objects: libblock.extract_all_objects(recursive: 
false),
 

[PATCH RFC 2/2] Revert "meson: Propagate gnutls dependency"

2024-05-24 Thread Akihiko Odaki
This reverts commit 3eacf70bb5a83e4775ad8003cbca63a40f70c8c2.

Signed-off-by: Akihiko Odaki 
---
 meson.build| 4 ++--
 block/meson.build  | 2 +-
 io/meson.build | 2 +-
 storage-daemon/meson.build | 2 +-
 ui/meson.build | 2 +-
 5 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/meson.build b/meson.build
index 0e6fa2e4b777..cd5a24807ec8 100644
--- a/meson.build
+++ b/meson.build
@@ -3518,7 +3518,7 @@ if have_block
 'blockdev-nbd.c',
 'iothread.c',
 'job-qmp.c',
-  ), gnutls)
+  ))
 
   # os-posix.c contains POSIX-specific functions used by qemu-storage-daemon,
   # os-win32.c does not
@@ -4008,7 +4008,7 @@ if have_tools
   qemu_io = executable('qemu-io', files('qemu-io.c'),
  dependencies: [block, qemuutil], install: true)
   qemu_nbd = executable('qemu-nbd', files('qemu-nbd.c'),
-   dependencies: [blockdev, qemuutil, gnutls, selinux],
+   dependencies: [blockdev, qemuutil, selinux],
install: true)
 
   subdir('storage-daemon')
diff --git a/block/meson.build b/block/meson.build
index e1f03fd773e9..0165ac178370 100644
--- a/block/meson.build
+++ b/block/meson.build
@@ -39,7 +39,7 @@ block_ss.add(files(
   'throttle.c',
   'throttle-groups.c',
   'write-threshold.c',
-), zstd, zlib, gnutls)
+), zstd, zlib)
 
 system_ss.add(when: 'CONFIG_TCG', if_true: files('blkreplay.c'))
 system_ss.add(files('block-ram-registrar.c'))
diff --git a/io/meson.build b/io/meson.build
index 283b9b2bdbdf..1164812f9126 100644
--- a/io/meson.build
+++ b/io/meson.build
@@ -13,4 +13,4 @@ io_ss.add(files(
   'dns-resolver.c',
   'net-listener.c',
   'task.c',
-), gnutls)
+))
diff --git a/storage-daemon/meson.build b/storage-daemon/meson.build
index 46267b63e72b..b955949fd6f3 100644
--- a/storage-daemon/meson.build
+++ b/storage-daemon/meson.build
@@ -1,6 +1,6 @@
 qsd_ss = ss.source_set()
 qsd_ss.add(files('qemu-storage-daemon.c'))
-qsd_ss.add(blockdev, chardev, qmp, qom, qemuutil, gnutls)
+qsd_ss.add(blockdev, chardev, qmp, qom, qemuutil)
 
 subdir('qapi')
 
diff --git a/ui/meson.build b/ui/meson.build
index a5ce22a678ba..9358439ceeed 100644
--- a/ui/meson.build
+++ b/ui/meson.build
@@ -43,7 +43,7 @@ vnc_ss.add(files(
   'vnc-jobs.c',
   'vnc-clipboard.c',
 ))
-vnc_ss.add(zlib, jpeg, gnutls)
+vnc_ss.add(zlib, jpeg)
 vnc_ss.add(when: sasl, if_true: files('vnc-auth-sasl.c'))
 system_ss.add_all(when: [vnc, pixman], if_true: vnc_ss)
 system_ss.add(when: vnc, if_false: files('vnc-stubs.c'))

-- 
2.45.1




[PATCH RFC 0/2] meson: Pass objects to declare_dependency()

2024-05-24 Thread Akihiko Odaki
Based-on: <20240524-xkb-v4-0-2de564e5c...@daynix.com>
("[PATCH v4 0/4] Fix sanitizer errors with clang 18.1.1")

This is changes suggested by Paolo Bonzini at:
https://lore.kernel.org/all/CABgObfYoEFZsW-H4WJ7xW0B85OqFi932d3-DmNAb6zTohFn=o...@mail.gmail.com/

Unfortunately it broke builds on my system. Below are the errors I
observed:

clang  -o qemu-img libauthz.a.p/authz_base.c.o libauthz.a.p/authz_list.c.o 
libauthz.a.p/authz_listfile.c.o libauthz.a.p/authz_simple.c.o 
libauthz.a.p/authz_pamacct.c.o libqom.a.p/qom_container.c.o 
libqom.a.p/qom_object.c.o libqom.a.p/qom_object_interfaces.c.o 
libqom.a.p/qom_qom-qobject.c.o libblock.a.p/block.c.o libblock.a.p/blockjob.c.o 
libblock.a.p/job.c.o libblock.a.p/qemu-io-cmds.c.o libblock.a.p/replication.c.o 
libblock.a.p/nbd_client.c.o libblock.a.p/nbd_client-connection.c.o 
libblock.a.p/nbd_common.c.o libblock.a.p/scsi_utils.c.o 
libblock.a.p/scsi_pr-manager.c.o libblock.a.p/scsi_pr-manager-helper.c.o 
libblock.a.p/block_accounting.c.o libblock.a.p/block_aio_task.c.o 
libblock.a.p/block_amend.c.o libblock.a.p/block_backup.c.o 
libblock.a.p/block_blkdebug.c.o libblock.a.p/block_blklogwrites.c.o 
libblock.a.p/block_blkverify.c.o libblock.a.p/block_block-backend.c.o 
libblock.a.p/block_block-copy.c.o libblock.a.p/block_commit.c.o 
libblock.a.p/block_copy-before-write.c.o libblock.a.p/b
 lock_copy-on-read.c.o libblock.a.p/block_create.c.o 
libblock.a.p/block_crypto.c.o libblock.a.p/block_dirty-bitmap.c.o 
libblock.a.p/block_filter-compress.c.o libblock.a.p/block_graph-lock.c.o 
libblock.a.p/block_io.c.o libblock.a.p/block_mirror.c.o 
libblock.a.p/block_nbd.c.o libblock.a.p/block_null.c.o 
libblock.a.p/block_preallocate.c.o libblock.a.p/block_progress_meter.c.o 
libblock.a.p/block_qapi.c.o libblock.a.p/block_qcow2.c.o 
libblock.a.p/block_qcow2-bitmap.c.o libblock.a.p/block_qcow2-cache.c.o 
libblock.a.p/block_qcow2-cluster.c.o libblock.a.p/block_qcow2-refcount.c.o 
libblock.a.p/block_qcow2-snapshot.c.o libblock.a.p/block_qcow2-threads.c.o 
libblock.a.p/block_quorum.c.o libblock.a.p/block_raw-format.c.o 
libblock.a.p/block_reqlist.c.o libblock.a.p/block_snapshot.c.o 
libblock.a.p/block_snapshot-access.c.o libblock.a.p/block_throttle.c.o 
libblock.a.p/block_throttle-groups.c.o libblock.a.p/block_write-threshold.c.o 
libblock.a.p/block_qcow.c.o libblock.a.p/block_vdi.c.o libblock.a.p/
 block_vhdx-endian.c.o libblock.a.p/block_vhdx-log.c.o 
libblock.a.p/block_vhdx.c.o libblock.a.p/block_vmdk.c.o 
libblock.a.p/block_vpc.c.o libblock.a.p/block_cloop.c.o 
libblock.a.p/block_bochs.c.o libblock.a.p/block_vvfat.c.o 
libblock.a.p/block_dmg.c.o libblock.a.p/block_qed-check.c.o 
libblock.a.p/block_qed-cluster.c.o libblock.a.p/block_qed-l2-cache.c.o 
libblock.a.p/block_qed-table.c.o libblock.a.p/block_qed.c.o 
libblock.a.p/block_parallels.c.o libblock.a.p/block_parallels-ext.c.o 
libblock.a.p/block_file-posix.c.o libblock.a.p/block_nvme.c.o 
libblock.a.p/block_replication.c.o libblock.a.p/block_stream.c.o 
libblock.a.p/block_monitor_bitmap-qmp-cmds.c.o libblock.a.p/block_curl.c.o 
libblock.a.p/block_ssh.c.o libblock.a.p/block_dmg-bz2.c.o 
libblock.a.p/meson-generated_.._block_block-gen.c.o 
libcrypto.a.p/crypto_afsplit.c.o libcrypto.a.p/crypto_akcipher.c.o 
libcrypto.a.p/crypto_block-luks.c.o libcrypto.a.p/crypto_block-qcow.c.o 
libcrypto.a.p/crypto_block.c.o libcrypto.a.p/crypto_cipher.c.
 o libcrypto.a.p/crypto_der.c.o libcrypto.a.p/crypto_hash.c.o 
libcrypto.a.p/crypto_hmac.c.o libcrypto.a.p/crypto_ivgen-essiv.c.o 
libcrypto.a.p/crypto_ivgen-plain.c.o libcrypto.a.p/crypto_ivgen-plain64.c.o 
libcrypto.a.p/crypto_ivgen.c.o libcrypto.a.p/crypto_pbkdf.c.o 
libcrypto.a.p/crypto_secret_common.c.o libcrypto.a.p/crypto_secret.c.o 
libcrypto.a.p/crypto_tlscreds.c.o libcrypto.a.p/crypto_tlscredsanon.c.o 
libcrypto.a.p/crypto_tlscredspsk.c.o libcrypto.a.p/crypto_tlscredsx509.c.o 
libcrypto.a.p/crypto_tlssession.c.o libcrypto.a.p/crypto_rsakey.c.o 
libcrypto.a.p/crypto_hash-gnutls.c.o libcrypto.a.p/crypto_hmac-gnutls.c.o 
libcrypto.a.p/crypto_pbkdf-gnutls.c.o libcrypto.a.p/crypto_secret_keyring.c.o 
libio.a.p/io_channel-buffer.c.o libio.a.p/io_channel-command.c.o 
libio.a.p/io_channel-file.c.o libio.a.p/io_channel-null.c.o 
libio.a.p/io_channel-socket.c.o libio.a.p/io_channel-tls.c.o 
libio.a.p/io_channel-util.c.o libio.a.p/io_channel-watch.c.o 
libio.a.p/io_channel-websock.c.o libio.a.p/io_
 channel.c.o libio.a.p/io_dns-resolver.c.o libio.a.p/io_net-listener.c.o 
libio.a.p/io_task.c.o libevent-loop-base.a.p/event-loop-base.c.o 
qemu-img.p/qemu-img.c.o -Werror -flto -Wl,--as-needed -Wl,--no-undefined -pie 
-fsanitize=cfi-icall -fsanitize-cfi-icall-generalize-pointers 
-fsanitize=undefined -fsanitize=address -fstack-protector-strong -Wl,-z,relro 
-Wl,-z,now -fuse-ld=lld -Wl,--start-group libqemuutil.a 
subprojects/libvhost-user/libvhost-user-glib.a 
subprojects/libvhost-user/libvhost-user.a @block.syms /usr/lib64/libgio-2.0.so 
/usr/lib64/libgobject-2.0.so /usr/lib64/libg

Re: [PATCH] hw/nvme: fix mo field in io mgnt send

2024-05-24 Thread Klaus Jensen
On May  8 09:36, Vincent Fu wrote:
> On 5/7/24 10:05, Vincent Fu wrote:
> > On 5/6/24 04:06, Klaus Jensen wrote:
> > > The Management Operation field of I/O Management Send is only 8 bits,
> > > not 16.
> > > 
> > > Fixes: 73064edfb864 ("hw/nvme: flexible data placement emulation")
> > > Signed-off-by: Klaus Jensen 
> > > ---
> > >   hw/nvme/ctrl.c | 2 +-
> > >   1 file changed, 1 insertion(+), 1 deletion(-)
> > > 
> > > diff --git a/hw/nvme/ctrl.c b/hw/nvme/ctrl.c
> > > index 9e7bbebc8bb0..ede5f281dd7c 100644
> > > --- a/hw/nvme/ctrl.c
> > > +++ b/hw/nvme/ctrl.c
> > > @@ -4387,7 +4387,7 @@ static uint16_t nvme_io_mgmt_send(NvmeCtrl *n,
> > > NvmeRequest *req)
> > >   {
> > >   NvmeCmd *cmd = &req->cmd;
> > >   uint32_t cdw10 = le32_to_cpu(cmd->cdw10);
> > > -    uint8_t mo = (cdw10 & 0xff);
> > > +    uint8_t mo = cdw10 & 0xf;
> > >   switch (mo) {
> > >   case NVME_IOMS_MO_NOP:
> > > 
> > > ---
> > > base-commit: 84b0eb1826f690aa8d51984644318ee6c810f5bf
> > > change-id: 20240506-fix-ioms-mo-97098c6c5396
> > > 
> > > Best regards,
> > 
> > Reviewed-by: Vincent Fu 
> 
> Klaus, upon taking a second look, the original code is correct. The proposed
> change would only keep the least significant 4 bits of the MO field. The
> original code gives you the 8 bits needed.
> 
> Let me withdraw my Reviewed-by.
> 
> Vincent

That was embarrasing. Thanks for catching that Vincent :)


signature.asc
Description: PGP signature


Re: [PATCH v3 3/3] meson: Drop the .fa library prefix

2024-05-24 Thread Akihiko Odaki

On 2024/05/22 22:45, Paolo Bonzini wrote:

On Wed, May 22, 2024 at 12:49 PM Akihiko Odaki  wrote:

The non-standard .fa library prefix breaks the link source
de-duplication done by Meson so drop it.


Can you show the difference in the command lines?


Without this patch:
clang  -o qemu-io qemu-io.p/qemu-io.c.o -Werror -flto -Wl,--as-needed 
-Wl,--no-undefined -pie -Wl,--whole-archive libblock.fa libcrypto.fa 
libauthz.fa libqom.fa libio.fa libevent-loop-base.fa 
-Wl,--no-whole-archive -fsanitize=cfi-icall 
-fsanitize-cfi-icall-generalize-pointers -fsanitize=undefined 
-fsanitize=address -fstack-protector-strong -Wl,-z,relro -Wl,-z,now 
-fuse-ld=lld -Wl,--start-group libqemuutil.a 
subprojects/libvhost-user/libvhost-user-glib.a 
subprojects/libvhost-user/libvhost-user.a libblock.fa libcrypto.fa 
libauthz.fa libqom.fa libio.fa libevent-loop-base.fa @block.syms 
/usr/lib64/libgio-2.0.so /usr/lib64/libgobject-2.0.so 
/usr/lib64/libglib-2.0.so /usr/lib64/libgmodule-2.0.so -pthread 
/usr/lib64/libgnutls.so -lm /usr/lib64/libpixman-1.so 
/usr/lib64/libzstd.so /usr/lib64/libz.so /usr/lib64/libcurl.so 
/usr/lib64/libssh.so -lbz2 -lpam -Wl,--end-group


With this patch:
clang  -o qemu-io qemu-io.p/qemu-io.c.o -Werror -flto -Wl,--as-needed 
-Wl,--no-undefined -pie -Wl,--whole-archive -Wl,--start-group libblock.a 
libcrypto.a libauthz.a libqom.a libio.a libevent-loop-base.a 
-Wl,--no-whole-archive -fsanitize=cfi-icall 
-fsanitize-cfi-icall-generalize-pointers -fsanitize=undefined 
-fsanitize=address -fstack-protector-strong -Wl,-z,relro -Wl,-z,now 
-fuse-ld=lld libqemuutil.a 
subprojects/libvhost-user/libvhost-user-glib.a 
subprojects/libvhost-user/libvhost-user.a @block.syms 
/usr/lib64/libgio-2.0.so /usr/lib64/libgobject-2.0.so 
/usr/lib64/libglib-2.0.so /usr/lib64/libgmodule-2.0.so -pthread 
/usr/lib64/libgnutls.so -lm /usr/lib64/libpixman-1.so 
/usr/lib64/libzstd.so /usr/lib64/libz.so /usr/lib64/libcurl.so 
/usr/lib64/libssh.so -lbz2 -lpam -Wl,--end-group




One possibility to force de-duplication of objects is to change
"link_whole: foo" to "objects: foo.extract_all_objects(recursive:
false)" in all the declare_dependency() invocations that involve a
'fa' archive.

This completely gets rid of the archives, which now become just a
dummy target. I have gotten reports of "ld" exhausting the limit of
open files when using thin archives (thin archives contain "symbolic
links" to the files with the actual object code, thus reducing disk
usage), and this would also be fixed.

The disadvantage is requiring a bump to Meson 1.1.x as the minimum
required version (the recommended version is 1.2.x because earlier
versions are incompatible with recent macOS). It could be done before
this patch (because then this patch is a total no-op), or after too to
fix the immediate issue with sanitizer builds.


I wrote such a change and applied after this patch, but it caused 
dependencies to be ignored. Please see "[PATCH RFC 0/2] meson: Pass 
objects to declare_dependency()", which I sent earlier.


Regards,
Akihiko Odaki



[PATCH 01/16] target/i386: remove unnecessary gen_update_cc_op before gen_eob*

2024-05-24 Thread Paolo Bonzini
This is already handled in gen_eob().  Before adding another DISAS_*
case, remove the double calls.

Signed-off-by: Paolo Bonzini 
---
 target/i386/tcg/translate.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
index 76be7425800..f44edb3c29c 100644
--- a/target/i386/tcg/translate.c
+++ b/target/i386/tcg/translate.c
@@ -4776,14 +4776,12 @@ static void i386_tr_tb_stop(DisasContextBase *dcbase, 
CPUState *cpu)
 gen_jmp_rel_csize(dc, 0, 0);
 break;
 case DISAS_EOB_NEXT:
-gen_update_cc_op(dc);
 gen_update_eip_cur(dc);
 /* fall through */
 case DISAS_EOB_ONLY:
 gen_eob(dc);
 break;
 case DISAS_EOB_INHIBIT_IRQ:
-gen_update_cc_op(dc);
 gen_update_eip_cur(dc);
 gen_eob_inhibit_irq(dc);
 break;
-- 
2.45.1




[PATCH 06/16] target/i386: assert that gen_update_eip_cur and gen_update_eip_next are the same in tb_stop

2024-05-24 Thread Paolo Bonzini
This is an invariant, since these cases of tb_stop() should only
be reached through the "instruction decoding completed" path of
i386_tr_translate_insn().

Signed-off-by: Paolo Bonzini 
---
 target/i386/tcg/translate.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
index 5dae890d2b6..2c7917d239f 100644
--- a/target/i386/tcg/translate.c
+++ b/target/i386/tcg/translate.c
@@ -4787,6 +4787,7 @@ static void i386_tr_tb_stop(DisasContextBase *dcbase, 
CPUState *cpu)
 gen_jmp_rel_csize(dc, 0, 0);
 break;
 case DISAS_EOB_NEXT:
+assert(dc->base.pc_next == dc->pc);
 gen_update_eip_cur(dc);
 /* fall through */
 case DISAS_EOB_ONLY:
@@ -4796,6 +4797,7 @@ static void i386_tr_tb_stop(DisasContextBase *dcbase, 
CPUState *cpu)
 gen_eob_syscall(dc);
 break;
 case DISAS_EOB_INHIBIT_IRQ:
+assert(dc->base.pc_next == dc->pc);
 gen_update_eip_cur(dc);
 gen_eob_inhibit_irq(dc);
 break;
-- 
2.45.1




[PATCH 00/16] target/i386/tcg: translation cleanups

2024-05-24 Thread Paolo Bonzini
Some cleanups in translate.c, which I could make now that the
it's smaller and it's easier to understand how the various
utility functions are used.

1-7: cleanups for gen_eob

8-14: inlining and removing macros

15-16: cleanups for cc_op vs. helpers

Paolo

Paolo Bonzini (16):
  target/i386: remove unnecessary gen_update_cc_op before gen_eob*
  target/i386: cleanup eob handling of RSM
  target/i386: document and group DISAS_* constants
  target/i386: avoid calling gen_eob_syscall before tb_stop
  target/i386: avoid calling gen_eob_inhibit_irq before tb_stop
  target/i386: assert that gen_update_eip_cur and gen_update_eip_next
are the same in tb_stop
  target/i386: raze the gen_eob* jungle
  target/i386: reg in gen_ldst_modrm is always OR_TMP0
  target/i386: split gen_ldst_modrm for load and store
  target/i386: inline gen_add_A0_ds_seg
  target/i386: use mo_stacksize more
  target/i386: introduce gen_lea_ss_ofs
  target/i386: clean up repeated string operations
  target/i386: remove aflag argument of gen_lea_v_seg
  target/i386: cpu_load_eflags already sets cc_op
  target/i386: set CC_OP in helpers if they want CC_OP_EFLAGS

 target/i386/ops_sse.h|   8 +
 target/i386/tcg/fpu_helper.c |   2 +
 target/i386/tcg/int_helper.c |  13 +-
 target/i386/tcg/seg_helper.c |  16 +-
 target/i386/tcg/translate.c  | 322 +++
 target/i386/tcg/emit.c.inc   |  58 +++
 6 files changed, 194 insertions(+), 225 deletions(-)

-- 
2.45.1




[PATCH 15/16] target/i386: cpu_load_eflags already sets cc_op

2024-05-24 Thread Paolo Bonzini
No need to set it again at the end of the translation block, cc_op_dirty
can be set to false.

Signed-off-by: Paolo Bonzini 
---
 target/i386/tcg/translate.c | 37 -
 target/i386/tcg/emit.c.inc  |  2 +-
 2 files changed, 25 insertions(+), 14 deletions(-)

diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
index 1a776e77297..7442a8a51b1 100644
--- a/target/i386/tcg/translate.c
+++ b/target/i386/tcg/translate.c
@@ -332,7 +332,7 @@ static const uint8_t cc_op_live[CC_OP_NB] = {
 [CC_OP_POPCNT] = USES_CC_SRC,
 };
 
-static void set_cc_op(DisasContext *s, CCOp op)
+static void set_cc_op_1(DisasContext *s, CCOp op, bool dirty)
 {
 int dead;
 
@@ -355,20 +355,27 @@ static void set_cc_op(DisasContext *s, CCOp op)
 tcg_gen_discard_tl(s->cc_srcT);
 }
 
-if (op == CC_OP_DYNAMIC) {
-/* The DYNAMIC setting is translator only, and should never be
-   stored.  Thus we always consider it clean.  */
-s->cc_op_dirty = false;
-} else {
-/* Discard any computed CC_OP value (see shifts).  */
-if (s->cc_op == CC_OP_DYNAMIC) {
-tcg_gen_discard_i32(cpu_cc_op);
-}
-s->cc_op_dirty = true;
+if (dirty && s->cc_op == CC_OP_DYNAMIC) {
+tcg_gen_discard_i32(cpu_cc_op);
 }
+s->cc_op_dirty = dirty;
 s->cc_op = op;
 }
 
+static void set_cc_op(DisasContext *s, CCOp op)
+{
+/*
+ * The DYNAMIC setting is translator only, everything else
+ * will be spilled later.
+ */
+set_cc_op_1(s, op, op != CC_OP_DYNAMIC);
+}
+
+static void assume_cc_op(DisasContext *s, CCOp op)
+{
+set_cc_op_1(s, op, false);
+}
+
 static void gen_update_cc_op(DisasContext *s)
 {
 if (s->cc_op_dirty) {
@@ -3510,6 +3517,10 @@ static void disas_insn_old(DisasContext *s, CPUState 
*cpu, int b)
 gen_update_cc_op(s);
 gen_update_eip_cur(s);
 gen_helper_syscall(tcg_env, cur_insn_len_i32(s));
+/* condition codes are modified only in long mode */
+if (LMA(s)) {
+assume_cc_op(s, CC_OP_EFLAGS);
+}
 /* TF handling for the syscall insn is different. The TF bit is  
checked
after the syscall insn completes. This allows #DB to not be
generated after one has entered CPL0 if TF is set in FMASK.  */
@@ -3526,7 +3537,7 @@ static void disas_insn_old(DisasContext *s, CPUState 
*cpu, int b)
 gen_helper_sysret(tcg_env, tcg_constant_i32(dflag - 1));
 /* condition codes are modified only in long mode */
 if (LMA(s)) {
-set_cc_op(s, CC_OP_EFLAGS);
+assume_cc_op(s, CC_OP_EFLAGS);
 }
 /* TF handling for the sysret insn is different. The TF bit is
checked after the sysret insn completes. This allows #DB to be
@@ -,7 +4455,7 @@ static void disas_insn_old(DisasContext *s, CPUState 
*cpu, int b)
 g_assert_not_reached();
 #else
 gen_helper_rsm(tcg_env);
-set_cc_op(s, CC_OP_EFLAGS);
+assume_cc_op(s, CC_OP_EFLAGS);
 #endif /* CONFIG_USER_ONLY */
 s->base.is_jmp = DISAS_EOB_ONLY;
 break;
diff --git a/target/i386/tcg/emit.c.inc b/target/i386/tcg/emit.c.inc
index 9eecf7ab56c..9fea395dfbf 100644
--- a/target/i386/tcg/emit.c.inc
+++ b/target/i386/tcg/emit.c.inc
@@ -1881,7 +1881,7 @@ static void gen_IRET(DisasContext *s, CPUX86State *env, 
X86DecodedInsn *decode)
 gen_helper_iret_protected(tcg_env, tcg_constant_i32(s->dflag - 1),
   eip_next_i32(s));
 }
-set_cc_op(s, CC_OP_EFLAGS);
+assume_cc_op(s, CC_OP_EFLAGS);
 s->base.is_jmp = DISAS_EOB_ONLY;
 }
 
-- 
2.45.1




[PATCH 14/16] target/i386: remove aflag argument of gen_lea_v_seg

2024-05-24 Thread Paolo Bonzini
It is always s->aflag.

Signed-off-by: Paolo Bonzini 
---
 target/i386/tcg/translate.c | 20 ++--
 target/i386/tcg/emit.c.inc  |  6 +++---
 2 files changed, 13 insertions(+), 13 deletions(-)

diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
index 18d8c0de674..1a776e77297 100644
--- a/target/i386/tcg/translate.c
+++ b/target/i386/tcg/translate.c
@@ -673,20 +673,20 @@ static void gen_lea_v_seg_dest(DisasContext *s, MemOp 
aflag, TCGv dest, TCGv a0,
 }
 }
 
-static void gen_lea_v_seg(DisasContext *s, MemOp aflag, TCGv a0,
+static void gen_lea_v_seg(DisasContext *s, TCGv a0,
   int def_seg, int ovr_seg)
 {
-gen_lea_v_seg_dest(s, aflag, s->A0, a0, def_seg, ovr_seg);
+gen_lea_v_seg_dest(s, s->aflag, s->A0, a0, def_seg, ovr_seg);
 }
 
 static inline void gen_string_movl_A0_ESI(DisasContext *s)
 {
-gen_lea_v_seg(s, s->aflag, cpu_regs[R_ESI], R_DS, s->override);
+gen_lea_v_seg(s, cpu_regs[R_ESI], R_DS, s->override);
 }
 
 static inline void gen_string_movl_A0_EDI(DisasContext *s)
 {
-gen_lea_v_seg(s, s->aflag, cpu_regs[R_EDI], R_ES, -1);
+gen_lea_v_seg(s, cpu_regs[R_EDI], R_ES, -1);
 }
 
 static inline TCGv gen_compute_Dshift(DisasContext *s, MemOp ot)
@@ -1777,7 +1777,7 @@ static void gen_lea_modrm(CPUX86State *env, DisasContext 
*s, int modrm)
 {
 AddressParts a = gen_lea_modrm_0(env, s, modrm);
 TCGv ea = gen_lea_modrm_1(s, a, false);
-gen_lea_v_seg(s, s->aflag, ea, a.def_seg, s->override);
+gen_lea_v_seg(s, ea, a.def_seg, s->override);
 }
 
 static void gen_nop_modrm(CPUX86State *env, DisasContext *s, int modrm)
@@ -2516,7 +2516,7 @@ static bool disas_insn_x87(DisasContext *s, CPUState 
*cpu, int b)
 bool update_fdp = true;
 
 tcg_gen_mov_tl(last_addr, ea);
-gen_lea_v_seg(s, s->aflag, ea, a.def_seg, s->override);
+gen_lea_v_seg(s, ea, a.def_seg, s->override);
 
 switch (op) {
 case 0x00 ... 0x07: /* fxxxs */
@@ -3313,7 +3313,7 @@ static void disas_insn_old(DisasContext *s, CPUState 
*cpu, int b)
 tcg_gen_sari_tl(s->tmp0, s->T1, 3 + ot);
 tcg_gen_shli_tl(s->tmp0, s->tmp0, ot);
 tcg_gen_add_tl(s->A0, gen_lea_modrm_1(s, a, false), s->tmp0);
-gen_lea_v_seg(s, s->aflag, s->A0, a.def_seg, s->override);
+gen_lea_v_seg(s, s->A0, a.def_seg, s->override);
 if (!(s->prefix & PREFIX_LOCK)) {
 gen_op_ld_v(s, ot, s->T0, s->A0);
 }
@@ -3634,7 +3634,7 @@ static void disas_insn_old(DisasContext *s, CPUState 
*cpu, int b)
 }
 gen_update_cc_op(s);
 gen_update_eip_cur(s);
-gen_lea_v_seg(s, s->aflag, cpu_regs[R_EAX], R_DS, s->override);
+gen_lea_v_seg(s, cpu_regs[R_EAX], R_DS, s->override);
 gen_helper_monitor(tcg_env, s->A0);
 break;
 
@@ -4040,7 +4040,7 @@ static void disas_insn_old(DisasContext *s, CPUState 
*cpu, int b)
 } else {
 tcg_gen_movi_tl(s->A0, 0);
 }
-gen_lea_v_seg(s, s->aflag, s->A0, a.def_seg, s->override);
+gen_lea_v_seg(s, s->A0, a.def_seg, s->override);
 if (a.index >= 0) {
 tcg_gen_mov_tl(s->T0, cpu_regs[a.index]);
 } else {
@@ -4145,7 +4145,7 @@ static void disas_insn_old(DisasContext *s, CPUState 
*cpu, int b)
 } else {
 tcg_gen_movi_tl(s->A0, 0);
 }
-gen_lea_v_seg(s, s->aflag, s->A0, a.def_seg, s->override);
+gen_lea_v_seg(s, s->A0, a.def_seg, s->override);
 if (a.index >= 0) {
 tcg_gen_mov_tl(s->T0, cpu_regs[a.index]);
 } else {
diff --git a/target/i386/tcg/emit.c.inc b/target/i386/tcg/emit.c.inc
index bc96735f61d..9eecf7ab56c 100644
--- a/target/i386/tcg/emit.c.inc
+++ b/target/i386/tcg/emit.c.inc
@@ -76,7 +76,7 @@ static void gen_NM_exception(DisasContext *s)
 static void gen_load_ea(DisasContext *s, AddressParts *mem, bool is_vsib)
 {
 TCGv ea = gen_lea_modrm_1(s, *mem, is_vsib);
-gen_lea_v_seg(s, s->aflag, ea, mem->def_seg, s->override);
+gen_lea_v_seg(s, ea, mem->def_seg, s->override);
 }
 
 static inline int mmx_offset(MemOp ot)
@@ -2044,7 +2044,7 @@ static void gen_MOV(DisasContext *s, CPUX86State *env, 
X86DecodedInsn *decode)
 
 static void gen_MASKMOV(DisasContext *s, CPUX86State *env, X86DecodedInsn 
*decode)
 {
-gen_lea_v_seg(s, s->aflag, cpu_regs[R_EDI], R_DS, s->override);
+gen_lea_v_seg(s, cpu_regs[R_EDI], R_DS, s->override);
 
 if (s->prefix & PREFIX_DATA) {
 gen_helper_maskmov_xmm(tcg_env, OP_PTR1, OP_PTR2, s->A0);
@@ -4039,7 +4039,7 @@ static void gen_XLAT(DisasContext *s, CPUX86State *env, 
X86DecodedInsn *decode)
 {
 /* AL is already zero-extended into s->T0.  */
 tcg_gen_add_tl(s->A0, cpu_regs[R_EBX], s->T0);
-gen_lea_v_seg(s, s->aflag, s->A0, R_D

[PATCH 03/16] target/i386: document and group DISAS_* constants

2024-05-24 Thread Paolo Bonzini
Place DISAS_* constants that update cpu_eip first, and
the "jump" ones last.  Add comments explaining the differences
and usage.

Signed-off-by: Paolo Bonzini 
---
 target/i386/tcg/translate.c | 25 ++---
 1 file changed, 22 insertions(+), 3 deletions(-)

diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
index 3c7d8d72144..52d758a224b 100644
--- a/target/i386/tcg/translate.c
+++ b/target/i386/tcg/translate.c
@@ -144,9 +144,28 @@ typedef struct DisasContext {
 TCGOp *prev_insn_end;
 } DisasContext;
 
-#define DISAS_EOB_ONLY DISAS_TARGET_0
-#define DISAS_EOB_NEXT DISAS_TARGET_1
-#define DISAS_EOB_INHIBIT_IRQ  DISAS_TARGET_2
+/*
+ * Point EIP to next instruction before ending translation.
+ * For instructions that can change hflags.
+ */
+#define DISAS_EOB_NEXT DISAS_TARGET_0
+
+/*
+ * Point EIP to next instruction and set HF_INHIBIT_IRQ if not
+ * already set.  For instructions that activate interrupt shadow.
+ */
+#define DISAS_EOB_INHIBIT_IRQ  DISAS_TARGET_1
+
+/*
+ * EIP has already been updated.  For jumps that do not use
+ * lookup_and_goto_ptr()
+ */
+#define DISAS_EOB_ONLY DISAS_TARGET_2
+
+/*
+ * EIP has already been updated.  For jumps that wish to use
+ * lookup_and_goto_ptr()
+ */
 #define DISAS_JUMP DISAS_TARGET_3
 
 /* The environment in which user-only runs is constrained. */
-- 
2.45.1




[PATCH 05/16] target/i386: avoid calling gen_eob_inhibit_irq before tb_stop

2024-05-24 Thread Paolo Bonzini
sti only has one exit, so it does not need to generate the
end-of-translation code inline.  It can be deferred to tb_stop.

Signed-off-by: Paolo Bonzini 
---
 target/i386/tcg/translate.c | 13 -
 target/i386/tcg/emit.c.inc  |  4 +---
 2 files changed, 1 insertion(+), 16 deletions(-)

diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
index 24e83c1af84..5dae890d2b6 100644
--- a/target/i386/tcg/translate.c
+++ b/target/i386/tcg/translate.c
@@ -557,19 +557,6 @@ static void gen_update_eip_cur(DisasContext *s)
 s->pc_save = s->base.pc_next;
 }
 
-static void gen_update_eip_next(DisasContext *s)
-{
-assert(s->pc_save != -1);
-if (tb_cflags(s->base.tb) & CF_PCREL) {
-tcg_gen_addi_tl(cpu_eip, cpu_eip, s->pc - s->pc_save);
-} else if (CODE64(s)) {
-tcg_gen_movi_tl(cpu_eip, s->pc);
-} else {
-tcg_gen_movi_tl(cpu_eip, (uint32_t)(s->pc - s->cs_base));
-}
-s->pc_save = s->pc;
-}
-
 static int cur_insn_len(DisasContext *s)
 {
 return s->pc - s->base.pc_next;
diff --git a/target/i386/tcg/emit.c.inc b/target/i386/tcg/emit.c.inc
index c78e35b1e28..8e311b6d213 100644
--- a/target/i386/tcg/emit.c.inc
+++ b/target/i386/tcg/emit.c.inc
@@ -3475,9 +3475,7 @@ static void gen_STD(DisasContext *s, CPUX86State *env, 
X86DecodedInsn *decode)
 static void gen_STI(DisasContext *s, CPUX86State *env, X86DecodedInsn *decode)
 {
 gen_set_eflags(s, IF_MASK);
-/* interruptions are enabled only the first insn after sti */
-gen_update_eip_next(s);
-gen_eob_inhibit_irq(s);
+s->base.is_jmp = DISAS_EOB_INHIBIT_IRQ;
 }
 
 static void gen_VAESKEYGEN(DisasContext *s, CPUX86State *env, X86DecodedInsn 
*decode)
-- 
2.45.1




[PATCH 11/16] target/i386: use mo_stacksize more

2024-05-24 Thread Paolo Bonzini
Use mo_stacksize for all stack accesses, including when
a 64-bit code segment is impossible and the code is
therefore checking only for SS32(s).

Signed-off-by: Paolo Bonzini 
---
 target/i386/tcg/translate.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
index 8138da23b3d..7b6bc486a63 100644
--- a/target/i386/tcg/translate.c
+++ b/target/i386/tcg/translate.c
@@ -2068,12 +2068,12 @@ static inline void gen_pop_update(DisasContext *s, 
MemOp ot)
 
 static inline void gen_stack_A0(DisasContext *s)
 {
-gen_lea_v_seg(s, SS32(s) ? MO_32 : MO_16, cpu_regs[R_ESP], R_SS, -1);
+gen_lea_v_seg(s, mo_stacksize(s), cpu_regs[R_ESP], R_SS, -1);
 }
 
 static void gen_pusha(DisasContext *s)
 {
-MemOp s_ot = SS32(s) ? MO_32 : MO_16;
+MemOp s_ot = mo_stacksize(s);
 MemOp d_ot = s->dflag;
 int size = 1 << d_ot;
 int i;
@@ -2089,7 +2089,7 @@ static void gen_pusha(DisasContext *s)
 
 static void gen_popa(DisasContext *s)
 {
-MemOp s_ot = SS32(s) ? MO_32 : MO_16;
+MemOp s_ot = mo_stacksize(s);
 MemOp d_ot = s->dflag;
 int size = 1 << d_ot;
 int i;
@@ -2111,7 +2111,7 @@ static void gen_popa(DisasContext *s)
 static void gen_enter(DisasContext *s, int esp_addend, int level)
 {
 MemOp d_ot = mo_pushpop(s, s->dflag);
-MemOp a_ot = CODE64(s) ? MO_64 : SS32(s) ? MO_32 : MO_16;
+MemOp a_ot = mo_stacksize(s);
 int size = 1 << d_ot;
 
 /* Push BP; compute FrameTemp into T1.  */
-- 
2.45.1




[PATCH 09/16] target/i386: split gen_ldst_modrm for load and store

2024-05-24 Thread Paolo Bonzini
The is_store argument of gen_ldst_modrm has only ever been passed
a constant.  Just split the function in two.

Signed-off-by: Paolo Bonzini 
---
 target/i386/tcg/translate.c | 52 +
 1 file changed, 29 insertions(+), 23 deletions(-)

diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
index b75d61a9141..d32b5b63f5c 100644
--- a/target/i386/tcg/translate.c
+++ b/target/i386/tcg/translate.c
@@ -1821,27 +1821,33 @@ static void gen_add_A0_ds_seg(DisasContext *s)
 gen_lea_v_seg(s, s->aflag, s->A0, R_DS, s->override);
 }
 
-/* generate modrm memory load or store of 'reg'. */
-static void gen_ldst_modrm(CPUX86State *env, DisasContext *s, int modrm,
-   MemOp ot, int is_store)
+/* generate modrm load of memory or register. */
+static void gen_ld_modrm(CPUX86State *env, DisasContext *s, int modrm, MemOp 
ot)
 {
 int mod, rm;
 
 mod = (modrm >> 6) & 3;
 rm = (modrm & 7) | REX_B(s);
 if (mod == 3) {
-if (is_store) {
-gen_op_mov_reg_v(s, ot, rm, s->T0);
-} else {
-gen_op_mov_v_reg(s, ot, s->T0, rm);
-}
+gen_op_mov_v_reg(s, ot, s->T0, rm);
 } else {
 gen_lea_modrm(env, s, modrm);
-if (is_store) {
-gen_op_st_v(s, ot, s->T0, s->A0);
-} else {
-gen_op_ld_v(s, ot, s->T0, s->A0);
-}
+gen_op_ld_v(s, ot, s->T0, s->A0);
+}
+}
+
+/* generate modrm store of memory or register. */
+static void gen_st_modrm(CPUX86State *env, DisasContext *s, int modrm, MemOp 
ot)
+{
+int mod, rm;
+
+mod = (modrm >> 6) & 3;
+rm = (modrm & 7) | REX_B(s);
+if (mod == 3) {
+gen_op_mov_reg_v(s, ot, rm, s->T0);
+} else {
+gen_lea_modrm(env, s, modrm);
+gen_op_st_v(s, ot, s->T0, s->A0);
 }
 }
 
@@ -3431,7 +3437,7 @@ static void disas_insn_old(DisasContext *s, CPUState 
*cpu, int b)
 ot = dflag;
 modrm = x86_ldub_code(env, s);
 reg = ((modrm >> 3) & 7) | REX_R(s);
-gen_ldst_modrm(env, s, modrm, ot, 0);
+gen_ld_modrm(env, s, modrm, ot);
 gen_extu(ot, s->T0);
 
 /* Note that lzcnt and tzcnt are in different extensions.  */
@@ -3578,14 +3584,14 @@ static void disas_insn_old(DisasContext *s, CPUState 
*cpu, int b)
 tcg_gen_ld32u_tl(s->T0, tcg_env,
  offsetof(CPUX86State, ldt.selector));
 ot = mod == 3 ? dflag : MO_16;
-gen_ldst_modrm(env, s, modrm, ot, 1);
+gen_st_modrm(env, s, modrm, ot);
 break;
 case 2: /* lldt */
 if (!PE(s) || VM86(s))
 goto illegal_op;
 if (check_cpl0(s)) {
 gen_svm_check_intercept(s, SVM_EXIT_LDTR_WRITE);
-gen_ldst_modrm(env, s, modrm, MO_16, 0);
+gen_ld_modrm(env, s, modrm, MO_16);
 tcg_gen_trunc_tl_i32(s->tmp2_i32, s->T0);
 gen_helper_lldt(tcg_env, s->tmp2_i32);
 }
@@ -3600,14 +3606,14 @@ static void disas_insn_old(DisasContext *s, CPUState 
*cpu, int b)
 tcg_gen_ld32u_tl(s->T0, tcg_env,
  offsetof(CPUX86State, tr.selector));
 ot = mod == 3 ? dflag : MO_16;
-gen_ldst_modrm(env, s, modrm, ot, 1);
+gen_st_modrm(env, s, modrm, ot);
 break;
 case 3: /* ltr */
 if (!PE(s) || VM86(s))
 goto illegal_op;
 if (check_cpl0(s)) {
 gen_svm_check_intercept(s, SVM_EXIT_TR_WRITE);
-gen_ldst_modrm(env, s, modrm, MO_16, 0);
+gen_ld_modrm(env, s, modrm, MO_16);
 tcg_gen_trunc_tl_i32(s->tmp2_i32, s->T0);
 gen_helper_ltr(tcg_env, s->tmp2_i32);
 }
@@ -3616,7 +3622,7 @@ static void disas_insn_old(DisasContext *s, CPUState 
*cpu, int b)
 case 5: /* verw */
 if (!PE(s) || VM86(s))
 goto illegal_op;
-gen_ldst_modrm(env, s, modrm, MO_16, 0);
+gen_ld_modrm(env, s, modrm, MO_16);
 gen_update_cc_op(s);
 if (op == 4) {
 gen_helper_verr(tcg_env, s->T0);
@@ -3880,7 +3886,7 @@ static void disas_insn_old(DisasContext *s, CPUState 
*cpu, int b)
  */
 mod = (modrm >> 6) & 3;
 ot = (mod != 3 ? MO_16 : s->dflag);
-gen_ldst_modrm(env, s, modrm, ot, 1);
+gen_st_modrm(env, s, modrm, ot);
 break;
 case 0xee: /* rdpkru */
 if (s->prefix & (PREFIX_LOCK | PREFIX_DATA
@@ -3907,7 +3913,7 @@ static void disas_insn_old(DisasContext *s, CPUState 
*cpu, int b)
 break;
 }
 gen_svm_check_intercept(s, SVM_EXIT_WRITE_CR0);
-gen_ldst_modrm(env, s, modrm, MO_16, 0);
+gen_ld_modrm(env, s, modrm, MO_16);
 /*
  * Only the 4 lower bits of CR0 are m

[PATCH 12/16] target/i386: introduce gen_lea_ss_ofs

2024-05-24 Thread Paolo Bonzini
Generalize gen_stack_A0() to include an initial add and to use an arbitrary
destination.  This is a common pattern and it is not a huge burden to
add the extra arguments to the only caller of gen_stack_A0().

Signed-off-by: Paolo Bonzini 
---
 target/i386/tcg/translate.c | 51 +++--
 target/i386/tcg/emit.c.inc  |  2 +-
 2 files changed, 22 insertions(+), 31 deletions(-)

diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
index 7b6bc486a63..8354209b037 100644
--- a/target/i386/tcg/translate.c
+++ b/target/i386/tcg/translate.c
@@ -2028,24 +2028,27 @@ static inline void gen_stack_update(DisasContext *s, 
int addend)
 gen_op_add_reg_im(s, mo_stacksize(s), R_ESP, addend);
 }
 
+static void gen_lea_ss_ofs(DisasContext *s, TCGv dest, TCGv src, target_ulong 
offset)
+{
+if (offset) {
+tcg_gen_addi_tl(dest, src, offset);
+src = dest;
+}
+gen_lea_v_seg_dest(s, mo_stacksize(s), dest, src, R_SS, -1);
+}
+
 /* Generate a push. It depends on ss32, addseg and dflag.  */
 static void gen_push_v(DisasContext *s, TCGv val)
 {
 MemOp d_ot = mo_pushpop(s, s->dflag);
 MemOp a_ot = mo_stacksize(s);
 int size = 1 << d_ot;
-TCGv new_esp = s->A0;
+TCGv new_esp = tcg_temp_new();
 
-tcg_gen_subi_tl(s->A0, cpu_regs[R_ESP], size);
-
-if (!CODE64(s)) {
-if (ADDSEG(s)) {
-new_esp = tcg_temp_new();
-tcg_gen_mov_tl(new_esp, s->A0);
-}
-gen_lea_v_seg(s, a_ot, s->A0, R_SS, -1);
-}
+tcg_gen_subi_tl(new_esp, cpu_regs[R_ESP], size);
 
+/* Now reduce the value to the address size and apply SS base.  */
+gen_lea_ss_ofs(s, s->A0, new_esp, 0);
 gen_op_st_v(s, d_ot, val, s->A0);
 gen_op_mov_reg_v(s, a_ot, R_ESP, new_esp);
 }
@@ -2055,7 +2058,7 @@ static MemOp gen_pop_T0(DisasContext *s)
 {
 MemOp d_ot = mo_pushpop(s, s->dflag);
 
-gen_lea_v_seg_dest(s, mo_stacksize(s), s->T0, cpu_regs[R_ESP], R_SS, -1);
+gen_lea_ss_ofs(s, s->T0, cpu_regs[R_ESP], 0);
 gen_op_ld_v(s, d_ot, s->T0, s->T0);
 
 return d_ot;
@@ -2066,21 +2069,14 @@ static inline void gen_pop_update(DisasContext *s, 
MemOp ot)
 gen_stack_update(s, 1 << ot);
 }
 
-static inline void gen_stack_A0(DisasContext *s)
-{
-gen_lea_v_seg(s, mo_stacksize(s), cpu_regs[R_ESP], R_SS, -1);
-}
-
 static void gen_pusha(DisasContext *s)
 {
-MemOp s_ot = mo_stacksize(s);
 MemOp d_ot = s->dflag;
 int size = 1 << d_ot;
 int i;
 
 for (i = 0; i < 8; i++) {
-tcg_gen_addi_tl(s->A0, cpu_regs[R_ESP], (i - 8) * size);
-gen_lea_v_seg(s, s_ot, s->A0, R_SS, -1);
+gen_lea_ss_ofs(s, s->A0, cpu_regs[R_ESP], (i - 8) * size);
 gen_op_st_v(s, d_ot, cpu_regs[7 - i], s->A0);
 }
 
@@ -2089,7 +2085,6 @@ static void gen_pusha(DisasContext *s)
 
 static void gen_popa(DisasContext *s)
 {
-MemOp s_ot = mo_stacksize(s);
 MemOp d_ot = s->dflag;
 int size = 1 << d_ot;
 int i;
@@ -2099,8 +2094,7 @@ static void gen_popa(DisasContext *s)
 if (7 - i == R_ESP) {
 continue;
 }
-tcg_gen_addi_tl(s->A0, cpu_regs[R_ESP], i * size);
-gen_lea_v_seg(s, s_ot, s->A0, R_SS, -1);
+gen_lea_ss_ofs(s, s->A0, cpu_regs[R_ESP], i * size);
 gen_op_ld_v(s, d_ot, s->T0, s->A0);
 gen_op_mov_reg_v(s, d_ot, 7 - i, s->T0);
 }
@@ -2116,7 +2110,7 @@ static void gen_enter(DisasContext *s, int esp_addend, 
int level)
 
 /* Push BP; compute FrameTemp into T1.  */
 tcg_gen_subi_tl(s->T1, cpu_regs[R_ESP], size);
-gen_lea_v_seg(s, a_ot, s->T1, R_SS, -1);
+gen_lea_ss_ofs(s, s->A0, s->T1, 0);
 gen_op_st_v(s, d_ot, cpu_regs[R_EBP], s->A0);
 
 level &= 31;
@@ -2125,18 +2119,15 @@ static void gen_enter(DisasContext *s, int esp_addend, 
int level)
 
 /* Copy level-1 pointers from the previous frame.  */
 for (i = 1; i < level; ++i) {
-tcg_gen_subi_tl(s->A0, cpu_regs[R_EBP], size * i);
-gen_lea_v_seg(s, a_ot, s->A0, R_SS, -1);
+gen_lea_ss_ofs(s, s->A0, cpu_regs[R_EBP], -size * i);
 gen_op_ld_v(s, d_ot, s->tmp0, s->A0);
 
-tcg_gen_subi_tl(s->A0, s->T1, size * i);
-gen_lea_v_seg(s, a_ot, s->A0, R_SS, -1);
+gen_lea_ss_ofs(s, s->A0, s->T1, -size * i);
 gen_op_st_v(s, d_ot, s->tmp0, s->A0);
 }
 
 /* Push the current FrameTemp as the last level.  */
-tcg_gen_subi_tl(s->A0, s->T1, size * level);
-gen_lea_v_seg(s, a_ot, s->A0, R_SS, -1);
+gen_lea_ss_ofs(s, s->A0, s->T1, -size * level);
 gen_op_st_v(s, d_ot, s->T1, s->A0);
 }
 
@@ -2153,7 +2144,7 @@ static void gen_leave(DisasContext *s)
 MemOp d_ot = mo_pushpop(s, s->dflag);
 MemOp a_ot = mo_stacksize(s);
 
-gen_lea_v_seg(s, a_ot, cpu_regs[R_EBP], R_SS, -1);
+gen_lea_ss_ofs(s, s->A0, cpu_regs[R_EBP], 0);
 gen_op_ld_v(s, d_ot, s->T0, s->A0);
 
 tcg_gen_addi_tl(s->T1, cpu_regs[R_EBP

[PATCH 02/16] target/i386: cleanup eob handling of RSM

2024-05-24 Thread Paolo Bonzini
gen_helper_rsm cannot generate an exception, and reloads the flags.
So there's no need to spill cc_op and update cpu_eip, but on the
other hand cc_op must be reset to CC_OP_EFLAGS before returning.

It all works by chance, because by spilling cc_op before the call
to the helper, it becomes non-dirty and gen_eob will not overwrite
the CC_OP_EFLAGS value that is placed there by the helper.  But
let's clean it up.

Signed-off-by: Paolo Bonzini 
---
 target/i386/tcg/translate.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
index f44edb3c29c..3c7d8d72144 100644
--- a/target/i386/tcg/translate.c
+++ b/target/i386/tcg/translate.c
@@ -4488,9 +4488,8 @@ static void disas_insn_old(DisasContext *s, CPUState 
*cpu, int b)
 /* we should not be in SMM mode */
 g_assert_not_reached();
 #else
-gen_update_cc_op(s);
-gen_update_eip_next(s);
 gen_helper_rsm(tcg_env);
+set_cc_op(s, CC_OP_EFLAGS);
 #endif /* CONFIG_USER_ONLY */
 s->base.is_jmp = DISAS_EOB_ONLY;
 break;
-- 
2.45.1




[PATCH 08/16] target/i386: reg in gen_ldst_modrm is always OR_TMP0

2024-05-24 Thread Paolo Bonzini
Values other than OR_TMP0 were only ever used by MOV and MOVNTI
opcodes.  Now that these have been converted to the new decoder,
remove the argument.

Signed-off-by: Paolo Bonzini 
---
 target/i386/tcg/translate.c | 33 -
 1 file changed, 12 insertions(+), 21 deletions(-)

diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
index c46385be060..b75d61a9141 100644
--- a/target/i386/tcg/translate.c
+++ b/target/i386/tcg/translate.c
@@ -1821,10 +1821,9 @@ static void gen_add_A0_ds_seg(DisasContext *s)
 gen_lea_v_seg(s, s->aflag, s->A0, R_DS, s->override);
 }
 
-/* generate modrm memory load or store of 'reg'. TMP0 is used if reg ==
-   OR_TMP0 */
+/* generate modrm memory load or store of 'reg'. */
 static void gen_ldst_modrm(CPUX86State *env, DisasContext *s, int modrm,
-   MemOp ot, int reg, int is_store)
+   MemOp ot, int is_store)
 {
 int mod, rm;
 
@@ -1832,24 +1831,16 @@ static void gen_ldst_modrm(CPUX86State *env, 
DisasContext *s, int modrm,
 rm = (modrm & 7) | REX_B(s);
 if (mod == 3) {
 if (is_store) {
-if (reg != OR_TMP0)
-gen_op_mov_v_reg(s, ot, s->T0, reg);
 gen_op_mov_reg_v(s, ot, rm, s->T0);
 } else {
 gen_op_mov_v_reg(s, ot, s->T0, rm);
-if (reg != OR_TMP0)
-gen_op_mov_reg_v(s, ot, reg, s->T0);
 }
 } else {
 gen_lea_modrm(env, s, modrm);
 if (is_store) {
-if (reg != OR_TMP0)
-gen_op_mov_v_reg(s, ot, s->T0, reg);
 gen_op_st_v(s, ot, s->T0, s->A0);
 } else {
 gen_op_ld_v(s, ot, s->T0, s->A0);
-if (reg != OR_TMP0)
-gen_op_mov_reg_v(s, ot, reg, s->T0);
 }
 }
 }
@@ -3440,7 +3431,7 @@ static void disas_insn_old(DisasContext *s, CPUState 
*cpu, int b)
 ot = dflag;
 modrm = x86_ldub_code(env, s);
 reg = ((modrm >> 3) & 7) | REX_R(s);
-gen_ldst_modrm(env, s, modrm, ot, OR_TMP0, 0);
+gen_ldst_modrm(env, s, modrm, ot, 0);
 gen_extu(ot, s->T0);
 
 /* Note that lzcnt and tzcnt are in different extensions.  */
@@ -3587,14 +3578,14 @@ static void disas_insn_old(DisasContext *s, CPUState 
*cpu, int b)
 tcg_gen_ld32u_tl(s->T0, tcg_env,
  offsetof(CPUX86State, ldt.selector));
 ot = mod == 3 ? dflag : MO_16;
-gen_ldst_modrm(env, s, modrm, ot, OR_TMP0, 1);
+gen_ldst_modrm(env, s, modrm, ot, 1);
 break;
 case 2: /* lldt */
 if (!PE(s) || VM86(s))
 goto illegal_op;
 if (check_cpl0(s)) {
 gen_svm_check_intercept(s, SVM_EXIT_LDTR_WRITE);
-gen_ldst_modrm(env, s, modrm, MO_16, OR_TMP0, 0);
+gen_ldst_modrm(env, s, modrm, MO_16, 0);
 tcg_gen_trunc_tl_i32(s->tmp2_i32, s->T0);
 gen_helper_lldt(tcg_env, s->tmp2_i32);
 }
@@ -3609,14 +3600,14 @@ static void disas_insn_old(DisasContext *s, CPUState 
*cpu, int b)
 tcg_gen_ld32u_tl(s->T0, tcg_env,
  offsetof(CPUX86State, tr.selector));
 ot = mod == 3 ? dflag : MO_16;
-gen_ldst_modrm(env, s, modrm, ot, OR_TMP0, 1);
+gen_ldst_modrm(env, s, modrm, ot, 1);
 break;
 case 3: /* ltr */
 if (!PE(s) || VM86(s))
 goto illegal_op;
 if (check_cpl0(s)) {
 gen_svm_check_intercept(s, SVM_EXIT_TR_WRITE);
-gen_ldst_modrm(env, s, modrm, MO_16, OR_TMP0, 0);
+gen_ldst_modrm(env, s, modrm, MO_16, 0);
 tcg_gen_trunc_tl_i32(s->tmp2_i32, s->T0);
 gen_helper_ltr(tcg_env, s->tmp2_i32);
 }
@@ -3625,7 +3616,7 @@ static void disas_insn_old(DisasContext *s, CPUState 
*cpu, int b)
 case 5: /* verw */
 if (!PE(s) || VM86(s))
 goto illegal_op;
-gen_ldst_modrm(env, s, modrm, MO_16, OR_TMP0, 0);
+gen_ldst_modrm(env, s, modrm, MO_16, 0);
 gen_update_cc_op(s);
 if (op == 4) {
 gen_helper_verr(tcg_env, s->T0);
@@ -3889,7 +3880,7 @@ static void disas_insn_old(DisasContext *s, CPUState 
*cpu, int b)
  */
 mod = (modrm >> 6) & 3;
 ot = (mod != 3 ? MO_16 : s->dflag);
-gen_ldst_modrm(env, s, modrm, ot, OR_TMP0, 1);
+gen_ldst_modrm(env, s, modrm, ot, 1);
 break;
 case 0xee: /* rdpkru */
 if (s->prefix & (PREFIX_LOCK | PREFIX_DATA
@@ -3916,7 +3907,7 @@ static void disas_insn_old(DisasContext *s, CPUState 
*cpu, int b)
 break;
 }
 gen_svm_check_intercept(s, SVM_EXIT_WRITE_CR0);
-gen_ldst_modrm(env, s, modrm, MO_16, OR_TMP0, 0);
+gen_ldst_mod

[PATCH 04/16] target/i386: avoid calling gen_eob_syscall before tb_stop

2024-05-24 Thread Paolo Bonzini
syscall and sysret only have one exit, so they do not need to
generate the end-of-translation code inline.  It can be
deferred to tb_stop.

Signed-off-by: Paolo Bonzini 
---
 target/i386/tcg/translate.c | 13 +++--
 1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
index 52d758a224b..24e83c1af84 100644
--- a/target/i386/tcg/translate.c
+++ b/target/i386/tcg/translate.c
@@ -168,6 +168,12 @@ typedef struct DisasContext {
  */
 #define DISAS_JUMP DISAS_TARGET_3
 
+/*
+ * EIP has already been updated.  Use updated value of
+ * EFLAGS.TF to determine singlestep trap (SYSCALL/SYSRET).
+ */
+#define DISAS_EOB_RECHECK_TF   DISAS_TARGET_4
+
 /* The environment in which user-only runs is constrained. */
 #ifdef CONFIG_USER_ONLY
 #define PE(S) true
@@ -3576,7 +3582,7 @@ static void disas_insn_old(DisasContext *s, CPUState 
*cpu, int b)
 /* TF handling for the syscall insn is different. The TF bit is  
checked
after the syscall insn completes. This allows #DB to not be
generated after one has entered CPL0 if TF is set in FMASK.  */
-gen_eob_syscall(s);
+s->base.is_jmp = DISAS_EOB_RECHECK_TF;
 break;
 case 0x107: /* sysret */
 /* For Intel SYSRET is only valid in long mode */
@@ -3595,7 +3601,7 @@ static void disas_insn_old(DisasContext *s, CPUState 
*cpu, int b)
checked after the sysret insn completes. This allows #DB to be
generated "as if" the syscall insn in userspace has just
completed.  */
-gen_eob_syscall(s);
+s->base.is_jmp = DISAS_EOB_RECHECK_TF;
 }
 break;
 case 0x1a2: /* cpuid */
@@ -4799,6 +4805,9 @@ static void i386_tr_tb_stop(DisasContextBase *dcbase, 
CPUState *cpu)
 case DISAS_EOB_ONLY:
 gen_eob(dc);
 break;
+case DISAS_EOB_RECHECK_TF:
+gen_eob_syscall(dc);
+break;
 case DISAS_EOB_INHIBIT_IRQ:
 gen_update_eip_cur(dc);
 gen_eob_inhibit_irq(dc);
-- 
2.45.1




[PATCH 10/16] target/i386: inline gen_add_A0_ds_seg

2024-05-24 Thread Paolo Bonzini
It is only used in MONITOR, where a direct call of gen_lea_v_seg
is simpler, and in XLAT.  Inline it in the latter.

Signed-off-by: Paolo Bonzini 
---
 target/i386/tcg/translate.c | 9 +
 target/i386/tcg/emit.c.inc  | 2 +-
 2 files changed, 2 insertions(+), 9 deletions(-)

diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
index d32b5b63f5c..8138da23b3d 100644
--- a/target/i386/tcg/translate.c
+++ b/target/i386/tcg/translate.c
@@ -1815,12 +1815,6 @@ static void gen_bndck(CPUX86State *env, DisasContext *s, 
int modrm,
 gen_helper_bndck(tcg_env, s->tmp2_i32);
 }
 
-/* used for LEA and MOV AX, mem */
-static void gen_add_A0_ds_seg(DisasContext *s)
-{
-gen_lea_v_seg(s, s->aflag, s->A0, R_DS, s->override);
-}
-
 /* generate modrm load of memory or register. */
 static void gen_ld_modrm(CPUX86State *env, DisasContext *s, int modrm, MemOp 
ot)
 {
@@ -3663,8 +3657,7 @@ static void disas_insn_old(DisasContext *s, CPUState 
*cpu, int b)
 }
 gen_update_cc_op(s);
 gen_update_eip_cur(s);
-tcg_gen_mov_tl(s->A0, cpu_regs[R_EAX]);
-gen_add_A0_ds_seg(s);
+gen_lea_v_seg(s, s->aflag, cpu_regs[R_EAX], R_DS, s->override);
 gen_helper_monitor(tcg_env, s->A0);
 break;
 
diff --git a/target/i386/tcg/emit.c.inc b/target/i386/tcg/emit.c.inc
index 8e311b6d213..f293db01b5c 100644
--- a/target/i386/tcg/emit.c.inc
+++ b/target/i386/tcg/emit.c.inc
@@ -4043,7 +4043,7 @@ static void gen_XLAT(DisasContext *s, CPUX86State *env, 
X86DecodedInsn *decode)
 {
 /* AL is already zero-extended into s->T0.  */
 tcg_gen_add_tl(s->A0, cpu_regs[R_EBX], s->T0);
-gen_add_A0_ds_seg(s);
+gen_lea_v_seg(s, s->aflag, s->A0, R_DS, s->override);
 gen_op_ld_v(s, MO_8, s->T0, s->A0);
 }
 
-- 
2.45.1




[PATCH 07/16] target/i386: raze the gen_eob* jungle

2024-05-24 Thread Paolo Bonzini
Make gen_eob take the DISAS_* constant as an argument, so that
it is not necessary to have wrappers around it.

Signed-off-by: Paolo Bonzini 
---
 target/i386/tcg/translate.c | 60 +
 1 file changed, 14 insertions(+), 46 deletions(-)

diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
index 2c7917d239f..c46385be060 100644
--- a/target/i386/tcg/translate.c
+++ b/target/i386/tcg/translate.c
@@ -260,8 +260,6 @@ STUB_HELPER(write_crN, TCGv_env env, TCGv_i32 reg, TCGv val)
 STUB_HELPER(wrmsr, TCGv_env env)
 #endif
 
-static void gen_eob(DisasContext *s);
-static void gen_jr(DisasContext *s);
 static void gen_jmp_rel(DisasContext *s, MemOp ot, int diff, int tb_num);
 static void gen_jmp_rel_csize(DisasContext *s, int diff, int tb_num);
 static void gen_exception_gpf(DisasContext *s);
@@ -2259,12 +2257,13 @@ static void gen_bnd_jmp(DisasContext *s)
 }
 }
 
-/* Generate an end of block. Trace exception is also generated if needed.
-   If INHIBIT, set HF_INHIBIT_IRQ_MASK if it isn't already set.
-   If RECHECK_TF, emit a rechecking helper for #DB, ignoring the state of
-   S->TF.  This is used by the syscall/sysret insns.  */
+/*
+ * Generate an end of block, including common tasks such as generating
+ * single step traps, resetting the RF flag, and handling the interrupt
+ * shadow.
+ */
 static void
-gen_eob_worker(DisasContext *s, bool inhibit, bool recheck_tf, bool jr)
+gen_eob(DisasContext *s, int mode)
 {
 bool inhibit_reset;
 
@@ -2275,52 +2274,29 @@ gen_eob_worker(DisasContext *s, bool inhibit, bool 
recheck_tf, bool jr)
 if (s->flags & HF_INHIBIT_IRQ_MASK) {
 gen_reset_hflag(s, HF_INHIBIT_IRQ_MASK);
 inhibit_reset = true;
-} else if (inhibit) {
+} else if (mode == DISAS_EOB_INHIBIT_IRQ) {
 gen_set_hflag(s, HF_INHIBIT_IRQ_MASK);
 }
 
 if (s->base.tb->flags & HF_RF_MASK) {
 gen_reset_eflags(s, RF_MASK);
 }
-if (recheck_tf) {
+if (mode == DISAS_EOB_RECHECK_TF) {
 gen_helper_rechecking_single_step(tcg_env);
 tcg_gen_exit_tb(NULL, 0);
 } else if (s->flags & HF_TF_MASK) {
 gen_helper_single_step(tcg_env);
-} else if (jr &&
+} else if (mode == DISAS_JUMP &&
/* give irqs a chance to happen */
!inhibit_reset) {
 tcg_gen_lookup_and_goto_ptr();
 } else {
 tcg_gen_exit_tb(NULL, 0);
 }
+
 s->base.is_jmp = DISAS_NORETURN;
 }
 
-static inline void
-gen_eob_syscall(DisasContext *s)
-{
-gen_eob_worker(s, false, true, false);
-}
-
-/* End of block.  Set HF_INHIBIT_IRQ_MASK if it isn't already set.  */
-static void gen_eob_inhibit_irq(DisasContext *s)
-{
-gen_eob_worker(s, true, false, false);
-}
-
-/* End of block, resetting the inhibit irq flag.  */
-static void gen_eob(DisasContext *s)
-{
-gen_eob_worker(s, false, false, false);
-}
-
-/* Jump to register */
-static void gen_jr(DisasContext *s)
-{
-gen_eob_worker(s, false, false, true);
-}
-
 /* Jump to eip+diff, truncating the result to OT. */
 static void gen_jmp_rel(DisasContext *s, MemOp ot, int diff, int tb_num)
 {
@@ -2372,9 +2348,9 @@ static void gen_jmp_rel(DisasContext *s, MemOp ot, int 
diff, int tb_num)
 tcg_gen_movi_tl(cpu_eip, new_eip);
 }
 if (s->jmp_opt) {
-gen_jr(s);   /* jump to another page */
+gen_eob(s, DISAS_JUMP);   /* jump to another page */
 } else {
-gen_eob(s);  /* exit to main loop */
+gen_eob(s, DISAS_EOB_ONLY);  /* exit to main loop */
 }
 }
 }
@@ -4787,22 +4763,14 @@ static void i386_tr_tb_stop(DisasContextBase *dcbase, 
CPUState *cpu)
 gen_jmp_rel_csize(dc, 0, 0);
 break;
 case DISAS_EOB_NEXT:
+case DISAS_EOB_INHIBIT_IRQ:
 assert(dc->base.pc_next == dc->pc);
 gen_update_eip_cur(dc);
 /* fall through */
 case DISAS_EOB_ONLY:
-gen_eob(dc);
-break;
 case DISAS_EOB_RECHECK_TF:
-gen_eob_syscall(dc);
-break;
-case DISAS_EOB_INHIBIT_IRQ:
-assert(dc->base.pc_next == dc->pc);
-gen_update_eip_cur(dc);
-gen_eob_inhibit_irq(dc);
-break;
 case DISAS_JUMP:
-gen_jr(dc);
+gen_eob(dc, dc->base.is_jmp);
 break;
 default:
 g_assert_not_reached();
-- 
2.45.1




[PATCH 16/16] target/i386: set CC_OP in helpers if they want CC_OP_EFLAGS

2024-05-24 Thread Paolo Bonzini
Mark cc_op as clean and do not spill it at the end of the translation block.
Technically this is a tiny bit less efficient, but:

* it results in translations that are a tiny bit smaller

* for most of these instructions, it is not unlikely that they are close to
the end of the basic block, in which case the spilling of cc_op would be
there anyway

* even in other cases, the cost is probably dwarfed by that of computing flags.

Signed-off-by: Paolo Bonzini 
---
 target/i386/ops_sse.h|  8 
 target/i386/tcg/fpu_helper.c |  2 ++
 target/i386/tcg/int_helper.c | 13 +
 target/i386/tcg/seg_helper.c | 16 
 target/i386/tcg/translate.c  | 12 ++--
 target/i386/tcg/emit.c.inc   | 22 +++---
 6 files changed, 44 insertions(+), 29 deletions(-)

diff --git a/target/i386/ops_sse.h b/target/i386/ops_sse.h
index 6a465a35fdb..f0aa1894aa2 100644
--- a/target/i386/ops_sse.h
+++ b/target/i386/ops_sse.h
@@ -,6 +,7 @@ void helper_ucomiss(CPUX86State *env, Reg *d, Reg *s)
 s1 = s->ZMM_S(0);
 ret = float32_compare_quiet(s0, s1, &env->sse_status);
 CC_SRC = comis_eflags[ret + 1];
+CC_OP = CC_OP_EFLAGS;
 }
 
 void helper_comiss(CPUX86State *env, Reg *d, Reg *s)
@@ -1122,6 +1123,7 @@ void helper_comiss(CPUX86State *env, Reg *d, Reg *s)
 s1 = s->ZMM_S(0);
 ret = float32_compare(s0, s1, &env->sse_status);
 CC_SRC = comis_eflags[ret + 1];
+CC_OP = CC_OP_EFLAGS;
 }
 
 void helper_ucomisd(CPUX86State *env, Reg *d, Reg *s)
@@ -1133,6 +1135,7 @@ void helper_ucomisd(CPUX86State *env, Reg *d, Reg *s)
 d1 = s->ZMM_D(0);
 ret = float64_compare_quiet(d0, d1, &env->sse_status);
 CC_SRC = comis_eflags[ret + 1];
+CC_OP = CC_OP_EFLAGS;
 }
 
 void helper_comisd(CPUX86State *env, Reg *d, Reg *s)
@@ -1144,6 +1147,7 @@ void helper_comisd(CPUX86State *env, Reg *d, Reg *s)
 d1 = s->ZMM_D(0);
 ret = float64_compare(d0, d1, &env->sse_status);
 CC_SRC = comis_eflags[ret + 1];
+CC_OP = CC_OP_EFLAGS;
 }
 #endif
 
@@ -1610,6 +1614,7 @@ void glue(helper_ptest, SUFFIX)(CPUX86State *env, Reg *d, 
Reg *s)
 cf |= (s->Q(i) & ~d->Q(i));
 }
 CC_SRC = (zf ? 0 : CC_Z) | (cf ? 0 : CC_C);
+CC_OP = CC_OP_EFLAGS;
 }
 
 #define FMOVSLDUP(i) s->L((i) & ~1)
@@ -1966,6 +1971,7 @@ static inline unsigned pcmpxstrx(CPUX86State *env, Reg 
*d, Reg *s,
 validd--;
 
 CC_SRC = (valids < upper ? CC_Z : 0) | (validd < upper ? CC_S : 0);
+CC_OP = CC_OP_EFLAGS;
 
 switch ((ctrl >> 2) & 3) {
 case 0:
@@ -2297,6 +2303,7 @@ void glue(helper_vtestps, SUFFIX)(CPUX86State *env, Reg 
*d, Reg *s)
 cf |= (s->L(i) & ~d->L(i));
 }
 CC_SRC = ((zf >> 31) ? 0 : CC_Z) | ((cf >> 31) ? 0 : CC_C);
+CC_OP = CC_OP_EFLAGS;
 }
 
 void glue(helper_vtestpd, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
@@ -2309,6 +2316,7 @@ void glue(helper_vtestpd, SUFFIX)(CPUX86State *env, Reg 
*d, Reg *s)
 cf |= (s->Q(i) & ~d->Q(i));
 }
 CC_SRC = ((zf >> 63) ? 0 : CC_Z) | ((cf >> 63) ? 0 : CC_C);
+CC_OP = CC_OP_EFLAGS;
 }
 
 void glue(helper_vpmaskmovd_st, SUFFIX)(CPUX86State *env,
diff --git a/target/i386/tcg/fpu_helper.c b/target/i386/tcg/fpu_helper.c
index ece22a3553f..8df8cae6310 100644
--- a/target/i386/tcg/fpu_helper.c
+++ b/target/i386/tcg/fpu_helper.c
@@ -487,6 +487,7 @@ void helper_fcomi_ST0_FT0(CPUX86State *env)
 ret = floatx80_compare(ST0, FT0, &env->fp_status);
 eflags = cpu_cc_compute_all(env) & ~(CC_Z | CC_P | CC_C);
 CC_SRC = eflags | fcomi_ccval[ret + 1];
+CC_OP = CC_OP_EFLAGS;
 merge_exception_flags(env, old_flags);
 }
 
@@ -499,6 +500,7 @@ void helper_fucomi_ST0_FT0(CPUX86State *env)
 ret = floatx80_compare_quiet(ST0, FT0, &env->fp_status);
 eflags = cpu_cc_compute_all(env) & ~(CC_Z | CC_P | CC_C);
 CC_SRC = eflags | fcomi_ccval[ret + 1];
+CC_OP = CC_OP_EFLAGS;
 merge_exception_flags(env, old_flags);
 }
 
diff --git a/target/i386/tcg/int_helper.c b/target/i386/tcg/int_helper.c
index 4cc59f15203..e1f92405282 100644
--- a/target/i386/tcg/int_helper.c
+++ b/target/i386/tcg/int_helper.c
@@ -187,6 +187,7 @@ void helper_aaa(CPUX86State *env)
 }
 env->regs[R_EAX] = (env->regs[R_EAX] & ~0x) | al | (ah << 8);
 CC_SRC = eflags;
+CC_OP = CC_OP_EFLAGS;
 }
 
 void helper_aas(CPUX86State *env)
@@ -211,6 +212,7 @@ void helper_aas(CPUX86State *env)
 }
 env->regs[R_EAX] = (env->regs[R_EAX] & ~0x) | al | (ah << 8);
 CC_SRC = eflags;
+CC_OP = CC_OP_EFLAGS;
 }
 
 void helper_daa(CPUX86State *env)
@@ -238,6 +240,7 @@ void helper_daa(CPUX86State *env)
 eflags |= parity_table[al]; /* pf */
 eflags |= (al & 0x80); /* sf */
 CC_SRC = eflags;
+CC_OP = CC_OP_EFLAGS;
 }
 
 void helper_das(CPUX86State *env)
@@ -269,6 +272,7 @@ void helper_das(CPUX86State *env)
 eflags |= parity_table[al]; /* pf */
 eflags |= (al & 0x80); /* sf */
 CC_SRC = eflags;
+CC_OP = CC_OP_EFLAGS;
 }
 
 #ifdef TARGET_X86_64
@@ -449,10 +453,11 @@ 

[PATCH 13/16] target/i386: clean up repeated string operations

2024-05-24 Thread Paolo Bonzini
Do not bother generating inline wrappers for gen_repz and gen_repz2;
use s->prefix to separate REPZ from REPNZ in the case of SCAS and
CMPS.

Signed-off-by: Paolo Bonzini 
---
 target/i386/tcg/translate.c | 22 --
 target/i386/tcg/emit.c.inc  | 22 +-
 2 files changed, 13 insertions(+), 31 deletions(-)

diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
index 8354209b037..18d8c0de674 100644
--- a/target/i386/tcg/translate.c
+++ b/target/i386/tcg/translate.c
@@ -1320,14 +1320,12 @@ static void gen_repz(DisasContext *s, MemOp ot,
 gen_jmp_rel_csize(s, -cur_insn_len(s), 0);
 }
 
-#define GEN_REPZ(op) \
-static inline void gen_repz_ ## op(DisasContext *s, MemOp ot) \
-{ gen_repz(s, ot, gen_##op); }
-
-static void gen_repz2(DisasContext *s, MemOp ot, int nz,
-  void (*fn)(DisasContext *s, MemOp ot))
+static void gen_repz_nz(DisasContext *s, MemOp ot,
+void (*fn)(DisasContext *s, MemOp ot))
 {
 TCGLabel *l2;
+int nz = (s->prefix & PREFIX_REPNZ) ? 1 : 0;
+
 l2 = gen_jz_ecx_string(s);
 fn(s, ot);
 gen_op_add_reg_im(s, s->aflag, R_ECX, -1);
@@ -1343,18 +1341,6 @@ static void gen_repz2(DisasContext *s, MemOp ot, int nz,
 gen_jmp_rel_csize(s, -cur_insn_len(s), 0);
 }
 
-#define GEN_REPZ2(op) \
-static inline void gen_repz_ ## op(DisasContext *s, MemOp ot, int nz) \
-{ gen_repz2(s, ot, nz, gen_##op); }
-
-GEN_REPZ(movs)
-GEN_REPZ(stos)
-GEN_REPZ(lods)
-GEN_REPZ(ins)
-GEN_REPZ(outs)
-GEN_REPZ2(scas)
-GEN_REPZ2(cmps)
-
 static void gen_helper_fp_arith_ST0_FT0(int op)
 {
 switch (op) {
diff --git a/target/i386/tcg/emit.c.inc b/target/i386/tcg/emit.c.inc
index 83fa745fd8a..bc96735f61d 100644
--- a/target/i386/tcg/emit.c.inc
+++ b/target/i386/tcg/emit.c.inc
@@ -1508,10 +1508,8 @@ static void gen_CMPccXADD(DisasContext *s, CPUX86State 
*env, X86DecodedInsn *dec
 static void gen_CMPS(DisasContext *s, CPUX86State *env, X86DecodedInsn *decode)
 {
 MemOp ot = decode->op[2].ot;
-if (s->prefix & PREFIX_REPNZ) {
-gen_repz_cmps(s, ot, 1);
-} else if (s->prefix & PREFIX_REPZ) {
-gen_repz_cmps(s, ot, 0);
+if (s->prefix & (PREFIX_REPZ | PREFIX_REPNZ)) {
+gen_repz_nz(s, ot, gen_cmps);
 } else {
 gen_cmps(s, ot);
 }
@@ -1834,7 +1832,7 @@ static void gen_INS(DisasContext *s, CPUX86State *env, 
X86DecodedInsn *decode)
 
 translator_io_start(&s->base);
 if (s->prefix & (PREFIX_REPZ | PREFIX_REPNZ)) {
-gen_repz_ins(s, ot);
+gen_repz(s, ot, gen_ins);
 } else {
 gen_ins(s, ot);
 }
@@ -1993,7 +1991,7 @@ static void gen_LODS(DisasContext *s, CPUX86State *env, 
X86DecodedInsn *decode)
 {
 MemOp ot = decode->op[2].ot;
 if (s->prefix & (PREFIX_REPZ | PREFIX_REPNZ)) {
-gen_repz_lods(s, ot);
+gen_repz(s, ot, gen_lods);
 } else {
 gen_lods(s, ot);
 }
@@ -2155,7 +2153,7 @@ static void gen_MOVS(DisasContext *s, CPUX86State *env, 
X86DecodedInsn *decode)
 {
 MemOp ot = decode->op[2].ot;
 if (s->prefix & (PREFIX_REPZ | PREFIX_REPNZ)) {
-gen_repz_movs(s, ot);
+gen_repz(s, ot, gen_movs);
 } else {
 gen_movs(s, ot);
 }
@@ -2321,7 +2319,7 @@ static void gen_OUTS(DisasContext *s, CPUX86State *env, 
X86DecodedInsn *decode)
 
 translator_io_start(&s->base);
 if (s->prefix & (PREFIX_REPZ | PREFIX_REPNZ)) {
-gen_repz_outs(s, ot);
+gen_repz(s, ot, gen_outs);
 } else {
 gen_outs(s, ot);
 }
@@ -3329,10 +3327,8 @@ static void gen_SBB(DisasContext *s, CPUX86State *env, 
X86DecodedInsn *decode)
 static void gen_SCAS(DisasContext *s, CPUX86State *env, X86DecodedInsn *decode)
 {
 MemOp ot = decode->op[2].ot;
-if (s->prefix & PREFIX_REPNZ) {
-gen_repz_scas(s, ot, 1);
-} else if (s->prefix & PREFIX_REPZ) {
-gen_repz_scas(s, ot, 0);
+if (s->prefix & (PREFIX_REPZ | PREFIX_REPNZ)) {
+gen_repz_nz(s, ot, gen_scas);
 } else {
 gen_scas(s, ot);
 }
@@ -3495,7 +3491,7 @@ static void gen_STOS(DisasContext *s, CPUX86State *env, 
X86DecodedInsn *decode)
 {
 MemOp ot = decode->op[1].ot;
 if (s->prefix & (PREFIX_REPZ | PREFIX_REPNZ)) {
-gen_repz_stos(s, ot);
+gen_repz(s, ot, gen_stos);
 } else {
 gen_stos(s, ot);
 }
-- 
2.45.1




RE: [PATCH] intel_iommu: Use the latest fault reasons defined by spec

2024-05-24 Thread Duan, Zhenzhong


>-Original Message-
>From: Jason Wang 
>Subject: Re: [PATCH] intel_iommu: Use the latest fault reasons defined by
>spec
>
>On Tue, May 21, 2024 at 6:25 PM Duan, Zhenzhong
> wrote:
>>
>>
>>
>> >-Original Message-
>> >From: Jason Wang 
>> >Subject: Re: [PATCH] intel_iommu: Use the latest fault reasons defined by
>> >spec
>> >
>> >On Mon, May 20, 2024 at 12:15 PM Liu, Yi L  wrote:
>> >>
>> >> > From: Duan, Zhenzhong 
>> >> > Sent: Monday, May 20, 2024 11:41 AM
>> >> >
>> >> >
>> >> >
>> >> > >-Original Message-
>> >> > >From: Jason Wang 
>> >> > >Sent: Monday, May 20, 2024 8:44 AM
>> >> > >To: Duan, Zhenzhong 
>> >> > >Cc: qemu-devel@nongnu.org; Liu, Yi L ; Peng,
>Chao
>> >P
>> >> > >; Yu Zhang ;
>> >Michael
>> >> > >S. Tsirkin ; Paolo Bonzini
>;
>> >> > >Richard Henderson ; Eduardo
>Habkost
>> >> > >; Marcel Apfelbaum
>> >
>> >> > >Subject: Re: [PATCH] intel_iommu: Use the latest fault reasons
>defined
>> >by
>> >> > >spec
>> >> > >
>> >> > >On Fri, May 17, 2024 at 6:26 PM Zhenzhong Duan
>> >> > > wrote:
>> >> > >>
>> >> > >> From: Yu Zhang 
>> >> > >>
>> >> > >> Currently we use only VTD_FR_PASID_TABLE_INV as fault reason.
>> >> > >> Update with more detailed fault reasons listed in VT-d spec 7.2.3.
>> >> > >>
>> >> > >> Signed-off-by: Yu Zhang 
>> >> > >> Signed-off-by: Zhenzhong Duan 
>> >> > >> ---
>> >> > >
>> >> > >I wonder if this could be noticed by the guest or not. If yes should
>> >> > >we consider starting to add thing like version to vtd emulation code?
>> >> >
>> >> > Kernel only dumps the reason like below:
>> >> >
>> >> > DMAR: [DMA Write NO_PASID] Request device [20:00.0] fault addr
>> >0x123460
>> >> > [fault reason 0x71] SM: Present bit in first-level paging entry is clear
>> >>
>> >> Yes, guest kernel would notice it as the fault would be injected to vm.
>> >>
>> >> > Maybe bump 1.0 -> 1.1?
>> >> > My understanding version number is only informational and is far
>from
>> >> > accurate to mark if a feature supported. Driver should check cap/ecap
>> >> > bits instead.
>> >>
>> >> Should the version ID here be aligned with VT-d spec?
>> >
>> >Probably, this might be something that could be noticed by the
>> >management to migration compatibility.
>>
>> Could you elaborate what we need to do for migration compatibility?
>> I see version is already exported so libvirt can query it, see:
>>
>> DEFINE_PROP_UINT32("version", IntelIOMMUState, version, 0),
>
>It is the Qemu command line parameters not the version of the vmstate.
>
>For example -device intel-iommu,version=3.0
>
>Qemu then knows it should behave as 3.0.

So you want to bump vtd_vmstate.version?

In fact, this series change intel_iommu property from 
x-scalable-mode=["on"|"off"]"
to x-scalable-mode=["legacy"|"modern"|"off"]".

My understanding management app should use same qemu cmdline
in source and destination, so compatibility is already guaranteed even if
we don't bump vtd_vmstate.version.

Thanks
Zhenzhong


Re: [PATCH v2 1/3] hw/riscv/virt: Add memory hotplugging and virtio-md-pci support

2024-05-24 Thread Daniel Henrique Barboza




On 5/21/24 07:56, Björn Töpel wrote:

From: Björn Töpel 

Virtio-based memory devices (virtio-mem/virtio-pmem) allows for
dynamic resizing of virtual machine memory, and requires proper
hotplugging (add/remove) support to work.

Add device memory support for RISC-V "virt" machine, and enable
virtio-md-pci with the corresponding missing hotplugging callbacks.

Signed-off-by: Björn Töpel 
---
  hw/riscv/Kconfig   |  2 +
  hw/riscv/virt.c| 83 +-
  hw/virtio/virtio-mem.c |  5 ++-
  3 files changed, 87 insertions(+), 3 deletions(-)

diff --git a/hw/riscv/Kconfig b/hw/riscv/Kconfig
index a2030e3a6ff0..08f82dbb681a 100644
--- a/hw/riscv/Kconfig
+++ b/hw/riscv/Kconfig
@@ -56,6 +56,8 @@ config RISCV_VIRT
  select PLATFORM_BUS
  select ACPI
  select ACPI_PCI
+select VIRTIO_MEM_SUPPORTED
+select VIRTIO_PMEM_SUPPORTED
  
  config SHAKTI_C

  bool
diff --git a/hw/riscv/virt.c b/hw/riscv/virt.c
index 4fdb66052587..443902f919d2 100644
--- a/hw/riscv/virt.c
+++ b/hw/riscv/virt.c
@@ -53,6 +53,8 @@
  #include "hw/pci-host/gpex.h"
  #include "hw/display/ramfb.h"
  #include "hw/acpi/aml-build.h"
+#include "hw/mem/memory-device.h"
+#include "hw/virtio/virtio-mem-pci.h"
  #include "qapi/qapi-visit-common.h"
  #include "hw/virtio/virtio-iommu.h"
  
@@ -1407,6 +1409,7 @@ static void virt_machine_init(MachineState *machine)

  DeviceState *mmio_irqchip, *virtio_irqchip, *pcie_irqchip;
  int i, base_hartid, hart_count;
  int socket_count = riscv_socket_count(machine);
+hwaddr device_memory_base, device_memory_size;
  
  /* Check socket count limit */

  if (VIRT_SOCKETS_MAX < socket_count) {
@@ -1420,6 +1423,12 @@ static void virt_machine_init(MachineState *machine)
  exit(1);
  }
  
+if (machine->ram_slots > ACPI_MAX_RAM_SLOTS) {

+error_report("unsupported amount of memory slots: %"PRIu64,
+ machine->ram_slots);


Let's also add the maximum amount allowed in this message, e.g. this error:

$ (...) -m 2G,slots=512,maxmem=8G
qemu-system-riscv64: unsupported amount of memory slots: 512

could be something like:

qemu-system-riscv64: unsupported amount of memory slots (512), maximum amount: 
256


LGTM otherwise. Thanks,


Daniel




+exit(EXIT_FAILURE);
+}
+
  /* Initialize sockets */
  mmio_irqchip = virtio_irqchip = pcie_irqchip = NULL;
  for (i = 0; i < socket_count; i++) {
@@ -1553,6 +1562,37 @@ static void virt_machine_init(MachineState *machine)
  memory_region_add_subregion(system_memory, memmap[VIRT_MROM].base,
  mask_rom);
  
+/* device memory */

+device_memory_base = ROUND_UP(s->memmap[VIRT_DRAM].base + 
machine->ram_size,
+  GiB);
+device_memory_size = machine->maxram_size - machine->ram_size;
+if (device_memory_size > 0) {
+/*
+ * Each DIMM is aligned based on the backend's alignment value.
+ * Assume max 1G hugepage alignment per slot.
+ */
+device_memory_size += machine->ram_slots * GiB;
+
+if (riscv_is_32bit(&s->soc[0])) {
+hwaddr memtop = device_memory_base + ROUND_UP(device_memory_size,
+  GiB);
+
+if (memtop > UINT32_MAX) {
+error_report("memory exceeds 32-bit limit by %lu bytes",
+ memtop - UINT32_MAX);
+exit(EXIT_FAILURE);
+}
+}
+
+if (device_memory_base + device_memory_size < device_memory_size) {
+error_report("unsupported amount of device memory");
+exit(EXIT_FAILURE);
+}
+
+machine_memory_devices_init(machine, device_memory_base,
+device_memory_size);
+}
+>   /*
   * Init fw_cfg. Must be done before riscv_load_fdt, otherwise the
   * device tree cannot be altered and we get FDT_ERR_NOSPACE.
@@ -1712,12 +1752,21 @@ static HotplugHandler 
*virt_machine_get_hotplug_handler(MachineState *machine,
  MachineClass *mc = MACHINE_GET_CLASS(machine);
  
  if (device_is_dynamic_sysbus(mc, dev) ||

-object_dynamic_cast(OBJECT(dev), TYPE_VIRTIO_IOMMU_PCI)) {
+object_dynamic_cast(OBJECT(dev), TYPE_VIRTIO_IOMMU_PCI) ||
+object_dynamic_cast(OBJECT(dev), TYPE_VIRTIO_MD_PCI)) {
  return HOTPLUG_HANDLER(machine);
  }
  return NULL;
  }
  
+static void virt_machine_device_pre_plug_cb(HotplugHandler *hotplug_dev,

+DeviceState *dev, Error **errp)
+{
+if (object_dynamic_cast(OBJECT(dev), TYPE_VIRTIO_MD_PCI)) {
+virtio_md_pci_pre_plug(VIRTIO_MD_PCI(dev), MACHINE(hotplug_dev), errp);
+}
+}
+
  static void virt_machine_device_plug_cb(HotplugHandler *hotplug_dev,
  DeviceState *dev, Error **errp)
  {
@@ -1735,6 +1784,35 @@ static void 

Re: [PATCH v2 3/3] hw/riscv/virt: Add ACPI GED and PC-DIMM MHP support

2024-05-24 Thread Daniel Henrique Barboza




On 5/21/24 07:56, Björn Töpel wrote:

From: Björn Töpel 

Add ACPI GED for the RISC-V "virt" machine, and wire up PC-DIMM memory
hotplugging support. Heavily based/copied from hw/arm/virt.c.

Signed-off-by: Björn Töpel 
---
  hw/riscv/Kconfig   |   3 ++
  hw/riscv/virt-acpi-build.c |  16 ++
  hw/riscv/virt.c| 104 -
  include/hw/riscv/virt.h|   6 ++-
  4 files changed, 126 insertions(+), 3 deletions(-)

diff --git a/hw/riscv/Kconfig b/hw/riscv/Kconfig
index 08f82dbb681a..bebe412f2107 100644
--- a/hw/riscv/Kconfig
+++ b/hw/riscv/Kconfig
@@ -56,6 +56,9 @@ config RISCV_VIRT
  select PLATFORM_BUS
  select ACPI
  select ACPI_PCI
+select MEM_DEVICE
+select DIMM
+select ACPI_HW_REDUCED
  select VIRTIO_MEM_SUPPORTED
  select VIRTIO_PMEM_SUPPORTED
  
diff --git a/hw/riscv/virt-acpi-build.c b/hw/riscv/virt-acpi-build.c

index 6dc3baa9ec86..61813abdef3f 100644
--- a/hw/riscv/virt-acpi-build.c
+++ b/hw/riscv/virt-acpi-build.c
@@ -27,6 +27,8 @@
  #include "hw/acpi/acpi-defs.h"
  #include "hw/acpi/acpi.h"
  #include "hw/acpi/aml-build.h"
+#include "hw/acpi/memory_hotplug.h"
+#include "hw/acpi/generic_event_device.h"
  #include "hw/acpi/pci.h"
  #include "hw/acpi/utils.h"
  #include "hw/intc/riscv_aclint.h"
@@ -432,6 +434,20 @@ static void build_dsdt(GArray *table_data,
  acpi_dsdt_add_gpex_host(scope, PCIE_IRQ + VIRT_IRQCHIP_NUM_SOURCES * 
2);
  }
  
+if (s->acpi_dev) {

+uint32_t event = object_property_get_uint(OBJECT(s->acpi_dev),
+  "ged-event", &error_abort);
+
+build_ged_aml(scope, "\\_SB."GED_DEVICE, HOTPLUG_HANDLER(s->acpi_dev),
+  GED_IRQ, AML_SYSTEM_MEMORY, memmap[VIRT_ACPI_GED].base);
+
+if (event & ACPI_GED_MEM_HOTPLUG_EVT) {
+build_memory_hotplug_aml(scope, ms->ram_slots, "\\_SB", NULL,
+ AML_SYSTEM_MEMORY,
+ memmap[VIRT_PCDIMM_ACPI].base);
+}
+}
+
  aml_append(dsdt, scope);
  
  /* copy AML table into ACPI tables blob and patch header there */

diff --git a/hw/riscv/virt.c b/hw/riscv/virt.c
index 443902f919d2..2e35890187f2 100644
--- a/hw/riscv/virt.c
+++ b/hw/riscv/virt.c
@@ -53,10 +53,13 @@
  #include "hw/pci-host/gpex.h"
  #include "hw/display/ramfb.h"
  #include "hw/acpi/aml-build.h"
+#include "hw/acpi/generic_event_device.h"
+#include "hw/acpi/memory_hotplug.h"
  #include "hw/mem/memory-device.h"
  #include "hw/virtio/virtio-mem-pci.h"
  #include "qapi/qapi-visit-common.h"
  #include "hw/virtio/virtio-iommu.h"
+#include "hw/mem/pc-dimm.h"
  
  /* KVM AIA only supports APLIC MSI. APLIC Wired is always emulated by QEMU. */

  static bool virt_use_kvm_aia(RISCVVirtState *s)
@@ -84,6 +87,8 @@ static const MemMapEntry virt_memmap[] = {
  [VIRT_UART0] ={ 0x1000, 0x100 },
  [VIRT_VIRTIO] =   { 0x10001000,0x1000 },
  [VIRT_FW_CFG] =   { 0x1010,  0x18 },
+[VIRT_PCDIMM_ACPI] =  { 0x1020, MEMORY_HOTPLUG_IO_LEN },
+[VIRT_ACPI_GED] = { 0x1021, ACPI_GED_EVT_SEL_LEN },
  [VIRT_FLASH] ={ 0x2000, 0x400 },
  [VIRT_IMSIC_M] =  { 0x2400, VIRT_IMSIC_MAX_SIZE },
  [VIRT_IMSIC_S] =  { 0x2800, VIRT_IMSIC_MAX_SIZE },
@@ -1400,6 +1405,28 @@ static void virt_machine_done(Notifier *notifier, void 
*data)
  }
  }
  
+static DeviceState *create_acpi_ged(RISCVVirtState *s)

+{
+DeviceState *dev;
+MachineState *ms = MACHINE(s);
+uint32_t event = 0;
+
+if (ms->ram_slots) {
+event |= ACPI_GED_MEM_HOTPLUG_EVT;
+}
+
+dev = qdev_new(TYPE_ACPI_GED);
+qdev_prop_set_uint32(dev, "ged-event", event);
+sysbus_realize_and_unref(SYS_BUS_DEVICE(dev), &error_fatal);
+
+sysbus_mmio_map(SYS_BUS_DEVICE(dev), 0, s->memmap[VIRT_ACPI_GED].base);
+sysbus_mmio_map(SYS_BUS_DEVICE(dev), 1, s->memmap[VIRT_PCDIMM_ACPI].base);
+sysbus_connect_irq(SYS_BUS_DEVICE(dev), 0, qdev_get_gpio_in(s->irqchip[0],
+GED_IRQ));
+
+return dev;
+}
+
  static void virt_machine_init(MachineState *machine)
  {
  const MemMapEntry *memmap = virt_memmap;
@@ -1612,6 +1639,10 @@ static void virt_machine_init(MachineState *machine)
  
  gpex_pcie_init(system_memory, pcie_irqchip, s);
  
+if (virt_is_acpi_enabled(s)) {

+s->acpi_dev = create_acpi_ged(s);
+}
+
  create_platform_bus(s, mmio_irqchip);
  
  serial_mm_init(system_memory, memmap[VIRT_UART0].base,

@@ -1752,6 +1783,7 @@ static HotplugHandler 
*virt_machine_get_hotplug_handler(MachineState *machine,
  MachineClass *mc = MACHINE_GET_CLASS(machine);
  
  if (device_is_dynamic_sysbus(mc, dev) ||

+object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM) ||
  object_dynamic_cast(OBJECT(dev), TYPE_VIRTIO_IOMMU_PCI) ||
  object

Re: [PATCH RFC 0/2] meson: Pass objects to declare_dependency()

2024-05-24 Thread Paolo Bonzini
On Fri, May 24, 2024 at 10:00 AM Akihiko Odaki  wrote:
>
> Based-on: <20240524-xkb-v4-0-2de564e5c...@daynix.com>
> ("[PATCH v4 0/4] Fix sanitizer errors with clang 18.1.1")
>
> This is changes suggested by Paolo Bonzini at:
> https://lore.kernel.org/all/CABgObfYoEFZsW-H4WJ7xW0B85OqFi932d3-DmNAb6zTohFn=o...@mail.gmail.com/
>
> Unfortunately it broke builds on my system. Below are the errors I
> observed:
>
> ld.lld: error: undefined symbol: pam_start
> >>> referenced by pamacct.c:40 
> >>> (/home/me/q/var/qemu/build/../authz/pamacct.c:40)
> >>>   qemu-img.lto.o:(qauthz_pam_is_allowed.cfi)

Thanks Akihiko for putting together the RFC! This is simply because
dependencies need to be added to the declare_dependency(). I'll post
the full series once I finish testing it.

Paolo




RE: [PATCH v3] mem/cxl_type3: support 3, 6, 12 and 16 interleave ways

2024-05-24 Thread Xingtao Yao (Fujitsu)
ping.

> -Original Message-
> From: Yao Xingtao 
> Sent: Wednesday, May 8, 2024 8:53 AM
> To: jonathan.came...@huawei.com; fan...@samsung.com
> Cc: qemu-devel@nongnu.org; Yao, Xingtao/姚 幸涛 
> Subject: [PATCH v3] mem/cxl_type3: support 3, 6, 12 and 16 interleave ways
> 
> Since the kernel does not check the interleave capability, a
> 3-way, 6-way, 12-way or 16-way region can be create normally.
> 
> Applications can access the memory of 16-way region normally because
> qemu can convert hpa to dpa correctly for the power of 2 interleave
> ways, after kernel implementing the check, this kind of region will
> not be created any more.
> 
> For non power of 2 interleave ways, applications could not access the
> memory normally and may occur some unexpected behaviors, such as
> segmentation fault.
> 
> So implements this feature is needed.
> 
> Link:
> https://lore.kernel.org/linux-cxl/3e84b919-7631-d1db-3e1d-33000f3f3868@fujits
> u.com/
> Signed-off-by: Yao Xingtao 
> ---
>  hw/cxl/cxl-component-utils.c |  9 +++--
>  hw/mem/cxl_type3.c   | 15 +++
>  2 files changed, 18 insertions(+), 6 deletions(-)
> 
> diff --git a/hw/cxl/cxl-component-utils.c b/hw/cxl/cxl-component-utils.c
> index cd116c0401..473895948b 100644
> --- a/hw/cxl/cxl-component-utils.c
> +++ b/hw/cxl/cxl-component-utils.c
> @@ -243,8 +243,13 @@ static void hdm_init_common(uint32_t *reg_state,
> uint32_t *write_msk,
>  ARRAY_FIELD_DP32(reg_state, CXL_HDM_DECODER_CAPABILITY,
> INTERLEAVE_4K, 1);
>  ARRAY_FIELD_DP32(reg_state, CXL_HDM_DECODER_CAPABILITY,
>   POISON_ON_ERR_CAP, 0);
> -ARRAY_FIELD_DP32(reg_state, CXL_HDM_DECODER_CAPABILITY,
> 3_6_12_WAY, 0);
> -ARRAY_FIELD_DP32(reg_state, CXL_HDM_DECODER_CAPABILITY, 16_WAY,
> 0);
> +if (type == CXL2_TYPE3_DEVICE) {
> +ARRAY_FIELD_DP32(reg_state, CXL_HDM_DECODER_CAPABILITY,
> 3_6_12_WAY, 1);
> +ARRAY_FIELD_DP32(reg_state, CXL_HDM_DECODER_CAPABILITY,
> 16_WAY, 1);
> +} else {
> +ARRAY_FIELD_DP32(reg_state, CXL_HDM_DECODER_CAPABILITY,
> 3_6_12_WAY, 0);
> +ARRAY_FIELD_DP32(reg_state, CXL_HDM_DECODER_CAPABILITY,
> 16_WAY, 0);
> +}
>  ARRAY_FIELD_DP32(reg_state, CXL_HDM_DECODER_CAPABILITY, UIO, 0);
>  ARRAY_FIELD_DP32(reg_state, CXL_HDM_DECODER_CAPABILITY,
>   UIO_DECODER_COUNT, 0);
> diff --git a/hw/mem/cxl_type3.c b/hw/mem/cxl_type3.c
> index 3e42490b6c..b755318838 100644
> --- a/hw/mem/cxl_type3.c
> +++ b/hw/mem/cxl_type3.c
> @@ -804,10 +804,17 @@ static bool cxl_type3_dpa(CXLType3Dev *ct3d, hwaddr
> host_addr, uint64_t *dpa)
>  continue;
>  }
> 
> -*dpa = dpa_base +
> -((MAKE_64BIT_MASK(0, 8 + ig) & hpa_offset) |
> - ((MAKE_64BIT_MASK(8 + ig + iw, 64 - 8 - ig - iw) & hpa_offset)
> -  >> iw));
> +if (iw < 8) {
> +*dpa = dpa_base +
> +((MAKE_64BIT_MASK(0, 8 + ig) & hpa_offset) |
> + ((MAKE_64BIT_MASK(8 + ig + iw, 64 - 8 - ig - iw) & 
> hpa_offset)
> +  >> iw));
> +} else {
> +*dpa = dpa_base +
> +((MAKE_64BIT_MASK(0, 8 + ig) & hpa_offset) |
> + MAKE_64BIT_MASK(ig + iw, 64 - ig - iw) & hpa_offset)
> +   >> (ig + iw)) / 3) << (ig + 8)));
> +}
> 
>  return true;
>  }
> --
> 2.37.3




Re: [PATCH 1/4] target/riscv: Add zimop extension

2024-05-24 Thread Daniel Henrique Barboza




On 5/22/24 03:29, LIU Zhiwei wrote:

Zimop extension defines an encoding space for 40 MOPs.The Zimop
extension defines 32 MOP instructions named MOP.R.n, where n is
an integer between 0 and 31, inclusive. The Zimop extension
additionally defines 8 MOP instructions named MOP.RR.n, where n
is an integer between 0 and 7.

These 40 MOPs initially are defined to simply write zero to x[rd],
but are designed to be redefined by later extensions to perform some
other action.

Signed-off-by: LIU Zhiwei 
---
  target/riscv/cpu.c  |  2 ++
  target/riscv/cpu_cfg.h  |  1 +
  target/riscv/insn32.decode  | 11 ++
  target/riscv/insn_trans/trans_rvzimop.c.inc | 37 +
  target/riscv/translate.c|  1 +
  5 files changed, 52 insertions(+)
  create mode 100644 target/riscv/insn_trans/trans_rvzimop.c.inc

diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
index eb1a2e7d6d..c1ac521142 100644
--- a/target/riscv/cpu.c
+++ b/target/riscv/cpu.c
@@ -175,6 +175,7 @@ const RISCVIsaExtData isa_edata_arr[] = {
  ISA_EXT_DATA_ENTRY(zvkt, PRIV_VERSION_1_12_0, ext_zvkt),
  ISA_EXT_DATA_ENTRY(zhinx, PRIV_VERSION_1_12_0, ext_zhinx),
  ISA_EXT_DATA_ENTRY(zhinxmin, PRIV_VERSION_1_12_0, ext_zhinxmin),
+ISA_EXT_DATA_ENTRY(zimop, PRIV_VERSION_1_12_0, ext_zimop),


Shouldn't this be placed right after zihpm?

ISA_EXT_DATA_ENTRY(zihintpause, PRIV_VERSION_1_10_0, ext_zihintpause),
ISA_EXT_DATA_ENTRY(zihpm, PRIV_VERSION_1_12_0, ext_zihpm),

+ISA_EXT_DATA_ENTRY(zimop, PRIV_VERSION_1_12_0, ext_zimop),

ISA_EXT_DATA_ENTRY(zmmul, PRIV_VERSION_1_12_0, ext_zmmul),


Thanks,

Daniel



  ISA_EXT_DATA_ENTRY(smaia, PRIV_VERSION_1_12_0, ext_smaia),
  ISA_EXT_DATA_ENTRY(smepmp, PRIV_VERSION_1_12_0, ext_smepmp),
  ISA_EXT_DATA_ENTRY(smstateen, PRIV_VERSION_1_12_0, ext_smstateen),
@@ -1463,6 +1464,7 @@ const RISCVCPUMultiExtConfig riscv_cpu_extensions[] = {
  MULTI_EXT_CFG_BOOL("zicsr", ext_zicsr, true),
  MULTI_EXT_CFG_BOOL("zihintntl", ext_zihintntl, true),
  MULTI_EXT_CFG_BOOL("zihintpause", ext_zihintpause, true),
+MULTI_EXT_CFG_BOOL("zimop", ext_zimop, false),
  MULTI_EXT_CFG_BOOL("zacas", ext_zacas, false),
  MULTI_EXT_CFG_BOOL("zaamo", ext_zaamo, false),
  MULTI_EXT_CFG_BOOL("zalrsc", ext_zalrsc, false),
diff --git a/target/riscv/cpu_cfg.h b/target/riscv/cpu_cfg.h
index cb750154bd..b547fbba9d 100644
--- a/target/riscv/cpu_cfg.h
+++ b/target/riscv/cpu_cfg.h
@@ -71,6 +71,7 @@ struct RISCVCPUConfig {
  bool ext_zihintntl;
  bool ext_zihintpause;
  bool ext_zihpm;
+bool ext_zimop;
  bool ext_ztso;
  bool ext_smstateen;
  bool ext_sstc;
diff --git a/target/riscv/insn32.decode b/target/riscv/insn32.decode
index f22df04cfd..972a1e8fd1 100644
--- a/target/riscv/insn32.decode
+++ b/target/riscv/insn32.decode
@@ -38,6 +38,8 @@
  %imm_bs   30:2   !function=ex_shift_3
  %imm_rnum 20:4
  %imm_z6   26:1 15:5
+%imm_mop5 30:1 26:2 20:2
+%imm_mop3 30:1 26:2
  
  # Argument sets:

  &empty
@@ -56,6 +58,8 @@
  &r2nfvmvm rd rs1 nf
  &rnfvm vm rd rs1 rs2 nf
  &k_aes shamt rs2 rs1 rd
+&mop5 imm rd rs1
+&mop3 imm rd rs1 rs2
  
  # Formats 32:

  @r   ...   . . ... . ... &r%rs2 %rs1 
%rd
@@ -98,6 +102,9 @@
  @k_aes   .. . . .  ... . ... &k_aes  shamt=%imm_bs   %rs2 
%rs1 %rd
  @i_aes   .. . . .  ... . ... &i  imm=%imm_rnum
%rs1 %rd
  
+@mop5 . . .. ..  .. . ... . ... &mop5 imm=%imm_mop5 %rd %rs1

+@mop3 . . .. .. . . . ... . ... &mop3 imm=%imm_mop3 %rd %rs1 
%rs2
+
  # Formats 64:
  @sh5 ...  . .  ... . ... &shift  shamt=%sh5  %rs1 
%rd
  
@@ -1010,3 +1017,7 @@ amocas_w00101 . . . . 010 . 010 @atom_st

  amocas_d00101 . . . . 011 . 010 @atom_st
  # *** RV64 Zacas Standard Extension ***
  amocas_q00101 . . . . 100 . 010 @atom_st
+
+# *** Zimop may-be-operation extension ***
+mop_r_n 1 . 00 .. 0111 .. . 100 . 0111011 @mop5
+mop_rr_n1 . 00 .. 1 . . 100 . 0111011 @mop3
diff --git a/target/riscv/insn_trans/trans_rvzimop.c.inc 
b/target/riscv/insn_trans/trans_rvzimop.c.inc
new file mode 100644
index 00..165aacd2b6
--- /dev/null
+++ b/target/riscv/insn_trans/trans_rvzimop.c.inc
@@ -0,0 +1,37 @@
+/*
+ * RISC-V translation routines for May-Be-Operation(zimop).
+ *
+ * Copyright (c) 2024 Alibaba Group.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2 or later, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GN

Re: [PATCH 3/4] target/riscv: Add zcmop extension

2024-05-24 Thread Daniel Henrique Barboza




On 5/22/24 03:29, LIU Zhiwei wrote:

Zcmop defines eight 16-bit MOP instructions named C.MOP.n, where n is
an odd integer between 1 and 15, inclusive. C.MOP.n is encoded in
the reserved encoding space corresponding to C.LUI xn, 0.

Unlike the MOPs defined in the Zimop extension, the C.MOP.n instructions
are defined to not write any register.

In current implementation, C.MOP.n only has an check function, without any
other more behavior.

Signed-off-by: LIU Zhiwei 
---
  target/riscv/cpu.c  |  2 ++
  target/riscv/cpu_cfg.h  |  1 +
  target/riscv/insn16.decode  |  1 +
  target/riscv/insn_trans/trans_rvzcmop.c.inc | 29 +
  target/riscv/tcg/tcg-cpu.c  |  5 
  target/riscv/translate.c|  1 +
  6 files changed, 39 insertions(+)
  create mode 100644 target/riscv/insn_trans/trans_rvzcmop.c.inc

diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
index c1ac521142..5052237a5b 100644
--- a/target/riscv/cpu.c
+++ b/target/riscv/cpu.c
@@ -176,6 +176,7 @@ const RISCVIsaExtData isa_edata_arr[] = {
  ISA_EXT_DATA_ENTRY(zhinx, PRIV_VERSION_1_12_0, ext_zhinx),
  ISA_EXT_DATA_ENTRY(zhinxmin, PRIV_VERSION_1_12_0, ext_zhinxmin),
  ISA_EXT_DATA_ENTRY(zimop, PRIV_VERSION_1_12_0, ext_zimop),
+ISA_EXT_DATA_ENTRY(zcmop, PRIV_VERSION_1_12_0, ext_zcmop),



I'm not sure if zcmop goes here. Perhaps here?


ISA_EXT_DATA_ENTRY(zce, PRIV_VERSION_1_12_0, ext_zce),

+ISA_EXT_DATA_ENTRY(zcmop, PRIV_VERSION_1_12_0, ext_zcmop),

ISA_EXT_DATA_ENTRY(zcmp, PRIV_VERSION_1_12_0, ext_zcmp),
ISA_EXT_DATA_ENTRY(zcmt, PRIV_VERSION_1_12_0, ext_zcmt),
ISA_EXT_DATA_ENTRY(zba, PRIV_VERSION_1_12_0, ext_zba),


Thanks,


Daniel



  ISA_EXT_DATA_ENTRY(smaia, PRIV_VERSION_1_12_0, ext_smaia),
  ISA_EXT_DATA_ENTRY(smepmp, PRIV_VERSION_1_12_0, ext_smepmp),
  ISA_EXT_DATA_ENTRY(smstateen, PRIV_VERSION_1_12_0, ext_smstateen),
@@ -1465,6 +1466,7 @@ const RISCVCPUMultiExtConfig riscv_cpu_extensions[] = {
  MULTI_EXT_CFG_BOOL("zihintntl", ext_zihintntl, true),
  MULTI_EXT_CFG_BOOL("zihintpause", ext_zihintpause, true),
  MULTI_EXT_CFG_BOOL("zimop", ext_zimop, false),
+MULTI_EXT_CFG_BOOL("zcmop", ext_zcmop, false),
  MULTI_EXT_CFG_BOOL("zacas", ext_zacas, false),
  MULTI_EXT_CFG_BOOL("zaamo", ext_zaamo, false),
  MULTI_EXT_CFG_BOOL("zalrsc", ext_zalrsc, false),
diff --git a/target/riscv/cpu_cfg.h b/target/riscv/cpu_cfg.h
index b547fbba9d..e29d4f6f9c 100644
--- a/target/riscv/cpu_cfg.h
+++ b/target/riscv/cpu_cfg.h
@@ -72,6 +72,7 @@ struct RISCVCPUConfig {
  bool ext_zihintpause;
  bool ext_zihpm;
  bool ext_zimop;
+bool ext_zcmop;
  bool ext_ztso;
  bool ext_smstateen;
  bool ext_sstc;
diff --git a/target/riscv/insn16.decode b/target/riscv/insn16.decode
index b96c534e73..3953bcf82d 100644
--- a/target/riscv/insn16.decode
+++ b/target/riscv/insn16.decode
@@ -140,6 +140,7 @@ sw110  ... ... .. ... 00 @cs_w
  addi  000 .  .  . 01 @ci
  addi  010 .  .  . 01 @c_li
  {
+  c_mop_n 011 0 0 n:3 1 0 01
illegal 011 0  -  0 01 # c.addi16sp and c.lui, RES nzimm=0
addi011 .  00010  . 01 @c_addi16sp
lui 011 .  .  . 01 @c_lui
diff --git a/target/riscv/insn_trans/trans_rvzcmop.c.inc 
b/target/riscv/insn_trans/trans_rvzcmop.c.inc
new file mode 100644
index 00..7205586508
--- /dev/null
+++ b/target/riscv/insn_trans/trans_rvzcmop.c.inc
@@ -0,0 +1,29 @@
+/*
+ * RISC-V translation routines for compressed May-Be-Operation(zcmop).
+ *
+ * Copyright (c) 2024 Alibaba Group.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2 or later, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program.  If not, see .
+ */
+
+#define REQUIRE_ZCMOP(ctx) do {   \
+if (!ctx->cfg_ptr->ext_zcmop) {   \
+return false; \
+} \
+} while (0)
+
+static bool trans_c_mop_n(DisasContext *ctx, arg_c_mop_n *a)
+{
+REQUIRE_ZCMOP(ctx);
+return true;
+}
diff --git a/target/riscv/tcg/tcg-cpu.c b/target/riscv/tcg/tcg-cpu.c
index 40054a391a..499b48dce8 100644
--- a/target/riscv/tcg/tcg-cpu.c
+++ b/target/riscv/tcg/tcg-cpu.c
@@ -583,6 +583,11 @@ void riscv_cpu_validate_set_extensions(RISCVCPU *cpu, 
Error **errp)
  }
  }
  
+if (cpu->cfg.ext_zcmop && !cpu->cfg.ext_zca) {

+error_se

Re: [PULL 01/10] target/loongarch/kvm: Fix VM recovery from disk failures

2024-05-24 Thread Michael Tokarev

23.05.2024 04:46, Song Gao wrote:

vmstate does not save kvm_state_conter,
which can cause VM recovery from disk to fail.

Cc: qemu-sta...@nongnu.org
Signed-off-by: Song Gao 
Acked-by: Peter Xu 
Message-Id: <20240508024732.3127792-1-gaos...@loongson.cn>
---
  target/loongarch/machine.c | 6 --
  1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/target/loongarch/machine.c b/target/loongarch/machine.c
index 9cd9e848d6..08a7fa5370 100644
--- a/target/loongarch/machine.c
+++ b/target/loongarch/machine.c
@@ -145,8 +145,8 @@ static const VMStateDescription vmstate_tlb = {
  /* LoongArch CPU state */
  const VMStateDescription vmstate_loongarch_cpu = {
  .name = "cpu",
-.version_id = 1,
-.minimum_version_id = 1,
+.version_id = 2,
+.minimum_version_id = 2,
  .fields = (const VMStateField[]) {
  VMSTATE_UINTTL_ARRAY(env.gpr, LoongArchCPU, 32),
  VMSTATE_UINTTL(env.pc, LoongArchCPU),
@@ -208,6 +208,8 @@ const VMStateDescription vmstate_loongarch_cpu = {
  VMSTATE_UINT64(env.CSR_DERA, LoongArchCPU),
  VMSTATE_UINT64(env.CSR_DSAVE, LoongArchCPU),
  
+VMSTATE_UINT64(kvm_state_counter, LoongArchCPU),

+
  VMSTATE_END_OF_LIST()
  },
  .subsections = (const VMStateDescription * const []) {


Should this really be part of any stable releases?
Wouldn't it break migration between, say, 8.2 with this change
and without?

Thanks,

/mjt
--
GPG Key transition (from rsa2048 to rsa4096) since 2024-04-24.
New key: rsa4096/61AD3D98ECDF2C8E  9D8B E14E 3F2A 9DD7 9199  28F1 61AD 3D98 
ECDF 2C8E
Old key: rsa2048/457CE0A0804465C5  6EE1 95D1 886E 8FFB 810D  4324 457C E0A0 
8044 65C5
Transition statement: http://www.corpit.ru/mjt/gpg-transition-2024.txt




Re: [PATCH 2/2] scsi-disk: Fix crash for VM configured with USB CDROM after live migration

2024-05-24 Thread Prasad Pandit
Hello Hyman,

* Is this the same patch series as sent before..?
  -> https://lists.nongnu.org/archive/html/qemu-devel/2024-04/msg00816.html

On Fri, 24 May 2024 at 12:02, Hyman Huang  wrote:
> For VMs configured with the USB CDROM device:
>
> -drive file=/path/to/local/file,id=drive-usb-disk0,media=cdrom,readonly=on...
> -device usb-storage,drive=drive-usb-disk0,id=usb-disk0...
>
> QEMU process may crash after live migration,
> Do the live migration repeatedly, crash may happen after live migratoin,

* Does live migration work many times before QEMU crashes on the
destination side? OR QEMU crashes at the very first migration?

>at 
> /usr/src/debug/qemu-6-6.2.0-75.7.oe1.smartx.git.40.x86_64/include/qemu/iov.h:49

* This qemu version looks quite old. Is the issue reproducible with
the latest QEMU version 9.0?

> diff --git a/hw/scsi/scsi-disk.c b/hw/scsi/scsi-disk.c
> +static void scsi_disk_emulate_save_request(QEMUFile *f, SCSIRequest *req)
> +{
> +SCSIDiskReq *r = DO_UPCAST(SCSIDiskReq, req, req);
> +SCSIDiskState *s = DO_UPCAST(SCSIDiskState, qdev, r->req.dev);
> +
> +if (s->migrate_emulate_scsi_request) {
> +scsi_disk_save_request(f, req);
> +}
> +}
> +
>  static void scsi_disk_load_request(QEMUFile *f, SCSIRequest *req)
>  {
>  SCSIDiskReq *r = DO_UPCAST(SCSIDiskReq, req, req);
> @@ -183,6 +193,16 @@ static void scsi_disk_load_request(QEMUFile *f, 
> SCSIRequest *req)
>  qemu_iovec_init_external(&r->qiov, &r->iov, 1);
>  }
>
> +static void scsi_disk_emulate_load_request(QEMUFile *f, SCSIRequest *req)
> +{
> +SCSIDiskReq *r = DO_UPCAST(SCSIDiskReq, req, req);
> +SCSIDiskState *s = DO_UPCAST(SCSIDiskState, qdev, r->req.dev);
> +
> +if (s->migrate_emulate_scsi_request) {
> +scsi_disk_load_request(f, req);
> +}
> +}
> +
>  /*
>   * scsi_handle_rw_error has two return values.  False means that the error
>   * must be ignored, true means that the error has been processed and the
> @@ -2593,6 +2613,8 @@ static const SCSIReqOps scsi_disk_emulate_reqops = {
>  .read_data= scsi_disk_emulate_read_data,
>  .write_data   = scsi_disk_emulate_write_data,
>  .get_buf  = scsi_get_buf,
> +.load_request = scsi_disk_emulate_load_request,
> +.save_request = scsi_disk_emulate_save_request,
>  };
>
>  static const SCSIReqOps scsi_disk_dma_reqops = {
> @@ -3137,7 +3159,7 @@ static Property scsi_hd_properties[] = {
>  static int scsi_disk_pre_save(void *opaque)
>  {
>  SCSIDiskState *dev = opaque;
> -dev->migrate_emulate_scsi_request = false;
> +dev->migrate_emulate_scsi_request = true;
>

* This patch seems to add support for migrating SCSI requests. While
it looks okay, not sure if it is required, how likely is someone to
configure a VM to use CDROM?

*  Should the CDROM device be reset on the destination if no requests
are found? ie. if (scsi_req_get_buf -> scsi_get_buf() returns NULL)?

Thank you.
---
  - Prasad




[PATCH v2 qemu 0/6] acpi: NUMA nodes for CXL HB as GP + complex NUMA test.

2024-05-24 Thread Jonathan Cameron via
v2: Improve (mostly add detail) the qmp documentatation (thanks Markus!)

ACPI 6.5 introduced Generic Port Affinity Structures to close a system
description gap that was a problem for CXL memory systems.
It defines an new SRAT Affinity structure (and hence allows creation of an
ACPI Proximity Node which can only be defined via an SRAT structure)
for the boundary between a discoverable fabric and a non discoverable
system interconnects etc.

The HMAT data on latency and bandwidth is combined with discoverable
information from the CXL bus (link speeds, lane counts) and CXL devices
(switch port to port characteristics and USP to memory, via CDAT tables
read from the device).  QEMU has supported the rest of the elements
of this chain for a while but now the kernel has caught up and we need
the missing element of Generic Ports (this code has been used extensively
in testing and debugging that kernel support, some resulting fixes
currently under review).

Generic Port Affinity Structures are very similar to the recently
added Generic Initiator Affinity Structures (GI) so this series
factors out and reuses much of that infrastructure for reuse
There are subtle differences (beyond the obvious structure ID change).

- The ACPI spec example (and linux kernel support) has a Generic
  Port not as associated with the CXL root port, but rather with
  the CXL Host bridge. As a result, an ACPI handle is used (rather
  than the PCI SBDF option for GIs). In QEMU the easiest way
  to get to this is to target the root bridge PCI Bus, and
  conveniently the root bridge bus number is used for the UID allowing
  us to construct an appropriate entry.

A key addition of this series is a complex NUMA topology example that
stretches the QEMU emulation code for GI, GP and nodes with just
CPUS, just memory, just hot pluggable memory, mixture of memory and CPUs.

A similar test showed up a few NUMA related bugs with fixes applied for
9.0 (note that one of these needs linux booted to identify that it
rejects the HMAT table and this test is a regression test for the
table generation only).

https://lore.kernel.org/qemu-devel/2eb6672cfdaea7dacd8e9bb0523887f13b9f85ce.1710282274.git@redhat.com/
https://lore.kernel.org/qemu-devel/74e2845c5f95b0c139c79233ddb65bb17f2dd679.1710282274.git@redhat.com/

Jonathan Cameron (6):
  hw/acpi/GI: Fix trivial parameter alignment issue.
  hw/acpi: Insert an acpi-generic-node base under acpi-generic-initiator
  hw/acpi: Generic Port Affinity Structure support
  bios-tables-test: Allow for new acpihmat-generic-x test data.
  bios-tables-test: Add complex SRAT / HMAT test for GI GP
  bios-tables-test: Add data for complex numa test (GI, GP etc)

 qapi/qom.json   |  35 
 include/hw/acpi/acpi_generic_initiator.h|  33 +++-
 include/hw/pci/pci_bridge.h |   1 +
 hw/acpi/acpi_generic_initiator.c| 199 ++--
 hw/pci-bridge/pci_expander_bridge.c |   1 -
 tests/qtest/bios-tables-test.c  |  92 +
 tests/data/acpi/q35/APIC.acpihmat-generic-x | Bin 0 -> 136 bytes
 tests/data/acpi/q35/CEDT.acpihmat-generic-x | Bin 0 -> 68 bytes
 tests/data/acpi/q35/DSDT.acpihmat-generic-x | Bin 0 -> 10400 bytes
 tests/data/acpi/q35/HMAT.acpihmat-generic-x | Bin 0 -> 360 bytes
 tests/data/acpi/q35/SRAT.acpihmat-generic-x | Bin 0 -> 520 bytes
 11 files changed, 302 insertions(+), 59 deletions(-)
 create mode 100644 tests/data/acpi/q35/APIC.acpihmat-generic-x
 create mode 100644 tests/data/acpi/q35/CEDT.acpihmat-generic-x
 create mode 100644 tests/data/acpi/q35/DSDT.acpihmat-generic-x
 create mode 100644 tests/data/acpi/q35/HMAT.acpihmat-generic-x
 create mode 100644 tests/data/acpi/q35/SRAT.acpihmat-generic-x

-- 
2.39.2




[PATCH v2 1/6] hw/acpi/GI: Fix trivial parameter alignment issue.

2024-05-24 Thread Jonathan Cameron via
Before making additional modification, tidy up this misleading indentation.

Reviewed-by: Ankit Agrawal 
Signed-off-by: Jonathan Cameron 
---
 hw/acpi/acpi_generic_initiator.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/acpi/acpi_generic_initiator.c b/hw/acpi/acpi_generic_initiator.c
index 17b9a052f5..18a939b0e5 100644
--- a/hw/acpi/acpi_generic_initiator.c
+++ b/hw/acpi/acpi_generic_initiator.c
@@ -132,7 +132,7 @@ static int build_all_acpi_generic_initiators(Object *obj, 
void *opaque)
 
 dev_handle.segment = 0;
 dev_handle.bdf = PCI_BUILD_BDF(pci_bus_num(pci_get_bus(pci_dev)),
-   pci_dev->devfn);
+   pci_dev->devfn);
 
 build_srat_generic_pci_initiator_affinity(table_data,
   gi->node, &dev_handle);
-- 
2.39.2




[PATCH v2 2/6] hw/acpi: Insert an acpi-generic-node base under acpi-generic-initiator

2024-05-24 Thread Jonathan Cameron via
This will simplify reuse when adding acpi-generic-port.
Note that some error_printf() messages will now print acpi-generic-node
whereas others will move to type specific cases in next patch so
are left alone for now.

Signed-off-by: Jonathan Cameron 
---
v2: Fix a typo in comment.
---
 include/hw/acpi/acpi_generic_initiator.h | 15 -
 hw/acpi/acpi_generic_initiator.c | 78 +++-
 2 files changed, 62 insertions(+), 31 deletions(-)

diff --git a/include/hw/acpi/acpi_generic_initiator.h 
b/include/hw/acpi/acpi_generic_initiator.h
index a304bad73e..dd4be19c8f 100644
--- a/include/hw/acpi/acpi_generic_initiator.h
+++ b/include/hw/acpi/acpi_generic_initiator.h
@@ -8,15 +8,26 @@
 
 #include "qom/object_interfaces.h"
 
-#define TYPE_ACPI_GENERIC_INITIATOR "acpi-generic-initiator"
+/*
+ * Abstract type to be used as base for
+ * - acpi-generic-initiator
+ * - acpi-generic-port
+ */
+#define TYPE_ACPI_GENERIC_NODE "acpi-generic-node"
 
-typedef struct AcpiGenericInitiator {
+typedef struct AcpiGenericNode {
 /* private */
 Object parent;
 
 /* public */
 char *pci_dev;
 uint16_t node;
+} AcpiGenericNode;
+
+#define TYPE_ACPI_GENERIC_INITIATOR "acpi-generic-initiator"
+
+typedef struct AcpiGenericInitiator {
+AcpiGenericNode parent;
 } AcpiGenericInitiator;
 
 /*
diff --git a/hw/acpi/acpi_generic_initiator.c b/hw/acpi/acpi_generic_initiator.c
index 18a939b0e5..c054e0e27d 100644
--- a/hw/acpi/acpi_generic_initiator.c
+++ b/hw/acpi/acpi_generic_initiator.c
@@ -10,45 +10,61 @@
 #include "hw/pci/pci_device.h"
 #include "qemu/error-report.h"
 
-typedef struct AcpiGenericInitiatorClass {
+typedef struct AcpiGenericNodeClass {
 ObjectClass parent_class;
+} AcpiGenericNodeClass;
+
+typedef struct AcpiGenericInitiatorClass {
+ AcpiGenericNodeClass parent_class;
 } AcpiGenericInitiatorClass;
 
+OBJECT_DEFINE_ABSTRACT_TYPE(AcpiGenericNode, acpi_generic_node,
+ACPI_GENERIC_NODE, OBJECT)
+
+OBJECT_DECLARE_SIMPLE_TYPE(AcpiGenericNode, ACPI_GENERIC_NODE)
+
 OBJECT_DEFINE_TYPE_WITH_INTERFACES(AcpiGenericInitiator, 
acpi_generic_initiator,
-   ACPI_GENERIC_INITIATOR, OBJECT,
+   ACPI_GENERIC_INITIATOR, ACPI_GENERIC_NODE,
{ TYPE_USER_CREATABLE },
{ NULL })
 
 OBJECT_DECLARE_SIMPLE_TYPE(AcpiGenericInitiator, ACPI_GENERIC_INITIATOR)
 
+static void acpi_generic_node_init(Object *obj)
+{
+AcpiGenericNode *gn = ACPI_GENERIC_NODE(obj);
+
+gn->node = MAX_NODES;
+gn->pci_dev = NULL;
+}
+
 static void acpi_generic_initiator_init(Object *obj)
 {
-AcpiGenericInitiator *gi = ACPI_GENERIC_INITIATOR(obj);
+}
+
+static void acpi_generic_node_finalize(Object *obj)
+{
+AcpiGenericNode *gn = ACPI_GENERIC_NODE(obj);
 
-gi->node = MAX_NODES;
-gi->pci_dev = NULL;
+g_free(gn->pci_dev);
 }
 
 static void acpi_generic_initiator_finalize(Object *obj)
 {
-AcpiGenericInitiator *gi = ACPI_GENERIC_INITIATOR(obj);
-
-g_free(gi->pci_dev);
 }
 
-static void acpi_generic_initiator_set_pci_device(Object *obj, const char *val,
-  Error **errp)
+static void acpi_generic_node_set_pci_device(Object *obj, const char *val,
+ Error **errp)
 {
-AcpiGenericInitiator *gi = ACPI_GENERIC_INITIATOR(obj);
+AcpiGenericNode *gn = ACPI_GENERIC_NODE(obj);
 
-gi->pci_dev = g_strdup(val);
+gn->pci_dev = g_strdup(val);
 }
-
-static void acpi_generic_initiator_set_node(Object *obj, Visitor *v,
-const char *name, void *opaque,
-Error **errp)
+static void acpi_generic_node_set_node(Object *obj, Visitor *v,
+   const char *name, void *opaque,
+   Error **errp)
 {
-AcpiGenericInitiator *gi = ACPI_GENERIC_INITIATOR(obj);
+AcpiGenericNode *gn = ACPI_GENERIC_NODE(obj);
 MachineState *ms = MACHINE(qdev_get_machine());
 uint32_t value;
 
@@ -58,20 +74,24 @@ static void acpi_generic_initiator_set_node(Object *obj, 
Visitor *v,
 
 if (value >= MAX_NODES) {
 error_printf("%s: Invalid NUMA node specified\n",
- TYPE_ACPI_GENERIC_INITIATOR);
+ TYPE_ACPI_GENERIC_NODE);
 exit(1);
 }
 
-gi->node = value;
-ms->numa_state->nodes[gi->node].has_gi = true;
+gn->node = value;
+ms->numa_state->nodes[gn->node].has_gi = true;
 }
 
-static void acpi_generic_initiator_class_init(ObjectClass *oc, void *data)
+static void acpi_generic_node_class_init(ObjectClass *oc, void *data)
 {
 object_class_property_add_str(oc, "pci-dev", NULL,
-acpi_generic_initiator_set_pci_device);
+acpi_generic_node_set_pci_device);
 object_class_property_add(oc, "node", "int", NULL,
-acpi_generic_initiator_set_node, NULL, NULL);
+acpi_generic_node_se

[PATCH v2 4/6] bios-tables-test: Allow for new acpihmat-generic-x test data.

2024-05-24 Thread Jonathan Cameron via
The test to be added exercises many corners of the SRAT and HMAT
table generation.

Signed-off-by: Jonathan Cameron 
---
 tests/qtest/bios-tables-test-allowed-diff.h | 5 +
 tests/data/acpi/q35/APIC.acpihmat-generic-x | 0
 tests/data/acpi/q35/CEDT.acpihmat-generic-x | 0
 tests/data/acpi/q35/DSDT.acpihmat-generic-x | 0
 tests/data/acpi/q35/HMAT.acpihmat-generic-x | 0
 tests/data/acpi/q35/SRAT.acpihmat-generic-x | 0
 6 files changed, 5 insertions(+)

diff --git a/tests/qtest/bios-tables-test-allowed-diff.h 
b/tests/qtest/bios-tables-test-allowed-diff.h
index dfb8523c8b..a5aa801c99 100644
--- a/tests/qtest/bios-tables-test-allowed-diff.h
+++ b/tests/qtest/bios-tables-test-allowed-diff.h
@@ -1 +1,6 @@
 /* List of comma-separated changed AML files to ignore */
+"tests/data/acpi/q35/APIC.acpihmat-generic-x",
+"tests/data/acpi/q35/CEDT.acpihmat-generic-x",
+"tests/data/acpi/q35/DSDT.acpihmat-generic-x",
+"tests/data/acpi/q35/HMAT.acpihmat-generic-x",
+"tests/data/acpi/q35/SRAT.acpihmat-generic-x",
diff --git a/tests/data/acpi/q35/APIC.acpihmat-generic-x 
b/tests/data/acpi/q35/APIC.acpihmat-generic-x
new file mode 100644
index 00..e69de29bb2
diff --git a/tests/data/acpi/q35/CEDT.acpihmat-generic-x 
b/tests/data/acpi/q35/CEDT.acpihmat-generic-x
new file mode 100644
index 00..e69de29bb2
diff --git a/tests/data/acpi/q35/DSDT.acpihmat-generic-x 
b/tests/data/acpi/q35/DSDT.acpihmat-generic-x
new file mode 100644
index 00..e69de29bb2
diff --git a/tests/data/acpi/q35/HMAT.acpihmat-generic-x 
b/tests/data/acpi/q35/HMAT.acpihmat-generic-x
new file mode 100644
index 00..e69de29bb2
diff --git a/tests/data/acpi/q35/SRAT.acpihmat-generic-x 
b/tests/data/acpi/q35/SRAT.acpihmat-generic-x
new file mode 100644
index 00..e69de29bb2
-- 
2.39.2




[PATCH v2 3/6] hw/acpi: Generic Port Affinity Structure support

2024-05-24 Thread Jonathan Cameron via
These are very similar to the recently added Generic Initiators
but instead of representing an initiator of memory traffic they
represent an edge point beyond which may lie either targets or
initiators.  Here we add these ports such that they may
be targets of hmat_lb records to describe the latency and
bandwidth from host side initiators to the port.  A descoverable
mechanism such as UEFI CDAT read from CXL devices and switches
is used to discover the remainder fo the path and the OS can build
up full latency and bandwidth numbers as need for work and data
placement decisions.

Signed-off-by: Jonathan Cameron 
---
v2: Updates to QMP documentation to provide a lot more information
on the parameters.
---
 qapi/qom.json|  35 ++
 include/hw/acpi/acpi_generic_initiator.h |  18 ++-
 include/hw/pci/pci_bridge.h  |   1 +
 hw/acpi/acpi_generic_initiator.c | 141 +--
 hw/pci-bridge/pci_expander_bridge.c  |   1 -
 5 files changed, 158 insertions(+), 38 deletions(-)

diff --git a/qapi/qom.json b/qapi/qom.json
index 38dde6d785..9d1d86bdad 100644
--- a/qapi/qom.json
+++ b/qapi/qom.json
@@ -826,6 +826,39 @@
   'data': { 'pci-dev': 'str',
 'node': 'uint32' } }
 
+
+##
+# @AcpiGenericPortProperties:
+#
+# Properties for acpi-generic-port objects.
+#
+# @pci-bus: QOM path of the PCI bus of the hostbridge associated with
+# this SRAT Generic Port Affinity Structure.  This is the same as
+# the bus parameter for the root ports attached to this host bridge.
+# The resulting SRAT Generic Port Affinity Structure will refer to
+# the ACPI object in DSDT that represents the host bridge (e.g.
+# ACPI0016 for CXL host bridges.) See ACPI 6.5 Section 5.2.16.7 for
+# more information.
+#
+# @node: Similar to a NUMA node ID, but instead of providing a reference
+# point used for defining NUMA distances and access characteristics
+# to memory or from an initiator (e.g. CPU), this node defines the
+# boundary point between non-discoverable system buses which must be
+# described by firmware, and a discoverable bus.  NUMA distances
+# and access characteristics are defined to and from that point.
+# For system software to establish full initiator to target
+# characteristics this information must be combined with information
+# retrieved from the discoverable part of the path.  An example would
+# use CDAT (see UEFI.org) information read from devices and switches
+# in conjunction with link characteristics read from PCIe
+# Configuration space.
+#
+# Since: 9.1
+##
+{ 'struct': 'AcpiGenericPortProperties',
+  'data': { 'pci-bus': 'str',
+'node': 'uint32' } }
+
 ##
 # @RngProperties:
 #
@@ -953,6 +986,7 @@
 { 'enum': 'ObjectType',
   'data': [
 'acpi-generic-initiator',
+'acpi-generic-port',
 'authz-list',
 'authz-listfile',
 'authz-pam',
@@ -1025,6 +1059,7 @@
   'discriminator': 'qom-type',
   'data': {
   'acpi-generic-initiator': 'AcpiGenericInitiatorProperties',
+  'acpi-generic-port':  'AcpiGenericPortProperties',
   'authz-list': 'AuthZListProperties',
   'authz-listfile': 'AuthZListFileProperties',
   'authz-pam':  'AuthZPAMProperties',
diff --git a/include/hw/acpi/acpi_generic_initiator.h 
b/include/hw/acpi/acpi_generic_initiator.h
index dd4be19c8f..1a899af30f 100644
--- a/include/hw/acpi/acpi_generic_initiator.h
+++ b/include/hw/acpi/acpi_generic_initiator.h
@@ -30,6 +30,12 @@ typedef struct AcpiGenericInitiator {
 AcpiGenericNode parent;
 } AcpiGenericInitiator;
 
+#define TYPE_ACPI_GENERIC_PORT "acpi-generic-port"
+
+typedef struct AcpiGenericPort {
+AcpiGenericInitiator parent;
+} AcpiGenericPort;
+
 /*
  * ACPI 6.3:
  * Table 5-81 Flags – Generic Initiator Affinity Structure
@@ -49,8 +55,16 @@ typedef enum {
  * Table 5-80 Device Handle - PCI
  */
 typedef struct PCIDeviceHandle {
-uint16_t segment;
-uint16_t bdf;
+union {
+struct {
+uint16_t segment;
+uint16_t bdf;
+};
+struct {
+uint64_t hid;
+uint32_t uid;
+};
+};
 } PCIDeviceHandle;
 
 void build_srat_generic_pci_initiator(GArray *table_data);
diff --git a/include/hw/pci/pci_bridge.h b/include/hw/pci/pci_bridge.h
index 5cd452115a..5456e24883 100644
--- a/include/hw/pci/pci_bridge.h
+++ b/include/hw/pci/pci_bridge.h
@@ -102,6 +102,7 @@ typedef struct PXBPCIEDev {
 PXBDev parent_obj;
 } PXBPCIEDev;
 
+#define TYPE_PXB_CXL_BUS "pxb-cxl-bus"
 #define TYPE_PXB_DEV "pxb"
 OBJECT_DECLARE_SIMPLE_TYPE(PXBDev, PXB_DEV)
 
diff --git a/hw/acpi/acpi_generic_initiator.c b/hw/acpi/acpi_generic_initiator.c
index c054e0e27d..85191e90ab 100644
--- a/hw/acpi/acpi_generic_initiator.c
+++ b/hw/acpi/acpi_generic_initiator.c
@@ -7,6 +7,7 @@
 #include "hw/acpi/acpi_generic_initiator.h"
 #include "hw/acpi/aml-build.h"
 #include "

[PATCH v2 5/6] bios-tables-test: Add complex SRAT / HMAT test for GI GP

2024-05-24 Thread Jonathan Cameron via
Add a test with 6 nodes to exercise most interesting corner cases
of SRAT and HMAT generation including the new Generic Initiator
and Generic Port Affinity structures.  More details of the
set up in the following patch adding the table data.

Signed-off-by: Jonathan Cameron 
---
 tests/qtest/bios-tables-test.c | 92 ++
 1 file changed, 92 insertions(+)

diff --git a/tests/qtest/bios-tables-test.c b/tests/qtest/bios-tables-test.c
index d1ff4db7a2..1651d06b7b 100644
--- a/tests/qtest/bios-tables-test.c
+++ b/tests/qtest/bios-tables-test.c
@@ -1862,6 +1862,96 @@ static void test_acpi_q35_tcg_acpi_hmat_noinitiator(void)
 free_test_data(&data);
 }
 
+/* Test intended to hit corner cases of SRAT and HMAT */
+static void test_acpi_q35_tcg_acpi_hmat_generic_x(void)
+{
+test_data data = {};
+
+data.machine = MACHINE_Q35;
+data.variant = ".acpihmat-generic-x";
+test_acpi_one(" -machine hmat=on,cxl=on"
+  " -smp 3,sockets=3"
+  " -m 128M,maxmem=384M,slots=2"
+  " -device virtio-rng-pci,id=gidev"
+  " -device pxb-cxl,bus_nr=64,bus=pcie.0,id=cxl.1"
+  " -object memory-backend-ram,size=64M,id=ram0"
+  " -object memory-backend-ram,size=64M,id=ram1"
+  " -numa node,nodeid=0,cpus=0,memdev=ram0"
+  " -numa node,nodeid=1"
+  " -object acpi-generic-initiator,id=gi0,pci-dev=gidev,node=1"
+  " -numa node,nodeid=2"
+  " -object acpi-generic-port,id=gp0,pci-bus=cxl.1,node=2"
+  " -numa node,nodeid=3,cpus=1"
+  " -numa node,nodeid=4,memdev=ram1"
+  " -numa node,nodeid=5,cpus=2"
+  " -numa hmat-lb,initiator=0,target=0,hierarchy=memory,"
+  "data-type=access-latency,latency=10"
+  " -numa hmat-lb,initiator=0,target=0,hierarchy=memory,"
+  "data-type=access-bandwidth,bandwidth=800M"
+  " -numa hmat-lb,initiator=0,target=2,hierarchy=memory,"
+  "data-type=access-latency,latency=100"
+  " -numa hmat-lb,initiator=0,target=2,hierarchy=memory,"
+  "data-type=access-bandwidth,bandwidth=200M"
+  " -numa hmat-lb,initiator=0,target=4,hierarchy=memory,"
+  "data-type=access-latency,latency=100"
+  " -numa hmat-lb,initiator=0,target=4,hierarchy=memory,"
+  "data-type=access-bandwidth,bandwidth=200M"
+  " -numa hmat-lb,initiator=0,target=5,hierarchy=memory,"
+  "data-type=access-latency,latency=200"
+  " -numa hmat-lb,initiator=0,target=5,hierarchy=memory,"
+  "data-type=access-bandwidth,bandwidth=400M"
+  " -numa hmat-lb,initiator=1,target=0,hierarchy=memory,"
+  "data-type=access-latency,latency=500"
+  " -numa hmat-lb,initiator=1,target=0,hierarchy=memory,"
+  "data-type=access-bandwidth,bandwidth=100M"
+  " -numa hmat-lb,initiator=1,target=2,hierarchy=memory,"
+  "data-type=access-latency,latency=50"
+  " -numa hmat-lb,initiator=1,target=2,hierarchy=memory,"
+  "data-type=access-bandwidth,bandwidth=400M"
+  " -numa hmat-lb,initiator=1,target=4,hierarchy=memory,"
+  "data-type=access-latency,latency=50"
+  " -numa hmat-lb,initiator=1,target=4,hierarchy=memory,"
+  "data-type=access-bandwidth,bandwidth=800M"
+  " -numa hmat-lb,initiator=1,target=5,hierarchy=memory,"
+  "data-type=access-latency,latency=500"
+  " -numa hmat-lb,initiator=1,target=5,hierarchy=memory,"
+  "data-type=access-bandwidth,bandwidth=100M"
+  " -numa hmat-lb,initiator=3,target=0,hierarchy=memory,"
+  "data-type=access-latency,latency=20"
+  " -numa hmat-lb,initiator=3,target=0,hierarchy=memory,"
+  "data-type=access-bandwidth,bandwidth=400M"
+  " -numa hmat-lb,initiator=3,target=2,hierarchy=memory,"
+  "data-type=access-latency,latency=80"
+  " -numa hmat-lb,initiator=3,target=2,hierarchy=memory,"
+  "data-type=access-bandwidth,bandwidth=200M"
+  " -numa hmat-lb,initiator=3,target=4,hierarchy=memory,"
+  "data-type=access-latency,latency=80"
+  " -numa hmat-lb,initiator=3,target=4,hierarchy=memory,"
+  "data-type=access-bandwidth,bandwidth=200M"
+  " -numa hmat-lb,initiator=3,target=5,hierarchy=memory,"
+  "data-type=access-latency,latency=20"
+  " -numa hmat-lb,initiator=3,target=5,hierarchy=memory,"
+  "data-type=access-bandwidth,bandwidth=400M"
+

[PATCH v2 6/6] bios-tables-test: Add data for complex numa test (GI, GP etc)

2024-05-24 Thread Jonathan Cameron via
Given this is a new configuration, there are affects on APIC, CEDT
and DSDT, but the key elements are in SRAT (plus related data in
HMAT).  The configuration has node to exercise many different combinations.

0) CPUs + Memory
1) GI only
2) GP only
3) CPUS only
4) Memory only
5) CPUs + HP memory

GI node, GP Node, Memory only node, hotplug memory
only node, latency and bandwidth such that in Linux Access0
(any initiator) and Access1 (CPU initiators only) given different
answers.  Following cropped to remove details of each entry.

[000h  004h]   Signature : "SRAT"[System Resource 
Affinity Table]

[030h 0048 001h]   Subtable Type : 00 [Processor Local APIC/SAPIC 
Affinity]
[032h 0050 001h] Proximity Domain Low(8) : 00
[033h 0051 001h] Apic ID : 00

[040h 0064 001h]   Subtable Type : 00 [Processor Local APIC/SAPIC 
Affinity]
[042h 0066 001h] Proximity Domain Low(8) : 03   


   [043h 0067 001h] 
Apic ID : 01

[050h 0080 001h]   Subtable Type : 00 [Processor Local APIC/SAPIC 
Affinity]
[052h 0082 001h] Proximity Domain Low(8) : 05
[053h 0083 001h] Apic ID : 02

[060h 0096 001h]   Subtable Type : 01 [Memory Affinity]
[062h 0098 004h]Proximity Domain : 
[068h 0104 008h]Base Address : 
[070h 0112 008h]  Address Length : 000A

[088h 0136 001h]   Subtable Type : 01 [Memory Affinity]
[08Ah 0138 004h]Proximity Domain : 
[090h 0144 008h]Base Address : 0010
[098h 0152 008h]  Address Length : 03F0
[0A8h 0168 008h]   Reserved3 : 

[0B0h 0176 001h]   Subtable Type : 01 [Memory Affinity]
[0B2h 0178 004h]Proximity Domain : 0004
[0B8h 0184 008h]Base Address : 0400
[0C0h 0192 008h]  Address Length : 0400

//Comment in hw/i386/aml-build.c on why these exist - not part of
//ACPI requirements.
[0D8h 0216 001h]   Subtable Type : 01 [Memory Affinity]
[0DAh 0218 004h]Proximity Domain : 
[0E0h 0224 008h]Base Address : 
[0E8h 0232 008h]  Address Length : 

[100h 0256 001h]   Subtable Type : 01 [Memory Affinity]
[102h 0258 004h]Proximity Domain : 
[108h 0264 008h]Base Address : 
[110h 0272 008h]  Address Length : 

[128h 0296 001h]   Subtable Type : 01 [Memory Affinity]
[12Ah 0298 004h]Proximity Domain : 
[130h 0304 008h]Base Address : 
[138h 0312 008h]  Address Length : 

[150h 0336 001h]   Subtable Type : 01 [Memory Affinity]
[152h 0338 004h]Proximity Domain : 
[158h 0344 008h]Base Address : 
[160h 0352 008h]  Address Length : 

[178h 0376 001h]   Subtable Type : 01 [Memory Affinity]
[17Ah 0378 004h]Proximity Domain : 
[180h 0384 008h]Base Address : 
[188h 0392 008h]  Address Length : 
// End of strange empty Memory Affinity structures.

[1A0h 0416 001h]   Subtable Type : 05 [Generic Initiator Affinity]
[1A3h 0419 001h]  Device Handle Type : 01
[1A4h 0420 004h]Proximity Domain : 0001
[1A8h 0424 010h]   Device Handle : 00 00 10 00 00 00 00 00 00 00 00 
00 00 00 00 00

[1C0h 0448 001h]   Subtable Type : 06 [Generic Port Affinity]
[1C3h 0451 001h]  Device Handle Type : 00
[1C4h 0452 004h]Proximity Domain : 0002
[1C8h 0456 010h]   Device Handle : 41 43 50 49 30 30 31 36 40 00 00 
00 00 00 00 00

[1E0h 0480 001h]   Subtable Type : 01 [Memory Affinity]
[1E2h 0482 004h]Proximity Domain : 0005
[1E8h 0488 008h]Base Address : 0001
[1F0h 0496 008h]  Address Length : 9000
[1FCh 0508 004h]   Flags (decoded below) : 0003
 Enabled : 1
   Hot Pluggable : 1
Non-Volatile : 0

Example block from HMAT:
[0F0h 0240 002h]  Structure Type : 0001 [System Locality Latency 
and Bandwidth Information]  

   [0F2h 0242 002h]  

Re: [PULL 10/10] hw/loongarch/virt: Fix FDT memory node address width

2024-05-24 Thread Michael Tokarev

23.05.2024 04:46, Song Gao wrote:

From: Jiaxun Yang 

Higher bits for memory nodes were omitted at qemu_fdt_setprop_cells.

Cc: qemu-sta...@nongnu.org
Signed-off-by: Jiaxun Yang 
Reviewed-by: Song Gao 
Message-Id: <20240520-loongarch-fdt-memnode-v1-1-5ea9be939...@flygoat.com>
Signed-off-by: Song Gao 
---
  hw/loongarch/virt.c | 3 ++-
  1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/hw/loongarch/virt.c b/hw/loongarch/virt.c
index e3bdf085b5..3e6e93edf3 100644
--- a/hw/loongarch/virt.c
+++ b/hw/loongarch/virt.c
@@ -464,7 +464,8 @@ static void fdt_add_memory_node(MachineState *ms,
  char *nodename = g_strdup_printf("/memory@%" PRIx64, base);
  
  qemu_fdt_add_subnode(ms->fdt, nodename);

-qemu_fdt_setprop_cells(ms->fdt, nodename, "reg", 0, base, 0, size);
+qemu_fdt_setprop_cells(ms->fdt, nodename, "reg", base >> 32, base,
+   size >> 32, size);
  qemu_fdt_setprop_string(ms->fdt, nodename, "device_type", "memory");
  
  if (ms->numa_state && ms->numa_state->num_nodes) {


This commit changes exactly the same place as the previous commit,
v9.0.0-274-gb11f981452, "hw/loongarch: Fix fdt memory node wrong 'reg'".

Was it the wrong fix?

Note the previous commit isn't in any released version of qemu.  So
when picking up for any stable release, both needs to be picked up :)

Thanks,

/mjt
--
GPG Key transition (from rsa2048 to rsa4096) since 2024-04-24.
New key: rsa4096/61AD3D98ECDF2C8E  9D8B E14E 3F2A 9DD7 9199  28F1 61AD 3D98 
ECDF 2C8E
Old key: rsa2048/457CE0A0804465C5  6EE1 95D1 886E 8FFB 810D  4324 457C E0A0 
8044 65C5
Transition statement: http://www.corpit.ru/mjt/gpg-transition-2024.txt




[PATCH v7 5/8] softmmu: Replace check for RAMBlock offset 0 with xen_mr_is_memory

2024-05-24 Thread Edgar E. Iglesias
From: "Edgar E. Iglesias" 

For xen, when checking for the first RAM (xen_memory), use
xen_mr_is_memory() rather than checking for a RAMBlock with
offset 0.

All Xen machines create xen_memory first so this has no
functional change for existing machines.

Signed-off-by: Edgar E. Iglesias 
Reviewed-by: Stefano Stabellini 
Reviewed-by: David Hildenbrand 
---
 system/physmem.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/system/physmem.c b/system/physmem.c
index 5e6257ef65..b7847db1a2 100644
--- a/system/physmem.c
+++ b/system/physmem.c
@@ -2229,7 +2229,7 @@ static void *qemu_ram_ptr_length(RAMBlock *block, 
ram_addr_t addr,
  * because we don't want to map the entire memory in QEMU.
  * In that case just map the requested area.
  */
-if (block->offset == 0) {
+if (xen_mr_is_memory(block->mr)) {
 return xen_map_cache(block->mr, block->offset + addr,
  len, lock, lock,
  is_write);
-- 
2.40.1




[PATCH v7 1/8] xen: mapcache: Make MCACHE_BUCKET_SHIFT runtime configurable

2024-05-24 Thread Edgar E. Iglesias
From: "Edgar E. Iglesias" 

Make MCACHE_BUCKET_SHIFT runtime configurable per cache instance.

Signed-off-by: Edgar E. Iglesias 
Reviewed-by: Stefano Stabellini 
---
 hw/xen/xen-mapcache.c | 54 ++-
 1 file changed, 33 insertions(+), 21 deletions(-)

diff --git a/hw/xen/xen-mapcache.c b/hw/xen/xen-mapcache.c
index fa6813b1ad..bc860f4373 100644
--- a/hw/xen/xen-mapcache.c
+++ b/hw/xen/xen-mapcache.c
@@ -23,13 +23,10 @@
 
 
 #if HOST_LONG_BITS == 32
-#  define MCACHE_BUCKET_SHIFT 16
 #  define MCACHE_MAX_SIZE (1UL<<31) /* 2GB Cap */
 #else
-#  define MCACHE_BUCKET_SHIFT 20
 #  define MCACHE_MAX_SIZE (1UL<<35) /* 32GB Cap */
 #endif
-#define MCACHE_BUCKET_SIZE (1UL << MCACHE_BUCKET_SHIFT)
 
 /* This is the size of the virtual address space reserve to QEMU that will not
  * be use by MapCache.
@@ -65,7 +62,8 @@ typedef struct MapCache {
 /* For most cases (>99.9%), the page address is the same. */
 MapCacheEntry *last_entry;
 unsigned long max_mcache_size;
-unsigned int mcache_bucket_shift;
+unsigned int bucket_shift;
+unsigned long bucket_size;
 
 phys_offset_to_gaddr_t phys_offset_to_gaddr;
 QemuMutex lock;
@@ -95,11 +93,14 @@ static inline int test_bits(int nr, int size, const 
unsigned long *addr)
 
 static MapCache *xen_map_cache_init_single(phys_offset_to_gaddr_t f,
void *opaque,
+   unsigned int bucket_shift,
unsigned long max_size)
 {
 unsigned long size;
 MapCache *mc;
 
+assert(bucket_shift >= XC_PAGE_SHIFT);
+
 mc = g_new0(MapCache, 1);
 
 mc->phys_offset_to_gaddr = f;
@@ -108,12 +109,14 @@ static MapCache 
*xen_map_cache_init_single(phys_offset_to_gaddr_t f,
 
 QTAILQ_INIT(&mc->locked_entries);
 
+mc->bucket_shift = bucket_shift;
+mc->bucket_size = 1UL << bucket_shift;
 mc->max_mcache_size = max_size;
 
 mc->nr_buckets =
 (((mc->max_mcache_size >> XC_PAGE_SHIFT) +
-  (1UL << (MCACHE_BUCKET_SHIFT - XC_PAGE_SHIFT)) - 1) >>
- (MCACHE_BUCKET_SHIFT - XC_PAGE_SHIFT));
+  (1UL << (bucket_shift - XC_PAGE_SHIFT)) - 1) >>
+ (bucket_shift - XC_PAGE_SHIFT));
 
 size = mc->nr_buckets * sizeof(MapCacheEntry);
 size = (size + XC_PAGE_SIZE - 1) & ~(XC_PAGE_SIZE - 1);
@@ -126,6 +129,13 @@ void xen_map_cache_init(phys_offset_to_gaddr_t f, void 
*opaque)
 {
 struct rlimit rlimit_as;
 unsigned long max_mcache_size;
+unsigned int bucket_shift;
+
+if (HOST_LONG_BITS == 32) {
+bucket_shift = 16;
+} else {
+bucket_shift = 20;
+}
 
 if (geteuid() == 0) {
 rlimit_as.rlim_cur = RLIM_INFINITY;
@@ -146,7 +156,9 @@ void xen_map_cache_init(phys_offset_to_gaddr_t f, void 
*opaque)
 }
 }
 
-mapcache = xen_map_cache_init_single(f, opaque, max_mcache_size);
+mapcache = xen_map_cache_init_single(f, opaque,
+ bucket_shift,
+ max_mcache_size);
 setrlimit(RLIMIT_AS, &rlimit_as);
 }
 
@@ -195,7 +207,7 @@ static void xen_remap_bucket(MapCache *mc,
 entry->valid_mapping = NULL;
 
 for (i = 0; i < nb_pfn; i++) {
-pfns[i] = (address_index << (MCACHE_BUCKET_SHIFT-XC_PAGE_SHIFT)) + i;
+pfns[i] = (address_index << (mc->bucket_shift - XC_PAGE_SHIFT)) + i;
 }
 
 /*
@@ -266,8 +278,8 @@ static uint8_t *xen_map_cache_unlocked(MapCache *mc,
 bool dummy = false;
 
 tryagain:
-address_index  = phys_addr >> MCACHE_BUCKET_SHIFT;
-address_offset = phys_addr & (MCACHE_BUCKET_SIZE - 1);
+address_index  = phys_addr >> mc->bucket_shift;
+address_offset = phys_addr & (mc->bucket_size - 1);
 
 trace_xen_map_cache(phys_addr);
 
@@ -294,14 +306,14 @@ tryagain:
 return mc->last_entry->vaddr_base + address_offset;
 }
 
-/* size is always a multiple of MCACHE_BUCKET_SIZE */
+/* size is always a multiple of mc->bucket_size */
 if (size) {
 cache_size = size + address_offset;
-if (cache_size % MCACHE_BUCKET_SIZE) {
-cache_size += MCACHE_BUCKET_SIZE - (cache_size % 
MCACHE_BUCKET_SIZE);
+if (cache_size % mc->bucket_size) {
+cache_size += mc->bucket_size - (cache_size % mc->bucket_size);
 }
 } else {
-cache_size = MCACHE_BUCKET_SIZE;
+cache_size = mc->bucket_size;
 }
 
 entry = &mc->entry[address_index % mc->nr_buckets];
@@ -422,7 +434,7 @@ static ram_addr_t 
xen_ram_addr_from_mapcache_single(MapCache *mc, void *ptr)
 trace_xen_ram_addr_from_mapcache_not_in_cache(ptr);
 raddr = RAM_ADDR_INVALID;
 } else {
-raddr = (reventry->paddr_index << MCACHE_BUCKET_SHIFT) +
+raddr = (reventry->paddr_index << mc->bucket_shift) +
  ((unsigned long) ptr - (unsigned long) entry->vaddr_base);
 }
 mapcache_unlock(mc);
@@ -585,8 +

[PATCH v7 4/8] softmmu: xen: Always pass offset + addr to xen_map_cache

2024-05-24 Thread Edgar E. Iglesias
From: "Edgar E. Iglesias" 

Always pass address with offset to xen_map_cache().
This is in preparation for support for grant mappings.

Since this is within a block that checks for offset == 0,
this has no functional changes.

Signed-off-by: Edgar E. Iglesias 
Reviewed-by: Stefano Stabellini 
Reviewed-by: David Hildenbrand 
---
 system/physmem.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/system/physmem.c b/system/physmem.c
index 342b7a8fd4..5e6257ef65 100644
--- a/system/physmem.c
+++ b/system/physmem.c
@@ -2230,7 +2230,8 @@ static void *qemu_ram_ptr_length(RAMBlock *block, 
ram_addr_t addr,
  * In that case just map the requested area.
  */
 if (block->offset == 0) {
-return xen_map_cache(block->mr, addr, len, lock, lock,
+return xen_map_cache(block->mr, block->offset + addr,
+ len, lock, lock,
  is_write);
 }
 
-- 
2.40.1




[PATCH v7 3/8] xen: Add xen_mr_is_memory()

2024-05-24 Thread Edgar E. Iglesias
From: "Edgar E. Iglesias" 

Add xen_mr_is_memory() to abstract away tests for the
xen_memory MR.

No functional changes.

Signed-off-by: Edgar E. Iglesias 
Reviewed-by: Stefano Stabellini 
Acked-by: David Hildenbrand 
---
 hw/xen/xen-hvm-common.c | 10 --
 include/sysemu/xen.h|  8 
 2 files changed, 16 insertions(+), 2 deletions(-)

diff --git a/hw/xen/xen-hvm-common.c b/hw/xen/xen-hvm-common.c
index 2d1b032121..a0a0252da0 100644
--- a/hw/xen/xen-hvm-common.c
+++ b/hw/xen/xen-hvm-common.c
@@ -12,6 +12,12 @@
 
 MemoryRegion xen_memory;
 
+/* Check for xen memory.  */
+bool xen_mr_is_memory(MemoryRegion *mr)
+{
+return mr == &xen_memory;
+}
+
 void xen_ram_alloc(ram_addr_t ram_addr, ram_addr_t size, MemoryRegion *mr,
Error **errp)
 {
@@ -28,7 +34,7 @@ void xen_ram_alloc(ram_addr_t ram_addr, ram_addr_t size, 
MemoryRegion *mr,
 return;
 }
 
-if (mr == &xen_memory) {
+if (xen_mr_is_memory(mr)) {
 return;
 }
 
@@ -55,7 +61,7 @@ static void xen_set_memory(struct MemoryListener *listener,
 {
 XenIOState *state = container_of(listener, XenIOState, memory_listener);
 
-if (section->mr == &xen_memory) {
+if (xen_mr_is_memory(section->mr)) {
 return;
 } else {
 if (add) {
diff --git a/include/sysemu/xen.h b/include/sysemu/xen.h
index 754ec2e6cb..dc72f83bcb 100644
--- a/include/sysemu/xen.h
+++ b/include/sysemu/xen.h
@@ -34,6 +34,8 @@ void xen_hvm_modified_memory(ram_addr_t start, ram_addr_t 
length);
 void xen_ram_alloc(ram_addr_t ram_addr, ram_addr_t size,
struct MemoryRegion *mr, Error **errp);
 
+bool xen_mr_is_memory(MemoryRegion *mr);
+
 #else /* !CONFIG_XEN_IS_POSSIBLE */
 
 #define xen_enabled() 0
@@ -47,6 +49,12 @@ static inline void xen_ram_alloc(ram_addr_t ram_addr, 
ram_addr_t size,
 g_assert_not_reached();
 }
 
+static inline bool xen_mr_is_memory(MemoryRegion *mr)
+{
+g_assert_not_reached();
+return false;
+}
+
 #endif /* CONFIG_XEN_IS_POSSIBLE */
 
 #endif
-- 
2.40.1




[PATCH v7 0/8] xen: Support grant mappings

2024-05-24 Thread Edgar E. Iglesias
From: "Edgar E. Iglesias" 

Hi,

Grant mappings are a mechanism in Xen for guests to grant each other
permissions to map and share pages. These grants can be temporary
so both map and unmaps must be respected. See here for more info:
https://github.com/xen-project/xen/blob/master/docs/misc/grant-tables.txt

Currently, the primary use-case for grants in QEMU, is with VirtIO backends.
Grant mappings will only work with models that use the address_space_map/unmap
interfaces, any other access will fail with appropriate error messages.

In response to feedback we got on v3, later version switch approach
from adding new MemoryRegion types and map/unmap hooks to instead reusing
the existing xen_map_cache() hooks (with extensions). Almost all of the
changes are now contained to the Xen modules.

This approach also refactors the mapcache to support multiple instances
(one for existing foreign mappings and another for grant mappings).

I've only enabled grants for the ARM PVH machine since that is what
I can currently test on.

Cheers,
Edgar

ChangeLog:

v6 -> v7:
* Use g_autofree in xen_remap_bucket().
* Flatten nested if-statements in xen_map_cache().
* Fix typo in error message in xen_map_cache().

v5 -> v6:
* Correct passing of ram_addr_offset in xen_replace_cache_entry_unlocked.

v4 -> v5:
* Compute grant_ref from address_index to xen_remap_bucket().
* Rename grant_is_write to is_write.
* Remove unnecessary + mc->bucket_size - 1 in
  xen_invalidate_map_cache_entry_unlocked().
* Remove use of global mapcache in refactor of
  xen_replace_cache_entry_unlocked().
* Add error checking for xengnttab_unmap().
* Add assert in xen_replace_cache_entry_unlocked() against grant mappings.
* Fix memory leak when freeing first entry in mapcache buckets.
* Assert that bucket_shift is >= XC_PAGE_SHIFT when creating mapcache.
* Add missing use of xen_mr_is_memory() in hw/xen/xen-hvm-common.c.
* Rebase with master.

v3 -> v4:
* Reuse existing xen_map_cache hooks.
* Reuse existing map-cache for both foreign and grant mappings.
* Only enable grants for the ARM PVH machine (removed i386).

v2 -> v3:
* Drop patch 1/7. This was done because device unplug is an x86-only case.
* Add missing qemu_mutex_unlock() before return.

v1 -> v2:
* Split patch 2/7 to keep phymem.c changes in a separate.
* In patch "xen: add map and unmap callbacks for grant" add check for total
  allowed grant < XEN_MAX_VIRTIO_GRANTS.
* Fix formatting issues and re-based with master latest.


Edgar E. Iglesias (8):
  xen: mapcache: Make MCACHE_BUCKET_SHIFT runtime configurable
  xen: mapcache: Unmap first entries in buckets
  xen: Add xen_mr_is_memory()
  softmmu: xen: Always pass offset + addr to xen_map_cache
  softmmu: Replace check for RAMBlock offset 0 with xen_mr_is_memory
  xen: mapcache: Pass the ram_addr offset to xen_map_cache()
  xen: mapcache: Add support for grant mappings
  hw/arm: xen: Enable use of grant mappings

 hw/arm/xen_arm.c|   5 +
 hw/xen/xen-hvm-common.c |  18 ++-
 hw/xen/xen-mapcache.c   | 234 
 include/hw/xen/xen-hvm-common.h |   3 +
 include/sysemu/xen-mapcache.h   |   2 +
 include/sysemu/xen.h|  15 ++
 system/physmem.c|  12 +-
 7 files changed, 224 insertions(+), 65 deletions(-)


base-commit: 70581940cabcc51b329652becddfbc6a261b1b83
-- 
2.40.1




[PATCH v7 8/8] hw/arm: xen: Enable use of grant mappings

2024-05-24 Thread Edgar E. Iglesias
From: "Edgar E. Iglesias" 

Signed-off-by: Edgar E. Iglesias 
Reviewed-by: Stefano Stabellini 
Reviewed-by: Manos Pitsidianakis 
---
 hw/arm/xen_arm.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/hw/arm/xen_arm.c b/hw/arm/xen_arm.c
index 15fa7dfa84..6fad829ede 100644
--- a/hw/arm/xen_arm.c
+++ b/hw/arm/xen_arm.c
@@ -125,6 +125,11 @@ static void xen_init_ram(MachineState *machine)
  GUEST_RAM1_BASE, ram_size[1]);
 memory_region_add_subregion(sysmem, GUEST_RAM1_BASE, &ram_hi);
 }
+
+/* Setup support for grants.  */
+memory_region_init_ram(&xen_grants, NULL, "xen.grants", block_len,
+   &error_fatal);
+memory_region_add_subregion(sysmem, XEN_GRANT_ADDR_OFF, &xen_grants);
 }
 
 void arch_handle_ioreq(XenIOState *state, ioreq_t *req)
-- 
2.40.1




[PATCH v7 6/8] xen: mapcache: Pass the ram_addr offset to xen_map_cache()

2024-05-24 Thread Edgar E. Iglesias
From: "Edgar E. Iglesias" 

Pass the ram_addr offset to xen_map_cache.
This is in preparation for adding grant mappings that need
to compute the address within the RAMBlock.

No functional changes.

Signed-off-by: Edgar E. Iglesias 
Reviewed-by: David Hildenbrand 
Reviewed-by: Stefano Stabellini 
---
 hw/xen/xen-mapcache.c | 16 +++-
 include/sysemu/xen-mapcache.h |  2 ++
 system/physmem.c  |  9 +
 3 files changed, 18 insertions(+), 9 deletions(-)

diff --git a/hw/xen/xen-mapcache.c b/hw/xen/xen-mapcache.c
index ec95445696..a07c47b0b1 100644
--- a/hw/xen/xen-mapcache.c
+++ b/hw/xen/xen-mapcache.c
@@ -167,7 +167,8 @@ static void xen_remap_bucket(MapCache *mc,
  void *vaddr,
  hwaddr size,
  hwaddr address_index,
- bool dummy)
+ bool dummy,
+ ram_addr_t ram_offset)
 {
 uint8_t *vaddr_base;
 xen_pfn_t *pfns;
@@ -266,6 +267,7 @@ static void xen_remap_bucket(MapCache *mc,
 
 static uint8_t *xen_map_cache_unlocked(MapCache *mc,
hwaddr phys_addr, hwaddr size,
+   ram_addr_t ram_offset,
uint8_t lock, bool dma, bool is_write)
 {
 MapCacheEntry *entry, *pentry = NULL,
@@ -337,14 +339,16 @@ tryagain:
 if (!entry) {
 entry = g_new0(MapCacheEntry, 1);
 pentry->next = entry;
-xen_remap_bucket(mc, entry, NULL, cache_size, address_index, dummy);
+xen_remap_bucket(mc, entry, NULL, cache_size, address_index, dummy,
+ ram_offset);
 } else if (!entry->lock) {
 if (!entry->vaddr_base || entry->paddr_index != address_index ||
 entry->size != cache_size ||
 !test_bits(address_offset >> XC_PAGE_SHIFT,
 test_bit_size >> XC_PAGE_SHIFT,
 entry->valid_mapping)) {
-xen_remap_bucket(mc, entry, NULL, cache_size, address_index, 
dummy);
+xen_remap_bucket(mc, entry, NULL, cache_size, address_index, dummy,
+ ram_offset);
 }
 }
 
@@ -391,13 +395,15 @@ tryagain:
 
 uint8_t *xen_map_cache(MemoryRegion *mr,
hwaddr phys_addr, hwaddr size,
+   ram_addr_t ram_addr_offset,
uint8_t lock, bool dma,
bool is_write)
 {
 uint8_t *p;
 
 mapcache_lock(mapcache);
-p = xen_map_cache_unlocked(mapcache, phys_addr, size, lock, dma, is_write);
+p = xen_map_cache_unlocked(mapcache, phys_addr, size, ram_addr_offset,
+   lock, dma, is_write);
 mapcache_unlock(mapcache);
 return p;
 }
@@ -632,7 +638,7 @@ static uint8_t *xen_replace_cache_entry_unlocked(MapCache 
*mc,
 trace_xen_replace_cache_entry_dummy(old_phys_addr, new_phys_addr);
 
 xen_remap_bucket(mc, entry, entry->vaddr_base,
- cache_size, address_index, false);
+ cache_size, address_index, false, old_phys_addr);
 if (!test_bits(address_offset >> XC_PAGE_SHIFT,
 test_bit_size >> XC_PAGE_SHIFT,
 entry->valid_mapping)) {
diff --git a/include/sysemu/xen-mapcache.h b/include/sysemu/xen-mapcache.h
index 1ec9e66752..b5e3ea1bc0 100644
--- a/include/sysemu/xen-mapcache.h
+++ b/include/sysemu/xen-mapcache.h
@@ -19,6 +19,7 @@ typedef hwaddr (*phys_offset_to_gaddr_t)(hwaddr phys_offset,
 void xen_map_cache_init(phys_offset_to_gaddr_t f,
 void *opaque);
 uint8_t *xen_map_cache(MemoryRegion *mr, hwaddr phys_addr, hwaddr size,
+   ram_addr_t ram_addr_offset,
uint8_t lock, bool dma,
bool is_write);
 ram_addr_t xen_ram_addr_from_mapcache(void *ptr);
@@ -37,6 +38,7 @@ static inline void xen_map_cache_init(phys_offset_to_gaddr_t 
f,
 static inline uint8_t *xen_map_cache(MemoryRegion *mr,
  hwaddr phys_addr,
  hwaddr size,
+ ram_addr_t ram_addr_offset,
  uint8_t lock,
  bool dma,
  bool is_write)
diff --git a/system/physmem.c b/system/physmem.c
index b7847db1a2..33d09f7571 100644
--- a/system/physmem.c
+++ b/system/physmem.c
@@ -2231,13 +2231,14 @@ static void *qemu_ram_ptr_length(RAMBlock *block, 
ram_addr_t addr,
  */
 if (xen_mr_is_memory(block->mr)) {
 return xen_map_cache(block->mr, block->offset + addr,
- len, lock, lock,
- is_write);
+ len, block->offset,
+ lock, lock, is_write);
 }
 
 block->

[PATCH v7 2/8] xen: mapcache: Unmap first entries in buckets

2024-05-24 Thread Edgar E. Iglesias
From: "Edgar E. Iglesias" 

When invalidating memory ranges, if we happen to hit the first
entry in a bucket we were never unmapping it. This was harmless
for foreign mappings but now that we're looking to reuse the
mapcache for transient grant mappings, we must unmap entries
when invalidated.

Signed-off-by: Edgar E. Iglesias 
Reviewed-by: Stefano Stabellini 
---
 hw/xen/xen-mapcache.c | 11 ---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/hw/xen/xen-mapcache.c b/hw/xen/xen-mapcache.c
index bc860f4373..ec95445696 100644
--- a/hw/xen/xen-mapcache.c
+++ b/hw/xen/xen-mapcache.c
@@ -491,18 +491,23 @@ static void 
xen_invalidate_map_cache_entry_unlocked(MapCache *mc,
 return;
 }
 entry->lock--;
-if (entry->lock > 0 || pentry == NULL) {
+if (entry->lock > 0) {
 return;
 }
 
-pentry->next = entry->next;
 ram_block_notify_remove(entry->vaddr_base, entry->size, entry->size);
 if (munmap(entry->vaddr_base, entry->size) != 0) {
 perror("unmap fails");
 exit(-1);
 }
+
 g_free(entry->valid_mapping);
-g_free(entry);
+if (pentry) {
+pentry->next = entry->next;
+g_free(entry);
+} else {
+memset(entry, 0, sizeof *entry);
+}
 }
 
 typedef struct XenMapCacheData {
-- 
2.40.1




[PATCH v7 7/8] xen: mapcache: Add support for grant mappings

2024-05-24 Thread Edgar E. Iglesias
From: "Edgar E. Iglesias" 

Add a second mapcache for grant mappings. The mapcache for
grants needs to work with XC_PAGE_SIZE granularity since
we can't map larger ranges than what has been granted to us.

Like with foreign mappings (xen_memory), machines using grants
are expected to initialize the xen_grants MR and map it
into their address-map accordingly.

CC: Manos Pitsidianakis 
Signed-off-by: Edgar E. Iglesias 
Reviewed-by: Stefano Stabellini 
---
 hw/xen/xen-hvm-common.c |  12 ++-
 hw/xen/xen-mapcache.c   | 165 +---
 include/hw/xen/xen-hvm-common.h |   3 +
 include/sysemu/xen.h|   7 ++
 4 files changed, 150 insertions(+), 37 deletions(-)

diff --git a/hw/xen/xen-hvm-common.c b/hw/xen/xen-hvm-common.c
index a0a0252da0..b8ace1c368 100644
--- a/hw/xen/xen-hvm-common.c
+++ b/hw/xen/xen-hvm-common.c
@@ -10,12 +10,18 @@
 #include "hw/boards.h"
 #include "hw/xen/arch_hvm.h"
 
-MemoryRegion xen_memory;
+MemoryRegion xen_memory, xen_grants;
 
-/* Check for xen memory.  */
+/* Check for any kind of xen memory, foreign mappings or grants.  */
 bool xen_mr_is_memory(MemoryRegion *mr)
 {
-return mr == &xen_memory;
+return mr == &xen_memory || mr == &xen_grants;
+}
+
+/* Check specifically for grants.  */
+bool xen_mr_is_grants(MemoryRegion *mr)
+{
+return mr == &xen_grants;
 }
 
 void xen_ram_alloc(ram_addr_t ram_addr, ram_addr_t size, MemoryRegion *mr,
diff --git a/hw/xen/xen-mapcache.c b/hw/xen/xen-mapcache.c
index a07c47b0b1..5f23b0adbe 100644
--- a/hw/xen/xen-mapcache.c
+++ b/hw/xen/xen-mapcache.c
@@ -14,6 +14,7 @@
 
 #include 
 
+#include "hw/xen/xen-hvm-common.h"
 #include "hw/xen/xen_native.h"
 #include "qemu/bitmap.h"
 
@@ -21,6 +22,8 @@
 #include "sysemu/xen-mapcache.h"
 #include "trace.h"
 
+#include 
+#include 
 
 #if HOST_LONG_BITS == 32
 #  define MCACHE_MAX_SIZE (1UL<<31) /* 2GB Cap */
@@ -41,6 +44,7 @@ typedef struct MapCacheEntry {
 unsigned long *valid_mapping;
 uint32_t lock;
 #define XEN_MAPCACHE_ENTRY_DUMMY (1 << 0)
+#define XEN_MAPCACHE_ENTRY_GRANT (1 << 1)
 uint8_t flags;
 hwaddr size;
 struct MapCacheEntry *next;
@@ -71,6 +75,8 @@ typedef struct MapCache {
 } MapCache;
 
 static MapCache *mapcache;
+static MapCache *mapcache_grants;
+static xengnttab_handle *xen_region_gnttabdev;
 
 static inline void mapcache_lock(MapCache *mc)
 {
@@ -131,6 +137,12 @@ void xen_map_cache_init(phys_offset_to_gaddr_t f, void 
*opaque)
 unsigned long max_mcache_size;
 unsigned int bucket_shift;
 
+xen_region_gnttabdev = xengnttab_open(NULL, 0);
+if (xen_region_gnttabdev == NULL) {
+error_report("mapcache: Failed to open gnttab device");
+exit(EXIT_FAILURE);
+}
+
 if (HOST_LONG_BITS == 32) {
 bucket_shift = 16;
 } else {
@@ -159,6 +171,15 @@ void xen_map_cache_init(phys_offset_to_gaddr_t f, void 
*opaque)
 mapcache = xen_map_cache_init_single(f, opaque,
  bucket_shift,
  max_mcache_size);
+
+/*
+ * Grant mappings must use XC_PAGE_SIZE granularity since we can't
+ * map anything beyond the number of pages granted to us.
+ */
+mapcache_grants = xen_map_cache_init_single(f, opaque,
+XC_PAGE_SHIFT,
+max_mcache_size);
+
 setrlimit(RLIMIT_AS, &rlimit_as);
 }
 
@@ -168,17 +189,24 @@ static void xen_remap_bucket(MapCache *mc,
  hwaddr size,
  hwaddr address_index,
  bool dummy,
+ bool grant,
+ bool is_write,
  ram_addr_t ram_offset)
 {
 uint8_t *vaddr_base;
-xen_pfn_t *pfns;
-int *err;
+g_autofree uint32_t *refs = NULL;
+g_autofree xen_pfn_t *pfns = NULL;
+g_autofree int *err;
 unsigned int i;
 hwaddr nb_pfn = size >> XC_PAGE_SHIFT;
 
 trace_xen_remap_bucket(address_index);
 
-pfns = g_new0(xen_pfn_t, nb_pfn);
+if (grant) {
+refs = g_new0(uint32_t, nb_pfn);
+} else {
+pfns = g_new0(xen_pfn_t, nb_pfn);
+}
 err = g_new0(int, nb_pfn);
 
 if (entry->vaddr_base != NULL) {
@@ -207,21 +235,51 @@ static void xen_remap_bucket(MapCache *mc,
 g_free(entry->valid_mapping);
 entry->valid_mapping = NULL;
 
-for (i = 0; i < nb_pfn; i++) {
-pfns[i] = (address_index << (mc->bucket_shift - XC_PAGE_SHIFT)) + i;
+if (grant) {
+hwaddr grant_base = address_index - (ram_offset >> XC_PAGE_SHIFT);
+
+for (i = 0; i < nb_pfn; i++) {
+refs[i] = grant_base + i;
+}
+} else {
+for (i = 0; i < nb_pfn; i++) {
+pfns[i] = (address_index << (mc->bucket_shift - XC_PAGE_SHIFT)) + 
i;
+}
 }
 
-/*
- * If the caller has requested the mapping at a specific address use
- * 

Re: [PATCH v2 01/18] migration: Fix file migration with fdset

2024-05-24 Thread Prasad Pandit
On Fri, 24 May 2024 at 00:38, Fabiano Rosas  wrote:
> This is further indicated by the presence of the 'offset'
> argument, which indicates the start of the region where QEMU is
> allowed to write.
>
> Fix the issue by replacing the O_TRUNC flag on open by an ftruncate
> call, which will take the offset into consideration.
>
> +if (ftruncate(fioc->fd, offset)) {
> +error_setg_errno(errp, errno,
> + "failed to truncate migration file to offset %" 
> PRIx64,
> + offset);
> +object_unref(OBJECT(fioc));
> +return;
> +}
> +

* Should 'offset' be checked for > zero while ftruncating? Else it'll
be same as O_TRUNC. Otherwise it looks fine.

Reviewed-by: Prasad Pandit 

Thank you.
---
  - Prasad




Re: [PULL 10/10] hw/loongarch/virt: Fix FDT memory node address width

2024-05-24 Thread Jiaxun Yang



在2024年5月24日五月 上午11:10,Michael Tokarev写道:
> 23.05.2024 04:46, Song Gao wrote:
>> From: Jiaxun Yang 
>> 
>> Higher bits for memory nodes were omitted at qemu_fdt_setprop_cells.
>> 
>> Cc: qemu-sta...@nongnu.org
>> Signed-off-by: Jiaxun Yang 
>> Reviewed-by: Song Gao 
>> Message-Id: <20240520-loongarch-fdt-memnode-v1-1-5ea9be939...@flygoat.com>
>> Signed-off-by: Song Gao 
>> ---
>>   hw/loongarch/virt.c | 3 ++-
>>   1 file changed, 2 insertions(+), 1 deletion(-)
>> 
>> diff --git a/hw/loongarch/virt.c b/hw/loongarch/virt.c
>> index e3bdf085b5..3e6e93edf3 100644
>> --- a/hw/loongarch/virt.c
>> +++ b/hw/loongarch/virt.c
>> @@ -464,7 +464,8 @@ static void fdt_add_memory_node(MachineState *ms,
>>   char *nodename = g_strdup_printf("/memory@%" PRIx64, base);
>>   
>>   qemu_fdt_add_subnode(ms->fdt, nodename);
>> -qemu_fdt_setprop_cells(ms->fdt, nodename, "reg", 0, base, 0, size);
>> +qemu_fdt_setprop_cells(ms->fdt, nodename, "reg", base >> 32, base,
>> +   size >> 32, size);
>>   qemu_fdt_setprop_string(ms->fdt, nodename, "device_type", "memory");
>>   
>>   if (ms->numa_state && ms->numa_state->num_nodes) {
>
> This commit changes exactly the same place as the previous commit,
> v9.0.0-274-gb11f981452, "hw/loongarch: Fix fdt memory node wrong 'reg'".
>
> Was it the wrong fix?

Yes, I believe previous commit is the wrong fix on the same problem.

>
> Note the previous commit isn't in any released version of qemu.  So
> when picking up for any stable release, both needs to be picked up :)

Please go ahead!

Thanks
- Jiaxun

>
> Thanks,
>
> /mjt
> -- 
> GPG Key transition (from rsa2048 to rsa4096) since 2024-04-24.
> New key: rsa4096/61AD3D98ECDF2C8E  9D8B E14E 3F2A 9DD7 9199  28F1 61AD 
> 3D98 ECDF 2C8E
> Old key: rsa2048/457CE0A0804465C5  6EE1 95D1 886E 8FFB 810D  4324 457C 
> E0A0 8044 65C5
> Transition statement: http://www.corpit.ru/mjt/gpg-transition-2024.txt

-- 
- Jiaxun



Re: [PATCH 2/2] scsi-disk: Fix crash for VM configured with USB CDROM after live migration

2024-05-24 Thread Yong Huang
On Fri, May 24, 2024 at 6:01 PM Prasad Pandit  wrote:

> Hello Hyman,
>
> * Is this the same patch series as sent before..?
>   ->
> https://lists.nongnu.org/archive/html/qemu-devel/2024-04/msg00816.html

Yes, exactly the same, I just refine the comment


>
>
> On Fri, 24 May 2024 at 12:02, Hyman Huang  wrote:
> > For VMs configured with the USB CDROM device:
> >
> > -drive
> file=/path/to/local/file,id=drive-usb-disk0,media=cdrom,readonly=on...
> > -device usb-storage,drive=drive-usb-disk0,id=usb-disk0...
> >
> > QEMU process may crash after live migration,
> > Do the live migration repeatedly, crash may happen after live migratoin,
>
> * Does live migration work many times before QEMU crashes on the
> destination side? OR QEMU crashes at the very first migration?
>
> >at
> /usr/src/debug/qemu-6-6.2.0-75.7.oe1.smartx.git.40.x86_64/include/qemu/iov.h:49
>
> * This qemu version looks quite old. Is the issue reproducible with
> the latest QEMU version 9.0?
>

I'm not testing the latest QEMU version while theoretically it is
reproducible, I'll check it and give a conclusion.


>
> > diff --git a/hw/scsi/scsi-disk.c b/hw/scsi/scsi-disk.c
> > +static void scsi_disk_emulate_save_request(QEMUFile *f, SCSIRequest
> *req)
> > +{
> > +SCSIDiskReq *r = DO_UPCAST(SCSIDiskReq, req, req);
> > +SCSIDiskState *s = DO_UPCAST(SCSIDiskState, qdev, r->req.dev);
> > +
> > +if (s->migrate_emulate_scsi_request) {
> > +scsi_disk_save_request(f, req);
> > +}
> > +}
> > +
> >  static void scsi_disk_load_request(QEMUFile *f, SCSIRequest *req)
> >  {
> >  SCSIDiskReq *r = DO_UPCAST(SCSIDiskReq, req, req);
> > @@ -183,6 +193,16 @@ static void scsi_disk_load_request(QEMUFile *f,
> SCSIRequest *req)
> >  qemu_iovec_init_external(&r->qiov, &r->iov, 1);
> >  }
> >
> > +static void scsi_disk_emulate_load_request(QEMUFile *f, SCSIRequest
> *req)
> > +{
> > +SCSIDiskReq *r = DO_UPCAST(SCSIDiskReq, req, req);
> > +SCSIDiskState *s = DO_UPCAST(SCSIDiskState, qdev, r->req.dev);
> > +
> > +if (s->migrate_emulate_scsi_request) {
> > +scsi_disk_load_request(f, req);
> > +}
> > +}
> > +
> >  /*
> >   * scsi_handle_rw_error has two return values.  False means that the
> error
> >   * must be ignored, true means that the error has been processed and the
> > @@ -2593,6 +2613,8 @@ static const SCSIReqOps scsi_disk_emulate_reqops =
> {
> >  .read_data= scsi_disk_emulate_read_data,
> >  .write_data   = scsi_disk_emulate_write_data,
> >  .get_buf  = scsi_get_buf,
> > +.load_request = scsi_disk_emulate_load_request,
> > +.save_request = scsi_disk_emulate_save_request,
> >  };
> >
> >  static const SCSIReqOps scsi_disk_dma_reqops = {
> > @@ -3137,7 +3159,7 @@ static Property scsi_hd_properties[] = {
> >  static int scsi_disk_pre_save(void *opaque)
> >  {
> >  SCSIDiskState *dev = opaque;
> > -dev->migrate_emulate_scsi_request = false;
> > +dev->migrate_emulate_scsi_request = true;
> >
>
> * This patch seems to add support for migrating SCSI requests. While
> it looks okay, not sure if it is required, how likely is someone to
> configure a VM to use CDROM?
>

I'm not sure this usage is common but in our production environment,
it is used.


>
> *  Should the CDROM device be reset on the destination if no requests
> are found? ie. if (scsi_req_get_buf -> scsi_get_buf() returns NULL)?
>

IMHO, resetting the CDROM device may be a work around because
the request *SHOULD *not be lost. No requests are found may be
caused by other reasons, resetting the CD ROM seems crude.
The path that executes the scsi_get_buf() is in a USB mass storage
device,  and it called by the UHCI controller originally, which just
handles the Frame List blindly, reset solution is kind of complicated
in implementation

Migrating the requests may be a graceful solution.

Thanks for the comments,
Yong


> Thank you.
> ---
>   - Prasad
>
>

-- 
Best regards


Re: [PATCH 1/1] target/riscv: Support Zama16b extension

2024-05-24 Thread Daniel Henrique Barboza




On 5/22/24 06:13, LIU Zhiwei wrote:

Zama16b is the property that misaligned load/stores/atomics within
a naturally aligned 16-byte region are atomic.

According to the specification, Zama16b applies only to AMOs, loads
and stores defined in the base ISAs, and loads and stores of no more
than XLEN bits defined in the F, D, and Q extensions. Thus it should
not apply to zacas or RVC instructions.

For an instruction in that set, if all accessed bytes lie within 16B granule,
the instruction will not raise an exception for reasons of address alignment,
and the instruction will give rise to only one memory operation for the
purposes of RVWMO—i.e., it will execute atomically.

Signed-off-by: LIU Zhiwei 
---
  target/riscv/cpu.c  |  2 ++
  target/riscv/cpu_cfg.h  |  1 +
  target/riscv/insn_trans/trans_rva.c.inc | 42 ++---
  target/riscv/insn_trans/trans_rvd.c.inc | 14 +++--
  target/riscv/insn_trans/trans_rvf.c.inc | 14 +++--
  target/riscv/insn_trans/trans_rvi.c.inc |  6 
  6 files changed, 57 insertions(+), 22 deletions(-)

diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
index eb1a2e7d6d..911e9892ed 100644
--- a/target/riscv/cpu.c
+++ b/target/riscv/cpu.c
@@ -117,6 +117,7 @@ const RISCVIsaExtData isa_edata_arr[] = {
  ISA_EXT_DATA_ENTRY(za64rs, PRIV_VERSION_1_12_0, has_priv_1_11),
  ISA_EXT_DATA_ENTRY(zaamo, PRIV_VERSION_1_12_0, ext_zaamo),
  ISA_EXT_DATA_ENTRY(zacas, PRIV_VERSION_1_12_0, ext_zacas),
+ISA_EXT_DATA_ENTRY(zama16b, PRIV_VERSION_1_12_0, ext_zama16b),


Is this the right order? Shouldn't it be after zalrsc?


LGTM otherwise. Thanks,


Daniel


  ISA_EXT_DATA_ENTRY(zalrsc, PRIV_VERSION_1_12_0, ext_zalrsc),
  ISA_EXT_DATA_ENTRY(zawrs, PRIV_VERSION_1_12_0, ext_zawrs),
  ISA_EXT_DATA_ENTRY(zfa, PRIV_VERSION_1_12_0, ext_zfa),
@@ -1464,6 +1465,7 @@ const RISCVCPUMultiExtConfig riscv_cpu_extensions[] = {
  MULTI_EXT_CFG_BOOL("zihintntl", ext_zihintntl, true),
  MULTI_EXT_CFG_BOOL("zihintpause", ext_zihintpause, true),
  MULTI_EXT_CFG_BOOL("zacas", ext_zacas, false),
+MULTI_EXT_CFG_BOOL("zama16b", ext_zama16b, false),
  MULTI_EXT_CFG_BOOL("zaamo", ext_zaamo, false),
  MULTI_EXT_CFG_BOOL("zalrsc", ext_zalrsc, false),
  MULTI_EXT_CFG_BOOL("zawrs", ext_zawrs, true),
diff --git a/target/riscv/cpu_cfg.h b/target/riscv/cpu_cfg.h
index cb750154bd..eaa66eb4f8 100644
--- a/target/riscv/cpu_cfg.h
+++ b/target/riscv/cpu_cfg.h
@@ -81,6 +81,7 @@ struct RISCVCPUConfig {
  bool ext_zdinx;
  bool ext_zaamo;
  bool ext_zacas;
+bool ext_zama16b;
  bool ext_zalrsc;
  bool ext_zawrs;
  bool ext_zfa;
diff --git a/target/riscv/insn_trans/trans_rva.c.inc 
b/target/riscv/insn_trans/trans_rva.c.inc
index 4a9e4591d1..eb080baddd 100644
--- a/target/riscv/insn_trans/trans_rva.c.inc
+++ b/target/riscv/insn_trans/trans_rva.c.inc
@@ -103,6 +103,12 @@ static bool gen_amo(DisasContext *ctx, arg_atomic *a,
  TCGv dest = dest_gpr(ctx, a->rd);
  TCGv src1, src2 = get_gpr(ctx, a->rs2, EXT_NONE);
  
+if (ctx->cfg_ptr->ext_zama16b) {

+mop |= MO_ATOM_WITHIN16;
+} else {
+mop |= MO_ALIGN;
+}
+
  decode_save_opc(ctx);
  src1 = get_address(ctx, a->rs1, 0);
  func(dest, src1, src2, ctx->mem_idx, mop);
@@ -126,55 +132,55 @@ static bool trans_sc_w(DisasContext *ctx, arg_sc_w *a)
  static bool trans_amoswap_w(DisasContext *ctx, arg_amoswap_w *a)
  {
  REQUIRE_A_OR_ZAAMO(ctx);
-return gen_amo(ctx, a, &tcg_gen_atomic_xchg_tl, (MO_ALIGN | MO_TESL));
+return gen_amo(ctx, a, &tcg_gen_atomic_xchg_tl, MO_TESL);
  }
  
  static bool trans_amoadd_w(DisasContext *ctx, arg_amoadd_w *a)

  {
  REQUIRE_A_OR_ZAAMO(ctx);
-return gen_amo(ctx, a, &tcg_gen_atomic_fetch_add_tl, (MO_ALIGN | MO_TESL));
+return gen_amo(ctx, a, &tcg_gen_atomic_fetch_add_tl, MO_TESL);
  }
  
  static bool trans_amoxor_w(DisasContext *ctx, arg_amoxor_w *a)

  {
  REQUIRE_A_OR_ZAAMO(ctx);
-return gen_amo(ctx, a, &tcg_gen_atomic_fetch_xor_tl, (MO_ALIGN | MO_TESL));
+return gen_amo(ctx, a, &tcg_gen_atomic_fetch_xor_tl, MO_TESL);
  }
  
  static bool trans_amoand_w(DisasContext *ctx, arg_amoand_w *a)

  {
  REQUIRE_A_OR_ZAAMO(ctx);
-return gen_amo(ctx, a, &tcg_gen_atomic_fetch_and_tl, (MO_ALIGN | MO_TESL));
+return gen_amo(ctx, a, &tcg_gen_atomic_fetch_and_tl, MO_TESL);
  }
  
  static bool trans_amoor_w(DisasContext *ctx, arg_amoor_w *a)

  {
  REQUIRE_A_OR_ZAAMO(ctx);
-return gen_amo(ctx, a, &tcg_gen_atomic_fetch_or_tl, (MO_ALIGN | MO_TESL));
+return gen_amo(ctx, a, &tcg_gen_atomic_fetch_or_tl, MO_TESL);
  }
  
  static bool trans_amomin_w(DisasContext *ctx, arg_amomin_w *a)

  {
  REQUIRE_A_OR_ZAAMO(ctx);
-return gen_amo(ctx, a, &tcg_gen_atomic_fetch_smin_tl, (MO_ALIGN | 
MO_TESL));
+return gen_amo(ctx, a, &tcg_gen_atomic_fetch_smin_tl, MO_TESL);
  }
  
  static bool trans_amomax_w(DisasContext *ctx, arg_amomax_w *a)


[PATCH v2 0/2] Fix GICv2 handling of pending interrupts

2024-05-24 Thread Sebastian Huber
v2:

* Fix handling of SPIs.

* Remove pending state if not in new target list.

Sebastian Huber (2):
  hw/intc/arm_gic: Fix set pending of PPIs
  hw/intc/arm_gic: Fix writes to GICD_ITARGETSRn

 hw/intc/arm_gic.c | 12 +++-
 1 file changed, 11 insertions(+), 1 deletion(-)

-- 
2.35.3




[PATCH v2 2/2] hw/intc/arm_gic: Fix writes to GICD_ITARGETSRn

2024-05-24 Thread Sebastian Huber
According to the GICv2 specification section 4.3.12, "Interrupt Processor
Targets Registers, GICD_ITARGETSRn":

"Any change to a CPU targets field value:
[...]
* Has an effect on any pending interrupts. This means:
  - adding a CPU interface to the target list of a pending interrupt makes that
interrupt pending on that CPU interface
  - removing a CPU interface from the target list of a pending interrupt
removes the pending state of that interrupt on that CPU interface."

Signed-off-by: Sebastian Huber 
---
 hw/intc/arm_gic.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/hw/intc/arm_gic.c b/hw/intc/arm_gic.c
index 241255081d..1f9bffc88c 100644
--- a/hw/intc/arm_gic.c
+++ b/hw/intc/arm_gic.c
@@ -1410,6 +1410,13 @@ static void gic_dist_writeb(void *opaque, hwaddr offset,
 value = ALL_CPU_MASK;
 }
 s->irq_target[irq] = value & ALL_CPU_MASK;
+if (irq >= GIC_INTERNAL && s->irq_state[irq].pending) {
+/*
+ * Changing the target of an interrupt that is currently
+ * pending updates the set of CPUs it is pending on.
+ */
+s->irq_state[irq].pending = value & ALL_CPU_MASK;
+}
 }
 } else if (offset < 0xf00) {
 /* Interrupt Configuration.  */
-- 
2.35.3




[PATCH v2 1/2] hw/intc/arm_gic: Fix set pending of PPIs

2024-05-24 Thread Sebastian Huber
According to the GICv2 specification section 4.3.7, "Interrupt Set-Pending
Registers, GICD_ISPENDRn":

"In a multiprocessor implementation, GICD_ISPENDR0 is banked for each connected
processor. This register holds the Set-pending bits for interrupts 0-31."

Signed-off-by: Sebastian Huber 
---
 hw/intc/arm_gic.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/hw/intc/arm_gic.c b/hw/intc/arm_gic.c
index 074cf50af2..241255081d 100644
--- a/hw/intc/arm_gic.c
+++ b/hw/intc/arm_gic.c
@@ -1308,12 +1308,15 @@ static void gic_dist_writeb(void *opaque, hwaddr offset,
 
 for (i = 0; i < 8; i++) {
 if (value & (1 << i)) {
+int mask = (irq < GIC_INTERNAL) ? (1 << cpu)
+: GIC_DIST_TARGET(irq + i);
+
 if (s->security_extn && !attrs.secure &&
 !GIC_DIST_TEST_GROUP(irq + i, 1 << cpu)) {
 continue; /* Ignore Non-secure access of Group0 IRQ */
 }
 
-GIC_DIST_SET_PENDING(irq + i, GIC_DIST_TARGET(irq + i));
+GIC_DIST_SET_PENDING(irq + i, mask);
 }
 }
 } else if (offset < 0x300) {
-- 
2.35.3




Re: [PATCH 5/6] target/riscv: Enable zabha for max cpu

2024-05-24 Thread Daniel Henrique Barboza




On 5/23/24 09:40, LIU Zhiwei wrote:

Signed-off-by: LIU Zhiwei 
---
  target/riscv/cpu.c | 2 ++
  1 file changed, 2 insertions(+)

diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
index 21d4e36405..9ec03a1edc 100644
--- a/target/riscv/cpu.c
+++ b/target/riscv/cpu.c
@@ -118,6 +118,7 @@ const RISCVIsaExtData isa_edata_arr[] = {
  ISA_EXT_DATA_ENTRY(zaamo, PRIV_VERSION_1_12_0, ext_zaamo),
  ISA_EXT_DATA_ENTRY(zacas, PRIV_VERSION_1_12_0, ext_zacas),
  ISA_EXT_DATA_ENTRY(zama16b, PRIV_VERSION_1_12_0, ext_zama16b),
+ISA_EXT_DATA_ENTRY(zabha, PRIV_VERSION_1_12_0, ext_zabha),


I think this should be place right after zaamo. Thanks,


Daniel


  ISA_EXT_DATA_ENTRY(zalrsc, PRIV_VERSION_1_12_0, ext_zalrsc),
  ISA_EXT_DATA_ENTRY(zawrs, PRIV_VERSION_1_12_0, ext_zawrs),
  ISA_EXT_DATA_ENTRY(zfa, PRIV_VERSION_1_12_0, ext_zfa),
@@ -1470,6 +1471,7 @@ const RISCVCPUMultiExtConfig riscv_cpu_extensions[] = {
  MULTI_EXT_CFG_BOOL("zcmop", ext_zcmop, false),
  MULTI_EXT_CFG_BOOL("zacas", ext_zacas, false),
  MULTI_EXT_CFG_BOOL("zama16b", ext_zama16b, false),
+MULTI_EXT_CFG_BOOL("zabha", ext_zabha, false),
  MULTI_EXT_CFG_BOOL("zaamo", ext_zaamo, false),
  MULTI_EXT_CFG_BOOL("zalrsc", ext_zalrsc, false),
  MULTI_EXT_CFG_BOOL("zawrs", ext_zawrs, true),




Re: [PATCH 0/6] target/riscv: Support Zabha extension

2024-05-24 Thread Daniel Henrique Barboza

Hi Zhiwei!



On 5/23/24 09:40, LIU Zhiwei wrote:

Zabha adds support AMO operations for byte and half word. If zacas has been 
implemented,
zabha also adds support amocas.b and amocas.h.

More details is on the specification here:
https://github.com/riscv/riscv-zabha

The implemenation of zabha follows the way of AMOs and zacas.

This patch set is based on these two patch set:
1. https://mail.gnu.org/archive/html/qemu-riscv/2024-05/msg00207.html
2. https://mail.gnu.org/archive/html/qemu-riscv/2024-05/msg00212.html


These 2 series doesn't seem to apply on top of each other, doesn't matter which
order I try. Applying zimop/zcmop first, then zama16b:

$ git am \[PATCH\ 1_1\]\ target_riscv\:\ Support\ Zama16b\ extension\ -\ LIU\ Zhiwei\ 
\\ -\ 2024-05-22\ 0613.eml
Applying: target/riscv: Support Zama16b extension
error: patch failed: target/riscv/cpu.c:1464
error: target/riscv/cpu.c: patch does not apply
Patch failed at 0001 target/riscv: Support Zama16b extension
hint: Use 'git am --show-current-patch=diff' to see the failed patch


Applying zama16b first, then zimop/zcmop:

$ git am \[PATCH\ 1_1\]\ target_riscv\:\ Support\ Zama16b\ extension\ -\ LIU\ Zhiwei\ 
\\ -\ 2024-05-22\ 0613.eml
Applying: target/riscv: Support Zama16b extension
$
$ git am \[PATCH\ 1_4\]\ target_riscv\:\ Add\ zimop\ extension\ -\ LIU\ Zhiwei\ 
\\ -\ 2024-05-22\ 0329.eml \[PATCH\ 2_4\]\ 
disas_riscv\:\ Support\ zimop\ disassemble\ -\ LIU\ Zhiwei\ 
\\ -\ 2024-05-22\ 0329.eml
Applying: target/riscv: Add zimop extension
error: patch failed: target/riscv/cpu.c:1463
error: target/riscv/cpu.c: patch does not apply
Patch failed at 0001 target/riscv: Add zimop extension


If the series are dependent on each other perhaps it's easier to send everything
in a single 11 patches series.


Thanks,

Daniel




LIU Zhiwei (6):
   target/riscv: Move gen_amo before implement Zabha
   target/riscv: Add AMO instructions for Zabha
   target/riscv: Move gen_cmpxchg before adding amocas.[b|h]
   target/riscv: Add amocas.[b|h] for Zabha
   target/riscv: Enable zabha for max cpu
   disas/riscv: Support zabha disassemble

  disas/riscv.c   |  60 
  target/riscv/cpu.c  |   2 +
  target/riscv/cpu_cfg.h  |   1 +
  target/riscv/insn32.decode  |  22 +++
  target/riscv/insn_trans/trans_rva.c.inc |  21 ---
  target/riscv/insn_trans/trans_rvzabha.c.inc | 145 
  target/riscv/insn_trans/trans_rvzacas.c.inc |  13 --
  target/riscv/translate.c|  36 +
  8 files changed, 266 insertions(+), 34 deletions(-)
  create mode 100644 target/riscv/insn_trans/trans_rvzabha.c.inc





[RFC PATCH] hw/dma: Add Intel I/OAT DMA controller emulation

2024-05-24 Thread Nikita Shubin
From: Nikita Shubin 

Add a memcpy only model of I/OAT DMA found on some Xeon based
motherboards.

Signed-off-by: Nikita Shubin 
---
Started as complementary device for a driver that can't get working 
without any DMA.

So it's worth (at least) mentioning it on mail lists.

Tested with Linux dmatest driver:

# ls -l /sys/class/dma/dma0chan0
lrwxrwxrwx 1 root root 0 May 24 11:28 /sys/class/dma/dma0chan0 -> 
../../devices/pci:00/:00:02.0/dma/dma0chan0

# lspci -vvv | grep 00:02.0
00:02.0 DMA controller: Intel Corporation Sky Lake-E CBDMA Registers (prog-if 
00 [8237])

# modprobe dmatest channel=dma0chan0 timeout=2000 iterations=1 run=1
...
[   30.600158][   T75] dmatest: dma0chan0-copy0: verifying source buffer...
[   30.603174][   T75] dmatest: dma0chan0-copy0: verifying dest buffer...
[   30.604863][   T75] dmatest: dma0chan0-copy0: result #1: 'test passed' with 
src_off=0x1cf5 dst_off=0x2993 len=0xc6d (0)
[   30.607302][   T75] dmatest: dma0chan0-copy0: summary 1 tests, 0 failures 
69.93 iops 209 KB/s (0)
---
 hw/dma/Kconfig  |   4 +
 hw/dma/ioatdma.c| 850 
 hw/dma/meson.build  |   1 +
 hw/dma/trace-events |  11 +
 4 files changed, 866 insertions(+)
 create mode 100644 hw/dma/ioatdma.c

diff --git a/hw/dma/Kconfig b/hw/dma/Kconfig
index 98fbb1bb04..a5e266085d 100644
--- a/hw/dma/Kconfig
+++ b/hw/dma/Kconfig
@@ -30,3 +30,7 @@ config SIFIVE_PDMA
 config XLNX_CSU_DMA
 bool
 select REGISTER
+
+config INTEL_IOATDMA
+default y if PCI_DEVICES
+depends on PCI && MSI_NONBROKEN
\ No newline at end of file
diff --git a/hw/dma/ioatdma.c b/hw/dma/ioatdma.c
new file mode 100644
index 00..119ad21e11
--- /dev/null
+++ b/hw/dma/ioatdma.c
@@ -0,0 +1,850 @@
+/*
+ * Intel(R) I/OAT DMA engine emulation
+ *
+ * Copyright (c) 2024 Nikita Shubin 
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to 
deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING 
FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ */
+#include "qemu/osdep.h"
+
+#include "chardev/char-fe.h"
+#include "chardev/char-socket.h"
+#include "hw/hw.h"
+#include "hw/pci/msi.h"
+#include "hw/pci/msix.h"
+#include "hw/pci/pci.h"
+#include "hw/qdev-properties-system.h"
+#include "hw/qdev-properties.h"
+#include "migration/blocker.h"
+#include "migration/vmstate.h"
+#include "qapi/error.h"
+#include "qapi/visitor.h"
+#include "qemu/cutils.h"
+#include "qemu/error-report.h"
+#include "qemu/event_notifier.h"
+#include "qemu/log.h"
+#include "qemu/module.h"
+#include "qemu/sockets.h"
+#include "qemu/units.h"
+#include "qom/object_interfaces.h"
+#include "sysemu/hostmem.h"
+#include "sysemu/kvm.h"
+#include "sysemu/qtest.h"
+
+#include "trace.h"
+
+#define PCI_VENDOR_ID_INTEL 0x8086
+#define PCI_DEVICE_ID_INTEL_IOAT_SKX0x2021
+
+#define IOATDMA_BAR0_SIZE   (16 * KiB)
+#define IOATDMA_MSIX_SIZE   (16 * KiB)
+#define IOATDMA_MSIX_TABLE  (0x2000)
+#define IOATDMA_MSIX_PBA(0x3000)
+
+/*  8-bit */
+#define IOAT_CHANCNT_OFFSET 0x00
+/*  8-bit */
+#define IOAT_XFERCAP_OFFSET 0x01
+/*  8-bit */
+#define IOAT_XFERCAP_MASK   0x1f
+
+/*  8-bit, unused */
+#define IOAT_GENCTRL_OFFSET 0x02
+#define IOAT_GENCTRL_DEBUG_EN   0x01
+
+/*  8-bit */
+#define IOAT_INTRCTRL_OFFSET0x03
+/* Master Interrupt Enable */
+#define IOAT_INTRCTRL_MASTER_INT_EN 0x01
+/* ATTNSTATUS -or- Channel Int */
+#define IOAT_INTRCTRL_INT_STATUS0x02
+/* INT_STATUS -and- MASTER_INT_EN */
+#define IOAT_INTRCTRL_INT   0x04
+/* Enable all MSI-X vectors */
+#define IOAT_INTRCTRL_MSIX_VECTOR_CONTROL   0x08
+
+/* Each bit is a channel */
+#define IOAT_ATTNSTATUS_OFFSET  0x04
+
+/*  8-bit */
+#define IOAT_VER_OFFSET 0x08
+
+/* 16-bit */
+#define IOAT_PERPORTOFFSET_OFFSET   0x0A
+
+/* 16-bit */
+#define IOAT_INTRDELAY_OFFSET   0x0C
+/* Interrupt Delay Time 

Re: [PATCH 2/2] hw/arm/xilinx_zynq: Support up to two CPU cores

2024-05-24 Thread Sebastian Huber

Hello Peter,

thanks for the review.

On 20.05.24 15:58, Peter Maydell wrote:

On Tue, 7 May 2024 at 14:04, Sebastian Huber
  wrote:

The Zynq 7000 SoCs contain two Arm Cortex-A9 MPCore (the Zynq 7000S have only
one core).  Add support for up to two simulated cores.

Signed-off-by: Sebastian Huber
---
  hw/arm/xilinx_zynq.c | 42 +++---
  1 file changed, 27 insertions(+), 15 deletions(-)

diff --git a/hw/arm/xilinx_zynq.c b/hw/arm/xilinx_zynq.c
index 078abd77bd..3b858e3e9a 100644
--- a/hw/arm/xilinx_zynq.c
+++ b/hw/arm/xilinx_zynq.c
@@ -184,6 +184,8 @@ static void zynq_init(MachineState *machine)
  SysBusDevice *busdev;
  qemu_irq pic[64];
  int n;
+unsigned int smp_cpus = machine->smp.cpus;
+qemu_irq cpu_irq[2];

We prefer not to have arrays of qemu_irq like this that are
just passing qemu_irqs from one place to another. Instead
at the point where you want the ARM_CPU_IRQ of a particular
CPU, call qdev_get_gpio_in() on the CPU object there.

I suggest dropping the "ARMCPU *cpu" local from this function
and instead adding an "ARMCPU *cpu[ZYNQ_MAX_CPUS]" array to
the ZynqMachineState struct.


I used the hw/arm/realview.c as a template for this change. I will try 
to implement the suggested changes.





  /* max 2GB ram */
  if (machine->ram_size > 2 * GiB) {
@@ -191,21 +193,27 @@ static void zynq_init(MachineState *machine)
  exit(EXIT_FAILURE);
  }

-cpu = ARM_CPU(object_new(machine->cpu_type));
+for (n = 0; n < smp_cpus; n++) {
+Object *cpuobj = object_new(machine->cpu_type);

-/* By default A9 CPUs have EL3 enabled.  This board does not
- * currently support EL3 so the CPU EL3 property is disabled before
- * realization.
- */
-if (object_property_find(OBJECT(cpu), "has_el3")) {
-object_property_set_bool(OBJECT(cpu), "has_el3", false, &error_fatal);
-}
+/* By default A9 CPUs have EL3 enabled.  This board does not
+ * currently support EL3 so the CPU EL3 property is disabled before
+ * realization.
+ */

If you're moving comment text around checkpatch will suggest that
you fix it up to our current coding standard, which is that
a multiline comment has the "/*" on a line of its own.


Ok.




+if (object_property_find(cpuobj, "has_el3")) {
+object_property_set_bool(cpuobj, "has_el3", false, &error_fatal);
+}
+
+object_property_set_int(cpuobj, "midr", ZYNQ_BOARD_MIDR,
+&error_fatal);
+object_property_set_int(cpuobj, "reset-cbar", MPCORE_PERIPHBASE,
+&error_fatal);

-object_property_set_int(OBJECT(cpu), "midr", ZYNQ_BOARD_MIDR,
-&error_fatal);
-object_property_set_int(OBJECT(cpu), "reset-cbar", MPCORE_PERIPHBASE,
-&error_fatal);
-qdev_realize(DEVICE(cpu), NULL, &error_fatal);
+qdev_realize(DEVICE(cpuobj), NULL, &error_fatal);
+
+cpu_irq[n] = qdev_get_gpio_in(DEVICE(cpuobj), ARM_CPU_IRQ);
+}
+cpu = ARM_CPU(first_cpu);

  /* DDR remapped to address zero.  */
  memory_region_add_subregion(address_space_mem, 0, machine->ram);
@@ -238,10 +246,14 @@ static void zynq_init(MachineState *machine)
  sysbus_mmio_map(SYS_BUS_DEVICE(slcr), 0, 0xF800);

  dev = qdev_new(TYPE_A9MPCORE_PRIV);
-qdev_prop_set_uint32(dev, "num-cpu", 1);
+qdev_prop_set_uint32(dev, "num-cpu", smp_cpus);
  busdev = SYS_BUS_DEVICE(dev);
  sysbus_realize_and_unref(busdev, &error_fatal);
  sysbus_mmio_map(busdev, 0, MPCORE_PERIPHBASE);
+for (n = 0; n < smp_cpus; n++) {
+sysbus_connect_irq(busdev, n, cpu_irq[n]);
+}

Looks like you have based this on a version of QEMU which doesn't
have commit 68a5827b80117973 which wires up the FIQ line of the
A9MPCORE_PRIV device to the CPUs.


Yes, indeed. I originally used a Qemu version from Xilinx. They have a 
huge set of patches which is not integrated in Qemu.





+zynq_binfo.gic_cpu_if_addr = MPCORE_PERIPHBASE + 0x100;
  sysbus_create_varargs("l2x0", MPCORE_PERIPHBASE + 0x2000, NULL);
  sysbus_connect_irq(busdev, 0,
 qdev_get_gpio_in(DEVICE(cpu), ARM_CPU_IRQ));
@@ -357,7 +369,7 @@ static void zynq_machine_class_init(ObjectClass *oc, void 
*data)
  MachineClass *mc = MACHINE_CLASS(oc);
  mc->desc = "Xilinx Zynq Platform Baseboard for Cortex-A9";
  mc->init = zynq_init;
-mc->max_cpus = 1;
+mc->max_cpus = 2;
  mc->no_sdcard = 1;
  mc->ignore_memory_transaction_failures = true;
  mc->default_cpu_type = ARM_CPU_TYPE_NAME("cortex-a9");
--

I'm not making this a condition for accepting this patch, but
since you're working on this board model would you consider
writing up some documentation for it? It's one of the boards
we do not currently have documented at all. This doesn't have to
be very extensive: a few paragraphs describing what the bo

[PATCH v2 2/2] hw/arm/xilinx_zynq: Support up to two CPU cores

2024-05-24 Thread Sebastian Huber
The Zynq 7000 SoCs contain two Arm Cortex-A9 MPCore (the Zynq 7000S have only
one core).  Add support for up to two simulated cores.

Signed-off-by: Sebastian Huber 
---
 hw/arm/xilinx_zynq.c | 54 +++-
 1 file changed, 33 insertions(+), 21 deletions(-)

diff --git a/hw/arm/xilinx_zynq.c b/hw/arm/xilinx_zynq.c
index 0abb62f131..ac30026040 100644
--- a/hw/arm/xilinx_zynq.c
+++ b/hw/arm/xilinx_zynq.c
@@ -84,9 +84,12 @@ static const int dma_irqs[8] = {
 0xe3401000 + ARMV7_IMM16(extract32((val), 16, 16)), /* movt r1 ... */ \
 0xe5801000 + (addr)
 
+#define ZYNQ_MAX_CPUS 2
+
 struct ZynqMachineState {
 MachineState parent;
 Clock *ps_clk;
+ARMCPU *cpu[ZYNQ_MAX_CPUS];
 };
 
 static void zynq_write_board_setup(ARMCPU *cpu,
@@ -176,13 +179,13 @@ static inline int zynq_init_spi_flashes(uint32_t 
base_addr, qemu_irq irq,
 static void zynq_init(MachineState *machine)
 {
 ZynqMachineState *zynq_machine = ZYNQ_MACHINE(machine);
-ARMCPU *cpu;
 MemoryRegion *address_space_mem = get_system_memory();
 MemoryRegion *ocm_ram = g_new(MemoryRegion, 1);
 DeviceState *dev, *slcr;
 SysBusDevice *busdev;
 qemu_irq pic[64];
 int n;
+unsigned int smp_cpus = machine->smp.cpus;
 
 /* max 2GB ram */
 if (machine->ram_size > 2 * GiB) {
@@ -190,21 +193,26 @@ static void zynq_init(MachineState *machine)
 exit(EXIT_FAILURE);
 }
 
-cpu = ARM_CPU(object_new(machine->cpu_type));
+for (n = 0; n < smp_cpus; n++) {
+Object *cpuobj = object_new(machine->cpu_type);
 
-/* By default A9 CPUs have EL3 enabled.  This board does not
- * currently support EL3 so the CPU EL3 property is disabled before
- * realization.
- */
-if (object_property_find(OBJECT(cpu), "has_el3")) {
-object_property_set_bool(OBJECT(cpu), "has_el3", false, &error_fatal);
-}
+/*
+ * By default A9 CPUs have EL3 enabled.  This board does not currently
+ * support EL3 so the CPU EL3 property is disabled before realization.
+ */
+if (object_property_find(cpuobj, "has_el3")) {
+object_property_set_bool(cpuobj, "has_el3", false, &error_fatal);
+}
 
-object_property_set_int(OBJECT(cpu), "midr", ZYNQ_BOARD_MIDR,
-&error_fatal);
-object_property_set_int(OBJECT(cpu), "reset-cbar", MPCORE_PERIPHBASE,
-&error_fatal);
-qdev_realize(DEVICE(cpu), NULL, &error_fatal);
+object_property_set_int(cpuobj, "midr", ZYNQ_BOARD_MIDR,
+&error_fatal);
+object_property_set_int(cpuobj, "reset-cbar", MPCORE_PERIPHBASE,
+&error_fatal);
+
+qdev_realize(DEVICE(cpuobj), NULL, &error_fatal);
+
+zynq_machine->cpu[n] = ARM_CPU(cpuobj);
+}
 
 /* DDR remapped to address zero.  */
 memory_region_add_subregion(address_space_mem, 0, machine->ram);
@@ -237,15 +245,19 @@ static void zynq_init(MachineState *machine)
 sysbus_mmio_map(SYS_BUS_DEVICE(slcr), 0, 0xF800);
 
 dev = qdev_new(TYPE_A9MPCORE_PRIV);
-qdev_prop_set_uint32(dev, "num-cpu", 1);
+qdev_prop_set_uint32(dev, "num-cpu", smp_cpus);
 busdev = SYS_BUS_DEVICE(dev);
 sysbus_realize_and_unref(busdev, &error_fatal);
 sysbus_mmio_map(busdev, 0, MPCORE_PERIPHBASE);
+zynq_binfo.gic_cpu_if_addr = MPCORE_PERIPHBASE + 0x100;
 sysbus_create_varargs("l2x0", MPCORE_PERIPHBASE + 0x2000, NULL);
-sysbus_connect_irq(busdev, 0,
-   qdev_get_gpio_in(DEVICE(cpu), ARM_CPU_IRQ));
-sysbus_connect_irq(busdev, 1,
-   qdev_get_gpio_in(DEVICE(cpu), ARM_CPU_FIQ));
+for (n = 0; n < smp_cpus; n++) {
+DeviceState *cpudev = DEVICE(OBJECT(zynq_machine->cpu[n]));
+sysbus_connect_irq(busdev, (2 * n) + 0,
+   qdev_get_gpio_in(cpudev, ARM_CPU_IRQ));
+sysbus_connect_irq(busdev, (2 * n) + 1,
+   qdev_get_gpio_in(cpudev, ARM_CPU_FIQ));
+}
 
 for (n = 0; n < 64; n++) {
 pic[n] = qdev_get_gpio_in(dev, n);
@@ -350,7 +362,7 @@ static void zynq_init(MachineState *machine)
 zynq_binfo.board_setup_addr = BOARD_SETUP_ADDR;
 zynq_binfo.write_board_setup = zynq_write_board_setup;
 
-arm_load_kernel(cpu, machine, &zynq_binfo);
+arm_load_kernel(zynq_machine->cpu[0], machine, &zynq_binfo);
 }
 
 static void zynq_machine_class_init(ObjectClass *oc, void *data)
@@ -362,7 +374,7 @@ static void zynq_machine_class_init(ObjectClass *oc, void 
*data)
 MachineClass *mc = MACHINE_CLASS(oc);
 mc->desc = "Xilinx Zynq Platform Baseboard for Cortex-A9";
 mc->init = zynq_init;
-mc->max_cpus = 1;
+mc->max_cpus = ZYNQ_MAX_CPUS;
 mc->no_sdcard = 1;
 mc->ignore_memory_transaction_failures = true;
 mc->valid_cpu_types = valid_cpu_types;
-- 
2.35.3




[PATCH v2 0/2] Zynq 7000 Improvements

2024-05-24 Thread Sebastian Huber
v2:

* Add Kconfig support

* Add array of CPUs to ZynqMachineState

* Add FIQ support

Sebastian Huber (2):
  hw/arm/xilinx_zynq: Add cache controller
  hw/arm/xilinx_zynq: Support up to two CPU cores

 hw/arm/Kconfig   |  1 +
 hw/arm/xilinx_zynq.c | 55 +++-
 2 files changed, 35 insertions(+), 21 deletions(-)

-- 
2.35.3




[PATCH v2 1/2] hw/arm/xilinx_zynq: Add cache controller

2024-05-24 Thread Sebastian Huber
The Zynq 7000 SoCs contain a CoreLink L2C-310 cache controller.  Add the
corresponding Qemu device to the xilinx-zynq-a9 machine.

Signed-off-by: Sebastian Huber 
---
 hw/arm/Kconfig   | 1 +
 hw/arm/xilinx_zynq.c | 1 +
 2 files changed, 2 insertions(+)

diff --git a/hw/arm/Kconfig b/hw/arm/Kconfig
index 8b97683a45..1ad60da7aa 100644
--- a/hw/arm/Kconfig
+++ b/hw/arm/Kconfig
@@ -370,6 +370,7 @@ config ZYNQ
 select A9MPCORE
 select CADENCE # UART
 select PFLASH_CFI02
+select PL310 # cache controller
 select PL330
 select SDHCI
 select SSI_M25P80
diff --git a/hw/arm/xilinx_zynq.c b/hw/arm/xilinx_zynq.c
index fc3abcbe88..0abb62f131 100644
--- a/hw/arm/xilinx_zynq.c
+++ b/hw/arm/xilinx_zynq.c
@@ -241,6 +241,7 @@ static void zynq_init(MachineState *machine)
 busdev = SYS_BUS_DEVICE(dev);
 sysbus_realize_and_unref(busdev, &error_fatal);
 sysbus_mmio_map(busdev, 0, MPCORE_PERIPHBASE);
+sysbus_create_varargs("l2x0", MPCORE_PERIPHBASE + 0x2000, NULL);
 sysbus_connect_irq(busdev, 0,
qdev_get_gpio_in(DEVICE(cpu), ARM_CPU_IRQ));
 sysbus_connect_irq(busdev, 1,
-- 
2.35.3




Re: [PATCH] tests/qtest/migration-test: Run some basic tests on s390x and ppc64 with TCG, too

2024-05-24 Thread Fabiano Rosas
Thomas Huth  writes:

> On 24/05/2024 02.05, Nicholas Piggin wrote:
>> On Wed May 22, 2024 at 7:12 PM AEST, Thomas Huth wrote:
>>> On s390x, we recently had a regression that broke migration / savevm
>>> (see commit bebe9603fc ("hw/intc/s390_flic: Fix crash that occurs when
>>> saving the machine state"). The problem was merged without being noticed
>>> since we currently do not run any migration / savevm related tests on
>>> x86 hosts.
>>> While we currently cannot run all migration tests for the s390x target
>>> on x86 hosts yet (due to some unresolved issues with TCG), we can at
>>> least run some of the non-live tests to avoid such problems in the future.
>>> Thus enable the "analyze-script" and the "bad_dest" tests before checking
>>> for KVM on s390x or ppc64 (this also fixes the problem that the
>>> "analyze-script" test was not run on s390x at all anymore since it got
>>> disabled again by accident in a previous refactoring of the code).
>> 
>> ppc64 is working for me, can it be enabled fully, or is it still
>> breaking somewhere? FWIW I have a patch to change it from using
>> open-firmware commands to a boot file which speeds it up.
>
> IIRC last time that I tried it was working fine for me, too, but getting a 
> speedup here first would be very welcome since using the Forth code slows 
> down the whole testing quite a bit.

Yeah, we're all gonna get kicked from the project if we add 10m to make
check in CI. =)

@Nick, send us that patch and I'd be glad to reenable the tests.



Re: [PATCH v2 01/18] migration: Fix file migration with fdset

2024-05-24 Thread Fabiano Rosas
Prasad Pandit  writes:

> On Fri, 24 May 2024 at 00:38, Fabiano Rosas  wrote:
>> This is further indicated by the presence of the 'offset'
>> argument, which indicates the start of the region where QEMU is
>> allowed to write.
>>
>> Fix the issue by replacing the O_TRUNC flag on open by an ftruncate
>> call, which will take the offset into consideration.
>>
>> +if (ftruncate(fioc->fd, offset)) {
>> +error_setg_errno(errp, errno,
>> + "failed to truncate migration file to offset %" 
>> PRIx64,
>> + offset);
>> +object_unref(OBJECT(fioc));
>> +return;
>> +}
>> +
>
> * Should 'offset' be checked for > zero while ftruncating? Else it'll
> be same as O_TRUNC. Otherwise it looks fine.

That's the point. If offset==0 we truncate all the way, if not, we
truncate to the offset.

>
> Reviewed-by: Prasad Pandit 

Thanks!



Re: [PATCH V1 23/26] migration: misc cpr-exec blockers

2024-05-24 Thread Fabiano Rosas
Steve Sistare  writes:

> Add blockers for cpr-exec migration mode for devices and options that do
> not support it.
>
> Signed-off-by: Steve Sistare 
> ---
>  accel/xen/xen-all.c|  5 +
>  backends/hostmem-epc.c | 12 ++--
>  hw/vfio/migration.c|  3 ++-
>  replay/replay.c|  6 ++
>  4 files changed, 23 insertions(+), 3 deletions(-)
>
> diff --git a/accel/xen/xen-all.c b/accel/xen/xen-all.c
> index 0bdefce..9a7ed0f 100644
> --- a/accel/xen/xen-all.c
> +++ b/accel/xen/xen-all.c

This file is missing the migration/blocker.h include.




Re: [PATCH V1 00/26] Live update: cpr-exec

2024-05-24 Thread Fabiano Rosas
Steve Sistare  writes:

> This patch series adds the live migration cpr-exec mode.  In this mode, QEMU
> stops the VM, writes VM state to the migration URI, and directly exec's a
> new version of QEMU on the same host, replacing the original process while
> retaining its PID.  Guest RAM is preserved in place, albeit with new virtual
> addresses.  The user completes the migration by specifying the -incoming
> option, and by issuing the migrate-incoming command if necessary.  This
> saves and restores VM state, with minimal guest pause time, so that QEMU may
> be updated to a new version in between.
>
> The new interfaces are:
>   * cpr-exec (MigMode migration parameter)
>   * cpr-exec-args (migration parameter)
>   * memfd-alloc=on (command-line option for -machine)
>   * only-migratable-modes (command-line argument)
>
> The caller sets the mode parameter before invoking the migrate command.
>
> Arguments for the new QEMU process are taken from the cpr-exec-args parameter.
> The first argument should be the path of a new QEMU binary, or a prefix
> command that exec's the new QEMU binary, and the arguments should include
> the -incoming option.
>
> Memory backend objects must have the share=on attribute, and must be mmap'able
> in the new QEMU process.  For example, memory-backend-file is acceptable,
> but memory-backend-ram is not.
>
> QEMU must be started with the '-machine memfd-alloc=on' option.  This causes
> implicit RAM blocks (those not explicitly described by a memory-backend
> object) to be allocated by mmap'ing a memfd.  Examples include VGA, ROM,
> and even guest RAM when it is specified without without reference to a
> memory-backend object.   The memfds are kept open across exec, their values
> are saved in vmstate which is retrieved after exec, and they are re-mmap'd.
>
> The '-only-migratable-modes cpr-exec' option guarantees that the
> configuration supports cpr-exec.  QEMU will exit at start time if not.
>
> Example:
>
> In this example, we simply restart the same version of QEMU, but in
> a real scenario one would set a new QEMU binary path in cpr-exec-args.
>
>   # qemu-kvm -monitor stdio -object
>   memory-backend-file,id=ram0,size=4G,mem-path=/dev/shm/ram0,share=on
>   -m 4G -machine memfd-alloc=on ...
>
>   QEMU 9.1.50 monitor - type 'help' for more information
>   (qemu) info status
>   VM status: running
>   (qemu) migrate_set_parameter mode cpr-exec
>   (qemu) migrate_set_parameter cpr-exec-args qemu-kvm ... -incoming 
> file:vm.state
>   (qemu) migrate -d file:vm.state
>   (qemu) QEMU 9.1.50 monitor - type 'help' for more information
>   (qemu) info status
>   VM status: running
>
> cpr-exec mode preserves attributes of outgoing devices that must be known
> before the device is created on the incoming side, such as the memfd 
> descriptor
> number, but currently the migration stream is read after all devices are
> created.  To solve this problem, I add two VMStateDescription options:
> precreate and factory.  precreate objects are saved to their own migration
> stream, distinct from the main stream, and are read early by incoming QEMU,
> before devices are created.  Factory objects are allocated on demand, without
> relying on a pre-registered object's opaque address, which is necessary
> because the devices to which the state will apply have not been created yet
> and hence have not registered an opaque address to receive the state.
>
> This patch series implements a minimal version of cpr-exec.  Future series
> will add support for:
>   * vfio
>   * chardev's without loss of connectivity
>   * vhost
>   * fine-grained seccomp controls
>   * hostmem-memfd
>   * cpr-exec migration test
>
>
> Steve Sistare (26):
>   oslib: qemu_clear_cloexec
>   vl: helper to request re-exec
>   migration: SAVEVM_FOREACH
>   migration: delete unused parameter mis
>   migration: precreate vmstate
>   migration: precreate vmstate for exec
>   migration: VMStateId
>   migration: vmstate_info_void_ptr
>   migration: vmstate_register_named
>   migration: vmstate_unregister_named
>   migration: vmstate_register at init time
>   migration: vmstate factory object
>   physmem: ram_block_create
>   physmem: hoist guest_memfd creation
>   physmem: hoist host memory allocation
>   physmem: set ram block idstr earlier
>   machine: memfd-alloc option
>   migration: cpr-exec-args parameter
>   physmem: preserve ram blocks for cpr
>   migration: cpr-exec mode
>   migration: migrate_add_blocker_mode
>   migration: ram block cpr-exec blockers
>   migration: misc cpr-exec blockers
>   seccomp: cpr-exec blocker
>   migration: fix mismatched GPAs during cpr-exec
>   migration: only-migratable-modes
>
>  accel/xen/xen-all.c|   5 +
>  backends/hostmem-epc.c |  12 +-
>  hmp-commands.hx|   2 +-
>  hw/core/machine.c  |  22 +++
>  hw/core/qdev.c |   1 +
>  hw/intc/apic_common.c  |   2 +-
>  hw/vfio/migration.c|   3 +-
>  include/exec/cpu-commo

Re: [PATCH 1/7] hw/s390x/ccw: Make s390_ccw_get_dev_info() return a bool

2024-05-24 Thread Anthony Krowiak



On 5/22/24 1:01 PM, Cédric Le Goater wrote:

Since s390_ccw_get_dev_info() takes an 'Error **' argument, best
practices suggest to return a bool. See the qapi/error.h Rules
section. While at it, modify the call in s390_ccw_realize().

Signed-off-by: Cédric Le Goater 



Reviewed-by: Anthony Krowiak 



---
  hw/s390x/s390-ccw.c | 12 ++--
  1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/hw/s390x/s390-ccw.c b/hw/s390x/s390-ccw.c
index 
5261e66724f1cc3157b9413b0d5fdf5289c92503..a06e91dfb318e3500324851488c56806fa46c08d
 100644
--- a/hw/s390x/s390-ccw.c
+++ b/hw/s390x/s390-ccw.c
@@ -71,7 +71,7 @@ IOInstEnding s390_ccw_store(SubchDev *sch)
  return ret;
  }
  
-static void s390_ccw_get_dev_info(S390CCWDevice *cdev,

+static bool s390_ccw_get_dev_info(S390CCWDevice *cdev,
char *sysfsdev,
Error **errp)
  {
@@ -84,12 +84,12 @@ static void s390_ccw_get_dev_info(S390CCWDevice *cdev,
  error_setg(errp, "No host device provided");
  error_append_hint(errp,
"Use -device vfio-ccw,sysfsdev=PATH_TO_DEVICE\n");
-return;
+return false;
  }
  
  if (!realpath(sysfsdev, dev_path)) {

  error_setg_errno(errp, errno, "Host device '%s' not found", sysfsdev);
-return;
+return false;
  }
  
  cdev->mdevid = g_path_get_basename(dev_path);

@@ -98,13 +98,14 @@ static void s390_ccw_get_dev_info(S390CCWDevice *cdev,
  tmp = g_path_get_basename(tmp_dir);
  if (sscanf(tmp, "%2x.%1x.%4x", &cssid, &ssid, &devid) != 3) {
  error_setg_errno(errp, errno, "Failed to read %s", tmp);
-return;
+return false;
  }
  
  cdev->hostid.cssid = cssid;

  cdev->hostid.ssid = ssid;
  cdev->hostid.devid = devid;
  cdev->hostid.valid = true;
+return true;
  }
  
  static void s390_ccw_realize(S390CCWDevice *cdev, char *sysfsdev, Error **errp)

@@ -116,8 +117,7 @@ static void s390_ccw_realize(S390CCWDevice *cdev, char 
*sysfsdev, Error **errp)
  int ret;
  Error *err = NULL;
  
-s390_ccw_get_dev_info(cdev, sysfsdev, &err);

-if (err) {
+if (!s390_ccw_get_dev_info(cdev, sysfsdev, &err)) {
  goto out_err_propagate;
  }
  




Re: [PATCH v2 1/3] hw/riscv/virt: Add memory hotplugging and virtio-md-pci support

2024-05-24 Thread Daniel Henrique Barboza




On 5/21/24 07:56, Björn Töpel wrote:

From: Björn Töpel 

Virtio-based memory devices (virtio-mem/virtio-pmem) allows for
dynamic resizing of virtual machine memory, and requires proper
hotplugging (add/remove) support to work.

Add device memory support for RISC-V "virt" machine, and enable
virtio-md-pci with the corresponding missing hotplugging callbacks.

Signed-off-by: Björn Töpel 
---
  hw/riscv/Kconfig   |  2 +
  hw/riscv/virt.c| 83 +-
  hw/virtio/virtio-mem.c |  5 ++-
  3 files changed, 87 insertions(+), 3 deletions(-)

diff --git a/hw/riscv/Kconfig b/hw/riscv/Kconfig
index a2030e3a6ff0..08f82dbb681a 100644
--- a/hw/riscv/Kconfig
+++ b/hw/riscv/Kconfig
@@ -56,6 +56,8 @@ config RISCV_VIRT
  select PLATFORM_BUS
  select ACPI
  select ACPI_PCI
+select VIRTIO_MEM_SUPPORTED
+select VIRTIO_PMEM_SUPPORTED
  
  config SHAKTI_C

  bool
diff --git a/hw/riscv/virt.c b/hw/riscv/virt.c
index 4fdb66052587..443902f919d2 100644
--- a/hw/riscv/virt.c
+++ b/hw/riscv/virt.c
@@ -53,6 +53,8 @@
  #include "hw/pci-host/gpex.h"
  #include "hw/display/ramfb.h"
  #include "hw/acpi/aml-build.h"
+#include "hw/mem/memory-device.h"
+#include "hw/virtio/virtio-mem-pci.h"
  #include "qapi/qapi-visit-common.h"
  #include "hw/virtio/virtio-iommu.h"
  
@@ -1407,6 +1409,7 @@ static void virt_machine_init(MachineState *machine)

  DeviceState *mmio_irqchip, *virtio_irqchip, *pcie_irqchip;
  int i, base_hartid, hart_count;
  int socket_count = riscv_socket_count(machine);
+hwaddr device_memory_base, device_memory_size;
  
  /* Check socket count limit */

  if (VIRT_SOCKETS_MAX < socket_count) {
@@ -1420,6 +1423,12 @@ static void virt_machine_init(MachineState *machine)
  exit(1);
  }
  
+if (machine->ram_slots > ACPI_MAX_RAM_SLOTS) {

+error_report("unsupported amount of memory slots: %"PRIu64,
+ machine->ram_slots);
+exit(EXIT_FAILURE);
+}
+
  /* Initialize sockets */
  mmio_irqchip = virtio_irqchip = pcie_irqchip = NULL;
  for (i = 0; i < socket_count; i++) {
@@ -1553,6 +1562,37 @@ static void virt_machine_init(MachineState *machine)
  memory_region_add_subregion(system_memory, memmap[VIRT_MROM].base,
  mask_rom);
  
+/* device memory */

+device_memory_base = ROUND_UP(s->memmap[VIRT_DRAM].base + 
machine->ram_size,
+  GiB);
+device_memory_size = machine->maxram_size - machine->ram_size;
+if (device_memory_size > 0) {
+/*
+ * Each DIMM is aligned based on the backend's alignment value.
+ * Assume max 1G hugepage alignment per slot.
+ */
+device_memory_size += machine->ram_slots * GiB;


We don't need to align to 1GiB. This calc can use 2MiB instead (or 4MiB if we're
running 32 bits).


+
+if (riscv_is_32bit(&s->soc[0])) {
+hwaddr memtop = device_memory_base + ROUND_UP(device_memory_size,
+  GiB);


Same here - alignment is 2/4 MiB.


+
+if (memtop > UINT32_MAX) {
+error_report("memory exceeds 32-bit limit by %lu bytes",
+ memtop - UINT32_MAX);
+exit(EXIT_FAILURE);
+}
+}
+
+if (device_memory_base + device_memory_size < device_memory_size) {
+error_report("unsupported amount of device memory");
+exit(EXIT_FAILURE);
+}


Took another look and found this a bit strange. These are all unsigned vars, so
if (unsigned a + unsigned b < unsigned b) will always be 'false'. The compiler 
is
probably cropping this out.

The calc we need to do is to ensure that the extra ram_slots * alignment will 
fit into
the VIRT_DRAM block, i.e. maxram_size + (ram_slots * alignment) < 
memmap[VIRT_DRAM].size.


TBH I'm starting to have second thoughts about letting users hotplug whatever 
they want.
It seems cleaner to just force the 2/4 Mb alignment in pre_plug() and be done 
with it,
no need to allocate ram_slots * alignment and doing all these extra checks.

As I sent in an earlier email, users must already comply to the alignment of 
the host
memory when plugging pc-dimms, so I'm not sure our value/proposition with all 
this
extra code is worth it - the alignment will most likely be forced by the host 
memory
backend, so might as well force ourselves in pre_plug().


Thanks,


Daniel



+
+machine_memory_devices_init(machine, device_memory_base,
+device_memory_size);
+}
+
  /*
   * Init fw_cfg. Must be done before riscv_load_fdt, otherwise the
   * device tree cannot be altered and we get FDT_ERR_NOSPACE.
@@ -1712,12 +1752,21 @@ static HotplugHandler 
*virt_machine_get_hotplug_handler(MachineState *machine,
  MachineClass *mc = MACHINE_GET_CLASS(machine);
  
  if (device_is_dynamic_sysbus(mc, de

Re: [PATCH 5/7] vfio/ccw: Use the 'Error **errp' argument of vfio_ccw_realize()

2024-05-24 Thread Anthony Krowiak



On 5/22/24 1:01 PM, Cédric Le Goater wrote:

The local error variable is kept for vfio_ccw_register_irq_notifier()
because it is not considered as a failing condition. We will change
how error reporting is done in following changes.

Remove the error_propagate() call.

Cc: Zhenzhong Duan 
Signed-off-by: Cédric Le Goater 



Reviewed-by: Anthony Krowiak 



---
  hw/vfio/ccw.c | 12 +---
  1 file changed, 5 insertions(+), 7 deletions(-)

diff --git a/hw/vfio/ccw.c b/hw/vfio/ccw.c
index 
9a8e052711fe2f7c067c52808b2af30d0ebfee0c..a468fa2342b97e0ee36bd5fb8443025cc90a0453
 100644
--- a/hw/vfio/ccw.c
+++ b/hw/vfio/ccw.c
@@ -582,8 +582,8 @@ static void vfio_ccw_realize(DeviceState *dev, Error **errp)
  
  /* Call the class init function for subchannel. */

  if (cdc->realize) {
-if (!cdc->realize(cdev, vcdev->vdev.sysfsdev, &err)) {
-goto out_err_propagate;
+if (!cdc->realize(cdev, vcdev->vdev.sysfsdev, errp)) {
+return;
  }
  }
  
@@ -596,17 +596,17 @@ static void vfio_ccw_realize(DeviceState *dev, Error **errp)

  goto out_attach_dev_err;
  }
  
-if (!vfio_ccw_get_region(vcdev, &err)) {

+if (!vfio_ccw_get_region(vcdev, errp)) {
  goto out_region_err;
  }
  
-if (!vfio_ccw_register_irq_notifier(vcdev, VFIO_CCW_IO_IRQ_INDEX, &err)) {

+if (!vfio_ccw_register_irq_notifier(vcdev, VFIO_CCW_IO_IRQ_INDEX, errp)) {
  goto out_io_notifier_err;
  }
  
  if (vcdev->crw_region) {

  if (!vfio_ccw_register_irq_notifier(vcdev, VFIO_CCW_CRW_IRQ_INDEX,
-&err)) {
+errp)) {
  goto out_irq_notifier_err;
  }
  }
@@ -634,8 +634,6 @@ out_attach_dev_err:
  if (cdc->unrealize) {
  cdc->unrealize(cdev);
  }
-out_err_propagate:
-error_propagate(errp, err);
  }
  
  static void vfio_ccw_unrealize(DeviceState *dev)




Re: [PATCH 2/7] s390x/css: Make CCWDeviceClass::realize return bool

2024-05-24 Thread Anthony Krowiak



On 5/22/24 1:01 PM, Cédric Le Goater wrote:

Since the realize() handler of CCWDeviceClass takes an 'Error **'
argument, best practices suggest to return a bool. See the api/error.h
Rules section. While at it, modify the call in s390_ccw_realize().

Signed-off-by: Cédric Le Goater 



Reviewed-by: Anthony Krowiak 



---
  hw/s390x/ccw-device.h | 2 +-
  hw/s390x/ccw-device.c | 3 ++-
  hw/s390x/s390-ccw.c   | 3 +--
  3 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/hw/s390x/ccw-device.h b/hw/s390x/ccw-device.h
index 
6dff95225df11c63f9b66975019026b215c8c448..5feeb0ee7a268b8709043b5bbc56b06e707a448d
 100644
--- a/hw/s390x/ccw-device.h
+++ b/hw/s390x/ccw-device.h
@@ -36,7 +36,7 @@ extern const VMStateDescription vmstate_ccw_dev;
  struct CCWDeviceClass {
  DeviceClass parent_class;
  void (*unplug)(HotplugHandler *, DeviceState *, Error **);
-void (*realize)(CcwDevice *, Error **);
+bool (*realize)(CcwDevice *, Error **);
  void (*refill_ids)(CcwDevice *);
  };
  
diff --git a/hw/s390x/ccw-device.c b/hw/s390x/ccw-device.c

index 
fb8c1acc64d5002c861a4913f292d8346dbef192..a7d682e5af9ce90e7e2fad8c24b30e39328c7cf4
 100644
--- a/hw/s390x/ccw-device.c
+++ b/hw/s390x/ccw-device.c
@@ -31,9 +31,10 @@ static void ccw_device_refill_ids(CcwDevice *dev)
  dev->subch_id.valid = true;
  }
  
-static void ccw_device_realize(CcwDevice *dev, Error **errp)

+static bool ccw_device_realize(CcwDevice *dev, Error **errp)
  {
  ccw_device_refill_ids(dev);
+return true;
  }
  
  static Property ccw_device_properties[] = {

diff --git a/hw/s390x/s390-ccw.c b/hw/s390x/s390-ccw.c
index 
a06e91dfb318e3500324851488c56806fa46c08d..4b8ede701df90949720262b6fc1b65f4e505e34d
 100644
--- a/hw/s390x/s390-ccw.c
+++ b/hw/s390x/s390-ccw.c
@@ -137,8 +137,7 @@ static void s390_ccw_realize(S390CCWDevice *cdev, char 
*sysfsdev, Error **errp)
  goto out_err;
  }
  
-ck->realize(ccw_dev, &err);

-if (err) {
+if (!ck->realize(ccw_dev, &err)) {
  goto out_err;
  }
  




Re: [PATCH 7/7] vfio/{ap, ccw}: Use warn_report_err() for IRQ notifier registration errors

2024-05-24 Thread Anthony Krowiak



On 5/22/24 1:01 PM, Cédric Le Goater wrote:

vfio_ccw_register_irq_notifier() and vfio_ap_register_irq_notifier()
errors are currently reported using error_report_err(). Since they are
not considered as failing conditions, using warn_report_err() is more
appropriate.

Signed-off-by: Cédric Le Goater 



Reviewed-by: Anthony Krowiak 



---
  hw/vfio/ap.c  | 2 +-
  hw/vfio/ccw.c | 2 +-
  2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/hw/vfio/ap.c b/hw/vfio/ap.c
index 
c12531a7886a2fe87598be0861fba5923bd2c206..0c4354e3e70169ec072e16da0919936647d1d351
 100644
--- a/hw/vfio/ap.c
+++ b/hw/vfio/ap.c
@@ -172,7 +172,7 @@ static void vfio_ap_realize(DeviceState *dev, Error **errp)
   * Report this error, but do not make it a failing condition.
   * Lack of this IRQ in the host does not prevent normal operation.
   */
-error_report_err(err);
+warn_report_err(err);
  }
  
  return;

diff --git a/hw/vfio/ccw.c b/hw/vfio/ccw.c
index 
36f2677a448c5e31523dcc3de7d973ec70e4a13c..1f8e1272c7555cd0a770481d1ae92988f6e2e62e
 100644
--- a/hw/vfio/ccw.c
+++ b/hw/vfio/ccw.c
@@ -616,7 +616,7 @@ static void vfio_ccw_realize(DeviceState *dev, Error **errp)
   * Report this error, but do not make it a failing condition.
   * Lack of this IRQ in the host does not prevent normal operation.
   */
-error_report_err(err);
+warn_report_err(err);
  }
  
  return;




Re: [PATCH 4/7] s390x/css: Make S390CCWDeviceClass::realize return bool

2024-05-24 Thread Anthony Krowiak



On 5/22/24 1:01 PM, Cédric Le Goater wrote:

Since the realize() handler of S390CCWDeviceClass takes an 'Error **'
argument, best practices suggest to return a bool. See the api/error.h
Rules section. While at it, modify the call in vfio_ccw_realize().

Signed-off-by: Cédric Le Goater 



Reviewed-by: Anthony Krowiak 



---
  include/hw/s390x/s390-ccw.h | 2 +-
  hw/s390x/s390-ccw.c | 7 ---
  hw/vfio/ccw.c   | 3 +--
  3 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/include/hw/s390x/s390-ccw.h b/include/hw/s390x/s390-ccw.h
index 
2c807ee3a1ae8d85460fe65be8a62c64f212fe4b..2e0a70998132070996d6b0d083b8ddba5b9b87dc
 100644
--- a/include/hw/s390x/s390-ccw.h
+++ b/include/hw/s390x/s390-ccw.h
@@ -31,7 +31,7 @@ struct S390CCWDevice {
  
  struct S390CCWDeviceClass {

  CCWDeviceClass parent_class;
-void (*realize)(S390CCWDevice *dev, char *sysfsdev, Error **errp);
+bool (*realize)(S390CCWDevice *dev, char *sysfsdev, Error **errp);
  void (*unrealize)(S390CCWDevice *dev);
  IOInstEnding (*handle_request) (SubchDev *sch);
  int (*handle_halt) (SubchDev *sch);
diff --git a/hw/s390x/s390-ccw.c b/hw/s390x/s390-ccw.c
index 
b3d14c61d732880a651edcf28a040ca723cb9f5b..3c0975055089c3629dd76ce2e1484a4ef66d8d41
 100644
--- a/hw/s390x/s390-ccw.c
+++ b/hw/s390x/s390-ccw.c
@@ -108,7 +108,7 @@ static bool s390_ccw_get_dev_info(S390CCWDevice *cdev,
  return true;
  }
  
-static void s390_ccw_realize(S390CCWDevice *cdev, char *sysfsdev, Error **errp)

+static bool s390_ccw_realize(S390CCWDevice *cdev, char *sysfsdev, Error **errp)
  {
  CcwDevice *ccw_dev = CCW_DEVICE(cdev);
  CCWDeviceClass *ck = CCW_DEVICE_GET_CLASS(ccw_dev);
@@ -117,7 +117,7 @@ static void s390_ccw_realize(S390CCWDevice *cdev, char 
*sysfsdev, Error **errp)
  int ret;
  
  if (!s390_ccw_get_dev_info(cdev, sysfsdev, errp)) {

-return;
+return false;
  }
  
  sch = css_create_sch(ccw_dev->devno, errp);

@@ -142,7 +142,7 @@ static void s390_ccw_realize(S390CCWDevice *cdev, char 
*sysfsdev, Error **errp)
  
  css_generate_sch_crws(sch->cssid, sch->ssid, sch->schid,

parent->hotplugged, 1);
-return;
+return true;
  
  out_err:

  css_subch_assign(sch->cssid, sch->ssid, sch->schid, sch->devno, NULL);
@@ -150,6 +150,7 @@ out_err:
  g_free(sch);
  out_mdevid_free:
  g_free(cdev->mdevid);
+return false;
  }
  
  static void s390_ccw_unrealize(S390CCWDevice *cdev)

diff --git a/hw/vfio/ccw.c b/hw/vfio/ccw.c
index 
2600e62e37238779800dc2b3a0bd315d7633017b..9a8e052711fe2f7c067c52808b2af30d0ebfee0c
 100644
--- a/hw/vfio/ccw.c
+++ b/hw/vfio/ccw.c
@@ -582,8 +582,7 @@ static void vfio_ccw_realize(DeviceState *dev, Error **errp)
  
  /* Call the class init function for subchannel. */

  if (cdc->realize) {
-cdc->realize(cdev, vcdev->vdev.sysfsdev, &err);
-if (err) {
+if (!cdc->realize(cdev, vcdev->vdev.sysfsdev, &err)) {
  goto out_err_propagate;
  }
  }




Re: [PATCH 3/7] hw/s390x/ccw: Remove local Error variable from s390_ccw_realize()

2024-05-24 Thread Anthony Krowiak



On 5/22/24 1:01 PM, Cédric Le Goater wrote:

Use the 'Error **errp' argument of s390_ccw_realize() instead and
remove the error_propagate() call.

Signed-off-by: Cédric Le Goater 



Reviewed-by: Anthony Krowiak 



---
  hw/s390x/s390-ccw.c | 13 +
  1 file changed, 5 insertions(+), 8 deletions(-)

diff --git a/hw/s390x/s390-ccw.c b/hw/s390x/s390-ccw.c
index 
4b8ede701df90949720262b6fc1b65f4e505e34d..b3d14c61d732880a651edcf28a040ca723cb9f5b
 100644
--- a/hw/s390x/s390-ccw.c
+++ b/hw/s390x/s390-ccw.c
@@ -115,13 +115,12 @@ static void s390_ccw_realize(S390CCWDevice *cdev, char 
*sysfsdev, Error **errp)
  DeviceState *parent = DEVICE(ccw_dev);
  SubchDev *sch;
  int ret;
-Error *err = NULL;
  
-if (!s390_ccw_get_dev_info(cdev, sysfsdev, &err)) {

-goto out_err_propagate;
+if (!s390_ccw_get_dev_info(cdev, sysfsdev, errp)) {
+return;
  }
  
-sch = css_create_sch(ccw_dev->devno, &err);

+sch = css_create_sch(ccw_dev->devno, errp);
  if (!sch) {
  goto out_mdevid_free;
  }
@@ -132,12 +131,12 @@ static void s390_ccw_realize(S390CCWDevice *cdev, char 
*sysfsdev, Error **errp)
  ccw_dev->sch = sch;
  ret = css_sch_build_schib(sch, &cdev->hostid);
  if (ret) {
-error_setg_errno(&err, -ret, "%s: Failed to build initial schib",
+error_setg_errno(errp, -ret, "%s: Failed to build initial schib",
   __func__);
  goto out_err;
  }
  
-if (!ck->realize(ccw_dev, &err)) {

+if (!ck->realize(ccw_dev, errp)) {
  goto out_err;
  }
  
@@ -151,8 +150,6 @@ out_err:

  g_free(sch);
  out_mdevid_free:
  g_free(cdev->mdevid);
-out_err_propagate:
-error_propagate(errp, err);
  }
  
  static void s390_ccw_unrealize(S390CCWDevice *cdev)




Re: [PATCH 04/16] vfio/helpers: Make vfio_set_irq_signaling() return bool

2024-05-24 Thread Anthony Krowiak



On 5/15/24 4:20 AM, Zhenzhong Duan wrote:

This is to follow the coding standand in qapi/error.h to return bool
for bool-valued functions.

Suggested-by: Cédric Le Goater 
Signed-off-by: Zhenzhong Duan 
---
  include/hw/vfio/vfio-common.h |  4 ++--
  hw/vfio/ap.c  |  8 +++
  hw/vfio/ccw.c |  8 +++
  hw/vfio/helpers.c | 18 ++--
  hw/vfio/pci.c | 40 ++-
  hw/vfio/platform.c| 18 +++-
  6 files changed, 46 insertions(+), 50 deletions(-)

diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h
index 2d8da32df4..fdce13f0f2 100644
--- a/include/hw/vfio/vfio-common.h
+++ b/include/hw/vfio/vfio-common.h
@@ -207,8 +207,8 @@ void vfio_spapr_container_deinit(VFIOContainer *container);
  void vfio_disable_irqindex(VFIODevice *vbasedev, int index);
  void vfio_unmask_single_irqindex(VFIODevice *vbasedev, int index);
  void vfio_mask_single_irqindex(VFIODevice *vbasedev, int index);
-int vfio_set_irq_signaling(VFIODevice *vbasedev, int index, int subindex,
-   int action, int fd, Error **errp);
+bool vfio_set_irq_signaling(VFIODevice *vbasedev, int index, int subindex,
+int action, int fd, Error **errp);
  void vfio_region_write(void *opaque, hwaddr addr,
 uint64_t data, unsigned size);
  uint64_t vfio_region_read(void *opaque,
diff --git a/hw/vfio/ap.c b/hw/vfio/ap.c
index ba653ef70f..d8a9615fee 100644
--- a/hw/vfio/ap.c
+++ b/hw/vfio/ap.c
@@ -117,8 +117,8 @@ static bool vfio_ap_register_irq_notifier(VFIOAPDevice 
*vapdev,
  fd = event_notifier_get_fd(notifier);
  qemu_set_fd_handler(fd, fd_read, NULL, vapdev);
  
-if (vfio_set_irq_signaling(vdev, irq, 0, VFIO_IRQ_SET_ACTION_TRIGGER, fd,

-   errp)) {
+if (!vfio_set_irq_signaling(vdev, irq, 0, VFIO_IRQ_SET_ACTION_TRIGGER, fd,
+errp)) {
  qemu_set_fd_handler(fd, NULL, NULL, vapdev);
  event_notifier_cleanup(notifier);
  }
@@ -141,8 +141,8 @@ static void vfio_ap_unregister_irq_notifier(VFIOAPDevice 
*vapdev,
  return;
  }
  
-if (vfio_set_irq_signaling(&vapdev->vdev, irq, 0,

-   VFIO_IRQ_SET_ACTION_TRIGGER, -1, &err)) {
+if (!vfio_set_irq_signaling(&vapdev->vdev, irq, 0,
+VFIO_IRQ_SET_ACTION_TRIGGER, -1, &err)) {
  warn_reportf_err(err, VFIO_MSG_PREFIX, vapdev->vdev.name);
  }
  
diff --git a/hw/vfio/ccw.c b/hw/vfio/ccw.c

index 89bb980167..1f578a3c75 100644
--- a/hw/vfio/ccw.c
+++ b/hw/vfio/ccw.c
@@ -434,8 +434,8 @@ static bool vfio_ccw_register_irq_notifier(VFIOCCWDevice 
*vcdev,
  fd = event_notifier_get_fd(notifier);
  qemu_set_fd_handler(fd, fd_read, NULL, vcdev);
  
-if (vfio_set_irq_signaling(vdev, irq, 0,

-   VFIO_IRQ_SET_ACTION_TRIGGER, fd, errp)) {
+if (!vfio_set_irq_signaling(vdev, irq, 0,
+VFIO_IRQ_SET_ACTION_TRIGGER, fd, errp)) {
  qemu_set_fd_handler(fd, NULL, NULL, vcdev);
  event_notifier_cleanup(notifier);
  }
@@ -464,8 +464,8 @@ static void vfio_ccw_unregister_irq_notifier(VFIOCCWDevice 
*vcdev,
  return;
  }
  
-if (vfio_set_irq_signaling(&vcdev->vdev, irq, 0,

-   VFIO_IRQ_SET_ACTION_TRIGGER, -1, &err)) {
+if (!vfio_set_irq_signaling(&vcdev->vdev, irq, 0,
+VFIO_IRQ_SET_ACTION_TRIGGER, -1, &err)) {
  warn_reportf_err(err, VFIO_MSG_PREFIX, vcdev->vdev.name);
  }
  
diff --git a/hw/vfio/helpers.c b/hw/vfio/helpers.c

index 0bb7b40a6a..93e6fef6de 100644
--- a/hw/vfio/helpers.c
+++ b/hw/vfio/helpers.c
@@ -107,12 +107,12 @@ static const char *index_to_str(VFIODevice *vbasedev, int 
index)
  }
  }
  
-int vfio_set_irq_signaling(VFIODevice *vbasedev, int index, int subindex,

-   int action, int fd, Error **errp)
+bool vfio_set_irq_signaling(VFIODevice *vbasedev, int index, int subindex,
+int action, int fd, Error **errp)
  {
  ERRP_GUARD();
  g_autofree struct vfio_irq_set *irq_set = NULL;
-int argsz, ret = 0;
+int argsz;
  const char *name;
  int32_t *pfd;
  
@@ -127,15 +127,11 @@ int vfio_set_irq_signaling(VFIODevice *vbasedev, int index, int subindex,

  pfd = (int32_t *)&irq_set->data;
  *pfd = fd;
  
-if (ioctl(vbasedev->fd, VFIO_DEVICE_SET_IRQS, irq_set)) {

-ret = -errno;
+if (!ioctl(vbasedev->fd, VFIO_DEVICE_SET_IRQS, irq_set)) {
+return true;



With this change, I don't see where the allocation of irq_set is is freed.

g_free(irq_set);

What am I missing?



  }
  
-if (!ret) {

-return 0;
-}
-
-error_setg_errno(errp, -ret, "VFIO_DEVICE_SET_IRQS failure");
+error_setg_errno(errp, errno, "VFIO_DEVICE_SET_IRQS

Re: [PATCH 05/16] vfio/helpers: Make vfio_device_get_name() return bool

2024-05-24 Thread Anthony Krowiak



On 5/15/24 4:20 AM, Zhenzhong Duan wrote:

This is to follow the coding standand in qapi/error.h to return bool
for bool-valued functions.

Suggested-by: Cédric Le Goater 
Signed-off-by: Zhenzhong Duan 
---
  include/hw/vfio/vfio-common.h | 2 +-
  hw/vfio/ap.c  | 2 +-
  hw/vfio/ccw.c | 2 +-
  hw/vfio/helpers.c | 8 
  hw/vfio/pci.c | 2 +-
  hw/vfio/platform.c| 5 ++---
  6 files changed, 10 insertions(+), 11 deletions(-)

diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h
index fdce13f0f2..d9891c796f 100644
--- a/include/hw/vfio/vfio-common.h
+++ b/include/hw/vfio/vfio-common.h
@@ -280,7 +280,7 @@ int vfio_get_dirty_bitmap(const VFIOContainerBase 
*bcontainer, uint64_t iova,
uint64_t size, ram_addr_t ram_addr, Error **errp);
  
  /* Returns 0 on success, or a negative errno. */

-int vfio_device_get_name(VFIODevice *vbasedev, Error **errp);
+bool vfio_device_get_name(VFIODevice *vbasedev, Error **errp);
  void vfio_device_set_fd(VFIODevice *vbasedev, const char *str, Error **errp);
  void vfio_device_init(VFIODevice *vbasedev, int type, VFIODeviceOps *ops,
DeviceState *dev, bool ram_discard);
diff --git a/hw/vfio/ap.c b/hw/vfio/ap.c
index d8a9615fee..c12531a788 100644
--- a/hw/vfio/ap.c
+++ b/hw/vfio/ap.c
@@ -158,7 +158,7 @@ static void vfio_ap_realize(DeviceState *dev, Error **errp)
  VFIOAPDevice *vapdev = VFIO_AP_DEVICE(dev);
  VFIODevice *vbasedev = &vapdev->vdev;
  
-if (vfio_device_get_name(vbasedev, errp) < 0) {

+if (!vfio_device_get_name(vbasedev, errp)) {
  return;
  }



snip ...


  
diff --git a/hw/vfio/helpers.c b/hw/vfio/helpers.c

index 93e6fef6de..a69b4411e5 100644
--- a/hw/vfio/helpers.c
+++ b/hw/vfio/helpers.c
@@ -605,7 +605,7 @@ bool vfio_has_region_cap(VFIODevice *vbasedev, int region, 
uint16_t cap_type)
  return ret;
  }
  
-int vfio_device_get_name(VFIODevice *vbasedev, Error **errp)

+bool vfio_device_get_name(VFIODevice *vbasedev, Error **errp)
  {
  ERRP_GUARD();
  struct stat st;
@@ -614,7 +614,7 @@ int vfio_device_get_name(VFIODevice *vbasedev, Error **errp)
  if (stat(vbasedev->sysfsdev, &st) < 0) {
  error_setg_errno(errp, errno, "no such host device");
  error_prepend(errp, VFIO_MSG_PREFIX, vbasedev->sysfsdev);
-return -errno;
+return false;
  }
  /* User may specify a name, e.g: VFIO platform device */
  if (!vbasedev->name) {
@@ -623,7 +623,7 @@ int vfio_device_get_name(VFIODevice *vbasedev, Error **errp)
  } else {
  if (!vbasedev->iommufd) {
  error_setg(errp, "Use FD passing only with iommufd backend");
-return -EINVAL;
+return false;
  }
  /*
   * Give a name with fd so any function printing out vbasedev->name
@@ -634,7 +634,7 @@ int vfio_device_get_name(VFIODevice *vbasedev, Error **errp)
  }
  }
  
-return 0;

+return true;
  }



For the two functions above:

Reviewed-by: Anthony Krowiak 


  



snip ...





Re: [PULL 00/72] ppc-for-9.1-1 queue

2024-05-24 Thread Richard Henderson

On 5/23/24 16:53, Nicholas Piggin wrote:

This replaces the previous PR for tags/pull-ppc-for-9.1-1-20240524 note
this tag is tags/pull-ppc-for-9.1-1-20240524-1 (added -1 suffix). The
changelog and code are unchanged. Subject for BHRB patches are fixed
and trimmed for some MMU cleanup patches. So I won't re-send individual
patches to lists.

Thanks,
Nick

The following changes since commit 70581940cabcc51b329652becddfbc6a261b1b83:

   Merge tag 'pull-tcg-20240523' ofhttps://gitlab.com/rth7680/qemu  into 
staging (2024-05-23 09:47:40 -0700)

are available in the Git repository at:

   https://gitlab.com/npiggin/qemu.git  tags/pull-ppc-for-9.1-1-20240524-1

for you to fetch changes up to e48fb4c590a23d81ee1d2f09ee9bcf5dd5f98e43:

   target/ppc: Remove pp_check() and reuse ppc_hash32_pp_prot() (2024-05-24 
09:43:14 +1000)



* Fix an interesting TLB invalidate race
* Implement more instructions with decodetree
* Add the POWER8/9/10 BHRB facility
* Add missing instructions, registers, SMT support
* First round of a big MMU xlate cleanup


Applied, thanks.  Please update https://wiki.qemu.org/ChangeLog/9.1 as 
appropriate.


r~




Re: [PATCH rfcv2 15/17] intel_iommu: Set default aw_bits to 48 in scalable modren mode

2024-05-24 Thread CLEMENT MATHIEU--DRIF
Hi Zhenzhong

On 22/05/2024 08:23, Zhenzhong Duan wrote:
> Caution: External email. Do not open attachments or click links, unless this 
> email comes from a known sender and you know the content is safe.
>
>
> According to VTD spec, stage-1 page table could support 4-level and
> 5-level paging.
>
> However, 5-level paging translation emulation is unsupported yet.
> That means the only supported value for aw_bits is 48.
>
> So default aw_bits to 48 in scalable modern mode. In other cases,
> it is still default to 39 for compatibility.
>
> Add a check to ensure user specified value is 48 in modern mode
> for now.
>
> Signed-off-by: Zhenzhong Duan 
> ---
>   hw/i386/intel_iommu.c | 16 +++-
>   1 file changed, 15 insertions(+), 1 deletion(-)
>
> diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
> index e07daaba99..a4c241ea96 100644
> --- a/hw/i386/intel_iommu.c
> +++ b/hw/i386/intel_iommu.c
> @@ -3748,7 +3748,7 @@ static Property vtd_properties[] = {
>   ON_OFF_AUTO_AUTO),
>   DEFINE_PROP_BOOL("x-buggy-eim", IntelIOMMUState, buggy_eim, false),
>   DEFINE_PROP_UINT8("aw-bits", IntelIOMMUState, aw_bits,
> -  VTD_HOST_ADDRESS_WIDTH),
> +  0xff),
you could define a constant for this invalid value
>   DEFINE_PROP_BOOL("caching-mode", IntelIOMMUState, caching_mode, FALSE),
>   DEFINE_PROP_BOOL("x-scalable-mode", IntelIOMMUState, scalable_mode, 
> FALSE),
>   DEFINE_PROP_BOOL("snoop-control", IntelIOMMUState, snoop_control, 
> false),
> @@ -4663,6 +4663,14 @@ static bool vtd_decide_config(IntelIOMMUState *s, 
> Error **errp)
>   }
>   }
>
> +if (s->aw_bits == 0xff) {
> +if (s->scalable_modern) {
> +s->aw_bits = VTD_HOST_AW_48BIT;
> +} else {
> +s->aw_bits = VTD_HOST_AW_39BIT;
> +}
> +}
> +
>   if ((s->aw_bits != VTD_HOST_AW_39BIT) &&
>   (s->aw_bits != VTD_HOST_AW_48BIT) &&
>   !s->scalable_modern) {
> @@ -4671,6 +4679,12 @@ static bool vtd_decide_config(IntelIOMMUState *s, 
> Error **errp)
>   return false;
>   }
>
> +if ((s->aw_bits != VTD_HOST_AW_48BIT) && s->scalable_modern) {
> +error_setg(errp, "Supported values for aw-bits are: %d",
specify 'in modern mode' in the message?
> +   VTD_HOST_AW_48BIT);
> +return false;
> +}
> +
>   if (s->scalable_mode && !s->dma_drain) {
>   error_setg(errp, "Need to set dma_drain for scalable mode");
>   return false;
> --
> 2.34.1
>
#cmd

Re: [PATCH rfcv2 09/17] intel_iommu: Flush stage-1 cache in iotlb invalidation

2024-05-24 Thread CLEMENT MATHIEU--DRIF
Hi Zhenzhong

On 22/05/2024 08:23, Zhenzhong Duan wrote:
> Caution: External email. Do not open attachments or click links, unless this 
> email comes from a known sender and you know the content is safe.
>
>
> According to spec, Page-Selective-within-Domain Invalidation (11b):
>
> 1. IOTLB entries caching second-stage mappings (PGTT=010b) or pass-through
> (PGTT=100b) mappings associated with the specified domain-id and the
> input-address range are invalidated.
> 2. IOTLB entries caching first-stage (PGTT=001b) or nested (PGTT=011b)
> mapping associated with specified domain-id are invalidated.
>
> So per spec definition the Page-Selective-within-Domain Invalidation
> needs to flush first stage and nested cached IOTLB enties as well.
>
> We don't support nested yet and pass-through mapping is never cached,
> so what in iotlb cache are only first-stage and second-stage mappings.
>
> Add a tag pgtt in VTDIOTLBEntry to mark PGTT type of the mapping and
> invalidate entries based on PGTT type.
>
> Signed-off-by: Zhenzhong Duan 
> ---
>   include/hw/i386/intel_iommu.h |  1 +
>   hw/i386/intel_iommu.c | 20 +---
>   2 files changed, 18 insertions(+), 3 deletions(-)
>
> diff --git a/include/hw/i386/intel_iommu.h b/include/hw/i386/intel_iommu.h
> index 011f374883..b0d5b5a5be 100644
> --- a/include/hw/i386/intel_iommu.h
> +++ b/include/hw/i386/intel_iommu.h
> @@ -156,6 +156,7 @@ struct VTDIOTLBEntry {
>   uint64_t pte;
>   uint64_t mask;
>   uint8_t access_flags;
> +uint8_t pgtt;
>   };
>
>   /* VT-d Source-ID Qualifier types */
> diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
> index 0801112e2e..0078bad9d4 100644
> --- a/hw/i386/intel_iommu.c
> +++ b/hw/i386/intel_iommu.c
> @@ -287,9 +287,21 @@ static gboolean vtd_hash_remove_by_page(gpointer key, 
> gpointer value,
>   VTDIOTLBPageInvInfo *info = (VTDIOTLBPageInvInfo *)user_data;
>   uint64_t gfn = (info->addr >> VTD_PAGE_SHIFT_4K) & info->mask;
>   uint64_t gfn_tlb = (info->addr & entry->mask) >> VTD_PAGE_SHIFT_4K;
> -return (entry->domain_id == info->domain_id) &&
> -(((entry->gfn & info->mask) == gfn) ||
> - (entry->gfn == gfn_tlb));
> +
> +if (entry->domain_id != info->domain_id) {
> +return false;
> +}
> +
> +/*
> + * According to spec, IOTLB entries caching first-stage (PGTT=001b) or
> + * nested (PGTT=011b) mapping associated with specified domain-id are
> + * invalidated. Nested isn't supported yet, so only need to check 001b.
> + */
> +if (entry->pgtt == VTD_SM_PASID_ENTRY_FLT) {
> +return true;
> +}
> +
> +return (entry->gfn & info->mask) == gfn || entry->gfn == gfn_tlb;
>   }
>
>   /* Reset all the gen of VTDAddressSpace to zero and set the gen of
> @@ -382,6 +394,8 @@ static void vtd_update_iotlb(IntelIOMMUState *s, uint16_t 
> source_id,
>   entry->access_flags = access_flags;
>   entry->mask = vtd_slpt_level_page_mask(level);
>   entry->pasid = pasid;
> +entry->pgtt = s->scalable_modern ? VTD_SM_PASID_ENTRY_FLT
> + : VTD_SM_PASID_ENTRY_SLT;
What about passing pgtt as a parameter so that the translation type 
detection is done only once (in vtd_do_iommu_translate)?
>
>   key->gfn = gfn;
>   key->sid = source_id;
> --
> 2.34.1
>
#cmd

Re: [PATCH rfcv2 06/17] intel_iommu: Implement stage-1 translation

2024-05-24 Thread CLEMENT MATHIEU--DRIF
Hi Zhenzhong,

I already sent you my comments about this patch earlier (question about 
checking pgtt) but here is a style review

On 22/05/2024 08:23, Zhenzhong Duan wrote:
> Caution: External email. Do not open attachments or click links, unless this 
> email comes from a known sender and you know the content is safe.
>
>
> From: Yi Liu 
>
> This adds stage-1 page table walking to support stage-1 only
> transltion in scalable modern mode.
>
> Signed-off-by: Yi Liu 
> Signed-off-by: Yi Sun 
> Signed-off-by: Zhenzhong Duan 
> ---
>   hw/i386/intel_iommu_internal.h |  17 +
>   hw/i386/intel_iommu.c  | 128 +++--
>   2 files changed, 141 insertions(+), 4 deletions(-)
>
> diff --git a/hw/i386/intel_iommu_internal.h b/hw/i386/intel_iommu_internal.h
> index 0e240d6d54..abfdbd5f65 100644
> --- a/hw/i386/intel_iommu_internal.h
> +++ b/hw/i386/intel_iommu_internal.h
> @@ -534,6 +534,23 @@ typedef struct VTDRootEntry VTDRootEntry;
>   #define VTD_SM_PASID_ENTRY_AW  7ULL /* Adjusted guest-address-width 
> */
>   #define VTD_SM_PASID_ENTRY_DID(val)((val) & VTD_DOMAIN_ID_MASK)
>
> +#define VTD_SM_PASID_ENTRY_FLPM  3ULL
> +#define VTD_SM_PASID_ENTRY_FLPTPTR   (~0xfffULL)
> +
> +/* Paging Structure common */
> +#define VTD_FL_PT_PAGE_SIZE_MASK(1ULL << 7)
> +/* Bits to decide the offset for each level */
> +#define VTD_FL_LEVEL_BITS   9
> +
> +/* First Level Paging Structure */
> +#define VTD_FL_PT_LEVEL 1
> +#define VTD_FL_PT_ENTRY_NR  512
> +
> +/* Masks for First Level Paging Entry */
> +#define VTD_FL_RW_MASK  (1ULL << 1)
> +#define VTD_FL_PT_BASE_ADDR_MASK(aw) (~(VTD_PAGE_SIZE - 1) & 
> VTD_HAW_MASK(aw))
> +#define VTD_PASID_ENTRY_FPD (1ULL << 1) /* Fault Processing Disable 
> */
> +
>   /* Second Level Page Translation Pointer*/
>   #define VTD_SM_PASID_ENTRY_SLPTPTR (~0xfffULL)
>
> diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
> index 544e8f0e40..cf29809bc1 100644
> --- a/hw/i386/intel_iommu.c
> +++ b/hw/i386/intel_iommu.c
> @@ -50,6 +50,8 @@
>   /* pe operations */
>   #define VTD_PE_GET_TYPE(pe) ((pe)->val[0] & VTD_SM_PASID_ENTRY_PGTT)
>   #define VTD_PE_GET_LEVEL(pe) (2 + (((pe)->val[0] >> 2) & 
> VTD_SM_PASID_ENTRY_AW))
> +#define VTD_PE_GET_FLPT_LEVEL(pe) \
> +(4 + (((pe)->val[2] >> 2) & VTD_SM_PASID_ENTRY_FLPM))
>
>   /*
>* PCI bus number (or SID) is not reliable since the device is usaully
> @@ -823,6 +825,11 @@ static int 
> vtd_get_pe_in_pasid_leaf_table(IntelIOMMUState *s,
>   return -VTD_FR_PASID_TABLE_ENTRY_INV;
>   }
>
> +if (pgtt == VTD_SM_PASID_ENTRY_FLT &&
> +VTD_PE_GET_FLPT_LEVEL(pe) != 4) {
Maybe you could add a function to check if the level is supported.
And it would also be nice to rename vtd_is_level_supported (used just 
above these lines) to make it clear that it's only relevant for second 
level translations and avoid mistakes
> +return -VTD_FR_PASID_TABLE_ENTRY_INV;
> +}
> +
>   return 0;
>   }
>
> @@ -958,7 +965,11 @@ static uint32_t vtd_get_iova_level(IntelIOMMUState *s,
>
>   if (s->root_scalable) {
>   vtd_ce_get_rid2pasid_entry(s, ce, &pe, pasid);
> -return VTD_PE_GET_LEVEL(&pe);
> +if (s->scalable_modern) {
> +return VTD_PE_GET_FLPT_LEVEL(&pe);
> +} else {
> +return VTD_PE_GET_LEVEL(&pe);
same, could be renamed
> +}
>   }
>
>   return vtd_ce_get_level(ce);
> @@ -1045,7 +1056,11 @@ static dma_addr_t 
> vtd_get_iova_pgtbl_base(IntelIOMMUState *s,
>
>   if (s->root_scalable) {
>   vtd_ce_get_rid2pasid_entry(s, ce, &pe, pasid);
> -return pe.val[0] & VTD_SM_PASID_ENTRY_SLPTPTR;
> +if (s->scalable_modern) {
> +return pe.val[2] & VTD_SM_PASID_ENTRY_FLPTPTR;
> +} else {
> +return pe.val[0] & VTD_SM_PASID_ENTRY_SLPTPTR;
> +}
>   }
>
>   return vtd_ce_get_slpt_base(ce);
> @@ -1847,6 +1862,106 @@ out:
>   trace_vtd_pt_enable_fast_path(source_id, success);
>   }
>
> +/* The shift of an addr for a certain level of paging structure */
> +static inline uint32_t vtd_flpt_level_shift(uint32_t level)
> +{
> +assert(level != 0);
> +return VTD_PAGE_SHIFT_4K + (level - 1) * VTD_FL_LEVEL_BITS;
> +}
> +
> +/*
> + * Given an iova and the level of paging structure, return the offset
> + * of current level.
> + */
> +static inline uint32_t vtd_iova_fl_level_offset(uint64_t iova, uint32_t 
> level)
> +{
> +return (iova >> vtd_flpt_level_shift(level)) &
> +((1ULL << VTD_FL_LEVEL_BITS) - 1);
> +}
> +
> +/* Get the content of a flpte located in @base_addr[@index] */
> +static uint64_t vtd_get_flpte(dma_addr_t base_addr, uint32_t index)
> +{
> +uint64_t flpte;
> +
> +assert(index < VTD_FL_PT_ENTRY_NR);
> +
> +if (dma_memory_read(&address_space_memory,
> +base_addr + index * sizeof(flpte), &flpte,
> + 

Re: [PATCH V1 05/26] migration: precreate vmstate

2024-05-24 Thread Fabiano Rosas
Steve Sistare  writes:

> Provide the VMStateDescription precreate field to mark objects that must
> be loaded on the incoming side before devices have been created, because
> they provide properties that will be needed at creation time.  They will
> be saved to and loaded from their own QEMUFile, via
> qemu_savevm_precreate_save and qemu_savevm_precreate_load, but these
> functions are not yet called in this patch.  Allow them to be called
> before or after normal migration is active, when current_migration and
> current_incoming are not valid.
>
> Signed-off-by: Steve Sistare 

Reviewed-by: Fabiano Rosas 



Re: [PATCH V1 00/26] Live update: cpr-exec

2024-05-24 Thread Steven Sistare

On 5/24/2024 9:02 AM, Fabiano Rosas wrote:

Steve Sistare  writes:


This patch series adds the live migration cpr-exec mode.  In this mode, QEMU
stops the VM, writes VM state to the migration URI, and directly exec's a
new version of QEMU on the same host, replacing the original process while
retaining its PID.  Guest RAM is preserved in place, albeit with new virtual
addresses.  The user completes the migration by specifying the -incoming
option, and by issuing the migrate-incoming command if necessary.  This
saves and restores VM state, with minimal guest pause time, so that QEMU may
be updated to a new version in between.

The new interfaces are:
   * cpr-exec (MigMode migration parameter)
   * cpr-exec-args (migration parameter)
   * memfd-alloc=on (command-line option for -machine)
   * only-migratable-modes (command-line argument)

The caller sets the mode parameter before invoking the migrate command.

Arguments for the new QEMU process are taken from the cpr-exec-args parameter.
The first argument should be the path of a new QEMU binary, or a prefix
command that exec's the new QEMU binary, and the arguments should include
the -incoming option.

Memory backend objects must have the share=on attribute, and must be mmap'able
in the new QEMU process.  For example, memory-backend-file is acceptable,
but memory-backend-ram is not.

QEMU must be started with the '-machine memfd-alloc=on' option.  This causes
implicit RAM blocks (those not explicitly described by a memory-backend
object) to be allocated by mmap'ing a memfd.  Examples include VGA, ROM,
and even guest RAM when it is specified without without reference to a
memory-backend object.   The memfds are kept open across exec, their values
are saved in vmstate which is retrieved after exec, and they are re-mmap'd.

The '-only-migratable-modes cpr-exec' option guarantees that the
configuration supports cpr-exec.  QEMU will exit at start time if not.

Example:

In this example, we simply restart the same version of QEMU, but in
a real scenario one would set a new QEMU binary path in cpr-exec-args.

   # qemu-kvm -monitor stdio -object
   memory-backend-file,id=ram0,size=4G,mem-path=/dev/shm/ram0,share=on
   -m 4G -machine memfd-alloc=on ...

   QEMU 9.1.50 monitor - type 'help' for more information
   (qemu) info status
   VM status: running
   (qemu) migrate_set_parameter mode cpr-exec
   (qemu) migrate_set_parameter cpr-exec-args qemu-kvm ... -incoming 
file:vm.state
   (qemu) migrate -d file:vm.state
   (qemu) QEMU 9.1.50 monitor - type 'help' for more information
   (qemu) info status
   VM status: running

cpr-exec mode preserves attributes of outgoing devices that must be known
before the device is created on the incoming side, such as the memfd descriptor
number, but currently the migration stream is read after all devices are
created.  To solve this problem, I add two VMStateDescription options:
precreate and factory.  precreate objects are saved to their own migration
stream, distinct from the main stream, and are read early by incoming QEMU,
before devices are created.  Factory objects are allocated on demand, without
relying on a pre-registered object's opaque address, which is necessary
because the devices to which the state will apply have not been created yet
and hence have not registered an opaque address to receive the state.

This patch series implements a minimal version of cpr-exec.  Future series
will add support for:
   * vfio
   * chardev's without loss of connectivity
   * vhost
   * fine-grained seccomp controls
   * hostmem-memfd
   * cpr-exec migration test


Steve Sistare (26):
   oslib: qemu_clear_cloexec
   vl: helper to request re-exec
   migration: SAVEVM_FOREACH
   migration: delete unused parameter mis
   migration: precreate vmstate
   migration: precreate vmstate for exec
   migration: VMStateId
   migration: vmstate_info_void_ptr
   migration: vmstate_register_named
   migration: vmstate_unregister_named
   migration: vmstate_register at init time
   migration: vmstate factory object
   physmem: ram_block_create
   physmem: hoist guest_memfd creation
   physmem: hoist host memory allocation
   physmem: set ram block idstr earlier
   machine: memfd-alloc option
   migration: cpr-exec-args parameter
   physmem: preserve ram blocks for cpr
   migration: cpr-exec mode
   migration: migrate_add_blocker_mode
   migration: ram block cpr-exec blockers
   migration: misc cpr-exec blockers
   seccomp: cpr-exec blocker
   migration: fix mismatched GPAs during cpr-exec
   migration: only-migratable-modes

  accel/xen/xen-all.c|   5 +
  backends/hostmem-epc.c |  12 +-
  hmp-commands.hx|   2 +-
  hw/core/machine.c  |  22 +++
  hw/core/qdev.c |   1 +
  hw/intc/apic_common.c  |   2 +-
  hw/vfio/migration.c|   3 +-
  include/exec/cpu-common.h  |   3 +-
  include/exec/memory.h  |  15 ++
  include/exec/ramblock.h|  10 +-
  include/

Re: [PATCH 01/16] target/i386: remove unnecessary gen_update_cc_op before gen_eob*

2024-05-24 Thread Richard Henderson

On 5/24/24 01:10, Paolo Bonzini wrote:

This is already handled in gen_eob().  Before adding another DISAS_*
case, remove the double calls.

Signed-off-by: Paolo Bonzini
---
  target/i386/tcg/translate.c | 2 --
  1 file changed, 2 deletions(-)


Reviewed-by: Richard Henderson 

r~



Re: [PATCH 02/16] target/i386: cleanup eob handling of RSM

2024-05-24 Thread Richard Henderson

On 5/24/24 01:10, Paolo Bonzini wrote:

gen_helper_rsm cannot generate an exception, and reloads the flags.
So there's no need to spill cc_op and update cpu_eip, but on the
other hand cc_op must be reset to CC_OP_EFLAGS before returning.

It all works by chance, because by spilling cc_op before the call
to the helper, it becomes non-dirty and gen_eob will not overwrite
the CC_OP_EFLAGS value that is placed there by the helper.  But
let's clean it up.

Signed-off-by: Paolo Bonzini
---
  target/i386/tcg/translate.c | 3 +--
  1 file changed, 1 insertion(+), 2 deletions(-)


Reviewed-by: Richard Henderson 

r~



Re: [PATCH 03/16] target/i386: document and group DISAS_* constants

2024-05-24 Thread Richard Henderson

On 5/24/24 01:10, Paolo Bonzini wrote:

Place DISAS_* constants that update cpu_eip first, and
the "jump" ones last.  Add comments explaining the differences
and usage.

Signed-off-by: Paolo Bonzini
---
  target/i386/tcg/translate.c | 25 ++---
  1 file changed, 22 insertions(+), 3 deletions(-)


Reviewed-by: Richard Henderson 

r~



Re: [PATCH 03/16] target/i386: document and group DISAS_* constants

2024-05-24 Thread Richard Henderson

On 5/24/24 01:10, Paolo Bonzini wrote:

Place DISAS_* constants that update cpu_eip first, and
the "jump" ones last.  Add comments explaining the differences
and usage.

Signed-off-by: Paolo Bonzini 
---
  target/i386/tcg/translate.c | 25 ++---
  1 file changed, 22 insertions(+), 3 deletions(-)

diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
index 3c7d8d72144..52d758a224b 100644
--- a/target/i386/tcg/translate.c
+++ b/target/i386/tcg/translate.c
@@ -144,9 +144,28 @@ typedef struct DisasContext {
  TCGOp *prev_insn_end;
  } DisasContext;
  
-#define DISAS_EOB_ONLY DISAS_TARGET_0

-#define DISAS_EOB_NEXT DISAS_TARGET_1
-#define DISAS_EOB_INHIBIT_IRQ  DISAS_TARGET_2
+/*
+ * Point EIP to next instruction before ending translation.
+ * For instructions that can change hflags.
+ */
+#define DISAS_EOB_NEXT DISAS_TARGET_0
+
+/*
+ * Point EIP to next instruction and set HF_INHIBIT_IRQ if not
+ * already set.  For instructions that activate interrupt shadow.
+ */
+#define DISAS_EOB_INHIBIT_IRQ  DISAS_TARGET_1
+
+/*
+ * EIP has already been updated.  For jumps that do not use
+ * lookup_and_goto_ptr()
+ */
+#define DISAS_EOB_ONLY DISAS_TARGET_2


Better as "for instructions that must return to the main loop", because pure jumps should 
either use goto_tb (DISAS_NORETURN) or lookup_and_goto_ptr (DISAS_JUMP).


Otherwise,
Reviewed-by: Richard Henderson 


r~



Re: [PATCH 04/16] target/i386: avoid calling gen_eob_syscall before tb_stop

2024-05-24 Thread Richard Henderson

On 5/24/24 01:10, Paolo Bonzini wrote:

syscall and sysret only have one exit, so they do not need to
generate the end-of-translation code inline.  It can be
deferred to tb_stop.

Signed-off-by: Paolo Bonzini
---
  target/i386/tcg/translate.c | 13 +++--
  1 file changed, 11 insertions(+), 2 deletions(-)


Reviewed-by: Richard Henderson 

r~



Re: [PATCH 05/16] target/i386: avoid calling gen_eob_inhibit_irq before tb_stop

2024-05-24 Thread Richard Henderson

On 5/24/24 01:10, Paolo Bonzini wrote:

sti only has one exit, so it does not need to generate the
end-of-translation code inline.  It can be deferred to tb_stop.

Signed-off-by: Paolo Bonzini
---
  target/i386/tcg/translate.c | 13 -
  target/i386/tcg/emit.c.inc  |  4 +---
  2 files changed, 1 insertion(+), 16 deletions(-)


Reviewed-by: Richard Henderson 

r~



Re: [PATCH 06/16] target/i386: assert that gen_update_eip_cur and gen_update_eip_next are the same in tb_stop

2024-05-24 Thread Richard Henderson

On 5/24/24 01:10, Paolo Bonzini wrote:

This is an invariant, since these cases of tb_stop() should only
be reached through the "instruction decoding completed" path of
i386_tr_translate_insn().

Signed-off-by: Paolo Bonzini
---
  target/i386/tcg/translate.c | 2 ++
  1 file changed, 2 insertions(+)


Reviewed-by: Richard Henderson 

r~



Re: [PATCH 07/16] target/i386: raze the gen_eob* jungle

2024-05-24 Thread Richard Henderson

On 5/24/24 01:10, Paolo Bonzini wrote:

Make gen_eob take the DISAS_* constant as an argument, so that
it is not necessary to have wrappers around it.

Signed-off-by: Paolo Bonzini
---
  target/i386/tcg/translate.c | 60 +
  1 file changed, 14 insertions(+), 46 deletions(-)


Reviewed-by: Richard Henderson 

r~



Re: [PATCH 08/16] target/i386: reg in gen_ldst_modrm is always OR_TMP0

2024-05-24 Thread Richard Henderson

On 5/24/24 01:10, Paolo Bonzini wrote:

Values other than OR_TMP0 were only ever used by MOV and MOVNTI
opcodes.  Now that these have been converted to the new decoder,
remove the argument.

Signed-off-by: Paolo Bonzini
---
  target/i386/tcg/translate.c | 33 -
  1 file changed, 12 insertions(+), 21 deletions(-)


Reviewed-by: Richard Henderson 

r~



Re: [PATCH 09/16] target/i386: split gen_ldst_modrm for load and store

2024-05-24 Thread Richard Henderson

On 5/24/24 01:10, Paolo Bonzini wrote:

The is_store argument of gen_ldst_modrm has only ever been passed
a constant.  Just split the function in two.

Signed-off-by: Paolo Bonzini
---
  target/i386/tcg/translate.c | 52 +
  1 file changed, 29 insertions(+), 23 deletions(-)


Reviewed-by: Richard Henderson 

r~



Re: [PATCH 10/16] target/i386: inline gen_add_A0_ds_seg

2024-05-24 Thread Richard Henderson

On 5/24/24 01:10, Paolo Bonzini wrote:

It is only used in MONITOR, where a direct call of gen_lea_v_seg
is simpler, and in XLAT.  Inline it in the latter.

Signed-off-by: Paolo Bonzini
---
  target/i386/tcg/translate.c | 9 +
  target/i386/tcg/emit.c.inc  | 2 +-
  2 files changed, 2 insertions(+), 9 deletions(-)


Reviewed-by: Richard Henderson 

r~



Re: [PATCH 11/16] target/i386: use mo_stacksize more

2024-05-24 Thread Richard Henderson

On 5/24/24 01:10, Paolo Bonzini wrote:

Use mo_stacksize for all stack accesses, including when
a 64-bit code segment is impossible and the code is
therefore checking only for SS32(s).

Signed-off-by: Paolo Bonzini
---
  target/i386/tcg/translate.c | 8 
  1 file changed, 4 insertions(+), 4 deletions(-)


Reviewed-by: Richard Henderson 

r~



Re: [PATCH 12/16] target/i386: introduce gen_lea_ss_ofs

2024-05-24 Thread Richard Henderson

On 5/24/24 01:10, Paolo Bonzini wrote:

Generalize gen_stack_A0() to include an initial add and to use an arbitrary
destination.  This is a common pattern and it is not a huge burden to
add the extra arguments to the only caller of gen_stack_A0().

Signed-off-by: Paolo Bonzini
---
  target/i386/tcg/translate.c | 51 +++--
  target/i386/tcg/emit.c.inc  |  2 +-
  2 files changed, 22 insertions(+), 31 deletions(-)


Reviewed-by: Richard Henderson 

r~



  1   2   3   >