date:20231222

Re: [PATCH v3A 1/6] target/i386: add support for FRED in CPUID enumeration

2023-12-22 Thread Zhao Liu

On Fri, Dec 22, 2023 at 03:34:02PM +0800, Zhao Liu wrote:
> Date: Fri, 22 Dec 2023 15:34:02 +0800
> From: Zhao Liu 
> Subject: Re: [PATCH v3A 1/6] target/i386: add support for FRED in CPUID
>  enumeration
> 
> On Thu, Dec 21, 2023 at 07:03:36PM -0800, Xin Li wrote:
> > Date: Thu, 21 Dec 2023 19:03:36 -0800
> > From: Xin Li 
> > Subject: [PATCH v3A 1/6] target/i386: add support for FRED in CPUID
> >  enumeration
> > X-Mailer: git-send-email 2.43.0
> > 
> > FRED, i.e., the Intel flexible return and event delivery architecture,
> > defines simple new transitions that change privilege level (ring
> > transitions).
> > 
> > The new transitions defined by the FRED architecture are FRED event
> > delivery and, for returning from events, two FRED return instructions.
> > FRED event delivery can effect a transition from ring 3 to ring 0, but
> > it is used also to deliver events incident to ring 0.  One FRED
> > instruction (ERETU) effects a return from ring 0 to ring 3, while the
> > other (ERETS) returns while remaining in ring 0.  Collectively, FRED
> > event delivery and the FRED return instructions are FRED transitions.
> > 
> > In addition to these transitions, the FRED architecture defines a new
> > instruction (LKGS) for managing the state of the GS segment register.
> > The LKGS instruction can be used by 64-bit operating systems that do
> > not use the new FRED transitions.
> > 
> > WRMSRNS is an instruction that behaves exactly like WRMSR, with the
> > only difference being that it is not a serializing instruction by
> > default.  Under certain conditions, WRMSRNS may replace WRMSR to improve
> > performance.  FRED uses it to switch RSP0 in a faster manner.
> > 
> > Search for the latest FRED spec in most search engines with this search
> > pattern:
> > 
> >   site:intel.com FRED (flexible return and event delivery) specification
> > 
> > The CPUID feature flag CPUID.(EAX=7,ECX=1):EAX[17] enumerates FRED, and
> > the CPUID feature flag CPUID.(EAX=7,ECX=1):EAX[18] enumerates LKGS, and
> > the CPUID feature flag CPUID.(EAX=7,ECX=1):EAX[19] enumerates WRMSRNS.
> > 
> > Add CPUID definitions for FRED/LKGS/WRMSRNS, and expose them to KVM guests.
> > 
> > Because FRED relies on LKGS and WRMSRNS, add that to feature dependency
> > map.
> > 
> > Tested-by: Shan Kang 
> > Signed-off-by: Xin Li 
> > ---
> 
> Reviewed-by: Zhao Liu 
> 
> > 
> > Changelog
> > v3A:
> > - Fix reversed dependency (Wu Dan1).
> > ---
> >  target/i386/cpu.c | 10 +-
> >  target/i386/cpu.h |  6 ++
> >  2 files changed, 15 insertions(+), 1 deletion(-)
> > 
> > diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> > index 358d9c0a65..66551c7eae 100644
> > --- a/target/i386/cpu.c
> > +++ b/target/i386/cpu.c
> > @@ -965,7 +965,7 @@ FeatureWordInfo feature_word_info[FEATURE_WORDS] = {
> >  "avx-vnni", "avx512-bf16", NULL, "cmpccxadd",
> >  NULL, NULL, "fzrm", "fsrs",
> >  "fsrc", NULL, NULL, NULL,
> > -NULL, NULL, NULL, NULL,
> > +NULL, "fred", "lkgs", "wrmsrns",
> >  NULL, "amx-fp16", NULL, "avx-ifma",
> >  NULL, NULL, NULL, NULL,
> >  NULL, NULL, NULL, NULL,
> > @@ -1552,6 +1552,14 @@ static FeatureDep feature_dependencies[] = {
> >  .from = { FEAT_VMX_SECONDARY_CTLS,  
> > VMX_SECONDARY_EXEC_ENABLE_USER_WAIT_PAUSE },
> >  .to = { FEAT_7_0_ECX,   CPUID_7_0_ECX_WAITPKG },
> >  },
> > +{
> > +.from = { FEAT_7_1_EAX, CPUID_7_1_EAX_LKGS },
> > +.to = { FEAT_7_1_EAX,   CPUID_7_1_EAX_FRED },
> > +},
> > +{
> > +.from = { FEAT_7_1_EAX, CPUID_7_1_EAX_WRMSRNS },
> > +.to = { FEAT_7_1_EAX,   CPUID_7_1_EAX_FRED },
> > +},

Oh, sorry, one thing that comes to mind, is this dependency required?
Since the FRED spec (v3.0) is all about WRMSR as the example, without
mentioning WRMSRNS, could there be other implementations that depend on
WRMSR instead of WRMSRNS?

The dependencies of LKGS are clearly defined in spec.

-Zhao

> >  };
> >  
> >  typedef struct X86RegisterInfo32 {
> > diff --git a/target/i386/cpu.h b/target/i386/cpu.h
> > index cd2e295bd6..5faf00551d 100644
> > --- a/target/i386/cpu.h
> > +++ b/target/i386/cpu.h
> > @@ -934,6 +934,12 @@ uint64_t 
> > x86_cpu_get_supported_feature_word(FeatureWord w,
> >  #define CPUID_7_1_EDX_AMX_COMPLEX   (1U << 8)
> >  /* PREFETCHIT0/1 Instructions */
> >  #define CPUID_7_1_EDX_PREFETCHITI   (1U << 14)
> > +/* Flexible return and event delivery (FRED) */
> > +#define CPUID_7_1_EAX_FRED  (1U << 17)
> > +/* Load into IA32_KERNEL_GS_BASE (LKGS) */
> > +#define CPUID_7_1_EAX_LKGS  (1U << 18)
> > +/* Non-Serializing Write to Model Specific Register (WRMSRNS) */
> > +#define CPUID_7_1_EAX_WRMSRNS   (1U << 19)
> >  
> >  /* Do not exhibit MXCSR Configuration Dependent Timing (MCDT) behavior */
> >  #define CPUID_7_2_EDX_MCDT_NO   (1U << 5)
>

Re: [PATCH 23/24] exec/cpu-all: Extract page-protection definitions to page-prot-common.h

2023-12-22 Thread Nicholas Piggin

On Tue Dec 12, 2023 at 7:20 AM AEST, Philippe Mathieu-Daudé wrote:
> Extract page-protection definitions from "exec/cpu-all.h"
> to "exec/page-prot-common.h".
>
> The list of files requiring the new header was generated
> using:
>
> $ git grep -wE \
>   'PAGE_(READ|WRITE|EXEC|BITS|VALID|ANON|RESERVED|TARGET_.|PASSTHROUGH)'

Acked-by: Nicholas Piggin  (ppc)

Looks trivial for ppc, so fine. But what's the difference between
-common.h and -all.h suffix?

Thanks,
Nick

RE: [PATCH v3A 1/6] target/i386: add support for FRED in CPUID enumeration

2023-12-22 Thread Li, Xin3



> > >  NULL, NULL, NULL, NULL, @@ -1552,6 +1552,14 @@ static
> > > FeatureDep feature_dependencies[] = {
> > >  .from = { FEAT_VMX_SECONDARY_CTLS,
> VMX_SECONDARY_EXEC_ENABLE_USER_WAIT_PAUSE },
> > >  .to = { FEAT_7_0_ECX,   CPUID_7_0_ECX_WAITPKG },
> > >  },
> > > +{
> > > +.from = { FEAT_7_1_EAX, CPUID_7_1_EAX_LKGS },
> > > +.to = { FEAT_7_1_EAX,   CPUID_7_1_EAX_FRED },
> > > +},
> > > +{
> > > +.from = { FEAT_7_1_EAX, CPUID_7_1_EAX_WRMSRNS },
> > > +.to = { FEAT_7_1_EAX,   CPUID_7_1_EAX_FRED },
> > > +},
> 
> Oh, sorry, one thing that comes to mind, is this dependency required?
> Since the FRED spec (v3.0) is all about WRMSR as the example, without
> mentioning WRMSRNS, could there be other implementations that depend on
> WRMSR instead of WRMSRNS?

This is a community ask from tglx:
https://lkml.kernel.org/kvm/87y1h81ht4.ffs@tglx/

Boris had the same question:
https://lore.kernel.org/lkml/20231114050201.GAZVL%2FSd%2FyLIdON9la@fat_crate.local/

But it needs to go through a formal approach, which takes time, to reach
the FRED public spec.

Thanks!
Xin

Re: [PATCH v3A 1/6] target/i386: add support for FRED in CPUID enumeration

2023-12-22 Thread Liu, Zhao1

On Fri, Dec 22, 2023 at 08:24:52AM +, Li, Xin3 wrote:
> Date: Fri, 22 Dec 2023 08:24:52 +
> From: "Li, Xin3" 
> Subject: RE: [PATCH v3A 1/6] target/i386: add support for FRED in CPUID
>  enumeration
> 
> 
> > > >  NULL, NULL, NULL, NULL, @@ -1552,6 +1552,14 @@ static
> > > > FeatureDep feature_dependencies[] = {
> > > >  .from = { FEAT_VMX_SECONDARY_CTLS,
> > VMX_SECONDARY_EXEC_ENABLE_USER_WAIT_PAUSE },
> > > >  .to = { FEAT_7_0_ECX,   CPUID_7_0_ECX_WAITPKG },
> > > >  },
> > > > +{
> > > > +.from = { FEAT_7_1_EAX, CPUID_7_1_EAX_LKGS },
> > > > +.to = { FEAT_7_1_EAX,   CPUID_7_1_EAX_FRED },
> > > > +},
> > > > +{
> > > > +.from = { FEAT_7_1_EAX, CPUID_7_1_EAX_WRMSRNS },
> > > > +.to = { FEAT_7_1_EAX,   CPUID_7_1_EAX_FRED },
> > > > +},
> > 
> > Oh, sorry, one thing that comes to mind, is this dependency required?
> > Since the FRED spec (v3.0) is all about WRMSR as the example, without
> > mentioning WRMSRNS, could there be other implementations that depend on
> > WRMSR instead of WRMSRNS?
> 
> This is a community ask from tglx:
> https://lkml.kernel.org/kvm/87y1h81ht4.ffs@tglx/
> 
> Boris had the same question:
> https://lore.kernel.org/lkml/20231114050201.GAZVL%2FSd%2FyLIdON9la@fat_crate.local/
> 
> But it needs to go through a formal approach, which takes time, to reach
> the FRED public spec.
> 

Thanks Xin! You can add a simple note in the commit message, such as
FRED's dependency on WRMSRNS will be documented, to avoid confusion
for later reviewers interested in FRED.

Regards,
Zhao

[PATCH v2 2/4] hw/cxl/device: read from register values in mdev_reg_read()

2023-12-22 Thread Hyeonggon Yoo

In the current mdev_reg_read() implementation, it consistently returns
that the Media Status is Ready (01b). This was fine until commit
25a52959f99d ("hw/cxl: Add support for device sanitation") because the
media was presumed to be ready.

However, as per the CXL 3.0 spec "8.2.9.8.5.1 Sanitize (Opcode 4400h)",
during sanitation, the Media State should be set to Disabled (11b). The
mentioned commit correctly sets it to Disabled, but mdev_reg_read()
still returns Media Status as Ready.

To address this, update mdev_reg_read() to read register values instead
of returning dummy values.

Fixes: commit 25a52959f99d ("hw/cxl: Add support for device sanitation")
Reviewed-by: Davidlohr Bueso 
Signed-off-by: Hyeonggon Yoo <42.hye...@gmail.com>
---
 hw/cxl/cxl-device-utils.c   | 17 +++--
 include/hw/cxl/cxl_device.h |  4 +++-
 2 files changed, 14 insertions(+), 7 deletions(-)

diff --git a/hw/cxl/cxl-device-utils.c b/hw/cxl/cxl-device-utils.c
index 29de298117..ba3f80e6e7 100644
--- a/hw/cxl/cxl-device-utils.c
+++ b/hw/cxl/cxl-device-utils.c
@@ -229,12 +229,9 @@ static void mailbox_reg_write(void *opaque, hwaddr offset, 
uint64_t value,
 
 static uint64_t mdev_reg_read(void *opaque, hwaddr offset, unsigned size)
 {
-uint64_t retval = 0;
-
-retval = FIELD_DP64(retval, CXL_MEM_DEV_STS, MEDIA_STATUS, 1);
-retval = FIELD_DP64(retval, CXL_MEM_DEV_STS, MBOX_READY, 1);
+CXLDeviceState *cxl_dstate = opaque;
 
-return retval;
+return cxl_dstate->mbox_reg_state64[R_CXL_MEM_DEV_STS];
 }
 
 static void ro_reg_write(void *opaque, hwaddr offset, uint64_t value,
@@ -371,7 +368,15 @@ static void mailbox_reg_init_common(CXLDeviceState 
*cxl_dstate)
 cxl_dstate->mbox_msi_n = msi_n;
 }
 
-static void memdev_reg_init_common(CXLDeviceState *cxl_dstate) { }
+static void memdev_reg_init_common(CXLDeviceState *cxl_dstate)
+{
+uint64_t memdev_status_reg;
+
+memdev_status_reg = FIELD_DP64(0, CXL_MEM_DEV_STS, MEDIA_STATUS, 1);
+memdev_status_reg = FIELD_DP64(memdev_status_reg, CXL_MEM_DEV_STS,
+   MBOX_READY, 1);
+cxl_dstate->mbox_reg_state64[R_CXL_MEM_DEV_STS] = memdev_status_reg;
+}
 
 void cxl_device_register_init_t3(CXLType3Dev *ct3d)
 {
diff --git a/include/hw/cxl/cxl_device.h b/include/hw/cxl/cxl_device.h
index b2cb280e16..b318d94b36 100644
--- a/include/hw/cxl/cxl_device.h
+++ b/include/hw/cxl/cxl_device.h
@@ -408,7 +408,9 @@ static inline void __toggle_media(CXLDeviceState 
*cxl_dstate, int val)
 {
 uint64_t dev_status_reg;
 
-dev_status_reg = FIELD_DP64(0, CXL_MEM_DEV_STS, MEDIA_STATUS, val);
+dev_status_reg = cxl_dstate->mbox_reg_state64[R_CXL_MEM_DEV_STS];
+dev_status_reg = FIELD_DP64(dev_status_reg, CXL_MEM_DEV_STS, MEDIA_STATUS,
+val);
 cxl_dstate->mbox_reg_state64[R_CXL_MEM_DEV_STS] = dev_status_reg;
 }
 #define cxl_dev_disable_media(cxlds)\
-- 
2.39.3

[PATCH v2 1/4] hw/cxl: fix build error in cxl_type3_stubs.c

2023-12-22 Thread Hyeonggon Yoo

Fix build errors in cxl_type3_stubs.c due to a the incorrect definition
of the qmp_cxl_{add,release}_dynamic_capacity functions.

Signed-off-by: Hyeonggon Yoo <42.hye...@gmail.com>
---
 hw/mem/cxl_type3_stubs.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/hw/mem/cxl_type3_stubs.c b/hw/mem/cxl_type3_stubs.c
index 1b54ec028c..d913b11b4d 100644
--- a/hw/mem/cxl_type3_stubs.c
+++ b/hw/mem/cxl_type3_stubs.c
@@ -68,14 +68,14 @@ void qmp_cxl_inject_correctable_error(const char *path, 
CxlCorErrorType type,
 error_setg(errp, "CXL Type 3 support is not compiled in");
 }
 
-void qmp_cxl_add_dynamic_capacity(const char *path,
+void qmp_cxl_add_dynamic_capacity(const char *path, uint8_t region_id,
   CXLDCExtentRecordList  *records,
   Error **errp)
 {
 error_setg(errp, "CXL Type 3 support is not compiled in");
 }
 
-void qmp_cxl_release_dynamic_capacity(const char *path,
+void qmp_cxl_release_dynamic_capacity(const char *path, uint8_t region_id,
   CXLDCExtentRecordList  *records,
   Error **errp)
 {
-- 
2.39.3

[PATCH v2 0/4] A Followup for "QEMU: CXL mailbox rework and features (Part 1)"

2023-12-22 Thread Hyeonggon Yoo

v1: 
https://lore.kernel.org/qemu-devel/20231127105830.2104954-1-42.hye...@gmail.com

Changes from v1:
- Added patch 1 that fixes a build failure in Jonathan's tree.
- Added patch 3, as (partially) suggested by Davidlohr Buseo.
  One difference is that I dropped sanitize_running(), because
  cxl_dev_media_diabled() is enough for checking if the media is
  disabled (which implies sanitation is in progress)
- Added patch 4 that dicards all event logs during sanitation

Thanks everyone for giving feedbacks!

This is a fixup for the recent patch series "QEMU: CXL mailbox rework and
features (Part 1)" [1].

I don't mind if patch 1 is squashed into the problematic patch, as the
patch is not mainlined yet. This is based on Jonathan
Cameron's git tree (https://gitlab.com/jic23/qemu/-/tree/cxl-2023-11-02)

Sequence of Patches:

   1. Fix build error when CXL is not enabled, because of mismatching
  definition in cxl_type3_stubs.c
 
   2. Make mdev_reg_read() actually read registers, instead of
  returning a dummy value. This fixes Media Status being incorrectly
  read as "Enabled" while sanitation is in progress.

   3. Introduce cxl_dev_media_disabled() and replace sanitize_running()
  with it. Also add an assert() to check the media is correctly disabled
  during sanitation. (Now enabling when already enabled, or vice versa
  raises an assert failure.)

   4. Drop all event records during sanitation, as per spec.

[1] 
https://lore.kernel.org/linux-cxl/20231023160806.13206-1-jonathan.came...@huawei.com

Hyeonggon Yoo (4):
  hw/cxl: fix build error in cxl_type3_stubs.c
  hw/cxl/device: read from register values in mdev_reg_read()
  hw/cxl/mbox: replace sanitize_running() with cxl_dev_media_disabled()
  hw/cxl/events: discard all event records during sanitation

 hw/cxl/cxl-device-utils.c   | 17 +++--
 hw/cxl/cxl-events.c | 13 +
 hw/cxl/cxl-mailbox-utils.c  |  7 +--
 hw/mem/cxl_type3.c  |  4 ++--
 hw/mem/cxl_type3_stubs.c|  4 ++--
 include/hw/cxl/cxl_device.h | 16 ++--
 6 files changed, 43 insertions(+), 18 deletions(-)

-- 
2.39.3

[PATCH v2 4/4] hw/cxl/events: discard all event records during sanitation

2023-12-22 Thread Hyeonggon Yoo

Per spec 8.2.9.9.5.1 Sanitize (Opcode 4400h), sanitize command should
delete all event logs. Introduce cxl_discard_all_event_logs() and call
this in __do_sanitization().

Signed-off-by: Hyeonggon Yoo <42.hye...@gmail.com>
---
 hw/cxl/cxl-events.c | 13 +
 hw/cxl/cxl-mailbox-utils.c  |  1 +
 include/hw/cxl/cxl_device.h |  1 +
 3 files changed, 15 insertions(+)

diff --git a/hw/cxl/cxl-events.c b/hw/cxl/cxl-events.c
index bee6dfaf14..837b18ab47 100644
--- a/hw/cxl/cxl-events.c
+++ b/hw/cxl/cxl-events.c
@@ -141,6 +141,19 @@ bool cxl_event_insert(CXLDeviceState *cxlds, 
CXLEventLogType log_type,
 return cxl_event_count(log) == 1;
 }
 
+void cxl_discard_all_event_records(CXLDeviceState *cxlds)
+{
+CXLEventLogType log_type;
+CXLEventLog *log;
+
+for (log_type = 0; log_type < CXL_EVENT_TYPE_MAX; log_type++) {
+log = &cxlds->event_logs[log_type];
+while (!cxl_event_empty(log)) {
+cxl_event_delete_head(cxlds, log_type, log);
+}
+}
+}
+
 CXLRetCode cxl_event_get_records(CXLDeviceState *cxlds, CXLGetEventPayload *pl,
  uint8_t log_type, int max_recs,
  size_t *len)
diff --git a/hw/cxl/cxl-mailbox-utils.c b/hw/cxl/cxl-mailbox-utils.c
index efeb5f8174..2ade351d82 100644
--- a/hw/cxl/cxl-mailbox-utils.c
+++ b/hw/cxl/cxl-mailbox-utils.c
@@ -1150,6 +1150,7 @@ static void __do_sanitization(CXLType3Dev *ct3d)
 memset(lsa, 0, memory_region_size(mr));
 }
 }
+cxl_discard_all_event_records(&ct3d->cxl_dstate);
 }
 
 /*
diff --git a/include/hw/cxl/cxl_device.h b/include/hw/cxl/cxl_device.h
index 5618061ebe..8f05dd9beb 100644
--- a/include/hw/cxl/cxl_device.h
+++ b/include/hw/cxl/cxl_device.h
@@ -604,6 +604,7 @@ CXLRetCode cxl_event_get_records(CXLDeviceState *cxlds, 
CXLGetEventPayload *pl,
  size_t *len);
 CXLRetCode cxl_event_clear_records(CXLDeviceState *cxlds,
CXLClearEventPayload *pl);
+void cxl_discard_all_event_records(CXLDeviceState *cxlds);
 
 void cxl_event_irq_assert(CXLType3Dev *ct3d);
 
-- 
2.39.3

[PATCH v2 3/4] hw/cxl/mbox: replace sanitize_running() with cxl_dev_media_disabled()

2023-12-22 Thread Hyeonggon Yoo

The spec states that reads/writes should have no effect and a part of
commands should be ignored when the media is disabled, not when the
sanitize command is running.

Introduce cxl_dev_media_disabled() to check if the media is disabled and
replace sanitize_running() with it.

Make sure that the media has been correctly disabled during sanitation
by adding an assert to __toggle_media(). Now, enabling when already
enabled or vice versa results in an assert() failure.

Suggested-by: Davidlohr Bueso 
Signed-off-by: Hyeonggon Yoo <42.hye...@gmail.com>
---
 hw/cxl/cxl-mailbox-utils.c  |  6 --
 hw/mem/cxl_type3.c  |  4 ++--
 include/hw/cxl/cxl_device.h | 11 ++-
 3 files changed, 12 insertions(+), 9 deletions(-)

diff --git a/hw/cxl/cxl-mailbox-utils.c b/hw/cxl/cxl-mailbox-utils.c
index 11ec8b648b..efeb5f8174 100644
--- a/hw/cxl/cxl-mailbox-utils.c
+++ b/hw/cxl/cxl-mailbox-utils.c
@@ -2248,6 +2248,8 @@ int cxl_process_cci_message(CXLCCI *cci, uint8_t set, 
uint8_t cmd,
 int ret;
 const struct cxl_cmd *cxl_cmd;
 opcode_handler h;
+CXLDeviceState *cxl_dstate = &CXL_TYPE3(cci->intf)->cxl_dstate;
+
 
 *len_out = 0;
 cxl_cmd = &cci->cxl_cmd_set[set][cmd];
@@ -2268,8 +2270,8 @@ int cxl_process_cci_message(CXLCCI *cci, uint8_t set, 
uint8_t cmd,
 return CXL_MBOX_BUSY;
 }
 
-/* forbid any selected commands while overwriting */
-if (sanitize_running(cci)) {
+/* forbid any selected commands while the media is disabled */
+if (cxl_dev_media_disabled(cxl_dstate)) {
 if (h == cmd_events_get_records ||
 h == cmd_ccls_get_partition_info ||
 h == cmd_ccls_set_lsa ||
diff --git a/hw/mem/cxl_type3.c b/hw/mem/cxl_type3.c
index 85fc08f118..ecb525a608 100644
--- a/hw/mem/cxl_type3.c
+++ b/hw/mem/cxl_type3.c
@@ -1286,7 +1286,7 @@ MemTxResult cxl_type3_read(PCIDevice *d, hwaddr 
host_addr, uint64_t *data,
 return MEMTX_ERROR;
 }
 
-if (sanitize_running(&ct3d->cci)) {
+if (cxl_dev_media_disabled(&ct3d->cxl_dstate)) {
 qemu_guest_getrandom_nofail(data, size);
 return MEMTX_OK;
 }
@@ -1314,7 +1314,7 @@ MemTxResult cxl_type3_write(PCIDevice *d, hwaddr 
host_addr, uint64_t data,
 return MEMTX_ERROR;
 }
 
-if (sanitize_running(&ct3d->cci)) {
+if (cxl_dev_media_disabled(&ct3d->cxl_dstate)) {
 return MEMTX_OK;
 }
 
diff --git a/include/hw/cxl/cxl_device.h b/include/hw/cxl/cxl_device.h
index b318d94b36..5618061ebe 100644
--- a/include/hw/cxl/cxl_device.h
+++ b/include/hw/cxl/cxl_device.h
@@ -411,6 +411,7 @@ static inline void __toggle_media(CXLDeviceState 
*cxl_dstate, int val)
 dev_status_reg = cxl_dstate->mbox_reg_state64[R_CXL_MEM_DEV_STS];
 dev_status_reg = FIELD_DP64(dev_status_reg, CXL_MEM_DEV_STS, MEDIA_STATUS,
 val);
+assert(dev_status_reg != cxl_dstate->mbox_reg_state64[R_CXL_MEM_DEV_STS]);
 cxl_dstate->mbox_reg_state64[R_CXL_MEM_DEV_STS] = dev_status_reg;
 }
 #define cxl_dev_disable_media(cxlds)\
@@ -418,14 +419,14 @@ static inline void __toggle_media(CXLDeviceState 
*cxl_dstate, int val)
 #define cxl_dev_enable_media(cxlds) \
 do { __toggle_media((cxlds), 0x1); } while (0)
 
-static inline bool scan_media_running(CXLCCI *cci)
+static inline bool cxl_dev_media_disabled(CXLDeviceState *cxl_dstate)
 {
-return !!cci->bg.runtime && cci->bg.opcode == 0x4304;
+uint64_t dev_status_reg = cxl_dstate->mbox_reg_state64[R_CXL_MEM_DEV_STS];
+return FIELD_EX64(dev_status_reg, CXL_MEM_DEV_STS, MEDIA_STATUS) == 0x3;
 }
-
-static inline bool sanitize_running(CXLCCI *cci)
+static inline bool scan_media_running(CXLCCI *cci)
 {
-return !!cci->bg.runtime && cci->bg.opcode == 0x4400;
+return !!cci->bg.runtime && cci->bg.opcode == 0x4304;
 }
 
 typedef struct CXLError {
-- 
2.39.3

Re: [PATCH v2] target/i386: Fix physical address truncation

2023-12-22 Thread Paolo Bonzini

On Thu, Dec 21, 2023 at 10:33 PM Richard Henderson
 wrote:
>
> On 12/22/23 02:49, Michael Brown wrote:
> > The address translation logic in get_physical_address() will currently
> > truncate physical addresses to 32 bits unless long mode is enabled.
> > This is incorrect when using physical address extensions (PAE) outside
> > of long mode, with the result that a 32-bit operating system using PAE
> > to access memory above 4G will experience undefined behaviour.
> >
> > The truncation code was originally introduced in commit 33dfdb5 ("x86:
> > only allow real mode to access 32bit without LMA"), where it applied
> > only to translations performed while paging is disabled (and so cannot
> > affect guests using PAE).
> >
> > Commit 9828198 ("target/i386: Add MMU_PHYS_IDX and MMU_NESTED_IDX")
> > rearranged the code such that the truncation also applied to the use
> > of MMU_PHYS_IDX and MMU_NESTED_IDX.  Commit 4a1e9d4 ("target/i386: Use
> > atomic operations for pte updates") brought this truncation into scope
> > for page table entry accesses, and is the first commit for which a
> > Windows 10 32-bit guest will reliably fail to boot if memory above 4G
> > is present.
> >
> > The original truncation code (now ten years old) appears to be wholly
> > redundant in the current codebase.  With paging disabled, the CPU
> > cannot be in long mode and so the maximum address size for any
> > executed instruction is 32 bits.  This will already cause the linear
> > address to be truncated to 32 bits, and there is therefore no way for
> > get_physical_address() to be asked to translate an address outside of
> > the 32-bit range.
> >
> > Fix by removing the address truncation in get_physical_address().
> >
> > Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2040
> > Signed-off-by: Michael Brown 
> > ---
> >   target/i386/tcg/sysemu/excp_helper.c | 6 --
> >   1 file changed, 6 deletions(-)
> >
> > diff --git a/target/i386/tcg/sysemu/excp_helper.c 
> > b/target/i386/tcg/sysemu/excp_helper.c
> > index 5b86f439ad..707f7326d4 100644
> > --- a/target/i386/tcg/sysemu/excp_helper.c
> > +++ b/target/i386/tcg/sysemu/excp_helper.c
> > @@ -582,12 +582,6 @@ static bool get_physical_address(CPUX86State *env, 
> > vaddr addr,
> >
> >   /* Translation disabled. */
> >   out->paddr = addr & x86_get_a20_mask(env);
> > -#ifdef TARGET_X86_64
> > -if (!(env->hflags & HF_LMA_MASK)) {
> > -/* Without long mode we can only address 32bits in real mode */
> > -out->paddr = (uint32_t)out->paddr;
> > -}
> > -#endif
>
> If the extension is not needed, then the a20 mask isn't either.

I think it is. The extension is not needed because the masking is
applied by either TCG (e.g. in gen_lea_v_seg_dest or gen_add_A0_im) or
mmu_translate(); but the a20 mask is never applied elsewhere for
either non-paging mode or page table walks.

> But I think there are some missing masks within mmu_translate that need 
> fixing at the same
> time:

Right.

> > /*
> >  * Page table level 3
> >  */
> > pte_addr = ((in->cr3 & ~0x1f) + ((addr >> 27) & 0x18)) & 
> > a20_mask;
>
> Bits 32-63 of cr3 must be ignored when !LMA.
>
> > /*
> >  * Page table level 2
> >  */
> > pte_addr = ((in->cr3 & ~0xfff) + ((addr >> 20) & 0xffc)) & a20_mask;
> > if (!ptw_translate(&pte_trans, pte_addr)) {
> > return false;
> > }
> > restart_2_nopae:
>
> Likewise.
>
> Looking again, it appears that all of the actual pte_addr calculations have 
> both
> PG_ADDRESS_MASK and a20_mask applied, and have verified that bits beyond 
> MAXPHYSADDR are
> zero via rsvd_mask.

In fact, applying a20_mask is incorrect when there will be an NPT
walk.  I'll include Michael's patch in a more complete series and send
it out after testing.

Paolo

Re: [PATCH v3 2/6] target/i386: mark CR4.FRED not reserved

2023-12-22 Thread Zhao Liu

On Wed, Nov 08, 2023 at 11:20:08PM -0800, Xin Li wrote:
> Date: Wed,  8 Nov 2023 23:20:08 -0800
> From: Xin Li 
> Subject: [PATCH v3 2/6] target/i386: mark CR4.FRED not reserved
> X-Mailer: git-send-email 2.42.0
> 
> The CR4.FRED bit, i.e., CR4[32], is no longer a reserved bit when FRED
> is exposed to guests, otherwise it is still a reserved bit.
> 
> Tested-by: Shan Kang 
> Signed-off-by: Xin Li 
> ---

Reviewed-by: Zhao Liu 

>  target/i386/cpu.h | 12 +++-
>  1 file changed, 11 insertions(+), 1 deletion(-)
> 
> diff --git a/target/i386/cpu.h b/target/i386/cpu.h
> index 5faf00551d..e210957cba 100644
> --- a/target/i386/cpu.h
> +++ b/target/i386/cpu.h
> @@ -262,6 +262,12 @@ typedef enum X86Seg {
>  #define CR4_PKE_MASK   (1U << 22)
>  #define CR4_PKS_MASK   (1U << 24)
>  
> +#ifdef TARGET_X86_64
> +#define CR4_FRED_MASK   (1ULL << 32)
> +#else
> +#define CR4_FRED_MASK   0
> +#endif
> +
>  #define CR4_RESERVED_MASK \
>  (~(target_ulong)(CR4_VME_MASK | CR4_PVI_MASK | CR4_TSD_MASK \
>  | CR4_DE_MASK | CR4_PSE_MASK | CR4_PAE_MASK \
> @@ -269,7 +275,8 @@ typedef enum X86Seg {
>  | CR4_OSFXSR_MASK | CR4_OSXMMEXCPT_MASK | CR4_UMIP_MASK \
>  | CR4_LA57_MASK \
>  | CR4_FSGSBASE_MASK | CR4_PCIDE_MASK | CR4_OSXSAVE_MASK \
> -| CR4_SMEP_MASK | CR4_SMAP_MASK | CR4_PKE_MASK | 
> CR4_PKS_MASK))
> +| CR4_SMEP_MASK | CR4_SMAP_MASK | CR4_PKE_MASK | 
> CR4_PKS_MASK \
> +| CR4_FRED_MASK))
>  
>  #define DR6_BD  (1 << 13)
>  #define DR6_BS  (1 << 14)
> @@ -2520,6 +2527,9 @@ static inline uint64_t cr4_reserved_bits(CPUX86State 
> *env)
>  if (!(env->features[FEAT_7_0_ECX] & CPUID_7_0_ECX_PKS)) {
>  reserved_bits |= CR4_PKS_MASK;
>  }
> +if (!(env->features[FEAT_7_1_EAX] & CPUID_7_1_EAX_FRED)) {
> +reserved_bits |= CR4_FRED_MASK;
> +}
>  return reserved_bits;
>  }
>  
> -- 
> 2.42.0
> 
>

Re: [PATCH v4 01/13] vdpa: add VhostVDPAShared

2023-12-22 Thread Lei Yang

QE tested this series v4 with regression tests. It has fixed the qemu
core issues that hit last time.And everything works fine.

Tested-by: Lei Yang 




On Fri, Dec 22, 2023 at 1:43 AM Eugenio Pérez  wrote:
>
> It will hold properties shared among all vhost_vdpa instances associated
> with of the same device.  For example, we just need one iova_tree or one
> memory listener for the entire device.
>
> Next patches will register the vhost_vdpa memory listener at the
> beginning of the VM migration at the destination. This enables QEMU to
> map the memory to the device before stopping the VM at the source,
> instead of doing while both source and destination are stopped, thus
> minimizing the downtime.
>
> However, the destination QEMU is unaware of which vhost_vdpa struct will
> register its memory_listener.  If the source guest has CVQ enabled, it
> will be the one associated with the CVQ.  Otherwise, it will be the
> first one.
>
> Save the memory operations related members in a common place rather than
> always in the first / last vhost_vdpa.
>
> Signed-off-by: Eugenio Pérez 
> Acked-by: Jason Wang 
> ---
>  include/hw/virtio/vhost-vdpa.h |  5 +
>  net/vhost-vdpa.c   | 24 ++--
>  2 files changed, 27 insertions(+), 2 deletions(-)
>
> diff --git a/include/hw/virtio/vhost-vdpa.h b/include/hw/virtio/vhost-vdpa.h
> index 5407d54fd7..eb1a56d75a 100644
> --- a/include/hw/virtio/vhost-vdpa.h
> +++ b/include/hw/virtio/vhost-vdpa.h
> @@ -30,6 +30,10 @@ typedef struct VhostVDPAHostNotifier {
>  void *addr;
>  } VhostVDPAHostNotifier;
>
> +/* Info shared by all vhost_vdpa device models */
> +typedef struct vhost_vdpa_shared {
> +} VhostVDPAShared;
> +
>  typedef struct vhost_vdpa {
>  int device_fd;
>  int index;
> @@ -46,6 +50,7 @@ typedef struct vhost_vdpa {
>  bool suspended;
>  /* IOVA mapping used by the Shadow Virtqueue */
>  VhostIOVATree *iova_tree;
> +VhostVDPAShared *shared;
>  GPtrArray *shadow_vqs;
>  const VhostShadowVirtqueueOps *shadow_vq_ops;
>  void *shadow_vq_ops_opaque;
> diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
> index d0614d7954..8b661b9e6d 100644
> --- a/net/vhost-vdpa.c
> +++ b/net/vhost-vdpa.c
> @@ -240,6 +240,10 @@ static void vhost_vdpa_cleanup(NetClientState *nc)
>  qemu_close(s->vhost_vdpa.device_fd);
>  s->vhost_vdpa.device_fd = -1;
>  }
> +if (s->vhost_vdpa.index != 0) {
> +return;
> +}
> +g_free(s->vhost_vdpa.shared);
>  }
>
>  /** Dummy SetSteeringEBPF to support RSS for vhost-vdpa backend  */
> @@ -1661,6 +1665,7 @@ static NetClientState 
> *net_vhost_vdpa_init(NetClientState *peer,
> bool svq,
> struct vhost_vdpa_iova_range 
> iova_range,
> uint64_t features,
> +   VhostVDPAShared *shared,
> Error **errp)
>  {
>  NetClientState *nc = NULL;
> @@ -1696,6 +1701,7 @@ static NetClientState 
> *net_vhost_vdpa_init(NetClientState *peer,
>  if (queue_pair_index == 0) {
>  vhost_vdpa_net_valid_svq_features(features,
>&s->vhost_vdpa.migration_blocker);
> +s->vhost_vdpa.shared = g_new0(VhostVDPAShared, 1);
>  } else if (!is_datapath) {
>  s->cvq_cmd_out_buffer = mmap(NULL, vhost_vdpa_net_cvq_cmd_page_len(),
>   PROT_READ | PROT_WRITE,
> @@ -1708,11 +1714,16 @@ static NetClientState 
> *net_vhost_vdpa_init(NetClientState *peer,
>  s->vhost_vdpa.shadow_vq_ops_opaque = s;
>  s->cvq_isolated = cvq_isolated;
>  }
> +if (queue_pair_index != 0) {
> +s->vhost_vdpa.shared = shared;
> +}
> +
>  ret = vhost_vdpa_add(nc, (void *)&s->vhost_vdpa, queue_pair_index, nvqs);
>  if (ret) {
>  qemu_del_net_client(nc);
>  return NULL;
>  }
> +
>  return nc;
>  }
>
> @@ -1824,17 +1835,26 @@ int net_init_vhost_vdpa(const Netdev *netdev, const 
> char *name,
>  ncs = g_malloc0(sizeof(*ncs) * queue_pairs);
>
>  for (i = 0; i < queue_pairs; i++) {
> +VhostVDPAShared *shared = NULL;
> +
> +if (i) {
> +shared = DO_UPCAST(VhostVDPAState, nc, 
> ncs[0])->vhost_vdpa.shared;
> +}
>  ncs[i] = net_vhost_vdpa_init(peer, TYPE_VHOST_VDPA, name,
>   vdpa_device_fd, i, 2, true, opts->x_svq,
> - iova_range, features, errp);
> + iova_range, features, shared, errp);
>  if (!ncs[i])
>  goto err;
>  }
>
>  if (has_cvq) {
> +VhostVDPAState *s0 = DO_UPCAST(VhostVDPAState, nc, ncs[0]);
> +VhostVDPAShared *shared = s0->vhost_vdpa.shared;
> +
>  nc = net_vhost_vdpa_init(peer, TYPE_VHOST_VDPA, name,
>   vdpa_

[PATCH v2 1/1] dump: Fix HMP dump-guest-memory -z without -R

2023-12-22 Thread Markus Armbruster

-z without -R has no effect: the dump format remains @elf.  Fix the
logic error so it becomes @kdump-zlib.

Fixes: e6549197f7ed (dump: Add command interface for kdump-raw formats)
Fixes: CID 1523841
Signed-off-by: Markus Armbruster 
Reviewed-by: Stephen Brennan 
Reviewed-by: Marc-André Lureau 
---
 dump/dump-hmp-cmds.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/dump/dump-hmp-cmds.c b/dump/dump-hmp-cmds.c
index b428ec33df..d9340427c3 100644
--- a/dump/dump-hmp-cmds.c
+++ b/dump/dump-hmp-cmds.c
@@ -41,7 +41,7 @@ void hmp_dump_guest_memory(Monitor *mon, const QDict *qdict)
 dump_format = DUMP_GUEST_MEMORY_FORMAT_WIN_DMP;
 }
 
-if (zlib && raw) {
+if (zlib) {
 if (raw) {
 dump_format = DUMP_GUEST_MEMORY_FORMAT_KDUMP_RAW_ZLIB;
 } else {
-- 
2.43.0

[PATCH v2 0/1] dump: Fix issues flagged by Coverity

2023-12-22 Thread Markus Armbruster

v2:
* PATCH 1 superseded by commit 95a40c44501 (dump: Add close fd on
  error return to avoid resource leak)
* PATCH 2 fell through the cracks; reposting unchanged

Markus Armbruster (1):
  dump: Fix HMP dump-guest-memory -z without -R

 dump/dump-hmp-cmds.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

-- 
2.43.0

Re: [v2 1/2] qapi/virtio: Add feature and status bits for x-query-virtio-status

2023-12-22 Thread Markus Armbruster

Hyman Huang  writes:

> This patch allows to display feature and status bits in
> virtio-status.
>
> Applications could find it helpful to compare status and features
> that are numeric encoded. For example, an upper application could
> use the features (encoded as a number) in the output of "ovs-vsctl
> list interface" and the feature bits fields in the output of QMP
> command "x-query-virtio-status" to compare directly when attempting
> to ensure the correctness of the virtio negotiation between guest,
> QEMU, and OVS-DPDK. Not applying any more encoding.
>
> This patch also serves as a preparation for the next one, which implements
> a vhost-user test case about acked features of vhost-user protocol.
>
> Note that since the matching HMP command is typically used for human,
> leave it unchanged.
>
> Signed-off-by: Hyman Huang 

We discussed v1 some more since you posted this.  Do you intend to do a
v3?

[PATCH] configure: use a native non-cross compiler for linux-user

2023-12-22 Thread Paolo Bonzini

Commit c2118e9e1ab ("configure: don't try a "native" cross for linux-user",
2023-11-23) sought to avoid issues with using the native compiler with a
cross-endian or cross-bitness setup.  However, in doing so it ended up
requiring a cross compiler setup (and most likely a slow compiler setup)
even when building TCG tests that are native to the host architecture.
Always allow the host compiler in that case.

Cc: qemu-sta...@nongnu.org
Fixes: c2118e9e1ab ("configure: don't try a "native" cross for linux-user", 
2023-11-23)
Signed-off-by: Paolo Bonzini 
---
 configure | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/configure b/configure
index 4fb2dd179c8..21ab9a64e98 100755
--- a/configure
+++ b/configure
@@ -1356,8 +1356,8 @@ probe_target_compiler() {
   done
 
   try=cross
-  # For softmmu/roms we might be able to use the host compiler
-  if [ "${1%softmmu}" != "$1" ]; then
+  # For softmmu/roms also look for a bi-endian or multilib-enabled host 
compiler
+  if [ "${1%softmmu}" != "$1" ] || test "$target_arch" = "$cpu"; then
   case "$target_arch:$cpu" in
 aarch64_be:aarch64 | \
 armeb:arm | \
-- 
2.43.0

Re: [PULL 20/47] backends/iommufd: Introduce the iommufd object

2023-12-22 Thread Eric Auger

Hi Cédric,

On 12/21/23 22:23, Cédric Le Goater wrote:
> On 12/21/23 18:14, Eric Auger wrote:
>> Hi Cédric,
>>
>> On 12/21/23 17:00, Cédric Le Goater wrote:
>>> [ ... ]
>>>
>>>
 +static void iommufd_backend_init(Object *obj)
 +{
 +    IOMMUFDBackend *be = IOMMUFD_BACKEND(obj);
 +
 +    be->fd = -1;
 +    be->users = 0;
 +    be->owned = true;
 +    qemu_mutex_init(&be->lock);> +}
 +
 +static void iommufd_backend_finalize(Object *obj)
 +{
 +    IOMMUFDBackend *be = IOMMUFD_BACKEND(obj);
 +
 +    if (be->owned) {
 +    close(be->fd);
 +    be->fd = -1;
 +    }
 +}
 +
 +static void iommufd_backend_set_fd(Object *obj, const char *str,
 Error **errp)
 +{
 +    IOMMUFDBackend *be = IOMMUFD_BACKEND(obj);
 +    int fd = -1;
 +
 +    fd = monitor_fd_param(monitor_cur(), str, errp);
 +    if (fd == -1) {
 +    error_prepend(errp, "Could not parse remote object fd %s:",
 str);
 +    return;
 +    }
 +    qemu_mutex_lock(&be->lock);
 +    be->fd = fd;
 +    be->owned = false;
 +    qemu_mutex_unlock(&be->lock);
 +    trace_iommu_backend_set_fd(be->fd);
 +}
 +
 +static bool iommufd_backend_can_be_deleted(UserCreatable *uc)
 +{
 +    IOMMUFDBackend *be = IOMMUFD_BACKEND(uc);
 +
 +    return !be->users;
>>>
>>> Coverity CID 1531549 reports a concurrent data access violation because
>>> be->users is being accessed without holding the mutex.
>>>
>>> I wonder how useful is this mutex anyhow, since the code paths should
>>> be protected by the BQL lock. If you agree, I will send an update to
>>> simply drop be->lock and solve this report.
>> I am not totally comfortable with the fact BQL covers the same
>> protection? Please can you elaborate.
>
> These routines are called when a device is created which is called
> from the QEMU main thread which exits holding the BQL. It should be
> fine.

OK fine for me as well

Thanks

Eric
>
> Thanks,
>
> C.
>
>

VIRTIO_GPU_CMD_SUBMIT_3D error feedback?

2023-12-22 Thread Hans de Ruiter


Hi guys,

Is there a way for a client-side driver to know if a VirGL command 
fails? The function virgl_renderer_submit_cmd() does return an error if 
one of the commands fail, but virgl_cmd_submit_3d() ignores the return 
code. It looks like VIRTIO_GPU_CMD_SUBMIT_3D would return a success, 
even if one of the commands in the command stream failed.


As a simple scenario: lets say a shader object creation failed for some 
reason. How would the driver know that it happened and be able to 
provide feedback on why? I know that things such as shader syntax errors 
would be caught before sending the shader code to the host. However, 
it's still possible that a shader object creation would fail.


regards,
Hans




publickey - hans@keasigmadelta.com - 4293c311.asc
Description: application/pgp-keys


signature.asc
Description: OpenPGP digital signature

Re: hw: Audit of qdev properties with same name but different types

2023-12-22 Thread Markus Armbruster

Philippe Mathieu-Daudé  writes:

> Hi,
>
> I just did an audit of QDev properties from different
> models sharing the same name, but of different types
> (as of v8.2.0-rc1).
>
> Nothing wrong, but some having the same meaning could
> use the same type :)

Yes.

> Just sharing:
>
> ---
>2 addr
>
> hw/core/generic-loader.c:183:DEFINE_PROP_UINT64("addr", 
> GenericLoaderState, addr, 0),
> hw/core/guest-loader.c:115:DEFINE_PROP_UINT64("addr", GuestLoaderState, 
> addr, 0),
> hw/ide/macio.c:469:DEFINE_PROP_UINT32("addr", MACIOIDEState, addr, -1),
> hw/pci/pci.c:71:DEFINE_PROP_PCI_DEVFN("addr", PCIDevice, devfn, -1),

"addr" is kind of vague, because the address space is anybody's guess.
Width depends on the address space.

Similar naming sins in the QAPI schema at least come with documentation,
e.g.

##
# @PCDIMMDeviceInfo:
[...]
# @addr: physical address, where device is mapped

>2 base
>
> hw/arm/armv7m.c:630:DEFINE_PROP_UINT32("base", BitBandState, base, 0),
> hw/dma/i8257.c:589:DEFINE_PROP_INT32("base", I8257State, base, 0x00),

"base" is even more vague than "addr".

If it's a base *address*, width again depends on the address space, and
use of signed is fishy.

>2 bus_nr
>
> hw/pci-bridge/pci_expander_bridge.c:412:DEFINE_PROP_UINT8("bus_nr", 
> PXBDev, bus_nr, 0),
> hw/pci-host/xilinx-pcie.c:160:DEFINE_PROP_UINT32("bus_nr", 
> XilinxPCIEHost, bus_nr, 0),

If these are PCI bus numbers: correct width is 8 bits.

>2 clock-frequency
>
> hw/timer/altera_timer.c:218:DEFINE_PROP_UINT32("clock-frequency", 
> AlteraTimer, freq_hz, 0),
> hw/timer/mss-timer.c:284:DEFINE_PROP_UINT32("clock-frequency", 
> MSSTimerState, freq_hz,
> hw/timer/sifive_pwm.c:409:DEFINE_PROP_UINT64("clock-frequency", struct 
> SiFivePwmState,
> hw/timer/stm32f2xx_timer.c:302:DEFINE_PROP_UINT64("clock-frequency", 
> struct STM32F2XXTimerState,
> hw/timer/xilinx_timer.c:246:DEFINE_PROP_UINT32("clock-frequency", 
> XpsTimerState, freq_hz, 62 * 100),

Appropriate width depends on valid frequency range.

>2 config
>
> hw/intc/pnv_xive2.c:1939:DEFINE_PROP_UINT64("config", PnvXive2, config,
> hw/isa/pc87312.c:332:DEFINE_PROP_UINT8("config", PC87312State, config, 1),

Are these two related?  "config" could be anything...

>2 core-id
>
> target/i386/cpu.c:7788:DEFINE_PROP_INT32("core-id", X86CPU, core_id, 0),
> target/i386/cpu.c:7794:DEFINE_PROP_INT32("core-id", X86CPU, core_id, -1),
> target/s390x/cpu.c:302:DEFINE_PROP_UINT32("core-id", S390CPU, 
> env.core_id, 0),

CPU core IDs are non-negative.  Why signed?  Perhaps to encode "N/A" as
-1?


I think we can see already any improvements would need to be made case
by case.

[Quite a few more...]

Re: [PATCH for-9.0] qapi: Add 'recurse-children' option to qom-list

2023-12-22 Thread Markus Armbruster

Alberto Garcia  writes:

> This allows returning a tree of all object properties under a given
> path, in a way similar to scripts/qmp/qom-tree.

Use case?  We already have that script, and also HMP info qom-tree.

> Signed-off-by: Alberto Garcia 
> ---
>  qapi/qom.json  | 10 +-
>  qom/qom-hmp-cmds.c |  4 ++--
>  qom/qom-qmp-cmds.c | 22 +-
>  3 files changed, 32 insertions(+), 4 deletions(-)
>
> diff --git a/qapi/qom.json b/qapi/qom.json
> index c53ef978ff..dfe3a20725 100644
> --- a/qapi/qom.json
> +++ b/qapi/qom.json
> @@ -33,6 +33,10 @@
>  #qdev device type name.  Link properties form the device model
>  #graph.
>  #
> +# @children: if specified, a list of @ObjectPropertyInfo describing

Suggest "if present".

> +# the child properties. This requires that this property's @type
> +# is of the form 'child' (since 9.0)

But when will it be present?

In qom-list-properties's return value: never.

In qom-list's return value: if and only if passed recurse-children=true.

Awkward to document.

> +#
>  # @description: if specified, the description of the property.
>  #
>  # @default-value: the default value, if any (since 5.0)
> @@ -42,6 +46,7 @@
>  { 'struct': 'ObjectPropertyInfo',
>'data': { 'name': 'str',
>  'type': 'str',
> +'*children' :  [ 'ObjectPropertyInfo' ],
>  '*description': 'str',
>  '*default-value': 'any' } }
>  
> @@ -54,6 +59,9 @@
>  # @path: the path within the object model.  See @qom-get for a
>  # description of this parameter.
>  #
> +# @recurse-children: if true, include the child properties recursively
> +# in the return value (default: false) (since 9.0)

Insufficiently clear on the connection to ObjectPropertyInfo member
@children, I'm afraid.

> +#
>  # Returns: a list of @ObjectPropertyInfo that describe the properties
>  # of the object.
>  #
> @@ -69,7 +77,7 @@
>  #  { "name": "mon0", "type": "child" } ] }
>  ##
>  { 'command': 'qom-list',
> -  'data': { 'path': 'str' },
> +  'data': { 'path': 'str', '*recurse-children': 'bool' },
>'returns': [ 'ObjectPropertyInfo' ],
>'allow-preconfig': true }
>  

[...]

Re: [RFC PATCH v3 06/30] migration/ram: Introduce 'fixed-ram' migration capability

2023-12-22 Thread Markus Armbruster

Fabiano Rosas  writes:

> Add a new migration capability 'fixed-ram'.
>
> The core of the feature is to ensure that each RAM page has a specific
> offset in the resulting migration stream. The reasons why we'd want
> such behavior are:
>
>  - The resulting file will have a bounded size, since pages which are
>dirtied multiple times will always go to a fixed location in the
>file, rather than constantly being added to a sequential
>stream. This eliminates cases where a VM with, say, 1G of RAM can
>result in a migration file that's 10s of GBs, provided that the
>workload constantly redirties memory.
>
>  - It paves the way to implement O_DIRECT-enabled save/restore of the
>migration stream as the pages are ensured to be written at aligned
>offsets.
>
>  - It allows the usage of multifd so we can write RAM pages to the
>migration file in parallel.
>
> For now, enabling the capability has no effect. The next couple of
> patches implement the core functionality.
>
> Signed-off-by: Fabiano Rosas 
> ---
> - mentioned seeking on docs
> ---
>  docs/devel/migration.rst | 21 +
>  migration/options.c  | 34 ++
>  migration/options.h  |  1 +
>  migration/savevm.c   |  1 +
>  qapi/migration.json  |  6 +-
>  5 files changed, 62 insertions(+), 1 deletion(-)
>
> diff --git a/docs/devel/migration.rst b/docs/devel/migration.rst
> index ec55089b25..eeb4fec31f 100644
> --- a/docs/devel/migration.rst
> +++ b/docs/devel/migration.rst
> @@ -572,6 +572,27 @@ Others (especially either older devices or system 
> devices which for
>  some reason don't have a bus concept) make use of the ``instance id``
>  for otherwise identically named devices.
>  
> +Fixed-ram format
> +
> +
> +When the ``fixed-ram`` capability is enabled, a slightly different
> +stream format is used for the RAM section. Instead of having a
> +sequential stream of pages that follow the RAMBlock headers, the dirty
> +pages for a RAMBlock follow its header. This ensures that each RAM
> +page has a fixed offset in the resulting migration file.
> +
> +The ``fixed-ram`` capability must be enabled in both source and
> +destination with:
> +
> +``migrate_set_capability fixed-ram on``
> +
> +Since pages are written to their relative offsets and out of order
> +(due to the memory dirtying patterns), streaming channels such as
> +sockets are not supported. A seekable channel such as a file is
> +required. This can be verified in the QIOChannel by the presence of
> +the QIO_CHANNEL_FEATURE_SEEKABLE. In more practical terms, this
> +migration format requires the ``file:`` URI when migrating.
> +
>  Return path
>  ---
>  

[...]

> diff --git a/qapi/migration.json b/qapi/migration.json
> index eb2f883513..3b93e13743 100644
> --- a/qapi/migration.json
> +++ b/qapi/migration.json
> @@ -531,6 +531,10 @@
>  # and can result in more stable read performance.  Requires KVM
>  # with accelerator property "dirty-ring-size" set.  (Since 8.1)
>  #
> +# @fixed-ram: Migrate using fixed offsets for each RAM page.  Requires

Offsets in what?

Clear enough from commit message and doc update, but the doc comment
needs to make sense on its own.

> +# a migration URI that supports seeking, such as a file.  (since
> +# 8.2)

9.0

> +#
>  # Features:
>  #
>  # @deprecated: Member @block is deprecated.  Use blockdev-mirror with
> @@ -555,7 +559,7 @@
> { 'name': 'x-ignore-shared', 'features': [ 'unstable' ] },
> 'validate-uuid', 'background-snapshot',
> 'zero-copy-send', 'postcopy-preempt', 'switchover-ack',
> -   'dirty-limit'] }
> +   'dirty-limit', 'fixed-ram'] }
>  
>  ##
>  # @MigrationCapabilityStatus:

Re: [PULL 20/47] backends/iommufd: Introduce the iommufd object

2023-12-22 Thread Cédric Le Goater


On 12/22/23 11:09, Eric Auger wrote:

Hi Cédric,

On 12/21/23 22:23, Cédric Le Goater wrote:

On 12/21/23 18:14, Eric Auger wrote:

Hi Cédric,

On 12/21/23 17:00, Cédric Le Goater wrote:

[ ... ]



+static void iommufd_backend_init(Object *obj)
+{
+    IOMMUFDBackend *be = IOMMUFD_BACKEND(obj);
+
+    be->fd = -1;
+    be->users = 0;
+    be->owned = true;
+    qemu_mutex_init(&be->lock);> +}
+
+static void iommufd_backend_finalize(Object *obj)
+{
+    IOMMUFDBackend *be = IOMMUFD_BACKEND(obj);
+
+    if (be->owned) {
+    close(be->fd);
+    be->fd = -1;
+    }
+}
+
+static void iommufd_backend_set_fd(Object *obj, const char *str,
Error **errp)
+{
+    IOMMUFDBackend *be = IOMMUFD_BACKEND(obj);
+    int fd = -1;
+
+    fd = monitor_fd_param(monitor_cur(), str, errp);
+    if (fd == -1) {
+    error_prepend(errp, "Could not parse remote object fd %s:",
str);
+    return;
+    }
+    qemu_mutex_lock(&be->lock);
+    be->fd = fd;
+    be->owned = false;
+    qemu_mutex_unlock(&be->lock);
+    trace_iommu_backend_set_fd(be->fd);
+}
+
+static bool iommufd_backend_can_be_deleted(UserCreatable *uc)
+{
+    IOMMUFDBackend *be = IOMMUFD_BACKEND(uc);
+
+    return !be->users;


Coverity CID 1531549 reports a concurrent data access violation because
be->users is being accessed without holding the mutex.

I wonder how useful is this mutex anyhow, since the code paths should
be protected by the BQL lock. If you agree, I will send an update to
simply drop be->lock and solve this report.

I am not totally comfortable with the fact BQL covers the same
protection? Please can you elaborate.


These routines are called when a device is created which is called
from the QEMU main thread which exits holding the BQL. It should be
fine.


OK fine for me as well


I pushed 2 patches on vfio-9.0 that I will send next year. Time for a
break !

Cheers,

C.

Re: [RFC PATCH v3 23/30] migration: Add direct-io parameter

2023-12-22 Thread Markus Armbruster

Fabiano Rosas  writes:

> Add the direct-io migration parameter that tells the migration code to
> use O_DIRECT when opening the migration stream file whenever possible.

Why is that useful?

> This is currently only used with fixed-ram migration for the multifd
> channels that transfer the RAM pages. Those channels only transfer the
> pages and are guaranteed to perform aligned writes.
>
> However the parameter could be made to affect other types of
> file-based migrations in the future.
>
> Signed-off-by: Fabiano Rosas 

[...]

> diff --git a/qapi/migration.json b/qapi/migration.json
> index 3b93e13743..1d38619842 100644
> --- a/qapi/migration.json
> +++ b/qapi/migration.json
> @@ -878,6 +878,9 @@
>  # @mode: Migration mode. See description in @MigMode. Default is 'normal'.
>  #(Since 8.2)
>  #
> +# @direct-io: Open migration files with O_DIRECT when possible. This
> +# requires that the 'fixed-ram' capability is enabled. (since 9.0)

Two spaces between sentences for consistency, please.

> +#
>  # Features:
>  #
>  # @deprecated: Member @block-incremental is deprecated.  Use
> @@ -911,7 +914,8 @@
> 'block-bitmap-mapping',
> { 'name': 'x-vcpu-dirty-limit-period', 'features': ['unstable'] },
> 'vcpu-dirty-limit',
> -   'mode'] }
> +   'mode',
> +   'direct-io'] }
>  
>  ##
>  # @MigrateSetParameters:
> @@ -1066,6 +1070,9 @@
>  # @mode: Migration mode. See description in @MigMode. Default is 'normal'.
>  #(Since 8.2)
>  #
> +# @direct-io: Open migration files with O_DIRECT when possible. This
> +# requires that the 'fixed-ram' capability is enabled. (since 9.0)
> +#
>  # Features:
>  #
>  # @deprecated: Member @block-incremental is deprecated.  Use
> @@ -1119,7 +1126,8 @@
>  '*x-vcpu-dirty-limit-period': { 'type': 'uint64',
>  'features': [ 'unstable' ] },
>  '*vcpu-dirty-limit': 'uint64',
> -'*mode': 'MigMode'} }
> +'*mode': 'MigMode',
> +'*direct-io': 'bool' } }
>  
>  ##
>  # @migrate-set-parameters:
> @@ -1294,6 +1302,9 @@
>  # @mode: Migration mode. See description in @MigMode. Default is 'normal'.
>  #(Since 8.2)
>  #
> +# @direct-io: Open migration files with O_DIRECT when possible. This
> +# requires that the 'fixed-ram' capability is enabled. (since 9.0)
> +#
>  # Features:
>  #
>  # @deprecated: Member @block-incremental is deprecated.  Use
> @@ -1344,7 +1355,8 @@
>  '*x-vcpu-dirty-limit-period': { 'type': 'uint64',
>  'features': [ 'unstable' ] },
>  '*vcpu-dirty-limit': 'uint64',
> -'*mode': 'MigMode'} }
> +'*mode': 'MigMode',
> +'*direct-io': 'bool' } }
>  
>  ##
>  # @query-migrate-parameters:

[...]

[RFC PATCH] meson.build: report graphics backends

2023-12-22 Thread Alex Bennée

To enable accelerated VirtIO GPUs for the guest we need the rendering
support on the host but currently it's not reported in the
configuration summary. Add a graphics backend section and report the
status of the VirGL and Rutabaga support libraries.

Signed-off-by: Alex Bennée 
---
 meson.build | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/meson.build b/meson.build
index 6c77d9687de..93868568870 100644
--- a/meson.build
+++ b/meson.build
@@ -4307,6 +4307,12 @@ summary_info += {'curses support':curses}
 summary_info += {'brlapi support':brlapi}
 summary(summary_info, bool_yn: true, section: 'User interface')
 
+# Graphics backends
+summary_info = {}
+summary_info += {'VirGL support': virgl}
+summary_info += {'Rutabaga support':  rutabaga}
+summary(summary_info, bool_yn: true, section: 'Graphics backends')
+
 # Audio backends
 summary_info = {}
 if targetos not in ['darwin', 'haiku', 'windows']
-- 
2.39.2

Re: [PATCH 11/16] target/riscv: move 'cbom_blocksize' to riscv_cpu_properties[]

2023-12-22 Thread Daniel Henrique Barboza





On 12/22/23 01:40, Richard Henderson wrote:

On 12/22/23 04:51, Daniel Henrique Barboza wrote:

+const PropertyInfo prop_cbom_blksize = {


static?  Same for cboz in the next patch.


Same in every other patch where I added a PropertyInfo it seems.


I'll fix it in v2. Thanks,


Daniel




r~

Re: [PATCH V6 03/14] cpus: stop vm in suspended runstate

2023-12-22 Thread Markus Armbruster

Steve Sistare  writes:

> Currently, a vm in the suspended state is not completely stopped.  The VCPUs
> have been paused, but the cpu clock still runs, and runstate notifiers for
> the transition to stopped have not been called.  This causes problems for
> live migration.  Stale cpu timers_state is saved to the migration stream,
> causing time errors in the guest when it wakes from suspend, and state that
> would have been modified by runstate notifiers is wrong.
>
> Modify vm_stop to completely stop the vm if the current state is suspended,
> transition to RUN_STATE_PAUSED, and remember that the machine was suspended.
> Modify vm_start to restore the suspended state.

Can you explain this to me in terms of the @current_run_state state
machine?  Like

Before the patch, trigger X in state Y goes to state Z.
Afterwards, it goes to ...

> This affects all callers of vm_stop and vm_start, notably, the qapi stop and
> cont commands.  For example:
>
> (qemu) info status
> VM status: paused (suspended)
>
> (qemu) stop
> (qemu) info status
> VM status: paused
>
> (qemu) cont
> (qemu) info status
> VM status: paused (suspended)
>
> (qemu) system_wakeup
> (qemu) info status
> VM status: running
>
> Suggested-by: Peter Xu 
> Signed-off-by: Steve Sistare 
> ---
>  include/sysemu/runstate.h |  5 +
>  qapi/misc.json| 10 --
>  system/cpus.c | 19 ++-
>  system/runstate.c |  3 +++
>  4 files changed, 30 insertions(+), 7 deletions(-)
>
> diff --git a/include/sysemu/runstate.h b/include/sysemu/runstate.h
> index f6a337b..1d6828f 100644
> --- a/include/sysemu/runstate.h
> +++ b/include/sysemu/runstate.h
> @@ -40,6 +40,11 @@ static inline bool shutdown_caused_by_guest(ShutdownCause 
> cause)
>  return cause >= SHUTDOWN_CAUSE_GUEST_SHUTDOWN;
>  }
>  
> +static inline bool runstate_is_started(RunState state)
> +{
> +return state == RUN_STATE_RUNNING || state == RUN_STATE_SUSPENDED;
> +}
> +
>  void vm_start(void);
>  
>  /**
> diff --git a/qapi/misc.json b/qapi/misc.json
> index cda2eff..efb8d44 100644
> --- a/qapi/misc.json
> +++ b/qapi/misc.json
> @@ -134,7 +134,7 @@
>  ##
>  # @stop:
>  #
> -# Stop all guest VCPU execution.
> +# Stop all guest VCPU and VM execution.

Doesn't "stop all VM execution" imply "stop all guest vCPU execution"?

>  #
>  # Since: 0.14
>  #
> @@ -143,6 +143,9 @@
   # Notes: This function will succeed even if the guest is already in
   # the stopped state.  In "inmigrate" state, it will ensure that
>  # the guest remains paused once migration finishes, as if the -S
>  # option was passed on the command line.
>  #
> +# In the "suspended" state, it will completely stop the VM and
> +# cause a transition to the "paused" state. (Since 9.0)
> +#

What user-observable (with query-status) state transitions are possible
here?

>  # Example:
>  #
>  # -> { "execute": "stop" }
> @@ -153,7 +156,7 @@
>  ##
>  # @cont:
>  #
> -# Resume guest VCPU execution.
> +# Resume guest VCPU and VM execution.
>  #
>  # Since: 0.14
>  #
> @@ -165,6 +168,9 @@
   # Returns: If successful, nothing
   #
   # Notes: This command will succeed if the guest is currently running.
   # It will also succeed if the guest is in the "inmigrate" state;
   # in this case, the effect of the command is to make sure the
>  # guest starts once migration finishes, removing the effect of the
>  # -S command line option if it was passed.
>  #
> +# If the VM was previously suspended, and not been reset or woken,
> +# this command will transition back to the "suspended" state. (Since 9.0)

Long line.

What user-observable state transitions are possible here?

> +#
>  # Example:
>  #
>  # -> { "execute": "cont" }

Should we update documentation of query-status, too?

   ##
   # @StatusInfo:
   #
   # Information about VCPU run state
   #
   # @running: true if all VCPUs are runnable, false if not runnable
   #
   # @singlestep: true if using TCG with one guest instruction per
   # translation block
   #
   # @status: the virtual machine @RunState
   #
   # Features:
   #
   # @deprecated: Member 'singlestep' is deprecated (with no
   # replacement).
   #
   # Since: 0.14
   #
   # Notes: @singlestep is enabled on the command line with '-accel
   # tcg,one-insn-per-tb=on', or with the HMP 'one-insn-per-tb'
   # command.
   ##
   { 'struct': 'StatusInfo',
 'data': {'running': 'bool',
  'singlestep': { 'type': 'bool', 'features': [ 'deprecated' ]},
  'status': 'RunState'} }

   ##
   # @query-status:
   #
   # Query the run status of all VCPUs
   #
   # Returns: @StatusInfo reflecting all VCPUs
   #
   # Since: 0.14
   #
   # Example:
   #
   # -> { "execute": "query-status" }
   # <- { "return": { "running": true,
   #  "singlestep": false,
   #  "status": "running" } }
   ##
   { 'command': 'query-status', 'returns': '

[PATCH v2 13/16] target/riscv: remove riscv_cpu_options[]

2023-12-22 Thread Daniel Henrique Barboza

The array is empty and can be removed.

Signed-off-by: Daniel Henrique Barboza 
---
 target/riscv/cpu.c | 5 -
 target/riscv/cpu.h | 1 -
 target/riscv/kvm/kvm-cpu.c | 9 -
 target/riscv/tcg/tcg-cpu.c | 4 
 4 files changed, 19 deletions(-)

diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
index f30058518a..7b1cc5d0c9 100644
--- a/target/riscv/cpu.c
+++ b/target/riscv/cpu.c
@@ -1860,11 +1860,6 @@ static const PropertyInfo prop_cboz_blksize = {
 .set_default_value = qdev_propinfo_set_default_value_uint,
 };
 
-Property riscv_cpu_options[] = {
-
-DEFINE_PROP_END_OF_LIST(),
-};
-
 static Property riscv_cpu_properties[] = {
 DEFINE_PROP_BOOL("debug", RISCVCPU, cfg.debug, true),
 
diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h
index 988471c7ba..f06987687a 100644
--- a/target/riscv/cpu.h
+++ b/target/riscv/cpu.h
@@ -774,7 +774,6 @@ extern const RISCVCPUMultiExtConfig riscv_cpu_extensions[];
 extern const RISCVCPUMultiExtConfig riscv_cpu_vendor_exts[];
 extern const RISCVCPUMultiExtConfig riscv_cpu_experimental_exts[];
 extern const RISCVCPUMultiExtConfig riscv_cpu_deprecated_exts[];
-extern Property riscv_cpu_options[];
 
 typedef struct isa_ext_data {
 const char *name;
diff --git a/target/riscv/kvm/kvm-cpu.c b/target/riscv/kvm/kvm-cpu.c
index 137a8ab2bb..5800abc9c6 100644
--- a/target/riscv/kvm/kvm-cpu.c
+++ b/target/riscv/kvm/kvm-cpu.c
@@ -1443,19 +1443,10 @@ void kvm_riscv_aia_create(MachineState *machine, 
uint64_t group_shift,
 static void kvm_cpu_instance_init(CPUState *cs)
 {
 Object *obj = OBJECT(RISCV_CPU(cs));
-DeviceState *dev = DEVICE(obj);
 
 riscv_init_kvm_registers(obj);
 
 kvm_riscv_add_cpu_user_properties(obj);
-
-for (Property *prop = riscv_cpu_options; prop && prop->name; prop++) {
-/* Check if we have a specific KVM handler for the option */
-if (object_property_find(obj, prop->name)) {
-continue;
-}
-qdev_property_add_static(dev, prop);
-}
 }
 
 void riscv_kvm_cpu_finalize_features(RISCVCPU *cpu, Error **errp)
diff --git a/target/riscv/tcg/tcg-cpu.c b/target/riscv/tcg/tcg-cpu.c
index 84064ef7e0..d3eeedc758 100644
--- a/target/riscv/tcg/tcg-cpu.c
+++ b/target/riscv/tcg/tcg-cpu.c
@@ -889,10 +889,6 @@ static void riscv_cpu_add_user_properties(Object *obj)
 riscv_cpu_add_multiext_prop_array(obj, riscv_cpu_experimental_exts);
 
 riscv_cpu_add_multiext_prop_array(obj, riscv_cpu_deprecated_exts);
-
-for (Property *prop = riscv_cpu_options; prop && prop->name; prop++) {
-qdev_property_add_static(DEVICE(obj), prop);
-}
 }
 
 /*
-- 
2.43.0

[PATCH v2 09/16] target/riscv: move 'elen' to riscv_cpu_properties[]

2023-12-22 Thread Daniel Henrique Barboza

Do the same thing we did with 'vlen' in the previous patch with 'elen'.

Signed-off-by: Daniel Henrique Barboza 
---
 target/riscv/cpu.c | 52 --
 target/riscv/tcg/tcg-cpu.c |  5 
 2 files changed, 50 insertions(+), 7 deletions(-)

diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
index c2ff50bcab..8be619b6f1 100644
--- a/target/riscv/cpu.c
+++ b/target/riscv/cpu.c
@@ -1723,9 +1723,54 @@ static const PropertyInfo prop_vlen = {
 .set_default_value = qdev_propinfo_set_default_value_uint,
 };
 
-Property riscv_cpu_options[] = {
-DEFINE_PROP_UINT16("elen", RISCVCPU, cfg.elen, 64),
+static void prop_elen_set(Object *obj, Visitor *v, const char *name,
+ void *opaque, Error **errp)
+{
+RISCVCPU *cpu = RISCV_CPU(obj);
+uint16_t value;
+
+if (!visit_type_uint16(v, name, &value, errp)) {
+return;
+}
+
+if (!is_power_of_2(value)) {
+error_setg(errp, "Vector extension ELEN must be power of 2");
+return;
+}
+
+/* Always allow setting a default value */
+if (cpu->cfg.elen == 0) {
+cpu->cfg.elen = value;
+return;
+}
+
+if (value != cpu->cfg.elen && riscv_cpu_is_vendor(obj)) {
+cpu_set_prop_err(cpu, name, errp);
+error_append_hint(errp, "Current '%s' val: %u\n",
+  name, cpu->cfg.elen);
+return;
+}
+
+cpu_option_add_user_setting(name, value);
+cpu->cfg.elen = value;
+}
+
+static void prop_elen_get(Object *obj, Visitor *v, const char *name,
+ void *opaque, Error **errp)
+{
+uint16_t value = RISCV_CPU(obj)->cfg.elen;
 
+visit_type_uint16(v, name, &value, errp);
+}
+
+static const PropertyInfo prop_elen = {
+.name = "elen",
+.get = prop_elen_get,
+.set = prop_elen_set,
+.set_default_value = qdev_propinfo_set_default_value_uint,
+};
+
+Property riscv_cpu_options[] = {
 DEFINE_PROP_UINT16("cbom_blocksize", RISCVCPU, cfg.cbom_blocksize, 64),
 DEFINE_PROP_UINT16("cboz_blocksize", RISCVCPU, cfg.cboz_blocksize, 64),
 
@@ -1748,6 +1793,9 @@ static Property riscv_cpu_properties[] = {
 {.name = "vlen", .info = &prop_vlen,
  .set_default = true, .defval.u = 128},
 
+{.name = "elen", .info = &prop_elen,
+ .set_default = true, .defval.u = 64},
+
 #ifndef CONFIG_USER_ONLY
 DEFINE_PROP_UINT64("resetvec", RISCVCPU, env.resetvec, DEFAULT_RSTVEC),
 #endif
diff --git a/target/riscv/tcg/tcg-cpu.c b/target/riscv/tcg/tcg-cpu.c
index 8ec858e096..84064ef7e0 100644
--- a/target/riscv/tcg/tcg-cpu.c
+++ b/target/riscv/tcg/tcg-cpu.c
@@ -185,11 +185,6 @@ static void riscv_cpu_validate_v(CPURISCVState *env, 
RISCVCPUConfig *cfg,
 return;
 }
 
-if (!is_power_of_2(cfg->elen)) {
-error_setg(errp, "Vector extension ELEN must be power of 2");
-return;
-}
-
 if (cfg->elen > 64 || cfg->elen < 8) {
 error_setg(errp,
"Vector extension implementation only supports ELEN "
-- 
2.43.0

[PATCH v2 00/16] target/riscv: deprecate riscv_cpu_options[]

2023-12-22 Thread Daniel Henrique Barboza

Hi,

This new version fixes all instances of 'const PropertyInfo' added,
changing it to 'static const PropertyInfo', like suggested by Richard in
v1.

Patches based on Alistair's riscv-to-apply.next. Series is also found
here:

https://gitlab.com/danielhb/qemu/-/tree/fix_cpu_opts_v2

Changes from v1:
- changed 'const PropertyInfo' to 'static const PropertyInfo' in all
  relevant patches.
- v1 link: 
https://lore.kernel.org/qemu-riscv/20231221175137.497379-1-dbarb...@ventanamicro.com/

Daniel Henrique Barboza (16):
  target/riscv/cpu_cfg.h: remove user_spec and bext_spec
  target/riscv: move 'pmu-mask' and 'pmu-num' to riscv_cpu_properties[]
  target/riscv: make riscv_cpu_is_generic() public
  target/riscv: move 'mmu' to riscv_cpu_properties[]
  target/riscv: move 'pmp' to riscv_cpu_properties[]
  target/riscv: rework 'priv_spec'
  target/riscv: rework 'vext_spec'
  target/riscv: move 'vlen' to riscv_cpu_properties[]
  target/riscv: move 'elen' to riscv_cpu_properties[]
  target/riscv: create finalize_features() for KVM
  target/riscv: move 'cbom_blocksize' to riscv_cpu_properties[]
  target/riscv: move 'cboz_blocksize' to riscv_cpu_properties[]
  target/riscv: remove riscv_cpu_options[]
  target/riscv/cpu.c: move 'mvendorid' to riscv_cpu_properties[]
  target/riscv/cpu.c: move 'mimpid' to riscv_cpu_properties[]
  target/riscv/cpu.c: move 'marchid' to riscv_cpu_properties[]

 target/riscv/cpu.c   | 584 +--
 target/riscv/cpu.h   |   7 +-
 target/riscv/cpu_cfg.h   |   4 -
 target/riscv/kvm/kvm-cpu.c   |  94 +++---
 target/riscv/kvm/kvm_riscv.h |   1 +
 target/riscv/tcg/tcg-cpu.c   |  63 
 6 files changed, 561 insertions(+), 192 deletions(-)

-- 
2.43.0

[PATCH v2 10/16] target/riscv: create finalize_features() for KVM

2023-12-22 Thread Daniel Henrique Barboza

To turn cbom_blocksize and cboz_blocksize into class properties we need
KVM specific changes.

KVM is creating its own version of these options with a customized
setter() that prevents users from picking an invalid value during init()
time. This comes at the cost of duplicating each option that KVM
supports. This will keep happening for each new shared option KVM
implements in the future.

We can avoid that by using the same property TCG uses and adding
specific KVM handling during finalize() time, like TCG already does with
riscv_tcg_cpu_finalize_features(). To do that, the common CPU property
offers a way of knowing if an option was user set or not, sparing us
from doing unneeded syscalls.

riscv_kvm_cpu_finalize_features() is then created using the same
KVMScratch CPU we already use during init() time, since finalize() time
is still too early to use the official KVM CPU for it. cbom_blocksize
and cboz_blocksize are then handled during finalize() in the same way
they're handled by their KVM specific setter.

With this change we can proceed with the blocksize changes in the common
code without breaking the KVM driver.

Signed-off-by: Daniel Henrique Barboza 
---
 target/riscv/cpu.c   | 16 +++---
 target/riscv/cpu.h   |  1 +
 target/riscv/kvm/kvm-cpu.c   | 59 
 target/riscv/kvm/kvm_riscv.h |  1 +
 4 files changed, 72 insertions(+), 5 deletions(-)

diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
index 8be619b6f1..f49d31d753 100644
--- a/target/riscv/cpu.c
+++ b/target/riscv/cpu.c
@@ -63,6 +63,11 @@ static void cpu_option_add_user_setting(const char *optname, 
uint32_t value)
 GUINT_TO_POINTER(value));
 }
 
+bool riscv_cpu_option_set(const char *optname)
+{
+return g_hash_table_contains(general_user_opts, optname);
+}
+
 #define ISA_EXT_DATA_ENTRY(_name, _min_ver, _prop) \
 {#_name, _min_ver, CPU_CFG_OFFSET(_prop)}
 
@@ -1056,17 +1061,18 @@ void riscv_cpu_finalize_features(RISCVCPU *cpu, Error 
**errp)
 {
 Error *local_err = NULL;
 
-/*
- * KVM accel does not have a specialized finalize()
- * callback because its extensions are validated
- * in the get()/set() callbacks of each property.
- */
 if (tcg_enabled()) {
 riscv_tcg_cpu_finalize_features(cpu, &local_err);
 if (local_err != NULL) {
 error_propagate(errp, local_err);
 return;
 }
+} else if (kvm_enabled()) {
+riscv_kvm_cpu_finalize_features(cpu, &local_err);
+if (local_err != NULL) {
+error_propagate(errp, local_err);
+return;
+}
 }
 
 #ifndef CONFIG_USER_ONLY
diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h
index 53101b82c5..988471c7ba 100644
--- a/target/riscv/cpu.h
+++ b/target/riscv/cpu.h
@@ -495,6 +495,7 @@ bool riscv_cpu_tlb_fill(CPUState *cs, vaddr address, int 
size,
 bool probe, uintptr_t retaddr);
 char *riscv_isa_string(RISCVCPU *cpu);
 void riscv_cpu_list(void);
+bool riscv_cpu_option_set(const char *optname);
 
 #define cpu_list riscv_cpu_list
 #define cpu_mmu_index riscv_cpu_mmu_index
diff --git a/target/riscv/kvm/kvm-cpu.c b/target/riscv/kvm/kvm-cpu.c
index 62a1e51f0a..70fb075846 100644
--- a/target/riscv/kvm/kvm-cpu.c
+++ b/target/riscv/kvm/kvm-cpu.c
@@ -1490,6 +1490,65 @@ static void kvm_cpu_instance_init(CPUState *cs)
 }
 }
 
+void riscv_kvm_cpu_finalize_features(RISCVCPU *cpu, Error **errp)
+{
+CPURISCVState *env = &cpu->env;
+KVMScratchCPU kvmcpu;
+struct kvm_one_reg reg;
+uint64_t val;
+int ret;
+
+/* short-circuit without spinning the scratch CPU */
+if (!cpu->cfg.ext_zicbom && !cpu->cfg.ext_zicboz) {
+return;
+}
+
+if (!kvm_riscv_create_scratch_vcpu(&kvmcpu)) {
+error_setg(errp, "Unable to create scratch KVM cpu");
+return;
+}
+
+if (cpu->cfg.ext_zicbom &&
+riscv_cpu_option_set(kvm_cbom_blocksize.name)) {
+
+reg.id = kvm_riscv_reg_id_ulong(env, KVM_REG_RISCV_CONFIG,
+kvm_cbom_blocksize.kvm_reg_id);
+reg.addr = (uint64_t)&val;
+ret = ioctl(kvmcpu.cpufd, KVM_GET_ONE_REG, ®);
+if (ret != 0) {
+error_setg(errp, "Unable to read cbom_blocksize, error %d", errno);
+return;
+}
+
+if (cpu->cfg.cbom_blocksize != val) {
+error_setg(errp, "Unable to set cbom_blocksize to a different "
+   "value than the host (%lu)", val);
+return;
+}
+}
+
+if (cpu->cfg.ext_zicboz &&
+riscv_cpu_option_set(kvm_cboz_blocksize.name)) {
+
+reg.id = kvm_riscv_reg_id_ulong(env, KVM_REG_RISCV_CONFIG,
+kvm_cboz_blocksize.kvm_reg_id);
+reg.addr = (uint64_t)&val;
+ret = ioctl(kvmcpu.cpufd, KVM_GET_ONE_REG, ®);
+if (ret != 0) {
+error_setg(errp, "Unable to read cbom_blocksize, error

[PATCH v2 01/16] target/riscv/cpu_cfg.h: remove user_spec and bext_spec

2023-12-22 Thread Daniel Henrique Barboza

They aren't being used.

Signed-off-by: Daniel Henrique Barboza 
---
 target/riscv/cpu_cfg.h | 2 --
 1 file changed, 2 deletions(-)

diff --git a/target/riscv/cpu_cfg.h b/target/riscv/cpu_cfg.h
index f4605fb190..c67a8731d3 100644
--- a/target/riscv/cpu_cfg.h
+++ b/target/riscv/cpu_cfg.h
@@ -136,8 +136,6 @@ struct RISCVCPUConfig {
 
 uint32_t pmu_mask;
 char *priv_spec;
-char *user_spec;
-char *bext_spec;
 char *vext_spec;
 uint16_t vlen;
 uint16_t elen;
-- 
2.43.0

[PATCH v2 14/16] target/riscv/cpu.c: move 'mvendorid' to riscv_cpu_properties[]

2023-12-22 Thread Daniel Henrique Barboza

Keep all class properties in riscv_cpu_properties[].

Signed-off-by: Daniel Henrique Barboza 
---
 target/riscv/cpu.c | 69 +-
 1 file changed, 37 insertions(+), 32 deletions(-)

diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
index 7b1cc5d0c9..260932e117 100644
--- a/target/riscv/cpu.c
+++ b/target/riscv/cpu.c
@@ -1860,6 +1860,41 @@ static const PropertyInfo prop_cboz_blksize = {
 .set_default_value = qdev_propinfo_set_default_value_uint,
 };
 
+static void prop_mvendorid_set(Object *obj, Visitor *v, const char *name,
+   void *opaque, Error **errp)
+{
+bool dynamic_cpu = riscv_cpu_is_dynamic(obj);
+RISCVCPU *cpu = RISCV_CPU(obj);
+uint32_t prev_val = cpu->cfg.mvendorid;
+uint32_t value;
+
+if (!visit_type_uint32(v, name, &value, errp)) {
+return;
+}
+
+if (!dynamic_cpu && prev_val != value) {
+error_setg(errp, "Unable to change %s mvendorid (0x%x)",
+   object_get_typename(obj), prev_val);
+return;
+}
+
+cpu->cfg.mvendorid = value;
+}
+
+static void prop_mvendorid_get(Object *obj, Visitor *v, const char *name,
+   void *opaque, Error **errp)
+{
+uint32_t value = RISCV_CPU(obj)->cfg.mvendorid;
+
+visit_type_uint32(v, name, &value, errp);
+}
+
+static const PropertyInfo prop_mvendorid = {
+.name = "mvendorid",
+.get = prop_mvendorid_get,
+.set = prop_mvendorid_set,
+};
+
 static Property riscv_cpu_properties[] = {
 DEFINE_PROP_BOOL("debug", RISCVCPU, cfg.debug, true),
 
@@ -1884,6 +1919,8 @@ static Property riscv_cpu_properties[] = {
 {.name = "cboz_blocksize", .info = &prop_cboz_blksize,
  .set_default = true, .defval.u = 64},
 
+ {.name = "mvendorid", .info = &prop_mvendorid},
+
 #ifndef CONFIG_USER_ONLY
 DEFINE_PROP_UINT64("resetvec", RISCVCPU, env.resetvec, DEFAULT_RSTVEC),
 #endif
@@ -1948,35 +1985,6 @@ static const struct SysemuCPUOps riscv_sysemu_ops = {
 };
 #endif
 
-static void cpu_set_mvendorid(Object *obj, Visitor *v, const char *name,
-  void *opaque, Error **errp)
-{
-bool dynamic_cpu = riscv_cpu_is_dynamic(obj);
-RISCVCPU *cpu = RISCV_CPU(obj);
-uint32_t prev_val = cpu->cfg.mvendorid;
-uint32_t value;
-
-if (!visit_type_uint32(v, name, &value, errp)) {
-return;
-}
-
-if (!dynamic_cpu && prev_val != value) {
-error_setg(errp, "Unable to change %s mvendorid (0x%x)",
-   object_get_typename(obj), prev_val);
-return;
-}
-
-cpu->cfg.mvendorid = value;
-}
-
-static void cpu_get_mvendorid(Object *obj, Visitor *v, const char *name,
-  void *opaque, Error **errp)
-{
-uint32_t value = RISCV_CPU(obj)->cfg.mvendorid;
-
-visit_type_uint32(v, name, &value, errp);
-}
-
 static void cpu_set_mimpid(Object *obj, Visitor *v, const char *name,
void *opaque, Error **errp)
 {
@@ -2086,9 +2094,6 @@ static void riscv_cpu_class_init(ObjectClass *c, void 
*data)
 cc->gdb_arch_name = riscv_gdb_arch_name;
 cc->gdb_get_dynamic_xml = riscv_gdb_get_dynamic_xml;
 
-object_class_property_add(c, "mvendorid", "uint32", cpu_get_mvendorid,
-  cpu_set_mvendorid, NULL, NULL);
-
 object_class_property_add(c, "mimpid", "uint64", cpu_get_mimpid,
   cpu_set_mimpid, NULL, NULL);
 
-- 
2.43.0

[PATCH v2 07/16] target/riscv: rework 'vext_spec'

2023-12-22 Thread Daniel Henrique Barboza

The same rework did in 'priv_spec' is done for 'vext_spec'. This time is
simpler, since we only accept one value ("v1.0") and we'll always have
env->vext_ver set to VEXT_VERSION_1_00_0, thus we don't need helpers to
convert string to 'vext_ver' back and forth like we needed for
'priv_spec'.

Signed-off-by: Daniel Henrique Barboza 
---
 target/riscv/cpu.c | 42 ++
 target/riscv/cpu.h |  1 +
 target/riscv/cpu_cfg.h |  1 -
 target/riscv/tcg/tcg-cpu.c | 15 --
 4 files changed, 39 insertions(+), 20 deletions(-)

diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
index 1302d32de3..d6625399a7 100644
--- a/target/riscv/cpu.c
+++ b/target/riscv/cpu.c
@@ -1237,6 +1237,8 @@ static void riscv_cpu_post_init(Object *obj)
 
 static void riscv_cpu_init(Object *obj)
 {
+RISCVCPU *cpu = RISCV_CPU(obj);
+
 #ifndef CONFIG_USER_ONLY
 qdev_init_gpio_in(DEVICE(obj), riscv_cpu_set_irq,
   IRQ_LOCAL_MAX + IRQ_LOCAL_GUEST_MAX);
@@ -1249,8 +1251,11 @@ static void riscv_cpu_init(Object *obj)
  * for all CPUs. Each accelerator will decide what to do when
  * users disable them.
  */
-RISCV_CPU(obj)->cfg.ext_zicntr = true;
-RISCV_CPU(obj)->cfg.ext_zihpm = true;
+cpu->cfg.ext_zicntr = true;
+cpu->cfg.ext_zihpm = true;
+
+/* vext_spec is always 1_00_0 */
+cpu->env.vext_ver = VEXT_VERSION_1_00_0;
 }
 
 typedef struct misa_ext_info {
@@ -1629,9 +1634,37 @@ static const PropertyInfo prop_priv_spec = {
 .set = prop_priv_spec_set,
 };
 
-Property riscv_cpu_options[] = {
-DEFINE_PROP_STRING("vext_spec", RISCVCPU, cfg.vext_spec),
+static void prop_vext_spec_set(Object *obj, Visitor *v, const char *name,
+   void *opaque, Error **errp)
+{
+RISCVCPU *cpu = RISCV_CPU(obj);
+g_autofree char *value = NULL;
+
+visit_type_str(v, name, &value, errp);
+
+if (!g_strcmp0(value, VEXT_VER_1_00_0_STR)) {
+error_setg(errp, "Unsupported vector spec version '%s'", value);
+return;
+}
+
+cpu->env.vext_ver = VEXT_VERSION_1_00_0;
+}
+
+static void prop_vext_spec_get(Object *obj, Visitor *v, const char *name,
+   void *opaque, Error **errp)
+{
+const char *value = VEXT_VER_1_00_0_STR;
 
+visit_type_str(v, name, (char **)&value, errp);
+}
+
+static const PropertyInfo prop_vext_spec = {
+.name = "vext_spec",
+.get = prop_vext_spec_get,
+.set = prop_vext_spec_set,
+};
+
+Property riscv_cpu_options[] = {
 DEFINE_PROP_UINT16("vlen", RISCVCPU, cfg.vlen, 128),
 DEFINE_PROP_UINT16("elen", RISCVCPU, cfg.elen, 64),
 
@@ -1652,6 +1685,7 @@ static Property riscv_cpu_properties[] = {
 {.name = "pmp", .info = &prop_pmp},
 
 {.name = "priv_spec", .info = &prop_priv_spec},
+{.name = "vext_spec", .info = &prop_vext_spec},
 
 #ifndef CONFIG_USER_ONLY
 DEFINE_PROP_UINT64("resetvec", RISCVCPU, env.resetvec, DEFAULT_RSTVEC),
diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h
index e8a691ca63..53101b82c5 100644
--- a/target/riscv/cpu.h
+++ b/target/riscv/cpu.h
@@ -89,6 +89,7 @@ enum {
 };
 
 #define VEXT_VERSION_1_00_0 0x0001
+#define VEXT_VER_1_00_0_STR "v1.0"
 
 enum {
 TRANSLATE_SUCCESS,
diff --git a/target/riscv/cpu_cfg.h b/target/riscv/cpu_cfg.h
index 2dba1f0007..7112af6c4c 100644
--- a/target/riscv/cpu_cfg.h
+++ b/target/riscv/cpu_cfg.h
@@ -135,7 +135,6 @@ struct RISCVCPUConfig {
 bool ext_XVentanaCondOps;
 
 uint32_t pmu_mask;
-char *vext_spec;
 uint16_t vlen;
 uint16_t elen;
 uint16_t cbom_blocksize;
diff --git a/target/riscv/tcg/tcg-cpu.c b/target/riscv/tcg/tcg-cpu.c
index 4d67b72d9e..6501c29d8e 100644
--- a/target/riscv/tcg/tcg-cpu.c
+++ b/target/riscv/tcg/tcg-cpu.c
@@ -201,21 +201,6 @@ static void riscv_cpu_validate_v(CPURISCVState *env, 
RISCVCPUConfig *cfg,
"in the range [8, 64]");
 return;
 }
-
-if (cfg->vext_spec) {
-if (!g_strcmp0(cfg->vext_spec, "v1.0")) {
-env->vext_ver = VEXT_VERSION_1_00_0;
-} else {
-error_setg(errp, "Unsupported vector spec version '%s'",
-   cfg->vext_spec);
-return;
-}
-} else if (env->vext_ver == 0) {
-qemu_log("vector version is not specified, "
- "use the default value v1.0\n");
-
-env->vext_ver = VEXT_VERSION_1_00_0;
-}
 }
 
 static void riscv_cpu_disable_priv_spec_isa_exts(RISCVCPU *cpu)
-- 
2.43.0

[PATCH v2 12/16] target/riscv: move 'cboz_blocksize' to riscv_cpu_properties[]

2023-12-22 Thread Daniel Henrique Barboza

Do the same we did with 'cbom_blocksize' in the previous patch.

Remove the now unused kvm_cpu_set_cbomz_blksize() setter.

Signed-off-by: Daniel Henrique Barboza 
---
 target/riscv/cpu.c | 45 +-
 target/riscv/kvm/kvm-cpu.c | 28 
 2 files changed, 44 insertions(+), 29 deletions(-)

diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
index 50825522b2..f30058518a 100644
--- a/target/riscv/cpu.c
+++ b/target/riscv/cpu.c
@@ -1818,8 +1818,49 @@ static const PropertyInfo prop_cbom_blksize = {
 .set_default_value = qdev_propinfo_set_default_value_uint,
 };
 
+static void prop_cboz_blksize_set(Object *obj, Visitor *v, const char *name,
+  void *opaque, Error **errp)
+{
+RISCVCPU *cpu = RISCV_CPU(obj);
+uint16_t value;
+
+if (!visit_type_uint16(v, name, &value, errp)) {
+return;
+}
+
+/* Always allow setting a default value */
+if (cpu->cfg.cboz_blocksize == 0) {
+cpu->cfg.cboz_blocksize = value;
+return;
+}
+
+if (value != cpu->cfg.cboz_blocksize && riscv_cpu_is_vendor(obj)) {
+cpu_set_prop_err(cpu, name, errp);
+error_append_hint(errp, "Current '%s' val: %u\n",
+  name, cpu->cfg.cboz_blocksize);
+return;
+}
+
+cpu_option_add_user_setting(name, value);
+cpu->cfg.cboz_blocksize = value;
+}
+
+static void prop_cboz_blksize_get(Object *obj, Visitor *v, const char *name,
+ void *opaque, Error **errp)
+{
+uint16_t value = RISCV_CPU(obj)->cfg.cboz_blocksize;
+
+visit_type_uint16(v, name, &value, errp);
+}
+
+static const PropertyInfo prop_cboz_blksize = {
+.name = "cboz_blocksize",
+.get = prop_cboz_blksize_get,
+.set = prop_cboz_blksize_set,
+.set_default_value = qdev_propinfo_set_default_value_uint,
+};
+
 Property riscv_cpu_options[] = {
-DEFINE_PROP_UINT16("cboz_blocksize", RISCVCPU, cfg.cboz_blocksize, 64),
 
 DEFINE_PROP_END_OF_LIST(),
 };
@@ -1845,6 +1886,8 @@ static Property riscv_cpu_properties[] = {
 
 {.name = "cbom_blocksize", .info = &prop_cbom_blksize,
  .set_default = true, .defval.u = 64},
+{.name = "cboz_blocksize", .info = &prop_cboz_blksize,
+ .set_default = true, .defval.u = 64},
 
 #ifndef CONFIG_USER_ONLY
 DEFINE_PROP_UINT64("resetvec", RISCVCPU, env.resetvec, DEFAULT_RSTVEC),
diff --git a/target/riscv/kvm/kvm-cpu.c b/target/riscv/kvm/kvm-cpu.c
index 1866b56913..137a8ab2bb 100644
--- a/target/riscv/kvm/kvm-cpu.c
+++ b/target/riscv/kvm/kvm-cpu.c
@@ -343,30 +343,6 @@ static KVMCPUConfig kvm_cboz_blocksize = {
 .kvm_reg_id = KVM_REG_RISCV_CONFIG_REG(zicboz_block_size)
 };
 
-static void kvm_cpu_set_cbomz_blksize(Object *obj, Visitor *v,
-  const char *name,
-  void *opaque, Error **errp)
-{
-KVMCPUConfig *cbomz_cfg = opaque;
-RISCVCPU *cpu = RISCV_CPU(obj);
-uint16_t value, *host_val;
-
-if (!visit_type_uint16(v, name, &value, errp)) {
-return;
-}
-
-host_val = kvmconfig_get_cfg_addr(cpu, cbomz_cfg);
-
-if (value != *host_val) {
-error_report("Unable to set %s to a different value than "
- "the host (%u)",
- cbomz_cfg->name, *host_val);
-exit(EXIT_FAILURE);
-}
-
-cbomz_cfg->user_set = true;
-}
-
 static void kvm_riscv_update_cpu_cfg_isa_ext(RISCVCPU *cpu, CPUState *cs)
 {
 CPURISCVState *env = &cpu->env;
@@ -484,10 +460,6 @@ static void kvm_riscv_add_cpu_user_properties(Object 
*cpu_obj)
 NULL, multi_cfg);
 }
 
-object_property_add(cpu_obj, "cboz_blocksize", "uint16",
-NULL, kvm_cpu_set_cbomz_blksize,
-NULL, &kvm_cboz_blocksize);
-
 riscv_cpu_add_kvm_unavail_prop_array(cpu_obj, riscv_cpu_extensions);
 riscv_cpu_add_kvm_unavail_prop_array(cpu_obj, riscv_cpu_vendor_exts);
 riscv_cpu_add_kvm_unavail_prop_array(cpu_obj, riscv_cpu_experimental_exts);
-- 
2.43.0

[PATCH v2 15/16] target/riscv/cpu.c: move 'mimpid' to riscv_cpu_properties[]

2023-12-22 Thread Daniel Henrique Barboza

Keep all class properties in riscv_cpu_properties[].

Signed-off-by: Daniel Henrique Barboza 
---
 target/riscv/cpu.c | 68 --
 1 file changed, 36 insertions(+), 32 deletions(-)

diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
index 260932e117..613e8d5ddc 100644
--- a/target/riscv/cpu.c
+++ b/target/riscv/cpu.c
@@ -1895,6 +1895,41 @@ static const PropertyInfo prop_mvendorid = {
 .set = prop_mvendorid_set,
 };
 
+static void prop_mimpid_set(Object *obj, Visitor *v, const char *name,
+void *opaque, Error **errp)
+{
+bool dynamic_cpu = riscv_cpu_is_dynamic(obj);
+RISCVCPU *cpu = RISCV_CPU(obj);
+uint64_t prev_val = cpu->cfg.mimpid;
+uint64_t value;
+
+if (!visit_type_uint64(v, name, &value, errp)) {
+return;
+}
+
+if (!dynamic_cpu && prev_val != value) {
+error_setg(errp, "Unable to change %s mimpid (0x%" PRIu64 ")",
+   object_get_typename(obj), prev_val);
+return;
+}
+
+cpu->cfg.mimpid = value;
+}
+
+static void prop_mimpid_get(Object *obj, Visitor *v, const char *name,
+void *opaque, Error **errp)
+{
+uint64_t value = RISCV_CPU(obj)->cfg.mimpid;
+
+visit_type_uint64(v, name, &value, errp);
+}
+
+static const PropertyInfo prop_mimpid = {
+.name = "mimpid",
+.get = prop_mimpid_get,
+.set = prop_mimpid_set,
+};
+
 static Property riscv_cpu_properties[] = {
 DEFINE_PROP_BOOL("debug", RISCVCPU, cfg.debug, true),
 
@@ -1920,6 +1955,7 @@ static Property riscv_cpu_properties[] = {
  .set_default = true, .defval.u = 64},
 
  {.name = "mvendorid", .info = &prop_mvendorid},
+ {.name = "mimpid", .info = &prop_mimpid},
 
 #ifndef CONFIG_USER_ONLY
 DEFINE_PROP_UINT64("resetvec", RISCVCPU, env.resetvec, DEFAULT_RSTVEC),
@@ -1985,35 +2021,6 @@ static const struct SysemuCPUOps riscv_sysemu_ops = {
 };
 #endif
 
-static void cpu_set_mimpid(Object *obj, Visitor *v, const char *name,
-   void *opaque, Error **errp)
-{
-bool dynamic_cpu = riscv_cpu_is_dynamic(obj);
-RISCVCPU *cpu = RISCV_CPU(obj);
-uint64_t prev_val = cpu->cfg.mimpid;
-uint64_t value;
-
-if (!visit_type_uint64(v, name, &value, errp)) {
-return;
-}
-
-if (!dynamic_cpu && prev_val != value) {
-error_setg(errp, "Unable to change %s mimpid (0x%" PRIu64 ")",
-   object_get_typename(obj), prev_val);
-return;
-}
-
-cpu->cfg.mimpid = value;
-}
-
-static void cpu_get_mimpid(Object *obj, Visitor *v, const char *name,
-   void *opaque, Error **errp)
-{
-uint64_t value = RISCV_CPU(obj)->cfg.mimpid;
-
-visit_type_uint64(v, name, &value, errp);
-}
-
 static void cpu_set_marchid(Object *obj, Visitor *v, const char *name,
 void *opaque, Error **errp)
 {
@@ -2094,9 +2101,6 @@ static void riscv_cpu_class_init(ObjectClass *c, void 
*data)
 cc->gdb_arch_name = riscv_gdb_arch_name;
 cc->gdb_get_dynamic_xml = riscv_gdb_get_dynamic_xml;
 
-object_class_property_add(c, "mimpid", "uint64", cpu_get_mimpid,
-  cpu_set_mimpid, NULL, NULL);
-
 object_class_property_add(c, "marchid", "uint64", cpu_get_marchid,
   cpu_set_marchid, NULL, NULL);
 
-- 
2.43.0

[PATCH v2 04/16] target/riscv: move 'mmu' to riscv_cpu_properties[]

2023-12-22 Thread Daniel Henrique Barboza

Commit 7f0bdfb5bfc ("target/riscv/cpu.c: remove cfg setup from
riscv_cpu_init()") already did some of the work by making some
cpu_init() functions to explictly enable their own 'mmu' default.

The generic CPUs didn't get update by that commit, so they are still
relying on the defaults set by the 'mmu' option. But having 'mmu' and
'pmp' being default=true will force CPUs that doesn't implement these
options to set them to 'false' in their cpu_init(), which isn't ideal.

We'll move 'mmu' to riscv_cpu_properties[] without any defaults, i.e.
the default will be 'false'. Compensate it by manually setting 'mmu =
true' to the generic CPUs that requires it.

Implement a setter for it to forbid the 'mmu' setting to be changed for
vendor CPUs. This will allow the option to exist for all CPUs and, at
the same time, protect vendor CPUs from undesired changes:

$ ./build/qemu-system-riscv64 -M virt -cpu sifive-e51,mmu=true
qemu-system-riscv64: can't apply global sifive-e51-riscv-cpu.mmu=true:
   CPU 'sifive-e51' does not allow changing the value of 'mmu'

Signed-off-by: Daniel Henrique Barboza 
---
 target/riscv/cpu.c | 67 +++---
 1 file changed, 63 insertions(+), 4 deletions(-)

diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
index fb3d23b047..080713b9b5 100644
--- a/target/riscv/cpu.c
+++ b/target/riscv/cpu.c
@@ -410,6 +410,8 @@ static void riscv_max_cpu_init(Object *obj)
 CPURISCVState *env = &cpu->env;
 RISCVMXL mlx = MXL_RV64;
 
+cpu->cfg.mmu = true;
+
 #ifdef TARGET_RISCV32
 mlx = MXL_RV32;
 #endif
@@ -424,7 +426,11 @@ static void riscv_max_cpu_init(Object *obj)
 #if defined(TARGET_RISCV64)
 static void rv64_base_cpu_init(Object *obj)
 {
-CPURISCVState *env = &RISCV_CPU(obj)->env;
+RISCVCPU *cpu = RISCV_CPU(obj);
+CPURISCVState *env = &cpu->env;
+
+cpu->cfg.mmu = true;
+
 /* We set this in the realise function */
 riscv_cpu_set_misa(env, MXL_RV64, 0);
 /* Set latest version of privileged specification */
@@ -542,13 +548,18 @@ static void rv64_veyron_v1_cpu_init(Object *obj)
 
 static void rv128_base_cpu_init(Object *obj)
 {
+RISCVCPU *cpu = RISCV_CPU(obj);
+CPURISCVState *env = &cpu->env;
+
 if (qemu_tcg_mttcg_enabled()) {
 /* Missing 128-bit aligned atomics */
 error_report("128-bit RISC-V currently does not work with Multi "
  "Threaded TCG. Please use: -accel tcg,thread=single");
 exit(EXIT_FAILURE);
 }
-CPURISCVState *env = &RISCV_CPU(obj)->env;
+
+cpu->cfg.mmu = true;
+
 /* We set this in the realise function */
 riscv_cpu_set_misa(env, MXL_RV128, 0);
 /* Set latest version of privileged specification */
@@ -560,7 +571,11 @@ static void rv128_base_cpu_init(Object *obj)
 #else
 static void rv32_base_cpu_init(Object *obj)
 {
-CPURISCVState *env = &RISCV_CPU(obj)->env;
+RISCVCPU *cpu = RISCV_CPU(obj);
+CPURISCVState *env = &cpu->env;
+
+cpu->cfg.mmu = true;
+
 /* We set this in the realise function */
 riscv_cpu_set_misa(env, MXL_RV32, 0);
 /* Set latest version of privileged specification */
@@ -1431,6 +1446,19 @@ const RISCVCPUMultiExtConfig riscv_cpu_deprecated_exts[] 
= {
 DEFINE_PROP_END_OF_LIST(),
 };
 
+static bool riscv_cpu_is_vendor(Object *obj)
+{
+return !riscv_cpu_is_generic(obj);
+}
+
+static void cpu_set_prop_err(RISCVCPU *cpu, const char *propname,
+ Error **errp)
+{
+g_autofree char *cpuname = riscv_cpu_get_name(cpu);
+error_setg(errp, "CPU '%s' does not allow changing the value of '%s'",
+   cpuname, propname);
+}
+
 static void prop_pmu_num_set(Object *obj, Visitor *v, const char *name,
  void *opaque, Error **errp)
 {
@@ -1468,8 +1496,37 @@ static const PropertyInfo prop_pmu_num = {
 .set = prop_pmu_num_set,
 };
 
+static void prop_mmu_set(Object *obj, Visitor *v, const char *name,
+ void *opaque, Error **errp)
+{
+RISCVCPU *cpu = RISCV_CPU(obj);
+bool value;
+
+visit_type_bool(v, name, &value, errp);
+
+if (cpu->cfg.mmu != value && riscv_cpu_is_vendor(obj)) {
+cpu_set_prop_err(cpu, "mmu", errp);
+return;
+}
+
+cpu->cfg.mmu = value;
+}
+
+static void prop_mmu_get(Object *obj, Visitor *v, const char *name,
+ void *opaque, Error **errp)
+{
+bool value = RISCV_CPU(obj)->cfg.mmu;
+
+visit_type_bool(v, name, &value, errp);
+}
+
+static const PropertyInfo prop_mmu = {
+.name = "mmu",
+.get = prop_mmu_get,
+.set = prop_mmu_set,
+};
+
 Property riscv_cpu_options[] = {
-DEFINE_PROP_BOOL("mmu", RISCVCPU, cfg.mmu, true),
 DEFINE_PROP_BOOL("pmp", RISCVCPU, cfg.pmp, true),
 
 DEFINE_PROP_STRING("priv_spec", RISCVCPU, cfg.priv_spec),
@@ -1491,6 +1548,8 @@ static Property riscv_cpu_properties[] = {
MAKE_64BIT_MASK(3, 16)),
 {.name = "pmu-num", .info = &prop_pmu_num}, /* Deprecated *

[PATCH v2 16/16] target/riscv/cpu.c: move 'marchid' to riscv_cpu_properties[]

2023-12-22 Thread Daniel Henrique Barboza

Keep all class properties in riscv_cpu_properties[].

Signed-off-by: Daniel Henrique Barboza 
---
 target/riscv/cpu.c | 110 +++--
 1 file changed, 57 insertions(+), 53 deletions(-)

diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
index 613e8d5ddc..d2400fd447 100644
--- a/target/riscv/cpu.c
+++ b/target/riscv/cpu.c
@@ -1930,6 +1930,62 @@ static const PropertyInfo prop_mimpid = {
 .set = prop_mimpid_set,
 };
 
+static void prop_marchid_set(Object *obj, Visitor *v, const char *name,
+ void *opaque, Error **errp)
+{
+bool dynamic_cpu = riscv_cpu_is_dynamic(obj);
+RISCVCPU *cpu = RISCV_CPU(obj);
+uint64_t prev_val = cpu->cfg.marchid;
+uint64_t value, invalid_val;
+uint32_t mxlen = 0;
+
+if (!visit_type_uint64(v, name, &value, errp)) {
+return;
+}
+
+if (!dynamic_cpu && prev_val != value) {
+error_setg(errp, "Unable to change %s marchid (0x%" PRIu64 ")",
+   object_get_typename(obj), prev_val);
+return;
+}
+
+switch (riscv_cpu_mxl(&cpu->env)) {
+case MXL_RV32:
+mxlen = 32;
+break;
+case MXL_RV64:
+case MXL_RV128:
+mxlen = 64;
+break;
+default:
+g_assert_not_reached();
+}
+
+invalid_val = 1LL << (mxlen - 1);
+
+if (value == invalid_val) {
+error_setg(errp, "Unable to set marchid with MSB (%u) bit set "
+ "and the remaining bits zero", mxlen);
+return;
+}
+
+cpu->cfg.marchid = value;
+}
+
+static void prop_marchid_get(Object *obj, Visitor *v, const char *name,
+ void *opaque, Error **errp)
+{
+uint64_t value = RISCV_CPU(obj)->cfg.marchid;
+
+visit_type_uint64(v, name, &value, errp);
+}
+
+static const PropertyInfo prop_marchid = {
+.name = "marchid",
+.get = prop_marchid_get,
+.set = prop_marchid_set,
+};
+
 static Property riscv_cpu_properties[] = {
 DEFINE_PROP_BOOL("debug", RISCVCPU, cfg.debug, true),
 
@@ -1956,6 +2012,7 @@ static Property riscv_cpu_properties[] = {
 
  {.name = "mvendorid", .info = &prop_mvendorid},
  {.name = "mimpid", .info = &prop_mimpid},
+ {.name = "marchid", .info = &prop_marchid},
 
 #ifndef CONFIG_USER_ONLY
 DEFINE_PROP_UINT64("resetvec", RISCVCPU, env.resetvec, DEFAULT_RSTVEC),
@@ -2021,56 +2078,6 @@ static const struct SysemuCPUOps riscv_sysemu_ops = {
 };
 #endif
 
-static void cpu_set_marchid(Object *obj, Visitor *v, const char *name,
-void *opaque, Error **errp)
-{
-bool dynamic_cpu = riscv_cpu_is_dynamic(obj);
-RISCVCPU *cpu = RISCV_CPU(obj);
-uint64_t prev_val = cpu->cfg.marchid;
-uint64_t value, invalid_val;
-uint32_t mxlen = 0;
-
-if (!visit_type_uint64(v, name, &value, errp)) {
-return;
-}
-
-if (!dynamic_cpu && prev_val != value) {
-error_setg(errp, "Unable to change %s marchid (0x%" PRIu64 ")",
-   object_get_typename(obj), prev_val);
-return;
-}
-
-switch (riscv_cpu_mxl(&cpu->env)) {
-case MXL_RV32:
-mxlen = 32;
-break;
-case MXL_RV64:
-case MXL_RV128:
-mxlen = 64;
-break;
-default:
-g_assert_not_reached();
-}
-
-invalid_val = 1LL << (mxlen - 1);
-
-if (value == invalid_val) {
-error_setg(errp, "Unable to set marchid with MSB (%u) bit set "
- "and the remaining bits zero", mxlen);
-return;
-}
-
-cpu->cfg.marchid = value;
-}
-
-static void cpu_get_marchid(Object *obj, Visitor *v, const char *name,
-   void *opaque, Error **errp)
-{
-uint64_t value = RISCV_CPU(obj)->cfg.marchid;
-
-visit_type_uint64(v, name, &value, errp);
-}
-
 static void riscv_cpu_class_init(ObjectClass *c, void *data)
 {
 RISCVCPUClass *mcc = RISCV_CPU_CLASS(c);
@@ -2101,9 +2108,6 @@ static void riscv_cpu_class_init(ObjectClass *c, void 
*data)
 cc->gdb_arch_name = riscv_gdb_arch_name;
 cc->gdb_get_dynamic_xml = riscv_gdb_get_dynamic_xml;
 
-object_class_property_add(c, "marchid", "uint64", cpu_get_marchid,
-  cpu_set_marchid, NULL, NULL);
-
 device_class_set_props(dc, riscv_cpu_properties);
 }
 
-- 
2.43.0

[PATCH v2 03/16] target/riscv: make riscv_cpu_is_generic() public

2023-12-22 Thread Daniel Henrique Barboza

We'll use this function in target/riscv/cpu.c to implement setters that
won't allow vendor CPU options to be changed.

Signed-off-by: Daniel Henrique Barboza 
---
 target/riscv/cpu.c | 5 +
 target/riscv/cpu.h | 1 +
 target/riscv/tcg/tcg-cpu.c | 5 -
 3 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
index 34f7616258..fb3d23b047 100644
--- a/target/riscv/cpu.c
+++ b/target/riscv/cpu.c
@@ -183,6 +183,11 @@ void isa_ext_update_enabled(RISCVCPU *cpu, uint32_t 
ext_offset, bool en)
 *ext_enabled = en;
 }
 
+bool riscv_cpu_is_generic(Object *cpu_obj)
+{
+return object_dynamic_cast(cpu_obj, TYPE_RISCV_DYNAMIC_CPU) != NULL;
+}
+
 const char * const riscv_int_regnames[] = {
 "x0/zero", "x1/ra",  "x2/sp",  "x3/gp",  "x4/tp",  "x5/t0",   "x6/t1",
 "x7/t2",   "x8/s0",  "x9/s1",  "x10/a0", "x11/a1", "x12/a2",  "x13/a3",
diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h
index d74b361be6..cfe965e512 100644
--- a/target/riscv/cpu.h
+++ b/target/riscv/cpu.h
@@ -757,6 +757,7 @@ enum riscv_pmu_event_idx {
 void isa_ext_update_enabled(RISCVCPU *cpu, uint32_t ext_offset, bool en);
 bool isa_ext_is_enabled(RISCVCPU *cpu, uint32_t ext_offset);
 void riscv_cpu_set_misa(CPURISCVState *env, RISCVMXL mxl, uint32_t ext);
+bool riscv_cpu_is_generic(Object *cpu_obj);
 
 typedef struct RISCVCPUMultiExtConfig {
 const char *name;
diff --git a/target/riscv/tcg/tcg-cpu.c b/target/riscv/tcg/tcg-cpu.c
index 8a35683a34..a09300e908 100644
--- a/target/riscv/tcg/tcg-cpu.c
+++ b/target/riscv/tcg/tcg-cpu.c
@@ -658,11 +658,6 @@ bool riscv_cpu_tcg_compatible(RISCVCPU *cpu)
 return object_dynamic_cast(OBJECT(cpu), TYPE_RISCV_CPU_HOST) == NULL;
 }
 
-static bool riscv_cpu_is_generic(Object *cpu_obj)
-{
-return object_dynamic_cast(cpu_obj, TYPE_RISCV_DYNAMIC_CPU) != NULL;
-}
-
 /*
  * We'll get here via the following path:
  *
-- 
2.43.0

[PATCH v2 06/16] target/riscv: rework 'priv_spec'

2023-12-22 Thread Daniel Henrique Barboza

'priv_spec' and 'vext_spec' are two string options used as a fancy way
of setting integers in the CPU state (cpu->env.priv_ver and
cpu->env.vext_ver). It requires us to deal with string parsing and to
store them in cpu_cfg.

We must support these string options, but we don't need to store them.
We have a precedence for this kind of arrangement in target/ppc/compat.c,
ppc_compat_prop_get|set, getters and setters used for the
'max-cpu-compat' class property of the pseries ppc64 machine. We'll do
the same with both 'priv_spec' and 'vext_spec'.

For 'priv_spec', the validation from riscv_cpu_validate_priv_spec() will
be done by the prop_priv_spec_set() setter, while also preventing it to
be changed for vendor CPUs. Add two helpers that converts env->priv_ver
back and forth to its string representation. These helpers allow us to
get a string and set 'env->priv_ver' and return a string giving the
current env->priv_ver value. In other words, make the cpu->cfg.priv_spec
string obsolete.

Last but not the least, move the reworked 'priv_spec' option to
riscv_cpu_properties[].

After all said and done, we don't need to store the 'priv_spec' string in
the CPU state, and we're now protecting vendor CPUs from priv_ver
changes:

$ ./build/qemu-system-riscv64 -M virt -cpu sifive-e51,priv_spec="v1.12.0"
qemu-system-riscv64: can't apply global sifive-e51-riscv-cpu.priv_spec=v1.12.0:
CPU 'sifive-e51' does not allow changing the value of 'priv_spec'
Current 'priv_spec' val: v1.10.0
$

Signed-off-by: Daniel Henrique Barboza 
---
 target/riscv/cpu.c | 72 +-
 target/riscv/cpu.h |  3 ++
 target/riscv/cpu_cfg.h |  1 -
 target/riscv/tcg/tcg-cpu.c | 29 ---
 4 files changed, 74 insertions(+), 31 deletions(-)

diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
index bdb6466c84..1302d32de3 100644
--- a/target/riscv/cpu.c
+++ b/target/riscv/cpu.c
@@ -1560,8 +1560,76 @@ static const PropertyInfo prop_pmp = {
 .set = prop_pmp_set,
 };
 
+static int priv_spec_from_str(const char *priv_spec_str)
+{
+int priv_version = -1;
+
+if (!g_strcmp0(priv_spec_str, PRIV_VER_1_12_0_STR)) {
+priv_version = PRIV_VERSION_1_12_0;
+} else if (!g_strcmp0(priv_spec_str, PRIV_VER_1_11_0_STR)) {
+priv_version = PRIV_VERSION_1_11_0;
+} else if (!g_strcmp0(priv_spec_str, PRIV_VER_1_10_0_STR)) {
+priv_version = PRIV_VERSION_1_10_0;
+}
+
+return priv_version;
+}
+
+static const char *priv_spec_to_str(int priv_version)
+{
+switch (priv_version) {
+case PRIV_VERSION_1_10_0:
+return PRIV_VER_1_10_0_STR;
+case PRIV_VERSION_1_11_0:
+return PRIV_VER_1_11_0_STR;
+case PRIV_VERSION_1_12_0:
+return PRIV_VER_1_12_0_STR;
+default:
+return NULL;
+}
+}
+
+static void prop_priv_spec_set(Object *obj, Visitor *v, const char *name,
+   void *opaque, Error **errp)
+{
+RISCVCPU *cpu = RISCV_CPU(obj);
+g_autofree char *value = NULL;
+int priv_version = -1;
+
+visit_type_str(v, name, &value, errp);
+
+priv_version = priv_spec_from_str(value);
+if (priv_version < 0) {
+error_setg(errp, "Unsupported privilege spec version '%s'", value);
+return;
+}
+
+if (priv_version != cpu->env.priv_ver && riscv_cpu_is_vendor(obj)) {
+cpu_set_prop_err(cpu, name, errp);
+error_append_hint(errp, "Current '%s' val: %s\n", name,
+  object_property_get_str(obj, name, NULL));
+return;
+}
+
+cpu->env.priv_ver = priv_version;
+}
+
+static void prop_priv_spec_get(Object *obj, Visitor *v, const char *name,
+   void *opaque, Error **errp)
+{
+RISCVCPU *cpu = RISCV_CPU(obj);
+const char *value = priv_spec_to_str(cpu->env.priv_ver);
+
+visit_type_str(v, name, (char **)&value, errp);
+}
+
+static const PropertyInfo prop_priv_spec = {
+.name = "priv_spec",
+.get = prop_priv_spec_get,
+.set = prop_priv_spec_set,
+};
+
 Property riscv_cpu_options[] = {
-DEFINE_PROP_STRING("priv_spec", RISCVCPU, cfg.priv_spec),
 DEFINE_PROP_STRING("vext_spec", RISCVCPU, cfg.vext_spec),
 
 DEFINE_PROP_UINT16("vlen", RISCVCPU, cfg.vlen, 128),
@@ -1583,6 +1651,8 @@ static Property riscv_cpu_properties[] = {
 {.name = "mmu", .info = &prop_mmu},
 {.name = "pmp", .info = &prop_pmp},
 
+{.name = "priv_spec", .info = &prop_priv_spec},
+
 #ifndef CONFIG_USER_ONLY
 DEFINE_PROP_UINT64("resetvec", RISCVCPU, env.resetvec, DEFAULT_RSTVEC),
 #endif
diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h
index cfe965e512..e8a691ca63 100644
--- a/target/riscv/cpu.h
+++ b/target/riscv/cpu.h
@@ -77,6 +77,9 @@ const char *riscv_get_misa_ext_description(uint32_t bit);
 #define CPU_CFG_OFFSET(_prop) offsetof(struct RISCVCPUConfig, _prop)
 
 /* Privileged specification version */
+#define PRIV_VER_1_10_0_STR "v1.10.0"
+#define PRIV_VER_1_11_0_STR "v1.11.0"
+#define PRIV_VER_1

[PATCH v2 02/16] target/riscv: move 'pmu-mask' and 'pmu-num' to riscv_cpu_properties[]

2023-12-22 Thread Daniel Henrique Barboza

Every property in riscv_cpu_options[] will be migrated to
riscv_cpu_properties[]. This will make their default values init
earlier, allowing cpu_init() functions to overwrite them. We'll also
implement common getters and setters that both accelerators will use,
allowing them to share validations that TCG is doing.

For pmu-mask and pmu-num it's just a matter of migrating the properties
from one array to the other. While we're at it, add a 'static' modifier
to 'prop_pmu_num' since we're not exporting it.

Signed-off-by: Daniel Henrique Barboza 
---
 target/riscv/cpu.c | 9 +
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
index 70bf10aa7c..34f7616258 100644
--- a/target/riscv/cpu.c
+++ b/target/riscv/cpu.c
@@ -1457,16 +1457,13 @@ static void prop_pmu_num_get(Object *obj, Visitor *v, 
const char *name,
 visit_type_uint8(v, name, &pmu_num, errp);
 }
 
-const PropertyInfo prop_pmu_num = {
+static const PropertyInfo prop_pmu_num = {
 .name = "pmu-num",
 .get = prop_pmu_num_get,
 .set = prop_pmu_num_set,
 };
 
 Property riscv_cpu_options[] = {
-DEFINE_PROP_UINT32("pmu-mask", RISCVCPU, cfg.pmu_mask, MAKE_64BIT_MASK(3, 
16)),
-{.name = "pmu-num", .info = &prop_pmu_num}, /* Deprecated */
-
 DEFINE_PROP_BOOL("mmu", RISCVCPU, cfg.mmu, true),
 DEFINE_PROP_BOOL("pmp", RISCVCPU, cfg.pmp, true),
 
@@ -1485,6 +1482,10 @@ Property riscv_cpu_options[] = {
 static Property riscv_cpu_properties[] = {
 DEFINE_PROP_BOOL("debug", RISCVCPU, cfg.debug, true),
 
+DEFINE_PROP_UINT32("pmu-mask", RISCVCPU, cfg.pmu_mask,
+   MAKE_64BIT_MASK(3, 16)),
+{.name = "pmu-num", .info = &prop_pmu_num}, /* Deprecated */
+
 #ifndef CONFIG_USER_ONLY
 DEFINE_PROP_UINT64("resetvec", RISCVCPU, env.resetvec, DEFAULT_RSTVEC),
 #endif
-- 
2.43.0

[PATCH v2 08/16] target/riscv: move 'vlen' to riscv_cpu_properties[]

2023-12-22 Thread Daniel Henrique Barboza

Turning 'vlen' into a class property will allow its default value to be
overwritten by cpu_init() later on, solving the issue we have now where
CPU specific settings are getting overwritten by the default.

For 'vlen', 'elen' and the blocksize options we need a way of tracking
if the user set a value for them. This is benign for TCG since the cost
of always validating these values are small, but for KVM we need syscalls
to read the host values to make the validations. Knowing whether the
user didn't touch the values makes a difference. We'll track user setting
for these properties using a hash, like we do in the TCG driver.

Common validation bits are moved from riscv_cpu_validate_v() to
prop_vlen_set() to be shared with KVM.

And, as done with every option we migrated to riscv_cpu_properties[],
vendor CPUs can't have their 'vlen' value changed.

Signed-off-by: Daniel Henrique Barboza 
---
 target/riscv/cpu.c | 63 +-
 target/riscv/tcg/tcg-cpu.c |  5 ---
 2 files changed, 62 insertions(+), 6 deletions(-)

diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
index d6625399a7..c2ff50bcab 100644
--- a/target/riscv/cpu.c
+++ b/target/riscv/cpu.c
@@ -29,6 +29,7 @@
 #include "qapi/visitor.h"
 #include "qemu/error-report.h"
 #include "hw/qdev-properties.h"
+#include "hw/core/qdev-prop-internal.h"
 #include "migration/vmstate.h"
 #include "fpu/softfloat-helpers.h"
 #include "sysemu/kvm.h"
@@ -53,6 +54,15 @@ const uint32_t misa_bits[] = {RVI, RVE, RVM, RVA, RVF, RVD, 
RVV,
 #define BYTE(x)   (x)
 #endif
 
+/* Hash that stores general user set numeric options */
+static GHashTable *general_user_opts;
+
+static void cpu_option_add_user_setting(const char *optname, uint32_t value)
+{
+g_hash_table_insert(general_user_opts, (gpointer)optname,
+GUINT_TO_POINTER(value));
+}
+
 #define ISA_EXT_DATA_ENTRY(_name, _min_ver, _prop) \
 {#_name, _min_ver, CPU_CFG_OFFSET(_prop)}
 
@@ -1244,6 +1254,8 @@ static void riscv_cpu_init(Object *obj)
   IRQ_LOCAL_MAX + IRQ_LOCAL_GUEST_MAX);
 #endif /* CONFIG_USER_ONLY */
 
+general_user_opts = g_hash_table_new(g_str_hash, g_str_equal);
+
 /*
  * The timer and performance counters extensions were supported
  * in QEMU before they were added as discrete extensions in the
@@ -1664,8 +1676,54 @@ static const PropertyInfo prop_vext_spec = {
 .set = prop_vext_spec_set,
 };
 
+static void prop_vlen_set(Object *obj, Visitor *v, const char *name,
+ void *opaque, Error **errp)
+{
+RISCVCPU *cpu = RISCV_CPU(obj);
+uint16_t value;
+
+if (!visit_type_uint16(v, name, &value, errp)) {
+return;
+}
+
+if (!is_power_of_2(value)) {
+error_setg(errp, "Vector extension VLEN must be power of 2");
+return;
+}
+
+/* Always allow setting a default value */
+if (cpu->cfg.vlen == 0) {
+cpu->cfg.vlen = value;
+return;
+}
+
+if (value != cpu->cfg.vlen && riscv_cpu_is_vendor(obj)) {
+cpu_set_prop_err(cpu, name, errp);
+error_append_hint(errp, "Current '%s' val: %u\n",
+  name, cpu->cfg.vlen);
+return;
+}
+
+cpu_option_add_user_setting(name, value);
+cpu->cfg.vlen = value;
+}
+
+static void prop_vlen_get(Object *obj, Visitor *v, const char *name,
+ void *opaque, Error **errp)
+{
+uint16_t value = RISCV_CPU(obj)->cfg.vlen;
+
+visit_type_uint16(v, name, &value, errp);
+}
+
+static const PropertyInfo prop_vlen = {
+.name = "vlen",
+.get = prop_vlen_get,
+.set = prop_vlen_set,
+.set_default_value = qdev_propinfo_set_default_value_uint,
+};
+
 Property riscv_cpu_options[] = {
-DEFINE_PROP_UINT16("vlen", RISCVCPU, cfg.vlen, 128),
 DEFINE_PROP_UINT16("elen", RISCVCPU, cfg.elen, 64),
 
 DEFINE_PROP_UINT16("cbom_blocksize", RISCVCPU, cfg.cbom_blocksize, 64),
@@ -1687,6 +1745,9 @@ static Property riscv_cpu_properties[] = {
 {.name = "priv_spec", .info = &prop_priv_spec},
 {.name = "vext_spec", .info = &prop_vext_spec},
 
+{.name = "vlen", .info = &prop_vlen,
+ .set_default = true, .defval.u = 128},
+
 #ifndef CONFIG_USER_ONLY
 DEFINE_PROP_UINT64("resetvec", RISCVCPU, env.resetvec, DEFAULT_RSTVEC),
 #endif
diff --git a/target/riscv/tcg/tcg-cpu.c b/target/riscv/tcg/tcg-cpu.c
index 6501c29d8e..8ec858e096 100644
--- a/target/riscv/tcg/tcg-cpu.c
+++ b/target/riscv/tcg/tcg-cpu.c
@@ -178,11 +178,6 @@ static void riscv_cpu_validate_misa_mxl(RISCVCPU *cpu, 
Error **errp)
 static void riscv_cpu_validate_v(CPURISCVState *env, RISCVCPUConfig *cfg,
  Error **errp)
 {
-if (!is_power_of_2(cfg->vlen)) {
-error_setg(errp, "Vector extension VLEN must be power of 2");
-return;
-}
-
 if (cfg->vlen > RV_VLEN_MAX || cfg->vlen < 128) {
 error_setg(errp,
"Vector extension implementation only supports VLEN "
-- 
2

[PATCH v2 05/16] target/riscv: move 'pmp' to riscv_cpu_properties[]

2023-12-22 Thread Daniel Henrique Barboza

Move 'pmp' to riscv_cpu_properties[], creating a new setter() for it
that forbids 'pmp' to be changed in vendor CPUs, like we did with the
'mmu' option.

We'll also have to manually set 'pmp = true' to generic CPUs that were
still relying on the previous default to set it.

Signed-off-by: Daniel Henrique Barboza 
---
 target/riscv/cpu.c | 37 +++--
 1 file changed, 35 insertions(+), 2 deletions(-)

diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
index 080713b9b5..bdb6466c84 100644
--- a/target/riscv/cpu.c
+++ b/target/riscv/cpu.c
@@ -411,6 +411,7 @@ static void riscv_max_cpu_init(Object *obj)
 RISCVMXL mlx = MXL_RV64;
 
 cpu->cfg.mmu = true;
+cpu->cfg.pmp = true;
 
 #ifdef TARGET_RISCV32
 mlx = MXL_RV32;
@@ -430,6 +431,7 @@ static void rv64_base_cpu_init(Object *obj)
 CPURISCVState *env = &cpu->env;
 
 cpu->cfg.mmu = true;
+cpu->cfg.pmp = true;
 
 /* We set this in the realise function */
 riscv_cpu_set_misa(env, MXL_RV64, 0);
@@ -559,6 +561,7 @@ static void rv128_base_cpu_init(Object *obj)
 }
 
 cpu->cfg.mmu = true;
+cpu->cfg.pmp = true;
 
 /* We set this in the realise function */
 riscv_cpu_set_misa(env, MXL_RV128, 0);
@@ -575,6 +578,7 @@ static void rv32_base_cpu_init(Object *obj)
 CPURISCVState *env = &cpu->env;
 
 cpu->cfg.mmu = true;
+cpu->cfg.pmp = true;
 
 /* We set this in the realise function */
 riscv_cpu_set_misa(env, MXL_RV32, 0);
@@ -1526,9 +1530,37 @@ static const PropertyInfo prop_mmu = {
 .set = prop_mmu_set,
 };
 
-Property riscv_cpu_options[] = {
-DEFINE_PROP_BOOL("pmp", RISCVCPU, cfg.pmp, true),
+static void prop_pmp_set(Object *obj, Visitor *v, const char *name,
+ void *opaque, Error **errp)
+{
+RISCVCPU *cpu = RISCV_CPU(obj);
+bool value;
+
+visit_type_bool(v, name, &value, errp);
 
+if (cpu->cfg.pmp != value && riscv_cpu_is_vendor(obj)) {
+cpu_set_prop_err(cpu, name, errp);
+return;
+}
+
+cpu->cfg.pmp = value;
+}
+
+static void prop_pmp_get(Object *obj, Visitor *v, const char *name,
+ void *opaque, Error **errp)
+{
+bool value = RISCV_CPU(obj)->cfg.pmp;
+
+visit_type_bool(v, name, &value, errp);
+}
+
+static const PropertyInfo prop_pmp = {
+.name = "pmp",
+.get = prop_pmp_get,
+.set = prop_pmp_set,
+};
+
+Property riscv_cpu_options[] = {
 DEFINE_PROP_STRING("priv_spec", RISCVCPU, cfg.priv_spec),
 DEFINE_PROP_STRING("vext_spec", RISCVCPU, cfg.vext_spec),
 
@@ -1549,6 +1581,7 @@ static Property riscv_cpu_properties[] = {
 {.name = "pmu-num", .info = &prop_pmu_num}, /* Deprecated */
 
 {.name = "mmu", .info = &prop_mmu},
+{.name = "pmp", .info = &prop_pmp},
 
 #ifndef CONFIG_USER_ONLY
 DEFINE_PROP_UINT64("resetvec", RISCVCPU, env.resetvec, DEFAULT_RSTVEC),
-- 
2.43.0

[PATCH v2 11/16] target/riscv: move 'cbom_blocksize' to riscv_cpu_properties[]

2023-12-22 Thread Daniel Henrique Barboza

After adding a KVM finalize() implementation, turn cbom_blocksize into a
class property. Follow the same design we used with 'vlen' and 'elen'.

The duplicated 'cbom_blocksize' KVM property can be removed from
kvm_riscv_add_cpu_user_properties().

Signed-off-by: Daniel Henrique Barboza 
---
 target/riscv/cpu.c | 46 +-
 target/riscv/kvm/kvm-cpu.c |  4 
 2 files changed, 45 insertions(+), 5 deletions(-)

diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
index f49d31d753..50825522b2 100644
--- a/target/riscv/cpu.c
+++ b/target/riscv/cpu.c
@@ -1776,8 +1776,49 @@ static const PropertyInfo prop_elen = {
 .set_default_value = qdev_propinfo_set_default_value_uint,
 };
 
+static void prop_cbom_blksize_set(Object *obj, Visitor *v, const char *name,
+  void *opaque, Error **errp)
+{
+RISCVCPU *cpu = RISCV_CPU(obj);
+uint16_t value;
+
+if (!visit_type_uint16(v, name, &value, errp)) {
+return;
+}
+
+/* Always allow setting a default value */
+if (cpu->cfg.cbom_blocksize == 0) {
+cpu->cfg.cbom_blocksize = value;
+return;
+}
+
+if (value != cpu->cfg.cbom_blocksize && riscv_cpu_is_vendor(obj)) {
+cpu_set_prop_err(cpu, name, errp);
+error_append_hint(errp, "Current '%s' val: %u\n",
+  name, cpu->cfg.cbom_blocksize);
+return;
+}
+
+cpu_option_add_user_setting(name, value);
+cpu->cfg.cbom_blocksize = value;
+}
+
+static void prop_cbom_blksize_get(Object *obj, Visitor *v, const char *name,
+ void *opaque, Error **errp)
+{
+uint16_t value = RISCV_CPU(obj)->cfg.cbom_blocksize;
+
+visit_type_uint16(v, name, &value, errp);
+}
+
+static const PropertyInfo prop_cbom_blksize = {
+.name = "cbom_blocksize",
+.get = prop_cbom_blksize_get,
+.set = prop_cbom_blksize_set,
+.set_default_value = qdev_propinfo_set_default_value_uint,
+};
+
 Property riscv_cpu_options[] = {
-DEFINE_PROP_UINT16("cbom_blocksize", RISCVCPU, cfg.cbom_blocksize, 64),
 DEFINE_PROP_UINT16("cboz_blocksize", RISCVCPU, cfg.cboz_blocksize, 64),
 
 DEFINE_PROP_END_OF_LIST(),
@@ -1802,6 +1843,9 @@ static Property riscv_cpu_properties[] = {
 {.name = "elen", .info = &prop_elen,
  .set_default = true, .defval.u = 64},
 
+{.name = "cbom_blocksize", .info = &prop_cbom_blksize,
+ .set_default = true, .defval.u = 64},
+
 #ifndef CONFIG_USER_ONLY
 DEFINE_PROP_UINT64("resetvec", RISCVCPU, env.resetvec, DEFAULT_RSTVEC),
 #endif
diff --git a/target/riscv/kvm/kvm-cpu.c b/target/riscv/kvm/kvm-cpu.c
index 70fb075846..1866b56913 100644
--- a/target/riscv/kvm/kvm-cpu.c
+++ b/target/riscv/kvm/kvm-cpu.c
@@ -484,10 +484,6 @@ static void kvm_riscv_add_cpu_user_properties(Object 
*cpu_obj)
 NULL, multi_cfg);
 }
 
-object_property_add(cpu_obj, "cbom_blocksize", "uint16",
-NULL, kvm_cpu_set_cbomz_blksize,
-NULL, &kvm_cbom_blocksize);
-
 object_property_add(cpu_obj, "cboz_blocksize", "uint16",
 NULL, kvm_cpu_set_cbomz_blksize,
 NULL, &kvm_cboz_blocksize);
-- 
2.43.0

Re: [PATCH v5 1/2] qom: new object to associate device to numa node

2023-12-22 Thread Markus Armbruster

 writes:

> From: Ankit Agrawal 
>
> NVIDIA GPU's support MIG (Mult-Instance GPUs) feature [1], which allows
> partitioning of the GPU device resources (including device memory) into
> several (upto 8) isolated instances. Each of the partitioned memory needs
> a dedicated NUMA node to operate. The partitions are not fixed and they
> can be created/deleted at runtime.
>
> Unfortunately Linux OS does not provide a means to dynamically create/destroy
> NUMA nodes and such feature implementation is not expected to be trivial. The
> nodes that OS discovers at the boot time while parsing SRAT remains fixed. So
> we utilize the Generic Initiator Affinity structures that allows association
> between nodes and devices. Multiple GI structures per BDF is possible,
> allowing creation of multiple nodes by exposing unique PXM in each of these
> structures.
>
> Introduce a new acpi-generic-initiator object to allow host admin provide the
> device and the corresponding NUMA nodes. Qemu maintain this association and
> use this object to build the requisite GI Affinity Structure.

Pardon my ignorance...  What makes this object an "initiator", and why
is it "generic"?

> An admin can provide the range of nodes through a uint16 array host-nodes
> and link it to a device by providing its id. Currently, only PCI device is
> supported. The following sample creates 8 nodes and link them to the PCI
> device dev0:
>
> -numa node,nodeid=2 \
> -numa node,nodeid=3 \
> -numa node,nodeid=4 \
> -numa node,nodeid=5 \
> -numa node,nodeid=6 \
> -numa node,nodeid=7 \
> -numa node,nodeid=8 \
> -numa node,nodeid=9 \
> -device 
> vfio-pci-nohotplug,host=0009:01:00.0,bus=pcie.0,addr=04.0,rombar=0,id=dev0 \
> -object acpi-generic-initiator,id=gi0,pci-dev=dev0,host-nodes=2-9 \

Does this link *all* NUMA nodes to dev0?

Would an example involving two devices be more instructive?

> [1] https://www.nvidia.com/en-in/technologies/multi-instance-gpu
>
> Signed-off-by: Ankit Agrawal 

[...]

> diff --git a/qapi/qom.json b/qapi/qom.json
> index c53ef978ff..efcc4b8dfd 100644
> --- a/qapi/qom.json
> +++ b/qapi/qom.json
> @@ -794,6 +794,21 @@
>  { 'struct': 'VfioUserServerProperties',
>'data': { 'socket': 'SocketAddress', 'device': 'str' } }
>  
> +##
> +# @AcpiGenericInitiatorProperties:
> +#
> +# Properties for acpi-generic-initiator objects.
> +#
> +# @pci-dev: PCI device ID to be associated with the node
> +#
> +# @host-nodes: numa node list

This feels a bit terse.  The commit message makes me guess this
specifies the NUMA nodes to be linked to @pci-dev.  Correct?

> +#
> +# Since: 9.0
> +##
> +{ 'struct': 'AcpiGenericInitiatorProperties',
> +  'data': { 'pci-dev': 'str',
> +'host-nodes': ['uint16'] } }
> +
>  ##
>  # @RngProperties:
>  #
> @@ -911,6 +926,7 @@
>  ##
>  { 'enum': 'ObjectType',
>'data': [
> +'acpi-generic-initiator',
>  'authz-list',
>  'authz-listfile',
>  'authz-pam',
> @@ -981,6 +997,7 @@
>  'id': 'str' },
>'discriminator': 'qom-type',
>'data': {
> +  'acpi-generic-initiator': 'AcpiGenericInitiatorProperties',
>'authz-list': 'AuthZListProperties',
>'authz-listfile': 'AuthZListFileProperties',
>'authz-pam':  'AuthZPAMProperties',

Re: [RFC PATCH 01/11] acpi: hmp/qmp: Add hmp/qmp support for system_sleep

2023-12-22 Thread Markus Armbruster

"Annie.li"  writes:

> Hi Markus,
>
> On 12/5/2023 3:34 PM, Markus Armbruster wrote:
>> You neglected to cc: QAPI schema maintainers.  I found it by chance.
>> Next time :)
> Yep, should have cc to the maintainers.
>>
>> Annie Li  writes:
>>
>>> Following hmp/qmp commands are implemented for pressing virtual
>>> sleep button,
>>>
>>> hmp: system_sleep
>>> qmp: { "execute": "system_sleep" }
>>>
>>> These commands put the guest into suspend or other power states
>>> depending on the power settings inside the guest.
>>
>> How is this related to system_wakeup?
>
> Both 'system_sleep' and 'system_wakeup' trigger the event to notify the
> guest OSPM the sleep button has been pressed. 'system_wakeup' triggers
> wake up notification when the guest is in suspend state.

Thanks.  Would it make sens to work this into the QAPI schema doc
comments somehow?

[...]

Re: [PATCH V8 02/12] cpus: stop vm in suspended runstate

2023-12-22 Thread Markus Armbruster

Steven Sistare  writes:

> FYI for Markus and Blake.  No change since V6 and V7.

I accidentally replied to v6 instead of here.  Sorry for the
inconvenience!

Re: [PATCH v7 0/4] compare machine type compat_props

2023-12-22 Thread Markus Armbruster

Something odd is going on here.

Your cover letter and PATCH 4 arrived here with

Content-Type: text/plain; charset=UTF-8

Good.

PATCH 2:

Content-Type: text/plain; charset="US-ASCII"; x-default=true

PATCH 1 and 3:

Content-Type: text/plain; charset=N

git-am chokes on that:

error: cannot convert from N to UTF-8

Re: [PATCH v3 52/70] i386/tdx: handle TDG.VP.VMCALL

2023-12-22 Thread Daniel P . Berrangé

On Fri, Dec 22, 2023 at 11:14:12AM +0800, Xiaoyao Li wrote:
> On 12/21/2023 7:05 PM, Daniel P. Berrangé wrote:
> > On Wed, Nov 15, 2023 at 02:15:01AM -0500, Xiaoyao Li wrote:
> > > From: Isaku Yamahata 
> > > 
> > > For GetQuote, delegate a request to Quote Generation Service.
> > > Add property "quote-generation-socket" to tdx-guest, whihc is a property
> > > of type SocketAddress to specify Quote Generation Service(QGS).
> > > 
> > > On request, connect to the QGS, read request buffer from shared guest
> > > memory, send the request buffer to the server and store the response
> > > into shared guest memory and notify TD guest by interrupt.
> > > 
> > > command line example:
> > >qemu-system-x86_64 \
> > >  -object 
> > > '{"qom-type":"tdx-guest","id":"tdx0","quote-generation-socket":{"type": 
> > > "vsock", "cid":"2","port":"1234"}}' \
> > 
> > Here you're illustrating a VSOCK address.  IIUC, both the 'qgs'
> > daemon and QEMU will be running in the host. Why would they need
> > to be using VSOCK, as opposed to a regular UNIX socket connection ?
> > 
> 
> We use vsock here because the QGS server we used for testing exposes the
> vsock socket.

Is this is the server impl you test with:

  
https://github.com/intel/SGXDataCenterAttestationPrimitives/tree/master/QuoteGeneration/quote_wrapper/qgs

or is there another impl ?

With regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

Re: [PATCH 37/40] plugins: add an API to read registers

2023-12-22 Thread Alex Bennée

Akihiko Odaki  writes:

> On 2023/12/21 19:38, Alex Bennée wrote:
>> We can only request a list of registers once the vCPU has been
>> initialised so the user needs to use either call the get function on
>> vCPU initialisation or during the translation phase.
>> We don't expose the reg number to the plugin instead hiding it
>> behind
>> an opaque handle. This allows for a bit of future proofing should the
>> internals need to be changed while also being hashed against the
>> CPUClass so we can handle different register sets per-vCPU in
>> hetrogenous situations.
>> Having an internal state within the plugins also allows us to expand
>> the interface in future (for example providing callbacks on register
>> change if the translator can track changes).
>> Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1706
>> Cc: Akihiko Odaki 
>> Based-on: <20231025093128.33116-18-akihiko.od...@daynix.com>
>> Signed-off-by: Alex Bennée 
>> ---
>> v2
>>- use new get whole list api, and expose upwards
>> vAJB:
>> The main difference to Akikio's version is hiding the gdb register
>> detail from the plugin for the reasons described above.
>> ---
>>   include/qemu/qemu-plugin.h   |  53 +-
>>   plugins/api.c| 102 +++
>>   plugins/qemu-plugins.symbols |   2 +
>>   3 files changed, 155 insertions(+), 2 deletions(-)
>> diff --git a/include/qemu/qemu-plugin.h b/include/qemu/qemu-plugin.h
>> index 4daab6efd29..e3b35c6ee81 100644
>> --- a/include/qemu/qemu-plugin.h
>> +++ b/include/qemu/qemu-plugin.h
>> @@ -11,6 +11,7 @@
>>   #ifndef QEMU_QEMU_PLUGIN_H
>>   #define QEMU_QEMU_PLUGIN_H
>>   +#include 
>>   #include 
>>   #include 
>>   #include 
>> @@ -227,8 +228,8 @@ struct qemu_plugin_insn;
>>* @QEMU_PLUGIN_CB_R_REGS: callback reads the CPU's regs
>>* @QEMU_PLUGIN_CB_RW_REGS: callback reads and writes the CPU's regs
>>*
>> - * Note: currently unused, plugins cannot read or change system
>> - * register state.
>> + * Note: currently QEMU_PLUGIN_CB_RW_REGS is unused, plugins cannot change
>> + * system register state.
>>*/
>>   enum qemu_plugin_cb_flags {
>>   QEMU_PLUGIN_CB_NO_REGS,
>> @@ -708,4 +709,52 @@ uint64_t qemu_plugin_end_code(void);
>>   QEMU_PLUGIN_API
>>   uint64_t qemu_plugin_entry_code(void);
>>   +/** struct qemu_plugin_register - Opaque handle for a translated
>> instruction */
>> +struct qemu_plugin_register;
>
> What about identifying a register with an index in an array returned
> by qemu_plugin_get_registers(). That saves troubles having the handle
> member in qemu_plugin_reg_descriptor.
>
>> +
>> +/**
>> + * typedef qemu_plugin_reg_descriptor - register descriptions
>> + *
>> + * @name: register name
>> + * @handle: opaque handle for retrieving value with 
>> qemu_plugin_read_register
>> + * @feature: optional feature descriptor, can be NULL
>
> Why can it be NULL?
>
>> + */
>> +typedef struct {
>> +char name[32];
>
> Why not const char *?

I was trying to avoid too many free floating strings. I could intern it
in the API though.

>
>> +struct qemu_plugin_register *handle;
>> +const char *feature;
>> +} qemu_plugin_reg_descriptor;
>> +
>> +/**
>> + * qemu_plugin_get_registers() - return register list for vCPU
>> + * @vcpu_index: vcpu to query
>> + *
>> + * Returns a GArray of qemu_plugin_reg_descriptor or NULL. Caller
>> + * frees the array (but not the const strings).
>> + *
>> + * As the register set of a given vCPU is only available once
>> + * the vCPU is initialised if you want to monitor registers from the
>> + * start you should call this from a qemu_plugin_register_vcpu_init_cb()
>> + * callback.
>
> Is this note really necessary? You won't know vcpu_index before
> qemu_plugin_register_vcpu_init_cb() anyway.

Best to be clear I think.

>
>> + */
>> +GArray * qemu_plugin_get_registers(unsigned int vcpu_index);
>
> Spurious space after *.
>
>> +
>> +/**
>> + * qemu_plugin_read_register() - read register
>> + *
>> + * @vcpu: vcpu index
>> + * @handle: a @qemu_plugin_reg_handle handle
>> + * @buf: A GByteArray for the data owned by the plugin
>> + *
>> + * This function is only available in a context that register read access is
>> + * explicitly requested.
>> + *
>> + * Returns the size of the read register. The content of @buf is in target 
>> byte
>> + * order. On failure returns -1
>> + */
>> +int qemu_plugin_read_register(unsigned int vcpu,
>> +  struct qemu_plugin_register *handle,
>> +  GByteArray *buf);
>
> Indention is not correct. docs/devel/style.rst says:
>
>> In case of function, there are several variants:
>>
>> * 4 spaces indent from the beginning
>> * align the secondary lines just after the opening parenthesis of
> the first

Isn't that what it does?

-- 
Alex Bennée
Virtualisation Tech Lead @ Linaro

[PULL 02/11] next-cube.c: don't pulse SCSI DMA IRQ upon reception of FLUSH command

2023-12-22 Thread Thomas Huth

From: Mark Cave-Ayland 

Normally a DMA FLUSH command is used to ensure that data is completely written
to the device and/or memory, so remove the pulse of the SCSI DMA IRQ if a DMA
FLUSH command is received. This enables the NeXT ROM monitor to start to load
from a SCSI disk.

Signed-off-by: Mark Cave-Ayland 
Reviewed-by: Thomas Huth 
Message-ID: <20231220131641.592826-3-mark.cave-ayl...@ilande.co.uk>
Signed-off-by: Thomas Huth 
---
 hw/m68k/next-cube.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/hw/m68k/next-cube.c b/hw/m68k/next-cube.c
index feeda23475..87ddaf4329 100644
--- a/hw/m68k/next-cube.c
+++ b/hw/m68k/next-cube.c
@@ -473,7 +473,6 @@ static void scr_writeb(NeXTPC *s, hwaddr addr, uint32_t 
value)
 DPRINTF("SCSICSR FIFO Flush\n");
 /* will have to add another irq to the esp if this is needed */
 /* esp_puflush_fifo(esp_g); */
-qemu_irq_pulse(s->scsi_dma);
 }
 
 if (value & SCSICSR_ENABLE) {
-- 
2.43.0

[PULL 10/11] next-cube.c: remove val and size arguments from nextscr2_write()

2023-12-22 Thread Thomas Huth

From: Mark Cave-Ayland 

These are now redundant with the scr2 and old_scr2 fields in NeXTPC. Rename
the function from nextscr2_write() to next_scr2_rtc_update() to better
reflect its purpose. At the same time replace the manual bit manipulation with
the extract32() and deposit32() functions.

Signed-off-by: Mark Cave-Ayland 
Reviewed-by: Thomas Huth 
Message-ID: <20231220131641.592826-11-mark.cave-ayl...@ilande.co.uk>
Signed-off-by: Thomas Huth 
---
 hw/m68k/next-cube.c | 17 ++---
 1 file changed, 6 insertions(+), 11 deletions(-)

diff --git a/hw/m68k/next-cube.c b/hw/m68k/next-cube.c
index fd707b4b54..d9a1f234ec 100644
--- a/hw/m68k/next-cube.c
+++ b/hw/m68k/next-cube.c
@@ -136,18 +136,13 @@ static void next_scr2_led_update(NeXTPC *s)
 }
 }
 
-static void nextscr2_write(NeXTPC *s, uint32_t val, int size)
+static void next_scr2_rtc_update(NeXTPC *s)
 {
 uint8_t old_scr2, scr2_2;
 NextRtc *rtc = &s->rtc;
 
-if (size == 4) {
-scr2_2 = (val >> 8) & 0xFF;
-} else {
-scr2_2 = val & 0xFF;
-}
-
-old_scr2 = (s->old_scr2 >> 8) & 0xff;
+old_scr2 = extract32(s->old_scr2, 8, 8);
+scr2_2 = extract32(s->scr2, 8, 8);
 
 if (scr2_2 & 0x1) {
 /* DPRINTF("RTC %x phase %i\n", scr2_2, rtc->phase); */
@@ -255,8 +250,8 @@ static void nextscr2_write(NeXTPC *s, uint32_t val, int 
size)
 rtc->command = 0;
 rtc->value = 0;
 }
-s->scr2 = val & 0x00FF;
-s->scr2 |= scr2_2 << 8;
+
+s->scr2 = deposit32(s->scr2, 8, 8, scr2_2);
 }
 
 static uint64_t next_mmio_read(void *opaque, hwaddr addr, unsigned size)
@@ -325,7 +320,7 @@ static void next_mmio_write(void *opaque, hwaddr addr, 
uint64_t val,
 s->scr2 = deposit32(s->scr2, (4 - (addr - 0xd000) - size) << 3,
 size << 3, val);
 next_scr2_led_update(s);
-nextscr2_write(s, val, size);
+next_scr2_rtc_update(s);
 s->old_scr2 = s->scr2;
 break;
 
-- 
2.43.0

[PULL 06/11] next-cube.c: move static led variable to NeXTPC

2023-12-22 Thread Thomas Huth

From: Mark Cave-Ayland 

The state of the led is stored in the SCR2 register which is part of the NeXTPC
device.

Note that this is a migration break for the NeXTPC device, but as nothing will
currently boot then we simply bump the migration version for now.

Signed-off-by: Mark Cave-Ayland 
Reviewed-by: Thomas Huth 
Message-ID: <20231220131641.592826-7-mark.cave-ayl...@ilande.co.uk>
Signed-off-by: Thomas Huth 
---
 hw/m68k/next-cube.c | 13 +++--
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/hw/m68k/next-cube.c b/hw/m68k/next-cube.c
index be4091ffd7..bcc7650cd9 100644
--- a/hw/m68k/next-cube.c
+++ b/hw/m68k/next-cube.c
@@ -92,6 +92,7 @@ struct NeXTPC {
 uint32_t scr2;
 uint32_t int_mask;
 uint32_t int_status;
+uint32_t led;
 uint8_t scsi_csr_1;
 uint8_t scsi_csr_2;
 
@@ -123,7 +124,6 @@ static const uint8_t rtc_ram2[32] = {
 
 static void nextscr2_write(NeXTPC *s, uint32_t val, int size)
 {
-static int led;
 static int phase;
 static uint8_t old_scr2;
 uint8_t scr2_2;
@@ -137,10 +137,10 @@ static void nextscr2_write(NeXTPC *s, uint32_t val, int 
size)
 
 if (val & 0x1) {
 DPRINTF("fault!\n");
-led++;
-if (led == 10) {
+s->led++;
+if (s->led == 10) {
 DPRINTF("LED flashing, possible fault!\n");
-led = 0;
+s->led = 0;
 }
 }
 
@@ -926,13 +926,14 @@ static const VMStateDescription next_rtc_vmstate = {
 
 static const VMStateDescription next_pc_vmstate = {
 .name = "next-pc",
-.version_id = 1,
-.minimum_version_id = 1,
+.version_id = 2,
+.minimum_version_id = 2,
 .fields = (VMStateField[]) {
 VMSTATE_UINT32(scr1, NeXTPC),
 VMSTATE_UINT32(scr2, NeXTPC),
 VMSTATE_UINT32(int_mask, NeXTPC),
 VMSTATE_UINT32(int_status, NeXTPC),
+VMSTATE_UINT32(led, NeXTPC),
 VMSTATE_UINT8(scsi_csr_1, NeXTPC),
 VMSTATE_UINT8(scsi_csr_2, NeXTPC),
 VMSTATE_STRUCT(rtc, NeXTPC, 0, next_rtc_vmstate, NextRtc),
-- 
2.43.0

[PULL 07/11] next-cube.c: move static phase variable to NextRtc

2023-12-22 Thread Thomas Huth

From: Mark Cave-Ayland 

The phase variable represents part of the state machine used to clock data out
of the NextRtc device.

Note that this is a migration break for the NeXTRtc struct, but as nothing will
currently boot then we simply bump the migration version for now.

Signed-off-by: Mark Cave-Ayland 
Reviewed-by: Thomas Huth 
Message-ID: <20231220131641.592826-8-mark.cave-ayl...@ilande.co.uk>
Signed-off-by: Thomas Huth 
---
 hw/m68k/next-cube.c | 31 ---
 1 file changed, 16 insertions(+), 15 deletions(-)

diff --git a/hw/m68k/next-cube.c b/hw/m68k/next-cube.c
index bcc7650cd9..f554fa 100644
--- a/hw/m68k/next-cube.c
+++ b/hw/m68k/next-cube.c
@@ -62,6 +62,7 @@ typedef struct next_dma {
 } next_dma;
 
 typedef struct NextRtc {
+int8_t phase;
 uint8_t ram[32];
 uint8_t command;
 uint8_t value;
@@ -124,7 +125,6 @@ static const uint8_t rtc_ram2[32] = {
 
 static void nextscr2_write(NeXTPC *s, uint32_t val, int size)
 {
-static int phase;
 static uint8_t old_scr2;
 uint8_t scr2_2;
 NextRtc *rtc = &s->rtc;
@@ -145,25 +145,25 @@ static void nextscr2_write(NeXTPC *s, uint32_t val, int 
size)
 }
 
 if (scr2_2 & 0x1) {
-/* DPRINTF("RTC %x phase %i\n", scr2_2, phase); */
-if (phase == -1) {
-phase = 0;
+/* DPRINTF("RTC %x phase %i\n", scr2_2, rtc->phase); */
+if (rtc->phase == -1) {
+rtc->phase = 0;
 }
 /* If we are in going down clock... do something */
 if (((old_scr2 & SCR2_RTCLK) != (scr2_2 & SCR2_RTCLK)) &&
 ((scr2_2 & SCR2_RTCLK) == 0)) {
-if (phase < 8) {
+if (rtc->phase < 8) {
 rtc->command = (rtc->command << 1) |
((scr2_2 & SCR2_RTDATA) ? 1 : 0);
 }
-if (phase >= 8 && phase < 16) {
+if (rtc->phase >= 8 && rtc->phase < 16) {
 rtc->value = (rtc->value << 1) |
  ((scr2_2 & SCR2_RTDATA) ? 1 : 0);
 
 /* if we read RAM register, output RT_DATA bit */
 if (rtc->command <= 0x1F) {
 scr2_2 = scr2_2 & (~SCR2_RTDATA);
-if (rtc->ram[rtc->command] & (0x80 >> (phase - 8))) {
+if (rtc->ram[rtc->command] & (0x80 >> (rtc->phase - 8))) {
 scr2_2 |= SCR2_RTDATA;
 }
 
@@ -174,7 +174,7 @@ static void nextscr2_write(NeXTPC *s, uint32_t val, int 
size)
 if (rtc->command == 0x30) {
 scr2_2 = scr2_2 & (~SCR2_RTDATA);
 /* for now status = 0x98 (new rtc + FTU) */
-if (rtc->status & (0x80 >> (phase - 8))) {
+if (rtc->status & (0x80 >> (rtc->phase - 8))) {
 scr2_2 |= SCR2_RTDATA;
 }
 
@@ -184,7 +184,7 @@ static void nextscr2_write(NeXTPC *s, uint32_t val, int 
size)
 /* read the status 0x31 */
 if (rtc->command == 0x31) {
 scr2_2 = scr2_2 & (~SCR2_RTDATA);
-if (rtc->control & (0x80 >> (phase - 8))) {
+if (rtc->control & (0x80 >> (rtc->phase - 8))) {
 scr2_2 |= SCR2_RTDATA;
 }
 rtc->retval = (rtc->retval << 1) |
@@ -220,7 +220,7 @@ static void nextscr2_write(NeXTPC *s, uint32_t val, int 
size)
 
 }
 
-if (ret & (0x80 >> (phase - 8))) {
+if (ret & (0x80 >> (rtc->phase - 8))) {
 scr2_2 |= SCR2_RTDATA;
 }
 rtc->retval = (rtc->retval << 1) |
@@ -229,8 +229,8 @@ static void nextscr2_write(NeXTPC *s, uint32_t val, int 
size)
 
 }
 
-phase++;
-if (phase == 16) {
+rtc->phase++;
+if (rtc->phase == 16) {
 if (rtc->command >= 0x80 && rtc->command <= 0x9F) {
 rtc->ram[rtc->command - 0x80] = rtc->value;
 }
@@ -246,7 +246,7 @@ static void nextscr2_write(NeXTPC *s, uint32_t val, int 
size)
 }
 } else {
 /* else end or abort */
-phase = -1;
+rtc->phase = -1;
 rtc->command = 0;
 rtc->value = 0;
 }
@@ -911,9 +911,10 @@ static Property next_pc_properties[] = {
 
 static const VMStateDescription next_rtc_vmstate = {
 .name = "next-rtc",
-.version_id = 1,
-.minimum_version_id = 1,
+.version_id = 2,
+.minimum_version_id = 2,
 .fields = (VMStateField[]) {
+VMSTATE_INT8(phase, NextRtc),
 VMSTATE_UINT8_ARRAY(ram, NextRtc, 32),
 VMSTATE_UINT8(command, NextRtc),
 VMSTATE_UINT8(value, NextRtc),
-- 
2.43.0

[PULL 09/11] next-cube.c: move LED logic to new next_scr2_led_update() function

2023-12-22 Thread Thomas Huth

From: Mark Cave-Ayland 

Ensure that the LED status is updated by calling next_scr2_led_update() whenever
the SC2 register is written.

Signed-off-by: Mark Cave-Ayland 
Reviewed-by: Thomas Huth 
Message-ID: <20231220131641.592826-10-mark.cave-ayl...@ilande.co.uk>
Signed-off-by: Thomas Huth 
---
 hw/m68k/next-cube.c | 22 +-
 1 file changed, 13 insertions(+), 9 deletions(-)

diff --git a/hw/m68k/next-cube.c b/hw/m68k/next-cube.c
index d53f73fb8b..fd707b4b54 100644
--- a/hw/m68k/next-cube.c
+++ b/hw/m68k/next-cube.c
@@ -124,6 +124,18 @@ static const uint8_t rtc_ram2[32] = {
 #define SCR2_RTDATA 0x4
 #define SCR2_TOBCD(x) (((x / 10) << 4) + (x % 10))
 
+static void next_scr2_led_update(NeXTPC *s)
+{
+if (s->scr2 & 0x1) {
+DPRINTF("fault!\n");
+s->led++;
+if (s->led == 10) {
+DPRINTF("LED flashing, possible fault!\n");
+s->led = 0;
+}
+}
+}
+
 static void nextscr2_write(NeXTPC *s, uint32_t val, int size)
 {
 uint8_t old_scr2, scr2_2;
@@ -135,15 +147,6 @@ static void nextscr2_write(NeXTPC *s, uint32_t val, int 
size)
 scr2_2 = val & 0xFF;
 }
 
-if (val & 0x1) {
-DPRINTF("fault!\n");
-s->led++;
-if (s->led == 10) {
-DPRINTF("LED flashing, possible fault!\n");
-s->led = 0;
-}
-}
-
 old_scr2 = (s->old_scr2 >> 8) & 0xff;
 
 if (scr2_2 & 0x1) {
@@ -321,6 +324,7 @@ static void next_mmio_write(void *opaque, hwaddr addr, 
uint64_t val,
 case 0xd000 ... 0xd003:
 s->scr2 = deposit32(s->scr2, (4 - (addr - 0xd000) - size) << 3,
 size << 3, val);
+next_scr2_led_update(s);
 nextscr2_write(s, val, size);
 s->old_scr2 = s->scr2;
 break;
-- 
2.43.0

[PULL 08/11] next-cube.c: move static old_scr2 variable to NeXTPC

2023-12-22 Thread Thomas Huth

From: Mark Cave-Ayland 

Move the old_scr2 variable to NeXTPC so that the old SCR2 register state is
stored along with the current SCR2 state.

Since the SCR2 register is 32-bits wide, convert old_scr2 to uint32_t and
update the SCR2 register access code to allow unaligned writes.

Note that this is a migration break, but as nothing will currently boot then
we do not need to worry about this now.

Signed-off-by: Mark Cave-Ayland 
Reviewed-by: Thomas Huth 
Message-ID: <20231220131641.592826-9-mark.cave-ayl...@ilande.co.uk>
Signed-off-by: Thomas Huth 
---
 hw/m68k/next-cube.c | 12 +---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/hw/m68k/next-cube.c b/hw/m68k/next-cube.c
index f554fa..d53f73fb8b 100644
--- a/hw/m68k/next-cube.c
+++ b/hw/m68k/next-cube.c
@@ -91,6 +91,7 @@ struct NeXTPC {
 
 uint32_t scr1;
 uint32_t scr2;
+uint32_t old_scr2;
 uint32_t int_mask;
 uint32_t int_status;
 uint32_t led;
@@ -125,8 +126,7 @@ static const uint8_t rtc_ram2[32] = {
 
 static void nextscr2_write(NeXTPC *s, uint32_t val, int size)
 {
-static uint8_t old_scr2;
-uint8_t scr2_2;
+uint8_t old_scr2, scr2_2;
 NextRtc *rtc = &s->rtc;
 
 if (size == 4) {
@@ -144,6 +144,8 @@ static void nextscr2_write(NeXTPC *s, uint32_t val, int 
size)
 }
 }
 
+old_scr2 = (s->old_scr2 >> 8) & 0xff;
+
 if (scr2_2 & 0x1) {
 /* DPRINTF("RTC %x phase %i\n", scr2_2, rtc->phase); */
 if (rtc->phase == -1) {
@@ -252,7 +254,6 @@ static void nextscr2_write(NeXTPC *s, uint32_t val, int 
size)
 }
 s->scr2 = val & 0x00FF;
 s->scr2 |= scr2_2 << 8;
-old_scr2 = scr2_2;
 }
 
 static uint64_t next_mmio_read(void *opaque, hwaddr addr, unsigned size)
@@ -318,7 +319,10 @@ static void next_mmio_write(void *opaque, hwaddr addr, 
uint64_t val,
 break;
 
 case 0xd000 ... 0xd003:
+s->scr2 = deposit32(s->scr2, (4 - (addr - 0xd000) - size) << 3,
+size << 3, val);
 nextscr2_write(s, val, size);
+s->old_scr2 = s->scr2;
 break;
 
 default:
@@ -876,6 +880,7 @@ static void next_pc_reset(DeviceState *dev)
 /* 0xXX00 << vital bits */
 s->scr1 = 0x00011102;
 s->scr2 = 0x00ff0c80;
+s->old_scr2 = s->scr2;
 
 s->rtc.status = 0x90;
 
@@ -932,6 +937,7 @@ static const VMStateDescription next_pc_vmstate = {
 .fields = (VMStateField[]) {
 VMSTATE_UINT32(scr1, NeXTPC),
 VMSTATE_UINT32(scr2, NeXTPC),
+VMSTATE_UINT32(old_scr2, NeXTPC),
 VMSTATE_UINT32(int_mask, NeXTPC),
 VMSTATE_UINT32(int_status, NeXTPC),
 VMSTATE_UINT32(led, NeXTPC),
-- 
2.43.0

[PULL 05/11] next-cube.c: update and improve dma_ops

2023-12-22 Thread Thomas Huth

From: Mark Cave-Ayland 

Rename dma_ops to next_dma_ops and the read/write functions to next_dma_read()
and next_dma_write() respectively, mark next_dma_ops as DEVICE_BIG_ENDIAN and
also improve the consistency of the val variable in next_dma_read() and
next_dma_write().

Signed-off-by: Mark Cave-Ayland 
Reviewed-by: Thomas Huth 
Message-ID: <20231220131641.592826-6-mark.cave-ayl...@ilande.co.uk>
Signed-off-by: Thomas Huth 
---
 hw/m68k/next-cube.c | 100 
 1 file changed, 63 insertions(+), 37 deletions(-)

diff --git a/hw/m68k/next-cube.c b/hw/m68k/next-cube.c
index 8ed9bac26d..be4091ffd7 100644
--- a/hw/m68k/next-cube.c
+++ b/hw/m68k/next-cube.c
@@ -491,59 +491,63 @@ static const MemoryRegionOps next_scr_ops = {
 #define NEXTDMA_NEXT_INIT0x4200
 #define NEXTDMA_SIZE 0x4204
 
-static void dma_writel(void *opaque, hwaddr addr, uint64_t value,
-   unsigned int size)
+static void next_dma_write(void *opaque, hwaddr addr, uint64_t val,
+   unsigned int size)
 {
 NeXTState *next_state = NEXT_MACHINE(opaque);
 
 switch (addr) {
 case NEXTDMA_ENRX(NEXTDMA_CSR):
-if (value & DMA_DEV2M) {
+if (val & DMA_DEV2M) {
 next_state->dma[NEXTDMA_ENRX].csr |= DMA_DEV2M;
 }
 
-if (value & DMA_SETENABLE) {
+if (val & DMA_SETENABLE) {
 /* DPRINTF("SCSI DMA ENABLE\n"); */
 next_state->dma[NEXTDMA_ENRX].csr |= DMA_ENABLE;
 }
-if (value & DMA_SETSUPDATE) {
+if (val & DMA_SETSUPDATE) {
 next_state->dma[NEXTDMA_ENRX].csr |= DMA_SUPDATE;
 }
-if (value & DMA_CLRCOMPLETE) {
+if (val & DMA_CLRCOMPLETE) {
 next_state->dma[NEXTDMA_ENRX].csr &= ~DMA_COMPLETE;
 }
 
-if (value & DMA_RESET) {
+if (val & DMA_RESET) {
 next_state->dma[NEXTDMA_ENRX].csr &= ~(DMA_COMPLETE | DMA_SUPDATE |
   DMA_ENABLE | DMA_DEV2M);
 }
 /* DPRINTF("RXCSR \tWrite: %x\n",value); */
 break;
+
 case NEXTDMA_ENRX(NEXTDMA_NEXT_INIT):
-next_state->dma[NEXTDMA_ENRX].next_initbuf = value;
+next_state->dma[NEXTDMA_ENRX].next_initbuf = val;
 break;
+
 case NEXTDMA_ENRX(NEXTDMA_NEXT):
-next_state->dma[NEXTDMA_ENRX].next = value;
+next_state->dma[NEXTDMA_ENRX].next = val;
 break;
+
 case NEXTDMA_ENRX(NEXTDMA_LIMIT):
-next_state->dma[NEXTDMA_ENRX].limit = value;
+next_state->dma[NEXTDMA_ENRX].limit = val;
 break;
+
 case NEXTDMA_SCSI(NEXTDMA_CSR):
-if (value & DMA_DEV2M) {
+if (val & DMA_DEV2M) {
 next_state->dma[NEXTDMA_SCSI].csr |= DMA_DEV2M;
 }
-if (value & DMA_SETENABLE) {
+if (val & DMA_SETENABLE) {
 /* DPRINTF("SCSI DMA ENABLE\n"); */
 next_state->dma[NEXTDMA_SCSI].csr |= DMA_ENABLE;
 }
-if (value & DMA_SETSUPDATE) {
+if (val & DMA_SETSUPDATE) {
 next_state->dma[NEXTDMA_SCSI].csr |= DMA_SUPDATE;
 }
-if (value & DMA_CLRCOMPLETE) {
+if (val & DMA_CLRCOMPLETE) {
 next_state->dma[NEXTDMA_SCSI].csr &= ~DMA_COMPLETE;
 }
 
-if (value & DMA_RESET) {
+if (val & DMA_RESET) {
 next_state->dma[NEXTDMA_SCSI].csr &= ~(DMA_COMPLETE | DMA_SUPDATE |
   DMA_ENABLE | DMA_DEV2M);
 /* DPRINTF("SCSI DMA RESET\n"); */
@@ -552,23 +556,23 @@ static void dma_writel(void *opaque, hwaddr addr, 
uint64_t value,
 break;
 
 case NEXTDMA_SCSI(NEXTDMA_NEXT):
-next_state->dma[NEXTDMA_SCSI].next = value;
+next_state->dma[NEXTDMA_SCSI].next = val;
 break;
 
 case NEXTDMA_SCSI(NEXTDMA_LIMIT):
-next_state->dma[NEXTDMA_SCSI].limit = value;
+next_state->dma[NEXTDMA_SCSI].limit = val;
 break;
 
 case NEXTDMA_SCSI(NEXTDMA_START):
-next_state->dma[NEXTDMA_SCSI].start = value;
+next_state->dma[NEXTDMA_SCSI].start = val;
 break;
 
 case NEXTDMA_SCSI(NEXTDMA_STOP):
-next_state->dma[NEXTDMA_SCSI].stop = value;
+next_state->dma[NEXTDMA_SCSI].stop = val;
 break;
 
 case NEXTDMA_SCSI(NEXTDMA_NEXT_INIT):
-next_state->dma[NEXTDMA_SCSI].next_initbuf = value;
+next_state->dma[NEXTDMA_SCSI].next_initbuf = val;
 break;
 
 default:
@@ -576,52 +580,73 @@ static void dma_writel(void *opaque, hwaddr addr, 
uint64_t value,
 }
 }
 
-static uint64_t dma_readl(void *opaque, hwaddr addr, unsigned int size)
+static uint64_t next_dma_read(void *opaque, hwaddr addr, unsigned int size)
 {
 NeXTState *next_state = NEXT_MACHINE(opaque);
+uint64_t val;
 
 switch (addr) {
 case NEXTDMA_SCSI(NEXTDMA_CSR):
 DPRINTF("SCSI DMA CSR READ\n");
-return next_state->dma[NEXTDMA_SCSI

[PULL 11/11] next-cube.c: move machine MemoryRegions into NeXTState

2023-12-22 Thread Thomas Huth

From: Mark Cave-Ayland 

These static memory regions are contained within the machine and do not need to
be dynamically allocated.

Signed-off-by: Mark Cave-Ayland 
Reviewed-by: Thomas Huth 
Message-ID: <20231220131641.592826-12-mark.cave-ayl...@ilande.co.uk>
Signed-off-by: Thomas Huth 
---
 hw/m68k/next-cube.c | 38 +-
 1 file changed, 21 insertions(+), 17 deletions(-)

diff --git a/hw/m68k/next-cube.c b/hw/m68k/next-cube.c
index d9a1f234ec..292f13defb 100644
--- a/hw/m68k/next-cube.c
+++ b/hw/m68k/next-cube.c
@@ -74,6 +74,12 @@ typedef struct NextRtc {
 struct NeXTState {
 MachineState parent;
 
+MemoryRegion rom;
+MemoryRegion rom2;
+MemoryRegion dmamem;
+MemoryRegion bmapm1;
+MemoryRegion bmapm2;
+
 next_dma dma[10];
 };
 
@@ -967,13 +973,9 @@ static const TypeInfo next_pc_info = {
 
 static void next_cube_init(MachineState *machine)
 {
+NeXTState *m = NEXT_MACHINE(machine);
 M68kCPU *cpu;
 CPUM68KState *env;
-MemoryRegion *rom = g_new(MemoryRegion, 1);
-MemoryRegion *rom2 = g_new(MemoryRegion, 1);
-MemoryRegion *dmamem = g_new(MemoryRegion, 1);
-MemoryRegion *bmapm1 = g_new(MemoryRegion, 1);
-MemoryRegion *bmapm2 = g_new(MemoryRegion, 1);
 MemoryRegion *sysmem = get_system_memory();
 const char *bios_name = machine->firmware ?: ROM_FILE;
 DeviceState *pcdev;
@@ -1008,21 +1010,23 @@ static void next_cube_init(MachineState *machine)
 sysbus_mmio_map(SYS_BUS_DEVICE(pcdev), 1, 0x0210);
 
 /* BMAP memory */
-memory_region_init_ram_flags_nomigrate(bmapm1, NULL, "next.bmapmem", 64,
-   RAM_SHARED, &error_fatal);
-memory_region_add_subregion(sysmem, 0x020c, bmapm1);
+memory_region_init_ram_flags_nomigrate(&m->bmapm1, NULL, "next.bmapmem",
+   64, RAM_SHARED, &error_fatal);
+memory_region_add_subregion(sysmem, 0x020c, &m->bmapm1);
 /* The Rev_2.5_v66.bin firmware accesses it at 0x820c0020, too */
-memory_region_init_alias(bmapm2, NULL, "next.bmapmem2", bmapm1, 0x0, 64);
-memory_region_add_subregion(sysmem, 0x820c, bmapm2);
+memory_region_init_alias(&m->bmapm2, NULL, "next.bmapmem2", &m->bmapm1,
+ 0x0, 64);
+memory_region_add_subregion(sysmem, 0x820c, &m->bmapm2);
 
 /* KBD */
 sysbus_create_simple(TYPE_NEXTKBD, 0x0200e000, NULL);
 
 /* Load ROM here */
-memory_region_init_rom(rom, NULL, "next.rom", 0x2, &error_fatal);
-memory_region_add_subregion(sysmem, 0x0100, rom);
-memory_region_init_alias(rom2, NULL, "next.rom2", rom, 0x0, 0x2);
-memory_region_add_subregion(sysmem, 0x0, rom2);
+memory_region_init_rom(&m->rom, NULL, "next.rom", 0x2, &error_fatal);
+memory_region_add_subregion(sysmem, 0x0100, &m->rom);
+memory_region_init_alias(&m->rom2, NULL, "next.rom2", &m->rom, 0x0,
+ 0x2);
+memory_region_add_subregion(sysmem, 0x0, &m->rom2);
 if (load_image_targphys(bios_name, 0x0100, 0x2) < 8) {
 if (!qtest_enabled()) {
 error_report("Failed to load firmware '%s'.", bios_name);
@@ -1049,9 +1053,9 @@ static void next_cube_init(MachineState *machine)
 next_scsi_init(pcdev, cpu);
 
 /* DMA */
-memory_region_init_io(dmamem, NULL, &next_dma_ops, machine, "next.dma",
-  0x5000);
-memory_region_add_subregion(sysmem, 0x0200, dmamem);
+memory_region_init_io(&m->dmamem, NULL, &next_dma_ops, machine,
+  "next.dma", 0x5000);
+memory_region_add_subregion(sysmem, 0x0200, &m->dmamem);
 }
 
 static void next_machine_class_init(ObjectClass *oc, void *data)
-- 
2.43.0

[PULL 00/11] m68k next-cube patches

2023-12-22 Thread Thomas Huth

The following changes since commit 191710c221f65b1542f6ea7fa4d30dde6e134fd7:

  Merge tag 'pull-request-2023-12-20' of https://gitlab.com/thuth/qemu into 
staging (2023-12-20 09:40:16 -0500)

are available in the Git repository at:

  https://gitlab.com/huth/qemu.git tags/m68k-pull-2023-12-22

for you to fetch changes up to 0d23b1ef85eb7a5ed7363da439de632783dbb37e:

  next-cube.c: move machine MemoryRegions into NeXTState (2023-12-22 14:08:26 
+0100)


* Add dummy next-cube ethernet register to allow diagnostic to timeout
* Don't pulse next-cube SCSI DMA IRQ upon reception of FLUSH command
* Rework next-cube mmio ops and system control register handling functions
* Embed next-cube MemoryRegions into NeXTState


Mark Cave-Ayland (11):
  next-cube.c: add dummy Ethernet register to allow diagnostic to timeout
  next-cube.c: don't pulse SCSI DMA IRQ upon reception of FLUSH command
  next-cube.c: update mmio_ops to properly use modern memory API
  next-cube.c: update scr_ops to properly use modern memory API
  next-cube.c: update and improve dma_ops
  next-cube.c: move static led variable to NeXTPC
  next-cube.c: move static phase variable to NextRtc
  next-cube.c: move static old_scr2 variable to NeXTPC
  next-cube.c: move LED logic to new next_scr2_led_update() function
  next-cube.c: remove val and size arguments from nextscr2_write()
  next-cube.c: move machine MemoryRegions into NeXTState

 hw/m68k/next-cube.c | 525 +++-
 1 file changed, 227 insertions(+), 298 deletions(-)

[PULL 03/11] next-cube.c: update mmio_ops to properly use modern memory API

2023-12-22 Thread Thomas Huth

From: Mark Cave-Ayland 

The old QEMU memory accessors used in the original NextCube patch series had
separate functions for 1, 2 and 4 byte accessors. When the series was finally
merged a simple wrapper function was written to dispatch the memory accesses
using the original functions.

Convert mmio_ops to use the memory API directly renaming it to next_mmio_ops,
marking it as DEVICE_BIG_ENDIAN, and handling any unaligned accesses.

Signed-off-by: Mark Cave-Ayland 
Reviewed-by: Thomas Huth 
Message-ID: <20231220131641.592826-4-mark.cave-ayl...@ilande.co.uk>
Signed-off-by: Thomas Huth 
---
 hw/m68k/next-cube.c | 156 +---
 1 file changed, 45 insertions(+), 111 deletions(-)

diff --git a/hw/m68k/next-cube.c b/hw/m68k/next-cube.c
index 87ddaf4329..f73f563ac1 100644
--- a/hw/m68k/next-cube.c
+++ b/hw/m68k/next-cube.c
@@ -255,150 +255,84 @@ static void nextscr2_write(NeXTPC *s, uint32_t val, int 
size)
 old_scr2 = scr2_2;
 }
 
-static uint32_t mmio_readb(NeXTPC *s, hwaddr addr)
+static uint64_t next_mmio_read(void *opaque, hwaddr addr, unsigned size)
 {
-switch (addr) {
-case 0xc000:
-return (s->scr1 >> 24) & 0xFF;
-case 0xc001:
-return (s->scr1 >> 16) & 0xFF;
-case 0xc002:
-return (s->scr1 >> 8)  & 0xFF;
-case 0xc003:
-return (s->scr1 >> 0)  & 0xFF;
-
-case 0xd000:
-return (s->scr2 >> 24) & 0xFF;
-case 0xd001:
-return (s->scr2 >> 16) & 0xFF;
-case 0xd002:
-return (s->scr2 >> 8)  & 0xFF;
-case 0xd003:
-return (s->scr2 >> 0)  & 0xFF;
-case 0x14020:
-DPRINTF("MMIO Read 0x4020\n");
-return 0x7f;
-
-default:
-DPRINTF("MMIO Read B @ %"HWADDR_PRIx"\n", addr);
-return 0x0;
-}
-}
-
-static uint32_t mmio_readw(NeXTPC *s, hwaddr addr)
-{
-switch (addr) {
-default:
-DPRINTF("MMIO Read W @ %"HWADDR_PRIx"\n", addr);
-return 0x0;
-}
-}
+NeXTPC *s = NEXT_PC(opaque);
+uint64_t val;
 
-static uint32_t mmio_readl(NeXTPC *s, hwaddr addr)
-{
 switch (addr) {
 case 0x7000:
 /* DPRINTF("Read INT status: %x\n", s->int_status); */
-return s->int_status;
+val = s->int_status;
+break;
 
 case 0x7800:
 DPRINTF("MMIO Read INT mask: %x\n", s->int_mask);
-return s->int_mask;
-
-case 0xc000:
-return s->scr1;
+val = s->int_mask;
+break;
 
-case 0xd000:
-return s->scr2;
+case 0xc000 ... 0xc003:
+val = extract32(s->scr1, (4 - (addr - 0xc000) - size) << 3,
+size << 3);
+break;
 
-default:
-DPRINTF("MMIO Read L @ %"HWADDR_PRIx"\n", addr);
-return 0x0;
-}
-}
+case 0xd000 ... 0xd003:
+val = extract32(s->scr2, (4 - (addr - 0xd000) - size) << 3,
+size << 3);
+break;
 
-static void mmio_writeb(NeXTPC *s, hwaddr addr, uint32_t val)
-{
-switch (addr) {
-case 0xd003:
-nextscr2_write(s, val, 1);
+case 0x14020:
+val = 0x7f;
 break;
+
 default:
-DPRINTF("MMIO Write B @ %x with %x\n", (unsigned int)addr, val);
+val = 0;
+DPRINTF("MMIO Read @ 0x%"HWADDR_PRIx" size %d\n", addr, size);
+break;
 }
 
+return val;
 }
 
-static void mmio_writew(NeXTPC *s, hwaddr addr, uint32_t val)
+static void next_mmio_write(void *opaque, hwaddr addr, uint64_t val,
+unsigned size)
 {
-DPRINTF("MMIO Write W\n");
-}
+NeXTPC *s = NEXT_PC(opaque);
 
-static void mmio_writel(NeXTPC *s, hwaddr addr, uint32_t val)
-{
 switch (addr) {
 case 0x7000:
-DPRINTF("INT Status old: %x new: %x\n", s->int_status, val);
+DPRINTF("INT Status old: %x new: %x\n", s->int_status,
+(unsigned int)val);
 s->int_status = val;
 break;
+
 case 0x7800:
-DPRINTF("INT Mask old: %x new: %x\n", s->int_mask, val);
+DPRINTF("INT Mask old: %x new: %x\n", s->int_mask, (unsigned int)val);
 s->int_mask  = val;
 break;
-case 0xc000:
-DPRINTF("SCR1 Write: %x\n", val);
-break;
-case 0xd000:
-nextscr2_write(s, val, 4);
-break;
-
-default:
-DPRINTF("MMIO Write l @ %x with %x\n", (unsigned int)addr, val);
-}
-}
-
-static uint64_t mmio_readfn(void *opaque, hwaddr addr, unsigned size)
-{
-NeXTPC *s = NEXT_PC(opaque);
-
-switch (size) {
-case 1:
-return mmio_readb(s, addr);
-case 2:
-return mmio_readw(s, addr);
-case 4:
-return mmio_readl(s, addr);
-default:
-g_assert_not_reached();
-}
-}
-
-static void mmio_writefn(void *opaque, hwaddr addr, uint64_t value,
- unsigned size)
-{
-NeXTPC *s = NEXT_PC(opaque);
 
-switch (size) {
-case 1:
-mmio_writeb(s, addr, value);
+case 0xc000 ... 0xc003:
+DPRINTF("SCR1 Write: %x\n", (unsigned in

[PULL 04/11] next-cube.c: update scr_ops to properly use modern memory API

2023-12-22 Thread Thomas Huth

From: Mark Cave-Ayland 

The old QEMU memory accessors used in the original NextCube patch series had
separate functions for 1, 2 and 4 byte accessors. When the series was finally
merged a simple wrapper function was written to dispatch the memory accesses
using the original functions.

Convert scr_ops to use the memory API directly renaming it to next_scr_ops,
marking it as DEVICE_BIG_ENDIAN, and handling any unaligned accesses.

Signed-off-by: Mark Cave-Ayland 
Reviewed-by: Thomas Huth 
Message-ID: <20231220131641.592826-5-mark.cave-ayl...@ilande.co.uk>
Signed-off-by: Thomas Huth 
---
 hw/m68k/next-cube.c | 155 
 1 file changed, 55 insertions(+), 100 deletions(-)

diff --git a/hw/m68k/next-cube.c b/hw/m68k/next-cube.c
index f73f563ac1..8ed9bac26d 100644
--- a/hw/m68k/next-cube.c
+++ b/hw/m68k/next-cube.c
@@ -335,81 +335,80 @@ static const MemoryRegionOps next_mmio_ops = {
 .endianness = DEVICE_BIG_ENDIAN,
 };
 
-static uint32_t scr_readb(NeXTPC *s, hwaddr addr)
+#define SCSICSR_ENABLE  0x01
+#define SCSICSR_RESET   0x02  /* reset scsi dma */
+#define SCSICSR_FIFOFL  0x04
+#define SCSICSR_DMADIR  0x08  /* if set, scsi to mem */
+#define SCSICSR_CPUDMA  0x10  /* if set, dma enabled */
+#define SCSICSR_INTMASK 0x20  /* if set, interrupt enabled */
+
+static uint64_t next_scr_readfn(void *opaque, hwaddr addr, unsigned size)
 {
+NeXTPC *s = NEXT_PC(opaque);
+uint64_t val;
+
 switch (addr) {
 case 0x14108:
 DPRINTF("FD read @ %x\n", (unsigned int)addr);
-return 0x40 | 0x04 | 0x2 | 0x1;
+val = 0x40 | 0x04 | 0x2 | 0x1;
+break;
+
 case 0x14020:
 DPRINTF("SCSI 4020  STATUS READ %X\n", s->scsi_csr_1);
-return s->scsi_csr_1;
+val = s->scsi_csr_1;
+break;
 
 case 0x14021:
 DPRINTF("SCSI 4021 STATUS READ %X\n", s->scsi_csr_2);
-return 0x40;
+val = 0x40;
+break;
 
 /*
  * These 4 registers are the hardware timer, not sure which register
- * is the latch instead of data, but no problems so far
+ * is the latch instead of data, but no problems so far.
+ *
+ * Hack: We need to have the LSB change consistently to make it work
  */
-case 0x1a000:
-return 0xff & (clock() >> 24);
-case 0x1a001:
-return 0xff & (clock() >> 16);
-case 0x1a002:
-return 0xff & (clock() >> 8);
-case 0x1a003:
-/* Hack: We need to have this change consistently to make it work */
-return 0xFF & clock();
+case 0x1a000 ... 0x1a003:
+val = extract32(clock(), (4 - (addr - 0x1a000) - size) << 3,
+size << 3);
+break;
 
 /* For now return dummy byte to allow the Ethernet test to timeout */
 case 0x6000:
-return 0xff;
+val = 0xff;
+break;
 
 default:
-DPRINTF("BMAP Read B @ %x\n", (unsigned int)addr);
-return 0;
+DPRINTF("BMAP Read @ 0x%x size %u\n", (unsigned int)addr, size);
+val = 0;
+break;
 }
-}
 
-static uint32_t scr_readw(NeXTPC *s, hwaddr addr)
-{
-DPRINTF("BMAP Read W @ %x\n", (unsigned int)addr);
-return 0;
+return val;
 }
 
-static uint32_t scr_readl(NeXTPC *s, hwaddr addr)
+static void next_scr_writefn(void *opaque, hwaddr addr, uint64_t val,
+ unsigned size)
 {
-DPRINTF("BMAP Read L @ %x\n", (unsigned int)addr);
-return 0;
-}
-
-#define SCSICSR_ENABLE  0x01
-#define SCSICSR_RESET   0x02  /* reset scsi dma */
-#define SCSICSR_FIFOFL  0x04
-#define SCSICSR_DMADIR  0x08  /* if set, scsi to mem */
-#define SCSICSR_CPUDMA  0x10  /* if set, dma enabled */
-#define SCSICSR_INTMASK 0x20  /* if set, interrupt enabled */
+NeXTPC *s = NEXT_PC(opaque);
 
-static void scr_writeb(NeXTPC *s, hwaddr addr, uint32_t value)
-{
 switch (addr) {
 case 0x14108:
 DPRINTF("FDCSR Write: %x\n", value);
-
-if (value == 0x0) {
+if (val == 0x0) {
 /* qemu_irq_raise(s->fd_irq[0]); */
 }
 break;
+
 case 0x14020: /* SCSI Control Register */
-if (value & SCSICSR_FIFOFL) {
+if (val & SCSICSR_FIFOFL) {
 DPRINTF("SCSICSR FIFO Flush\n");
 /* will have to add another irq to the esp if this is needed */
 /* esp_puflush_fifo(esp_g); */
 }
 
-if (value & SCSICSR_ENABLE) {
+if (val & SCSICSR_ENABLE) {
 DPRINTF("SCSICSR Enable\n");
 /*
  * qemu_irq_raise(s->scsi_dma);
@@ -423,17 +422,17 @@ static void scr_writeb(NeXTPC *s, hwaddr addr, uint32_t 
value)
  * s->scsi_csr_1 &= ~SCSICSR_ENABLE;
  */
 
-if (value & SCSICSR_RESET) {
+if (val & SCSICSR_RESET) {
 DPRINTF("SCSICSR Reset\n");
 /* I think this should set DMADIR. CPUDMA and INTMASK to 0 */
 qemu_irq_raise(s->scsi_reset);
 s->scsi_csr_1 &= ~(SCSICSR_INTMASK | 0

[PULL 01/11] next-cube.c: add dummy Ethernet register to allow diagnostic to timeout

2023-12-22 Thread Thomas Huth

From: Mark Cave-Ayland 

Add a dummy register at address 0x6000 in the MMIO memory region to allow the
initial diagnostic test to timeout rather than getting stuck in a loop
continuously writing "en_write: tx not ready" to the console.

Signed-off-by: Mark Cave-Ayland 
Tested-by: Thomas Huth 
Message-ID: <20231220131641.592826-2-mark.cave-ayl...@ilande.co.uk>
Signed-off-by: Thomas Huth 
---
 hw/m68k/next-cube.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/hw/m68k/next-cube.c b/hw/m68k/next-cube.c
index fabd861941..feeda23475 100644
--- a/hw/m68k/next-cube.c
+++ b/hw/m68k/next-cube.c
@@ -429,6 +429,10 @@ static uint32_t scr_readb(NeXTPC *s, hwaddr addr)
 /* Hack: We need to have this change consistently to make it work */
 return 0xFF & clock();
 
+/* For now return dummy byte to allow the Ethernet test to timeout */
+case 0x6000:
+return 0xff;
+
 default:
 DPRINTF("BMAP Read B @ %x\n", (unsigned int)addr);
 return 0;
-- 
2.43.0

Re: [RFC PATCH 0/4] target/riscv/kvm: QEMU support for KVM Guest Debug on RISC-V

2023-12-22 Thread Daniel Henrique Barboza


Hi,

It seems that we still need the kernel KVM side to be sorted out first [1],
so I believe we should wait a bit until we can review this RFC. Otherwise we
might risk reviewing something that has to be changed later.


[1] https://lore.kernel.org/kvm/20231221095002.7404-1-duc...@eswincomputing.com/


Thanks,

Daniel

On 12/21/23 06:49, Chao Du wrote:

This series implements QEMU KVM Guest Debug on RISC-V. Currently, we can
debug RISC-V KVM guest from the host side, with software breakpoints.

A brief test was done on QEMU RISC-V hypervisor emulator.

A TODO list which will be added later:
1. HW breakpoints support
2. Test cases

This series is based on QEMU 8.2.0-rc4 and is also available at:
https://github.com/Du-Chao/qemu/tree/riscv_gd_sw

This is dependent on KVM side changes:
https://github.com/Du-Chao/linux/tree/riscv_gd_sw

Chao Du (4):
   target/riscv/kvm: add software breakpoints support
   target/riscv/kvm: implement kvm_arch_update_guest_debug()
   target/riscv/kvm: handle the exit with debug reason
   linux-headers: enable KVM GUEST DEBUG for RISC-V

  accel/kvm/kvm-all.c   |   8 +--
  include/sysemu/kvm.h  |   6 +-
  linux-headers/asm-riscv/kvm.h |   1 +
  target/arm/kvm64.c|   6 +-
  target/i386/kvm/kvm.c |   6 +-
  target/mips/kvm.c |   6 +-
  target/ppc/kvm.c  |   6 +-
  target/riscv/kvm/kvm-cpu.c| 101 ++
  target/s390x/kvm/kvm.c|   6 +-
  9 files changed, 130 insertions(+), 16 deletions(-)

--
2.17.1

Re: [PATCH 00/35] target/arm: Implement emulation of nested virtualization

2023-12-22 Thread Miguel Luis

Hi Peter,

> On 18 Dec 2023, at 10:32, Peter Maydell  wrote:
> 
> This patchset adds support for emulating the Arm architectural features
> FEAT_NV and FEAT_NV2 which allow nested virtualization, i.e. where a
> hypervisor can run a guest which thinks it is running at EL2.
> 
> Nominally FEAT_NV is sufficient for this and FEAT_NV2 merely improves
> the performance in the nested-virt setup, but in practice hypervisors
> such as KVM are going to require FEAT_NV2 and not bother to support
> the FEAT_NV-only case, so I have implemented them one after the other
> in this single patchset.
> 
> The feature is essentially a collection of changes that allow the
> hypervisor to lie to the guest so that it thinks it is running in EL2
> when it's really at EL1. The best summary of what all the changes are
> is in section D8.11 "Nested virtualization" in the Arm ARM, but the
> short summary is:
> * EL2 system registers etc trap to EL2 rather than UNDEFing
> * ERET traps to EL2
> * the CurrentEL register reports "EL2" when NV is enabled
> * on exception entry, SPSR_EL1.M may report "EL2" as the EL the
>   exception was taken from
> * when HCR_EL1.NV1 is also set, then there are some extra tweaks
>   (NV1 == 1 means "guest thinks it is running with HCR_EL2.E2H == 0")
> * some AT S1 address translation insns can be trapped to EL2
> and FEAT_NV2 adds:
> * accesses to some system registers are transformed into memory
>   accesses instead of trapping to EL2
> * accesses to a few EL2 system registers are redirected to the
>   equivalent EL1 registers
> 
> This patchset is sufficient that you can run an L0 guest kernel that
> has support for FEAT_NV/FEAT_NV2 in its KVM code, and then
> inside that start a nested L1 guest that thinks it has EL2 access,
> and then run an inner-nested L2 guest under that that can get
> to running userspace code. To do that you'll need some not-yet-upstream
> patches for both Linux and kvmtool:
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms.git/log/?h=kvm-arm64/nv-6.8-nv2-only
> https://gitlab.arm.com/linux-arm/kvmtool/-/commits/nv-v6.6
> 
> You'll also want to turn off SVE and SME emulation in QEMU
> (-cpu max,sve=off,sme=off) because at the moment the KVM patchset
> doesn't handle SVE and nested-virt together (the other option
> is to hack kvmtool to make it not ask for both at once, but this
> is easier).
> 
> (kvmtool is needed here to run the L1 because QEMU itself as a VMM
> doesn't yet support asking KVM for an EL2 guest.)
> 
> The first three patches in the series aren't strictly part of FEAT_NV:
> * patch 1 is already reviewed; I put it here to avoid having
>   to deal with textual conflicts between it and this series
> * patch 2 sets CTR_EL0.{IDC,DIC} for '-cpu max', which is a good
>   idea anyway and also works around what Marc Z and I think is
>   a KVM bug that otherwise causes boot of the L2 kernel to hang
> * patch 3 is a GIC bug which is not FEAT_NV specific but for
>   some reason only manifests when booting an L1 kernel under NV
> 

I've successfully replicated this setup and reached inner-nested L2 guest 
userspace.

FWIW, feel free to add

Tested-by: Miguel Luis 

(I've been working on QEMU asking KVM for an EL2 guest on top of this series 
although there's been yet some debugging to do.)

Thank you!

Miguel

> thanks
> -- PMM
> 
> Peter Maydell (35):
>  target/arm: Don't implement *32_EL2 registers when EL1 is AArch64 only
>  target/arm: Set CTR_EL0.{IDC,DIC} for the 'max' CPU
>  hw/intc/arm_gicv3_cpuif: handle LPIs in in the list registers
>  target/arm: Handle HCR_EL2 accesses for bits introduced with FEAT_NV
>  target/arm: Implement HCR_EL2.AT handling
>  target/arm: Enable trapping of ERET for FEAT_NV
>  target/arm: Always honour HCR_EL2.TSC when HCR_EL2.NV is set
>  target/arm: Allow use of upper 32 bits of TBFLAG_A64
>  target/arm: Record correct opcode fields in cpreg for E2H aliases
>  target/arm: *_EL12 registers should UNDEF when HCR_EL2.E2H is 0
>  target/arm: Make EL2 cpreg accessfns safe for FEAT_NV EL1 accesses
>  target/arm: Move FPU/SVE/SME access checks up above
>ARM_CP_SPECIAL_MASK check
>  target/arm: Trap sysreg accesses for FEAT_NV
>  target/arm: Make NV reads of CurrentEL return EL2
>  target/arm: Set SPSR_EL1.M correctly when nested virt is enabled
>  target/arm: Trap registers when HCR_EL2.{NV,NV1} == {1,1}
>  target/arm: Always use arm_pan_enabled() when checking if PAN is
>enabled
>  target/arm: Don't honour PSTATE.PAN when HCR_EL2.{NV,NV1} == {1,1}
>  target/arm: Treat LDTR* and STTR* as LDR/STR when NV,NV1 is 1,1
>  target/arm: Handle FEAT_NV page table attribute changes
>  target/arm: Add FEAT_NV to max, neoverse-n2, neoverse-v1 CPUs
>  target/arm: Handle HCR_EL2 accesses for FEAT_NV2 bits
>  target/arm: Implement VNCR_EL2 register
>  target/arm: Handle FEAT_NV2 changes to when SPSR_EL1.M reports EL2
>  target/arm: Handle FEAT_NV2 redirection of SPSR_EL2, ELR_EL2, ESR_EL2,
>FAR_EL2
>  target/arm: Implement FEAT_N

[RFC] Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2051

2023-12-22 Thread Avan Elen


Hi, this patch fixes a bug:  https://gitlab.com/qemu-project/qemu/-/issues/2051
 
From: Elen Avan 
Date: Fri, 22 Dec 2023 14:39:48 +
Subject: [RFC] Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2051
 
Signed-off-by: Elen Avan 
---
 include/ui/rect.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
 
diff --git a/include/ui/rect.h b/include/ui/rect.h
index 94898f92d0..68f05d78a8 100644
--- a/include/ui/rect.h
+++ b/include/ui/rect.h
@@ -19,7 +19,7 @@ static inline void qemu_rect_init(QemuRect *rect,
   uint16_t width, uint16_t height)
 {
 rect->x = x;
-    rect->y = x;
+    rect->y = y;
 rect->width = width;
 rect->height = height;
 }
--
2.41.0

Re: [v2 1/2] qapi/virtio: Add feature and status bits for x-query-virtio-status

2023-12-22 Thread Yong Huang

Sure. of course. I'll do that next week. While v3 would be a single patch
that only contains the first commit.

Thanks,
Yong

On Fri, Dec 22, 2023 at 5:54 PM Markus Armbruster  wrote:

> Hyman Huang  writes:
>
> > This patch allows to display feature and status bits in
> > virtio-status.
> >
> > Applications could find it helpful to compare status and features
> > that are numeric encoded. For example, an upper application could
> > use the features (encoded as a number) in the output of "ovs-vsctl
> > list interface" and the feature bits fields in the output of QMP
> > command "x-query-virtio-status" to compare directly when attempting
> > to ensure the correctness of the virtio negotiation between guest,
> > QEMU, and OVS-DPDK. Not applying any more encoding.
> >
> > This patch also serves as a preparation for the next one, which
> implements
> > a vhost-user test case about acked features of vhost-user protocol.
> >
> > Note that since the matching HMP command is typically used for human,
> > leave it unchanged.
> >
> > Signed-off-by: Hyman Huang 
>
> We discussed v1 some more since you posted this.  Do you intend to do a
> v3?
>
>

-- 
Best regards

Re: [PATCH V6 03/14] cpus: stop vm in suspended runstate

2023-12-22 Thread Steven Sistare

On 12/22/2023 7:20 AM, Markus Armbruster wrote:
> Steve Sistare  writes:
> 
>> Currently, a vm in the suspended state is not completely stopped.  The VCPUs
>> have been paused, but the cpu clock still runs, and runstate notifiers for
>> the transition to stopped have not been called.  This causes problems for
>> live migration.  Stale cpu timers_state is saved to the migration stream,
>> causing time errors in the guest when it wakes from suspend, and state that
>> would have been modified by runstate notifiers is wrong.
>>
>> Modify vm_stop to completely stop the vm if the current state is suspended,
>> transition to RUN_STATE_PAUSED, and remember that the machine was suspended.
>> Modify vm_start to restore the suspended state.
> 
> Can you explain this to me in terms of the @current_run_state state
> machine?  Like
> 
> Before the patch, trigger X in state Y goes to state Z.
> Afterwards, it goes to ...

Old behavior:
  RUN_STATE_SUSPENDED --> stop --> RUN_STATE_SUSPENDED

New behavior:
  RUN_STATE_SUSPENDED --> stop --> RUN_STATE_PAUSED
  RUN_STATE_PAUSED--> cont --> RUN_STATE_SUSPENDED

>> This affects all callers of vm_stop and vm_start, notably, the qapi stop and
>> cont commands.  For example:
>>
>> (qemu) info status
>> VM status: paused (suspended)
>>
>> (qemu) stop
>> (qemu) info status
>> VM status: paused
>>
>> (qemu) cont
>> (qemu) info status
>> VM status: paused (suspended)
>>
>> (qemu) system_wakeup
>> (qemu) info status
>> VM status: running
>>
>> Suggested-by: Peter Xu 
>> Signed-off-by: Steve Sistare 
>> ---
>>  include/sysemu/runstate.h |  5 +
>>  qapi/misc.json| 10 --
>>  system/cpus.c | 19 ++-
>>  system/runstate.c |  3 +++
>>  4 files changed, 30 insertions(+), 7 deletions(-)
>>
>> diff --git a/include/sysemu/runstate.h b/include/sysemu/runstate.h
>> index f6a337b..1d6828f 100644
>> --- a/include/sysemu/runstate.h
>> +++ b/include/sysemu/runstate.h
>> @@ -40,6 +40,11 @@ static inline bool shutdown_caused_by_guest(ShutdownCause 
>> cause)
>>  return cause >= SHUTDOWN_CAUSE_GUEST_SHUTDOWN;
>>  }
>>  
>> +static inline bool runstate_is_started(RunState state)
>> +{
>> +return state == RUN_STATE_RUNNING || state == RUN_STATE_SUSPENDED;
>> +}
>> +
>>  void vm_start(void);
>>  
>>  /**
>> diff --git a/qapi/misc.json b/qapi/misc.json
>> index cda2eff..efb8d44 100644
>> --- a/qapi/misc.json
>> +++ b/qapi/misc.json
>> @@ -134,7 +134,7 @@
>>  ##
>>  # @stop:
>>  #
>> -# Stop all guest VCPU execution.
>> +# Stop all guest VCPU and VM execution.
> 
> Doesn't "stop all VM execution" imply "stop all guest vCPU execution"?

Agreed, so we simply have:

# @stop:
# Stop guest VM execution.

# @cont:
# Resume guest VM execution.

>>  #
>>  # Since: 0.14
>>  #
>> @@ -143,6 +143,9 @@
># Notes: This function will succeed even if the guest is already in
># the stopped state.  In "inmigrate" state, it will ensure that
>>  # the guest remains paused once migration finishes, as if the -S
>>  # option was passed on the command line.
>>  #
>> +# In the "suspended" state, it will completely stop the VM and
>> +# cause a transition to the "paused" state. (Since 9.0)
>> +#
> 
> What user-observable (with query-status) state transitions are possible
> here?

{"status": "suspended", "singlestep": false, "running": false}
--> stop -->
{"status": "paused", "singlestep": false, "running": false}

>>  # Example:
>>  #
>>  # -> { "execute": "stop" }
>> @@ -153,7 +156,7 @@
>>  ##
>>  # @cont:
>>  #
>> -# Resume guest VCPU execution.
>> +# Resume guest VCPU and VM execution.
>>  #
>>  # Since: 0.14
>>  #
>> @@ -165,6 +168,9 @@
># Returns: If successful, nothing
>#
># Notes: This command will succeed if the guest is currently running.
># It will also succeed if the guest is in the "inmigrate" state;
># in this case, the effect of the command is to make sure the
>>  # guest starts once migration finishes, removing the effect of the
>>  # -S command line option if it was passed.
>>  #
>> +# If the VM was previously suspended, and not been reset or woken,
>> +# this command will transition back to the "suspended" state. (Since 
>> 9.0)
> 
> Long line.

It fits in 80 columns, but perhaps this looks nicer:

# If the VM was previously suspended, and not been reset or woken,
# this command will transition back to the "suspended" state.
# (Since 9.0)

> What user-observable state transitions are possible here?

{"status": "paused", "singlestep": false, "running": false}
--> cont -->
{"status": "suspended", "singlestep": false, "running": false}

>> +#
>>  # Example:
>>  #
>>  # -> { "execute": "cont" }
> 
> Should we update documentation of query-status, too?

IMO no. The new behavior changes the status/RunState field only, and the
domain of values does not change, only the transitions caused by the commands
described here.

-

Re: [PATCH v2] target/i386: Fix physical address truncation

2023-12-22 Thread Paolo Bonzini

On Fri, Dec 22, 2023 at 10:04 AM Paolo Bonzini  wrote:
> > If the extension is not needed, then the a20 mask isn't either.
>
> I think it is. The extension is not needed because the masking is
> applied by either TCG (e.g. in gen_lea_v_seg_dest or gen_add_A0_im) or
> mmu_translate(); but the a20 mask is never applied elsewhere for
> either non-paging mode or page table walks.

Hmm, except helpers do not apply the masking. :/

So Michael's patch would for example break something as silly as a
BOUND, FSAVE or XSAVE operation invoked around the 4GB boundary.

The easiest way to proceed is to introduce a new MMU index
MMU_PTW_IDX, which is the same as MMU_PHYS_IDX except it does not mask
32-bit addresses. Any objections?

Paolo

Re: [PULL 0/2] loongarch-to-apply queue

2023-12-22 Thread Stefan Hajnoczi

Applied, thanks.

Please update the changelog at https://wiki.qemu.org/ChangeLog/9.0 for any 
user-visible changes.


signature.asc
Description: PGP signature

Re: [PULL v2 00/33] Block layer patches

2023-12-22 Thread Stefan Hajnoczi

Applied, thanks.

Please update the changelog at https://wiki.qemu.org/ChangeLog/9.0 for any 
user-visible changes.


signature.asc
Description: PGP signature

Re: [PATCH v2] target/i386: Fix physical address truncation

2023-12-22 Thread Paolo Bonzini

On Fri, Dec 22, 2023 at 5:16 PM Paolo Bonzini  wrote:
>
> On Fri, Dec 22, 2023 at 10:04 AM Paolo Bonzini  wrote:
> > > If the extension is not needed, then the a20 mask isn't either.
> >
> > I think it is. The extension is not needed because the masking is
> > applied by either TCG (e.g. in gen_lea_v_seg_dest or gen_add_A0_im) or
> > mmu_translate(); but the a20 mask is never applied elsewhere for
> > either non-paging mode or page table walks.
>
> Hmm, except helpers do not apply the masking. :/
>
> So Michael's patch would for example break something as silly as a
> BOUND, FSAVE or XSAVE operation invoked around the 4GB boundary.
>
> The easiest way to proceed is to introduce a new MMU index
> MMU_PTW_IDX, which is the same as MMU_PHYS_IDX except it does not mask
> 32-bit addresses. Any objections?

Nevermind, I wasn't thinking straight.

Helpers will not use MMU_PHYS_IDX. So those are fine, we just need to
keep the masking before the "break".

The only user of MMU_PHYS_IDX is VMRUN/VMLOAD/VMSAVE. We need to add
checks that the VMCB is aligned there, and same for writing to
MSR_HSAVE_PA.

Paolo

Re: [RFC PATCH 4/5] virtio: Add VMState for early load

2023-12-22 Thread Eugenio Perez Martin

On Mon, Sep 18, 2023 at 6:51 AM Yajun Wu  wrote:
>
> Define new virtio device vmstate for early save/load (happens in
> LM setup stage). Same as original vmstate, except:
>
> In LM setup phase of the destination VM, the guest memory has not
> been synchronized yet. To address this, a flag has been added to
> virtio_load function to skip the index check.
>
> Signed-off-by: Yajun Wu 
> Reviewed-by: Avihai Horon 
> Reviewed-by: Jiri Pirko 
> ---
>  hw/virtio/virtio.c | 152 +++--
>  include/hw/virtio/virtio.h |  10 ++-
>  2 files changed, 103 insertions(+), 59 deletions(-)
>
> diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
> index 969c25f4cf..8c73c245dd 100644
> --- a/hw/virtio/virtio.c
> +++ b/hw/virtio/virtio.c
> @@ -2832,7 +2832,17 @@ virtio_device_get(QEMUFile *f, void *opaque, size_t 
> size,
>  VirtIODevice *vdev = VIRTIO_DEVICE(opaque);
>  DeviceClass *dc = DEVICE_CLASS(VIRTIO_DEVICE_GET_CLASS(vdev));
>
> -return virtio_load(vdev, f, dc->vmsd->version_id);
> +return virtio_load(vdev, f, dc->vmsd->version_id, false);
> +}
> +
> +/* A wrapper for use as a VMState .get function */
> +static int virtio_early_device_get(QEMUFile *f, void *opaque, size_t size,
> +  const VMStateField *field)
> +{
> +VirtIODevice *vdev = VIRTIO_DEVICE(opaque);
> +DeviceClass *dc = DEVICE_CLASS(VIRTIO_DEVICE_GET_CLASS(vdev));
> +
> +return virtio_load(vdev, f, dc->vmsd->version_id, true);
>  }
>
>  const VMStateInfo  virtio_vmstate_info = {
> @@ -2841,6 +2851,12 @@ const VMStateInfo  virtio_vmstate_info = {
>  .put = virtio_device_put,
>  };
>
> +const VMStateInfo virtio_early_vmstate_info = {
> +.name = "virtio-early",
> +.get = virtio_early_device_get,
> +.put = virtio_device_put,
> +};
> +
>  static int virtio_set_features_nocheck(VirtIODevice *vdev, uint64_t val)
>  {
>  VirtioDeviceClass *k = VIRTIO_DEVICE_GET_CLASS(vdev);
> @@ -2940,8 +2956,75 @@ size_t virtio_get_config_size(const 
> VirtIOConfigSizeParams *params,
>  return config_size;
>  }
>
> +static int virtio_load_check_index(VirtIODevice *vdev, int num)
> +{
> +int i;
> +
> +RCU_READ_LOCK_GUARD();
> +

I didn't check manually, but in the original function vdc->post_load
call was also protected by the RCU. Maybe it is better to leave it at
the caller?

> +for (i = 0; i < num; i++) {
> +if (vdev->vq[i].vring.desc) {
> +uint16_t nheads;
> +
> +/*
> + * VIRTIO-1 devices migrate desc, used, and avail ring addresses 
> so
> + * only the region cache needs to be set up.  Legacy devices need
> + * to calculate used and avail ring addresses based on the desc
> + * address.
> + */
> +if (virtio_vdev_has_feature(vdev, VIRTIO_F_VERSION_1)) {
> +virtio_init_region_cache(vdev, i);
> +} else {
> +virtio_queue_update_rings(vdev, i);
> +}
> +
> +if (virtio_vdev_has_feature(vdev, VIRTIO_F_RING_PACKED)) {
> +vdev->vq[i].shadow_avail_idx = vdev->vq[i].last_avail_idx;
> +vdev->vq[i].shadow_avail_wrap_counter =
> +vdev->vq[i].last_avail_wrap_counter;
> +continue;
> +}
> +
> +nheads = vring_avail_idx(&vdev->vq[i]) - 
> vdev->vq[i].last_avail_idx;
> +/* Check it isn't doing strange things with descriptor numbers. 
> */
> +if (nheads > vdev->vq[i].vring.num) {
> +virtio_error(vdev, "VQ %d size 0x%x Guest index 0x%x "
> + "inconsistent with Host index 0x%x: delta 0x%x",
> + i, vdev->vq[i].vring.num,
> + vring_avail_idx(&vdev->vq[i]),
> + vdev->vq[i].last_avail_idx, nheads);
> +vdev->vq[i].used_idx = 0;
> +vdev->vq[i].shadow_avail_idx = 0;
> +vdev->vq[i].inuse = 0;
> +continue;
> +}
> +vdev->vq[i].used_idx = vring_used_idx(&vdev->vq[i]);
> +vdev->vq[i].shadow_avail_idx = vring_avail_idx(&vdev->vq[i]);
> +
> +/*
> + * Some devices migrate VirtQueueElements that have been popped
> + * from the avail ring but not yet returned to the used ring.
> + * Since max ring size < UINT16_MAX it's safe to use modulo
> + * UINT16_MAX + 1 subtraction.
> + */
> +vdev->vq[i].inuse = (uint16_t)(vdev->vq[i].last_avail_idx -
> +vdev->vq[i].used_idx);
> +if (vdev->vq[i].inuse > vdev->vq[i].vring.num) {
> +error_report("VQ %d size 0x%x < last_avail_idx 0x%x - "
> + "used_idx 0x%x",
> + i, vdev->vq[i].vring.num,
> + vdev-

Re: [PATCH 3/6] target/riscv: Add pointer masking tb flags

2023-12-22 Thread Alexey Baturo

Hi Richard,

Thanks for the suggestion.
If it's ok to consume another bit(3 bits total) for Pointer Masking flags,
I'll do it.
>so that the translator can see the true width of the address
I guess I'll need a helper to calculate the exact number of bits to
shift(0, 7 or 16) based on those 2 extracted bits. Is it ok with you?

Thanks

пт, 22 дек. 2023 г. в 01:49, Richard Henderson :

> On 12/21/23 21:40, Alexey Baturo wrote:
> > From: Alexey Baturo 
> >
> > Signed-off-by: Alexey Baturo 
> > ---
> >   target/riscv/cpu.h| 19 +--
> >   target/riscv/cpu_helper.c |  4 
> >   target/riscv/translate.c  | 10 ++
> >   3 files changed, 27 insertions(+), 6 deletions(-)
> >
> > diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h
> > index f49d4aa52c..2099168950 100644
> > --- a/target/riscv/cpu.h
> > +++ b/target/riscv/cpu.h
> > @@ -390,6 +390,10 @@ struct CPUArchState {
> >   target_ulong senvcfg;
> >   uint64_t henvcfg;
> >   #endif
> > +/* current number of masked top bits by pointer masking */
> > +target_ulong pm_pmlen;
> > +/* if pointer masking should do sign extension */
> > +bool pm_signext;
> >
> >   /* Fields from here on are preserved across CPU reset. */
> >   QEMUTimer *stimer; /* Internal timer for S-mode interrupt */
> > @@ -538,14 +542,17 @@ FIELD(TB_FLAGS, VILL, 14, 1)
> >   FIELD(TB_FLAGS, VSTART_EQ_ZERO, 15, 1)
> >   /* The combination of MXL/SXL/UXL that applies to the current cpu
> mode. */
> >   FIELD(TB_FLAGS, XL, 16, 2)
> > -FIELD(TB_FLAGS, VTA, 18, 1)
> > -FIELD(TB_FLAGS, VMA, 19, 1)
> > +/* If pointer masking should be applied and address sign extended */
> > +FIELD(TB_FLAGS, PM_ENABLED, 18, 1)
>
> I think it would be better add the entire two bit field here, so that the
> translator can
> see the true width of the address.  You can then use tcg_gen_{s}extract_tl
> to perform the
> truncation.  At which point the 'target_ulong pm_pmlen' is not required.
>
>
> r~
>
> > +FIELD(TB_FLAGS, PM_SIGNEXTEND, 19, 1)
> > +FIELD(TB_FLAGS, VTA, 20, 1)
> > +FIELD(TB_FLAGS, VMA, 21, 1)
> >   /* Native debug itrigger */
> > -FIELD(TB_FLAGS, ITRIGGER, 20, 1)
> > +FIELD(TB_FLAGS, ITRIGGER, 22, 1)
> >   /* Virtual mode enabled */
> > -FIELD(TB_FLAGS, VIRT_ENABLED, 21, 1)
> > -FIELD(TB_FLAGS, PRIV, 22, 2)
> > -FIELD(TB_FLAGS, AXL, 24, 2)
> > +FIELD(TB_FLAGS, VIRT_ENABLED, 23, 1)
> > +FIELD(TB_FLAGS, PRIV, 24, 2)
> > +FIELD(TB_FLAGS, AXL, 25, 2)
> >
> >   #ifdef TARGET_RISCV32
> >   #define riscv_cpu_mxl(env)  ((void)(env), MXL_RV32)
> > diff --git a/target/riscv/cpu_helper.c b/target/riscv/cpu_helper.c
> > index a3d477d226..79cddbd930 100644
> > --- a/target/riscv/cpu_helper.c
> > +++ b/target/riscv/cpu_helper.c
> > @@ -135,6 +135,10 @@ void cpu_get_tb_cpu_state(CPURISCVState *env, vaddr
> *pc,
> >   flags = FIELD_DP32(flags, TB_FLAGS, VS, vs);
> >   flags = FIELD_DP32(flags, TB_FLAGS, XL, env->xl);
> >   flags = FIELD_DP32(flags, TB_FLAGS, AXL, cpu_address_xl(env));
> > +if (env->pm_pmlen != 0) {
> > +flags = FIELD_DP32(flags, TB_FLAGS, PM_ENABLED, 1);
> > +}
> > +flags = FIELD_DP32(flags, TB_FLAGS, PM_SIGNEXTEND, env->pm_signext);
> >
> >   *pflags = flags;
> >   }
> > diff --git a/target/riscv/translate.c b/target/riscv/translate.c
> > index 6b4b9a671c..4c0d526b58 100644
> > --- a/target/riscv/translate.c
> > +++ b/target/riscv/translate.c
> > @@ -42,6 +42,8 @@ static TCGv cpu_gpr[32], cpu_gprh[32], cpu_pc, cpu_vl,
> cpu_vstart;
> >   static TCGv_i64 cpu_fpr[32]; /* assume F and D extensions */
> >   static TCGv load_res;
> >   static TCGv load_val;
> > +/* number of top masked address bits by pointer masking extension */
> > +static TCGv pm_pmlen;
> >
> >   /*
> >* If an operation is being performed on less than TARGET_LONG_BITS,
> > @@ -103,6 +105,9 @@ typedef struct DisasContext {
> >   bool vl_eq_vlmax;
> >   CPUState *cs;
> >   TCGv zero;
> > +/* pointer masking extension */
> > +bool pm_enabled;
> > +bool pm_signext;
> >   /* Use icount trigger for native debug */
> >   bool itrigger;
> >   /* FRM is known to contain a valid value. */
> > @@ -1176,6 +1181,8 @@ static void
> riscv_tr_init_disas_context(DisasContextBase *dcbase, CPUState *cs)
> >   ctx->xl = FIELD_EX32(tb_flags, TB_FLAGS, XL);
> >   ctx->address_xl = FIELD_EX32(tb_flags, TB_FLAGS, AXL);
> >   ctx->cs = cs;
> > +ctx->pm_enabled = FIELD_EX32(tb_flags, TB_FLAGS, PM_ENABLED);
> > +ctx->pm_signext = FIELD_EX32(tb_flags, TB_FLAGS, PM_SIGNEXTEND);
> >   ctx->itrigger = FIELD_EX32(tb_flags, TB_FLAGS, ITRIGGER);
> >   ctx->zero = tcg_constant_tl(0);
> >   ctx->virt_inst_excp = false;
> > @@ -1307,4 +1314,7 @@ void riscv_translate_init(void)
> >"load_res");
> >   load_val = tcg_global_mem_new(tcg_env, offsetof(CPURISCVState,
> load_val),
> >"load_val");
> > +/* Assign var with number of pointer masking m

[PATCH 0/5] target/i386: Fix physical address masking bugs

2023-12-22 Thread Paolo Bonzini

The address translation logic in get_physical_address() will currently
truncate physical addresses to 32 bits unless long mode is enabled.
This is incorrect when using physical address extensions (PAE) outside
of long mode, with the result that a 32-bit operating system using PAE
to access memory above 4G will experience undefined behaviour.
Instead, truncation must be applied only to non-paging mode, because
all paths that go through page table accesses already produce a
correctly-masked address.

Furthermore, when inspecting the code I noticed that the A20 mask is
applied incorrectly when NPT is active.  The mask should not be applied
to the addresses that are looked up in the NPT, only to the physical
addresses.  Obviously no hypervisor is going to leave A20 masking on,
but the code actually becomes simpler so let's do it.

Patches 1 and 2 fix cases in which the addresses must be masked,
or overflow is otherwise invalid, for MMU_PHYS_IDX accesses.

Patch 3 fixes the bug, by limiting the masking to the case of CR0.PG=0.

Patches 4 and 5 further clean up the MMU functions to centralize
application of the A20 mask and fix bugs in the process.

Untested except for running the SVM tests from kvm-unit-tests
(which is better than nothing, still).

Supersedes: 
<0102018c8d11471f-9a6d73eb-0c34-4f61-8d37-5a4418f9e0d7-000...@eu-west-1.amazonses.com>

Paolo Bonzini (5):
  target/i386: mask high bits of CR3 in 32-bit mode
  target/i386: check validity of VMCB addresses
  target/i386: Fix physical address truncation
  target/i386: remove unnecessary/wrong application of the A20 mask
  target/i386: leave the A20 bit set in the final NPT walk

 target/i386/tcg/sysemu/excp_helper.c | 44 
 target/i386/tcg/sysemu/misc_helper.c |  3 ++
 target/i386/tcg/sysemu/svm_helper.c  | 27 +
 3 files changed, 43 insertions(+), 31 deletions(-)

-- 
2.43.0

[PATCH 4/5] target/i386: remove unnecessary/wrong application of the A20 mask

2023-12-22 Thread Paolo Bonzini

If ptw_translate() does a MMU_PHYS_IDX access, the A20 mask is already
applied in get_physical_address(), which is called via probe_access_full()
and x86_cpu_tlb_fill().

If ptw_translate() on the other hand does a MMU_NESTED_IDX access,
the A20 mask must not be applied to the address that is looked up in
the nested page tables; it must be applied only *while* looking up NPT
entries.

Therefore, we can remove A20 masking from the computation of the page
table entry's address, and let get_physical_address() or mmu_translate()
apply it when they know they are returning a host-physical address.

Cc: qemu-sta...@nongnu.org
Fixes: 4a1e9d4d11c ("target/i386: Use atomic operations for pte updates", 
2022-10-18)
Signed-off-by: Paolo Bonzini 
---
 target/i386/tcg/sysemu/excp_helper.c | 21 -
 1 file changed, 8 insertions(+), 13 deletions(-)

diff --git a/target/i386/tcg/sysemu/excp_helper.c 
b/target/i386/tcg/sysemu/excp_helper.c
index eee1af52710..ede8ba6b80e 100644
--- a/target/i386/tcg/sysemu/excp_helper.c
+++ b/target/i386/tcg/sysemu/excp_helper.c
@@ -164,8 +164,7 @@ static bool mmu_translate(CPUX86State *env, const 
TranslateParams *in,
 /*
  * Page table level 5
  */
-pte_addr = ((in->cr3 & ~0xfff) +
-(((addr >> 48) & 0x1ff) << 3)) & a20_mask;
+pte_addr = (in->cr3 & ~0xfff) + (((addr >> 48) & 0x1ff) << 3);
 if (!ptw_translate(&pte_trans, pte_addr)) {
 return false;
 }
@@ -189,8 +188,7 @@ static bool mmu_translate(CPUX86State *env, const 
TranslateParams *in,
 /*
  * Page table level 4
  */
-pte_addr = ((pte & PG_ADDRESS_MASK) +
-(((addr >> 39) & 0x1ff) << 3)) & a20_mask;
+pte_addr = (pte & PG_ADDRESS_MASK) + (((addr >> 39) & 0x1ff) << 3);
 if (!ptw_translate(&pte_trans, pte_addr)) {
 return false;
 }
@@ -210,8 +208,7 @@ static bool mmu_translate(CPUX86State *env, const 
TranslateParams *in,
 /*
  * Page table level 3
  */
-pte_addr = ((pte & PG_ADDRESS_MASK) +
-(((addr >> 30) & 0x1ff) << 3)) & a20_mask;
+pte_addr = (pte & PG_ADDRESS_MASK) + (((addr >> 30) & 0x1ff) << 3);
 if (!ptw_translate(&pte_trans, pte_addr)) {
 return false;
 }
@@ -238,7 +235,7 @@ static bool mmu_translate(CPUX86State *env, const 
TranslateParams *in,
 /*
  * Page table level 3
  */
-pte_addr = ((in->cr3 & 0xffe0ULL) + ((addr >> 27) & 0x18)) & 
a20_mask;
+pte_addr = (in->cr3 & 0xffe0ULL) + ((addr >> 27) & 0x18);
 if (!ptw_translate(&pte_trans, pte_addr)) {
 return false;
 }
@@ -260,8 +257,7 @@ static bool mmu_translate(CPUX86State *env, const 
TranslateParams *in,
 /*
  * Page table level 2
  */
-pte_addr = ((pte & PG_ADDRESS_MASK) +
-(((addr >> 21) & 0x1ff) << 3)) & a20_mask;
+pte_addr = (pte & PG_ADDRESS_MASK) + (((addr >> 21) & 0x1ff) << 3);
 if (!ptw_translate(&pte_trans, pte_addr)) {
 return false;
 }
@@ -287,8 +283,7 @@ static bool mmu_translate(CPUX86State *env, const 
TranslateParams *in,
 /*
  * Page table level 1
  */
-pte_addr = ((pte & PG_ADDRESS_MASK) +
-(((addr >> 12) & 0x1ff) << 3)) & a20_mask;
+pte_addr = (pte & PG_ADDRESS_MASK) + (((addr >> 12) & 0x1ff) << 3);
 if (!ptw_translate(&pte_trans, pte_addr)) {
 return false;
 }
@@ -306,7 +301,7 @@ static bool mmu_translate(CPUX86State *env, const 
TranslateParams *in,
 /*
  * Page table level 2
  */
-pte_addr = ((in->cr3 & 0xf000ULL) + ((addr >> 20) & 0xffc)) & 
a20_mask;
+pte_addr = (in->cr3 & 0xf000ULL) + ((addr >> 20) & 0xffc);
 if (!ptw_translate(&pte_trans, pte_addr)) {
 return false;
 }
@@ -335,7 +330,7 @@ static bool mmu_translate(CPUX86State *env, const 
TranslateParams *in,
 /*
  * Page table level 1
  */
-pte_addr = ((pte & ~0xfffu) + ((addr >> 10) & 0xffc)) & a20_mask;
+pte_addr = (pte & ~0xfffu) + ((addr >> 10) & 0xffc);
 if (!ptw_translate(&pte_trans, pte_addr)) {
 return false;
 }
-- 
2.43.0

[PATCH 5/5] target/i386: leave the A20 bit set in the final NPT walk

2023-12-22 Thread Paolo Bonzini

The A20 mask is only applied to the final memory access.  Nested
page tables are always walked with the raw guest-physical address.

Unlike the previous patch, in this one the masking must be kept, but
it was done too early.

Cc: qemu-sta...@nongnu.org
Fixes: 4a1e9d4d11c ("target/i386: Use atomic operations for pte updates", 
2022-10-18)
Signed-off-by: Paolo Bonzini 
---
 target/i386/tcg/sysemu/excp_helper.c | 12 +++-
 1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/target/i386/tcg/sysemu/excp_helper.c 
b/target/i386/tcg/sysemu/excp_helper.c
index ede8ba6b80e..37e650c1fcd 100644
--- a/target/i386/tcg/sysemu/excp_helper.c
+++ b/target/i386/tcg/sysemu/excp_helper.c
@@ -134,7 +134,6 @@ static inline bool ptw_setl(const PTETranslate *in, 
uint32_t old, uint32_t set)
 static bool mmu_translate(CPUX86State *env, const TranslateParams *in,
   TranslateResult *out, TranslateFault *err)
 {
-const int32_t a20_mask = x86_get_a20_mask(env);
 const target_ulong addr = in->addr;
 const int pg_mode = in->pg_mode;
 const bool is_user = (in->mmu_idx == MMU_USER_IDX);
@@ -417,10 +416,13 @@ do_check_protect_pse36:
 }
 }
 
-/* align to page_size */
-paddr = (pte & a20_mask & PG_ADDRESS_MASK & ~(page_size - 1))
-  | (addr & (page_size - 1));
+/* merge offset within page */
+paddr = (pte & PG_ADDRESS_MASK & ~(page_size - 1)) | (addr & (page_size - 
1));
 
+/*
+ * Note that NPT is walked (for both paging structures and final guest
+ * addresses) using the address with the A20 bit set.
+ */
 if (in->ptw_idx == MMU_NESTED_IDX) {
 CPUTLBEntryFull *full;
 int flags, nested_page_size;
@@ -459,7 +461,7 @@ do_check_protect_pse36:
 }
 }
 
-out->paddr = paddr;
+out->paddr = paddr & x86_get_a20_mask(env);
 out->prot = prot;
 out->page_size = page_size;
 return true;
-- 
2.43.0

[PATCH 1/5] target/i386: mask high bits of CR3 in 32-bit mode

2023-12-22 Thread Paolo Bonzini

CR3 bits 63:32 are ignored in 32-bit mode (either legacy 2-level
paging or PAE paging).  Do this in mmu_translate() to remove
the last where get_physical_address() meaningfully drops the high
bits of the address.

Cc: qemu-sta...@nongnu.org
Suggested-by: Richard Henderson 
Fixes: 4a1e9d4d11c ("target/i386: Use atomic operations for pte updates", 
2022-10-18)
Signed-off-by: Paolo Bonzini 
---
 target/i386/tcg/sysemu/excp_helper.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/target/i386/tcg/sysemu/excp_helper.c 
b/target/i386/tcg/sysemu/excp_helper.c
index 5b86f439add..11126c860d4 100644
--- a/target/i386/tcg/sysemu/excp_helper.c
+++ b/target/i386/tcg/sysemu/excp_helper.c
@@ -238,7 +238,7 @@ static bool mmu_translate(CPUX86State *env, const 
TranslateParams *in,
 /*
  * Page table level 3
  */
-pte_addr = ((in->cr3 & ~0x1f) + ((addr >> 27) & 0x18)) & a20_mask;
+pte_addr = ((in->cr3 & 0xffe0ULL) + ((addr >> 27) & 0x18)) & 
a20_mask;
 if (!ptw_translate(&pte_trans, pte_addr)) {
 return false;
 }
@@ -306,7 +306,7 @@ static bool mmu_translate(CPUX86State *env, const 
TranslateParams *in,
 /*
  * Page table level 2
  */
-pte_addr = ((in->cr3 & ~0xfff) + ((addr >> 20) & 0xffc)) & a20_mask;
+pte_addr = ((in->cr3 & 0xf000ULL) + ((addr >> 20) & 0xffc)) & 
a20_mask;
 if (!ptw_translate(&pte_trans, pte_addr)) {
 return false;
 }
-- 
2.43.0

[PATCH 2/5] target/i386: check validity of VMCB addresses

2023-12-22 Thread Paolo Bonzini

MSR_VM_HSAVE_PA bits 0-11 are reserved, as are the bits above the
maximum physical address width of the processor.  Setting them to
1 causes a #GP (see "15.30.4 VM_HSAVE_PA MSR" in the AMD manual).

The same is true of VMCB addresses passed to VMRUN/VMLOAD/VMSAVE,
even though the manual is not clear on that.

Cc: qemu-sta...@nongnu.org
Fixes: 4a1e9d4d11c ("target/i386: Use atomic operations for pte updates", 
2022-10-18)
Signed-off-by: Paolo Bonzini 
---
 target/i386/tcg/sysemu/misc_helper.c |  3 +++
 target/i386/tcg/sysemu/svm_helper.c  | 27 +--
 2 files changed, 24 insertions(+), 6 deletions(-)

diff --git a/target/i386/tcg/sysemu/misc_helper.c 
b/target/i386/tcg/sysemu/misc_helper.c
index e1528b7f80b..1901712ecef 100644
--- a/target/i386/tcg/sysemu/misc_helper.c
+++ b/target/i386/tcg/sysemu/misc_helper.c
@@ -201,6 +201,9 @@ void helper_wrmsr(CPUX86State *env)
 tlb_flush(cs);
 break;
 case MSR_VM_HSAVE_PA:
+if (val & (0xfff | ((~0ULL) << env_archcpu(env)->phys_bits))) {
+goto error;
+}
 env->vm_hsave = val;
 break;
 #ifdef TARGET_X86_64
diff --git a/target/i386/tcg/sysemu/svm_helper.c 
b/target/i386/tcg/sysemu/svm_helper.c
index 32ff0dbb13c..5d6de2294fa 100644
--- a/target/i386/tcg/sysemu/svm_helper.c
+++ b/target/i386/tcg/sysemu/svm_helper.c
@@ -164,14 +164,19 @@ void helper_vmrun(CPUX86State *env, int aflag, int 
next_eip_addend)
 uint64_t new_cr3;
 uint64_t new_cr4;
 
-cpu_svm_check_intercept_param(env, SVM_EXIT_VMRUN, 0, GETPC());
-
 if (aflag == 2) {
 addr = env->regs[R_EAX];
 } else {
 addr = (uint32_t)env->regs[R_EAX];
 }
 
+/* Exceptions are checked before the intercept.  */
+if (addr & (0xfff | ((~0ULL) << env_archcpu(env)->phys_bits))) {
+raise_exception_err_ra(env, EXCP0D_GPF, 0, GETPC());
+}
+
+cpu_svm_check_intercept_param(env, SVM_EXIT_VMRUN, 0, GETPC());
+
 qemu_log_mask(CPU_LOG_TB_IN_ASM, "vmrun! " TARGET_FMT_lx "\n", addr);
 
 env->vm_vmcb = addr;
@@ -463,14 +468,19 @@ void helper_vmload(CPUX86State *env, int aflag)
 int mmu_idx = MMU_PHYS_IDX;
 target_ulong addr;
 
-cpu_svm_check_intercept_param(env, SVM_EXIT_VMLOAD, 0, GETPC());
-
 if (aflag == 2) {
 addr = env->regs[R_EAX];
 } else {
 addr = (uint32_t)env->regs[R_EAX];
 }
 
+/* Exceptions are checked before the intercept.  */
+if (addr & (0xfff | ((~0ULL) << env_archcpu(env)->phys_bits))) {
+raise_exception_err_ra(env, EXCP0D_GPF, 0, GETPC());
+}
+
+cpu_svm_check_intercept_param(env, SVM_EXIT_VMLOAD, 0, GETPC());
+
 if (virtual_vm_load_save_enabled(env, SVM_EXIT_VMLOAD, GETPC())) {
 mmu_idx = MMU_NESTED_IDX;
 }
@@ -519,14 +529,19 @@ void helper_vmsave(CPUX86State *env, int aflag)
 int mmu_idx = MMU_PHYS_IDX;
 target_ulong addr;
 
-cpu_svm_check_intercept_param(env, SVM_EXIT_VMSAVE, 0, GETPC());
-
 if (aflag == 2) {
 addr = env->regs[R_EAX];
 } else {
 addr = (uint32_t)env->regs[R_EAX];
 }
 
+/* Exceptions are checked before the intercept.  */
+if (addr & (0xfff | ((~0ULL) << env_archcpu(env)->phys_bits))) {
+raise_exception_err_ra(env, EXCP0D_GPF, 0, GETPC());
+}
+
+cpu_svm_check_intercept_param(env, SVM_EXIT_VMSAVE, 0, GETPC());
+
 if (virtual_vm_load_save_enabled(env, SVM_EXIT_VMSAVE, GETPC())) {
 mmu_idx = MMU_NESTED_IDX;
 }
-- 
2.43.0

[PATCH 3/5] target/i386: Fix physical address truncation

2023-12-22 Thread Paolo Bonzini

The address translation logic in get_physical_address() will currently
truncate physical addresses to 32 bits unless long mode is enabled.
This is incorrect when using physical address extensions (PAE) outside
of long mode, with the result that a 32-bit operating system using PAE
to access memory above 4G will experience undefined behaviour.

The truncation code was originally introduced in commit 33dfdb5 ("x86:
only allow real mode to access 32bit without LMA"), where it applied
only to translations performed while paging is disabled (and so cannot
affect guests using PAE).

Commit 9828198 ("target/i386: Add MMU_PHYS_IDX and MMU_NESTED_IDX")
rearranged the code such that the truncation also applied to the use
of MMU_PHYS_IDX and MMU_NESTED_IDX.  Commit 4a1e9d4 ("target/i386: Use
atomic operations for pte updates") brought this truncation into scope
for page table entry accesses, and is the first commit for which a
Windows 10 32-bit guest will reliably fail to boot if memory above 4G
is present.

The truncation code however is not completely redundant.  Even though the
maximum address size for any executed instruction is 32 bits, helpers for
operations such as BOUND, FSAVE or XSAVE may ask get_physical_address()
to translate an address outside of the 32-bit range, if invoked with an
argument that is close to the 4G boundary.

So, move the address truncation in get_physical_address() in the
CR0.PG=0 case.

Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2040
Fixes: 4a1e9d4d11c ("target/i386: Use atomic operations for pte updates", 
2022-10-18)
Cc: qemu-sta...@nongnu.org
Co-developed-by: Michael Brown 
Signed-off-by: Michael Brown 
Signed-off-by: Paolo Bonzini 
---
 target/i386/tcg/sysemu/excp_helper.c | 11 ---
 1 file changed, 4 insertions(+), 7 deletions(-)

diff --git a/target/i386/tcg/sysemu/excp_helper.c 
b/target/i386/tcg/sysemu/excp_helper.c
index 11126c860d4..eee1af52710 100644
--- a/target/i386/tcg/sysemu/excp_helper.c
+++ b/target/i386/tcg/sysemu/excp_helper.c
@@ -577,17 +577,14 @@ static bool get_physical_address(CPUX86State *env, vaddr 
addr,
 }
 return mmu_translate(env, &in, out, err);
 }
+
+/* No paging implies long mode is disabled.  */
+addr = (uint32_t)addr;
 break;
 }
 
-/* Translation disabled. */
+/* No translation needed. */
 out->paddr = addr & x86_get_a20_mask(env);
-#ifdef TARGET_X86_64
-if (!(env->hflags & HF_LMA_MASK)) {
-/* Without long mode we can only address 32bits in real mode */
-out->paddr = (uint32_t)out->paddr;
-}
-#endif
 out->prot = PAGE_READ | PAGE_WRITE | PAGE_EXEC;
 out->page_size = TARGET_PAGE_SIZE;
 return true;
-- 
2.43.0

[PATCH 02/22] target/i386: speedup JO/SETO after MUL or IMUL

2023-12-22 Thread Paolo Bonzini

OF is equal to the carry flag, so use the same CCPrepare.

Signed-off-by: Paolo Bonzini 
---
 target/i386/tcg/translate.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
index 8fb80011a22..a16eb8d4008 100644
--- a/target/i386/tcg/translate.c
+++ b/target/i386/tcg/translate.c
@@ -1020,6 +1020,9 @@ static CCPrepare gen_prepare_eflags_o(DisasContext *s, 
TCGv reg)
 case CC_OP_CLR:
 case CC_OP_POPCNT:
 return (CCPrepare) { .cond = TCG_COND_NEVER, .mask = -1 };
+case CC_OP_MULB ... CC_OP_MULQ:
+return (CCPrepare) { .cond = TCG_COND_NE,
+ .reg = cpu_cc_src, .mask = -1 };
 default:
 gen_compute_eflags(s);
 return (CCPrepare) { .cond = TCG_COND_NE, .reg = cpu_cc_src,
-- 
2.43.0

[PATCH 20/22] target/i386: adjust decoding of J operand

2023-12-22 Thread Paolo Bonzini

gen_jcc() has been changed to accept a relative offset since the
new decoder was written.  Adjust the J operand, which is meant
to be used with jump instructions such as gen_jcc(), to not
include the program counter and to not truncate the result, as
both operations are now performed by common code.

The result is that J is now the same as the I operand.

Reviewed-by: Richard Henderson 
Signed-off-by: Paolo Bonzini 
---
 target/i386/tcg/decode-new.c.inc | 10 --
 1 file changed, 10 deletions(-)

diff --git a/target/i386/tcg/decode-new.c.inc b/target/i386/tcg/decode-new.c.inc
index 99d18d2871e..f30889dbc0a 100644
--- a/target/i386/tcg/decode-new.c.inc
+++ b/target/i386/tcg/decode-new.c.inc
@@ -1329,19 +1329,9 @@ static bool decode_op(DisasContext *s, CPUX86State *env, 
X86DecodedInsn *decode,
 }
 
 case X86_TYPE_I:  /* Immediate */
-op->unit = X86_OP_IMM;
-decode->immediate = insn_get_signed(env, s, op->ot);
-break;
-
 case X86_TYPE_J:  /* Relative offset for a jump */
 op->unit = X86_OP_IMM;
 decode->immediate = insn_get_signed(env, s, op->ot);
-decode->immediate += s->pc - s->cs_base;
-if (s->dflag == MO_16) {
-decode->immediate &= 0x;
-} else if (!CODE64(s)) {
-decode->immediate &= 0xu;
-}
 break;
 
 case X86_TYPE_L:  /* The upper 4 bits of the immediate select a 128-bit 
register */
-- 
2.43.0

[PATCH 00/22] target/i386: first part of TCG changes for 9.0

2023-12-22 Thread Paolo Bonzini

The main things here are a bunch of cleanups to the common logic of
the new decoder, some changes to translate.c that prepare for redoing
one-byte opcodes in the new framework, and the implementation of the
CMPccXADD instruction that is new in Sierra Forest processors.

These are all relatively innocuous changes, and easy to bisect in
case things go wrong.

Paolo


Paolo Bonzini (22):
  target/i386: optimize computation of JL and JLE from flags
  target/i386: speedup JO/SETO after MUL or IMUL
  target/i386: remove unnecessary arguments from raise_interrupt
  target/i386: remove unnecessary truncations
  target/i386: clean up cpu_cc_compute_all
  target/i386: document more deviations from the manual
  target/i386: reimplement check for validity of LOCK prefix
  target/i386: avoid trunc and ext for MULX and RORX
  target/i386: rename zext0/zext2 and make them closer to the manual
  target/i386: add X86_SPECIALs for MOVSX and MOVZX
  target/i386: do not decode string source/destination into decode->mem
  target/i386: do not clobber A0 in POP translation
  target/i386: do not clobber T0 on string operations
  target/i386: split eflags computation out of gen_compute_eflags
  target/i386: do not use s->tmp4 for push
  target/i386: do not use s->tmp0 for jumps on ECX ==/!= 0
  target/i386: extract gen_far_call/jmp, reordering temporaries
  target/i386: prepare for implementation of STOS/SCAS in new decoder
  target/i386: move operand load and writeback out of gen_cmovcc1
  target/i386: adjust decoding of J operand
  target/i386: introduce flags writeback mechanism
  target/i386: implement CMPccXADD

 target/i386/cpu.c|   2 +-
 target/i386/cpu.h|   5 +-
 target/i386/tcg/cc_helper.c  |   6 +-
 target/i386/tcg/decode-new.c.inc | 152 +--
 target/i386/tcg/decode-new.h |  29 +++-
 target/i386/tcg/emit.c.inc   | 224 +--
 target/i386/tcg/excp_helper.c|   7 +-
 target/i386/tcg/fpu_helper.c |  10 +-
 target/i386/tcg/helper-tcg.h |   3 +-
 target/i386/tcg/int_helper.c |   8 +-
 target/i386/tcg/misc_helper.c|   4 +-
 target/i386/tcg/seg_helper.c |   8 +-
 target/i386/tcg/translate.c  | 250 ++-
 tests/tcg/i386/Makefile.target   |   2 +-
 tests/tcg/i386/test-flags.c  |  37 +
 15 files changed, 509 insertions(+), 238 deletions(-)
 create mode 100644 tests/tcg/i386/test-flags.c

-- 
2.43.0

[PATCH 21/22] target/i386: introduce flags writeback mechanism

2023-12-22 Thread Paolo Bonzini

ALU instructions can write to both memory and flags.  If the CC_SRC*
and CC_DST locations have been written already when a memory access
causes a fault, the value in CC_SRC* and CC_DST might be interpreted
with the wrong CC_OP (the one that is in effect before the instruction.

Besides just using the wrong result for the flags, something like
subtracting -1 can have disastrous effects if the current CC_OP is
CC_OP_EFLAGS: this is because QEMU does not expect bits outside the ALU
flags to be set in CC_SRC, and env->eflags can end up set to all-ones.
In the case of the attached testcase, this sets IOPL to 3 and would
cause an assertion failure if SUB is moved to the new decoder.

This mechanism is not really needed for BMI instructions, which can
only write to a register, but put it to use anyway for cleanliness.
In the case of BZHI, the code has to be modified slightly to ensure
that decode->cc_src is written, otherwise the new assertions trigger.

Signed-off-by: Paolo Bonzini 
---
 target/i386/cpu.h|  1 +
 target/i386/tcg/decode-new.c.inc | 34 +
 target/i386/tcg/decode-new.h |  4 
 target/i386/tcg/emit.c.inc   | 36 ---
 tests/tcg/i386/Makefile.target   |  2 +-
 tests/tcg/i386/test-flags.c  | 37 
 6 files changed, 101 insertions(+), 13 deletions(-)
 create mode 100644 tests/tcg/i386/test-flags.c

diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index ecdd4518c64..7f0786e8b98 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -1285,6 +1285,7 @@ typedef enum {
 
 CC_OP_NB,
 } CCOp;
+QEMU_BUILD_BUG_ON(CC_OP_NB >= 128);
 
 typedef struct SegmentCache {
 uint32_t selector;
diff --git a/target/i386/tcg/decode-new.c.inc b/target/i386/tcg/decode-new.c.inc
index f30889dbc0a..717d7307722 100644
--- a/target/i386/tcg/decode-new.c.inc
+++ b/target/i386/tcg/decode-new.c.inc
@@ -1662,6 +1662,7 @@ static void disas_insn_new(DisasContext *s, CPUState 
*cpu, int b)
 bool first = true;
 X86DecodedInsn decode;
 X86DecodeFunc decode_func = decode_root;
+uint8_t cc_live;
 
 s->has_modrm = false;
 
@@ -1815,6 +1816,7 @@ static void disas_insn_new(DisasContext *s, CPUState 
*cpu, int b)
 }
 
 memset(&decode, 0, sizeof(decode));
+decode.cc_op = -1;
 decode.b = b;
 if (!decode_insn(s, env, decode_func, &decode)) {
 goto illegal_op;
@@ -1953,6 +1955,38 @@ static void disas_insn_new(DisasContext *s, CPUState 
*cpu, int b)
 decode.e.gen(s, env, &decode);
 gen_writeback(s, &decode, 0, s->T0);
 }
+
+/*
+ * Write back flags after last memory access.  Some newer ALU 
instructions, as
+ * well as SSE instructions, write flags in the gen_* function, but that 
can
+ * cause incorrect tracking of CC_OP for instructions that write to both 
memory
+ * and flags.
+ */
+if (decode.cc_op != -1) {
+if (decode.cc_dst) {
+tcg_gen_mov_tl(cpu_cc_dst, decode.cc_dst);
+}
+if (decode.cc_src) {
+tcg_gen_mov_tl(cpu_cc_src, decode.cc_src);
+}
+if (decode.cc_src2) {
+tcg_gen_mov_tl(cpu_cc_src2, decode.cc_src2);
+}
+if (decode.cc_op == CC_OP_DYNAMIC) {
+tcg_gen_mov_i32(cpu_cc_op, decode.cc_op_dynamic);
+}
+set_cc_op(s, decode.cc_op);
+cc_live = cc_op_live[decode.cc_op];
+} else {
+cc_live = 0;
+}
+if (decode.cc_op != CC_OP_DYNAMIC) {
+assert(!decode.cc_op_dynamic);
+assert(!!decode.cc_dst == !!(cc_live & USES_CC_DST));
+assert(!!decode.cc_src == !!(cc_live & USES_CC_SRC));
+assert(!!decode.cc_src2 == !!(cc_live & USES_CC_SRC2));
+}
+
 return;
  gp_fault:
 gen_exception_gpf(s);
diff --git a/target/i386/tcg/decode-new.h b/target/i386/tcg/decode-new.h
index 70b6717227f..25220fc4362 100644
--- a/target/i386/tcg/decode-new.h
+++ b/target/i386/tcg/decode-new.h
@@ -283,6 +283,10 @@ struct X86DecodedInsn {
 target_ulong immediate;
 AddressParts mem;
 
+TCGv cc_dst, cc_src, cc_src2;
+TCGv_i32 cc_op_dynamic;
+int8_t cc_op;
+
 uint8_t b;
 };
 
diff --git a/target/i386/tcg/emit.c.inc b/target/i386/tcg/emit.c.inc
index 4c2006fdd09..fd120e7b9b4 100644
--- a/target/i386/tcg/emit.c.inc
+++ b/target/i386/tcg/emit.c.inc
@@ -339,6 +339,19 @@ static inline int vector_len(DisasContext *s, 
X86DecodedInsn *decode)
 return s->vex_l ? 32 : 16;
 }
 
+static void prepare_update1_cc(X86DecodedInsn *decode, DisasContext *s, CCOp 
op)
+{
+decode->cc_dst = s->T0;
+decode->cc_op = op;
+}
+
+static void prepare_update2_cc(X86DecodedInsn *decode, DisasContext *s, CCOp 
op)
+{
+decode->cc_src = s->T1;
+decode->cc_dst = s->T0;
+decode->cc_op = op;
+}
+
 static void gen_store_sse(DisasContext *s, X86DecodedInsn *decode, int src_ofs)
 {
 MemOp ot = decode->op[0].ot;
@@ -1027,6 +1040,7 @@ static void gen_##uname(DisasContext *s,

[PATCH 01/22] target/i386: optimize computation of JL and JLE from flags

2023-12-22 Thread Paolo Bonzini

Take advantage of the fact that there can be no 1 bits between SF and OF.
If they were adjacent, you could sum SF and get a carry only if SF was
already set.  Then the value of OF in the sum is the XOR of OF itself,
the carry (which is SF) and 0 (the value of the OF bit in the addend):
this is OF^SF exactly.

Because OF and SF are not adjacent, just place more 1 bits to the
left so that the carry propagates, which means summing CC_O - CC_S.

Signed-off-by: Paolo Bonzini 
---
 target/i386/tcg/translate.c | 10 --
 1 file changed, 4 insertions(+), 6 deletions(-)

diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
index 037bc47e7c2..8fb80011a22 100644
--- a/target/i386/tcg/translate.c
+++ b/target/i386/tcg/translate.c
@@ -1126,10 +1126,9 @@ static CCPrepare gen_prepare_cc(DisasContext *s, int b, 
TCGv reg)
 if (reg == cpu_cc_src) {
 reg = s->tmp0;
 }
-tcg_gen_shri_tl(reg, cpu_cc_src, 4); /* CC_O -> CC_S */
-tcg_gen_xor_tl(reg, reg, cpu_cc_src);
+tcg_gen_addi_tl(reg, cpu_cc_src, CC_O - CC_S);
 cc = (CCPrepare) { .cond = TCG_COND_NE, .reg = reg,
-   .mask = CC_S };
+   .mask = CC_O };
 break;
 default:
 case JCC_LE:
@@ -1137,10 +1136,9 @@ static CCPrepare gen_prepare_cc(DisasContext *s, int b, 
TCGv reg)
 if (reg == cpu_cc_src) {
 reg = s->tmp0;
 }
-tcg_gen_shri_tl(reg, cpu_cc_src, 4); /* CC_O -> CC_S */
-tcg_gen_xor_tl(reg, reg, cpu_cc_src);
+tcg_gen_addi_tl(reg, cpu_cc_src, CC_O - CC_S);
 cc = (CCPrepare) { .cond = TCG_COND_NE, .reg = reg,
-   .mask = CC_S | CC_Z };
+   .mask = CC_O | CC_Z };
 break;
 }
 break;
-- 
2.43.0

[PATCH 03/22] target/i386: remove unnecessary arguments from raise_interrupt

2023-12-22 Thread Paolo Bonzini

is_int is always 1, and error_code is always zero.

Signed-off-by: Paolo Bonzini 
---
 target/i386/tcg/excp_helper.c | 7 +++
 target/i386/tcg/helper-tcg.h  | 3 +--
 target/i386/tcg/misc_helper.c | 2 +-
 3 files changed, 5 insertions(+), 7 deletions(-)

diff --git a/target/i386/tcg/excp_helper.c b/target/i386/tcg/excp_helper.c
index 7c3c8dc7fe8..65e37ae2a0c 100644
--- a/target/i386/tcg/excp_helper.c
+++ b/target/i386/tcg/excp_helper.c
@@ -28,7 +28,7 @@
 G_NORETURN void helper_raise_interrupt(CPUX86State *env, int intno,
   int next_eip_addend)
 {
-raise_interrupt(env, intno, 1, 0, next_eip_addend);
+raise_interrupt(env, intno, next_eip_addend);
 }
 
 G_NORETURN void helper_raise_exception(CPUX86State *env, int exception_index)
@@ -112,10 +112,9 @@ void raise_interrupt2(CPUX86State *env, int intno,
 
 /* shortcuts to generate exceptions */
 
-G_NORETURN void raise_interrupt(CPUX86State *env, int intno, int is_int,
-int error_code, int next_eip_addend)
+G_NORETURN void raise_interrupt(CPUX86State *env, int intno, int 
next_eip_addend)
 {
-raise_interrupt2(env, intno, is_int, error_code, next_eip_addend, 0);
+raise_interrupt2(env, intno, 1, 0, next_eip_addend, 0);
 }
 
 G_NORETURN void raise_exception_err(CPUX86State *env, int exception_index,
diff --git a/target/i386/tcg/helper-tcg.h b/target/i386/tcg/helper-tcg.h
index cd1723389ad..ce34b737bb0 100644
--- a/target/i386/tcg/helper-tcg.h
+++ b/target/i386/tcg/helper-tcg.h
@@ -65,8 +65,7 @@ G_NORETURN void raise_exception_err(CPUX86State *env, int 
exception_index,
 int error_code);
 G_NORETURN void raise_exception_err_ra(CPUX86State *env, int exception_index,
int error_code, uintptr_t retaddr);
-G_NORETURN void raise_interrupt(CPUX86State *nenv, int intno, int is_int,
-int error_code, int next_eip_addend);
+G_NORETURN void raise_interrupt(CPUX86State *nenv, int intno, int 
next_eip_addend);
 G_NORETURN void handle_unaligned_access(CPUX86State *env, vaddr vaddr,
 MMUAccessType access_type,
 uintptr_t retaddr);
diff --git a/target/i386/tcg/misc_helper.c b/target/i386/tcg/misc_helper.c
index babff061864..66b332a83c1 100644
--- a/target/i386/tcg/misc_helper.c
+++ b/target/i386/tcg/misc_helper.c
@@ -43,7 +43,7 @@ void helper_into(CPUX86State *env, int next_eip_addend)
 
 eflags = cpu_cc_compute_all(env, CC_OP);
 if (eflags & CC_O) {
-raise_interrupt(env, EXCP04_INTO, 1, 0, next_eip_addend);
+raise_interrupt(env, EXCP04_INTO, next_eip_addend);
 }
 }
 
-- 
2.43.0

[PATCH 18/22] target/i386: prepare for implementation of STOS/SCAS in new decoder

2023-12-22 Thread Paolo Bonzini

Do not use gen_op, and pull the load from the accumulator into
disas_insn.

Signed-off-by: Paolo Bonzini 
---
 target/i386/tcg/translate.c | 9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
index edbad0ad746..c7d48088418 100644
--- a/target/i386/tcg/translate.c
+++ b/target/i386/tcg/translate.c
@@ -1264,7 +1264,6 @@ static TCGLabel *gen_jz_ecx_string(DisasContext *s)
 
 static void gen_stos(DisasContext *s, MemOp ot)
 {
-gen_op_mov_v_reg(s, MO_32, s->T0, R_EAX);
 gen_string_movl_A0_EDI(s);
 gen_op_st_v(s, ot, s->T0, s->A0);
 gen_op_add_reg(s, s->aflag, R_EDI, gen_compute_Dshift(s, ot));
@@ -1282,7 +1281,11 @@ static void gen_scas(DisasContext *s, MemOp ot)
 {
 gen_string_movl_A0_EDI(s);
 gen_op_ld_v(s, ot, s->T1, s->A0);
-gen_op(s, OP_CMPL, ot, R_EAX);
+tcg_gen_mov_tl(cpu_cc_src, s->T1);
+tcg_gen_mov_tl(s->cc_srcT, s->T0);
+tcg_gen_sub_tl(cpu_cc_dst, s->T0, s->T1);
+set_cc_op(s, CC_OP_SUBB + ot);
+
 gen_op_add_reg(s, s->aflag, R_EDI, gen_compute_Dshift(s, ot));
 }
 
@@ -4960,6 +4963,7 @@ static bool disas_insn(DisasContext *s, CPUState *cpu)
 case 0xaa: /* stosS */
 case 0xab:
 ot = mo_b_d(b, dflag);
+gen_op_mov_v_reg(s, MO_32, s->T0, R_EAX);
 if (prefixes & (PREFIX_REPZ | PREFIX_REPNZ)) {
 gen_repz_stos(s, ot);
 } else {
@@ -4978,6 +4982,7 @@ static bool disas_insn(DisasContext *s, CPUState *cpu)
 case 0xae: /* scasS */
 case 0xaf:
 ot = mo_b_d(b, dflag);
+gen_op_mov_v_reg(s, MO_32, s->T0, R_EAX);
 if (prefixes & PREFIX_REPNZ) {
 gen_repz_scas(s, ot, 1);
 } else if (prefixes & PREFIX_REPZ) {
-- 
2.43.0

[PATCH 19/22] target/i386: move operand load and writeback out of gen_cmovcc1

2023-12-22 Thread Paolo Bonzini

Similar to gen_setcc1, make gen_cmovcc1 receive TCGv.  This is more friendly
to simultaneous implementation in the old and the new decoder.

A small wart is that s->T0 of CMOV is currently the *second* argument (which
would ordinarily be in T1).  Therefore, the condition has to be inverted in
order to overwrite s->T0 with cpu_regs[reg] if the MOV is not performed.

This only applies to the old decoder, and this code will go away soon.

Signed-off-by: Paolo Bonzini 
---
 target/i386/tcg/translate.c | 16 ++--
 1 file changed, 6 insertions(+), 10 deletions(-)

diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
index c7d48088418..53b98d5e6ac 100644
--- a/target/i386/tcg/translate.c
+++ b/target/i386/tcg/translate.c
@@ -2500,14 +2500,10 @@ static void gen_jcc(DisasContext *s, int b, int diff)
 gen_jmp_rel(s, s->dflag, diff, 0);
 }
 
-static void gen_cmovcc1(CPUX86State *env, DisasContext *s, MemOp ot, int b,
-int modrm, int reg)
+static void gen_cmovcc1(DisasContext *s, int b, TCGv dest, TCGv src)
 {
-CCPrepare cc;
+CCPrepare cc = gen_prepare_cc(s, b, s->T1);
 
-gen_ldst_modrm(env, s, modrm, ot, OR_TMP0, 0);
-
-cc = gen_prepare_cc(s, b, s->T1);
 if (cc.mask != -1) {
 TCGv t0 = tcg_temp_new();
 tcg_gen_andi_tl(t0, cc.reg, cc.mask);
@@ -2517,9 +2513,7 @@ static void gen_cmovcc1(CPUX86State *env, DisasContext 
*s, MemOp ot, int b,
 cc.reg2 = tcg_constant_tl(cc.imm);
 }
 
-tcg_gen_movcond_tl(cc.cond, s->T0, cc.reg, cc.reg2,
-   s->T0, cpu_regs[reg]);
-gen_op_mov_reg_v(s, ot, reg, s->T0);
+tcg_gen_movcond_tl(cc.cond, dest, cc.reg, cc.reg2, src, dest);
 }
 
 static inline void gen_op_movl_T0_seg(DisasContext *s, X86Seg seg_reg)
@@ -5238,7 +5232,9 @@ static bool disas_insn(DisasContext *s, CPUState *cpu)
 ot = dflag;
 modrm = x86_ldub_code(env, s);
 reg = ((modrm >> 3) & 7) | REX_R(s);
-gen_cmovcc1(env, s, ot, b, modrm, reg);
+gen_ldst_modrm(env, s, modrm, ot, OR_TMP0, 0);
+gen_cmovcc1(s, b ^ 1, s->T0, cpu_regs[reg]);
+gen_op_mov_reg_v(s, ot, reg, s->T0);
 break;
 
 //
-- 
2.43.0

[PATCH 10/22] target/i386: add X86_SPECIALs for MOVSX and MOVZX

2023-12-22 Thread Paolo Bonzini

Usually the registers are just moved into s->T0 without much care for
their operand size.  However, in some cases we can get more efficient
code if the operand fetching logic syncs with the emission function
on what is nicer.

All the current uses are mostly demonstrative and only reduce the code
in the emission functions, because the instructions do not support
memory operands.  However the logic is generic and applies to several
more instructions such as MOVSXD (aka movslq), one-byte shift
instructions, multiplications, XLAT, and indirect calls/jumps.

Signed-off-by: Paolo Bonzini 
---
 target/i386/tcg/decode-new.c.inc | 18 ++
 target/i386/tcg/decode-new.h |  4 +++
 target/i386/tcg/emit.c.inc   | 42 +---
 3 files changed, 40 insertions(+), 24 deletions(-)

diff --git a/target/i386/tcg/decode-new.c.inc b/target/i386/tcg/decode-new.c.inc
index 00fdb243857..d7a86d96c0c 100644
--- a/target/i386/tcg/decode-new.c.inc
+++ b/target/i386/tcg/decode-new.c.inc
@@ -156,6 +156,8 @@
 #define op0_Rd .special = X86_SPECIAL_Op0_Rd,
 #define op2_Ry .special = X86_SPECIAL_Op2_Ry,
 #define avx_movx .special = X86_SPECIAL_AVXExtMov,
+#define sextT0 .special = X86_SPECIAL_SExtT0,
+#define zextT0 .special = X86_SPECIAL_ZExtT0,
 
 #define vex1 .vex_class = 1,
 #define vex1_rep3 .vex_class = 1, .vex_special = X86_VEX_REPScalar,
@@ -571,8 +573,8 @@ static const X86OpEntry opcodes_0F38_F0toFF[16][5] = {
 [5] = {
 X86_OP_ENTRY3(BZHI, G,y, E,y, B,y, vex13 cpuid(BMI1)),
 {},
-X86_OP_ENTRY3(PEXT, G,y, B,y, E,y, vex13 cpuid(BMI2)),
-X86_OP_ENTRY3(PDEP, G,y, B,y, E,y, vex13 cpuid(BMI2)),
+X86_OP_ENTRY3(PEXT, G,y, B,y, E,y, vex13 zextT0 cpuid(BMI2)),
+X86_OP_ENTRY3(PDEP, G,y, B,y, E,y, vex13 zextT0 cpuid(BMI2)),
 {},
 },
 [6] = {
@@ -583,10 +585,10 @@ static const X86OpEntry opcodes_0F38_F0toFF[16][5] = {
 {},
 },
 [7] = {
-X86_OP_ENTRY3(BEXTR, G,y, E,y, B,y, vex13 cpuid(BMI1)),
+X86_OP_ENTRY3(BEXTR, G,y, E,y, B,y, vex13 zextT0 cpuid(BMI1)),
 X86_OP_ENTRY3(SHLX, G,y, E,y, B,y, vex13 cpuid(BMI1)),
-X86_OP_ENTRY3(SARX, G,y, E,y, B,y, vex13 cpuid(BMI1)),
-X86_OP_ENTRY3(SHRX, G,y, E,y, B,y, vex13 cpuid(BMI1)),
+X86_OP_ENTRY3(SARX, G,y, E,y, B,y, vex13 sextT0 cpuid(BMI1)),
+X86_OP_ENTRY3(SHRX, G,y, E,y, B,y, vex13 zextT0 cpuid(BMI1)),
 {},
 },
 };
@@ -1905,6 +1907,12 @@ static void disas_insn_new(DisasContext *s, CPUState 
*cpu, int b)
 }
 break;
 
+case X86_SPECIAL_SExtT0:
+case X86_SPECIAL_ZExtT0:
+/* Handled in gen_load.  */
+assert(decode.op[1].unit == X86_OP_INT);
+break;
+
 default:
 break;
 }
diff --git a/target/i386/tcg/decode-new.h b/target/i386/tcg/decode-new.h
index b253f7457ae..70b6717227f 100644
--- a/target/i386/tcg/decode-new.h
+++ b/target/i386/tcg/decode-new.h
@@ -191,6 +191,10 @@ typedef enum X86InsnSpecial {
  * become P/P/Q/N, and size "x" becomes "q".
  */
 X86_SPECIAL_MMX,
+
+/* When loaded into s->T0, register operand 1 is zero/sign extended.  */
+X86_SPECIAL_SExtT0,
+X86_SPECIAL_ZExtT0,
 } X86InsnSpecial;
 
 /*
diff --git a/target/i386/tcg/emit.c.inc b/target/i386/tcg/emit.c.inc
index f5e44117eab..4c2006fdd09 100644
--- a/target/i386/tcg/emit.c.inc
+++ b/target/i386/tcg/emit.c.inc
@@ -232,9 +232,30 @@ static void gen_load(DisasContext *s, X86DecodedInsn 
*decode, int opn, TCGv v)
 break;
 case X86_OP_INT:
 if (op->has_ea) {
-gen_op_ld_v(s, op->ot, v, s->A0);
+if (v == s->T0 && decode->e.special == X86_SPECIAL_SExtT0) {
+gen_op_ld_v(s, op->ot | MO_SIGN, v, s->A0);
+} else {
+gen_op_ld_v(s, op->ot, v, s->A0);
+}
+
+} else if (op->ot == MO_8 && byte_reg_is_xH(s, op->n)) {
+if (v == s->T0 && decode->e.special == X86_SPECIAL_SExtT0) {
+tcg_gen_sextract_tl(v, cpu_regs[op->n - 4], 8, 8);
+} else {
+tcg_gen_extract_tl(v, cpu_regs[op->n - 4], 8, 8);
+}
+
+} else if (op->ot < MO_TL && v == s->T0 &&
+   (decode->e.special == X86_SPECIAL_SExtT0 ||
+decode->e.special == X86_SPECIAL_ZExtT0)) {
+if (decode->e.special == X86_SPECIAL_SExtT0) {
+tcg_gen_ext_tl(v, cpu_regs[op->n], op->ot | MO_SIGN);
+} else {
+tcg_gen_ext_tl(v, cpu_regs[op->n], op->ot);
+}
+
 } else {
-gen_op_mov_v_reg(s, op->ot, v, op->n);
+tcg_gen_mov_tl(v, cpu_regs[op->n]);
 }
 break;
 case X86_OP_IMM:
@@ -1084,9 +1105,6 @@ static void gen_BEXTR(DisasContext *s, CPUX86State *env, 
X86DecodedInsn *decode)
  * Shifts larger than operand size get zeros.
  */
 tcg_gen_ext8u_tl(s->A0, s->T1);
-if (TARGET_LONG_BITS == 64 && ot == MO_32) {
-

[PATCH 14/22] target/i386: split eflags computation out of gen_compute_eflags

2023-12-22 Thread Paolo Bonzini

The new x86 decoder wants the gen_* functions to compute EFLAGS before
writeback, which can be an issue for instructions with a memory
destination such as ARPL or shifts.

Extract code to compute the EFLAGS without clobbering CC_SRC, in case
the memory write causes a fault.  The flags writeback mechanism will
take care of copying the result to CC_SRC.

Signed-off-by: Paolo Bonzini 
---
 target/i386/tcg/translate.c | 28 +++-
 1 file changed, 19 insertions(+), 9 deletions(-)

diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
index 00ed0cc9a31..b79c312465b 100644
--- a/target/i386/tcg/translate.c
+++ b/target/i386/tcg/translate.c
@@ -862,22 +862,22 @@ static void gen_op_update_neg_cc(DisasContext *s)
 tcg_gen_movi_tl(s->cc_srcT, 0);
 }
 
-/* compute all eflags to cc_src */
-static void gen_compute_eflags(DisasContext *s)
+/* compute all eflags to reg */
+static void gen_mov_eflags(DisasContext *s, TCGv reg)
 {
-TCGv zero, dst, src1, src2;
+TCGv dst, src1, src2;
+TCGv_i32 cc_op;
 int live, dead;
 
 if (s->cc_op == CC_OP_EFLAGS) {
+tcg_gen_mov_tl(reg, cpu_cc_src);
 return;
 }
 if (s->cc_op == CC_OP_CLR) {
-tcg_gen_movi_tl(cpu_cc_src, CC_Z | CC_P);
-set_cc_op(s, CC_OP_EFLAGS);
+tcg_gen_movi_tl(reg, CC_Z | CC_P);
 return;
 }
 
-zero = NULL;
 dst = cpu_cc_dst;
 src1 = cpu_cc_src;
 src2 = cpu_cc_src2;
@@ -886,7 +886,7 @@ static void gen_compute_eflags(DisasContext *s)
 live = cc_op_live[s->cc_op] & ~USES_CC_SRCT;
 dead = live ^ (USES_CC_DST | USES_CC_SRC | USES_CC_SRC2);
 if (dead) {
-zero = tcg_constant_tl(0);
+TCGv zero = tcg_constant_tl(0);
 if (dead & USES_CC_DST) {
 dst = zero;
 }
@@ -898,8 +898,18 @@ static void gen_compute_eflags(DisasContext *s)
 }
 }
 
-gen_update_cc_op(s);
-gen_helper_cc_compute_all(cpu_cc_src, dst, src1, src2, cpu_cc_op);
+if (s->cc_op != CC_OP_DYNAMIC) {
+cc_op = tcg_constant_i32(s->cc_op);
+} else {
+cc_op = cpu_cc_op;
+}
+gen_helper_cc_compute_all(reg, dst, src1, src2, cc_op);
+}
+
+/* compute all eflags to cc_src */
+static void gen_compute_eflags(DisasContext *s)
+{
+gen_mov_eflags(s, cpu_cc_src);
 set_cc_op(s, CC_OP_EFLAGS);
 }
 
-- 
2.43.0

[PATCH 22/22] target/i386: implement CMPccXADD

2023-12-22 Thread Paolo Bonzini

The main difficulty here is that a page fault when writing to the destination
must not overwrite the flags.  Therefore, the compute-flags helper must be
called with a temporary destination instead of using gen_jcc1*.

For simplicity, I am using an unconditional cmpxchg operation, that becomes
a NOP if the comparison fails.

Signed-off-by: Paolo Bonzini 
---
 target/i386/cpu.c|   2 +-
 target/i386/tcg/decode-new.c.inc |  25 
 target/i386/tcg/decode-new.h |   1 +
 target/i386/tcg/emit.c.inc   | 104 +++
 target/i386/tcg/translate.c  |   2 +
 5 files changed, 133 insertions(+), 1 deletion(-)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 95d5f16cd5e..fd47ee7defb 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -738,7 +738,7 @@ void x86_cpu_vendor_words2str(char *dst, uint32_t vendor1,
 #define TCG_7_0_EDX_FEATURES (CPUID_7_0_EDX_FSRM | 
CPUID_7_0_EDX_KERNEL_FEATURES)
 
 #define TCG_7_1_EAX_FEATURES (CPUID_7_1_EAX_FZRM | CPUID_7_1_EAX_FSRS | \
-  CPUID_7_1_EAX_FSRC)
+  CPUID_7_1_EAX_FSRC | CPUID_7_1_EAX_CMPCCXADD)
 #define TCG_7_1_EDX_FEATURES 0
 #define TCG_7_2_EDX_FEATURES 0
 #define TCG_APM_FEATURES 0
diff --git a/target/i386/tcg/decode-new.c.inc b/target/i386/tcg/decode-new.c.inc
index 717d7307722..426c4594120 100644
--- a/target/i386/tcg/decode-new.c.inc
+++ b/target/i386/tcg/decode-new.c.inc
@@ -538,6 +538,28 @@ static const X86OpEntry opcodes_0F38_00toEF[240] = {
 [0xdd] = X86_OP_ENTRY3(VAESENCLAST, V,x,  H,x,   W,x,  vex4 cpuid(AES) 
p_66),
 [0xde] = X86_OP_ENTRY3(VAESDEC, V,x,  H,x,   W,x,  vex4 cpuid(AES) 
p_66),
 [0xdf] = X86_OP_ENTRY3(VAESDECLAST, V,x,  H,x,   W,x,  vex4 cpuid(AES) 
p_66),
+
+/*
+ * REG selects srcdest2 operand, VEX. selects src3.  VEX class not 
found
+ * in manual, assumed to be 13 from the VEX.L0 constraint.
+ */
+[0xe0] = X86_OP_ENTRY3(CMPccXADD,   M,y, G,y, B,y,  vex13 xchg chk(o64) 
cpuid(CMPCCXADD) p_66),
+[0xe1] = X86_OP_ENTRY3(CMPccXADD,   M,y, G,y, B,y,  vex13 xchg chk(o64) 
cpuid(CMPCCXADD) p_66),
+[0xe2] = X86_OP_ENTRY3(CMPccXADD,   M,y, G,y, B,y,  vex13 xchg chk(o64) 
cpuid(CMPCCXADD) p_66),
+[0xe3] = X86_OP_ENTRY3(CMPccXADD,   M,y, G,y, B,y,  vex13 xchg chk(o64) 
cpuid(CMPCCXADD) p_66),
+[0xe4] = X86_OP_ENTRY3(CMPccXADD,   M,y, G,y, B,y,  vex13 xchg chk(o64) 
cpuid(CMPCCXADD) p_66),
+[0xe5] = X86_OP_ENTRY3(CMPccXADD,   M,y, G,y, B,y,  vex13 xchg chk(o64) 
cpuid(CMPCCXADD) p_66),
+[0xe6] = X86_OP_ENTRY3(CMPccXADD,   M,y, G,y, B,y,  vex13 xchg chk(o64) 
cpuid(CMPCCXADD) p_66),
+[0xe7] = X86_OP_ENTRY3(CMPccXADD,   M,y, G,y, B,y,  vex13 xchg chk(o64) 
cpuid(CMPCCXADD) p_66),
+
+[0xe8] = X86_OP_ENTRY3(CMPccXADD,   M,y, G,y, B,y,  vex13 xchg chk(o64) 
cpuid(CMPCCXADD) p_66),
+[0xe9] = X86_OP_ENTRY3(CMPccXADD,   M,y, G,y, B,y,  vex13 xchg chk(o64) 
cpuid(CMPCCXADD) p_66),
+[0xea] = X86_OP_ENTRY3(CMPccXADD,   M,y, G,y, B,y,  vex13 xchg chk(o64) 
cpuid(CMPCCXADD) p_66),
+[0xeb] = X86_OP_ENTRY3(CMPccXADD,   M,y, G,y, B,y,  vex13 xchg chk(o64) 
cpuid(CMPCCXADD) p_66),
+[0xec] = X86_OP_ENTRY3(CMPccXADD,   M,y, G,y, B,y,  vex13 xchg chk(o64) 
cpuid(CMPCCXADD) p_66),
+[0xed] = X86_OP_ENTRY3(CMPccXADD,   M,y, G,y, B,y,  vex13 xchg chk(o64) 
cpuid(CMPCCXADD) p_66),
+[0xee] = X86_OP_ENTRY3(CMPccXADD,   M,y, G,y, B,y,  vex13 xchg chk(o64) 
cpuid(CMPCCXADD) p_66),
+[0xef] = X86_OP_ENTRY3(CMPccXADD,   M,y, G,y, B,y,  vex13 xchg chk(o64) 
cpuid(CMPCCXADD) p_66),
 };
 
 /* five rows for no prefix, 66, F3, F2, 66+F2  */
@@ -1503,6 +1525,9 @@ static bool has_cpuid_feature(DisasContext *s, 
X86CPUIDFeature cpuid)
 return (s->cpuid_7_0_ebx_features & CPUID_7_0_EBX_AVX2);
 case X86_FEAT_SHA_NI:
 return (s->cpuid_7_0_ebx_features & CPUID_7_0_EBX_SHA_NI);
+
+case X86_FEAT_CMPCCXADD:
+return (s->cpuid_7_1_eax_features & CPUID_7_1_EAX_CMPCCXADD);
 }
 g_assert_not_reached();
 }
diff --git a/target/i386/tcg/decode-new.h b/target/i386/tcg/decode-new.h
index 25220fc4362..15e6bfef4b1 100644
--- a/target/i386/tcg/decode-new.h
+++ b/target/i386/tcg/decode-new.h
@@ -104,6 +104,7 @@ typedef enum X86CPUIDFeature {
 X86_FEAT_AVX2,
 X86_FEAT_BMI1,
 X86_FEAT_BMI2,
+X86_FEAT_CMPCCXADD,
 X86_FEAT_F16C,
 X86_FEAT_FMA,
 X86_FEAT_MOVBE,
diff --git a/target/i386/tcg/emit.c.inc b/target/i386/tcg/emit.c.inc
index fd120e7b9b4..f05f79e1f62 100644
--- a/target/i386/tcg/emit.c.inc
+++ b/target/i386/tcg/emit.c.inc
@@ -1190,6 +1190,109 @@ static void gen_BZHI(DisasContext *s, CPUX86State *env, 
X86DecodedInsn *decode)
 prepare_update2_cc(decode, s, CC_OP_BMILGB + ot);
 }
 
+static void gen_CMPccXADD(DisasContext *s, CPUX86State *env, X86DecodedInsn 
*decode)
+{
+TCGLabel *label_top = gen_new_label();
+TCGLabel *label_bottom = gen_new_label();
+TCGv oldv = tcg_temp_new();
+TCGv newv = tcg_temp_new();
+TCGv cmpv = tcg_temp_new();

[PATCH 08/22] target/i386: avoid trunc and ext for MULX and RORX

2023-12-22 Thread Paolo Bonzini

Use _tl operations for 32-bit operands on 32-bit targets, and only go
through trunc and extu ops for 64-bit targets.  While the trunc/ext
ops should be pretty much free after optimization, the optimizer also
does not like having the same temporary used in multiple EBBs.
Therefore it is nicer to not use tmpN* unless necessary.

Signed-off-by: Paolo Bonzini 
---
 target/i386/tcg/emit.c.inc | 37 +
 1 file changed, 25 insertions(+), 12 deletions(-)

diff --git a/target/i386/tcg/emit.c.inc b/target/i386/tcg/emit.c.inc
index 98c4c9569ef..f5e44117eab 100644
--- a/target/i386/tcg/emit.c.inc
+++ b/target/i386/tcg/emit.c.inc
@@ -1348,7 +1348,8 @@ static void gen_MULX(DisasContext *s, CPUX86State *env, 
X86DecodedInsn *decode)
 
 /* low part of result in VEX., high in MODRM */
 switch (ot) {
-default:
+case MO_32:
+#ifdef TARGET_X86_64
 tcg_gen_trunc_tl_i32(s->tmp2_i32, s->T0);
 tcg_gen_trunc_tl_i32(s->tmp3_i32, s->T1);
 tcg_gen_mulu2_i32(s->tmp2_i32, s->tmp3_i32,
@@ -1356,13 +1357,15 @@ static void gen_MULX(DisasContext *s, CPUX86State *env, 
X86DecodedInsn *decode)
 tcg_gen_extu_i32_tl(cpu_regs[s->vex_v], s->tmp2_i32);
 tcg_gen_extu_i32_tl(s->T0, s->tmp3_i32);
 break;
-#ifdef TARGET_X86_64
-case MO_64:
-tcg_gen_mulu2_i64(cpu_regs[s->vex_v], s->T0, s->T0, s->T1);
-break;
-#endif
-}
 
+case MO_64:
+#endif
+tcg_gen_mulu2_tl(cpu_regs[s->vex_v], s->T0, s->T0, s->T1);
+break;
+
+default:
+g_assert_not_reached();
+}
 }
 
 static void gen_PALIGNR(DisasContext *s, CPUX86State *env, X86DecodedInsn 
*decode)
@@ -1765,14 +1768,24 @@ static void gen_PSLLDQ_i(DisasContext *s, CPUX86State 
*env, X86DecodedInsn *deco
 static void gen_RORX(DisasContext *s, CPUX86State *env, X86DecodedInsn *decode)
 {
 MemOp ot = decode->op[0].ot;
-int b = decode->immediate;
+int mask = ot == MO_64 ? 63 : 31;
+int b = decode->immediate & mask;
 
-if (ot == MO_64) {
-tcg_gen_rotri_tl(s->T0, s->T0, b & 63);
-} else {
+switch (ot) {
+case MO_32:
+#ifdef TARGET_X86_64
 tcg_gen_trunc_tl_i32(s->tmp2_i32, s->T0);
-tcg_gen_rotri_i32(s->tmp2_i32, s->tmp2_i32, b & 31);
+tcg_gen_rotri_i32(s->tmp2_i32, s->tmp2_i32, b);
 tcg_gen_extu_i32_tl(s->T0, s->tmp2_i32);
+break;
+
+case MO_64:
+#endif
+tcg_gen_rotri_tl(s->T0, s->T0, b);
+break;
+
+default:
+g_assert_not_reached();
 }
 }
 
-- 
2.43.0

[PATCH 05/22] target/i386: clean up cpu_cc_compute_all

2023-12-22 Thread Paolo Bonzini

cpu_cc_compute_all() has an argument that is always equal to CC_OP for 
historical
reasons (dating back to commit a7812ae4123, "TCG variable type checking.", 
2008-11-17,
which added the argument to helper_cc_compute_all).  It does not make sense for 
the
argumnt to have any other value, so remove it and clean up some lines that are 
not
too long anymore.

Signed-off-by: Paolo Bonzini 
---
 target/i386/cpu.h |  4 ++--
 target/i386/tcg/cc_helper.c   |  6 +++---
 target/i386/tcg/fpu_helper.c  | 10 --
 target/i386/tcg/int_helper.c  |  8 
 target/i386/tcg/misc_helper.c |  2 +-
 target/i386/tcg/seg_helper.c  |  8 
 6 files changed, 18 insertions(+), 20 deletions(-)

diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index ef987f344cf..ecdd4518c64 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -2344,13 +2344,13 @@ void cpu_x86_inject_mce(Monitor *mon, X86CPU *cpu, int 
bank,
 uint64_t status, uint64_t mcg_status, uint64_t addr,
 uint64_t misc, int flags);
 
-uint32_t cpu_cc_compute_all(CPUX86State *env1, int op);
+uint32_t cpu_cc_compute_all(CPUX86State *env1);
 
 static inline uint32_t cpu_compute_eflags(CPUX86State *env)
 {
 uint32_t eflags = env->eflags;
 if (tcg_enabled()) {
-eflags |= cpu_cc_compute_all(env, CC_OP) | (env->df & DF_MASK);
+eflags |= cpu_cc_compute_all(env) | (env->df & DF_MASK);
 }
 return eflags;
 }
diff --git a/target/i386/tcg/cc_helper.c b/target/i386/tcg/cc_helper.c
index c310bd842f1..f76e9cb8cfb 100644
--- a/target/i386/tcg/cc_helper.c
+++ b/target/i386/tcg/cc_helper.c
@@ -220,9 +220,9 @@ target_ulong helper_cc_compute_all(target_ulong dst, 
target_ulong src1,
 }
 }
 
-uint32_t cpu_cc_compute_all(CPUX86State *env, int op)
+uint32_t cpu_cc_compute_all(CPUX86State *env)
 {
-return helper_cc_compute_all(CC_DST, CC_SRC, CC_SRC2, op);
+return helper_cc_compute_all(CC_DST, CC_SRC, CC_SRC2, CC_OP);
 }
 
 target_ulong helper_cc_compute_c(target_ulong dst, target_ulong src1,
@@ -335,7 +335,7 @@ target_ulong helper_read_eflags(CPUX86State *env)
 {
 uint32_t eflags;
 
-eflags = cpu_cc_compute_all(env, CC_OP);
+eflags = cpu_cc_compute_all(env);
 eflags |= (env->df & DF_MASK);
 eflags |= env->eflags & ~(VM_MASK | RF_MASK);
 return eflags;
diff --git a/target/i386/tcg/fpu_helper.c b/target/i386/tcg/fpu_helper.c
index 4430d3d380c..4b965a5d6c4 100644
--- a/target/i386/tcg/fpu_helper.c
+++ b/target/i386/tcg/fpu_helper.c
@@ -484,9 +484,8 @@ void helper_fcomi_ST0_FT0(CPUX86State *env)
 FloatRelation ret;
 
 ret = floatx80_compare(ST0, FT0, &env->fp_status);
-eflags = cpu_cc_compute_all(env, CC_OP);
-eflags = (eflags & ~(CC_Z | CC_P | CC_C)) | fcomi_ccval[ret + 1];
-CC_SRC = eflags;
+eflags = cpu_cc_compute_all(env) & ~(CC_Z | CC_P | CC_C);
+CC_SRC = eflags | fcomi_ccval[ret + 1];
 merge_exception_flags(env, old_flags);
 }
 
@@ -497,9 +496,8 @@ void helper_fucomi_ST0_FT0(CPUX86State *env)
 FloatRelation ret;
 
 ret = floatx80_compare_quiet(ST0, FT0, &env->fp_status);
-eflags = cpu_cc_compute_all(env, CC_OP);
-eflags = (eflags & ~(CC_Z | CC_P | CC_C)) | fcomi_ccval[ret + 1];
-CC_SRC = eflags;
+eflags = cpu_cc_compute_all(env) & ~(CC_Z | CC_P | CC_C);
+CC_SRC = eflags | fcomi_ccval[ret + 1];
 merge_exception_flags(env, old_flags);
 }
 
diff --git a/target/i386/tcg/int_helper.c b/target/i386/tcg/int_helper.c
index 05418f181f1..ab85dc55400 100644
--- a/target/i386/tcg/int_helper.c
+++ b/target/i386/tcg/int_helper.c
@@ -190,7 +190,7 @@ void helper_aaa(CPUX86State *env)
 int al, ah, af;
 int eflags;
 
-eflags = cpu_cc_compute_all(env, CC_OP);
+eflags = cpu_cc_compute_all(env);
 af = eflags & CC_A;
 al = env->regs[R_EAX] & 0xff;
 ah = (env->regs[R_EAX] >> 8) & 0xff;
@@ -214,7 +214,7 @@ void helper_aas(CPUX86State *env)
 int al, ah, af;
 int eflags;
 
-eflags = cpu_cc_compute_all(env, CC_OP);
+eflags = cpu_cc_compute_all(env);
 af = eflags & CC_A;
 al = env->regs[R_EAX] & 0xff;
 ah = (env->regs[R_EAX] >> 8) & 0xff;
@@ -237,7 +237,7 @@ void helper_daa(CPUX86State *env)
 int old_al, al, af, cf;
 int eflags;
 
-eflags = cpu_cc_compute_all(env, CC_OP);
+eflags = cpu_cc_compute_all(env);
 cf = eflags & CC_C;
 af = eflags & CC_A;
 old_al = al = env->regs[R_EAX] & 0xff;
@@ -264,7 +264,7 @@ void helper_das(CPUX86State *env)
 int al, al1, af, cf;
 int eflags;
 
-eflags = cpu_cc_compute_all(env, CC_OP);
+eflags = cpu_cc_compute_all(env);
 cf = eflags & CC_C;
 af = eflags & CC_A;
 al = env->regs[R_EAX] & 0xff;
diff --git a/target/i386/tcg/misc_helper.c b/target/i386/tcg/misc_helper.c
index 66b332a83c1..b0f0f7b893b 100644
--- a/target/i386/tcg/misc_helper.c
+++ b/target/i386/tcg/misc_helper.c
@@ -41,7 +41,7 @@ void helper_into(CPUX86State *env, int next_eip_addend)
 {
 int eflags;
 
-e

[PATCH 16/22] target/i386: do not use s->tmp0 for jumps on ECX ==/!= 0

2023-12-22 Thread Paolo Bonzini

Create a new temporary, to ease the register allocator's work.

Creation of the temporary is pushed into gen_ext_tl, which
also allows NULL as the first parameter now.

Signed-off-by: Paolo Bonzini 
---
 target/i386/tcg/translate.c | 9 ++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
index afe0fa6c65f..e5f71170967 100644
--- a/target/i386/tcg/translate.c
+++ b/target/i386/tcg/translate.c
@@ -720,6 +720,9 @@ static TCGv gen_ext_tl(TCGv dst, TCGv src, MemOp size, bool 
sign)
 if (size == MO_TL) {
 return src;
 }
+if (!dst) {
+dst = tcg_temp_new();
+}
 tcg_gen_ext_tl(dst, src, size | (sign ? MO_SIGN : 0));
 return dst;
 }
@@ -736,9 +739,9 @@ static void gen_exts(MemOp ot, TCGv reg)
 
 static void gen_op_j_ecx(DisasContext *s, TCGCond cond, TCGLabel *label1)
 {
-tcg_gen_mov_tl(s->tmp0, cpu_regs[R_ECX]);
-gen_extu(s->aflag, s->tmp0);
-tcg_gen_brcondi_tl(cond, s->tmp0, 0, label1);
+TCGv tmp = gen_ext_tl(NULL, cpu_regs[R_ECX], s->aflag, false);
+
+tcg_gen_brcondi_tl(cond, tmp, 0, label1);
 }
 
 static inline void gen_op_jz_ecx(DisasContext *s, TCGLabel *label1)
-- 
2.43.0

[PATCH 13/22] target/i386: do not clobber T0 on string operations

2023-12-22 Thread Paolo Bonzini

The new decoder would rather have the operand in T0 when expanding SCAS, rather
than use R_EAX directly as gen_scas currently does.  This makes SCAS more 
similar
to CMP and SUB, in that CC_DST = T0 - T1.

Signed-off-by: Paolo Bonzini 
---
 target/i386/tcg/translate.c | 45 -
 1 file changed, 24 insertions(+), 21 deletions(-)

diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
index efef4e74d4c..00ed0cc9a31 100644
--- a/target/i386/tcg/translate.c
+++ b/target/i386/tcg/translate.c
@@ -522,9 +522,9 @@ void gen_op_add_reg_im(DisasContext *s, MemOp size, int 
reg, int32_t val)
 gen_op_mov_reg_v(s, size, reg, s->tmp0);
 }
 
-static inline void gen_op_add_reg_T0(DisasContext *s, MemOp size, int reg)
+static inline void gen_op_add_reg(DisasContext *s, MemOp size, int reg, TCGv 
val)
 {
-tcg_gen_add_tl(s->tmp0, cpu_regs[reg], s->T0);
+tcg_gen_add_tl(s->tmp0, cpu_regs[reg], val);
 gen_op_mov_reg_v(s, size, reg, s->tmp0);
 }
 
@@ -707,10 +707,12 @@ static inline void gen_string_movl_A0_EDI(DisasContext *s)
 gen_lea_v_seg(s, s->aflag, cpu_regs[R_EDI], R_ES, -1);
 }
 
-static inline void gen_op_movl_T0_Dshift(DisasContext *s, MemOp ot)
+static inline TCGv gen_compute_Dshift(DisasContext *s, MemOp ot)
 {
-tcg_gen_ld32s_tl(s->T0, tcg_env, offsetof(CPUX86State, df));
-tcg_gen_shli_tl(s->T0, s->T0, ot);
+TCGv dshift = tcg_temp_new();
+tcg_gen_ld32s_tl(dshift, tcg_env, offsetof(CPUX86State, df));
+tcg_gen_shli_tl(dshift, dshift, ot);
+return dshift;
 };
 
 static TCGv gen_ext_tl(TCGv dst, TCGv src, MemOp size, bool sign)
@@ -818,13 +820,16 @@ static bool gen_check_io(DisasContext *s, MemOp ot, 
TCGv_i32 port,
 
 static void gen_movs(DisasContext *s, MemOp ot)
 {
+TCGv dshift;
+
 gen_string_movl_A0_ESI(s);
 gen_op_ld_v(s, ot, s->T0, s->A0);
 gen_string_movl_A0_EDI(s);
 gen_op_st_v(s, ot, s->T0, s->A0);
-gen_op_movl_T0_Dshift(s, ot);
-gen_op_add_reg_T0(s, s->aflag, R_ESI);
-gen_op_add_reg_T0(s, s->aflag, R_EDI);
+
+dshift = gen_compute_Dshift(s, ot);
+gen_op_add_reg(s, s->aflag, R_ESI, dshift);
+gen_op_add_reg(s, s->aflag, R_EDI, dshift);
 }
 
 static void gen_op_update1_cc(DisasContext *s)
@@ -1249,8 +1254,7 @@ static void gen_stos(DisasContext *s, MemOp ot)
 gen_op_mov_v_reg(s, MO_32, s->T0, R_EAX);
 gen_string_movl_A0_EDI(s);
 gen_op_st_v(s, ot, s->T0, s->A0);
-gen_op_movl_T0_Dshift(s, ot);
-gen_op_add_reg_T0(s, s->aflag, R_EDI);
+gen_op_add_reg(s, s->aflag, R_EDI, gen_compute_Dshift(s, ot));
 }
 
 static void gen_lods(DisasContext *s, MemOp ot)
@@ -1258,8 +1262,7 @@ static void gen_lods(DisasContext *s, MemOp ot)
 gen_string_movl_A0_ESI(s);
 gen_op_ld_v(s, ot, s->T0, s->A0);
 gen_op_mov_reg_v(s, ot, R_EAX, s->T0);
-gen_op_movl_T0_Dshift(s, ot);
-gen_op_add_reg_T0(s, s->aflag, R_ESI);
+gen_op_add_reg(s, s->aflag, R_ESI, gen_compute_Dshift(s, ot));
 }
 
 static void gen_scas(DisasContext *s, MemOp ot)
@@ -1267,19 +1270,21 @@ static void gen_scas(DisasContext *s, MemOp ot)
 gen_string_movl_A0_EDI(s);
 gen_op_ld_v(s, ot, s->T1, s->A0);
 gen_op(s, OP_CMPL, ot, R_EAX);
-gen_op_movl_T0_Dshift(s, ot);
-gen_op_add_reg_T0(s, s->aflag, R_EDI);
+gen_op_add_reg(s, s->aflag, R_EDI, gen_compute_Dshift(s, ot));
 }
 
 static void gen_cmps(DisasContext *s, MemOp ot)
 {
+TCGv dshift;
+
 gen_string_movl_A0_EDI(s);
 gen_op_ld_v(s, ot, s->T1, s->A0);
 gen_string_movl_A0_ESI(s);
 gen_op(s, OP_CMPL, ot, OR_TMP0);
-gen_op_movl_T0_Dshift(s, ot);
-gen_op_add_reg_T0(s, s->aflag, R_ESI);
-gen_op_add_reg_T0(s, s->aflag, R_EDI);
+
+dshift = gen_compute_Dshift(s, ot);
+gen_op_add_reg(s, s->aflag, R_ESI, dshift);
+gen_op_add_reg(s, s->aflag, R_EDI, dshift);
 }
 
 static void gen_bpt_io(DisasContext *s, TCGv_i32 t_port, int ot)
@@ -1307,8 +1312,7 @@ static void gen_ins(DisasContext *s, MemOp ot)
 tcg_gen_andi_i32(s->tmp2_i32, s->tmp2_i32, 0x);
 gen_helper_in_func(ot, s->T0, s->tmp2_i32);
 gen_op_st_v(s, ot, s->T0, s->A0);
-gen_op_movl_T0_Dshift(s, ot);
-gen_op_add_reg_T0(s, s->aflag, R_EDI);
+gen_op_add_reg(s, s->aflag, R_EDI, gen_compute_Dshift(s, ot));
 gen_bpt_io(s, s->tmp2_i32, ot);
 }
 
@@ -1321,8 +1325,7 @@ static void gen_outs(DisasContext *s, MemOp ot)
 tcg_gen_andi_i32(s->tmp2_i32, s->tmp2_i32, 0x);
 tcg_gen_trunc_tl_i32(s->tmp3_i32, s->T0);
 gen_helper_out_func(ot, s->tmp2_i32, s->tmp3_i32);
-gen_op_movl_T0_Dshift(s, ot);
-gen_op_add_reg_T0(s, s->aflag, R_ESI);
+gen_op_add_reg(s, s->aflag, R_ESI, gen_compute_Dshift(s, ot));
 gen_bpt_io(s, s->tmp2_i32, ot);
 }
 
-- 
2.43.0

[PATCH 06/22] target/i386: document more deviations from the manual

2023-12-22 Thread Paolo Bonzini

Signed-off-by: Paolo Bonzini 
---
 target/i386/tcg/decode-new.c.inc | 12 
 1 file changed, 12 insertions(+)

diff --git a/target/i386/tcg/decode-new.c.inc b/target/i386/tcg/decode-new.c.inc
index 2bdbb1bba0f..232c6a45c96 100644
--- a/target/i386/tcg/decode-new.c.inc
+++ b/target/i386/tcg/decode-new.c.inc
@@ -26,6 +26,13 @@
  * size (X86_SIZE_*) codes used in the manual.  There are a few differences
  * though.
  *
+ * Operand sizes
+ * -
+ *
+ * The manual lists d64 ("cannot encode 32-bit size in 64-bit mode") and f64
+ * ("cannot encode 16-bit or 32-bit size in 64-bit mode") as modifiers of the
+ * "v" or "z" sizes.  The decoder simply makes them separate operand sizes.
+ *
  * Vector operands
  * ---
  *
@@ -44,6 +51,11 @@
  * if the difference is expressed via prefixes.  Individual instructions
  * are separated by prefix in the generator functions.
  *
+ * There is a custom size "xh" used to address half of a SSE/AVX operand.
+ * This points to a 64-bit operand for SSE operations, 128-bit operand
+ * for 256-bit AVX operands, etc.  It is used for conversion operations
+ * such as VCVTPH2PS or VCVTSS2SD.
+ *
  * There are a couple cases in which instructions (e.g. MOVD) write the
  * whole XMM or MM register but are established incorrectly in the manual
  * as "d" or "q".  These have to be fixed for the decoder to work correctly.
-- 
2.43.0

[PATCH 15/22] target/i386: do not use s->tmp4 for push

2023-12-22 Thread Paolo Bonzini

Just create a temporary for the occasion.

Signed-off-by: Paolo Bonzini 
---
 target/i386/tcg/translate.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
index b79c312465b..afe0fa6c65f 100644
--- a/target/i386/tcg/translate.c
+++ b/target/i386/tcg/translate.c
@@ -2580,7 +2580,7 @@ static void gen_push_v(DisasContext *s, TCGv val)
 
 if (!CODE64(s)) {
 if (ADDSEG(s)) {
-new_esp = s->tmp4;
+new_esp = tcg_temp_new();
 tcg_gen_mov_tl(new_esp, s->A0);
 }
 gen_lea_v_seg(s, a_ot, s->A0, R_SS, -1);
-- 
2.43.0

[PATCH 12/22] target/i386: do not clobber A0 in POP translation

2023-12-22 Thread Paolo Bonzini

The new decoder likes to compute the address in A0 very early, so the
gen_lea_v_seg in gen_pop_T0 would clobber the address of the memory
operand.  Instead use T0 since it is already available and will be
overwritten immediately after.

Reviewed-by: Richard Henderson 
Signed-off-by: Paolo Bonzini 
---
 target/i386/tcg/translate.c | 34 --
 1 file changed, 20 insertions(+), 14 deletions(-)

diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
index 73b83e07e23..efef4e74d4c 100644
--- a/target/i386/tcg/translate.c
+++ b/target/i386/tcg/translate.c
@@ -635,17 +635,17 @@ static TCGv eip_cur_tl(DisasContext *s)
 }
 }
 
-/* Compute SEG:REG into A0.  SEG is selected from the override segment
+/* Compute SEG:REG into DEST.  SEG is selected from the override segment
(OVR_SEG) and the default segment (DEF_SEG).  OVR_SEG may be -1 to
indicate no override.  */
-static void gen_lea_v_seg(DisasContext *s, MemOp aflag, TCGv a0,
-  int def_seg, int ovr_seg)
+static void gen_lea_v_seg_dest(DisasContext *s, MemOp aflag, TCGv dest, TCGv 
a0,
+   int def_seg, int ovr_seg)
 {
 switch (aflag) {
 #ifdef TARGET_X86_64
 case MO_64:
 if (ovr_seg < 0) {
-tcg_gen_mov_tl(s->A0, a0);
+tcg_gen_mov_tl(dest, a0);
 return;
 }
 break;
@@ -656,14 +656,14 @@ static void gen_lea_v_seg(DisasContext *s, MemOp aflag, 
TCGv a0,
 ovr_seg = def_seg;
 }
 if (ovr_seg < 0) {
-tcg_gen_ext32u_tl(s->A0, a0);
+tcg_gen_ext32u_tl(dest, a0);
 return;
 }
 break;
 case MO_16:
 /* 16 bit address */
-tcg_gen_ext16u_tl(s->A0, a0);
-a0 = s->A0;
+tcg_gen_ext16u_tl(dest, a0);
+a0 = dest;
 if (ovr_seg < 0) {
 if (ADDSEG(s)) {
 ovr_seg = def_seg;
@@ -680,17 +680,23 @@ static void gen_lea_v_seg(DisasContext *s, MemOp aflag, 
TCGv a0,
 TCGv seg = cpu_seg_base[ovr_seg];
 
 if (aflag == MO_64) {
-tcg_gen_add_tl(s->A0, a0, seg);
+tcg_gen_add_tl(dest, a0, seg);
 } else if (CODE64(s)) {
-tcg_gen_ext32u_tl(s->A0, a0);
-tcg_gen_add_tl(s->A0, s->A0, seg);
+tcg_gen_ext32u_tl(dest, a0);
+tcg_gen_add_tl(dest, dest, seg);
 } else {
-tcg_gen_add_tl(s->A0, a0, seg);
-tcg_gen_ext32u_tl(s->A0, s->A0);
+tcg_gen_add_tl(dest, a0, seg);
+tcg_gen_ext32u_tl(dest, dest);
 }
 }
 }
 
+static void gen_lea_v_seg(DisasContext *s, MemOp aflag, TCGv a0,
+  int def_seg, int ovr_seg)
+{
+gen_lea_v_seg_dest(s, aflag, s->A0, a0, def_seg, ovr_seg);
+}
+
 static inline void gen_string_movl_A0_ESI(DisasContext *s)
 {
 gen_lea_v_seg(s, s->aflag, cpu_regs[R_ESI], R_DS, s->override);
@@ -2576,8 +2582,8 @@ static MemOp gen_pop_T0(DisasContext *s)
 {
 MemOp d_ot = mo_pushpop(s, s->dflag);
 
-gen_lea_v_seg(s, mo_stacksize(s), cpu_regs[R_ESP], R_SS, -1);
-gen_op_ld_v(s, d_ot, s->T0, s->A0);
+gen_lea_v_seg_dest(s, mo_stacksize(s), s->T0, cpu_regs[R_ESP], R_SS, -1);
+gen_op_ld_v(s, d_ot, s->T0, s->T0);
 
 return d_ot;
 }
-- 
2.43.0

[PATCH 07/22] target/i386: reimplement check for validity of LOCK prefix

2023-12-22 Thread Paolo Bonzini

The previous check erroneously allowed CMP to be modified with LOCK.
Instead, tag explicitly the instructions that do support LOCK.

Signed-off-by: Paolo Bonzini 
---
 target/i386/tcg/decode-new.c.inc | 17 ++---
 target/i386/tcg/decode-new.h |  3 +++
 target/i386/tcg/emit.c.inc   |  5 -
 3 files changed, 13 insertions(+), 12 deletions(-)

diff --git a/target/i386/tcg/decode-new.c.inc b/target/i386/tcg/decode-new.c.inc
index 232c6a45c96..5eb2e9d0224 100644
--- a/target/i386/tcg/decode-new.c.inc
+++ b/target/i386/tcg/decode-new.c.inc
@@ -151,6 +151,7 @@
 
 #define cpuid(feat) .cpuid = X86_FEAT_##feat,
 #define xchg .special = X86_SPECIAL_Locked,
+#define lock .special = X86_SPECIAL_HasLock,
 #define mmx .special = X86_SPECIAL_MMX,
 #define zext0 .special = X86_SPECIAL_ZExtOp0,
 #define zext2 .special = X86_SPECIAL_ZExtOp2,
@@ -1103,10 +1104,6 @@ static int decode_modrm(DisasContext *s, CPUX86State 
*env, X86DecodedInsn *decod
 {
 int modrm = get_modrm(s, env);
 if ((modrm >> 6) == 3) {
-if (s->prefix & PREFIX_LOCK) {
-decode->e.gen = gen_illegal;
-return 0xff;
-}
 op->n = (modrm & 7);
 if (type != X86_TYPE_Q && type != X86_TYPE_N) {
 op->n |= REX_B(s);
@@ -1881,6 +1878,9 @@ static void disas_insn_new(DisasContext *s, CPUState 
*cpu, int b)
 if (decode.op[0].has_ea) {
 s->prefix |= PREFIX_LOCK;
 }
+decode.e.special = X86_SPECIAL_HasLock;
+/* fallthrough */
+case X86_SPECIAL_HasLock:
 break;
 
 case X86_SPECIAL_ZExtOp0:
@@ -1909,6 +1909,12 @@ static void disas_insn_new(DisasContext *s, CPUState 
*cpu, int b)
 break;
 }
 
+if (s->prefix & PREFIX_LOCK) {
+if (decode.e.special != X86_SPECIAL_HasLock || !decode.op[0].has_ea) {
+goto illegal_op;
+}
+}
+
 if (!validate_vex(s, &decode)) {
 return;
 }
@@ -1952,9 +1958,6 @@ static void disas_insn_new(DisasContext *s, CPUState 
*cpu, int b)
 gen_load_ea(s, &decode.mem, decode.e.vex_class == 12);
 }
 if (s->prefix & PREFIX_LOCK) {
-if (decode.op[0].unit != X86_OP_INT || !decode.op[0].has_ea) {
-goto illegal_op;
-}
 gen_load(s, &decode, 2, s->T1);
 decode.e.gen(s, env, &decode);
 } else {
diff --git a/target/i386/tcg/decode-new.h b/target/i386/tcg/decode-new.h
index e6c904a3192..611bfddd957 100644
--- a/target/i386/tcg/decode-new.h
+++ b/target/i386/tcg/decode-new.h
@@ -158,6 +158,9 @@ typedef enum X86InsnCheck {
 typedef enum X86InsnSpecial {
 X86_SPECIAL_None,
 
+/* Accepts LOCK prefix; LOCKed operations do not load or writeback operand 
0 */
+X86_SPECIAL_HasLock,
+
 /* Always locked if it has a memory operand (XCHG) */
 X86_SPECIAL_Locked,
 
diff --git a/target/i386/tcg/emit.c.inc b/target/i386/tcg/emit.c.inc
index d444d83e534..98c4c9569ef 100644
--- a/target/i386/tcg/emit.c.inc
+++ b/target/i386/tcg/emit.c.inc
@@ -55,11 +55,6 @@ static void gen_NM_exception(DisasContext *s)
 gen_exception(s, EXCP07_PREX);
 }
 
-static void gen_illegal(DisasContext *s, CPUX86State *env, X86DecodedInsn 
*decode)
-{
-gen_illegal_opcode(s);
-}
-
 static void gen_load_ea(DisasContext *s, AddressParts *mem, bool is_vsib)
 {
 TCGv ea = gen_lea_modrm_1(s, *mem, is_vsib);
-- 
2.43.0

[PATCH 11/22] target/i386: do not decode string source/destination into decode->mem

2023-12-22 Thread Paolo Bonzini

decode->mem is only used if one operand has has_ea == true.  String
operations will not use decode->mem and will load A0 on their own, because
they are the only case of two memory operands in a single instruction.

Signed-off-by: Paolo Bonzini 
---
 target/i386/tcg/decode-new.c.inc | 20 ++--
 1 file changed, 2 insertions(+), 18 deletions(-)

diff --git a/target/i386/tcg/decode-new.c.inc b/target/i386/tcg/decode-new.c.inc
index d7a86d96c0c..99d18d2871e 100644
--- a/target/i386/tcg/decode-new.c.inc
+++ b/target/i386/tcg/decode-new.c.inc
@@ -1212,6 +1212,8 @@ static bool decode_op(DisasContext *s, CPUX86State *env, 
X86DecodedInsn *decode,
 case X86_TYPE_None:  /* Implicit or absent */
 case X86_TYPE_A:  /* Implicit */
 case X86_TYPE_F:  /* EFLAGS/RFLAGS */
+case X86_TYPE_X:  /* string source */
+case X86_TYPE_Y:  /* string destination */
 break;
 
 case X86_TYPE_B:  /* VEX. selects a GPR */
@@ -1346,24 +1348,6 @@ static bool decode_op(DisasContext *s, CPUX86State *env, 
X86DecodedInsn *decode,
 op->n = insn_get(env, s, op->ot) >> 4;
 break;
 
-case X86_TYPE_X:  /* string source */
-op->n = -1;
-decode->mem = (AddressParts) {
-.def_seg = R_DS,
-.base = R_ESI,
-.index = -1,
-};
-break;
-
-case X86_TYPE_Y:  /* string destination */
-op->n = -1;
-decode->mem = (AddressParts) {
-.def_seg = R_ES,
-.base = R_EDI,
-.index = -1,
-};
-break;
-
 case X86_TYPE_2op:
 *op = decode->op[0];
 break;
-- 
2.43.0

[PATCH 04/22] target/i386: remove unnecessary truncations

2023-12-22 Thread Paolo Bonzini

gen_lea_v_seg (called by gen_add_A0_ds_seg) already zeroes any
bits of s->A0 beyond s->aflag.  It does so before summing the
segment base and, if not in 64-bit mode, also after summing it.

Signed-off-by: Paolo Bonzini 
---
 target/i386/tcg/emit.c.inc  | 4 +---
 target/i386/tcg/translate.c | 2 --
 2 files changed, 1 insertion(+), 5 deletions(-)

diff --git a/target/i386/tcg/emit.c.inc b/target/i386/tcg/emit.c.inc
index 82da5488d47..d444d83e534 100644
--- a/target/i386/tcg/emit.c.inc
+++ b/target/i386/tcg/emit.c.inc
@@ -1242,9 +1242,7 @@ static void gen_LDMXCSR(DisasContext *s, CPUX86State 
*env, X86DecodedInsn *decod
 
 static void gen_MASKMOV(DisasContext *s, CPUX86State *env, X86DecodedInsn 
*decode)
 {
-tcg_gen_mov_tl(s->A0, cpu_regs[R_EDI]);
-gen_extu(s->aflag, s->A0);
-gen_add_A0_ds_seg(s);
+gen_lea_v_seg(s, s->aflag, cpu_regs[R_EDI], R_DS, s->override);
 
 if (s->prefix & PREFIX_DATA) {
 gen_helper_maskmov_xmm(tcg_env, OP_PTR1, OP_PTR2, s->A0);
diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
index a16eb8d4008..73b83e07e23 100644
--- a/target/i386/tcg/translate.c
+++ b/target/i386/tcg/translate.c
@@ -4183,7 +4183,6 @@ static bool disas_insn(DisasContext *s, CPUState *cpu)
 tcg_gen_mov_tl(s->A0, cpu_regs[R_EBX]);
 tcg_gen_ext8u_tl(s->T0, cpu_regs[R_EAX]);
 tcg_gen_add_tl(s->A0, s->A0, s->T0);
-gen_extu(s->aflag, s->A0);
 gen_add_A0_ds_seg(s);
 gen_op_ld_v(s, MO_8, s->T0, s->A0);
 gen_op_mov_reg_v(s, MO_8, R_EAX, s->T0);
@@ -5835,7 +5834,6 @@ static bool disas_insn(DisasContext *s, CPUState *cpu)
 gen_update_cc_op(s);
 gen_update_eip_cur(s);
 tcg_gen_mov_tl(s->A0, cpu_regs[R_EAX]);
-gen_extu(s->aflag, s->A0);
 gen_add_A0_ds_seg(s);
 gen_helper_monitor(tcg_env, s->A0);
 break;
-- 
2.43.0

[PATCH 09/22] target/i386: rename zext0/zext2 and make them closer to the manual

2023-12-22 Thread Paolo Bonzini

X86_SPECIAL_ZExtOp0 and X86_SPECIAL_ZExtOp2 are poorly named; they are a hack
that is needed by scalar insertion and extraction instructions, and not really
related to zero extension: for PEXTR the zero extension is done by the 
generation
functions, for PINSR the high bits are not used at all and in fact are *not*
filled with zeroes when loaded into s->T1.

Rename the values to match the effect described in the manual, and explain
better in the comments.

Signed-off-by: Paolo Bonzini 
---
 target/i386/tcg/decode-new.c.inc | 16 
 target/i386/tcg/decode-new.h | 17 +
 2 files changed, 21 insertions(+), 12 deletions(-)

diff --git a/target/i386/tcg/decode-new.c.inc b/target/i386/tcg/decode-new.c.inc
index 5eb2e9d0224..00fdb243857 100644
--- a/target/i386/tcg/decode-new.c.inc
+++ b/target/i386/tcg/decode-new.c.inc
@@ -153,8 +153,8 @@
 #define xchg .special = X86_SPECIAL_Locked,
 #define lock .special = X86_SPECIAL_HasLock,
 #define mmx .special = X86_SPECIAL_MMX,
-#define zext0 .special = X86_SPECIAL_ZExtOp0,
-#define zext2 .special = X86_SPECIAL_ZExtOp2,
+#define op0_Rd .special = X86_SPECIAL_Op0_Rd,
+#define op2_Ry .special = X86_SPECIAL_Op2_Ry,
 #define avx_movx .special = X86_SPECIAL_AVXExtMov,
 
 #define vex1 .vex_class = 1,
@@ -632,13 +632,13 @@ static const X86OpEntry opcodes_0F3A[256] = {
 [0x05] = X86_OP_ENTRY3(VPERMILPD_i, V,x,  W,x,  I,b,  vex6 chk(W0) 
cpuid(AVX) p_66),
 [0x06] = X86_OP_ENTRY4(VPERM2x128,  V,qq, H,qq, W,qq, vex6 chk(W0) 
cpuid(AVX) p_66),
 
-[0x14] = X86_OP_ENTRY3(PEXTRB, E,b,  V,dq, I,b,  vex5 cpuid(SSE41) 
zext0 p_66),
-[0x15] = X86_OP_ENTRY3(PEXTRW, E,w,  V,dq, I,b,  vex5 cpuid(SSE41) 
zext0 p_66),
+[0x14] = X86_OP_ENTRY3(PEXTRB, E,b,  V,dq, I,b,  vex5 cpuid(SSE41) 
op0_Rd p_66),
+[0x15] = X86_OP_ENTRY3(PEXTRW, E,w,  V,dq, I,b,  vex5 cpuid(SSE41) 
op0_Rd p_66),
 [0x16] = X86_OP_ENTRY3(PEXTR,  E,y,  V,dq, I,b,  vex5 cpuid(SSE41) 
p_66),
 [0x17] = X86_OP_ENTRY3(VEXTRACTPS, E,d,  V,dq, I,b,  vex5 cpuid(SSE41) 
p_66),
 [0x1d] = X86_OP_ENTRY3(VCVTPS2PH,  W,xh, V,x,  I,b,  vex11 chk(W0) 
cpuid(F16C) p_66),
 
-[0x20] = X86_OP_ENTRY4(PINSRB, V,dq, H,dq, E,b,  vex5 cpuid(SSE41) 
zext2 p_66),
+[0x20] = X86_OP_ENTRY4(PINSRB, V,dq, H,dq, E,b,  vex5 cpuid(SSE41) 
op2_Ry p_66),
 [0x21] = X86_OP_GROUP0(VINSERTPS),
 [0x22] = X86_OP_ENTRY4(PINSR,  V,dq, H,dq, E,y,  vex5 cpuid(SSE41) 
p_66),
 
@@ -1883,17 +1883,17 @@ static void disas_insn_new(DisasContext *s, CPUState 
*cpu, int b)
 case X86_SPECIAL_HasLock:
 break;
 
-case X86_SPECIAL_ZExtOp0:
+case X86_SPECIAL_Op0_Rd:
 assert(decode.op[0].unit == X86_OP_INT);
 if (!decode.op[0].has_ea) {
 decode.op[0].ot = MO_32;
 }
 break;
 
-case X86_SPECIAL_ZExtOp2:
+case X86_SPECIAL_Op2_Ry:
 assert(decode.op[2].unit == X86_OP_INT);
 if (!decode.op[2].has_ea) {
-decode.op[2].ot = MO_32;
+decode.op[2].ot = s->dflag == MO_16 ? MO_32 : s->dflag;
 }
 break;
 
diff --git a/target/i386/tcg/decode-new.h b/target/i386/tcg/decode-new.h
index 611bfddd957..b253f7457ae 100644
--- a/target/i386/tcg/decode-new.h
+++ b/target/i386/tcg/decode-new.h
@@ -165,11 +165,20 @@ typedef enum X86InsnSpecial {
 X86_SPECIAL_Locked,
 
 /*
- * Register operand 0/2 is zero extended to 32 bits.  Rd/Mb or Rd/Mw
- * in the manual.
+ * Rd/Mb or Rd/Mw in the manual: register operand 0 is treated as 32 bits
+ * (and writeback zero-extends it to 64 bits if applicable).  PREFIX_DATA
+ * does not trigger 16-bit writeback and, as a side effect, high-byte
+ * registers are never used.
  */
-X86_SPECIAL_ZExtOp0,
-X86_SPECIAL_ZExtOp2,
+X86_SPECIAL_Op0_Rd,
+
+/*
+ * Ry/Mb in the manual (PINSRB).  However, the high bits are never used by
+ * the instruction in either the register or memory cases; the *real* 
effect
+ * of this modifier is that high-byte registers are never used, even 
without
+ * a REX prefix.  Therefore, PINSRW does not need it despite having Ry/Mw.
+ */
+X86_SPECIAL_Op2_Ry,
 
 /*
  * Register operand 2 is extended to full width, while a memory operand
-- 
2.43.0

1 2 >

1 - 100 of 110 matches

Mail list logo