date:20221011

Re: [PATCH] s390x: step down as general arch maintainer

2022-10-11 Thread Thomas Huth


On 10/10/2022 18.09, Cornelia Huck wrote:

I haven't really been working on s390x for some time now, and in
practice, I don't have time for it, either. So let's remove myself
from this entry.

Signed-off-by: Cornelia Huck 
---
  MAINTAINERS | 2 --
  1 file changed, 2 deletions(-)


Cornelia, thank you again very much for all your s390x work that you did 
through all those years! If you ever want to come back to the heavy metal 
computers (those that you cannot throw that easily through a window ;-)), 
you're very welcome to send a reverting patch for this one here!


Queued to my s390x-next branch:

 https://gitlab.com/thuth/qemu/-/commits/s390x-next/

 Thomas

Re: [RISU PATCH 1/5] risu: Use alternate stack

2022-10-11 Thread gaosong




在 2022/10/10 22:43, Peter Maydell 写道:

On Mon, 10 Oct 2022 at 15:20, Richard Henderson
 wrote:

On 9/17/22 00:43, Song Gao wrote:

We can use alternate stack, so that we can use sp register as intput/ouput 
register.
I had tested aarch64/LoongArch architecture.

Signed-off-by: Song Gao
---
   risu.c | 16 +++-
   1 file changed, 15 insertions(+), 1 deletion(-)

Good idea.

Depending on the architecture there might still need to be
restrictions on use of the stack pointer, eg aarch64's
alignment requirements, but this at least means you can
in theory write some risu rules that use SP.

I really want use alternate stack, this way can reduce risu rules.
what about use this only on LoongArch architecture ?

Thanks.
Song Gao

Re: [PATCH v3 0/5] vhost-user-blk: dynamically resize config space based on features

2022-10-11 Thread Daniil Tatianin





On 10/11/22 10:20 AM, Daniil Tatianin wrote:

Ping :)

Oops, didn't see the pull request. Disregard this.


On 9/6/22 10:31 AM, Daniil Tatianin wrote:

This patch set attempts to align vhost-user-blk with virtio-blk in
terms of backward compatibility and flexibility. It also improves
the virtio core by introducing new common code that can be used by
a virtio device to calculate its config space size.

In particular it adds the following things:
- Common virtio code for deducing the required device config size based
   on provided host features.
- Ability to disable modern virtio-blk features like
   discard/write-zeroes for vhost-user-blk.
- Dynamic configuration space resizing based on enabled features,
   by reusing the common code introduced in the earlier commits.
- Cleans up the VHostUserBlk structure by reusing parent fields.

Changes since v1 (mostly addresses Stefan's feedback):
- Introduce VirtIOConfigSizeParams & virtio_get_config_size
- Remove virtio_blk_set_config_size altogether, make virtio-blk-common.c
   only hold the virtio-blk config size parameters.
- Reuse parent fields in vhost-user-blk instead of introducing new ones.

Changes since v2:
- Squash the first four commits into one
- Set .min_size for virtio-net as well
- Move maintainer/meson user-blk bits to the last commit

Daniil Tatianin (5):
   virtio: introduce VirtIOConfigSizeParams & virtio_get_config_size
   virtio-blk: move config size params to virtio-blk-common
   vhost-user-blk: make it possible to disable write-zeroes/discard
   vhost-user-blk: make 'config_wce' part of 'host_features'
   vhost-user-blk: dynamically resize config space based on features

  MAINTAINERS   |  4 +++
  hw/block/meson.build  |  4 +--
  hw/block/vhost-user-blk.c | 29 +++-
  hw/block/virtio-blk-common.c  | 39 +++
  hw/block/virtio-blk.c | 28 +++
  hw/net/virtio-net.c   |  9 +--
  hw/virtio/virtio.c    | 10 ---
  include/hw/virtio/vhost-user-blk.h    |  1 -
  include/hw/virtio/virtio-blk-common.h | 20 ++
  include/hw/virtio/virtio.h    | 10 +--
  10 files changed, 105 insertions(+), 49 deletions(-)
  create mode 100644 hw/block/virtio-blk-common.c
  create mode 100644 include/hw/virtio/virtio-blk-common.h

Re: [PATCH v3 0/5] vhost-user-blk: dynamically resize config space based on features

2022-10-11 Thread Daniil Tatianin


Ping :)

On 9/6/22 10:31 AM, Daniil Tatianin wrote:

This patch set attempts to align vhost-user-blk with virtio-blk in
terms of backward compatibility and flexibility. It also improves
the virtio core by introducing new common code that can be used by
a virtio device to calculate its config space size.

In particular it adds the following things:
- Common virtio code for deducing the required device config size based
   on provided host features.
- Ability to disable modern virtio-blk features like
   discard/write-zeroes for vhost-user-blk.
- Dynamic configuration space resizing based on enabled features,
   by reusing the common code introduced in the earlier commits.
- Cleans up the VHostUserBlk structure by reusing parent fields.

Changes since v1 (mostly addresses Stefan's feedback):
- Introduce VirtIOConfigSizeParams & virtio_get_config_size
- Remove virtio_blk_set_config_size altogether, make virtio-blk-common.c
   only hold the virtio-blk config size parameters.
- Reuse parent fields in vhost-user-blk instead of introducing new ones.

Changes since v2:
- Squash the first four commits into one
- Set .min_size for virtio-net as well
- Move maintainer/meson user-blk bits to the last commit

Daniil Tatianin (5):
   virtio: introduce VirtIOConfigSizeParams & virtio_get_config_size
   virtio-blk: move config size params to virtio-blk-common
   vhost-user-blk: make it possible to disable write-zeroes/discard
   vhost-user-blk: make 'config_wce' part of 'host_features'
   vhost-user-blk: dynamically resize config space based on features

  MAINTAINERS   |  4 +++
  hw/block/meson.build  |  4 +--
  hw/block/vhost-user-blk.c | 29 +++-
  hw/block/virtio-blk-common.c  | 39 +++
  hw/block/virtio-blk.c | 28 +++
  hw/net/virtio-net.c   |  9 +--
  hw/virtio/virtio.c| 10 ---
  include/hw/virtio/vhost-user-blk.h|  1 -
  include/hw/virtio/virtio-blk-common.h | 20 ++
  include/hw/virtio/virtio.h| 10 +--
  10 files changed, 105 insertions(+), 49 deletions(-)
  create mode 100644 hw/block/virtio-blk-common.c
  create mode 100644 include/hw/virtio/virtio-blk-common.h

Re: [PATCH v9 01/10] s390x/cpus: Make absence of multithreading clear

2022-10-11 Thread Pierre Morel





On 9/28/22 18:28, Cédric Le Goater wrote:

On 9/28/22 18:16, Pierre Morel wrote:
More thinking about this I will drop this patch for backward 
compatibility and in topology masks treat CPUs as being cores*threads


yes. You never know, people might have set threads=2 in their
domain file (like me). You could give the user a warning though,
with warn_report().


More thinking, I come back to the first idea after Daniel comment and 
protect the change with a new machine type version.





Thanks,

C.






--
Pierre Morel
IBM Lab Boeblingen

Re: [PATCH v9 01/10] s390x/cpus: Make absence of multithreading clear

2022-10-11 Thread Cédric Le Goater


On 10/11/22 09:21, Pierre Morel wrote:



On 9/28/22 18:28, Cédric Le Goater wrote:

On 9/28/22 18:16, Pierre Morel wrote:

More thinking about this I will drop this patch for backward compatibility and 
in topology masks treat CPUs as being cores*threads


yes. You never know, people might have set threads=2 in their
domain file (like me). You could give the user a warning though,
with warn_report().


More thinking, I come back to the first idea after Daniel comment and protect 
the change with a new machine type version.


yes. That would be another machine class attribute to set in the new machine,
may be 'max_threads' to compare with the user provided value.

C.






Thanks,

C.

Re: [PATCH v2 4/7] util: Add write-only "node-affinity" property for ThreadContext

2022-10-11 Thread David Hildenbrand


On 11.10.22 08:03, Markus Armbruster wrote:

David Hildenbrand  writes:


Let's make it easier to pin threads created via a ThreadContext to
all CPUs currently belonging to a given set of NUMA nodes -- which is the
common case.

"node-affinity" is simply a shortcut for setting "cpu-affinity" manually
to the list of CPUs belonging to the set of nodes. This property can only
be written.

A simple QEMU example to set the CPU affinity to Node 1 on a system with
two NUMA nodes, 24 CPUs each:
 qemu-system-x86_64 -S \
   -object thread-context,id=tc1,node-affinity=1

And we can query the cpu-affinity via HMP/QMP:
 (qemu) qom-get tc1 cpu-affinity
 [
 1,
 3,
 5,
 7,
 9,
 11,
 13,
 15,
 17,
 19,
 21,
 23,
 25,
 27,
 29,
 31,
 33,
 35,
 37,
 39,
 41,
 43,
 45,
 47
 ]


Double-checking my understanding: on this system, the even CPUs belong
to NUMA node 0, and the odd ones to node 1.  Setting node-affinity=1 is
therefore sugar for setting cpu-affinity to the set of even CPUs.
Correct?


Yes!

# lscpu
...
NUMA:
  NUMA node(s):  2
  NUMA node0 CPU(s): 
0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,38,40,42,44,46
  NUMA node1 CPU(s): 
1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31,33,35,37,39,41,43,45,47





We cannot query the node-affinity:
 (qemu) qom-get tc1 node-affinity
 Error: Insufficient permission to perform this operation


The error message is somewhat misleading.  "Insufficient permission"
suggests this could work if I had more "permission".  Not the case.  The
message comes from object_property_get(), i.e. it's not this patch's
fault.  I'll post a patch to improve it.



I agree, thanks!

--
Thanks,

David / dhildenb

Re: [PATCH v2 3/7] util: Introduce ThreadContext user-creatable object

2022-10-11 Thread David Hildenbrand


But note that due to dynamic library loading this example will not work
before we actually make use of thread_context_create_thread() in QEMU
code, because the type will otherwise not get registered.


What do you mean exactly by "not work"?  It's not "CLI option or HMP
command fails":



For me, if I compile patch #1-#3 only, I get:

$ ./build/qemu-system-x86_64 -S -display none -nodefaults -monitor stdio 
-object thread-context,id=tc1,cpu-affinity=0-1,cpu-affinity=6-7

qemu-system-x86_64: invalid object type: thread-context


Reason is that, without a call to thread_context_create_thread(), we 
won't trigger type_init(thread_context_register_types) and consequently, 
the type won't be registered.


Is it really different in your environment? Maybe it depends on the QEMU 
config?



 $ upstream-qemu -S -display none -nodefaults -monitor stdio -object 
thread-context,id=tc1,cpu-affinity=0-1,cpu-affinity=6-7
 QEMU 7.1.50 monitor - type 'help' for more information
 (qemu) qom-get tc1 cpu-affinity
 [
 0,
 1,
 6,
 7
 ]
 (qemu) info cpus
 * CPU #0: thread_id=1670613

Even though the affinities refer to nonexistent CPUs :)


CPU affinities are CPU numbers on your system (host), not QEMU vCPU 
numbers. I could talk about physical CPU numbers in the doc here, 
although I am not sure if that really helps. What about "host CPU 
numbers" and in patch #4 "host node numbers"?


Seems to match what we document for @MemoryBackendProperties: 
"@host-nodes: the list of NUMA host nodes to bind the memory to"




But unrelated to that, pthread_setaffinity_np() won't bail out on CPUs 
that are currently not available in the host -- because one might 
online/hotplug them later. It only bails out if none of the CPUs is 
currently available in the host:


https://man7.org/linux/man-pages/man3/pthread_setaffinity_np.3.html


   EINVAL (pthread_setaffinity_np()) The affinity bit mask mask
  contains no processors that are currently physically on
  the system and permitted to the thread according to any
  restrictions that may be imposed by the "cpuset" mechanism
  described in cpuset(7).

It will bail out on CPUs that cannot be available in the host though, 
because it's impossible due to the kernel config:



   EINVAL (pthread_setaffinity_np()) cpuset specified a CPU that was
  outside the set supported by the kernel.  (The kernel
  configuration option CONFIG_NR_CPUS defines the range of
  the set supported by the kernel data type used to
  represent CPU sets.)





A ThreadContext can be reused, simply by reconfiguring the CPU affinity.


So, when a thread is created, its affinity comes from its thread context
(if any).  When I later change the context's affinity, it does *not*
affect existing threads, only future ones.  Correct?


Yes, that's the current state.




Reviewed-by: Michal Privoznik 
Signed-off-by: David Hildenbrand 
---
  include/qemu/thread-context.h |  57 +++
  qapi/qom.json |  17 +++
  util/meson.build  |   1 +
  util/oslib-posix.c|   1 +
  util/thread-context.c | 278 ++
  5 files changed, 354 insertions(+)
  create mode 100644 include/qemu/thread-context.h
  create mode 100644 util/thread-context.c

diff --git a/include/qemu/thread-context.h b/include/qemu/thread-context.h
new file mode 100644
index 00..2ebd6b7fe1
--- /dev/null
+++ b/include/qemu/thread-context.h
@@ -0,0 +1,57 @@
+/*
+ * QEMU Thread Context
+ *
+ * Copyright Red Hat Inc., 2022
+ *
+ * Authors:
+ *  David Hildenbrand 
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+
+#ifndef SYSEMU_THREAD_CONTEXT_H
+#define SYSEMU_THREAD_CONTEXT_H
+
+#include "qapi/qapi-types-machine.h"
+#include "qemu/thread.h"
+#include "qom/object.h"
+
+#define TYPE_THREAD_CONTEXT "thread-context"
+OBJECT_DECLARE_TYPE(ThreadContext, ThreadContextClass,
+THREAD_CONTEXT)
+
+struct ThreadContextClass {
+ObjectClass parent_class;
+};
+
+struct ThreadContext {
+/* private */
+Object parent;
+
+/* private */
+unsigned int thread_id;
+QemuThread thread;
+
+/* Semaphore to wait for context thread action. */
+QemuSemaphore sem;
+/* Semaphore to wait for action in context thread. */
+QemuSemaphore sem_thread;
+/* Mutex to synchronize requests. */
+QemuMutex mutex;
+
+/* Commands for the thread to execute. */
+int thread_cmd;
+void *thread_cmd_data;
+
+/* CPU affinity bitmap used for initialization. */
+unsigned long *init_cpu_bitmap;
+int init_cpu_nbits;
+};
+
+void thread_context_create_thread(ThreadContext *tc, QemuThread *thread,
+  const char *name,
+  void *(*start_routine)(void *),

Re: [PATCH v3] m68k: write bootinfo as rom section and re-randomize on reboot

2022-10-11 Thread Laurent Vivier


Le 03/10/2022 à 13:02, Jason A. Donenfeld a écrit :

Rather than poking directly into RAM, add the bootinfo block as a proper
ROM, so that it's restored when rebooting the system. This way, if the
guest corrupts any of the bootinfo items, but then tries to reboot,
it'll still be restored back to normal as expected.

Then, since the RNG seed needs to be fresh on each boot, regenerate the
RNG seed in the ROM when reseting the CPU.


As it's needed to be refreshed, I think it would better not to use a ROM and to regenerate all the 
bootinfo data on the reset. This will also avoid the conditional g_malloc().


Thanks,
Laurent



Cc: Geert Uytterhoeven 
Cc: Laurent Vivier 
Signed-off-by: Jason A. Donenfeld 
---
  hw/m68k/bootinfo.h | 48 +++
  hw/m68k/q800.c | 71 +-
  hw/m68k/virt.c | 51 +++--
  3 files changed, 111 insertions(+), 59 deletions(-)

diff --git a/hw/m68k/bootinfo.h b/hw/m68k/bootinfo.h
index 897162b818..eb92937cf6 100644
--- a/hw/m68k/bootinfo.h
+++ b/hw/m68k/bootinfo.h
@@ -12,66 +12,66 @@
  #ifndef HW_M68K_BOOTINFO_H
  #define HW_M68K_BOOTINFO_H
  
-#define BOOTINFO0(as, base, id) \

+#define BOOTINFO0(base, id) \
  do { \
-stw_phys(as, base, id); \
+stw_p(base, id); \
  base += 2; \
-stw_phys(as, base, sizeof(struct bi_record)); \
+stw_p(base, sizeof(struct bi_record)); \
  base += 2; \
  } while (0)
  
-#define BOOTINFO1(as, base, id, value) \

+#define BOOTINFO1(base, id, value) \
  do { \
-stw_phys(as, base, id); \
+stw_p(base, id); \
  base += 2; \
-stw_phys(as, base, sizeof(struct bi_record) + 4); \
+stw_p(base, sizeof(struct bi_record) + 4); \
  base += 2; \
-stl_phys(as, base, value); \
+stl_p(base, value); \
  base += 4; \
  } while (0)
  
-#define BOOTINFO2(as, base, id, value1, value2) \

+#define BOOTINFO2(base, id, value1, value2) \
  do { \
-stw_phys(as, base, id); \
+stw_p(base, id); \
  base += 2; \
-stw_phys(as, base, sizeof(struct bi_record) + 8); \
+stw_p(base, sizeof(struct bi_record) + 8); \
  base += 2; \
-stl_phys(as, base, value1); \
+stl_p(base, value1); \
  base += 4; \
-stl_phys(as, base, value2); \
+stl_p(base, value2); \
  base += 4; \
  } while (0)
  
-#define BOOTINFOSTR(as, base, id, string) \

+#define BOOTINFOSTR(base, id, string) \
  do { \
  int i; \
-stw_phys(as, base, id); \
+stw_p(base, id); \
  base += 2; \
-stw_phys(as, base, \
+stw_p(base, \
   (sizeof(struct bi_record) + strlen(string) + \
1 /* null termination */ + 3 /* padding */) & ~3); \
  base += 2; \
  for (i = 0; string[i]; i++) { \
-stb_phys(as, base++, string[i]); \
+stb_p(base++, string[i]); \
  } \
-stb_phys(as, base++, 0); \
-base = (base + 3) & ~3; \
+stb_p(base++, 0); \
+base = (void *)(((unsigned long)base + 3) & ~3); \
  } while (0)
  
-#define BOOTINFODATA(as, base, id, data, len) \

+#define BOOTINFODATA(base, id, data, len) \
  do { \
  int i; \
-stw_phys(as, base, id); \
+stw_p(base, id); \
  base += 2; \
-stw_phys(as, base, \
+stw_p(base, \
   (sizeof(struct bi_record) + len + \
2 /* length field */ + 3 /* padding */) & ~3); \
  base += 2; \
-stw_phys(as, base, len); \
+stw_p(base, len); \
  base += 2; \
  for (i = 0; i < len; ++i) { \
-stb_phys(as, base++, data[i]); \
+stb_p(base++, data[i]); \
  } \
-base = (base + 3) & ~3; \
+base = (void *)(((unsigned long)base + 3) & ~3); \
  } while (0)
  #endif
diff --git a/hw/m68k/q800.c b/hw/m68k/q800.c
index a4590c2cb0..e09e244ddc 100644
--- a/hw/m68k/q800.c
+++ b/hw/m68k/q800.c
@@ -321,11 +321,22 @@ static const TypeInfo glue_info = {
  },
  };
  
+typedef struct {

+M68kCPU *cpu;
+struct bi_record *rng_seed;
+} ResetInfo;
+
  static void main_cpu_reset(void *opaque)
  {
-M68kCPU *cpu = opaque;
+ResetInfo *reset_info = opaque;
+M68kCPU *cpu = reset_info->cpu;
  CPUState *cs = CPU(cpu);
  
+if (reset_info->rng_seed) {

+qemu_guest_getrandom_nofail((void *)reset_info->rng_seed->data + 2,
+be16_to_cpu(*(uint16_t *)reset_info->rng_seed->data));
+}
+
  cpu_reset(cs);
  cpu->env.aregs[7] = ldl_phys(cs->as, 0);
  cpu->env.pc = ldl_phys(cs->as, 4);
@@ -386,6 +397,7 @@ static void q800_init(MachineState *machine)
  NubusBus *nubus;
  DeviceState *glue;
  DriveInfo *dinfo;
+ResetInfo *reset_info;
  uint8_t rng_seed[32];
  
  linux_boot = (kernel_filename != NULL);

@@ -396

Re: [RFC PATCH 1/6] qemu/bswap: Add const_le64()

2022-10-11 Thread Jonathan Cameron via

On Mon, 10 Oct 2022 15:29:39 -0700
ira.we...@intel.com wrote:

> From: Ira Weiny 
> 
> Gcc requires constant versions of cpu_to_le* calls.
> 
> Add a 64 bit version.
> 
> Signed-off-by: Ira Weiny 

Seems reasonable to me but I'm not an expert in this stuff.
FWIW

Reviewed-by: Jonathan Cameron 

There are probably a lot of places in the CXL emulation where
our endian handling isn't correct but so far it hasn't mattered
as all the supported architectures are little endian.

Good to not introduce more cases however!

Jonathan


> ---
>  include/qemu/bswap.h | 10 ++
>  1 file changed, 10 insertions(+)
> 
> diff --git a/include/qemu/bswap.h b/include/qemu/bswap.h
> index 346d05f2aab3..08e607821102 100644
> --- a/include/qemu/bswap.h
> +++ b/include/qemu/bswap.h
> @@ -192,10 +192,20 @@ CPU_CONVERT(le, 64, uint64_t)
>   (((_x) & 0xff00U) <<  8) |  \
>   (((_x) & 0x00ffU) >>  8) |  \
>   (((_x) & 0xff00U) >> 24))
> +# define const_le64(_x)  \
> +_x) & 0x00ffU) << 56) |  \
> + (((_x) & 0xff00U) << 40) |  \
> + (((_x) & 0x00ffU) << 24) |  \
> + (((_x) & 0xff00U) <<  8) |  \
> + (((_x) & 0x00ffU) >>  8) |  \
> + (((_x) & 0xff00U) >> 24) |  \
> + (((_x) & 0x00ffU) >> 40) |  \
> + (((_x) & 0xff00U) >> 56))
>  # define const_le16(_x)  \
>  _x) & 0x00ff) << 8) |\
>   (((_x) & 0xff00) >> 8))
>  #else
> +# define const_le64(_x) (_x)
>  # define const_le32(_x) (_x)
>  # define const_le16(_x) (_x)
>  #endif

Re: [PATCH v2 0/7] hostmem: NUMA-aware memory preallocation using ThreadContext

2022-10-11 Thread Dr. David Alan Gilbert

* David Hildenbrand (da...@redhat.com) wrote:
> On 10.10.22 12:40, Dr. David Alan Gilbert wrote:
> > * David Hildenbrand (da...@redhat.com) wrote:
> > > This is a follow-up on "util: NUMA aware memory preallocation" [1] by
> > > Michal.
> > > 
> > > Setting the CPU affinity of threads from inside QEMU usually isn't
> > > easily possible, because we don't want QEMU -- once started and running
> > > guest code -- to be able to mess up the system. QEMU disallows relevant
> > > syscalls using seccomp, such that any such invocation will fail.
> > > 
> > > Especially for memory preallocation in memory backends, the CPU affinity
> > > can significantly increase guest startup time, for example, when running
> > > large VMs backed by huge/gigantic pages, because of NUMA effects. For
> > > NUMA-aware preallocation, we have to set the CPU affinity, however:
> > > 
> > > (1) Once preallocation threads are created during preallocation, 
> > > management
> > >  tools cannot intercept anymore to change the affinity. These threads
> > >  are created automatically on demand.
> > > (2) QEMU cannot easily set the CPU affinity itself.
> > > (3) The CPU affinity derived from the NUMA bindings of the memory backend
> > >  might not necessarily be exactly the CPUs we actually want to use
> > >  (e.g., CPU-less NUMA nodes, CPUs that are pinned/used for other VMs).
> > > 
> > > There is an easy "workaround". If we have a thread with the right CPU
> > > affinity, we can simply create new threads on demand via that prepared
> > > context. So, all we have to do is setup and create such a context ahead
> > > of time, to then configure preallocation to create new threads via that
> > > environment.
> > > 
> > > So, let's introduce a user-creatable "thread-context" object that
> > > essentially consists of a context thread used to create new threads.
> > > QEMU can either try setting the CPU affinity itself ("cpu-affinity",
> > > "node-affinity" property), or upper layers can extract the thread id
> > > ("thread-id" property) to configure it externally.
> > > 
> > > Make memory-backends consume a thread-context object
> > > (via the "prealloc-context" property) and use it when preallocating to
> > > create new threads with the desired CPU affinity. Further, to make it
> > > easier to use, allow creation of "thread-context" objects, including
> > > setting the CPU affinity directly from QEMU, before enabling the
> > > sandbox option.
> > > 
> > > 
> > > Quick test on a system with 2 NUMA nodes:
> > > 
> > > Without CPU affinity:
> > >  time qemu-system-x86_64 \
> > >  -object 
> > > memory-backend-memfd,id=md1,hugetlb=on,hugetlbsize=2M,size=64G,prealloc-threads=12,prealloc=on,host-nodes=0,policy=bind
> > >  \
> > >  -nographic -monitor stdio
> > > 
> > >  real0m5.383s
> > >  real0m3.499s
> > >  real0m5.129s
> > >  real0m4.232s
> > >  real0m5.220s
> > >  real0m4.288s
> > >  real0m3.582s
> > >  real0m4.305s
> > >  real0m5.421s
> > >  real0m4.502s
> > > 
> > >  -> It heavily depends on the scheduler CPU selection
> > > 
> > > With CPU affinity:
> > >  time qemu-system-x86_64 \
> > >  -object thread-context,id=tc1,node-affinity=0 \
> > >  -object 
> > > memory-backend-memfd,id=md1,hugetlb=on,hugetlbsize=2M,size=64G,prealloc-threads=12,prealloc=on,host-nodes=0,policy=bind,prealloc-context=tc1
> > >  \
> > >  -sandbox enable=on,resourcecontrol=deny \
> > >  -nographic -monitor stdio
> > > 
> > >  real0m1.959s
> > >  real0m1.942s
> > >  real0m1.943s
> > >  real0m1.941s
> > >  real0m1.948s
> > >  real0m1.964s
> > >  real0m1.949s
> > >  real0m1.948s
> > >  real0m1.941s
> > >  real0m1.937s
> > > 
> > > On reasonably large VMs, the speedup can be quite significant.
> > > 
> > > While this concept is currently only used for short-lived preallocation
> > > threads, nothing major speaks against reusing the concept for other
> > > threads that are harder to identify/configure -- except that
> > > we need additional (idle) context threads that are otherwise left unused.
> > > 
> > > This series does not yet tackle concurrent preallocation of memory
> > > backends. Memory backend objects are created and memory is preallocated 
> > > one
> > > memory backend at a time -- and there is currently no way to do
> > > preallocation asynchronously.
> 
> Hi Dave,
> 
> > 
> > Since you seem to have a full set of r-b's - do you intend to merge this
> > as-is or do the cuncurrenct preallocation first?
> 
> I intent to merge this as is, as it provides a benefit as it stands and
> concurrent preallcoation might not require user interface changes.

Yep, that's fair enough.

> I do have some ideas on how to implement concurrent preallocation, but it
> needs more thought (and more importantly, time).

Yep, it would be nice for the really h

Re: [RFC PATCH 2/6] qemu/uuid: Add UUID static initializer

2022-10-11 Thread Jonathan Cameron via

On Mon, 10 Oct 2022 15:29:40 -0700
ira.we...@intel.com wrote:

> From: Ira Weiny 
> 
> UUID's are defined as network byte order fields.  No static initializer
> was available for UUID's in their standard big endian format.
> 
> Define a big endian initializer for UUIDs.
> 
> Signed-off-by: Ira Weiny 

Seems sensible.  Would allow a cleanup in the existing cel_uuid handling
in the CXL code where we use a static for this and end up filling it
with the same value multiple times which is less than ideal...
A quick grep and for qemu_uuid_parse() suggests there are other cases
where it's passed a constant string.

Reviewed-by: Jonathan Cameron 

> ---
>  include/qemu/uuid.h | 12 
>  1 file changed, 12 insertions(+)
> 
> diff --git a/include/qemu/uuid.h b/include/qemu/uuid.h
> index 9925febfa54d..dc40ee1fc998 100644
> --- a/include/qemu/uuid.h
> +++ b/include/qemu/uuid.h
> @@ -61,6 +61,18 @@ typedef struct {
>  (clock_seq_hi_and_reserved), (clock_seq_low), (node0), (node1), (node2),\
>  (node3), (node4), (node5) }
>  
> +/* Normal (network byte order) UUID */
> +#define UUID(time_low, time_mid, time_hi_and_version,\
> +  clock_seq_hi_and_reserved, clock_seq_low, node0, node1, node2, \
> +  node3, node4, node5)   \
> +  { ((time_low) >> 24) & 0xff, ((time_low) >> 16) & 0xff,\
> +((time_low) >> 8) & 0xff, (time_low) & 0xff, \
> +((time_mid) >> 8) & 0xff, (time_mid) & 0xff, \
> +((time_hi_and_version) >> 8) & 0xff, (time_hi_and_version) & 0xff,   \
> +(clock_seq_hi_and_reserved), (clock_seq_low),\
> +(node0), (node1), (node2), (node3), (node4), (node5) \
> +  }
> +
>  #define UUID_FMT "%02hhx%02hhx%02hhx%02hhx-" \
>   "%02hhx%02hhx-%02hhx%02hhx-" \
>   "%02hhx%02hhx-" \

[PATCH v4] win32: set threads name

2022-10-11 Thread marcandre . lureau

From: Marc-André Lureau 

As described in:
https://learn.microsoft.com/en-us/visualstudio/debugger/how-to-set-a-thread-name-in-native-code?view=vs-2022

SetThreadDescription() is available since Windows 10, version 1607 and
in some versions only by "Run Time Dynamic Linking". Its declaration is
not yet in mingw, so we lookup the function the same way glib does.

Tested with Visual Studio Community 2022 debugger.

Signed-off-by: Marc-André Lureau 
Acked-by: Richard Henderson 
---
 util/qemu-thread-win32.c | 55 ++--
 1 file changed, 53 insertions(+), 2 deletions(-)

diff --git a/util/qemu-thread-win32.c b/util/qemu-thread-win32.c
index a2d5a6e825..b20bfa9c1f 100644
--- a/util/qemu-thread-win32.c
+++ b/util/qemu-thread-win32.c
@@ -19,12 +19,39 @@
 
 static bool name_threads;
 
+typedef HRESULT (WINAPI *pSetThreadDescription) (HANDLE hThread,
+ PCWSTR lpThreadDescription);
+static pSetThreadDescription SetThreadDescriptionFunc;
+static HMODULE kernel32_module;
+
+static bool load_set_thread_description(void)
+{
+static gsize _init_once = 0;
+
+if (g_once_init_enter(&_init_once)) {
+kernel32_module = LoadLibrary("kernel32.dll");
+if (kernel32_module) {
+SetThreadDescriptionFunc =
+(pSetThreadDescription)GetProcAddress(kernel32_module,
+  "SetThreadDescription");
+if (!SetThreadDescriptionFunc) {
+FreeLibrary(kernel32_module);
+}
+}
+g_once_init_leave(&_init_once, 1);
+}
+
+return !!SetThreadDescriptionFunc;
+}
+
 void qemu_thread_naming(bool enable)
 {
-/* But note we don't actually name them on Windows yet */
 name_threads = enable;
 
-fprintf(stderr, "qemu: thread naming not supported on this host\n");
+if (enable && !load_set_thread_description()) {
+fprintf(stderr, "qemu: thread naming not supported on this host\n");
+name_threads = false;
+}
 }
 
 static void error_exit(int err, const char *msg)
@@ -400,6 +427,26 @@ void *qemu_thread_join(QemuThread *thread)
 return ret;
 }
 
+static bool
+set_thread_description(HANDLE h, const char *name)
+{
+  HRESULT hr;
+  g_autofree wchar_t *namew = NULL;
+
+  if (!load_set_thread_description()) {
+  return false;
+  }
+
+  namew = g_utf8_to_utf16(name, -1, NULL, NULL, NULL);
+  if (!namew) {
+  return false;
+  }
+
+  hr = SetThreadDescriptionFunc(h, namew);
+
+  return SUCCEEDED(hr);
+}
+
 void qemu_thread_create(QemuThread *thread, const char *name,
void *(*start_routine)(void *),
void *arg, int mode)
@@ -423,7 +470,11 @@ void qemu_thread_create(QemuThread *thread, const char 
*name,
 if (!hThread) {
 error_exit(GetLastError(), __func__);
 }
+if (name_threads && name && !set_thread_description(hThread, name)) {
+fprintf(stderr, "qemu: failed to set thread description: %s\n", name);
+}
 CloseHandle(hThread);
+
 thread->data = data;
 }
 
-- 
2.37.3

Re: [RISU PATCH 1/5] risu: Use alternate stack

2022-10-11 Thread Peter Maydell

On Tue, 11 Oct 2022 at 07:57, gaosong  wrote:
>
>
> 在 2022/10/10 22:43, Peter Maydell 写道:
> > On Mon, 10 Oct 2022 at 15:20, Richard Henderson
> >  wrote:
> >> On 9/17/22 00:43, Song Gao wrote:
> >>> We can use alternate stack, so that we can use sp register as 
> >>> intput/ouput register.
> >>> I had tested aarch64/LoongArch architecture.
> >>>
> >>> Signed-off-by: Song Gao
> >>> ---
> >>>risu.c | 16 +++-
> >>>1 file changed, 15 insertions(+), 1 deletion(-)
> >> Good idea.
> > Depending on the architecture there might still need to be
> > restrictions on use of the stack pointer, eg aarch64's
> > alignment requirements, but this at least means you can
> > in theory write some risu rules that use SP.
> I really want use alternate stack, this way can reduce risu rules.
> what about use this only on LoongArch architecture ?

I just mean that although this patch is fine it might
still mean that depending on the architecture some care
and/or special casing of sp in the target risu rules
might be needed. I don't know if that applies to
loongarch or not.

-- PMM

Re: [PATCH v3] m68k: write bootinfo as rom section and re-randomize on reboot

2022-10-11 Thread Peter Maydell

On Tue, 11 Oct 2022 at 09:41, Laurent Vivier  wrote:
>
> Le 03/10/2022 à 13:02, Jason A. Donenfeld a écrit :
> > Rather than poking directly into RAM, add the bootinfo block as a proper
> > ROM, so that it's restored when rebooting the system. This way, if the
> > guest corrupts any of the bootinfo items, but then tries to reboot,
> > it'll still be restored back to normal as expected.
> >
> > Then, since the RNG seed needs to be fresh on each boot, regenerate the
> > RNG seed in the ROM when reseting the CPU.
>
> As it's needed to be refreshed, I think it would better not to use a ROM and 
> to regenerate all the
> bootinfo data on the reset.

I quite liked the use of a rom blob in this patch -- it gets rid
of a lot of direct stl_phys() calls (which is a semi-deprecated
API because it ignores the possibility of failure).

-- PMM

Re: [PATCH v2 3/4] crypto: Support export akcipher to pkcs8

2022-10-11 Thread Daniel P . Berrangé

On Sat, Oct 08, 2022 at 04:50:29PM +0800, Lei He wrote:
> crypto: support export RSA private keys with PKCS#8 standard.
> So that users can upload this private key to linux kernel.
> 
> Signed-off-by: lei he 
> ---
>  crypto/akcipher.c | 18 ++
>  crypto/rsakey.c   | 42 ++
>  crypto/rsakey.h   | 11 ++-
>  include/crypto/akcipher.h | 21 +
>  4 files changed, 91 insertions(+), 1 deletion(-)

Reviewed-by: Daniel P. Berrangé 


With regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

Re: [RFC PATCH 0/6] QEMU CXL Provide mock CXL events and irq support

2022-10-11 Thread Jonathan Cameron via

On Mon, 10 Oct 2022 15:29:38 -0700
ira.we...@intel.com wrote:

> From: Ira Weiny 
> 
> CXL Event records inform the OS of various CXL device events.  Thus far CXL
> memory devices are emulated and therefore don't naturally have events which
> will occur.
> 
> Add mock events and a HMP trigger mechanism to facilitate guest OS testing of
> event support.
> 
> This support requires a follow on version of the event patch set.  The RFC was
> submitted and discussed here:
> 
>   
> https://lore.kernel.org/linux-cxl/20220813053243.757363-1-ira.we...@intel.com/
> 
> I'll post the lore link to the new version shortly.
> 
> Instructions for running this test.
> 
> Add qmp option to qemu:
> 
>$ qemu-system-x86_64 ... -qmp 
> unix:/tmp/run_qemu_qmp_0,server,nowait ...
> 
>   OR
> 
>$ run_qemu.sh ... --qmp ...
> 
> Enable tracing of events within the guest:
> 
>$ echo "" > /sys/kernel/tracing/trace
>$ echo 1 > /sys/kernel/tracing/events/cxl/enable
>$ echo 1 > /sys/kernel/tracing/tracing_on
> 
> Trigger event generation and interrupts in the host:
> 
>$ echo "cxl_event_inject cxl-devX" | qmp-shell -H 
> /tmp/run_qemu_qmp_0
> 
>   Where X == one of the memory devices; cxl-dev0 should work.
> 
> View events on the guest:
> 
>$ cat /sys/kernel/tracing/trace

Hi Ira,

Why is this an RFC rather than a patch set to apply?

It's useful to have that in the cover letter so we can focus on what
you want comments on (rather than simply review).

Thanks,

Jonathan

> 
> 
> Ira Weiny (6):
>   qemu/bswap: Add const_le64()
>   qemu/uuid: Add UUID static initializer
>   hw/cxl/cxl-events: Add CXL mock events
>   hw/cxl/mailbox: Wire up get/clear event mailbox commands
>   hw/cxl/cxl-events: Add event interrupt support
>   hw/cxl/mailbox: Wire up Get/Set Event Interrupt policy
> 
>  hmp-commands.hx |  14 ++
>  hw/cxl/cxl-device-utils.c   |   1 +
>  hw/cxl/cxl-events.c | 330 
>  hw/cxl/cxl-host-stubs.c |   5 +
>  hw/cxl/cxl-mailbox-utils.c  | 224 +---
>  hw/cxl/meson.build  |   1 +
>  hw/mem/cxl_type3.c  |   7 +-
>  include/hw/cxl/cxl_device.h |  22 +++
>  include/hw/cxl/cxl_events.h | 194 +
>  include/qemu/bswap.h|  10 ++
>  include/qemu/uuid.h |  12 ++
>  include/sysemu/sysemu.h |   3 +
>  12 files changed, 802 insertions(+), 21 deletions(-)
>  create mode 100644 hw/cxl/cxl-events.c
>  create mode 100644 include/hw/cxl/cxl_events.h
> 
> 
> base-commit: 6f7f81898e4437ea544ee4ca24bef7ec543b1f06

Re: [PATCH v2 2/4] crypto: Support DER encodings

2022-10-11 Thread Daniel P . Berrangé

On Sat, Oct 08, 2022 at 04:50:28PM +0800, Lei He wrote:
> Add encoding interfaces for DER encoding:
> 1. support decoding of 'bit string', 'octet string', 'object id'
> and 'context specific tag' for DER encoder.
> 2. implemented a simple DER encoder.
> 3. add more testsuits for DER encoder.
> 
> Signed-off-by: lei he 
> ---
>  crypto/der.c | 307 
> +++
>  crypto/der.h | 211 -
>  tests/unit/test-crypto-der.c | 126 ++
>  3 files changed, 597 insertions(+), 47 deletions(-)

Reviewed-by: Daniel P. Berrangé 


With regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

Re: [RFC PATCH 1/6] qemu/bswap: Add const_le64()

2022-10-11 Thread Peter Maydell

On Mon, 10 Oct 2022 at 23:48,  wrote:
>
> From: Ira Weiny 
>
> Gcc requires constant versions of cpu_to_le* calls.
>
> Add a 64 bit version.
>
> Signed-off-by: Ira Weiny 
> ---
>  include/qemu/bswap.h | 10 ++
>  1 file changed, 10 insertions(+)
>
> diff --git a/include/qemu/bswap.h b/include/qemu/bswap.h
> index 346d05f2aab3..08e607821102 100644
> --- a/include/qemu/bswap.h
> +++ b/include/qemu/bswap.h
> @@ -192,10 +192,20 @@ CPU_CONVERT(le, 64, uint64_t)
>   (((_x) & 0xff00U) <<  8) |  \
>   (((_x) & 0x00ffU) >>  8) |  \
>   (((_x) & 0xff00U) >> 24))
> +# define const_le64(_x)  \
> +_x) & 0x00ffU) << 56) |  \
> + (((_x) & 0xff00U) << 40) |  \
> + (((_x) & 0x00ffU) << 24) |  \
> + (((_x) & 0xff00U) <<  8) |  \
> + (((_x) & 0x00ffU) >>  8) |  \
> + (((_x) & 0xff00U) >> 24) |  \
> + (((_x) & 0x00ffU) >> 40) |  \
> + (((_x) & 0xff00U) >> 56))

Can you add this in the right place, ie above the const_le32()
definition, please ?

>  # define const_le16(_x)  \
>  _x) & 0x00ff) << 8) |\
>   (((_x) & 0xff00) >> 8))
>  #else
> +# define const_le64(_x) (_x)
>  # define const_le32(_x) (_x)
>  # define const_le16(_x) (_x)
>  #endif

This is kind of a weird API, because:
 * it only exists for little-endian, not big-endian
 * we use it in exactly two files (linux-user/elfload.c and
   hw/input/virtio-input-hid.c)

which leaves me wondering if there's a better way of doing
it that I'm missing. But maybe it's just that we never filled
out the missing bits of the API surface because we haven't
needed them yet. Richard ?

thanks
-- PMM

Re: [PATCH v7 0/5] QEMU PCIe DOE for PCIe 4.0/5.0 and CXL 2.0

2022-10-11 Thread Huai-Cheng

Hi Jonathan,

We've reviewed the patches related to DOE and everything looks good. And we
are glad to maintain the code as the maintainers.

Thanks for applying the changes.

Best Regards,
Huai-Cheng Kuo

On Mon, Oct 10, 2022 at 6:30 PM Jonathan Cameron <
jonathan.came...@huawei.com> wrote:

> On Fri, 7 Oct 2022 16:21:51 +0100
> Jonathan Cameron  wrote:
>
> > Whilst I have carried on Huai-Cheng Kuo's series version numbering and
> > naming, there have been very substantial changes since v6 so I would
> > suggest fresh review makes sense for anyone who has looked at this
> before.
> > In particularly if the Avery design folks could check I haven't broken
> > anything that would be great.
>
> I forgot to run checkpatch on these and there is some white space that
> will need cleaning up and one instance of missing brackets.
> As that doesn't greatly affect review, I'll wait for a few days to see
> if there is other feedback to incorporate in v8.
>
> Sorry for the resulting noise!
>
> These are now available at
> https://gitlab.com/jic23/qemu/-/commits/cxl-2022-10-09
> along with a bunch of other CXL features:
> * Compliance DOE protocol
> * SPDM / CMA over DOE supprot
> * ARM64 support in general.
> * Various small emulation additions.
> * CPMU support
>
> I'll add a few more features to similarly named branches over the next
> week or so including initial support for standalone switch CCI mailboxes.
>
> Jonathan
>
> >
> > For reference v6: QEMU PCIe DOE for PCIe 4.0/5.0 and CXL 2.0
> >
> https://lore.kernel.org/qemu-devel/1623330943-18290-1-git-send-email-cbr...@avery-design.com/
> >
> > Summary of changes:
> > 1) Linux headers definitions for DOE are now upstream so drop that patch.
> > 2) Add CDAT for switch upstream port.
> > 3) Generate 'plausible' default CDAT tables when a file is not provided.
> > 4) General refactoring to calculate the correct table sizes and allocate
> >based on that rather than copying from a local static array.
> > 5) Changes from earlier reviews such as matching QEMU type naming style.
> > 6) Moved compliance and SPDM usecases to future patch sets.
> >
> > Sign-offs on these are complex because the patches were originally
> developed
> > by Huai-Cheng Kuo, but posted by Chris Browy and then picked up by
> Jonathan
> > Cameron who made substantial changes.
> >
> > Huai-Cheng Kuo / Chris Browy, please confirm you are still happy to
> maintain this
> > code as per the original MAINTAINERS entry.
> >
> > What's here?
> >
> > This series brings generic PCI Express Data Object Exchange support (DOE)
> > DOE is defined in the PCIe Base Spec r6.0. It consists of a mailbox in
> PCI
> > config space via a PCIe Extended Capability Structure.
> > The PCIe spec defines several protocols (including one to discover what
> > protocols a given DOE instance supports) and other specification such as
> > CXL define additional protocols using their own vendor IDs.
> >
> > In this series we make use of the DOE to support the CXL spec defined
> > Table Access Protocol, specifically to provide access to CDAT - a
> > table specified in a specification that is hosted by the UEFI forum
> > and is used to provide runtime discoverability of the sort of information
> > that would otherwise be available in firmware tables (memory types,
> > latency and bandwidth information etc).
> >
> > The Linux kernel gained support for DOE / CDAT on CXL type 3 EPs in 6.0.
> > The version merged did not support interrupts (earlier versions did
> > so that support in the emulation was tested a while back).
> >
> > This series provides CDAT emulation for CXL switch upstream ports
> > and CXL type 3 memory devices. Note that to exercise the switch support
> > additional Linux kernel patches are needed.
> >
> https://lore.kernel.org/linux-cxl/20220503153449.4088-1-jonathan.came...@huawei.com/
> > (I'll post a new version of that support shortly)
> >
> > Additional protocols will be supported by follow on patch sets:
> > * CXL compliance protocol.
> > * CMA / SPDM device attestation.
> > (Old version at https://gitlab.com/jic23/qemu/-/commits/cxl-next - will
> refresh
> > that tree next week)
> >
> > Huai-Cheng Kuo (3):
> >   hw/pci: PCIe Data Object Exchange emulation
> >   hw/cxl/cdat: CXL CDAT Data Object Exchange implementation
> >   hw/mem/cxl-type3: Add CXL CDAT Data Object Exchange
> >
> > Jonathan Cameron (2):
> >   hw/mem/cxl-type3: Add MSIX support
> >   hw/pci-bridge/cxl-upstream: Add a CDAT table access DOE
> >
> >  MAINTAINERS|   7 +
> >  hw/cxl/cxl-cdat.c  | 222 
> >  hw/cxl/meson.build |   1 +
> >  hw/mem/cxl_type3.c | 236 +
> >  hw/pci-bridge/cxl_upstream.c   | 182 +++-
> >  hw/pci/meson.build |   1 +
> >  hw/pci/pcie_doe.c  | 367 +
> >  include/hw/cxl/cxl_cdat.h  | 166 +++
> >  include/hw/cxl/cxl_component.h |   7 +
> >  include/hw

Re: [PATCH v8 5/8] KVM: Register/unregister the guest private memory regions

2022-10-11 Thread Fuad Tabba

Hi,

On Thu, Sep 15, 2022 at 3:38 PM Chao Peng  wrote:
>
> If CONFIG_HAVE_KVM_PRIVATE_MEM=y, userspace can register/unregister the
> guest private memory regions through KVM_MEMORY_ENCRYPT_{UN,}REG_REGION
> ioctls. The patch reuses existing SEV ioctl number but differs that the
> address in the region for KVM_PRIVATE_MEM case is gpa while for SEV case
> it's hva. Which usages should the ioctls go is determined by the newly
> added kvm_arch_has_private_mem(). Architecture which supports
> KVM_PRIVATE_MEM should override this function.
>
> The current implementation defaults all memory to private. The shared
> memory regions are stored in a xarray variable for memory efficiency and
> zapping existing memory mappings is also a side effect of these two
> ioctls when defined.
>
> Signed-off-by: Chao Peng 
> ---
>  Documentation/virt/kvm/api.rst  | 17 ++--
>  arch/x86/include/asm/kvm_host.h |  1 +
>  arch/x86/kvm/mmu.h  |  2 -
>  include/linux/kvm_host.h| 13 ++
>  virt/kvm/kvm_main.c | 73 +
>  5 files changed, 100 insertions(+), 6 deletions(-)
>
> diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
> index 1a6c003b2a0b..c0f800d04ffc 100644
> --- a/Documentation/virt/kvm/api.rst
> +++ b/Documentation/virt/kvm/api.rst
> @@ -4715,10 +4715,19 @@ Documentation/virt/kvm/x86/amd-memory-encryption.rst.
>  This ioctl can be used to register a guest memory region which may
>  contain encrypted data (e.g. guest RAM, SMRAM etc).
>
> -It is used in the SEV-enabled guest. When encryption is enabled, a guest
> -memory region may contain encrypted data. The SEV memory encryption
> -engine uses a tweak such that two identical plaintext pages, each at
> -different locations will have differing ciphertexts. So swapping or
> +Currently this ioctl supports registering memory regions for two usages:
> +private memory and SEV-encrypted memory.
> +
> +When private memory is enabled, this ioctl is used to register guest private
> +memory region and the addr/size of kvm_enc_region represents guest physical
> +address (GPA). In this usage, this ioctl zaps the existing guest memory
> +mappings in KVM that fallen into the region.
> +
> +When SEV-encrypted memory is enabled, this ioctl is used to register guest
> +memory region which may contain encrypted data for a SEV-enabled guest. The
> +addr/size of kvm_enc_region represents userspace address (HVA). The SEV
> +memory encryption engine uses a tweak such that two identical plaintext 
> pages,
> +each at different locations will have differing ciphertexts. So swapping or
>  moving ciphertext of those pages will not result in plaintext being
>  swapped. So relocating (or migrating) physical backing pages for the SEV
>  guest will require some additional steps.
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index 2c96c43c313a..cfad6ba1a70a 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -37,6 +37,7 @@
>  #include 
>
>  #define __KVM_HAVE_ARCH_VCPU_DEBUGFS
> +#define __KVM_HAVE_ZAP_GFN_RANGE
>
>  #define KVM_MAX_VCPUS 1024
>
> diff --git a/arch/x86/kvm/mmu.h b/arch/x86/kvm/mmu.h
> index 6bdaacb6faa0..c94b620bf94b 100644
> --- a/arch/x86/kvm/mmu.h
> +++ b/arch/x86/kvm/mmu.h
> @@ -211,8 +211,6 @@ static inline u8 permission_fault(struct kvm_vcpu *vcpu, 
> struct kvm_mmu *mmu,
> return -(u32)fault & errcode;
>  }
>
> -void kvm_zap_gfn_range(struct kvm *kvm, gfn_t gfn_start, gfn_t gfn_end);
> -
>  int kvm_arch_write_log_dirty(struct kvm_vcpu *vcpu);
>
>  int kvm_mmu_post_init_vm(struct kvm *kvm);
> diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
> index 2125b50f6345..d65690cae80b 100644
> --- a/include/linux/kvm_host.h
> +++ b/include/linux/kvm_host.h
> @@ -260,6 +260,15 @@ bool kvm_test_age_gfn(struct kvm *kvm, struct 
> kvm_gfn_range *range);
>  bool kvm_set_spte_gfn(struct kvm *kvm, struct kvm_gfn_range *range);
>  #endif
>
> +#ifdef __KVM_HAVE_ZAP_GFN_RANGE
> +void kvm_zap_gfn_range(struct kvm *kvm, gfn_t gfn_start, gfn_t gfn_end);
> +#else
> +static inline void kvm_zap_gfn_range(struct kvm *kvm, gfn_t gfn_start
> + gfn_t gfn_end)
> +{
> +}
> +#endif
> +
>  enum {
> OUTSIDE_GUEST_MODE,
> IN_GUEST_MODE,
> @@ -795,6 +804,9 @@ struct kvm {
> struct notifier_block pm_notifier;
>  #endif
> char stats_id[KVM_STATS_NAME_SIZE];
> +#ifdef CONFIG_HAVE_KVM_PRIVATE_MEM
> +   struct xarray mem_attr_array;
> +#endif
>  };
>
>  #define kvm_err(fmt, ...) \
> @@ -1454,6 +1466,7 @@ bool kvm_arch_dy_has_pending_interrupt(struct kvm_vcpu 
> *vcpu);
>  int kvm_arch_post_init_vm(struct kvm *kvm);
>  void kvm_arch_pre_destroy_vm(struct kvm *kvm);
>  int kvm_arch_create_vm_debugfs(struct kvm *kvm);
> +bool kvm_arch_has_private_mem(struct kvm *kvm);
>
>  #ifndef __KVM_HAVE_ARCH_VM_ALLOC
>  /*
> diff --git a/virt/kvm/kvm_main.c b/vir

Re: [RFC PATCH 3/6] hw/cxl/cxl-events: Add CXL mock events

2022-10-11 Thread Jonathan Cameron via

On Mon, 10 Oct 2022 15:29:41 -0700
ira.we...@intel.com wrote:

> From: Ira Weiny 
> 
> To facilitate testing of guest software add mock events and code to
> support iterating through the event logs.
> 
> Signed-off-by: Ira Weiny 

Various comments inline, but biggest one is I'd like to see
a much more flexible injection interface.  Happy to help code one
up if that is useful.

Jonathan


> ---
>  hw/cxl/cxl-events.c | 248 
>  hw/cxl/meson.build  |   1 +
>  include/hw/cxl/cxl_device.h |  19 +++
>  include/hw/cxl/cxl_events.h | 173 +
>  4 files changed, 441 insertions(+)
>  create mode 100644 hw/cxl/cxl-events.c
>  create mode 100644 include/hw/cxl/cxl_events.h
> 
> diff --git a/hw/cxl/cxl-events.c b/hw/cxl/cxl-events.c
> new file mode 100644
> index ..c275280bcb64
> --- /dev/null
> +++ b/hw/cxl/cxl-events.c
> @@ -0,0 +1,248 @@
> +/*
> + * CXL Event processing
> + *
> + * Copyright(C) 2022 Intel Corporation.
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2. See the
> + * COPYING file in the top-level directory.
> + */
> +
> +#include 
> +
> +#include "qemu/osdep.h"
> +#include "qemu/bswap.h"
> +#include "qemu/typedefs.h"
> +#include "hw/cxl/cxl.h"
> +#include "hw/cxl/cxl_events.h"
> +
> +struct cxl_event_log *find_event_log(CXLDeviceState *cxlds, int log_type)
> +{
> +if (log_type >= CXL_EVENT_TYPE_MAX) {
> +return NULL;
> +}
> +return &cxlds->event_logs[log_type];
> +}
> +
> +struct cxl_event_record_raw *get_cur_event(struct cxl_event_log *log)
> +{
> +return log->events[log->cur_event];
> +}
> +
> +uint16_t get_cur_event_handle(struct cxl_event_log *log)
> +{
> +return cpu_to_le16(log->cur_event);
> +}
> +
> +bool log_empty(struct cxl_event_log *log)
> +{
> +return log->cur_event == log->nr_events;
> +}
> +
> +int log_rec_left(struct cxl_event_log *log)
> +{
> +return log->nr_events - log->cur_event;
> +}
> +
> +static void event_store_add_event(CXLDeviceState *cxlds,
> +  enum cxl_event_log_type log_type,
> +  struct cxl_event_record_raw *event)
> +{
> +struct cxl_event_log *log;
> +
> +assert(log_type < CXL_EVENT_TYPE_MAX);
> +
> +log = &cxlds->event_logs[log_type];
> +assert(log->nr_events < CXL_TEST_EVENT_CNT_MAX);
> +
> +log->events[log->nr_events] = event;
> +log->nr_events++;
> +}
> +
> +uint16_t log_overflow(struct cxl_event_log *log)
> +{
> +int cnt = log_rec_left(log) - 5;

Why -5?  Can't we make it actually overflow and drop records
if that happens?

> +
> +if (cnt < 0) {
> +return 0;
> +}
> +return cnt;
> +}
> +
> +#define CXL_EVENT_RECORD_FLAG_PERMANENT BIT(2)
> +#define CXL_EVENT_RECORD_FLAG_MAINT_NEEDED  BIT(3)
> +#define CXL_EVENT_RECORD_FLAG_PERF_DEGRADED BIT(4)
> +#define CXL_EVENT_RECORD_FLAG_HW_REPLACEBIT(5)
> +
> +struct cxl_event_record_raw maint_needed = {
> +.hdr = {
> +.id.data = UUID(0xDEADBEEF, 0xCAFE, 0xBABE,
> +0xa5, 0x5a, 0xa5, 0x5a, 0xa5, 0xa5, 0x5a, 0xa5),
> +.length = sizeof(struct cxl_event_record_raw),
> +.flags[0] = CXL_EVENT_RECORD_FLAG_MAINT_NEEDED,
> +/* .handle = Set dynamically */
> +.related_handle = const_le16(0xa5b6),
> +},
> +.data = { 0xDE, 0xAD, 0xBE, 0xEF },
> +};
> +
> +struct cxl_event_record_raw hardware_replace = {
> +.hdr = {
> +.id.data = UUID(0xBABECAFE, 0xBEEF, 0xDEAD,
> +0xa5, 0x5a, 0xa5, 0x5a, 0xa5, 0xa5, 0x5a, 0xa5),
> +.length = sizeof(struct cxl_event_record_raw),
> +.flags[0] = CXL_EVENT_RECORD_FLAG_HW_REPLACE,
> +/* .handle = Set dynamically */
> +.related_handle = const_le16(0xb6a5),
> +},
> +.data = { 0xDE, 0xAD, 0xBE, 0xEF },
> +};
> +
> +#define CXL_GMER_EVT_DESC_UNCORECTABLE_EVENTBIT(0)
> +#define CXL_GMER_EVT_DESC_THRESHOLD_EVENT   BIT(1)
> +#define CXL_GMER_EVT_DESC_POISON_LIST_OVERFLOW  BIT(2)
> +
> +#define CXL_GMER_MEM_EVT_TYPE_ECC_ERROR 0x00
> +#define CXL_GMER_MEM_EVT_TYPE_INV_ADDR  0x01
> +#define CXL_GMER_MEM_EVT_TYPE_DATA_PATH_ERROR   0x02
> +
> +#define CXL_GMER_TRANS_UNKNOWN  0x00
> +#define CXL_GMER_TRANS_HOST_READ0x01
> +#define CXL_GMER_TRANS_HOST_WRITE   0x02
> +#define CXL_GMER_TRANS_HOST_SCAN_MEDIA  0x03
> +#define CXL_GMER_TRANS_HOST_INJECT_POISON   0x04
> +#define CXL_GMER_TRANS_INTERNAL_MEDIA_SCRUB 0x05
> +#define CXL_GMER_TRANS_INTERNAL_MEDIA_MANAGEMENT0x06
> +
> +#define CXL_GMER_VALID_CHANNEL  BIT(0)
> +#define CXL_GMER_VALID_RANK BIT(1)
> +#define CXL_GMER_VALID_DEVICE   BIT(2)
> +#define CXL_GMER_VALID_COMPONENT

[PATCH] target/s390x: Fix emulation of the VISTR instruction

2022-10-11 Thread Thomas Huth

The element size is encoded in the M3 field, not in the M4
field. Let's also add a TCG test that shows the failing
behavior without this fix.

Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1248
Signed-off-by: Thomas Huth 
---
 tests/tcg/s390x/vf.c| 50 +
 target/s390x/tcg/translate_vx.c.inc |  2 +-
 tests/tcg/s390x/Makefile.target |  6 
 3 files changed, 57 insertions(+), 1 deletion(-)
 create mode 100644 tests/tcg/s390x/vf.c

diff --git a/tests/tcg/s390x/vf.c b/tests/tcg/s390x/vf.c
new file mode 100644
index 00..fdc424ce7c
--- /dev/null
+++ b/tests/tcg/s390x/vf.c
@@ -0,0 +1,50 @@
+/*
+ * vf: vector facility tests
+ */
+#include 
+#include 
+#include "vx.h"
+
+static inline void vistr(S390Vector *v1, S390Vector *v2,
+ const uint8_t m3, const uint8_t m5)
+{
+asm volatile("vistr %[v1], %[v2], %[m3], %[m5]\n"
+: [v1] "=v" (v1->v)
+: [v2]  "v" (v2->v)
+, [m3]  "i" (m3)
+, [m5]  "i" (m5)
+: "cc");
+}
+
+static int test_vistr(void)
+{
+S390Vector vd = {};
+S390Vector vs16 = {
+.h[0] = 0x1234, .h[1] = 0x0056, .h[2] = 0x7800, .h[3] = 0x,
+.h[4] = 0x0078, .h[5] = 0x, .h[6] = 0x6543, .h[7] = 0x2100
+};
+S390Vector vs32 = {
+.w[0] = 0x1234, .w[1] = 0x78654300,
+.w[2] = 0x0, .w[3] = 0x12,
+};
+
+vistr(&vd, &vs16, 1, 0);
+if (vd.h[0] != 0x1234 || vd.h[1] != 0x0056 || vd.h[2] != 0x7800 ||
+vd.h[3] || vd.h[4] || vd.h[5] || vd.h[6] || vd.h[7]) {
+puts("ERROR: vitrh failed!");
+return 1;
+}
+
+vistr(&vd, &vs32, 2, 0);
+if (vd.w[0] != 0x1234 || vd.w[1] != 0x78654300 || vd.w[2] || vd.w[3]) {
+puts("ERROR: vitrf failed!");
+return 1;
+}
+
+return 0;
+}
+
+int main(int argc, char *argv[])
+{
+return test_vistr();
+}
diff --git a/target/s390x/tcg/translate_vx.c.inc 
b/target/s390x/tcg/translate_vx.c.inc
index 3526ba3e3b..b69c1a111c 100644
--- a/target/s390x/tcg/translate_vx.c.inc
+++ b/target/s390x/tcg/translate_vx.c.inc
@@ -2723,7 +2723,7 @@ static DisasJumpType op_vfene(DisasContext *s, DisasOps 
*o)
 
 static DisasJumpType op_vistr(DisasContext *s, DisasOps *o)
 {
-const uint8_t es = get_field(s, m4);
+const uint8_t es = get_field(s, m3);
 const uint8_t m5 = get_field(s, m5);
 static gen_helper_gvec_2 * const g[3] = {
 gen_helper_gvec_vistr8,
diff --git a/tests/tcg/s390x/Makefile.target b/tests/tcg/s390x/Makefile.target
index c830313e67..f8e71a9439 100644
--- a/tests/tcg/s390x/Makefile.target
+++ b/tests/tcg/s390x/Makefile.target
@@ -18,6 +18,12 @@ TESTS+=signals-s390x
 TESTS+=branch-relative-long
 TESTS+=noexec
 
+Z13_TESTS=vf
+vf: LDFLAGS+=-lm
+$(Z13_TESTS): CFLAGS+=-march=z13 -O2
+TESTS+=$(if $(shell $(CC) -march=z13 -S -o /dev/null -xc /dev/null \
+>/dev/null 2>&1 && echo OK),$(Z13_TESTS))
+
 Z14_TESTS=vfminmax
 vfminmax: LDFLAGS+=-lm
 $(Z14_TESTS): CFLAGS+=-march=z14 -O2
-- 
2.31.1

[PING PATCH v5] Add 'q35' machine type to hotplug tests

2022-10-11 Thread Michael Labiuk


I would like to ping a patch

https://patchew.org/QEMU/20220929223547.1429580-1-michael.lab...@virtuozzo.com/ 




On 9/30/22 01:35, Michael Labiuk via wrote:

Add pci bridge setting to run hotplug tests on q35 machine type.
Hotplug tests was bounded to 'pc' machine type by commit 7b172333f1b

v5 -> v4:

* Unify device removing in tests.
* Using qtest_has_machine("q35") as condition.
* fixed typos.
* Replaced snprintf.

v4 -> v3:

* Moving helper function process_device_remove() to separate commit.
* Refactoring hd-geo-test to avoid code duplication.

Michael Labiuk (9):
   tests/x86: add helper qtest_qmp_device_del_send()
   tests/x86: Add subtest with 'q35' machine type to device-plug-test
   tests/x86: Refactor hot unplug hd-geo-test
   tests/x86: Add 'q35' machine type to override-tests in hd-geo-test
   tests/x86: Add 'q35' machine type to hotplug hd-geo-test
   tests/x86: Fix comment typo in drive_del-test
   tests/x86: replace snprint() by g_strdup_printf() in drive_del-test
   tests/x86: Add 'q35' machine type to drive_del-test
   tests/x86: Add 'q35' machine type to ivshmem-test

  tests/qtest/device-plug-test.c |  56 --
  tests/qtest/drive_del-test.c   | 125 +++--
  tests/qtest/hd-geo-test.c  | 319 -
  tests/qtest/ivshmem-test.c |  18 ++
  tests/qtest/libqos/pci-pc.c|   8 +-
  tests/qtest/libqtest.c |  16 +-
  tests/qtest/libqtest.h |  10 ++
  7 files changed, 425 insertions(+), 127 deletions(-)

Re: [PATCH v3 00/13] Misc ppc/mac machines clean up

2022-10-11 Thread BALATON Zoltan


On Mon, 3 Oct 2022, BALATON Zoltan wrote:

This series includes some clean ups to mac_newworld and mac_oldworld
to make them a bit simpler and more readable, It also removes the
shared mac.h file that turns out was more of a random collection of
unrelated things. Getting rid of this mac.h improves the locality of
device models and reduces unnecessary interdependency.


Ping?


v3: Some more patch spliting and changes I've noticed and address more
review comments
v2: Split some patches and add a few more I've noticed now and address
review comments

BALATON Zoltan (13):
 mac_newworld: Drop some variables
 mac_oldworld: Drop some more variables
 mac_{old|new}world: Set tbfreq at declaration
 mac_{old|new}world: Avoid else branch by setting default value
 mac_{old|new}world: Simplify cmdline_base calculation
 mac_newworld: Clean up creation of Uninorth devices
 mac_{old|new}world: Reduce number of QOM casts
 hw/ppc/mac.h: Move newworld specific parts out from shared header
 hw/ppc/mac.h: Move macio specific parts out from shared header
 hw/ppc/mac.h: Move grackle-pcihost type declaration out to a header
 hw/ppc/mac.h: Move PROM and KERNEL defines to board code
 hw/ppc/mac.h: Rename to include/hw/nvram/mac_nvram.h
 mac_nvram: Use NVRAM_SIZE constant

MAINTAINERS   |   2 +
hw/ide/macio.c|   1 -
hw/intc/heathrow_pic.c|   1 -
hw/intc/openpic.c |   1 -
hw/misc/macio/cuda.c  |   1 -
hw/misc/macio/gpio.c  |   1 -
hw/misc/macio/macio.c |   8 +-
hw/misc/macio/pmu.c   |   1 -
hw/nvram/mac_nvram.c  |   2 +-
hw/pci-host/grackle.c |  15 +--
hw/pci-host/uninorth.c|   1 -
hw/ppc/mac.h  | 105 
hw/ppc/mac_newworld.c | 225 --
hw/ppc/mac_oldworld.c | 111 +++--
include/hw/misc/macio/macio.h |  23 +++-
include/hw/nvram/mac_nvram.h  |  51 
include/hw/pci-host/grackle.h |  44 +++
17 files changed, 280 insertions(+), 313 deletions(-)
delete mode 100644 hw/ppc/mac.h
create mode 100644 include/hw/nvram/mac_nvram.h
create mode 100644 include/hw/pci-host/grackle.h

Re: [RFC PATCH 4/6] hw/cxl/mailbox: Wire up get/clear event mailbox commands

2022-10-11 Thread Jonathan Cameron via

On Mon, 10 Oct 2022 15:29:42 -0700
ira.we...@intel.com wrote:

> From: Ira Weiny 
> 
> Replace the stubbed out CXL Get/Clear Event mailbox commands with
> commands which return the mock event information.
> 
> Signed-off-by: Ira Weiny 
> ---
>  hw/cxl/cxl-device-utils.c  |   1 +
>  hw/cxl/cxl-mailbox-utils.c | 103 +++--
>  2 files changed, 101 insertions(+), 3 deletions(-)
> 
> diff --git a/hw/cxl/cxl-device-utils.c b/hw/cxl/cxl-device-utils.c
> index 687759b3017b..4bb41101882e 100644
> --- a/hw/cxl/cxl-device-utils.c
> +++ b/hw/cxl/cxl-device-utils.c
> @@ -262,4 +262,5 @@ void cxl_device_register_init_common(CXLDeviceState 
> *cxl_dstate)
>  memdev_reg_init_common(cxl_dstate);
>  
>  assert(cxl_initialize_mailbox(cxl_dstate) == 0);
> +cxl_mock_add_event_logs(cxl_dstate);

Given you add support for injection later, why start with some records?
If we do want to do this for testing detection of events before driver
is loaded, then add a parameter to the command line to turn this on.

>  }
> diff --git a/hw/cxl/cxl-mailbox-utils.c b/hw/cxl/cxl-mailbox-utils.c
> index bb66c765a538..df345f23a30c 100644
> --- a/hw/cxl/cxl-mailbox-utils.c
> +++ b/hw/cxl/cxl-mailbox-utils.c
> @@ -9,6 +9,7 @@
>  
>  #include "qemu/osdep.h"
>  #include "hw/cxl/cxl.h"
> +#include "hw/cxl/cxl_events.h"
>  #include "hw/pci/pci.h"
>  #include "qemu/cutils.h"
>  #include "qemu/log.h"
> @@ -116,11 +117,107 @@ struct cxl_cmd {
>  return CXL_MBOX_SUCCESS;  \
>  }
>  
> -DEFINE_MAILBOX_HANDLER_ZEROED(events_get_records, 0x20);
> -DEFINE_MAILBOX_HANDLER_NOP(events_clear_records);
>  DEFINE_MAILBOX_HANDLER_ZEROED(events_get_interrupt_policy, 4);
>  DEFINE_MAILBOX_HANDLER_NOP(events_set_interrupt_policy);
>  
> +static ret_code cmd_events_get_records(struct cxl_cmd *cmd,
> +   CXLDeviceState *cxlds,
> +   uint16_t *len)
> +{
> +struct cxl_get_event_payload *pl;
> +struct cxl_event_log *log;
> +uint8_t log_type;
> +uint16_t nr_overflow;
> +
> +if (cmd->in < sizeof(log_type)) {
> +return CXL_MBOX_INVALID_INPUT;
> +}
> +
> +log_type = *((uint8_t *)cmd->payload);
> +if (log_type >= CXL_EVENT_TYPE_MAX) {
> +return CXL_MBOX_INVALID_INPUT;
> +}
> +
> +pl = (struct cxl_get_event_payload *)cmd->payload;
> +
> +log = find_event_log(cxlds, log_type);
> +if (!log || log_empty(log)) {
> +goto no_data;
> +}
> +
> +memset(pl, 0, sizeof(*pl));
> +pl->record_count = const_le16(1);
> +
> +if (log_rec_left(log) > 1) {

As below we need to handle a request that can take more than
one record, otherwise we aren't complaint with the spec.

> +pl->flags |= CXL_GET_EVENT_FLAG_MORE_RECORDS;
> +}
> +
> +nr_overflow = log_overflow(log);
> +if (nr_overflow) {
> +struct timespec ts;
> +uint64_t ns;
> +
> +clock_gettime(CLOCK_REALTIME, &ts);
> +
> +ns = ((uint64_t)ts.tv_sec * 10) + (uint64_t)ts.tv_nsec;
> +
> +pl->flags |= CXL_GET_EVENT_FLAG_OVERFLOW;
> +pl->overflow_err_count = cpu_to_le16(nr_overflow);
> +ns -= 50; /* 5s ago */
> +pl->first_overflow_timestamp = cpu_to_le64(ns);
> +ns -= 10; /* 1s ago */
> +pl->last_overflow_timestamp = cpu_to_le64(ns);
> +}
> +
> +memcpy(&pl->record, get_cur_event(log), sizeof(pl->record));
> +pl->record.hdr.handle = get_cur_event_handle(log);
> +*len = sizeof(pl->record);
> +return CXL_MBOX_SUCCESS;
> +
> +no_data:
> +*len = sizeof(*pl) - sizeof(pl->record);
> +memset(pl, 0, *len);
> +return CXL_MBOX_SUCCESS;
> +}
> +
> +static ret_code cmd_events_clear_records(struct cxl_cmd *cmd,
> + CXLDeviceState *cxlds,
> + uint16_t *len)
> +{
> +struct cxl_mbox_clear_event_payload *pl;
> +struct cxl_event_log *log;
> +uint8_t log_type;
> +
> +pl = (struct cxl_mbox_clear_event_payload *)cmd->payload;
> +log_type = pl->event_log;
> +
> +/* Don't handle more than 1 record at a time */
> +if (pl->nr_recs != 1) {

I think we need to fix this so it will handle multiple clears + hack just
enough in on kernel side to verify it.

I don't recall seeing that invalid input is something we can return if
we simply don't support as many clear entries as the command provides.

> +return CXL_MBOX_INVALID_INPUT;
> +}
> +
> +if (log_type >= CXL_EVENT_TYPE_MAX) {
> +return CXL_MBOX_INVALID_INPUT;
> +}
> +
> +log = find_event_log(cxlds, log_type);
> +if (!log) {
> +return CXL_MBOX_SUCCESS;
> +}
> +
> +/*
> + * The current code clears events as they are read.  Test that behavior
> + * only; don't support clearning from the middle of the log

This comment had me worried that we were looking at needing
t

[PULL 00/37] SCSI, i386 patches for 2022-10-11

2022-10-11 Thread Paolo Bonzini

The following changes since commit f1d33f55c47dfdaf8daacd618588ad3ae4c452d1:

  Merge tag 'pull-testing-gdbstub-plugins-gitdm-061022-3' of 
https://github.com/stsquad/qemu into staging (2022-10-06 07:11:56 -0400)

are available in the Git repository at:

  https://gitlab.com/bonzini/qemu.git tags/for-upstream

for you to fetch changes up to 5d2456789ac50b11c2bd560ddf3470fe820bb0ff:

  linux-user: i386/signal: support XSAVE/XRSTOR for signal frame fpstate 
(2022-10-11 10:27:35 +0200)


* scsi-disk: support setting CD-ROM block size via device options
* target/i386: Implement MSR_CORE_THREAD_COUNT MSR
* target/i386: notify VM exit support
* target/i386: PC-relative translation block support
* target/i386: support for XSAVE state in signal frames (linux-user)


Alexander Graf (3):
  x86: Implement MSR_CORE_THREAD_COUNT MSR
  i386: kvm: Add support for MSR filtering
  KVM: x86: Implement MSR_CORE_THREAD_COUNT MSR

Chenyi Qiang (3):
  i386: kvm: extend kvm_{get, put}_vcpu_events to support pending triple 
fault
  kvm: expose struct KVMState
  i386: add notify VM exit support

John Millikin (1):
  scsi-disk: support setting CD-ROM block size via device options

Paolo Bonzini (4):
  kvm: allow target-specific accelerator properties
  linux-user: i386/signal: move fpstate at the end of the 32-bit frames
  linux-user: i386/signal: support FXSAVE fpstate on 32-bit emulation
  linux-user: i386/signal: support XSAVE/XRSTOR for signal frame fpstate

Richard Henderson (26):
  target/i386: Remove pc_start
  target/i386: Return bool from disas_insn
  target/i386: Remove cur_eip argument to gen_exception
  target/i386: Remove cur_eip, next_eip arguments to gen_interrupt
  target/i386: Create gen_update_eip_cur
  target/i386: Create gen_update_eip_next
  target/i386: Introduce DISAS_EOB*
  target/i386: Use DISAS_EOB* in gen_movl_seg_T0
  target/i386: Use DISAS_EOB_NEXT
  target/i386: USe DISAS_EOB_ONLY
  target/i386: Create cur_insn_len, cur_insn_len_i32
  target/i386: Remove cur_eip, next_eip arguments to gen_repz*
  target/i386: Introduce DISAS_JUMP
  target/i386: Truncate values for lcall_real to i32
  target/i386: Create eip_next_*
  target/i386: Use DISAS_TOO_MANY to exit after gen_io_start
  target/i386: Create gen_jmp_rel
  target/i386: Use gen_jmp_rel for loop, repz, jecxz insns
  target/i386: Use gen_jmp_rel for gen_jcc
  target/i386: Use gen_jmp_rel for DISAS_TOO_MANY
  target/i386: Remove MemOp argument to gen_op_j*_ecx
  target/i386: Merge gen_jmp_tb and gen_goto_tb into gen_jmp_rel
  target/i386: Create eip_cur_tl
  target/i386: Add cpu_eip
  target/i386: Inline gen_jmp_im
  target/i386: Enable TARGET_TB_PCREL

 accel/kvm/kvm-all.c  |  78 +---
 hw/scsi/scsi-disk.c  |   7 +-
 include/sysemu/kvm.h |   2 +
 include/sysemu/kvm_int.h |  76 
 linux-user/i386/signal.c | 231 +++---
 qapi/run-state.json  |  17 +
 qemu-options.hx  |  11 +
 target/arm/kvm.c |   4 +
 target/i386/cpu-param.h  |   4 +
 target/i386/cpu.c|   3 +-
 target/i386/cpu.h|   4 +
 target/i386/helper.h |   2 +-
 target/i386/kvm/kvm.c| 266 +++
 target/i386/kvm/kvm_i386.h   |  11 +
 target/i386/machine.c|  20 +
 target/i386/tcg/fpu_helper.c |  64 +--
 target/i386/tcg/seg_helper.c |   6 +-
 target/i386/tcg/sysemu/misc_helper.c |   5 +
 target/i386/tcg/tcg-cpu.c|   8 +-
 target/i386/tcg/translate.c  | 830 ++-
 target/mips/kvm.c|   4 +
 target/ppc/kvm.c |   4 +
 target/riscv/kvm.c   |   4 +
 target/s390x/kvm/kvm.c   |   4 +
 24 files changed, 1102 insertions(+), 563 deletions(-)
-- 
2.37.3

[PULL 03/37] kvm: allow target-specific accelerator properties

2022-10-11 Thread Paolo Bonzini

Several hypervisor capabilities in KVM are target-specific.  When exposed
to QEMU users as accelerator properties (i.e. -accel kvm,prop=value), they
should not be available for all targets.

Add a hook for targets to add their own properties to -accel kvm, for
now no such property is defined.

Signed-off-by: Paolo Bonzini 
Message-Id: <20220929072014.20705-3-chenyi.qi...@intel.com>
Signed-off-by: Paolo Bonzini 
---
 accel/kvm/kvm-all.c| 2 ++
 include/sysemu/kvm.h   | 2 ++
 target/arm/kvm.c   | 4 
 target/i386/kvm/kvm.c  | 4 
 target/mips/kvm.c  | 4 
 target/ppc/kvm.c   | 4 
 target/riscv/kvm.c | 4 
 target/s390x/kvm/kvm.c | 4 
 8 files changed, 28 insertions(+)

diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
index 423fb1936f..03a69cf053 100644
--- a/accel/kvm/kvm-all.c
+++ b/accel/kvm/kvm-all.c
@@ -3731,6 +3731,8 @@ static void kvm_accel_class_init(ObjectClass *oc, void 
*data)
 NULL, NULL);
 object_class_property_set_description(oc, "dirty-ring-size",
 "Size of KVM dirty page ring buffer (default: 0, i.e. use bitmap)");
+
+kvm_arch_accel_class_init(oc);
 }
 
 static const TypeInfo kvm_accel_type = {
diff --git a/include/sysemu/kvm.h b/include/sysemu/kvm.h
index 790d35ef78..e9a97eda8c 100644
--- a/include/sysemu/kvm.h
+++ b/include/sysemu/kvm.h
@@ -349,6 +349,8 @@ bool kvm_device_supported(int vmfd, uint64_t type);
 
 extern const KVMCapabilityInfo kvm_arch_required_capabilities[];
 
+void kvm_arch_accel_class_init(ObjectClass *oc);
+
 void kvm_arch_pre_run(CPUState *cpu, struct kvm_run *run);
 MemTxAttrs kvm_arch_post_run(CPUState *cpu, struct kvm_run *run);
 
diff --git a/target/arm/kvm.c b/target/arm/kvm.c
index e5c1bd50d2..d21603cf28 100644
--- a/target/arm/kvm.c
+++ b/target/arm/kvm.c
@@ -1056,3 +1056,7 @@ bool kvm_arch_cpu_check_are_resettable(void)
 {
 return true;
 }
+
+void kvm_arch_accel_class_init(ObjectClass *oc)
+{
+}
diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index 3ebe8b7f1f..f18d21413c 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -5468,3 +5468,7 @@ void kvm_request_xsave_components(X86CPU *cpu, uint64_t 
mask)
 mask &= ~BIT_ULL(bit);
 }
 }
+
+void kvm_arch_accel_class_init(ObjectClass *oc)
+{
+}
diff --git a/target/mips/kvm.c b/target/mips/kvm.c
index caf70decd2..bcb8e06b2c 100644
--- a/target/mips/kvm.c
+++ b/target/mips/kvm.c
@@ -1294,3 +1294,7 @@ bool kvm_arch_cpu_check_are_resettable(void)
 {
 return true;
 }
+
+void kvm_arch_accel_class_init(ObjectClass *oc)
+{
+}
diff --git a/target/ppc/kvm.c b/target/ppc/kvm.c
index 466d0d2f4c..7c25348b7b 100644
--- a/target/ppc/kvm.c
+++ b/target/ppc/kvm.c
@@ -2966,3 +2966,7 @@ bool kvm_arch_cpu_check_are_resettable(void)
 {
 return true;
 }
+
+void kvm_arch_accel_class_init(ObjectClass *oc)
+{
+}
diff --git a/target/riscv/kvm.c b/target/riscv/kvm.c
index 70b4cff06f..30f21453d6 100644
--- a/target/riscv/kvm.c
+++ b/target/riscv/kvm.c
@@ -532,3 +532,7 @@ bool kvm_arch_cpu_check_are_resettable(void)
 {
 return true;
 }
+
+void kvm_arch_accel_class_init(ObjectClass *oc)
+{
+}
diff --git a/target/s390x/kvm/kvm.c b/target/s390x/kvm/kvm.c
index 6a8dbadf7e..508c24cfec 100644
--- a/target/s390x/kvm/kvm.c
+++ b/target/s390x/kvm/kvm.c
@@ -2581,3 +2581,7 @@ int kvm_s390_get_zpci_op(void)
 {
 return cap_zpci_op;
 }
+
+void kvm_arch_accel_class_init(ObjectClass *oc)
+{
+}
-- 
2.37.3

[PULL 01/37] scsi-disk: support setting CD-ROM block size via device options

2022-10-11 Thread Paolo Bonzini

From: John Millikin 

SunOS expects CD-ROM devices to have a block size of 512, and will
fail to mount or install using QEMU's default block size of 2048.

When initializing the SCSI device, allow the `physical_block_size'
block device option to override the default block size.

Signed-off-by: John Millikin 
Message-Id: <20220804122950.1577012-1-j...@john-millikin.com>
Signed-off-by: Paolo Bonzini 
---
 hw/scsi/scsi-disk.c | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/hw/scsi/scsi-disk.c b/hw/scsi/scsi-disk.c
index 399e1787ea..e493c28814 100644
--- a/hw/scsi/scsi-disk.c
+++ b/hw/scsi/scsi-disk.c
@@ -2544,6 +2544,7 @@ static void scsi_cd_realize(SCSIDevice *dev, Error **errp)
 SCSIDiskState *s = DO_UPCAST(SCSIDiskState, qdev, dev);
 AioContext *ctx;
 int ret;
+uint32_t blocksize = 2048;
 
 if (!dev->conf.blk) {
 /* Anonymous BlockBackend for an empty drive. As we put it into
@@ -2553,9 +2554,13 @@ static void scsi_cd_realize(SCSIDevice *dev, Error 
**errp)
 assert(ret == 0);
 }
 
+if (dev->conf.physical_block_size != 0) {
+blocksize = dev->conf.physical_block_size;
+}
+
 ctx = blk_get_aio_context(dev->conf.blk);
 aio_context_acquire(ctx);
-s->qdev.blocksize = 2048;
+s->qdev.blocksize = blocksize;
 s->qdev.type = TYPE_ROM;
 s->features |= 1 << SCSI_DISK_F_REMOVABLE;
 if (!s->product) {
-- 
2.37.3

[PULL 09/37] target/i386: Remove cur_eip, next_eip arguments to gen_interrupt

2022-10-11 Thread Paolo Bonzini

From: Richard Henderson 

All callers pass s->base.pc_next and s->pc, which we can just as
well compute within the function.  Adjust to use tcg_constant_i32
while we're at it.

Reviewed-by: Paolo Bonzini 
Reviewed-by: Philippe Mathieu-Daudé 
Signed-off-by: Richard Henderson 
Message-Id: <20221001140935.465607-5-richard.hender...@linaro.org>
Signed-off-by: Paolo Bonzini 
---
 target/i386/tcg/translate.c | 13 ++---
 1 file changed, 6 insertions(+), 7 deletions(-)

diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
index 617832fcb0..5a9c3b1e71 100644
--- a/target/i386/tcg/translate.c
+++ b/target/i386/tcg/translate.c
@@ -2627,13 +2627,12 @@ static void gen_unknown_opcode(CPUX86State *env, 
DisasContext *s)
 
 /* an interrupt is different from an exception because of the
privilege checks */
-static void gen_interrupt(DisasContext *s, int intno,
-  target_ulong cur_eip, target_ulong next_eip)
+static void gen_interrupt(DisasContext *s, int intno)
 {
 gen_update_cc_op(s);
-gen_jmp_im(s, cur_eip);
-gen_helper_raise_interrupt(cpu_env, tcg_const_i32(intno),
-   tcg_const_i32(next_eip - cur_eip));
+gen_jmp_im(s, s->base.pc_next - s->cs_base);
+gen_helper_raise_interrupt(cpu_env, tcg_constant_i32(intno),
+   tcg_constant_i32(s->pc - s->base.pc_next));
 s->base.is_jmp = DISAS_NORETURN;
 }
 
@@ -7342,12 +7341,12 @@ static bool disas_insn(DisasContext *s, CPUState *cpu)
 }
 break;
 case 0xcc: /* int3 */
-gen_interrupt(s, EXCP03_INT3, s->base.pc_next - s->cs_base, s->pc - 
s->cs_base);
+gen_interrupt(s, EXCP03_INT3);
 break;
 case 0xcd: /* int N */
 val = x86_ldub_code(env, s);
 if (check_vm86_iopl(s)) {
-gen_interrupt(s, val, s->base.pc_next - s->cs_base, s->pc - 
s->cs_base);
+gen_interrupt(s, val);
 }
 break;
 case 0xce: /* into */
-- 
2.37.3

[PULL 04/37] kvm: expose struct KVMState

2022-10-11 Thread Paolo Bonzini

From: Chenyi Qiang 

Expose struct KVMState out of kvm-all.c so that the field of struct
KVMState can be accessed when defining target-specific accelerator
properties.

Signed-off-by: Chenyi Qiang 
Message-Id: <20220929072014.20705-4-chenyi.qi...@intel.com>
Signed-off-by: Paolo Bonzini 
---
 accel/kvm/kvm-all.c  | 74 --
 include/sysemu/kvm_int.h | 76 
 2 files changed, 76 insertions(+), 74 deletions(-)

diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
index 03a69cf053..fbfe948398 100644
--- a/accel/kvm/kvm-all.c
+++ b/accel/kvm/kvm-all.c
@@ -77,86 +77,12 @@
 do { } while (0)
 #endif
 
-#define KVM_MSI_HASHTAB_SIZE256
-
 struct KVMParkedVcpu {
 unsigned long vcpu_id;
 int kvm_fd;
 QLIST_ENTRY(KVMParkedVcpu) node;
 };
 
-enum KVMDirtyRingReaperState {
-KVM_DIRTY_RING_REAPER_NONE = 0,
-/* The reaper is sleeping */
-KVM_DIRTY_RING_REAPER_WAIT,
-/* The reaper is reaping for dirty pages */
-KVM_DIRTY_RING_REAPER_REAPING,
-};
-
-/*
- * KVM reaper instance, responsible for collecting the KVM dirty bits
- * via the dirty ring.
- */
-struct KVMDirtyRingReaper {
-/* The reaper thread */
-QemuThread reaper_thr;
-volatile uint64_t reaper_iteration; /* iteration number of reaper thr */
-volatile enum KVMDirtyRingReaperState reaper_state; /* reap thr state */
-};
-
-struct KVMState
-{
-AccelState parent_obj;
-
-int nr_slots;
-int fd;
-int vmfd;
-int coalesced_mmio;
-int coalesced_pio;
-struct kvm_coalesced_mmio_ring *coalesced_mmio_ring;
-bool coalesced_flush_in_progress;
-int vcpu_events;
-int robust_singlestep;
-int debugregs;
-#ifdef KVM_CAP_SET_GUEST_DEBUG
-QTAILQ_HEAD(, kvm_sw_breakpoint) kvm_sw_breakpoints;
-#endif
-int max_nested_state_len;
-int many_ioeventfds;
-int intx_set_mask;
-int kvm_shadow_mem;
-bool kernel_irqchip_allowed;
-bool kernel_irqchip_required;
-OnOffAuto kernel_irqchip_split;
-bool sync_mmu;
-uint64_t manual_dirty_log_protect;
-/* The man page (and posix) say ioctl numbers are signed int, but
- * they're not.  Linux, glibc and *BSD all treat ioctl numbers as
- * unsigned, and treating them as signed here can break things */
-unsigned irq_set_ioctl;
-unsigned int sigmask_len;
-GHashTable *gsimap;
-#ifdef KVM_CAP_IRQ_ROUTING
-struct kvm_irq_routing *irq_routes;
-int nr_allocated_irq_routes;
-unsigned long *used_gsi_bitmap;
-unsigned int gsi_count;
-QTAILQ_HEAD(, KVMMSIRoute) msi_hashtab[KVM_MSI_HASHTAB_SIZE];
-#endif
-KVMMemoryListener memory_listener;
-QLIST_HEAD(, KVMParkedVcpu) kvm_parked_vcpus;
-
-/* For "info mtree -f" to tell if an MR is registered in KVM */
-int nr_as;
-struct KVMAs {
-KVMMemoryListener *ml;
-AddressSpace *as;
-} *as;
-uint64_t kvm_dirty_ring_bytes;  /* Size of the per-vcpu dirty ring */
-uint32_t kvm_dirty_ring_size;   /* Number of dirty GFNs per ring */
-struct KVMDirtyRingReaper reaper;
-};
-
 KVMState *kvm_state;
 bool kvm_kernel_irqchip;
 bool kvm_split_irqchip;
diff --git a/include/sysemu/kvm_int.h b/include/sysemu/kvm_int.h
index 1f5487d9b7..3b4adcdc10 100644
--- a/include/sysemu/kvm_int.h
+++ b/include/sysemu/kvm_int.h
@@ -10,6 +10,7 @@
 #define QEMU_KVM_INT_H
 
 #include "exec/memory.h"
+#include "qapi/qapi-types-common.h"
 #include "qemu/accel.h"
 #include "sysemu/kvm.h"
 
@@ -36,6 +37,81 @@ typedef struct KVMMemoryListener {
 int as_id;
 } KVMMemoryListener;
 
+#define KVM_MSI_HASHTAB_SIZE256
+
+enum KVMDirtyRingReaperState {
+KVM_DIRTY_RING_REAPER_NONE = 0,
+/* The reaper is sleeping */
+KVM_DIRTY_RING_REAPER_WAIT,
+/* The reaper is reaping for dirty pages */
+KVM_DIRTY_RING_REAPER_REAPING,
+};
+
+/*
+ * KVM reaper instance, responsible for collecting the KVM dirty bits
+ * via the dirty ring.
+ */
+struct KVMDirtyRingReaper {
+/* The reaper thread */
+QemuThread reaper_thr;
+volatile uint64_t reaper_iteration; /* iteration number of reaper thr */
+volatile enum KVMDirtyRingReaperState reaper_state; /* reap thr state */
+};
+struct KVMState
+{
+AccelState parent_obj;
+
+int nr_slots;
+int fd;
+int vmfd;
+int coalesced_mmio;
+int coalesced_pio;
+struct kvm_coalesced_mmio_ring *coalesced_mmio_ring;
+bool coalesced_flush_in_progress;
+int vcpu_events;
+int robust_singlestep;
+int debugregs;
+#ifdef KVM_CAP_SET_GUEST_DEBUG
+QTAILQ_HEAD(, kvm_sw_breakpoint) kvm_sw_breakpoints;
+#endif
+int max_nested_state_len;
+int many_ioeventfds;
+int intx_set_mask;
+int kvm_shadow_mem;
+bool kernel_irqchip_allowed;
+bool kernel_irqchip_required;
+OnOffAuto kernel_irqchip_split;
+bool sync_mmu;
+uint64_t manual_dirty_log_protect;
+/* The man page (and posix) say ioctl numbers are signed int, but
+ * they're not.  Linux, glibc and *

[PULL 05/37] i386: add notify VM exit support

2022-10-11 Thread Paolo Bonzini

From: Chenyi Qiang 

There are cases that malicious virtual machine can cause CPU stuck (due
to event windows don't open up), e.g., infinite loop in microcode when
nested #AC (CVE-2015-5307). No event window means no event (NMI, SMI and
IRQ) can be delivered. It leads the CPU to be unavailable to host or
other VMs. Notify VM exit is introduced to mitigate such kind of
attacks, which will generate a VM exit if no event window occurs in VM
non-root mode for a specified amount of time (notify window).

A new KVM capability KVM_CAP_X86_NOTIFY_VMEXIT is exposed to user space
so that the user can query the capability and set the expected notify
window when creating VMs. The format of the argument when enabling this
capability is as follows:
  Bit 63:32 - notify window specified in qemu command
  Bit 31:0  - some flags (e.g. KVM_X86_NOTIFY_VMEXIT_ENABLED is set to
  enable the feature.)

Users can configure the feature by a new (x86 only) accel property:
qemu -accel kvm,notify-vmexit=run|internal-error|disable,notify-window=n

The default option of notify-vmexit is run, which will enable the
capability and do nothing if the exit happens. The internal-error option
raises a KVM internal error if it happens. The disable option does not
enable the capability. The default value of notify-window is 0. It is valid
only when notify-vmexit is not disabled. The valid range of notify-window
is non-negative. It is even safe to set it to zero since there's an
internal hardware threshold to be added to ensure no false positive.

Because a notify VM exit may happen with VM_CONTEXT_INVALID set in exit
qualification (no cases are anticipated that would set this bit), which
means VM context is corrupted. It would be reflected in the flags of
KVM_EXIT_NOTIFY exit. If KVM_NOTIFY_CONTEXT_INVALID bit is set, raise a KVM
internal error unconditionally.

Acked-by: Peter Xu 
Signed-off-by: Chenyi Qiang 
Message-Id: <20220929072014.20705-5-chenyi.qi...@intel.com>
Signed-off-by: Paolo Bonzini 
---
 accel/kvm/kvm-all.c   |  2 +
 qapi/run-state.json   | 17 
 qemu-options.hx   | 11 +
 target/i386/kvm/kvm.c | 98 +++
 4 files changed, 128 insertions(+)

diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
index fbfe948398..f99b0becd8 100644
--- a/accel/kvm/kvm-all.c
+++ b/accel/kvm/kvm-all.c
@@ -3618,6 +3618,8 @@ static void kvm_accel_instance_init(Object *obj)
 s->kernel_irqchip_split = ON_OFF_AUTO_AUTO;
 /* KVM dirty ring is by default off */
 s->kvm_dirty_ring_size = 0;
+s->notify_vmexit = NOTIFY_VMEXIT_OPTION_RUN;
+s->notify_window = 0;
 }
 
 /**
diff --git a/qapi/run-state.json b/qapi/run-state.json
index 9273ea6516..49989d30e6 100644
--- a/qapi/run-state.json
+++ b/qapi/run-state.json
@@ -643,3 +643,20 @@
 { 'struct': 'MemoryFailureFlags',
   'data': { 'action-required': 'bool',
 'recursive': 'bool'} }
+
+##
+# @NotifyVmexitOption:
+#
+# An enumeration of the options specified when enabling notify VM exit
+#
+# @run: enable the feature, do nothing and continue if the notify VM exit 
happens.
+#
+# @internal-error: enable the feature, raise a internal error if the notify
+#  VM exit happens.
+#
+# @disable: disable the feature.
+#
+# Since: 7.2
+##
+{ 'enum': 'NotifyVmexitOption',
+  'data': [ 'run', 'internal-error', 'disable' ] }
\ No newline at end of file
diff --git a/qemu-options.hx b/qemu-options.hx
index 95b998a13b..606d379186 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -191,6 +191,7 @@ DEF("accel", HAS_ARG, QEMU_OPTION_accel,
 "split-wx=on|off (enable TCG split w^x mapping)\n"
 "tb-size=n (TCG translation block cache size)\n"
 "dirty-ring-size=n (KVM dirty ring GFN count, default 0)\n"
+"notify-vmexit=run|internal-error|disable,notify-window=n 
(enable notify VM exit and set notify window, x86 only)\n"
 "thread=single|multi (enable multi-threaded TCG)\n", 
QEMU_ARCH_ALL)
 SRST
 ``-accel name[,prop=value[,...]]``
@@ -242,6 +243,16 @@ SRST
 is disabled (dirty-ring-size=0).  When enabled, KVM will instead
 record dirty pages in a bitmap.
 
+``notify-vmexit=run|internal-error|disable,notify-window=n``
+Enables or disables notify VM exit support on x86 host and specify
+the corresponding notify window to trigger the VM exit if enabled.
+``run`` option enables the feature. It does nothing and continue
+if the exit happens. ``internal-error`` option enables the feature.
+It raises a internal error. ``disable`` option doesn't enable the 
feature.
+This feature can mitigate the CPU stuck issue due to event windows 
don't
+open up for a specified of time (i.e. notify-window).
+Default: notify-vmexit=run,notify-window=0.
+
 ERST
 
 DEF("smp", HAS_ARG, QEMU_OPTION_smp,
diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index

[PULL 02/37] i386: kvm: extend kvm_{get, put}_vcpu_events to support pending triple fault

2022-10-11 Thread Paolo Bonzini

From: Chenyi Qiang 

For the direct triple faults, i.e. hardware detected and KVM morphed
to VM-Exit, KVM will never lose them. But for triple faults sythesized
by KVM, e.g. the RSM path, if KVM exits to userspace before the request
is serviced, userspace could migrate the VM and lose the triple fault.

A new flag KVM_VCPUEVENT_VALID_TRIPLE_FAULT is defined to signal that
the event.triple_fault_pending field contains a valid state if the
KVM_CAP_X86_TRIPLE_FAULT_EVENT capability is enabled.

Acked-by: Peter Xu 
Signed-off-by: Chenyi Qiang 
Message-Id: <20220929072014.20705-2-chenyi.qi...@intel.com>
Signed-off-by: Paolo Bonzini 
---
 target/i386/cpu.c |  1 +
 target/i386/cpu.h |  1 +
 target/i386/kvm/kvm.c | 20 
 target/i386/machine.c | 20 
 4 files changed, 42 insertions(+)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index ad623d91e4..06884177fa 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -6017,6 +6017,7 @@ static void x86_cpu_reset(DeviceState *dev)
 env->exception_has_payload = false;
 env->exception_payload = 0;
 env->nmi_injected = false;
+env->triple_fault_pending = false;
 #if !defined(CONFIG_USER_ONLY)
 /* We hard-wire the BSP to the first CPU. */
 apic_designate_bsp(cpu->apic_state, s->cpu_index == 0);
diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index 82004b65b9..d4124973ce 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -1739,6 +1739,7 @@ typedef struct CPUArchState {
 uint8_t has_error_code;
 uint8_t exception_has_payload;
 uint64_t exception_payload;
+uint8_t triple_fault_pending;
 uint32_t ins_len;
 uint32_t sipi_vector;
 bool tsc_valid;
diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index f2a96492ce..3ebe8b7f1f 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -132,6 +132,7 @@ static int has_xcrs;
 static int has_pit_state2;
 static int has_sregs2;
 static int has_exception_payload;
+static int has_triple_fault_event;
 
 static bool has_msr_mcg_ext_ctl;
 
@@ -2479,6 +2480,16 @@ int kvm_arch_init(MachineState *ms, KVMState *s)
 }
 }
 
+has_triple_fault_event = kvm_check_extension(s, 
KVM_CAP_X86_TRIPLE_FAULT_EVENT);
+if (has_triple_fault_event) {
+ret = kvm_vm_enable_cap(s, KVM_CAP_X86_TRIPLE_FAULT_EVENT, 0, true);
+if (ret < 0) {
+error_report("kvm: Failed to enable triple fault event cap: %s",
+ strerror(-ret));
+return ret;
+}
+}
+
 ret = kvm_get_supported_msrs(s);
 if (ret < 0) {
 return ret;
@@ -4295,6 +4306,11 @@ static int kvm_put_vcpu_events(X86CPU *cpu, int level)
 }
 }
 
+if (has_triple_fault_event) {
+events.flags |= KVM_VCPUEVENT_VALID_TRIPLE_FAULT;
+events.triple_fault.pending = env->triple_fault_pending;
+}
+
 return kvm_vcpu_ioctl(CPU(cpu), KVM_SET_VCPU_EVENTS, &events);
 }
 
@@ -4364,6 +4380,10 @@ static int kvm_get_vcpu_events(X86CPU *cpu)
 }
 }
 
+if (events.flags & KVM_VCPUEVENT_VALID_TRIPLE_FAULT) {
+env->triple_fault_pending = events.triple_fault.pending;
+}
+
 env->sipi_vector = events.sipi_vector;
 
 return 0;
diff --git a/target/i386/machine.c b/target/i386/machine.c
index cecd476e98..310b125235 100644
--- a/target/i386/machine.c
+++ b/target/i386/machine.c
@@ -1562,6 +1562,25 @@ static const VMStateDescription vmstate_arch_lbr = {
 }
 };
 
+static bool triple_fault_needed(void *opaque)
+{
+X86CPU *cpu = opaque;
+CPUX86State *env = &cpu->env;
+
+return env->triple_fault_pending;
+}
+
+static const VMStateDescription vmstate_triple_fault = {
+.name = "cpu/triple_fault",
+.version_id = 1,
+.minimum_version_id = 1,
+.needed = triple_fault_needed,
+.fields = (VMStateField[]) {
+VMSTATE_UINT8(env.triple_fault_pending, X86CPU),
+VMSTATE_END_OF_LIST()
+}
+};
+
 const VMStateDescription vmstate_x86_cpu = {
 .name = "cpu",
 .version_id = 12,
@@ -1706,6 +1725,7 @@ const VMStateDescription vmstate_x86_cpu = {
 &vmstate_amx_xtile,
 #endif
 &vmstate_arch_lbr,
+&vmstate_triple_fault,
 NULL
 }
 };
-- 
2.37.3

[PULL 07/37] target/i386: Return bool from disas_insn

2022-10-11 Thread Paolo Bonzini

From: Richard Henderson 

Instead of returning the new pc, which is present in
DisasContext, return true if an insn was translated.
This is false when we detect a page crossing and must
undo the insn under translation.

Reviewed-by: Philippe Mathieu-Daudé 
Signed-off-by: Richard Henderson 
Reviewed-by: Paolo Bonzini 
Message-Id: <20221001140935.465607-3-richard.hender...@linaro.org>
Signed-off-by: Paolo Bonzini 
---
 target/i386/tcg/translate.c | 44 +++--
 1 file changed, 23 insertions(+), 21 deletions(-)

diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
index 16bf56dbc7..3f3e79c096 100644
--- a/target/i386/tcg/translate.c
+++ b/target/i386/tcg/translate.c
@@ -4707,7 +4707,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s, 
int b)
 
 /* convert one instruction. s->base.is_jmp is set if the translation must
be stopped. Return the next pc value */
-static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
+static bool disas_insn(DisasContext *s, CPUState *cpu)
 {
 CPUX86State *env = cpu->env_ptr;
 int b, prefixes;
@@ -4734,15 +4734,16 @@ static target_ulong disas_insn(DisasContext *s, 
CPUState *cpu)
 break;
 case 1:
 gen_exception_gpf(s);
-return s->pc;
+return true;
 case 2:
 /* Restore state that may affect the next instruction. */
+s->pc = s->base.pc_next;
 s->cc_op_dirty = orig_cc_op_dirty;
 s->cc_op = orig_cc_op;
 s->base.num_insns--;
 tcg_remove_ops_after(s->prev_insn_end);
 s->base.is_jmp = DISAS_TOO_MANY;
-return s->base.pc_next;
+return false;
 default:
 g_assert_not_reached();
 }
@@ -8644,13 +8645,13 @@ static target_ulong disas_insn(DisasContext *s, 
CPUState *cpu)
 default:
 goto unknown_op;
 }
-return s->pc;
+return true;
  illegal_op:
 gen_illegal_opcode(s);
-return s->pc;
+return true;
  unknown_op:
 gen_unknown_opcode(env, s);
-return s->pc;
+return true;
 }
 
 void tcg_x86_init(void)
@@ -8815,7 +8816,6 @@ static void i386_tr_insn_start(DisasContextBase *dcbase, 
CPUState *cpu)
 static void i386_tr_translate_insn(DisasContextBase *dcbase, CPUState *cpu)
 {
 DisasContext *dc = container_of(dcbase, DisasContext, base);
-target_ulong pc_next;
 
 #ifdef TARGET_VSYSCALL_PAGE
 /*
@@ -8828,21 +8828,23 @@ static void i386_tr_translate_insn(DisasContextBase 
*dcbase, CPUState *cpu)
 }
 #endif
 
-pc_next = disas_insn(dc, cpu);
-dc->base.pc_next = pc_next;
+if (disas_insn(dc, cpu)) {
+target_ulong pc_next = dc->pc;
+dc->base.pc_next = pc_next;
 
-if (dc->base.is_jmp == DISAS_NEXT) {
-if (dc->flags & (HF_TF_MASK | HF_INHIBIT_IRQ_MASK)) {
-/*
- * If single step mode, we generate only one instruction and
- * generate an exception.
- * If irq were inhibited with HF_INHIBIT_IRQ_MASK, we clear
- * the flag and abort the translation to give the irqs a
- * chance to happen.
- */
-dc->base.is_jmp = DISAS_TOO_MANY;
-} else if (!is_same_page(&dc->base, pc_next)) {
-dc->base.is_jmp = DISAS_TOO_MANY;
+if (dc->base.is_jmp == DISAS_NEXT) {
+if (dc->flags & (HF_TF_MASK | HF_INHIBIT_IRQ_MASK)) {
+/*
+ * If single step mode, we generate only one instruction and
+ * generate an exception.
+ * If irq were inhibited with HF_INHIBIT_IRQ_MASK, we clear
+ * the flag and abort the translation to give the irqs a
+ * chance to happen.
+ */
+dc->base.is_jmp = DISAS_TOO_MANY;
+} else if (!is_same_page(&dc->base, pc_next)) {
+dc->base.is_jmp = DISAS_TOO_MANY;
+}
 }
 }
 }
-- 
2.37.3

[PULL 14/37] target/i386: Use DISAS_EOB_NEXT

2022-10-11 Thread Paolo Bonzini

From: Richard Henderson 

Replace sequences of gen_update_cc_op, gen_update_eip_next,
and gen_eob with the new is_jmp enumerator.

Reviewed-by: Paolo Bonzini 
Signed-off-by: Richard Henderson 
Message-Id: <20221001140935.465607-10-richard.hender...@linaro.org>
Signed-off-by: Paolo Bonzini 
---
 target/i386/tcg/translate.c | 40 -
 1 file changed, 13 insertions(+), 27 deletions(-)

diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
index 8c0ef0f212..717c978381 100644
--- a/target/i386/tcg/translate.c
+++ b/target/i386/tcg/translate.c
@@ -7022,8 +7022,7 @@ static bool disas_insn(DisasContext *s, CPUState *cpu)
 gen_pop_update(s, ot);
 set_cc_op(s, CC_OP_EFLAGS);
 /* abort translation because TF/AC flag may change */
-gen_update_eip_next(s);
-gen_eob(s);
+s->base.is_jmp = DISAS_EOB_NEXT;
 }
 break;
 case 0x9e: /* sahf */
@@ -7452,8 +7451,7 @@ static bool disas_insn(DisasContext *s, CPUState *cpu)
 gen_helper_rdmsr(cpu_env);
 } else {
 gen_helper_wrmsr(cpu_env);
-gen_update_eip_next(s);
-gen_eob(s);
+s->base.is_jmp = DISAS_EOB_NEXT;
 }
 }
 break;
@@ -7652,8 +7650,7 @@ static bool disas_insn(DisasContext *s, CPUState *cpu)
 goto illegal_op;
 }
 gen_helper_clac(cpu_env);
-gen_update_eip_next(s);
-gen_eob(s);
+s->base.is_jmp = DISAS_EOB_NEXT;
 break;
 
 case 0xcb: /* stac */
@@ -7662,8 +7659,7 @@ static bool disas_insn(DisasContext *s, CPUState *cpu)
 goto illegal_op;
 }
 gen_helper_stac(cpu_env);
-gen_update_eip_next(s);
-gen_eob(s);
+s->base.is_jmp = DISAS_EOB_NEXT;
 break;
 
 CASE_MODRM_MEM_OP(1): /* sidt */
@@ -7707,8 +7703,7 @@ static bool disas_insn(DisasContext *s, CPUState *cpu)
 tcg_gen_trunc_tl_i32(s->tmp2_i32, cpu_regs[R_ECX]);
 gen_helper_xsetbv(cpu_env, s->tmp2_i32, s->tmp1_i64);
 /* End TB because translation flags may change.  */
-gen_update_eip_next(s);
-gen_eob(s);
+s->base.is_jmp = DISAS_EOB_NEXT;
 break;
 
 case 0xd8: /* VMRUN */
@@ -7769,8 +7764,7 @@ static bool disas_insn(DisasContext *s, CPUState *cpu)
 }
 gen_update_cc_op(s);
 gen_helper_stgi(cpu_env);
-gen_update_eip_next(s);
-gen_eob(s);
+s->base.is_jmp = DISAS_EOB_NEXT;
 break;
 
 case 0xdd: /* CLGI */
@@ -7808,8 +7802,7 @@ static bool disas_insn(DisasContext *s, CPUState *cpu)
 tcg_gen_ext32u_tl(s->A0, cpu_regs[R_EAX]);
 }
 gen_helper_flush_page(cpu_env, s->A0);
-gen_update_eip_next(s);
-gen_eob(s);
+s->base.is_jmp = DISAS_EOB_NEXT;
 break;
 
 CASE_MODRM_MEM_OP(2): /* lgdt */
@@ -7892,8 +7885,7 @@ static bool disas_insn(DisasContext *s, CPUState *cpu)
 tcg_gen_andi_tl(s->T1, s->T1, ~0xe);
 tcg_gen_or_tl(s->T0, s->T0, s->T1);
 gen_helper_write_crN(cpu_env, tcg_constant_i32(0), s->T0);
-gen_update_eip_next(s);
-gen_eob(s);
+s->base.is_jmp = DISAS_EOB_NEXT;
 break;
 
 CASE_MODRM_MEM_OP(7): /* invlpg */
@@ -7903,8 +7895,7 @@ static bool disas_insn(DisasContext *s, CPUState *cpu)
 gen_svm_check_intercept(s, SVM_EXIT_INVLPG);
 gen_lea_modrm(env, s, modrm);
 gen_helper_flush_page(cpu_env, s->A0);
-gen_update_eip_next(s);
-gen_eob(s);
+s->base.is_jmp = DISAS_EOB_NEXT;
 break;
 
 case 0xf8: /* swapgs */
@@ -8303,8 +8294,7 @@ static bool disas_insn(DisasContext *s, CPUState *cpu)
 gen_svm_check_intercept(s, SVM_EXIT_WRITE_CR0 + reg);
 gen_op_mov_v_reg(s, ot, s->T0, rm);
 gen_helper_write_crN(cpu_env, tcg_constant_i32(reg), s->T0);
-gen_update_eip_next(s);
-gen_eob(s);
+s->base.is_jmp = DISAS_EOB_NEXT;
 } else {
 gen_svm_check_intercept(s, SVM_EXIT_READ_CR0 + reg);
 gen_helper_read_crN(s->T0, cpu_env, tcg_constant_i32(reg));
@@ -8338,8 +8328,7 @@ static bool disas_insn(DisasContext *s, CPUState *cpu)
 gen_op_mov_v_reg(s, ot, s->T0, rm);
 tcg_gen_movi_i32(s->tmp2_i32, reg);
 gen_helper_set_dr(cpu_env, s->tmp2_i32, s->T0);
-gen_update_eip_next(s);
-gen_eob(s);
+s->base.is_jmp = DISAS_EOB_NEXT;
 } else {
 gen_svm_check_intercept(s, SVM_EXIT_READ_DR0 + reg);
 tcg_gen_movi_i32(s->tmp2_i32, reg);
@@ -8353,8 +

[PULL 10/37] target/i386: Create gen_update_eip_cur

2022-10-11 Thread Paolo Bonzini

From: Richard Henderson 

Like gen_update_cc_op, sync EIP before doing something
that could raise an exception.  Replace all gen_jmp_im
that use s->base.pc_next.

Reviewed-by: Paolo Bonzini 
Signed-off-by: Richard Henderson 
Message-Id: <20221001140935.465607-6-richard.hender...@linaro.org>
Signed-off-by: Paolo Bonzini 
---
 target/i386/tcg/translate.c | 52 -
 1 file changed, 28 insertions(+), 24 deletions(-)

diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
index 5a9c3b1e71..85253e1e17 100644
--- a/target/i386/tcg/translate.c
+++ b/target/i386/tcg/translate.c
@@ -511,10 +511,14 @@ static inline void gen_op_st_rm_T0_A0(DisasContext *s, 
int idx, int d)
 }
 }
 
-static inline void gen_jmp_im(DisasContext *s, target_ulong pc)
+static void gen_jmp_im(DisasContext *s, target_ulong pc)
 {
-tcg_gen_movi_tl(s->tmp0, pc);
-gen_op_jmp_v(s->tmp0);
+gen_op_jmp_v(tcg_constant_tl(pc));
+}
+
+static void gen_update_eip_cur(DisasContext *s)
+{
+gen_jmp_im(s, s->base.pc_next - s->cs_base);
 }
 
 /* Compute SEG:REG into A0.  SEG is selected from the override segment
@@ -703,7 +707,7 @@ static bool gen_check_io(DisasContext *s, MemOp ot, 
TCGv_i32 port,
 target_ulong next_eip = s->pc - s->cs_base;
 
 gen_update_cc_op(s);
-gen_jmp_im(s, cur_eip);
+gen_update_eip_cur(s);
 if (s->prefix & (PREFIX_REPZ | PREFIX_REPNZ)) {
 svm_flags |= SVM_IOIO_REP_MASK;
 }
@@ -1335,7 +1339,7 @@ static void gen_helper_fp_arith_STN_ST0(int op, int opreg)
 static void gen_exception(DisasContext *s, int trapno)
 {
 gen_update_cc_op(s);
-gen_jmp_im(s, s->base.pc_next - s->cs_base);
+gen_update_eip_cur(s);
 gen_helper_raise_exception(cpu_env, tcg_const_i32(trapno));
 s->base.is_jmp = DISAS_NORETURN;
 }
@@ -2630,7 +2634,7 @@ static void gen_unknown_opcode(CPUX86State *env, 
DisasContext *s)
 static void gen_interrupt(DisasContext *s, int intno)
 {
 gen_update_cc_op(s);
-gen_jmp_im(s, s->base.pc_next - s->cs_base);
+gen_update_eip_cur(s);
 gen_helper_raise_interrupt(cpu_env, tcg_constant_i32(intno),
tcg_constant_i32(s->pc - s->base.pc_next));
 s->base.is_jmp = DISAS_NORETURN;
@@ -6831,7 +6835,7 @@ static bool disas_insn(DisasContext *s, CPUState *cpu)
 do_lret:
 if (PE(s) && !VM86(s)) {
 gen_update_cc_op(s);
-gen_jmp_im(s, s->base.pc_next - s->cs_base);
+gen_update_eip_cur(s);
 gen_helper_lret_protected(cpu_env, tcg_const_i32(dflag - 1),
   tcg_const_i32(val));
 } else {
@@ -7327,7 +7331,7 @@ static bool disas_insn(DisasContext *s, CPUState *cpu)
 }
 if (prefixes & PREFIX_REPZ) {
 gen_update_cc_op(s);
-gen_jmp_im(s, s->base.pc_next - s->cs_base);
+gen_update_eip_cur(s);
 gen_helper_pause(cpu_env, tcg_const_i32(s->pc - s->base.pc_next));
 s->base.is_jmp = DISAS_NORETURN;
 }
@@ -7353,7 +7357,7 @@ static bool disas_insn(DisasContext *s, CPUState *cpu)
 if (CODE64(s))
 goto illegal_op;
 gen_update_cc_op(s);
-gen_jmp_im(s, s->base.pc_next - s->cs_base);
+gen_update_eip_cur(s);
 gen_helper_into(cpu_env, tcg_const_i32(s->pc - s->base.pc_next));
 break;
 #ifdef WANT_ICEBP
@@ -7460,7 +7464,7 @@ static bool disas_insn(DisasContext *s, CPUState *cpu)
 case 0x132: /* rdmsr */
 if (check_cpl0(s)) {
 gen_update_cc_op(s);
-gen_jmp_im(s, s->base.pc_next - s->cs_base);
+gen_update_eip_cur(s);
 if (b & 2) {
 gen_helper_rdmsr(cpu_env);
 } else {
@@ -7472,7 +7476,7 @@ static bool disas_insn(DisasContext *s, CPUState *cpu)
 break;
 case 0x131: /* rdtsc */
 gen_update_cc_op(s);
-gen_jmp_im(s, s->base.pc_next - s->cs_base);
+gen_update_eip_cur(s);
 if (tb_cflags(s->base.tb) & CF_USE_ICOUNT) {
 gen_io_start();
 }
@@ -7483,7 +7487,7 @@ static bool disas_insn(DisasContext *s, CPUState *cpu)
 break;
 case 0x133: /* rdpmc */
 gen_update_cc_op(s);
-gen_jmp_im(s, s->base.pc_next - s->cs_base);
+gen_update_eip_cur(s);
 gen_helper_rdpmc(cpu_env);
 s->base.is_jmp = DISAS_NORETURN;
 break;
@@ -7513,7 +7517,7 @@ static bool disas_insn(DisasContext *s, CPUState *cpu)
 case 0x105: /* syscall */
 /* XXX: is it usable in real mode ? */
 gen_update_cc_op(s);
-gen_jmp_im(s, s->base.pc_next - s->cs_base);
+gen_update_eip_cur(s);
 gen_helper_syscall(cpu_env, tcg_const_i32(s->pc - s->base.pc_next));
 /* TF handling for the syscall insn is different. The TF bit is  
checked
after the syscall insn completes. This allows #DB to not be
@@ -7539,13 +7543,13 @@ static bool disas_in

[PULL 06/37] target/i386: Remove pc_start

2022-10-11 Thread Paolo Bonzini

From: Richard Henderson 

The DisasContext member and the disas_insn local variable of
the same name are identical to DisasContextBase.pc_next.

Reviewed-by: Paolo Bonzini 
Signed-off-by: Richard Henderson 
Message-Id: <20221001140935.465607-2-richard.hender...@linaro.org>
Signed-off-by: Paolo Bonzini 
---
 target/i386/tcg/translate.c | 114 +++-
 1 file changed, 60 insertions(+), 54 deletions(-)

diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
index 44af8c107f..16bf56dbc7 100644
--- a/target/i386/tcg/translate.c
+++ b/target/i386/tcg/translate.c
@@ -76,7 +76,6 @@ typedef struct DisasContext {
 DisasContextBase base;
 
 target_ulong pc;   /* pc = eip + cs_base */
-target_ulong pc_start; /* pc at TB entry */
 target_ulong cs_base;  /* base of CS segment */
 
 MemOp aflag;
@@ -1345,13 +1344,13 @@ static void gen_exception(DisasContext *s, int trapno, 
target_ulong cur_eip)
the instruction is known, but it isn't allowed in the current cpu mode.  */
 static void gen_illegal_opcode(DisasContext *s)
 {
-gen_exception(s, EXCP06_ILLOP, s->pc_start - s->cs_base);
+gen_exception(s, EXCP06_ILLOP, s->base.pc_next - s->cs_base);
 }
 
 /* Generate #GP for the current instruction. */
 static void gen_exception_gpf(DisasContext *s)
 {
-gen_exception(s, EXCP0D_GPF, s->pc_start - s->cs_base);
+gen_exception(s, EXCP0D_GPF, s->base.pc_next - s->cs_base);
 }
 
 /* Check for cpl == 0; if not, raise #GP and return false. */
@@ -2016,7 +2015,7 @@ static uint64_t advance_pc(CPUX86State *env, DisasContext 
*s, int num_bytes)
 }
 
 s->pc += num_bytes;
-if (unlikely(s->pc - s->pc_start > X86_MAX_INSN_LENGTH)) {
+if (unlikely(s->pc - s->base.pc_next > X86_MAX_INSN_LENGTH)) {
 /* If the instruction's 16th byte is on a different page than the 1st, 
a
  * page fault on the second page wins over the general protection fault
  * caused by the instruction being too long.
@@ -2614,7 +2613,7 @@ static void gen_unknown_opcode(CPUX86State *env, 
DisasContext *s)
 if (qemu_loglevel_mask(LOG_UNIMP)) {
 FILE *logfile = qemu_log_trylock();
 if (logfile) {
-target_ulong pc = s->pc_start, end = s->pc;
+target_ulong pc = s->base.pc_next, end = s->pc;
 
 fprintf(logfile, "ILLOPC: " TARGET_FMT_lx ":", pc);
 for (; pc < end; ++pc) {
@@ -3226,8 +3225,7 @@ static const struct SSEOpHelper_table7 sse_op_table7[256] 
= {
 goto illegal_op; \
 } while (0)
 
-static void gen_sse(CPUX86State *env, DisasContext *s, int b,
-target_ulong pc_start)
+static void gen_sse(CPUX86State *env, DisasContext *s, int b)
 {
 int b1, op1_offset, op2_offset, is_xmm, val;
 int modrm, mod, rm, reg;
@@ -3269,7 +3267,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s, 
int b,
 }
 /* simple MMX/SSE operation */
 if (s->flags & HF_TS_MASK) {
-gen_exception(s, EXCP07_PREX, pc_start - s->cs_base);
+gen_exception(s, EXCP07_PREX, s->base.pc_next - s->cs_base);
 return;
 }
 if (s->flags & HF_EM_MASK) {
@@ -4717,11 +4715,10 @@ static target_ulong disas_insn(DisasContext *s, 
CPUState *cpu)
 MemOp ot, aflag, dflag;
 int modrm, reg, rm, mod, op, opreg, val;
 target_ulong next_eip, tval;
-target_ulong pc_start = s->base.pc_next;
 bool orig_cc_op_dirty = s->cc_op_dirty;
 CCOp orig_cc_op = s->cc_op;
 
-s->pc_start = s->pc = pc_start;
+s->pc = s->base.pc_next;
 s->override = -1;
 #ifdef TARGET_X86_64
 s->rex_w = false;
@@ -4745,7 +4742,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState 
*cpu)
 s->base.num_insns--;
 tcg_remove_ops_after(s->prev_insn_end);
 s->base.is_jmp = DISAS_TOO_MANY;
-return pc_start;
+return s->base.pc_next;
 default:
 g_assert_not_reached();
 }
@@ -6079,7 +6076,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState 
*cpu)
 if (s->flags & (HF_EM_MASK | HF_TS_MASK)) {
 /* if CR0.EM or CR0.TS are set, generate an FPU exception */
 /* XXX: what to do if illegal op ? */
-gen_exception(s, EXCP07_PREX, pc_start - s->cs_base);
+gen_exception(s, EXCP07_PREX, s->base.pc_next - s->cs_base);
 break;
 }
 modrm = x86_ldub_code(env, s);
@@ -6620,7 +6617,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState 
*cpu)
offsetof(CPUX86State, segs[R_CS].selector));
 tcg_gen_st16_i32(s->tmp2_i32, cpu_env,
  offsetof(CPUX86State, fpcs));
-tcg_gen_st_tl(tcg_constant_tl(pc_start - s->cs_base),
+tcg_gen_st_tl(tcg_constant_tl(s->base.pc_next - s->cs_base),
   cpu_env, offsetof(CPUX86State, fpip));
 }
 }
@@ -6632,7 +6629,

[PULL 16/37] target/i386: Create cur_insn_len, cur_insn_len_i32

2022-10-11 Thread Paolo Bonzini

From: Richard Henderson 

Create common routines for computing the length of the insn.
Use tcg_constant_i32 in the new function, while we're at it.

Reviewed-by: Paolo Bonzini 
Reviewed-by: Philippe Mathieu-Daudé 
Signed-off-by: Richard Henderson 
Message-Id: <20221001140935.465607-12-richard.hender...@linaro.org>
Signed-off-by: Paolo Bonzini 
---
 target/i386/tcg/translate.c | 31 +++
 1 file changed, 19 insertions(+), 12 deletions(-)

diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
index 6b16c0b62c..fe99c4361c 100644
--- a/target/i386/tcg/translate.c
+++ b/target/i386/tcg/translate.c
@@ -530,6 +530,16 @@ static void gen_update_eip_next(DisasContext *s)
 gen_jmp_im(s, s->pc - s->cs_base);
 }
 
+static int cur_insn_len(DisasContext *s)
+{
+return s->pc - s->base.pc_next;
+}
+
+static TCGv_i32 cur_insn_len_i32(DisasContext *s)
+{
+return tcg_constant_i32(cur_insn_len(s));
+}
+
 /* Compute SEG:REG into A0.  SEG is selected from the override segment
(OVR_SEG) and the default segment (DEF_SEG).  OVR_SEG may be -1 to
indicate no override.  */
@@ -712,9 +722,6 @@ static bool gen_check_io(DisasContext *s, MemOp ot, 
TCGv_i32 port,
 gen_helper_check_io(cpu_env, port, tcg_constant_i32(1 << ot));
 }
 if (GUEST(s)) {
-target_ulong cur_eip = s->base.pc_next - s->cs_base;
-target_ulong next_eip = s->pc - s->cs_base;
-
 gen_update_cc_op(s);
 gen_update_eip_cur(s);
 if (s->prefix & (PREFIX_REPZ | PREFIX_REPNZ)) {
@@ -723,7 +730,7 @@ static bool gen_check_io(DisasContext *s, MemOp ot, 
TCGv_i32 port,
 svm_flags |= 1 << (SVM_IOIO_SIZE_SHIFT + ot);
 gen_helper_svm_check_io(cpu_env, port,
 tcg_constant_i32(svm_flags),
-tcg_constant_i32(next_eip - cur_eip));
+cur_insn_len_i32(s));
 }
 return true;
 #endif
@@ -2028,7 +2035,7 @@ static uint64_t advance_pc(CPUX86State *env, DisasContext 
*s, int num_bytes)
 }
 
 s->pc += num_bytes;
-if (unlikely(s->pc - s->base.pc_next > X86_MAX_INSN_LENGTH)) {
+if (unlikely(cur_insn_len(s) > X86_MAX_INSN_LENGTH)) {
 /* If the instruction's 16th byte is on a different page than the 1st, 
a
  * page fault on the second page wins over the general protection fault
  * caused by the instruction being too long.
@@ -2647,7 +2654,7 @@ static void gen_interrupt(DisasContext *s, int intno)
 gen_update_cc_op(s);
 gen_update_eip_cur(s);
 gen_helper_raise_interrupt(cpu_env, tcg_constant_i32(intno),
-   tcg_constant_i32(s->pc - s->base.pc_next));
+   cur_insn_len_i32(s));
 s->base.is_jmp = DISAS_NORETURN;
 }
 
@@ -7314,7 +7321,7 @@ static bool disas_insn(DisasContext *s, CPUState *cpu)
 if (prefixes & PREFIX_REPZ) {
 gen_update_cc_op(s);
 gen_update_eip_cur(s);
-gen_helper_pause(cpu_env, tcg_const_i32(s->pc - s->base.pc_next));
+gen_helper_pause(cpu_env, cur_insn_len_i32(s));
 s->base.is_jmp = DISAS_NORETURN;
 }
 break;
@@ -7340,7 +7347,7 @@ static bool disas_insn(DisasContext *s, CPUState *cpu)
 goto illegal_op;
 gen_update_cc_op(s);
 gen_update_eip_cur(s);
-gen_helper_into(cpu_env, tcg_const_i32(s->pc - s->base.pc_next));
+gen_helper_into(cpu_env, cur_insn_len_i32(s));
 break;
 #ifdef WANT_ICEBP
 case 0xf1: /* icebp (undocumented, exits to external debugger) */
@@ -7499,7 +7506,7 @@ static bool disas_insn(DisasContext *s, CPUState *cpu)
 /* XXX: is it usable in real mode ? */
 gen_update_cc_op(s);
 gen_update_eip_cur(s);
-gen_helper_syscall(cpu_env, tcg_const_i32(s->pc - s->base.pc_next));
+gen_helper_syscall(cpu_env, cur_insn_len_i32(s));
 /* TF handling for the syscall insn is different. The TF bit is  
checked
after the syscall insn completes. This allows #DB to not be
generated after one has entered CPL0 if TF is set in FMASK.  */
@@ -7531,7 +7538,7 @@ static bool disas_insn(DisasContext *s, CPUState *cpu)
 if (check_cpl0(s)) {
 gen_update_cc_op(s);
 gen_update_eip_cur(s);
-gen_helper_hlt(cpu_env, tcg_const_i32(s->pc - s->base.pc_next));
+gen_helper_hlt(cpu_env, cur_insn_len_i32(s));
 s->base.is_jmp = DISAS_NORETURN;
 }
 break;
@@ -7640,7 +7647,7 @@ static bool disas_insn(DisasContext *s, CPUState *cpu)
 }
 gen_update_cc_op(s);
 gen_update_eip_cur(s);
-gen_helper_mwait(cpu_env, tcg_const_i32(s->pc - s->base.pc_next));
+gen_helper_mwait(cpu_env, cur_insn_len_i32(s));
 s->base.is_jmp = DISAS_NORETURN;
 break;
 
@@ -7716,7 +7723,7 @@ static bool disas_insn(DisasContext *s, CPUState

[PULL 08/37] target/i386: Remove cur_eip argument to gen_exception

2022-10-11 Thread Paolo Bonzini

From: Richard Henderson 

All callers pass s->base.pc_next - s->cs_base, which we can just
as well compute within the function.  Note the special case of
EXCP_VSYSCALL in which s->cs_base wasn't subtracted, but cs_base
is always zero in 64-bit mode, when vsyscall is used.

Reviewed-by: Paolo Bonzini 
Reviewed-by: Philippe Mathieu-Daudé 
Signed-off-by: Richard Henderson 
Message-Id: <20221001140935.465607-4-richard.hender...@linaro.org>
Signed-off-by: Paolo Bonzini 
---
 target/i386/tcg/translate.c | 26 +-
 1 file changed, 13 insertions(+), 13 deletions(-)

diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
index 3f3e79c096..617832fcb0 100644
--- a/target/i386/tcg/translate.c
+++ b/target/i386/tcg/translate.c
@@ -1332,10 +1332,10 @@ static void gen_helper_fp_arith_STN_ST0(int op, int 
opreg)
 }
 }
 
-static void gen_exception(DisasContext *s, int trapno, target_ulong cur_eip)
+static void gen_exception(DisasContext *s, int trapno)
 {
 gen_update_cc_op(s);
-gen_jmp_im(s, cur_eip);
+gen_jmp_im(s, s->base.pc_next - s->cs_base);
 gen_helper_raise_exception(cpu_env, tcg_const_i32(trapno));
 s->base.is_jmp = DISAS_NORETURN;
 }
@@ -1344,13 +1344,13 @@ static void gen_exception(DisasContext *s, int trapno, 
target_ulong cur_eip)
the instruction is known, but it isn't allowed in the current cpu mode.  */
 static void gen_illegal_opcode(DisasContext *s)
 {
-gen_exception(s, EXCP06_ILLOP, s->base.pc_next - s->cs_base);
+gen_exception(s, EXCP06_ILLOP);
 }
 
 /* Generate #GP for the current instruction. */
 static void gen_exception_gpf(DisasContext *s)
 {
-gen_exception(s, EXCP0D_GPF, s->base.pc_next - s->cs_base);
+gen_exception(s, EXCP0D_GPF);
 }
 
 /* Check for cpl == 0; if not, raise #GP and return false. */
@@ -3267,7 +3267,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s, 
int b)
 }
 /* simple MMX/SSE operation */
 if (s->flags & HF_TS_MASK) {
-gen_exception(s, EXCP07_PREX, s->base.pc_next - s->cs_base);
+gen_exception(s, EXCP07_PREX);
 return;
 }
 if (s->flags & HF_EM_MASK) {
@@ -6077,7 +6077,7 @@ static bool disas_insn(DisasContext *s, CPUState *cpu)
 if (s->flags & (HF_EM_MASK | HF_TS_MASK)) {
 /* if CR0.EM or CR0.TS are set, generate an FPU exception */
 /* XXX: what to do if illegal op ? */
-gen_exception(s, EXCP07_PREX, s->base.pc_next - s->cs_base);
+gen_exception(s, EXCP07_PREX);
 break;
 }
 modrm = x86_ldub_code(env, s);
@@ -7302,7 +7302,7 @@ static bool disas_insn(DisasContext *s, CPUState *cpu)
 goto illegal_op;
 val = x86_ldub_code(env, s);
 if (val == 0) {
-gen_exception(s, EXCP00_DIVZ, s->base.pc_next - s->cs_base);
+gen_exception(s, EXCP00_DIVZ);
 } else {
 gen_helper_aam(cpu_env, tcg_const_i32(val));
 set_cc_op(s, CC_OP_LOGICB);
@@ -7336,7 +7336,7 @@ static bool disas_insn(DisasContext *s, CPUState *cpu)
 case 0x9b: /* fwait */
 if ((s->flags & (HF_MP_MASK | HF_TS_MASK)) ==
 (HF_MP_MASK | HF_TS_MASK)) {
-gen_exception(s, EXCP07_PREX, s->base.pc_next - s->cs_base);
+gen_exception(s, EXCP07_PREX);
 } else {
 gen_helper_fwait(cpu_env);
 }
@@ -8393,7 +8393,7 @@ static bool disas_insn(DisasContext *s, CPUState *cpu)
 goto illegal_op;
 }
 if ((s->flags & HF_EM_MASK) || (s->flags & HF_TS_MASK)) {
-gen_exception(s, EXCP07_PREX, s->base.pc_next - s->cs_base);
+gen_exception(s, EXCP07_PREX);
 break;
 }
 gen_lea_modrm(env, s, modrm);
@@ -8406,7 +8406,7 @@ static bool disas_insn(DisasContext *s, CPUState *cpu)
 goto illegal_op;
 }
 if ((s->flags & HF_EM_MASK) || (s->flags & HF_TS_MASK)) {
-gen_exception(s, EXCP07_PREX, s->base.pc_next - s->cs_base);
+gen_exception(s, EXCP07_PREX);
 break;
 }
 gen_lea_modrm(env, s, modrm);
@@ -8418,7 +8418,7 @@ static bool disas_insn(DisasContext *s, CPUState *cpu)
 goto illegal_op;
 }
 if (s->flags & HF_TS_MASK) {
-gen_exception(s, EXCP07_PREX, s->base.pc_next - s->cs_base);
+gen_exception(s, EXCP07_PREX);
 break;
 }
 gen_lea_modrm(env, s, modrm);
@@ -8431,7 +8431,7 @@ static bool disas_insn(DisasContext *s, CPUState *cpu)
 goto illegal_op;
 }
 if (s->flags & HF_TS_MASK) {
-gen_exception(s, EXCP07_PREX, s->base.pc_next - s->cs_base);
+gen_exception(s, EXCP07_PREX);
 break;
 }
 gen_helper_update_mxcsr(cpu_env);
@@ -8822,7 +8822

[PULL 18/37] target/i386: Introduce DISAS_JUMP

2022-10-11 Thread Paolo Bonzini

From: Richard Henderson 

Drop the unused dest argument to gen_jr().
Remove most of the calls to gen_jr, and use DISAS_JUMP.
Remove some unused loads of eip for lcall and ljmp.

Reviewed-by: Paolo Bonzini 
Signed-off-by: Richard Henderson 
Message-Id: <20221001140935.465607-14-richard.hender...@linaro.org>
Signed-off-by: Paolo Bonzini 
---
 target/i386/tcg/translate.c | 24 +---
 1 file changed, 13 insertions(+), 11 deletions(-)

diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
index c8ef9f0356..7db6f617a1 100644
--- a/target/i386/tcg/translate.c
+++ b/target/i386/tcg/translate.c
@@ -135,6 +135,7 @@ typedef struct DisasContext {
 #define DISAS_EOB_ONLY DISAS_TARGET_0
 #define DISAS_EOB_NEXT DISAS_TARGET_1
 #define DISAS_EOB_INHIBIT_IRQ  DISAS_TARGET_2
+#define DISAS_JUMP DISAS_TARGET_3
 
 /* The environment in which user-only runs is constrained. */
 #ifdef CONFIG_USER_ONLY
@@ -222,7 +223,7 @@ STUB_HELPER(wrmsr, TCGv_env env)
 #endif
 
 static void gen_eob(DisasContext *s);
-static void gen_jr(DisasContext *s, TCGv dest);
+static void gen_jr(DisasContext *s);
 static void gen_jmp(DisasContext *s, target_ulong eip);
 static void gen_jmp_tb(DisasContext *s, target_ulong eip, int tb_num);
 static void gen_op(DisasContext *s1, int op, MemOp ot, int d);
@@ -2385,7 +2386,7 @@ static void gen_goto_tb(DisasContext *s, int tb_num, 
target_ulong eip)
 } else {
 /* jump to another page */
 gen_jmp_im(s, eip);
-gen_jr(s, s->tmp0);
+gen_jr(s);
 }
 }
 
@@ -2754,7 +2755,7 @@ static void gen_eob(DisasContext *s)
 }
 
 /* Jump to register */
-static void gen_jr(DisasContext *s, TCGv dest)
+static void gen_jr(DisasContext *s)
 {
 do_gen_eob_worker(s, false, false, true);
 }
@@ -5328,7 +5329,7 @@ static bool disas_insn(DisasContext *s, CPUState *cpu)
 gen_push_v(s, s->T1);
 gen_op_jmp_v(s->T0);
 gen_bnd_jmp(s);
-gen_jr(s, s->T0);
+s->base.is_jmp = DISAS_JUMP;
 break;
 case 3: /* lcall Ev */
 if (mod == 3) {
@@ -5349,8 +5350,7 @@ static bool disas_insn(DisasContext *s, CPUState *cpu)
   tcg_const_i32(dflag - 1),
   tcg_const_i32(s->pc - s->cs_base));
 }
-tcg_gen_ld_tl(s->tmp4, cpu_env, offsetof(CPUX86State, eip));
-gen_jr(s, s->tmp4);
+s->base.is_jmp = DISAS_JUMP;
 break;
 case 4: /* jmp Ev */
 if (dflag == MO_16) {
@@ -5358,7 +5358,7 @@ static bool disas_insn(DisasContext *s, CPUState *cpu)
 }
 gen_op_jmp_v(s->T0);
 gen_bnd_jmp(s);
-gen_jr(s, s->T0);
+s->base.is_jmp = DISAS_JUMP;
 break;
 case 5: /* ljmp Ev */
 if (mod == 3) {
@@ -5376,8 +5376,7 @@ static bool disas_insn(DisasContext *s, CPUState *cpu)
 gen_op_movl_seg_T0_vm(s, R_CS);
 gen_op_jmp_v(s->T1);
 }
-tcg_gen_ld_tl(s->tmp4, cpu_env, offsetof(CPUX86State, eip));
-gen_jr(s, s->tmp4);
+s->base.is_jmp = DISAS_JUMP;
 break;
 case 6: /* push Ev */
 gen_push_v(s, s->T0);
@@ -6808,7 +6807,7 @@ static bool disas_insn(DisasContext *s, CPUState *cpu)
 /* Note that gen_pop_T0 uses a zero-extending load.  */
 gen_op_jmp_v(s->T0);
 gen_bnd_jmp(s);
-gen_jr(s, s->T0);
+s->base.is_jmp = DISAS_JUMP;
 break;
 case 0xc3: /* ret */
 ot = gen_pop_T0(s);
@@ -6816,7 +6815,7 @@ static bool disas_insn(DisasContext *s, CPUState *cpu)
 /* Note that gen_pop_T0 uses a zero-extending load.  */
 gen_op_jmp_v(s->T0);
 gen_bnd_jmp(s);
-gen_jr(s, s->T0);
+s->base.is_jmp = DISAS_JUMP;
 break;
 case 0xca: /* lret im */
 val = x86_ldsw_code(env, s);
@@ -8846,6 +8845,9 @@ static void i386_tr_tb_stop(DisasContextBase *dcbase, 
CPUState *cpu)
 gen_update_eip_cur(dc);
 gen_eob_inhibit_irq(dc, true);
 break;
+case DISAS_JUMP:
+gen_jr(dc);
+break;
 default:
 g_assert_not_reached();
 }
-- 
2.37.3

[PULL 17/37] target/i386: Remove cur_eip, next_eip arguments to gen_repz*

2022-10-11 Thread Paolo Bonzini

From: Richard Henderson 

All callers pass s->base.pc_next and s->pc, which we can just
as well compute within the functions.  Pull out common helpers
and reduce the amount of code under macros.

Reviewed-by: Paolo Bonzini 
Signed-off-by: Richard Henderson 
Message-Id: <20221001140935.465607-13-richard.hender...@linaro.org>
Signed-off-by: Paolo Bonzini 
---
 target/i386/tcg/translate.c | 116 ++--
 1 file changed, 57 insertions(+), 59 deletions(-)

diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
index fe99c4361c..c8ef9f0356 100644
--- a/target/i386/tcg/translate.c
+++ b/target/i386/tcg/translate.c
@@ -736,7 +736,7 @@ static bool gen_check_io(DisasContext *s, MemOp ot, 
TCGv_i32 port,
 #endif
 }
 
-static inline void gen_movs(DisasContext *s, MemOp ot)
+static void gen_movs(DisasContext *s, MemOp ot)
 {
 gen_string_movl_A0_ESI(s);
 gen_op_ld_v(s, ot, s->T0, s->A0);
@@ -1156,18 +1156,18 @@ static inline void gen_jcc1(DisasContext *s, int b, 
TCGLabel *l1)
 
 /* XXX: does not work with gdbstub "ice" single step - not a
serious problem */
-static TCGLabel *gen_jz_ecx_string(DisasContext *s, target_ulong next_eip)
+static TCGLabel *gen_jz_ecx_string(DisasContext *s)
 {
 TCGLabel *l1 = gen_new_label();
 TCGLabel *l2 = gen_new_label();
 gen_op_jnz_ecx(s, s->aflag, l1);
 gen_set_label(l2);
-gen_jmp_tb(s, next_eip, 1);
+gen_jmp_tb(s, s->pc - s->cs_base, 1);
 gen_set_label(l1);
 return l2;
 }
 
-static inline void gen_stos(DisasContext *s, MemOp ot)
+static void gen_stos(DisasContext *s, MemOp ot)
 {
 gen_op_mov_v_reg(s, MO_32, s->T0, R_EAX);
 gen_string_movl_A0_EDI(s);
@@ -1176,7 +1176,7 @@ static inline void gen_stos(DisasContext *s, MemOp ot)
 gen_op_add_reg_T0(s, s->aflag, R_EDI);
 }
 
-static inline void gen_lods(DisasContext *s, MemOp ot)
+static void gen_lods(DisasContext *s, MemOp ot)
 {
 gen_string_movl_A0_ESI(s);
 gen_op_ld_v(s, ot, s->T0, s->A0);
@@ -1185,7 +1185,7 @@ static inline void gen_lods(DisasContext *s, MemOp ot)
 gen_op_add_reg_T0(s, s->aflag, R_ESI);
 }
 
-static inline void gen_scas(DisasContext *s, MemOp ot)
+static void gen_scas(DisasContext *s, MemOp ot)
 {
 gen_string_movl_A0_EDI(s);
 gen_op_ld_v(s, ot, s->T1, s->A0);
@@ -1194,7 +1194,7 @@ static inline void gen_scas(DisasContext *s, MemOp ot)
 gen_op_add_reg_T0(s, s->aflag, R_EDI);
 }
 
-static inline void gen_cmps(DisasContext *s, MemOp ot)
+static void gen_cmps(DisasContext *s, MemOp ot)
 {
 gen_string_movl_A0_EDI(s);
 gen_op_ld_v(s, ot, s->T1, s->A0);
@@ -1222,7 +1222,7 @@ static void gen_bpt_io(DisasContext *s, TCGv_i32 t_port, 
int ot)
 }
 }
 
-static inline void gen_ins(DisasContext *s, MemOp ot)
+static void gen_ins(DisasContext *s, MemOp ot)
 {
 gen_string_movl_A0_EDI(s);
 /* Note: we must do this dummy write first to be restartable in
@@ -1238,7 +1238,7 @@ static inline void gen_ins(DisasContext *s, MemOp ot)
 gen_bpt_io(s, s->tmp2_i32, ot);
 }
 
-static inline void gen_outs(DisasContext *s, MemOp ot)
+static void gen_outs(DisasContext *s, MemOp ot)
 {
 gen_string_movl_A0_ESI(s);
 gen_op_ld_v(s, ot, s->T0, s->A0);
@@ -1252,42 +1252,49 @@ static inline void gen_outs(DisasContext *s, MemOp ot)
 gen_bpt_io(s, s->tmp2_i32, ot);
 }
 
-/* same method as Valgrind : we generate jumps to current or next
-   instruction */
-#define GEN_REPZ(op)  \
-static inline void gen_repz_ ## op(DisasContext *s, MemOp ot,  \
- target_ulong cur_eip, target_ulong next_eip) \
-{ \
-TCGLabel *l2; \
-gen_update_cc_op(s);  \
-l2 = gen_jz_ecx_string(s, next_eip);  \
-gen_ ## op(s, ot);\
-gen_op_add_reg_im(s, s->aflag, R_ECX, -1);\
-/* a loop would cause two single step exceptions if ECX = 1   \
-   before rep string_insn */  \
-if (s->repz_opt)  \
-gen_op_jz_ecx(s, s->aflag, l2);   \
-gen_jmp(s, cur_eip);  \
+/* Generate jumps to current or next instruction */
+static void gen_repz(DisasContext *s, MemOp ot,
+ void (*fn)(DisasContext *s, MemOp ot))
+{
+TCGLabel *l2;
+gen_update_cc_op(s);
+l2 = gen_jz_ecx_string(s);
+fn(s, ot);
+gen_op_add_reg_im(s, s->aflag, R_ECX, -1);
+/*
+ * A loop would cause two single step exceptions if ECX = 1
+ * before rep string_insn
+ */
+if (s->repz_opt) {
+

[PULL 15/37] target/i386: USe DISAS_EOB_ONLY

2022-10-11 Thread Paolo Bonzini

From: Richard Henderson 

Replace lone calls to gen_eob() with the new enumerator.

Reviewed-by: Paolo Bonzini 
Signed-off-by: Richard Henderson 
Message-Id: <20221001140935.465607-11-richard.hender...@linaro.org>
Signed-off-by: Paolo Bonzini 
---
 target/i386/tcg/translate.c | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
index 717c978381..6b16c0b62c 100644
--- a/target/i386/tcg/translate.c
+++ b/target/i386/tcg/translate.c
@@ -6835,7 +6835,7 @@ static bool disas_insn(DisasContext *s, CPUState *cpu)
 /* add stack offset */
 gen_stack_update(s, val + (2 << dflag));
 }
-gen_eob(s);
+s->base.is_jmp = DISAS_EOB_ONLY;
 break;
 case 0xcb: /* lret */
 val = 0;
@@ -6853,7 +6853,7 @@ static bool disas_insn(DisasContext *s, CPUState *cpu)
   tcg_const_i32(s->pc - s->cs_base));
 }
 set_cc_op(s, CC_OP_EFLAGS);
-gen_eob(s);
+s->base.is_jmp = DISAS_EOB_ONLY;
 break;
 case 0xe8: /* call im */
 {
@@ -7439,7 +7439,7 @@ static bool disas_insn(DisasContext *s, CPUState *cpu)
 gen_set_label(l1);
 gen_jmp_im(s, tval);
 gen_set_label(l2);
-gen_eob(s);
+s->base.is_jmp = DISAS_EOB_ONLY;
 }
 break;
 case 0x130: /* wrmsr */
@@ -7480,7 +7480,7 @@ static bool disas_insn(DisasContext *s, CPUState *cpu)
 gen_exception_gpf(s);
 } else {
 gen_helper_sysenter(cpu_env);
-gen_eob(s);
+s->base.is_jmp = DISAS_EOB_ONLY;
 }
 break;
 case 0x135: /* sysexit */
@@ -7491,7 +7491,7 @@ static bool disas_insn(DisasContext *s, CPUState *cpu)
 gen_exception_gpf(s);
 } else {
 gen_helper_sysexit(cpu_env, tcg_const_i32(dflag - 1));
-gen_eob(s);
+s->base.is_jmp = DISAS_EOB_ONLY;
 }
 break;
 #ifdef TARGET_X86_64
@@ -8574,7 +8574,7 @@ static bool disas_insn(DisasContext *s, CPUState *cpu)
 gen_update_eip_next(s);
 gen_helper_rsm(cpu_env);
 #endif /* CONFIG_USER_ONLY */
-gen_eob(s);
+s->base.is_jmp = DISAS_EOB_ONLY;
 break;
 case 0x1b8: /* SSE4.2 popcnt */
 if ((prefixes & (PREFIX_REPZ | PREFIX_LOCK | PREFIX_REPNZ)) !=
-- 
2.37.3

[PULL 20/37] target/i386: Create eip_next_*

2022-10-11 Thread Paolo Bonzini

From: Richard Henderson 

Create helpers for loading the address of the next insn.
Use tcg_constant_* in adjacent code where convenient.

Reviewed-by: Paolo Bonzini 
Signed-off-by: Richard Henderson 
Message-Id: <20221001140935.465607-16-richard.hender...@linaro.org>
Signed-off-by: Paolo Bonzini 
---
 target/i386/tcg/translate.c | 55 +++--
 1 file changed, 34 insertions(+), 21 deletions(-)

diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
index 1aa5b37ea6..be29ea7a03 100644
--- a/target/i386/tcg/translate.c
+++ b/target/i386/tcg/translate.c
@@ -541,6 +541,27 @@ static TCGv_i32 cur_insn_len_i32(DisasContext *s)
 return tcg_constant_i32(cur_insn_len(s));
 }
 
+static TCGv_i32 eip_next_i32(DisasContext *s)
+{
+/*
+ * This function has two users: lcall_real (always 16-bit mode), and
+ * iret_protected (16, 32, or 64-bit mode).  IRET only uses the value
+ * when EFLAGS.NT is set, which is illegal in 64-bit mode, which is
+ * why passing a 32-bit value isn't broken.  To avoid using this where
+ * we shouldn't, return -1 in 64-bit mode so that execution goes into
+ * the weeds quickly.
+ */
+if (CODE64(s)) {
+return tcg_constant_i32(-1);
+}
+return tcg_constant_i32(s->pc - s->cs_base);
+}
+
+static TCGv eip_next_tl(DisasContext *s)
+{
+return tcg_constant_tl(s->pc - s->cs_base);
+}
+
 /* Compute SEG:REG into A0.  SEG is selected from the override segment
(OVR_SEG) and the default segment (DEF_SEG).  OVR_SEG may be -1 to
indicate no override.  */
@@ -1213,12 +1234,9 @@ static void gen_bpt_io(DisasContext *s, TCGv_i32 t_port, 
int ot)
 /* user-mode cpu should not be in IOBPT mode */
 g_assert_not_reached();
 #else
-TCGv_i32 t_size = tcg_const_i32(1 << ot);
-TCGv t_next = tcg_const_tl(s->pc - s->cs_base);
-
+TCGv_i32 t_size = tcg_constant_i32(1 << ot);
+TCGv t_next = eip_next_tl(s);
 gen_helper_bpt_io(cpu_env, t_port, t_size, t_next);
-tcg_temp_free_i32(t_size);
-tcg_temp_free(t_next);
 #endif /* CONFIG_USER_ONLY */
 }
 }
@@ -5324,9 +5342,7 @@ static bool disas_insn(DisasContext *s, CPUState *cpu)
 if (dflag == MO_16) {
 tcg_gen_ext16u_tl(s->T0, s->T0);
 }
-next_eip = s->pc - s->cs_base;
-tcg_gen_movi_tl(s->T1, next_eip);
-gen_push_v(s, s->T1);
+gen_push_v(s, eip_next_tl(s));
 gen_op_jmp_v(s->T0);
 gen_bnd_jmp(s);
 s->base.is_jmp = DISAS_JUMP;
@@ -5342,14 +5358,14 @@ static bool disas_insn(DisasContext *s, CPUState *cpu)
 if (PE(s) && !VM86(s)) {
 tcg_gen_trunc_tl_i32(s->tmp2_i32, s->T0);
 gen_helper_lcall_protected(cpu_env, s->tmp2_i32, s->T1,
-   tcg_const_i32(dflag - 1),
-   tcg_const_tl(s->pc - s->cs_base));
+   tcg_constant_i32(dflag - 1),
+   eip_next_tl(s));
 } else {
 tcg_gen_trunc_tl_i32(s->tmp2_i32, s->T0);
 tcg_gen_trunc_tl_i32(s->tmp3_i32, s->T1);
 gen_helper_lcall_real(cpu_env, s->tmp2_i32, s->tmp3_i32,
-  tcg_const_i32(dflag - 1),
-  tcg_const_i32(s->pc - s->cs_base));
+  tcg_constant_i32(dflag - 1),
+  eip_next_i32(s));
 }
 s->base.is_jmp = DISAS_JUMP;
 break;
@@ -5372,7 +5388,7 @@ static bool disas_insn(DisasContext *s, CPUState *cpu)
 if (PE(s) && !VM86(s)) {
 tcg_gen_trunc_tl_i32(s->tmp2_i32, s->T0);
 gen_helper_ljmp_protected(cpu_env, s->tmp2_i32, s->T1,
-  tcg_const_tl(s->pc - s->cs_base));
+  eip_next_tl(s));
 } else {
 gen_op_movl_seg_T0_vm(s, R_CS);
 gen_op_jmp_v(s->T1);
@@ -6854,8 +6870,8 @@ static bool disas_insn(DisasContext *s, CPUState *cpu)
 }
 gen_helper_iret_real(cpu_env, tcg_const_i32(dflag - 1));
 } else {
-gen_helper_iret_protected(cpu_env, tcg_const_i32(dflag - 1),
-  tcg_const_i32(s->pc - s->cs_base));
+gen_helper_iret_protected(cpu_env, tcg_constant_i32(dflag - 1),
+  eip_next_i32(s));
 }
 set_cc_op(s, CC_OP_EFLAGS);
 s->base.is_jmp = DISAS_EOB_ONLY;
@@ -6867,15 +6883,13 @@ static bool disas_insn(DisasContext *s, CPUState *cpu)
 } else {
 tval = (int16_t)insn_get(env, s, MO_16);
 }
-next_eip = s->pc - s->cs_base;
-tval += next_eip;
+tval +=

[PULL 12/37] target/i386: Introduce DISAS_EOB*

2022-10-11 Thread Paolo Bonzini

From: Richard Henderson 

Add a few DISAS_TARGET_* aliases to reduce the number of
calls to gen_eob() and gen_eob_inhibit_irq().  So far,
only update i386_tr_translate_insn for exiting the block
because of single-step or previous inhibit irq.

Reviewed-by: Paolo Bonzini 
Signed-off-by: Richard Henderson 
Message-Id: <20221001140935.465607-8-richard.hender...@linaro.org>
Signed-off-by: Paolo Bonzini 
---
 target/i386/tcg/translate.c | 23 +--
 1 file changed, 21 insertions(+), 2 deletions(-)

diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
index 4c1548da8e..caa22af5a7 100644
--- a/target/i386/tcg/translate.c
+++ b/target/i386/tcg/translate.c
@@ -132,6 +132,10 @@ typedef struct DisasContext {
 TCGOp *prev_insn_end;
 } DisasContext;
 
+#define DISAS_EOB_ONLY DISAS_TARGET_0
+#define DISAS_EOB_NEXT DISAS_TARGET_1
+#define DISAS_EOB_INHIBIT_IRQ  DISAS_TARGET_2
+
 /* The environment in which user-only runs is constrained. */
 #ifdef CONFIG_USER_ONLY
 #define PE(S) true
@@ -8849,7 +8853,7 @@ static void i386_tr_translate_insn(DisasContextBase 
*dcbase, CPUState *cpu)
  * the flag and abort the translation to give the irqs a
  * chance to happen.
  */
-dc->base.is_jmp = DISAS_TOO_MANY;
+dc->base.is_jmp = DISAS_EOB_NEXT;
 } else if (!is_same_page(&dc->base, pc_next)) {
 dc->base.is_jmp = DISAS_TOO_MANY;
 }
@@ -8861,9 +8865,24 @@ static void i386_tr_tb_stop(DisasContextBase *dcbase, 
CPUState *cpu)
 {
 DisasContext *dc = container_of(dcbase, DisasContext, base);
 
-if (dc->base.is_jmp == DISAS_TOO_MANY) {
+switch (dc->base.is_jmp) {
+case DISAS_NORETURN:
+break;
+case DISAS_TOO_MANY:
+case DISAS_EOB_NEXT:
+gen_update_cc_op(dc);
 gen_update_eip_cur(dc);
+/* fall through */
+case DISAS_EOB_ONLY:
 gen_eob(dc);
+break;
+case DISAS_EOB_INHIBIT_IRQ:
+gen_update_cc_op(dc);
+gen_update_eip_cur(dc);
+gen_eob_inhibit_irq(dc, true);
+break;
+default:
+g_assert_not_reached();
 }
 }
 
-- 
2.37.3

[PULL 11/37] target/i386: Create gen_update_eip_next

2022-10-11 Thread Paolo Bonzini

From: Richard Henderson 

Sync EIP before exiting a translation block.
Replace all gen_jmp_im that use s->pc.

Reviewed-by: Paolo Bonzini 
Signed-off-by: Richard Henderson 
Message-Id: <20221001140935.465607-7-richard.hender...@linaro.org>
Signed-off-by: Paolo Bonzini 
---
 target/i386/tcg/translate.c | 45 -
 1 file changed, 25 insertions(+), 20 deletions(-)

diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
index 85253e1e17..4c1548da8e 100644
--- a/target/i386/tcg/translate.c
+++ b/target/i386/tcg/translate.c
@@ -521,6 +521,11 @@ static void gen_update_eip_cur(DisasContext *s)
 gen_jmp_im(s, s->base.pc_next - s->cs_base);
 }
 
+static void gen_update_eip_next(DisasContext *s)
+{
+gen_jmp_im(s, s->pc - s->cs_base);
+}
+
 /* Compute SEG:REG into A0.  SEG is selected from the override segment
(OVR_SEG) and the default segment (DEF_SEG).  OVR_SEG may be -1 to
indicate no override.  */
@@ -5719,7 +5724,7 @@ static bool disas_insn(DisasContext *s, CPUState *cpu)
 gen_pop_update(s, ot);
 /* Note that reg == R_SS in gen_movl_seg_T0 always sets is_jmp.  */
 if (s->base.is_jmp) {
-gen_jmp_im(s, s->pc - s->cs_base);
+gen_update_eip_next(s);
 if (reg == R_SS) {
 s->flags &= ~HF_TF_MASK;
 gen_eob_inhibit_irq(s, true);
@@ -5734,7 +5739,7 @@ static bool disas_insn(DisasContext *s, CPUState *cpu)
 gen_movl_seg_T0(s, (b >> 3) & 7);
 gen_pop_update(s, ot);
 if (s->base.is_jmp) {
-gen_jmp_im(s, s->pc - s->cs_base);
+gen_update_eip_next(s);
 gen_eob(s);
 }
 break;
@@ -5785,7 +5790,7 @@ static bool disas_insn(DisasContext *s, CPUState *cpu)
 gen_movl_seg_T0(s, reg);
 /* Note that reg == R_SS in gen_movl_seg_T0 always sets is_jmp.  */
 if (s->base.is_jmp) {
-gen_jmp_im(s, s->pc - s->cs_base);
+gen_update_eip_next(s);
 if (reg == R_SS) {
 s->flags &= ~HF_TF_MASK;
 gen_eob_inhibit_irq(s, true);
@@ -5983,7 +5988,7 @@ static bool disas_insn(DisasContext *s, CPUState *cpu)
 /* then put the data */
 gen_op_mov_reg_v(s, ot, reg, s->T1);
 if (s->base.is_jmp) {
-gen_jmp_im(s, s->pc - s->cs_base);
+gen_update_eip_next(s);
 gen_eob(s);
 }
 break;
@@ -7039,7 +7044,7 @@ static bool disas_insn(DisasContext *s, CPUState *cpu)
 gen_pop_update(s, ot);
 set_cc_op(s, CC_OP_EFLAGS);
 /* abort translation because TF/AC flag may change */
-gen_jmp_im(s, s->pc - s->cs_base);
+gen_update_eip_next(s);
 gen_eob(s);
 }
 break;
@@ -7375,7 +7380,7 @@ static bool disas_insn(DisasContext *s, CPUState *cpu)
 if (check_iopl(s)) {
 gen_helper_sti(cpu_env);
 /* interruptions are enabled only the first insn after sti */
-gen_jmp_im(s, s->pc - s->cs_base);
+gen_update_eip_next(s);
 gen_eob_inhibit_irq(s, true);
 }
 break;
@@ -7451,7 +7456,7 @@ static bool disas_insn(DisasContext *s, CPUState *cpu)
 }
 
 gen_set_label(l3);
-gen_jmp_im(s, next_eip);
+gen_update_eip_next(s);
 tcg_gen_br(l2);
 
 gen_set_label(l1);
@@ -7469,7 +7474,7 @@ static bool disas_insn(DisasContext *s, CPUState *cpu)
 gen_helper_rdmsr(cpu_env);
 } else {
 gen_helper_wrmsr(cpu_env);
-gen_jmp_im(s, s->pc - s->cs_base);
+gen_update_eip_next(s);
 gen_eob(s);
 }
 }
@@ -7669,7 +7674,7 @@ static bool disas_insn(DisasContext *s, CPUState *cpu)
 goto illegal_op;
 }
 gen_helper_clac(cpu_env);
-gen_jmp_im(s, s->pc - s->cs_base);
+gen_update_eip_next(s);
 gen_eob(s);
 break;
 
@@ -7679,7 +7684,7 @@ static bool disas_insn(DisasContext *s, CPUState *cpu)
 goto illegal_op;
 }
 gen_helper_stac(cpu_env);
-gen_jmp_im(s, s->pc - s->cs_base);
+gen_update_eip_next(s);
 gen_eob(s);
 break;
 
@@ -7724,7 +7729,7 @@ static bool disas_insn(DisasContext *s, CPUState *cpu)
 tcg_gen_trunc_tl_i32(s->tmp2_i32, cpu_regs[R_ECX]);
 gen_helper_xsetbv(cpu_env, s->tmp2_i32, s->tmp1_i64);
 /* End TB because translation flags may change.  */
-gen_jmp_im(s, s->pc - s->cs_base);
+gen_update_eip_next(s);
 gen_eob(s);
 break;
 
@@ -7786,7 +7791,7 @@ static bool disas_insn(DisasContext *s, CPUState *cpu)
 }
 gen_update_cc_op(s);
 gen_helper_stgi(cpu_env);
-gen_jmp_im(s, s->pc - s->cs_b

[PULL 21/37] target/i386: Use DISAS_TOO_MANY to exit after gen_io_start

2022-10-11 Thread Paolo Bonzini

From: Richard Henderson 

We can set is_jmp early, using only one if, and let that
be overwritten by gen_rep*'s calls to gen_jmp_tb.

Reviewed-by: Paolo Bonzini 
Signed-off-by: Richard Henderson 
Message-Id: <20221001140935.465607-17-richard.hender...@linaro.org>
Signed-off-by: Paolo Bonzini 
---
 target/i386/tcg/translate.c | 42 +
 1 file changed, 10 insertions(+), 32 deletions(-)

diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
index be29ea7a03..11aaba8a65 100644
--- a/target/i386/tcg/translate.c
+++ b/target/i386/tcg/translate.c
@@ -5660,14 +5660,12 @@ static bool disas_insn(DisasContext *s, CPUState *cpu)
 }
 if (tb_cflags(s->base.tb) & CF_USE_ICOUNT) {
 gen_io_start();
+s->base.is_jmp = DISAS_TOO_MANY;
 }
 gen_helper_rdrand(s->T0, cpu_env);
 rm = (modrm & 7) | REX_B(s);
 gen_op_mov_reg_v(s, dflag, rm, s->T0);
 set_cc_op(s, CC_OP_EFLAGS);
-if (tb_cflags(s->base.tb) & CF_USE_ICOUNT) {
-gen_jmp(s, s->pc - s->cs_base);
-}
 break;
 
 default:
@@ -6704,15 +6702,12 @@ static bool disas_insn(DisasContext *s, CPUState *cpu)
 }
 if (tb_cflags(s->base.tb) & CF_USE_ICOUNT) {
 gen_io_start();
+s->base.is_jmp = DISAS_TOO_MANY;
 }
 if (prefixes & (PREFIX_REPZ | PREFIX_REPNZ)) {
 gen_repz_ins(s, ot);
-/* jump generated by gen_repz_ins */
 } else {
 gen_ins(s, ot);
-if (tb_cflags(s->base.tb) & CF_USE_ICOUNT) {
-gen_jmp(s, s->pc - s->cs_base);
-}
 }
 break;
 case 0x6e: /* outsS */
@@ -6725,15 +6720,12 @@ static bool disas_insn(DisasContext *s, CPUState *cpu)
 }
 if (tb_cflags(s->base.tb) & CF_USE_ICOUNT) {
 gen_io_start();
+s->base.is_jmp = DISAS_TOO_MANY;
 }
 if (prefixes & (PREFIX_REPZ | PREFIX_REPNZ)) {
 gen_repz_outs(s, ot);
-/* jump generated by gen_repz_outs */
 } else {
 gen_outs(s, ot);
-if (tb_cflags(s->base.tb) & CF_USE_ICOUNT) {
-gen_jmp(s, s->pc - s->cs_base);
-}
 }
 break;
 
@@ -6750,13 +6742,11 @@ static bool disas_insn(DisasContext *s, CPUState *cpu)
 }
 if (tb_cflags(s->base.tb) & CF_USE_ICOUNT) {
 gen_io_start();
+s->base.is_jmp = DISAS_TOO_MANY;
 }
 gen_helper_in_func(ot, s->T1, s->tmp2_i32);
 gen_op_mov_reg_v(s, ot, R_EAX, s->T1);
 gen_bpt_io(s, s->tmp2_i32, ot);
-if (tb_cflags(s->base.tb) & CF_USE_ICOUNT) {
-gen_jmp(s, s->pc - s->cs_base);
-}
 break;
 case 0xe6:
 case 0xe7:
@@ -6768,14 +6758,12 @@ static bool disas_insn(DisasContext *s, CPUState *cpu)
 }
 if (tb_cflags(s->base.tb) & CF_USE_ICOUNT) {
 gen_io_start();
+s->base.is_jmp = DISAS_TOO_MANY;
 }
 gen_op_mov_v_reg(s, ot, s->T1, R_EAX);
 tcg_gen_trunc_tl_i32(s->tmp3_i32, s->T1);
 gen_helper_out_func(ot, s->tmp2_i32, s->tmp3_i32);
 gen_bpt_io(s, s->tmp2_i32, ot);
-if (tb_cflags(s->base.tb) & CF_USE_ICOUNT) {
-gen_jmp(s, s->pc - s->cs_base);
-}
 break;
 case 0xec:
 case 0xed:
@@ -6787,13 +6775,11 @@ static bool disas_insn(DisasContext *s, CPUState *cpu)
 }
 if (tb_cflags(s->base.tb) & CF_USE_ICOUNT) {
 gen_io_start();
+s->base.is_jmp = DISAS_TOO_MANY;
 }
 gen_helper_in_func(ot, s->T1, s->tmp2_i32);
 gen_op_mov_reg_v(s, ot, R_EAX, s->T1);
 gen_bpt_io(s, s->tmp2_i32, ot);
-if (tb_cflags(s->base.tb) & CF_USE_ICOUNT) {
-gen_jmp(s, s->pc - s->cs_base);
-}
 break;
 case 0xee:
 case 0xef:
@@ -6805,14 +6791,12 @@ static bool disas_insn(DisasContext *s, CPUState *cpu)
 }
 if (tb_cflags(s->base.tb) & CF_USE_ICOUNT) {
 gen_io_start();
+s->base.is_jmp = DISAS_TOO_MANY;
 }
 gen_op_mov_v_reg(s, ot, s->T1, R_EAX);
 tcg_gen_trunc_tl_i32(s->tmp3_i32, s->T1);
 gen_helper_out_func(ot, s->tmp2_i32, s->tmp3_i32);
 gen_bpt_io(s, s->tmp2_i32, ot);
-if (tb_cflags(s->base.tb) & CF_USE_ICOUNT) {
-gen_jmp(s, s->pc - s->cs_base);
-}
 break;
 
 //
@@ -7478,11 +7462,9 @@ static bool disas_insn(DisasContext *s, CPUState *cpu)
 gen_update_eip_cur(s);
 if (tb_cflags(s->base.tb) & CF_USE_ICOUNT) {
 gen_io_start();
+s->base.is_jmp = DISAS_TOO_MANY;
 }
 gen_helper_rdtsc(cpu_env);
-if (tb_cflags(s->base.tb) & CF_USE_ICOUNT) {
-gen_jmp(s, s->pc - s->cs_base);
-}

[PULL 13/37] target/i386: Use DISAS_EOB* in gen_movl_seg_T0

2022-10-11 Thread Paolo Bonzini

From: Richard Henderson 

Set is_jmp properly in gen_movl_seg_T0, so that the callers
need to nothing special.

Reviewed-by: Paolo Bonzini 
Signed-off-by: Richard Henderson 
Message-Id: <20221001140935.465607-9-richard.hender...@linaro.org>
Signed-off-by: Paolo Bonzini 
---
 target/i386/tcg/translate.c | 36 +---
 1 file changed, 5 insertions(+), 31 deletions(-)

diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
index caa22af5a7..8c0ef0f212 100644
--- a/target/i386/tcg/translate.c
+++ b/target/i386/tcg/translate.c
@@ -2457,13 +2457,15 @@ static void gen_movl_seg_T0(DisasContext *s, X86Seg 
seg_reg)
because ss32 may change. For R_SS, translation must always
stop as a special handling must be done to disable hardware
interrupts for the next instruction */
-if (seg_reg == R_SS || (CODE32(s) && seg_reg < R_FS)) {
-s->base.is_jmp = DISAS_TOO_MANY;
+if (seg_reg == R_SS) {
+s->base.is_jmp = DISAS_EOB_INHIBIT_IRQ;
+} else if (CODE32(s) && seg_reg < R_FS) {
+s->base.is_jmp = DISAS_EOB_NEXT;
 }
 } else {
 gen_op_movl_seg_T0_vm(s, seg_reg);
 if (seg_reg == R_SS) {
-s->base.is_jmp = DISAS_TOO_MANY;
+s->base.is_jmp = DISAS_EOB_INHIBIT_IRQ;
 }
 }
 }
@@ -5726,26 +5728,12 @@ static bool disas_insn(DisasContext *s, CPUState *cpu)
 ot = gen_pop_T0(s);
 gen_movl_seg_T0(s, reg);
 gen_pop_update(s, ot);
-/* Note that reg == R_SS in gen_movl_seg_T0 always sets is_jmp.  */
-if (s->base.is_jmp) {
-gen_update_eip_next(s);
-if (reg == R_SS) {
-s->flags &= ~HF_TF_MASK;
-gen_eob_inhibit_irq(s, true);
-} else {
-gen_eob(s);
-}
-}
 break;
 case 0x1a1: /* pop fs */
 case 0x1a9: /* pop gs */
 ot = gen_pop_T0(s);
 gen_movl_seg_T0(s, (b >> 3) & 7);
 gen_pop_update(s, ot);
-if (s->base.is_jmp) {
-gen_update_eip_next(s);
-gen_eob(s);
-}
 break;
 
 /**/
@@ -5792,16 +5780,6 @@ static bool disas_insn(DisasContext *s, CPUState *cpu)
 goto illegal_op;
 gen_ldst_modrm(env, s, modrm, MO_16, OR_TMP0, 0);
 gen_movl_seg_T0(s, reg);
-/* Note that reg == R_SS in gen_movl_seg_T0 always sets is_jmp.  */
-if (s->base.is_jmp) {
-gen_update_eip_next(s);
-if (reg == R_SS) {
-s->flags &= ~HF_TF_MASK;
-gen_eob_inhibit_irq(s, true);
-} else {
-gen_eob(s);
-}
-}
 break;
 case 0x8c: /* mov Gv, seg */
 modrm = x86_ldub_code(env, s);
@@ -5991,10 +5969,6 @@ static bool disas_insn(DisasContext *s, CPUState *cpu)
 gen_movl_seg_T0(s, op);
 /* then put the data */
 gen_op_mov_reg_v(s, ot, reg, s->T1);
-if (s->base.is_jmp) {
-gen_update_eip_next(s);
-gen_eob(s);
-}
 break;
 
 //
-- 
2.37.3

[PULL 26/37] target/i386: Remove MemOp argument to gen_op_j*_ecx

2022-10-11 Thread Paolo Bonzini

From: Richard Henderson 

These functions are always passed aflag, so we might as well
read it from DisasContext directly.  While we're at it, use
a common subroutine for these two functions.

Signed-off-by: Richard Henderson 
Reviewed-by: Paolo Bonzini 
Message-Id: <20221001140935.465607-22-richard.hender...@linaro.org>
Signed-off-by: Paolo Bonzini 
---
 target/i386/tcg/translate.c | 31 ---
 1 file changed, 16 insertions(+), 15 deletions(-)

diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
index cf23ae6e5e..9294f12f66 100644
--- a/target/i386/tcg/translate.c
+++ b/target/i386/tcg/translate.c
@@ -676,20 +676,21 @@ static void gen_exts(MemOp ot, TCGv reg)
 gen_ext_tl(reg, reg, ot, true);
 }
 
-static inline
-void gen_op_jnz_ecx(DisasContext *s, MemOp size, TCGLabel *label1)
+static void gen_op_j_ecx(DisasContext *s, TCGCond cond, TCGLabel *label1)
 {
 tcg_gen_mov_tl(s->tmp0, cpu_regs[R_ECX]);
-gen_extu(size, s->tmp0);
-tcg_gen_brcondi_tl(TCG_COND_NE, s->tmp0, 0, label1);
+gen_extu(s->aflag, s->tmp0);
+tcg_gen_brcondi_tl(cond, s->tmp0, 0, label1);
 }
 
-static inline
-void gen_op_jz_ecx(DisasContext *s, MemOp size, TCGLabel *label1)
+static inline void gen_op_jz_ecx(DisasContext *s, TCGLabel *label1)
 {
-tcg_gen_mov_tl(s->tmp0, cpu_regs[R_ECX]);
-gen_extu(size, s->tmp0);
-tcg_gen_brcondi_tl(TCG_COND_EQ, s->tmp0, 0, label1);
+gen_op_j_ecx(s, TCG_COND_EQ, label1);
+}
+
+static inline void gen_op_jnz_ecx(DisasContext *s, TCGLabel *label1)
+{
+gen_op_j_ecx(s, TCG_COND_NE, label1);
 }
 
 static void gen_helper_in_func(MemOp ot, TCGv v, TCGv_i32 n)
@@ -1183,7 +1184,7 @@ static TCGLabel *gen_jz_ecx_string(DisasContext *s)
 {
 TCGLabel *l1 = gen_new_label();
 TCGLabel *l2 = gen_new_label();
-gen_op_jnz_ecx(s, s->aflag, l1);
+gen_op_jnz_ecx(s, l1);
 gen_set_label(l2);
 gen_jmp_rel_csize(s, 0, 1);
 gen_set_label(l1);
@@ -1286,7 +1287,7 @@ static void gen_repz(DisasContext *s, MemOp ot,
  * before rep string_insn
  */
 if (s->repz_opt) {
-gen_op_jz_ecx(s, s->aflag, l2);
+gen_op_jz_ecx(s, l2);
 }
 gen_jmp_rel_csize(s, -cur_insn_len(s), 0);
 }
@@ -1306,7 +1307,7 @@ static void gen_repz2(DisasContext *s, MemOp ot, int nz,
 gen_update_cc_op(s);
 gen_jcc1(s, (JCC_Z << 1) | (nz ^ 1), l2);
 if (s->repz_opt) {
-gen_op_jz_ecx(s, s->aflag, l2);
+gen_op_jz_ecx(s, l2);
 }
 gen_jmp_rel_csize(s, -cur_insn_len(s), 0);
 }
@@ -7397,16 +7398,16 @@ static bool disas_insn(DisasContext *s, CPUState *cpu)
 case 0: /* loopnz */
 case 1: /* loopz */
 gen_op_add_reg_im(s, s->aflag, R_ECX, -1);
-gen_op_jz_ecx(s, s->aflag, l2);
+gen_op_jz_ecx(s, l2);
 gen_jcc1(s, (JCC_Z << 1) | (b ^ 1), l1);
 break;
 case 2: /* loop */
 gen_op_add_reg_im(s, s->aflag, R_ECX, -1);
-gen_op_jnz_ecx(s, s->aflag, l1);
+gen_op_jnz_ecx(s, l1);
 break;
 default:
 case 3: /* jcxz */
-gen_op_jz_ecx(s, s->aflag, l1);
+gen_op_jz_ecx(s, l1);
 break;
 }
 
-- 
2.37.3

[PULL 22/37] target/i386: Create gen_jmp_rel

2022-10-11 Thread Paolo Bonzini

From: Richard Henderson 

Create a common helper for pc-relative branches.  The jmp jb insn
was missing a mask for CODE32.  In all cases the CODE64 check was
incorrectly placed, allowing PREFIX_DATA to truncate %rip to 16 bits.

Signed-off-by: Richard Henderson 
Reviewed-by: Paolo Bonzini 
Message-Id: <20221001140935.465607-18-richard.hender...@linaro.org>
Signed-off-by: Paolo Bonzini 
---
 target/i386/tcg/translate.c | 58 ++---
 1 file changed, 29 insertions(+), 29 deletions(-)

diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
index 11aaba8a65..ba1bd7c707 100644
--- a/target/i386/tcg/translate.c
+++ b/target/i386/tcg/translate.c
@@ -226,6 +226,7 @@ static void gen_eob(DisasContext *s);
 static void gen_jr(DisasContext *s);
 static void gen_jmp(DisasContext *s, target_ulong eip);
 static void gen_jmp_tb(DisasContext *s, target_ulong eip, int tb_num);
+static void gen_jmp_rel(DisasContext *s, MemOp ot, int diff, int tb_num);
 static void gen_op(DisasContext *s1, int op, MemOp ot, int d);
 static void gen_exception_gpf(DisasContext *s);
 
@@ -2792,6 +2793,21 @@ static void gen_jmp_tb(DisasContext *s, target_ulong 
eip, int tb_num)
 }
 }
 
+static void gen_jmp_rel(DisasContext *s, MemOp ot, int diff, int tb_num)
+{
+target_ulong dest = s->pc - s->cs_base + diff;
+
+/* In 64-bit mode, operand size is fixed at 64 bits. */
+if (!CODE64(s)) {
+if (ot == MO_16) {
+dest &= 0x;
+} else {
+dest &= 0x;
+}
+}
+gen_jmp_tb(s, dest, tb_num);
+}
+
 static void gen_jmp(DisasContext *s, target_ulong eip)
 {
 gen_jmp_tb(s, eip, 0);
@@ -6862,20 +6878,12 @@ static bool disas_insn(DisasContext *s, CPUState *cpu)
 break;
 case 0xe8: /* call im */
 {
-if (dflag != MO_16) {
-tval = (int32_t)insn_get(env, s, MO_32);
-} else {
-tval = (int16_t)insn_get(env, s, MO_16);
-}
-tval += s->pc - s->cs_base;
-if (dflag == MO_16) {
-tval &= 0x;
-} else if (!CODE64(s)) {
-tval &= 0x;
-}
+int diff = (dflag != MO_16
+? (int32_t)insn_get(env, s, MO_32)
+: (int16_t)insn_get(env, s, MO_16));
 gen_push_v(s, eip_next_tl(s));
 gen_bnd_jmp(s);
-gen_jmp(s, tval);
+gen_jmp_rel(s, dflag, diff, 0);
 }
 break;
 case 0x9a: /* lcall im */
@@ -6893,19 +6901,13 @@ static bool disas_insn(DisasContext *s, CPUState *cpu)
 }
 goto do_lcall;
 case 0xe9: /* jmp im */
-if (dflag != MO_16) {
-tval = (int32_t)insn_get(env, s, MO_32);
-} else {
-tval = (int16_t)insn_get(env, s, MO_16);
+{
+int diff = (dflag != MO_16
+? (int32_t)insn_get(env, s, MO_32)
+: (int16_t)insn_get(env, s, MO_16));
+gen_bnd_jmp(s);
+gen_jmp_rel(s, dflag, diff, 0);
 }
-tval += s->pc - s->cs_base;
-if (dflag == MO_16) {
-tval &= 0x;
-} else if (!CODE64(s)) {
-tval &= 0x;
-}
-gen_bnd_jmp(s);
-gen_jmp(s, tval);
 break;
 case 0xea: /* ljmp im */
 {
@@ -6922,12 +6924,10 @@ static bool disas_insn(DisasContext *s, CPUState *cpu)
 }
 goto do_ljmp;
 case 0xeb: /* jmp Jb */
-tval = (int8_t)insn_get(env, s, MO_8);
-tval += s->pc - s->cs_base;
-if (dflag == MO_16) {
-tval &= 0x;
+{
+int diff = (int8_t)insn_get(env, s, MO_8);
+gen_jmp_rel(s, dflag, diff, 0);
 }
-gen_jmp(s, tval);
 break;
 case 0x70 ... 0x7f: /* jcc Jb */
 tval = (int8_t)insn_get(env, s, MO_8);
-- 
2.37.3

[PULL 31/37] target/i386: Enable TARGET_TB_PCREL

2022-10-11 Thread Paolo Bonzini

From: Richard Henderson 

Signed-off-by: Richard Henderson 
Reviewed-by: Paolo Bonzini 
Message-Id: <20221001140935.465607-27-richard.hender...@linaro.org>
Signed-off-by: Paolo Bonzini 
---
 target/i386/cpu-param.h |   4 ++
 target/i386/tcg/tcg-cpu.c   |   8 ++-
 target/i386/tcg/translate.c | 130 
 3 files changed, 113 insertions(+), 29 deletions(-)

diff --git a/target/i386/cpu-param.h b/target/i386/cpu-param.h
index 9740bd7abd..1e79389761 100644
--- a/target/i386/cpu-param.h
+++ b/target/i386/cpu-param.h
@@ -25,4 +25,8 @@
 #define TARGET_PAGE_BITS 12
 #define NB_MMU_MODES 3
 
+#ifndef CONFIG_USER_ONLY
+# define TARGET_TB_PCREL 1
+#endif
+
 #endif
diff --git a/target/i386/tcg/tcg-cpu.c b/target/i386/tcg/tcg-cpu.c
index 6cf14c83ff..828244abe2 100644
--- a/target/i386/tcg/tcg-cpu.c
+++ b/target/i386/tcg/tcg-cpu.c
@@ -49,9 +49,11 @@ static void x86_cpu_exec_exit(CPUState *cs)
 static void x86_cpu_synchronize_from_tb(CPUState *cs,
 const TranslationBlock *tb)
 {
-X86CPU *cpu = X86_CPU(cs);
-
-cpu->env.eip = tb_pc(tb) - tb->cs_base;
+/* The instruction pointer is always up to date with TARGET_TB_PCREL. */
+if (!TARGET_TB_PCREL) {
+CPUX86State *env = cs->env_ptr;
+env->eip = tb_pc(tb) - tb->cs_base;
+}
 }
 
 #ifndef CONFIG_USER_ONLY
diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
index 689a45256c..279a3ae999 100644
--- a/target/i386/tcg/translate.c
+++ b/target/i386/tcg/translate.c
@@ -78,6 +78,7 @@ typedef struct DisasContext {
 
 target_ulong pc;   /* pc = eip + cs_base */
 target_ulong cs_base;  /* base of CS segment */
+target_ulong pc_save;
 
 MemOp aflag;
 MemOp dflag;
@@ -480,9 +481,10 @@ static void gen_add_A0_im(DisasContext *s, int val)
 }
 }
 
-static inline void gen_op_jmp_v(TCGv dest)
+static inline void gen_op_jmp_v(DisasContext *s, TCGv dest)
 {
 tcg_gen_mov_tl(cpu_eip, dest);
+s->pc_save = -1;
 }
 
 static inline
@@ -519,12 +521,24 @@ static inline void gen_op_st_rm_T0_A0(DisasContext *s, 
int idx, int d)
 
 static void gen_update_eip_cur(DisasContext *s)
 {
-tcg_gen_movi_tl(cpu_eip, s->base.pc_next - s->cs_base);
+assert(s->pc_save != -1);
+if (TARGET_TB_PCREL) {
+tcg_gen_addi_tl(cpu_eip, cpu_eip, s->base.pc_next - s->pc_save);
+} else {
+tcg_gen_movi_tl(cpu_eip, s->base.pc_next - s->cs_base);
+}
+s->pc_save = s->base.pc_next;
 }
 
 static void gen_update_eip_next(DisasContext *s)
 {
-tcg_gen_movi_tl(cpu_eip, s->pc - s->cs_base);
+assert(s->pc_save != -1);
+if (TARGET_TB_PCREL) {
+tcg_gen_addi_tl(cpu_eip, cpu_eip, s->pc - s->pc_save);
+} else {
+tcg_gen_movi_tl(cpu_eip, s->pc - s->cs_base);
+}
+s->pc_save = s->pc;
 }
 
 static int cur_insn_len(DisasContext *s)
@@ -539,6 +553,7 @@ static TCGv_i32 cur_insn_len_i32(DisasContext *s)
 
 static TCGv_i32 eip_next_i32(DisasContext *s)
 {
+assert(s->pc_save != -1);
 /*
  * This function has two users: lcall_real (always 16-bit mode), and
  * iret_protected (16, 32, or 64-bit mode).  IRET only uses the value
@@ -550,17 +565,38 @@ static TCGv_i32 eip_next_i32(DisasContext *s)
 if (CODE64(s)) {
 return tcg_constant_i32(-1);
 }
-return tcg_constant_i32(s->pc - s->cs_base);
+if (TARGET_TB_PCREL) {
+TCGv_i32 ret = tcg_temp_new_i32();
+tcg_gen_trunc_tl_i32(ret, cpu_eip);
+tcg_gen_addi_i32(ret, ret, s->pc - s->pc_save);
+return ret;
+} else {
+return tcg_constant_i32(s->pc - s->cs_base);
+}
 }
 
 static TCGv eip_next_tl(DisasContext *s)
 {
-return tcg_constant_tl(s->pc - s->cs_base);
+assert(s->pc_save != -1);
+if (TARGET_TB_PCREL) {
+TCGv ret = tcg_temp_new();
+tcg_gen_addi_tl(ret, cpu_eip, s->pc - s->pc_save);
+return ret;
+} else {
+return tcg_constant_tl(s->pc - s->cs_base);
+}
 }
 
 static TCGv eip_cur_tl(DisasContext *s)
 {
-return tcg_constant_tl(s->base.pc_next - s->cs_base);
+assert(s->pc_save != -1);
+if (TARGET_TB_PCREL) {
+TCGv ret = tcg_temp_new();
+tcg_gen_addi_tl(ret, cpu_eip, s->base.pc_next - s->pc_save);
+return ret;
+} else {
+return tcg_constant_tl(s->base.pc_next - s->cs_base);
+}
 }
 
 /* Compute SEG:REG into A0.  SEG is selected from the override segment
@@ -2260,7 +2296,12 @@ static TCGv gen_lea_modrm_1(DisasContext *s, 
AddressParts a)
 ea = cpu_regs[a.base];
 }
 if (!ea) {
-tcg_gen_movi_tl(s->A0, a.disp);
+if (TARGET_TB_PCREL && a.base == -2) {
+/* With cpu_eip ~= pc_save, the expression is pc-relative. */
+tcg_gen_addi_tl(s->A0, cpu_eip, a.disp - s->pc_save);
+} else {
+tcg_gen_movi_tl(s->A0, a.disp);
+}
 ea = s->A0;
 } else if (a.disp != 0) {
 tcg_gen_addi_tl(s->A0, ea, a.disp);
@@ -2748,32 +2

[PULL 32/37] x86: Implement MSR_CORE_THREAD_COUNT MSR

2022-10-11 Thread Paolo Bonzini

From: Alexander Graf 

Intel CPUs starting with Haswell-E implement a new MSR called
MSR_CORE_THREAD_COUNT which exposes the number of threads and cores
inside of a package.

This MSR is used by XNU to populate internal data structures and not
implementing it prevents virtual machines with more than 1 vCPU from
booting if the emulated CPU generation is at least Haswell-E.

This patch propagates the existing hvf logic from patch 027ac0cb516
("target/i386/hvf: add rdmsr 35H MSR_CORE_THREAD_COUNT") to TCG.

Signed-off-by: Alexander Graf 
Message-Id: <20221004225643.65036-2-ag...@csgraf.de>
Signed-off-by: Paolo Bonzini 
---
 target/i386/tcg/sysemu/misc_helper.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/target/i386/tcg/sysemu/misc_helper.c 
b/target/i386/tcg/sysemu/misc_helper.c
index 1328aa656f..e1528b7f80 100644
--- a/target/i386/tcg/sysemu/misc_helper.c
+++ b/target/i386/tcg/sysemu/misc_helper.c
@@ -450,6 +450,11 @@ void helper_rdmsr(CPUX86State *env)
  case MSR_IA32_UCODE_REV:
 val = x86_cpu->ucode_rev;
 break;
+case MSR_CORE_THREAD_COUNT: {
+CPUState *cs = CPU(x86_cpu);
+val = (cs->nr_threads * cs->nr_cores) | (cs->nr_cores << 16);
+break;
+}
 default:
 if ((uint32_t)env->regs[R_ECX] >= MSR_MC0_CTL
 && (uint32_t)env->regs[R_ECX] < MSR_MC0_CTL +
-- 
2.37.3

[PULL 23/37] target/i386: Use gen_jmp_rel for loop, repz, jecxz insns

2022-10-11 Thread Paolo Bonzini

From: Richard Henderson 

With gen_jmp_rel, we may chain to the next tb instead of merely
writing to eip and exiting.  For repz, subtract cur_insn_len to
restart the current insn.

Signed-off-by: Richard Henderson 
Reviewed-by: Paolo Bonzini 
Message-Id: <20221001140935.465607-19-richard.hender...@linaro.org>
Signed-off-by: Paolo Bonzini 
---
 target/i386/tcg/translate.c | 36 +++-
 1 file changed, 15 insertions(+), 21 deletions(-)

diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
index ba1bd7c707..434a6ad6cd 100644
--- a/target/i386/tcg/translate.c
+++ b/target/i386/tcg/translate.c
@@ -224,9 +224,9 @@ STUB_HELPER(wrmsr, TCGv_env env)
 
 static void gen_eob(DisasContext *s);
 static void gen_jr(DisasContext *s);
-static void gen_jmp(DisasContext *s, target_ulong eip);
 static void gen_jmp_tb(DisasContext *s, target_ulong eip, int tb_num);
 static void gen_jmp_rel(DisasContext *s, MemOp ot, int diff, int tb_num);
+static void gen_jmp_rel_csize(DisasContext *s, int diff, int tb_num);
 static void gen_op(DisasContext *s1, int op, MemOp ot, int d);
 static void gen_exception_gpf(DisasContext *s);
 
@@ -1185,7 +1185,7 @@ static TCGLabel *gen_jz_ecx_string(DisasContext *s)
 TCGLabel *l2 = gen_new_label();
 gen_op_jnz_ecx(s, s->aflag, l1);
 gen_set_label(l2);
-gen_jmp_tb(s, s->pc - s->cs_base, 1);
+gen_jmp_rel_csize(s, 0, 1);
 gen_set_label(l1);
 return l2;
 }
@@ -1288,7 +1288,7 @@ static void gen_repz(DisasContext *s, MemOp ot,
 if (s->repz_opt) {
 gen_op_jz_ecx(s, s->aflag, l2);
 }
-gen_jmp(s, s->base.pc_next - s->cs_base);
+gen_jmp_rel_csize(s, -cur_insn_len(s), 0);
 }
 
 #define GEN_REPZ(op) \
@@ -1308,7 +1308,7 @@ static void gen_repz2(DisasContext *s, MemOp ot, int nz,
 if (s->repz_opt) {
 gen_op_jz_ecx(s, s->aflag, l2);
 }
-gen_jmp(s, s->base.pc_next - s->cs_base);
+gen_jmp_rel_csize(s, -cur_insn_len(s), 0);
 }
 
 #define GEN_REPZ2(op) \
@@ -2793,6 +2793,7 @@ static void gen_jmp_tb(DisasContext *s, target_ulong eip, 
int tb_num)
 }
 }
 
+/* Jump to eip+diff, truncating the result to OT. */
 static void gen_jmp_rel(DisasContext *s, MemOp ot, int diff, int tb_num)
 {
 target_ulong dest = s->pc - s->cs_base + diff;
@@ -2808,9 +2809,11 @@ static void gen_jmp_rel(DisasContext *s, MemOp ot, int 
diff, int tb_num)
 gen_jmp_tb(s, dest, tb_num);
 }
 
-static void gen_jmp(DisasContext *s, target_ulong eip)
+/* Jump to eip+diff, truncating to the current code size. */
+static void gen_jmp_rel_csize(DisasContext *s, int diff, int tb_num)
 {
-gen_jmp_tb(s, eip, 0);
+/* CODE64 ignores the OT argument, so we need not consider it. */
+gen_jmp_rel(s, CODE32(s) ? MO_32 : MO_16, diff, tb_num);
 }
 
 static inline void gen_ldq_env_A0(DisasContext *s, int offset)
@@ -7404,24 +7407,18 @@ static bool disas_insn(DisasContext *s, CPUState *cpu)
 case 0xe2: /* loop */
 case 0xe3: /* jecxz */
 {
-TCGLabel *l1, *l2, *l3;
-
-tval = (int8_t)insn_get(env, s, MO_8);
-tval += s->pc - s->cs_base;
-if (dflag == MO_16) {
-tval &= 0x;
-}
+TCGLabel *l1, *l2;
+int diff = (int8_t)insn_get(env, s, MO_8);
 
 l1 = gen_new_label();
 l2 = gen_new_label();
-l3 = gen_new_label();
 gen_update_cc_op(s);
 b &= 3;
 switch(b) {
 case 0: /* loopnz */
 case 1: /* loopz */
 gen_op_add_reg_im(s, s->aflag, R_ECX, -1);
-gen_op_jz_ecx(s, s->aflag, l3);
+gen_op_jz_ecx(s, s->aflag, l2);
 gen_jcc1(s, (JCC_Z << 1) | (b ^ 1), l1);
 break;
 case 2: /* loop */
@@ -7434,14 +7431,11 @@ static bool disas_insn(DisasContext *s, CPUState *cpu)
 break;
 }
 
-gen_set_label(l3);
-gen_update_eip_next(s);
-tcg_gen_br(l2);
+gen_set_label(l2);
+gen_jmp_rel_csize(s, 0, 1);
 
 gen_set_label(l1);
-gen_jmp_im(s, tval);
-gen_set_label(l2);
-s->base.is_jmp = DISAS_EOB_ONLY;
+gen_jmp_rel(s, dflag, diff, 0);
 }
 break;
 case 0x130: /* wrmsr */
-- 
2.37.3

[PULL 19/37] target/i386: Truncate values for lcall_real to i32

2022-10-11 Thread Paolo Bonzini

From: Richard Henderson 

Use i32 not int or tl for eip and cs arguments.

Reviewed-by: Paolo Bonzini 
Signed-off-by: Richard Henderson 
Message-Id: <20221001140935.465607-15-richard.hender...@linaro.org>
Signed-off-by: Paolo Bonzini 
---
 target/i386/helper.h | 2 +-
 target/i386/tcg/seg_helper.c | 6 ++
 target/i386/tcg/translate.c  | 3 ++-
 3 files changed, 5 insertions(+), 6 deletions(-)

diff --git a/target/i386/helper.h b/target/i386/helper.h
index ac3b4d1ee3..39a3c24182 100644
--- a/target/i386/helper.h
+++ b/target/i386/helper.h
@@ -37,7 +37,7 @@ DEF_HELPER_2(lldt, void, env, int)
 DEF_HELPER_2(ltr, void, env, int)
 DEF_HELPER_3(load_seg, void, env, int, int)
 DEF_HELPER_4(ljmp_protected, void, env, int, tl, tl)
-DEF_HELPER_5(lcall_real, void, env, int, tl, int, int)
+DEF_HELPER_5(lcall_real, void, env, i32, i32, int, i32)
 DEF_HELPER_5(lcall_protected, void, env, int, tl, int, tl)
 DEF_HELPER_2(iret_real, void, env, int)
 DEF_HELPER_3(iret_protected, void, env, int, int)
diff --git a/target/i386/tcg/seg_helper.c b/target/i386/tcg/seg_helper.c
index bffd82923f..539189b4d1 100644
--- a/target/i386/tcg/seg_helper.c
+++ b/target/i386/tcg/seg_helper.c
@@ -1504,14 +1504,12 @@ void helper_ljmp_protected(CPUX86State *env, int 
new_cs, target_ulong new_eip,
 }
 
 /* real mode call */
-void helper_lcall_real(CPUX86State *env, int new_cs, target_ulong new_eip1,
-   int shift, int next_eip)
+void helper_lcall_real(CPUX86State *env, uint32_t new_cs, uint32_t new_eip,
+   int shift, uint32_t next_eip)
 {
-int new_eip;
 uint32_t esp, esp_mask;
 target_ulong ssp;
 
-new_eip = new_eip1;
 esp = env->regs[R_ESP];
 esp_mask = get_sp_mask(env->segs[R_SS].flags);
 ssp = env->segs[R_SS].base;
diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
index 7db6f617a1..1aa5b37ea6 100644
--- a/target/i386/tcg/translate.c
+++ b/target/i386/tcg/translate.c
@@ -5346,7 +5346,8 @@ static bool disas_insn(DisasContext *s, CPUState *cpu)
tcg_const_tl(s->pc - s->cs_base));
 } else {
 tcg_gen_trunc_tl_i32(s->tmp2_i32, s->T0);
-gen_helper_lcall_real(cpu_env, s->tmp2_i32, s->T1,
+tcg_gen_trunc_tl_i32(s->tmp3_i32, s->T1);
+gen_helper_lcall_real(cpu_env, s->tmp2_i32, s->tmp3_i32,
   tcg_const_i32(dflag - 1),
   tcg_const_i32(s->pc - s->cs_base));
 }
-- 
2.37.3

[PULL 36/37] linux-user: i386/signal: support FXSAVE fpstate on 32-bit emulation

2022-10-11 Thread Paolo Bonzini

Linux can use FXSAVE to save/restore XMM registers even on 32-bit
systems.  This requires some care in order to keep the FXSAVE area
aligned to 16 bytes; for this reason, get_sigframe is changed to
pass the offset into the FXSAVE area rather than the full frame
size.

Reviewed-by: Richard Henderson 
Signed-off-by: Paolo Bonzini 
---
 linux-user/i386/signal.c | 129 +++
 1 file changed, 77 insertions(+), 52 deletions(-)

diff --git a/linux-user/i386/signal.c b/linux-user/i386/signal.c
index 76317a3d16..58b11be116 100644
--- a/linux-user/i386/signal.c
+++ b/linux-user/i386/signal.c
@@ -39,29 +39,7 @@ struct target_xmmreg {
 uint32_t element[4];
 };
 
-struct target_fpstate_32 {
-/* Regular FPU environment */
-uint32_t cw;
-uint32_t sw;
-uint32_t tag;
-uint32_t ipoff;
-uint32_t cssel;
-uint32_t dataoff;
-uint32_t datasel;
-struct target_fpreg st[8];
-uint16_t  status;
-uint16_t  magic;  /* 0x = regular FPU data only */
-
-/* FXSR FPU environment */
-uint32_t _fxsr_env[6];   /* FXSR FPU env is ignored */
-uint32_t mxcsr;
-uint32_t reserved;
-struct target_fpxreg fxsr_st[8]; /* FXSR FPU reg data is ignored */
-struct target_xmmreg xmm[8];
-uint32_t padding[56];
-};
-
-struct target_fpstate_64 {
+struct target_fpstate_fxsave {
 /* FXSAVE format */
 uint16_t cw;
 uint16_t sw;
@@ -75,11 +53,36 @@ struct target_fpstate_64 {
 uint32_t xmm_space[64];
 uint32_t reserved[24];
 };
+#define TARGET_FXSAVE_SIZE   sizeof(struct target_fpstate_fxsave)
+QEMU_BUILD_BUG_ON(TARGET_FXSAVE_SIZE != 512);
+
+struct target_fpstate_32 {
+/* Regular FPU environment */
+uint32_t cw;
+uint32_t sw;
+uint32_t tag;
+uint32_t ipoff;
+uint32_t cssel;
+uint32_t dataoff;
+uint32_t datasel;
+struct target_fpreg st[8];
+uint16_t  status;
+uint16_t  magic;  /* 0x = regular FPU data only */
+struct target_fpstate_fxsave fxsave;
+};
+
+/*
+ * For simplicity, setup_frame aligns struct target_fpstate_32 to
+ * 16 bytes, so ensure that the FXSAVE area is also aligned.
+ */
+QEMU_BUILD_BUG_ON(offsetof(struct target_fpstate_32, fxsave) & 15);
 
 #ifndef TARGET_X86_64
 # define target_fpstate target_fpstate_32
+# define TARGET_FPSTATE_FXSAVE_OFFSET offsetof(struct target_fpstate_32, 
fxsave)
 #else
-# define target_fpstate target_fpstate_64
+# define target_fpstate target_fpstate_fxsave
+# define TARGET_FPSTATE_FXSAVE_OFFSET 0
 #endif
 
 struct target_sigcontext_32 {
@@ -172,8 +175,16 @@ struct sigframe {
 struct target_fpstate fpstate_unused;
 abi_ulong extramask[TARGET_NSIG_WORDS-1];
 char retcode[8];
-struct target_fpstate fpstate;
+
+/*
+ * This field will be 16-byte aligned in memory.  Applying QEMU_ALIGNED
+ * to it ensures that the base of the frame has an appropriate alignment
+ * too.
+ */
+struct target_fpstate fpstate QEMU_ALIGNED(8);
 };
+#define TARGET_SIGFRAME_FXSAVE_OFFSET (\
+offsetof(struct sigframe, fpstate) + TARGET_FPSTATE_FXSAVE_OFFSET)
 
 struct rt_sigframe {
 abi_ulong pretcode;
@@ -183,25 +194,35 @@ struct rt_sigframe {
 struct target_siginfo info;
 struct target_ucontext uc;
 char retcode[8];
-struct target_fpstate fpstate;
+struct target_fpstate fpstate QEMU_ALIGNED(8);
 };
-
+#define TARGET_RT_SIGFRAME_FXSAVE_OFFSET ( \
+offsetof(struct rt_sigframe, fpstate) + TARGET_FPSTATE_FXSAVE_OFFSET)
 #else
 
 struct rt_sigframe {
 abi_ulong pretcode;
 struct target_ucontext uc;
 struct target_siginfo info;
-struct target_fpstate fpstate;
+struct target_fpstate fpstate QEMU_ALIGNED(16);
 };
-
+#define TARGET_RT_SIGFRAME_FXSAVE_OFFSET ( \
+offsetof(struct rt_sigframe, fpstate) + TARGET_FPSTATE_FXSAVE_OFFSET)
 #endif
 
 /*
  * Set up a signal frame.
  */
 
-/* XXX: save x87 state */
+static void fxsave_sigcontext(CPUX86State *env, struct target_fpstate_fxsave 
*fxsave,
+  abi_ulong fxsave_addr)
+{
+/* fxsave_addr must be 16 byte aligned for fxsave */
+assert(!(fxsave_addr & 0xf));
+
+cpu_x86_fxsave(env, fxsave_addr);
+}
+
 static void setup_sigcontext(struct target_sigcontext *sc,
 struct target_fpstate *fpstate, CPUX86State *env, abi_ulong mask,
 abi_ulong fpstate_addr)
@@ -233,13 +254,14 @@ static void setup_sigcontext(struct target_sigcontext *sc,
 
 cpu_x86_fsave(env, fpstate_addr, 1);
 fpstate->status = fpstate->sw;
-magic = 0x;
+if (!(env->features[FEAT_1_EDX] & CPUID_FXSR)) {
+magic = 0x;
+} else {
+fxsave_sigcontext(env, &fpstate->fxsave,
+  fpstate_addr + TARGET_FPSTATE_FXSAVE_OFFSET);
+magic = 0;
+}
 __put_user(magic, &fpstate->magic);
-__put_user(fpstate_addr, &sc->fpstate);
-
-/* non-iBCS2 ext

[PULL 24/37] target/i386: Use gen_jmp_rel for gen_jcc

2022-10-11 Thread Paolo Bonzini

From: Richard Henderson 

Signed-off-by: Richard Henderson 
Reviewed-by: Paolo Bonzini 
Message-Id: <20221001140935.465607-20-richard.hender...@linaro.org>
Signed-off-by: Paolo Bonzini 
---
 target/i386/tcg/translate.c | 57 -
 1 file changed, 18 insertions(+), 39 deletions(-)

diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
index 434a6ad6cd..5b84be4975 100644
--- a/target/i386/tcg/translate.c
+++ b/target/i386/tcg/translate.c
@@ -2409,32 +2409,14 @@ static void gen_goto_tb(DisasContext *s, int tb_num, 
target_ulong eip)
 }
 }
 
-static inline void gen_jcc(DisasContext *s, int b,
-   target_ulong val, target_ulong next_eip)
+static void gen_jcc(DisasContext *s, int b, int diff)
 {
-TCGLabel *l1, *l2;
+TCGLabel *l1 = gen_new_label();
 
-if (s->jmp_opt) {
-l1 = gen_new_label();
-gen_jcc1(s, b, l1);
-
-gen_goto_tb(s, 0, next_eip);
-
-gen_set_label(l1);
-gen_goto_tb(s, 1, val);
-} else {
-l1 = gen_new_label();
-l2 = gen_new_label();
-gen_jcc1(s, b, l1);
-
-gen_jmp_im(s, next_eip);
-tcg_gen_br(l2);
-
-gen_set_label(l1);
-gen_jmp_im(s, val);
-gen_set_label(l2);
-gen_eob(s);
-}
+gen_jcc1(s, b, l1);
+gen_jmp_rel_csize(s, 0, 1);
+gen_set_label(l1);
+gen_jmp_rel(s, s->dflag, diff, 0);
 }
 
 static void gen_cmovcc1(CPUX86State *env, DisasContext *s, MemOp ot, int b,
@@ -4780,7 +4762,6 @@ static bool disas_insn(DisasContext *s, CPUState *cpu)
 int shift;
 MemOp ot, aflag, dflag;
 int modrm, reg, rm, mod, op, opreg, val;
-target_ulong next_eip, tval;
 bool orig_cc_op_dirty = s->cc_op_dirty;
 CCOp orig_cc_op = s->cc_op;
 
@@ -6933,22 +6914,20 @@ static bool disas_insn(DisasContext *s, CPUState *cpu)
 }
 break;
 case 0x70 ... 0x7f: /* jcc Jb */
-tval = (int8_t)insn_get(env, s, MO_8);
-goto do_jcc;
+{
+int diff = (int8_t)insn_get(env, s, MO_8);
+gen_bnd_jmp(s);
+gen_jcc(s, b, diff);
+}
+break;
 case 0x180 ... 0x18f: /* jcc Jv */
-if (dflag != MO_16) {
-tval = (int32_t)insn_get(env, s, MO_32);
-} else {
-tval = (int16_t)insn_get(env, s, MO_16);
+{
+int diff = (dflag != MO_16
+? (int32_t)insn_get(env, s, MO_32)
+: (int16_t)insn_get(env, s, MO_16));
+gen_bnd_jmp(s);
+gen_jcc(s, b, diff);
 }
-do_jcc:
-next_eip = s->pc - s->cs_base;
-tval += next_eip;
-if (dflag == MO_16) {
-tval &= 0x;
-}
-gen_bnd_jmp(s);
-gen_jcc(s, b, tval, next_eip);
 break;
 
 case 0x190 ... 0x19f: /* setcc Gv */
-- 
2.37.3

[PULL 34/37] KVM: x86: Implement MSR_CORE_THREAD_COUNT MSR

2022-10-11 Thread Paolo Bonzini

From: Alexander Graf 

The MSR_CORE_THREAD_COUNT MSR describes CPU package topology, such as number
of threads and cores for a given package. This is information that QEMU has
readily available and can provide through the new user space MSR deflection
interface.

This patch propagates the existing hvf logic from patch 027ac0cb516
("target/i386/hvf: add rdmsr 35H MSR_CORE_THREAD_COUNT") to KVM.

Signed-off-by: Alexander Graf 
Message-Id: <20221004225643.65036-4-ag...@csgraf.de>
Signed-off-by: Paolo Bonzini 
---
 target/i386/kvm/kvm.c | 21 +
 1 file changed, 21 insertions(+)

diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index 1d9a50b02b..bed6c00f2c 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -2401,6 +2401,17 @@ static int kvm_get_supported_msrs(KVMState *s)
 return ret;
 }
 
+static bool kvm_rdmsr_core_thread_count(X86CPU *cpu, uint32_t msr,
+uint64_t *val)
+{
+CPUState *cs = CPU(cpu);
+
+*val = cs->nr_threads * cs->nr_cores; /* thread count, bits 15..0 */
+*val |= ((uint32_t)cs->nr_cores << 16); /* core count, bits 31..16 */
+
+return true;
+}
+
 static Notifier smram_machine_done;
 static KVMMemoryListener smram_listener;
 static AddressSpace smram_address_space;
@@ -2613,6 +2624,8 @@ int kvm_arch_init(MachineState *ms, KVMState *s)
 }
 }
 if (kvm_vm_check_extension(s, KVM_CAP_X86_USER_SPACE_MSR)) {
+bool r;
+
 ret = kvm_vm_enable_cap(s, KVM_CAP_X86_USER_SPACE_MSR, 0,
 KVM_MSR_EXIT_REASON_FILTER);
 if (ret) {
@@ -2620,6 +2633,14 @@ int kvm_arch_init(MachineState *ms, KVMState *s)
  strerror(-ret));
 exit(1);
 }
+
+r = kvm_filter_msr(s, MSR_CORE_THREAD_COUNT,
+   kvm_rdmsr_core_thread_count, NULL);
+if (!r) {
+error_report("Could not install MSR_CORE_THREAD_COUNT handler: %s",
+ strerror(-ret));
+exit(1);
+}
 }
 
 return 0;
-- 
2.37.3

[PULL 28/37] target/i386: Create eip_cur_tl

2022-10-11 Thread Paolo Bonzini

From: Richard Henderson 

Signed-off-by: Richard Henderson 
Reviewed-by: Paolo Bonzini 
Message-Id: <20221001140935.465607-24-richard.hender...@linaro.org>
Signed-off-by: Paolo Bonzini 
---
 target/i386/tcg/translate.c | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
index 2e7b94700b..5b0dab8633 100644
--- a/target/i386/tcg/translate.c
+++ b/target/i386/tcg/translate.c
@@ -562,6 +562,11 @@ static TCGv eip_next_tl(DisasContext *s)
 return tcg_constant_tl(s->pc - s->cs_base);
 }
 
+static TCGv eip_cur_tl(DisasContext *s)
+{
+return tcg_constant_tl(s->base.pc_next - s->cs_base);
+}
+
 /* Compute SEG:REG into A0.  SEG is selected from the override segment
(OVR_SEG) and the default segment (DEF_SEG).  OVR_SEG may be -1 to
indicate no override.  */
@@ -6617,7 +6622,7 @@ static bool disas_insn(DisasContext *s, CPUState *cpu)
offsetof(CPUX86State, segs[R_CS].selector));
 tcg_gen_st16_i32(s->tmp2_i32, cpu_env,
  offsetof(CPUX86State, fpcs));
-tcg_gen_st_tl(tcg_constant_tl(s->base.pc_next - s->cs_base),
+tcg_gen_st_tl(eip_cur_tl(s),
   cpu_env, offsetof(CPUX86State, fpip));
 }
 }
-- 
2.37.3

Re: [RFC PATCH 5/6] hw/cxl/cxl-events: Add event interrupt support

2022-10-11 Thread Jonathan Cameron via

On Mon, 10 Oct 2022 15:29:43 -0700
ira.we...@intel.com wrote:

> From: Ira Weiny 
> 
> To facilitate testing of event interrupt support add a QMP HMP command
> to reset the event logs and issue interrupts when the guest has enabled
> those interrupts.
Two things in here, so probably wants breaking into two patches:
1) Add the injection command
2) Add the interrupt support.

As on earlier patches, I think we need a more sophisticated
injection interface so we can inject individual errors (or better yet sets of
errors so we can trigger single error case, and multiple error per interrupt.)

Jonathan


> 
> Signed-off-by: Ira Weiny 
> ---
>  hmp-commands.hx | 14 +++
>  hw/cxl/cxl-events.c | 82 +
>  hw/cxl/cxl-host-stubs.c |  5 +++
>  hw/mem/cxl_type3.c  |  7 +++-
>  include/hw/cxl/cxl_device.h |  3 ++
>  include/sysemu/sysemu.h |  3 ++
>  6 files changed, 113 insertions(+), 1 deletion(-)
> 
> diff --git a/hmp-commands.hx b/hmp-commands.hx
> index 564f1de364df..c59a98097317 100644
> --- a/hmp-commands.hx
> +++ b/hmp-commands.hx
> @@ -1266,6 +1266,20 @@ SRST
>Inject PCIe AER error
>  ERST
>  
> +{
> +.name   = "cxl_event_inject",
> +.args_type  = "id:s",
> +.params = "id ",
> +.help   = "inject cxl events and interrupt\n\t\t\t"
> +  " = qdev device id\n\t\t\t",
> +.cmd= hmp_cxl_event_inject,
> +},
> +
> +SRST
> +``cxl_event_inject``
> +  Inject CXL Events
> +ERST
> +
>  {
>  .name   = "netdev_add",
>  .args_type  = "netdev:O",


>  const MemoryRegionOps cfmws_ops;
> diff --git a/hw/mem/cxl_type3.c b/hw/mem/cxl_type3.c
> index 2b13179d116d..b4a90136d190 100644
> --- a/hw/mem/cxl_type3.c
> +++ b/hw/mem/cxl_type3.c
> @@ -459,7 +459,7 @@ static void ct3_realize(PCIDevice *pci_dev, Error **errp)
>  ComponentRegisters *regs = &cxl_cstate->crb;
>  MemoryRegion *mr = ®s->component_registers;
>  uint8_t *pci_conf = pci_dev->config;
> -unsigned short msix_num = 3;
> +unsigned short msix_num = 7;
>  int i;
>  
>  if (!cxl_setup_memory(ct3d, errp)) {
> @@ -502,6 +502,11 @@ static void ct3_realize(PCIDevice *pci_dev, Error **errp)
>  msix_vector_use(pci_dev, i);
>  }
>  
> +ct3d->cxl_dstate.event_vector[CXL_EVENT_TYPE_INFO] = 6;
> +ct3d->cxl_dstate.event_vector[CXL_EVENT_TYPE_WARN] = 5;
> +ct3d->cxl_dstate.event_vector[CXL_EVENT_TYPE_FAIL] = 4;
> +ct3d->cxl_dstate.event_vector[CXL_EVENT_TYPE_FATAL] = 3;

For testing purposes, maybe put 2 of them on same interrupt vector?
That way we'll verify the kernel code deals fine with either separate
interrupts or shared vectors.

[PULL 25/37] target/i386: Use gen_jmp_rel for DISAS_TOO_MANY

2022-10-11 Thread Paolo Bonzini

From: Richard Henderson 

With gen_jmp_rel, we may chain between two translation blocks
which may only be separated because of TB size limits.

Reviewed-by: Paolo Bonzini 
Signed-off-by: Richard Henderson 
Message-Id: <20221001140935.465607-21-richard.hender...@linaro.org>
Signed-off-by: Paolo Bonzini 
---
 target/i386/tcg/translate.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
index 5b84be4975..cf23ae6e5e 100644
--- a/target/i386/tcg/translate.c
+++ b/target/i386/tcg/translate.c
@@ -8798,6 +8798,9 @@ static void i386_tr_tb_stop(DisasContextBase *dcbase, 
CPUState *cpu)
 case DISAS_NORETURN:
 break;
 case DISAS_TOO_MANY:
+gen_update_cc_op(dc);
+gen_jmp_rel_csize(dc, 0, 0);
+break;
 case DISAS_EOB_NEXT:
 gen_update_cc_op(dc);
 gen_update_eip_cur(dc);
-- 
2.37.3

[PULL 35/37] linux-user: i386/signal: move fpstate at the end of the 32-bit frames

2022-10-11 Thread Paolo Bonzini

Recent versions of Linux moved the 32-bit fpstate towards the end of the
frame, so that the variable-sized xsave data does not overwrite the
(ABI-defined) extramask[] field.  Follow suit in QEMU.

Reviewed-by: Richard Henderson 
Signed-off-by: Paolo Bonzini 
---
 linux-user/i386/signal.c | 11 +--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/linux-user/i386/signal.c b/linux-user/i386/signal.c
index 4372621a4d..76317a3d16 100644
--- a/linux-user/i386/signal.c
+++ b/linux-user/i386/signal.c
@@ -163,9 +163,16 @@ struct sigframe {
 abi_ulong pretcode;
 int sig;
 struct target_sigcontext sc;
-struct target_fpstate fpstate;
+/*
+ * The actual fpstate is placed after retcode[] below, to make
+ * room for the variable-sized xsave data.  The older unused fpstate
+ * has to be kept to avoid changing the offset of extramask[], which
+ * is part of the ABI.
+ */
+struct target_fpstate fpstate_unused;
 abi_ulong extramask[TARGET_NSIG_WORDS-1];
 char retcode[8];
+struct target_fpstate fpstate;
 };
 
 struct rt_sigframe {
@@ -175,8 +182,8 @@ struct rt_sigframe {
 abi_ulong puc;
 struct target_siginfo info;
 struct target_ucontext uc;
-struct target_fpstate fpstate;
 char retcode[8];
+struct target_fpstate fpstate;
 };
 
 #else
-- 
2.37.3

Re: [PATCH] docs/devel: remove incorrect claim about git send-email

2022-10-11 Thread Linus Heckemann

Alyssa Ross  writes:

> Alyssa Ross  writes:
>
>> Linus Heckemann  writes:
>>
>>> While it's unclear to me what git send-email actually does with the
>>> -v2 parameter (it is not documented, but also not rejected), it does
>>> not add a v2 tag to the email's subject, which is what led to the
>>> mishap in [1].
>>>
>>> [1]: https://lists.nongnu.org/archive/html/qemu-devel/2022-09/msg00679.html
>>
>> It does for me!
>>
>> Tested with:
>>
>>git send-email -v2 --to h...@alyssa.is HEAD~
>>
>> X-Mailer: git-send-email 2.37.1
>
> I wouldn't be surprised if it only adds it when it's generating the
> patch though.  Did you perhaps run git format-patch first to generate a
> patch file, and then use git send-email to send it?

Yes! I didn't realise that git send-email can be used without the
intermediate format-patch step. I guess it's a git bug that git
send-email will silently ignore -v when used with a patch file. I'll
have a look at fixing that.

[PATCH v5 0/6] ASID support in vhost-vdpa net

2022-10-11 Thread Eugenio Pérez

Control VQ is the way net devices use to send changes to the device state, like
the number of active queues or its mac address.

QEMU needs to intercept this queue so it can track these changes and is able to
migrate the device. It can do it from 1576dbb5bbc4 ("vdpa: Add x-svq to
NetdevVhostVDPAOptions"). However, to enable x-svq implies to shadow all VirtIO
device's virtqueues, which will damage performance.

This series adds address space isolation, so the device and the guest
communicate directly with them (passthrough) and CVQ communication is split in
two: The guest communicates with QEMU and QEMU forwards the commands to the
device.

Comments are welcome. Thanks!

v5:
- Move vring state in vhost_vdpa_get_vring_group instead of using a
  parameter.
- Rename VHOST_VDPA_NET_CVQ_PASSTHROUGH to VHOST_VDPA_NET_DATA_ASID

v4:
- Rebased on last CVQ start series, that allocated CVQ cmd bufs at load
- Squash vhost_vdpa_cvq_group_is_independent.
- Do not check for cvq index on vhost_vdpa_net_prepare, we only have one
  that callback registered in that NetClientInfo.
- Add comment specifying behavior if device does not support _F_ASID
- Update headers to a later Linux commit to not to remove SETUP_RNG_SEED

v3:
- Do not return an error but just print a warning if vdpa device initialization
  returns failure while getting AS num of VQ groups
- Delete extra newline

v2:
- Much as commented on series [1], handle vhost_net backend through
  NetClientInfo callbacks instead of directly.
- Fix not freeing SVQ properly when device does not support CVQ
- Add BIT_ULL missed checking device's backend feature for _F_ASID.

Eugenio Pérez (6):
  vdpa: Use v->shadow_vqs_enabled in vhost_vdpa_svqs_start & stop
  vdpa: Allocate SVQ unconditionally
  vdpa: Add asid parameter to vhost_vdpa_dma_map/unmap
  vdpa: Store x-svq parameter in VhostVDPAState
  vdpa: Add listener_shadow_vq to vhost_vdpa
  vdpa: Always start CVQ in SVQ mode

 include/hw/virtio/vhost-vdpa.h |  10 ++-
 hw/virtio/vhost-vdpa.c |  75 ++-
 net/vhost-vdpa.c   | 128 ++---
 hw/virtio/trace-events |   4 +-
 4 files changed, 170 insertions(+), 47 deletions(-)

-- 
2.31.1

[PATCH v5 5/6] vdpa: Add listener_shadow_vq to vhost_vdpa

2022-10-11 Thread Eugenio Pérez

The memory listener that thells the device how to convert GPA to qemu's
va is registered against CVQ vhost_vdpa. This series try to map the
memory listener translations to ASID 0, while it maps the CVQ ones to
ASID 1.

Let's tell the listener if it needs to register them on iova tree or
not.

Signed-off-by: Eugenio Pérez 
---
v5: Solve conflict about vhost_iova_tree_remove accepting mem_region by
value.
---
 include/hw/virtio/vhost-vdpa.h | 2 ++
 hw/virtio/vhost-vdpa.c | 6 +++---
 net/vhost-vdpa.c   | 1 +
 3 files changed, 6 insertions(+), 3 deletions(-)

diff --git a/include/hw/virtio/vhost-vdpa.h b/include/hw/virtio/vhost-vdpa.h
index 6560bb9d78..0c3ed2d69b 100644
--- a/include/hw/virtio/vhost-vdpa.h
+++ b/include/hw/virtio/vhost-vdpa.h
@@ -34,6 +34,8 @@ typedef struct vhost_vdpa {
 struct vhost_vdpa_iova_range iova_range;
 uint64_t acked_features;
 bool shadow_vqs_enabled;
+/* The listener must send iova tree addresses, not GPA */
+bool listener_shadow_vq;
 /* IOVA mapping used by the Shadow Virtqueue */
 VhostIOVATree *iova_tree;
 GPtrArray *shadow_vqs;
diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
index ad663feacc..29d009c02b 100644
--- a/hw/virtio/vhost-vdpa.c
+++ b/hw/virtio/vhost-vdpa.c
@@ -220,7 +220,7 @@ static void vhost_vdpa_listener_region_add(MemoryListener 
*listener,
  vaddr, section->readonly);
 
 llsize = int128_sub(llend, int128_make64(iova));
-if (v->shadow_vqs_enabled) {
+if (v->listener_shadow_vq) {
 int r;
 
 mem_region.translated_addr = (hwaddr)(uintptr_t)vaddr,
@@ -247,7 +247,7 @@ static void vhost_vdpa_listener_region_add(MemoryListener 
*listener,
 return;
 
 fail_map:
-if (v->shadow_vqs_enabled) {
+if (v->listener_shadow_vq) {
 vhost_iova_tree_remove(v->iova_tree, mem_region);
 }
 
@@ -292,7 +292,7 @@ static void vhost_vdpa_listener_region_del(MemoryListener 
*listener,
 
 llsize = int128_sub(llend, int128_make64(iova));
 
-if (v->shadow_vqs_enabled) {
+if (v->listener_shadow_vq) {
 const DMAMap *result;
 const void *vaddr = memory_region_get_ram_ptr(section->mr) +
 section->offset_within_region +
diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
index e8c78e4813..f7831aeb8d 100644
--- a/net/vhost-vdpa.c
+++ b/net/vhost-vdpa.c
@@ -604,6 +604,7 @@ static NetClientState *net_vhost_vdpa_init(NetClientState 
*peer,
 s->vhost_vdpa.index = queue_pair_index;
 s->always_svq = svq;
 s->vhost_vdpa.shadow_vqs_enabled = svq;
+s->vhost_vdpa.listener_shadow_vq = svq;
 s->vhost_vdpa.iova_tree = iova_tree;
 if (!is_datapath) {
 s->cvq_cmd_out_buffer = qemu_memalign(qemu_real_host_page_size(),
-- 
2.31.1

[PULL 27/37] target/i386: Merge gen_jmp_tb and gen_goto_tb into gen_jmp_rel

2022-10-11 Thread Paolo Bonzini

From: Richard Henderson 

These functions have only one caller, and the logic is more
obvious this way.

Signed-off-by: Richard Henderson 
Reviewed-by: Paolo Bonzini 
Message-Id: <20221001140935.465607-23-richard.hender...@linaro.org>
Signed-off-by: Paolo Bonzini 
---
 target/i386/tcg/translate.c | 50 +
 1 file changed, 17 insertions(+), 33 deletions(-)

diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
index 9294f12f66..2e7b94700b 100644
--- a/target/i386/tcg/translate.c
+++ b/target/i386/tcg/translate.c
@@ -224,7 +224,6 @@ STUB_HELPER(wrmsr, TCGv_env env)
 
 static void gen_eob(DisasContext *s);
 static void gen_jr(DisasContext *s);
-static void gen_jmp_tb(DisasContext *s, target_ulong eip, int tb_num);
 static void gen_jmp_rel(DisasContext *s, MemOp ot, int diff, int tb_num);
 static void gen_jmp_rel_csize(DisasContext *s, int diff, int tb_num);
 static void gen_op(DisasContext *s1, int op, MemOp ot, int d);
@@ -2393,23 +2392,6 @@ static inline int insn_const_size(MemOp ot)
 }
 }
 
-static void gen_goto_tb(DisasContext *s, int tb_num, target_ulong eip)
-{
-target_ulong pc = s->cs_base + eip;
-
-if (translator_use_goto_tb(&s->base, pc))  {
-/* jump to same page: we can use a direct jump */
-tcg_gen_goto_tb(tb_num);
-gen_jmp_im(s, eip);
-tcg_gen_exit_tb(s->base.tb, tb_num);
-s->base.is_jmp = DISAS_NORETURN;
-} else {
-/* jump to another page */
-gen_jmp_im(s, eip);
-gen_jr(s);
-}
-}
-
 static void gen_jcc(DisasContext *s, int b, int diff)
 {
 TCGLabel *l1 = gen_new_label();
@@ -2762,20 +2744,6 @@ static void gen_jr(DisasContext *s)
 do_gen_eob_worker(s, false, false, true);
 }
 
-/* generate a jump to eip. No segment change must happen before as a
-   direct call to the next block may occur */
-static void gen_jmp_tb(DisasContext *s, target_ulong eip, int tb_num)
-{
-gen_update_cc_op(s);
-set_cc_op(s, CC_OP_DYNAMIC);
-if (s->jmp_opt) {
-gen_goto_tb(s, tb_num, eip);
-} else {
-gen_jmp_im(s, eip);
-gen_eob(s);
-}
-}
-
 /* Jump to eip+diff, truncating the result to OT. */
 static void gen_jmp_rel(DisasContext *s, MemOp ot, int diff, int tb_num)
 {
@@ -2789,7 +2757,23 @@ static void gen_jmp_rel(DisasContext *s, MemOp ot, int 
diff, int tb_num)
 dest &= 0x;
 }
 }
-gen_jmp_tb(s, dest, tb_num);
+
+gen_update_cc_op(s);
+set_cc_op(s, CC_OP_DYNAMIC);
+if (!s->jmp_opt) {
+gen_jmp_im(s, dest);
+gen_eob(s);
+} else if (translator_use_goto_tb(&s->base, dest))  {
+/* jump to same page: we can use a direct jump */
+tcg_gen_goto_tb(tb_num);
+gen_jmp_im(s, dest);
+tcg_gen_exit_tb(s->base.tb, tb_num);
+s->base.is_jmp = DISAS_NORETURN;
+} else {
+/* jump to another page */
+gen_jmp_im(s, dest);
+gen_jr(s);
+}
 }
 
 /* Jump to eip+diff, truncating to the current code size. */
-- 
2.37.3

[PULL 33/37] i386: kvm: Add support for MSR filtering

2022-10-11 Thread Paolo Bonzini

From: Alexander Graf 

KVM has grown support to deflect arbitrary MSRs to user space since
Linux 5.10. For now we don't expect to make a lot of use of this
feature, so let's expose it the easiest way possible: With up to 16
individually maskable MSRs.

This patch adds a kvm_filter_msr() function that other code can call
to install a hook on KVM MSR reads or writes.

Signed-off-by: Alexander Graf 
Message-Id: <20221004225643.65036-3-ag...@csgraf.de>
Signed-off-by: Paolo Bonzini 
---
 target/i386/kvm/kvm.c  | 123 +
 target/i386/kvm/kvm_i386.h |  11 
 2 files changed, 134 insertions(+)

diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index ec63b5eb10..1d9a50b02b 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -141,6 +141,8 @@ static struct kvm_cpuid2 *cpuid_cache;
 static struct kvm_cpuid2 *hv_cpuid_cache;
 static struct kvm_msr_list *kvm_feature_msrs;
 
+static KVMMSRHandlers msr_handlers[KVM_MSR_FILTER_MAX_RANGES];
+
 #define BUS_LOCK_SLICE_TIME 10ULL /* ns */
 static RateLimit bus_lock_ratelimit_ctrl;
 static int kvm_get_one_msr(X86CPU *cpu, int index, uint64_t *value);
@@ -2610,6 +2612,15 @@ int kvm_arch_init(MachineState *ms, KVMState *s)
 return ret;
 }
 }
+if (kvm_vm_check_extension(s, KVM_CAP_X86_USER_SPACE_MSR)) {
+ret = kvm_vm_enable_cap(s, KVM_CAP_X86_USER_SPACE_MSR, 0,
+KVM_MSR_EXIT_REASON_FILTER);
+if (ret) {
+error_report("Could not enable user space MSRs: %s",
+ strerror(-ret));
+exit(1);
+}
+}
 
 return 0;
 }
@@ -5109,6 +5120,108 @@ void kvm_arch_update_guest_debug(CPUState *cpu, struct 
kvm_guest_debug *dbg)
 }
 }
 
+static bool kvm_install_msr_filters(KVMState *s)
+{
+uint64_t zero = 0;
+struct kvm_msr_filter filter = {
+.flags = KVM_MSR_FILTER_DEFAULT_ALLOW,
+};
+int r, i, j = 0;
+
+for (i = 0; i < KVM_MSR_FILTER_MAX_RANGES; i++) {
+KVMMSRHandlers *handler = &msr_handlers[i];
+if (handler->msr) {
+struct kvm_msr_filter_range *range = &filter.ranges[j++];
+
+*range = (struct kvm_msr_filter_range) {
+.flags = 0,
+.nmsrs = 1,
+.base = handler->msr,
+.bitmap = (__u8 *)&zero,
+};
+
+if (handler->rdmsr) {
+range->flags |= KVM_MSR_FILTER_READ;
+}
+
+if (handler->wrmsr) {
+range->flags |= KVM_MSR_FILTER_WRITE;
+}
+}
+}
+
+r = kvm_vm_ioctl(s, KVM_X86_SET_MSR_FILTER, &filter);
+if (r) {
+return false;
+}
+
+return true;
+}
+
+bool kvm_filter_msr(KVMState *s, uint32_t msr, QEMURDMSRHandler *rdmsr,
+QEMUWRMSRHandler *wrmsr)
+{
+int i;
+
+for (i = 0; i < ARRAY_SIZE(msr_handlers); i++) {
+if (!msr_handlers[i].msr) {
+msr_handlers[i] = (KVMMSRHandlers) {
+.msr = msr,
+.rdmsr = rdmsr,
+.wrmsr = wrmsr,
+};
+
+if (!kvm_install_msr_filters(s)) {
+msr_handlers[i] = (KVMMSRHandlers) { };
+return false;
+}
+
+return true;
+}
+}
+
+return false;
+}
+
+static int kvm_handle_rdmsr(X86CPU *cpu, struct kvm_run *run)
+{
+int i;
+bool r;
+
+for (i = 0; i < ARRAY_SIZE(msr_handlers); i++) {
+KVMMSRHandlers *handler = &msr_handlers[i];
+if (run->msr.index == handler->msr) {
+if (handler->rdmsr) {
+r = handler->rdmsr(cpu, handler->msr,
+   (uint64_t *)&run->msr.data);
+run->msr.error = r ? 0 : 1;
+return 0;
+}
+}
+}
+
+assert(false);
+}
+
+static int kvm_handle_wrmsr(X86CPU *cpu, struct kvm_run *run)
+{
+int i;
+bool r;
+
+for (i = 0; i < ARRAY_SIZE(msr_handlers); i++) {
+KVMMSRHandlers *handler = &msr_handlers[i];
+if (run->msr.index == handler->msr) {
+if (handler->wrmsr) {
+r = handler->wrmsr(cpu, handler->msr, run->msr.data);
+run->msr.error = r ? 0 : 1;
+return 0;
+}
+}
+}
+
+assert(false);
+}
+
 static bool has_sgx_provisioning;
 
 static bool __kvm_enable_sgx_provisioning(KVMState *s)
@@ -5226,6 +5339,16 @@ int kvm_arch_handle_exit(CPUState *cs, struct kvm_run 
*run)
 ret = 0;
 }
 break;
+case KVM_EXIT_X86_RDMSR:
+/* We only enable MSR filtering, any other exit is bogus */
+assert(run->msr.reason == KVM_MSR_EXIT_REASON_FILTER);
+ret = kvm_handle_rdmsr(cpu, run);
+break;
+case KVM_EXIT_X86_WRMSR:
+/* We only enable MSR filtering, any other exit is bogus */
+assert(run->msr.reason == KVM_MSR_EXIT_REASON_F

[PATCH v5 2/6] vdpa: Allocate SVQ unconditionally

2022-10-11 Thread Eugenio Pérez

SVQ may run or not in a device depending on runtime conditions (for
example, if the device can move CVQ to its own group or not).

Allocate the resources unconditionally, and decide later if to use them
or not.

Signed-off-by: Eugenio Pérez 
---
 hw/virtio/vhost-vdpa.c | 33 +++--
 1 file changed, 15 insertions(+), 18 deletions(-)

diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
index 7f0ff4df5b..d966966131 100644
--- a/hw/virtio/vhost-vdpa.c
+++ b/hw/virtio/vhost-vdpa.c
@@ -410,6 +410,21 @@ static int vhost_vdpa_init_svq(struct vhost_dev *hdev, 
struct vhost_vdpa *v,
 int r;
 bool ok;
 
+shadow_vqs = g_ptr_array_new_full(hdev->nvqs, vhost_svq_free);
+for (unsigned n = 0; n < hdev->nvqs; ++n) {
+g_autoptr(VhostShadowVirtqueue) svq;
+
+svq = vhost_svq_new(v->iova_tree, v->shadow_vq_ops,
+v->shadow_vq_ops_opaque);
+if (unlikely(!svq)) {
+error_setg(errp, "Cannot create svq %u", n);
+return -1;
+}
+g_ptr_array_add(shadow_vqs, g_steal_pointer(&svq));
+}
+
+v->shadow_vqs = g_steal_pointer(&shadow_vqs);
+
 if (!v->shadow_vqs_enabled) {
 return 0;
 }
@@ -426,20 +441,6 @@ static int vhost_vdpa_init_svq(struct vhost_dev *hdev, 
struct vhost_vdpa *v,
 return -1;
 }
 
-shadow_vqs = g_ptr_array_new_full(hdev->nvqs, vhost_svq_free);
-for (unsigned n = 0; n < hdev->nvqs; ++n) {
-g_autoptr(VhostShadowVirtqueue) svq;
-
-svq = vhost_svq_new(v->iova_tree, v->shadow_vq_ops,
-v->shadow_vq_ops_opaque);
-if (unlikely(!svq)) {
-error_setg(errp, "Cannot create svq %u", n);
-return -1;
-}
-g_ptr_array_add(shadow_vqs, g_steal_pointer(&svq));
-}
-
-v->shadow_vqs = g_steal_pointer(&shadow_vqs);
 return 0;
 }
 
@@ -580,10 +581,6 @@ static void vhost_vdpa_svq_cleanup(struct vhost_dev *dev)
 struct vhost_vdpa *v = dev->opaque;
 size_t idx;
 
-if (!v->shadow_vqs) {
-return;
-}
-
 for (idx = 0; idx < v->shadow_vqs->len; ++idx) {
 vhost_svq_stop(g_ptr_array_index(v->shadow_vqs, idx));
 }
-- 
2.31.1

[PULL 37/37] linux-user: i386/signal: support XSAVE/XRSTOR for signal frame fpstate

2022-10-11 Thread Paolo Bonzini

Add support for saving/restoring extended save states when signals
are delivered.  This allows using AVX, MPX or PKRU registers in
signal handlers.

Reviewed-by: Richard Henderson 
Signed-off-by: Paolo Bonzini 
---
 linux-user/i386/signal.c | 119 +--
 target/i386/cpu.c|   2 +-
 target/i386/cpu.h|   3 +
 target/i386/tcg/fpu_helper.c |  64 +++
 4 files changed, 142 insertions(+), 46 deletions(-)

diff --git a/linux-user/i386/signal.c b/linux-user/i386/signal.c
index 58b11be116..60fa07d6f9 100644
--- a/linux-user/i386/signal.c
+++ b/linux-user/i386/signal.c
@@ -24,6 +24,10 @@
 
 /* from the Linux kernel - /arch/x86/include/uapi/asm/sigcontext.h */
 
+#define TARGET_FP_XSTATE_MAGIC1 0x46505853U /* FPXS */
+#define TARGET_FP_XSTATE_MAGIC2 0x46505845U /* FPXE */
+#define TARGET_FP_XSTATE_MAGIC2_SIZE4
+
 struct target_fpreg {
 uint16_t significand[4];
 uint16_t exponent;
@@ -39,6 +43,15 @@ struct target_xmmreg {
 uint32_t element[4];
 };
 
+struct target_fpx_sw_bytes {
+uint32_t magic1;
+uint32_t extended_size;
+uint64_t xfeatures;
+uint32_t xstate_size;
+uint32_t reserved[7];
+};
+QEMU_BUILD_BUG_ON(sizeof(struct target_fpx_sw_bytes) != 12*4);
+
 struct target_fpstate_fxsave {
 /* FXSAVE format */
 uint16_t cw;
@@ -51,10 +64,13 @@ struct target_fpstate_fxsave {
 uint32_t mxcsr_mask;
 uint32_t st_space[32];
 uint32_t xmm_space[64];
-uint32_t reserved[24];
+uint32_t hw_reserved[12];
+struct target_fpx_sw_bytes sw_reserved;
+uint8_t xfeatures[];
 };
 #define TARGET_FXSAVE_SIZE   sizeof(struct target_fpstate_fxsave)
 QEMU_BUILD_BUG_ON(TARGET_FXSAVE_SIZE != 512);
+QEMU_BUILD_BUG_ON(offsetof(struct target_fpstate_fxsave, sw_reserved) != 464);
 
 struct target_fpstate_32 {
 /* Regular FPU environment */
@@ -214,13 +230,39 @@ struct rt_sigframe {
  * Set up a signal frame.
  */
 
-static void fxsave_sigcontext(CPUX86State *env, struct target_fpstate_fxsave 
*fxsave,
-  abi_ulong fxsave_addr)
+static void xsave_sigcontext(CPUX86State *env, struct target_fpstate_fxsave 
*fxsave,
+ abi_ulong fxsave_addr)
 {
-/* fxsave_addr must be 16 byte aligned for fxsave */
-assert(!(fxsave_addr & 0xf));
+if (!(env->features[FEAT_1_ECX] & CPUID_EXT_XSAVE)) {
+/* fxsave_addr must be 16 byte aligned for fxsave */
+assert(!(fxsave_addr & 0xf));
 
-cpu_x86_fxsave(env, fxsave_addr);
+cpu_x86_fxsave(env, fxsave_addr);
+__put_user(0, &fxsave->sw_reserved.magic1);
+} else {
+uint32_t xstate_size = xsave_area_size(env->xcr0, false);
+uint32_t xfeatures_size = xstate_size - TARGET_FXSAVE_SIZE;
+
+/*
+ * extended_size is the offset from fpstate_addr to right after the end
+ * of the extended save states.  On 32-bit that includes the legacy
+ * FSAVE area.
+ */
+uint32_t extended_size = TARGET_FPSTATE_FXSAVE_OFFSET
++ xstate_size + TARGET_FP_XSTATE_MAGIC2_SIZE;
+
+/* fxsave_addr must be 64 byte aligned for xsave */
+assert(!(fxsave_addr & 0x3f));
+
+/* Zero the header, XSAVE *adds* features to an existing save state.  
*/
+memset(fxsave->xfeatures, 0, 64);
+cpu_x86_xsave(env, fxsave_addr);
+__put_user(TARGET_FP_XSTATE_MAGIC1, &fxsave->sw_reserved.magic1);
+__put_user(extended_size, &fxsave->sw_reserved.extended_size);
+__put_user(env->xcr0, &fxsave->sw_reserved.xfeatures);
+__put_user(xstate_size, &fxsave->sw_reserved.xstate_size);
+__put_user(TARGET_FP_XSTATE_MAGIC2, (uint32_t *) 
&fxsave->xfeatures[xfeatures_size]);
+}
 }
 
 static void setup_sigcontext(struct target_sigcontext *sc,
@@ -257,8 +299,8 @@ static void setup_sigcontext(struct target_sigcontext *sc,
 if (!(env->features[FEAT_1_EDX] & CPUID_FXSR)) {
 magic = 0x;
 } else {
-fxsave_sigcontext(env, &fpstate->fxsave,
-  fpstate_addr + TARGET_FPSTATE_FXSAVE_OFFSET);
+xsave_sigcontext(env, &fpstate->fxsave,
+ fpstate_addr + TARGET_FPSTATE_FXSAVE_OFFSET);
 magic = 0;
 }
 __put_user(magic, &fpstate->magic);
@@ -291,7 +333,7 @@ static void setup_sigcontext(struct target_sigcontext *sc,
 __put_user((uint16_t)0, &sc->fs);
 __put_user(env->segs[R_SS].selector, &sc->ss);
 
-fxsave_sigcontext(env, fpstate, fpstate_addr);
+xsave_sigcontext(env, fpstate, fpstate_addr);
 #endif
 
 __put_user(fpstate_addr, &sc->fpstate);
@@ -332,8 +374,12 @@ get_sigframe(struct target_sigaction *ka, CPUX86State 
*env, size_t fxsave_offset
 
 if (!(env->features[FEAT_1_EDX] & CPUID_FXSR)) {
 return (esp - (fxsave_offset + TARGET_FXSAVE_SIZE)) & -8ul;
-} else {
+} else if (!(env->features[FEAT_1_ECX] & CPUID_EXT_XSAVE)) {
 return ((esp - TARGET_FXSAVE

[PATCH] qgraph: implement stack as a linked list

2022-10-11 Thread Paolo Bonzini

The stack used to visit the graph is implemented as a fixed-size array,
and the array is sized according to the maximum anticipated length of
a path on the graph.  However, the worst case for a depth-first search
is to push all nodes on the graph, and in fact stack overflows have
been observed in the past depending on the ordering of the constructors.

To fix the problem once and for all, use a QSLIST instead of the array,
allocating QOSStackElements from the heap.

Signed-off-by: Paolo Bonzini 
---
 tests/qtest/libqos/qgraph.c | 35 +++
 1 file changed, 11 insertions(+), 24 deletions(-)

diff --git a/tests/qtest/libqos/qgraph.c b/tests/qtest/libqos/qgraph.c
index 0a2dddfafa..2433e6ea4b 100644
--- a/tests/qtest/libqos/qgraph.c
+++ b/tests/qtest/libqos/qgraph.c
@@ -52,6 +52,7 @@ struct QOSStackElement {
 QOSStackElement *parent;
 QOSGraphEdge *parent_edge;
 int length;
+QSLIST_ENTRY(QOSStackElement) next;
 };
 
 /* Each enty in these hash table will consist of  pair. */
@@ -59,8 +60,7 @@ static GHashTable *edge_table;
 static GHashTable *node_table;
 
 /* stack used by the DFS algorithm to store the path from machine to test */
-static QOSStackElement qos_node_stack[QOS_PATH_MAX_ELEMENT_SIZE];
-static int qos_node_tos;
+static QSLIST_HEAD(, QOSStackElement) qos_node_stack;
 
 /**
  * add_edge(): creates an edge of type @type
@@ -325,40 +325,27 @@ static void qos_print_cb(QOSGraphNode *path, int length)
 static void qos_push(QOSGraphNode *el, QOSStackElement *parent,
  QOSGraphEdge *e)
 {
+QOSStackElement *elem = g_new0(QOSStackElement, 1);
 int len = 0; /* root is not counted */
-if (qos_node_tos == QOS_PATH_MAX_ELEMENT_SIZE) {
-g_printerr("QOSStack: full stack, cannot push");
-abort();
-}
-
 if (parent) {
 len = parent->length + 1;
 }
-qos_node_stack[qos_node_tos++] = (QOSStackElement) {
+*elem = (QOSStackElement) {
 .node = el,
 .parent = parent,
 .parent_edge = e,
 .length = len,
 };
-}
-
-/* qos_tos(): returns the top of stack, without popping */
-static QOSStackElement *qos_tos(void)
-{
-return &qos_node_stack[qos_node_tos - 1];
+QSLIST_INSERT_HEAD(qos_node_stack, elem, next);
 }
 
 /* qos_pop(): pops an element from the tos, setting it unvisited*/
-static QOSStackElement *qos_pop(void)
+static void qos_pop(void)
 {
-if (qos_node_tos == 0) {
-g_printerr("QOSStack: empty stack, cannot pop");
-abort();
-}
-QOSStackElement *e = qos_tos();
+QOSStackElement *e = QSLIST_FIRST(qos_node_stack);
 e->node->visited = false;
-qos_node_tos--;
-return e;
+QSLIST_REMOVE_HEAD(qos_node_stack, next);
+g_free(e);
 }
 
 /**
@@ -400,8 +387,8 @@ static void qos_traverse_graph(QOSGraphNode *root, 
QOSTestCallback callback)
 
 qos_push(root, NULL, NULL);
 
-while (qos_node_tos > 0) {
-s_el = qos_tos();
+while (!QSLIST_EMPTY(qos_node_stack)) {
+s_el = QSLIST_HEAD(qos_node_stack);
 v = s_el->node;
 if (v->visited) {
 qos_pop();
-- 
2.37.3

[PULL 29/37] target/i386: Add cpu_eip

2022-10-11 Thread Paolo Bonzini

From: Richard Henderson 

Create a tcg global temp for this, and use it instead of explicit stores.

Signed-off-by: Richard Henderson 
Reviewed-by: Paolo Bonzini 
Message-Id: <20221001140935.465607-25-richard.hender...@linaro.org>
Signed-off-by: Paolo Bonzini 
---
 target/i386/tcg/translate.c | 13 +++--
 1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
index 5b0dab8633..f08fa060c4 100644
--- a/target/i386/tcg/translate.c
+++ b/target/i386/tcg/translate.c
@@ -64,6 +64,7 @@
 
 /* global register indexes */
 static TCGv cpu_cc_dst, cpu_cc_src, cpu_cc_src2;
+static TCGv cpu_eip;
 static TCGv_i32 cpu_cc_op;
 static TCGv cpu_regs[CPU_NB_REGS];
 static TCGv cpu_seg_base[6];
@@ -481,7 +482,7 @@ static void gen_add_A0_im(DisasContext *s, int val)
 
 static inline void gen_op_jmp_v(TCGv dest)
 {
-tcg_gen_st_tl(dest, cpu_env, offsetof(CPUX86State, eip));
+tcg_gen_mov_tl(cpu_eip, dest);
 }
 
 static inline
@@ -518,7 +519,7 @@ static inline void gen_op_st_rm_T0_A0(DisasContext *s, int 
idx, int d)
 
 static void gen_jmp_im(DisasContext *s, target_ulong pc)
 {
-gen_op_jmp_v(tcg_constant_tl(pc));
+tcg_gen_movi_tl(cpu_eip, pc);
 }
 
 static void gen_update_eip_cur(DisasContext *s)
@@ -8614,6 +8615,13 @@ void tcg_x86_init(void)
 [R_EDI] = "edi",
 [R_EBP] = "ebp",
 [R_ESP] = "esp",
+#endif
+};
+static const char eip_name[] = {
+#ifdef TARGET_X86_64
+"rip"
+#else
+"eip"
 #endif
 };
 static const char seg_base_names[6][8] = {
@@ -8640,6 +8648,7 @@ void tcg_x86_init(void)
 "cc_src");
 cpu_cc_src2 = tcg_global_mem_new(cpu_env, offsetof(CPUX86State, cc_src2),
  "cc_src2");
+cpu_eip = tcg_global_mem_new(cpu_env, offsetof(CPUX86State, eip), 
eip_name);
 
 for (i = 0; i < CPU_NB_REGS; ++i) {
 cpu_regs[i] = tcg_global_mem_new(cpu_env,
-- 
2.37.3

[PATCH v5 4/6] vdpa: Store x-svq parameter in VhostVDPAState

2022-10-11 Thread Eugenio Pérez

CVQ can be shadowed two ways:
- Device has x-svq=on parameter (current way)
- The device can isolate CVQ in its own vq group

QEMU needs to check for the second condition dynamically, because CVQ
index is not known at initialization time. Since this is dynamic, the
CVQ isolation could vary with different conditions, making it possible
to go from "not isolated group" to "isolated".

Saving the cmdline parameter in an extra field so we never disable CVQ
SVQ in case the device was started with cmdline.

Signed-off-by: Eugenio Pérez 
---
 net/vhost-vdpa.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
index 025fbbc41b..e8c78e4813 100644
--- a/net/vhost-vdpa.c
+++ b/net/vhost-vdpa.c
@@ -38,6 +38,8 @@ typedef struct VhostVDPAState {
 void *cvq_cmd_out_buffer;
 virtio_net_ctrl_ack *status;
 
+/* The device always have SVQ enabled */
+bool always_svq;
 bool started;
 } VhostVDPAState;
 
@@ -600,6 +602,7 @@ static NetClientState *net_vhost_vdpa_init(NetClientState 
*peer,
 
 s->vhost_vdpa.device_fd = vdpa_device_fd;
 s->vhost_vdpa.index = queue_pair_index;
+s->always_svq = svq;
 s->vhost_vdpa.shadow_vqs_enabled = svq;
 s->vhost_vdpa.iova_tree = iova_tree;
 if (!is_datapath) {
-- 
2.31.1

Re: [PATCH v4] win32: set threads name

2022-10-11 Thread Bin Meng

On Tue, Oct 11, 2022 at 5:29 PM  wrote:
>
> From: Marc-André Lureau 
>
> As described in:
> https://learn.microsoft.com/en-us/visualstudio/debugger/how-to-set-a-thread-name-in-native-code?view=vs-2022
>
> SetThreadDescription() is available since Windows 10, version 1607 and
> in some versions only by "Run Time Dynamic Linking". Its declaration is
> not yet in mingw, so we lookup the function the same way glib does.
>
> Tested with Visual Studio Community 2022 debugger.
>
> Signed-off-by: Marc-André Lureau 
> Acked-by: Richard Henderson 
> ---
>  util/qemu-thread-win32.c | 55 ++--
>  1 file changed, 53 insertions(+), 2 deletions(-)
>
> diff --git a/util/qemu-thread-win32.c b/util/qemu-thread-win32.c
> index a2d5a6e825..b20bfa9c1f 100644
> --- a/util/qemu-thread-win32.c
> +++ b/util/qemu-thread-win32.c
> @@ -19,12 +19,39 @@
>
>  static bool name_threads;
>
> +typedef HRESULT (WINAPI *pSetThreadDescription) (HANDLE hThread,
> + PCWSTR lpThreadDescription);
> +static pSetThreadDescription SetThreadDescriptionFunc;
> +static HMODULE kernel32_module;
> +
> +static bool load_set_thread_description(void)
> +{
> +static gsize _init_once = 0;
> +
> +if (g_once_init_enter(&_init_once)) {
> +kernel32_module = LoadLibrary("kernel32.dll");
> +if (kernel32_module) {
> +SetThreadDescriptionFunc =
> +(pSetThreadDescription)GetProcAddress(kernel32_module,
> +  
> "SetThreadDescription");
> +if (!SetThreadDescriptionFunc) {
> +FreeLibrary(kernel32_module);
> +}
> +}
> +g_once_init_leave(&_init_once, 1);
> +}
> +
> +return !!SetThreadDescriptionFunc;
> +}
> +
>  void qemu_thread_naming(bool enable)
>  {
> -/* But note we don't actually name them on Windows yet */
>  name_threads = enable;
>
> -fprintf(stderr, "qemu: thread naming not supported on this host\n");
> +if (enable && !load_set_thread_description()) {
> +fprintf(stderr, "qemu: thread naming not supported on this host\n");
> +name_threads = false;
> +}
>  }
>
>  static void error_exit(int err, const char *msg)
> @@ -400,6 +427,26 @@ void *qemu_thread_join(QemuThread *thread)
>  return ret;
>  }
>
> +static bool

This is still not fixed

> +set_thread_description(HANDLE h, const char *name)
> +{
> +  HRESULT hr;

and the 4 spaces here ...

> +  g_autofree wchar_t *namew = NULL;
> +
> +  if (!load_set_thread_description()) {
> +  return false;
> +  }
> +
> +  namew = g_utf8_to_utf16(name, -1, NULL, NULL, NULL);
> +  if (!namew) {
> +  return false;
> +  }
> +
> +  hr = SetThreadDescriptionFunc(h, namew);
> +
> +  return SUCCEEDED(hr);
> +}
> +
>  void qemu_thread_create(QemuThread *thread, const char *name,
> void *(*start_routine)(void *),
> void *arg, int mode)
> @@ -423,7 +470,11 @@ void qemu_thread_create(QemuThread *thread, const char 
> *name,
>  if (!hThread) {
>  error_exit(GetLastError(), __func__);
>  }
> +if (name_threads && name && !set_thread_description(hThread, name)) {
> +fprintf(stderr, "qemu: failed to set thread description: %s\n", 
> name);
> +}
>  CloseHandle(hThread);
> +
>  thread->data = data;
>  }
>

Regards,
Bin

[PULL 30/37] target/i386: Inline gen_jmp_im

2022-10-11 Thread Paolo Bonzini

From: Richard Henderson 

Expand this function at each of its callers.

Signed-off-by: Richard Henderson 
Reviewed-by: Paolo Bonzini 
Message-Id: <20221001140935.465607-26-richard.hender...@linaro.org>
Signed-off-by: Paolo Bonzini 
---
 target/i386/tcg/translate.c | 15 +--
 1 file changed, 5 insertions(+), 10 deletions(-)

diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
index f08fa060c4..689a45256c 100644
--- a/target/i386/tcg/translate.c
+++ b/target/i386/tcg/translate.c
@@ -517,19 +517,14 @@ static inline void gen_op_st_rm_T0_A0(DisasContext *s, 
int idx, int d)
 }
 }
 
-static void gen_jmp_im(DisasContext *s, target_ulong pc)
-{
-tcg_gen_movi_tl(cpu_eip, pc);
-}
-
 static void gen_update_eip_cur(DisasContext *s)
 {
-gen_jmp_im(s, s->base.pc_next - s->cs_base);
+tcg_gen_movi_tl(cpu_eip, s->base.pc_next - s->cs_base);
 }
 
 static void gen_update_eip_next(DisasContext *s)
 {
-gen_jmp_im(s, s->pc - s->cs_base);
+tcg_gen_movi_tl(cpu_eip, s->pc - s->cs_base);
 }
 
 static int cur_insn_len(DisasContext *s)
@@ -2767,17 +2762,17 @@ static void gen_jmp_rel(DisasContext *s, MemOp ot, int 
diff, int tb_num)
 gen_update_cc_op(s);
 set_cc_op(s, CC_OP_DYNAMIC);
 if (!s->jmp_opt) {
-gen_jmp_im(s, dest);
+tcg_gen_movi_tl(cpu_eip, dest);
 gen_eob(s);
 } else if (translator_use_goto_tb(&s->base, dest))  {
 /* jump to same page: we can use a direct jump */
 tcg_gen_goto_tb(tb_num);
-gen_jmp_im(s, dest);
+tcg_gen_movi_tl(cpu_eip, dest);
 tcg_gen_exit_tb(s->base.tb, tb_num);
 s->base.is_jmp = DISAS_NORETURN;
 } else {
 /* jump to another page */
-gen_jmp_im(s, dest);
+tcg_gen_movi_tl(cpu_eip, dest);
 gen_jr(s);
 }
 }
-- 
2.37.3

Re: [PATCH v5 1/9] tests/x86: add helper qtest_qmp_device_del_send()

2022-10-11 Thread Thomas Huth


On 30/09/2022 00.35, Michael Labiuk via wrote:

Move sending 'device_del' command to separate function.
Function can be used in case of addition action is needed to start
actual removing device after sending command.

Signed-off-by: Michael Labiuk 
---
  tests/qtest/device-plug-test.c | 15 ++-
  tests/qtest/drive_del-test.c   |  6 +-
  tests/qtest/libqos/pci-pc.c|  8 +---
  tests/qtest/libqtest.c | 16 ++--
  tests/qtest/libqtest.h | 10 ++
  5 files changed, 24 insertions(+), 31 deletions(-)

[...]

diff --git a/tests/qtest/libqtest.h b/tests/qtest/libqtest.h
index 3abc75964d..29ea9c697d 100644
--- a/tests/qtest/libqtest.h
+++ b/tests/qtest/libqtest.h
@@ -761,12 +761,22 @@ void qtest_qmp_device_add(QTestState *qts, const char 
*driver, const char *id,
  void qtest_qmp_add_client(QTestState *qts, const char *protocol, int fd);
  #endif /* _WIN32 */
  
+/**

+ * qtest_qmp_device_del_send:
+ * @qts: QTestState instance to operate on
+ * @id: Identification string
+ *
+ * Generic hot-unplugging test via the device_del QMP command.
+ */
+void qtest_qmp_device_del_send(QTestState *qts, const char *id);
+
  /**
   * qtest_qmp_device_del:
   * @qts: QTestState instance to operate on
   * @id: Identification string
   *
   * Generic hot-unplugging test via the device_del QMP command.
+ * Waiting for command comlition event.


Typo: "comlition" should be "completion", I guess?

Apart from that, patch looks fine, so with that fixed:
Reviewed-by: Thomas Huth

Re: [PATCH v4] virtio-scsi: Send "REPORTED LUNS CHANGED" sense data upon disk hotplug events.

2022-10-11 Thread Paolo Bonzini

Queued, thanks.

Paolo

Re: [PATCH] migration: Fix a potential guest memory corruption

2022-10-11 Thread Dr. David Alan Gilbert

* Zhenzhong Duan (zhenzhong.d...@intel.com) wrote:

Hi,

> Imagine a rare case, after a dirty page is sent to compression threads's
> ring, dirty bitmap sync trigger right away and mark the same page dirty
> again and sent. Then the new page may be overwriten by stale page in
> compression threads's ring in the destination.

Yes, I think we had a similar problem in multifd.

> So we need to ensure there is only one copy of the same dirty page either
> by flushing the ring after every bitmap sync or avoiding processing same
> dirty page continuously.
> 
> I choose the 2nd which avoids the time consuming flush operation.

I'm not sure this guarantees it; it makes it much less likely but if
only a few pages are dirtied and you have lots of threads, I think the
same thing could still happy.

I think you're going to need to flush the ring after each sync.

Dave

> Signed-off-by: Zhenzhong Duan 
> ---
>  migration/ram.c | 8 
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/migration/ram.c b/migration/ram.c
> index dc1de9ddbc68..67b2035586bd 100644
> --- a/migration/ram.c
> +++ b/migration/ram.c
> @@ -1551,7 +1551,7 @@ static bool find_dirty_block(RAMState *rs, 
> PageSearchStatus *pss, bool *again)
>  pss->postcopy_requested = false;
>  pss->postcopy_target_channel = RAM_CHANNEL_PRECOPY;
>  
> -pss->page = migration_bitmap_find_dirty(rs, pss->block, pss->page);
> +pss->page = migration_bitmap_find_dirty(rs, pss->block, pss->page + 1);
>  if (pss->complete_round && pss->block == rs->last_seen_block &&
>  pss->page >= rs->last_page) {
>  /*
> @@ -1564,7 +1564,7 @@ static bool find_dirty_block(RAMState *rs, 
> PageSearchStatus *pss, bool *again)
>  if (!offset_in_ramblock(pss->block,
>  ((ram_addr_t)pss->page) << TARGET_PAGE_BITS)) {
>  /* Didn't find anything in this RAM Block */
> -pss->page = 0;
> +pss->page = -1;
>  pss->block = QLIST_NEXT_RCU(pss->block, next);
>  if (!pss->block) {
>  /*
> @@ -2694,7 +2694,7 @@ static void ram_state_reset(RAMState *rs)
>  {
>  rs->last_seen_block = NULL;
>  rs->last_sent_block = NULL;
> -rs->last_page = 0;
> +rs->last_page = -1;
>  rs->last_version = ram_list.version;
>  rs->xbzrle_enabled = false;
>  postcopy_preempt_reset(rs);
> @@ -2889,7 +2889,7 @@ void ram_postcopy_send_discard_bitmap(MigrationState 
> *ms)
>  /* Easiest way to make sure we don't resume in the middle of a host-page 
> */
>  rs->last_seen_block = NULL;
>  rs->last_sent_block = NULL;
> -rs->last_page = 0;
> +rs->last_page = -1;
>  
>  postcopy_each_ram_send_discard(ms);
>  
> -- 
> 2.25.1
> 
-- 
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK

Re: [RFC PATCH v2 2/4] acpi: fadt: support revision 6.0 of the ACPI specification

2022-10-11 Thread Miguel Luis


> On 11 Oct 2022, at 05:02, Ani Sinha  wrote:
> 
> On Mon, Oct 10, 2022 at 6:53 PM Miguel Luis  wrote:
>> 
>> Update the Fixed ACPI Description Table (FADT) to revision 6.0 of the ACPI
>> specification adding the field "Hypervisor Vendor Identity" that was missing.
>> 
>> This field's description states the following: "64-bit identifier of 
>> hypervisor
>> vendor. All bytes in this field are considered part of the vendor identity.
>> These identifiers are defined independently by the vendors themselves,
>> usually following the name of the hypervisor product. Version information
>> should NOT be included in this field - this shall simply denote the vendor's
>> name or identifier. Version information can be communicated through a
>> supplemental vendor-specific hypervisor API. Firmware implementers would
>> place zero bytes into this field, denoting that no hypervisor is present in
>> the actual firmware."
>> 
>> Hereupon, what should a valid identifier of an Hypervisor Vendor ID be and
>> where should QEMU provide that information?
>> 
>> On the v1 [1] of this RFC there's the suggestion of having this information
>> in sync by the current acceleration name. This also seems to imply that QEMU,
>> which generates the FADT table, and the FADT consumer need to be in sync with
>> the values of this field.
>> 
>> This version follows Ani Sinha's suggestion [2] of using "QEMU" for the
>> hypervisor vendor ID.
>> 
>> [1]: https://lists.nongnu.org/archive/html/qemu-devel/2022-10/msg00911.html
>> [2]: https://lists.nongnu.org/archive/html/qemu-devel/2022-10/msg00989.html
>> 
>> Signed-off-by: Miguel Luis 
> 
> Reviewed-by: Ani Sinha 
> 

Thank you Ani. In the meanwhile, taking the description part of: “Firmware
implementers would place zero bytes into this field, denoting that no
hypervisor is present in the actual firmware.", I reached to something along
the lines below:

diff --git a/hw/acpi/aml-build.c b/hw/acpi/aml-build.c
index 42feb4d4d7..e719afe0cb 100644
--- a/hw/acpi/aml-build.c
+++ b/hw/acpi/aml-build.c
@@ -2198,7 +2198,11 @@ void build_fadt(GArray *tbl, BIOSLinker *linker, const 
AcpiFadtData *f,
 }
 
 /* Hypervisor Vendor Identity */
-build_append_padded_str(tbl, "QEMU", 8, '\0');
+if (f->hyp_is_present) {
+build_append_padded_str(tbl, "QEMU", 8, '\0');
+} else {
+build_append_int_noprefix(tbl, 0, 8);
+}
 
 /* TODO: extra fields need to be added to support revisions above rev6 */
 assert(f->rev == 6);
diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
index 72bb6f61a5..d238ce2b88 100644
--- a/hw/arm/virt-acpi-build.c
+++ b/hw/arm/virt-acpi-build.c
@@ -818,6 +818,7 @@ static void build_fadt_rev6(GArray *table_data, BIOSLinker 
*linker,
 .minor_ver = 0,
 .flags = 1 << ACPI_FADT_F_HW_REDUCED_ACPI,
 .xdsdt_tbl_offset = &dsdt_tbl_offset,
+.hyp_is_present = vms->virt && (kvm_enabled() || hvf_enabled()),
 };
 
 switch (vms->psci_conduit) {
diff --git a/include/hw/acpi/acpi-defs.h b/include/hw/acpi/acpi-defs.h
index 2b42e4192b..2aff5304af 100644
--- a/include/hw/acpi/acpi-defs.h
+++ b/include/hw/acpi/acpi-defs.h
@@ -79,7 +79,7 @@ typedef struct AcpiFadtData {
 uint16_t arm_boot_arch;/* ARM_BOOT_ARCH */
 uint16_t iapc_boot_arch;   /* IAPC_BOOT_ARCH */
 uint8_t minor_ver; /* FADT Minor Version */
-
+bool hyp_is_present;
 /*
  * respective tables offsets within ACPI_BUILD_TABLE_FILE,
  * NULL if table doesn't exist (in that case field's value

Any thoughts on this?

Thanks
Miguel

>> ---
>> hw/acpi/aml-build.c  | 13 ++---
>> hw/arm/virt-acpi-build.c | 10 +-
>> 2 files changed, 15 insertions(+), 8 deletions(-)
>> 
>> diff --git a/hw/acpi/aml-build.c b/hw/acpi/aml-build.c
>> index e6bfac95c7..42feb4d4d7 100644
>> --- a/hw/acpi/aml-build.c
>> +++ b/hw/acpi/aml-build.c
>> @@ -2070,7 +2070,7 @@ void build_pptt(GArray *table_data, BIOSLinker 
>> *linker, MachineState *ms,
>> acpi_table_end(linker, &table);
>> }
>> 
>> -/* build rev1/rev3/rev5.1 FADT */
>> +/* build rev1/rev3/rev5.1/rev6.0 FADT */
>> void build_fadt(GArray *tbl, BIOSLinker *linker, const AcpiFadtData *f,
>> const char *oem_id, const char *oem_table_id)
>> {
>> @@ -2193,8 +2193,15 @@ void build_fadt(GArray *tbl, BIOSLinker *linker, 
>> const AcpiFadtData *f,
>> /* SLEEP_STATUS_REG */
>> build_append_gas_from_struct(tbl, &f->sleep_sts);
>> 
>> -/* TODO: extra fields need to be added to support revisions above rev5 
>> */
>> -assert(f->rev == 5);
>> +if (f->rev == 5) {
>> +goto done;
>> +}
>> +
>> +/* Hypervisor Vendor Identity */
>> +build_append_padded_str(tbl, "QEMU", 8, '\0');
>> +
>> +/* TODO: extra fields need to be added to support revisions above rev6 
>> */
>> +assert(f->rev == 6);
>> 
>> done:
>> acpi_table_end(linker, &table);
>> diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
>> index 9b3aee01bf..7

[PATCH v5 3/6] vdpa: Add asid parameter to vhost_vdpa_dma_map/unmap

2022-10-11 Thread Eugenio Pérez

So the caller can choose which ASID is destined.

No need to update the batch functions as they will always be called from
memory listener updates at the moment. Memory listener updates will
always update ASID 0, as it's the passthrough ASID.

All vhost devices's ASID are 0 at this moment.

Signed-off-by: Eugenio Pérez 
---
v5:
* Solve conflict, now vhost_vdpa_svq_unmap_ring returns void
* Change comment on zero initialization.

v4: Add comment specifying behavior if device does not support _F_ASID

v3: Deleted unneeded space
---
 include/hw/virtio/vhost-vdpa.h |  8 +---
 hw/virtio/vhost-vdpa.c | 29 +++--
 net/vhost-vdpa.c   |  6 +++---
 hw/virtio/trace-events |  4 ++--
 4 files changed, 29 insertions(+), 18 deletions(-)

diff --git a/include/hw/virtio/vhost-vdpa.h b/include/hw/virtio/vhost-vdpa.h
index d85643..6560bb9d78 100644
--- a/include/hw/virtio/vhost-vdpa.h
+++ b/include/hw/virtio/vhost-vdpa.h
@@ -29,6 +29,7 @@ typedef struct vhost_vdpa {
 int index;
 uint32_t msg_type;
 bool iotlb_batch_begin_sent;
+uint32_t address_space_id;
 MemoryListener listener;
 struct vhost_vdpa_iova_range iova_range;
 uint64_t acked_features;
@@ -42,8 +43,9 @@ typedef struct vhost_vdpa {
 VhostVDPAHostNotifier notifier[VIRTIO_QUEUE_MAX];
 } VhostVDPA;
 
-int vhost_vdpa_dma_map(struct vhost_vdpa *v, hwaddr iova, hwaddr size,
-   void *vaddr, bool readonly);
-int vhost_vdpa_dma_unmap(struct vhost_vdpa *v, hwaddr iova, hwaddr size);
+int vhost_vdpa_dma_map(struct vhost_vdpa *v, uint32_t asid, hwaddr iova,
+   hwaddr size, void *vaddr, bool readonly);
+int vhost_vdpa_dma_unmap(struct vhost_vdpa *v, uint32_t asid, hwaddr iova,
+ hwaddr size);
 
 #endif
diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
index d966966131..ad663feacc 100644
--- a/hw/virtio/vhost-vdpa.c
+++ b/hw/virtio/vhost-vdpa.c
@@ -72,22 +72,24 @@ static bool 
vhost_vdpa_listener_skipped_section(MemoryRegionSection *section,
 return false;
 }
 
-int vhost_vdpa_dma_map(struct vhost_vdpa *v, hwaddr iova, hwaddr size,
-   void *vaddr, bool readonly)
+int vhost_vdpa_dma_map(struct vhost_vdpa *v, uint32_t asid, hwaddr iova,
+   hwaddr size, void *vaddr, bool readonly)
 {
 struct vhost_msg_v2 msg = {};
 int fd = v->device_fd;
 int ret = 0;
 
 msg.type = v->msg_type;
+msg.asid = asid; /* 0 if vdpa device does not support asid */
 msg.iotlb.iova = iova;
 msg.iotlb.size = size;
 msg.iotlb.uaddr = (uint64_t)(uintptr_t)vaddr;
 msg.iotlb.perm = readonly ? VHOST_ACCESS_RO : VHOST_ACCESS_RW;
 msg.iotlb.type = VHOST_IOTLB_UPDATE;
 
-   trace_vhost_vdpa_dma_map(v, fd, msg.type, msg.iotlb.iova, msg.iotlb.size,
-msg.iotlb.uaddr, msg.iotlb.perm, msg.iotlb.type);
+trace_vhost_vdpa_dma_map(v, fd, msg.type, msg.asid, msg.iotlb.iova,
+ msg.iotlb.size, msg.iotlb.uaddr, msg.iotlb.perm,
+ msg.iotlb.type);
 
 if (write(fd, &msg, sizeof(msg)) != sizeof(msg)) {
 error_report("failed to write, fd=%d, errno=%d (%s)",
@@ -98,18 +100,24 @@ int vhost_vdpa_dma_map(struct vhost_vdpa *v, hwaddr iova, 
hwaddr size,
 return ret;
 }
 
-int vhost_vdpa_dma_unmap(struct vhost_vdpa *v, hwaddr iova, hwaddr size)
+int vhost_vdpa_dma_unmap(struct vhost_vdpa *v, uint32_t asid, hwaddr iova,
+ hwaddr size)
 {
 struct vhost_msg_v2 msg = {};
 int fd = v->device_fd;
 int ret = 0;
 
 msg.type = v->msg_type;
+/*
+ * The caller must set asid = 0 if the device does not support asid.
+ * This is not an ABI break since it is set to 0 by the initializer anyway.
+ */
+msg.asid = asid;
 msg.iotlb.iova = iova;
 msg.iotlb.size = size;
 msg.iotlb.type = VHOST_IOTLB_INVALIDATE;
 
-trace_vhost_vdpa_dma_unmap(v, fd, msg.type, msg.iotlb.iova,
+trace_vhost_vdpa_dma_unmap(v, fd, msg.type, msg.asid, msg.iotlb.iova,
msg.iotlb.size, msg.iotlb.type);
 
 if (write(fd, &msg, sizeof(msg)) != sizeof(msg)) {
@@ -229,7 +237,7 @@ static void vhost_vdpa_listener_region_add(MemoryListener 
*listener,
 }
 
 vhost_vdpa_iotlb_batch_begin_once(v);
-ret = vhost_vdpa_dma_map(v, iova, int128_get64(llsize),
+ret = vhost_vdpa_dma_map(v, 0, iova, int128_get64(llsize),
  vaddr, section->readonly);
 if (ret) {
 error_report("vhost vdpa map fail!");
@@ -303,7 +311,7 @@ static void vhost_vdpa_listener_region_del(MemoryListener 
*listener,
 vhost_iova_tree_remove(v->iova_tree, *result);
 }
 vhost_vdpa_iotlb_batch_begin_once(v);
-ret = vhost_vdpa_dma_unmap(v, iova, int128_get64(llsize));
+ret = vhost_vdpa_dma_unmap(v, 0, iova, int128_get64(llsize));
 if (ret) {
 error_report("vhost_vdpa dma unmap error!");

Re: [PATCH v5 2/9] tests/x86: Add subtest with 'q35' machine type to device-plug-test

2022-10-11 Thread Thomas Huth


On 30/09/2022 00.35, Michael Labiuk via wrote:

Configure pci bridge setting to plug pci device and unplug.

Signed-off-by: Michael Labiuk 
---
  tests/qtest/device-plug-test.c | 41 ++
  1 file changed, 41 insertions(+)


Reviewed-by: Thomas Huth

[PATCH v3 1/5] hw/smbios: add core_count2 to smbios table type 4

2022-10-11 Thread Julia Suvorova

In order to use the increased number of cpus, we need to bring smbios
tables in line with the SMBIOS 3.0 specification. This allows us to
introduce core_count2 which acts as a duplicate of core_count if we have
fewer cores than 256, and contains the actual core number per socket if
we have more.

core_enabled2 and thread_count2 fields work the same way.

Signed-off-by: Julia Suvorova 
Reviewed-by: Igor Mammedov 
Message-Id: <20220731162141.178443-2-jus...@redhat.com>
---
 hw/smbios/smbios.c   | 19 ---
 hw/smbios/smbios_build.h |  9 +++--
 include/hw/firmware/smbios.h | 12 
 3 files changed, 35 insertions(+), 5 deletions(-)

diff --git a/hw/smbios/smbios.c b/hw/smbios/smbios.c
index 4c9f664830..591481d449 100644
--- a/hw/smbios/smbios.c
+++ b/hw/smbios/smbios.c
@@ -681,8 +681,14 @@ static void smbios_build_type_3_table(void)
 static void smbios_build_type_4_table(MachineState *ms, unsigned instance)
 {
 char sock_str[128];
+size_t tbl_len = SMBIOS_TYPE_4_LEN_V28;
 
-SMBIOS_BUILD_TABLE_PRE(4, T4_BASE + instance, true); /* required */
+if (smbios_ep_type == SMBIOS_ENTRY_POINT_TYPE_64) {
+tbl_len = SMBIOS_TYPE_4_LEN_V30;
+}
+
+SMBIOS_BUILD_TABLE_PRE_SIZE(4, T4_BASE + instance,
+true, tbl_len); /* required */
 
 snprintf(sock_str, sizeof(sock_str), "%s%2x", type4.sock_pfx, instance);
 SMBIOS_TABLE_SET_STR(4, socket_designation_str, sock_str);
@@ -709,8 +715,15 @@ static void smbios_build_type_4_table(MachineState *ms, 
unsigned instance)
 SMBIOS_TABLE_SET_STR(4, serial_number_str, type4.serial);
 SMBIOS_TABLE_SET_STR(4, asset_tag_number_str, type4.asset);
 SMBIOS_TABLE_SET_STR(4, part_number_str, type4.part);
-t->core_count = t->core_enabled = ms->smp.cores;
-t->thread_count = ms->smp.threads;
+
+t->core_count = (ms->smp.cores > 255) ? 0xFF : ms->smp.cores;
+t->core_enabled = t->core_count;
+
+t->core_count2 = t->core_enabled2 = cpu_to_le16(ms->smp.cores);
+
+t->thread_count = (ms->smp.threads > 255) ? 0xFF : ms->smp.threads;
+t->thread_count2 = cpu_to_le16(ms->smp.threads);
+
 t->processor_characteristics = cpu_to_le16(0x02); /* Unknown */
 t->processor_family2 = cpu_to_le16(0x01); /* Other */
 
diff --git a/hw/smbios/smbios_build.h b/hw/smbios/smbios_build.h
index 56b5a1e3f3..351660024e 100644
--- a/hw/smbios/smbios_build.h
+++ b/hw/smbios/smbios_build.h
@@ -27,6 +27,11 @@ extern unsigned smbios_table_max;
 extern unsigned smbios_table_cnt;
 
 #define SMBIOS_BUILD_TABLE_PRE(tbl_type, tbl_handle, tbl_required)\
+SMBIOS_BUILD_TABLE_PRE_SIZE(tbl_type, tbl_handle, tbl_required,   \
+sizeof(struct smbios_type_##tbl_type))\
+
+#define SMBIOS_BUILD_TABLE_PRE_SIZE(tbl_type, tbl_handle, \
+tbl_required, tbl_len)\
 struct smbios_type_##tbl_type *t; \
 size_t t_off; /* table offset into smbios_tables */   \
 int str_index = 0;\
@@ -39,12 +44,12 @@ extern unsigned smbios_table_cnt;
 /* use offset of table t within smbios_tables */  \
 /* (pointer must be updated after each realloc) */\
 t_off = smbios_tables_len;\
-smbios_tables_len += sizeof(*t);  \
+smbios_tables_len += tbl_len; \
 smbios_tables = g_realloc(smbios_tables, smbios_tables_len);  \
 t = (struct smbios_type_##tbl_type *)(smbios_tables + t_off); \
   \
 t->header.type = tbl_type;\
-t->header.length = sizeof(*t);\
+t->header.length = tbl_len;   \
 t->header.handle = cpu_to_le16(tbl_handle);   \
 } while (0)
 
diff --git a/include/hw/firmware/smbios.h b/include/hw/firmware/smbios.h
index 4b7ad77a44..9615446f5d 100644
--- a/include/hw/firmware/smbios.h
+++ b/include/hw/firmware/smbios.h
@@ -18,6 +18,8 @@
 
 
 #define SMBIOS_MAX_TYPE 127
+#define offsetofend(TYPE, MEMBER) \
+   (offsetof(TYPE, MEMBER) + sizeof_field(TYPE, MEMBER))
 
 /* memory area description, used by type 19 table */
 struct smbios_phys_mem_area {
@@ -187,8 +189,18 @@ struct smbios_type_4 {
 uint8_t thread_count;
 uint16_t processor_characteristics;
 uint16_t processor_family2;
+/* SMBIOS spec 3.0.0, Table 21 */
+uint16_t core_count2;
+uint16_t core_enabled2;
+uint16_t thread_count2;
 } QEMU_PACKED;
 
+typedef enum smbios_type_4_len_ver {
+SMBIOS_TYPE_4_LEN_V28 = offsetofend(struct smbios_type_4,
+

Re: [RFC PATCH 6/6] hw/cxl/mailbox: Wire up Get/Set Event Interrupt policy

2022-10-11 Thread Jonathan Cameron via

On Mon, 10 Oct 2022 15:29:44 -0700
ira.we...@intel.com wrote:

> From: Ira Weiny 
> 
> Replace the stubbed out CXL Get/Set Event interrupt policy mailbox
> commands.  Enable those commands to control interrupts for each of the
> event log types.
> 
> Signed-off-by: Ira Weiny 
A few trivial comments inline.

Thanks,

Jonathan

> ---
>  hw/cxl/cxl-mailbox-utils.c  | 129 ++--
>  include/hw/cxl/cxl_events.h |  21 ++
>  2 files changed, 129 insertions(+), 21 deletions(-)
> 
> diff --git a/hw/cxl/cxl-mailbox-utils.c b/hw/cxl/cxl-mailbox-utils.c
> index df345f23a30c..52e8804c24ed 100644
> --- a/hw/cxl/cxl-mailbox-utils.c
> +++ b/hw/cxl/cxl-mailbox-utils.c
> @@ -101,25 +101,6 @@ struct cxl_cmd {
>  uint8_t *payload;
>  };
>  
> -#define DEFINE_MAILBOX_HANDLER_ZEROED(name, size) \
> -uint16_t __zero##name = size; \
> -static ret_code cmd_##name(struct cxl_cmd *cmd,   \
> -   CXLDeviceState *cxl_dstate, uint16_t *len) \
> -{ \
> -*len = __zero##name;  \
> -memset(cmd->payload, 0, *len);\
> -return CXL_MBOX_SUCCESS;  \
> -}
> -#define DEFINE_MAILBOX_HANDLER_NOP(name)  \
> -static ret_code cmd_##name(struct cxl_cmd *cmd,   \
> -   CXLDeviceState *cxl_dstate, uint16_t *len) \
> -{ \
> -return CXL_MBOX_SUCCESS;  \
> -}
> -
> -DEFINE_MAILBOX_HANDLER_ZEROED(events_get_interrupt_policy, 4);
> -DEFINE_MAILBOX_HANDLER_NOP(events_set_interrupt_policy);
> -
>  static ret_code cmd_events_get_records(struct cxl_cmd *cmd,
> CXLDeviceState *cxlds,
> uint16_t *len)
> @@ -218,6 +199,110 @@ static ret_code cmd_events_clear_records(struct cxl_cmd 
> *cmd,
>  return CXL_MBOX_SUCCESS;
>  }
>  
> +static ret_code cmd_events_get_interrupt_policy(struct cxl_cmd *cmd,
> +CXLDeviceState *cxl_dstate,
> +uint16_t *len)
> +{
> +struct cxl_event_interrupt_policy *policy;
> +struct cxl_event_log *log;
> +
> +policy = (struct cxl_event_interrupt_policy *)cmd->payload;
> +memset(policy, 0, sizeof(*policy));
> +
> +log = find_event_log(cxl_dstate, CXL_EVENT_TYPE_INFO);

Less obvious than below case, but again, perhaps a little utility function
to cut down on duplication.

> +if (log->irq_enabled) {
> +policy->info_settings = CXL_EVENT_INT_SETTING(log->irq_vec);
> +}
> +
> +log = find_event_log(cxl_dstate, CXL_EVENT_TYPE_WARN);
> +if (log->irq_enabled) {
> +policy->warn_settings = CXL_EVENT_INT_SETTING(log->irq_vec);
> +}
> +
> +log = find_event_log(cxl_dstate, CXL_EVENT_TYPE_FAIL);
> +if (log->irq_enabled) {
> +policy->failure_settings = CXL_EVENT_INT_SETTING(log->irq_vec);
> +}
> +
> +log = find_event_log(cxl_dstate, CXL_EVENT_TYPE_FATAL);
> +if (log->irq_enabled) {
> +policy->fatal_settings = CXL_EVENT_INT_SETTING(log->irq_vec);
> +}
> +
> +log = find_event_log(cxl_dstate, CXL_EVENT_TYPE_DYNAMIC_CAP);
> +if (log->irq_enabled) {
> +/* Dynamic Capacity borrows the same vector as info */
> +policy->dyn_cap_settings = CXL_INT_MSI_MSIX;
> +}
> +
> +*len = sizeof(*policy);
> +return CXL_MBOX_SUCCESS;
> +}
> +
> +static ret_code cmd_events_set_interrupt_policy(struct cxl_cmd *cmd,
> +CXLDeviceState *cxl_dstate,
> +uint16_t *len)
> +{
> +struct cxl_event_interrupt_policy *policy;
> +struct cxl_event_log *log;
> +
> +policy = (struct cxl_event_interrupt_policy *)cmd->payload;
> +
> +log = find_event_log(cxl_dstate, CXL_EVENT_TYPE_INFO);
Maybe a utility function?

set_int_policy(cxl_dstate, CXL_EVENT_TYPE_INFO,
   policy->info_settings);
set_int_policy(cxl_dstate, CXL_EVENT_TYPE_WARN,
   policy->warn_settings);
etc


> +if ((policy->info_settings & CXL_EVENT_INT_MODE_MASK) ==
> +CXL_INT_MSI_MSIX) {
> +log->irq_enabled = true;
> +log->irq_vec = cxl_dstate->event_vector[CXL_EVENT_TYPE_INFO];
> +} else {
> +log->irq_enabled = false;
> +log->irq_vec = 0;
> +}
> +
> +log = find_event_log(cxl_dstate, CXL_EVENT_TYPE_WARN);
> +if ((policy->warn_settings & CXL_EVENT_INT_MODE_MASK) ==
> +

Re: [PATCH 49/51] io/channel-watch: Fix socket watch on Windows

2022-10-11 Thread Bin Meng

Hi Paolo,

On Thu, Oct 6, 2022 at 11:03 AM Bin Meng  wrote:
>
> Hi Paolo,
>
> On Wed, Sep 28, 2022 at 2:10 PM Bin Meng  wrote:
> >
> > Hi Paolo,
> >
> > On Wed, Sep 21, 2022 at 9:02 AM Bin Meng  wrote:
> > >
> > > On Wed, Sep 14, 2022 at 4:08 PM Bin Meng  wrote:
> > > >
> > > > On Wed, Sep 7, 2022 at 1:07 PM Bin Meng  wrote:
> > > > >
> > > > > Hi Clément,
> > > > >
> > > > > On Tue, Sep 6, 2022 at 8:06 PM Clément Chigot  
> > > > > wrote:
> > > > > >
> > > > > > > > > I checked your patch, what you did seems to be something one 
> > > > > > > > > would
> > > > > > > > > naturally write, but what is currently in the QEMU sources 
> > > > > > > > > seems to be
> > > > > > > > > written intentionally.
> > > > > > > > >
> > > > > > > > > +Paolo Bonzini , you are the one who implemented the socket 
> > > > > > > > > watch on
> > > > > > > > > Windows. Could you please help analyze this issue?
> > > > > > > > >
> > > > > > > > > > to avoid WSAEnumNetworkEvents for the master GSource which 
> > > > > > > > > > only has
> > > > > > > > > > G_IO_HUP (or for any GSource having only that).
> > > > > > > > > > As I said above, the current code doesn't do anything with 
> > > > > > > > > > it anyway.
> > > > > > > > > > So, IMO, it's safe to do so.
> > > > > > > > > >
> > > > > > > > > > I'll send you my patch attached. I was planning to send it 
> > > > > > > > > > in the following
> > > > > > > > > > weeks anyway. I was just waiting to be sure everything 
> > > > > > > > > > looks fine on our
> > > > > > > > > > CI. Feel free to test and modify it if needed.
> > > > > > > > >
> > > > > > > > > I tested your patch. Unfortunately there is still one test 
> > > > > > > > > case
> > > > > > > > > (migration-test.exe) throwing up the "Broken pipe" message.
> > > > > > > >
> > > > > > > > I must say I didn't fully test it against qemu testsuite yet. 
> > > > > > > > Maybe there are
> > > > > > > > some refinements to be done. "Broken pipe" might be linked to 
> > > > > > > > the missing
> > > > > > > > G_IO_HUP support.
> > > > > > > >
> > > > > > > > > Can you test my patch instead to see if your gdb issue can be 
> > > > > > > > > fixed?
> > > > > > > >
> > > > > > > > Yeah sure. I'll try to do it this afternoon.
> > > > > >
> > > > > > I can't explain how mad at me I am... I'm pretty sure your patch 
> > > > > > was the first
> > > > > > thing I've tried when I encountered this issue. But it wasn't 
> > > > > > working
> > > > > > or IIRC the
> > > > > > issue went away but that was because the polling was actually 
> > > > > > disabled (looping
> > > > > > indefinitely)...I'm suspecting that I already had changed the 
> > > > > > CreateEvent for
> > > > > > WSACreateEvent which forces you to handle the reset.
> > > > > > Finally, I end up struggling reworking the whole check function...
> > > > > > But yeah, your patch does work fine on my gdb issues too.
> > > > >
> > > > > Good to know this patch works for you too.
> > > > >
> > > > > > And I guess the events are reset when recv() is being called 
> > > > > > because of the
> > > > > > auto-reset feature set up by CreateEvent().
> > > > > > IIUC, what Marc-André means by busy loop is the polling being 
> > > > > > looping
> > > > > > indefinitely as I encountered. I can ensure that this patch doesn't 
> > > > > > do that.
> > > > > > It can be easily checked by setting the env variable 
> > > > > > G_MAIN_POLL_DEBUG.
> > > > > > It'll show what g_poll is doing and it's normally always available 
> > > > > > on
> > > > > > Windows.
> > > > > >
> > > > > > Anyway, we'll wait for Paolo to see if he remembers why he had to 
> > > > > > call
> > > > > > WSAEnumNetworkEvents. Otherwize, let's go for your patch. Mine might
> > > > > > be a good start to improve the whole polling on Windows but if it 
> > > > > > doesn't
> > > > > > work in your case, it then needs some refinements.
> > > > > >
> > > > >
> > > > > Yeah, this issue bugged me quite a lot. If we want to reset the event
> > > > > in qio_channel_socket_source_check(), we will have to do the following
> > > > > to make sure qtests are happy.
> > > > >
> > > > > diff --git a/io/channel-watch.c b/io/channel-watch.c
> > > > > index 43d38494f7..f1e1650b81 100644
> > > > > --- a/io/channel-watch.c
> > > > > +++ b/io/channel-watch.c
> > > > > @@ -124,8 +124,6 @@ qio_channel_socket_source_check(GSource *source)
> > > > > return 0;
> > > > > }
> > > > > - WSAEnumNetworkEvents(ssource->socket, ssource->ioc->event, &ev);
> > > > > -
> > > > > FD_ZERO(&rfds);
> > > > > FD_ZERO(&wfds);
> > > > > FD_ZERO(&xfds);
> > > > > @@ -153,6 +151,10 @@ qio_channel_socket_source_check(GSource *source)
> > > > > ssource->revents |= G_IO_PRI;
> > > > > }
> > > > > + if (ssource->revents) {
> > > > > + WSAEnumNetworkEvents(ssource->socket, ssource->ioc->event, &ev);
> > > > > + }
> > > > > +
> > > > > return ssource->revents;
> > > > > }
> > > > >
> > > > > Removing "if (ssource->revents)" won't work.
> > > > >
> > > > > It seems to me tha

[PATCH v3 3/5] tests/acpi: allow changes for core_count2 test

2022-10-11 Thread Julia Suvorova

Signed-off-by: Julia Suvorova 
Message-Id: <20220731162141.178443-4-jus...@redhat.com>
---
 tests/data/acpi/q35/APIC.core-count2| 0
 tests/data/acpi/q35/DSDT.core-count2| 0
 tests/data/acpi/q35/FACP.core-count2| 0
 tests/qtest/bios-tables-test-allowed-diff.h | 3 +++
 4 files changed, 3 insertions(+)
 create mode 100644 tests/data/acpi/q35/APIC.core-count2
 create mode 100644 tests/data/acpi/q35/DSDT.core-count2
 create mode 100644 tests/data/acpi/q35/FACP.core-count2

diff --git a/tests/data/acpi/q35/APIC.core-count2 
b/tests/data/acpi/q35/APIC.core-count2
new file mode 100644
index 00..e69de29bb2
diff --git a/tests/data/acpi/q35/DSDT.core-count2 
b/tests/data/acpi/q35/DSDT.core-count2
new file mode 100644
index 00..e69de29bb2
diff --git a/tests/data/acpi/q35/FACP.core-count2 
b/tests/data/acpi/q35/FACP.core-count2
new file mode 100644
index 00..e69de29bb2
diff --git a/tests/qtest/bios-tables-test-allowed-diff.h 
b/tests/qtest/bios-tables-test-allowed-diff.h
index dfb8523c8b..e81dc67a2e 100644
--- a/tests/qtest/bios-tables-test-allowed-diff.h
+++ b/tests/qtest/bios-tables-test-allowed-diff.h
@@ -1 +1,4 @@
 /* List of comma-separated changed AML files to ignore */
+"tests/data/acpi/q35/APIC.core-count2",
+"tests/data/acpi/q35/DSDT.core-count2",
+"tests/data/acpi/q35/FACP.core-count2",
-- 
2.37.3

[PATCH v3 5/5] tests/acpi: update tables for new core count test

2022-10-11 Thread Julia Suvorova

Changes in the tables (for 275 cores):
FACP:
+ Use APIC Cluster Model (V4) : 1

APIC:
+[02Ch 0044   1]Subtable Type : 00 [Processor Local APIC]
+[02Dh 0045   1]   Length : 08
+[02Eh 0046   1] Processor ID : 00
+[02Fh 0047   1]Local Apic ID : 00
+[030h 0048   4]Flags (decoded below) : 0001
+   Processor Enabled : 1
...
+
+[81Ch 2076   1]Subtable Type : 00 [Processor Local APIC]
+[81Dh 2077   1]   Length : 08
+[81Eh 2078   1] Processor ID : FE
+[81Fh 2079   1]Local Apic ID : FE
+[820h 2080   4]Flags (decoded below) : 0001
+   Processor Enabled : 1
+  Runtime Online Capable : 0
+
+[824h 2084   1]Subtable Type : 09 [Processor Local x2APIC]
+[825h 2085   1]   Length : 10
+[826h 2086   2] Reserved : 
+[828h 2088   4]  Processor x2Apic ID : 00FF
+[82Ch 2092   4]Flags (decoded below) : 0001
+   Processor Enabled : 1
+[830h 2096   4]Processor UID : 00FF
...

DSDT:
+Processor (C001, 0x01, 0x, 0x00)
+{
+Method (_STA, 0, Serialized)  // _STA: Status
+{
+Return (CSTA (One))
+}
+
+Name (_MAT, Buffer (0x08)  // _MAT: Multiple APIC Table Entry
+{
+ 0x00, 0x08, 0x01, 0x01, 0x01, 0x00, 0x00, 0x00   // 

+})
+Method (_EJ0, 1, NotSerialized)  // _EJx: Eject Device, x=0-9
+{
+CEJ0 (One)
+}
+
+Method (_OST, 3, Serialized)  // _OST: OSPM Status Indication
+{
+COST (One, Arg0, Arg1, Arg2)
+}
+}
...
+Processor (C0FE, 0xFE, 0x, 0x00)
+{
+Method (_STA, 0, Serialized)  // _STA: Status
+{
+Return (CSTA (0xFE))
+}
+
+Name (_MAT, Buffer (0x08)  // _MAT: Multiple APIC Table Entry
+{
+ 0x00, 0x08, 0xFE, 0xFE, 0x01, 0x00, 0x00, 0x00   // 

+})
+Method (_EJ0, 1, NotSerialized)  // _EJx: Eject Device, x=0-9
+{
+CEJ0 (0xFE)
+}
+
+Method (_OST, 3, Serialized)  // _OST: OSPM Status Indication
+{
+COST (0xFE, Arg0, Arg1, Arg2)
+}
+}
+
+Device (C0FF)
+{
+Name (_HID, "ACPI0007" /* Processor Device */)  // _HID: 
Hardware ID
+Name (_UID, 0xFF)  // _UID: Unique ID
+Method (_STA, 0, Serialized)  // _STA: Status
+{
+Return (CSTA (0xFF))
+}
+
+Name (_MAT, Buffer (0x10)  // _MAT: Multiple APIC Table Entry
+{
+/*  */  0x09, 0x10, 0x00, 0x00, 0xFF, 0x00, 0x00, 
0x00,  // 
+/* 0008 */  0x01, 0x00, 0x00, 0x00, 0xFF, 0x00, 0x00, 0x00 
  // 
+})
+Method (_EJ0, 1, NotSerialized)  // _EJx: Eject Device, x=0-9
+{
+CEJ0 (0xFF)
+}
+
+Method (_OST, 3, Serialized)  // _OST: OSPM Status Indication
+{
+COST (0xFF, Arg0, Arg1, Arg2)
+}
+}
+
...

Signed-off-by: Julia Suvorova 
Message-Id: <20220731162141.178443-6-jus...@redhat.com>
---
 tests/data/acpi/q35/APIC.core-count2| Bin 0 -> 2478 bytes
 tests/data/acpi/q35/DSDT.core-count2| Bin 0 -> 32414 bytes
 tests/data/acpi/q35/FACP.core-count2| Bin 0 -> 244 bytes
 tests/qtest/bios-tables-test-allowed-diff.h |   3 ---
 4 files changed, 3 deletions(-)

diff --git a/tests/data/acpi/q35/APIC.core-count2 
b/tests/data/acpi/q35/APIC.core-count2
index 
e69de29bb2d1d6434b8b29ae775ad8c2e48c5391..a255082ef5bc39f0d92d3e372b91f09dd6d0d9a1
 100644
GIT binary patch
literal 2478
zcmXZeWl$AS7=Youz=a#M-K}6ZwgLuNAQ;$~*xjvQcY@uCVs{~Sf`WpLVt2Rbe!ORA
z_B`J^b9R56*&pj2=M2p(=#;b`y@>U1KQZ2
ztu5Nwq0xx;_UPb%CKH;?XtAKxijI!xf-7uzGc@Q3Gq%
z#9Fnmc5SRv2fe+~#|M4+PE2*{()H?L{rcFT0s8r&zdtr?h>aRyuWjcwXs+qT%Q9ky?e9Xepgju;w>ojPIX&e)|3
zcI}GYx?%V37#4;-dSK6<*sB-z?u~u=VBfyjuOIgBj{^qaz=1eu5Dp%ULx$kcp*U<9
z4j+yqM&QViIBFD*9*twh;MlP^ZXAvuj}s=~#ECd*5{8FkL5Yu4b}wYY8_u3wKEHsHpMxM>q^-i%we;MT3UZ5u{MiV+Y2>;Le@6
zYZva`jeGXs-o3bQAMW3e2M*xDgLvo=9zKjmj^NRwcZnq0$#t4H*R2JA|@r_&6{}Z
z7A7ZSN($b-jd$+g-Me`29^Su?4<6vdhnSj*j~?OU$C#FePoCh@r}*p{K7WocUf|1@
z`05qDevNP5;M=$O?j62=j~_nZ$B+2w6Mp`TU%ueiulVg7e*ca?e&Ela`0E$`{*8bB
z;NQQPo-UeQHSM3S%%ZeJ#vXl>ElNA87Nwn3i_*@jMQIn+qO_}OQQA$lDDAE~Lr49*wAgf6Z

[PATCH v5 6/6] vdpa: Always start CVQ in SVQ mode

2022-10-11 Thread Eugenio Pérez

Isolate control virtqueue in its own group, allowing to intercept control
commands but letting dataplane run totally passthrough to the guest.

Signed-off-by: Eugenio Pérez 
---
v5:
* Fixing the not adding cvq buffers when x-svq=on is specified.
* Move vring state in vhost_vdpa_get_vring_group instead of using a
  parameter.
* Rename VHOST_VDPA_NET_CVQ_PASSTHROUGH to VHOST_VDPA_NET_DATA_ASID

v4:
* Squash vhost_vdpa_cvq_group_is_independent.
* Rebased on last CVQ start series, that allocated CVQ cmd bufs at load
* Do not check for cvq index on vhost_vdpa_net_prepare, we only have one
  that callback registered in that NetClientInfo.

v3:
* Make asid related queries print a warning instead of returning an
  error and stop the start of qemu.
---
 hw/virtio/vhost-vdpa.c |   3 +-
 net/vhost-vdpa.c   | 118 +++--
 2 files changed, 115 insertions(+), 6 deletions(-)

diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
index 29d009c02b..fd4de06eab 100644
--- a/hw/virtio/vhost-vdpa.c
+++ b/hw/virtio/vhost-vdpa.c
@@ -682,7 +682,8 @@ static int vhost_vdpa_set_backend_cap(struct vhost_dev *dev)
 {
 uint64_t features;
 uint64_t f = 0x1ULL << VHOST_BACKEND_F_IOTLB_MSG_V2 |
-0x1ULL << VHOST_BACKEND_F_IOTLB_BATCH;
+0x1ULL << VHOST_BACKEND_F_IOTLB_BATCH |
+0x1ULL << VHOST_BACKEND_F_IOTLB_ASID;
 int r;
 
 if (vhost_vdpa_call(dev, VHOST_GET_BACKEND_FEATURES, &features)) {
diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
index f7831aeb8d..6f6ef59ea3 100644
--- a/net/vhost-vdpa.c
+++ b/net/vhost-vdpa.c
@@ -38,6 +38,9 @@ typedef struct VhostVDPAState {
 void *cvq_cmd_out_buffer;
 virtio_net_ctrl_ack *status;
 
+/* Number of address spaces supported by the device */
+unsigned address_space_num;
+
 /* The device always have SVQ enabled */
 bool always_svq;
 bool started;
@@ -102,6 +105,9 @@ static const uint64_t vdpa_svq_device_features =
 BIT_ULL(VIRTIO_NET_F_RSC_EXT) |
 BIT_ULL(VIRTIO_NET_F_STANDBY);
 
+#define VHOST_VDPA_NET_DATA_ASID 0
+#define VHOST_VDPA_NET_CVQ_ASID 1
+
 VHostNetState *vhost_vdpa_get_vhost_net(NetClientState *nc)
 {
 VhostVDPAState *s = DO_UPCAST(VhostVDPAState, nc, nc);
@@ -226,6 +232,34 @@ static NetClientInfo net_vhost_vdpa_info = {
 .check_peer_type = vhost_vdpa_check_peer_type,
 };
 
+static uint32_t vhost_vdpa_get_vring_group(int device_fd, unsigned vq_index)
+{
+struct vhost_vring_state state = {
+.index = vq_index,
+};
+int r = ioctl(device_fd, VHOST_VDPA_GET_VRING_GROUP, &state);
+
+return r < 0 ? 0 : state.num;
+}
+
+static int vhost_vdpa_set_address_space_id(struct vhost_vdpa *v,
+   unsigned vq_group,
+   unsigned asid_num)
+{
+struct vhost_vring_state asid = {
+.index = vq_group,
+.num = asid_num,
+};
+int ret;
+
+ret = ioctl(v->device_fd, VHOST_VDPA_SET_GROUP_ASID, &asid);
+if (unlikely(ret < 0)) {
+warn_report("Can't set vq group %u asid %u, errno=%d (%s)",
+asid.index, asid.num, errno, g_strerror(errno));
+}
+return ret;
+}
+
 static void vhost_vdpa_cvq_unmap_buf(struct vhost_vdpa *v, void *addr)
 {
 VhostIOVATree *tree = v->iova_tree;
@@ -300,11 +334,50 @@ dma_map_err:
 static int vhost_vdpa_net_cvq_start(NetClientState *nc)
 {
 VhostVDPAState *s;
-int r;
+struct vhost_vdpa *v;
+uint32_t cvq_group;
+int cvq_index, r;
 
 assert(nc->info->type == NET_CLIENT_DRIVER_VHOST_VDPA);
 
 s = DO_UPCAST(VhostVDPAState, nc, nc);
+v = &s->vhost_vdpa;
+
+v->listener_shadow_vq = s->always_svq;
+v->shadow_vqs_enabled = s->always_svq;
+s->vhost_vdpa.address_space_id = VHOST_VDPA_NET_DATA_ASID;
+
+if (s->always_svq) {
+goto out;
+}
+
+if (s->address_space_num < 2) {
+return 0;
+}
+
+/**
+ * Check if all the virtqueues of the virtio device are in a different vq
+ * than the last vq. VQ group of last group passed in cvq_group.
+ */
+cvq_index = v->dev->vq_index_end - 1;
+cvq_group = vhost_vdpa_get_vring_group(v->device_fd, cvq_index);
+for (int i = 0; i < cvq_index; ++i) {
+uint32_t group = vhost_vdpa_get_vring_group(v->device_fd, i);
+
+if (unlikely(group == cvq_group)) {
+warn_report("CVQ %u group is the same as VQ %u one (%u)", 
cvq_group,
+i, group);
+return 0;
+}
+}
+
+r = vhost_vdpa_set_address_space_id(v, cvq_group, VHOST_VDPA_NET_CVQ_ASID);
+if (r == 0) {
+v->shadow_vqs_enabled = true;
+s->vhost_vdpa.address_space_id = VHOST_VDPA_NET_CVQ_ASID;
+}
+
+out:
 if (!s->vhost_vdpa.shadow_vqs_enabled) {
 return 0;
 }
@@ -576,12 +649,38 @@ static const VhostShadowVirtqueueOps 
vhost_vdpa_net_svq_ops = {
 .avail_handler = vhost_vdpa_net_handle_ctrl_avail,
 };
 
+static uint32_t vhost_vdp

[PATCH v5 1/6] vdpa: Use v->shadow_vqs_enabled in vhost_vdpa_svqs_start & stop

2022-10-11 Thread Eugenio Pérez

This function used to trust in v->shadow_vqs != NULL to know if it must
start svq or not.

This is not going to be valid anymore, as qemu is going to allocate svq
unconditionally (but it will only start them conditionally).

Signed-off-by: Eugenio Pérez 
---
 hw/virtio/vhost-vdpa.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
index 7468e44b87..7f0ff4df5b 100644
--- a/hw/virtio/vhost-vdpa.c
+++ b/hw/virtio/vhost-vdpa.c
@@ -1029,7 +1029,7 @@ static bool vhost_vdpa_svqs_start(struct vhost_dev *dev)
 Error *err = NULL;
 unsigned i;
 
-if (!v->shadow_vqs) {
+if (!v->shadow_vqs_enabled) {
 return true;
 }
 
@@ -1082,7 +1082,7 @@ static void vhost_vdpa_svqs_stop(struct vhost_dev *dev)
 {
 struct vhost_vdpa *v = dev->opaque;
 
-if (!v->shadow_vqs) {
+if (!v->shadow_vqs_enabled) {
 return;
 }
 
-- 
2.31.1

[PATCH v3 2/5] bios-tables-test: teach test to use smbios 3.0 tables

2022-10-11 Thread Julia Suvorova

Introduce the 64-bit entry point. Since we no longer have a total
number of structures, stop checking for the new ones at the EOF
structure (type 127).

Signed-off-by: Julia Suvorova 
Reviewed-by: Igor Mammedov 
Message-Id: <20220731162141.178443-3-jus...@redhat.com>
---
 tests/qtest/bios-tables-test.c | 100 +
 1 file changed, 76 insertions(+), 24 deletions(-)

diff --git a/tests/qtest/bios-tables-test.c b/tests/qtest/bios-tables-test.c
index 2ebeb530b2..f5fffdc348 100644
--- a/tests/qtest/bios-tables-test.c
+++ b/tests/qtest/bios-tables-test.c
@@ -88,8 +88,8 @@ typedef struct {
 uint64_t rsdp_addr;
 uint8_t rsdp_table[36 /* ACPI 2.0+ RSDP size */];
 GArray *tables;
-uint32_t smbios_ep_addr;
-struct smbios_21_entry_point smbios_ep_table;
+uint64_t smbios_ep_addr[SMBIOS_ENTRY_POINT_TYPE__MAX];
+SmbiosEntryPoint smbios_ep_table;
 uint16_t smbios_cpu_max_speed;
 uint16_t smbios_cpu_curr_speed;
 uint8_t *required_struct_types;
@@ -533,10 +533,9 @@ static void test_acpi_asl(test_data *data)
 free_test_data(&exp_data);
 }
 
-static bool smbios_ep_table_ok(test_data *data)
+static bool smbios_ep2_table_ok(test_data *data, uint32_t addr)
 {
-struct smbios_21_entry_point *ep_table = &data->smbios_ep_table;
-uint32_t addr = data->smbios_ep_addr;
+struct smbios_21_entry_point *ep_table = &data->smbios_ep_table.ep21;
 
 qtest_memread(data->qts, addr, ep_table, sizeof(*ep_table));
 if (memcmp(ep_table->anchor_string, "_SM_", 4)) {
@@ -559,13 +558,29 @@ static bool smbios_ep_table_ok(test_data *data)
 return true;
 }
 
-static void test_smbios_entry_point(test_data *data)
+static bool smbios_ep3_table_ok(test_data *data, uint64_t addr)
+{
+struct smbios_30_entry_point *ep_table = &data->smbios_ep_table.ep30;
+
+qtest_memread(data->qts, addr, ep_table, sizeof(*ep_table));
+if (memcmp(ep_table->anchor_string, "_SM3_", 5)) {
+return false;
+}
+
+if (acpi_calc_checksum((uint8_t *)ep_table, sizeof *ep_table)) {
+return false;
+}
+
+return true;
+}
+
+static SmbiosEntryPointType test_smbios_entry_point(test_data *data)
 {
 uint32_t off;
 
 /* find smbios entry point structure */
 for (off = 0xf; off < 0x10; off += 0x10) {
-uint8_t sig[] = "_SM_";
+uint8_t sig[] = "_SM_", sig3[] = "_SM3_";
 int i;
 
 for (i = 0; i < sizeof sig - 1; ++i) {
@@ -574,14 +589,30 @@ static void test_smbios_entry_point(test_data *data)
 
 if (!memcmp(sig, "_SM_", sizeof sig)) {
 /* signature match, but is this a valid entry point? */
-data->smbios_ep_addr = off;
-if (smbios_ep_table_ok(data)) {
+if (smbios_ep2_table_ok(data, off)) {
+data->smbios_ep_addr[SMBIOS_ENTRY_POINT_TYPE_32] = off;
+}
+}
+
+for (i = 0; i < sizeof sig3 - 1; ++i) {
+sig3[i] = qtest_readb(data->qts, off + i);
+}
+
+if (!memcmp(sig3, "_SM3_", sizeof sig3)) {
+if (smbios_ep3_table_ok(data, off)) {
+data->smbios_ep_addr[SMBIOS_ENTRY_POINT_TYPE_64] = off;
+/* found 64-bit entry point, no need to look for 32-bit one */
 break;
 }
 }
 }
 
-g_assert_cmphex(off, <, 0x10);
+/* found at least one entry point */
+g_assert_true(data->smbios_ep_addr[SMBIOS_ENTRY_POINT_TYPE_32] ||
+  data->smbios_ep_addr[SMBIOS_ENTRY_POINT_TYPE_64]);
+
+return data->smbios_ep_addr[SMBIOS_ENTRY_POINT_TYPE_64] ?
+   SMBIOS_ENTRY_POINT_TYPE_64 : SMBIOS_ENTRY_POINT_TYPE_32;
 }
 
 static inline bool smbios_single_instance(uint8_t type)
@@ -625,16 +656,23 @@ static bool smbios_cpu_test(test_data *data, uint32_t 
addr)
 return true;
 }
 
-static void test_smbios_structs(test_data *data)
+static void test_smbios_structs(test_data *data, SmbiosEntryPointType ep_type)
 {
 DECLARE_BITMAP(struct_bitmap, SMBIOS_MAX_TYPE+1) = { 0 };
-struct smbios_21_entry_point *ep_table = &data->smbios_ep_table;
-uint32_t addr = le32_to_cpu(ep_table->structure_table_address);
-int i, len, max_len = 0;
+
+SmbiosEntryPoint *ep_table = &data->smbios_ep_table;
+int i = 0, len, max_len = 0;
 uint8_t type, prv, crt;
+uint64_t addr;
+
+if (ep_type == SMBIOS_ENTRY_POINT_TYPE_32) {
+addr = le32_to_cpu(ep_table->ep21.structure_table_address);
+} else {
+addr = le64_to_cpu(ep_table->ep30.structure_table_address);
+}
 
 /* walk the smbios tables */
-for (i = 0; i < le16_to_cpu(ep_table->number_of_structures); i++) {
+do {
 
 /* grab type and formatted area length from struct header */
 type = qtest_readb(data->qts, addr);
@@ -660,19 +698,33 @@ static void test_smbios_structs(test_data *data)
 }
 
 /* keep track of max. struct size */
-if (max_len < len) {
+if (ep_type == SMBIOS_ENTRY_P

[PATCH v1 2/4] tests/docker: update test-mingw to run single build

2022-10-11 Thread Alex Bennée

While the test-build test happily run for mingw the test-mingw case
runs more of the packaging inline with what our CI does. It however
fails if we don't find both compilers and expects to be run on a
docker image with both.

Remove that distinction and make it work more like the other build
test scripts.

Signed-off-by: Alex Bennée 
---
 tests/docker/test-mingw | 16 ++--
 1 file changed, 6 insertions(+), 10 deletions(-)

diff --git a/tests/docker/test-mingw b/tests/docker/test-mingw
index 0bc6d78872..18366972eb 100755
--- a/tests/docker/test-mingw
+++ b/tests/docker/test-mingw
@@ -13,14 +13,12 @@
 
 . common.rc
 
-requires_binary x86_64-w64-mingw32-gcc
-requires_binary i686-w64-mingw32-gcc
+requires_binary x86_64-w64-mingw32-gcc i686-w64-mingw32-gcc
 
 cd "$BUILD_DIR"
 
-for prefix in x86_64-w64-mingw32- i686-w64-mingw32-; do
-TARGET_LIST=${TARGET_LIST:-$DEF_TARGET_LIST} \
-build_qemu --cross-prefix=$prefix \
+TARGET_LIST=${TARGET_LIST:-$DEF_TARGET_LIST} \
+build_qemu \
 --enable-trace-backends=simple \
 --enable-gnutls \
 --enable-nettle \
@@ -29,8 +27,6 @@ for prefix in x86_64-w64-mingw32- i686-w64-mingw32-; do
 --enable-bzip2 \
 --enable-guest-agent \
 --enable-docs
-install_qemu
-make installer
-make clean
-
-done
+install_qemu
+make installer
+make clean
-- 
2.34.1

[PATCH v1 0/4] testing/next hotfix (revert bios build, mingw)

2022-10-11 Thread Alex Bennée

Hi,

Consider this a hotfix testing/next series. I hadn't noticed the
update to build the BIOS's would trigger a lot of downloading for a
normal build. I've reverted one patch which stops that from happening
and we can revisit enabling this is a more sustainable way later.

Also we have updates for the win32/64 builds which didn't make the
last PR although they currently rely on a out-of-tree libvirt-ci
update.

I'm still without CI minutes so haven't been able to run this through
gitlab yet.

Please review (and push to CI) so I can spin a PR today.

Alex Bennée (4):
  tests/docker: update fedora-win[32|64]-cross with lcitool
  tests/docker: update test-mingw to run single build
  Revert "configure: build ROMs with container-based cross compilers"
  configure: expose the direct container command

 configure |  33 ++-
 tests/docker/dockerfiles/alpine.docker|   2 +-
 tests/docker/dockerfiles/centos8.docker   |   2 +-
 .../dockerfiles/debian-amd64-cross.docker | 234 -
 tests/docker/dockerfiles/debian-amd64.docker  | 236 +-
 .../dockerfiles/debian-arm64-cross.docker | 232 -
 .../dockerfiles/debian-armel-cross.docker | 230 -
 .../dockerfiles/debian-armhf-cross.docker | 232 -
 .../dockerfiles/debian-mips64el-cross.docker  | 226 -
 .../dockerfiles/debian-mipsel-cross.docker| 226 -
 .../dockerfiles/debian-ppc64el-cross.docker   | 230 -
 .../dockerfiles/debian-s390x-cross.docker | 228 -
 .../dockerfiles/fedora-win32-cross.docker | 139 ---
 .../dockerfiles/fedora-win64-cross.docker | 138 +++---
 tests/docker/dockerfiles/fedora.docker| 230 -
 tests/docker/dockerfiles/opensuse-leap.docker |   2 +-
 tests/docker/dockerfiles/ubuntu2004.docker| 234 -
 tests/docker/test-mingw   |  16 +-
 tests/lcitool/libvirt-ci  |   2 +-
 tests/lcitool/refresh |  48 ++--
 20 files changed, 1520 insertions(+), 1400 deletions(-)

-- 
2.34.1

[PATCH v3 4/5] bios-tables-test: add test for number of cores > 255

2022-10-11 Thread Julia Suvorova

The new test is run with a large number of cpus and checks if the
core_count field in smbios_cpu_test (structure type 4) is correct.

Choose q35 as it allows to run with -smp > 255.

Signed-off-by: Julia Suvorova 
Message-Id: <20220731162141.178443-5-jus...@redhat.com>
---
 tests/qtest/bios-tables-test.c | 58 ++
 1 file changed, 45 insertions(+), 13 deletions(-)

diff --git a/tests/qtest/bios-tables-test.c b/tests/qtest/bios-tables-test.c
index f5fffdc348..4a76befc93 100644
--- a/tests/qtest/bios-tables-test.c
+++ b/tests/qtest/bios-tables-test.c
@@ -92,6 +92,8 @@ typedef struct {
 SmbiosEntryPoint smbios_ep_table;
 uint16_t smbios_cpu_max_speed;
 uint16_t smbios_cpu_curr_speed;
+uint8_t smbios_core_count;
+uint16_t smbios_core_count2;
 uint8_t *required_struct_types;
 int required_struct_types_len;
 QTestState *qts;
@@ -631,29 +633,42 @@ static inline bool smbios_single_instance(uint8_t type)
 }
 }
 
-static bool smbios_cpu_test(test_data *data, uint32_t addr)
+static void smbios_cpu_test(test_data *data, uint32_t addr,
+SmbiosEntryPointType ep_type)
 {
-uint16_t expect_speed[2];
-uint16_t real;
+uint8_t core_count, expected_core_count = data->smbios_core_count;
+uint16_t speed, expected_speed[2];
+uint16_t core_count2, expected_core_count2 = data->smbios_core_count2;
 int offset[2];
 int i;
 
 /* Check CPU speed for backward compatibility */
 offset[0] = offsetof(struct smbios_type_4, max_speed);
 offset[1] = offsetof(struct smbios_type_4, current_speed);
-expect_speed[0] = data->smbios_cpu_max_speed ? : 2000;
-expect_speed[1] = data->smbios_cpu_curr_speed ? : 2000;
+expected_speed[0] = data->smbios_cpu_max_speed ? : 2000;
+expected_speed[1] = data->smbios_cpu_curr_speed ? : 2000;
 
 for (i = 0; i < 2; i++) {
-real = qtest_readw(data->qts, addr + offset[i]);
-if (real != expect_speed[i]) {
-fprintf(stderr, "Unexpected SMBIOS CPU speed: real %u expect %u\n",
-real, expect_speed[i]);
-return false;
-}
+speed = qtest_readw(data->qts, addr + offset[i]);
+g_assert_cmpuint(speed, ==, expected_speed[i]);
 }
 
-return true;
+core_count = qtest_readb(data->qts,
+addr + offsetof(struct smbios_type_4, core_count));
+
+if (expected_core_count) {
+g_assert_cmpuint(core_count, ==, expected_core_count);
+}
+
+if (ep_type == SMBIOS_ENTRY_POINT_TYPE_64) {
+core_count2 = qtest_readw(data->qts,
+  addr + offsetof(struct smbios_type_4, core_count2));
+
+/* Core Count has reached its limit, checking Core Count 2 */
+if (expected_core_count == 0xFF && expected_core_count2) {
+g_assert_cmpuint(core_count2, ==, expected_core_count2);
+}
+}
 }
 
 static void test_smbios_structs(test_data *data, SmbiosEntryPointType ep_type)
@@ -686,7 +701,7 @@ static void test_smbios_structs(test_data *data, 
SmbiosEntryPointType ep_type)
 set_bit(type, struct_bitmap);
 
 if (type == 4) {
-g_assert(smbios_cpu_test(data, addr));
+smbios_cpu_test(data, addr, ep_type);
 }
 
 /* seek to end of unformatted string area of this struct ("\0\0") */
@@ -908,6 +923,21 @@ static void test_acpi_q35_tcg(void)
 free_test_data(&data);
 }
 
+static void test_acpi_q35_tcg_core_count2(void)
+{
+test_data data = {
+.machine = MACHINE_Q35,
+.variant = ".core-count2",
+.required_struct_types = base_required_struct_types,
+.required_struct_types_len = ARRAY_SIZE(base_required_struct_types),
+.smbios_core_count = 0xFF,
+.smbios_core_count2 = 275,
+};
+
+test_acpi_one("-machine smbios-entry-point-type=64 -smp 275", &data);
+free_test_data(&data);
+}
+
 static void test_acpi_q35_tcg_bridge(void)
 {
 test_data data;
@@ -1859,6 +1889,8 @@ int main(int argc, char *argv[])
 qtest_add_func("acpi/q35/tpm12-tis",
test_acpi_q35_tcg_tpm12_tis);
 }
+qtest_add_func("acpi/q35/core-count2",
+   test_acpi_q35_tcg_core_count2);
 qtest_add_func("acpi/q35/bridge", test_acpi_q35_tcg_bridge);
 qtest_add_func("acpi/q35/multif-bridge",
test_acpi_q35_multif_bridge);
-- 
2.37.3

Re: [PATCH v5 4/9] tests/x86: Add 'q35' machine type to override-tests in hd-geo-test

2022-10-11 Thread Thomas Huth


On 30/09/2022 00.35, Michael Labiuk via wrote:

Signed-off-by: Michael Labiuk 
---
  tests/qtest/hd-geo-test.c | 97 +++
  1 file changed, 97 insertions(+)

diff --git a/tests/qtest/hd-geo-test.c b/tests/qtest/hd-geo-test.c
index 61f4c24b81..278464c379 100644
--- a/tests/qtest/hd-geo-test.c
+++ b/tests/qtest/hd-geo-test.c
@@ -741,6 +741,27 @@ static void test_override_ide(void)
  test_override(args, "pc", expected);
  }
  
+static void test_override_sata(void)

+{
+TestArgs *args = create_args();
+CHSResult expected[] = {
+{"/pci@i0cf8/pci8086,2922@1f,2/drive@0/disk@0", {1, 120, 30} },
+{"/pci@i0cf8/pci8086,2922@1f,2/drive@1/disk@0", {9000, 120, 30} },
+{"/pci@i0cf8/pci8086,2922@1f,2/drive@2/disk@0", {0, 1, 1} },
+{"/pci@i0cf8/pci8086,2922@1f,2/drive@3/disk@0", {1, 0, 0} },
+{NULL, {0, 0, 0} }
+};
+add_drive_with_mbr(args, empty_mbr, 1);
+add_drive_with_mbr(args, empty_mbr, 1);
+add_drive_with_mbr(args, empty_mbr, 1);
+add_drive_with_mbr(args, empty_mbr, 1);
+add_ide_disk(args, 0, 0, 0, 1, 120, 30);
+add_ide_disk(args, 1, 1, 0, 9000, 120, 30);
+add_ide_disk(args, 2, 2, 0, 0, 1, 1);
+add_ide_disk(args, 3, 3, 0, 1, 0, 0);
+test_override(args, "q35", expected);
+}
+
  static void test_override_scsi(void)
  {
  TestArgs *args = create_args();
@@ -763,6 +784,42 @@ static void test_override_scsi(void)
  test_override(args, "pc", expected);
  }
  
+static void setup_pci_bridge(TestArgs *args, const char *id, const char *rootid)

+{
+
+char *root, *br;
+root = g_strdup_printf("-device pcie-root-port,id=%s", rootid);
+br = g_strdup_printf("-device pcie-pci-bridge,bus=%s,id=%s", rootid, id);
+
+args->argc = append_arg(args->argc, args->argv, ARGV_SIZE, root);
+args->argc = append_arg(args->argc, args->argv, ARGV_SIZE, br);
+}
+
+static void test_override_scsi_q35(void)
+{
+TestArgs *args = create_args();
+CHSResult expected[] = {
+{   "/pci@i0cf8/pci-bridge@1/scsi@3/channel@0/disk@0,0",
+{1, 120, 30}
+},
+{"/pci@i0cf8/pci-bridge@1/scsi@3/channel@0/disk@1,0", {9000, 120, 30} 
},
+{"/pci@i0cf8/pci-bridge@1/scsi@3/channel@0/disk@2,0", {1, 0, 0} },
+{"/pci@i0cf8/pci-bridge@1/scsi@3/channel@0/disk@3,0", {0, 1, 0} },
+{NULL, {0, 0, 0} }
+};
+add_drive_with_mbr(args, empty_mbr, 1);
+add_drive_with_mbr(args, empty_mbr, 1);
+add_drive_with_mbr(args, empty_mbr, 1);
+add_drive_with_mbr(args, empty_mbr, 1);
+setup_pci_bridge(args, "pcie.0", "br");
+add_scsi_controller(args, "lsi53c895a", "br", 3);
+add_scsi_disk(args, 0, 0, 0, 0, 0, 1, 120, 30);
+add_scsi_disk(args, 1, 0, 0, 1, 0, 9000, 120, 30);
+add_scsi_disk(args, 2, 0, 0, 2, 0, 1, 0, 0);
+add_scsi_disk(args, 3, 0, 0, 3, 0, 0, 1, 0);
+test_override(args, "q35", expected);
+}
+
  static void test_override_scsi_2_controllers(void)
  {
  TestArgs *args = create_args();
@@ -801,6 +858,22 @@ static void test_override_virtio_blk(void)
  test_override(args, "pc", expected);
  }
  
+static void test_override_virtio_blk_q35(void)

+{
+TestArgs *args = create_args();
+CHSResult expected[] = {
+{"/pci@i0cf8/pci-bridge@1/scsi@3/disk@0,0", {1, 120, 30} },
+{"/pci@i0cf8/pci-bridge@1/scsi@4/disk@0,0", {9000, 120, 30} },
+{NULL, {0, 0, 0} }
+};
+add_drive_with_mbr(args, empty_mbr, 1);
+add_drive_with_mbr(args, empty_mbr, 1);
+setup_pci_bridge(args, "pcie.0", "br");
+add_virtio_disk(args, 0, "br", 3, 1, 120, 30);
+add_virtio_disk(args, 1, "br", 4, 9000, 120, 30);
+test_override(args, "q35", expected);
+}
+
  static void test_override_zero_chs(void)
  {
  TestArgs *args = create_args();
@@ -812,6 +885,17 @@ static void test_override_zero_chs(void)
  test_override(args, "pc", expected);
  }
  
+static void test_override_zero_chs_q35(void)

+{
+TestArgs *args = create_args();
+CHSResult expected[] = {
+{NULL, {0, 0, 0} }
+};
+add_drive_with_mbr(args, empty_mbr, 1);
+add_ide_disk(args, 0, 0, 0, 0, 0, 0);
+test_override(args, "q35", expected);
+}
+
  static void test_override_hot_unplug(TestArgs *args, const char *devid,
   CHSResult expected[], CHSResult 
expected2[])
  {
@@ -944,6 +1028,19 @@ int main(int argc, char **argv)
 test_override_scsi_hot_unplug);
  qtest_add_func("hd-geo/override/virtio_hot_unplug",
 test_override_virtio_hot_unplug);
+
+if (qtest_has_machine("q35")) {
+qtest_add_func("hd-geo/override/sata", test_override_sata);
+qtest_add_func("hd-geo/override/virtio_blk_q35",
+   test_override_virtio_blk_q35);
+qtest_add_func("hd-geo/override/zero_chs_q35",
+   test_override_zero_chs_q35);
+
+if

Re: [PATCH v4] virtio-scsi: Send "REPORTED LUNS CHANGED" sense data upon disk hotplug events.

2022-10-11 Thread Venu Busireddy

On 2022-10-11 12:34:56 +0200, Paolo Bonzini wrote:
> Queued, thanks.

Thank you!

Venu

> 
> Paolo
>

Re: [PATCH v5 7/9] tests/x86: replace snprint() by g_strdup_printf() in drive_del-test

2022-10-11 Thread Thomas Huth


On 30/09/2022 00.35, Michael Labiuk via wrote:

Using g_autofree char* and  g_strdup_printf(...) instead of ugly
snprintf on stack array.

Signed-off-by: Michael Labiuk 
---
  tests/qtest/drive_del-test.c | 10 --
  1 file changed, 4 insertions(+), 6 deletions(-)

diff --git a/tests/qtest/drive_del-test.c b/tests/qtest/drive_del-test.c
index 44b9578801..106c613f4f 100644
--- a/tests/qtest/drive_del-test.c
+++ b/tests/qtest/drive_del-test.c
@@ -123,12 +123,10 @@ static const char *qvirtio_get_dev_type(void)
  
  static void device_add(QTestState *qts)

  {
-QDict *response;
-char driver[32];
-snprintf(driver, sizeof(driver), "virtio-blk-%s",
- qvirtio_get_dev_type());
-
-response = qtest_qmp(qts, "{'execute': 'device_add',"
+g_autofree char *driver = g_strdup_printf("virtio-blk-%s",
+  qvirtio_get_dev_type());
+QDict *response =
+   qtest_qmp(qts, "{'execute': 'device_add',"
" 'arguments': {"
"   'driver': %s,"
"   'drive': 'drive0',"


Reviewed-by: Thomas Huth

Re: [PATCH v5 5/9] tests/x86: Add 'q35' machine type to hotplug hd-geo-test

2022-10-11 Thread Thomas Huth


On 30/09/2022 00.35, Michael Labiuk via wrote:

Add pci bridge setting to test hotplug.
Duplicate tests for plugging scsi and virtio devices for q35 machine type.

Signed-off-by: Michael Labiuk 
---
  tests/qtest/hd-geo-test.c | 76 ++-
  1 file changed, 75 insertions(+), 1 deletion(-)

diff --git a/tests/qtest/hd-geo-test.c b/tests/qtest/hd-geo-test.c
index 278464c379..4a7628077b 100644
--- a/tests/qtest/hd-geo-test.c
+++ b/tests/qtest/hd-geo-test.c
@@ -963,6 +963,42 @@ static void test_override_scsi_hot_unplug(void)
  test_override_hot_unplug(args, "scsi-disk0", expected, expected2);
  }
  
+static void test_override_scsi_hot_unplug_q35(void)

+{
+TestArgs *args = create_args();
+CHSResult expected[] = {
+{
+"/pci@i0cf8/pci-bridge@1/pci-bridge@0/scsi@2/channel@0/disk@0,0",
+{1, 120, 30}
+},
+{
+"/pci@i0cf8/pci-bridge@1/pci-bridge@0/scsi@2/channel@0/disk@1,0",
+{20, 20, 20}
+},
+{NULL, {0, 0, 0} }
+};
+CHSResult expected2[] = {
+{
+"/pci@i0cf8/pci-bridge@1/pci-bridge@0/scsi@2/channel@0/disk@1,0",
+{20, 20, 20}
+},
+{NULL, {0, 0, 0} }
+};
+
+args->argc = append_arg(args->argc, args->argv, ARGV_SIZE,
+g_strdup("-device pcie-root-port,id=p0 "
+ "-device pcie-pci-bridge,bus=p0,id=b1 "
+ "-machine q35"));
+
+add_drive_with_mbr(args, empty_mbr, 1);
+add_drive_with_mbr(args, empty_mbr, 1);
+add_scsi_controller(args, "virtio-scsi-pci", "b1", 2);
+add_scsi_disk(args, 0, 0, 0, 0, 0, 1, 120, 30);
+add_scsi_disk(args, 1, 0, 0, 1, 0, 20, 20, 20);
+
+test_override_hot_unplug(args, "scsi-disk0", expected, expected2);
+}
+
  static void test_override_virtio_hot_unplug(void)
  {
  TestArgs *args = create_args();
@@ -986,6 +1022,41 @@ static void test_override_virtio_hot_unplug(void)
  test_override_hot_unplug(args, "virtio-disk0", expected, expected2);
  }
  
+static void test_override_virtio_hot_unplug_q35(void)

+{
+TestArgs *args = create_args();
+CHSResult expected[] = {
+{
+"/pci@i0cf8/pci-bridge@1/pci-bridge@0/scsi@2/disk@0,0",
+{1, 120, 30}
+},
+{
+"/pci@i0cf8/pci-bridge@1/pci-bridge@0/scsi@3/disk@0,0",
+{20, 20, 20}
+},
+{NULL, {0, 0, 0} }
+};
+CHSResult expected2[] = {
+{
+"/pci@i0cf8/pci-bridge@1/pci-bridge@0/scsi@3/disk@0,0",
+{20, 20, 20}
+},
+{NULL, {0, 0, 0} }
+};
+
+args->argc = append_arg(args->argc, args->argv, ARGV_SIZE,
+g_strdup("-device pcie-root-port,id=p0 "
+ "-device pcie-pci-bridge,bus=p0,id=b1 "
+ "-machine q35"));
+
+add_drive_with_mbr(args, empty_mbr, 1);
+add_drive_with_mbr(args, empty_mbr, 1);
+add_virtio_disk(args, 0, "b1", 2, 1, 120, 30);
+add_virtio_disk(args, 1, "b1", 3, 20, 20, 20);
+
+test_override_hot_unplug(args, "virtio-disk0", expected, expected2);
+}
+
  int main(int argc, char **argv)
  {
  Backend i;
@@ -1035,11 +1106,14 @@ int main(int argc, char **argv)
 test_override_virtio_blk_q35);
  qtest_add_func("hd-geo/override/zero_chs_q35",
 test_override_zero_chs_q35);
-
  if (qtest_has_device("lsi53c895a")) {
  qtest_add_func("hd-geo/override/scsi_q35",
 test_override_scsi_q35);
  }
+qtest_add_func("hd-geo/override/scsi_hot_unplug_q35",
+   test_override_scsi_hot_unplug_q35);
+qtest_add_func("hd-geo/override/virtio_hot_unplug_q35",
+   test_override_virtio_hot_unplug_q35);
  }
  } else {
  g_test_message("QTEST_QEMU_IMG not set or qemu-img missing; "


Acked-by: Thomas Huth

[PATCH v1 1/4] tests/docker: update fedora-win[32|64]-cross with lcitool

2022-10-11 Thread Alex Bennée

Convert another two dockerfiles to lcitool and update. I renamed the
helper because it is not Debian specific. We need an updated lcitool
for this to deal with the weirdness of a 32bit nsis tool for both 32
and 64 bit builds. As a result there are some minor whitespace and
re-order changes in a bunch of the docker files.

Signed-off-by: Alex Bennée 
Message-Id: <20220929114231.583801-10-alex.ben...@linaro.org>
---
 tests/docker/dockerfiles/alpine.docker|   2 +-
 tests/docker/dockerfiles/centos8.docker   |   2 +-
 .../dockerfiles/debian-amd64-cross.docker | 234 -
 tests/docker/dockerfiles/debian-amd64.docker  | 236 +-
 .../dockerfiles/debian-arm64-cross.docker | 232 -
 .../dockerfiles/debian-armel-cross.docker | 230 -
 .../dockerfiles/debian-armhf-cross.docker | 232 -
 .../dockerfiles/debian-mips64el-cross.docker  | 226 -
 .../dockerfiles/debian-mipsel-cross.docker| 226 -
 .../dockerfiles/debian-ppc64el-cross.docker   | 230 -
 .../dockerfiles/debian-s390x-cross.docker | 228 -
 .../dockerfiles/fedora-win32-cross.docker | 139 ---
 .../dockerfiles/fedora-win64-cross.docker | 138 +++---
 tests/docker/dockerfiles/fedora.docker| 230 -
 tests/docker/dockerfiles/opensuse-leap.docker |   2 +-
 tests/docker/dockerfiles/ubuntu2004.docker| 234 -
 tests/lcitool/libvirt-ci  |   2 +-
 tests/lcitool/refresh |  48 ++--
 18 files changed, 1499 insertions(+), 1372 deletions(-)

diff --git a/tests/docker/dockerfiles/alpine.docker 
b/tests/docker/dockerfiles/alpine.docker
index 9b7541261a..a854ae6b78 100644
--- a/tests/docker/dockerfiles/alpine.docker
+++ b/tests/docker/dockerfiles/alpine.docker
@@ -119,8 +119,8 @@ RUN apk update && \
 ln -s /usr/bin/ccache /usr/libexec/ccache-wrappers/g++ && \
 ln -s /usr/bin/ccache /usr/libexec/ccache-wrappers/gcc
 
+ENV CCACHE_WRAPPERSDIR "/usr/libexec/ccache-wrappers"
 ENV LANG "en_US.UTF-8"
 ENV MAKE "/usr/bin/make"
 ENV NINJA "/usr/bin/ninja"
 ENV PYTHON "/usr/bin/python3"
-ENV CCACHE_WRAPPERSDIR "/usr/libexec/ccache-wrappers"
diff --git a/tests/docker/dockerfiles/centos8.docker 
b/tests/docker/dockerfiles/centos8.docker
index d89113c0df..1f70d41aeb 100644
--- a/tests/docker/dockerfiles/centos8.docker
+++ b/tests/docker/dockerfiles/centos8.docker
@@ -130,8 +130,8 @@ RUN dnf distro-sync -y && \
 ln -s /usr/bin/ccache /usr/libexec/ccache-wrappers/g++ && \
 ln -s /usr/bin/ccache /usr/libexec/ccache-wrappers/gcc
 
+ENV CCACHE_WRAPPERSDIR "/usr/libexec/ccache-wrappers"
 ENV LANG "en_US.UTF-8"
 ENV MAKE "/usr/bin/make"
 ENV NINJA "/usr/bin/ninja"
 ENV PYTHON "/usr/bin/python3"
-ENV CCACHE_WRAPPERSDIR "/usr/libexec/ccache-wrappers"
diff --git a/tests/docker/dockerfiles/debian-amd64-cross.docker 
b/tests/docker/dockerfiles/debian-amd64-cross.docker
index 9047759e76..8311024632 100644
--- a/tests/docker/dockerfiles/debian-amd64-cross.docker
+++ b/tests/docker/dockerfiles/debian-amd64-cross.docker
@@ -11,62 +11,62 @@ RUN export DEBIAN_FRONTEND=noninteractive && \
 apt-get install -y eatmydata && \
 eatmydata apt-get dist-upgrade -y && \
 eatmydata apt-get install --no-install-recommends -y \
-bash \
-bc \
-bison \
-bsdextrautils \
-bzip2 \
-ca-certificates \
-ccache \
-dbus \
-debianutils \
-diffutils \
-exuberant-ctags \
-findutils \
-flex \
-gcovr \
-genisoimage \
-gettext \
-git \
-hostname \
-libglib2.0-dev \
-libpcre2-dev \
-libspice-protocol-dev \
-llvm \
-locales \
-make \
-meson \
-ncat \
-ninja-build \
-openssh-client \
-perl-base \
-pkgconf \
-python3 \
-python3-numpy \
-python3-opencv \
-python3-pillow \
-python3-pip \
-python3-sphinx \
-python3-sphinx-rtd-theme \
-python3-venv \
-python3-yaml \
-rpm2cpio \
-sed \
-sparse \
-tar \
-tesseract-ocr \
-tesseract-ocr-eng \
-texinfo && \
+  bash \
+  bc \
+  bison \
+  bsdextrautils \
+  bzip2 \
+  ca-certificates \
+  ccache \
+  dbus \
+  debianutils \
+  diffutils \
+  exuberant-ctags \
+  findutils \
+  flex \
+

[PATCH v3 0/5] hw/smbios: add core_count2 to smbios table type 4

2022-10-11 Thread Julia Suvorova

The SMBIOS 3.0 specification provides the ability to reflect over
255 cores. The 64-bit entry point has been used for a while, but
structure type 4 has not been updated before, so the dmidecode output
looked like this (-smp 280):

Handle 0x0400, DMI type 4, 42 bytes
Processor Information
...
Core Count: 24
Core Enabled: 24
Thread Count: 1
...

Big update in the bios-tables-test as it couldn't work with SMBIOS 3.0.

v3:
* rebase on fresh master
* crop lines to 80 characters [Igor]
* add conditions for cc2 field check in the test [Igor]

v2:
* generate tables type 4 of different sizes based on the
  selected smbios version
* use SmbiosEntryPoint* types instead of creating new constants
* refactor smbios_cpu_test [Igor, Ani]
* clarify signature check [Igor]
* add comments with specifications and clarification of the structure loop 
[Ani]


Julia Suvorova (5):
  hw/smbios: add core_count2 to smbios table type 4
  bios-tables-test: teach test to use smbios 3.0 tables
  tests/acpi: allow changes for core_count2 test
  bios-tables-test: add test for number of cores > 255
  tests/acpi: update tables for new core count test

 hw/smbios/smbios.c   |  19 +++-
 hw/smbios/smbios_build.h |   9 +-
 include/hw/firmware/smbios.h |  12 ++
 tests/data/acpi/q35/APIC.core-count2 | Bin 0 -> 2478 bytes
 tests/data/acpi/q35/DSDT.core-count2 | Bin 0 -> 32414 bytes
 tests/data/acpi/q35/FACP.core-count2 | Bin 0 -> 244 bytes
 tests/qtest/bios-tables-test.c   | 158 ---
 7 files changed, 156 insertions(+), 42 deletions(-)
 create mode 100644 tests/data/acpi/q35/APIC.core-count2
 create mode 100644 tests/data/acpi/q35/DSDT.core-count2
 create mode 100644 tests/data/acpi/q35/FACP.core-count2

-- 
2.37.3

Re: [PATCH v5 6/9] tests/x86: Fix comment typo in drive_del-test

2022-10-11 Thread Thomas Huth


On 30/09/2022 00.35, Michael Labiuk via wrote:

Signed-off-by: Michael Labiuk 
---
  tests/qtest/drive_del-test.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tests/qtest/drive_del-test.c b/tests/qtest/drive_del-test.c
index 467e752b0d..44b9578801 100644
--- a/tests/qtest/drive_del-test.c
+++ b/tests/qtest/drive_del-test.c
@@ -327,7 +327,7 @@ static void test_blockdev_add_device_add_and_del(void)
  qts = qtest_init(machine_addition);
  
  /*

- * blockdev_add/device_add and device_del.  The it drive is used by a
+ * blockdev_add/device_add and device_del. The drive is used by a
   * device that unplugs after reset, but it doesn't go away.
   */
  blockdev_add_with_media(qts);


Reviewed-by: Thomas Huth

Re: [PING PATCH v5] Add 'q35' machine type to hotplug tests

2022-10-11 Thread Thomas Huth


On 11/10/2022 12.18, Michael Labiuk wrote:

I would like to ping a patch


Sorry, it took me a little bit longer to get back to this...

Anyway, patches look fine, and I've queued them now (with the typo fixed in 
the first patch) to my testing-next branch:


 https://gitlab.com/thuth/qemu/-/commits/testing-next

 Thomas




On 9/30/22 01:35, Michael Labiuk via wrote:

Add pci bridge setting to run hotplug tests on q35 machine type.
Hotplug tests was bounded to 'pc' machine type by commit 7b172333f1b

v5 -> v4:

* Unify device removing in tests.
* Using qtest_has_machine("q35") as condition.
* fixed typos.
* Replaced snprintf.

v4 -> v3:

* Moving helper function process_device_remove() to separate commit.
* Refactoring hd-geo-test to avoid code duplication.

Michael Labiuk (9):
   tests/x86: add helper qtest_qmp_device_del_send()
   tests/x86: Add subtest with 'q35' machine type to device-plug-test
   tests/x86: Refactor hot unplug hd-geo-test
   tests/x86: Add 'q35' machine type to override-tests in hd-geo-test
   tests/x86: Add 'q35' machine type to hotplug hd-geo-test
   tests/x86: Fix comment typo in drive_del-test
   tests/x86: replace snprint() by g_strdup_printf() in drive_del-test
   tests/x86: Add 'q35' machine type to drive_del-test
   tests/x86: Add 'q35' machine type to ivshmem-test

  tests/qtest/device-plug-test.c |  56 --
  tests/qtest/drive_del-test.c   | 125 +++--
  tests/qtest/hd-geo-test.c  | 319 -
  tests/qtest/ivshmem-test.c |  18 ++
  tests/qtest/libqos/pci-pc.c    |   8 +-
  tests/qtest/libqtest.c |  16 +-
  tests/qtest/libqtest.h |  10 ++
  7 files changed, 425 insertions(+), 127 deletions(-)

Re: [PATCH v5 9/9] tests/x86: Add 'q35' machine type to ivshmem-test

2022-10-11 Thread Thomas Huth


On 30/09/2022 00.35, Michael Labiuk via wrote:

Configure pci bridge setting to test ivshmem on 'q35'.

Signed-off-by: Michael Labiuk 
---
  tests/qtest/ivshmem-test.c | 18 ++
  1 file changed, 18 insertions(+)


Reviewed-by: Thomas Huth

Re: [PATCH v5 3/9] tests/x86: Refactor hot unplug hd-geo-test

2022-10-11 Thread Thomas Huth


On 30/09/2022 00.35, Michael Labiuk via wrote:

Moving common code to function.

Signed-off-by: Michael Labiuk 
---
  tests/qtest/hd-geo-test.c | 144 +++---
  1 file changed, 57 insertions(+), 87 deletions(-)


Nice refactoring, nice diffstat!

Reviewed-by: Thomas Huth

Re: [RFC PATCH v2 2/4] acpi: fadt: support revision 6.0 of the ACPI specification

2022-10-11 Thread Ani Sinha

On Tue, Oct 11, 2022 at 4:33 PM Miguel Luis  wrote:
>
>
> > On 11 Oct 2022, at 05:02, Ani Sinha  wrote:
> >
> > On Mon, Oct 10, 2022 at 6:53 PM Miguel Luis  wrote:
> >>
> >> Update the Fixed ACPI Description Table (FADT) to revision 6.0 of the ACPI
> >> specification adding the field "Hypervisor Vendor Identity" that was 
> >> missing.
> >>
> >> This field's description states the following: "64-bit identifier of 
> >> hypervisor
> >> vendor. All bytes in this field are considered part of the vendor identity.
> >> These identifiers are defined independently by the vendors themselves,
> >> usually following the name of the hypervisor product. Version information
> >> should NOT be included in this field - this shall simply denote the 
> >> vendor's
> >> name or identifier. Version information can be communicated through a
> >> supplemental vendor-specific hypervisor API. Firmware implementers would
> >> place zero bytes into this field, denoting that no hypervisor is present in
> >> the actual firmware."
> >>
> >> Hereupon, what should a valid identifier of an Hypervisor Vendor ID be and
> >> where should QEMU provide that information?
> >>
> >> On the v1 [1] of this RFC there's the suggestion of having this information
> >> in sync by the current acceleration name. This also seems to imply that 
> >> QEMU,
> >> which generates the FADT table, and the FADT consumer need to be in sync 
> >> with
> >> the values of this field.
> >>
> >> This version follows Ani Sinha's suggestion [2] of using "QEMU" for the
> >> hypervisor vendor ID.
> >>
> >> [1]: https://lists.nongnu.org/archive/html/qemu-devel/2022-10/msg00911.html
> >> [2]: https://lists.nongnu.org/archive/html/qemu-devel/2022-10/msg00989.html
> >>
> >> Signed-off-by: Miguel Luis 
> >
> > Reviewed-by: Ani Sinha 
> >
>
> Thank you Ani. In the meanwhile, taking the description part of: “Firmware
> implementers would place zero bytes into this field, denoting that no
> hypervisor is present in the actual firmware.", I reached to something along
> the lines below:

That line is meant for hardware vendors when shipping bioses with
physical HW. All VMs run with QEMU are run from a hypervisor
environment.

>
> diff --git a/hw/acpi/aml-build.c b/hw/acpi/aml-build.c
> index 42feb4d4d7..e719afe0cb 100644
> --- a/hw/acpi/aml-build.c
> +++ b/hw/acpi/aml-build.c
> @@ -2198,7 +2198,11 @@ void build_fadt(GArray *tbl, BIOSLinker *linker, const 
> AcpiFadtData *f,
>  }
>
>  /* Hypervisor Vendor Identity */
> -build_append_padded_str(tbl, "QEMU", 8, '\0');
> +if (f->hyp_is_present) {
> +build_append_padded_str(tbl, "QEMU", 8, '\0');
> +} else {
> +build_append_int_noprefix(tbl, 0, 8);
> +}
>
>  /* TODO: extra fields need to be added to support revisions above rev6 */
>  assert(f->rev == 6);
> diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
> index 72bb6f61a5..d238ce2b88 100644
> --- a/hw/arm/virt-acpi-build.c
> +++ b/hw/arm/virt-acpi-build.c
> @@ -818,6 +818,7 @@ static void build_fadt_rev6(GArray *table_data, 
> BIOSLinker *linker,
>  .minor_ver = 0,
>  .flags = 1 << ACPI_FADT_F_HW_REDUCED_ACPI,
>  .xdsdt_tbl_offset = &dsdt_tbl_offset,
> +.hyp_is_present = vms->virt && (kvm_enabled() || hvf_enabled()),
>  };
>
>  switch (vms->psci_conduit) {
> diff --git a/include/hw/acpi/acpi-defs.h b/include/hw/acpi/acpi-defs.h
> index 2b42e4192b..2aff5304af 100644
> --- a/include/hw/acpi/acpi-defs.h
> +++ b/include/hw/acpi/acpi-defs.h
> @@ -79,7 +79,7 @@ typedef struct AcpiFadtData {
>  uint16_t arm_boot_arch;/* ARM_BOOT_ARCH */
>  uint16_t iapc_boot_arch;   /* IAPC_BOOT_ARCH */
>  uint8_t minor_ver; /* FADT Minor Version */
> -
> +bool hyp_is_present;
>  /*
>   * respective tables offsets within ACPI_BUILD_TABLE_FILE,
>   * NULL if table doesn't exist (in that case field's value
>
> Any thoughts on this?
>
> Thanks
> Miguel
>
> >> ---
> >> hw/acpi/aml-build.c  | 13 ++---
> >> hw/arm/virt-acpi-build.c | 10 +-
> >> 2 files changed, 15 insertions(+), 8 deletions(-)
> >>
> >> diff --git a/hw/acpi/aml-build.c b/hw/acpi/aml-build.c
> >> index e6bfac95c7..42feb4d4d7 100644
> >> --- a/hw/acpi/aml-build.c
> >> +++ b/hw/acpi/aml-build.c
> >> @@ -2070,7 +2070,7 @@ void build_pptt(GArray *table_data, BIOSLinker 
> >> *linker, MachineState *ms,
> >> acpi_table_end(linker, &table);
> >> }
> >>
> >> -/* build rev1/rev3/rev5.1 FADT */
> >> +/* build rev1/rev3/rev5.1/rev6.0 FADT */
> >> void build_fadt(GArray *tbl, BIOSLinker *linker, const AcpiFadtData *f,
> >> const char *oem_id, const char *oem_table_id)
> >> {
> >> @@ -2193,8 +2193,15 @@ void build_fadt(GArray *tbl, BIOSLinker *linker, 
> >> const AcpiFadtData *f,
> >> /* SLEEP_STATUS_REG */
> >> build_append_gas_from_struct(tbl, &f->sleep_sts);
> >>
> >> -/* TODO: extra fields need to be added to support revisions above 
> >> rev5 */
> >> -assert(f-

1 2 3 >

1 - 100 of 274 matches

Mail list logo