date:20240314

Re: [PATCH-for-9.0? 01/12] accel/tcg/plugin: Remove CONFIG_SOFTMMU_GATE definition

2024-03-14 Thread Thomas Huth


On 13/03/2024 22.33, Philippe Mathieu-Daudé wrote:

The CONFIG_SOFTMMU_GATE definition was never used, remove it.

Signed-off-by: Philippe Mathieu-Daudé 
---
  accel/tcg/plugin-gen.c | 6 --
  1 file changed, 6 deletions(-)

diff --git a/accel/tcg/plugin-gen.c b/accel/tcg/plugin-gen.c
index 8028786c7b..cd78ef94a1 100644
--- a/accel/tcg/plugin-gen.c
+++ b/accel/tcg/plugin-gen.c
@@ -57,12 +57,6 @@
  #include "exec/helper-info.c.inc"
  #undef  HELPER_H
  
-#ifdef CONFIG_SOFTMMU

-# define CONFIG_SOFTMMU_GATE 1
-#else
-# define CONFIG_SOFTMMU_GATE 0
-#endif
-
  /*
   * plugin_cb_start TCG op args[]:
   * 0: enum plugin_gen_from


Reviewed-by: Thomas Huth

Re: [PATCH-for-9.0? 02/12] travis-ci: Rename SOFTMMU -> SYSTEM

2024-03-14 Thread Thomas Huth


On 13/03/2024 22.33, Philippe Mathieu-Daudé wrote:

Since we *might* have user emulation with softmmu,
rename MAIN_SOFTMMU_TARGETS as MAIN_SYSTEM_TARGETS
to express 'system emulation targets'.

Signed-off-by: Philippe Mathieu-Daudé 
---
  .travis.yml | 8 
  1 file changed, 4 insertions(+), 4 deletions(-)


Reviewed-by: Thomas Huth

Re: [External] Re: [PATCH v2 1/1] memory tier: acpi/hmat: create CPUless memory tiers after obtaining HMAT info

2024-03-14 Thread Huang, Ying

"Ho-Ren (Jack) Chuang"  writes:

> On Tue, Mar 12, 2024 at 2:21 AM Huang, Ying  wrote:
>>
>> "Ho-Ren (Jack) Chuang"  writes:
>>
>> > The current implementation treats emulated memory devices, such as
>> > CXL1.1 type3 memory, as normal DRAM when they are emulated as normal memory
>> > (E820_TYPE_RAM). However, these emulated devices have different
>> > characteristics than traditional DRAM, making it important to
>> > distinguish them. Thus, we modify the tiered memory initialization process
>> > to introduce a delay specifically for CPUless NUMA nodes. This delay
>> > ensures that the memory tier initialization for these nodes is deferred
>> > until HMAT information is obtained during the boot process. Finally,
>> > demotion tables are recalculated at the end.
>> >
>> > * Abstract common functions into `find_alloc_memory_type()`
>>
>> We should move kmem_put_memory_types() (renamed to
>> mt_put_memory_types()?) too.  This can be put in a separate patch.
>>
>
> Will do! Thanks,
>
>
>>
>> > Since different memory devices require finding or allocating a memory type,
>> > these common steps are abstracted into a single function,
>> > `find_alloc_memory_type()`, enhancing code scalability and conciseness.
>> >
>> > * Handle cases where there is no HMAT when creating memory tiers
>> > There is a scenario where a CPUless node does not provide HMAT information.
>> > If no HMAT is specified, it falls back to using the default DRAM tier.
>> >
>> > * Change adist calculation code to use another new lock, mt_perf_lock.
>> > In the current implementation, iterating through CPUlist nodes requires
>> > holding the `memory_tier_lock`. However, `mt_calc_adistance()` will end up
>> > trying to acquire the same lock, leading to a potential deadlock.
>> > Therefore, we propose introducing a standalone `mt_perf_lock` to protect
>> > `default_dram_perf`. This approach not only avoids deadlock but also
>> > prevents holding a large lock simultaneously.
>> >
>> > Signed-off-by: Ho-Ren (Jack) Chuang 
>> > Signed-off-by: Hao Xiang 
>> > ---
>> >  drivers/acpi/numa/hmat.c | 11 ++
>> >  drivers/dax/kmem.c   | 13 +--
>> >  include/linux/acpi.h |  6 
>> >  include/linux/memory-tiers.h |  8 +
>> >  mm/memory-tiers.c| 70 +---
>> >  5 files changed, 92 insertions(+), 16 deletions(-)
>> >
>> > diff --git a/drivers/acpi/numa/hmat.c b/drivers/acpi/numa/hmat.c
>> > index d6b85f0f6082..28812ec2c793 100644
>> > --- a/drivers/acpi/numa/hmat.c
>> > +++ b/drivers/acpi/numa/hmat.c
>> > @@ -38,6 +38,8 @@ static LIST_HEAD(targets);
>> >  static LIST_HEAD(initiators);
>> >  static LIST_HEAD(localities);
>> >
>> > +static LIST_HEAD(hmat_memory_types);
>> > +
>>
>> HMAT isn't a device driver for some memory devices.  So I don't think we
>> should manage memory types in HMAT.
>
> I can put it back in memory-tier.c. How about the list? Do we still
> need to keep a separate list for storing late_inited memory nodes?
> And how about the list name if we need to remove the prefix "hmat_"?

I don't think we need a separate list for memory-less nodes.  Just
iterate all memory-less nodes.

>
>> Instead, if the memory_type of a
>> node isn't set by the driver, we should manage it in memory-tier.c as
>> fallback.
>>
>
> Do you mean some device drivers may init memory tiers between
> memory_tier_init() and late_initcall(memory_tier_late_init);?
> And this is the reason why you mention to exclude
> "node_memory_types[nid].memtype != NULL" in memory_tier_late_init().
> Is my understanding correct?

Yes.

>> >  static DEFINE_MUTEX(target_lock);
>> >
>> >  /*
>> > @@ -149,6 +151,12 @@ int acpi_get_genport_coordinates(u32 uid,
>> >  }
>> >  EXPORT_SYMBOL_NS_GPL(acpi_get_genport_coordinates, CXL);
>> >
>> > +struct memory_dev_type *hmat_find_alloc_memory_type(int adist)
>> > +{
>> > + return find_alloc_memory_type(adist, &hmat_memory_types);
>> > +}
>> > +EXPORT_SYMBOL_GPL(hmat_find_alloc_memory_type);
>> > +
>> >  static __init void alloc_memory_initiator(unsigned int cpu_pxm)
>> >  {
>> >   struct memory_initiator *initiator;
>> > @@ -1038,6 +1046,9 @@ static __init int hmat_init(void)
>> >   if (!hmat_set_default_dram_perf())
>> >   register_mt_adistance_algorithm(&hmat_adist_nb);
>> >
>> > + /* Post-create CPUless memory tiers after getting HMAT info */
>> > + memory_tier_late_init();
>> > +
>>
>> This should be called in memory-tier.c via
>>
>> late_initcall(memory_tier_late_init);
>>
>> Then, we don't need hmat to call it.
>>
>
> Thanks. Learned!
>
>
>> >   return 0;
>> >  out_put:
>> >   hmat_free_structures();
>> > diff --git a/drivers/dax/kmem.c b/drivers/dax/kmem.c
>> > index 42ee360cf4e3..aee17ab59f4f 100644
>> > --- a/drivers/dax/kmem.c
>> > +++ b/drivers/dax/kmem.c
>> > @@ -55,21 +55,10 @@ static LIST_HEAD(kmem_memory_types);
>> >
>> >  static struct memory_dev_type *kmem_find_alloc_memory_type(int adist)
>> >  {
>> > - bool found = fa

Re: [PATCH v3] target/riscv: Implement dynamic establishment of custom decoder

2024-03-14 Thread Philippe Mathieu-Daudé


On 14/3/24 07:24, Huang Tao wrote:

Hi,

On 2024/3/13 18:47, Philippe Mathieu-Daudé wrote:



+{
+    GPtrArray *dynamic_decoders;
+    dynamic_decoders = g_ptr_array_sized_new(decoder_table_size);
+    for (size_t i = 0; i < decoder_table_size; ++i) {
+    if (decoder_table[i].guard_func &&
+    decoder_table[i].guard_func(&cpu->cfg)) {
+    g_ptr_array_add(dynamic_decoders,
+ (gpointer)decoder_table[i].decode_fn);
+    }
+    }
+
+    cpu->decoders = dynamic_decoders;
+}


Move this function to translate.c and make decoder_table[] static.
Then we don't need the "cpu_decoder.h", it is specific to TCG and
declarations go in "target/riscv/tcg/tcg-cpu.h".

This function is about finalizing the feature of cpu, it is not suitable 
to move it to translate.c from the perspective of code structure and 
readability.


I will try to move the function to tcg-cpu.c, and the declarations to 
tcg-cpu.h according to your suggestion.



diff --git a/target/riscv/translate.c b/target/riscv/translate.c
index 177418b2b9..332f0bfd4e 100644
--- a/target/riscv/translate.c
+++ b/target/riscv/translate.c
@@ -115,6 +115,7 @@ typedef struct DisasContext {
  bool frm_valid;
  /* TCG of the current insn_start */
  TCGOp *insn_start;
+    const GPtrArray *decoders;


Why do we need this reference? We can use env_archcpu(env)->decoders.


As Richard said before:

 > We try to avoid placing env into DisasContext, so that it is much 
harder to make the mistake of referencing env fields at 
translation-time, when you really needed to generate tcg code to 
reference the fields at runtime.


Right, he told me the same on IRC recently ;)



It also applies to the ArchCPU case.


Thanks to your review, I will adopt the other suggestions in the next 
version.



Thanks,

Huang Tao

Re: [PATCH] vfio/iommufd: Fix memory leak

2024-03-14 Thread Cédric Le Goater


On 3/14/24 04:31, Duan, Zhenzhong wrote:




-Original Message-
From: Cédric Le Goater 
Sent: Thursday, March 14, 2024 5:06 AM
To: qemu-devel@nongnu.org
Cc: Alex Williamson ; Cédric Le Goater
; Eric Auger ; Liu, Yi L
; Duan, Zhenzhong 
Subject: [PATCH] vfio/iommufd: Fix memory leak

Make sure variable contents is freed if scanf fails.

Cc: Eric Auger 
Cc: Yi Liu 
Cc: Zhenzhong Duan 
Fixes: CID 1540007
Fixes: 5ee3dc7af785 ("vfio/iommufd: Implement the iommufd backend")
Signed-off-by: Cédric Le Goater 


Reviewed-by: Zhenzhong Duan 

Unrelated to this patch, I see there are four g_free calls, not clear if it's 
deserved
to cleanup with g_autofree.


Ah yes. This is much better indeed.


Thanks,

C.

[PATCH v2] vfio/iommufd: Fix memory leak

2024-03-14 Thread Cédric Le Goater

Coverity reported a memory leak on variable 'contents' in routine
iommufd_cdev_getfd(). Use g_autofree variables to simplify the exit
path and get rid of g_free() calls.

Cc: Eric Auger 
Cc: Yi Liu 
Fixes: CID 1540007
Fixes: 5ee3dc7af785 ("vfio/iommufd: Implement the iommufd backend")
Suggested-by: Zhenzhong Duan 
Signed-off-by: Cédric Le Goater 
---
 hw/vfio/iommufd.c | 19 ---
 1 file changed, 8 insertions(+), 11 deletions(-)

diff --git a/hw/vfio/iommufd.c b/hw/vfio/iommufd.c
index 
a75a785e90c64cdcc4d10c88d217801b3f536cdb..b9c7efb3ef11e49e189103ae6fb9011a631b60da
 100644
--- a/hw/vfio/iommufd.c
+++ b/hw/vfio/iommufd.c
@@ -118,10 +118,12 @@ static int iommufd_cdev_getfd(const char *sysfs_path, 
Error **errp)
 {
 ERRP_GUARD();
 long int ret = -ENOTTY;
-char *path, *vfio_dev_path = NULL, *vfio_path = NULL;
+g_autofree char *path = NULL;
+g_autofree char *vfio_dev_path = NULL;
+g_autofree char *vfio_path = NULL;
 DIR *dir = NULL;
 struct dirent *dent;
-gchar *contents;
+g_autofree gchar *contents = NULL;
 gsize length;
 int major, minor;
 dev_t vfio_devt;
@@ -130,7 +132,7 @@ static int iommufd_cdev_getfd(const char *sysfs_path, Error 
**errp)
 dir = opendir(path);
 if (!dir) {
 error_setg_errno(errp, errno, "couldn't open directory %s", path);
-goto out_free_path;
+goto out;
 }
 
 while ((dent = readdir(dir))) {
@@ -147,14 +149,13 @@ static int iommufd_cdev_getfd(const char *sysfs_path, 
Error **errp)
 
 if (!g_file_get_contents(vfio_dev_path, &contents, &length, NULL)) {
 error_setg(errp, "failed to load \"%s\"", vfio_dev_path);
-goto out_free_dev_path;
+goto out_close_dir;
 }
 
 if (sscanf(contents, "%d:%d", &major, &minor) != 2) {
 error_setg(errp, "failed to get major:minor for \"%s\"", 
vfio_dev_path);
-goto out_free_dev_path;
+goto out_close_dir;
 }
-g_free(contents);
 vfio_devt = makedev(major, minor);
 
 vfio_path = g_strdup_printf("/dev/vfio/devices/%s", dent->d_name);
@@ -164,17 +165,13 @@ static int iommufd_cdev_getfd(const char *sysfs_path, 
Error **errp)
 }
 
 trace_iommufd_cdev_getfd(vfio_path, ret);
-g_free(vfio_path);
 
-out_free_dev_path:
-g_free(vfio_dev_path);
 out_close_dir:
 closedir(dir);
-out_free_path:
+out:
 if (*errp) {
 error_prepend(errp, VFIO_MSG_PREFIX, path);
 }
-g_free(path);
 
 return ret;
 }
-- 
2.44.0

Re: [PATCH v3 3/3] Add support for RAPL MSRs in KVM/Qemu

2024-03-14 Thread Anthony Harivel



Hi Daniel,


> You don't need to access it via the /node/ hierarchy
>
> The canonical path for CPUs would be
>
>   /sys/devices/system/cpu/cpuNNN/topology
>
> The core_cpus_list file is giving you hyper-thread siblings within
> a core, which I don't think is what you want.
>
> If you're after discrete physical packages, then 'package_cpus_list'
> gives you all CPUs within a physical socket (package) I believe.
>

Yes, this could work.
However, on laptop, I've got: 
cat package_cpus_list
0-11

Where on server: 
package_cpus_list
0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,38,40,42,44,46

I asked my teammate: always the same results. This is I guess due to 
either a difference in the kernel version or how the kernel is handling 
the case where there is only one package, versus the case with multiple 
packages. 

Anyway, writing a C function to handle both cases might not be easy. 

The other solution would be to loop through 
/sys/devices/system/cpu/cpuNNN/ and update a table of integer of a fixed 
size (*) where I increment table[0] for package_0 when I encounter a CPU 
that belongs to package_0 and so on. Reading 
cpuNNN/topology/physical_package_id is giving to whom package the cpu 
belongs to.

This is a bit tedious but a safer solution, I think.


(*): Maybe dynamic allocation is better ?

Thanks again for your guidance.

Regards,
Anthony

[PATCH v3 2/2] vhost: Perform memory section dirty scans once per iteration

2024-03-14 Thread Si-Wei Liu

On setups with one or more virtio-net devices with vhost on,
dirty tracking iteration increases cost the bigger the number
amount of queues are set up e.g. on idle guests migration the
following is observed with virtio-net with vhost=on:

48 queues -> 78.11%  [.] vhost_dev_sync_region.isra.13
8 queues -> 40.50%   [.] vhost_dev_sync_region.isra.13
1 queue -> 6.89% [.] vhost_dev_sync_region.isra.13
2 devices, 1 queue -> 18.60%  [.] vhost_dev_sync_region.isra.14

With high memory rates the symptom is lack of convergence as soon
as it has a vhost device with a sufficiently high number of queues,
the sufficient number of vhost devices.

On every migration iteration (every 100msecs) it will redundantly
query the *shared log* the number of queues configured with vhost
that exist in the guest. For the virtqueue data, this is necessary,
but not for the memory sections which are the same. So essentially
we end up scanning the dirty log too often.

To fix that, select a vhost device responsible for scanning the
log with regards to memory sections dirty tracking. It is selected
when we enable the logger (during migration) and cleared when we
disable the logger. If the vhost logger device goes away for some
reason, the logger will be re-selected from the rest of vhost
devices.

After making mem-section logger a singleton instance, constant cost
of 7%-9% (like the 1 queue report) will be seen, no matter how many
queues or how many vhost devices are configured:

48 queues -> 8.71%[.] vhost_dev_sync_region.isra.13
2 devices, 8 queues -> 7.97%   [.] vhost_dev_sync_region.isra.14

Co-developed-by: Joao Martins 
Signed-off-by: Joao Martins 
Signed-off-by: Si-Wei Liu 
---
v2 -> v3:
  - add after-fix benchmark to commit log
  - rename vhost_log_dev_enabled to vhost_dev_should_log
  - remove unneeded comparisons for backend_type
  - use QLIST array instead of single flat list to store vhost
logger devices
  - simplify logger election logic

---
 hw/virtio/vhost.c | 63 ++-
 include/hw/virtio/vhost.h |  1 +
 2 files changed, 58 insertions(+), 6 deletions(-)

diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
index efe2f74..d91858b 100644
--- a/hw/virtio/vhost.c
+++ b/hw/virtio/vhost.c
@@ -45,6 +45,7 @@
 
 static struct vhost_log *vhost_log[VHOST_BACKEND_TYPE_MAX];
 static struct vhost_log *vhost_log_shm[VHOST_BACKEND_TYPE_MAX];
+static QLIST_HEAD(, vhost_dev) vhost_log_devs[VHOST_BACKEND_TYPE_MAX];
 
 /* Memslots used by backends that support private memslots (without an fd). */
 static unsigned int used_memslots;
@@ -149,6 +150,43 @@ bool vhost_dev_has_iommu(struct vhost_dev *dev)
 }
 }
 
+static inline bool vhost_dev_should_log(struct vhost_dev *dev)
+{
+assert(dev->vhost_ops);
+assert(dev->vhost_ops->backend_type > VHOST_BACKEND_TYPE_NONE);
+assert(dev->vhost_ops->backend_type < VHOST_BACKEND_TYPE_MAX);
+
+return dev == QLIST_FIRST(&vhost_log_devs[dev->vhost_ops->backend_type]);
+}
+
+static inline void vhost_dev_elect_mem_logger(struct vhost_dev *hdev, bool add)
+{
+VhostBackendType backend_type;
+
+assert(hdev->vhost_ops);
+
+backend_type = hdev->vhost_ops->backend_type;
+assert(backend_type > VHOST_BACKEND_TYPE_NONE);
+assert(backend_type < VHOST_BACKEND_TYPE_MAX);
+
+if (add && !QLIST_IS_INSERTED(hdev, logdev_entry)) {
+if (QLIST_EMPTY(&vhost_log_devs[backend_type])) {
+QLIST_INSERT_HEAD(&vhost_log_devs[backend_type],
+  hdev, logdev_entry);
+} else {
+/*
+ * The first vhost_device in the list is selected as the shared
+ * logger to scan memory sections. Put new entry next to the head
+ * to avoid inadvertent change to the underlying logger device.
+ */
+QLIST_INSERT_AFTER(QLIST_FIRST(&vhost_log_devs[backend_type]),
+   hdev, logdev_entry);
+}
+} else if (!add && QLIST_IS_INSERTED(hdev, logdev_entry)) {
+QLIST_REMOVE(hdev, logdev_entry);
+}
+}
+
 static int vhost_sync_dirty_bitmap(struct vhost_dev *dev,
MemoryRegionSection *section,
hwaddr first,
@@ -166,12 +204,14 @@ static int vhost_sync_dirty_bitmap(struct vhost_dev *dev,
 start_addr = MAX(first, start_addr);
 end_addr = MIN(last, end_addr);
 
-for (i = 0; i < dev->mem->nregions; ++i) {
-struct vhost_memory_region *reg = dev->mem->regions + i;
-vhost_dev_sync_region(dev, section, start_addr, end_addr,
-  reg->guest_phys_addr,
-  range_get_last(reg->guest_phys_addr,
- reg->memory_size));
+if (vhost_dev_should_log(dev)) {
+for (i = 0; i < dev->mem->nregions; ++i) {
+struct vhost_memory_region *reg = dev->mem->regions + i;
+vhost_dev_sync_region(dev, section,

[PATCH v3 1/2] vhost: dirty log should be per backend type

2024-03-14 Thread Si-Wei Liu

There could be a mix of both vhost-user and vhost-kernel clients
in the same QEMU process, where separate vhost loggers for the
specific vhost type have to be used. Make the vhost logger per
backend type, and have them properly reference counted.

Suggested-by: Michael S. Tsirkin 
Signed-off-by: Si-Wei Liu 
---
v2->v3: 
  - remove non-effective assertion that never be reached
  - do not return NULL from vhost_log_get()
  - add neccessary assertions to vhost_log_get()

---
 hw/virtio/vhost.c | 50 ++
 1 file changed, 38 insertions(+), 12 deletions(-)

diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
index 2c9ac79..efe2f74 100644
--- a/hw/virtio/vhost.c
+++ b/hw/virtio/vhost.c
@@ -43,8 +43,8 @@
 do { } while (0)
 #endif
 
-static struct vhost_log *vhost_log;
-static struct vhost_log *vhost_log_shm;
+static struct vhost_log *vhost_log[VHOST_BACKEND_TYPE_MAX];
+static struct vhost_log *vhost_log_shm[VHOST_BACKEND_TYPE_MAX];
 
 /* Memslots used by backends that support private memslots (without an fd). */
 static unsigned int used_memslots;
@@ -287,6 +287,10 @@ static int vhost_set_backend_type(struct vhost_dev *dev,
 r = -1;
 }
 
+if (r == 0) {
+assert(dev->vhost_ops->backend_type == backend_type);
+}
+
 return r;
 }
 
@@ -319,16 +323,22 @@ static struct vhost_log *vhost_log_alloc(uint64_t size, 
bool share)
 return log;
 }
 
-static struct vhost_log *vhost_log_get(uint64_t size, bool share)
+static struct vhost_log *vhost_log_get(VhostBackendType backend_type,
+   uint64_t size, bool share)
 {
-struct vhost_log *log = share ? vhost_log_shm : vhost_log;
+struct vhost_log *log;
+
+assert(backend_type > VHOST_BACKEND_TYPE_NONE);
+assert(backend_type < VHOST_BACKEND_TYPE_MAX);
+
+log = share ? vhost_log_shm[backend_type] : vhost_log[backend_type];
 
 if (!log || log->size != size) {
 log = vhost_log_alloc(size, share);
 if (share) {
-vhost_log_shm = log;
+vhost_log_shm[backend_type] = log;
 } else {
-vhost_log = log;
+vhost_log[backend_type] = log;
 }
 } else {
 ++log->refcnt;
@@ -340,11 +350,20 @@ static struct vhost_log *vhost_log_get(uint64_t size, 
bool share)
 static void vhost_log_put(struct vhost_dev *dev, bool sync)
 {
 struct vhost_log *log = dev->log;
+VhostBackendType backend_type;
 
 if (!log) {
 return;
 }
 
+assert(dev->vhost_ops);
+backend_type = dev->vhost_ops->backend_type;
+
+if (backend_type == VHOST_BACKEND_TYPE_NONE ||
+backend_type >= VHOST_BACKEND_TYPE_MAX) {
+return;
+}
+
 --log->refcnt;
 if (log->refcnt == 0) {
 /* Sync only the range covered by the old log */
@@ -352,13 +371,13 @@ static void vhost_log_put(struct vhost_dev *dev, bool 
sync)
 vhost_log_sync_range(dev, 0, dev->log_size * VHOST_LOG_CHUNK - 1);
 }
 
-if (vhost_log == log) {
+if (vhost_log[backend_type] == log) {
 g_free(log->log);
-vhost_log = NULL;
-} else if (vhost_log_shm == log) {
+vhost_log[backend_type] = NULL;
+} else if (vhost_log_shm[backend_type] == log) {
 qemu_memfd_free(log->log, log->size * sizeof(*(log->log)),
 log->fd);
-vhost_log_shm = NULL;
+vhost_log_shm[backend_type] = NULL;
 }
 
 g_free(log);
@@ -376,7 +395,8 @@ static bool vhost_dev_log_is_shared(struct vhost_dev *dev)
 
 static inline void vhost_dev_log_resize(struct vhost_dev *dev, uint64_t size)
 {
-struct vhost_log *log = vhost_log_get(size, vhost_dev_log_is_shared(dev));
+struct vhost_log *log = vhost_log_get(dev->vhost_ops->backend_type,
+  size, vhost_dev_log_is_shared(dev));
 uint64_t log_base = (uintptr_t)log->log;
 int r;
 
@@ -2037,8 +2057,14 @@ int vhost_dev_start(struct vhost_dev *hdev, VirtIODevice 
*vdev, bool vrings)
 uint64_t log_base;
 
 hdev->log_size = vhost_get_log_size(hdev);
-hdev->log = vhost_log_get(hdev->log_size,
+hdev->log = vhost_log_get(hdev->vhost_ops->backend_type,
+  hdev->log_size,
   vhost_dev_log_is_shared(hdev));
+if (!hdev->log) {
+VHOST_OPS_DEBUG(r, "vhost_log_get failed");
+goto fail_vq;
+}
+
 log_base = (uintptr_t)hdev->log->log;
 r = hdev->vhost_ops->vhost_set_log_base(hdev,
 hdev->log_size ? log_base : 0,
-- 
1.8.3.1

Re: [PATCH v3 3/3] Add support for RAPL MSRs in KVM/Qemu

2024-03-14 Thread Daniel P . Berrangé

On Thu, Mar 14, 2024 at 09:26:53AM +0100, Anthony Harivel wrote:
> 
> Hi Daniel,
> 
> 
> > You don't need to access it via the /node/ hierarchy
> >
> > The canonical path for CPUs would be
> >
> >   /sys/devices/system/cpu/cpuNNN/topology
> >
> > The core_cpus_list file is giving you hyper-thread siblings within
> > a core, which I don't think is what you want.
> >
> > If you're after discrete physical packages, then 'package_cpus_list'
> > gives you all CPUs within a physical socket (package) I believe.
> >
> 
> Yes, this could work.
> However, on laptop, I've got: 
> cat package_cpus_list
> 0-11
> 
> Where on server: 
> package_cpus_list
> 0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,38,40,42,44,46
> 
> I asked my teammate: always the same results. This is I guess due to 
> either a difference in the kernel version or how the kernel is handling 
> the case where there is only one package, versus the case with multiple 
> packages.

Both are the same data format - it is a list of ranges. In the
laptop example, there's only a single range to parse. In the
server example there are many ranges but since each range is
only 1 cpu, it has collapsed the ranges to the single CPU id.

> Anyway, writing a C function to handle both cases might not be easy.

Approximately

  * Read the whole file with g_get_file_contents
  * Use  g_strsplit(data, ",", 0) to get a list of ranges
  * Iterate over the return list of ranges and g_strsplit(range, "-", 2);
  - The returned list should be either a single element (if there was
 no '-'), or a pair of elements (if there was a N-M)


With regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

Re: [PATCH v6 03/17] hw/loongarch: Add slave cpu boot_code

2024-03-14 Thread gaosong





在 2024/3/14 9:31, maobibo 写道:



On 2024/3/11 下午2:50, maobibo wrote:



On 2024/3/8 下午5:36, gaosong wrote:



在 2024/3/8 16:27, maobibo 写道:



On 2024/3/8 上午12:48, Song Gao wrote:

Signed-off-by: Song Gao 
Message-Id: <20240301093839.663947-4-gaos...@loongson.cn>
---
  hw/loongarch/boot.c | 70 
-

  1 file changed, 69 insertions(+), 1 deletion(-)

diff --git a/hw/loongarch/boot.c b/hw/loongarch/boot.c
index 149deb2e01..e560ac178a 100644
--- a/hw/loongarch/boot.c
+++ b/hw/loongarch/boot.c
@@ -15,6 +15,54 @@
  #include "sysemu/reset.h"
  #include "sysemu/qtest.h"
+static const unsigned int slave_boot_code[] = {
+  /* Configure reset ebase. */
+    0x0400302c,   /* csrwr  $r12,0xc    */
+
+  /* Disable interrupt. */
+    0x0380100c,   /* ori    $r12,$r0,0x4    */
+    0x04000180,   /* csrxchg    $r0,$r12,0x0    */
+
+  /* Clear mailbox. */
+    0x142d,   /* lu12i.w    $r13,1(0x1) */
+    0x038081ad,   /* ori    $r13,$r13,0x20  */
+    0x06481da0,   /* iocsrwr.d  $r0,$r13    */
+
+  /* Enable IPI interrupt.  */
+    0x142c,   /* lu12i.w    $r12,1(0x1) */
+    0x0400118c,   /* csrxchg    $r12,$r12,0x4   */
+    0x02fffc0c,   /* addi.d $r12,$r0,-1(0xfff)  */
+    0x142d,   /* lu12i.w    $r13,1(0x1) */
+    0x038011ad,   /* ori    $r13,$r13,0x4   */
+    0x064819ac,   /* iocsrwr.w  $r12,$r13   */
+    0x142d,   /* lu12i.w    $r13,1(0x1) */
+    0x038081ad,   /* ori    $r13,$r13,0x20  */
+
+  /* Wait for wakeup  <.L11>:   */
+    0x06488000,   /* idle   0x0 */
+    0x0340,   /* andi   $r0,$r0,0x0 */
+    0x064809ac,   /* iocsrrd.w  $r12,$r13   */
+    0x43fff59f,   /* beqz   $r12,-12(0x74) # 48 <.L11> */
+
+  /* Read and clear IPI interrupt.  */
+    0x142d,   /* lu12i.w    $r13,1(0x1) */
+    0x064809ac,   /* iocsrrd.w  $r12,$r13   */
+    0x142d,   /* lu12i.w    $r13,1(0x1) */
+    0x038031ad,   /* ori    $r13,$r13,0xc   */
+    0x064819ac,   /* iocsrwr.w  $r12,$r13   */
+
+  /* Disable  IPI interrupt.    */
+    0x142c,   /* lu12i.w    $r12,1(0x1) */
+    0x04001180,   /* csrxchg    $r0,$r12,0x4    */
+
+  /* Read mail buf and jump to specified entry */
+    0x142d,   /* lu12i.w    $r13,1(0x1) */
+    0x038081ad,   /* ori    $r13,$r13,0x20  */
+    0x06480dac,   /* iocsrrd.d  $r12,$r13   */
+    0x00150181,   /* move   $r1,$r12    */
+    0x4c20,   /* jirl   $r0,$r1,0   */
+};
+
  static uint64_t cpu_loongarch_virt_to_phys(void *opaque, uint64_t 
addr)

  {
  return addr & MAKE_64BIT_MASK(0, TARGET_PHYS_ADDR_SPACE_BITS);
@@ -111,8 +159,15 @@ static void 
loongarch_firmware_boot(LoongArchMachineState *lams,

  fw_cfg_add_kernel_info(info, lams->fw_cfg);
  }
+static void init_boot_rom(struct loongarch_boot_info *info, void *p)
+{
+    memcpy(p, &slave_boot_code, sizeof(slave_boot_code));
+    p += sizeof(slave_boot_code);
+}
+
  static void loongarch_direct_kernel_boot(struct 
loongarch_boot_info *info)

  {
+    void  *p, *bp;
  int64_t kernel_addr = 0;
  LoongArchCPU *lacpu;
  CPUState *cs;
@@ -126,11 +181,24 @@ static void 
loongarch_direct_kernel_boot(struct loongarch_boot_info *info)

  }
  }
+    /* Load 'boot_rom' at [0 - 1MiB] */
+    p = g_malloc0(1 * MiB);
+    bp = p;
+    init_boot_rom(info, p);
+    rom_add_blob_fixed("boot_rom", bp, 1 * MiB, 0);
+
The secondary cpu waiting on the bootrom located memory address 
0x0-0x10.


Is it possible that primary cpu clears the memory located at bootrom
and then wakeup the secondary cpu?


I think it impossible，0-1M is ROM。

I am not sure whether it is ok if area between 0-1M is ROM.

For the memory map table, low memory area (0 - 256M) is still ddr ram.
And it is passed to kernel with fdt system table, rather than 
area(1-256M). Is that right?


There are some lines like this:
 /* Node0 memory */
 memmap_add_entry(VIRT_LOWMEM_BASE, VIRT_LOWMEM_SIZE, 1);

Song,

Can the base memory address of bootrom for secondary cpus be set as base 
address of flash like bios, such as VIRT_FLASH0_BASE/VIRT_FLASH1_BASE?

 > And ddr memory map area is kept unchanged.


Good suggestions, I wil do this on v7.

Thanks.
Song Gao


Regards
Bibo Mao



Regards
Bibo Mao



Thanks.
Song Gao

Regards
Bibo Mao


  CPU_FOREACH(cs) {
  lacpu = LOONGARCH_CPU(cs);
  lacpu->env.load_elf = true;
-    lacpu->env.elf_address = kernel_addr;
+    if (cs == first_cpu) {
+    lacpu->env.elf_address = kernel_addr;
+    } else {
+    lacpu->env.elf_address = 0;
+    }
+    lacpu->env.boot_info

Re: [PATCH v6 03/17] hw/loongarch: Add slave cpu boot_code

2024-03-14 Thread gaosong





在 2024/3/14 10:28, chen huacai 写道:

Song,

On Fri, Mar 8, 2024 at 12:51 AM Song Gao  wrote:


Signed-off-by: Song Gao 
Message-Id: <20240301093839.663947-4-gaos...@loongson.cn>
---
  hw/loongarch/boot.c | 70 -
  1 file changed, 69 insertions(+), 1 deletion(-)

diff --git a/hw/loongarch/boot.c b/hw/loongarch/boot.c
index 149deb2e01..e560ac178a 100644
--- a/hw/loongarch/boot.c
+++ b/hw/loongarch/boot.c
@@ -15,6 +15,54 @@
  #include "sysemu/reset.h"
  #include "sysemu/qtest.h"

+static const unsigned int slave_boot_code[] = {
+  /* Configure reset ebase. */
+0x0400302c,   /* csrwr  $r12,0xc*/

Use reg-names may be a little better than reg-nums.


Got it.

Thanks.
Song Gao


Huacai


+
+  /* Disable interrupt. */
+0x0380100c,   /* ori$r12,$r0,0x4*/
+0x04000180,   /* csrxchg$r0,$r12,0x0*/
+
+  /* Clear mailbox. */
+0x142d,   /* lu12i.w$r13,1(0x1) */
+0x038081ad,   /* ori$r13,$r13,0x20  */
+0x06481da0,   /* iocsrwr.d  $r0,$r13*/
+
+  /* Enable IPI interrupt.  */
+0x142c,   /* lu12i.w$r12,1(0x1) */
+0x0400118c,   /* csrxchg$r12,$r12,0x4   */
+0x02fffc0c,   /* addi.d $r12,$r0,-1(0xfff)  */
+0x142d,   /* lu12i.w$r13,1(0x1) */
+0x038011ad,   /* ori$r13,$r13,0x4   */
+0x064819ac,   /* iocsrwr.w  $r12,$r13   */
+0x142d,   /* lu12i.w$r13,1(0x1) */
+0x038081ad,   /* ori$r13,$r13,0x20  */
+
+  /* Wait for wakeup  <.L11>:   */
+0x06488000,   /* idle   0x0 */
+0x0340,   /* andi   $r0,$r0,0x0 */
+0x064809ac,   /* iocsrrd.w  $r12,$r13   */
+0x43fff59f,   /* beqz   $r12,-12(0x74) # 48 <.L11> */
+
+  /* Read and clear IPI interrupt.  */
+0x142d,   /* lu12i.w$r13,1(0x1) */
+0x064809ac,   /* iocsrrd.w  $r12,$r13   */
+0x142d,   /* lu12i.w$r13,1(0x1) */
+0x038031ad,   /* ori$r13,$r13,0xc   */
+0x064819ac,   /* iocsrwr.w  $r12,$r13   */
+
+  /* Disable  IPI interrupt.*/
+0x142c,   /* lu12i.w$r12,1(0x1) */
+0x04001180,   /* csrxchg$r0,$r12,0x4*/
+
+  /* Read mail buf and jump to specified entry */
+0x142d,   /* lu12i.w$r13,1(0x1) */
+0x038081ad,   /* ori$r13,$r13,0x20  */
+0x06480dac,   /* iocsrrd.d  $r12,$r13   */
+0x00150181,   /* move   $r1,$r12*/
+0x4c20,   /* jirl   $r0,$r1,0   */
+};
+
  static uint64_t cpu_loongarch_virt_to_phys(void *opaque, uint64_t addr)
  {
  return addr & MAKE_64BIT_MASK(0, TARGET_PHYS_ADDR_SPACE_BITS);
@@ -111,8 +159,15 @@ static void loongarch_firmware_boot(LoongArchMachineState 
*lams,
  fw_cfg_add_kernel_info(info, lams->fw_cfg);
  }

+static void init_boot_rom(struct loongarch_boot_info *info, void *p)
+{
+memcpy(p, &slave_boot_code, sizeof(slave_boot_code));
+p += sizeof(slave_boot_code);
+}
+
  static void loongarch_direct_kernel_boot(struct loongarch_boot_info *info)
  {
+void  *p, *bp;
  int64_t kernel_addr = 0;
  LoongArchCPU *lacpu;
  CPUState *cs;
@@ -126,11 +181,24 @@ static void loongarch_direct_kernel_boot(struct 
loongarch_boot_info *info)
  }
  }

+/* Load 'boot_rom' at [0 - 1MiB] */
+p = g_malloc0(1 * MiB);
+bp = p;
+init_boot_rom(info, p);
+rom_add_blob_fixed("boot_rom", bp, 1 * MiB, 0);
+
  CPU_FOREACH(cs) {
  lacpu = LOONGARCH_CPU(cs);
  lacpu->env.load_elf = true;
-lacpu->env.elf_address = kernel_addr;
+if (cs == first_cpu) {
+lacpu->env.elf_address = kernel_addr;
+} else {
+lacpu->env.elf_address = 0;
+}
+lacpu->env.boot_info = info;
  }
+
+g_free(bp);
  }

  void loongarch_load_kernel(MachineState *ms, struct loongarch_boot_info *info)
--
2.34.1

Re: [PATCH v4 05/23] qapi: create QAPISchemaDefinition

2024-03-14 Thread Markus Armbruster

John Snow  writes:

> Include entities don't have names, but we generally expect "entities" to
> have names. Reclassify all entities with names as *definitions*, leaving
> the nameless include entities as QAPISchemaEntity instances.
>
> This is primarily to help simplify typing around expectations of what
> callers expect for properties of an "entity".
>
> Suggested-by: Markus Armbruster 
> Signed-off-by: John Snow 
> ---
>  scripts/qapi/schema.py | 144 +++--
>  1 file changed, 82 insertions(+), 62 deletions(-)
>
> diff --git a/scripts/qapi/schema.py b/scripts/qapi/schema.py
> index 117f0f78f0c..da273c1649d 100644
> --- a/scripts/qapi/schema.py
> +++ b/scripts/qapi/schema.py

[...]

> @@ -107,6 +90,46 @@ def _set_module(self, schema, info):
>  def set_module(self, schema):
>  self._set_module(schema, self.info)
>  
> +def visit(self, visitor):
> +# pylint: disable=unused-argument
> +assert self._checked
> +
> +
> +class QAPISchemaDefinition(QAPISchemaEntity):
> +meta: Optional[str] = None
> +
> +def __init__(self, name: str, info, doc, ifcond=None, features=None):
> +assert isinstance(name, str)
> +super().__init__(info)
> +for f in features or []:
> +assert isinstance(f, QAPISchemaFeature)
> +f.set_defined_in(name)
> +self.name = name
> +self.doc = doc
> +self._ifcond = ifcond or QAPISchemaIfCond()
> +self.features = features or []
> +
> +def __repr__(self):
> +return "<%s:%s at 0x%x>" % (type(self).__name__, self.name,
> +id(self))
> +
> +def c_name(self):
> +return c_name(self.name)
> +
> +def check(self, schema):
> +assert not self._checked
> +seen = {}
> +for f in self.features:
> +f.check_clash(self.info, seen)
> +super().check(schema)

v3 called super().check() first.  Why did you move it?  I'm asking just
to make sure I'm not missing something subtle.

> +
> +def connect_doc(self, doc=None):
> +super().connect_doc(doc)
> +doc = doc or self.doc
> +if doc:
> +for f in self.features:
> +doc.connect_feature(f)
> +
>  @property
>  def ifcond(self):
>  assert self._checked

[...]

RE: [PATCH v2] vfio/iommufd: Fix memory leak

2024-03-14 Thread Duan, Zhenzhong



>-Original Message-
>From: Cédric Le Goater 
>Subject: [PATCH v2] vfio/iommufd: Fix memory leak
>
>Coverity reported a memory leak on variable 'contents' in routine
>iommufd_cdev_getfd(). Use g_autofree variables to simplify the exit
>path and get rid of g_free() calls.
>
>Cc: Eric Auger 
>Cc: Yi Liu 
>Fixes: CID 1540007
>Fixes: 5ee3dc7af785 ("vfio/iommufd: Implement the iommufd backend")
>Suggested-by: Zhenzhong Duan 
>Signed-off-by: Cédric Le Goater 

Reviewed-by: Zhenzhong Duan 

Thanks
Zhenzhong

>---
> hw/vfio/iommufd.c | 19 ---
> 1 file changed, 8 insertions(+), 11 deletions(-)
>
>diff --git a/hw/vfio/iommufd.c b/hw/vfio/iommufd.c
>index
>a75a785e90c64cdcc4d10c88d217801b3f536cdb..b9c7efb3ef11e49e189103
>ae6fb9011a631b60da 100644
>--- a/hw/vfio/iommufd.c
>+++ b/hw/vfio/iommufd.c
>@@ -118,10 +118,12 @@ static int iommufd_cdev_getfd(const char
>*sysfs_path, Error **errp)
> {
> ERRP_GUARD();
> long int ret = -ENOTTY;
>-char *path, *vfio_dev_path = NULL, *vfio_path = NULL;
>+g_autofree char *path = NULL;
>+g_autofree char *vfio_dev_path = NULL;
>+g_autofree char *vfio_path = NULL;
> DIR *dir = NULL;
> struct dirent *dent;
>-gchar *contents;
>+g_autofree gchar *contents = NULL;
> gsize length;
> int major, minor;
> dev_t vfio_devt;
>@@ -130,7 +132,7 @@ static int iommufd_cdev_getfd(const char
>*sysfs_path, Error **errp)
> dir = opendir(path);
> if (!dir) {
> error_setg_errno(errp, errno, "couldn't open directory %s", path);
>-goto out_free_path;
>+goto out;
> }
>
> while ((dent = readdir(dir))) {
>@@ -147,14 +149,13 @@ static int iommufd_cdev_getfd(const char
>*sysfs_path, Error **errp)
>
> if (!g_file_get_contents(vfio_dev_path, &contents, &length, NULL)) {
> error_setg(errp, "failed to load \"%s\"", vfio_dev_path);
>-goto out_free_dev_path;
>+goto out_close_dir;
> }
>
> if (sscanf(contents, "%d:%d", &major, &minor) != 2) {
> error_setg(errp, "failed to get major:minor for \"%s\"", 
> vfio_dev_path);
>-goto out_free_dev_path;
>+goto out_close_dir;
> }
>-g_free(contents);
> vfio_devt = makedev(major, minor);
>
> vfio_path = g_strdup_printf("/dev/vfio/devices/%s", dent->d_name);
>@@ -164,17 +165,13 @@ static int iommufd_cdev_getfd(const char
>*sysfs_path, Error **errp)
> }
>
> trace_iommufd_cdev_getfd(vfio_path, ret);
>-g_free(vfio_path);
>
>-out_free_dev_path:
>-g_free(vfio_dev_path);
> out_close_dir:
> closedir(dir);
>-out_free_path:
>+out:
> if (*errp) {
> error_prepend(errp, VFIO_MSG_PREFIX, path);
> }
>-g_free(path);
>
> return ret;
> }
>--
>2.44.0

[PATCH v4] target/riscv: Implement dynamic establishment of custom decoder

2024-03-14 Thread Huang Tao

In this patch, we modify the decoder to be a freely composable data
structure instead of a hardcoded one. It can be dynamically builded up
according to the extensions.
This approach has several benefits:
1. Provides support for heterogeneous cpu architectures. As we add decoder in
   RISCVCPU, each cpu can have their own decoder, and the decoders can be
   different due to cpu's features.
2. Improve the decoding efficiency. We run the guard_func to see if the decoder
   can be added to the dynamic_decoder when building up the decoder. Therefore,
   there is no need to run the guard_func when decoding each instruction. It can
   improve the decoding efficiency
3. For vendor or dynamic cpus, it allows them to customize their own decoder
   functions to improve decoding efficiency, especially when vendor-defined
   instruction sets increase. Because of dynamic building up, it can skip the 
other
   decoder guard functions when decoding.
4. Pre patch for allowing adding a vendor decoder before decode_insn32() with 
minimal
   overhead for users that don't need this particular vendor decoder.

Signed-off-by: Huang Tao 
Suggested-by: Christoph Muellner 
Co-authored-by: LIU Zhiwei 
Reviewed-by: Richard Henderson 
---
Changes in v4:
- fix typo
- rename function
- add 'if tcg_enable()'
- move function to tcg-cpu.c and declarations to tcg-cpu.h

Changes in v3:
- use GPtrArray to save decode function poionter list.
---
 target/riscv/cpu.c |  1 +
 target/riscv/cpu.h |  1 +
 target/riscv/tcg/tcg-cpu.c | 15 +++
 target/riscv/tcg/tcg-cpu.h | 15 +++
 target/riscv/translate.c   | 31 +++
 5 files changed, 47 insertions(+), 16 deletions(-)

diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
index c160b9216b..17070b82a7 100644
--- a/target/riscv/cpu.c
+++ b/target/riscv/cpu.c
@@ -1132,6 +1132,7 @@ void riscv_cpu_finalize_features(RISCVCPU *cpu, Error 
**errp)
 error_propagate(errp, local_err);
 return;
 }
+riscv_tcg_cpu_finalize_dynamic_decoder(cpu);
 } else if (kvm_enabled()) {
 riscv_kvm_cpu_finalize_features(cpu, &local_err);
 if (local_err != NULL) {
diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h
index 3b1a02b944..48e67410e1 100644
--- a/target/riscv/cpu.h
+++ b/target/riscv/cpu.h
@@ -457,6 +457,7 @@ struct ArchCPU {
 uint32_t pmu_avail_ctrs;
 /* Mapping of events to counters */
 GHashTable *pmu_event_ctr_map;
+const GPtrArray *decoders;
 };
 
 /**
diff --git a/target/riscv/tcg/tcg-cpu.c b/target/riscv/tcg/tcg-cpu.c
index ab6db817db..c9ab92ea2f 100644
--- a/target/riscv/tcg/tcg-cpu.c
+++ b/target/riscv/tcg/tcg-cpu.c
@@ -853,6 +853,21 @@ void riscv_tcg_cpu_finalize_features(RISCVCPU *cpu, Error 
**errp)
 }
 }
 
+void riscv_tcg_cpu_finalize_dynamic_decoder(RISCVCPU *cpu)
+{
+GPtrArray *dynamic_decoders;
+dynamic_decoders = g_ptr_array_sized_new(decoder_table_size);
+for (size_t i = 0; i < decoder_table_size; ++i) {
+if (decoder_table[i].guard_func &&
+decoder_table[i].guard_func(&cpu->cfg)) {
+g_ptr_array_add(dynamic_decoders,
+(gpointer)decoder_table[i].riscv_cpu_decode_fn);
+}
+}
+
+cpu->decoders = dynamic_decoders;
+}
+
 bool riscv_cpu_tcg_compatible(RISCVCPU *cpu)
 {
 return object_dynamic_cast(OBJECT(cpu), TYPE_RISCV_CPU_HOST) == NULL;
diff --git a/target/riscv/tcg/tcg-cpu.h b/target/riscv/tcg/tcg-cpu.h
index f7b32417f8..ce94253fe4 100644
--- a/target/riscv/tcg/tcg-cpu.h
+++ b/target/riscv/tcg/tcg-cpu.h
@@ -26,4 +26,19 @@ void riscv_cpu_validate_set_extensions(RISCVCPU *cpu, Error 
**errp);
 void riscv_tcg_cpu_finalize_features(RISCVCPU *cpu, Error **errp);
 bool riscv_cpu_tcg_compatible(RISCVCPU *cpu);
 
+struct DisasContext;
+struct RISCVCPUConfig;
+typedef struct RISCVDecoder {
+bool (*guard_func)(const struct RISCVCPUConfig *);
+bool (*riscv_cpu_decode_fn)(struct DisasContext *, uint32_t);
+} RISCVDecoder;
+
+typedef bool (*riscv_cpu_decode_fn)(struct DisasContext *, uint32_t);
+
+extern const size_t decoder_table_size;
+
+extern const RISCVDecoder decoder_table[];
+
+void riscv_tcg_cpu_finalize_dynamic_decoder(RISCVCPU *cpu);
+
 #endif
diff --git a/target/riscv/translate.c b/target/riscv/translate.c
index ea5d52b2ef..bce16d5054 100644
--- a/target/riscv/translate.c
+++ b/target/riscv/translate.c
@@ -37,6 +37,8 @@
 #include "exec/helper-info.c.inc"
 #undef  HELPER_H
 
+#include "tcg/tcg-cpu.h"
+
 /* global register indices */
 static TCGv cpu_gpr[32], cpu_gprh[32], cpu_pc, cpu_vl, cpu_vstart;
 static TCGv_i64 cpu_fpr[32]; /* assume F and D extensions */
@@ -117,6 +119,7 @@ typedef struct DisasContext {
 bool frm_valid;
 /* TCG of the current insn_start */
 TCGOp *insn_start;
+const GPtrArray *decoders;
 } DisasContext;
 
 static inline bool has_ext(DisasContext *ctx, uint32_t ext)
@@ -1120,21 +1123,16 @@ static inline int insn_len(uint16_t first_wo

[PATCH v2 3/4] tests/avocado: use OpenBSD 7.4 for sbsa-ref

2024-03-14 Thread Marcin Juszkiewicz

7.4 was released in October 2023, time to update before 7.3 gets dropped.

Disabled tests for 'max' cpu as OpenBSD fails there.

Signed-off-by: Marcin Juszkiewicz 
---
 tests/avocado/machine_aarch64_sbsaref.py | 47 +++-
 1 file changed, 34 insertions(+), 13 deletions(-)

diff --git a/tests/avocado/machine_aarch64_sbsaref.py 
b/tests/avocado/machine_aarch64_sbsaref.py
index 259225f15f..0e52dc5854 100644
--- a/tests/avocado/machine_aarch64_sbsaref.py
+++ b/tests/avocado/machine_aarch64_sbsaref.py
@@ -159,14 +159,14 @@ def test_sbsaref_alpine_linux_max(self):
 # This tests the whole boot chain from EFI to Userspace
 # We only boot a whole OS for the current top level CPU and GIC
 # Other test profiles should use more minimal boots
-def boot_openbsd73(self, cpu):
+def boot_openbsd(self, cpu):
 self.fetch_firmware()
 
 img_url = (
-"https://cdn.openbsd.org/pub/OpenBSD/7.3/arm64/miniroot73.img";
+"https://cdn.openbsd.org/pub/OpenBSD/7.4/arm64/miniroot74.img";
 )
 
-img_hash = 
"7fc2c75401d6f01fbfa25f4953f72ad7d7c18650056d30755c44b9c129b707e5"
+img_hash = 
"7b08b2ce081cff6408d183f7152ddcfd2779912104866e4fdf6ae2d864b51142"
 img_path = self.fetch_asset(img_url, algorithm="sha256", 
asset_hash=img_hash)
 
 self.vm.set_console()
@@ -180,23 +180,44 @@ def boot_openbsd73(self, cpu):
 self.vm.launch()
 wait_for_console_pattern(self,
  "Welcome to the OpenBSD/arm64"
- " 7.3 installation program.")
+ " 7.4 installation program.")
 
-def test_sbsaref_openbsd73_cortex_a57(self):
+def test_sbsaref_openbsd_cortex_a57(self):
 """
 :avocado: tags=cpu:cortex-a57
+:avocado: tags=os:openbsd
 """
-self.boot_openbsd73("cortex-a57")
+self.boot_openbsd("cortex-a57")
 
-def test_sbsaref_openbsd73_neoverse_n1(self):
+def test_sbsaref_openbsd_neoverse_n1(self):
 """
 :avocado: tags=cpu:neoverse-n1
+:avocado: tags=os:openbsd
 """
-self.boot_openbsd73("neoverse-n1")
+self.boot_openbsd("neoverse-n1")
 
-def test_sbsaref_openbsd73_max(self):
-"""
-:avocado: tags=cpu:max
-"""
-self.boot_openbsd73("max")
+# OpenBSD 7.4 does not boot on current max cpu.
+#
+#   def test_sbsaref_openbsd_max_pauth_off(self):
+#   """
+#   :avocado: tags=cpu:max
+#   :avocado: tags=os:openbsd
+#   """
+#   self.boot_openbsd("max,pauth=off")
+
+#   @skipUnless(os.getenv('AVOCADO_TIMEOUT_EXPECTED'), 'Test might timeout')
+#   def test_sbsaref_openbsd_max_pauth_impdef(self):
+#   """
+#   :avocado: tags=cpu:max
+#   :avocado: tags=os:openbsd
+#   """
+#   self.boot_openbsd("max,pauth-impdef=on")
+
+#   @skipUnless(os.getenv('AVOCADO_TIMEOUT_EXPECTED'), 'Test might timeout')
+#   def test_sbsaref_openbsd_max(self):
+#   """
+#   :avocado: tags=cpu:max
+#   :avocado: tags=os:openbsd
+#   """
+#   self.boot_openbsd("max")
 

-- 
2.44.0

[PATCH v2 2/4] tests/avocado: drop virtio-rng from sbsa-ref tests

2024-03-14 Thread Marcin Juszkiewicz

sbsa-ref is supposed to emulate real hardware so virtio-rng-pci
does not fit here

Signed-off-by: Marcin Juszkiewicz 
---
 tests/avocado/machine_aarch64_sbsaref.py | 8 
 1 file changed, 8 deletions(-)

diff --git a/tests/avocado/machine_aarch64_sbsaref.py 
b/tests/avocado/machine_aarch64_sbsaref.py
index cbab793455..259225f15f 100644
--- a/tests/avocado/machine_aarch64_sbsaref.py
+++ b/tests/avocado/machine_aarch64_sbsaref.py
@@ -132,10 +132,6 @@ def boot_alpine_linux(self, cpu):
 cpu,
 "-drive",
 f"file={iso_path},format=raw",
-"-device",
-"virtio-rng-pci,rng=rng0",
-"-object",
-"rng-random,id=rng0,filename=/dev/urandom",
 )
 
 self.vm.launch()
@@ -179,10 +175,6 @@ def boot_openbsd73(self, cpu):
 cpu,
 "-drive",
 f"file={img_path},format=raw",
-"-device",
-"virtio-rng-pci,rng=rng0",
-"-object",
-"rng-random,id=rng0,filename=/dev/urandom",
 )
 
 self.vm.launch()

-- 
2.44.0

[PATCH v2 1/4] tests/avocado: update sbsa-ref firmware

2024-03-14 Thread Marcin Juszkiewicz

We now have CI job to build those and publish in space with
readable urls.

Firmware is built using Debian 'bookworm' cross toolchain (gcc 12.2.0).

Used versions:

- Trusted Firmware v2.10.2
- Tianocore EDK2 stable202402
- Tianocore EDK2 Platforms code commit 085c2fb

Signed-off-by: Marcin Juszkiewicz 
---
 tests/avocado/machine_aarch64_sbsaref.py | 40 +---
 1 file changed, 21 insertions(+), 19 deletions(-)

diff --git a/tests/avocado/machine_aarch64_sbsaref.py 
b/tests/avocado/machine_aarch64_sbsaref.py
index 528c7d2934..cbab793455 100644
--- a/tests/avocado/machine_aarch64_sbsaref.py
+++ b/tests/avocado/machine_aarch64_sbsaref.py
@@ -1,6 +1,6 @@
 # Functional test that boots a Linux kernel and checks the console
 #
-# SPDX-FileCopyrightText: 2023 Linaro Ltd.
+# SPDX-FileCopyrightText: 2023-2024 Linaro Ltd.
 # SPDX-FileContributor: Philippe Mathieu-Daudé 
 # SPDX-FileContributor: Marcin Juszkiewicz 
 #
@@ -32,34 +32,36 @@ def fetch_firmware(self):
 """
 Flash volumes generated using:
 
-- Fedora GNU Toolchain version 13.2.1 20230728 (Red Hat 13.2.1-1)
+Toolchain from Debian:
+aarch64-linux-gnu-gcc (Debian 12.2.0-14) 12.2.0
 
-- Trusted Firmware-A
-  https://github.com/ARM-software/arm-trusted-firmware/tree/7c3ff62d
+Used components:
+
+- Trusted Firmware 2.10.2
+- Tianocore EDK2 stable202402
+- Tianocore EDK2-platforms commit 085c2fb
 
-- Tianocore EDK II
-  https://github.com/tianocore/edk2/tree/0f9283429dd4
-  https://github.com/tianocore/edk2/tree/ad1c0394b177
-  https://github.com/tianocore/edk2-platforms/tree/d03a60523a60
 """
 
 # Secure BootRom (TF-A code)
 fs0_xz_url = (
-"https://fileserver.linaro.org/s/rE43RJyTfxPtBkc/";
-"download/SBSA_FLASH0.fd.xz"
+"https://artifacts.codelinaro.org/artifactory/linaro-419-sbsa-ref/";
+"20240313-116475/edk2/SBSA_FLASH0.fd.xz"
 )
-fs0_xz_hash = "cdb8e4ffdaaa79292b7b465693f9e5fae6b7062d"
-tar_xz_path = self.fetch_asset(fs0_xz_url, asset_hash=fs0_xz_hash)
+fs0_xz_hash = 
"637593749cc307dea7dc13265c32e5d020267552f22b18a31850b8429fc5e159"
+tar_xz_path = self.fetch_asset(fs0_xz_url, asset_hash=fs0_xz_hash,
+  algorithm='sha256')
 archive.extract(tar_xz_path, self.workdir)
 fs0_path = os.path.join(self.workdir, "SBSA_FLASH0.fd")
 
 # Non-secure rom (UEFI and EFI variables)
 fs1_xz_url = (
-"https://fileserver.linaro.org/s/AGWPDXbcqJTKS4R/";
-"download/SBSA_FLASH1.fd.xz"
+"https://artifacts.codelinaro.org/artifactory/linaro-419-sbsa-ref/";
+"20240313-116475/edk2/SBSA_FLASH1.fd.xz"
 )
-fs1_xz_hash = "411155ae6984334714dff08d5d628178e790c875"
-tar_xz_path = self.fetch_asset(fs1_xz_url, asset_hash=fs1_xz_hash)
+fs1_xz_hash = 
"cb0a5e8cf5e303c5d3dc106cfd5943ffe9714b86afddee7164c69ee1dd41991c"
+tar_xz_path = self.fetch_asset(fs1_xz_url, asset_hash=fs1_xz_hash,
+  algorithm='sha256')
 archive.extract(tar_xz_path, self.workdir)
 fs1_path = os.path.join(self.workdir, "SBSA_FLASH1.fd")
 
@@ -96,15 +98,15 @@ def test_sbsaref_edk2_firmware(self):
 
 # AP Trusted ROM
 wait_for_console_pattern(self, "Booting Trusted Firmware")
-wait_for_console_pattern(self, "BL1: v2.9(release):v2.9")
+wait_for_console_pattern(self, "BL1: v2.10.2(release):")
 wait_for_console_pattern(self, "BL1: Booting BL2")
 
 # Trusted Boot Firmware
-wait_for_console_pattern(self, "BL2: v2.9(release)")
+wait_for_console_pattern(self, "BL2: v2.10.2(release)")
 wait_for_console_pattern(self, "Booting BL31")
 
 # EL3 Runtime Software
-wait_for_console_pattern(self, "BL31: v2.9(release)")
+wait_for_console_pattern(self, "BL31: v2.10.2(release)")
 
 # Non-trusted Firmware
 wait_for_console_pattern(self, "UEFI firmware (version 1.0")

-- 
2.44.0

[PATCH v2 0/4] tests/avocado: update sbsa-ref firmware to latest

2024-03-14 Thread Marcin Juszkiewicz

Updating sbsa-ref firmware for QEMU CI was manual task. Now it is
replaced by CI job run on CodeLinaro Gitlab instance.

This patchset updates to current state:

- Trusted Firmware v2.10.2 (latest LTS)
- Tianocore EDK2 stable202402 (latest release)

And Tianocore EDK2-platforms commit 085c2fb (edk2-platforms does not
have releases).

Firmware images were built using Debian 'bookworm' cross gcc 12.2.0
compiler.

And while I am in that file I dropped use of 'virtio-rng-pci' device as
sbsa-ref is supposed to emulate physical hardware.

OpenBSD updated to 7.4 version.

(1/8) test_sbsaref_edk2_firmware: PASS (2.49 s)
(2/8) test_sbsaref_alpine_linux_cortex_a57: PASS (23.78 s)
(3/8) test_sbsaref_alpine_linux_neoverse_n1: PASS (23.38 s)
(4/8) test_sbsaref_alpine_linux_max_pauth_off: PASS (23.31 s)
(5/8) test_sbsaref_alpine_linux_max_pauth_impdef: PASS (29.27 s)
(6/8) test_sbsaref_alpine_linux_max: PASS (80.13 s)
(7/8) test_sbsaref_openbsd_cortex_a57: PASS (16.01 s)
(8/8) test_sbsaref_openbsd_neoverse_n1: PASS (16.05 s)

Signed-off-by: Marcin Juszkiewicz 
---
Changes in v2:
- disabled 'max' tests on OpenBSD
- moved tags to 'one tag per line'
- added 'os:linux' tags to Alpine ones
- Link to v1: 
https://lore.kernel.org/r/20240313-sbsa-ref-firmware-update-v1-0-e166703c5...@linaro.org

---
Marcin Juszkiewicz (4):
  tests/avocado: update sbsa-ref firmware
  tests/avocado: drop virtio-rng from sbsa-ref tests
  tests/avocado: use OpenBSD 7.4 for sbsa-ref
  tests/avocado: sbsa-ref: add Alpine tests for misc 'max' setup

 tests/avocado/machine_aarch64_sbsaref.py | 113 ---
 1 file changed, 73 insertions(+), 40 deletions(-)
---
base-commit: 0748129684be2773117b0b8fc3c60161abdb7bb8
change-id: 20240313-sbsa-ref-firmware-update-7579d9f6d59b

Best regards,
-- 
Marcin Juszkiewicz

[PATCH v2 4/4] tests/avocado: sbsa-ref: add Alpine tests for misc 'max' setup

2024-03-14 Thread Marcin Juszkiewicz

PAuth makes run timeout on CI so add tests using 'max' without it
and with impdef one.

Signed-off-by: Marcin Juszkiewicz 
---
 tests/avocado/machine_aarch64_sbsaref.py | 18 ++
 1 file changed, 18 insertions(+)

diff --git a/tests/avocado/machine_aarch64_sbsaref.py 
b/tests/avocado/machine_aarch64_sbsaref.py
index 0e52dc5854..0dd2eff0de 100644
--- a/tests/avocado/machine_aarch64_sbsaref.py
+++ b/tests/avocado/machine_aarch64_sbsaref.py
@@ -140,18 +140,36 @@ def boot_alpine_linux(self, cpu):
 def test_sbsaref_alpine_linux_cortex_a57(self):
 """
 :avocado: tags=cpu:cortex-a57
+:avocado: tags=os:linux
 """
 self.boot_alpine_linux("cortex-a57")
 
 def test_sbsaref_alpine_linux_neoverse_n1(self):
 """
 :avocado: tags=cpu:neoverse-n1
+:avocado: tags=os:linux
 """
 self.boot_alpine_linux("neoverse-n1")
 
+def test_sbsaref_alpine_linux_max_pauth_off(self):
+"""
+:avocado: tags=cpu:max
+:avocado: tags=os:linux
+"""
+self.boot_alpine_linux("max,pauth=off")
+
+def test_sbsaref_alpine_linux_max_pauth_impdef(self):
+"""
+:avocado: tags=cpu:max
+:avocado: tags=os:linux
+"""
+self.boot_alpine_linux("max,pauth-impdef=on")
+
+@skipUnless(os.getenv('AVOCADO_TIMEOUT_EXPECTED'), 'Test might timeout')
 def test_sbsaref_alpine_linux_max(self):
 """
 :avocado: tags=cpu:max
+:avocado: tags=os:linux
 """
 self.boot_alpine_linux("max")
 

-- 
2.44.0

Re: [PATCH for-9.0 v14 3/8] target/riscv: always clear vstart in whole vec move insns

2024-03-14 Thread Daniel Henrique Barboza





On 3/13/24 19:36, Richard Henderson wrote:

On 3/13/24 12:01, Daniel Henrique Barboza wrote:

These insns have 2 paths: we'll either have vstart already cleared if
vstart_eq_zero or we'll do a brcond to check if vstart >= maxsz to call
the 'vmvr_v' helper. The helper will clear vstart if it executes until
the end, or if vstart >= vl.

However, if vstart >= maxsz, the helper will be skipped, and vstart
won't be cleared since the helper is being responsible from doing it.

We want to make the helpers responsible to manage vstart, including
these corner cases, precisely to avoid these situations. Move the vstart

= maxsz cond to the helper, and be sure to clear vstart if that

happens. This way we're now sure that vstart is being cleared in the end
of the execution, regardless of the path taken.

Fixes: f714361ed7 ("target/riscv: rvv-1.0: implement vstart CSR")
Signed-off-by: Daniel Henrique Barboza 
---
  target/riscv/insn_trans/trans_rvv.c.inc | 3 ---
  target/riscv/vector_helper.c    | 5 +
  2 files changed, 5 insertions(+), 3 deletions(-)

diff --git a/target/riscv/insn_trans/trans_rvv.c.inc 
b/target/riscv/insn_trans/trans_rvv.c.inc
index 8c16a9f5b3..52c26a7834 100644
--- a/target/riscv/insn_trans/trans_rvv.c.inc
+++ b/target/riscv/insn_trans/trans_rvv.c.inc
@@ -3664,12 +3664,9 @@ static bool trans_##NAME(DisasContext *s, arg_##NAME * 
a)   \
   vreg_ofs(s, a->rs2), maxsz, maxsz);    \
  mark_vs_dirty(s);   \
  } else {    \
-    TCGLabel *over = gen_new_label();   \
-    tcg_gen_brcondi_tl(TCG_COND_GEU, cpu_vstart, maxsz, over);  \
  tcg_gen_gvec_2_ptr(vreg_ofs(s, a->rd), vreg_ofs(s, a->rs2), \
 tcg_env, maxsz, maxsz, 0, gen_helper_vmvr_v); \
  mark_vs_dirty(s);   \
-    gen_set_label(over);    \
  }   \
  return true;    \
  }   \
diff --git a/target/riscv/vector_helper.c b/target/riscv/vector_helper.c
index ca79571ae2..cd8235ea98 100644
--- a/target/riscv/vector_helper.c
+++ b/target/riscv/vector_helper.c
@@ -5075,6 +5075,11 @@ void HELPER(vmvr_v)(void *vd, void *vs2, CPURISCVState 
*env, uint32_t desc)
  uint32_t startb = env->vstart * sewb;
  uint32_t i = startb;
+    if (env->vstart >= maxsz) {
+    env->vstart = 0;
+    return;
+    }


I think you were right to be concerned about vlmax -- you need to use startb 
not vstart in this comparison, otherwise you've got mixed units on the 
comparison.


Hm, so it seems like the comparison is broken in the translation today as well 
...





  memcpy((uint8_t *)vd + H1(i),
 (uint8_t *)vs2 + H1(i),
 maxsz - startb);


Unrelated to this patch series, this has a big-endian host error.
With big-endian, the bytes to be copied may not be sequential.


We're going for v15. Might as well fix this too.

At this point I'll rename the series to "riscv: vector fixes" since there's
a lot of stuff going on.



     if (HOST_BIG_ENDIAN && i % 8 != 0) {
     uint32_t j = ROUND_UP(i, 8);
     memcpy(vd + H(j - 1), vs2 + H1(j - 1), j - i);
     i = j;
     }
     memcpy(vd + i, vs2 + i, maxsz - i);


I'll do an extra patch with it. Thanks,

Daniel




r~

Re: [PATCH for-9.0 v14 4/8] target/riscv/vector_helpers: do early exit when vstart >= vl

2024-03-14 Thread Daniel Henrique Barboza





On 3/14/24 00:52, Max Chou wrote:

Hi Daniel,

According the v spec section 15.2 & 15.3.

"The vcpop.m instruction writes x[rd] even if vl=0 (with the value 0, since no 
mask elements are active).
   Traps on vcpop.m are always reported with a vstart of 0. The vcpop.m instruction 
will raise an illegal instruction exception if vstart is non-zero."

"The vfirst.m instruction writes x[rd] even if vl=0 (with the value -1, since 
no mask elements are active).
  Traps on vfirst are always reported with a vstart of 0. The vfirst instruction 
will raise an illegal instruction exception if vstart is non-zero."

Both the vcpop.m and vfirst.m instructions will raise illegal instruction 
exception with non-zero vstart.

And currently both the trans_vcpop_m and trans_vfirst_m translate functions 
check the vstart_eq_zero flag.
So I think the early exit checking in the vcpop.m and vfirstm helper functions 
may be redundant.


@@ -4585,6 +4641,11 @@ target_ulong HELPER(vcpop_m)(void *v0, void *vs2, 
CPURISCVState *env,
  uint32_t vl = env->vl;
  int i;
  
+if (env->vstart >= env->vl) {

+env->vstart = 0;
+return 0;
+}
+
  for (i = env->vstart; i < vl; i++) {
  if (vm || vext_elem_mask(v0, i)) {
  if (vext_elem_mask(vs2, i)) {


According v spec section 15.3

""The vfirst.m instruction writes x[rd] even if vl=0 (with the value -1, since no 
mask elements are active)."

If both the vstart and vl are 0 here, the early exit checking will return the 
wrong value 0 (the return value should be -1) here.


Let's just remove these early exits from both vcpop.m and vfirst.m functions 
then.

Thanks,

Daniel




@@ -4604,6 +4665,11 @@ target_ulong HELPER(vfirst_m)(void *v0, void *vs2, 
CPURISCVState *env,
  uint32_t vl = env->vl;
  int i;
  
+if (env->vstart >= env->vl) {

+env->vstart = 0;
+return 0;
+}
+
  for (i = env->vstart; i < vl; i++) {
  if (vm || vext_elem_mask(v0, i)) {
  if (vext_elem_mask(vs2, i)) {

[PATCH v4] linux-aio: add IO_CMD_FDSYNC command support

2024-03-14 Thread Prasad Pandit

From: Prasad Pandit 

Libaio defines IO_CMD_FDSYNC command to sync all outstanding
asynchronous I/O operations, by flushing out file data to the
disk storage.

Enable linux-aio to submit such aio request. This helps to
reduce latency induced via pthread_create calls by
thread-pool (aio=threads).

Signed-off-by: Prasad Pandit 
---
 block/file-posix.c  |  7 +++
 block/linux-aio.c   | 21 -
 include/block/raw-aio.h |  1 +
 3 files changed, 28 insertions(+), 1 deletion(-)

v4: New boolean field to indicate if aio_fdsync is available or not.
It is set at file open time and checked before AIO_FLUSH call.
  - https://lists.nongnu.org/archive/html/qemu-devel/2024-03/msg03701.html

diff --git a/block/file-posix.c b/block/file-posix.c
index 35684f7e21..78a8cea03b 100644
--- a/block/file-posix.c
+++ b/block/file-posix.c
@@ -159,6 +159,7 @@ typedef struct BDRVRawState {
 bool has_discard:1;
 bool has_write_zeroes:1;
 bool use_linux_aio:1;
+bool has_laio_fdsync:1;
 bool use_linux_io_uring:1;
 int page_cache_inconsistent; /* errno from fdatasync failure */
 bool has_fallocate;
@@ -718,6 +719,7 @@ static int raw_open_common(BlockDriverState *bs, QDict 
*options,
 ret = -EINVAL;
 goto fail;
 }
+s->has_laio_fdsync = laio_has_fdsync(s->fd);
 #else
 if (s->use_linux_aio) {
 error_setg(errp, "aio=native was specified, but is not supported "
@@ -2599,6 +2601,11 @@ static int coroutine_fn 
raw_co_flush_to_disk(BlockDriverState *bs)
 if (raw_check_linux_io_uring(s)) {
 return luring_co_submit(bs, s->fd, 0, NULL, QEMU_AIO_FLUSH);
 }
+#endif
+#ifdef CONFIG_LINUX_AIO
+if (s->has_laio_fdsync && raw_check_linux_aio(s)) {
+return laio_co_submit(s->fd, 0, NULL, QEMU_AIO_FLUSH, 0);
+}
 #endif
 return raw_thread_pool_submit(handle_aiocb_flush, &acb);
 }
diff --git a/block/linux-aio.c b/block/linux-aio.c
index ec05d946f3..e3b5ec9aba 100644
--- a/block/linux-aio.c
+++ b/block/linux-aio.c
@@ -384,6 +384,9 @@ static int laio_do_submit(int fd, struct qemu_laiocb 
*laiocb, off_t offset,
 case QEMU_AIO_READ:
 io_prep_preadv(iocbs, fd, qiov->iov, qiov->niov, offset);
 break;
+case QEMU_AIO_FLUSH:
+io_prep_fdsync(iocbs, fd);
+break;
 /* Currently Linux kernel does not support other operations */
 default:
 fprintf(stderr, "%s: invalid AIO request type 0x%x.\n",
@@ -412,7 +415,7 @@ int coroutine_fn laio_co_submit(int fd, uint64_t offset, 
QEMUIOVector *qiov,
 AioContext *ctx = qemu_get_current_aio_context();
 struct qemu_laiocb laiocb = {
 .co = qemu_coroutine_self(),
-.nbytes = qiov->size,
+.nbytes = qiov ? qiov->size : 0,
 .ctx= aio_get_linux_aio(ctx),
 .ret= -EINPROGRESS,
 .is_read= (type == QEMU_AIO_READ),
@@ -486,3 +489,19 @@ void laio_cleanup(LinuxAioState *s)
 }
 g_free(s);
 }
+
+bool laio_has_fdsync(int fd)
+{
+struct iocb cb;
+struct iocb *cbs[] = {&cb, NULL};
+
+io_context_t ctx = 0;
+io_setup(1, &ctx);
+
+/* check if host kernel supports IO_CMD_FDSYNC */
+io_prep_fdsync(&cb, fd);
+int ret = io_submit(ctx, 1, cbs);
+
+io_destroy(ctx);
+return (ret == -EINVAL) ? false : true;
+}
diff --git a/include/block/raw-aio.h b/include/block/raw-aio.h
index 20e000b8ef..626706827f 100644
--- a/include/block/raw-aio.h
+++ b/include/block/raw-aio.h
@@ -60,6 +60,7 @@ void laio_cleanup(LinuxAioState *s);
 int coroutine_fn laio_co_submit(int fd, uint64_t offset, QEMUIOVector *qiov,
 int type, uint64_t dev_max_batch);
 
+bool laio_has_fdsync(int);
 void laio_detach_aio_context(LinuxAioState *s, AioContext *old_context);
 void laio_attach_aio_context(LinuxAioState *s, AioContext *new_context);
 #endif
-- 
2.44.0

[PATCH] file-posix: rearrange BDRVRawState fields

2024-03-14 Thread Prasad Pandit

From: Prasad Pandit 

Rearrange BRDVRawState structure fields to avoid memory
fragments in its object's memory and save some(~8) bytes
per object.

Signed-off-by: Prasad Pandit 
---
 block/file-posix.c | 39 +++
 1 file changed, 19 insertions(+), 20 deletions(-)

diff --git a/block/file-posix.c b/block/file-posix.c
index 78a8cea03b..584346ea3e 100644
--- a/block/file-posix.c
+++ b/block/file-posix.c
@@ -135,12 +135,6 @@
 #define RAW_LOCK_SHARED_BASE   200
 
 typedef struct BDRVRawState {
-int fd;
-bool use_lock;
-int type;
-int open_flags;
-size_t buf_align;
-
 /* The current permissions. */
 uint64_t perm;
 uint64_t shared_perm;
@@ -151,29 +145,34 @@ typedef struct BDRVRawState {
 uint64_t locked_shared_perm;
 
 uint64_t aio_max_batch;
+struct {
+uint64_t discard_nb_ok;
+uint64_t discard_nb_failed;
+uint64_t discard_bytes_ok;
+} stats;
 
+PRManager *pr_mgr;
+BDRVReopenState *reopen_state;
+
+size_t buf_align;
+int fd;
+int type;
+int open_flags;
 int perm_change_fd;
 int perm_change_flags;
-BDRVReopenState *reopen_state;
+int page_cache_inconsistent; /* errno from fdatasync failure */
 
+bool use_lock;
+bool has_fallocate;
+bool needs_alignment;
+bool force_alignment;
+bool drop_cache;
+bool check_cache_dropped;
 bool has_discard:1;
 bool has_write_zeroes:1;
 bool use_linux_aio:1;
 bool has_laio_fdsync:1;
 bool use_linux_io_uring:1;
-int page_cache_inconsistent; /* errno from fdatasync failure */
-bool has_fallocate;
-bool needs_alignment;
-bool force_alignment;
-bool drop_cache;
-bool check_cache_dropped;
-struct {
-uint64_t discard_nb_ok;
-uint64_t discard_nb_failed;
-uint64_t discard_bytes_ok;
-} stats;
-
-PRManager *pr_mgr;
 } BDRVRawState;
 
 typedef struct BDRVRawReopenState {
-- 
2.44.0

[PATCH v6 0/3] Introduce sdtrig ISA extension

2024-03-14 Thread Himanshu Chauhan

All the CPUs may or may not implement the debug triggers. Some CPUs
may implement only debug specification v0.13 and not sdtrig ISA
extension.

This patchset, adds sdtrig ISA as an extension which can be turned on or off by
sdtrig= option. It is turned off by default.

When debug is true and sdtrig is false, the behaviour is as defined in debug
specification v0.13. If sdtrig is turned on, the behaviour is as defined
in the sdtrig ISA extension.

The "sdtrig" string is concatenated to ISA string when debug or sdtrig is 
enabled.

Changes from v1:
  - Replaced the debug property with ext_sdtrig
  - Marked it experimenatal by naming it x-sdtrig
  - x-sdtrig is added to ISA string
  - Reversed the patch order

Changes from v2:
  - Mark debug property as deprecated and replace internally with sdtrig 
extension
  - setting/unsetting debug property shows warning and sets/unsets ext_sdtrig
  - sdtrig is added to ISA string as RISC-V debug specification is frozen

Changes from v3:
  - debug propery is not deprecated but it is superceded by sdtrig extension
  - Mcontrol6 support is not published when only debug property is turned
on as debug spec v0.13 doesn't define mcontrol6 match triggers.
  - Enabling sdtrig extension turns of debug property and a warning is printed.
This doesn't break debug specification implemenation since sdtrig is
backward compatible with debug specification.
  - Disable debug property and enable sdtrig by default for Ventana's Veyron
CPUs.

Changes from v4:
  - Enable debug flag if sdtrig was enabled but debug was disabled.
  - Other cosmetic changes.

Changes from v5:
  - Addressed comments from Andrew Jones

Himanshu Chauhan (3):
  target/riscv: Enable mcontrol6 triggers only when sdtrig is selected
  target/riscv: Expose sdtrig ISA extension
  target/riscv: Enable sdtrig for Ventana's Veyron CPUs

 target/riscv/cpu.c | 12 +-
 target/riscv/cpu_cfg.h |  1 +
 target/riscv/csr.c |  2 +-
 target/riscv/debug.c   | 90 +-
 4 files changed, 65 insertions(+), 40 deletions(-)

-- 
2.34.1

[PATCH v6 2/3] target/riscv: Expose sdtrig ISA extension

2024-03-14 Thread Himanshu Chauhan

This patch adds "sdtrig" in the ISA string when sdtrig extension is enabled.
The sdtrig extension may or may not be implemented in a system. Therefore, the
   -cpu rv64,sdtrig=
option can be used to dynamically turn sdtrig extension on or off.

Since, the sdtrig ISA extension is a superset of debug specification, disable
the debug property when sdtrig is enabled. A warning is printed when this is
done.

By default, the sdtrig extension is disabled and debug property enabled as 
usual.

Signed-off-by: Himanshu Chauhan 
---
 target/riscv/cpu.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
index 2602aae9f5..66c91fffd6 100644
--- a/target/riscv/cpu.c
+++ b/target/riscv/cpu.c
@@ -175,6 +175,7 @@ const RISCVIsaExtData isa_edata_arr[] = {
 ISA_EXT_DATA_ENTRY(zvkt, PRIV_VERSION_1_12_0, ext_zvkt),
 ISA_EXT_DATA_ENTRY(zhinx, PRIV_VERSION_1_12_0, ext_zhinx),
 ISA_EXT_DATA_ENTRY(zhinxmin, PRIV_VERSION_1_12_0, ext_zhinxmin),
+ISA_EXT_DATA_ENTRY(sdtrig, PRIV_VERSION_1_12_0, ext_sdtrig),
 ISA_EXT_DATA_ENTRY(smaia, PRIV_VERSION_1_12_0, ext_smaia),
 ISA_EXT_DATA_ENTRY(smepmp, PRIV_VERSION_1_12_0, ext_smepmp),
 ISA_EXT_DATA_ENTRY(smstateen, PRIV_VERSION_1_12_0, ext_smstateen),
@@ -1008,6 +1009,11 @@ static void riscv_cpu_reset_hold(Object *obj)
 set_default_nan_mode(1, &env->fp_status);
 
 #ifndef CONFIG_USER_ONLY
+if (!cpu->cfg.debug && cpu->cfg.ext_sdtrig) {
+warn_report("Enabling 'debug' since 'sdtrig' is enabled.");
+cpu->cfg.debug = true;
+}
+
 if (cpu->cfg.debug || cpu->cfg.ext_sdtrig) {
 riscv_trigger_reset_hold(env);
 }
@@ -1480,6 +1486,7 @@ const RISCVCPUMultiExtConfig riscv_cpu_extensions[] = {
 MULTI_EXT_CFG_BOOL("zvfhmin", ext_zvfhmin, false),
 MULTI_EXT_CFG_BOOL("sstc", ext_sstc, true),
 
+MULTI_EXT_CFG_BOOL("sdtrig", ext_sdtrig, false),
 MULTI_EXT_CFG_BOOL("smaia", ext_smaia, false),
 MULTI_EXT_CFG_BOOL("smepmp", ext_smepmp, false),
 MULTI_EXT_CFG_BOOL("smstateen", ext_smstateen, false),
-- 
2.34.1

[PATCH v6 1/3] target/riscv: Enable mcontrol6 triggers only when sdtrig is selected

2024-03-14 Thread Himanshu Chauhan

The mcontrol6 triggers are not defined in debug specification v0.13
These triggers are defined in sdtrig ISA extension.

This patch:
   * Adds ext_sdtrig capability which is used to select mcontrol6 triggers
   * Keeps the debug property. All triggers that are defined in v0.13 are
 exposed.

Signed-off-by: Himanshu Chauhan 
---
 target/riscv/cpu.c |  4 +-
 target/riscv/cpu_cfg.h |  1 +
 target/riscv/csr.c |  2 +-
 target/riscv/debug.c   | 90 +-
 4 files changed, 57 insertions(+), 40 deletions(-)

diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
index c160b9216b..2602aae9f5 100644
--- a/target/riscv/cpu.c
+++ b/target/riscv/cpu.c
@@ -1008,7 +1008,7 @@ static void riscv_cpu_reset_hold(Object *obj)
 set_default_nan_mode(1, &env->fp_status);
 
 #ifndef CONFIG_USER_ONLY
-if (cpu->cfg.debug) {
+if (cpu->cfg.debug || cpu->cfg.ext_sdtrig) {
 riscv_trigger_reset_hold(env);
 }
 
@@ -1168,7 +1168,7 @@ static void riscv_cpu_realize(DeviceState *dev, Error 
**errp)
 riscv_cpu_register_gdb_regs_for_features(cs);
 
 #ifndef CONFIG_USER_ONLY
-if (cpu->cfg.debug) {
+if (cpu->cfg.debug || cpu->cfg.ext_sdtrig) {
 riscv_trigger_realize(&cpu->env);
 }
 #endif
diff --git a/target/riscv/cpu_cfg.h b/target/riscv/cpu_cfg.h
index 2040b90da0..0c57e1acd4 100644
--- a/target/riscv/cpu_cfg.h
+++ b/target/riscv/cpu_cfg.h
@@ -114,6 +114,7 @@ struct RISCVCPUConfig {
 bool ext_zvfbfwma;
 bool ext_zvfh;
 bool ext_zvfhmin;
+bool ext_sdtrig;
 bool ext_smaia;
 bool ext_ssaia;
 bool ext_sscofpmf;
diff --git a/target/riscv/csr.c b/target/riscv/csr.c
index 726096444f..26623d3640 100644
--- a/target/riscv/csr.c
+++ b/target/riscv/csr.c
@@ -546,7 +546,7 @@ static RISCVException have_mseccfg(CPURISCVState *env, int 
csrno)
 
 static RISCVException debug(CPURISCVState *env, int csrno)
 {
-if (riscv_cpu_cfg(env)->debug) {
+if (riscv_cpu_cfg(env)->debug || riscv_cpu_cfg(env)->ext_sdtrig) {
 return RISCV_EXCP_NONE;
 }
 
diff --git a/target/riscv/debug.c b/target/riscv/debug.c
index e30d99cc2f..674223e966 100644
--- a/target/riscv/debug.c
+++ b/target/riscv/debug.c
@@ -100,13 +100,15 @@ static trigger_action_t get_trigger_action(CPURISCVState 
*env,
 target_ulong tdata1 = env->tdata1[trigger_index];
 int trigger_type = get_trigger_type(env, trigger_index);
 trigger_action_t action = DBG_ACTION_NONE;
+const RISCVCPUConfig *cfg = riscv_cpu_cfg(env);
 
 switch (trigger_type) {
 case TRIGGER_TYPE_AD_MATCH:
 action = (tdata1 & TYPE2_ACTION) >> 12;
 break;
 case TRIGGER_TYPE_AD_MATCH6:
-action = (tdata1 & TYPE6_ACTION) >> 12;
+if (cfg->ext_sdtrig)
+action = (tdata1 & TYPE6_ACTION) >> 12;
 break;
 case TRIGGER_TYPE_INST_CNT:
 case TRIGGER_TYPE_INT:
@@ -727,7 +729,12 @@ void tdata_csr_write(CPURISCVState *env, int tdata_index, 
target_ulong val)
 type2_reg_write(env, env->trigger_cur, tdata_index, val);
 break;
 case TRIGGER_TYPE_AD_MATCH6:
-type6_reg_write(env, env->trigger_cur, tdata_index, val);
+if (riscv_cpu_cfg(env)->ext_sdtrig) {
+type6_reg_write(env, env->trigger_cur, tdata_index, val);
+} else {
+qemu_log_mask(LOG_UNIMP, "trigger type: %d is not supported\n",
+  trigger_type);
+}
 break;
 case TRIGGER_TYPE_INST_CNT:
 itrigger_reg_write(env, env->trigger_cur, tdata_index, val);
@@ -750,9 +757,14 @@ void tdata_csr_write(CPURISCVState *env, int tdata_index, 
target_ulong val)
 
 target_ulong tinfo_csr_read(CPURISCVState *env)
 {
-/* assume all triggers support the same types of triggers */
-return BIT(TRIGGER_TYPE_AD_MATCH) |
-   BIT(TRIGGER_TYPE_AD_MATCH6);
+target_ulong ts = 0;
+
+ts = BIT(TRIGGER_TYPE_AD_MATCH);
+
+if (riscv_cpu_cfg(env)->ext_sdtrig)
+ts |= BIT(TRIGGER_TYPE_AD_MATCH6);
+
+return ts;
 }
 
 void riscv_cpu_debug_excp_handler(CPUState *cs)
@@ -803,19 +815,21 @@ bool riscv_cpu_debug_check_breakpoint(CPUState *cs)
 }
 break;
 case TRIGGER_TYPE_AD_MATCH6:
-ctrl = env->tdata1[i];
-pc = env->tdata2[i];
-
-if ((ctrl & TYPE6_EXEC) && (bp->pc == pc)) {
-if (env->virt_enabled) {
-/* check VU/VS bit against current privilege level */
-if ((ctrl >> 23) & BIT(env->priv)) {
-return true;
-}
-} else {
-/* check U/S/M bit against current privilege level */
-if ((ctrl >> 3) & BIT(env->priv)) {
-return true;
+if (cpu->cfg.ext_sdtrig) {
+ctrl = env->tdata1[i];
+pc = env->tdata2[i];
+
+if (

[PATCH v6 3/3] target/riscv: Enable sdtrig for Ventana's Veyron CPUs

2024-03-14 Thread Himanshu Chauhan

Ventana's Veyron CPUs support sdtrig ISA extension. By default, enable
the sdtrig extension and disable the debug property for these CPUs.

Signed-off-by: Himanshu Chauhan 
---
 target/riscv/cpu.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
index 66c91fffd6..3c7ad1c903 100644
--- a/target/riscv/cpu.c
+++ b/target/riscv/cpu.c
@@ -569,6 +569,7 @@ static void rv64_veyron_v1_cpu_init(Object *obj)
 cpu->cfg.cbom_blocksize = 64;
 cpu->cfg.cboz_blocksize = 64;
 cpu->cfg.ext_zicboz = true;
+cpu->cfg.ext_sdtrig = true;
 cpu->cfg.ext_smaia = true;
 cpu->cfg.ext_ssaia = true;
 cpu->cfg.ext_sscofpmf = true;
-- 
2.34.1

[RFC PATCH v2] contrib/plugins: control flow plugin (WIP!)

2024-03-14 Thread Alex Bennée

This is a simple control flow tracking plugin that uses the latest
inline and conditional operations to detect and track control flow
changes. It is currently an exercise at seeing how useful the changes
are.

Based-on: <20240312075428.244210-1-pierrick.bouv...@linaro.org>
Cc: Gustavo Romero 
Cc: Pierrick Bouvier 
Signed-off-by: Alex Bennée 
Message-Id: <20240311153432.1395190-1-alex.ben...@linaro.org>

---
v2
  - only need a single call back
  - drop need for INSN_WIDTH
  - still don't understand the early exits

I'm still seeing weirdness in the generated code, for example the
plugin reports "early exits" which doesn't make sense for a ret which
terminates a block:

  addr: 0x403c88 hexchar: ret  (1/1)
early exits 1280
branches 1280
  to 0x403d00 (639)
  to 0x403d24 (639)

If I look in the debug I see:

2560 matches for "403c88" in buffer: debug.log
   7934:vcpu_tb_branched_exec: pc=403c84, npc=403c8c, lpc=403c5c, epbc=403c88
   7935:vcpu_tb_branched_exec: pc=403d00, npc=403d24, lpc=403c88, epbc=403d20
   7938:vcpu_tb_branched_exec: pc=403c84, npc=403c8c, lpc=403c5c, epbc=403c88
   7939:vcpu_tb_branched_exec: pc=403d24, npc=403d48, lpc=403c88, epbc=403d44
   7943:vcpu_tb_branched_exec: pc=403c84, npc=403c8c, lpc=403c5c, epbc=403c88

showing last pc values that don't sync up with epbc (end of block +
insn_width). AFAICT this exit is only translated once. This makes me
wonder if we are correctly setting values. Looking at the translator
debug (-d plugin,in_asm,op,op_opt,out_asm -dfilter
0x403c00..0x403d00):


IN: hexchar
0x00403c84:  a8c27bfd  ldp  x29, x30, [sp], #0x20
0x00403c88:  d65f03c0  ret

OP:
 ld_i32 loc0,env,$0xfff0
 brcond_i32 loc0,$0x0,lt,$L0

^ is the start of block check

 st8_i32 $0x0,env,$0xfff4
 ld_i32 tmp2,env,$0xdb08
 mul_i32 tmp2,tmp2,$0x18
 ext_i32_i64 tmp3,tmp2
 mov_i64 tmp5,$0x5612c297b8a0
 add_i64 tmp5,tmp5,tmp3
 mov_i64 tmp4,$0x403c88
 st_i64 tmp4,tmp5,$0x0

^ is this a store to the scoreboard?

 ld_i32 tmp2,env,$0xdb08
 mul_i32 tmp2,tmp2,$0x18
 ext_i32_i64 tmp3,tmp2
 mov_i64 tmp5,$0x5612c297b8a8
 add_i64 tmp5,tmp5,tmp3
 mov_i64 tmp4,$0x403c8c
 st_i64 tmp4,tmp5,$0x0

^ and I guess this? Should we be honouring the order we add TB
instrumentation or is it driven by the different types of operation?

 mov_i64 tmp8,$0x403c84
 ld_i32 tmp2,env,$0xdb08
 mul_i32 tmp7,tmp2,$0x18
 ext_i32_i64 tmp3,tmp7
 mov_i64 tmp5,$0x5612c297b8a8
 add_i64 tmp5,tmp5,tmp3
 ld_i64 tmp4,tmp5,$0x0
 brcond_i64 tmp4,$0x403c84,eq,$L1

^ this is the conditional check of tmp4 (what was expected vs current pc)

 call plugin(0x7f84c99bf5c0),$0x11,$0,tmp2,tmp8
 set_label $L1

  00403c84  
 ld_i32 tmp2,env,$0xdb08
 mul_i32 tmp2,tmp2,$0x18
 ext_i32_i64 tmp3,tmp2
 mov_i64 tmp5,$0x5612c297b8b0
 add_i64 tmp5,tmp5,tmp3
 mov_i64 tmp4,$0x403c84
 st_i64 tmp4,tmp5,$0x0
 mov_i64 loc9,sp
 shl_i64 loc10,loc9,$0x8
 sar_i64 loc10,loc10,$0x8
 and_i64 loc10,loc10,loc9
 qemu_ld_a64_i128 loc12,loc13,loc10,noat+un+leo,0
 mov_i64 x29,loc12
 mov_i64 lr,loc13
 add_i64 loc9,loc9,$0x20
 mov_i64 sp,loc9

  00403c88  
 ld_i32 tmp2,env,$0xdb08
 mul_i32 tmp2,tmp2,$0x18
 ext_i32_i64 tmp3,tmp2
 mov_i64 tmp5,$0x5612c297b8b0
 add_i64 tmp5,tmp5,tmp3
 mov_i64 tmp4,$0x403c88
 st_i64 tmp4,tmp5,$0x0
 shl_i64 pc,lr,$0x8
 sar_i64 pc,pc,$0x8
 and_i64 pc,pc,lr
 call lookup_tb_ptr,$0x6,$1,tmp3,env
 goto_ptr tmp3
 set_label $L0
 exit_tb $0x7f84b80d34c3

OP after optimization and liveness analysis:
 ld_i32 tmp0,env,$0xfff0  pref=0x
 brcond_i32 tmp0,$0x0,lt,$L0  dead: 0
 st8_i32 $0x0,env,$0xfff4 dead: 0
 ld_i32 tmp2,env,$0xdb08  pref=0x
 mul_i32 tmp2,tmp2,$0x18  dead: 1  pref=0x
 ext_i32_i64 tmp3,tmp2dead: 1  pref=0x
 add_i64 tmp5,tmp3,$0x5612c297b8a0dead: 1 2  pref=0x
 st_i64 $0x403c88,tmp5,$0x0   dead: 0 1
 ld_i32 tmp2,env,$0xdb08  pref=0x
 mul_i32 tmp2,tmp2,$0x18  dead: 1  pref=0x
 ext_i32_i64 tmp3,tmp2dead: 1  pref=0x
 add_i64 tmp5,tmp3,$0x5612c297b8a8dead: 1  pref=0x
 st_i64 $0x403c8c,tmp5,$0x0   dead: 0 1
 ld_i32 tmp2,env,$0xdb08  dead: 1  pref=0x80
 mul_i32 tmp7,tmp2,$0x18  dead: 2  pref=0x
 ext_i32_i64 tmp3,tmp7dead: 1  pref=0x
 add_i64 tmp5,tmp3,$0x5612c297b8a8dead: 1 2  pref=0x
 ld_i64 tmp4,tmp5,$0x0dead: 1  pref=0x
 brcond_i64 tmp4,$0x403c84,eq,$L1 dead: 0
 call plugin(0x7f84c99bf5c0),$0x11,$0,tmp2,$0x403c84  dead: 0 1
 set_label $L1

  00403c84  
 ld_i32 tmp2,env,$0xdb08  pref=0x
 mul_i32 tmp2,tmp2,$0x18  dead: 1  pref=0x
 ext_i32_i64 tmp3,tmp2

[PATCH 1/5] roms/efi: clean up edk2 build config

2024-03-14 Thread Gerd Hoffmann

Needed to avoid stale toolchain configurations breaking firmware builds.

Signed-off-by: Gerd Hoffmann 
---
 roms/Makefile | 1 +
 1 file changed, 1 insertion(+)

diff --git a/roms/Makefile b/roms/Makefile
index 8e5d8d26a9a0..edc234a0e886 100644
--- a/roms/Makefile
+++ b/roms/Makefile
@@ -187,6 +187,7 @@ clean:
rm -rf seabios/.config seabios/out seabios/builds
$(MAKE) -C ipxe/src veryclean
$(MAKE) -C edk2/BaseTools clean
+   rm -rf 
edk2/Conf/{.cache,BuildEnv.sh,build_rule.txt,target.txt,tools_def.txt}
$(MAKE) -C SLOF clean
rm -rf u-boot/build-e500
$(MAKE) -C u-boot-sam460ex distclean
-- 
2.44.0

[PATCH 4/5] roms/efi: use pure 64-bit build for edk2-x86_64-secure-code.fd

2024-03-14 Thread Gerd Hoffmann

Signed-off-by: Gerd Hoffmann 
---
 roms/edk2-build.config | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/roms/edk2-build.config b/roms/edk2-build.config
index ef3eb7beebe7..cc9b21154205 100644
--- a/roms/edk2-build.config
+++ b/roms/edk2-build.config
@@ -70,11 +70,11 @@ cpy1 = FV/OVMF_CODE.fd edk2-x86_64-code.fd
 
 [build.ovmf.x86_64.secure]
 desc = ovmf build (64-bit, secure boot)
-conf = OvmfPkg/OvmfPkgIa32X64.dsc
-arch = IA32 X64
+conf = OvmfPkg/OvmfPkgX64.dsc
+arch = X64
 opts = common
ovmf.sb.smm
-plat = Ovmf3264
+plat = OvmfX64
 dest = ../pc-bios
 cpy1 = FV/OVMF_CODE.fd edk2-x86_64-secure-code.fd
 
-- 
2.44.0

[PATCH 3/5] roms/efi: exclude efi shell from secure boot builds

2024-03-14 Thread Gerd Hoffmann

Bugzilla: https://bugzilla.tianocore.org/show_bug.cgi?id=4641
Signed-off-by: Gerd Hoffmann 
---
 roms/edk2-build.config | 1 +
 1 file changed, 1 insertion(+)

diff --git a/roms/edk2-build.config b/roms/edk2-build.config
index 05cbafef70cb..ef3eb7beebe7 100644
--- a/roms/edk2-build.config
+++ b/roms/edk2-build.config
@@ -18,6 +18,7 @@ CAVIUM_ERRATUM_27456 = TRUE
 [opts.ovmf.sb.smm]
 SECURE_BOOT_ENABLE   = TRUE
 SMM_REQUIRE  = TRUE
+BUILD_SHELL  = FALSE
 
 [opts.armvirt.silent]
 DEBUG_PRINT_ERROR_LEVEL  = 0x8000
-- 
2.44.0

[PATCH 0/5] roms/efi: cleanup fix, config update, ekd2 binary rebuild

2024-03-14 Thread Gerd Hoffmann




Gerd Hoffmann (5):
  roms/efi: clean up edk2 build config
  roms/efi: drop workaround for edk2-stable202308
  roms/efi: exclude efi shell from secure boot builds
  roms/efi: use pure 64-bit build for edk2-x86_64-secure-code.fd
  update edk2 binaries for arm, risc-v and x86 secure boot.

 pc-bios/edk2-aarch64-code.fd.bz2   | Bin 1589320 -> 1589310 bytes
 pc-bios/edk2-arm-code.fd.bz2   | Bin 1571418 -> 1571693 bytes
 pc-bios/edk2-i386-secure-code.fd.bz2   | Bin 2130741 -> 1876986 bytes
 pc-bios/edk2-riscv-code.fd.bz2 | Bin 1345420 -> 1289160 bytes
 pc-bios/edk2-x86_64-secure-code.fd.bz2 | Bin 2214892 -> 1967967 bytes
 roms/Makefile  |   1 +
 roms/edk2-build.config |  13 -
 7 files changed, 5 insertions(+), 9 deletions(-)

-- 
2.44.0

[PATCH 2/5] roms/efi: drop workaround for edk2-stable202308

2024-03-14 Thread Gerd Hoffmann

Not needed for newer edk2 versions.

Signed-off-by: Gerd Hoffmann 
---
 roms/edk2-build.config | 6 --
 1 file changed, 6 deletions(-)

diff --git a/roms/edk2-build.config b/roms/edk2-build.config
index 0d367dbdb775..05cbafef70cb 100644
--- a/roms/edk2-build.config
+++ b/roms/edk2-build.config
@@ -32,9 +32,6 @@ PcdDxeNxMemoryProtectionPolicy = 0xC0007FD1
 # shim.efi has broken MemAttr code
 PcdUninstallMemAttrProtocol= TRUE
 
-[pcds.workaround.202308]
-PcdFirstTimeWakeUpAPsBySipi = FALSE
-
 

 # i386
 
@@ -66,7 +63,6 @@ desc = ovmf build (64-bit)
 conf = OvmfPkg/OvmfPkgX64.dsc
 arch = X64
 opts = common
-pcds = workaround.202308
 plat = OvmfX64
 dest = ../pc-bios
 cpy1 = FV/OVMF_CODE.fd edk2-x86_64-code.fd
@@ -77,7 +73,6 @@ conf = OvmfPkg/OvmfPkgIa32X64.dsc
 arch = IA32 X64
 opts = common
ovmf.sb.smm
-pcds = workaround.202308
 plat = Ovmf3264
 dest = ../pc-bios
 cpy1 = FV/OVMF_CODE.fd edk2-x86_64-secure-code.fd
@@ -87,7 +82,6 @@ desc = ovmf build for microvm
 conf = OvmfPkg/Microvm/MicrovmX64.dsc
 arch = X64
 opts = common
-pcds = workaround.202308
 plat = MicrovmX64
 dest = ../pc-bios
 cpy1 = FV/MICROVM.fd  edk2-x86_64-microvm.fd
-- 
2.44.0

Re: [RFC PATCH v2] contrib/plugins: control flow plugin (WIP!)

2024-03-14 Thread Alex Bennée

Alex Bennée  writes:

> This is a simple control flow tracking plugin that uses the latest
> inline and conditional operations to detect and track control flow
> changes. It is currently an exercise at seeing how useful the changes
> are.
>
> Based-on: <20240312075428.244210-1-pierrick.bouv...@linaro.org>
> Cc: Gustavo Romero 
> Cc: Pierrick Bouvier 
> Signed-off-by: Alex Bennée 
> Message-Id: <20240311153432.1395190-1-alex.ben...@linaro.org>
>
> ---
> v2
>   - only need a single call back
>   - drop need for INSN_WIDTH
>   - still don't understand the early exits
>
> I'm still seeing weirdness in the generated code, for example the
> plugin reports "early exits" which doesn't make sense for a ret which
> terminates a block:
>
>   addr: 0x403c88 hexchar: ret  (1/1)
> early exits 1280
> branches 1280
>   to 0x403d00 (639)
>   to 0x403d24 (639)
>

> +
> +/*
> + * At the start of each block we need to resolve two things:
> + *
> + *  - is last_pc == block_end, if not we had an early exit
> + *  - is start of block last_pc + insn width, if not we jumped
> + *
> + * Once those are dealt with we can instrument the rest of the
> + * instructions for their execution.
> + *
> + */
> +static void vcpu_tb_trans(qemu_plugin_id_t id, struct qemu_plugin_tb *tb)
> +{
> +uint64_t pc = qemu_plugin_tb_vaddr(tb);
> +size_t insns = qemu_plugin_tb_n_insns(tb);
> +struct qemu_plugin_insn *last_insn = qemu_plugin_tb_get_insn(tb, insns - 
> 1);
> +
> +/*
> + * check if we are executing linearly after the last block. We can
> + * handle both early block exits and normal branches in the
> + * callback if we hit it.
> + */
> +gpointer udata = GUINT_TO_POINTER(pc);
> +qemu_plugin_register_vcpu_tb_exec_cond_cb(
> +tb, vcpu_tb_branched_exec, QEMU_PLUGIN_CB_NO_REGS,
> +QEMU_PLUGIN_COND_NE, pc_after_block, pc, udata);
> +
> +/*
> + * Now we can set start/end for this block so the next block can
> + * check where we are at.
> + */
> +qemu_plugin_register_vcpu_tb_exec_inline_per_vcpu(tb,
> +  
> QEMU_PLUGIN_INLINE_STORE_U64,
> +  end_block, 
> qemu_plugin_insn_vaddr(last_insn));
> +qemu_plugin_register_vcpu_tb_exec_inline_per_vcpu(tb,
> +  
> QEMU_PLUGIN_INLINE_STORE_U64,
> +  pc_after_block,
> +  
> qemu_plugin_insn_vaddr(last_insn) +
> +
> qemu_plugin_insn_size(last_insn));

With the following:

modified   contrib/plugins/cflow.c
@@ -220,7 +220,7 @@ static void vcpu_tb_branched_exec(unsigned int cpu_index, 
void *udata)
 g_mutex_lock(&node->lock);
 
 if (early_exit) {
-fprintf(stderr, "%s: pc=%"PRIx64", epbc=%"PRIx64"
+fprintf(stderr, "%s: pc=%"PRIx64", epbc=%"PRIx64
 " npc=%"PRIx64", lpc=%"PRIx64", \n",
 __func__, pc, ebpc, npc, lpc);
 node->early_exit++;
@@ -264,6 +264,7 @@ static void vcpu_tb_trans(qemu_plugin_id_t id, struct 
qemu_plugin_tb *tb)
 {
 uint64_t pc = qemu_plugin_tb_vaddr(tb);
 size_t insns = qemu_plugin_tb_n_insns(tb);
+struct qemu_plugin_insn *first_insn = qemu_plugin_tb_get_insn(tb, 0);
 struct qemu_plugin_insn *last_insn = qemu_plugin_tb_get_insn(tb, insns - 
1);
 
 /*
@@ -278,12 +279,13 @@ static void vcpu_tb_trans(qemu_plugin_id_t id, struct 
qemu_plugin_tb *tb)
 
 /*
  * Now we can set start/end for this block so the next block can
- * check where we are at.
+ * check where we are at. Do this on the first instruction and not
+ * the TB so we don't get mixed up with above.
  */
-qemu_plugin_register_vcpu_tb_exec_inline_per_vcpu(tb,
+qemu_plugin_register_vcpu_insn_exec_inline_per_vcpu(first_insn,
   
QEMU_PLUGIN_INLINE_STORE_U64,
   end_block, 
qemu_plugin_insn_vaddr(last_insn));
-qemu_plugin_register_vcpu_tb_exec_inline_per_vcpu(tb,
+qemu_plugin_register_vcpu_insn_exec_inline_per_vcpu(first_insn,
   
QEMU_PLUGIN_INLINE_STORE_U64,
   pc_after_block,
   
qemu_plugin_insn_vaddr(last_insn) +

The report looks more sane:

  collected 9013 control flow nodes in the hash table
addr: 0x403c88 hexchar: ret  (0/1)
  branches 1280  
to 0x403d00 (639)
to 0x403d24 (639)

So I think we need to think about preserving the ordering of
instrumentation (at least from the same plugin) so we are not laying any
API bear traps for users.

I assume because loads and stores are involved we won't see the
optimiser trying to swap stuff around.

-- 
Alex Bennée
Virtualisation Te

Re: [PATCH v2 2/4] tests/avocado: drop virtio-rng from sbsa-ref tests

2024-03-14 Thread Alex Bennée

Marcin Juszkiewicz  writes:

> sbsa-ref is supposed to emulate real hardware so virtio-rng-pci
> does not fit here

What is real anyway ;-)

VirtIO devices exist in real life and I would argue for PCI the device
appears very much like a normal PCI device. I could see the argument for
avoiding virtio-mmio devices which are a lot more VM-only in flavour.

>
> Signed-off-by: Marcin Juszkiewicz 
> ---
>  tests/avocado/machine_aarch64_sbsaref.py | 8 
>  1 file changed, 8 deletions(-)
>
> diff --git a/tests/avocado/machine_aarch64_sbsaref.py 
> b/tests/avocado/machine_aarch64_sbsaref.py
> index cbab793455..259225f15f 100644
> --- a/tests/avocado/machine_aarch64_sbsaref.py
> +++ b/tests/avocado/machine_aarch64_sbsaref.py
> @@ -132,10 +132,6 @@ def boot_alpine_linux(self, cpu):
>  cpu,
>  "-drive",
>  f"file={iso_path},format=raw",
> -"-device",
> -"virtio-rng-pci,rng=rng0",
> -"-object",
> -"rng-random,id=rng0,filename=/dev/urandom",
>  )
>  
>  self.vm.launch()
> @@ -179,10 +175,6 @@ def boot_openbsd73(self, cpu):
>  cpu,
>  "-drive",
>  f"file={img_path},format=raw",
> -"-device",
> -"virtio-rng-pci,rng=rng0",
> -"-object",
> -"rng-random,id=rng0,filename=/dev/urandom",
>  )
>  
>  self.vm.launch()

-- 
Alex Bennée
Virtualisation Tech Lead @ Linaro

Re: [PATCH v2 2/4] tests/avocado: drop virtio-rng from sbsa-ref tests

2024-03-14 Thread Peter Maydell

On Thu, 14 Mar 2024 at 12:11, Alex Bennée  wrote:
>
> Marcin Juszkiewicz  writes:
>
> > sbsa-ref is supposed to emulate real hardware so virtio-rng-pci
> > does not fit here
>
> What is real anyway ;-)
>
> VirtIO devices exist in real life and I would argue for PCI the device
> appears very much like a normal PCI device. I could see the argument for
> avoiding virtio-mmio devices which are a lot more VM-only in flavour.

But our platform for testing virtio-rng-pci should be the 'virt'
board (and our avocado test for virt does indeed have that device).
What exactly are we adding by testing it also in sbsa-ref,
where the typical user will *not* be using it (if they wanted
a virtio setup they would be using 'virt')?

thanks
-- PMM

Re: [PATCH v2 3/4] tests/avocado: use OpenBSD 7.4 for sbsa-ref

2024-03-14 Thread Alex Bennée

Marcin Juszkiewicz  writes:

> 7.4 was released in October 2023, time to update before 7.3 gets dropped.
>
> Disabled tests for 'max' cpu as OpenBSD fails there.
>
> Signed-off-by: Marcin Juszkiewicz 
> ---
>  tests/avocado/machine_aarch64_sbsaref.py | 47 
> +++-
>  1 file changed, 34 insertions(+), 13 deletions(-)
>
> diff --git a/tests/avocado/machine_aarch64_sbsaref.py 
> b/tests/avocado/machine_aarch64_sbsaref.py
> index 259225f15f..0e52dc5854 100644
> --- a/tests/avocado/machine_aarch64_sbsaref.py
> +++ b/tests/avocado/machine_aarch64_sbsaref.py
> @@ -159,14 +159,14 @@ def test_sbsaref_alpine_linux_max(self):
>  # This tests the whole boot chain from EFI to Userspace
>  # We only boot a whole OS for the current top level CPU and GIC
>  # Other test profiles should use more minimal boots
> -def boot_openbsd73(self, cpu):
> +def boot_openbsd(self, cpu):
>  self.fetch_firmware()
>  
>  img_url = (
> -"https://cdn.openbsd.org/pub/OpenBSD/7.3/arm64/miniroot73.img";
> +"https://cdn.openbsd.org/pub/OpenBSD/7.4/arm64/miniroot74.img";
>  )
>  
> -img_hash = 
> "7fc2c75401d6f01fbfa25f4953f72ad7d7c18650056d30755c44b9c129b707e5"
> +img_hash = 
> "7b08b2ce081cff6408d183f7152ddcfd2779912104866e4fdf6ae2d864b51142"
>  img_path = self.fetch_asset(img_url, algorithm="sha256", 
> asset_hash=img_hash)
>  
>  self.vm.set_console()
> @@ -180,23 +180,44 @@ def boot_openbsd73(self, cpu):
>  self.vm.launch()
>  wait_for_console_pattern(self,
>   "Welcome to the OpenBSD/arm64"
> - " 7.3 installation program.")
> + " 7.4 installation program.")
>  
> -def test_sbsaref_openbsd73_cortex_a57(self):
> +def test_sbsaref_openbsd_cortex_a57(self):
>  """
>  :avocado: tags=cpu:cortex-a57
> +:avocado: tags=os:openbsd
>  """
> -self.boot_openbsd73("cortex-a57")
> +self.boot_openbsd("cortex-a57")
>  
> -def test_sbsaref_openbsd73_neoverse_n1(self):
> +def test_sbsaref_openbsd_neoverse_n1(self):
>  """
>  :avocado: tags=cpu:neoverse-n1
> +:avocado: tags=os:openbsd
>  """
> -self.boot_openbsd73("neoverse-n1")
> +self.boot_openbsd("neoverse-n1")
>  
> -def test_sbsaref_openbsd73_max(self):
> -"""
> -:avocado: tags=cpu:max
> -"""
> -self.boot_openbsd73("max")
> +# OpenBSD 7.4 does not boot on current max cpu.
> +#
> +#   def test_sbsaref_openbsd_max_pauth_off(self):
> +#   """
> +#   :avocado: tags=cpu:max
> +#   :avocado: tags=os:openbsd
> +#   """
> +#   self.boot_openbsd("max,pauth=off")

If we are not going to delete the entries then at least use a @skip
instead of commenting. Maybe:

  @skip("Potential un-diagnosed upstream bug?")

but it would be nice to figure out exactly where is breaks.

> +
> +#   @skipUnless(os.getenv('AVOCADO_TIMEOUT_EXPECTED'), 'Test might timeout')
> +#   def test_sbsaref_openbsd_max_pauth_impdef(self):
> +#   """
> +#   :avocado: tags=cpu:max
> +#   :avocado: tags=os:openbsd
> +#   """
> +#   self.boot_openbsd("max,pauth-impdef=on")
> +
> +#   @skipUnless(os.getenv('AVOCADO_TIMEOUT_EXPECTED'), 'Test might timeout')
> +#   def test_sbsaref_openbsd_max(self):
> +#   """
> +#   :avocado: tags=cpu:max
> +#   :avocado: tags=os:openbsd
> +#   """
> +#   self.boot_openbsd("max")

-- 
Alex Bennée
Virtualisation Tech Lead @ Linaro

Re: [RFC PATCH v2] contrib/plugins: control flow plugin (WIP!)

2024-03-14 Thread Pierrick Bouvier


On 3/14/24 16:02, Alex Bennée wrote:

Alex Bennée  writes:


This is a simple control flow tracking plugin that uses the latest
inline and conditional operations to detect and track control flow
changes. It is currently an exercise at seeing how useful the changes
are.

Based-on: <20240312075428.244210-1-pierrick.bouv...@linaro.org>
Cc: Gustavo Romero 
Cc: Pierrick Bouvier 
Signed-off-by: Alex Bennée 
Message-Id: <20240311153432.1395190-1-alex.ben...@linaro.org>

---
v2
   - only need a single call back
   - drop need for INSN_WIDTH
   - still don't understand the early exits

I'm still seeing weirdness in the generated code, for example the
plugin reports "early exits" which doesn't make sense for a ret which
terminates a block:

   addr: 0x403c88 hexchar: ret  (1/1)
 early exits 1280
 branches 1280
   to 0x403d00 (639)
   to 0x403d24 (639)




+
+/*
+ * At the start of each block we need to resolve two things:
+ *
+ *  - is last_pc == block_end, if not we had an early exit
+ *  - is start of block last_pc + insn width, if not we jumped
+ *
+ * Once those are dealt with we can instrument the rest of the
+ * instructions for their execution.
+ *
+ */
+static void vcpu_tb_trans(qemu_plugin_id_t id, struct qemu_plugin_tb *tb)
+{
+uint64_t pc = qemu_plugin_tb_vaddr(tb);
+size_t insns = qemu_plugin_tb_n_insns(tb);
+struct qemu_plugin_insn *last_insn = qemu_plugin_tb_get_insn(tb, insns - 
1);
+
+/*
+ * check if we are executing linearly after the last block. We can
+ * handle both early block exits and normal branches in the
+ * callback if we hit it.
+ */
+gpointer udata = GUINT_TO_POINTER(pc);
+qemu_plugin_register_vcpu_tb_exec_cond_cb(
+tb, vcpu_tb_branched_exec, QEMU_PLUGIN_CB_NO_REGS,
+QEMU_PLUGIN_COND_NE, pc_after_block, pc, udata);
+
+/*
+ * Now we can set start/end for this block so the next block can
+ * check where we are at.
+ */
+qemu_plugin_register_vcpu_tb_exec_inline_per_vcpu(tb,
+  
QEMU_PLUGIN_INLINE_STORE_U64,
+  end_block, 
qemu_plugin_insn_vaddr(last_insn));
+qemu_plugin_register_vcpu_tb_exec_inline_per_vcpu(tb,
+  
QEMU_PLUGIN_INLINE_STORE_U64,
+  pc_after_block,
+  
qemu_plugin_insn_vaddr(last_insn) +
+
qemu_plugin_insn_size(last_insn));


With the following:

modified   contrib/plugins/cflow.c
@@ -220,7 +220,7 @@ static void vcpu_tb_branched_exec(unsigned int cpu_index, 
void *udata)
  g_mutex_lock(&node->lock);
  
  if (early_exit) {

-fprintf(stderr, "%s: pc=%"PRIx64", epbc=%"PRIx64"
+fprintf(stderr, "%s: pc=%"PRIx64", epbc=%"PRIx64
  " npc=%"PRIx64", lpc=%"PRIx64", \n",
  __func__, pc, ebpc, npc, lpc);
  node->early_exit++;
@@ -264,6 +264,7 @@ static void vcpu_tb_trans(qemu_plugin_id_t id, struct 
qemu_plugin_tb *tb)
  {
  uint64_t pc = qemu_plugin_tb_vaddr(tb);
  size_t insns = qemu_plugin_tb_n_insns(tb);
+struct qemu_plugin_insn *first_insn = qemu_plugin_tb_get_insn(tb, 0);
  struct qemu_plugin_insn *last_insn = qemu_plugin_tb_get_insn(tb, insns - 
1);
  
  /*

@@ -278,12 +279,13 @@ static void vcpu_tb_trans(qemu_plugin_id_t id, struct 
qemu_plugin_tb *tb)
  
  /*

   * Now we can set start/end for this block so the next block can
- * check where we are at.
+ * check where we are at. Do this on the first instruction and not
+ * the TB so we don't get mixed up with above.
   */
-qemu_plugin_register_vcpu_tb_exec_inline_per_vcpu(tb,
+qemu_plugin_register_vcpu_insn_exec_inline_per_vcpu(first_insn,

QEMU_PLUGIN_INLINE_STORE_U64,
end_block, 
qemu_plugin_insn_vaddr(last_insn));
-qemu_plugin_register_vcpu_tb_exec_inline_per_vcpu(tb,
+qemu_plugin_register_vcpu_insn_exec_inline_per_vcpu(first_insn,

QEMU_PLUGIN_INLINE_STORE_U64,
pc_after_block,

qemu_plugin_insn_vaddr(last_insn) +

The report looks more sane:

   collected 9013 control flow nodes in the hash table
 addr: 0x403c88 hexchar: ret  (0/1)
   branches 1280
 to 0x403d00 (639)
 to 0x403d24 (639)

So I think we need to think about preserving the ordering of
instrumentation (at least from the same plugin) so we are not laying any
API bear traps for users.

I assume because loads and stores are involved we won't see the
optimiser trying to swap stuff around.



Currently, order is fixed.

Series introducing more inline ops changed order to be:
- inline ops:
  - PLUGIN

Re: [PATCH v2 1/6] virtio/virtio-pci: Handle extra notification data

2024-03-14 Thread Jonah Palmer





On 3/13/24 11:01 PM, Jason Wang wrote:

On Wed, Mar 13, 2024 at 7:55 PM Jonah Palmer  wrote:


Add support to virtio-pci devices for handling the extra data sent
from the driver to the device when the VIRTIO_F_NOTIFICATION_DATA
transport feature has been negotiated.

The extra data that's passed to the virtio-pci device when this
feature is enabled varies depending on the device's virtqueue
layout.

In a split virtqueue layout, this data includes:
  - upper 16 bits: shadow_avail_idx
  - lower 16 bits: virtqueue index

In a packed virtqueue layout, this data includes:
  - upper 16 bits: 1-bit wrap counter & 15-bit shadow_avail_idx
  - lower 16 bits: virtqueue index

Tested-by: Lei Yang 
Reviewed-by: Eugenio Pérez 
Signed-off-by: Jonah Palmer 
---
  hw/virtio/virtio-pci.c | 10 +++---
  hw/virtio/virtio.c | 18 ++
  include/hw/virtio/virtio.h |  1 +
  3 files changed, 26 insertions(+), 3 deletions(-)

diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c
index cb6940fc0e..0f5c3c3b2f 100644
--- a/hw/virtio/virtio-pci.c
+++ b/hw/virtio/virtio-pci.c
@@ -384,7 +384,7 @@ static void virtio_ioport_write(void *opaque, uint32_t 
addr, uint32_t val)
  {
  VirtIOPCIProxy *proxy = opaque;
  VirtIODevice *vdev = virtio_bus_get_device(&proxy->bus);
-uint16_t vector;
+uint16_t vector, vq_idx;
  hwaddr pa;

  switch (addr) {
@@ -408,8 +408,12 @@ static void virtio_ioport_write(void *opaque, uint32_t 
addr, uint32_t val)
  vdev->queue_sel = val;
  break;
  case VIRTIO_PCI_QUEUE_NOTIFY:
-if (val < VIRTIO_QUEUE_MAX) {
-virtio_queue_notify(vdev, val);
+vq_idx = val;
+if (vq_idx < VIRTIO_QUEUE_MAX) {
+if (virtio_vdev_has_feature(vdev, VIRTIO_F_NOTIFICATION_DATA)) {
+virtio_queue_set_shadow_avail_data(vdev, val);
+}
+virtio_queue_notify(vdev, vq_idx);
  }
  break;
  case VIRTIO_PCI_STATUS:
diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
index d229755eae..bcb9e09df0 100644
--- a/hw/virtio/virtio.c
+++ b/hw/virtio/virtio.c
@@ -2255,6 +2255,24 @@ void virtio_queue_set_align(VirtIODevice *vdev, int n, 
int align)
  }
  }

+void virtio_queue_set_shadow_avail_data(VirtIODevice *vdev, uint32_t data)
+{
+/* Lower 16 bits is the virtqueue index */
+uint16_t i = data;
+VirtQueue *vq = &vdev->vq[i];
+
+if (!vq->vring.desc) {
+return;
+}
+
+if (virtio_vdev_has_feature(vdev, VIRTIO_F_RING_PACKED)) {
+vq->shadow_avail_wrap_counter = (data >> 31) & 0x1;
+vq->shadow_avail_idx = (data >> 16) & 0x7FFF;
+} else {
+vq->shadow_avail_idx = (data >> 16);


Do we need to do a sanity check for this value?

Thanks



It can't hurt, right? What kind of check did you have in mind?

if (vq->shadow_avail_idx >= vq->vring.num)

Or something else?


+}
+}
+
  static void virtio_queue_notify_vq(VirtQueue *vq)
  {
  if (vq->vring.desc && vq->handle_output) {
diff --git a/include/hw/virtio/virtio.h b/include/hw/virtio/virtio.h
index c8f72850bc..53915947a7 100644
--- a/include/hw/virtio/virtio.h
+++ b/include/hw/virtio/virtio.h
@@ -335,6 +335,7 @@ void virtio_queue_update_rings(VirtIODevice *vdev, int n);
  void virtio_init_region_cache(VirtIODevice *vdev, int n);
  void virtio_queue_set_align(VirtIODevice *vdev, int n, int align);
  void virtio_queue_notify(VirtIODevice *vdev, int n);
+void virtio_queue_set_shadow_avail_data(VirtIODevice *vdev, uint32_t data);
  uint16_t virtio_queue_vector(VirtIODevice *vdev, int n);
  void virtio_queue_set_vector(VirtIODevice *vdev, int n, uint16_t vector);
  int virtio_queue_set_host_notifier_mr(VirtIODevice *vdev, int n,
--
2.39.3

Re: [PATCH v2 2/4] tests/avocado: drop virtio-rng from sbsa-ref tests

2024-03-14 Thread Alex Bennée

Peter Maydell  writes:

> On Thu, 14 Mar 2024 at 12:11, Alex Bennée  wrote:
>>
>> Marcin Juszkiewicz  writes:
>>
>> > sbsa-ref is supposed to emulate real hardware so virtio-rng-pci
>> > does not fit here
>>
>> What is real anyway ;-)
>>
>> VirtIO devices exist in real life and I would argue for PCI the device
>> appears very much like a normal PCI device. I could see the argument for
>> avoiding virtio-mmio devices which are a lot more VM-only in flavour.
>
> But our platform for testing virtio-rng-pci should be the 'virt'
> board (and our avocado test for virt does indeed have that device).
> What exactly are we adding by testing it also in sbsa-ref,
> where the typical user will *not* be using it (if they wanted
> a virtio setup they would be using 'virt')?

I assume just ensuring the board has enough entropy to boot cleanly. If
it doesn't need it then it doesn't matter I guess.

>
> thanks
> -- PMM

-- 
Alex Bennée
Virtualisation Tech Lead @ Linaro

Re: [PATCH v2] gitlab: aggressively avoid extra GIT data

2024-03-14 Thread Manos Pitsidianakis

On Tue, 12 Mar 2024 at 19:09, Alex Bennée  wrote:
>
> This avoids fetching blobs and tree references for branches we are not
> going to worry about. Also skip tag references which are similarly not
> useful and keep the default --prune. This keeps the .git data to
> around 100M rather than the ~400M even a shallow clone takes.
>
> So we can check the savings we also run a quick du while setting up
> the build.
>
> We also have to have special settings of GIT_FETCH_EXTRA_FLAGS for the
> Windows build, the migration legacy test and the custom runners. In
> the case of the custom runners we also move the free floating variable
> to the runner template.
>
> Signed-off-by: Alex Bennée 
>
> ---
> v2
>   - make custom runners follow the legacy options
> ---

Reviewed-by: Manos Pitsidianakis

Re: [PATCH v2] block: Use LVM tools for LV block device truncation

2024-03-14 Thread Daniel P . Berrangé

On Wed, Mar 13, 2024 at 11:43:27AM +0100, Alexander Ivanov wrote:
> If a block device is an LVM logical volume we can resize it using
> standard LVM tools.
> 
> Add a helper to detect if a device is a DM device. In raw_co_truncate()
> check if the block device is DM and resize it executing lvresize.
> 
> Signed-off-by: Alexander Ivanov 
> ---
>  block/file-posix.c | 61 ++
>  1 file changed, 61 insertions(+)
> 
> diff --git a/block/file-posix.c b/block/file-posix.c
> index 35684f7e21..5f07d98aa5 100644
> --- a/block/file-posix.c
> +++ b/block/file-posix.c
> @@ -2642,6 +2642,38 @@ raw_regular_truncate(BlockDriverState *bs, int fd, 
> int64_t offset,
>  return raw_thread_pool_submit(handle_aiocb_truncate, &acb);
>  }

>  static int coroutine_fn raw_co_truncate(BlockDriverState *bs, int64_t offset,
>  bool exact, PreallocMode prealloc,
>  BdrvRequestFlags flags, Error **errp)
> @@ -2670,6 +2702,35 @@ static int coroutine_fn 
> raw_co_truncate(BlockDriverState *bs, int64_t offset,
>  if (S_ISCHR(st.st_mode) || S_ISBLK(st.st_mode)) {
>  int64_t cur_length = raw_getlength(bs);
>  
> +/*
> + * Try to resize an LVM device using LVM tools.
> + */
> +if (device_is_dm(&st) && offset > 0) {
> +int spawn_flags = G_SPAWN_SEARCH_PATH | 
> G_SPAWN_STDOUT_TO_DEV_NULL;
> +int status;
> +bool success;
> +char *err;
> +GError *gerr = NULL;
> +g_autofree char *size_str = g_strdup_printf("%ldB", offset);

offset is 64-bit, but '%ld' is not guaranteed to be 64-bit. I expect
this will break on 32-bit platforms. Try PRId64 instead.

> +const char *cmd[] = {"lvresize", "-f", "-L",
> + size_str, bs->filename, NULL};
> +
> +success = g_spawn_sync(NULL, (gchar **)cmd, NULL, spawn_flags,
> +   NULL, NULL, NULL, &err, &status, &gerr);
> +
> +if (success && WEXITSTATUS(status) == 0) {
> +return 0;
> +}

We should probably check  'g_spawn_check_wait_status' rather than
WEXITSTATUS, as this then gives us further eror message details
that

> +
> +if (!success) {
> +error_setg(errp, "lvresize execution error: %s", 
> gerr->message);
> +} else {
> +error_setg(errp, "%s", err);

...we would also include here, such as the exit code or terminal
signal.

> +}
> +
> +return -EINVAL;
> +}
> +
>  if (offset != cur_length && exact) {
>  error_setg(errp, "Cannot resize device files");
>  return -ENOTSUP;
> -- 
> 2.40.1
> 
> 

With regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

Re: [PATCH v4] linux-aio: add IO_CMD_FDSYNC command support

2024-03-14 Thread Stefan Hajnoczi

On Thu, Mar 14, 2024 at 04:46:28PM +0530, Prasad Pandit wrote:
> From: Prasad Pandit 
> 
> Libaio defines IO_CMD_FDSYNC command to sync all outstanding
> asynchronous I/O operations, by flushing out file data to the
> disk storage.
> 
> Enable linux-aio to submit such aio request. This helps to
> reduce latency induced via pthread_create calls by
> thread-pool (aio=threads).
> 
> Signed-off-by: Prasad Pandit 
> ---
>  block/file-posix.c  |  7 +++
>  block/linux-aio.c   | 21 -
>  include/block/raw-aio.h |  1 +
>  3 files changed, 28 insertions(+), 1 deletion(-)
> 
> v4: New boolean field to indicate if aio_fdsync is available or not.
> It is set at file open time and checked before AIO_FLUSH call.
>   - https://lists.nongnu.org/archive/html/qemu-devel/2024-03/msg03701.html

Reviewed-by: Stefan Hajnoczi 


signature.asc
Description: PGP signature

Re: [PATCH] file-posix: rearrange BDRVRawState fields

2024-03-14 Thread Stefan Hajnoczi

On Thu, Mar 14, 2024 at 04:47:41PM +0530, Prasad Pandit wrote:
> From: Prasad Pandit 
> 
> Rearrange BRDVRawState structure fields to avoid memory
> fragments in its object's memory and save some(~8) bytes
> per object.
> 
> Signed-off-by: Prasad Pandit 
> ---
>  block/file-posix.c | 39 +++
>  1 file changed, 19 insertions(+), 20 deletions(-)

Reviewed-by: Stefan Hajnoczi 


signature.asc
Description: PGP signature

Re: [PATCH v4 16/23] qapi/schema: Don't initialize "members" with `None`

2024-03-14 Thread Markus Armbruster

John Snow  writes:

> Declare, but don't initialize the "members" field with type
> List[QAPISchemaObjectTypeMember].
>
> This simplifies the typing from what would otherwise be
> Optional[List[T]] to merely List[T]. This removes the need to add
> assertions to several callsites that this value is not None - which it
> never will be after the delayed initialization in check() anyway.
>
> The type declaration without initialization trick will cause accidental
> uses of this field prior to full initialization to raise an
> AttributeError.
>
> (Note that it is valid to have an empty members list, see the internal
> q_empty object as an example. For this reason, we cannot use the empty
> list as a replacement test for full initialization and instead rely on
> the _checked/_check_complete fields.)
>
> Signed-off-by: John Snow 
> ---
>  scripts/qapi/schema.py | 12 +++-
>  1 file changed, 7 insertions(+), 5 deletions(-)
>
> diff --git a/scripts/qapi/schema.py b/scripts/qapi/schema.py
> index 50ebc4f12de..fb30314741a 100644
> --- a/scripts/qapi/schema.py
> +++ b/scripts/qapi/schema.py
> @@ -20,7 +20,7 @@
>  from collections import OrderedDict
>  import os
>  import re
> -from typing import List, Optional
> +from typing import List, Optional, cast
>  
>  from .common import (
>  POINTER_SUFFIX,
> @@ -449,7 +449,7 @@ def __init__(self, name, info, doc, ifcond, features,
>  self.base = None
>  self.local_members = local_members
>  self.variants = variants
> -self.members = None
> +self.members: List[QAPISchemaObjectTypeMember]
>  self._check_complete = False
>  
>  def check(self, schema):
> @@ -482,7 +482,11 @@ def check(self, schema):
>  for m in self.local_members:
>  m.check(schema)
>  m.check_clash(self.info, seen)
> -members = seen.values()
> +
> +# check_clash works in terms of the supertype, but local_members
> +# is asserted to be List[QAPISchemaObjectTypeMember].

Do you mean "but self.members is declared as
List[QAPISchemaObjectTypeMember]"?

> +# Cast down to the subtype.
> +members = cast(List[QAPISchemaObjectTypeMember], list(seen.values()))

Let's break the line after the comma.

>  
>  if self.variants:
>  self.variants.check(schema, seen)
> @@ -515,11 +519,9 @@ def is_implicit(self):
>  return self.name.startswith('q_')
>  
>  def is_empty(self):
> -assert self.members is not None
>  return not self.members and not self.variants
>  
>  def has_conditional_members(self):
> -assert self.members is not None
>  return any(m.ifcond.is_present() for m in self.members)
>  
>  def c_name(self):

Re: [PATCH for-9.0 v14 4/8] target/riscv/vector_helpers: do early exit when vstart >= vl

2024-03-14 Thread Max Chou


According v spec section 7.9. Vector Load/Store Whole Register Instructions

"The instructions operate with an effective vector length, 
evl=NFIELDS*VLEN/EEW, regardless of current settings in vtype and vl. 
The usual property that no elements are written if vstart ≥ vl does not 
apply to these instructions. Instead, no elements are written if vstart 
≥ evl."


The VSTART_CHECK_EARLY_EXIT in vext_ldst_whole function may causes 
unexpected result. We may replace the VSTART_CHECK_EARLY_EXIT function by


- VSTART_CHECK_EARLY_EXIT(env); + if (env->vstart >= ((vlenb * nf) >> 
log2_esz)) { + env->vstart = 0; + return; + }



@@ -572,6 +580,8 @@ vext_ldst_whole(void *vd, target_ulong base, CPURISCVState 
*env, uint32_t desc,
  uint32_t vlenb = riscv_cpu_cfg(env)->vlenb;
  uint32_t max_elems = vlenb >> log2_esz;
  
+VSTART_CHECK_EARLY_EXIT(env);

+
  k = env->vstart / max_elems;
  off = env->vstart % max_elems;
  
@@ -877,6 +887,8 @@ void HELPER(NAME)(void *vd, void *v0, void *vs1, void *vs2,   \

  uint32_t vta = vext_vta(desc);\
  uint32_t i;   \
\
+VSTART_CHECK_EARLY_EXIT(env); \
+  \
  for (i = env->vstart; i < vl; i++) {  \
  ETYPE s1 = *((ETYPE *)vs1 + H(i));\
  ETYPE s2 = *((ETYPE *)vs2 + H(i));\
@@ -909,6 +921,8 @@ void HELPER(NAME)(void *vd, void *v0, target_ulong s1, void 
*vs2,\
  uint32_t vta = vext_vta(desc);   \
  uint32_t i;  \
   \
+VSTART_CHECK_EARLY_EXIT(env);\
+ \
  for (i = env->vstart; i < vl; i++) { \
  ETYPE s2 = *((ETYPE *)vs2 + H(i));   \
  ETYPE carry = vext_elem_mask(v0, i); \
@@ -944,6 +958,8 @@ void HELPER(NAME)(void *vd, void *v0, void *vs1, void *vs2, 
  \
  uint32_t vta_all_1s = vext_vta_all_1s(desc);  \
  uint32_t i;   \
\
+VSTART_CHECK_EARLY_EXIT(env); \
+  \
  for (i = env->vstart; i < vl; i++) {  \
  ETYPE s1 = *((ETYPE *)vs1 + H(i));\
  ETYPE s2 = *((ETYPE *)vs2 + H(i));\
@@ -982,6 +998,8 @@ void HELPER(NAME)(void *vd, void *v0, target_ulong s1,  
\
  uint32_t vta_all_1s = vext_vta_all_1s(desc);\
  uint32_t i; \
  \
+VSTART_CHECK_EARLY_EXIT(env);   \
+\
  for (i = env->vstart; i < vl; i++) {\
  ETYPE s2 = *((ETYPE *)vs2 + H(i));  \
  ETYPE carry = !vm && vext_elem_mask(v0, i); \
@@ -1078,6 +1096,8 @@ void HELPER(NAME)(void *vd, void *v0, void *vs1,  
\
  uint32_t vma = vext_vma(desc);\
  uint32_t i;   \
\
+VSTART_CHECK_EARLY_EXIT(env); \
+  \
  for (i = env->vstart; i < vl; i++) {  \
  if (!vm && !vext_elem_mask(v0, i)) {  \
  /* set masked-off elements to 1s */   \
@@ -1125,6 +1145,8 @@ void HELPER(NAME)(void *vd, void *v0, target_ulong s1,
  \
  uint32_t vma = vext_vma(desc);  \
  uint32_t i; \
  \
+VSTART_CHECK_EARLY_EXIT(env);   \
+\
  for (i = env->vstart; i < vl; i++) {\
  if (!vm && !vext_elem_mask(v0, i)) {\
  /* set masked-off elements to 1s */ \
@@ -1187,6 +1209,8 @@ void HELPER(NAME)(void *vd, void *v0, void *vs1, void 
*vs2,   \
  uint32_t vma = vext_vma(desc);\
  uint32_t i;

Re: [PATCH for-9.0 v14 4/8] target/riscv/vector_helpers: do early exit when vstart >= vl

2024-03-14 Thread Daniel Henrique Barboza





On 3/14/24 10:14, Max Chou wrote:

According v spec section 7.9. Vector Load/Store Whole Register Instructions

"The instructions operate with an effective vector length, evl=NFIELDS*VLEN/EEW, 
regardless of current settings in vtype and vl. The usual property that no elements are 
written if vstart ≥ vl does not apply to these instructions. Instead, no elements are 
written if vstart ≥ evl."

The VSTART_CHECK_EARLY_EXIT in vext_ldst_whole function may causes unexpected 
result. We may replace the VSTART_CHECK_EARLY_EXIT function by

- VSTART_CHECK_EARLY_EXIT(env); + if (env->vstart >= ((vlenb * nf) >> log2_esz)) { 
+ env->vstart = 0; + return; + }


Do we need to do an early exit in this case? If the function is able to handle
gracefully whatever env->vstart value it faces (it seems to be the case) then we
should just remove the exit entirely.

In fact I removed all the early exits from all helpers that are guarded
by vstart_eq_zero: vcpop_m(), vfirst_m(), vmsetm(), GEN_VEXT_VIOTA_M(),
GEN_VEXT_VCOMPRESS_VM(), GEN_VEXT_RED() and GEN_VEXT_FRED(). For these case
the helpers can either do nothing if vl = 0 or throw some exception like
vcpop and first does.


Thanks,

Daniel






@@ -572,6 +580,8 @@ vext_ldst_whole(void *vd, target_ulong base, CPURISCVState 
*env, uint32_t desc,
  uint32_t vlenb = riscv_cpu_cfg(env)->vlenb;
  uint32_t max_elems = vlenb >> log2_esz;
  
+VSTART_CHECK_EARLY_EXIT(env);

+
  k = env->vstart / max_elems;
  off = env->vstart % max_elems;
  
@@ -877,6 +887,8 @@ void HELPER(NAME)(void *vd, void *v0, void *vs1, void *vs2,   \

  uint32_t vta = vext_vta(desc);\
  uint32_t i;   \
\
+VSTART_CHECK_EARLY_EXIT(env); \
+  \
  for (i = env->vstart; i < vl; i++) {  \
  ETYPE s1 = *((ETYPE *)vs1 + H(i));\
  ETYPE s2 = *((ETYPE *)vs2 + H(i));\
@@ -909,6 +921,8 @@ void HELPER(NAME)(void *vd, void *v0, target_ulong s1, void 
*vs2,\
  uint32_t vta = vext_vta(desc);   \
  uint32_t i;  \
   \
+VSTART_CHECK_EARLY_EXIT(env);\
+ \
  for (i = env->vstart; i < vl; i++) { \
  ETYPE s2 = *((ETYPE *)vs2 + H(i));   \
  ETYPE carry = vext_elem_mask(v0, i); \
@@ -944,6 +958,8 @@ void HELPER(NAME)(void *vd, void *v0, void *vs1, void *vs2, 
  \
  uint32_t vta_all_1s = vext_vta_all_1s(desc);  \
  uint32_t i;   \
\
+VSTART_CHECK_EARLY_EXIT(env); \
+  \
  for (i = env->vstart; i < vl; i++) {  \
  ETYPE s1 = *((ETYPE *)vs1 + H(i));\
  ETYPE s2 = *((ETYPE *)vs2 + H(i));\
@@ -982,6 +998,8 @@ void HELPER(NAME)(void *vd, void *v0, target_ulong s1,  
\
  uint32_t vta_all_1s = vext_vta_all_1s(desc);\
  uint32_t i; \
  \
+VSTART_CHECK_EARLY_EXIT(env);   \
+\
  for (i = env->vstart; i < vl; i++) {\
  ETYPE s2 = *((ETYPE *)vs2 + H(i));  \
  ETYPE carry = !vm && vext_elem_mask(v0, i); \
@@ -1078,6 +1096,8 @@ void HELPER(NAME)(void *vd, void *v0, void *vs1,  
\
  uint32_t vma = vext_vma(desc);\
  uint32_t i;   \
\
+VSTART_CHECK_EARLY_EXIT(env); \
+  \
  for (i = env->vstart; i < vl; i++) {  \
  if (!vm && !vext_elem_mask(v0, i)) {  \
  /* set masked-off elements to 1s */   \
@@ -1125,6 +1145,8 @@ void HELPER(NAME)(void *vd, void *v0, target_ulong s1,
  \
  uint32_t vma = vext_vma(desc);  \
  uint32_t i;

Re: [PATCH v4 05/23] qapi: create QAPISchemaDefinition

2024-03-14 Thread John Snow

On Thu, Mar 14, 2024, 5:12 AM Markus Armbruster  wrote:

> John Snow  writes:
>
> > Include entities don't have names, but we generally expect "entities" to
> > have names. Reclassify all entities with names as *definitions*, leaving
> > the nameless include entities as QAPISchemaEntity instances.
> >
> > This is primarily to help simplify typing around expectations of what
> > callers expect for properties of an "entity".
> >
> > Suggested-by: Markus Armbruster 
> > Signed-off-by: John Snow 
> > ---
> >  scripts/qapi/schema.py | 144 +++--
> >  1 file changed, 82 insertions(+), 62 deletions(-)
> >
> > diff --git a/scripts/qapi/schema.py b/scripts/qapi/schema.py
> > index 117f0f78f0c..da273c1649d 100644
> > --- a/scripts/qapi/schema.py
> > +++ b/scripts/qapi/schema.py
>
> [...]
>
> > @@ -107,6 +90,46 @@ def _set_module(self, schema, info):
> >  def set_module(self, schema):
> >  self._set_module(schema, self.info)
> >
> > +def visit(self, visitor):
> > +# pylint: disable=unused-argument
> > +assert self._checked
> > +
> > +
> > +class QAPISchemaDefinition(QAPISchemaEntity):
> > +meta: Optional[str] = None
> > +
> > +def __init__(self, name: str, info, doc, ifcond=None,
> features=None):
> > +assert isinstance(name, str)
> > +super().__init__(info)
> > +for f in features or []:
> > +assert isinstance(f, QAPISchemaFeature)
> > +f.set_defined_in(name)
> > +self.name = name
> > +self.doc = doc
> > +self._ifcond = ifcond or QAPISchemaIfCond()
> > +self.features = features or []
> > +
> > +def __repr__(self):
> > +return "<%s:%s at 0x%x>" % (type(self).__name__, self.name,
> > +id(self))
> > +
> > +def c_name(self):
> > +return c_name(self.name)
> > +
> > +def check(self, schema):
> > +assert not self._checked
> > +seen = {}
> > +for f in self.features:
> > +f.check_clash(self.info, seen)
> > +super().check(schema)
>
> v3 called super().check() first.  Why did you move it?  I'm asking just
> to make sure I'm not missing something subtle.
>

I don't believe you are, I think it was just ... something I didn't notice
when rebasing, since I merged this patch by hand. I recall having to insert
the super call manually, and I guess my preferences are nondeterministic.


> > +
> > +def connect_doc(self, doc=None):
> > +super().connect_doc(doc)
> > +doc = doc or self.doc
> > +if doc:
> > +for f in self.features:
> > +doc.connect_feature(f)
> > +
> >  @property
> >  def ifcond(self):
> >  assert self._checked
>
> [...]
>
>

Re: [PATCH v4 16/23] qapi/schema: Don't initialize "members" with `None`

2024-03-14 Thread John Snow

On Thu, Mar 14, 2024, 9:01 AM Markus Armbruster  wrote:

> John Snow  writes:
>
> > Declare, but don't initialize the "members" field with type
> > List[QAPISchemaObjectTypeMember].
> >
> > This simplifies the typing from what would otherwise be
> > Optional[List[T]] to merely List[T]. This removes the need to add
> > assertions to several callsites that this value is not None - which it
> > never will be after the delayed initialization in check() anyway.
> >
> > The type declaration without initialization trick will cause accidental
> > uses of this field prior to full initialization to raise an
> > AttributeError.
> >
> > (Note that it is valid to have an empty members list, see the internal
> > q_empty object as an example. For this reason, we cannot use the empty
> > list as a replacement test for full initialization and instead rely on
> > the _checked/_check_complete fields.)
> >
> > Signed-off-by: John Snow 
> > ---
> >  scripts/qapi/schema.py | 12 +++-
> >  1 file changed, 7 insertions(+), 5 deletions(-)
> >
> > diff --git a/scripts/qapi/schema.py b/scripts/qapi/schema.py
> > index 50ebc4f12de..fb30314741a 100644
> > --- a/scripts/qapi/schema.py
> > +++ b/scripts/qapi/schema.py
> > @@ -20,7 +20,7 @@
> >  from collections import OrderedDict
> >  import os
> >  import re
> > -from typing import List, Optional
> > +from typing import List, Optional, cast
> >
> >  from .common import (
> >  POINTER_SUFFIX,
> > @@ -449,7 +449,7 @@ def __init__(self, name, info, doc, ifcond, features,
> >  self.base = None
> >  self.local_members = local_members
> >  self.variants = variants
> > -self.members = None
> > +self.members: List[QAPISchemaObjectTypeMember]
> >  self._check_complete = False
> >
> >  def check(self, schema):
> > @@ -482,7 +482,11 @@ def check(self, schema):
> >  for m in self.local_members:
> >  m.check(schema)
> >  m.check_clash(self.info, seen)
> > -members = seen.values()
> > +
> > +# check_clash works in terms of the supertype, but local_members
> > +# is asserted to be List[QAPISchemaObjectTypeMember].
>
> Do you mean "but self.members is declared as
> List[QAPISchemaObjectTypeMember]"?
>

Argh. I meant asserted in the linguistic sense. mypy asserts it to be; not
a runtime assertion.

I do this a lot, apparently.


> > +# Cast down to the subtype.
> > +members = cast(List[QAPISchemaObjectTypeMember],
> list(seen.values()))
>
> Let's break the line after the comma.
>

Go for it.

After this series I may send a patchset showing what changes black would
make. I cannot be trusted with aesthetic consistency.


> >
> >  if self.variants:
> >  self.variants.check(schema, seen)
> > @@ -515,11 +519,9 @@ def is_implicit(self):
> >  return self.name.startswith('q_')
> >
> >  def is_empty(self):
> > -assert self.members is not None
> >  return not self.members and not self.variants
> >
> >  def has_conditional_members(self):
> > -assert self.members is not None
> >  return any(m.ifcond.is_present() for m in self.members)
> >
> >  def c_name(self):
>
>

Re: [PATCH for-9.0 v14 4/8] target/riscv/vector_helpers: do early exit when vstart >= vl

2024-03-14 Thread Daniel Henrique Barboza





On 3/14/24 10:27, Daniel Henrique Barboza wrote:



On 3/14/24 10:14, Max Chou wrote:

According v spec section 7.9. Vector Load/Store Whole Register Instructions

"The instructions operate with an effective vector length, evl=NFIELDS*VLEN/EEW, 
regardless of current settings in vtype and vl. The usual property that no elements are 
written if vstart ≥ vl does not apply to these instructions. Instead, no elements are 
written if vstart ≥ evl."

The VSTART_CHECK_EARLY_EXIT in vext_ldst_whole function may causes unexpected 
result. We may replace the VSTART_CHECK_EARLY_EXIT function by

- VSTART_CHECK_EARLY_EXIT(env); + if (env->vstart >= ((vlenb * nf) >> log2_esz)) { 
+ env->vstart = 0; + return; + }


Do we need to do an early exit in this case? If the function is able to handle
gracefully whatever env->vstart value it faces (it seems to be the case) then we
should just remove the exit entirely.


Nevermind. Here's the guard that I am removing in the next patch for 
vext_lsdt_whole:

-uint32_t evl = s->cfg_ptr->vlenb * nf / width;
-TCGLabel *over = gen_new_label();
-tcg_gen_brcondi_tl(TCG_COND_GEU, cpu_vstart, evl, over);

If we just remove it and put nothing else in its place we'll end up breaking it.

To preserve the existing behavior we'll have to change the early exit to

if (vstart >= evl) {...}


'evl' is being calculated as

uint32_t evl = s->cfg_ptr->vlenb * nf / width;


So yeah, your suggestion to exit the helper early with:

+ if (env->vstart >= ((vlenb * nf) >> log2_esz)) {
+ env->vstart = 0;
+ return;
+ }

Is correct. I'll change it in v15.


Thanks,


Daniel



In fact I removed all the early exits from all helpers that are guarded
by vstart_eq_zero: vcpop_m(), vfirst_m(), vmsetm(), GEN_VEXT_VIOTA_M(),
GEN_VEXT_VCOMPRESS_VM(), GEN_VEXT_RED() and GEN_VEXT_FRED(). For these case
the helpers can either do nothing if vl = 0 or throw some exception like
vcpop and first does.


Thanks,

Daniel






@@ -572,6 +580,8 @@ vext_ldst_whole(void *vd, target_ulong base, CPURISCVState 
*env, uint32_t desc,
  uint32_t vlenb = riscv_cpu_cfg(env)->vlenb;
  uint32_t max_elems = vlenb >> log2_esz;
+    VSTART_CHECK_EARLY_EXIT(env);
+
  k = env->vstart / max_elems;
  off = env->vstart % max_elems;
@@ -877,6 +887,8 @@ void HELPER(NAME)(void *vd, void *v0, void *vs1, void *vs2, 
  \
  uint32_t vta = vext_vta(desc);    \
  uint32_t i;   \
    \
+    VSTART_CHECK_EARLY_EXIT(env); \
+  \
  for (i = env->vstart; i < vl; i++) {  \
  ETYPE s1 = *((ETYPE *)vs1 + H(i));    \
  ETYPE s2 = *((ETYPE *)vs2 + H(i));    \
@@ -909,6 +921,8 @@ void HELPER(NAME)(void *vd, void *v0, target_ulong s1, void 
*vs2,    \
  uint32_t vta = vext_vta(desc);   \
  uint32_t i;  \
   \
+    VSTART_CHECK_EARLY_EXIT(env);    \
+ \
  for (i = env->vstart; i < vl; i++) { \
  ETYPE s2 = *((ETYPE *)vs2 + H(i));   \
  ETYPE carry = vext_elem_mask(v0, i); \
@@ -944,6 +958,8 @@ void HELPER(NAME)(void *vd, void *v0, void *vs1, void *vs2, 
  \
  uint32_t vta_all_1s = vext_vta_all_1s(desc);  \
  uint32_t i;   \
    \
+    VSTART_CHECK_EARLY_EXIT(env); \
+  \
  for (i = env->vstart; i < vl; i++) {  \
  ETYPE s1 = *((ETYPE *)vs1 + H(i));    \
  ETYPE s2 = *((ETYPE *)vs2 + H(i));    \
@@ -982,6 +998,8 @@ void HELPER(NAME)(void *vd, void *v0, target_ulong s1,  
    \
  uint32_t vta_all_1s = vext_vta_all_1s(desc);    \
  uint32_t i; \
  \
+    VSTART_CHECK_EARLY_EXIT(env);   \
+    \
  for (i = env->vstart; i < vl; i++) {    \
  ETYPE s2 = *((ETYPE *)vs2 + H(i));  \
  ETYPE carry = !vm && vext_elem_mask(v0, i); \
@@ -1078,6 +1096,8 @@ void HELPER(NAME)(void *vd, void *v0, void *vs1,  
    \
  uint32_t vma = vext_vma(desc);

Re: [PATCH v4 4/5] qapi: blockdev-backup: add discard-source parameter

2024-03-14 Thread Vladimir Sementsov-Ogievskiy


On 13.03.24 19:08, Markus Armbruster wrote:

Vladimir Sementsov-Ogievskiy  writes:


Add a parameter that enables discard-after-copy. That is mostly useful
in "push backup with fleecing" scheme, when source is snapshot-access
format driver node, based on copy-before-write filter snapshot-access
API:

[guest]  [snapshot-access] ~~ blockdev-backup ~~> [backup target]
||
| root   | file
vv
[copy-before-write]
| |
| file| target
v v
[active disk]   [temp.img]

In this case discard-after-copy does two things:

  - discard data in temp.img to save disk space
  - avoid further copy-before-write operation in discarded area

Note that we have to declare WRITE permission on source in
copy-before-write filter, for discard to work. Still we can't take it
unconditionally, as it will break normal backup from RO source. So, we
have to add a parameter and pass it thorough bdrv_open flags.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Fiona Ebner 
Tested-by: Fiona Ebner 


[...]


diff --git a/qapi/block-core.json b/qapi/block-core.json
index 1874f880a8..2ef52ae9a7 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -1610,6 +1610,9 @@
  # node specified by @drive.  If this option is not given, a node
  # name is autogenerated.  (Since: 4.2)
  #
+# @discard-source: Discard blocks on source which are already copied


"have been copied"?


Oh, right




+# to the target.  (Since 9.1)
+#
  # @x-perf: Performance options.  (Since 6.0)
  #
  # Features:
@@ -1631,6 +1634,7 @@
  '*on-target-error': 'BlockdevOnError',
  '*auto-finalize': 'bool', '*auto-dismiss': 'bool',
  '*filter-node-name': 'str',
+'*discard-source': 'bool',
  '*x-perf': { 'type': 'BackupPerf',
   'features': [ 'unstable' ] } } }


QAPI schema
Acked-by: Markus Armbruster 



Thanks!

--
Best regards,
Vladimir

Re: [PATCH v4 16/23] qapi/schema: Don't initialize "members" with `None`

2024-03-14 Thread Markus Armbruster

John Snow  writes:

> On Thu, Mar 14, 2024, 9:01 AM Markus Armbruster  wrote:
>
>> John Snow  writes:
>>
>> > Declare, but don't initialize the "members" field with type
>> > List[QAPISchemaObjectTypeMember].
>> >
>> > This simplifies the typing from what would otherwise be
>> > Optional[List[T]] to merely List[T]. This removes the need to add
>> > assertions to several callsites that this value is not None - which it
>> > never will be after the delayed initialization in check() anyway.
>> >
>> > The type declaration without initialization trick will cause accidental
>> > uses of this field prior to full initialization to raise an
>> > AttributeError.
>> >
>> > (Note that it is valid to have an empty members list, see the internal
>> > q_empty object as an example. For this reason, we cannot use the empty
>> > list as a replacement test for full initialization and instead rely on
>> > the _checked/_check_complete fields.)
>> >
>> > Signed-off-by: John Snow 
>> > ---
>> >  scripts/qapi/schema.py | 12 +++-
>> >  1 file changed, 7 insertions(+), 5 deletions(-)
>> >
>> > diff --git a/scripts/qapi/schema.py b/scripts/qapi/schema.py
>> > index 50ebc4f12de..fb30314741a 100644
>> > --- a/scripts/qapi/schema.py
>> > +++ b/scripts/qapi/schema.py
>> > @@ -20,7 +20,7 @@
>> >  from collections import OrderedDict
>> >  import os
>> >  import re
>> > -from typing import List, Optional
>> > +from typing import List, Optional, cast
>> >
>> >  from .common import (
>> >  POINTER_SUFFIX,
>> > @@ -449,7 +449,7 @@ def __init__(self, name, info, doc, ifcond, features,
>> >  self.base = None
>> >  self.local_members = local_members
>> >  self.variants = variants
>> > -self.members = None
>> > +self.members: List[QAPISchemaObjectTypeMember]
>> >  self._check_complete = False
>> >
>> >  def check(self, schema):
>> > @@ -482,7 +482,11 @@ def check(self, schema):
>> >  for m in self.local_members:
>> >  m.check(schema)
>> >  m.check_clash(self.info, seen)
>> > -members = seen.values()
>> > +
>> > +# check_clash works in terms of the supertype, but local_members
>> > +# is asserted to be List[QAPISchemaObjectTypeMember].
>>
>> Do you mean "but self.members is declared as
>> List[QAPISchemaObjectTypeMember]"?
>
> Argh. I meant asserted in the linguistic sense. mypy asserts it to be; not
> a runtime assertion.
>
> I do this a lot, apparently.

Okay to adjust your comment to my version?

[...]

Re: [PATCH v4 16/23] qapi/schema: Don't initialize "members" with `None`

2024-03-14 Thread John Snow

On Thu, Mar 14, 2024, 9:58 AM Markus Armbruster  wrote:

> John Snow  writes:
>
> > On Thu, Mar 14, 2024, 9:01 AM Markus Armbruster 
> wrote:
> >
> >> John Snow  writes:
> >>
> >> > Declare, but don't initialize the "members" field with type
> >> > List[QAPISchemaObjectTypeMember].
> >> >
> >> > This simplifies the typing from what would otherwise be
> >> > Optional[List[T]] to merely List[T]. This removes the need to add
> >> > assertions to several callsites that this value is not None - which it
> >> > never will be after the delayed initialization in check() anyway.
> >> >
> >> > The type declaration without initialization trick will cause
> accidental
> >> > uses of this field prior to full initialization to raise an
> >> > AttributeError.
> >> >
> >> > (Note that it is valid to have an empty members list, see the internal
> >> > q_empty object as an example. For this reason, we cannot use the empty
> >> > list as a replacement test for full initialization and instead rely on
> >> > the _checked/_check_complete fields.)
> >> >
> >> > Signed-off-by: John Snow 
> >> > ---
> >> >  scripts/qapi/schema.py | 12 +++-
> >> >  1 file changed, 7 insertions(+), 5 deletions(-)
> >> >
> >> > diff --git a/scripts/qapi/schema.py b/scripts/qapi/schema.py
> >> > index 50ebc4f12de..fb30314741a 100644
> >> > --- a/scripts/qapi/schema.py
> >> > +++ b/scripts/qapi/schema.py
> >> > @@ -20,7 +20,7 @@
> >> >  from collections import OrderedDict
> >> >  import os
> >> >  import re
> >> > -from typing import List, Optional
> >> > +from typing import List, Optional, cast
> >> >
> >> >  from .common import (
> >> >  POINTER_SUFFIX,
> >> > @@ -449,7 +449,7 @@ def __init__(self, name, info, doc, ifcond,
> features,
> >> >  self.base = None
> >> >  self.local_members = local_members
> >> >  self.variants = variants
> >> > -self.members = None
> >> > +self.members: List[QAPISchemaObjectTypeMember]
> >> >  self._check_complete = False
> >> >
> >> >  def check(self, schema):
> >> > @@ -482,7 +482,11 @@ def check(self, schema):
> >> >  for m in self.local_members:
> >> >  m.check(schema)
> >> >  m.check_clash(self.info, seen)
> >> > -members = seen.values()
> >> > +
> >> > +# check_clash works in terms of the supertype, but
> local_members
> >> > +# is asserted to be List[QAPISchemaObjectTypeMember].
> >>
> >> Do you mean "but self.members is declared as
> >> List[QAPISchemaObjectTypeMember]"?
> >
> > Argh. I meant asserted in the linguistic sense. mypy asserts it to be;
> not
> > a runtime assertion.
> >
> > I do this a lot, apparently.
>
> Okay to adjust your comment to my version?
>

Absolutely, of course!


> [...]
>
>

Re: [PATCH v4 05/23] qapi: create QAPISchemaDefinition

2024-03-14 Thread Markus Armbruster

John Snow  writes:

> On Thu, Mar 14, 2024, 5:12 AM Markus Armbruster  wrote:
>
>> John Snow  writes:
>>
>> > Include entities don't have names, but we generally expect "entities" to
>> > have names. Reclassify all entities with names as *definitions*, leaving
>> > the nameless include entities as QAPISchemaEntity instances.
>> >
>> > This is primarily to help simplify typing around expectations of what
>> > callers expect for properties of an "entity".
>> >
>> > Suggested-by: Markus Armbruster 
>> > Signed-off-by: John Snow 
>> > ---
>> >  scripts/qapi/schema.py | 144 +++--
>> >  1 file changed, 82 insertions(+), 62 deletions(-)
>> >
>> > diff --git a/scripts/qapi/schema.py b/scripts/qapi/schema.py
>> > index 117f0f78f0c..da273c1649d 100644
>> > --- a/scripts/qapi/schema.py
>> > +++ b/scripts/qapi/schema.py
>>
>> [...]
>>
>> > @@ -107,6 +90,46 @@ def _set_module(self, schema, info):
>> >  def set_module(self, schema):
>> >  self._set_module(schema, self.info)
>> >
>> > +def visit(self, visitor):
>> > +# pylint: disable=unused-argument
>> > +assert self._checked
>> > +
>> > +
>> > +class QAPISchemaDefinition(QAPISchemaEntity):
>> > +meta: Optional[str] = None
>> > +
>> > +def __init__(self, name: str, info, doc, ifcond=None, features=None):
>> > +assert isinstance(name, str)
>> > +super().__init__(info)
>> > +for f in features or []:
>> > +assert isinstance(f, QAPISchemaFeature)
>> > +f.set_defined_in(name)
>> > +self.name = name
>> > +self.doc = doc
>> > +self._ifcond = ifcond or QAPISchemaIfCond()
>> > +self.features = features or []
>> > +
>> > +def __repr__(self):
>> > +return "<%s:%s at 0x%x>" % (type(self).__name__, self.name,
>> > +id(self))
>> > +
>> > +def c_name(self):
>> > +return c_name(self.name)
>> > +
>> > +def check(self, schema):
>> > +assert not self._checked
>> > +seen = {}
>> > +for f in self.features:
>> > +f.check_clash(self.info, seen)
>> > +super().check(schema)
>>
>> v3 called super().check() first.  Why did you move it?  I'm asking just
>> to make sure I'm not missing something subtle.
>>
>
> I don't believe you are, I think it was just ... something I didn't notice
> when rebasing, since I merged this patch by hand. I recall having to insert
> the super call manually, and I guess my preferences are nondeterministic.

I'd like to move it back, for consistency with its subtypes' .check(),
which all call super().check() early.  Okay?

>> > +
>> > +def connect_doc(self, doc=None):
>> > +super().connect_doc(doc)
>> > +doc = doc or self.doc
>> > +if doc:
>> > +for f in self.features:
>> > +doc.connect_feature(f)
>> > +
>> >  @property
>> >  def ifcond(self):
>> >  assert self._checked
>>
>> [...]
>>
>>

Re: [PATCH v4 05/23] qapi: create QAPISchemaDefinition

2024-03-14 Thread John Snow

On Thu, Mar 14, 2024, 10:04 AM Markus Armbruster  wrote:

> John Snow  writes:
>
> > On Thu, Mar 14, 2024, 5:12 AM Markus Armbruster 
> wrote:
> >
> >> John Snow  writes:
> >>
> >> > Include entities don't have names, but we generally expect "entities"
> to
> >> > have names. Reclassify all entities with names as *definitions*,
> leaving
> >> > the nameless include entities as QAPISchemaEntity instances.
> >> >
> >> > This is primarily to help simplify typing around expectations of what
> >> > callers expect for properties of an "entity".
> >> >
> >> > Suggested-by: Markus Armbruster 
> >> > Signed-off-by: John Snow 
> >> > ---
> >> >  scripts/qapi/schema.py | 144
> +++--
> >> >  1 file changed, 82 insertions(+), 62 deletions(-)
> >> >
> >> > diff --git a/scripts/qapi/schema.py b/scripts/qapi/schema.py
> >> > index 117f0f78f0c..da273c1649d 100644
> >> > --- a/scripts/qapi/schema.py
> >> > +++ b/scripts/qapi/schema.py
> >>
> >> [...]
> >>
> >> > @@ -107,6 +90,46 @@ def _set_module(self, schema, info):
> >> >  def set_module(self, schema):
> >> >  self._set_module(schema, self.info)
> >> >
> >> > +def visit(self, visitor):
> >> > +# pylint: disable=unused-argument
> >> > +assert self._checked
> >> > +
> >> > +
> >> > +class QAPISchemaDefinition(QAPISchemaEntity):
> >> > +meta: Optional[str] = None
> >> > +
> >> > +def __init__(self, name: str, info, doc, ifcond=None,
> features=None):
> >> > +assert isinstance(name, str)
> >> > +super().__init__(info)
> >> > +for f in features or []:
> >> > +assert isinstance(f, QAPISchemaFeature)
> >> > +f.set_defined_in(name)
> >> > +self.name = name
> >> > +self.doc = doc
> >> > +self._ifcond = ifcond or QAPISchemaIfCond()
> >> > +self.features = features or []
> >> > +
> >> > +def __repr__(self):
> >> > +return "<%s:%s at 0x%x>" % (type(self).__name__, self.name,
> >> > +id(self))
> >> > +
> >> > +def c_name(self):
> >> > +return c_name(self.name)
> >> > +
> >> > +def check(self, schema):
> >> > +assert not self._checked
> >> > +seen = {}
> >> > +for f in self.features:
> >> > +f.check_clash(self.info, seen)
> >> > +super().check(schema)
> >>
> >> v3 called super().check() first.  Why did you move it?  I'm asking just
> >> to make sure I'm not missing something subtle.
> >>
> >
> > I don't believe you are, I think it was just ... something I didn't
> notice
> > when rebasing, since I merged this patch by hand. I recall having to
> insert
> > the super call manually, and I guess my preferences are nondeterministic.
>
> I'd like to move it back, for consistency with its subtypes' .check(),
> which all call super().check() early.  Okay?
>

Yes, please do.

(You *really* don't have to ask, but I understand you are being courteous
to a fault. It's your module, though! I argue a lot but it *is* yours. Run
it through the armbru delinter as much as you'd like.)


> >> > +
> >> > +def connect_doc(self, doc=None):
> >> > +super().connect_doc(doc)
> >> > +doc = doc or self.doc
> >> > +if doc:
> >> > +for f in self.features:
> >> > +doc.connect_feature(f)
> >> > +
> >> >  @property
> >> >  def ifcond(self):
> >> >  assert self._checked
> >>
> >> [...]
> >>
> >>
>
>

Re: [PATCH v2 3/4] tests/avocado: use OpenBSD 7.4 for sbsa-ref

2024-03-14 Thread Marcin Juszkiewicz


W dniu 14.03.2024 o 13:14, Alex Bennée pisze:

+# OpenBSD 7.4 does not boot on current max cpu.
+#
+#   def test_sbsaref_openbsd_max_pauth_off(self):
+#   """
+#   :avocado: tags=cpu:max
+#   :avocado: tags=os:openbsd
+#   """
+#   self.boot_openbsd("max,pauth=off")

If we are not going to delete the entries then at least use a @skip
instead of commenting. Maybe:

   @skip("Potential un-diagnosed upstream bug?")

but it would be nice to figure out exactly where is breaks.


I am going to subscribe to openbsd mailing list and ask there.

OpenBSD 7.3 works, 7.4/7.5-current does not.
And it is on Neoverse-V1/N2 and max cpu types.

Re: [PATCH for-9.0] mirror: Don't call job_pause_point() under graph lock

2024-03-14 Thread Stefan Hajnoczi

On Wed, Mar 13, 2024 at 04:30:00PM +0100, Kevin Wolf wrote:
> Calling job_pause_point() while holding the graph reader lock
> potentially results in a deadlock: bdrv_graph_wrlock() first drains
> everything, including the mirror job, which pauses it. The job is only
> unpaused at the end of the drain section, which is when the graph writer
> lock has been successfully taken. However, if the job happens to be
> paused at a pause point where it still holds the reader lock, the writer
> lock can't be taken as long as the job is still paused.
> 
> Mark job_pause_point() as GRAPH_UNLOCKED and fix mirror accordingly.
> 
> Cc: qemu-sta...@nongnu.org
> Buglink: https://issues.redhat.com/browse/RHEL-28125
> Fixes: 004915a96a7a40e942ac85e6d22518cbcd283506
> Signed-off-by: Kevin Wolf 
> ---
>  include/qemu/job.h |  2 +-
>  block/mirror.c | 10 ++
>  2 files changed, 7 insertions(+), 5 deletions(-)
> 
> diff --git a/include/qemu/job.h b/include/qemu/job.h
> index 9ea98b5927..2b873f2576 100644
> --- a/include/qemu/job.h
> +++ b/include/qemu/job.h
> @@ -483,7 +483,7 @@ void job_enter(Job *job);
>   *
>   * Called with job_mutex *not* held.
>   */
> -void coroutine_fn job_pause_point(Job *job);
> +void coroutine_fn GRAPH_UNLOCKED job_pause_point(Job *job);
>  
>  /**
>   * @job: The job that calls the function.
> diff --git a/block/mirror.c b/block/mirror.c
> index 5145eb53e1..1bdce3b657 100644
> --- a/block/mirror.c
> +++ b/block/mirror.c
> @@ -479,9 +479,9 @@ static unsigned mirror_perform(MirrorBlockJob *s, int64_t 
> offset,
>  return bytes_handled;
>  }
>  
> -static void coroutine_fn GRAPH_RDLOCK mirror_iteration(MirrorBlockJob *s)
> +static void coroutine_fn GRAPH_UNLOCKED mirror_iteration(MirrorBlockJob *s)
>  {
> -BlockDriverState *source = s->mirror_top_bs->backing->bs;
> +BlockDriverState *source;
>  MirrorOp *pseudo_op;
>  int64_t offset;
>  /* At least the first dirty chunk is mirrored in one iteration. */
> @@ -489,6 +489,10 @@ static void coroutine_fn GRAPH_RDLOCK 
> mirror_iteration(MirrorBlockJob *s)
>  bool write_zeroes_ok = 
> bdrv_can_write_zeroes_with_unmap(blk_bs(s->target));
>  int max_io_bytes = MAX(s->buf_size / MAX_IN_FLIGHT, MAX_IO_BYTES);
>  
> +bdrv_graph_co_rdlock();
> +source = s->mirror_top_bs->backing->bs;

Is bdrv_ref(source) needed here so that source cannot go away if someone
else write locks the graph and removes it? Or maybe something else
protects against that. Either way, please add a comment that explains
why this is safe.

> +bdrv_graph_co_rdunlock();
> +
>  bdrv_dirty_bitmap_lock(s->dirty_bitmap);
>  offset = bdrv_dirty_iter_next(s->dbi);
>  if (offset < 0) {
> @@ -1066,9 +1070,7 @@ static int coroutine_fn mirror_run(Job *job, Error 
> **errp)
>  mirror_wait_for_free_in_flight_slot(s);
>  continue;
>  } else if (cnt != 0) {
> -bdrv_graph_co_rdlock();
>  mirror_iteration(s);
> -bdrv_graph_co_rdunlock();
>  }
>  }
>  
> -- 
> 2.44.0
> 


signature.asc
Description: PGP signature

Re: [PATCH v2 3/4] tests/avocado: use OpenBSD 7.4 for sbsa-ref

2024-03-14 Thread Philippe Mathieu-Daudé


On 14/3/24 13:14, Alex Bennée wrote:

Marcin Juszkiewicz  writes:


7.4 was released in October 2023, time to update before 7.3 gets dropped.

Disabled tests for 'max' cpu as OpenBSD fails there.

Signed-off-by: Marcin Juszkiewicz 
---
  tests/avocado/machine_aarch64_sbsaref.py | 47 +++-
  1 file changed, 34 insertions(+), 13 deletions(-)

diff --git a/tests/avocado/machine_aarch64_sbsaref.py 
b/tests/avocado/machine_aarch64_sbsaref.py
index 259225f15f..0e52dc5854 100644
--- a/tests/avocado/machine_aarch64_sbsaref.py
+++ b/tests/avocado/machine_aarch64_sbsaref.py
@@ -159,14 +159,14 @@ def test_sbsaref_alpine_linux_max(self):
  # This tests the whole boot chain from EFI to Userspace
  # We only boot a whole OS for the current top level CPU and GIC
  # Other test profiles should use more minimal boots
-def boot_openbsd73(self, cpu):
+def boot_openbsd(self, cpu):
  self.fetch_firmware()
  
  img_url = (

-"https://cdn.openbsd.org/pub/OpenBSD/7.3/arm64/miniroot73.img";
+"https://cdn.openbsd.org/pub/OpenBSD/7.4/arm64/miniroot74.img";
  )
  
-img_hash = "7fc2c75401d6f01fbfa25f4953f72ad7d7c18650056d30755c44b9c129b707e5"

+img_hash = 
"7b08b2ce081cff6408d183f7152ddcfd2779912104866e4fdf6ae2d864b51142"
  img_path = self.fetch_asset(img_url, algorithm="sha256", 
asset_hash=img_hash)
  
  self.vm.set_console()

@@ -180,23 +180,44 @@ def boot_openbsd73(self, cpu):
  self.vm.launch()
  wait_for_console_pattern(self,
   "Welcome to the OpenBSD/arm64"
- " 7.3 installation program.")
+ " 7.4 installation program.")
  
-def test_sbsaref_openbsd73_cortex_a57(self):

+def test_sbsaref_openbsd_cortex_a57(self):
  """
  :avocado: tags=cpu:cortex-a57
+:avocado: tags=os:openbsd
  """
-self.boot_openbsd73("cortex-a57")
+self.boot_openbsd("cortex-a57")
  
-def test_sbsaref_openbsd73_neoverse_n1(self):

+def test_sbsaref_openbsd_neoverse_n1(self):
  """
  :avocado: tags=cpu:neoverse-n1
+:avocado: tags=os:openbsd
  """
-self.boot_openbsd73("neoverse-n1")
+self.boot_openbsd("neoverse-n1")
  
-def test_sbsaref_openbsd73_max(self):

-"""
-:avocado: tags=cpu:max
-"""
-self.boot_openbsd73("max")
+# OpenBSD 7.4 does not boot on current max cpu.
+#
+#   def test_sbsaref_openbsd_max_pauth_off(self):
+#   """
+#   :avocado: tags=cpu:max
+#   :avocado: tags=os:openbsd
+#   """
+#   self.boot_openbsd("max,pauth=off")


If we are not going to delete the entries then at least use a @skip
instead of commenting. Maybe:

   @skip("Potential un-diagnosed upstream bug?")


Daniel or Peter suggested to open a GitLab issue and use

@skip("https://gitlab.com/qemu-project/qemu/-/issues/xyz";)

to track progress.


but it would be nice to figure out exactly where is breaks.

Re: [PATCH v4 00/23] qapi: statically type schema.py

2024-03-14 Thread Markus Armbruster

John Snow  writes:

> This is *v4*, for some definitions of "version" and "four".

Series
Reviewed-by: Markus Armbruster 

I'll replace PATCH 12, not because it's wrong, just because I like my
replacement better.  I'll post this and two follow-up patches as v5.
I will not let the two follow-up patches delay queuing, though.

Thanks!

Re: [PATCH v4 00/23] qapi: statically type schema.py

2024-03-14 Thread John Snow

On Thu, Mar 14, 2024, 10:43 AM Markus Armbruster  wrote:

> John Snow  writes:
>
> > This is *v4*, for some definitions of "version" and "four".
>
> Series
> Reviewed-by: Markus Armbruster 
>
> I'll replace PATCH 12, not because it's wrong, just because I like my
> replacement better.  I'll post this and two follow-up patches as v5.
> I will not let the two follow-up patches delay queuing, though.
>
> Thanks!
>

Sounds good! I'll divert efforts to improving sphinx docs next unless you
have a higher priority to suggest.

Re: [PATCH v2 1/6] virtio/virtio-pci: Handle extra notification data

2024-03-14 Thread Eugenio Perez Martin

On Thu, Mar 14, 2024 at 1:16 PM Jonah Palmer  wrote:
>
>
>
> On 3/13/24 11:01 PM, Jason Wang wrote:
> > On Wed, Mar 13, 2024 at 7:55 PM Jonah Palmer  
> > wrote:
> >>
> >> Add support to virtio-pci devices for handling the extra data sent
> >> from the driver to the device when the VIRTIO_F_NOTIFICATION_DATA
> >> transport feature has been negotiated.
> >>
> >> The extra data that's passed to the virtio-pci device when this
> >> feature is enabled varies depending on the device's virtqueue
> >> layout.
> >>
> >> In a split virtqueue layout, this data includes:
> >>   - upper 16 bits: shadow_avail_idx
> >>   - lower 16 bits: virtqueue index
> >>
> >> In a packed virtqueue layout, this data includes:
> >>   - upper 16 bits: 1-bit wrap counter & 15-bit shadow_avail_idx
> >>   - lower 16 bits: virtqueue index
> >>
> >> Tested-by: Lei Yang 
> >> Reviewed-by: Eugenio Pérez 
> >> Signed-off-by: Jonah Palmer 
> >> ---
> >>   hw/virtio/virtio-pci.c | 10 +++---
> >>   hw/virtio/virtio.c | 18 ++
> >>   include/hw/virtio/virtio.h |  1 +
> >>   3 files changed, 26 insertions(+), 3 deletions(-)
> >>
> >> diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c
> >> index cb6940fc0e..0f5c3c3b2f 100644
> >> --- a/hw/virtio/virtio-pci.c
> >> +++ b/hw/virtio/virtio-pci.c
> >> @@ -384,7 +384,7 @@ static void virtio_ioport_write(void *opaque, uint32_t 
> >> addr, uint32_t val)
> >>   {
> >>   VirtIOPCIProxy *proxy = opaque;
> >>   VirtIODevice *vdev = virtio_bus_get_device(&proxy->bus);
> >> -uint16_t vector;
> >> +uint16_t vector, vq_idx;
> >>   hwaddr pa;
> >>
> >>   switch (addr) {
> >> @@ -408,8 +408,12 @@ static void virtio_ioport_write(void *opaque, 
> >> uint32_t addr, uint32_t val)
> >>   vdev->queue_sel = val;
> >>   break;
> >>   case VIRTIO_PCI_QUEUE_NOTIFY:
> >> -if (val < VIRTIO_QUEUE_MAX) {
> >> -virtio_queue_notify(vdev, val);
> >> +vq_idx = val;
> >> +if (vq_idx < VIRTIO_QUEUE_MAX) {
> >> +if (virtio_vdev_has_feature(vdev, 
> >> VIRTIO_F_NOTIFICATION_DATA)) {
> >> +virtio_queue_set_shadow_avail_data(vdev, val);
> >> +}
> >> +virtio_queue_notify(vdev, vq_idx);
> >>   }
> >>   break;
> >>   case VIRTIO_PCI_STATUS:
> >> diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
> >> index d229755eae..bcb9e09df0 100644
> >> --- a/hw/virtio/virtio.c
> >> +++ b/hw/virtio/virtio.c
> >> @@ -2255,6 +2255,24 @@ void virtio_queue_set_align(VirtIODevice *vdev, int 
> >> n, int align)
> >>   }
> >>   }
> >>
> >> +void virtio_queue_set_shadow_avail_data(VirtIODevice *vdev, uint32_t data)

Maybe I didn't explain well, but I think it is better to pass directly
idx to a VirtQueue *. That way only the caller needs to check for a
valid vq idx, and (my understanding is) the virtio.c interface is
migrating to VirtQueue * use anyway.

> >> +{
> >> +/* Lower 16 bits is the virtqueue index */
> >> +uint16_t i = data;
> >> +VirtQueue *vq = &vdev->vq[i];
> >> +
> >> +if (!vq->vring.desc) {
> >> +return;
> >> +}
> >> +
> >> +if (virtio_vdev_has_feature(vdev, VIRTIO_F_RING_PACKED)) {
> >> +vq->shadow_avail_wrap_counter = (data >> 31) & 0x1;
> >> +vq->shadow_avail_idx = (data >> 16) & 0x7FFF;
> >> +} else {
> >> +vq->shadow_avail_idx = (data >> 16);
> >
> > Do we need to do a sanity check for this value?
> >
> > Thanks
> >
>
> It can't hurt, right? What kind of check did you have in mind?
>
> if (vq->shadow_avail_idx >= vq->vring.num)
>

I'm a little bit lost too. shadow_avail_idx can take all uint16_t
values. Maybe you meant checking for a valid vq index, Jason?

Thanks!

> Or something else?
>
> >> +}
> >> +}
> >> +
> >>   static void virtio_queue_notify_vq(VirtQueue *vq)
> >>   {
> >>   if (vq->vring.desc && vq->handle_output) {
> >> diff --git a/include/hw/virtio/virtio.h b/include/hw/virtio/virtio.h
> >> index c8f72850bc..53915947a7 100644
> >> --- a/include/hw/virtio/virtio.h
> >> +++ b/include/hw/virtio/virtio.h
> >> @@ -335,6 +335,7 @@ void virtio_queue_update_rings(VirtIODevice *vdev, int 
> >> n);
> >>   void virtio_init_region_cache(VirtIODevice *vdev, int n);
> >>   void virtio_queue_set_align(VirtIODevice *vdev, int n, int align);
> >>   void virtio_queue_notify(VirtIODevice *vdev, int n);
> >> +void virtio_queue_set_shadow_avail_data(VirtIODevice *vdev, uint32_t 
> >> data);
> >>   uint16_t virtio_queue_vector(VirtIODevice *vdev, int n);
> >>   void virtio_queue_set_vector(VirtIODevice *vdev, int n, uint16_t vector);
> >>   int virtio_queue_set_host_notifier_mr(VirtIODevice *vdev, int n,
> >> --
> >> 2.39.3
> >>
> >
>

Re: [PATCH v2 3/4] tests/avocado: use OpenBSD 7.4 for sbsa-ref

2024-03-14 Thread Alex Bennée

Philippe Mathieu-Daudé  writes:

> On 14/3/24 13:14, Alex Bennée wrote:
>> Marcin Juszkiewicz  writes:
>> 
>>> 7.4 was released in October 2023, time to update before 7.3 gets dropped.
>>>
>>> Disabled tests for 'max' cpu as OpenBSD fails there.
>>>
>>> Signed-off-by: Marcin Juszkiewicz 
>>> ---
>>>   tests/avocado/machine_aarch64_sbsaref.py | 47 
>>> +++-
>>>   1 file changed, 34 insertions(+), 13 deletions(-)
>>>
>>> diff --git a/tests/avocado/machine_aarch64_sbsaref.py 
>>> b/tests/avocado/machine_aarch64_sbsaref.py
>>> index 259225f15f..0e52dc5854 100644
>>> --- a/tests/avocado/machine_aarch64_sbsaref.py
>>> +++ b/tests/avocado/machine_aarch64_sbsaref.py
>>> @@ -159,14 +159,14 @@ def test_sbsaref_alpine_linux_max(self):
>>>   # This tests the whole boot chain from EFI to Userspace
>>>   # We only boot a whole OS for the current top level CPU and GIC
>>>   # Other test profiles should use more minimal boots
>>> -def boot_openbsd73(self, cpu):
>>> +def boot_openbsd(self, cpu):
>>>   self.fetch_firmware()
>>> img_url = (
>>> -"https://cdn.openbsd.org/pub/OpenBSD/7.3/arm64/miniroot73.img";
>>> +"https://cdn.openbsd.org/pub/OpenBSD/7.4/arm64/miniroot74.img";
>>>   )
>>>   -img_hash =
>>> "7fc2c75401d6f01fbfa25f4953f72ad7d7c18650056d30755c44b9c129b707e5"
>>> +img_hash = 
>>> "7b08b2ce081cff6408d183f7152ddcfd2779912104866e4fdf6ae2d864b51142"
>>>   img_path = self.fetch_asset(img_url, algorithm="sha256", 
>>> asset_hash=img_hash)
>>> self.vm.set_console()
>>> @@ -180,23 +180,44 @@ def boot_openbsd73(self, cpu):
>>>   self.vm.launch()
>>>   wait_for_console_pattern(self,
>>>"Welcome to the OpenBSD/arm64"
>>> - " 7.3 installation program.")
>>> + " 7.4 installation program.")
>>>   -def test_sbsaref_openbsd73_cortex_a57(self):
>>> +def test_sbsaref_openbsd_cortex_a57(self):
>>>   """
>>>   :avocado: tags=cpu:cortex-a57
>>> +:avocado: tags=os:openbsd
>>>   """
>>> -self.boot_openbsd73("cortex-a57")
>>> +self.boot_openbsd("cortex-a57")
>>>   -def test_sbsaref_openbsd73_neoverse_n1(self):
>>> +def test_sbsaref_openbsd_neoverse_n1(self):
>>>   """
>>>   :avocado: tags=cpu:neoverse-n1
>>> +:avocado: tags=os:openbsd
>>>   """
>>> -self.boot_openbsd73("neoverse-n1")
>>> +self.boot_openbsd("neoverse-n1")
>>>   -def test_sbsaref_openbsd73_max(self):
>>> -"""
>>> -:avocado: tags=cpu:max
>>> -"""
>>> -self.boot_openbsd73("max")
>>> +# OpenBSD 7.4 does not boot on current max cpu.
>>> +#
>>> +#   def test_sbsaref_openbsd_max_pauth_off(self):
>>> +#   """
>>> +#   :avocado: tags=cpu:max
>>> +#   :avocado: tags=os:openbsd
>>> +#   """
>>> +#   self.boot_openbsd("max,pauth=off")
>> If we are not going to delete the entries then at least use a @skip
>> instead of commenting. Maybe:
>>@skip("Potential un-diagnosed upstream bug?")
>
> Daniel or Peter suggested to open a GitLab issue and use
>
> @skip("https://gitlab.com/qemu-project/qemu/-/issues/xyz";)
>
> to track progress.

That's a good idea. Are you going to respin?

>
>> but it would be nice to figure out exactly where is breaks.

-- 
Alex Bennée
Virtualisation Tech Lead @ Linaro

Re: [RFC PATCH 4/5] cxl/core: add report option for cxl_mem_get_poison()

2024-03-14 Thread Shiyang Ruan via





在 2024/2/10 14:49, Dan Williams 写道:

Shiyang Ruan wrote:

When a poison event is received, driver uses GET_POISON_LIST command
to get the poison list.  Now driver has cxl_mem_get_poison(), so
reuse it and add a parameter 'bool report', report poison record to MCE
if set true.


If the memory error record has the poison event, why does the poison
list need to be retrieved by the kernel? I would expect it is sufficient
to just report the single poison event and leave it to userspace to
react to that event and retrieve more data if it wants.


The GMER has only physical address field, no range/length of the POISON, 
we can't get the poison range from the single event record.  Since the 
POISON range is injected by one command, one GMER is sent to driver, we 
have to use GET_POISON_LIST command to get the length.



--
Thanks,
Ruan.

Re: [PATCH v2 2/2] migration/multifd: Ensure we're not given a socket for file migration

2024-03-14 Thread Peter Xu

On Wed, Mar 13, 2024 at 06:28:24PM -0300, Fabiano Rosas wrote:
> When doing migration using the fd: URI, the incoming migration starts
> before the user has passed the file descriptor to QEMU. This means
> that the checks at migration_channels_and_transport_compatible()
> happen too soon and we need to allow a migration channel of type
> SOCKET_ADDRESS_TYPE_FD even though socket migration is not supported
> with multifd.

Hmm, bare with me if this is a stupid one.. why the incoming migration can
start _before_ the user passed in the fd?

IOW, why can't we rely on a single fd_is_socket() check for
SOCKET_ADDRESS_TYPE_FD in transport_supports_multi_channels()?

> 
> The commit decdc76772 ("migration/multifd: Add mapped-ram support to
> fd: URI") was supposed to add a second check prior to starting
> migration to make sure a socket fd is not passed instead of a file fd,
> but failed to do so.
> 
> Add the missing verification.
> 
> Fixes: decdc76772 ("migration/multifd: Add mapped-ram support to fd: URI")
> Signed-off-by: Fabiano Rosas 
> ---
>  migration/fd.c   | 8 
>  migration/file.c | 7 +++
>  2 files changed, 15 insertions(+)
> 
> diff --git a/migration/fd.c b/migration/fd.c
> index 39a52e5c90..c07030f715 100644
> --- a/migration/fd.c
> +++ b/migration/fd.c
> @@ -22,6 +22,7 @@
>  #include "migration.h"
>  #include "monitor/monitor.h"
>  #include "io/channel-file.h"
> +#include "io/channel-socket.h"
>  #include "io/channel-util.h"
>  #include "options.h"
>  #include "trace.h"
> @@ -95,6 +96,13 @@ void fd_start_incoming_migration(const char *fdname, Error 
> **errp)
>  }
>  
>  if (migrate_multifd()) {
> +if (fd_is_socket(fd)) {
> +error_setg(errp,
> +   "Multifd migration to a socket FD is not supported");
> +object_unref(ioc);
> +return;
> +}
> +
>  file_create_incoming_channels(ioc, errp);
>  } else {
>  qio_channel_set_name(ioc, "migration-fd-incoming");
> diff --git a/migration/file.c b/migration/file.c
> index ddde0ca818..b6e8ba13f2 100644
> --- a/migration/file.c
> +++ b/migration/file.c
> @@ -15,6 +15,7 @@
>  #include "file.h"
>  #include "migration.h"
>  #include "io/channel-file.h"
> +#include "io/channel-socket.h"
>  #include "io/channel-util.h"
>  #include "options.h"
>  #include "trace.h"
> @@ -58,6 +59,12 @@ bool file_send_channel_create(gpointer opaque, Error 
> **errp)
>  int fd = fd_args_get_fd();
>  
>  if (fd && fd != -1) {
> +if (fd_is_socket(fd)) {
> +error_setg(errp,
> +   "Multifd migration to a socket FD is not supported");
> +goto out;
> +}
> +
>  ioc = qio_channel_file_new_dupfd(fd, errp);
>  } else {
>  ioc = qio_channel_file_new_path(outgoing_args.fname, flags, 0, errp);
> -- 
> 2.35.3
> 

-- 
Peter Xu

Re: Intention to work on GSoC project

2024-03-14 Thread Eugenio Perez Martin

On Fri, Mar 1, 2024 at 7:29 PM Sahil  wrote:
>
> Hi,
>
> On Friday, March 1, 2024 1:10:49 PM IST you wrote:
> > [...]
> > You can add the recently published
> > "virtio-live-migration-technical-deep-dive" :) [1].
> >
> > [1]
> > https://developers.redhat.com/articles/2024/02/21/virtio-live-migration-technical-deep-dive
>
> Thank you. This resource will help me cover a lot of ground.
>
> > [...]
> > There are few other tasks pending for QEMU in general and SVQ in
> > particular. I'll add them as gitlab issues by today.
> >
>
> That would be nice. I'll keep an eye out for the new issues.
>

Hi Sahil,

It's being hard to find a good self-contained small task related to
the project to be honest. As it would be out of SVQ, would it be ok
for you if we start straight to the task of adding the packed vq
format to SVQ?

Thanks!

Re: [PATCH v2 1/2] migration: Fix iocs leaks during file and fd migration

2024-03-14 Thread Peter Xu

On Wed, Mar 13, 2024 at 06:28:23PM -0300, Fabiano Rosas wrote:
> The memory for the io channels is being leaked in three different ways
> during file migration:
> 
> 1) if the offset check fails we never drop the ioc reference;
> 
> 2) we allocate an extra channel for no reason;
> 
> 3) if multifd is enabled but channel creation fails when calling
>dup(), we leave the previous channels around along with the glib
>polling;
> 
> Fix all issues by restructuring the code to first allocate the
> channels and only register the watches when all channels have been
> created.
> 
> For multifd, the file and fd migrations can share code because both
> are backed by a QIOChannelFile. For the non-multifd case, the fd needs
> to be separate because it is backed by a QIOChannelSocket.
> 
> Fixes: 2dd7ee7a51 ("migration/multifd: Add incoming QIOChannelFile support")
> Fixes: decdc76772 ("migration/multifd: Add mapped-ram support to fd: URI")
> Reported-by: Peter Xu 
> Signed-off-by: Fabiano Rosas 

Reviewed-by: Peter Xu 

-- 
Peter Xu

Re: [PATCH v2 0/2] migration mapped-ram fixes

2024-03-14 Thread Peter Xu

On Wed, Mar 13, 2024 at 06:28:22PM -0300, Fabiano Rosas wrote:
> Hi,
> 
> In this v2:
> 
> patch 1 - The fix for the ioc leaks, now including the main channel
> 
> patch 2 - A fix for an fd: migration case I thought I had written code
>   for, but obviously didn't.

Maybe I found one more issue.. I'm looking at fd_start_outgoing_migration().

ioc = qio_channel_new_fd(fd, errp);  <- here the fd is consumed and
then owned by the IOC
if (!ioc) {
close(fd);
return;
}

outgoing_args.fd = fd;   <- here we use the fd again,
and "owned" by outgoing_args
even if it shouldn't?

The problem is outgoing_args.fd will be cleaned up with a close().  I had a
feeling that it's possible it will close() something else if the fd reused
before that close() but after the IOC's.  We may want yet another dup() for
outgoing_args.fd?

If you agree, we may also want to avoid doing:

outgoing_args.fd = -1;

We could assert it instead making sure no fd leak.

> 
> Thank you for your patience.
> 
> based-on: https://gitlab.com/peterx/qemu/-/commits/migration-stable
> CI run: https://gitlab.com/farosas/qemu/-/pipelines/1212483701
> 
> Fabiano Rosas (2):
>   migration: Fix iocs leaks during file and fd migration
>   migration/multifd: Ensure we're not given a socket for file migration
> 
>  migration/fd.c   | 35 +++---
>  migration/file.c | 65 
>  migration/file.h |  1 +
>  3 files changed, 60 insertions(+), 41 deletions(-)
> 
> -- 
> 2.35.3
> 

-- 
Peter Xu

[PATCH v4 07/21] smbios: avoid mangling user provided tables

2024-03-14 Thread Igor Mammedov

currently smbios_entry_add() preserves internally '-smbios type='
options but tables provided with '-smbios file=' are stored directly
into blob that eventually will be exposed to VM. And then later
QEMU adds default/'-smbios type' entries on top into the same blob.

It makes impossible to generate tables more than once, hence
'immutable' guard was used.
Make it possible to regenerate final blob by storing user provided
blobs into a dedicated area (usr_blobs) and then copy it when
composing final blob. Which also makes handling of -smbios
options consistent.

As side effect of this and previous commits there is no need to
generate legacy smbios_entries at the time options are parsed.
Instead compose smbios_entries on demand from  usr_blobs like
it is done for non-legacy SMBIOS tables.

Signed-off-by: Igor Mammedov 
Tested-by: Fiona Ebner 
Reviewed-by: Ani Sinha 
---
v4:
   fix conflicts when rebasing on top of:
 04f143d828 Implement SMBIOS type 9 v2.6
 735eee07d1 Implement base of SMBIOS type 9 descriptor
---
 hw/smbios/smbios.c | 181 +++--
 1 file changed, 93 insertions(+), 88 deletions(-)

diff --git a/hw/smbios/smbios.c b/hw/smbios/smbios.c
index 1bae36a6e0..db422f4fb0 100644
--- a/hw/smbios/smbios.c
+++ b/hw/smbios/smbios.c
@@ -57,6 +57,14 @@ static size_t smbios_entries_len;
 static bool smbios_uuid_encoded = true;
 /* end: legacy structures & constants for <= 2.0 machines */
 
+/*
+ * SMBIOS tables provided by user with '-smbios file=' option
+ */
+uint8_t *usr_blobs;
+size_t usr_blobs_len;
+static GArray *usr_blobs_sizes;
+static unsigned usr_table_max;
+static unsigned usr_table_cnt;
 
 uint8_t *smbios_tables;
 size_t smbios_tables_len;
@@ -67,7 +75,6 @@ static SmbiosEntryPointType smbios_ep_type = 
SMBIOS_ENTRY_POINT_TYPE_32;
 static SmbiosEntryPoint ep;
 
 static int smbios_type4_count = 0;
-static bool smbios_immutable;
 static bool smbios_have_defaults;
 static uint32_t smbios_cpuid_version, smbios_cpuid_features;
 
@@ -632,9 +639,8 @@ static void smbios_build_type_1_fields(void)
 
 uint8_t *smbios_get_table_legacy(uint32_t expected_t4_count, size_t *length)
 {
-/* drop unwanted version of command-line file blob(s) */
-g_free(smbios_tables);
-smbios_tables = NULL;
+int i;
+size_t usr_offset;
 
 /* also complain if fields were given for types > 1 */
 if (find_next_bit(have_fields_bitmap,
@@ -644,12 +650,33 @@ uint8_t *smbios_get_table_legacy(uint32_t 
expected_t4_count, size_t *length)
 exit(1);
 }
 
-if (!smbios_immutable) {
-smbios_build_type_0_fields();
-smbios_build_type_1_fields();
-smbios_validate_table(expected_t4_count);
-smbios_immutable = true;
+g_free(smbios_entries);
+smbios_entries_len = sizeof(uint16_t);
+smbios_entries = g_malloc0(smbios_entries_len);
+
+for (i = 0, usr_offset = 0; usr_blobs_sizes && i < usr_blobs_sizes->len;
+ i++)
+{
+struct smbios_table *table;
+struct smbios_structure_header *header;
+size_t size = g_array_index(usr_blobs_sizes, size_t, i);
+
+header = (struct smbios_structure_header *)(usr_blobs + usr_offset);
+smbios_entries = g_realloc(smbios_entries, smbios_entries_len +
+   size + sizeof(*table));
+table = (struct smbios_table *)(smbios_entries + smbios_entries_len);
+table->header.type = SMBIOS_TABLE_ENTRY;
+table->header.length = cpu_to_le16(sizeof(*table) + size);
+memcpy(table->data, header, size);
+smbios_entries_len += sizeof(*table) + size;
+(*(uint16_t *)smbios_entries) =
+cpu_to_le16(le16_to_cpu(*(uint16_t *)smbios_entries) + 1);
+usr_offset += size;
 }
+
+smbios_build_type_0_fields();
+smbios_build_type_1_fields();
+smbios_validate_table(expected_t4_count);
 *length = smbios_entries_len;
 return smbios_entries;
 }
@@ -1217,68 +1244,68 @@ void smbios_get_tables(MachineState *ms,
 {
 unsigned i, dimm_cnt, offset;
 
-/* drop unwanted (legacy) version of command-line file blob(s) */
-g_free(smbios_entries);
-smbios_entries = NULL;
+g_free(smbios_tables);
+smbios_tables = g_memdup2(usr_blobs, usr_blobs_len);
+smbios_tables_len = usr_blobs_len;
+smbios_table_max = usr_table_max;
+smbios_table_cnt = usr_table_cnt;
 
-if (!smbios_immutable) {
-smbios_build_type_0_table();
-smbios_build_type_1_table();
-smbios_build_type_2_table();
-smbios_build_type_3_table();
+smbios_build_type_0_table();
+smbios_build_type_1_table();
+smbios_build_type_2_table();
+smbios_build_type_3_table();
 
-assert(ms->smp.sockets >= 1);
+assert(ms->smp.sockets >= 1);
 
-for (i = 0; i < ms->smp.sockets; i++) {
-smbios_build_type_4_table(ms, i);
-}
+for (i = 0; i < ms->smp.sockets; i++) {
+smbios_build_type_4_table(ms,

[PATCH v4 00/21] Workaround Windows failing to find 64bit SMBIOS entry point with SeaBIOS

2024-03-14 Thread Igor Mammedov

Changelog:
 v4:
   * rebase on top of current master due to conflict with
 new SMBIOS table 9 commits
   * add extra patch with comments about obscure legacy entries counting
 v3:
   * whitespace missed by checkpatch
   * fix idndent in QAPI
   * reorder 17/20 before 1st 'auto' can be used
   * pick up acks
 v2:
   * QAPI style fixes (Markus Armbruster )
   * squash 11/19 into 10/19 (Ani Sinha )
   * split '[PATCH 09/19] smbios: build legacy mode code only for 'pc' machine'
 in 3 smaller patches, to make it more readable
   smbios: add smbios_add_usr_blob_size() helper
  
   smbios: rename/expose structures/bitmaps used by both legacy and modern 
code  
   smbios: build legacy mode code only for 'pc' machine
   * pick up acks

Windows (10) bootloader when running on top of SeaBIOS, fails to find
SMBIOSv3 entry point. Tracing it shows that it looks for v2 anchor markers
only and not v3. Tricking it into believing that entry point is found
lets Windows successfully locate and parse SMBIOSv3 tables. Whether it
will be fixed on Windows side is not clear so here goes a workaround.

Idea is to try build v2 tables if QEMU configuration permits,
and fallback to v3 tables otherwise. That will mask Windows issue
form majority of users.
However if VM configuration can't be described (typically large VMs)
by v2 tables, QEMU will use SMBIOSv3 and Windows will hit the issue
again. In this case complain to Microsoft and/or use UEFI instead of
SeaBIOS (requires reinstall).

Default compat setting of smbios-entry-point-type after series
for pc/q35 machines:
  * 9.0-newer: 'auto'
  * 8.1-8.2: '64'
  * 8.0-older: '32'

Fixes: https://gitlab.com/qemu-project/qemu/-/issues/2008

Igor Mammedov (21):
  tests: smbios: make it possible to write SMBIOS only test
  tests: smbios: add test for -smbios type=11 option
  tests: smbios: add test for legacy mode CLI options
  smbios: cleanup smbios_get_tables() from legacy handling
  smbios: get rid of smbios_smp_sockets global
  smbios: get rid of smbios_legacy global
  smbios: avoid mangling user provided tables
  smbios: don't check type4 structures in legacy mode
  smbios: add smbios_add_usr_blob_size() helper
  smbios: rename/expose structures/bitmaps used by both legacy and
modern code
  smbios: build legacy mode code only for 'pc' machine
  smbios: handle errors consistently
  smbios: get rid of global smbios_ep_type
  smbios: clear smbios_type4_count before building tables
  smbios: extend smbios-entry-point-type with 'auto' value
  smbios: in case of entry point is 'auto' try to build v2 tables 1st
  smbios: error out when building type 4 table is not possible
  tests: acpi/smbios: whitelist expected blobs
  pc/q35: set SMBIOS entry point type to 'auto' by default
  tests: acpi: update expected SSDT.dimmpxm blob
  smbios: add extra comments to smbios_get_table_legacy()

 hw/i386/fw_cfg.h |   3 +-
 include/hw/firmware/smbios.h |  28 +-
 hw/arm/virt.c|   6 +-
 hw/i386/Kconfig  |   1 +
 hw/i386/fw_cfg.c |  14 +-
 hw/i386/pc.c |   4 +-
 hw/i386/pc_piix.c|   4 +
 hw/i386/pc_q35.c |   3 +
 hw/loongarch/virt.c  |   7 +-
 hw/riscv/virt.c  |   6 +-
 hw/smbios/Kconfig|   2 +
 hw/smbios/meson.build|   4 +
 hw/smbios/smbios.c   | 483 +++
 hw/smbios/smbios_legacy.c| 192 +++
 hw/smbios/smbios_legacy_stub.c   |  15 +
 qapi/machine.json|   5 +-
 tests/data/acpi/q35/SSDT.dimmpxm | Bin 1815 -> 1815 bytes
 tests/data/smbios/type11_blob| Bin 0 -> 11 bytes
 tests/data/smbios/type11_blob.legacy | Bin 0 -> 10 bytes
 tests/qtest/bios-tables-test.c   |  81 -
 20 files changed, 541 insertions(+), 317 deletions(-)
 create mode 100644 hw/smbios/smbios_legacy.c
 create mode 100644 hw/smbios/smbios_legacy_stub.c
 create mode 100644 tests/data/smbios/type11_blob
 create mode 100644 tests/data/smbios/type11_blob.legacy

--
2.39.3

[PATCH v4 13/21] smbios: get rid of global smbios_ep_type

2024-03-14 Thread Igor Mammedov

Signed-off-by: Igor Mammedov 
Acked-by: Daniel Henrique Barboza 
Reviewed-by: Ani Sinha 
Tested-by: Fiona Ebner 
---
 hw/i386/fw_cfg.h |  3 ++-
 include/hw/firmware/smbios.h |  5 +++--
 hw/arm/virt.c|  4 ++--
 hw/i386/fw_cfg.c |  8 
 hw/i386/pc.c |  2 +-
 hw/loongarch/virt.c  |  7 ---
 hw/riscv/virt.c  |  6 +++---
 hw/smbios/smbios.c   | 27 +++
 hw/smbios/smbios_legacy.c|  2 +-
 9 files changed, 35 insertions(+), 29 deletions(-)

diff --git a/hw/i386/fw_cfg.h b/hw/i386/fw_cfg.h
index 1e1de6b4a3..92e310f5fd 100644
--- a/hw/i386/fw_cfg.h
+++ b/hw/i386/fw_cfg.h
@@ -23,7 +23,8 @@
 FWCfgState *fw_cfg_arch_create(MachineState *ms,
uint16_t boot_cpus,
uint16_t apic_id_limit);
-void fw_cfg_build_smbios(PCMachineState *ms, FWCfgState *fw_cfg);
+void fw_cfg_build_smbios(PCMachineState *pcms, FWCfgState *fw_cfg,
+ SmbiosEntryPointType ep_type);
 void fw_cfg_build_feature_control(MachineState *ms, FWCfgState *fw_cfg);
 void fw_cfg_add_acpi_dsdt(Aml *scope, FWCfgState *fw_cfg);
 
diff --git a/include/hw/firmware/smbios.h b/include/hw/firmware/smbios.h
index d4b91d5a14..8d3fb2fb3b 100644
--- a/include/hw/firmware/smbios.h
+++ b/include/hw/firmware/smbios.h
@@ -326,16 +326,17 @@ struct smbios_type_127 {
 struct smbios_structure_header header;
 } QEMU_PACKED;
 
-bool smbios_validate_table(Error **errp);
+bool smbios_validate_table(SmbiosEntryPointType ep_type, Error **errp);
 void smbios_add_usr_blob_size(size_t size);
 void smbios_entry_add(QemuOpts *opts, Error **errp);
 void smbios_set_cpuid(uint32_t version, uint32_t features);
 void smbios_set_defaults(const char *manufacturer, const char *product,
  const char *version,
- bool uuid_encoded, SmbiosEntryPointType ep_type);
+ bool uuid_encoded);
 void smbios_set_default_processor_family(uint16_t processor_family);
 uint8_t *smbios_get_table_legacy(size_t *length, Error **errp);
 void smbios_get_tables(MachineState *ms,
+   SmbiosEntryPointType ep_type,
const struct smbios_phys_mem_area *mem_array,
const unsigned int mem_array_size,
uint8_t **tables, size_t *tables_len,
diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index b634c908a7..a9a913aead 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -1651,13 +1651,13 @@ static void virt_build_smbios(VirtMachineState *vms)
 
 smbios_set_defaults("QEMU", product,
 vmc->smbios_old_sys_ver ? "1.0" : mc->name,
-true, SMBIOS_ENTRY_POINT_TYPE_64);
+true);
 
 /* build the array of physical mem area from base_memmap */
 mem_array.address = vms->memmap[VIRT_MEM].base;
 mem_array.length = ms->ram_size;
 
-smbios_get_tables(ms, &mem_array, 1,
+smbios_get_tables(ms, SMBIOS_ENTRY_POINT_TYPE_64, &mem_array, 1,
   &smbios_tables, &smbios_tables_len,
   &smbios_anchor, &smbios_anchor_len,
   &error_fatal);
diff --git a/hw/i386/fw_cfg.c b/hw/i386/fw_cfg.c
index e387bf50d0..d802d2787f 100644
--- a/hw/i386/fw_cfg.c
+++ b/hw/i386/fw_cfg.c
@@ -48,7 +48,8 @@ const char *fw_cfg_arch_key_name(uint16_t key)
 return NULL;
 }
 
-void fw_cfg_build_smbios(PCMachineState *pcms, FWCfgState *fw_cfg)
+void fw_cfg_build_smbios(PCMachineState *pcms, FWCfgState *fw_cfg,
+ SmbiosEntryPointType ep_type)
 {
 #ifdef CONFIG_SMBIOS
 uint8_t *smbios_tables, *smbios_anchor;
@@ -63,8 +64,7 @@ void fw_cfg_build_smbios(PCMachineState *pcms, FWCfgState 
*fw_cfg)
 if (pcmc->smbios_defaults) {
 /* These values are guest ABI, do not change */
 smbios_set_defaults("QEMU", mc->desc, mc->name,
-pcmc->smbios_uuid_encoded,
-pcms->smbios_entry_point_type);
+pcmc->smbios_uuid_encoded);
 }
 
 /* tell smbios about cpuid version and features */
@@ -89,7 +89,7 @@ void fw_cfg_build_smbios(PCMachineState *pcms, FWCfgState 
*fw_cfg)
 array_count++;
 }
 }
-smbios_get_tables(ms, mem_array, array_count,
+smbios_get_tables(ms, ep_type, mem_array, array_count,
   &smbios_tables, &smbios_tables_len,
   &smbios_anchor, &smbios_anchor_len,
   &error_fatal);
diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index feb7a93083..44eb073abd 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -672,7 +672,7 @@ void pc_machine_done(Notifier *notifier, void *data)
 
 acpi_setup();
 if (x86ms->fw_cfg) {
-fw_cfg_build_smbios(pcms, x86ms->fw_cfg);
+fw_cfg_build_smbios(pcms, x86ms->fw_cfg, 
pcms->smbios_entry_point_type);
 fw_cfg_build_feat

[PATCH v4 11/21] smbios: build legacy mode code only for 'pc' machine

2024-03-14 Thread Igor Mammedov

basically moving code around without functional change.
And exposing some symbols so that they could be shared
between smbbios.c and new smbios_legacy.c

plus some meson magic to build smbios_legacy.c only
for 'pc' machine and otherwise replace it with stub
if not selected.

Signed-off-by: Igor Mammedov 
Reviewed-by: Ani Sinha 
---
v2:
  moved type0/type1/have_binfile_bitmap/have_fields_bitmap rename into
  a separate patch
v3:
  drop blank space at the end of hw/smbios/smbios_legacy_stub.c
---
 include/hw/firmware/smbios.h   |   5 +
 hw/i386/Kconfig|   1 +
 hw/smbios/Kconfig  |   2 +
 hw/smbios/meson.build  |   4 +
 hw/smbios/smbios.c | 164 +-
 hw/smbios/smbios_legacy.c  | 179 +
 hw/smbios/smbios_legacy_stub.c |  15 +++
 7 files changed, 207 insertions(+), 163 deletions(-)
 create mode 100644 hw/smbios/smbios_legacy.c
 create mode 100644 hw/smbios/smbios_legacy_stub.c

diff --git a/include/hw/firmware/smbios.h b/include/hw/firmware/smbios.h
index 05707c6341..ccc51e72f5 100644
--- a/include/hw/firmware/smbios.h
+++ b/include/hw/firmware/smbios.h
@@ -17,6 +17,9 @@
  *
  */
 
+extern uint8_t *usr_blobs;
+extern GArray *usr_blobs_sizes;
+
 typedef struct {
 const char *vendor, *version, *date;
 bool have_major_minor, uefi;
@@ -323,6 +326,8 @@ struct smbios_type_127 {
 struct smbios_structure_header header;
 } QEMU_PACKED;
 
+void smbios_validate_table(void);
+void smbios_add_usr_blob_size(size_t size);
 void smbios_entry_add(QemuOpts *opts, Error **errp);
 void smbios_set_cpuid(uint32_t version, uint32_t features);
 void smbios_set_defaults(const char *manufacturer, const char *product,
diff --git a/hw/i386/Kconfig b/hw/i386/Kconfig
index a1846be6f7..a6ee052f9a 100644
--- a/hw/i386/Kconfig
+++ b/hw/i386/Kconfig
@@ -76,6 +76,7 @@ config I440FX
 select PIIX
 select DIMM
 select SMBIOS
+select SMBIOS_LEGACY
 select FW_CFG_DMA
 
 config ISAPC
diff --git a/hw/smbios/Kconfig b/hw/smbios/Kconfig
index 553adf4bfc..8d989a2f1b 100644
--- a/hw/smbios/Kconfig
+++ b/hw/smbios/Kconfig
@@ -1,2 +1,4 @@
 config SMBIOS
 bool
+config SMBIOS_LEGACY
+bool
diff --git a/hw/smbios/meson.build b/hw/smbios/meson.build
index 7046967462..a59039f669 100644
--- a/hw/smbios/meson.build
+++ b/hw/smbios/meson.build
@@ -4,5 +4,9 @@ smbios_ss.add(when: 'CONFIG_IPMI',
   if_true: files('smbios_type_38.c'),
   if_false: files('smbios_type_38-stub.c'))
 
+smbios_ss.add(when: 'CONFIG_SMBIOS_LEGACY',
+  if_true: files('smbios_legacy.c'),
+  if_false: files('smbios_legacy_stub.c'))
+
 system_ss.add_all(when: 'CONFIG_SMBIOS', if_true: smbios_ss)
 system_ss.add(when: 'CONFIG_SMBIOS', if_false: files('smbios-stub.c'))
diff --git a/hw/smbios/smbios.c b/hw/smbios/smbios.c
index e93b8f9cb1..f9efe01233 100644
--- a/hw/smbios/smbios.c
+++ b/hw/smbios/smbios.c
@@ -31,38 +31,12 @@
 #include "hw/pci/pci_device.h"
 #include "smbios_build.h"
 
-/* legacy structures and constants for <= 2.0 machines */
-struct smbios_header {
-uint16_t length;
-uint8_t type;
-} QEMU_PACKED;
-
-struct smbios_field {
-struct smbios_header header;
-uint8_t type;
-uint16_t offset;
-uint8_t data[];
-} QEMU_PACKED;
-
-struct smbios_table {
-struct smbios_header header;
-uint8_t data[];
-} QEMU_PACKED;
-
-#define SMBIOS_FIELD_ENTRY 0
-#define SMBIOS_TABLE_ENTRY 1
-
-static uint8_t *smbios_entries;
-static size_t smbios_entries_len;
 static bool smbios_uuid_encoded = true;
-/* end: legacy structures & constants for <= 2.0 machines */
-
 /*
  * SMBIOS tables provided by user with '-smbios file=' option
  */
 uint8_t *usr_blobs;
 size_t usr_blobs_len;
-static GArray *usr_blobs_sizes;
 static unsigned usr_table_max;
 static unsigned usr_table_cnt;
 
@@ -546,7 +520,7 @@ static void smbios_check_type4_count(uint32_t 
expected_t4_count)
 }
 }
 
-static void smbios_validate_table(void)
+void smbios_validate_table(void)
 {
 if (smbios_ep_type == SMBIOS_ENTRY_POINT_TYPE_32 &&
 smbios_tables_len > SMBIOS_21_MAX_TABLES_LEN) {
@@ -556,134 +530,6 @@ static void smbios_validate_table(void)
 }
 }
 
-
-/* legacy setup functions for <= 2.0 machines */
-static void smbios_add_field(int type, int offset, const void *data, size_t 
len)
-{
-struct smbios_field *field;
-
-if (!smbios_entries) {
-smbios_entries_len = sizeof(uint16_t);
-smbios_entries = g_malloc0(smbios_entries_len);
-}
-smbios_entries = g_realloc(smbios_entries, smbios_entries_len +
-  sizeof(*field) + len);
-field = (struct smbios_field *)(smbios_entries + smbios_entries_len);
-field->header.type = SMBIOS_FIELD_ENTRY;
-field->header.length = cpu_to_le16(sizeof(*field) + len);
-
-field->type = type;
-field->offset = cpu_to_le16(offset);
-memcpy(field->data, data, len);
-
-smbios_entries_len

[PATCH v4 09/21] smbios: add smbios_add_usr_blob_size() helper

2024-03-14 Thread Igor Mammedov

it will be used by follow up patch when legacy handling
is moved out into a separate file.

Signed-off-by: Igor Mammedov 
Reviewed-by: Ani Sinha 
---
 hw/smbios/smbios.c | 18 ++
 1 file changed, 14 insertions(+), 4 deletions(-)

diff --git a/hw/smbios/smbios.c b/hw/smbios/smbios.c
index a96e5cd839..d530667a9d 100644
--- a/hw/smbios/smbios.c
+++ b/hw/smbios/smbios.c
@@ -1411,6 +1411,14 @@ static bool save_opt_list(size_t *ndest, char ***dest, 
QemuOpts *opts,
 return true;
 }
 
+static void smbios_add_usr_blob_size(size_t size)
+{
+if (!usr_blobs_sizes) {
+usr_blobs_sizes = g_array_new(false, false, sizeof(size_t));
+}
+g_array_append_val(usr_blobs_sizes, size);
+}
+
 void smbios_entry_add(QemuOpts *opts, Error **errp)
 {
 const char *val;
@@ -1458,10 +1466,12 @@ void smbios_entry_add(QemuOpts *opts, Error **errp)
 smbios_type4_count++;
 }
 
-if (!usr_blobs_sizes) {
-usr_blobs_sizes = g_array_new(false, false, sizeof(size_t));
-}
-g_array_append_val(usr_blobs_sizes, size);
+/*
+ * preserve blob size for legacy mode so it could build its
+ * blobs flavor from 'usr_blobs'
+ */
+smbios_add_usr_blob_size(size);
+
 usr_blobs_len += size;
 if (size > usr_table_max) {
 usr_table_max = size;
-- 
2.39.3

[PATCH v4 02/21] tests: smbios: add test for -smbios type=11 option

2024-03-14 Thread Igor Mammedov

Signed-off-by: Igor Mammedov 
Reviewed-by: Ani Sinha 
Tested-by: Fiona Ebner 
---
 tests/data/smbios/type11_blob  | Bin 0 -> 11 bytes
 tests/qtest/bios-tables-test.c |  17 +
 2 files changed, 17 insertions(+)
 create mode 100644 tests/data/smbios/type11_blob

diff --git a/tests/data/smbios/type11_blob b/tests/data/smbios/type11_blob
new file mode 100644
index 
..1d8fea4b0c6f040a13ba99c3fad762538b795614
GIT binary patch
literal 11
Scmd;PW!S(N;u;*nzyJUX)&c?m

literal 0
HcmV?d1

diff --git a/tests/qtest/bios-tables-test.c b/tests/qtest/bios-tables-test.c
index b2992bafa8..a116f88e1d 100644
--- a/tests/qtest/bios-tables-test.c
+++ b/tests/qtest/bios-tables-test.c
@@ -2091,6 +2091,21 @@ static void test_acpi_pc_smbios_options(void)
 free_test_data(&data);
 }
 
+static void test_acpi_pc_smbios_blob(void)
+{
+uint8_t req_type11[] = { 11 };
+test_data data = {
+.machine = MACHINE_PC,
+.variant = ".pc_smbios_blob",
+.required_struct_types = req_type11,
+.required_struct_types_len = ARRAY_SIZE(req_type11),
+};
+
+test_smbios("-machine smbios-entry-point-type=32 "
+"-smbios file=tests/data/smbios/type11_blob", &data);
+free_test_data(&data);
+}
+
 static void test_oem_fields(test_data *data)
 {
 int i;
@@ -2244,6 +2259,8 @@ int main(int argc, char *argv[])
 #endif
 qtest_add_func("acpi/piix4/smbios-options",
test_acpi_pc_smbios_options);
+qtest_add_func("acpi/piix4/smbios-blob",
+   test_acpi_pc_smbios_blob);
 }
 if (qtest_has_machine(MACHINE_Q35)) {
 qtest_add_func("acpi/q35", test_acpi_q35_tcg);
-- 
2.39.3

[PATCH v4 15/21] smbios: extend smbios-entry-point-type with 'auto' value

2024-03-14 Thread Igor Mammedov

later patches will use it to pick SMBIOS version at runtime
depending on configuration.

Signed-off-by: Igor Mammedov 
Acked-by: Markus Armbruster 
Reviewed-by: Ani Sinha 
Tested-by: Fiona Ebner 
---
v3:
- make sure i2nd line of comment is indented on 4 spaces only
---
 qapi/machine.json | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/qapi/machine.json b/qapi/machine.json
index bb5a178909..0840c91e70 100644
--- a/qapi/machine.json
+++ b/qapi/machine.json
@@ -1797,10 +1797,13 @@
 #
 # @64: SMBIOS version 3.0 (64-bit) Entry Point
 #
+# @auto: Either 2.x or 3.x SMBIOS version, 2.x if configuration can be
+# described by it and 3.x otherwise (since: 9.0)
+#
 # Since: 7.0
 ##
 { 'enum': 'SmbiosEntryPointType',
-  'data': [ '32', '64' ] }
+  'data': [ '32', '64', 'auto' ] }
 
 ##
 # @MemorySizeConfiguration:
-- 
2.39.3

[PATCH v4 14/21] smbios: clear smbios_type4_count before building tables

2024-03-14 Thread Igor Mammedov

it will help to keep type 4 tables accounting correct in case
SMBIOS tables are built multiple times.

Signed-off-by: Igor Mammedov 
Tested-by: Fiona Ebner 
---
 hw/smbios/smbios.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/hw/smbios/smbios.c b/hw/smbios/smbios.c
index 7fefe98c85..9aad8f2df2 100644
--- a/hw/smbios/smbios.c
+++ b/hw/smbios/smbios.c
@@ -1096,6 +1096,7 @@ void smbios_get_tables(MachineState *ms,
ep_type == SMBIOS_ENTRY_POINT_TYPE_64);
 
 g_free(smbios_tables);
+smbios_type4_count = 0;
 smbios_tables = g_memdup2(usr_blobs, usr_blobs_len);
 smbios_tables_len = usr_blobs_len;
 smbios_table_max = usr_table_max;
-- 
2.39.3

[PATCH v4 18/21] tests: acpi/smbios: whitelist expected blobs

2024-03-14 Thread Igor Mammedov

Signed-off-by: Igor Mammedov 
Acked-by: Ani Sinha 
---
 tests/qtest/bios-tables-test-allowed-diff.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/tests/qtest/bios-tables-test-allowed-diff.h 
b/tests/qtest/bios-tables-test-allowed-diff.h
index dfb8523c8b..81148a604f 100644
--- a/tests/qtest/bios-tables-test-allowed-diff.h
+++ b/tests/qtest/bios-tables-test-allowed-diff.h
@@ -1 +1,2 @@
 /* List of comma-separated changed AML files to ignore */
+"tests/data/acpi/q35/SSDT.dimmpxm",
-- 
2.39.3

[PATCH v4 05/21] smbios: get rid of smbios_smp_sockets global

2024-03-14 Thread Igor Mammedov

it makes smbios_validate_table() independent from
smbios_smp_sockets global, which in turn lets
smbios_get_tables() avoid using not related legacy code.

Signed-off-by: Igor Mammedov 
Reviewed-by: Ani Sinha 
Tested-by: Fiona Ebner 
---
goal here is to isolate legacy handling from generic smbios_get_tables()
---
 include/hw/firmware/smbios.h |  2 +-
 hw/i386/fw_cfg.c |  2 +-
 hw/smbios/smbios.c   | 22 +-
 3 files changed, 11 insertions(+), 15 deletions(-)

diff --git a/include/hw/firmware/smbios.h b/include/hw/firmware/smbios.h
index c21b8d3285..36744b6cc9 100644
--- a/include/hw/firmware/smbios.h
+++ b/include/hw/firmware/smbios.h
@@ -313,7 +313,7 @@ void smbios_set_defaults(const char *manufacturer, const 
char *product,
  const char *version, bool legacy_mode,
  bool uuid_encoded, SmbiosEntryPointType ep_type);
 void smbios_set_default_processor_family(uint16_t processor_family);
-uint8_t *smbios_get_table_legacy(MachineState *ms, size_t *length);
+uint8_t *smbios_get_table_legacy(uint32_t expected_t4_count, size_t *length);
 void smbios_get_tables(MachineState *ms,
const struct smbios_phys_mem_area *mem_array,
const unsigned int mem_array_size,
diff --git a/hw/i386/fw_cfg.c b/hw/i386/fw_cfg.c
index a635234e68..fcb4fb0769 100644
--- a/hw/i386/fw_cfg.c
+++ b/hw/i386/fw_cfg.c
@@ -70,7 +70,7 @@ void fw_cfg_build_smbios(PCMachineState *pcms, FWCfgState 
*fw_cfg)
 /* tell smbios about cpuid version and features */
 smbios_set_cpuid(cpu->env.cpuid_version, cpu->env.features[FEAT_1_EDX]);
 
-smbios_tables = smbios_get_table_legacy(ms, &smbios_tables_len);
+smbios_tables = smbios_get_table_legacy(ms->smp.cpus, &smbios_tables_len);
 if (smbios_tables) {
 fw_cfg_add_bytes(fw_cfg, FW_CFG_SMBIOS_ENTRIES,
  smbios_tables, smbios_tables_len);
diff --git a/hw/smbios/smbios.c b/hw/smbios/smbios.c
index 22369b49fb..50617fa585 100644
--- a/hw/smbios/smbios.c
+++ b/hw/smbios/smbios.c
@@ -70,7 +70,7 @@ static SmbiosEntryPoint ep;
 static int smbios_type4_count = 0;
 static bool smbios_immutable;
 static bool smbios_have_defaults;
-static uint32_t smbios_cpuid_version, smbios_cpuid_features, 
smbios_smp_sockets;
+static uint32_t smbios_cpuid_version, smbios_cpuid_features;
 
 static DECLARE_BITMAP(have_binfile_bitmap, SMBIOS_MAX_TYPE+1);
 static DECLARE_BITMAP(have_fields_bitmap, SMBIOS_MAX_TYPE+1);
@@ -539,14 +539,11 @@ opts_init(smbios_register_config);
  */
 #define SMBIOS_21_MAX_TABLES_LEN 0x
 
-static void smbios_validate_table(MachineState *ms)
+static void smbios_validate_table(uint32_t expected_t4_count)
 {
-uint32_t expect_t4_count = smbios_legacy ?
-ms->smp.cpus : smbios_smp_sockets;
-
-if (smbios_type4_count && smbios_type4_count != expect_t4_count) {
+if (smbios_type4_count && smbios_type4_count != expected_t4_count) {
 error_report("Expected %d SMBIOS Type 4 tables, got %d instead",
- expect_t4_count, smbios_type4_count);
+ expected_t4_count, smbios_type4_count);
 exit(1);
 }
 
@@ -634,7 +631,7 @@ static void smbios_build_type_1_fields(void)
 }
 }
 
-uint8_t *smbios_get_table_legacy(MachineState *ms, size_t *length)
+uint8_t *smbios_get_table_legacy(uint32_t expected_t4_count, size_t *length)
 {
 if (!smbios_legacy) {
 *length = 0;
@@ -644,7 +641,7 @@ uint8_t *smbios_get_table_legacy(MachineState *ms, size_t 
*length)
 if (!smbios_immutable) {
 smbios_build_type_0_fields();
 smbios_build_type_1_fields();
-smbios_validate_table(ms);
+smbios_validate_table(expected_t4_count);
 smbios_immutable = true;
 }
 *length = smbios_entries_len;
@@ -1235,10 +1232,9 @@ void smbios_get_tables(MachineState *ms,
 smbios_build_type_2_table();
 smbios_build_type_3_table();
 
-smbios_smp_sockets = ms->smp.sockets;
-assert(smbios_smp_sockets >= 1);
+assert(ms->smp.sockets >= 1);
 
-for (i = 0; i < smbios_smp_sockets; i++) {
+for (i = 0; i < ms->smp.sockets; i++) {
 smbios_build_type_4_table(ms, i);
 }
 
@@ -1284,7 +1280,7 @@ void smbios_get_tables(MachineState *ms,
 smbios_build_type_41_table(errp);
 smbios_build_type_127_table();
 
-smbios_validate_table(ms);
+smbios_validate_table(ms->smp.sockets);
 smbios_entry_point_setup();
 smbios_immutable = true;
 }
-- 
2.39.3

[PATCH v4 16/21] smbios: in case of entry point is 'auto' try to build v2 tables 1st

2024-03-14 Thread Igor Mammedov

QEMU for some time now uses SMBIOS 3.0 for PC/Q35 machines by
default, however Windows has a bug in locating SMBIOS 3.0
entrypoint and fails to find tables when booted on SeaBIOS
(on UEFI SMBIOS 3.0 tables work fine since firmware hands
over tables in another way)

Missing SMBIOS tables may lead to some issues for guest
though (worst are: possible reactiveation, inability to
get virtio drivers from 'Windows Update')

It's unclear  at this point if MS will fix the issue on their
side. So instead of it (or rather in addition) this patch
will try to workaround the issue.

aka, use smbios-entry-point-type=auto to make QEMU try
generating conservative SMBIOS 2.0 tables and if that
fails (due to limits/requested configuration) fallback
to SMBIOS 3.0 tables.

With this in place majority of users will use SMBIOS 2.0
tables which work fine with (Windows + legacy BIOS).
The configurations that is not to possible to describe
with SMBIOS 2.0 will switch automatically to SMBIOS 3.0
(which will trigger Windows bug but there is nothing
QEMU can do here, so go and aks Microsoft to real fix).

Signed-off-by: Igor Mammedov 
Reviewed-by: Ani Sinha 
Tested-by: Fiona Ebner 
---
 hw/smbios/smbios.c | 52 +++---
 1 file changed, 49 insertions(+), 3 deletions(-)

diff --git a/hw/smbios/smbios.c b/hw/smbios/smbios.c
index 9aad8f2df2..d9dda226e6 100644
--- a/hw/smbios/smbios.c
+++ b/hw/smbios/smbios.c
@@ -1082,7 +1082,7 @@ static void smbios_entry_point_setup(SmbiosEntryPointType 
ep_type)
 }
 }
 
-void smbios_get_tables(MachineState *ms,
+static bool smbios_get_tables_ep(MachineState *ms,
SmbiosEntryPointType ep_type,
const struct smbios_phys_mem_area *mem_array,
const unsigned int mem_array_size,
@@ -1091,6 +1091,7 @@ void smbios_get_tables(MachineState *ms,
Error **errp)
 {
 unsigned i, dimm_cnt, offset;
+ERRP_GUARD();
 
 assert(ep_type == SMBIOS_ENTRY_POINT_TYPE_32 ||
ep_type == SMBIOS_ENTRY_POINT_TYPE_64);
@@ -1177,11 +1178,56 @@ void smbios_get_tables(MachineState *ms,
 abort();
 }
 
-return;
+return true;
 err_exit:
 g_free(smbios_tables);
 smbios_tables = NULL;
-return;
+return false;
+}
+
+void smbios_get_tables(MachineState *ms,
+   SmbiosEntryPointType ep_type,
+   const struct smbios_phys_mem_area *mem_array,
+   const unsigned int mem_array_size,
+   uint8_t **tables, size_t *tables_len,
+   uint8_t **anchor, size_t *anchor_len,
+   Error **errp)
+{
+Error *local_err = NULL;
+bool is_valid;
+ERRP_GUARD();
+
+switch (ep_type) {
+case SMBIOS_ENTRY_POINT_TYPE_AUTO:
+case SMBIOS_ENTRY_POINT_TYPE_32:
+is_valid = smbios_get_tables_ep(ms, SMBIOS_ENTRY_POINT_TYPE_32,
+mem_array, mem_array_size,
+tables, tables_len,
+anchor, anchor_len,
+&local_err);
+if (is_valid || ep_type != SMBIOS_ENTRY_POINT_TYPE_AUTO) {
+break;
+}
+/*
+ * fall through in case AUTO endpoint is selected and
+ * SMBIOS 2.x tables can't be generated, to try if SMBIOS 3.x
+ * tables would work
+ */
+case SMBIOS_ENTRY_POINT_TYPE_64:
+error_free(local_err);
+local_err = NULL;
+is_valid = smbios_get_tables_ep(ms, SMBIOS_ENTRY_POINT_TYPE_64,
+mem_array, mem_array_size,
+tables, tables_len,
+anchor, anchor_len,
+&local_err);
+break;
+default:
+abort();
+}
+if (!is_valid) {
+error_propagate(errp, local_err);
+}
 }
 
 static void save_opt(const char **dest, QemuOpts *opts, const char *name)
-- 
2.39.3

[PATCH v4 21/21] smbios: add extra comments to smbios_get_table_legacy()

2024-03-14 Thread Igor Mammedov

Signed-off-by: Igor Mammedov 
---
 hw/smbios/smbios_legacy.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/hw/smbios/smbios_legacy.c b/hw/smbios/smbios_legacy.c
index 06907cd16c..c37a8ee821 100644
--- a/hw/smbios/smbios_legacy.c
+++ b/hw/smbios/smbios_legacy.c
@@ -151,6 +151,9 @@ uint8_t *smbios_get_table_legacy(size_t *length, Error 
**errp)
 smbios_entries_len = sizeof(uint16_t);
 smbios_entries = g_malloc0(smbios_entries_len);
 
+/*
+ * build a set of legacy smbios_table entries using user provided blobs
+ */
 for (i = 0, usr_offset = 0; usr_blobs_sizes && i < usr_blobs_sizes->len;
  i++)
 {
@@ -166,6 +169,10 @@ uint8_t *smbios_get_table_legacy(size_t *length, Error 
**errp)
 table->header.length = cpu_to_le16(sizeof(*table) + size);
 memcpy(table->data, header, size);
 smbios_entries_len += sizeof(*table) + size;
+/*
+ * update number of entries in the blob,
+ * see SeaBIOS: qemu_cfg_legacy():QEMU_CFG_SMBIOS_ENTRIES
+ */
 (*(uint16_t *)smbios_entries) =
 cpu_to_le16(le16_to_cpu(*(uint16_t *)smbios_entries) + 1);
 usr_offset += size;
-- 
2.39.3

[PATCH v4 12/21] smbios: handle errors consistently

2024-03-14 Thread Igor Mammedov

Current code uses mix of error_report()+exit(1)
and error_setg() to handle errors.
Use newer error_setg() everywhere, beside consistency
it will allow to detect error condition without killing
QEMU and attempt switch-over to SMBIOS3.x tables/entrypoint
in follow up patch.

while at it, clear smbios_tables pointer after freeing.
that will avoid double free if smbios_get_tables() is called
multiple times.

Signed-off-by: Igor Mammedov 
Reviewed-by: Ani Sinha 
---
 include/hw/firmware/smbios.h |  4 ++--
 hw/i386/fw_cfg.c |  3 ++-
 hw/smbios/smbios.c   | 34 ++
 hw/smbios/smbios_legacy.c| 22 ++
 4 files changed, 40 insertions(+), 23 deletions(-)

diff --git a/include/hw/firmware/smbios.h b/include/hw/firmware/smbios.h
index ccc51e72f5..d4b91d5a14 100644
--- a/include/hw/firmware/smbios.h
+++ b/include/hw/firmware/smbios.h
@@ -326,7 +326,7 @@ struct smbios_type_127 {
 struct smbios_structure_header header;
 } QEMU_PACKED;
 
-void smbios_validate_table(void);
+bool smbios_validate_table(Error **errp);
 void smbios_add_usr_blob_size(size_t size);
 void smbios_entry_add(QemuOpts *opts, Error **errp);
 void smbios_set_cpuid(uint32_t version, uint32_t features);
@@ -334,7 +334,7 @@ void smbios_set_defaults(const char *manufacturer, const 
char *product,
  const char *version,
  bool uuid_encoded, SmbiosEntryPointType ep_type);
 void smbios_set_default_processor_family(uint16_t processor_family);
-uint8_t *smbios_get_table_legacy(size_t *length);
+uint8_t *smbios_get_table_legacy(size_t *length, Error **errp);
 void smbios_get_tables(MachineState *ms,
const struct smbios_phys_mem_area *mem_array,
const unsigned int mem_array_size,
diff --git a/hw/i386/fw_cfg.c b/hw/i386/fw_cfg.c
index d1281066f4..e387bf50d0 100644
--- a/hw/i386/fw_cfg.c
+++ b/hw/i386/fw_cfg.c
@@ -71,7 +71,8 @@ void fw_cfg_build_smbios(PCMachineState *pcms, FWCfgState 
*fw_cfg)
 smbios_set_cpuid(cpu->env.cpuid_version, cpu->env.features[FEAT_1_EDX]);
 
 if (pcmc->smbios_legacy_mode) {
-smbios_tables = smbios_get_table_legacy(&smbios_tables_len);
+smbios_tables = smbios_get_table_legacy(&smbios_tables_len,
+&error_fatal);
 fw_cfg_add_bytes(fw_cfg, FW_CFG_SMBIOS_ENTRIES,
  smbios_tables, smbios_tables_len);
 return;
diff --git a/hw/smbios/smbios.c b/hw/smbios/smbios.c
index f9efe01233..512ecd46f3 100644
--- a/hw/smbios/smbios.c
+++ b/hw/smbios/smbios.c
@@ -19,7 +19,6 @@
 #include "qemu/units.h"
 #include "qapi/error.h"
 #include "qemu/config-file.h"
-#include "qemu/error-report.h"
 #include "qemu/module.h"
 #include "qemu/option.h"
 #include "sysemu/sysemu.h"
@@ -511,23 +510,25 @@ opts_init(smbios_register_config);
  */
 #define SMBIOS_21_MAX_TABLES_LEN 0x
 
-static void smbios_check_type4_count(uint32_t expected_t4_count)
+static bool smbios_check_type4_count(uint32_t expected_t4_count, Error **errp)
 {
 if (smbios_type4_count && smbios_type4_count != expected_t4_count) {
-error_report("Expected %d SMBIOS Type 4 tables, got %d instead",
- expected_t4_count, smbios_type4_count);
-exit(1);
+error_setg(errp, "Expected %d SMBIOS Type 4 tables, got %d instead",
+   expected_t4_count, smbios_type4_count);
+return false;
 }
+return true;
 }
 
-void smbios_validate_table(void)
+bool smbios_validate_table(Error **errp)
 {
 if (smbios_ep_type == SMBIOS_ENTRY_POINT_TYPE_32 &&
 smbios_tables_len > SMBIOS_21_MAX_TABLES_LEN) {
-error_report("SMBIOS 2.1 table length %zu exceeds %d",
- smbios_tables_len, SMBIOS_21_MAX_TABLES_LEN);
-exit(1);
+error_setg(errp, "SMBIOS 2.1 table length %zu exceeds %d",
+   smbios_tables_len, SMBIOS_21_MAX_TABLES_LEN);
+return false;
 }
+return true;
 }
 
 bool smbios_skip_table(uint8_t type, bool required_table)
@@ -1151,15 +1152,18 @@ void smbios_get_tables(MachineState *ms,
 smbios_build_type_41_table(errp);
 smbios_build_type_127_table();
 
-smbios_check_type4_count(ms->smp.sockets);
-smbios_validate_table();
+if (!smbios_check_type4_count(ms->smp.sockets, errp)) {
+goto err_exit;
+}
+if (!smbios_validate_table(errp)) {
+goto err_exit;
+}
 smbios_entry_point_setup();
 
 /* return tables blob and entry point (anchor), and their sizes */
 *tables = smbios_tables;
 *tables_len = smbios_tables_len;
 *anchor = (uint8_t *)&ep;
-
 /* calculate length based on anchor string */
 if (!strncmp((char *)&ep, "_SM_", 4)) {
 *anchor_len = sizeof(struct smbios_21_entry_point);
@@ -1168,6 +1172,12 @@ void smbios_get_tables(MachineState *ms,
 } else {
 abort();
 }
+
+return;
+err_exit:
+g_free(sm

Re: [RFC PATCH 3/5] cxl/core: introduce cxl_mem_report_poison()

2024-03-14 Thread Shiyang Ruan via





在 2024/2/10 14:46, Dan Williams 写道:

Shiyang Ruan wrote:

If poison is detected(reported from cxl memdev), OS should be notified to
handle it.  Introduce this function:
   1. translate DPA to HPA;
   2. construct a MCE instance; (TODO: more details need to be filled)
   3. log it into MCE event queue;

After that, MCE mechanism can walk over its notifier chain to execute
specific handlers.

Signed-off-by: Shiyang Ruan 
---
  arch/x86/kernel/cpu/mce/core.c |  1 +
  drivers/cxl/core/mbox.c| 33 +
  2 files changed, 34 insertions(+)

diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c
index bc39252bc54f..a64c0aceb7e0 100644
--- a/arch/x86/kernel/cpu/mce/core.c
+++ b/arch/x86/kernel/cpu/mce/core.c
@@ -131,6 +131,7 @@ void mce_setup(struct mce *m)
m->ppin = cpu_data(m->extcpu).ppin;
m->microcode = boot_cpu_data.microcode;
  }
+EXPORT_SYMBOL_GPL(mce_setup);


No, mce_setup() is x86 specific and the CXL subsystem is CPU
architecture independent. My expectation is that CXL should translate
errors for edac similar to how the ACPI GHES code does it. See usage of
edac_raw_mc_handle_error() and memory_failure_queue().

Otherwise an MCE is a CPU consumption of poison event, and CXL is
reporting device-side discovery of poison.


Yes, I misunderstood here.  I was mean to use MCE to finally call 
memory_failure(). I think memory_failure_queue() is what I need.


  void memory_failure_queue(unsigned long pfn, int flags)

But it can only queue one PFN at a time, we may need to make it support 
queuing a range of PFN.



--
Thanks,
Ruan.

[PATCH v4 20/21] tests: acpi: update expected SSDT.dimmpxm blob

2024-03-14 Thread Igor Mammedov

address shift is caused by switch to 32-bit SMBIOS entry point
which has slightly different size from 64-bit one and happens
to trigger a bit different memory layout.

Expected diff:

-Name (MEMA, 0x07FFE000)
+Name (MEMA, 0x07FFF000)

Signed-off-by: Igor Mammedov 
Acked-by: Ani Sinha 
---
 tests/qtest/bios-tables-test-allowed-diff.h |   1 -
 tests/data/acpi/q35/SSDT.dimmpxm| Bin 1815 -> 1815 bytes
 2 files changed, 1 deletion(-)

diff --git a/tests/qtest/bios-tables-test-allowed-diff.h 
b/tests/qtest/bios-tables-test-allowed-diff.h
index 81148a604f..dfb8523c8b 100644
--- a/tests/qtest/bios-tables-test-allowed-diff.h
+++ b/tests/qtest/bios-tables-test-allowed-diff.h
@@ -1,2 +1 @@
 /* List of comma-separated changed AML files to ignore */
-"tests/data/acpi/q35/SSDT.dimmpxm",
diff --git a/tests/data/acpi/q35/SSDT.dimmpxm b/tests/data/acpi/q35/SSDT.dimmpxm
index 
70f133412f5e0aa128ab210245a8de7304eeb843..9ea4e0d0ceaa8a5cbd706afb6d49de853fafe654
 100644
GIT binary patch
delta 23
ecmbQvH=U0wIM^jboSlJzam_|9E_UV*|JeaVTLvQl

delta 23
ecmbQvH=U0wIM^jboSlJzanD9BE_UVz|JeaVy9Ofw

-- 
2.39.3

[PATCH v4 06/21] smbios: get rid of smbios_legacy global

2024-03-14 Thread Igor Mammedov

clean up smbios_set_defaults() which is reused by legacy
and non legacy machines from being aware of 'legacy' notion
and need to turn it off. And push legacy handling up to
PC machine code where it's relevant.

Signed-off-by: Igor Mammedov 
Reviewed-by: Ani Sinha 
Acked-by: Daniel Henrique Barboza 
Tested-by: Fiona Ebner 
---
PS: I've moved/kept legacy smbios_entries to smbios_get_tables()
but it at least is not visible to API users. To get rid of it
as well, it would be necessary to change how '-smbios' CLI
option is processed. Which is done later in the series.
---
 include/hw/firmware/smbios.h |  2 +-
 hw/arm/virt.c|  2 +-
 hw/i386/fw_cfg.c |  7 ---
 hw/loongarch/virt.c  |  2 +-
 hw/riscv/virt.c  |  2 +-
 hw/smbios/smbios.c   | 35 +++
 6 files changed, 23 insertions(+), 27 deletions(-)

diff --git a/include/hw/firmware/smbios.h b/include/hw/firmware/smbios.h
index 36744b6cc9..7b42e7b4ac 100644
--- a/include/hw/firmware/smbios.h
+++ b/include/hw/firmware/smbios.h
@@ -310,7 +310,7 @@ struct smbios_type_127 {
 void smbios_entry_add(QemuOpts *opts, Error **errp);
 void smbios_set_cpuid(uint32_t version, uint32_t features);
 void smbios_set_defaults(const char *manufacturer, const char *product,
- const char *version, bool legacy_mode,
+ const char *version,
  bool uuid_encoded, SmbiosEntryPointType ep_type);
 void smbios_set_default_processor_family(uint16_t processor_family);
 uint8_t *smbios_get_table_legacy(uint32_t expected_t4_count, size_t *length);
diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index e5cd935232..b634c908a7 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -1650,7 +1650,7 @@ static void virt_build_smbios(VirtMachineState *vms)
 }
 
 smbios_set_defaults("QEMU", product,
-vmc->smbios_old_sys_ver ? "1.0" : mc->name, false,
+vmc->smbios_old_sys_ver ? "1.0" : mc->name,
 true, SMBIOS_ENTRY_POINT_TYPE_64);
 
 /* build the array of physical mem area from base_memmap */
diff --git a/hw/i386/fw_cfg.c b/hw/i386/fw_cfg.c
index fcb4fb0769..c1e9c0fd9c 100644
--- a/hw/i386/fw_cfg.c
+++ b/hw/i386/fw_cfg.c
@@ -63,15 +63,16 @@ void fw_cfg_build_smbios(PCMachineState *pcms, FWCfgState 
*fw_cfg)
 if (pcmc->smbios_defaults) {
 /* These values are guest ABI, do not change */
 smbios_set_defaults("QEMU", mc->desc, mc->name,
-pcmc->smbios_legacy_mode, 
pcmc->smbios_uuid_encoded,
+pcmc->smbios_uuid_encoded,
 pcms->smbios_entry_point_type);
 }
 
 /* tell smbios about cpuid version and features */
 smbios_set_cpuid(cpu->env.cpuid_version, cpu->env.features[FEAT_1_EDX]);
 
-smbios_tables = smbios_get_table_legacy(ms->smp.cpus, &smbios_tables_len);
-if (smbios_tables) {
+if (pcmc->smbios_legacy_mode) {
+smbios_tables = smbios_get_table_legacy(ms->smp.cpus,
+&smbios_tables_len);
 fw_cfg_add_bytes(fw_cfg, FW_CFG_SMBIOS_ENTRIES,
  smbios_tables, smbios_tables_len);
 return;
diff --git a/hw/loongarch/virt.c b/hw/loongarch/virt.c
index efce112310..53bfdcee61 100644
--- a/hw/loongarch/virt.c
+++ b/hw/loongarch/virt.c
@@ -355,7 +355,7 @@ static void virt_build_smbios(LoongArchMachineState *lams)
 return;
 }
 
-smbios_set_defaults("QEMU", product, mc->name, false,
+smbios_set_defaults("QEMU", product, mc->name,
 true, SMBIOS_ENTRY_POINT_TYPE_64);
 
 smbios_get_tables(ms, NULL, 0, &smbios_tables, &smbios_tables_len,
diff --git a/hw/riscv/virt.c b/hw/riscv/virt.c
index a094af97c3..535fd047ba 100644
--- a/hw/riscv/virt.c
+++ b/hw/riscv/virt.c
@@ -1275,7 +1275,7 @@ static void virt_build_smbios(RISCVVirtState *s)
 product = "KVM Virtual Machine";
 }
 
-smbios_set_defaults("QEMU", product, mc->name, false,
+smbios_set_defaults("QEMU", product, mc->name,
 true, SMBIOS_ENTRY_POINT_TYPE_64);
 
 if (riscv_is_32bit(&s->soc[0])) {
diff --git a/hw/smbios/smbios.c b/hw/smbios/smbios.c
index 50617fa585..1bae36a6e0 100644
--- a/hw/smbios/smbios.c
+++ b/hw/smbios/smbios.c
@@ -54,7 +54,6 @@ struct smbios_table {
 
 static uint8_t *smbios_entries;
 static size_t smbios_entries_len;
-static bool smbios_legacy = true;
 static bool smbios_uuid_encoded = true;
 /* end: legacy structures & constants for <= 2.0 machines */
 
@@ -633,9 +632,16 @@ static void smbios_build_type_1_fields(void)
 
 uint8_t *smbios_get_table_legacy(uint32_t expected_t4_count, size_t *length)
 {
-if (!smbios_legacy) {
-*length = 0;
-return NULL;
+/* drop unwanted version of command-line file blob(s) */
+g_free(smbios_tables);
+smbios_tables = NULL;
+
+/* also complain if fields wer

[PATCH v4 08/21] smbios: don't check type4 structures in legacy mode

2024-03-14 Thread Igor Mammedov

legacy mode doesn't support structures of type 2 and more,
and CLI has a check for '-smbios type' option, however it's
still possible to sneak in type4 as a blob with '-smbios file'
option. However doing the later makes SMBIOS tables broken
since SeaBIOS doesn't expect that.

Rather than trying to add support for type4 to legacy code
(both QEMU and SeaBIOS), simplify smbios_get_table_legacy()
by dropping not relevant check in legacy code and error out
on type4 blob.

Signed-off-by: Igor Mammedov 
Reviewed-by: Ani Sinha 
Tested-by: Fiona Ebner 
---
 * The issue affects 'isapc' and pc-i440fx-2.0. the later is
   in deprecated state and to be dropped in near future
 * possibly the same issue applies to other SMBIOS types above type 1
   but I haven't tested that, and well tables that aren't
   generated by SeaBIOS can get be added just fine
   (tested type11 blob). So I went with a minimal change
   to fixup type4 only that I'm touching. Leaving the rest
   for other time or when someone complains about it, which is
   very unlikely given it's really only remaining isapc machine.

   I'd very much prefer to deprecate 'isapc' and then drop
   all legacy related code (it will benefit not only SMBIOS
   but other code as well).
   BTW: 'isapc' is in semi-dead, I cna't boot RHEL6 on it
   with KVM enabled anymore (RHEL9 host), TCG still boots though.
   One more reason to get deprecate it.
---
 include/hw/firmware/smbios.h |  2 +-
 hw/i386/fw_cfg.c |  3 +--
 hw/smbios/smbios.c   | 18 ++
 3 files changed, 16 insertions(+), 7 deletions(-)

diff --git a/include/hw/firmware/smbios.h b/include/hw/firmware/smbios.h
index 7b42e7b4ac..0f0dca8f83 100644
--- a/include/hw/firmware/smbios.h
+++ b/include/hw/firmware/smbios.h
@@ -313,7 +313,7 @@ void smbios_set_defaults(const char *manufacturer, const 
char *product,
  const char *version,
  bool uuid_encoded, SmbiosEntryPointType ep_type);
 void smbios_set_default_processor_family(uint16_t processor_family);
-uint8_t *smbios_get_table_legacy(uint32_t expected_t4_count, size_t *length);
+uint8_t *smbios_get_table_legacy(size_t *length);
 void smbios_get_tables(MachineState *ms,
const struct smbios_phys_mem_area *mem_array,
const unsigned int mem_array_size,
diff --git a/hw/i386/fw_cfg.c b/hw/i386/fw_cfg.c
index c1e9c0fd9c..d1281066f4 100644
--- a/hw/i386/fw_cfg.c
+++ b/hw/i386/fw_cfg.c
@@ -71,8 +71,7 @@ void fw_cfg_build_smbios(PCMachineState *pcms, FWCfgState 
*fw_cfg)
 smbios_set_cpuid(cpu->env.cpuid_version, cpu->env.features[FEAT_1_EDX]);
 
 if (pcmc->smbios_legacy_mode) {
-smbios_tables = smbios_get_table_legacy(ms->smp.cpus,
-&smbios_tables_len);
+smbios_tables = smbios_get_table_legacy(&smbios_tables_len);
 fw_cfg_add_bytes(fw_cfg, FW_CFG_SMBIOS_ENTRIES,
  smbios_tables, smbios_tables_len);
 return;
diff --git a/hw/smbios/smbios.c b/hw/smbios/smbios.c
index db422f4fb0..a96e5cd839 100644
--- a/hw/smbios/smbios.c
+++ b/hw/smbios/smbios.c
@@ -545,14 +545,17 @@ opts_init(smbios_register_config);
  */
 #define SMBIOS_21_MAX_TABLES_LEN 0x
 
-static void smbios_validate_table(uint32_t expected_t4_count)
+static void smbios_check_type4_count(uint32_t expected_t4_count)
 {
 if (smbios_type4_count && smbios_type4_count != expected_t4_count) {
 error_report("Expected %d SMBIOS Type 4 tables, got %d instead",
  expected_t4_count, smbios_type4_count);
 exit(1);
 }
+}
 
+static void smbios_validate_table(void)
+{
 if (smbios_ep_type == SMBIOS_ENTRY_POINT_TYPE_32 &&
 smbios_tables_len > SMBIOS_21_MAX_TABLES_LEN) {
 error_report("SMBIOS 2.1 table length %zu exceeds %d",
@@ -637,7 +640,7 @@ static void smbios_build_type_1_fields(void)
 }
 }
 
-uint8_t *smbios_get_table_legacy(uint32_t expected_t4_count, size_t *length)
+uint8_t *smbios_get_table_legacy(size_t *length)
 {
 int i;
 size_t usr_offset;
@@ -650,6 +653,12 @@ uint8_t *smbios_get_table_legacy(uint32_t 
expected_t4_count, size_t *length)
 exit(1);
 }
 
+if (test_bit(4, have_binfile_bitmap)) {
+error_report("can't process table for smbios "
+ "type 4 on machine versions < 2.1!");
+exit(1);
+}
+
 g_free(smbios_entries);
 smbios_entries_len = sizeof(uint16_t);
 smbios_entries = g_malloc0(smbios_entries_len);
@@ -676,7 +685,7 @@ uint8_t *smbios_get_table_legacy(uint32_t 
expected_t4_count, size_t *length)
 
 smbios_build_type_0_fields();
 smbios_build_type_1_fields();
-smbios_validate_table(expected_t4_count);
+smbios_validate_table();
 *length = smbios_entries_len;
 return smbios_entries;
 }
@@ -1304,7 +1313,8 @@ void smbios_get_tables(MachineState *ms,
 smbios_build_type_41_table(errp);
 smbios_build_type_127_tabl

[PATCH v4 10/21] smbios: rename/expose structures/bitmaps used by both legacy and modern code

2024-03-14 Thread Igor Mammedov

As a preparation to move legacy handling into a separate file,
add prefix 'smbios_' to type0/type1/have_binfile_bitmap/have_fields_bitmap
and expose them in smbios.h so that they can be reused in
legacy and modern code.

Doing it as a separate patch to avoid rename cluttering follow-up
patch which will move legacy code into a separate file.

Signed-off-by: Igor Mammedov 
Reviewed-by: Ani Sinha 
---
 include/hw/firmware/smbios.h |  16 +
 hw/smbios/smbios.c   | 113 ---
 2 files changed, 69 insertions(+), 60 deletions(-)

diff --git a/include/hw/firmware/smbios.h b/include/hw/firmware/smbios.h
index 0f0dca8f83..05707c6341 100644
--- a/include/hw/firmware/smbios.h
+++ b/include/hw/firmware/smbios.h
@@ -2,6 +2,7 @@
 #define QEMU_SMBIOS_H
 
 #include "qapi/qapi-types-machine.h"
+#include "qemu/bitmap.h"
 
 /*
  * SMBIOS Support
@@ -16,8 +17,23 @@
  *
  */
 
+typedef struct {
+const char *vendor, *version, *date;
+bool have_major_minor, uefi;
+uint8_t major, minor;
+} smbios_type0_t;
+extern smbios_type0_t smbios_type0;
+
+typedef struct {
+const char *manufacturer, *product, *version, *serial, *sku, *family;
+/* uuid is in qemu_uuid */
+} smbios_type1_t;
+extern smbios_type1_t smbios_type1;
 
 #define SMBIOS_MAX_TYPE 127
+extern DECLARE_BITMAP(smbios_have_binfile_bitmap, SMBIOS_MAX_TYPE + 1);
+extern DECLARE_BITMAP(smbios_have_fields_bitmap, SMBIOS_MAX_TYPE + 1);
+
 #define offsetofend(TYPE, MEMBER) \
(offsetof(TYPE, MEMBER) + sizeof_field(TYPE, MEMBER))
 
diff --git a/hw/smbios/smbios.c b/hw/smbios/smbios.c
index d530667a9d..e93b8f9cb1 100644
--- a/hw/smbios/smbios.c
+++ b/hw/smbios/smbios.c
@@ -78,19 +78,11 @@ static int smbios_type4_count = 0;
 static bool smbios_have_defaults;
 static uint32_t smbios_cpuid_version, smbios_cpuid_features;
 
-static DECLARE_BITMAP(have_binfile_bitmap, SMBIOS_MAX_TYPE+1);
-static DECLARE_BITMAP(have_fields_bitmap, SMBIOS_MAX_TYPE+1);
+DECLARE_BITMAP(smbios_have_binfile_bitmap, SMBIOS_MAX_TYPE + 1);
+DECLARE_BITMAP(smbios_have_fields_bitmap, SMBIOS_MAX_TYPE + 1);
 
-static struct {
-const char *vendor, *version, *date;
-bool have_major_minor, uefi;
-uint8_t major, minor;
-} type0;
-
-static struct {
-const char *manufacturer, *product, *version, *serial, *sku, *family;
-/* uuid is in qemu_uuid */
-} type1;
+smbios_type0_t smbios_type0;
+smbios_type1_t smbios_type1;
 
 static struct {
 const char *manufacturer, *product, *version, *serial, *asset, *location;
@@ -599,36 +591,36 @@ static void smbios_maybe_add_str(int type, int offset, 
const char *data)
 static void smbios_build_type_0_fields(void)
 {
 smbios_maybe_add_str(0, offsetof(struct smbios_type_0, vendor_str),
- type0.vendor);
+ smbios_type0.vendor);
 smbios_maybe_add_str(0, offsetof(struct smbios_type_0, bios_version_str),
- type0.version);
+ smbios_type0.version);
 smbios_maybe_add_str(0, offsetof(struct smbios_type_0,
  bios_release_date_str),
- type0.date);
-if (type0.have_major_minor) {
+ smbios_type0.date);
+if (smbios_type0.have_major_minor) {
 smbios_add_field(0, offsetof(struct smbios_type_0,
  system_bios_major_release),
- &type0.major, 1);
+ &smbios_type0.major, 1);
 smbios_add_field(0, offsetof(struct smbios_type_0,
  system_bios_minor_release),
- &type0.minor, 1);
+ &smbios_type0.minor, 1);
 }
 }
 
 static void smbios_build_type_1_fields(void)
 {
 smbios_maybe_add_str(1, offsetof(struct smbios_type_1, manufacturer_str),
- type1.manufacturer);
+ smbios_type1.manufacturer);
 smbios_maybe_add_str(1, offsetof(struct smbios_type_1, product_name_str),
- type1.product);
+ smbios_type1.product);
 smbios_maybe_add_str(1, offsetof(struct smbios_type_1, version_str),
- type1.version);
+ smbios_type1.version);
 smbios_maybe_add_str(1, offsetof(struct smbios_type_1, serial_number_str),
- type1.serial);
+ smbios_type1.serial);
 smbios_maybe_add_str(1, offsetof(struct smbios_type_1, sku_number_str),
- type1.sku);
+ smbios_type1.sku);
 smbios_maybe_add_str(1, offsetof(struct smbios_type_1, family_str),
- type1.family);
+ smbios_type1.family);
 if (qemu_uuid_set) {
 /* We don't encode the UUID in the "wire format" here because this
  * function is for legacy mode and needs to keep the guest ABI, and
@@ -646,14 +638,14 @@ uint

[PATCH v4 01/21] tests: smbios: make it possible to write SMBIOS only test

2024-03-14 Thread Igor Mammedov

Cureently it not possible to run SMBIOS test without ACPI one,
which gets into the way when testing ACPI-less configs.

Extract SMBIOS testing into separate routines that could also
be run without ACPI dependency and use that for testing SMBIOS.

As the 1st user add "acpi/piix4/smbios-options" test case.

Signed-off-by: Igor Mammedov 
Reviewed-by: Ani Sinha 
Tested-by: Fiona Ebner 
---
 tests/qtest/bios-tables-test.c | 47 +++---
 1 file changed, 38 insertions(+), 9 deletions(-)

diff --git a/tests/qtest/bios-tables-test.c b/tests/qtest/bios-tables-test.c
index 21811a1ab5..b2992bafa8 100644
--- a/tests/qtest/bios-tables-test.c
+++ b/tests/qtest/bios-tables-test.c
@@ -858,16 +858,8 @@ static void test_vm_prepare(const char *params, test_data 
*data)
 g_free(args);
 }
 
-static void process_acpi_tables_noexit(test_data *data)
+static void process_smbios_tables_noexit(test_data *data)
 {
-test_acpi_load_tables(data);
-
-if (getenv(ACPI_REBUILD_EXPECTED_AML)) {
-dump_aml_files(data, true);
-} else {
-test_acpi_asl(data);
-}
-
 /*
  * TODO: make SMBIOS tests work with UEFI firmware,
  * Bug on uefi-test-tools to provide entry point:
@@ -879,6 +871,27 @@ static void process_acpi_tables_noexit(test_data *data)
 }
 }
 
+static void test_smbios(const char *params, test_data *data)
+{
+test_vm_prepare(params, data);
+boot_sector_test(data->qts);
+process_smbios_tables_noexit(data);
+qtest_quit(data->qts);
+}
+
+static void process_acpi_tables_noexit(test_data *data)
+{
+test_acpi_load_tables(data);
+
+if (getenv(ACPI_REBUILD_EXPECTED_AML)) {
+dump_aml_files(data, true);
+} else {
+test_acpi_asl(data);
+}
+
+process_smbios_tables_noexit(data);
+}
+
 static void process_acpi_tables(test_data *data)
 {
 process_acpi_tables_noexit(data);
@@ -2064,6 +2077,20 @@ static void test_acpi_q35_pvpanic_isa(void)
 free_test_data(&data);
 }
 
+static void test_acpi_pc_smbios_options(void)
+{
+uint8_t req_type11[] = { 11 };
+test_data data = {
+.machine = MACHINE_PC,
+.variant = ".pc_smbios_options",
+.required_struct_types = req_type11,
+.required_struct_types_len = ARRAY_SIZE(req_type11),
+};
+
+test_smbios("-smbios type=11,value=TEST", &data);
+free_test_data(&data);
+}
+
 static void test_oem_fields(test_data *data)
 {
 int i;
@@ -2215,6 +2242,8 @@ int main(int argc, char *argv[])
 #ifdef CONFIG_POSIX
 qtest_add_func("acpi/piix4/acpierst", test_acpi_piix4_acpi_erst);
 #endif
+qtest_add_func("acpi/piix4/smbios-options",
+   test_acpi_pc_smbios_options);
 }
 if (qtest_has_machine(MACHINE_Q35)) {
 qtest_add_func("acpi/q35", test_acpi_q35_tcg);
-- 
2.39.3

[PATCH v4 04/21] smbios: cleanup smbios_get_tables() from legacy handling

2024-03-14 Thread Igor Mammedov

smbios_get_tables() bails out right away if leagacy mode is enabled
and won't generate any SMBIOS tables. At the same time x86 specific
fw_cfg_build_smbios() will genarate legacy tables and then proceed
to preparing temporary mem_array for useless call to
smbios_get_tables() and then discard it.

Drop legacy related check in smbios_get_tables() and return from
fw_cfg_build_smbios() early if legacy tables where built without
proceeding to non legacy part of the function.

Signed-off-by: Igor Mammedov 
Reviewed-by: Ani Sinha 
Tested-by: Fiona Ebner 
---
 hw/i386/fw_cfg.c   | 1 +
 hw/smbios/smbios.c | 6 --
 2 files changed, 1 insertion(+), 6 deletions(-)

diff --git a/hw/i386/fw_cfg.c b/hw/i386/fw_cfg.c
index 98a478c276..a635234e68 100644
--- a/hw/i386/fw_cfg.c
+++ b/hw/i386/fw_cfg.c
@@ -74,6 +74,7 @@ void fw_cfg_build_smbios(PCMachineState *pcms, FWCfgState 
*fw_cfg)
 if (smbios_tables) {
 fw_cfg_add_bytes(fw_cfg, FW_CFG_SMBIOS_ENTRIES,
  smbios_tables, smbios_tables_len);
+return;
 }
 
 /* build the array of physical mem area from e820 table */
diff --git a/hw/smbios/smbios.c b/hw/smbios/smbios.c
index e3d5d8f2e2..22369b49fb 100644
--- a/hw/smbios/smbios.c
+++ b/hw/smbios/smbios.c
@@ -1229,12 +1229,6 @@ void smbios_get_tables(MachineState *ms,
 {
 unsigned i, dimm_cnt, offset;
 
-if (smbios_legacy) {
-*tables = *anchor = NULL;
-*tables_len = *anchor_len = 0;
-return;
-}
-
 if (!smbios_immutable) {
 smbios_build_type_0_table();
 smbios_build_type_1_table();
-- 
2.39.3

[PATCH v4 17/21] smbios: error out when building type 4 table is not possible

2024-03-14 Thread Igor Mammedov

If SMBIOS v2 version is requested but number of cores/threads
are more than it's possible to describe with v2, error out
instead of silently ignoring the fact and filling core/thread
count with bogus values.

This will help caller to decide if it should fallback to
SMBIOSv3 when smbios-entry-point-type='auto'

Signed-off-by: Igor Mammedov 
Reviewed-by: Ani Sinha 
Tested-by: Fiona Ebner 
---
 hw/smbios/smbios.c | 14 --
 1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/hw/smbios/smbios.c b/hw/smbios/smbios.c
index d9dda226e6..be64919def 100644
--- a/hw/smbios/smbios.c
+++ b/hw/smbios/smbios.c
@@ -669,7 +669,8 @@ static void smbios_build_type_3_table(void)
 }
 
 static void smbios_build_type_4_table(MachineState *ms, unsigned instance,
-  SmbiosEntryPointType ep_type)
+  SmbiosEntryPointType ep_type,
+  Error **errp)
 {
 char sock_str[128];
 size_t tbl_len = SMBIOS_TYPE_4_LEN_V28;
@@ -723,6 +724,12 @@ static void smbios_build_type_4_table(MachineState *ms, 
unsigned instance,
 if (tbl_len == SMBIOS_TYPE_4_LEN_V30) {
 t->core_count2 = t->core_enabled2 = cpu_to_le16(cores_per_socket);
 t->thread_count2 = cpu_to_le16(threads_per_socket);
+} else if (t->core_count == 0xFF || t->thread_count == 0xFF) {
+error_setg(errp, "SMBIOS 2.0 doesn't support number of processor "
+ "cores/threads more than 255, use "
+ "-machine smbios-entry-point-type=64 option to enable 
"
+ "SMBIOS 3.0 support");
+return;
 }
 
 SMBIOS_BUILD_TABLE_POST;
@@ -,7 +1118,10 @@ static bool smbios_get_tables_ep(MachineState *ms,
 assert(ms->smp.sockets >= 1);
 
 for (i = 0; i < ms->smp.sockets; i++) {
-smbios_build_type_4_table(ms, i, ep_type);
+smbios_build_type_4_table(ms, i, ep_type, errp);
+if (*errp) {
+goto err_exit;
+}
 }
 
 smbios_build_type_8_table();
-- 
2.39.3

[PATCH v4 03/21] tests: smbios: add test for legacy mode CLI options

2024-03-14 Thread Igor Mammedov

Unfortunately having 2.0 machine type deprecated is not enough
to get rid of legacy SMBIOS handling since 'isapc' also uses
that and it's staying around.

Hence add test for CLI options handling to be sure that it
ain't broken during SMBIOS code refactoring.

Signed-off-by: Igor Mammedov 
Reviewed-by: Ani Sinha 
Tested-by: Fiona Ebner 
---
 tests/data/smbios/type11_blob.legacy | Bin 0 -> 10 bytes
 tests/qtest/bios-tables-test.c   |  17 +
 2 files changed, 17 insertions(+)
 create mode 100644 tests/data/smbios/type11_blob.legacy

diff --git a/tests/data/smbios/type11_blob.legacy 
b/tests/data/smbios/type11_blob.legacy
new file mode 100644
index 
..aef463aab903405958b0a85f85c5980671c08bee
GIT binary patch
literal 10
Rcmd;PW!S(N;u;*n000Tp0s;U4

literal 0
HcmV?d1

diff --git a/tests/qtest/bios-tables-test.c b/tests/qtest/bios-tables-test.c
index a116f88e1d..d1ff4db7a2 100644
--- a/tests/qtest/bios-tables-test.c
+++ b/tests/qtest/bios-tables-test.c
@@ -2106,6 +2106,21 @@ static void test_acpi_pc_smbios_blob(void)
 free_test_data(&data);
 }
 
+static void test_acpi_isapc_smbios_legacy(void)
+{
+uint8_t req_type11[] = { 1, 11 };
+test_data data = {
+.machine = "isapc",
+.variant = ".pc_smbios_legacy",
+.required_struct_types = req_type11,
+.required_struct_types_len = ARRAY_SIZE(req_type11),
+};
+
+test_smbios("-smbios file=tests/data/smbios/type11_blob.legacy "
+"-smbios type=1,family=TEST", &data);
+free_test_data(&data);
+}
+
 static void test_oem_fields(test_data *data)
 {
 int i;
@@ -2261,6 +2276,8 @@ int main(int argc, char *argv[])
test_acpi_pc_smbios_options);
 qtest_add_func("acpi/piix4/smbios-blob",
test_acpi_pc_smbios_blob);
+qtest_add_func("acpi/piix4/smbios-legacy",
+   test_acpi_isapc_smbios_legacy);
 }
 if (qtest_has_machine(MACHINE_Q35)) {
 qtest_add_func("acpi/q35", test_acpi_q35_tcg);
-- 
2.39.3

[PATCH v4 19/21] pc/q35: set SMBIOS entry point type to 'auto' by default

2024-03-14 Thread Igor Mammedov

Use smbios-entry-point-type='auto' for newer machine types as a workaround
for Windows not detecting SMBIOS tables. Which makes QEMU pick SMBIOS tables
based on configuration (with 2.x preferred and fallback to 3.x if the former
isn't compatible with configuration)

Default compat setting of smbios-entry-point-type after series
for pc/q35 machines:
  * 9.0-newer: 'auto'
  * 8.1-8.2: '64'
  * 8.0-older: '32'

Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2008
Signed-off-by: Igor Mammedov 
Reviewed-by: Ani Sinha 
Tested-by: Fiona Ebner 
---
 hw/i386/pc.c  | 2 +-
 hw/i386/pc_piix.c | 4 
 hw/i386/pc_q35.c  | 3 +++
 3 files changed, 8 insertions(+), 1 deletion(-)

diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 44eb073abd..e80f02bef4 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -1832,7 +1832,7 @@ static void pc_machine_class_init(ObjectClass *oc, void 
*data)
 mc->nvdimm_supported = true;
 mc->smp_props.dies_supported = true;
 mc->default_ram_id = "pc.ram";
-pcmc->default_smbios_ep_type = SMBIOS_ENTRY_POINT_TYPE_64;
+pcmc->default_smbios_ep_type = SMBIOS_ENTRY_POINT_TYPE_AUTO;
 
 object_class_property_add(oc, PC_MACHINE_MAX_RAM_BELOW_4G, "size",
 pc_machine_get_max_ram_below_4g, pc_machine_set_max_ram_below_4g,
diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
index c9a6c0aa68..18ba076609 100644
--- a/hw/i386/pc_piix.c
+++ b/hw/i386/pc_piix.c
@@ -525,12 +525,16 @@ DEFINE_I440FX_MACHINE(v9_0, "pc-i440fx-9.0", NULL,
 
 static void pc_i440fx_8_2_machine_options(MachineClass *m)
 {
+PCMachineClass *pcmc = PC_MACHINE_CLASS(m);
+
 pc_i440fx_9_0_machine_options(m);
 m->alias = NULL;
 m->is_default = false;
 
 compat_props_add(m->compat_props, hw_compat_8_2, hw_compat_8_2_len);
 compat_props_add(m->compat_props, pc_compat_8_2, pc_compat_8_2_len);
+/* For pc-i44fx-8.2 and 8.1, use SMBIOS 3.X by default */
+pcmc->default_smbios_ep_type = SMBIOS_ENTRY_POINT_TYPE_64;
 }
 
 DEFINE_I440FX_MACHINE(v8_2, "pc-i440fx-8.2", NULL,
diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c
index 8a427c4647..b5922b44af 100644
--- a/hw/i386/pc_q35.c
+++ b/hw/i386/pc_q35.c
@@ -376,11 +376,14 @@ DEFINE_Q35_MACHINE(v9_0, "pc-q35-9.0", NULL,
 
 static void pc_q35_8_2_machine_options(MachineClass *m)
 {
+PCMachineClass *pcmc = PC_MACHINE_CLASS(m);
 pc_q35_9_0_machine_options(m);
 m->alias = NULL;
 m->max_cpus = 1024;
 compat_props_add(m->compat_props, hw_compat_8_2, hw_compat_8_2_len);
 compat_props_add(m->compat_props, pc_compat_8_2, pc_compat_8_2_len);
+/* For pc-q35-8.2 and 8.1, use SMBIOS 3.X by default */
+pcmc->default_smbios_ep_type = SMBIOS_ENTRY_POINT_TYPE_64;
 }
 
 DEFINE_Q35_MACHINE(v8_2, "pc-q35-8.2", NULL,
-- 
2.39.3

Re: [PATCH v3 1/2] vhost: dirty log should be per backend type

2024-03-14 Thread Eugenio Perez Martin

On Thu, Mar 14, 2024 at 9:38 AM Si-Wei Liu  wrote:
>
> There could be a mix of both vhost-user and vhost-kernel clients
> in the same QEMU process, where separate vhost loggers for the
> specific vhost type have to be used. Make the vhost logger per
> backend type, and have them properly reference counted.
>
> Suggested-by: Michael S. Tsirkin 
> Signed-off-by: Si-Wei Liu 
> ---
> v2->v3:
>   - remove non-effective assertion that never be reached
>   - do not return NULL from vhost_log_get()
>   - add neccessary assertions to vhost_log_get()
>
> ---
>  hw/virtio/vhost.c | 50 ++
>  1 file changed, 38 insertions(+), 12 deletions(-)
>
> diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
> index 2c9ac79..efe2f74 100644
> --- a/hw/virtio/vhost.c
> +++ b/hw/virtio/vhost.c
> @@ -43,8 +43,8 @@
>  do { } while (0)
>  #endif
>
> -static struct vhost_log *vhost_log;
> -static struct vhost_log *vhost_log_shm;
> +static struct vhost_log *vhost_log[VHOST_BACKEND_TYPE_MAX];
> +static struct vhost_log *vhost_log_shm[VHOST_BACKEND_TYPE_MAX];
>
>  /* Memslots used by backends that support private memslots (without an fd). 
> */
>  static unsigned int used_memslots;
> @@ -287,6 +287,10 @@ static int vhost_set_backend_type(struct vhost_dev *dev,
>  r = -1;
>  }
>
> +if (r == 0) {
> +assert(dev->vhost_ops->backend_type == backend_type);
> +}
> +
>  return r;
>  }
>
> @@ -319,16 +323,22 @@ static struct vhost_log *vhost_log_alloc(uint64_t size, 
> bool share)
>  return log;
>  }
>
> -static struct vhost_log *vhost_log_get(uint64_t size, bool share)
> +static struct vhost_log *vhost_log_get(VhostBackendType backend_type,
> +   uint64_t size, bool share)
>  {
> -struct vhost_log *log = share ? vhost_log_shm : vhost_log;
> +struct vhost_log *log;
> +
> +assert(backend_type > VHOST_BACKEND_TYPE_NONE);
> +assert(backend_type < VHOST_BACKEND_TYPE_MAX);
> +
> +log = share ? vhost_log_shm[backend_type] : vhost_log[backend_type];
>
>  if (!log || log->size != size) {
>  log = vhost_log_alloc(size, share);
>  if (share) {
> -vhost_log_shm = log;
> +vhost_log_shm[backend_type] = log;
>  } else {
> -vhost_log = log;
> +vhost_log[backend_type] = log;
>  }
>  } else {
>  ++log->refcnt;
> @@ -340,11 +350,20 @@ static struct vhost_log *vhost_log_get(uint64_t size, 
> bool share)
>  static void vhost_log_put(struct vhost_dev *dev, bool sync)
>  {
>  struct vhost_log *log = dev->log;
> +VhostBackendType backend_type;
>
>  if (!log) {
>  return;
>  }
>
> +assert(dev->vhost_ops);
> +backend_type = dev->vhost_ops->backend_type;
> +
> +if (backend_type == VHOST_BACKEND_TYPE_NONE ||
> +backend_type >= VHOST_BACKEND_TYPE_MAX) {
> +return;
> +}
> +
>  --log->refcnt;
>  if (log->refcnt == 0) {
>  /* Sync only the range covered by the old log */
> @@ -352,13 +371,13 @@ static void vhost_log_put(struct vhost_dev *dev, bool 
> sync)
>  vhost_log_sync_range(dev, 0, dev->log_size * VHOST_LOG_CHUNK - 
> 1);
>  }
>
> -if (vhost_log == log) {
> +if (vhost_log[backend_type] == log) {
>  g_free(log->log);
> -vhost_log = NULL;
> -} else if (vhost_log_shm == log) {
> +vhost_log[backend_type] = NULL;
> +} else if (vhost_log_shm[backend_type] == log) {
>  qemu_memfd_free(log->log, log->size * sizeof(*(log->log)),
>  log->fd);
> -vhost_log_shm = NULL;
> +vhost_log_shm[backend_type] = NULL;
>  }
>
>  g_free(log);
> @@ -376,7 +395,8 @@ static bool vhost_dev_log_is_shared(struct vhost_dev *dev)
>
>  static inline void vhost_dev_log_resize(struct vhost_dev *dev, uint64_t size)
>  {
> -struct vhost_log *log = vhost_log_get(size, 
> vhost_dev_log_is_shared(dev));
> +struct vhost_log *log = vhost_log_get(dev->vhost_ops->backend_type,
> +  size, 
> vhost_dev_log_is_shared(dev));
>  uint64_t log_base = (uintptr_t)log->log;
>  int r;
>
> @@ -2037,8 +2057,14 @@ int vhost_dev_start(struct vhost_dev *hdev, 
> VirtIODevice *vdev, bool vrings)
>  uint64_t log_base;
>
>  hdev->log_size = vhost_get_log_size(hdev);
> -hdev->log = vhost_log_get(hdev->log_size,
> +hdev->log = vhost_log_get(hdev->vhost_ops->backend_type,
> +  hdev->log_size,
>vhost_dev_log_is_shared(hdev));
> +if (!hdev->log) {

I thought vhost_log_get couldn't return NULL :).

Other than that,

Acked-by: Eugenio Pérez 

> +VHOST_OPS_DEBUG(r, "vhost_log_get failed");
> +goto fail_vq;
> +}
> +
>  log_base = (uintptr_t)hdev->log->log;
>  r = hdev

Re: [PATCH v4 00/21] Workaround Windows failing to find 64bit SMBIOS entry point with SeaBIOS

2024-03-14 Thread Michael S. Tsirkin

On Thu, Mar 14, 2024 at 04:22:41PM +0100, Igor Mammedov wrote:
> Changelog:
>  v4:
>* rebase on top of current master due to conflict with
>  new SMBIOS table 9 commits
>* add extra patch with comments about obscure legacy entries counting


Thanks a lot for the quick turnaround Igor!

>  v3:
>* whitespace missed by checkpatch
>* fix idndent in QAPI
>* reorder 17/20 before 1st 'auto' can be used
>* pick up acks
>  v2:
>* QAPI style fixes (Markus Armbruster )
>* squash 11/19 into 10/19 (Ani Sinha )
>* split '[PATCH 09/19] smbios: build legacy mode code only for 'pc' 
> machine'
>  in 3 smaller patches, to make it more readable
>smbios: add smbios_add_usr_blob_size() helper  
> 
>smbios: rename/expose structures/bitmaps used by both legacy and 
> modern code  
>smbios: build legacy mode code only for 'pc' machine
>* pick up acks
> 
> Windows (10) bootloader when running on top of SeaBIOS, fails to find
> SMBIOSv3 entry point. Tracing it shows that it looks for v2 anchor markers
> only and not v3. Tricking it into believing that entry point is found
> lets Windows successfully locate and parse SMBIOSv3 tables. Whether it
> will be fixed on Windows side is not clear so here goes a workaround.
> 
> Idea is to try build v2 tables if QEMU configuration permits,
> and fallback to v3 tables otherwise. That will mask Windows issue
> form majority of users.
> However if VM configuration can't be described (typically large VMs)
> by v2 tables, QEMU will use SMBIOSv3 and Windows will hit the issue
> again. In this case complain to Microsoft and/or use UEFI instead of
> SeaBIOS (requires reinstall).
> 
> Default compat setting of smbios-entry-point-type after series
> for pc/q35 machines:
>   * 9.0-newer: 'auto'
>   * 8.1-8.2: '64'
>   * 8.0-older: '32'
> 
> Fixes: https://gitlab.com/qemu-project/qemu/-/issues/2008
> 
> Igor Mammedov (21):
>   tests: smbios: make it possible to write SMBIOS only test
>   tests: smbios: add test for -smbios type=11 option
>   tests: smbios: add test for legacy mode CLI options
>   smbios: cleanup smbios_get_tables() from legacy handling
>   smbios: get rid of smbios_smp_sockets global
>   smbios: get rid of smbios_legacy global
>   smbios: avoid mangling user provided tables
>   smbios: don't check type4 structures in legacy mode
>   smbios: add smbios_add_usr_blob_size() helper
>   smbios: rename/expose structures/bitmaps used by both legacy and
> modern code
>   smbios: build legacy mode code only for 'pc' machine
>   smbios: handle errors consistently
>   smbios: get rid of global smbios_ep_type
>   smbios: clear smbios_type4_count before building tables
>   smbios: extend smbios-entry-point-type with 'auto' value
>   smbios: in case of entry point is 'auto' try to build v2 tables 1st
>   smbios: error out when building type 4 table is not possible
>   tests: acpi/smbios: whitelist expected blobs
>   pc/q35: set SMBIOS entry point type to 'auto' by default
>   tests: acpi: update expected SSDT.dimmpxm blob
>   smbios: add extra comments to smbios_get_table_legacy()
> 
>  hw/i386/fw_cfg.h |   3 +-
>  include/hw/firmware/smbios.h |  28 +-
>  hw/arm/virt.c|   6 +-
>  hw/i386/Kconfig  |   1 +
>  hw/i386/fw_cfg.c |  14 +-
>  hw/i386/pc.c |   4 +-
>  hw/i386/pc_piix.c|   4 +
>  hw/i386/pc_q35.c |   3 +
>  hw/loongarch/virt.c  |   7 +-
>  hw/riscv/virt.c  |   6 +-
>  hw/smbios/Kconfig|   2 +
>  hw/smbios/meson.build|   4 +
>  hw/smbios/smbios.c   | 483 +++
>  hw/smbios/smbios_legacy.c| 192 +++
>  hw/smbios/smbios_legacy_stub.c   |  15 +
>  qapi/machine.json|   5 +-
>  tests/data/acpi/q35/SSDT.dimmpxm | Bin 1815 -> 1815 bytes
>  tests/data/smbios/type11_blob| Bin 0 -> 11 bytes
>  tests/data/smbios/type11_blob.legacy | Bin 0 -> 10 bytes
>  tests/qtest/bios-tables-test.c   |  81 -
>  20 files changed, 541 insertions(+), 317 deletions(-)
>  create mode 100644 hw/smbios/smbios_legacy.c
>  create mode 100644 hw/smbios/smbios_legacy_stub.c
>  create mode 100644 tests/data/smbios/type11_blob
>  create mode 100644 tests/data/smbios/type11_blob.legacy
> 
> --
> 2.39.3

Re: [PATCH v3 2/2] vhost: Perform memory section dirty scans once per iteration

2024-03-14 Thread Eugenio Perez Martin

On Thu, Mar 14, 2024 at 9:38 AM Si-Wei Liu  wrote:
>
> On setups with one or more virtio-net devices with vhost on,
> dirty tracking iteration increases cost the bigger the number
> amount of queues are set up e.g. on idle guests migration the
> following is observed with virtio-net with vhost=on:
>
> 48 queues -> 78.11%  [.] vhost_dev_sync_region.isra.13
> 8 queues -> 40.50%   [.] vhost_dev_sync_region.isra.13
> 1 queue -> 6.89% [.] vhost_dev_sync_region.isra.13
> 2 devices, 1 queue -> 18.60%  [.] vhost_dev_sync_region.isra.14
>
> With high memory rates the symptom is lack of convergence as soon
> as it has a vhost device with a sufficiently high number of queues,
> the sufficient number of vhost devices.
>
> On every migration iteration (every 100msecs) it will redundantly
> query the *shared log* the number of queues configured with vhost
> that exist in the guest. For the virtqueue data, this is necessary,
> but not for the memory sections which are the same. So essentially
> we end up scanning the dirty log too often.
>
> To fix that, select a vhost device responsible for scanning the
> log with regards to memory sections dirty tracking. It is selected
> when we enable the logger (during migration) and cleared when we
> disable the logger. If the vhost logger device goes away for some
> reason, the logger will be re-selected from the rest of vhost
> devices.
>
> After making mem-section logger a singleton instance, constant cost
> of 7%-9% (like the 1 queue report) will be seen, no matter how many
> queues or how many vhost devices are configured:
>
> 48 queues -> 8.71%[.] vhost_dev_sync_region.isra.13
> 2 devices, 8 queues -> 7.97%   [.] vhost_dev_sync_region.isra.14
>
> Co-developed-by: Joao Martins 
> Signed-off-by: Joao Martins 
> Signed-off-by: Si-Wei Liu 
> ---
> v2 -> v3:
>   - add after-fix benchmark to commit log
>   - rename vhost_log_dev_enabled to vhost_dev_should_log
>   - remove unneeded comparisons for backend_type
>   - use QLIST array instead of single flat list to store vhost
> logger devices
>   - simplify logger election logic
>
> ---
>  hw/virtio/vhost.c | 63 
> ++-
>  include/hw/virtio/vhost.h |  1 +
>  2 files changed, 58 insertions(+), 6 deletions(-)
>
> diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
> index efe2f74..d91858b 100644
> --- a/hw/virtio/vhost.c
> +++ b/hw/virtio/vhost.c
> @@ -45,6 +45,7 @@
>
>  static struct vhost_log *vhost_log[VHOST_BACKEND_TYPE_MAX];
>  static struct vhost_log *vhost_log_shm[VHOST_BACKEND_TYPE_MAX];
> +static QLIST_HEAD(, vhost_dev) vhost_log_devs[VHOST_BACKEND_TYPE_MAX];
>
>  /* Memslots used by backends that support private memslots (without an fd). 
> */
>  static unsigned int used_memslots;
> @@ -149,6 +150,43 @@ bool vhost_dev_has_iommu(struct vhost_dev *dev)
>  }
>  }
>
> +static inline bool vhost_dev_should_log(struct vhost_dev *dev)
> +{
> +assert(dev->vhost_ops);
> +assert(dev->vhost_ops->backend_type > VHOST_BACKEND_TYPE_NONE);
> +assert(dev->vhost_ops->backend_type < VHOST_BACKEND_TYPE_MAX);
> +
> +return dev == QLIST_FIRST(&vhost_log_devs[dev->vhost_ops->backend_type]);
> +}
> +
> +static inline void vhost_dev_elect_mem_logger(struct vhost_dev *hdev, bool 
> add)
> +{
> +VhostBackendType backend_type;
> +
> +assert(hdev->vhost_ops);
> +
> +backend_type = hdev->vhost_ops->backend_type;
> +assert(backend_type > VHOST_BACKEND_TYPE_NONE);
> +assert(backend_type < VHOST_BACKEND_TYPE_MAX);
> +
> +if (add && !QLIST_IS_INSERTED(hdev, logdev_entry)) {
> +if (QLIST_EMPTY(&vhost_log_devs[backend_type])) {
> +QLIST_INSERT_HEAD(&vhost_log_devs[backend_type],
> +  hdev, logdev_entry);
> +} else {
> +/*
> + * The first vhost_device in the list is selected as the shared
> + * logger to scan memory sections. Put new entry next to the head
> + * to avoid inadvertent change to the underlying logger device.
> + */

Why is changing the logger device a problem? All the code paths are
either changing the QLIST or logging, isn't it?

> +QLIST_INSERT_AFTER(QLIST_FIRST(&vhost_log_devs[backend_type]),
> +   hdev, logdev_entry);
> +}
> +} else if (!add && QLIST_IS_INSERTED(hdev, logdev_entry)) {
> +QLIST_REMOVE(hdev, logdev_entry);
> +}
> +}
> +
>  static int vhost_sync_dirty_bitmap(struct vhost_dev *dev,
> MemoryRegionSection *section,
> hwaddr first,
> @@ -166,12 +204,14 @@ static int vhost_sync_dirty_bitmap(struct vhost_dev 
> *dev,
>  start_addr = MAX(first, start_addr);
>  end_addr = MIN(last, end_addr);
>
> -for (i = 0; i < dev->mem->nregions; ++i) {
> -struct vhost_memory_region *reg = dev->mem->regions + i;
> -vhost_dev_sync_region(dev, section, start_addr, end_addr,
>

Re: [PATCH v2 2/2] migration/multifd: Ensure we're not given a socket for file migration

2024-03-14 Thread Peter Xu

On Thu, Mar 14, 2024 at 11:10:12AM -0400, Peter Xu wrote:
> On Wed, Mar 13, 2024 at 06:28:24PM -0300, Fabiano Rosas wrote:
> > When doing migration using the fd: URI, the incoming migration starts
> > before the user has passed the file descriptor to QEMU. This means
> > that the checks at migration_channels_and_transport_compatible()
> > happen too soon and we need to allow a migration channel of type
> > SOCKET_ADDRESS_TYPE_FD even though socket migration is not supported
> > with multifd.
> 
> Hmm, bare with me if this is a stupid one.. why the incoming migration can
> start _before_ the user passed in the fd?
> 
> IOW, why can't we rely on a single fd_is_socket() check for
> SOCKET_ADDRESS_TYPE_FD in transport_supports_multi_channels()?
> 
> > 
> > The commit decdc76772 ("migration/multifd: Add mapped-ram support to
> > fd: URI") was supposed to add a second check prior to starting
> > migration to make sure a socket fd is not passed instead of a file fd,
> > but failed to do so.
> > 
> > Add the missing verification.
> > 
> > Fixes: decdc76772 ("migration/multifd: Add mapped-ram support to fd: URI")
> > Signed-off-by: Fabiano Rosas 
> > ---
> >  migration/fd.c   | 8 
> >  migration/file.c | 7 +++
> >  2 files changed, 15 insertions(+)
> > 
> > diff --git a/migration/fd.c b/migration/fd.c
> > index 39a52e5c90..c07030f715 100644
> > --- a/migration/fd.c
> > +++ b/migration/fd.c
> > @@ -22,6 +22,7 @@
> >  #include "migration.h"
> >  #include "monitor/monitor.h"
> >  #include "io/channel-file.h"
> > +#include "io/channel-socket.h"
> >  #include "io/channel-util.h"
> >  #include "options.h"
> >  #include "trace.h"
> > @@ -95,6 +96,13 @@ void fd_start_incoming_migration(const char *fdname, 
> > Error **errp)
> >  }
> >  
> >  if (migrate_multifd()) {
> > +if (fd_is_socket(fd)) {
> > +error_setg(errp,
> > +   "Multifd migration to a socket FD is not 
> > supported");
> > +object_unref(ioc);
> > +return;
> > +}

And... I just noticed this is forbiding multifd+socket+fd in general?  But
isn't that the majority of multifd usage when with libvirt over sockets?

Shouldn't it about fd's seekable-or-not instead when mapped-ram enabled
(IOW, migration_needs_seekable_channel() only)?

> > +
> >  file_create_incoming_channels(ioc, errp);
> >  } else {
> >  qio_channel_set_name(ioc, "migration-fd-incoming");
> > diff --git a/migration/file.c b/migration/file.c
> > index ddde0ca818..b6e8ba13f2 100644
> > --- a/migration/file.c
> > +++ b/migration/file.c
> > @@ -15,6 +15,7 @@
> >  #include "file.h"
> >  #include "migration.h"
> >  #include "io/channel-file.h"
> > +#include "io/channel-socket.h"
> >  #include "io/channel-util.h"
> >  #include "options.h"
> >  #include "trace.h"
> > @@ -58,6 +59,12 @@ bool file_send_channel_create(gpointer opaque, Error 
> > **errp)
> >  int fd = fd_args_get_fd();
> >  
> >  if (fd && fd != -1) {
> > +if (fd_is_socket(fd)) {
> > +error_setg(errp,
> > +   "Multifd migration to a socket FD is not 
> > supported");
> > +goto out;
> > +}
> > +
> >  ioc = qio_channel_file_new_dupfd(fd, errp);
> >  } else {
> >  ioc = qio_channel_file_new_path(outgoing_args.fname, flags, 0, 
> > errp);
> > -- 
> > 2.35.3
> > 
> 
> -- 
> Peter Xu

-- 
Peter Xu

Re: [PATCH v2 3/4] tests/avocado: use OpenBSD 7.4 for sbsa-ref

2024-03-14 Thread Marcin Juszkiewicz


W dniu 14.03.2024 o 15:56, Alex Bennée pisze:

If we are not going to delete the entries then at least use a @skip
instead of commenting. Maybe:
@skip("Potential un-diagnosed upstream bug?")

Daniel or Peter suggested to open a GitLab issue and use

 @skip("https://gitlab.com/qemu-project/qemu/-/issues/xyz";)

to track progress.

That's a good idea. Are you going to respin?


Opened https://gitlab.com/qemu-project/qemu/-/issues/2224 to track 
problem. Subscribed to arm@openbsd mailing list.


Will walk the dog and then mail them with problem.

And respin patch series tomorrow.

question on s390x topology: KVM only, or also TCG?

2024-03-14 Thread Claudio Fontana

Hello Pierre, Ilya,

I have a question on the s390x "topology" feature and examples.

Mainly, is this feature supposed to be KVM accelerator-only, or also available 
when using the TCG accelerator?

(docs/devel/s390-cpu-topology.rst vs 
https://www.qemu.org/docs/master/system/s390x/cpu-topology.html)

I see stsi-topology.c in target/s390x/kvm/ , so that part is clearly 
KVM-specific,

but in hw/s390x/cpu-topology.c I read:

"
 * - The first part in this file is taking care of all common functions 

 *   used by KVM and TCG to create and modify the topology. 

 *  

 * - The second part, building the topology information data for the

 *   guest with CPU and KVM specificity will be implemented inside  

 *   the target/s390/kvm sub tree.
"

In the docs/devel/s390-cpu-topology.rst

I see the example command:

 qemu-system-s390x \
-enable-kvm \
-cpu z14,ctop=on \
-smp 1,drawers=3,books=3,sockets=2,cores=2,maxcpus=36 \
-device z14-s390x-cpu,core-id=19,entitlement=high \
-device z14-s390x-cpu,core-id=11,entitlement=low \
-device z14-s390x-cpu,core-id=12,entitlement=high \
   ...


which uses KVM only.

In https://www.qemu.org/docs/master/system/s390x/cpu-topology.html

I read:

"Prerequisites:
To use the CPU topology, you need to run with KVM on a s390x host that uses the 
Linux kernel v6.0 or newer (which provide the so-called 
KVM_CAP_S390_CPU_TOPOLOGY capability that allows QEMU to signal the CPU 
topology facility via the so-called STFLE bit 11 to the VM).
"

So I would assume this is KVM-only, but then in the "Examples" section below I 
see the example:

"
$ qemu-system-s390x -m 2G \
  -cpu gen16b,ctop=on \
  -smp cpus=5,sockets=8,cores=4,maxcpus=32 \
  -device host-s390x-cpu,core-id=14 \
"

and

"
qemu-system-s390x -m 2G \
  -cpu gen16b,ctop=on \
  -smp cpus=1,sockets=8,cores=4,maxcpus=32 \
  \
  -device gen16b-s390x-cpu,drawer-id=1,book-id=1,socket-id=2,core-id=1 \
  -device gen16b-s390x-cpu,drawer-id=1,book-id=1,socket-id=2,core-id=2 \
  -device gen16b-s390x-cpu,drawer-id=1,book-id=1,socket-id=2,core-id=3 \
  \
  -device gen16b-s390x-cpu,drawer-id=0,book-id=0,socket-id=0,core-id=9 \
  -device gen16b-s390x-cpu,drawer-id=0,book-id=0,socket-id=0,core-id=14 \
  \
  -device gen16b-s390x-cpu,core-id=4,dedicated=on,entitlement=high
"

We received questions about this, so I hope you can shed some light, maybe it 
would be good to just update the web page to include -accel kvm or -enable-kvm 
everywhere for clarity?

Thanks for your help on this,

Claudio

-- 
Claudio Fontana
Engineering Manager Virtualization, SUSE Labs Core

SUSE Software Solutions Italy Srl

Re: question on s390x topology: KVM only, or also TCG?

2024-03-14 Thread Thomas Huth


On 14/03/2024 16.49, Claudio Fontana wrote:

Hello Pierre, Ilya,

I have a question on the s390x "topology" feature and examples.

Mainly, is this feature supposed to be KVM accelerator-only, or also available 
when using the TCG accelerator?


 Hi Claudio!

Pierre left IBM, please CC: Nina with regards to s390x topology instead.

But with regards to your question, I think I can answer that, too: The 
topology feature is currently working with KVM only, yes. It hasn't been 
implemented for TCG yet.



(docs/devel/s390-cpu-topology.rst vs 
https://www.qemu.org/docs/master/system/s390x/cpu-topology.html)

I see stsi-topology.c in target/s390x/kvm/ , so that part is clearly 
KVM-specific,

but in hw/s390x/cpu-topology.c I read:

"
  * - The first part in this file is taking care of all common functions
  *   used by KVM and TCG to create and modify the topology.
  *
  * - The second part, building the topology information data for the
  *   guest with CPU and KVM specificity will be implemented inside
  *   the target/s390/kvm sub tree.
"

In the docs/devel/s390-cpu-topology.rst

I see the example command:

  qemu-system-s390x \
 -enable-kvm \
 -cpu z14,ctop=on \
 -smp 1,drawers=3,books=3,sockets=2,cores=2,maxcpus=36 \
 -device z14-s390x-cpu,core-id=19,entitlement=high \
 -device z14-s390x-cpu,core-id=11,entitlement=low \
 -device z14-s390x-cpu,core-id=12,entitlement=high \
...


which uses KVM only.

In https://www.qemu.org/docs/master/system/s390x/cpu-topology.html

I read:

"Prerequisites:
To use the CPU topology, you need to run with KVM on a s390x host that uses the 
Linux kernel v6.0 or newer (which provide the so-called 
KVM_CAP_S390_CPU_TOPOLOGY capability that allows QEMU to signal the CPU 
topology facility via the so-called STFLE bit 11 to the VM).
"

So I would assume this is KVM-only, but then in the "Examples" section below I 
see the example:

"
$ qemu-system-s390x -m 2G \
   -cpu gen16b,ctop=on \
   -smp cpus=5,sockets=8,cores=4,maxcpus=32 \
   -device host-s390x-cpu,core-id=14 \
"

and

"
qemu-system-s390x -m 2G \
   -cpu gen16b,ctop=on \
   -smp cpus=1,sockets=8,cores=4,maxcpus=32 \
   \
   -device gen16b-s390x-cpu,drawer-id=1,book-id=1,socket-id=2,core-id=1 \
   -device gen16b-s390x-cpu,drawer-id=1,book-id=1,socket-id=2,core-id=2 \
   -device gen16b-s390x-cpu,drawer-id=1,book-id=1,socket-id=2,core-id=3 \
   \
   -device gen16b-s390x-cpu,drawer-id=0,book-id=0,socket-id=0,core-id=9 \
   -device gen16b-s390x-cpu,drawer-id=0,book-id=0,socket-id=0,core-id=14 \
   \
   -device gen16b-s390x-cpu,core-id=4,dedicated=on,entitlement=high
"

We received questions about this, so I hope you can shed some light, maybe it 
would be good to just update the web page to include -accel kvm or -enable-kvm 
everywhere for clarity?


Yes, it would be better to include "-accel kvm" in those examples. Would you 
like to send a patch?


 Thanks,
  Thomas

Re: [PATCH v2 1/6] virtio/virtio-pci: Handle extra notification data

2024-03-14 Thread Jonah Palmer





On 3/14/24 10:55 AM, Eugenio Perez Martin wrote:

On Thu, Mar 14, 2024 at 1:16 PM Jonah Palmer  wrote:




On 3/13/24 11:01 PM, Jason Wang wrote:

On Wed, Mar 13, 2024 at 7:55 PM Jonah Palmer  wrote:


Add support to virtio-pci devices for handling the extra data sent
from the driver to the device when the VIRTIO_F_NOTIFICATION_DATA
transport feature has been negotiated.

The extra data that's passed to the virtio-pci device when this
feature is enabled varies depending on the device's virtqueue
layout.

In a split virtqueue layout, this data includes:
   - upper 16 bits: shadow_avail_idx
   - lower 16 bits: virtqueue index

In a packed virtqueue layout, this data includes:
   - upper 16 bits: 1-bit wrap counter & 15-bit shadow_avail_idx
   - lower 16 bits: virtqueue index

Tested-by: Lei Yang 
Reviewed-by: Eugenio Pérez 
Signed-off-by: Jonah Palmer 
---
   hw/virtio/virtio-pci.c | 10 +++---
   hw/virtio/virtio.c | 18 ++
   include/hw/virtio/virtio.h |  1 +
   3 files changed, 26 insertions(+), 3 deletions(-)

diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c
index cb6940fc0e..0f5c3c3b2f 100644
--- a/hw/virtio/virtio-pci.c
+++ b/hw/virtio/virtio-pci.c
@@ -384,7 +384,7 @@ static void virtio_ioport_write(void *opaque, uint32_t 
addr, uint32_t val)
   {
   VirtIOPCIProxy *proxy = opaque;
   VirtIODevice *vdev = virtio_bus_get_device(&proxy->bus);
-uint16_t vector;
+uint16_t vector, vq_idx;
   hwaddr pa;

   switch (addr) {
@@ -408,8 +408,12 @@ static void virtio_ioport_write(void *opaque, uint32_t 
addr, uint32_t val)
   vdev->queue_sel = val;
   break;
   case VIRTIO_PCI_QUEUE_NOTIFY:
-if (val < VIRTIO_QUEUE_MAX) {
-virtio_queue_notify(vdev, val);
+vq_idx = val;
+if (vq_idx < VIRTIO_QUEUE_MAX) {
+if (virtio_vdev_has_feature(vdev, VIRTIO_F_NOTIFICATION_DATA)) {
+virtio_queue_set_shadow_avail_data(vdev, val);
+}
+virtio_queue_notify(vdev, vq_idx);
   }
   break;
   case VIRTIO_PCI_STATUS:
diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
index d229755eae..bcb9e09df0 100644
--- a/hw/virtio/virtio.c
+++ b/hw/virtio/virtio.c
@@ -2255,6 +2255,24 @@ void virtio_queue_set_align(VirtIODevice *vdev, int n, 
int align)
   }
   }

+void virtio_queue_set_shadow_avail_data(VirtIODevice *vdev, uint32_t data)


Maybe I didn't explain well, but I think it is better to pass directly
idx to a VirtQueue *. That way only the caller needs to check for a
valid vq idx, and (my understanding is) the virtio.c interface is
migrating to VirtQueue * use anyway.



Oh, are you saying to just pass in a VirtQueue *vq instead of 
VirtIODevice *vdev and get rid of the vq->vring.desc check in the function?



+{
+/* Lower 16 bits is the virtqueue index */
+uint16_t i = data;
+VirtQueue *vq = &vdev->vq[i];
+
+if (!vq->vring.desc) {
+return;
+}
+
+if (virtio_vdev_has_feature(vdev, VIRTIO_F_RING_PACKED)) {
+vq->shadow_avail_wrap_counter = (data >> 31) & 0x1;
+vq->shadow_avail_idx = (data >> 16) & 0x7FFF;
+} else {
+vq->shadow_avail_idx = (data >> 16);


Do we need to do a sanity check for this value?

Thanks



It can't hurt, right? What kind of check did you have in mind?

if (vq->shadow_avail_idx >= vq->vring.num)



I'm a little bit lost too. shadow_avail_idx can take all uint16_t
values. Maybe you meant checking for a valid vq index, Jason?

Thanks!


Or something else?


+}
+}
+
   static void virtio_queue_notify_vq(VirtQueue *vq)
   {
   if (vq->vring.desc && vq->handle_output) {
diff --git a/include/hw/virtio/virtio.h b/include/hw/virtio/virtio.h
index c8f72850bc..53915947a7 100644
--- a/include/hw/virtio/virtio.h
+++ b/include/hw/virtio/virtio.h
@@ -335,6 +335,7 @@ void virtio_queue_update_rings(VirtIODevice *vdev, int n);
   void virtio_init_region_cache(VirtIODevice *vdev, int n);
   void virtio_queue_set_align(VirtIODevice *vdev, int n, int align);
   void virtio_queue_notify(VirtIODevice *vdev, int n);
+void virtio_queue_set_shadow_avail_data(VirtIODevice *vdev, uint32_t data);
   uint16_t virtio_queue_vector(VirtIODevice *vdev, int n);
   void virtio_queue_set_vector(VirtIODevice *vdev, int n, uint16_t vector);
   int virtio_queue_set_host_notifier_mr(VirtIODevice *vdev, int n,
--
2.39.3

Re: [PATCH V4 1/1] target/loongarch: Fixed tlb huge page loading issue

2024-03-14 Thread Richard Henderson


On 3/13/24 15:33, Xianglai Li wrote:

+if (unlikely((level == 0) || (level > 4))) {
+return base;
+}
+
+if (FIELD_EX64(base, TLBENTRY, HUGE)) {
+if (FIELD_EX64(base, TLBENTRY, LEVEL)) {
+return base;
+} else {
+return  FIELD_DP64(base, TLBENTRY, LEVEL, level);
+}
+
+if (unlikely(level == 4)) {
+qemu_log_mask(LOG_GUEST_ERROR,
+  "Attempted use of level %lu huge page\n", level);
+}


This block is unreachable, because you've already returned.
Perhaps it would be worthwhile to add another for the level==0 or > 4 case 
above?


@@ -530,20 +553,34 @@ void helper_ldpte(CPULoongArchState *env, target_ulong 
base, target_ulong odd,
  CPUState *cs = env_cpu(env);
  target_ulong phys, tmp0, ptindex, ptoffset0, ptoffset1, ps, badv;
  int shift;
-bool huge = (base >> LOONGARCH_PAGE_HUGE_SHIFT) & 0x1;
  uint64_t ptbase = FIELD_EX64(env->CSR_PWCL, CSR_PWCL, PTBASE);
  uint64_t ptwidth = FIELD_EX64(env->CSR_PWCL, CSR_PWCL, PTWIDTH);
+uint64_t dir_base, dir_width;
  
  base = base & TARGET_PHYS_MASK;

+if (FIELD_EX64(base, TLBENTRY, HUGE)) {
+/*
+ * Gets the huge page level and Gets huge page size
+ * Clears the huge page level information in the address
+ * Clears huge page bit
+ */
+get_dir_base_width(env, &dir_base, &dir_width,
+   FIELD_EX64(base, TLBENTRY, LEVEL));
+
+FIELD_DP64(base, TLBENTRY, LEVEL, 0);
+FIELD_DP64(base, TLBENTRY, HUGE, 0);
+if (FIELD_EX64(base, TLBENTRY, HG)) {
+FIELD_DP64(base, TLBENTRY, HG, 0);
+FIELD_DP64(base, TLBENTRY, G, 1);


FIELD_DP64 returns a value.  You need

base = FIELD_DP64(base, ...);


r~

1 2 >

1 - 100 of 188 matches

Mail list logo